advanced calculus 2 for electrical engineers math-212/ece ...ece206/outline/ece206.pdf · advanced...
TRANSCRIPT
Advanced Calculus 2 for Electrical Engineers
MATH-212/ECE-206
FALL TERM, 2013
Andrew J. Heunis c©
Department of Electrical and Computer Engineering
University of Waterloo
Waterloo
Ontario N2L 3G1
February 20, 2014
Contents
1 Goals and Preview 3
2 Multidimensional Integration 5
2.1 Two Dimensional Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Three Dimensional Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Scalar and Vector Fields 33
3.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Definition of Vector and Scalar Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4 Curves and Paths in Space 44
4.1 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Paths and Parametric Representation of Curves . . . . . . . . . . . . . . . . . . . . 45
4.3 Derivatives Along a Path and Tangent to a Curve . . . . . . . . . . . . . . . . . . . 51
4.4 Simple Curves and Closed Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Line Integral and Arc Length 58
5.1 Line Integral of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Line Integral of Scalar Field and Arc Length . . . . . . . . . . . . . . . . . . . . . . 67
6 Conservative Vector Fields 70
6.1 Gradient of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.2 Conservative Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.3 Conservation of Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7 Green’s Theorem in the Plane 83
7.1 Green’s Theorem for Rectangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
1
7.2 Green’s Theorem: General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8 Surfaces, Surface Area and Surface Integrals 91
8.1 Parametric Representation of Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.2 Tangents to a Surface and Smooth Surfaces . . . . . . . . . . . . . . . . . . . . . . 102
8.3 Area of a Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.4 Surface Integral of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8.5 Surface Integral of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
9 Vector Calculus 134
9.1 Differential Operators of Vector Calculus: Divergence, Curl, Laplacian . . . . . . . . 134
9.2 Theorem of Stokes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
9.3 Divergence Theorem of Gauss-Ostrogradskii . . . . . . . . . . . . . . . . . . . . . . 158
9.4 The Continuity Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
10 The Basic Laws of Electricity and Magnetism 173
10.1 Static Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
10.2 Static Magnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
10.3 Time Varying Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
11 Maxwell’s Equations 193
11.1 The Ampere-Maxwell Law for Time Varying Fields . . . . . . . . . . . . . . . . . . 193
11.2 Maxwell’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.3 Electromagnetic Waves without Sources . . . . . . . . . . . . . . . . . . . . . . . . 198
11.4 Electromagnetic Waves with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . 210
12 Cylindrical and Spherical Coordinates 217
12.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
12.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
2
Chapter 1
Goals and Preview
This course continues the sequence of calculus courses that you have taken during the last several
years, and is specifically about vector calculus and the calculus of complex variables. Vector calculus
builds upon the elementary calculus you have learned, but is far more powerful than this elementary
calculus. Indeed, comparing vector calculus with elementary calculus is like comparing a turbo-
charged Mercedes Benz with an oxcart. The power of vector calculus is really a consequence of its
main theorems; these are Green’s theorem, Stokes’ theorem and the Gauss-Ostrogradskii theorem,
and we shall see all of these in the course. The ideas and theorems of vector calculus are completely
indispensable for modern science and technology, and are used in electromagnetism, aerodynamics,
fluid mechanics, classical mechanics, quantum mechanics and gravitational physics. In particular,
vector calculus is the essential tool for really understanding the laws of electricity and magnetism,
for vector calculus enables one to effectively compress every single known law of electricity and
magnetism into a set of just four equations, called Maxwell’s equations. In this course we shall apply
the tools of vector calculus to a preliminary study of the main properties of Maxwell’s equations,
and you will be seeing much more on these equations in follow-up courses on electromagnetic fields
and electromagnetic waves.
Concerning Maxwell’s equations the physicist Richard P. Feynman (Nobel Prize in Physics,
1965) stated “From a long view of the history of mankind - seen, from, say, ten thousand years from
now - there can be little doubt that the most significant event of the 19th century will be judged as
Maxwell’s discovery of the laws of electrodynamics” 1 That is, the discovery of Maxwell’s equations
(c. 1861 - 1865) transcends in importance absolutely everything that took place in the 19th century,
1see Richard P. Feynman “The Feynman Lectures on Physics, Volume II: Mainly Electromagnetism and Matter”,
Chapter 1, Section 6 (Electromagnetism in science and technology).
3
including the depressing litany of wars, revolutions, colonizations, exploitations and land-grabs, and
the usual political double-dealing, finagling, back-stabbing and horse-trading. Ten thousand years
from now all of that will have been largely forgotten, but the priceless value of Maxwell’s equations
will remain. In fact, without Maxwell’s equations, there would be quite literally no modern science
or modern technology, and therefore of course no modern civilization either. Today Maxwell’s
equations are indispensable to scientists working at the very frontiers of physics, on problems of
high energy physics, gravitational physics and quantum electrodynamics, and are just as essential for
engineering applications, for these equations are the very key to radio, television, radar, microwave
ovens, microwave communications, space satellites, cell phones, the internet ..... the list is virtually
endless!
Here are Maxwell’s marvelous equations:
∇ ·EEE =ρ
ε0,
∇ ·BBB = 0,
∇×EEE +∂BBB
∂t= 0,
∇×BBB = µ0JJJ + ε0µ0∂EEE
∂t.
These equations will likely look impenetrable to you now, but you will be quite familiar with them
by the end of this course. At this point you may perhaps recognize some of the symbols from
elementary physics, such as the electric field EEE and the magnetic field BBB, and we will shortly find
out about the charge density field ρ and the current density field JJJ . However, what are the bizarre
looking symbols ∇· and ∇×, and how does one extract such an enormous history-transforming
punch out of these equations? The answers to these questions is to be found in the power of the
vector calculus that we are going to study in this course.
4
Chapter 2
Multidimensional Integration
Vector calculus involves a number of rather fancy integrals that we shall study later in the course,
such as line integrals and surface integrals. In fact, the basic theorems which give vector calculus its
extraordinary power (that is Green’s theorem, theorem of Stokes, theorem of Gauss-Ostrogradskii),
can only be formulated in terms of line and surface integrals. However, these integrals, and espe-
cially the surface integral, in their turn rely on the simpler ideas of multidimensional integration,
specifically two dimensional “dxdy” integrals (or double integrals) and three dimensional “ dx dy dz”
integrals (or triple integrals). We are therefore going to devote this chapter to recalling the main
ideas of multidimensional integration. You should also note that multidimensional integrals serve
not only the requirements of vector calculus, but are also indispensable in areas such as probabil-
ity and communications, where vector calculus does not naturally arise. The ideas of the present
chapter are therefore also prerequisites for later courses that you will take on probability and com-
munications.
2.1 Two Dimensional Integration
In the present section we focus on two dimensional integration, that is the integration of functions
defined on portions of the plane R2. You are likely already familiar with two dimensional integration,
but in view of its huge importance we shall briefly recall the main aspects of two dimensional
integration here. Suppose we have a real-valued function
(2.1.1) f : D → R,
in which D ⊂ R2 is the rectangle shown in Figure 2.1: The sides of the rectangle D are the intervals
5
Figure 2.1: Rectangle D in the plane R2
a ≤ x ≤ b and c ≤ y ≤ d, and in the notation of sets we write D as
(2.1.2) D = (x, y) ∈ R2 | a ≤ x ≤ b and c ≤ y ≤ d
For the sake of brevity we will usually denote this rectangle in the following “mathematical” notation
(2.1.3) D = [a, b]× [c, d].
in which the intervals a ≤ x ≤ b and c ≤ y ≤ b are indicated by the abbreviated notations [a, b]
and [c, d]. We now define what is meant by the integral of the function f over the rectangle D. To
this end subdivide the interval a ≤ x ≤ b into n + 1 equally spaced points xj and subdivide the
interval c ≤ y ≤ d into n+ 1 equally spaced points yk, that is
(2.1.4) a = x0 < x1 < . . . < xn = b, c = y0 < y1 < . . . < yn = d,
with spacing ∆x and ∆y between successive subdivision points
(2.1.5) ∆x := xi+1 − xi =b− an
, ∆y := yj+1 − yj =d− cn
,
(see Figure 2.2). Let Djk be the (small) rectangle given by
Djk := (x, y) ∈ R2 | xj ≤ x ≤ xj+1 and yk ≤ y ≤ yk+1
≡ [xj, xj+1]× [yk, yk+1],(2.1.6)
6
Figure 2.2: Rectangles Djk and D in the plane R2
and fix some point rrrjk := (ξj, ηk) in Djk, that is xj ≤ ξj ≤ xj+1 and yk ≤ ηk ≤ yk+1 (again see
Figure 2.2).
Now define the Riemann sum of the function f on the rectangle D:
(2.1.7) Sn :=n−1∑j=0
n−1∑k=0
f(ξj, ηk)∆x∆y,
for each n = 1, 2, . . .. We can define the integral of the function f over the rectangle D as follows:
Definition 2.1.1. If the sequence of Riemann sums Sn, n = 1, 2, . . . converges to a limit S as
n → ∞, and the limit S is the same for every choice of points (ξj, ηk) in Djk, then S is called the
integral of the function f over the rectangle D.
Remark 2.1.2. The various notations for the integral in Definition 2.1.1 are
(2.1.8)
∫D
f(x, y) dx dy,
∫D
f(x, y) dA,
∫D
f dx dy,
∫D
f dA,
The essential elements in all of these notations is the subscript D attached to the integral, indicating
the region in R2 over which one integrates, and the integrand f indicating the function being
integrated. The symbol “ dA” is effectively just shorthand for “ dx dy”. The first two notations
explicitly remind us that we are integrating over D with respect to an underlying space variable in
R2 which is generically denoted by (x, y). It can be quite tedious to keep carrying the space variable
(x, y), and so, in the third and fourth notations of (2.1.8), this variable is suppressed, but always
understood to be present!
7
Remark 2.1.3. Definition 2.1.1 raises a number of questions. What is the situation if the sequence
of Riemann sums Sn, n = 1, 2, . . . fails to converge to any limit? In this case the integral of
the function f over the rectangle D does “not make sense” and is said to be undefined. Another
very natural question: under what conditions on f can we be sure that the integral of f over a
rectangle D “makes sense” (or is defined) in the sense of Definition 2.1.1? This is a rather profound
question, the answer to which is given by a branch of mathematics called the abstract theory of
Lebesgue measure and integration. Fortunately, we need never be concerned with this question, for
the abstract theory of measure and integration tells us that the class of functions which can be
integrated over D is simply huge, and we are completely safe in assuming that every function that
we shall encounter has an integral which is defined.
Remark 2.1.4. It is one thing to formulate the definition of an integral, as we have done in
Definition 2.1.1, quite another matter to actually calculate the integral over a rectangle D of a given
function f . For this we need an essential result called Fubini’s theorem. To state this result suppose
that f : D → R and D is the rectangle at (2.1.3) i.e. shown in Figure 2.1. Now define the function
h1(x) for all a ≤ x ≤ b as follows:
(2.1.9) h1(x) :=
∫ d
c
f(x, y) dy, for all a ≤ x ≤ b.
It is most important to understand the sense in which the integration on the right side of (2.1.9)
is meant: We fix some x in the interval a ≤ x ≤ b which leaves us with a function f(x, y) which
depends only on y in the interval c ≤ y ≤ d (since x is fixed). The right side of (2.1.9) is the integral
of this function with respect to y in the interval c ≤ y ≤ d. Of course, for different choices of x in
the interval a ≤ x ≤ b, we generally get different values for the integral, that is we get a function
h1(x) of x over the interval a ≤ x ≤ b. The important thing is that the integral in (2.1.9) is just
an ordinary single-variable integral over an interval, and this is usually quite easy to evaluate. In
exactly the same way we also define the function h2(y) for all c ≤ y ≤ d as follows:
(2.1.10) h2(y) :=
∫ b
a
f(x, y) dx, for all c ≤ y ≤ d.
Having defined the integrals at (2.1.9) and (2.1.10) we can state
Theorem 2.1.5 (Fubini theorem for rectangles in R2). Suppose that f : D → R where D is the
rectangle at (2.1.3), and h1(x) and h2(y) are defined by (2.1.9) and (2.1.10) respectively. Then
(2.1.11)
∫D
f(x, y) dx dy =
∫ b
a
h1(x) dx =
∫ d
c
h2(y) dy.
8
Remark 2.1.6. The equalities at (2.1.11) are usually written in a more complete and self-contained
way as follows:
(2.1.12)
∫D
f(x, y) dx dy =
∫ b
a
∫ d
c
f(x, y) dy
dx =
∫ d
c
∫ b
a
f(x, y) dx
dy.
Notice that in the right-hand integral in (2.1.12) we fix first keep y fixed and integrate with respect
to x to get the “inner integral” in braces, and then integrate with respect to y, whereas for the
middle integral in (2.1.12) we do things the other way around. The absolutely essential thing
about Fubini’s theorem is that it reduces evaluation of an integral over a rectangle to the successive
evaluations of two integrals over intervals. These are called iterated integrals. Either we can use
the iterated integral in the middle or the iterated integral on the right of (2.1.12). Both choices will
work (Fubini’s theorem guarantees this!) but in practice it is often the case that one choice involves
less work than the other.
Example 2.1.7. For the function f : D → R given by
(2.1.13) f(x, y) = x2 + y2, with D := [−1, 1]× [0, 1],
evaluate the integral∫Df(x, y) dx dy.
We use Fubini’s theorem with a = −1, b = 1, c = 0 and d = 1. Following (2.1.9), put
h1(x) :=
∫ d
c
f(x, y) dy
=
∫ 1
0
[x2 + y2] dy (from (2.1.13))
=
[x2y +
y3
3
]y=1
y=0
(keeping x constant in the dy-integration)
= x2 +1
3.
(2.1.14)
Now use Fubini’s theorem in the form of (2.1.11):∫D
f(x, y) dx dy =
∫ b
a
h1(x) dx
=
∫ 1
−1
x2 +
1
3
dx =
4
3.
(2.1.15)
You should now repeat this calculation, but using (2.1.10) instead of (2.1.9), to verify that you get
the same value for the integral.
9
Remark 2.1.8. Suppose we must integrate a function f : D → R when D ⊂ R2 is not a rectangle
square to the x− y axes, as has been the case in all previous double integrals. An obvious example
of such a non-rectangular region is the unit disc, that is the disc of unit radius centered at the
origin of R2. We cannot directly evaluate an integral over the unit disc by Fubini’s theorem, which
is restricted to integration over rectangular regions of R2. However, we can easily modify Fubini’s
theorem to integrate over a large class of non-rectangular regions D ⊂ R2 provided that these
regions are not too complicated. To formulate such a region suppose that φ1 : [a, b] → R and
φ2 : [a, b]→ R are given continuous functions over some fixed interval a ≤ x ≤ b such that
(2.1.16) φ1(x) ≤ φ2(x) for all a ≤ x ≤ b.
The region D ⊂ R2 is called y-simple with lower function φ1(x), upper function φ2(x) and common
interval of definition a ≤ x ≤ b, when D is the set of all points (x, y) such that (see Figure 2.3)
(2.1.17) a ≤ x ≤ b and φ1(x) ≤ y ≤ φ2(x),
that is
(2.1.18) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x).
In short, a y-simple region D ⊂ R2 with lower function φ1 : [a, b] → R and upper function
φ2 : [a, b]→ R is lower bounded by the graph of the function φ1(x) and upper bounded by the graph
of the function φ2(x). From now on we fix constants c and d such that
(2.1.19) c < φ1(x) ≤ φ2(x) < d for all a ≤ x ≤ b.
It then follows that the region D is contained within the rectangle E defined by
E := (x, y) ∈ R2 | a ≤ x ≤ b and c ≤ y ≤ d
= [a, b]× [c, d],(2.1.20)
(see Figure 2.3). Now define the function f ∗ on the rectangle E as
(2.1.21) f ∗(x, y) :=
f(x, y), for all (x, y) in D,
0, for all (x, y) in E but outside D.
It is then evident that
(2.1.22)
∫D
f(x, y) dx dy =
∫E
f ∗(x, y) dx dy.
10
Figure 2.3: y-simple region D
We now evaluate the integral over the rectangle E on the right of (2.1.22) by Fubini’s theorem in
the form of the identity (2.1.12) with f ∗ in place of f , and making use of the middle integral of
(2.1.22), that is
(2.1.23)
∫E
f ∗(x, y) dx dy =
∫ b
a
∫ d
c
f ∗(x, y) dy
dx.
From (2.1.21) and (2.1.19), for each fixed x in the interval a ≤ x ≤ b we must have
(2.1.24) f ∗(x, y) :=
f(x, y), for all φ1(x) ≤ y ≤ φ2(x),
0, when either c ≤ y < φ1(x) or φ2(x) < y ≤ d.
In view of (2.1.24), in the “inner integral” over c ≤ y ≤ d appearing in (2.1.23) the upper limit of
integration d can be replaced with φ2(x) and the lower limit of integration c can be replaced with
φ1(x) without changing the value of the integral, that is
(2.1.25)
∫ d
c
f ∗(x, y) dy =
∫ φ2(x)
φ1(x)
f(x, y) dy for all a ≤ x ≤ b,
so that (2.1.25) and (2.1.23) then give
(2.1.26)
∫E
f ∗(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
f(x, y) dy
dx.
11
Upon combining (2.1.26) and (2.1.22) we find that∫D
f(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
f(x, y) dy
dx.
This result is so useful that we repeat it stated as a theorem:
Theorem 2.1.9 (Fubini for y-simple region in R2). Suppose that D is any y-simple region with
lower function φ1(x), upper function φ2(x) and common interval of definition a ≤ x ≤ b (see Figure
2.3), and f : D → R is a given function. Then
(2.1.27)
∫D
f(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
f(x, y) dy
dx.
The nice thing about (2.1.27) is that the right hand side is often very easy to calculate as the
next example shows:
Example 2.1.10. D ⊂ R2 is a y-simple region with lower function φ1(x) and upper function φ2(x)
defined by
(2.1.28) φ1(x) := 0 φ2(x) :=√
1 + cos(x), for all 0 ≤ x ≤ 2π.
Note that 1 + cos(x) ≥ 0 for all 0 ≤ x ≤ 2π i.e. the square root in the definition of φ2(x) is real
(not imaginary) valued. The function f is defined by
(2.1.29) f(x, y) := 2y for all (x, y) in D.
Evaluate the integral of function f over the region D.
From (2.1.27) with a = 0 and b = 2π
(2.1.30)
∫D
f(x, y) dx dy =
∫ 2π
0
∫ φ2(x)
φ1(x)
f(x, y) dy
dx.
For the inner dy-integral in (2.1.30) define
(2.1.31) h1(x) :=
∫ φ2(x)
φ1(x)
f(x, y) dy, for all 0 ≤ x ≤ 2π,
so that
(2.1.32)
∫D
f(x, y) dx dy =
∫ 2π
0
h1(x) dx.
12
From (2.1.31) with (2.1.29) and (2.1.28)
(2.1.33) h1(x) =
∫ φ2(x)
φ1(x)
(2y) dy =[y2]y=φ2(x)
y=φ1(x)= φ2(x)2 − φ1(x)2 = 1 + cos(x),
From (2.1.32) with (2.1.33)
(2.1.34)
∫D
f(x, y) dx dy =
∫ 2π
0
[1 + cos(x)] dx = 2π.
Remark 2.1.11. Complementary to the idea of a y-simple region is an x-simple region. To define
an x-simple region suppose that ψ1 : [c, d] → R and ψ2 : [c, d] → R are given continuous functions
over some fixed interval c ≤ y ≤ d such that
(2.1.35) ψ1(y) ≤ ψ2(y) for all c ≤ y ≤ d.
The region D ⊂ R2 is called x-simple with left function ψ1(y), right function ψ2(y) and common
interval of definition c ≤ y ≤ d, when D is the set of all points (x, y) such that (see Figure 2.4)
(2.1.36) c ≤ y ≤ d and ψ1(y) ≤ x ≤ ψ2(y),
that is
(2.1.37) D = (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y).
We see that an x-simple region D ⊂ R2 with left function ψ1 : [c, d] → R and right function
ψ2 : [c, d]→ R is bounded on the left by the graph of the function ψ1(y) and bounded on the right
by the graph of the function ψ2(y). We then have the following analog of Theorem 2.1.9:
Theorem 2.1.12 (Fubini for x-simple region in R2). Suppose that D is any x-simple region with
left function ψ1(y), right function ψ2(y) and common interval of definition c ≤ y ≤ d (see Figure
2.4), and f : D → R is a given function. Then
(2.1.38)
∫D
f(x, y) dx dy =
∫ d
c
∫ ψ2(y)
ψ1(y)
f(x, y) dx
dy.
Remark 2.1.13. Particularly useful in applications are regions that are both x-simple and y-simple
at the same time. Such regions are called regular regions. A region D ⊂ R2 is therefore regular
when it is both lower bounded by a continuous function φ1 : [a, b] → R and upper bounded by a
continuous function φ2 : [a, b]→ R (where φ1 and φ2 satisfy (2.1.16)), as well as left bounded by a
13
Figure 2.4: x-simple region D
continuous function ψ1 : [c, d]→ R and right bounded by a continuous function ψ2 : [c, d]→ R (in
which ψ1 and ψ2 satisfy (2.1.35)). Put another way, a region D ⊂ R2 is regular when it is given by
D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x)
= (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y),(2.1.39)
for some continuous functions φ1 : [a, b] → R, φ2 : [a, b] → R, ψ1 : [c, d] → R, ψ2 : [c, d] → R. For
such a region it is clear that both Theorem 2.1.9 and Theorem 2.1.12 must hold, namely
Theorem 2.1.14 (Fubini for a regular region in R2). Suppose that D ⊂ R2 is a regular region, that
is both y-simple with lower function φ1(x), upper function φ2(x), and common interval of definition
a ≤ x ≤ b, as well as x-simple with left function ψ1(y), right function ψ2(y), and common interval
of definition c ≤ y ≤ d. For a function f : D → R we have
(2.1.40)
∫D
f(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
f(x, y) dy
dx =
∫ d
c
∫ ψ2(y)
ψ1(y)
f(x, y) dx
dy.
In order to integrate f on the regular region D we can evaluate either the middle integral or
the right hand integral in (2.1.40). As we shall see this flexibility of choice can be very useful;
sometimes one of these integrals is very much easier to evaluate than the other, in which case one
obviously evaluates the easier of the two integrals.
14
Example 2.1.15. Show that the triangular shaped region D ⊂ R2 in Figure 2.5 is a regular region.
First show that D is y-simple: From Figure 2.5 it looks reasonable to define
Figure 2.5: Region D for Example 2.1.15
(2.1.41) a := 1, b := 3, and φ1(x) := 1 for all 1 ≤ x ≤ 3.
To define the upper function φ2(x) we write out the equation of the straight line passing through
A and B, that is
(2.1.42) y =x+ 1
2.
Now (2.1.42) shows that we must define the upper function by
(2.1.43) φ2(x) :=x+ 1
2for all 1 ≤ x ≤ 3.
Next show that D is also x-simple: From Figure 2.5 it looks reasonable to take
(2.1.44) c := 1, d := 2, and ψ2(y) := 3 for all 1 ≤ y ≤ 2.
It remains to define the left function ψ1(y). For this we just rewrite (2.1.42), but putting x in terms
of y, that is
(2.1.45) x = 2y − 1,
and (2.1.45) shows that we must define the left function by
(2.1.46) ψ1(y) := 2y − 1 for all 1 ≤ y ≤ 2.
15
Example 2.1.16. Show that the disc of radius r centered at the point (α, β) in R2 (see Figure 2.6)
is a regular region.
Figure 2.6: Region D for Example 2.1.16
We first show that D is y-simple. For this we must determine lower and upper functions φ1(x)
and φ2(x) defined on some common interval a ≤ x ≤ b. From Figure 2.6 it looks reasonable to fix
(2.1.47) a := α− r, b := α + r.
Now the equation of the circle ABCE is of course
(2.1.48) (x− α)2 + (y − β)2 = r2.
We use this to determine the lower function φ1(x) and upper function φ2(x). From (2.1.48) we find
(y − β)2 = r2 − (x− α)2
that is
(2.1.49) y = β ±√r2 − (x− α)2.
From (2.1.49) it follows that the equation of the lower arc AEC is
(2.1.50) φ1(x) = β −√r2 − (x− α)2, for all a ≤ x ≤ b,
16
and the equation of the upper arc ABC is
(2.1.51) φ2(x) = β +√r2 − (x− α)2, for all a ≤ x ≤ b.
This shows that D is y-simple with lower function φ1(x) (see (2.1.50)) and upper function φ2(x)
(see (2.1.51)), and common interval a ≤ x ≤ b given by (2.1.47).
We next show that D is x-simple. For this we must determine left and right functions ψ1(y)
and ψ2(y) defined on some common interval c ≤ y ≤ d. From Figure 2.6 it looks reasonable to fix
(2.1.52) c := β − r, d := β + r.
From (2.1.48) we get (exactly as at (2.1.49))
(2.1.53) x = α±√r2 − (y − β)2.
From (2.1.53) it follows that the equation of the left arc BAE is
(2.1.54) ψ1(y) = α−√r2 − (y − β)2, for all c ≤ y ≤ d,
and the equation of the right arc BCE is
(2.1.55) ψ2(y) = α +√r2 − (y − β)2, for all c ≤ y ≤ d.
This shows that D is x-simple with left function ψ1(y) (see (2.1.54)) and right function ψ2(y) (see
(2.1.55)) and common interval c ≤ y ≤ b given by (2.1.52).
Example 2.1.17. A region D ⊂ R2 is shown in Figure 2.7 and function f is defined on D by
(2.1.56) f(x, y) := expy3, for all (x, y) in D.
Determine the integral of f on region D. It is clear that D is a y-simple region. In fact, it is
immediate from Figure 2.7 that, for the lower and upper functions φ1 : [a, b]→ R and φ2 : [a, b]→ R,
we should take
(2.1.57) a := 0, b := 1, φ1(x) :=√x and φ2(x) := 1 for all 0 ≤ x ≤ 1.
Since D is y-simple, from (2.1.27) we get
(2.1.58)
∫D
f(x, y) dx dy =
∫ 1
0
∫ φ2(x)
φ1(x)
f(x, y) dy
dx.
17
Figure 2.7: Region D for Example 2.1.17
For the “inner” dy-integral at (2.1.58) define
(2.1.59) h1(x) :=
∫ φ2(x)
φ1(x)
f(x, y) dy, for all 0 ≤ x ≤ 1,
so that, from (2.1.59) and (2.1.58),
(2.1.60)
∫D
f(x, y) dx dy =
∫ 1
0
h1(x) dx.
We must calculate the integral on the right of (2.1.60), and so we must first determine h1(x). From
(2.1.57) and (2.1.56),
(2.1.61) h1(x) =
∫ 1
√x
expy3 dy, for all 0 ≤ x ≤ 1.
We must now integrate the function expy3. Here, however, we run into a serious problem. To
integrate this function we need to find some function g(y) such that
(2.1.62)dg(y)
dy= expy3.
Unfortunately, an explicit formula for a function g(y) satisfying (2.1.62) is completely unknown to
anyone, which means that we cannot explicitly calculate the integral at (2.1.61) for h1(x), and
therefore of course we cannot determine the integral on the right of (2.1.60). At this point we could
18
just give up and try to approximate the integral of f on D numerically. However, note that the
region D shown in Figure 2.7 is also x-simple (that is, D is a regular region). In fact, it is immediate
from Figure 2.7 that the left and right boundary functions ψ1 : [c, d] → R and ψ2 : [c, d] → R are
given by
(2.1.63) c := 0, d := 1, ψ1(y) := 0 and ψ2(y) := y2 for all 0 ≤ y ≤ 1.
Since D is a regular region we also have available the third integral on the right side of (2.1.40),
that is
(2.1.64)
∫D
f(x, y) dx dy =
∫ d
c
∫ ψ2(y)
ψ1(y)
f(x, y) dx
dy.
For the “inner” dx-integral at (2.1.64) define
h2(y) :=
∫ ψ2(y)
ψ1(y)
f(x, y) dx
=
∫ y2
0
expy3 dx (from (2.1.63) and (2.1.56))
= expy3∫ y2
0
dx
= y2 expy3, for all 0 ≤ y ≤ 1.
(2.1.65)
Upon combining (2.1.65), (2.1.64) and (2.1.63) we obtain
(2.1.66)
∫D
f(x, y) dx dy =
∫ 1
0
h2(y) dy =
∫ 1
0
y2 expy3 dy.
Now of coursed expy3
dy= 3y2 expy3
so that
(2.1.67)
∫ 1
0
y2 expy3 dy =
[expy3
3
]y=1
y=0
=e− 1
3.
From (2.1.67) and (2.1.66)
(2.1.68)
∫D
f(x, y) dx dy =e− 1
3.
19
Remark 2.1.18. In Remark 2.1.13 we observed that, when the region D ⊂ R2 is regular, then
we have available both the middle and third iterated integrals in (2.1.40) with which to compute
the integral of function f on D. We also observed that one of these integrals may be difficult
to compute whereas the other integral may be easy to compute. Example (2.1.17) shows this very
clearly. Indeed, the middle integral of (2.1.40) is actually impossible to calculate (at least explicitly)
whereas the integral on the right of (2.1.40) is quite easy to evaluate.
Remark 2.1.19. The following elementary result from two dimensional calculus in the plane is
often useful: If D ⊂ R2 is some region which is either x-simple or y-simple, then taking f to be the
function with constant value
f(x, y) = 1 for all (x, y) in D,
we have
(2.1.69)
∫D
dx dy = area of D.
This follows immediately from Definition 2.1.1.
2.2 Three Dimensional Integration
In Section 2.1 we reviewed the main aspects of two dimensional integration, that is integration of
a real-valued function f on a region D in the plane R2. In this section we extend the ideas of two
dimensional integration to integration in three dimensions, following an approach which is a clear
extension of the approach to two dimensional integrals in Section 2.1. The goal of three dimensional
integration is to integrate a real-valued function f on a region Ω of three dimensional space R3.
Three dimensional integration is essential in many areas of physics and engineering; in particular
the forthcoming divergence theorem of Gauss-Ostrogradskii, an essential result of vector calculus,
relies on three dimensional integration.
Suppose we have a real-valued function
(2.2.70) f : Ω→ R,
in which Ω ⊂ R3 is the rectangular parallelepiped shown in Figure 2.8: The sides of the parallepiped
Ω are the intervals a ≤ x ≤ b, c ≤ y ≤ d and e ≤ z ≤ g, and in the notation of sets we write Ω as
(2.2.71) Ω = (x, y, z) ∈ R3 | a ≤ x ≤ b, c ≤ y ≤ d and e ≤ z ≤ g.
20
Figure 2.8: Rectangular parallelepiped in R3
For the sake of brevity we will usually denote this rectangle in the following “mathematical” notation
(2.2.72) Ω = [a, b]× [c, d]× [e, g],
in which the intervals a ≤ x ≤ b, c ≤ y ≤ b and e ≤ z ≤ g are indicated by the abbreviated
notations [a, b], [c, d] and [e, g]. We now define the integral of the function f over the parallelepiped
Ω. To this end subdivide the interval a ≤ x ≤ b into n + 1 equally spaced points xj, subdivide
the interval c ≤ y ≤ d into n + 1 equally spaced points yk, and subdivide the interval e ≤ z ≤ g
into the n+ 1 equally spaced points zl, that is
(2.2.73) a = x0 < x1 < . . . < xn = b, c = y0 < y1 < . . . < yn = d, e = z0 < z1 < . . . < zn = g,
with spacing ∆x, ∆y and ∆z between successive subdivision points given by
(2.2.74) ∆x := xj+1 − xj =b− an
, ∆y := yk+1 − yk =d− cn
, ∆z := zl+1 − zl =g − en
,
(c.f. (2.1.4) and (2.1.5)). Let Ωjkl be the (small) parallelepiped given by
Ωjkl := (x, y, z) ∈ R3 | xj ≤ x ≤ xj+1, yk ≤ y ≤ yk+1, and zl ≤ z ≤ zl+1
≡ [xj, xj+1]× [yk, yk+1]× [zl, zl+1],(2.2.75)
21
fix some point rrrjkl := (ξj, ηk, ζl) in Ωjkl (i.e. xj ≤ ξj ≤ xj+1, yk ≤ ηk ≤ yk+1 and zl ≤ ζl ≤ zl+1) and
define the Riemann sum of the function f on the parallelepiped Ω as follows
(2.2.76) Sn :=n−1∑j=0
n−1∑k=0
n−1∑l=0
f(ξj, ηk, ζl)∆x∆y∆z,
for each n = 1, 2, . . .. Now we can define the integral of the function f over the parallelepiped Ω as
follows:
Definition 2.2.1. If the sequence of Riemann sums Sn, n = 1, 2, . . . defined by (2.2.76) converges
to a limit S as n → ∞, and the limit S is the same for every choice of points (ξj, ηk, ζl) in Ωjkl,
then S is called the integral of the function f over the parallelepiped Ω.
Remark 2.2.2. The various notations for the integral in Definition 2.2.1 are
(2.2.77)
∫Ω
f(x, y, z) dx dy dz,
∫Ω
f(x, y, z) dV,
∫Ω
f dx dy dz,
∫Ω
f dV,
The essential elements in all of these notations is the subscript Ω attached to the integral, indicating
the region in R3 over which one integrates, and the integrand f indicating the function being
integrated. The symbol “ dV ” is effectively just shorthand for “ dx dy dz”. The first two notations
explicitly remind us that we are integrating over Ω with respect to an underlying space variable
in R3 which is generically denoted by (x, y, z). It can be quite tedious to keep carrying the space
variable (x, y, z), and so, in the third and fourth notations of (2.2.77), this variable is suppressed,
but always understood to be present! (compare with Remark 2.1.2 for two dimensional integrals).
Remark 2.2.3. Exactly as for two dimensional integrals (recall Remark 2.1.3) Definition 2.1.1 raises
a number of questions. What is the situation if the sequence of Riemann sums Sn, n = 1, 2, . . .fails to converge to any limit? In this case the integral of the function f over the parallelepiped
Ω does “not make sense” and is said to be undefined. Another very natural question: how can
we be sure that the integral of a function f over a parallelepiped Ω “makes sense” in the sense of
Definition 2.2.1? As in the case of two dimensional integrals, real analysis tells us that the class of
functions which can be integrated over Ω is simply huge, and we are completely safe in assuming
that every function that we shall encounter can be integrated.
Remark 2.2.4. In Remark 2.1.4 we observed that the actual calculation of two dimensional inte-
grals over a rectangle depended on a result called Fubini’s theorem. In exactly the same way, in
order to evaluate a three dimensional integral over a parallelepiped we need a version of Fubini’s
22
theorem extended to three dimensions. We develop this extension next. Define the rectangle D1 in
the x− y plane by (see Figure 2.8)
D1 := (x, y) ∈ R2 | a ≤ x ≤ b, c ≤ y ≤ d
= [a, b]× [c, d],(2.2.78)
and define the function
(2.2.79) h1(x, y) :=
∫ g
e
f(x, y, z) dz, for all (x, y) in D1.
In (2.2.79) we have fixed some (x, y) in D1 so that f(x, y, z) is now just a function of z only, and
on the right of (2.2.79) we integrate this function of z over e ≤ z ≤ g to get a real number h1(x, y)
which depends on our choice of (x, y). Similarly, we can define the rectangle D2 in the x− z plane
by (see Figure 2.8)
D2 := (x, z) ∈ R2 | a ≤ x ≤ b, e ≤ z ≤ g
= [a, b]× [e, g],(2.2.80)
and define the function
(2.2.81) h2(x, z) :=
∫ d
c
f(x, y, z) dy, for all (x, z) in D2,
and, likewise, we can define the rectangle D3 in the y − z plane by (see Figure 2.8)
D3 := (y, z) ∈ R2 | c ≤ y ≤ d, e ≤ z ≤ g
= [c, d]× [e, g],(2.2.82)
and define the function
(2.2.83) h3(y, z) :=
∫ b
a
f(x, y, z) dx, for all (y, z) in D3.
We then have the following Fubini theorem for three dimensions:
Theorem 2.2.5 (Fubini for rectangular parallelepiped in R3). Suppose that f : Ω → R where Ω
is the rectangular parallelepiped at (2.2.71), the functions h1(x, y), h2(x, z) and h3(y, z) are defined
by (2.2.79), (2.2.81) and (2.2.83) respectively, and the rectangles D1, D2 and D3 are defined by
(2.2.78), (2.2.80) and (2.2.82) respectively. Then
(2.2.84)
∫Ω
f dV =
∫D1
h1(x, y) dx dy =
∫D2
h2(x, z) dx dz =
∫D3
h3(y, z) dy dz.
23
Remark 2.2.6. Observe that the three integrals on the right of (2.2.84) are two dimensional
integrals (over the rectangles D1, D2 and D3), and each of these can be reduced by the Fubini
Theorem 2.1.5 to iterated integrals over intervals. Indeed, if we apply Fubini’s theorem for two
dimensional integrals in the form of the the identity (2.1.12) to, say, the two dimensional integral
of h1(x, y) over D1, we obtain
(2.2.85)
∫D1
h1(x, y) dx dy =
∫ b
a
∫ d
c
h1(x, y) dy
dx =
∫ d
c
∫ b
a
h1(x, y) dx
dy,
and similarly for the remaining two dimensional integrals over D2 and D3 at (2.2.84).
Example 2.2.7. A rectangular parallelepiped Ω is given by
Ω := (x, y, z) ∈ R3 | 0 ≤ x ≤ α, 0 ≤ y ≤ β and 0 ≤ z ≤ γ
= [0, α]× [0, β]× [0, γ],(2.2.86)
in which α, β and γ are positive constants, and f : Ω→ R is defined by
(2.2.87) f(x, y, z) := xy2 for all (x, y, z) in Ω.
Evaluate the integral∫
Ωf dV . Following (2.2.78) and (2.2.79) define
D1 := (x, y) ∈ R2 | 0 ≤ x ≤ α, 0 ≤ y ≤ β
= [0, α]× [0, β],(2.2.88)
and
(2.2.89) h1(x, y) :=
∫ γ
0
f(x, y, z) dz, for all (x, y) in D1.
From (2.2.89) and (2.2.87) we get
(2.2.90) h1(x, y) :=
∫ γ
0
xy2 dz = xy2
∫ γ
0
dz = γxy2.
From (2.2.90) and (2.1.12) (i.e. the Fubini theorem for two dimensions) we get
(2.2.91)
∫D1
h1(x, y) dx dy =
∫ α
0
∫ β
0
h1(x, y) dy
dx =
∫ α
0
∫ β
0
γxy2 dy
dx.
For the dy-integral at (2.2.91) we have
(2.2.92)
∫ β
0
γxy2 dy = γx
∫ β
0
y2 dy =γxβ3
3.
24
Now put (2.2.92) into (2.2.91) to get
(2.2.93)
∫D1
h1(x, y) dx dy =
∫ α
0
γxβ3
3dx =
γβ3
3
∫ α
0
x dx =α2β3γ
6.
From (2.2.93) and the Fubini Theorem 2.2.84
(2.2.94)
∫Ω
f dV =
∫D1
h1(x, y) dx dy =α2β3γ
6.
From (2.2.84) it follows that one could also get the result at (2.2.94) by either integrating h2(x, z)
over D2 or integrating h3(y, z) over D3.
Remark 2.2.8. It remains to define the integral of a function f defined over a region Ω ⊂ R3
which is not a rectangular parallelepiped. In this case we fix any rectangular parallelepiped Ξ ⊂ R3
which is large enough to contain the region Ω, that is Ω ⊂ Ξ. Now define f ∗ on the rectangular
parallelepiped Ξ by
(2.2.95) f ∗(x, y, z) :=
f(x, y, z), for all (x, y, z) in Ω,
0, for all (x, y) in Ξ but outside Ω.
We then define the integral of f over the region Ω by
(2.2.96)
∫Ω
f dV =
∫Ξ
f ∗ dV.
Since Ξ is a rectangular parallelepiped the integral on the right of (2.2.96) can, at least in principle,
be evaluated using the three dimensional Fubini Theorem 2.2.5.
Remark 2.2.9. In practice, use of Theorem 2.2.5 to actually compute the integral at (2.2.96)
relies on the region Ω not being too complicated, much as was the case in Remark 2.1.8 for two
dimensional integrals. We now formulate a particularly useful type of region Ω ⊂ R3 over which we
can evaluate integrals. To this end from now on we write
(2.2.97) R2xy := the x− y plane in R3,
so that generic points in R2xy are denoted by (x, y). Suppose that γ1 and γ2 are real valued continuous
functions defined on the common region D ⊂ R2xy, that is γ1 : D → R and γ2 : D → R, and suppose
that
(2.2.98) γ1(x, y) ≤ γ2(x, y), for all (x, y) in D.
25
Let S1 be the set of points traced out in R3 by the point (x, y, γ1(x, y)) as the point (x, y) varies
throughout the set D ⊂ R2xy, that is
(2.2.99) S1 = (x, y, γ1(x, y)) ∈ R3 | (x, y) in D.
As shown in Figure 2.9 the set S1 is a surface in R3. Effectively, one can imagine D as a “floor”
and γ1(x, y) gives the “height” of a “roof” at the point (x, y) in D; then S1 represents the shape of
the roof. Similarly, corresponding to γ2 : D → R is the surface S2 given by
(2.2.100) S2 = (x, y, γ2(x, y)) ∈ R3 | (x, y) in D.
Now let Ω ⊂ R3 be the set of all points (x, y, z) in R3 which are between the surfaces S1 and S2 (see
Figure 2.9). Put another way, Ω is the set of all (x, y, z) in R3 such that (x, y) is a member of D,
and, for this (x, y), z is in the range γ1(x, y) ≤ z ≤ γ2(x, y). In set-theoretic terms we write this as
(2.2.101) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y).
By analogy with the simple regions in R2 discussed in Remark 2.1.8 this region Ω ⊂ R3 is called
z-simple with lower function γ1(x, y), upper function γ2(x, y) and common domain of definition
D ⊂ R2xy. By following very much the same argument that we used to obtain Theorem 2.1.9 one
can establish
Theorem 2.2.10 (Fubini for z-simple region in R3). Suppose that Ω is any z-simple region with
lower function γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see
Figure 2.9). If f : Ω→ R is a given function then
(2.2.102)
∫Ω
f dV =
∫D
∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
dx dy,
Equivalently, we can write (2.2.102) as
(2.2.103)
∫Ω
f dV =
∫D
h1(x, y) dx dy,
in which h1(x, y) is defined by the one dimensional integral
(2.2.104) h1(x, y) :=
∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz, for all (x, y) in D.
The nice thing about (2.2.103) is that it reduces calculation of the three dimensional integral over
Ω on the left to calculation of the two dimensional integral over D ⊂ R2xy on the right. We already
26
Figure 2.9: z-simple region Ω ⊂ R3
know from Section 2.1 how to deal with such two dimensional integrals. In fact, suppose that the
common domain of definition D is itself y-simple with lower function φ1(x), upper function φ2(x)
and common interval of definition a ≤ x ≤ b, that is D is given by
(2.2.105) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x).
(recall Remark 2.1.8 and see (2.1.18)). Then, from (2.1.27) with h1 in place of f , we find
(2.2.106)
∫D
h1(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
h1(x, y) dy
dx,
so that (2.2.106) and (2.2.103) give
(2.2.107)
∫Ω
f dV =
∫ b
a
∫ φ2(x)
φ1(x)
h1(x, y) dy
dx.
27
Usually the definition of h1 at (2.2.104) is substituted into (2.2.107) to give the following compressed
version of (2.2.107)
(2.2.108)
∫Ω
f dV =
∫ b
a
∫ φ2(x)
φ1(x)
[∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
]dy
dx,
which displays the integral of f over the three dimensional region Ω as three iterated integrals. In
(2.2.108) the common domain of definition D of the z-simple region Ω was assumed to be given
by (2.2.105), that is D is y-simple with lower function φ1(x), upper function φ2(x) and common
interval of definition a ≤ x ≤ b. Suppose instead that D is x-simple with left function ψ1(y), right
function ψ2(y) and common interval of definition c ≤ y ≤ d, that is
(2.2.109) D = (x, y) ∈ R2 | c ≤ y ≤ d, ψ1(y) ≤ x ≤ ψ2(y).
(see (2.1.37)). Repeating the argument which led to (2.2.108) we find of course that
(2.2.110)
∫Ω
f dV =
∫ d
c
∫ ψ2(y)
ψ1(y)
[∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
]dx
dy.
If the domain D is regular, that is x-simple as well as y-simple and given by both (2.2.109) and
(2.2.105) (see Remark 2.1.8) then (2.2.110) and (2.2.108) must hold together, that is∫Ω
f dV =
∫ b
a
∫ φ2(x)
φ1(x)
[∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
]dy
dx
=
∫ d
c
∫ ψ2(y)
ψ1(y)
[∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
]dx
dy.
(2.2.111)
Example 2.2.11. The region Ω ⊂ R3 is the tetrahedron in Figure 2.10(a) with vertices O, A, B
and C, and the function f : Ω→ R is defined by
(2.2.112) f(x, y, z) := y for all (x, y, z) in Ω.
Evaluate the three dimensional integral ∫Ω
f dV.
We see that Ω is given in set-theoretic terms by
(2.2.113) Ω := (x, y, z) ∈ R3 | x ≥ 0, y ≥ 0, z ≥ 0, x+ y + z ≤ 1
28
Figure 2.10: (a) Tetrahedron Ω for the Example 2.2.11 (b) Common domain of definition D
We cannot directly use the formulation of Ω at (2.2.113) to evaluate the integral. Notice however
that Ω is a z-simple region. To see this observe that all points (x, y, z) on the triangular surface
with vertices A, B and C must satisfy the identity
(2.2.114) z = 1− x− y.
Let D ⊂ R2xy be the triangular region in the x− y plane with vertices O, A and C shown in Figure
2.10(b), and define the functions
(2.2.115) γ1(x, y) := 0, γ2(x, y) := 1− x− y, for all (x, y) in D.
With these definitions it is clear that Ω ⊂ R3 is the z-simple region given by
(2.2.116) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y),
that is Ω is z-simple with lower function γ1(x, y) and upper function γ2(x, y) defined by (2.2.115),
and with common domain of definition D ⊂ R2xy in Figure 2.10(b). From Figure 2.10(b) we observe
that D is given by
(2.2.117) D = (x, y) ∈ R2 | a ≤ x ≤ b, φ1(x) ≤ y ≤ φ2(x),
29
in which
(2.2.118) a := 0, b := 1, φ1(x) := 0 and φ2(x) := 1− x for all 0 ≤ x ≤ 1,
that is D is y-simple with lower function φ1(x) and upper function φ2(x) defined by (2.2.118), and
with common interval of definition 0 ≤ x ≤ 1. Since Ω is z-simple and D is y-simple we can use
(2.2.108), that is ∫Ω
f dV =
∫ b
a
∫ φ2(x)
φ1(x)
[∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
]dy
dx
=
∫ 1
0
∫ 1−x
0
[∫ 1−x−y
0
y dz
]dy
dx,
(2.2.119)
in which the second equality at (2.2.119) follows from (2.2.112), (2.2.115) and (2.2.118). Now
evaluate the successive iterated integrals on the right hand side of (2.2.119):
(2.2.120)
∫ 1−x−y
0
y dz = y
∫ 1−x−y
0
dz = y(1− x− y),
so that ∫ 1−x
0
[∫ 1−x−y
0
y dz
]dy =
∫ 1−x
0
y(1− x− y) dy (from (2.2.120))
= (1− x)
∫ 1−x
0
y dy −∫ 1−x
0
y2 dy
= (1− x)
[y2
2
]y=1−x
y=0
−[y3
3
]y=1−x
y=0
=(1− x)3
2− (1− x)3
3=
(1− x)3
6.
(2.2.121)
From (2.2.121) and (2.2.119) we find∫Ω
f dV =1
6
∫ 1
0
(1− x)3 dx =1
6
[−1
4(1− x)4
]x=1
x=0
=1
24.
Remark 2.2.12. In Remark 2.2.9 we defined a region Ω ⊂ R3 which is z-simple, with lower function
γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see (2.2.101)). In much
the same way we can also formulate the analogous ideas of y-simple and x-simple regions in R3.
For this put (c.f. (2.2.97))
(2.2.122) R2xz := the x− z plane in R3, R2
yz := the y − z plane in R3.
30
Now suppose that ρ1 : D → R and ρ2 : D → R are continuous functions defined on a common
region D ⊂ R2xz such that
(2.2.123) ρ1(x, z) ≤ ρ2(x, z) for all (x, z) in D.
Then Ω ⊂ R3 is called a y-simple region with lower function ρ1(x, z), upper function ρ2(x, z) and
common domain of definition D ⊂ R2xz when
(2.2.124) Ω = (x, y, z) ∈ R3 | (x, z) in D and ρ1(x, z) ≤ y ≤ ρ2(x, z),
(c.f. (2.2.101) for the analogous z-simple case). Similarly, if η1 : D → R and η2 : D → R are
continuous functions defined on some common region D ⊂ R2yz such that
(2.2.125) η1(y, z) ≤ η2(y, z) for all (y, z) in D,
then Ω ⊂ R3 is called an x-simple region with lower function η1(y, z), upper function η2(y, z) and
common domain of definition D ⊂ R2yz when
(2.2.126) Ω = (x, y, z) ∈ R3 | (y, z) in D and η1(y, z) ≤ x ≤ η2(y, z).
Finally, the region Ω ⊂ R3 is regular when it is both z-simple, y-simple and x-simple at once, that
is given equivalently by (2.2.101), (2.2.124) and (2.2.126).
We can now give the ultimate version of Fubini’s theorem for three dimensional integration,
which for completeness repeats Theorem 2.2.10 for z-simple regions and includes the cases of the
x-simple, and y-simple regions of Remark 2.2.12:
Theorem 2.2.13 (Fubini for general regions in R3). Suppose that Ω ⊂ R3 is a given region and
f : Ω→ R is a given function.
(a) If Ω is a z-simple region with lower function γ1(x, y), upper function γ2(x, y) and common
domain of definition D ⊂ R2xy (see Figure 2.9) then
(2.2.127)
∫Ω
f dV =
∫D
∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
dx dy.
(b) If Ω ⊂ R3 is a y-simple region with lower function ρ1(x, z), upper function ρ2(x, z), and common
domain of definition D ⊂ R2xz (see (2.2.124)) then
(2.2.128)
∫Ω
f dV =
∫D
∫ ρ2(x,z)
ρ1(x,z)
f(x, y, z) dy
dx dz.
31
(c) If Ω ⊂ R3 is a x-simple region with lower function η1(y, z), upper function η2(y, z) and common
domain of definition D ⊂ R2yz (see (2.2.126)) then
(2.2.129)
∫Ω
f dV =
∫D
∫ η2(y,z)
η1(y,z)
f(x, y, z) dx
dy dz.
(d) If Ω ⊂ R3 is regular (that is simultaneously x-simple, y-simple and z-simple) then (2.2.127),
(2.2.128) and (2.2.129) all hold.
32
Chapter 3
Scalar and Vector Fields
One of the most important ideas in physics and engineering is the notion of a field. Roughly speaking,
a field describes how a scalar-valued or vector-valued quantity varies through space, leading to the
more specialized ideas of a scalar field and a vector field.
3.1 Motivating Examples
Before stating the formal definitions of scalar and vector fields we give some motivating examples:
Example 3.1.1. A flat circular metal plate of radius 1 m is located with its centre at the origin of
the xy plane (which we shall denote by R2), and heated with a blow-torch. At each point (x, y) of
the unit disc centred at the origin (exactly the part of the plane occupied by the metal disc) denote
the temperature of the disc by T (x, y) (see Figure 3.1). We then have a scalar-valued function
T (x, y) defined for each point (x, y) in the unit disc. This function is an instance of a scalar field
defined on a region of two-dimensional space, namely the unit disc centred at the origin of the plane
R2.
Example 3.1.2. We can easily generalize Example 3.1.1 to the case of three dimensions as follows.
A solid metal ball of radius 1 m is located with its centre at the origin of three-dimensional xyz
space (which we shall denote by R3), and heated with a blow-torch. At each point (x, y, z) of the
unit sphere centred at the origin (exactly the part of space occupied by the metal ball) denote the
temperature of the ball by T (x, y, z). We then have a scalar-valued function T (x, y, z) defined for
each point (x, y, z) in the unit sphere. This function is an instance of a scalar field defined on a
region of three-dimensional space, namely the unit sphere centred at the origin of R3.
33
Figure 3.1: Temperature variation on the unit disc.
Example 3.1.3. A positive point charge of Q coul. is located at the origin of three-dimensional
space R3. If a positive test charge of 1 coul. is located at the point (x, y, z) then, according to
Coulomb’s law of electrostatics, a force is exerted on the test charge with a magnitude given by
(3.1.1)Q
4πε0r2, for r :=
√x2 + y2 + z2,
(ε0 is a physical constant) and direction along the radial line from the origin to (x, y, z) and away
from the charge Q at the origin (since like charges repel). We denote this force by EEE(x, y, z); this
quantity is a vector, since it has both a magnitude given by (3.1.1) and direction along the radial
line and is called the electrostatic field in space due the charge Q coul. at the origin. (see Figure
3.2). Notice that the electrostatic field EEE(x, y, z) is undefined when (x, y, z) is at the origin (i.e.
(x, y, z) = 0) since r = 0, so that the magnitude at (3.1.1) is undefined; moreover, the radial line
from the origin to itself makes no sense, so there is not a well-defined direction either. However,
EEE(x, y, z) is defined for all (x, y, z) 6= 0 i.e. all (x, y, z) not at the origin. This vector-valued function
is an instance of a vector field defined everywhere in three-dimensional space R3 except for the origin.
Example 3.1.4. Suppose that some electric charge is continuously spread or distributed or “s-
meared” throughout some fixed region D of three-dimensional space R3 (i.e. D ⊂ R3). For example,
D could be the sphere with radius of 1 m. centered at the origin of the xyz-coordinate system, or D
34
Figure 3.2: Vector field EEE(x, y, z).
could be the whole of R3 (i.e. D = R3), but all kinds of other choices for D are of course possible.
The case D = R3 is the simplest and most commonly occurring. Fix some point (x, y, z) in the
region D, and visualize another very small sphere of radius 0 < ε << 1 which is centred at (x, y, z)
(see Figure 3.3). If Vε is the volume of the small sphere and Qε is the total charge contained within
this sphere, then the ratio Qε/Vε (with units of coulombs per cubic metre) is the average charge
density in the small sphere. If the limit
(3.1.2) ρ(x, y, x) = limε→0
Qε
Vε
exists, then the scalar quantity ρ(x, y, z) defines the charge density at the point (x, y, z). If, fur-
thermore, the limit at (3.1.2) exists for each and every point (x, y, z) in the region D, then we have
a scalar-valued function ρ(x, y, z) defined at each point (x, y, z) in D. This function is a scalar field
giving the charge density at every point in the region D of three-dimensional space R3. Effectively,
ρ(x, y, z) gives the quantity of charge per unit volume concentrated at (x, y, z) that is ρ describes
the local concentration of charge at each point in the region D.
Example 3.1.5 (Total enclosed charge). In this example we use three dimensional integration (see
Section 2.2) to relate total charge to the charge density of Example 3.1.4, and for simplicity we take
D = R3 in Example 3.1.4, so that the charge is spread “everywhere” in space. Fix some region
Ω ⊂ R3. For example, Ω could be a spherical shaped region or a parallelepiped in R3. Then the
35
Figure 3.3: Small sphere centered at (x, y, z).
total charge enclosed within Ω must be given by the integral
(3.1.3) Q =
∫Ω
ρ dV.
Later, in Section 9.4, we shall use the very simple relation (3.1.3) to obtain an essential result called
the continuity equation, which describes the movement of charge through space.
Example 3.1.6. Here we use the charge density scalar field ρ(x, y, z) of Example 3.1.4 to construct
a vector field called the current density field. Suppose that the charge in the fixed region D is in
motion, moving through the region, and in particular, at each point (x, y, z) in D, the charge moves
past that point with a velocity vvv(x, y, z). Notice that vvv(x, y, z) is a vector, since it involves both
the direction of motion of the charge as well as the speed at which the charge moves past (x, y, z).
For each (x, y, z) in D define
(3.1.4) JJJ(x, y, z) := ρ(x, y, z)vvv(x, y, z).
Then JJJ(x, y, z) is a vector (since the product of a scalar and a vector is always a vector) defined
for each (x, y, z) in the region D. This vector-valued function is therefore a vector field defined
everywhere in the region D, called the current density field. Since the units of ρ(x, y, z) are coul./m3,
and the units of vvv(x, y, z) are m/sec., the units of JJJ(x, y, z) must be coulombs per sec. per square
metre, that is amperes per square meter (or amps./m2). To get a better understanding of what the
36
current density field JJJ(x, y, z) really means fix a plane or “flat” surface S with area A, and let nnn be
the unit vector normal to the surface S (see Figure 3.4). Suppose, to begin with, that the charge
Figure 3.4: Movement of charge perpendicular to S: vvv and nnn are collinear.
density ρ(x, y, z) has the constant value ρ for all (x, y, z), and similarly suppose that the velocity
vvv(x, y, z) of movement of the charge has the constant value vvv for all (x, y, z), so that
(3.1.5) ρ(x, y, z) = ρ, vvv(x, y, z) = vvv for all (x, y, z) in D.
From (3.1.4) and (3.1.5) the current density field JJJ(x, y, z) must then have the constant value
(3.1.6) JJJ = ρvvv.
In the first instance suppose that the direction of movement of the charge is exactly in the direction
of the unit normal nnn, that is nnn and vvv are collinear (see Figure 3.4). If v denotes the speed of
movement of the charge then of course
(3.1.7) v = ‖vvv‖ = nnn · vvv,
where the second equality follows by the collinearity of vvv and nnn, and the fact that nnn has unit length.
Now fix some small ∆t > 0 (regarded as time). Since vvv is perpendicular to the surface S one sees
that the total “volume of space” that crosses surface S in the time ∆t must be Av∆t, so that the
total charge Q which flows across the surface S in the time ∆t must be this total “volume of space”
37
multiplied by the constant charge density ρ, that is the total charge Q flowing across the surface S
in the time ∆t is given by
(3.1.8) Q = (Av∆t)ρ when the charge velocity vvv is collinear with nnn.
Using (3.1.7) in (3.1.8) then gives
(3.1.9) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is collinear with nnn.
Now suppose that the direction of charge movement is no longer collinear with the unit normal nnn,
as previously, but is instead tangential to the surface S, so that the velocity vector vvv lies along S,
that is vvv is orthogonal to the unit vector nnn (see Figure 3.5). Since the direction of charge movement
Figure 3.5: Movement of charge tangential to S: vvv and nnn are orthogonal.
is along the surface S and not through the surface, there can be no charge crossing S, so the total
charge Q that crosses S in the time ∆t is zero, that is
(3.1.10) Q = 0 when the charge velocity vvv is orthogonal to nnn.
When vvv and nnn are orthogonal then of course nnn · vvv = 0, so that we can write (3.1.10) as
(3.1.11) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is orthogonal to nnn.
38
Figure 3.6: Movement of charge across S: velocity vvv in a general direction.
Finally suppose that the direction of charge movement is neither collinear with nnn nor orthogonal
to nnn, as in the previous cases, but instead vvv is just in some general direction, as shown in Figure
3.6. There is now a component of velocity vvv1 with magnitude v1 in the direction of the unit normal
nnn, and a component of velocity vvv2 along the surface S and orthogonal to the unit vector nnn. As we
have just seen from (3.1.8) and (3.1.10), all charge flowing across S is due to the first component
vvv1 collinear with nnn, and none is due to the component vvv2 orthogonal to nnn, so that the total charge
Q that crosses surface S in the time ∆t is given by (3.1.8) with v1 in place of v that is
(3.1.12) Q = (Av1∆t)ρ.
But v1 is just the projection of vvv along nnn so that
(3.1.13) v1 = nnn · vvv,
and therefore, in this general case, from (3.1.13) and (3.1.12), we find that the total charge Q
passing through the surface S in the time ∆t is given by
(3.1.14) Q = (A∆t)(nnn · vvv)ρ when the charge velocity vvv is in a general direction.
Observe that (3.1.9) and (3.1.11) are just special cases of the general relation (3.1.14). Now the
quantity I = Q/∆t is the total current passing through the surface S, therefore
I =Q
∆t= A(nnn · vvv)ρ = A(ρvvv) · nnn = A(JJJ · nnn),
39
where the first equality follows from (3.1.14), the second equality is clear, and the third equality
follows from (3.1.6). We therefore have
(3.1.15) I = A(JJJ · nnn),
that is the total current through the surface S is the product of the area A of S and the inner product
JJJ · nnn of the current density JJJ with the unit normal nnn to S.
We next remove the assumption, made at (3.1.5), that the charge density and charge velocity
are constant in space (i.e. constant with respect to (x, y, z)). This is just a matter of writing out
(3.1.15) for an infinitesimally small surface dS with infinitesimally small area dA. In fact, since
the surface dS is infinitesimally small, the current density field JJJ(x, y, z) is effectively constant as
(x, y, z) varies through dS. Therefore we can use (3.1.15) with dA in place of A and JJJ(x, y, z) in
place of JJJ , to see that the infinitesimal current passing through the infinitesimal surface dS with
unit normal nnn(x, y, z) is given by
(3.1.16) dI = (JJJ(x, y, z) · nnn(x, y, z)) dA.
The relation (3.1.16) shows the usefulness of the current density vector field JJJ(x, y, z); knowing this
vector field we can calculate the current dI flowing across a small planar surface dS with area dA
and unit normal vector nnn(x, y, z) at a point (x, y, z) in dS. Later, in Section 8.5, we shall extend
this relation for applications to Maxwell’s equations.
Remark 3.1.7. The notion of charge density formulated in Example 3.1.4 and current density
formulated in Example 3.1.6 are at this stage just illustrative cases of a scalar field and a vector
field. Later these quantities will take on a much deeper significance. In fact, we shall see that
charge and current density are absolutely indispensable in the formulation of Maxwell’s equations
of electromagnetism.
3.2 Definition of Vector and Scalar Fields
With the preceding examples in mind we can now formulate in general terms exactly what is meant
by a scalar field and a vector field:
Definition 3.2.1. A vector field in R3 comprises a specified region D ⊂ R3, called the domain
of the vector field, together with a function or mapping FFF : D → R3 which assigns to each point
40
(x, y, z) in D the vector FFF (x, y, z) in R3. We usually resolve the vector FFF (x, y, z) into its scalar
components along the standard iii, jjj, kkk axes in the representation
(3.2.17) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D,
so that F1(x, y, z), F2(x, y, z) and F3(x, y, z) are respectively the x, y and z coordinates of the vector
FFF (x, y, z). In exactly the same way, a scalar field in R3 comprises a specified region D ⊂ R3, called
the domain of the vector field, together with a function or mapping f : D → R which assigns to
each point (x, y, z) in D the the real number f(x, y, z).
Remark 3.2.2. The vector field FFF : D → R3 is called a C1-vector field when, for each i = 1, 2, 3,
the partial derivatives
∂Fi(x, y, z)
∂x,
∂Fi(x, y, z)
∂y,
∂Fi(x, y, z)
∂z,
all exist and are continuous functions of (x, y, z) in D. For the most part we shall be dealing with
C1-vector fields in this course. However, we shall sometimes also have to deal with vector fields
which are even better behaved in the following sense: A vector field FFF : D → R3 is called a C2-
vector field when FFF is a C1-vector field, and in addition, for each i = 1, 2, 3, the second partial
derivatives
∂2Fi(x, y, z)
∂x2 ,∂2Fi(x, y, z)
∂y2 ,∂2Fi(x, y, z)
∂z2 ,
∂2Fi(x, y, z)
∂x∂y,
∂2Fi(x, y, z)
∂y∂z,
∂2Fi(x, y, z)
∂x∂z,
all exist and are continuous functions of (x, y, z) in D. A standard result from elementary calculus
says that, when FFF : D → R3 is a C2-vector field, then we always have
∂2Fi(x, y, z)
∂x∂y=∂2Fi(x, y, z)
∂y∂x,
∂2Fi(x, y, z)
∂x∂z=∂2Fi(x, y, z)
∂z∂x,
∂2Fi(x, y, z)
∂y∂z=∂2Fi(x, y, z)
∂z∂y,
for all (x, y, z) in D. That is, the second partial derivatives are equal regardless of the order in
which they are calculated. In exactly the same way, a scalar field f : D → R is called a C1-scalar
field when the partial derivatives
∂f(x, y, z)
∂x,
∂f(x, y, z)
∂y,
∂f(x, y, z)
∂z,
41
all exist and are continuous functions of (x, y, z) in D, and is called a C2-scalar field when it is a
C1-scalar field with the additional property that the second partial derivatives
∂2f(x, y, z)
∂x2 ,∂2f(x, y, z)
∂y2 ,∂2f(x, y, z)
∂z2 ,
∂2f(x, y, z)
∂x∂y,
∂2f(x, y, z)
∂y∂z,
∂2f(x, y, z)
∂x∂z,
all exist and are continuous functions of (x, y, z) in D. Again, if f : D → R is a C2-scalar field then
∂2f(x, y, z)
∂x∂y=∂2f(x, y, z)
∂y∂x,
∂2f(x, y, z)
∂x∂z=∂2f(x, y, z)
∂z∂x,
∂2f(x, y, z)
∂y∂z=∂2f(x, y, z)
∂z∂y,
for all (x, y, z) in D.
Remark 3.2.3. It is also useful, especially for simple examples, to have the notion of a vector
field in R2 and a scalar field in R2; we do not formulate these here, for they are obviously given by
Definition 3.2.1 with three-dimensional space R3 everywhere replaced by two-dimensional space R2.
Remark 3.2.4. The domain D of a field (either vector or scalar) occurring in Definition 3.2.1 may
represent a region of space that is natural or intrinsic to the field. In Example 3.1.2 the domain D of
the temperature field T (x, y, z) is naturally the unit sphere centred at the origin of R3, since this is
exactly the region of space occupied by the heated metal ball. In Example 3.1.3 the domain D of the
electrostatic fieldEEE(x, y, z) is all of R3 except for the origin (in mathematical notation D = R3\0)since we have seen that the electrostatic field is undefined (i.e. does not make sense) when (x, y, z)
is at the origin. On the other hand, the domain D could simply be an arbitrary portion of space to
which we wish to restrict attention; this is the case in Example 3.1.4 and Example 3.1.6 where we
just want to focus on some designated portion of space in which we have a distribution of charge.
If our charge had been spread out though all space then we could have taken D = R3. In many
instances, for simplicity and to focus on the essentials, we shall just assume D = R3, that is our
fields are defined everywhere. However, as the electrostatic field of Example 3.1.3 makes clear, we
occasionally come across fields which cannot be defined everywhere, and in these cases we must
carefully specify the domain D = R3 0, that is all of R3 except for the origin.
Remark 3.2.5. It is clear from Definition 3.2.1 that a vector field describes how a vector-valued
quantity changes through space (or a portion of space identified by the domain D of the field) and
likewise for a scalar field. However, in most instances of interest in physics and engineering, one
comes across vector and scalar-valued quantities which change not only through space but also
42
vary with time, so that, with all dependencies displayed, a time varying vector field FFF should be
written FFF (t, x, y, z) and a time varying scalar field f should be written f(t, x, y, z), in which t of
course denotes time, and (x, y, z) is a general point in some domain D ⊂ R3. This additional
time-dependence is easily fitted within Definition 3.2.1: a time-varying vector field is one in which,
for each fixed instant t, we just have a vector field which maps each (x, y, z) in D into the vector
FFF (t, x, y, z) in R3 (i.e. t is kept constant and we think only of the dependence on the space variables
(x, y, z)). Similarly, a time varying scalar field is one in which, for each fixed instant t, we just have
a scalar field which maps each (x, y, z) in D into the real number f(t, x, y, z).
43
Chapter 4
Curves and Paths in Space
Curves and paths in space are among the essential building-blocks of vector calculus. We first
motivate the ideas of a path and a curve in space with a few simple examples.
4.1 Motivating Examples
Example 4.1.1. Define
(4.1.1) γγγ(t) := (cos(πt), sin(πt)), 0 ≤ t ≤ 1.
This defines a function or mapping
(4.1.2) γγγ : [0, 1]→ R2
which takes each 0 ≤ t ≤ 1 into the vector γγγ(t) in the plane R2 given by (4.1.1). This function is
called a path in the plane R2. If we plot γγγ(t) versus 0 ≤ t ≤ 1 then we get a semicircle on the plane
R2 shown in the next figure: This semicircle, which is the image or range of the function γγγ(t), is
called the curve or trace of the path. Notice that the curve has a natural direction as t increases
from t = 0 until t = 1.
Example 4.1.2. Define
(4.1.3) γγγ(t) := (t2, t3), 1 ≤ t ≤ 3.
This defines a function or mapping γγγ : [1, 3] → R2 which takes each 1 ≤ t ≤ 3 into the vector γγγ(t)
in the plane R2 given by (4.1.3). A plot of γγγ(t) versus 1 ≤ t ≤ 3 is shown at Figure 4.2. Again, this
plot is the curve or trace of the path and has a direction which corresponds to increasing t from
t = 1 until t = 3.
44
Figure 4.1: Path defined by (4.1.1)
4.2 Paths and Parametric Representation of Curves
With Example 4.1.1 and Example 4.1.2 in mind, we can now formulate in general terms what is
meant by a path and by the corresponding curve (or trace) of the path:
Definition 4.2.1. A two-dimensional path (or parametric function for a two-dimensional curve) is
a given function or mapping
γγγ : [a, b]→ R2,
from a specified interval [a, b] into R2, which maps each a ≤ t ≤ b into the vector γγγ(t) in R2. This
mapping is usually written in the scalar component form
γγγ(t) = (x(t), y(t))
= x(t)iii+ y(t)jjj for all t in a ≤ t ≤ b.(4.2.4)
The curve of the path is the set of points Γ in the plane R2 traced by γγγ(t) as t traverses the interval
a ≤ t ≤ b. In the notation of sets we write this as
Γ := γγγ(t) ∈ R2 | a ≤ t ≤ b.
The interval a ≤ t ≤ b on which the path is defined is called the parametric interval, the variable
t is called the parametric variable of the path, and the whole function γγγ : [a, b] → R2 is called a
45
Figure 4.2: Path defined by (4.1.3)
parametric representation of the curve Γ. The starting point of the path is the vector γγγ(a) while
the ending point of the path is the vector γγγ(b), and the curve Γ has a direction from the starting to
the ending point corresponding to t increasing from t = a until t = b.
We can clearly formulate analogous ideas in three-dimensional R3 space rather than the plane
R2, by an obvious modification of the preceding definition. For completeness we give this next:
Definition 4.2.2. A three-dimensional path (or parametric function for a three-dimensional curve)
is a given function or mapping
γγγ : [a, b]→ R3,
from a specified interval [a, b] into R3, which maps each a ≤ t ≤ b into the vector γγγ(t) in R3. This
mapping is usually written in the scalar component form
γγγ(t) = (x(t), y(t), z(t))
= x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b.(4.2.5)
The curve of the path is the set of points Γ in the space R3 traced by γγγ(t) as t traverses the interval
a ≤ t ≤ b. In the notation of sets we write this as
Γ := γγγ(t) ∈ R3 | a ≤ t ≤ b.
46
The interval a ≤ t ≤ b on which the path is defined is called the parametric interval, the variable
t is called the parametric variable of the path, and the whole function γγγ : [a, b] → R3 is called a
parametric representation of the curve Γ. The starting point of the path is the vector γγγ(a) while
the ending point of the path is the vector γγγ(b), and the curve Γ has a direction from the starting to
the ending point corresponding to t increasing from t = a until t = b.
Figure 4.3: Path γγγ : [a, b]→ R3
Remark 4.2.3. Definition 4.2.1 and Definition 4.2.2 are word-for-word identical except that every-
where we just replace the plane R2 with three-dimensional space R3. From now on we shall usually
just formulate ideas such as path and curve in the more general case of R3 and you will be left
to formulate the corresponding idea in the simpler case of R2. On the other hand, many of our
concrete examples will be for the two-dimensional case simply because it is much easier to draw
curves on R2 than in R3.
Remark 4.2.4. If the path γγγ : [a, b]→ R3 is such that the first derivatives
dx(t)
dt,
dy(t)
dt,
dz(t)
dt,
of the scalar components at (4.2.5) exist and are continuous for all a ≤ t ≤ b, then γγγ is called a
C1-path and the curve corresponding to γγγ is called a C1-curve (compare with C1-fields in Remark
3.2.2). Again, if γγγ : [a, b] → R3 is a C1-path with the further property that the second derivatives
47
of the scalar components
d2x(t)
dt2,
d2y(t)
dt2,
d2z(t)
dt2,
exist and are continuous for all a ≤ t ≤ b, then γγγ is called a C2-path and the curve corresponding
to γγγ is called a C2-curve. In this course our focus will be almost exclusively on C1 and C2-paths
and curves. These ideas specialize in the obvious way for curves in the plane given by the path
γγγ : [a, b]→ R2; one simply discards the derivatives of the scalar component z(t).
Remark 4.2.5. Notice from Definition 4.2.1 and Definition 4.2.2 the distinction between a path
and its curve or trace. The path refers to the whole mapping, including the interval of definition,
whereas the curve is just the totality of points in space (R2 or R3) that are successively occupied
by γγγ(t) as t increases through the interval of definition. In particular, all information concerning
dependence on the parametric variable t and the interval of definition [a, b] is lost if we are just
given the curve of a path, rather than the path itself. The next example illustrates that different
paths can nevertheless have identical curves:
Example 4.2.6. Define
(4.2.6) γγγ(t) := (cos(2πt), sin(2πt)), 0 ≤ t ≤ 1/2.
We then have a mapping γγγ : [0, 1/2] → R2, and the plot of γγγ(t) versus 0 ≤ t ≤ 1/2 is shown in
Figure 4.4:
Remark 4.2.7. Observe that the path γγγ(t) at (4.1.1) is different from the path γγγ(t) at (4.2.6) (the
intervals of definition and defining formulae are clearly different) but, upon comparing Figure 4.1 and
Figure 4.4, it becomes clear that these different paths have identical curves, namely the semicircles
ABC. One can think of the paths at (4.1.1) and (4.2.6) as distinct parametric representations of
the same curve, namely the semicircular arc ABC at Figure 4.1 and Figure 4.4.
Remark 4.2.8. Here we illustrate a general method for starting with a given path and changing
the underlying parametrization to get a generally different path which nevertheless has exactly the
same curve. To illustrate the idea in a specific case let
(4.2.7) γγγ1(t) := (cos(πt), sin(πt)), 0 ≤ t ≤ 1.
be the path in Example 4.1.1. We see from Figure 4.1 that the curve of this path is the semicircular
arc ABC. Now define the function
(4.2.8) ψ : [1,√
2]→ R as ψ(s) := s2 − 1.
48
Figure 4.4: Path defined by (4.2.6)
We then see that
(4.2.9) ψ(1) = 0, ψ(√
2) = 1, ψ(1)(s) = 2s− 1 > 0 for all 1 ≤ s ≤√
2,
(see the following Figure 4.5). It is clear that the function ψ(s) increases strictly monotonically
through the interval [0, 1] as s increases through the interval 1 ≤ s ≤√
2. Now define the path
γγγ2 : [1,√
2]→ R2
as follows:
γγγ2(s) := γγγ1(ψ(s))
= (cos(π(s2 − 1)), sin(π(s2 − 1))), 1 ≤ s ≤√
2.(4.2.10)
In the substitution at (4.2.10) the quantity ψ(s) = s2 − 1 replaces every occurrence of t in (4.2.7).
In Figure 4.6 we show how γγγ2(s) moves through R2 for different values of 1 ≤ s ≤√
2 indicating
specifically s = 1, s = 3/2, s =√
2. Comparing Figure 4.1 and Figure 4.6 we see that the two
paths defined by (4.2.7) and (4.2.10), which are clearly different paths, nevertheless have the same
curve, that is the arc ABC. The paths defined by (4.2.7) and (4.2.10) are distinct parametric
representations of the same curve ABC.
We can repeat this reparametrization of curves in complete generality. Suppose that
(4.2.11) γγγ1 : [a1, b1]→ R3,
49
Figure 4.5: Function ψ at (4.2.9)
is a given path. We are going to change the parametrization to get a different path with exactly
the same curve, much as we did in the preceding special case. To this end, fix some interval [a2, b2]
as well as some strictly increasing C1-function
(4.2.12) ψ : [a2, b2]→ R
such that
(4.2.13) ψ(a2) = a1, ψ(b2) = b1, ψ(1)(s) > 0 for all a2 ≤ s ≤ b2,
(see the following Figure 4.7). It is clear that the function ψ(s) increases strictly monotonically
through the interval [a1, b1] as s increases through the interval a2 ≤ s ≤ b2. Now define
(4.2.14) γγγ2(s) := γγγ1(ψ(s)), a2 ≤ s ≤ b2.
We then get a path
(4.2.15) γγγ2 : [a2, b2]→ R3,
which is clearly different from the path (4.2.11) (except for the trivial case where a1 = a2, b1 = b2
and ψ(s) = s). However, it is clear from (4.2.14) that the two paths nevertheless follow identical
curves in R3.
50
Figure 4.6: Path defined by (4.2.10)
4.3 Derivatives Along a Path and Tangent to a Curve
Given a path
(4.3.16) γγγ : [a, b]→ R3,
written in the scalar component form (c.f. (4.2.5))
γγγ(t) = x(t)iii+ y(t)jjj + z(t)kkk
= (x(t), y(t), z(t)), a ≤ t ≤ b,(4.3.17)
so that the right side of (4.3.17) gives the usual (x, y, z)-coordinates in three-dimensional space
at each value of the parameter t. We then define the derivative of the path with respect to the
parameter t to be the vector in R3 given by
γγγ(1)(t) :=dx(t)
dtiii+
dy(t)
dtjjj +
dz(t)
dtkkk
=
(dx(t)
dt,
dy(t)
dt,
dz(t)
dt
), for all instants a ≤ t ≤ b,
(4.3.18)
in which the scalar components of γγγ(1)(t) are the t-derivatives of the scalar components of γγγ(t). An
alternative notation for the derivative γγγ(1)(t) at (4.3.18) is
(4.3.19)dγγγ(t)
dt:=
dx(t)
dtiii+
dy(t)
dtjjj +
dz(t)
dtkkk.
51
Figure 4.7: Function ψ at (4.2.13)
Since γγγ1(t) is a vector in R3 for each a ≤ t ≤ b, we end up with another path
(4.3.20) γγγ(1) : [a, b]→ R3,
given by the t-derivative γγγ(1)(t) in R3 at every instant a ≤ t ≤ b. In the same way it is natural to
define the second t-derivative of the path γγγ as
γγγ(2)(t) :=d
dtγγγ(1)(t)
=d2x(t)
dt2iii+
d2y(t)
dt2jjj +
d2z(t)
dt2kkk
=
(d2x(t)
dt2,
d2y(t)
dt2,
d2z(t)
dt2
), for all instants a ≤ t ≤ b.
(4.3.21)
An alternative notation for the derivative γγγ(2)(t) at (4.3.21) is
(4.3.22)d2γγγ(t)
dt2:=
d2x(t)
dt2iii+
d2y(t)
dt2jjj +
d2z(t)
dt2kkk, for all instants a ≤ t ≤ b.
Notice that the right side of (4.3.21) also defines a point in R3 for each a ≤ t ≤ b, so that we have
yet another path
(4.3.23) γγγ(2) : [a, b]→ R3,
which gives the second t-derivative γγγ(2)(t) in R3 at every instant a ≤ t ≤ b.
52
From (4.3.17) and (4.3.18), together with the fact that
dx(t)
dt= lim
∆t→0
x(t+ ∆t)− x(t)
∆t, and similarly for
dy(t)
dtand
dz(t)
dt,
it follows that we can write the first t-derivative γγγ(1)(t) as
(4.3.24) γγγ(1)(t) = lim∆t→0
γγγ(t+ ∆t)− γγγ(t)
∆t, a ≤ t ≤ b.
From Figure 4.8 one sees that the vector difference γγγ(t + ∆t) − γγγ(t) is very close to tangential to
Figure 4.8: Approximation of first t-derivative
the curve of the path (4.3.16) at γγγ(t) when ∆t is small, and therefore the “rescaled” vector
γγγ(t+ ∆t)− γγγ(t)
∆t
is again close to tangent to the curve of the path (4.3.16) at γγγ(t) when ∆t is small, and the limit
at (4.3.24) is exactly tangent to the curve of the path (4.3.16) at γ(t) (see Figure 4.9).
We therefore have the important fact that
the derivative γγγ(1)(t) is tangent to the curve of the path (4.3.16) at γγγ(t) for each instant a ≤ t ≤ b,
(see Figure 4.9). In exactly the same way one can see that
the derivative γγγ(2)(t) is tangent to the curve of the path (4.3.20) at γγγ(1)(t) for each instant a ≤ t ≤ b.
One can of course define further t-derivatives γγγ(n)(t), n = 3, 4, . . ., in exactly the same way (although
there is seldom a need for these).
53
Figure 4.9: First derivative γγγ(1)(t) is tangent to the curve at γγγ(t)
Remark 4.3.1. Given two paths
(4.3.25) γγγ1 : [a, b]→ R3 and γγγ2 : [a, b]→ R3,
define the inner product of the vectors γγγ1(t) and γγγ2(t) in R3 for each a ≤ t ≤ b, giving the R valued
function
(4.3.26) ϕ(t) := γγγ1(t) · γγγ2(t), for all t in a ≤ t ≤ b.
Then we recall the following rule for differentiation of products:
(4.3.27)d
dtϕ(t) = γγγ1(t) · γγγ(1)
2 (t) + γγγ(1)1 (t) · γγγ2(t). for all t in a ≤ t ≤ b.
Remark 4.3.2. The parametric variable for a given path
(4.3.28) γγγ : [a, b]→ R3,
typically indicated by the variable t (although symbols such as s, σ, τ , u etc. could equally
well be used) does not have any “physical” interpretation in Definition 4.2.2. However, there are
applications in which this variable is specifically interpreted as time, the parametric interval [a, b]
is a given time interval, and the vector γγγ(t) represents the point in three-dimensional space R3
occupied (e.g. by a particle, an electric charge etc.) at the instant of time a ≤ t ≤ b. Since we
54
are now talking about movement through space as time increases it makes sense to introduce the
corresponding velocity and acceleration. Of course the velocity is defined as the first t-derivative of
γγγ, namely
(4.3.29) vvv(t) := γγγ(1)(t), for all times t in a ≤ t ≤ b,
and the acceleration is defined as the second t-derivative of γγγ, namely
(4.3.30) aaa(t) := γγγ(2)(t), for all times t in a ≤ t ≤ b.
This has some very important consequences. In fact, suppose that FFF : D → R3 is a vector field
with domain which we shall take to be D := R3 for simplicity, and that, at each point (x, y, z) in
R3, the vector FFF (x, y, z) is the force acting on a particle of mass m located at (x, y, z). Suppose the
particle follows the path
γγγ : [a, b]→ R3
in response to this force. If the particle is at γγγ(t) at instant t then the force on the particle is given
by FFF (γγγ(t)). Newton’s second law then says that
(4.3.31) maaa(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.
Combining this with (4.3.30) gives
(4.3.32) mγγγ(2)(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.
This is a second order vector differential equation which can, in principle, be solved to get the path
γγγ : [a, b] → R3 followed by the particle if one knows the force vector field FFF . In practice of course
this equation may be very difficult to solve explicitly, but it can always be solved numerically. In
fact, the path followed by a space probe moving through the solar system is calculated by solving
(4.3.32) (numerically); here FFF is the sum of the forces exerted on the probe by the sun and the
various planets. These forces are obtained from Newton’s law of universal gravitation.
4.4 Simple Curves and Closed Curves
Suppose that
(4.4.33) γγγ : [a, b]→ R2
55
is a path (or parametric function) with corresponding curve Γ (recall Definition 4.2.1) shown at the
left of Figure 4.10. The curve Γ clearly has the property that it does not “cross itself” anywhere.
What this means is that for any distinct t1 and t2 in the interval [a, b] (i.e. t1 6= t2) it must be the
case that γγγ(t1) 6= γγγ(t2), since, if γγγ(t1) = γγγ(t2) for distinct t1 and t2, then the curve must cross itself
somewhere (shown at the right of Figure 4.11). The curve Γ at the left of Figure 4.10, which does
not cross itself anywhere, is called a simple curve, whereas the curve Γ at the right of Figure 4.11,
which does cross itself somewhere, is called a non-simple curve. Of course exactly the same basic
Figure 4.10: Simple and non-simple curves
idea holds for the curves of paths in three dimensional space, that is
(4.4.34) γγγ : [a, b]→ R3.
If the curve in R3 of the path (4.4.34) does not cross itself anywhere then it is a simple curve, and
if it does cross itself somewhere then it is a non-simple curve.
For paths (4.4.33) and (4.4.34) the point γγγ(a) is called the starting point of the curve, and the
point γγγ(b) is called the ending point of the curve. A curve, simple or not, is called a closed curve
when the starting point and ending point coincide, that is
(4.4.35) γγγ(a) = γγγ(b).
A closed curve which is also simple (see the left of Figure 4.11) is called a simple closed curve
whereas a simple curve which does cross itself somewhere (distinct from the “exceptional” point
56
γγγ(a) = γγγ(b)) is a non-simple closed curve. The study of simple and non-simple curves is quite deep
and is part of algebraic topology. We are going to see that simple closed curves in particular are
quite important in vector calculus.
Figure 4.11: Simple and non-simple closed curves
57
Chapter 5
Line Integral and Arc Length
You are all familiar with the integration of a given function f(t) for all t in some interval a ≤ t ≤ b.
In Chapter 2 we extended the idea of integration over intervals to integration over regions of R2 and
R3. Vector calculus acquires its extraordinary power from two further extensions of the integration
concept, namely line integrals and surface integrals. Indeed, the main theorems of vector calculus,
as well as the basic laws of electricity and magnetism, can only be stated in terms of line integrals
and surface integrals. Our goal in this chapter is to study the construction and main properties of
line integrals, deferring the more sophisticated notion of surface integrals to Chapter 8.
5.1 Line Integral of a Vector Field
To formulate what is meant by a line integral suppose that we are given a curve in the space R3
which starts at a point A and ends at a point B which we denote by Γ, and we are also given a
vector field FFF (x, y, z) (see Figure 5.1). For the sake of simplicity, and to focus on just the essentials,
we take the domain of the vector field to be all of R3.
Remark 5.1.1. In order to simplify the notation, from now on we are always going to write rrr for
the vector corresponding to a point (x, y, z) in R3, so that
(5.1.1) rrr = xiii+ yjjj + zkkk, with length ‖rrr‖ =√x2 + y2 + z2,
in which iii, jjj and kkk are the usual standard unit vectors along the x, y and z-axes respectively. With
this notation we will write FFF (rrr) as an alternative to the notation FFF (x, y, z) for the value of the
vector field FFF at the point rrr given by (5.1.1).
58
Figure 5.1: Curve Γ in R3
Now we can define the line integral of the vector fieldFFF (rrr) along the curve Γ as follows: Introduce
points rrr0, rrr1, . . . , rrri, rrri+1, . . . , rrrn along the curve as shown at Figure 5.1, with rrr0 and rrrn corresponding
to A and B respectively, and put
(5.1.2) ∆rrri := rrri+1 − rrri, i = 0, 1, . . . , n− 1.
Then the inner product FFF (rrri) ·∆rrri is a scalar. If the sum
(5.1.3)n−1∑i=0
FFF (rrri) ·∆rrri
converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit is called the line
integral of the vector field FFF (rrr) along the curve Γ and denoted by
(5.1.4)
∫Γ
FFF (rrr) · drrr or
∫Γ
FFF (x, y, z) · drrr or, more briefly, by
∫Γ
FFF · drrr
Remark 5.1.2. In the particular case where the starting and end points A and B of the curve Γ
coincide, so that Γ is a closed curve in R3 (see Figure 5.2) the notations of (5.1.4) are sometimes
modified to
(5.1.5)
∫
Γ
FFF (rrr) · drrr or
∫
Γ
FFF (x, y, z) · drrr or, more briefly, by
∫
Γ
FFF · drrr,
the purpose of the small circle over the integral sign being to indicate that the line integral is over
a closed curve. In this case the line integral is called the circulation of the vector field FFF around the
59
closed curve Γ. In general, the small circle over the integral sign is quite redundant and does not
tell us anything new. We shall therefore avoid the notations at (5.1.5) and just use the notations
at (5.1.4) even when Γ is a closed curve.
Figure 5.2: Circulation of the vector field FFF
Remark 5.1.3. What is the significance of the line integral we have just defined? This depends
on the physical significance of the field FFF . Suppose that the vector field FFF is an electric field.
Then, with reference to Figure 5.1, FFF (rrri) is the force exerted on a standard unit positive charge
(i.e. a positive charge of one coul.) when it is located at point rrri on Γ, the quantity FFF (rrri) ·∆rrri is
approximately the work done by the electric field in displacing the standard unit positive charge
along Γ from rrri to rrri+1, and the quantity at (5.1.3) is approximately the work done by the electric
field in moving the unit positive charge along Γ from A to B. It is clear that the limit given by the
line integral at (5.1.4), that is
(5.1.6)
∫Γ
FFF · drrr,
is exactly the work done by the electric field in moving the standard unit positive charge along Γ
from A to B. This quantity is of course just the voltage difference between A and B. If Γ is the
closed curve in Figure 5.2 then the circulation of the electric field FFF around the closed curve Γ,
which is again the line integral at (5.1.6), is the total work done by the electric field in moving
60
a unit positive charge once completely around Γ. This quantity (measured in volts) is called the
electromotive force (or emf) of the electric field FFF around Γ. In conclusion, we see that line integrals
can have definite physical significance. In fact, line integrals are indispensable in much of physics
and engineering. In particular, as we shall see later, line integrals are essential for expressing the
basic laws of electromagnetism in mathematical form.
It is one thing to define the line integral as we have done, quite another thing to actually calculate
the line integral of a given vector field along a given curve. As one knows from ordinary calculus
the calculation of integrals can be a challenge. Fortunately, line integrals can be quite tractable
when we know a path
(5.1.7) γγγ : [a, b]→ R3
giving the curve Γ (recall Definition 4.2.2), which is nearly always the case when we actually have
to calculate a line integral. To see this, fix points ti in the interval a ≤ t ≤ b such that
(5.1.8) a = t0 < t1 < . . . < tn = b and put ∆ti := ti+1 − ti and rrri := γγγ(ti).
Then of course
∆rrri = rrri+1 − rrri= γγγ(ti+1)− γγγ(ti)
≈ γγγ(1)(ti)(ti+1 − ti) (linear approximation when ∆ti is small)
= γγγ(1)(ti)∆ti.
(5.1.9)
Then
(5.1.10)n−1∑i=0
FFF (rrri) ·∆rrri ≈n−1∑i=0
FFF (γγγ(ti)) · γγγ(1)(ti)∆ti,
as follows from (5.1.9) and (5.1.8). If we define the function
(5.1.11) g(t) := FFF (γγγ(t)) · γγγ(1)(t), a ≤ t ≤ b,
then it is clear that the right side of (5.1.10) converges to the ordinary integral
(5.1.12)
∫ b
a
g(t) dt ≡∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt.
61
It follows that
(5.1.13)
∫Γ
FFF (rrr) · drrr =
∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt
so that evaluation of the line integral on the left boils down to evaluation of the ordinary integral
on the right, this just being evaluation of the integral of g(t) at (5.1.11) over the interval a ≤ t ≤ b.
In view of the expansion at (4.2.5), that is
(5.1.14) γγγ(t) = x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b,
we can “expand” the right side of (5.1.13). Indeed, from (5.1.14) we have
(5.1.15) γγγ(1)(t) =dx
dt(t)iii+
dy
dt(t)jjj +
dz
dt(t)kkk for all t in a ≤ t ≤ b.
Moreover, expanding the vector field FFF according to (3.2.17), that is
(5.1.16) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D,
we see from (5.1.16), (5.1.15) and (5.1.14) that
FFF (γγγ(t)) · γγγ(1)(t) = [F1(x(t), y(t), z(t))iii+ F2(x(t), y(t), z(t))jjj + F3(x(t), y(t), z(t))kkk]
·[
dx
dt(t)iii+
dy
dt(t)jjj +
dz
dt(t)kkk
]= F1(x(t), y(t), z(t))
dx
dt(t) + F2(x(t), y(t), z(t))
dy
dt(t)
+ F3(x(t), y(t), z(t))dz
dt(t).
(5.1.17)
From (5.1.17) and (5.1.13) we are able to write the line integral in the expanded form∫Γ
FFF (rrr) · drrr =
∫ b
a
[F1(x(t), y(t), z(t))
dx
dt(t) + F2(x(t), y(t), z(t))
dy
dt(t)
+F3(x(t), y(t), z(t))dz
dt(t)
]dt.
(5.1.18)
Remark 5.1.4. Needless to say, all of the preceding trivially specializes to the case where we have
a curve Γ in the plane R2 (instead of in R3) and FFF (x, y) is a vector field in R2.
Example 5.1.5. A vector field in R2 is defined by
(5.1.19) FFF (x, y) := (y,−x) = yiii− xjjj for all (x, y) in R2.
62
Γ1 is a curve in R2 of the path
(5.1.20) γγγ : [0, π/2]→ R2, given by γγγ(t) := (cos(t), sin(t)).
Determine line integral of FFF along Γ1.
We apply (5.1.13). From (5.1.20) and (5.1.19)
(5.1.21) FFF (γγγ(t)) = (sin(t),− cos(t)), 0 ≤ t ≤ π/2,
and
(5.1.22) γγγ(1)(t) =
(d
dtcos(t),
d
dtsin(t)
)= (− sin(t), cos(t)). 0 ≤ t ≤ π/2,
From (5.1.22) and (5.1.21)
(5.1.23) FFF (γγγ(t)) · γγγ(1)(t) = (sin(t),− cos(t)) · (− sin(t), cos(t)) = − sin2(t)− cos2(t) = −1.
From (5.1.23) and (5.1.13) we have
(5.1.24)
∫Γ1
FFF (rrr) · drrr =
∫ π/2
0
FFF (γγγ(t)) · γγγ(1)(t) dt =
∫ π/2
0
(−1) dt = −π2.
Now suppose that Γ2 is another curve in R2, of the path
(5.1.25) γγγ : [0, 1]→ R2, given by γγγ(t) := (1− t, t).
We repeat the computation of the line integral of the vector field FFF at (5.1.19) but along the curve
Γ2 corresponding to the path at (5.1.25).
From (5.1.20) and (5.1.25)
(5.1.26) FFF (γγγ(t)) = (t, t− 1) 0 ≤ t ≤ 1,
and
(5.1.27) γγγ(1)(t) =
(d
dt(1− t), d
dt(t)
)= (−1, 1), 0 ≤ t ≤ 1,
so that
(5.1.28) FFF (γγγ(t)) · γγγ(1)(t) = (t, t− 1) · (−1, 1) = −1.
From (5.1.28) and (5.1.13) we have
(5.1.29)
∫Γ2
FFF (rrr) · drrr =
∫ 1
0
FFF (γγγ(t)) · γγγ(1)(t) dt =
∫ 1
0
(−1) dt = −1.
63
Figure 5.3: The curves Γ1 and Γ2
Remark 5.1.6. Example 5.1.5 illustrates something very important. In Figure 5.3 we have drawn
the curves Γ1 and Γ2 corresponding respectively to the paths γγγ at (5.1.20) and (5.1.25).
We see that Γ1 and Γ2 are distinct curves in R2, but do start at the common point A = (1, 0) and
end at the common point B = (0, 1), and that the line integrals of the vector field at (5.1.19) are
different. In general, if one has distinct curves Γ1 and Γ2 which nevertheless begin at a common
point A and end at a common point B, then the lines integrals of a vector field over these curves
will be different. We will later identify an important class of vector fields which have the special
property that that the line integral is the same for any curve from a given point A to a given point
B.
Remark 5.1.7. The equation (5.1.13) relates the line integral of the vector field FFF along a curve Γ
to a dt-integral which involves the parametric representation of Γ by some path (5.1.7). We know
from Example 4.1.1 and Example 4.2.6 that different paths can be the parametric representation of
the same curve. Furthermore, in Remark 4.2.8 we gave a general method for starting with the path
of a curve and constructing a generally different path with the same curve. With this in mind, it is
essential to check that the dt-integral on the right of (5.1.13) is the same regardless of which path
γγγ : [a, b] → R3 we use as a parametric representation of the curve Γ. The following result assures
us that this is the case:
64
Theorem 5.1.8. Suppose that FFF : R3 → R3 is a continuous vector field and that
(5.1.30) γγγ1 : [a1, b1]→ R3 and γγγ2 : [a2, b2]→ R3
are C1-paths having the same curve Γ, so that
(5.1.31) Γ = γγγ1(t) ∈ R3 | a1 ≤ t ≤ b1 and Γ = γγγ2(t) ∈ R3 | a2 ≤ t ≤ b2
i.e. γγγ1(t) traverses the curve Γ as t increases through the interval a1 ≤ t ≤ b1, and γγγ2(t) traverses
the identical curve Γ as t increases through the interval a2 ≤ t ≤ b2. Then
(5.1.32)
∫ b1
a1
FFF (γγγ1(t)) · γγγ(1)1 (t) dt =
∫ b2
a2
FFF (γγγ2(t)) · γγγ(1)2 (t) dt.
Proof: Suppose that the second parametric representation (or path)
(5.1.33) γγγ2 : [a2, b2]→ R3,
of the curve Γ is related to the first parametric representation
(5.1.34) γγγ1 : [a1, b1]→ R3,
of the same curve Γ by the construction seen at Remark 4.2.8, that is
(5.1.35) γγγ2(s) := γγγ1(ψ(s)), a2 ≤ s ≤ b2,
for some function
(5.1.36) ψ : [a2, b2]→ [a1, b1]
such that
(5.1.37) ψ(a2) = a1, ψ(b2) = b1, ψ(1)(s) > 0 for all a2 ≤ s ≤ b2.
To see that (5.1.32) holds we first evaluate the left side of (5.1.32) using integration by substitution.
Therefore define the substitution
(5.1.38) t := ψ(s), so that dt = ψ(1)(s) ds.
We will use this substitution to write the dt-integral on the left side of (5.1.32) as a ds-integral.
With this substitution the lower limit of the ds-integral is given by s1 such that a1 = ψ(s1), and
the upper limit is given by s2 such that b1 = ψ(s2), that is
(5.1.39) s1 = ψ−1(a1), s2 = ψ−1(a2).
65
Then integration by substitution gives
(5.1.40)
∫ b1
a1
FFF (γγγ1(t)) · γγγ(1)1 (t) dt =
∫ ψ−1(b1)
ψ−1(a1)
FFF (γγγ1(ψ(s))) · γγγ(1)1 (ψ(s))ψ(1)(s) ds.
From (5.1.37) we have
(5.1.41) ψ−1(a1) = a2, ψ−1(b1) = b2.
Now put (5.1.41) into (5.1.40):∫ b1
a1
FFF (γγγ1(t)) · γγγ(1)1 (t) dt =
∫ b2
a2
FFF (γγγ1(ψ(s))) · γγγ(1)1 (ψ(s))ψ(1)(s) ds
=
∫ b2
a2
FFF (γγγ2(s)) · γγγ(1)1 (ψ(s))ψ(1)(s) ds,
(5.1.42)
(we used (5.1.35) at the second equality). Now evaluate the derivative γγγ(1)1 (ψ(s)); for this take
s-derivative of each side of (5.1.35) to get
γγγ(2)2 (s) =
d
dsγγγ2(s)
=d
dsγγγ1(ψ(s)) (from (5.1.35))
= γγγ(1)1 (ψ(s))ψ(1)(s) (from the chain rule),
(5.1.43)
and from (5.1.43) we get
(5.1.44)
∫ b2
a2
FFF (γγγ2(s)) · γγγ(1)1 (ψ(s))ψ(1)(s) ds =
∫ b2
a2
FFF (γγγ2(s)) · γγγ(1)2 (s) ds.
Upon combining (5.1.44) and (5.1.42) we obtain∫ b1
a1
FFF (γγγ1(t)) · γγγ(1)1 (t) dt =
∫ b2
a2
FFF (γγγ2(s)) · γγγ(1)2 (s) ds,
which is just (5.1.32).
Of course Theorem 5.1.8 is an extremely important result, for it assures us that we get the same
line integral regardless of the path that we use for a parametric representation of the curve Γ. Were
this not the case then the whole notion of a line integral would not make any sense at all!
66
5.2 Line Integral of Scalar Field and Arc Length
We have defined the line integral of a vector field along a given curve in R3. We next define the line
integral of a scalar field along a given curve in R3 by a very similar construction. We therefore fix
a given curve in the space R3 which starts at a point A and ends at a point B which we denote by
Γ, and we are also given a scalar field f(x, y, z) (see Figure 5.1). As at Remark 5.1.1 we identify a
point (x, y, z) with a vector rrr (see (5.1.1)) and write f(rrr) instead of f(x, y, z). Exactly as with the
definition of the line integral of a vector field we introduce points rrr0, rrr1, . . . , rrri, rrri+1, . . . , rrrn along
the curve as shown at Figure 5.1, with rrr0 and rrrn corresponding to A and B respectively, and put
(5.2.45) ∆rrri := rrri+1 − rrri, i = 0, 1, . . . , n− 1.
If
(5.2.46) ∆si := ‖∆rrri‖
denotes the usual Euclidean length of the vector ∆rrri then the product f(rrri)∆si = f(rrri) ‖∆rrri‖ is a
scalar. If the sum
(5.2.47)n−1∑i=0
f(rrri)∆si
converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit is called the line
integral of the scalar field f along the curve Γ and denoted by
(5.2.48)
∫Γ
f(rrr) ds or
∫Γ
f(x, y, z) ds or, more briefly, by
∫Γ
f ds.
Exactly as with the line integral of a vector field, evaluation of the line integral of a scalar field is
facilitated when the curve Γ is the curve of a path (5.1.7). To see this, fix points ti in the interval
a ≤ t ≤ b exactly as at (5.1.8). Then (5.1.9) continues to hold, so that in particular
(5.2.49) ∆si = ‖∆rrri‖ ≈∥∥γγγ(1)(ti)
∥∥∆ti,
and
(5.2.50)n−1∑i=0
f(rrri)∆si ≈n−1∑i=0
f(γγγ(ti))∥∥γγγ(1)(ti)
∥∥∆ti,
as follows from (5.2.49) and (5.1.8). If we define the function
(5.2.51) g(t) := f(γγγ(t))∥∥γγγ(1)(t)
∥∥ , a ≤ t ≤ b,
67
then the right side of (5.2.50) converges to the ordinary integral
(5.2.52)
∫ b
a
g(t) dt ≡∫ b
a
f(γγγ(t))∥∥γγγ(1)(t)
∥∥ dt.
It follows that
(5.2.53)
∫Γ
f(rrr) ds =
∫ b
a
f(γγγ(t))∥∥γγγ(1)(t)
∥∥ dt.
That is, evaluation of the line integral of a scalar field f along a curve Γ with a parametric repre-
sentation (5.1.7) reduces to evaluation of the ordinary integral of the function at (5.2.51) over the
interval a ≤ t ≤ b.
Example 5.2.1. A scalar field in R2 is defined by
(5.2.54) f(x, y) := x2 + y for all (x, y) in R2,
and Γ is a curve in R2 with the parametric representation
(5.2.55) γγγ : [0, 1]→ R2, given by γγγ(t) := tiii− tjjj ≡ (t,−t).
We apply (5.2.53). From (5.2.55) and (5.2.54)
(5.2.56) f(γγγ(t)) = f(x(t), y(t)) = f(t2,−t) = t2 − t, 0 ≤ t ≤ 1,
and
(5.2.57) γγγ(1)(t) =d
dt(t)iii+
d
dt(−t)jjj = iii− jjj ≡ (1,−1), 0 ≤ t ≤ 1,
so that
(5.2.58)∥∥γγγ(1)(t)
∥∥ =√
12 + (−1)2 =√
2, 0 ≤ t ≤ 1.
From (5.2.58) and (5.2.56)
(5.2.59) f(γγγ(t))∥∥γγγ(1)(t)
∥∥ =√
2(t2 − t),
and from (5.2.59) and (5.2.53) we have
(5.2.60)
∫Γ
f(rrr) ds =√
2
∫ 1
0
(t2 − t) dt = −√
2
6.
68
Remark 5.2.2. Returning to the partition of the curve Γ shown in Figure 5.1, it is evident that if
the limit of the sum
(5.2.61)n−1∑i=0
‖∆rrri‖
converges to a real number as n→∞ and max0≤i≤n−1 ‖∆rrri‖ → 0, then this limit must be the length
of the curve Γ, so that length of Γ is just the line integral of the constant scalar field f(x, y, z) ≡ 1
along the curve Γ. In particular, the length of Γ is just given by (5.2.53) when we take f(x, y, z) ≡ 1,
that is
(5.2.62) length(Γ) =
∫ b
a
∥∥γγγ(1)(t)∥∥ dt.
An (undramatic) illustration of the use of (5.2.62) is
Example 5.2.3. Find the length of the circle of radius α > 0 in the plane R2. The circle is the
curve Γ. In order to use (5.2.62) we must parametrically represent Γ as the curve of some path
γγγ : [a, b]→ R2. In this case a clear choice of path is
(5.2.63) γγγ(t) := (α cos(t), α sin(t)), 0 ≤ t ≤ 2π.
Then, from (5.2.63),
(5.2.64) γγγ(1)(t) =
(d
dt(α cos(t)),
d
dt(α sin(t))
)= (−α sin(t), α cos(t)), 0 ≤ t ≤ 2π,
and from (5.2.64)
(5.2.65)∥∥γγγ(1)(t)
∥∥ =√
(−α sin(t))2 + (α cos(t))2 = α.
From (5.2.65) and (5.2.62)
length(Γ) =
∫ 2π
0
α dt = 2πα.
69
Chapter 6
Conservative Vector Fields
A conservative vector field is a particularly important type of vector field with the property that
“conservation of energy” always holds for these fields. Conservative vector fields occur everywhere
in physics and engineering including electromagnetism, gravitational physics and hydrodynamics.
Before defining a conservative vector field we must first formulate the idea of the gradient of a scalar
field and dispose of some “calculus” preliminaries.
6.1 Gradient of a Scalar Field
Definition 6.1.1. Suppose that f : D → R is a C1-scalar field in R3 with domain D ⊂ R3 (see
Definition 3.2.1). Then the gradient of the scalar field f is the vector field gradf on the same domain
D defined by
(gradf)(x, y, z) :=
(∂f
∂x(x, y, z),
∂f
∂y(x, y, z),
∂f
∂z(x, y, z)
)
≡ ∂f
∂x(x, y, z)iii+
∂f
∂y(x, y, z)jjj +
∂f
∂z(x, y, z)kkk, for all (x, y, z) in D.
(6.1.1)
An alternative notation for gradf is∇f (the symbol∇ is called “del” or “nabla”) so that (gradf)(x, y, z)
and ∇f(x, y, z) denote the same vector in R3 at each (x, y, z) in D.
Remark 6.1.2. Often the defining relation (6.1.1) is written with the generic variable (x, y, z)
stripped away, that is
(6.1.2) gradf = ∇f =∂f
∂xiii+
∂f
∂yjjj +
∂f
∂zkkk,
70
in which case it is understood that the domain is still the set D.
Remark 6.1.3. In Definition 6.1.1 we begin with a scalar field f and end up with a “new” field
(namely a vector field) denoted by gradf , or alternatively by ∇f . We can imagine this whole
process as given by a sort of “black box” in which the “input” is the whole scalar field f and the
“output” is the whole vector field gradf (or ∇f), as shown in Figure 6.1. That is, the black box
takes in the scalar field f and by a process of partial differentiation “pummels” it into the vector
field gradf appearing at the “output” of the box. In the circumstances it makes sense to label this
Figure 6.1: Black box for the gradient vector field
box with the alternative symbols “grad” or “∇”. Put another way, we can regard the black box as
an “operator” which takes the scalar field f and “operates” on it to produce the vector field gradf .
Since this process of operation clearly involves partial differentiation the black box defines a so-
called “differential operator”. There is another useful way to think about this differential operator.
Denote by grad (or ∇) the “symbolic” three dimensional vector
grad ≡ ∇ :=
(∂
∂x,∂
∂y,∂
∂z
)≡ ∂
∂xiii+
∂
∂yjjj +
∂
∂zkkk.
(6.1.3)
Of course this is just a “symbolic” vector, and not a “real” vector, in the sense that its components
are partial derivative symbols rather than actual numbers, but it is nevertheless very useful. If
71
we imagine that there are “gaps” behind the symbols grad (or ∇) as well as the partial derivative
symbols ∂/∂x, ∂/∂y and ∂/∂y appearing in (6.1.3), then by inserting f(x, y, z) into each of these
gaps then we obtain
gradf(x, y, z) = ∇f(x, y, z) =∂f
∂x(x, y, z)iii+
∂f
∂y(x, y, z)jjj +
∂f
∂z(x, y, z)kkk,(6.1.4)
which is exactly the relation (6.1.1). In this sense the symbolic vector at (6.1.3) describes how the
black box in Figure 6.1 works.
Remark 6.1.4. The gradient of a scalar field appears in the following useful calculation that we shall
frequently use: Suppose that f : D → R3 is a scalar field with domain D = R3 and γγγ : [a, b]→ R3
is a path with the component-wise representation at (4.3.17). Then f(γγγ(t)) ≡ f(x(t), y(t), z(t)) is
a real valued function defined on a ≤ t ≤ b. By the chain rule the t-derivative of this function is
given by
d
dtf(γγγ(t)) =
d
dtf(x(t), y(t), z(t))
=∂f
∂x(x(t), y(t), z(t))
d
dtx(t) +
∂f
∂y(x(t), y(t), z(t))
d
dty(t) +
∂f
∂z(x(t), y(t), z(t))
d
dtz(t)
=
(∂f
∂x(x(t), y(t), z(t)),
∂f
∂y(x(t), y(t), z(t)),
∂f
∂z(x(t), y(t), z(t))
)·(
d
dtx(t),
d
dty(t),
d
dtz(t)
)= ∇f(x(t), y(t), z(t)) · γγγ(1)(t) (see (4.3.18) and (6.1.4))
= ∇f(γγγ(t)) · γγγ(1)(t) (see (4.3.17)),
that is we have the general relation
(6.1.5)d
dtf(γγγ(t)) = ∇f(γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b,
which holds for any scalar field f and any path γγγ : [a.b] → R3. Furthermore, integrating each side
of (6.1.5) over the interval a ≤ t ≤ b one finds∫ b
a
∇f(γγγ(t)) · γγγ(1)(t) dt =
∫ b
a
d
dtf(γγγ(t)) dt
= f(γγγ(b))− f(γγγ(a)),
that is we have the further general relation
(6.1.6)
∫ b
a
∇f(γγγ(t)) · γγγ(1)(t) dt = f(γγγ(b))− f(γγγ(a)),
which again holds for any scalar field f and any path γγγ : [a.b]→ R3.
72
6.2 Conservative Vector Fields
With the preceding preliminaries out of the way, we can define a conservative vector field:
Definition 6.2.1. A vector field FFF : D → R3 with domain D ⊂ R3 is a conservative vector field
when
(6.2.7) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in D,
for some scalar field Ψ : D → R with the same domain D. The scalar field Ψ is called a potential
function of the vector field. In short, a vector field is conservative if it is the gradient of some scalar
field called a potential function of the vector field.
Remark 6.2.2. Observe that a vector field FFF with a potential function Ψ in fact has many potential
functions. Indeed, if c is a real constant, and we put
(6.2.8) Ψ1(x, y, z) := Ψ(x, y, z) + c, for all (x, y, z) in D,
then of course
FFF (x, y, z) = ∇Ψ1(x, y, z), for all (x, y, z) in D,
so that Ψ1 is also a potential function of FFF .
Example 6.2.3. We now give a simple but very important example of a conservative vector field
namely the electrostatic field EEE(x, y, z) from a single point charge Q of Example 3.1.3. We must
prove that this vector field is the gradient of some scalar field. Exactly as at Remark 5.1.1 we put
rrr for the vector from the origin of R3 to point (x, y, z), so that
(6.2.9) rrr = xiii+ yjjj + zkkk, with length ‖rrr‖ =√x2 + y2 + z2,
From Example 3.1.3 we know that EEE(x, y, z) has magnitude (or length) given by
(6.2.10) ‖EEE(x, y, z)‖ =Q
4πε0[x2 + y2 + z2],
(see (3.1.1)). Moreover, the direction of EEE(x, y, z) is collinear with rrr, the unit vector in the direction
of rrr, namely
(6.2.11) rrr :=rrr
‖rrr‖=
xiii+ yjjj + zkkk√x2 + y2 + z2
.
73
From (6.2.11) and (6.2.10)
EEE(x, y, z) =Q
4πε0[x2 + y2 + z2]
xiii+ yjjj + zkkk√x2 + y2 + z2
=Q
4πε0
(x
[x2 + y2 + z2]3/2,
y
[x2 + y2 + z2]3/2,
y
[x2 + y2 + z2]3/2
).
(6.2.12)
Now define a scalar field
(6.2.13) Ψ(x, y, z) := − Q
4πε0
1√x2 + y2 + z2
.
But (easy exercise!) we have
∂
∂x
1√x2 + y2 + z2
=−x
[x2 + y2 + z2]3/2,
∂
∂y
1√x2 + y2 + z2
=−y
[x2 + y2 + z2]3/2,
∂
∂z
1√x2 + y2 + z2
=−z
[x2 + y2 + z2]3/2.
(6.2.14)
From (6.2.12), (6.2.13) and (6.2.14) we get
(6.2.15) EEE(x, y, z) = ∇Ψ(x, y, z),
as required to demonstrate that EEE is a conservative vector field with a potential function given by
the scalar field Ψ at (6.2.13).
Remark 6.2.4. It is extremely important to be able to verify when a given vector field is conserva-
tive. In the previous example we did this by “guessing” a scalar function Ψ and then checking that
this is a potential function of the electric field EEE. Clearly this “guesswork” is not a very satisfactory
way in which to proceed in general. Later, we shall learn a mechanical (and easy!) test for verifying
when a vector field is conservative.
Remark 6.2.5. Here we demonstrate that it is easy to calculate the line integral of a conservative
vector field when we know its potential function. Suppose that FFF is a conservative vector field in
R3 with corresponding potential function Ψ, that is
(6.2.16) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in R3.
74
If Γ is a curve from point A with coordinates (x0, y0, z0) to a point B with coordinates (x1, y1, z1)
then
(6.2.17)
∫Γ
FFF (rrr) · drrr = Ψ(x1, y1, z1)−Ψ(x0, y0, z0).
To verify this suppose that the path
(6.2.18) γγγ : [a, b]→ R3
is some parametric representation of the curve Γ. Then∫Γ
FFF (rrr) · drrr =
∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt (see (5.1.13))
=
∫ b
a
∇Ψ(γγγ(t)) · γγγ(1)(t) dt (see (6.2.16))
= Ψ(γγγ(b))−Ψ(γγγ(a)) (from (6.1.6) with Ψ in place of f).
(6.2.19)
Now (6.2.17) follows from (6.2.19) since γγγ(a) = (x0, y0, z0) and γγγ(b) = (x1, y1, z1).
Remark 6.2.6. The relation (6.2.17) shows that the line integral of a conservative vector field is
easy to evaluate provided that we know a potential function of the conservative field. Much more
important, however, is the following consequence of (6.2.17): Suppose that Γ1 and Γ2 are two curves
in R3 starting at a common point (x0, y0, z0) and ending at the common point (x1, y1, z1) (see Figure
6.2).
Then, from (6.2.17) we have
(6.2.20)
∫Γ1
FFF (rrr) · drrr =
∫Γ2
FFF (rrr) · drrr,
that is, the line integral of a conservative vector field along a curve depends only on the end points
of the curve and not on the form of the curve between these end points, i.e. the line integral of a
conservative vector field is path independent.
Another related, and very important, property of conservative vector fields also follows at once
from (6.2.17). Suppose that Γ is a closed curve in R3, that is a curve which begins and ends at
the same point. Fix some point (x0, y0, z0) on Γ; then the curve starts at (x0, y0, z0) and ends at
(x1, y1, z1) = (x0, y0, z0). Upon putting (x1, y1, z1) = (x0, y0, z0) in (6.2.17) we obtain the following
(recall the notation (5.1.5)): For a conservative vector field we have
(6.2.21)
∫Γ
FFF (rrr) · drrr = 0 for every closed curve Γ in R3,
75
Figure 6.2: Curves Γ1 and Γ2 in R3 with common starting and ending points
that is, the circulation of a conservative vector field around every closed curve in R3 is always zero.
How about a converse to the above statement? That is, if we know that (6.2.21) holds for some
vector field FFF , then is it necessarily the case that FFF is conservative? Actually, this converse is true
but its proof relies on a result that we will only learn about much later, called Stokes’ theorem. We
therefore just state the following result, which not only gives the preceding converse, but, much
more importantly, also provides a genuinely practical test for determining when a given vector field
is conservative:
Theorem 6.2.7. Suppose that FFF : D → R3 is a C1-vector field with domain D = R3, and we
put FFF (x, y, z) = (F1(x, y, z), F2(x, y, z), F3(x, y, z)) for all (x, y, z) in R3 i.e. F1(x, y, z), F2(x, y, z),
and F3(x, y, z) are the real scalar components of the vector FFF (x, y, z) in R3. Then the following are
equivalent:
(a) FFF is a conservative vector field;
(b)∫
ΓFFF (rrr) · drrr = 0 for every closed curve Γ in R3;
(c) for all (x, y, z) in R3 we have
∂F3
∂y(x, y, z) =
∂F2
∂z(x, y, z),
∂F1
∂z(x, y, z) =
∂F3
∂x(x, y, z),
∂F2
∂x(x, y, z) =
∂F1
∂y(x, y, z).
Of course it is the equivalence of (a) and (c) in Theorem 6.2.7 which is of particular interest: if
we can check the three conditions of (c) then the vector field is indeed conservative.
76
Remark 6.2.8. Everything we have said above of course specializes trivially to the case of vector
fields in R2, as we briefly indicate next: Suppose that f : D → R is a scalar field in R2 with domain
D ⊂ R2. Then the gradient of this scalar field is the vector field in R2 with the same domain D
defined by
∇f(x, y, z) = (gradf)(x, y, z)
:=
(∂f
∂x(x, y),
∂f
∂y(x, y)
), for all (x, y) in D,
(6.2.22)
in which ∇ is now the two-dimensional operator defined by
(6.2.23) ∇ :=
(∂
∂x,∂
∂y
).
Exactly as at (6.1.5) and (6.1.6), in the two-dimensional case we have
d
dtf(γγγ(t)) = ∇f(γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b,∫ b
a
∇f(γγγ(t)) · γγγ(1)(t) dt = f(γγγ(b))− f(γγγ(a)),
(6.2.24)
which hold for any two-dimensional scalar field f and any path γγγ : [a.b] → R2. Exactly as at
Definition 6.2.1, a vector field FFF : D → R2 with domain D ⊂ R2 is a conservative vector field when
(6.2.25) FFF (x, y) = ∇Ψ(x, y), for all (x, y) in D,
for some scalar field Ψ : D → R with the same domain D called a potential function of the vector
field. Exactly as at (6.2.17), if FFF is a conservative vector field with a potential function Ψ (that
is, (6.2.25) holds) and Γ is a curve in R2 from point A with coordinates (x0, y0) to a point B with
coordinates (x1, y1) then
(6.2.26)
∫Γ
FFF (rrr) · drrr = Ψ(x1, y1)−Ψ(x0, y0).
As a consequence of (6.2.26) we see the following: if Γ1 and Γ2 are two curves in R2 starting at a
common point (x0, y0) and ending at the common point (x1, y1) then
(6.2.27)
∫Γ1
FFF (rrr) · drrr =
∫Γ2
FFF (rrr) · drrr,
(c.f. (6.2.20)) that is, the line integral of a two-dimensional conservative vector field along a curve
in R2 depends only on the end points of the curve and not on the form of the curve between these
end points. In particular, for a conservative vector field in R2 we have
(6.2.28)
∫Γ
FFF (rrr) · drrr = 0 for every closed curve Γ in R2.
77
Finally, in the two dimensional case we note that the first two conditions in Theorem 6.2.7(c) fall
away, since there is no dependence on z and F3 is identically zero in the case of vector fields in
R2, so that we are left only with the first condition. That is, in place of Theorem 6.2.7, for two
dimensional fields we have
Theorem 6.2.9. Suppose that FFF : D → R2 is a vector field with domain D = R2, and we put
FFF (x, y) = (F1(x, y, z), F2(x, y, z)) for all (x, y) in R2 i.e. F1(x, y) and F2(x, y) are the real scalar
components of the vector FFF (x, y) in R2. Then the following are equivalent:
(a) FFF is a conservative vector field;
(b)∫
ΓFFF (rrr) · drrr = 0 for every closed curve Γ in R2;
(c) for all (x, y) in R2 we have∂F2
∂x(x, y) =
∂F1
∂y(x, y).
Example 6.2.10. FFF : D → R2 is a two dimensional vector field with domain D := R2, defined by
(6.2.29) FFF (x, y) = (F1(x, y), F2(x, y)) for F1(x, y) := 2xy, F2(x, y) := 1 + x2.
Establish that FFF is a conservative vector field and determine a potential function.
We use the test of Theorem 6.2.9(c): From (6.2.29)
(6.2.30)∂F1
∂y(x, y) = 2x,
∂F2
∂x(x, y) = 2x,
so that the test of Theorem 6.2.9(c) is verified. From the equivalence of (a) and (c) of Theorem
6.2.9 we conclude that FFF is conservative. There is therefore a function Ψ : R2 → R such that
(6.2.31)∂Ψ
∂x(x, y) = F1(x, y),
∂Ψ
∂y(x, y) = F2(x, y), for all (x, y) in R2,
which we must determine. From (6.2.31) and (6.2.29)
(6.2.32)∂Ψ
∂x(x, y) = 2xy.
Integrate each side of (6.2.32) with respect to x:
(6.2.33) Ψ(x, y) = x2y + h(y).
Notice that h(y) is the constant of integration; since the integration is with respect to x this constant
may depend on y, as we have indicated. Now take y-derivatives of each side of (6.2.33) to get
(6.2.34)∂Ψ
∂y(x, y) = x2 +
dh
dy(y).
78
From (6.2.31), and (6.2.29) we have
(6.2.35)∂Ψ
∂y(x, y) = 1 + x2,
so that (6.2.35) and (6.2.34) give
1 + x2 = x2 +dh
dy(y),
so thatdh
dy(y) = 1,
and therefore
(6.2.36) h(y) = y + c,
for some constant c. Combining (6.2.36) and (6.2.33) gives the potential function
Ψ(x, y) = x2y + y + c, for all (x, y) in R2.
Example 6.2.11. FFF : D → R3 is a vector field with domain D = R3 defined by
(6.2.37) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk,
with the scalar components
(6.2.38) F1(x, y, z) = y, F2(x, y, z) = z cos(yz) + x, F3(x, y, z) = y cos(yz).
Establish that FFF is a conservative vector field and determine a potential function.
We use the test of Theorem 6.2.7. From (6.2.38) we have
(6.2.39)∂F3(x, y, z)
∂y= cos(yz)− yz sin(yz),
∂F2(x, y, z)
∂z= cos(yz)− yz sin(yz),
which cheks the first condition of Theorem 6.2.7(c); the remaining two conditions are similarly
checked showing that FFF is conservative. We then have a function Ψ : R3 → R such that
(6.2.40)∂Ψ(x, y, z)
∂x= F1(x, y, z),
∂Ψ(x, y, z)
∂y= F2(x, y, z),
∂Ψ(x, y, z)
∂z= F3(x, y, z),
for all (x, y, z) in R3. From (6.2.40) and (6.2.38)
(6.2.41)∂Ψ(x, y, z)
∂x= y,
∂Ψ(x, y, z)
∂y= z cos(yz) + x,
∂Ψ(x, y, z)
∂z= y cos(yz),
79
for all (x, y, z) in R3. Integrating the first relation of (6.2.41) with respect to x gives
(6.2.42) Ψ(x, y, z) = xy + h1(y, z).
Notice that h1(y, z) is the constant of integration; since the integration is with respect to x this
constant may depend on (y, z), as we have indicated. Now take derivatives of each side of (6.2.42)
in y to get
(6.2.43)∂Ψ(x, y, z)
∂y= x+
∂h1(y, z)
∂y.
Upon combining (6.2.43) and the second relation of (6.2.41) we find
(6.2.44)∂h1(y, z)
∂y= z cos(yz).
Integrating each side of (6.2.44) with respect to y then gives
(6.2.45) h1(y, z) = sin(yz) + h2(z).
Again, notice that h2(z) is the constant of integration; since the integration is with respect to y,
and the only variables appearing in (6.2.45) are y and z, this constant generally depends on z (but
not on x), as we have indicated. Now put (6.2.45) into (6.2.42) to get
(6.2.46) Ψ(x, y, z) = xy + sin(yz) + h2(z).
It remains to use the third relation of (6.2.41) (the only one so far not used). To this end take
z-derivatives of each side of (6.2.46) to get
(6.2.47)∂Ψ(x, y, z)
∂z= y cos(yz) +
dh2(z)
dz,
and then, from (6.2.47) and the third relation of (6.2.41) we obtain
dh2(z)
dz= 0,
so that
(6.2.48) h2(z) = c,
for a constant c. Putting (6.2.48) in (6.2.46) gives the potential function
(6.2.49) Ψ(x, y, z) = xy + sin(yz) + c.
80
6.3 Conservation of Energy
We end this section by demonstrating a very important physical property of conservative vector
fields. Suppose, as in Remark 4.3.2, that FFF : D → R3 is a vector field with domain D := R3, and
that, at each point (x, y, z) in R3, the vector FFF (x, y, z) is the force acting on a particle of mass m
located at (x, y, z). Now suppose that the force vector field FFF is conservative, with some potential
function Ψ, so that
(6.3.50) FFF (x, y, z) = ∇Ψ(x, y, z), for all (x, y, z) in R3.
If the particle moves through a point rrr = (x, y, z) with some velocity vvv then we define the total
mechanical energy E as
(6.3.51) E =1
2m ‖vvv‖2 −Ψ(rrr).
The first term on the right side of (6.3.51) is the kinetic energy and the second term is the potential
energy of the particle. If the particle follows a path
γγγ : [a, b]→ R3
in response to this force, then, at instant t, the particle is at the point γγγ(t) and moving with velocity
vvv(t) = γγγ(1)(t) (see (4.3.29)), so that the total mechanical energy at instant t must be given by
(6.3.52) E(t) =1
2m∥∥γγγ(1)(t)
∥∥2 −Ψ(γγγ(t)), for all t in a ≤ t ≤ b,
(obtained by putting rrr = γγγ(t) and vvv = γγγ(1)(t) in (6.3.51)). We will show that E(t) is necessarily
constant over a ≤ t ≤ b, so that mechanical energy is conserved by motion in a conservative force
field. To this end observe that (6.3.52) can be written as
(6.3.53) E(t) =1
2m(γγγ(1)(t) · γγγ(1)(t))−Ψ(γγγ(t)), for all t in a ≤ t ≤ b.
But (from (4.3.26) - (4.3.27)) we have
(6.3.54)d
dt(γγγ(1)(t) · γγγ(1)(t)) = 2γγγ(1)(t) · γγγ(2)(t), for all t in a ≤ t ≤ b.
and
d
dtΨ(γγγ(t)) = ∇Ψ(γγγ(t)) · γγγ(1)(t) (from (6.1.5) with Ψ in place of f)
= FFF (γγγ(t)) · γγγ(1)(t) (see (6.3.50)).
(6.3.55)
81
Now combine (6.3.53), (6.3.54) and (6.3.55) to see that
(6.3.56)d
dtE(t) = mγγγ(2)(t)−FFF (γγγ(t)) · γγγ(1)(t), for all t in a ≤ t ≤ b.
But, at Remark Remark 4.3.2, we have seen from Newton’s second law of motion that the path
satisfies the following differential equation (see (4.3.32)):
(6.3.57) mγγγ(2)(t) = FFF (γγγ(t)), for all times t in a ≤ t ≤ b.
From (6.3.57) and (6.3.56) we get
d
dtE(t) = 0, for all t in a ≤ t ≤ b,
as required to establish conservation of mechanical energy.
82
Chapter 7
Green’s Theorem in the Plane
We are going to learn an extremely powerful and useful theorem of multivariable calculus, called
Green’s theorem in the plane. This is essentially a two dimensional result so our focus will be
exclusively on fields in R2 rather than R3. As we shall see later, this two dimensional result is
nevertheless the essential tool for establishing the main results on three dimensional vector calculus
(such as Stokes’ theorem and Gauss’ theorem) which are indispensable for physics and engineering.
7.1 Green’s Theorem for Rectangles
As preparation for Green’s theorem we first look at line integrals of a vector field FFF : R2 → R2 (i.e.
for simplicity the domain of FFF is all of R2) over the very simple curves Γ1, Γ2, Γ3 and Γ4 around
the perimeter of a rectangle, as shown on Figure 7.1. The curve Γ1 is from (a, c) to (b, c), and an
obvious parametric representation of Γ1 is
(7.1.1) γγγ : [a, b]→ R2
defined by
(7.1.2) γγγ(t) := (t, c) for all a ≤ t ≤ b.
Taking the t-derivative of (7.1.2) then gives
(7.1.3) γγγ(1)(t) = (1, 0), for all a ≤ t ≤ b.
83
Figure 7.1: Curves Γ1,Γ2,Γ3 and Γ4 in the plane R2
We now have the line integral along the curve Γ1:∫Γ1
FFF (rrr) · drrr =
∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt (from (5.1.13))
=
∫ b
a
(F1(t, c), F2(t, c)) · (1, 0) dt (from (7.1.2) and (7.1.3))
=
∫ b
a
F1(t, c) dt
=
∫ b
a
F1(x, c) dx.
(7.1.4)
At the last equality of (7.1.4) we have just re-named the variable of integration x instead of t. This
trivial change will actually be useful later on! In much the same way as for (7.1.4) we have the line
integrals over Γ2, Γ3 and Γ4 (simple exercise!):∫Γ2
FFF (rrr) · drrr =
∫ d
c
F2(b, y) dy(7.1.5) ∫Γ3
FFF (rrr) · drrr = −∫ b
a
F1(x, d) dx(7.1.6) ∫Γ4
FFF (rrr) · drrr = −∫ d
c
F2(a, y) dy.(7.1.7)
Now suppose that Γ is the closed curve in R2 around the perimeter of the rectangle [a, b] × [c, d]
shown in Figure 7.2: Upon comparing Figure 7.2 and Figure 7.1 we see that the line integral of
84
Figure 7.2: Curve Γ counter-clockwise around the perimeter of a rectangle in R2
FFF along the closed curve Γ is just the sum of the line integrals of FFF along Γ1, Γ2, Γ3 and Γ4 in
succession, that is∫Γ
FFF (rrr) · drrr =
∫Γ1
FFF (rrr) · drrr +
∫Γ2
FFF (rrr) · drrr +
∫Γ3
FFF (rrr) · drrr +
∫Γ4
FFF (rrr) · drrr
=
∫ b
a
[F1(x, c)− F1(x, d)]dx+
∫ d
c
[F2(b, y)− F2(a, y)]dy,
(7.1.8)
(we have used (7.1.4) - (7.1.7) at the second equality of (7.1.8)). The formula (7.1.8) gives the line
integral of a vector field in R2 around the perimeter of a rectangle. This is a very useful result.
In fact, we are now going to use (7.1.8) to establish the following preliminary version of Green’s
theorem in the plane:
Theorem 7.1.1 (Preliminary Green’s theorem). Suppose that Γ is the closed path around the
perimeter of the rectangle
D := [a, b]× [c, d]
in the counter-clockwise direction, as shown in Figure 7.2, and FFF : R2 → R2 is a C1-vector field
(see Remark 3.2.2 in which we just formally put F3 = 0). Then
(7.1.9)
∫Γ
FFF (rrr) · drrr =
∫D
[∂F2
∂x(x, y)− ∂F1
∂y(x, y)
]dx dy.
85
Remark 7.1.2. Notice that (7.1.9) relates a line integral around the perimeter of a rectangle to a
dx dy-integral over the rectangle. We shall see later that this relation of line integrals (circulations
actually) to dx dy-integrals in the plane is very useful.
It remains to verify (7.1.9). Using (7.1.8) this is easy. First observe that
(7.1.10)
∫ d
c
∂F1
∂y(x, y) dy = F1(x, d)− F1(x, c).
Notice that (7.1.10) just follows from elementary calculus: The integral of the derivative of a function
is just the function itself. In this case the derivative is just with respect to y (the variable x is held
fixed and plays no role in any of this). In exactly the same way
(7.1.11)
∫ b
a
∂F2
∂x(x, y) dx = F2(b, y)− F2(a, y).
Now from (7.1.8), (7.1.10) and (7.1.11) we obtain
(7.1.12)
∫Γ
FFF (rrr) · drrr =
∫ d
c
[∫ b
a
∂F2
∂x(x, y) dx
]dy −
∫ b
a
[∫ d
c
∂F1
∂y(x, y) dy
]dx.
But, from Fubini’s theorem (see especially (2.1.12)) we have
(7.1.13)
∫D
∂F2
∂x(x, y) dx dy =
∫ d
c
[∫ b
a
∂F2
∂x(x, y) dx
]dy,
and
(7.1.14)
∫D
∂F1
∂y(x, y) dx dy =
∫ b
a
[∫ d
c
∂F1
∂y(x, y) dy
]dx.
Now (7.1.9) follows at once from (7.1.12), (7.1.13) and (7.1.14).
7.2 Green’s Theorem: General Case
The version of Green’s theorem given by Theorem 7.1.1 has one major limitation, namely it is
restricted to integration over rectangles in the plane. Green’s theorem acquires real power when
this restriction is removed and we are able to integrate over non-rectangular regions D in the plane
such as the one shown in Figure 7.3, in which the perimeter of D is a simple closed curve (recall
Section 4.4) with counter-clockwise direction: We state this more general version of Green’s theorem
in the plane without proof:
86
Figure 7.3: Curve Γ counter-clockwise around perimeter of non-rectangular region D in R2
Theorem 7.2.1 (Green’s theorem in the plane). Suppose that D is a region in the plane shown at
Figure 7.3, and the perimeter of D is a simple closed curve Γ in the counter-clockwise direction. If
FFF : R2 → R2 is a C1-vector field, then
(7.2.15)
∫Γ
FFF (rrr) · drrr =
∫D
[∂F2
∂x(x, y)− ∂F1
∂y(x, y)
]dx dy.
Example 7.2.2. Suppose that D is the unit disc in the plane and FFF : R2 → R2 is a vector field
defined by
(7.2.16) FFF (x, y) := (F1(x, y), F2(x, y)),
for
(7.2.17) F1(x, y) := x2ex + y − log(1 + x2) F2(x, y) := 8x− sin(y).
Evaluate the line integral∫
ΓFFF (rrr) · drrr, in which Γis the circle around the unit disc in the counter-
clockwise direction.
Direct evaluation of this line integral by choosing a parametric representation of Γ and using
(5.1.13) is difficult (try it and see!). Instead we shall use Theorem 7.2.1. From (7.2.17) we have
(7.2.18)∂F1
∂y(x, y) = 1,
∂F2
∂x(x, y) = 8.
87
From (7.2.18) and (7.2.15)∫Γ
FFF (rrr) · drrr =
∫D
[8− 1] dx dy = 7
∫D
dx dy = 7area(D) = 7π,
where we have used Remark 2.1.19 at the third equality.
Remark 7.2.3. Green’s theorem provides a useful formula for calculating the area of a region in
R2. Indeed, fix the region D in the plane shown at Figure 7.3, and let Γ be the closed curve around
the perimeter of this region in the counter-clockwise direction (exactly as in Theorem 7.2.1). Now
take the special vector field FFF : R2 → R2 defined by
(7.2.19) F1(x, y) := −y F2(x, y) := x.
From (7.2.19) we have
(7.2.20)
∫D
[∂F2
∂x(x, y)− ∂F1
∂y(x, y)
]dx dy = 2
∫D
dx dy = 2area(D),
where we have used Remark 2.1.19 at the second equality. Now suppose that a parametric repre-
sentation of the curve Γ is
(7.2.21) γγγ : [a, b]→ R2 with γγγ(t) = (x(t), y(t)) for all a ≤ t ≤ b.
Then ∫Γ
FFF (rrr) · drrr =
∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt (from (5.1.13))
=
∫ b
a
(F1(x(t), y(t)), F2(x(t), y(t)) ·(
dx(t)
dt,
dy(t)
dt
)dt (see (7.2.21))
=
∫ b
a
[x(t)
dy(t)
dt− y(t)
dx(t)
dt
]dt. (from (7.2.19)).
(7.2.22)
Now combine (7.2.22), (7.2.20) and (7.2.15) to obtain the area formula
(7.2.23) area(D) =1
2
∫ b
a
[x(t)
dy(t)
dt− y(t)
dx(t)
dt
]dt.
Example 7.2.4. A hypocycloid is a curve Γ in R2 comprising all (x, y) which satisfy the relation
(7.2.24) x2/3 + y2/3 = 1,
that is
(7.2.25) Γ = (x, y) ∈ R2 | x2/3 + y2/3 = 1.
88
Figure 7.4: Hypocycloid in R2
(see Figure 7.4). Determine the area of the region enclosed by the hypocycloid.
Direct evaluation of this area is very complicated (you may want to try it!). We shall see that
formula (7.2.23) makes the computation of area quite easy. A parametrization of the curve Γ around
the perimeter of the hypocycloid in the counter-clockwise direction is
(7.2.26) γγγ(t) := x(t)iii+ y(t)jjj where x(t) = [cos(t)]3, y(t) = [sin(t)]3, for all 0 ≤ t ≤ 2π.
In fact, from (7.2.26), we have
(7.2.27) [x(t)]2/3 + [y(t)]2/3 = [cos(t)]2 + [sin(t)]2 = 1, for all 0 ≤ t ≤ 2π,
so that the path γγγ : [0, 2π]→ R2 traverses the curve Γ, and it is easily checked that the direction of
traverse is counter clockwise. We now use the path given by (7.2.26) in the area formula (7.2.23).
From (7.2.26)
(7.2.28)dx(t)
dt= −3 sin(t)[cos(t)]2
dy(t)
dt= 3 cos(t)[sin(t)]2.
89
We now evaluate the integrand of the area formula (7.2.23). From (7.2.28) and (7.2.26) we find
x(t)dy(t)
dt− y(t)
dx(t)
dt= 3[cos(t)]4[sin(t)]2 + 3[sin(t)]4[cos(t)]2
= 3[cos(t) sin(t)]2[cos(t)]2 + [sin(t)]2
= 3[cos(t) sin(t)]2
=3
4[sin(2t)]2 (recall identity sin(2t) = 2 sin(t) cos(t))
=3
8[1− cos(4t)] (recall 2 sin2(θ) = 1− cos(2θ)).
(7.2.29)
From (7.2.29) and (7.2.23) with a = 0 and b = 2π we obtain
area =3
16
∫ 2π
0
[1− cos(4t)] dt
=3π
8− 3
16
∫ 2π
0
cos(4t) dt =3π
8
(7.2.30)
since ∫ 2π
0
cos(4t) dt = 0
by inspection.
90
Chapter 8
Surfaces, Surface Area and Surface
Integrals
We are going to study surfaces in three dimensional space, our final objective being the construction
of surface integrals. The theorems of Gauss-Ostrogradskii and Stokes on vector calculus that we
shall address later rely in an essential way on surface integrals. Furthermore, surface integrals are
likewise essential for stating in completely general terms the basic laws of electromagnetism, such
as Faraday’s law on electromagnetic induction and Ampere’s circulation law for magnetic fields, as
we shall see later in this chapter.
8.1 Parametric Representation of Surfaces
Intuitively, a surface in three dimensional space is a “thin”, essentially “two dimensional” object,
such as a sheet of paper or the roof of a tent. Our first task is to make this somewhat vague notion
mathematically precise in a clear definition. We first motivate this with some examples:
Example 8.1.1. The surface of the “top half” of a sphere of radius r > 0 in R3 with center at the
origin is the set of all points (x, y, z) in R3 which satisfy the relations
(8.1.1) x2 + y2 + z2 = r2, z ≥ 0.
(see Figure 8.1). The description of the hemispherical surface at (8.1.1) can be given an alternative
(and more useful) formulation as follows: Define the disc D ⊂ R2xy of radius r in the x − y plane,
that is
(8.1.2) D = (x, y) ∈ R2 | x2 + y2 ≤ r2,
91
Figure 8.1: Top half of sphere of radius r
and define the function
(8.1.3) f(x, y) :=√r2 − x2 − y2, for all (x, y) in D,
(note that r2 − x2 − y2 ≥ 0 for all (x, y) in D so that the square-root is always a real number). It
is clear that the point (x, y, f(x, y)) traces out the surface S of the top half of the sphere in Figure
8.1 as (x, y) varies through the disc D.
Example 8.1.2. It is very easy to generalize Example 8.1.1 as follows: Suppose that f : D → Ris a given continuous function defined on some given region D ⊂ R2
xy in the x− y plane. As (x, y)
varies throughout D the point (x, y, f(x, y)) traces out a “surface” S in three dimensional space R3
as shown in Figure 8.2: We call this surface S the graph of the function f : D → R, and denote this
surface in set-theoretic terms as
(8.1.4) S = (x, y, f(x, y)) ∈ R3 | (x, y) in D.
Remark 8.1.3. The graphs of functions f : D → R, as in Example 8.1.1, and more generally in
Example 8.1.2, represent an important class of surfaces, but it is an unfortunate fact that not every
92
Figure 8.2: Surface S; the graph of a function f : D → R
surface in R3 can be represented by the graph of a function. Shown in Figure 8.3 is the surface
of the whole sphere of radius r centered at the origin of R3, in contrast to just the “top half” of
the sphere in Figure 8.1. If D denotes the disc of radius r in the x− y plane (see (8.1.2)) then we
see from Figure 8.3 that the straight line parallel to the z-axis through the point A inside the disc
D (with coordinates (x, y)) passes through the surface at the distinct points B and C. One could
perhaps deal with this by defining the “multivalued-valued” function
(8.1.5) f(x, y) := ±√r2 − x2 − y2, for all (x, y) in D,
(c.f. (8.1.3)), the positive and negative values in (8.1.5) corresponding to the points B and C
respectively, and then attempt to regard the surface of the whole sphere as the “graph” of the
multivalued function f given by (8.1.5). However, “multivalued functions” bring a host of intractable
complications, so much so that in mathematics we never deal with anything except single-valued
93
Figure 8.3: Whole surface of a sphere of radius r
functions. This being the case, it follows that we cannot represent the surface of the whole sphere
as the graph of a (single-valued!) function f . Similar problems occur for the surface in Figure 8.4,
which includes a “fold”. Again, this surface cannot be the graph of a function f : D → R, for some
region D in the x− y plane, since, for some points (x, y), a straight line parallel to the z-axis and
passing through (x, y) necessarily passes through the surface at three distinct points A, B and C
on account of the “fold” in the surface. Here we would need a “three-valued function” to represent
this surface, giving the z-coordinate of each of the three points A, B and C for fixed (x, y), an
absolutely hideous prospect. We repeat again: we deal only with functions that are single-valued,
so the folded surface in Figure 8.4 cannot be represented by the graph of a function. A final example
of a surface which clearly cannot be represented by the graph of any function is the outer surface of
the deformed “donut” in Figure 8.5. The surfaces in Figures 8.3, 8.4 and 8.5 suggest that we must
be quite careful in formulating exactly what we mean by a surface. We shall build on one essential
piece of intuition: One can think of a surface as a portion of a generally deformed “flat surface” or
plane. Since it takes two coordinates to specify a point in a plane it follows that one should likewise
require two “coordinates” to specify a point on a surface. The following definition builds on this
94
Figure 8.4: Surface S includes a “fold”
intuition:
Definition 8.1.4. A parametric function for a surface S is a given function or mapping
(8.1.6) ΦΦΦ : D → R3,
defined on some given region D in a u− v-plane R2uv, written in the scalar component form
ΦΦΦ(u, v) = (x(u, v), y(u, v), z(u, v))
= x(u, v)iii+ y(u, v)jjj + z(u, v)kkk for all (u, v) in D,(8.1.7)
which maps each point (u, v) in the fixed region D into the vector ΦΦΦ(u, v) in R3. The surface S of
the parametric function is the set of points in R3 traced by ΦΦΦ(u, v) as (u, v) traverses the region D.
In the notation of sets we write this as
(8.1.8) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D.
95
Figure 8.5: Surface of a deformed donut
The region D on which the parametric function is defined is called the domain of definition or
parametric domain, the variable (u, v) is called the parametric variable, and the whole function
ΦΦΦ : D → R3 is called a parametric representation of the surface.
In Example 8.1.2 the region D on which the surface is defined is a region in the x− y plane R2xy.
In contrast, the parametric domain D in the Definition 8.1.4 is a region in a general u−v plane R2uv
which could be different from the x − y plane. It is because of this flexibility in not being limited
just to regions in the x− y plane that Definition 8.1.4 describes a very much larger class of surfaces
than just the surfaces corresponding to graphs of functions that were seen in Example 8.1.2.
Remark 8.1.5. Suppose that the parametric function ΦΦΦ is a C1-function, that is (c.f. Remark
3.2.2) the first partial derivatives
∂x(u, v)
∂u,
∂y(u, v)
∂u,
∂z(u, v)
∂u,
∂x(u, v)
∂v,
∂y(u, v)
∂v,
∂z(u, v)
∂v,
of the scalar components of ΦΦΦ (see (8.1.7)) exist and are continuous functions of (u, v) in D. Then
the parametric representation (8.1.6) is called a C1-parametric representation and the corresponding
96
surface S is called a C1-surface. Our focus will be almost exclusively on the case of C1-parametric
representations and C1-surfaces.
Remark 8.1.6. Observe that if a surface S is the graph of a function f : D → R as in the
Example 8.1.2, then it has a parametric representation of the form specified in Definition 8.1.4 and
is therefore a surface in the sense of Definition 8.1.4. Indeed, suppose we are given the continuous
function f : D → R, in which D ⊂ R2xy is some region in the x − y plane, exactly as in Example
8.1.2. We now just take the u − v plane R2uv in Definition 8.1.4 to be identical to the x − y plane
R2xy. Define the parametric function ΦΦΦ : D → R3 by
ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk
≡ (u, v, f(u, v)), for all (u, v) in D,(8.1.9)
so that the scalar components of ΦΦΦ(u, v) are (c.f. (8.1.7))
(8.1.10) x(u, v) = u, y(u, v) = v, z(u, v) = f(u, v), for all (u, v) in D.
Then it is clear that the surface S given in Example 8.1.2 is exactly the set of points in R3 traced by
ΦΦΦ(u, v) as (u, v) traverses the region D, that is, the function at (8.1.9) is a parametric representation
of the surface S in Example 8.1.2. The advantage of Definition 8.1.4 is that it applies to a much
broader class of surfaces than the surfaces which are graphs of functions, and, as we have just noted,
also includes surfaces which are graphs of functions.
Example 8.1.7. We now return to Example 8.1.1 (the top half of a sphere of radius r centered
at the origin of R3 shown in Figure 8.1) which we already know is the surface of the graph of
the function f given by (8.1.3). We shall now give an alternative parametric representation of the
surface. To see how this works we reproduce part of Figure 8.1 in greater detail in Figure 8.6:
Fix some point A on the surface S of the sphere. Then the length of OA is the radius r of the
sphere. Introduce the angle φ from the z-axis to the ray OA, drop a perpendicular from A onto the
x− y plane to get the point B, and let θ be the angle in the x− y plane from the x-axis to the ray
OB. From the right-angle triangle OAE we find
(8.1.11) OE = r cos(φ) OB = AE = r sin(φ),
and from the right-angle triangle OBC with (8.1.11) we find
(8.1.12) OC = OB cos(θ) = r sin(φ) cos(θ) OD = OB sin(θ) = r sin(φ) sin(θ).
97
Figure 8.6: Spherical surface
But from Figure 8.6 we see that OC, OD and OE give the x, y and z coordinates of the point A
in terms of the angles θ and φ so that
(8.1.13) x(θ, φ) = r sin(φ) cos(θ), y(θ, φ) = r sin(φ) sin(θ), z(θ, φ) = r cos(φ).
This means that, if we fix the angles θ and φ then the point A in Figure 8.6 corresponding to these
angles has x− y − z coordinates given by (8.1.13). Put another way, if θ varies through the range
0 ≤ θ ≤ 2π and φ varies through the range 0 ≤ φ ≤ π/2 then the point with coordinates given by
(8.1.13) traces out the entire surface S, that is the top half of the surface of the sphere of radius
r shown in Figure 8.1. To write this in the formalism of Definition 8.1.4 we define the rectangle
D ⊂ R2 in the “θ − φ plane” by
D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2
= [0, 2π]× [0, π/2],(8.1.14)
98
in which we have used the abbreviated “mathematical” notation of (2.1.3) for the rectangle D.
Using (8.1.13) we define the function Φ : D → R3 by
ΦΦΦ(θ, φ) := (x(θ, φ), y(θ, φ), z(θ, φ))
≡ x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk
= r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk,
(8.1.15)
for all (θ, φ) in D defined by (8.1.14). It is clear that, as (θ, φ) traverses the rectangle D given by
(8.1.14) then the vector ΦΦΦ(θ, φ) defined by (8.1.15) traces out the surface S, namely the top half of
the sphere of radius r. We therefore have a parametric representation of the surface S in the sense
of Definition 8.1.4.
Remark 8.1.8. It follows that we have two distinct parametric representations for the surface S
which is the top half of a sphere of radius r centered at the origin of R3, namely the representation
constructed in Example 8.1.7 and the representation following from the fact that S is the graph
of a function f (recall Example 8.1.1 and Remark 8.1.6). This makes it clear that a given surface
generally has several different parametric representations. In such a situation is there any “right”
choice of parametric representation? The answer really depends on the particular problem one has
in mind. Later in this chapter we shall study integration over surfaces (or surface integrals) and
we shall see that evaluation of these integrals is often greatly simplified by choosing the “right”
parametric representation of the surface over which we must integrate. In the next example we
identify a major advantage of the parametric representation of the top half of the sphere established
in Example 8.1.7. Recall from Remark 8.1.3 that the representation of this surface as the graph of
the function (8.1.3) does not extend in any easy way to a description of the surface of the whole
sphere of radius r. By contrast, in the next example we see that the parametric representation of
Example 8.1.7 extends trivially to give a full description of the whole sphere of radius r.
Example 8.1.9. We want a parametric representation of the surface S which is now the whole
sphere of radius r centered at the origin of R3 (see Figure 8.3). Referring to Figure 8.6 we see that,
if θ varies through the interval 0 ≤ θ ≤ 2π (as in Example 8.1.7) and φ varies through the interval
0 ≤ φ ≤ π (in contrast to the range 0 ≤ φ ≤ π/2 in Example 8.1.7) then the point A must trace out
the surface of the whole sphere. It is trivial to repeat the analysis of Example 8.1.7 to see that a
parametric representation of the whole surface of the sphere is given by the region D ⊂ R2 defined
99
by
D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π
= [0, 2π]× [0, π],(8.1.16)
(compare (8.1.14)) with ΦΦΦ : D → R3 given again by (8.1.15) that is
(8.1.17) ΦΦΦ(θ, φ) = r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk,
for all (θ, φ) in D defined by (8.1.16).
In the next example we give a parametric representation of another surface which is again not the
graph of a function.
Example 8.1.10. In this example we write (r, θ) instead of (u, v) for the parametric variable since
we want to regard the first variable r as “radius” and the second variable as “angle”. Define
ΦΦΦ(r, θ) := (r cos(θ), r sin(θ), θ)
= r cos(θ)iii+ r sin(θ)jjj + θkkk,(8.1.18)
for all (r, θ) such that
(8.1.19) 0 ≤ r ≤ 1 and 0 ≤ θ ≤ 2π.
We then have a parametric mapping ΦΦΦ : D → R3 in which the region D is the rectangle in the
“r − θ plane” given by (8.1.19), that is
D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π
= [0, 1]× [0, 2π].(8.1.20)
Comparing (8.1.18) with (8.1.7) we see that the scalar components of Φ are given by
(8.1.21) x(r, θ) = r cos(θ), y(r, θ) = r sin(θ), z(r, θ) = θ.
The surface in R3 traced by ΦΦΦ(r, θ) as (r, θ) traverses the rectangle D is called a helicoid and shown
in Figure 8.7.
If r is held fixed in the range 0 ≤ r ≤ 1 then the point (x(r, θ), y(r, θ)) = (r cos(θ), r sin(θ)) traces a
circle of radius r in the plane as θ varies through the range 0 ≤ θ ≤ 2π. However, the third relation
of (8.1.21) “winds” this circle into a “spiral” of radius r around the z-axis in R3. The totality of
these “spirals” of radius r for all 0 ≤ r ≤ 1 makes up the helicoid. It is clear that the helicoid
cannot possibly be the graph of a function.
100
Figure 8.7: Helicoid(taken from Wikimedia Commons)
Remark 8.1.11. It is worthwhile to compare Definition 4.2.2 with Definition 8.1.4, for they are very
similar. In each case we begin with a specified parametric function (also called a path in Definition
4.2.2) and define a corresponding curve or surface as the range of the parametric function when the
parametric variable (t in the case of a curve, (u, v) in the case of a surface) traverses through a basic
domain (an interval [a, b] in the case of a curve, a region D ⊂ R2 in the case of a surface). One
clear difference between these definitions is that the parametric variable in the case of a curve is a
single real number (usually denoted as t) whereas the parametric variable in the case of a surface is
a pair of real numbers (usually denoted as (u, v)). Of course this just reflects the fact that a curve
results from the deformation of a portion [a, b] of a straight line (i.e. a “one-dimensional” object)
whereas a surface results from the deformation of a portion D of a plane (i.e. a “two-dimensional”
object). The deformation is described by the parametric function γγγ in the case of a curve, and
by the parametric function ΦΦΦ in the case of a surface. Finally, note one (huge) difference between
Definitions 4.2.2 and 8.1.4, namely a curve has a natural direction corresponding to the increase
of t through a ≤ t ≤ b, whereas there seems to be no similarly natural “direction” for a surface
(since there is no obvious “direction” in which (u, v) traverses a region D ⊂ R2). In fact, there is a
sense of direction for a surface (called the orientation of the surface) but the formulation of this is
a highly technical business which properly belongs to the realm of differential geometry and which
we will not take up here.
101
8.2 Tangents to a Surface and Smooth Surfaces
Suppose that we are given a C1-parametric function ΦΦΦ : D → R3 (see Definition 8.1.4 and Remark
8.1.5) for which the region D ⊂ R2 is specifically a rectangle of the form
(8.2.22) D = [a, b]× [c, d],
(c.f. (2.1.3) and see Figure 2.1), and fix some (u0, v0) in D, that is u0 is in the range a ≤ u ≤ b and
v0 is in the range c ≤ v ≤ d. Then the mapping
(8.2.23) γγγv0 : [a, b]→ R3 defined by γγγv0(u) := ΦΦΦ(u, v0) for all a ≤ u ≤ b
is the parametric representation of a curve Γv0 traced in R3 by γγγv0(u) = ΦΦΦ(u, v0) as u varies through
the interval a ≤ u ≤ b (see Figure 8.8).
Figure 8.8: Curves Γv0 and Γu0 with tangent vectors
Following (4.3.18) we take the derivative of the parametric function γγγv0(u) at u = u0, namely
(8.2.24)dγγγv0du
(u0) =∂ΦΦΦ
∂u(u0, v0) =
(∂x
∂u(u0, v0),
∂y
∂u(u0, v0),
∂z
∂u(u0, v0)
).
102
Here the first equality at (8.2.24) follows from (8.2.23) and the second equality follows from the
scalar component-wise representation of ΦΦΦ (see (8.1.7)). In view of Section 4.3 the vector at (8.2.24)
is tangent to the curve Γv0 at the point
(8.2.25) ΦΦΦ(u0, v0) = (x(u0, v0), y(u0, v0), z(u0, v0))
(see Figure 8.8). Similarly the mapping
(8.2.26) γγγu0 : [c, d]→ R3 defined by γγγu0(v) := ΦΦΦ(u0, v) for all c ≤ v ≤ d
is the parametric representation of a curve Γu0 traced in R3 by γγγu0(v) = ΦΦΦ(u0, v) as v varies through
the interval c ≤ v ≤ d, and the vector
(8.2.27)dγγγu0dv
(v0) =∂ΦΦΦ
∂v(u0, v0) =
(∂x
∂v(u0, v0),
∂y
∂v(u0, v0),
∂z
∂v(u0, v0)
)is also tangent to the curve Γu0 at the point ΦΦΦ(u0, v0) (see Figure 8.8). We now define the vector
cross product of the vector ∂ΦΦΦ(u0, v0)/∂u at (8.2.24) and the vector ∂ΦΦΦ(u0, v0)/∂v at (8.2.27), that
is
(8.2.28) NNN(u0, v0) :=∂ΦΦΦ
∂u(u0, v0)× ∂ΦΦΦ
∂v(u0, v0).
Using the familiar expression for calculating cross products of vectors, together with the formulas
for the vectors ∂ΦΦΦ(u0, v0)/∂u and ∂ΦΦΦ(u0, v0)/∂v at (8.2.24) and (8.2.27), we obtain from (8.2.28)
(8.2.29) NNN(u0, v0) :=
∣∣∣∣∣∣∣∣iii jjj kkk
∂x∂u
(u0, v0) ∂y∂u
(u0, v0) ∂z∂u
(u0, v0)∂x∂v
(u0, v0) ∂y∂v
(u0, v0) ∂z∂v
(u0, v0)
∣∣∣∣∣∣∣∣ .Now expand the 3× 3 determinant on the right of (8.2.29) by Cramer’s rule to get
NNN(u0, v0) = iii
[∂y
∂u(u0, v0)
∂z
∂v(u0, v0)− ∂y
∂v(u0, v0)
∂z
∂u(u0, v0)
]− jjj
[∂x
∂u(u0, v0)
∂z
∂v(u0, v0)− ∂x
∂v(u0, v0)
∂z
∂u(u0, v0)
]+ kkk
[∂x
∂u(u0, v0)
∂y
∂v(u0, v0)− ∂x
∂v(u0, v0)
∂y
∂u(u0, v0)
],
(8.2.30)
that is
(8.2.31) NNN(u0, v0) = iii∂(y, z)
∂(u, v)(u0, v0) + jjj
∂(z, x)
∂(u, v)(u0, v0) + kkk
∂(x, y)
∂(u, v)(u0, v0),
103
in which we have introduced the usual 2× 2 Jacobian determinants defined by
(8.2.32)
∂(y, z)
∂(u, v)(u0, v0) :=
∣∣∣∣∣ ∂y∂u(u0, v0) ∂y∂v
(u0, v0)∂z∂u
(u0, v0) ∂z∂v
(u0, v0)
∣∣∣∣∣ =
[∂y
∂u(u0, v0)
∂z
∂v(u0, v0)− ∂y
∂v(u0, v0)
∂z
∂u(u0, v0)
](8.2.33)
∂(z, x)
∂(u, v)(u0, v0) :=
∣∣∣∣∣ ∂z∂u(u0, v0) ∂z∂v
(u0, v0)∂x∂u
(u0, v0) ∂x∂v
(u0, v0)
∣∣∣∣∣ =
[∂z
∂u(u0, v0)
∂x
∂v(u0, v0)− ∂z
∂v(u0, v0)
∂x
∂u(u0, v0)
](8.2.34)
∂(x, y)
∂(u, v)(u0, v0) :=
∣∣∣∣∣∂x∂u(u0, v0) ∂x∂v
(u0, v0)∂y∂u
(u0, v0) ∂y∂v
(u0, v0)
∣∣∣∣∣ =
[∂x
∂u(u0, v0)
∂y
∂v(u0, v0)− ∂y
∂u(u0, v0)
∂x
∂v(u0, v0)
]on the right side of (8.2.30) to get (8.2.31). It is clear that ΦΦΦ(u, v) traces the surface S shown in
Figure 8.8 as (u, v) traverses the rectangular region D.
Definition 8.2.1. The surface S shown in Figure 8.8 is called smooth at the point Φ(u0, v0) when
NNN(u0, v0) 6= 0. The surface S is called smooth when it is smooth at ΦΦΦ(u0, v0) for each and every
(u0, v0) in the region D, that is NNN(u0, v0) 6= 0 for each and every (u0, v0) in D. Throughout this
course we shall be interested only in smooth surfaces.
Remark 8.2.2. Assuming, as we shall always do, that the surface S is smooth in the sense of
Definition 8.2.1, we can define the unit vector
(8.2.35) nnn(u0, v0) :=NNN(u0, v0)
‖NNN(u0, v0)‖for every (u0, v0) in D,
where, of course, from (8.2.31) and Pythagoras, for each (u0, v0) in D we have
(8.2.36) ‖NNN(u0, v0)‖ =
√[∂(x, y)
∂(u, v)(u0, v0)
]2
+
[∂(y, z)
∂(u, v)(u0, v0)
]2
+
[∂(z, x)
∂(u, v)(u0, v0)
]2
.
Since the vectors at (8.2.27) and (8.2.24) are tangent to the curves Γu0 and Γv0 respectively (as
already noted above), it follows that these vectors span the plane which is tangent to the surface
S at the point ΦΦΦ(u0, v0), and the unit vector nnn(u0, v0) is normal to this plane, and therefore also
normal to the surface S at the point ΦΦΦ(u0, v0).
104
8.3 Area of a Surface
We can now obtain a useful formula for the area of a surface with the parametric representation
ΦΦΦ : D → R3, in which, exactly as at Remark 8.2, we suppose for concreteness that the region D is
the rectangle at (8.2.22). Fix some u0, v0, and small ∆u > 0, ∆v > 0, such that
(8.3.37) a ≤ u0 < u0 + ∆u ≤ b, c ≤ v0 < v0 + ∆v ≤ d.
Then we get a small rectangle ∆D with the “corners” given by the points (u0, v0), (u0 + ∆u, v0),
(u0, v0 + ∆v) and (u0 + ∆u, v0 + ∆v), that is
∆D := (u, v) | u0 ≤ u ≤ u0 + ∆u, v0 ≤ v ≤ v0 + ∆v
= [u0, u0 + ∆u]× [v0, v0 + ∆v],(8.3.38)
and ΦΦΦ maps ∆D onto the small “piece of surface”
(8.3.39) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D.
The small surface ∆S is approximately a flat parallogram with edges AB and AC (see Figure 8.9).
Since ∆u and ∆v are small the edges AB and AC are approximately given respectively by the
following vectors vvv1 and vvv2
(8.3.40) vvv1 := ΦΦΦ(u0, v0 + ∆v)−ΦΦΦ(u0, v0), vvv2 := ΦΦΦ(u0 + ∆u, v0)−ΦΦΦ(u0, v0).
But, again since ∆u and ∆v are small, we also have the relations
(8.3.41) vvv1 ≈∂ΦΦΦ
∂v(u0, v0)∆v, vvv2 ≈
∂ΦΦΦ
∂u(u0, v0)∆u.
Now we know that the area of the parallelogram ∆S with edges given by the vectors vvv1 and vvv2 is
given by the norm of the cross-product vvv1 × vvv2 namely
(8.3.42) area∆S = ‖vvv1 × vvv2‖ .
But, from (8.3.41)
vvv1 × vvv2 ≈(∂ΦΦΦ
∂v(u0, v0)× ∂ΦΦΦ
∂u(u0, v0)
)∆u ∆v
≈NNN(u0, v0)∆u ∆v (from the definition of NNN(u0, v0) at (8.2.28)),
(8.3.43)
105
Figure 8.9: Small rectangle ∆D and approximate parallelogram ∆S
and combining (8.3.43) with (8.3.42) we obtain
(8.3.44) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v.
As ∆u and ∆v shrink to the infinitesimals du and dv, the small rectangle ∆D with lower left
corner given by (u0, v0) shrinks to the infinitesimal rectangle dD (still with lower left corner given
by (u0, v0)), the piece of surface ∆S shrinks to the infinitesimal parallelogram dS (still with one
corner “anchored” at the point A given by ΦΦΦ(u0, v0) as in Figure 8.9), and the approximation at
(8.3.44) becomes exact, so that
(8.3.45) area dS = ‖NNN(u0, v0)‖ du dv.
Now the total area of the surface S is the “sum” or integral of the elemental areas at (8.3.45) as
the infinitesimal rectangles dD cover the whole rectangle D, that is the total area of the surface S
106
must be given by any of the following three equivalent expressions
areaS =
∫D
‖NNN(u, v)‖ du dv
=
∫D
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ du dv
=
∫D
√[∂(x, y)
∂(u, v)(u, v)
]2
+
[∂(y, z)
∂(u, v)(u, v)
]2
+
[∂(z, x)
∂(u, v)(u, v)
]2
du dv,
(8.3.46)
in which the second expression follows from (8.2.28) and the third expression follows from (8.2.36)
(replacing (u0, v0) with (u, v)).
Remark 8.3.1. Notice that, in order to evaluate the area of the surface S we can use any of
the three integrals on the right hand side of (8.3.46). As a practical matter the second integral
is typically the easiest to do calculations with, as well as displaying very clearly the role of the
parametric representation in the area formula. Each of the integrals just involves integrating over
a region D of R2 and therefore can be evaluated using Fubini’s theorem when D is a rectangle (see
Theorem 2.1.5 and Remark 2.1.6), or more generally either a y-simple region or x-simple region in
R2 (see Remark 2.1.8).
Remark 8.3.2. The following important question arises in connection with the area formula given
by (8.3.46), namely the right side appears to depend on the particular parametric representation
that we have chosen for the surface S (recall (8.1.6) and (8.1.8)). This seeming dependence is shown
particularly by the second of the equivalent expressions on the right side of (8.3.46), in which the role
of the parametric representation ΦΦΦ : D → R3 is clearly indicated. However, the area of a surface is
intrinsic, and should not depend on the particular parametric representation we have chosen for the
surface. This means that if we choose to represent the same surface S by an alternative parametric
representation
(8.3.47) ΦΦΦ : D → R3 for some region D ⊂ R2,
so that in particular the surface S is traversed by ΦΦΦ(u, v) as (u, v) traverses region D, that is
(compare (8.1.8))
(8.3.48) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D,
then we had better have
(8.3.49) areaS =
∫D
∥∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥∥ du dv,
107
if the area formula is to be “independent” of the parametric representation we have chosen for the
surface S! This involves showing that the quantities on the right sides of (8.3.49) and (8.3.46) are
equal. The following theorem, which we state without proof, guarantees that this is always the
case:
Theorem 8.3.3. Suppose that
(8.3.50) ΦΦΦ : D → R3 and ΦΦΦ : D → R3
are alternative C1-parametric representations of a surface S, so that
(8.3.51) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D and S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D
i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses D, and likewise ΦΦΦ(u, v) also traverses S as (u, v) traverses
D. Then
(8.3.52)
∫D
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ du dv =
∫D
∥∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥∥ du dv.
It follows from Theorem 8.3.3 that the formula (8.3.46) for the area of the surface S does not
depend on which parametric representation we use for S. This means, in particular, that if we have
several parametric representations of a surface S then we should use that particular representation
which involves the least amount of work in the integrations for calculating the area. This will become
clear in later examples.
Remark 8.3.4. Suppose a surface S is the graph of a function
(8.3.53) f : D → R,
as in Example 8.1.2 and Remark 8.1.6. In this case the area formula (8.3.46) simplifies, as we next
show. From Remark 8.1.6 we know that the surface S has the parametric representation ΦΦΦ : D → R3
in which
(8.3.54) ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk for all (u, v) in D,
(c.f. (8.1.9)). We next calculate the u-derivative and v-derivative of ΦΦΦ at (8.3.54):
(8.3.55)∂ΦΦΦ
∂u(u, v) = 1iii+ 0jjj +
∂f
∂u(u, v)kkk,
108
(8.3.56)∂ΦΦΦ
∂v(u, v) = 0iii+ 1jjj +
∂f
∂v(u, v)kkk.
From (8.3.56) and (8.3.55) we get
∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v) =
∣∣∣∣∣∣∣∣iii jjj kkk
1 0 ∂f∂u
(u, v)
0 1 ∂f∂v
(u, v)
∣∣∣∣∣∣∣∣ ,= −∂f
∂u(u, v)iii− ∂f
∂v(u, v)jjj + kkk,
(8.3.57)
and calculating the Pythagorean length of the cross product vector at (8.3.57) we obtain
(8.3.58)
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ =
√1 +
[∂f
∂u(u, v)
]2
+
[∂f
∂v(u, v)
]2
.
From (8.3.46), together with (8.3.58), we see that the area of the surface S given by the graph of
the function (8.3.53) is
(8.3.59) areaS =
∫D
√1 +
[∂f
∂u(u, v)
]2
+
[∂f
∂v(u, v)
]2
du dv.
Example 8.3.5. In this example we shall determine the area of the surface S which is the top half
of the sphere of radius r centered at the origin of R3 (see Example 8.1.1). We know that S is the
graph of the function f given by (8.1.3) that is
(8.3.60) f(u, v) :=√r2 − u2 − v2, for all (u, v) in D,
defined on the disc of radius r in the u− v plane given by (8.1.2) that is
(8.3.61) D = (u, v) ∈ R2 | u2 + v2 ≤ r2.
Evaluation of the area is therefore just a matter of substituting f and D given by (8.3.60) and
(8.3.61) into the formula given by (8.3.59) for the area of a surface which is the graph of a function
f , that is
(8.3.62) areaS =
∫D
√1 +
[∂f
∂u(u, v)
]2
+
[∂f
∂v(u, v)
]2
du dv.
109
From (8.3.60) we get
(8.3.63)∂f
∂u(u, v) =
−u√r2 − u2 − v2
,∂f
∂v(u, v) =
−v√r2 − u2 − v2
,
and, from (8.3.63) we find
(8.3.64) 1 +
[∂f
∂u(u, v)
]2
+
[∂f
∂v(u, v)
]2
=r2
r2 − u2 − v2,
Combining (8.3.64) and (8.3.62) we get
(8.3.65) areaS =
∫D
r√r2 − u2 − v2
du dv.
Evaluation of this integral is quite laborious and complicated (although not impossible), because the
integrand involves the reciprocal of a square root (which is usually quite awkward to deal with) and
because the integration is over the disc D (see (8.3.61)) rather than over a nice simple region such
as a rectangle. For this reason let us see if there is not an easier way to determine the area. Recall
from Theorem 8.3.3 that we can use any parametric representation of S in the area formula (8.3.46),
and recall from Example 8.1.7 that we have an alternative parametric representation ΦΦΦ : D → R3
of the surface S in which (see (8.1.14) and (8.1.15))
D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2
= [0, 2π]× [0, π/2],(8.3.66)
and
(8.3.67) ΦΦΦ(θ, φ) = r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk, for all (θ, φ) in D.
Of the equivalent expressions given by (8.3.46) the second expression is typically the easiest to use,
that is
(8.3.68) areaS =
∫D
∥∥∥∥∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
∥∥∥∥ dθ dφ,
where we have just replaced the generic parametric variable (u, v) with the parametric variable (θ, φ)
specific to the surface S given by (8.3.66) - (8.3.67). We now calculate the θ-partial derivatives of
the scalar components of ΦΦΦ(θ, φ) at (8.3.67):
(8.3.69)∂ΦΦΦ
∂θ(θ, φ) = −r sin(φ) sin(θ)iii+ r sin(φ) cos(θ)jjj + 0kkk.
110
Similarly for the φ-partial derivatives:
(8.3.70)∂ΦΦΦ
∂φ(θ, φ) = r cos(φ) cos(θ)iii+ r cos(φ) sin(θ)jjj − r sin(φ)kkk.
Now calculate the vector cross product of the partial derivative vectors at (8.3.69) and (8.3.70) to
get
(8.3.71)∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ) =
∣∣∣∣∣∣∣∣iii jjj kkk
−r sin(φ) sin(θ) r sin(φ) cos(θ) 0
r cos(φ) cos(θ) r cos(φ) sin(θ) −r sin(φ)
∣∣∣∣∣∣∣∣ .Expanding the right side of (8.3.71) gives
∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ) = iii[(r sin(φ) cos(θ))(−r sin(φ))− 0]
− jjj[(−r sin(φ) sin(θ))(−r sin(φ))− 0]
+ kkk[(−r sin(φ) sin(θ))(r cos(φ) sin(θ))− (r cos(φ) cos(θ))(r sin(φ) cos(θ))]
= −r2
sin2(φ) cos(θ)iii+ sin2(φ) sin(θ)jjj + sin(φ) cos(φ)(sin2(θ) + cos2(θ))kkk
= −r2
sin2(φ) cos(θ)iii+ sin2(φ) sin(θ)jjj + sin(φ) cos(φ)kkk.
(8.3.72)
We now determine the Pythagorean length of the vector at (8.3.72):∥∥∥∥∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
∥∥∥∥ = r2√
sin4(φ) cos2(θ) + sin4(φ) sin2(θ) + sin2(φ) cos2(φ)
= r2√
sin4(φ) + sin2(φ) cos2(φ)
= r2√
sin2(φ)[sin2(φ) + cos2(φ)]
= r2| sin(φ)|.
(8.3.73)
Now put (8.3.73) into (8.3.68) to get
areaS =
∫D
∥∥∥∥∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
∥∥∥∥ dθ dφ
= r2
∫ 2π
0
∫ π/2
0
| sin(φ)| dφ
dθ
= r2
∫ 2π
0
1 dθ
= 2πr2.
(8.3.74)
111
Note that we used (8.3.73), the fact that D is a rectangle (see (8.3.66)), and the Fubini Theorem
(recall (2.1.12)) at the second equality at (8.3.74).
Remark 8.3.6. Example 8.3.5 illustrates a very important aspect of the area formula (8.3.46),
namely the choice of parametric representation of the surface S can substantially influence the
amount of work involved in using this formula. Indeed, we saw that the parametric representation
of S as the graph of the function (8.3.60) over the region D given by (8.3.61) leads to the rather
complicated integral at (8.3.65), whereas for the parametric representation of S given by (8.3.66)
and (8.3.67) the area formula is quite easy to use.
Example 8.3.7. Determine the area of the helicoid given in Example 8.1.10. The parametric
representation
ΦΦΦ : D → R3
of the helicoid is given by (8.1.18) and the region D is given by (8.1.19) (equivalently by (8.1.20)),
that is
D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π
= [0, 1]× [0, 2π].(8.3.75)
and
(8.3.76) ΦΦΦ(r, θ) = r cos(θ)iii+ r sin(θ)jjj + θkkk, for all (r, θ) in D.
We see that (8.3.46) gives three equivalent expressions for the area. As we have already noted the
second of these expressions is typically the easiest to use, so the the area is given by
(8.3.77) area of helicoid =
∫D
∥∥∥∥∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ)
∥∥∥∥ dr dθ
in which we have replaced the generic parametric variables (u, v) in (8.3.46) with the parametric
variables (r, θ) that are specific to the helicoid. We now calculate the r and θ-partial derivatives of
the scalar components of ΦΦΦ(r, θ) given by (8.3.76):
∂ΦΦΦ
∂r(r, θ) = cos(θ)iii+ sin(θ)jjj + 0kkk
∂ΦΦΦ
∂θ(r, θ) = −r sin(θ) + r cos(θ)jjj + 1kkk.
(8.3.78)
Now calculate the vector cross product of the partial derivative vectors at (8.3.78) to get
(8.3.79)∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ) =
∣∣∣∣∣∣∣∣iii jjj kkk
cos(θ) sin(θ) 0
−r sin(θ) r cos(θ) 1
∣∣∣∣∣∣∣∣ ,112
that is
∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ) = sin(θ)iii+ cos(θ)jjj + [r cos2(θ) + r sin2(θ)]kkk
= sin(θ)iii+ cos(θ)jjj + rkkk.
(8.3.80)
Now calculate the Pythagorean length of the vector at (8.3.80):
(8.3.81)
∥∥∥∥∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ)
∥∥∥∥ =√
sin2(θ) + cos2(θ) + r2 =√
1 + r2.
From (8.3.81) and (8.3.77)
area of helicoid =
∫D
[√r2 + 1] dr dθ,
=
∫ 2π
0
∫ 1
0
[√r2 + 1] dr
dθ (from Remark 2.1.6 and (8.3.75)).
(8.3.82)
A rather lengthy and tedious integration by substitution gives
(8.3.83)
∫ 1
0
[√r2 + 1] dr =
1
2[√
2 + log(1 +√
2)],
and then (8.3.82) with (8.3.83) give
area of helicoid = π[√
2 + log(1 +√
2)].
Remark 8.3.8. For later use we repeat the area formulae (8.3.44) and (8.3.45). We are given the
parametric representation ΦΦΦ : D → R3 of a surface S for some region D ⊂ R2xy. Fix a rectangle
(8.3.84) ∆D := (u, v) | u0 ≤ u ≤ u0 + ∆u, v0 ≤ v ≤ v0 + ∆v,
contained within the region D, in which ∆u > 0 and ∆v > 0 are small, and let ∆S be the
corresponding small surface which is the image of ∆D, that is
(8.3.85) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,
(see Figure 8.10). Since the edges ∆u and ∆v are small the area of the surface ∆S is approximately
given by
(8.3.86) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,
113
Figure 8.10: Small rectangle ∆D and approximate parallelogram ∆S
(see (8.3.44)), in which NNN(u0, v0) is the vector normal to the approximately flat surface ∆S, and
given by
(8.3.87) NNN(u0, v0) =∂ΦΦΦ
∂v(u0, v0)× ∂ΦΦΦ
∂u(u0, v0),
(see (8.2.28)). As ∆u and ∆v shrink to the infinitesimals du and dv, the small rectangle ∆D
with lower left corner given by (u0, v0) shrinks to the infinitesimal rectangle dD (still with lower
left corner given by (u0, v0)), the piece of surface ∆S shrinks to the infinitesimal parallelogram dS
(still with one corner “anchored” at the point A given by ΦΦΦ(u0, v0) as in Figure 8.10), and the
approximation at (8.3.86) becomes exact, so that
(8.3.88) area dS = ‖NNN(u0, v0)‖ du dv.
We shall use the area formulae (8.3.86) and (8.3.88) in Section 8.4 on surface integrals of scalar
fields, and in Section 8.5 on surface integrals of vector fields.
114
8.4 Surface Integral of a Scalar Field
In Section 8.3 we obtained a formula for the area of a surface S with a parametric representation
ΦΦΦ : D → R3 (see (8.3.46) and (8.2.36)). In this section our goal is to generalize this idea to con-
struct the integral of a given scalar field over the surface S. For a concrete instance of how this
type of integral could be useful suppose that the surface S describes an infinitesimally thin sheet
of plastic, and for each point (x, y, z) on S the function value f(x, y, z) gives the charge density (in
units coul./m2) concentrated on the surface S at the point (x, y, z). The integral that we are going
to define will enable us to determine the total charge on the surface.
To fix ideas suppose that a surface S has the parametric representation ΦΦΦ : D → R3, in which
we suppose for concreteness that the region D ⊂ R2uv is the rectangle at (8.2.22), that is
D = (u, v) | a ≤ u ≤ b, c ≤ v ≤ d
= [a, b]× [c, d],(8.4.89)
and f : R3 → R is a given continuous scalar field. As in Section 8.3 we will divide the region D
into small rectangles ∆D (exactly as at (8.3.38) and (8.3.84)). Then Φ maps ∆D onto the piece of
surface ∆S given by (8.3.85), that is (see Figure 8.10)
(8.4.90) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,
which is an approximate parallelogram with area given by (8.3.86), that is
(8.4.91) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,
in which, from (8.3.87),
(8.4.92) NNN(u0, v0) :=∂ΦΦΦ
∂u(u0, v0)× ∂ΦΦΦ
∂v(u0, v0).
We now multiply the area at (8.4.91) by the value of the scalar field f at point ΦΦΦ(u0, v0) corre-
sponding to the corner A in Figure 8.10, that is by the value f(ΦΦΦ(u0, v0)), to get
(8.4.93) f(ΦΦΦ(u0, v0))area∆S ≈ f(ΦΦΦ(u0, v0)) ‖NNN(u0, v0)‖∆u ∆v.
What is the significance of the quantity at (8.4.93)? To get some idea of this suppose that at
each point (x, y, z) on the surface S the value f(x, y, z) gives the density of charge per unit area
115
concentrated on the surface at (x, y, z) (with units coul./m2). Then, in particular, the charge density
on the surface at the point ΦΦΦ(u0, v0) in Figure 8.10 (i.e. given by (x, y, z) = ΦΦΦ(u0, v0)) is f(ΦΦΦ(u0, v0))
coul./m2, so that the quantity at (8.4.93) is approximately the total charge on the small piece of
surface ∆S. Exactly as at (8.3.45), as ∆u and ∆v shrink to the infinitesimals du and dv the small
rectangle ∆D in Figure 8.10 with lower left corner given by (u0, v0) shrinks to the infinitesimal
rectangle dD (still with lower left corner given by (u0, v0)), the piece of surface ∆S shrinks to the
infinitesimal parallelogram dS (still with one corner “anchored” at the point A given by ΦΦΦ(u0, v0)
as in Figure 8.10), and the approximation at (8.4.93) becomes exact, so that
(8.4.94) f(ΦΦΦ(u0, v0))area dS = f(ΦΦΦ(u0, v0)) ‖NNN(u0, v0)‖ du dv.
We now “sum” or integrate the elemental quantities at (8.4.94) as the infinitesimal rectangles dD
cover the whole rectangle D to get the quantity
(8.4.95)
∫D
f(ΦΦΦ(u, v)) ‖NNN(u, v)‖ du dv.
This quantity is known as the surface integral of the scalar field f over the surface S. If we recall the
particular interpretation of f(x, y, z) being the charge density at any point (x, y, z) on the surface
then it is clear that the surface integral at (8.4.95) gives the total charge on the surface S.
In view of (8.4.92) and (8.2.36) (but replacing (u0, v0) with (u, v)) we can equally well write the
surface integral at (8.4.95) as
(8.4.96)
∫D
f(ΦΦΦ(u, v))
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ du dv
as well as
(8.4.97)
∫D
f(ΦΦΦ(u, v))
√[∂(x, y)
∂(u, v)(u, v)
]2
+
[∂(y, z)
∂(u, v)(u, v)
]2
+
[∂(z, x)
∂(u, v)(u, v)
]2
du dv
in the sense that all three integrals at (8.4.95), (8.4.96) and (8.4.97) are equal. Among these various
notations that given by (8.4.96) is typically the most convenient for actual calculations, and this is
the notation we shall use from now on.
A question very similar to that addressed in Remark 8.3.2 for the area formula arises in con-
nection with surface integrals, namely are surface integrals of scalar fields independent of the
parametrization that we choose for the surface S? Again, they had better be! If fact, going
back to our motivating interpretation of f as the charge per unit area on the surface, we have noted
that the surface integrals (8.4.95), (8.4.96) and (8.4.97) all give the total charge on the surface S,
116
and this of course should not depend on the particular parametric representation of the surface
S. That surface integrals are indeed independent of whichever parametric representation for the
surface S that we use is guaranteed by the following analog (in fact generalization of) Theorem
8.3.3 which we again state without proof:
Theorem 8.4.1. Suppose that f : R3 → R is a continuous scalar field, and that
(8.4.98) ΦΦΦ : D → R3 and ΦΦΦ : D → R3
are alternative C1-parametric representations of a surface S, so that
(8.4.99) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D and S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D
i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses the region D, and likewise ΦΦΦ(u, v) traverses the same
surface S as (u, v) traverses the region D. Then
(8.4.100)∫D
f(ΦΦΦ(u, v))
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ du dv =
∫D
f(ΦΦΦ(u, v))
∥∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥∥ du dv.
Remark 8.4.2. A large variety of notations are commonly encountered in the literature to serve
as short-hand for the surface integrals at (8.4.95), (8.4.96) and (8.4.97), in particular
(8.4.101)
∫S
f(x, y, z) dA,
∫S
f(x, y, z) dσ, and
∫S
f(x, y, z) dS,
as well as
(8.4.102)
∫S
f dA,
∫S
f dσ, and
∫S
f dS.
In all these notations the essential elements are the subscript S of the integral, indicating the
surface over which one integrates, and of course the integrand f , which indicates the scalar field
being integrated. The notations at (8.4.101) are quite explicit, and remind us that we are integrating
over S with respect to an underlying space variable in R3 generically denoted by (x, y, z). It can
often be rather tedious to keep carrying the space variable argument (x, y, z), and this is the reason
for introducing the notations at (8.4.102), in which the space variable is suppressed (but always
understood to be present!). In this course we shall typically use the first of the notations at (8.4.102),
117
so that we write∫S
f dA =
∫D
f(ΦΦΦ(u, v)) ‖NNN(u, v)‖ du dv
=
∫D
f(ΦΦΦ(u, v))
∥∥∥∥∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
∥∥∥∥ du dv
=
∫D
f(ΦΦΦ(u, v))
√[∂(x, y)
∂(u, v)(u, v)
]2
+
[∂(y, z)
∂(u, v)(u, v)
]2
+
[∂(z, x)
∂(u, v)(u, v)
]2
du dv,
(8.4.103)
Observe that our notation on the left side of (8.4.103) for the surface integral of the scalar field
f over the surface S completely suppresses all mention of the particular parametric representation
ΦΦΦ : D → R3 of the surface S. This is exactly as things should be, for we know from Theorem 8.4.1
that the surface integral does not in fact depend on which particular parametric representation of
the surface S is used. Again, we recall that the second of the three expressions on the right of
(8.4.103) is usually the easiest to use in actual calculations of the surface integral. We illustrate
this in Example 8.4.3 which follows. Finally, note that if the functions f is such that
f(x, y, z) = 1, for all (x, y, z) in S
then (8.4.103) just reduces to the area formula (8.3.46) as one would expect.
Example 8.4.3. Suppose that S is the helicoid of Example 8.1.10 and the scalar field is
(8.4.104) f(x, y, z) :=√x2 + y2 + 1 for all (x, y, z) in R3.
Determine the surface integral ∫S
f dA.
In Example 8.1.10 we have seen that the helicoid has the parametric representation ΦΦΦ : D → R3 in
which
D = (r, θ) ∈ R2 | 0 ≤ r ≤ 1 and 0 ≤ φ ≤ 2π
= [0, 1]× [0, 2π].(8.4.105)
and
(8.4.106) ΦΦΦ(r, θ) = r cos(θ)iii+ r sin(θ)jjj + θkkk, for all (r, θ) in D.
Of the three equivalent expressions given by (8.4.103) the easiest to use is usually the second
expression, that is
(8.4.107)
∫S
f dA =
∫D
f(ΦΦΦ(r, θ))
∥∥∥∥∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ)
∥∥∥∥ dr dθ,
118
in which the generic parametric variable (u, v) of (8.4.103) is replaced with the parametric variable
(r, θ) specific to the helicoid. In Example 8.3.7 (in which we determined the area of the helicoid)
we have already calculated
(8.4.108)
∥∥∥∥∂ΦΦΦ
∂r(r, θ)× ∂ΦΦΦ
∂θ(r, θ)
∥∥∥∥ =√
1 + r2.
(see (8.3.81)). Moreover, from (8.4.106) and (8.4.104), we have
(8.4.109) f(ΦΦΦ(r, θ)) =√r2 cos2(θ) + r2 sin2(θ) + 1 =
√r2 + 1.
Now put (8.4.109) and (8.4.108) into (8.4.107) to get∫S
f dA =
∫D
[r2 + 1] dr dθ
=
∫ 2π
0
∫ 1
0
[r2 + 1] dr
dθ (from Remark 2.1.6 and (8.4.105))
=
∫ 2π
0
4
3dθ =
8π
3.
(8.4.110)
8.5 Surface Integral of a Vector Field
In Section 8.4 we obtained a formula for the surface integral of a scalar function f over a surface S
with a parametric representation ΦΦΦ : D → R3 (see (8.4.103)). In this section our goal is to define
an analogous integral, but this time for a vector field, leading to the surface integral of a vector
field over a given surface. Our construction of this surface integral will very closely parallel the
construction of the surface integral of a scalar field in Section 8.4.
In fact, proceeding exactly as in Section 8.4, suppose that the surface S has the parametric
representation ΦΦΦ : D → R3, in which we again suppose for concreteness that the region D is the
rectangle
D = (u, v) | a ≤ u ≤ b, c ≤ v ≤ d
= [a, b]× [c, d],(8.5.111)
but now FFF : R3 → R3 is a given continuous vector field. As in Section 8.3 we will divide the region
D into small rectangles ∆D (exactly as at (8.3.38) and (8.3.84)). Then Φ maps ∆D onto the piece
of surface ∆S given by (8.3.85), that is (see Figure 8.10)
(8.5.112) ∆S := ΦΦΦ(u, v) | (u, v) ∈ ∆D,
119
which is an approximate parallelogram with area given by (8.3.86), that is
(8.5.113) area∆S ≈ ‖NNN(u0, v0)‖∆u ∆v,
in which, from (8.3.87),
(8.5.114) NNN(u0, v0) :=∂ΦΦΦ
∂u(u0, v0)× ∂ΦΦΦ
∂v(u0, v0).
However, instead of multiplying the area at (8.5.113) by f(ΦΦΦ(u0, v0)), as we did at (8.4.93) in the
construction of the surface integral of a scalar field f , we now multiply this area by the scalar
quantity given by the inner product FFF (ΦΦΦ(u0, v0)) ·nnn(u0, v0), in which nnn(u0, v0) is the unit normal to
the surface ∆S at the point ΦΦΦ(u0, v0) given by
(8.5.115) nnn(u0, v0) :=NNN(u0, v0)
‖NNN(u0, v0)‖for every (u0, v0) in D,
(recall (8.2.35)). That is, we calculate
FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area∆S ≈ FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0) ‖NNN(u0, v0)‖∆u ∆v
≈ FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0)∆u ∆v,
where the first ≈ follows from (8.5.113) and the second ≈ follows from (8.5.115). We therefore have
(8.5.116) FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area∆S ≈ FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0)∆u ∆v.
What is the significance of the quantity on the left side of (8.5.116)? To get a sense of this suppose
that we have electric charge moving through space, and that at each point (x, y, z) in R3 the vector
FFF (x, y, z) represents the current density at (x, y, z). In the notation of Example 3.1.6, where the
current density was introduced, we should really write JJJ(x, y, z) instead of FFF (x, y, z) for the current
density, but we will continue to use FFF (x, y, z). We know from Example 3.1.6 that the quantity on
the left of (8.5.116) is the total current passing through the surface ∆S. Of course, we would like
to determine the total current passing through the whole surface S, and this means “adding up”
or “integrating” the currents passing through the small surfaces ∆S given by (8.5.116). To this
end we proceed just as we did in Section 8.4. Exactly as at (8.3.45), as ∆u and ∆v shrink to the
infinitesimals du and dv the small rectangle ∆D in Figure 8.10 with lower left corner given by
(u0, v0) shrinks to the infinitesimal rectangle dD (still with lower left corner given by (u0, v0)), the
piece of surface ∆S shrinks to the infinitesimal parallelogram dS (still with one corner “anchored”
120
at the point A given by ΦΦΦ(u0, v0) as in Figure 8.10), and the approximation at (8.5.116) becomes
exact, so that
(8.5.117) FFF (ΦΦΦ(u0, v0)) · nnn(u0, v0)area dS = FFF (ΦΦΦ(u0, v0)) ·NNN(u0, v0) du dv.
We now “sum” or integrate the elemental quantities at (8.5.117) as the infinitesimal rectangles dD
cover the whole rectangle D to get the quantity
(8.5.118)
∫D
FFF (ΦΦΦ(u, v)) ·NNN(u, v) du dv.
This quantity is known as the surface integral of the vector field FFF over the surface S. We note that
the surface integral of a vector field, like the surface integral of a scalar field, is a scalar quantity
(and definitely not a vector quantity!). If we recall the particular interpretation of FFF (x, y, z) being
the current density at any point (x, y, z) on the surface then it is clear that the surface integral at
(8.5.118) gives the total current passing through the surface S.
In view of (8.2.28) (but replacing (u0, v0) with (u, v)) we can equally well write the surface
integral at (8.5.118) as
(8.5.119)
∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv
in the sense that all the integrals at (8.5.118) and (8.5.119) are equal. The integral at (8.5.119) is
typically the most convenient for actual calculations, and this is what we shall use from now on.
A question very similar to that addressed in Remark 8.3.2 for the area formula, and addressed
by Theorem 8.4.1 for surface integrals of scalar fields, of course also arises in connection with surface
integrals of vector fields, namely is the surface integral of a vector field independent of the particular
parametrization that we choose for the surface S? That this is indeed the case is guaranteed by the
following analog of Theorem 8.4.1 which we again state without proof:
Theorem 8.5.1. Suppose that FFF : R3 → R3 is a continuous vector field, and that
(8.5.120) ΦΦΦ : D → R3 and ΦΦΦ : D → R3
are alternative C1-parametric representations of a surface S, so that
(8.5.121) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D and S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D
i.e. ΦΦΦ(u, v) traverses S as (u, v) traverses the region D, and likewise ΦΦΦ(u, v) traverses the same
surface S as (u, v) traverses the region D. Then
(8.5.122)∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv =
∫D
FFF (ΦΦΦ(u, v)) ·
[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv.
121
Remark 8.5.2. Exactly as for the case of surface integrals of a scalar field (see Remark 8.4.2) a
variety of notations are commonly encountered in the literature to denote the surface integral of a
vector field FFF over some surface S, namely
(8.5.123)
∫S
FFF (x, y, z) · dAAA,
∫S
FFF (x, y, z) · dσσσ, and
∫S
FFF (x, y, z) · dSSS
as well as
(8.5.124)
∫S
FFF · dAAA,
∫S
FFF · dσσσ, and
∫S
FFF · dSSS.
In all these notations the essential elements are the subscript S of the integral, indicating the
surface over which one integrates, and of course the integrand FFF , which indicates the vector field
being integrated. The notations at (8.5.123) are quite explicit, and remind us that we are integrating
over S with respect to an underlying space variable in R3 generically denoted by (x, y, z). As is
the case with surface integrals of scalar fields, it can often be rather tedious to keep carrying the
space variable argument (x, y, z), and this is the reason for introducing the notations at (8.5.124),
in which the space variable is suppressed (but always understood to be present!). In this course we
shall typically use the first of the notations at (8.5.124), so that we write∫S
FFF · dAAA :=
∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv
=
∫D
FFF (ΦΦΦ(u, v)) ·NNN(u, v) du dv,
(8.5.125)
for the integrals at (8.5.119) and (8.5.118). Observe that our notation on the left side of (8.5.125)
for the surface integral of the vector field FFF over the surface S completely suppresses all mention
of the particular parametric representation ΦΦΦ : D → R3 of the surface S. This is exactly as things
should be, for we know from Theorem 8.5.1 that the surface integral does not in fact depend on
which particular parametric representation of the surface S is used. Finally, we recall that the first
of the three expressions on the right of (8.5.125) is usually the easiest to use in actual calculations
of the surface integral.
Remark 8.5.3. Just as in Remark 8.3.4 suppose that the surface S has the special form of the
graph of a function
(8.5.126) f : D → R,
122
(as in Example 8.1.2 and Remark 8.1.6). In this case we can simplify the general formula for the
integral of a vector field FFF : R3 → R3 over the surface S given by (8.5.125), that is
(8.5.127)
∫S
FFF · dAAA :=
∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv.
From Remark 8.1.6 we know that the surface S has the parametric representation ΦΦΦ : D → R3 in
which
(8.5.128) ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk for all (u, v) in D,
(c.f. (8.1.9)), and for ΦΦΦ given by (8.5.128) we have already seen that
∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v) =
∣∣∣∣∣∣∣∣iii jjj kkk
1 0 ∂f∂u
(u, v)
0 1 ∂f∂v
(u, v)
∣∣∣∣∣∣∣∣ ,= −∂f
∂u(u, v)iii− ∂f
∂v(u, v)jjj + kkk,
(8.5.129)
(see (8.3.57)). Writing FFF in the scalar component form (3.2.17), that is
(8.5.130) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in R3,
one sees from (8.5.130) and (8.5.128) that
(8.5.131) FFF (ΦΦΦ(u, v)) = F1(u, v, f(u, v))iii+ F2(u, v, f(u, v))jjj + F3(u, v, f(u, v))kkk,
for all (u, v) in D (recall (8.5.126). From (8.5.131) and (8.5.129) we get
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]= [F1(u, v, f(u, v))iii+ F2(u, v, f(u, v))jjj + F3(u, v, f(u, v))kkk]
·[−∂f∂u
(u, v)iii− ∂f
∂v(u, v)jjj + kkk
].
(8.5.132)
Multiplying out the right side of (8.5.132) and inserting into the integrand on the right of (8.5.127)
we obtain ∫S
FFF · dAAA =
∫D
[F3(u, v, f(u, v))− F1(u, v, f(u, v))
∂f
∂u(u, v)
−F2(u, v, f(u, v))∂f
∂v(u, v)
]du dv.
(8.5.133)
123
This gives the surface integral of FFF over S directly in terms of the scalar components of FFF (see
(8.5.130)) and the function f whose graph defines the surface S. We emphasize that the formula
(8.5.133) is applicable only when the surface S is the graph of a function f , as at (8.5.126). When
S is not the graph of a function then we must resort to the more general expression (8.5.127) when
we evaluate the surface integral of a vector field FFF . In fact, I always prefer to use (8.5.127), even
when S is the graph of a function f , and completely avoid the use of (8.5.133).
Remark 8.5.4. The surface integral ∫S
FFF · dAAA
of a vector field FFF over a surface S that we have constructed is called the flux of the vector field
FFF through the surface S, and has the interpretation of the “aggregate flow” of the vector field FFF
through the surface S. Is there a more precise “physical interpretation” of the flux? This depends
on the physical meaning of the vector field FFF . We have already noted that when FFF (x, y, z) is
identified with the current density JJJ(x, y, z) then the surface integral∫S
JJJ · dAAA
(i.e. the flux of the current density JJJ over the surface S) gives the total current passing through
the surface S. Of course, this current should not depend in any way on the particular parametric
representation we use for the surface S, and Theorem 8.5.1 guarantees this. Later, when we come
to Maxwell’s equations, we shall need the surface integrals of the electric field EEE(x, y, z) and the
magnetic field BBB(x, y, z) over a surface S, that is∫S
EEE · dAAA and
∫S
BBB · dAAA.
These quantities are known, respectively, as the electric flux of the electric field EEE and the magnetic
flux of the magnetic field BBB through the surface S.
Example 8.5.5. A surface S in R3 has the parametric representation ΦΦΦ : D → R3 defined by
D = (u, v) ∈ R2 | 0 ≤ u ≤ 2 and 0 ≤ v ≤ 3
= [0, 2]× [0, 3],(8.5.134)
and
(8.5.135) ΦΦΦ(u, v) = uiii+ u2jjj + vkkk, for all (u, v) in D.
124
Current flows through the surface S, with a current density given by
(8.5.136) JJJ(x, y, z) = 3z2iii+ 6jjj + 6xzkkk, for all (x, y, z) in R3.
Determine the total current passing through the surface S.
From (8.5.125), but with JJJ in place of FFF , we get
(8.5.137)
∫S
JJJ · dAAA :=
∫D
JJJ(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv,
so evaluation of the current through S is just a matter of evaluating the integral on the right side
of (8.5.137). Note that we can write (8.5.135) in the form
(8.5.138) ΦΦΦ(u, v) = x(u, v)iii+ y(u, v)jjj + z(u, v)kkk,
for
(8.5.139) x(u, v) := u, y(u, v) := u2, z(u, v) := v.
From (8.5.139), (8.5.138) and (8.5.136), we obtain
JJJ(ΦΦΦ(u, v)) = 3z2(u, v)iii+ 6jjj + 6x(u, v)z(u, v)kkk
= 3v2iii+ 6jjj + 6uvkkk.(8.5.140)
Taking u-partial derivatives and v-partial derivatives of the scalar components at (8.5.135) gives
∂ΦΦΦ
∂u(u, v) = 1iii+ 2ujjj + 0kkk
∂ΦΦΦ
∂v(u, v) = 0iii+ 0jjj + 1kkk,
(8.5.141)
so that from (8.5.141) we get
(8.5.142)∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v) =
∣∣∣∣∣∣∣∣iii jjj kkk
1 2u 0
0 0 1
∣∣∣∣∣∣∣∣ = 2uiii− jjj + 0kkk.
From (8.5.142) and (8.5.140)
JJJ(ΦΦΦ(u, v))) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]= (3v2iii+ 6jjj + 6uvkkk) · (2uiii− jjj + 0kkk)
= 6(uv2 − 1).
(8.5.143)
125
From (8.5.143) and (8.5.137), the total current passing through surface S is given by∫S
JJJ · dAAA = 6
∫D
(uv2 − 1) du dv
= 6
∫ 3
0
∫ 2
0
(uv2 − 1) du
dv (from (8.5.134) and Remark 2.1.6)
= 72.
(8.5.144)
Example 8.5.6. In Example 6.2.3 we saw the following: If a single point charge Q is located at
the origin of R3 (recall Example 3.1.3), then the electric field at any point (x, y, z) is given by
(8.5.145) EEE(x, y, z) =Q
4πε0[x2 + y2 + z2]3/2(xiii+ yjjj + zkkk)
provided that (x, y, z) is not at the origin of R3 (see (6.2.12)). Suppose that the surface S is the top
half of the sphere of radius r centered at the origin of R3, exactly as at Example 8.3.5. Determine
the electric flux through the surface S.
From Remark 8.5.4 that we must compute the surface integral
(8.5.146)
∫S
EEE · dAAA.
In Example 8.3.5 we saw that there are at least two parametric representations of the surface
S, namely a “Cartesian representation”, in terms of the graph of the function f(x, y) given by
(8.3.60) defined for all (x, y) in the disc D given by (8.3.61), and a “polar representation”, given
by ΦΦΦ : D → R3 in which D is the rectangle defined by (8.3.66) and ΦΦΦ is the function defined by
(8.3.67). We know from Theorem 8.5.1 that we can use either of these parametric representations
of S to compute the surface integral (8.5.146), so the question arises which is the better (in the
sense of involving least work) of the two representations to use? We saw in Example 8.3.5 that
the amount of effort involved in computing the area depends dramatically on which parametric
representation is used, so we can expect much the same thing when we compute (8.5.146). We
know that the electric field from a point charge Q at the origin of R3 is radially symmetric around
the origin. Although this is not readily apparent from the Cartesian formulation of the electric
field given by (8.5.145) it is immediate from the description of the field in Example 3.1.3. Since the
parametric representation of the surface S given by (8.3.66) and (8.3.67) is also in a sense radially
symmetric let us try to calculate the surface integral using this representation of S, which we repeat
for convenience as follows:
D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π/2
= [0, 2π]× [0, π/2],(8.5.147)
126
and
(8.5.148) ΦΦΦ(θ, φ) = x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk, for all (θ, φ) in D,
for
(8.5.149) x(θ, φ) := r sin(φ) cos(θ), y(θ, φ) := r sin(φ) sin(θ), z(θ, φ) := r cos(φ).
(c.f. (8.3.66) and (8.3.67)). From (8.5.125), but with EEE in place of FFF and replacing the generic
parametric variables (u, v) in (8.5.125) with the parametric variables (θ, φ) of (8.5.147) - (8.5.149),
we get
(8.5.150)
∫S
EEE · dAAA :=
∫D
EEE(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
]dθ dφ,
so that evaluation of the electric flux through S reduces to evaluating the integral on the right side
of (8.5.150). In Example 8.3.5 we have calculated
(8.5.151)∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ) = −r2 sin(φ) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk ,
(c.f. (8.3.72)). We next determine EEE(ΦΦΦ(θ, φ)). From (8.5.148)) and (8.5.145) we have
EEE(ΦΦΦ(θ, φ)) = EEE(x(θ, φ), y(θ, φ), z(θ, φ))
=Q
4πε0[x2(θ, φ) + y2(θ, φ) + z2(θ, φ)]3/2x(θ, φ)iii+ y(θ, φ)jjj + z(θ, φ)kkk ,
(8.5.152)
and from (8.5.149) we get
x2(θ, φ) + y2(θ, φ) + z2(θ, φ)
= r2 sin2(φ) cos2(θ) + r2 sin2(φ) sin2(θ) + r2 cos2(φ)
= r2.
(8.5.153)
Now put (8.5.153) and (8.5.149) into (8.5.152) to get
(8.5.154) EEE(ΦΦΦ(θ, φ)) =Q
4πε0r3r sin(φ) cos(θ)iii+ r sin(φ) sin(θ)jjj + r cos(φ)kkk ,
127
and then, from (8.5.154) and (8.5.151) we find
EEE(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
]=
Q
4πε0r3(r) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk
· (−r2 sin(φ)) sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk
= −Q sin(φ)
4πε0sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk
· sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk
= −Q sin(φ)
4πε0
sin2(φ) cos2(θ) + sin2(φ) sin2(θ) + cos2(φ)
= −Q sin(φ)
4πε0
sin2(φ) + cos2(φ)
= −Q sin(φ)
4πε0.
(8.5.155)
From (8.5.155) and (8.5.150)
(8.5.156)
∫S
EEE · dAAA = − Q
4πε0
∫D
sin(φ) dθ dφ = − Q
4πε0
∫ 2π
0
∫ π/2
0
sin(φ) dφ
dθ = − Q
2ε0,
where we have used the rectangular form of D (see (8.5.147)) and the Fubini theorem (see Remark
2.1.6) at the second equality of (8.5.156).
Remark 8.5.7. In Chapter 5 we defined line integrals, and in the present chapter we have devoted
considerable effort to the definition of surface integrals. Why all this effort on such seemingly
strange integrals? In this remark we are going to see how these integrals are absolutely essential
for stating one of the most fundamental and bedrock laws of physics namely Ampere’s circuital
law. From elementary physics one is familiar with the qualitative phenomenon that a current i
passing through a conductor causes a magnetic field vector BBB(x, y, z) at all points (x, y, z) in the
space surrounding the conductor. The question arises: is there any quantitative (or mathematical)
relationship between the current i and the magnetic vector field BBB that it causes? Suppose that
i is a time-constant current flowing through a long and very thin metallic conductor, and fix any
simple closed curve Γ which “loops” just once around the conductor. Assign to Γ a direction in
accordance with the usual right hand rule i.e. if the thumb of the right hand is aligned along the
conductor in the direction of the current then Γ is assigned the direction of the forefingers as shown
in Figure 8.11. It has been determined by experiment that under these conditions we always have
128
Figure 8.11: Current i and closed curve Γ for Ampere’s law
(8.5.157)
∫Γ
BBB · drrr = µ0i,
in which the quantity on the left of (8.5.157) is the usual line integral of the magnetic field HHH
around the closed curve Γ, as defined in Section 5.1, and µ0 is a constant called the magnetic
permeability of free space. This physical law is known as Ampere’s circuital law. Notice how this
law is naturally stated in terms of line integrals, and notice also its great generality; the relation
(8.5.157) holds for every imaginable closed curve Γ looping around the current-carrying conductor.
Despite its generality the circuital law in the form of (8.5.157) has the disadvantage that the cause
of the magnetic field is assumed to be current though a conductor. From the point of view of
electromagnetism it is actually much more useful to have a circuital law in which the cause of the
magnetic field is not a current through a conductor but rather a current density arising from the
diffuse or distributed movement of charge through space (recall Example 3.1.6). This is because, in
contrast to electrical circuits where one always deals with currents passing through conductors, in
electromagnetism it is much more natural to deal with charge which moves diffusely through space
rather than in concentrated fashion along the narrow confines of a conductor. Suppose therefore
that we have a diffuse movement of charge through space given by a time-constant current density
vector field JJJ , as in Example 3.1.6, with domain D = R3 for simplicity. That is, at each (x, y, z) in
R3 the current density is JJJ(x, y, z), and moreover the current density at each (x, y, z) does not vary
with time. It is a basic physical fact that this diffuse movement of charge causes a magnetic field
129
BBB(x, y, z) at each point (x, y, z) (much as a current flowing through a conductor causes a magnetic
field) and we would like to quantify the relationship between the current density vector field JJJ and
magnetic vector field BBB that it creates. To state this relationship fix some surface S in R3 with
boundary curve Γ, as shown in Figure 8.12. We emphasize that S is a purely theoretical surface
Figure 8.12: Current density JJJ and surface S for Ampere’s law
that leaves the movement of charge completely unaffected, and is not in any sense a physical surface
or barrier which impedes or disturbs the movement of charge described by the current density JJJ .
It has been determined by experiment that under these conditions the vector fields JJJ and BBB are
always related by
(8.5.158)
∫Γ
BBB · drrr = µ0
∫S
JJJ · dAAA.
Exactly as at (8.5.157) the quantity on the left of (8.5.158) is the line integral of the magnetic
field BBB around the closed curve Γ defined in Section 5.1. As for the quantity on the right side of
(8.5.158), this is of course the surface integral of the current density vector field over the surface
S that has been defined in this section. This physical law is also known as Ampere’s circuital law,
and it is an essential halfway-house in getting to Maxwell’s equations of electromagnetism, as we
shall see in later chapters. Notice how indispensable both line integrals and surface integrals are
in the statement of this basic physical law. Notice also the universality built into the statement at
(8.5.158), namely this relation between JJJ and BBB holds for every possible choice of the finite open
surface S with boundary Γ. Finally notice that, for a given surface S, the surface integral on the
130
right of (8.5.158) is nothing but the total current flowing through the surface S, as we have already
seen. In this sense there is consistency between the circuital laws at (8.5.158) and (8.5.157).
Remark 8.5.8. In this remark we are going to state another bedrock law of physics, namely
Faraday’s law of electromagnetic induction. Exactly as with Ampere’s circuital law of Remark
8.5.7 we shall see that surface integrals and line integrals are completely indispensable for the very
formulation and meaning of Faraday’s law. Suppose that BBB is a time varying magnetic field, that is
at each point (x, y, z) in R3 the magnetic field vector is BBB(t, x, y, z) for each instant t, and generally
changes as t changes. One could for example obtain such a time varying magnetic field by moving
a permanent magnet through space. We briefly noted in Remark 3.2.4 the possibility of vector
fields which can vary not just through space but also with time, but until now we been concerned
with fields that vary only through space and do not depend on time. Here we absolutely must deal
with fields which change not just through space but also with time, for we are going to see that
the essential element in Faraday’s law is the time-changing magnetic field BBB. It turns out that
this poses no serious difficulties; the mathematical tools we have developed for fields which depend
only on space are easily extended and adapted to fields which depend on time as well as space.
In essence Faraday’s law of electromagnetic induction states that a time varying magnetic field BBB
Figure 8.13: Magnetic field BBB and electric field EEE for Faraday’s law
causes a time varying electric field EEE, that is at each point (x, y, z) in R3 we get an electric field
EEE(t, x, y, z) which also varies with time t. Naturally we would like a quantitative or mathematical
relation between the time varying fields BBB and EEE, and this we state next. Fix some surface S in
131
R3 with boundary curve Γ, as shown in Figure 8.13, and define the flux of the magnetic field BBB (or
magnetic flux) through the surface S at each instant t by
(8.5.159) Φmag(t) :=
∫S
BBB · dAAA,
(recall Remark 8.5.4). Before going any further we should note a seeming oddity of the notation
at (8.5.159), namely the left side indicates a clear dependence on the time t but there is no corre-
sponding mention of t on the right side of (8.5.159). This is happening because on the right side we
are suppressing all variables in the notation for the surface integral. In fact, for each t the quantity
on the right is understood to mean the surface integral over S of the vector field
(8.5.160) FFF (x, y, z) := BBB(t, x, y, z), for all (x, y, z) in R3,
obtained from BBB by keeping t fixed. The t-dependence implicit in the right side of (8.5.159) can be
make quite explicit if we fix some parametric representation ΦΦΦ : D → R3 of the surface S (recall
Definition 8.1.4). Then the surface integral of any vector field FFF : R3 → R3 over S is given in terms
of the parametric representation by
(8.5.161)
∫S
FFF · dAAA =
∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv,
(recall (8.5.125)). For each t the quantity on the right side of (8.5.159) is then given by (8.5.161)
with FFF defined by (8.5.160), that is
(8.5.162)
∫S
BBB · dAAA =
∫D
BBB(t,ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv,
for each t. The t-dependence of the quantity on the right side of (8.5.159) is now clearly apparent
from the right side of (8.5.162). Having cleared up the interpretation of (8.5.159) we can state
Faraday’s law of electromagnetic induction in full as follows: the electric field EEE caused by the time
varying magnetic field BBB always satisfies the relation
(8.5.163)
∫Γ
EEE · drrr = − dΦmag(t)
dt,
for all t. As with Ampere’s circuital law (8.5.158) one should note the universality incorporated
in (8.5.163) (partnered with the definition (8.5.159)), namely this relation holds regardless of how
one chooses the surface S with boundary Γ. Another point to notice is that we have a notational
peculiarity at (8.5.163) not unlike that which we saw at (8.5.159), that is the right side of (8.5.163)
132
clearly depends on time t but no such dependence on t is explicitly indicated in the line integral
on the left side of (8.5.163). Again this t-dependence is certainly present but “hidden” because all
variables are suppressed in the notation for the line integral on the left of (8.5.163). To make this
t-dependence explicit fix some parametric representation
(8.5.164) γγγ : [a, b]→ R3
of the closed curve Γ (see Definition 4.2.2). From (5.1.13) we recall that the line integral of any
vector field FFF : R3 → R3 along Γ is always given by
(8.5.165)
∫Γ
FFF · drrr =
∫ b
a
FFF (γγγ(u)) · γγγ(1)(u) du.
For each t the quantity on the left of (8.5.163) is understood to mean the surface integral over S of
the vector field
(8.5.166) FFF (x, y, z) := EEE(t, x, y, z), for all (x, y, z) in R3,
obtained from EEE by keeping t fixed (much as was the case at (8.5.160), which we used to unravel the
hidden t-dependence on the right side of (8.5.159)). From (8.5.165), with FFF defined by (8.5.166) at
each t, we see that the left side of (8.5.163) is really given by
(8.5.167)
∫Γ
EEE · drrr =
∫ b
a
EEE(t,γγγ(u)) · γγγ(1)(u) du,
for each and every t. The hidden t-dependence on the left side of (8.5.163) is now clearly displayed
on the right side of (8.5.167). Usually, (8.5.163) and (8.5.159) are combined to yield Faraday’s law
of electromagnetic induction in the following form:∫Γ
EEE · drrr = − d
dt
∫S
BBB · dAAA
= −∫S
∂BBB
∂t· dAAA,
(8.5.168)
which displays the relation between the given time varying magnetic field BBB and the resulting time
varying electric field EEE.
133
Chapter 9
Vector Calculus
In this chapter our goal is to introduce several distinct ways of taking the “derivative” of given
vector and scalar fields, in the course of which we shall define the so-called divergence, curl and
Laplacian differential operators. We then establish two fundamental theorems of vector calculus,
namely the theorem of Stokes and the theorem of Gauss-Ostrogradskii, which are stated in terms of
these differential operators, as well as the surface integrals studied in Chapter 8, the line integrals
studied in Chapter 5, and the volume integrals studied in Chapter 2. As we shall see in Chapter
10 the theorems of Gauss-Ostrogradskii and Stokes are essential tools for understanding Maxwell’s
equations.
9.1 Differential Operators of Vector Calculus: Divergence,
Curl, Laplacian
In Section 6.1 we defined the gradient of a scalar field f to be the vector field gradf given by
(9.1.1) (gradf)(x, y, z) :=∂f
∂x(x, y, z)iii+
∂f
∂y(x, y, z)jjj +
∂f
∂z(x, y, z)kkk,
(see Definition 6.1.1). As we noted in Remark 6.1.3 the symbol grad (or ∇) denotes a differential
operator which “operates on” a given scalar field f to generate a vector field gradf (or ∇f) by
a process of partial differentiation. In this section our goal is to define two further differential
operators, namely the divergence and the curl which, rather like grad, operate on a given field
by partial differentiation to produce another field. In this case, however, the divergence and curl
operate on a vector field, in contrast to grad which always operates on a scalar field.
134
Definition 9.1.1. Suppose that FFF : D → R3 is a C1-vector field in R3 with domain D ⊂ R3 (recall
Definition 3.2.1 and Remark 3.2.2) given by
(9.1.2) FFF (x, y, z) := F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D.
Then the divergence of the vector field FFF is the scalar field div f on the same domain D defined by
(9.1.3) (div FFF )(x, y, z) :=∂F1
∂x(x, y, z) +
∂F2
∂y(x, y, z) +
∂F3
∂z(x, y, z), for all (x, y, z) in D.
Remark 9.1.2. Notice that the right side of (9.1.3) is a scalar quantity for each (x, y, z) in the
domain D of the given vector field FFF , and is therefore indeed a scalar field with the same domain
as the vector field FFF . In short, div denotes an operator which “operates” on a given vector field
FFF to produce a scalar field div FFF having the same domain as FFF . Much like the operator grad the
operator div involves partial differentiation, and hence is a “differential operator”, but it is clearly
a very different differential operator from grad.
Remark 9.1.3. We can use the symbolic vector ∇ defined by (6.1.3), that is
(9.1.4) ∇ :=∂
∂xiii+
∂
∂yjjj +
∂
∂zkkk,
to give an alternative and frequently used notation for the divergence. If we form a “symbolic”
inner product of the vectors on the right sides of (9.1.4) and (9.1.2), pretending that the partial
derivatives in (9.1.4) are actual numbers which we can “multiply” into the scalar components F1,
F2 and F3 of FFF , then we get (at least formally)
∇ · F (x, y, z) =
(∂
∂xiii+
∂
∂yjjj +
∂
∂zkkk
)· (F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk)
=∂F1
∂x(x, y, z) +
∂F2
∂y(x, y, z) +
∂F3
∂z(x, y, z).
(9.1.5)
Comparing (9.1.5) and (9.1.3) we see that div FFF and∇·FFF are identical, that is∇·FFF is an alternative
notation for the divergence div f , in much the same way that ∇f is alternative notation for the
gradient gradf of a scalar field f (see Definition 6.1.1). Do note that the “dot” in ∇ ·FFF is essential
since it indicates the formal inner product which gives (9.1.4) - the notation ∇FFF (i.e. without the
“dot”) makes no sense at all. In general the notation div FFF is common in the older and more
classical books, whereas the notation ∇ ·FFF is preferred in more modern books. We shall typically
use the notation ∇ ·FFF in these notes.
135
Remark 9.1.4. What is the physical significance of the divergence ∇ · FFF of a vector field FFF?
This is by no means immediately obvious from the defining formula (9.1.3). We shall see from the
Divergence Theorem of Gauss-Ostogradskii, to be established later in this chapter, that the scalar
value ∇·FFF (x, y, z) at a point (x, y, z) in R3 effectively measures the local “divergence” (or “flowing
away”) of the vector field FFF from the point (x, y, z). This local divergence is an important property
for electric and magnetic fields. In fact, we shall see in the next chapter that two of Maxwell’s
equations (originating from the Gauss laws for electric fields and magnetic fields) exactly describe
the divergence div EEE (or ∇·EEE) of the electric EEE and the divergence div BBB (or ∇·BBB) of the magnetic
field BBB, and that these divergence properties are essential for Maxwell’s theory of electromagnetic
waves.
Remark 9.1.5. A vector field FFF : D → R3 with domain D ⊂ R3 is called solenoidal (or incom-
pressible) when div FFF (x, y, z) = 0 for all (x, y, z) in D. We shall see later that the magnetic field BBB
is always solenoidal.
We next define another rather fancy differential operator, namely the curl of a vector field, which
is just as important as the divergence:
Definition 9.1.6. Suppose that FFF : D → R3 is a C1-vector field in R3 with domain D ⊂ R3 (recall
Definition 3.2.1 and Remark 3.2.2) given by
(9.1.6) FFF (x, y, z) := F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D.
Then the curl of the vector field FFF is the vector field curl FFF on the same domain D defined by
(curl FFF )(x, y, z) :=
[∂F3
∂y(x, y, z)− ∂F2
∂z(x, y, z)
]iii+
[∂F1
∂z(x, y, z)− ∂F3
∂x(x, y, z)
]jjj
+
[∂F2
∂x(x, y, z)− ∂F1
∂y(x, y, z)
]kkk,
(9.1.7)
for all (x, y, z) in D.
Remark 9.1.7. Notice that the right side of (9.1.7) is a vector for each (x, y, z) in the domain D of
the given vector field FFF , and is therefore indeed a vector field with the same domain as the vector
field FFF . That is curl denotes an operator which “operates” on a given vector field FFF to produce
another vector field curl FFF having the same domain as FFF . Much like the operators gradand div,
the operator curl again involves partial differentiation, and hence is a “differential operator”, but
it is clearly a very different differential operator from both grad and div. In particular, we know
136
from Definition 6.1.1 that grad converts a given scalar field f into a vector field gradf (or ∇f),
while from Definition 9.1.1 we see that div converts a given vector field FFF into a scalar field div FFF
(or ∇ · FFF ), and from Definition 9.1.6 we see that curl converts a given vector field FFF into another
vector field curl FFF . In view of these comments one can reasonably ask if there are any useful
differential operators which convert a given scalar field into another scalar field. There is indeed
such an operator, called the Laplacian operator, which we define later.
Remark 9.1.8. The definition of curl at (9.1.7) looks like a confusing jumble of symbols which
seems not only to have no intuitive or physical significance but appears to be almost impossible to
remember. We shall see later that the curl of a vector field in fact has definite physical significance.
Here we use the symbolic vector at (9.1.4) to get an alternative notation for curl FFF , in much the
same way that we obtained ∇·FFF as alternative notation for div FFF in Remark 9.1.3. This alternative
notation also makes it easy to remember the seemingly strange definition at (9.1.7). Much as in
Remark 9.1.3 we again form a “symbolic” product of the vectors on the right sides of (9.1.4) and
(9.1.2), but now we formally calculate the vector cross product of the right sides of (9.1.4) and
(9.1.2), rather than the inner product as we did at (9.1.5):
∇×FFF (x, y, z) =
∣∣∣∣∣∣∣∣iii jjj kkk∂∂x
∂∂y
∂∂z
F1(x, y, z) F2(x, y, z) F3(x, y, z)
∣∣∣∣∣∣∣∣=
[∂F3
∂y(x, y, z)− ∂F2
∂z(x, y, z)
]iii+
[∂F1
∂z(x, y, z)− ∂F3
∂x(x, y, z)
]jjj
+
[∂F2
∂x(x, y, z)− ∂F1
∂y(x, y, z)
]kkk.
(9.1.8)
The second equality at (9.1.8) follows from formal calculation of the determinant, pretending as
usual that the partial derivatives occurring in the determinant at (9.1.4) are actual numbers which
we can “multiply” into the scalar components F1, F2 and F3 of FFF . From (9.1.8) and (9.1.7) we get
(9.1.9) ∇×FFF (x, y, z) = curl FFF (x, y, z) for all (x, y, z) in D.
It is clear that (9.1.8) gives a useful mnemonic for remembering the definition of curl FFF , and also
gives the alternative notation ∇ × FFF for curl FFF . In general the notation curl F is common in the
older and more classical books, whereas the notation ∇×FFF is preferred in more modern books. We
shall typically use the notation ∇×FFF in these notes.
137
We can now restate Theorem 6.2.7 in terms of the curl operator defined at (9.1.8):
Theorem 9.1.9. Suppose that FFF : D → R3 is a C1-vector field with domain D = R3 (recall
Definition 3.2.1 and Remark 3.2.2), and put
FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk,
for all (x, y, z) in R3 i.e. F1(x, y, z), F2(x, y, z), and F3(x, y, z) are the real scalar components of
the vector FFF (x, y, z) in R3. Then the following are equivalent:
(a) FFF is a conservative vector field;
(b)∫
ΓFFF (rrr) · drrr = 0 for every closed curve Γ in R3;
(c) (∇×FFF )(x, y, z) = 0 for all (x, y, z) in R3.
Remark 9.1.10. What “physical significance” does the curl operator have? Here we briefly indicate
one instance, from fluid mechanics, in which the curl has a very definite physical interpretation.
Suppose that the vector field VVV is the velocity flow field of a fluid moving through space (e.g. a
current of water), that is the vector VVV (x, y, z) gives the velocity of a moving fluid at each point
(x, y, z) in R3. Attached to an axis, which rotates in a bearing that you hold, are very small
paddles. The bearing is fixed at a point (x, y, z) in the moving fluid with the axis of spin aligned
along some fixed unit vector nnn (see Figure 9.1). For reasons which will become clear we call this
device a “curl meter”. There is viscous friction in the bearing, so it can be shown from elementary
Figure 9.1: Curl meter
physics that the angular speed (not angular acceleration!) of the curl meter is directly proportional
138
to the torque around the axis of the curl meter arising from the impact of the moving fluid on the
paddles (the constant of proportionality depends on the geometry of the paddles, the coefficient of
viscous friction of the fluid and several other factors). Using the principles of fluid mechanics one
can establish that
(9.1.10) angular speed of curl meter = κ|curl VVV (x, y, z) · nnn|,
where κ is another constant of proportionality. From (9.1.10) we see that there is a very direct
relation between curl VVV and the angular speed of the curl meter. Observe from (9.1.10) that when
the axis of spin of the curl meter (determined by the unit vector nnn) is collinear with curl VVV (x, y, z)
then the curl meter rotates fastest, whereas when the axis of spin is orthogonal to curl VVV (x, y, z)
then the curl meter does not rotate at all. In this way we see that curl VVV tells us the “local spin”
in the fluid velocity field VVV at any point (x, y, z) around an axis aligned with any unit vector nnn.
Remark 9.1.11. We shall see from the Theorem of Stokes, to be established later in this chapter,
that the intuitive picture of curl established in Remark 9.1.10 for a the velocity field VVV of a moving
fluid extends to a more general setting, namely the vector curl FFF (x, y, z) (or∇×FFF (x, y, z)) at a point
(x, y, z) in R3 effectively measures the local “curling” (or “turning” or “rotation” or “vorticity”)
of a general vector field FFF at point (x, y, z) (this will be discussed in Remark 9.2.7). In fact, in
older textbooks curlFFF was sometimes denoted by “vort FFF” or “rot FFF” (“vort” for “vorticity”, “rot”
for “rotation”) but these semi-comical notations were soon discarded in favor of the currently used
curl FFF and ∇ × FFF . This local rotation, like the divergence discussed in Remark 9.1.4, is also an
important property of electric and magnetic fields. We shall see in the next chapter that Faraday’s
law of electromagnetic induction (which has been previewed in Remark 8.5.8) actually describes
the local rotation curl EEE (or ∇ × EEE) of the electric field, while Ampere’s magnetic circuital law
(previewed in Remark 8.5.7) describes the local rotation curl BBB (or ∇×BBB) of the magnetic field.
Remark 9.1.12. A vector field FFF : D → R3 with domain D ⊂ R3 is called irrotational when its
curl is identically zero on the domain D, that is ∇ × FFF (x, y, z) = 0 for all (x, y, z) in D. From
Theorem 9.1.9 one sees, in particular, that
(9.1.11) FFF : R3 → R3 is conservative if and only if FFF is irrotational.
The next result shows that the gradient of a C2-scalar field is always irrotational:
Theorem 9.1.13. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3. Then the
vector field ∇f (see Definition 6.1.1) is irrotational, that is
(9.1.12) (∇× (∇f))(x, y, z) = 0 for all (x, y, z) in D.
139
Proof: From Definition 6.1.1 we have
(9.1.13) ∇f(x, y, z) =∂f
∂x(x, y, z)iii+
∂f
∂y(x, y, z)jjj +
∂f
∂z(x, y, z)kkk
for all (x, y, z) in D. Now take FFF := ∇f so that, from (9.1.13), we have
(9.1.14) F1(x, y, z) =∂f
∂x(x, y, z), F2(x, y, z) =
∂f
∂y(x, y, z), F3(x, y, z) =
∂f
∂z(x, y, z),
for all (x, y, z) in D. From (9.1.8) and (9.1.14) we get
∇× (∇f)(x, y, z) =
∣∣∣∣∣∣∣∣iii jjj kkk∂∂x
∂∂y
∂∂z
∂f∂x
(x, y, z) ∂f∂y
(x, y, z) ∂f∂z
(x, y, z)
∣∣∣∣∣∣∣∣=
[∂2f
∂y∂z(x, y, z)− ∂2f
∂z∂y(x, y, z)
]iii−
[∂2f
∂x∂z(x, y, z)− ∂2f
∂z∂x(x, y, z)
]jjj
+
[∂2f
∂x∂y(x, y, z)− ∂2f
∂y∂x(x, y, z)
]kkk,
(9.1.15)
for all (x, y, z) in D, where the second equality at (9.1.15) follows from formally expanding the
determinant. Since f is a C2-function we know that the mixed partial derivatives are equal, e.g. for
the mixed y − z partial derivatives we have
(9.1.16)∂2f
∂y∂z(x, y, z) =
∂2f
∂z∂y(x, y, z), for all (x, y, z) in D
and similarly for the mixed x− z and x− y partial derivatives. From (9.1.16) etc. and (9.1.15) we
obtain ∇× (∇f)(x, y, z) = 0 for all (x, y, z) in D as required.
The next result is quite similar to Theorem 9.1.13 but shows that the curl of a vector field is
always solenoidal. The proof is omitted since it is very similar to the proof of Theorem 9.1.13:
Theorem 9.1.14. Suppose that GGG : D → R3 is a C2-vector field with domain D ⊂ R3. Then the
vector field ∇×GGG (see Definition 9.1.6 and (9.1.9)) is solenoidal (see Remark 9.1.5), that is
(9.1.17) (∇ · (∇×GGG))(x, y, z) = 0 for all (x, y, z) in D.
140
Example 9.1.15. Determine the divergence and curl of the vector field defined by
(9.1.18) FFF (x, y, z) := xiii+ yjjj + zkkk for all (x, y, z) in D := R3.
We compute the divergence first. From (9.1.18) and (9.1.6) we have
(9.1.19) F1(x, y, z) = x, F2(x, y, z) = y, F3(x, y, z) = z, for all (x, y, z) in R3.
From (9.1.19) and (9.1.5)
(9.1.20) ∇ · F (x, y, z) =∂x
∂x+∂y
∂y+∂z
∂z= 3, for all (x, y, z) in R3.
For the curl, from (9.1.19) and (9.1.7) we get
(9.1.21) (curl FFF )(x, y, z) =
[∂z
∂y− ∂y
∂z
]iii+
[∂x
∂z− ∂z
∂x
]jjj +
[∂y
∂x− ∂x
∂y
]kkk = 0,
for all (x, y, z) in R3. It follows that FFF is irrotational (see Remark 9.1.12).
Example 9.1.16. Show that the vector field defined by
(9.1.22) FFF (x, y, z) := yiii− xjjj + 0kkk for all (x, y, z) in D := R3,
cannot be a conservative vector field.
In view of Remark 9.1.12 it is enough to prove that the vector field FFF in not irrotational. We
therefore calculate the curl of FFF . From (9.1.22) we have
(9.1.23) F1(x, y, z) = y, F2(x, y, z) = −x, F3(x, y, z) = 0, for all (x, y, z) in R3,
(compare (9.1.6)) and inserting (9.1.23) into (9.1.8) we obtain
(9.1.24) ∇×FFF (x, y, z) =
∣∣∣∣∣∣∣∣iii jjj kkk∂∂x
∂∂y
∂∂z
y −x 0
∣∣∣∣∣∣∣∣ = −2kkk, for all (x, y, z) in R3.
We therefore do not have ∇ × FFF (x, y, z) = 0 for all (x, y, z) ∈ R3. It follows that FFF cannot be
irrotational, and then, from Remark 9.1.12, it cannot be conservative.
Example 9.1.17. Show that the vector field GGG defined by
(9.1.25) GGG(x, y, z) := x3yiii+ zjjj + xzkkk for all (x, y, z) in D := R3,
141
cannot be the curl of another vector field.
This is similar to Example 9.1.16 except that now we use Theorem 9.1.14. Suppose in fact that
GGG is the curl of another vector field FFF : D → R3, that is
(9.1.26) GGG(x, y, z) = ∇×FFF (x, y, z) for all (x, y, z) in D := R3.
Then
(9.1.27) ∇ ·GGG(x, y, z) = ∇ · (∇×FFF )(x, y, z) = 0 for all (x, y, z) in D := R3,
in which the first equality follows from (9.1.26) and the second from Theorem 9.1.14. However,
from (9.1.26) and (9.1.5)
∇ ·GGG(x, y, z) =∂G1
∂x(x, y, z) +
∂G2
∂y(x, y, z) +
∂G3
∂z(x, y, z)
=∂(x3y)
∂x+∂(z)
∂y+∂(xz)
∂z= 3x2y + x,
(9.1.28)
for all (x, y, z) in R3. Since (9.1.28) contradicts (9.1.27) we see that GGG cannot be the curl of another
vector field.
We next introduce a differential operator which converts a given scalar field into another scalar
field, as promised at the end of Remark 9.1.7:
Definition 9.1.18. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3 (see Definition
3.2.1 and Remark 3.2.2). Then the Laplacian of the scalar field f is another scalar field denoted by
∇2f and defined on the same domain D by
(9.1.29) ∇2f(x, y, z) :=∂2f
∂x2 (x, y, z) +∂2f
∂y2 (x, y, z) +∂2f
∂z2 (x, y, z)
for all (x, y, z) in D.
Finally, we extend the Laplacian operator, defined above for scalar fields, to vector fields:
Definition 9.1.19. Suppose that FFF : D → R3 is a C2-vector field in R3 with domain D ⊂ R3 (see
Definition 3.2.1 and Remark 3.2.2) given by
(9.1.30) FFF (x, y, z) := F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in D.
The Laplacian of the vector field FFF is another vector field ∇2FFF on the same domain D defined by
(9.1.31) (∇2FFF )(x, y, z) := (∇2F1)(x, y, z)iii+ (∇2F2)(x, y, z)jjj + (∇2F3)(x, y, z)kkk,
for all (x, y, z) in D. Here the functions ∇2Fi in (9.1.31) are of course defined by (9.1.29) with
f := Fi.
142
Remark 9.1.20. Laplacian operators occur all over physics and engineering, and in particular are
essential in the study of electromagnetism, as we shall see. The next theorem illustrates that the
Laplacian operator on a scalar field is really just the successive application of the gradient and
divergence to the scalar field:
Theorem 9.1.21. Suppose that f : D → R is a C2-scalar field with domain D ⊂ R3. Then the
Laplacian of f is the divergence of the gradient of f , that is
(9.1.32) ∇2f(x, y, z) = (∇ ·GGG)(x, y, z) where GGG(x, y, z) := (∇f)(x, y, z),
for all (x, y, z) ∈ D, more compactly
(9.1.33) ∇2f(x, y, z) = ∇ · (∇f)(x, y, z) for all (x, y, z) in D.
Proof: From Definition 6.1.1 the gradient of f is given by
(9.1.34) ∇f(x, y, z) =∂f
∂x(x, y, z)iii+
∂f
∂y(x, y, z)jjj +
∂f
∂z(x, y, z)kkk, for all (x, y, z) in D.
Now define the vector field FFF : D → R3 by
(9.1.35) FFF (x, y, z) := ∇f(x, y, z), for all (x, y, z) in D.
From (9.1.35) and (9.1.34) the scalar components of FFF are
(9.1.36) F1(x, y, z) =∂f
∂x(x, y, z), F2(x, y, z) =
∂f
∂y(x, y, z), F3(x, y, z) =
∂f
∂z(x, y, z),
for all (x, y, z) in D. Then the divergence of the gradient of f is
∇ · (∇f)(x, y, z) = ∇ ·FFF (x, y, z) (see (9.1.35))
=∂F1
∂x(x, y, z) +
∂F2
∂y(x, y, z) +
∂F3
∂z(x, y, z) (see (9.1.5))
=∂2f
∂x2 (x, y, z) +∂2f
∂y2 (x, y, z) +∂2f
∂z2 (x, y, z) (see (9.1.36)),
(9.1.37)
for all (x, y, z) in D. From (9.1.37) and (9.1.29) we obtain (9.1.33).
Remark 9.1.22. Suppose that a C2-scalar field f : D → R such that its gradient is solenoidal
(Remark 9.1.5) that is
(9.1.38) ∇f(x, y, z) = 0, for all (x, y, z) in D.
143
From (9.1.38) and Theorem 9.1.21 (see (9.1.33)) we get
(9.1.39) ∇2f(x, y, z) = 0, for all (x, y, z) in D,
that is (see (9.1.29))
(9.1.40)∂2f
∂x2 (x, y, z) +∂2f
∂y2 (x, y, z) +∂2f
∂z2 (x, y, z) = 0, for all (x, y, z) in D.
The relation (9.1.40) is a particular instance of a partial differential equation and is known as
Laplace’s equation. Any scalar function f which satisfies the relation (9.1.40) is called a solution of
Laplace’s equation. We see, therefore, that any scalar field whose gradient is solenoidal is necessarily
a solution of Laplace’s equation. Laplace’s equation is ubiquitous throughout mathematical physics
and engineering, and in particular is indispensable in the study of electromagnetism.
Remark 9.1.23. We have defined the gradient operator for scalar fields (see Definition 6.1.1), the
divergence operator (see Definition 9.1.1 and (9.1.5)) and curl operator (see Definition 9.1.6 and
(9.1.8)) for vector fields when these fields do not depend on t, that is are functions of (x, y, z) only.
In this section, as well as in later chapters, we are going to apply these operators to scalar and
vector fields which are time varying in the sense of Remark 3.2.5, that is are functions of (t, x, y, z).
It goes without saying that we just ignore t and apply the x, y and z-partial derivatives as usual.
In particular, if f(t, x, y, z) is a time varying scalar field then we define the gradient of f by
(∇f)(t, x, y, z) =∂f
∂x(t, x, y, z)iii+
∂f
∂y(t, x, y, z)jjj +
∂f
∂z(t, x, y, z)kkk,(9.1.41)
for all (t, x, y, z) (c.f. (6.1.4)), and we define the Laplacian of f by
(9.1.42) (∇2f)(t, x, y, z) :=∂2f
∂x2 (t, x, y, z) +∂2f
∂y2 (t, x, y, z) +∂2f
∂z2 (t, x, y, z)
for all (t, x, y, z) (c.f. (9.1.29)). Similarly, if FFF (x, y, z) is a time varying vector field with the scalar
component representation
(9.1.43) FFF (t, x, y, z) = F1(t, x, y, z)iii+ F2(t, x, y, z)jjj + F3(t, x, y, z)kkk,
then we define the divergence of FFF by
(9.1.44) (∇ · F )(t, x, y, z) =∂F1
∂x(t, x, y, z) +
∂F2
∂y(t, x, y, z) +
∂F3
∂z(t, x, y, z),
144
for all (t, x, y, z) (c.f. (9.1.5)), the curl of FFF by
(∇×FFF )(t, x, y, z) =
∣∣∣∣∣∣∣∣iii jjj kkk∂∂x
∂∂y
∂∂z
F1(t, x, y, z) F2(t, x, y, z) F3(t, x, y, z)
∣∣∣∣∣∣∣∣=
[∂F3
∂y(t, x, y, z)− ∂F2
∂z(t, x, y, z)
]iii+
[∂F1
∂z(t, x, y, z)− ∂F3
∂x(t, x, y, z)
]jjj
+
[∂F2
∂x(t, x, y, z)− ∂F1
∂y(t, x, y, z)
]kkk.
(9.1.45)
for all (t, x, y, z) (c.f. (9.1.8)), and the Laplacian of FFF by
(9.1.46) (∇2FFF )(x, y, z) := (∇2F1)(x, y, z)iii+ (∇2F2)(x, y, z)jjj + (∇2F3)(x, y, z)kkk,
for all (t, x, y, z) (c.f. (9.1.31)). Here the functions ∇2Fi(t, x, y, z) in (9.1.46) are of course defined
by (9.1.42) with f := Fi.
We now state for future reference a useful result on interchanging the divergence and curl op-
erators with partial t-derivatives for time varying vector fields FFF (t, x, y, z). This very simple result
will be useful in later applications, particularly when we look at Maxwell’s equations. Put
(9.1.47) GGG1(t, x, y, z) := (∇ ·FFF )(t, x, y, z),
GGG2(t, x, y, z) :=∂FFF (t, x, y, z)
∂t
=∂F1(t, x, y, z)
∂tiii+
∂F2(t, x, y, z)
∂tjjj +
∂F3(t, x, y, z)
∂tkkk,
(9.1.48)
for all (t, x, y, z). It is easy, although tedious, to verify that
(9.1.49)∂GGG1(t, x, y, z)
∂t= (∇ ·GGG2)(t, x, y, z),
for all (t, x, y, z). The relation (9.1.49) is usually written as
(9.1.50)∂
∂t[(∇ ·FFF )(t, x, y, z)] = ∇ ·
[∂FFF (t, x, y, z)
∂t
],
that is we can interchange the divergence with the partial t-derivative. Similarly, upon defining
(9.1.51) GGG3(t, x, y, z) := (∇×FFF )(t, x, y, z),
145
one can again check by simple but tedious calculation that
(9.1.52)∂GGG3(t, x, y, z)
∂t= (∇×GGG2)(t, x, y, z),
for all (t, x, y, z). The relation (9.1.52) is usually written as
(9.1.53)∂
∂t[(∇×FFF )(t, x, y, z)] = ∇×
[∂FFF (t, x, y, z)
∂t
],
that is we can interchange the curl with the partial t-derivative.
Remark 9.1.24. Here we collect some of the more useful identities for the gradient, divergence,
curl and Laplacian operators defined previously. Typically these identities are established by easy
(although sometimes tedious) calculations, or follow from theorems we have already established. In
the following f and g are time varying C1-scalar fields, FFF and GGG are time varying C1-vector fields
(recall Remark 3.2.2), and c is any real constant. For maximum generality we state the identities
in the time varying case, in terms of (t, x, y, z). Of course, the identities also hold for time constant
fields, in which case we just everywhere replace (t, x, y, z) with (x, y, z).
(9.1.54) (∇(cf))(t, x, y, z) = c(∇f)(t, x, y, z).
(9.1.55) (∇(f + g))(t, x, y, z) = (∇f)(t, x, y, z) + (∇g)(t, x, y, z).
(9.1.56) (∇(fg))(t, x, y, z) = g(t, x, y, z)(∇f)(t, x, y, z) + f(t, x, y, z)(∇g)(t, x, y, z).
(9.1.57) (∇(f/g))(t, x, y, z) =g(t, x, y, z)(∇f)(t, x, y, z)− f(t, x, y, z)(∇g)(t, x, y, z)
g2(t, x, y, z),
for all (t, x, y, z) in D such that g(t, x, y, z) 6= 0.
(9.1.58) (∇ · (cFFF ))(t, x, y, z) = c(∇ ·FFF )(t, x, y, z).
(9.1.59) (∇ · (FFF +GGG))(t, x, y, z) = (∇ ·FFF )(t, x, y, z) + (∇ ·GGG)(t, x, y, z).
(9.1.60) (∇× (cFFF ))(t, x, y, z) = c(∇×FFF )(t, x, y, z).
(9.1.61) (∇× (FFF +GGG))(t, x, y, z) = (∇×FFF )(t, x, y, z) + (∇×GGG)(t, x, y, z).
(9.1.62) (∇× (∇×FFF ))(t, x, y, z) = (∇(∇ ·FFF ))(t, x, y, z)− (∇2FFF )(t, x, y, z).
(∇2(fg))(t, x, y, z) = g(t, x, y, z)(∇2f)(t, x, y, z) + f(t, x, y, z)(∇2g)(t, x, y, z)
+ 2((∇f) · (∇g))(t, x, y, z),(9.1.63)
(here f and g are C2-scalar fields, recall Remark 3.2.2).
(9.1.64) ∇ · (FFF ×GGG)(t, x, y, z) = GGG(t, x, y, z) · (∇×FFF )(t, x, y, z)−FFF (t, x, y, z) · (∇×GGG)(t, x, y, z).
146
9.2 Theorem of Stokes
Stokes’ theorem is effectively a generalization of Green’s theorem (see Theorem 7.2.1) from two to
three dimensions.
Remark 9.2.1. In Definition 8.1.4 we defined a surface S as the image or range of a parametric
function
(9.2.65) ΦΦΦ : D → R3,
for some region D ⊂ R2, specifically
(9.2.66) S := ΦΦΦ(u, v) ∈ R3 | (u, v) ∈ D,
(see (8.1.8)). The surface S is called closed when it is the boundary of some region in R3, or
equivalently completely contains some region in R3. For example, the surface of a sphere in R3 is a
closed surface. Surfaces which are not closed are called open. For example, a flat disc in R3 with
circular boundary of radius r is an open surface. Similarly, the x−y plane in R3 is an open surface.
We shall mainly be interested in finite open surfaces, that is open surfaces of finite size. The x− yplane is an open surface but clearly of infinite extent, therefore not a finite open surface. A disc
with circular boundary of finite radius r is a finite open surface. We are going to assume that our
finite open surface S always has a boundary Γ which is closed curve, that is a curve which begins
and ends at the same point (see Figure 9.2). Clearly one can traverse the closed curve Γ in two
possible directions, namely from A to B to C then back to A, or in the opposite direction from
A to C to B then back to A. For purposes of reference we need to standardize an unambiguous
direction of traverse. We always give Γ that particular direction which is such that the surface S
enclosed by Γ is on your left when you traverse Γ in this direction (see Figure 9.2). The boundary
curve Γ is then called positively oriented.
We are now able to state Stokes’ theorem:
Theorem 9.2.2 (Stokes’ theorem). Suppose that S is a finite open surface in R3 with closed posi-
tively oriented boundary curve Γ (see Remark 9.2.1 and Figure 9.2), and FFF : R3 → R3 is a C1-vector
field (recall Remark 3.2.2). Then
(9.2.67)
∫Γ
FFF · drrr =
∫S
(∇×FFF ) · dAAA.
147
Figure 9.2: Open surface S with positively oriented boundary Γ
Remark 9.2.3. In the alternative notation curl FFF for ∇× FFF for (see Remark 9.1.8) the theorem
of Stokes is frequently written
(9.2.68)
∫Γ
FFF · drrr =
∫S
(curl FFF ) · dAAA.
Remark 9.2.4. Recall that the surface integral on the right sides of (9.2.68) and (9.2.67) is defined
by (8.5.125) in terms of any parametric representation ΦΦΦ : D → R3 of the surface S, but of course
with the vector field ∇×FFF (or curl FFF ) in place of FFF in (8.5.125), that is
(9.2.69)
∫S
(∇×FFF ) · dAAA :=
∫D
(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv.
As for the line integral on the left sides of (9.2.68) and (9.2.67), recall that this is defined by (5.1.13),
that is
(9.2.70)
∫Γ
FFF · drrr =
∫ b
a
FFF (γγγ(t)) · γγγ(1)(t) dt,
in which γγγ : [a, b] → R3 is any path (recall Definition 4.2.2) which traverses the boundary curve
Γ in the positively oriented direction shown in Figure 9.2 as the parametric variable t increases
through the interval a ≤ t ≤ b. Since Γ is a closed curve the quantity at (9.2.70) is the circulation
of the vector field FFF around Γ (see Remark 5.1.2). We can therefore paraphrase Stokes’ theorem
as follows: “The circulation of a C1-vector field FFF around the positively oriented boundary Γ of a
finite open surface S is equal to the surface integral of ∇×FFF over S”.
148
Proof of Theorem 9.2.2: We shall prove Theorem 9.2.2 in the special case for which the finite
open surface S is the graph of a function
(9.2.71) f : D → R,
with parametric representation in Remark 8.3.4, that is
(9.2.72) ΦΦΦ(u, v) := uiii+ vjjj + f(u, v)kkk for all (u, v) in D,
(see (8.3.54) and Figure 9.3). We have already calculated
Figure 9.3: Open surface S in proof of Stokes’ theorem
(9.2.73)∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v) = −∂f
∂u(u, v)iii− ∂f
∂v(u, v)jjj + kkk, for all (u, v) in D,
(see (8.3.57)). From (9.1.8) and (9.2.72)
(∇×FFF )(ΦΦΦ(u, v)) =
[∂F3
∂y(u, v, f(u, v))− ∂F2
∂z(u, v, f(u, v))
]iii
+
[∂F1
∂z(u, v, f(u, v))− ∂F3
∂x(u, v, f(u, v))
]jjj
+
[∂F2
∂x(u, v, f(u, v))− ∂F1
∂y(u, v, f(u, v))
]kkk,
(9.2.74)
149
for all (u, v) in D. From (9.2.74) and (9.2.73),
(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]=
[∂F2
∂z(u, v, f(u, v))− ∂F3
∂y(u, v, f(u, v))
]∂f
∂u(u, v)
+
[∂F3
∂x(u, v, f(u, v))− ∂F1
∂z(u, v, f(u, v))
]∂f
∂v(u, v)
+
[∂F2
∂x(u, v, f(u, v))− ∂F1
∂y(u, v, f(u, v))
],
(9.2.75)
for all (u, v) in D, and from (9.2.75) together with (9.2.69) we get∫S
∇×FFF · dAAA
=
∫D
(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv
=
∫D
[∂F2
∂z(u, v, f(u, v))− ∂F3
∂y(u, v, f(u, v))
]∂f
∂u(u, v) du dv
+
∫D
[∂F3
∂x(u, v, f(u, v))− ∂F1
∂z(u, v, f(u, v))
]∂f
∂v(u, v) du dv
+
∫D
[∂F2
∂x(u, v, f(u, v))− ∂F1
∂y(u, v, f(u, v))
]du dv.
(9.2.76)
We next calculate the line integral on the left of (9.2.67). Exactly as at Remark 9.2.4 we fix a path
(9.2.77) γγγ : [a, b]→ R3
which traverses the boundary curve Γ in the direction of positive orientation (see Figure 9.3). We
write γγγ in the scalar component form
γγγ(t) = (x(t), y(t), z(t))
= x(t)iii+ y(t)jjj + z(t)kkk for all t in a ≤ t ≤ b,(9.2.78)
(see (4.2.5)), so that the line integral then becomes∫Γ
FFF (rrr) · drrr =
∫ b
a
[F1(x(t), y(t), z(t))
dx
dt(t) + F2(x(t), y(t), z(t))
dy
dt(t)
+F3(x(t), y(t), z(t))dz
dt(t)
]dt.
(9.2.79)
(see (5.1.18)). Since the surface S is the graph of the function f (see (9.2.71)) it follows that
(9.2.80) z(t) = f(x(t), y(t)), for all t in a ≤ t ≤ b,
150
(see Figure 9.3) so that from (9.2.80) and (9.2.78) we obtain
(9.2.81) γγγ(t) = x(t)iii+ y(t)jjj + f(x(t), y(t))kkk for all t in a ≤ t ≤ b.
and, from (9.2.80) and the chain rule, we get
(9.2.82)dz
dt(t) =
∂f
∂x(x(t), y(t))
dx
dt(t) +
∂f
∂y(x(t), y(t))
dy
dt(t).
We next insert (9.2.82) and (9.2.80) in (9.2.79) to get
∫Γ
FFF (rrr) · drrr =
∫ b
a
[F1(x(t), y(t), f(x(t), y(t)))
dx
dt(t) + F2(x(t), y(t), f(x(t), y(t)))
dy
dt(t)
+F3(x(t), y(t), f(x(t), y(t)))
(∂f
∂x(x(t), y(t))
dx
dt(t) +
∂f
∂y(x(t), y(t))
dy
dt(t)
)]dt
=
∫ b
a
[G1(x(t), y(t))
dx
dt(t) +G2(x(t), y(t))
dy
dt(t)
]dt,
(9.2.83)
where we have defined
(9.2.84) G1(x, y) := F1(x, y, f(x, y)) + F3(x, y, f(x, y))∂f
∂x(x, y),
(9.2.85) G2(x, y) := F2(x, y, f(x, y)) + F3(x, y, f(x, y))∂f
∂y(x, y),
for all (x, y) in R2. We next define the path in the plane by
(9.2.86) ηηη(t) := (x(t), y(t)), for all t in a ≤ t ≤ b,
where (9.2.81) relates x(t) and y(t) to γγγ(t). We next let ∆ denote the boundary of the region D
in R2. Since γγγ(t) traverses Γ in the positively oriented direction as t increases through a ≤ t ≤ b
it follows that ηηη(t) traverses ∆ in the counterclockwise direction (see Figure 9.3). From (9.2.86) we
have ∫ b
a
[G1(x(t), y(t))
dx
dt(t) +G2(x(t), y(t))
dy
dt(t)
]dt =
∫ b
a
GGG(ηηη(t)) · ηηη(1)(t) dt
=
∫∆
GGG(rrr) · drrr,
(9.2.87)
in which the vector field GGG : R2 → R2 is defined by
(9.2.88) GGG(x, y) := (G1(x, y), G2(x, y)), for all (x, y) in R2.
151
Since ηηη(t), a ≤ t ≤ b traverses ∆ counterclockwise, from Green’s theorem (see Theorem 7.2.1)
we find
(9.2.89)
∫∆
GGG(rrr) · drrr =
∫D
[∂G2
∂x(x, y)− ∂G1
∂y(x, y)
]dx dy.
Combining (9.2.89), (9.2.87), and (9.2.83) gives
(9.2.90)
∫Γ
FFF (rrr) · drrr =
∫D
[∂G2
∂x(x, y)− ∂G1
∂y(x, y)
]dx dy.
We next evaluate the integrand on the right side of (9.2.90). From (9.2.84) and (9.2.85), together
with the chain rule for partial derivatives, we obtain
∂G2
∂x(x, y)− ∂G1
∂y(x, y)
=
[∂F2
∂x(x, y, f(x, y)) +
∂F2
∂z(x, y, f(x, y))
∂f
∂x(x, y)
+∂F3
∂x(x, y, f(x, y))
∂f
∂y(x, y) +
∂F3
∂z(x, y, f(x, y))
∂f
∂x(x, y)
∂f
∂y(x, y)
+F3(x, y, f(x, y))∂2f
∂x∂y(x, y)
]−[∂F1
∂y(x, y, f(x, y)) +
∂F1
∂z(x, y, f(x, y))
∂f
∂y(x, y)
+∂F3
∂y(x, y, f(x, y))
∂f
∂x(x, y) +
∂F3
∂z(x, y, f(x, y))
∂f
∂y(x, y)
∂f
∂x(x, y)
+F3(x, y, f(x, y))∂2f
∂x∂y(x, y)
]
(9.2.91)
Now the last two terms in the first square braces cancel the last two terms in the second square
braces. After this simplification, from (9.2.91) and (9.2.90) we get∫Γ
FFF (rrr) · drrr
=
∫D
[∂F2
∂x(x, y, f(x, y)) +
∂F2
∂z(x, y, f(x, y))
∂f
∂x(x, y) +
∂F3
∂x(x, y, f(x, y))
∂f
∂y(x, y)
]−[∂F1
∂y(x, y, f(x, y)) +
∂F1
∂z(x, y, f(x, y))
∂f
∂y(x, y) +
∂F3
∂y(x, y, f(x, y))
∂f
∂x(x, y)
]dx dy
=
∫D
[∂F2
∂z(x, y, f(x, y))− ∂F3
∂y(x, y, f(x, y))
]∂f
∂x(x, y) dx dy
+
∫D
[∂F3
∂x(x, y, f(x, y))− ∂F1
∂z(x, y, f(x, y))
]∂f
∂y(x, y) dx dy
+
∫D
[∂F2
∂x(x, y, f(x, y))− ∂F1
∂y(x, y, f(x, y))
]dx dy.
(9.2.92)
152
We find that the right sides of (9.2.92) and (9.2.76) are equal and therefore the left sides are also
equal giving ∫S
∇×FFF · dAAA =
∫Γ
FFF (rrr) · drrr,
which establishes the theorem.
Remark 9.2.5. Suppose that S1 and S2 are two finite open surfaces in R3 having a common closed
positively oriented boundary curve Γ. Applying Stokes Theorem 9.2.2 to the surface S1 we get
(9.2.93)
∫Γ
FFF · drrr =
∫S1
(∇×FFF ) · dAAA,
and similarly, applying Stokes Theorem 9.2.2 to the surface S2 we also get
(9.2.94)
∫Γ
FFF · drrr =
∫S2
(∇×FFF ) · dAAA.
Combining (9.2.93) and (9.2.94),
(9.2.95)
∫S1
(∇×FFF ) · dAAA =
∫S2
(∇×FFF ) · dAAA.
We can often use (9.2.95) to replace evaluation of a difficult surface integral with the evaluation of
an easier surface integral, as the next example illustrates.
Example 9.2.6. A vector field FFF : R3 → R3 is defined by
(9.2.96) FFF (x, y, z) := xiii+ (x+ y)jjj + (x+ y + z)kkk, for all (x, y, z) in R3,
and the surface S is the top half of the sphere of radius r centered at the origin of R3 (see Example
8.1.1 and Figure 9.4). We must calculate the surface integral of the curl vector field ∇×FFF over S.
From (9.2.96) we easily get
(9.2.97) (∇×FFF )(x, y, z) = iii− jjj + kkk, for all (x, y, z) in R3.
However, direct evaluation of the surface integral∫S
(∇×FFF ) · dAAA
is quite tedious despite the simplicity of (9.2.97) (the reader may want to try the direct evaluation
to see this). We see from Figure 9.4 that if Γ is the circle of radius r in the x−y plane with counter-
clockwise direction then Γ is a closed positively oriented boundary curve of S, and therefore from
Stokes’ theorem 9.2.2 we get
(9.2.98)
∫Γ
FFF · drrr =
∫S
(∇×FFF ) · dAAA,
153
Figure 9.4: Surfaces S and S1 and closed curve Γ for Example 9.2.6
so we could try to evaluate the line integral on the left of (9.2.98) to get the required surface integral.
Unfortunately this line integral is also quite tedious to compute because the vector field FFF does not
have any obvious symmetry properties. However, observe from Figure 9.4 that the flat surface
(9.2.99) S1 = (x, y, z) | x2 + y2 ≤ r2, z = 0,
is also a finite open surface in R3 with Γ as its closed positively oriented boundary curve, that is Γ
is the common closed positively oriented boundary curve of the surfaces S and S1. From Remark
9.2.5 we then get
(9.2.100)
∫S
(∇×FFF ) · dAAA =
∫S1
(∇×FFF ) · dAAA.
We shall evaluate the surface integral on the right side of (9.2.100) using (8.5.125) (with ∇ × FFFinstead of FFF ), which we expect to be easier to deal with than the surface integral on the left side
of (9.2.100) since the surface S1 is “flat” not “curved”. To use (8.5.125) we need a parametric
representation of the surface S1. Clearly such a representation is the mapping ΦΦΦ : D → R3 in which
D ⊂ R2uv is
(9.2.101) D = (u, v) | u2 + v2 ≤ r2,
and ΦΦΦ is defined by
(9.2.102) ΦΦΦ(u, v) := uiii+ vjjj + 0kkk, for all (u, v) in D,
154
for it is immediately clear from (9.2.102) and (9.2.101) that
(9.2.103) S1 = ΦΦΦ(u, v) | (u, v) ∈ D.
This very simple parametric representation results of course from the fact that the surface S1 is
“flat”. Note also that, in this very simple case, the surface S1 being represented by ΦΦΦ is really just
identical to the parametric domain D of the mapping ΦΦΦ (recall Definition 8.1.4). From (9.2.102)
(9.2.104)∂ΦΦΦ
∂u(u, v) = iii and
∂ΦΦΦ
∂v(u, v) = jjj,
for all (u, v) in D, and from (9.2.104) we get
(9.2.105)∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂u(u, v) = iii× jjj = kkk, for all (u, v) in D.
From (9.2.97) and (9.2.102) we find
(9.2.106) (∇×FFF )(ΦΦΦ(u, v)) = iii− jjj + kkk, for all (u, v) in D,
and from (9.2.106) and (9.2.105) we obtain
(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂u(u, v)
]= (iii− jjj + kkk) · kkk
= 1, for all (u, v) in D.
(9.2.107)
From (8.5.125), with ∇×FFF substituted in place of FFF , we get∫D
∇×FFF · dAAA =
∫D
(∇×FFF )(ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv
=
∫D
(1) du dv (from (9.2.107))
= areaD = πr2.
(9.2.108)
From (9.2.108) and (9.2.100) we find
(9.2.109)
∫S
∇×FFF · dAAA = πr2.
Remark 9.2.7. In Remark 9.1.10 we saw that the curl of the velocity field VVV of a moving fluid
has a very direct physical interpretation. For general vector fields FFF it is not possible to be quite
so specific about the physical significance of the curl vector field ∇×FFF . Nevertheless, with the aid
of Stokes’ Theorem 9.2.2, we can at least get a partial intuitive sense of what the curl vector field
155
means. To see this suppose we have a vector field FFF : R3 → R3 (for simplicity we take the domain
of FFF to be D := R3), and fix some point (x, y, z) in R3 together with a unit vector nnn passing through
(x, y, z). Let the surface Sρ be a flat disc of radius ρ with center at (x, y, z) and lying in the plane
which is orthogonal to the unit vector nnn (see Figure 9.5). Finally, let Γρ be the closed positively
oriented boundary of S (that is Γρ is just a circle of radius ρ centered at (x, y, z) with the sense of
direction indicated in Figure 9.5). From Stokes’ Theorem 9.2.2 we get
Figure 9.5: Flat disc Sρ with circular boundary curve Γρ
(9.2.110)
∫Γρ
FFF · drrr =
∫Sρ
(∇×FFF ) · dAAA.
Now suppose that the radius ρ of the disc S is very small. Then, from the definition of a surface
integral, one sees that
(9.2.111)
∫Sρ
(∇×FFF ) · dAAA ≈ (∇×FFF )(x, y, z) · nnn(areaSρ),
since the vector field ∇×FFF is approximately constant with value (∇×FFF )(x, y, z) at all points on
the small surface S. From (9.2.111) and (9.2.110)
(9.2.112) (∇×FFF )(x, y, z) · nnn ≈ 1
areaSρ
∫Γρ
FFF · drrr.
156
Now the circulation of FFF around Γρ given by the line integral on the right side of (9.2.112) is a
measure of the aggregate “turning” or “rotation” of the vector field FFF around the very small curve
Γρ, so that the quantity on the right side of (9.2.112) is the aggregate “turning” of the vector field
FFF around Γρ per unit area of the surface S enclosed by Γρ. Taking ρ→ 0 in (9.2.112) then gives
(9.2.113) (∇×FFF )(x, y, z) · nnn = limρ→0
1
areaSρ
∫Γρ
FFF · drrr.
To get a more detailed intuitive interpretation of ∇×FFF from (9.2.113) we need to assign a physical
interpretation to the vector field FFF itself. For example, suppose that the vector field FFF is actually
the velocity field VVV that was discussed in Remark 9.1.10. Then, applying the principles of fluid
mechanics (we shall not undertake this here) one can actually establish the relation (9.1.10) on the
basis of (9.2.113). For a second example, suppose that the vector field FFF is an electric field. In
Remark 5.1.3, we have seen that the “turning” of FFF around Γρ represented by circulation on the
right side of (9.2.113) is actually the electromotive force (a very physical entity!) generated by the
electric field around Γρ. We then see from (9.2.113) that in this case the quantity (∇×FFF )(x, y, z) ·nnnis the limit of electromotive force per unit area enclosed by Γρ as ρ → 0. This interpretation of
∇×FFF when FFF is an electric field is very useful in electromagnetism.
Example 9.2.8 (from nanotechnology). A time varying electric field is given by
(9.2.114) EEE(t, x, y, z) = t(x+ y)iii− 2t2x2jjj + txykkk, (x, y, z) ∈ R3.
As in Remark 9.2.7 fix some point (x, y, z) in R3 and some unit vector nnn passing through (x, y, z),
and let Sρ be the flat disc with radius ρ and center at (x, y, z) lying in the plane which is orthogonal
to the unit vector nnn (see Figure 9.5). A circular metallic loop with total resistance R is placed along
the boundary curve Γρ. Determine the current in the loop in terms of time t when (all distances in
meters)
(9.2.115) ρ = 10−3, (x, y, z) = (1, 1, 5), nnn =iii+ 2jjj + kkk√
6, R = 10 ohm.
From basic physics the voltage v(t) generated in the circular loop is the electromotive force generated
by the electric field around Γρ, that is
(9.2.116) v(t) =
∫Γρ
EEE · dAAA,
so that the current through the loop is, by Ohm’s law,
(9.2.117) i(t) =v(t)
R=
1
R
∫Γρ
EEE · dAAA.
157
Direct evaluation of the line integral on the right side of (9.2.117) for the electric field given by
(9.2.114) is not at all easy! But, since ρ is so small, we know from the approximation (9.2.112) that
(9.2.118)
∫Γρ
EEE · drrr ≈ areaSρ[(∇×EEE)(t, x, y, z) · nnn].
From (9.2.118) and (9.2.117)
(9.2.119) i(t) ≈ πρ2
R[(∇×EEE)(t, x, y, z) · nnn],
and from (9.2.120)
∇×EEE(t, x, y, z) =
∣∣∣∣∣∣∣∣iii jjj kkk∂∂x
∂∂y
∂∂z
t(x+ y) −2t2x2 txy
∣∣∣∣∣∣∣∣= txiii− tyjjj − (4t2x+ t)kkk
= tiii− tjjj − (4t2 + t)kkk.
(9.2.120)
From (9.2.120), (9.2.119) and (9.2.115),
i(t) =10−6π
10√
6[tiii− tjjj − (4t2 + t)kkk] · [iii+ 2jjj + kkk]
= −10−5(2π)√6
(t+ 2t2).
(9.2.121)
9.3 Divergence Theorem of Gauss-Ostrogradskii
We come to the second major theorem of vector calculus namely the theorem of Gauss-Ostrogradskii
or divergence theorem. In contrast to Stokes’ theorem, which involves the surface integral of a vector
field over an open surface (see Remark 9.2.1), the divergence theorem involves the surface integral
of a vector field over a closed surface. We therefore expand on the notion of a closed surface which
was briefly introduced in Remark 9.2.1:
Remark 9.3.1. A surface S in R3 is a closed surface when it is the boundary of some region in
R3. Examples of closed surfaces are the surface of a sphere in R3 and the surface of a rectangular
parallelepiped in R3. In the divergence theorem we shall be interested in finite closed surfaces, that
is closed surfaces which contain a region of finite volume or finite extent. Fix a point on some closed
158
finite surface S (e.g. think of a point on the surface of a sphere or a rectangular parallelepiped in
R3). At this point there are two possible choices of unit vector which are orthogonal to the surface,
namely a unit vector which points out of the enclosed region and alternatively a unit vector which
points into the enclosed region. From now on we shall say that a finite closed surface has outward
orientation when the unit vector normal to the surface points out of the enclosed region at every
point on the surface. Analogously, a finite closed surface is said to have inward orientation when
the unit vector normal to the surface points into the enclosed region at every point on the surface
(see Figure 9.6 in which points A, B and C are on the surface S and unit vectors normal to the
surface are indicated at these points).
Figure 9.6: Closed surfaces S with outward orientation (a) and inward orientation (b)
We then have
Theorem 9.3.2 (Divergence theorem of Gauss-Ostrogradskii). Suppose that S is a closed surface
with outward orientation (see Remark 9.3.1) which encloses a finite region Ω ⊂ R3, and FFF : R3 → R3
is a C1-vector field (see Remark 3.2.2). Then
(9.3.122)
∫Ω
(div FFF ) dV =
∫S
FFF · dAAA,
(recall the definition of div FFF at Definition 9.1.1).
Remark 9.3.3. In the alternative notation ∇ ·FFF for curl FFF the divergence theorem is written
(9.3.123)
∫Ω
(∇ ·FFF ) dV =
∫S
FFF · dAAA.
159
Remark 9.3.4. Recall that the surface integral on the right sides of (9.3.122) and (9.3.123) is
defined by (8.5.125) in terms of any parametric representation ΦΦΦ : D → R3 of the surface S, that is
(9.3.124)
∫S
FFF · dAAA :=
∫D
FFF (ΦΦΦ(u, v)) ·[∂ΦΦΦ
∂u(u, v)× ∂ΦΦΦ
∂v(u, v)
]du dv.
As for the three dimensional integral appearing on the left of (9.3.122) and (9.3.123), this is formu-
lated in Definition 2.2.1 in the case where Ω is a rectangular parallelepiped, and at Remark 2.2.8
for general Ω ⊂ R3, with f taken to be the scalar field ∇ ·FFF .
Proof of Theorem 9.3.2: We shall prove Theorem 9.3.2 in the special case where Ω is a regular
region in the sense of Remark 2.2.12. Writing the vector field FFF in the usual componentwise form
(9.3.125) FFF (x, y, z) = F1(x, y, z)iii+ F2(x, y, z)jjj + F3(x, y, z)kkk, for all (x, y, z) in R3,
(c.f. (3.2.17)), we have from (9.1.5)
(9.3.126) ∇ ·FFF (x, y, z) =∂F1
∂x(x, y, z) +
∂F2
∂y(x, y, z) +
∂F3
∂z(x, y, z), for all (x, y, z) in R3,
so that, from (9.3.126) the volume integral on the right of (9.3.123) is the sum
(9.3.127)
∫Ω
(∇ ·FFF ) dV =
∫Ω
∂F1
∂xdV +
∫Ω
∂F2
∂ydV +
∫Ω
∂F3
∂zdV.
As for the surface integral on the left of (9.3.123), we have∫S
FFF · dAAA =
∫S
(F1iii+ F2jjj + F3kkk) · dAAA (from (9.3.125))
=
∫S
GGG1 · dAAA+
∫S
GGG2 · dAAA+
∫S
GGG3 · dAAA,
(9.3.128)
in which we have defined the simple vector fields
(9.3.129) GGG1(x, y, z) := F1(x, y, z)iii, GGG2(x, y, z) := F2(x, y, z)jjj, GGG3(x, y, z) := F3(x, y, z)kkk,
for all (x, y, z) in R3. We shall establish the relations
(9.3.130)
∫Ω
∂F1
∂xdV =
∫S
GGG1 · dAAA,
∫Ω
∂F2
∂ydV =
∫S
GGG2 · dAAA,
∫Ω
∂F3
∂ydV =
∫S
GGG3 · dAAA,
since, with (9.3.130) established, we obtain (9.3.123) from (9.3.128) and (9.3.127). We establish the
third of the relations at (9.3.130), the remaining relations being proved in an identical way. Since
Ω is a regular region in the sense of Remark 2.2.12 it is, in particular, z-simple with lower function
160
γ1(x, y), upper function γ2(x, y) and common domain of definition D ⊂ R2xy (see (2.2.101)), so that,
in set theory notation, Ω is given by (2.2.101), that is
(9.3.131) Ω = (x, y, z) ∈ R3 | (x, y) in D and γ1(x, y) ≤ z ≤ γ2(x, y).
It follows from Remark 2.2.9 that the integral of any real valued function f over Ω is given by
(2.2.102). Identifying f with ∂F3/∂z we obtain
(9.3.132)
∫Ω
∂F3
∂ydV =
∫D
∫ γ2(x,y)
γ1(x,y)
∂F3
∂y(x, y, z) dz
dx dy.
By the fundamental theorem of calculus
(9.3.133)
∫ γ2(x,y)
γ1(x,y)
∂F3
∂y(x, y, z) dz = F3(x, y, γ2(x, y))− F3(x, y, γ1(x, y)),
so that, from (9.3.132) and (9.3.133),
(9.3.134)
∫Ω
∂F3
∂ydV =
∫D
F3(x, y, γ2(x, y))− F3(x, y, γ1(x, y)) dx dy.
Next, evaluate the surface integral on the right side of the three relations at (9.3.130). To this end,
since Ω is given by (9.3.131), it necessarily has the form shown in Figure 9.7, in which the upper
surface S2 is the graph of the function γ2(x, y), namely
(9.3.135) S2 = (x, y, γ2(x, y)) ∈ R3 | (x, y) in D,
(c.f. (2.2.100)), while the lower surface S1 is the graph of the function γ1(x, y), namely
(9.3.136) S1 = (x, y, γ1(x, y)) ∈ R3 | (x, y) in D,
and the “side surface” S3 is such that
(9.3.137) each unit vector normal to S3 is also normal to the unit vector kkk.
Now the surfaces S1, S2 and S3 are disjoint and cover the surface S which encloses Ω, that is
(9.3.138) S = S1 ∪ S2 ∪ S3, Si ∩ Sj = ∅ when i 6= j.
From (9.3.138) it follows that
(9.3.139)
∫S
GGG3 · dAAA =
∫S1
GGG3 · dAAA+
∫S2
GGG3 · dAAA+
∫S3
GGG3 · dAAA.
161
Figure 9.7: Region Ω and surfaces S1, S2 and S3 in the proof of Theorem 9.3.2
It remains to evaluate each of the integrals on the right of (9.3.139). One sees from the third
relation of (9.3.129) that GGG3 is in the direction of the unit vector kkk. In view of this fact, together
with (9.3.137) and the definition of surface integral, it is immediate that
(9.3.140)
∫S3
GGG3 · dAAA = 0.
We now evaluate the first integral on the right side of (9.3.139). To this end we recall Remark 8.5.3
in which we established the formula (8.5.133) for the surface integral of a general vector field FFF
over a surface which is the graph of some function f . In fact we shall use (8.5.133), but taking γ2
in place of f (since S2 is the graph of γ2, as shown at (9.3.135)) and taking GGG3 in place of FFF (since
GGG3 is the vector field we are integrating). From the third relation of (9.3.129) we see that
(9.3.141) GGG3(x, y, z) = 0iii+ 0jjj + F3(x, y, z)kkk,
so that in (8.5.133) we identify F1 = 0, F2 = 0 to get
(9.3.142)
∫S2
GGG3 · dAAA =
∫D
F3(u, v, γ2(u, v)) du dv.
162
In exactly the same way, for the surface integral of GGG3 over the lower surface S1 which is the graph
of the function γ1 (see (9.3.136)), we obtain
(9.3.143)
∫S1
GGG3 · dAAA = −∫D
F3(u, v, γ1(u, v)) du dv.
Why the negative sign on the right of (9.3.143), in contrast to the right side of (9.3.142)? This
sign is to compensate for unit vectors nnn normal to S1 pointing outwards (by the assumed outward
orientation of S) and hence downwards (since S1 is the em lower surface), whereas the unit vectors
normal to S2 point upwards (again by the assumed outward orientation of S, see Figure 9.7).
Combining (9.3.139), (9.3.140), (9.3.142) and (9.3.143),
(9.3.144)
∫S
GGG3 · dAAA =
∫D
F3(u, v, γ2(u, v))− F3(u, v, γ1(u, v)) du dv.
Comparison of (9.3.134) with (9.3.144) shows
(9.3.145)
∫S
GGG3 · dAAA =
∫Ω
∂F3
∂ydV,
which is the third relation of (9.3.130). The first and second relations of (9.3.130) are established
in the same way, using the fact that Ω is also x-simple and y-simple.
The Gauss-Ostrogradskii Theorem 9.3.2 can often be used to simplify the calculation of a sur-
face integral, reducing this to a volume integral which may often be easier to calculate, as the next
example demonstrates.
Example 9.3.5. A vector field FFF : R3 → R3 is defined by
(9.3.146) FFF (x, y, z) := 4xiii− 2y2jjj + z2kkk, for all (x, y, z) in R3.
The surface S encloses the region
(9.3.147) Ω = xyz ∈ R3 | x2 + y2 ≤ 4, 0 ≤ z ≤ 3
that is Ω is the cylinder-shaped region of radius 2 and height 3 shown in Figure 9.8. We must
evaluate the surface integral ∫S
FFF · dAAA.
Direct evaluation of this integral is not impossible, but is nevertheless quite lengthy and complicated,
as we have to evaluate the surface integral separately over each of the three bounding surfaces S1
163
Figure 9.8: Cylindrical region Ω for the Example 9.3.5
(“lower” surface), S2 (“top” surface), and S3 (“side” surface) in Figure 9.8 (you may want to try
the calculation!). On the other hand, from Theorem 9.3.2 we have that
(9.3.148)
∫S
FFF · dAAA =
∫Ω
(∇ ·FFF ) dV,
so we will try to evaluate the volume integral on the right side of (9.3.148). First calculate the
divergence. From (9.3.146) and Definition 9.1.1
(9.3.149) (∇ ·FFF )(x, y, z) =∂(4x)
∂x+∂(−2y2)
∂y+∂(z2)
∂z= 4− 4y + 2z.
We must therefore evaluate the three dimensional integral
(9.3.150)
∫Ω
f dV
for
(9.3.151) f(x, y, z) := (∇ ·FFF )(x, y, z) = 4− 4y + 2z.
We see at once, from Remark 2.2.9 and Figure 9.8, that the region Ω given by (9.3.147) is z-simple
with lower function γ1(x, y), upper function γ2(x, y), and common domain of definition D ⊂ R2xy,
defined by
(9.3.152) D := (x, y) ∈ R2 | x2 + y2 ≤ 4, γ1(x, y) := 0 γ2(x, y) := 3, for all (x, y) in D.
164
We can therefore use Fubini’s theorem in the form of (2.2.102), namely
(9.3.153)
∫Ω
f dV =
∫D
∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz
dx dy.
From (9.3.152) and (9.3.151) we get
(9.3.154)
∫ γ2(x,y)
γ1(x,y)
f(x, y, z) dz =
∫ 3
0
(4− 4y + 2z) dz = 21− 12y,
and, from (9.3.154) and (9.3.153), we get
(9.3.155)
∫Ω
f dV =
∫D
g(x, y) dx dy, where g(x, y) := 21− 12y for all (x, y) in D.
Now one sees from (9.3.152) that the region D is a disc of radius r = 2 centered at the origin of R2,
so it follows, exactly as at Example 2.1.16, that D is a y-simple region with upper function φ2(x),
lower function φ1(x), and common interval of definition a ≤ x ≤ b, with
(9.3.156) a := −2, b := 2, φ1(x) := −√
4− x2, φ2(x) :=√
4− x2, for all −2 ≤ x ≤ 2.
We can now use Fubini’s theorem in the form of (2.1.27) to evaluate the integral on the right side
of (9.3.155), that is∫D
g(x, y) dx dy =
∫ b
a
∫ φ2(x)
φ1(x)
g(x, y) dy
dx
=
∫ 2
−2
∫ √4−x2
−√
4−x2(21− 12y) dy
dx (from (9.3.156) and (9.3.155)).
(9.3.157)
For the “inner” dy-integral we have
(9.3.158)
∫ √4−x2
−√
4−x2(21− 12y) dy =
[(21y − 6y2)
]y=√
4−x2
y=−√
4−x2 = 42√
4− x2.
From (9.3.158) and (9.3.157) we get
(9.3.159)
∫D
g(x, y) dx dy = 42
∫ 2
−2
√4− x2 dx.
To evaluate the integral on the right side of (9.3.159) make the substitution
(9.3.160) x = 2 sin(θ) so that dx = 2 cos(θ) dθ.
165
Moreover, from (9.3.160),
(9.3.161) θ = arcsin(x/2) so that θ = π2
when x = 2 and θ = −π2
when x = −2.
Then, from (9.3.160) and (9.3.161),
(9.3.162)
∫ 2
−2
√4− x2 dx =
∫ π/2
−π/2
√4− 4 sin2(θ)(2 cos(θ)) dθ = 4
∫ π/2
−π/2cos2(θ) dθ.
Now observe thatd[sin(θ) cos(θ)]
dθ= cos2(θ)− sin2(θ) = 2 cos2(θ)− 1
so that
(9.3.163)d
dθ
[θ + sin(θ) cos(θ)
2
]= cos2(θ).
From (9.3.163) and (9.3.162) we find
(9.3.164)
∫ 2
−2
√4− x2 dx = 4
[θ + sin(θ) cos(θ)
2
]θ=π/2θ=−π/2
= 2π.
From (9.3.164) and (9.3.159)
(9.3.165)
∫D
g(x, y) dx dy = 42(2π) = 84π.
From (9.3.165), the first relation of (9.3.155), (9.3.151) and (9.3.148) we obtain∫S
FFF · dAAA =
∫Ω
(∇ ·FFF ) dV =
∫Ω
f dV =
∫D
g(x, y) dx dy = 84π.
Remark 9.3.6. In Remark 9.2.7 we saw that Stokes’ Theorem 9.2.2 could be used to shed some
light on the physical significance of the curl of a vector field. In much the same way we can use the
Gauss-Ostrogradskii Theorem 9.3.2 to get some understanding of the physical significance of the
divergence of a vector field. To see this suppose we have a vector field FFF : R3 → R3 (for simplicity
we take the domain of FFF to be D := R3), and take the region Ωρ ⊂ R3 to be a sphere with radius
ρ centered at some point (x, y, z) in R3, and let Sρ be the outward oriented surface of Ωρ. From
Theorem 9.3.2 we obtain
(9.3.166)
∫Ωρ
(∇ ·FFF ) dV =
∫Sρ
FFF · dAAA.
166
Now suppose that the radius ρ of the sphere Ωρ is very small. Then, from the definition of the three
dimensional integral, one sees that
(9.3.167)
∫Ωρ
(∇ ·FFF ) · dV ≈ (∇ ·FFF )(x, y, z)(volΩρ),
(where volΩρ denotes the volume of the sphere Ωρ) since the scalar field ∇ ·FFF is approximately
constant with value (∇ ·FFF )(x, y, z) at all points in the small region Ωρ centered at (x, y, z). From
(9.3.167) and (9.3.166) we get
(9.3.168) (∇ ·FFF )(x, y, z) ≈ 1
volΩρ
∫Sρ
FFF · dAAA,
and upon taking ρ→ 0 at (9.3.168) we obtain
(9.3.169) (∇ ·FFF )(x, y, z) = limρ→0
1
volΩρ
∫Sρ
FFF · dAAA. for all (x, y, z) in R3.
We recall from Remark 8.5.4 that the surface integrals over the spherical surface Sρ on the right sides
of (9.3.169) and (9.3.168) are the flux of the vector field FFF through Sρ. Then (9.3.168) effectively
says that the divergence (∇ ·FFF )(x, y, z) is approximately the flux of FFF through the small spherical
surface Sρ centered at (x, y, z) per volume of the region Ωρ enclosed by Sρ. Moreover, (9.3.169) says
that this approximation becomes exact when the radius ρ of the sphere Sρ becomes “infinitesimally
small”. If, for example, we identify the vector field FFF with a current density vector field JJJ , then
(9.3.169) becomes
(9.3.170) (∇ · JJJ)(x, y, z) = limρ→0
1
volΩρ
∫Sρ
JJJ · dAAA. for all (x, y, z) in R3.
But we know from Remark 8.5.4 that the surface integral∫Sρ
JJJ · dAAA
on the right hand side of (9.3.170) is in fact the total current flowing out of Ωρ through the boundary
surface Sρ. It then follows from (9.3.170) that (∇·JJJ)(x, y, z) must this total out-flowing current per
volume of the spherical region Ωρ enclosed by Sρ for infinitesimally small radius ρ. In particular, if
(9.3.171) (∇ · JJJ)(x, y, z) > 0,
then there must be a source of current located at the point (x, y, z), and if
(9.3.172) (∇ · JJJ)(x, y, z) < 0,
then there must be a sink of current located at the point (x, y, z).
167
Example 9.3.7. A current density vector field is given by
(9.3.173) JJJ(x, y, z) = x3iii− 2xyjjj + yzkkk, for all (x, y, z) in R3,
As in Remark 9.3.6 let Γρ be a spherical region with radius ρ > 0 and center at a point (x, y, z),
and let Sρ be the surface of Γρ. Determine the total current through the surface Sρ when
(9.3.174) ρ = 10−6, (x, y, z) = (1, 2, 1),
(all distances in meters). We know that the current through Sρ is given by the surface integral
(9.3.175) i =
∫Sρ
JJJ · dAAA.
(recall Section 8.5). Direct evaluation of the surface integral at (9.3.175) is not at all easy. We could
modify the parametric representation for the whole sphere worked out in Example 8.1.9 to account
for the fact that the center is at (x, y, z) = (1, 2, 1) to get the representation ΦΦΦ : D → R3 in which
D = (θ, φ) ∈ R2 | 0 ≤ θ ≤ 2π and 0 ≤ φ ≤ π
= [0, 2π]× [0, π],(9.3.176)
and
(9.3.177) ΦΦΦ(θ, φ) = [1 + ρ sin(φ) cos(θ)]iii+ [2 + ρ sin(φ) sin(θ)]jjj + [1 + ρ cos(φ)]kkk,
for all (θ, φ) in D defined by (9.3.176) (compare with (8.1.17)). Using the general formula for surface
integral given by (8.5.125) we then see that the current at (9.3.175) is given by
i =
∫Sρ
JJJ · dAAA
=
∫D
JJJ(ΦΦΦ(θ, φ)) ·[∂ΦΦΦ
∂θ(θ, φ)× ∂ΦΦΦ
∂φ(θ, φ)
]dθ dφ.
(9.3.178)
However, substitution of (9.3.177) and (9.3.173) into (9.3.178) leads to some rather laborious inte-
grals! Since ρ is so small (see (9.3.174)) we can instead use the approximation for the divergence
established in Remark 9.3.6. From (9.3.168) we have
(∇ · JJJ)(x, y, z) ≈ 1
volΩρ
∫Sρ
JJJ · dAAA,
that is
(9.3.179)
∫Sρ
JJJ · dAAA ≈ volΩρ(∇ · JJJ)(x, y, z).
168
From (9.3.173) and (9.3.174)
(∇ · JJJ)(x, y, z) =∂(x3)
∂x+∂(−2xy)
∂y+∂(yz)
∂z
= 3x2 − 2x+ y = 3.
(9.3.180)
Upon combining (9.3.180), (9.3.179), (9.3.175) and (9.3.174) we get
i ≈ 4
3πρ3(∇ · JJJ)(x, y, z) =
4
3π(10−6)3(3) = 4π10−18 amps.
9.4 The Continuity Equation
Our goal in this section is to use the divergence Theorem 9.3.2 to establish the continuity equation,
a basic result which is central to fluid dynamics, aerodynamics, electromagnetism and several other
parts of physics and engineering. We shall obtain the continuity equation in the context of electric
charge moving diffusely through space, since it is this version of the continuity equation which is
of most relevance in the study of electromagnetism. In Example 3.1.4 we introduced the charge
density scalar field ρ, and in Example 3.1.5 we saw how this could be used to obtain the total charge
Q enclosed in a region Ω ⊂ R3, namely
(9.4.181) Q =
∫Ω
ρ(x, y, z) dx dy dz.
(see (3.1.3)). Now suppose that charge is in motion through space. This means that the charge
density at a point (x, y, z) generally changes with time t, in other words is given by ρ(t, x, y, z),
which is a function of time t for each (x, y, z). The charge density ρ is therefore a time varying scalar
field in the sense of Remark 3.2.5. Fix some arbitrary region Ω ⊂ R3 having the closed surface S as
its boundary. We shall suppose that S is outwardly oriented, that is the unit vector normal to the
surface points out of the region Ω at every point on the surface (see Remark 9.3.1). We emphasize
that S is a purely theoretical surface that leaves the movement of charge completely unaffected,
and is not in any sense a physical surface or barrier which impedes or disturbs the movement of
charge. In accordance with (9.4.181) the total charge contained within Ω at each instant t is given
by
(9.4.182) Q(t) =
∫Ω
ρ(t, x, y, z) dx dy dz.
From now on we are going to resort increasingly to the custom (which is standard throughout the
literature) of suppressing variables. Accordingly, (9.4.182) will be written in the abbreviated form
(9.4.183) Q(t) =
∫Ω
ρ(t) dV,
169
so that the reader must “mentally insert” the missing variables (x, y, z) at ρ(t) (recall from Remark
2.2.2 that dV is shorthand for dx dy dz), or written in even more stripped down form as
(9.4.184) Q(t) =
∫Ω
ρ dV,
in which case the reader must “mentally insert” all the missing variables (t, x, y, z) at ρ. One soon
gets used to this! Basing ourselves on the totally stripped down notation at (9.4.184) it follows that
the rate of increase of the total charge contained within the region Ω is given by
(9.4.185)dQ(t)
dt=
d
dt
∫Ω
ρ dV =
∫Ω
∂ρ
∂tdV.
With all variables displayed (9.4.186) of course reads
(9.4.186)dQ(t)
dt=
∫Ω
∂ρ(t, x, y, z)
∂tdx dy dz,
but from now on we are going to avoid such detailed notation unless it is really needed (which
occasionally it is). To repeat, from (9.4.185) we have
(9.4.187) rate of increase of the total charge contained within the region Ω =
∫Ω
∂ρ
∂tdV.
Now recall from Section 8.5 that the total current passing through S is given by the surface integral
of the current density vector field JJJ of the moving charge over surface the S, that is
(9.4.188) total current passing through the boundary surface S =
∫S
JJJ · dAAA,
(recall Example 3.1.6 for the definition of the current density vector field). Since S is outwardly
oriented, the current at (9.4.188) is the rate at which charge leaves the region Ω by flowing through
the boundary surface S. It then follows, by a sign-change, that
(9.4.189) rate at which charge enters region Ω through the boundary surface S = −∫S
JJJ · dAAA.
We now appeal to a basic physical law, called the law of conservation of charge which, in the present
setting, says
(9.4.190)
the rate of increase of the total charge contained within the region Ω is equal to
the rate at which charge enters region Ω through the boundary surface S.
The law of conservation of charge is completely justified by experiment. Like Ampere’s law and
Faraday’s law, this law is a bedrock principle of physics. The law of conservation of charge means
170
in particular that charge cannot be created or destroyed, that is within any region Ω there are never
any sources of charge (i.e. places at which charge just “appears”) or sinks of charge (i.e. places
at which charge just “vanishes”). In view of (9.4.190) the quantities at (9.4.189) and at (9.4.187)
must be equal, that is
(9.4.191)
∫Ω
∂ρ
∂tdV = −
∫S
JJJ · dAAA.
The relation (9.4.191) therefore expresses in mathematical form the basic principle of conservation
of charge. Unfortunately this relation is not very easy to use since the combination of a volume
integral on the left side and a surface integral on the right side is very difficult to deal with. We
are now going to use the tremendous power of vector calculus (in this case the divergence Theorem
9.3.2) to refine or “process” the relation (9.4.191) into a form which is extremely useful. From
Theorem 9.3.2, interpreting the general vector field FFF at (9.3.123) as the current density vector
field JJJ , we have
(9.4.192)
∫S
JJJ · dAAA =
∫Ω
(∇ · JJJ) dV,
and combining (9.4.192) with (9.4.191) we get
(9.4.193)
∫Ω
∂ρ
∂tdV = −
∫Ω
(∇ · JJJ) dV.
Thanks to the divergence theorem we now have the same kind of integral (a volume integral) on
each side of (9.4.193), in contrast to (9.4.191) with its awkward mix of a volume integral on one side
and a surface integral on the other side. As we shall see, the relation (9.4.193) in terms of volume
integrals alone is much easier to deal with than (9.4.191). From (9.4.193)
(9.4.194)
∫Ω
[(∇ · JJJ) +
∂ρ
∂t
]dV = 0.
At this point it is important to realize that (9.4.194) holds for each and every region Ω ⊂ R3.
Indeed, in our development of (9.4.194) we made absolutely no assumptions about Ω except that it
is just a region in R3. This is very useful for it allows us to use the following technical result which
we state without proof:
Theorem 9.4.1 (du Bois Reymond). Suppose the scalar function g : R3 → R is such that
(9.4.195)
∫Ω
g dV = 0
for each and every region Ω ⊂ R3. Then g(x, y, z) = 0 for all (x, y, z) in R3.
171
In order to use Theorem 9.4.1 at (9.4.194) it pays to write this relation displaying all the variables
which have been suppressed, that is we have
(9.4.196)
∫Ω
[(∇ · JJJ)(t, x, y, z) +
∂ρ(t, x, y, z)
∂t
]dV = 0,
for each t and each region Ω ⊂ R3. Now fix some arbitrary value of t, and for this fixed t put
(9.4.197) g(x, y, z) := (∇ · JJJ)(t, x, y, z) +∂ρ(t, x, y, z)
∂t, for all (x, y, z) in R3.
It follows from (9.4.197) and (9.4.196) that, at this fixed instant t, we have
(9.4.198)
∫Ω
g dV = 0,
for each and every region Ω ⊂ R3. Now we can apply Theorem 9.4.1 to (9.4.198) to conclude that
(9.4.199) g(x, y, z) = 0 for all (x, y, z) in R3.
Now it follows from (9.4.199) and (9.4.197), together with the arbitrary choice of t that
(9.4.200) (∇ · JJJ)(t, x, y, z) +∂ρ(t, x, y, z)
∂t= 0, for all t and points (x, y, z) in R3.
In (9.4.200), in accordance with (9.1.44), by (∇ · JJJ)(t, x, y, z) we just mean the quantity
(9.4.201) (∇ · JJJ)(t, x, y, z) =∂J1
∂x(t, x, y, z) +
∂J2
∂y(t, x, y, z) +
∂J3
∂z(t, x, y, z).
Needless to say, we will always write (9.4.200) in the stripped down form
(9.4.202) (∇ · JJJ) +∂ρ
∂t= 0,
in which all the variables (t, x, y, z) are omitted, but it should always be remembered that (9.4.202)
is just shorthand for the more detailed formulation at (9.4.200).
Remark 9.4.2. The relation (9.4.202) is called the continuity equation and expresses the conserva-
tion of charge in very convenient mathematical form. The continuity equation will be indispensable
in our study of Maxwell’s equations.
172
Chapter 10
The Basic Laws of Electricity and
Magnetism
In this chapter we are going to formulate the basic, experimentally determined, laws of electricity
and magnetism. We have already seen two of these laws, namely Ampere’s circuital law and
Faraday’s law of electromagnetic induction, which were stated just to illustrate the essential role of
line integrals and surface integrals in formulating the basic physical laws of electricity and magnetism
(see Remark 8.5.7 and Remark 8.5.8). In this chapter we shall look at these and the other laws
of electricity and magnetism more thoroughly and use the tools of vector calculus to reformulate
the laws in mathematically very convenient form. All of this is preparation for the next chapter, in
which we study Maxwell’s equations of electromagnetism.
10.1 Static Electric Fields
In this section we focus on static electric fields, that is electric fields EEE(x, y, z) which vary from
one point (x, y, z) to another but are constant with respect to time. The basic experimental fact
concerning static electric fields is Coulomb’s force law which effectively provides the definition of
the electric field EEE. We have already seen Coulomb’s law at Example 3.1.3, and we state this again
as follows:
Law 10.1.1 (Coulomb’s law of electrostatics). Suppose that a point charge of Q coul. is located
at point (u, v, w) in R3, and (x, y, z) is some other point in R3 distinct from (u, v, w) as shown in
173
Figure 10.1. Denote by rrr(u, v, w;x, y, z) the unit vector from (u, v, w) to (x, y, z), that is
(10.1.1) rrr(u, v, w;x, y, z) :=(x− u)iii+ (y − v)jjj + (z − w)kkk√
(x− u)2 + (y − v)2 + (z − w)2.
Then the electric field EEE(x, y, z) at point (x, y, z) due to the charge Q is the force exerted on a
positive test charge of 1 coul. at (x, y, z), which according to Coulomb’s inverse square force law is
given by
(10.1.2) EEE(x, y, z) =Q
4πε0[(x− u)2 + (y − v)2 + (z − w)2]rrr(u, v, w;x, y, z).
Here ε0 is a constant called the electrostatic permittivity of free space, which is given by
(10.1.3) ε0 = 8.854× 10−12coul2/(Newton meter2).
Usually (10.1.1) and (10.1.2) are combined into the following single expression for the electric field:
(10.1.4) EEE(x, y, z) =Q(x− u)iii+ (y − v)jjj + (z − w)kkk
4πε0[(x− u)2 + (y − v)2 + (z − w)2]3/2.
Figure 10.1: Electric field EEE(x, y, z) at (x, y, z) due to point charge Q at (u, v, w)
Point charges, such as the charge Q which is “concentrated” at the point (u, v, w) in the state-
ment of Coulomb’s law, are “singularities” and constitute rather unnatural objects in the theory of
electromagnetism. It is much more usual to deal, not with point charges, but rather with charge
diffusely spread through space and described by a charge density scalar field ρ of the kind introduced
174
in Example 3.1.4, and throughout this section we shall suppose that this is the case. To focus on
the main question and avoid distraction by secondary issues we shall also suppose that the domain
D of the charge density scalar field ρ is all of R3, that is D = R3 in the notation of Example
3.1.4. Finally, we shall suppose that the charge density is time constant or static in the sense that
the charge density ρ(x, y, z) at any specified point (x, y, z) is constant with time t (but generally
varies from one point (x, y, z) to another). We saw at Example 3.1.4 that the significance of ρ is
the following: if (u, v, w) is any point within an infinitesimal cube dV in R3, with infinitesimal
side-lengths du, dv and dw (with reference to the u, v and w-axes of R3), then the total charge
enclosed in the cube is the infinitesimal quantity
(10.1.5) dQ = ρ(u, v, w) du dv dw,
so that the total charge Q, diffusely spread throughout space according to the density ρ, is just the
“sum” of the elements dQ at (10.1.5) expressed in terms of a volume integral namely
Q =
∫R3
dQ
=
∫R3
ρ(u, v, w) du dv dw.
(10.1.6)
With this observation we can easily state Coulomb’s law in terms of charge diffusely spread through
space with charge density ρ. Fix some point (u, v, w) in R3 such that (u, v, w) lies in an infinitesimal
cube with side-lengths du, dv and dw, and fix some other point (x, y, z) (see Figure 10.2). As a
Figure 10.2: Electric field dEEE(x, y, z) at (x, y, z) due to infinitesimal point charge dQ at (u, v, w)
175
result of the charge dQ given by (10.1.5) the electric field at (x, y, z) is the infinitesimal vector
given by (10.1.4) with dQ in place of Q namely
(10.1.7) dEEE(x, y, z) =1
4πε0
(x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2
ρ(u, v, w) du dv dw.
Our goal is to determine the total electric field at (x, y, z) as a result of all the charge contained
within R3. This means that we must “add up” the infinitesimal electric field vectors at (10.1.7)
for all infinitesimal cubes contained within R3. Using the calculus of three dimensional integrals
worked out in Section 2.2 this is easy. In fact
EEE(x, y, z) =
∫R3
dEEE(x, y, z)
=1
4πε0
∫R3
(x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2
ρ(u, v, w) du dv dw,
(10.1.8)
in which we substituted from (10.1.7) at the second equality of (10.1.8). The statement (10.1.8)
amounts to Coulomb’s law for the electric field caused by charge diffusely spread through space
with charge densityρ. Expanding (10.1.8) we get
EEE(x, y, z) =
[1
4πε0
∫R3
(x− u)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw
]iii
+
[1
4πε0
∫R3
(y − v)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw
]jjj
+
[1
4πε0
∫R3
(z − w)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw
]kkk.
(10.1.9)
Observe that each of the three integrals on the right side of (10.1.9) is a standard three dimensional
integral, in which we integrate with respect to the space variables u, v and w, while keeping x, y
and z fixed.
We shall now establish that the electric field EEE at (10.1.9) is conservative (recall Definition
6.2.1). To this end define
(10.1.10) Ψ(x, y, z) := − 1
4πε0
∫R3
ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,
for all (x, y, z) in R3, and observe (easy calculus!) that
∂
∂x
[ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2
]= − (x− u)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2,
∂
∂y
[ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2
]= − (y − v)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2,
∂
∂z
[ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2
]= − (z − w)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2,
(10.1.11)
176
for all (x, y, z), in which we have kept u, v and w constant when evaluating the partial derivatives.
Recalling the gradient operator (see Definition 6.1.1) we then have
(10.1.12) ∇Ψ(x, y, z) =∂Ψ
∂x(x, y, z)iii+
∂Ψ
∂y(x, y, z)jjj +
∂Ψ
∂z(x, y, z)kkk.
Now from (10.1.10)
∂Ψ
∂x(x, y, z) = − 1
4πε0
∂
∂x
∫R3
ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw
= − 1
4πε0
∫R3
∂
∂x
[ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2
]du dv dw
=1
4πε0
∫R3
(x− u)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw (from (10.1.11)).
(10.1.13)
In exactly the same way
∂Ψ
∂y(x, y, z) =
1
4πε0
∫R3
(y − v)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw,
∂Ψ
∂z(x, y, z) =
1
4πε0
∫R3
(z − w)ρ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]3/2du dv dw,
(10.1.14)
where we have used the second and third relations of (10.1.11). Upon combining (10.1.14), (10.1.13),
(10.1.12) and (10.1.9) we find
(10.1.15) EEE(x, y, z) = (∇Ψ)(x, y, z) for all (x, y, z) in R3,
which we write with variables suppressed as
(10.1.16) EEE = ∇Ψ.
Remark 10.1.2. In Example 6.2.3 we saw that the electric field due to a point charge Q is conser-
vative. The result (10.1.16) tells us that an electric field arising from charge diffusely distributed
through space according to a specified charge density field ρ is also conservative with a potential
function given by (10.1.10).
Remark 10.1.3. From (10.1.16) and Theorem 9.1.13 we immediately obtain the important identity
(10.1.17) ∇×EEE = 0,
which tells us that the electric field EEE caused by a diffuse distribution of charge with charge density
ρ is irrotational (see Remark 9.1.12).
177
There is an alternative way of stating Coulomb’s law which is extremely important:
Law 10.1.4 (Gauss’ law for static electric fields). Suppose that EEE is the electric field arising from
the charge density ρ (see (10.1.8)). Then, for any region Ω ⊂ R3 with the closed outwardly oriented
surface S as boundary (recall Remark 9.3.1), we have
(10.1.18)
∫S
EEE · dAAA =1
ε0
∫Ω
ρ dV,
in which ε0 is given by (10.1.3).
Remark 10.1.5. We recognize the integral on the right of (10.1.18) as giving the total charge
contained within the region Ω. There is of course plenty of charge in space outside the region Ω,
but according to Gauss’ Law 10.1.4 the total flux of the electric field EEE through S has nothing
whatever to do with this “outside” charge and is determined only by the charge inside Ω! The
basic experimental fact leading to Gauss’ Law 10.1.4 is Coulomb’s law in the form of the statement
(10.1.9), and Gauss’ law is really just Coulomb’s law but stated in more esoteric mathematical
language (involving surface integrals!). The reason we prefer the esoteric statement at (10.1.18) to
the more down-to-earth inverse square law of Coulomb given by (10.1.9) is that (10.1.18) is often
much easier to use than the inverse square law, for both theoretical investigation and practical
applications. The reason for this is Gauss’ law in the form (10.1.18) is very well suited to application
of the tools of vector calculus, whereas Coulomb’s law in the form (10.1.9) is not.
Remark 10.1.6. Gauss’ law 10.1.4 is a global statement since it gives an aggregate or net property
of the electric field EEE in the form of the surface integral at (10.1.18). We are now going to use the
divergence Theorem 9.3.2 to rewrite Gauss’ law in a local or pointwise or differential form which
says something about EEE(x, y, z) at every individual point (x, y, z). To this end we use Theorem
9.3.2 (in the form of (9.3.123), with EEE in place of the generic vector field FFF ) to get
(10.1.19)
∫Ω
(∇ ·EEE) dV =
∫S
EEE · dAAA,
and upon combining (10.1.17) and (10.1.18) we find∫Ω
(∇ ·EEE) dV =1
ε0
∫Ω
ρ dV,
that is
(10.1.20)
∫Ω
[(∇ ·EEE)− 1
ε0ρ
]dV = 0.
178
Now (10.1.20) holds for each and every region Ω ⊂ R3, so we can apply Theorem 9.4.1 with
g(x, y, z) = (∇ ·EEE)(x, y, z)− 1
ε0ρ(x, y, z),
to conclude that
(10.1.21) (∇ ·EEE) =1
ε0ρ,
(in which we have suppressed the underlying variable (x, y, z)!). The statement at (10.1.21) may
be regarded as the local or pointwise or differential version of the global version of Gauss’ law for
static electric fields at (10.1.18). This relation will be of immense value when we study Maxwell’s
equations (in fact, it is one of Maxwell’s equations!). The preceding derivation of (10.1.21) illustrates
just how useful the seemingly esoteric statement at (10.1.18) can be; it would have been effectively
impossible to derive (10.1.21) directly on the basis of Coulomb’s inverse square law, even though
this law is (physically) completely equivalent to Gauss’ law 10.1.4.
Remark 10.1.7. The following question arises: Given a charge density field ρ how does one deter-
mine the electric field EEE caused by the charge density field? Of course Coulomb’s law (10.1.9) in
principle gives EEE(x, y, z) at each point (x, y, z) by direct integration. However, the integrals on the
right side of (10.1.9) are usually difficult to evaluate, so this is not a very practical way to determine
the EEE-field. The way around this obstacle is to recall from Remark 10.1.2 that EEE is conservative,
that is (see (10.1.16))
(10.1.22) EEE = ∇Ψ for a scalar potential field Ψ : R3 → R.
We are now going to use the local form of Gauss’ law, given by (10.1.21), to show that the scalar
potential Ψ necessarily satisfies a partial differential equation known as Poisson’s equation. To this
end we first take the divergence of each side of (10.1.22), that is
(10.1.23) ∇ ·EEE = ∇ · (∇Ψ).
From Theorem 9.1.21 we have
(10.1.24) ∇ · (∇Ψ) = ∇2Ψ,
so that (10.1.24) and (10.1.23) give
(10.1.25) ∇ ·EEE = ∇2Ψ.
179
Upon combining (10.1.25) and the local form of Gauss’ law (10.1.21) we get
(10.1.26) ∇2Ψ =1
ε0ρ.
Recalling the Laplacian operator (see Definition 9.1.18) we see that (10.1.26) can be written explic-
itly in terms of second partial derivatives as
(10.1.27)∂2Ψ
∂x2 +∂2Ψ
∂y2 +∂2Ψ
∂z2 =1
ε0ρ,
in which, as usual, we have suppressed the basic variable (x, y, z). The nice thing about the partial
differential equation (10.1.27) (that is (10.1.26)) is that there are available extremely powerful
numerical methods for determining functions Ψ which satisfy this equation when one is given the
charge density field ρ. Having determined this Ψ we then easily obtain the electric field from
(10.1.22) by calculating the x, y and z-partial derivatives of Ψ. This is a much more competitive
and feasible approach for calculating EEE than by direct evaluation of the integrals in (10.1.9).
Remark 10.1.8. The relation (10.1.26) (equivalently (10.1.27)) is a particular instance of a partial
differential equation, called the Poisson equation, which can be formulated in general terms as
(10.1.28) ∇2f = ψ,
or equivalently, in less codified terms, as
(10.1.29)∂2f
∂x2 +∂2f
∂y2 +∂2f
∂z2 = ψ,
in which ψ : R3 → R is a given or known scalar field, and our goal is to determine some function
f : R3 → R which satisfies (10.1.28) (equivalently (10.1.29)). Poisson equations are of central im-
portance and occur all over physics and engineering. We have just seen in Remark 10.1.7 how any
potential function Ψ of a time constant electric field EEE arising from a given time constant charge
density ρ satisfies a Poisson equation, and we shall see later that Poisson equations arise very natu-
rally in connection with magnetic fields as well. Furthermore, one also encounters Poisson equations
in fluid dynamics and aerodynamics, thermodynamics, elasticity theory, quantum mechanics and
gravitational physics, to mention just a few areas where this equation occurs. It is easy, although
tedious, to verify by direct substitution that a function f which satisfies (10.1.28) is given by
(10.1.30) f(x, y, z) = − 1
4π
∫R3
ψ(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,
for all (x, y, z) in R3.
180
10.2 Static Magnetic Fields
In this section we state the basic experimentally determined laws concerning static magnetic fields.
Just as the basic entity giving rise to a static electric field EEE in the previous Section 10.1 is a static
charge density scalar field ρ, so the basic entity giving rise to a static magnetic field BBB is a static
current density vector field JJJ of the kind introduced in Example 3.1.6, and throughout this section
we assume given a current density vector field JJJ , defined for simplicity on the domain D := R3 (in
the notation of Example 3.1.6), and static in the sense that at each (x, y, z) the current density
JJJ(x, y, z) is constant with respect to time t.
In the same way that the basic experimental fact concerning static electric fields is Coulomb’s
law, so the basic experimental fact concerning static magnetic fields is the Biot-Savart law. This
law states that a time constant or static current density vector field JJJ causes a time constant or
static magnetic field BBB, and that BBB is given in terms of JJJ by the integral
(10.2.31) BBB(x, y, z) =µ0
4π
∫R3
JJJ(u, v, w)× (x− u)iii+ (y − v)jjj + (z − w)kkk[(x− u)2 + (y − v)2 + (z − w)2]3/2
du dv dw,
for each (x, y, z) in R3, in which µ0 is a constant called the magnetic permeability of free space with
the value
(10.2.32) µ0 = 4π × 10−7 henry/meter.
Remark 10.2.1. There are definite similarities between the Biot-Savart law (10.2.31) and Coulom-
b’s law in the form of (10.1.8). The denominators in both integrands are identical, indicating that
both of the laws are “inverse square” laws. In each law there is an “input” or “cause” in the inte-
grand on the right side; this is the charge density scalar field ρ in Coulomb’s law (10.1.8), and the
current density vector field JJJ in the Biot-Savart law (10.2.31), and each law involves integration
over R3. On the other hand there is also one huge difference between the two laws: in (10.1.8) at
each (u, v, w) in R3 the vector (x−u)iii+(y−v)jjj+(z−w)kkk is multiplied by the scalar charge density
ρ(u, v, w) causing the electric field, whereas in (10.2.31) the same vector (x−u)iii+(y−v)jjj+(z−w)kkk
is cross-multiplied by the vector current density JJJ(u, v, w) causing the magnetic field. We could
easily calculate the cross product in (10.2.31) and expand BBB(x, y, z) in vector form but shall not
do this. Indeed, the value of the Biot-Savart law (10.2.31) lies in the fact that it lends itself, after
a lengthy and rather complicated mathematical analysis that we shall certainly not give here, to
restatement in the form of two laws, namely Ampere’s circuital law (which we previewed at Remark
8.5.7) and Gauss’ law for magnetic fields (in much the same way that Coulomb’s law (10.1.8) lends
181
itself to restatement in the form of Gauss’ Law 10.1.4 for electrostatic fields). We state these laws
next.
Law 10.2.2 (Ampere’s circuital law and Gauss’ law for static magnetic fields). Suppose that BBB is
the static magnetic field arising from the static current density JJJ (see (10.2.31)). Then, for each
finite open surface S with boundary curve Γ, we have
(10.2.33)
∫Γ
BBB · drrr = µ0
∫S
JJJ · dAAA,
and for each closed surface S we have
(10.2.34)
∫S
BBB · dAAA = 0.
Remark 10.2.3. We have already seen the statement (10.2.33), known as Ampere’s circuital law
(recall Remark 8.5.7). Observe that (10.2.34) is a statement about the flux of the magnetic field
through a closed surface, much like Gauss’ Law (10.1.18) is a statement about the flux of an electric
field though a closed surface. For this reason the assertion (10.2.34) is called Gauss’ law for magnetic
fields.
Remark 10.2.4. The relations (10.2.33) and (10.2.34) are global statements about the magnetic
field BBB stated in terms of line integrals and surface integrals of BBB. Exactly as we wrote the global
form (10.1.18) of Gauss’ law of electrostatics in the local form (10.1.21), we are now going to write
the laws (10.2.33) and (10.2.34) in local form. Using the divergence theorem and just repeating the
steps (10.1.19) to (10.1.21) (with BBB in place of EEE and zero in place of ρ) we get the local form of
(10.2.34) namely
(10.2.35) ∇ ·BBB = 0.
We are now going to establish the local form of Ampere’s circuital law (10.2.33), and for this we
shall need Stokes Theorem 9.2.2. Indeed, using Stokes’ theorem in the form of (9.2.67), with BBB in
place of the generic vector field FFF , we have
(10.2.36)
∫Γ
BBB · drrr =
∫S
(∇×BBB) · dAAA,
and with (10.2.36) we can write Ampere’s circuital law (10.2.33) as∫S
(∇×BBB) · dAAA = µ0
∫S
JJJ · dAAA,
182
or
(10.2.37)
∫S
[(∇×BBB)− µ0JJJ ] · dAAA = 0.
From Law (10.2.2) we know that (10.2.33) holds for each and every finite open surface S, and
therefore (10.2.37) must also hold for each and every finite open surface S. This is hugely important,
for it allows us to use the following “surface integral” analog of Theorem 9.4.1:
Theorem 10.2.5 (du Bois Reymond mk.II). Suppose the vector field GGG : R3 → R3 is such that
(10.2.38)
∫S
GGG · dAAA = 0
for each and every finite open surface S in R3. Then GGG(x, y, z) = 0 for all (x, y, z) in R3.
From (10.2.37) and Theorem 10.2.38 (with GGG := (∇×BBB)− µ0JJJ) we obtain
(10.2.39) ∇×BBB = µ0JJJ.
The relation (10.2.39) is the local form of Ampere’s circuital law (10.2.33), in much the same way
that (10.2.35) is the local form of Gauss’ law (10.2.34) for magnetic fields, and (10.1.21) is the local
form of Gauss’ law (10.1.18) for electric fields. In particular, (10.2.39) tells us that the curl (or
“rotation” or “turning” or “vorticity”) of the magnetic field BBB at each (x, y, z) in R3 is directly
proportional to the vector JJJ(x, y, z) (recall the intuitive significance of curl discussed at Remark
9.1.11 and Remark 9.2.7). We see from (10.2.39) that in the nontrivial case where JJJ is not identically
zero (so that JJJ(x, y, z) 6= 0 for some (x, y, z) in R3), then ∇×BBB also cannot be identically zero; it
then follows from the equivalence of (a) and (c) in Theorem 9.1.9 that the static magnetic field BBB
arising from a static current density JJJ cannot possibly be a conservative vector field. This is clearly
very different from a static electric field EEE arising from a static charge density ρ, which is always
conservative, as we have seen at Remark 10.1.2. It follows that it is generally not possible to write
BBB as the gradient of some scalar potential function, that is we generally do not have
(10.2.40) BBB = ∇Ψ
for a scalar potential function Ψ : R3 → R (recall Definition 6.2.1). This lack of a scalar potential
function makes magnetic fields intrinsically more difficult to deal with than electric fields. It turns
out that we can, nevertheless, always write BBB as the curl of another vector field, and this provides
some partial compensation for not having a scalar potential function at our disposal. The essence
of the matter is discussed in the next few remarks.
183
Remark 10.2.6. For the moment forget about magnetic fields and consider a general C1-vector
field FFF : R3 → R3. If it is true that FFF = ∇×GGG for some C1-vector field GGG : R3 → R3 then we know
from (the very elementary) Theorem 9.1.14 that ∇ ·FFF = ∇ · (∇×GGG) = 0, that is
(10.2.41) FFF = ∇×GGG for some vector field GGG ⇒ ∇ ·FFF = 0.
Is the converse of (10.2.41) also true? That is, if we know that ∇ ·FFF = 0 then is it necessarily the
case that FFF = ∇ ×GGG for some vector field GGG? This converse is far from obvious, and is in fact
decidedly difficult to establish, but nevertheless is true, according to the following very profound
theorem which we make no attempt to establish here:
Theorem 10.2.7 (Poincare). Suppose that FFF : R3 → R3 is a C1-vector field. If (∇·FFF )(x, y, z) = 0
for all (x, y, z) in R3 then FFF is necessarily given by
(10.2.42) FFF (x, y, z) = (∇×GGG)(x, y, z) for all (x, y, z) in R3,
for some C1-vector field GGG : R3 → R3.
Remark 10.2.8. If a vector field FFF : R3 → R3 is given by the curl of a vector field GGG : R3 → R3,
that is
(10.2.43) FFF (x, y, z) = (∇×GGG)(x, y, z), for all (x, y, z) in R3,
or more briefly
(10.2.44) FFF = ∇×GGG,
(with the variables (x, y, z) suppressed), then the vector field GGG is called a vector potential of the
vector field FFF , and correspondingly we say that the vector field FFF has a vector potential GGG. If the
vector field FFF has a vector potential GGG, that is FFF and GGG are related by (10.2.44), then FFF in fact has
infinitely many vector potentials. To see this put
(10.2.45) GGG := GGG+∇g
for any C1-function g : R3 → R. From Theorem 9.1.13 we know
(10.2.46) ∇× (∇g) = 0.
184
Then
∇× GGG = ∇× (GGG+∇g) (from (10.2.45))
= ∇×GGG+∇× (∇g) (from (9.1.61))
= ∇×GGG+ 0 (from (10.2.46))
= FFF (from (10.2.44)),
that is
(10.2.47) FFF = ∇× GGG.
In short, if GGG is a vector potential of FFF then GGG defined by (10.2.45) for any C1-function g : R3 → Ris also a vector potential of FFF , that is FFF has infinitely many vector potentials.
Remark 10.2.9. From Theorem 10.2.7 we see that a solenoidal vector field FFF (i.e. ∇ · FFF = 0,
recall Remark 9.1.5) always has some vector potential GGG. In accordance with Remark 10.2.8 the
solenoidal vector field FFF then has infinitely many vector potentials GGG of the form
GGG = GGG+∇g,
corresponding to every C1-scalar field g : R3 → R.
Remark 10.2.10. One sees from (10.2.35) that the magnetic field BBB is solenoidal, so that we can
apply Theorem 10.2.7 to conclude that
(10.2.48) BBB = ∇×AAA for some vector field AAA : R3 → R3.
Any vector fieldAAA which satisfies (10.2.48) is a magnetic vector potential of the magnetic fieldBBB. We
know from Remark 10.2.8 that, if AAA is a magnetic vector potential of BBB, then for every C1-function
g : R3 → R, the vector field
(10.2.49) AAA := AAA+∇g,
is also a magnetic vector potential of BBB.
Remark 10.2.11. In Remark 10.1.7 we addressed the question of how to determine an electric
field EEE in terms of the (known) charge density field ρ causing the electric field. Here we consider an
analogous question for static magnetic fields: if we know the static current density field JJJ how can
we determine the static magnetic field BBB in terms of JJJ? In principle the Biot-Savart law (10.2.31)
185
gives the answer, but (much as with the direct application of Coulomb’s law noted in Remark
10.1.7) the integrals appearing in (10.2.31) are usually difficult to evaluate. In Remark 10.1.7 we
saw that the calculation of an electric field EEE in terms of the known current density field ρ can be
reduced to the solution of a Poisson equation giving a potential function Ψ of the electric field. This
is a very satisfactory state of affairs because we understand clearly how to solve Poisson equations
(typically by numerical analysis or numerical methods). In the present remark we shall see that,
rather similarly, the calculation of the magnetic field BBB in terms of JJJ can be reduced to the solution
of three Poisson equations, each equation giving a scalar component of a magnetic vector potential
AAA of the magnetic field BBB. It will soon become clear that the path we must follow in attaining this
goal is a good deal longer and more complicated than the rather simple analysis of Remark 10.1.7.
This just reflects the fact that magnetic fields, not being conservative, are a lot more challenging to
deal with than electric fields. Gauss’ Law and Ampere’s law and for magnetic fields in local form
state that a time constant current density JJJ causes a time constant magnetic field BBB which satisfies
the relations
(10.2.50) ∇ ·BBB = 0, ∇×BBB = µ0JJJ.
(see (10.2.35) and (10.2.39)). Determining the magnetic fieldBBB therefore means that we must “solve
for” or “extract” BBB in terms of the known current density JJJ using the relations at (10.2.50). There
is a very powerful theorem of Helmholtz (or von Helmholtz) which says that one can indeed “solve”
for BBB in terms of JJJ from the relations at (10.2.50) and even provides a formula which determines
BBB. However, the Helmholtz theorem is an advanced result which is rather above the level of this
introductory course. Instead, we shall use the notion of magnetic vector potential in Remark 10.2.10
and a direct argument to see how to “solve” (10.2.50) for the vector field BBB. In the course of this
we shall introduce the very clever idea of a gauge transformation. In view of Remark 10.2.10 we
can write BBB in the form
(10.2.51) BBB = ∇×AAA for a magnetic vector potential AAA : R3 → R3,
and we know from Remark 10.2.10 that there are actually infinitely many such vector potentialsAAA. It
turns out that we can use the fact thatBBB has many magnetic vector potentials to find a particularly
nice vector potential AAA of BBB with the further property that it is solenoidal i.e. ∇ · AAA = 0, that is
we can actually show the following
(10.2.52) BBB = ∇× AAA for some vector potential AAA : R3 → R3 such that ∇ · AAA = 0.
186
For the moment we by-pass the question of how to establish (10.2.52) and instead concentrate on
how to “solve” the relations (10.2.50) for BBB in terms of JJJ assuming that (10.2.52) holds.
We have
∇×BBB = ∇× (∇× AAA) (from (10.2.52))
= ∇(∇ · AAA)−∇2AAA (from the identity (9.1.62)),
that is
(10.2.53) ∇×BBB = ∇(∇ · AAA)−∇2AAA.
From (10.2.52) we also have ∇ · AAA = 0, and therefore of course
(10.2.54) ∇(∇ · AAA) = 0.
In view of (10.2.54) and (10.2.53) we get
(10.2.55) ∇×BBB = −∇2AAA,
and combining (10.2.55) with Ampere’s law (the second relation of (10.2.50)) we find
(10.2.56) −∇2AAA = µ0JJJ.
Now (10.2.56) is a very nice relation indeed. In fact, recalling the componentwise expansion of the
vector fields JJJ and AAA, namely
(10.2.57) JJJ = J1iii+ J2jjj + J3kkk, AAA = A1iii+ A2jjj + A3kkk,
and recalling the componentwise expansion of ∇2AAA from Definition 9.1.19, that is
(10.2.58) (∇2AAA) = (∇2A1)iii+ (∇2A2)jjj + (∇2A3)kkk,
(as follows from (9.1.31) with AAA in place of FFF ), we can equate the scalar componets in the vector
relation (10.2.56) to get the scalar relations
(10.2.59) ∇2Ai = −µ0Ji,
for i = 1, 2, 3. Now each (10.2.59) is a scalar Poisson equation of the form (10.1.28), for which we
already know the solution (see (10.1.30)). Indeed, just matching (10.2.59) with (10.1.28) we see
from (10.1.30) that each Ai is given by
(10.2.60) Ai(x, y, z) =µ0
4π
∫R3
Ji(u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,
187
for all (x, y, z) in R3, which gives each Ai in terms of the (known) current density components Ji.
We have therefore obtained the vector field AAA, and using this we can now immediately determine
BBB by calculating the curl of AAA as at (10.2.52). In this way we have determined the magnetic field
BBB which satisfies the relations (10.2.50).
It remains to establish that (10.2.52) actually holds. To this end we fix any arbitrary magnetic
vector potential AAA of the magnetic field BBB (recall (10.2.51)) so that
(10.2.61) BBB = ∇×AAA.
With this arbitrary choice of magnetic vector potential AAA let f be any scalar field which satisfies
the relation
(10.2.62) ∇2f = −∇ ·AAA,
in which, of course, AAA is the vector field that we have just fixed (the relation (10.2.62) is of course
just a Poisson equation of the form (10.1.28), although we will not actually need to solve this
equation for f , as we will soon see). Using the arbitrarily chosen magnetic vector potential AAA,
together with the f satisfying (10.2.62), define
(10.2.63) AAA := AAA+∇f.
From Remark 10.2.10 and (10.2.61) we know that AAA is also a magnetic vector potential of BBB, that
is
(10.2.64) BBB = ∇× AAA.
The transformation of our arbitrarily chosen AAA into AAA at (10.2.63) is called a gauge transformation
of AAA. We are now going to see that AAA is actually the “nice” magnetic vector potential that we want
in that ∇ · AAA = 0. In fact
∇ · AAA = ∇ · (AAA+∇f) (from (10.2.63))
= ∇ ·AAA+∇ · (∇f) (from (9.1.59) )
= ∇ ·AAA+∇2f (from Theorem 9.1.21)
= 0 (from (10.2.62)),
that is
(10.2.65) ∇ · AAA = 0.
188
Now (10.2.52) follows from (10.2.65) and (10.2.64). Notice that we do not have to determine the
function f which satisfies (10.2.62), even though AAA (which we need in order to obtain BBB from
(10.2.64)) is actually defined in terms of this f by (10.2.63). The reason of course is that AAA has
been defined in such a way that it also satisfies the Poisson equations (10.2.59) and we can determine
AAA from these equations without having to find f . In fact, the function f occurring at (10.2.62) is just
a very clever device which ensures that AAA defined by (10.2.63) actually satisfies the all-important
Poisson equations (10.2.59) for which we know the solution. The definition of AAA at (10.2.63) is
called a Coulomb gauge transformation.
10.3 Time Varying Fields
Up until now we have concentrated on the laws of electricity and magnetism which pertain to time
constant or static electric and magnetic fields. These are Gauss’ Law 10.1.4 for static electric fields
(also expressed in local form at (10.1.21)), and the Ampere and Gauss Law 10.2.2 for static magnetic
fields (also expressed in local form at (10.2.39) and (10.2.35) respectively). In this section we are
going to focus on the basic experimentally determined laws of electricity and magnetism for time
varying electric and magnetic fields. Recall from Remark 3.2.5 that a time varying vector field FFF
is denoted more completely by FFF (t, x, y, z), which indicates that FFF generally varies with respect to
time t at each fixed (x, y, z) in R3, and generally varies with respect to space points (x, y, z) at each
fixed instant t. Similarly, a time varying scalar field f is denoted more completely by f(t, x, y, z)
with the same interpretation. We begin with the time varying version of Gauss’ law for electric
fields:
Law 10.3.1 (Gauss’ law for time varying electric fields). A time varying charge density ρ causes a
time varying electric field EEE, and the fields ρ and EEE are related by
(10.3.66) (∇ ·EEE)(t, x, y, z) =1
ε0ρ(t, x, y, z),
for each instant t and each point (x, y, z) in R3.
Remark 10.3.2. Recall that the divergence of the time varying electric field on the left of (10.3.66)
is to be interpreted in accordance with Remark 9.1.23 (see in particular (9.1.44)). As usual we will
write (10.3.66) with the variables (t, x, y, z) suppressed, that is
(10.3.67) (∇ ·EEE) =1
ε0ρ,
189
so that (t, x, y, z) must be mentally substituted into (10.3.67). We have chosen to state Gauss’
law for time varying fields in local form, because this will be particularly useful when we come to
Maxwell’s equations, but we could just as easily have stated this law in global form in terms of
surface integrals. We emphasize that Law 10.3.1 is a consequence of experimental observation, in the
same way that Law 10.1.4 for static electric fields is also a consequence of experimental observation.
In exactly the same way we can state Gauss’ law for time varying magnetic fields. Again, we
choose the state the time varying law in local form since this is the form most useful for Maxwell’s
equations:
Law 10.3.3 (Gauss’ law for time varying magnetic fields). A time varying current density JJJ causes
a time varying magnetic field BBB which satisfies
(10.3.68) (∇ ·BBB)(t, x, y, z) = 0,
for each instant t and each point (x, y, z) in R3.
Remark 10.3.4. Needless to say we will write the relation (10.3.66) with the variables (t, x, y, z)
suppressed, that is
(10.3.69) ∇ ·BBB = 0.
Law 10.3.3 is, like Law 10.3.1, a consequence of experimental observation.
We next come to Faraday’s law of electromagnetic induction which we have already seen at
Remark 8.5.8. For completeness we state this law again:
Law 10.3.5 (Faraday’s law of electromagnetic induction). A time varying magnetic field BBB causes
a time varying electric field EEE. Moreover, for each finite open surface S with boundary curve Γ, the
fields BBB and EEE are related by
(10.3.70)
∫Γ
EEE · drrr = −∫S
∂BBB
∂t· dAAA.
We are now going to write Faraday’s law in local form, in much the same way that we extracted
the local form (10.2.39) of Ampere’s law from the global form (10.2.33). To this end we use Stokes’
theorem in the form of (9.2.67), with EEE in place of the generic vector field FFF , to get
(10.3.71)
∫Γ
EEE · drrr =
∫S
(∇×EEE) · dAAA,
190
so that (10.3.71) and (10.3.70) give∫S
(∇×EEE) · dAAA = −∫S
∂BBB
∂t· dAAA,
that is
(10.3.72)
∫S
[(∇×EEE) +
∂BBB
∂t
]· dAAA = 0.
Notice that (10.3.72) holds for each and every finite open surface S. We are now going to apply
Theorem 10.2.5. To this end write (10.3.72) with all the variables displayed, that is
(10.3.73)
∫S
[(∇×EEE)(t, x, y, z) +
∂BBB(t, x, y, z)
∂t
]· dAAA = 0.
Recall that the curl (∇ × EEE)(t, x, y, z) in the integrand of of (10.3.73) is to be interpreted in
accordance with Remark 9.1.23 (see in particular (9.1.45)). The relation (10.3.73) holds for each
instant t and each finite open surface S (notice that the space variables (x, y, z) have been integrated
out). Now fix some arbitrary value of t, and for this t define the vector field
(10.3.74) GGG(x, y, z) := (∇×EEE)(t, x, y, z) +∂BBB(t, x, y, z)
∂t, for all (x, y, z) in R3.
It follows from (10.3.74) and (10.3.73) that, at this fixed instant t, we have
(10.3.75)
∫S
GGG · dAAA = 0,
for each and every finite open surface S, so that Theorem 10.2.5 gives
(10.3.76) GGG(x, y, z) = 0 for all (x, y, z) in R3.
In view of (10.3.76), (10.3.74) and the arbitrary choice of instant t, it follows that
(10.3.77) (∇×EEE)(t, x, y, z) +∂BBB(t, x, y, z)
∂t= 0
for each instant t and each (x, y, z) in R3. From now on we shall, of course, write (10.3.77) with
the variables (t, x, y, z) suppressed, that is
(10.3.78) (∇×EEE) +∂BBB
∂t= 0.
The relation (10.3.78) is Faraday’s law of electromagnetic induction in local form.
191
Remark 10.3.6. We have formulated the basic laws of electricity and magnetism in global form, in
terms of line integrals and surface integrals, and then used the tools of vector calculus (in particular
Stokes’ theorem and the divergence theorem) to reformulate these laws in local form. Why have
we reduced the laws to local form? It is important to understand that experimental observation
always gives us the laws of electricity and magnetism in global form. However, it is invariably the
case that the local form of these laws is the most useful in applications. This holds whether one
has the really ambitious goal of using the laws of electricity and magnetism to advance the state of
fundamental physics (such as the quantum theory of electrodynamics), or on the other hand one
just wants to apply the laws to some engineering problem (such as the design of antennas for cell
phones); regardless of the particular application it is typically the local form of the laws which is
the most useful. In fact, from now on we shall concentrate almost exclusively on the local form of
the laws of electricity and magnetism.
Remark 10.3.7. In the preceding account of the laws of electricity and magnetism for time varying
fields we have so far not discussed a possible time varying version of Ampere’s circuital law. Given
how nicely and easily the Gauss laws for electric and magnetic fields extend to the time varying case
(see (10.3.66) and (10.3.68)) it is reasonable to expect, on the basis of the local version of Ampere’s
law for static fields (see (10.2.39)), that the time varying version of this law might look like
(10.3.79) (∇×BBB)(t, x, y, z) = µ0JJJ(t, x, y, z),
for all instants t and points (x, y, z) in R3 (for completeness we have displayed all variables at
(10.3.79)). In fact, experimental evidence in support of the “time varying law” (10.3.79) proved
very difficult to come by. It turns out that there is indeed an extension of Ampere’s law to time
varying fields but this extension is not given by (10.3.79)! It was Maxwell who discovered that
(10.3.79) is false and then used the tools of vector calculus to get a “corrected version” of Ampere’s
law for time varying fields which turns out to be in agreement with experimental evidence, as well
as consistent with Ampere’s law in the special case of time constant fields. This constitutes one
of the very greatest discoveries in all of physics. In the next chapter we shall follow Maxwell, and
use the tools of vector calculus to see that (10.3.79) is incorrect and get the correct extension of
Ampere’s law to time varying fields.
192
Chapter 11
Maxwell’s Equations
In this chapter we first address the problem of extending Ampere’s law to time varying fields, as
discussed in Remark 10.3.7. We then state Maxwell’s equations in full and develop some of the
simplest consequences of these equations which have completely revolutionized physics.
11.1 The Ampere-Maxwell Law for Time Varying Fields
In Remark 10.3.7 we noted that the relation (10.3.79), that is
(11.1.1) ∇×BBB = µ0JJJ
(with the variables (t, x, y, z) suppressed), looks like a plausible extension of Ampere’s law to time
varying fields. Following Maxwell, we are going to use the tools of vector calculus to see that (11.1.1)
cannot possibly be true for genuinely time varying fields. To this end we require the continuity
equation (9.4.202) already established in Section 9.4, that is
(11.1.2) (∇ · JJJ) +∂ρ
∂t= 0.
Now assume that (11.1.1) holds for time varying fields. Take the divergence of each side of (11.1.1)
to get
(11.1.3) ∇ · (∇×BBB) = µ0(∇ · JJJ).
From Theorem 9.1.14 (with BBB in place of GGG) we have
(11.1.4) ∇ · (∇×BBB) = 0,
193
so that (11.1.4) and (11.1.3) give
(11.1.5) ∇ · JJJ = 0.
Upon combining (11.1.5) and (11.1.2) we find
(11.1.6)∂ρ(t, x, y, z)
∂t= 0,
(displaying all variables). It follows from (11.1.6) that the charge density scalar field ρ is time
constant, that is if (11.1.1) holds then ρ must be time constant. However, in experiments one
can easily create time varying charge density fields; indeed, such time varying fields occur all over
physics and engineering. It follows that the supposition that (11.1.1) holds for time varying fields
is contradicted by physical evidence, and therefore (11.1.1) cannot possibly be true for time varying
fields. This being the case, can we somehow get a“corrected” or “extended” version of (11.1.1)
which is true for time varying fields? Maxwell’s idea was to “guess” that a corrected version of
(11.1.1) would look like
(11.1.7) ∇×BBB = µ0JJJ +GGG
for some time varying vector field GGG. That is, one tries to correct (11.1.1) by adding a correction
term GGG to the right side. It remains to determine exactly what GGG actually is. From (11.1.7) we
have
(11.1.8) GGG = ∇×BBB − µ0JJJ.
Now take the divergence of each side of (11.1.8) to see that
∇ ·GGG = ∇ · [∇×BBB − µ0JJJ ]
= ∇ · (∇×BBB)− µ0(∇ · JJJ)
= −µ0(∇ · JJJ) (since Theorem 9.1.14 gives ∇ · (∇×BBB) = 0),
that is
(11.1.9) (∇ · JJJ) = − 1
µ0
(∇ ·GGG).
From Gauss’ law for electric fields at (10.3.67) we get
(11.1.10) ρ = ε0(∇ ·EEE).
194
Taking partial t-derivatives on each side of (11.1.10) gives
∂ρ
∂t= ε0
∂(∇ ·EEE)
∂t
= ε0∇ ·[∂EEE
∂t
],
(11.1.11)
(where the interchange of partial t-derivative and divergence at the second equality of (11.1.11) is
justified by (9.1.50) with EEE in place of FFF ). Now substitute (11.1.11) and (11.1.9) in the continuity
equation (11.1.2) to obtain
ε0∇ ·[∂EEE
∂t
]− 1
µ0
(∇ ·GGG) = 0,
that is (multiplying through by µ0)
ε0µ0∇ ·[∂EEE
∂t
]− (∇ ·GGG) = 0,
that is
(11.1.12) ∇ ·[ε0µ0
∂EEE
∂t−GGG
]= 0.
The simplest relation between the vector fields EEE and GGG which is consistent with (11.1.12) is of
course to make the argument in square brackets zero, that is
(11.1.13) GGG = ε0µ0∂EEE
∂t.
Finally, substitute (11.1.13) for GGG in (11.1.7) to get
(11.1.14) ∇×BBB = µ0JJJ + ε0µ0∂EEE
∂t.
Remark 11.1.1. The relation (11.1.14) constitutes Maxwell’s extension of Ampere’s law to time
varying fields, and is usually called the Ampere-Maxwell law. Notice that, when the fields are
time constant, the partial t-derivative on the right of (11.1.14) is of course identically zero, so
that (11.1.14) is completely consistent with Ampere’s law (11.1.1) for time constant fields, which is
known to be true from experimental evidence based on the Biot-Savart law (see the discussion for
Section 10.2). However, the fundamental question remains: is the Ampere-Maxwell law (11.1.14)
actually true for time varying fields? The derivation of (11.1.14) given above certainly looks sound
enough, but we must remember that it began with a guess, namely that the corrected version of
Ampere’s law has to look like (11.1.7) for some vector field GGG. What if this guess is wrong, and the
195
actual correction of Ampere’s law involves some more complicated modification than just “adding
in” a term GGG? In this case the law (11.1.14), that was established based on this guess, would
certainly not be true! The answer to the question is therefore not so obvious. As is always the
case, the final determination rests on experimental evidence, and experimental evidence confirming
the correctness or otherwise of the Ampere-Maxwell law turned out to be frustratingly difficult to
obtain. However, during the period 1887 - 1891, in a brilliantly innovative series of experiments, the
physicist Heinrich Hertz showed convincingly that the Ampere-Maxwell law is in fact perfectly true.
In fact, numerous modern technological devices, such as cell-phones, radio, television, radar, micro-
wave ovens, the internet etc, etc, etc, etc, rely crucially on the correctness of the Ampere-Maxwell
law; the fact that these devices work in the way they do really constitutes daily “experimental
verification” of this law.
Remark 11.1.2. A further point to notice about (11.1.14) is the following: Suppose that the
current density JJJ is identically zero, but there is still a time varying electric field EEE. According to
the Ampere-Maxwell law the time varying electric field EEE causes a magnetic field BBB, and the two
fields are related by (11.1.14) with JJJ = 0, that is
(11.1.15) ∇×BBB − ε0µ0∂EEE
∂t= 0.
This is very symmetric with respect to Faraday’s Law 10.3.5, for this says that a time varying
magnetic field causes an electric field, and the two fields are related by (10.3.78), which indeed
looks very similar to (11.1.15) when EEE and BBB are interchanged (the quantity ε0µ0 is less important
than it looks, for it is really just a consequence of our choice of standard MKS-units; in fact, by going
over to a different system of units - called “Gaussian units” - we can actually make the coefficients
of the t-partial derivative terms in (11.1.15) and (10.3.5) equal in magnitude). To this extent
Maxwell’s correction of Ampere’s law supplies a new physical law which is a symmetric counterpart
of Faraday’s law. It was in fact the search for this “missing symmetry” which motivated Maxwell to
propose the correction to Ampere’s law in the form of (11.1.7). Notice that the symmetry between
(11.1.15) and (10.3.5) is not quite complete; it is more like “skew-symmetry” because of the minus
sign preceding the partial t-derivative in (11.1.15) compared with the plus sign preceding the partial
t-derivative in (10.3.5).
196
11.2 Maxwell’s Equations
We are now finally ready to state Maxwell’s equations in all their glory. These equations comprise
Gauss’s laws for electric and magnetic fields, Faraday’s law of electromagnetic induction and the
Ampere-Maxwell law. We collect these together in the following massively important statement:
Law 11.2.1 (Maxwell). A given time varying charge density ρ, together with a given time varying
current density JJJ , causes a time varying electric field EEE and a time varying magnetic field BBB.
Moreover, these fields are related to each other, as well as to the given “sources” ρ and JJJ , by the
equations:
(a) (∇ ·EEE) =1
ε0ρ (Gauss for electric fields)
(b) (∇ ·BBB) = 0 (Gauss for magnetic fields)
(c) (∇×EEE) +∂BBB
∂t= 0 (Faraday)
(d) (∇×BBB) = µ0JJJ + ε0µ0∂EEE
∂t(Ampere-Maxwell)
(11.2.16)
(see (10.1.21), (10.3.68), (10.3.78) and (11.1.14)).
Remark 11.2.2. Notice that we have stated the preceding laws in local form. We could just
as easily have stated the laws in global form, in terms of surface and line integrals, but these
are seldom needed and the local versions are much more useful. The equations (11.2.16)(a) - (d)
are Maxwell’s equations. One may reasonably question why these are called Maxwell’s equations,
since (11.2.16)(a) - (c) and the static version of (11.2.16)(d) were known well before the time of
Maxwell. The term Maxwell’s equations is nevertheless completely appropriate, for it was Maxwell
who first truly understood the enormous power of these equations when used together. Indeed,
Maxwell used the equations to discover the existence of self-sustaining electromagnetic waves or
electromagnetic radiation. This discovery completely changed the whole science of physics, and
is the reason why Maxwell, together with Galileo, Newton and Einstein, is counted among the
four greatest physicists of all time. One especially important consequence of the discovery of
electromagnetic radiation is that it was the prime motivator leading Einstein to the later formulation
of the theory of relativity. On a more pragmatic level, electromagnetic radiation is of course the
entire basis of our modern civilization. In all of this, the term ε0µ0∂EEE/(∂t) added by Maxwell to
Ampere’s law to get (11.2.16)(d), is of central importance. In fact, it is presence of this very term
which is really at the root of the discovery of electromagnetic radiation, as we shall see in the next
few sections.
197
11.3 Electromagnetic Waves without Sources
In the present section we look at Maxwell’s equations in the source free case, in which the charge
density ρ and current density JJJ are identically zero, that is
(11.3.17) ρ = 0, JJJ = 0.
We are going to use the tools of vector calculus to see that the phenomenon of electromagnetic
waves is predicted by Maxwell’s equations. In view of (11.3.17), the Maxwell equations (11.2.16)
reduce to
(a) (∇ ·EEE) = 0,
(b) (∇ ·BBB) = 0,
(c) (∇×EEE) +∂BBB
∂t= 0,
(d) (∇×BBB) = ε0µ0∂EEE
∂t.
(11.3.18)
Remark 11.3.1. One may reasonably ask how it is possible to even get non-zero electric and
magnetic fields in the source free case. Indeed, Maxwell’s equations at (11.3.18) are obviously
trivially satisfied by the zero fields EEE(t, x, y, z) = 0 and BBB(t, x, y, z) = 0 for all (t, x, y, z). We get
non-zero electric and magnetic fields satisfying (11.3.18) as a consequence of electromagnetic energy
being radiated into free space by some transmitting agent. A radiator of electromagnetic energy
comprises an antenna which absorbs energy from a voltage source, as shown in Figure 11.1. This
energy sets up a time varying charge density field ρ and a time varying current density field JJJ ,
both of which are localized to the antenna, that is the charge and current density fields are zero
everywhere outside the antenna. By a very sophisticated application of Maxwell’s equations in the
general form (11.2.16), which allows for the charge and current density fields ρ and JJJ localized to
the antenna, one can show that the energy absorbed from the voltage source by the antenna is
carried into space, away from the antenna, by a non-zero electric field EEE and a non-zero magnetic
field BBB. Outside the antenna, where the charge and current density fields are zero (as we have
noted), these fields are governed by the source free equations (11.3.18). It is the behavior of these
fields outside the antenna that is our main concern in this section.
Take the curl of each side of (11.3.18)(c) to get
(11.3.19) ∇× (∇×EEE) +∇×[∂BBB
∂t
]= 0.
198
Figure 11.1: Radiation of electromagnetic energy from an antenna
From (9.1.53) (with BBB in place of FFF ),
(11.3.20)∂(∇×BBB)
∂t= ∇×
[∂BBB
∂t
],
and from (9.1.62) (with EEE in place of FFF ),
(11.3.21) (∇× (∇×EEE)) = (∇(∇ ·EEE))− (∇2EEE) = −(∇2EEE),
(suppressing the variables (t, x, y, z)), in which the second equality at (11.3.21) follows from (11.3.18)(a).
Now put (11.3.21) and (11.3.20) into (11.3.19) to get
(11.3.22) −(∇2EEE) +∂(∇×BBB)
∂t= 0.
From (11.3.22) and (11.3.18)(d) we finally obtain
(11.3.23) c2(∇2EEE) =∂2EEE
∂t2, in which c :=
1√ε0µ0
.
We note, from (11.3.23), (10.2.32) and (10.1.3), with some easy dimensional analysis, that
(11.3.24) c = 2.999× 108 meters/second,
199
which amazingly enough is the speed of light in a vacuum. The full significance will shortly become
clear! We next obtain an equation for BBB which is analogous to (11.3.23). To this end, take the curl
of each side of (11.3.18)(d), so that
(11.3.25) ∇× (∇×BBB) = ε0µ0∇×[∂EEE
∂t
].
From (9.1.62) (with BBB in place of FFF ), together with (11.3.18)(b), we find
(11.3.26) (∇× (∇×BBB)) = −(∇2BBB),
(exactly as at (11.3.21)), and of course
(11.3.27)∂(∇×EEE)
∂t= ∇×
[∂EEE
∂t
],
from (9.1.53) with EEE in place of FFF . Now (11.3.27), (11.3.26) and (11.3.25) give
(11.3.28) −(∇2BBB) = ε0µ0∂(∇×EEE)
∂t,
and, from (11.3.28) along with (11.3.18)(c), we find
(11.3.29) c2(∇2BBB) =∂2BBB
∂t2, in which c :=
1√ε0µ0
.
Expanding the vector fields EEE and BBB in the usual componentwise form, that is
(11.3.30) EEE = E1iii+ E2jjj + E3kkk, BBB = B1iii+B2jjj +B3kkk,
and recalling Definition 9.1.19, we can write the vector relations (11.3.29) and (11.3.23) in compo-
nentwise form as follows:
(11.3.31) c2(∇2Er) =∂2Er
∂t2, c2(∇2Br) =
∂2Br
∂t2, r = 1, 2, 3,
in which the Er and Br are time varying scalar fields. The relations at (11.3.31) constitute six
partial differential equations called wave equations, each of which can be solved individually for the
relevant Er and Br. In the next remark we summarize the main properties of wave equations.
Remark 11.3.2. Each of the relations at (11.3.31) constitutes a special kind of partial differential
equation, called a wave equation, which is of the form
(11.3.32) c2(∇2u) =∂2u
∂t2,
200
in which c is a positive constant. The significance of the constant c will shortly become clear, but
let us note at this point that, regardless of the units attached to the quantity u (e.g. meters, volts,
amps, bars, joules, dimensionless etc.), for dimensional consistency in (11.3.32) to hold the units of
c must be meters per second. This suggests that the constant c has something to do with speed or
velocity. Our study of (11.3.32) will soon confirm that this is indeed the case. With the variables
(t, x, y, z) explicitly shown the equation (11.3.32) looks like
(11.3.33) c2(∇2u(t, x, y, z)) =∂2u(t, x, y, z)
∂t2.
Expanding the Laplacian ∇2 (recall Definition 9.1.18) we can write the wave equation even more
explicitly as follows:
(11.3.34) c2
[∂2u
∂x2 (t, x, y, z) +∂2u
∂y2 (t, x, y, z) +∂2u
∂z2 (t, x, y, z)
]=∂2u(t, x, y, z)
∂t2.
Any function u(t, x, y, z) which satisfies the wave equation is best thought of as a time varying scalar
field in the sense of Remark 3.2.5. Solving the wave equation is a matter of finding a scalar field
u(t, x, y, z) such that the relation (11.3.32) (i.e. (11.3.33) and (11.3.34)) is satisfied, and any such
scalar field is called a solution of the wave equation. There is an extensive theory associated with
wave equations and their solutions, much of it at a very advanced level. Fortunately, in order to
study the electromagnetic phenomena arising from the wave equations at (11.3.31), we shall require
only the most basic aspects of wave equations, and we summarize these in the present remark.
Notice first that there is one very obvious solution of the wave equation, namely the identically zero
function
(11.3.35) u(t, x, y, z) = 0, for all t and for all (x, y, z),
since this function trivially satisfies (11.3.34). Fortunately, there are other, more interesting solu-
tions of the wave equation. To see this fix
(11.3.36) some unit vector nnn = n1iii+ n2jjj + n3kkk, and some C2-function ψ : R→ R.
With the nnn and ψ fixed at (11.3.36) define the time varying scalar field
u(t, x, y, z) := ψ(nnn · (xiii+ yjjj + zkkk)− ct)
= ψ(n1x+ n2y + n3z − ct),(11.3.37)
201
for all (t, x, y, z). It is easy to show that u is a solution of the wave equation (11.3.32). In fact, from
(11.3.37), we get
∂2u(t, x, y, z)
∂t2= c2ψ(2)(n1x+ n2y + n3z − ct)
∂2u(t, x, y, z)
∂x2 = (n1)2ψ(2)(n1x+ n2y + n3z − ct)
∂2u(t, x, y, z)
∂y2 = (n2)2ψ(2)(n1x+ n2y + n3z − ct)
∂2u(t, x, y, z)
∂z2 = (n3)2ψ(2)(n1x+ n2y + n3z − ct).
(11.3.38)
Then
c2
[∂2u
∂x2 (t, x, y, z) +∂2u
∂y2 (t, x, y, z) +∂2u
∂z2 (t, x, y, z)
]= c2[(n1)2 + (n2)2 + (n3)2]ψ(2)(n1x+ n2y + n3z − ct)
= c2ψ(2)(n1x+ n2y + n3z − ct)
=∂2u(t, x, y, z)
∂t2.
(11.3.39)
Here we have used (11.3.38) at the first equality of (11.3.39), the fact that nnn is a unit vector at the
second equality, and (11.3.38) again at the third equality. We see from (11.3.39) that u defined by
(11.3.37) is a solution of the wave equation (11.3.32) for each and every choice of the unit vector
nnn and C2-function ψ (see (11.3.36)). Since there are many such choices of nnn and ψ it follows that
there are many possible scalar fields u which satisfy the wave equation (11.3.32), that is the wave
equation has not just one but many solutions! We next look at the structure of the scalar field u
given by (11.3.37); among other things this will reveal why (11.3.32) is called a wave equation. In
the first instance take the unit vector nnn along the x-axis,
(11.3.40) nnn = iii, i.e. n1 = 1, n2 = 0, n3 = 0.
In this case the function u at (11.3.37) simplifies to
(11.3.41) u(t, x, y, z) = ψ(x− ct), for all (t, x, y, z).
To get a feel for this solution fix a real constant α, take the point αnnn = αiii on the x-axis, and let Pα
be the plane in R3 parallel to the y− z plane and passing through point αnnn, that is perpendicularly
intersecting the x-axis at point x = α (see Figure 11.2). Mathematically we can express Pα as
202
Figure 11.2: Plane Pα through x = α and parallel to the y − z plane
(11.3.42) Pα = (x, y, z) | x = α.
From (11.3.41) and (11.3.42) it follows that for every fixed instant t we have
(11.3.43) u(t, x, y, z) = ψ(α− ct), for all (x, y, z) in Pα,
that is, at each fixed instant t the function u(t, x, y, z) has the constant value ψ(α − ct) for each
and every (x, y, z) in the plane Pα. To see how the solution (11.3.41) describes a “wave” we look at
the function on the right side of (11.3.41) at the fixed instants t = 0, t = t1 and t = t2, for
(11.3.44) 0 < t1 < t2.
For the sake of illustration we choose a function ψ with a “bell-shaped” profile having a maximum
at α0, but any C2-function will suffice (see Figure 11.3). Now plot ψ(x − ct) against x keeping t
fixed at the values t = 0, t = t1 and t = t2 (recall (11.3.44)), as shown in Figure 11.4: It is clear
that the graph of ψ(x− ct) (seen as a function of x for each fixed t) maintains the same “shape” or
“profile” but “propagates” or “undulates” to the right with increasing t, that is we have a “wave”
which moves to the right, and if we watch the “maximum” point A, located at α0 when t = 0, we
see that it shifts to α0 + ct1 at t = t1, and then shifts to α0 + ct2 at t = t2, which indicates that the
wave moves to the right at speed c. This confirms what we noted above, namely that the constant
c appearing in the wave equation (11.3.32) has something to do with speed or velocity. As we have
already noted, at each fixed instant t the “wave function” u(t, x, y, z) is constant for all (x, y, z)
203
Figure 11.3: Profile of ψ(α) versus α
in the planar surface Pα (the constant value being ψ(α − ct)). For this reason the wave given by
(11.3.41) is called a plane wave.
Let us now return to the more general solution of the wave equation given by (11.3.37) in terms
of a general unit vector nnn (rather than the special unit vector at (11.3.40)). We shall associate with
a generic point (x, y, z) in R3 the vector rrr in the usual way, that is
(11.3.45) rrr = xiii+ yjjj + zkkk.
Let OA denote the straight line in R3 collinear with nnn and fix some real constant α. Then αnnn is a
point on OA. Let Pα be the plane in R3 which is perpendicular to the line OA and intersects line
OA at the point αnnn (see Figure 11.5). It is then evident that, for each rrr = xiii+ yjjj + zkkk in Pα, the
vector rrr−αnnn must be perpendicular to the line OA, and in particular rrr−αnnn must be perpendicular
to nnn (since OA is collinear with nnn) that is
(rrr − αnnn) · nnn = 0 for each point rrr = xiii+ yjjj + zkkk in Pα
or, since nnn · nnn = 1,
(11.3.46) rrr · nnn = (αnnn) · nnn = αnnn · nnn = α for each point rrr = xiii+ yjjj + zkkk in Pα.
204
Figure 11.4: ψ(x− ct) versus x at instants t = 0, t = t1 and t = t2 for 0 < t1 < t2
That is, for each point rrr = xiii+ yjjj + zkkk in Pα we have
n1x+ n2y + n3z = nnn · rrr (from (11.3.36))
= α (from (11.3.46)).(11.3.47)
In view of (11.3.47) and (11.3.37) we obtain the following: for each fixed instant t
(11.3.48) u(t, x, y, z) = ψ(α− ct) for each point (x, y, z) in Pα.
In view of (11.3.48) we see that it is enough to look at the dependence of ψ(α − ct) on α (which
indicates displacement along line OA) for each fixed t in order to understand the whole function u
given by (11.3.37). This dependence is obviously pretty similar to the dependence we saw at Figure
11.4, but for completeness we illustrate matters in Figure 11.6, in which we show the dependence of
ψ(α−ct) on α for the fixed instants t = 0, t = t1, and t = t2 (recall (11.3.44)). It is clear from Figure
11.6 and (11.3.48) that the wave given by (11.3.37) “propagates” or “undulates” in the direction of
the line OA, that is in the direction of the unit normal nnn, at a speed c. Moreover it follows from
(11.3.48) that, at each fixed instant t, the scalar field u(t, x, y, z) has the constant value ψ(α − ct)for all (x, y, z) in the plane Pα, that is the wave fronts are the planes Pα, and therefore the wave
described by the solution u(t, x, y, z) at (11.3.37) is again a plane wave.
205
Figure 11.5: The plane Pα is perpendicular to OA and intersects OA at αnnn
We have shown that any time varying scalar field u(t, x, y, z) given by (11.3.37), subject to
(11.3.36), qualifies as a solution of the wave equation (11.3.32), and furthermore the waves described
by u(t, x, y, z) are plane waves with wave fronts perpendicular to the unit vector nnn. Of particular
importance is the case where the external factors setting up the wave dictate that the solution of
the wave equation (11.3.32) be in the specific form
(11.3.49) u(t, x, y, z) = η(x, y, z) cos(ωt),
in which ω is a constant angular frequency and η : R3 → R is some C2-function. Thus u(t, x, y, z)
at (11.3.49) is periodic in time at each fixed point (x, y, z), and furthermore is “separated” in the
sense that the dependence of u on the space variables (x, y, z) is “locked” in the function η, which
does not depend on time t, whereas the dependence of u on time t is locked in the periodic function
cos(ωt), which does not depend on the space variables (x, y, z). Such separated functions with
sinusoidal dependence on time t occur very naturally in applications. We are going to see that, if
the “separated” solution u at (11.3.49) has planar wave fronts which are perpendicular to some unit
vector nnn, then u can be put in the general form of (11.3.37) in which the function ψ is also periodic,
so that sinusoidal dependence in the time variable t, as at (11.3.49), leads to sinusoidal dependence
in the space variables (x, y, z). First note that life is a lot easier if, at (11.3.49), we represent the
206
Figure 11.6: Dependence of ψ(α− ct) on α for t = 0, t1, t2 with 0 < t1 < t2
sinusoid cos(ωt) in exponential form using Euler’s formula, so that we write (11.3.49) as follows
(11.3.50) u(t, x, y, z) = η(x, y, z)e−jωt,
remembering that mentally we just take the real part of the right side of (11.3.50). From (11.3.50)
we have
(11.3.51) (∇2u)(t, x, y, z) = (∇2η)(x, y, z)e−jωt,
(see (9.1.42)), and it is immediate that
(11.3.52)∂2u(t, x, y, z)
∂t2= −ω2η(x, y, z)e−jωt.
Putting (11.3.52) and (11.3.51) into (11.3.33) then gives[c2(∇2η)(x, y, z) + ω2η(x, y, z)
]e−jωt = 0,
so that
(11.3.53) (∇2η)(x, y, z) + κ2η(x, y, z) = 0,
207
where we have defined
(11.3.54) κ :=ω
c.
We must now determine scalar fields η(x, y, z) which satisfy (11.3.53), since each such scalar field
substituted into the right side of (11.3.51) leads to a solution of the wave equation (11.3.32). To
this end fix some complex constant γ given by
(11.3.55) γ = Aejθ,
fix any unit vector nnn in the form of (11.3.36), and define
η(x, y, z) := γ expjκnnn · (xiii+ yjjj + zkkk)
= γ expjκ(n1x+ n2y + n3z).(11.3.56)
From (11.3.56)∂η(x, y, z)
∂x= γ(jκn1) expjκ(n1x+ n2y + n3z),
and therefore
∂2η(x, y, z)
∂x2 = γ(jκn1)2 expjκ(n1x+ n2y + n3z)
= −γκ2(n1)2 expjκ(n1x+ n2y + n3z).(11.3.57)
Similarly
(11.3.58)∂2η(x, y, z)
∂y2 = −γκ2(n2)2 expjκ(n1x+ n2y + n3z),
and
(11.3.59)∂2η(x, y, z)
∂z2 = −γκ2(n3)2 expjκ(n1x+ n2y + n3z).
From Definition 9.1.18 together with (11.3.57) - (11.3.59) we obtain
(∇2η)(x, y, z) =∂2η(x, y, z)
∂x2 +∂2η(x, y, z)
∂y2 +∂2η(x, y, z)
∂z2
= −γκ2[(n1)2 + (n2)2 + (n3)2] expjκ(n1x+ n2y + n3z)
= −γκ2 expjκ(n1x+ n2y + n3z) (since nnn is a unit vector)
= −κ2η(x, y, z) (from (11.3.56)).
(11.3.60)
208
We see from (11.3.60) that, if η is defined by (11.3.56) in terms of any complex number γ and
any unit vector nnn, then η satisfies (11.3.53). Upon combining (11.3.56) and (11.3.50) we find that
solutions u(t, x, y, z) of the wave equation (11.3.32) having the special form (11.3.50) are given by
u(t, x, y, z) = γ expjκ(n1x+ n2y + n3z)e−jωt
= γ expj[κ(n1x+ n2y + n3z)− ωt]
= γ expj[κnnn · (xiii+ yjjj + zkkk)− ωt]
= γ expjκ[nnn · (xiii+ yjjj + zkkk)− ct].
(11.3.61)
Now remembering that we just want the real part on the right side of (11.3.61), we obtain the
solutions
(11.3.62) u(t, x, y, z) = A cosκ[nnn · (xiii+ yjjj + zkkk)− ct] + θ
in which the amplitude A and phase angle θ follow from (11.3.55). We see that u at (11.3.62) is in
exactly the form of (11.3.37) with ψ given by
(11.3.63) ψ(α) := A cos(κα + θ) for all α.
Notice that ψ(α) is a periodic function of α with period given by
(11.3.64)2π
κ=
2πc
ω,
as follows from (11.3.54). This is the spatial periodicity of the solution u at (11.3.62) in the direction
of the unit vector nnn for every fixed t, usually called the wavelength of the wave. Finally, one sees
from (11.3.62) that, at each point (x, y, z) in R3, the time periodicity is given by
(11.3.65)2π
κc=
2π
ω,
which of course just confirms the time periodicity we began with at (11.3.49).
In the present remark we have summarized the essential aspects of the wave equation (11.3.32)
that we shall need for addressing the wave equations for the electric and magnetic fields at (11.3.31)
that we obtained from Maxwell’s equations. In brief, we have learned that scalar fields u of the
form at (11.3.37), defined in terms of a given unit vector nnn and a C2-function ψ (recall (11.3.36)),
constitute solutions of the wave equation (11.3.32). Moreover, these solutions have planar wave
fronts comprising planes that are perpendicular to the vector nnn, and the wave represented by u
moves in the direction of nnn at a speed c (see Figure 11.5 and Figure 11.6). Finally, if “external
209
conditions” are such that the solution u not only has planar wave fronts perpendicular to a given
unit vector nnn, but also has the special separated form at (11.3.49), with time periodicity 2π/ω, then
ψ is also periodic with periodicity at (11.3.64), this being the spatial periodicity (or wavelength) of
the wave in the direction of nnn.
We return to the wave equations at (11.3.31) for the three scalar components Er and Br, r =
1, 2, 3. Each of these equations matches (11.3.32), u being identified in turn with each of the Er
and Br, and the constant c being the speed of light at (11.3.24). It follows that all properties of
wave equations established in Remark 11.3.2 immediately carry over to the equations (11.3.31). In
particular, each of the scalar components Er and Br propagates through space as a wave moving at
the speed of light! (Although we shall not pursue the matter here one can use Maxwell’s equations
to demonstrate that light is an electromagnetic wave in which the electric and magnetic fields are
governed by the source free equations (11.3.18)). Of particular interest in applications is the case
where the voltage source for the antenna in Figure 11.1 is sinusoidal, that is
(11.3.66) v(t) = V cos(ωt).
One can show by a rather deep analysis involving Maxwell’s equations that with this sinusoidal
voltage source the solutions Er and Br of the wave equations at (11.3.31) have the special separated
form of (11.3.49) that is
(11.3.67) Er(t, x, y, z) = η(x, y, z)e−jwt,
(with a similar formula for the Br). In view of Remark 11.3.2 it then follows that we have plane
wave solutions of the form (compare (11.3.62))
(11.3.68) Er(t, x, y, z) = A cosκ[nnn · (xiii+ yjjj + zkkk)− ct] + θ,
(with a similar formula for the Br), in which the wave number κ is given by (11.3.54) with c being
the speed of light at (11.3.24). The amplitude A, unit vector nnn and phase angle θ in (11.3.68) are
functions of the amplitude V in (11.3.66) and the geometry of the antenna. These matters are
addressed in the theory of antennas, which is itself built on Maxwell’s equations.
11.4 Electromagnetic Waves with Sources
In Section 11.3 we addressed the source-free Maxwell equations (11.3.18). The identity (9.1.62) for
the “double curl” of a vector field was used to show that the electric and magnetic fields EEE and BBB
210
given by the source free equations (11.3.18) are easily reduced to the wave equations (11.3.31), each
of which can be separately “solved” for the components Er and Br according to Remark 11.3.2. In
the present section we return to Maxwell’s equations (11.2.16) with sources, which we repeat here:
(a) (∇ ·EEE) =1
ε0ρ,
(b) (∇ ·BBB) = 0,
(c) (∇×EEE) +∂BBB
∂t= 0,
(d) (∇×BBB) = µ0JJJ + ε0µ0∂EEE
∂t.
(11.4.69)
There are many reasons why we want to study Maxwell’s equations (11.4.69) with sources present.
For example, in Remark 11.3.1 we briefly noted that a voltage source connected to an antenna
sets up charge and current density fields ρ and JJJ within the antenna. In order to design efficient
and properly functioning antennas not only must we understand the EEE and BBB fields outside the
antenna (something we addressed in Section 11.3), but we must also thoroughly understand these
fields within the antenna and how they are related to the source fields ρ and JJJ created inside the
antenna by the voltage source. All of this is given by Maxwell’s equations (11.4.69).
The presence of the charge and current density sources ρ and JJJ in (11.4.69)(a)(d) means that
we cannot use the rather simple approach of Section 11.3 for the source free case to obtain wave
equations for Er and Br similar to (11.3.31). In fact, equations (11.4.69) with sources present, are
definitely more challenging than the source free equations (11.3.18). To get a possible clue on how to
proceed let us recall Remark 10.2.11, in which we addressed the problem of extracting the magnetic
field BBB from the relations (10.2.50) in terms of the source JJJ (for time constant fields), and for which
we used the very clever idea of a Coulomb gauge. Equations (11.4.69) present us with a problem
which is not unlike the problem of Remark 10.2.11, in that we want to extract fields (now both an
electric field EEE and a magnetic field BBB) from the relations (11.4.69) in terms of the given sources
ρ and JJJ , in the same way that we extracted the magnetic field BBB from the relations (10.2.50) in
terms of the source JJJ . Of course the set of equations (11.4.69) is clearly more complicated than the
set of equations (10.2.50), and also involves time varying fields in contrast to the time static case of
Remark 10.2.11, so the problem we are dealing with now is obviously significantly more challenging
than the problem addressed in Remark 10.2.11. It turns out, however, that the clever idea of a
gauge, which proved so effective in Remark 10.2.11, can actually be extended to the present problem
as well. Exactly as was the case in Remark 10.2.11 the magic key to all this is to be found in the
vector calculus we have learned.
211
From (11.4.69)(b) with Theorem 10.2.7 we can put
(11.4.70) BBB = ∇×AAA
for some vector field AAA called the magnetic vector potential of BBB (much as at (10.2.51) in the case
of static magnetic fields). Then
∂BBB
∂t=∂(∇×AAA)
∂t(from (11.4.70))
= ∇×[∂AAA
∂t
](from (9.1.53)),
(11.4.71)
and then
∇×[EEE +
∂AAA
∂t
]= ∇×EEE +∇×
[∂AAA
∂t
](from (9.1.61) )
= ∇×EEE +∂BBB
∂t= 0 (from (11.4.71) and (11.4.69)(c))
that is
(11.4.72) ∇×[EEE +
∂AAA
∂t
]= 0.
From Theorem 9.1.9 we see that the vector field in square brackets in (11.4.72) is conservative, and
so, from Definition 6.2.1, there is some scalar field Ψ such that
EEE +∂AAA
∂t= −∇Ψ,
that is
(11.4.73) EEE = −∂AAA
∂t−∇Ψ.
We know in fact that there are actually many vector fields AAA such that (11.4.70) holds, and many
scalar fields Ψ such that (11.4.73) holds. It turns out that we can actually find a rather special
vector field AAA and a rather special scalar field Ψ such that
(11.4.74) BBB = ∇× AAA, EEE = −
[∂AAA
∂t+∇Ψ
], (∇ · AAA) + ε0µ0
∂Ψ
∂t= 0.
While there are many vector fields AAA and scalar fields Ψ which satisfy the first two relations of
(11.4.74), it is still far from obvious that we can choose these fields AAA and Ψ in such a way that the
212
third relation of (11.4.74) is also satisfied. Later we will demonstrate that we can in fact do this,
but now let us just suppose we have fields AAA and Ψ satisfying all of (11.4.74) at our disposal and see
how we can use these fields to “solve” Maxwell’s equations (11.4.69) for the electric and magnetic
fields EEE and BBB. From the first two relations of (11.4.74) and (11.4.69)(d) we get
(11.4.75) ∇× (∇× AAA) = µ0JJJ − ε0µ0∂
∂t
[∂AAA
∂t+∇Ψ
].
Now expand the left side of (11.4.75) by the identity (9.1.62) to obtain
∇(∇ · AAA)−∇2AAA = µ0JJJ − ε0µ0∂
∂t
[∂AAA
∂t+∇Ψ
],
that is
(11.4.76) ∇2AAA− ε0µ0∂2AAA
∂t2−∇
[(∇ · AAA) + ε0µ0
∂Ψ
∂t
]= −µ0JJJ.
But the quantity in square brackets on the left side of (11.4.76) is identically zero, by the third
relation of (11.4.74), so that (11.4.76) simplifies to
(11.4.77) ∇2AAA− ε0µ0∂2AAA
∂t2= −µ0JJJ.
Now (11.4.77) is a very nice relation indeed. In fact, recalling the componentwise expansion of the
vector fields JJJ and AAA, namely
(11.4.78) JJJ = J1iii+ J2jjj + J3kkk, AAA = A1iii+ A2jjj + A3kkk,
it is immediate that we have
(11.4.79)∂2AAA
∂t2=∂2A1
∂t2iii+
∂2A2
∂t2iii+
∂2A3
∂t2iii.
Furthermore, recall the componentwise expansion of ∇2AAA from Definition 9.1.19, that is
(11.4.80) (∇2AAA) = (∇2A1)iii+ (∇2A2)jjj + (∇2A3)kkk,
(as follows from (9.1.31) with AAA in place of FFF ). In view of (11.4.78), (11.4.79) and (11.4.80), we can
equate the scalar components in the vector relation (11.4.77) to get the scalar relations
(11.4.81) ∇2Ai − ε0µ0∂2Ai
∂t2= −µ0Ji,
213
for i = 1, 2, 3. Observe that, in the case of time constant fields, the double t-derivative in (11.4.81)
is identically zero so that (11.4.81) reduces at once to the Poisson equations (10.2.59) that we
already obtained in Remark 10.2.11 in which we addressed the problem of static fields. In Remark
10.2.11 we saw that the Poisson equations (10.2.59) have nice solutions (see (10.2.60)). Can we
get comparably explicit solutions of the equations at (11.4.81) which are effectively “time varying”
Poisson equations featuring a double t-derivative of Ai in addition to the usual Laplacian of Ai?
There is indeed such an explicit formula for the solutions Ai of (11.4.81), which can be obtained
by the method of Green’s functions applied to the equations (11.4.81). The method of Green’s
functions is part of the theory of partial differential equations and is somewhat outside the scope of
this course. Accordingly, we shall not develop this method here but will just state the result that
one gets for the solutions of (11.4.81), namely
(11.4.82) Ai(t, x, y, z) =µ0
4π
∫R3
Ji(t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,
for all (t, x, y, z), which gives each Ai in terms of the (known) time varying current density com-
ponents Ji(t, x, y, z). Notice that the numerator in the integrand of (11.4.82) is obtained by sub-
stituting the quadruple of numbers (t −√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w) in place of
the generic variable (t, x, y, z) in Ji(t, x, y, z). When JJJ is time constant, so that we have Ji(x, y, z)
instead of Ji(t, x, y, z), there is no “place” for the number t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2],
which is substituted into the “first argument” corresponding to the variable t, so the numerator
just reduces to Ji(u, v, w), which is precisely what we have at (10.2.60).
In view of (11.4.82) we have now obtained the vector field AAA, and by calculating the curl of AAA
we get the magnetic field BBB from the first relation of (11.4.74). It remains to determine the electric
field EEE. For this we observe
1
ε0ρ = ∇ ·EEE (from (11.4.69)(a))
= ∇ ·
[−∂A
AA
∂t−∇Ψ
](from the second relation of (11.4.74))
= −∇ ·
[∂AAA
∂t
]−∇2Ψ
= −∂(∇ · AAA)
∂t−∇2Ψ,
that is
(11.4.83)∂(∇ · AAA)
∂t+∇2Ψ = − 1
ε0ρ.
214
From the third relation of (11.4.74) we have
(∇ · AAA) = −ε0µ0∂Ψ
∂t
and putting this in (11.4.83) we find
(11.4.84) ∇2Ψ− ε0µ0∂2Ψ
∂t2= − 1
ε0ρ.
This equation is of exactly the same form as (11.4.81), except that the “source” term on the right
hand side involves the given charge density ρ instead of the given component Ji of the current
density JJJ , and the “thing” we must extract from (11.4.84) is the scalar field Ψ. Applying the
method of Green’s functions to (11.4.84) we find that Ψ is given by
(11.4.85) Ψ(t, x, y, z) =1
4πε0
∫R3
ρ(t−√ε0µ0[(x− u)2 + (y − v)2 + (z − w)2], u, v, w)
[(x− u)2 + (y − v)2 + (z − w)2]1/2du dv dw,
for all (t, x, y, z), which gives Ψ in terms of the (known) time varying charge density field ρ(t, x, y, z).
It remains to determine the electric field EEE, but now we have all the information required for this.
We just determine ∇Ψ from Ψ at (11.4.85), and determine (∂AAA)/(∂t) from AAA given by (11.4.82),
and then EEE is given by the second relation of (11.4.74).
It remains to address the all-important question of how to find some vector field AAA and some
scalar field Ψ which satisfies all three relations of (11.4.74). The situation is not unlike that which
we faced when we looked at static magnetic fields in Remark 10.2.11, only there we had the simpler
task of just choosing a vector field AAA such that the two relations at (10.2.52) hold, whereas here we
must choose both a vector field AAA and a scalar field Ψ such that all three relations in (11.4.74) hold,
obviously a more difficult task. Nevertheless, we will proceed by much the same sort of clever idea
we used in Remark 10.2.11, namely a gauge transformation. To define the gauge transformation we
fix any vector field AAA such that (11.4.70) holds and fix any scalar field Ψ such that (11.4.73) holds.
We know that there are actually many vector fields AAA such that (11.4.70) holds and many scalar
fields Ψ such that (11.4.73) holds; for now we just make an arbitrary choice from among the myriad
possibilities of some AAA satisfying (11.4.70) and some Ψ satisfying (11.4.73), and stick with these
choices from now on (this is similar to what we did in Remark 10.2.11, except that there things
were simpler in that we just had to arbitrarily fix a vector field AAA and not worry about a scalar
field Ψ). Now suppose that f is a time varying scalar field which satisfies the relation
(11.4.86) ∇2f − ε0µ0∂2f
∂t2= −
[(∇ ·AAA) + ε0µ0
∂Ψ
∂t
],
215
in which the AAA and Ψ on the right had side are the vector and scalar fields we have just chosen,
and define the time varying vector field AAA and the time varying scalar field Ψ in terms of AAA, Ψ and
f as follows:
(11.4.87) AAA := AAA+∇f, Ψ := Ψ− ∂f
∂t.
We shall now see that AAA and Ψ defined at (11.4.87) satisfy all relations at (11.4.74). This incredibly
clever transformation of AAA into AAA and Ψ into Ψ in terms of the function f satisfying (11.4.86) is
called the Lorentz gauge transformation. We get
∇× AAA = ∇× [AAA+∇f ] (from (11.4.87))
= ∇×AAA+∇× (∇f) (from (9.1.61) )
= ∇×AAA+ 0 (from Theorem 9.1.13 )
= BBB (from (11.4.70)),
that is
(11.4.88) BBB = ∇× AAA,
so that the first relation of (11.4.74) is established. Moreover
∂AAA
∂t+∇Ψ =
∂
∂t[AAA+∇f ] +∇[Ψ− ∂f
∂t] (from (11.4.87))
=∂AAA
∂t+∂(∇f)
∂t+∇Ψ−∇
[∂f
∂t
]=∂AAA
∂t+∂(∇f)
∂t+∇Ψ− ∂(∇f)
∂t
=∂AAA
∂t+∇Ψ
= −EEE ( from (11.4.73))
that is
(11.4.89) EEE = −
[∂AAA
∂t+∇Ψ
],
which gives the second relation of (11.4.74). As for the third relation of (11.4.74), we have
(∇ · AAA) + ε0µ0∂Ψ
∂t= ∇ · [AAA+∇f ] + ε0µ0
∂
∂t[Ψ− ∂f
∂t] (from (11.4.87) )
= ∇ ·AAA+∇2f + ε0µ0∂Ψ
∂t− ε0µ0
∂2f
∂t2
= 0 ( from (11.4.86) ),
(11.4.90)
as required.
216
Chapter 12
Cylindrical and Spherical Coordinates
We are all familiar with the standard Cartesian coordinate system in three dimensional space R3,
in which a vector vvv is expressed in terms of Cartesian coordinates with reference to some fixed set
of mutually orthogonal unit vectors iii, jjj,kkk. Of all the various coordinate systems one can install
in R3 the Cartesian coordinate system is by far the simplest, the most universal and the most
important. There are, nevertheless, some situations for which the Cartesian coordinate system is
not entirely ideal. These typically involve scalar or vector fields which exhibit some kind of inherent
symmetry, such as cylindrical symmetry around a straight axis, or spherical symmetry (also called
radial symmetry) around a fixed point. Such symmetry is an extremely valuable property, which
can enormously simplify the solution of problems, and which therefore should be exploited as much
as possible. The cylindrical and spherical coordinate systems studied here are designed for just
this purpose. An important halfway-house to both of these coordinate systems is the the familiar
system of polar coordinates for representing a point in the plane which we recall in the following
section.
12.1 Polar Coordinates
From Figure 12.1 one sees that a point A in the plane can be represented by Cartesian coordinates
comprising a pair of real numbers (x, y) giving the “x-coordinates” and “y-coordinates. Alterna-
tively, one can represent the same point A by a distance r from the origin together with an angle
θ relative to the x-axis of the line from the origin to the point A and measured in the counter
217
clockwise direction. The point A is again represented by a pair of real numbers, namely (r, θ), but
this pair of course has a very different interpretation from the Cartesian pair (x, y).
Figure 12.1: Polar coordinates
Notice in particular that the Cartesian coordinates (x, y) take values in the range
(12.1.1) −∞ < x <∞, −∞ < y <∞,
while the polar coordinates (r, θ) naturally take values in the range
(12.1.2) 0 ≤ r <∞, 0 ≤ θ < 2π.
In particular, we do not allow the value θ = 2π at (12.1.2) since this merely replicates the case
of θ = 0. The relation between polar and Cartesian coordinates is extremely simple. Indeed, if
point A has polar coordinates (r, θ) with r ≥ 0 and 0 ≤ θ < 2π then the corresponding Cartesian
coordinates (x, y) are of course given by
(12.1.3) x = r cos(θ), y = r sin(θ).
That is, if the point A in the plane is given by the polar coordinates (r, θ) then, in terms of the
Cartesian basis iii, jjj, one sees from (12.1.3) it must be given by the vector
(12.1.4) vvv(r, θ) = r cos(θ)iii+ r sin(θ)jjj.
218
Effectively, the relation (12.1.4) (i.e. the relations (12.1.3)) tells us how the Cartesian representation
of a point changes when we change its polar coordinates. Conversely, given the Cartesian coordinates
(x, y) of a point, one sees from Figure 12.1 that the corresponding polar coordinates (r, θ) are given
by
(12.1.5) r =√x2 + y2, θ =
arctan(y/x), when x > 0 and y > 0,
π + arctan(y/x), when x < 0 and −∞ < y <∞,
2π + arctan(y/x), when x > 0 and y < 0,
(recall that arctan(α) always takes values in the range −π/2 to +π/2 with arctan(α) < 0 when
α < 0).
Remark 12.1.1. When are polar coordinates preferable to Cartesian coordinates? Suppose a
particle of mass M is located at the origin of an x−y Cartesian coordinate system in the plane, and
another particle of mass m moves in the plane under the influence of the gravitational force exerted
by the particle of mass M , and given by Newton’s law of universal gravitation. For instance, one
could think of M as the mass of the earth concentrated at the origin and the particle of mass m
could be a satellite orbiting the earth. Newton’s law of gravitation states that the force FFF is one of
attraction, along the radial line joining the particles, with magnitude given by
(12.1.6) ‖FFF‖ =k
r2,
in which r is the distance between the two particles, and k is a constant determined by the masses
M and m. We see that the force is a vector field which radially symmetric around the origin. In
particular, this force depends very simply on the radial distance r and has the wonderfully nice
property that its magnitude does not depend at all on the angle θ (see Figure 12.2). In this kind
of situation one will always use polar coordinates in preference to Cartesian coordinates, in order
to take advantage of the radial symmetry of the force field FFF . In particular, by working in polar
coordinates, and using the laws of classical mechanics, it is actually quite easy to show that the
particle of mass m follows an orbit in the plane in which the polar coordinates (r, θ) of the particle
are related by
(12.1.7) r =c
1 + ε cos(θ);
here c and ε are constants depending on the masses m and M (the constant ε is called the eccentricity
of the orbit). Equation (12.1.7) is known as Kepler’s first law and is a basic result in orbital
219
mechanics which tells us that the particle of mass m moves along a curve which is either an ellipse,
a parabola or a hyperbola (corresponding to ε < 1, ε = 1 and ε > 1 respectively). Kepler’s first law
would be extremely difficult to derive - and even just to write down - in Cartesian coordinates.
Figure 12.2: Radially symmetric gravitational force field in the plane
Remark 12.1.2. The standard Cartesian unit vectors iii and jjj enable one to express any vector
vvv = (x, y) in the plane in the Cartesian form
(12.1.8) vvv = xiii+ yjjj.
When we use polar coordinates then these Cartesian vectors are no longer very appropriate and we
must develop alternative “standard” vectors which are better suited to polar coordinates. To this
end fix some point A in the plane with polar coordinates (r0, θ0) (see Figure 12.3) and use (12.1.4)
to define the path
γγγ1(r) := vvv(r, θ0)
= r cos(θ0)iii+ r sin(θ0)jjj, for all 0 ≤ r <∞,(12.1.9)
in which θ0 is held fixed and the parametric variable is r (c.f. Definition 4.2.1). Then the curve Γ1
of this path is the straight line from the origin passing through A shown in Figure 12.3. Similarly
220
to (12.1.9), define the path
γγγ2(θ) := vvv(r0, θ)
= r0 cos(θ)iii+ r0 sin(θ)jjj, for all 0 ≤ θ < 2π,(12.1.10)
in which r0 is held fixed and θ is the parametric variable. Clearly the curve Γ2 of this path is the
circle of radius r0 passing through A in a counter-clockwise direction shown in Figure 12.3. We now
Figure 12.3: Curves Γ1 and Γ2 and basis vectors eeer(r0, θ0) and eeeθ(r0, θ0)
define the tangent to the curve Γ1 at the point A, namely
(12.1.11) γγγ(1)1 (r0) :=
dγγγ1
dr(r)∣∣r=r0
.
From (12.1.11) and (12.1.9) we find
(12.1.12) γγγ(1)1 (r0) = cos(θ0)iii+ sin(θ0)jjj,
in particular γγγ(1)1 (r0) is in the direction of the straight line joining 0 to A. Now let eeer(r0, θ0) denote
the unit vector with the same direction as γγγ(1)1 (r0) namely
(12.1.13) eeer(r0, θ0) :=γγγ
(1)1 (r0)∥∥∥γγγ(1)1 (r0)
∥∥∥ = cos(θ0)iii+ sin(θ0)jjj,
221
in which the final equality follows from (12.1.12) since it is clear that∥∥∥γγγ(1)
1 (r0)∥∥∥ = 1. Again,
eeer(r0, θ0) is in the direction of the straight line joining 0 to A (see Figure 12.3).
Similarly to (12.1.11) we can also define the tangent to the curve Γ2 at the point A, namely
(12.1.14) γγγ(1)2 (θ0) :=
dγγγ2
dθ(θ)∣∣θ=θ0
= −r0 sin(θ0)iii+ r0 cos(θ0)jjj,
in which the last equality at (12.1.14) follows from (12.1.10). Now let eeeθ(r0, θ0) denote the unit
vector with the same direction as γγγ(2)1 (θ0) namely
(12.1.15) eeeθ(r0, θ0) :=γγγ
(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)
∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj,
in which the final equality follows from (12.1.14) since it is clear that∥∥∥γγγ(1)
2 (θ0)∥∥∥ = r0. Since γγγ
(1)2 (θ0)
is tangent to the curve Γ2 at point A so also is the vector eeeθ(r0, θ0) (see Figure 12.3). We see from
Figure 12.3 that “attached” to the point A with polar coordinates (r0, θ0) is a pair of orthogonal
basis vectors eeer(r0, θ0), eeeθ(r0, θ0) called the coordinate frame at the point A. Notice that these
basis vectors change direction (but of course not the unit length) as the point (r0, θ0) moves (see
Figure 12.4), so that we have a moving coordinate frame. Put another way
the basis vectors eeer(r, θ), eeeθ(r, θ) are functions of the point (r, θ) to which they are attached.
This is in direct contrast to the Cartesian unit basis vectors iii, jjj which of course have a constant
direction parallel to the x and y axes respectively.
For later reference we rewrite the relations (12.1.13) and (12.1.15) but replacing the generic
polar coordinates (r0, θ0) with (r, θ) (just to lighten the notation):
(12.1.16) eeer(r, θ) :== cos(θ)iii+ sin(θ)jjj,
(12.1.17) eeeθ(r, θ) = − sin(θ)iii+ cos(θ)jjj.
Remark 12.1.3. Another important question deals with how length changes when we change the
polar coordinates. To fix ideas we first look at this question in the simpler setting of Cartesian
coordinates. Suppose point A has Cartesian coordinates (x, y) and we make small perturbations
dx and dy in the coordinates to get the point B with Cartesian coordinates (x+ dx, y + dy) (see
Figure 12.5).
222
Figure 12.4: Coordinate frame eeer, eeeθ at the points (r0, θ0) and (r1, θ1)
We must determine the small distance ds between A and B. Of course, the answer is immediate
from Pythagoras, namely
ds =√
( dx)2 + ( dy)2,
or, as we shall usually write in order to get get of the awkward square-root sign,
(12.1.18) ( ds)2 = ( dx)2 + ( dy)2.
We now consider the same question, but in polar coordinates. That is, suppose point A has polar
coordinates (r, θ) and we make small perturbations dr and dθ in the polar coordinates to get the
point B with polar coordinates (r + dr, θ + dθ) (see Figure 12.6). Again, we must determine the
resulting small distance ds between A and B. Clearly AD is of length dr, and since dθ is small
the circular arc AC is effectively a straight line with length given by r dθ. Moreover, again since
dθ is small, we see that AC and AD are effectively orthogonal, so that ADBCA is effectively a
rectangle. We summarize the situation as follows:
(12.1.19) length of AD is dr, length of AC is r dθ, and ADBCA is a rectangle.
223
Figure 12.5: Perturbations in Cartesian coordinates
It is now immediate from (12.1.19) and Pythagoras that
( ds)2 = (length of AB)2
= (length of AD)2 + (length of AC)2
= ( dr)2 + r2( dθ)2.
(12.1.20)
For later reference we customarily write (12.1.20) in the seemingly more complicated Riemannian
form
(12.1.21) ( ds)2 = [hr(r, θ) dr]2 + [hθ(r, θ) dθ]2
in which hr and hθ are the so-called Riemannian scale functions defined (in this case) by
(12.1.22) hr(r, θ) := 1, hθ(r, θ) := r.
12.2 Cylindrical Coordinates
Having summarized the main aspects of plane polar coordinates in the preceding section we are
now ready to look at coordinate systems in three dimensional space R3. Of course we already have
the familiar Cartesian coordinate system for which the standard basis vectors are the triple iii, jjj,kkk
224
Figure 12.6: Perturbations in polar coordinates
of orthogonal unit vectors parallel to the x, y and z-axes respectively, in which a generic vector vvv
is represented in the form
(12.2.23) vvv = xiii+ yjjj + zkkk,
or, equivalently, by the triplet of Cartesian coordinates (x, y, z), in which x, y and z are real scalars
in the range
(12.2.24) −∞ < x <∞, −∞ < y <∞, −∞ < z <∞.
Although the Cartesian coordinate system is extraordinarily useful it nevertheless fails to take ad-
vantage of any symmetries that may be available as part of a problem. We would like to make the
most of such symmetries (when present) for symmetry can hugely simplify the solution of the prob-
lem. For this reason we introduce two coordinate systems in R3 namely the cylindrical coordinate
system and the spherical coordinate system. The cylindrical coordinate system takes advantage of
any symmetry about an axis, nearly always chosen (just for convenience) to be the z-axis, while
the inherently more complex spherical coordinate system takes advantage of any radial symmetry
around the origin of R3.
We begin with the simpler case of a cylindrical coordinate system. Suppose that A is a point
225
in R3 with Cartesian coordinates (x, y, z). Then the pair (x, y) gives the Cartesian coordinates of
the point B in the x− y-plane (see Figure 12.7).
Figure 12.7: Cylindrical coordinates
Let (r, θ) be the polar coordinates of the point B; then of course r and θ are given in terms of the
Cartesian coordinates (x, y) of B by (12.1.5), repeated here for convenience as follows:
(12.2.25) r =√x2 + y2, θ =
arctan(y/x), when x > 0 and y > 0,
π + arctan(y/x), when x < 0 and −∞ < y <∞,
2π + arctan(y/x), when x > 0 and y < 0.
It is clear from Figure 12.7 that the triplet of real numbers (r, θ, z) completely specifies the point
A; this triplet constitutes the cylindrical coordinates of the point A. Clearly these cylindrical
coordinates naturally take values in the range
(12.2.26) 0 ≤ r <∞, 0 ≤ θ < 2π, −∞ < z <∞.
We see that the parameters r and θ in the cylindrical coordinates (r, θ, z) are given in terms of the
Cartesian coordinates (x, y, z) of a point A by (12.2.25). Conversely, if one is given the cylindrical
coordinates (r, θ, z) of a point A then the parameters x and y in the corresponding Cartesian
226
coordinates (x, y, z) must be given by
(12.2.27) x = r cos(θ), y = r sin(θ).
Put another way, if the point A is given by the cylindrical coordinates (r, θ, z) then, in terms of the
Cartesian basis iii, jjj,kkk, one sees from (12.2.27) it must be given by the vector
(12.2.28) vvv(r, θ, z) = r cos(θ)iii+ r sin(θ)jjj + zkkk.
Effectively, the relation (12.2.28) (equivalently the relations (12.2.27)) tells us how the Cartesian
representation of a point changes when we change its cylindrical coordinates.
We now construct a triple of orthogonal basis vectors for cylindrical coordinates which are
an analog (indeed a simple extension) of the moving coordinate frame eeer(r, θ), eeeθ(t, θ) that we
constructed in Remark 12.1.2 for polar coordinates. We proceed exactly as we did in Remark 12.1.2,
that is fix some point A in R3 with the cylindrical coordinates (r0, θ0, z0). Then (by analogy with
(12.1.9)) define the path
γγγ1(r) := vvv(r, θ0, z0)
= r cos(θ0)iii+ r sin(θ0)jjj + z0kkk, for all 0 ≤ r <∞,(12.2.29)
in which (θ, z) in (12.2.28) is held fixed at (θ0, z0), and r is the parametric variable. The curve Γ1
of this path is clearly the straight line passing through A and parallel to the line OB (see Figure
12.8 which is adapted from Mathematica).
Likewise (c.f. (12.1.10)), define the path
γγγ2(θ) := vvv(r0, θ, z0)
= r0 cos(θ)iii+ r0 sin(θ)jjj + z0kkk, for all 0 ≤ θ < 2π,(12.2.30)
in which (r, z) in (12.2.28) is held fixed at (r0, z0), and θ is the parametric variable. Clearly the
curve Γ2 of this path is the circle of radius r0, lying parallel to the x − y-plane at the “height” z0
(see Figure 12.8). Finally, define the path
γγγ3(z) = vvv(r0, θ0, z)
= r0 cos(θ0)iii+ r0 sin(θ0)jjj + zkkk, for all −∞ < z <∞,(12.2.31)
in which (r, θ) in (12.2.28) is held fixed at (r0, θ0), and z is the parametric variable. Clearly the
curve Γ3 of this path is the straight line passing through the point B and parallel to the z-axis (see
Figure 12.8). Exactly as at (12.1.11), we define the tangent to the curve Γ1 at the point A, namely
(12.2.32) γγγ(1)1 (r0) :=
dγγγ1
dr(r)∣∣r=r0
= cos(θ0)iii+ sin(θ0)jjj + 0kkk,
227
Figure 12.8: Curves Γ1, Γ2, Γ3, and basis vectors eeer(r0, θ0z0), eeeθ(r0, θ0z0), eeez(r0, θ0z0)
in which the last equality follows from (12.2.29). In the same way, from (12.2.30), we define the
tangent to the curve Γ2 at the point A, that is
(12.2.33) γγγ(1)2 (θ0) :=
dγγγ2
dθ(θ)∣∣θ=θ0
= −r0 sin(θ0)iii+ r0 cos(θ0)jjj + 0kkk,
and, from (12.2.31), we define the tangent to the curve Γ3 at the point A, that is
(12.2.34) γγγ(1)3 (z0) :=
dγγγ3
dz(z)∣∣z=z0
= 0iii+ 0jjj + kkk.
Exactly as at (12.1.13) and (12.1.15), we define unit vectors eeer(r0, θ0, z0), eeeθ(r0, θ0, z0) and eeez(r0, θ0, z0),
having the same direction as γγγ(1)1 (r0), γγγ
(1)2 (θ0) and γγγ
(1)3 (z0) respectively, that is
(12.2.35) eeer(r0, θ0, z0) :=γγγ
(1)1 (r0)∥∥∥γγγ(1)1 (r0)
∥∥∥ = cos(θ0)iii+ sin(θ0)jjj + 0kkk,
in which the final equality follows from (12.2.32) since it is clear that∥∥∥γγγ(1)
1 (r0)∥∥∥ = 1. Similarly,
228
from (12.2.33), we have∥∥∥γγγ(1)
2 (θ0)∥∥∥ = r0, and therefore
(12.2.36) eeeθ(r0, θ0, z0) :=γγγ
(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)
∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj + 0kkk,
and, from (12.2.34)
(12.2.37) eeez(r0, θ0, z0) :=γγγ
(1)3 (z0)∥∥∥γγγ(1)3 (z0)
∥∥∥ = 0iii+ 0jjj + kkk.
Now we know from (12.2.32) that γγγ(1)1 (r0) is tangent to the curve Γ1 at point A, so it follows from
(12.2.35) that
(12.2.38) the unit vector eeer(r0, θ0, z0) is tangent to curve Γ1 at A,
(see Figure 12.8). Similarly, from (12.2.33) and (12.2.36), and from (12.2.34) and (12.2.37), we have
(12.2.39) the unit vector eeeθ(r0, θ0, z0) is tangent to curve Γ2 at A,
(12.2.40) the unit vector eeez(r0, θ0, z0) is tangent to curve Γ3 at A,
(see Figure 12.8). Moreover, calculating the inner product (or “dot product”) of eeer(r0, θ0, z0) with
eeeθ(r0, θ0, z0) using (12.2.35) and (12.2.36) we get
(12.2.41) (eeer(r0, θ0, z0)) · (eeeθ(r0, θ0, z0)) = − sin(θ0) cos(θ0) + sin(θ0) cos(θ0) = 0.
Similarly, from (12.2.35), (12.2.36) and (12.2.37) we find
(12.2.42) (eeer(r0, θ0, z0)) · (eeez(r0, θ0, z0)) = (eeeθ(r0, θ0, z0)) · (eeez(r0, θ0, z0)) = 0.
From (12.2.41) and (12.2.42) it follows that, for each and every (r0, θ0, z0), we have
(12.2.43)
eeer(r0, θ0, z0), eeeθ(r0, θ0, z0), eeez(r0, θ0, z0) is a triplet of mutually orthogonal unit vectors.
We see from Figure 12.8 that the triplet of orthogonal unit vectors eeer(r0, θ0, z0), eeeθ(r0, θ0, z0), eeez(r0, θ0, z0)is “attached” to the point A with cylindrical coordinates (r0, θ0, z0) and constitutes a coordinate
frame at the point A. Notice that these basis vectors change direction (but not of course the unit
length) as the point A moves, so that (exactly as for the case of polar coordinates) we have a moving
229
coordinate frame. This is in direct contrast to the Cartesian unit basis vectors iii, jjj,kkk which of
course have a constant direction parallel to the x, y and z axes respectively. For later reference
we rewrite the relations (12.2.35), (12.2.36) and (12.2.37), but replacing the generic cylindrical
coordinates (r0, θ0, z0) with (r, θ, z) (to lighten the notation):
(12.2.44) eeer(r, θ, z) = cos(θ)iii+ sin(θ)jjj + 0kkk,
(12.2.45) eeeθ(r, θ, z) = − sin(θ)iii+ cos(θ)jjj + 0kkk,
(12.2.46) eeez(r, θ, z) = 0iii+ 0jjj + 1kkk,
(c.f. (12.1.16) and (12.1.17) for similar relations in the case of polar coordinates).
In Remark 12.1.3 we established an expression in Riemannian form for the change ds in distance in
the plane resulting from a small change in the polar coordinates (see (12.1.21) and (12.1.22)). We
are now going to establish an analogous expression for the change ds in three dimensional space
resulting from a small change in cylindrical coordinates.
Suppose point A has cylindrical coordinates (r, θ, z) and we make small perturbations dr, dθ and
dz in the cylindrical coordinates to get point B with cylindrical coordinates (r+ dr, θ+ dθ, z+ dz)
(see Figure 12.9). We must determine the distance ds between A and B.
From Figure 12.9 we see that the straight line AD has length dr, and since dθ is small the circular
arc AC is effectively a straight line with length given by r dθ. Moreover, the straight line AE clearly
has length dz. Also, it is clear from Figure 12.9 that AD is collinear with the unit vector eeer(r, θ, z),
AC is collinear with the unit vector eeeθ(r, θ, z), and AE is collinear with the unit vector eeez(r, θ, z).
Put another way
(12.2.47) AD = eeer(r, θ, z)( dr), AC = eeeθ(r, θ, z)(r dθ), AE = eeez(r, θ, z)( dz),
and, again from Figure 12.9, we see that
(12.2.48) AB = eeer(r, θ, z)( dr) + eeeθ(r, θ, z)(r dθ) + eeez(r, θ, z)( dz).
In view of (12.2.48), (12.2.43), and Pythagoras, we get
( ds)2 := (length of AB)2
= ( dr)2 + r2( dθ)2 + ( dz)2.(12.2.49)
230
Figure 12.9: Perturbations in cylindrical coordinates
Exactly as at (12.1.21) for polar coordinates we can put (12.2.49) into Riemannian form, that is
(12.2.50) ( ds)2 = [hr(r, θ, z) dr]2 + [hθ(r, θ, z) dθ]2 + [hz(r, θ, z) dz]2,
in which hr, hθ and hz are the Riemannian scale functions defined by
(12.2.51) hr(r, θ, z) := 1, hθ(r, θ, z) := r, hz(r, θ, z) := 1.
12.3 Spherical Coordinates
The preceding completes our introduction to the main aspects of cylindrical coordinates in three
dimensional space, and we now move on to the spherical coordinate system in three dimensional
space R3. Suppose that A is a point in R3 with Cartesian coordinates (x, y, z). Then the pair (x, y)
gives the Cartesian coordinates of the point B in the x− y plane (see Figure 12.10).
Let (r, θ) be the polar coordinates of the point B; then of course r and θ are given in terms of the
231
Figure 12.10: Spherical coordinates
Cartesian coordinates (x, y) of B by (12.1.5), repeated here as
(12.3.52) r =√x2 + y2, θ =
arctan(y/x), when x > 0 and y > 0,
π + arctan(y/x), when x < 0 and −∞ < y <∞,
2π + arctan(y/x), when x > 0 and y < 0,
Moreover, from Pythagoras, the radial length of the vector from the origin O to point A is
(12.3.53) ρ =√x2 + y2 + z2,
and it follows from the right-angle triangle OAC in Figure 12.10 that the angle φ between the
positive z-axis and the radial vector OA is related to r and ρ by
(12.3.54) sin(φ) =r
ρ, that is r = ρ sin(φ).
Moreover, again from the right-angle triangle OAC in Figure 12.10, we also have
(12.3.55) cos(φ) =z
ρ, that is z = ρ cos(φ).
Combining the first relation of (12.3.55) and (12.3.53) then gives
(12.3.56) φ = arccos
(z√
x2 + y2 + z2
).
232
It is clear from Figure 12.10 that the triplet of real numbers (ρ, θ, φ) completely specifies the point A;
this triplet constitutes the spherical coordinates of the point A. Clearly these spherical coordinates
naturally take values in the range
(12.3.57) 0 ≤ ρ <∞, 0 ≤ θ < 2π, 0 ≤ φ ≤ π.
We see then that the spherical coordinates (ρ, θ, φ) are given in terms of the Cartesian coordinates
(x, y, z) of a point A by (12.3.53), the second relation of (12.3.52) and (12.3.56) respectively. Sup-
pose, conversely, that one is given the spherical coordinates (ρ, θ, φ) of a point A; how does one
determine the corresponding Cartesian coordinates (x, y, z)? From the right-angle triangle ODB in
Figure 12.10 we see that
(12.3.58) x = r cos(θ), y = r sin(θ).
Upon combining (12.3.58) with the second relation of (12.3.54) we see that the Cartesian coordinates
(x, y, z) are given in terms of the spherical coordinates (ρ, θ, φ) by
(12.3.59) x = ρ sin(φ) cos(θ), y = ρ sin(φ) sin(θ), z = ρ cos(φ),
(the last relation of (12.3.59) follows from the last relation of (12.3.55)). Put another way, if the
point A is given by the spherical coordinates (ρ, θ, φ) then, in terms of the Cartesian basis iii, jjj,kkk,one sees from (12.3.59) it must be given by the vector
(12.3.60) vvv(ρ, θ, φ) = ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk.
Effectively, the relations (12.3.59) (equivalently the relation (12.3.60)) tells us how the Cartesian
representation of a point changes when we change its spherical coordinates. We now construct
a triple of orthogonal basis vectors for spherical coordinates which are an analog of the moving
coordinate frame eeer(r, θ, z), eeeθ(r, θ, z), eeez(r, θ, z) that we constructed for cylindrical coordinates
(c.f. (12.2.44), (12.2.45) and (12.2.46)). We proceed exactly as we did in the case of cylindrical
coordinates, that is fix some point A in R3 with spherical coordinates (ρ0, θ0, φ0). By analogy with
(12.2.29) define the path
γγγ1(ρ) := vvv(ρ, θ0, φ0)
= ρ sin(φ0) cos(θ0)iii+ ρ sin(φ0) sin(θ0)jjj + ρ cos(φ0)kkk, for all 0 ≤ ρ <∞,(12.3.61)
in which (θ, φ) in (12.3.60) is held fixed at (θ0, φ0), and ρ is the parametric variable. Then the curve
Γ1 of this path is clearly the straight line collinear with the vector OA (see Figure 12.11 which is
adapted from Mathematica). Likewise (c.f. (12.2.30)) define the path
233
Figure 12.11: Curves Γ1, Γ2, Γ3, and basis vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0)
γγγ2(θ) := vvv(ρ0, θ, φ0)
= ρ0 sin(φ0) cos(θ)iii+ ρ0 sin(φ0) sin(θ)jjj + ρ0 cos(φ0)kkk, for all 0 ≤ θ < 2π,(12.3.62)
in which (ρ, φ) in (12.3.60) is held fixed at (ρ0, φ0), and θ is the parametric variable. Clearly the
curve Γ2 of this path is the circle of radius
(12.3.63) r0 := ρ0 sin(φ0),
lying parallel to the x− y plane at the “height”
(12.3.64) z0 := ρ0 cos(φ0),
(see Figure 12.11). Finally (c.f. (12.2.31)) define the path
γγγ3(φ) := vvv(ρ0, θ0, φ)
= ρ0 sin(φ) cos(θ0)iii+ ρ0 sin(φ) sin(θ0)jjj + ρ0 cos(φ)kkk, for all 0 ≤ φ ≤ π,(12.3.65)
in which (ρ, θ) in (12.3.60) is held fixed at (ρ0, θ0), and φ is the parametric variable. Clearly the
curve Γ3 of this path is the circle of radius ρ0 “vertical” to the x − y plane and lying in the plane
234
which contains the triangle OAB (see Figure 12.11). Exactly as at (12.2.32), we define the tangent
to the curve Γ1 at the point A, namely
(12.3.66) γγγ(1)1 (ρ0) :=
dγγγ1
dρ(ρ)∣∣ρ=ρ0
= sin(φ0) cos(θ0)iii+ sin(φ0) sin(θ0)jjj + cos(φ0)kkk
in which the last equality follows from (12.3.61). In the same way, from (12.3.62), we define the
tangent to the curve Γ2 at the point A, that is
(12.3.67) γγγ(1)2 (θ0) :=
dγγγ2
dθ(θ)∣∣θ=θ0
= −ρ0 sin(φ0) sin(θ0)iii+ ρ0 sin(φ0) cos(θ0)jjj + 0kkk,
and, from (12.3.65), we define the tangent to the curve Γ3 at the point A, that is
(12.3.68) γγγ(1)3 (φ0) :=
dγγγ3
dφ(φ)∣∣φ=φ0
= ρ0 cos(φ0) cos(θ0)iii+ ρ0 cos(φ0) sin(θ0)jjj − ρ0 sin(φ0)kkk.
Exactly as at (12.2.35), (12.2.36) and (12.2.37) we define unit vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0)
and eeeφ(ρ0, θ0, φ0), having the same direction as γγγ(1)1 (ρ0), γγγ
(1)2 (θ0) and γγγ
(1)3 (φ0) respectively, that is
(12.3.69) eeeρ(ρ0, θ0, φ0) :=γγγ
(1)1 (ρ0)∥∥∥γγγ(1)1 (ρ0)
∥∥∥ = sin(φ0) cos(θ0)iii+ sin(φ0) sin(θ0)jjj + cos(φ0)kkk,
in which the final equality follows from (12.3.66) since it is clear that∥∥∥γγγ(1)
1 (ρ0)∥∥∥ = 1. Similarly,
from (12.3.67), we have∥∥∥γγγ(1)
2 (θ0)∥∥∥ = ρ0 sin(φ0), and therefore
(12.3.70) eeeθ(ρ0, θ0, φ0) :=γγγ
(1)2 (θ0)∥∥∥γγγ(1)2 (θ0)
∥∥∥ = − sin(θ0)iii+ cos(θ0)jjj + 0kkk,
and, from (12.3.68), we have∥∥∥γγγ(1)
3 (φ0)∥∥∥ = ρ0, and therefore
(12.3.71) eeeφ(ρ0, θ0, φ0) :=γγγ
(1)3 (φ0)∥∥∥γγγ(1)3 (φ0)
∥∥∥ = cos(φ0) cos(θ0)iii+ cos(φ0) sin(θ0)jjj − sin(φ0)kkk.
Now we know from (12.3.66) that γγγ(1)1 (ρ0) is tangent to the curve Γ1 at point A, so it follows
from (12.3.69) that
(12.3.72) the unit vector eeeρ(ρ0, θ0, φ0) is tangent to curve Γ1 at A,
(see Figure 12.11). Similarly, from (12.3.67) and (12.3.70), and from (12.3.68) and (12.3.71), we
have
(12.3.73) the unit vector eeeθ(ρ0, θ0, φ0) is tangent to curve Γ2 at A,
235
(12.3.74) the unit vector eeeφ(ρ0, θ0, φ0) is tangent to curve Γ3 at A,
(see Figure 12.11). Moreover, calculating the inner product of eeeρ(ρ0, θ0, φ0) with eeeθ(ρ0, θ0, φ0) using
(12.3.69) and (12.3.70) we get
(12.3.75) (eeeρ(ρ0, θ0, φ0)) · (eeeθ(ρ0, θ0, φ0)) = − sin(φ0)[sin(θ0) cos(θ0)− sin(θ0) cos(θ0)] = 0.
Similarly, from (12.3.69), (12.3.70) and (12.3.71) we find
(12.3.76) (eeeρ(ρ0, θ0, φ0)) · (eeeφ(ρ0, θ0, φ0)) = (eeeθ(ρ0, θ0, φ0)) · (eeeφ(ρ0, θ0, φ0)) = 0.
From (12.3.75) and (12.3.76) it follows that, for each and every (ρ0, θ0, φ0), we have
(12.3.77)
eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0) is a triplet of mutually orthogonal unit vectors.
We see from Figure 12.11 that the orthogonal unit vectors eeeρ(ρ0, θ0, φ0), eeeθ(ρ0, θ0, φ0), eeeφ(ρ0, θ0, φ0)are “attached” to the point A with spherical coordinates (ρ0, θ0, φ0) and constitutes a coordinate
frame at the point A. Notice that these basis vectors change direction (but not of course the unit
length) as the point Amoves, so that (exactly as for the cases of polar and cylindrical coordinates) we
have a moving coordinate frame. Again, this is in direct contrast to the Cartesian unit basis vectors
iii, jjj,kkk which of course have a constant direction parallel to the x, y and z axes respectively. For
later reference we rewrite the relations (12.3.69), (12.3.70) and (12.3.71), but replacing the generic
spherical coordinates (ρ0, θ0, φ0) with (ρ, θ, φ) (to lighten the notation):
(12.3.78) eeeρ(ρ, θ, φ) = sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk, ,
(12.3.79) eeeθ(ρ, θ, φ) = − sin(θ)iii+ cos(θ)jjj + 0kkk,
(12.3.80) eeeφ(ρ, θ, φ) = cos(φ) cos(θ)iii+ cos(φ) sin(θ)jjj − sin(φ)kkk,
(c.f. (12.2.44), (12.2.45) and (12.2.46) for similar relations in the case of cylindrical coordinates).
For cylindrical coordinates we established an expression in Riemannian form for the change ds in
distance resulting from a small change in the cylindrical coordinates (see (12.2.50) and (12.2.51)).
We now get a comparable expression for the case of spherical coordinates. Suppose point A has
spherical coordinates (ρ, θ, φ) so that
(12.3.81) point A is given by vvv(ρ, θ, φ) (see (12.3.60))
236
and we make small perturbations dρ, dθ and dφ in the spherical coordinates to get point B with
the spherical coordinates (ρ+ dρ, θ + dθ, φ+ dφ) so that
(12.3.82) point B is given by vvv(ρ+ dρ, θ + dθ, φ+ dφ) (again see (12.3.60))
(see Figure 12.12). We must determine the distance ds between A and B. In the case of cylindrical
Figure 12.12: Perturbations in spherical coordinates
coordinates it was easy to calculate this distance just by looking at Figure 12.9, for this led imme-
diately to (12.2.48) which in turn gave us the desired relation (12.2.49) (which we then wrote in
the Riemannian form (12.2.50) and (12.2.51)). In the case of spherical coordinates it is not quite so
easy to see what is going on by looking at Figure 12.12, and in fact one can easily extract misleading
information by incorrectly interpreting this figure. Accordingly, we shall instead proceed just by the
use of ordinary calculus and not rely on any figures or pictures at all. From (12.3.81) and (12.3.82)
237
we see
( ds)2 = (length of AB)2
= ‖vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)‖2 ,(12.3.83)
so we must first calculate the vector difference [vvv(ρ + dρ, θ + dθ, φ + dφ)− vvv(ρ, θ, φ)]. But this is
easy using the formulas we have already worked out. In fact, by ordinary calculus, we have
vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)
=∂vvv
∂ρ(ρ, θ, φ)( dρ) +
∂vvv
∂θ(ρ, θ, φ)( dθ) +
∂vvv
∂φ(ρ, θ, φ)( dφ).
(12.3.84)
Now substitute vvv given by (12.3.60) into the right side of (12.3.84). We get
vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)
=∂
∂ρ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dρ)
+∂
∂θ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dθ)
+∂
∂φ[ρ sin(φ) cos(θ)iii+ ρ sin(φ) sin(θ)jjj + ρ cos(φ)kkk]( dφ)
= [sin(φ) cos(θ)iii+ sin(φ) sin(θ)jjj + cos(φ)kkk]( dρ)
+ [− sin(θ)iii+ cos(θ)jjj](ρ sin(φ) dθ)
+ [cos(φ) cos(θ)iii+ cos(φ) sin(θ)jjj − sin(φ)kkk](ρ dφ) (evaluating the partial derivatives)
= ( dρ)eeeρ(ρ, θ, φ) + (ρ sin(φ) dθ)eeeθ(ρ, θ, φ) + (ρ dφ)eeeφ(ρ, θ, φ),
(12.3.85)
in which we have used (12.3.78), (12.3.79) and (12.3.80) at the last equality. To summarize, in
(12.3.85) we have shown
vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)
= ( dρ)eeeρ(ρ, θ, φ) + (ρ sin(φ) dθ)eeeθ(ρ, θ, φ) + (ρ dφ)eeeφ(ρ, θ, φ).(12.3.86)
In view of (12.3.86), (12.3.77) and Pythagoras we get
‖vvv(ρ+ dρ, θ + dθ, φ+ dφ)− vvv(ρ, θ, φ)‖2
= ( dρ)2 + (ρ sin(φ))2( dθ)2 + (ρ)2( dφ)2.(12.3.87)
238
Now combine (12.3.87) and (12.3.83). We get the distance between A and B (recall (12.3.81)
and (12.3.82)) in terms of small changes ( dρ, dθ, dφ) in the spherical coordinates (ρ, θ, φ) in the
Riemannian form
(12.3.88) ( ds)2 = [hρ(ρ, θ, φ) dρ]2 + [hθ(ρ, θ, φ) dθ]2 + [hφ(ρ, θ, φ) dφ]2,
in which hr, hθ and hz are the Riemannian scale functions defined by
(12.3.89) hρ(ρ, θ, φ) := 1, hθ(ρ, θ, φ) := ρ sin(φ), hφ(ρ, θ, φ) := ρ.
239