mathematical analysismajhu/math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1...

418
Mathematical Analysis Min Yan February 4, 2008

Upload: others

Post on 01-Nov-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Mathematical Analysis

Min Yan

February 4, 2008

Page 2: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2

Page 3: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Contents

1 Limit and Continuity 71.1 Limit of Sequence . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 81.1.2 Property . . . . . . . . . . . . . . . . . . . . . . . . . . 131.1.3 Infinity and Infinitesimal . . . . . . . . . . . . . . . . . 181.1.4 Additional Exercise . . . . . . . . . . . . . . . . . . . . 20

1.2 Convergence of Sequence Limit . . . . . . . . . . . . . . . . . 211.2.1 Necessary Condition . . . . . . . . . . . . . . . . . . . 221.2.2 Supremum and Infimum . . . . . . . . . . . . . . . . . 241.2.3 Monotone Sequence . . . . . . . . . . . . . . . . . . . . 271.2.4 Convergent Subsequence . . . . . . . . . . . . . . . . . 301.2.5 Convergence of Cauchy Sequence . . . . . . . . . . . . 351.2.6 Open Cover . . . . . . . . . . . . . . . . . . . . . . . . 361.2.7 Additional Exercise . . . . . . . . . . . . . . . . . . . . 38

1.3 Limit of Function . . . . . . . . . . . . . . . . . . . . . . . . . 391.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 391.3.2 Variation . . . . . . . . . . . . . . . . . . . . . . . . . 421.3.3 Property . . . . . . . . . . . . . . . . . . . . . . . . . . 451.3.4 Limit of Trignometric Function . . . . . . . . . . . . . 491.3.5 Limit of Exponential Function . . . . . . . . . . . . . . 511.3.6 More Property . . . . . . . . . . . . . . . . . . . . . . 551.3.7 Order of Infinity and Infinitesimal . . . . . . . . . . . . 581.3.8 Additional Exercise . . . . . . . . . . . . . . . . . . . . 60

1.4 Continuous Function . . . . . . . . . . . . . . . . . . . . . . . 621.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 621.4.2 Uniformly Continuous Function . . . . . . . . . . . . . 661.4.3 Maximum and Minimum . . . . . . . . . . . . . . . . . 681.4.4 Intermediate Value Theorem . . . . . . . . . . . . . . . 691.4.5 Invertible Continuous Function . . . . . . . . . . . . . 711.4.6 Inverse Trignometric and Logarithmic Functions . . . . 741.4.7 Additional Exercise . . . . . . . . . . . . . . . . . . . . 78

2 Differentiation 812.1 Approximation and Differentiation . . . . . . . . . . . . . . . 82

2.1.1 Approximation . . . . . . . . . . . . . . . . . . . . . . 822.1.2 Differentiation . . . . . . . . . . . . . . . . . . . . . . . 842.1.3 Derivative . . . . . . . . . . . . . . . . . . . . . . . . . 85

3

Page 4: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4 CONTENTS

2.1.4 Tangent Line and Rate of Change . . . . . . . . . . . . 872.1.5 Rules of Computation . . . . . . . . . . . . . . . . . . 902.1.6 Basic Example . . . . . . . . . . . . . . . . . . . . . . 922.1.7 Derivative of Inverse Function . . . . . . . . . . . . . . 962.1.8 Additional Exercise . . . . . . . . . . . . . . . . . . . . 99

2.2 Application of Differentiation . . . . . . . . . . . . . . . . . . 992.2.1 Maximum and Minimum . . . . . . . . . . . . . . . . . 1002.2.2 Mean Value Theorem . . . . . . . . . . . . . . . . . . . 1022.2.3 Monotone Function . . . . . . . . . . . . . . . . . . . . 1052.2.4 L’Hospital’s Rule . . . . . . . . . . . . . . . . . . . . . 1072.2.5 Additional Exercise . . . . . . . . . . . . . . . . . . . . 112

2.3 High Order Approximation . . . . . . . . . . . . . . . . . . . . 1142.3.1 Quadratic Approximation . . . . . . . . . . . . . . . . 1142.3.2 High Order Derivative . . . . . . . . . . . . . . . . . . 1162.3.3 Taylor Expansion . . . . . . . . . . . . . . . . . . . . . 1212.3.4 Remainder . . . . . . . . . . . . . . . . . . . . . . . . . 1262.3.5 Maximum and Minimum . . . . . . . . . . . . . . . . . 1282.3.6 Convex and Concave . . . . . . . . . . . . . . . . . . . 1302.3.7 Additional Exercise . . . . . . . . . . . . . . . . . . . . 134

3 Integration 1393.1 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . 140

3.1.1 Riemann Sum . . . . . . . . . . . . . . . . . . . . . . . 1403.1.2 Integrability Criterion . . . . . . . . . . . . . . . . . . 1433.1.3 Integrability of Continuous and Monotone Functions . 1463.1.4 Properties of Integration . . . . . . . . . . . . . . . . . 1483.1.5 Additional Exercise . . . . . . . . . . . . . . . . . . . . 154

3.2 Antiderivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 1583.2.1 Fundamental Theorem of Calculus . . . . . . . . . . . 1593.2.2 Antiderivative . . . . . . . . . . . . . . . . . . . . . . . 1643.2.3 Integration by Parts . . . . . . . . . . . . . . . . . . . 1693.2.4 Change of Variable . . . . . . . . . . . . . . . . . . . . 1713.2.5 Additional Exercise . . . . . . . . . . . . . . . . . . . . 174

3.3 Topics on Integration . . . . . . . . . . . . . . . . . . . . . . . 1783.3.1 Integration of Rational Functions . . . . . . . . . . . . 1783.3.2 Improper Integration . . . . . . . . . . . . . . . . . . . 1833.3.3 Riemann-Stieltjes Integration . . . . . . . . . . . . . . 1873.3.4 Bounded Variation Function . . . . . . . . . . . . . . . 1933.3.5 Additional Exercise . . . . . . . . . . . . . . . . . . . . 199

4 Series 2034.1 Series of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 204

4.1.1 Sum of Series . . . . . . . . . . . . . . . . . . . . . . . 2044.1.2 Comparison Test . . . . . . . . . . . . . . . . . . . . . 2074.1.3 Conditional Convergence . . . . . . . . . . . . . . . . . 2114.1.4 Additional Exercise . . . . . . . . . . . . . . . . . . . . 215

4.2 Series of Functions . . . . . . . . . . . . . . . . . . . . . . . . 218

Page 5: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

CONTENTS 5

4.2.1 Uniform Convergence . . . . . . . . . . . . . . . . . . . 2184.2.2 Properties of Uniform Convergence . . . . . . . . . . . 2234.2.3 Power Series . . . . . . . . . . . . . . . . . . . . . . . . 2294.2.4 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 2344.2.5 Additional Exercise . . . . . . . . . . . . . . . . . . . . 240

5 Multivariable Function 2455.1 Limit and Continuity . . . . . . . . . . . . . . . . . . . . . . . 246

5.1.1 Limit in Euclidean Space . . . . . . . . . . . . . . . . . 2465.1.2 Topology in Euclidean Space . . . . . . . . . . . . . . . 2495.1.3 Multivariable Function . . . . . . . . . . . . . . . . . . 2545.1.4 Continuous Function . . . . . . . . . . . . . . . . . . . 2565.1.5 Multivariable Map . . . . . . . . . . . . . . . . . . . . 2585.1.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 262

5.2 Multivariable Algebra . . . . . . . . . . . . . . . . . . . . . . . 2645.2.1 Linear Transform . . . . . . . . . . . . . . . . . . . . . 2645.2.2 Bilinear and Quadratic Form . . . . . . . . . . . . . . 2685.2.3 Multilinear Map and Polynomial . . . . . . . . . . . . 275

6 Multivariable Differentiation 2856.1 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

6.1.1 Differentiability and Derivative . . . . . . . . . . . . . 2866.1.2 Partial Derivative . . . . . . . . . . . . . . . . . . . . . 2876.1.3 Rules of Differentiation . . . . . . . . . . . . . . . . . . 2916.1.4 Directional Derivative . . . . . . . . . . . . . . . . . . 294

6.2 Inverse and Implicit Function . . . . . . . . . . . . . . . . . . 2986.2.1 Inverse Differentiation . . . . . . . . . . . . . . . . . . 2986.2.2 Implicit Differentiation . . . . . . . . . . . . . . . . . . 3026.2.3 Hypersurface . . . . . . . . . . . . . . . . . . . . . . . 3046.2.4 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 307

6.3 High Order Differentiation . . . . . . . . . . . . . . . . . . . . 3096.3.1 Quaratic Approximation . . . . . . . . . . . . . . . . . 3096.3.2 High Order Partial Derivative . . . . . . . . . . . . . . 3136.3.3 Taylor Expansion . . . . . . . . . . . . . . . . . . . . . 3166.3.4 Maximum and Minimum . . . . . . . . . . . . . . . . . 3196.3.5 Constrained Extreme . . . . . . . . . . . . . . . . . . . 3236.3.6 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 330

7 Multivariable Integration 3337.1 Riemann Integration . . . . . . . . . . . . . . . . . . . . . . . 334

7.1.1 Volume in Euclidean Space . . . . . . . . . . . . . . . . 3347.1.2 Riemann Sum . . . . . . . . . . . . . . . . . . . . . . . 3397.1.3 Properties of Integration . . . . . . . . . . . . . . . . . 3437.1.4 Fubini Theorem . . . . . . . . . . . . . . . . . . . . . . 3477.1.5 Volume in Vector Space . . . . . . . . . . . . . . . . . 3527.1.6 Change of Variable . . . . . . . . . . . . . . . . . . . . 3567.1.7 Improper Integration . . . . . . . . . . . . . . . . . . . 362

Page 6: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6 CONTENTS

7.1.8 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 3677.2 Integration on Hypersurface . . . . . . . . . . . . . . . . . . . 369

7.2.1 Rectifiable Curve . . . . . . . . . . . . . . . . . . . . . 3697.2.2 Integration of Function on Curve . . . . . . . . . . . . 3737.2.3 Integration of 1-Form on Curve . . . . . . . . . . . . . 3767.2.4 Surface Area . . . . . . . . . . . . . . . . . . . . . . . . 3797.2.5 Integration of Function on Surface . . . . . . . . . . . . 3827.2.6 Integration of 2-Form on Surface . . . . . . . . . . . . 3847.2.7 Volume and Integration on Hypersurface . . . . . . . . 3897.2.8 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 394

7.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 3967.3.1 Green Theorem . . . . . . . . . . . . . . . . . . . . . . 3967.3.2 Independence of Integral on Path . . . . . . . . . . . . 4017.3.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . 4067.3.4 Gauss Theorem . . . . . . . . . . . . . . . . . . . . . . 4117.3.5 Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 416

8 Calculus on Manifold 4178.1 Differentiable Manifold . . . . . . . . . . . . . . . . . . . . . . 418

8.1.1 Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . 4188.1.2 Tangent Space . . . . . . . . . . . . . . . . . . . . . . . 4188.1.3 Differential Form . . . . . . . . . . . . . . . . . . . . . 4188.1.4 Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 418

8.2 Integration on Manifold . . . . . . . . . . . . . . . . . . . . . 4188.2.1 Partition of Unity . . . . . . . . . . . . . . . . . . . . . 4188.2.2 Integration . . . . . . . . . . . . . . . . . . . . . . . . 4188.2.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . 418

Page 7: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 1

Limit and Continuity

7

Page 8: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

8 CHAPTER 1. LIMIT AND CONTINUITY

1.1 Limit of Sequence

A sequence is an infinite list

x1, x2, x3, . . . , xn, xn+1, . . . .

The sequence can be briefly denoted as {xn}. The subscript n is called theindex and does not have to start from 1. For example,

x5, x6, x7, . . . , xn, xn+1, . . . ,

is also a sequence, with the index starting from 5.In this chapter, the terms xn of a sequence are assumed to be real numbers

and can be plotted on the real number line. A sequence has a limit l if thefollowing implication happens:

n is big =⇒ xn is close to l.

Intuitively, this means that the sequence “accumulates” around l. However,to give a rigorous definition, more consideration is required for the meaning of“big”, “close” and “implies”. The bigness of a number n is usually measuredby n > N for some big N (for example, we say n is “in the thousands”if N = 1, 000). The closeness between two numbers u and v is usuallymeasured by (the smallness of) the size of |u − v|. The implication meansthat the predetermined smallness of |xn − l| may be achieved by the bignessof n.

1.1.1 Definition

Definition 1.1.1. A sequence {xn} of real numbers has limit l (or convergesto l), and denoted limn→∞ xn = l, if for any ε > 0, there is N , such that

n > N =⇒ |xn − l| < ε. (1.1.1)

A sequence is convergent if it has a (finite) limit. Otherwise, the sequence isdivergent.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................................................................................................ε................................................ ................................................

ε ............................................................................

l

.............x1

.............x2

.............x3

.............x4

.............x5

.............xN+1

......................................xN+2

......................................xN+3

............................................................................xN+4

............

............

............

............

............

............

............

............

Figure 1.1: for any ε, there is N

Note the logical relation between ε and N . The predetermined smallnessε for |xn − l| is arbitrarily given, while the size N for n is to be found afterε is given. Thus the choice of N usually depends on ε and is often expressedas a function of ε.

Page 9: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 9

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

................

................................

................................................ε

................

................................

................................................ε

................................................................................................

................................................................................................

................................................

................................................................................................

................................................................................................l

1 5 N+1 N+4............. ............. ............. ............. ............. ............. ............. ............. .............

.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................................................................................

Figure 1.2: another plot of a converging sequence

Since the limit is about the long term behavior of a sequence getting closerto a target, only small ε and big N need to be considered in the argument.For example, the limit of a sequence is not changed if the first 100 terms arereplaced by other arbitrary numbers. See Exercise 1.1.9 for more examples.

Example 1.1.1. Intuitively, the bigger n is, the smaller1n

gets. This suggests

limn→∞1n

= 0. Rigorously following the definition, for any ε > 0, choose N =1ε

.Then

n > N =⇒∣∣∣∣ 1n − 0

∣∣∣∣ =1n<

1N

= ε.

Thus the implication (1.1.1) is established for the sequence{

1n

}.

How did we find the suitable N? Our goal is to achieve∣∣∣∣ 1n − 0

∣∣∣∣ < ε. This is

the same as n >1ε

, which suggests us to take N =1ε

. Note that our choice of Nmay not be a natural number.

Example 1.1.2. The sequence{

n

n+ (−1)n

}, with index starting from n = 2, is

23,

32,

45,

54,

67,

76, . . . .

Plotting the sequence suggests limn→∞n

n+ (−1)n= 1.

For the rigorous argument, we observe that∣∣∣∣ n

n+ (−1)n− 1∣∣∣∣ =

1n+ (−1)n

≤ 1n− 1

.

In order for the left side to be less than ε, it is sufficient to make1

n− 1< ε, which

is the same as n >1ε

+ 1.

Page 10: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

10 CHAPTER 1. LIMIT AND CONTINUITY

Based on the analysis, for any ε > 0, choose N =1ε

+ 1. Then

n > N =⇒∣∣∣∣ n

n+ (−1)n− 1∣∣∣∣ =

1n+ (−1)n

≤ 1n− 1

<1

N − 1= ε.

Example 1.1.3. Consider the sequence

1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142135, 1.41421356, . . .

of the finer and finer decimal approximations of√

2. The intuition suggests thatlimn→∞ xn =

√2. The rigorous verification means that for any ε > 0, we need to

find N , such thatn > N =⇒ |xn −

√2| < ε.

Since the n-th term xn is the expansion up to the n-th decimal point, it satisfies|xn−

√2| < 10−n. Therefore it suffices to find N so that the following implication

holdsn > N =⇒ 10−n < ε.

Assume 1 > ε > 0. Then ε has the decimal expansion

ε = 0.00 · · · 0ENEN+1EN+2 · · · ,

with EN is from {1, 2, . . . , 9}. In other words, N is the location of the first nonzerodigit in the decimal expansion of ε. Then for n > N we have

ε ≥ 0.00 · · · 0EN ≥ 0.00 · · · 01 = 10−N > 10−n.

The argument above assumes 1 > ε > 0. This is not a problem because if wecan achieve |xn− l| < 0.5 for n > N , then we can certainly achieve |xn− l| < ε forany ε ≥ 1 and for the same n > N . In other words, we may add the assumptionthat ε is less than a certain fixed number without hurting the overall rigorousargument for the limit.

Exercise 1.1.1. Rigorously verify the limits.

1. limn→∞2nn− 2

= 2.

2. limn→∞1√n

= 0.

3. limn→∞(√n+ 1−

√n) = 0.

4. limn→∞1

n2/3 − n1/2= 0.

5. limn→∞cosnn

= 0.

6. limn→∞

√n− cosn√n+ sinn

= 1.

Exercise 1.1.2. Let a positive real number a > 0 have the decimal expansion

a = X.Z1Z2 · · ·ZnZn+1 · · · ,

whereX is a non-negative integer, and Zn is a single digit integer from {0, 1, 2, . . . , 9}at the n-th decimal point. Prove the sequence

X.Z1, X.Z1Z2, X.Z1Z2Z3, X.Z1Z2Z3Z4, . . .

of finer and finer decimal approximations converges to a.

Page 11: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 11

Exercise 1.1.3. Suppose xn ≤ l ≤ yn and limn→∞(xn−yn) = 0. Prove limn→∞ xn =limn→∞ yn = l.

Exercise 1.1.4. Suppose |xn − l| ≤ yn and limn→∞ yn = 0. Prove limn→∞ xn = l.

Exercise 1.1.5. Suppose limn→∞ xn = l. Prove limn→∞ |xn| = |l|. Is the conversetrue?

Exercise 1.1.6. Suppose limn→∞ xn = l. Prove limn→∞ xn+3 = l. Is the conversetrue? (See Proposition 1.2.1 for a more general statement)

Exercise 1.1.7. Prove that the limit is not changed if finitely many terms aremodified. In other words, if there is N , such that xn = yn for n > N , thenlimn→∞ xn = l if and only if limn→∞ yn = l.

Exercise 1.1.8. Prove the uniqueness of the limit. In other words, if limn→∞ xn = land limn→∞ xn = l′, then l = l′.

Exercise 1.1.9. Prove the following are equivalent definitions of limn→∞ xn = l.

1. For any c > ε > 0, where c is some fixed number, there is N , such that|xn − l| < ε for all n > N .

2. For any ε > 0, there is a natural number N , such that |xn − l| < ε for alln > N .

3. For any ε > 0, there is N , such that |xn − l| ≤ ε for all n > N .

4. For any ε > 0, there is N , such that |xn − l| < ε for all n ≥ N .

5. For any ε > 0, there is N , such that |xn − l| ≤ 2ε for all n > N .

Exercise 1.1.10. Which are equivalent to the definition of limn→∞ xn = l?

1. For ε = 0.001, we have N = 1000, such that |xn − l| < ε for all n > N .

2. For any ε satisfying 0.001 ≥ ε > 0, there is N , such that |xn − l| < ε for alln > N .

3. For any ε > 0.001, there is N , such that |xn − l| < ε for all n ≥ N .

4. For any ε > 0, there is a natural number N , such that |xn − l| ≤ ε for alln ≥ N .

5. For any ε > 0, there is N , such that |xn − l| < 2ε2 for all n > N .

6. For any ε > 0, there is N , such that |xn − l| < 2ε2 + 1 for all n > N .

7. For any ε > 0, we have N = 1000, such that |xn − l| < ε for all n > N .

8. For any ε > 0, there are infinitely many n, such that |xn − l| < ε.

9. For infinitely many ε > 0, there is N , such that |xn − l| < ε for all n > N .

10. For any ε > 0, there is N , such that l − 2ε < xn < l + ε for all n > N .

11. For any natural number K, there is N , such that |xn− l| <1K

for all n > N .

The subsequence examples are the important basic limits. The analysisleading to the suitable choice of N for the given ε will be given. The rigorousargument in line with the definition of limit is left to the reader.

Page 12: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

12 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.1.4. For any a > 0, we have

limn→∞

1na

= 0. (1.1.2)

The inequality∣∣∣∣ 1na

∣∣∣∣ < ε is the same as1n< ε

1a . Thus choosing N = ε−

1a should

make the implication (1.1.1) hold.Example 1.1.5. If |a| < 1, then we have

limn→∞

an = 0. (1.1.3)

Let1|a|

= 1 + b, then b > 0 and

1|an|

= (1 + b)n = 1 + nb+n(n− 1)

2b2 + · · · > nb.

This implies |an| < 1nb

. In order to get |an| < ε, therefore, it suffices to make sure1nb

< ε. This suggests us to choose N =1bε

.

Example 1.1.6. We havelimn→∞

n√n = 1. (1.1.4)

Let xn = n√n− 1. Then xn > 0 and

n =(n√n)n = 1 + nxn +

n(n− 1)2

x2n + · · · > n(n− 1)

2x2n.

This implies x2n <

2n− 1

. In order to get | n√n− 1| = xn < ε, therefore, it suffices

to make sure2

n− 1< ε2. Thus we may choose N =

2ε2

+ 1.

Example 1.1.7. For any a, we have

limn→∞

an

n!= 0. (1.1.5)

Suppose M > |a| is an integer. Then for n > M , we have∣∣∣∣ann!

∣∣∣∣ =|a|M

M !|a|

M + 1|a|

M + 2· · · |a|

n≤ |a|

M

M !|a|n.

Thus in order to get∣∣∣∣ann!

∣∣∣∣ < ε, we only need to make sure|a|M

M !|a|n< ε. This leads

to the choice N = max{M,|a|M+1

M !ε

}.

Exercise 1.1.11. Proven!nn

<1n

and(n!)2

(2n)!<

1n+ 1

for n > 1. Then use this to

prove limn→∞n!nn

= limn→∞(n!)2

(2n)!= 0.

Exercise 1.1.12. Use the binary expansion of 2n = (1+1)n to prove 2n >n(n− 1)

2.

Then prove limn→∞n

2n= 0.

Exercise 1.1.13. Prove limn→∞n√

3 = 1 and limn→∞n√n2 + 2n+ 3 = 1.

Exercise 1.1.14. Prove n√n! >

√n

2. Then use this to prove limn→∞

1n√n!

= 0.

Page 13: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 13

1.1.2 Property

A sequence is bounded if there is a constant B, such that |xn| ≤ B for anyn. This is equivalent to the existence of constants B1 and B2, such thatB1 ≤ xn ≤ B2 for any n. The constants B, B1, B2 are respectively called abound, a lower bound and an upper bound.

Proposition 1.1.2. Convergent sequences are bounded.

Proof. Suppose limn→∞ xn = l. For ε = 1 > 0, there is N , such that

n > N =⇒ |xn − l| < 1.

Moreover, by taking a bigger natural number if necessary, N may be furtherassumed to be a natural number. Then xN+1, xN+2, . . . , have upper boundl + 1 and lower bound l − 1, and the whole sequence has

max{x1, x2, . . . , xN , l + 1}, min{x1, x2, . . . , xN , l − 1}

as upper and lower bounds.

Exercise 1.1.15. Prove that if |xn| < B for n > N , then the whole sequence {xn} isbounded. This implies that the boundedness is not changed by modifying finitelymany terms in a sequence.

Exercise 1.1.16. Suppose limn→∞ xn = 0 and yn is bounded. Prove limn→∞ xnyn =0.

Proposition 1.1.3 (Arithmetic Rule). Suppose

limn→∞

xn = l, limn→∞

yn = k.

Then

limn→∞

(xn + yn) = l + k, limn→∞

xnyn = lk, limn→∞

xnyn

=l

k,

where yn 6= 0 and k 6= 0 are assumed in the third equality.

Proof. For any ε > 0, there are N1 and N2, such that

n > N1 =⇒ |xn − l| <ε

2, n > N2 =⇒ |yn − k| <

ε

2.

Then

n > max{N1, N2} =⇒ |(xn + yn)− (l+k)| ≤ |xn− l|+ |yn−k| <ε

2+ε

2= ε.

This completes the proof that limn→∞(xn + yn) = l + k.By Proposition 1.1.2, we have |yn| < B for a fixed number B and all n.

For any ε > 0, there are N1 and N2, such that

n > N1 =⇒ |xn − l| <ε

2B, n > N2 =⇒ |yn − k| <

ε

2|l|.

Page 14: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

14 CHAPTER 1. LIMIT AND CONTINUITY

Then

n > N = max{N1, N2} =⇒ |xnyn − lk| = |(xnyn − lyn) + (lyn − lk)|

≤ |xn − l||yn|+ |l||yn − k| <ε

2BB + |l| ε

2|l|= ε.

This completes the proof that limn→∞ xnyn = lk.

Assume yn 6= 0 and k 6= 0. We will prove limn→∞1

yn=

1

k. By the product

property of the limit, this implies

limn→∞

xnyn

= limn→∞

xn limn→∞

1

yn= l

1

k=l

k.

For any ε > 0, we have ε′ = min

{ε|k|2

2,|k|2

}> 0. Then there is N , such

that

n > N =⇒ |yn − k| < ε′

⇐⇒ |yn − k| <ε|k|2

2, |yn − k| <

|k|2

=⇒ |yn − k| <ε|k|2

2, |yn| >

|k|2

=⇒∣∣∣∣ 1

yn− 1

k

∣∣∣∣ =|yn − k||ynk|

<

ε|k|2

2|k|2|k|

= ε.

This completes the proof that limn→∞1

yn=

1

k.

Example 1.1.8. By the limit (1.1.2) and the arithmetic rule, we have

limn→∞

2nn− 2

= limn→∞

2

1− 1n

=limn→∞ 2

limn→∞ 1− limn→∞1n

=2

1− 0= 2.

Here is a more complicated example.

limn→∞

n3 + 2n+ 22n3 + 10n2 + 1

= limn→∞

1 + 21n2

+ 21n3

2 + 101n

+1n3

=1 + 2

(limn→∞

1n

)2

+ 2(

limn→∞1n

)3

2 + 10(

limn→∞1n

)+(

limn→∞1n

)3

=1 + 2 · 02 + 2 · 03

2 + 10 · 0 + 03=

12.

The idea can be generalized to obtain

limn→∞

apnp + ap−1n

p−1 + · · ·+ a1n+ a0

bqnq + bq−1nq−1 + · · ·+ b1n+ b0=

0 if p < q, bq 6= 0apbq

if p = q, bq 6= 0 . (1.1.6)

Page 15: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 15

Exercise 1.1.17. Suppose limn→∞ xn = l, limn→∞ yn = k. Prove limn→∞max{xn, yn} =max{l, k}, limn→∞min{xn, yn} = min{l, k}. You may use the formula max{x, y} =12

(x+ y + |x− y|) and the similar one for min{x, y}.

Proposition 1.1.4 (Order Rule). Suppose both {xn} and {yn} converge.

1. If xn ≥ yn for big n, then limn→∞ xn ≥ limn→∞ yn.

2. If limn→∞ xn > limn→∞ yn, then xn > yn for big n.

A special case of the property is that limn→∞ xn < l implies xn < l forsufficiently big n, and xn ≤ l implies limn→∞ xn ≤ l.

Proof. We prove the second statement first. Suppose limn→∞ xn > limn→∞ yn.Then by Proposition 1.1.3, limn→∞(xn − yn) = limn→∞ xn − limn→∞ yn > 0.For ε = limn→∞(xn − yn) > 0, there is N , such that

n > N =⇒ |(xn − yn)− ε| < ε =⇒ xn − yn > ε− ε = 0 ⇐⇒ xn > yn.

By exchanging xn and yn in the second statement, we find that

limn→∞

xn < limn→∞

yn =⇒ xn < yn for big n.

This further implies that we cannot have xn ≥ yn for big n. The combinedimplication

limn→∞

xn < limn→∞

yn =⇒ opposite of (xn ≥ yn for big n)

is equivalent to the first statement.

In the second part of the proof above, we used the logical fact that “A =⇒B” is the same as “(not B) =⇒ (not A)”. Moreover, we note that thefollowing two statements are not opposite of each other.

1. There is N , such that xn < yn for n > N .

2. There is N , such that xn ≥ yn for n > N .

Proposition 1.1.5 (Sandwich Rule). Suppose

xn ≤ yn ≤ zn, limn→∞

xn = limn→∞

zn = l.

Thenlimn→∞

yn = l.

Proof. For any ε > 0, there are N1 and N2, such that

n > N1 =⇒ |xn − l| < ε, n > N2 =⇒ |zn − l| < ε.

Then

n > max{N1, N2} =⇒ −ε < xn − l ≤ yn − l ≤ zn − l < ε =⇒ |yn − l| < ε.

Page 16: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

16 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.1.9. To find the limit of the sequence{cosn

n

}, we compare it with

the sequence{

1n

}in Example 1.1.1. Since − 1

n≤ cosn

n≤ 1n

and limn→∞1n

=

limn→∞−1n

= 0, we get limn→∞cosnn

= 0.

By similar reason, we get limn→∞sinnn

= 0 and limn→∞(−1)n

n= 0. Then by

the arithmetic rule,

limn→∞

n− cosnn2 + (−1)n sinn

= limn→∞

1n

1− cosnn

1 +sinnn

(−1)n

n

=(

limn→∞

1n

) 1−(

limn→∞cosnn

)1 +

(limn→∞

sinnn

)(limn→∞

(−1)n

n

)= 0

1− 01 + 0 · 0

= 0.

Example 1.1.10. Suppose {xn} is a sequence satisfying |xn − l| < 1n

. Then we

have l − 1n< xn < l +

1n

. Since limn→∞

(l − 1

n

)= limn→∞

(l +

1n

)= l, by the

sandwich rule, we get limn→∞ xn = l.

Example 1.1.11. For any a > 1 and n > a, we have 1 < n√a < n

√n. Thus by the

limit (1.1.4) and the sandwich rule, we have limn→∞ n√a = 1. On the other hand,

for 0 < a < 1, we have b =1a> 1 and

limn→∞

n√a = lim

n→∞

1n√b

=1

limn→∞n√b

= 1.

Combining all the cases, we get limn→∞ n√a = 1 for any a > 0. Furthermore, we

have1 < (n+ a)

1n+b < (2n)

2n , 1 < (n2 + an+ b)

1n+c < (2n2)

2n

for sufficiently big n. By

limn→∞

(2n)2n =

(limn→∞

n√

2 limn→∞

n√n)2

= (1 · 1)2 = 1,

limn→∞

(2n2)2n =

(limn→∞

n√

2)2 (

limn→∞

n√n)4

= 12 · 14 = 1,

and the sandwich rule, we get limn→∞(n+ a)1

n+b = limn→∞(n2 + an+ b)1

n+c = 1.The same idea leads to the limit

limn→∞

(apnp + ap−1np−1 + · · ·+ a1n+ a0)

1n+a = 1. (1.1.7)

Example 1.1.12. Consider limn→∞np

anfor |a| > 1 and any p. The special case

p = 0 is the limit (1.1.3), and the special case p = 1, a = 2 is Exercise 1.1.12. Let

Page 17: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 17

|a| = 1 + b. Since |a| > 1, we have b > 0. Fix a natural number P ≥ p. Then forn > P ,

|a|n = 1 + nb+n(n− 1)

2b2 + · · ·+ n(n− 1) · · · (n− P )

(P + 1)!bP+1 + · · ·

>n(n− 1) · · · (n− P )

(P + 1)!bP+1.

Thus

0 <∣∣∣∣npan

∣∣∣∣ ≤ nP

|a|n<

(P + 1)!n(n− 1) · · · (n− P )

nP

bP+1

=1n

n

(n− 1)n

(n− 2)· · · n

(n− P )(P + 1)!bP+1

.

By limn→∞1n

= 0, limn→∞n

n− k= 1, the fact that P and b are fixed constants,

and the arithmetic rule, the right side has limit 0 as n → ∞. By the sandwichrule, we conclude that

limn→∞

np

an= 0 for |a| > 1 and any p. (1.1.8)

Exercise 1.1.18. Redo Exercise 1.1.4 by using the sandwich rule.

Exercise 1.1.19. Let a > 0 be a constant. Then1n< a < n for big n. Use this and

the limit (1.1.4) to prove limn→∞ n√a = 1.

Exercise 1.1.20. Compute the limits.

1. limn→∞2n7/4 − 3n3/2

(3n3/4 − n1/2 + 1)(n+ 2).

2. limn→∞(√n2 + n− n).

3. limn→∞3n−1 − 8 · 7n + (−1)n+1

8n−1 + 2(n+ 1)(−5)n.

4. limn→∞(−1)nn+ 1√n3 + (−1)n

.

5. limn→∞n! + 10n

n10 + nn.

6. limn→∞(n2 + n+ 3)(n!2n + 5n)

(n+ 2)!2n + 5n.

7. limn→∞n√√

n+ cosn+ sinn.

8. limn→∞n√

11 · 2n + 2 · 5n + 5 · 11n.

9. limn→∞n√

11nn− 5n(n2 + 1).

10. limn→∞n

√1 + n

√2 + n√

3 + n.

11. limn→∞n

√1 · 3 · 5 · · · (2n− 1)

2 · 4 · 5 · · · 2n.

Exercise 1.1.21. Compute the limits.

1. limn→∞an

an + 1, where a 6= −1.

2. limn→∞(n+ a)n

n2+bn+c .

3. limn→∞n√an + bn + cn, where a, b, c > 0.

4. limn→∞

(1√

1 + n2+

1√2 + n2

+ · · ·+ 1√n+ n2

).

Page 18: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

18 CHAPTER 1. LIMIT AND CONTINUITY

1.1.3 Infinity and Infinitesimal

A changing numerical quantity is an infinity if it tends to get arbitrarily big.For sequences, this means the following.

Definition 1.1.6. A sequence {xn} diverges to infinity, denoted limn→∞ xn =∞, if for any b, there is N , such that

n > N =⇒ |xn| > b. (1.1.9)

It diverges to positive infinity, denoted limn→∞ xn = +∞, if for any b, thereis N , such that

n > N =⇒ xn > b. (1.1.10)

It diverges to negative infinity, denoted limn→∞ xn = −∞, if for any b, thereis N , such that

n > N =⇒ xn < b. (1.1.11)

Example 1.1.13. We rigorously verify limn→∞n2

n+ (−1)n= +∞. For any b > 0,

choose N = 2b. Then

n > N =⇒ n2

n+ (−1)n≥ n2

n+ 1>n2

2n>N

2= b.

Exercise 1.1.22. Rigorously verify the divergence to infinity.

1. limn→∞(100 + 10n− n2) = −∞.

2. limn→∞(−1)n

√n

2 + sinn=∞.

3. limn→∞

√n+

√(n− 1) + · · ·+

√2 +√

1 = +∞.

Exercise 1.1.23. Infinities must be unbounded. Is the converse true?

Exercise 1.1.24. Suppose limn→∞ xn = +∞, limn→∞ yn = +∞. Prove limn→∞(xn+yn) = +∞, limn→∞ xnyn = +∞.

Exercise 1.1.25. Suppose limn→∞ xn = ∞ and |xn − xn+1| < c for some constantc. Prove that either limn→∞ xn = +∞ or limn→∞ xn = −∞. Furthermore, iflimn→∞ xn = +∞ and x > x1, then prove x < xn < x+ c for some n.

A changing numerical quantity is an infinitesimal if it tends to get arbi-trarily small. For sequences, this means that for any ε > 0, there is N , suchthat

n > N =⇒ |xn| < ε. (1.1.12)

This is simply limn→∞ xn = 0. Note that the implications (1.1.9) and (1.1.12)

are equivalent by taking ε =1

b. Therefore we have

{xn} is an infinity ⇐⇒{

1

xn

}is an infinitesimal.

Page 19: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.1. LIMIT OF SEQUENCE 19

For example, the infinitesimals (1.1.2), (1.1.3), (1.1.5), (1.1.8) tell us that

{na} (for a > 0), {an} (for |a| > 1),

{n!

an

}, and

{an

np

}(for |a| > 1) are

infinities. Moreover, the first case in the limit (1.1.6) tells us

limn→∞

apnp + ap−1n

p−1 + · · ·+ a1n+ a0

bqnq + bq−1nq−1 + · · ·+ b1n+ b0

=∞ if p > q, ap 6= 0.

On the other hand, since limn→∞ xn = l is equivalent to limn→∞(xn− l) =0, we have

{xn} converges to l ⇐⇒ {xn − l} is an infinitesimal.

For example, the limit (1.1.4) tells us that { n√n− 1} is an infinitesimal.

Exercise 1.1.26. How to characterize a positive infinity {xn} in terms of the in-

finitesimal{

1xn

}?

Exercise 1.1.27. Explain the infinities.

1. limn→∞n!an

=∞ for any a 6= 0.

2. limn→∞n!

an + bn=∞ if a+ b 6= 0.

3. limn→∞1

n√n− 1

= +∞.

4. limn→∞1

n√n− n√

2n= −∞.

Some properties of finite limits can be extended to infinities and infinites-imals. For example, if limn→∞ xn = +∞ and limn→∞ yn = +∞, thenlimn→∞(xn + yn) = +∞. The property can be denoted as the arithmeticrule (+∞) + (+∞) = +∞. Moreover, if limn→∞ xn = 1, limn→∞ yn = 0 and

yn < 0 for big n, then limn→∞xnyn

= −∞. Thus we have another arithmetic

rule1

0−= −∞. Common sense suggests more arithmetic rules such as

c+∞ =∞, c·∞ =∞(for c 6= 0), ∞·∞ =∞, c

0=∞(for c 6= 0),

c

∞= 0,

where c is a finite number and represents a sequence convergent to c.We must be careful in applying arithmetic rules involving infinities and

infinitesimals. For example, we have

limn→∞

n−1 = 0, limn→∞

2n−1 = 0, limn→∞

n−2 = 0,

limn→∞

n−1

2n−1=

1

2, limn→∞

n−1

n−2= +∞, lim

n→∞

n−2

2n−1= 0,

so that0

0has no definite value.

Page 20: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

20 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.1.14. By the arithmetic rule, we have

limn→∞

(n2 + 3)(−2)n = limn→∞

(n2 + 3) limn→∞

(−2)n =∞ ·∞ =∞.

limn→∞

(n+

1n

)= lim

n→∞n+ lim

n→∞

1n

= (+∞) + 0 = +∞.

limn→∞

n√n

n√n− 1

=limn→∞ n

√n

limn→∞( n√n− 1)

= limn→∞

10+

= +∞.

Exercise 1.1.28. Prove the properties of infinities.

1. (bounded)+∞ =∞: If {xn} is bounded and limn→∞ yn =∞, then limn→∞(xn+yn) =∞.

2. min{+∞,+∞} = +∞: If limn→∞ xn = limn→∞ yn = +∞, then limn→∞min{xn, yn} =+∞.

3. Sandwich rule: If xn ≥ yn and limn→∞ yn = +∞, then limn→∞ xn = +∞.

4. (> c > 0) ·(+∞) = +∞: If xn > c for some constant c > 0 and limn→∞ yn =+∞, then limn→∞ xnyn = +∞.

Exercise 1.1.29. Show that it is not necessarily true that∞+∞ =∞ by construct-ing examples of sequences {xn} and {yn} that diverge ∞ but one of the followingholds.

1. limn→∞(xn + yn) = 2.

2. limn→∞(xn + yn) = +∞.

3. {xn + yn} is bounded and divergent.

Exercise 1.1.30. Show that one cannot make a definite conclusion on 0 · ∞ byconstructing examples of sequences {xn} and {yn}, such that limn→∞ xn = 0 andlimn→∞ yn =∞ but one of the following holds.

1. limn→∞ xnyn = 2.

2. limn→∞ xnyn = 0.

3. limn→∞ xnyn =∞.

4. {xnyn} is bounded and divergent.

Exercise 1.1.31. Provide counterexamples to the wrong arithmetic rules.

+∞+∞

= 1, (+∞)− (+∞) = 0, 0 · ∞ = 0, 0 · ∞ =∞, 0 · ∞ = 1.

1.1.4 Additional Exercise

Ratio Rule

Exercise 1.1.32. Suppose∣∣∣∣xn+1

xn

∣∣∣∣ ≤ ∣∣∣∣yn+1

yn

∣∣∣∣.1. Prove that |xn| ≤ c|yn| for some constant c.

Page 21: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 21

2. Prove that limn→∞ yn = 0 implies limn→∞ xn = 0.

3. Prove that limn→∞ xn =∞ implies limn→∞ yn =∞.

Exercise 1.1.33. Suppose limn→∞xn+1

xn= l. What can you say about limn→∞ xn

by looking at the value of l?

Exercise 1.1.34. Use the ratio rule to study the limits (1.1.8) and limn→∞(n!)2an

(2n)!.

Power Rule

Exercise 1.1.35. Suppose limn→∞ xn = l > 0. Prove limn→∞ xαn = lα by the

following steps.

1. Assume xn ≥ 1 and l = 1. By using the sandwich rule and the fact thatlimn→∞ x

an = 1 for any integer a, prove that limn→∞ x

αn = 1 for any number

α. The same argument applies to the case xn ≤ 1.

2. Use min{xn, 1} ≤ xn ≤ max{xn, 1}, Exercise 1.1.17 and the sandwich ruleto remove the assumption xn ≥ 1 in the first part.

3. Use the arithmetic rule to prove the limit for general l.

Average Rule

Exercise 1.1.36. Suppose limn→∞ xn = l. Let yn =x1 + x2 + · · ·+ xn

n.

1. Prove that if |xn − l| < ε for n > N , where N is a natural number, then

n > N =⇒ |yn − l| <|x1|+ |x2|+ · · ·+ |xN |+N |l|

n+ ε.

2. Use the first part and Proposition 1.1.2 to prove limn→∞ yn = l.

3. What happens if l = +∞ or ∞?

Exercise 1.1.37. Find suitable condition on a sequence {an} of positive numbers,

such that limn→∞ xn = l implies limn→∞a1x1 + a2x2 + · · ·+ anxn

a1 + a2 + · · ·+ an= l.

1.2 Convergence of Sequence Limit

The discussion of convergent sequences in Section 1.1 is based on the explicitvalue of the limit. However, there are many cases that a sequence must beconvergent, but the value of the limit is not known. The limit of the worldrecord in 100 meter dash is one such example. In such cases, the existenceof the limit cannot be established by using the definition alone. A morefundamental theory is needed.

Page 22: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

22 CHAPTER 1. LIMIT AND CONTINUITY

1.2.1 Necessary Condition

A subsequence of a sequence {xn} is obtained by selecting some terms fromthe sequence. The indices of the selected terms can be arranged as a strictlyincreasing sequence n1 < n2 < · · · < nk < · · · , and the subsequence can bedenoted as {xnk}. The following are two examples of subsequences.

{x3k} : x3, x6, x9, x12, x10, x15, . . .

{x2k} : x2, x4, x8, x16, x32, x64, . . .

Note that if {xn} starts from n = 1, then nk ≥ k. Thus by reindexing theterms if necessary, we will always assume nk ≥ k in subsequent proofs.

Proposition 1.2.1. Suppose a sequence converges to l. Then all its subse-quences converge to l.

Proof. Suppose limn→∞ xn = l. For any ε > 0, there is N , such that n > Nimplies |xn − l| < ε. Then

k > N =⇒ nk ≥ k > N =⇒ |xnk − l| < ε.

Example 1.2.1. By limn→∞ n√n = 1, we have limk→∞

2k√

2k = 1. Taking the squareof the limit, we get limn→∞

n√

2n = (limk→∞2k√

2k)2 = 1.

Example 1.2.2. The sequence {(−1)n} has subsequences {(−1)2k} = {1} and{(−1)2k+1} = {−1}. Since the two subsequences have different limits, the originalsequence {(−1)n} diverges.

Exercise 1.2.1. Prove the sequences diverge.

1.(−1)n2n+ 1

n+ 2.

2.(−1)n2n(n+ 1)

(√n+ 2)3

.

3.√n(√

n+ (−1)n −√n− (−1)n

).

4.n sin

3n cos

2+ 2

.

5.(−1)nnn+ 1

.

6. x2n =1n

, x2n+1 = n√n.

7. n√

2n + 3(−1)nn.

8. 1 + n sinnπ

2.

9. cosn2nπ

3.

Exercise 1.2.2. Prove that limn→∞ xn = l if and only if limk→∞ x2k = l andlimk→∞ x2k+1 = l.

Exercise 1.2.3. What is wrong with the following application of Propositions 1.1.3and 1.2.1: The sequence xn = (−1)n satisfies xn+1 = −xn. Therefore limn→∞ xn =limn→∞ xn+1 = − limn→∞ xn, and we get limn→∞ xn = 0.

Exercise 1.2.4. Prove that if a sequence diverges to infinity, then all its subse-quences diverge to infinity.

Page 23: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 23

Theorem 1.2.2 (Cauchy1 Criterion). Suppose a sequence {xn} converges.Then for any ε > 0, there is N , such that

m,n > N =⇒ |xm − xn| < ε.

Proof. Suppose limn→∞ xn = l. For any ε > 0, there is N , such that n > N

implies |xn − l| <ε

2. Then m,n > N implies

|xm − xn| = |(xm − l)− (xn − l)| ≤ |xm − l|+ |xn − l| <ε

2+ε

2= ε.

Cauchy criterion plays a critical role in analysis. Therefore the criterionis called a theorem instead of just a proposition. Moreover, the propertydescribed in the criterion is given a special name.

Definition 1.2.3. A sequence {xn} is called a Cauchy sequence if for anyε > 0, there is N , such that

m,n > N =⇒ |xm − xn| < ε. (1.2.1)

Theorem 1.2.2 says that convergent sequences must be Cauchy sequences.The converse that any Cauchy sequence is convergent is also true and is oneof the most fundamental results in analysis.

Example 1.2.3. Consider the sequence {(−1)n}. For ε = 1 > 0 and any N , we canfind an even n > N . Then m = n+1 > N is odd and |xm−xn| = 2 > ε. Thereforethe Cauchy criterion fails and the sequence diverges.

Example 1.2.4 (Oresme2). The harmonic sequence

xn = 1 +12

+13

+ · · ·+ 1n

satisfies

x2n − xn =1

n+ 1+

1n+ 2

+ · · ·+ 12n≥ 1

2n+

12n

+ · · ·+ 12n

=n

2n=

12.

Thus for ε =12

and any N , we have |xm − xn| >12

by taking any natural numbern > N and m = 2n. Therefore the Cauchy criterion fails and the harmonicsequence diverges.

Example 1.2.5. We show the sequence {sinn} diverges. The real line is dividedby kπ ± π

4into intervals of length

π

2> 1. Therefore for any integer k, there are

integers m and n satisfying

2kπ +π

4< m < 2kπ +

3π4, 2kπ − π

4> n > 2kπ − 3π

4.

1Augustin Louis Cauchy, born 1789 in Paris (France), died 1857 in Sceaux (France). Hiscontributions to mathematics can be seem by the numerous mathematical terms bearinghis name, including Cauchy integral theorem (complex functions), Cauchy-Kovalevskayatheorem (differential equations), Cauchy-Riemann equations, Cauchy sequences. He pro-duced 789 mathematics papers and his collected works were published in 27 volumes.

2Nicole Oresme, born 1323 in Allemagne (France), died 1382 in Lisieux (France).

Page 24: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

24 CHAPTER 1. LIMIT AND CONTINUITY

Moreover, by taking k to be a big positive number, m and n can be as big as we

wish. Then sinm >1√2

, sinn < − 1√2

, and we have | sinm − sinn| >√

2. Thus

the sequence {sinn} is not Cauchy and must diverge.

Exercise 1.2.5. Prove the sequences diverge.

1. xn = 1 +1√2

+1√3

+ · · ·+ 1√n

. 2. xn =12

+25

+310

+ · · ·+ n

n2 + 1.

Exercise 1.2.6. Prove that if limn→∞ xn = +∞ and |xn − xn+1| < c for someconstant c < π, then {sinxn} diverges.

1.2.2 Supremum and Infimum

To discuss the converse of Theorem 1.2.2, we have to consider the differencebetween rational and real numbers. Specifically, consider the converse ofCauchy criterion stated for the real and the rational number systems:

1. Real number Cauchy sequences always have real number limits.

2. Rational number Cauchy sequences always have rational number limits.

The key distinction here is that a sequence of rational numbers may have anirrational number as the limit. For example, the rational number sequenceof the decimal approximations of

√2 in Example 1.1.3 is a Cauchy sequence

but has no rational number limit. This shows that the second statement iswrong.

Therefore the truthfulness of the first statement is closely related to thefundamental question of the definition of real numbers. In other words,the establishment of the first property must also point to the key differencebetween the rational and the real number systems. One solution to the fun-damental question is to simply use the converse of Cauchy criterion as theway of constructing real numbers from rational numbers (by requiring thatall Cauchy sequences converge). This is the topological approach and can bedealt with in the larger context of the completion of metric spaces. Alter-natively, real numbers can be constructed by considering the order amongthe numbers. The subsequent discussion will be based on this more intuitiveapproach, which is called the Dedekind3 cut.

The order relation between real numbers enables us to introduce the fol-lowing concept.

Definition 1.2.4. Let X be a nonempty set of numbers. An upper bound ofX is a number B such that x ≤ B for any x ∈ X. The supremum of X isthe least upper bound of the set and is denoted supX.

The supremum λ = supX is characterized by the following properties.

3Julius Wilhelm Richard Dedekind, born 1831 and died 1916 in Braunschweig (Ger-many). Dedekind came up with the idea of the cut on November 24 of 1858 while thinkinghow to teach calculus. He made important contributions to algebraic number theory. Hiswork introduced a new style of mathematics that influenced generations of mathemati-cians.

Page 25: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 25

1. λ is an upper bound: For any x ∈ X, we have x ≤ λ.

2. Any number smaller than λ is not an upper bound: For any ε > 0,there is x ∈ X, such that x > λ− ε.

The lower bound and the infimum inf X can be similarly defined and char-acterized.

Example 1.2.6. Both the set {1, 2} and the interval [0, 2] have 2 as the supremum.In general, the maximum of a set X is a number ξ ∈ X such that ξ ≥ x for anyx ∈ X, and the maximum (if exists) is always the supremum.

On the other hand, the interval (0, 2) has no maximum but still has 2 as thesupremum. The similar discussion on minimum can be made.

Example 1.2.7.√

2 is the supremum of the set

{1.4, 1.41, 1.414, 1.4142, 1.41421, 1.414213, 1.4142135, 1.41421356, . . . }

of its decimal expansions. It is also the supremum of the set{mn

: m and n are natural numbers satisfying m2 < 2n2}

of positive rational numbers whose squares are less than 2.

Example 1.2.8. Let Ln be the length of an edge of the inscribed regular n-gon in acircle of radius 1. Then 2π is the supremum of the set {3L3, 4L4, 5L5, . . . } of thecircumferences of the n-gons.

Exercise 1.2.7. Find the suprema and the infima.

1. {a+ b : a, b are rational. a2 < 3, |2b+ 1| < 5}.

2.{

n

n+ 1: n is a natural number

}.

3.{

(−1)nnn+ 1

: n is a natural number}

.

4.{mn

: m and n are natural numbers satisfying m2 > 3n2}

.

5.{

12m

+13n

: m and n are natural numbers}

.

6.{nRn

2: n ≥ 3 is a natural number

}, where Rn is the length of an edge of

the circumscribed regular n-gon in a circle of radius 1.

Exercise 1.2.8. Prove the supremum is unique.

Exercise 1.2.9. Suppose X is a nonempty bounded set of numbers. Prove thatλ = supX is characterized by the following two properties.

1. λ is an upper bound: For any x ∈ X, we have x ≤ λ.

2. λ is the limit of a sequence in X: There are xn ∈ X, such that λ =limn→∞ xn.

The following are some properties of the supremum and infimum.

Page 26: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

26 CHAPTER 1. LIMIT AND CONTINUITY

Proposition 1.2.5. Suppose X and Y are nonempty bounded sets of num-bers.

1. If x ≤ y for any x ∈ X and y ∈ Y , then supX ≤ inf Y .

2. If |x− y| ≤ c for any x ∈ X and y ∈ Y , then | supX − supY | ≤ c and| supX − inf Y | ≤ c.

3. If X + Y = {x+ y : x ∈ X, y ∈ Y }, then sup(X + Y ) = supX + supYand inf(X + Y ) = inf X + inf Y .

4. If cX = {cx : x ∈ X}, then sup(cX) = c supX when c > 0 andsup(cX) = c inf X when c < 0.

5. If XY = {xy : x ∈ X, y ∈ Y } and all numbers in X, Y are positive,then sup(XY ) = supX supY and inf(XY ) = inf X inf Y .

6. If X−1 = {x−1 : x ∈ X} and all numbers in X are positive, thensupX−1 = (inf X)−1.

Proof. In the first property, fix any y ∈ Y . Then y is an upper bound ofX. Therefore supX ≤ y. Since supX ≤ y for any y ∈ Y , supX is a lowerbound of Y . Therefore supX ≤ inf Y .

Now consider the second property. For any ε > 0, there are x ∈ X andy ∈ Y , such that supX − ε < x ≤ supX and supY − ε < y ≤ supY . Then

(supX − ε)− supY < x− y < supX − (supY − ε),

which means exactly |(x− y)− (supX − supY )| < ε. This further implies

| supX − supY | < |x− y|+ ε ≤ c+ ε.

Since this holds for any ε > 0, we conclude that | supX − supY | ≤ c. Theinequality | supX − inf Y | ≤ c can be similarly proved.

Next we prove the third property. For any x ∈ X and y ∈ Y , we have x+y ≤ supX+supY , so that supX+supY is an upper bound of X+Y . On theother hand, for any ε > 0, there are x, y ∈ X, Y , such that x > supX− ε andy > supY −ε. Then x+y ∈ X+Y satisfies x+y > supX+supY −2ε. Sinceε > 0 is arbitrary, this shows that any number smaller than supX + supYis not an upper bound of X + Y . Therefore supX + supY is the supremumof X + Y .

The proof of the rest are left as exercises.

Exercise 1.2.10. Finish the proof of Proposition 1.2.5.

Exercise 1.2.11. Suppose Xi are nonempty sets of numbers. Let X = ∪iXi andli = supXi. Prove that supX = supi li.

The existence of the supremum is what distinguishes the real numbersfrom the rational numbers.

Page 27: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 27

Definition 1.2.6. Real numbers is a set with the usual arithmetic operationsand the order satisfying the usual properties, and the additional property thatany bounded set of real numbers has the supremum.

The arithmetic operations are addition, subtraction, multiplication anddivision. An order on a set S is a relation x < y defined for pairs x, y ∈ S,satisfying the following properties:

• transitivity: x < y and y < z =⇒ x < z.

• exclusivity: If x < y, then y < x does not hold.

The following are some (but not all) of the usual arithmetic and order prop-erties.

• commutativity: a+ b = b+ a, ab = ba.

• distributivity: a(b+ c) = ab+ ac.

• unit: There is a special number 1 such that 1a = a.

• order compatible with addition: a < b =⇒ a+ c < b+ c.

• order compatible with multiplication: a < b, 0 < c =⇒ ac < bc.

Because of these properties, the real numbers form an ordered field. Since therational numbers also has the arithmetic operations and the order satisfyingthese usual properties, the rational numbers also form an ordered field. Thusthe key distinction between the real and rational numbers is the existence ofthe supremum. A bounded set of rational numbers may not have rationalnumber supremum. A bounded set of real numbers always has real numbersupremum. Due to the existence of supremum, the real numbers form acomplete ordered field.

1.2.3 Monotone Sequence

A sequence {xn} is increasing if xn+1 ≥ xn. It is strictly increasing ifxn+1 > xn. The concepts of decreasing and strictly decreasing sequencescan be similarly defined. A sequence is monotone if it is either increasing ordecreasing.

Proposition 1.2.7. Bounded monotone sequences of real numbers are con-vergent. Unbounded monotone sequences of real numbers diverge to infinity.

Since an increasing sequence {xn} satisfies xn ≥ x1, the sequence has x1 asa lower bound. Therefore it is bounded if and only if it has an upper bound,and the proposition says that an increasing sequence with upper bound mustbe convergent. Similar remarks can be made for decreasing sequences.

Page 28: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

28 CHAPTER 1. LIMIT AND CONTINUITY

Proof. Let {xn} be a bounded increasing sequence. The sequence has areal number supremum l = sup{xn}. For any ε > 0, by the second propertycharacterizing the supremum, there is N , such that xN > l−ε. Then becausethe sequence is increasing, n > N implies xn ≥ xN > l − ε. We also havexn ≤ l because l is an upper bound. Thus we conclude that

n > N =⇒ l − ε < xn ≤ l =⇒ |xn − l| < ε.

This proves that the sequence converges to l.Let {xn} be an unbounded increasing sequence. Then it has no upper

bound. In other words, for any b, there is N , such that xN > b. Since thesequence is increasing, we have

n > N =⇒ xn > xN > b.

Thus we conclude that {xn} diverges to +∞.The proof for the decreasing sequences is similar.

Example 1.2.9. The sequence in Example 1.2.4 is clearly increasing. Since it isdivergent, the sequence has no upper bound. In fact, the proof of Proposition1.2.7 tells us

limn→∞

(1 +

12

+13

+ · · ·+ 1n

)= +∞.

On the other hand, the increasing sequence

xn = 1 +122

+132

+ · · ·+ 1n2

satisfies

xn < 1 +1

1 · 2+

12 · 3

+ · · ·+ 1(n− 1)n

= 1 +(

11− 1

2

)+(

12− 1

3

)+ · · ·+

(1

n− 1− 1n

)= 1 +

11− 1n< 2,

and must be convergent.Much later on, we will see that the sequence

xn = 1 +12p

+13p

+ · · ·+ 1np

converges if and only if p > 1.

Example 1.2.10. Suppose a sequence is given inductively by x1 = 1, xn+1 =√2 + xn. We claim that the sequence is increasing. It is easy to see that xn+1 > xn

is equivalent to x2n − xn − 2 = (xn − 2)(xn + 1) < 0. Since the sequence is clearly

positive, the problem becomes xn < 2. First x1 < 2. Second if xn < 2, thenxn+1 <

√2 + 2 = 2. The inequality xn < 2 is therefore proved by induction.

Since the sequence is increasing and has upper bound 2, it is convergent. Thelimit l satisfies

l2 = limn→∞

x2n+1 = 2 + lim

n→∞xn = 2 + l.

Solving the equation for l ≥ 0, we conclude that the limit is l = 2.

Page 29: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 29

Example 1.2.11. Let xn =(

1 +1n

)n. The binomial expansion tells us

xn = 1 + n

(1n

)+n(n− 1)

2!

(1n

)2

+n(n− 1)(n− 2)

3!

(1n

)3

+ · · ·+ n(n− 1) · · · 1n!

(1n

)n= 1 +

11!

+12!

(1− 1

n

)+

13!

(1− 1

n

)(1− 2

n

)+ · · ·+ 1

n!

(1− 1

n

)(1− 2

n

)· · ·(

1− n− 1n

).

By comparing the similar formula for xn+1, we find the sequence is strictly in-creasing. The formula also tells us (see Example 1.2.9)

xn < 1 + 1 +12!

+13!

+ · · ·+ 1n!< 1 + 1 +

11 · 2

+1

2 · 3+ · · ·+ 1

(n− 1)n< 3,

so that the sequence has an upper bound. Therefore the sequence converges. Thelimit has a special notation

e = limn→∞

(1 +

1n

)n= 2.71828182845904 · · ·

and is a fundamental constant of the nature.

Exercise 1.2.12. Prove the sequences converge.

1. xn = 1 +123

+133

+ · · ·+ 1n3

.

2. xn = 1 +12!

+13!

+ · · ·+ 1n!

.

3. xn = 1 +122

+133

+ · · ·+ 1nn

.

4. xn = 1 +12

+122

+ · · ·+ 12n

.

Exercise 1.2.13. Consider the sequence xn = 1 +1√2

+1√3

+ · · ·+ 1√n− 2√n+ α.

1. Prove that if α ≤ 12

, then the sequence is strictly decreasing.

2. Prove that if α >12

, then the sequence is strictly increasing for n >1

16α− 8.

3. Prove the sequence converges to the same limit for all α.

Exercise 1.2.14. For any a > 0, let xn =

√a+

√a+

√a+ · · ·+

√a, where the

square root appears n times. Prove xn < xn+1 <1 +√

4a+ 12

and find the limit

of the sequence. (The result can be extended to a sequence defined by x1 ≤ b,xn+1 = f(xn), where f(x) be a function satisfying b ≥ f(x) ≥ x for x ≤ b. SeeFigure 1.3)

Exercise 1.2.15. Prove the inductively defined sequences converge and find thelimits.

Page 30: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

30 CHAPTER 1. LIMIT AND CONTINUITY

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...............

...............

...............

...............

...............

...............

.............

................

................

................

................

................

................

................

................

................

............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

.............

................

................

................

................

................

................

................

................

................

................

................

................

................

........

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.......

bx1 x2 x3 x4x5

.......................................................

...................................................

.................................................

................................................

................................................

..................................................

......................................................

...............................................................

...................................................................................................

..............................................................................................

...............................................................

................................................................................................

.....................................................................................................................

....................................................................................................................................

............................................................................................................................................................................................................................................................................................................

Figure 1.3: recursively defined convergent sequence

1. x1 = 1, xn+1 = 1 +2xn

. 2. x1 = 1, xn+1 =x2n + 22xn

.

Exercise 1.2.16. Discuss the convergence of the sequence defined by x1 = α, xn+1 =β + γxn. Do the same to the sequence defined by x1 = α, xn+1 = β +

γ

xn(This

generalizes the first problem in Exercise 1.2.15).

Exercise 1.2.17. Let a, b > 0. Define sequences a1 = a, b1 = b, an =an−1 + bn−1

2,

bn =2an−1bn−1

an−1 + bn−1. Use

a+ b

2≥ 2aba+ b

to prove the sequences converge. Moreover,

find the limits.

Exercise 1.2.18. Let xn =(

1 +1n

)nand yn =

(1 +

1n

)n+1

.

1. Use induction to prove (1+x)n ≥ 1+nx for x > −1 and any natural numbern.

2. By showingxn+1

xn> 1 and

yn−1

yn> 1, prove {xn} is increasing and {yn} is

decreasing.

3. Prove {xn} and {yn} converge to the same limit e.

4. Prove e− xn <e

n.

5. Prove limn→∞

(1− 1

n

)−n= e.

Exercise 1.2.19. Prove that e ≥ 1 +11!

+12!

+ · · ·+ 1n!

for any n.

1.2.4 Convergent Subsequence

By Proposition 1.1.2, any convergent sequence is bounded. While boundedsequences may not converge (see Example 1.2.2), the following still holds.

Page 31: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 31

Theorem 1.2.8 (Bolzano4-Weierstrass5 Theorem). A bounded sequence ofreal numbers has a convergent subsequence.

Recall that by Proposition 1.2.1, any subsequence of a convergent se-quence is convergent. Theorem 1.2.8 shows that if the original sequence isonly assumed to be bounded, then “any subsequence” should be changed to“some subsequence”.

Proof. Let {xn} be a bounded sequence. Then all xn lie in a bounded intervalI = [a, b].

Divide I into two equal halves I ′ =

[a,a+ b

2

]and I ′′ =

[a+ b

2, b

]. Then

either I ′ or I ′′ must contain infinitely many xn. We denote this interval byI1 = [a1, b1] and find xn1 ∈ I1

Further divide I1 into two equal halves I ′1 =

[a1,

a1 + b1

2

]and I ′′1 =[

a1 + b1

2, b1

]. Then either I ′1 or I ′′1 must contain infinitely many xn. We

denote this interval by I2 = [a2, b2]. Because I2 contains infinitely many xn,we can find xn2 ∈ I2 with n2 > n1.

Keep going, we get a sequence of intervals

I = [a, b] ⊃ I1 = [a1, b1] ⊃ I2 = [a2, b2] ⊃ · · · ⊃ Ik = [ak, bk] ⊃ · · ·

with the length of Ik being bk−ak =b− a

2k. Moreover, we have a subsequence

{xnk} satisfying xnk ∈ Ik.The inclusion relation between the intervals tells us

a ≤ a1 ≤ a2 ≤ · · · ≤ ak ≤ · · · ≤ bk ≤ · · · ≤ b2 ≤ b1 ≤ b.

Thus {ak} and {bk} are bounded and monotone sequences. By Proposition1.2.7, both sequences converge. Moreover, the length of Ik and the limit(1.1.3) tell us limk→∞(bk − ak) = 0. Therefore the two sequences have thesame limit. Denote l = limk→∞ ak = limk→∞ bk.

The property xnk ∈ Ik means ak ≤ xnk ≤ bk. By the sandwich rule, weget limk→∞ xnk = l. Thus we find a convergent subsequence {xnk}.

4Bernard Placidus Johann Nepomuk Bolzano, born 1781 and died 1848 in Prague (Bo-hemia, now Czech). Bolzano insisted that many results which were thought ”obvious”required rigorous proof and made fundamental contributions to the foundation of mathe-matics. He understood the need to redefine and enrich the concept of number itself anddefine the Cauchy sequence four years before Cauchy’s work appeared.

5Karl Theodor Wilhelm Weierstrass, born 1815 in Ostenfelde, Westphalia (now Ger-many), died 1848 in Berlin (Germany). In 1864, he found a continuous but nowheredifferentiable function. His lectures on analytic functions, elliptic functions, abelian func-tions and calculus of variations influenced many generations of mathematicians, and hisapproach still dominates the teaching of analysis today.

Page 32: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

32 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.2.12. Let {xn}, {yn}, {zn} be sequences converging to l1, l2, l3, respec-tively. Then for the sequence

x1, y1, z1, x2, y2, z2, x3, y3, z3, . . . , xn, yn, zn, . . .

the limits of convergent subsequences are l1, l2, l3.We need to explain that any l 6= l1, l2, l3 is not the limit of any subsequence.

There is ε > 0 such that (take ε =12

min{|l − l1|, |l − l2|, |l − l3|}, for example)

|l − l1| ≥ 2ε, |l − l2| ≥ 2ε, |l − l3| ≥ 2ε.

Then there are N1, N2, N3, such that

n > N1 =⇒ |xn − l1| < ε,

n > N2 =⇒ |yn − l2| < ε,

n > N3 =⇒ |zn − l3| < ε.

Since |l − l1| ≥ 2ε and |xn − l1| < ε imply |xn − l| > ε, we have

n > max{N1, N2, N3} =⇒ |xn − l| > ε, |yn − l| > ε, |zn − l| > ε.

This implies that l cannot be the limit of any convergent subsequence.A more direct argument about the limits of convergent subsequences is the

following. Let l be the limit of a convergent subsequence {wm} of the combinedsequence. The subsequence {wm} must contain infinitely many terms from at leastone of the three sequences. If {wm} contains infinitely many terms from {xn}, thenit contains a subsequence {xnk} of {xn}. By Proposition 1.2.1, we get

l = limwm = limxnk = limxn = l1.

The second equality is due to the fact that {xnk} is a subsequence of {wm}. Thethird equality is due to {xnk} being a subsequence of {xn}.

Exercise 1.2.20. For sequences in Exercise 1.2.1, find all the limits of convergentsubsequences.

Exercise 1.2.21. Any real number is the limit of a sequence of the formn1

10,n2

100,

n3

1000, . . . , where nk are integers. Based on this observation, construct a sequence

so that the limits of convergent subsequences are all the numbers between 0 and1.

Exercise 1.2.22. Prove that a number is the limit of a convergent subsequence of{xn} if and only if it is the limit of a convergent subsequence of {xn n

√n}.

Exercise 1.2.23. Suppose {xn} and {yn} are two bounded sequences. Prove thatthere are nk, such that both subsequences {xnk} and {ynk} converge.

The following technical result provides a criterion for a number to be thelimit of a subsequence.

Proposition 1.2.9. l is the limit of a convergent subsequence of {xn} if andonly if for any ε > 0 and N , there is n > N , such that |xn − l| < ε.

Page 33: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 33

Proof. Let l be the limit of a convergent subsequence {xnk}. Let ε > 0 andN be given. Then there is K, such that k > K implies |xnk− l| < ε. It is easyto find k > K, such that nk > N (take k = max{K,N} + 1, for example).Then for n = nk, we have n > N and |xn − l| < ε.

Conversely, suppose for any ε > 0 and N , there is n > N , such that|xn − l| < ε. Then for ε = 1, there is n1 such that |xn1 − l| < 1. Next, for

ε =1

2and N = n1, there is n2 > n1 such that |xn2 − l| <

1

2. Keep going,

by taking ε =1

k + 1and N = nk to find xnk+1

, we construct a subsequence

{xnk} satisfying |xnk − l| <1

k. The inequality implies limk→∞ xnk = l.

Here is a remark that is very useful for the discussion of subsequences.Suppose P is a property about terms in a sequence (xn > l or xn > xn+1, forexamples). Then the following statements are equivalent:

1. For any N , there is n > N , such that xn has property P .

2. There are infinitely many xn with property P .

3. There is a subsequence {xnk} such that each term xnk has property P .

In particular, the criterion for l to be the limit of a convergent subsequenceof {xn} is for any ε > 0, there are infinitely many xn satisfying |xn − l| < ε.

Exercise 1.2.24. Let {xn} be a sequence. Suppose limk→∞ lk = l and for each k,lk is the limit of a convergent subsequence of {xn}. Prove that l is also the limitof a convergent subsequence.

Let {xn} be a bounded sequence. Then the set LIM{xn} of all the limits ofconvergent subsequences of {xn} is also bounded. The supremum of LIM{xn}is called the upper limit and denoted limn→∞ xn. The infimum of LIM{xn} iscalled the lower limit and denoted limn→∞ xn. For example, for the sequencein Example 1.1.14, the upper limit is max{l1, l2, l3} and the lower limit ismin{l1, l2, l3}.

The following characterizes the upper limit. The lower limit can be sim-ilarly characterized.

Proposition 1.2.10. Suppose {xn} is a bounded sequence and l is a number.

1. If l < limn→∞ xn, then there are infinitely many xn > l.

2. If l > limn→∞ xn, then there are only finitely many xn > l.

Proof. If l < limn→∞ xn, then by the definition of the upper limit, somel′ > l is the limit of a subsequence. By Proposition 1.1.4, all the terms inthe subsequence except finitely many will be bigger than l. Thus we findinfinitely many xn > l.

The second statement is the same as the following: If there are infinitelymany xn > l, then l ≤ limn→∞ xn. We will prove this equivalent statement.

Page 34: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

34 CHAPTER 1. LIMIT AND CONTINUITY

Since there are infinitely many xn > l, there is a subsequence {xnk} sat-isfying xnk > l. By Theorem 1.2.8, the subsequence has a further convergentsubsequence {xnkp}. Then xnk > l implies limxnkp ≥ l. This gives a number

in LIM{xn} that is no less than l, so that limn→∞ xn = sup LIM{xn} ≥ l.

Proposition 1.2.11. The upper and lower limits are limits of convergentsubsequences. Moreover, the sequence converges if and only if the upper andlower limits are equal.

The first conclusion is limn→∞ xn, limn→∞ xn ∈ LIM{xn}. In the secondconclusion, the equality limn→∞ xn = limn→∞ xn = l means LIM{xn} = {l},which basically says that all convergent subsequences have the same limit.

Proof. Denote l = limxn. For any ε > 0, we have l + ε > limxn andl − ε < limxn. Applying Proposition 1.2.10, we know there are infinitelymany xn > l − ε and only finitely many xn > l + ε. Therefore there areinfinitely many xn satisfying l + ε ≥ xn > l − ε. Thus we have provedthat for any ε > 0, there are infinitely many xn satisfying |xn − l| ≤ ε. ByProposition 1.2.9 (and the remark after the proof), this shows that l is thelimit of a convergent subsequence.

For the second part, Proposition 1.2.1 says that if {xn} converges to l,then LIM{xn} = {l}, so that limxn = limxn = l. Conversely, supposelimxn = limxn = l. Then for any ε > 0, applying the second part ofProposition 1.2.10 to l + ε > limxn, we find only finitely many xn > l + ε.Applying the similar property for the lower limit to l − ε < limxn, we alsofind only finitely many xn < l− ε. Thus |xn− l| ≤ ε holds for all but finitelymany xn. If N is the biggest index for those xn that do not satisfy |xn−l| ≤ ε,then we get |xn − l| ≤ ε for all n > N . This proves that {xn} converges to l.

Exercise 1.2.25. Find all the upper and lower limits of bounded sequences in Ex-ercise 1.2.1.

Exercise 1.2.26. Prove the properties of upper and lower limits.

1. limn→∞(−xn) = − limn→∞ xn.

2. limn→∞ xn + limn→∞ yn ≥ limn→∞(xn + yn) ≥ limn→∞ xn + limn→∞ yn.

3. If xn > 0, then limn→∞1xn

=1

limn→∞ xn.

4. If xn ≥ 0 and yn ≥ 0, then limn→∞ xn · limn→∞ yn ≥ limn→∞(xnyn) ≥limn→∞ xn · limn→∞ yn.

Exercise 1.2.27. Prove that if limn→∞

∣∣∣∣xn+1

xn

∣∣∣∣ < 1, then limn→∞ xn = 0. Prove

that if limn→∞

∣∣∣∣xn+1

xn

∣∣∣∣ > 1, then limn→∞ xn =∞.

Exercise 1.2.28. Prove that the upper limit l of a bounded sequence {xn} is char-acterized by the following two properties.

Page 35: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 35

1. l is the limit of a convergent subsequence.

2. For any ε > 0, there is N , such that xn < l + ε for any n > N .

The characterization may be compared with the one for the supremum in Exercise1.2.9.

Exercise 1.2.29. Let {xn} be a bounded sequence. Let yn = sup{xn, xn+1, xn+2, . . . }.Then {yn} is a bounded decreasing sequence. Prove that limn→∞ yn = limn→∞ xn.Find the similar formula for limn→∞ xn.

1.2.5 Convergence of Cauchy Sequence

Now we are ready to prove the converse of Theorem 1.2.2.

Theorem 1.2.12. Any Cauchy sequence of real numbers is convergent.

Proof. Let {xn} be a Cauchy sequence. We claim the sequence is bounded.For ε = 1 > 0, there is N , such that m,n > N implies |xm − xn| < 1.Taking m = N + 1, we find n > N implies xN+1 − 1 < xn < xN+1 + 1.Therefore max{x1, x2, . . . , xN , xN+1 + 1} is an upper bound for the sequence,and min{x1, x2, . . . , xN , xN+1 − 1} is a lower bound.

By Theorem 1.2.8, there is a subsequence {xnk} converging to a limit l.Thus for any ε > 0, there is K, such that

k > K =⇒ |xnk − l| <ε

2.

On the other hand, since {xn} is a Cauchy sequence, there is N , such that

m,n > N =⇒ |xm − xn| <ε

2.

Now for any n > N , we can easily find some k > K, such that nk > N

(k = max{K,N} + 1, for example). Then we have both |xnk − l| <ε

2and

|xnk − xn| <ε

2. The inequalities imply |xn − l| < ε. Thus we established the

implication

n > N =⇒ |xn − l| < ε.

The proof of Theorem 1.2.12 does not use the upper and lower limits.Alternatively, we observe that for any ε > 0, the Cauchy condition impliesthat any two subsequences will be within ε of each other after finitely manyterms. This implies that the difference between the limits of any two conver-gent subsequences cannot be more than ε. Since ε can be arbitrarily small,this implies the upper and the lower limits must be the same.

Example 1.2.13. For the sequence

xn = 1− 122

+132− · · ·+ (−1)n

n2,

Page 36: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

36 CHAPTER 1. LIMIT AND CONTINUITY

and m > n, we have

|xm − xn| =∣∣∣∣(−1)n+1

(n+ 1)2+

(−1)n+2

(n+ 2)2+ · · ·+ (−1)m

m2

∣∣∣∣<

1n(n+ 1)

+1

(n+ 1)(n+ 2)+ · · ·+ 1

(m− 1)m

=(

1n− 1n+ 1

)+(

1n+ 1

− 1n+ 2

)+ · · ·+

(1

m− 1− 1m

)=

1n− 1m<

1n.

For any ε > 0, let N =1ε

. Then for m > n > N , we have

0 < xm − xn <1n< ε.

Thus {xn} is a Cauchy sequence and must converge.

Exercise 1.2.30. Prove the sequences converge.

1. xn = 1 +12!

+13!

+ · · ·+ 1n!

.

2. xn = 1 +122

+133

+ · · ·+ 1nn

.

3. xn = 1− 123

+133

+ · · ·+ (−1)n1n3

.

4. xn = sin 1 +sin 223

+sin 333

+ · · ·+ sinnn3

.

Exercise 1.2.31. Let |a| ≤ 1. Define a sequence by x1 = a, xn+1 =14

(1 + x2n).

1. Prove |xn| ≤ 1 and |xn+1 − xn| <1

2n−2.

2. Prove12

+122

+123

+ · · ·+ 12n

= 1− 12n

.

3. Prove the sequence {xn} converges and find the limit.

1.2.6 Open Cover

A set X of numbers is closed if xn ∈ X and limn→∞ xn = l implies l ∈ X.For example, the order rule on the limits tells us that closed intervals [a, b]are closed sets. The proof of Bolzano-Weierstrass Theorem also gives us thefollowing result, which in modern topological language says that boundedand closed sets of numbers are compact.

Theorem 1.2.13 (Heine6-Borel7 Theorem). Suppose X is a bounded andclosed set of numbers. Suppose {(ai, bi)} is a collection of open intervalssuch that X ⊂ ∪(ai, bi). Then X ⊂ (ai1 , bi1) ∪ (ai2 , bi2) ∪ · · · ∪ (ain , bin) forfinitely many intervals in the collection.

6Heinrich Eduard Heine, born 1821 in Berlin (Germany), died 1881 in Halle (Germany).7Felix Edouard Justin Emile Borel, born 1871 in Aveyron (France), died 1881 in Paris

(France).

Page 37: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.2. CONVERGENCE OF SEQUENCE LIMIT 37

When X ⊂ ∪(ai, bi) happens, we say U = {(ai, bi)} is an open cover ofX. The theorem says that if X is bounded and closed, then any open coverby open intervals has a finite subcover.

Proof. Suppose X ⊂ I = [α, β] for a bounded and closed interval I. SupposeX cannot be covered by finitely many open intervals in U = {(ai, bi)}.

Similar to the proof of Bolzano-Weierstrass Theorem, we divide the in-

terval into two equal halves I ′ =

[α,α + β

2

]and I ′′ =

[α + β

2, β

]. Then

either X ′ = X ∩ I ′ or X ′′ = X ∩ I ′′ cannot be covered by finitely many openintervals in U . We denote the corresponding interval by I1 = [α1, β1] anddenote X1 = X ∩ I1.

Further divide I1 into two equal halves I ′1 =

[α1,

α1 + β1

2

]and I ′′1 =[

α1 + β1

2, β1

]. Then either X ′1 = X1∩ I ′1 or X ′′1 = X1∩ I ′′1 cannot be covered

by finitely many open intervals in U . We denote the corresponding intervalby I2 = [α2, β2] and denote X2 = X ∩ I2.

Keep going, we get a sequence of intervals

I = [α, β] ⊃ I1 = [α1, β1] ⊃ I2 = [α2, β2] ⊃ · · · ⊃ Ik = [αk, βk] ⊃ · · ·

with the length of Ik being βk − αk =β − α

2k, and Xk = X ∩ Ik cannot be

covered by finitely many open intervals in U .As argued in the proof before, we have converging limit l = limk→∞ αk =

limk→∞ βk. Moreover, picking xk ∈ Xk, we have αk ≤ xk ≤ βk. By thesandwich rule, we get l = limk→∞ xk. Then by the assumption that X isclosed, we get l ∈ X.

Since X ⊂ ∪(ai, bi), we have l ∈ (ai0 , bi0) for some interval (ai0 , bi0) ∈ U .Then by l = limk→∞ αk = limk→∞ βk, we have Xk ⊂ Ik = [αk, βk] ⊂ (ai0 , bi0)for sufficiently big k. In particular, Xk can be covered by one open interval inU . The contradiction shows that X must be covered by finitely many openintervals from U .

Exercise 1.2.32. Find a collection U = {(ai, bi)} that covers (0, 1], but (0, 1] cannotbe covered by finitely many intervals in U . Find similar counterexample for [0,+∞)in place of (0, 1].

Exercise 1.2.33 (Lebesgue). Suppose [α, β] is covered by the collection U = {(ai, bi)}.Denote

X = {x ∈ [α, β] : [α, x] is covered by finitely many intervals in U}.

1. Prove that supX ∈ X.

2. Prove that if x ∈ X and x < β, then x+ δ ∈ X for some δ > 0.

3. Prove that supX = β.

This proves Heine-Borel Theorem for bounded and closed intervals.

Page 38: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

38 CHAPTER 1. LIMIT AND CONTINUITY

Exercise 1.2.34. Prove Heine-Borel Theorem for a bounded and closed set X inthe following way. Suppose X is covered by the collection U = (ai, bi).

1. Prove that there is δ > 0, such that for any x ∈ X, (x − δ, x + δ) ⊂ (ai, bi)for some (ai, bi) ∈ U .

2. Use the boundedness of X to find finitely many numbers c1, c2, . . . , cn, suchthat X ⊂ (c1, c1 + δ) ∪ (c2, c2 + δ) ∪ · · · ∪ (cn, cn + δ).

3. Prove that if X ∩ (cj , cj + δ) 6= ∅, then (cj , cj + δ) ⊂ (ai, bi) for some(ai, bi) ∈ U .

4. Prove that X is covered by no more than n open intervals in U .

1.2.7 Additional Exercise

Extended Supremum and Extended Upper Limit

Exercise 1.2.35. Extend the number system by including the “infinite numbers”+∞, −∞ and introduce the order −∞ < x < +∞ for any real number x. Thenfor any nonempty set X of real numbers and possibly +∞ or −∞, we have supXand inf X similarly defined. Prove that there are exactly three possibilities forsupX.

1. If X has no upper bound or +∞ ∈ X, then supX = +∞.

2. If X has a finite number as an upper bound and contains at least one finitereal number, then supX is a finite real number.

3. If X = {−∞}, then supX = −∞.

Write down the similar statements for inf X.

Exercise 1.2.36. For a not necessarily bounded sequence {xn}, extend the definitionof LIM{xn} by adding +∞ if there is a subsequence diverging to +∞, and adding−∞ if there is a subsequence diverging to −∞. Define the upper and lower limitsas the supremum and infimum of LIM{xn}, using the extension of the concepts inExercise 1.2.35. Prove the following extensions of Proposition 1.2.11.

1. A sequence with no upper bound must have a subsequence diverging to +∞.This means limn→∞ xn = +∞.

2. If there is no subsequence with finite limit and no subsequence diverging to−∞, then the whole sequence diverges to +∞.

Supremum and Infimum in Ordered Set

Recall that an order on a set is a relation x < y between pairs of elementssatisfying the transitivity and the exclusivity. The concepts of upper bound,lower bound, supremum and infimum can be defined for subsets of an orderedset in a way similar to numbers.

Exercise 1.2.37. Provide a characterization of the supremum similar to numbers.

Exercise 1.2.38. Prove that the supremum, if exists, must be unique.

Page 39: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 39

Exercise 1.2.39. An order is defined for all subsets of the plane R2 by A ≤ B ifA is contained in B. Let R be the set of all rectangles centered at the origin andwith circumference 1. Find the supremum and infimum of R.

Alternative Proof of Bolzano-Weierstrass Theorem

We say a term xn in a sequence has property P if there is M , such thatm > M implies xm > xn.

Exercise 1.2.40. Suppose there are infinitely many terms in a sequence {xn} withproperty P . Prove that for any term xn with property P , there is m > n such thatxm > xn and xm also have property P . Then onstruct an increasing subsequence{xnk} in which each xnk has property P .

Exercise 1.2.41. Suppose there are only finitely many terms in a sequence {xn}with property P . Prove that for any n > N , there is m > n such that xm < xn.Then use this to construct a decreasing subsequence {xnk}.Exercise 1.2.42. Suppose the sequence {xn} is bounded. Prove that the increasingsubsequence constructed in Exercise 1.2.40 converges to limxn, and the decreasingsubsequence constructed in Exercise 1.2.41 converges to limxn.

Set Version of Bolzano-Weierstrass Theorem

l is called an accumulation point of a set X of numbers if for any ε > 0,there is x ∈ X satisfying |x− l| < ε. The set version of Bolzano-WeierstrassTheorem says that any bounded and closed set of numbers has an accumu-lation point. Theorem 1.2.8 may be considered as the sequence version ofBolzano-Weierstrass Theorem.

Exercise 1.2.43. What does it mean for a point l not to be an accumulation point?Use your answer and Heine-Borel Theorem to prove the set version of Bolzano-Weierstrass Theorem.

Exercise 1.2.44. Use the set version of Bolzano-Weierstrass Theorem to prove thesequence version.

1.3 Limit of Function

The limit process is not restricted to sequences only. For a function f(x)defined near (but not necessarily at) a, we may also consider its tendency asx approaches a. This leads to the definition of the limit of functions.

Many properties of the limit of sequences can be extended to functions.The two types of limits are also closely related.

1.3.1 Definition

The definition of the limit of sequences can be easily modified to obtain thedefinition of the limit of functions.

Definition 1.3.1. A function f(x) defined near a has limit l at a, denotedlimx→a f(x) = l, if for any ε > 0, there is δ > 0, such that

0 < |x− a| < δ =⇒ |f(x)− l| < ε. (1.3.1)

Page 40: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

40 CHAPTER 1. LIMIT AND CONTINUITY

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

................

................................

................................................ε

................................................

................

................................ε

...................................................................................δ...................................................................................δ

................................

................

................

................

................

................

................

................

................

................

................

.................................................................................................................................................................................................................................................................

a

L

.......................................................................................

....................................................................

.........................................................

...... ....................................................................................................................................................................................................................................................

............................................................................................................

....................................................................................................................................................................................

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

......

Figure 1.4: for any ε, there is δ

Similar to the limit of sequences, the predetermined smallness ε for |f(x)−l| is arbitrarily given, while the size δ for |x−a| is to be found after ε is given.Thus the choice of δ usually depends on ε and is often expressed as a functionof ε. Moreover, since the limit is about what happens when numbers are close,only small ε and δ need to be considered.

Example 1.3.1. The graph for the function f(x) = x2 suggests limx→2 x2 = 4.

Rigorously following the definition, for any ε > 0, choose δ = min{

1,ε

5

}. Then

0 < |x− 2| < δ =⇒ |x− 2| < 1, |x− 2| < ε

5=⇒ |x+ 2| < 5, |x− 2| < ε

5=⇒ |x2 − 4| = |x+ 2||x− 2| < 5

ε

5= ε.

How did we choose δ? We try to achieve |x2 − 4| = |x + 2||x − 2| < ε byrequiring 0 < |x− 2| < δ. Note that when x is close to 2, |x+ 2| is close to 4 and|x2−4| is close to 4δ. More precisely, if we give some room to x by choosing δ ≤ 1,so that |x + 2| is not more than 5, then we end up with the requirement 5δ ≤ ε.Combining δ ≤ 1 and 5δ ≤ ε together yields our choice for δ.

Example 1.3.2. The graph for the function f(x) =√x suggests that limx→4

√x =

2. Rigorously following the definition, for any ε > 0, choose δ = ε. Then

0 < |x−4| < δ =⇒ |√x−2| = |

√x− 2||

√x+ 2|

|√x+ 2|

=|x− 4||√x+ 2|

|√x+ 2|

< δ = ε.

Example 1.3.3. The graph for the function f(x) =1x

suggests limx→11x

= 1. Torigorously argue by the definition, we estimate how small∣∣∣∣1x − 1

∣∣∣∣ =|x− 1||x|

can be when 0 < |x− 1| < δ. Note that in the quotient, the numerator |x− 1| < δ

and the denominator |x| should be close to 1 for small δ. Specifically, if δ <12

,

Page 41: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 41

then |x − 1| < δ will imply |x| > 12

, and we get|x− 1||x|

< 2δ. We conclude that

0 < |x− 1| < δ = min{

12,ε

2

}implies

∣∣∣∣1x − 1∣∣∣∣ < ε.

Example 1.3.4. Note that the limit of f(x) at a does not depend on f(a). In fact,the function f does not even need to be defined at a. For example, the function

f(x) =

1x

if x 6= 1

2 if x = 1

has limit limx→1 f(x) = limx→11x

= 1. In fact, the verification for the limit inExample 1.1.3 can be used here without any change.

Exercise 1.3.1. Rigorously verify the limits.

1. limx→−2 x2 = 4.

2. limx→2 3x3 = 24.

3. limx→1√x = 1.

4. limx→−21x

= −12

.

5. limx→−22x2

=12

.

6. limx→a |x| = |a|.

Exercise 1.3.2. Prove that limx→a f(x) = l implies limx→a |f(x)| = |l|. Prove theconverse is true if l = 0. What about the converse in case l 6= 0?

Exercise 1.3.3. Prove the following are equivalent definitions of limx→a f(x) = l.

1. For any ε > 0, there is δ > 0, such that 0 < |x−a| ≤ δ implies |f(x)− l| < ε.

2. For any c > ε > 0, where c is some fixed number, there is δ > 0, such that0 < |x− a| < δ implies |f(x)− l| ≤ ε.

3. For any natural number n, there is δ > 0, such that 0 < |x− a| < δ implies

|f(x)− l| ≤ 1n

.

4. For any 1 > ε > 0, there is δ > 0, such that 0 < |x − a| < δ implies|f(x)− l| < ε

1− ε.

Exercise 1.3.4. Which are equivalent to the definition of limx→a f(x) = l?

1. For ε = 0.001, we have δ = 0.01, such that 0 < |x−a| ≤ δ implies |f(x)−l| <ε.

2. For any ε > 0, there is δ > 0, such that |x− a| < δ implies |f(x)− l| < ε.

3. For any ε > 0, there is δ > 0, such that 0 < |x − a| < δ implies 0 <|f(x)− l| < ε.

4. For any ε satisfying 0.001 ≥ ε > 0, there is δ > 0, such that 0 < |x− a| ≤ 2δimplies |f(x)− l| ≤ ε.

5. For any ε > 0.001, there is δ > 0, such that 0 < |x − a| ≤ 2δ implies|f(x)− l| ≤ ε.

6. For any ε > 0, there is a rational number δ > 0, such that 0 < |x − a| < δimplies |f(x)− l| < ε.

Page 42: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

42 CHAPTER 1. LIMIT AND CONTINUITY

7. For any ε > 0, there is a natural number N , such that 0 < |x − a| < 1N

implies |f(x)− l| < ε.

8. For any ε > 0, there is δ > 0, such that 0 < |x−a| < δ implies |f(x)−l| < ε2.

9. For any ε > 0, there is δ > 0, such that 0 < |x− a| < δ implies |f(x)− l| <ε2 + 1.

10. For any ε > 0, there is δ > 0, such that 0 < |x − a| < δ + 1 implies|f(x)− l| < ε2.

1.3.2 Variation

In the definition of limx→a f(x), x may approach a from right (i.e., x > a)or from left (i.e., x < a). The two approaches may be treated separately,leading to the one side limits.

Definition 1.3.2. A function f(x) defined for x > a and near a has rightlimit l at a, denoted limx→a+ f(x) = l, if for any ε > 0, there is δ > 0, suchthat

0 < x− a < δ =⇒ |f(x)− l| < ε. (1.3.2)

A function f(x) defined for x < a and near a has left limit l at a, denotedlimx→a− f(x) = l, if for any ε > 0, there is δ > 0, such that

−δ < x− a < 0 =⇒ |f(x)− l| < ε. (1.3.3)

The one side limits are often denoted as f(a+) = limx→a+ f(x) andf(a−) = limx→a− f(x). Moreover, the one side limits and the usual (twoside) limit are related as follows. The proof is left as an exercise.

Proposition 1.3.3. limx→a f(x) = l if and only if limx→a+ f(x) = limx→a− f(x) =l.

Example 1.3.5. For the function

f(x) =

1x

if x > 1

3x− 2 if x < 1,

we havelimx→1+

f(x) = limx→1

1x

= 1, limx→1−

f(x) = limx→1

(3x− 2) = 1.

Therefore limx→1 f(x) = 1. On the other hand, for the function

g(x) =

1x

if x > 1

3x if x < 1,

we havelimx→1+

g(x) = limx→1

1x

= 1, limx→1−

g(x) = limx→1

3x = 3.

Since the left and right limits are not equal, g(x) diverges at 1.

Page 43: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 43

Example 1.3.6. If α > 0, then for any ε > 0, we have

0 < x < δ = ε1α =⇒ 0 < xα < δα = ε.

This shows limx→0+ xα = 0.

Exercise 1.3.5. Let [x] be the biggest integer n satisfying n ≤ x. For example,[1.1] = 1, [0.99] = 0, [−3.2] = −4. Compute the limits.

1. limx→2+ [x].

2. limx→2− [x].

3. limx→√

2+ [x].

4. limx→√

2− [x].

5. limx→4+ [√x].

6. limx→4− [√x].

Exercise 1.3.6. Determine whether limits exist.

1. limx→0

{√x if x > 0−√−x if x < 0

.

2. limx→1

{2 if x > 1x if x < 1

.

3. limx→1

{1 if x > 1x if x < 1

.

4. limx→1

{1 if x > 2x if x < 2

.

Another variation of the function limit is to consider what happens whenx gets very big.

Definition 1.3.4. A function f(x) has limit l at∞, denoted limx→∞ f(x) =l, if for any ε > 0, there is N , such that

|x| > N =⇒ |f(x)− l| < ε. (1.3.4)

The limit at the infinity can also be split into the limits limx→+∞ f(x),limx→−∞ f(x) at positive and negative infinities. Proposition 1.3.3 still holdsfor the limit at infinity.

The index n in sequences are usually taken as positive numbers only, sothat the sequence limit limn→∞ xn really means limn→+∞ xn. For the functionlimit, if only∞ is used without sign, then both negative and positive infinitiesare included.

Example 1.3.7. We verify that limx→∞2x+ 1x− 1

= 2. For any ε > 0, take N =3ε

+1.

Then

|x| > N =⇒ |x− 1| ≥ |x| − 1 >3ε

=⇒∣∣∣∣2x+ 1x− 1

− 2∣∣∣∣ =

3|x− 1|

< ε.

Example 1.3.8. By treating n in Example 1.1.1 as a real number instead of an

integer, the same argument leads to limx→∞1x

= 0. More generally, the argument

for the limit (1.1.2) also works and gives us

limx→+∞

1xα

= 0 for α > 0. (1.3.5)

The limit can be rephrased as

limx→+∞

xα = 0 for α < 0. (1.3.6)

Page 44: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

44 CHAPTER 1. LIMIT AND CONTINUITY

Note that only +∞ is considered here because xα may not be defined for negativex and non-integer α. Of course in the special case n is a natural number, the sameargument leads to

limx→∞

1xn

= 0 for natural number n. (1.3.7)

Example 1.3.9. The analogue of the limit (1.1.3) is

limx→+∞

αx = 0 for 0 < α < 1. (1.3.8)

The proof, however, needs to be modified. Again let1α

= 1 + β. Then β > 0. For

any ε > 0, take N =1βε

+ 1. Then

x > N =⇒ x > n for some integer n satisfying x > n >1βε

=⇒ 1αx

>1αn

> nβ >1ε

=⇒ 0 < αx < ε,

where the inequality1αn

> nβ was proved in Example 1.2.12.

Exercise 1.3.7. Rigorously verify the limits.

1. limx→∞x

x2 + 1= 0.

2. limx→∞sinxx

= 0.

3. limx→∞x2 − 1x2 + 1

= 1.

4. limx→∞x2 + x− 1x2 + 1

= 1.

Exercise 1.3.8. Provelim

x→−∞αx = 0 for α > 1. (1.3.9)

Exercise 1.3.9. Prove the extension

limx→+∞

xβαx = 0 for 0 < α < 1 (1.3.10)

of the limit (1.3.8).

The final variation is the divergence to infinity.

Definition 1.3.5. A function f(x) diverges to infinity at a, denoted limx→a f(x) =∞, if for any b, there is δ > 0, such that

0 < |x− a| < δ =⇒ |f(x)| > b. (1.3.11)

The divergence to positive and negative infinities, denoted limx→a f(x) =+∞ and limx→a f(x) = −∞ respectively, can be similarly defined. Moreover,the divergence to infinity at the left of a, the right of a, or when a is variouskind of infinities, can also be similarly defined.

Similar to the sequences diverging to infinity, we know f(x) is an infinity

if and only if limx→a1

f(x)= 0, i.e., the reciprocal is an infinitesimal.

Page 45: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 45

Example 1.3.10. We verify that limx→1−x

x2 − 1= −∞. For any b > 0, choose

δ = min{

12,1b

}. Then

−δ < x− 1 < 0 =⇒ −δ < x− 1 < 0, x > 1− δ ≥ 12, x+ 1 < 2

=⇒ x

x2 − 1=

x

(x+ 1)(x− 1)<

12

2(−δ)< − b

4.

Since b can be arbitrarily big,b

4can also be arbitrarily big. Therefore

x

x2 − 1is a

negative infinity at the left of 1.

Example 1.3.11. Taking the reciprocals of the infinitesimals in the Examples 1.3.6and 1.3.8, we get some infinities. By considering the sign, we actually get positiveor negative infinities. Combining the results with the earlier examples, we get thelimits of the power functions

limx→0+

xα =

0 if α > 01 if α = 0+∞ if α < 0

, limx→+∞

xα =

0 if α < 01 if α = 0+∞ if α > 0

. (1.3.12)

Taking the reciprocal of the limits of the exponential function in Example 1.3.9and in (1.3.9) will give us

limx→+∞

αx =

0 if 0 < α < 11 if α = 1+∞ if α > 1

, limx→−∞

αx =

0 if α > 11 if α = 1+∞ if 0 < α < 1

. (1.3.13)

Exercise 1.3.10. Rigorously verify the limits.

1. limx→01− x

x2(1 + x)= +∞.

2. limx→0+1− xx

= +∞.

3. limx→−∞x2 + 1x+ 1

= −∞.

4. limx→+∞x2 + 12x− 5

= +∞.

1.3.3 Property

The limit of functions has similar properties as the limit of sequences.

Proposition 1.3.6. The limit of functions have the following properties.

1. Boundedness: A function convergent at a is bounded near a.

2. Arithmetic: Suppose limx→a f(x) = l and limx→a g(x) = k. Then

limx→a

(f(x) + g(x)) = l + k, limx→a

f(x)g(x) = lk, limx→a

f(x)

g(x)=l

k,

where g(x) 6= 0 and k 6= 0 are assumed in the third equality.

Page 46: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

46 CHAPTER 1. LIMIT AND CONTINUITY

3. Order: Suppose limx→a f(x) = l and limx→a g(x) = k. If f(x) ≥ g(x)for x close to a, then l ≥ k. Conversely, if l > k, then f(x) > g(x) forx close to a.

4. Sandwich: Suppose f(x) ≤ g(x) ≤ h(x) and limx→a f(x) = limx→a h(x) =l. Then limx→a g(x) = l.

5. Composition: Suppose limx→a g(x) = b, limy→b f(y) = c. If g(x) 6= bfor x near a, or f(b) = c, then limy→a f(g(x)) = c.

When something happens near a, we mean that there is δ > 0, such thatit happens when 0 < |x − a| < δ. For example, f(x) is bounded near a ifthere is δ > 0 and B, such that 0 < |x− a| < δ implies |f(x)| < B.

The first four properties are parallel to the Propositions 1.1.2, 1.1.3, 1.1.4,1.1.5 for the limit of sequences, and can be proved in the similar way. Thefifth property means that for y = g(x) and z = f(y), we have

limx→a

y = b, limy→b

z = c =⇒ limx→a

z = c.

Its analogue for the sequence is Proposition 1.2.1. The property can beproved by combining the following two implications together

0 < |x− a| < δ =⇒ |g(x)− b| < µ,

0 < |y − b| < µ =⇒ |f(y)− c| < ε,

where µ is found for the given ε, and then δ is found for the given µ. However,there is the technical problem that the right side of the first implication doesnot quite match the left side of the second implication. To make them match,the implications should be modified either as

0 < |x− a| < δ =⇒ 0 < |g(x)− b| < µ,

0 < |y − b| < µ =⇒ |f(y)− c| < ε,

or as

0 < |x− a| < δ =⇒ |g(x)− b| < µ,

|y − b| < µ =⇒ |f(y)− c| < ε.

The first modification simply additionally requires that g(x) 6= b for x neara. The second modification simply additionally requires that f(b) = c.

Example 1.3.12. A polynomial is a function of the form

p(x) = cnxn + cn−1x

n−1 + · · ·+ c1x+ c0

By repeatedly applying the arithmetic rule to limx→a c = c (c is a constant) andlimx→a x = a, we get

limx→a

p(x) = cnan + cn−1a

n−1 + · · ·+ c1a+ c0 = p(a). (1.3.14)

Page 47: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 47

More generally, a rational function

r(x) =cnx

n + cn−1xn−1 + · · ·+ c1x+ c0

dmxm + dm−1xm−1 + · · ·+ d1x+ d0

is a quotient of two polynomials. By applying the arithmetic rule to (1.3.14), weget

limx→a

r(x) =cna

n + cn−1an−1 + · · ·+ c1a+ c0

dmam + dm−1am−1 + · · ·+ d1a+ d0= r(a), (1.3.15)

as long as the denominator is nonzero.

Example 1.3.13. We have limx→1(x2+x+2) = 4 by Example 1.3.12 and limx→4√x =

2 by Example 1.3.2. Then we get limx→1

√x2 + x+ 2 = 2 by the composition rule.

Proposition 1.3.6 was stated for the two side limit limx→a f(x) = l withfinite a and l only. The properties also hold when a is replaced by a+, a−,∞, +∞ and −∞. Some properties still hold when l is replaced by variousinfinities. Here are some examples.

1. The arithmetic rule (+∞)+(+∞) = +∞ for sequences can be extendedto functions. For example, if limx→a+ f(x) = +∞ and limx→a+ g(x) =+∞, then limx→a+(f(x) + g(x)) = +∞. In general, all the valid arith-metic rules for sequences that involves infinities and infinitesimals arestill valid for functions. However, as in the sequence case, the samecare needs to be taken in applying the arithmetic rules to infinities andinfinitesimals.

2. In the composition rule, a, b, c can be basically any symbols. Forexample, if limx→∞ g(x) = b, g(x) > b and limy→b+ f(y) = c, thenlimx→∞ f(g(x)) = c.

3. If f(x) ≥ g(x) and limx→a g(x) = +∞, then limx→a f(x) = +∞. Thisextends the sandwich rule to +∞. There is similar sandwich rule for−∞ but no sandwich rule for ∞.

Example 1.3.14. By repeatedly applying the arithmetic rule to limx→∞ c = c (c is

a constant) and limx→∞1x

= 0, we get

limx→∞

2x5 + 10−3x+ 1

= limx→∞

x42 + 10

1x

−3 +1x

= (+∞)2 + 10 · 0−3 + 0

= −∞,

limx→∞

x3 − 2x2 + 1−x3 + 1

= limx→∞

1− 21x

+1x3

−1 +1x3

=2− 2 · 0 + 03

−1 + 0= −2.

In general, we have the limit

limx→∞

cnxn + cn−1x

n−1 + · · ·+ c1x+ c0

dmxm + dm−1xm−1 + · · ·+ d1x+ d0=

0 if m > n, dm 6= 0cndm

if m = n, dm 6= 0

∞ if m < n, cn 6= 0

(1.3.16)

of a rational function at ∞.

Page 48: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

48 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.3.15. The power function xα is defined for any α and x > 0. In Example1.3.11, we obtained the limits at 0+ and +∞. Now we may consider the limit atany finite a > 0.

Fix a natural number n > |α|. Then for x > 1, we have1xn

< xα < xn.

By limx→1 xn = 1, limx→1

1xn

= 1 from (1.3.15) and the sandwich rule, we getlimx→1+ xα = 1. Note that the limit is taken from the right because the sandwich

inequality holds for x > 1 only. Similarly, from the inequality1xn

> xα > xn for0 < x < 1, we get limx→1− x

α = 1. Thus we conclude that limx→1 xα = 1. For the

limit of the power function at any a > 0, the composition rule gives us

limx→a

xα = limx→1

(ax)α = aα limx→1

xα = aα. (1.3.17)

Example 1.3.16. Suppose limx→0+ f(x) = l. By limx→0 x2 = 0, x2 > 0 for x 6= 0,

and the composition rule, we have limx→0 f(x2) = l.Conversely, suppose limx→0 f(x2) = l. Then by the composition rule and

limx→0+

√x = 0 from (1.3.12), we get limx→0+ f(x) = limx→0+ f((

√x)2) = l.

Thus we conclude that limx→0 f(x2) = limx→0+ f(x). The equality means thatone limit exists if and only if the other also exists. Moreover, the two limits areequal when they exist.

By the extension of the composition rule, the equality limx→0 f(x2) = limx→0+ f(x)also holds even if the limits diverge to infinity.

Exercise 1.3.11. Rewrite the limits as limx→a f(x) for suitable a.

1. limx→a− f(−x).

2. limx→0 f

(1x

).

3. limx→0 f((x+ 1)3).

4. limx→2+ f

(1x

).

Exercise 1.3.12. Compute the limits.

1. limx→1x2 + 3

√x+ 1

(√x+ 1)(

√x+ 2)

.

2. limx→−33√x− 5.

3. limx→0

√2x+ 1−

√x+ 1√

x+ 2−√

2x+ 2.

4. limx→+∞

√2x+ 1−

√x+ 1√

x+ 2−√

2x+ 2.

5. limx→1

3√x− 1x− 1

.

6. limx→0x

3√

2x+ 1− 3√x+ 1

.

7. limx→+∞3x

83 + 5x

53 + 2

x52 − 3x

32 + 2x

12

.

8. limx→1+3x

83 + 5x

53 + 2

x52 − 3x

32 + 2x

12

.

9. limx→+∞ x

(√x+ 1x− 1

− 1)

.

10. limx→1+ x

(√x+ 1x− 1

− 1)

.

11. limx→+∞x+ 1−

√x2 + 3

x− 1.

12. limx→1+x+ 1−

√x2 + 3

x− 1.

Exercise 1.3.13. Prove the properties of limit.

1. If limx→a f(x) = l and limx→a g(x) = k, then limx→a max{f(x), g(x)} =max{l, k} and limx→a min{f(x), g(x)} = min{l, k}.

Page 49: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 49

2. If limx→a+ f(x) =∞ and there are c > 0 and δ > 0, such that 0 < x−a < δimplies g(x) > c, then limx→a+ f(x)g(x) =∞.

3. If limx→a g(x) = +∞ and limy→+∞ f(y) = c, then limx→a f(g(x)) = c.

4. If f(x) ≤ g(x) and limx→+∞ g(x) = −∞, then limx→+∞ f(x) = −∞.

1.3.4 Limit of Trignometric Function

We already know the limits (1.3.12) and (1.3.17) of the power function xα.We also know the limits (1.3.15) and (1.3.16) of the polynomials and rationalfunctions. In this section, we will study the limits of trigonometric functions.

We begin by establishing an important inequality. In Figure 1.5 are a

circle of radius 1 and an angle 0 < x <π

2. The arc PB has length x, and

the area of the fan OBP is1

2x. Both the triangles OBP and OBQ have

the base OB = 1 and respective heights AP = sinx, BQ = tanx. Since thefan OBP is sandwiched between the triangles OBP and OBQ, we have theinequality between the areas

1

2sinx <

1

2x <

1

2tanx for 0 < x <

π

2.

Therefore we get

0 < sinx < x for 0 < x <π

2, (1.3.18)

and

cosx <sinx

x< 1 for 0 < x <

π

2. (1.3.19)

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

...............

................

....................................................................................................................................

.....................................

.........................................

................................................

................................................................

.......................................................................................................................

..............................

........................................................

........................................................

........................................................

........................................................

........................................................

........................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

....................................x

O A B

PQ

1

Figure 1.5: trignometric function

Applying the sandwich rule to (1.3.18), we get limx→0+ sinx = 0. By thecomposition rule, limx→0− sinx = lim−x→0− sin(−x) = − limx→0+ sinx = 0.Thus by Proposition 1.3.3,

limx→0

sinx = 0.

Page 50: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

50 CHAPTER 1. LIMIT AND CONTINUITY

This further implies

limx→0

cosx = limx→0

(1− 2 sin2 x

2

)= 1− 2

(limx→0

sinx

2

)2

= 1.

Then by the addition formulae for the sine and the cosine functions, we have

limx→a

sinx = limx→0

sin(a+ x) = limx→0

(sin a cosx+ cos a sinx)

= sin a · 1 + cos a · 0 = sin a, (1.3.20)

limx→a

cosx = limx→0

cos(a+ x) = limx→0

(cos a cosx+ sin a sinx)

= cos a · 1 + sin a · 0 = cos a. (1.3.21)

Finally, by the arithmetic rule, we get

limx→a

tanx =limx→a sinx

limx→a cosx=

sin a

cos a= tan a, (1.3.22)

and similarly for the limits of the other trignometric functions.Applying limx→0 cosx = 1 and the sandwich rule to (1.3.19), we get

limx→0

sinx

x= 1. (1.3.23)

This implies

limx→0

tanx

x=

limx→0sinx

xlimx→0 cosx

= 1. (1.3.24)

limx→0

1− cosx

x2= lim

x→0

2 sin2 x

2x2

= limx→0

2 sin2 x

(2x)2=

1

2

(limx→0

sinx

x

)2

=1

2. (1.3.25)

Example 1.3.17. In the limits (1.3.23) and (1.3.24), we substitute x by x− π

2and

get (the composition rule is used here)

limx→π

2

cosx

x− π

2

= −1, limx→π

2

1(x− π

2

)tanx

= −1.

Taking the reciprocal of the second limit, we get

limx→π

2

(x− π

2

)tanx = −1.

Example 1.3.18. Since limx→+∞(√x+ 1−

√x) = limx→+∞

1√x+ 1 +

√x

= 0. By

the composition rule, we have

limx→+∞

sin√x+ 1−

√x

2= 0, lim

x→+∞

sin(√x+ 1−

√x)√

x+ 1−√x

= 1.

Then by the fact that cos is bounded, we have

limx→+∞

(sin√x+ 1− sin

√x) = lim

x→+∞2 sin

√x+ 1−

√x

2cos√x+ 1 +

√x

2= 0.

Page 51: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 51

Moreover,

limx→+∞

√x sin(

√x+ 1−

√x) = lim

x→+∞

√x(√x+ 1−

√x) lim

x→+∞

sin(√x+ 1−

√x)√

x+ 1−√x

= limx→+∞

√x√

x+ 1 +√x

= limx→+∞

1√1 +

1x

+ 1=

12.

Exercise 1.3.14. Compute the limits.

1. limx→πsinxx− π

.

2. limx→∞sinxx− π

.

3. limx→0sinxx− π

.

4. limx→− 3π2

x+3π2

cosx.

5. limx→π1 + cosx(x− π)2

.

6. limx→0tan 2xtan 3x

.

7. limx→0sin(tanx)

x.

8. limx→01− cosx

sinx.

9. limx→01− cosx

sinxsin

1x2

.

10. limx→0tanx− sinx

x3.

11. limx→0sinx2

(sinx)2.

12. limx→01− cosx2

(1− cosx)2.

13. limx→0cosx− cos 2x

x2.

14. limx→0

√cosx− 3

√cosx

sin2 x.

1.3.5 Limit of Exponential Function

In this section, we study the limits of exponential functions.We first establish

limx→0+

xx = 1. (1.3.26)

Note that for the special case x =1

n, we have xx =

1n√n

. So the limit is

closely related to the limit (1.1.4).

For 0 < x < 1, we have1

n+ 1≤ x <

1

nfor some natural number n. This

implies

1 > xx >

(1

n+ 1

) 1n

=1

(n+ 1)1n

,

which further implies

|xx − 1| <

∣∣∣∣∣ 1

(n+ 1)1n

− 1

∣∣∣∣∣ .By the limit (1.1.7), we know limn→∞

1

(n+ 1)1n

= 1. Therefore for any ε > 0,

there is N > 0, such that

n > N =⇒

∣∣∣∣∣ 1

(n+ 1)1n

− 1

∣∣∣∣∣ < ε.

Page 52: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

52 CHAPTER 1. LIMIT AND CONTINUITY

If we take δ =1

N + 1, then

0 < x < δ =⇒ 1

n+ 1≤ x <

1

nfor some natural number n > N

=⇒ |xx − 1| <

∣∣∣∣∣ 1

(n+ 1)1n

− 1

∣∣∣∣∣ < ε.

This completes the proof of (1.3.26).

Let a > 0 be a constant. For sufficiently small x > 0, we have x < a <1

x,

which further implies xx < ax <1

xx. Then by (1.3.26) and the sandwich

rule, we get limx→0+ ax = 1. Further by the composition rule, we have

limx→0− ax = limx→0+ a−x =

1

limx→0+ ax= 1. Thus we conclude that

limx→0

ax = 1. (1.3.27)

This is a special case of limx→b ax, which we expect to be ab. The equality

limx→b ax = ab can be derived from the following useful result.

Proposition 1.3.7 (Exponential Rule). Suppose limx→a f(x) = l > 0 andlimx→a g(x) = k. Then limx→a f(x)g(x) = lk.

Proof. We prove the case a is finite. The proof for the other cases is similar.By the limit (1.3.17) and the composition rule, we have limx→a f(x)k =

lk. On the other hand, choose A > 1 satisfying A−1 < l < A. Thenlimx→a f(x) = l tells us that there is δ > 0, such that 0 < |x− a| < δ impliesA−1 < f(x) < A. This further implies

A−|g(x)−k| < f(x)g(x)−k < A|g(x)−k|.

The assumption limx→a g(x) = k implies limx→a |g(x)− k| = 0. By the limit(1.3.27) and the composition rule, we have

limx→a

A−|g(x)−k| = limx→a

A|g(x)−k| = 1.

Then by the sandwich rule, we get limx→a f(x)g(x)−k = 1. Thus we concludethat

limx→a

f(x)g(x) = limx→a

f(x)k limx→a

f(x)g(x)−k = lk · 1 = lk.

Other forms of the exponential rule can be found in Exercise 1.3.17.

Example 1.3.19. By the limit (1.3.26), we have limx→0− |x|x = limx→0+ x−x =1

limx→0+ xx= 1. Thus limx→0 |x|x = 1.

Page 53: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 53

Example 1.3.20. By the limit (1.3.26) and the exponential rule, we have

limx→0

(5− 3x+ 4x2)x = 50 = 1.

limx→0

(2x2 − x3)x = limx→0

(|x|x)2(2− x)x = 12 · 20 = 1.

limx→0+

(2x+ x3)√x = lim

x→0+(2x2 + x6)x = lim

x→0(|x|x)2(2 + x4)x = 12 · 20 = 1.

Example 1.3.21. By the limits (1.3.23) and the exponential rule, we have

limx→0+

(sinx)x = limx→0+

(sinxx

)xxx = 10 · 1 = 1.

limx→0

(1− cosx)tanx = limx→0

(1− cosx

x2

)tanx

(xx)2 tan xx =

(12

)0

· 12 = 1.

Exercise 1.3.15. Compute the limits.

1. limx→0+ (xx)x.

2. limx→0+ x(xx).

3. limx→0+(tanx)x.

4. limx→0 |x|tanx.

5. limx→0(cosx)x.

6. limx→+∞ x−x.

7. limx→+∞ x1x .

8. limx→+∞(2x + 3x)1x .

9. limx→+∞(2x + 3x)x

x2+1 .

Exercise 1.3.16. Prove limx→0 |p(x)|x = 1 for any nonzero polynomial p(x).

Exercise 1.3.17. Prove following the exponential rules.

1. l+∞ = +∞ for l > 1: If limx→a f(x) = l > 1 and limx→a g(x) = +∞, thenlimx→a f(x)g(x) = +∞.

2. (0+)k = 0 for k > 0: If f(x) > 0, limx→a f(x) = 0 and limx→a g(x) = k > 0,then limx→a f(x)g(x) = 0.

From the two rules, further derive the following exponential rules.

1. l+∞ = 0 for 0 < l < 1.

2. l−∞ = 0 for l > 1.

3. (0+)k = +∞ for k < 0.

4. (+∞)k = 0 for k > 0.

Exercise 1.3.18. Provide counterexamples to the wrong exponential rules.

(+∞)0 = 1, 1+∞ = 1, 00 = 1, 00 = 0.

Next we consider limx→∞

(1 +

1

x

)x. Similar to the way the limit (1.3.26)

is derived, we compare with the natural number case, which is the definitionof e in Example 1.2.11.

For x > 1, we have n ≤ x ≤ n + 1 for some natural number n. Thisimplies (

1 +1

n+ 1

)n≤(

1 +1

x

)x≤(

1 +1

n

)n+1

. (1.3.28)

Page 54: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

54 CHAPTER 1. LIMIT AND CONTINUITY

From Example 1.2.11, we know

limn→∞

(1 +

1

n

)n+1

= limn→∞

(1 +

1

n

)nlimn→∞

(1 +

1

n

)= e,

limn→∞

(1 +

1

n+ 1

)n=

limn→∞

(1 +

1

n+ 1

)n+1

limn→∞

(1 +

1

n+ 1

) = e.

Therefore for any ε > 0, there is N > 0, such that

n > N =⇒

∣∣∣∣∣(

1 +1

n

)n+1

− e

∣∣∣∣∣ < ε,

∣∣∣∣(1 +1

n+ 1

)n− e∣∣∣∣ < ε, (1.3.29)

Then for x > N + 1, we have n ≤ x ≤ n+ 1 for some natural number n > N .By the inequalities (1.3.28) and (1.3.29), this further implies

−ε <(

1 +1

n+ 1

)n− e ≤

(1 +

1

x

)x− e ≤

(1 +

1

n

)n+1

− e < ε.

This proves that limx→+∞

(1 +

1

x

)x= e, which further implies

limx→−∞

(1 +

1

x

)x= lim

x→+∞

(1− 1

x

)−x= lim

x→+∞

(1 +

1

x− 1

)x−1(1 +

1

x− 1

)= e.

Thus we conclude that

limx→∞

(1 +

1

x

)x= e. (1.3.30)

Example 1.3.22. By the composition rule, we have

limx→∞

(1 +

a

x

)x= lim

x→∞

(1 +

1x

)ax=(

limx→∞

(1 +

1x

)x)a= ea.

limx→0

(1− x)1x = lim

x→∞

(1 +

1x

)−x=(

limx→∞

(1 +

1x

)x)−1

= e−1.

By the exponential rule and the composition rule, we have

limx→−∞

(1 +

1x

)x2

= limx→−∞

((1 +

1x

)x)x= e−∞ = 0.

limx→0

(1 + x+ x2)1x = lim

x→0

((1 + x+ x2)

1x+x2

)x+x2

x

=(

limx→0

(1 + x+ x2)1

x+x2

)limx→0x+x2

x

=(

limy→∞

(1 +

1y

)y)limx→0(1+x)

= e1 = e.

The examples also show that there is no exponential rule for 1+∞ or 1−∞.Exercise 1.3.19. Compute the limits.

Page 55: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 55

1. limx→0(1 + x)−2x .

2. limx→∞

(1 +

1x2

)x.

3. limx→0

(2 + x

2 + 3x

) 1x

.

4. limx→∞

(2 + x

2 + 3x

) 1x

.

5. limx→0

(2 + x+ x2

2− 3x+ 2x2

) 1x2+x

.

6. limx→∞

(2 + x+ x2

2− 3x+ 2x2

) 1x2+x

.

Exercise 1.3.20. Prove that n ≤ x ≤ n+1 implies(

1− 1n+ 1

)−n≤(

1− 1x

)−x≤(

1− 1n

)−(n+1)

. Then use this and Exercise 1.2.18 to prove limx→−∞

(1 +

1x

)x=

e.

1.3.6 More Property

In Section 1.3.5, we saw the close relation between the sequence limit andthe function limit. In fact, we used the sequence limit to derive the functionlimit.

Proposition 1.3.8. For a function f(x), limx→a f(x) = l if and only iflimn→∞ f(xn) = l for any sequence {xn} satisfying xn 6= a and limn→∞ xn =a.

Proof. We prove the case a and l are finite. The proof for the other cases issimilar.

Suppose limx→a f(x) = l. Suppose {xn} is a sequence satisfying xn 6= aand limn→∞ xn = a. Then for any ε > 0, there are δ > 0 and N , such that

0 < |x− a| < δ =⇒ |f(x)− l| < ε.

n > N =⇒ |xn − a| < δ.

The assumption xn 6= a further implies |xn − a| > 0. Thus combining thetwo implications together leads to

n > N =⇒ |f(xn)− l| < ε,

and we conclude limn→∞ xn = a.Conversely, assume limx→a f(x) 6= l (which means either the limit does

not exist, or the limit exists but is not equal to l). Then there is ε > 0,such that for any δ > 0, there is x satisfying 0 < |x − a| < δ and |f(x) −l| ≥ ε. Specifically, by choosing δ =

1

nfor natural numbers n, we find a

sequence {xn} satisfying 0 < |xn − a| < 1

nand |f(xn) − l| ≥ ε. The first

inequality implies xn 6= a and limn→∞ xn = a. The second inequality implieslimn→∞ f(xn) 6= l. In conclusion, we find a sequence satisfying xn 6= a andlimn→∞ xn = a but limn→∞ f(xn) 6= l.

Page 56: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

56 CHAPTER 1. LIMIT AND CONTINUITY

Example 1.3.23. Consider limx→+∞ cosx. The sequences xn = 2nπ, yn = (2n+1)πdiverge to +∞. However, the sequences {cosxn} = {1} and {cos yn} = {−1}converge to different limits. By Proposition 1.3.8 fails, cosx diverges at +∞.

Example 1.3.24. The Dirichlet8 function is

D(x) =

{1 if x is rational0 if x is irrational

.

For any a, we can find a sequence {xn} of rational numbers and a sequence {yn} of

irrational numbers convergent to but not equal to a (for a = 0, take xn =1n

and

yn =1

n+√

2, for example). Then f(xn) = 1, f(yn) = 0, so that limn→∞ f(xn) = 1

and limn→∞ f(yn) = 0. This implies that the Dirichlet function diverges every-where.

Example 1.3.25. The restriction of the limits (1.3.23), (1.3.26), (1.3.30) to{

1n

},{

1√n

}, {√n} give

limn→+∞

n sin1n

= 1.

limn→∞

n1√n =

(limn→∞

(√n)

1√n

)2= 1.

limn→∞

(1 +

1√n

)√n= e.

Exercise 1.3.21. Determine convergence.

1. limx→π2

tanx.

2. limx→π2

(x− π

2

)tanx.

3. limx→π2

tanx+ 1tanx− 1

.

4. limx→π tanx.

5. limx→1 sinx

x2 − 1.

6. limx→1(√

1 + x−√

2x) sinx

x2 − 1.

Exercise 1.3.22. Compute the limits.

1. limn→∞

(n2 − 12n+ 1

)sin 1√n−1

.

2. limn→∞

(sin

1n

)tan 1n

.

3. limn→∞

(sin

1n

) 1√n

.

4. limn→∞(sin√n+ 1− sin

√n).

5. limn→∞

(1 + sin

1n

)n.

6. limn→∞

(1 +

(−1)n

n2

)n.

8Johann Peter Gustav Lejeune Dirichlet, born 1805 in Duren (French Empire, nowGermany), died in Gottingen (Hanover, now Germany). He proved the famous Fermat’sLast Theorem for the case n = 5 in 1825. He made fundamental contributions to theanalytic number theory, partial differential equation, and Fourier series. He introducedhis famous function in 1829.

Page 57: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 57

Exercise 1.3.23. Prove that limx→a f(x) converges if and only if limn→∞ f(xn)converges for any sequence {xn} satisfying xn 6= a and limn→∞ xn = a.

Exercise 1.3.24. Prove that limx→a+ f(x) = l if and only if limn→∞ f(xn) = l forany strictly decreasing sequence {xn} converging to a. Moreover, state the similarcriterion for limx→+∞ f(x) = l.

Exercise 1.3.25. Prove the exponential rule for the sequence limits similar toProposition 1.3.7: If limn→∞ xn = l > 0 and limn→∞ yn = k, then limn→∞ x

ynn =

lk. Moreover, extend the exponential rule similar to Exercise 1.3.17.

A function is increasing if

x > y =⇒ f(x) ≥ f(y).

It is strictly increasing if

x > y =⇒ f(x) > f(y).

The concepts of decreasing and strictly decreasing functions can be similarlydefined. A function is monotone if it is either increasing or decreasing.

Proposition 1.2.7 on the convergence of monotone sequences can be ex-tended to the one side limits and the limits at signed infinities of monotonefunctions. The following result is stated for the right limit.

Proposition 1.3.9. Suppose f(x) is a monotone function defined and boundedfor x > a and x near a. Then limx→a+ f(x) converges.

The proposition can be proved by using the same idea for the proof ofProposition 1.2.7.

We will use derivatives to determine whether certain functions are mono-tone. The method can be combined with the proposition to show the con-vergence of limits.

Exercise 1.3.26. Suppose f(x) is an increasing function defined and unboundedfor x > a and x near a. Prove that limx→a+ f(x) = −∞.

Finally, the Cauchy criterion can also be applied to the limit of functions.

Proposition 1.3.10. A function f(x) has limit at a if and only if for anyε > 0, there is δ > 0, such that

0 < |x− a| < δ, 0 < |y − a| < δ =⇒ |f(x)− f(y)| < ε.

Similar statements can be made for the one side limits and the limits atinfinities.

Proof. For the convergence implying the Cauchy condition, the proof of The-orem 1.2.2 can be adapted without much change.

Conversely, assume f(x) satisfies the Cauchy condition. Choose a se-quence {xn} satisfying xn 6= a and limn→∞ xn = a. We prove that {f(xn)}is a Cauchy sequence. For any ε > 0, we have δ > 0 as given by the Cauchycondition for f(x). Then for δ > 0, there is N , such that

n > N =⇒ |xn − a| < δ.

Page 58: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

58 CHAPTER 1. LIMIT AND CONTINUITY

The assumption xn 6= a also implies 0 < |xn − a| < δ. Combined with theCauchy condition for f(x), we get

m,n > N =⇒ 0 < |xm − a| < δ, 0 < |xn − a| < δ

=⇒ |f(xm)− f(xn)| < ε.

Therefore {f(xn)} is a Cauchy sequence and must converge to a limit l.Next we prove that l is also the limit of the function at a. For any ε > 0,

we have δ > 0 as given by the Cauchy condition for f(x). Then we can findxn satisfying 0 < |xn− a| < δ and |f(xn)− l| < ε. Moreover, for x satisfying0 < |x − a| < δ, we may apply the Cauchy condition to x and xn to get|f(x)− f(xn)| < ε. Therefore

|f(x)− l| ≤ |f(xn)− l|+ |f(x)− f(xn)| < 2ε.

This completes the proof that limx→a f(x) = l.

Example 1.3.26. Consider limx→+∞ cosx. For anyN , we can find a natural numbern such that x = 2nπ, y = (2n+1)π > N , but |cosx− cos y| = 2. Thus the Cauchycriterion fails for ε = 2, and limx→+∞ cosx diverges.

Exercise 1.3.27. Show the limits diverge by using Proposition 1.3.10.

1. limx→11 + x

1− x. 2. limx→0 sin

1x

. 3. limx→∞2 + sinx2− sinx

.

1.3.7 Order of Infinity and Infinitesimal

The limit limx→a f(x) = l means f(x) − l is an infinitesimal. The limitlimx→a f(x) = ∞ means f(x) is an infinity. Thus the concept of limit isreally a matter of infinitesimals and infinities.

The infinities can be compared. For example, as x → ∞, the infinity x2

gets bigger much faster than the infinity x does. In general, bigger infinitiesare considered to have higher order. The following are more specific ways ofcomparing infinities f(x) and g(x) at a. The comparisons can also be appliedto sequences.

If limx→af(x)

g(x)= 0 (or equivalently, limx→a

g(x)

f(x)= ∞), then we denote

f(x) = o(g(x)) and call g(x) an infinity of higher order than f(x). Forexample,

xα = o(xβ) at +∞ for β > α > 0,

xα = o(xβ) at 0+ for β < α < 0,

ax = o(bx) at +∞ for b > a > 1,

xα = o(ax) at +∞ for α > 0, a > 1,

nα = o(nβ) for β > α > 0,

nα = o(ax) for α > 0, a > 1,

an = o(n!),

n! = o(nn).

Page 59: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 59

If |f(x)| ≤ c|g(x)| for a constant c and x close to a, then we denotef(x) = O(g(x)) and say the order of f(x) is no higher than that of g(x).For example, x(1 + sinx) = O(x) at ∞, n2+(−1)n = O(n3). We also denotef(x) = O(1) to mean that f(x) is bounded, although the constant 1 is notan infinity. For example, sinx = O(1) and cosx = O(1) at ∞.

If c1|g(x)| ≤ |f(x)| ≤ c2|g(x)| for constants c2 > c1 > 0 and x close to a,that is, f(x) = O(g(x)) and g(x) = O(f(x)), then f(x) and g(x) are infinitiesof the same order. For example, a polynomial of degree n has the same orderas xn at ∞, x(3 + 2 sinx) and x have the same order at ∞.

If limx→af(x)

g(x)= 1, then we denote f(x) ∼ g(x) and call f(x) and g(x)

equivalent infinities. For example, x3 + 2x− 5 ∼ x3 and 2x + x4 ∼ 2x at ∞,3

2(x− 1)∼ x+ 2

x2 − 1at 1, 3n − 2n+2 ∼ 3n.

Example 1.3.27. By√x+

√x+√x

√x

=

√1 +

√x+√x

x=

√√√√1 +

√1x

+1

x32

,

we get limx→+∞

√x+

√x+√x

√x

= 1 and conclude√x+

√x+√x ∼√x at +∞.

Exercise 1.3.28. Compare the infinities at +∞.

√x+ 1− 1, 3

√√x+ 1− 1,

√1 +

√1 +√

1 + x, x2(2 + cosx), xx, 2x.

Exercise 1.3.29. Find α, so that the infinities at ∞ have the same order as xα.

x(2 + cosx) + x2(2− sinx),√

6x3 + 5x5,

√x+

√x2 +

√x3.

Exercise 1.3.30. Is it true that f1(x) = o(g(x)) and f2(x) = o(g(x)) =⇒ f1(x) +f2(x) = o(g(x))? Discuss the similar properties for product, composition, etc.Discuss the similar properties for other types of comparisons.

Exercise 1.3.31. Given a sequence of functions f1(x), f2(x), . . . , fn(x), . . . that areinfinities as x → a. Construct a function f(x) such that any fn(x) = o(f(x)). Inother words, f(x) diverges to infinity much faster than any fn(x).

Similar to the infinities, the infinitesimals can also be compared. In gen-eral, smaller infinitesimals are considered to have higher order. The followingcomparisons are defined for infinitesimals f(x) and g(x) at a.

If limx→af(x)

g(x)= 0, then we denote f(x) = o(g(x)) and call f(x) an

infinitesimal of higher order than g(x). For example,

xβ = o(xα) at 0+ for β > α > 0,

bx = o(ax) at −∞ for b > a > 1,

nβ = o(nα) for β < α < 0,

ax = o(nα) for α < 0, 0 < a < 1.

Page 60: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

60 CHAPTER 1. LIMIT AND CONTINUITY

If |f(x)| ≤ c|g(x)| for a constant c and x close to a, then we denotef(x) = O(g(x)) and say the order of g(x) is no higher than that of f(x). For

example, sinx sin1

x= O(x) at 0,

cosx

x= O

(1

x

)at ∞.

If c1|g(x)| ≤ |f(x)| ≤ c2|g(x)| for constants c2 > c1 > 0 and x closeto a, that is, f(x) = O(g(x)) and g(x) = O(f(x)), then f(x) and g(x) areinfinitesimals of the same order. For example, x2 + 2x − 3 and x − 1 havethe same order at 1, 1− cosx and x2 have the same order at 0.

If limx→af(x)

g(x)= 1, then we denote f(x) ∼ g(x) and call f(x) and g(x)

equivalent infinitesimals. For example, sinx ∼ x and tanx ∼ x at 0, x2 +2x− 3 ∼ 4(x− 1) at 1.

Example 1.3.28. By√x+

√x+√x

x18

=

√x

34 +

√x+√x

x14

=

√x

34 +

√x

12 + 1,

we get limx→0+

√x+

√x+√x

x18

= 1 and conclude√x+

√x+√x ∼ x

18 at 0+.

Example 1.3.29. The signs o(g(x)) and O(g(x)) are often used as a function f(x)satisfying the suitable comparison with g(x). For example, we may write

sinx = x+ o(x), cosx = 1− 12x2 + o(x2)

to mean thatsinx = x+ f1(x), cosx = 1− 1

2x2 + f2(x),

withlimx→0

f1(x)x

= limx→0

f2(x)x2

= 0.

Exercise 1.3.32. Compare the infinitesimals at 0.√x+ 1− 1, 3

√√x+ 1− 1, tanx− x, sinx+ cosx− 1, sinx sin

1x.

Exercise 1.3.33. Find α, so that the infinitesimals at 0 have the same order as xα.

sin 2x− 2 sinx, 2√

6x3 + 5x5,

√x+

√x2 +

√x3.

Exercise 1.3.34. Do Exercise 1.3.30 again for the infinitesimals.Exercise 1.3.35. How do the comparisons change when we use the reciprocal

f(x)↔ 1f(x)

to convert between infinities and infinitesimals?

1.3.8 Additional Exercise

Extended Supremum and Extended Upper Limit

If limx→0f(x)

g(x)= 1, then we expect f(x1) + f(x2) + · · · + f(xn) and

g(x1) + g(x2) + · · ·+ g(xn) to be very close to each other when x1, x2, . . . , xnare very small. The following exercises indicate some cases our expectationis fulfilled.

Page 61: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.3. LIMIT OF FUNCTION 61

Exercise 1.3.36. Suppose f(x) is a function on (0, 1] satisfying limx→0+f(x)x

= 1.Prove

limn→∞

(f

(1n2

)+ f

(2n2

)+ f

(3n2

)+ · · ·+ f

( nn2

))= limn→∞

(1n2

+2n2

+3n2

+ · · ·+ n

n2

)= lim

n→∞

n+ 12n

=12.

Exercise 1.3.37. Suppose g(x) > 0 and limx→0f(x)g(x)

= 1. Suppose for each natural

number n, there are nonzero numbers xn,1, xn,2, . . . , xn,kn , so that {xn,k} uniformlyconverges to 0: For any ε > 0, there is N , such that n > N implies |xn,k| < ε.Prove that if

limn→∞

(g(xn,1) + g(xn,2) + · · ·+ g(xn,kn)) = a,

thenlimn→∞

(f(xn,1) + f(xn,2) + · · ·+ f(xn,kn)) = a.

Upper and Lower Limits of Functions

Suppose f(x) is defined near (but not necessarily at) a. Let

LIMaf ={

limn→∞

f(xn) : xn 6= a, limn→∞

xn = a, limn→∞

f(xn) converges}.

Define

limx→a

f(x) = sup LIMaf, limx→a

f(x) = inf LIMaf.

Similar definitions can be made when a is replaced by a+, a−, ∞, +∞ and−∞.

Exercise 1.3.38. Prove the analogue of Proposition 1.2.9: l ∈ LIMaf if and only iffor any ε > 0 and δ > 0, there is x satisfying 0 < |x− a| < δ and |f(x)− l| < ε.

Exercise 1.3.39. Prove the analogue of Exercise 1.2.24: If limk→∞ lk = l andlk ∈ LIMaf for each k, then l ∈ LIMaf .

Exercise 1.3.40. Prove the analogue of Proposition 1.2.10:

1. If l < limx→a f(x), then for any δ > 0, there is x satisfying 0 < |x − a| < δand f(x) > l.

2. If l > limx→a f(x), then there is δ > 0, such that 0 < |x − a| < δ impliesf(x) < l.

The two properties completely characterize the upper limit.

Exercise 1.3.41. Prove the analogue of Proposition 1.2.11: The upper and thelower limits of f(x) at a belong to LIMaf , and f(x) converges at a if and only ifthe upper and the lower limits at a are equal.

Exercise 1.3.42. Prove the extension of Proposition 1.3.3:

LIMaf = LIMa+f ∪ LIMa−f.

Page 62: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

62 CHAPTER 1. LIMIT AND CONTINUITY

In particular, we have

limx→a

f(x) = max{

limx→a+

f(x), limx→a−

f(x)},

limx→a

f(x) = min{

limx→a+

f(x), limx→a−

f(x)}.

Exercise 1.3.43. Extend the arithmetic and order properties of upper and lowerlimits in Exercise 1.2.26.

Exercise 1.3.44. Prove the analogue of Exercise 1.2.29:

limx→a

f(x) = limδ→0+

sup{f(x) : 0 < |x− a| < δ}.

1.4 Continuous Function

Changing quantities are often described by functions. Most of the changesin the real world are smooth, gradual and well behaved. For example, peopledo not often press the brake when driving a car, and the weather do notsuddenly change from summer to winter. The functions describing such wellbehaved changes are at least continuous.

1.4.1 Definition

A function is continuous if its graph “does not break”. The graph of afunction f(x) may break at a for various reasons. But the breaks are alwaysone of the two types: Either limx→a f(x) diverges or the limit converges butthe limit value is not f(a).

In Figure 1.6, the function is continuous at a4 and is not continuous at theother four points. Specifically, limx→a1 f(x) exists but is not equal to f(a1).limx→a2 f(x) does not exist because the left and the right limits are not thesame. limx→a3 f(x) does not exist because the function is not bounded neara3. limx→a5 f(x) does not exist because the left limit does not exist.

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................................

................................................................................................

................................

................................................ ................................................ ................................................

......................................................................................

........................................

.............................................................

..................................................................................

...........................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................

................................................................................................................................................................................................................

............

.........

.........

...........

.................

....................................................................................................................................................................... .........................................................................................................

..........................................................................................

......................................................................................................................................................

........................................................................................................................

........................................................................................................................

........................................................................................................................a1 a2 a3 a4 a5

Figure 1.6: continuity and discontinuity

The observation leads to the following definition.

Definition 1.4.1. A function f(x) defined near (and include) a is continuousat a if limx→a f(x) = f(a).

Page 63: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 63

Using the ε-δ language, the continuity of f(x) at a means that for anyε > 0, there is δ > 0, such that

|x− a| < δ =⇒ |f(x)− f(a)| < ε. (1.4.1)

A function is right continuous at a if limx→a+ f(x) = f(a), and is leftcontinuous if limx→a− f(x) = f(a). For example, the function in Figure 1.6is left continuous at a2 and right continuous at a5. A function is continuousat a if and only if it is both left and right continuous at a.

A function defined on an open interval (a, b) is continuous if it is contin-uous at every point on the interval. A function defined on a closed interval[a, b] is continuous if it is continuous at every point on (a, b), is right con-tinuous at a, and is left continuous at b. Continuities for functions on otherkinds of intervals can be defined similarly.

Most basic functions are continuous. The limits (1.3.15), (1.3.17), (1.3.20),(1.3.21), (1.3.22), (1.3.27) show that polynomials, rational functions, trigno-metric functions, power functions, and exponential functions are continuous(at the places where the functions are defined). Then the arithmetic rule andthe composition rule in Proposition 1.3.6 imply the following.

Proposition 1.4.2. The arithmetic combinations and the compositions ofcontinuous functions are still continuous.

By the proposition, functions such as sin2 x+tanx2,x√x+ 1

, 2x costanx

x2 + 1are continuous. Thus examples of discontinuity can only be found amongmore exotic functions.

Proposition 1.3.8 implies the following criterion for the continuity in termsof sequences.

Proposition 1.4.3. A function f(x) is continuous at a if and only iflimn→∞ f(xn) = f(a) for any sequence {xn} converging to a.

The conclusion of the proposition can be written as

limn→∞

f(xn) = f(

limn→∞

xn

). (1.4.2)

Moreover, by the composition rule for the limit of functions, the continuityof f(x) at a = limy→b g(y) also means

limy→b

f(g(y)) = f

(limy→b

g(y)

). (1.4.3)

Therefore the function and the limit can be interchanged if the function iscontinuous.

Example 1.4.1. The sign function

sign(x) =

1 if x > 00 if x = 0−1 if x < 0

Page 64: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

64 CHAPTER 1. LIMIT AND CONTINUITY

......................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.......................................

....................................................................................................................

..............................................................................................................................

................................

................................................

................................sign(x)

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.......................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.......................................

................................................................................................................................................................ .......... ................................................................................................................................................................ ..........

................................................................................................................................................

..........................................

..........................................

..........................................

..........................................

....................................

D(x) xD(x)

rational rational

irrational irrational

Figure 1.7: discontinuous functions

is continuous everywhere except at 0. The Dirichlet function in Example 1.3.24is not continuous everywhere. Multiplying x to the Dirichlet function produces afunction

xD(x) =

{x if x is rational0 if x is irrational

that is continuous only at 0.

Example 1.4.2 (Thomae9). For a rational number x =p

q, where p is an integer,

q is a natural number, and p, q are coprime, define R(x) =1q

. For an irrational

number x, define R(x) = 0. Finally define R(0) = 1. We will show that R(x) iscontinuous at precisely all the irrational numbers.

By the way R(x) is defined, for any integer N , the only numbers x satisfying

R(x) ≥ 1N

are the rational numbersp

qwith q ≤ N . Therefore on any bounded

interval, we have R(x) <1N

for all except finitely many rational numbers. Forany irrational a, let ε > 0 be the smallest distance between a and these rational

numbers. Then we have |R(x)−R(a)| = R(x) <1N

for all x satisfying |x−a| < ε.

This proves that f(x) is continuous at a. On the other hand, for rational a =p

q,

we can find irrational numbers x arbitrarily close to a, but with |R(x) − R(a)| =R(a) =

1q

. Since q does not depend on x, we do not have limx→aR(x) = R(a),

and the function is not continuous at the rational point.

Exercise 1.4.1. Which functions are not continuous? Where do the discontinuitieshappen, and for what reason?

1. |x|.

2.x2 sinxx2 + sinx

.

3. [x] sinx.

4. [x] sinπx.

5.

|x|x

if x 6= 0

0 if x = 0.

6.

|x|x

sinx if x 6= 0

0 if x = 0.

9Karl Johannes Thomae, born 1840 in Laucha (Germany), died 1921 in Jena (Ger-many). Thomae made important contributions to the function theory. In 1870 he showedthe continuity in each variable does not imply the joint continuity. He constructed theexample here in 1875.

Page 65: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 65

7.

|x|x

sinx if x 6= 0

1 if x = 0.

8.

sin1x

if x 6= 0

0 if x = 0.

9.

x sin1x

if x 6= 0

0 if x = 0.

10.

{xx if x > 00 if x ≤ 0

.

11.

xx if x > 0(−x)x if x < 01 if x = 0

.

12.

{x if x is rational1− x if x is irrational

.

Exercise 1.4.2. Construct functions on (0, 2) satisfying the requirements.

1. f(x) is not continuous at12

, 1 and32

and is continuous everywhere else.

2. f(x) is continuous at12

, 1 and32

and is not continuous everywhere else.

3. f(x) is continuous everywhere except at1n

for all natural numbers n.

4. f(x) is not left continuous at12

, not right continuous at 1, neither side

continuous at32

, and continuous everywhere else (including the right of12

and the left of 1).

Exercise 1.4.3. Prove that a function f(x) is continuous at a if any only if thereis l, such that for any ε > 0, there is δ > 0, such that

|x− a| < δ =⇒ |f(x)− l| < ε.

By (1.4.1), all you need to do here is to show l = f(a).

Exercise 1.4.4. Prove that if a function is continuous on (a, b] and [b, c), then it iscontinuous on (a, c). What about other types of intervals?

Exercise 1.4.5. Suppose f(x) and g(x) are continuous. Prove max{f(x), g(x)} andmin{f(x), g(x)} are also continuous.

Exercise 1.4.6. Suppose f(x) is continuous on [a, b] and f(r) = 0 for all rationalnumbers r ∈ [a, b]. Prove f(x) = 0 on the whole interval.

Exercise 1.4.7. Prove that a continuous function f(x) on (a, b) is the restriction of acontinuous function on [a, b] if and only if the limits limx→a+ f(x) and limx→b− f(x)exist.

Exercise 1.4.8. Suppose f(x) is an increasing function on [a, b]. Prove that if anynumber in [f(a), f(b)] can be the value of f(x), then the function is continuous.

Exercise 1.4.9. Suppose f(x) and g(x) be continuous functions on (a, b). Find theplaces where the following function is continuous.

h(x) =

{f(x) if x is rationalg(x) if x is irrational

Exercise 1.4.10. Suppose f(x) is an increasing function on [a, b]. By Proposition1.3.9, the limits f(c+) = limx→c+ f(x) and f(c−) = limx→c− f(x) exist at anyc ∈ (a, b).

Page 66: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

66 CHAPTER 1. LIMIT AND CONTINUITY

1. Prove that for any ε > 0, there are finitely many c satisfying f(c+)−f(c−) >ε.

2. Prove that f(x) is not continuous only at countably many points.

1.4.2 Uniformly Continuous Function

Theorem 1.4.4. Suppose f(x) is a continuous function on a bounded closedinterval [a, b]. Then for any ε > 0, there is δ > 0, such that for x, y ∈ [a, b],

|x− y| < δ =⇒ |f(x)− f(y)| < ε. (1.4.4)

In the ε-δ formulation (1.4.1) of the continuity, only one variable x isallowed to change. This means that, in addition to being dependent on ε,the choice of δ may also be dependent on the location a of the continuity.The property (1.4.4) says that the choice of δ can be the same for all thepoints on the interval, so that it depends on ε only. Therefore the propertydescribed in the theorem is called the uniform continuity, and the theorembasically says a continuous function on a bounded closed interval is uniformlycontinuous.

Proof of Theorem 1.4.4. Suppose f(x) is not uniformly continuous. Thenthere is ε > 0, such that for any natural number n, there are xn, yn ∈ [a, b],such that

|xn − yn| <1

n, |f(xn)− f(yn)| > ε. (1.4.5)

Since the sequence {xn} is bounded by a and b, by Theorem 1.2.8, there is

a convergent subsequence {xnk}. Then by xnk −1

nk< ynk < xnk +

1

nkand

the sandwich rule, the subsequence {ynk} also converges to the same limit.Denote limk→∞ xnk = limk→∞ ynk = c.

By a ≤ xn, yn ≤ b, we have a ≤ c ≤ b. Therefore by the assumption, f(x)is continuous at c. By Proposition 1.4.3, we have

limk→∞

f(xnk) = limk→∞

f(ynk) = f(c).

Then by the second inequality in (1.4.5), we have

ε ≤∣∣∣ limk→∞

f(xnk)− limk→∞

f(ynk)∣∣∣ = |f(c)− f(c)| = 0.

The contradiction shows that it was wrong to assume that the function isnot uniformly continuous.

Example 1.4.3. Consider the function x2 on [0, 2]. For any ε > 0, take δ =ε

4.

Then for 0 ≤ x, y ≤ 2, we have

|x− y| < δ =⇒ |x2 − y2| = |x− y||x+ y| ≤ 4|x− y| < ε.

Thus x2 is uniformly continuous on [0, 2].

Page 67: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 67

Now consider the same function on [0,∞). For any δ > 0, take x =1δ

,

y =1δ

2. Then |x− y| < δ, but |x2 − y2| = xδ +

δ2

4> xδ = 1. Thus (1.4.4) fails

for ε = 1, and x2 is not uniformly continuous on [0,∞).

Example 1.4.4. Consider the function√x on [1,∞). For any ε > 0, take δ = ε.

Then for x, y ≥ 1, we have

|x− y| < δ =⇒ |√x−√y| = |x− y|

|√x+√y|≤ |x− y|

2< ε.

Thus√x is uniformly continuous on [1,∞).

We also know√x is continuous on [0, 1], by Theorem 1.4.4. Then by using

Exercise 1.4.12,√x is uniformly continuous on [0,∞).

Exercise 1.4.11. Which are uniformly continuous?

1. f(x) = x2 on (0, 3).

2. f(x) =1x

on [1, 3].

3. f(x) =1x

on (0, 3].

4. f(x) = 3√x on (−∞,∞).

5. f(x) = x32 on [1,∞).

6. f(x) = sinx on (−∞,∞).

7. f(x) = sin1x

on (0, 1].

8. f(x) = x sin1x

on (0, 1].

9. f(x) = xx on (0, 1].

10. f(x) =(

1 +1x

)xon (0,∞).

Exercise 1.4.12. Prove that if a function is uniformly continuous on (a, b] and [b, c),then it is uniformly continuous on (a, c). What about other types of intervals?

Exercise 1.4.13. A function f(x) is called Lipschitz10 if there is a constant L suchthat |f(x)− f(y)| ≤ L|x− y| for any x and y. Prove that Lipschitz functions areuniformly continuous.

Exercise 1.4.14. A function f(x) on the whole real line is called periodic if thereis a constant p such that f(x + p) = f(x) for any x (the number p is called theperiod of the function). Prove that continuous periodic functions are uniformlycontinuous.

Exercise 1.4.15. Is the sum of uniformly continuous functions uniformly continu-ous? What about the product, the maximum and the composition of uniformlycontinuous functions?

Exercise 1.4.16. Suppose f(x) is a continuous function on [a, b]. Prove that g(x) =sup{f(t) : a ≤ t ≤ x} is continuous.

Exercise 1.4.17. Let f(x) be a continuous function on (a, b).

1. Prove that if the limits limx→a+ f(x), limx→b− f(x) exist, then f(x) is uni-formly continuous.

2. Prove that if (a, b) is bounded and f(x) is uniformly continuous, then thelimits limx→a+ f(x), limx→b− f(x) exist.

10Rudolf Otto Sigismund Lipschitz, born 1832 in Konigsberg (Germany, now Kalin-ingrad, Russia), died 1903 in Bonn (Germany). He made important contributions innumber theory, Fourier series, differential equations, and mechanics.

Page 68: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

68 CHAPTER 1. LIMIT AND CONTINUITY

3. Use√x to show that the bounded condition in the second part is necessary.

4. Use the second part to show that sin1x

is not uniformly continuous on (0, 1).

Exercise 1.4.18 (Dirichlet). For any continuous function f(x) on a bounded andclosed interval [a, b] and ε > 0, inductively define

c0 = a, cn = sup{c : |f(x)− f(cn−1)| < ε on [cn−1, c]}.

Prove that the process must stop after finitely many steps, which means that thereis n, such that |f(x)− f(cn−1)| < ε on [cn, b]. Then use this to give another proofof Theorem 1.4.4.

1.4.3 Maximum and Minimum

Theorem 1.4.5. A continuous function on a bounded closed interval mustbe bounded and reaches its maximum and minimum.

The theorem says that there are x0, x1 ∈ [a, b], such that f(x0) ≤ f(x) ≤f(x1) for any x ∈ [a, b]. Thus the values of the function are bounded by f(x0)and f(x1). Moreover, the function reaches its minimum at x0 and reaches itsmaximum at x1.

We also note that although the theorem tells us the existence of themaximum and the minimum, its does not tell us how to find the extremes.The method for finding the extremes will be developed in Section 2.2.1.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

a bx0 x1

max

min

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

.............

............

............

............

....................................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

........................................................................................................................................................................

Figure 1.8: maximum and minimum

Proof of Theorem 1.4.5. Let f(x) be a continuous function on a boundedclosed interval [a, b]. We first prove the function is bounded. By Theorem1.4.4, for ε = 1 > 0, there is δ > 0, such that x, y ∈ [a, b] and |x − y| < δimplies |f(x)− f(y)| < 1. In other words, f(x) is bounded by f(y) + 1 andf(y)− 1 on any open interval (y − δ, y + δ) of length 2δ. Since the bounded

interval [a, b] can be covered by finitely many (any number bigger thanb− a

2δis enough) such intervals, the function is bounded on the whole interval [a, b].

Page 69: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 69

Let β = sup{f(x) : x ∈ [a, b]}. Showing f(x) reaches its maximum is thesame as proving β is a value of f(x). By the characterization of supremum,

for any natural number n, there is xn ∈ [a, b], such that β − 1

n< f(xn) ≤ β.

By the sandwich rule, we get

limn→∞

f(xn) = β. (1.4.6)

On the other hand, since the sequence {xn} is bounded by a and b, by The-orem 1.2.8, there is a convergent subsequence {xnk} with c = limk→∞ xnk ∈[a, b]. Then we have

limk→∞

f(xnk) = f(c). (1.4.7)

Combining the limits (1.4.6), (1.4.7) and using Proposition 1.2.1, we getf(c) = β. The proof for the function to reach its minimum is similar.

Exercise 1.4.19. Construct functions satisfying the requirements.

1. f(x) is continuous and not bounded on (0, 1).

2. f(x) is continuous and bounded on (0, 1) but does not reach its maximum.

3. f(x) is continuous and bounded on (0, 1). Moreover, f(x) also reaches itsmaximum and minimum.

4. f(x) is not continuous and not bounded on [0, 1].

5. f(x) is not continuous on [0, 1] but reaches its maximum and minimum.

6. f(x) is continuous and bounded on (−∞,∞) but does not reach its maxi-mum.

7. f(x) is continuous and bounded on (−∞,∞). Moreover, f(x) also reachesits maximum and minimum.

What do your examples say about Theorem 1.4.5?

Exercise 1.4.20. Suppose a continuous function f(x) on (a, b) satisfies limx→a+ f(x) =limx→b− f(x) = −∞. Prove that the function reaches its maximum on the interval.

Exercise 1.4.21. Suppose f(x) is continuous on a bounded closed interval [a, b].

Suppose for any a ≤ x ≤ b, there is a ≤ y ≤ b, such that |f(y)| ≤ 12|f(x)|. Prove

that f(c) = 0 for some a ≤ c ≤ b. Does the conclusion still hold if the closedinterval is changed to an open one?

Exercise 1.4.22. Prove that a uniformly continuous function on a bounded intervalmust be bounded.

1.4.4 Intermediate Value Theorem

Theorem 1.4.6 (Intermediate Value Theorem). Suppose f(x) is a contin-uous function on a bounded closed interval [a, b]. If y is a number betweenf(a) and f(b), then there is c ∈ [a, b], such that f(c) = y.

Page 70: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

70 CHAPTER 1. LIMIT AND CONTINUITY

By Theorem 1.4.5, for the function f(x) in the theorem above, there arex0, x1 ∈ [a, b], such that

f(x0) = min{f(x) : a ≤ x ≤ b}, f(x1) = max{f(x) : a ≤ x ≤ b}.

Denote α = f(x0) and β = f(x1). Applying the intermediate value theoremto f(x) on [x0, x1] (or [x1, x0] if x0 > x1), we see that f(x) reaches any valuebetween α and β. This proves the following result.

Proposition 1.4.7. Suppose f(x) is a continuous function on a boundedclosed interval [a, b]. Then the values of the function on the interval againform a bounded closed interval:

f([a, b]) = {f(x) : a ≤ x ≤ b} = [α, β].

Conversely, the proposition implies both Theorems 1.4.5 and 1.4.6.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

a bcx0 x1

α

β

y

max

min

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

......

............

............

............

....................................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

........................................................................................................................................................................

............

............

............

............

............

............

............

............

............

............

..................................................................................................................

Figure 1.9: intermediate value theorem

Proof of Theorem 1.4.6. Without loss of generality, assume f(a) ≤ y ≤ f(b).Let

X = {x ∈ [a, b] : f(x) ≤ y}.The set is not empty because a ∈ X. The set is also bounded by a and b.Therefore we have c = supX ∈ [a, b]. We expect f(c) = y.

If f(c) > y, then by the order property in Proposition 1.3.6, there isδ > 0, such that f(x) > y for any x satisfying c− δ < x ≤ c (the right side ofc may not be allowed in case c = b). On the other hand, by c = supX, thereis c− δ < x′ ≤ c, such that f(x′) ≤ y. Therefore we have a contradiction atx′.

If f(c) < y, then by f(b) ≥ y, we have c < b. Again by the order propertyin Proposition 1.3.6, there is δ > 0, such that f(c) < y for any x satisfying|x− c| < δ. In particular, any x′ ∈ (c, c+ δ) will satisfy x′ > c and f(x′) < y.This contradicts with the assumption that c is an upper bound of X.

Thus we conclude that f(c) = y, and the proof is complete.

Page 71: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 71

Example 1.4.5. For the function f(x) = x5 + 2x3 − 5x2 + 1, we have f(0) = 1,f(1) = −1, f(2) = 29. By the intermediate value theorem, there are a ∈ [0, 1] andb ∈ [1, 2] such that f(a) = f(b) = 0.

Example 1.4.6. A number a is a root of a function f(x) if f(a) = 0. For a poly-nomial f(x) = anx

n + an−1xn−1 + · · · + a1x + a0 with an > 0 and n odd, we

havef(x) = xng(x), g(x) = an +

an−1

x+ · · ·+ a1

xn−1+a0

xn.

Since limx→∞ g(x) = an > 0 and n is odd, we have f(x) > 0 for sufficiently bigand positive x, and f(x) < 0 for sufficiently big and negative x. Then by theintermediate value theorem, f(a) = 0 for some a (between two sufficiently bignumbers of opposite signs). Similar argument also works for the case an < 0.Therefore we conclude that a polynomial of odd degree must have a real root.

Example 1.4.7. The function tanx is continuous on(−π

2,π

2

). By

limx→−π

2+

tanx = −∞, limx→π

2−

tanx = +∞,

for any number y, we can find −π2< a < b <

π

2, such that tan a < y < tan b.

Then by the intermediate value theorem, we have y = f(c) for some c ∈ [a, b].This shows that any real number is the tangent of some angle.

Exercise 1.4.23. Prove that for any polynomial of odd degree, any real numberis the value of the polynomial. Moreover, estimate a solution of the equationx5 − 3x4 + 5x = 4 to the accuracy of the first decimal point (i.e., find n such that

you are certain that there is a solution betweenn

10and

n+ 110

).

Exercise 1.4.24. Show that 2x = 3x has solution on (0, 1). Show that 3x = x2 hassolution.

Exercise 1.4.25. Suppose a continuous function on an interval is never zero. Provethat it is always positive or always negative.

Exercise 1.4.26. Suppose a continuous function f(x) on (a, b) satisfies limx→a+ f(x) =α and limx→b− f(x) = β. Prove that any number in (α, β) is the value of the func-tion. The statement holds even when some of a, b, α, β are infinities.

Exercise 1.4.27. Suppose f(x) is a continuous function on (a, b). Prove that if f(x)only takes rational numbers as values, then f(x) is a constant.

Exercise 1.4.28. Suppose f(x) and g(x) are continuous functions on [a, b]. Provethat if f(a) < g(a) and f(b) > g(b), then f(c) = g(c) for some c ∈ (a, b).

Exercise 1.4.29. Suppose f : [0, 1] → [0, 1] is a continuous function. Prove thatf(c) = c for some c ∈ [0, 1]. c is called a fixed point of f .

Exercise 1.4.30. Suppose f(x) is a two-to-one function on [a, b]. In other words, forany x ∈ [a, b], there is exactly one other y ∈ [a, b] such that x 6= y and f(x) = f(y).Prove that f(x) is not continuous.

1.4.5 Invertible Continuous Function

Functions are maps. By writing a function in the form f : [a, b] → [α, β],we mean the function is defined on the domain [a, b] and its values lie in

Page 72: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

72 CHAPTER 1. LIMIT AND CONTINUITY

the range [α, β]. The function is onto (or surjective) if any γ ∈ [α, β] canbe written as γ = f(c) for some c ∈ [a, b]. It is one-to-one (or injective)if x1 6= x2 implies f(x1) 6= f(x2). It is invertible (or bijective) if there isanother function g : [α, β] → [a, b] such that g(f(x)) = x for any x ∈ [a, b]and f(g(y)) = y for any y ∈ [α, β]. The function g is called the inverse of fand is denoted g = f−1. It is a basic fact that a function is invertible if andonly if it is onto and one-to-one.

In the discussion above, the domain and the range are expressed as closedintervals. The discussion is still valid if other kinds of intervals (or any setsof real numbers) are used.

Theorem 1.4.8. A continuous function on an interval is invertible if andonly if it is strictly monotone. Moreover, the inverse is also continuous andstrictly monotone.

Proof. We only prove the case f(x) is a continuous function on a boundedand closed interval [a, b].

Suppose f(x) is strictly increasing. By Theorem 1.4.5, the function is anonto map f : [a, b]→ [α, β] = [f(a), f(b)]. Moreover,

x1 6= x2 ⇐⇒ x1 > x2 or x1 < x2

=⇒ f(x1) > f(x2) or f(x1) < f(x2)

⇐⇒ f(x1) 6= f(x2).

Thus the function is one-to-one. Since onto and one-to-one maps are invert-ible, we conclude that f(x) invertible.

Conversely, suppose f : [a, b] → [α, β] is invertible and f(a) < f(b). Weclaim f(x) is strictly increasing. If f(x) is not strictly increasing, then thereare a ≤ x1 < x2 ≤ b satisfying f(x1) ≥ f(x2). Since the invertibility impliesone-to-one, we must have f(x1) > f(x2). Then we have two possibilities.

1. If f(a) ≥ f(x1), then f(b) > f(a) ≥ f(x1) > f(x2), and we havef(a) > f(x2) < f(b).

2. If f(a) < f(x1), then f(a) < f(x1) > f(x2).

By f being one-to-one, we have either

y1 < y2 < y3, f(y1) > f(y2) < f(y3),

ory1 < y2 < y3, f(y1) < f(y2) > f(y3).

For the first case, we have either f(y1) ≤ f(y3) or f(y1) ≥ f(y3). If f(y1) ≤f(y3), then by the intermediate value theorem, there is y1 < y2 ≤ y4 ≤ y3

satisfying f(y1) = f(y4). Similarly, if f(y1) ≥ f(y3), then there is y1 ≤ y4 ≤y2 < y3 satisfying f(y4) = f(y3). Both contradict the one-to-one property.Thus the first case cannot happen. The proof for the impossibility of thesecond case is similar. This completes the proof that f(x) must be strictlyincreasing.

Page 73: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 73

It remains to prove that if f(x) is continuous, strictly increasing andinvertible, then its inverse f−1(y) is also continuous and strictly increasing.

Let y1 = f(x1) and y2 = f(x2). Then

x1 ≥ x2 =⇒ y1 ≥ y2.

The implication is the same as

y1 < y2 =⇒ x1 < x2,

which means exactly that f−1(x) is strictly increasing.Finally, we prove the continuity of f(x) at γ ∈ [α, β] = [f(a), f(b)]. We

just proved that f−1 is strictly increasing, so that by Proposition 1.3.9, theright side limit limy→γ+ f−1(y) = c exists. Applying the continuous functionf(x) and (1.4.3), we get

γ = limy→γ+

y = limy→γ+

f(f−1(y)) = f(c).

Therefore limy→γ+ f−1(y) = c = f−1(γ), which means f−1 is right continuousat γ. By the similar reason, the inverse function is left continuous.

For a strictly increasing and continuous function f(x) on an (not neces-sarily bounded) open interval (a, b), the same argument as before shows it isone-to-one. Moreover, by Proposition 1.3.9, the limits α = limx→a+ f(x) andβ = limx→b− f(x) exist. Then by Exercise 1.4.26, the map f : (a, b)→ (α, β)is onto. Thus the function is invertible.

For an invertible and continuous function f(x) on (a, b), the restriction onany closed interval [a′, b′] ⊂ (a, b) is still invertible and continuous. Then byTheorem 1.4.8 for the closed intervals, the restriction on [a′, b′] is one-to-one.Since [a′, b′] is arbitrary, we conclude that f(x) is one-to-one on (a, b).

The proof of the continuity of the inverse function can also be appliedto the strictly increasing and continuous functions on (a, b). Moreover, theproof also shows

limy→α+

f−1(y) = a, limy→β−

f−1(y) = b.

Exercise 1.4.31. Determine invertible functions.

1. x2 : [0, 1]→ [0, 1].

2. x2 : (0, 2)→ (0, 4).

3. x2 : (0, 2)→ (0, 5).

4. x2 : (−2, 2)→ (0, 4).

5. x2 : [−2, 0]→ [0, 4].

6. x2 : (0,∞)→ (0,∞).

7. x2 : (−∞, 0]→ [0,∞).

8. x2 : (−∞,∞)→ (−∞,∞).

Exercise 1.4.32. Suppose f(x) is a strictly decreasing and continuous function on[a, b). Let α = f(a) and β = limx→b− f(x). Prove that f : [a, b) → (β, α] isinvertible and limy→β+ f−1(y) = b.

Page 74: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

74 CHAPTER 1. LIMIT AND CONTINUITY

1.4.6 Inverse Trignometric and Logarithmic Functions

Theorem 1.4.8 may be used to introduce some important inverse functions.The functions

sin :[−π

2,π

2

]→ [−1, 1],

cos : [0, π]→ [−1, 1],

tan:(−π

2,π

2

)→ (−∞,∞),

are onto, strictly monotone and continuous (see (1.3.20), (1.3.21), (1.3.22)).Therefore they are invertible, and the inverses trigonometric functions

arcsin : [−1, 1]→[−π

2,π

2

],

arccos : [−1, 1]→ [0, π],

arctan: (−∞,∞)→(−π

2,π

2

),

are also strictly monotone and continuous. The other three inverse trigono-

metric functions may be defined similarly. The equality cos(π

2− x)

= sinx

implies that

arcsinx+ arccosx =π

2.

The similar equalities also imply the similar equations between other inversetrigonometric functions. Moreover, by the remark made after the proof ofTheorem 1.4.8, we have

limx→−∞

arctanx = −π2, limx→+∞

arctanx =π

2.

Similar limits hold for other inverse trigonometric functions.

Example 1.4.8. The continuity of inverse sine function says that limx→0 arcsinx =arcsin 0 = 0. Moreover, by the composition rule, the variable x in the limit (1.3.23)may be substituted by the continuous function arcsinx to get limx→0

x

arcsinx= 1.

Taking reciprocal, we get limx→0arcsinx

x= 1.

Exercise 1.4.33. Prove the equalities.

1. arcsin(−x) = − arcsinx.

2. arccos(−x) = π − arccosx.

3. arctan(−x) = − arctanx.

4. arcsin(cosx) =π

2− x for 0 ≤ x ≤

π.

5. arctanx+arctan1x

2for x > 0.

6. cos(arcsinx) =√

1− x2.

7. tan(arcsinx) =x√

1− x2.

8. tan(arccosx) =√

1− x2

x.

Exercise 1.4.34. Compute the limits.

Page 75: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 75

1. limx→−1+2 arcsinx+ π

x+ 1.

2. limx→1−(arccosx)2

x− 1.

3. limx→0arctanx

x.

4. limx→0tan(arcsinx)sin(arctanx)

.

5. limx→0+(sinx)arcsinx.

6. limx→+∞ x(

arctanx− π

2

).

By (1.3.27), the exponential function αx based on a constant α > 0 iscontinuous. For α > 1, the function is strictly increasing, and

limx→−∞

αx = 0, limx→+∞

αx = +∞.

For 0 < α < 1, the function is strictly decreasing, and

limx→−∞

αx = +∞, limx→+∞

αx = 0.

Thus the exponential function

αx : (−∞,∞)→ (0,∞), 0 < α 6= 1

is invertible. The inverse function

logα x : (0,∞)→ (−∞,∞), 0 < α 6= 1

is the logarithmic function, which is also continuous, strictly increasing forα > 1 and strictly decreasing for 0 < α < 1. Moreover,

limx→0+

logα x =

{−∞ if α > 1

+∞ if 0 < α < 1, (1.4.8)

limx→+∞

logα x =

{+∞ if α > 1

−∞ if 0 < α < 1. (1.4.9)

The following equalities for the exponential function

α0 = 1, α1 = α, αxαy = αx+y, (αx)y = αxy

imply the following equalities for the logarithmic functions

logα 1 = 0, logα α = 1, logα(xy) = logα x+ logα y,

logα xy = y logα x, logα x =

log x

logα.

The logarithmic function with the special base α = e is called the naturallogarithmic function and is denoted by log or ln. In particular, we havelog e = 1.

Next we derive some important limits involving the exponential and thelogarithmic functions.

Page 76: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

76 CHAPTER 1. LIMIT AND CONTINUITY

Taking α =1

ein the limit (1.3.10), we have limx→+∞

ex= 0. By

limx→+∞ log x = +∞ and the composition rule, we may substitute x bylog x and get

limx→+∞

(log x)β

x= 0. (1.4.10)

This means that at +∞, the logarithmic function goes to infinity much slowerthan x. Even raising the power of the logarithmic function does not help thespeed of growth.

Substituting x by1

xin the limit (1.3.30), we get limx→0(1 + x)

1x = e. By

the continuity of log, we have limx→0 log(

(1 + x)1x

)= log limx→0(1 + x)

1x =

log e = 1. Then by the property of log, the limit is

limx→0

log(1 + x)

x= 1. (1.4.11)

The continuity of ex tells us limx→0(ex−1) = 0. Substituting x in (1.4.11)

by ex − 1, we get limx→0x

ex − 1= limx→0

log(1 + ex − 1)

ex − 1= 1. Taking recip-

rocal, we have

limx→0

ex − 1

x= 1. (1.4.12)

The continuity of log tells us limx→0 log(1 + x) = 0. Substituting x in

(1.4.12) by α log(1+x) and using eα log(1+x) = (1+x)α, we get limx→0(1 + x)α − 1

log(1 + x)=

α. Multiplying the limit with (1.4.11), we get

limx→0

(1 + x)α − 1

x= α. (1.4.13)

Example 1.4.9. For β > 0, taking1β

power of the limit (1.4.10) gives us

limx→+∞

log xxβ

= 0 for β > 0. (1.4.14)

Then substituting x by1x

and using log1x

= − log x, we get

limx→0+

xβ log x = 0 for β > 0. (1.4.15)

The limits (1.4.10), (1.4.14), (1.4.15) can be extended to the logarithms based on

any α > 0 by using logα x =log xlogα

.

Example 1.4.10. Substituting x by x logα in (1.4.12) and using ex logα = αx, weget a more general formula

limx→0

αx − 1x

= logα. (1.4.16)

Page 77: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 77

Example 1.4.11. We already know limx→0+ xx = 1. To further find out how close

xx is to 1, we let y = xx. Then x log x = log y andxx − 1x log x

=y − 1log y

. As x → 0+,

we have y → 1+, and limy→1y − 1log y

= limz→0z

log(1 + z)= 1 by (1.4.11). Thus we

conclude thatxx = 1 + x log x+ o(x log x),

where x log x is an infinitesimal at 0+.

Exercise 1.4.35. Compute the limits.

1. limx→1log x√x− 1

.

2. limx→0+log xx

.

3. limx→+∞log(log x)

x.

4. limx→1+(x− 1) log(log x).

5. limx→2log x− log 2

x− 2.

6. limx→+∞log x√x+ log x

.

7. limx→+∞log(2 + 3x)log(2 + 7x)

.

8. limx→0log(2 + 3x)log(2 + 7x)

.

9. limx→− 2

7

+log(2 + 3x)log(2 + 7x)

.

10. limx→0log(1 +

√x)

log(1 + x).

11. limx→+∞log(1 +

√x)

log(1 + x).

12. limx→+∞log3(2 + x3)log4(3 + x4)

.

13. limx→+∞x√√

x+ log x.

14. limx→0logα(1 + x)

logα(1 + 2x− x2).

15. limx→0(1 + x+ x2)

79 − 1

x.

16. limx→0(1 + x+ x2)

79 − (1 + x)

79

x2.

17. limx→0+xx

2 − 1x log x

.

18. limx→+∞ x( x√x− 1).

19. limx→+∞((x+ 1)α − xα).

Exercise 1.4.36. Use Exercise 1.1.36 and the continuity of logarithmic function toprove that if xn > 0 and limn→∞ xn = l, then limn→∞ n

√x1x2 · · ·xn = l. What

about the case limn→∞ xn = +∞?

Exercise 1.4.37. Prove that if xn > 0 and limn→∞xnxn−1

= l, then limn→∞ n√xn = l.

Use the conclusion to compute the following limits

limn→∞

n√n!n

, limn→∞

n√

(2n)!n2

, limn→∞

n

√(2n)!(n!)2

.

Exercise 1.4.38. Use Exercise 1.2.18 to derive1

1 + n< log

(1 +

1n

)<

1n

. Then

use the inequality to show that xn = 1 +12

+ · · ·+ 1n− log n is strictly decreasing

and yn = 1 +12

+ · · · + 1n− log(n + 1) is strictly increasing. Moreover, prove

Page 78: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

78 CHAPTER 1. LIMIT AND CONTINUITY

that both xn and yn converge to the same limit, called the Euler11-Mascheroni12

constant:

limn→∞

(1 +

12

+13

+ · · ·+ 1n− log n

)= 0.577215669015328 · · · . (1.4.17)

A vast extension of the exercise appears in Exercise 4.1.61.

1.4.7 Additional Exercise

Additive and Multiplicative Functions

Exercise 1.4.39. Suppose f(x) is a continuous function on R satisfying f(x+ y) =f(x) + f(y).

1. Prove that f(nx) = nf(x) for integers n.

2. Prove that f(rx) = rf(x) for rational numbers r.

3. Prove that f(x) = ax for some constant a.

Exercise 1.4.40. Suppose f(x) is a continuous function on R satisfying f(x+ y) =f(x)f(y). Prove that f(x) = ax for some constant a > 0.

Cauchy Criterion for Continuity

Exercise 1.4.41. Prove that a function f(x) is continuous at a if any only if forany ε > 0, there is δ > 0, such that

|x− a| < δ, |y − a| < δ =⇒ |f(x)− f(y)| < ε.

Exercise 1.4.42. Prove that if a function f(x) is not continuous at a, then one ofthe following will happen.

1. There is ε > 0, such that for any δ > 0, there are a − δ < x < a anda < y < a+ δ, such that |f(x)− f(a)| ≥ ε and |f(y)− f(a)| ≥ ε.

2. There is ε > 0, such that for any δ > 0, there are a − δ < x < a anda < y < a+ δ, such that |f(x)− f(y)| ≥ ε.

Note that x and y are on different sides of a.

One Side Invertible Monotone Function

Let f(x) be an increasing but not necessarily continuous function on [a, b].By Exercise 1.4.10, the function has only countably many discontinuities. Afunction g(x) on [f(a), f(b)] is a left inverse of f(x) if g(f(x)) = x, and is aright inverse if f(g(y)) = y.

Exercise 1.4.43. Prove that if f(x) has a left inverse g(y), then f(x) is strictlyincreasing, and g(f(x−)) = g(f(x+)) = x for any x ∈ [a, b].

11Leonhard Euler, born 1707 in Basel (Switzerland), died 1783 in St. Petersburg (Rus-sia).

12Lorenzo Mascheroni, born 1750 in Lombardo-Veneto (now Italy), died 1800 in Paris(France).

Page 79: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

1.4. CONTINUOUS FUNCTION 79

Exercise 1.4.44. Suppose f(x) is strictly increasing and has only one discontinuityat c ∈ (a, b). Prove that f(x) has a unique left inverse g(y). Moreover, g(y) isstrictly increasing on [f(a), f(c−)] and [f(c+), f(b)], and g(y) = c on [f(c−), f(c+)].

Exercise 1.4.45. Suppose f(x) is strictly increasing. Prove that for any y ∈[f(a), f(b)], there is a unique x ∈ [a, b] satisfying f(x−) ≤ y ≤ f(x+). More-over, for such unique x, the definition g(y) = x gives the unique left inverse off(x).

Exercise 1.4.46. Prove that f(x) has a right inverse if and only if f(x) is continuous.Moreover, the right inverse must be strictly increasing, and the right inverse isunique if and only if f(x) is strictly increasing.

Page 80: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

80 CHAPTER 1. LIMIT AND CONTINUITY

Page 81: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 2

Differentiation

81

Page 82: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

82 CHAPTER 2. DIFFERENTIATION

2.1 Approximation and Differentiation

The precalculus mathematics deals with static quantities which always havespecific and fixed values. For example, the angle in an equilateral triangle is60 degrees. The area of the rectangle with width 3 and height 2 is 6.

Differentiation is the mathematical tool for studying changing quantities.Such quantities often depend on other quantities. The dependence is oftenexpressed as functions. For example, if two angles in a triangle are α andβ degrees, then the third angle is γ = 180 − α − β degrees. Moreover, thearea of a rectangle depends on the width and the height (and is given by theproduct function).

Of course, functions may be evaluated to get static values. For example,for width = 3 and height = 2, we get area = 6. Such substitutions simplyfix the changing quantities and cannot answer the following questions.

1. Does the area increase when the width and the height are increasing?

2. What is the biggest area of a rectangle with circumference 10?

To answer such questions, we have to think of the whole function instead ofspecific evaluation. We have to study how the function changes. We haveto consider the whole evolving process instead of specific moments of theprocess.

The changes of general functions may be described in terms of approxi-mations by simple functions. The simplest and the most useful function forthe approximation purpose is the linear function. The linear approximationis called the differential and the key coefficient in the linear approximationis called the derivative.

2.1.1 Approximation

Let us start with some common sense. Suppose P (A) is a problem aboutan object A. To solve the problem, we may consider a simpler object Bthat closely approximates A. Because B is simpler, P (B) is easier to solve.Because B is close to A, the (easily obtained) answer to P (B) is also prettymuch the answer to P (A).

Example 2.1.1. To find the distance from Hong Kong to Shanghai, we take a mapand use the ruler to measure the distance between the two cities and then multiplythe scaling ratio of the map. In this process, P is the distance, A is the cities HongKong and Shanghai, and B is the two dots representing the two cities on a map.

The map is only an approximation of the actual world. Of course, the biggerthe map, the more accurate the measurement is. A better measurement can beobtained by a sphere shaped map (the globe found in stationary stores) becausethe sphere is a better approximation of the world than the flat plane (some ancientpeople may disagree, though).

On the other hand, how accurate we wish the answer is depends on our purpose.If our purpose is only to estimate the time needed for flying from Hong Kong toShanghai, a poster sized map is more than sufficient.

Page 83: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 83

Example 2.1.2. Suppose we want to count how much money there is in a big bagof one dollar coins. We may weigh the whole bag, weigh one coin, and then dividethe two numbers. In this process, A is the collection of real coins, with slightvariation in size and weight, and B is the imaginary ideal collection in which eachcoin is identical to the one we choose to weigh. If the big bag is not too big (saytwo kilo), the error would be small enough and the answer would be quite reliable.

Exercise 2.1.1. Think again how you solve problems in everyday life. Do youactually solve the problem as is? Or what you solve is just an approximation ofthe problem?

Many real world problems can often be mathematically interpreted asquantities related by complicated functions. To solve the problem, we maytry to approximate the complicated functions by simple ones and solve thesame problem for the simple functions.

What are the simple functions? Although the answer could be rathersubjective, most people would agree that the following functions are listedfrom simple to complicated.

type one variable examples two variable exampleszero 0 0

constant −1, −3 −1, −3linear 3 + x, 4− 5x 3 + x+ 2y, 5x− 7y

quadratic 1 + x2, −4 + 5x− 2x2 xy, x2 + xy − 3y2 + y + 1cubic 2x− 2x3 2x2 − 2y3, x2y + xy2

rational2 + x2

x(4− x),

1

1 + x

1

x+ y,x+ xy + 3y2

x(y − x)algebraic

√x, (1 + x1/2)1/3 √

xy, (x3 − 3y2)1/3

transcendental sinx, ex + cosx sinxy, ex+y + cos(x+ y)

What do we mean by approximating a function with a simple (class of)functions? Consider measuring certain length (say the height of a person, forexample) by a ruler with only centimeters. We expect to get an approximatereading of the length with the accuracy within millimeters, or significantlysmaller than the base unit of 1cm for the ruler:

|actual length − reading from ruler| ≤ ε(1cm).

Similarly, approximating a function f(x) near x0 by a function p(x) of someclass would mean the difference |f(x)−p(x)| is significantly smaller than the“base unit” for the class. Specifically, for any ε > 0, there is δ > 0, such that

|x− x0| < δ =⇒ |f(x)− p(x)| ≤ ε(unit). (2.1.1)

The “base unit” for the class of constant functions is 1. Thus the approx-imation of a function f(x) by a constant function p(x) = a near x0 meansthat for any ε > 0, there is δ > 0, such that

|x− x0| < δ =⇒ |f(x)− a| ≤ ε. (2.1.2)

Page 84: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

84 CHAPTER 2. DIFFERENTIATION

The condition can be split into two parts. First, for x = x0, we get |f(x0)−a| ≤ ε for any ε > 0. This means exactly a = f(x0). Second, for x 6= x0, weget

0 < |x− x0| < δ =⇒ |f(x)− a| ≤ ε.

This means exactly limx→x0 f(x) = a. Combined with a = f(x0), we concludethat a function f(x) is approximated by a constant near x0 if and only if itis continuous at x0. Moreover, the approximating constant is f(x0).

The approximation by constant functions, which happens for continuousfunctions only, is good enough for the problem of the sign of the function.

Example 2.1.3. Consider the positivity of functions. The cubic function f(x) =x3 − 5x+ 2 is approximated by the constant f(3) = 14 near 3. Since the constantfunction is positive, the cubic function is also positive near 3.

Note that the farther away from 3, the less certain we are about the positivityfor the cubic function. The reason is that the approximation becomes worse furtheraway from 3. As a matter of fact, if we move away by distance 1 (say at x = 2),the positivity is lost.

Exercise 2.1.2. Rigorously prove that if f(x) is continuous at x0 and f(x0) > 0,then there is δ > 0, such that f(x) > 0 for any x ∈ (x0 − δ, x0 + δ).

2.1.2 Differentiation

The approximation by constant functions is too crude for most problems. Asthe next simplest, the linear functions A + Bx often need to be used. Forthe convenience of discussion, we introduce

∆x = x− x0

(∆ is the Greek equivalence of D, here used to stand for Difference) andrewrite

A+B(x0 + ∆x) = (A+Bx0) +B∆x = a+ b∆x.

What is the base unit of a + b∆x as x approaches x0? The base unit ofthe constant term a is 1. The base unit of the difference term b∆x is ∆x,which is very small compared with the unit 1. Therefore the base unit forthe linear function a + b∆x is ∆x. The discussion may be compared withthe expression am + bcm (a meters and b centimeters). Since 1cm is muchsmaller than 1m, the base unit for am + bcm is 1cm.

The approximation of a function f(x) by a linear function a+ b∆x at x0

means the following.

Definition 2.1.1. A function f(x) is differentiable at x0 if there is a linearfunction a+ b∆x, such that for any ε > 0, there is δ > 0, such that

|∆x| = |x− x0| < δ =⇒ |f(x)− a− b∆x| ≤ ε|∆x|. (2.1.3)

Example 2.1.4. To find the linear approximation of x2 at x0 = 1, we rewrite thefunction in terms of ∆x = x− 1 near 1.

x2 = (1 + ∆x)2 = 1 + 2∆x+ ∆x2 = 1 + 2∆x+ o(∆x).

Therefore the linear approximation is 1 + 2∆x.

Page 85: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 85

Exercise 2.1.3. Show that x2 is linearly approximated by x20 + 2x0∆x at x0. What

about the linear approximation of x3?

For x = x0, the condition (2.1.3) tells us the constant coefficient a =f(x0). Let

∆f = f(x)− a = f(x)− f(x0)

be the change of the function caused by the change ∆x of the variable. Thenthe right side of (2.1.3) becomes |∆f − b∆x| ≤ ε|∆x|, so that the conditionbecomes

∆f = b∆x+ o(∆x). (2.1.4)

In other words, the scaling b∆x of the change of variable is the linear ap-proximation of the change of the function. The viewpoint is caught in thenotation of the differential

df = bdx (2.1.5)

of the function at x0.Note that the symbols df and dx have not yet been specified as numerical

quantities. Thus bdx should be, at least for the moment, considered as anintegrated notation instead of the product of two quantities. On the otherhand, the notation is motivated from b∆x, which was indeed a product of twonumbers. So it is allowed to add two differentials and to multiply numbersto differentials. In more advanced mathematics, the differential symbols willindeed be defined as quantities in some linear approximation spaces (calledtangent spaces). However, one has to be careful in multiplying differentialstogether because this is not a valid operation within linear spaces. Finally,dividing differentials is not allowed.

2.1.3 Derivative

Taking x = x0 in (2.1.3) tells us a = f(x0). Taking x 6= x0, the condition(2.1.3) becomes

0 < |∆x| = |x− x0| < δ =⇒∣∣∣∣f(x)− f(x0)

x− x0

− b∣∣∣∣ ≤ ε. (2.1.6)

This tells us how to compute the coefficient b.

Definition 2.1.2. The derivative of a function f(x) at x0 is

f ′(x0) =df

dx= lim

x→x0

f(x)− f(x0)

x− x0

= lim∆x→0

∆f

∆x= lim

∆x→0

f(x0 + ∆x)− f(x0)

∆x.

We already saw that the differentiability implies the existence of thederivative. Conversely, if the derivative exists, then we have the implication(2.1.6), which is the same as (2.1.3) with a = f(x0) and 0 < |x − x0| < δ.Since (2.1.3) always holds with a = f(x0) and x = x0, we conclude that thecondition for differentiability holds with a = f(x0) and |x− x0| < δ.

Proposition 2.1.3. A function f(x) is differentiable at x0 if and only if ithas the derivative at x0. Moreover, the function is continuous at x0 and itslinear approximation is f(x0) + f ′(x0)∆x.

Page 86: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

86 CHAPTER 2. DIFFERENTIATION

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

.........................................................................

...............

...............

..............................

∆y

........................................................................................................................∆x

....................................................................

....................................................................

.....................................................................

....................................................................

....................................................................

...................................................

................................................................................................................................................................................................................................................................................................................................................................................................................................

x0 x

f(x0)

f(x)

f(x)

linearapprox.

slope =f ′(x0)

slope = ∆y∆x

.......................................................................................................................................................................................

.............................................................

..........................................................................................................................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

...

......................................................................................................................................................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

.........................................................................................................................................................................................................

Figure 2.1: linear approximation and derivative of f(x) at x0

Proof. It remains to prove the continuity. Taking ε = 1 in the definition ofdifferentiability, we find δ > 0, such that

|∆x| = |x− x0| < δ =⇒ |f(x)− f(x0)− b∆x| ≤ |∆x|=⇒ |f(x)− f(x0)| ≤ (|b|+ 1)|∆x|.

Then for any ε > 0, we get

|x− x0| < max

{δ,

ε

|b|+ 1

}=⇒ |f(x)− f(x0)| ≤ (|b|+ 1)|x− x0| ≤ ε.

This shows f(x) is continuous at x0.

We emphasis here that although the existence of linear approximation isequivalent to the existence of derivative, the two play different roles. Linearapproximation is the motivation and the concept. Derivative is the compu-tation of the linear approximation and is derived from it. Therefore linearapproximation is much more important in understanding the essence of cal-culus. As a matter of fact, for multivariable functions, linear approximationsmay be similarly computed by partial derivatives. However, the existence ofpartial derivatives exist does not imply the existence of linear approximation.

Mathematical concepts are always derived from common sense. The for-mulas for computing the concepts are only obtained after mathematicallyanalysing the common sense. Never equate the formula with the conceptitself!

Example 2.1.5. The linear approximation to a linear function f(x) = A+Bx mustbe the linear function itself. Moreover, since ∆f = f(x) − f(x0) = B(x − x0) =B∆x, the derivative f ′(x0) = B by comparing with (2.1.4).

Example 2.1.6. For a quadratic function f(x) = A+Bx+ Cx2, we have

∆f = f(x0 + ∆x)− f(x0) = (B + 2Cx0)∆x+ C∆x2 = (B + 2Cx0)∆x+ o(∆x).

By comparing with (2.1.4), the function is differentiable at x0, with the differentialdf = (B + 2Cx0)dx at x0. Usually we write df = (B + 2Cx)dx for the differentialat any x. We also have f ′(x) = B + 2Cx.

Page 87: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 87

Example 2.1.7. For f(x) = sinx, we have

f ′(0) = limx→0

f(x)− f(0)x

= limx→0

sinxx

= 1.

Therefore the sine function is differentiable at 0, with f(0) + f ′(0)(x − 0) = x asthe linear approximation.

Example 2.1.8. For f(x) = |x|, the limit limx→0f(x)− f(0)

x= limx→0

|x|x

divergesbecause the left and the right limits are different. Therefore the absolute valuefunction is not differentiable at 0.

Exercise 2.1.4. Find the differential and the derivative for the cubic functionf(x) = x3 by computing ∆f = f(x0 + ∆x)− f(x0).

Exercise 2.1.5. Determine the differentiability and find the linear approximationsif possible (α, β > 0).

1.√|x| at 0.

2. cosx at a.

3. sin2 x at 0.

4. ex at a.

5. log(1 + x) at 0.

6. | sinπx3| at 0 and 1.

7.

{x if x 6= 01 if x = 0

at 0.

8.

{−xα if x ≥ 0(−x)β if x < 0

at 0.

9.

{xα if x ≥ 0(−x)β if x < 0

at 0.

Exercise 2.1.6. Study the differentiability of the function |x3(x− 1)(x− 2)2|.Exercise 2.1.7. Study the differentiability of Thomae’s function in Example 1.4.2.

Exercise 2.1.8. Suppose f(0) = 0 and f(x) ≥ |x|. Show that f(x) is not differen-tiable at 0. What if f(x) ≤ |x|?Exercise 2.1.9. Suppose f(x) is differentiable on a bounded interval (a− ε, b+ ε).Suppose f(x) = 0 for infinitely many x ∈ [a, b]. Prove that there is c ∈ [a, b], suchthat f(c) = 0 and f ′(c) = 0.

Exercise 2.1.10. Suppose f(x) is differentiable at x0, with f(x0) 6= 0. Find

limt→0

(f(x0 + t)f(x0)

) 1t

.

Exercise 2.1.11. Prove that f(x) is differentiable at x0 if and only if f(x) = f(x0)+(x− x0)g(x) for a function g(x) continuous at x0.

2.1.4 Tangent Line and Rate of Change

Geometrically, the differentiation may be understood through the graphs offunctions. The graph of a function is a curve, and the graph of a linearfunction is a straight line. Therefore the linear approximation is the approxi-mation of a curve by a straight line. Such straight line is the “best fit” amongall the straight lines passing through (x0, f(x0)). To construct the best fit,we let Lx be the line passing through the two points (x0, f(x0)) and (x, f(x))on the graph of the function. Then the limit of the line Lx as x approaches

Page 88: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

88 CHAPTER 2. DIFFERENTIATION

x0 is the linear approximation. Since the slope of Lx is∆f

∆x, the slope of the

linear approximation is the limit of the quotient, or the derivative.The differentiation can also be viewed as the rate of change. To un-

derstand how a quantity y depends on another quantity x, it suffices tounderstand the how the change in x induces the change in y. To illustratethe idea, suppose x is the age of a person and y = f(x) = 100 + x2 isthe wage the person earns at age x. Then we know the wage changes byDf(x) = f(x+ 1)− f(x) = 2x+ 1 when the person becomes one year older.Conversely, suppose we know the starting salary f(20) = 500 at the age 20and we also know that every year the wage changes by Df(x) = 2x + 1.Then we may recover the actual salary f(x) by adding up the changes. Forexample, the salary at age 25 is

f(25) = f(20) +Df(20) +Df(21) +Df(22) +Df(23) +Df(24)

= 500 + 41 + 43 + 45 + 47 + 49

= 725.

In general, it is not difficult to recover the formula f(x) = 100 + x2. Theprocess of measuring changes is differentiation, and the process of recoveringthe whole by adding up the changes is integration.

Exercise 2.1.12. Given the initial term x1 of a sequence and the difference xn+1−xnbetween consecutive terms, recover the sequence.

1. x1 = 1 and xn+1 − xn = 1.

2. x1 = 1 and xn+1 − xn = n.

3. x1 = 1 and xn+1 − xn = (−1)n.

4. x1 = 1 and xn+1 − xn =1

n(n+ 1).

5. x1 = 1 and xn+1 − xn =1

√n+√n+ 1

.

In the example above, the variable is taken to be integers only. In reality,many variables are real numbers and the changes are continuous. In this case,we have to consider the change ∆f = f(x + ∆x)− f(x) of a function whenx is changed by ∆x. Since ∆f usually becomes smaller as ∆x gets smaller,

it is more sensible to measure the rate of change∆f

∆xfor the period between

x and x + ∆x. When ∆x is approaching zero, the limit lim∆x→0∆f

∆xthen

becomes the rate of change at the instant of x. The process of computingthe rate of change is differentiation, and the process of recovering the wholefrom the rate of change is integration.

Example 2.1.9. The height of a free falling object on the face of the earth is

h(t) = h0 −12gt2,

Page 89: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 89

where h0 is the initial height, g is the gravitational constant, and t is the time. Asthe time is changed from t to t+ ∆t, the height is changed by

∆h = h(t+ ∆t)− h(t) = −gt∆t− 12

∆t2.

The average speed of the falling object during the period is

∆h∆t

= −gt− 12

∆t.

As ∆t→ 0, we get the speed

h(t) = lim∆t→0

∆h∆t

= −gt

at the time t. Note that the speed is getting bigger as time goes by, which isconsistent with the observation.

As a matter of fact, the formula for the height h(t) is obtained in the reverseway (i.e., by integration). From Newton1’s second law of mechanics, we knowthe speed of the falling object is v(t) = −gt. To recover the height of the objectis the same as finding a function h(t) satisfying h′(t) = v(t) = −gt. Indeed

h(t) = h0 −12gt2 is the function with the required derivative.

Example 2.1.10. Let f(t) be the amount of money in a savings account. Theincrease of the amount, due to the interest, can be measured by the rate of change,which at a particular time t is proportional to the amount of money f(t) at thetime. In other words, we have f ′(t) = cf(t), where the constant c is the interestrate. Now the question is what kind of function satisfies the (differential) equation.

To get some idea, let us consider the discrete case. Suppose the interest ispaid out yearly instead of continuously. Then the amount at the (t+ 1)-st year isf(t+ 1) = (1 + c)f(t). Thus for any natural number t we get

f(t) = (1 + c)f(t− 1) = (1 + c)2f(t− 2) = · · · = (1 + c)tf(0),

which is an exponential function of t. If the interest payment is made monthly

at the yearly rate of c, then we have f(t+

112

)=(

1 +c

12

)f(t), from which we

can similarly get f(t) =(

1 +c

12

)12tf(0). If the interest payment is made n times

a year, then we will get f(t) =(

1 +c

n

)ntf(0). As n goes to infinity, we get the

formula

f(t) = limn→∞

(1 +

c

n

)ntf(0) = f(0)ect

for the account for which the interest is paid continuously. We will rigorously showlater that the solutions to f ′(t) = cf(t) are indeed scalar multiples of ect.

1Isaac Newton, born 1643 in Woolsthorpe (England), died 1727 in London (England).Newton gave the first explicit statement on the fundamental theorem of calculus. So forNewton, integration means to recover a function from its derivative.

Page 90: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

90 CHAPTER 2. DIFFERENTIATION

2.1.5 Rules of Computation

Proposition 2.1.4. The sum, the product and the composition of differen-tiable functions are differentiable. Moreover,

(f(x) + g(x))′ = f ′(x) + g′(x),

(f(x)g(x))′ = f ′(x)g(x) + f(x)g′(x),

(g(f(x)))′ = g′(f(x))f ′(x).

The rules can also be written as

d(f + g)

dx=df

dx+dg

dx,d(fg)

dx=df

dxg + f

dg

dx,dg

dx=dg

df

df

dx,

or in the differential form,

d(f + g) = df + dg, d(fg) = gdf + fdg, d(g ◦ f) = (g′ ◦ f)df.

The formula for the product is called the Leibniz2 rule. The formula for thecomposition is called the chain rule.

A special case of the Leibniz rule is that for any constant c, we have

(cf)′ = c′f + cf ′ = 0f + cf ′ = cf ′.

The formula for the derivative of the sum can be explained through thelinear approximation. Suppose the functions f(x) and g(x) are approximatedby the linear functions

L(x) = f(x0) + f ′(x0)∆x, K(x) = g(x0) + g′(x0)∆x

near x0. Then we expect f(x) + g(x) to be approximated by the sum

L(x) +K(x) = (f(x0) + g(x0)) + (f ′(x0) + g′(x0))∆x

of the linear functions. In particular, this implies the derivative (f + g)′(x0)is the coefficient f ′(x0) + g′(x0) of ∆x.

Similarly, we expect f(x)g(x) to be approximated by the product

L(x)K(x) = f(x0)g(x0) + (f ′(x0)g(x0) + f(x0)g′(x0))∆x+ f ′(x0)g′(x0)∆x2

of linear functions. Although L(x)K(x) is not linear, it is further approxi-mated by the linear function

M(x) = f(x0)g(x0) + (f ′(x0)g(x0) + f(x0)g′(x0))∆x.

Therefore f(x)g(x) is also approximated by the linear function M(x). Inparticular, (fg)′(x0) is the coefficient f ′(x0)g(x0) + f(x0)g′(x0) of ∆x.

2Gottfried Wilhelm von Leibniz, born 1646 in Leipzig, Saxony (now Germany), died1716 in Hannover (now Germany).

Page 91: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 91

For the composition g(f(x)), the functions f(x) and g(y) are approxi-mated by the linear functions

L(x) = f(x0) + f ′(x0)∆x, K(y) = g(y0) + g′(y0)∆y

near x0 and y0 = f(x0). Then we expect g(f(x)) to be approximated by thecomposition

K(L(x)) = g(y0) + g′(y0)f ′(x0)∆x

of linear functions. In particular, this implies the derivative (g ◦ f)′(x0) isthe coefficient g′(y0)f ′(x0) = g′(f(x0))f ′(x0) of ∆x.

Proof of Proposition 2.1.4. We prove the formulae by rigorously verifying theclaims made above about the linear approximations.

Consider the sum first. For any ε1 > 0, there are δ1, δ2 > 0, such that

|∆x| < δ1 =⇒ |f(x)− L(x)| ≤ ε1|∆x|, (2.1.7)

|∆x| < δ2 =⇒ |g(x)−K(x)| ≤ ε1|∆x|. (2.1.8)

This implies

|∆x| < min{δ1, δ2} =⇒ |(f(x) + g(x))− (L(x) +K(x))| ≤ 2ε1|∆x|. (2.1.9)

Thus for any ε > 0, we take ε1 =ε

2and find δ1, δ2 > 0, such that (2.1.7) and

(2.1.8) hold. Then (2.1.9) holds and becomes

|∆x| < min{δ1, δ2} =⇒ |(f(x) + g(x))− (L(x) +K(x))| ≤ ε|∆x|.

This proves that f(x) + g(x) is approximated by the linear function L(x) +K(x).

Now consider the product. For |∆x| < min{δ1, δ2}, we have

|f(x)g(x)− L(x)K(x)| ≤ |f(x)g(x)− L(x)g(x)|+ |L(x)g(x)− L(x)K(x)|≤ ε1|∆x||g(x)|+ ε1|L(x)||∆x|≤ ε1(|g(x)|+ |L(x)|)|∆x|.

Then

|f(x)g(x)−M(x)| ≤ |f(x)g(x)− L(x)K(x)|+ |f ′(x0)g′(x0)∆x2|≤ [ε1(|g(x)|+ |L(x)|) + |f ′(x0)g′(x0)∆x|] |∆x|.

Since |g(x)| and |L(x)| are continuous at x0, they are bounded near x0 byProposition 1.3.6. Therefore for any ε > 0, it is not difficult to find δ3 > 0and ε1 > 0, such that

|∆x| < δ3 =⇒ ε1(|g(x)|+ |L(x)|) + |f ′(x0)g′(x0)∆x| ≤ ε.

Next for this ε1, we may find δ1, δ2 > 0, such that (2.1.7) and (2.1.8) hold.Then we have

|∆x| < min{δ1, δ2, δ3} =⇒ |f(x)g(x)−M(x)| ≤ ε|∆x|.

Page 92: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

92 CHAPTER 2. DIFFERENTIATION

This proves that f(x)g(x) is approximated by the linear function M(x).Finally consider the composition. For any ε1, ε2 > 0, there are δ1, δ2 > 0,

such that

|∆x| = |x− x0| < δ1 =⇒ |f(x)− L(x)| ≤ ε1|∆x|, (2.1.10)

|∆y| = |y − y0| < δ2 =⇒ |g(y)−K(y)| ≤ ε2|∆y|. (2.1.11)

Then for y = f(x), we have

|∆x| < δ1 =⇒ |∆y| = |f(x)− f(x0)| ≤ |f(x)− L(x)|+ |f ′(x0)∆x|≤ (ε1 + |f ′(x0)|)|∆x| < (ε1 + |f ′(x0)|)δ1. (2.1.12)

and

|∆x| < δ1, |∆y| < δ2 =⇒ |g(f(x))−K(L(x))|≤ |g(f(x))−K(f(x))|+ |K(f(x))−K(L(x))|= |g(y)−K(y)|+ |g′(x0)||f(x)− L(x)|≤ ε2|∆y|+ |g′(x0)|ε1|∆x|≤ [ε2(ε1 + |f ′(x0)|) + ε1|g′(x0)|]|∆x|. (2.1.13)

Suppose for any ε > 0, we can find suitable δ1, δ2, ε1, ε2, such that

(ε1 + |f ′(x0)|)δ1 ≤ δ2, (2.1.14)

ε2(ε1 + |f ′(x0)|) + ε1|g′(x0)| ≤ ε, (2.1.15)

and (2.1.10), (2.1.11) hold. Then (2.1.12) tells us |∆x| < δ1 implies |∆y| < δ2,and (2.1.13) becomes

|∆x| < δ1 =⇒ |g(f(x))−K(L(x))| ≤ [ε2(ε1+|f ′(x0)|)+ε1|g′(x0)|]|∆x| ≤ ε|∆x|.This proves that f(g(x)) is approximated by the linear function K(L(x)).

It remains to find δ1, δ2, ε1, ε2 such that (2.1.10), (2.1.11), (2.1.14) and(2.1.15) hold. For any ε > 0, we first find ε1, ε2 > 0 satisfying (2.1.15). Forthis ε2 > 0, we find δ2 > 0, such that the implication (2.1.11) holds. Then forthe ε1 > 0 we already found, it is easy to find δ1 > 0 such that the implication(2.1.10) holds and (2.1.14) is also satisfied.

Exercise 2.1.13. A function is odd if f(−x) = −f(x). It is even if f(−x) = f(x).A function is periodic with period p if f(x+ p) = f(x). Prove that the derivativesof odd, even, periodic functions are respectively even, odd, periodic.

2.1.6 Basic Example

We establish the following derivatives.

(xα)′ = αxα−1,

(sinx)′ = cosx,

(cosx)′ = − sinx,

(tanx)′ = sec2 x,

(ex)′ = ex,

(αx)′ = αx logα.

Page 93: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 93

By the limit (1.4.13), for x > 0 we have

dxα

dx= lim

∆x→0

(x+ ∆x)α − xα

∆x= lim

u→0

(x+ xu)α − xα

xu

= limu→0

xα−1 (1 + u)α − 1

u= αxα−1.

For x < 0 and integer α, let y = −x. By the chain rule,

dxα

dx=d((−y)α)

dy

dy

dx= (−1)α

d(yα)

dy

d(−x)

dx= (−1)ααyα−1(−1) = αxα−1.

For natural number α, we also have

dxα

dx

∣∣∣∣x=0

= lim∆x→0

(0 + ∆x)α − 0α

∆x= lim

∆x→0(∆x)α−1 =

{0 if α > 1

1 if α = 1= αxα−1

∣∣x=0

.

Thus the formula for the derivative of the power function holds whenever thefunction is defined.

The derivative of power functions leads to the derivative for polynomials

(cnxn + cn−1x

n−1 + · · ·+ c1x+ c0)′ = ncnxn−1 + (n− 1)cn−1x

n−2 + · · ·+ c1.

On the other hand, by the derivative of x−1 and the chain rule, we get thederivative (

1

f

)′= (−1)(f)−2f ′ = − f

f 2

of the reciprocal of any differentiable function. By further applying the Leib-niz rule, we get the derivative(

f

g

)′= f ′

1

g+ f

(1

g

)′=f ′g − fg′

g2(2.1.16)

of the quotient of differentiable functions. In particular, we are able to com-pute the derivative of any rational function.

The limits (1.3.23) and (1.3.25) tell us

d sinx

dx

∣∣∣∣x=0

= limx→0

sinx− sin 0

x= lim

x→0

sinx

x= 1,

d cosx

dx

∣∣∣∣x=0

= limx→0

cosx− cos 0

x= lim

x→0x

cosx− 1

x2= 0.

At any x = a, let y = x− a. Then by the chain rule,

d sinx

dx

∣∣∣∣x=a

=d sin(y + a)

dy

∣∣∣∣y=0

dy

dx

∣∣∣∣x=a

=d

dy(sin y cos a+ cos y sin a)

∣∣∣∣y=0

=d sin y

dy

∣∣∣∣y=0

cos a+d cos y

dy

∣∣∣∣y=0

sin a = 1 cos a+ 0 sin a = cos a.

Page 94: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

94 CHAPTER 2. DIFFERENTIATION

Similar computation gives us the derivative of the cosine function. Furtherapplying (2.1.16) gives us the derivative of the tangent function.

d tanx

dx=

d

dx

(sinx

cosx

)=

cosx cosx− (− sinx) sinx

cosx2=

1

cosx2= sec2 x.

The limit (1.4.12) tells us

dex

dx= lim

∆x→0

ex+∆x − ex

∆x= lim

∆x→0exe∆x − 1

∆x= ex.

The derivative of the other exponential functions can be obtained by thechain rule

dαx

dx=dex logα

dx=dey

dy

∣∣∣∣y=x logα

d(x logα)

dx= ex logα logα = αx logα.

Note that all the exponential functions are the solutions of the differentialequations of the form f ′ = cf , where c is a constant (see Example 2.2.6 forthe converse). Such functions are characterized by the important propertythat the rate of change is proportional to the value of the function itself (seethe discussion on interest rate in Example 2.1.10). From this viewpoint, thecase c = 1 is the purest and the most natural case. Only the exponentialfunction based on e corresponds to c = 1. This is why ex is called the naturalexponential function.

Example 2.1.11. The function f(x) =√

1 + sin2 x is the composition of f =√u,

u = 1 + v2, v = sinx. Therefore

df

dx=d√u

du

d(1 + v2)dv

d(sinx)dx

=1

2√u

2v cosx =sinx cosx√1 + sin2 x

.

Example 2.1.12. The equation x2 +y2

4= 1 is an ellipse. The ellipse has upper and

lower parts given respectively by y = 2√

1− x2 and y = −2√

1− x2. By the chainrule, the derivative of the upper part is

dy

dx=d(2u

12 )

du

∣∣∣∣∣u=1−x2

d(1− x2)dx

= 212u−

12 (−2x) = − 2x√

1− x2.

Alternatively, instead of finding the explicit formula for y in terms of x, we keep

in mind that y is a function of x and take the equation to mean x2 +y(x)2

4= 1.

Then takingd

dxon both sides of the equation gives us

0 =d

dx

(x2 +

y2

4

)=d(x2)dx

+14d(y2)dy

dy

dx= 2x+

14

2ydy

dx.

Solving the equation, we getdy

dx= −4x

y.

Page 95: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 95

Yet another way of findingdy

dxis by the parametrization x = cos t, y = 2 sin t

of the ellipse. The chain ruledy

dt=dy

dx

dx

dttells us

dy

dx=

dy

dtdx

dt

=2 cos t− sin t

= −2 cot t.

The reader is left to verify that the three ways give the same result.

Exercise 2.1.14. Compute the derivatives of the trigonometric functions cotx,secx, cscx.

Exercise 2.1.15. Compute the derivative of log x by making use of the limit (1.4.11).

Exercise 2.1.16. Compute the derivatives.

1.1

x2 + 1.

2.2x2 + 1x2 + 1

.

3.(

2x2 + 1x2 + 1

)4

.

4.√

1 + x2.

5.√

1 +√

2 +√

3 + x.

6. (sin 2x+ cos 3x)7.

7. cos(sin(tanx)).

8. sin2 1x

.

9. 2sin 3x.

10. e2x(cosx− 2 sinx).

11.23x − 3x2

32x + 2x3.

Exercise 2.1.17. The hyperbolic functions are

shx =ex − e−x

2, chx =

ex + e−x

2, thx =

ex − e−x

ex + e−x, cothx =

ex + e−x

ex − e−x.

Express the derivatives of hyperbolic functions in terms of hyperbolic functions.

Exercise 2.1.18. Suppose f(x) and g(x) are differentiable functions satisfying f(2) =3, g(2) = 2, f ′(2) = 1, g′(2) = −1. Compute the derivatives of the following func-tions at 2.

f(x)g(x), ef(x) sinπg(x), f(g(x)), g(2f(x)2 − 8g(x)).

Exercise 2.1.19. Suppose u = u(x) and v = v(x) are differentiable functions satis-fying u(0) = 1, v(0) = −1 and the given equations. Compute the derivatives of uand v at 0.

1. u2 + uv + v2 = 1, (1 + x2)u+ (1− x2)v = x.

2. xu+ (x+ 1)v = ex, eu + xe−v = ex+1.

Exercise 2.1.20. Suppose y is a function of x satisfying given equation. Compute

the derivativedy

dxat the given point.

1. y3 + 2xy2 − 2x3 = 1, at x = 1, y = 1.

2. x sin(x− y) = y cos(x+ y), at x = π, y = 0.

3. ey = xy, at x = −e−1, y = −1.

4. (1 + y)ex + (1 + x)ey = xy + 2, at x = 0, y = 0.

Page 96: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

96 CHAPTER 2. DIFFERENTIATION

5. y2 sinx = x2 cos y, at x =π

4, y =

π

4.

Exercise 2.1.21. For the parametrized curves, compute the derivativedy

dxat the

given point. Then find the equation for the tangent line of the curve.

1. Cycloid: x = t− sin t, y = 1− cos t, at t =π

4.

2. Spiral: x = t cos t, y = t sin t, at t = π.

3. Involute of circle: x = cos t+ t sin t, y = sin t− t cos t, at t = π.

4. Four-leaved rose: x = cos 2t cos t, y = cos 2t sin t, at t =π

4.

2.1.7 Derivative of Inverse Function

The inverse trigonometric functions and the logarithmic functions were de-fined as inverse functions. In order to compute their derivatives, we need toknow how to compute the derivatives of inverse functions.

If a function y = f(x) is approximated by a linear function y = A+Bx,then the inverse function x = f−1(y) should be approximated by the inverse

linear function x = −B−1A+B−1y. This suggests that (f−1)′ = B−1 =1

f ′.

Proposition 2.1.5. Suppose a continuous function f(x) is invertible nearx0 and is differentiable at x0. If f ′(x0) 6= 0, then the inverse function is alsodifferentiable at y0 = f(x0), with

(f−1)′(y0) =1

f ′(x0). (2.1.17)

The formula can also be written as

dx

dy=

(dy

dx

)−1

for y = f(x), x = f−1(y). (2.1.18)

Proof. Let f(x) be invertible and differentiable. By Proposition 2.1.3, it iscontinuous. By the composition rule, we may substitute x, x0 by f−1(y), f−1(y0)in the limit

f ′(x0) = limx→x0

f(x)− f(x0)

x− x0

and get

f ′(x0) = limy→y0

y − y0

f−1(y)− f−1(y0).

If f ′(x0) 6= 0, then this implies

1

f ′(x0)= lim

y→y0

f−1(y)− f−1(y0)

y − y0

.

Thus f−1(y) is differentiable at y0 with1

f ′(x0)as the derivative.

Page 97: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.1. APPROXIMATION AND DIFFERENTIATION 97

For more detailed discussion on the relation between the invertibility off(x) near x0 and f ′(x0) 6= 0, see Exercises 2.1.22, 2.1.24, and the remarkafter Proposition 2.2.4.

Exercise 2.1.22. Suppose f(x) is invertible near x0 and is differentiable at x0.Prove that if the inverse function is differentiable at y0 = f(x0), then f ′(x0) 6= 0.This is the “conditional” converse of Proposition 2.1.5.

Exercise 2.1.23. Suppose f(x) is invertible near x0 and is differentiable at x0.Prove directly that if L(x) = f(x0) + b∆x is the linear approximation of f(x) atx0 with b 6= 0, then L−1(y) = x0 + b−1∆y is the linear approximation of f−1(x) atf(x0). This gives an alternative proof of Proposition 2.1.5.

Exercise 2.1.24. Consider the function

f(x) =

n

n2 + 1if x =

1n, n ∈ N

x otherwise.

Verify that f ′(0) = 1 but f(x) is not one-to-one. This shows that the invertibilitycondition is necessary in Proposition 2.1.5. In particular, f ′(x0) 6= 0 does notnecessarily imply the function is invertible near x0.

Now we are ready to derive the following derivatives.

(arcsinx)′ =1√

1− x2,

(arccosx)′ = − 1√1− x2

,

(arctanx)′ =1

1 + x2,

(arcsecx)′ =1

x√x2 − 1

,

(log |x|)′ = 1

x,

(logα |x|)′ =1

x logα.

Let y = arcsinx. Then x = sin y, −π2≤ y ≤ π

2, and

d arcsinx

dx=

(d sin y

dy

)−1

=1

cos y=

1√1− sin2 y

=1√

1− x2.

The derivative of the inverse cosine can be derived similarly, or derived from

arcsinx + arccosx =π

2. For the derivative of the inverse tangent, let y =

arctanx. Then x = tan y, −π2≤ y ≤ π

2, and

d arctanx

dx=

(d tan y

dy

)−1

=1

sec2 y=

1

1 + tan2 y=

1

1 + x2.

Page 98: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

98 CHAPTER 2. DIFFERENTIATION

Using sec′ x = secx tanx, we also get the derivative for inverse secant.

d arcsecx

dx=

1

sec y tan y

∣∣∣∣y=arcsecx

=1

sec y√

sec2 y − 1

∣∣∣∣∣y=arcsecx

=1

x√x2 − 1

.

The derivative of the logarithmic function can be obtained from thederivative of the exponential function. For x > 0,

d log x

dx=

(dey

dy

)−1∣∣∣∣∣y=log x

=1

ey

∣∣∣∣y=log x

=1

x.

For x < 0, y = log |x|, we have x = −|x| = −ey and

d log |x|dx

=

(d(−ey)dy

)−1∣∣∣∣∣y=log |x|

=1

−ey

∣∣∣∣y=log |x|

=1

−|x|=

1

x.

Thus the formula holds for any x 6= 0. The derivative for the other logarith-

mic functions can be then obtained from logα x =log x

logα.

Example 2.1.13. To compute the derivative of f(x) =(x+ 2)7

√x2 + 1

(x2 − x+ 1)6, we take

the log and get

log f(x) = 7 log(x+ 2) +12

log(x2 + 1)− 6 log(x2 − x+ 1).

Taking the derivative, we get

f ′(x)f(x)

= 71

x+ 2+

2x2(x2 + 1)

− 62x− 1

x2 − x+ 1.

Thus

f ′(x) =(x+ 2)7

√x2 + 1

(x2 − x+ 1)6

(7

x+ 2+

x

x2 + 1− 12x− 6x2 − x+ 1

).

Exercise 2.1.25. Compute the derivatives.

1. arcsin1x

.

2.arcsinxarccosx

.

3. (arctan(1 + x2))3.

4. x(log x− 1).

5. log(log x).

6. log(x+√

1 + x2).

7. log | sinx|.

8. arctan(log x).

9. arcsin(arccosx).

Exercise 2.1.26. Suppose f(x) and g(x) are differentiable functions satisfying f(2) =3, g(2) = 2, f ′(2) = 1, g′(2) = −1. Compute the derivatives of the following func-tions at 2.

log(f(x) + g(x)), arcsinf(x)

3√g(x)

, f(1 + log2 g(x)), g

(6 arctan

√f(x)

π

).

Exercise 2.1.27. Suppose y is a function of x satisfying arctany

x= log

√x2 + y2.

Compute the derivative of y and express the result in terms of x and y.

Exercise 2.1.28. Compute the derivatives by making use of the logarithmic func-tion.

Page 99: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 99

1.(x2 + 1)3

√x− 2

(x+ 3)9 3√x3 − x+ 2

.

2. xx.

3. x

√1 + x

sinx+ cosx.

4. f(x)g(x).

5. x(xx).

6. (1 + sinx)arccosx.

Exercise 2.1.29. Find the inverse of hyperbolic functions in Exercise 2.1.17. Thencompute the derivatives in two ways: by direct computation and by using Propo-sition 2.1.5.

2.1.8 Additional Exercise

One Side Derivative

Define one side derivatives

f ′(x+0 ) = lim

x→x+0

f(x)− f(x0)

x− x0

, f ′(x−0 ) = limx→x−0

f(x)− f(x0)

x− x0

.

By Proposition 1.3.3, the usual (two side) derivative exists if and only if theleft and right side derivatives exist and are equal.

Exercise 2.1.30. Define one side differentiability and prove the equivalence betweenthe right differentiability and the existence of the right derivative.

Exercise 2.1.31. For α > 0, compute the right derivative of the power function xα

at 0.

Exercise 2.1.32. Determine whether the function

{xx if x > 01 if x = 0

is right differen-

tiable at x = 0.

Exercise 2.1.33. Prove right differentiability implies right continuity.

Exercise 2.1.34. Prove the properties (f+g)′(x+) = f ′(x+)+g′(x+) and (fg)′(x+) =f ′(x+)g(x) + f(x)g′(x+) for the right derivative.

Exercise 2.1.35. Suppose f ′(x+0 ) > 0. Prove that f(x) > f(x0) for x > x0 and

near x0.

Exercise 2.1.36. Suppose f(x) and g(y) are right differentiable at x0 and y0 =f(x0). Prove that under one of the following conditions, the composition g(f(x))is right differentiable at x0, and we have the chain rule (g◦f)′(x+

0 ) = g′(y+0 )f ′(x+

0 ).

1. f(x) ≥ f(x0) for x ≥ x0.

2. g(y) is (two side) differentiable at y0.

Note that by Exercise 2.1.35, the first condition is satisfied if f ′(x+0 ) > 0. Can you

find the other chain rules for one side derivatives?

Exercise 2.1.37. State and prove the one side derivative version of Proposition2.1.5.

2.2 Application of Differentiation

Having defined and computed the linear approximations, we are ready tosolve problems related to the change of functions.

Page 100: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

100 CHAPTER 2. DIFFERENTIATION

2.2.1 Maximum and Minimum

A function f(x) has a local maximum at x0 if f(x0) is the biggest value amongall the values near x0. In other words, there is δ > 0, such that

|x− x0| < δ =⇒ f(x0) ≥ f(x). (2.2.1)

Similarly, f(x) has a local minimum at x0 if there is δ > 0, such that

|x− x0| < δ =⇒ f(x0) ≤ f(x). (2.2.2)

Local maxima and local minima are also called local extremes.In the definition above, the function is assumed to be defined on both

sides of x0. Similar definition can be made when the function is defined ononly one side of x. For example, suppose f(x) is defined on a bounded closedinterval [a, b]. Then a is a local maximum if there is δ > 0, such that

0 ≤ x− a < δ =⇒ f(a) ≥ f(x).

A function f(x) has an absolute maximum at x0 if f(x0) is the biggestvalue among all the values of f(x). In other words, we have f(x0) ≥ f(x)for any x in the domain of f . The concept of absolute minimum is similarlydefined. Absolute extremes are clearly also local extremes.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

................................................

................................................

................................................

................................................

................................................

a b

absolutemaximum

absoluteminimum

localmaximum

localminimum

localmaximum

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..............................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

Figure 2.2: maximum and minimum

The following result tells us how to find local extremes.

Proposition 2.2.1. Suppose a function f(x) is defined on both sides of x0

and has a local extreme at x0. If f(x) is differentiable at x0, then f ′(x0) = 0.

Proof. Suppose f(x) is differentiable at x0, with f ′(x0) > 0. Take ε =f ′(x0)

2.

Then there is δ > 0, such that

|x− x0| < δ =⇒ |f(x)− f(x0)− f ′(x0)(x− x0)| ≤ ε|x− x0|.

Page 101: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 101

For δ > x− x0 > 0, this tells us

f(x)− f(x0) ≥ f ′(x0)(x− x0)− ε|x− x0| > 0.

For 0 > x− x0 > −δ, this tells us

f(x)− f(x0) ≤ f ′(x0)(x− x0) + ε|x− x0| < 0.

Thus x0 is not a local extreme. Similarly, if f ′(x0) < 0, then x0 is also not alocal extreme.

Note that the proof makes critical use of both sides of x0. For a functionf(x) defined on a bounded closed interval [a, b], this means that the proposi-tion may be applied to the interior points of the interval where the functionis differentiable. Therefore the proposition tells us that a local extreme pointx0 must be one of the following three cases.

1. a < x0 < b, f ′(x0) does not exist.

2. a < x0 < b, f ′(x0) exists and is 0.

3. x0 = a or b.

Typically, the three possibilities would provide finitely many candidates forthe potential local extremes. If we take the maximum and minimum of thevalues at these points, we will get the absolute maximum and minimum.

Example 2.2.1. The function f(x) = x2ex is differentiable everywhere, with thederivative f ′(x) = x(x+ 2)ex. From f ′(x) = 0, we get two possible local extremesx = −2 and x = 0. By limx→−∞ f(x) = 0, limx→∞ f(x) = +∞, f(x) ≥ 0 forany x, f(−2) = 4e−2, f(0) = 0, we conclude that the function has the absoluteminimum at 0, and the function has no absolute maximum. As to whether x = −2is a local extreme, see Example 2.2.7.

Example 2.2.2. For the function f(x) = (x + 1)x23 on [−1, 1], we have f ′(x) =

13

(5x + 2)x−13 . The derivative vanishes at x = −2

5and does not exist at x = 0.

Thus the possible local extremes are the two points and the end points −1, 1. By

f(−1) = 0, f(−2

5

)=

3 3√

2025

, f(0) = 0, f(1) = 2, we conclude that the function

has absolute minimum at −1 and 0 and has absolute maximum at 1.

Example 2.2.3. What is the biggest rectangle with circumference 1?Let one side of the rectangle be x. From practical consideration, we have

0 ≤ x ≤ 12

, and the area is A(x) = x

(12− x)

. Thus the problem becomes finding

the maximum of A(x) on the interval[0,

12

]. Since A(x) is differentiable, the

potential candidates are either the end points of the interval or the interior points

where A′(x) =12− 2x = 0. So the possible local extremes are 0,

14

,12

. The values

of A at the three points are 0,116

, 0, respectively. The biggest value116

appears

at x =14

, in which case the rectangle is a square.

Page 102: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

102 CHAPTER 2. DIFFERENTIATION

Example 2.2.4. The function f(x) = x3 satisfies f ′(0) = 0. However, we havef(x) < f(0) for x < 0 and f(x) > f(0) for x > 0. Therefore 0 is not a localextreme for f(x). The example shows that f ′(x0) = 0 is a necessary instead of asufficient condition for local extremes (of differentiable functions).

Exercise 2.2.1. Find the maximum and the minimum on the given range.

1. x(x+ 1)(x+ 2) on [−3, 3].

2. x(x+ 1)(x+ 2) on [−3, 0].

3. |x|ex on [−2, 1].

4. sinx+ cosx on [0, 2π].

5. sinx+ cosx on whole R.

6. (x + 1) cosx + (x − 1) sinx on[−π, π].

Exercise 2.2.2. How much of the corners of a square with side length 1 should becut so that the tray made from it has the biggest volume?

Exercise 2.2.3. Given the surface area of a cylinder, when is the volume biggest?

Exercise 2.2.4. Find the distance from the point (1, 1) to the parabola y2 = 2x.

Exercise 2.2.5. An arrow is shot at the angle α. The location of the arrow at timet is

x = tv cosα, y = tv sinα− 12gt2,

where v is the initial speed, x is the distance traveled and y is the height. Forwhat angle α does the arrow travel the furthest?

Exercise 2.2.6. Suppose f(x) is left differentiable at x0 (see Exercise 2.1.30). Provethat if x0 is a local maximum, then f ′(x−0 ) ≥ 0. What about the right differentiablecase?

2.2.2 Mean Value Theorem

Theorem 2.2.2 (Main Value Theorem). Suppose f(x) is continuous on [a, b]and differentiable on (a, b). Then there is a < c < b, such that

f ′(c) =f(b)− f(a)

b− a. (2.2.3)

The quotient on the right is the slope of the line segment that connectsthe end points of the graph of f(x) on [a, b]. The theorem says the linesegment is parallel to some tangent line.

The conclusion of the theorem can also be written as

f(b)− f(a) = f ′(c)(b− a) for some a < c < b, (2.2.4)

or

f(x+ ∆x)− f(x) = f ′(x+ θ∆x)∆x for some 0 < θ < 1. (2.2.5)

Also note that a and b may be exchanged in the theorem, so that we do nothave to insist a < b (or ∆x > 0) for the theorem to hold.

Page 103: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 103

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

a bc1 c2

f(a)

f(b)

L

max{f−L}

min{f−L}

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

............................................................................................................................

............................................................................................................................

.................................................................................................................

.........................................................................

.........................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

...

............

............

............

............

............

............

............

............

................................................................................................ ................................................................................................................................................

Figure 2.3: mean value theorem

Proof. The line connecting (a, f(a)) and (b, f(b)) is

L(x) = f(a) +f(b)− f(a)

b− a(x− a).

As suggested by Figure 2.3, the tangent lines parallel to L(x) can be foundwhere the difference

h(x) = f(x)− L(x) = f(x)− f(a)− f(b)− f(a)

b− a(x− a)

reaches maximum or minimum.Since h(x) is continuous, by Theorem 1.4.5, it reaches maximum and

minimum at c1, c2 ∈ [a, b]. If both c1 and c2 are end points a and b, then themaximum and the minimum of h(x) are h(a) = h(b) = 0. This implies h(x)is constantly zero on the interval, so that h′(c) = 0 for any a < c < b. If oneof c1 and c2, denoted c, is not an end point, then by Proposition 2.2.1, wehave h′(c) = 0. In any case, we have a < c < b satisfying

h′(c) = f ′(c)− f(b)− f(a)

b− a= 0.

Example 2.2.5. We provex

1 + x≤ log(1 + x) ≤ x for x > −1. Taking f(x), a, b in

Theorem 2.2.2 to be log(1 + x), 0, x, we have

log(1 + x) = log(1 + x)− log(1 + 0) =d log(1 + u)

du

∣∣∣∣u=θx

(x− 0) =x

1 + θx,

where 0 < θ < 1. For x > 0, we have 1 < 1 + θx < 1 + x. For −1 < x < 0, wehave 1 > 1 + θx > 1 + x. In both cases, we conclude

x

1 + x<

x

1 + θx< x.

Exercise 2.2.7. Find c in the mean value theorem.

Page 104: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

104 CHAPTER 2. DIFFERENTIATION

1. x3 on [−1, 1]. 2.1x

on [1, 2]. 3. 2x on [0, 1].

Exercise 2.2.8. Let f(x) = |x − 1|. Is there 0 < c < 3 such that f(3) − f(0) =f ′(c)(3− 0)? Does your conclusion contradict the means value theorem?

Exercise 2.2.9. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).Prove that f(x) is Lipschitz if and only if f ′(x) is bounded on (a, b). (see Exercise1.4.13)

Exercise 2.2.10. Use the main value theorem to prove | sinx− sin y| < |x− y| and| arctanx− arctan y| < |x− y|.

Exercise 2.2.11. Prove thatx

1 + x2< arctanx < x for x > 0.

Exercise 2.2.12 (Rolle3’s Theorem). Suppose f(x) is continuous on [a, b] and dif-ferentiable on (a, b). Prove that if f(a) = f(b), then f ′(c) = 0 for some a < c < b.

Exercise 2.2.13. Suppose f(x) has continuous derivative on a bounded and closedinterval [a, b]. Prove that for any ε > 0, there is δ > 0, such that |∆x| < δimplies |f(x+ ∆x)− f(x)− f ′(x)∆x| < ε|∆x|. In other words, f(x) is uniformlydifferentiable.

Exercise 2.2.14. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).Suppose f(a) = 0 and |f ′(x)| ≤ A|f(x)| for some constant A. First prove that

f(x) = 0 on[a, a+

12A

]. Then proceed to prove that f(x) = 0 on the whole

interval [a, b].

Exercise 2.2.15. Suppose f(x) is continuous on [a, b] and is left and right differ-entiable on (a, b) (see Exercise 2.1.30). Prove that there is a < c < b, such thatf(b)− f(a)

b− alies between f ′(c−) and f ′(c+). A further extension of the mean value

theorem appears in Exercise 2.2.37.

The mean value theorem has the following important consequence, whichbasically says that a non-changing quantity must be a constant.

Proposition 2.2.3. Suppose f ′(x) = 0 for all x on an interval. Then f(x)is a constant on the interval.

For any two points x1, x2 in the interval, we applying the mean valuetheorem to get

f(x1)− f(x2) = f ′(c)(x1 − x2) = 0(x1 − x2) = 0 for some x1 < c < x2.

This proves the proposition.

Example 2.2.6. Suppose f(x) satisfies f ′(x) = af(x), where a is a constant. Then

(e−axf(x))′ = −ae−axf(x) + e−axf ′(x) = −ae−axf(x) + ae−axf(x) = 0.

Therefore e−axf(x) = c is a constant, and f(x) = ceax is a multiple of the expo-nential function.

3Michel Rolle, born 1652 in Ambert (France), died 1719 in Paris (France). Rolleinvented the notion n

√x for the n-th root of x. His theorem appeared in an obscure book

in 1691.

Page 105: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 105

Exercise 2.2.16. Suppose f(x) satisfies f ′(x) = xf(x). Prove that f(x) = cex2

2 forsome constant c.

Exercise 2.2.17. Suppose f(x) satisfies |f(x)− f(y)| ≤ |x− y|α for some constantα > 1. Prove that f(x) is constant.

Exercise 2.2.18. Prove that if f ′(x) = g′(x) for all x, then f(x) = g(x)+c for someconstant c. Then prove the following equalities.

1. 2 arctanx+ arcsin2x

1 + x2= π for x ≥ 1.

2. 3 arccosx− arccos(3x− 4x3) = π for |x| ≤ 12

.

2.2.3 Monotone Function

Proposition 2.2.4. Suppose f(x) is continuous on an interval and is differ-entiable on the interior of the interval. Then f(x) is increasing if and onlyif f ′(x) ≥ 0. Moreover, if f ′(x) > 0, then f(x) is strictly increasing.

The similar statement for decreasing functions is also true. Moreover, bycombining with Theorem 1.4.8, we find that f ′(x) 6= 0 near x0 implies f(x)is invertible near x0 (compare Proposition 2.1.5).

From the approximation viewpoint, the proposition can be interpreted asfollows. The monotone property for f(x) near x0 is pretty much the same asthe monotone property of its linear approximation f(x0) + f ′(x0)∆x. Thenthe condition f ′(x0) ≥ 0 for the linear approximation to be increasing is verylikely also the condition for the function f(x) to be increasing near x0.

To find out whether f(x) has the increasing property near x0, we con-sider the linear approximation f(x0) + f ′(x0)∆x. The linear approximationis increasing if and only if the coefficient f ′(x0) ≥ 0. Because of the ap-proximation, the condition f ′(x0) ≥ 0 very likely will imply that f(x) is alsoincreasing near x0.

Proof. Suppose f(x) is increasing. Then either f(x) = f(y) or f(x) − f(y)

has the same sign as x− y. Thereforef(y)− f(x)

y − x≥ 0 for any x 6= y. This

implies f ′(x) ≥ 0 for any x.Conversely, for x > y, the mean value theorem tells us f(x) − f(y) =

f ′(c)(x − y) for some x > c > y. Then the condition f ′(c) ≥ 0 impliesf(x) ≥ f(y), and the condition f ′(c) > 0 implies f(x) > f(y).

Example 2.2.7. The derivative of f(x) = x2ex is f ′(x) = x(x+ 2)ex. The possiblelocal extremes found in Example 2.2.1 divide the whole real line into three sections(−∞,−2], [−2, 0] and [0,∞), on the interiors of which we have respectively f ′(x) >0, f ′(x) < 0 and f ′(x) > 0. Therefore the function is strictly increasing, strictlydecreasing and strictly increasing again on the three sections.

Since the function changes from increasing to decreasing at x = −2, f(−2) =4e−2 is a local maximum. Since the function changes from decreasing to increasingat x = 0, f(0) = 0 is a local minimum. Combined with limx→−∞ f(x) = 0,

Page 106: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

106 CHAPTER 2. DIFFERENTIATION

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..................................

−2

4e−2

a b

.......................................

.............

................................................................................................................................................................................

...................................................................................

...........................................................................................................................................................................................................................................................................................................................................................

................................................

.....................................................................................................................................................................................................................

Figure 2.4: graph of x2ex

limx→+∞ f(x) = +∞, we get a rough sketch for the graph of the function inFigure 2.4. The numbers a and b in the graph will be explained in Example2.3.21.

Example 2.2.8. By taking the derivative of the function f(x) = (x+1)x23 , we found

x = −25

and x = 0 are the possible local extremes in Example 2.2.2. To determine

whether they are indeed local extremes, we note that f ′(x) > 0, f ′(x) < 0 and

f ′(x) > 0 in the interiors of the intervals(−∞,−2

5

],[−2

5, 0]

and [0,∞). Thus

the function changes from increasing to decreasing at x = −25

and changes from

decreasing to increasing at x = 0. This implies that f(−2

5

)=

3 3√

2025

is a local

maximum and f(0) = 0 is a local minimum.

Example 2.2.9. We prove ex > 1+x for x 6= 0. The problem is the same as showingf(x) = ex − x > 1 = f(0). For x > 0, we have f ′(x) = ex − 1 > 0 for x > 0. Thusf(x) is strictly increasing and f(x) > f(0) for x > 0. The argument for the casex < 0 is similar.

Exercise 2.2.19. Study the monotone property of functions and find local maximaand minima.

1. x(x+ 1)(x+ 2).

2.x

1 + x2.

3. x− 1x

.

4. (x− 1)x23 .

5. sinx+ cosx.

6. |x|ex.

7. (x+ 1) cosx+ (x− 1) sinx.

8. x− 2 sinx.

9. x2 − log x.

10. ex cosx.

11. x− 2 arctanx.

12. xx.

Exercise 2.2.20. Prove the inequalities.

1. tanx > x for 0 < x <π

2. (Hint: Consider sinx− x cosx)

2. sinx >2πx for 0 < x <

π

2. (Hint: Consider

sinxx

)

Page 107: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 107

3. sinx+ cosx > 1 + x− x2

√2

for x > 0.

4. cosx > 1− x2

2for 0 < x <

π

2.

5. x− x2

2(1 + x)> log(x+ 1) > x− x2

2for x > 0.

Exercise 2.2.21. Prove that(

1 +1x

)xis strictly increasing and

(1 +

1x

)x+1

is

strictly decreasing on (0,+∞). Moreover, prove that(

1 +1x

)x< e <

(1 +

1x

)x+1

and e−(

1 +1x

)x<e

xfor x > 0.

Exercise 2.2.22. The following steps show that the smallest α such that(

1 +1x

)x+α

>

e for all x > 0 is α =12

.

1. Convert the problem to α > f

(1x

), with f(x) =

1log(1 + x)

− 1x

. Thus the

problem is the same as finding the superimum of f(x) on (0,+∞).

2. Prove that u − 1u> 2 log u for u > 1 and u − 1

u< 2 log u for u < 1. Then

use the inequality to show that f is a decreasing function.

3. Show that the supremum of f(x) is limx→0+ f(x) =12

.

4. Find the smallest α and biggest β such that(

1 +1n

)n+α

> e >

(1 +

1n

)n+β

for all natural numbers n.

Exercise 2.2.23. Suppose f(x) is continuous for x ≥ 0 and differentiable for x > 0.

Prove that if f ′(x) is strictly increasing and f(0) = 0, thenf(x)x

is also strictlyincreasing.

Exercise 2.2.24. Suppose f(x) is left and right differentiable on an interval. Provethat if f ′(x+) ≥ 0 and f ′(x−) ≥ 0, then f(x) is increasing. Moreover, if theinequalities are strict, then f(x) is strictly increasing.

2.2.4 L’Hospital’s Rule

The derivative is a useful tool for the computation of function limits of the

types0

0or∞∞

.

Proposition 2.2.5 (L’Hospital4’s Rule). Suppose f(x) and g(x) are differ-entiable on (a− δ, a) ∪ (a, a+ δ) for some δ > 0. Assume

4Guillaume Francois Antoine Marquis de L’Hospital, born 1661 in Paris (France), died1704 in Paris (France). His famous rule was found in his 1696 book “Analyse des infinimentpetits pour l’intelligence des lignes courbes”, which was the first textbook in differentialcalculus.

Page 108: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

108 CHAPTER 2. DIFFERENTIATION

1. Either limx→a f(x) = limx→a g(x) = 0 or limx→a f(x) = limx→a g(x) =∞.

2. limx→af ′(x)

g′(x)exists.

Thenf(x)

g(x)also exists and limx→a

f(x)

g(x)= limx→a

f ′(x)

g′(x).

The following counterexample

limx→0

1 + x

2 + x=

1

2, limx→0

(1 + x)′

(2 + x)′= lim

x→0

1

1= 1.

shows the necessity of the first condition. The second condition means thatthe existence of the limit for the derivative quotient implies the existence ofthe limit for the original quotient. The converse of this implication is notnecessarily true.

The L’Hospital’s rule as stated above is only about the finite limit at afinite a. The subsequent proof shows that the technique also applies to one

side limits. By converting x to1

x, it is also easy to show that the technique

also holds for x approaching various kinds of infinities. Moreover, the rulecan also be applied if the limit of the quotient is some kind of infinity. This

can be shown by reverting the quotientf(x)

g(x)to

g(x)

f(x).

To prove L’Hospital’s rule, we need the following extension of the meanvalue theorem.

Theorem 2.2.6 (Cauchy’s Mean Value Theorem). Suppose f(x) and g(x)are continuous on [a, b] and differentiable on (a, b). If g′(x) is never zero,then there is a < c < b, such that

f ′(c)

g′(c)=f(b)− f(a)

g(b)− g(a). (2.2.6)

Geometrically, consider (g(x), f(x)) as a parametrized curve in R2. Thevector from one point at x = a to another point at x = b is (g(b)−g(a), f(b)−f(a)). Cauchy’s mean value theorem says that it should be parallel to a tan-gent vector (g′(c), f ′(c)) at another point x = c on the curve. This suggeststhat the theorem may be proved by imitating the proof of the mean valuetheorem, by considering

h(x) = f(x)− f(a)− f(b)− f(a)

g(b)− g(a)(g(x)− g(a)).

The details are left to the reader.

Proof of L’Hospital’s Rule. We will prove for the limit of the type limx→a+

only, with a a finite number. Thus f(x) and g(x) are assumed to be differ-entiable on (a, a+ δ).

Page 109: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 109

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

..........................................................................................................................................................

................................................................................................................... ..............

.................................................................................................................... ..............(g(a), f(a))

(g(b), f(b))

(g(c), f(c))

(g′(c), f ′(c))

.................................................................................

.............................................................

.................................................

.........................................................................................................................................................................................................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................................................................................................................................................

Figure 2.5: Cauchy’s mean value theorem

First assume limx→a+ f(x) = limx→a+ g(x) = 0. Then f(x) and g(x) canbe extended to continuous functions on [a, a+δ) by defining f(a) = g(a) = 0.Cauchy’s mean value theorem then tells us that for any a < x < a + δ, wehave

f(x)

g(x)=f(x)− f(a)

g(x)− g(a)=f ′(c)

g′(c)(2.2.7)

for some c satisfying a < c < x (and c depends on x). As x → a+, we havec→ a+. Therefore if the limit of the right of (2.2.7) exists, so is the limit onthe left, and the two limits are the same.

Now consider the case limx→a+ f(x) = limx→a+ g(x) = ∞. The technicaldifficulty here is that the functions cannot be extended to x = a as before.

Still, we try to establish something similar to (2.2.7) by replacingf(x)− f(a)

g(x)− g(a)

withf(x)− f(b)

g(x)− g(b), where b > a is very close to a. The second equality in

(2.2.7) still holds. Although the first equality no longer holds, it is sufficient

to show thatf(x)

g(x)and

f(x)− f(b)

g(x)− g(b)are very close. Of course all these should

be put together in logical order.

Denote limx→a+

f ′(x)

g′(x)= l. For any ε > 0, there is δ1 > 0, such that

a < x ≤ b = a+ δ1 =⇒∣∣∣∣f ′(x)

g′(x)− l∣∣∣∣ < ε.

Then by Cauchy’s mean value theorem,

a < x < b =⇒∣∣∣∣f(x)− f(b)

g(x)− g(b)− l∣∣∣∣ =

∣∣∣∣f ′(c)g′(c)− l∣∣∣∣ < ε, (2.2.8)

where a < x < c < b. In particular, the quotientf(x)− f(b)

g(x)− g(b)is bounded.

Page 110: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

110 CHAPTER 2. DIFFERENTIATION

Moreover, by the assumption limx→a+ f(x) = limx→a+ g(x) =∞, we have

limx→a+

1− g(b)

g(x)

1− f(b)

f(x)

= 1.

Therefore, there is δ1 ≥ δ > 0, such that

a < x < a+ δ =⇒∣∣∣∣f(x)

g(x)− f(x)− f(b)

g(x)− g(b)

∣∣∣∣=

∣∣∣∣f(x)− f(b)

g(x)− g(b)

∣∣∣∣∣∣∣∣∣∣∣∣1− g(b)

g(x)

1− f(b)

f(x)

− 1

∣∣∣∣∣∣∣∣ < ε.

Since a < x < a + δ implies a < x < b, the conclusion of (2.2.8) also holds.Thus

a < x < a+ δ =⇒∣∣∣∣f(x)

g(x)− l∣∣∣∣

≤∣∣∣∣f(x)

g(x)− f(x)− f(b)

g(x)− g(b)

∣∣∣∣+

∣∣∣∣f(x)− f(b)

g(x)− g(b)− l∣∣∣∣ < 2ε.

Example 2.2.10. To find limx→0sinxx

, we note that both sinx and x are continuousand vanish at x = 0. Moreover,

limx→0

(sinx)′

x′= lim

x→0

cosx1

= cos 0 = 1.

Therefore by L’Hospital’s rule, limx→0sinxx

= limx→0(sinx)′

x′= 1.

Unfortunately, the argument above is circular, because the limit limx→0sinxx

=1 was used to compute the derivative of sinx. Similarly, the computations of

limx→0ex − 1x

and limx→0log(x+ 1)

xby L’Hospital’s rule is also circular.

Example 2.2.11. The limit limx→1

(1

x− 1− 1

log x

)may be found by making use

of L’Hospital’s rule twice.

limx→1

(1

x− 1− 1

log x

)= lim

x→1

log x− x+ 1(x− 1) log x

=(1) limx→1

1x− 1

log x− x− 1x

= limx→1

1− xx log x+ x− 1

=(2) limx→1

−1log x+ 2

= −12.

Page 111: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 111

Note that one should not be blindly apply L’Hospital’s rule without checking theconditions. For the computation above, since 1− x = x log x+ x− 1 = 0 at x = 0

and limx→1−1

log x+ 2exists, the equality =(2) holds. Then since log x − x + 1 =

(x−1) log x = 0 at x = 0 and limx→1

1x− 1

log x− x− 1x

exists, the equality =(1) holds.

Example 2.2.12. The limit limx→+∞(x +√

1 + x2)1

log x may be found by first ap-plying L’Hospital’s rule to the log of the function.

limx→+∞

log(x+√

1 + x2)log x

= limx→+∞

1 +x√

1 + x2

x+√

1 + x2

1x

= limx→+∞

x√1 + x2

= 1.

Note that L’Hospital’s rule may be applied because limx→+∞ log(x+√

1 + x2) =limx→+∞ log x =∞ and the limit in the middle exists. By taking the exponential,we get

limx→+∞

(x+√

1 + x2)1

log x = e1 = e.

Exercise 2.2.25. Compute the limits.

1. limx→0x− sinx

x3.

2. limx→0ex − 1− x

x2.

3. limx→0

(cosx− 1

x4+

12x2

).

4. limx→0 sinx log x.

5. limx→0

(1x− cotx

).

6. limx→0(cosx)1x2 .

7. limx→0

x√

1 + x− ex

.

8. limx→+∞(π − 2 arctanx) log x.

9. limx→∞

(x sin

1x

)x2

.

Exercise 2.2.26. Discuss whether L’Hospital’s rule can be applied to the limits.

1. limx→∞x+ sinxx− cosx

. 2. limx→+∞x

x+ sinx.

3. limx→0

x2 sin1x

sinx.

Exercise 2.2.27. The mean value theorem tells us

log(1 + x)− log 1 = x1

1 + θx, ex − 1 = xeθx,

for some 0 < θ < 1. Prove that in both cases, limx→0 θ =12

.

Exercise 2.2.28. Prove L’Hospital’s rule for the case a = +∞. Moreover, discussL’Hospital’s rule for the case l =∞.

Page 112: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

112 CHAPTER 2. DIFFERENTIATION

2.2.5 Additional Exercise

Ratio Rule

By specifying the ratio rule in Exercise 1.1.32 to yn = an, we get the limitsin Exercises 1.1.33 and 1.1.34. By making other choices of y0 and using the

Taylor expansion to estimate the quotientyn+1

yn, we get other concrete forms

of the ratio rule.

Exercise 2.2.29. Prove that if a > b > c > 0, then 1− ax < (1− x)b < 1− cx and1 + ax > (1 + x)b > 1 + cx for small and positive x. Then prove the following

1. If∣∣∣∣xn+1

xn

∣∣∣∣ ≤ 1− a

nfor some a > 0 and big n, then limn→∞ xn = 0.

2. If∣∣∣∣xn+1

xn

∣∣∣∣ ≥ 1 +a

nfor some a > 0 and big n, then limn→∞ xn =∞.

Exercise 2.2.30. Study the limits.

1. limn→∞(n!)2an

(2n)!. 2. limn→∞

(n+ a)n+b

cnn!.

Exercise 2.2.31. Rephrase the rules in Exercise 2.2.29 in terms of the quotient∣∣∣∣ xnxn+1

∣∣∣∣. Then prove that if limn→∞ n

(∣∣∣∣ xnxn+1

∣∣∣∣− 1)> 0, then limn→∞ xn = 0.

Find the similar condition for the conclusion limn→∞ xn =∞.

Exercise 2.2.32. Prove that if a > b > c > 0, then 1 − a

x log x<

(log(x− 1))b

(log x)b<

1 − c

x log xand 1 +

a

x log x>

(log(x+ 1))b

(log x)b> 1 +

c

x log xfor big and positive x.

Then prove the following.

1. If∣∣∣∣xn+1

xn

∣∣∣∣ ≤ 1− a

n log nfor some a > 0 and big n, then limn→∞ xn = 0.

2. If∣∣∣∣xn+1

xn

∣∣∣∣ ≥ 1 +a

n log nfor some a > 0 and big n, then limn→∞ xn =∞.

Darboux5’s Intermediate Value Theorem

Exercise 2.2.33. Suppose f(x) is differentiable on [a, b]. By considering the ex-tremes of f(x)−γx, prove that for any γ between f ′(a) and f ′(b), there is c ∈ [a, b],such that f ′(c) = γ.

Exercise 2.2.34. Find a function f(x) differentiable everywhere on [0, 1], such thatf ′(x) is not continuous. The examples shows that Darboux’s intermediate valuetheorem is not a consequence of the usual intermediate value theorem.

Extension of Mean Value Theorem6

5Jean Gaston Darboux, born 1842 in Nimes (France), died 1917 in Paris (France).Darboux made important contributions to differential geometry and analysis.

6See “Some Remarks on Functions with One-Sided Derivatives” by Miller and Vyborny,American Math Monthly 93 (1986) 471-475.

Page 113: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.2. APPLICATION OF DIFFERENTIATION 113

In Exercise 2.2.15, the mean value theorem is extended to continuousfunctions that are both left and right differentiable. In what follows, weconsider continuous functions that are left or right differentiable.

Exercise 2.2.35. Prove that for any function f(x) on [a, b] and l <f(b)− f(a)

b− a,

there is a linear function L(x) satisfying L′(x) = l and L(a) > f(a), L(b) < f(b).

Exercise 2.2.36. Suppose f(x) is a continuous function f(x) on [a, b] and L is alinear function L with the properties in Exercise 2.2.35. Prove that c = sup{x ∈(a, b) : f ≤ L on [a, x]} satisfies c ∈ (a, b) and f(c) = L(c). Moreover, if f(x) hasany one side derivative at c, then the one side derivative is no less than L′(c) = l.

Exercise 2.2.37. Suppose f(x) is a continuous function on [a, b] and is left or rightdifferentiable at any point on (a, b). Let f ′∗(x) be one of the one side derivativesat x. Prove that

inf(a,b)

f ′∗ ≤f(b)− f(a)

b− a≤ sup

(a,b)f ′∗.

Exercise 2.2.38. Use Exercise 2.2.37 to further extend the criterion for the mono-tonicity in Exercise 2.2.24.

Basic Inequalities

The monotone property can be used to prove some important basic in-equalities. Let p and q be real numbers satisfying

1

p+

1

q= 1.

Exercise 2.2.39. For x > 0, prove that x1p ≤ 1

px+

1q

in case p > 1 and x1p ≥ 1

px+

1q

in case p < 1.

Exercise 2.2.40 (Young7 Inequality). For a, b > 0, prove

ab ≤ 1pap +

1qbq for p > 1

andab ≥ 1

pap +

1qbq for p < 1.

When does the equality hold?

Exercise 2.2.41 (Holder8 Inequality). Suppose p, q > 0. For positive numbers a1,

a2, . . . , an, b1, b2, . . . , bn, by taking a =ai

(∑api )

1p

and b =bi

(∑bqi )

1q

in the Young

inequality, prove ∑aibi ≤

(∑api

) 1p(∑

bqi

) 1q. (2.2.9)

When does the equality hold?

7William Henry Young, born 1863 in London (England), died 1942 in Lausanne(Switzerland). Young discovered a form of Lebesgue integration independently. He wrotean influential book ”The fundamental theorems of the differential calculus” in 1910.

8Otto Ludwig Holder, born 1859 in Stuttgart (Germany), died 1937 in Leipzig (Ger-many). He discovered the inequality in 1884. Holder also made fundamental contributionsto the group theory.

Page 114: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

114 CHAPTER 2. DIFFERENTIATION

Exercise 2.2.42 (Minkowski9 Inequality). Suppose p > 1. By applying the Holderinequality to a1, a2, . . . , an, (a1 + b1)p−1, (a2 + b2)p−1, . . . , (an + bn)p−1 and thento b1, b2, . . . , bn, (a1 + b1)p−1, (a2 + b2)p−1, . . . , (an + bn)p−1, prove(∑

(ai + bi)p) 1p ≤

(∑api

) 1p +

(∑bpi

) 1p. (2.2.10)

When does the equality hold?

2.3 High Order Approximation

The linear approximation has many nice properties and can be used to solvemany problems. Still, there are problems that require approximations morerefined than the linear one. In this case, polynomials of higher and higherdegrees can be used. This leads to higher order derivatives and Taylor series.

2.3.1 Quadratic Approximation

A function is approximated by a quadratic function a+ b∆x+ c∆x2 at x0, iffor any ε > 0, there is δ > 0, such that

|∆x| < δ =⇒ |f(x)− a− b∆x− c∆x2| ≤ ε|∆x|2. (2.3.1)

Similar to the linear approximation, the condition (2.3.1) for quadratic ap-proximation means exactly a = f(x0), the derivative b = f ′(x0) exists, andthe limit

c = lim∆x→0

f(x0 + ∆x)− f(x0)− f ′(x0)∆x

∆x2

= limx→x0

f(x)− f(x0)− f ′(x0)(x− x0)

(x− x0)2(2.3.2)

exists. It is rather tempting to define c to be the second order derivative.But the following result suggests that it is better to call 2c the second orderderivative.

Proposition 2.3.1. Suppose f(x) is differentiable near x0, and the derivativefunction f ′(x) has further derivative f ′′(x0) at x0. Then f(x) has a quadratic

approximation at x0, given by f(x0) + f ′(x0)(x− x0) +f ′′(x0)

2(x− x0)2.

Proof. Consider the difference (called remainder)

R2(x) = f(x)− f(x0)− f ′(x0)(x− x0)− f ′′(x0)

2(x− x0)2

between the function and the expected quadratic approximation. The differ-ence satisfies

R2(x0) = R′2(x0) = R′′2(x0) = 0.

9Hermann Minkowski, born 1864 in Alexotas (Russia, now Kaunas of Lithuania), died1909 in Gottingen (Germany). Minkowski’s fundamental contribution to geometry pro-vided the mathematical foundation of Einstein’s theory of relativity.

Page 115: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 115

By Cauchy’s mean value theorem, we have

R2(x)

(x− x0)2=

R2(x)−R2(x0)

(x− x0)2 − (x0 − x0)2=

R′2(c)

2(c− x0)=R′2(c)−R′2(x0)

2(c− x0),

for some c between x0 and x. As x→ x0, we have c→ x0, so that

limx→x0

R2(x)

(x− x0)2= lim

c→x0

R′2(c)−R′2(x0)

2(c− x0)=R′′2(x0)

2= 0.

Because of the proposition, we define the second order differential to be

d2f = 2cdx2,

where dx2 is indeed considered as the square of dx, at least symbolically. Wehave d2f = f ′′(x0)dx2 if the second order derivative exists at x0.

Example 2.3.1. Suppose f(x) has second order derivative at x0. Then we have

f(x0 + ∆x) = f(x0) + f ′(x0)∆x+f ′′(x0)

2∆x2 + o(∆x2)

f(x0 + 2∆x) = f(x0) + 2f ′(x0)∆x+ 2f ′′(x0)∆x2 + o(∆x2).

This implies

f(x0 + 2∆x)− 2f(x0 + ∆x) + f(x0) = f ′′(x0)∆x2 + o(∆x2),

and we get another way of expressing the second order derivative as a limit.

f ′′(x0) = lim∆x→0

f(x0 + 2∆x)− 2f(x0 + ∆x) + f(x0)∆x2

.

Exercise 2.3.1. Find suitable conditions among constants a, b, λ, µ so that λf(x0+a∆x) +µf(x0 + b∆x) + f(x0) = f ′′(x0)∆x2 + o(∆x2) holds for twice differentiablefunctions. Then derive some formula for the second order derivative similar to theone in Example 2.3.1.

Exercise 2.3.2. Suppose f ′′(0) exists and f ′′(0) 6= 0. Prove that in the mean value

theorem f(x) − f(0) = xf ′(θx), we have limx→0 θ =12

. This generalizes theobservation in Exercise 2.2.27.

Exercise 2.3.3. Suppose f(x) has second order derivative at x0. Let h and kbe small, distinct and nonzero numbers. Find the quadratic function q(x) =a+ b∆x+ c∆x2 satisfying

q(x0) = f(x0), q(x0 + h) = f(x0 + h), q(x0 + k) = f(x0 + k).

Then prove that limh,k→0 b = f ′(x0) and limh,k→0 c =f ′′(x0)

2as long as

h

h− kis kept bounded. This provides the geometrical interpretation of the quadraticapproximation.

Exercise 2.3.4. Proposition 2.3.1 basically says that the existence of the secondorder derivative implies the second order differentiability. Show that the converseis not true (in contrast what Proposition 2.1.3 says about the first order case) byconsidering the following functions at x0 = 0.

Page 116: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

116 CHAPTER 2. DIFFERENTIATION

1. f(x) =

{x3 if x is rational0 if x is irrational

. 2. f(x) =

x3 sin1x2

if x 6= 0

0 if x = 0.

Exercise 2.3.5. Determine the existence of the quadratic approximation and theexistence of the second order derivative of functions in Exercise 2.1.5.Exercise 2.3.6. Study the existence of the quadratic approximation and the exis-tence of the second order derivative of function |x3(x− 1)(x− 2)2|.Exercise 2.3.7. Suppose P (x) and Q(x) are quadratic approximations of f(x) andg(x) at x0.

1. Prove that P (x)+Q(x) is the quadratic approximation of f(x)+g(x) at x0.

2. Prove that although P (x)Q(x) has degree 4, the second order truncation ofthe product is the quadratic approximation of f(x)g(x).

3. Suppose f(x) and g(x) have seond order derivatives. What do the twoconclusions tell you about the second order derivatives of f(x) + g(x) andf(x)g(x)?

Exercise 2.3.8. Suppose P (x) is the quadratic approximations of f(x) at x0. Sup-pose Q(y) is the quadratic approximations of g(y) at y0 = f(x0).

1. Prove that the second order truncation of Q(P (x)) is the quadratic approx-imation of g(f(x)).

2. Suppose f(x) and g(y) have seond order derivatives at x0 and y0. What canyou say about the second order derivative of g(f(x))?

2.3.2 High Order Derivative

The quadratic approximation is computed by taking the derivative twice.Approximations by higher order polynomials are expected to be computedby taking the derivative many times.

Given a differentiable function f(x), its derivative f ′(x) is also a function.If the function f ′(x) is also differentiable, then we have the second orderderivative f ′′(x). If the function f ′′(x) is again differentiable, then we get thethird order derivative f ′′′(x). In general, the n-th order derivative of f(x) is

denoted f (n)(x) ordnf

dxn.

It is easy to show (by induction, for example), that the high order deriva-tives of the power, the exponential, and the logarithmic functions are

(xα)(n) = α(α− 1) · · · (α− n+ 1)xα−n,

(ex)(n) = ex,

(αx)(n) = αx(logα)n,

(log |x|)(n) = (−1)n−1(n− 1)!x−n.

Note that if α is a natural number, then (xα)(n) = 0 for n > α. The highorder derivatives of the sine and the cosine functions have periodic pattern.

sin′ x = cosx, sin′′ x = − sinx, sin′′′ x = − cosx, sin′′′′ x = sinx, . . .

cos′ x = − sinx, cos′′ x = − cosx, cos′′′ x = sinx, cos′′′′ x = cosx, . . .

Page 117: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 117

However, there is no clear pattern for the derivatives of the tangent function.

tan′ x = sec2 x,

tan′′ x = 2 sec2 x tanx,

tan′′′ x = 4 sec2 x tan2 x+ 2 sec4 x = 6 sec4 x− 4 sec2 x,

tan′′′′ x = (24 sec4 x− 8 sec2 x) tanx.

The formulae for the derivatives of the sum and the scalar multiplicationcan be directly extended to the high order derivatives.

(f + g)(n) = f (n) + g(n), (cf)(n) = cf (n).

The Leibniz rule can also be extended.

(fg)′ = f ′g + fg′,

(fg)′′ = f ′′g + 2f ′g′ + fg′′,

(fg)′′′ = f ′′′g + 3f ′′g′ + 3f ′g′′ + fg′′′,

(fg)′′′′ = f ′′′′g + 4f ′′′g′ + 6f ′′g′′ + 4f ′g′′′ + fg′′′′.

By induction, it is not hard to show that the coefficients in the extendedLeibniz rule are the same as the ones in the binomial expansion.

(fg)(n) = f (n) + nf (n−1)g′ +n(n− 1)

2f (n−2)g′′ + · · ·+ g(n).

There is no clean extension of the chain rule. The following is the chain rulefor the second order derivative of z = g(y) = g(f(x)).

d2z

dx2=

d

dx

(dz

dx

)(definition of f ′′)

=d

dx

(dz

dy

dy

dx

)(chain rule)

=d

dx

(dz

dy

)dy

dx+dz

dy

d

dx

(dy

dx

)(Leibniz rule)

=d

dy

(dz

dy

)dy

dx

dy

dx+dz

dy

d2y

dx2(chain rule and definition of f ′′)

=d2z

dy2

(dy

dx

)2

+dz

dy

d2y

dx2. (definition of f ′′)

If the n-th order derivative exists, then the n-th order differential is

dnf = f (n)(x)dxn,

where dxn is symbolically the n-th power of dx. The high order differentialcan be generally computed by dnf = d(dn−1f), d(dxn) = 0 for the variablex, and the Leibniz rule d(αβ) = (dα)β + αdβ. For example,

d2f = d(df) = d(f ′dx) = (df ′)dx+ f ′d(dx) = (f ′′dx)dx = f ′′dx2,

and

d3f = d(d2f) = d(f ′′dx2) = (df ′′)dx2 + f ′′d(dx2) = (f ′′′dx)dx2 = f ′′′dx3.

Page 118: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

118 CHAPTER 2. DIFFERENTIATION

Example 2.3.2. Let

f(x) =

{x2 if x ≥ 0−x2 if x < 0

.

Then

f ′(x) =

{2x if x ≥ 0−2x if x < 0

= 2|x|.

By Example 2.1.8, the function f ′(x) is not differentiable at 0. Therefore f(x) hasderivative at 0 only up to the first order, and has all the high order derivatives atany x 6= 0.

Example 2.3.3. We have(√

1 + sin2 x)′

=sinx cosx√1 + sin2 x

from Example 2.1.11.

Then

(√1 + sin2 x

)′′=

(sinx cosx)′√

1 + sin2 x− sinx cosx(√

1 + sin2 x)′

(√1 + sin2 x

)2

=(cos2 x− sin2 x)

√1 + sin2 x− sinx cosx

sinx cosx√1 + sin2 x

1 + sin2 x

=(1− 2 sin2 x)(1 + sin2 x)− sin2 x(1− sin2 x)(

1 + sin2 x) 3

2

=1− 2 sin2 x− sin4 x(

1 + sin2 x) 3

2

.

Example 2.3.4. By (x3)(n) = 0 for n > 3 and the Leibniz rule,

(x3ex)(n) = x3(ex)(n) + n(x3)′(ex)(n−1) +n(n− 1)

2(x3)′′(ex)(n−2)

+n(n− 1)(n− 2)

6(x3)′′′(ex)(n−3)

= (x3 + 3nx2 + 3n(n− 1)x+ n(n− 1)(n− 2))ex.

Example 2.3.5. From the formula for the high order derivatives of xα, it is easy todeduce

((ax+ b)α)(n) = α(α− 1) · · · (α− n+ 1)an(ax+ b)α−n.

Then by1 + x√1− x

= 2(1− x)−12 − (1− x)

12 , we get

(1 + x√1− x

)(n)

= 212

32· · · 2n− 1

2(1− x)−

2n+12 −

(−1

2

)12· · · 2n− 3

2(1− x)−

2n−12

=1 · 3 · · · (2n− 3)

2n(1− x)−

2n+12 (4n− 1− x).

Example 2.3.6. In Example 2.1.12, the derivativedy

dxfor the ellipse x2+

y2

4= 1 was

computed in three ways. Continuing the three ways, the second order derivativecan also be computed.

Page 119: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 119

In the first way, the upper part y = 2√

1− x2 has the first order derivative

dy

dx= − 2x√

1− x2,

and the second order derivative

d2y

dx2= −(2x)′

√1− x2 − 2x(

√1− x2)′

(√

1− x2)2= −

2√

1− x2 + 2xx√

1− x2

1− x2

= −2(1− x2) + 2x2

(1− x2)32

= − 2

(1− x2)32

.

In the second way, y is implicitly considered as a function of x. Takingd

dxon

both sides of the equation x2 +y2

4= 1 and using the chain rule, we get

2x+12ydy

dx= 0.

Takingd

dxagain, we get

2 +12

(dy

dx

)2

+12yd2y

dx2= 0.

In Example 2.1.12, the first equation was solved to givedy

dx= −4x

y. Substituting

this into the second equation and solve for the second order derivative, we get

d2y

dx2= −

16(x2 +

y2

4

)y3

= −16y3.

In the third way, the ellipse is parametrized by x = cos t, y = 2 sin t. The first

order derivativedy

dx= −2 cot t was computed by the chain rule in Example 2.1.12.

Continuing with the same idea, we get

d2y

dx2=d(−2 cot t)

dx=

d(−2 cot t)dtdx

dt

=2 csc2 t

− sin t= − 2

sin3 t.

Example 2.3.7. Let p(x) be a polynomial and

f(x) =

p(

1x

)e−

1x2 if x 6= 0

0 if x = 0.

At x 6= 0, we have

f ′(x) =[−p′

(1x

)1x2

+ p

(1x

)2x3

]e−

1x2 = q

(1x

)e−

1x2 ,

Page 120: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

120 CHAPTER 2. DIFFERENTIATION

where q(x) = −p′(x)x2 − 2p(x)x3 is also a polynomial. Moreover, by

limx→0

xke−1x2 = lim

x→∞x−ke−x

2= lim

x→+∞

x−k2

ex= 0

for any k, we have

f ′(0) = limx→0

f(x)x

= limx→0

1xp

(1x

)e−

1x2 = 0.

Therefore f ′(x) is of the same type as f(x), with another polynomial q(x) in placeof p(x). In particular, we conclude that f(x) has derivatives of all orders, andf (n)(0) = 0 for any n.

Exercise 2.3.9. Compute the derivatives up to the third order.

1.1

x2 + 1.

2.2x2 + 1x2 + 1

.

3. cotx.

4. secx.

5. cscx.

6. arcsinx.

7. arctanx.

8. arcsecx.

9. (sin 2x+ cos 3x)7.

10. 2sin 3x.

11. e2x(cosx− 2 sinx).

12. arcsin1x

.

13. x(log x− 1).

14. log(log x).

15. log(x+√

1 + x2).

16. log | sinx|.

Exercise 2.3.10. Compute all the high order derivatives.

1. log(2− 3x).

2.x2 + x+ 1

4x2 − 1.

3.x√

2x+ 1.

4. x4 log x.

Exercise 2.3.11. Provedn(f(ax+ b))

dxn= anf (n)(ax+ b).

Exercise 2.3.12. Suppose u = u(x) and v = v(x) are differentiable functions sat-isfying u(0) = 1, v(0) = −1 and the given equations. Compute the second orderderivatives of u and v at 0.

1. u2 + uv + v2 = 1, (1 + x2)u+ (1− x2)v = x.

2. xu+ (x+ 1)v = ex, eu + xe−v = ex+1.

Exercise 2.3.13. Suppose y is a function of x satisfying the given equation. Com-

pute the second order derivatived2y

dx2at the given point.

1. y3 + 2xy2 − 2x3 = 1, at x = 1, y = 1.

2. x sin(x− y) = y cos(x+ y), at x = π, y = 0.

Page 121: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 121

3. ey = xy, at x = −e−1, y = −1.

4. (1 + y)ex + (1 + x)ey = xy + 2, at x = 0, y = 0.

Exercise 2.3.14. For the parametrized curves, compute the second order derivatived2y

dx2at the given point.

1. Cycloid: x = t− sin t, y = 1− cos t, at t =π

4.

2. Spiral: x = t cos t, y = t sin t, at t = π.

3. Involute of circle: x = cos t+ t sin t, y = sin t− t cos t, at t = π.

4. Four-leaved rose: x = cos 2t cos t, y = cos 2t sin t, at t =π

4.

Exercise 2.3.15. Find the formulae for the derivatives of quotient, composition andinverse functions up to the third order.

Exercise 2.3.16. Prove the functions y = arctanx and y = (arcsinx)2 satisfy theequations (1 + x2)y′′ + 2xy′ = 0 and (1− x2)y′′ − xy′ = 2. Then use the equationsto compute all the high order derivatives of arctanx and (arcsinx)2 at 0.

Exercise 2.3.17. Suppose x = x(t) and y = y(t) is a parametrized curve. Provethat

d2y

dx2=y′′x′ − y′x′′

x′3,

where x′, x′′, y′, y′′ are the derivatives with respect to t. Also find the formula forthe third order derivative.

2.3.3 Taylor Expansion

Using high order derivatives, the discussion on the quadratic approximationscan be extended to high order approximations.

Theorem 2.3.2. Suppose f(x) has the n-th order derivative f (n)(x0) at x0.Then for the n-th degree polynomial

Tn(x) = f(x0) + f ′(x0)(x− x0) +f ′′(x0)

2(x− x0)2 + · · ·+ f (n)(x0)

n!(x− x0)n,

(2.3.3)we have

limx→x0

f(x)− Tn(x)

(x− x0)n= 0.

Note that the existence of f (n)(x0) implicitly assumes the existence off (k)(x) for all k < n and all x near x0. The function Tn(x) is a polynomialof degree n characterized by the property

f(x) = Tn(x) + o(∆xn), ∆x = x− x0. (2.3.4)

Because of the property, we say f(x) is n-th order differentiable. The poly-nomial Tn(x) is called the n-th order Taylor expansion.

Page 122: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

122 CHAPTER 2. DIFFERENTIATION

Proof of Theorem 2.3.2. The theorem can be proved similar to Proposition2.3.1. The remainder Rn(x) = f(x)− Tn(x) satisfies

Rn(x0) = R′n(x0) = R′′n(x0) = · · · = R(n)n (x0) = 0.

Therefore by Cauchy’s mean value theorem,

Rn(x)

(x− x0)n=

Rn(x)−Rn(x0)

(x− x0)n − (x0 − x0)n

=R′n(c1)

n(c1 − x0)n−1=

R′n(c1)−R′n(x0)

n((c1 − x0)n−1 − (x0 − x0)n−1)= · · ·

=R

(n−1)n (cn−1)

n(n− 1) · · · 2(cn−1 − x0)(2.3.5)

for some c1 between x0 and x, c2 between x0 and c1, . . . , and cn−1 betweenx0 and cn−2. Then we have

limx→x0

Rn(x)

(x− x0)n= lim

cn−1→x0

R(n−1)n (cn−1)−R(n−1)

n (x0)

n!(c1 − x0)=R

(n)n (x0)

n!= 0.

The computation of the high order derivatives in Section 2.3.2 immedi-ately gives the following Taylor expansions at 0.

(1 + x)α = 1 + αx+α(α− 1)

2!x2 +

α(α− 1)(α− 2)

3!x3

+ · · ·+ α(α− 1) · · · (α− n+ 1)

n!xn + o(xn),

1

1− x= 1 + x+ x2 + x3 + · · ·+ xn + o(xn),

ex = 1 + x+1

2!x2 +

1

3!x3 + · · ·+ 1

n!xn + o(xn),

log(1 + x) = x− 1

2x2 +

1

3x3 + · · ·+ (−1)n−1

nxn + o(xn),

sinx = x− 1

3!x3 +

1

5!x5 + · · ·+ (−1)n+1

(2n− 1)!x2n−1 + o(x2n),

cosx = 1− 1

2!x2 +

1

4!x4 + · · ·+ (−1)n

(2n)!x2n + o(x2n+1).

Example 2.3.8. By rewriting the function x4 as a polynomial in (x− 1)

x4 = (1 + (x− 1))4 = 1 + 4(x− 1) + 6(x− 1)2 + 4(x− 1)3 + (x− 1)4,

and using the characterization (2.3.4), we get the Taylor expansion of variousorders of x4 at 1.

T1(x) = 1 + 4(x− 1),

T2(x) = 1 + 4(x− 1) + 6(x− 1)2,

T3(x) = 1 + 4(x− 1) + 6(x− 1)2 + 4(x− 1)3,

Tn(x) = 1 + 4(x− 1) + 6(x− 1)2 + 4(x− 1)3 + (x− 1)4, for n ≥ 4.

Page 123: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 123

Example 2.3.9. To find the Taylor expansion of log x at 2, we use the Taylorexpansion of log(1 + x) at 0 to get

log x = log 2 + log(

1 +x− 2

2

)= log 2 +

x− 22− 1

2(x− 2)2

22+

13

(x− 2)3

23

+ · · ·+ (−1)n−1

n

(x− 2)n

2n+ o((x− 2)n).

Thus the Taylor expansion of log x at 2 is

Tn(x) = log 2 +12

(x− 2)− 18

(x− 2)2 +124

(x− 2)3 + · · ·+ (−1)n−1

n2n(x− 2)n.

Example 2.3.10. The Taylor expansion of ex sinx at 0 can be obtained by multi-plying the Taylor expansions of ex and sinx together. In particular, to get the5-th order Taylor expansion, we have

ex sinx =(

1 + x+12!x2 +

13!x3 +

14!x4 + o(x4)

)(x− 1

3!x3 +

15!x5 + o(x6)

)=(

1 + x+12!x2 +

13!x3 +

14!x4

)(x− 1

3!x3 +

15!x5

)+ o(x5)

= x+ x2 +12!x3 +

13!x4 +

14!x5 − 1

3!x3 − 1

3!x4 − 1

2! · 3!x5 +

15!x5 + o(x5).

The second equality uses xmo(xn) = o(xm+n). The third equality uses xm = o(xn)for m > n. Thus the 5-th order Taylor expansion of ex sinx at 0 is

T5(x) = x+ x2 +13x3 − 1

30x5.

Example 2.3.11. To find the 5-th order Taylor expansion of secx at 0, we use theTaylor expansions of cosx and (1− x)−1 at 0 to get

secx =1

cosx=(

1− 12!x2 +

14!x4 + o(x5)

)−1

= 1 +(

12!x2 − 1

4!x4 + o(x5)

)+(

12!x2 − 1

4!x4 + o(x5)

)2

+(

12!x2 − 1

4!x4 + o(x5)

)3

+ o

((12!x2 − 1

4!x4 + o(x5)

)3)

= 1 +12!x2 − 1

4!x4 +

(12!x2

)2

+ o(x5).

Thus the Taylor expansion of secx at 0 is

T5(x) = 1 +12x2 +

524x4.

As a consequence of the Taylor expansion, we get a limit

limx→0

cosx+ secx− 2(ex2 − 1)2

= limx→0

1− 12!x2 +

14!x4 + 1 +

12x2 +

524x4 − 2 + o(x5)

(1 + x2 + o(x3)− 1)2

= limx→0

14x4 + o(x5)

x4 + o(x5)=

14.

Page 124: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

124 CHAPTER 2. DIFFERENTIATION

Example 2.3.12. Here is one more example of using Taylor expansions to computethe limit.

limx→0

(1x− 1

sinx

)= lim

x→0

1x− 1

x− 13!x3 + o(x4)

= lim

x→0

1x

1− 1

1− 13!x2 + o(x3)

= lim

x→0

1x

(1− 1− 1

3!x2 + o(x2)

)= lim

x→0

(−1

6x+ o(x)

)= 0.

Example 2.3.13. The derivative of f(x) = arcsinx is

g(x) = f ′(x) = (1− x)−12 = 1 +

12x+ · · ·+ 1

n!1 · 3 · · · (2n− 1)

2nxn + o(xn)

= 1 +12x+ · · ·+ (2n)!

22n(n!)2xn + o(xn),

This tells us

f (2n)(0)(2n− 1)!

=g(2n−1)(0)(2n− 1)!

= 0,f (2n+1)(0)

(2n)!=g(2n)(0)

(2n)!=

(2n)!22n(n!)2

.

Thus we get

f (2n)(0) = 0, f (2n+1)(0) = (2n)!(2n)!

22n(n!)2=(

(2n)!2nn!

)2

.

Example 2.3.14 (Cauchy). From Example 2.3.7, we know the derivatives of thefunction

f(x) =

{e−

1x2 if x 6= 0

0 if x = 0

are 0 at any order. Thus the Taylor expansion of the function is 0 at any order.

Exercise 2.3.18. Find the Taylor expansions.

1. x3 + 5x− 1 at −1.

2. x3 + 5x− 1 at 0.

3. αx at 1.

4. xα at 1.

5. sin 2x atπ

4.

6. log3 + x2

2 + xat 0.

Exercise 2.3.19. Find the 5-th order Taylor expansions at 0.

1.√x+ 1ex

2tanx.

2. (1 + x)x.

3. arcsinx.

4. logsinxx

.

5. log cosx.

6.x

ex − 1.

Exercise 2.3.20. Use Taylor expansions to compute limits.

Page 125: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 125

1. limx→0x− tanxx− sinx

.

2. limx→0

(1x2− 1

tan2 x

).

3. limx→0

(sinxx

) 1log(1−x2)

.

4. limx→0(cosx+ sinx)1

x(x+1) .

5. limx→0(cosx)1x2 .

6. limx→∞ x

(e−

(1 +

1x

)x).

7. limx→0xex − log(1 + x)

x2.

8. limx→∞ x2 log

(x sin

1x

).

9. limx→1(x− 1) log x1 + cosπx

.

10. limx→1

(1

x− 1− 1

log x

).

11. limx→0

(1x− cotx

).

12. limx→0(1 + 2x+ x2)

1x − (1 + 2x− x2)

1x

x.

Exercise 2.3.21. Find the high order derivatives of arctanx at 0.

Exercise 2.3.22. Suppose f(x) has second order derivative at 0 and satisfies

limx→0

(1 + x+

f(x)x

) 1x

= eλ.

1. Find f(0), f ′(0) and f ′′(0).

2. Find limx→0

(1 +

f(x)x

) 1x

.

Exercise 2.3.23. Suppose f(x) has derivatives of any order. How are the high orderderivatives of f(x) and f(x2) at 0 related?

Exercise 2.3.24. Prove that the Taylor expansion at 0 of an odd function containsonly terms of odd power. What about an even function?

Exercise 2.3.25. Suppose f(x) and g(x) are approximated by linear functions a0 +a1∆x and b0 + b1∆x at x0. Suppose a1 > 0. Without computing the derivatives,find the linear approximation of f(x)g(x) at x0. Moreover, extend the result toquadratic approximations.

As explained after the statement of Theorem 2.3.2, like the quadraticsituation, the existence of n-th order derivative at x0 implies the n-th orderdifferentiability at x0. On the other hand, Exercise 2.3.4 shows that then-th order differentiability does not necessarily imply the existence of n-thorder derivative. This is rather different from the first order case, when thedifferentiability is equivalent to the existence of the derivative.

Example 2.3.15. It is easy to see that |x|α is n-th order differentiable at 0 whenone of the following happens:

1. α > n: The n-th order approximation is 0.

2. α is an even natural number: We have |x|α = xα. The n-th order approxi-mation is xα if α ≤ n and is 0 if α > n.

Page 126: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

126 CHAPTER 2. DIFFERENTIATION

We claim the converse is also true.Let m be the unique natural number satisfying m+ 1 ≥ α > m. Then for any

natural number n ≥ α, we have n ≥ m+ 1. Therefore n-th order differentiabilityimplies (m + 1)-st order differentiability. Moreover, by the first statement above,the m-th order approximation is 0, so that the (m+ 1)-st order approximation isbxm+1. In other words, for any ε > 0, there is δ > 0, such that

|x| < δ =⇒∣∣|x|α − bxm+1

∣∣ < ε|x|m+1.

This is equivalent to the existence of the limit b = limx→0|x|α

xm+1. Since m+ 1 ≥ α,

this happens exactly when α = m+ 1 is an even number. Therefore the converseis proved.

As for the existence of the n-th order derivative, because this implies the n-th order differentiability, the conditions above are necessary. Conversely, if α >n ≥ k > 0, then we have (|x|α)(k) = α(α − 1) · · · (α − k + 1)xα−k for x > 0 orx = 0+ and (|x|α)(k) = (−1)kα(α− 1) · · · (α− k + 1)(−x)α−k for x < 0 or x = 0−.Therefore |x|α has n-th order derivative 0 at 0. Moreover, if α is an even naturalnumber, then |x|α = xα has derivative of any order. Therefore the condition forthe existence of the n-order derivative is the same as the condition for the n-thorder differentiability.

Exercise 2.3.26. Study the n-th order differentiability and the existence of the n-thorder derivative of the functions at 0 (α, β > 0, a, b 6= 0).

1.

{axα if x ≥ 0b(−x)β if x < 0

. 2.

|x|α sin1|x|β

if x 6= 0

0 if x = 0.

2.3.4 Remainder

Let Tn(x) be the n-th order Taylor expansion of f(x) at x0. Under thecondition of Theorem 2.3.2, all we know the remainder is Rn(x) = f(x) −Tn(x) = o(∆xn). Under slightly stronger assumption, however, more can besaid about the remainder.

Proposition 2.3.3 (Lagrange10). Suppose f(t) has (n+1)-st order derivativebetween x and x0. Then there is c between x and x0, such that

Rn(x) =f (n+1)(c)

(n+ 1)!(x− x0)n+1. (2.3.6)

Note that when n = 0, the conclusion of the proposition is exactly themean value theorem.

The expression for Rn(x) in Proposition 2.3.3 is called the Lagrange formof the remainder. In Exercise 2.3.47 and Example 3.3.1, two other formulaefor the remainder will be given.

10Joseph-Louis Lagrange, born 1736 in Turin (now Italy), died 1813 in Paris (France).

Page 127: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 127

Proof. Under the assumption that f has (n+ 1)-st order derivative betweenx0 and x, we have the following computation similar to (2.3.5),

Rn(x)

(x− x0)n+1=

Rn(x)−Rn(x0)

(x− x0)n+1 − (x0 − x0)n+1

=R′n(c1)

(n+ 1)(c1 − x0)n=

R′n(c1)−R′n(x0)

(n+ 1)((c1 − x0)n − (x0 − x0)n)= · · ·

=R

(n)n (cn)

(n+ 1)n(n− 1) · · · 2(cn − x0)=R

(n)n (cn)−R(n)

n (x0)

(n+ 1)!(cn − x0)=R

(n+1)n (c)

(n+ 1)!,

where c is between x0 and x. Since Tn is a polynomial of degree n, its (n+1)-

st order derivative is zero. Therefore R(n+1)n (c) = f (n+1)(c). The formula for

the remainder then follows.

Example 2.3.16. The remainder of the Taylor expansion for ex at 0 is

Rn(x) =ec

(n+ 1)!xn+1.

Since |c| < |x|, for each fixed x, we have |Rn(x)| < e|x||x|n+1

(n+ 1)!, which converges to

0 as n→∞. Therefore

ex = limn→∞

(1 + x+

12!x2 +

13!x3 + · · ·+ 1

n!xn)

= 1 + x+12!x2 +

13!x3 + · · ·+ 1

n!xn + · · · .

Moreover, since |R9(1)| < e

10!<

310!

< 10−6, we find

e ≈ 1 + 1 +12!

+13!

+ · · ·+ 19!≈ 2.718285

is accurate up to the 6th digit.Example 2.3.17. Suppose f(x) has second order derivative on [0, 1]. Suppose|f(0)| ≤ 1, |f(1)| ≤ 1 and |f ′′(x)| ≤ 1. We would like to estimate the size off ′(x).

Fix any 0 < x < 1. By the second order Taylor expansion at x and theremainder formula, we have

f(0) = f(x) + f ′(x)(x− 0) +f ′′(c1)

2(x− 0)2, 0 < c1 < x

f(1) = f(x) + f ′(x)(x− 1) +f ′′(c2)

2(x− 1)2, x < c2 < 1

Subtracting the two, we get

f ′(x) = f(1)− f(0) +f ′′(c1)

2x2 − f ′′(c2)

2(x− 1)2.

Therefore by the assumption on the sizes of f(1), f(0) and f ′′(x), we get

|f ′(x)| ≤ 2 +12

(x2 + (x− 1)2) ≤ 52.

Exercise 2.3.27. Estimate the errors of approximations for |x| ≤ 0.2.

Page 128: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

128 CHAPTER 2. DIFFERENTIATION

1. cosx ≈ 1− 12x2 +

124x4.

2.1√

1 + x≈ 1− 1

2x+

38x2.

3. log(1 + x) ≈ x− 12x2.

4. 2x ≈ 1 + x log 2 +(log 2)2

2x2.

Exercise 2.3.28. Compute the values up to the 4-th digit.

1.√e. 2. log 0.9. 3. 5

√30.

Exercise 2.3.29. Show that the Taylor expansions of sinx and cosx converge tothe respective functions for any x as n→∞.

Exercise 2.3.30. Prove that if there is M , such that |f (n)(x)| ≤ M for all n andx ∈ [a, b], then the Taylor expansion of f(x) converges to f(x) for x ∈ [a, b].

Exercise 2.3.31. Prove that for x > 0, we have

x−x2

2+x3

3−· · ·+x2k

2k− x2k+1

(2k + 1)(1 + x)2k+1> log(1+x) > x−x

2

2+x3

3−· · ·+x2k

2k− x

2k+1

2k + 1,

and

x−x2

2+x3

3−· · ·− x2k−1

2k − 1+

x2k

2k(1 + x)2k< log(1+x) < x−x

2

2+x3

3−· · ·− x2k−1

2k − 1+x2k

2k.

This extends the inequality in Exercise 2.2.20. Also derive the similar inequalitiesfor 0 > x > −1. Then use the inequalities to discuss the convergence of the Taylorseries of log(1 + x).

Exercise 2.3.32. Suppose f(x) has the third order derivative on [−1, 1], such thatf(−1) = 0, f(0) = 0, f(1) = 1, f ′(0) = 0. Prove that there are −1 < x < 0 and0 < y < 1, such that f ′′′(x) + f ′′′(y) = 6.

2.3.5 Maximum and Minimum

Suppose f(x) is differentiable at x0. By proposition 2.2.1, a necessary condi-tion for x0 to be a local extreme is f ′(x0) = 0. To further determine whetherx0 is indeed a local extreme, high order approximations can be used.

Proposition 2.3.4. Suppose f(x) has n-th order approximation a+b(x−x0)n

at x0, with b 6= 0.

1. If n is odd and b 6= 0, then x0 is not a local extreme.

2. If n is even and b > 0, then x0 is a local minimum. If n is even andb < 0, then x0 is a local maximum.

Proof. The n-th order approximation means that for any ε > 0, there isδ > 0, such that

|x− x0| < δ =⇒ |f(x)− a− b(x− x0)n| ≤ ε|(x− x0)n|.

This implies a = f(x0) by taking x = x0. In what follows, we will fix ε to beany number satisfying 0 < ε < |b|, so that b− ε and b have the same sign.

Page 129: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 129

Suppose n is odd and b > 0. Then for δ > x− x0 > 0, we have

f(x)− f(x0) > b(x− x0)n − ε|(x− x0)n| = (b− ε)(x− x0)n > 0,

and for 0 > x− x0 > −δ, we have

f(x)− f(x0) < b(x− x0)n + ε|(x− x0)n| = (b− ε)(x− x0)n < 0,

Thus x0 is not a local extreme. Similarly, if b < 0, then x0 is also not a localextreme.

Suppose n is even and b > 0. Then |x− x0| < δ implies

f(x)− f(x0) ≥ b(x− x0)n − ε|(x− x0)n| = (b− ε)(x− x0)n ≥ 0.

Thus x0 is a local minimum. Similarly, if b < 0, then x0 is a local maximum.

Suppose f(x) has derivatives of sufficiently high order. Then the func-tion has high order approximations by the Taylor expansion. To apply theproposition, we assume

f ′(x0) = f ′′(x0) = · · · = f (n−1)(x0) = 0, f (n)(x0) 6= 0.

Then we conclude the following.

1. If n is odd, then x0 is not a local extreme.

2. If n is even, then x0 is a local minimum when f (n)(x0) > 0, and is alocal maximum when f (n)(x0) < 0.

On the other hand, since the n-th order differentiability is weaker than the ex-istence of the n-th order derivative, Proposition 2.3.4 is a technically strongerstatement.

Example 2.3.18. By Example 2.2.1, we know the possible local extremes of thefunction f(x) = x2ex are 0 and −2. Then by f ′′(x) = (x2 + 4x + 2)ex, f ′′(0) =2 > 0, f ′′(−2) = −2e−2 < 0, we conclude that 0 is a local minimum and −2 is alocal maximum. This confirms the same conclusion in Example 2.2.7, which wasobtained by studying the monotone properties of the function.

Example 2.3.19. For the function f(x) = (x+1)x83 , we have f ′(x) =

13

(11x+8)x53 ,

f ′′(x) =89

(11x+5)x23 . From f ′(x) we find two candidates −11

8and 0 for the local

extremes. Since f ′′(−11

8

)= −9

411

23 < 0, we know −11

8is a local maximum.

Since f ′′(0) = 0, no immediate conclusion can be drawn from the other candidate0 by Proposition 2.3.4. Moreover, since f ′′′(0) does not exist, the propositioncannot be applied to determine whether x = 0 is a local extreme. On the otherhand, for x close to 0, we do have f(x) ≥ 0 = f(0), so that 0 is indeed a localminimum by the definition.

Exercise 2.3.33. Find local extrema.

Page 130: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

130 CHAPTER 2. DIFFERENTIATION

1. x+1x

.

2. 6x10 − 10x6.

3. sinx+ cosx.

4.sinx

2 + cosx.

5. cosx+12

cos 2x.

6. (x2 + 1)ex.

7. x log x.

8.(log x)2

x.

9.(

1 + x+12x2

)e−x.

10.(

1 + x+12x2 +

16x3

)e−x.

11. arctanx− 12

log(1 + x2).

Exercise 2.3.34. Does the second part of Proposition 2.3.4 hold if > and < arereplaced by ≥ and ≤?

Exercise 2.3.35. Study the local extremes of the function

f(x) =

1x4e−

1x2 if x 6= 0

0 if x = 0.

2.3.6 Convex and Concave

A function is convex if the line segment Lx,y connecting any two points(x, f(x)) and (y, f(y)) on the graph of f lies above the graph of f . In otherwords, for any x < z < y, the point (z, f(z)) lies below Lx,y. See Figure 2.6.From the picture, it is easy to see that the condition is equivalent to any oneof the following.

1. slope of Lz,y ≥ slope of Lx,y.

2. slope of Lx,y ≥ slope of Lx,z.

3. slope of Lz,y ≥ slope of Lx,z.

Algebraically, the slope of Lx,y isf(y)− f(x)

y − x, and it is not difficult to verify

by direct computation that the three conditions are equivalent. Moreover,the line segment Lx,y is given by

Lx,y(z) = f(y) +f(y)− f(x)

y − x(z − y).

Thus the convexity of f(x) means Lx,y(z) ≥ f(z), which is easily seen to beequivalent to the first condition.

Convex functions can also be characterized by straight lines below thegraph.

Proposition 2.3.5. A function f(x) on an open interval is convex if andonly if for any z, there is a linear function K(x) such that K(z) = f(z) andK(x) ≤ f(x) for all x.

Page 131: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 131

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ...........................................................................................................................................................................................................................................................................................................................................................................................................................................................

x yzλ 1− λ

Lx,z

Lz,y

Lx,y

K

..................................................................................

..................................................................................

..................................................................................

.............................................................................................................................................................................................................................................................................................................

...............................................................

..........................................................................................................................................................................................................................................................................................................................

..............................................................................................................................................................................................................................................................

.............................................................................................................................................................................................................................

............

............

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

............

............

......

Figure 2.6: convex function

Proof. For a convex function f(x), by the third convexity condition, we have

supx<z

(slope of Lx,z) ≤ infy>z

(slope of Lz,y).

Let B = B(z) be a number between the supremum and the infimum. Then

slope of Lx,z ≤ B ≤ slope of Lz,y

for any x < z < y. Let K(x) = f(z) + B(x− z) be the linear function withslope B and satisfies K(z) = f(z). Then the relation above between B andthe slopes tells us that f(x) ≥ K(x) for any x.

Conversely, suppose the linear function K exists with the claimed prop-erty. Then geometrically it is clear that the properties of K implies slope ofLx,z ≤ B and slope of Lz,y ≥ B for any x < z < y. This implies the thirdconvexity condition.

The convexity condition can also be rephrased as follows. Write z =(1 − λ)x + λy, where x < z < y is equivalent to 0 < λ < 1 (and z < x,z = x, x < z < y, z = y, z > y are respectively equivalent to λ < 0, λ = 0,0 < λ < 1, λ = 1, λ > 1). Either geometrical consideration or algebraiccomputation tells us

Lx,y(z) = (1− λ)f(x) + λf(y).

Thus the convexity is the same as

0 < λ < 1 =⇒ (1− λ)f(x) + λf(y) ≥ f((1− λ)x+ λy). (2.3.7)

A function is concave if the line segment connecting two points on thegraph of f lies below the graph of f . By exchanging the directions of theinequalities, the characterizations of convex functions become the character-ization of concave functions.

A point of inflection is where the function is changed from concave toconvex, or from convex to concave.

Page 132: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

132 CHAPTER 2. DIFFERENTIATION

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..................................................................................................................................................................................................................................................................................................................................................................................................................................

..................................................................

..................................................................

..................................................................

..................................................................

.........................................................................................................

...........................................................................................................................................................................................................................................................................

................................................................................................................................................................................................................................................................................................

..................................................................................................................................................................................................................

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

Figure 2.7: concave function

Proposition 2.3.6. Suppose f(x) is differentiable on an interval. Then f(x)is convex if and only if f ′(x) is increasing.

Combining Propositions 2.2.4 and 2.3.6, a function f(x) with second orderderivative is convex if and only if f ′′(x) ≥ 0. The similar statement forconcave functions is also true. The points of inflection are then the placeswhere f ′′(x) changes the sign.

Proof. Suppose f(x) is convex. Then for fixed x < y and changing z betweenx and y, we have

f ′(x) = limz→x+

(slope of Lx,z) ≤ slope of Lx,y ≤ limz→y−

(slope of Lz,y) = f ′(y).

Conversely, suppose f ′ is increasing and x < z < y. By the mean valuetheorem, we have

slope of Lx,z = f ′(c), slope of Lz,y = f ′(d),

for some x < c < z and z < d < y. Since c < d, we have f ′(c) ≤ f ′(d), sothat the third condition for the convexity holds.

Example 2.3.20. Since (− log x)′′ =1x2

> 0, the derivative (− log x)′ is an increas-

ing function and − log x is convex. Therefore if p, q > 0 satisfy1p

+1q

= 1, then

we have

log xy =1p

log xp +1q

log yq ≤ log(

1pxp +

1qyq).

Taking the exponential, we get the Young equality in Exercise 2.2.40

xy ≤ 1pxp +

1qyq.

Example 2.3.21. The function f(x) = x2ex in Examples 2.2.7 and 2.3.18 hasf ′′(x) = (x2 +4x+2)ex = (x−a)(x−b)ex, where a = −2−

√2, b = −2+

√2. From

the signs of the second order derivative, we know that the function is convex on(−∞, a] and [b,∞), and is concave on [a, b]. The points a and b, where concavityand convexity are exchanged, are the points of inflection. This adds additionalinformation to the graph of the function in Figure 2.4.

Page 133: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 133

Example 2.3.22. Let f(x) =x

1 + x2. From f ′(x) =

1− x2

(1 + x2)2, we find f(x) is

decreasing on (−∞,−1], increasing on [−1, 1], and decreasing again on [1,∞).

Therefore f(−1) = −12

is a local minimum, and f(1) =12

is a local maximum.

From f ′′(x) =−2x(3− x2)

(1 + x2)3, we find f(x) is concave on (−∞,−

√3], convex on

[−√

3, 0], concave again on [0,√

3], and then convex again on [√

3,∞). Therefore

f(−√

3) = −√

34

, f(0) = 0 and f(√

3) =√

34

are points of inflection.

Combining the information with limx→∞ f(x) = 0, we get a rough sketch ofthe function in Figure 2.8.

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

...............

.........................................

2−2

12

−12

√3−

√3

............. ............. ..........................

.............

.............

...........................................................................................................................

.....................................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

....................................................

...................................................................................................

...........................

Figure 2.8: graph ofx

1 + x2

Example 2.3.23. Consider the function f(x) = (x+ 1)x23 in Example 2.2.8. From

f ′(x) =13

(5x + 2)x−13 , we find f(x) is increasing on

(−∞,−2

5

], decreasing on[

−25, 0], and increasing again on [0,∞). Therefore f

(−2

5

)=

3 3√

2025

is a local

maximum, and f(0) = 0 is a local minimum.

From f ′′(x) =29

(5x− 1)x−43 , we find f(x) is concave on

(−∞, 1

5

]and convex

on[

15,∞)

. Therefore f(

15

)=

6 3√

525

is a point of inflection.

Combined with limx→∞f(x)

x53

= 1, we get a rough sketch of the function in

Figure 2.9.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

−25

15

3 3√2025

............. .............

.............

.......................................................................................

..................................................

....................................................................

.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Figure 2.9: graph of (x+ 1)x23

Page 134: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

134 CHAPTER 2. DIFFERENTIATION

Exercise 2.3.36. Sketch the graph of the functions.

1. x3 + 6x2−15x−20.

2. x− 1x

.

3.1

1 + x2.

4. (x− 1)23 (x+ 1)

13 .

5. log(1 + x2).

6. xe−x.

7. x− arctanx.

8.log xx

.

9. e−x cosx.

Exercise 2.3.37. Are the sum, product, composition, maximum, minimum of twoconvex functions still convex?

Exercise 2.3.38. Verify the convexity of x log x and then use the property to prove

the inequality(x+ y

2

)x+y

≤ xxyy.

Exercise 2.3.39. Suppose p ≥ 1. Show that xp is convex. Then for non-negativea1, a2, . . . , an, b1, b2, . . . , bn, take

x =ai

(∑api )

1p

, y =bi

(∑bpi )

1p

, λ =(∑api )

1p

(∑api )

1p + (

∑bpi )

1p

,

in the inequality (2.3.7) and derive the Minkowski inequality in Exercise 2.2.42.

Exercise 2.3.40. Suppose f(x) is a convex function. For any λ1, λ2, . . . , λn satis-fying λ1 + λ2 + · · ·+ λn = 1 and 0 < λi < 1, prove Jensen11 inequality

f(λ1x1 + λ2x2 + · · ·+ λnxn) ≤ λ1f(x1) + λ2f(x2) + · · ·+ λnf(xn).

Then use this to prove that for xi > 0, we have

n√x1x2 · · ·xn ≤

x1 + x2 + · · ·+ xnn

≤ p

√xp1 + xp2 + · · ·+ xpn

nfor p ≥ 1,

and(x1x2 · · ·xn)

x1+x2+···+xnn ≤ xx1

1 xx22 · · ·x

xnn .

Exercise 2.3.41. Prove that if f ′′(x0) = 0 and f ′′′(x0) 6= 0, then x0 is a point ofinflection.

Exercise 2.3.42. Prove that a continuous function on an interval is convex if and

only iff(x) + f(y)

2≥ f

(x+ y

2

)for any x and y on the interval.

Exercise 2.3.43. Prove that a function f(x) on an open interval (a, b) is convexif and only if for any a < x < y < b, we have f(z) ≥ Lx,y(z) for any z ∈ (a, x)and z ∈ (y, b). Then prove that a convex function on an open interval must becontinuous.

2.3.7 Additional Exercise

Estimation of sinx and cosx

We know the inequality sinx >2

πx for 0 < x <

π

2from 1.3.18 and

Exercise 2.2.20. The subsequent exercises extends the estimation to higherorder and to cosx.

11Johan Jensen, born 1859 in Nakskov (Denmark), died 1925 in Copenhagen (Denmark).He proved the inequality in 1906.

Page 135: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 135

Exercise 2.3.44. Let

fk(x) = x− x3

3!+ · · ·+ (−1)k−1 x2k−1

(2k − 1)!− sinx,

gk(x) = 1− x2

2!+ · · ·+ (−1)k

x2k

(2k)!− cosx.

Verify that g′k = −fk and f ′k+1 = gk. Then use the equalities to prove that

x− x3

3!+ · · · − x4k−1

(4k − 1)!< sinx < x− x3

3!+ · · · − x4k−1

(4k − 1)!+

x4k+1

(4k + 1)!

for x > 0. Also derive the similar inequalities for cosx.

Exercise 2.3.45. Prove that

x−x3

3!+· · ·− x4k−1

(4k − 1)!+

x4k+1

(4k + 1)!< sinx < x−x

3

3!+· · ·+ x4k+1

(4k + 1)!− 2π

x4k+3

(4k + 3)!,

and

1− x2

2!+ · · · − x4k−2

(4k − 2)!+

x4k

(4k)!< cosx < 1− x2

2!+ · · ·+ x4k

(4k)!− 2π

x4k+2

(4k + 2)!

for 0 < x <π

2.

Exercise 2.3.46. Let

fn(x) = 1 + x− x2

2!− x3

3!+ · · ·+ s1(n)

xn

n!, s1(n) =

{1 if n = 4k, 4k + 1−1 if n = 4k + 2, 4k + 3

,

gn(x) = 1− x− x2

2!+x3

3!+ · · ·+ s2(n)

xn

n!, s2(n) =

{1 if n = 4k − 1, 4k−1 if n = 4k + 1, 4k + 2

.

Prove that

f4k+1(x)−√

2x4k+2

(4k + 2)!< cosx+ sinx < f4k−1(x) +

√2x4k

(4k)!,

and

g4k(x)−√

2x4k+1

(4k + 1)!< cosx− sinx < g4k+2(x) +

√2

x4k+3

(4k + 3)!.

Moreover, derive similar inequalities for a cosx+ b sinx.

Cauchy Form of the Remainder

The Lagrange form (2.3.6) is the simplest form of the remainder. How-ever, for certain functions, it is more suitable to use the Cauchy form

Rn(x) =f (n+1)(c)

n!(x− c)n(x− x0) (2.3.8)

of the remainder. The proof makes use of the function

F (t) = f(x)− f(t)− f ′(t)(x− t)− f ′′(t)

2(x− t)2 − · · · − f (n)(t)

n!(x− t)n

defined for any fixed x and x0.

Page 136: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

136 CHAPTER 2. DIFFERENTIATION

Exercise 2.3.47. By applying the mean value theorem to F (t) for t between x0 andx, prove the Cauchy form of the remainder.

Exercise 2.3.48. By applying Cauchy’s mean value theorem to F (t) and G(t) =(x− t)n+1, derive the Lagrange form (2.3.6) for the remainder.

Exercise 2.3.49. Prove the remainder of the Taylor series of (1 + x)α satisfies

|Rn| ≤ ρn = A

∣∣∣∣α(α− 1) · · · (α− n)n!

xn+1

∣∣∣∣ for |x| < 1,

where A = (1 + |x|)α−1 for α ≥ 1 and A = (1 − |x|)α−1 for α < 1. Then useExercise 1.1.33 to show that limn→∞ ρn = 0. This shows that the Taylor series of(1 + x)α converges for |x| < 1.

Exercise 2.3.50. Study the convergence of the Taylor series of log(1 + x).

Relation between the Bounds of a Function and its Derivatives

In Example 2.3.17, we saw bounds on a function and its second orderderivative will induce a bound on the first order derivative. The subsequentexercises provides more examples.

Exercise 2.3.51. Suppose f(x) is a function on [0, 1] with second order derivativeand satisfying f(0) = f ′(0) = 0, f(1) = 1. Prove that if f ′′(x) ≤ 2 for any0 < x < 1, then f(x) = x2. In other words, unless f(x) = x2, we will havef ′′(x) > 2 somewhere on (0, 1).

Exercise 2.3.52. Consider functions f(x) on [0, 1] with second order derivative andsatisfying f(0) = f(1) = 0 and min[0,1] f(x) = −1. What would be the “lowestbound” a for f ′′(x)? In other words, find biggest a, such that any such functionf(x) will have f ′′(x) ≥ a somewhere on (0, 1).

Exercise 2.3.53. Study the constraint on the second order derivative for functionson [a, b] satisfying f(a) = A, f(b) = B and min[a,b] f(x) = m.

Exercise 2.3.54. Suppose f(x) has the second order derivative on (a, b). SupposeM0, M1, M2 are the suprema of |f(x)|, |f ′(x)|, |f ′′(x)| on the interval. By rewritingthe remainder formula as an expression of f ′ in terms of f and f ′′, prove that

|f ′(x)| ≤ h

2M2 +

2hM0

for any a < x < b and 0 < h < max{x− a, b− x}. Then prove M1 ≤ 2√M2M0 in

case b = +∞. Moreover, verify that the equality happens for

f(x) =

2x2 − 1 if −1 < x < 0x2 − 1x2 + 1

if x ≥ 0.

Exercise 2.3.55. Suppose f(x) has the second order derivative on (a,+∞). Provethat if f ′′(x) is bounded and limx→+∞ f(x) = 0, then limx→+∞ f

′(x) = 0.

Convexity Criterion by the One Side Derivative

Convex functions are continuous (see Exercise 2.3.43) but not necessarilyalways differentiable (|x| is convex, for example). So a convexity criterionmore general than Proposition 2.3.6 is needed.

Page 137: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

2.3. HIGH ORDER APPROXIMATION 137

Exercise 2.3.56. Prove that a convex function f(x) on an open interval is left andright differentiable, and the one side derivatives satisfy

f ′(x+) ≤ f(y)− f(x)y − x

≤ f ′(y−).

Exercise 2.3.57. Prove the following are equivalent for a function f(x) on an openinterval.

1. f(x) is convex.

2. f(x) is left and right differentiable, with x < y implying f ′(x−) ≤ f ′(x+) ≤f ′(y−) ≤ f ′(y+).

3. f(x) is left continuous and right differentiable, with increasing f ′(x+).

Page 138: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

138 CHAPTER 2. DIFFERENTIATION

Page 139: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 3

Integration

139

Page 140: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

140 CHAPTER 3. INTEGRATION

3.1 Riemann Integration

Discovered independently by Newton and Leibniz, the integration was origi-nally the method of using the antiderivative to find the areas under curves.Thus the method started as an application of the differentiation. Then Rie-mann1 studied the limiting process leading to the area and established theintegration as an independent subject. The new viewpoint further led toother integration theories, among which the most significant is the Lebesgue2

integration.

3.1.1 Riemann Sum

Let f(x) be a function on a bounded interval [a, b]. To compute the areaof the region between the graph of the function and the x-axis, we choose apartition

P : a = x0 < x1 < x2 < · · · < xn = b (3.1.1)

of the interval and approximate the region by a sequence of rectangles withbase [xi−1, xi] and height f(x∗i ), where x∗i ∈ [xi−1, xi]. The total area of therectangles is the Riemann sum

S(P, f) =n∑i=1

f(x∗i )(xi − xi−1) =n∑i=1

f(x∗i )∆xi. (3.1.2)

Note that S(P, f) also depends on the choices of x∗i , although the choice doesnot appear in the notation.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

a bxi−1 xi

f(x∗i )

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

...............

...............

...............

...............

...............

...............

...............

...............

...............

...................................................................................................................................................................................

......................................

................

................

................

................

................

................

........................................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..............................................................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

........................................................................................................................................................................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................................................

...............................................................................................................................................................................................................................................................................................................................................................................................................................

Figure 3.1: Riemann sum

1Georg Friedrich Bernhard Riemann, born 1826 in Breselenz, Hanover (now Germany),died 1866 in Selasca (Italy).

2Henri Leon Lebesgue, born 1875 in Beauvais (France), died 1941 in Paris (France).His 1901 paper “Sur une generalisation de l’integrale definie” introduced the concept ofmeasure and revolutionized the integral calculus. He also made major contributions inother areas of mathematics, including topology, potential theory, the Dirichlet problem,the calculus of variations, set theory, the theory of surface area and dimension theory.

Page 141: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 141

Example 3.1.1. To find the area under the function f(x) = x over the interval

[0, 1], we choose a partition Pn given by xi =i

nand choose x∗i =

i

n. Then

S(Pn, x) =n∑i=1

i

n

1n

=1n2

n(n+ 1)2

=n+ 1

2n.

If the middle point x∗i =12

(i− 1n

+i

n

)=

2i− 12n

of the interval [xi−1, xi] is

chosen, then

S(Pn, x) =n∑i=1

2i− 12n

1n

=n(n+ 1)− n

2n2=

12.

Exercise 3.1.1. Compute the Riemann sums.

1. f(x) = x, xi =i

n, x∗i =

i− 1n

.

2. f(x) = x2, xi =i

n, x∗i =

i

n.

3. f(x) = x2, xi =i

n, x∗i =

2i− 12n

.

4. f(x) = αx, xi =i

n, x∗i =

i− 1n

.

The Riemann sum is only an approximation of the area of the region. Weexpect the approximation to get more accurate when the mesh

‖P‖ = max1≤i≤n

∆xi

gets smaller. This leads to the definition of the definite integral.

Definition 3.1.1. A function f(x) on a bounded interval [a, b] is Riemannintegrable, with integral I, if for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒ |S(P, f)− I| < ε. (3.1.3)

Because of the similarity to the definition of limits, we write

I =

∫ b

a

f(x)dx = lim‖P‖→0

S(P, f).

The numbers a and b are called the lower limit and the upper limit of theintegral.

As pointed out at the beginning of the chapter, there are several inte-gration theories. Therefore there are different meanings of integrability. Inthis course, unless otherwise indicated, the integrability will always meanRiemann integrability.

Example 3.1.2. For the constant function f(x) = c on [a, b] and any partition P ,we have

S(P, c) =n∑i=1

c∆xi = c

n∑i=1

∆xi = c(b− a).

Therefore the constant function is integrable, with∫ b

acdx = c(b− a).

Page 142: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

142 CHAPTER 3. INTEGRATION

Example 3.1.3. Consider the function

dc(x) =

{0 if x 6= c

1 if x = c

that is constantly zero except at x = c. For any partition P of a bounded closedinterval [a, b] containing c, we have

S(P, dc) =

0 if no x∗i = c

∆xk if xk−1 < x∗k = c < xk

∆xk + ∆xk+1 if x∗k = xk = c

.

This implies that dc(x) is integrable, and∫ b

adc(x)dx = 0.

Example 3.1.4. For the Dirichlet function D(x) in Example 1.3.24 and any par-tition P of [a, b], we have S(P,D) = b − a if all x∗i are rational numbers andS(P,D) = 0 if all x∗i are irrational numbers. Thus the Dirichlet function is notintegrable.

Example 3.1.5. Consider Thomae’s function R(x) in Example 1.4.2. For any nat-ural number N , let AN be the set of rational numbers in [0, 1] with denominators≤ N . Then AN is finite, containing νN numbers. For any partition P of [0, 1] andchoices of x∗i , the Riemann sum S(P,R) can be divided into two parts. The firstpart consists of those intervals with x∗i ∈ AN , and the second part has x∗i 6∈ AN .The number of terms in the first part is ≤ 2νN , with 0 < R(x∗i ) ≤ 1. In the second

part we have 0 < R(x∗i ) ≤1N

. Thus we conclude

0 ≤ S(P,R) ≤ 2νN‖P‖+1N

(1− 0) = 2νN‖P‖+1N.

By taking ‖P‖ < δ =1

2NνN, for example, we get 0 ≤ S(P,R) <

2N

. Thus we

conclude that the function is integrable on [0, 1], with∫ 1

0R(x)dx = 0.

Exercise 3.1.2. Study the integrability. For the integrable ones, find the integrals.

1.

{0 if 0 ≤ x < 11 if 1 ≤ x ≤ 2

for x ∈ [0, 2].

2.

{x if x is rational0 if x is irrational

for x ∈ [0, 1].

3.

{1 if x = 1

n , n ∈ Z0 if otherwise

for x ∈ [0, 1].

Exercise 3.1.3. Prove that∫ 1

0xdx =

12

in the following steps.

1. For any partition P , if x∗i =xi + xi−1

2is the middle point of the intervals,

then the Riemann sum Smid(P, x) =12

.

Page 143: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 143

2. For any partition P and choices of x∗i , we have |S(P, x)−Smid(P, x)| ≤ 12‖P‖.

Finally, we remark that since the Riemann sum S(P, f) takes into accountof the sign of the function f(x), the integration is actually the signed area,which counts the part of the area corresponding to f(x) < 0 as negative.

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

a bA

B

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

...............

...............

...............

...............

...............

.............

Figure 3.2:

∫ b

a

f(x)dx = −area(A) + area(B)

3.1.2 Integrability Criterion

In Examples 3.1.2 through 3.1.5, we saw a function may or may not beintegrable. The following is a simple condition for integrability.

Proposition 3.1.2. Riemann integrable functions are bounded.

Example 3.1.4 shows the converse is not true.

Proof. Let f(x) be integrable on a bounded interval [a, b] and let I be theintegral. Then for ε = 1 > 0, there is a partition P , such that∣∣∣∣∣

n∑i=1

f(x∗i )∆xi − I

∣∣∣∣∣ = |S(P, f)− I| < 1

for any choices of x∗i . Now we fix x∗2, x∗3, . . . , x

∗n, so that

∑ni=2 f(x∗i )∆xi is a

fixed bounded number. Then

|f(x∗1)∆x1| ≤

∣∣∣∣∣n∑i=2

f(x∗i )∆xi − I

∣∣∣∣∣+ 1

for any x∗1 ∈ [x0, x1]. In particular, this shows that f(x) is bounded by1

∆x1

(|∑n

i=2 f(x∗i )∆xi − I|+ 1) on the first interval [x0, x1] of the partition.

Similar argument shows that the function is bounded on any interval of thepartition. Since the partition contains finitely many intervals, the functionis bounded on [a, b].

Page 144: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

144 CHAPTER 3. INTEGRATION

Similar to the convergence of sequences, a more refined criterion for in-tegrability can be obtained by considering the Cauchy criterion. By a proofsimilar to the convergence of sequences and functions, the Riemann sumconverges if and only if for any ε > 0, there is δ > 0, such that

‖P‖, ‖P ′‖ < δ =⇒ |S(P, f)− S(P ′, f)| < ε. (3.1.4)

Note that hidden in the notation is the choices of x∗i and x′∗i for P and P ′.For the special case P = P ′, we have

S(P, f)− S(P ′, f) =n∑i=1

(f(x∗i )− f(x′∗i ))∆xi,

and the supremum of the difference for all possible choices of x∗i and x′∗i is

supall x∗i

S(P, f)− infall x∗i

S(P, f) =n∑i=1

(sup

[xi−1,xi]

f(x)− inf[xi−1,xi]

f(x)

)∆xi.

Define the oscillation of a bounded function f(x) on an interval [a, b] to be

ω[a,b](f) = supa≤x≤y≤b

|f(x)− f(y)| = sup[a,b]

f(x)− inf[a,b]

f(x).

Then

supall x∗i

S(P, f)− infall x∗i

S(P, f) =n∑i=1

ω[xi−1,xi](f)∆xi,

is the Riemann sum of the oscillations, and the specialization of the Cauchycriterion (3.1.4) to the case P = P ′ becomes

‖P‖ < δ =⇒n∑i=1

ω[xi−1,xi](f)∆xi < ε. (3.1.5)

.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...............................

...............................................................................

................

................

................

................

........................................

a bxi−1 xi

ω(f)

.........................................................................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................

................

................

................

................

................

................

................

................

................

........................................................................................................................................

...............

...............

...............

...............

...............

...............

...........................................................................................................

................

................

................

................

...............................................................

...............

...............

...............

......................................................................................................................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

........................................................................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

........................................................................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................

...............................................................................................................................................................................................................................................................................................................................

...................................................

Figure 3.3: Riemann sum of oscillations

Page 145: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 145

It turns out this specialized Cauchy criterion also implies the generalCauchy criterion (3.1.4). As a preparation for the proof, note that for anya ≤ c ≤ b, we have

|f(c)(b− a)− S(P, f)| =

∣∣∣∣∣n∑i=1

(f(c)− f(x∗i ))∆xi

∣∣∣∣∣ ≤n∑i=1

|f(c)− f(x∗i )|∆xi

≤n∑i=1

ω[a,b](f)∆xi ≤ ω[a,b](f)(b− a). (3.1.6)

Theorem 3.1.3 (Riemann Criterion). A bounded function f(x) on a boundedinterval [a, b] is Riemann integrable if and only if for any ε > 0, there is δ > 0,such that ‖P‖ < δ implies

∑ni=1 ω[xi−1,xi](f)∆xi < ε.

Proof. Assume the implication (3.1.5) holds. Let P and P ′ be partitionssatisfying ‖P‖, ‖P ′‖ < δ. Let Q be the partition obtained by combining thepartition points in P and P ′ together. Make arbitrary choices of x∗i for Q andform the Riemann sum S(Q, f). Note that Q is obtained by adding morepoints into P (we say Q is a refinement of P ). For any interval [xi−1, xi] inthe partition P , denote by Q[xi−1,xi] the part of the partition Q lying insidethe interval. Then

S(Q, f) =n∑i=1

S(Q[xi−1,xi], f),

and by the inequality (3.1.6), we have

|S(P, f)− S(Q, f)| =

∣∣∣∣∣n∑i=1

f(x∗i )∆xi −n∑i=1

S(Q[xi−1,xi], f)

∣∣∣∣∣≤

n∑i=1

∣∣f(x∗i )(xi − xi−1)− S(Q[xi−1,xi], f)∣∣

≤n∑i=1

ω[xi−1,xi](f)∆xi.

Since ‖P‖ < δ, the right side is less than ε. By the same reason, we get|S(P ′, f)− S(Q, f)| < ε. Therefore

|S(P, f)− S(P ′, f)| ≤ |S(P, f)− S(Q, f)|+ |S(P ′, f)− S(Q, f)| < 2ε.

Example 3.1.6. For the function f(x) = x in Example 3.1.1, we have ω[xi−1,xi](f) =xi − xi−1 ≤ ‖P‖ and

∑ω[xi−1,xi](f)∆xi ≤

∑‖P‖∆xi = (b − a)‖P‖. Then it is

easy to see that the Riemann criterion is satisfied, so that f(x) = x is integrable.

Moreover, the computation in Example 3.1.1 tells us∫ 1

0xdx =

12

.

Example 3.1.7. For the Dirichlet function in Exercise 1.3.24, we have ω[xi−1,xi](f) =1 and

∑ω[xi−1,xi](f)∆xi = b − a. By the Riemann criterion, the function is not

integrable.

Page 146: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

146 CHAPTER 3. INTEGRATION

Exercise 3.1.4. Prove that the function f(x) = x2 is integrable on [0, 1].

Exercise 3.1.5. Study the integrability of the functions in Exercise 3.1.2 again byusing Theorem 3.1.3.

Exercise 3.1.6. Prove the inequality ω(|f |) ≤ ω(f). Then prove that the integra-bility of f(x) implies the integrability of |f(x)|.

Exercise 3.1.7. Prove that the integrability of f(x) implies the integrability off(x)2.

Exercise 3.1.8. Suppose a bounded function f(x) on [a, b] is integrable on [c, b]for any a < c < b. Prove that f(x) is integrable on [a, b]. In fact, we also have∫ b

af(x)dx = lim

c→a+

∫ b

cf(x)dx by Theorem 3.2.1.

3.1.3 Integrability of Continuous and Monotone Func-tions

By using the integrability criterion in Theorem 3.1.3, we may identify someimportant classes of integrable functions.

Proposition 3.1.4. Continuous functions on bounded closed intervals areRiemann integrable.

Proof. By Theorem 1.4.4, for any ε > 0, there is δ > 0, such that |x− y| < δimplies |f(x)−f(y)| < ε. Suppose ‖P‖ < δ. Then for any x, y ∈ [xi−1, xi], wehave |x− y| ≤ |xi−xi−1| < δ, which implies |f(x)− f(y)| < ε. Therefore theoscillation ω[xi−1,xi](f) ≤ ε and

∑ω[xi−1,xi](f)∆xi ≤ ε(b − a). By Theorem

3.1.3, the function is integrable.

Proposition 3.1.5. Monotone functions on bounded closed intervals are Rie-mann integrable.

Proof. Let f(x) be an increasing function on a bounded closed interval [a, b].Then ω[xi−1,xi](f)∆xi = f(xi)− f(xi−1), and∑

ω[xi−1,xi](f)∆xi =∑

(f(xi)− f(xi−1))∆xi

≤ ‖P‖∑

(f(xi)− f(xi−1)) = ‖P‖(f(b)− f(a)).

By Theorem 3.1.3, this implies that the function is integrable.

Example 3.1.8. The function x is integrable on [a, b] by either Proposition 3.1.4 or

Proposition 3.1.5. The computation in Example 3.1.1 tells us∫ 1

0xdx = lim

n→∞S(Pn, x) =

12

. An extension of the computation shows

∫ b

axdx =

12

(b2 − a2).

Page 147: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 147

The function x2 is integrable on [a, b] by Proposition 3.1.4. To find the integral,

we take a partition Pn to consist of xi = a+i

n(b− a) and choose x∗i = xi. Then

S(Pn, x2) =n∑i=1

(a+

i

n(b− a)

)2 1n

(b− a)

=n∑i=1

(1na2 + 2

i

n2a(b− a) +

i2

n3(b− a)2

)(b− a)

=n

na2(b− a) +

n(n+ 1)n2

a(b− a)2 +n(n+ 1)(2n+ 1)

6n3(b− a)3,

Taking the limit as n→∞, we get∫ 1

0x2dx =

13

(b2 − a2).

Exercise 3.1.9. Suppose f(x) is a convex function on [a, b]. Use the inequali-

ties in Exercises 2.3.56 and 2.3.57 to prove that f(b) − f(a) =∫ b

af ′(x−)dx =∫ b

af ′(x+)dx.

Proposition 3.1.6. Suppose f(x) is an integrable function on a boundedinterval [a, b]. Suppose the values of f(x) lie in a finite union U of closedintervals, and φ(y) is a continuous function on U . Then the compositionφ(f(x)) is integrable.

Proof. By Theorems 1.4.4 and 1.4.5, the continuous function φ is uniformlycontinuous and bounded on each interval inside U . Since U contains finitelymany closed intervals, φ is also uniformly continuous and bounded on U . Inother words, for any ε > 0, there is δ > 0, such that |y − y′| < δ implies|φ(y)− φ(y′)| < ε. In particular, ω(f) < δ implies ω(φ ◦ f) < ε.

By Theorem 3.1.3, since f(x) is integrable, there is δ′ > 0 satisfyingδ′ < δ, such that ‖P‖ < δ′ implies

∑ω[xi−1,xi](f)∆xi < δε. Then the sum∑

ω[xi−1,xi](f)∆xi =∑<δ

+∑≥δ

,

where∑

<δ and∑≥δ consist respectively of the terms with ω[xi−1,xi](f) < δ

and with ω[xi−1,xi](f) ≥ δ. We have∑<δ

ω[xi−1,xi](φ ◦ f)∆xi ≤ (b− a)ε,

andδ∑≥δ

∆xi ≤∑≥δ

ω[xi−1,xi](f)∆xi ≤∑

ω[xi−1,xi](f)∆xi < δε.

If φ is bounded by B on U , then ω[xi−1,xi](φ ◦ f) ≤ 2B and∑≥δ

ω[xi−1,xi](φ ◦ f)∆xi ≤ 2B∑≥δ

∆xi ≤ 2Bε.

Page 148: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

148 CHAPTER 3. INTEGRATION

Thus we conclude∑ω[xi−1,xi](φ ◦ f)∆xi =

∑<δ

+∑≥δ

≤ (b− a)ε+ 2Bε.

Since a, b, B are all fixed constants, by Theorem 3.1.3, this implies that φ ◦ fis integrable.

Example 3.1.9. If f(x) is integrable, then f(x)2 and |f(x)| are also integrable. If wefurther have f(x) ≥ 0, then

√f(x) is integrable. If we further have |f(x)| > c > 0

for a constant c, then by taking U = [−B,−c] ∪ [c,B], where B is the bound for

f , we find1f

to be integrable.

Exercise 3.1.10. Does the integrability of |f(x)| imply the integrability of f(x)?What about f(x)2? What about f(x)3?Exercise 3.1.11. Suppose φ(x) satisfies A(x2 − x1) < φ(x2)− φ(x1) < B(x2 − x1)for some constants A,B > 0 and all a ≤ x1 < x2 ≤ b.

1. Prove that ω[x1,x2](f ◦ φ) = ω[φ(x1),φ(x2)](f).

2. Prove that if f(y) is integrable on [φ(a), φ(b)], then f(φ(x)) is integrable on[a, b].

Moreover, prove that if φ(x) is continuous on [a, b] and differentiable on (a, b),satisfying A < φ′(x) < B for all x ∈ (a, b), then A(x2 − x1) < φ(x2) − φ(x1) <B(x2 − x1) for all a ≤ x1 < x2 ≤ b.

3.1.4 Properties of Integration

Being defined as a certain type of limit, the integration has the similar prop-erties as the limit. The properties limn→∞(xn+yn) = limn→∞ xn+limn→∞ ynand limn→∞ cxn = c limn→∞ xn have the following analogues.

Proposition 3.1.7. Suppose f(x) and g(x) are integrable on [a, b]. Thenf(x) + g(x) and cf(x) are also integrable on [a, b], and∫ b

a

(f(x) + g(x))dx =

∫ b

a

f(x)dx+

∫ b

a

g(x)dx, (3.1.7)∫ b

a

cf(x)dx = c

∫ b

a

f(x)dx. (3.1.8)

Proof. Denote I =

∫ b

a

f(x)dx and J =

∫ b

a

g(x)dx. For any ε > 0, there is

δ > 0, such that

‖P‖ < δ =⇒ |S(P, f)− I| < ε, |S(P, g)− J | < ε.

On the other hand, by choosing the same partition P and the same x∗i forboth f(x) and g(x), we have

S(P, f + g) =∑

(f(x∗i ) + g(x∗i ))∆xi

=∑

f(x∗i )∆xi +∑

g(x∗i )∆xi = S(P, f) + S(P, g).

Page 149: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 149

Thus we conclude

‖P‖ < δ =⇒ |S(P, f + g)− I − J | ≤ |S(P, f)− I|+ |S(P, g)− J | < 2ε.

This shows that f + g is also integrable, and

∫ b

a

(f(x) + g(x))dx = I + J .

The proof of

∫ b

a

cf(x)dx = cI is similar.

Example 3.1.10. Let L(x) = A + Bx be the linear function satisfying L(a) = hand L(b) = k. Then k+ l = 2A+B(a+ b). By Examples 3.1.2 and 3.1.8, we have∫ b

aL(x)dx = A

∫ b

adx+B

∫ b

axdx = A(b− a) +

B

2(b2 − a2) =

12

(b− a)(k + l).

This is indeed the area of the trapezoid under the line.

Example 3.1.11. Suppose f(x) and g(x) are integrable. By Proposition 3.1.7,f(x) + g(x) is integrable. By Proposition 3.1.6, f(x)2, g(x)2 and (f(x) + g(x))2

are also integrable. Then by Proposition 3.1.7 again, the product

f(x)g(x) =12[(f(x) + g(x))2 − f(x)2 − g(x)2

]is also integrable. However, there is no formula expressing the integral of f(x)g(x)in terms of the integrals of f(x) and g(x).

Moreover, if |g(x)| > c > 0 for a constant c, then by the discussion in Example

3.1.9, the quotientf(x)g(x)

= f(x)1

g(x)is integrable.

Example 3.1.12. Suppose f(x) is integrable on [a, b]. Suppose g(x) = f(x) for allx except at c ∈ [a, b]. Then g(x) = f(x) + λdc(x), where dc(x) is the function inExample 3.1.3 and λ = g(c)− f(c). In particular, by the computation in Example3.1.3, g(x) is also integrable and∫ b

ag(x)dx =

∫ b

af(x)dx+ λ

∫ b

adc(x)dx =

∫ b

af(x)dx.

The example shows that changing an integrable function at finitely many placesdoes not change the integrability and the integral. In particular, it makes sense totalk about the integrability of a function f(x) on a bounded open interval (a, b)because any numbers may be assigned as f(a) and f(b) without affecting theintegrability of (the extended) f(x) on [a, b].

Example 3.1.13. In Example 3.1.5, we showed that the Thomae’s function R(x) inExample 1.4.2 is integrable. By Example 3.1.12, we also know that the function

f(x) =

{1 if 0 < x ≤ 10 if x = 0

is integrable. However, the composition f(R(x)) is the Dirichlet function, whichis not integrable by Example 3.1.4.

Exercise 3.1.12. Prove that if f(x) and g(x) are integrable, then max{f(x), g(x)}and min{f(x), g(x)} are also integrable.

Page 150: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

150 CHAPTER 3. INTEGRATION

Exercise 3.1.13. Find a function f(x) on [0, 1] that is never zero and is integrable,

such that1

f(x)is not integrable.

Proposition 3.1.8. Suppose f(x) and g(x) are integrable on [a, b]. If f(x) ≤g(x), then ∫ b

a

f(x)dx ≤∫ b

a

g(x)dx.

Proof. Choosing the same partition P and the same x∗i for both functions,we have

S(P, f) =∑

f(x∗i )∆xi ≤∑

g(x∗i )∆xi = S(P, g).

Then by (3.1.7) for f and g, for any ε > 0, there is δ > 0, such that ‖P‖ < δimplies ∫ b

a

f(x)dx− ε < S(P, f) ≤ S(P, g) <

∫ b

a

g(x)dx+ ε.

Since this holds for any ε > 0, we conclude that

∫ b

a

f(x)dx ≤∫ b

a

g(x)dx.

Example 3.1.14. If f(x) is integrable, then |f(x)| is integrable, and −|f(x)| ≤

f(x) ≤ |f(x)|. By Proposition 3.1.8, we have −∫ b

a|f(x)|dx ≤

∫ b

af(x)dx ≤∫ b

a|f(x)|dx. This is the same as

∣∣∣∣∫ b

af(x)dx

∣∣∣∣ ≤ ∫ b

a|f(x)|dx. (3.1.9)

Exercise 3.1.14. Suppose f(x) is integrable on [a, b]. Prove that

(b− a) inf[a,b]

f ≤∫ b

af(x)dx ≤ (b− a) sup

[a,b]f (3.1.10)

Exercise 3.1.15. Suppose f(x) is continuous on [a, b]. Prove that there is a < c < b,such that ∫ b

af(x)dx = f(c)(b− a). (3.1.11)

More generally, prove the first integral mean value theorem: For any non-negativeintegrable function g(x) on [a, b], there is a < c < b, such that∫ b

ag(x)f(x)dx = f(c)

∫ b

ag(x)dx. (3.1.12)

Exercise 3.1.16. Suppose f(x) is integrable on [a, b]. Prove that∣∣∣∣f(c)(b− a)−∫ b

af(x)dx

∣∣∣∣ ≤ ω[a,b](f)(b− a) (3.1.13)

for any a ≤ c ≤ b. Moreover, prove that∣∣∣∣S(P, f)−∫ b

af(x)dx

∣∣∣∣ ≤∑ω[xi−1,xi](f)∆xi. (3.1.14)

Page 151: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 151

Exercise 3.1.17. Suppose f(x) is a continuous function on an open interval con-

taining [a, b]. Prove that limt→0

∫ b

a|f(x+t)−f(x)|dx = 0. We will see in Exercise

3.1.46 that the continuity assumption is not needed.

If a region is divided into non-overlapping parts, then the whole area issum of the areas of the parts. The following result reflects the intuition.

Proposition 3.1.9. Suppose a < b < c and f(x) is a function on [a, c].Then f(x) is integrable on [a, c] if and only if its restrictions on [a, b] and[b, c] are integrable. Moreover,∫ c

a

f(x)dx =

∫ b

a

f(x)dx+

∫ c

b

f(x)dx. (3.1.15)

Proof. The proof is based on the study of the relation of the Riemann sumsof the function on [a, b], [b, c] and [a, c]. Let P be a partition of [a, c].

If P contains b as a partition point, then P is obtained by combining apartition P ′ of [a, b] and a partition P ′′ of [b, c] together. For any choices ofx∗i for P and the same choices for P ′ and P ′′, we have S(P, f) = S(P ′, f) +S(P ′′, f).

If P does not contain b as a partition point, then xk−1 < b < xk forsome k, and the new partition P = P ∪ {b} is still obtained by combining apartition P ′ of [a, b] and a partition P ′′ of [b, c] together. For any choices of x∗ifor P , we keep all x∗i with i 6= k and introduce xk−1 < x′∗k < b, b < x′′∗k < xkfor P . Then S(P , f) = S(P ′, f) + S(P ′′, f) as before, and

|S(P, f)− S(P ′, f)− S(P ′′, f)|=|S(P, f)− S(P , f)|=|f(x∗k)(xk − xk−1)− f(x′

∗k)(b− xk−1)− f(x′′

∗k)(xk − b)|

≤2 sup[xk−1,xk]

|f | ‖P‖.

Suppose f is integrable on [a, b] and [b, c]. Then by Proposition 3.1.2, f(x)is bounded on the two intervals. Thus |f(x)| < B for some constant B and all

x ∈ [a, c]. Denote I =

∫ b

a

f(x)dx and J =

∫ c

b

f(x)dx. For any ε > 0, there is

δ > 0, such that for partitions P ′ of [a, b] and P ′′ of [b, c] satisfying ‖P ′‖ < δand ‖P ′′‖ < δ, we have |S(P ′, f) − I| < ε and |S(P ′′, f) − J | < ε. Then forany partition P of [a, c] satisfying ‖P‖ < δ, we always have (regardless weare in the first or the second case above)

|S(P, f)− I − J |≤|S(P, f)− S(P ′, f)− S(P ′′, f)|+ |S(P ′, f)− I|+ |S(P ′′, f)− I|<2δB + 2ε.

This implies that f(x) is integrable on [a, c], with

∫ c

a

f(x)dx = I + J .

It remains to show that the integrability of f(x) on [a, c] implies theintegrability of f on [a, b] and [b, c]. By Cauchy criterion, for any ε > 0,

Page 152: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

152 CHAPTER 3. INTEGRATION

there is δ > 0, such that for partitions P and Q of [a, c] satisfying ‖P‖ < δand ‖Q‖ < δ, we have |S(P, f) − S(Q, f)| < ε. Now suppose P ′ and Q′ arepartitions of [a, b] satisfying ‖P ′‖ < δ and ‖Q′‖ < δ. Let R be any partitionof [b, c] satisfying ‖R‖ < δ. By adding R to P ′ and Q′, we get partitions Pand Q of [a, b] satisfying ‖P‖ < δ and ‖Q‖ < δ. Moreover, the choices of x∗i(which may be different for P ′ and Q′) may be extended by adding the samex∗i for P ′′. Then we get

|S(P ′, f)− S(Q′, f)| = |(S(P ′, f) + S(R, f))− (S(Q′, f) + S(R, f))|= |S(P, f)− S(Q, f)| < ε.

This proves the integrability for f on [a, b]. The proof of the integrability on[b, c] is similar.

Example 3.1.15. A function f(x) on [a, b] is a step function if there is a partitionP and constants ci, such that f(x) = ci for xi−1 < x < xi (it does not matterwhat f(xi) are). Then∫ b

af(x)dx =

∑∫ xi

xi−1

f(x)dx (Theorem 3.1.9)

=∑∫ xi

xi−1

cidx (Example 3.1.12)

=∑

ci(xi − xi−1). (Example 3.1.2)

Example 3.1.16. Combining Example 3.1.10 with Proposition 3.1.9, we know theintegral of a piecewise linear function is equal to the geometrical area of the regionunder the graph.

Exercise 3.1.18. Suppose f(x) is a continuous function on [a, b]. Prove the followingare equivalent.

1. f(x) = 0 for all x.

2.∫ b

a|f(x)|dx = 0.

3.∫ d

cf(x)dx = 0 for any [c, d] ⊂ [a, b].

4.∫ b

af(x)g(x)dx = 0 for any continuous function g(x).

Exercise 3.1.19. Suppose f(x) is a continuous function on [a, b]. Prove that∣∣∣∣∫ b

af(x)dx

∣∣∣∣ =∫ b

a|f(x)|dx if and only if f(x) does not change sign.

Exercise 3.1.20. Suppose f(x) is a continuous function on [a, b]. Prove that if∫ b

af(x)dx =

∫ b

axf(x)dx = 0, then there are at least two distinct points a <

c1, c2 < b, such that f(c1) = f(c2) = 0. Extend the result to more points.

Page 153: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 153

Exercise 3.1.21. Suppose f(x) ≥ 0 is a concave function on [a, b]. Then for any y ∈[a, b], f(x) is bigger than the function obtained by connecting straight lines from

(a, 0) to (y, f(y)) and then to (b, 0). Use this to prove that f(y) ≤ 2b− a

∫ b

af(x)dx.

Moreover, determine when the equality holds.

Exercise 3.1.22. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).

Suppose m ≤ f ′ ≤ M on (a, b) and denote µ =f(b)− f(a)

b− a. By comparing f(x)

with suitable piecewise linear functions, prove that∣∣∣∣∫ b

af(x)dx− f(a) + f(b)

2(b− a)

∣∣∣∣ ≤ (M − µ)(µ−m)2(M −m)

(b− a)2. (3.1.16)

Exercise 3.1.23. Suppose f(x) is a non-negative and strictly increasing functionon [a, b].

1. Prove that if f(b) ≤ 1, then limn→∞

∫ b

af(x)ndx = 0.

2. Prove that if f(b) > 1 and f(x) is continuous at b, then limn→∞

∫ b

af(x)ndx =

+∞.

Extend the result to limn→∞

∫ b

af(x)ng(x)dx, where g(x) is non-negative and in-

tegrable on [a, b].

Exercise 3.1.24. Suppose f(x) is continuous on [a, b] and f(x) > 0 on (a, b). Sup-

pose g(x) is integrable on [a, b]. Prove that limn→∞

∫ b

ag(x) n

√f(x)dx =

∫ b

ag(x)dx.

Exercise 3.1.25. Suppose f(x) ≥ 0 is continuous on [a, b]. Prove that

limp→+∞

(∫ b

af(x)pdx

) 1p

= max[a,b]

f(x).

Exercise 3.1.26. Suppose f(x) satisfies the Lipschitz condition |f(x) − f(x′)| <L|x− x′| on [a, b]. Prove that for any partition P and any choices of x∗i , we have∣∣∣∣S(P, f)−

∫ b

af(x)dx

∣∣∣∣ ≤ L

2

∑∆x2

i .

This gives an estimate of how close the Riemann sum is to the actual integral.

Exercise 3.1.27. Suppose g(x) is integrable on [a, b]. Prove that for any partitionP of [a, b] and choices of x∗i , we have∣∣∣∣∣∑ f(x∗i )

∫ xi

xi−1

g(x)dx−∫ b

af(x)g(x)dx

∣∣∣∣∣ ≤ sup[a,b]|g|∑

ω[xi−1,xi](f)∆xi.

In particular, if f(x) is integrable, then

lim‖P‖→0

∑f(x∗i )

∫ xi

xi−1

g(x)dx =∫ b

af(x)g(x)dx.

Page 154: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

154 CHAPTER 3. INTEGRATION

The definition of the Riemann integral

∫ b

a

f(x)dx implicitly assumes a <

b. If a > b, then we also define∫ b

a

f(x)dx = −∫ a

b

f(x)dx. (3.1.17)

Moreover, we define

∫ a

a

f(x)dx = 0 (which can be considered as a special

case of the original definition of the Riemann integral). Then the equality∫ c

a

f(x)dx =

∫ b

a

f(x)dx+

∫ c

b

f(x)dx

still holds for any order between a, b, c. Proposition 3.1.7 still holds for a ≥ b,and the direction of the inequality in Proposition 3.1.8 needs to be reversedfor a ≥ b.

3.1.5 Additional Exercise

Modified Riemann Sum and Riemann Product

Exercise 3.1.28. Let φ(t) be a function defined near 0. For any partition P of [a, b]and choices of x∗i , define the “modified Riemann sum”

Sφ(P, f) =n∑i=1

φ(f(x∗i )∆xi).

Prove that if φ is differentiable at 0, such that φ(0) = 0 and φ′(0) = 1, and f(x)

is integrable on [a, b], then lim‖P‖→0 Sφ(P, f) =∫ b

af(x)dx.

Exercise 3.1.29. For any partition P of [a, b] and choices of x∗i , define the “Riemannproduct”

Π(P, f) = (1 + f(x∗1)∆x1)(1 + f(x∗2)∆x2) · · · (1 + f(x∗n)∆xn).

Prove that if f(x) is integrable on [a, b], then lim‖P‖→0 Π(P, f) = eR ba f(x)dx.

Integrability and Continuity

The converse of Proposition 3.1.4 is not true. So how continuous mustbe an integrable function be?

Exercise 3.1.30. Suppose f(x) is integrable on [a, b]. Prove that for any ε > 0, thereis δ > 0, such that for any partition P satisfying ‖P‖ < δ, we have ω[xi−1,xi](f) < εfor some interval [xi−1, xi] in the partition.

Exercise 3.1.31. Suppose there is a sequence of intervals [a, b] ⊃ [a1, b1] ⊃ [a2, b2] ⊃· · · , such that an < c < bn for all n and limn→∞ ω[an,bn](f) = 0. Prove that f(x)is continuous at c.

Exercise 3.1.32. Prove that an integrable function must be continuous somewhere.In fact, prove that for any (c, d) ⊂ [a, b], an integrable function on [a, b] is contin-uous somewhere in (c, d). In other words, the continuous points of the integrablefunction must be dense.

Page 155: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 155

Exercise 3.1.33. Define the oscillation

ω(x) = limδ→0+

ω[x−δ,x+δ](f) = limy→x

f(x)− limy→x

f(x)

of a function at a point (see the definition before Exercise 1.3.38 for the upperand lower limit of functions). Prove that f(x) is continuous at x0 if and only ifω(x0) = 0.

Exercise 3.1.34. Prove that if f(x) is integrable, then for any ε > 0 and δ > 0,there is a union U of finitely many intervals, such that the sum of the lengths ofthe intervals in U is < ε, and ω(x) ≥ δ implies x ∈ U .

Exercise 3.1.35 (Hankel3). Prove that if f(x) is integrable, then for any ε > 0,there is a union U of countably many intervals, such that the sum of the lengthsof the intervals in U is < ε, and all discontinuous points of f(x) are inside U . Thisbasically says that the set of discontinuous points of a Riemann integrable functionhas Lebesgue measure 0. The converse is also true.

Strict Inequality in Integration

The existence of the continuous points proved in Exercise 3.1.32 for inte-grable functions enables us to change the inequalities in Proposition 3.1.8 tobecome strict.

Exercise 3.1.36. Prove that if f(x) > 0 is integrable on [a, b], then∫ b

af(x)dx > 0.

In particular, this shows

f(x) < g(x) =⇒∫ b

af(x)dx <

∫ b

ag(x)dx.

Exercise 3.1.37. Suppose f(x) is an integrable function on [a, b]. Prove the follow-ing are equivalent.

1.∫ d

cf(x)dx = 0 for any [c, d] ⊂ [a, b].

2.∫ b

a|f(x)|dx = 0.

3.∫ b

af(x)g(x)dx = 0 for any continuous function g(x).

4.∫ b

af(x)g(x)dx = 0 for any integrable function g(x).

5. f(x) = 0 at continuous points.

Refinement of Partition and Integrability Criterion

The integrability criterion in Theorem 3.1.3 requires the Riemann sum ofoscillations to be small for all partitions P satisfying ‖P‖ < δ. The followingexercises show that it is sufficient for this to happen for just one partition P .

3Hermann Hankel, born 1839 in Halle (Germany), died 1873 in Schramberg (Germany).Hankel was Riemann’s student, and his study of Riemann’s integral prepared for thediscovery of Lebesgue integral.

Page 156: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

156 CHAPTER 3. INTEGRATION

Exercise 3.1.38. Suppose a partition Q : a = y0 < y1 < y2 < · · · < ym = b isobtained by adding k points to a partition P : a = x0 < x1 < x2 < · · · < xn = b.Prove that ∑

ω[yj−1,yj ](f)∆yj ≤∑

ω[xi−1,xi](f)∆xi

and ∑ω[xi−1,xi](f)∆xi ≤

∑ω[yj−1,yj ](f)∆yj + 2k‖P‖ω[a,b](f).

Exercise 3.1.39. Prove that a function f(x) is integrable if and only if for anyε > 0, there is a partition P , such that

∑ω[xi−1,xi](f)∆xi < ε.

Darboux Sum and Darboux Integral

For a function f(x) on [a, b] and a partition P of [a, b], the upper andlower Darboux sums are

U(P, f) = supall x∗i

S(P, f) =n∑i=1

sup[xi−1,xi]

f(x)∆xi, (3.1.18)

L(P, f) = infall x∗i

S(P, f) =n∑i=1

inf[xi−1,xi]

f(x)∆xi. (3.1.19)

For bounded f(x), the upper and lower Darboux integrals are∫ b

a

f(x)dx = infall P

U(P, f),

∫ b

a

f(x)dx = supall P

L(P, f). (3.1.20)

Exercise 3.1.40. Prove that if Q is a refinement of P , then U(P, f) ≥ U(Q, f) ≥L(Q, f) ≥ L(P, f).

Exercise 3.1.41. Prove that∫ b

af(x)dx = lim

‖P‖→0U(P, f),

∫ b

af(x)dx = lim

‖P‖→0L(P, f).

Exercise 3.1.42. Prove that ∫ b

af(x)dx ≥

∫ b

af(x)dx,

and the equality holds if and only if f(x) is Riemann integrable on [a, b]. Moreover,

the Riemann integral∫ b

af(x)dx is the common value.

Exercise 3.1.43. Prove that if f(x) is integrable, then∫ b

af(x)dx = lim

‖P‖→0

∑φi∆xi,

where φi is any number satisfying inf [xi−1,xi] f(x) ≤ φi ≤ sup[xi−1,xi] f(x).

Exercise 3.1.44. Study the properties in Section 3.1.4 for the Darboux integral.

Page 157: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.1. RIEMANN INTEGRATION 157

Integral Continuity

The equality limt→0 |f(x+ t)− f(x)| = 0 means the continuity of f at t.

Therefore limt→0

∫ b

a

|f(x + t) − f(x)|dx means the “integral continuity” of

f on [a, b]. Exercise 3.1.17 says that continuity implies integral continuity.The following exercises show that the continuity of f is unnecessary for theintegral continuity.

Exercise 3.1.45. Suppose f is integrable on an open interval containing [a, b]. Sup-pose P be a partition of [a, b] by intervals of equal length δ. Prove that if |t| < δ,then∫ xi

xi−1

|f(x+ t)− f(x)|dx ≤ δ(ω[xi−2,xi−1](f) + ω[xi−1,xi](f) + ω[xi,xi+1](f)).

Exercise 3.1.46. Suppose f is integrable on an open interval containing [a, b]. Prove

that limt→0

∫ b

a|f(x+ t)− f(x)|dx = 0 and limt→1

∫ b

a|f(tx)− f(x)|dx = 0.

Integral Inequalities for Convex Functions

Exercise 3.1.47. Suppose f(x) is a convex function on [a, b]. By comparing f(x)with linear functions in Figure 2.6, prove

f

(a+ b

2

)(b− a) ≤

∫ b

af(x)dx ≤ f(a) + f(b)

2(b− a).

Exercise 3.1.48. A weight on [a, b] is a function λ(x) satisfying

λ(x) ≥ 0,1

b− a

∫ b

aλ(x) = 1. (3.1.21)

We have1

b− a

∫ b

aλ(x)xdx = (1 − µ)a + µb for some 0 < µ < 1. For a convex

function on [a, b], prove that

f((1− µ)a+ µb) ≤ 1b− a

∫ b

aλ(x)f(x)dx ≤ (1− µ)f(a) + µf(b).

The left inequality is the integral version of the Jensen inequality in Exercise 2.3.40.What do you get by applying the integral Jensen inequality to x2, ex and log x?

Exercise 3.1.49. Suppose f(x) is a convex function on [a, b] and φ(t) is an integrablefunction on [α, β] satisfying a ≤ φ(t) ≤ b. Suppose λ(x) is a weight function on[α, β] as defined in Exercise 3.1.48. Prove that

f

(1

β − α

∫ β

αλ(t)φ(t)dt

)≤ 1β − α

∫ β

αλ(t)f(φ(t))dt.

This further extends the integral Jensen inequality.

Holder and Minkowski Inequalities

Page 158: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

158 CHAPTER 3. INTEGRATION

Exercise 3.1.50. Suppose p, q > 0 satisfy1p

+1q

= 1. Prove the integral versions

of the Holder and Minkowski inequalities in Exercises 2.2.41 and 2.2.42.∫ b

a|f(x)g(x)|dx ≤

(∫ b

a|f(x)|pdx

) 1p(∫ b

a|g(x)|qdx

) 1q

, (3.1.22)(∫ b

a|f(x) + g(x)|pdx

) 1p

≤(∫ b

a|f(x)|pdx

) 1p

+(∫ b

a|g(x)|pdx

) 1p

. (3.1.23)

Estimation of Integral

Exercise 3.1.51. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).By comparing f(x) with the straight line L(x) = f(a)+m(x−a) for m = sup(a,b) f

or m = inf(a,b) f′, prove that

inf(a,b) f′

2(b− a)2 ≤

∫ b

af(x)dx− f(a)(b− a) ≤

sup(a,b) f′

2(b− a)2. (3.1.24)

Then use Darboux’s intermediate value theorem in Exercise 2.2.33 to show∫ b

af(x)dx = f(a)(b− a) +

f ′(c)2

(b− a)2 (3.1.25)

for some a < c < b.

Exercise 3.1.52. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).Suppose g(x) is non-negative and integrable on [a, b]. Prove that∫ b

af(x)g(x)dx = f(a)

∫ b

ag(x)dx+ f ′(c)

∫ b

a(x− a)g(x)dx (3.1.26)

for some a < c < b.

Exercise 3.1.53. Suppose f(x) is continuous on [a, b] and differentiable on (a, b).Use Exercise 3.1.52 to prove that∣∣∣∣∫ b

af(x)dx− f(a) + f(b)

2(b− a)

∣∣∣∣ ≤ ω(a,b)(f ′)8

(b− a)2. (3.1.27)

In fact, this estimation can also be derived from Exercise 3.1.22.

Exercise 3.1.54. Suppose f(x) is continuous on [a, b] and has second order deriva-

tive on (a, b). Use the Taylor expansion ata+ b

2to prove that∫ b

af(x)dx = f

(a+ b

2

)(b− a) +

f ′′(c)24

(b− a)3 (3.1.28)

for some a < c < b.

3.2 Antiderivative

Riemann’s definition of integration is independent of the differentiation. Be-fore Riemann, however, Newton and Leibniz considered the integration asthe inverse of the differentiation. The relation enables us to compute theintegrals.

Page 159: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 159

3.2.1 Fundamental Theorem of Calculus

Let us consider the integral of f(x) as a function by changing the upper limit.The result is a function with f(x) as the derivative.

Theorem 3.2.1. Suppose f(x) is integrable. Then F (x) =

∫ x

a

f(t)dt is a

continuous function. Moreover, if f(x) is continuous at x0, then F (x) isdifferentiable with F ′(x0) = f(x0).

Proof. By Proposition 3.1.2, there is B such that |f(x)| < B. Then byPropositions 3.1.8 (see also (3.1.9)) and Proposition 3.1.9,

|F (x)− F (x0)| =∣∣∣∣∫ x

x0

f(t)dt

∣∣∣∣ ≤ B|x− x0|.

This implies limx→x0 F (x) = F (x0).Now further assume that f(x) is continuous at x0. For any ε < 0, there

is δ > 0, such that |x− x0| < δ implies |f(x)− f(x0)| < ε. Then

|F (x)− F (x0)− f(x0)(x− x0)| =∣∣∣∣∫ x

x0

f(t)dt− f(x0)(x− x0)

∣∣∣∣=

∣∣∣∣∫ x

x0

(f(t)− f(x0))dt

∣∣∣∣ ≤ ε|x− x0|.

This means that F (x0) + f(x0)(x − x0) is the linear approximation of F (x)at x0, with F ′(x0) = f(x0).

The theorem suggests that to compute the integral of a continuous func-tion f(x), we may consider a function φ(x) satisfying φ′(x) = f(x). Althoughthe functions φ(x) and F (x) may not be equal, the property φ′(x) = f(x) =F ′(x) implies, by Proposition 2.2.3, that φ(x) = F (x) + C for a constant C.Then ∫ b

a

f(x)dx = F (b) = F (b)− F (a) = φ(b)− φ(a).

For the obvious reason, the function φ(x) is called an antiderivative of f(x).It is unique up to adding a constant.

Example 3.2.1. For the sign function sign(x) in Example 1.4.1, we have

∫ x

0sign(t)dt =

∫ x

01dx = x if x > 0

0 if x = 0∫ x

0−1dx = −x if x < 0

.

by considering the antiderivative of 1 for x > 0 and the antiderivative of −1 for

x < 0. Thus∫ x

0sign(t)dt = |x| is continuous and is differentiable except at x = 0,

which is the place where sign(t) is not continuous.

Page 160: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

160 CHAPTER 3. INTEGRATION

Example 3.2.2. The functions c, x, x2 are continuous, with cx,12x2,

13x3 as an-

tiderivatives. Therefore∫ b

acdx = cb− ca,

∫ b

axdx =

12b2 − 1

2a2,

∫ b

ax2dx =

13b3 − 1

3a3.

Example 3.2.3. The functions cos2 x =12

(1 + cos 2x), tan2 x = sec2 x − 1 are

continuous, with14

(2x+ sin 2x), tanx− x as antiderivatives. Therefore

∫ π2

0cos2 xdx =

14

(2π

2+ sinπ

)− 1

4(2 · 0 + sin 0) =

π

4,∫ π

4

0tan2 xdx =

(tan

π

4− π

4

)− (tan 0− 0) = 1− π

4.

Example 3.2.4. The function1x

is integrable on [1, 2]. Consider the partition

P : 1 <n+ 1n

<n+ 2n

< · · · < 2nn

= 2 and take the right endsn+ 1n

,n+ 2n

, . . . ,2nn

as x∗i . Then the Riemann sum

S

(P,

1x

)=

1n+ 1n

1n

+1

n+ 2n

1n

+ · · ·+ 12nn

1n

=1

n+ 1+

1n+ 2

+ · · ·+ 12n.

Thus

limn→∞

(1

n+ 1+

1n+ 2

+ · · ·+ 12n

)= lim‖P‖→0

S

(P,

1x

)=∫ 2

1

1xdx = log 2,

where last equality is obtained from the fact that log x is an antiderivative of1x

.

Example 3.2.5. The function g(x) =∫ x2

0

tdt

1 + t3= h(x2), with h(y) =

∫ y

0

tdt

1 + t3.

By Theorem 3.2.1 and the chain rule, we have

g′(x) = h′(x2)2x = 2xx2

1 + (x2)3=

2x3

1 + x6.

By the similar idea, we have

d

dx

(∫ ex

log x

tdt

1 + t3

)=

d

dx

(∫ ex

0

tdt

1 + t3−∫ log x

0

tdt

1 + t3

)= ex

ex

1 + (ex)3− 1x

log x1 + (log x)3

=e2x

1 + e3x− log xx(1 + (log x)3)

.

Exercise 3.2.1. Compute the integrals.

Page 161: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 161

1.∫ b

0xndx, n ∈ N.

2.∫ b

1xαdx, b > 0.

3.∫ b

0sinxdx.

4.∫ b

0cos 2xdx.

5.∫ b

0exdx.

6.∫ b

02xdx.

Exercise 3.2.2. Compute the derivatives.

1.∫ x

0sin t2dt.

2.∫ x2

0sin tdt.

3.∫ sinx

0t2dt.

4.∫ x

sinxsin t2dt.

5.∫ |x|

0sin t2dt.

6.∫ sinx

0|t|dt.

Exercise 3.2.3. For non-negative integers m and n, prove that∫ 2π

0cosmx sinnxdx = 0,

∫ 2π

0cosmx cosnxdx =

0 if m 6= n

π if m = n 6= 02π if m = n = 0

, (3.2.1)

∫ 2π

0sinmx sinnxdx =

{0 if m 6= n or m = n = 0π if m = n 6= 0

.

Exercise 3.2.4. Compute the limits by relating to the Riemann sums of suitableintegrals.

1. limn→∞12 · 1 + 22 · 3 + · · ·+ n2 · (2n− 1)

n4.

2. limn→∞1n

(cos

π

n+ cos

2πn

+ · · ·+ cos(n− 1)π

n

).

3. limn→∞

n√n!n

.

4. limn→∞1α + 3α + · · ·+ (2n+ 1)α

nα+1, α > 0.

5. limn→∞1α + 3α + · · ·+ (2n− 1)α

2α + 4α + · · ·+ (2n)α, α > 0.

Exercise 3.2.5. If f(x) is not continuous at x0, is it true that F (x) =∫ x

af(t)dt is

not differentiable at x0?

Exercise 3.2.6. Suppose f(x) is integrable and F (x) =∫ x

af(t)dt. Prove that if

f(x) has left limit at x0, then F ′(x−0 ) = f(x−0 ). In particular, this shows that if fhas different left and right limits at x0, then F (x) is not differentiable at x0.

Exercise 3.2.7. Suppose f(x) is integrable on [a, b]. Prove that there is a < c < b

such that∫ c

af(x)dx =

12

∫ b

af(x)dx.

Page 162: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

162 CHAPTER 3. INTEGRATION

Exercise 3.2.8. Find continuous functions on given intervals satisfying the equali-ties.

1.∫ x

0f(t)dt =

∫ 1

xf(t)dt on [0, 1].

2. A∫ x

0tf(t)dt = x

∫ x

0f(t)dt on (0,+∞).

3. (f(x))2 = 2∫ x

0f(t)dt on (−∞,+∞).

Exercise 3.2.9. Find continuous functions f(x) on (0,+∞) such that∫ ab

af(x)dx

is independent of a > 0 for all b > 0.

Exercise 3.2.10. Suppose f(x) is continuous on [a, b], differentiable on (a, b), and

satisfies f(a) = 0, 1 ≥ f ′(x) ≥ 0. Prove that(∫ b

af(x)dx

)2

≥∫ b

af(x)3dx.

As a matter of fact, the continuity assumption can be weakened in orderfor the integral to be the antiderivative.

Theorem 3.2.2. Suppose f(x) is integrable on [a, b]. Suppose F (x) is contin-

uous on [a, b] and is differentiable on (a, b). If F ′(x) = f(x), then

∫ b

a

f(x)dx =

F (b)− F (a).

The theorem is almost the inverse of Theorem 3.2.1. Put together, theyare called the Fundamental Theorem of Calculus.

Proof. For a partition P of [a, b], we have

F (b)− F (a) =∑

(F (xi)− F (xi−1)) =∑

f(x∗i )(xi − xi−1),

where the second equality comes from the mean value theorem and F ′(x) =f(x). Therefore F (b) − F (a) is a Riemann sum with suitable choices of x∗i .

When ‖P‖ → 0, by the integrability of f(x), we get F (b)−F (a) =

∫ b

a

f(x)dx.

Example 3.2.6. By Exercise 3.1.8 and Proposition 3.1.4, the function

f(x) =

2x sin1x− cos

1x

if x 6= 0

0 if x = 0

is not continuous but still integrable on any bounded interval. Moreover, thefunction

F (x) =

x2 sin1x

if x 6= 0

0 if x = 0

is an antiderivative of f(x). Therefore Theorem 3.2.2 can be applied to give us∫ 1

0f(x)dx = F (1) − F (0) = sin 1. Note that the discussion after Theorem 3.2.1

does not apply to the example because of the discontinuity.

Page 163: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 163

Example 3.2.7. Consider F (x) = x2 sin1x2

for x 6= 0 and F (0) = 0. The functionis differentiable with the derivative

F ′(x) =

2x sin1x2− 2x

cos1x2

if x 6= 0

0 if x = 0.

However, F ′(x) is not integrable on [0, 1] because it is not bounded. So the exis-tence of antiderivative does not necessarily imply the integrability.

On the other hand, we note that the limit lima→0+

∫ 1

aF ′(x)dx = lim

a→0+(F (1)−

F (a)) = sin 1 exists. What we have here is that∫ 1

0F ′(x)dx is an improper integral

with sin 1 as the value.

Exercise 3.2.11. Prove that Theorem 3.2.2 still holds if F (x) is differentiable at allbut finitely many points (so F (x) is piecewise differentiable).

Exercise 3.2.12. Suppose f(x) is differentiable. Prove that f ′(x) is integrable if

and only if there is an integrable function g(x) such that f(x) = f(a) +∫ x

ag(t)dt.

Exercise 3.2.13. Suppose f(x) is differentiable and f ′(x) is integrable. Let [x] be

the biggest integer ≤ x. Compute∫ b

a[x]f ′(x)dx.

Exercise 3.2.14. Suppose f(x) has integrable derivative on [a, b].

1. Prove that if f(x) vanishes somewhere on [a, b], then |f(x)| ≤∫ b

a|f ′(x)|dx.

2. Prove that∫ b

a|f(x)|dx ≤ max

{∣∣∣∣∫ b

af(x)dx

∣∣∣∣ , (b− a)∫ b

a|f ′(x)|dx

}.

Exercise 3.2.15. Suppose f(x) is continuous on [a, b] and differentiable on (a, b),such that f ′(x) is integrable on [a, b]. Use Holder inequality (3.1.22) in Exercise3.1.50 to prove that

(f(b)− f(a))2 ≤ (b− a)∫ b

af ′(x)2dx. (3.2.2)

Then prove the following.

1. If f(a) = 0, then∫ b

af(x)2dx ≤ (b− a)2

2

∫ b

af ′(x)2dx.

2. If f(a) = f(b) = 0, then∫ b

af(x)2dx ≤ (b− a)2

4

∫ b

af ′(x)2dx.

3. If f(a+ b

2

)= 0, then

∫ b

af(x)2dx ≤ (b− a)2

4

∫ b

af ′(x)2dx.

Page 164: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

164 CHAPTER 3. INTEGRATION

3.2.2 Antiderivative

Because of the fundamental theorem of calculus, we use

∫f(x)dx to denote

all the antiderivatives of f(x). In other words, if F (x) is a differentiablefunction satisfying F ′(x) = f(x), then∫

f(x)dx = F (x) + C,

where C is an arbitrary constant. Here are some basic examples.∫xαdx =

xα+1

α + 1+ C if α 6= −1

log |x|+ C if α = −1,

∫sinxdx = − cosx+ C,∫cosxdx = sinx+ C,∫tanxdx = − log | cosx|+ C,∫sec2 xdx = tanx+ C,∫

secx tanxdx = secx+ C,∫exdx = ex + C,∫axdx =

ax

log a+ C,∫

log |x|dx = x log |x| − x+ C,∫dx√

1− x2= arcsinx+ C,∫

dx

1 + x2= arctanx+ C,∫

dx

x√x2 − 1

= arcsecx+ C.

Example 3.2.8. By1

1− x2=

12

(1

1− x+

11 + x

), we get∫

dx

1− x2=

12

(∫dx

1− x+∫

dx

1 + x

)=

12

(− log |1− x|+ log |1 + x|) + C =12

log∣∣∣∣1 + x

1− x

∣∣∣∣+ C.

Example 3.2.9. By sinx sin 2x =12

(cos 3x− cosx), we get∫sinx sin 2xdx =

16

sin 3x− 12

sinx+ C.

Page 165: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 165

By sin2 x+ cos2 x = 1, we get∫dx

sin2 x cos2 x=∫

sin2 x+ cos2 x

sin2 x cos2 xdx

=∫

(sec2 x+ csc2 x)dx = tanx− cotx+ C.

Exercise 3.2.16. Verify the antiderivatives.

1.∫

dx√x2 + a

= log |x+√x2 + a|+ C.

2.∫ √

1− x2dx =12

(arcsinx+ x√

1− x2) + C.

3.∫eax cos bxdx = eax

a cos bx+ b sin bxa2 + b2

+ C.

4.∫

log |x|dx = x log |x| − x+ C.

Exercise 3.2.17. Compute the antiderivatives.

1.∫

dx

x(1 + x).

2.∫x(1 + x)9dx.

3.∫

x2

(1 + x)9dx.

4.∫x2(1 + x)ndx.

5.∫x2 − x+ 1(1 + x)n

dx.

6.∫

(2x + 2−x)2dx.

7.∫|x|dx.

8.∫

sin 2x cos 3xdx.

9.∫

cos2 xdx.

10.∫

tan2 xdx.

11.∫

sin3 xdx.

12.∫

cos 2xdxsin2 x cos2 x

.

The formulae for derivatives can be directly translated into formulae forantiderivatives. For example, the formula (f(x) + g(x))′ = f ′(x) + g′(x)corresponds to the formula∫

(f(x) + g(x))dx =

∫f(x)dx+

∫g(x)dx. (3.2.3)

The Leibniz rule corresponds to the following. Suppose∫f(x)dx = F (x) + C,

∫g(x)dx = G(x) + C.

Then ∫F (x)g(x)dx = F (x)G(x)−

∫f(x)G(x)dx. (3.2.4)

This is the formula of the integration by parts.Using the differential notation, we have dF (x) = f(x)dx and dG(x) =

g(x)dx. Thus the formula can also be written as∫F (x)dG(x) = F (x)G(x)−

∫G(x)dF (x). (3.2.5)

Page 166: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

166 CHAPTER 3. INTEGRATION

In other words, the integration by parts is a way of exchanging the functions“inside” and “outside” the differential notation. The viewpoint suggeststhe usefulness of including dx in the integration (or antiderivative) nota-

tion

∫f(x)dx. Indeed, in the more advanced mathematics, integrations are

carried out for the differential forms f(x)dx instead of the functions only.

Example 3.2.10. The antiderivative of the logarithmic function is∫log |x|dx = x log |x| −

∫xd log |x| = x log |x| −

∫x

1xdx = x log |x| − x+ C.

Example 3.2.11. To find the antiderivative of ex sinx, we apply the integration byparts twice. ∫

ex sinxdx = −∫exd cosx

= −ex cosx+∫

cosxdex

= −ex cosx+∫ex cosxdx

= −ex cosx+∫exd sinx

= −ex cosx+ ex sinx−∫

sinxdex

= −ex cosx+ ex sinx−∫ex sinxdx.

Solving for∫ex sinxdx, we get

∫ex sinxdx =

12ex(sinx− cosx) + C.

Exercise 3.2.18. Compute the antiderivatives.

1.∫eax sin bxdx.

2.∫ax cos bxdx.

3.∫x22xdx.

4.∫xex sinxdx.

5.∫x2e−x cos 2xdx.

6.∫xn arctanxdx.

Exercise 3.2.19. Prove the recursive relations.

1.∫

sinn xdx = − 1n

sinn−1 x cosx+n− 1n

∫sinn−2 xdx.

2.∫

cosn xdx =1n

cosn−1 x sinx+n− 1n

∫cosn−2 xdx.

3.∫

tann xdx =tann−1 x

n− 1−∫

tann−2 xdx.

4.∫

secn xdx =secn−2 x tanx

n− 1− n− 2n− 1

∫secn−2 xdx.

Page 167: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 167

5.∫xnexdx = xnex − n

∫xn−1exdx.

6.∫xα(log |x|)ndx =

1α+ 1

xα+1(log |x|)n − n

α+ 1

∫xα(log |x|)n−1dx.

7.∫ex sinn xdx =

1n2 − 1

ex sinn−1 x(n cosx− sinx) +n

n+ 1

∫ex sinn−2 xdx.

8.∫xn sinxdx = −xn cosx+ nxn−1 sinx− n(n− 1)

∫xn−2 sinxdx.

9.∫xn cosxdx = xn sinx+ nxn−1 cosx− n(n− 1)

∫xn−2 cosxdx.

10.∫

(1 + ax2)ndx =x(1 + ax2)n

2n+ 1+

2n2n+ 1

∫(1 + ax2)n−1dx.

Then compute the antiderivatives.

1.∫

sin6 xdx.

2.∫

sin5 x cos4 xdx.

3.∫

tan6 xdx.

4.∫

tan−6 xdx

5.∫x3(x+ 1)exdx.

6.∫ √

x(log x)2dx.

7.∫ex sin4 xdx.

8.∫x4 sinxdx.

9.∫

dx

(1 + x2)2.

Exercise 3.2.20. Let I(m,n) =∫

cosm x sinn xdx. Use

d sinn x = n sinn−1 x cosxdx, d cosn x = −n cosn−1 x sinxdx

to derive the recursive relations.

I(m,n) = −cosm+1 x sinn−1 x

m+ 1+n− 1m+ 1

I(m+ 2, n− 2) if m 6= −1

= −cosm+1 x sinn−1 x

m+ n+n− 1m+ n

I(m,n− 2) if m+ n 6= 0.

Find the similar relation by exchanging the sine and cosine functions. Then com-

pute the antiderivatives∫

cos4 x sin5 xdx,∫

sin4 x

cos3 xdx.

Exercise 3.2.21. Let I(m,n) =∫

(x − a)m(x − b)ndx. Find a recursive relation

between I(m,n) and I(m−1, n+1). Then use the relation to compute the integral∫ 1

−1(x− 1)3(x+ 1)10dx.

The chain rule for the derivative corresponds to the following. Suppose∫f(y)dy = F (y) + C.

Then for any differentiable function φ(x),∫f(y)dy

∣∣∣∣y=φ(x)

= F (φ(x)) + C =

∫f(φ(x))φ′(x)dx. (3.2.6)

This is the formula for the change of variable.

Page 168: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

168 CHAPTER 3. INTEGRATION

Example 3.2.12. To compute∫

dx

1 +√x

, we introduce x = t2.

∫dx

1 +√x

=∫d(t2)1 + t

=∫

2tdt1 + t

=∫

2(

1− 11 + t

)dt

= 2(t− log |1 + t|) + C = 2(√x− log |1 +

√x|) + C.

Example 3.2.13. To compute the antiderivative ofx√

3 + 4x− x2, we note 5+4x−

x2 = 9− (x− 2)2 and introduce x− 2 = 3 sin t.∫xdx√

3 + 4x− x2=∫

(2 + 3 sin t)3d sin t√9− 9 sin2 t

=∫

3(2 + 3 sin t)dt

= 3(2t− 3 cos t) + C = 6 arcsinx− 2

3− 3√

3 + 4x− x2 + C.

In general, to compute an antiderivative of the form∫f(x,

√ax2 + bx+ c)dx, we

may complete the square for the quadratic function ax2 + bx+ c. Then a suitabletrigonometric function can be used to change the variable.

Example 3.2.14. We compute the antiderivative of the secant function.∫secxdx =

∫cosxdxcos2 x

=∫

d sinx1− sin2 x

=12

log∣∣∣∣1 + sinx1− sinx

∣∣∣∣+ C

In the last step, the computation of Example 3.2.8 is used. Note that by

1 + sinx1− sinx

=(1 + sinx)2

(1− sinx)(1 + sinx)=

(1 + sinx)2

cos2 x= (secx+ tanx)2,

the antiderivative can also be written as∫secxdx = log | secx+ tanx|+ C.

Example 3.2.15. The antiderivative of1√

x2 + 1may be computed by introducing

x = tan t∫dx√x2 + 1

=∫

sec2 tdt

sec t=∫

sec tdt = log | secx+tanx|+C = log(x+√x2 + 1)+C.

The integration of sec t is computed in Example 3.2.14. Similarly, the antideriva-

tive of1√

x2 − 1may be computed by introducing x = sec t∫

dx√x2 − 1

=∫

sec t tan tdttan t

=∫

sec tdt = log |x+√x2 − 1|+ C.

Example 3.2.16. To compute the antiderivative of the inverse sine function, weintroduce x = sin t.∫

arcsinxdx =∫td sin t = t sin t−

∫sin tdt

= t sin t+ cos t+ C = x arcsinx+√

1− x2 + C.

Note that the integration by parts is also used.

Exercise 3.2.22. Compute the antiderivatives.

Page 169: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 169

1.∫

f ′(x)f(x)α

dx. 2.∫

f ′(x)1 + f(x)2

dx. 3.∫

2f(x)f ′(x)dx.

Exercise 3.2.23. Compute the antiderivatives.

1.∫

dx√x+ 3√x

.

2.∫ √

xdx

1− 3√x

.

3.∫

(1 + 3√x)10dx.

4.∫

dx

x2 + 2x+ 5.

5.∫

xdx

x2 + 2x+ 5.

6.∫

x3dx

x2 + 2x+ 5.

7.∫

dx

(x2 + 2x+ 5)2.

8.∫

x3dx

(x2 + 2x+ 5)2.

9.∫ √

x2 + 2x+ 5dx.

10.∫

dx√2x− x2

.

11.∫x√

5 + 4x− x2dx.

12.∫

xdx

(x2 + 2x+ 2)32

.

13.∫

(2x+ 1)dx√x(x+ 1)

.

14.∫

dx

x log x.

15.∫

log xx

dx.

16.∫

dx

ex + e−x.

17.∫

cotxdx.

18.∫

cscxdx.

19.∫

tan3 xdx.

20.∫

sec3 xdx.

21.∫

sin4 x

cos3 xdx.

22.∫

sin5 x

cos3 xdx.

23.∫

(arcsinx)2dx.

24.∫x(arcsinx)2dx.

Exercise 3.2.24. Compute the antiderivatives (a > 0).

1.∫

(ax+ b)αdx.

2.∫

dx

a2 + x2.

3.∫

dx

(a2 + x2)32

.

4.∫

dx√a2 − x2

.

5.∫

dx√a2 + x2

.

6.∫

dx√x2 − a2

.

7.∫ √

a2 − x2dx.

8.∫ √

a2 + x2dx.

9.∫ √

x2 − a2dx.

10.∫

xdx√x2 − a2

.

11.∫

dx

x√x2 − a2

.

12.∫ √

x2 − a2

xdx.

13.∫x√a2 − x2dx.

14.∫x√a2 + x2dx.

15.∫x√x2 − a2dx.

3.2.3 Integration by Parts

By the fundamental theorem of calculus, the integration by parts formula(3.2.4) for the antiderivatives gives the integration by parts formula for theRiemann integral.

Theorem 3.2.3 (Integration by Parts). Suppose F (x) and G(x) are con-tinuous on [a, b] and differentiable on (a, b). Suppose f(x) = F ′(x) andg(x) = G′(x) are integrable on [a.b]. Then∫ b

a

F (x)g(x)dx = F (b)G(b)− F (a)G(a)−∫ b

a

f(x)G(x)dx. (3.2.7)

Page 170: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

170 CHAPTER 3. INTEGRATION

Note that although f(x) and g(x) are not defined at a and b, arbitrarynumbers may be assigned as the values at the two points. By Example 3.1.12,this does not affect the integrability and the integral.

The conditions in the theorem are included to make sure the antideriva-tive can be used to compute the integrals. The condition is weakened inExercise 3.2.57. The reason behind the weakened condition will be revealedin Theorem 3.3.4.

Finally, similar to (3.2.5), the integration by parts formula can also bewritten as∫ b

a

F (x)dG(x) = [F (b)G(b)− F (a)G(a)]−∫ b

a

G(x)dF (x). (3.2.8)

Example 3.2.17. Suppose f(x) has integrable (n+ 1)-st order derivative. Then

f(x) = f(x0) +∫ x

x0

f ′(t)dt = f(x0) +∫ x

x0

f ′(t)d(t− x)

= f(x0) + f ′(x)(x− x)− f ′(x0)(x0 − x)−∫ x

x0

(t− x)f ′′(t)dt

= f(x0) + f ′(x0)(x− x0)− 12

∫ x

x0

f ′′(t)d(t− x)2

= f(x0) + f ′(x0)(x− x0) +12f ′′(x0)(x− x0)2 − 1

2

∫ x

x0

(t− x)2f ′′′(t)dt

= · · ·

= f(x0) + f ′(x0)(x− x0) +12!f ′′(x0)(x− x0)2 + · · ·+ 1

n!f (n)(x0)(x− x0)n

+ (−1)n1n!

∫ x

x0

(t− x)nf (n+1)(t)dt.

This gives the integral form

Rn(x) =1n!

∫ x

x0

(x− t)nf (n+1)(t)dt. (3.2.9)

of the remainder of the Taylor series.

Exercise 3.2.25 (Jean Bernoulli4). Suppose f(t) has continuous n-th order deriva-tive on [0, x]. Prove that∫ x

0f(t)dt = xf(x)−x

2

2!f ′(x)+· · ·+(−1)n−1x

n

n!f (n−1)(x)+(−1)n

1n!

∫ x

0tnf (n)(t)dt.

Exercise 3.2.26. Suppose u(x) and v(x) have continuous n-th order derivative on[a, b]. Prove that∫ b

auv(n)dx =

[uv(n−1) − u′v(n−2) + · · ·+ (−1)n−1u(n−1)v

]x=b

x=a+(−1)n

∫ b

au(n)vdx.

Then apply the formula to∫ x

x0

(x− t)nf (n+1)(t)dt to prove the integral form (3.2.9)

of the remainder.4Jean Bernoulli, born 1667 and died 1748 in Basel (Switzerland).

Page 171: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 171

Exercise 3.2.27. Suppose f ′(x) is integrable and limx→+∞ f′(x)dx = 0. Prove

that limb→+∞1b

∫ b

af(x) sinxdx = 0. Moreover, extend the result to higher order

derivatives.

3.2.4 Change of Variable

By the fundamental theorem of calculus, the change of variable formula(3.2.6) for the antiderivatives gives the change of variable formula for theRiemann integral.

Theorem 3.2.4 (Change of Variable). Suppose φ(x) is differentiable, withφ′(x) integrable on [a, b]. Suppose f(y) is continuous on φ([a, b]). Then∫ φ(b)

φ(a)

f(y)dy =

∫ b

a

f(φ(x))φ′(x)dx. (3.2.10)

Similar to the integration by parts, the differential notation can be usedin the change of variable formula to get∫ φ(b)

φ(a)

f(y)dy =

∫ b

a

f(φ(x))dφ(x). (3.2.11)

The formulation makes the meaning of the change of variable rather clear.In particular, the change should also be made for the variable y inside thedifferential dy. This suggests again the usefulness of including the differentialnotation in the integration.

The conditions of the theorem are included to make sure that for fixedb, the derivatives in a of both sides of the formula are equal. The followingresult shows that the change of variable formula also holds under more strictcondition on φ(x) and less strict condition on f(y). However, the resultcannot be proved by applying the fundamental theorem of calculus. Riemannsum has to be used.

Theorem 3.2.5 (Change of Variable). Suppose φ(x) is increasing and dif-ferentiable, with φ′(x) integrable on [a, b]. Suppose f(y) is integrable on[φ(a), φ(b)]. Then ∫ φ(b)

φ(a)

f(y)dy =

∫ b

a

f(φ(x))φ′(x)dx.

Proof. Since φ′ and f are integrable, they are both bounded. Assume |φ′(x)| <A and |f(y)| < B for all x ∈ [a, b] and y ∈ [φ(a), φ(b)].

Let

P : a = x0 < x1 < x2 < · · · < xn = b

be a partition of [a, b]. Denote by

φ(P ) : φ(a) = φ(x0) ≤ φ(x1) ≤ φ(x2) ≤ · · · ≤ φ(xn) = φ(b)

Page 172: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

172 CHAPTER 3. INTEGRATION

the corresponding partition of [φ(a), φ(b)]. Note that although it might hap-pen that φ(xi−1) = φ(xi) for some i, it is not hard to see that the definitionof Riemann sum is not changed if some equality in the partition is allowed.The assumption |φ′(x)| < A implies that ‖φ(P )‖ < A‖P‖.

For a Riemann sum

S(P, f(φ(x))φ′(x)) =∑

f(φ(x∗i ))φ′(x∗i )(xi − xi−1)

of f(φ(x))φ′(x) with respect to the partition P , we consider a correspondingRiemann sum

S(φ(P ), f(y)) =∑

f(φ(x∗i ))(φ(xi)− φ(xi−1))

=∑

f(φ(x∗i ))φ′(x∗∗i )(xi − xi−1)

of f(x) with respect to the partition φ(P ), where the second equality is fromthe mean value theorem. Then

|S(P, f(φ(x))φ′(x))− S(φ(P ), f(y))|

≤∑|f(φ(x∗i ))||φ′(x∗i )− φ′(x∗∗i )|(xi − xi−1) ≤ B

∑ω[xi−1,xi](φ

′)∆xi.

Since φ′ and f are integrable, for any ε > 0, there are δ1, δ2 > 0, such that‖P‖ < δ1 and ‖φ(P )‖ ≤ δ2 imply

∑ω[xi−1,xi](φ

′)∆xi < ε,

∣∣∣∣∣S(φ(P ), f(y))−∫ φ(b)

φ(a)

f(y)dy

∣∣∣∣∣ < ε.

Combined with ‖φ(P )‖ < A‖P‖, we conclude that ‖P‖ < min

{δ1,

δ2

A

}implies ∣∣∣∣∣S(P, f(φ(x))φ′(x))−

∫ φ(b)

φ(a)

f(y)dy

∣∣∣∣∣ < Bε+ ε.

This proves that f(φ(x))φ′(x) is integrable and the change of variable formulaholds.

Example 3.2.18. For the special case φ(x) = x+ c, we get∫ b+c

a+cf(x)dx =

∫ b

af(x+ c)dx.

For the special case φ(x) = cx, c 6= 0, we get∫ cb

caf(x)dx = c

∫ b

af(cx)dx.

Example 3.2.19. To compute I =∫ π

0

x sinxdx1 + cos2 x

, we introduce x = π − t.

I = −∫ 0

π

(π − t) sin(π − t)dt1 + cos2(π − t)

=∫ π

0

(π − t) sin tdt1 + cos2 t

= π

∫ π

0

sin tdt1 + cos2 t

− I.

Page 173: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 173

Thus

I =π

2

∫ π

0

sin tdt1 + cos2 t

= −π2

∫ π

0

d cos t1 + cos2 t

= −π2

∫ −1

1

dx

1 + x2

2(arctan 1− arctan(−1)) =

π2

4.

Exercise 3.2.28. Compute the integrals.

1.∫ 1

0xex

2dx. 2.

∫ 2

0

xdx

1 + x2. 3.

∫ 3

1

dx

x√x+ 1

.

Exercise 3.2.29. Suppose f(x) is integrable on [−a, a].

1. If f(x) is an even function, prove that∫ a

−af(x)dx = 2

∫ a

0f(x)dx.

2. If f(x) is an odd function, prove that∫ a

−af(x)dx = 0.

Exercise 3.2.30. Suppose f(x) is a continuous function on [−a, a]. Prove the fol-lowing are equivalent.

1. f(x) is an odd function.

2.∫ b

−bf(x)dx = 0 for any 0 < b < a.

3.∫ a

−af(x)g(x)dx = 0 for any even continuous function g(x).

4.∫ a

−af(x)g(x)dx = 0 for any even integrable function g(x).

Exercise 3.2.31. Use the formula log x =∫ x

1

dt

tfor x > 0 to prove the property

log x+ log y = log(xy) of the logarithmic function.

Exercise 3.2.32. Suppose f(x) is integrable on an open interval containing [a, b]and is continuous at a and b. Prove that

limh→0

∫ b

a

f(x+ h)− f(x)h

dx = f(b)− f(a).

The result should be compared with the equality∫ b

alimh→0

f(x+ h)− f(x)h

dx =∫ b

af ′(x)dx = f(b)− f(a),

which by Theorem 3.2.2 holds when f(x) is differentiable and f ′(x) is integrable.

Page 174: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

174 CHAPTER 3. INTEGRATION

3.2.5 Additional Exercise

Estimation of Integral

The estimations on the integrals in Exercise 3.1.51, 3.1.53 and 3.1.54 maybe extended by using higher order derivatives.

Exercise 3.2.33. Suppose f(x) is continuous on [a, b] and has n-th order derivativeon (a, b). By either integrating the Taylor expansion at a or considering the Taylor

expansion of the function F (x) =∫ x

af(t)dt, prove

∫ b

af(x)dx = f(a)(b−a)+

f ′(a)2!

(b−a)2+· · ·+f (n−1)(a)n!

(b−a)n+f (n)(c)(n+ 1)!

(b−a)n+1,

(3.2.12)where a < c < b.

Exercise 3.2.34. Suppose f(x) is continuous on [a, b] and has n-th order derivativeon (a, b). Prove∣∣∣∣∣∫ b

af(x)dx−

n−1∑k=0

f (k)(a) + (−1)kf (k)(b)(k + 1)!2k

(b− a)k+1

∣∣∣∣∣ ≤ ω(a,b)(f (n))(n+ 1)!2n+1

(b− a)n+1

(3.2.13)for odd n and∫ b

af(x)dx =

n−1∑k=0

f (k)(a) + (−1)kf (k)(b)(k + 1)!2k

(b−a)k+1 +f (n)(c)

(n+ 1)!2n(b−a)n+1 (3.2.14)

for even n and some a < c < b.

Exercise 3.2.35. Suppose f(x) is continuous on [a, b] and has 2n-th order derivativeon (a, b). Prove∫ b

af(x)dx =

n−1∑k=0

1(2k + 1)!22k

f (2k)

(a+ b

2

)(b− a)2k+1 +

f (2n)(c)(2n+ 1)!22n

(b− a)2n+1.

(3.2.15)for some a < c < b.

Estimation of Special Riemann Sums

Consider the partition of [a, b] by evenly distributed partition points xi =

a+i

n(b−a). We can form the Riemann sums by choosing the left, right and

middle points of the partition intervals

Sleft,n(f) =∑

f(xi−1)∆xi =b− an

∑f(xi−1),

Sright,n(f) =∑

f(xi)∆xi =b− an

∑f(xi),

Smiddle,n(f) =∑

f

(xi−1 + xi

2

)∆xi =

b− an

∑f

(xi−1 + xi

2

).

The question is how close these are to the actual integral.

Page 175: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 175

Exercise 3.2.36. Suppose f(x) is continuous on [a, b] and differentiable on (a, b),such that f ′(x) is integrable. Use the estimation in Exercise 3.1.51 to prove that

limn→∞

n

(∫ b

af(x)dx− Sleft,n(f)

)=

12

(f(b)− f(a))(b− a)

limn→∞

n

(∫ b

af(x)dx− Sright,n(f)

)= −1

2(f(b)− f(a))(b− a).

Exercise 3.2.37. Suppose f(x) is continuous on [a, b] and has second order deriva-tive on (a, b), such that f ′′(x) is integrable on [a, b]. Use the estimation in Exercise3.1.54 to prove that

limn→∞

n2

(∫ b

af(x)dx− Smiddle,n(f)

)=

124

(f ′(b)− f ′(a))(b− a)2.

Exercise 3.2.38. Use Exercises 3.2.33, 3.2.34 and 3.2.35 to derive higher order

approximation formulae for the integral∫ b

af(x)dx.

Average of Functions

The average of an integrable function on [a, b] is

Av[a,b](f) =1

b− a

∫ b

a

f(x)dx. (3.2.16)

Exercise 3.2.39. Prove the properties of the average.

1. Av[a+c,b+c](f(x+ c)) = Av[a,b](f(x)), Av[λa,λb](f(λx)) = Av[a,b](f(x)).

2. If c = λa + (1 − λ)b, then Av[a,b](f) = λAv[a,c](f) + (1 − λ)Av[c,b](f). Inparticular, Av[a,b](f) lies between Av[a,c](f) and Av[c,b](f).

3. f ≥ g implies Av[a,b](f) ≥ Av[a,b](g).

4. If f(x) is continuous, then Av[a,b](f) = f(c) for some a < c < b.

Exercise 3.2.40. Suppose f(x) is integrable on [0, a] for any a > 0. Consider the

average function g(x) = Av[0,x](f) =1x

∫ x

0f(t)dt.

1. Prove that if limx→+∞ f(x) = l, then limx→+∞ g(x) = l (compare Exercise1.1.36).

2. Prove that if f(x) is increasing, then g(x) is also increasing.

3. Prove that if f(x) is convex, then g(x) is also convex.

Exercise 3.2.41. For a weight function λ(x) defined in (3.1.21) in Exercise 3.1.47,the weighted average of an integrable function f(x) is

Avλ[a,b](f(x)) =1

b− a

∫ b

aλ(x)f(x)dx (3.2.17)

Can you extend the properties of average in Exercise 3.2.39 to weighted average?

Page 176: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

176 CHAPTER 3. INTEGRATION

Integration of Periodic Function and Riemann-Lebesgue Lemma

Suppose f(x) is a periodic integrable function with period T .

Exercise 3.2.42. Prove that∫ a+T

af(x)dx =

∫ T

0f(x)dx.

Exercise 3.2.43. Prove that limb→+∞1b

∫ b

af(x)dx =

1T

∫ T

0f(x)dx. This says that

the limit of the average on bigger and bigger intervals is the average on an intervalof the period length.

Exercise 3.2.44. Prove Riemann-Lebesgue Lemma: Suppose f(x) is a periodicintegrable function with period T and g(x) is integrable on [a, b]. Then

limt→∞

∫ b

af(tx)g(x)dx =

1T

∫ T

0f(x)dx

∫ b

ag(x)dx.

Trigonometric Integration

Exercise 3.2.45. Let a, b, n be given.

1. Prove that there are A, B, C, such that∫dx

(a sinx+ b cosx)n=

A sinx+B cosx(a sinx+ b cosx)n−1

+ C

∫dx

(a sinx+ b cosx)n−2.

2. Prove that if |a| 6= |b|, then there are A, B, C, such that∫dx

(a+ b cosx)n=

A sinx(a+ b cosx)n−1

+B∫

dx

(a+ b cosx)n−1+C

∫dx

(a+ b cosx)n−2.

Exercise 3.2.46. Compute the antiderivative of∫

dx

cos(x+ a) cos(x+ b)by using

tan(x+ a)− tan(x+ b) =sin(a− b)

cos(x+ a) cos(x+ b).

Use the similar idea to compute the following antiderivatives.

1.∫

dx

sin(x+ a) cos(x+ b).

2.∫

tan(x+ a) tan(x+ b)dx.

3.∫

dx

sinx− sin a.

4.∫

dx

cosx+ cos a.

Wallis5 Formula

For p, q ≥ 0, define w(p, q) =

∫ 1

0

(1− x1p )qdx.

Exercise 3.2.47. Prove w(p, q) =q

p+ qw(p, q − 1).

Exercise 3.2.48. Prove w(p, q) = w(q, p) by changing the variable y = (1− x1p )q.

5John Wallis, born 1616 in Kent (England), died 1703 in Oxford (England).

Page 177: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.2. ANTIDERIVATIVE 177

Exercise 3.2.49. Prove w(m,n) =m!n!

(m+ n)!for natural numbers m and n.

Exercise 3.2.50. Prove w is strictly decreasing in p and in q.

Exercise 3.2.51. Show that w(

12,12

)=π

4.

Exercise 3.2.52. Use w(n+ 1, n+ 1) < w

(n+

12, n+

12

)< w(n, n) to prove

22 · 42 · 62 · · · (2n)2

1 · 32 · 52 · · · (2n− 1)2 · (2n+ 1)<π

2<

22 · 42 · 62 · · · (2n)2 · (2n+ 2)1 · 32 · 52 · · · (2n− 1)2 · (2n+ 1)2

.

Exercise 3.2.53. Prove that in the inequality in Exercise 3.2.52, the left is increas-ing, the right is decreasing, and both have

π

2as the limit. This leads to Wallis

infinite product formula

π

2=

21· 2

3· 3

4· 4

5· 5

6· 6

7· · · · .

Integration by Parts under Weaker Condition

Suppose f(x) and g(x) are integrable on [a, b], and

F (x) =

∫ x

a

f(t)dt, G(x) =

∫ x

a

g(t)dt.

The subsequent exercises will prove the integration by parts formula (3.2.7)under the assumption. Note that up to adding constants to F and G, theassumption is strictly weaker than the one in Theorem 3.2.3.

For a partition P : a = x0 < x1 < · · · < xn = b and the choices x∗i = xi,define the “partial Riemann sums” Si(P, f) =

∑ik=1 f(xk)∆xk.

Exercise 3.2.54. The partial Riemann sum is the Riemann sum for the integral

F (xi) =∫ xi

af(t)dt and should approximate the integral. Then by replacing F (xi)

with the approximation Si(P, f), the sum∑n

i=1 Si(P, f)g(xi)∆xi should approx-imate the Riemann sum S(P, Fg). By using the estimation (3.1.14) in Exercise3.1.16, prove the estimation∣∣∣∣∣S(P, Fg)−

n∑i=1

Si(P, f)g(xi)∆xi

∣∣∣∣∣ ≤(

n∑i=1

ω[xi−1,xi](f)∆xi

)S(P, |g|).

Exercise 3.2.55. Exercise 3.2.54 provides approximations for the Riemann sums ofof Fg and fG. Prove the sum of the two approximations is

n∑i=1

Si(P, f)g(xi)∆xi+n∑i=1

f(xi)∆xiSi(g, P ) = S(P, f)S(P, g)+n∑i=1

f(xi)g(xi)∆x2i .

Exercise 3.2.56. Prove the integration by parts formula∫ b

a(F (x)g(x) + f(x)G(x))dx = F (b)G(b)

for the special case F (a) = G(a) = 0.

Page 178: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

178 CHAPTER 3. INTEGRATION

Exercise 3.2.57. Suppose f(x) and g(x) are integrable on [a, b] and

F (x) =∫ x

af(t)dt+ c, G(x) =

∫ x

ag(t)dt+ d,

where c and d are some constants. Prove the integration by parts formula (3.2.7).

Second Integral Mean Value TheoremThe first integral mean value theorem appeared in Exercise 3.1.15.Suppose f(x) is differentiable with integrable f ′(x) on [a, b]. Suppose

g(x) is continuous on [a, b].

Exercise 3.2.58. Suppose f(x) ≥ 0 and is decreasing. Suppose G(x) =∫ x

ag(x)dx

satisfies m ≤ G(x) ≤M for x ∈ [a, b]. Prove that

f(a)m ≤∫ b

af(x)g(x)dx ≤ f(a)M.

Then use this to prove that∫ b

af(x)g(x)dx = f(a)

∫ c

ag(x)dx

for some a < c < b. What if f(x) is increasing?

Exercise 3.2.59. Suppose f(x) is monotone. Prove that∫ b

af(x)g(x)dx = f(a)

∫ c

ag(x)dx+ f(b)

∫ b

cg(x)dx

for some a < c < b.

Young Inequality

Exercise 3.2.60. Suppose f(x) is differentiable for x ≥ 0 and satisfies f ′(x) > 0,f(0) = 0. Prove that for any a, b > 0, we have the Young inequality∫ a

0f(x)dx+

∫ b

0f−1(y)dy ≥ ab. (3.2.18)

Then apply to the function xp−1 for p > 1 and derive the Young inequality inExercise 2.2.40.

The inequality actually holds under weaker condition. See Exercise 3.3.27.

3.3 Topics on Integration

3.3.1 Integration of Rational Functions

It is a basic algebraic fact that any rational function is a sum of functions

of the forms Axn,A

(x− a)nand

Bx+ C

(x2 + ax+ b)n, where x2 + ax + b has no

real roots. Thus the antiderivative of the rational function is the sum of theantiderivatives of these three types of functions. The antiderivative of the

Page 179: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 179

first two types of functions are very simple. To compute the antiderivativeof the last type, we complete the square and get x2 + ax+ b = (x+α)2 + β2.Then∫

(Bx+ C)dx

(x2 + ax+ b)n= b

∫(x+ α)dx

((x+ α)2 + β2)n+ c

∫dx

((x+ α)2 + β2)n,

where b = B, c = C −Bα. The first antiderivative is

∫(x+ α)dx

((x+ α)2 + β2)n=

1

2log(x2 + ax+ b) + C if n = 1

− 1

2(n− 1)(x2 + ax+ b)n−1+ C if n > 1

.

The change of variable βt = x + α reduces the computation of the second

antiderivative into

∫dx

(t2 + 1)n. This may be computed by the last recursive

relation in Exercise 3.2.19.

Example 3.3.1. To compute the antiderivative of the rational function f(x) =x4 + x3 + x2 − 2x+ 1

x5 + x4 − 2x3 − 2x2 + x+ 1, we find the factorization x5 +x4−2x3−2x2 +x+1 =

(x− 1)2(x+ 1)3 of the denominator and write

f(x) =A1

x− 1+

A2

(x− 1)2+

B1

x+ 1+

B2

(x+ 1)2+

B3

(x+ 1)3.

This is the same as

x4 + x3 + x2 − 2x+ 1 = A1(x− 1)(x+ 1)3 +A2(x+ 1)3

+B1(x− 1)2(x+ 1)2 +B2(x− 1)2(x+ 1) +B3(x− 1)2.

Then we get the following equations.

2 = 8A2, (at x = 1)4 = 4B3, (at x = −1)

7 = 8A1 + 12A2, (d

dxat x = 1)

−5 = 4B2 − 4B3, (d

dxat x = −1)

1 = A1 +A2 +B1 +B2 +B3. (coefficient of x4)

It is easy to solve the equations and get∫f(x)dx =

∫ (1

2(x− 1)+

14(x− 1)2

+1

2(x+ 1)− 1

4(x+ 1)2+

1(x+ 1)3

)dx

=12

log |x− 1| − 14(x− 1)

+12

log |x+ 1|+ 14(x+ 1)

− 12(x+ 1)2

+ C

=12

log |x2 − 1| − x

(x− 1)(x+ 1)2+ C.

Page 180: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

180 CHAPTER 3. INTEGRATION

Example 3.3.2. To compute the antiderivative of the rational function f(x) =4x3 − x2 + 2x

8x6 + 4x5 − 4x4 − 8x3 − 2x2 + x+ 1, we find the factorization (x−1)(2x−1)(2x2+

2x+ 1)2 of the denominator and write

f(x) =A

x− 1+

B

2x− 1+

C1x+D1

2x2 + 2x+ 1+

C2x+D2

(2x2 + 2x+ 1)2.

This is the same as

4x3 − x2 + 2x = A(2x− 1)(2x2 + 2x+ 1)2 +B(x− 1)(2x2 + 2x+ 1)2

+ (C1x+D1)(x− 1)(2x− 1)(2x2 + 2x+ 1)+ (C2x+D2)(x− 1)(2x− 1).

By various methods (comparing coefficients of xk, evaluating at specific values ofx, taking derivative and evaluating at specific values of x, etc.), we get a systemof equations. Solving the system, we get

f(x) =1

5(x− 1)− 2

5(2x− 1)− 1

5(2x2 + 2x+ 1)+

x

(2x2 + 2x+ 1)2.

We have ∫dx

x− 1= log |x− 1|,∫

dx

2x− 1=

12

∫d(2x− 1)

2x− 1=

12

log |2x− 1|,∫dx

2x2 + 2x+ 1=∫

d(2x+ 1)(2x+ 1)2 + 1

= arctan(2x+ 1),∫xdx

(2x2 + 2x+ 1)2=∫

4xdx((2x+ 1)2 + 1)2

=∫

(tan t− 1)d tan t(tan2 t+ 1)2

=∫

(sin t cos t− cos2 t)dt =12

(sin2 t− sin t cos t− t)

=12

(cos2 t(tan2 t− tan t)− t) + C

=(2x+ 1)2 − (2x+ 1)

2(2x+ 1)2 + 1− 1

2arctan(2x+ 1) + C

=x(2x+ 1)

2x2 + 2x+ 1− 1

2arctan(2x+ 1) + C.

Thus we conclude∫f(x)dx =

15

log∣∣∣∣ x− 12x− 1

∣∣∣∣− 710

arctan(2x+ 1) +x(2x+ 1)

2x2 + 2x+ 1+ C.

Exercise 3.3.1. Compute the antiderivatives of the rational functions.

1.∫

dx

x2(1 + x).

2.∫

dx

x(1 + x)(2 + x).

3.∫

x4dx

1− x2.

4.∫

xdx

x2 + x− 2.

Page 181: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 181

5.∫

x5dx

x2 + x− 2.

6.∫

dx

1− x4.

7.∫

x2dx

(1− x4)2.

8.∫

dx

(1− x4)2.

9.∫

xdx

(x− 1)2(x2 + 2x+ 2).

10.∫

(x4 + 4x3 + 4x2 + 4x+ 4)dxx(x+ 2)(x2 + 2x+ 2)2

.

11.∫

(2x2 + 3)dxx3 + x2 − 2

.

12.∫

dx

x4 + 4.

13.∫

xdx

x3 + 1.

14.∫

dx

(x+ 1)(x2 + 1)(x3 + 1).

By suitable change of variables, many antiderivatives may be convertedto the antiderivatives of rational functions.

Example 3.3.3. By the change t = tanx

2, we have∫

R(sinx, cosx)dx =∫R

(2t

1 + t2,1− t2

1 + t2

)2dt

1 + t2.

If R is a rational function, then the computation is reduced to the antiderivativeof some rational function.

As a concrete example, for a 6= 0, we have∫dx

a+ sinx=∫

2dt(a+

2t1 + t2

)(1 + t2)

=∫

2dt

a

((t+

1a

)2

+ 1− 1a2

) .

If |a| > 1, then with t+1a

=√

1− 1a2u, we further have

∫dx

a+ sinx=∫ 2

√1− 1

a2du

a

(1− 1

a2

)(u2 + 1)

=2

sign(a)√a2 − 1

arctana tan

x

2+ 1

√a2 − 1

+ C.

If |a| = 1, then∫dx

a+ sinx=∫

2dta(t+ a)2

=−2

a(t+ a)+ C =

−2

a tanx

2+ 1

+ C.

If |a| < 1, then(t+

1a

)2

+ 1− 1a2

= (t− α)(t− β), α = −1a

+

√1a2− 1, β = −1

a−√

1a2− 1,

and we further have∫dx

a+ sinx=∫

2dta(t− α)(t− β)

=2

a(α− β)log∣∣∣∣ t− αt− β

∣∣∣∣+ C

=1√

1− a2log

∣∣∣∣∣∣a tan

x

2+ 1−

√1− a2

a tanx

2+ 1 +

√1− a2

∣∣∣∣∣∣+ C.

Page 182: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

182 CHAPTER 3. INTEGRATION

Example 3.3.4. Although the change t = tanx

2can always be used for rational

functions of trigomonetric functions, it may not be the most efficient one. Forexample, by t = cosx, we have∫

dx

(a+ cosx) sinx=∫

−d cosx(a+ cosx) sin2 x

=∫

−dt(a+ t)(1− t2)

.

If a 6= ±1, then

−1(a+ t)(1− t2)

=1

(a2 − 1)(t+ a)− 1

2(a− 1)(t+ 1)+

12(a+ 1)(t− 1)

,

and∫dx

(a+ cosx) sinx=

log |t+ a|a2 − 1

− log |t+ 1|2(a− 1)

+log |t− 1|2(a+ 1)

+ C

=log | cosx+ a|

a2 − 1− log | cosx+ 1|

2(a− 1)+

log | cosx− 1|2(a+ 1)

+ C.

Example 3.3.5. By the change t = n

√ax+ b

cx+ d, we have

∫R

(x, n√ax+ b

cx+ d

)dx =

∫R

(αtn + β

γtn + δ, t

)(αδ − βγ)ntn−1

(γtn + δ)2dt

Applying the idea to the antiderivative∫ √

x

(x+ 1)3dx, we introduce t =

√x

x+ 1.

Then x =t2

1− t2, dx =

2tdt(1− t2)2

, and

∫ √x

(x+ 1)3dx =

∫|1− t2|t 2tdt

(1− t2)2= ±

∫ (1

t+ 1− 1t− 1

− 2)dt

= ±(

log∣∣∣∣ t+ 1t− 1

∣∣∣∣− 2t)

+ C

= ±(

2 log(√x+√x+ 1)− 2

√x

x+ 1

)+ C.

The sign is positive if x > 0, and is negative if x < −1.

Example 3.3.6. To compute the antiderivative∫ √

ex − 1ex + 1

dx, we introduce t = ex.

Then dt = exdx = tdx, and the antiderivative becomes∫ √

t− 1t+ 1

dt

t. By further

introducing u =√t− 1t+ 1

=√ex − 1ex + 1

, we get

∫ √ex − 1ex + 1

dx =∫

4u2du

(1 + u2)(1− u2)=∫ (

1u+ 1

− 1u− 1

− 2u2 + 1

)du

= log∣∣∣∣u+ 1u− 1

∣∣∣∣− 2 arctanu+ C

= 2 log(√ex − 1 +

√ex + 1)− 2 arctan

√ex − 1ex + 1

+ C.

Exercise 3.3.2. Compute the antiderivatives.

Page 183: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 183

1.∫

(1 + sinx)dxsinx(1 + cosx)

.

2.∫ √

tanxdx.

3.∫

(sinx+ cosx)dxsinx(sinx− cosx)

.

4.∫

dx

a+ cosx.

5.∫

dx

(a+ cos2 x) sinx.

6.∫

dx

a+ tanx.

7.∫

dx

a sinx+ b cosx+ c.

8.∫

dx

a2 sin2 x+ b2 cos2 x.

9.∫ √

1− x1 + x

dx.

10.∫

1x2

√1− x1 + x

dx.

11.∫

dx√1 + ex +

√1− ex

.

12.∫

dx√ax + b

.

Exercise 3.3.3. Suppose R is a rational function. Suppose r, s are rational num-bers such that r + s is an integer. Find a suitable change of variable, such that∫R(x, (ax+ b)r, (cx+d)s)dx is changed into the antiderivative of a rational func-

tion.

Exercise 3.3.4. Suppose r, s, t are rational numbers. For each of the following

cases, find a suitable change of variable, such that∫xr(a + bxs)tdx is changed

into the antiderivative of a rational function.

1. t is an integer.

2.r + 1s

is an integer.

3.r + 1s

+ t is an integer.

A theorem by Chebyshev6 says that these are the only cases that the antiderivativecan be changed to the antiderivative of a rational function.

3.3.2 Improper Integration

So far the integrability was defined only for bounded functions on boundedintervals. When the functions or the intervals are unbounded, we may inte-grate on the bounded part and then take limit.

Suppose f(x) is integrable on [a, c] for any a < c < b. If limc→b−

∫ c

a

f(x)dx

converges, then we say the improper integral

∫ b

a

f(x)dx converges and write

∫ b

a

f(x)dx = limc→b−

∫ c

a

f(x)dx.

6Pafnuty Lvovich Chebyshev, born 1821 in Okatovo (Russia), died 1894 in St Peters-burg (Russia). Chebyshev’s work touches many fields of mathematics, including analysis,probability, number theory and mechanics. Chebyshev introduced his famous polynomialsin 1854 and later generalized to the concept of orthogonal polynomials.

Page 184: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

184 CHAPTER 3. INTEGRATION

The definition also applies to the case b = +∞, and similar definition can bemade when f(x) is integrable on [c, b] for any a < c < b.

By Exercise 3.1.8, if f(x) is a bounded function on bounded interval [a, b]and is integrable on [a, c] for any a < c < b, then it is integrable on [a, b].Therefore an integral on a bounded interval becomes improper only if it isnot bounded.

Example 3.3.7. For any α 6= −1, the limit lima→0+

∫ 1

axαdx = lim

a→0+

1− aα+1

α+ 1

converges if and only if α + 1 > 0. Therefore the improper integral∫ 1

0xαdx

(which is in fact improper only when α < 0) converges if and only if α > −1, and∫ 1

0xαdx =

1α+ 1

.

On the other hand, the limit lima→+∞

∫ a

1xαdx = lim

a→+∞

aα+1 − 1α+ 1

converges

if and only if α + 1 < 0. Therefore the improper integral∫ +∞

1xαdx converges if

and only if α < −1, and∫ +∞

1xαdx =

−1α+ 1

.

Example 3.3.8. The integral∫ +∞

−∞

dx

1 + x2is improper at −∞ and +∞. Since

lima→−∞,b→+∞

∫ b

a

dx

1 + x2= lim

a→−∞,b→+∞(arctan b− arctan a) =

π

2−(−π

2

)= π,

the improper integral converges, and∫ +∞

−∞

dx

1 + x2= π.

Exercise 3.3.5. Determine the convergence of the improper integrals and evaluatethe convergent ones.

1.∫ ∞

2

dx

x(log x)α.

2.∫ 1

0

dx

x(− log x)α.

3.∫ +∞

0xαdx.

4.∫ 1

0log xdx.

5.∫ +∞

0axdx.

6.∫ 0

−∞axdx.

7.∫ 1

−1

dx

1− x2.

8.∫ +∞

2

dx

1− x2.

9.∫ 1

−1

dx√1− x2

.

10.∫ π

2

0tanxdx.

11.∫ +∞

0e−x sinxdx.

12.∫ +∞

0e−x| sinx|dx.

Exercise 3.3.6. Suppose f(x) is continuous for x ≥ 0, and limx→+∞ f(x) = l.

Prove that for any a, b > 0,∫ +∞

0

f(ax)− f(bx)x

dx = (f(0)− l) logb

a.

The convergent improper integrals have properties similar to the normalintegrals. For example, the integration by parts and the change of variablestill hold.

The convergence of improper integrals can be determined by the Cauchycriterion. The following is the criterion stated for the case that the integralbecomes improper at +∞.

Page 185: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 185

Proposition 3.3.1 (Cauchy Criterion). Suppose f(x) is integrable on [a, b]

for any b > a. Then the improper integral

∫ +∞

a

f(x)dx converges if and only

if for any ε > 0, there is N , such that

b, c > N =⇒∣∣∣∣∫ c

b

f(x)dx

∣∣∣∣ < ε.

A consequence of the criterion is the following useful method for deter-mining the convergence, again stated for integrals that are improper at +∞.

Proposition 3.3.2 (Comparison Test). Suppose f(x) and g(x) are integrable

on [a, b] for any b > a. If |f(x)| ≤ g(x) and

∫ +∞

a

g(x)dx converges, then∫ +∞

a

f(x)dx also converges.

Proof. For any ε > 0, applying the Cauchy criterion to the convergence of∫ +∞

a

g(x)dx tells us that there is N , such that b, c > N implies

∫ c

b

g(x)dx <

ε. Then b, c > N implies∣∣∣∣∫ c

b

f(x)dx

∣∣∣∣ ≤ ∫ c

b

|f(x)|dx ≤∫ c

b

g(x)dx < ε.

Thus the Cauchy criterion for the convergence of

∫ +∞

a

f(x)dx is verified.

A special case of the comparison test is that if both f(x) and g(x) are

positive, and limx→+∞f(x)

g(x)= l exists, then the convergence of

∫ +∞

a

g(x)dx

implies the convergence of

∫ +∞

a

f(x)dx. If l 6= 0, then the two convergences

are equivalent. More generally, if c1g(x) ≤ f(x) ≤ c2g(x) for some c1, c2 > 0,then the two convergences are equivalent.

Example 3.3.9. Let r(x) =p(x)q(x)

be a rational function. Let m and n be the

degrees of the polynomials p(x) and q(x). Then limx→∞r(x)xm−n

converges to a

nonzero number (the quotient of the leading coefficients). By changing the signof the leading coefficients if necessary, we may assume p(x) and q(x) are positive.

Since∫ +∞

axm−ndx converges if and only if m− n ≤ −2, by the comparison test,

we conclude that∫ +∞

ar(x)dx converges if and only if m− n ≤ −2.

Example 3.3.10. The integral∫ 2

0

sinxdx√|x(x− 1)|

is improper at 1 (in fact at both 1+

and 1−). Since limx→1sinx√|x|

= sin 1 6= 0, the convergence of the improper integral

Page 186: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

186 CHAPTER 3. INTEGRATION

is equivalent to the convergence of∫ 2

0

dx√|x− 1|

. The later converges (on both

sides of 1) because∫ 1

0

dx√x

converges. Therefore∫ 2

0

sinxdx√|x(x− 1)|

converges.

Example 3.3.11. The improper integral∫ 1

0log xdx converges. Since 0 < log sinx <

log x for 0 < x < 1, we find that∫ π

2

0log sinxdx also converges.

To compute the improper integral, we change the variable.∫ π2

0log sinxdx =

∫ π2

0log(

2 sinx

2cos

x

2

)dx = 2

∫ π4

0log(2 sinx cosx)dx

2log 2 +

∫ π4

0log sinxdx+

∫ π4

0log cosxdx

2log 2 + 2

∫ π4

0log sinxdx− 2

∫ π4

π2

log sinxdx

2log 2 + 2

∫ π2

0log sinxdx.

Therefore∫ π

2

0log sinxdx = −π

4log 2.

Example 3.3.12. Comparison test is not the only way to derive the convergence of

improper integrals. Consider the improper integral∫ +∞

1

sinxx

dx. The integral of

the corresponding absolute value function satisfies∫ nπ

(n−1)π

∣∣∣∣sinxx∣∣∣∣ dx ≥ 1

∫ nπ

(n−1)π| sinx|dx =

2nπ

.

The divergence of∑ 1

nthen implies that

∫ a

1

∣∣∣∣sinxx∣∣∣∣ dx is not bounded. Therefore∫ +∞

1

∣∣∣∣sinxx∣∣∣∣ dx diverges, and the comparison test cannot be used.

On the other hand, for a, b > 1, we use integration by parts to get∫ b

a

sinxx

dx = −∫ b

a

1xd cosx = −sin b

b+

sin aa

+∫ b

a

cosxx2

dx.

Therefore if b > a >1ε

, then

∣∣∣∣∫ b

a

sinxx

dx

∣∣∣∣ ≤ ∣∣∣∣sin bb∣∣∣∣+∣∣∣∣sin aa

∣∣∣∣+∣∣∣∣∫ b

a

cosxx2

dx

∣∣∣∣ ≤ 1b

+1a

+∫ b

a

1x2dx < 3ε.

By the Cauchy criterion, the improper integral∫ +∞

1

sinxx

dx converges.

The idea here will be elaborated to become the Dirichlet and Abel tests inExercise 3.3.47.

Exercise 3.3.7. Compute the improper integrals.

Page 187: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 187

1.∫ π

0

x sinxdx1− cosx

.

2.∫ π

0x log sinxdx.

3.∫ +∞

0xne−xdx.

4.∫ 1

0(log x)ndx.

5.∫ 1

0

xndx√1− x

.

6.∫ +∞

−∞

dx

(1 + x2)n.

Exercise 3.3.8. Determine the convergence of improper integrals.

1.∫ ∞

2xα| log x|βdx.

2.∫ 1

0xα| log x|βdx.

3.∫ +∞

0xαaxdx.

4.∫ π

0

1sinx

dx.

5.∫ +∞

0xα| sinx|dx.

6.∫ 1

0xα(1− x)βdx.

Exercise 3.3.9. Prove that∫ +∞

0f

(ax+

b

x

)dx =

1a

∫ +∞

0f(√x2 + 4ab)dx,

provided a, b > 0 and both sides converge.

Exercise 3.3.10. Suppose f(x) < g(x) < h(x). Prove that if∫ b

af(x)dx and∫ b

ah(x)dx converge, then

∫ b

ag(x)dx converges.

Exercise 3.3.11. Prove that if f(x) ≥ 0 and∫ +∞

af(x)dx converges, then there is

an increasing sequence {xn} diverging to +∞, such that limn→∞ f(xn) = 0. More-over, prove that in the special case f(x) is monotone, we have limx→+∞ f(x) = 0.

Exercise 3.3.12. Suppose f(x) ≥ 0.

1. If [a, b] is a bounded interval and∫ b

af(x)2dx converges, prove that

∫ b

af(x)dx

also converges.

2. If limx→+∞ f(x) = 0 and∫ +∞

af(x)dx converges, prove that

∫ +∞

af(x)2dx

also converges.

Exercise 3.3.13. Prove that if α > β > 0, then lima→+∞ aβ

∫ +∞

a

sinxdxxα

= 0.

Then prove lima→01a

∫ a

0sin

1xdx = 0.

3.3.3 Riemann-Stieltjes Integration

Let α be a function on a bounded interval [a, b]. The Riemann sum may beextended to the Riemann-Stieltjes7 sum

S(P, f, α) =n∑i=1

f(x∗i )(α(xi)− α(xi−1)) =n∑i=1

f(x∗i )∆αi. (3.3.1)

7Thomas Jan Stieltjes, born 1856 in Zwolle (Netherland), died 1894 in Toulouse(France). He is often called the father of the analytic theory of continued fractions and isbest remembered for his integral.

Page 188: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

188 CHAPTER 3. INTEGRATION

We say f has Riemann-Stieltjes integral (or simply Stieltjes integral) I and

denote

∫ b

a

fdα = I, if for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒ |S(P, f, α)− I| < ε. (3.3.2)

When α(x) = x, we get the Riemann integral.

Example 3.3.13. Suppose f(x) = c is a constant. Then S(P, f, α) = c(α(b)−α(a)).

Therefore∫ b

acdα = c(α(b)− α(a)).

Example 3.3.14. Suppose α(x) = α0 is a constant. Then ∆αi = 0. Therefore∫ b

afdα0 = 0. Since f can be arbitrary, we see that Proposition 3.1.2 cannot be

extended without additional conditions.

Example 3.3.15. Suppose a < c < b and

α(x) =

α− if x < c

α0 if x = c

α+ if x > c

is a step function, in which the three values are not the same. Then

S(P, f, α) =

{f(x∗i )(α+ − α−) if xi−1 < c < xi

f(x∗i+1)(α+ − α0) + f(x∗i )(α0 − α−) if c = xi.

If f(x) is continuous at c, then it is easy to see that ‖P‖ small implies

|S(P, f, α)− f(c)(α+ − α−)| small. Therefore∫ b

af(x)dα = f(c)(α+ − α−).

If f(x) is not continuous at c and α+ 6= α−, then we choose P with xi−1 < c <xi. The discontinuity at c tells us that there is ε > 0, such that no matter how small‖P‖ is, there are x, y satisfying xi−1 < x, y < xi and |f(x)−f(y)| ≥ ε. Taking x andy as x∗i respectively and keeping all the other x∗j the same, we get two Riemann-Stieltjes sums. The difference of the two sums is |(f(x) − f(y))(α+ − α−)| ≥ε|α+ − α−|. Therefore f(x) is not Riemann-Stieltjes integrable with respect to α.

If f(x) is not continuous at c and α+ = α−, then α0 6= α+ = α− and we chooseP with xi = c. Assume f(x) is not left continuous at c. Then there is ε > 0, suchthat no matter how small ‖P‖ is, there are x, y satisfying xi−1 < x, y ≤ xi = cand |f(x)− f(y)| ≥ ε. Taking x and y as x∗i respectively and keeping all the otherx∗j the same, we get two Riemann-Stieltjes sums. The difference of the two sumsis |(f(x)− f(y))(α0 − α−)| ≥ ε|α0 − α−|. Therefore f(x) is not Riemann-Stieltjesintegrable with respect to α. The case f(x) is not right continuous at c is similar.

We conclude that f is Riemann-Stieltjes integrable with respect to α if andonly if f is continuous at c.

Exercise 3.3.14. Prove that if α and f have a common point of discontinuity in[a, b], then f is not Riemann-Stieltjes integrable with respect to α. You may needthe characterization of discontinuity in Exercise 1.4.42.

Exercise 3.3.15. Find suitable α on [0, 2], such that∫ 2

0fdα = f(0) + f(1) + f(2)

for any continuous f on [0, 2].

Page 189: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 189

Exercise 3.3.16. Prove that if the Dirichlet function is Riemann-Stieltjes integrable

with respect α, then α is constant. In particular, this shows that∫ b

afdα = 0 for

any function α with respect to which f is Riemann-Stieltjes integrable does notnecessarily imply that f is zero.

Exercise 3.3.17. Prove that the only function Riemann-Stieltjes integrable withrespect to the Dirichlet function is the constant function. In particular, this shows

that∫ b

afdα = 0 for any function f that is Riemann-Stieltjes integrable with

respect to α does not necessarily imply that α is a constant.

The Riemann-Stieltjes integrability can be verified by Cauchy criterion.The results from Section 3.1.4 remain mostly true. Proposition 3.1.7 can beadopted without much change (see Exercise 3.3.19). Proposition 3.1.8 stillholds if α is an increasing function (see Exercise 3.3.22). Proposition 3.1.9holds in one direction (see Exercises 3.3.20 and 3.3.21).

Exercise 3.3.18. Suppose f is Riemann-Stieltjes integrable with respect to α andβ. Suppose c is a constant. Prove that f is Riemann-Stieltjes integrable withrespect to α+ β and cα, and∫ b

afd(α+ β) =

∫ b

afdα+

∫ b

afdβ,

∫ b

afd(cα) = c

∫ b

afdα. (3.3.3)

Exercise 3.3.19. Suppose f and g are Riemann-Stieltjes integrable with respect toα. Prove that f + g and cf are Riemann-Stieltjes integrable with respect to α and∫ b

a(f + g)dα =

∫ b

afdα+

∫ b

agdα,

∫ b

acfdα = c

∫ b

afdα. (3.3.4)

Exercise 3.3.20. Suppose a < b < c and f is Riemann-Stieltjes integrable withrespect to α on [a, c]. Prove that f is Riemann-Stieltjes integrable with respect toα on [a, b] and [b, c], and ∫ c

afdα =

∫ b

afdα+

∫ c

bfdα. (3.3.5)

Exercise 3.3.21. Suppose a < b < c and f is Riemann-Stieltjes integrable withrespect to α on [a, b] and [b, c]. Assume f and α are bounded.

1. Prove that if f is continuous at b, then f is Riemann-Stieltjes integrable on[a, c].

2. Prove that if α is continuous at b, then f is Riemann-Stieltjes integrable on[a, c].

3. Construct functions f and α on [a, c], such that f is Riemann-Stieltjes in-tegrable with respect to α on [a, b] and [b, c], but both f and α are notcontinuous at b. By Exercise 3.3.14, f is not Riemann-Stieltjes integrablewith respect to α on [a, c].

Exercise 3.3.22. Suppose f and g are Riemann-Stieltjes integrable with respect toan increasing α on [a, b]. Prove that

f ≤ g =⇒∫ b

afdα ≤

∫ b

agdα.

Page 190: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

190 CHAPTER 3. INTEGRATION

Moreover, if α is strictly increasing and f and g are continuous, then the equalityholds if and only if f = g.

Exercise 3.3.23. Suppose f is Riemann-Stieltjes integrable with respect to an in-creasing α on [a, b]. Prove that ∣∣∣∣∫ b

afdα

∣∣∣∣ ≤ ∫ b

a|f |dα,

(α(b)− α(a)) inf[a,b]

f ≤∫ b

afdα ≤ (α(b)− α(a)) sup

[a,b]f,∣∣∣∣f(c)(α(b)− α(a))−

∫ b

afdα

∣∣∣∣ ≤ ω[a,b](f)(α(b)− α(a)),∣∣∣∣S(P, f, α)−∫ b

afdα

∣∣∣∣ ≤∑ω[xi−1,xi](f)∆αi.

Moreover, extend the first integral mean theorem in Exercise 3.1.15 to Riemann-Stieltjes integral.

Exercise 3.3.24. Suppose f is Riemann-Stieltjes integrable with respect to α, and

F (x) =∫ x

afdα. Prove that if α is strictly monotone and f(x) is continuous at x0,

then limh→0F (x0 + h)− F (x0)α(x0 + h)− α(x0)

= f(x0). This extends the fundamental theorem

of calculus.

The Riemann-Stieltjes integral can often be computed by the ordinaryRiemann integral.

Theorem 3.3.3. Suppose f is bounded, β is Riemann integrable and α(x) =∫ x

a

β(t)dt. Then f is Riemann-Stieltjes integrable with respect to α if and

only if fβ is Riemann integrable. Moreover,∫ b

a

fdα =

∫ b

a

fβdx.

For the special case β is continuous, we have α′ = β and the formulabecomes ∫ b

a

fdα =

∫ b

a

fα′dx. (3.3.6)

This provides the rule for moving α from inside d to outside d, which isconsistent with the notation dα = α′dx for the differential.

Proof. Suppose |f | < B for a constant B. Since β is Riemann integrable, forany ε > 0, there is δ > 0, such that ‖P‖ < δ implies

∑ω[xi−1,xi](β)∆xi < ε.

Now suppose ‖P‖ < δ and x∗i are chosen. The Riemann-Stieltjes sum of fwith respect to α is

S(P, f, α) =∑

f(x∗i )(α(xi)− α(xi−1)) =∑

f(x∗i )

∫ xi

xi−1

β(t)dt,

Page 191: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 191

and the Riemann sum of fβ is

S(P, fβ) =∑

f(x∗i )β(x∗i )∆xi.

Then

|S(P, f, α)− S(P, fβ)| ≤∑|f(x∗i )|

∣∣∣∣∫ xi

xi−1

β(t)dt− β(x∗i )∆xi

∣∣∣∣≤∑

Bω[xi−1,xi](β)∆xi ≤ Bε.

This implies that the Riemann-Stieltjes sum of f with respect to α convergesif and only if the Riemann sum of fβ converges. Moreover, the two limitsare the same.

The relation between the Riemann-Stieltjes integral and the ordinary Rie-mann integral suggests that the integration by parts and the change of vari-able formulae can be extended. The integration by parts for the Riemann-Stieltjes integral turns out to be a more symmetric and elegant.

Theorem 3.3.4. Suppose f is Riemann-Stieltjes integrable with respect toα. Then α is Riemann-Stieltjes integrable with respect to f . Moreover,∫ b

a

fdα +

∫ b

a

αdf = f(b)α(b)− f(a)α(a). (3.3.7)

Proof. LetP : a = x0 < x1 < x2 < · · · < xn = b

be a partition of [a, b] and xi−1 ≤ x∗i ≤ xi are chosen for 1 ≤ i ≤ n. Consider

Q : a = x∗0 ≤ x∗1 ≤ x∗2 ≤ · · · ≤ x∗n ≤ x∗n+1 = b.

Q is almost a partition except some repetition may happen among the par-tition points. By choosing x∗i−1 ≤ xi−1 ≤ x∗i for 1 ≤ i ≤ n + 1, we still usethe notation

S(Q, f, α) =n+1∑i=1

f(xi−1)(α(x∗i )− α(x∗i−1))

to denote the Riemann-Stieltjes sum of f with respect to α. In case a repeti-tion x∗i = x∗i−1 happens, the corresponding term in the sum simply vanishes.Therefore S(Q, f, α) is the same as the Riemann-Stieltjes sum after removingall the repetitions and we may pretend there is no repetition in S(Q, f, α)in the subsequent argument. In particular, by the integrability of f withrespect to α, for any ε > 0, there is δ, such that

‖Q‖ < δ =⇒∣∣∣∣S(Q, f, α)−

∫ b

a

fdα

∣∣∣∣ < ε.

Since x∗i − x∗i−1 ≤ xi − xi−2 ≤ 2‖P‖, we have ‖Q‖ ≤ 2‖P‖ and

‖P‖ < δ

2=⇒

∣∣∣∣S(Q, f, α)−∫ b

a

fdα

∣∣∣∣ < ε.

Page 192: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

192 CHAPTER 3. INTEGRATION

Note that

S(Q, f, α) =n+1∑i=1

f(xi−1)α(x∗i )−n+1∑i=1

f(xi−1)α(x∗i−1)

=n∑i=1

f(xi−1)α(x∗i ) + f(b)α(b)−n+1∑i=2

f(xi−1)α(x∗i−1)− f(a)α(a)

=n∑i=1

f(xi−1)α(x∗i )−n∑i=1

f(xi)α(x∗i ) + f(b)α(b)− f(a)α(a)

= −S(P, α, f) + f(b)α(b)− f(a)α(a).

Therefore we have

‖P‖ < δ

2=⇒

∣∣∣∣S(P, α, f)−(f(b)α(b)− f(a)α(a)−

∫ b

a

fdα

)∣∣∣∣ < ε.

This proves that α is Riemann-Stieltjes integrable with respect to f , and theformula (3.3.7) holds.

Exercise 3.3.25. Suppose f(x) is a non-negative and decreasing function on [a, b]that is Riemann-Stieltjes integrable with respect to α. Suppose m ≤ α ≤ M on[a, b]. Prove

f(a)(m− α(a)) ≤∫ b

afdα ≤ f(a)(M − α(a))

and derive the second integral mean value theorem in Exercises 3.2.58 and 3.2.59without assuming the differentiability.

In view of Theorem 3.3.3, the following extends Theorem 3.2.5 for thechange of variable.

Theorem 3.3.5. Suppose φ is increasing and continuous on [a, b]. Supposef is Riemann-Stieltjes integrable with respect to α on [φ(a), φ(b)]. Then f ◦φis Riemann-Stieltjes integrable with respect to α ◦ φ on [a, b]. Moreover,∫ φ(b)

φ(a)

fdα =

∫ b

a

(f ◦ φ)d(α ◦ φ). (3.3.8)

Proof. Let P be a partition of [a, b]. Similar to the proof of Theorem 3.2.5,we have a partition φ(P ) of [φ(a), φ(b)]. Choose x∗i for P and choose cor-responding φ(x∗i ) for φ(P ). Then the Riemann-Stieltjes sum of f ◦ φ withrespect to α ◦ φ is

S(P, f ◦ φ, α ◦ φ) =∑

f(φ(x∗i ))(α(φ(xi))− α(φ(xi−1))) = S(φ(P ), f, α).

Since f is Riemann-Stieltjes integrable with respect to α, for any ε > 0, thereis δ > 0, such that

‖Q‖ < δ =⇒

∣∣∣∣∣S(Q, f, α)−∫ φ(a)

φ(a)

fdα

∣∣∣∣∣ < ε.

Page 193: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 193

The continuity of φ implies the uniform continuity. Therefore there is δ′ > 0,such that ‖P‖ < δ′ implies ‖φ(P )‖ < δ. Then ‖P‖ < δ′ implies∣∣∣∣∣S(P, f ◦ φ, α ◦ φ)−

∫ φ(a)

φ(a)

fdα

∣∣∣∣∣ =

∣∣∣∣∣S(φ(P ), f, α)−∫ φ(a)

φ(a)

fdα

∣∣∣∣∣ < ε.

This proves that f ◦ φ is Riemann-Stieltjes integrable with respect to α ◦ φand the formula (3.3.8) holds.

Exercise 3.3.26. What will happen to Theorem 3.3.5 if φ is decreasing? What ifφ(x) is not assumed to be continuous?

Exercise 3.3.27. Suppose f and g are increasing functions satisfying g(f(x)) = xand f(0) = 0. Prove that if f is continuous, then for any a, b > 0, we have theYoung inequality ∫ a

0f(x)dx+

∫ b

0g(y)dy ≥ ab.

3.3.4 Bounded Variation Function

The discussion of the integrability in Section 3.1.2 critically depends on theinequality (3.1.6). The inequality may be extended to the Riemann-Stieltjessum

|f(c)(α(b)− α(a))− S(P, f, α)| ≤ ω[a,b](f)VP (α), (3.3.9)

where

VP (α) = |α(x0)− α(x1)|+ |α(x1)− α(x2)|+ · · ·+ |α(xn−1)− α(xn)|

is the variation of α with respect to the partition P . We say α has boundedvariation if there is a constant V , such that VP (α) ≤ V for any partition P .If a partition Q refines the partition P , then we clearly have

VQ(α) ≥ VP (α).

This suggests us to define the variation of α on an interval to be

V[a,b](α) = sup{VP (α) : P is a partition of [a, b]}.

It is easy to see that monotone functions have bounded variations withV[a,b](α) = |α(b)− α(a)|, and linear combinations of bounded variation func-tions also have bounded variations. Moreover, it is easy to establish thefollowing properties for the variations.

Proposition 3.3.6. Suppose α and β are bounded variation functions on[a, b]. Then

V[a,b](α) ≥ |α(b)− α(a)|,V[a,c](α) = V[a,b](α) + V[b,c](α) for a < b < c,

V[a,b](λα) = |λ|V[a,b](α),

V[a,b](α + β) ≤ V[a,b](α) + V[a,b](β).

Page 194: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

194 CHAPTER 3. INTEGRATION

Proof. The difference |α(b) − α(a)| is the variation with respect to the par-tition a = x0 < x1 = b. Therefore the first inequality follows from thedefinition of the variation V[a,b](α).

For any partition P of [a, c], we denote by P ∪ {b} the partition of [a, b]obtained by adding c to P , and denote by P[x,y] the partition of [x, y] obtainedby combining x, y and those points in P lying in [x, y]. Then

VP[a,b](α) + VP[b,c]

(α) = VP∪{b}(α) ≥ VP (α).

This implies V[a,c](α) ≤ V[a,b](α) + V[b,c](α). Conversely, for any partitions P ′

of [a, b] and P ′′ of [b, c], we denote by P ′ ∪ P ′′ the partition of [a, c] obtainedby combining points in P ′ and P ′′ together. Then

VP (α) = VP ′(α) + VP ′′(α).

This implies that V[a,c](α) ≥ V[a,b](α) + V[b,c](α). The completes the proof ofthe second equality.

The third equality follows from VP (λα) = |λ|VP (α).The forth inequality follows from VP (α + β) ≤ VP (α) + VP (β).

Example 3.3.16. Suppose β is Riemann integrable and α(x) =∫ x

aβ(t)dt. Then

VP (α) =∑∣∣∣∣∣∫ xi

xi−1

β(t)dt

∣∣∣∣∣, and by (3.1.13) in Exercise 3.1.16,

|S(P, |β|)− VP (α)| ≤∑∣∣∣∣∣|β(x∗i )|∆xi −

∣∣∣∣∣∫ xi

xi−1

β(t)dt

∣∣∣∣∣∣∣∣∣∣

≤∑∣∣∣∣∣β(x∗i )∆xi −

∫ xi

xi−1

β(t)dt

∣∣∣∣∣≤∑

ω[xi−1,xi](β)∆xi.

This implies α has bounded variation, and

V[a,b](α) =∫ b

a|β(t)|dt.

Exercise 3.3.28. Prove that any Lipschitz function has bounded variation.

Exercise 3.3.29. Prove that any bounded variation function is Riemann integrable.

Exercise 3.3.30. Suppose a function α has bounded variation on [a, b]. Prove thatα is increasing if and only if V[a,b](α) = α(b)− α(a).

Exercise 3.3.31. Suppose f is Riemann-Stieltjes integrable with respect to α. Sup-pose a ≤ c ≤ b. Prove that ∣∣∣∣∫ b

afdα

∣∣∣∣ ≤ sup[a,b]|f |V[a,b](α),∣∣∣∣f(c)(α(b)− α(a))−

∫ b

afdα

∣∣∣∣ ≤ ω[a,b](f)V[a,b](α).

Page 195: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 195

Exercise 3.3.32. Suppose f is Riemann-Stieltjes integrable with respect to a bounded

variation function α, and F (x) =∫ x

afdα. Prove that if α is not continuous at x0

and f(x0) 6= 0, then F is not continuous at x0.

For a function α(x) with bounded variation on [a, b], define the variationfunction

v(x) = V[a,x](α).

Intuitively, the change of α is positive when α is increasing and negativewhen α is decreasing. The variation function v(x) keeps track of both aspositive changes. Then we may further define the positive variation functionand the negative variation function

v+ =v + α

2, v− =

v − α2

.

Intuitively, v+(x) keeps track of the positive changes only and becomes con-stant on the intervals on which α is decreasing, and v−(x) keeps track of thenegative changes only. We have α = v+ − v−. Moreover, for a ≤ x < y ≤ b,by Proposition 3.3.6, we have

v+(y)− v+(x) =V[x,y](α) + α(y)− α(x)

2≥V[x,y](α)− |α(y)− α(x)|

2≥ 0.

Thus v+(x) is increasing. Similarly, v−(x) is also increasing. Therefore weproved the necessary part of the following.

Proposition 3.3.7. A function has bounded variation if and only if it is thedifference of two increasing functions.

The sufficiency follows from the fact that monotone functions have boundedvariations and linear combinations of bounded variation functions still havebounded variations.

Exercise 3.3.33. Suppose β is Riemann integrable and α(x) =∫ x

aβ(t)dt. Prove

that v+(x) =∫ x

amax{0, β(t)}dt, v−(x) = −

∫ x

amin{0, β(t)}dt.

Exercise 3.3.34. Suppose a function α has bounded variation, with positive andnegative variation functions v+ and v−. Suppose α = u+ − u−, where u+ and u−

are increasing functions.

1. Prove that V[a,b](α) ≤ (u+(b)− u+(a)) + (u−(b)− u−(a)).

2. Prove that v+(y)−v+(x) ≤ u+(y)−u+(x) and v−(y)−v−(x) ≤ u−(y)−u−(x)for any a ≤ x < y ≤ b.

3. Prove that the equality in the first part holds if and only if u+ = v+ + c andu− = v− + c for some constant c.

The result shows that α = v+ − v− is the “most efficient” way of expressing abounded variation function as the difference of increasing functions.

Next we establish properties of continuous bounded variation functions.

Page 196: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

196 CHAPTER 3. INTEGRATION

Proposition 3.3.8. A bounded variation function is continuous if and onlyif its variation function is continuous.

Proof. Let α be a bounded variation function on [a, b] and v(x) = V[a,x](α).By Proposition 3.3.6, for a ≤ x < y ≤ b, we have

|α(y)− α(x)| ≤ V[x,y](α) = v(y)− v(x).

This implies that if v is continuous, then α is also continuous.Conversely, for any ε > 0, there is a partition P of [a, b], such that

VP (α) > V[a,b](α)− ε = v(b)− ε.

If α is continuous, then there is δ > 0, such that a ≤ x < y ≤ b and y−x < δimplies |α(y) − α(x)| < ε. Let δP = min ∆xi be the length of the smallestinterval in P . Then for a ≤ x < y ≤ b satisfying y−x < δ′ = min{δ, δP}, theinterval (x, y) contains at most one point from P . Denote by Q = P ∪{x, y}the partition of [a, b] obtained by adding x, y to P . Denote by Q[x,y] thepartition of [x, y] obtained by combining x, y and those points in P lying in[x, y] (as in the proof of Proposition 3.3.6). Then

VQ[a,x](α) + VQ[x,y]

(α) + VQ[y,b](α) = VQ(α) ≥ VP (α) > V[a,b](α)− ε

= V[a,x](α) + V[x,y](α) + V[y,b](α)− ε.

By VQ[a,x](α) ≤ V[a,x](α) and VQ[y,b]

(α) ≤ V[y,b](α), we get

VQ[x,y](α) ≥ V[x,y](α)− ε.

Since (x, y) contains at most one point from P . The partition Q[x,y] is eitherx < y or x < c < y for some c ∈ P . Since y − x < δ, we have either

VQ[x,y](α) = |α(y)− α(x)| < ε,

orVQ[x,y]

(α) = |α(y)− α(c)|+ |α(c)− α(x)| < 2ε.

Therefore in either case, we get

v(y)− v(x) = V[x,y](α) ≤ VQ[x,y](α) + ε ≤ 3ε.

Since the only condition for this to hold is y−x < δ′, this proves the continuityof v.

The following is a technical result about continuous bounded variationfunctions that will be useful for proving further results.

Proposition 3.3.9. Suppose α is a continuous bounded variation function.Then for any ε > 0, that there is δ > 0, such that ‖P‖ < δ implies VP (α) >V[a,b](α)− ε.

Page 197: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 197

Proof. Example 3.3.16 shows that the close relation between the variationand the Riemann integral. The subsequent proof is similar to Exercises3.1.38 and 3.1.39.

By definition, there is a partition Q, such that VQ(α) > V[a,b](α) − ε.Suppose Q contains n partition points. By the continuity of α, there isδ > 0, such that

|x− y| < δ =⇒ |α(x)− α(y)| < ε

2n.

Now assume P is any partition satisfying ‖P‖ < δ. Then the oscillations of

α on the intervals in P are <ε

2n. Denote by P ∪ Q the partition of [a, b]

obtained by combining points of P and Q together. Then

VP∪Q(α) ≥ VQ(α) > V[a,b](α)− ε.

On the other hand, P ∪ Q is obtained from P by adding no more than npoints. These n points fall into at most n intervals in P , and the sums forVP∪Q(α) and VP (α) differ only at these intervals. Specifically, if [xi−1, xi]is one such interval that contains k points from Q, then the term |α(xi) −α(xi−1)| in VP (α) is replaced by a sum of at most k + 1 terms in VP∪Q(α).

Since ω[xi−1,xi](α) <ε

2n, the sum of these k+1 terms is < (k+1)

ε

2n. Adding

up all these differences, we get (note that∑

(ki + 1) ≤ 2n)

VP∪Q(α) < VP (α) + 2nε

2n= VP (α) + ε.

Combining the two inequalities together, we conclude that

‖P‖ < δ =⇒ VP (α) + ε > V[a,b](α)− ε.

Exercise 3.3.35. Given the conclusion of Proposition 3.3.9, prove that for anypartition P of any interval [c, d] ⊂ [a, b] satisfying ‖P‖ < δ, we have VP (α) >V[c,d](α)− ε. Use this observation to give another proof of Proposition 3.3.8.

Exercise 3.3.36. Suppose f is Riemann-Stieltjes integrable with respect to α, and

F (x) =∫ x

afdα. Prove that if α is continuous and has bounded variation, then F

is continuous. Compare with the Exercise 3.3.32.

Now we are ready to extend Theorem 3.1.3 on the criterion for Riemannintegrabilty to the Riemann-Stieltjes integrability.

Theorem 3.3.10. Suppose f is bounded and α has bounded variation.

1. If f is Riemann-Stieltjes integrable with respect to α, then for any ε > 0,there is δ > 0, such that

‖P‖ < δ =⇒∑

ω[xi−1,xi](f)|∆αi| < ε.

Page 198: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

198 CHAPTER 3. INTEGRATION

2. If for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒∑

ω[xi−1,xi](f)V[xi−1,xi](α) < ε,

then f is Riemann-Stieltjes integrable with respect to α.

Moreover, if α is monotone, then the two parts are inverse to each otherand give a necessary and sufficient condition for the Riemann-Stieltjes in-tegrability. If α is continuous, then the converse of the second part is alsotrue and gives a necessary and sufficient condition for the Riemann-Stieltjesintegrability.

Proof. The proof is very similar to the proof of Theorem 3.1.3. For the firstpart, we note that by taking P = P ′ but choosing different x∗i for P and P ′,we have

S(P, f, α)− S(P ′, f, α) =∑

(f(x∗i )− f(x′∗i ))∆αi.

Then for each i, we may choose f(x∗i )− f(x′∗i ) to have the same sign as ∆αiand |f(x∗i )− f(x′∗i )| to be as close to the oscillation ω[xi−1,xi](f) as possible.The result is that S(P, f, α)−S(P ′, f, α) is very close to

∑ω[xi−1,xi](f)|∆αi|.

The rest of the proof is the same.For the second part, the key is the following estimation for a refinement

Q of P .

|S(P, f, α)− S(Q, f, α)| =

∣∣∣∣∣n∑i=1

f(x∗i )∆αi −n∑i=1

S(Q[xi−1,xi], f, α)

∣∣∣∣∣≤

n∑i=1

∣∣f(x∗i )(α(xi+1)− α(xi))− S(Q[xi−1,xi], f, α)∣∣

≤n∑i=1

ω[xi−1,xi](f)VQ[xi−1,xi](α)

≤n∑i=1

ω[xi−1,xi](f)V[xi−1,xi](α),

where the second inequality follows from (3.3.9). The rest of the proof is thesame.

If α is monotone, then |∆αi| = V[xi−1,xi](α), so that the two parts areinverse to each other.

If α is continuous, then by Proposition 3.3.9, for any ε > 0, there is δ > 0,such that ‖P‖ < δ implies VP (α) > V[a,b](α)− ε. Now for Riemann-Stieltjesintegrable f , by the first part, there is δ′ > 0, such that ‖P‖ < δ′ implies∑ω[xi−1,xi](f)|∆αi| < ε. If |f | < B, then ‖P‖ < min{δ, δ′} implies∑

ω[xi−1,xi](f)V[xi−1,xi](α)

≤∑

ω[xi−1,xi](f)|∆αi|+∑

ω[xi−1,xi](f)(V[xi−1,xi](α)− |∆αi|)

≤ε+ 2B∑

(V[xi−1,xi](α)− |∆αi|) = ε+ 2B(V[a,b](α)− VP (α)) < (2B + 1)ε.

Page 199: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 199

With the help of the theorem, the results in Section 3.1.3 may be ex-tended. The proofs are left as exercises.

Proposition 3.3.11. Any continuous function is Riemann-Stieltjes inte-grable with respect to any function with bounded variation.

By Theorem 3.3.4, we have the following result that extends Proposition3.1.5.

Proposition 3.3.12. Any function with bounded variation is Riemann-Stieltjesintegrable with respect to any continuous function.

If α is monotone or continuous, then Theorem 3.3.10 gives us necessaryand sufficient criteria for the Riemann-Stieltjes integrability. Based on this,Proposition 3.1.6 can be proved just as before.

Proposition 3.3.13. Suppose f is Riemann-Stieltjes integrable with respectto α, which is either monotone or continuous with bounded variation. Sup-pose the values of f lie in a finite union U of closed intervals, and φ is acontinuous function on U . Then the composition φ ◦ f is also Riemann-Stieltjes integrable with respect to α.

By the same reason as Riemann integration, under the assumption aboutα in Proposition 3.3.13, the products of functions Riemann-Stieltjes inte-grable with respect to α are still Riemann-Stieltjes integrable with respectto α.

3.3.5 Additional Exercise

Trigonometric Integration

Exercise 3.3.37. By writing a sinx + b cosx as a linear combination of c sinx +

d cosx and (c sinx + d cosx)′, compute the antiderivatives∫a sinx+ b cosxc sinx+ d cosx

dx

and∫

a sinx+ b cosx(c sinx+ d cosx)2

dx. Use the similar idea to compute the antiderivative∫a sinx+ b cosx+ λ

c sinx+ d cosx+ µdx.

Exercise 3.3.38. By writing Au2 + 2Buv + Cv2 in the form (αu+ βv)(au+ bv) +

γ(u2 + v2), compute the antiderivative∫A sin2 x+ 2B sinx cosx+ C cos2 x

a sinx+ b cosxdx.

Gamma Function

The Gamma function is

Γ(x) =

∫ +∞

0

tx−1e−tdt.

Exercise 3.3.39. Prove that the function is defined and continuous for x > 0.

Exercise 3.3.40. Prove limx→0+ Γ(x) = limx→+∞ Γ(x) = +∞.

Page 200: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

200 CHAPTER 3. INTEGRATION

Exercise 3.3.41. Prove the other formulae for the Gamma function

Γ(x) = 2∫ ∞

0t2x−1e−t

2dt = ax

∫ ∞0

tx−1e−atdt.

Exercise 3.3.42. Prove the equalities for the Gamma function

Γ(x+ 1) = xΓ(x), Γ(n) = (n− 1)!.

Beta Function

The Beta function is

B(x, y) =

∫ 1

0

tx−1(1− t)y−1dt.

In Exercise 7.1.70, we will see that the Beta function can be written in termsof the Gamma function.

Exercise 3.3.43. Prove that the function is defined for x, y > 0.

Exercise 3.3.44. Use the change of variables t =1

1 + uto prove the other formulae

for the Beta function

B(x, y) =∫ ∞

0

ty−1dt

(1 + t)x+y=∫ 1

0

tx−1 + ty−1

(1 + t)x+ydt.

Exercise 3.3.45. Prove

B(x, y) = 2∫ π

2

0cos2x−1 t sin2y−1 tdt.

Exercise 3.3.46. Prove the equalities for the Beta function

B(x, y) = B(x, y), B(x+ 1, y) =x

x+ yB(x, y).

Dirichlet Test and Abel8 Test

In Example 3.3.12, we showed the convergence of an improper integralfor which the convergence cannot be established by comparison test. Theidea can be elaborated into the following tests.

Suppose we are interested in the convergence of the improper integral∫ +∞

a

f(x)g(x)dx. The Dirichlet test verifies the following conditions.

1. f(x) is monotone and satisfies limx→+∞ f(x) = 0.

2. There is M , such that

∣∣∣∣∫ c

a

g(x)dx

∣∣∣∣ < M for all c ∈ [a,+∞).

8Niels Henrik Abel, born 1802 in Frindoe (Norway), died 1829 in Froland (Norway).In 1824, Abel proved the impossibility of solving the general equation of fifth degree inradicals. Abel also made contributions to elliptic functions. Abel’s name is enshrined inthe term abelian, which describes the commutative property.

Page 201: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

3.3. TOPICS ON INTEGRATION 201

The Abel test verifies the following conditions.

1. f(x) is monotone and bounded.

2.

∫ +∞

a

g(x)dx converges.

Exercise 3.3.47. By using the idea of Example 3.3.12, prove the convergence of∫ +∞

af(x)g(x)dx under either conditions. The use of Theorem 3.2.3 on integra-

tion by parts requires the additional assumption that f has integrable derivative.However, by using the more general Theorem 3.3.4, the condition can be relaxed.

Exercise 3.3.48. Derive the Abel test from the Dirichlet test.

Exercise 3.3.49. Can you extend the test to other kinds of improper integrals, suchas unbounded function on bounded interval?

Exercise 3.3.50. Show the improper integrals∫ +∞

0

sinxx

dx and∫ +∞

0sinx2dx con-

verge.

Page 202: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

202 CHAPTER 3. INTEGRATION

Page 203: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 4

Series

203

Page 204: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

204 CHAPTER 4. SERIES

4.1 Series of Numbers

When n approaches infinity, the Taylor expansion becomes a series

T (x) = f(x0)+f ′(x0)(x−x0)+f ′′(x0)

2(x−x0)2+· · ·+ f (n)(x0)

n!(x−x0)n+· · · .

This can be considered as the infinite degree approximation of f(x) at x0.The value of the series would be a limit, and the natural question is whetherthe value is equal to the function itself.

Another important and useful series is the Fourier1 series

1

2a0 +a1 cosx+ b1 sinx+a2 cos 2x+ b2 sin 2x+ · · ·+an cosnx+ bn sinnx+ · · ·

defined for an integrable periodic function f(x) with period 2π, where

an =1

π

∫ 2π

0

f(x) cosnxdx, bn =1

π

∫ 2π

0

f(x) sinnxdx.

Again the central question here is whether the limit of the series is equal tothe function itself.

The Taylor series and the Fourier series are series of functions. In thispart, we discuss the more elementary theory of the series of numbers. Theseries of functions will be discussed in the next part.

4.1.1 Sum of Series

A series is an infinite sum

∞∑n=1

xn = x1 + x2 + · · ·+ xn + · · · .

The partial sum of the series is the sequence

sn =n∑k=1

xk = x1 + x2 + · · ·+ xn.

Definition 4.1.1. A series∑∞

n=1 xn has sum s, and denoted∑∞

n=1 xn = s,if limn→∞ sn = s.

If the limit exists, then the series converges. Otherwise, the series di-verges. If limn→∞ sn = ∞, we also say the series diverges to infinity andwrite

∑∞n=1 xn =∞.

The Cauchy criterion for the convergence of sequences leads immediatelyto the Cauchy criterion for the convergence of series: A series

∑∞n=1 xn con-

verges if and only for any ε > 0, there is N , such that

n ≥ m > N =⇒ |xm + xm+1 + · · ·+ xn| < ε. (4.1.1)

1Jean Baptiste Joseph Fourier, born 1768 in Bourgogne (France), died 1830 in Paris(France).

Page 205: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 205

For the special case that m = n, we find that a necessary condition for asequence

∑∞n=1 xn to converge is

limn→∞

xn = 0.

Like sequences, a series does not have to start with index 1. On the otherhand, without loss of generality, we can always assume that a series startswith index 1 in theoretical studies. Moreover, modifying, adding or deletingfinitely many terms in a series does not change the convergence of the series.

Example 4.1.1. The geometric series is∑∞

n=0 an = 1 +a+a2 + · · ·+an + · · · . The

partial sum satisfies

(1− a)sn = (1 + a+ a2 + · · ·+ an)− (a+ a2 + a3 + · · ·+ an+1) = 1− an+1.

Therefore sn =1− an+1

1− a, and

∞∑n=0

an =

1

1− aif |a| < 1

diverges if |a| ≥ 1. (4.1.2)

Example 4.1.2. If xn ≥ 0, then the partial sum is an increasing sequence. Thereforethe convergence of the non-negative series

∑xn is equivalent to the boundedness

of the partial sums. This is the reason behind the divergence of the harmonic

series∑∞

n=1

1n

and the convergence of the series∑∞

n=1

1n2

in Example 1.2.9. Wealso note that the computation in Example 1.2.9 tells us

∞∑n=1

1n(n+ 1)

=1

1 · 2+

12 · 3

+ · · ·+ 1(n− 1)n

+ · · · = limn→∞

(11− 1n

)= 1.

Example 4.1.3. In Example 2.3.16, the estimation of the remainder of the Taylor

series tells us that∑∞

n=0

xn

n!converges to ex for any x. In particular, we have

1 +11!

+12!

+ · · ·+ 1n!

+ · · · = e.

Exercise 4.1.1. Compute the sums of convergent series.

1.∑∞

n=1 nan.

2.∑∞

n=1

n

2n− 1.

3.∑∞

n=1

(−1)n

(2n)!.

4.∑∞

n=1

(−1)n

(2n+ 1)!.

5.∑∞

n=2 log(

1− 1n2

).

6.∑∞

n=1

1n√a

.

7.∑∞

n=1

1(a+ n)(a+ n+ 1)

.

8.∑∞

n=1

1n(n+ 1)(n+ 2)

.

Exercise 4.1.2. Use Exercise 1.4.38 to show

1− 12

+13− 1

4+

15− 1

6+

17− 1

8+ · · · = log 2,

1 +13− 1

2+

15

+17− 1

4+

19

+111− 1

6+ · · · = 3

2log 2.

Note that the second series is obtained from the first by rearranging the terms.

Page 206: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

206 CHAPTER 4. SERIES

Exercise 4.1.3. Prove that if y is not an integer multiple of 2π, then

n∑k=0

cos(x+ ky) =1

2 siny

2

(sin(x+

2n+ 12

y

)− sin

(x− 1

2y

)),

n∑k=0

sin(x+ ky) =1

2 siny

2

(− cos

(x+

2n+ 12

y

)+ cos

(x− 1

2y

)).

Use integration between y and π to find the partial sum of the series∑ sinny

nfor 0 < y < 2π. Then apply Riemann-Lebesgue Lemma in Exercise 3.2.44 to get∑∞

n=1

sinnyn

=π − y

2for 0 < y < 2π.

Exercise 4.1.4. Suppose xn > 0. Compute∑∞

n=1

xn(1 + x1)(1 + x2) · · · (1 + xn)

.

Exercise 4.1.5. Prove that a sequence xn converges if and only if the series∑

(xn+1−xn) converges.

Exercise 4.1.6. Prove that if the series∑xn and

∑yn converge, then the series∑

(xn + yn) and∑cxn also converge. Moreover,∑

(xn + yn) =∑

xn +∑

yn,∑

cxn = c∑

xn.

Exercise 4.1.7. Prove that if a series converges, then the series obtained by com-bining successive terms also converges. In other words, if

∑∞n=1 xn converges, then

for any strictly increasing sequence of natural numbers nk, the series∑∞

n=k(xnk +xnk+1 + · · ·+ xnk+1−1) also converges.

Exercise 4.1.8. Prove that if limn→∞ xn = 0, then∑xn converges if and only if∑

(x2n−1 +x2n) converges, and the two sums are the same. What about combiningthree consecutive terms?

Exercise 4.1.9. Suppose xn is decreasing and positive. Prove that∑xn con-

verges if and only if∑

2nx2n converges. Then study the convergence of∑ 1

nαand∑ 1

n(log n)α.

An infinite product is

∞∏n=1

xn = x1x2 · · ·xn · · · .

The partial product is the sequence pn = x1x2 · · ·xn. If limn→∞ pn = p andp 6= 0, then we say the infinite product converges to (or has the product) p.

Because the limit limn→∞ pn is assumed to be nonzero, we have xn 6= 0,so that the convergence is not affected if we modify, add or delete finitelymany nonzero terms. Moreover, for a convergent infinite product we have

limn→∞

xn = limn→∞

pnpn−1

=limn→∞ pn

limn→∞ pn−1

=p

p= 1.

In particular, the terms xn will be positive for big enough n. Since the finitelymany possibly negative (and nonzero) terms will not affect the convergence,

Page 207: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 207

we may pretend all terms are positive and use log to convert the infiniteproduct to a series

∞∑n=1

log xn = log x1 + log x2 + · · ·+ log xn + · · · ,

whose partial sum is

log x1 + log x2 + · · ·+ log xn = log pn.

Thus the infinite product∏xn converges if and only if the series

∑log xn

converges.

Exercise 4.1.10. Compute convergent infinite products.

1.∏∞n=1

(1 +

1n

).

2.∏∞n=2

(1 +

(−1)n

n

).

3.∏∞n=1 2

1n .

4.∏∞n=1 2

(−1)n

n! .

5.∏∞n=1 cos

x

2n.

Exercise 4.1.11. Use the Cauchy criterion for the convergence of the series∑

log xnto get the Cauchy criterion for the convergence of the infinite product

∏xn.

Exercise 4.1.12. Establish properties for infinite products similar to Exercises 4.1.5and 4.1.6.

Exercise 4.1.13. Why do we have to consider the infinite product as divergentwhen limn→∞ pn = 0? What bad things may happen if the case limn→∞ pn = 0 isconsidered as convergent?

4.1.2 Comparison Test

Similar to the convergence of improper integrals, the convergence of seriesmay be determined by comparing with other series.

Proposition 4.1.2 (Comparison Test). Suppose |xn| ≤ yn. If∑yn con-

verges, then∑xn converges.

Proof. By the Cauchy criterion for the convergence of yn, for any ε > 0, thereis N , such that (4.1.1) holds for

∑yn. Then for n ≥ m > N , we have

|xm+xm+1 + · · ·+xn| ≤ |xm|+ |xm+1|+ · · ·+ |xn| ≤ ym+ym+1 + · · ·+yn < ε.

This verifies the Cauchy criterion for the convergence of∑xn.

A series∑xn absolutely converges if

∑|xn| converges. The series

∑xn

in the proposition absolutely converges. The proposition also tells us thatabsolute convergence implies convergence. Moreover, by the discussion inExample 4.1.2, a series

∑xn absolutely converges if and only if the partial

sum of∑|xn| is bounded.

Page 208: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

208 CHAPTER 4. SERIES

A special case of the comparison test is that if both xn and yn are positive,

and limn→∞xnyn

= l exists, then the convergence of∑yn implies the conver-

gence of∑xn. If l 6= 0, then the two convergences are equivalent. More

generally, if c1yn ≤ xn ≤ c2yn for some c1, c2 > 0, then the two convergencesare also equivalent.

A series conditionally converges if it converges but not absolutely con-verges.

Example 4.1.4. Since1nα≤ 1n2

for α ≥ 2, the convergence of∑ 1

n2in Example

4.1.2 implies that∑ 1

nαconverges for α ≥ 2. On the other hand, since

1nα≥ 1n

for

0 < α ≤ 1, the divergence of the harmonic series∑ 1

nimplies that

∑ 1nα

divergesfor 0 < α ≤ 1.

Example 4.1.5. Suppose xn satisfies n√|xn| ≤ a for some constant a < 1. Then

|xn| ≤ an and the convergence of the geometric series∑an in Example 4.1.1

implies that∑xn absolutely converges. This is called the root test.

For a specific example, let p(t) be a nonzero polynomial. Then

limn→∞

n√|p(n)an| = |a| lim

n→∞n√|p(n)| = |a|.

If |a| < 1, then the limit implies that for any |a| < b < 1, we have n√|p(n)an| < b

for sufficiently big n. Since the convergence of a series is independent of the choiceof finitely many terms, we conclude that

∑p(n)an converges for |a| < 1.

If |a| ≥ 1, then p(n)an does not converge to 0 as n → ∞. Thus∑p(n)an

diverges for |a| ≥ 1.

Example 4.1.6. Suppose xn satisfies∣∣∣∣xn+1

xn

∣∣∣∣ ≤ a for some constant a < 1. Then

|xn| = |x1|∣∣∣∣x2

x1

∣∣∣∣ ∣∣∣∣x3

x2

∣∣∣∣ · · · ∣∣∣∣ xnxn−1

∣∣∣∣ ≤ |x1|an−1.

The convergence of the geometric series |x1|∑an−1 implies that

∑xn absolutely

converges. This is called the ratio test.

For a specific example, the series∑ (n!)2an

(2n)!satisfies

limn→∞

∣∣∣∣∣∣∣∣(n!)2an

(2n)!((n− 1)!)2an−1

(2n− 2)!

∣∣∣∣∣∣∣∣ = limn→∞

n2|a|2n(2n− 1)

=|a|4.

Thus for |a| < 4, the ratio is <|a|+ 1

5< 1 for sufficiently big n, and the series

converges. If |a| ≥ 4, then the absolute value of the quotient is > 1 and theindividual term does not converge to 0 (by Exercise 1.1.33, the limit is in fact ∞),so that the series diverges.

Exercise 4.1.14. Suppose p(x) and q(x) are polynomials. Determine the conver-

gence of∑ p(n)

q(n).

Page 209: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 209

Exercise 4.1.15. Is the convergence of∑xn and

∑yn related to the convergence

of∑

max{xn, yn} and∑

min{xn, yn}?

Exercise 4.1.16. Suppose∣∣∣∣xn+1

xn

∣∣∣∣ ≤ ∣∣∣∣yn+1

yn

∣∣∣∣ for sufficiently big n. Prove that if∑yn absolutely converges, then

∑xn absolutely converges.

Exercise 4.1.17 (Root Test). Prove that∑xn absolutely converges if limn→∞

n√|xn| <

1 and diverges if limn→∞n√|xn| > 1.

Exercise 4.1.18 (Ratio Test). Prove that∑xn absolutely converges if limn→∞

∣∣∣∣xn+1

xn

∣∣∣∣ <1 and diverges if limn→∞

∣∣∣∣xn+1

xn

∣∣∣∣ > 1.

Exercise 4.1.19. Prove that

limn→∞

n√|xn| ≤ lim

n→∞

∣∣∣∣xn+1

xn

∣∣∣∣ , limn→∞

n√|xn| ≥ lim

n→∞

∣∣∣∣xn+1

xn

∣∣∣∣ .What do the inequalities tell you about the relation between the root and ratiotests?

Exercise 4.1.20. Determine the convergence of series.

1.∑ 1√

n(n− 1).

2.∑ sinn√

(n2 − 1)(n2 − 2).

3.∑an+(−1)n .

4.∑an

2.

5.∑n2an

2.

6.∑√

an + bn.

7.∑ 1√

an + bn.

8.∑

(a1n − 1).

9.∑(

e1n − 1− 1

n

).

10.∑(

an+ b

cn+ d

)n.

11.∑ log n

n3.

12.∑ 1

(log n)2.

13.∑ 1

(log n)n.

14.∑n!an.

15.∑ nan

(n+ 1)!.

16.∑√

(2n)!an

n!.

Exercise 4.1.21. Use Exercise 4.1.8 and the comparison test to show that the series∑ (−1)n−1

n= 1− 1

2+

13− 1

4+ · · · converges.

Exercise 4.1.22. Suppose xn 6= −1.

1. Prove that if∑xn converges, then

∏(1 + xn) converges if and only if

∑x2n

converges.

2. Prove that if∑x2n converges, then

∏(1 + xn) converges if and only if

∑xn

converges.

The convergence of series may also be determined by comparing with theconvergence of improper integrals.

Proposition 4.1.3 (Integral Comparison Test). Suppose f(x) is a decreasingfunction on [a,+∞) satisfying limx→+∞ f(x) = 0. Then the series

∑f(n)

converges if and only if the improper integral

∫ +∞

a

f(x)dx converges.

Page 210: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

210 CHAPTER 4. SERIES

Proof. Since the convergence is not changed if finitely many terms are mod-ified or deleted, we may assume a = 1 without loss of generality.

Since f(x) is decreasing, we have f(k) ≥∫ k+1

k

f(x)dx ≥ f(k + 1). Then

f(1) + f(2) + · · ·+ f(n− 1)

≥∫ n

1

f(x)dx =

∫ 2

1

f(x)dx+

∫ 3

2

f(x)dx+ · · ·+∫ n

n−1

f(x)dx

≥f(2) + f(3) + · · ·+ f(n).

This implies that

∫ n

1

f(x)dx is bounded if and only if the partial sums of the

series∑f(n) are bounded. Since f(x) ≥ 0, the boundedness is equivalent

to the convergence. Therefore

∫ +∞

a

f(x)dx converges if and only if∑f(n)

converges.

Example 4.1.7. The series∑ 1

nαconverges if and only if

∫ +∞

1

dx

xαconverges, which

by Example 3.3.7 means α > 1. The Riemann zeta function

ζ(α) = 1 +12α

+13α

+ · · ·+ 1nα

+ · · · (4.1.3)

is then defined for α > 1.

Example 4.1.8. The series∑ 1

n(log n)αconverges if and only if

∫ +∞

2

dx

x(log x)αconverges. It is easy to see that the improper integral converges if and only ifα > 1. Therefore the series converges if and only if α > 1.

Exercise 4.1.23. Show that the number of n digit numbers that do not containthe digit 9 is 8 · 9n−1. Then use the fact to prove that if we delete the terms inthe harmonic series that contain the digit 9, then the series becomes convergent.What about deleting the terms that contain some other digit? What about thenumbers expressed in the base other than 10? What about deleting similar terms

in the series∑ 1

nα?

Exercise 4.1.24. Determine the convergence of series.

1.∑ sinn√

n(n− 1)(n− 2).

2.∑ 1

(log n)α logn.

3.∑ 1

nα + (log n)β.

4.∑nα(log n)β.

5.∑(

1− a log nn

)n.

6.∑∫ 1

n

0

1 + x2dx.

Exercise 4.1.25. Prove that if∑x2n converges, then

∑ xnnα

also converges for any

α >12

. What if α =12

?

Exercise 4.1.26. Use the idea of the proof of Proposition 4.1.3 to prove log(n−1)! <

n(log n− 1) + 1 < log n!. Then derive the inequalitynn

en−1< n! <

(n+ 1)n+1

en.

Page 211: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 211

4.1.3 Conditional Convergence

A series∑xn is alternating if the signs of xn and xn+1 are different. Another

way of expressing an alternating series is∑

(−1)nxn, with xn ≥ 0.

Proposition 4.1.4 (Leibniz Test). Suppose xn is decreasing and limn→∞ xn =0. Then

∑(−1)nxn converges. Moreover, the sum s and the partial sum sn

satisfy |sn − s| ≤ xn+1.

Proof. The partial sum satisfies

s2n+1 − s2n−1 = x2n − x2n+1 ≥ 0, s2n+2 − s2n = −x2n+1 + x2n+2 ≤ 0.

Thus s2n+1 is increasing and s2n is decreasing. Moreover, by s2n − s2n+1 =x2n+1 ≥ 0, s2n+1 has upper bound and s2n has lower bound. Therefore thesequences converge. Then limn→∞(s2n − s2n+1) = 0 implies that the limitsare the same.

The estimation |sn − s| ≤ xn+1 follows from

0 ≤ s2n − s ≤ s2n − s2n+1 = x2n+1, 0 ≤ s− s2n−1 ≤ s2n − s2n−1 = x2n.

Example 4.1.9. The series∑ (−1)n

nαconverges for α > 0. Note that by Exam-

ple 4.1.7, the series absolutely converges only for α > 1. Therefore the seriesconditionally converges for 0 < α ≤ 1.

Exercise 4.1.27. Suppose b 6= 0. Find all the combinations of a and b such that∑nabn converge.

Exercise 4.1.28. Find all a 6= e−1 such that∑ (na)n

n!converges (the case a = e−1

will be dealt with in Exercise 4.1.45).

Exercise 4.1.29. Construct a convergent series∑xn such that the series

∑x2n

diverges. By Exercise 4.1.22, we see that the convergence of∑xn does not neces-

sarily imply the convergence of∏

(1 + xn).

Exercise 4.1.30. Suppose xn is decreasing and limn→∞ xn = 1. Prove that the“alternating infinite product”

∏x

(−1)n

n converges. Then find suitable xn, suchthat the series

∑yn defined by 1 + yn = x

(−1)n

n diverges. This shows that theconvergence of

∏(1 + yn) does not necessarily imply the convergence of

∑yn.

A rearrangement of a series∑xn is

∑xkn , where n → kn is a one-to-

one correspondence from the index set to itself (i.e., a rearrangement of theindices).

Proposition 4.1.5. Any rearrangement of an absolutely convergent seriesis still absolutely convergent. Moreover, the sum is the same.

Proof. Let s =∑xn. For any ε > 0, there is a natural number N , such that∑∞

i=N+1 |xi| < ε. Let N ′ = max{i : ki ≤ N}. Then∑N ′

i=1 xki contains all the

Page 212: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

212 CHAPTER 4. SERIES

terms x1, x2, . . . , xN . Therefore for n > N ′, the difference∑n

i=1 xki−∑N

i=1 xiis a sum of some non-repeating terms in

∑∞i=N+1 xi, and we have∣∣∣∣∣

n∑i=1

xki − s

∣∣∣∣∣ ≤∣∣∣∣∣n∑i=1

xki −N∑i=1

xi

∣∣∣∣∣+

∣∣∣∣∣N∑i=1

xi − s

∣∣∣∣∣ ≤ 2∞∑

i=N+1

|xi| < 2ε.

The absolute convergence of the rearrangement may be obtained by ap-plying what was just proved to

∑|xn|.

In contrast to absolutely convergent series, Exercise 4.1.2 shows that therearrangement of conditionally convergent series may change the sum. Infact, the rearrangement can produce any behavior we wish to have.

Proposition 4.1.6. A conditionally convergent series may be rearranged tohave any number as the sum, or to become divergent.

Proof. Suppose∑xn conditionally converges. Let

∑x′n and

∑x′′n be the

series obtained by respectively taking only the non-negative terms and thenegative terms. If

∑x′n converges, then

∑x′′n =

∑xn−

∑x′n also converges.

Therefore∑|xn| =

∑x′n −

∑x′′n converges. Since

∑|xn| is assumed to

diverge, the contradiction shows that∑x′n diverges. Because x′n ≥ 0, we

conclude∑x′n = +∞. Similarly, we have

∑x′′n = −∞.

Let s be any number. By∑x′n = +∞, there is m1, such that

m1−1∑k=1

x′k ≤ s <

m1∑k=1

x′k.

Then by∑x′′n = −∞, there is n1, such that

n1−1∑k=1

x′′k ≥ s−m1∑k=1

x′k >

n1∑k=1

x′′k.

Then by∑

n>n1x′n = +∞, there is m2, such that

m2−1∑k=m1+1

x′k ≤ s−m1∑k=1

x′k −n1∑k=1

x′′k <

m2∑k=m1+1

x′k.

Keep going, we get

−x′′np ≥ s−mp∑k=1

x′k −np∑k=1

x′′k > 0, −x′mp ≤ s−mp∑k=1

x′k −np−1∑k=1

x′′k < 0.

Then for mp−1 < j ≤ mp, by x′k > 0, we have

−x′mp ≤ s−mp∑k=1

x′k−np−1∑k=1

x′′k ≤ s−j∑

k=1

x′k−np−1∑k=1

x′′k ≤ s−mp−1∑k=1

x′k−np−1∑k=1

x′′k ≤ −x′′np−1,

Page 213: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 213

and for np−1 < j ≤ np, by x′′k < 0, we have

−x′′np ≥ s−mp∑k=1

x′k−np∑k=1

x′′k ≥ s−mp∑k=1

x′k−j∑

k=1

x′′k ≥ s−mp∑k=1

x′k−np−1∑k=1

x′′k ≥ −x′mp .

The convergence of∑xn implies limn→∞ xn = 0. Then the two estimations

above show that the rearranged series

x′1 + · · ·+ x′m1+ x′′1 + · · ·+ x′′n1

+ x′m1+1 + · · ·+ x′m2+ x′′n1+1 + · · ·+ x′′n2

+ · · ·

converges to s.

Exercise 4.1.31. Rearrange the series 1− 12

+13− 1

4+ · · · so that p positive terms

are followed by q negative terms and the pattern repeated. Show that the sum of

the new series is log 2+12

logp

q. For any real number, expand the idea to construct

a rearrangement to have the given number as the limit.

Exercise 4.1.32. Prove that if the rearrangement satisfies |kn − n| < M for aconstant M , then

∑xkn converges if and only if

∑xn converges, and the sums

are the same.

Exercise 4.1.33. Suppose∑xn conditionally converges. Prove that for any s and

t satisfying s < t, there is a rearrangement with the partial sum s′n satisfyinglimn→∞ s

′n = s, limn→∞ s

′n = t. Moreover, prove that any number between s and

t is the limit of a convergent subsequence of {s′n}.

The product of two series∑xn and

∑yn involves the product xmyn for

all m and n. In general, the product series∑xmyn makes sense only after

we arrange all the terms into a linear sequence∑

(xy)k =∑xmkynk , which

is given by a one-to-one correspondence (mk, nk) : N→ N×N. For example,the following is the “diagonal arrangement”∑

(xy)k =x1y1 + x1y2 + x2y1 + · · · (4.1.4)

+ x1yn−1 + x2yn−2 + · · ·+ xn−1y1 + · · · ,

and the following is the “square arrangement”∑(xy)k =x1y1 + x1y2 + x2y2 + x2y1 + · · · (4.1.5)

+ x1yn + x2yn + · · ·+ xnyn−1 + xnyn + xnyn−1 + · · ·+ xny1 + · · ·

In view of Proposition 4.1.5, the following result shows that the arrange-ment does not matter if the series absolutely converge.

Proposition 4.1.7. Suppose∑xn and

∑yn absolutely converge. Then∑

xmyn also absolutely converges, and∑xmyn = (

∑xm) (

∑yn).

Proof. Let s =∑xm and t =

∑yn. For any ε > 0, there is a natu-

ral number N , such that∑

m>N |xm| < ε and∑

n>N |yn| < ε. Let K =

max{k : mk ≤ N and nk ≤ N}. Then∑K

i=1(xy)i contains all the terms

Page 214: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

214 CHAPTER 4. SERIES

........................................................................................................................................................................................................................................................................................................................... ..............................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.........................................

........................................................................................................................................................................................................................................................................................................................... ..............................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.........................................n

m

n

m• • • •

• • • •

• • • •

• • • •

• • • •

• • • •

• • • •

• • • •

(xy)1

(xy)2

(xy)3

(xy)4

(xy)5

(xy)6

(xy)7

(xy)8

(xy)9

(xy)10 (xy)1

(xy)2 (xy)3

(xy)4

(xy)5 (xy)6 (xy)7

(xy)8

(xy)9

(xy)10 (xy)11 (xy)12 (xy)13

(xy)14

(xy)15

(xy)16............................

............................

..........

............................

............................

............................

............................

.................

............................

............................

............................

............................

............................

............................

........................

............................

............................

............................

............................

............................

............................

...

............................

............................

............................

............................

............................

..........

............................

............................

............................

..

............

............

............

......................................................

............

............

............

............

............

............

............

................................................................................................

............

............

............

............

............

............

............

............

............

............

............

..........................................................................................................................................

............

............

............

............

............

............

............

............

............

............

........................................................................................................................

Figure 4.1: diagonal and square arrangements

xmyn with 1 ≤ m,n ≤ N . Therefore for k > K, the difference∑k

i=1(xy)i −(∑Nm=1 xm

)(∑Nn=1 yn

)is a sum of some non-repeating terms xmyn with

either m > N or n > N , and we have∣∣∣∣∣k∑i=1

(xy)i − st

∣∣∣∣∣ ≤∣∣∣∣∣k∑i=1

(xy)i −

(N∑m=1

xm

)(N∑n=1

yn

)∣∣∣∣∣+

∣∣∣∣∣N∑m=1

xm − s

∣∣∣∣∣∣∣∣∣∣N∑n=1

yn

∣∣∣∣∣+ |s|

∣∣∣∣∣N∑n=1

yn − t

∣∣∣∣∣≤

∑m>N or n>N

|xmyn|+

(∑m>N

|xm|

)(N∑n=1

|yn|

)+ |s|

(∑n>N

|yn|

)≤2ε

(∑|xm|+

∑|yn|).

The absolute convergence of the product series may be obtained by ap-plying what was just proved to

∑|xm| and

∑|yn|.

Example 4.1.10. The geometric series absolutely converges to1

1− afor |a| < 1.

The product of two copies of the geometric series is∑i,j≥0

aiaj =∞∑n=1

∑i+j=n

an =∞∑n=1

(n+ 1)an.

Thus we conclude

1 + 2a+ 3a2 + · · ·+ (n+ 1)an + · · · = 1(1− a)2

.

Exercise 4.1.34. Use the Taylor series of ex to verify exey = ex+y.

Exercise 4.1.35. Suppose∑xn and

∑yn converge (not necessarily absolutely).

Does the square arrangement (4.1.5) converge to (∑xn)(

∑yn)?

Exercise 4.1.36. Suppose∑xn absolutely converges and

∑yn converges. Prove

that the diagonal arrangement (4.1.4) converges. Moreover, show that the condi-

tion of absolute convergence is necessary by considering the product of∑ (−1)n√

nwith itself.

Page 215: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 215

Exercise 4.1.37. Let pn, n = 1, 2, . . . , be prime numbers in increasing order. LetSn be the set of natural numbers whose prime factors are among p1, p2, . . . , pn.For example, 20 6∈ S2 and 20 ∈ S3 because the prime factors of 20 are 2 and 5.

1. Prove that∏ki=1

(1− 1

pαi

)−1

=∑

n∈Sk1nα

.

2. When does the infinite product∏(

1− 1pαn

)converge?

3. When does the series∑ 1

pαnconverge?

Note that the zeta function (see Example 4.1.7) ζ(α) =∏∞i=1

(1− 1

pαi

)−1

when

the right side converges. The equality relates the function to the number theory.

4.1.4 Additional Exercise

Convergence of Series

Exercise 4.1.38. Suppose xn > 0 and xn is increasing. Prove that∑ 1

xnconverges

if and only if∑ n

x1 + x2 + · · ·+ xnconverges.

Exercise 4.1.39. Suppose xn > 0. Prove that∑xn converges if and only if∑ xn

x1 + x2 + · · ·+ xnconverges.

Exercise 4.1.40. Suppose xn ≥ 0. Prove that∑ xn

(x1 + x2 + · · ·+ xn)2converges.

Exercise 4.1.41. Suppose xn is a non-negative and decreasing sequence. Prove thatif limn→∞ xn = 0 and

∑ni=1(xi − xn) =

∑ni=1 xi − nxn ≤ B for a fixed bound B,

then∑xn converges.

Raabe2 and Bertrand3 Tests

Exercise 4.1.16 is the mother of ratio tests. By applying the exercise tothe power series, we get the ratio test in Example 4.1.8 and Exercise 4.1.18.By applying the exercise to the series in Examples 4.1.7 and 4.1.8, we getthe Rabbe and Bertrand tests.

Exercise 4.1.42. Prove that if∣∣∣∣xn+1

xn

∣∣∣∣ ≤ 1 − a

nfor some a > 1 and big n, then∑

xn absolutely converges.

Exercise 4.1.43. Prove that if∣∣∣∣xn+1

xn

∣∣∣∣ ≥ 1− 1n− a

for some constant a and big n,

then∑xn does not absolutely converge.

Exercise 4.1.44. Rephrase the Raabe test in Exercises 4.1.42 and 4.1.43 in terms

of the quotient∣∣∣∣ xnxn+1

∣∣∣∣.2Joseph Ludwig Raabe, born 1801 in Brody (now Ukrain), died 1859 in Zurich (Switzer-

land).3Joseph Louis Francois Bertrand, born 1822 and died 1900 in Paris (France).

Page 216: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

216 CHAPTER 4. SERIES

Exercise 4.1.45. Determine the convergence of the series∑ α(α+ 1) · · · (α+ n)

β(β + 1) · · · (β + n)

and∑ nn

enn!.

Exercise 4.1.46. Derive the condition for the absolute convergence of∑xn in terms

of the comparison between∣∣∣∣xn+1

xn

∣∣∣∣ and 1 − a

n log n, or the comparison between∣∣∣∣ xnxn+1

∣∣∣∣ and 1 +a

n log n.

Kummer4 Test

Exercise 4.1.47. Prove that if there are cn > 0 and δ > 0, such that cn −

cn+1

∣∣∣∣xn+1

xn

∣∣∣∣ ≥ δ for sufficiently big n, then∑xn absolutely converges.

Exercise 4.1.48. Prove that if xn > 0, and there are cn > 0, such that cn −cn+1

xn+1

xn≤ 0 and

∑ 1cn

diverges, then∑xn diverges.

Exercise 4.1.49. Prove that if∑xn absolutely converges, then there are cn > 0,

such that cn − cn+1

∣∣∣∣xn+1

xn

∣∣∣∣ ≥ 1 for all n.

Exercise 4.1.50. Derive ratio, Raabe and Bertrand tests from the Kummer test.

Absolutely Convergence of Infinite Product

Suppose xn 6= −1. An infinite product∏

(1 + xn) absolutely converges if∏(1 + |xn|) converges.

Exercise 4.1.51. Prove that∏

(1 + xn) absolutely converges if and only if theseries

∑xn absolutely converges. Moreover,

∏(1 + |xn|) = +∞ if and only if∑

xn = +∞.

Exercise 4.1.52. Prove that if the infinite product absolutely converges, then theinfinite product converges.

Exercise 4.1.53. Suppose 0 < xn < 1. Prove that∏

(1 + xn) converges if and onlyif∏

(1− xn) converges. Moreover,∏

(1 + xn) = +∞ if and only if∏

(1− xn) = 0.

Exercise 4.1.54. Suppose∏

(1+xn) converges to a positive number but∏

(1+|xn|)diverges. Prove that by rearranging the terms in

∏(1 + xn), the infinite product

can have any positive number as the limit, or to diverge.

Ratio Rule

By relating xn to the partial product of∏ xn+1

xn, the limit of the sequence

xn can be studied by considering the ratioxn+1

xn. This leads to the extension

of the ratio rules in Exercises 1.1.32, 2.2.29, 2.2.32.

Exercise 4.1.55. Suppose∣∣∣∣xn+1

xn

∣∣∣∣ ≤ 1 − yn and 0 < yn < 1. Use Exercises 4.1.51

and 4.1.53 to prove that if∑yn = +∞, then limn→∞ xn = 0.

4Ernst Eduard Kummer, born 1810 in Sorau (now Germany), died 1893 in Berlin(Germany).

Page 217: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.1. SERIES OF NUMBERS 217

Exercise 4.1.56. Suppose∣∣∣∣xn+1

xn

∣∣∣∣ ≥ 1 + yn and 0 < yn < 1. Prove that if∑yn =

+∞, then limn→∞ xn =∞.

Exercise 4.1.57. Suppose 1− yn ≤xn+1

xn≤ 1 + zn and 0 < yn, zn < 1. Prove that

if∑yn and

∑zn converge, then limn→∞ xn converges to a nonzero limit.

Exercise 4.1.58. Study limn→∞(n+ a)n+ 1

2

(±e)nn!, the case not yet settled in Exercise

2.2.30.

An Example by Borel

Suppose an > 0 and∑√

an converges. Suppose {rn} is all the rational

numbers in [0, 1]. We study the convergence of the series∑ an|x− rn|

for

x ∈ [0, 1]?

Exercise 4.1.59. Prove that if x 6∈ ∪n(rn − c√an, rn + c

√an), then the series

converges.

Exercise 4.1.60. Use Heine-Borel theorem to prove that if∑√

an <12c

, then

[0, 1] 6⊂ ∪n(rn − c√an, rn + c

√an). By Exercise 4.1.59, this implies that the series

converges for some x ∈ [0, 1].

Approximate Partial Sum by Integral

The proof of Proposition 4.1.3 gives an estimation of the partial sumfor

∑f(n) by the integral of f(x) on suitable intervals. The idea leads

to an estimation of n! in Exercise 4.1.26. In what follows, we study theapproximation in general.

Suppose f(x) is a decreasing function on [1,+∞) satisfying limx→+∞ f(x) =

0. Denote dn = f(1) + f(2) + · · ·+ f(n− 1)−∫ n

1

f(x)dx.

Exercise 4.1.61. Prove that dn is increasing and 0 ≤ dn ≤ f(1)−f(n). This impliesthat dn converge to a limit γ.

Exercise 4.1.62. Prove that if f(x) is convex, then dn ≥12

(f(1)− f(n)).

Exercise 4.1.63. By using Exercise 3.1.53, prove that if f(x) is convex and differ-entiable, then for m > n, we have∣∣∣∣dn − dm +

12

(f(n)− f(m))∣∣∣∣ ≤ 1

8(f ′(m)− f ′(n)).

Exercise 4.1.64. For convex and differentiable f(x), prove∣∣∣∣dn − γ +12f(n)

∣∣∣∣ ≤ −18f ′(n).

Exercise 4.1.65. By using Exercise 3.2.34, prove that if f(x) has second orderderivative, then for m > n, we have

124

m−1∑k=n

inf[k,k+1]

f ′′ ≤ dn−dm+12

(f(n)−f(m))− 18

(f ′(n)−f ′(m)) ≤ 124

m−1∑k=n

sup[k,k+1]

f ′′.

Page 218: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

218 CHAPTER 4. SERIES

Exercise 4.1.66. Let γ be the Euler-Mascheroni constant in Exercise 1.4.38. Prove

124(n+ 1)2

≤ 1 +12

+ · · ·+ 1n− 1

− log n− γ +1

2n+

18n2≤ 1

24(n− 1)2.

Exercise 4.1.67. Estimate 1+1√2

+1√3

+ · · ·+ 1√n− 1

−2√n (see Exercise 1.2.13).

Dirichlet Test and Abel Test

The Dirichlet and Abel tests in Exercise 3.3.47 for the convergence ofimproper integrals have analogues for the convergence of series.

Let sn be the partial sum of∑yn.

Exercise 4.1.68. Prove that∑n

k=1 xkyk = (x1−x2)s1 + (x2−x3)s2 + · · ·+ (xn−1−xn)sn−1 + xnsn. The equality is the discrete version of the integration by parts.

Exercise 4.1.69. Suppose xn ≥ 0 and xn is decreasing. Prove that if m ≤ sk ≤Mfor 1 ≤ k ≤ n, then x1m ≤

∑nk=1 xkyk ≤ x1M .

Exercise 4.1.70 (Dirichlet Test). Prove that if xn is decreasing, limn→∞ xn = 0,and the partial sums of

∑yn are bounded, then

∑xnyn converges.

Exercise 4.1.71 (Abel Test). Prove that if xn is decreasing and bounded, and∑yn

converges, then∑xnyn converges.

Exercise 4.1.72. Derive the Leibniz test and the Abel test from the Dirichlet test.

Exercise 4.1.73. Determine the convergence of the series∑ sinna

nα.

Exercise 4.1.74. Prove that if β > α, then the convergence of∑ an

nαimplies the

convergence of∑ an

nβ.

4.2 Series of Functions

We study of the series of functions, especially the Taylor and Fourier series.To preserve the properties such as limit, differentiation, integration of thefunctions under the infinite sum, some condition on the uniformity on theconvergence is needed. We first introduce such uniformity condition for se-quences and series of functions. Then we will study Taylor and Fourier seriesin detail.

4.2.1 Uniform Convergence

A sequence of functions fn(x) converges to a function f(x) if limn→∞ fn(x) =f(x) for each x. The following are some examples.

limn→∞

1

n+ x= 0, x ∈ (−∞,+∞)

limn→∞

xn =

{0 if |x| < 1

1 if x = 1. x ∈ (−1, 1]

limn→∞

(1 +

x

n

)n= ex, x ∈ (−∞,+∞)

Page 219: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 219

The second example indicates that the continuity of fn(x) does not necessar-ily imply the continuity of the limit function. In other words, the equalitylimn→∞ limx→a fn(x) = limx→a f(x) = limx→a limn→∞ fn(x) does not neces-sarily hold. The next example shows that the limit and the integration maynot commute.

Example 4.2.1. Consider the function fn(x) in Figure 4.2. We have limn→∞ nfn(x) =

0 and∫ 1

0nfn(x)dx =

12

. In particular,

limn→∞

∫ 1

0nfn(x)dx =

126=∫ 1

0limn→∞

nfn(x)dx = 0.

............................................................................................................................................................................................................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

........................................................................................................................................................................................................................................................................ ........

........

1n

0 1

1

Figure 4.2: a non-uniform convergent sequence

Given limn→∞ fn(x) = f(x) on [a, b], how can we prove limn→∞

∫ b

a

fn(x)dx =∫ b

a

f(x)dx? We have

∣∣∣∣∫ b

a

fn(x)dx−∫ b

a

f(x)dx

∣∣∣∣ ≤ ∫ b

a

|fn(x)− f(x)|dx.

If for any ε > 0, there is N , such that |fn(x) − f(x)| < ε for all n > N and

x ∈ [a, b], then we have

∣∣∣∣∫ b

a

fn(x)dx−∫ b

a

f(x)dx

∣∣∣∣ ≤ ε(b − a) for n > N .

The proof leads to the following definition.

Definition 4.2.1. A sequence of functions fn(x) uniformly converges to afunction f(x) on [a, b] if for any ε > 0, there is N , such that

n > N, x ∈ [a, b] =⇒ |fn(x)− f(x)| < ε. (4.2.1)

The key point of the definition is that the inequality holds for all x inthe defining domain. In other words, N is independent of the choice of x.Of course, the defining domain does not have to be closed interval only. Theinterval [a, b] in the definition can be replaced by any set of numbers.

It is not difficult to derive the Cauchy criterion for the uniform conver-gence (the defining domain is omitted in the statement).

Page 220: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

220 CHAPTER 4. SERIES

........................................................................................................................................................................................................................................................... ...........................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..............................................................

..........................................................

........................................................................................................................................................................................................................................................................................................

....................................................................................................................................................................................

............................................................................................................................................................................

..........................................................................................................................................................................................................

..............................

...............................................

......................................................................................................................................................

.........................

.............................

................................................

......................................................................................................................................................

f + ε

f

f − ε

fn

a b

Figure 4.3: uniform convergence

Proposition 4.2.2 (Cauchy Criterion). A sequence fn(x) uniformly con-verges if and only if for any ε > 0, there is N , such that

m,n > N =⇒ |fm(x)− fn(x)| < ε. (4.2.2)

Example 4.2.2. For any fixed R and ε > 0, we have

x ≥ R,n > 1ε−R =⇒ n+ x >

=⇒∣∣∣∣ 1n+ x

∣∣∣∣ < ε.

Therefore1

n+ xuniformly converges to 0 on [R,+∞) for any R.

On the other hand, the sequence is defined on the set X of all real numbersexcept negative integers. However, the sequence is not uniformly convergent onX. Specifically, for ε = 1 and any N , pick a natural number n > N . Then for

x = −n+12∈ X, we have

∣∣∣∣ 1n+ x

− 0∣∣∣∣ = 2 > ε.

Example 4.2.3. Let 0 < R < 1. For any ε > 0, we have

|x| ≤ R,n > logR ε =log εlogR

=⇒ |xn| ≤ Rn < RlogR ε = ε.

Therefore xn uniformly converges to 0 on [−R,R] for any 0 < R < 1.On the other hand, for any 0 < ε < 1 and any N , pick a natural number

n > N . Then we have 0 < x = n√ε < 1 and xn = ε. This shows the convergence

is not uniform on (0, 1).

Example 4.2.4. We have limn→∞ n log(

1 +x

n

)= x for any x. Moreover, for any

fixed R and ε > 0, by Taylor expansion we have

|x| ≤ R,n > R2

2ε=⇒

∣∣∣n log(

1 +x

n

)− x∣∣∣ =

∣∣∣∣n(xn − c2

2n2

)− x∣∣∣∣ ≤ R2

2n< ε,

where |c| < |x|. Therefore the convergence is uniform on [−R,R] for any R > 0.

Taking the exponential, we get limn→∞

(1 +

x

n

)n= ex. To see that the

convergence is also uniform on [−R,R], we use the uniform continuity of ex on[−R− 1, R+ 1]. For any ε > 0, there is 0 < δ < 1, such that

|x| ≤ R+ 1, |y| ≤ R+ 1, |x− y| < δ =⇒ |ex − ey| < ε.

Page 221: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 221

Then by the estimation above, we get

|x| ≤ R,n > R2

2δ=⇒

∣∣∣n log(

1 +x

n

)− x∣∣∣ ≤ δ < 1

=⇒∣∣∣(1 +

x

n

)n− ex

∣∣∣ =∣∣∣en log(1+ x

n) − ex∣∣∣ < ε.

On the other hand, for any fixed n, we have limx→+∞

((1 +

x

n

)n− ex

)=

∞. This implies that for ε = 1 and any natural number n, there is x, suchthat

∣∣∣(1 +x

n

)n− ex

∣∣∣ > ε. This shows that the convergence is not uniform on

(−∞,+∞).

Exercise 4.2.1. Determine the intervals on which the sequences uniformly converge.

1. x1n .

2. n(x1n − 1).

3.1

nx+ 1.

4.sinnxn

.

5. sinx

n.

6. n√

1 + xn.

7.(x+

1n

)α.

8.(

1 +x

n

)α.

9. log(

1 +x

n

).

Exercise 4.2.2. Suppose limn→∞ fn(x) = f(x) uniformly on [a, b] and uniformlyon [b, c]. Prove that limn→∞ fn(x) = f(x) uniformly on [a, c]. What about othercombinations of intervals?

Exercise 4.2.3. Prove that limn→∞ fn(x) = f(x) uniformly if and only if limn→∞ sup |fn(x)−f(x)| = 0, where the supremum is taken over all x in the defining domain.

Exercise 4.2.4. Suppose limn→∞ fn(x) = f(x) uniformly on X. Suppose g(y) is afunction on Y such that g(y) ∈ X for any y ∈ Y . Prove that limn→∞ fn(g(y)) =f(g(y)) uniformly on Y . Is it also true that limn→∞ fn(x) = f(x) not uniformlyimplies limn→∞ fn(g(y)) = f(g(y)) not uniformly?

Exercise 4.2.5. Suppose limn→∞ fn(x) = f(x) uniformly on X. Suppose fn(x) ∈ Yfor all n and all x ∈ X. Prove that if g(y) is uniformly continuous on Y , thenlimn→∞ g(fn(x)) = g(f(x)) uniformly on X.

Exercise 4.2.6. Are the sum, product, composition, maximum, etc, of two uni-formly convergent sequences of functions still uniformly convergent?

Exercise 4.2.7. Suppose f(x) is integrable on [a, b + 1]. Prove that the sequence

fn(x) =1n

∑n−1i=0 f

(x+

i

n

)converges uniformly to

∫ x+1

xf(t)dt on [a, b].

A series∑un(x) of functions uniformly converges if the sequence of par-

tial sum functions uniformly converges. Applying the Cauchy criterion, wefind

∑un(x) uniformly converges if and only if for any ε > 0, there is N ,

such that

n ≥ m > N =⇒ |um(x) + um+1(x) + · · ·+ un(x)| < ε. (4.2.3)

Based on this, it is possible to extend various results on the convergenceof series of numbers to the uniform convergence of series of functions. Forexample, the uniform convergence of a series

∑un(x) would imply that the

sequence un(x) uniformly converges to 0. The comparison test can also beextended.

Page 222: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

222 CHAPTER 4. SERIES

Proposition 4.2.3 (Comparison Test). Suppose |un(x)| ≤ vn(x). If∑vn(x)

uniformly converges, then∑un(x) uniformly converges.

Example 4.2.5. We have∣∣∣∣ (−1)n

n2 + x2

∣∣∣∣ ≤ 1n2

. By considering1n2

as constant functions,

the series∑ 1

n2uniformly converges. Therefore by the comparison test, the series∑ (−1)n

n2 + x2also uniformly converges.

Example 4.2.6. The partial sum of the geometric series 1 + x+ x2 + · · ·+ xn + · · ·

is1− xn+1

1− x, and the sum is

11− x

for |x| < 1. By

∣∣∣∣1− xn+1

1− x− 1

1− x

∣∣∣∣ =|xn+1|1− x

,

and an argument similar to the one in Example 4.2.3, we find the series uniformlyconverges on [−R,R] for any 0 < R < 1, but does not uniformly converge on(−1, 1).

Example 4.2.7. The series∑ (−1)n

nxn converges to a function f(x) on (−1, 1]. We

expect f(x) = − log(1 + x) but the fact is yet to be established. Where does itconverge uniformly?

For any 0 < R < 1, we have∣∣∣∣(−1)n

nxn∣∣∣∣ ≤ Rn on [−R,R]. By considering each

Rn as a constant function, the series∑Rn converges uniformly. Therefore by the

comparison test, the series∑ (−1)n

nxn converges uniformly on [−R,R].

The convergence is not uniform on (−1, 0]. For any 0 < x < 1, we have∣∣∣∣∣2n∑

i=n+1

(−1)i

i(−x)i

∣∣∣∣∣ =xn+1

n+ 1+xn+2

n+ 2+ · · ·+ x2n

2n≥ x2n

2.

For ε =13

and any n, we can find x close to 1, such thatx2n

2> ε. Therefore the

Cauchy criterion for the uniform convergence fails.The convergence is uniform on [0, 1]. For each x ∈ [0, 1], the series is an

alternating series and1nxn is decreasing. By the estimation in Proposition 4.1.4,

we get ∣∣∣∣∣n∑i=0

(−1)i

ixi − f(x)

∣∣∣∣∣ ≤ 1n+ 1

xn+1 ≤ 1n+ 1

for all x ∈ [0, 1]. From this it is easy to see the series uniformly converges on [0, 1].The uniform convergence on [0, 1] will be extended in a later Proposition 4.2.10.

Example 4.2.8. By Example 2.3.16, the Taylor series 1+11!x+

12!x2+· · ·+ 1

n!xn+· · ·

of ex converges to ex. By the remainder formula (2.3.6) in Proposition 2.3.3, thedifference between the sum and the partial sum is∣∣∣∣∣

n∑k=0

1k!xk − ex

∣∣∣∣∣ < e|x||x|n+1

(n+ 1)!.

Page 223: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 223

Then it is easy to see that for any R, the Taylor series uniformly converges for|x| ≤ R.

Alternatively, for |x| ≤ R, we have∣∣∣∣xnn!

∣∣∣∣ ≤ Rn

n!. Then the series

∑ Rn

n!, in which

each term is considered as a constant function, uniformly converges. Therefore bythe comparison test, the Taylor series uniformly converges.

On the other hand, the Taylor series of ex does not uniformly converge on

(−∞,+∞) because the sequencexn

n!does not uniformly converge to 0.

Exercise 4.2.8. Determine the intervals on which the series uniformly converge.

1.∑xne−nx.

2.∑ xn√

n.

3.∑(

x(x+ n)n

)n.

4.∑ 1

nα + xα.

5.∑ 1

x+ an.

6.∑ xn

1− xn.

Exercise 4.2.9. Show that∑

(−1)nxn(1 − x) uniformly converges on [0, 1] andabsolutely converges for each x ∈ [0, 1]. However, the series

∑|(−1)nxn(1 − x)|

does not uniformly converge.

Exercise 4.2.10. Can you establish the uniform convergence version of Exercise4.1.16?

Exercise 4.2.11 (Leibniz Test). Prove that if un(x) is a monotone sequence anduniformly converges to 0, then

∑(−1)nun(x) uniformly converges.

4.2.2 Properties of Uniform Convergence

Proposition 4.2.4. Suppose fn(x) uniformly converges to f(x) for x 6= a. Iflimx→a fn(x) = ln exists, then both limn→∞ ln and limx→a f(x) converge andare equal.

The conclusion is

limx→a

limn→∞

fn(x) = limn→∞

limx→a

fn(x). (4.2.4)

Therefore uniform convergence implies that the two limits limx→a and limn→∞commute.

Proof. For any ε > 0, there is N , such that

m,n > N, x 6= a =⇒ |fm(x)− fn(x)| < ε. (4.2.5)

Taking limx→a of the right side of (4.2.5), we get |lm − ln| ≤ ε. Thereforelimn→∞ ln converges by the Cauchy criterion. Let l = limn→∞ ln. Then

n > N =⇒ |l − ln| ≤ ε.

On the other hand, taking limm→∞ of the right side of (4.2.5), we get

n > N, x 6= a =⇒ |f(x)− fn(x)| ≤ ε.

Page 224: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

224 CHAPTER 4. SERIES

Fix one n > N . Since limx→a fn(x) = ln, there is δ > 0, such that

0 < |x− a| < δ =⇒ |fn(x)− ln| < ε.

Then for the fixed choice of n, we have

0 < |x− a| < δ =⇒ |f(x)− l| ≤ |f(x)− fn(x)|+ |fn(x)− ln|+ |l− ln| < 3ε.

This proves that limx→a f(x) = l.

An immediate consequence of Proposition 4.2.4 is that the uniform con-vergence preserves the continuity.

Proposition 4.2.5. Suppose fn(x) is continuous and the sequence fn(x)converges to f(x).

1. If the convergence is uniform, then f(x) is also continuous.

2. (Dini Theorem) If fn(x) is monotone in n and f(x) is continuous, thenthe convergence is uniform on any bounded and closed interval.

Proof. By taking ln = limx→a fn(x) = fn(a), the first part is a consequenceof Proposition 4.2.4.

Now suppose fn(x) is monotone in n and f(x) is continuous. If the limitlim fn(x) = f(x) is not uniform on [a, b], then there is ε > 0, a subsequencefnk(x), and a sequence xk ∈ [a, b], such that |fnk(xk) − f(xk)| ≥ ε. In thebounded and closed interval [a, b], xk has a convergent subsequence. Thuswithout loss of generality, we may assume xk converges to c ∈ [a, b]. By themonotone assumption, we have

n ≤ nk =⇒ |fn(xk)− f(xk)| ≥ |fnk(xk)− f(xk)| ≥ ε.

Thus for each fixed n, by taking k → ∞ and the continuity of f(x) andfn(x), we get |fn(c) − f(c)| ≥ ε. This contradicts with the assumption thatfn(c) converges to f(c). Thus we proved that fn(x) must converge to f(x)uniformly on [a, b].

Note that in the proof of the second part, we only need fn(x) to bemonotone for each x. It is not required that the sequence should be increasingfor all x or decreasing for all x.

Example 4.2.9. Since xn are continuous and the limit

limn→∞

xn =

{0 if 0 ≤ x < 11 if x = 1

is not continuous on [0, 1], the convergence is not uniform on [0, 1]. On the otherhand, for any 0 < R < 1, the sequence xn is decreasing in n on [0, R] and the limitfunction 0 is continuous on the interval. By Dini Theorem, the limit limn→∞ x

n = 0is uniform on [0, R].

Page 225: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 225

Example 4.2.10. By Example 4.2.4, we know the function ex is the uniform limitof the polynomials

(1 +

x

n

)non [−R,R] for any R > 0. Since polynomials are

continuous, we conclude that ex is continuous on any [−R,R]. In other words, ex

is continuous on (−∞,+∞).Instead of deriving the continuity of ex from the uniform convergence, we may

also use the continuity of ex and the fact that the sequence(

1 +x

n

)nis monotone

to derive that the convergence is uniform on [−R,R] for any R > 0.

Example 4.2.11. The function fn(x) given in Figure 4.2 is continuous, and the limitlimn→∞ fn(x) = 0 is also continuous. However, the convergence is not uniform.Note that the sequence fn(x) is not monotone, so that Dini Theorem cannot beapplied.

Exercise 4.2.12. Suppose fn(x) uniformly converges to f(x). Prove that if fn(x)are uniformly continuous, then f(x) is uniformly continuous.

The discussion leading to the definition of uniform convergence makes usexpect the following.

Proposition 4.2.6. Suppose fn(x) is integrable on a bounded interval [a, b]and the sequence fn(x) uniformly converges. Then limn→∞ fn(x) is integrable,with ∫ b

a

limn→∞

fn(x)dx = limn→∞

∫ b

a

fn(x)dx. (4.2.6)

Proof. For any ε > 0, there is N , such that

m,n > N, x ∈ [a, b] =⇒ |fm(x)− fn(x)| < ε.

This further implies

m,n > N =⇒∣∣∣∣∫ b

a

fm(x)dx−∫ b

a

fn(x)dx

∣∣∣∣ ≤ ε(b− a).

By Cauchy criterion, the limit limn→∞

∫ b

a

fn(x)dx converges. Denote the

limit by I.Let f(x) = limn→∞ fn(x). For any ε > 0, there is N , such that

n > N, x ∈ [a, b] =⇒ |fn(x)− f(x)| < ε.

This implies that for any partition P of [a, b] and the same choice of x∗i forall functions, we have

n > N =⇒ |S(P, fn(x))− S(P, f(x))| < ε(b− a).

Now we fix one n > N satisfying∣∣∣∣∫ b

a

fn(x)dx− I∣∣∣∣ < ε.

For the fixed integrable function fn(x), there is δ > 0, such that

‖P‖ < δ =⇒∣∣∣∣S(P, fn(x))−

∫ b

a

fn(x)dx

∣∣∣∣ < ε.

Page 226: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

226 CHAPTER 4. SERIES

Combining everything together, we find ‖P‖ < δ implies

|S(P, f(x))− I| ≤|S(P, fn(x))− S(P, f(x))|

+

∣∣∣∣S(P, fn(x))−∫ b

a

fn(x)dx

∣∣∣∣+

∣∣∣∣∫ b

a

fn(x)dx− I∣∣∣∣

<ε(b− a+ 2).

This shows that f(x) is integrable, with

∫ b

a

f(x)dx = I.

Example 4.2.12. Let rn, n = 1, 2, . . . , be all the rational numbers in [0, 1]. Thenthe functions

fn(x) =

{1 if x = r1, r2, . . . , rn

0 otherwise

are integrable. However, the limit limn→∞ fn(x) = D(x) is the Dirichlet function,which is not Riemann integrable. Of course the limit is not uniform.

The example shows that the limit of Riemann integrable functions are not nec-essarily Riemann integrable. Thus we often need to attach the uniform convergencecondition if we want to consider the limit of Riemann integral. The annoyance willbe resolved by the introduction of Lebesgue integral. The Lebesgue integral is anextension of the Riemann integral that allows more functions (such as the Dirich-let function) to be integrable, and has the property that the limit of Lebesgueintegrable functions are (almost always) Lebesgue integrable.

Example 4.2.13. By Example 4.2.3, the limit

limn→∞

xn = f(x) =

{0 if |x| < 11 if x = 1

is not uniform on [0, 1]. However, we still have

limn→∞

∫ 1

0xndx = lim

n→∞

1n+ 1

= 0 =∫ 1

0f(x)dx.

In general, we have the dominant convergence theorem in the theory of Lebesgueintegral: If fn(x) are uniformly bounded and Lebesgue integrable, then f(x) =limn→∞ fn(x) is Lebesgue integrable, and the equality (4.2.6) holds.

Exercise 4.2.13. Suppose fn(x) is integrable on a bounded interval [a, b] and

limn→∞ fn(x) = f(x) uniformly. Prove that the convergence limn→∞

∫ x

afn(t)dt =∫ x

alimn→∞

fn(t)dt is uniform for x ∈ [a, b].

Exercise 4.2.14. Extend Proposition 4.2.6 to the Riemann-Stieltjes integral.

Exercise 4.2.15. Suppose f is Riemann-Stieltjes integrable with respect to eachαn. Will the uniform convergence of αn tell you something about the limit of∫ b

afdαn.

Page 227: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 227

Proposition 4.2.7. Suppose fn(x) are differentiable on an interval, suchthat fn(x0) converges at some point x0 and f ′n(x) uniformly converges tog(x). Then fn(x) uniformly converges and(

limn→∞

fn(x))′

= limn→∞

f ′n(x). (4.2.7)

Proof. For any ε > 0, there is N , such that

m,n > N =⇒ |f ′m(x)− f ′n(x)| < ε, |fm(x0)− fn(x0)| < ε.

Applying the mean value theorem to fm(x)− fn(x), we get

|(fm(y)−fn(y))−(fm(x)−fn(x))| = |f ′m(c)−f ′n(c)||y−x| ≤ ε|x−y| (4.2.8)

for m,n > N and any x, y in the interval. In particular, we get

|fm(x)− fn(x)| ≤ ε|x0 − x|+ |fm(x0)− fn(x0)| ≤ ε(|x0 − x|+ 1)

for m,n > N and any x. This implies that the sequence fn(x) uniformlyconverges on any bounded interval.

Let f(x) = limn→∞ fn(x). By taking limm→∞ in (4.2.8), we get

|(f(y)−f(x))−(fn(y)−fn(x))| = |(f(y)−fn(y))−(f(x)−fn(x))| ≤ ε|y−x|.

This means that for fixed x, the function gn(y) =fn(y)− fn(x)

y − xuniformly

converges to g(y) =f(y)− f(x)

y − xfor all y not equal to x. By Proposition

4.2.4, we conclude that limy→x g(y) converges and

f ′(x) = limy→x

g(y) = limn→∞

limy→x

gn(y) = limn→∞

f ′n(x).

Exercise 4.2.16. Derive the derivative of ex from ex = limn→∞

(1 +

x

n

)nor ex =

limn→∞

(1 +

11!x+

12!x2 + · · ·+ 1

n!xn)

and the uniform convergence.

The properties of uniform convergence can be rephrased for series.

1. Suppose∑un(x) uniformly converges. Then taking the limit commutes

with the sum: ∑limx→a

un(x) = limx→a

∑un(x).

2. Suppose un(x) is continuous and∑un(x) uniformly converges. Then∑

un(x) is also continuous.

3. (Dini Theorem) Suppose un(x) is continuous, un(x) ≥ 0 and∑un(x)

converges to a continuous function. Then∑un(x) uniformly converges.

Page 228: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

228 CHAPTER 4. SERIES

4. Suppose un(x) is integrable and∑un(x) uniformly converges. Then∑

un(x) is integrable and the integration commutes with the sum:∫ b

a

∑un(x)dx =

∑∫ b

a

un(x)dx.

5. Suppose∑un(x0) converges and

∑u′n(x) uniformly converges. Then∑

un(x) uniformly converges and is differentiable, and the derivativecommutes with the sum:(∑

un(x))′

=∑

u′n(x).

Example 4.2.14. By Example 4.1.7, the Riemann zeta function

ζ(x) = 1 +12x

+13x

+ · · ·+ 1nx

+ · · ·

is defined on (1,+∞). For any R > 1, we have

x ≥ R =⇒ 0 <1nx≤ 1nR

.

Since the series∑ 1

nRof numbers (considered as constant functions) converges,

the series∑ 1

nxuniformly converges on [R,+∞). Therefore ζ(x) is continuous

on [R,+∞) for any R > 1. Since R > 1 is arbitrary, we conclude that ζ(x) iscontinuous on (1,+∞).

Consider the series −∑ log n

nxobtained by differentiating ζ(x) term by term.

For any R > 1, choose R′ satisfying R > R′ > 1. Then 0 <log nnx≤ log n

nR<

1nR′

for x > R and sufficiently big n. The convergence of the series∑ 1

nR′of numbers

implies the series −∑ log n

nxuniformly convergent on [R,+∞) for any R > 1. This

implies that ζ(x) is differentiable and ζ ′(x) = −∑ log n

nx. Further argument shows

that ζ(x) has derivative of any order.

Example 4.2.15. Let h(x) be given by h(x) = |x| on [−1, 1] and h(x + 2) = h(x)for any x. The function is continuous and satisfies 0 ≤ h(x) ≤ 1. Therefore

f(x) =∞∑n=0

(34

)nh(4nx)

uniformly converges and is a continuous function. However, we will show that thefunction is not differentiable anywhere.

Let δk = ± 12 · 4k

. Then for any n, by |h(x)− h(y)| ≤ |x− y|, we have∣∣∣∣h(4n(a+ δk))− h(4na)δk

∣∣∣∣ ≤ 4nδkδk

= 4n.

For n > k, 4nδk is a multiple of 2, and we have

h(4n(a+ δk))− h(4na)δk

= 0.

Page 229: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 229

For n = k, we have 4kδk = ±12

. By choosing ± sign so that there is no integer

between 4ka and 4ka ± 12

, we can make sure that |h(4k(a + δk)) − h(4ka)| =

|4k(a+ δk)− 4ka| = 12

. Then∣∣∣∣h(4k(a+ δk))− h(4ka)δk

∣∣∣∣ = 4k.

Thus for any fixed a, by choosing a sequence δk with suitable ± sign, we get∣∣∣∣f(a+ δk)− f(a)δk

∣∣∣∣ ≥ (34

)k4k −

k−1∑n=0

(34

)n4n = 3k − 3k − 1

3− 1=

3k + 12

.

This implies that limδ→0f(a+ δ)− f(a)

δdiverges.

Exercise 4.2.17. Justify the equalities.

1.∫ 1

0xaxdx =

∞∑n=1

(−a)n−1

nn.

2.∫ a

λa

( ∞∑n=0

λn tanλnx

)dx = − log | cos a| for |a| < π

2, |λ| < 1.

3.∫ +∞

x(ζ(t)− 1)dt =

∞∑n=2

1nx log n

for x > 1.

Exercise 4.2.18. Find the places where the series converge and have derivatives.Also find the highest order of the derivative.

1.∑ 1

n(log n)x.

2.∑(

x+1n

)n.

3.∑+∞

n=−∞1

|n− x|α.

4.∑ (−1)n

nx.

4.2.3 Power Series

A power series is a series of the form

∞∑n=0

anxn = a0 + a1x+ a2x

2 + · · ·+ anxn + · · · ,

or more generally, of the form∑∞

n=0 an(x − x0)n. Taylor series are powerseries.

Theorem 4.2.8. Suppose

R =1

limn→∞n√|an|

. (4.2.9)

Then the power series absolutely converges for |x| < R and diverges for|x| > R. Moreover, for any 0 < r < R, the power series uniformly convergesfor |x| ≤ r.

Page 230: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

230 CHAPTER 4. SERIES

Proof. For any 0 < r < R, choose r′ satisfying r < r′ < R. Then we have1

r′> limn→∞

n√|an|. By the second part of Proposition 1.2.10, this means

that1

r′> n√|an| for all but finitely many n. Thus for |x| ≤ r, we have

|anxn| =(

n√|an||x|

)n<

(|x|r′

)n≤( rr′

)nfor all but finitely many n. Since 0 < r < r′, the series

∑( rr′

)nof numbers

converges. Then by the comparison test, the series∑anx

n of functionsuniformly converges for |x| ≤ r.

Since the series converges for |x| ≤ r, where r can be any number sat-isfying 0 < r < R, we conclude that the series converges for |x| < R. On

the other hand, if |x| > R, then1

|x|< limn→∞

n√|an|. By the first part of

Proposition 1.2.10, there are infinitely many n satisfying1

|x|< n√|an|, or

|anxn| > 1. Thus the terms of the series∑anx

n do not converge to 0, andthe series diverges.

The number R in Theorem 4.2.8 is called the radius of convergence forthe power series. By the theorem, the radius is characterized by the propertythat the power series converges for |x| < R and diverges for |x| > R.

Example 4.2.16. By Example 4.2.6, the geometrical series∑xn converges for |x| <

1 and diverges for |x| ≥ 1. Thus the radius of convergence of the geometric seriesis 1. Alternatively, the radius of convergence can be obtained by limn→∞

n√|an| =

limn→∞n√

1 = 1.

By Example 4.2.8 (or Example 2.3.16), the Taylor series∑ xn

n!of ex converges

for any x. Therefore the radius of convergence is ∞. By the similar reason, theradius of convergence for the Taylor series of sinx and cosx is also ∞.

The Taylor series of log(1 + x) is∑ (−1)n+1

nxn. Since limn→∞

n√|an| =

limn→∞n

√1n

= 1, the radius of convergence is11

= 1.

Example 4.2.17. By limn→∞n√n! = ∞, the radius of convergence for the series∑

n!xn is 0. In other words, the series diverges for all x 6= 0.By limn→∞

n√

2n + 3n = 3, the radius of convergence for the series∑

(2n+3n)xn

is13

.

By limn→∞n

√1nn

= 0, the radius of convergence for the series∑ (−1)n

nnxn is

∞. In other words, the series converges for all x.Exercise 4.2.19. Prove the radius of convergence R of a power series

∑anx

n sat-isfies

limn→∞

∣∣∣∣ anan+1

∣∣∣∣ ≤ R ≤ limn→∞

∣∣∣∣ anan+1

∣∣∣∣ .Then use the result to show that the radius of convergence for the Taylor series of(1 + x)α and log(1 + x) is 1.

Page 231: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 231

Exercise 4.2.20. Suppose the radii of convergence for the power series∑anx

n and∑bnx

n are R and R′.

1. Prove that the radius of convergence for the sum power series∑

(an+ bn)xn

is at least min{R,R′}.

2. Prove that if R 6= R′, then the radius of convergence for the sum powerseries is equal to min{R,R′}.

What about the radius of convergence for the product power series∑

(a0bn +a1bn−1 + · · ·+ anb0)xn?

Exercise 4.2.21. Find the radius of convergence.

1.∑xn

2.

2.∑ 2n

nxn.

3.∑

2nxn2−1.

4.∑ (2n)!

(n!)2xn.

5.∑an

2xn.

6.∑(

n+ 1n

)nxn.

7.∑

(−1)n(n+ 1n

)n2

xn.

8.∑

(−1)n(n+ 1n

)n2

xn2.

9.∑

(−1)n(n+ 1

2n

)nxn.

Due to the uniform convergence, the calculus of the power series can bedone term by term.

Proposition 4.2.9. Suppose R is the radius of convergence of a power series∑anx

n. Then the sum has derivative of any order for |x| < R. Moreover,the derivative and integral can be taken term by term for |x| < R:

(a0 + a1x+ · · ·+ anxn + · · · )′ = a1 + 2a2x+ · · ·+ nanx

n−1 + · · · ,∫ x

0

(a0 + a1t+ · · ·+ antn + · · · ) dt = a0x+

a1

2x2 +

a2

3x3 + · · ·+ an

nxn+1 + · · · .

Note that by the formula (4.2.9), the radius of convergence for the deriva-tive and integral power series remains the same.

Example 4.2.18. The power series∑nxn = x + 2x2 + 3x3 + · · · has radius of

convergence 1. To find the sum, we note that

1 + 2x+ 3x2 + · · · = (1 + x+ x2 + x3 + · · · )′ =(

11− x

)′=

1(1− x)2

.

Therefore ∑nxn = x(1 + 2x+ 3x2 + · · · ) =

x

(1− x)2.

Example 4.2.19. By the equality

11− x

= 1 + x+ x2 + · · ·+ xn + · · · , |x| < 1,

we get1

1 + x= 1− x+ x2 + · · ·+ (−1)nxn + · · · , |x| < 1.

Page 232: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

232 CHAPTER 4. SERIES

By integration, we get

log(1 + x) = x− 12x2 +

13x3 + · · ·+ (−1)n

n+ 1xn+1 + · · · , |x| < 1.

Exercise 4.2.22. Find the sum of power series.

1.∑∞

n=1 n2xn.

2.∑∞

n=0

x2n+1

2n+ 1.

3.∑∞

n=1

xn

n(n+ 1).

4.∑∞

n=1

xn

n(n+ 1)(n+ 2).

Exercise 4.2.23. Find the Taylor series of the function and the radius of conver-gence. Then explain why the sum of the Taylor series is the given function.

1. arcsinx.

2.∫ x

0

sin ttdt.

3. arctanx.

4.∫ x

0e−t

2dt.

5. log(x+√

1 + x2).

6.∫ x

0

dt√1− t4

dt.

Exercise 4.2.24. Verify that the functions defined by the power series are thesolutions of the differential equations.

1. f(x) =∑∞

n=0

x4n

(4n)!, f (4)(x) = f(x).

2. f(x) =∑∞

n=0

xn

(n!)2, xf ′′(x) + f ′(x)− f(x) = 0.

Exercise 4.2.25. Show that f(x) =∑∞

n=1

xn

n2= −

∫ x

0

log(1− t)t

dt and then verify

that f(x) + f(1− x) + log x log(1− x) = f(1).

Exercise 4.2.26 (Euler). The vibration of a circular drumhead is described by thedifferential equation

f ′′(x) +1xf ′(x) +

(a2 − b2

x2

)f(x) = 0.

Verify that

f(x) = xb[1− 1

b+ 1

(ax2

)2+

12!(b+ 1)(b+ 2)

(ax2

)4− 1

3!(b+ 1)(b+ 2)(b+ 3)

(ax2

)6+ · · ·

]is a solution of the equation.

Exercise 4.2.27. Derive the derivative of ex, sinx and cosx from the uniformconvergence of their Taylor series.

Exercise 4.2.28. Prove that if the radius of convergence for the power series∑anx

n

is nonzero, then the sum function f(x) satisfies f (n)(0) = n!an.

The results so far concerns only with what happens within the radius ofconvergence. The following describes what happens at the radius of conver-gence.

Page 233: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 233

Proposition 4.2.10 (Abel Theorem). Suppose R is the radius of conver-gence of a power series. If the power series converges at x = R, then theseries uniformly converges on [0, R]. If the power series converges at x = −R,then the series uniformly converges on [−R, 0].

By Proposition 4.2.4, a consequence of Abel theorem is that if∑anx

n

converges at x = R > 0, then

limx→R−

∑anx

n =∑

anRn.

Similar equality holds when the power series converges at x = −R.

Proof. The convergence at x = ±R may be converted to the convergence at

x = 1 by the change of variable x → x

±R. Thus it suffices to consider the

case R = 1 only. In other words, we assume∑an converges and would like

to prove that∑anx

n uniformly converges on [0, 1].Applying the Cauchy criterion to the convergent series

∑an, for any

ε > 0, there is N , such that n ≥ m > N implies |am + am+1 + · · ·+ an| < ε.Then

amxm + am+1x

m+1 + · · ·+ anxn

=am(xm − xm+1) + (am + am+1)(xm+1 − xm+2) + · · ·+ (am + am+1 + · · ·+ an−1)(xn−1 − xn) + (am + am+1 + · · ·+ an)xn.

By am + am+1 + · · ·+ ak < ε for any k ≥ m > N and xk ≥ xk+1, we find thatn ≥ m > N implies

amxm + am+1x

m+1 + · · ·+ anxn

≤ε(xm − xm+1) + ε(xm+1 − xm+2) + · · ·+ ε(xn−1 − xn) + εxn = εxm.

By am + am+1 + · · · + ak > −ε and the similar argument, we also find thatn ≥ m > N implies amx

m + am+1xm+1 + · · ·+ anx

n ≥ −εxm. Thus we have

n ≥ m > N =⇒ |amxm + am+1xm+1 + · · ·+ anx

n| ≤ εxm ≤ ε.

This proves the uniform convergence of the series on [0, 1].We note that the proof above is a special case of the Abel test in Exercise

4.2.66.

Example 4.2.20. By Example 4.2.16, the Taylor series of log(1 + x) has radius ofconvergence 1. The series also converges at x = 1 because it is alternating. ByExample 4.2.19 and the remark made after Proposition 4.2.10, we have

1− 12

+13

+ · · ·+ (−1)n+1

n+ · · ·

= limx→1−

(x− 1

2x2 +

13x3 + · · ·+ (−1)n

n+ 1xn+1 + · · ·

)= limx→1−

log(1 + x) = log 2.

Page 234: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

234 CHAPTER 4. SERIES

Exercise 4.2.29. Use the Taylor series of arctanx to show that

1− 13

+15− 1

7+

19− 1

11+ · · · = π

4.

Then show

1 +12− 1

3− 1

4+

15

+16− 1

7− 1

8+ · · · = π

4+

log 22

.

Exercise 4.2.30. Discuss the convergence of the Taylor series of arcsinx at theradius of convergence.

Exercise 4.2.31. By making use of the Taylor series of (1 + x)α, prove that

∞∑n=0

α(α− 1) · · · (α− n+ 1)n!

= 2α, α > −1,

and∞∑n=0

α(α+ 1) · · · (α+ n− 1)n!

= 0, α < 0.

Note that the convergence can be obtained by the Leibniz test and the Raabe test(see Example 4.1.42).

Exercise 4.2.32. The series∑cn with cn = a0bn+a1bn−1+· · ·+anb0 is obtained by

combining terms in the diagonal arrangement of the product of the series∑an and∑

bn. By considering the power series∑anx

n,∑bnx

n,∑cnx

n at x = 1, provethat if

∑an,

∑bn and

∑cn converge, then

∑cn = (

∑an)(

∑bn). Compare with

Exercise 4.1.36.

4.2.4 Fourier Series

A trigonometric series is an infinite sum of trigonometric functions

a0

2+∞∑n=1

(an cosnx+ bn sinnx). (4.2.10)

If the series uniformly converges to a continuous function f(x) (this happens,for example, when

∑(|an|+ |bn|) converges), then the sum f(x) is necessarily

a periodic function with period 2π:

f(x+ 2π) = f(x).

The inner product of two periodic integrable real functions with period2π is

〈f, g〉 =

∫ 2π

0

f(x)g(x)dx. (4.2.11)

Because of the periodicity, the integral can be taken over any interval oflength 2π. The two functions are said to be orthogonal if 〈f, g〉 = 0. Acollection of such functions is called an orthogonal system if they are pairwiseorthogonal.

Page 235: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 235

The inner product is linear with respect to f :

〈f1 + f2, g〉 = 〈f1, g〉+ 〈f2, g〉, 〈cf, g〉 = c〈f, g〉.

It is also symmetric:〈f, g〉 = 〈g, f〉.

Therefore the inner product is also linear in g, and we say the inner productis bilinear. Moreover, the inner product has the positivity property:

〈f, f〉 ≥ 0.

By Exercise 3.2.3, the functions

φ1(x) =1

2, φ2(x) = cos x, φ3(x) = sin x, φ4(x) = cos 2x, φ5(x) = sin 2x, . . .

used in the construction of the trigonometric series form an orthogonal sys-tem. Denote the corresponding coefficients by c1 = a0, c2 = a1, c3 = b1, . . . .Suppose

∑|cn| converges. Then by the uniform convergence and the bilinear

property of the inner product, we have

〈f, φn〉 =

∫ 2π

0

f(x)φn(x)dx

= c0

∫ 2π

0

φ0(x)φn(x)dx+ c1

∫ 2π

0

φ1(x)φn(x)dx+ c2

∫ 2π

0

φ2(x)φn(x)dx+ · · ·

= c0〈φ0, φn〉+ c1〈φ1, φn〉+ c2〈φ2, φn〉+ · · ·= cn〈φn, φn〉.

Therefore

cn =〈f, φn〉〈φn, φn〉

. (4.2.12)

In our case (again by the computation in Exercise 3.2.3), we get

an =1

π

∫ 2π

0

f(x) cosnxdx, bn =1

π

∫ 2π

0

f(x) sinnxdx. (4.2.13)

These are the Fourier coefficients.Conversely, for any periodic integrable function with period 2π, we may

compute the Fourier coefficients as above and construct the Fourier series

f(x) ∼ a0

2+∞∑n=1

(an cosnx+ bn sinnx).

The notation ∼ only means that the function and the series are related. Thetwo will be equal only under certain conditions.

Example 4.2.21. Let 0 ≤ a ≤ 2π. Then

f(x) =

{0 if 2nπ < x < a+ 2nπ1 if a+ 2nπ < x < 2(n+ 1)π

, n ∈ Z

Page 236: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

236 CHAPTER 4. SERIES

is the periodic function with period 2π that uniquely extends the function

f(x) =

{0 if 0 < x < a

1 if a < x < 2π

on (0, 2π). Note that since the Fourier coefficients are defined as integrals, thevalues of the function at 0 and a are not important. The Fourier series is

1− a

2π−∑ 1

nπ(sinna cosnx+ (1− cosna) sinnx).

Example 4.2.22. Let f(x) be the periodic function with period 2π given by f(x) =x2 on (0, 2π). Then

a0 =1π

∫ 2π

0x2dx =

8π2

3,

an =1π

∫ 2π

0x2 cosnxdx = − 2

∫ 2π

0x sinnxdx =

2n2π

(2π −

∫ 2π

0cosnxdx

)=

4n2,

bn =1π

∫ 2π

0x2 sinnxdx = − 1

(4π2 − 2

∫ 2π

0x cosnxdx

)= −4π

n− 2n2π

∫ 2π

0sinnxdx = −4π

n.

Therefore the Fourier series is

4π2

3+∑(

4n2

cosnx− 4πn

sinnx).

Exercise 4.2.33. Compute the Fourier series for the periodic functions with period2π and satisfy the given formulae.

1. f(x) = x on (0, 2π).

2. f(x) = ax on (0, 2π).

3. f(x) = ax on (−π, π).

4. f(x) = | sinx|.

5. f(x) =

{0 if 0 < x < a

x− a if a < x < 2π.

6. f(x) =

{x2 if 0 < x < π

−x2 if π < x < 2π.

Exercise 4.2.34. Suppose l > 0 and f(x) is a periodic function with period l:

f(x + l) = f(x). Then f

(lx

)is a periodic function with period 2π. Use the

observation to derive the formula for the coefficients of the Fourier series

a0

2+∞∑n=1

(an cos

2nπxl

+ bn sin2nπxl

).

Exercise 4.2.35. What can you say about the Fourier series of odd periodic func-tions? What about even periodic functions?

Exercise 4.2.36. How can you extend a function f(x) on(

0,π

2

)so that its Fourier

series is of the form∑b2n−1 sin(2n− 1)x?

Exercise 4.2.37. Suppose k is a integer. How are the Fourier series of f(x) andf(kx+ a) related?

Page 237: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 237

Exercise 4.2.38. Suppose f(x) is a periodic differentiable function with period 2π,such that f ′(x) is integrable. How are the Fourier series of f(x) and f ′(x) related?

Exercise 4.2.39. Show that {cosnx} is an orthogonal system on (0, π) with respect

to the inner product 〈f, g〉 =∫ π

0f(x)g(x)dx. Show that {sinnx} is also an or-

thogonal system on (0, π). However, the combined system is not an orthogonalsystem on (0, π).

Similar to the Taylor series, we are interested in whether the Fourier seriesof f(x) converges to f(x). We start the discussion with the following result.

Proposition 4.2.11 (Bessel Inequality). For any periodic integrable functionf(x) of period 2π, the Fourier coefficients satisfy

a20

2+∑

(a2n + b2

n) ≤ 1

π

∫ 2π

0

f(x)2dx. (4.2.14)

Proof. Let sN =∑N

n=1 cnφn. Then

〈f, sN〉 =N∑n=1

cn〈f, φn〉 (inner product is bilinear)

=N∑n=1

c2n〈φn, φn〉 (definition of cn)

=N∑m=1

N∑n=1

cmcn〈φm, φn〉 (φn is orthogonal)

=

⟨N∑m=1

φm,N∑n=1

φn

⟩= 〈sN , sN〉. (inner product is bilinear)

Thus by the bilinearity and the symmetric property of the inner product, wehave

0 ≤ 〈f − sN , f − sN〉 = 〈f, f〉 − 〈f, sN〉 − 〈sN , f〉+ 〈sN , sN〉

= 〈f, f〉 − 〈f, sN〉 =

∫ 2π

0

f(x)2dx−N∑n=1

c2n〈φn, φn〉.

In our case, we have⟨1

2,1

2

⟩=π

2, 〈cosnx, cosnx〉 = π, 〈sinnx, sinnx〉 = π,

and the inequality (4.2.14) follows.

Note that the Bessel inequality applies to any orthogonal system {φn}:

〈f, f〉 ≥∞∑n=1

c2n〈φn, φn〉, where cn =

〈f, φn〉〈φn, φn〉

. (4.2.15)

As a matter of fact, the inequality becomes an equality for the trigonometricseries (the fact is related to the completeness of the trigonometric system).

Page 238: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

238 CHAPTER 4. SERIES

Proposition 4.2.12 (Riemann-Lebesgue Lemma). Suppose f(x) is an inte-grable function on a bounded interval [a, b]. Then

limn→∞

∫ b

a

f(x) cosnxdx = 0, limn→∞

∫ b

a

f(x) sinnxdx = 0. (4.2.16)

Proof. For a periodic integrable function f(x) with period 2π, Proposition4.2.11 tells us that the Fourier coefficients converge to 0, which means

limn→∞

∫ 2π

0

f(x) cosnxdx = 0, limn→∞

∫ 2π

0

f(x) sinnxdx = 0.

If |b − a| ≤ 2π, then f(x) may be extended to a periodic integrablefunction F (x) with period 2π satisfying

F (x) =

{f(x) if a < x < b

0 if b < x < a+ 2π.

Then ∫ b

a

f(x) cosnxdx =

∫ a+2π

a

F (x) cosnxdx =

∫ 2π

0

F (x) cosnxdx.

We just proved that this approaches 0 as n→∞.

In general, for bounded [a, b], the integration

∫ b

a

may be divided into

finitely many integrations on intervals of length ≤ 2π. Then the limit alsoapproaches 0.

A more general version of Riemann-Lebesgue Lemma can be found inExercise 3.2.44.

Note that by cos(n + λ)x = cosλx sinnx − sinλx cosnx and the similarformula for sin(n+ λ)x, we also have

limn→∞

∫ b

a

f(x) cos(n+ λ)xdx = 0, limn→∞

∫ b

a

f(x) sin(n+ λ)xdx = 0. (4.2.17)

Exercise 4.2.40. Suppose f(x) is a periodic function with period 2π that is mono-tone on (a, a + 2π) for some a. Prove that there is a constant M , such that the

Fourier coefficients satisfy |an| ≤M

nand |bn| ≤

M

n. Can you draw stronger

conclusion if f ′(x) is monotone?

Exercise 4.2.41. Suppose f(x) is decreasing and limx→+∞ f(x) = 0. Prove that∫ +∞

af(x) cosnxdx converges and limn→∞

∫ +∞

af(x) cosnxdx = 0.

Now we are ready for a simple convergence criterion for the Fourier series.

Page 239: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 239

Theorem 4.2.13. Suppose f(x) is a periodic integrable function with period2π. Suppose at a, f(x) has the left limit f(a−) and the right limit f(a+), andthere are M and δ > 0, such that

0 < t < δ =⇒ |f(a+ t)− f(a+)| ≤Mt, |f(a− t)− f(a−)| ≤Mt.

Then the Fourier series converges tof(a+) + f(a−)

2at a.

The condition of the theorem is satisfied everywhere if the function ispiecewise Lipschitz.

Proof. The partial sum of the Fourier series is

sN(a) =a0

2+

N∑n=1

(an cosna+ bn sinna)

=1

π

∫ 2π

0

f(t)

(1

2+

N∑n=1

(cosnt cosna+ sinnt sinna)

)dt

=1

π

∫ 2π

0

f(t)

(1

2+

N∑n=1

cosn(t− a)

)dt

=1

π

∫ 2π

0

f(t)DN(t− a)dt =1

π

∫ 2π−a

−af(t)DN(t)dt

=1

π

∫ π

−πf(a+ t)DN(t)dt,

where the last equality is due to the fact that the integrand is periodic. TheDirichlet kernel function

DN(t) =1

2+

N∑n=1

cosnt =

sin

(N +

1

2

)t

2 sint

2

. (4.2.18)

satisfies1

π

∫ 0

−πDN(t)dt =

1

π

∫ π

0

DN(t)dt =1

2.

Therefore

sN(a)− f(a+) + f(a−)

2

=1

π

∫ π

−πf(a+ t)DN(t)dt− 1

π

∫ 0

−πf(a−)DN(t)dt− 1

π

∫ π

0

f(a+)DN(t)dt

=1

π

∫ 0

−π(f(a+ t)− f(a−))DN(t)dt+

1

π

∫ π

0

(f(a+ t)− f(a+))DN(t)dt.

Note that for any fixed a, the functionf(a+ t)− f(a−)

2 sint

2

is integrable on

[−π,−ε] for any ε > 0. Moreover, the assumption tells us that the function

Page 240: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

240 CHAPTER 4. SERIES

is bounded on [−π, 0]. Thus by Exercise 3.1.8, the function is integrable on[−π, 0]. Then by (4.2.17),∫ 0

−π(f(a+ t)− f(a−))DN(t)dt =

∫ 0

−π

f(a+ t)− f(a−)

2 sint

2

sin

(N +

1

2

)tdt

approaches 0 as N →∞. By the similar reason,

∫ π

0

(f(a+t)−f(a+))DN(t)dt

also converges to 0.

Example 4.2.23. The function in Example 4.2.22 satisfies the condition of Theorem4.2.13. By evaluating the Fourier series at x = 0, we get

(2π)2 + 02

2=

4π2

3+∑ 4

n2.

From this we get112

+122

+ · · ·+ 1n2

+ · · · = π2

6.

Exercise 4.2.42. By evaluating the Fourier series in Example 4.2.22, show that∑ sinnxn

=π − x

2on (0, 2π). Then derive the sum of the following infinite series.

1. 1− 13

+15− 1

7+

19− 1

11+ · · · .

2. 1 +15− 1

7− 1

11+

113

+117

+ · · · .

3. 1 +12− 1

4− 1

5+

17

+18− 1

10− 1

11+ · · · .

4. 1 +13− 1

5− 1

7+

19

+111− 1

13− 1

15+ · · · .

Exercise 4.2.43. By studying the Fourier series of x4, compute∑ 1

n4.

Exercise 4.2.44. Prove that if f(x) has continuous second order derivative, thenthe Fourier series uniformly converges to f(x).

4.2.5 Additional Exercise

Uniform Convergence of Double Sequence

A double sequence xm,n is indexed by natural numbers m and n. We writelimn→∞ xm,n = lm if the limit holds for any fixed m. We say the convergenceis uniform if for any ε > 0, there is N , such that

n > N =⇒ |xm,n − lm| < ε.

The key point here is that N depends on ε only, and is independent of m.The uniform convergence of series

∑n xm,n can also be defined accord-

ingly.

Exercise 4.2.45. Determine uniform convergence.

Page 241: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 241

1. limn→∞1

m+ n= 0.

2. limn→∞m

m+ n= 0.

3. limn→∞1mn

= 0.

4. limn→∞1nm

= 0 (n > 1).

5. limn→∞ n√m = 1.

6. limn→∞m

n= 0.

Exercise 4.2.46. Construct a double sequence xm,n satisfying limm→∞ xm,n = 0and limn→∞ xm,n = 1. In particular, the double limits limm→∞ limn→∞ xm,n andlimn→∞ limm→∞ xm,n are different.

Exercise 4.2.47. Are the sum, product, maximum, etc, of two uniformly convergentdouble sequences still uniformly convergent?

Exercise 4.2.48. Suppose limn→∞ xm,n = lm uniformly and limm→∞ xm,n = knexists. Prove that both limm→∞ lm and limn→∞ kn converge and are equal.

Exercise 4.2.49. State the Cauchy criterion for the uniform convergence of theseries

∑n xm,n. State the corresponding comparison test.

Exercise 4.2.50. Determine uniform convergence (α, β > 0).

1.∑

n amn.

2.∑

n

1mαnβ

.

3.∑

n

1nαm

.

4.∑

n

(m+ n)!m!n!

an.

Double Series

A double series∑

m,n≥1 xm,n may converge in several different ways.First, the double series converges to sum s if for any ε > 0, there is N ,

such that

m,n > N =⇒

∣∣∣∣∣m∑i=1

n∑j=1

xm,n − s

∣∣∣∣∣ < ε.

Second, the double series has repeated sum∑

m

∑n xm,n if

∑n xm,n con-

verges for each m and the series∑

m (∑

n xm,n) again converges. Of course,there is another repeated sum

∑n

∑m xm,n.

Third, a one-to-one correspondence k ∈ N 7→ (m(k), n(k)) ∈ N2 arrangesthe double series into a single series

∑k xm(k),n(k), and we may consider the

sum of the single series.Fourth, for any finite subset A ⊂ N2, we may define the partial sum

sA =∑

(m,n)∈A

xm,n.

Then for any sequence Ak of finite subsets satisfying Ak ⊂ Ak+1 and ∪Ak =N2, we say the double series converges to s with respect to the sequence Akif limk→∞ sAk = s. For example, we have the spherical sum by consideringAk = {(m,n) ∈ N2 : m2 + n2 ≤ k2} and the triangular sum by consideringAk = {(m,n) ∈ N2 : m+ n ≤ k}.

Finally, the double series absolutely converges (see Exercise 4.2.52 for thereason for the terminology) to s if for any ε > 0, there is N , such that|sA − s| < ε for any A containing all (m,n) satisfying m ≤ N and n ≤ N .

Page 242: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

242 CHAPTER 4. SERIES

Exercise 4.2.51. State the Cauchy criterion for the convergence of the double series∑m,n≥1 xm,n (in the first sense). State the corresponding comparison test.

Exercise 4.2.52. Prove that a double series∑

m,n≥1 xm,n absolutely converges (inthe final sense) if and only if

∑m,n≥1 |xm,n| converges (in the first sense).

Exercise 4.2.53. Prove that a double series∑

m,n≥1 xm,n absolutely converges ifand only if all the arrangement series

∑k xm(k),n(k) converge. Moreover, the ar-

rangement series have the same sum.

Exercise 4.2.54. Prove that a double series∑

m,n≥1 xm,n absolutely converges if andonly if the double series converge with respect to all the sequences An. Moreover,the sums with respect to all the sequences An are the same.

Exercise 4.2.55. Prove that if a double series∑

m,n≥1 xm,n absolutely converges,then the two repeated sums converge and have the same value.

Exercise 4.2.56. If a double series does not converge absolutely, what can happento various sums?

Exercise 4.2.57. Study the convergence and values of the following double series.

1.∑

m,n≥1 amn.

2.∑

m,n≥1

1(m+ n)α

.

3.∑

m,n≥1

(−1)m+n

(m+ n)α.

4.∑

m,n≥2

1nm

.

Uniform Convergence of Two Variable Function

Let f(x, t) be a two variable function. If limt→a f(x, t) = l(x), then wesay f(x, t) converges to l(x) as t→ a. The convergence is uniform if for anyε > 0, there is δ > 0, such that

0 < |t− a| < δ =⇒ |f(x, t)− l(x)| < ε.

Similar definition can be made for the one-side limit and the limit at infinity.

Exercise 4.2.58. Determine the intervals on which the limits are uniformly conver-gent.

1. limt→∞1xt

= 0.

2. limt→0 xt = 1.

3. limt→0 t(xt − 1) = 1.

4. limt→∞1

tx+ 1= 0.

5. limt→0

√x+ t =

√x.

6. limt→∞

(1 +

x

t

)t= et.

Exercise 4.2.59. State the Cauchy criterion for f(x, t) to uniformly converges tol(x) as t→ a.

Exercise 4.2.60. Prove that f(x, t) uniformly converges to l(x) as x → a if andonly if f(x, tn) uniformly converges to l(x) for any sequence tn satisfying tn 6= aand limn→∞ tn → a.

Exercise 4.2.61. Suppose f(x) is continuous on an open interval containing [a, b].Prove that f(x+ t) uniformly converges to f(x) on [a, b] as t→ 0.

Page 243: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

4.2. SERIES OF FUNCTIONS 243

Exercise 4.2.62. Suppose f(x) has continuous an open interval containing [a, b].

Prove thatf(x+ t)− f(x)

tuniformly converges to f ′(x) on [a, b] as t→ 0.

Exercise 4.2.63. Suppose f(x) is continuous on an open interval containing [a, b].Prove that the fundamental theorem of calculus

limt→0

1t

∫ x+t

xf(t)dt = f(x)

converges uniformly.

Exercise 4.2.64. What properties of the uniform convergence of fn(x) still hold forthe uniform convergence of f(x, t)?

Dirichlet Test and Abel Test

A sequence of functions fn(x) is uniformly bounded if there is M , suchthat |fn(x)| < M for all n and x.

Exercise 4.2.65 (Dirichlet Test). Prove that if un(x) is a monotone sequence anduniformly converges to 0, and the partial sums of

∑vn(x) are uniformly bounded,

then∑un(x)vn(x) uniformly converges.

Exercise 4.2.66 (Abel Test). Prove that if un(x) is a monotone and uniformlybounded sequence, and

∑vn(x) uniformly converges, then

∑un(x)vn(x) uni-

formly converges.

Exercise 4.2.67. Using the formula in Exercise 4.1.3, prove that if an is mono-tone and converges to 0, then

∑an cosnx and

∑an sinnx uniformly converges

on [δ, 2π − δ] for any δ > 0. Moreover, show that∑ sinnx

ndoes not uniformly

converge on [0, 2π].

The Series∑∞

n=1

annx

The series∑∞

n=1

annx

has many properties similar to the power series.

Exercise 4.2.68. Use the Abel test in Exercise 4.2.66 to show that if∑ an

nrcon-

verges, then∑ an

nxuniformly converges on [r,+∞), and

∑ annr

= limx→r+

∑ annx

.

Exercise 4.2.69. Prove that there is R, such that∑ an

nxconverges on (R,+∞) and

diverges on (−∞, R). Moreover, prove that R ≥ limn→∞log |an|log n

.

Exercise 4.2.70. Prove that we can take terms wise integration and derivative ofany order of the series on (R,+∞).

Exercise 4.2.71. Prove that there is R′, such that the series absolutely converges on

(R′,+∞) and absolutely diverges on (R′,+∞). Moreover, prove that limn→∞log |an|log n

+

1 ≥ R′ ≥ limn→∞log |an|log n

+ 1.

Exercise 4.2.72. Give an example such that the inequalities in Exercises 4.2.69 and4.2.71 are strict.

Continuous But Not Differentiable Function

Page 244: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

244 CHAPTER 4. SERIES

Exercise 4.2.73. Let an be any sequence of points. Let |b| < 1. Prove that thefunction f(x) =

∑bn|x − an| is continuous and is not differentiable precisely at

{an}.Exercise 4.2.74 (Riemann). Let ((x)) be the periodic function with period 1, de-

termined by ((x)) = x for −12< x <

12

and ((12

)) = 0. Let f(x) =∑∞

n=1

((nx))n2

.

1. Prove that f(x) is not continuous precisely at rational numbers with evendenominators, i.e., number of the form r =

a

2b, where a and b are odd

integers.

2. Compute f(r+) − f(r) and f(r−) − f(r) at discontinuous points (you mayneed the conclusion of Exercise 4.2.23 for the precise value).

3. Prove that f(x) is integrable, and F (x) =∫ x

0f(x)dx is not differentiable

precisely at rational numbers with even denominators.

Exercise 4.2.75 (Weierstrass). Suppose 0 < b < 1 and a is an odd integer sat-

isfying ab > 1 +3π2

. Prove that∑∞

n=0 bn cos(anπx) is continuous but nowhere

differentiable.

Exercise 4.2.76 (Weierstrass). Let {an} be a bounded countable set of numbers.Let h(x) = x+

x

2sin log |x|, h(0) = 0, and 0 < b < 1. Prove that

∑∞n=1 b

nh(x−an)

is continuous, strictly increasing, and is not differentiable precisely at {an}.

Page 245: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 5

Multivariable Function

245

Page 246: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

246 CHAPTER 5. MULTIVARIABLE FUNCTION

5.1 Limit and Continuity

The extension of the analysis from single variable to multivariable means thatwe are going from the real line R to the Euclidean space Rn. While many con-cepts and results may be extended, the Euclidean space is more complicatedin various aspects. The first complication is the many possible choices of thedistance in a Euclidean space, which we will show to be all equivalent as faras the mathematical analysis is concerned. The second complication is thatit is not sufficient to just do analysis on the rectangles, which are the obviousgeneralizations of the intervals on the real lines. For example, to analyze thetemperature around the globe, we need to deal with a function defined onthe 2-dimensional sphere inside the 3-dimensional Euclidean space. Thus weneed to set up proper topological concepts that extend closed interval (closedsubsets), bounded closed interval (compact subsets), and open interval (opensubsets). Once the concepts are established, we find that most of the dis-cussion about the limits and continuity of single variable functions may beextended to multivariable functions (the only exceptions are those dealingwith orders among real numbers, such as monotone properties).

5.1.1 Limit in Euclidean Space

The n-dimensional (real) Euclidean space Rn is the collection of n-tuples ofreal numbers

~x = (x1, x2, . . . , xn), xi ∈ R.

Geometrically, R1 is the usual real line, and R2 is a plane with origin. Anelement of Rn can be considered as a point or as an arrow starting from theorigin and ending at the point. By the later viewpoint, an element is alsocalled a vector.

The operation of addition and scalar multiplication may be applied toEuclidean vectors

~x+ ~y = (x1 + y1, x2 + y2, . . . , xn + yn), c~x = (cx1, cx2, . . . , cxn). (5.1.1)

The operations satisfy the usual properties such as the commutativity andthe associativity. In general, a vector space is a set with two operationssatisfying these usual properties.

The operation

~x · ~y = x1y1 + x2y2 + · · ·+ xnyn. (5.1.2)

is called the dot product and satisfies the usual properties such as the bilin-earity and the positivity. In general, an inner product on a vector space isan operation with numerical value satisfying these usual properties.

The dot product, or the inner product in general, induces the length (orthe Euclidean norm)

‖~x‖2 =√~x · ~x =

√x2

1 + x22 + · · ·+ x2

n. (5.1.3)

Page 247: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 247

It also induces the angle θ between two nonzero vectors by the formula

cos θ =~x · ~y

‖~x‖2‖~y‖2

. (5.1.4)

The definition of the angle is justified by the Schwarz inequality

|~x · ~y| ≤ ‖~x‖2‖~y‖2. (5.1.5)

By the angle formula, two vectors are orthogonal and denoted ~x ⊥ ~y, if~x ·~y = 0. Moreover, the orthogonal projection of a vector ~x on another vector~y is

proj~y~x = ‖~x‖2 cos θ~y

‖~y‖2

=~x · ~y~y · ~y

~y. (5.1.6)

Finally, the area of the parallelogram formed by two vectors is

A(~x, ~y) = ‖~x‖2‖~y‖2| sin θ| =√

(~x · ~x)(~y · ~y)− (~x · ~y)2. (5.1.7)

A norm on a vector space is a function ‖~x‖ satisfying

1. Positivity: ‖~x‖ ≥ 0, and ‖~x‖ = 0 if and only if ~x = ~0 = (0, 0, . . . , 0).

2. Scalar Property: ‖c~x‖ = |c|‖~x‖.

3. Triangle Inequality: ‖~x+ ~y‖ ≤ ‖~x‖+ ‖~y‖.

The norm induces the distance ‖~x−~y‖ between two vectors. The most oftenused norms are the Euclidean norm ‖~x‖2 and

‖~x‖1 = |x1|+ |x2|+ · · ·+ |xn|, ‖~x‖∞ = max{|x1|, |x2|, . . . , |xn|}.

In general, for any p ≥ 1, the Lp-norm is defined in Exercise 5.1.3.

.......................................... ...................................................................... .......................................... ................................................................................................................ ......................................................................

........................................................................................................................................................................................................................................................................................................................................................................ ..............

............................

..........................................

..........................................

...................................................................................................................................

..............................................................................................................................................................................

.............................................................................................................. ε εε ~x ~x~x

BL∞(~x, ε) BL1(~x, ε)BL2(~x, ε)

Figure 5.1: balls with respect to different norms

Given any norm, we have the (open) ball and the closed ball

B(~x, ε) = {~y : ‖~y − ~x‖ < ε},B(~x, ε) = {~y : ‖~y − ~x‖ ≤ ε}

of radius ε centered at ~x. Moreover, for the Euclidean norm, we have the(closed) Euclidean ball and the sphere of radius R

BnR = {~x : ‖~x‖2 ≤ R} = {(x1, x2, . . . , xn) : x2

1 + x22 + · · ·+ x2

n ≤ R2},Sn−1R = {~x : ‖~x‖2 = R} = {(x1, x2, . . . , xn) : x2

1 + x22 + · · ·+ x2

n = R2}.

For the radius R = 1, we have the unit ball Bn = Bn1 and the unit sphere

Sn−1 = Sn−11 .

Page 248: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

248 CHAPTER 5. MULTIVARIABLE FUNCTION

Exercise 5.1.1. Find all the norms on R.

Exercise 5.1.2. Directly verify the Schwarz inequality for the dot product

|x1y1 + x2y2 + · · ·+ xnyn| ≤√x2

1 + x22 + · · ·+ x2

n

√y2

1 + y22 + · · ·+ y2

n,

and find the condition for the equality to hold. Moreover, derive the triangleinequality for the norm ‖~x‖2 from the Schwarz inequality.

Note that the Holder inequality in Exercise 2.2.41 generalizes the Schwarzinequality.

Exercise 5.1.3. Prove that for any p ≥ 1, the Lp-norm

‖~x‖p = p√|x1|p + |x2|p + · · ·+ |xn|p (5.1.8)

satisfies the three conditions for the norm. Then prove that the Lp-norms satisfy

‖~x‖∞ ≤ ‖~x‖p ≤ p√n‖~x‖∞.

Exercise 5.1.4. Prove that for any positive numbers a1, a2, . . . , an, and p ≥ 1,

‖~x‖ = p√a1|x1|p + a2|x2|p + · · ·+ an|xn|p

is a norm.

Exercise 5.1.5. Suppose ‖~x‖ and 9~y9 are norms on Rm and Rn. Prove that‖(~x, ~y)‖ = max{‖~x‖,9~y9} is a norm on Rm × Rn = Rm+n. Prove that if m = n,then ‖~x‖+ 9~x9 is a norm on Rn.

Exercise 5.1.6. Prove that for any norm and any vector ~x, there is a number r ≥ 0and a vector ~u, such that ~x = r~u and ‖~u‖ = 1. The expression ~x = r~u is thepolar decomposition that describes a vector as characterized by the length r andthe direction ~u.

Exercise 5.1.7. Prove that any norm satisfies |‖~x‖ − ‖~y‖| ≤ ‖~x− ~y‖.

Exercise 5.1.8. Prove that if ~y ∈ B(~x, ε), then B(~y, δ) ⊂ B(~x, ε) for some radiusδ > 0. In fact, we can take δ = ε− ‖~y − ~x‖.

Comparing the single variable (represented by R) with the multivariable(represented by Rn), we find that the addition is generalized in a uniqueway, and the multiplication is generalized to the scalar multiplication andthe dot product. Moreover, the distance between real numbers is generalizedto many possible choices of norms.

A sequence of vectors {~xn} converges to a vector ~l with respect to a norm

‖~x‖, denoted limn→∞ ~xn = ~l, if for any ε > 0, there is N , such that

n > N =⇒ ‖~xn −~l‖ < ε. (5.1.9)

The right side is the same as ~xn ∈ B(~l, ε). The definition appears to dependon the choice of norm. For example, limn→∞(xn, yn) = (l, k) with respect tothe Euclidean norm means

limn→∞

√|xn − l|2 + |yn − k|2 = 0,

Page 249: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 249

and the same limit with respect to the L∞-norm means each coordinateconverges

limn→∞

xn = l, limn→∞

yn = k.

On the other hand, if two norms ‖~x‖ and 9~x9 satisfy

c1‖~x‖ ≤ 9~x9 ≤ c2‖~x‖ (5.1.10)

for some c1, c2 > 0, then it is easy to see that the convergence with respectto ‖~x‖ is equivalent to the convergence with respect to 9~x9. Because of theremark, two norms related as in (5.1.11) are said to be equivalent. For exam-ple, since all Lp-norms are equivalent by Exercise 5.1.3, the Lp-convergencein Rn also means that each coordinate converges.

In Theorem 5.1.12, we will prove that all norms on Rn (and on finitedimensional vector spaces in general) are equivalent, so that the convergenceis independent of the choice of the norm. For the moment, however, allthe subsequent discussions can be rigorously verified with regard to the Lp-norm. After Theorem 5.1.12 is established, the discussion becomes valid forany norm.

Because the L∞-convergence is the same as the convergence of each co-ordinate, by applying the Cauchy criterion to each coordinate, we get theCauchy criterion for the convergence of vectors: For any ε > 0, there is N ,such that m,n > N implies ‖~xm − ~xn‖ < ε. This extends Theorems 1.2.2and 1.2.12.

By considering individual coordinates, Proposition 1.2.1, Theorem 1.2.8and Proposition 1.2.9 can be extended to vectors. Because of the lack of orderamong vectors, Proposition 1.2.7 cannot be extended directly to vectors, butcan still be applied to individual coordinates. For the same reason, theconcept of upper and lower limits cannot be extended.

Exercise 5.1.9. Prove that if the first norm is equivalent to the second norm, andthe second norm is equivalent to the third norm, then the first norm is equivalentto the third norm.

Exercise 5.1.10. Prove that if limn→∞ ~xn = ~l, then limn→∞ ‖~xn‖ = ‖~l‖. Provethe converse is true if ~l = ~0. In particular, convergent sequences of vectors arebounded.

Exercise 5.1.11. Prove that if limn→∞ ~xn = ~l and limn→∞ ~yn = ~k, then limn→∞(~xn+~yn) = ~l + ~k and limn→∞ c~xn = c~l.

Exercise 5.1.12. Prove that if limn→∞ cn = c and limn→∞ ~xn = ~x, then limn→∞ cn~xn =c~x.

Exercise 5.1.13. Prove that if limn→∞ ~xn = ~x and limn→∞ ~yn = ~y with respect tothe Euclidean norm ‖~x‖2, then limn→∞ ~xn · ~yn = ~x · ~y. Of course the Euclideannorm can be replaced by any norm after Theorem 5.1.12 is established.

5.1.2 Topology in Euclidean Space

The theory of one variable functions are often discussed over intervals. Formultivariable functions, it is no longer sufficient to only consider rectangles.

Page 250: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

250 CHAPTER 5. MULTIVARIABLE FUNCTION

We need some basic concepts and results for the most useful type of subsetsof the Euclidean space.

The subsequent definitions are made with respect to any norm. Theresults are stated for any norm. The discussion and proofs are mostly basedon the L∞-norm. After Theorem 5.1.12 is established, the discussion andresults are valid for any norm.

Many important results on one-variable functions are stated on a boundedand closed interval. Since the key property used for proving the results isthat any sequence in the interval has a subsequence converging to a numberin the interval, we introduce the following concept.

Definition 5.1.1. A subset of a Euclidean space is compact if any sequencein the subset has a convergent subsequence with the limit still in the subset.

A subset K is bounded if there is a constant M such that ‖~x‖ < M forany ~x ∈ K. By applying Theorem 1.2.8 to each coordinate (for L∞-norm,with the help of Exercise 1.2.23), we find that any sequence in a boundedsubset K has a convergent subsequence. For K to be compact (with respectto the L∞-norm, at the moment), it remains to make sure the limit is stillinside K. This leads to the following concept.

Definition 5.1.2. A subset of a Euclidean space is closed if the limit of anyconvergent sequence in the subset is still in the subset.

In other words, a subset is closed if it contains all its limits.

Proposition 5.1.3. A subset of the Euclidean space is compact if and onlyif it is bounded and closed.

For the special case of the L∞-norm, we have argued that bounded andclosed subsets are compact. The proof of the converse is given below, for anynorm.

Proof. Suppose K is not bounded. Then there is a sequence {~xn} in Ksatisfying limn→∞ ‖~xn‖ = ∞. This implies that limn→∞ ‖~xnk‖ = ∞ for anysubsequence {~xnk}, so that any subsequence is not bounded and must diverge(see Exercise 5.1.10). Thus K is not compact.

Suppose K is not closed. Then there is a convergent sequence {~xn} inK such that limn→∞ ~xn = ~y 6∈ K. In particular, any subsequence of {~xn}converges to ~y 6∈ K. Thus no subsequence has limit in K, and K is notcompact.

Exercise 5.1.14. Prove that if two norms are equivalent, then a subset is bounded,compact, or closed with respect to one norm if and only if it has the same propertywith respect to the other norm.

Exercise 5.1.15. Prove that the subsets {~x : ‖~x‖∞ = 1} and {~x : ‖~x‖∞ ≤ 1} arecompact with respect to the L∞-norm.

Exercise 5.1.16. Suppose K is a compact subset. Prove that there are ~a,~b ∈ K,such that ‖~a−~b‖ = max~x,~y∈K ‖~x− ~y‖.

Page 251: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 251

Exercise 5.1.17. Prove that the closed ball B(~x, ε) is a closed subset (so the nameis appropriate).

Exercise 5.1.18. Prove that the intersection of closed subsets is closed, the unionof finitely many closed subsets is also closed, and the product of closed subsets isclosed (use the product norm in Exercise 5.1.5). What about compact subsets?

The definition of closed subsets suggests that, to make any subset closed,we should add all the limits of convergent subsequences in the subset.

Definition 5.1.4. The closure A of a subset A consists of the limits of allconvergent sequences in A.

Clearly, a subset A is closed if and only if A ⊂ A. The closure is alsocharacterized by the following result.

Proposition 5.1.5. The closure of A is closed. It consists of points ~y satis-fying the property that for any ε > 0, we have ‖~y − ~x‖ < ε for some ~x ∈ A.Moreover, it is the smallest closed subset containing A.

The property in the proposition means B(~y, ε) ∩ A 6= ∅ for any ε > 0.The geometric meaning is illustrated in Figure 5.2.

..............................................................................................................................................................................................................................

....................................

....................................

....................................

....................................

....................................

....................................

........................................................................................................................................................................................................................

...........................................................................................................................................................................................

........................................................................................................................

.......................................................................

A

~yε

intersectionnonempty

Figure 5.2: characterization of points in the closure

If a subset A is bounded, then it is contained in the closed cube [−M,M ]n

for some big M . The proposition implies that the closure A is also containedin [−M,M ]n and is therefore bounded. Then by Proposition 5.1.3, the closureis compact.

Proof. Any ~y ∈ A is the limit of a sequence {~xn} in A. Thus for any ε > 0,we have ‖~y − ~xn‖ < ε for big n, where we also note that ~xn ∈ A.

Conversely, suppose ~y has the property described in the proposition. Then

for ε =1

n, we have ~xn ∈ A satisfying ‖~y − ~xn‖ <

1

n. This implies ~y is the

limit of the sequence {~xn} in A.Next we prove the closure is closed. By the definition of closure, this is the

same as showing ¯A ⊂ A. By the characterization just proved, if ~z ∈ ¯A, thenfor any ε > 0, we have ‖~z−~y‖ < ε for some ~y ∈ A. Using the characterizationof the closure again, we have ‖~y − ~x‖ < ε for some ~x ∈ A. Therefore we get

‖~z − ~x‖ ≤ ‖~z − ~y‖+ ‖~y − ~x‖ < 2ε.

By the characterization of the closure, this shows that ~z ∈ A and completesthe proof that A is closed.

Page 252: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

252 CHAPTER 5. MULTIVARIABLE FUNCTION

Finally, we prove that the closure is the smallest closed subset containingA. Since any ~x ∈ A is the limit of the constant sequence {~x} in A, we haveA ⊂ A. Therefore the closure is a closed subset containing A. Moreover,suppose C is a closed subset containing A. Then any ~y ∈ A is the limit of aconvergent sequence in A. Since A ⊂ C, the sequence also lies in C. SinceC is closed, it contains the limit ~z of the convergence sequence. This provesthat A ⊂ C. Therefore the closure is the smallest closed subset containingA.

Definition 5.1.6. The boundary ∂A of a subset A consists of points thatare simultaneously limit of A and Rn − A.

The definition means ∂A = A ∩ Rn − A. Equivalently, ~y ∈ ∂A meansB(~y, ε) ∩ A 6= ∅ and B(~y, ε) ∩ (Rn − A) 6= ∅ for any ε > 0.

Exercise 5.1.19. Prove that for any norm, the closed ball B(~x, ε) is the closure ofB(~x, ε), and the sphere B(~x, ε)−B(~x, ε) is the boundary of B(~x, ε).

Exercise 5.1.20. Prove that A is closed if and only if A = A.

Exercise 5.1.21. Prove that if two norms are equivalent, then the closure and theboundary of a subset with respect to one norm is the same as the the closure andthe boundary with respect to the other norm.

Exercise 5.1.22. Prove properties of the closure and the boundary (the norm onthe product of Euclidean space is given in Exercise 5.1.5).

1. A ⊂ B =⇒ A ⊂ B.

2. A ∪B = A ∪ B.

3. A ∩B = A ∩ B.

4. A×B = A× B.

5. A = ∂A ∪A.

6. ∂A = ∂(Rn −A).

7. ∂(A ∪B) ⊂ ∂A ∪ ∂B.

8. ∂(A ∩B) ⊂ ∂A ∪ ∂B.

9. ∂(A−B) ⊂ ∂A ∪ ∂B.

10. ∂(A×B) = (∂A× B)∪ (A× ∂B).

Exercise 5.1.23. Suppose ~x ∈ A and ~y /∈ A. Suppose φ(t) = ~x + t(~y − ~x) andτ = sup{t : φ(t) ∈ A}. Prove that φ(τ) ∈ ∂A.

In the analysis of one-variable functions, such as differentiability, we oftenrequire functions to be defined at all the points near a given point. Formultivariable functions, the requirement becomes the following concept.

Definition 5.1.7. A subset of a Euclidean space is open if for any point inthe subset, all the points near the point are also in the subset.

Thus a subset U is open if

~x ∈ U =⇒ B(~x, ε) ⊂ U for some ε > 0. (5.1.11)

The geometrical meaning is illustrated in Figure 5.3.

Proposition 5.1.8. A subset U of Rn is open if and only if the complementRn − U is closed.

Page 253: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 253

.................................................................................................................................................................................. ...........................

................................................ .................................................................................................................................................................................................................................

.............................................................. ................................................................................................ ..........................................

......................................................................................................................................................

...............................................

~x ~x ................................

U U

Figure 5.3: definition of open subset

Proof. The subset U is open if and only if (5.1.11) holds. Since the rightside of (5.1.11) is the same as B(~x, ε) ∩ (Rn − U) = ∅ for some ε > 0, byProposition 5.1.5, we see that U is open if and only if

~x ∈ U =⇒ ~x 6∈ Rn − U.

This is logically the same as

~x ∈ Rn − U =⇒ ~x 6∈ U.

Since the right side is the same as ~x ∈ Rn−U , the implication means exactlyRn − U ⊂ Rn−U , which by the definition means the complement Rn−U isclosed.

The following result says that if an open subset U contains a compactsubset K, then U contains an ε-neighborhood of K.

Proposition 5.1.9. Suppose a compact subset K is contained in an opensubset U . Then there is ε > 0, such that ‖~y − ~x‖ < ε and ~x ∈ K implies~y ∈ U .

Proof. Suppose the conclusion is not true. Then for any n, there are ~xn ∈ Kand ~yn 6∈ U , such that ‖~yn − ~xn‖ <

1

n. Since K is compact, there is a

subsequence {~xnk} converging to ~z ∈ K. By ‖~yn − ~xn‖ <1

n, we know the

subsequence {~ynk} also converges to ~z. Since ~ynk ∈ Rn − U and Rn − Uis closed by Proposition 5.1.8, we get ~z ∈ Rn − U . This contradicts with~z ∈ K ⊂ U .

Finally, the Heine-Borel Theorem in Section 1.2.6 can be extended toEuclidean space.

Theorem 5.1.10. Suppose K is a compact subset. Suppose {Ui} is a collec-tion of open subsets such that K ⊂ ∪Ui. Then K ⊂ Ui1 ∪ Ui2 ∪ · · · ∪ Uik forfinitely open subsets in the collection.

The extension can be proved for the L∞-norm by adopting the originalHeine-Borel Theorem with little modification. The compact subset K iscontained in a bounded rectangle I = [α1, β1] × [α2, β2] × · · · × [αn, βn].

By replacing each component interval [αi, βi] with either

[αi,

αi + βi2

]or

Page 254: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

254 CHAPTER 5. MULTIVARIABLE FUNCTION[αi + βi

2, βi

], the rectangle can be divided into 2n rectangles. If K cannot

be covered by finitely many open subsets from {Ui}, then for one of the 2n

rectangles, denoted I1, the intersection K1 = K ∩ I1 cannot be covered byfinitely many open subsets from {Ui}. The rest of the construction and theargument can proceed as before.

Exercise 5.1.24. Prove that if two norms are equivalent, then a subset is open withrespect to one norm if and only if it is open with respect to the other norm.

Exercise 5.1.25. Prove that for any norm, the ball B(~x, ε) is open.

Exercise 5.1.26. Prove that the intersection of finitely many open subsets is open,the union of open subsets is also open, and the product of open subsets is open(use the product norm in Exercise 5.1.5).

Exercise 5.1.27. Suppose f(x) is a continuous function on R. Prove that the subset{(x, y) : y < f(x)} is open, with the subset {(x, y) : y = f(x)} as the boundary.What if f(x) is not continuous?

5.1.3 Multivariable Function

Multivariable functions may be defined on all kinds of subsets of the Eu-clidean space. Compared with the single variable case, we often need to paymore attention to the subset on which the function is defined.

A multivariable function f on a subset A may be visualized either by thegraph {(~x, f(~x)) : ~x ∈ A} or the levels {~x ∈ A : f(~x) = c}.

........................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

....................................

........................................................................................................................................................

...................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

....................................

.......................................................................................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................

.

............................................................................................................................................................................................ ................

.....................................................................................

..............................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...........................................................................................................................................................

...................................................

....................................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................

.............................................................................................

............................................

...............................................................................................................................................................................................................................................................

...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................

x

y

z

z = x2 + y2 x

y

f=1

f=4

f=9

..................................................................................................................

.............................................................................................................................................................................

Figure 5.4: graph and level of x2 + y2

Given a norm, a multivariable function f(~x) = f(x1, x2, . . . , xn) has limitl at ~a = (a1, a2, . . . , an) if for any ε > 0, there is δ > 0, such that

0 < ‖~x− ~a‖ < δ =⇒ |f(~x)− l| < ε. (5.1.12)

For the L∞-norm, this means

|xi − ai| < δ for all i and xi 6= ai for some i =⇒ |f(x1, x2, . . . , xn)− l| < ε.

After we show that all norms are equivalent in Theorem 5.1.12, we will knowthat the continuity is independent of the choice of the norm.

Page 255: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 255

In general, if a function is defined on a subset A, then the limit is denotedas lim~x∈A,~x→~a f(~x) = l. If a function is defined for all ~x satisfying 0 <‖~x− ~a‖ < δ for some δ (i.e., ~x is in the punctured ball B(~a, δ)− ~a), then wesimply denote lim~x→~a f(~x) = l or limx1→a1,...,xn→an f(x1, x2, . . . , xn) = l.

If B ⊂ A, then by restricting the definition of the limit from ~x ∈ A to~x ∈ B, we find

lim~x∈A,~x→~a

f(~x) = l =⇒ lim~x∈B,~x→~a

f(~x) = l.

In particular, if the restrictions of a function on different subsets give differentlimits, then the limit diverges.

The limits of multivariable functions have all the usual properties as be-fore. We may also define various variations at ∞ by replacing ‖~x − ~a‖ < δwith ‖~x‖ > N or replacing |f(~x)− l| < ε by |f(~x)| > b.

Example 5.1.1. Consider the function f(x, y) =xy(x2 − y2)x2 + y2

defined for (x, y) 6=

(0, 0). Since |f(x, y)| ≤ |xy|, we find ‖(x, y)‖∞ < δ =√ε implies |f(x, y)| ≤

ε. Therefore we have limx,y→0 f(x, y) = 0 in the L∞-norm. By the equivalencebetween the Lp-norms, the limit also holds for other Lp-norms.

Example 5.1.2. Consider the function f(x, y) =xy

x2 + y2defined for (x, y) 6= (0, 0).

We have f(x, cx) =c

1 + c2, so that limy=cx,(x,y)→(0,0) f(x, y) =

c

1 + c2. Thus the

restriction of the function to straight lines of different slopes gives different limits.We conclude that the function diverges at (0, 0).Exercise 5.1.28. Describe the graphs and the levels of functions.

1. ax+ by.

2. ax+ by + cz.

3.(x− x0)2

a2+

(y − y0)2

b2.

4.x2

a2+y2

b2+z2

c2.

5.(x− x0)2

a2− (y − x0)2

b2.

6.x2

a2+y2

b2− z2

c2.

7.x2

a2− y2

b2− z2

c2.

8. (x2 + y2)2.

9. |x|p + |y|p.

10. xy.

Exercise 5.1.29. What is the relation between lim~x∈A,~x→~a f(~x), lim~x∈B,~x→~a f(~x),lim~x∈A∪B,~x→~a f(~x), and lim~x∈A∩B,~x→~a f(~x).Exercise 5.1.30. Prove that lim~x→~a f(~x) = l if and only if limn→∞ f(~xn) = l forany sequence {~xn} satisfying ~xn 6= ~a and limn→∞ ~xn = ~a.Exercise 5.1.31. For a subset A ⊂ Rn, the characteristic function is

χA(~x) =

{1 if ~x ∈ A0 if ~x 6∈ A

Find the condition for lim~x→~a χA(~x) to exist.

Exercise 5.1.32. Find the condition for limx,y→0+xpyq

(xm + yn)k= 0, where all the

parameters are positive. Extend the discussion to more variables.Exercise 5.1.33. Compute the convergent limits. All parameters are positive.

Page 256: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

256 CHAPTER 5. MULTIVARIABLE FUNCTION

1. limx→1,y→11

x− y.

2. limx→0,y→0,z→0xyz

x2 + y2 + z2.

3. limx→0,y→0(x− y) sin1

x2 + y2.

4. limx→∞,y→∞(x− y) sin1

x2 + y2.

5. limx→∞,y→∞(x2 + y2)p

(x4 + y4)q.

6. limx→0,y→0,0<x<y2

xpy

x2 + y2.

7. limx→∞,y→∞,ax≤y≤bx1xy

.

8. limx→0+,y→0+(x+ y)xy.

9. limx→∞,y→0

(1 +

1x

) x2

x+y

.

10. limx→+∞,y→+∞

(1 +

1x

) x2

x+y

.

11. limx→+∞,y→+∞x2 + y2

ex+y.

5.1.4 Continuous Function

A multivariable function f(~x) is continuous at ~a if it is defined at ~a andlim~x→~a f(~x) = f(~a). For the L∞-norm, this means that for any ε > 0, thereis δ > 0, such that

|xi − ai| < δ for all i =⇒ |f(x1, x2, . . . , xn)− f(a1, a2, . . . , an)| < ε.

Like the single variable case, it is easy to see that the arithmetic combi-nations of continuous functions are continuous, and the exponential of con-tinuous functions are continuous.

A function f(~x) on A is uniformly continuous if for any ε > 0, there isδ > 0, such that

~x, ~y ∈ A, ‖~x− ~y‖ < δ =⇒ |f(~x)− f(~y)| < ε. (5.1.13)

The following extension of Theorem 1.4.5 is proved for any norm.

Theorem 5.1.11. A continuous function on a compact subset is bounded,uniformly continuous, and reaches its maximum and minimum.

Proof. Suppose f(~x) is continuous on a compact subset K. If f(~x) is notbounded, then there is a sequence {~xn} in K such that limn→∞ f(~xn) = ∞.Since K is compact, there is a subsequence {~xnk} converging to ~a ∈ K. Bythe continuity of f(~x) on K, we have limk→∞ f(~xnk) = f(~a). This contradictswith limn→∞ f(~xn) =∞.

The function is bounded and we may let β = sup~x∈K f(~x). There is asequence {~xn} in K such that limn→∞ f(~xn) = β. By the similar argumentas before, there is a subsequence converging to ~a ∈ K, such that f(~a) =limk→∞ f(~xnk) = β. Thus the maximum is reached at ~a.

The proof of Theorem 1.4.4 can be adopted to prove the uniform conti-nuity.

Exercise 5.1.34. Prove that any norm is a continuous function with respect toitself. Is the continuity uniform?

Page 257: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 257

Exercise 5.1.35. Prove that if f(~x) and g(~x) are continuous at ~x0, and h(u, v) iscontinuous at (f(~x0), g(~x0)) with respect to the L∞-norm on R2, then h(f(~x), g(~x))is continuous at ~x0.

Exercise 5.1.36. Prove that if f(x, y) is monotone and continuous in x and in y,then f(x, y) is continuous with respect to the L∞-norm. What if the function ismonotone and continuous in x but is only continuous in y?

Exercise 5.1.37. Study the continuity.

1.

sinxyx

if x 6= 0

0 if x = 0.

2.

sinxy|x|+ |y|

if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0).

3.

{y if x is rational−y if x is irrational

.

4.

{x(x2 + y2)p if y 6= 00 if y = 0

.

5.

{y log(x2 + y2) if(x, y) 6= (0, 0)0 if(x, y) = (0, 0)

.

6.

{exy if y 6= 0

0 if y = 0.

Exercise 5.1.38. Suppose φ(x) is uniformly continuous. Is f(x, y) = φ(x) uniformlycontinuous with respect to the L∞-norm?

Exercise 5.1.39. Prove that with respect to the L∞-norm, any uniformly continu-ous function on a bounded subset is bounded.

Exercise 5.1.40. Prove that the sum of uniformly continuous functions is continu-ous. What about the product?

By using Theorem 5.1.11, we can establish the following result, whichallows us to freely use any norm in the discussion of vectors and multivariablefunctions. For example, a subset is compact with respect to any norm if andonly if it is closed and bounded.

Theorem 5.1.12. All norms on a finite dimensional vector space are equiv-alent.

Proof. We prove the claim for norms on a Euclidean space. Since linearalgebra tells us that any finite dimensional vectors space is isomorphic to aEuclidean space, the result applies to any vector space.

We compare any norm ‖~x‖ with the L∞-norm ‖~x‖∞. Let ~e1 = (1, 0, . . . , 0),~e2 = (0, 1, . . . , 0), . . . , ~en = (0, 0, . . . , 1) be the standard basis of Rn. By theconditions for norm, we have

‖x‖ = ‖x1~e1 + x2~e2 + · · ·+ xn~en‖≤ |x1|‖~e1‖+ |x2|‖~e2‖+ · · ·+ |xn|‖~en‖≤ (‖~e1‖+ ‖~e2‖+ · · ·+ ‖~en‖) max{|x1|, |x2|, . . . , |xn|}= (‖~e1‖+ ‖~e2‖+ · · ·+ ‖~en‖)‖~x‖∞ = c1‖~x‖∞.

We remark that this implies (see Exercise 5.1.7)

‖~x− ~a‖∞ ≤ δ =⇒ |‖~x‖ − ‖~a‖| ≤ ‖~x− ~a‖ ≤ c1δ.

Page 258: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

258 CHAPTER 5. MULTIVARIABLE FUNCTION

Based on this, it is easy to see that the function ‖x‖ is continuous withrespect to the L∞-norm.

It remains to prove that ‖x‖ ≥ c2‖~x‖∞ for some c2 > 0. The inequalityimplies

‖~x‖∞ = 1 =⇒ ‖~x‖ ≥ c2. (5.1.14)

Conversely, suppose the implication (5.1.14) holds. For any ~x, we write~x = r~u with r = ‖~x‖∞ and ‖~u‖∞ = 1 (see Exercise 5.1.6). Then

‖~x‖ = ‖r~u‖ = r‖~u‖ ≥ rc2 = c2‖~x‖∞.

Therefore we conclude the inequality ‖~x‖ ≥ c2‖~x‖∞ is equivalent to theimplication (5.1.14).

The subset K = {~x : ‖~x‖∞ = 1} is bounded and closed with respect tothe L∞-norm (see Exercise 5.1.15). By Proposition 5.1.3, which was fullyproved for the L∞-norm, K is compact with respect to the L∞-norm. Thenby Theorem 5.1.11, the continuity of the function ‖~x‖ on K with respect tothe L∞-norm implies that ‖~x‖ reaches its minimum on K at ~a ∈ K. Thenwe conclude the implication (5.1.14) holds for c2 = ‖~a‖, with

~a ∈ K =⇒ ‖~a‖∞ = 1 =⇒ ~a 6= ~0 =⇒ c2 = ‖~a‖ > 0.

5.1.5 Multivariable Map

Multivariable analysis is not restricted to multivariable functions. We mayalso consider maps from one multivariable to another multivariable. Theseare the maps between Euclidean spaces.

A map from A ⊂ Rn to Rm has m-dimensional vectors as values. Insteadof using many arrows, we denote the map by

F (~x) = (f1(~x), f2(~x), . . . , fm(~x)) : A ⊂ Rn → Rm.

Two maps into the same Euclidean space may be added. A number canbe multiplied to a map into a Euclidean space. If F : A ⊂ Rn → Rm andG : B ⊂ Rk → Rn are maps such that G(~x) ∈ A for any ~x ∈ B, then we havethe composition (F ◦G)(~x) = F (G(~x)) : B ⊂ Rk → Rm.

The map converges to ~l ∈ Rm at ~a ∈ Rn, denoted lim~x→~a F (~x) = ~l, if forany ε > 0, there is δ > 0, such that

0 < ‖~x− ~a‖ < δ =⇒ ‖F (~x)−~l‖ < ε. (5.1.15)

By using the L∞-norm, it is easy to see that

lim~x→~a

F (~x) = ~l ⇐⇒ lim~x→~a

fi(~x) = li for all i. (5.1.16)

The property also holds for any norm because any norm is equivalent to theL∞-norm.

Page 259: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 259

The map is continuous at ~a if lim~x→~a F (~x) = F (~a). By (5.1.16), a map iscontinuous if and only if its coordinate functions are continuous.

Many properties can be extended from functions to maps. For example,the sum and the composition of continuous maps are still continuous. Thescalar product of a continuous function and a continuous map is a continuousmap. The dot product of two continuous maps is a continuous function.

Some special cases of maps can be visualized in various ways. For exam-ple, a function is a map to R and can be visualized by its graph or its levels.On the other hand, a (parametrized) curve (or a path) in Rn is a continuousmap

φ(t) = (x1(t), x2(t), . . . , xn(t)) : [a, b]→ Rn.

The continuity means that each coordinate function xi(t) is continuous. For

example, the straight line passing through ~a and ~b is

φ(t) = (1− t)~a+ t~b = ~a+ t(~b− ~a), (5.1.17)

and the unit circle on the plane is

φ(t) = (cos t, sin t) : [0, 2π]→ R2. (5.1.18)

We say the path connects φ(a) to φ(b). A subset A ⊂ Rn is path connectedif any two points in A are connected by a path in A (i.e., φ(t) ∈ A for anyt ∈ [a, b]).

Similar to curves, a map R2 → Rn may be considered as a (parametrized)surface. For example, the sphere in R3 may be parametrized by

σ(φ, θ) = (sinφ cos θ, sinφ sin θ, cosφ) : [0, π]× [0, 2π]→ R3, (5.1.19)

and the torus by (a > b > 0)

σ(φ, θ) = ((a+ b cosφ) cos θ, (a+ b cosφ) sin θ, b sinφ) : [0, 2π]× [0, 2π]→ R3.(5.1.20)

A map Rn → Rn may be considered as a change of variable, or a trans-form, or a vector field. For example,

(x, y) = (r cos θ, r sin θ) : [0,∞)× R→ R2

is the polar coordinate. Moreover,

Rθ(x, y) = (x cos θ − y sin θ, x sin θ + y cos θ) : R2 → R2

transforms the plane by rotation of angle θ, and

~x→ ~a+ ~x : Rn → Rn

shifts all the Euclidean space by ~a. A vector field assigns an arrow to a point.For example, the vector field

F (~x) = ~x : Rn → Rn

is a vector field in the radial direction, while

F (x, y) = (y,−x) : R2 → R2

is a counterclockwise rotating vector field, just like the water flow in the sink.

Page 260: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

260 CHAPTER 5. MULTIVARIABLE FUNCTION

........................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

.........................................

................

................

........................................................

......................................

...................

••

••

~a

Rθ(~a)

~b

Rθ(~b)

θ

θ

............................................................

............................................................

......................................................

......................................................................................................................................................................................................................................................................................................................................................

......................................................................................................................................................................................................................................................................................................................................................

•~0

~a•~x

•~a+ ~x

•~y

•~a+ ~y

A

~a+ A

..............................................................................

....................................

....................................

.....

..................................................................

..................................................................

....................................

....................................

..........................................

....................................

....................................

..........................................

....................................

....................................

..........................................

Figure 5.5: rotation and shifting

........................................ ..........................................................................................................................

......................................................

.....................................................

....................................... ..............

.....................................................

.................................................................................................................................. .........................................................................................................

...............

...............

...............

...............

...............................

...........................................................................................

..........................................................................................

............................................................................ ..............

..........................................

................................................

..........................................................................................

..............................................................................

..............................................................................

................................................................ ............................................................................................

..............................................................................

..............................................................................

.............................................................................. ..............................................................................

• ............................................................................................................

...................................................... ........................................ ............................................................................................................................................................................

............................................................................... ................................................................. ..............

............................................

................................................................................................................

................................................................ ......................................................................................................................................................................................................................................

............................................................................................................................................................... ................................................................................................... ........................................................................................................................................................................................ ..............

.......................................................................................................................................................................................................................................................................................................................................

............................................

............................................

........................................................................................................................................................................

............................................................................................................................... .................................................................................................................................

................................................................................................................. ................................................................................................................................. ..............

...............................................................................................................................•

Figure 5.6: radial and rotational vector fields

Example 5.1.3. By direct argument, it is easy to see that the product functionµ(x, y) = xy : R2 → R is continuous. Suppose f(~x) and g(~x) are continuous. ThenF (~x) = (f(~x), g(~x)) : Rn → R2 is also continuous because each coordinate functionis continuous. The composition (µ ◦F )(~x) = f(~x)g(~x) of the two continuous mapsis also continuous. Therefore we conclude that the product of continuous functionsis continuous.

Exercise 5.1.41. Descrebe the maps in suitable ways.

1. F (x) = (cosx, sinx, x).

2. F (x, y) = (x, y, x+ y).

3. F (x, y, z) = (y, z, x).

4. F (x, y, z) = (y,−x, z).

5. F (x, y) = (x2, y2).

6. F (x, y) = (x2 − y2, 2xy).

7. F (~x) = 2~x.

8. F (~x) = ~a− ~x.

The meaning of the sixth map can be seen through the polar coordinate.

Exercise 5.1.42. Prove that the composition of two continuous maps is continuous.

Exercise 5.1.43. Prove that the addition (~x, ~y) → ~x + ~y : Rn × Rn → Rn and thescalar multiplication (c, ~x)→ c~x : R× Rn → Rn are continuous.

The following extends the similar Propositions 1.3.8 and 1.4.3 for singlevariable functions. The same proof applies here.

Proposition 5.1.13. For a map F (~x), lim~x→~a F (~x) = ~l if and only if

limn→∞ F (~xn) = ~l for any sequence {~xn} satisfying ~xn 6= ~a and limn→∞ ~xn =~a. In particular, F is continuous at ~a if and only if limn→∞ F (~xn) = F (~a)for any sequence {~xn} satisfying limn→∞ ~xn = ~a.

Page 261: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 261

Theorem 5.1.11 on the boundedness and the uniform continuity can alsobe extended. The extreme is meaningless for maps.

Theorem 5.1.14. Suppose F (~x) is a continuous map on a compact subsetK. Then F (~x) is bounded and uniformly continuous: For any ε > 0, there isδ > 0, such that

~x, ~y ∈ K, ‖~x− ~y‖ < δ =⇒ ‖F (~x)− F (~y)‖ < ε. (5.1.21)

We do not expect the intermediate value theorem to extend to multi-variable functions in general because the theorem makes critical use of theinterval. For the multivariable case, the interval may be replaced by the pathconnected condition.

Theorem 5.1.15. Suppose f(~x) is a continuous function on a path connected

subset A. Then for any ~a,~b ∈ A and y between f(~a) and f(~b), there is ~c ∈ A,such that f(~c) = y.

Proof. Since A is path connected, there is a continuous path φ(t) : [a, b] →A such that φ(a) = ~a and φ(b) = ~b. The composition g(t) = f(φ(t)) isthen a continuous function for t ∈ [a, b], and y is between g(a) = f(~a) and

g(b) = f(~b). By Theorem 1.4.6, there is c ∈ [a, b], such that g(c) = y. Theconclusion is the same as f(~c) = y for ~c = φ(c).

We cannot talk about multivariable monotone maps. Thus the only partof Theorem 1.4.8 that can be extended is the continuity.

Theorem 5.1.16. Suppose F : K ⊂ Rn → Rm is a one-to-one and continu-ous map on a compact set K. Then the inverse map F−1 : F (K) ⊂ Rm → Rn

is continuous.

Proof. We claim that F−1 is in fact uniformly continuous. Suppose it is notuniformly continuous, then there is ε > 0 and ~ξk = F (~xk), ~ηk = F (~yk) ∈F (K), such that ‖~ξk − ~ηk‖ → 0 as k →∞ and ‖F−1(~ξk)−F−1(~ηk)‖ ≥ ε. Bythe compactness of K, we can find kp, such that both limp→∞ ~xkp = ~a and

limp→∞ ~ykp = ~b exist. Then by the continuity of F , we have limp→∞ ~ξkp =

limp→∞ F (~xkp) = F (~a) and limp→∞ ~ηkp = limp→∞ F (~ykp) = F (~b). We con-

clude that ‖F (~a) − F (~b)‖ = limp→∞ ‖~ξkp − ~ηkp‖ = 0, so that F (~a) = F (~b).

Since F is one-to-one, we get ~a = ~b. This is in contradiction with limp→∞ ~xkp =

~a, limp→∞ ~ykp = ~b, and ‖~xk − ~yk‖ = ‖F−1(~ξk)− F−1(~ηk)‖ ≥ ε.

Exercise 5.1.44. Prove the following are equivalent for a map F .

1. The map is continuous.

2. The preimage F−1(C) = {~x : F (~x) ∈ C} ⊂ Rn of any closed subset C ⊂ Rm

is closed.

3. The preimage F−1(U) of any open subset U ⊂ Rm is open.

Page 262: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

262 CHAPTER 5. MULTIVARIABLE FUNCTION

4. The preimage F−1(B(~b, ε)) of any open ball B(~b, ε) is open.

Exercise 5.1.45. Prove the following are equivalent for a map F .

1. The map is continuous.

2. F (A) ⊂ F (A) for any subset A.

3. F−1(B) ⊃ F−1(B) for any subset B.

Exercise 5.1.46. Suppose F is a continuous map on a compact subset K. Provethat the image subset F (K) = {F (~x) : ~x ∈ K} is compact.

Exercise 5.1.47. A map F (~x) is Lipschitz if ‖F (~x) − F (~y)‖ ≤ L‖~x − ~y‖ for someconstant L. Prove that Lipschitz maps are uniformly continuous.

5.1.6 Exercise

Exercise 5.1.48. Suppose f(x) has continuous derivative. Prove that

lim(x,y)→(a,a)

f(x)− f(y)x− y

= f ′(a).

What if the continuity condition is dropped? Is there a similar conclusion for thesecond order derivative?

Exercise 5.1.49. Prove that F : Rn → Rm is continuous if and only if F · ~a is acontinuous function for any ~a ∈ Rm.

Repeated Extreme

Exercise 5.1.50. For a function f(~x, ~y) on A×B. Prove that

inf~y∈B

sup~x∈A

f(~x, ~y) ≥ sup~x∈A

inf~y∈B

f(~x, ~y) ≥ inf~x∈A

inf~y∈B

f(~x, ~y) = inf~x∈A,~y∈B

f(~x, ~y).

Exercise 5.1.51. For any a ≥ b1 ≥ c1 ≥ d and a ≥ b2 ≥ c2 ≥ d, can you constructa function f(x, y) on [0, 1]× [0, 1] such that

sup~x∈A,~y∈B

f = a, inf~x∈A,~y∈B

f = d,

inf~y∈B

sup~x∈A

f = b1. sup~x∈A

inf~y∈B

f = c1,

inf~x∈A

sup~y∈B

f = b2, sup~y∈B

inf~x∈A

f = c2.

Exercise 5.1.52. Suppose f(x, y) is a function on [0, 1]× [0, 1]. Prove that if f(x, y)is increasing in x, then infy∈[0,1] supx∈[0,1] f(x, y) = supx∈[0,1] infy∈[0,1] f(x, y). Canthe interval [0, 1] be changed to (0, 1)?

Exercise 5.1.53. Extend the discussion to three or more repeated extremes. Forexample, can you give a simple criterion for comparing two strings of repeatedextremes?

Repeated Limit

Exercise 5.1.54. Study the repeated limits limx→0,y→0, limx→0 limy→0, limy→0 limx→0

for the following functions.

Page 263: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.1. LIMIT AND CONTINUITY 263

1.x− y + x2 + y2

x+ y.

2. x sin1x

+ y cos1x

.

3.|x|α|y|β

(x2 + y2)γ.

4.x2y2

x3 + y3.

5. (x+ y) sin1x

sin1y

.

6.ex − ey

sinxy.

Exercise 5.1.55. Establish a concept of the uniform convergence and find the con-dition for the commutativity of the repeated limits such as limx→a limy→b f(x, y) =limy→b limx→a f(x, y), similar to Proposition 4.2.4.

Exercise 5.1.56. Prove that if limx→a,y→b f(x, y) exists and limy→b f(x, y) = g(x)exists for each x, then limx→a,y→b f(x, y) = limx→a limy→b f(x, y).

Exercise 5.1.57. Construct a function f(x, y) defined for (x, y) 6= (0, 0), such thatthe repeated limits limx→0 limy→0 f(x, y) and limy→0 limx→0 f(x, y) exist and areequal, but limx→0,y→0 f(x, y) does not converge. What other situations are possi-ble?

Limit Along Any Path

Exercise 5.1.58. Prove that if lim~x→~a f(~x) converges along any continuous pathleading to ~a, then the limit exists.

Exercise 5.1.59. Show that limx→0,y→0xy2

x2 + y4does not exist, although the limit

converges zero along any straight line leading to (0, 0) .

Homogeneous and Multihomogeneous Function

A function f(~x) is homogeneous of degree α if f(c~x) = cαf(~x) for anyc > 0. More generally, a function is multihomogeneous if

cf(x1, x2, . . . , xn) = f(cβ1x1, cβ2x2, . . . , c

βnxn).

The later part of the proof of Theorem 5.1.12 makes use of the fact that anynorm is a homogeneous function of degree 1 (also see Exercise 5.1.62).

Exercise 5.1.60. Prove that two homogeneous functions of the same degree areequal away from ~0 if and only if their restrictions on the unit sphere Sn−1 areequal.

Exercise 5.1.61. Prove that a homogeneous function is bigger than another homo-geneous function of the same degree away from ~0 if and only if the inequality holdson Sn−1.

Exercise 5.1.62. Suppose f(~x) is a continuous homogeneous function of degree αsatisfying f(~x) > 0 for ~x 6= ~0. Prove that there is c > 0, such that f(~x) ≥ c‖~x‖αfor any ~x.

Exercise 5.1.63. Prove that a homogeneous function is continuous away from ~0 ifand only if its restriction on Sn−1 is continuous. Then find the condition for ahomogeneous function to be continuous at ~0.

Continuous Map and Function on Compact Set

Exercise 5.1.64. Suppose F (~x, ~y) is a continuous map on a A×K, where A ⊂ Rm

and K ⊂ Rn. Prove that if K is compact, then for any ~a ∈ A and ε > 0, there isδ > 0, such that ~x ∈ A, ~y ∈ K, and ‖~x− ~a‖ < δ implies ‖F (~x, ~y)− F (~a, ~y)‖ < ε.

Page 264: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

264 CHAPTER 5. MULTIVARIABLE FUNCTION

Exercise 5.1.65. Suppose f(~x, ~y) is a continuous function on A×K. Prove that ifK is compact, then g(~x) = max~y∈K f(~x, ~y) is a continuous function on A.

Exercise 5.1.66. Suppose f(~x, y) is a continuous function on A× [0, 1]. Prove that

g(~x) =∫ 1

0f(~x, y)dy is a continuous function on A.

Continuity in Coordinates

A function f(~x, ~y) is continuous in ~x if lim~x→~a f(~x, ~y) = f(~a, ~y). It isuniformly continuous in ~x if the limit is uniform in ~y: For any ε > 0, thereis δ > 0, such that ‖~x− ~x′‖ < δ implies ‖F (~x, ~y)− F (~x′, ~y)‖ ≤ ε for any ~y.

Exercise 5.1.67. Prove that if f(~x, ~y) is continuous, then f(~x, ~y) is continuous

in ~x. Moreover, show that the function f(x, y) =

xy

x2 + y2if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0)is

continuous in both x and y, but is not a continuous function.

Exercise 5.1.68. Prove that if f(~x, ~y) is continuous in ~x and is uniformly continuousin ~y, then f(~x, ~y) is continuous.

Exercise 5.1.69. Suppose f(~x, y) is continuous in ~x and in y. Prove that if f(~x, y)is monotone in y, then f(~x, ~y) is continuous.

5.2 Multivariable Algebra

To extend the differentiation from single to multivariable, we need to con-sider multivariable linear and polynomial maps. We introduce the necessarylinear and multilinear algebras for multivariable analysis, including lineartransform, linear functional, bilinear form, multilinear form, and exterior al-gebra. The discussion will be made on Euclidean spaces and can be easilyextended to general finite dimensional vector spaces.

5.2.1 Linear Transform

A map L : Rn → Rm is a linear transform if it preserves the addition andscalar multiplication

L(~x+ ~y) = L(~x) + L(~y), L(c~x) = cL(~x). (5.2.1)

As a consequence, a linear transform also preserves the linear combinations

L(c1~x1 + c2~x2 + · · ·+ ck~xk) = c1L(~x1) + c2L(~x2) + · · ·+ ckL(~xk).

In particular, for the standard basis ~e1 = (1, 0, . . . , 0), ~e2 = (0, 1, . . . , 0), . . . ,~en = (0, 0, . . . , 1) of Rn and the image ~ai = L(~ei) of the standard basis, weget

L(~x) = x1~a1 + x2~a2 + · · ·+ xn~an. (5.2.2)

Conversely, any map given by the formula (5.2.2) is a linear transform.Denote ~aj = (a1j, a2j, . . . , amj). Then the i-th coordinate of L(~x) is

li(~x) = ai1x1 + ai2x2 + · · ·+ ainxn.

Page 265: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 265

By expressing vectors as vertical lists of coordinates, the linear transformbecomes

L(~x) =

l1(~x)l2(~x)

...lm(~x)

=

a11x1 + a12x2 + · · ·+ a1nxna21x1 + a22x2 + · · ·+ a2nxn

...am1x1 + am2x2 + · · ·+ amnxn

=

a11 a12 · · · a1n

a21 a22 · · · a2n...

......

am1 am2 · · · amn

x1

x2...xn

= A~x,

where A is an m× n matrix. Therefore matrices are really the explicit pre-sentations of linear transforms. A linear transform L produces a matrixA =

(L(~e1) L(~e2) · · · L(~en)

). Conversely, a matrix A produces a linear

transform L(~x) = A~x. Under the correspondence, the addition, scalar mul-tiplication and composition of linear transforms become the addition, scalarmultiplication and product of matrices.

A number valued linear transform l : Rn → R is called a linear functional.A linear functional can be written as

l(~x) = a1x1 + a2x2 + · · ·+ anxn = ~a · ~x (5.2.3)

for a unique vector ~a = (a1, a2, . . . , an). The levels l(~x) = c of the linearfunctional are (n− 1)-dimensional hyperplanes in Rn orthogonal to ~a.

Exercise 5.2.1. Verify that a map given by the formula (5.2.2) is a linear transform.In other words, the map satisfies (5.2.1).

Exercise 5.2.2. Suppose L,K : Rn → Rm are linear transforms and c is a number.Verify that the maps K + L and cK defined by

(K + L)(~x) = K(~x) + L(~x), (cK)(~x) = c(K(~x))

are linear transforms. Moreover, prove that if K and L correspond to m × nmatrices A and B, then K + L and cK correspond to the matrices A+B and cA(the right way is to study the column vectors of K + L and cK).

Exercise 5.2.3. Suppose L : Rn → Rm and K : Rk → Rn are linear transforms.Verify that the composition L◦K : Rk → Rm is a linear transform. Moreover, provethat if K and L correspond to matrices A and B, then L ◦K correspond to theproduct matrix BA (again, study the column vectors of the matrix correspondingto the composition).

Exercise 5.2.4. Suppose L : Rn → Rm is a linear transform. Verify that for any~y ∈ Rm, ~x 7→ L(~x) · ~y is a linear functional on Rn. The linear functional (of ~x) canbe expressed as L(~x) · ~y = ~x · ~z for a unique ~z ∈ Rn.

1. Verify that L∗(~y) = ~z : Rm → Rn is a linear transform. Note that L∗ is theunique linear transform satisfying

L(~x) · ~y = ~x · L∗(~y) for all ~x ∈ Rn and ~y ∈ Rm. (5.2.4)

Page 266: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

266 CHAPTER 5. MULTIVARIABLE FUNCTION

2. Prove that (L+K)∗ = L∗ +K∗, (cK)∗ = cK∗, (L ◦K)∗ = K∗ ◦ L∗.

3. Prove that if L corresponds to the m× n matrix A, then L∗ corresponds tothe transpose AT of A. In particular, this implies (BA)T = ATBT .

The linear transform L∗ is called the adjoint of L.

Given a linear transform L : Rn → Rm and norms on Rn and Rm, we have

‖L(~x)‖ ≤ |x1|‖~a1‖+|x2|‖~a2‖+· · ·+|xn|‖~an‖ ≤ (‖~a1‖+‖~a2‖+· · ·+‖~an‖)‖~x‖∞.

By the equivalence of norms on Rn, we can find λ, such that ‖L(~x)‖ ≤ λ‖~x‖.By ‖c~x‖ = |c|‖~x‖ and L(c~x) = cL(~x), the smallest such λ is

‖L‖ = inf{λ : ‖L(~x)‖ ≤ λ‖~x‖ for any ~x}= inf{λ : ‖L(~x)‖ ≤ λ for any ~x satisfying ‖~x‖ = 1}= sup{‖L(~x)‖ : ‖~x‖ = 1}. (5.2.5)

The number ‖L‖ the norm of the linear transform (with respect to the givennorms on the Euclidean spaces). The norm ‖A‖ of a matrix A is the normof the corresponding linear transform. The inequality

‖L(~x)− L(~y)‖ = ‖L(~x− ~y)‖ ≤ λ‖~x− ~y‖

also implies that linear transforms from finitely dimensional vector spacesare continuous.

The linear functions used in the differentiation of single variable functionswill be extended to maps of the form ~a+L(~x). We will call such maps linearmaps (the formal name is affine maps), in contrast to linear transform (forwhich ~a = ~0).

Theorem 5.2.1. The norm of linear transform satisfies the three axioms forthe norm. Moreover, the norm of composition satisfies ‖L ◦K‖ ≤ ‖L‖‖K‖.

Proof. By the definition, we have ‖L‖ ≥ 0. If ‖L‖ = 0, then ‖L(~x)‖ = 0 forany ~x. By the positivity of norm, we get L(~x) = ~0 for any ~x. In other words,L is the zero transform. This verifies the positivity of ‖L‖.

By the definition of ‖L‖ and ‖K‖, we have

‖L(~x) +K(~x)‖ ≤ ‖L(~x)‖+‖K(~x)‖ ≤ ‖L‖‖~x‖+‖K‖‖~x‖ ≤ (‖L‖+‖K‖)‖~x‖.

Then by the definition of ‖L + K‖, we get ‖L + K‖ ≤ ‖L‖ + ‖K‖. Thisverifies the triangle inequality.

The scalar property can be similarly proved.Finally, we have

‖(L ◦K)(~x)‖ = ‖L(K(~x))‖ ≤ ‖L‖‖K(~x)‖ ≤ ‖L‖‖K‖‖~x‖.

Then by the definition of ‖L ◦K‖, we have ‖L ◦K‖ ≤ ‖L‖‖K‖.

Page 267: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 267

Exercise 5.2.5. Prove that with respect to the Euclidean norm on Rn and theabsolute value on R, the norm of the linear functional l(~x) = ~a · ~x : Rn → Ris ‖~a‖2. Then extend the conclusion to linear transforms Rn → Rm with theEuclidean norm on Rn and the L∞-norm on Rm.

Exercise 5.2.6. Prove that if the Euclidean spaces are given the Euclidean norms,then ‖L‖ = sup‖~x‖=‖~y‖=1 L(~x) · ~y. This implies that the adjoint linear transformin Exercise 5.2.4 satisfies ‖L∗‖ = ‖L‖ with respect to the Euclidean norms.

Exercise 5.2.7. Suppose A = (aij) is the matrix of a linear transform. Prove that

‖A‖ ≤√∑

a2ij with respect to the Euclidean norms.

Exercise 5.2.8. Prove that the supremum in the definition of the norm of lineartransform is in fact maximum. In other words, the supremum is reached at somevector of unit length.

Exercise 5.2.9. Suppose a continuous map F : Rn → Rm is additive F (~x + ~y) =F (~x) + F (~y). Prove that F is a linear map.

By using matrices, the collection of all the linear transforms from Rn toRm is easily identified with the Euclidean space Rmn. On the space Rmn, thepreferred norm is the norm of linear transform, although the choice is oftennot important because all the norms are equivalent.

Example 5.2.1. For anm×nmatrixA, the left multiplication LA(X) = AX : Rnk →Rmk is a linear transform. By ‖AX‖ ≤ ‖A‖‖X‖, we have ‖LA‖ ≤ ‖A‖. Similarly,the right multiplication RA(X) = XA : Rkm → Rkn is a linear transform with‖RA‖ ≤ ‖A‖.Example 5.2.2. Consider the map of taking squares F (L) = L2 : Rn

2 → Rn2. If H

is a linear transform satisfying ‖H‖ < ε, then by Proposition 5.2.1, we have

‖(L+H)2 − L2‖ = ‖LH +HL+H2‖ ≤ ‖LH‖+ ‖HL‖+ ‖H2‖≤ ‖L‖‖H‖+ ‖H‖‖L‖+ ‖H‖2 < 2ε‖L‖+ ε2.

From this it is easy to deduce that the square map is continuous.

Exercise 5.2.10. Prove that with respect to the norm of linear transform, theaddition, scalar multiplication and composition of linear transforms are continuousmaps.

Exercise 5.2.11. Is it possible to have ‖LA‖ = ‖A‖ for some choices of norms onthe Euclidean space? Does the equality always hold?

Exercise 5.2.12. Suppose Rn is given a norm, which induces a norm for the col-lection Rn2

of linear transforms from Rn to itself. Study the inverse of lineartransforms.

1. Prove that if ‖L‖ < 1, then∑∞

n=0 Ln = 1 + L + L2 + · · · converges in Rn2

and is the inverse of I − L (I is the identity transform).

2. Prove that if L is invertible, then for any linear transform K satisfying ‖K−

L‖ < 1‖L−1‖

, K is also invertible and ‖K−1 − L−1‖ ≤ ‖K − L‖‖L−1‖2

1− ‖K − L‖‖L−1‖.

This implies that the collection GL(n) of invertible linear transforms on Rn

is an open subset of Rn2.

3. Prove that L 7→ L−1 is a continuous map on GL(n).

Page 268: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

268 CHAPTER 5. MULTIVARIABLE FUNCTION

5.2.2 Bilinear and Quadratic Form

A map B : Rm × Rn → Rk is bilinear if it is linear for the vector in Rm

B(~x+ ~x′, ~y) = B(~x, ~y) +B(~x′, ~y), B(c~x, ~y) = cB(~x, ~y), (5.2.6)

and is also linear for the vector in Rn

B(~x, ~y + ~y′) = B(~x, ~y) +B(~x, ~y′), B(~x, c~y) = cB(~x, ~y). (5.2.7)

Bilinear maps can be considered as generalized products because the scalarproduct

c~x : R× Rn → Rn,

the dot product~x · ~y : Rn × Rn → R,

the 3-dimensional cross product

(x1, x2, x3)×(y1, y2, y3) = (x2y3−x3y2, x3y1−x1y3, x1y2−x2y1) : R3×R3 → R3,

and the matrix product

AB : Rmn × Rnk → Rmk

are all bilinear.The addition and scalar multiplication of bilinear maps are still bilinear.

Moreover, if B is bilinear and L and K are linear, then B(L(~u), K(~v)) isbilinear (note that (L,K)(~u,~v) = (L(~u), K(~v)) is a linear transform). Onthe other hand, if L is linear, then L(B(~x, ~y)) is bilinear.

For the standard bases ~ei and ~fj of Rm and Rn, we get

B(~x, ~y) = B

(∑i

xi~ei,∑j

yj ~fj

)=∑i,j

xiyjB(~ei, ~fj) =∑i,j

xiyj~aij. (5.2.8)

Conversely, any map given by the formula (5.2.8) is bilinear. Moreover, wehave

‖B(~x, ~y)‖ ≤∑i,j

|xi||yj|‖~aij‖ ≤

(∑i,j

‖~aij‖

)‖~x‖∞‖~y‖∞.

This implies that the bilinear map is continuous, and we may define the normof the bilinear map to be

‖B‖ = inf{c : ‖B(~x, ~y)‖ ≤ c‖~x‖‖~y‖ for any ~x}= sup{‖B(~x, ~y)‖ : ‖~x‖ = ‖~y‖ = 1}. (5.2.9)

A map is bilinear if and only if its coordinate functions are bilinear func-tions. Bilinear functions b(~x, ~y) for ~x ∈ Rm and ~y ∈ Rn are equivalent tomatrices A = (aij) via the formula

b(~x, ~y) =∑i,j

aijxiyj = A~x · ~y. (5.2.10)

Page 269: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 269

The relation can also be written in terms of the linear transform as b(~x, ~y) =L(~x) · ~y. The addition and scalar multiplication of bilinear functions corre-spond to the addition and scalar multiplication of matrices or linear trans-forms.

A bilinear function on Rn × Rn is called a bilinear form on Rn. Abilinear form is symmetric if b(~x, ~y) = b(~y, ~x), and is skew-symmetric ifb(~x, ~y) = −b(~y, ~x). We also call the corresponding matrices symmetric andskew-symmetric.

Exercise 5.2.13. Find the norms of the scalar product, dot product and matrixproduct with respect to the Euclidean norms on the Euclidean spaces.

Exercise 5.2.14. Prove that the norm of bilinear map satisfies the three conditionsfor norms.

Exercise 5.2.15. Find the norm of a bilinear function b(~x, ~y) = L(~x) ·~y with respectto the Euclidean norms.

Exercise 5.2.16. Prove that a bilinear form is skew-symmetric if and only if b(~x, ~x) =0 for any ~x.

Exercise 5.2.17. Prove that any bilinear form is the unique sum of a symmetricform and a skew-symmetric form.

Exercise 5.2.18. Suppose B : Rm×Rn → Rk is a continuous map satisfying B(~x+~x′, ~y) = L(~x, ~y) + L(~x′, ~y) and B(~x, ~y + ~y′) = L(~x, ~y) + L(~x, ~y′). Prove that B is abilinear map.

A skew-symmetric bilinear form on R2 is

b(~x, ~y) = a12x1y2 + a21x2y1 = a12x1y2 − a12x2y1 = a12 det

(x1 y1

x2 y2

).

A skew-symmetric bilinear form on R3 is

b(~x, ~y) = a12x1y2 + a21x2y1 + a13x1y3 + a31x3y1 + a23x2y3 + a23x3y2

= a12(x1y2 − x2y1) + a31(x3y1 − x1y3) + a23(x2y3 − x3y2)

= a23 det

(x2 y2

x3 y3

)+ a31 det

(x3 y3

x1 y1

)+ a12 det

(x1 y1

x2 y2

)= ~a · (~x× ~y).

In general, a skew-symmetric bilinear form on Rn is

b(~x, ~y) =∑i 6=j

aijxiyj =∑i<j

aij(xiyj − xjyi) =∑i<j

aij det

(xi yixj yj

).

Motivated by the formula on R3, we may define n-dimensional cross product

~x× ~y =((−1)i+j+1(xiyj − xjyi)

)i<j

: Rn × Rn → Rn(n−1)

2 . (5.2.11)

Then we still have

b(~x, ~y) = ~a · (~x× ~y),

Page 270: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

270 CHAPTER 5. MULTIVARIABLE FUNCTION

where the ij-coordinate of ~a ∈ Rn(n−1)

2 is aij when i + j is odd and is ajiwhen i + j is even. The formula for skew-symmetric bilinear forms may becompared with the formula (5.2.3) for linear functionals.

The general cross product ~x×~y is still bilinear and satisfies ~x×~y = −~y×~x.Therefore

(x1~a1 + x2~a2)× (y1~a1 + y2~a2)

=x1y1~a1 × ~a1 + x1y2~a1 × ~a2 + x2y1~a2 × ~a1 + x2y2~a2 × ~a2

=x1y2~a1 × ~a2 − x2y1~a1 × ~a2 = det

(x1 y1

x2 y2

)~a1 × ~a2. (5.2.12)

In R3, the cross product ~x× ~y is a vector orthogonal to the two vectors,with the direction determined by the right hand rule from ~x to ~y. Moreover,the Euclidean norm of ~x×~y is given by the area of the parallelogram spannedby the two vectors

‖~x× ~y‖2 = ‖~x‖2‖~y‖2| sin θ|,

where θ is the angle between the two vectors. In other dimensions, we cannottalk about the direction because ~x × ~y is in a different Euclidean space.However, the length of the cross product is still the area of the parallelogramspanned by the two vectors

Area(~x, ~y) =√

(~x · ~x)(~y · ~y)− (~x · ~y)2 =

√∑i,j

x2i y

2j −

∑i,j

xiyixjyj

=

√∑i<j

(x2i y

2j + x2

jy2i − 2xiyixjyj) =

√∑i<j

(xiyj − xjyi)2

= ‖~x× ~y‖2. (5.2.13)

In R2, the cross product ~x × ~y ∈ R is the determinant of the matrixformed by the two vectors. The Euclidean norm of the cross product is| det

(~x ~y

)|, which is the area of the parallelogram spanned by ~x and ~y . A

linear transform

L(~x) = A~x = x1~a1 + x2~a2 : R2 → Rn

takes the parallelogram spanned by ~x, ~y ∈ R2 to the parallelogram spannedby L(~x), L(~y) ∈ Rn. By (5.2.12), the areas of the parallelograms are relatedby

Area(L(~x), L(~y)) = | det(~x ~y

)|‖~a1 × ~a2‖2 = ‖~a1 × ~a2‖2Area(~x, ~y).

Exercise 5.2.19. (Archimedes1) Find the area of the region A bounded by x = 0,y = 4, and y = x2 in the following way.

1. Show that the area of the triangle with vertices (a, a2), (a+h, (a+h)2) and(a+ 2h, (a+ 2h)2) is |h|3.

1Archimedes of Syracuse, born 287 BC and died 212 BC.

Page 271: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 271

2. Let Pn be the polygon bounded by the line connecting (0, 0) to (0, 4), theline connecting (0, 4) to (2, 4), and the line segments connecting points on

the parabola y = x2 with x = 0,1

2n−1,

22n−1

, . . . ,2n

2n−1.

3. Prove that the area of the polygon Pn − Pn−1 is1

4n−1.

4. Prove that the area of the region A is∑∞

n=0

14n−1

=163

.

A quadratic form

q(~x) = b(~x, ~x) =∑i,j

aijxixj = A~x · ~x (5.2.14)

is obtained by taking two vectors in a bilinear form to be the same. Sinceskew-symmetric bilinear forms induce the zero quadratic form, usually onlysymmetric bilinear forms are used to induce quadratic forms. Thereforequadratic forms are in one-to-one correspondence with symmetric bilinearforms (and symmetric matrices, and self-adjoint linear transforms) by (notethat aij = aji)

q(x1, x2, . . . , xn)

=∑

1≤i≤n

aiix2i + 2

∑1≤i<j≤n

aijxixj (5.2.15)

=a11x21 + a22x

22 + · · ·+ annx

2n + 2a12x1x2 + 2a13x1x3 + · · ·+ 2a(n−1)nxn−1xn.

Exercise 5.2.20. Prove that a symmetric bilinear form can be recovered from thecorresponding quadratic form by

b(~x, ~y) =14

(q(~x+ ~y)− q(~x− ~y)) =12

(q(~x+ ~y)− q(~x)− q(~y)). (5.2.16)

Exercise 5.2.21. Prove that a quadratic form is homogeneous of second order

q(c~x) = c2q(~x),

and satisfies the parellelogram law

q(~x+ ~y) + q(~x− ~y) = 2q(~x) + 2q(~y). (5.2.17)

Exercise 5.2.22. Suppose a function q satisfies the parellelogram law (5.2.17). De-fine a function b(~x, ~y) by (5.2.16).

1. Prove that q(~0) = 0, q(−~x) = q(~x).

2. Prove that b is symmetric.

3. Prove that b satisfies b(~x+ ~y, ~z) + b(~x− ~y, ~z) = 2b(~x, ~z).

4. Suppose f(~x) is a function satisfying f(~0) = 0 and f(~x+~y)+f(~x−~y) = 2f(~x).Prove that f is additive (see Exercise 5.2.9): f(~x+ ~y) = f(~x) + f(~y).

5. Prove that if q is continuous, then b is a bilinear form.

Page 272: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

272 CHAPTER 5. MULTIVARIABLE FUNCTION

A quadratic form is positive definite if

~x 6= 0 =⇒ q(~x) > 0.

It is negative definite if

~x 6= 0 =⇒ q(~x) < 0.

It is indefinite if it takes both positive and negative values. Besides the threepossibilities, a quadratic form may satisfy q(~x) ≥ 0 for all ~x (called semi-positive definite) and q(~x) = 0 for some ~x 6= ~0. The fifth possibility is thatq(~x) ≤ 0 for all ~x (called semi-negative definite) and q(~x) = 0 for some ~x 6= ~0.

The terms of the form aijxixj, i 6= j are cross terms. A quadratic formwithout cross terms is q = a11x

21 + a22x

22 + · · ·+ annx

2n. It is positive definite

if and only if all aii > 0. It is negative definite if and only if all aii < 0. It isindefinite if and only if some aii > 0 and some ajj < 0. In general, the crossterms may be eliminated by the technique of completing the squares. Thenthe nature of the quadratic form can be determined.

Example 5.2.3. Consider the quadratic from q(x, y, z) = x2 + 13y2 + 14z2 + 6xy+2xz + 18yz. Putting together all the terms involving x and completing a square,we get

q = x2 + 6xy + 2xz + 13y2 + 14z2 + 18yz

= [x2 + 2x(3y + z) + (3y + z)2] + 13y2 + 14z2 + 18yz − (3y + z)2

= (x+ 3y + z)2 + 4y2 + 13z2 + 12yz.

The remaining terms involve only y and z. Putting together all the terms involvingy and completing a square, we get

4y2 + 13z2 + 12yz = (2y + 3z)2 + 4z2.

Thus q = (x+ 3y + z)2 + (2y + 3z)2 + (2z)2 = u2 + v2 + 4w2 is positive definite.

Example 5.2.4. The cross terms in the quadratic function q = 4x21 + 19x2

2 − 4x24 −

Page 273: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 273

4x1x2 + 4x1x3 − 8x1x4 + 10x2x3 + 16x2x4 + 12x3x4 can be eliminated as follows.

q = 4[x21 − x1x2 + x1x3 − 2x1x4] + 19x2

2 − 4x24 + 10x2x3 + 16x2x4 + 12x3x4

= 4

[x2

1 + 2x1

(−1

2x2 +

12x3 − x4

)+(−1

2x2 +

12x3 − x4

)2]

+ 19x22 − 4x2

4 + 10x2x3 + 16x2x4 + 12x3x4 − 4(−1

2x2 +

12x3 − x4

)2

= 4(x1 −

12x2 +

12x3 − x4

)2

+ 18[x2

2 +23x2x3 +

23x2x4

]− x2

3 − 8x24 + 16x3x4

= (2x1 − x2 + x3 − 2x4)2 + 18

[x2

2 + 2x2

(13x3 +

13x4

)+(

13x3 +

13x4

)2]

− x23 − 8x2

4 + 16x3x4 − 18(

13x3 +

13x4

)2

= (2x1 − x2 + x3 − 2x4)2 + 18(x2 +

13x3 +

13x4

)2

− 3(x23 − 4x3x4)− 10x2

4

= (2x1 − x2 + x3 − 2x4)2 + 2(3x2 + x3 + x4)2

− 3[x23 + 2x3(−2x4) + (−2x4)2]− 10x2

4 + 3(−2x4)2

= (2x1 − x2 + x3 − 2x4)2 + 2(3x2 + x3 + x4)2 − 3(x3 − 2x4)2 + 2x24

= y21 + 2y2

2 − 3y33 + 2y2

4.

The result tells us that q is indefinite.

Example 5.2.5. The quadratic form q = 4xy+y2 has no x2 term. We may completethe square by using the y2 term and get q = (y + 2x)2 − 4x2 = u2 − 4v2, which isindefinite.

The quadratic form q = xy + yz has no square terms. We may eliminate thecross terms by introducing x = x1 +y1, y = x1−y1, so that q = x2

1−y21 +x1z−y1z.

Then we complete the square and get q =(x1 −

12z

)2

−(y1 +

12z

)2

=14

(x+ y−

z)2 − 14

(x− y + z)2. The quadratic form is also indefinite.

Exercise 5.2.23. Eliminate the cross terms in the following quadratic functions.Then determine whether the functions are positive definite, negative definite, orindefinite.

1. x2 + 4xy − 5y2.

2. 2x2 + 4xy.

3. 4x21 + 4x1x2 + 5x2

2.

4. x2 + 2y2 + z2 + 2xy − 2xz.

5. −2u2 − v2 − 6w2 − 4uw + 2vw.

6. x21 +x2

3 +2x1x2 +2x1x3 +2x1x4 +2x3x4.

Exercise 5.2.24. Eliminate the cross terms in the quadratic function x2 + 2y2 +z2 + 2xy− 2xz by first completing a square for terms involving z, then completingfor terms involving y.

Exercise 5.2.25. Prove that if a quadratic form is positive definite, then all thecoefficients of the square terms must be positive.

Page 274: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

274 CHAPTER 5. MULTIVARIABLE FUNCTION

Exercise 5.2.26. Prove that if a quadratic form q(~x) is positive definite, then thereis λ > 0, such that q(~x) ≥ λ‖~x‖2 for any ~x. Extend the result to homogeneousfunctions.

Now let us study the technique of completing the squares in general.The k-th principal minor of a matrix A is the determinant of the k × k

matrix formed by the entries in the first k rows and first k columns of A.Suppose a quadratic form q(~x) = A~x ·~x, where A is a symmetric matrix. Let

d1 = a11, d2 = det

(a11 a12

a21 a22

), . . . , dn = detA

be the principal minors of the coefficient matrix.If d1 6= 0, then eliminating all the cross terms involving x1 gives us

q(~x) = a11

(x2

1 + 2x11

a11

(a12x2 + · · ·+ a1nxn) +1

a211

(a12x2 + · · ·+ a1nxn)2

)+ a22x

22 + · · ·+ annx

2n + 2a23x2x3 + 2a24x2x4 + · · ·+ 2a(n−1)nxn−1xn

− 1

a11

(a12x2 + · · ·+ a1nxn)2

= d1

(x1 +

a12

d1

x2 + · · ·+ a1n

d1

xn

)2

+ q2(~x2),

where q2 is a quadratic form of the truncated vector ~x2 = (x2, x3, . . . , xn).The coefficient matrix A2 for q2 is obtained as follows. For each 2 ≤ i ≤ n,

adding− a1i

a11

multiple of the first column of A to the i-th column will make the

i-th term in the first row to become zero. Then we get a matrix

(d1

~0∗ A2

).

Since the column operation does not change the determinant of the matrix(and all the principal minors), the principal minors d

(2)1 , d

(2)2 , . . . , d

(2)n−1 of A2

are related to the principal minors d(1)1 = d1, d

(1)2 = d2, . . . , d

(1)n = dn of A1

by d(1)k+1 = d1d

(2)k .

The discussion sets up an inductive argument. Assume d1, d2, . . . , dk areall nonzero. Then we may complete the squares in k steps and obtain

q(~x) = d(1)1 (x1 + b12x2 + · · ·+ b1nxn)2 + d

(2)1 (x2 + b23x3 + · · ·+ b2nxn)2

+ · · ·+ d(k)1 (xk + ak(k+1)xk+1 + · · ·+ aknxn)2 + qk+1(~xk+1),

with

d(i)1 =

d(i−1)2

d(i−1)1

=d

(i−2)3

d(i−2)2

= · · · = d(1)i

d(1)i−1

=didi−1

,

and the coefficient of x2k+1 in qk+1 is d

(k+1)1 =

dk+1

dk.

Thus we have shown that if d1, d2, . . . , dn are all nonzero, then there isan “upper triangular” change of variables

y1 = x1 +b12x2 +b13x3 + · · · +b1nxn,y2 = x2 +b23x3 + · · · +b2nxn,

...yn = xn,

Page 275: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 275

such that q = d1y21 +

d2

d1

y22 + · · · +

dndn−1

y2n. Consequently, we conclude

Sylvester’s criterion

1. If d1 > 0, d2 > 0, . . . , dn > 0, then q(~x) is positive definite.

2. If −d1 > 0, d2 > 0, . . . , (−1)ndn > 0, then q(~x) is negative definite.

3. If d1 > 0, d2 > 0, . . . , dk > 0, dk+1 < 0, or −d1 > 0, d2 > 0, . . . ,(−1)kdk > 0, (−1)k+1dk+1 < 0, then q(~x) is indefinite.

5.2.3 Multilinear Map and Polynomial

A map F (~x1, ~x2, . . . , ~xk) : Rn1 × Rn2 × · · · × Rnk → Rm is multilinear if it islinear in each of its k variables. For example, if B1(~x, ~y) and B2(~u,~v) arebilinear, then B1(~x,B2(~u,~v)) is a trilinear map in ~x, ~u, ~v. The addition,scalar multiplication and composition of multilinear maps of matching typesare still multilinear.

Similar to the discussion of linear and bilinear maps, a map is multilinearif and only if it is given by

F =∑

i1,i2,··· ,ik

x1i1x2i2 · · · xkik~ai1i2···ik , ~ai1i2···ik = F (~ei1 , ~ei2 , . . . , ~eik), (5.2.18)

in terms of the coordinates of the variables. Moreover, we have

‖F (~x1, ~x2, . . . , ~xk)‖ ≤

( ∑i1,i2,··· ,ik

‖~ai1i2···ik‖

)‖~x1‖∞‖~x2‖∞ · · · ‖~xk‖∞.

Thus the norm of multilinear maps can be defined similar to the linear andbilinear maps.

A map is multilinear if and only if its coordinate functions are multilin-ear (of the same type). Multilinear functions are equivalent to the collection(ai1i2···ik) of its coefficients, which can be imagined as a “k-dimensional ma-trix”. Multilinear functions on Rn × Rn × · · · × Rn are called multilinearforms on Rn.

A multilinear map F on Rn×Rn×· · ·×Rn is symmetric if switching anytwo variables does not change the value

F (~x1, . . . , ~xi, . . . , ~xj, . . . , ~xk) = F (~x1, . . . , ~xj, . . . , ~xi, . . . , ~xk).

This is equivalent to the coefficients ~ai1i2···ik being independent of the orderof indices.

A k-th order form

φ(~x) = f(~x, ~x, . . . , ~x) =∑

i1,i2,...,ik

ai1i2···ikxi1xi2 · · · xik (5.2.19)

is obtained by taking all vectors in a multilinear form of k vectors to be thesame. Similar to the quadratic forms, usually only symmetric multilinear

Page 276: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

276 CHAPTER 5. MULTIVARIABLE FUNCTION

forms are used here, and this gives an equivalence between symmetric mul-tilinear forms of k vectors and k-th order forms. Write ak1k2···kn = ai1i2···ikwhen the collection {i1, i2, . . . , ik} consists of ki copies of i for any 1 ≤ i ≤ n.For example,

a244x2x4x4 = a424x4x2x4 = a442x4x4x2 = a0102x01x

12x

03x

24.

Then we have

φ(x1, x2, . . . , xn) =∑

k1+k2+···+kn=k,ki≥0

k!

k1!k2! · · · kn!ak1k2···knxk1

1 xk22 · · · xknn .

(5.2.20)The k-th order forms satisfy φ(c~x) = ckφ(~x). They are homogeneous

functions of order k and are multivariable analogues of the monomial xk forsingle variable polynomials. Thus a k-th order multivariable polynomial onRn is a linear combination of j-th order forms with j ≤ k

p(~x) = φ0(~x) + φ1(~x) + φ2(~x) + · · ·+ φk(~x)

=∑

k1+k2+···+kn≤k,ki≥0

bk1k2···knxk11 x

k22 · · ·xknn . (5.2.21)

By defining a polynomial as the sum of reductions of multilinear forms, theconcept extends to general vector spaces and to maps between vector spaces.In particular, a map F : Rn → Rm is a polynomial map if and only if all itscoordinate functions are polynomials.

A multilinear map F on Rn × Rn × · · · × Rn is alternating if switchingany two variables changes the sign

F (~x1, . . . , ~xi, . . . , ~xj, . . . , ~xk) = −F (~x1, . . . , ~xj, . . . , ~xi, . . . , ~xk).

This is equivalent to the coefficients ~ai1i2···ik changing signs when two indicesare exchanged. The alternating property is also equivalent to the value beingzero when two vectors are equal

F (~x1, . . . , ~y, . . . , ~y, . . . , ~xk) = 0.

In particular, if k > n, then at least two indices in ~ai1i2···ik must be the same.Therefore the coefficient vector is zero, and any alternating multilinear mapis zero.

Suppose k = n. Then we have

F (~x1, ~x2, . . . , ~xn)

=∑

i1,i2,...,in

x1i1x2i2 · · ·xninF (~ei1 , ~ei2 , . . . , ~ein)

=∑

i1,i2,...,in

sign(i1, i2, . . . , in)x1i1x2i2 · · ·xninF (~e1, ~e2, . . . , ~en)

= det

x11 x21 · · · xn1

x12 x22 · · · xn2...

......

x1n x2n · · · xnn

F (~e1, ~e2, . . . , ~en), (5.2.22)

Page 277: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 277

where (i1, i2, . . . , in) is a rearrangement of (1, 2, . . . , n) (called a permuta-tion), and sign(i1, i2, . . . , in) is 1 if it takes even number of steps to re-cover (1, 2, . . . , n) from (i1, i2, . . . , in) by exchanging pairs of numbers, andsign(i1, i2, . . . , in) is −1 if it takes odd number of steps. Thus the determi-nant of n × n matrices is the unique multilinear alternating function of then column vectors, such that the value at the identity matrix (correspondingto the columns ~e1, ~e2, . . . , ~en) is 1.

Suppose k ≤ n. Then we have

F (~ei1 , ~ei2 , . . . , ~eik) = ±F (~ej1 , ~ej2 , . . . , ~ejk),

where j1 < j2 < · · · < jk is the rearrangement of i1, i2, . . . , ik in increasingorder. Then a computation similar to (5.2.22) tells us

F (~x1, ~x2, . . . , ~xk)

=∑

i1,i2,...,ik

x1i1x2i2 · · ·xkikF (~ei1 , ~ei2 , . . . , ~eik)

=∑

j1<j2<···<jk

det

x1j1 x2j1 · · · xkj1x1j2 x2j2 · · · xkj2

......

...x1jk x2jk · · · xkjk

F (~ej1 , ~ej2 , . . . , ~ejk). (5.2.23)

The skew-symmetric bilinear forms can be described by the generalized

cross product ~x×~y : Rn×Rn → Rn(n−1)

2 , where the coordinates of Rn(n−1)

2 areindexed by 1 ≤ i < j ≤ n. For alternating multilinear k-forms, we need toconsider cross product of k vectors with value in a Euclidean space in whichthe coordinates are indexed by 1 ≤ i1 < i2 < · · · < ik ≤ n. Therefore forany subset I = {i1, i2, . . . , ik} ⊂ [n] = {1, 2, . . . , n}, arrange the indices in Iin increasing order and introduce the symbol

~e∧I = ~ei1 ∧ ~ei2 ∧ · · · ∧ ~eik .

Define the k-th exterior product ΛkRn ∼= Rn!

k!(n−k)! of Rn to be the vector spacewith symbols ~e∧I as basis (called the standard basis of ΛkRn). Then definethe k-th order wedge product to be the map

~x1 ∧ ~x2 ∧ · · · ∧ ~xk : Rn × Rn × · · · × Rn → ΛkRn

given by

~x1 ∧ ~x2 ∧ · · · ∧ ~xk =∑

i1<i2<···<ik

det

x1i1 x2i1 · · · xki1x1i2 x2i2 · · · xki2

......

...x1ik x2ik · · · xkik

~ei1 ∧~ei2 ∧ · · · ∧~eik .

(5.2.24)The wedge product is multilinear and alternating. The notation is con-

sistent in the sense that if 1 ≤ i1 < i2 < · · · < ik ≤ n and ~xj = ~eij , then

Page 278: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

278 CHAPTER 5. MULTIVARIABLE FUNCTION

the right side of (5.2.24) is indeed equal to ~ei1 ∧ ~ei2 ∧ · · · ∧ ~eik . Moreover, ifI = {i1, i2, . . . , ik} but ij is not necessarily increasing, then ~ei1∧~ei2∧· · ·∧~eik =±~e∧I , where the sign is determined by the number of exchanges needed inorder to rearrange the indices in increasing order.

Suppose

~y1 = a11~x1 + a12~x2 + · · ·+ a1k~xk,

~y2 = a21~x1 + a22~x2 + · · ·+ a2k~xk,

...

~yk = ak1~x1 + ak2~x2 + · · ·+ akk~xk,

and denote the coefficient matrix A = (aij). Then we have

~y1 ∧ ~y2 ∧ · · · ∧ ~yk = (detA)~x1 ∧ ~x2 ∧ · · · ∧ ~xk. (5.2.25)

The equality can be derived similar to (5.2.22). It can also be derived by thefact that both sides are alternating multilinear maps of the row vectors of A,and both sides have the same values when the rows of A are standard basisvectors.

For the special case that ~xi form the standard basis of the Euclideanspace, the equality (5.2.25) becomes

~x1 ∧ ~x2 ∧ · · · ∧ ~xn = det(~x1 ~x2 · · · ~xn

)~e[n] for ~xi ∈ Rn. (5.2.26)

Taking the map F in the equality (5.2.22) to be the wedge product will alsolead to (5.2.26). The equality also generalizes (5.2.12) for the case n = 2.

If ~x1, ~x2, . . . , ~xk are linearly independent, then by extending the vectorsto a basis of Rn, the equality (5.2.26) implies ~x1∧~x2∧· · ·∧~xk 6= 0. Conversely,if the vectors are linearly dependent, then one vector, say ~x1, can be writtenas a linear combination of the others

~x1 = c2~x2 + · · ·+ ck~xk,

and by the alternating property of the wedge product,

~x1 ∧ ~x2 ∧ · · · ∧ ~xk = c2~x2 ∧ ~x2 ∧ · · · ∧ ~xk + · · ·+ ck~xk ∧ ~x2 ∧ · · · ∧ ~xk = 0.

Therefore we conclude that

~x1, ~x2, . . . , ~xk are linearly independent ⇐⇒ ~x1∧~x2∧· · ·∧~xk 6= 0. (5.2.27)

A linear transform L : Rn → Rm induces a linear transform

ΛL = ΛkL : ΛkRn → ΛkRm

defined by the values on the basis vectors

ΛL(~ei1 ∧ ~ei2 ∧ · · · ∧ ~eik) = L(~ei1) ∧ L(~ei2) ∧ · · · ∧ L(~eik).

Page 279: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 279

Then we have

ΛL(~x1 ∧ ~x2 ∧ · · · ∧ ~xk) = L(~x1) ∧ L(~x2) ∧ · · · ∧ L(~xk) (5.2.28)

because both sides are multilinear maps from (Rn)k to ΛRm with the samevalues at the standard basis vectors. The equality (5.2.28) implies that

Λ(L ◦K) = ΛL ◦ ΛK.

Therefore if L is invertible, then ΛL is also invertible. Since any basis ~b1,~b2, . . . , ~bn of Rn is the image of the standard basis ~e1, ~e2, . . . , ~en under aninvertible linear map, we conclude from (5.2.28) that ~b∧I = ~bi1 ∧~bi2 ∧ · · ·∧~bikalso form a basis of the exterior product ΛkRn. This suggests that the exteriorproduct ΛV can be defined for any vector space V .

Because the dimension of ΛnRn is 1, the linear transform ΛnL : ΛnRn →ΛnRn induced by a linear transform L : Rn → Rn is multiplying a number

L(~x1) ∧ L(~x2) ∧ · · · ∧ L(~xn) = d~x1 ∧ ~x2 ∧ · · · ∧ ~xn. (5.2.29)

By comparing the special case

L(~e1) ∧ L(~e2) ∧ · · · ∧ L(~en) = d~e∧[n]

with (5.2.26), we find the number d to be the determinant of the matrixcorresponding to L. Therefore the equation 5.2.29 gives a definition

L(~x1) ∧ L(~x2) ∧ · · · ∧ L(~xn) = (detL)~x1 ∧ ~x2 ∧ · · · ∧ ~xn (5.2.30)

of the determinant of a linear transform from Rn to itself without explicitlyreferring to the standard basis. Such a definition can be directly extended togeneral vector spaces. We also note that the equality Λn(L◦K) = ΛnL◦ΛnKsimply means

det(L ◦K) = detL detK. (5.2.31)

This gives a conceptual proof of the similar equality for matrices.We have Λ0Rn = R, Λ1Rn = Rn, ΛnRn = R~e∧[n], and ΛkRn = 0 for k > n.

The total exterior product space

ΛRn = Λ0Rn ⊕ Λ1Rn ⊕ Λ2Rn ⊕ · · · ⊕ ΛnRn

has dimension 2n. Define the wedge product

∧ : ΛkRn × ΛlRn → Λk+lRn, ΛRn × ΛRn → ΛRn

to be the bilinear map that extends the obvious values at the standard basisvectors. Then we have

(~x1∧~x2∧· · ·∧~xk)∧(~y1∧~y2∧· · ·∧~yl) = ~x1∧~x2∧· · ·∧~xk∧~y1∧~y2∧· · ·∧~yl (5.2.32)

because both sides are multilinear maps from (Rn)k+l to Λk+lRn with thesame values at the standard basis vectors ~e∧I . The associativity

(~λ ∧ ~µ) ∧ ~ν = ~λ ∧ (~µ ∧ ~ν) (5.2.33)

Page 280: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

280 CHAPTER 5. MULTIVARIABLE FUNCTION

and the graded commutativity

~λ ∧ ~µ = (−1)kl~µ ∧ ~λ for ~λ ∈ ΛkRn, ~µ ∈ ΛlRn (5.2.34)

can also be derived by considering both sides as multilinear maps that coin-cide at the standard basis vectors. This makes ΛRn into an algebra with unit1 ∈ Λ0Rn, called the exterior algebra. The exterior algebra can be generatedfrom the unit 1 and the vectors in Rn by the wedge product.

For a linear transform L : Rn → Rm, the induced transform ΛL on theexterior algebra is an algebra homomorphism

ΛL(1) = 1,

ΛL(a~λ+ b~µ) = aΛL(~λ) + bΛL(~µ),

ΛL(~λ ∧ ~µ) = ΛL(~λ) ∧ ΛL(~µ).

By taking the standard basis ∧~eI as an orthonormal basis, the exterioralgebra ΛRn becomes an inner product space. Then we have the equality

(~x1∧~x2∧· · ·∧~xk)·(~y1∧~y2∧· · ·∧~yk) = det(~xi ·~yj)1≤i,j≤k = detXTY, (5.2.35)

whereX =

(~x1 ~x2 · · · ~xk

), Y =

(~y1 ~y2 · · · ~yk

).

The equality is a consequence of the fact that both sides are multilinearfunctions on (Rn)2k and are equal at the standard basis. As a result, ifU : Rn → Rn is an orthogonal transform, then ΛU : ΛRn → ΛRn is also anorthogonal transform. In other words, if ~b1, ~b2, . . . , ~bn is an orthogonal basisof Rn, then ~b∧I also form an orthogonal basis of ΛRn.

The equality (5.2.35) also tells us that if ~λ = ~x1 ∧ ~x2 ∧ · · · ∧ ~xk, ~µ =

~y1 ∧ ~y2 ∧ · · · ∧ ~yk, ~ξ = ~z1 ∧ ~z2 ∧ · · · ∧ ~zl, ~η = ~w1 ∧ ~w2 ∧ · · · ∧ ~wl, such that~xi · ~wj = 0, then

(~λ ∧ ~ξ) · (~µ ∧ ~η) = (~λ · ~µ)(~η · ~ξ). (5.2.36)

In general, for a subspace V ⊂ Rn, we let ΛV to be the subspace of ΛRn

consisting of linear combination of ~x1 ∧ ~x2 ∧ · · · ∧ ~xk with ~xi ∈ V . Now ifV and W are subspaces orthogonal to each other, then the equality (5.2.36)

holds for ~λ ∈ ΛV and ~η ∈ ΛW .As suggested by the cross product and the extension to higher dimension,

the Euclidean norm

‖~x1 ∧ ~x2 ∧ · · · ∧ ~xk‖2 =√

det(~xi · ~xj)1≤i,j≤k =√

detXTX (5.2.37)

with respect to the inner product in ΛkRn is the k-dimensional volume ofthe parallelepiped spanned by the k vectors in Rn. Strictly speaking, theconcept of volume is yet to be defined, and the formula needs to be justified.The justification will be carried out in Section 7.1.5. Note that by (5.2.36),we have

~λ ∈ ΛV, ~µ ∈ ΛW, V ⊥ W =⇒ ‖~λ ∧ ~µ‖2 = ‖~λ‖2‖~µ‖2. (5.2.38)

Page 281: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 281

Define the duality map (or Hodge star operator) on the exterior algebra

~λ 7→ ~λ? : ΛkRn → Λn−kRn,

as a linear map with the value on the standard basis given by

~e∧I = ±~e∧([n]−I),

where the sign is chosen so that

~e∧I ∧ ~e ?∧I = ~e∧[n] = ~e1 ∧ ~e2 ∧ · · · ∧ ~en. (5.2.39)

For example, we have

~e ?i = (−1)i−1~e∧([n]−i),

~e ?i∧j = (−1)i+j−1~e∧([n]−{i,j}), for i < j,

~e ?∧([n]−i) = (−1)n−i~ei.

Moreover, the dual in R2 is the counterclockwise rotation by 90 degrees

(x1, x2)? = x1~e?1 + x2~e

?2 = x1~e2 − x2~e1 = (−x2, x1), (5.2.40)

and the cross product in R3 is the dual of the wedge product

((x1, x2, x3) ∧ (y1, y2, y3))?

= det

(x1 y1

x2 y2

)(~e1 ∧ ~e2)∗ + det

(x1 y1

x3 y3

)(~e1 ∧ ~e3)∗ + det

(x2 y2

x3 y3

)(~e2 ∧ ~e3)∗

= det

(x1 y1

x2 y2

)~e3 − det

(x1 y1

x3 y3

)~e2 + det

(x2 y2

x3 y3

)~e1

=(x1, x2, x3)× (y1, y2, y3). (5.2.41)

Suppose I consists of k indices. Then by ~e∧I ∧ ~e ?∧I = ~e∧[n] = ~e ?∧I ∧ ~e ??∧I =(−1)k(n−k)~e ??∧I ∧ ~e ?∧I , we have

~λ ?? = (−1)k(n−k)~λ for ~λ ∈ ΛkRn. (5.2.42)

We also have~λ ∧ ~µ ? = (~λ · ~µ)~e∧[n] (5.2.43)

because both sides are bilinear maps from (ΛkRn)2 to ΛnRn that are equal at

the standard basis vectors. By (~λ? · ~µ ?)~e∧[n] = ~λ?∧ ~µ ?? = (−1)k(n−k)~λ?∧ ~µ =

~µ∧~λ? = (~µ ·~λ)~e∧[n] = (~λ ·~µ)~e∧[n], we find the duality map preserves the innerproduct

~λ? · ~µ ? = ~λ · ~µ, (5.2.44)

and therefore also preserves the Euclidean norm

‖~λ?‖2 = ‖~λ‖2.

Page 282: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

282 CHAPTER 5. MULTIVARIABLE FUNCTION

The equalities (5.2.42) and (5.2.43) imply

~µ ∧ ~λ = (~µ · ~λ∗)~e∧[n]. (5.2.45)

Since a vector ~ξ ∈ ΛkRn is uniquely determined by the dot product ~µ · ~ξ forall ~µ ∈ ΛkRn, the dual ~λ∗ may be characterized as the unique exterior vectorsatisfying (5.2.45) for all ~µ.

Suppose U : Rn → Rn is an orthogonal transform. Then

ΛU(~µ) ∧ ΛU(~λ) = ΛU(~µ ∧ ~λ) = (~µ · ~λ∗)ΛU(~e∧[n])

= (ΛU(~µ) · ΛU(~λ∗))(detU)~e∧[n].

Since ΛU(~µ) can be any vector in ΛkRn, the equality tells us

ΛU(~λ)∗ = (detU)ΛU(~λ∗) for ~λ ∈ ΛkRn and orthogonal U. (5.2.46)

A basis ~b1, ~b2, . . . , ~bn of Rn is positively oriented if

det(~b1

~b2 · · · ~bn

)> 0.

By (5.2.26), this means

~b1 ∧~b2 ∧ · · · ∧~bn = d~e1 ∧ ~e2 ∧ · · · ∧ ~en, d > 0.

Suppose ~x1, ~x2, . . . , ~xk and ~xk+1, ~xk+2, . . . , ~xn are two collections of linearlyindependent vectors, such that any vector in the first collection is orthogonalto any vector in the second. Then the union ~x1, ~x2, . . . , ~xn of two collectionsis a basis in Rn. By modifying ~xn by a sign if necessary, we assume the basis~x1, ~x2, . . . , ~xn is positively oriented. Then we claim that

(~x1 ∧ ~x2 ∧ · · · ∧ ~xk)∗ = c~xk ∧ ~xk+1 ∧ · · · ∧ ~xn (5.2.47)

with

c =‖~x1 ∧ ~x2 ∧ · · · ∧ ~xk‖2

‖~xk ∧ ~xk+1 ∧ · · · ∧ ~xn‖2

> 0.

The equality (5.2.47) suggests that the duality can be defined in the exterioralgebra of any inner product space.

To derive the equality (5.2.47), we choose an orthonormal bases ~b1, ~b2,

. . . , ~bk for the span of ~x1, ~x2, . . . , ~xk and an orthonormal bases ~bk+1, ~bk+2,

. . . , ~bn for the span of ~xk+1, ~xk+2, . . . , ~xn, such that

~x1 ∧ ~x2 ∧ · · · ∧ ~xk = α~b1 ∧~b2 ∧ · · · ∧~bk,~xk ∧ ~xk+1 ∧ · · · ∧ ~xn = β~bk+1 ∧~bk+2 ∧ · · · ∧~bn,

for some α, β > 0. Then ~b1, ~b2, . . . , ~bn is also a positively oriented orthonor-mal basis of Rn. Let U be the orthogonal transform given by the matrix(~b1

~b2 · · · ~bn

). Then detU > 0 and by (5.2.46)

~b1 ∧~b2 ∧ · · · ∧~b∗k = (ΛU(~e1 ∧ ~e2 ∧ · · · ∧ ~ek))∗

= ΛU(~ek+1 ∧ ~ek+2 ∧ · · · ∧ ~en)

= ~bk+1 ∧~bk+2 ∧ · · · ∧~bn,

Page 283: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

5.2. MULTIVARIABLE ALGEBRA 283

Therefore we conclude (~x1 ∧ ~x2 ∧ · · · ∧ ~xk)∗ =α

β~xk ∧ ~xk+1 ∧ · · · ∧ ~xn.

In the special case k = n − 1, we find ~y = (~x1 ∧ ~x2 ∧ · · · ∧ ~xn−1)∗ is avector Rn such that

~xi · ~y = 0, ~x1 ∧ ~x2 ∧ · · · ∧ ~xn−1 ∧ ~y = ‖~x1 ∧ ~x2 ∧ · · · ∧ ~xn−1‖22~e[n].

Therefore ~y is a vector orthogonal to ~x1, ~x2, . . . , ~xn−1 and has “compatible”direction. The geometrical meaning and the equality (5.2.41) shows that themap

(~x1 ∧ ~x2 ∧ · · · ∧ ~xn−1)∗ : (Rn)n−1 → Rn

is the true generalization of the cross product in R3.

Page 284: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

284 CHAPTER 5. MULTIVARIABLE FUNCTION

Page 285: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 6

Multivariable Differentiation

285

Page 286: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

286 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

6.1 Differentiation

The differentiation can be extended from single to multivariable by consid-ering the linear approximation. The coefficients in the linear approximationare given by partial derivatives, and most of the properties of differentiationcan be extended. However, the multivariable differentiability is no longerequivalent to the existence of partial derivatives. Moreover, we can considerderivatives in directions other than the coordinate directions (which give thepartial derivatives).

6.1.1 Differentiability and Derivative

The differentiation of maps between Euclidean spaces can be defined bydirectly generalizing the definition for single variable functions. Denote∆~x = ~x− ~x0.

Definition 6.1.1. A map F (~x) defined on a ball around ~x0 is differentiableat ~x0 if there is a linear map ~a + L(∆~x), such that for any ε > 0, there isδ > 0, such that

‖∆~x‖ = ‖~x− ~x0‖ < δ =⇒ ‖F (~x)− ~a− L(∆~x)‖ ≤ ε‖∆~x‖. (6.1.1)

The linear transform L is the derivative of F at ~x0 and is denoted F ′(~x0).

The condition (6.1.1) can be rephrased as

~a = F (~x0), lim∆~x→~0

‖F (~x0 + ∆~x)− F (~x0)− L(∆~x)‖‖∆~x‖

= 0. (6.1.2)

The differentiability at a point requires the map to be defined everywherenear the point. Thus for single variable functions, the differentiability isdefined only for functions on open intervals or unions of open intervals. Formultivariable maps, the differentiability is defined only for maps on opensubsets. In the future, the differentiability will be extended to maps definedon “differentiable subsets” (called submanifolds).

Similar to the single variable case, we denote the differential dF = L(d~x)of a map F . Again at the moment it is only a symbolic notation.

Example 6.1.1. For the map F (x, y) = (x2 + y2, xy) : R2 → R2, we have

F (x, y)− F (x0, y0) = (2x0∆x+ ∆x2 + 2y0∆y + ∆y2, y0∆x+ x0∆y + ∆x∆y)

= (2x0∆x+ 2y0∆y, y0∆x+ x0∆y) + (∆x2 + ∆y2,∆x∆y).

Since ‖(∆x2 + ∆y2,∆x∆y)‖∞ ≤ 2‖(∆x,∆y)‖2∞, F is differentiable at (x0, y0) andthe derivative F ′(x0, y0) is the linear transform (u, v) 7→ (2x0u+ 2y0v, y0u+ x0v).The differential is d(x0,y0)F = (2x0dx+ 2y0dy, y0dx+ x0dy).

Example 6.1.2. For the function f(~x) = ~x · ~x = ‖~x‖22, we have

f(~x0 + ∆~x)− f(~x0) = 2~x0 ·∆~x+ ‖∆~x‖22.

Since 2~x0 ·∆~x is a linear functional of ∆~x, f is differentiable, with the derivativef ′(~x0)(~v) = 2~x0 · ~v and the differential d~x0

f = 2~x0 · d~x.

Page 287: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 287

Example 6.1.3. The space of n × n matrices form the Euclidean space Rn2. For

the map F (X) = X2 : Rn2 → Rn2of taking squares of matrices, we have

F (A+H)− F (A) = (A2 +AH +HA+H2)−A2 = AH +HA+H2.

The map H 7→ AH +HA is a linear transform, and by Proposition 5.2.1, ‖H2‖ ≤‖H‖2 ≤ ε‖H‖ when ‖H‖ < ε. Therefore the map is differentiable, with thederivative F ′(A)(H) = AH +HA and the differential dAF = A(dX) + (dX)A.

Exercise 6.1.1. Use the definition to show the differentiability of the functionf(x, y) = ax2 + 2bxy + cy2 and find the derivative.

Exercise 6.1.2. Prove that if a map is differentiable at ~x0, then the map is con-tinuous at ~x0. Then show that the Euclidean norm ‖~x‖2 is continuous but notdifferentiable at ~0.

Exercise 6.1.3. Suppose F is differentiable at ~x0, with F (~x0) = ~0. Suppose afunction λ(~x) is continuous at ~x0. Prove that λ(~x)F (~x) is differentiable at ~x0.

Exercise 6.1.4. Find the condition for a homogeneous function to be differentiableat ~0. What about a multihomogeneous function?

Exercise 6.1.5. Suppose A is an n× n matrix. Find the derivative of the functionA~x · ~x.

Exercise 6.1.6. Let XT be the transpose of a matrix X. Prove that the derivativeof F (X) = XTX : Rn2 → Rn2

is F ′(A)(H) = ATH +HTA.

Exercise 6.1.7. For any natural number k, find the derivative of the map of takingthe k-th power of matrices.

Exercise 6.1.8. Use the equality (I +H)−1 = I −H +H2(I +H)−1 and Exercise5.2.12 to find the derivative of the inverse matrix map at the identity matrix I.

6.1.2 Partial Derivative

By considering the L∞-norm, we see that a map is differentiable if and only ifeach coordinate function is differentiable. Moreover, the linear approximationis obtained by putting together the linear approximations of the coordinatefunctions.

A function f(~x) on Rn is approximated at ~x0 by a linear function

a+ b1∆x1 + b2∆x2 + · · ·+ bn∆xn

if for any ε > 0, there is δ > 0, such that

|∆xi| = |xi − xi0| < δ, for all i

=⇒ |f(x1, x2, . . . , xn)− a− b1∆x1 − b2∆x2 − · · · − bn∆xn|≤ εmax{|∆x1|, |∆x2|, . . . , |∆xn|}.

If we fix x2 = x20, . . . , xn = xn0, and let only x1 change, then the above saysthat f(x1, x20, . . . , xn0) is approximated by the linear function a + b1∆x1 atx1 = x10. Thus a = f(~x0) and

b1 = lim∆x1→0

f(x10 + ∆x1, x20, . . . , xn0)− f(x10, x20, . . . , xn0)

∆x1

Page 288: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

288 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

is the derivative of f(~x) in x1 with all the other coordinates fixed. Thecoefficient is called the partial derivative in the first variable. The othercoefficients are the similar partial derivatives and denoted

bi =∂f

∂xi= Dxif = fxi . (6.1.3)

Using the notation, the derivative f ′(~x) is the linear transform

(v1, v2, . . . , vn) 7→ ∂f

∂x1

v1 +∂f

∂x2

v2 + · · ·+ ∂f

∂xnvn : Rn → R,

and the differential of the function is

df =∂f

∂x1

dx1 +∂f

∂x2

dx2 + · · ·+ ∂f

∂xndxn. (6.1.4)

Moreover, introduce the gradient

∇f =

(∂f

∂x1

,∂f

∂x2

, . . . ,∂f

∂xn

). (6.1.5)

Then the linear approximation is f(~x0) + ∇f(~x0) · ∆~x, the derivative isf ′(~x)(~v) = ∇f(~x0) · ~v, and the differential is df = ∇f · d~x.

Example 6.1.4. The function f(x, y) = 1+2x+xy2 has partial derivatives at (0, 0)

f(0, 0) = 1, fx(0, 0) = (2 + y2)|x=0,y=0 = 2, fy(0, 0) = 2xy|x=0,y=0 = 0.

Thus the candidate for the linear approximation is 1+2(x−0)+0(y−0) = 1+2x.Then we verify that (note that ∆x = x, ∆y = y)

|f(x, y)− 1− 2x| = |xy2| ≤ ‖(x, y)‖3∞.

Therefore f is differentiable at (0, 0), with d(0,0)f = 2dx.

Example 6.1.5. The function f(x, y) =√|xy| satisfies f(0, 0) = 0, fx(0, 0) = 0,

fy(0, 0) = 0. So the candidate for the linear approximation at (0, 0) is the zerofunction. However, by Example 5.1.2, the limit

lim(x,y)→(0,0)

|f(x, y)|‖(x, y)‖2

= limx→0,y→0

√|xy|

x2 + y2

does not exist. Therefore f is not differentiable at (0, 0), despite the existence ofthe partial derivatives.

Exercise 6.1.9. Compute the partial derivatives.

1. 4xy3 + 5x3y − 3y2.

2. arctany

x.

3. exyz sin(x+2y+3z).

4. log(x2 + y2).

5. xyz.

6. (xy)z.

Exercise 6.1.10. Discuss the continuity, the existence of partial derivatives and thedifferentiability at (0, 0).

Page 289: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 289

1. f(x, y) =

|x|pyx2 + y2

if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0), p > 0.

2. f(x, y) =

xy

(x2 + y2)pif(x, y) 6= (0, 0)

0 if(x, y) = (0, 0).

3. f(x, y) =

(x2 + y2)p sin1

x2 + y2if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0).

4. f(x, y) =

|x|p|y|q sin1

x2 + y2if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0).

Exercise 6.1.11. Prove the properties of the gradient.

1. ∇c = ~0.

2. ∇(f + g) = ∇f +∇g.

3. ∇(fg) = f∇g + g∇f .

4. ∇φ(f) = φ′(f)∇f .

Putting the linear approximations of coordinate functions together, we getthe linear approximation of a differentiable map. In particular, the derivativelinear transform F ′(~x0) is given by the Jacobian matrix

∂F

∂~x=∂(f1, f2, . . . , fm)

∂(x1, x2, . . . , xn)=

∂f1

∂x1

∂f1

∂x2

· · · ∂f1

∂xn∂f2

∂x1

∂f2

∂x2

· · · ∂f2

∂xn...

......

∂fn∂x1

∂fn∂x2

· · · ∂fn∂xn

. (6.1.6)

For a differentiable multivariable function f(~x) : Rn → R, the Jacobian ma-trix is the gradient ∇f considered as a 1× n matrix.

It is rather tedious to first use the partial derivatives to find the candi-date linear approximation and then verify the condition for differentiability.Fortunately, there is a simple sufficient (but not necessary) condition for thedifferentiability.

Proposition 6.1.2. Suppose a map is defined near ~x0. If all the partialderivatives exist near ~x0 and the partial derivatives are continuous at ~x0,then the map is differentiable at ~x0.

A differentiable map F : Rn → Rm is continuously differentiable if themap ~x 7→ F ′(~x) : Rn → Rmn is continuous. This is equivalent to all thepartial derivatives are continuous.

Proof. Assume the partial derivatives fx(x, y) and fy(x, y) exist near (x0, y0).Applying the mean value theorem to f(t, y) for t ∈ [x0, x] and to f(x0, s) for

Page 290: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

290 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

s ∈ [y0, y], we get

f(x, y)− f(x0, y0) = (f(x, y)− f(x0, y)) + (f(x0, y)− f(x0, y0))

= fx(c, y)(x− x0) + fy(x0, d)(y − y0)

= fx(c, y)∆x+ fy(x0, d)∆y

for some c ∈ [x0, x] and d ∈ [y0, y]. If fx and fy are continuous at (x0, y0),then for any ε > 0, there is δ > 0, such that

|∆x| < δ, |∆y| < δ =⇒ |fx(x, y)− fx(x0, y0)| < ε, |fy(x, y)− fy(x0, y0)| < ε.

By |c−x0| < |∆x| and |d−y0| < |∆y|, we find |∆x| < δ and |∆y| < δ implies|fx(c, y)− fx(x0, y0)| < ε, |fy(x0, d)− fy(x0, y0)| < ε, so that

|f(x, y)− f(x0, y0)− fx(x0, y0)∆x− fy(x0, y0)∆y| ≤ ε|∆x|+ ε|∆y|.

This shows that f(x, y) is differentiable at (x0, y0).The proof for the general case is similar.

Example 6.1.6. The function f(x, y) = 1+2x+xy2 in Example 6.1.4 has continuouspartial derivatives fx = 2 + y2, fy = 2xy. Therefore the function is differentiablewith df = (2 + y2)dx+ 2xydy.Example 6.1.7. Express the map in Example 6.1.1 as u(x, y) = x2 + y2, v(x, y) =xy. The partial derivatives ux = 2x, uy = 2y, vx = y, vy = x are continuous.

Therefore the map is differentiable, with the differential(dudv

)=(

2xdx+ 2ydyydx+ xdy

)=(

2x 2yy x

)(dxdy

).

Example 6.1.8. On the plane, the polar coordinate (r, θ) and the cartesian co-ordinate (x, y) are related by x = r cos θ, y = r sin θ. The relation is differ-entiable because the partial derivatives are continuous. The Jacobian matrix∂(x, y)∂(r, θ)

=(

cos θ −r sin θsin θ r cos θ

)and the differential

(dxdy

)=(

cos θdr − r sin θdθsin θdr + r cos θdθ

)=(

cos θ −r sin θsin θ r cos θ

)(drdθ

).

Example 6.1.9. By Proposition 2.1.3, the differentiability of a parametrized curve

φ(t) = (x1(t), x2(t), . . . , xn(t)) : (a, b)→ Rn

is equivalent to the existence of the derivatives x′i. The Jacobian matrix is thevertical version of the tangent vector φ′ = (x′1, x

′2, . . . , x

′n). As a linear map, the

derivative is u 7→ uφ′.The definition of parametrized curve allows the constant curve and allows the

tangent vector to become zero. It also allows continuously differentiable curves toappear to have sharp corners, such as this example

φ(t) =

{(t2, 0) if t ≤ 0(0, t2) if t > 0

.

To avoid such undesirable situations, we say a differentiable parametrized curveis regular if the tangent vector φ′ is never zero. We will see that by the inversefunction theorem (see Theorem 6.2.1), this is equivalent to the possibility of localreparametrization by one coordinate (i.e., t = xi for some i).

Page 291: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 291

Example 6.1.10. The parametrized sphere (5.1.19) and the parametrized torus(5.1.20) are differentiable surfaces in R3. In general, if a parametrized surfaceσ(u, v) : R2 → Rn is differentiable, then the tangent vectors σu and σv span thetangent plane T(u0,v0)S. As a linear map, the derivative σ′ is (s, t) 7→ sσu + tσv.

The undesirable situation that may happen for parametrized curves may alsohappen for parametrized surfaces. We say a differentiable parametrized surface isregular if the tangent vectors σu and σv are always linearly independent. We willsee that by the inverse function theorem (see Theorem 6.2.1), this is equivalentto the possibility of local reparametrization by two coordinates (i.e., u = xi andv = xj for some i and j).

Exercise 6.1.12. Compute the Jacobian matrices and the differentials.

1. r =√x2 + y2, θ = arctan

y

x.

2. u1 = x1 + x2 + x3, u2 = x1x2 + x2x3 + x3x1, u3 = x1x2x3.

3. x = r sinφ cos θ, y = r sinφ sin θ, z = r cosφ.

Exercise 6.1.13. Suppose F (~x) is a bounded map near ~0. Prove that ‖~x‖22F (~x) isdifferentiable at ~0. Then use this to construct a function differentiable at ~0 buthas no partial derivatives away from ~0.

Exercise 6.1.14. Construct a function that is differentiable everywhere but thepartial derivatives are not continuous at ~0.

Exercise 6.1.15. Prove that if fx(x0, y0) exists and fy(x, y) is continuous at (x0, y0),then f(x, y) is differentiable at (x0, y0). Extend the result to three or more vari-ables.

Exercise 6.1.16. Prove that a function f(~x) is differentiable at ~x0 if and only iff(~x) = f(~x0) + J(~x) ·∆~x, where J : Rn → Rn is continuous at ~x0. Extend the factto differentiable maps.

6.1.3 Rules of Differentiation

The rules of computation of the derivatives can be extended from single tomultivariable since the underlying principle still holds. If maps have linearapproximations, then their addition, scalar multiplication and compositionare still approximated by the similar combinations of the linear approxima-tions.

Suppose F,G : Rn → Rm are differentiable at ~x0. Then the additionF +G : Rn → Rm is also differentiable at ~x0, with the derivative

(F +G)′ = F ′ +G′.

Therefore the Jacobian matrix for F +G is the sum of the Jacobian matricesof F and G. In terms of the individual entries of the Jacobian matrix, the

equality means∂(f + g)

∂xi=∂f

∂xi+∂g

∂xi.

Suppose a function λ : Rn → R and a map F : Rn → Rm are differentiableat ~x0. Then the scalar multiplication λF : Rn → Rm is also differentiable at~x0, with the derivative given by the Leibniz formula

(λF )′ = λ′F + λF ′.

Page 292: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

292 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Note that both sides are supposed to be linear transforms from Rn to Rm. At~x0, we have λ(~x0) ∈ R, λ′(~x0) : Rn → R, F (~x0) ∈ Rm and F ′(~x0) : Rn → Rm.Then we see that the right side takes a vector in Rn to a vector in Rm. Ofcourse, in terms of the individual entries of the Jacobian matrix, the equality

means∂(λf)

∂xi=

∂λ

∂xif + λ

∂f

∂xi.

The most general form of the Leibniz formula can be found in Exercise6.1.17.

For the composition, we get multivariable version of the chain rule. Sup-pose F : Rn → Rm is differentiable at ~x0 and G : Rm → Rk is differentiableat ~y0 = F (~x0). Then the composition G ◦ F : Rn → Rk is also differentiableat ~x0, with the derivative given by the chain rule

(G ◦ F )′ = G′ ◦ F ′.

The right side is a composition of the linear transforms F ′(~x0) : Rn → Rm

and G′(~y0) = G′(F (~x0)) : Rm → Rk. Therefore the Jacobian matrix of thecomposition is the product of the Jacobian matrices of individual maps. Interms of the individual entries of the Jacobian matrix, the chain rule means

∂(g ◦ F )

∂xi=

∂g

∂f1

∂f1

∂xi+∂g

∂f2

∂f2

∂xi+ · · ·+ ∂g

∂fm

∂fm∂xi

.

Example 6.1.11. Suppose u = ex sin(y + 2z), x = t2, y = t3 − 2t, z = t2 + t. Therelations can be considered as a composition t 7→ (x, y, z) 7→ u. To compute thederivative of u(t) = u(x(t), y(t), z(t)), we differentiate the relations

du = ex sin(y + 2z)dx+ ex cos(y + 2z)dy + 2ex cos(y + 2z)dz,dx = 2tdt,

dy = (3t2 − 2)dt,dz = (2t+ 1)dt.

Substituting dx, dy, dz into du, we get

du = (ex sin(y + 2z))2tdt+ (ex cos(y + 2z))(3t2 − 2)dt+ (2ex cos(y + 2z))(2t+ 1)dt

= et2(2t sin(t3 + 2t2) + (3t2 − 2 + 4t+ 2) cos(t3 + 2t2))dt.

Thusdu

dt= et

2(2t sin(t3 + 2t2) + (3t2 + 4t) cos(t3 + 2t2)).

We note that the coefficients in the relation between the differentials form theJacobian matrix. The substitution of the differential relation besically computesthe product of the Jacobian matrix.

Example 6.1.12. Suppose u = ex sin(y+ 2z), x = s2t, y = s+ 2t, z = s2 − t. Then

du = ex sin(y + 2z)dx+ ex cos(y + 2z)dy + 2ex cos(y + 2z)dz,

dx = 2stds+ s2dt,

dy = ds+ 2dt,dz = 2sds− dt.

Page 293: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 293

Substituting dx, dy, dz into du, we get

du = ex sin(y + 2z)(2stds+ s2dt) + ex cos(y + 2z)(ds+ 2dt) + 2ex cos(y + 2z)(2sds− dt)

= es2t(2st sin(s+ 2s2) + (1 + 4s) cos(s+ 2s2))ds+ es

2ts2 sin(s+ 2s2)dt.

In other words,

us = es2t(2st sin(s+ 2s2) + (1 + 4s) cos(s+ 2s2)),

ut = es2ts2 sin(s+ 2s2).

Exercise 6.1.17. Suppose F : Rp → Rm and G : Rp → Rn are differentiable at~z0 ∈ Rp, and B : Rm × Rn → Rk is a bilinear map. Prove that B(F,G) : Rp → Rk

is also differentiable at ~z0, with the derivative given by the Leibniz formula

B(F,G)′ = B(F ′, G) +B(F,G′).

Extend the Leibniz formula to multilinear maps.

Exercise 6.1.18. Prove the chain rule for multivariable maps. Does the chain rule

formuladg(x(t), y(t))

dt= gx(x(t), y(t))x′(t) + gy(x(t), y(t))y′(t) still hold if g is not

assumed to be differentiable?

Exercise 6.1.19. Suppose F,G : Rn → Rn are maps that are invertible to eachother. Suppose F is differentiable at ~x0 and G is differentiable at ~y0 = F (~x0).Prove that the linear transforms F ′(~x0) and G′(~y0) are inverse to each other. Inparticular, the derivative of an invertible differentiable map is an invertible lineartransform.

Exercise 6.1.20. Compute partial derivatives.

1. z = uv + sin t, u = et, v = cos t, find zt.

2. z = x2 log y, x =u

v, y = u− v, find zu, zv.

3. u = f(x+ y, x− y), find ux, uy.

4. u = f(r cos θ, r sin θ), find ur, uθ.

5. u = f

(x

y,y

z,z

x

), find ux, uy, uz.

Exercise 6.1.21. Compute the derivative of xx by considering f = uv, u = x, v = x.

Exercise 6.1.22. Prove that a differentiable function f(x, y) depends only on theangle in the polar coordinate if and only if xfx + yfy = 0. Can you make a similarclaim for a differentiable function that depends only on the length.

Exercise 6.1.23. Suppose f(x, y) is a differentiable function satisfying f(x, x2) = 1,fx(x, x2) = x. Find fy(x, x2).

Exercise 6.1.24. For a differentiable function f , show that u = yf(x2−y2) satisfiesy2ux + xyuy = u.

Exercise 6.1.25. Suppose differentiable functions f and g satisfy f(u, v, w) =g(x, y, z) for u =

√yz, v =

√zx, w =

√xy. Prove that ufu + vfv + wfw =

xfx + yfy + zfz. Exercise 6.3.47 is a vast generalization.

Page 294: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

294 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Exercise 6.1.26. Find the partial differential equation characterizing differentiablefunctions f(x, y) of the form f(x, y) = φ(xy). What about functions of the formf(x, y) = φ

(yx

)?

Exercise 6.1.27. Discuss the differentiability of a function f(‖~x‖2) of the Euclideannorm.

Exercise 6.1.28. For any invertible matrix A, the inverse map is the compositionof the following three maps:

X 7→ A−1X, X 7→ X−1, X 7→ XA−1.

Use this and Exercise 6.1.8 to find the derivative of the inverse matrix map at theidentity matrix A.

6.1.4 Directional Derivative

The restriction of a function f on a parametrized curve φ(t) : (a, b)→ Rn isthe composition f(φ(t)). If both f and φ are differentiable, then the changeof the function f along the curve φ is measured by

f(φ(t))′ =∂f

∂x1

dx1

dt+∂f

∂x2

dx2

dt+ · · ·+ ∂f

∂xn

dxndt

= ∇f · φ′. (6.1.7)

The formula shows that the change depends on the tangent vector instead ofthe curve itself. Define the derivative of f along any vector ~v to be

D~vf = limt→0

f(~x+ t~v)− f(~x)

t= f ′(~x)(~v) = ∇f · ~v. (6.1.8)

Then we get f(φ(t))′ = Dφ′(t)f . Note that the derivative along ~v is the valueof the derivative linear functional f ′(~x) at ~v. Thus D~vf is a linear functionof ~v and the partial derivatives are simply the values of the linear functionat the standard basis vectors.

If ~v has unit Euclidean length (i.e., ‖~v‖2 = 1), then only the directionof the vector matters and D~vf is called the directional derivative. Let θ bethe angle between ∇f and ~v. Then D~vf = ‖∇f‖2 cos θ. This shows that fincreases the most (at the rate of ‖∇f‖2) in the direction of its gradient anddecreases the most in the opposite direction. Moreover, f “does not change”in directions orthogonal to its gradient.

Geometrically, for any constant c, the level f(~x) = c of the function istypically an (n− 1)-dimensional hypersurface in Rn. If a curve φ lies in thelevel, then the restriction of the function on the curve is constant, so thatf(φ(t))′ = 0. Since f(φ(t))′ = Dφ′(t)f = ∇f · φ′(t), we find the gradientis orthogonal to the tangent vectors of the level. On the other hand, thefunction changes only if one jumps from one level to a different level. Thebiggest change happens when one moves in the direction orthogonal to thelevels, which is the direction of ∇f . Moreover, the magnitude ‖∇f‖2 of thegradient is the derivative of f in the direction of∇f , which measures how fastthe jump is. Therefore ‖∇f‖2 measures how much the levels are “squeezed”together.

Page 295: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 295

..........................................................................................................................................

................................................

........................................................

..............................................................................................

..............................................

.............................................................................................................................................................................................................................................................................................................................. ................

.................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.................................................................... ................

........................................

.........................................................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

........................................................................... ................

....................................

....................................................

........................................................................................................

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.......................................................

...........................................................

Figure 6.1: gradient and level

Example 6.1.13. The gradient of f = x2 + y2 + z2 is ∇f = (2x, 2y, 2z). Thederivative at (1, 1, 1) in the direction (2, 1, 2) is

D 13

(2,1,2)f(1, 1, 1) = ∇f(1, 1, 1) · 13

(2, 1, 2) =13

(2, 2, 2) · (2, 1, 2) =103.

Note the vector (2, 1, 2) is divided by its length, so that only the direction counts.

Example 6.1.14. Suppose the derivative of f in the directions (1, 1) and (1,−1)are respectively 2 and −3. Then

D(1,1)f =√

2D 1√2

(1,1)f = 2√

2,

D(1,−1)f =√

2D 1√2

(1,−1)f = −3√

2,

and

fx = D(1,0)f =12

(D(1,1)f +D(1,−1)f) =−1√

2,

fy = D(0,1)f =12

(D(1,1)f −D(1,−1)f) =5√2.

Example 6.1.15. We try to characterize continuously differentiable functions f(x, y)satisfying xfx = yfy. Note that the condition is the same as ∇f ⊥ (x,−y), whichis equivalent to ∇f being parallel to (y, x) = ∇(xy). This means that the levels off and the levels of xy are tangential everywhere. Therefore we expect the levelsof f(x, y) and xy are the same, and we should have f(x, y) = h(xy) for somecontinuously differentiable h(t).

For the rigorous argument, let f(x, y) = g(x, xy), or g(x, z) = f(x,z

x

)(we

define such g in case x 6= 0, and we define f(x, y) = g(xy, y) in case y 6= 0). Then

gx = fx −z

x2fy =

1x

(xfx −

z

xfy

)= 0.

This shows that g is independent of the first variable, and we have f(x, y) = h(xy).A vast extension of the discussion is the theory of functional dependence. See

Exercises 6.2.21 through 6.2.26.

Exercise 6.1.29. Compute directional derivatives.

1. 4xy3 + 5x3y − 3y2 at (0, 1), in direction (1, 2).

2. arctany

xat (1, 1), in direction (1,−1).

Page 296: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

296 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

3. xy + yz + zx at (1, 2, 3), in direction (2, 2, 1).

4. a+ b1x1 + b2x2 + · · ·+ bnxn at (1, 1, . . . , 1), in direction (v1, v2, . . . , vn).

5. x21 + x2

2 + · · ·+ x2n at (1,−1, . . . , (−1)n), in direction (1, 1, . . . , 1).

Exercise 6.1.30. Suppose Df = 1 in direction (1, 2, 2), Df =√

2 in direction(0, 1,−1), Df = 3 in direction (0, 0, 1). Find the gradient of f .

Exercise 6.1.31. The curves φ(t) = (t, t2) and ψ(t) = (t2, t) intersect at (1, 1).Suppose the derivatives of f(x, y) along the two curves are respectively 2 and 3 at(1, 1), what is the gradient of f at (1, 1).

Exercise 6.1.32. Suppose ~u1, ~u2, . . . , ~un is an orthonormal basis. Prove that

∇f = (D~u1f)~u1 + (D~u2

f)~u2 + · · ·+ (D~unf)~un.

Exercise 6.1.33. Express the gradient of f(x, y) in terms of the partial derivatives

fr, fθ and the directions ~er =~xr‖~xr‖2

= (cos θ, sin θ), ~eθ =~xθ‖~xθ‖2

= (− sin θ, cos θ)

in the polar coordinates. Exercise 6.2.17 is a vast generalization.

Exercise 6.1.34. Find continuously differentiable functions satisfying the equations.

1. afx = bfy.

2. yfx = xfy.

3. xfx + yfy = 0.

4. (x+ y)fx + (x− y)fy = 0.

5. fx = fy = fz.

6. xfx = yfy = zfz.

The idea underlying the discussion of the directional derivative can beused to derive the following partial extension of the mean value theorem.

Proposition 6.1.3. Suppose f is a function differentiable along the straightline connecting ~a and ~b. Then there is ~c on the straight line, such that

f(~b)− f(~a) = ∇f(~c) · (~b− ~a).

Proof. The straight line connecting ~a to ~b is φ(t) = (1− t)~a+ t~b for t ∈ [0, 1].The single variable function f(φ(t)) is differentiable with

(f(φ(t)))′ = ∇f(φ(t)) · φ′(t) = ∇f(φ(t)) · (~b− ~a).

By applying the main value theorem to f(φ(t)), we find 0 < c < 1, such that

f(~b)− f(~a) = f(φ(1))− f(φ(0)) = (f(φ(t)))′t=c(1− 0) = ∇f(φ(c)) · (~b− ~a).

Let ~c = φ(c). The proposition is proved.

Although the proposition may be extended to individual coordinates of amultivariable map, the choice of ~c may be different for different coordinates.The following is a more unified extension.

Page 297: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.1. DIFFERENTIATION 297

Proposition 6.1.4. Suppose F is a map differentiable along the straight lineconnecting ~a and ~b. Then there is ~c on the straight line, such that

‖F (~b)− F (~a)‖2 ≤ ‖F ′(~c)‖‖~b− ~a‖2.

where the norm ‖F ′(~c)‖ is with respect to the Euclidean norms.

Proof. Fix any vector ~v and consider the function f(~x) = F (~x) · ~v. Then∇f(~x) · ~u = F ′(~x)(~u) · ~v. By Proposition 6.1.3, we have

|(F (~b)− F (~a)) · ~v| = |∇f(~c) · (~b− ~a)| = |F ′(~c)(~b− ~a) · ~v|≤ ‖F ′(~c)(~b− ~a)‖2‖~v‖2 ≤ ‖F ′(~c)‖‖~b− ~a‖2‖~v‖2,

By taking ~v = F (~b) − F (~a) at the beginning, we get ‖F (~b) − F (~a)‖2 ≤‖F ′(~c)‖‖~b− ~a‖2.

As a consequence of the mean value theorem, Theorem 2.2.3 can be ex-tended from single to multivariable.

Proposition 6.1.5. Suppose a differentiable map on a path connected opensubset has zero derivative everywhere. Then the map is a constant map.

Proof. By considering coordinates, it suffices to prove the statement for func-tions. Assume f is differentiable on a path connected open subset U , withzero derivative everywhere.

For any ~x ∈ U , there is ε > 0, such that ‖~y − ~x‖ < ε implies ~y ∈ U .Then the straight line connecting ~x and ~y lies in U . By Proposition 6.1.3and ∇f(~c) = 0, we have f(~y) = f(~x). Thus f is constant near ~x.

For any two points ~x, ~y ∈ U , there is a path φ(t), t ∈ [a, b], connecting~x to ~y. For any t0 ∈ [a.b], there is ε > 0, such that ‖~y − φ(t0)‖ < ε implies~y ∈ U . Since φ is continuous, there is δ > 0, such that |t − t0| < δ implies‖φ(t)−φ(t0)‖ < ε. Then f(φ(t)) is constant on (t0−µ, t0 +µ). In particular,we have (f(φ(t))′ = 0 for any t ∈ [a, b]. Thus f(~x) = f(φ(a)) = f(φ(b)) =f(~y). This completes the proof that f is constant on U .

Exercise 6.1.35. What can you say about Proposition 6.1.4 if the norms are dif-ferent from the Euclidean norm?

Exercise 6.1.36. What can you say about Proposition 6.1.5 if only some partialderivatives are constantly zero?

Exercise 6.1.37. Suppose ~a is a nonzero vector in Rn. Suppose ~b1, ~b2, . . . , ~bn−1

are linearly independent vectors orthogonal to ~a. Prove that a function f onwhole Rn (or on any open convex subset) satisfies D~af = 0 if and only if f(~x) =g(~b1 ·~x,~b2 ·~x, . . . ,~bn−1 ·~x) for some function g on Rn−1. Extend the result to severalvectors ~a.

Page 298: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

298 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

6.2 Inverse and Implicit Function

The inverse and implicit function theorems basically say that if the similarinverse and implicit problems can be solved for the linear approximation,then the problems can be solved for the multivariable map itself. The resulthas geometrical interpretation related to the definition of hypersurface.

6.2.1 Inverse Differentiation

Suppose a single variable function f(x) has continuous derivative near x0. Iff ′(x0) 6= 0, then f ′(x) is either positive for all x near x0 or negative for allx near x0. Thus f(x) is monotone and is therefore invertible near x0. Themultivariable extension is the following.

Theorem 6.2.1 (Inverse Function Theorem). Suppose F : Rn → Rn is con-tinuously differentiable near ~x0. Suppose the derivative F ′(~x0) is an invertiblelinear map. Then there is an open subset U containing ~x0, such that F (U)is also open, F : U → F (U) is invertible, F−1 : F (U) → U is differentiable,and (F−1)′(~y) = (F ′(~x))−1 when ~y = F (~x).

The theorem basically says that if the linear approximation of a map isinvertible, then the map is locally also invertible. Moreover, the differentiald~x = (F−1)′d~y is the solution of the equation d~y = F ′d~x.

As shown by Exercise 2.1.24, the continuity assumption cannot be droppedfrom the theorem.

Proof. For any ~y near ~y0 = F (~x0), we wish to find ~x near ~x0 satisfyingF (~x) = ~y. To solve the problem, we approximate the map F (~x) by the linearmap L0(~x) = F (~x0)+F ′(~x0)(~x−~x0) and solve the similar equation L0(~x) = ~y.The solution ~x1 of the approximate linear equation satisfies

~y = F (~x0) + F ′(~x0)(~x1 − ~x0).

Although not exactly equal to the solution ~x that we are looking for, ~x1 isoften closer to ~x than ~x0 (see Figure 6.2). So we repeat the process by usingL1(~x) = F (~x1) + F ′(~x0)(~x − ~x1) to approximate F near ~x1 and solve thesimilar approximate linear equation L1(~x) = ~y. The process is an inductivedefinition of a sequence {~xk} by

~y = Lk(~xk+1) = F (~xk) + F ′(~x0)(~xk+1 − ~xk). (6.2.1)

To prove the expectation that the sequence {~xk} converges to the solution~x, we use the approximation

F (~x) = F (~x0) + F ′(~x0)(~x− ~x0) +R(~x) (6.2.2)

in the inductive definition (6.2.1) to get

~y = F (~x0) + F ′(~x0)(~xk − ~x0) +R(~xk) + F ′(~x0)(~xk+1 − ~xk)= F (~x0) + F ′(~x0)(~xk+1 − ~x0) +R(~xk). (6.2.3)

Page 299: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.2. INVERSE AND IMPLICIT FUNCTION 299

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

.......

................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

.......

................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

.......

(~x0, ~y0)

~y

~x1 ~x2 ~x3 ~x

L0 L1L2

F

....................................................

.......................................................

...........................................................

...............................................................

......................................................................

..................................................................................

.............................................................................................................

................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

Figure 6.2: find ~x satisfying F (~x) = ~y

Write the formula (6.2.3) with k − 1 in place of k and taking the differencewith (6.2.3), we get

F ′(~x0)(~xk+1 − ~xk) +R(~xk)−R(~xk−1) = ~0.

Since R′(~x) = F ′(~x) − F ′(~x0), by the continuity of the differentiation at ~x0,for any ε > 0, there is δ > 0, such that F is defined on the ball B(~x0, δ) and

‖~x− ~x0‖ < δ =⇒ ‖R′(~x)‖ < ε. (6.2.4)

By Proposition 6.1.4, if we know ~xk, ~xk−1 ∈ B(~x0, δ), then (6.2.4) implies

‖~xk+1 − ~xk‖ = ‖F ′(~x0)−1(R(~xk)−R(~xk−1))‖≤ ‖F ′(~x0)−1‖‖R(~xk)−R(~xk−1)‖≤ ε‖F ′(~x0)−1‖‖~xk − ~xk−1‖. (6.2.5)

Now fix some 0 < α < 1. Then for ε <α

‖F ′(~x0)−1‖, we get ‖~xk+1 − ~xk‖ ≤

α‖~xk − ~xk−1‖. This further implies

‖~xk+1 − ~xk‖ ≤ αk‖~x1 − ~x0‖,

and

‖~xk+1 − ~xl‖ ≤ (αl + αl+1 + · · ·+ αk)‖~x1 − ~x0‖ <αl

1− α‖~x1 − ~x0‖.

This implies {~xk} is a Cauchy sequence and therefore converges. Takingthe limit of (6.2.1), we further see that the limit ~x = limk→∞ ~xk satisfies~y = F (~x).

How do we know ~xk, ~xk−1 ∈ B(~x0, δ), so that the estimation (6.2.5) holds?This requires the following rigorous setup. Assume 1 > α > 0 is fixed,

α

‖F ′(~x0)−1‖> ε > 0 is given, and δ > 0 is found so that (6.2.4) holds. Then

we assume ~y satisfies

‖~y − F (~x0)‖ < (1− α)δ

‖F ′(~x0)−1‖(6.2.6)

Page 300: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

300 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

and inductively construct ~xk by the formula (6.2.1). The argument aboveshows that after ~xk is constructed, we know

‖~xk − ~x0‖ <1

1− α‖~x1 − ~x0‖ ≤

1

1− α‖F ′(~x0)−1‖‖~y − F (~x0)‖ < δ.

Therefore we know ~xk, ~xk−1 ∈ B(~x0, δ), and the argument can be continued.The discussion above also tells us that, for any ~y satisfying (6.2.6), there

is ~x satisfying ‖~x− ~x0‖ < δ and F (~x) = ~y. Geometrically, this means

F (B(~x0, δ)) ⊃ B

(F (~x0),

(1− α)δ

‖F ′(~x0)−1‖

).

Note that so far we have only used the fact that F ′ is continuous at ~x0 andF ′(~x0) is invertible. Therefore if F is continuously differentiable on an opensubset U , such that F ′ is invertible everywhere, then F (U) is also open.

Next we prove that F is one-to-one on the ball B(~x0, δ). By the approx-imation (6.2.2) and the estimation (6.2.4), for any ~x, ~x′ ∈ B(~x0, δ),

‖F (~x)− F (~x′)‖ = ‖F ′(~x0)(~x− ~x′) +R(~x)−R(~x′)‖≥ ‖F ′(~x0)(~x− ~x′)‖ − ε‖~x− ~x′‖

≥(

1

‖F ′(~x0)−1‖− ε)‖~x− ~x′‖

≥ 1− α‖F ′(~x0)−1‖

‖~x− ~x′‖. (6.2.7)

Since α < 1, ~x 6= ~x′ implies F (~x) 6= F (~x′).Finally, it remains to prove the differentiability of the inverse map at

~y0 = F (~x0). Consider the inverse map F−1 : F (B(~x0, δ)) → B(~x0, δ). By(6.2.2), for ~y ∈ F (B(~x0, δ)) and ~x = F−1(~y), we have

~x = ~x0 + F ′(~x0)−1(~y − ~y0)− F ′(~x0)−1R(~x).

Then by R(~x0) = ~0 and ‖R′(~x)‖ < ε along the line connecting ~x0 and ~x, weget

‖F ′(~x0)−1R(~x)‖ ≤ ‖F ′(~x0)−1‖‖R(~x)−R(~x0)‖≤ ε‖F ′(~x0)−1‖‖~x− ~x0‖

≤ ε‖F ′(~x0)−1‖2

1− α‖~y − ~y0‖,

where the last inequality makes use of (6.2.7). The estimation shows that~x0 + F ′(~x0)−1(~y − ~y0) linearly approximates F−1(~x). Therefore the inversemap is differentiable at ~x0, with (F−1)′(~x0) = F ′(~x0)−1.

Note that most of the proof only makes use of the assumption that thederivative F ′ is continuous at ~x0. The assumption implies that the image ofa ball around ~x0 contains a ball around F (~x0). From this fact, the continuity

Page 301: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.2. INVERSE AND IMPLICIT FUNCTION 301

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

................................................

.......

..............................................................

..............................................................

..............................................................

..........................

..........................................................................................

...................................................................................

(~x0, ~y0)

~y

~x1 ~x2 ~x

L0 L1 L2

F

....................................................

.......................................................

...........................................................

...............................................................

......................................................................

..................................................................................

.............................................................................................................

................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

Figure 6.3: Newton’s method

of the derivative everywhere is (only) used to conclude that the image ofopen subsets must be open.

The proof actually includes a method for computing the inverse function.In fact, it appears to be more suitable to use

Ln(~x) = F (~xk) + F ′(~xk)(~x− ~xk)

to approximate F at ~xk (with F ′ at ~xk instead of at ~x0). The method isquite effective in case the dimension is small (so that F ′(~xk)

−1 is easier tocompute). In particular, for a single variable function f(x), the solution tof(x) = 0 can be found by starting from x0 and successively constructing

xn+1 = xn −f(xn)

f ′(xn).

Example 6.2.1. In Example 6.1.8, we computed the differentials of the cartesiancoordinate in terms of the polar coordinate. The Jacobian matrix is invertibleaway from the origin, so that the polar coordinates can also be written locally interms of the cartesian coordinates. In fact, the map (r, θ) → (x, y) is invertiblefor (r, θ) ∈ (0,∞) × (a, a + 2π). By solving the system dx = cos θdr − r sin θdθ,dy = sin θdr + r cos θdθ, we get

dr = cos θdx+ sin θdy, dθ = −r−1 sin θdx+ r−1 cos θdy.

By the inverse function theorem, the coefficients form the Jacobian matrix

∂(r, θ)∂(x, y)

=(

cos θ sin θ−r−1 sin θ r−1 cos θ

).

Exercise 6.2.1. Use the differential in Exercise 6.1.21 to find differential of thechange from the cartesian coordinate (x, y, z) to the spherical coordinate (r, φ, θ).

Exercise 6.2.2. Suppose x = eu + u cos v, y = eu + u sin v. Find the places whereu and v can be locally written as differentiable functions of x and y and compute∂(u, v)∂(x, y)

.

Exercise 6.2.3. Find the places where z can be locally expressed as a function ofx and y and then compute zx and zy.

Page 302: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

302 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

1. x = s+ t, y = s2 + t2, z = s3 + t3.

2. x = es+t, y = es−t, z = st.

Exercise 6.2.4. Change the partial differential equation (x+ y)ux − (x− y)uy = 0to an equation with respect to the polar coordinate (r, θ).

6.2.2 Implicit Differentiation

Suppose a continuous function f(x, y) satisfies f(x0, y0) = 0 and has con-tinuous partial derivative fy near (x0, y0). Assume fy(x0, y0) > 0. Thenfy(x, y) > 0 for (x, y) near (x0, y0). Thus for any fixed x, f(x, y) is strictlyincreasing in y. In particular, for some small ε > 0, we have f(x0, y0 + ε) >f(x0, y0) = 0 and f(x0, y0 − ε) < f(x0, y0) = 0. By the continuity in x,there is δ > 0, such that f(x, y0 + ε) > 0 and f(x, y0 − ε) < 0 for anyx ∈ (x0 − δ, x0 + δ). Now for any fixed x ∈ (x0 − δ, x0 + δ), f(x, y) is strictlyincreasing in y and has different signs at y0 + ε and y0 − ε. Therefore thereis a unique y ∈ (y0 − ε, y0 + ε) satisfying f(x, y) = 0.

We say y is an implicit function of x because the function is only implicitlygiven by the equation f(x, y) = 0.

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

...........................................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

(x0, y0)

y0 − ε

y0 + ε

x0 − δ x0 + δ

y = y(x)

+

..................................................................................................................................................................................................................................................

..........................................................................................................................................

....................................................................................

....................................................................................

..................

............

............

............

............

............

............

............

............

............

............

............

............

......

Figure 6.4: implicit function

The argument shows that if f(x0, y0) = 0 and fy(x0, y0) 6= 0, plus somecontinuity condition, then the equation f(x, y) = 0 can be solved to definea function y = y(x) near (x0, y0). If f is differentiable at (x0, y0), then theequation f(x, y) = 0 is approximated by the linear equation

fx(x0, y0)(x− x0) + fy(x0, y0)(y − y0) = 0.

The assumption fy(x0, y0) 6= 0 makes it possible to solve the linear equation

and get y = y0 −fx(x0, y0)

fy(x0, y0)(x − x0). So the conclusion is that if the linear

approximation implicitly defines a function, then the original equation alsoimplicitly defines a function.

In general, consider a differentiable map F : Rn×Rm = Rm+n → Rm. Wehave

F ′(~u,~v) = F ′(~u,~0) + F ′(~0, ~v) = F~x(~u) + F~y(~v),

Page 303: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.2. INVERSE AND IMPLICIT FUNCTION 303

where F~x : Rn → Rm and F~y : Rm → Rm are linear maps and generalize thepartial derivatives. In terms of the Jacobian matrix, F ′ can be written asthe block matrix (F~x F~y).

Theorem 6.2.2 (Implicit Function Theorem). Suppose F : Rn×Rm → Rm iscontinuously differentiable near (~x0, ~y0) and satisfies F (~x0, ~y0) = ~0. Supposethe derivative F~y(~x0, ~y0) : Rm → Rm in ~y is an invertible linear map. Thenthere is an open subset U containing ~x0 and a map G : U ⊂ Rn → Rm, suchthat G is continuously differentiable near ~x0, G(~x0) = ~y0 and F (~x,G(~x)) = ~0.

The map ~y = G(~x) is the solution to the equation F (~x, ~y) = ~0. Since theequation is approximated by the linear equation

F~x(~x0, ~y0)∆~x+ F~y(~x0, ~y0)∆~y = ~0,

the solution∆~y = F−1

~y F~x∆~x

of the linear equation is expected to approximate G. Thus we expect to have

G′ = −F−1~y F~x.

This is the same as saying that the differential d~y = G′d~x is the solution ofthe equation

dF = F~xd~x+ F~yd~y = 0.

Proof. The mapH(~x, ~y) = (~x, F (~x, ~y)) : Rm+n → Rm+n has continuous deriva-tive

H ′(~u,~v) = (~u, F~x(~u) + F~y(~v))

near ~x0. The invertibility of F~y(~x0, ~y0) implies the invertibility of H ′(~x0, ~y0).Then by the inverse function theorem, H has inverse H−1 = (S, T ) that iscontinuously differentiable near H(~x0, ~y0) = (~x0,~0), where S : Rm+n → Rn

and T : Rm+n → Rm. Since HH−1 = (S, F (S, T )) is the identity, we haveS(~x, ~z) = ~x and F (~x, T (~x, ~z)) = ~z. Then

F (~x, ~y)) = ~0 ⇐⇒ H(~x, ~y) = (~x,~0)

⇐⇒ (~x, ~y) = (S(~x,~0), T (~x,~0)) = (~x, T (~x,~0)).

Therefore for ~y = G(~x) = T (~x,~0) is exactly the solution of F (~x, ~y) = ~0.

Example 6.2.2. The unit sphere S2 ⊂ R3 is given by the equation f(x, y, z) =x2 + y2 + z2 = 1. By fz = 2z and the implicit function theorem, z can beexpressed as a function of (x, y) near any place where z 6= 0. In fact, the expressionis z = ±

√1− x2 − y2, where the sign is the same as the sign of z. By solving the

equationdf = 2xdx+ 2ydy + 2zdz = 0,

we get dz = −xzdx− y

zdy. Therefore zx = −x

zand zy = −y

z.

Exercise 6.2.5. Prove that the derivative of the implicit function is G′ = −F−1~y F~x.

Page 304: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

304 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Exercise 6.2.6. Find the places where maps are implicitly defined and computethe derivatives of the maps.

1. x3 + y3 − 3axy = 0, finddy

dx.

2. x2 + y2 + z2 − 4x+ 6y − 2z = 0, find∂z

∂(x, y)and

∂x

∂(y, z).

3. x2 + y2 = z2 + w2, x+ y + z + w = 1, find∂(z, w)∂(x, y)

and∂(x,w)∂(y, z)

.

4. z = f(x+ y + z, xyz), find∂z

∂(x, y)and

∂x

∂(y, z).

Exercise 6.2.7. Verify that implicitly defined functions satisfy the partial differen-tial equations.

1. f(x− az, y − bz) = 0, z = z(x, y) satisfies azx + bzy = 1.

2. x2 + y2 + z2 = yf

(z

y

), z = z(x, y) satisfies (x2− y2− z2)zx + 2xyzy = 4xz.

Exercise 6.2.8. In solving the equation f(x, y) = 0 for two variable functions, wedid not assume anything about the partial derivative in x. Extend the discussionto the general multivariable case and point out what conclusion in the implicitfunction theorem may not hold.

6.2.3 Hypersurface

For a continuously differentiable map F : Rn → Rm, the graph (~x, F (~x)) : Rn →Rm+n is an n-dimensional hypersurface in Rm+n. However, we should not in-sist that it is always the last m coordinates that can be written as a map ofthe first n coordinates. For example, the unit circle S1 ⊂ R2 is the graphof a function of y in x near (x0, y0) ∈ S1 if y0 6= 0, and the circle is thegraph of a function of x in y if x0 6= 0. Thus an n-dimensional hypersurfacein Rm+n is a subset such that near any point, the subset is the graph of acontinuously differentiable map of some choice of m variables in terms of theother n variables.

There are generally two ways of specifying a hypersurface. For n ≤ m,the image of a continuously differentiable map F : Rn → Rm may give ann-dimensional parametrized hypersurface in Rm. For n ≥ m, the preimageof a continuously differentiable map F : Rn → Rm may give an (n − m)-dimensional level hypersurface in Rn.

A continuously differentiable map F : Rn → Rm is regular if the derivativeF ′(~x) : Rn → Rm is always an injective linear map (this necessarily impliesn ≤ m). For example, a parametrized curve φ(t) is regular if its tangentvector φ′(t) is not zero, and a parametrized surface σ(u, v) is regular if thetangent vectors σu and σv are not parallel. From linear algebra, the injectivelinear map F ′(~x) is invertible when projected to certain n coordinates inRm. Assume these n coordinates are the first n coordinates. Then F (~x) =(G(~x), H(~x)), where G : Rn → Rn and H : Rn → Rm−n are continuously

Page 305: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.2. INVERSE AND IMPLICIT FUNCTION 305

differentiable and G′(~x) is invertible. By the inverse function theorem, G islocally invertible, and we can locally change the variable from ~x to ~y = G(~x)and get F (G−1(~y)) = (~y,H(G−1(~y))). In other words, by a local change ofvariable in Rn, F becomes the graph of a continuously differentiable mapH ◦G−1 : Rn → Rm−n. Thus F specifies an n-dimensional hypersurface S inRm.

The tangent space TF (~x)S of the hypersurface is the collection of tangentvectors of the curves F (φ(t)) in the surface. By F (φ(t))′ = F ′(φ(t))(φ′(t))(linear transform F ′(φ(t)) applied to vector φ′(t)), the tangent space is theimage of the linear transform F ′(~x)

TF (~x)S = {F ′(~x)(~v) : ~v ∈ Rn}. (6.2.8)

Example 6.2.3. For the parametrized sphere (5.1.19), we have

∂(x, y, z)∂(φ, θ)

=

cosφ cos θ − sinφ sin θcosφ sin θ sinφ cos θ− sinφ 0

.

The matrix has rank 2 as long as sinφ 6= 0 (i.e., z = cosφ 6= ±1). Thereforethe parametrization is a regular surface away from the north and south poles.

Moreover, if sinφ 6= 0 and cosφ 6= 0 (i.e., z 6= 0 or ±1), then∂(x, y)∂(φ, θ)

has full

rank 2. This suggests that away from the two poles and the equator, the map(φ, θ) 7→ (x, y) is locally invertible and the surface can be reparametrized by (x, y).As a matter of fact, on the whole north hemisphere (which means 0 < z < 1), wehave the explicit formula for the reparametrization

(φ, θ) =(

arcsin√x2 + y2, arctan

y

x

), z = cos arcsin

√x2 + y2 =

√1− x2 − y2.

The south hemisphere can be similarly reparametrized.

Example 6.2.4. In Example 6.1.15, we try to characterize functions f(x, y) satis-fying xfx = yfy. The equation can also be interpreted as f being constant alongthe curves tangential to (x,−y) everywhere:

(x′, y′) = λ(x,−y) =⇒ d

dtf(x(t), y(t)) = x′fx + y′fy = λ(xfx − yfy) = 0.

Byx′ = x =⇒ (log |x|)′ = 1 =⇒ x = aet,

and similarly y′ = −y′ implying y = be−t, we see that the curves are really thelevels xy = c. This suggests that f(x, y) = h(xy) for some h.

Exercise 6.2.9. Study the reparametrization of the sphere (5.1.19) by two coordi-nates along the equator.

Exercise 6.2.10. For the parametrized torus (5.1.20), find the places where thesurfaces are regular and then compute the partial derivatives of some coordinatein terms of the other two.

A vector ~y0 ∈ Rm is a regular value of a continuously differentiable mapF : Rn → Rm if F ′(~x) : Rn → Rm is a surjective linear map (this necessarilyimplies n ≥ m) for all ~x satisfying F (~x) = ~y0 (~x is a preimage of ~y0). From

Page 306: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

306 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

linear algebra, the surjective linear map F ′(~x) is invertible when restrictedto certain choice of m coordinates in Rn (by taking all the other n−m coor-dinates to be 0). By taking these m coordinates to be Rm and the remainingn − m coordinates to be Rn in the implicit function theorem, we find theequation F (~x) = ~y0 defines the graph of a continuously differentiable mapthat expresses the m coordinates in terms of the remaining n − m coordi-nates. In particular, this shows that F (~x) = ~y0 is an (n − m)-dimensionalhypersurface S in Rn.

The tangent space T~xS is given by the derivative of the map producedby the implicit function theorem. The derivative of the map is computedfrom the linear approximation of the defining equation F (~x) = ~y0. Thus thetangent space

T~xS = {~v ∈ Rn : F ′(~x)(~v) = ~0} (6.2.9)

is the kernel of the linear transform F ′(~x) : Rn → Rm.For a continuously differentiable function f(~x), a number y0 is a regu-

lar value if f(~x) = y0 implies the gradient ∇f 6= ~0 at ~x (or at least onepartial derivative is nonzero). The equation f(~x) = y0 specifies an (n − 1)-dimensional hypersurface, with the tangent hyperplane given by ∇f · ~v = 0.In other words, the gradient ∇f is the normal vector of the tangent hyper-plane.

Back to a continuously differentiable map F : Rn → Rm. Express the mapF = (f1, f2, . . . , fm) in its individual coordinates. The derivative F ′(~x) : Rn →Rm is surjective if and only if the gradients ∇f1, ∇f2, . . . , ∇fm are linearlyindependent

c1∇f1 + c2∇f2 + · · ·+ cm∇fm = 0 =⇒ c1 = c2 = · · · = cm = 0.

For a regular value ~y0, the hypersurface S defined by the equation F (~x) = ~y0

is the intersection of the (n− 1)-dimensional hypersurfaces fi(~x) = yi0. Thetangent space T~xS is the intersection of the tangent spaces ∇fi · ~v = 0 forthe coordinate functions. The normal vectors ∇f1, ∇f2, . . . , ∇fm span thenormal space of the hypersurface S at ~x.

Example 6.2.5. The unit sphere Sn−1 = f−1(1), where f(~x) = ~x · ~x = x21 + x2

2 +· · ·+x2

n. The number 1 is a regular value because f(~x) = 1 =⇒ ~x 6= ~0 =⇒ ∇f =2~x 6= ~0. As a linear map, the restriction of the derivative ~v 7→ 2~x · ~v to the i-thcoordinate is invertible if and only if xi 6= 0. By the implicit function theorem,if xi 6= 0, then xi can be expressed as a function of (x1, . . . , xi−1, xi+1, . . . , xn)near ~x. Indeed, the part of the sphere with xi 6= 0 consists of the graphs of thefunctions xi = ±

√1− x2

1 − · · · − x2i−1 − x2

i+1 − · · · − x2n (which graph ~x belongs

to depends on the sign of its i-th coordinate). The whole sphere is then coveredby these pieces of hypersurfaces, called coordinate charts.

Exercise 6.2.11. Find the regular values of the maps and determine the tangentspaces of the preimages of regular values.

1. f(x, y) = x2 + y2 + z2 + xy + yz + zx+ x+ y + z.

2. F (x, y, z) = (x+ y + z, xy + yz + zx).

Page 307: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.2. INVERSE AND IMPLICIT FUNCTION 307

Exercise 6.2.12. Find a regular value λ for xyz, so that the surface xyz = λ is

tangential to the ellipsex2

a2+y2

b2+z2

c2= 1 at some point.

Exercise 6.2.13. Prove that any sphere x2+y2+z2 = a2 and any cone x2+y2 = b2z2

are orthogonal at their intersections. Can you extend this to higher dimension?

Exercise 6.2.14. The space of n × n symmetric matrices can be identified withRn(n+1)

2 . By considering the map F (X) = XTX : Rn2 → Rn(n+1)

2 in Exercise

6.1.6, prove that the orthogonal matrices O(n) = {X : XTX = I} is ann(n− 1)

2dimensional hypersurface in Rn2

.

6.2.4 Exercise

Orthogonal Change of Variable

A change of variable ~x = F (~y) : Rn → Rn is orthogonal if the vectors

~xy1 =∂~x

∂y1

, ~xy2 =∂~x

∂y2

, . . . , ~xyn =∂~x

∂yn

are orthogonal.

Exercise 6.2.15. Prove that an orthogonal change of variable satisfies∂yi∂xj

=

1‖~xyi‖22

∂xj∂yi

.

Exercise 6.2.16. Is the inverse ~y = F−1(~x) of an orthogonal change of variable isalso an orthogonal change of variable?

Exercise 6.2.17. Prove that under an orthogonal change of variable, the gradientin ~x can be written in terms of the new variable by

∇f =∂f

∂y1

~xy1

‖~xy1‖22+∂f

∂y2

~xy2

‖~xy2‖22+ · · ·+ ∂f

∂yn

~xyn‖~xyn‖22

.

Elementary Symmetric Polynomial

The elementary symmetric polynomials for n variables x1, x2, . . . , xn are

σk =∑

1≤i1<i2<···<ik≤n

xi1xi2 · · ·xik , k = 1, 2, . . . , n.

Vieta’s formulae says that they appear as the coefficients of the polynomial

p(x) = (x− x1)(x− x2) · · · (x− xn)

= xn − σ1xn−1 + σ2x

n−2 − · · ·+ (−1)n−1σn−1x+ (−1)nσn. (6.2.10)

Therefore ~x 7→ ~σ : Rn → Rn is the map that takes the roots of polynomialsto polynomials.

Page 308: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

308 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Exercise 6.2.18. Prove that the derivative∂~σ

∂~xof the polynomial with respect to

the roots satisfyxn−1

1 xn−21 · · · 1

xn−12 xn−2

2 · · · 1...

......

xn−1n xn−2

n · · · 1

∂~σ

∂~x+

p′(x1) 0 · · · 0

0 p′(x2) · · · 0...

......

0 0 · · · p′(xn)

= O.

Then prove that when the roots are distinct, the roots can be locally written ascontinuously differentiable functions of the polynomial.

Power Sum and Newton’s Identity

The power sums for n variables x1, x2, . . . , xn are

sk =∑

1≤i≤n

xki = xk1 + xk2 + · · ·+ xkn, k = 1, 2, . . . , n.

For the polynomial p(x) in (6.2.10), by adding p(xi) = 0 together for i =1, 2, . . . , n, we get

sn − σ1sn−1 + σ2sn−2 − · · ·+ (−1)n−1σn−1s1 + (−1)nnσn = 0. (6.2.11)

Exercise 6.2.19. For

ul,k =∑

i1<i2<···<il,j 6=ip

xi1xi2 · · ·xilxkj , l ≥ 0, k ≥ 1, l + k ≤ n,

prove that

sk = u0,k,

σ1sk−1 = u0,k + u1,k−1,

σ2sk−2 = u1,k−1 + u2,k−2,

...σk−2s2 = uk−3,3 + uk−2,2,

σk−1s1 = uk−2,2 + kσk.

and derive Newton’s identities

sk − σ1sk−1 + σ2sk−2 − · · ·+ (−1)k−1σk−1s1 + (−1)kkσk = 0, k = 1, 2, . . . , n.(6.2.12)

Exercise 6.2.20. Prove that there is a polynomial invertible map that relates ~s =(s1, s2, . . . , sn) and ~σ = (σ1, σ2, . . . , σn). Then discuss the local invertibility of themap ~x 7→ ~σ : Rn → Rn when there are multiple roots (see Exercise 6.2.18).

Functional Dependence

A collection of functions are functionally dependent if some can be writtenas functions of the others. For example, f = x+y, g = x2+y2, and h = x3+y3

are functionally dependent because h =1

2f 3−1

2fg. In the following exercises,

all functions are continuously differentiable.

Page 309: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 309

Exercise 6.2.21. Prove that if f1(~x), f2(~x), . . . , fn(~x) are functionally dependent,then the gradients ∇f1, ∇f2, . . . , ∇fn are linearly dependent.

Exercise 6.2.22. Prove that if f1(~x), f2(~x), . . . , fn(~x) are functionally depen-dent near ~x0 if and only if there is a function h(~y) defined for ~y near ~y0 =(f1(~x0), f2(~x0), . . . , fn(~x0)), such that ∇h(~y0) 6= ~0 and h(f1(~x), f2(~x), . . . , fn(~x)) =0.

Exercise 6.2.23. Suppose the gradients ∇f , ∇g are linearly dependent everywhere,and ∇g(~x0) 6= ~0. Prove that there is a function h(y) defined for y near g(~x0), suchthat f(~x) = h(g(~x)) for ~x near ~x0.

Hint: If∂g

∂x16= 0, then (x1, x2, . . . , xn) 7→ (g, x2, . . . , xn) is invertible near ~x0.

After changing the variables from (x1, x2, . . . , xn) to (g, x2, . . . , xn), verify that∂f

∂x2= · · · = ∂f

∂xn= 0.

Exercise 6.2.24. Suppose the gradient vectors ∇f , ∇g1, . . . , ∇gk are linearly de-pendent near ~x0. Suppose ∇g1, . . . , ∇gk are linearly independent at ~x0. Provethat there is a function h(~y) defined for ~y near (f1(~x0), . . . , fk(~x0)), such thatf(~x) = h(g(~x)) for ~x near ~x0.

Exercise 6.2.25. Suppose the rank of the gradient vectors ∇f1, ∇f2, . . . , ∇fm isalways k near ~x0. Prove that there are k functions from f1, f2, . . . , fm, such thatthe other m− k functions are functionally dependent on these k functions.

Exercise 6.2.26. Determine functional dependence.

1. x+ y + z, x2 + y2 + z2, x3 + y3 + z3.

2. x+ y − z, x− y + z, x2 + y2 + z2 − 2yz.

3.x

x2 + y2 + z2,

y

x2 + y2 + z2,

z

x2 + y2 + z2.

4.x√

x2 + y2 + z2,

y√x2 + y2 + z2

,z√

x2 + y2 + z2.

6.3 High Order Differentiation

The high order differentiation and Taylor expansion can also be extended tomultivariable. The concepts can be used to solve problems that require highorder approximations, such as local maximum and minimum with or withoutconstraint.

6.3.1 Quaratic Approximation

A function f(~x) defined near ~x0 ∈ Rn is approximated by a quadratic function

p(~x) = a+∑

1≤i≤n

bi(xi − xi0) +∑

1≤i,j≤n

cij(xi − xi0)(xj − xj0)

= a+~b ·∆~x+ C∆~x ·∆~x (6.3.1)

if for any ε > 0, there is δ > 0, such that

‖∆~x‖ < δ =⇒ |f(~x)− p(~x)| ≤ ε‖∆~x‖2. (6.3.2)

Page 310: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

310 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

In this case, we say f is second order differentiable at ~x0, with the quadraticform

f ′′(~x0)(~v) = 2∑

1≤i,j≤n

cijvivj = C~v · ~v (6.3.3)

as the the second order derivative. The second order differential is

d2f = 2∑

1≤i,j≤n

cijdxidxj = 2Cd~x · d~x. (6.3.4)

Similar to the single variable case, the second order differentiability im-

plies the first order differentiability, so that a = f(~x0) and bi =∂f(~x0)

∂xi. We

expect the coefficients cij to be the second order partial derivatives

cij =1

2fxixj(~x0) =

1

2

∂2f(~x0)

∂xi∂xj=

1

2

∂xi

(∂f

∂xj

)~x=~x0

. (6.3.5)

However, a second order differentiable function can be very bad (discontin-uous, for example) away from ~x0. Therefore the formula (6.3.5) may notmake sense at all. On the other hand, Theorem 2.3.1 suggests that if thesecond order partial derivatives do exist and have good enough properties,then the quadratic function (6.3.1) with the coefficients given by suitablepartial derivatives should approximate f .

To find suitable extension of Theorem 2.3.1 to multivariable functions,we study the case n = 2 and ~x0 = ~0. In other words, we consider a twovariable function f(x, y) defined near (0, 0). Assume f has first order partialderivatives near (0, 0) and second order derivatives at (0, 0). To show that

p(x, y) = f(0, 0) + fx(0, 0)x+ fy(0, 0)y

+1

2(fxx(0, 0)x2 + fxy(0, 0)xy + fyx(0, 0)yx+ fyy(0, 0)y2) (6.3.6)

approximates f near (0, 0), we restrict the remainder R2(x, y) = f(x, y) −p(x, y) to the straight lines passing through (0, 0). Therefore for fixed (x, y)close to (0, 0), we introduce

r2(t) = R2(tx, ty) = f(tx, ty)− p(tx, ty).

Assume f is differentiable near (0, 0). Then by the chain rule, r is differ-entiable, with

r′2(t) = fx(tx, ty)x+ fy(tx, ty)y − fx(0, 0)x− fy(0, 0)y

− (fxx(0, 0)x2 + fxy(0, 0)xy + fyx(0, 0)yx+ fyy(0, 0)y2)t

= x[fx(tx, ty)− fx(0, 0)− fxx(0, 0)tx− fyx(0, 0)ty]

+ y[fy(tx, ty)− fy(0, 0)− fxy(0, 0)tx− fyy(0, 0)ty]. (6.3.7)

Further assume that fx and fy are differentiable at (0, 0). Then for any ε > 0,there is δ > 0, such that ‖(ξ, η)‖ < δ implies

|fx(ξ, η)− fx(0, 0)− fxx(0, 0)ξ − fyx(0, 0)η| < ε‖(ξ, η)‖,|fy(ξ, η)− fy(0, 0)− fxy(0, 0)ξ − fyy(0, 0)η| < ε‖(ξ, η)‖.

Page 311: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 311

Taking (ξ, η) = (tx, ty), by (6.3.7) we find ‖(x, y)‖ < δ and 0 < t < 1 imply

|r′2(t)| ≤ |x|ε‖(tx, ty)‖+ |y|ε‖(tx, ty)‖ = tε(|x|+ |y|)‖(x, y)‖

Inspired by the proof of Theorem 2.3.1, we apply Cauchy’s mean valuetheorem to get

R2(x, y) = r2(1) =r2(1)− r2(0)

12 − 02=r′2(c)

2c, 0 < c < 1.

Therefore

‖(x, y)‖ < δ =⇒ |R2(x, y)| ≤ 1

2ε(|x|+ |y|)‖(x, y)‖.

By the equivalence of norms, the right side is comparable to ε‖(x, y)‖2.Therefore we conclude that p(x, y) is a quadratic approximation of f(x, y)near (0, 0).

Theorem 6.3.1. Suppose a function f is differentiable near ~x0 and the par-

tial derivatives∂f

∂xiare differentiable at ~x0. Then f is second order differen-

tiable at ~x0, with the quadratic approximation

T2(~x) = f(~x0) +∑

1≤i≤n

∂f(~x0)

∂xi∆xi +

1

2

∑1≤i,j≤n

∂2f(~x0)

∂xi∂xj∆xi∆xj.

Under slightly stronger condition, the partial derivative coefficients aresymmetric.

Theorem 6.3.2. Suppose f(x, y) has partial derivatives fx, fy, fxy near(x0, y0) and fxy is continuous at (x0, y0), then fyx(x0, y0) exists and fxy(x0, y0) =fyx(x0, y0).

Thus if a function has all the first and second order partial derivativesnear ~x0, and the second order partial derivatives are continuous at ~x0, thenthe condition of Theorem 6.3.1 is satisfied. Moreover, the second order partialderivatives is independent of the order of the variables.

Proof. For fixed x0 and x, we apply the mean value theorem to the functiong(y) = f(x, y)− f(x0, y). By the existence of fy near (x0, y0), we get

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0) = g(y)− g(y0) = g′(d)(y − y0)

= (fy(x, d)− fy(x0, d))(y − y0)

for some d between y0 and y. Then we fix d and apply the mean valuetheorem to the function fy(x, d) of x. By the existence of fxy near (x0, y0),we get

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0) = fxy(c, d)(x− x0)(y − y0)

Page 312: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

312 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

for some c between x0 and x. Then the continuity of fxy at (x0, y0) tells us

limx→x0,y→y0

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0)

(x− x0)(y − y0)= fxy(x0, y0).

On the other hand, for any fixed y 6= y0, by the existence of fx near (x0, y0),we have

limx→x0

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0)

(x− x0)(y − y0)=fx(x0, y)− fx(x0, y0)

y − y0

.

Thus the double limit is equal to the repeated limit (see Exercise 5.1.56)

limy→y0

fx(x0, y)− fx(x0, y0)

y − y0

= limy→y0

limx→x0

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0)

(x− x0)(y − y0)

= limx→x0,y→y0

f(x, y)− f(x0, y)− f(x, y0) + f(x0, y0)

(x− x0)(y − y0)

= fxy(x0, y0).

This means exactly fyx(x0, y0) exists and fxy(x0, y0) = fyx(x0, y0).

Example 6.3.1. The function f = xy2z3 has continuous partial derivatives

fx = y2z3, fy = 2xyz3, fz = 3xy2z2,

fxx = 0, fyy = 2yz3, fzz = 6xy2z, fxy = 2yz3, fxz = 3y2z2, fyz = 6xyz2.

The values of the partial derivatives at (3, 2, 1) are

f = 12,fx = 4, fy = 12, fz = 36,fxx = 0, fyy = 4, fzz = 72, fxy = 4, fxz = 12, fyz = 36.

The quadratic approximation of f at (3, 2, 1) is

q = 12 + 4∆x+ 12∆y+ 36∆z+12

(4∆y2 + 72∆z2 + 8∆x∆y+ 24∆x∆z+ 72∆y∆z),

where ∆x = x− 3, ∆y = y − 2, ∆z = z − 1. The quadratic differential is

d2(3,2,1)f = 4dy2 + 72dz2 + 8dxdy + 24dxdz + 72dydz.

Exercise 6.3.1. Compute the partial derivatives up to second order.

1. 4xy3 + 5x3y − 3y2.

2. arctany

x.

3. exyz sin(x+2y+3z).

4. log(x2 + y2).

5. xyz.

6. (xy)z.

Exercise 6.3.2. Construct a function that is second order differentiable at (0, 0)but is not continuous anywhere away from (0, 0).

Page 313: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 313

Exercise 6.3.3. Show that the function

f(x, y) =

xy(x2 − y2)x2 + y2

if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0)

has all the second order partial derivatives near (0, 0) but fxy(0, 0) 6= fyx(0, 0).

Exercise 6.3.4. Consider the function f(x, y) in Example 5.1.2. Show that f(x, y)2

has all the second order derivatives but the function is still not continuous (so notdifferentiable) at (0, 0). Can you find a function that has all the partial derivativesbut is not continuous?

Exercise 6.3.5. Suppose f has continuous second order partial derivatives. Provethat if ffxy = fxfy, then f(x, y) = g(x)h(y).

Exercise 6.3.6. Derive the chain rule for the second order derivative by composingthe quadratic approximations.

6.3.2 High Order Partial Derivative

The discussion on the quadratic approximation suggests the role to be playedby high order partial derivatives in the high order approximation. In general,the k-th order partial derivative

∂kf

∂xi1∂xi2 · · · ∂xik= Dxi1xi2 ···xikf = fxi1xi2 ···xik (6.3.8)

is obtained by successfully taking partial derivatives in xik , xik−1, . . . , xi2 ,

xi1 . The partial derivative may depend on the order of variables xik , xik−1,

. . . , xi2 , xi1 . However, by Theorem 6.3.2, if all the partial derivatives upto the (k − 1)-st order are continuous near ~x0, and the k-th order partialderivatives are continuous at ~x0, then the partial derivative is independentof the order. In this case, we can write

∂kf

∂xi1∂xi2 · · · ∂xik=

∂kf

∂xk11 ∂x

k22 · · · ∂xknn

,

where kj is the number of xj in the collection {xi1 , xi2 , . . . , xik}.Many techniques for the computation of high order derivatives of single

variable functions can be adapted to multivariable functions.

Page 314: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

314 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Example 6.3.2. For a function f(x, y) and x = x(u, v), y = y(u, v), we have

∂2f

∂u∂v=

∂u

(∂f

∂v

)=

∂u

(∂f

∂x

∂x

∂v+∂f

∂y

∂y

∂v

)=

∂u

(∂f

∂x

)∂x

∂v+∂f

∂x

∂u

(∂x

∂v

)+

∂u

(∂f

∂y

)∂y

∂v+∂f

∂y

∂u

(∂y

∂v

)=(∂2f

∂x2

∂x

∂u+

∂2f

∂y∂x

∂y

∂u

)∂x

∂v+∂f

∂x

∂2x

∂u∂v

+(∂2f

∂x∂y

∂x

∂u+∂2f

∂y2

∂y

∂u

)∂y

∂v+∂f

∂y

∂2y

∂u∂v

=∂x

∂u

∂x

∂v

∂2f

∂x2+∂y

∂u

∂y

∂v

∂2f

∂y2+(∂x

∂u

∂y

∂v+∂x

∂v

∂y

∂u

)∂2f

∂x∂y

+∂2x

∂u∂v

∂f

∂x+

∂2y

∂u∂v

∂f

∂y.

Example 6.3.3. By Example 6.2.1, the partial derivatives of the polar coordinatein terms of the cartesian coordinate is

rx = cos θ, ry = sin θ, θx = −r−1 sin θ, θy = r−1 cos θ.

Thus

rxx = −(sin θ)θx = r−1 sin2 θ,

ryy = (cos θ)θy = r−1 cos2 θ,

rxy = −(sin θ)θy = −r−1 sin θ cos θ,

θxx = r−2rx sin θ − r−1(cos θ)θx = 2r−2 sin θ cos θ,

θyy = −r−2ry cos θ + r−1(− sin θ)θy = −2r−2 sin θ cos θ,

θxy = r−2ry sin θ − r−1(cos θ)θy = r−2(sin2 θ − cos2 θ).

Example 6.3.4. By Example 6.2.2, the partial derivatives of one coordinate in termsof the other coordinates in the unit sphere x2 + y2 + z2 = 1 is (when z 6= 0)

zx = −xz, zy = −y

z.

Thus

zxx = −z − xzxz2

= −x2 + z2

z3=

1− y2

z3,

zxy =xzyz2

= −xyz3,

zxxx = −3(1− y2)zx

z4= 3

(1− y2)xz5

,

zxxy =−2yz3 − 3(1− y2)z2zy

z6=

(1− 3y2 − 2z2)yz5

.

Exercise 6.3.7. Compute the third order partial derivatives.

1. 4xy3 + 5x3y − 3y2.

2. arctany

x.

3. exyz sin(x+2y+3z).

4. log(x2 + y2).

5. xyz.

6. (xy)z.

Page 315: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 315

Exercise 6.3.8. Compute the partial derivatives.

1. z = uv + sin t, u = et, v = cos t, find ztt, zttt.

2. z = x2 log y, x =u

v, y = u− v, find zuu, zuv, zvv.

3. u = f(x+ y, x− y), find uxx, uxy, uxxx, uxyy.

4. u = f(r cos θ, r sin θ), find urr, uθθ, urθ.

5. u = f

(x

y,y

z,z

x

), find uxx, uxy, uxyz.

6. u = xyzf(x, y, z), find∂i+j+ku

∂xi∂yj∂zk.

Exercise 6.3.9. For a function f on Rn and a linear transform L : Rm → Rn. Howare the high order partial derivatives of f and f ◦ L related?

Exercise 6.3.10. Verify the functions satisfy the partial differential equations.

1. u =1

2a√πte−

(x−b)2

4a2t , heat equation ut = a2uxx.

2. u = log((x− a)2 + (y − b)2), Laplace equation uxx + uyy = 0.

3. u = A((x1 − a1)2 + (x2 − a2)2 + · · · + (xn − an)2)2−n

2 , Laplace equationux2

1+ ux2

2+ · · ·+ ux2

n= 0.

4. u = φ(x+ at) + ψ(x− at), wave equation utt = a2uxx.

Exercise 6.3.11. Derive the partial differential equations under the new variables.

1. x = r cos θ, y = r sin θ, Laplace equation uxx + uyy = 0.

2. ξ = x+ at, η = x− at, wave equation utt = a2uxx.

3. s = xy, t =x

y, equation x2uxx − y2uyy = 0.

Exercise 6.3.12. Compute the partial derivatives.

1. x = s+ t, y = s2 + t2, z = s3 + t3, find zxx, zxy, zyy.

2. x = es+t, y = es−t, z = st, find zxx, zxxx.

3. x3 + y3 − 3axy = 0, find yxx and yxxx.

4. x2 + y2 + z2 − 4x+ 6y − 2z = 0, find∂2z

∂(x, y)2and

∂2x

∂(y, z)2.

5. x2 + y2 = z2 + w2, x+ y + z + w = 1, find∂2(z, w)∂(x, y)2

,∂2(x,w)∂(y, z)2

.

Exercise 6.3.13. Suppose y = y(~x) is given implicitly by f(~x, y) = 0. Compute thesecond order partial derivatives of y.

Page 316: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

316 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

6.3.3 Taylor Expansion

A map P : Rn → Rm is a polynomial map of degree k if all its coordinatefunctions are polynomials of degree k. A map F (~x) defined near ~x0 is k-thorder differentiable at ~x0 if it is approximated by a polynomial map P ofdegree k. In other words, for any ε > 0, there is δ > 0, such that

‖∆~x‖ < δ =⇒ ‖F (~x)− P (~x)‖ ≤ ε‖∆~x‖k. (6.3.9)

Express the approximate polynomial as

P (~x) = F (~x0)+F ′(~x0)(∆x)+1

2F ′′(~x0)(∆x)+· · ·+ 1

k!F (k)(~x0)(∆x), (6.3.10)

where the coordinates of F (i)(~x0) are i-th order forms. Then F (i)(~x0) is calledthe i-th order derivative of F at ~x0.

It is easy to see that F is approximated by P if and only if each coordinateis approximated. A function f(~x) defined near ~x0 is k-th order differentiableat ~x0 if there is a polynomial

p(~x) =∑

k1+k2+···+kn≤k,ki≥0

bk1k2···kn∆xk11 ∆xk2

2 · · ·∆xknn

of degree k, such that for any ε > 0, there is δ > 0, such that

‖∆~x‖ < δ =⇒ |f(~x)− p(~x)| ≤ ε‖∆~x‖k.

The k-th order derivative

f (k)(~x0)(~v) =∑

k1+k2+···+kn=k,ki≥0

k!bk1k2···knvk11 v

k22 · · · vknn

is a k-th order form, and the k-th order differential is

dkf =∑

k1+k2+···+kn=k,ki≥0

k!bk1k2···kndxk11 dx

k22 · · · dxknn .

Similar to single variable functions, a high order differentiable functionmay not be continuous away from the point. On the other hand, if f haspartial derivatives up to order k at ~x0, then we may construct the k-th orderTaylor expansion

Tk(~x) =∑

1≤i1,i2,...,im≤n,0≤m≤k

1

m!

∂mf(~x0)

∂xi1∂xi2 · · · ∂xim∆xi1∆xi2 · · ·∆xim . (6.3.11)

If all the partial derivatives exist near ~x0 and are continuous at ~x0, then thepartial derivatives are independent of the order of the variables, and we have

Tk(~x) =∑

k1+k2+···+kn≤k,ki≥0

1

k1!k2! · · · kn!

∂k1+k2+···+knf(~x0)

∂xk11 ∂x

k22 · · · ∂xknn

∆xk11 ∆xk2

2 · · ·∆xknn .

(6.3.12)

Page 317: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 317

Theorem 6.3.3. Suppose f(~x) has continuous partial derivatives up to orderk − 1 near ~x0 and the (k − 1)-st order partial derivatives are differentiableat ~x0. Then the k-th order Taylor expansion is the k-th order polynomialapproximation at ~x0.

Proof. Restrict the remainder

Rk(~x) = f(~x)− Tk(~x)

to the straight line connecting ~x0 to ~x

rk(t) = Rk((1− t)~x0 + t~x) = Rk(~x0 + t∆~x).

Since f(~x) has continuous partial derivatives up to order k − 1 near ~x0, thefunction f(~x) and its partial derivatives up to order (k−2) are differentiablenear ~x0. Then by the chain rule, rk has derivatives up to order (k − 1) fort ∈ (−1, 1). Moreover, we have

rk(0) = r′k(0) = r′′k(0) = · · · = r(k−1)k (0) = 0, (6.3.13)

and

r(k−1)k (t) =

∑1≤i1,i2,...,ik−1≤n

[δi1,i2,...,ik−1(t∆~x)−λi1,i2,...,ik−1

(t∆~x)]∆xi1∆xi2 · · ·∆xik−1,

where

δi1,i2,...,ik−1(∆~x) =

∂k−1f(~x0 + ∆~x)

∂xi1∂xi2 · · · ∂xik−1

is the (k − 1)-st order partial derivative, and

λi1,i2,...,ik−1(∆~x) =

∂k−1f(~x0)

∂xi1∂xi2 · · · ∂xik−1

+∑

1≤i≤n

∂kf(~x0)

∂xi∂xi1∂xi2 · · · ∂xik−1

∆xi

is the linear approximation of the partial derivative.By (6.3.13) and Cauchy’s mean value theorem, we have

Rk(~x) = rk(1) =rk(1)− rk(0)

1k − 0k=r′k(c1)

kck−11

=r′k(c1)− r′k(0)

k(ck−11 − 0k−1)

=r′′k(c2)

k(k − 1)ck−22

= · · · = r(k−1)k (ck−1)

k(k − 1) · · · 2ck−1

=r

(k−1)k (ck−1)

k!ck−1

for some 1 > c1 > c2 > · · · > ck−1 > 0. By the assumption that the (k−1)-storder partial derivatives δi1,i2,...,im of f are differentiable, for any ε > 0, thereis δ > 0, such that ‖∆~x‖ < δ implies

‖δi1,i2,...,ik−1(∆~x)− λi1,i2,...,ik−1

(∆~x)‖ ≤ ε‖∆~x‖.

Then for ‖∆~x‖ < δ and |t| < 1, we have

|r(k−1)k (t)| ≤

∑1≤i1,i2,...,ik−1≤n

ε‖t∆~x‖|∆xi1∆xi2 · · ·∆xik−1|

≤ εnk−1|t|‖∆~x‖‖∆~x‖k−1∞ ,

Page 318: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

318 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

and

|Rk(~x)| = |r(k−1)k (ck−1)|k!|ck−1|

≤ εnk−1

k!‖∆~x‖‖∆~x‖k−1

∞ .

By the equivalence of norms, this implies that lim∆~x→~0Rk(~x)

‖∆~x‖k= 0.

Exercise 6.3.14. Find Taylor expansions.

1. xy at (1, 4), to third order.

2. sin(x2 + y2) at (0, 0), to fourth order.

3. xyyz at (1, 1, 1), to third order.

4.∫ 1

0(1 + x)t

2ydt at (0, 0), to third order.

Exercise 6.3.15. Find the second and third order derivatives of the map of takingthe k-th power of matrices.

Exercise 6.3.16. Find the high order derivatives of the map of taking the inverseof matrices.

Exercise 6.3.17. Find the condition for a homogeneous function to be k-th orderdifferentiable at ~0. What about a multihomogeneous function?

Under stronger differentiability condition, the remainder formula in Propo-sition 2.3.3 can also be extended.

Proposition 6.3.4. Suppose f(~x) has continuous partial derivatives up toorder k + 1 near ~x0. Then for any ~x near ~x0, there is ~c on the straight lineconnecting ~x0 to ~x, such that

|f(~x)−Tk(~x)| ≤∑

k1+k2+···+kn≤k+1

1

k1!k2! · · · kn!

∣∣∣∣ ∂k+1f(~c)

∂xk11 ∂x

k22 · · · ∂xknn

∣∣∣∣ |∆x1|k1|∆x2|k2 · · · |∆xn|kn .

(6.3.14)

Proof. Under the assumption of the proposition, we have

Rk(~x) =rk(1)− rk(0)

1k+1 − 0k+1= · · · = r

(k)k (ck)

(k + 1)k · · · 2ck=r

(k+1)k (ck+1)

(k + 1)!

for some 1 > c1 > c2 > · · · > ck+1 > 0. Since Tk(~x) is a polynomial of degreek, we have

r(k+1)k (t) =

dk+1f(~x0 + t∆~x)

dtk+1

=∑

k1+k2+···+kn=k+1

(k + 1)!

k1!k2! · · · kn!

∂k1+k2+···+knf(~x0 + t∆~x)

∂xk11 ∂x

k22 · · · ∂xknn

∆xk11 ∆xk2

2 · · ·∆xknn ,

and the estimation for the remainder Rk(~x) follows.

Page 319: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 319

Example 6.3.5. For the function f = xy2z3 in Example 6.3.1, we have

fxxx = fxxy = fxxz = fxyy = fyyy = 0,

fxyz = 6yz2, fxzz = 6y2z, fyyz = 6yz2, fyzz = 12xyz.

Thus at (3, 2, 1), when |∆x|, |∆y|, |∆z| < δ, the sum of the absolute values of thethird order derivatives is bounded by 6(2 + δ)(1 + δ)2 + 6(2 + δ)2(1 + δ) + 6(2 +δ)(1 + δ)2 + 12(3 + δ)(2 + δ)(1 + δ) < 120(1 + δ)3. The error of the quadratic

approximation is <120(1 + δ)3

3!δ3 = 20(1 + δ)3δ3.

Exercise 6.3.18. Estimate the errors of the approximations in Exercise 6.3.14.

6.3.4 Maximum and Minimum

A function f(~x) has a local maximum at ~x0 if there is δ > 0, such that

‖~x− ~x0‖ < δ =⇒ f(~x) ≤ f(~x0).

In other words, the value of f at ~x0 is biggest among the values of f at pointsnear ~x0. Similarly, f has a local minimum at ~x0 if there is δ > 0, such that

‖~x− ~x0‖ < δ =⇒ f(~x) ≥ f(~x0).

Suppose the function is defined near ~x0. Then ~x0 is a local extreme if andonly if it is a local extreme in all directions. So a necessary condition is thatif the derivative D~vf(~x0) exists in the direction of ~v, then D~vf(~x0) = 0. Bytaking ~v to be the coordinate directions, we get the following result.

Proposition 6.3.5. Suppose f(~x) is defined near ~x0 and has a local extreme

at ~x0. If a partial derivative∂f

∂xi(~x0) exists, then the partial derivative must

be zero.

Thus if f(~x) is differentiable at a local extreme ~x0, then ∇f(~x0) = ~0. Thecondition can also be expressed as df = 0.

Example 6.3.6. For f(x, y) = x2 + xy + y2, we have fx = 2x+ y and fy = x+ 2y.The condition fx = fy = 0 tells us the only possible local extreme is (0, 0). Since

f(x, y) =(x+

12y

)2

+34y2 ≥ 0 = f(0, 0),

(0, 0) is a local minimum.Similarly, by solving ∇g = 0, the only possible local extreme for g(x, y) =

x2 − y2 is again (0, 0). Since g(x, y) > 0 = g(0, 0) when |x| > |y| and g(x, y) <0 = g(0, 0) when |x| < |y|, (0, 0) is neither a local maximum nor a local minimum.Taking a clue form the graph of g, (0, 0) is called a saddle point.

Example 6.3.7. The function f(x, y) = |x|+y satisfies fy = 1 6= 0. Therefore thereis no local extreme, despite the fact that fx may not exist.

Page 320: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

320 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Example 6.3.8. The function f(x, y) =√|xy| has the partial derivatives

fx =

12

√∣∣∣yx

∣∣∣ if x > 0

−12

√∣∣∣yx

∣∣∣ if x < 0

0 if y = 0does not exist if x = 0, y 6= 0

, fy =

12

√∣∣∣∣xy∣∣∣∣ if y > 0

−12

√∣∣∣∣xy∣∣∣∣ if y < 0

0 if x = 0does not exist if x 6= 0, y = 0

.

Therefore the possible local extrema are (0, y0) for any y0 and (x0, 0) for any x0.Since

f(x, y) =√|xy| ≥ 0 = f(0, y0) and f(x0, 0),

the points on the two axes are indeed local minima.

Example 6.3.9. Consider the differentiable function z = z(x, y) given implicitly byx2− 2xy+ 4yz+ z3 + 2y− z = 1. To find the possible local extrema of z(x, y), wetake the differential and get

(2x− 2y)dx+ (−2x+ 4z + 2)dy + (4y + 3z2 − 1)dz = 0.

The possible local extrema are obtained by the condition dz = zxdx + zydy = 0and satisfy the implicit equation. Thus we try to solve

2x− 2y = 0, −2x+ 4z + 2 = 0, x2 − 2xy + 4yz + z3 + 2y − z = 1.

From the first two equations, we have x = y = 2z + 1. Substituting into thethird, we get z3 + 4z2 + 3z + 1 = 1, which has three solutions z = 0,−1,−3.Using x = y = 2z + 1, we find three possible local extrema (1, 1, 0), (−1,−1,−1),(−5,−5,−3) for z(x, y).

Example 6.3.10. The continuous function f = x2 − 2xy reaches its maximum andminimum on the compact subset |x| + |y| ≤ 1. From fx = 2x − 2y = 0 andfy = −2x = 0, we find the point (0, 0) to be the only possible local extreme in theinterior |x| + |y| < 1. Then we look for the local extrema of f restricted to theboundary |x|+ |y| = 1, which may be devided into four (open) line segments andfour points.

On the segment x+ y = 1, 0 < x < 1, we have f = x2 − 2x(1− x) = 3x2 − 2x.

From fx = 6x−2 = 0 and x+y = 1, we find the possible local extreme(

13,23

)for

f on the segment. Similarly, we find a possible local extreme(−1

3,−2

3

)on the

segment x+ y = −1, −1 < x < 0, and no possible local extremes on the segmentsx− y = 1, 0 < x < 1 and −x+ y = 1, −1 < x < 0.

We also need to consider four points at the ends of the four segments, whichare (1, 0), (−1, 0), (0, 1), (0,−1). Comparing the values of f at all the possiblelocal extrema

f(0, 0) = 0, f(1/3, 2/3) = f(−1/3,−1/3) = −1/3,f(1, 0) = f(−1, 0) = 1, f(0, 1) = f(0,−1) = 0,

we find ±(1/2, 2/3) are the absolute minima and (±1, 0) are the absolute maxima.

Page 321: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 321

Exercise 6.3.19. Find possible local extremes.

1. x2y3(a− x− y).

2. x+ y + 4 sinx sin y.

3. xyz log(x2 + y2 + z2).

4. x1x2 · · ·xn +a1

x1+a2

x2+ · · ·+ an

xn.

5.x1x2 · · ·xn

(a+ x1)(x1 + x2) · · · (xn + b).

Exercise 6.3.20. Find possible local extremes of implicitly defined function z.

1. x2 + y2 + z2 − 2x+ 6z = 6.

2. (x2 + y2 + z2)2 = a2(x2 + y2 − z2).

3. x3 + y3 + z3 − axyz = 1.

Exercise 6.3.21. Find the absolute extremes for the functions on the given domain.

1. x+ y + z on {(x, y, z) : x2 + y2 ≤ z ≤ 1}.

2. sinx+ sin y − sin(x+ y) on {(x, y) : x ≥ 0, y ≥ 0, x+ y ≤ 2π}.

Exercise 6.3.22. What is the shortest distance between two straight lines?

Similar to the single variable case. The linear approximation may be usedto find potential local extremes. The high order approximations may be usedto determine whether the candidates are indeed local extremes.

If a function is continuously second order differentiable, then the secondorder derivative is a quadratic form, called the Hessian of the function

hf (~v) =∑

1≤i≤n

∂2f

∂x2i

v2i + 2

∑1≤i<j≤n

∂2f

∂xixjvivj

=∂2f

∂x21

v21 +

∂2f

∂x22

v22 + · · ·+ ∂2f

∂x2n

v2n

+ 2∂2f

∂x1∂x2

v1v2 + 2∂2f

∂x1∂x3

v1v3 + · · ·+ 2∂2f

∂xn−1∂xnvn−1vn. (6.3.15)

The following extends Proposition 2.3.4.

Proposition 6.3.6. Suppose f(~x) has second order partial derivatives near~x0, such that the first order derivative is continuous near ~x0 and the secondorder derivative is continuous at ~x0, such that ∇f(~x0) = 0.

1. If the Hessian is positive definite, then ~x0 is a local minimum.

2. If the Hessian is negative definite, then ~x0 is a local maximum.

3. If the Hessian is indefinite, then ~x0 is not a local extreme.

Page 322: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

322 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Proof. The Hessian is continuous homogeneous function of order 2. Thus itreaches its maximum and minimum on the compact subset {~v : ‖~v‖ = 1}.Suppose the Hessian is positive definite. Then the minimum on the subset isreached at some point, with value c > 0. The homogeneity and the fact thathf (~v) ≥ c for ~v satisfying ‖~v‖ = 1 implies that hf (~v) ≥ c‖~v‖2 for any ~v.

By Theorem 6.3.3 and ∇f(~x0) = 0, there is δ > 0, such that ‖∆~x‖ < δimplies ∣∣∣∣f(~x)− f(~x0)− 1

2hf (∆~x)

∣∣∣∣ ≤ c

2‖∆~x‖2.

By hf (∆~x) ≥ c‖∆~x‖2, we get

f(~x)− f(~x0) ≥ 1

2hf (∆~x)− c

2‖∆~x‖2 ≥ 0.

The proof that the negative definiteness implies local maximum is similar.The argument also shows that if hf (~v) > 0 for some ~v, then f(~x0 + t~v) >

f(~x0) for sufficiently small t 6= 0. Similarly, if hf (~w) < 0 for some other ~w,then f(~x0 + t~w) < f(~x0) for sufficiently small t 6= 0. Thus if hf is indefinite,then ~x0 is not a local extreme.

Example 6.3.11. We try to find the local extrema of f = x3 + y2 + z2 + 12xy+ 2z.By solving

fx = 3x2 + 12y = 0, fy = 2y + 12x = 0, fz = 2z + 2 = 0,

we find two possible local extremes ~a = (0, 0,−1) and ~b = (24,−144,−1). TheHessian of f at the two points are

hf,~a(u, v, w) = 2v2 + 2w2 + 24uv, hf,~b

(u, v, w) = 144u2 + 2v2 + 2w2 + 24uv.

Since hf,~a(1, 1, 0) = 26 > 0, hf,~a(−1, 1, 0) = −22 < 0, ~a is not a local extreme.Since h

f,~b= (12u+ v)2 + v2 + 2w2 > 0 for (u, v, w) 6= ~0, ~b is a local minimum.

Example 6.3.12. We study whether the three possibilities in Example 6.3.9 are

indeed local extrema. By zx =−2x+ 2y

4y + 3z2 − 1, zy =

2x− 4z − 24y + 3z2 − 1

and the fact that

zx = zy = 0 at the three points, we have

zxx =−2

4y + 3z2 − 1, zxy =

2(4x+ 3z2 − 1)(4y + 3z2 − 1)2

, zyy =−8(x− 2z − 1)(4y + 3z2 − 1)2

,

at the three points, and get the Hessians

h(1,1,0)(u, v) = −23u2 +

43uv,

h(−1,−1,−1)(u, v) = u2 − 2uv,

h(−5,−5,−3)(u, v) = −13u2 +

23uv.

By taking (u, v) = (1, 1) and (1,−1), we see the Hessians are indefinite and noneare local extrema.

Page 323: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 323

Example 6.3.13. The only possible local extreme for the function f(x, y) = x3 +y2

is (0, 0), where the Hessian hf (u, v) = 2v2. Although the Hessian is non-negative,it is not positive definite. The Hessian is in fact semi-positive definite. In fact,(0, 0) is not a local extreme.

The problem is that h(1, 0) = 0, which corresponds to the fact that the localextreme problem cannot be solved for f(x, 0) = x3 by quadratic approximationalone. What we have is a local extreme problem for which quadratic approximationis needed in y direction and cubic approximation is needed in x direction.Exercise 6.3.23. Try your best to determine whether the possible local extremesin Exercises 6.3.19 and 6.3.20 are indeed local maxima or local minima.Exercise 6.3.24. Suppose a two variable function f(x, y) has continuous derivativesup to second order at (x0, y0). Suppose fx = fy = 0 at (x0, y0).

1. Prove that if fxx > 0 and fxxfyy−f2xy > 0, then (x0, y0) is a local minimum.

2. Prove that if fxx < 0 and fxxfyy−f2xy > 0, then (x0, y0) is a local maximum.

3. Prove that if fxx 6= 0 and fxxfyy − f2xy < 0, then (x0, y0) is not a local

extreme.

Exercise 6.3.25. Show that the function

f(x, y) =

x2 + y2 +x3y3

(x2 + y2)3if(x, y) 6= (0, 0)

0 if(x, y) = (0, 0)

has first and second order partial derivatives and satisfy∇f(0, 0) = 0 and hf (u, v) >0 for (u, v) 6= (0, 0). However, the function does not have a local extreme at (0, 0).

The counterexample is not continuous at (0, 0). Can you find a counterexamplethat is differentiable at (0, 0)?Exercise 6.3.26. Suppose a function f(~x) has continuous derivatives up to thirdorder at ~x0, such that all the first and second order partial derivatives vanish at~x0. Prove that if some third order partial derivative is nonzero, then ~x0 is not alocal extreme.

It is possible for the Hessian to be in none of the three cases. In otherwords, the quadratic form may be non-nongative (or non-positive) and maybecome zero in some direction. In this case, the quadratic approximation isnot enough for concluding the local extreme. However, there are two prob-lems in applying the higher order approximation in the multivariable case.The first problem is that we may need approximations of different ordersin different directions, so we may not have a neat criterion like Proposition2.3.4. The second problem is that even if we can use approximations of thesame order in all directions, there is no general technique such as completingthe squares to convert forms of higher order to some kind of standard from,from which we can see the positive or negative definiteness.

6.3.5 Constrained Extreme

Suppose G : Rn → Rm is a map and G(~x0) = ~c. A function f(~x) has a localmaximum at ~x0 under the constraint G(~x) = ~c if there is δ > 0, such that

G(~x) = ~c, ‖~x− ~x0‖ < δ =⇒ f(~x) ≤ f(~x0).

Page 324: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

324 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

In other words, the value of f at ~x0 is biggest among the values of f atpoints near ~x0 and satisfying G(~x) = ~c. Similarly, f has a local minimum at~x0 under the constraint G(~x) = ~c if there is δ > 0, such that

G(~x) = ~c, ‖~x− ~x0‖ < δ =⇒ f(~x) ≥ f(~x0).

Suppose G : Rn → Rm is continuously differentiable and ~c is a regularvalue, then G(~x) = ~c defines an (n −m)-dimensional hypersurface S in Rn.Suppose f is a differentiable function defined on an open subset containingS. Then f has a local maximum at ~x0 under the constraint G(~x) = ~c if andonly if the restriction of f on any curve in S passing through ~x0 has a localmaximum at ~x0. In particular, this implies that the derivative D~vf in thedirection of any tangent vector ~v ∈ T~x0S must vanish.

The tangent vectors ~v ∈ T~x0S are characterized by the linear equationG′(~x0)(~v) = ~0. Suppose G = (g1, g2, · · · , gm). Then

G′(~x0)(~v) = (∇g1(~x0) · ~v, ∇g2(~x0) · ~v, . . . , ∇gm(~x0) · ~v).

Thus D~vf = ∇f(~x0) · ~v vanishes for all tangent vectors ~v means

∇g1(~x0) · ~v = ∇g2(~x0) · ~v = · · · = ∇gm(~x0) · ~v = 0 =⇒ ∇f(~x0) · ~v = 0.

It is a fact from linear algebra that this is equivalent to ∇f(~x0) is a linearcombination of ∇g1(~x0), ∇g2(~x0), . . . , ∇gm(~x0).

Proposition 6.3.7. Suppose a map G = (g1, g2, . . . , gm) : Rn → Rm is con-tinuously differentiable map and G(~x0) = ~c is a regular value of G. Supposea function f is defined near ~x0 and is differentiable at ~x0. If f has a lo-cal extreme at ~x0 under the constraint G(~x) = ~c, then ∇f(~x0) is a linearcombination of ∇g1(~x0), ∇g2(~x0), . . . , ∇gm(~x0):

∇f(~x0) = λ1∇g1(~x0) + λ2∇g2(~x0) + · · ·+ λm∇gm(~x0).

The numbers λ1, λ2, · · · , λm are called the lagrange multipliers. Thecondition can also be written as and equality of linear functionals

f ′(~x0) = ~λ ·G′(~x0).

Example 6.3.14. We try to find the possible local extrema of f = xy2 on the circleg = x2 + y2 = 3. At local extrema, we have

∇f = (y2, 2xy) = λ∇g = λ(2x, 2y).

Combined with the fact the point lies in the circle, we get

y2 = 2λx, 2xy = 2λy, x2 + y2 = 3.

If λ = 0, then the first two equations become xy = y2 = 0, which is the sameas y = 0. From the third equation, we get two possible local extrema (±

√3, 0).

If λ 6= 0, then from the first equation, y = 0 would imply x = 0, whichcontradicts with the third equation. By y 6= 0 and the first equation, we get

Page 325: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 325

λ = x. Substituting into the first and the third equations, we get y2 = 2x2 andx2 = 1. This gives us four possible local extrema (±1,±

√2).

Note that since the circle is compact, the function must reach its maximumand minimum. By comparing the values of the function at the six possible lo-cal extrema, we find f has absolute maximum 2 at (1,±

√2), and has absolute

minimum −2 at (−1,±√

2).

Example 6.3.15. The local extrema problem in Example 6.3.9 can also be con-sidered as the local extrema of the function f(x, y, z) = z under the constraintg(x, y, z) = x2 − 2xy + 4yz + z3 + 2y − z = 1. At local extrema, we have

∇f = (0, 0, 1) = λ∇g = λ(2x− 2y,−2x+ 4z + 2, 4y + 2z2 − 1).

This is the same as

0 = λ(2x− 2y), 0 = λ(−2x+ 4z + 2), 1 = λ(4y + 2z2 − 1).

The third equality tells us λ 6= 0. Thus the first two equalities become 2x−2y = 0and −2x+ 4z + 2 = 0. These are the conditions we found in Example 6.3.9.

Example 6.3.16. We try to find the possible local extrema of f = xy + yz + zx onthe sphere x2 + y2 + z2 = 1. By ∇f = (y + z, z + x, x+ y) and ∇(x2 + y2 + z2) =(2x, 2y, 2z), we have

y + z = 2λx, z + x = 2λy, x+ y = 2λz, x2 + y2 + z2 = 1

at local extrema. Adding the first three equalities together, we get (λ− 1)(x+ y+z) = 0.

If λ = 1, then x = y = z from the first three equations. Substituting into the

fourth equation, we get 3x2 = 1 and two possible extrema ±(

1√3,

1√3,

1√3

).

If λ 6= 1, then x+ y + z = 0 and the first three equations become (2λ+ 1)x =(2λ + 1)y = (2λ + 1)z = 0. Since (0, 0, 0) does not satisfy x2 + y2 + z2 = 1, wemust have 2λ + 1 = 0, and the four equations is equivalent to x + y + z = 0 andx2 + y2 + z2 = 1, which is a circle in R3.

Thus the possible local extrema are ±(

1√3,

1√3,

1√3

)and the points on the

circle x+ y + z = 0, x2 + y2 + z2 = 1.Note that the sphere is compact and the continuous function f reaches its

maximum and minimum on the sphere. Since f = 1 at ±(

1√3,

1√3,

1√3

)and

f =12

[(x + y + z)2 − (x2 + y2 + z2)] = −12

along the circle x + y + z = 0,

x2 + y2 + z2 = 1, we find the maximum is 1 and the minimum is −12

.We can also use fx = y + z = 0, fy = z + x = 0, fz = x + y = 0 to find

the possible local extreme (0, 0, 0) of f in the interior x2 + y2 + z2 < 1 of the ballx2 + y2 + z2 ≤ 1. By comparing f(0, 0, 0) = 0 with the maximum and minimumof f on the sphere, we find f reaches its extremes on the ball at boundary points.

Example 6.3.17. We try to find the possible local extrema of f = xyz on the circlegiven by g1 = x+ y+ z = 0 and g2 = x2 + y2 + z2 = 6. The necessary condition is

∇f = (yz, zx, xy) = λ1∇g1 + λ2∇g2 = (λ1 + 2λ2x, λ1 + 2λ2y, λ1 + 2λ2z).

Page 326: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

326 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Canceling λ1 from the three equations, we get

(x− y)(z + 2λ2) = 0, (x− z)(y + 2λ2) = 0.

If x 6= y and x 6= z, then y = −2λ2 = z. Therefore at least two of x, y, z are equal.If x = y, then the two constraints g1 = 0 and g2 = 6 tell us 2x + z = 0

and 2x2 + z2 = 6. Canceling z, we get 6x2 = 6 and two possible local extrema(1, 1,−2) and (−1,−1, 2). By assuming x = z or y = z, we get four more possiblelocal extrema (1,−2, 1), (−1, 2,−1), (−2, 1, 1), (2,−1,−1).

By evaluating the function at the six possible local extrema, we find the ab-solute maximum 2 is reached at three points and the absolute minimum −2 isreached at the other three points.

Exercise 6.3.27. Find possible local extremes under the constraint.

1. xαyβzγ under the constraint x+ y + z = a, x, y, z > 0, where α, β, γ > 0.

2. sinx sin y sin z under the constraint x+ y + z =π

2.

3. x1 + x2 + · · ·+ xn under the constraint x1x2 · · ·xn = a.

4. x1x2 · · ·xn under the constraint x1 + x2 + · · ·+ xn = a.

5. x1x2 · · ·xn under the constraint x1 +x2 + · · ·+xn = a, x21 +x2

2 + · · ·+x2n = b.

6. xp1 + xp2 + · · ·+ xpn under the constraint a1x1 + a2x2 + · · ·+ anxn = b.

Exercise 6.3.28. Prove inequalities.

1. n√x1x2 · · ·xn ≤

x1 + x2 + · · ·+ xnn

for xi ≥ 0.

2.xp1 + xp2 + · · ·+ xpn

n≥(x1 + x2 + · · ·+ xn

n

)pfor p ≥ 1, xi ≥ 0. What if

0 < p < 1?

Exercise 6.3.29. Derive Holder inequality in Exercise 2.2.41 by considering thefunction

∑bqi of bi under the constraint

∑aibi = 1.

Exercise 6.3.30. Suppose A is a symmetric matrix. Prove that if the quadratic formq(~x) = A~x·~x reaches maximum or minimum at ~v on the unit sphere {~x : ‖~x‖2 = 1},then ~v is an eigenvector of A.

Exercise 6.3.31. Fix the base traiangle and the height of a pyramid. When is thetotal area of the side faces smallest?

Exercise 6.3.32. The intersection of the plane x + y − z = 0 and the ellipse x2 +y2 + z2 − xy− yz− zx = 1 is an ellipse centered at the origin. Find the lengths ofthe two axis of the ellipse.

After identifying the possible local extrema by using the linear approxi-mation, we need to use the quadratic approximation to determine whetherthe candidates are indeed local extrema.

Assume we are in the situation as described in Proposition 6.3.7. Furtherassume that f and G = (g1, g2, . . . , gm) are continuously second order differ-entiable. Then we have quadratic approximations of f and gi at the possible

Page 327: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 327

local extreme ~x0

pf (~x) = f(~x0) +∇f(~x0) ·∆~x+1

2hf (∆~x),

pgi(~x) = gi(~x0) +∇gi(~x0) ·∆~x+1

2hgi(∆~x).

The original local maximum problem

f(~x) < f(~x0) for ~x near ~x0, ~x 6= ~x0, and gi(~x) = ci,

is approximated by the similar problem

pf (~x) < pf (~x0) = f(~x0) for ~x near ~x0, ~x 6= ~x0, and pgi(~x) = ci.

The local minimum problem can be similarly approximated, with > in placeof <.

The quadratic approximation problem is not easy to solve because of themixture of linear and quadratic terms. To get rid of the mix, we introducethe function

f(~x) = f(~x)− ~λ ·G(~x) = f(~x)− λ1g1(~x)− λ2g2(~x)− · · · − λmgm(~x),

where λi are the lagrange multipliers obtained from the linear approximation.Since f(~x) = f(~x)− ~λ · ~c for those ~x satisfying the constraint G(~x) = ~c, and~λ·~c is a constant, f has a local maximum at ~x0 under the constraint G(~x) = ~cif and only if f has a local maximum at ~x0 under the constraint G(~x) = ~c.However, the later problem is simpler because the quadratic approximation

pf = pf (~x)− λ1pg1(~x)− λ2pg2(~x)− · · · − λmpgm(~x)

= f(~x0) + (∇f(~x0)− λ1∇g1(~x0)− λ2∇g2(~x0)− · · · − λm∇gm(~x0)) ·∆~x

+1

2(hf (∆~x)− λ1hg1(∆~x)− λ2hg2(∆~x)− · · · − λmhgm(∆~x))

= f(~x0) +1

2(hf (∆~x)− λ1hg1(∆~x)− λ2hg2(∆~x)− · · · − λmhgm(∆~x))

has no first order term.Since the quadratic approximation of f at ~x0 contains no first order term,

the local maximum problem for f at ~x0 is not changed if the constraint ismodified by terms of second order or higher (this remains to be rigorouslyproved). Thus the local maximum problem becomes

pf (~x) < pf (~x0) = f(~x0) for ~x near ~x0, ∆~x 6= ~0, and ∇gi(~x0) ·∆~x = 0.

Proposition 6.3.8. Suppose G = (g1, g2, . . . , gm) : Rn → Rm is a continu-ously second order differentiable map and G(~x0) = ~c is a regular value of G.Suppose f is defined near ~x0 and is continuously second order differentiableat ~x0. Suppose f ′(~x0) = ~λ ·G′(~x0) for a vector ~λ ∈ Rm. Denote the quadraticform

q = hf − ~λ · hG = hf − λ1hg1 − λ2hg2 − · · · − λmhgm .

Page 328: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

328 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

1. If q(~v) is positive definite for vectors ~v satisfying G′(~x0)(~v) = 0, then~x0 is a local minimum of f under the constraint G(~x) = ~c.

2. If q(~v) is negative definite for vectors ~v satisfying G′(~x0)(~v) = 0, then~x0 is a local maximum of f under the constraint G(~x) = ~c.

3. If q(~v) is indefinite for vectors ~v satisfying G′(~x0)(~v) = 0, then ~x0 isnot a local extreme of f under the constraint G(~x) = ~c.

Proof. Since G(~x) = ~c implies f(~x) = f(~x) − ~λ · G(~x) = f(~x) − ~λ · ~c, theextreme problem for f(~x) under the constraint G(~x) = ~c is the same as theextreme problem for f(~x) under the constraint G(~x) = ~c. Note that q is theHessian of f .

Since ~c is a regular value, the constraint G(~x) = ~c means that some choiceof m coordinates ~z of ~x can be written as a continuously differentiable mapH(~y) of the other n−m coordinates ~y. Thus ~x = (~y,H(~y)) (after rearrangingthe coordinates if necessary) is exactly the solution of the constraint G(~x) =~c, and the problem is to determine whether ~y0 is a (unconstrained) localextreme of f(~y,H(~y)).

We have

~z = H(~y) = ~z0 +H ′(~y0)(∆~y) +R1(~y), R1(~y) = o(‖∆~y‖),

and

∆~x = (∆~y,∆~z) = (∆~y,H ′(~y0)(∆~y) +R1(~y)) = L(∆~y) + (~0, R1(~y)),

whereH ′(~y0) is the linear transform such that ~v = L(∆~y) = (∆~y,H ′(~y0)(∆~y))are exactly the solutions of G′(~x0)(~v) = 0. Let b be the symmetric bilinearform that induces q. Then

q(∆~x) = q(L(∆~y))+q(~0, R1(~y))+2b(L(∆~y), (~0, R1(~y))

)= q(L(∆~y))+R2(~y).

By

|q(~0, R1(~y))| ≤ c1‖R1(~y)‖2,∣∣∣b(L(∆~y), (~0, R1(~y))∣∣∣ ≤ c2‖L(∆~y)‖‖R1(~y)‖ ≤ c3‖∆~y‖‖R1(~y)‖,

and R1(~y) = o(‖∆~y‖), we get R2(~y) = o(‖∆~y‖2).Now for ~x = (~y,H(~y)), we have

f(~x) = f(~x0) +1

2q(∆~x) +R3(~x) = f(~x0) +

1

2q(L(∆~y)) +

1

2R2(~y) +R3(~x),

where R3(~x) = o(‖∆~x‖2). By R1(~y) = o(‖∆~y‖), we have ‖∆~x‖ = O(‖∆~y‖)and R3(~x) = o(‖∆~y‖2). Therefore

f(~y,H(~y)) = f(~y0, H(~y0)) +1

2q(L(∆~y)) +R(~y), R(~y) = o(‖∆~y‖2).

Page 329: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 329

Then we may apply the proof of Proposition 6.3.6 for the unconstrained ex-tremes to conclude that the nature of the quadratic form q(L(~u)) determinesthe nature of the local extreme. Since vectors of the form L(~u) are exactly thesolutions of the linear equation G′(~x0)(~v) = 0, we get the three statementsin the proposition.

Example 6.3.18. In Example 6.3.14 we found six possible local extrema of f = xy2

on the circle g = x2 +y2 = 3. To determine whether they are indeed local extrema,we compute the Hessians hf (u, v) = 4yuv+2xv2 and hg(u, v) = 2u2+2v2. Togetherwith ∇g = (2x, 2y), we have the following table showing what happens at the sixpoints.

(x0, y0) λ hf − λhg ∇g · (u, v) = 0 hf − λhg restricted(√

3, 0) 0 2√

3v2 2√

3v = 0 0(−√

3, 0) 0 −2√

3v2 −2√

3v = 0 0(1,√

2) 1 −2u2 + 4√

2uv 2u+ 2√

2v = 0 −6u2

(1,−√

2) 1 −2u2 − 4√

2uv 2u− 2√

2v = 0 −6u2

(−1,√

2) −1 2u2 + 4√

2uv −2u+ 2√

2v = 0 6u2

(−1,−√

2) −1 2u2 − 4√

2uv −2u− 2√

2v = 0 6u2

Therefore (1,±√

2) are local maxima and (−1,±√

2) are local minima. How-ever, we still do not know whether (0,±

√3) are local extrema (an approximation

more refined than the quadratic one is needed).

Example 6.3.19. In Example 6.3.16, we found that ±(

1√3,

1√3,

1√3

)are the pos-

sible extrema of f = xy + yz + zx on the sphere x2 + y2 + z2 = 1. To determinewhether they are indeed local extrema, we compute the Hessians hf (u, v, w) =2uv+ 2vw+ 2wu and hx2+y2+z2(u, v, w) = 2u2 + 2v2 + 2w2. At the two points, wehave λ = 1 and

hf − λhx2+y2+z2 = −2u2 − 2v2 − 2w2 + 2uv + 2vw + 2wu.

By ∇(x2 + y2 + z2) = (2x, 2y, 2z), we need to consider the sign of hf −λhx2+y2+z2

for those (u, v, w) satisfying

±(

1√3u+

1√3v +

1√3w

)= 0.

Substituting the solution w = −u− v, we get

hf − λhx2+y2+z2 = −6u2 − 6v2 − 6uv = −6(u+

12v

)2

− 92v2,

which is negative definite. Thus ±(

1√3,

1√3,

1√3

)are local maxima.

Exercise 6.3.33. Determine whether local extremes in Exercise 6.3.27 are indeedlocal maximum or local minimum.

Page 330: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

330 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

6.3.6 Exercise

Convex Set and Convex Function

A subset A ⊂ Rn is convex if ~x, ~y ∈ A implies the straight line connecting~x and ~y still lies in A. In other words, (1− λ)~x+ λ~y ∈ A for any 0 < λ < 1.

A function f(~x) defined on a convex subset A is convex if

0 < λ < 1 =⇒ (1− λ)f(~x) + λf(~y) ≥ f((1− λ)~x+ λ~y). (6.3.16)

Exercise 6.3.34. For any λ1, λ2, . . . , λn satisfying λ1 + λ2 + · · · + λn = 1 and0 < λi < 1, extend Jensen inequality in Exercise 2.3.40 to multivariable convexfunctions

f(λ1~x1 + λ2~x2 + · · ·+ λn~xn) ≤ λ1f(~x1) + λ2f(~x2) + · · ·+ λnf(~xn). (6.3.17)

Exercise 6.3.35. Prove that a function f(~x) is convex if and only if its restrictionon any straight line is convex.

Exercise 6.3.36. Prove that a function f(~x) is convex if and only if for any linearfunction L(~x) = a+ b1x1 + b2x2 + · · ·+ bnxn, the set {~x : f(~x) ≤ L(~x)} is convex.

Exercise 6.3.37. Extend Proposition 2.3.5 to multivariable: A function f(~x) de-fined on an open convex subset is convex if and only if for any ~z in the subset,there is a linear function K(~x), such that K(~z) = f(~z) and K(~x) ≤ f(~x).

Exercise 6.3.38. Prove that any convex function on an open convex subset is con-tinuous.

Exercise 6.3.39. Prove that a second order continuously differentiable functionf(~x) on an open convex subset is convex if and only if the Hessian is semipositivedefinite: hf (~v) ≥ 0 for any ~v.

Snell’s Law

Fermat’s principle says that light travels along the path of least travelingtime. Because of the principle, the direction of light changes as it enter fromair to water, as shown in the picture. The phenomenon is called refraction.

.......................................................................................................................................................................................................................................................... ............................................................................................................................................................................................................. ..............

.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................α

βwater, speed v

air, speed u

............

............

............

............

............

............

............

............

............

............

............

............

......

Figure 6.5: Snell’s Law

Exercise 6.3.40. Suppose a light travels at the respective speed of u and v in theair and the water. Use Fermat’s principle to prove that the angle α of incidenceand the angle β of refraction are related by the Snell’s law:

u

sinα=

v

sinβ.

Page 331: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

6.3. HIGH ORDER DIFFERENTIATION 331

Laplacian

The Laplacian of a function f(~x) is

∆f =∂2f

∂x21

+∂2f

∂x22

+ · · ·+ ∂2f

∂x2n

. (6.3.18)

Functions satisfying the Laplace equation ∆f = 0 are called harmonic func-tions.

Exercise 6.3.41. Prove that ∆(f+g) = ∆f+∆g, ∆(fg) = g∆f+f∆g+2∇f ·∇g.

Exercise 6.3.42. Suppose a function f(~x) = u(r), depends only on the Euclideannorm r = ‖~x‖2 of the vector. Prove that ∆f = u′′(r) + (n − 1)r−1u′(r) and findthe condition for the function to be harmonic.

Exercise 6.3.43. Derive the Laplacian in R2

∆f =∂2f

∂r2+

1r

∂f

∂r+

1r2

∂2f

∂θ2. (6.3.19)

in the polar coordinates. Also derive the Laplacian in R3

∆f =∂2f

∂r2+

2r

∂f

∂r+

1r2

∂2f

∂θ2+

cosφr2 sinφ

∂f

∂θ+

1r2 sin2 φ

∂2f

∂θ2. (6.3.20)

in the spherical coordinates.

Exercise 6.3.44. Let ~x = F (~y) : Rn → Rn be an orthogonal change of variable (seeExercise (6.2.16)). Let |J | = ‖~xy1‖2‖~xy2‖2 · · · ‖~xyn‖2 be the absolute value of thedeterminant of the Jacobian matrix. Prove the Laplacian is

∆f =1|J |

(∂

∂y1

(|J |‖~xy1‖22

∂f

∂y1

)+

∂y2

(|J |‖~xy2‖22

∂f

∂y2

)+ · · ·+ ∂

∂yn

(|J |‖~xyn‖22

∂f

∂yn

)).

(6.3.21)In particular, if the change of variable is orthonormal (i.e., ‖~xyi‖2 = 1), then theLaplacian is not changed.

Euler Equation

Exercise 6.3.45. Suppose a function f is differentiable away from ~0. Prove that fis homogeneous of degree α if and only if it satisfies the Euler equation

x1fx1 + x2fx2 + · · ·+ xnfxn = αf. (6.3.22)

What about a multihomogeneous function?

Exercise 6.3.46. Extend the characterization of homogeneous function (6.3.22) tohigh order derivatives∑

1≤i1,i2,...,ik

xi1xi2 · · ·xik∂kf

∂xi1∂xi2 · · · ∂xik= α(α− 1) · · · (α− k + 1)f. (6.3.23)

Exercise 6.3.47. Prove that a change of variable ~x = H(~z) : Rn → Rn preserves∑xifxi

z1(f ◦H)z1 + z2(f ◦H)z2 + · · ·+ zn(f ◦H)zn = x1fx1 + x2fx2 + · · ·+ xnfxn .

for any differentiable function f(~x) if and only if it preserves the scaling

H(c~z) = cH(~z) for any c > 0.

Page 332: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

332 CHAPTER 6. MULTIVARIABLE DIFFERENTIATION

Page 333: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 7

Multivariable Integration

333

Page 334: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

334 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.1 Riemann Integration

The integration of multivariable functions is closely tied to the concept ofvolume in two ways. First we can only integrate functions defined on subsetswith volumes. Second, the integral of an n-variable function is the volumeof an (n + 1)-dimensional subset (the region enclosed by the graph of thefunction). The connection makes the theory of integration equivalent to thetheory of volume.

The theory of volume can be developed from the obvious positivity, addi-tivity, and translation invariant properties. The finite additivity leads to theJordan volume theory and the Riemann integral. In the future, the count-able additivity will lead to the Lebesgue measure theory and the Lebesgueintegral.

7.1.1 Volume in Euclidean Space

It is easy to imagine how to extend the Riemann integration to multivariablefunctions on rectangles

I = [a1, b1]× [a2, b2]× · · · × [an, bn] ⊂ Rn. (7.1.1)

However, integration on rectangles alone is too restrictive. The subsets onwhich the Riemann integration can be defined should be subsets with vol-umes. After all, the integral of the constant function 1 is supposed to bethe volume of the underlying subset. Thus a theory of volume needs to beestablished.

Let A ⊂ Rn be a bounded subset. Then A ⊂ [−M,M ]n for some big M .By taking a partition of [−M,M ] for each coordinate, we get a partition Pof [−M,M ]n consisting of rectangles that intersect only along boundaries.The size of the partition can be measured by

‖P‖ = max{d(I) : I ∈ P}, d(I) = max{b1−a1, b2−a2, . . . , bn−an}. (7.1.2)

For any partition P , the unions of rectangles

A+P = ∪{I : I ∈ P, I ∩ A 6= ∅}, A−P = ∪{I : I ∈ P, I ⊂ A} (7.1.3)

form the outer and the inner approximations of A, satisfying

A+P ⊃ A ⊃ A−P .

Define the volume of the rectangle (7.1.1) by

µ([a1, b1]× [a2, b2]× · · · × [an, bn]) = (b1 − a1)(b2 − a2) · · · (bn − an). (7.1.4)

Then define

µ(A+P ) =

∑I∈P,I∩A 6=∅

µ(I), µ(A−P ) =∑

I∈P,I⊂A

µ(I). (7.1.5)

We haveµ(A+

P )− µ(A−P ) =∑

I∈P,I∩A 6=∅,I−A 6=∅

µ(I), (7.1.6)

and the volume of A is supposed to be between µ(A+P ) and µ(A−P ).

Page 335: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 335

Definition 7.1.1. A subset A ⊂ Rn has volume (or is Jordan measurable) iffor any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒ µ(A+P )− µ(A−P ) < ε.

The volume (or the Jordan measure) of A is the unique number µ(A) satis-fying µ(A+

P ) ≥ µ(A) ≥ µ(A−P ) for all partition P .

Example 7.1.1. Consider A = (a, b) ⊂ R. For any ε > 0, take P to be a < a+ ε <b−ε < b. Then µ(A+

P ) = µ([a, b]) = b−a and µ(A−P ) = µ([a+ε, b−ε]) = b−a−2ε.Thus A has volume (or length) b− a.

Consider B = {n−1 : n ∈ N}. For any N and ε <1

2N2, take P to be 0 <

N−1 + ε < (N − 1)−1 − ε < (N − 1)−1 + ε < · · · < 2−1 − ε < 2−1 + ε < 1− ε < 1.Then µ(A+

P ) = N−1 + 2Nε < 2N−1 and µ(A−P ) = 0. Thus A has volume 0.

Example 7.1.2. Let A be the set of rational numbers in [0, 1]. Then any interval[a, b] with a < b contains points inside and outside A. Therefore for any partitionP of [0, 1], we have µ(A+

P ) = 1 and µ(A−P ) = 0. We conclude that A has no volume.

Example 7.1.3. Consider a non-negative bounded single variable function f(x) on[a, b]. The subset

G(f) = {(x, y) : a ≤ x ≤ b, 0 ≤ y ≤ f(x)} ⊂ R2

is the subset under the graph of the function. For any partition P : a = x0 < x1 <· · · < xk = b of [a, b] and any partition Q : 0 = y0 < y1 < · · · < yl of the y-axis, wehave

G(f)+P×Q = ∪{[xi−1, xi]× [0,min yj ] : f(x) < yj for xi−1 ≤ x ≤ xi},

G(f)−P×Q = ∪{[xi−1, xi]× [0,max yi] : f(x) ≥ yi for xi−1 ≤ x ≤ xi}.

Therefore

0 ≤ µ(G(f)+

P×Q

)−∑

sup[xi−1,xi]

f(x)∆xi

=∑(

min yj − sup[xi−1,xi]

f(x)

)∆xi ≤ ‖Q‖(b− a).

Similarly,

0 ≥ µ(G(f)−P×Q

)−∑

inf[xi−1,xi]

f(x)∆xi ≥ −‖Q‖(b− a).

Thus ∣∣∣µ(G(f)+P×Q

)− µ

(G(f)−P×Q

)−∑

ω[xi−1,xi](f)∆xi∣∣∣ ≤ ‖Q‖(b− a).

This implies that G(f) has volume (or area) if and only if f is Riemann integrable.

Moreover, the volume of G(f) is∫ b

af(x)dx.

The following gives criteria for subsets to have volume. Recall that ~x is aboundary point of A if and only if for any ε > 0, there are ~a ∈ A and ~b 6∈ A,such that ‖~x− ~a‖ < ε and ‖~x−~b‖ < ε.

Page 336: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

336 CHAPTER 7. MULTIVARIABLE INTEGRATION

Proposition 7.1.2. The following are equivalent for a bounded subset A.

1. A has volume.

2. For any ε > 0, there is a partition P , such that µ(A+P )− µ(A−P ) < ε.

3. For any ε > 0, there are rectangles I1, I2, . . . , Ik, such that

∂A ⊂ I1 ∪ I2 ∪ · · · ∪ Ik, µ(I1) + µ(I2) + · · ·+ µ(Ik) < ε. (7.1.7)

Proof. The first property clearly implies the second.Now assume the second property holds. A boundary point ~x of A lies

either on the boundary of some I ∈ P , or inside the interior of some I ∈ P .If ~x is in the interior of I, then I contains points inside as well as outsideA. In other words, I appears in the outer approximation A+

P but not in theinner approximation A−P . We conclude that

∂A ⊂ I1 ∪ I2 ∪ · · · ∪ Ik,

where each Ii is either part of the boundary of some I ∈ P or Ii ∈ P appearsin A+

P but not in A−P . Since the boundary parts of rectangles have volume 0,we have

µ(I1) + µ(I2) + · · ·+ µ(Ik) ≤ µ(A+P )− µ(A−P ) < ε.

This proves the third property.Finally we assume the third property. For I in (7.1.1) and δ > 0, denote

Iδ = {~y : ‖~y − ~x‖∞ ≤ δ for some ~x ∈ I} =∏

[ai − δ, bi + δ].

Then for sufficiently small δ, we still have

µ(Iδ1) + µ(Iδ2) + · · ·+ µ(Iδk) < ε.

Suppose P is a partition with ‖P‖ < δ. For any rectangle I ∈ P that appears

in A+P but not in A−P , we can find ~a ∈ I ∩ A and ~b ∈ I − A. Then there is

a point ~c ∈ ∂A on the straight line connecting ~a and ~b (see Exercise 5.1.23).Since ~c still lies in I, for any ~x ∈ I, we have ‖~x − ~c‖ ≤ ‖P‖ < δ. Then by~c ∈ ∂A ⊂ I1 ∪ I2 ∪ · · · ∪ Ik, we have I ⊂ Iδ1 ∪ Iδ2 ∪ · · · ∪ Iδk . Therefore

µ(A+P )− µ(A−P ) =

∑I∩A 6=∅,I−A 6=∅

µ(I) ≤ µ(Iδ1) + µ(Iδ2) + · · ·+ µ(Iδk) < ε.

This proves the first property.

Define the outer volume and the inner volume

µ+(A) = infPµ(A+

P ), µ−(A) = supPµ(A−P ). (7.1.8)

Since µ(A+P ) ≥ µ(A+

Q) ≥ µ(A−Q) ≥ µ(A−P ) when Q refines P , and any twopartitions have a common refinement, we can easily get µ+(A) ≥ µ−(A). ByProposition 7.1.2, A has volume if and only if µ+(A) = µ−(A).

Page 337: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 337

The outer and inner volumes are clearly monotone

A ⊂ B =⇒ µ+(A) ≤ µ+(B), µ−(A) ≤ µ−(B). (7.1.9)

Moreover, the outer volume is clearly subadditive

µ+(A1 ∪ A2 ∪ · · · ∪ Ak) ≤ µ+(A1) + µ+(A2) + · · ·+ µ+(Ak). (7.1.10)

Consequently, the volume is monotone and subadditive.

Proposition 7.1.3. A subset has volume if and only if it boundary has vol-ume 0. Moreover, if A and B have volumes, then A ∪ B, A ∩ B and A− Bhave volumes and satisfy the additive property

µ(A ∪B) = µ(A) + µ(B)− µ(A ∩B). (7.1.11)

The additivity (7.1.11) implies the subadditivity property (7.1.10) be-comes an equality when the subsets have volumes and are disjoint.

Proof. Suppose A has volume. By the criterion (7.1.7) in Proposition 7.1.2,for any ε > 0, we have

µ+(∂A) ≤ µ+(I1 ∪ I2 ∪ · · · ∪ Ik) ≤ µ+(I1) + µ+(I2) + · · ·+ µ+(Ik)

= µ(I1) + µ(I2) + · · ·+ µ(Ik) < ε.

Therefore we get µ+(∂A) = 0. Then by µ−(∂A) ≤ µ+(∂A), we see that∂A has volume 0. Conversely, if ∂A has volume 0, then we can derive thethird property in Proposition 7.1.2 from the definition of µ+(∂A) = 0. Thiscompletes the proof that A has volume if and only if ∂A has volume 0.

The boundaries of A∪B, A∩B and A−B are all contained in ∂A∪ ∂B.If A and B have volumes, then by µ+(∂A ∪ ∂B) ≤ µ+(∂A) + µ+(∂B) = 0,we see that all the boundaries have volume 0, so that the three subsets havevolumes.

Suppose A and B are disjoint. Then for any partition P , we have A−P ∪B−P ⊂ (A ∩ B)−P , and A−P , B−P are disjoint. This leads to µ−(A) + µ−(B) ≤µ−(A∪B). On the other hand, we always have µ+(A)+µ+(B) ≤ µ+(A∪B) bythe subadditivity. Therefore in case A andB have volumes, we get µ(A∪B) =µ(A) + µ(B).

In general, suppose A and B have volumes. Then A − B, B − A, andA ∩B have volumes and are disjoint. Then (7.1.11) follows from

µ(A ∪B) = µ(A−B) + µ(B − A) + µ(A ∩B),

µ(A) = µ(A−B) + µ(A ∩B),

µ(B) = µ(B − A) + µ(A ∩B).

Exercise 7.1.1. Find subsets A, B of R, such that A and B do not have volumes,but A ∪ B, A ∩ B have volumes. Is it possible for A − B and B − A also not tohave volumes?

Page 338: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

338 CHAPTER 7. MULTIVARIABLE INTEGRATION

Exercise 7.1.2. Suppose A has volume and A − ∂A ⊂ B ⊂ A. Prove that B alsohas volume and µ(B) = µ(A). In particular, the interior A − ∂A and the closureA have the same volume as A. Conversely, construct a subset A such that bothA − ∂A and A have volumes, but µ(A − ∂A) 6= µ(A), so that A does not havevolume.

Exercise 7.1.3. Suppose A ⊂ Rm and B ⊂ Rn are bounded.

1. Prove that µ+(A×B) = µ+(A)µ+(B).

2. Prove that if A has volume 0, then A×B has volume 0.

3. Prove that if µ+(A) 6= 0 and µ+(B) 6= 0, then A×B has volume if and onlyif A and B have volumes.

4. Prove that if A and B have volumes, then µ(A×B) = µ(A)µ(B).

Exercise 7.1.4. Prove that µ+(∂A) = µ+(A) − µ−(A). This gives another proofthat A has volume if and only if ∂A has volume 0.

Exercise 7.1.5. Determine whether the subsets have volumes and compute thevolumes.

1. {m−1 + n−1 : m,n ∈ N}.

2. irrational numbers in [0, 1].

3. {(x, x2) : 0 ≤ x ≤ 1}.

4. {(x, y) : 0 ≤ x ≤ 1, 0 ≤ y ≤ x2}.

5. {(x, y) : 0 ≤ x ≤ y2, 0 ≤ y ≤ 1}.

6. {(n−1, y) : n ∈ N, 0 < y < 1 + n−1}.

7. {(x, y) : |x| < 1, |y| < 1, x is rational or y is rational}.

8. {(x, y) : |x| < 1, |y| < 1, x 6= y}.

Exercise 7.1.6. Prove that any n-gon in R2 has volume.

So far the partitions are made of rectangles. To allow more flexibility,define a general partition of a subset B to be a finite collection P of subsetssuch that

1. Each I ∈ P has volume.

2. µ(I ∩ J) = 0 for I 6= J in P .

3. B = ∪I∈P I.

The subset B must have volume and µ(B) =∑

I∈P µ(I). Extend the size ofthe partition by

‖P‖ = max{d(I) : I ∈ P}, d(I) = sup{‖~x− ~y‖∞ : ~x, ~y ∈ I}. (7.1.12)

For a subset A ⊂ B, the outer and the inner approximations A+P , A−P may

also be defined by (7.1.3).

Page 339: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 339

Proposition 7.1.4. Suppose B has volume. The following are equivalent forA ⊂ B.

1. A has volume.

2. For any ε > 0, there is δ > 0, such that ‖P‖ < δ for a general partitionP implies µ(A+

P )− µ(A−P ) < ε.

3. For any ε > 0, there is a general partition P of B, such that µ(A+P )−

µ(A−P ) < ε.

4. For any ε > 0, there are subsets A+ and A− with volumes, such thatA+ ⊃ A ⊃ A− and µ(A+)− µ(A−) < ε.

Proof. Suppose A has volume. Then we have the third property in Propo-sition 7.1.2. For any general partition P of B, the last part of the proof ofProposition 7.1.2 still works, except that we no longer know ~c ∈ I (becausepieces in P may not be convex). However, we still have

‖~x− ~c‖ ≤ ‖~x− ~a‖+ ‖~a− ~c‖ ≤ ‖~x− ~a‖+ ‖~a−~b‖ ≤ 2‖P‖.

Therefore the whole argument still works when ‖P‖ < δ

2, and we may con-

clude µ(A+P )− µ(A−P ) < ε.

The second property clearly implies the third, and the third implies thefourth by taking A+ = A+

P and A− = A−P .Finally, assume the fourth property. Since A+ and A− have volumes,

there is δ > 0, such that if a (rectangular) partition P satisfies ‖P‖ < δ,then

µ((A+)+P )− µ(A+) ≤ µ((A+)+

P )− µ((A+)−P ) < ε,

µ(A−)− µ((A−)−P ) ≤ µ((A−)+P )− µ((A−)−P ) < ε.

By A+ ⊃ A ⊃ A−, we have (A+)+P ⊃ A+

P ⊃ A−P ⊃ (A−)−P and

µ(A+P )− µ(A−P ) ≤ µ((A+)+

P )− µ((A−)−P )

≤(µ((A+)+P )− µ(A+)) + (µ(A+)− µ(A−)) + (µ(A−)− µ((A−)−P )) < 3ε.

This shows that A has volume.

Exercise 7.1.7. Prove that the disks in R2 have volumes by comparing the inscribedand circumscribed regular n-gons.

7.1.2 Riemann Sum

Let f(~x) be a function defined on a subset A ⊂ Rn with volume. For anygeneral partition P of A and choices ~x∗I ∈ I, define the Riemann sum

S(P, f) =∑I∈P

f(~x∗I)µ(I). (7.1.13)

Page 340: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

340 CHAPTER 7. MULTIVARIABLE INTEGRATION

Definition 7.1.5. A function f(~x) is Riemann integrable on a subset A withvolume, with integral J , if for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒ |S(P, f)− J | < ε.

We denote ∫A

f(~x)dµ = J.

Sometimes we also use dV (dA for n = 2) or dx1dx2 · · · dxn in place of dµ.Riemann integrable functions might be unbounded if µ(A) = 0 (or some

“separated part” of A has volume 0). To avoid complicated statements,we will always insist Riemann integrable functions to be bounded. For abounded function f , define the oscillation

ωA(f) = sup~x∈A

f(~x)− inf~x∈A

f(~x) (7.1.14)

as in the single variable case. Then we have

~c ∈ A =⇒ |f(~c)µ(A)− S(P, f)| ≤ ωA(f)µ(A). (7.1.15)

For a function f defined on A ⊂ Rn, the subset

GA(f) = {(~x, y) : ~x ∈ A, y ∈ [0, f(~x)]} ⊂ Rn+1 (7.1.16)

(y ∈ [f(~x), 0] if f(~x) ≤ 0) is the region between the graph of the functionand 0.

Proposition 7.1.6. Suppose f is a bounded function on a subset A withvolume. Then the following are equivalent.

1. f is Riemann integrable on A.

2. For any ε > 0, there is δ > 0, such that ‖P‖ < δ for a general partitionP implies

∑I∈P ωI(f)µ(I) < ε.

3. For any ε > 0, there is a general partition P of A, such that∑

I∈P ωI(f)µ(I) <ε.

4. The subset GA(f) has volume in Rn+1.

Proof. The proof that the first two properties are equivalent is the same asthe single variable case. The second property clearly implies the third.

Let P be a general partition of A. Let |f | < M and Q : −M = y0 < y1 <y2 < · · · < yk = M be a partition of the y-axis. Then

GA(f)+P×Q = ∪I∈P I × [yp, yq], GA(f)−P×Q = ∪I∈P I × [yr, ys],

where [yp, yq] and [yr, ys] are given below.

Page 341: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 341

1. If f ≥ 0 on I, then yp is the biggest yj < 0, yr is the smallest yj > 0,and [ys, yq) is the smallest interval such that ys ≤ f < yq on I. Wehave −‖Q‖ ≤ yp ≤ yr ≤ ‖Q‖, and

|(yq − yp)− (ys − yr)− ωI(f)| ≤ |yq − ys − ωI(f)|+ |yp|+ |yr| ≤ 4‖Q‖.

Note that ys may not exist and I × [yr, ys] = ∅. In this case, we willtake ys = yr, for which the estimations still hold and the subsequentestimations are not affected.

2. If f ≤ 0 on I, then similar argument can be carried out and the sameestimation hold.

3. If f can be positive and negative on I, then yp is the biggest yj satisfyingf > yj on I, yq is the smallest yj satisfying f < yj on I, and I×[yr, ys] =∅. We will take yr = ys = 0 and have

|(yq − yp)− (ys − yr)− ωI(f)| = |yq − yp − ωI(f)| ≤ 2‖Q‖.

Now we have∣∣∣∣∣µ (GA(f)+P×Q

)− µ

(GA(f)−P×Q

)−∑I

ωI(f)µ(I)

∣∣∣∣∣=

∣∣∣∣∣∑I∈P

(yq − yp)µ(I)−∑I∈P

(ys − yr)µ(I)−∑I

ωI(f)µ(I)

∣∣∣∣∣≤∑I∈P

|(yq − yp)− (ys − yr)− ωI(f)|µ(I) ≤ 4‖Q‖µ(A).

By taking ‖Q‖ sufficiently small, and with the help of Proposition 7.1.4, wefind the second, the third, and the fourth statements are equivalent.

Consider the three cases for I again. Choose ~x∗I ∈ I. In the three cases,we have respectively

|(yq − yp)− |f(~x∗I)|| = |(yq − yp)− f(~x∗I)|≤ |yq − f(~x∗I)|+ |yp| ≤ ωI(f) + 2‖Q‖,

|(yq − yp)− |f(~x∗I)|| = |(yq − yp) + f(~x∗I)|≤ |yq|+ |yp − f(~x∗I)| ≤ ωI(f) + 2‖Q‖,

|(yq − yp)− |f(~x∗I)|| ≤ |yq − yp|+ |f(~x∗I)| ≤ 2ωI(f) + 2‖Q‖.

Therefore the Riemann sum S(P, |f |) =∑

I∈P |f(~x∗I)|µ(I) satisfies

|µ(GA(f)+P×Q)− S(P, |f |)| ≤

∑I∈P

|(yq − yp)− |f(~x∗I)||µ(I)

≤ 2∑I

ωI(f)µ(I) + 2‖Q‖µ(A).

Page 342: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

342 CHAPTER 7. MULTIVARIABLE INTEGRATION

In the situation described in Proposition 7.1.6, the right side can be arbitrar-ily small and we conclude that

µ(GA(f)) =

∫A

|f |dA. (7.1.17)

Example 7.1.4. Consider the characteristic function of a bounded subset A

χA(~x) =

{1 if ~x ∈ A0 if ~x /∈ A

Let A ⊂ (−M,M)n and let P be a partition of [−M,M ]n. Then the oscillation

ωI(χA) =

{1 if I ∩A 6= ∅ and I −A 6= ∅0 if I ⊂ A or I ∩A = ∅

,

and we get∑ωI(χA)µ(I) = µ(A+

P )− µ(A−P ). Thereofre χA is Riemann integrableif and only if A has volume.

Moreover, by taking x∗I ∈ I ∩ A for all I satisfying I ∩ A 6= ∅ and I − A 6= ∅,we get S(P, f) = µ(A+

P ). By taking x∗I ∈ I − A instead, we get S(P, f) = µ(A−P ).

Either formula implies µ(A) =∫χAdµ in case A has volume.

Exercise 7.1.8. Prove that if µ(A) = 0, then any bounded function is Riemann

integrable on A, with∫Afdµ = 0.

Exercise 7.1.9. Define the two variable Thomae function (see Examples 1.4.2 and

3.1.5) R(x, y) =1qx

+1qy

if x and y are rational numbers with irreducible de-

nominators qx and qy and R(x, y) = 0 otherwise. Prove that R(x, y) is Riemannintegrable on any subset with volume, and the integral is 0.

Exercise 7.1.10. Suppose f is a bounded function on a subset A with volume.Prove that f is integrable if and only if GA(f) = {(~x, y) : ~x ∈ A, y ∈ [0, f(~x))} hasvolume. Moreover, the integrability of f implies HA(f) = {(~x, f(~x)) : ~x ∈ A} hasvolume zero, but the converse is not true.

Exercise 7.1.11. Suppose f is a bounded function on a subset A with volume.Prove that f is integrable if and only if GA(f) = {(~x, y) : ~x ∈ A, y ∈ [0, f(~x))} hasvolume. Moreover, the integrability of f implies HA(f) = {(~x, f(~x)) : ~x ∈ A} hasvolume zero, but the converse is not true.

Exercise 7.1.12. Suppose f ≤ g ≤ h on a subset A with volume. Suppose f and h

are integrable, with∫Afdµ =

∫Ahdµ. Prove that g is also integrable.

Exercise 7.1.13. Prove that if f is integrable on A, and B ⊂ A has volume, thenf is integrable on B.

For a map F : A ⊂ Rn → Rm on a subset A with volume, we may alsointroduce the Riemann sum

S(P, F ) =∑I∈P

µ(I)F (x∗I).

Page 343: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 343

The map is Riemann integrable on a subset A with volume, with integration~J ∈ Rm, if for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒ ‖S(P, F )− ~J‖ < ε.

Then we denote ∫A

F (~x)dµ = ~J.

It is easy to see that F is integrable if and only if each coordinate is integrable.

Exercise 7.1.14. Define the oscillation of a map F : Rn → Rm to be

ωA(F ) = sup~x,~y∈A

‖F (~x)− F (~y)‖.

Prove that a map F is Riemann integrable if and only if for any ε > 0, there isδ > 0, such that ‖P‖ < δ for a general partition P of A implies the Riemannsum of the oscillation

∑I∈P ωI(F )µ(I) < ε. Moreover, the integrability is also

equivalent to the existence of one partition P satisfying∑

I∈P ωI(F )µ(I) < ε.

Exercise 7.1.15. For a Riemann integrable map, prove∥∥∥∥∫AF (~x)dµ

∥∥∥∥ ≤ ∫A‖F (~x)‖dµ ≤

(sup~x∈A‖F (~x)‖

)µ(A).

7.1.3 Properties of Integration

The following property extends Proposition 3.1.4.

Proposition 7.1.7. Bounded and continuous maps are Riemann integrableon subsets with volumes.

Proof. Suppose f is continuous on a subset A with volume. For any ε > 0,there is a partition P , such that µ(A − A−P ) < ε. The subset K = A−P iscompact because it is a union of finitely many closed rectangles. Thereforef is uniformly continuous on K. This means that there is δ > 0, such that~x, ~y ∈ K and ‖~x − ~y‖ < δ implies |f(~x) − f(~y)| < ε. By further refine therectangles in P such that ‖P‖ < δ, then I ∈ P and I ⊂ A would implyωI(f) < ε.

Now consider the general partition Q = {I ∈ P : I ⊂ A}∪{A−K} of A.If |f | < b on A, then with respect tot he partition, we have∑

J∈Q

ωJ(f)µ(J) = ωA−K(f)µ(A−K) +∑

I∈P,I⊂A

ωI(f)µ(I)

≤ 2bε+ ε∑

I∈P,I⊂A

µ(I) ≤ 2bε+ µ(A)ε.

This verifies the criterion for the integrability of f on A.

Proposition 3.1.5 cannot be extended in general because there is no mono-tonicity concept for multivariable functions. With basically the same proof,Proposition 3.1.6 can be extended.

Page 344: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

344 CHAPTER 7. MULTIVARIABLE INTEGRATION

Proposition 7.1.8. Suppose F : Rn → Rm is a Riemann integrable map ona subset A with volume. Suppose the values of F lie in a compact subset K,and Φ is a continuous map on K. Then the composition Φ ◦ F is Riemannintegrable on A.

Propositions 3.1.7 and 3.1.8 can also be extended, by the same argument.

Proposition 7.1.9. Suppose f and g are Riemann integrable on a subset Awith volume.

1. The linear combination af + bg and the product fg are Riemann inte-

grable on A, and

∫A

(af + bg)dµ = a

∫A

fdµ+ b

∫A

gdµ.

2. If f ≤ g, then

∫A

fdµ ≤∫A

gdµ. Moreover,

∣∣∣∣∫A

fdµ

∣∣∣∣ ≤ ∫A

|f |dµ.

Denote the “upper half” and the “lower half” of Rn+1

H+ = {(~x, y) : y ≥ 0} = Rn × [0,+∞),

H− = {(~x, y) : y ≤ 0} = Rn × (−∞, 0].

For any subset G ⊂ Rn+1 with volume, the intersections G∩H+ and G∩H−have volumes. For the special case G = GA(f), we have

GA(f) ∩H+ = GA(max{f, 0}), GA(f) ∩H− = GA(min{f, 0}).

If f is Riemann integrable on A, then both intersections have volumes, sothat max{f, 0} and min{f, 0} are Riemann integrable on A. Moreover, by(7.1.17), we have∫

A

max{f, 0}dµ = µ(GA(f) ∩H+),

∫A

min{f, 0}dµ = −µ(GA(f) ∩H−).

Adding the two equalities, we get∫A

fdµ = µ(GA(f) ∩H+)− µ(GA(f) ∩H−). (7.1.18)

The equality and Exercise 7.1.1 imply the following extension of Proposition3.1.9.

Proposition 7.1.10. Suppose f is Riemann integrable on subsets A and Bwith volume. Then f is Riemann integrable on A ∪B and A ∩B, with∫

A∪Bfdµ+

∫A∩B

fdµ =

∫A

fdµ+

∫B

fdµ.

The next result will be needed in the theory of improper multivariableintegrals.

Page 345: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 345

Proposition 7.1.11. Suppose f is Riemann integrable on a subset A withvolume. Then ∫

A

max{f, 0}dµ = supC⊂A,Chas volume

∫C

fdµ.

Intuitively, we wish C to be the largest subset of A on which f ≥ 0. Thenatural choice of C would be A+ = {~x ∈ A : f(~x) ≥ 0}. Unfortunately, thesubset A+ may not have volume. A counterexample is the negative of theThomae function in Example 3.1.5.

Proof. For any ε > 0, there is δ > 0, such that ‖P‖ < δ for a general partitionP of A implies

∑I∈P ωI(f)µ(I) < ε. Then f+ = max{f, 0} satisfies∣∣∣∣S(P, f+)−∫A

f+dµ

∣∣∣∣ ≤∑I∈P

ωI(f+)µ(I) ≤

∑I∈P

ωI(f)µ(I) ≤ ε.

Moreover,Q = {I ∈ P : f(~x) ≥ 0 for some ~x ∈ I}

is a general partition of C = ∪I∈QI. For each I ∈ P , choose ~x∗I ∈ I. Thechoice can be made to satisfy f(~x∗I) ≥ 0 in case I ∈ Q. Then we have∣∣∣∣S(P, f+)−

∫C

fdµ

∣∣∣∣ =

∣∣∣∣S(Q, f)−∫C

fdµ

∣∣∣∣≤∑I∈Q

ωI(f)µ(I) ≤∑I∈P

ωI(f)µ(I) ≤ ε.

Combined with the earlier estimation, we get∣∣∣∣∫A

f+dµ−∫C

fdµ

∣∣∣∣ ≤ 2ε.

On the other hand, we always have∫A

f+dµ ≥∫C

f+dµ ≥∫C

fdµ.

Then the equality in the proposition follows.

Example 7.1.5. For any ~a 6= ~0, consider a codimension 1 hyperplane

B = {~x : ~a · ~x = c}.

If the last coordinate of ~a is nonzero, then we can solve the last coordinate from~a · ~x = c and get

B = {(~y, z) : z = ~b · ~y}.

For any M > 0, B ∩ [−M,M ]n−1 × R is contained in the graph of the linearfunction ~b · ~y for ~y ∈ [−M,M ]n−1. Since linear functions are Riemann integrable,we conclude that µ(B ∩ [−M,M ]n−1 × R) = 0. In general, we always have µ(B ∩[−M,M ]n−1 × R) = 0 up to a permutation of coordinates.

Page 346: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

346 CHAPTER 7. MULTIVARIABLE INTEGRATION

A convex polytope A is the intersection of finitely many half spaces ~a · ~x ≥ c,~a 6= ~0. The boundary of the convex polytope consists of those points such thatsome inequalities become equalities ~a · ~x = c. Therefore the boundary is a finiteunion of subsets of the form B above. In particular, if A is bounded, then theboundary has volume zero. Therefore bounded convex polytopes have volume.

A bounded polyhedron is a finite union of bounded polytopes and thereforealso has volume.

Exercise 7.1.16. Prove Proposition 7.1.8.

Exercise 7.1.17. Suppose f is Riemann integrable on [a, b]. Suppose A ⊂ R2 is asubset with area, such that (x, y) ∈ A implies a ≤ x+ y ≤ b.

1. Prove that there is b, such that µ{(x, y) ∈ A : s ≤ x+ y ≤ t} < b(t− s) forany s < t.

2. Prove that f(x+ y) is Riemann integrable on A.

Moreover, compute the area of {(x, y) : |x| ≤ M, |y| ≤ M, s ≤ xy ≤ t} and studythe integrability of f(xy).

Exercise 7.1.18. Prove Proposition 7.1.9. Explain that the Riemann integral is notchanged if the function is arbitrarily modified on any subset of volume 0.

Exercise 7.1.19. Prove Proposition 7.1.10 by directly using the Riemann sum.

Exercise 7.1.20. Suppose a function f(x, y) defined on a rectangle [a, b] × [c, d] isincreasing in x for fixed y and increasing in y for fixed x. Prove that f is Riemannintegrable. What if f in increasing in one variable and continuous in another?

Exercise 7.1.21. Suppose f(~x) is integrable on A. Prove that

µ(A) infAf ≤

∫Afdµ ≤ µ(A) sup

Af, (7.1.19)

∣∣∣∣S(P, f)−∫Afdµ

∣∣∣∣ ≤∑I∈P

ωI(f)µ(I). (7.1.20)

Exercise 7.1.22. Suppose A is a path connected and compact subset with vol-ume. Suppose f is a continuous function on A. Prove that there is ~a ∈ A, such

that∫Afdµ = f(~a)µ(A). Extend the conclusion to

∫Afgdµ for a non-negative

integrable function g on A.

Exercise 7.1.23. Suppose A is an open subset with volume. Suppose f is a con-tinuous function on A. Prove the following are equivalent.

1. f = 0.

2.∫A|f |dµ = 0.

3.∫Bfdµ = 0 for any subset B ⊂ A with volume.

4.∫Ifdµ = 0 for any rectangle I ⊂ A.

5.∫Afgdµ = 0 for any continuous function g on A.

Page 347: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 347

Does the conclusion still hold if A is not open?

Exercise 7.1.24. Extend Proposition 4.2.6 to multivariable Riemann integration.Suppose fn(~x) is integrable on a subset A with volume and the sequence {fn(~x)}uniformly converges on A. Then limn→∞ fn(~x) is integrable on A, with∫

Alimn→∞

fn(~x)dµ = limn→∞

∫Afn(~x)dµ.

7.1.4 Fubini Theorem

Theorem 7.1.12 (Fubini Theorem). Suppose A ⊂ Rm and B ⊂ Rn havevolumes. Suppose f(~x, ~y) is Riemann integrable on A×B. Suppose for each

fixed ~y, f(~x, ~y) is Riemann integrable on A. Then

∫A

f(~x, ~y)dµ~x is Riemann

integrable on B, and∫A×B

f(~x, ~y)dµ~x,~y =

∫B

(∫A

f(~x, ~y)dµ~x

)dµ~y. (7.1.21)

Proof. Let P and Q be partitions of A and B. Let ~x∗I ∈ I ∈ P and ~y∗J ∈ J ∈Q. Let g(~y) =

∫A

f(~x, ~y)dµ~x. Then we have the Riemann sums

S(P ×Q, f) =∑

I∈P,J∈Q

f(~x∗I , ~y∗J)µ~x(I)µ~y(J) =

∑J∈Q

S(P, f(~x, ~y∗J))µ~y(J),

S(Q, g) =∑J∈Q

g(~y∗J)µ~y(J) =∑J∈Q

(∫A

f(~x, ~y∗J)dµ~x

)µ~y(J).

For any ε > 0, by the integrability of f on A × B, there is δ > 0, such that‖P‖ < δ, ‖Q‖ < δ implies∣∣∣∣S(P ×Q, f)−

∫A×B

f(~x, ~y)dµ~x,~y

∣∣∣∣ < ε.

Fix one partition Q satisfying ‖Q‖ < δ and fix one choice of ~y∗J . Thenthere is δ ≥ δ′ > 0, such that ‖P‖ < δ′ implies∣∣∣∣S(P, f(~x, ~y∗J))−

∫A

f(~x, ~y∗J)dµ~x

∣∣∣∣ < ε

for all (finitely many) J ∈ Q. For any such P , we have

|S(P ×Q, f)− S(Q, g)| ≤∑J∈Q

∣∣∣∣S(P, f(~x, ~y∗J))−∫A

f(~x, ~y∗J)dµ~x

∣∣∣∣µ~y(J)

≤∑J∈Q

εµ~y(J) = εµ~y(B),

Page 348: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

348 CHAPTER 7. MULTIVARIABLE INTEGRATION

and we get∣∣∣∣S(Q, g)−∫A×B

f(~x, ~y)dµ~x,~y

∣∣∣∣≤|S(P ×Q, f)− S(Q, g)|+

∣∣∣∣S(P ×Q, f)−∫A×B

f(~x, ~y)dµ~x,~y

∣∣∣∣<εµ~y(B) + ε.

In particular, the third property in Theorem 7.1.6 is satisfied. Therefore g is

integrable on B, with

∫B

gdµ~y =

∫A×B

f(~x, ~y)dµ~x,~y.

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

....................................

............................................................................................................................................................................................... ................................................................................................................................................................................. ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..............

...........................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................

A

B

a

b

~x

y

y = k(~x)

y = h(~x)

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............................................................................................................................................................................................................................................................

................................................................................................................................................................................................................................................

Figure 7.1: integration between two graphs

Example 7.1.6. Suppose h and k are integrable functions on a subset A withvolume. Suppose h(~x) ≥ k(~x). Then the region

B = {(~x, y) : ~x ∈ A, h(~x) ≥ y ≥ k(~x)}

between the graphs of h and k has volume. For an integrable function f(~x, y)on B, we may extend the function to A × [a, b] (b > h(~x) ≥ k(~x) > a for all~x ∈ A) by taking value 0 outside B. If for each fixed ~x, f(~x, y) is integrable fory ∈ [k(~x), h(~x)], then we have∫

Bf(~x, y)dµ~x,y =

∫A×[a,b]

f(~x, y)dµ~x,y

=∫A

(∫ b

a(~x, y)dy

)dµ~x =

∫A

(∫ h(~x)

k(~x)(~x, y)dy

)dµ~x.

For a specific example, the integral of (x + y)2 on the region bounded by x = 0,y = 1 and x = y is∫

0≤x≤1,x≤y≤1(x+y)2dxdy =

∫ 1

0

(∫ 1

x(x+ y)2dy

)dx =

∫ 1

0

(x+ 1)3 − (2x)3

3dx =

3712.

Page 349: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 349

Example 7.1.7. Suppose A ⊂ Rm × Rn is a subset with volume. Suppose for each~x ∈ Rm, the section

A~x = {~y ∈ Rn : (~x, ~y) ∈ A}also has volume. Then by applying Fubini theorem to the characteristic function(see Example 7.1.4), we find the volume of A to be

µm+n(A) =∫AχA(~x, ~y)dµ~x,~y =

∫Rm

(∫RnχA(~x, ~y)dµ~y

)dµ~x =

∫Rm

µ~y(A~x)dµ~x.

The integral on the Euclidean spaces are actually the integral on some rectanglesenclosing the subsets.

For a specific example, the area of the unit disk in R2 is∫ 1

−1µ[−√

1− x2,√

1− x2]dx =

∫ 1

−12√

1− x2dx =∫ π

02√

1− cos2 t sin tdt = π.

Example 7.1.8. The two variable Thomae function in Exercise 7.1.9 is Riemann

integrable. However, for each rational y, we have R(x, y) ≥ 1qy

> 0 for rational x

and R(x, y) = 0 for irrational x. By the reason similar to the non-integrability ofthe Dirichlet function, f(x, y) is not integrable in x. Thus the repeated integral∫ (∫

R(x, y)dx)dy does not exist. The other repeated integral also does not

exist for the same reason.Note that since Riemann integrability is not changed if the function is modified

on a subset of volume 0, Fubini theorem essentially holds even if those ~y for whichf(~x, ~y) is not integrable in ~x form a set of volume zero. Unfortunately, this is notthe case for the two variable Thomae function, because the set of rational numbersdoes not have volume zero.

Example 7.1.9. If f(~x, ~y) is integrable on A×B, f(~x, ~y) is integrable on A for eachfixed ~y, and f(~x, ~y) is integrable on B for each fixed ~x, then by Fubini theorem,the two repeated integrals are the same∫

A

(∫Bf(~x, ~y)dµ~y

)dµ~x =

∫B

(∫Af(~x, ~y)dµ~x

)dµ~y.

However, if two repeated integrals exist and are equal, it does not necessarily followthat the function is integrable.

Consider

S ={(

k

p,l

p

): 0 ≤ k, l ≤ n, p is a prime number

}.

The section Sx = {y : (x, y) ∈ S} is empty for any irrational x and is finite for anyrational x. As a result, the section has volume (length) 0 for any x. The sameholds for the sections Sy. On the other hand, for any 2 dimensional partition Pof [0, 1]2 by rectangles, we have S+

P = [0, 1]2 (because S is dense in [0, 1]2) andS−P = ∅ (because S contains no rectangles). Therefore µ(S+

P ) = 1, µ(S−P ) = 0, andthe set S has no volume.

By Example 7.1.4, a subset has volume if and only if its characteristic functionis Riemann integrable. Therefore in terms of the characteristic function χS , the

two repeated integrals∫ (∫

χS(x, y)dy)dx and

∫ (∫χS(x, y)dx

)dy exist and

are equal to 0, but the double integral∫χS(x, y)dxdy does not exist.

Page 350: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

350 CHAPTER 7. MULTIVARIABLE INTEGRATION

Exercise 7.1.25. Compute the integrals.

1.∫

[0,π]×[0,π]| cos(x+ y)|dxdy.

2.∫

[0,1]×[0,1]y2 cos(πxy)dxdy.

3.∫

[a1,b1]×[a1,b2]×···×[an,bn](x1 + x2 + · · ·+ xn)pdx1dx2 · · · dxn.

4.∫A

(x+ y)dxdy, A is the region bounded by y = x2, y = 4x2, y = 1 and with

x ≥ 0.

5.∫A

sin yy

dxdy, A is the region bounded by x = 0, y = x and y = π.

6.∫Axydxdy, A is the region bounded by y = sinx and y = cosx and between

the intersection points x =π

4and x =

5π4

.

7.∫A

dxdydz

x2 + y2, A is the region bounded by x = 1, x = 2, z = 0, y = x, z = y.

8.∫ 1

0

(∫ 1

−1

√|y − x2|dx

)dy.

9.∫ 1

0

(∫ 1

xsinπ(1− y2)dy

)dx.

10.∫A

(1−x1−x2−· · ·−xn)dx1dx2 · · · dxn, A is the region bounded by x1 = 0,

x2 = 0, . . . , xn = 0, x1 + x2 + · · ·+ xn = 1.

Exercise 7.1.26. Prove that∫a≤x1≤···≤xn≤b

f(x1)dx1dx2 · · · dxn =1

(n− 1)!

∫ b

a(b− t)n−1f(t)dt∫

a≤x1≤···≤xn≤bf(x1)f(x2) · · · f(xn)dx1dx2 · · · dxn =

1n!

(∫ b

af(t)dt

)n.

Exercise 7.1.27. Compute the volumes.

1. The region bounded by x = y and x = y2.

2. The area between the graphs of 1 + 2x+ 3x3 + 4x4 and 2x+ 3x3 + 5x4, and−1 ≤ x ≤ 1.

3. The region bounded by the elliptic parabloid z = x2 + y2 and the planez = 1.

4. The region bounded by the elliptic parabloid z = 2x2 + y2 and the cylinderz = 2− y2.

5. The region bounded by x1 = 0, x2 = 0, . . . , xn = 0, x1 + x2 + · · ·+ xn = 1.

Page 351: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 351

Exercise 7.1.28. Suppose f(x) is continuous and the improper integral∫ +∞

0f(x)dx

converges. For 0 < a < b, show that the improper integral∫ +∞

0

(∫ b

af(tx)dt

)dx

also converges and compute the improper integral.Exercise 7.1.29. Study the existence and equalities for the double integrals andrepeated integrals.

1. f(x, y) =

{1 if x is rational2y if x is irrational

on [0, 1]× [0, 1].

2. f(x, y) =

{1 if x is rational2y if x is irrational

on [0, 1]× [0, 2].

3. f(x, y) =

{1 if x, y are rational0 otherwise

.

4. f(x, y) =

{R(x) if x, y are rational0 otherwise

, where R(x) is the Thomae function

in Example 1.4.2.

5. f(x, y) =

1 if x =1n, n ∈ N and y is rational

0 otherwise.

6. f(x, y) =

1 if x =

1n, n ∈ N and y is rational

1 if x is rational and y =1n, n ∈ N

0 otherwise

.

Exercise 7.1.30. Suppose f(x) and g(x) are increasing functions on [0, 1]. Byconsidering the double integral of (f(x)− f(y))(g(x)− g(y)), prove that∫ 1

0f(x)g(x)dx ≥

∫ 1

0f(x)dx

∫ 1

0g(x)dx.

More generally, for p(x) ≥ 0, prove the Chebychev inequality∫ b

ap(x)dx

∫ b

ap(x)f(x)g(x)dx ≥

∫ b

ap(x)f(x)dx

∫ b

ap(x)g(x)dx.

Exercise 7.1.31. Suppose f(x) is a continuous function on [a, b]. By consideringthe double integral of (f(x)− f(y))2, prove that(∫ b

af(x)dx

)2

≤ (b− a)∫ b

af(x)2dx.

Exercise 7.1.32. Suppose A and B are subsets with volumes. Suppose f(~x) andg(~y) are bounded functions on A and B.

1. Prove that if f(~x) and g(~y) are integrable, then f(~x)g(~y) is integrable, and∫A×B

f(~x)g(~y)dµ~x,~y =∫Af(~x)dµ~x

∫Bg(~y)dµ~y.

2. Prove that if f(~x) is integrable and∫Afdµ 6= 0, then f(~x)g(~y) is integrable

if and only if g(~y) is integrable.

Page 352: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

352 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.1.5 Volume in Vector Space

A collection Σ of subsets is a ring if it satisfies

A,B ∈ Σ =⇒ A ∪B,A ∩B,A−B ∈ Σ.

A volume theory is a function ν defined on a ring Σ, such that the followingproperties are satisfied.

1. Non-negativity: ν(A) ≥ 0.

2. Additivity: If A1, A2, . . . , Ak are disjoint, then ν(A1∪A2∪· · ·∪Ak) =ν(A1) + ν(A2) + · · ·+ ν(Ak).

The properties imply the monotonicity

A ⊂ B and A,B ∈ Σ =⇒ ν(A) ≤ µ(A) + ν(B − A) = ν(B). (7.1.22)

We also say a subset A has ν-volume (or ν-measurable) if A ∈ Σ.The notation µ will be reserved for the specific volume theory developed

in Section 7.1.1. By Propositions 7.1.9 and 7.1.10, if f is a non-negativefunction on Rn that is Riemann integrable on any subset with µ-volume,

then ν(A) =

∫A

fdµ is a volume theory on the collection of subsets with

µ-volumes.

Exercise 7.1.33. Given a volume theory ν on a ring Σ, define the outer and innervolumes of any A ⊂ X by

ν+(A) = infK∈Σ,K⊃A

ν(K), ν−(A) = supL∈Σ,L⊂A

ν(L).

We say A has ν-volume if ν+(A) = ν−(A), and denote ν(A) = ν+(A) = ν−(A).Prove that the collection Σ of subsets with ν-volume is still a ring. Moreover,prove that ν is indeed a volume theory on Σ.

Exercise 7.1.34. Prove that finite disjoint unions of all types of rectangles (i.e.,products of all types of intervals, including single points) form a ring Σ of subsetsof Rn. Moreover, define the volume µ(I) of rectangles I in a usual way and defineµ(A) =

∑µ(Ij) for A = ∪Ij ∈ Σ, Ij disjoint.

1. Prove that µ is well-defined. In other words, µ(A) is independent of the wayA is divided into disjoint rectangles.

2. Prove that µ is a volume theory on Σ.

The volume theory µ produced by the process in Exercise 7.1.33 is then the Jordanmeasure theory.

A volume theory ν on a vector space V is translation invariant if A ∈ Σimplies A+ ~v ∈ Σ, and

ν(A+ ~v) = ν(A) for any ~v ∈ V. (7.1.23)

The following result shows that translation invariant volume theories areessentially unique.

Page 353: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 353

Proposition 7.1.13. Suppose ν is a translation invariant volume theory onRn, such that all subsets with µ-volumes also have ν-volumes. Then ν(A) =cµ(A) for some constant c.

Proof. For any x, N and distinct x1, x2, . . . , xN ∈ [a1, b1], we have

Nν(x× [a2, b2]× [a3, b3]× · · · × [an, bn])

=N∑i=1

ν(xi × [a2, b2]× [a3, b3]× · · · × [an, bn]) (translation invariant)

=ν({x1, x2, . . . , xN} × [a2, b2]× [a3, b3]× · · · × [an, bn]) (additive)

≤ν([a1, b1]× [a2, b2]× [a3, b3]× · · · × [an, bn]). (monotone)

Since N is arbitrary, we must have

ν(x× [a2, b2]× [a3, b3]× · · · × [an, bn]) = 0.

Similar equality holds for other “reduced rectangles”, including ones withopen or partly open intervals. In particular, the additivity holds if the inter-sections of Ai are finite unions of reduced rectangles.

Let N be a natural number and ε =1

N. The cube [0, 1]n is the union of

translations of Nn copies of the small cube [0, ε]n, such that the intersectionsamong the translations are reduced rectangles. Therefore

ν([0, ε]n) = εnNnν([0, ε]n) = εnν([0, 1]n) = cεn, c = ν([0, 1]n).

Now for any rectangle I = [a1, b1]× [a2, b2]× [a3, b3]× · · · × [an, bn], we findnon-negative integers ki satisfying kiε ≤ bi − ai ≤ (ki + 1)ε. Then

I+ = [ai, ai + (ki + 1)ε] ⊃ I ⊃ I− =∏

[ai, ai + kiε].

By filling I+ and I− with translations of [0, ε]n, we get

ν(I+) =(∏

(ki + 1))cεn ≥ ν(I) ≥ ν(I−) =

(∏ki

)cεn.

By limN→∞ ki = limN→∞(ki + 1)ε = bi − ai, we conclude that

ν(I) = c∏

(bi − ai) = cµ(I).

Let A be any subset with µ-volume. For a partition P , we have theapproximations A+

P and A−P of A. By the additivity and the fact that ν = cµon rectangles, we get ν(A+

P ) = cµ(A+P ) ≥ ν(A) ≥ ν(A−P ) = cµ(A−P ). Since

µ(A+P ) and µ(A−P ) can become arbitrarily close to µ(A) as P gets more refined,

we conclude that ν(A) = cµ(A).

The discussion on translation invariant volume theory helps us to studythe change of the volume under linear transforms.

Page 354: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

354 CHAPTER 7. MULTIVARIABLE INTEGRATION

Proposition 7.1.14. Suppose L : Rn → Rn is a linear transform. Supposea subset A has µ-volume. Then L(A) has µ-volume, and

µ(L(A)) = | det(L)|µ(A). (7.1.24)

Proof. For any rectangle I, L(I) is a parallelepiped, which has µ-volume byExample 7.1.5. This implies that if A is a finite union of rectangles, thenL(A) has µ-volume.

Denote cL = µ(L([0, 1]n)). By the translation invariance property ofA 7→ µ(L(A)) and the proof of Proposition 7.1.13, we have µ(L(A)) = cLµ(A)for any finite union A of rectangles. Repeating the last part of the proof,for any subset A with µ-volume, we get µ(L(A+

P )) = cLµ(A+P ) ≥ µ(L(A)) ≥

µ(L(A−P )) = cLµ(A−P ). Then by the last criterion in Proposition 7.1.4, weconclude that L(A) also has µ-volume. Moreover, the estimation tells usµ(L(A)) = cLµ(A).

It remains to show cL = | det(L)|. For linear transforms L,K : Rn → Rn,we have µ(L ◦K(A)) = µ(L(K(A)) = cLµ(K(A)) = cLcKµ(A). This tells uscL◦K = cLcK . On the other hand, any linear transform is a composition ofthe following types (called elementary linear transforms).

1. Exchange: L(x1, x2, x3, . . . , xn) = (x2, x1, x3, . . . , xn).

2. Scaling: L(x1, x2, . . . , xn) = (cx1, x2, . . . , xn).

3. Shifting: L(x1, x2, x3, . . . , xn) = (x1, x2 + cx1, x3, . . . , xn).

By cL◦K = cLcK and | det(L◦K)| = | det(L)|| det(K)|, it suffices to verify theformula cL = µ(L([0, 1]n)) = | det(L)| for the elementary linear transforms.

For the exchange, we have L([0, 1]n) = [0, 1]n, so that d(L) = µ([0, 1]n) =1 = | − 1| = | detL|. For the scaling, we have L([0, 1]n) = [0, c] × [0, 1]n−1,so that d(L) = µ([0, c]× [0, 1]n−1) = |c| = | detL|. For the shifting, we haveL([0, 1]n) = B × [0, 1]n−2, where B ⊂ R2 is the parallelogram with (0, 0),(1, c), (0, 1), (0, 1 + c) as vertices. It is easy to see that µ(B) = 1, so thatd(L) = µ(B × [0, 1]n−2) = 1 = | detL|.

Proposition 7.1.14 suggests that the concept of having µ-volume is well-defined on any finite dimensional vector space V . Proposition 7.1.13 furthertells us that translation invariant volumes on the collection of subsets ofV with µ-volumes are unique up to multiplying a constant. In particular,such a volume theory is completely determined by the volume of a non-degenerate parallelepiped (rectangles are not well defined in vector spaces,parallelepipeds are well defined).

Exercise 7.1.35. Suppose A ⊂ Rn is a subset with volume. The cone on A withcone point ~c ∈ Rn+1 is

~cA = {(1− t)~c+ t(~a, 0) : ~a ∈ A, t ∈ [0, 1]}.

Prove that the volume of ~cA is1

n+ 1|cn+1|µ(A).

Page 355: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 355

Exercise 7.1.36. Suppose αn is the volume of the unit ball Bn ⊂ Rn.

1. Prove that αn =∫ 1

−1(1− r2)

n−12 αn−1dr.

2. Find the general formula for αn.

3. What is the volume of the ellipsoid x2 +13y2 +14z2 +6xy+2xz+18yz ≤ 1?The quadratic form appeared in Example 5.2.3.

Now we can justify the claim in the multilinear algebra that the volumeof a parallelepiped spanned by ~x1, ~x2, . . . , ~xk ∈ Rn is the Euclidean norm‖~x1 ∧ ~x2 ∧ · · · ∧ ~xk‖2 of the wedge of the spanning vectors.

Suppose V ⊂ Rn is the k-dimensional linear subspace spanned by thevectors ~x1, ~x2, . . . , ~xk. Then V has an inner product induced from the dotproduct of Rn. For any orthonormal basis of V , there is a unique transla-tion invariant volume theory on V such that the volume of the cube spannedby the orthonormal basis is 1. Since any two orthonormal basis are relatedby an orthogonal linear transform, which has determinant ±1, the volumetheory is actually independent of the choice of the orthonormal basis. De-note the volume theory by µV (which only depends on the inner product onV ). Equivalently, we can take any isometry L : Rk → V (linear transformpreserving the inner product) and define µV = µ ◦ L−1.

Suppose K : Rk → V is the linear transform that takes the standard basis~e1, ~e2, . . . , ~ek to ~x1, ~x2, . . . , ~xk. Then A = K(I), where I = [0, 1]k is theparallelepiped spanned by ~e1, ~e2, . . . , ~ek. For ~vi = L−1(~xi) = L−1 ◦K(~ei), wehave

µV (A) = µV (K(I)) = (µV ◦ L)(L−1 ◦K(I))

= µ(L−1 ◦K(I)) (definition of µV )

= | det(L−1 ◦K)|µ(I) (Proposition 7.1.14)

= ‖~v1 ∧ ~v2 ∧ · · · ∧ ~vk‖2 (definition of determinant)

= ‖~x1 ∧ ~x2 ∧ · · · ∧ ~xk‖2. (L is an isometry)

The computation also tells us that the equality

µV (K(B)) = ‖K(~e1) ∧K(~e2) ∧ · · · ∧K(~ek)‖2µ(B)

holds for B = I. Since both sides are translation invariant volume theories onRk, the equality holds for any subset B ⊂ Rk with µ-volume. After changingthe notations, we make a record of the fact.

Proposition 7.1.15. Suppose L : Rk → Rn is an injective linear transform,with image subspace V . Then

µV (L(A)) = ‖L(~e1) ∧ L(~e2) ∧ · · · ∧ L(~ek)‖2µ(A). (7.1.25)

Page 356: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

356 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.1.6 Change of Variable

A change of variable on Rn is an invertible map Φ: U ⊂ Rn → Rn defined onan open subset U , such that both Φ and Φ−1 are continuously differentiable.By the inverse function theorem, the condition is the same as Φ is invertible,continuously differentiable and the derivative Φ′ is invertible everywhere.Moreover, we know Φ(U) is also open.

A change of variable on a subset A is a change of variable on an opensubset U containing the closure A of A.

The formula for the Riemann integral under a change of variable wasestablished for single variable functions in Theorems 3.2.4 and 3.2.5. Formultivariable functions, we first study how the change of variable affects thevolume.

Proposition 7.1.14 tells us how the volume changes under a linear changeof variable. For a general change of variable on a subset A with volume,we consider a partition P of A and choices ~x∗I ∈ I for I ∈ P . Then Φ isapproximated by the linear map

LI(~x) = Φ(~x∗I) + Φ′(~x∗I)(~x− ~x∗I)

on I, and we expect the volume of Φ(I) to be approximated by the volume ofLI(I), which is | det Φ′(~x∗I)|µ(I). Thus the volume of Φ(A) is approximatedby∑

I∈P | det Φ′(~x∗I)|µ(I), which leads to the expectation that

µ(Φ(A)) =

∫A

| det Φ′(~x)|dµ~x. (7.1.26)

This further leads to the formula for the Riemann integral under a change ofvariable.

Theorem 7.1.16 (Change of Variable). Suppose A ⊂ Rn has volume. Sup-pose Φ: Rn → Rn is a change of variable on A. Then Φ(A) has volume.Moreover, for any Riemann integrable function f on Φ(A), f ◦ Φ is alsoRiemann integrable on A, and∫

A

f(Φ(~x))| det Φ′(~x)|dµ~x =

∫Φ(A)

f(~y)dµ~y. (7.1.27)

The sketch that derives the formula contains some technical difficulties.First, we do not yet know that Φ(I) have volumes, so we can only estimateµ+(Φ(I)) or µ−(Φ(I)) at the beginning. Second, the estimation on the vol-ume of Φ(I) can only be made by finding a nice subset IΦ containing orcontained in Φ(I). We expect IΦ to be a “slightly inflated” version of theparallelepiped LI(I) (for containing Φ(I)) or a “slightly shrunken” version(for being contained in Φ(I)). However, the only information we have is thefact that LI(~x) is a linear approximation of Φ, which is some inequality aboutnorms. The subset containment that can be derived from the inequality mustbe the containment of balls. Thus we must consider partitions consisting ofballs. This means that the partitions are made up of closed balls

BL∞(~c, r) = {~x : ‖~x− ~c‖∞ ≤ r} = ~c+ [−r, r]n

Page 357: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 357

in the L∞-norm. Moreover, the containment argument between Φ(I) and IΦ

must be done after using a linear transform to convert the parallelepiped IΦ

to an L∞-ball.

Proof. The change of variable Φ is defined on an open subset U containing theclosure A. We need to consider partitions P and take linear approximationsof Φ on I ∈ P that have nonempty intersections with A. Therefore we needto consider Φ on a compact subset slightly bigger than A and still containedin U . By Proposition 5.1.9, there is δ0 > 0, such that

K = {~y : ‖~y − ~x‖∞ ≤ δ0 for some ~x ∈ A} ⊂ U.

Since A is bounded, it is easy to see that K is a bounded and closed subset.By Proposition 5.1.3, K is compact.

With the help of Theorem 5.1.14, we can set up the bounds and the linearapproximations for Φ on K. Specifically, the uniform continuity of Φ′ andProposition 6.1.4 tell us that for any ε > 0, there is δ > 0, such that if‖~y−~x‖∞ < δ and the straight line connecting ~x to ~y is contained in K, then

‖Φ(~y)− Φ(~x)− Φ′(~x)(~y − ~x)‖∞ ≤ ε‖~y − ~x‖∞. (7.1.28)

Moreover, the boundedness of Φ′ and (Φ′)−1 tells us that | det Φ′| < M and‖(Φ′)−1‖ < M on K for some constant M , where the norms of the lineartransform is with respect to the L∞-norm.

................

................

................

...............................

..........................................................................................................................................

.........................................................................................................

...............

...............................

........................................................................................................................ ...................

....................................................................................................................................................................................................................................................................................................................................................................................................................................................

..................................................

.................................................................................................................................................................. .......................................................................................................................................................................................................................................................................................................................................................................................................................................

......................................................................................................................................................................................................................................................................

................

................

................

................

................

................

................

................

................

................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ....................................................

..........................................................................................................................................................................................................................................

...................................................

...........................................................................

............

............

............

............

............

............

............

............

................................................................................................................................................................................................................................................................................................

........................................................................ ............ ........................................................................ ............

............. ............. ............. ............. ............. ............. ............. ............. ...............................................................................................................................................

....................................................

...........................................................................................

.............

.............

.............

.............

.............

.............

.............

.............

............. ............. ............. ............. ............. ............. ............. ............. ................................................................................................................................................................................................................

I

•~c •~c•Φ(~c)Φ

L L−1r r

εbr

Figure 7.2: estimate the volume of Φ(I)

Now consider a closed L∞-ball I = BL∞(~c, r) ⊂ K with radius r < δ. Wewould like to estimate the volume of Φ(I). Denote the linear approximationof Φ at ~c by

L(~x) = Φ(~c) + Φ′(~c)(~x− ~c).

Then for any ~x ∈ I, by (7.1.28) we have

‖Φ(~x)− L(~x)‖∞ ≤ ε‖~x− ~c‖∞ ≤ εr.

This implies

‖L−1(Φ(~x))−~x‖∞ = ‖Φ′(~c)−1(Φ(~x)−L(~x))‖∞ ≤M‖Φ(~x)−L(~x)‖∞ ≤ εMr,

and

‖L−1(Φ(~x))−~c‖∞ ≤ ‖L−1(Φ(~x))− ~x‖∞+ ‖~x−~c‖∞ ≤ εMr+ r = (1 + εM)r.

Page 358: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

358 CHAPTER 7. MULTIVARIABLE INTEGRATION

Therefore L−1(Φ(I)) ⊂ BL∞(~c, (1 + εM)r), which is the same as Φ(I) ⊂L(BL∞(~c, (1 + εM)r)). Thus by Proposition 7.1.14,

µ+(Φ(I)) ≤ µ(L(BL∞(~c, (1 + εM)r))) = | det Φ′(~c)|µ(BL∞(~c, (1 + εM)r))

= (1 + εM)n| det Φ′(~c)|µ(I). (7.1.29)

Next we need to put the estimations together. For the given ε > 0, thereis δ′ > 0, such that

‖P‖ < δ′ =⇒ µ(A+P )− µ(A−P ) < ε.

Fix any r < min

{δ0

2, δ,

δ′

2

}and choose P to consist of special rectangles of

the form I = BL∞(~c, r). By 2r < δ0, we have

I ∩ A 6= ∅ =⇒ I ⊂ K.

Therefore the estimation (7.1.29) can be applied to Φ(I) and gives us

µ+(Φ(A)) ≤∑I∩A 6=∅

µ+(Φ(I)) ≤ (1 + εM)n∑I∩A 6=∅

| det Φ′(~c)|µ(I)

≤ (1 + εM)nMµ(A+P ). (7.1.30)

Applying the inequality (7.1.30) to ∂A in place of A, we get

µ+(Φ(∂A)) ≤ (1 + εM)nMµ((∂A)+P ).

Since A has volume, µ((∂A)+P ) can become arbitrarily small. Therefore

µ+(Φ(∂A)) = 0. By the continuity of Φ and Φ−1, we have Φ(∂A) = ∂Φ(A).Then by the second criterion in Proposition 7.1.2, we conclude that A hasvolume and µ+(Φ(A)) = µ(Φ(A)).

The inequality (7.1.30) (the first and the third terms, more precisely)further tells us

µ(Φ(A)) = µ+(Φ(A))

≤ (1 + εM)n∑I∩A 6=∅

| det Φ′(~c)|µ(I)

≤ (1 + εM)n

∑I⊂A

+∑

I∩A 6=∅,I∩(Rn−A) 6=∅

| det Φ′(~c)|µ(I)

≤ (1 + εM)n

∑I∩A,I∈P

| det Φ′(~c)|µ(I ∩ A) +M∑

I∩∂A 6=∅

µ(I)

= (1 + εM)n

(S(P ∩ A, | det Φ′|) +Mµ((∂A)+

P )).

As ε→ 0 and ‖P‖ → 0, we get

µ(Φ(A)) ≤∫A

| det Φ′|dµ.

Page 359: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 359

We proved one direction formula of the change of volume. We will showthat this implies the same one direction formula for the change of Riemannintegrals, at least for non-negative functions. Then by considering the onedirection formula in the opposite direction (i.e., applying to the change ofvariable Φ−1), we get the formula in the opposite direction.

First we need to clarify the integrability. The graph subset GA(f ◦ Φ)is obtained by applying the change of variable (Φ−1, id) to the graph subsetGΦ(A)(f). We just proved that a change of variable takes subsets with volumeto subsets with volume. Thus GΦ(A)(f) has volume implies GA(f ◦ Φ) hasvolume. By Proposition 7.1.6, this means that the Riemann integrability off on Φ(A) implies the Riemann integrability of f ◦ Φ on A.

Second we study the change of integral for an integrable function f ≥ 0on A. Suppose P is a general partition of A. Then Φ(P ) is also a generalpartition. By choosing ~x∗I ∈ I and the corresponding Φ(~x∗I) ∈ Φ(I), we have

S(f,Φ(P )) =∑I∈P

f(Φ(x∗I))µ(Φ(I)) ≤∑I∈P

f(Φ(x∗I))

∫I

| det Φ′(~x)|dµ~x.

(7.1.31)On the other hand,∣∣∣∣∣S((f ◦ Φ)| det Φ′|, P )−

∑I∈P

f(Φ(x∗I))

∫I

| det Φ′(~x)|dµ~x

∣∣∣∣∣≤∑I∈P

f(Φ(x∗I))

∣∣∣∣| det Φ′(x∗I)|µ(I)−∫I

| det Φ′(~x)|dµ~x∣∣∣∣

≤∑I∈P

f(Φ(x∗I))ωI(| det Φ′|)µ(I). (7.1.32)

Since f is bounded and | det Φ′| is integrable (by continuity), for any ε > 0,there is δ > 0, such that ‖P‖ < δ implies∑

I∈P

f(Φ(x∗I))ωI(| det Φ′|)µ(I) < ε. (7.1.33)

Combining (7.1.31), (7.1.32), (7.1.33) together, we see ‖P‖ < δ implies

S(f,Φ(P )) ≤ S((f ◦ Φ)| det Φ′|, P ) + ε.

The boundedness of Φ′ implies that ‖Φ(P )‖ is also small when ‖P‖ is small.Thus the inequality above implies∫

Φ(A)

f(~y)dµ~y ≤∫A

f(Φ(~x))| det Φ′(~x)|dµ~x. (7.1.34)

Applying the inequality (7.1.34) to Φ−1, (f ◦Φ)| det Φ′|, Φ(A) in place ofΦ, f , A, we get the inequality in the opposite direction. Thus the formulafor the change of variable is proved for non-negative functions. The generalcase can be proved by expressing any Riemann integrable function f as f =max{0, f} − max{0,−f}, where both max{0, f} and max{0,−f} are non-negative and Riemann integrable.

Page 360: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

360 CHAPTER 7. MULTIVARIABLE INTEGRATION

Example 7.1.10. Suppose a region A in the cartesian coordinate corresponds to aregion B in the polar coordinate. Then by taking the determinant of the Jacobianmatrix in Example 6.1.8, we have∫

Af(x, y)dxdy =

∫Bf(r cos θ, r sin θ)rdrdθ.

For example, the volume of the intersection of the ball x2 + y2 + z2 ≤ R2 and thecylinder x2 + y2 ≤ Rx is the region between the graphs of z =

√R2 − x2 − y2 and

z = −√R2 − x2 − y2 over the disk x2 + y2 ≤ Rx. Since the boundary of the disk

is given by r = R cos θ for −π2≤ θ ≤ π

2, the volume is

∫x2+y2≤Rx

2√R2 − x2 − y2dxdy =

∫ R

0

∫ π2

−π2

√R2 − r2rdrdθ =

πR3

3.

Example 7.1.11. Suppose

F (~θ) : A ⊂ Rn−1 → Sn−1 = {~x : ‖~x‖2 = 1} ⊂ Rn

is a parametrization of the unit sphere (see Example 6.1.10 for an example in casen = 3). Then

Φ(r, ~θ) = rF (~θ) : (0,∞)×A→ Rn

is a spherical change of variable (away from ~0). We have

Φ′ =(F rF ′

), det |Φ′| = rn−1 det

(F F ′

).

For a Riemann integrable function f(t) on [a, b], b > a ≥ 0, we have∫a≤‖~x‖2≤b

f(‖~x‖2)dµ =∫a≤r≤b,~θ∈A

f(r)rn−1| det(F F ′

)|drdµ~θ

= βn−1

∫ b

arn−1f(r)dr.

The quantity

βn−1 =∫A| det

(F F ′

)|dµ~θ

is independent of the choice of parametrization F and is the (n − 1)-dimensionalvolume of the unit sphere. For the special case n = 2 and n = 3, we have β1 = 2πand

β2 =∫

[0,π]×[0,2π]

∣∣∣∣∣∣det

sinφ cos θ cosφ cos θ − sinφ sin θsinφ sin θ cosφ sin θ sinφ cos θ

cosφ − sinφ 0

∣∣∣∣∣∣ dφdθ = 4π.

We also note that by taking f = 1, a = 0, b = 1, we get the volume

αn = βn−1

∫ 1

0rn−1dr =

βn−1

n

of the unit ball Bn. Then βn−1 can be computed by using Exercise 7.1.36.

Page 361: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 361

Example 7.1.12. Suppose A ⊂ R2 has area and all (x, y) ∈ A satisfies y ≥ 0. Therotation of A with respect to the x-axis is the subset

B = {(x, y, z) : (x, r) ∈ A, x2 + y2 = r2} ⊂ R3.

By using the parametrization

Φ(x, r, θ) = (x, r cos θ, r sin θ) : A× [0, 2π)→ B

and det Φ′ = r, we find the volume

µ(B) =∫A×[0,2π)

rdxdrdθ = 2π∫Aydxdy.

Introduce the center of weight

(x∗A, y∗A) =

1µ(A)

∫A

(x, y)dxdy

for the subset A. Then we get

µ(B) = 2πy∗Aµ(A).

The formula is called the Pappus-Guldinus theorem. We note that y∗A is the averageof the distance from points in A tot he x-axis.

Next we consider the more general case of the rotation along a straight lineL : ax + bx = c. Assume A lies in the “positive side” of the straight line, whichmeans that all (x, y) ∈ A satisfies ax + by ≥ c. Denote by B the subset of R3

obtained by rotating A around L.To compute the volume of B, we introduce a new coordinate system on R2 by

F (x, y) = (x0 + x cosα− y sinα, y0 + x sinα+ y cosα) : R2 → R2,

where (x0, y0) is a point on L and α is the angle of inclination of L. The transformtakes the x-axis to L, and F−1(A) lies in the upper half plane. A correspondingrotation in R3 takes the rotation of F (A) with respect to the x-axis to B. Sincethe rotation does not change distance and volume, we still have

µ(B) = 2πy∗Aµ(F−1(A)) = 2πdµ(A),

where d is the average distance from points in A to L. Since the distance from

(x, y) to L is y =(a, b)√a2 + b2

· (x− x0, y − y0), we have

d =(a, b)

µ(A)√a2 + b2

·∫A

(x−x0, y−y0)dµ =a(x∗A − x0) + b(y∗A − y0)√

a2 + b2=ax∗A + by∗A − c√

a2 + b2.

Exercise 7.1.37. Find formula for the integral under the changes of variables inExercise 6.1.12.

Exercise 7.1.38. Use suitable changes of variables to compute the integral.

1.∫x2+y2≤a2

|xy|dxdy.

2.∫Aex−yx+y dxdy, A is the region bounded by x = 0, y = 0, x+ y = 1.

Page 362: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

362 CHAPTER 7. MULTIVARIABLE INTEGRATION

3.∫A

(ax2 +2bxy+cy2)dxdy, A is the part of the unit disk in the first quadrant.

4.∫Az2dxdydz, A is the intersection of the ball x2 + y2 + z2 ≤ 1 and the

cylinder x2 + y2 ≤ x.

Exercise 7.1.39. Find the areas and volumes.

1. The region on the plane bounded by the curve r = sinnθ in polar coordinate.

2. The region on the plane bounded by the curve (x2 + y2)2 = 2ax3.

3. The intersection of the balls x2 + y2 + z2 ≤ 1 and x2 + y2 + z2 ≤ 2az.

Exercise 7.1.40. Extend the Pappus-Guldinus Theorem in Example 7.1.12 to highdimension. Specifically, we may rotate a subset A ⊂ Rn with volume and on thepositive side of the hyperplane ~a · ~x = b around the hyperplane. Moreover, therotation of a single point can be a sphere instead of a circle.

Exercise 7.1.41. Suppose f(t) is a Riemann integrable function. Prove that∫‖~x‖2≤R

f(‖~x− ~a‖2)dµ =12βn−2

∫x2+y2≤R2

|y|n−2f(√

(x− ‖~a‖2)2 + y2)dxdy.

7.1.7 Improper Integration

Suppose A ⊂ Rn is an unbounded subset such that A ∩ B has volume forany bounded subset B with volume. Suppose f is a function on A that isRiemann integrable on A ∩ B for any bounded subset B with volume. Thecondition is the same as the intersection A ∩ Bn

R with the ball of radius Rhas volume for any R, and f is Riemann integrable on A∩Bn

R. The improper

integral

∫A

fdµ converges to I if for any ε > 0, there is R, such that

∣∣∣∣∫A∩B

fdµ− I∣∣∣∣ < ε

for any subset B with volume and containing BnR.

Proposition 7.1.17. Suppose f is a function on an unbounded subset A,such that for any bounded subset B with volume, A∩B has volume and f isRiemann integrable on A ∩B. Then the following are equivalent.

1. The improper integral

∫A

fdµ converges.

2. The improper integral

∫A

|f |dµ converges.

3. For any ε > 0, there is R > 0, such that if a bounded subset C has

volume and satisfies C ∩BnR = ∅, then

∣∣∣∣∫A∩C

fdµ

∣∣∣∣ < ε.

Page 363: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 363

4. There is M > 0, such that

∫A∩B|f |dµ < M for all bounded subsets B

with volume.

5. There is M > 0, such that

∣∣∣∣∫A∩B

fdµ

∣∣∣∣ < M for all bounded subsets B

with volume.

In the fourth statement, B can be replaced by BnR, or by any sequence

Bn with the property that any BnR is contained in some Bn.

Proof. We have the Cauchy criterion for the convergence: For any ε > 0,there is R > 0, such that if B1 and B2 have volumes and contain Bn

R, then∣∣∣∣∫A∩B1

fdµ−∫A∩B2

fdµ

∣∣∣∣ < ε.

By taking B1 = C ∪ BnR and B2 = Bn

R, the Cauchy criterion becomes thethird statement. Conversely, suppose the third statement holds. Then for B1

and B2 in the Cauchy criterion, B1 − B2 and B2 − B1 satisfy the conditionfor C in the second statement. Therefore∣∣∣∣∫

A∩B1

fdµ−∫A∩B2

fdµ

∣∣∣∣ =

∣∣∣∣∫A∩(B1−B2)

fdµ−∫A∩(B2−B1)

fdµ

∣∣∣∣ < 2ε.

This proves that the first and the third statements are equivalent.By an argument similar to the the convergence of monotone sequences, the

second and the fourth statements are equivalent. The first statement impliesthe fifth. By Proposition 7.1.11, the fifth statement implies the fourth. Thenthe fourth statement, being equivalent to the second statement, implies thefirst by way of Cauchy criterion. This completes the proof that all statementsare equivalent.

In contrast to the single variable case, an improper integral convergesif and only if it absolutely converges. The reason is that more freedom isallowed for the choice of B in the definition of the improper integral. Ifwe restrict the choice of B to balls only, then we get a concept similar tothe single variable case, and there will be a difference between absolute andconditional convergences.

The third condition in Proposition 7.1.17 is the Cauchy criterion for theconvergence. This allows us to establish the comparison tests for the con-vergence similar to the single variable case. Moreover, the usual propertiesof the Riemann integrals, including the Fubini theorem and the change ofvariable formula, can be extended to improper integrals.

Example 7.1.13. Suppose f(r) is a function on [a,+∞). By the computation inExample 7.1.11, we have∫

a≤‖~x‖2≤R|f(‖~x‖2)|dµ = βn−1

∫ R

a|f(r)|rn−1dr.

Page 364: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

364 CHAPTER 7. MULTIVARIABLE INTEGRATION

Since the convergence of∫‖~x‖2≥a

f(‖~x‖2)dµ and∫ +∞

a|f(r)|rn−1dr are equiva-

lent to the boundedness of∫a≤‖~x‖2≤R

|f(‖~x‖2)|dµ and∫ R

a|f(r)|rn−1dr, we con-

clude the convergence of the two improper integrals are equivalent. In particular,∫‖~x‖2≥a

‖~x‖p2converges if and only if p > n, and we have

∫‖~x‖2≥a

‖~x‖p2= βn−1

∫ +∞

a

1rprn−1dr =

βn−1

(p− n)ap−n.

Example 7.1.14. The improper integral∫

R2

e−(x2+y2)dxdy converges because e−(x2+y2) ≤1

(x2 + y2)2for sufficiently big ‖(x, y)‖2 and

∫x2+y2≥1

dxdy

(x2 + y2)2converges. By Fu-

bini theorem, we have∫R2

e−(x2+y2)dxdy =(∫ +∞

−∞e−x

2dx

)(∫ +∞

−∞e−y

2dy

)=(∫ +∞

−∞e−x

2dx

)2

.

On the other hand, we have∫R2

e−(x2+y2)dxdy = β1

∫ +∞

0re−r

2dr = π.

Thus we conclude that∫ +∞

0e−x

2dx =

12

∫ +∞

−∞e−x

2dx =

√π

2.

Exercise 7.1.42. State and prove the comparison test for the convergence of im-proper integrals.

Exercise 7.1.43. Prove that if f ≥ 0 and the improper integral∫Afdµ converges,

then ∫Afdµ = sup

B

∫A∩B

fdµ = supR>0

∫A∩BnR

fdµ,

where B are all bounded subsets with volume.

Exercise 7.1.44. Suppose A ⊂ Rn is an unbounded subset, such that A ∩ B hasvolume for any bounded subset B with volume. We say A has volume if µ(A∩B)is bounded for all B and denote µ(A) = supB µ(A ∩B).

1. Prove that A has volume if and only if the characteristic function χA is

integrable on Rn, and µ(A) =∫

RnχAdµ.

2. Prove that the extended volume satisfies µ(A∪B)+µ(A∩B) = µ(A)+µ(B).

3. Prove that for unbounded A with volume, the improper integral∫Afdµ

converges if and only if GA(f) has volume.

Page 365: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 365

Exercise 7.1.45. Suppose A and B have the property that A ∩ C and B ∩ C have

volume for any bounded subset C with volume. Prove that∫A∪B

fdµ converges

if and only if∫Afdµ and

∫Bfdµ converge. Moreover,

∫A∪B

fdµ +∫A∩B

fdµ =∫Afdµ+

∫Bfdµ.

Exercise 7.1.46. Suppose∫ +∞

0|f(r2)|rn−1dr converges and q(~x) = A~x · ~x is a

quadratic form given by a positive definite matrix A. Prove that∫Rnf(q(~x))dµ =

βn−1√detA

∫ +∞

0f(r2)rn−1dr.

Exercise 7.1.47. Suppose ϕ(~x) is a non-negative continuously differentiable homo-

geneous function of order p > 0 satisfying ~x·∇ϕ(~x) 6= 0. Prove that∫ϕ(~x)≥1

f(ϕ(~x))dµ

converges if and only if∫ +∞

1|f(rp)|rn−1dr converges.

The result still holds even if restricted to a cone with origin as the vertex. A

consequence is that for any 1 ≤ p ≤ ∞, the improper integral∫‖~x‖p≥1

f(‖~x‖p)dµ

converges if and only if∫ +∞

1|f(r)|rn−1dr converges.

Exercise 7.1.48. Determine the convergence of the improper integrals and computethe convergent ones if possible.

1.∫

R2

e−|x−y|dxdy

1 + |x− y|.

2.∫

[1,+∞)2

dxdy

(x+ y)p.

3.∫

R2

dxdy

(1 + |x|)p(1 + |y|)q.

4.∫

[0,+∞)2

(x− y)dxdy(x+ y + 1)p

.

5.∫

[1,+∞)2

log(xy)dxdyx2 + xy + y2

.

6.∫

R2

sinx sin ydxdy(x2 + y2)p

.

Exercise 7.1.49. Study the convergence of∫x,y≥1

xpyq

(xm + yn)kdxdy, where all the

parameters are positive. Extend the study to more variables.

Exercise 7.1.50. Study the convergence for any norm ‖~x‖.

1.∫‖~x‖≥1

‖~x‖pdµ. 2.∫

Rn‖~x‖−‖~x‖dµ. 3.

∫Rn

sin ‖~x‖(1 + ‖~x‖)p

dµ.

Suppose f is an unbounded function on a bounded subset A with volume,such that f is Riemann integrable on A − B(~x0, δ) for any δ > 0. Then we

say the improper integral

∫A

fdµ converges to I if for any ε > 0, there is

δ > 0, such that ∣∣∣∣∫A−B

fdµ− I∣∣∣∣ < ε

for any subset B with volume satisfying B(~x0, δ′) ⊂ B ⊂ B(~x0, δ) for some

δ′ < δ.The improperness of multivariable function may also appear along more

sophisticated subsets. See Example 7.1.16.

Page 366: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

366 CHAPTER 7. MULTIVARIABLE INTEGRATION

With the necessary modification, Proposition 7.1.17 remains largely true.In particular, the convergence may be determined by the uniform bounded-ness.

Example 7.1.15. Suppose f(r) is a function on (0, a]. By the computation in

Example 7.1.11, the improper integral∫‖~x‖2≤a

f(‖~x‖2)dµ converges if and only if∫ a

0rn−1|f(r)|dr converges. In particular,

∫‖~x‖2≤a

‖~x‖p2converges if and only if

p < n, and we have∫‖~x‖2≤a

‖~x‖p2= βn−1

∫ a

0rn−1 1

rpdr =

βn−1an−p

n− p.

Example 7.1.16. The integral∫

(0,1)2

dxdy√(x+ y)(1− x)(1− y)

is improper along 1×

[0, 1] ∪ [0, 1]× 1 and (0, 0). Near 1× [0, 1] ∪ [0, 1]× 1, we have

1√2√

(1− x)(1− y)≤ 1√

(x+ y)(1− x)(1− y)≤ 1√

(1− x)(1− y).

By the boundedness criterion and the convergence of∫

(0,1)2

dxdy√(1− x)(1− y)

=∫ 1

0

dx√1− x

∫ 1

0

dy√1− y

, we see that∫

(0,1)2

dxdy√(x+ y)(1− x)(1− y)

converges along

1× [0, 1] ∪ [0, 1]× 1. Near (0, 0), we have

c1

4√x2 + y2

≤ 1√(x+ y)(1− x)(1− y)

≤ c2

4√x2 + y2

for some c1, c2 > 0. By Example 7.1.15, the integral∫x,y≥0,x2+y2≤1

dxdy4√x2 + y2

con-

verges. Therefore∫

(0,1)2

dxdy√(x+ y)(1− x)(1− y)

converges at (0, 0). We conclude

that the whole integral converges.

Exercise 7.1.51. Prove that the improper integral∫x2+y2≤1

(x2 − y2)dxdy(x2 + y2)2

diverges

but the Cauchy principal value limε→0+

∫ε2≤x2+y2≤1

(x2 − y2)dxdy(x2 + y2)2

converges.

Exercise 7.1.52. Study the existence of the repeated integrals and the improperintegrals.

1.∫

[0,π]×(0,1]

cosxy

dxdy. 2.∫

(0,+∞)2

e−xy sinxdxdy.

Exercise 7.1.53. Determine the convergence of the improper integrals and computethe convergent ones if possible.

1.∫

(0,1)2

|x− y|p

(x+ y)qdxdy.

2.∫‖~x‖2<1

dxdy

(1− ‖~x‖22)p.

Page 367: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.1. RIEMANN INTEGRATION 367

3.∫x+y+z≤1,x>0,y>0,z>0

zdxdydz

(x+ y + z)(y + z)2.

4.∫

0≤x<y<πlog sin(y − x)dxdy.

7.1.8 Exercise

Integrability and Continuity

Exercise 7.1.54. Prove that an integrable function f on A must be continuousat some interior points of A. In fact, for any ball B(~a, ε) ⊂ A, f is continuoussomewhere in the ball.

Exercise 7.1.55. Extend Exercise 3.1.35 to multivariable Riemann integral. Sup-pose f is integrable on a subset A with volume. Prove that for any ε > 0, there isa union U of countably many rectangles, such that the sum of the volumes of therectangles is < ε, and all discontinuous points of f are inside U .

Exercise 7.1.56. Suppose f and g are integrable on A. Prove that if f(~x) > g(~x)

on the interior of A, then∫Afdµ >

∫Ag(~x)dµ.

Exercise 7.1.57. Suppose f is integrable on A. Prove the following are equivalent.

1.∫A|f |dµ = 0.

2.∫Bfdµ = 0 for any subset B ⊂ A with volume.

3.∫Bfdµ = 0 for any rectangle I ⊂ A.

4.∫Afgdµ = 0 for any continuous function g on A.

5.∫Afgdµ = 0 for any integrable function g on A.

6. f = 0 at continuous points.

Darboux Sum and Darboux Integral

The upper and the lower Darboux sums (3.1.18) and (3.1.19) and theupper and the lower Darboux integrals (3.1.20) may be extended to multi-variable functions.

Exercise 7.1.58. For a single variable function on [a, b], there are two versions ofthe Darboux integrals by choosing interval partitions or by choosing partitionswith subsets with volumes (lengths). Prove that the two versions are the same.The result can be extended to functions defined on rectangles in Rn.

Exercise 7.1.59. Prove that∫Af(~x)dµ~x = lim‖P‖→0 U(P, f) and

∫Af(~x)dµ~x =

lim‖P‖→0 L(P, f).

Page 368: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

368 CHAPTER 7. MULTIVARIABLE INTEGRATION

Exercise 7.1.60. Prove that if f(~x) ≥ 0 on A, then∫Af(~x)dµ~x = µ+(GA(f)) and∫

Af(~x)dµ~x = µ−(GA(f)).

Exercise 7.1.61. Prove that∫Af(~x)dµ~x ≥

∫Af(~x)dµ~x. Moreover, the equality holds

if and only if f(~x) is Riemann integrable on A, and∫Af(~x)dµ~x is the common

value.

Exercise 7.1.62. Prove that∫A×B

f(~x, ~y)dµ~x,~y ≤∫B

(∫Af(~x, ~y)dµ~x

)dµ~y,∫

A×Bf(~x, ~y)dµ~x,~y ≥

∫B

(∫Af(~x, ~y)dµ~x

)dµ~y.

Also write down the similar inequalities with∫A

in place of∫A

.

Exercise 7.1.63. Prove the extension of Fubini theorem: If f is Riemann integrable

on A× B, then any function g(~y) satisfying∫Af(~x, ~y)dµ~x ≤ g(~y) ≤

∫Af(~x, ~y)dµ~x

is Riemann integrable on B, and∫A×B

f(~x, ~y)dµ~x,~y =∫Bg(~y)dµ~y.

Exercise 7.1.64. Prove that if f(~x, ~y) is integrable on A × B, then the collectionof ~y for which f is not integrable in ~x is contained in a countable union of subsetsof volume 0.

Exercise 7.1.65. Study the Darboux integrals and the repeated Darboux integralsfor the functions in Exercise 7.1.29.

Integral Continuity

Exercise 7.1.66. Suppose A has volume. Suppose f is integrable on an open subsetcontaining the closure A. Prove that

lim~t→~0

∫A|f(~x+ ~t)− f(~x)|dµ = 0.

Kepler’s Second Law on Planet Motion

Kepler’s Law says that the line from the sun to the planet will sweep outequal areas in equal intervals of time.

Exercise 7.1.67. Suppose φ(t) = (x(t), y(t)) : [a, b] → R2 is a curve, such that themovement from the vector φ to φ′ is counterclockwise. Prove that the area swept by

the line connecting the origin and φ(t) as t moves from a to b is12

∫ b

a(xy′−yx′)dt.

Exercise 7.1.68. Derive Kepler’s law from Exercise 7.1.67 and Newton’s second

law of motion: φ′′ = cφ

‖φ‖32, where c is a constant determined by the mass of the

sun.

Page 369: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 369

Dirichlet Transform

The Dirichlet transform is

x1 + x2 + · · ·+ xn = t1,

x2 + · · ·+ xn = t1t2,

...

xn−1 + xn = t1t2 · · · tn−1,

xn = t1t2 · · · tn−1tn.

Exercise 7.1.69. Use the Dirichlet transform to prove the relation between theBeta function and the Gamma function

B(x, y) =Γ(x)Γ(y)Γ(x+ y)

.

Exercise 7.1.70. The standard simplex is

∆n = {(x1, x2, . . . , xn) : xi ≥ 0, x1 + x2 + · · ·+ xn ≤ 1}.

Use the Dirichlet transform to prove that that for p1, p2, . . . , pn > 0, we have∫∆n

xp1−11 xp2−1

2 · · ·xpn−1n dx1dx2 · · · dxn =

Γ(p1)Γ(p2) · · ·Γ(pn)(p1 + p2 + · · ·+ pn)Γ(p1 + p2 + · · ·+ pn)

.

7.2 Integration on Hypersurface

The concept of volume can be defined for good geometric objects such ascurves and surfaces in Euclidean spaces. Although the definition makesuse the parametrizations of the objects, the volume is independent of theparametrizations, thanks to the change of volume formula for the Riemannintegral. Then the volume can be further used to define the Riemann inte-grals on such geometric objects.

When the Riemann integrals are used for the physical quantities such asthe work or the flux, we get the Riemann integral of differential forms oncurves and surfaces, etc. The formalism of differential forms is compatiblewith orientable change of variables.

7.2.1 Rectifiable Curve

The straight line connecting ~x to ~y has length ‖~x− ~y‖. To define the lengthof a continuous curve φ : [a, b] → Rn, we take a partition P of [a, b] andapproximate the part of φ on [ti−1, ti] by connecting a straight line betweenthe end points φ(ti−1) and φ(ti). The length of the approximate curve is

µP (φ) =∑‖φ(ti)− φ(ti−1)‖. (7.2.1)

We say φ is rectifiable if µP (φ) has an upper bound. The length of a rectifiablecurve is

µ(φ) = supPµP (φ). (7.2.2)

Page 370: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

370 CHAPTER 7. MULTIVARIABLE INTEGRATION

Proposition 7.2.1. A curve is rectifiable if and only if its coordinate func-tions have finite variations.

Proof. Because all norms are equivalent, the rectifiability is independent ofthe choice of the norm. If we take the L1-norm, then

‖φ(ti)−φ(ti−1)‖1 = |x1(ti)−x1(ti−1)|+|x2(ti)−x2(ti−1)|+· · ·+|xn(ti)−xn(ti−1)|,

and

µP (φ) = VP (x1) + VP (x2) + · · ·+ VP (xn).

The proposition then follows.

A change of variable (or reparametrization) for a parametrized curve is aninvertible continuous map u : [c, d] → [a, b]. The map induces a one-to-onecorrespondence between the partitions Q of [c, d] and partitions P = u(Q)of [a, b]. Since µP (φ) = µQ(φ ◦ u), and ‖P‖ is small if and only if ‖Q‖is small, the rectifiability and the length are independent of the choice ofparametrization.

Proposition 7.2.2. Suppose Ψ: [a, b] → Rn is Riemann integrable. Then

φ(t) = ~x0 +

∫ t

a

Ψ(τ)dτ is a rectifiable curve, and its length is

∫ b

a

‖Ψ(t)‖dt.

If φ is continuously differentiable, then Ψ = φ′ and

µ(φ) =

∫ b

a

‖φ′(t)‖dt. (7.2.3)

Proof. For any partition P of [a, b] and choices t∗i ∈ [ti−1, ti], we have

|µP (φ)− S(P, ‖Ψ(t)‖)| ≤∑∣∣∣∣∥∥∥∥∫ ti

ti−1

Ψ(t)dt

∥∥∥∥− ‖Ψ(t∗i )‖∆ti∣∣∣∣

≤∑∥∥∥∥∫ ti

ti−1

Ψ(t)dt−∆tiΨ(t∗i )

∥∥∥∥≤∑

supt∈[ti−1,ti]

‖Ψ(t)−Ψ(t∗i )‖∆ti.

The integrability of Ψ shows the right side is small when ‖P‖ is small.

Example 7.2.1. The graph of a function f(x) on [a, b] is parametrized by φ(x) =(x, f(x)). If f is continuously differentiable, then the Euclidean length of the graph

is∫ b

a

√1 + f ′2dt and the L1-length is

∫ b

a(1 + |f ′|)dt.

Page 371: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 371

Example 7.2.2. Consider the following “circular curves”.

φ1(t) = (cos t, sin t), 0 ≤ t ≤ 2πφ2(t) = (cos t, sin t), 0 ≤ t ≤ 4πφ3(t) = (cos 2t, sin 2t), 0 ≤ t ≤ 2πφ4(t) = (cos t,− sin t), 0 ≤ t ≤ 2π

φ5(t) =

{(cos t, sin t) if 0 ≤ t ≤ π(cos t,− sin t) if π < t ≤ 2π

Although the images of both φ1 and φ2 are the whole unit circle, the length of φ1

and φ2 are 2π and 4π. As a matter of fact, φ1 wraps around the circle once, andφ2 wraps around twice, and we cannot reparametrize φ1 to get φ2. Therefore thelength of a curve depends on the “track” instead of the image only.

The curve φ3 is a reparametrization of φ2 by t → 2t. Therefore the length ofφ3 is also 4π. The curve φ4 is a reparametrization of φ1 by t→ 2π − t. Thereforethe length of φ4 is also 2π.

The image of the curve φ5 is the upper unit circle. The curve first moves from(1, 0) to (−1, 0) along the circle and then moves back to (1, 0) along the circle.The total distance 2π of the movement is the length of the curve.

Example 7.2.3. The astroid x23 + y

23 = a

23 can be parametrized as x = a cos3 t,

y = a sin3 t for 0 ≤ t ≤ 2π. The Euclidean length of the curve is∫ 2π

0

√(−3a cos2 t sin t)2 + (3a sin2 t cos t)2dt = 6a.

The L∞-length is∫ 2π

0max{| − 3a cos2 t sin t|, |3a sin2 t cos t|}dt = 8

(1− 1

2√

2

)a.

Note that the equality x23 + y

23 = a

23 only specifies a subset of R2, which is

the image of the astroid. Strictly speaking, the image alone is not sufficient todetermine the length, as indicated by Example 7.2.2. On the other hand, we doimplicitly understand that the question is to compute the length of one round ofthe astroid, which is indicated by the range [0, 2π] for the parametrization.

Exercise 7.2.1. Prove that the L1-length of a rectifiable curve is the sum of thevariations of the coordinates on the interval.

Exercise 7.2.2. Prove that if φ is a rectifiable curve on [a, b], then µ(φ) = µ(φ|[a,c])+µ(φ|[c,b]) for any a < c < b.

Exercise 7.2.3. Prove that the image of any rectifiable curve in Rn, n ≥ 2, hasn-dimensional volume zero.

Exercise 7.2.4. Suppose φ is a rectifiable curve. Prove that for any ε > 0, there isδ > 0, such that ‖P‖ < δ implies µP (φ) > µ(φ)−ε. This implies lim‖P‖→0 µP (φ) =µ(φ).

Exercise 7.2.5. Suppose F : Rn → Rn is a map satisfying ‖F (~x)−F (~y)‖ = ‖~x−~y‖.Prove that µ(F (C)) = µ(C).

Exercise 7.2.6. Suppose φ(t) is a continuously differentiable curve on [a.b]. Forany partition P of [a.b] and choice t∗i , the curve is approximated by the straightlines φ(t∗i ) + φ′(t∗i )(t− t∗i ) on intervals [ti−1, ti]. Prove that the sum of the lengthof the tangent lines has the length of φ as the limit as ‖P‖ → 0.

Page 372: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

372 CHAPTER 7. MULTIVARIABLE INTEGRATION

Given a parametrized curve φ(t) : [a, b]→ Rn, the arc length function

s(t) = µ(φ|[a,t]) (7.2.4)

is increasing. By Exercise 3.3.33 and the relation between µP (φ) and thevariation, s(t) is also continuous. If φ is not constant on any interval in [a, b],then s(t) is strictly increasing, and the curve can be reparametrized by thearc length. In general, we may identify the intervals on which φ is constantand modify the parametrization by reducing the intervals to single points.This does not change the arc length and the “track” of φ, and therefore willnot affect all the later developments. Thus without loss of generality, we willassume that all curves can be reparametrized by the arc length.

Suppose φ is continuously differentiable. Then s(t) =

∫ t

a

‖φ′(t)‖dt by

(7.2.3), or ds = ‖φ′(t)‖dt. In the special case t = s is the arc length, we get

‖φ′(s)‖ = 1.

In other words, φ′(s) is the tangent vector of unit length.Geometrically, s(t) = µ(φ|[a,t]) is the arc length counted from the be-

ginning point φ(a) of the curve. It is also possible to count the lengths(t) = µ(φ|[t,b]) from the end point φ(b) of the curve. The two arc lengthssatisfy s + s = µ(C) and represent different choices of the directions or ori-entations of the curve. The arc length s is compatible with the orientationof the parameter t and s is opposite to the orientation of t. The choice ofdirection will not affect the length, but will affect certain integrals in thefuture.

Example 7.2.4. For the circle φ(θ) = (a cos θ, a sin θ), the Euclidean arc length(counted from θ = 0) is

s =∫ θ

0‖φ′(t)‖2dt =

∫ θ

0

√a2 sin2 t+ a2 cos2 tdt = aθ.

Therefore the circle is parametrized as φ(s) =(a cos

s

a, a sin

s

a

)by the arc length.

On the other hand, with respect to the L1-norm, we have

s =∫ θ

0‖φ′(t)‖1dt =

∫ θ

0|a|(| sin t|+| cos t|)dt = |a|(1−cos θ+sin θ), for 0 ≤ θ ≤ π

2.

Exercise 7.2.7. Find the formula for the arc length of a curve in R2 from theparametrized polar coordinate r = r(t), θ = θ(t).

Exercise 7.2.8. Compute the Euclidean lengths of the curves.

1. Parabola y = x2, 0 ≤ x ≤ 1.

2. Spiral r = aθ, 0 ≤ θ ≤ π.

3. Another spiral r = eaθ, 0 ≤ θ ≤ α.

4. Cycloid x = a(t− sin t), y = a(1− cos t), 0 ≤ t ≤ 2π.

Page 373: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 373

5. Cardioid r = 2a(1 + cos θ).

6. Involute of the unit circle x = cos θ + θ sin θ, y = sin θ − θ cos θ, 0 ≤ θ ≤ α.

7. Helix x = a cos θ, y = a sin θ, z = bθ, 0 ≤ θ ≤ α.

We will often use capital letters such as C to denote curves. By this wemean that C is presented by a parametrization but any reparametrizationis equally good. Thus different parametrizations are considered to give thesame curve C if they are related by invertible continuous maps. Moreover, Ccan be parametrized by arc length s, which means that only parametrizationsthat are not constant on any interval are allowed. Thus we have the lengthµ(C) without any ambiguity. Finally, sometimes we may wish to distinguishthe two choice of the directions C may take. This is achieved by restrict-ing the reparametrizations to strictly increasing functions only. This is alsoequivalent to the choice of one of two possible arc lengths. We will indicateone choice by C and the other choice by −C.

7.2.2 Integration of Function on Curve

Suppose f(~x) is a function defined along a rectifiable curve C. Let φ : [a, b]→Rn be a parametrization of C. For a partition Q of [a, b], we get a partition Pof C by the segments Ci between ~xi−1 = φ(ti−1) and ~xi = φ(ti). For choicest∗i ∈ [ti−1, ti], we get ~x∗i = φ(t∗i ) ∈ Ci. Define the Riemann sum

S(P, f) =∑

f(φ(t∗i ))µ(φ|[ti−1,ti]) =∑

f(~x∗i )µ(Ci). (7.2.5)

The function is Riemann integrable along the curve, with Riemann integral

I =

∫C

fds, if for any ε > 0, there is δ > 0, such that

‖P‖ = maxµ(Ci) < δ =⇒ |S(P, f)− I| < ε.

The definition is independent of the choice of parametrization. Moreover,

the definition can be extended to

∫C

Fds for maps F : Rn → Rm.

The integral has the usual properties of the Riemann integral of singlevariable functions.

1. f is integrable if and only if for any ε > 0, there is δ > 0, such that

‖P‖ < δ =⇒∑

ωCi(f)µ(Ci) < ε.

2. Continuous functions are integrable. Monotone functions (the conceptis defined along curves) are integrable. If f is integrable and has value ina compact subset K, and g is continuous on K, then g ◦f is integrable.

3. The sum and the product of integrable functions are integrable, and∫C

(f + g)ds =

∫C

fds+

∫C

gds,

∫C

cfds = c

∫C

fds.

Page 374: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

374 CHAPTER 7. MULTIVARIABLE INTEGRATION

Moreover,

f ≤ g =⇒∫C

fds ≤∫C

gds.

4. If C is divided into two parts C1 and C2, then a function is integrableon C if and only if it is integrable on C1 and C2. Moreover,∫

C

fds =

∫C1

fds+

∫C2

fds.

Let φ(t) = ~x0 +

∫ t

a

Ψ(τ)dτ as in Proposition 7.2.2. Then for a partition

Q of [a, b] and P = φ(Q), we have

|S(P, f)− S(Q, (f ◦ φ)‖Ψ‖)|

=

∣∣∣∣∑(f(~x∗i )

∫ ti

ti−1

‖Ψ(t)‖dt− f(φ(t∗i ))‖Ψ(t∗i )‖∆ti)∣∣∣∣

≤∑|f(φ(t∗i ))|

∣∣∣∣∫ ti

ti−1

‖Ψ(t)‖dt− ‖Ψ(t∗i )‖∆ti∣∣∣∣

≤∑|f(φ(t∗i ))| sup

t∈[ti−1,ti]

‖Ψ(t)−Ψ(t∗i )‖∆ti

≤∑|f(φ(t∗i ))|ω[ti−1,ti](Ψ)∆ti.

The integrability of Ψ (see Exercise 7.1.14) and the boundedness of f impliesthat the right side is very small when ‖Q‖ is very small (which implies ‖P‖is very small). Therefore we conclude that∫

C

fds =

∫ b

a

f(φ(t))‖Ψ(t)‖dt. (7.2.6)

Example 7.2.5. We try to compute the integral of a linear function l(~x) = ~a · ~xalong the straight line C connecting ~u to ~v. The straight line can be parametrizedas φ(t) = ~u+ t(~v − ~u), with ds = ‖~v − ~u‖dt. Therefore∫

C~a · ~xds =

∫ 1

0~a · (~u+ t(~v − ~u))‖~v − ~u‖dt =

12

(~a · (~u+ ~v))‖~v − ~u‖.

The result applies to any norm.

Example 7.2.6. The integral of |y| along the unit circle with respect to the Eu-clidean norm is∫

x2+y2=1|y|ds =

∫ 2π

0| sin θ|dθ = 4

∫ π2

0| sin θ|dθ = 4.

The integral with respect to the L1-norms is∫x2+y2=1

|y|ds =∫ 2π

0| sin θ|(| sin θ|+ | cos θ|)dθ = 4

∫ π2

0sin θ(sin θ + cos θ)dθ = 4.

Page 375: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 375

Example 7.2.7. The integrals of |y| along the curves C1, C2, C3, C4 in Example7.2.2 with respect to the Euclidean norm are∫

C1

|y|ds =∫C4

|y|ds =∫ 2π

0| sin t|

√sin2 t+ cos2 tdt = 4,∫

C2

|y|ds =∫C3

|y|ds =∫φ2|[0,2π]

|y|ds+∫φ2|[2π,4π]

|y|ds = 2∫C1

|y|ds = 8,∫C5

|y|ds =∫φ5|[0,π]

|y|ds+∫φ5|[π,2π]

|y|ds = 2∫φ5|[0,π]

|y|ds = 4.

Note that C2 may be split into two parts for 0 ≤ t ≤ 2π and 2π ≤ t ≤ 4π. Thefirst part is C1 and the second part is also C1 but reparametrized by t → t + 2π.Therefore the integral on C2 is the double of the integral on C1.

The curve C4 may be split into two parts for 0 ≤ t ≤ π and π ≤ t ≤ 2π. Thesecond part is the first part reparametrized by t → 2π − t. Thus the integral onC4, which is the sum of the two parts, is the double of the first part.

Exercise 7.2.9. Compute the integrals along the curve with respect to the Eu-clidean norm.

1.∫C|y|ds, C is the unit circle x2 + y2 = 1.

2.∫Cxyds, C is the part of the ellipse

x2

a2+y2

b2= 1 in the first quadrant.

3.∫C

(x43 + y

43 )ds, C is the astroid x

23 + y

23 = a

23 .

4.∫C

(a1x+ a2y+ a3z)ds, C is the circle x2 + y2 + z2 = 1, b1x+ b2y+ b3z = 0.

Exercise 7.2.10. Suppose f is a continuous function along a curve C. Prove that

there is ~c ∈ C, such that∫Cfds = f(~c)µ(C).

Exercise 7.2.11. Suppose F : Rn → Rn is a map satisfying ‖F (~x)−F (~y)‖ = ‖~x−~y‖.Prove that

∫F (C)

f(~x)ds =∫Cf(F (~x))ds.

Exercise 7.2.12. Suppose a curve C is parametrized by φ(t). Then the arc length

s(t) is increasing. Prove that the Riemann integral∫Cfds is the same as the

Riemann-Stieltjes integral∫ b

af(φ(t))ds(t).

Exercise 7.2.13. Suppose f is a Riemann integrable function along a rectifiablecurve C connecting ~a to ~b. For any ~x ∈ C, denote by C[~a, ~x] the part of C between

~a and ~x. Then F (~x) =∫C[~a,~x]

fds can be considered as the “antiderivative” of f

along C. By using the concept, state and prove the integration by part formulafor the integral along C.

Page 376: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

376 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.2.3 Integration of 1-Form on Curve

In physics, the force can be represented as a vector. The work done by aconstant force F along the straight line from ~a to~b is F ·(~b−~a). Now considerthe work done by a changing force along a curve. Specifically, consider avector field F : C → Rn along a rectifiable curve C. For a partition P of Cand choices ~x∗i ∈ Ci, define the Riemann sum

S(P, F ) =∑

F (~x∗i ) · (~xi − ~xi−1). (7.2.7)

The integral

∫C

F ·d~x is defined as the limit of the Riemann sum as ‖P‖ → 0.

It follows directly from the definition that the integral

∫C

F · ~x is not

changed by a reparametrization if the direction of the curve is preserved.Moreover, the integral changes sign if the direction is reversed∫

−CF · d~x = −

∫C

F · d~x. (7.2.8)

The equality can be compared with

∫ a

b

fdx = −∫ b

a

fdx.

Suppose F = (f1, f2, . . . , fn) and C parametrized by φ = (x1, x2, . . . , xn) : [a, b]→Rn. For a partition Q of [a, b] and choices t∗i , the corresponding partitionP = φ(Q) and choices ~x∗i = φ(t∗i ), we have

S(P, F ) =∑

F (φ(t∗i )) · (φ(ti)− φ(ti−1))

=∑

(f1(φ(t∗i ))∆x1i + f2(φ(t∗i ))∆x2i + · · ·+ fn(φ(t∗i ))∆xni)

= S(P, f1 ◦ φ, x1) + S(P, f2 ◦ φ, x2) + · · ·+ S(P, fn ◦ φ, xn),

which is the sum of the Riemann-Stieljes sums of fi with respect to thecontinuous function xi(t) with bounded variation. Therefore∫C

F · d~x =

∫ b

a

f1(φ(t))dx1(t) +

∫ b

a

f2(φ(t))dx2(t) + · · ·+∫ b

a

fn(φ(t))dxn(t).

Because of the connection to the Riemann-Stieljes integral, we also denote∫C

F · d~x =

∫C

f1dx1 + f2dx2 + · · ·+ fndxn. (7.2.9)

The expression

F · d~x = f1dx1 + f2dx2 + · · ·+ fndxn (7.2.10)

is called a 1-form (a differential form of order 1).The relation to the Riemann-Stieljes integrals allows us to establish prop-

erties for the integral of a 1-form on a curve.

Page 377: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 377

1. By Theorem 3.3.3, if φ(t) = ~x0 +

∫ t

a

Ψ(τ)dτ as in Proposition 7.2.2,

then the 1-form F · d~x is integrable if and only if F · Ψ is Riemannintegrable, and ∫

C

F · d~x =

∫ b

a

F (φ(t)) ·Ψ(t)dt. (7.2.11)

2. If F is continuous, then the 1-form F ·d~x is integrable. If F is integrableand has value in a compact subset K, and G is continuous on K, thenG ◦ F is integrable.

3. The sum and scalar multiplication of integrable 1-forms are integrable,and∫

C

(F +G) · d~x =

∫C

F · d~x+

∫C

G · d~x,∫C

cF · d~x = c

∫C

F · d~x.

Moreover, ∣∣∣∣∫C

F · d~x∣∣∣∣ ≤ (sup

C‖F‖2)µ(C),

where µ(C) is the Euclidean length.

4. If C is divided into two parts C1 and C2, then a 1-form is integrable onC if and only if it is integrable on C1 and C2. Moreover,∫

C

F · d~x =

∫C1

F · d~x+

∫C2

F · d~x.

Example 7.2.8. Consider three curves connecting (0, 0) to (1, 1). The curve C1 isthe straight line φ(t) = (t, t). The curve C2 is the parabola φ(t) = (t, t2). Thecurve C3 is the straight line from (0, 0) to (1, 0) followed by the straight line from(1, 0) to (1, 1). Then∫

C1

ydx+ xdy =∫ 1

0(tdt+ tdt) = 1,∫

C2

ydx+ xdy =∫ 1

0(t2dt+ t · 2tdt) = 1,∫

C3

ydx+ xdy =∫ 1

00dx+

∫ 1

01dy = 1.

We note that the result is independent of the choice of the curve. In fact, for anycontinuously differentiable curve φ(t) = (x(t), y(t)), t ∈ [a, b], connecting (0, 0) to(1, 1), we have∫

Cydx+ xdy =

∫ b

a(y(t)x′(t) + x(t)y′(t))dt =

∫ b

a(x(t)y(t))′dt

= x(b)y(b)− x(a)y(a) = 1− 0 = 1.

Page 378: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

378 CHAPTER 7. MULTIVARIABLE INTEGRATION

Example 7.2.9. Taking the three curves in Example 7.2.8 again, we have∫C1

xydx+ (x+ y)dy =∫ 1

0(t2dt+ 2tdt) =

43,∫

C2

xydx+ (x+ y)dy =∫ 1

0(t3dt+ (t+ t2)2tdt) =

1712,∫

C3

xydx+ (x+ y)dy =∫ 1

00dx+

∫ 1

0(1 + y)dy =

32.

In contrast to the integral of ydx+ xdy, the integral of xydx+ (x+ y)dy dependson the curve connecting the two points.

Example 7.2.10. Consider the integral of F = (−y, x) on the four parametrizedcurves in Example 7.2.2.∫C1

F · d~x =∫ 2π

0(− sin t, cos t) · (− sin t, cos t)dt =

∫ 2π

0dt = 2π,∫

C2

F · d~x =∫ 4π

0(− sin t, cos t) · (− sin t, cos t)dt =

∫ 4π

0dt = 4π,∫

C3

F · d~x =∫ 2π

0(− sin 2t, cos 2t) · (−2 sin 2t, 2 cos 2t)dt =

∫ 2π

02dt = 4π,∫

C4

F · d~x =∫ 2π

0(sin t, cos t) · (− sin t,− cos t)dt = −

∫ 2π

0dt = −2π,∫

C5

F · d~x =∫ π

0(− sin t, cos t) · (− sin t, cos t)dt+

∫ 2π

π(sin t, cos t) · (− sin t,− cos t)dt

=∫ π

0dt−

∫ 2π

πdt = 0.

The discussion about in Example 7.2.7 on the relation between the integrals onthe four curves still applies here. Note that the reparametrizations t → t + 2π in

C2 and t→ t

2in C3 preserve the direction, and the reparametrization t→ 2π − t

in C4 and C5 reverses the direction. In particular, the integral on C5, which is thesum of the two parts, must be zero.

Exercise 7.2.14. Compute the integrals of the 1-forms on the three curves in Ex-ample 7.2.8. In case the three integrals are the same, can you provide a generalreason?

1. xdx+ ydy.

2. ydx− xdy.

3. (2x+ ay)ydx+ (x+ by)xdy.

4. ex(ydx+ ady).

Exercise 7.2.15. Compute the integrals of the 1-forms.

1.∫C

ydx− xdyx2 + y2

, C is upper half circle in the counterclockwise direction.

2.∫C

(2a−y)dx+dy, C is the cycloid x = at−b sin t, y = a−b cos t, 0 ≤ t ≤ 2π.

3.∫Cxdy + ydz + zdx, C is the straight line connecting (0, 0, 0) to (1, 1, 1).

Page 379: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 379

4.∫C

(x+ z)dx+ (y+ z)dy+ (x−y)dz, C is the helix x = cos t, y = sin t, z = t,

0 ≤ t ≤ α.

5.∫C

(y2−z2)dx+(z2−x2)dy+(x2−y2)dz, C is the boundary of the intersection

of the sphere x2 + y2 + z2 = 1 with the first quadrant, and the direction is(x, y)-plane part, followed by (y, z)-plane part, followed by (z, x)-plane part.

Exercise 7.2.16. Suppose ~a is a constant vector and U is an orthogonal linear

transform. Prove that∫~a+U(C)

F (~x) · d~x =∫CF (~a+ U(~x)) · d~x.

Exercise 7.2.17. Suppose c is a constant. Prove that∫cCF (~x)·d~x = c

∫CF (c~x)·d~x.

Exercise 7.2.18. Suppose A is a symmetric matrix. Compute the integral of A~x·d~xon any continuously differentiable parametrized curve.

Exercise 7.2.19. Define ωC(F ) = sup~x,~y∈C ‖F (~x) − F (~y)‖. Prove that if for anyε > 0, there is δ > 0, such that ‖P‖ < δ implies

∑ωCi(F )µ(Ci) < ε, where Ci are

the segments of C cut by the partition P , then F · d~x is integrable on C. Showthat the converse is not true.

7.2.4 Surface Area

Suppose σ(~u) = σ(u, v) : A ⊂ R2 → Rn is a parametrized surface on a subsetA with area. For a partition P of R2 by triangles, we consider the trianglesI ∈ P that are contained in A. Each triangle is determined by its threevertices ~u0, ~u1, ~u2. Let Iσ ⊂ Rn be the triangle with σ(~u0), σ(~u1), σ(~u2)as the vertices. Then it is natural to think of the union of such Iσ as anapproximation of the parametrized surface. By (5.2.13), the area of Iσ is

1

2

∑I

‖(σ(~u1)− σ(~u0))× (σ(~u2)− σ(~u0))‖2.

We expect the sum of the areas approaches the area of the parametrizedsurface as ‖P‖ → 0.

It turns out that the intuition worked for curves but not for surfaces. Forcurves, the direction of the straight line connecting two nearby points φ(ti−1)and φ(ti) is very close to the direction φ′(t∗i ). For surfaces, the direction ofthe plane passing through three nearby points σ(~u0), σ(~u1), σ(~u2) may befar away from the direction of the tangent plane σ(~u∗I) + σ′(~u∗I)(~u− ~u∗I).Example 7.2.11. Consider the cylinder σ(θ, z) = (cos θ, sin θ, z) for 0 ≤ z ≤ 1.

Let α =π

mand d =

12n

. We cut the cylinder at heights 0, d, 2d, . . . , 2nd andget 2n + 1 unit circles. On the unit circles at heights 0, 2d, 4d, . . . , 2nd, we plotm points at angles 0, 2α, 4α, . . . , (2m − 2)α. On the unit circles at the heightsd, 3d, 5d, . . . , (2n− 1)d, we plot m points at angles α, 3α, . . . , (2m− 1)α. By con-necting nearby triple points, we get 2mn identical isosceles triangles with base2 sinα and height

√(1− cosα)2 + d2. The total area is

4mn12

2 sinα√

(1− cosα)2 + d2 = 2πsinαα

√(1− cosα

d

)2

+ 1.

Page 380: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

380 CHAPTER 7. MULTIVARIABLE INTEGRATION

1

(2j+1)d

2jd

(2j−1)d

0

2iα

(2i−3)α(2i−1)α (2i+1)α

(2i+3)α

...........................................................................................................................................................................................................................................................................................................................................................................................

...........................................................................................................................................................................................................................................................................................................................................................................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................

........................................................................................................................................................

........................................................................................................................................................................

...............................................................................................................................................

.......................................................................................

...............................................................................................................................................................................................................

............................................................................................................................................................................

..................................................

.......................................................................................................................

............................................................................................................

..........................................

..................................................................

...........................................................................................................................

............

............

............

............

............

............

............

............

............

............

............

............

............

.............

.........................................

....................

..............................................................

......................................

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

................................................

.

............................................................................................................................................................................................................................................................................................................

..................

............................................................................................................................................................................................................................................................................................................

..................

..............................................................

Figure 7.3: a bad approximation of the cylinder

This has no limit as α, d→ 0.

So it is necessary to make sure that the direction of the approximateplanes to be really close to the direction of the tangent plane. Assume theparametrization σ is continuously differentiable. Take any general partitionP of A, choose ~u∗I ∈ I. Then σ is approximated on I by the linear map

LI(~u) = σ(~u∗I) + σ′(~u∗I)(~u− ~u∗I).

The area of σ(I) is approximated by the area of LI(I), which by Proposition7.1.15 is

‖σ′(~u∗I)(~e1)× σ′(~u∗I)(~e2)‖2µ(I) = ‖σu(~u∗I)× σv(~u∗I)‖2µ(I).

The area of the surface is approximated by the sum∑

I∈P ‖σu(~u∗I)×σv(~u∗I)‖2µ(I),which is the Riemann sum for the integral of the function ‖σu × σv‖2 on A.Therefore the area of a surface parametrized by a continuously differentiablemap σ(u, v) : A ⊂ R2 → Rn is

µ(S) =

∫A

‖σu × σv‖2dudv =

∫A

√‖σu‖2

2‖σv‖22 − (σu · σv)2dudv. (7.2.12)

We also denote

dA = ‖σu × σv‖2dudv, (7.2.13)

where A indicates the area.The geometric intuition tells us that the area should be independent of

the parametrization. To verify the intuition, we consider a change of variable(u, v) = Φ(s, t) : B → A. We have σs = usσu + vsσv, σt = utσu + vtσv, andby (5.2.12),

σs × σt = det∂(u, v)

∂(s, t)σu × σv. (7.2.14)

Then by the change of variable formula, we have∫B

‖σs × σt‖2dsdt =

∫B

∣∣∣∣det∂(u, v)

∂(s, t)

∣∣∣∣ ‖σu × σv‖2dsdt =

∫A

‖σu × σv‖2dudv.

Page 381: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 381

Example 7.2.12. The graph of a continuously differentiable function f(x, y) definedfor (x, y) ∈ A is naturally parametrized as σ(x, y) = (x, y, f(x, y)). By

σx × σy = (1, 0, fx)× (0, 1, fy) = (−fx,−fy, 1),

the area of the surface is∫A‖(1, 0, fx)× (0, 1, fy)‖2dxdy =

∫A

√1 + f2

x + f2y dxdy.

In case z = f(x, y) is implicitly defined by g(x, y, z) = c, the formula becomes

∫A

√g2x + g2

y + g2z

|gz|dxdy =

∫A

dxdy

cos θ,

where ∇g = (gx, gy, gz) is normal to the tangent space, and θ is the angle betweenthe tangent plane and the (x, y)-plane.

Example 7.2.13. The surface obtained by rotating a curve x = x(t), y = y(t),t ∈ [a, b], around the x-axis is σ(t, θ) = (x(t), y(t) cos θ, y(t) sin θ). The area is∫

[a,b]×[0,2π]‖(x′, y′ cos θ, y′ sin θ)× (0,−y sin θ, y cos θ)‖2dtdθ

=2π∫ b

a|y(t)|

√x′(t)2 + y′(t)2dt.

In particular, the area of the unit sphere is (see Example 7.1.11)

β2 = 2π∫ π

0| sin t|

√(cos t)′2 + (sin t)′2dt = 4π.

The torus in Example 6.1.10 is obtained by rotating the circle x = a + b cosφ,z = b sinφ around the z-axis. So the torus has area

2π∫ 2π

0(a+ b cosφ)

√(a+ b cosφ)′2 + (b sinφ)′2dφ = 2πab.

Example 7.2.14. The parametrized surface σ(u, v) = (cos(u+v), sin(u+v), cosu, sinu)for 0 ≤ u, v ≤ π

2has area∫

[0,π2 ]

√‖σu‖22‖σv‖22 − (σu · σv)2dudv =

∫[0,π2 ]

√2 · 1− 12dudv =

π2

4.

Exercise 7.2.20. Find the area of the graph of F : R2 → Rn−2. The graph is asurface in Rn.

Exercise 7.2.21. Study the effect of transforms of Rn on the areas of surfaces.

1. For the shifting ~x→ ~a+ ~x, prove µ(~a+ S) = µ(S).

2. For the scaling ~x→ c~x, prove µ(cS) = c2µ(S).

3. For an orthogonal linear transform U , prove µ(U(S)) = µ(S).

Exercise 7.2.22. Establish and prove the Pappus-Guldinus theorem (see Example7.1.12) for the area of surfaces obtained by rotation.

Page 382: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

382 CHAPTER 7. MULTIVARIABLE INTEGRATION

Exercise 7.2.23. Find areas of surfaces.

1. The boundary surface of the common part of the ball x2 + y2 + z2 ≤ R2 andthe cylinder x2 + y2 ≤ Rx.

2. The surface (x2 + y2)13 + z

23 = 1.

3. The parametrized surface σ(u, v) = (u+v, u−v, u2+v2, u2−v2) for 0 ≤ u ≤ a,0 ≤ v ≤ b.

7.2.5 Integration of Function on Surface

Suppose f(~x) is a bounded function on a surface S parametrized by σ(u, v) : A ⊂R2 → Rn. For a general partition P of A and choices (u∗I , v

∗I ) ∈ I for I ∈ P ,

we have the Riemann sum

S(P, f, σ) =∑

f(σ(u∗I , v∗I ))µ(σ(I)). (7.2.15)

The integral

∫S

fdµ =

∫S

fdA of f on the surface is defined as the limit of

the Riemann sum as ‖P‖ → 0. The notation dA indicates the integration isagainst the area.

The Riemann sum satisfies

|S(P, f, σ)− S(P, (f ◦ σ)‖σu × σv‖2)|

≤∑|f(σ(u∗I , v

∗I ))|

∣∣∣∣∫I

‖σu × σv‖2dudv − ‖σu(u∗I , v∗I )× σv(u∗I , v∗I )‖2µ(I)

∣∣∣∣≤∑|f(σ(u∗I , v

∗I ))| sup

(u,v)∈I‖σu(u, v)× σv(u, v)− σu(u∗I , v∗I )× σv(u∗I , v∗I )‖2µ(I).

By the continuity of σu×σv and the boundedness of f , the right side is smallif ‖P‖ is small. Therefore f is Riemann integrable on the surface if and onlyif (f ◦σ)‖σu×σv‖2 is Riemann integrable on A. When the parametrization isregular, this is also equivalent to f ◦σ is Riemann integrable on A. Moreover,we have ∫

S

fdA =

∫A

f(σ(u, v))‖σu(u, v)× σv(u, v)‖2dudv. (7.2.16)

The integral has properties similar to the integral of functions on rectifiablecurves in Section 7.2.2. Moreover, the integral is independent of the choiceof the parametrization, either by an argument similar to the area of thesurface, or by the fact that the Riemann sum (7.2.15) is essentially given bya partition of the surface and a choice of points on the surface.

Example 7.2.15. Let S be the part of the cone z =√x2 + y2 cut by the cylinder

(x− 1)2 + y2 = 1. Then dA =√

1 + z2x + z2

ydxdy =√

2dxdy, and∫S

(xy + yz + zx)dA =∫

(x−1)2+y2≤1(xy + (x+ y)

√x2 + y2)

√2dxdy.

Page 383: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 383

The region (x− 1)2 + y2 ≤ 1 can be described as 0 ≤ r ≤ 2 cos θ, −π2≤ θ ≤ π

2in

the polar coordinate. Therefore∫S

(xy + yz + zx)dA =∫ π

2

−π2

(∫ 2 cos θ

0(r2 cos θ sin θ + r(cos θ + sin θ)r)

√2rdr

)dθ

= 4√

2∫ π

2

−π2

(cos θ sin θ + cos θ + sin θ) cos4 θdθ =6415

√2.

Example 7.2.16. For a fixed vector ~a ∈ R3, consider the integral∫S2

f(~a · ~x)dA on

the unit sphere. There is a rotation that moves ~a to (‖~a‖2, 0, 0). Since the rotationpreserves the areas of surfaces in R3, we have∫

S2

f(~a · ~x)dA =∫S2

f((‖~a‖2, 0, 0) · (x, y, z))dA =∫S2

f(‖~a‖2x)dA

Parametrize S2 by σ(x, θ) = (x,√

1− x2 cos θ,√

1− x2 sin θ). Then ‖σx×σθ‖2 = 1and ∫

S2

f(‖~a‖2x)dA =∫−1≤x≤1,0≤θ≤2π

f(‖~a‖2x)dxdθ = 2π∫ 1

−1f(‖~a‖2x)dx.

Exercise 7.2.24. Use the change of variable formula to prove that the Riemannintegral on a surface is independent of the parametrization.

Exercise 7.2.25. Study the effect of transforms of Rn on the integrals on surfaces.

1. For the shifting ~x→ ~a+ ~x, prove∫~a+S

f(~a+ ~x)dA =∫Sf(~x)dA.

2. For the scaling ~x→ c~x, prove∫cSf(c~x)dA = c2

∫Sf(~x)dA.

3. For an orthogonal linear transform U , prove∫U(S)

f(~x)dA =∫Sf(U(~x))dA.

Exercise 7.2.26. Compute the integrals.

1.∫x2+y2+z2=1,a≤z≤b

dA

z, 1 ≥ b > a > 0.

2.∫Sx2dA and

∫Sx2y2dA, S is the sphere x2 + y2 + z2 = 1.

3.∫S

dA

(1 + x+ y)2, S is the boundary of the tetrahedron x+ y + z ≤ 1, x ≥ 0,

y ≥ 0, z ≥ 0.

4.∫T 2

(x+y+z)dA, T 2 is part of the torus in (5.1.20) in the quadrant x, y ≥ 0.

5.∫S

(x21 + x2

2)dA, S is the surface σ(u, v) = (u+ v, u− v, u2 + v2, u2 − v2) for

0 ≤ u ≤ a, 0 ≤ v ≤ b.

Exercise 7.2.27. Compute the attracting force∫S2

~x− ~a‖~x− ~a‖32

dA of the unit sphere

on a point ~a.

Page 384: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

384 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.2.6 Integration of 2-Form on Surface

In physics, a flow can be represented as a vector. The flux of a constantflow F through a region A on a plane ~a · ~x = b in R3 is (F · ~n)µ(A), where

~n =~a

‖~a‖2

has unit Euclidean length and is called a normal vector of the

plane. A plane in R2 has two normal vectors ~n and −~n that indicate “twosides” of the plane. The sign of the flux indicates whether the flow is thesame as or against the normal direction.

Suppose S ⊂ R3 is a surface parametrized by a regular continuouslydifferentiable map σ(u, v) : A ⊂ R2 → R3. Then the parametrization gives anormal vector

~n =σu × σv‖σu × σv‖2

(7.2.17)

of (the tangent plane of) the surface. Geometrically, the equality means thatthe order σu, σv, ~n follows the “right hand rule”. Sometimes instead of onedifferentiable map that parametrizes the whole surface, we may need severalregular parametrizations that combine to cover the whole surface. In thiscase, we require the normal vectors from different parametrizations coincideon their overlapping. Then the whole surface has a continuous choice of thenormal vectors. Such a choice is called an orientation of the surface.

Example 7.2.17. The graph of a continuously differentiable function f(x, y) is nat-urally parametrized as σ(x, y) = (x, y, f(x, y)). By the computation in Exercise7.2.12, the normal vector induced by the parametrization is

~n =(−fx,−fy, 1)‖(−fx,−fy, 1)‖2

=(−fx,−fy, 1)√

1 + f2x + f2

y

.

Note that the vector points to the positive direction of z, which is upward in theusual way of depicting the (x, y, z) coordinates. To induce the normal vector inthe downward direction, we may use the parametrization σ(y, x) = (x, y, f(x, y))(or perhaps σ(u, v) = (v, u, f(v, u)) in the less confusing way).

Example 7.2.18. The sphere S2R = {(x, y, z) : x2 + y2 + z2 = R2} of radius R can

be covered by the paramerizations

σ1(x, y) = (x, y,√R2 − x2 − y2), x2 + y2 < R2,

σ2(y, x) = (x, y,−√R2 − x2 − y2), x2 + y2 < R2,

σ3(z, x) = (x,√R2 − x2 − z2, z), x2 + z2 < R2,

σ4(x, z) = (x,−√R2 − x2 − z2, z), x2 + z2 < R2,

σ5(y, z) = (√R2 − y2 − z2, y, z), y2 + z2 < R2,

σ6(z, y) = (−√R2 − y2 − z2, y, z), y2 + z2 < R2.

By(σ1)x × (σ1)y =

(1, 0,−x

z

)×(

0, 1,−yz

)=(xz,y

z, 1),

The normal vector from the first parametrization is

(σ1)x × (σ1)y‖(σ1)x × (σ1)y‖2

=(x, y, z)‖(x, y, z)‖2

=~x

R.

Page 385: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 385

Similar computations also show that~x

Ris the normal vector from the other

parametrizations. Therefore the choice ~n =~x

Rgives an orientation of the sphere.

Note that the order of the variables in σ2 are deliberately arranged to have yas the first and x as the second. Thus the normal vector should be computed from(σ2)y × (σ2)x instead of the other way around.

Under a change of variable u = u(s, t), v = v(s, t), by (7.2.14), we findthe normal vectors computed from the two parametrizations are the same if

and only if det∂(u, v)

∂(s, t)> 0. Now on the overlapping of two parametrizations

σ1(u, v), σ2(s, t) of a surface, the parameters (u, v) and (s, t) are relatedas a change of variables. We say the two parametrizations are compatiblyoriented if the Jacobian matrix of the change of variable on the overlappinghas positive determinant. Then we conclude that an orientation of the surfaceis the same as a collection of compatibly oriented parametrizations.

It is possible for a surface to be nonorientable in the sense that there isno continuous choice of the normal vectors. This is equivalent to the nonex-istence of a collection of compatibly oriented parametrizations. A typicalexample is the Mobius band.

................

................

................

................

................

................

.....

............................................................................................................................................................................................................................................................................

............................................................................................................................................................................................................................................................................

....................................................................................................................................................................

...........................................................

...........................................................................................................................

...................................................

..........................................................................................

.................................................................................................................................................................................................................................................................................................................................................................

...........................................................................................................................

......................................................................................................................................................................................................................................................................................................................

..................................................... ...................................................

................

...........................................

........................................................................................................

............................. .......... .......................................................

Figure 7.4: Mobius band

Back to flux, suppose S ⊂ R3 is a surface with orientation ~n. The flowis a vector field F : S → Rn on the surface. For a regular continuouslydifferentiable parametrization σ : A → S compatible with the orientation, ageneral partition P of A and choices (u∗I , v

∗I ) ∈ I for I ∈ P , the total flux of

the flow through the surface is approximated by the Riemann sum

S(P, F, ~n, σ) =∑

F (σ(u∗I , v∗I )) · ~n(σ(u∗I , v

∗I ))µ(σ(I)). (7.2.18)

The integral of the flow along the surface is defined as the limit of the Rie-mann sum as ‖P‖ → 0. Since the Riemann sum S(P, F, ~n, σ) is the same asthe Riemann sum S(P, F ·~n, σ) for the integral of the function F ·~n on the sur-face, the integrability of F along the surface is the same as the integrabilityof F · ~n, and∫

S

F · ~ndA =

∫A

F (σ(u, v)) · (σu(u, v)× σv(u, v))dudv. (7.2.19)

The integral has properties similar to the integral of 1-forms on rectifiablecurves in Section 7.2.3. Moreover, the integral is independent of the choice

Page 386: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

386 CHAPTER 7. MULTIVARIABLE INTEGRATION

of the orientation compatible parametrization, either by an argument similarto the area of the surface, or by the fact that the Riemann sum (7.2.18)is essentially given by a partition of the surface, a choice of points on thesurface, and the orientation (indicated by the normal vector ~n).

Example 7.2.19. By the computation of the normal vector in Example 7.2.18, the(outgoing) flux of the flow F = ~x = (x, y, z) through the sphere of radius R is∫

S2R

~x · ~xRdA =

∫S2R

RdA = Rµ(S2R) = R3β2 = 4πR3.

Example 7.2.20. The outgoing flux of the flow F = (x2, y2, z2) through the sphere‖~x− ~x0‖22 = (x− x0)2 + (y − y0)2 + (z − z0)2 = R2 is∫

‖~x−~x0‖2=R(x2, y2, z2) · (x− x0, y − y0, z − z0)

RdA.

By the shifting ~x→ ~x0 + ~x, the integral is equal to

1R

∫‖~x‖2=R

((x+ x0)2x+ (y + y0)2y + (z + z0)2z)dA.

By suitable rotations that exchange the axis, we have∫‖~x‖2=R

xndA =∫‖~x‖2=R

yndA =∫‖~x‖2=R

zndA. By the transform (x, y, z)→ (−x, y, z), we also have∫‖~x‖2=R

xndA =

0 for odd integers n. Thus the integral above becomes

1R

∫‖~x‖2=R

(2x0x2 + 2y0y

2 + 2z0z2)dA =

23R

(x0 + y0 + z0)∫‖~x‖2=R

(x2 + y2 + z2)dA

=83π(x0 + y0 + z0)R3.

Exercise 7.2.28. Use the change of variable formula to prove that the Riemann inte-gral (7.2.19) on a surface is independent of the orientation compatible parametriza-tion.

Exercise 7.2.29. Study the effect of transforms of Rn on the flux, similar to Exercise7.2.25.

Exercise 7.2.30. Compute the flux.

1. F = (x, y, z), S is a triangle with vertices ~a, ~b, ~c and normal vector ~n (whichis necessarily orthogonal to ~b− ~a and ~c− ~a).

2. F = (f(x), g(y), h(z)), S is the boundary of the rectangle [a1, a2]× [b1, b2]×[c1, c2] with ~n pointing outward.

3. F = (xy2, yz2, zx2), S is the sphere (x − x0)2 + (y − y0)2 + (z − z0)2 = 1with ~n pointing outward.

Denote F = (f, g, h) and σ = (x, y, z). Then

F · (σu × σv) = f(yuzv − yvzu) + g(zuxv − zvxu) + h(xuyv − xvyu).

Page 387: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 387

On the other hand, by formally extending the cross product to the differential1-forms, we get

dx× dy = (xudu+ xvdv)× (yudu+ yvdv) = (xuyv − xvyu)du× dv.

Therefore

F · (σu × σv)du× dv = fdy × dz + gdz × dx+ hdx× dy.

Since the cross product is a special case of the exterior product, theformula suggests how to extend the integral of 2-forms on to surfaces in theother dimensions. Define the integral of a differential 2-form

ω =∑i<j

fij(~x)dxi ∧ dxj (7.2.20)

on a surface S ⊂ Rn parametrized by a regular continuously differentiablemap σ(u, v) : A ⊂ R2 → Rn to be∫

S

ω =∑i<j

∫A

fij(σ(u, v)) det∂(xi, xj)

∂(u, v)dudv. (7.2.21)

Similar to the case n = 3, we are concerned about whether the integraldepends on the choice of the parametrization. Under a change of variableu = u(s, t), v = v(s, t), we have∫

Auv

f(σ(u, v)) det∂(xi, xj)

∂(u, v)dudv

=

∫Ast

f(σ(s, t)) det∂(xi, xj)

∂(u, v)

∣∣∣∣det∂(u, v)

∂(s, t)

∣∣∣∣ dsdt.By

∂(xi, xj)

∂(s, t)=∂(xi, xj)

∂(u, v)

∂(u, v)

∂(s, t), the right side fits into the definition (7.2.21)

in terms of the new variable (s, t) if and only if det∂(u, v)

∂(s, t)> 0. Therefore

the definition of the integral of a 2-form on a surface is not changed if theparametrizations are compatibly oriented.

In general, a surface S ⊂ R may be covered by finitely many parametrizedpieces σi : Ai ⊂ R2 → Si ⊂ Rn, S = ∪Si. On the overlap Si ∩ Sj of thepieces, we get the change of variable among different parametrizations (calledtransition map)

ϕij = σ−1j σi : Aij = σ−1

i (Si ∩ Sj)→ Aji = σ−1j (Si ∩ Sj).

The parametrizations are compatibly oriented if detϕ′ij > 0 for any i, j. Thesurface is orientable if there is a compatibly oriented collection of parametriza-tions that covers the surface. Such a collection gives an orientation of thesurface. Any other parametrization σ∗ : A ⊂ R2 → S∗ ⊂ S is compatiblewith the orientation if det(σ−1

i ◦ σ∗)′ > 0 on σ−1(S∗ ∩ Si).

Page 388: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

388 CHAPTER 7. MULTIVARIABLE INTEGRATION

.............................................................................................................................................................................................. ...............................................................................................................................................

.............................................................................................................................................................................................. ...............................................................................................................................................

........................................................................................................................................................................................................................

..........................................................

..........................................................

..........................................................

..........................................................

............................................................

............................................................................................................................................................................................................................................................................................................................................................................. ..............

SiSj

S

Aij

Aji

σiσj

ϕij

...................................................................

....................................................................................................

...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...........................................

...................................................

................................................................

...............................................................................................

.....................................................................................

................................................................................................................................................................................................

................

.................................

...........................................................................................................................................................................................

....................................

..............................................................................................................................................................................................................

.......................................... ......................................................................................................

........................................................................

..............................................................................................................

..........................................................................................

...............................................

...........................................

Figure 7.5: transition between overlapping parametrizations

For surfaces in R3, the definition of orientation given above is equivalentto a continuous choice of normal vector throughout the surface. The newdefinition is more general because does not make explicit use of R3 and istherefore more general.

Choose orientation compatible parametrizations σi that covers the sur-face, such that the intersections Si ∩ Sj have area zero. Then the integral ofa 2-form ω on the surface may be defined as∫

S

ω =∑∫

Si

ω, (7.2.22)

where the integrals inside the sum on the right side are defined by (7.2.21).The definition is independent of the choice of parametrizations.

Example 7.2.21. The flux in Example 7.2.19 is the integral of the 2-form xdy ∧dz + ydz ∧ dx + zdx ∧ dy on the sphere S2

R with orientation given by compatibleparametrizations in Example 7.2.18. The integral of zdx ∧ dy may be computedby using σ1 and σ2∫

S2R

zdx ∧ dy =∫x2+y2≤R2

√1− x2 − y2 det

∂(x, y)∂(x, y)

dxdy

+∫x2+y2≤R2

−√

1− x2 − y2 det∂(x, y)∂(y, x)

dxdy

= 2∫ 2π

0

(∫ R

0

√1− r2rdr

)dθ =

43πR3.

Similarly, we get∫S2R

xdy ∧ dz =∫S2R

ydz ∧ dx =43πR3.

Example 7.2.22. We compute the outgoing flux of F = (x2, y2, z2) in Example

7.2.20 through the ellipse S given by(x− x0)2

a2+

(y − y0)2

b2+

(z − z0)2

c2= 1. By

the shifting ~x→ ~x0 + ~x, the flux is the integral∫S+~x0

(x+ x0)2dy ∧ dz + (y + y0)2dz ∧ dx+ (z + z0)2dx ∧ dy.

Page 389: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 389

The integral of (z+z0)2dx∧dy may be computed by using parametrizations similarto σ1 and σ2 in Example 7.2.18∫

S+~x0

(z + z0)2dx ∧ dy =∫x2

a2 + y2

b2≤1

(c

√1− x2

a2− y2

b2+ z0

)2

det∂(x, y)∂(x, y)

dxdy

+∫x2

a2 + y2

b2≤1

(−c√

1− x2

a2− y2

b2+ z0

)2

det∂(x, y)∂(y, x)

dxdy

= 4z0c

∫x2

a2 + y2

b2≤1

√1− x2

a2− y2

b2dxdy

= 4abcz0

∫u2+v2≤1

√1− u2 − v2dudv =

163πabcz0.

The total flux is16π3abc(x0 + y0 + z0).

Exercise 7.2.31. Compute the integrals.

1.∫Sxy2dy∧dz+yz2dz∧dx+zx2dx∧dy, S is the ellipse

(x− x0)2

a2+

(y − y0)2

b2+

(z − z0)2

c2= 1 with orientation given by the inward normal vector.

2.∫Sxyzdx ∧ dy, S is the part of the ellipse

x2

a2+y2

b2+z2

c2= 1, x ≥ 0, y ≥ 0,

with orientation given by outward normal vector.

3.∫Sxydy ∧ dz and

∫Sx2ydy ∧ dz, S is the boundary of the solid enclosed by

z = x2 + y2 and z = 1, with orientation given by outward normal vector.

4.∫Sdx1 ∧ dx2 + dx2 ∧ dx3 + · · · + dxn−1 ∧ dxn, S is the surface σ(u, v) =

(u+ v, u2 + v2, . . . , un + vn) for 0 ≤ u ≤ a, 0 ≤ v ≤ b.

Exercise 7.2.32. Prove that the area of a surface S ⊂ R3 given by an equation

g(x, y, z) = c is∫S

gxdy ∧ dz + gydz ∧ dx+ gzdx ∧ dy√g2x + g2

y + g2z

.

7.2.7 Volume and Integration on Hypersurface

Suppose S ⊂ Rk is a k-dimensional hypersurface parametrized by a contin-uously differentiable map σ(~u) = σ(u1, u2, . . . , uk) : Rk → Rn on a subsetA ⊂ Rk with volume. The (k-dimensional) volume of the hypersurface is

µ(S) =

∫A

‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2du1du2 · · · duk

=

∫A

√det(σui · σuj)1≤i,j≤kdu1du2 · · · duk. (7.2.23)

We also denote

dV = ‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2du1du2 · · · duk, (7.2.24)

Page 390: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

390 CHAPTER 7. MULTIVARIABLE INTEGRATION

where V indicates the volume. The definition is independent of the choice ofthe parametrization.

Example 7.2.23. In Example 7.1.11, for a parametrization F (~θ) : Rn−1 → Rn of

the unit sphere Sn−1, we claimed that the integral βn−1 =∫A| det

(F F ′

)|dµ~θ is

the area of the sphere. Now the claim can be justified.Since F · F = ‖F‖2 = 1, by taking partial derivatives, we get Fθi · F = 0.

Therefore F is a unit length vector orthogonal to Fθi . This implies that

‖Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθn−1‖ = ‖F ∧ Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθn−1‖ = |det(F F ′

)|,

where the second equality follows from the interpretation of the determinant asthe wedge product of n vectors in Rn. Thus the volume of the sphere is∫

A‖Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθn−1‖2dµ~θ =

∫A|det

(F F ′

)|dµ~θ.

Continuing with the computation of the volume of the unit sphere, we notethat the parametrization of Sn−1 induces a parametrization of Sn

G(~θ, φ) = (F (~θ) cosφ, sinφ) : A×[−π

2,π

2

]→ Rn+1.

We have

‖Gθ1 ∧Gθ2 ∧ · · · ∧Gθn−1 ∧Gφ‖2=‖(Fθ1 cosφ, 0) ∧ (Fθ2 cosφ, 0) ∧ · · · ∧ (Fθn−1 cosφ, 0) ∧ (−F sinφ, cosφ)‖2=‖(Fθ1 cosφ, 0) ∧ (Fθ2 cosφ, 0) ∧ · · · ∧ (Fθn−1 cosφ, 0)‖2‖(−F sinφ, cosφ)‖2=‖Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθn−1‖2 cosn−1 φ,

where the second equality is due to the fact that (−F sinφ, cosφ) is orthogo-

nal to (Fθi cosφ, 0), and we use ‖(−F sinφ, cosφ)‖2 =√‖F‖22 sin2 φ+ cos2 φ =√

sin2 φ+ cos2 φ = 1 in the last equality. Therefore

βn =∫~θ∈A,−π

2≤φ≤π

2

‖Fθ1∧Fθ2∧· · ·∧Fθn−1‖2 cosn−1 φdµ~θdφ = βn−1

∫ π2

−π2

cosn−1 φdφ.

Using integration by parts, we get (see Exercise 3.2.19)

∫ π2

−π2

cosn φdφ =n− 1n

∫ π2

−π2

cosn−2 φdφ = · · · =

(n− 1)(n− 3) · · · 1n(n− 2) · · · 2

π if n is even

(n− 1)(n− 3) · · · 2n(n− 2) · · · 3

2 if n is odd.

Therefore βn =2πn− 1

βn−2. Combined with β1 = 2π, β2 = 4π, we get

βn−1 =

2πk

(k − 1)!if n = 2k

2k+1πk

(2k − 1)(2k − 3) · · · 1if n = 2k + 1

.

Page 391: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 391

Finally, by Example 7.1.11, the volume of the unit ball is

αn =βn−1

n=

πk

k!if n = 2k

2k+1πk

(2k + 1)(2k − 1)(2k − 3) · · · 1if n = 2k + 1

.

The result may be compared with Exercise 7.1.36.

A function f(~x) is Riemann integrable on the hypersurface S if f(σ(~u))is Riemann integrable on A. The integral is∫

S

fdV =

∫A

f(σ(u1, u2, . . . , uk))‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2du1du2 · · · duk(7.2.25)

The integral is independent of the choice of the regular parametrizations.

Example 7.2.24. Consider a k-dimensional hypersurface S parametrized by

σ(~u) = (ξ(~u), r(~u)) : A ⊂ Rk → Rn × R.

Assume r(~u) ≥ 0, so that the hypersurface is inside the upper half Euclidean space.The l-dimensional rotation of S around the axis Rn × 0 is a (k + l)-dimensionalhypersurface

ρl(S) = {(~x, ~y) ∈ Rn × Rl+1 : (~x, ‖~y‖2) ∈ S}.

Let F (~θ) : B ⊂ Rl → Rl+1 be a parametrization of the unit sphere Sl. Then therotation hypersurface ρl(S) is parametrized by

ρ(~u, ~θ) = (ξ(~u), r(~u)F (~θ)) : A×B → Rn × Rl+1.

By ρui = (ξui , ruiF ), ρθj = (~0, rFθj ), and ρui · ρθj = ruiF · rFθj = 0, we get

‖ρu1∧ρu2∧· · ·∧ρuk∧ρθ1∧ρθ2∧· · ·∧ρθl‖2 = ‖ρu1∧ρu2∧· · ·∧ρuk‖2‖ρθ1∧ρθ2∧· · ·∧ρθl‖2,

and‖ρθ1 ∧ ρθ2 ∧ · · · ∧ ρθl‖2 = rl‖Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθl‖2.

Moreover, there is an orthogonal transform UF on Rl+1 such that UF (F ) =(1, 0, . . . , 0). Then (id, UF ) is an orthogonal transform on Rn × Rl+1 such that

(id, UF )ρui = (ξui , ruiUF (F )) = (ξui , rui , 0, . . . , 0) = (σui , 0, . . . , 0).

Applying the orthogonal transform to ‖ρu1 ∧ ρu2 ∧ · · · ∧ ρuk‖2, we get

‖ρu1 ∧ ρu2 ∧ · · · ∧ ρuk‖2 = ‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2

Thus the volume of the rotation hypersurface is

µ(ρl(S)) =∫A×B

‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2rl‖Fθ1 ∧ Fθ2 ∧ · · · ∧ Fθl‖2dµ~udµ~θ

= βl

∫A‖σu1 ∧ σu2 ∧ · · · ∧ σuk‖2r

ldµ~u = βl

∫SrldV.

Exercise 7.2.33. Use the rotation to prove that βk+l+1 = βkβl

∫ π2

0cosk θ sinl θdθ.

Page 392: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

392 CHAPTER 7. MULTIVARIABLE INTEGRATION

Exercise 7.2.34. Extend the result of Example 7.2.24 to the rotation around ahyperplane ~a · ~x = b (the surface is on the positive side of the hyperplane).

A differential k-form on Rn is an expression of the form

ω =∑

i1<i2<···<ik

fi1i2···ik(~x)dxi1 ∧ dxi2 ∧ · · · ∧ dxik . (7.2.26)

The integral of the differential k-form on the hypersurface S with respect toa regular parametrization σ is∫

S

ω =

∫A

fi1i2···ik(σ(~u)) det∂(xi1 , xi2 , . . . , xik)

∂(u1, u2, . . . , uk)du1du2 · · · duk. (7.2.27)

Similar to the integral of differential 2-forms on surfaces, the integral is inde-pendent of the parametrizations as long as they are compatibly oriented. Thecompatibility means that the determinant of the Jacobian for the change ofvariables (so called transition map) is positive. In general, the hypersurfacemay be divided into several compatibly oriented pieces and the integral ofthe differential form is the sum of the integrals on the pieces.

Now we consider the special case of the integral of an (n− 1)-form on anoriented (n− 1)-dimensional hypersurface S ⊂ Rn. At any ~x ∈ S, there aretwo unit length vectors orthogonal to the tangent hyperplane. Specifically, ifσ(~u) is a regular parametrization, then the tangent hyperplane is spanned byσu1 , σu2 , . . . , σun−1 . Define the normal vector compatible with the orientationof the parametrization to be

~n = (−1)n−1 (σu1 ∧ σu2 ∧ · · · ∧ σun−1)?

‖σu1 ∧ σu2 ∧ · · · ∧ σun−1‖2

. (7.2.28)

The compatible normal vector is characterized by the property that ~n haslength 1, is orthogonal to σui , and the basis ~n, σu1 , σu2 , . . . , σun−1 has positiveorientation (meaning det

(~n σu1 σu2 · · · σun−1

)> 0).

Let (xi means the term xi is missing from the list)

~a = (σu1 ∧ σu2 ∧ · · · ∧ σun−1)? =∑

det∂(x1, x2, . . . , xi, . . . , xn)

∂(u1, , u2, . . . , un−1)~e ?∧([n]−i)

=∑

(−1)n−i det∂(x1, x2, . . . , xi, . . . , xn)

∂(u1, , u2, . . . , un−1)~ei.

Then define the flux of a vector field F = (f1, f2, . . . , fn) : S → Rn to be∫S

F · ~ndV = (−1)n−1

∫A

F · ~adu1du2 · · · duk

=

∫A

∑(−1)i−1fi det

∂(x1, x2, . . . , xi, . . . , xn)

∂(u1, , u2, . . . , un−1)du1du2 · · · duk

=

∫S

∑(−1)i−1fidx1 ∧ dx2 ∧ · · · ∧ dxi ∧ · · · ∧ dxn. (7.2.29)

This extends the relation between the flux in R3 and the integration of 2-forms in Section 7.2.6.

Page 393: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 393

For the special case n = 2, we have an oriented curve C ⊂ R2 parametrizedby φ(t), and the compatible normal vector ~n is the unit length vector withdirection obtained by rotating the tangent vector φ′ in clockwise direction by90 degrees. Moreover,∫

C

(f, g) · ~nds =

∫C

fdy − gdx =

∫C

(−g, f) · d~x. (7.2.30)

Exercise 7.2.35. Denote

π(v0, v1, . . . , vn) = (v1, v0, v2, . . . , vn),ρi(v0, v1, . . . , vn) = (vi+1, vi+2, . . . , vn, v0, v1, . . . , vi−1, vi).

Prove that the unit sphere Sn is covered by compatibly oriented regular parametriza-tions

σ+i (~u) = ρi

(√1− ‖~u‖22, ~u

), σ−i (~u) = ρiπ

(−√

1− ‖~u‖22, ~u)

defined for 1 ≤ i ≤ n and ~u ∈ Rn satisfying ‖~u‖2 < 1.

Exercise 7.2.36. Use (7.2.29) to show that the length of a curve C in R2 given by an

equation g(x, y) = c is∫C

−gydx+ gxdy√g2x + g2

y

, similar to the formula in Exercise 7.2.32.

Extend the formula to the volume of an (n − 1)-dimensional hypersurface in Rn

given by g(x1, x2, . . . , xn) = c. What about the (n− k)-dimensional hypersurfacedefined by a map G : Rn → Rk?

Exercise 7.2.37. Compute the integrals.

1.∫Sn−1R

(~x−~a) · ~ndV , the normal vector ~n points to the outside of the sphere.

2.∫Sn−1R

(a1x1 + a2x2 + · · ·+ anxn)dx2 ∧ dx3 ∧ · · · ∧ dxn, the orientation of the

sphere is given by the outward normal vector.

3.∫Sdx1∧dx2∧dx3+dx2∧dx3∧dx4+· · ·+dxn−2∧dxn−1∧dxn, S is the surface

σ(u, v, w) = ρ(u) + ρ(v) + ρ(w), ρ(u) = (u, u2, . . . , un), 0 ≤ u, v, w ≤ a.

We end the section with the discussion of the direction of the normalvector for the hypersurfaces given by graphs of functions.

Consider the graph σ(~u) = (~u, h(~u)) of a continuously differentiable func-tion h on Rn−1. By

(σu1 ∧ σu2 ∧ · · · ∧ σun−1)?

=((~e1 + hu1~en) ∧ (~e2 + hu2~en) ∧ · · · ∧ (~en−1 + hun−1~en))?

=(~e1 ∧ ~e2 ∧ · · · ∧ ~en−1)? +∑

(−1)n−1−ihui(~e1 ∧ ~e2 ∧ · · · ∧ ~ei ∧ · · · ∧ ~en)?

=~en −∑

hui~ei = (−hu1 ,−hu2 , . . . ,−hun−1 , 1),

the normal vector

~n =(−1)n−1(−hu1 ,−hu2 , . . . ,−hun−1 , 1)√

1 + h2u1

+ h2u2

+ . . .+ h2un−1

Page 394: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

394 CHAPTER 7. MULTIVARIABLE INTEGRATION

points in the direction of xn when n is odd and opposite to the direction ofxn when n is even.

In general, consider the graph

σi(~u) = σi(u1, u2, . . . , un−1) = (u1, u2, . . . , ui−1, h(~u), ui, · · · , un−1)

of a continuously differentiable function h. We have σi(~u) = Ui(σ(~u)) for theorthogonal transform

U(x1, x2, . . . , xn) = (x1, x2, . . . , xi−1, xn, xi+1, . . . , xn−1).

By the characterization of the compatible normal vector, the normal vectorcompatible with the parametrization σi is

(detU)U(~n) =(−1)i−1(−hu1 ,−hu2 , . . . ,−hui−1

, 1,−hui , . . . ,−hun−1)√1 + h2

u1+ h2

u2+ . . .+ h2

un−1

.

The normal direction points in the direction of xi when i is odd and oppositeto the direction of xi when i is even.

7.2.8 Exercise

Isometry between Normed Spaces

Let ‖~x‖ and 9~x9 be norms on Rn and Rm. A map F : Rn → Rm is anisometry if it satisfies 9F (~x) − F (~y)9 = ‖~x − ~y‖. Exercise 7.2.5 says thatisometries preserve the length of curves.

A map F : Rn → Rm is affine if the following equivalent conditions aresatisfied.

1. F ((1 − t)~x + t~y) = (1 − t)F (~x) + tF (~y) for any ~x, ~y ∈ Rn and any0 ≤ t ≤ 1.

2. F ((1− t)~x+ t~y) = (1− t)F (~x)+ tF (~y) for any ~x, ~y ∈ Rn and any t ∈ R.

3. F (~x) = ~a+ L(~x) for a fixed vector ~a (necessarily equal to F (~0)) and alinear transform L.

Exercise 7.2.38. Suppose the norm 9~x9 on Rm is strictly convex, in the sense9~x+ ~y9 = 9~x9 + 9 ~y9 implies ~x and ~y are parallel. Prove that for any isometryF and ~x, ~y, ~z = (1 − t)~x + t~y ∈ Rn, 0 ≤ t ≤ 1, the vectors F (~z) − F (~x) andF (~y)− F (~z) must be parallel. Then prove that the isometry F is affine.

Exercise 7.2.39. Show that the Lp-norm is strictly convex for 1 < p < ∞. Thenshow that an isometry between Euclidean spaces with the Euclidean norms mustbe of the form ~a+L(~x), where L is a linear transform with its matrix A satisfyingATA = I.

Exercise 7.2.40. For any ~u ∈ Rn, the map φ(~x) = 2~u − ~x is the reflection withrespect to ~u. A subset K ⊂ Rn is symmetric with respect to ~u if ~x ∈ K =⇒φ(~x) ∈ K. The subset has radius r(K) = sup~x∈K ‖~x− ~u‖.

Page 395: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.2. INTEGRATION ON HYPERSURFACE 395

1. Prove that φ is an isometry from (Rn, ‖~x‖) to itself, ‖φ(~x)− ~x‖ = 2‖~x− ~u‖,and ~u is the only point fixed by φ (i.e., satisfying φ(~x) = ~x).

2. For a subset K symmetric with respect to ~u, prove that the subset hasdiameter sup~x,~y∈K ‖~x − ~y‖ = 2r(K). Then prove that the subset K ′ =

{~x : K ⊂ B(~x, r(K))} has radius r(K ′) ≤ 12r(K).

3. For any ~a,~b ∈ Rn, denote ~u =~a+~b

2. Prove that

K0 = {~x : ‖~x− ~a‖ = ‖~x−~b‖ =12‖~a−~b‖}

is symmetric with respect to ~u. Then prove that the sequence Kn definedby Kn+1 = K ′n satisfies ∩Kn = {~u}.

The last part gives a characterization of the middle point ~u of two points ~a and ~bpurely in terms of the norm.

Exercise 7.2.41. (Mazur-Ulam Theorem) Prove that an invertible isometry is nec-essarily affine. Specifically, suppose F : (Rn, ‖~x‖) → (Rn,9~x9) is an invertibleisometry. By using the characterization of the middle point in Exercise 7.2.39,prove that the map preserves the middle point

F

(~a+~b

2

)=F (~a) + F (~b)

2.

Then further prove that the property implies F is affine.

Exercise 7.2.42. Let φ(t) be a real function and consider F (t) = (t, φ(t)) : R→ R2.For the absolute value on R and the L∞-norm on R2, find suitable condition onφ to make sure F is an isometry. The exercise shows that an isometry is notnecessarily affine in general.

Page 396: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

396 CHAPTER 7. MULTIVARIABLE INTEGRATION

7.3 Stokes Theorem

The fundamental theorem of calculus relates the integration of a function onan interval to the values of the antiderivative at the end points of the integral.The Stokes theorem extends the result to high dimensional integrations.

7.3.1 Green Theorem

A curve φ : [a, b] → Rn is simple if it has no self intersection. It is closed ifφ(a) = φ(b). The concept of simple closed curve is independent of the choiceof parametrization.

A simple closed curve in R2 divides the plane into two path connectedpieces, one bounded and the other unbounded. Therefore we have two sidesof simple closed curves in R2. Suppose A ⊂ R2 is a bounded subset withfinitely many rectifiable simple closed curves C1, C2, . . . , Ck as the boundary.Since rectifiable curves have area zero (see Exercise 7.2.3), the subset A hasarea. We further assume that A is contained in only one side of each Ci.Then Ci has a compatible orientation such that A is “on the left” of C. Thismeans that Ci has counterclockwise orientation if A is in the bounded sideof Ci, and has clockwise orientation if A is in the unbounded side. Thisalso means that if Ci is oriented and differentiable, then rotating the tangentvector ~t by 90 degrees in clockwise direction will produce a normal vector ~npointing away from A.

..............................................

.............................................................

....................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................ ..................................

..................................................................................................................................................................................................................................................

.......................................

...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

........................................

.................................................................

..............................................

...........................................................

.....................................................

................................................

..........................

..................................................................

................................. ....................

.....................................................

A

C1

C2

C3

~n~t

Figure 7.6: orientation of boundary

Theorem 7.3.1 (Green Theorem). Suppose A ⊂ R2 is a region with com-patibly oriented rectifiable simple closed boundary curves C. Then for anycontinuously differentiable functions f and g, we have∫

C

fdx+ gdy =

∫A

(−fy + gx)dxdy. (7.3.1)

In Green theorem, C is understood to be the union of simple closed curvesC1, C2, . . . , Ck, each oriented in the compatible way. The integral in (7.3.1)

Page 397: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 397

along C is understood to be∫C

=

∫C1

+

∫C2

+ · · ·+∫Ck

.

Proof. We first consider the special case that

A = {(x, y) : x ∈ [a.b], h(x) ≥ y ≥ k(x)}

is a region between the graphs of functions h(x) and k(x) with boundedvariations and g = 0. We have∫

A

fydxdy =

∫ b

a

(∫ h(x)

k(x)

fy(x, y)dy

)dx (Fubini Theorem)

=

∫ b

a

(f(x, h(x))− f(x, k(x)))dx. (fundamental theorem)

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

....................................

..............................................

..........................

.............................................. ....................

..............................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..............

...........................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................A

a bx

y

y = h(x)

y = k(x)

ClCr

Cb

Ct

............

............

............

............

............

............

......

Figure 7.7: Green theorem for a special case

On the other hand, the boundary C consists of four segments with parametriza-tions compatible with the orientation

Cb : φ(t) = (t, k(t)), t ∈ [a, b]

Cr : φ(t) = (b, t), t ∈ [k(b), h(b)]

Ct : φ(t) = (−t, h(−t)), t ∈ [−b,−a]

Cl : φ(t) = (a,−t), t ∈ [−h(a),−h(b)]

Then ∫C

fdx =

(∫Cb

+

∫Cr

+

∫Ct

+

∫Cl

)fdx

=

∫ b

a

f(t, k(t))dt+

∫ h(b)

k(b)

f(b, t)db

+

∫ −a−b

f(−t, h(−t))d(−t) +

∫ −k(a)

−h(a)

f(a,−t)da

=

∫ b

a

f(x, k(x))dx+ 0 +

∫ a

b

f(x, h(x))dx+ 0.

Page 398: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

398 CHAPTER 7. MULTIVARIABLE INTEGRATION

Therefore we have

∫A

fydxdy = −∫C

fdx.

Many regions can be divided by vertical lines into several special regionsAj studied above. Let Cj be the compatibly oriented boundary of Aj. Then

the sum of

∫Cj

fdx is

∫C

fdx, because the integrations along the vertical lines

are all zero. Moreover, the sum of

∫Aj

fydxdy is

∫A

fydxdy. This proves the

equality

∫C

fdx = −∫A

fydxdy for the more general regions.

..............................................

.............................................................

....................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................................................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................ ..................................

..................................................................................................................................................................................................................................................

.......................................

...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

.....................................................

................................................

............................ ....................

................................................

.......................... ....................

................................................

................................. ....................

................................................

................................. ....................

...............................

.........................................................................

.....................................................

.................................. ....................

................................................

A1

A2

A3

A4

A5

A6

A7

A8

A9

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

......

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

............

........................................................................................................................

Figure 7.8: divide a general case to special cases

Now consider the most general case that the boundary curves are assumedto be only rectifiable. To simplify the presentation, we will assume that Ahas only one boundary curve C. Parametrize the curve C by the Euclideanarc length φ(s) : [a, b] → R2. For any natural number n, divide [a, b] evenly

into n intervals of length δ =b− an

. Denote by Cj the segment of C between

φ(sj−1) and φ(sj). Let Lj be the straight line connecting φ(sj−1) and φ(sj).Let L be the curve constructed by taking the union of Lj. Let A′ be theregion enclosed by the curve L. Then A′ can be divided by vertical lines intospecial regions. As explained above, we have

∫L

fdx = −∫A′fydxdy.

Let M be the upper bound of ‖∇f‖2 =√f 2x + f 2

y . Since s is the Eu-clidean arc length, we have ‖φ(s) − φ(t)‖2 ≤ |s − t|. Therefore for ~x ∈ Cjand ~y ∈ Lj, we have

|f(~x)− f(~y)| ≤M‖~x− ~y‖2 ≤M‖~x− φ(sj)‖2 +M‖φ(sj)− ~y‖2 ≤ 2Mδ.

Page 399: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 399

This implies∣∣∣∣S(P, fdx, φ)−∫L

fdx

∣∣∣∣ =

∣∣∣∣∣∑(f(φ(s∗j))∆xj −

∫Lj

fdx

)∣∣∣∣∣≤∑∣∣∣∣∣

∫Lj

(f(φ(s∗j))− f(~x))dx

∣∣∣∣∣≤∑

2Mδ|∆xj| ≤ 2Mδ∑

∆sj = 2M(b− a)δ.

The estimation shows that limn→∞

∫L

fdx =

∫C

fdx.

On the other hand, let Bj be the region bounded by Cj and Lj. ThenBj ⊂ B(φ(sj), δ). Therefore µ(Bj) ≤ πδ2 and∣∣∣∣∫

A

fdxdy −∫A′fdxdy

∣∣∣∣ ≤∑∫Bj

|f |dxdy

≤∑

Mµ(Bj) ≤ nMπδ2 =πM(b− a)2

n.

This shows that limn→∞

∫A′fdxdy =

∫A

fdxdy.

This completes the proof of

∫C

fdx = −∫A

fydxdy for regions with recti-

fiable boundaries. The formula

∫C

gdy =

∫A

gxdxdy can be similarly proved.

.............................................................................................................................................................................................................................

................................................

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................................................................................................................................

..........................................................

.........................................................................................................................................................................................

................................. ..................................................................

..............................................

A1

A2

C1C2..................................................

..........................................................................................................................................................................................

.............................................................................................................................................................................................

...............................................

.....................................................................................................................................................................................

.....................................................................................................................................................................................

.....................................................

.....................................................

A1

A2

C1

C2

Figure 7.9: a closed but not simple curve

Suppose a curve C is closed but not simple. Then C may enclose someparts of the region more than once, and the orientation of C may not becompatible with some parts. For example, in Figure 7.9, the left curve in-tersects itself once, which divides the curve into the outside part C1 and theinside part C2. Then∫C

fdx+ gdy =

(∫C1

+

∫C2

)fdx+ gdy =

(∫A1∪A2

+

∫A1

)(−fx + gy)dxdy

=

(2

∫A1

+

∫A2

)(−fx + gy)dxdy.

Page 400: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

400 CHAPTER 7. MULTIVARIABLE INTEGRATION

For the curve on the right, we have∫C

fdx+ gdy =

(∫A1

−∫A2

)(−fx + gy)dxdy

In general, we still have Green theorem for regions enclosed by (not neces-sarily simple) closed curves, as long as the regions are counted in the cor-responding way. In fact, this remark is already used in the proof of Greentheorem above, since the curve L in the proof may not be simple.

Example 7.3.1. The area of the region A enclosed by a simple closed curve C is∫Adxdy =

∫Cxdy = −

∫Cydx.

For example, suppose C is the counterclockwise unit circle arc from (1, 0) to (0, 1).

To compute∫Cxdy, we add C1 = [0, 1] × 0 in the rightward direction and C2 =

0 × [0, 1] in the downward direction. Then the integral(∫

C+∫C1

+∫C2

)xdy is

the areaπ

4of the quarter unit disk. Since

∫C1

xdy =∫ 1

0xd0 = 0 and

∫C2

xdy =∫ 0

10dy = 0, we conclude that

∫Cxdy =

π

4.

Exercise 7.3.1. Compute the integrals.

1.∫C

(ex + 2y − 3x)dx+ (8x+ cos y2)dy, C is the triangle with vertices (0, 0),

(1, 2), (2, 3), in the clockwise direction.

2.∫C

(x2 +y2)dx−2xydy, C is the boundary of the half disk x2 +y2 ≤ 4, y ≥ 0,

in the counterclockwise direction.

Exercise 7.3.2. Compute the areas.

1. Region enclosed by the astroid x32 + y

32 = 1.

2. Region enclosed by the cardioid r = 2a(1 + cos θ).

3. Region enclosed by the parabola (x+ y)2 = x and the x-axis.

Exercise 7.3.3. Do Exercise 7.1.67 again by using Green theorem.

Exercise 7.3.4. Extend integration by parts by finding a relation between∫A

(ufx+

vgy)dxdy and∫A

(uxf + vyg)dxdy.

Exercise 7.3.5. The divergence of a vector field F = (f, g) on R2 is

divF = fx + gy. (7.3.2)

Prove that ∫CF · ~nds =

∫A

divFdA, (7.3.3)

where ~n is the normal vector pointing away from A.

Page 401: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 401

Exercise 7.3.6. The Laplacian of a two variable function f is ∆f = fxx + fyy =div∇f . Prove Green identities∫

Af∆gdA =

∫Cf∇g · ~nds−

∫A∇f · ∇gdA. (7.3.4)

and ∫A

(f∆g − g∆f)dA =∫C

(f∇g − g∇f) · ~nds. (7.3.5)

7.3.2 Independence of Integral on Path

If fy = gx on a region A ⊂ R2, then Green theorem tells us

∫C

fdx+gdy = 0.

The conclusion can be interpreted and utilized in different ways.Suppose C1 and C2 are two curves connecting ~a to ~b in R2. Then C1 ∪

(−C2) is an oriented closed curve. If fy = gx on the region enclosed by

C1 ∪ (−C2), then we have

∫C1∪(−C2)

fdx+ gdy = 0, which means

∫C1

fdx+ gdy =

∫C2

fdx+ gdy.

In other words, if fy = gx on the region between two curves with the samebeginning and end points, then the integral of the 1-form fdx + gdy alongthe two curves are the same.

.......................... .............................................. ....................

(x0, y0)

(x1, y1)

C1

C2

fy = gxfy = gx

......................................

.......................................................

...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...............................................................................

............................................................................................................................................................................................................................................

................................................................................................................................

..................................................................

...............................................................................................................................................................................................................................

Figure 7.10: when

∫C1

fdx+ gdy =

∫C2

fdx+ gdy

Another explanation of the independence of the integral of a 1-form onthe choice of paths is given in Examples 7.2.8. The general argument isthe following. Suppose f = ϕx and g = ϕy for a differentiable functionϕ(x, y). Suppose a curve C connecting (x0, y0) to (x1, y1) is parametrized bydifferentiable φ(t) = (x(t), y(t)). Then∫

C

fdx+ gdy =

∫ b

a

(ϕxx′ + ϕyy

′)dt =

∫ b

a

dϕ(φ(t))

dtdt

= ϕ(φ(b))− ϕ(φ(a)) = ϕ(x1, y1)− ϕ(x0, y0).

The condition f = ϕx and g = ϕy means the 1-form fdx + gdy is thedifferential dϕ. It also means the vector field (f, g) is the gradient ∇ϕ. Thefunction ϕ, called a potential for the 1-form fdx + gdy or for the vector

Page 402: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

402 CHAPTER 7. MULTIVARIABLE INTEGRATION

field (f, g), is simply the antiderivative for the pair of two variable functions!Any continuous single variable function has antiderivatives, which are uniqueup to adding constants. However, if ϕx = f , ϕy = g, and f and g havecontinuous partial derivatives, then f and g must satisfy fy = ϕyx = ϕxy =gx. Therefore fy = gx is a necessary condition for the vector field (f, g) tohave antiderivative (potential). Moreover, the antiderivative of f(x) is given

by the integral φ(x) = φ(x0) +

∫ x

x0

f(t)dt along the line (interval) [x0, x]

connecting x0 to x. Similarly, we expect the potential for the two variablecase to be

ϕ(x, y) = ϕ(x0, y0) +

∫ (x,y)

(x0,y0)

fdx+ gdy = ϕ(x0, y0) +

∫C

fdx+ gdy, (7.3.6)

except there are many curves C connecting (x0, y0) to (x, y). Therefore thetwo variable antiderivative problem is closely related to the independence ofthe integral on the choice of paths.

Theorem 7.3.2. Suppose f and g are continuous functions on an open sub-set U ⊂ R2. Then the following are equivalent.

1. The integral

∫C

fdx + gdy along an oriented rectifiable curve C in U

depends only on the beginning and end points of C.

2. There is a differentiable function ϕ on U , such that ϕx = f and ϕy = g.

Moreover, if U is simply connected and f and g are continuously differen-tiable, then the above is also equivalent to

fy = gx.

The simply connected condition means that the subset has no holes. Therigorous definition is that any continuous map S1 → U from the unit circle(i.e., a closed curve in U) extends to a continuous map B2 → U from theunit disk. Any two curves in U with the same beginning and end pointsform a closed curve in U . The extension to the unit disk means that theregion between the two curves still lies in U , so that the integral along thetwo curves are the same.

Proof. By the earlier discussion, we know that the second statement impliesfy = gx. We also know that the property implies the equality∫

C

fdx+ gdy = ϕ(x1, y1)− ϕ(x0, y0) (7.3.7)

for the special case C is differentiably parametrized. Moreover, we also ex-plained that the condition fy = gx on a simply connected open subset impliesthe first statement. It remains to prove the first statement implies the sec-ond, and (7.3.7) holds for the differentiable case implies the general rectifiablecase.

Page 403: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 403

Suppose the integral depends only on the beginning and end points. Thenwe consider ϕ(x, y) given by the formula (7.3.6) ((x0, y0) is a fixed point inU and ϕ(x0, y0) is the arbitrary constant allowed in the antiderivative). Forany (x1, y1) ∈ U , there is ε > 0, such that ‖(x, y) − (x1, y1)‖∞ < ε implies(x, y) ∈ U . Then for such (x, y), we have

ϕ(x, y) = ϕ(x1, y1) +

∫I

fdx+ gdy,

where I is the straight line connecting (x1, y1) to (x, y) (the setup makes

sure that I ⊂ U). In particular, ϕ(x, y1) = ϕ(x1, y1) +

∫ x

x1

f(t, y1)dt. By the

continuity of f and the fundamental theorem of calculus, we get ϕx(x1, y1) =f(x1, y1). The equality ϕy(x1, y1) = g(x1, y1) can be similarly proved. Thedifferentiability of ϕ follows from the continuity of f and g.

Now assume (7.3.7) holds for differentiably parametrized curves. Considerthe general case φ(t) is only rectifiable. Let P be a partition of [a, b]. LetLi be the straight line connecting φ(ti−1) to φ(ti). Let L be obtained byconnecting Li together. Since Li is differentiably parametrized, we have∫L

fdx+ gdy =∑∫

Li

fdx+ gdy =∑

(ϕ(φ(ti))−ϕ(φ(ti−1))) = ϕ(x1, y1)−

ϕ(x0, y0). On the other hand, in the proof of Theorem 7.3.1, we have shown

that

∫C

fdx+ gdy = lim‖P‖→0

∫L

fdx+ gdy. Therefore we conclude that (7.3.7)

also holds for C.

Example 7.3.2. The integral of ydx+ xdy in Example 7.2.8 is independent of the

choice of the paths because the equality∂y

∂y=∂x

∂xholds on the whole plane, which

is simply connected. The potential of the 1-form is xy (up to adding constant).In Example 7.2.9, the integral of xydx+ (x+ y)dy along the three curves gives

different results. Indeed, we have (xy)y = x 6= (x + y)x = 1, and the 1-form hasno potential.

Example 7.3.3. The vector field1y2

(xy2 + y, 2y−x) is defined on y > 0 and y < 0,

both simply connected. It satisfiesd

dy

(xy2 + y

y2

)= − 1

y2=

d

dx

(2y − xy2

). There-

fore it has a potential function ϕ. By ϕx =y(xy + 1)

y2, we get ϕ =

∫y(xy + 1)

y2dx+

ϑ(y) =x2

2+x

y+ ϑ(y). Then by ϕy = − x

y2+ ϑ′(y) =

2y − xy2

, we get ϑ′(y) =2y

, so

that ϑ(y) = 2 log |y|+ c. The potential function is

ϕ =x2

2+x

y+ 2 log |y|+ c.

In particular, we have∫ (2,2)

(1,1)

(xy2 + y)dx+ (2y − x)dyy2

=(x2

2+x

y+ 2 log |y|

)(2,2)

(1,1)

=32

+ log 2.

Page 404: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

404 CHAPTER 7. MULTIVARIABLE INTEGRATION

The example shows that the formula (7.3.6) may not be the most practical wayof computing the potential.

Example 7.3.4. The 1-formydx− xdyx2 + y2

satisfies

fy =(

y

x2 + y2

)y

=x2 − y2

(x2 + y2)2= gx =

(−x

x2 + y2

)x

on R2− (0, 0), which is unfortunately not simply connected. Let U be obtained byremoving the non-positive x-axis (−∞, 0]×0 from R2. Then U is simply connected,

and Theorem 7.3.2 can be applied. The potential forydx− xdyx2 + y2

is ϕ = −θ, where

−π < θ < π is the angle in the polar coordinate. The potential can be used tocompute ∫ (0,1)

(1,0)

ydx− xdyx2 + y2

= −θ(0, 1) + θ(1, 0) = −π2

+ 0 = −π2

for any curve connecting (1, 0) to (0, 1) that does not intersect the non-positivex-axis. The unit circle arc C1 : φ1(t) = (cos t, sin t), 0 ≤ t ≤ π

2, and the straight

line C2 : φ2(t) = (1− t, t), 0 ≤ t ≤ 1, are such curves.On the other hand, consider the unit circle arc C3 : φ3(t) = (cos t,− sin t),

0 ≤ t ≤ 3π2

, connecting (1, 0) to (0, 1) in the clockwise direction. To compute theintegral along the curve, let V be obtained by removing the non-negative diagonal{(x, x) : x ≥ 0} from R2. The 1-form still has the potential ϕ = −θ on V . The

crucial difference is that the range for θ is changed toπ

4< θ <

9π4

. Therefore∫ (0,1)

(1,0)

ydx− xdyx2 + y2

= −θ(0, 1) + θ(1, 0) = −π2

+ 2π =3π2

for any curve connecting (1, 0) to (0, 1) that does not intersect the non-negativediagonal.

In general, the integral∫ (0,1)

(1,0)

ydx− xdyx2 + y2

depends only on how the curve con-

necting (1, 0) to (0, 1) goes around the origin, the only place where fy = gx fails.

Exercise 7.3.7. Explain the computations in Exercise 7.2.14 by Green theorem.

Exercise 7.3.8. Use potential functions to compute the integrals.

1.∫C

2xdx+ ydy, C is the straight line connecting (2, 0) to (0, 2).

2.∫C

xdx+ ydy

x2 + y2, C is the circular arc connecting (2, 0) to (0, 2).

3.∫C

ydx− xdyax2 + by2

, C is the unit circle in counterclockwise direction.

4.∫C

(x− y)dx+ (x+ y)dyx2 + y2

, C is the elliptic arcx2

a2+y2

b2= 1 in the upper

half plane connecting (a, 0) to (−a, 0).

5.∫C

(ex sin 2y−y)dx+(2ex cos 2y−1)dy, C is the circular arc connecting (0, 1)

to (1, 0) in clockwise direction.

Page 405: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 405

6.∫C

(2xy3 − 3y2 cosx)dx + (−6y sinx + 3x2y2)dy, C is the curve 2x = πy2

connecting (0, 0) to(π

2, 1)

.

Exercise 7.3.9. Find α so that the vector fields(x, y)

(x2 + y2)αand

(−y, x)(x2 + y2)α

have

potentials. Then find the potentials.

Exercise 7.3.10. Compute the integral∫Cg(xy)(ydx+xdy), where C is the straight

line connecting (2, 3) to (1, 6).

Exercise 7.3.11. Why is the potential function unique up to adding constants?

.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................

............................ ..................................

..................................................................................................................................................................................................................................................

.......................................

.........................................................................................

............................................................................................................................................................

..........................................................................................................................................................................................................................................................................................................................................................................................

..............................................................................................................................................

................................................

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

..............................................

.....................................................

...................................................

.............................. ....................

..............................................

...............................................................

.................................................................................................................................................................................................................................................................................................................... U

C

C1

C2

[1]

[2][3]

[4]

..............................

.........................................

.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................

..................................................................................................................................................................................................................

.....................................

.....................................................................................................................

.......................................................................................................................................................................................................................................................................................................................................................................................................

Figure 7.11: closed curve in open subset with holes

Now we discuss another application of the Green theorem in case fy = gx.The open subset in Figure 7.11 has two holes. The two holes are enclosed bysimple closed curves C1 and C2 in U oriented in counterclockwise direction.The closed curve C in U may be divided into four parts, denoted C[1], C[2],C[3], C[4]. The union of oriented closed curves C[1]∪(−C1), C[2]∪C[4]∪(−C1)∪(−C2) and C[3] also enclose subsets contained in U . If fy = gy on U , then byGreen theorem,∫C[1]∪(−C1)

fdx+ gdy =

∫C[2]∪C[4]∪(−C1)∪(−C2)

fdx+ gdy =

∫C[3]

fdx+ gdy = 0.

Thus∫C

fdx+ gdy =

(∫C1

+

∫C1∪C2

)fdx+ gdy =

(2

∫C1

+

∫C2

)fdx+ gdy.

Note that the coefficients 2 for C1 and 1 for C2 mean that C wraps aroundC1 twice and C2 once.

In general, suppose U has finitely many holes. We enclose these holeswith closed curves C1, C2, . . . , Ck, all in counterclockwise orientation. Thenany closed curve C in U wraps around the i-th hole ni times. The sign ofni is positive when the wrapping is counterclockwise and is negative if the

Page 406: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

406 CHAPTER 7. MULTIVARIABLE INTEGRATION

wrapping is clockwise. Then we say C is homologous to n1C1 + n2C2 + · · ·+nkCk. If fy = gx on U , then we have∫

C

fdx+ gdy =

(n1

∫C1

+n2

∫C2

+ · · ·+ nk

∫Ck

)fdx+ gdy. (7.3.8)

Finally, suppose C and D are two curves in U with the same end points.Because U is no longer simply connected, the equality no longer implies that

the integrals

∫C

fdx+gdy and

∫D

fdx+gdy are equal. However, the difference

between the two integrals can be computed by considering the homology ofthe closed curve C ∪ (−D).

Example 7.3.5. In Example 7.3.4, the 1-formydx− xdyx2 + y2

satisfies fy = gx on U =

R2 − (0, 0). The unit circle C in the counterclockwise direction encloses the onlyhole of U . We have∫

C

ydx− xdyx2 + y2

=∫ 2π

0

sin t(− sin t)dt− cos t cos tdtcos2 t+ sin2 t

= −2π.

If C1 is a curve on the first quadrangle connecting (1, 0) to (0, 1) and C2 is acurve on the second, third and fourth quadrangles connecting the two points, thenC1 ∪ (−C2) is homologous to C, and we have∫

C1

ydx− xdyx2 + y2

=∫C2

ydx− xdyx2 + y2

− 2π.

Exercise 7.3.12. Study how the integral of the 1-formydx− xdyx2 + xy + y2

depends on

the curves.

Exercise 7.3.13. Study how the integral of the 1-form

ω =(x2 − y2 − 1)dx+ 2xydy

((x− 1)2 + y2)((x+ 1)2 + y2)

depends on the curves. Note that if Cε is the counterclockwise circle of radius ε

around (1, 0), then∫Cε0

ω = limε→0

∫Cε

ω.

7.3.3 Stokes Theorem

Stokes theorem is the extension of Green theorem to oriented surfaces S ⊂ R3

with compatibly oriented boundary curves C. Suppose the surface is givenan orientation compatible parametrization

σ(u, v) = (x(u, v), y(u, v), z(u, v)) : A→ S ⊂ R3,

and the boundary C of S corresponds to the boundary D of A. Recall thatD is oriented in such a way that A is “on the left” of D. Correspondingly,C is oriented in such a way that S is “on the left” of C. Also recall thedescription in case D is differentiable, the clockwise rotation of the tangentvector of D by 90 degrees will be the normal direction pointing away from

Page 407: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 407

A. Correspondingly, the clockwise rotation, in the tangent plane of S, of thetangent vector of C by 90 degrees will be the normal direction pointing awayfrom S.

Suppose F = (f, g, h) is a continuously differentiable vector field on R3.Then ∫

C

F · d~x =

∫C

fdx+ gdy + hdz

=

∫D

(fxu + gyu + hzu)du+ (fxv + gyv + hzv)dv

=

∫A

((fxv + gyv + hzv)u − (fxu + gyu + hzu)v)dudv

=

∫A

(fuxv − fvxu + guyv − gvyu + huzv − hvzu)dudv.

By

fuxv − fvxu = (fxxu + fyyu + fzzu)xv − (fxxv + fyyv + fzzv)xu

= −fy det∂(x, y)

∂(u, v)+ fz det

∂(z, x)

∂(u, v),

we have∫A

(fuxv − fvxu)dudv =

∫A

−fy det∂(x, y)

∂(u, v)dudv + fz det

∂(z, x)

∂(u, v)dudv

=

∫S

−fydx ∧ dy + fzdz ∧ dx.

Combined with the similar computations for g and h, we get the followingresult.

Theorem 7.3.3 (Stokes Theorem). Suppose S ⊂ R3 an oriented surfacewith compatibly oriented simple closed boundary curves C. Then for anycontinuously differentiable F = (f, g, h), we have∫C

fdx+gdy+hdz =

∫S

(gx−fy)dx∧dy+(hy−gz)dy∧dz+(fz−hx)dz∧dx.

(7.3.9)

Although the argument was made only for one parametrization, the equal-ity can be extended to oriented surfaces by adding the equalities on compat-ibly oriented parametrized pieces.

Introduce the symbol

∇ =

(∂

∂x,∂

∂y,∂

∂z

)(7.3.10)

on R3. For any function f , we formally have the gradient

gradf = ∇f =

(∂f

∂x,∂f

∂y,∂f

∂z

). (7.3.11)

Page 408: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

408 CHAPTER 7. MULTIVARIABLE INTEGRATION

For any vector field F = (f, g, h) on R3, we formally have the curl

curlF = ∇× F =

(∂h

∂y− ∂g

∂z,∂f

∂z− ∂h

∂z,∂g

∂x− ∂f

∂y

). (7.3.12)

Then Stokes theorem can be written as∫C

F · d~x =

∫S

curlF · ~ndA. (7.3.13)

Example 7.3.6. Suppose C is the circle given by x2+y2+z2 = 1, x+y+z = r, withthe counterclockwise orientation when viewed from the direction of the x-axis. Wewould like to compute the integral

∫C

(ax + by + cz + d)dx. Let S be the disk

x2 + y2 + z2 ≤ 1, x+ y+ z = r, with the normal direction given by (1, 1, 1). Then

by Stokes theorem and the fact that the radius of S is

√1− r2

3, we have

∫C

(ax+by+cz+d)dx =∫S−bdx∧dy+cdz∧dx =

∫S

1√3

(−b+c)dA =c− b√

(1− r2

3

).

Example 7.3.7. Faraday observed that a changing magnetic field B induces anelectric field E. More precisely, Faraday’s induction law says that the rate ofchange of the flux of the magnetic field through a surface S is the negative of theintegral of the electric field along the boundary C of the surface S. The law issummarized in the formula

−∫CE · d~x =

d

dt

∫SB · ~ndA. (7.3.14)

By Stokes theorem, the left side is −∫S

curlE · ~ndA. Since the equality holds for

any surface S, we conclude the differential version of Faraday’s law

−curlE =∂B

∂t. (7.3.15)

This is one of Maxwell’s equations for electromagnetic fields.

Exercise 7.3.14. Compute the integrals.

1.∫Cy2dx+ (x+ y)dy+ yzdz, C is the ellipse x2 + y2 = 2, x+ y+ z = 2, with

clockwise orientation as viewed from the origin.

Exercise 7.3.15. Suppose C is any closed curve on the sphere x2 + y2 + z2 = R2.

Prove that∫C

(y2 + z2)dx+ (z2 + x2)dy+ (x2 + y2)dz = 0. In general, what is the

condition for f , g, h so that∫Cfdx + gdy + hdz = 0 for any closed curve C on

any sphere centered at the origin?

Exercise 7.3.16. Find the formulae for the curl of F +G, gF .

Exercise 7.3.17. Prove that curl(gradf) = ~0. Moreover, compute curl(curlF ).

Page 409: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 409

Exercise 7.3.18. The electric field E induced by a changing magnetic field is alsochanging and follows Ampere’s law∫

CB · d~x = µ0

∫SJ · ~ndA+ ε0µ0

d

dt

∫SE · ~ndA, (7.3.16)

where J is the current density, and µ0, ε0 are some physical constants. Derivethe differential version of Ampere’s law, which is the other Maxwell’s equation forelectromagnetic fields.

Green theorem can be further extended to surfaces in Euclidean spacesof dimension higher than 3. Suppose S is an oriented surface given by anorientation compatible regular parametrization σ(u, v) : A ⊂ R2 → Rn. Theboundary C of S corresponds to the boundary D of A. The compatibleorientation of C is described as follows. At any ~x = σ(u, v) ∈ C, the tangentplane T~xS of the surface is spanned by the tangent vectors σu and σv. On thetangent plane is the normal vector ~n ∈ T~xS pointing away from the surfaceS. Then the tangent vector of C is a unit length vector ~t ∈ T~xS orthogonalto ~n. Among the two choices of ~t (one is the negative of the other), theone compatible with the orientation of S is such that the rotation from σuto σv is in the same direction of the rotation from ~n to ~t. The conditioncan be rephrased as follows. Since both {σu, σv} and {~n,~t} are bases of the2-dimensional subspace T~xS, we have

σu ∧ σv = λ~n ∧ ~t,

for some number λ 6= 0. The tangent vector ~t is the only unit length vectororthogonal to ~n, such that λ > 0 in the equality above.

Now for a continuously differentiable f , we have∫C

fdxj =

∫D

f

(∂xj∂u

du+∂xj∂v

dv

)=

∫A

(∂

∂v

(f∂xj∂v

)− ∂

∂u

(f∂xj∂u

))dudv

=

∫A

(∂f

∂v

∂xj∂v− ∂f

∂u

∂xj∂u

)dudv

=

∫A

∑i

(∂f

∂xi

∂xi∂u

∂xj∂v− ∂f

∂xi

∂xi∂v

∂xj∂u

)dudv

=

∫A

∑i 6=j

∂f

∂xidet

∂(xi, xj)

∂(u, v)dudv =

∫S

∑i 6=j

∂f

∂xidxi ∧ dxj.

Thus for a continuously differentiable vector field F = (f1, f2, . . . , fn) on Rn,we have∫

C

F · d~x =

∫C

∑j

fjdxj

=

∫S

∑i 6=j

∂fj∂xi

dxi ∧ dxj =

∫S

∑i<j

(∂fj∂xi− ∂fi∂xj

)dxi ∧ dxj. (7.3.17)

Page 410: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

410 CHAPTER 7. MULTIVARIABLE INTEGRATION

This is the Stokes theorem for oriented surfaces in Rn.Similar to Green theorem, Stokes theorem has implication on how the

integral of a 1-form depends on the choice of the curve. Suppose a vectorfield F satisfies

∂fj∂xi

=∂fi∂xj

(7.3.18)

on an open subset U ⊂ Rn. A collection C of oriented closed curves ishomologous to 0 in U if it is the compatibly oriented boundary of an orientedsurface S in U (in fact, a continuously differentiable map from S to U issufficient). Stokes theorem tells us that if F satisfies (7.3.18) on U and C is

homologous to 0 in U , then

∫C

F · d~x = 0. Moreover, two oriented curves C

and D that have the same begining and end points are homologous in U ifthe closed curve C ∩ (−D) is homologous to 0. Then the equalities (7.3.18)

imply that the integral

∫C

F · d~x depends only on the homologous class of C.

A special case is when the subset U is simply connected, which means thatany continuous map S1 → U extends to a continuous map B2 → U . ThenC ∩ (−D) is the boundary of a disk in U . With the same proof, Theorem7.3.2 may be extended.

Theorem 7.3.4. Suppose F = (f1, f2, . . . , fn) is a continuous vector field onan open subset U ⊂ Rn. Then the following are equivalent.

1. The integral

∫C

F ·d~x along an oriented rectifiable curve C in U depends

only on the beginning and end points of C.

2. There is a differentiable function ϕ on U , such that ∇ϕ = F .

Moreover, if U is simply connected and F is continuously differentiable, thenthe above is also equivalent to

∂fj∂xi

=∂fi∂xj

for any i, j.

In R3, the theorem says that F = gradϕ for some ϕ on a simply connectedregion if and only if curlF = ~0.

The potential function ϕ is given by

ϕ(~x) =

∫ ~x

~x0

F · d~x (7.3.19)

up to adding constants. It also satisfies dϕ = F · d~x.

Example 7.3.8. The vector field (f, g, h) = (yz(2x+ y+ z), zx(x+ 2y+ z), xy(x+y + 2z)) satisfies

fy = gx = (2x+ 2y + z)z, hy = gz = (x+ 2y + 2z)x, fz = hx = (2x+ y + 2z)y.

Page 411: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 411

Therefore the vector field has potential. The potential function can be computedby integrating along successive straight lines connecting (0, 0, 0), (x, 0, 0), (x, y, 0),(x, y, z). The integral is zero on the first two segments, so that

ϕ(x, y, z) =∫ z

0xy(x+ y + 2z)dz = xyz(x+ y + z).

Exercise 7.3.19. Determine whether the integrals of vector fields or the 1-formsare independent of the choice of the curves. For the ones that are independent,find the potential function.

1. ex(cos yz,−z sin yz,−y sin yz).

2. y2z3dx+ 2xyz3dy + 2xyz2dz.

3. (y + z)dx+ (z + x)dy + (x+ y)dz.

4. (x2, x3, . . . , xn, x1).

5. (x21, x

22, . . . , x

2n).

6. x1x2 · · ·xn(dx1

x1+dx2

x2+ · · ·+ dxn

xn

).

Exercise 7.3.20. Suppose ~a is a nonzero vector. Find condition on a function f(~x)so that the vector field f(~x)~a has a potential function.

Exercise 7.3.21. Find condition on a function f(~x) so that the vector field f(~x)~xhas a potential function.

Exercise 7.3.22. Study the potential function of(y − z)dx+ (z − x)dy + (x− y)dz

(x− y)2 + (y − z)2.

7.3.4 Gauss Theorem

Green theorem can be extended to regions of higher dimension. For example,let

A = {(x, y, z) : (x, y) ∈ B, k(x, y) ≤ z ≤ h(x, y)}

be the 3-dimensional region between the graphs of functions h(x, y) andk(x, y) on B ⊂ R2 (see Figure 7.12). Then we have∫

A

fzdxdydz =

∫B

(∫ h(x,y)

k(x,y)

fz(x, y, z)dz

)dxdy

=

∫B

(f(x, y, h(x, y))− f(x, y, k(x, y)))dxdy.

On the other hand, the boundary S of A consists of three pieces. By takingthe normal vector ~n of the boundary surface to point outward of A, theorientation compatible parametrizations of the three pieces are

Sb : σ(y, x) = (x, y, k(x, y)), (x, y) ∈ BSt : σ(x, y) = (x, y, h(x, y)), (x, y) ∈ BSv : σ(t, z) = (x(t), y(t), z), k(x, y) ≤ z ≤ h(x, y)

Page 412: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

412 CHAPTER 7. MULTIVARIABLE INTEGRATION

where (x(t), y(t)) is a parametrization of the boundary curve of B. We have∫S

fdx ∧ dy =

(∫Sb

+

∫St

+

∫Sv

)fdx ∧ dy

=

∫B

f(x, y, k(x, y)) det∂(x, y)

∂(y, x)dxdy

+

∫B

f(x, y, h(x, y)) det∂(x, y)

∂(x, y)dxdy

+

∫a≤t≤b,k(x,y)≤z≤h(x,y)

f(x(t), y(t), z) det∂(x, y)

∂(t, z)dtdz

= −∫B

f(x, y, k(x, y))dxdy +

∫B

f(x, y, h(x, y))dxdy + 0.

..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................... ..............

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

....................................

...............................................................................................................................................................................................................................

................................................................................

...........................................................................................

.......................................................................................... ..............

.............................................................................................................................................................................................................................................................................................................................................................................

..........................................................................................................................................................................................................................................................................................................................................................................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

................

..............

.......................................................................................................................................................................................................................................................................................................................................................................

...................................................................................................................................................................................................................................................................................................................................................................................

A

B

x

y

z

z = h(x, y)

z = k(x, y) SvSb

St

~n

~n

~n

............

............

............

............

............

............

............

......

............

............

............

............

............

............

................................................

......................................................................................................................................................................................................................................................................................................................................................................................................

.........................................................................................................................................................................................................................

...........................................................................................................................................................................................................................

Figure 7.12: Gauss Theorem for a special case

Thus we proved

∫S

fdx∧dy =

∫A

fzdxdydz. The equality can be extended

to regions that can be divided by vertical planes into regions between graphsof continuously differentiable functions. Similar equalities can be establishedby rotating x, y and z.

Similar to the 2-dimensional case, we may consider a subset A ⊂ R3

with surfaces S1, S2, . . . , Sk as the boundary. We further assume that Ais contained in only one side of each Si. Let ~n be the normal vector of thesurfaces pointing outward of A. This induces orientations on the surfaces.We denote by S the union of the oriented boundary surfaces and say A is aregion with compatibly oriented boundary surfaces S.

Theorem 7.3.5 (Gauss Theorem). Suppose A ⊂ R3 a region with boundarysurfaces S compatibly oriented with respect to the outward normal vector ~n.Then for any continuously differentiable F = (f, g, h), we have∫

S

fdy ∧ dz + gdz ∧ dx+ hdx ∧ dy =

∫A

(fx + gy + hz)dxdydz. (7.3.20)

Page 413: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 413

Define the divergence

divF = ∇ · F =∂f

∂x+∂g

∂y+∂h

∂z. (7.3.21)

The Gauss theorem means that the outward flux of a flow F = (f, g, h) isequal to the integral of the divergence of the flow on the solid∫

S

F · ~ndA =

∫A

divFdV. (7.3.22)

Example 7.3.9. The volume of the region enclosed by a surface S ⊂ R3 withoutboundary is∫

Sxdy ∧ dz =

∫Sydz ∧ dx =

∫Szdx ∧ dy =

13

∫S

(x, y, z) · ~ndA.

Example 7.3.10. In Example 7.2.22, the outgoing flux of the flow F = (x2, y2, z2)

through the ellipse(x− x0)2

a2+

(y − y0)2

b2+

(z − z0)2

c2= 1 is computed by surface

integral. Alternatively, by Gauss theorem, the flux is∫F · ~ndA =

∫(x−x0)2

a2 +(y−y0)2

b2+

(z−z0)2

c2≤1

2(x+ y + z)dxdydz

=∫x2

a2 + y2

b2+ z2

c2≤1

2(x+ x0 + y + y0 + z + z0)dxdydz.

Since the transform ~x→ −~x does not preserve the integral, we get∫x2

a2 + y2

b2+ z2

c2≤1

(x+ y + z)dxdydz =∫x2

a2 + y2

b2+ z2

c2≤1

(−x− y − z)dxdydz.

Therefore the flux is∫x2

a2 + y2

b2+ z2

c2≤1

2(x0 + y0 + z0)dxdydz =8πR3

3abc(x0 + y0 + z0).

Example 7.3.11. To compute the upward flux of F = (xz,−yz, (x2 +y2)z) throughthe surface S given by 0 ≤ z = 4−x2−y2, we introduce the diskD = {(x, y, 0) : x2+y2 ≤ 4} on the (x, y)-plane. Taking the normal direction of D to be (0, 0, 1), thesurface S ∪ (−D) is the boundary of the region A given by 0 ≤ z ≤ 4 − x2 − y2.By Gauss theorem, the flux through S is∫

SF · ~ndA =

∫S∪(−D)

F · ~ndA+∫DF · ~ndA

=∫A

(z − z + x2 + y2)dxdydz +∫x2+y2≤4

F (x, y, 0) · (0, 0, 1)dA

=∫A

(x2 + y2)dxdydz =∫

0≤r≤2,0≤θ≤2πr2(4− r2)rdrdθ =

32π3.

Example 7.3.12. The gravitational field created by a mass M at point ~x0 ∈ R is

G = − M

‖~x− ~x0‖32(~x− ~x0).

Page 414: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

414 CHAPTER 7. MULTIVARIABLE INTEGRATION

A straightforward computation shows divG = 0. Suppose A is a region withcompatibly oriented surfaces S as the boundary and ~x0 6∈ S. If ~x0 6∈ A, then by

Gauss theorem, the outward flux∫SG ·~ndA = 0. If ~x0 ∈ A, then let Bε be the ball

of radius ε centered at ~x0. The boundary of the ball is the sphere Sε, which we

given an orientation compatible with the outward normal vector ~n =‖~x− ~x0‖‖~x− ~x0‖2

.

For sufficiently small ε, the ball the the sphere are contained in A. Moreover, A−Bεis a region not containing ~x0 and has compatibly oriented surfaces S ∪ (−Sε) asthe boundary. Therefore∫SG·~ndA =

∫Sε

G·~ndA =∫‖~x−~x0‖2=ε

−Mε3

(~x−~x0)·~x− ~x0

εdA = −M

ε2

∫Sε

dA = −4πM.

More generally, the gravitational field created by several masses at variouslocations is the sum of the individual gravitational field. The outward flux ofthe field through S is then −4π multiplied to the total mass contained in A. Inparticular, the flux does not depend on the location of the mass, but whether themass is contained in A or not. This is called Gauss’s Law.

Exercise 7.3.23. Compute the flux.

1. Outward flux of (x3, x2y, x2z) through boundary of the solid x2 + y2 ≤ a2,0 ≤ z ≤ b.

2. Inward flux of (xy2, yz2, zx2) through the ellipse(x− x0)2

a2+

(y − y0)2

b2+

(z − z0)2

c2= 1.

3. Upward flux of (x3, y3, z3) through the surface z = x2 + y2 ≤ 1.

4. Outward flux of (x2, y2,−2(x+ y)z) through the torus in Example 6.1.10.

Exercise 7.3.24. Suppose A ⊂ R3 is convex region with boundary surface S. Sup-pose ~a is a vector in the interior of A, and p is the distance from ~a to the tangent

plane of S. Compute∫SpdA. Moreover, for the special case S is the ellipse

x2

a2+y2

b2+z2

c2= 1 and ~a = (0, 0, 0), compute

∫S

1pdA.

Exercise 7.3.25. Find the formulae for the divergence of F +G, gF , F ×G.

Exercise 7.3.26. Prove that div(curlF ) = 0. Moreover, compute div(gradf) andgrad(divF ).

Gauss theorem can be further extended to higher dimension. SupposeA ⊂ Rn is the n-dimensional region between the graphs xn = h(x1, x2, . . . , xn−1)and xn = k(x1, x2, . . . , xn−1) for (x1, x2, . . . , xn−1) ∈ B. Then we have∫

A

fxndx1dx2 · · · dxn =

∫B

(f(~u, h(~u))− f(~u, k(~u)))du1du2 · · · dun−1.

Suppose the boundary S of A is oriented to be compatible with the normalvector pointing away from A. Then the normal vector for the top boundary Stpoints in the direction of xn and the normal vector for the bottom boundary

Page 415: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

7.3. STOKES THEOREM 415

Sb points opposite to the direction of xn. By the discussion at the endof Section 7.2.7, the parametrization σ(~u) = (~u, h(~u)) for St is compatiblyoriented if n is odd and is oppositely oriented if n is even. Consequently, weget ∫

B

f(~u, h(~u))du1du2 · · · dun−1

=

∫B

f(σ(~u))∂(x1, x2, . . . , xn−1)

∂(u1, u2, . . . , un−1)du1du2 · · · dun−1

=

∫St

(−1)n−1fdx1 ∧ dx2 ∧ · · · ∧ dxn−1.

Similar argument may be made for the bottom boundary Sb, for which wenote that the parametrization σ(~u) = (~u, k(~u)) is compatibly oriented (thenormal vector point opposite to the direction of xn) if n is even and is oppo-sitely oriented if n is odd. We then conclude that∫B

(f(~u, h(~u))−f(~u, k(~u)))du1du2 · · · dun−1 =

∫S

(−1)n−1fdx1∧dx2∧· · ·∧dxn−1.

The argument can be similarly applied to the case A is given by

h(x1, . . . , xi−1, xi+1, . . . , xn−1) ≥ xi ≥ k(x1, . . . , xi−1, xi+1, . . . , xn−1).

In this case, we have∫A

fxidx1dx2 · · · dxn =

∫B

f(u1, . . . , ui−1, h(~u), ui, . . . , un)du1du2 · · · dun−1

−∫B

f(u1, . . . , ui−1, k(~u), ui, . . . , un)du1du2 · · · dun−1.

Moreover, St may be parametrized by

σi(~u) = σi(u1, u2, . . . , un−1) = (u1, u2, . . . , ui−1, h(~u), ui, . . . , un−1).

The normal vector for St points in the direction of xi. By the discussion at theend of Section 7.2.7, the orientation of the parametrization σi is compatible tothe normal vector if i is odd and is not compatible if i is even. Consequently,we get ∫

B

f(σi(~u))du1du2 · · · dun−1

=

∫B

f(σi(~u))∂(x1, x2, . . . , xi, . . . , xn)

∂(u1, u2, . . . , un−1)du1du2 · · · dun−1

=

∫St

(−1)i−1fdx1 ∧ dx2 ∧ · · · ∧ dxi ∧ · · · ∧ dxn−1.

Therefore we get the following extension of Gauss theorem. SupposeA ⊂ Rn is a region with boundary hypersurface S compatibly oriented with

Page 416: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

416 CHAPTER 7. MULTIVARIABLE INTEGRATION

respect to the outward normal vector ~n. Then for any continuously differen-tiable F = (f1, f2, . . . , fn), we have∫

S

∑(−1)i−1fidx1 ∧ dx2 ∧ · · · ∧ dxi ∧ · · · ∧ dxn

=

∫A

(∂f1

∂x1

+∂f2

∂x2

+ · · ·+ ∂fn∂xn

)dx1dx2 · · · dxn. (7.3.23)

Extend the definition of divergence to vector fields in Rn

divF =∂f1

∂x1

+∂f2

∂x2

+ · · ·+ ∂fn∂xn

. (7.3.24)

The Gauss theorem can be rewritten as∫S

F · ~ndV =

∫A

divFdµ~x. (7.3.25)

7.3.5 Exercise

Potential and Functional Dependence

Exercise 7.3.27. Suppose a continuously second order differentiable function g(~x)satisfies ∇(~x0) 6= ~0. Suppose f(~x) is continuously differentiable near ~x0. Provethat the differential form

fdg = fgx1dx1 + fgx2dx2 + · · ·+ fgxn

has a potential near ~x0 if and only if f(~x) = h(g(~x)) for a continuously differentiableh(t).

Exercise 7.3.28. Find condition on f so that there are potentials.

1. f(x, y)(ydx+ xdy).

2. f(x, y)(xdx+ ydy).

3. f(x, y)(ydx− xdy).

4. f(x)(−xy, 1 + x2).

Page 417: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

Chapter 8

Calculus on Manifold

417

Page 418: Mathematical Analysismajhu/Math203/book.pdfn!1(p n+ 1 n) = 0. 4. lim n!1 1 n2=3 n1=2 = 0. 5. lim n!1 cosn n = 0. 6. lim n!1 p n cosn p n+ sinn = 1. Exercise 1.1.2. Let a positive real

418 CHAPTER 8. CALCULUS ON MANIFOLD

8.1 Differentiable Manifold

8.1.1 Manifold

definitiondifferentiable function on manifoldorientationmanifold with boundarydifferentiable map between manifolds

8.1.2 Tangent Space

definitionbasistangent fielddifferentiation of differentiable map

8.1.3 Differential Form

differentiation of functiondifferential formdifferentiation of differential form

8.1.4 Topics

inverse function, implicit function?

8.2 Integration on Manifold

8.2.1 Partition of Unity

partition of unity

8.2.2 Integration

partition of unity

8.2.3 Stokes Theorem

partition of unity