honors mathematics ii functions of a single variable -...

Honors Mathematics IIFunctions of a Single Variable

Dr. Horst Hohberger

University of Michigan - Shanghai Jiaotong UniversityJoint Institute

Fall Term 2015

Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 1 / 624

Introduction

Office Hours, Email, TAs Please read the Course profile, which has been uploaded to the

Resources section on the SAKAI course site. My office is Room 218 in in the Law School Building. My email is [email protected] and I’ll try to answer email queries

within 24 hours. Office hours will be announced on SAKAI. Please also make use of the chat room on SAKAI for asking

questions, making comments or giving feedback on the course. The recitation class schedule and the TA office hours and contact

details will be announced on SAKAI. Of course, you can also contact the TA with questions outside of

office hours, but please observe basic rules of politeness. It is notappropriate to phone your TA at 10 pm in the evening and requesthelp for the exercise set due the next day. Your TA is there to helpyou, but is not a 24/7 help line.


Introduction

Coursework Policy

The following applies mostly to the paper-based homework at thebeginning of this term:

Hand in your coursework on time, by the date given on each set ofcourse work. Late work will not be accepted unless you come to mepersonally and I find your explanation for the lateness acceptable.

You can be deducted up to 10% of the awarded marks for anassignment if you fail to write neatly and legibly. Messiness will bepenalized!


Introduction

Use of Wikipedia and Other Sources; Honor Code Policy

When faced with a particularly difficult problem, you may want to refer toother textbooks or online sources such as Wikipedia. Here are a fewguidelines:

Outside sources may treat a similar sounding subject matter at amuch more advanced or a much simpler level than this course. Thismeans that explanations you find are much more complicated or fartoo simple to help you. For example, when looking up the “inductionaxiom” you may find many high-school level explanations that are notsufficient for our problems; on the other hand, wikipedia contains alot of information relating to formal logic that is far beyond what weare discussing here.

If you do use any outside sources to help you solve a homeworkproblem, you are not allowed to just copy the solution; this isconsidered a violation of the Honor Code.


Introduction

Use of Wikipedia and Other Sources; Honor Code Policy

The correct way of using outside sources is to understand thecontents of your source and then to write in your own words andwithout referring back to the source the solution of the problem. Yoursolution should differ in style significantly from the published solution.If you are not sure whether you are incorporating too much materialfrom your source in your solutions, then you must cite the source thatyou used.

You may cooperate with other students in finding solutions toassignments, but you must write your own answers. Do not simplycopy answers from other students. It is acceptable to discuss theproblems orally, but you may not look at each others’ written notes.Do not show your written solutions to any other student. This wouldbe considered a violation of the Honor Code.


Introduction

Use of Wikipedia and Other Sources; Honor Code PolicyIn this course, the following actions are examples of violations of theHonor Code:

Showing another student your written solution to a problem. Sending a screenshot of your solution via QQ, email or other means

to another student. Showing another student the written solution of a third student;

distributing some student’s solution to other students. Viewing another student’s written solution. Copying your solution in electronic form (LATEX source, PDF, JPG

image etc.) to the computer hardware (flash drive, hard disk etc.) ofanother student. Having another student’s solution in electronic formon your computer hardware.

If you have any questions regarding the application of the Honor Code,please contact me or any of the TAs.


Introduction

Grading Policy

It is JI policy to curve the grades for standard undergraduate coursesso that the median grade is a “B.” For this Honors course, the gradewill be curved to achieve a median grade of “B+.”

The grade will be composed of the course work and the exams asfollows:

First midterm exam: 20% Second midterm exam: 25% Final exam: 30% Course work: 25%


Introduction

What is Mathematics?

Mathematics is about certainty.

Mathematics is about modeling.

Mathematics is about thinking.


Introduction

Course Focus

Please forget everything you have learned in school; you haven’tlearned it.

Edmund Landau, Foundations of Analysis, 1929

The course will focus on functions of one variable, in particular Foundations: Set Theory, Logic, Numbers, Algebraic Structures. Functions, Limits and Continuity, Differentiation Integration

The main textbook for this course is Calculus, M. Spivak, 3rd Edition.However, additional material from other sources will be added to thelecture. Further textbook references will then be given.


Introduction

NotationThe notation used in these notes should be fairly well-known from school;some symbols will be defined explicitly to make sure that they areunderstood unambiguously (for example, the summation symbol

∑).

A particular convention needs some explaining: when an equality is adefinition (and not a statement that can be proven), the equal symbol willsometimes have a colon added on the side of the defined quantity. Forexample,

A ∪ B := x : x ∈ A or x ∈ B

indicates that that the union A ∪ B is being defined through this equality.Another example might involve

1 + 2 + · · ·+ n =n(n + 1)

2=: an,

indicating that the number an is defined to be n(n + 1)/2.


Introduction

Notation

On the other hand, it would be wrong to write

1 + 2 + · · ·+ n :=n(n + 1)

2

because the objects on the left and the right are already defined and theequality is a statement that can be proven; it does not define anything.


Basic Tools and Definitions

Part I




Basic Concepts in Logic

Basic Concepts in Set Theory

The Natural Numbers and Mathematical Induction

The Rational Numbers

Roots, Bounds & Real Numbers

Complex Numbers


Basic Tools and Definitions Basic Concepts in Logic






Complex Numbers



Propositional Logic, StatementsReference L. Loomis and S. Sternberg, Advanced Calculus, Sections 0.1-0.4.Freely available here (clickable link):

http://www.math.harvard.edu/∼shlomo/docs/Advanced_Calculus.pdf.

A statement (also called a proposition) is anything we can regard as beingeither true or false. We do not define here what the words “statement”,“true” or “false” mean. This is beyond the purview of mathematics andfalls into the realm of philosophy. Instead, we apply the principle that “weknow it when we see it.”We will generally not use examples from the “real world” as statements.The reason is that in general objects in the real world are much to looselydefine for the application of strict logic to make any sense. For example,the statement “It is raining.” may be considered true by some people(“Yes, raindrops are falling out of the sky.”) while at the same time falseby others (“No, it is merely drizzling.”) Furthermore, importantinformation is missing (Where is it raining? When is it raining?).


http://www.math.harvard.edu/~shlomo/docs/Advanced_Calculus.pdf


Statements1.1.1. Examples.

“3 > 2” is a true statement. “x3 > 0” is not a statement, because we can not decide whether it is

true or not. “the cube of any number is strictly positive (> 0)” is a false

statement.

The last example can be written using a statement variable x : “For any number x , x3 > 0”

The first part of the statement is a quantifier (“for any number x”), whilethe second part is called a statement frame or predicate (“x3 > 0”).A statement frame becomes a statement (which can then be either true orfalse) when the variable takes on a specific value; for example, 13 > 0 is atrue statement and 03 > 0 is a false statement.We will treat quantifiers and predicates in more detail later.



Working with Statements

We will denote statements by capital letters such as A,B,C , ... andstatement frames by symbols such as A(x) or B(x , y , z) etc.1.1.2. Examples.

A : 4 is an even number. B : 2 > 3. A(n) : 1 + 2 + 3 + ... + n = n(n + 1)/2.

We will now introduce logical operations on statements. The simplestpossible type of operation is a unary operation, i.e., it takes a statement Aand returns a statement B.1.1.3. Definition. Let A be a statement. Then we define the negation of A,written as ¬A, to be the statement that is true if A is false and false if Ais true.



Negation

1.1.4. Example. If A is the statement A : 2 > 3, then the negation of A is¬A : 2 > 3.We can describe the action of the unary operation ¬ through the followingtable:

A ¬AT FF T

If A is true (T), then ¬A is false (F) and vice-versa. Such a table is calleda truth table.We will use truth tables to define all our operations on statements.



ConjunctionThe next simplest type of operations on statements are binary operations.The have two statements as arguments and return a single statement,called a compound statement, whose truth or falsehood depends on thetruth or falsehood of the original two statements.1.1.5. Definition. Let A and B be two statements. Then we define theconjunction of A and B, written A ∧ B, by the following truth table:

A B A ∧ B

T T TT F FF T FF F F

The conjunction A ∧ B is spoken “A and B.” It is true only if both A andB are true, false otherwise.



Disjunction1.1.6. Definition. Let A and B be two statements. Then we define thedisjunction of A and B, written A ∨ B, by the following truth table:

A B A ∨ B

T T TT F TF T TF F F

The disjunction A∨B is spoken “A or B.” It is true only if either A or B istrue, false otherwise.1.1.7. Example.

Let A : 2 > 0 and B : 1+1 = 1. Then A∧B is false and A∨B is true. Let A be a statement. Then the compound statement “A ∨ (¬A)” is

always true, and “A ∧ (¬A)” is always false.



Proofs using Truth TablesHow do we prove that “A ∨ (¬A)” is an always true statement? We areclaiming that A ∨ (¬A) will be a true statement, regardless of whether thestatement A is true or not. To prove this, we go through all possibilitiesusing a truth table:

A ¬A A ∨ (¬A)T F TF T T

Since the column for A ∨ (¬A) only lists T for “true,” we see thatA ∨ (¬A) is always true. A compound statement that is always true iscalled a tautology.Correspondingly, the truth table for A ∧ (¬A) is

A ¬A A ∧ (¬A)T F FF T F

so A ∧ (¬A) is always false. A compound statement that is always false iscalled a contradiction.



Implication1.1.8. Definition. Let A and B be two statements. Then we define theimplication of B and A, written A⇒ B, by the following truth table:

A B A⇒ B

T T TT F FF T TF F T

We read “A⇒ B” as “A implies B,” “if A, then B” or “A only if B”.The fact that F⇒ T is a true statement may seem a little strange. Beforewe go into the reason for this, note that Definition 1.1.8 is a definition!That means, we don’t have to justify it in any way, there is nothing toprove, it is just a notation for a certain compound statement. Hence, wecould leave it at that and give no further explanation. However, it wouldbe useful and appropriate to motivate why we are defining the implicationin this way.



ImplicationTo illustrate why the implication is defined as it is, it is useful to look at aspecific example:

For all real numbers x , x > 0⇒ x3 ≥ 0 (1.1.1)

It is hoped that the reader will agree that this should be a true statement.In order to verify this, we would need to verify the truth of the statementobtained from the predicate “x > 0⇒ x3 ≥ 0” for all real numbers x .We shall see later that a real number can be either strictly positive, strictlynegative or zero (cf. the trichotomy law on Slide 66).

If x is a strictly positive number, both statements are true, so wehave T⇒ T.

If x is a strictly negative number, both statements are false, so wehave F⇒ F.

If x is zero, the first statement is false but the second statement istrue, so we have F⇒ T.



EquivalenceHence, if we want (1.1.1) to be a true statement, then we need to defineA⇒ B as being true in all of the three cases above. We next introduceyet another, important, compound statement.1.1.9. Definition. Let A and B be two statements. Then we define theequivalence of A and B, written A⇔ B, by the following truth table:

A B A⇔ B

T T TT F FF T FF F T

We read “A⇔ B” as “A is equivalent to B” or “A if and only if B”. Sometextbooks abbreviate “if and only if” by “iff.”If A and B are both true or both false, then they are equivalent.Otherwise, they are not equivalent. In propositional logic, “equivalence” isthe closest thing to the “equality” of arithmetic.



EquivalenceOn the one hand, logical equivalence is strange; two statements A and Bdo not need to have anything to do with each other to be equivalent. Forexample, the statements “2 > 0” and “100− 1 = 99” are both true, sothey are equivalent.1.1.10. Definition. Two compound statements A and B are called logicallyequivalent if A⇔ B is a tautology. We then write A ≡ B.

1.1.11. Example. The two de Morgan rules are the tautologies

¬(A ∨ B)⇔ (¬A) ∧ (¬B), ¬(A ∧ B)⇔ (¬A) ∨ (¬B).

To indicate that they are tautologies, we write

¬(A ∨ B) ≡ (¬A) ∧ (¬B), ¬(A ∧ B) ≡ (¬A) ∨ (¬B).

These logical equivalencies will be proved in the assignments.



Contraposition

An important tautology is the contrapositive of the compound statementA⇒ B.

(A⇒ B)⇔ (¬B ⇒ ¬A).

For example, for any number x , the statement “x > 0⇒ x3 > 0” isequivalent to “x3 > 0⇒ x > 0.” This principle is used in proofs bycontradiction.We prove the contrapositive using a truth table:

A B ¬A ¬B ¬B ⇒ ¬A A⇒ B (A⇒ B)⇔ (¬B ⇒ ¬A)T T F F T T TT F F T F F TF T T F T T TF F T T T T T



Rye Whiskey

The following song is an old Western-style song, called “Rye Whiskey” andperformed by Tex Ritter in the 1930’s and 1940’s.

If the ocean was whiskey and I was a duck,I’d swim to the bottom and never come up.But the ocean ain’t whiskey, and I ain’t no duck,So I’ll play jack-of-diamonds and trust to my luck.For it’s whiskey, rye whiskey, rye whiskey I cry.If I don’t get rye whiskey I surely will die.

The lyrics make sense (at least as much as song lyrics generally do).



Rye WhiskeyOne can use de Morgan’s rules and the contrapositive to re-write the songlyrics as follows

If I never reach bottom or sometimes come up,Then the ocean’s not whiskey, or I’m not a duck.But my luck can’t be trusted, or the cards I’ll not buck,So the ocean is whiskey or I am a duck.For it’s whiskey, rye whiskey, rye whiskey I cry.If my death is uncertain, then I get whiskey (rye).

These lyrics seem to be logically equivalent to the original song, but arejust humorous nonsense. This again illustrates clearly why it is futile toapply mathematical logic to everyday language.This example is due to (clickable link) W. P. Cooke, The AmericanMathematical Monthly, Vol. 76, No. 9 (Nov., 1969), p. 1051.


http://www.jstor.org/stable/pdf/2317141.pdf

http://www.jstor.org/stable/pdf/2317141.pdf


Logical QuantifiersIn the previous examples we have used predicates A(x) with the words “forall x .” This is an instance of a logical quantifier that indicates for which xa predicate A(x) is to be evaluated to a statement.In order to use quantifiers properly, we clearly need a universe of objects xwhich we can insert into A(x) (a domain for A(x)). This leads usimmediately to the definition of a set. We will discuss set theory in detaillater. For the moment it is sufficient for us to view a set as a “collection ofobjects” and assume that the following sets are known:

the set of natural numbers N (which includes the number 0), the set of integers Z, the set of real numbers R, the empty set ∅ (also written ∅ or ) that does not contain any

objects.If M is a set containing x , we write x ∈ M and call x an element of M.



Logical QuantifiersThere are two types of quantifiers:

the universal quantifier, denoted by the symbol ∀, read as “for all” and the existential quantifier, denoted by ∃, read as “there exists.”

1.1.12. Definition. Let M be a set and A(x) be a predicate. Then wedefine the quantifier ∀ by

∀x∈M

A(x) ⇔ A(x) is true for all x ∈ M

We define the quantifier ∃ by

∃x∈M

A(x) ⇔ A(x) is true for at least one x ∈ M

We may also write ∀x ∈ M : A(x) instead of ∀x∈M

A(x) and similarly for ∃.



Logical Quantifiers

We may also state the domain before making the statements, as in thefollowing example.1.1.13. Examples. Let x be a real number. Then

∀x : x > 0⇒ x3 > 0 is a true statement; ∀x : x > 0⇔ x2 > 0 is a false statement; ∃x : x > 0⇔ x2 > 0 is a true statement.

Sometimes mathematicians put a quantifier at the end of a statementframe; this is known as a hanging quantifier. Such a hanging quantifierwill be interpreted as being located just before the statement frame:

∃y : y + x2 > 0 ∀x

is equivalent to ∃y∀x : y + x2 > 0.



Contraposition and Negation of QuantifiersWe do not actually need the quantifier ∃ since

∃x∈M

A(x) ⇔ A(x) is true for at least one x ∈ M

⇔ A(x) is not false for all x ∈ M

⇔ ¬ ∀x∈M

(¬A(x)) (1.1.2)

The equivalence (1.1.2) is called contraposition of quanifiers. It impliesthat the negation of ∃x ∈ M : A(x) is equivalent to ∀x ∈ M : ¬A(x). Forexample,

¬(∃x ∈ R : x2 < 0

)⇔ ∀x ∈ R : x2 < 0.

Conversely,

¬ (∀x ∈ M : A(x)) ⇔ ∃x ∈ M : ¬A(x).



Vacuous Truth

If the domain of the universal quantifier ∀ is the empty set M = ∅, thenthe statement ∀x ∈ M : A(x) is defined to be true regardless of thepredicate A(x). It is then said that A(x) is vacuously true.1.1.14. Example. Let M be the set of real numbers x such that x = x + 1.Then the statement

∀x∈M

x > x

is true.This convention reflects the philosophy that a universal statement is trueunless there is a counterexample to prove it false. While this may seem astrange point of view, it proves useful in practice.This is similar to saying that “All pink elephants can fly.” is a truestatement, because it is impossible to find a pink elephant that can’t fly.



Nesting Quantifiers

We can also treat predicates with more than one variable as shown in thefollowing example.1.1.15. Examples. In the following examples, x , y are taken from the realnumbers.

∀x∀y : x2 + y2 − 2xy ≥ 0 is equivalent to ∀y∀x : x2 + y2 − 2xy ≥ 0.Therefore, one often writes ∀x , y : x2 + y2 − 2xy ≥ 0.

∃x∃y : x + y > 0 is equivalent to ∃y∃x : x + y > 0, often abbreviatedto ∃x , y : x + y > 0.

∀x∃y : x + y > 0 is a true statement. ∃x∀y : x + y > 0 is a false statement.

As is clear from these examples, the order of the quantifiers is important ifthey are different.


Basic Tools and Definitions Basic Concepts in Set Theory






Complex Numbers



Naive Set Theory: Sets via Predicates

We want to be able to talk about “collections of objects”; however, we willbe unable to strictly define what an “object” or a “collection” is (exceptthat we also want any collection to qualify as an “object”). The problemwith “naive” set theory is that any attempt to make a formal definitionwill lead to a contradiction - we will see an example of this later. However,for our practical purposes we can live with this, as we won’t generallyencounter these contradictions.We indicate that an object (called an element) x is part of a collection(called a set) X by writing x ∈ X . We characterize the elements of a setX by some predicate P :

x ∈ X ⇔ P(x). (1.2.1)

We write such a set X in the form X = x : P(x).



Notation for SetsWe define the empty set ∅ := x : x = x. The empty set has noelements, because the predicate x = x is never true.We may also use the notation X = x1, x2, ... , xn to denote a set. In thiscase, X is understood to be the set

X = x : (x = x1) ∨ (x = x2) ∨ · · · ∨ (x = xn).

We will frequently use the convention

x ∈ A : P(x) = x : x ∈ A ∧ P(x)

1.2.1. Example. The set of even positive integers is

n ∈ N : ∃k∈Z

n = 2k



Subsets and Equality of Sets

If every object x ∈ X is also an element of a set Y , we say that X is asubset of Y , writing X ⊂ Y ; in other words,

X ⊂ Y ⇔ ∀x ∈ X : x ∈ Y .

We say that X = Y if and only if X ⊂ Y and Y ⊂ X .We say that X is a proper subset of Y if X ⊂ Y but X = Y . In that casewe write X ⫋ Y .Some authors write ⊆ for ⊂ and ⊂ for ⫋. Pay attention to the conventionused when referring to literature.



Examples of Sets and Subsets1.2.2. Examples.

1. For any set X , ∅ ⊂ X . Since ∅ does not contain any elements, thedomain of the statement ∀x ∈ X : x ∈ Y is empty. Therefore, it isvacuously true and hence ∅ ⊂ X .

2. Consider the set A = a, b, c where a, b, c are arbitrary objects, forexample, numbers. The set

B = a, b, a, b, c , c

is equal to A, because it satisfies A ⊂ B and B ⊂ A as follows:

x ∈ A ⇔ (x = a) ∨ (x = b) ∨ (x = c) ⇔ x ∈ B.

Therefore, neither order nor repetition of the elements affects thecontents of a set.If C = a, b, then C ⊂ A and in fact C ⫋ A. Setting D = b, c wehave D ⫋ A but C ⊂ D and D ⊂ C .



Power Set and CardinalityIf a set X has a finite number of elements, we define the cardinality of Xto be this number, denoted by #X , |X | or cardX .We define the power set

P(M) := A : A ⊂ M.

Here the elements of the set P(M) are themselves sets; P(M) is the “setof all subsets of M.” Therefore, the statements

A ⊂ M and A ∈ P(M)

are equivalent.1.2.3. Example. The power set of a, b, c is

P(a, b, c) =∅, a, b, c, a, b, b, c, a, c, a, b, c

.

The cardinality of a, b, c is 3, the cardinality of the power set is|P(a, b, c)| = 8.



Operations on SetsIf A = x : P1(x), B = x : P2(x) we define the union, intersection anddifference of A and B by

A ∪ B := x : P1(x) ∨ P2(x), A ∩ B := x : P1(x) ∧ P2(x),A \ B := x : P1(x) ∧ (¬P2(x)).

Let A ⊂ M. We then define the complement of A by

Ac := M \ A.

If A ∩ B = ∅, we say that the sets A and B are disjoint.Occasionally, the notation A− B is used for A \ B and Ac is sometimesdenoted by A.1.2.4. Example. Let A = a, b, c and B = c , d. Then

A ∪ B = a, b, c , d, A ∩ B = c, A \ B = a, b.



Operations on SetsThe laws for logical equivalencies immediately lead to several rules for setoperations. For example, de Morgan’s laws (see Example 1.1.11) for ∧ and∨ imply

(A ∪ B)c = Ac ∩ Bc, (A ∩ B)c = Ac ∪ Bc,

Other such rules are, for example, A ∩ (B ∪ C ) = (A ∩ B) ∪ (A ∩ C )

A ∪ (B ∩ C ) = (A ∪ B) ∩ (A ∪ C )

(A ∪ B) \ C = (A \ C ) ∪ (B \ C )

(A ∩ B) \ C = (A \ C ) ∩ (B \ C )

A \ (B ∪ C ) = (A \ B) ∩ (A \ C )

A \ (B ∩ C ) = (A \ B) ∪ (A \ C )

Some of these will be proved in the recitation class and the assignments.



Operations on SetsOccasionally we will need the following notation for the union andintersection of a finite number n ∈ N of sets:

n∪k=0

Ak := A0 ∪ A1 ∪ A2 ∪ · · · ∪ An,

n∩k=0

Ak := A0 ∩ A1 ∩ A2 ∩ · · · ∩ An.

This notation even extends to n =∞, but needs to be properly defined:

x ∈∞∪k=0

Ak :⇔ ∃k∈N

x ∈ Ak ,

x ∈∞∩k=0

Ak :⇔ ∀k∈N

x ∈ Ak .



Operations on SetsIn particular,

∞∩k=0

Ak ⊂∞∪k=0

Ak .

1.2.5. Example. Let Ak = 0, 1, 2, ... , k for k ∈ N. Then∞∪k=0

Ak = N,∞∩k=0

Ak = 0.

To see the first statement, note that N ⊂∪∞

k=0 Ak since x ∈ N impliesx ∈ Ax implies x ∈

∪∞k=0 Ak . Furthermore,

∪∞k=0 Ak ⊂ N since

x ∈∪∞

k=0 Ak implies x ∈ Ak for some k ∈ N implies x ∈ N.For the second statement, note that

∩∞k=0 Ak ⊂ N. Now 0 ∈ Ak for all

k ∈ N. Thus 0 ⊂∩∞

k=0 Ak . On the other hand, for any x ∈ N \ 0 wehave x /∈ Ax−1 whence x /∈

∩∞k=0 Ak .



Ordered PairsReferences Loomis/Sternberg, Advanced Calculus, Chapter 0.6.See also Spivak, Calculus, Appendix to Chapter 3.

A set does not contain any information about the order of its elements,e.g.,

a, b = b, a.

Thus, there is no such a thing as the “first element of a set”. However,sometimes it is convenient or necessary to have such an ordering. This isachieved by defining an ordered pair, denoted by

(a, b)

and having the property that

(a, b) = (c , d) ⇔ (a = c) ∧ (b = d).



Cartesian Product of SetsThere are (at least) two ways of defining an ordered pair as a set:

(i) (a, b) := a, a, b or(ii) (a, b) := 1, a, 2, b.

The first definition does not need the natural numbers and uses only settheory, but both are of course equivalent.If A,B are sets and a ∈ A, b ∈ B, then we denote the set of all orderedpairs by

A× B := (a, b) : a ∈ A, b ∈ B.A× B is called the cartesian product of A and B.

In this manner we can also define an ordered triple (a, b, c) or, moregenerally, an ordered n-tuple (a1, ... , an) and the n-fold cartesian productA1 × · · · × An of sets Ak , k = 1, ... , n.If we take the cartesian product of a set with itself, we may abbreviate itusing exponents, e.g.,

N2 := N× N.



Problems in Naive Set TheoryIf one simply views sets as arbitrary “collections” and allows a set tocontain arbitrary objects, including other sets, then fundamental problemsarise. We first illustrate this by an analogy:Suppose a library contains not only books but also catalogs of books, i.e.,books listing other books. For example, there might be a catalog listing allmathematics books in the library, a catalog listing all history books, etc.Suppose that there are so many catalogs, that you are asked to createcatalogs of catalogs, i.e., catalogs listing other catalogs. In particular, youare asked to create the following:

(i) A catalog of all catalogs in the library. This catalog lists all catalogscontained in the library, so it must of course also list itself.

(ii) A catalog of all catalogs that list themselves. Does this catalog alsolist itself?

(iii) A catalog of all catalogs that do not list themselves. Does thiscatalog also list itself?



The Russel Antinomy

In the previous analogy, we can view “catalogs” as “sets” and being “listedin a catalog” as “being an element of a set”. Then we have

(i) The set of all sets must have itself as an element.(ii) The set of all sets that have themselves as elements may or may not

contain itself. (This may be decidable by adding some rule to settheory.)

(iii) It is not decidable whether the “set of all sets that do not havethemselves as elements” has itself as an element.



The Russel Antinomy

Formally, this paradox is known as the Russel antinomy:

1.2.6. Russel Antinomy. The predicate P(x) : x /∈ x does not define a setA = x : P(x).

Proof.If A = x : x /∈ x were a set, then we should be able to decide for any sety whether y ∈ A or y /∈ A. We show that for y = A this is not possiblebecause either assumption leads to a contradiction:

(i) Assume A ∈ A. Then P(A) by (1.2.1), i.e., A /∈ A. (ii) Assume A /∈ A. Then ¬P(A) by (1.2.1), therefore A ∈ A.

Since we cannot decide whether A ∈ A or A /∈ A, A can not be a set.



The Russel Antinomy

There are several examples in classical literature and philosophy of theRussel antimony and other“paradoxes”:

1. A person says: “This sentence is a lie.” Is he lying or telling thetruth? (An example of a sentence which is not a logical statement,since a truth value can not be assigned to it.)

2. In a mountain village, there is a barber. Some villagers shavethemselves (always) while the others never shave themselves. Thebarber shaves those and only those villagers that never shavethemselves. Who shaves the barber? (An illustration of the Russelantimony.)



Russel Antinomy

We will simply ignore the existence of such contradictions and build onnaive set theory. There are further paradoxes (antinomies) in naive settheory, such as Cantor’s paradox and the Burali-Forti paradox. All of theseare resolved if naive set theory is replaced by a modern axiomatic settheory such as Zermelo-Fraenkel set theory.

Further Information:Set Theory, Stanford Encyclopedia of Philosophy,http://plato.stanford.edu/entries/set-theory/.T. Jech, Set Theory: The Third Millennium Edition, Revised andExpanded, Reprinted by World Publishing Corporation Beijing, ￥78.00元


http://plato.stanford.edu/entries/set-theory/

Basic Tools and Definitions The Natural Numbers and Mathematical Induction






Complex Numbers



The Natural Numbers

We denote the set of natural numbers by N, i.e.,

N = 0, 1, 2, 3, 4, ....

The natural numbers can be constructed rigorously from set theory, butwe will not do so here.We assume that the familiar operations of addition and multiplication areknown for the natural numbers, i.e., for any two natural numbers a, b ∈ Nwe can define the natural number c = a+ b ∈ N called the sum of a andb. The addition has the following properties (here a, b, c ∈ N):

1. a+ (b + c) = (a+ b) + c (Associativity)2. a+ 0 = 0 + a = a (Existence of a neutral element)3. a+ b = b + a (Commutativity)



Multiplication

Similarly, we can define multiplication, where a · b ∈ N is called theproduct of a and b. We have the following properties (here a, b, c ∈ N):

1. a · (b · c) = (a · b) · c (Associativity)2. a · 1 = 1 · a = a (Existence of a neutral element)3. a · b = b · a (Commutativity)

We also have a property that essentially states that addition andmultiplication are consistent,

a · (b + c) = a · b + a · c . (Distributivity)

Note that we are not able to define subtraction or division for all naturalnumbers.



Notation for Addition and Multiplication

For any numbers a1, a2, ... , an we define the notation

a1 + a2 + · · ·+ an =:n∑

j=1

aj =:∑

1≤j≤n

aj

anda1 · a2 · · · an =:

n∏j=1

aj =:∏

1≤j≤n

aj .

(We will use this notation for rational and real numbers ak ; see thefollowing sections.) For n ∈ N we define

0! := 1 and n! := n · (n − 1)! for n > 0.

This is an example of a recursive definition.



Powers and OrderingWe are also able to define the exponential, “intuitively” written as

ab = a · a · ... · a︸︷︷︸b times

for a, b ∈ N

by setting a0 := 1 and an := a · an−1 (another recursive definition). Wenote that

ab+c = ab · ac and (ab)c = ab·c .

We can imbue the natural numbers with an ordering, denoted by <. Wesay that a < b if there exists some c ∈ N∗ := N \ 0 such that b = a+ c.We write a > b if b < a.We define the symbol ≤ to mean “< or =”, so

a ≤ b ⇔(a < b ∨ a = b

).

We write a ≥ b if b ≤ a.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 56 / 624


Mathematical InductionReference Spivak, Chapter 2.

Often one wants to show that some statement frame A(n) is true for alln ∈ N with n ≥ n0 for some n0 ∈ N. Mathematical induction works byestablishing two statements:

(I) A(n0) is true.(II) A(n + 1) is true whenever A(n) is true for n ≥ n0, i.e.,

∀n∈Nn≥n0

(A(n)⇒ A(n + 1)

)Note that (II) does not make a statement on the situation when A(n) isfalse; it is permitted for A(n + 1) to be true even if A(n) is false.The principle of mathematical induction now claims that A(n) is true forall n ≥ n0 if (I) and (II) are true. This follows from the fifth Peano axiom(the induction axiom).



Binomial FormulaIn practice, one first establishes A(n0), then shows that A(n + 1) is true,assuming that A(n) holds.

As an example, we will prove the binomial formula:

A(n) : (a+ b)n =n∑

k=0

(n

k

)akbn−k ,

where a, b ∈ R, n ∈ N and(n

k

):=

n!

k!(n − k)!, n! :=

n∏k=1

k = 1 · 2 · 3 · ... · n, 0! := 1.

We will need the relation(n + 1

k

)=

(n

k − 1

)+

(n

k

). (1.3.1)



Binomial Formula

We start by showing that A(0) is true. We see that

(a+ b)0 = 1, and0∑

k=0

(0

k

)akb0−k =

(0

0

)a0b0−0 = 1.

Hence A(0) is true, since both sides of the equation are equal to unity.

We now consider A(n + 1):

(a+ b)n+1 = (a+ b)(a+ b)n = (a+ b)n∑

k=0

(n

k

)akbn−k (1.3.2)

We have inserted the statement A(n) in (1.3.2), since we want to showthat A(n + 1) is true under the condition that A(n) is true.



Binomial Formula

Applying some algebraic manipulations, we obtain

(a+ b)n+1 = (a+ b)n∑

k=0

(n

k

)akbn−k

=n∑

k=0

(n

k

)ak+1bn−k +

n∑k=0

(n

k

)akbn+1−k

=n+1∑j=1

(n

j − 1

)ajbn+1−j +

n∑k=0

(n

k

)akbn+1−k

where we have set j = k + 1.



Binomial FormulaWe can now rename the index back to k and split off some terms of thesums:

(a+ b)n+1 = an+1 +n∑

k=1

(n

k − 1

)akbn+1−k + bn+1 +

n∑k=1

(n

k

)akbn+1−k

= an+1 + bn+1 +n∑

k=1

((n

k − 1

)+

(n

k

))akbn+1−k

Now we apply (1.3.1):

(a+ b)n+1 = an+1 + bn+1 +n∑

k=1

(n + 1

k

)akbn+1−k

=n+1∑k=0

(n + 1

k

)akbn+1−k ,

which is just A(n + 1). Hence the binomial formula holds for all n ∈ N.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 61 / 624

Basic Tools and Definitions The Rational Numbers






Complex Numbers



Basic Properties of Numbers - AdditionReference Spivak, Chapter 1.

We assume that the set of rational numbers (fractions) Q is known andhas been introduced from set theory. (This procedure will be discussed indetail in the course Ve203 Discrete Mathematics). We recapitulate thebasic properties of rational numbers, using the notation of Spivak’s book.For addition we have

associativity ∀a,b,c∈Q

a+ (b + c) = (a+ b) + c (P1)

neutral element ∃0∈Q

∀a∈Q

a+ 0 = 0 + a = a (P2)

inverse element ∀a∈Q

∃−a∈Q

(−a) + a = a+ (−a) = 0 (P3)

commutativity ∀a,b∈Q

a+ b = b + a. (P4)



Multiplication

For multiplication, we have a similar set of properties

associativity ∀a,b,c∈Q

a · (b · c) = (a · b) · c, (P5)

neutral element ∃1∈Q1 =0

∀a∈Q

a · 1 = 1 · a = a, (P6)

inverse element ∀a∈Qa =0

∃a−1∈Q

a · a−1 = a−1 · a = 1, (P7)

commutativity ∀a,b∈Q

a · b = b · a. (P8)

With these properties we can prove that if a = 0 and a · b = a · c , thenb = c . Furthermore,if a · b = 0, then either a = 0 or b = 0.



Combining Addition and Multiplication

These twice four properties for addition and multiplication are basicallyindependent of each other (the only connection was that 0 = 1). Thefollowing property ensures that the two operations “work well together”,

distributivity ∀a,b,c∈Q

a · (b + c) = a · b + a · c . (P9)

It is this relation that allows us to prove that a− b = b − a only if a = b,and it is also the law of distributivity that we employ to do elementarymultiplications.



The Positive Rational NumbersWe assume that we know what a strictly positive rational number is, i.e.,that there exists a set P with the property that for every number a oneand only one of the following holds:

1. a = 0,2. a ∈ P,3. −a ∈ P .

This is known as the trichotomy law (P10) for the rational numbers. Wealso assume that the set of positive numbers is closed under addition andmultiplication:

If a and b are in P , then a+ b is in P , (P11)If a and b are in P, then a · b is in P . (P12)

We then writeP = p ∈ Q : p > 0.

so that p > 0 is defined by p ∈ P.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 66 / 624


The Absolute ValueWe define the absolute value (sometimes called the modulus) |a| of arational number a by

|a| =

a if a ≥ 0,

−a if a < 0.An important basic result is the triangle equality:

1.4.1. Triangle Inequality. For all rational numbers a, b ∈ Q, we have

|a+ b| ≤ |a|+ |b|. (1.4.1)

Proof.We prove the triangle inequality by considering all possible cases for a andb according to the trichotomy law (P10):

a = 0, b ∈ Q. If a = 0 we have a+ b = b and |a| = a = 0, so|a|+ |b| = |b|. Both sides of (1.4.1) are then equal to |b|, so theinequality holds.



The Triangle InequalityProof.We prove the triangle inequality by considering all possible cases for a andb according to the trichotomy law:

a > 0, b > 0. In this case, a+ b > 0 by (P11) so |a+ b| = a+ b.Furthermore, |a| = a and |b| = b, so both sides of (1.4.1) are equal toa+ b.

a < 0, b < 0. In this case, a+ b < 0 by (P11) (why?) so|a+ b| = −a− b. Furthermore, |a| = −a and |b| = −b, so both sidesof (1.4.1) are equal to −a− b.

a < 0, b > 0, a+ b ≥ 0. In this case, |a|+ |b| = b − a and|a+ b| = b + a. Since a < 0 and −a > 0 we have

|a+ b| = b + a < b + 0 = b < b + (−a) = b − a = |a|+ |b|

and the inequality holds.The case a < 0, b > 0, a+ b < 0 is left to the reader!



The Triangle Inequality

As a corollary of the triangle inequality we obtain the reverse triangleinequality:

|a+ b| ≥∣∣|a| − |b|∣∣ for all a, b ∈ Q. (1.4.2)

For the proof, it is sufficient to apply the triangle inequality as follows:

|b| = |a+ b − a| ≤ |a+ b|+ |a|.

Rearranging gives|a+ b| ≥ |b| − |a|.

Interchanging a and b then yields (1.4.2).


Basic Tools and Definitions Roots, Bounds & Real Numbers






Complex Numbers



The Rational Numbers are Insufficient

While the rational numbers are closed with respect to addition andmultiplication, and furthermore are large enough to include “inverseelements” −a and a−1 for these operations, they are still “incomplete” inmore than one sense:

1. Algebraically: there is no inverse element for operation of squaring.Given b ∈ Q, it may not be possible to find a ∈ Q such that a2 = b.

2. Analytically: Given a sequence of rational numbers that come closerand closer to each other, there may not be a “limiting value” in Q.(We will discuss this later.)

3. Set theoretically: Given a bounded subset of Q, there may not be aleast upper or greatest lower bound for this set.

We will discuss i) and iii) in more detail now, and come back to ii) later,when we construct a suitable expansion of the rational numbers.



The Square Root Problem

Reference Spivak, end of Chapter 2.

1.5.1. Theorem. There exists no number a ∈ Q such that a2 = 2.

Proof.Assume that a = p/q ∈ Q has the property that (p/q)2 = 2, wherep, q ∈ N. We further assume that p and q have no common divisor greaterthan 1. Then

p2 = 2q2,

so p2 is even. This implies that p is even, so p = 2k for some k ∈ N. Butthen

2q2 = 4k2 ⇒ q2 = 2k2 is even ⇒ q is even.

Thus p and q are both divisible by 2.



Bounded Subsets of Q

Reference Spivak, beginning of Chapter 8.

1.5.2. Definition. We say that a set U ⊂ Q is bounded if there exists aconstant c ∈ Q such that

|x | ≤ c for all x ∈ U.

If this is not the case, we say that U is unbounded.Numbers c1 and c2 such that

c1 ≤ x ≤ c2 for all x ∈ U

are called lower and upper bounds for U, respectively.We say that a set U ∈ Q is bounded above if there exists an upper boundfor U, and bounded below if there exists a lower bound for U.



Bounded Subsets of Q

1.5.3. Examples. The set U = 1/n ∈ Q : n ∈ N∗ is bounded since |x | ≤ 1 for all

x ∈ U. Furthermore, the number −1 is a lower bound and 1 is anupper bound. Of course, there are many more upper and lowerbounds.

The set N ⊂ Q is bounded below (lower bound: 0), but not boundedabove. It is unbounded.



Maxima and Minima of Sets

1.5.4. Definition. Let U ⊂ Q be a subset of the rational numbers. We saythat a number x1 ∈ U is the minimum of U if

x1 ≤ x , for all x ∈ U

and we write x1 =: minU.Similarly, we say that x2 ∈ U is the maximum of U if

x2 ≥ x , for all x ∈ U

and write x2 =: maxU.A set that is not bounded below has no minimum, and a set that is notbounded above has no maximum. However, even a bounded set does notneed to have a maximum or a minimum.



Greatest Lower and Least Upper Bounds of Sets

1.5.5. Definition. We say that an upper bound c2 ∈ Q of a set U ⊂ Q isthe least upper bound of U if no c ∈ Q with c < c2 is an upper bound.We then write c2 =: supU.We say that a lower bound c1 ∈ Q is the greatest lower bound of U if noc ∈ Q with c > c1 is a lower bound. We then write c1 =: inf U.

1.5.6. Example. The set U = 1/n ∈ Q : n ∈ N∗ has

inf U = 0, supU = 1,

minU does not exist, maxU = 1.

In general, if maxU exists, supU = maxU, and if minU exists,inf U = minU.



Existence of Suprema and Infima

Not every bounded set in Q has an infimum or supremum: consider

U = x ∈ Q : 2 < x2 ≤ 4 ∧ x > 0

It is clear that supU = maxU = 2, but the minimum and even theinfimum do not exist (you will prove this in the assignments).On the other hand, if we add a postulate to our previous axioms, we canwiden the rational numbers to include solutions of x2 = 2.



The Real NumbersWe define the set of real numbers R as the smallest extension of therational numbers Q such that the following property holds:

(P13) If A ⊂ R, A = ∅ is bounded above, then there exists aleast upper bound for A in R.

We call all real numbers that are not rational irrational numbers.We will not immediately construct such a set of objects, but do so a littlelater in this lecture. For now, we stipulate that such a set exists, and wecall it the set of real numbers R. Furthermore, we can add and multiplyelements of R and in general all properties (P1)-(P12) remain valid forelements of R.Note that (P13) is sufficient to guarantee the existence of greatest lowerbounds; we need but consider −A = x ∈ R : − x ∈ A instead of A. Theleast upper bound of −A will yield the negative of the greatest lowerbound of A.



The Square Root ProblemWe can now solve the square root problem:

1.5.7. Theorem. For every real number x > 0, there exists a uniquepositive real number y solving the equation y2 = x .

Proof.The proof is in two parts: existence and uniqueness, i.e., we showseparately that i) there exists a solution and ii) the solution is unique.Often, it is easier to show uniqueness than existence. Also, the uniquenessproof may yield helpful results that can be used for the (generally moredifficult) existence proof. Therefore, one often starts with proving theuniqueness.Assume that there are two numbers y1 > y2 > 0 such that y21 = y22 = x .Then

0 = y21 − y22 = (y1 − y2)︸︷︷︸>0

(y1 + y2)︸︷︷︸>0

> 0



The Square Root ProblemProof (continued).We now show the existence of a solution to the equation y2 = x . LetM := t ∈ R : t > 0 ∧ t2 > x. Then M is non-empty and bounded frombelow, so by (P13) it has a greatest lower bound y = infM. We willestablish that y2 = x by showing that y2 > x and y2 < x lead tocontradictions. This strategy essentially uses the trichotomy law (P10).Assume that y2 > x , so y > x

y . Then

0 <

(y − x

y

)2

= y2 − 2yx

y+

(x

y

)2

,

so thaty2 +

(x

y

)2

> 2yx

y= 2x .



The Square Root ProblemProof (continued).Set s := 1

2(y + xy ). Then

s2 =1

4

(y2 + 2

x

yy +

(xy

)2)>

1

4(2x + 2x) = x

so s ∈ M. However,

s =1

2(y +

x

y) <

1

2(y + y) = y = infM

The proof that y2 < x leads to a contradiction is simpler and will bediscussed in the assignments.

Therefore, the real numbers include all square roots of positive numbers;we denote the positive solution to y2 = x by y =

√x . Similarly, we write

the positive solution to yn = x as y = n√x .



Subsets of the Real NumbersWe introduce some notation we will have occasion to use often later.1.5.8. Definition. Let a, b ∈ R with a < b. Then we define the followingspecial subsets of R, which we call intervals:

[a, b] := x ∈ R : (a ≤ x) ∧ (x ≤ b) = x ∈ R : a ≤ x ≤ b,[a, b) := x ∈ R : (a ≤ x) ∧ (x < b) = x ∈ R : a ≤ x < b,(a, b] := x ∈ R : (a < x) ∧ (x ≤ b) = x ∈ R : a < x ≤ b,(a, b) := x ∈ R : (a < x) ∧ (x < b) = x ∈ R : a < x < b.

Furthermore, for any a ∈ R we set

[a,∞) := x ∈ R : x ≥ a, (a,∞) := x ∈ R : x > a,(−∞, a] := x ∈ R : x ≤ a), (−∞, a) := x ∈ R : x < a

Finally, we set (−∞,∞) := R.



Subsets of the Real Numbers

1.5.9. Definition. We call an interval I open if I = (a, b), a ∈ R or a = −∞, b ∈ R or b =∞; closed if I = [a, b], a, b ∈ R; half-open if is neither open or closed.

We often denote intervals by capital letters I or J. If we say “Let I be anopen interval…” we mean “Let I ⊂ R be a set such that there existnumbers a, b ∈ R such that I = (a, b)…”



Subsets of the Real NumbersMore generally, we have the following classification of points with respectto a set:1.5.10. Definition. Let A ⊂ R.

(i) We call x ∈ R an interior point of A if there exists some ε > 0 suchthat the interval (x − ε, x + ε) ⊂ A. The set of interior points of A isdenoted by intA.

(ii) We call x ∈ R an exterior point of A if there exists some ε > 0 suchthat the interval (x − ε, x + ε) ∩ A = ∅.

(iii) We call x ∈ R a boundary point of A if for every ε > 0(x − ε, x + ε) ∩ A = ∅ and (x − ε, x + ε) ∩ Ac = ∅. The set ofboundary points of A is denoted by ∂A.

(iv) We call x ∈ R an accumulation point of A if for every ε > 0 theinterval (x − ε, x + ε) ∩ A \ x = ∅.




1.5.11. Example. For the interval A = [0, 1), intA = (0, 1), Any x ∈ R \ [0, 1] is an exterior point, ∂A = 0, 1, Any x ∈ [0, 1] is an accumulation point.

1.5.12. Example. For the set A =x ∈ R : x = 1

n , n ∈ N \ 0

, intA = ∅, Any x ∈ R \ (A ∪ 0) is an exterior point, ∂A = A ∪ 0, Only x = 0 is an accumulation point.




We can hence define general open and closed sets:1.5.13. Definition.

(i) A set A ⊂ R is called open if all points of A are interior points, i.e., if

A = intA.

(ii) A set A ⊂ R is called closed if R \ A is open.(iii) The set

A := A ∪ ∂A

is called the closure of A. It is the smallest closed set that contains A.


Basic Tools and Definitions Complex Numbers






Complex Numbers



The Complex NumbersReference Spivak, Chapter 24.

Although we have added the square roots (solutions to y2 − x = 0, x > 0)to the rational numbers, and also many other elements through theaddition of (P13), we have still not quite solved the problem of solutionsof quadratic equations: there is no solution y to y2 + 1 = 0 in the realnumbers. (This is expressed by saying that R is algebraically incomplete.)We again need to extend the real numbers to include solutions toequations such as y2 = −1. We follow the familiar procedure ofconsidering pairs of real numbers and define the complex numbers C as

C = (a, b) : a, b ∈ R = R2

From the point of view of sets, there is no difference between C and R2.We consider the real numbers R as a subset of C, writing (x , 0) ∈ C forx ∈ R.



The Complex Numbers

We define addition in C through

(a1, b1) + (a2, b2) := (a1 + a2, b1 + b2),

so in particular the sum of two real numbers is again a real number.We next define multiplication in an apparently strange way:

(a1, b1) · (a2, b2) := (a1a2 − b1b2, a1b2 + a2b1).

However, it turns out that this multiplication satisfies the axioms(P5)-(P8), and also (P9) together with addition. The neutral element is(1, 0) (as expected), but the inverse element is a bit more complicated.



The Complex NumbersFor z = (a, b) ∈ C \ 0 the multiplicative inverse is given by

z−1 =

(a

a2 + b2,−b

a2 + b2

),

i.e., zz−1 = (1, 0).With the previously defined multiplication we can find a solution to theequation z2 = −1:

(0, 1)2 = (0, 1) · (0, 1) = (0 · 0− 1 · 1, 0 · 1 + 0 · 1) = (−1, 0).

In fact, it is possible to prove that C is algebraically closed:

1.6.1. Theorem. Every polynomial equation of the form∑n

k=0 akxk = 0,

where a0, ... , an are fixed complex numbers, has precisely n complexsolutions x1, ... , xn ∈ C.



The Complex NumbersWe further define multiplication of complex numbers by real numbers inthe following way:

λ · (a, b) = (λa,λb), λ ∈ R, (a, b) ∈ C.

Note that is multiplication together with addition in C is distributive.We can then split any complex number into two components:

(a, b) = (a, 0) + (0, b) = a · (1, 0) + b · (0, 1).

The pair (1, 0) ∈ C corresponds to 1 ∈ R, while the pair (0, 1) ∈ C is oftendenoted by the letter i . Hence we usually write

C ∋ z = (a, b) = a · 1 + b · i = a+ bi , a, b ∈ R,

where a =: Re z is called the real part and b =: Im z the imaginary part ofa complex number z = a+ bi .



Visualizing Real Numbers

We visualize real numbers by drawing a single line, with an arrow pointingto the right, in the direction of increasing positive numbers:

-4 -3 -2 -1 0 1 2 3 4

We will identify the set of real numbers with this line so that a single realnumber corresponds to a “point” on the line. This is of courseproblematic; in how far this is justified, which assumptions enter into thisidentification and which precise axioms need to be stated falls into thepurview of analytic geometry. We will not discuss this further in thislecture, but simply assume a correspondence between points on a line andthe real numbers.



Visualizing Complex NumbersWe visualize complex numbers by drawing two perpendicular lines(“axes”), with arrows pointing to the right and upward. each representingthe real numbers. A complex number z = a+ bi = Re z + (Im z) · i isrepresented by a point in the plane whose orthogonal projection onto thehorizontal axis gives Re z , and whose orthogonal projection onto thevertical axis gives Im z .

-4 -3 -2 -1 1 2 3 4Re z

-2

-1

1

2

3

4

Im z

z = Re z + ä Im z



The Complex NumbersWe note that we have no ordering relation on C; it makes no sense to saythat one complex number is bigger or smaller than the other! Therefore,properties (P10)-(P12) do not hold for a complex number.

1.6.2. Notation. Whenever we write x > 0, we automatically assume thatx ∈ R.

1.6.3. Definition. We define the modulus or absolute value of a complexnumber z = a+ bi by

|z | =√

a2 + b2 =√z · z .

Here z = a− bi is called the complex conjugate of z .This definition of the absolute value is called an extension of the absolutevalue from the real numbers to the complex numbers.



The Complex NumbersThe complex modulus has a geometric interpretation as distance:

-4 -3 -2 -1 1 2 3 4Re z

-2

-1

1

2

3

4

Im z

z = Re z + ä Im z

ÈzÈ = HRe zL2 + HIm zL2

In fact, the geometric distance between two complex numbers z1 and z2can is

ϱ(z1, z2) = |z1 − z2|.



The Complex Numbers

Although we cannot define the ordering < on the complex numbers, wecan nevertheless define the concept of bounded sets in C.

1.6.4. Definition. Let z0 ∈ C. Then we define the open ball of radiusR > 0 centered at z0 by

BR(z0) := z ∈ C : |z − z0| < R.

We say that a set Ω ⊂ C is bounded if there exists some R > 0 such thatΩ ⊂ BR(0).Of course, it makes no sense to talk of upper or lower bounds of sets.



The Complex Numbers

The open ball of radius R about z0 ∈ C has a geometric interpretation asa neighbourhood:

-4 -3 -2 -1 1 2 3 4Re z

-2

-1

1

2

3

4

Im z

z0

R

BR Iz0M


Real Functions, Convergence and Continuity

Part II




Functions and Maps

Sequences

Real Functions

Limits of Functions

Continuous Functions


Real Functions, Convergence and Continuity Functions and Maps

Functions and Maps

Sequences

Real Functions

Limits of Functions




FunctionsReference Spivak, Calculus, Chapter 3.Our goal is to put the concept of a function on a sound mathematicalfooting. From school, we imagine a function to be an assignment ofvalues, e.g.,

x 7→ x2

where to every number x we associate the square of this number. Thisconcept of a function is very active: we take a number x (input) andreturn a number x2 (output). We are doing something with x . Of course,we have to specify which numbers x are acceptable as input (we call thisthe domain of the function) and in what set the output lies (this set issometimes called the codomain).We also need to make sure that every input value yields exactly one outputvalue. In summary, a function should be an object to which we associate

(i) a domain and co-domain(ii) a unique assignment



DefinitionNote that the above is mostly a “wishlist” - we still have no properdefinition of what a function actually is. It turns out that defining a“unique assignment” is very difficult in mathematics. The idea of “doingsomething” does not lend itself to contradiction-free definitions. Inmodern mathematics, active concepts are always replaced by passive orstatic concepts that do not involve “actions”. The definition of a functionis the first, basic example of this.In the following definition, we will replace the idea of an assignment withthe idea of a set. Essentially, the assignment

x 7→ x2

is replaced with the set of all pairs

(x , x2).



Definition2.1.1. Definition. Let X and Y be sets and let P be a predicate withdomain X × Y . Let

f := (x , y) ∈ X × Y : P(x , y)

and assume that P has the property that

∀(x1,y1)∈f

∀(x2,y2)∈f

x1 = x2 ⇒ y1 = y2. (2.1.1)

Letdom f :=

x ∈ X : ∃

y∈Y: (x , y) ∈ f

Then we say that f is a function from dom f to Y . The set dom f ⊂ X iscalled the domain of f and Y is called the codomain of f . We also definethe range of f by

ran f :=y ∈ Y : ∃

x∈X: (x , y) ∈ f

.



Examples2.1.2. Example. Let X ,Y = Z and P(x , y) : x2 = y . Then

f = (x , y) ∈ Z2 : y = x2 = (0, 0), (−1, 1), (1, 1), (−2, 4), (2, 4), ....

The function f has domain

dom f =x ∈ Z : ∃

y∈Z: y = x2

= Z

and the range satisfies

ran f =y ∈ Z : ∃

x∈Z: y = x2

⊂ N.

By contrast, the predicate Q(x , y) : y2 = x does not give rise to a functionfrom Z to Z because

g = (x , y) ∈ Z2 : y2 = x = (0, 0), (1,−1), (1, 1), (4, 2), (4,−2), ...

does not satisfy (2.1.1): (4, 2) and (4,−2) are both in g , but 2 = −2.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 104 / 624


Functions as AssignmentsThe previous example illustrates how our “wishlist” for the properties offunctions is fulfilled by Definition 2.1.1. Having a formal definition is goingto be useful whenever we need to check whether an object actually is afunction or not.In practice, however, we will still think of a function as an assignment(albeit implemented technically through a static set of pairs). In theexpression

f = (x , y) ∈ X × Y : P(x , y)

the predicate P is not arbitrary, but for every x ∈ dom f there must exist aunique value y for which P(x , y) is true. We denote this value by f (x)(here the letter f is used in a slightly different way; it does not denote aset), i.e., we write y = f (x). (Thus, P(x , f (x)) is a tautology.)We can think of a function as an assignment of f (x) ∈ Y to certain x ∈ Xsuch that P(x , f (x)) is a tautology, where we are given some sets X ,Yand a predicate with domain X × Y .



Notation

The idea of a function as an assignment, while mathematically difficult tohandle, is very intuitive, and in calculus we mostly think of functions inthis way. Therefore, we use the following notation to denote a function fwith domain Ω ⊂ X and range contained in Y :

f : Ω → Y , x 7→ f (x).

The above is the traditional notation in calculus for a function f . Itcontains all necessary information: the domain of f and the assignmentwhich makes P(x , f (x)) a tautology. (Note the different shapes of thearrows; they are also traditional.) An alternative notation is

f : Ω → Y , y = f (x).

Another word for function is also map or mapping.



Notation

2.1.3. Example. Instead of writing

f = (x , y) ∈ Z2 : y = x2

we write

f : Z→ Z, x 7→ x2

or

f : Z→ Z, f (x) = x2.



Compositions of Functions

If X ,Y ,Z are sets, Ω ⊂ X , Σ ⊂ Y , and two maps f : Ω → Y , g : Σ → Zare given such that ran f ⊂ Σ = dom g , then we define the composition

g f : Ω → Z , x 7→ g(f (x)).

2.1.4. Example. Consider the function f : Z→ N, f (x) = 1 + x2 andg : Z \ 0 → Q, g(x) = 1/x . Then ran f ⊂ N \ 0 ⊂ dom g , so we candefine the composition

g f : Z→ Q, x 7→ 1

1 + x2.


Real Functions, Convergence and Continuity Sequences

Functions and Maps

Sequences

Real Functions

Limits of Functions




Sequences - DefinitionsReference Spivak, Calculus, Chapter 22.

We will first discuss a relatively elementary type of map, the so-called(infinite) sequences. A sequence is simply a map from the natural to thereal or complex numbers. In fact, we define

2.2.1. Definition. Let Ω ⊂ N. A map Ω → R is called a real sequence, amap Ω → C a complex sequence.

2.2.2. Notation. We denote the values of a sequence by an, i.e., a sequencemaps n 7→ an, n ∈ Ω ⊂ N. Instead of (n, an) : n ∈ Ω we denote asequence by any of the following,

(an)n∈Ω = (an) = a0, a1, a2, ... .

The values an are called terms of the sequence.



Sequences - Examples

2.2.3. Examples. The first terms of the sequence of even numbers, (an) = (2n) are

(an) = 0, 2, 4, 6, 8, 10, ...

The definition an = 2n is called an explicit definition of the sequence. The Fibonacci sequence is defined by a0 = a1 = 1, an = an−1 + an−2.

This is called a recursive definition. A sequence that takes on only two values in an alternating manner is

called an alternatng sequence, e.g. an = (−1)n. The above are all integer sequences since their values are integers. A

non-integer sequence is, for example, (an)n∈N\0 = (1/n)n∈N\0.



Sequences - Convergence2.2.4. Definition. The real or complex sequence (an) : Ω → X , Ω ⊂ N,X = R or C, is said to converge with limit a ∈ X if

∀ε>0

∃N∈N

∀n>N

|an − a| < ε.

We then write “ limn→∞

an = a” or “an → a as n→∞.”

A sequence that does not converge (to any limit) is called divergent.

2.2.5. Remark. Definition 2.2.4 can alternatively be formulated as

limn→∞

an = a :⇔ ∀ε>0

∃N∈N

∀n>N

an ∈ Bε(a). (2.2.1)

where

Bε(a) = z ∈ X : |a− z | < ε, ε > 0, a ∈ X ,

for X = R or C is called an ε-neighbourhood of a. We hence say, “Forsufficiently large n, an is arbitrarily close to a.”




2.2.6. Theorem. The sequence (an), an = 1n , n ∈ N \ 0, converges

towards a = 0.

Proof.We need to show that for any ε > 0 we can find an N ∈ N such that forall n > N we have

|an − a| =∣∣∣∣1n − 0

∣∣∣∣ = 1

n< ε.

We note that 1n < ε⇔ n > 1

ε . Therefore, given ε > 0, we choose anyN ∈ N with N > 1/ε. Then for all n > N we have n > 1/ε and hence1/n < ε. Thus |an − 0| = an < ε for all n > N, so by the definition ofconvergence we have

limn→∞

1

n= 0.




2.2.7. Example. The sequence (an), an = 2n + 1, n ∈ N, does notconverge.

Proof.Let a be any real number. We will show that an → a. This is equivalent toshowing that

∃ε>0

∀N∈N

∃n>N

|an − a| ≥ ε. (2.2.2)

We will take ε = 1 and show that for all N ∈ N there exists an n > N suchthat |an − a| ≥ 1. Let N be fixed and choose n > maxN, |a|. Then

|an − a| ≥∣∣|an| − |a|∣∣ = ∣∣2n + 1− |a|

∣∣ ≥ ∣∣1 + |a|∣∣ ≥ ε,which is what we wanted to show.



Sequences - Divergence to Infinity

2.2.8. Definition. A real sequence (an) is called divergent to infinity if

∀C>0

∃N∈N

∀n>N

an > C .

It is called divergent to minus infinity if

∀C<0

∃N∈N

∀n>N

an < C .

2.2.9. Notation. We write “ limn→∞

an =∞” or “an →∞ as n→∞” in thefirst case, “ lim

n→∞an = −∞” or “an → −∞ as n→∞” in the second case.




2.2.10. Examples. The sequence (an), an = 2n + 1 diverges to infinity. Any sequence that diverges to infinity or minus infinity does not

converge. The sequence (an), an = (−1)n does not converge and does not

diverge to infinity or minus infinity.



Sequences - General Results

2.2.11. Definition. A sequence is bounded if its range is a bounded set.The range of a sequence is of course

ran(an) = an : n ∈ N.

Since

supa0, a1, ... ≤ supn∈N|an| and |infa0, a1, ...| ≤ sup

n∈N|an|,

it follows that a sequence is bounded if and only if supn∈N|an| <∞.




2.2.12. Lemma. A convergent sequence is bounded.

Proof.Let an → a. Then there exists some N such that |an − a| < 1 for alln > N. By the triangle inequality, this means that

|an| = |an − a+ a| ≤ |an − a|+ |a| < 1 + |a|

for n > N. It follows that

|an| ≤ max|a1|, ... , |aN |, |a|+ 1 =: C

for all n ∈ N, so sup|an| ≤ C <∞ and the range is bounded.



Sequences - General ResultsThe following result looks harmless, even trivial, but is of great importancefor applications, proofs and later generalizations.2.2.13. Lemma.

1. The sequence (an) converges to a if and only if the sequence(bn) := (an − a) converges to zero, i.e.,

an → a ⇔ an − a→ 0

2. The sequence (an) converges to 0 if and only if the sequence(bn) = (|an|) converges to zero, i.e.,

an → 0 ⇔ |an| → 0

Proof.1. |an − a| < ε ⇔ |(an − a)− 0| < ε.2.∣∣|an| − 0

∣∣ < ε ⇔ |an| < ε ⇔ |an − 0| < ε




2.2.14. Lemma. A convergent sequence has precisely one limit.

Obviously a convergent sequence has a limit; the statement of the lemmais that it can not have more than one limit.

Proof.Let a = b and assume that an → a and an → b. Then for every ε > 0there exists some N(ε) such that |an − a|, |an − b| < ε/2 for all n > N(ε).Let ε = |a− b|/2 and choose n > N(|a− b|/2). Then

|a− b| = |a− an + an − b| ≤ |a− an|+ |an − b| < ε

2+ε

2=|a− b|

2,

which is a contradiction.




2.2.15. Proposition. Let (an) and (bn) be convergent real or complexsequences with an → a and bn → b for some a, b ∈ C. Then

1. lim(an + bn) = a+ b ,2. lim(an · bn) = ab,3. lim an

bn= a/b if b = 0.

We will prove the first statement, the second statement will be part ofyour homework, while the last statement will be discussed in the recitationclass.




Proof of 1.Let an → a, bn → b. We will show that an + bn → a+ b. Fix any ε > 0.Then for some Na ∈ N for all n > Na we have

|an − a| < ε

2.

Similarly, for some Nb ∈ N we have |bn − b| < ε/2 for all n > Nb. DefineN := maxNa,Nb. Then for all n > N we have |an − a| < ε/2 and|bn − b| < ε/2. Furthermore,

|an + bn − (a+ b)| ≤ |an − a|+ |bn − b| < ε

2+ε

2= ε. (2.2.3)

Hence for any ε > 0 we have found some N such that (2.2.3) holds for alln > N.



Sequences - Some Applications

One of the most useful tools for showing the convergence of a sequence isthe so-called Squeze Theorem:

2.2.16. Theorem. Let (an), (bn) and (cn) be real sequences withan < cn < bn for sufficiently large n. Suppose that lim an = lim bn = a.Then (cn) converges and lim cn = a.

We use the phrase “(...) for sufficiently large n” here to say “there existssome N0 ∈ N such that for all n > N0 (...).” This is a common expressionin mathematics and will occur later in other contexts.




Proof of the Squeeze Theorem.Fix ε > 0. Then there exists a sufficiently large N ∈ N such that

a− ε < an and bn < a+ ε for all n > N.

Since an < cn < bn for sufficiently large n, this implies

a− ε < cn < a+ ε ⇔ |cn − a| < ε

for all n > N.




2.2.17. Corollary. Let (rn) be a real sequence with lim rn = 0 and let (zn)be a complex sequence with

|zn| < rn for sufficiently large n.

Then zn → 0.

This follows from the squeeze theorem with an = 0, bn = rn and cn = |zn|together with Lemma 2.2.13, part 2.




2.2.18. Proposition. Let q ∈ C, |q| < 1. Then limn→∞

qn = 0.

For the proof, we will need a consequence of Bernoulli’s Inequality, whichwill be discussed in the assignments:

2.2.19. Lemma. For x > −1 and n ∈ N,

(1 + x)n > nx .

The proof of Bernoulli’s inequality (by induction) is part of your exercises.



Sequences - Some Applications2.2.20. Proposition. Let q ∈ C, |q| < 1. Then lim

n→∞qn = 0.

Proof.The case q = 0 is trivial, so we now assume 0 < |q| < 1. Then

1

|q|=: 1 + τ for some τ > 0.

By Bernoulli’s inequality,(

1

|q|

)n

= (1 + τ)n > nτ . Therefore,

|qn| = |q|n < 1

nτ.

The sequence ( 1nτ ) converges to 0 as n→∞, so by Corollary 2.2.17 we

have qn → 0 as n→∞.



Complex Sequences2.2.21. Lemma. Let (zn) be a complex sequence, zn = xn + iyn and letz = x + iy ∈ C. Then

zn → z ⇐⇒ (xn → x and yn → y).

Proof.We use that

maxa, b ≤√a2 + b2 ≤ |a|+ |b|

(see this by squaring both sides) and

|zn − z | = |xn − x + i(yn − y)| =√

(xn − x)2 + (yn − y)2,

which gives

max|xn − x |, |yn − y |

≤ |zn − z | ≤ |xn − x |+ |yn − y |.



Complex Sequences

Proof (continued).It is then clear that

|zn − z | < ε ⇒ |xn − x | < ε ∧ |yn − y | < ε

and|xn − x | < ε

2∧ |yn − y | < ε

2⇒ |zn − z | < ε.

From this and the definition of convergence, the Lemma follows.



Real SequencesAs we have seen, the convergence of complex sequences can be reduced tothe convergence of real sequences. This is very important, because one ofthe most useful techniques for showing convergence use the order relation“<”, which is defined for real, but not complex numbers.

2.2.22. Definition. A real sequence (an) is called increasing if an ≤ an+1 for all n ∈ N. decreasing if an ≥ an+1 for all n ∈ N. strictly increasing if an < an+1 for all n ∈ N. strictly decreasing if an > an+1 for all n ∈ N. monotonic if it is either increasing or decreasing.

2.2.23. Notation. If xn is increasing (decreasing) and xn → x , we writexn x (xn x).



Real Sequences - Boundedness & Monotonicity2.2.24. Theorem. Every monotonic and bounded (real) sequence (an) isconvergent. More precisely,

an supan : n ∈ N if (an) is increasing,an infan : n ∈ N if (an) is decreasing,

We will show the first statement only.Proof.Let (an) be an increasing and bounded sequence. Then the range

A := ran(an) = an : n ∈ N

is non-empty and bounded. By property (P13) it follows that the leastupper bound

supA =: σ <∞

exists.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 131 / 624


Order relations for Sequences in R

Proof (continued).For every ε > 0 there exists some number N such that σ− ε < aN . (If thiswere untrue, then σ − ε would be an upper bound for X , contradicting thedefinition of σ as least upper bound.) Since the sequence is increasing, itfollows that

σ − ε < aN ≤ an ≤ σ

for all n ≥ N. But this means that σ − an < ε for all n > N, so an → σ.The case where (an) is decreasing is dealt with analogously.Note that the proof has essentially used not only the ordering “<” (whichis defined for rational and real numbers), but also property (P13), which isonly defined for real numbers. The statement of the theorem is false forrational numbers.



General Sequences - Subsequences

2.2.25. Definition. Let (an)n∈N be a sequence and let (nk)k∈N be a strictlyincreasing sequence of natural numbers. Then the composition

(ank )k∈N := (an)n∈N (nk)k∈N = (an1 , an2 , an3 , ...)

is called a subsequence of (an).

2.2.26. Example. If (an)n∈N = (a0, a1, a2, ...) is a sequence, then(a2k) = (a0, a2, a4, ...) is a subsequence of (an), since the map k 7→ 2k(the sequence (2k)k∈N) is strictly increasing.




2.2.27. Definition. Let (an)n∈N be a sequence and A ⊂ N any infinite setof natural numbers. We define the subsequence

(an)n∈A

to be the composition of (an) and the sequence (nk), where n0 = minAand nk+1 > nk , nk ∈ A, for all k ∈ N (nk is the kth-smallest element ofA).

2.2.28. Example. If (an)n∈N is a sequence and if we denote the set of evennatural numbers by E , then

(a2k)k∈N = (an)n∈E .




2.2.29. Lemma. Let (an) be a convergent sequence with limit a. Then anysubsequence of (an) is convergent with the same limit.

Proof.Let (ank ) be a subsequence of (an). Then we need to show that

∀ε>0

∃K∈N

∀k>K

|ank − a| < ε.

We know that∀

ε>0∃

N∈N∀

n>N|an − a| < ε.

Fix ε > 0 and choose a corresponding N. Since the sequence (nk) isstrictly increasing, there exists some K ∈ N such that nk > N for allk > K .



Monotonic Subsequences of Real Sequences

2.2.30. Lemma. Every real sequence has a monotonic subsequence.

Proof.Let the sequence (an) be given and define

A := n ∈ N : an ≥ ak for all k > n.

Now if A is not finite, the subsequence

(an)n∈A

is decreasing and we have found a monotonic subsequence.



Monotonic Subsequences of Real Sequences

Proof (continued).If

A = n ∈ N : an ≥ ak for all k > n.

is finite, then there exists some N such that no number n > N belongs toA. But that means that

∀n>N

∃l>n

al > an. (2.2.4)

Fix any number n0 > N. Next, in (2.2.4) set n = n0 and choose somel > n0 such that al > an0 . Denote this l by n1. Then, in (2.2.4) set n = n1and choose some l > n1 such that al > an1 . This procedure can berepeated to find a sequence of increasing numbers n0, n1, n2, ... such that(ank )k∈N is a strictly increasing subsequence of (an)n∈N.



Accumulation Points2.2.31. Definition. Let (an) be a sequence. Then a number a is called anaccumulation point of (an) if

∀ε>0

∀N∈N

∃n>N|an − a| < ε.

2.2.32. Remark. If a sequence converges, then the limit is the onlyaccumulation point. Prove this explicitly for yourself!

2.2.33. Examples.1. The sequence (−1)n − 1/n has accumulation points 1 and −1, but

does not converge.2. The sequence (an) given by

an =

0 n evenn n odd

has the accumulation point 0, but does not converge.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 138 / 624


Accumulation Points2.2.34. Lemma. A number a is an accumulation point of a sequence (an)if and only if there exists a subsequence of (an) that converges to a.

Proof.(⇐) This case follows immediately from the definitions - do it yourself!(⇒) We assume that a is an accumulation point of (an) = (a1, a2, a3, ...).

By the definition of an accumulation point, there exists some ak suchthat |a− ak | < ε := 1; let n1 = k. Again, for ε = 1/2 and N = n1there exists some l > n1 such that |a− al | < 1/2; let n2 = l . Bychoosing ε = 1, 1/2, 1/3, 1/4, ... we hence iteratively find naturalnumbers nk , k ∈ N, such that (nk) is a strictly increasing sequenceand

|ank − a| < 1

k, k = 1, 2, 3, ... .

It follows that (ank ) is a subsequence of (an) that converges to a.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 139 / 624


Theorem of Bolzano-Weierstraß

We can now prove the following fundamental result:

2.2.35. Theorem of Bolzano-Weierstraß. Every bounded real sequence hasan accumulation point.

Proof.Let (an) be a bounded sequence of real numbers. By Lemma 2.2.30, thissequence has a monotonic subsequence, which is of course also bounded.By Theorem 2.2.24, this subsequence converges, so by Lemma 2.2.34, (an)has an accumulation point.



Generalizing Convergence

As previously hinted, the concept of sequences and convergence onlyrequires two things:

A sequence of objects, realized by a map (an) : N→ M, where M issome arbitrary set and

The concept of “closeness,” so we can define convergence. We wantto express mathematically that a is a limit of (an) if an is arbitrarilyclose to a for sufficiently large n.

The first concept is already quite clear, and we don’t need to elaborate.We simply define a sequence in a set M as a map (an) : N→ M.




For the second concept there are various generalizations, where the mostgeneral ideas are part of the mathematical field of Topology, which we willnot go into here. We will instead discuss a generalization of theneighborhoods

Bε(x) = y ∈ M : |x − y | < ε

where we replace |x − y | by a general “distance function”, called a metric.A metric will associate to two points x , y ∈ M their distance, which shouldbe a positive real number. Hence it will be a function defined not on M,but on the set of pairs M ×M, and it should satisfy certain conditions(such as being positive).We also want the metric to be symmetric (the distance from x to y shouldequal the distance from y to x), we want the distance from x to y to be 0if and only if x = y , and we want a form of the triangle inequality to hold.



Generalizing ConvergenceWe hence allow any function that satisfies these conditions to be called adistance function, or metric. Formally, we state the following definition:

2.2.36. Definition. Let M be a set. A map ϱ : M ×M → R is called ametric if

(i) ϱ(x , y) ≥ 0 for all x , y ∈ M and ϱ(x , y) = 0 if and only if x = y .(ii) ϱ(x , y) = ϱ(y , x)

(iii) ϱ(x , z) ≤ ϱ(x , y) + ϱ(y , z).The pair (M, ϱ) is then called a metric space.

2.2.37. Examples.(i) M = (0, 1] ⊂ R, ϱ(x , y) = |x − y |,(ii) M = C, ϱ(z ,w) = max|Re z − Rew |, |Im z − Imw |,(iii) M = C, ϱ(z ,w) = |Re z − Rew |+ |Im z − Imw |



Examples of Metric Spaces2.2.38. Example. For M = C the New York City metric is given byϱ(x , y) = |x1 − y1|+ |x2 − y2|, where x = x1 + ix2 and y = y1 + iy2.

Extracted from Hammonds 1910 map, ”New York City and vicinity.” 1910. Wikipedia. Wikimedia Foundation. Web. 20 August 2013


http://en.wikipedia.org/wiki/File:1910_NYC_map.jpg



We can hence define convergence of sequences in metric spaces (M, ϱ),where convergence of a sequence (an) : N→ M is determined by (2.2.1),

limn→∞

an = a :⇔ ∀ε>0

∃N∈N

∀n>N

an ∈ Bε(a).

where

Bε(a) = y ∈ M : ϱ(y , a) < ε, ε > 0, a ∈ M.

All results that do not use the properties of the real or complex numbersthen also apply to sequences in (M, ϱ), we just need to replace |x − y |everywhere with ϱ(x , y).




In particular, we can define boundedness.

2.2.39. Definition. Let M be a set and (an) be any sequence in M. Fixx ∈ M. Then (an) is called bounded if

∃R>0

∀n∈N

an ∈ BR(x)

Note that this definition does not depend on the point x ; if (an) isbounded for some choice of x , then it is also bounded if some other y ∈ Mis used instead. We say that boundedness is well-defined.



Cauchy SequencesWe now want to study sequences in metric spaces whose elements “groupclose together”, without necessarily converging. In particular, we want tocharacterize sequences, where the distance between the elements becomessmaller and smaller. Such sequences are called Cauchy squences.

2.2.40. Definition. A sequence (an) in a metric space (M, ϱ) is called aCauchy sequence if

∀ε>0

∃N∈N

∀m,n>N

ϱ(am, an) < ε.

2.2.41. Remark. Every convergent sequence is a Cauchy sequence. Thisfollows from

ϱ(an, am) ≤ ϱ(an, a) + ϱ(am, a).

where an → a. However, not every Cauchy sequence converges.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 147 / 624


Cauchy Sequences

2.2.42. Lemma. Every Cauchy sequence in a metric space (M, ϱ) isbounded.

Proof.Let (an) be a Cauchy sequence. We show that (an) is bounded. For ε = 1we can find some N such that ϱ(aN , an) < 1 for all n > N. Let x ∈ M besome fixed point. By the triangle inequality,

ϱ(an, x) ≤ ϱ(an, aN) + ϱ(aN , x) < 1 + ϱ(aN , x) for n > N

and

ϱ(an, x) ≤ maxϱ(a1, x), ... , ϱ(aN , x), ϱ(aN , x) + 1 =: C ,

hence an ∈ BC+1(x) for all n ∈ N and (an) is bounded.



Cauchy Sequences2.2.43. Theorem. Every Cauchy sequence in R with the metric

ϱ : R× R→ R, ϱ(x , y) = |x − y |,

is convergent.

Proof.Let (an) be a Cauchy sequence, so in particular (an) is bounded. By thetheorem of Bolzano-Weierstraß, (an) has a convergent subsequence (ank )whose limit we denote by a, i.e., ank → a.We now prove that an → a. Let ε > 0 be fixed. Then we can fix some Nsuch that |an − am| < ε/2 for n,m > N. Due to the convergence of thesubsequence, we can also find some nk > N such that |ank − a| < ε/2.Then for any m > N we have

|am − a| ≤ |am − ank |+ |ank − a| < ε

2+ε

2= ε.



Complete Metric SpacesSimilarly, one can show that every complex Cauchy sequence converges.The property that every Cauchy sequence converges is a very nice propertyof the metric space (M, ϱ), i.e., it depends on the metric ϱ as well as onthe set M.In fact, this property is so useful, that we give it a special name,

2.2.44. Definition. A metric space (M, ϱ) is called complete if everyCauchy sequence converges in M.

2.2.45. Examples. Let ϱ(x , y) = |x − y |. Then1. The spaces (R, ϱ) and (C, ϱ) are complete. (See recitation class for

(C, ϱ).)2. The space ([0, 1), ϱ) is incomplete. (Take an = 1− 1/n.)3. The space (Q, ϱ) is incomplete. (See exercises.)



Complete Metric Spaces2.2.46. Example. Consider the metric ϱ : R× R→ R defined by

ϱ(x , y) =

∣∣∣∣ x

1 + |x |− y

1 + |y |

∣∣∣∣ .It can easily be checked that ϱ is actually a metric; the graph picturedbelow helps to see why ϱ(x , y) = 0 if and only if x = y , the symmetry andtriangle inequality are clear from the definition.

x

-1

1

y

y

x

x ¤ + 1



Complete Metric SpacesNow we claim that R with this metric is incomplete. In fact, take thesequence (an) given by an = n. We will first show that this is a Cauchysequence. Consider the sequence (bn) in R with the usual metric given by

bn =n

1 + n.

Since bn → 1 as n→∞ it is clear that (bn) is a Cauchy sequence in Rwith the usual metric, i.e., for any ε > 0 there exists an N > 0 such thatfor all n,m > N ∣∣∣∣ m

1 +m− n

1 + n

∣∣∣∣ < ε.

But the left-hand side is just ϱ(am, an), so we see that (an) is a Cauchysequence in (R, ϱ). However, it is easily seen that (an) does not convergeto any number a in (R, ϱ), so we have found a non-convergent Cauchysequence in (R, ϱ) and the metric space is hence incomplete.



The Real Numbers

Reference Spivak, Calculus, 4th ed., Chapter 28, Problem 1.

Our aim is to construct the completion of an incomplete metric space by“adding” all the limits of Cauchy sequences to the space. We will illustratehow to do this with the rational numbers; the general procedure isanalogous.Given Q, we may consider the set of all sequences in Q that converge to alimit. Denote this set by Conv(Q). Each sequence (an) ∈ Conv(Q) isassociated uniquely to a number a ∈ Q, namely its limit. We can now saythat two sequences are equivalent if they have the same limit, i.e.,

(an) ∼ (bn) :⇔ limn→∞

an = limn→∞

bn. (2.2.5)

(This is called an equivalence relation.)



Construction of the Real Numbers

We denote the set of all sequences with the same limit as a sequence (an)by [(an)]. Such a set is called a (equivalence) class and the set of allclasses is denoted Conv(Q)/ ∼.Since each rational number is represented by a class (why?) we see thatthe rational numbers may be identified with the set of all classes ofconvergent sequences:

Q ≃ Conv(Q)/ ∼ .

This is just a formal way of saying that the set of rational numberscorresponds to the set of all convergent sequences of rational numbers, ifany two sequences with the same limit are considered equivalent. Anyrational number corresponds to exactly one class of sequences andvice-versa.




We can now consider a larger class of sequences, that of Cauchy sequencesof rational numbers, denoted by Cauchy(Q). Since every convergentsequence is a Cauchy sequence, Conv(Q) ⊂ Cauchy(Q).Furthermore, we say that two Cauchy sequences are equivalent not if theyhave the same limit (because they might not converge) but rather if theirdifference converges to zero:

(an) ∼ (bn) :⇔ limn→∞

(an − bn) = 0. (2.2.6)

Of course, (2.2.6) is equivalent to (2.2.5) for convergent sequences. Wenow have the larger set

Cauchy(Q)/ ∼ ⊃ Conv(Q)/ ∼ ≃ Q




The set Cauchy(Q)/ ∼ incorporates the rational numbers and by itsconstruction every Cauchy sequence (an) in Cauchy(Q)/ ∼ has a limit,namely precisely the object represented by the class [(an)]. We write

R := Cauchy(Q)/ ∼

and call this set the real numbers.It can be shown that all the operations of the rational numbers can beextended to R and that properties (P1) – (P12) continue to hold.Furthermore, in the set R property (P13) holds. (This can be shown byrelating least upper bounds to Cauchy sequences; the details are left toyou!)



Construction of the Real Numbers2.2.47. Example. Every rational number has a finite decimal representation.We can think of a real number as having an “infinite decimalrepresentation.” (More details on this later.) For example, the sequence

1, 1.4, 1.41, ...

may converge to√2 if the following terms are chosen appropriately.

This “infinite decimal representation” is just the way that real numbers areintroduced in middle school. As another example, the sequences

(an) := (0.4, 0.49, 0.499, 0.4999, 0.49999, ...)

and(bn) := (0.5, 0.5, 0.5, 0.5, 0.5, ...)

are equivalent in the sense of (2.2.6), since|an − bn| = 10−(n+1) n→∞−−−→ 0.

Hence, 0.499999 ... and 0.5 are considered to represent the same number.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 157 / 624

Real Functions, Convergence and Continuity Real Functions

Functions and Maps

Sequences

Real Functions

Limits of Functions




Real Functions

Reference Spivak, Calculus, Chapters 3,4.

In the previous section, we have considered real and complex sequences,i.e., functions defined on a subset of the natural numbers. Now, we willinvestigate real functions, i.e., functions defined on a subset of R with realvalues, i.e., with range in the real numbers. Recall that we use the notation

f : Ω→ R, x 7→ f (x).

where Ω ⊂ R is the domain of f .



Real Functions

IMPORTANT: Remember that we denote a function by the symbol f , andthe values of the function at a some point x by f (x). It is important notto confuse these two notations; f is a function (set of pairs), while f (x) isa (real) number.Sometimes we will say (e.g.) “the function f (x) = x2” or even “thefunction x2”, but this is just short for the correct formulation “thefunction f with values given by f (x) = x2.”In this context, we consider the function f to have the maximal possibledomain, in other words,

dom f = x ∈ R : f (x) makes sense as a real number.



GraphsIn order to specify a unique function we always need to give

(i) the domain Ω and the space Y containing the range,(ii) the association (“mapping”) x 7→ f (x).

Although a function is actually defined through a set of pairs of elements,sometimes one prefers to think of a function as being given by (i) and (ii)above, and denotes by

Γ(f ) = (x , f (x)) ∈ X × Y : x ∈ dom f

the graph of the function f .If Ω ⊂ R and Y = R, we visualize the graph through the drawing of twoperpendicular axes, the horizontal of which represents the real numbersthat encompass the domain, the vertical representing the reals includingthe range. In this way, a point (x , y) ∈ R2 is represented by a point in the“coordinate system” of these graphs.



GraphsPlotting the graph of a function can be very useful for a quick overvew ofits properties. Compare the information conveyed by

f : R→ R, f (x) =1

1 + x + x2

and

-1

2

x

1

y

y =

1

1 + x + x2



Graphs

In order to arrive at such a plot, we need to analyze the properties of thefunction. We are interested in finding

the intercepts with the axes, any “jumps” or “breaks” in the graph, points where the graph approaches a maximum or minimum value, values which are approached by the graph as |x | becomes very large.

Much of the rest of this term’s lecture will focus on such an analysis. Forcomplicated functions, we will literally need “every trick in the book.”



Polynomial Functions

An important class of functions is constituted b the polynomial functions,which have the form

dom f = R, f (x) =n∑

k=0

ckxk , n ∈ N, c1, ... , cn ∈ R.

The natural number n is called the degree of the polynomial.Polynomial functions of degree 1 are straight lines, and can be uniquelydetermined by fixing two points through which they pass:Let (x1, y1) and (x2, y2) be two points in the plane R2. Then the straightline through these two points is given by

f (x) =x − x2x1 − x2

y1 +x − x1x2 − x1

y2 =y1 − y2x1 − x2

(x − x2) + y2



Power FunctionsPolynomial functions of the form f (x) = xn, n ∈ N, are also called powerfunctions and can best be visualized through their graphs:

x

1

y

y = 1

y = x

y = x2

y = x3

y = x4



Rational FunctionsQuotients of polynomial functions are called rational functions. They havethe form

f (x) =p(x)

q(x), p, q are polynomial functions.

and have domain dom f = x ∈ R : q(x) = 0.

x

y

y =

1

x + x2

- 1

y =

x2

7 + 7 x

y =

x

1 + x2

y =

1

x



Piecewise FunctionsWe can define functions in a piecewise way by writing

f (x) =

g1(x) P1(x)... ...gn(x) P2(x)

where P1, ... ,Pn are predicates g1, ... , gn are suitable functions withdom gk = x ∈ R : Pk(x). Generally we require that for any given x ∈ Rat most one of the predicates is true. The domain of f is then given bydom f = x ∈ R : ∃

1≤k≤nPk(x).

2.3.1. Example.

f (x) =

x , −1 ≤ x < 1,

2− x , 1 ≤ x < 3,

dom f = [−1, 3).

-1 1 2 3x

-1

1

y



Periodic FunctionsA function f : R→ R such that there exists some p ∈ R such that

f (x + p) = f (x) for all x ∈ R

is called periodic with period p.We can construct a periodic function as follows: First, we define the floorand ceiling of a number x ∈ R as

⌊x⌋ := maxn ∈ Z : n ≤ x, ⌈x⌉ := minn ∈ Z : n ≥ x,

respectively. We then define the “distance to the nearest integer” function,the function

· : R→ [0, 1/2], x := min|x − ⌈x⌉|, |x − ⌊x⌋|. (2.3.1)

(Here we have used a dot to indicate the position of the argument x whenevaluating · at x ∈ R.) Clearly, x + 1 = x for all x , so · isperiodic with period p = 1.



Periodic Functions

The graph of the function · is sketched below:

-4 -3 -2 -1 0 1 2 3

x

1

2

y



Manipulating FunctionsGiven a function f : R→ R, we can stretch and translate the graph of fhorizontally by multiplying its argument with a number or adding aconstant to the argument:

x

y

y = f HxLy = f H2 xLy = f Hx2L



Manipulating FunctionsGiven a function f : R→ R, we can stretch and translate the graph of fhorizontally by multiplying its argument with a number or adding aconstant to the argument:

x

y

y = f HxLy = f Hx+1Ly = f Hx-1L



Manipulating FunctionsGiven a function f : R→ R, we can stretch and translate the graph of fvertically by multiplying its argument with a number or adding a constantto the value of f :

x

y

y = f HxL

y = 2 f HxL

y = f HxL 2



Manipulating FunctionsGiven a function f : R→ R, we can stretch and translate the graph of fvertically by multiplying its argument with a number or adding a constantto the value of f :

x

y

y = f HxL

y = f HxL + 1

y = f HxL - 1



Periodic FunctionsGiven the function · of (2.3.1), we would like to define the function ψhaving the graph shown below:

-10-9 -8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8 9 10x

-1

1

y

Find an expression for ψ(x) using · !Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 174 / 624


Weird Functions

Recall that we have defined the composition of maps. Now the functionφ : R \ 0 → R, φ(x) = 1/x has an interesting property: it maps theinterval (0, 1) to the interval (1,∞) bijectively, and the interval (1,∞) to(0, 1) bijectively. Furthermore, f (1) = 1. Therefore, the action of φ on thepositive real numbers can be considered as a “mirroring” at x = 1. Theaction on the negative real numbers is similar.If we compose a function g with φ, the graph of g φ shows thismirroring. We illustrate by taking the periodic function ψ that we definedon the previous slide.



Weird Functions

-10-9 -8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8 9 10x

-1

1

y

The function ψ φ, domψ ϕ = R \ 0, graphed above, is not periodicany more. Something strange seems to go on near the origin.



Weird Functions

-1 1x

-1

1

y



Weird Functions

-0.1 -0.05 0.05x

-1

1

y

The infinitely many peaks in (1,∞) of ψ have been crammed into theinterval (0, 1) by the composition. It becomes impossible to give a preciserendering of the graph near x = 0.



Weird Functions

Strange functions such as the last one can be constructed quite easily. Forexample, it is impossible to give any precise idea of the graph of

f (x) =

1, x rational,0, x irrational.



Parity and Monotonicity

2.3.2. Definition. Let Ω ⊂ R and f : Ω → R. The f is said to be even if f (−x) = f (x) for all x ∈ Ω. odd if f (−x) = −f (x) for all x ∈ Ω. increasing if f (x) ≥ f (y) for all x > y , x , y ∈ Ω . strictly increasing if f (x) > f (y) for all x > y , x , y ∈ Ω. decreasing if f (x) ≤ f (y) for all x > y , x , y ∈ Ω. strictly decreasing if f (x) < f (y) for all x > y , x , y ∈ Ω . monotonic if f is either decreasing or increasing. strictly monotonic if f is either strictly decreasing or strictly

increasing.


Real Functions, Convergence and Continuity Limits of Functions

Functions and Maps

Sequences

Real Functions

Limits of Functions




Limit of a FunctionTaking our inspiration from sequences, we would like to formalize theconcept of the limit of a function. In other words, we would like to be ableto describe the behaviour of a function f near some point x0.In the case of sequences, the only interesting case was x0 =∞. While weare ultimately interested in limits of functions also for x0 ∈ R, this case canbe generalized to functions of a real variable in a straightforward manner:2.4.1. Definition. Let f be a real- or complex-valued function defined on asubset of R that includes some interval (a,∞), a ∈ R. Then f convergesto L ∈ C as x →∞, written

limx→∞

f (x) = L :⇔ ∀ε>0

∃C>0

∀x>C|f (x)− L| < ε. (2.4.1)

The limit of a function as x → −∞ is defined similarly.



Limit of a Function

2.4.2. Example. We have

limx→∞

1

x= 0, lim

x→−∞

1

x= 0,

The analogy of the previous definition to that for sequences is clear.However, if we want to extend the concept of a limit to x0 ∈ R, we needto look at the essence of this definition:

We preset how close we want f (x) to be to L (“ ∀ε>0

”). Then if xis close enough to x0 =∞ (“ ∃

C>0∀

x>C”) we can achieve

|f (x)− L| < ε.

If we want to extend this to x0 ∈ R, we merely have to find another way ofsaying “if x is close enough to x0 ∈ R”.



Limit of a FunctionThis is easily done; we express “if x is close enough to x0 ∈ R” by saying“if there exists some δ > 0 such that |x − x0| < δ.” We want this to holdfor all x , except possibly for x = x0. (Hence, “near x0” does not mean “atx0!”We also want f to be defined “near x0,” so we assume that x0 ∈ dom f .We can then finally write down our definition for the limit of a function atx0 ∈ R:2.4.3. Definition. Let f be a real- or complex-valued function defined on asubset Ω ⊂ R and let x0 ∈ Ω. Then the limit of f as x → x0 is equal toL ∈ C, written

limx→x0

f (x) = L :⇔ ∀ε>0∃

δ>0∀

x∈Ω\x0|x − x0| < δ ⇒ |f (x)− L| < ε.



Limit of a Function2.4.4. Examples.

1. limx→0

x+12x+1 = 1,

2. limx→1

√x = 1.

As for sequences, the main method for finding limits is not directly by thedefinition, but rather through the following theorem:2.4.5. Theorem. Let f and g be real- or complex-valued functions andx0 ∈ dom f ∩ dom g such that lim

x→x0f (x) and lim

x→x0g(x) exist. Then

1. limx→x0

(f (x) + g(x)) = limx→x0

f (x) + limx→x0

g(x) ,

2. limx→x0

(f (x) · g(x)) =(limx→x0

f (x))(

limx→x0

g(x)),

3. limx→x0

f (x)

g(x)=

limx→x0

f (x)

limx→x0

g(x)if limx→x0 g(x) = 0.

These statements remain true if x0 = ±∞.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 185 / 624


One-sided LimitsOccasionally, it may be useful to consider one-sided limits for some x0 ∈ R:2.4.6. Definition. Let f be a real- or complex-valued function defined on asubset Ω ⊂ R and let x0 ∈ Ω.Then the limit of f as x converges to x0 from above is equal to L ∈ C,

limxx0

f (x) = L :⇔ ∀ε>0∃

δ>0∀

x∈Ω\x00 < x − x0 < δ ⇒ |f (x)− L| < ε.

Analogously, the limit of f as x converges to x0 from below is equal toL ∈ C,

limxx0

f (x) = L :⇔ ∀ε>0∃

δ>0∀

x∈Ω\x00 < x0 − x < δ ⇒ |f (x)− L| < ε.

2.4.7. Remark. Clearly, f (x)→ L as x → x0 if and only if f (x)→ L asx x0 and f (x)→ L as x x0.



Limits of FunctionsIn a similar manner to our definition for sequences, we can also definedivergence to plus or minus infinity. The precise formulation is left as anexercise. We give some concrete examples2.4.8. Examples.

1. limx→0

1x does not exist; lim

x0

1x =∞, lim

x0

1x = −∞. Note that 1/x is not

defined at x = 0.2. Let f : R→ R be defined as

f (x) =

x

1+x2, −∞ < x ≤ 0,

1 + x , 0 < x <∞.

Then limx0

f (x) = 1, limx0

f (x) = 0 and limx→x0

f (x) exists for all x0 = 0.Note that f (0) = 0.



Limits of Functions using Sequences

There is a close relation between the limit of a function and the limits ofsequences:2.4.9. Theorem. Let f be a real- or complex-valued function defined on asubset Ω ⊂ R and let x0 ∈ Ω. Then

limx→x0

f (x) = L ⇔ ∀(an)

an∈Ω\x0

(an

n→∞−−−→ x0 ⇒ f (an)n→∞−−−→ L

)

A similar result holds for x0 = ±∞.This remarkable result means that the limits of functions can be describedentirely through sequences.




Proof.(⇒) Assume that limx→x0 f (x) = L. Let (an) be a sequence such that

an → x0. We will show that then f (an)→ L. Fix ε > 0. Then wechoose a δ > 0 such that

|x − x0| < δ ⇒ |f (x)− L| < ε.

Furthermore, we can choose an N ∈ N such that |an − x0| < δ forn > N. Then for n > N, |f (an)− L| < ε, so f (an)→ L.



Limits of Functions using SequencesProof (continued).

(⇐) Assume that f (an)→ L for all sequences (an) with an → x0. Weprove that then f (x)→ L as x → x0 by contraposition. Assume thatf (x) → L as x → x0. Then

∃ε>0∀

δ>0∃

x∈Ω|x − x0| < δ ⇒ |f (x)− L| < ε

⇔ ∃ε>0∀

δ>0∃

x∈Ω|x − x0| < δ ∧ |f (x)− L| ≥ ε

Fix ε > 0. Then for every n ∈ N∗ there exists a number xn such that

|xn − x0| <1

n∧ |f (xn)− L| ≥ ε.

Clearly the sequence (xn) converges to x0, but f (xn) cannot convergeto L.



Limits of Functions using Sequences2.4.10. Example. Let ψ denote the periodic function we studied earlier, asshown in the left graph below. We want to study lim

x→0ψ(1/x), shown

below right.

-10-9 -8 -7 -6 -5 -4 -3 -2 -1 1 2 3 4 5 6 7 8 9 10x

-1

1

y

-0.1 -0.05 0.05x

-1

1

y

We suspect that the limit does not exist. We consider the sequence (xn),xn = 1/(4n + 1). Then xn → 0 and ψ(1/xn) = ψ(4n + 1) = 1→ 1 asn→∞.For (yn) given by yn = 1/(4n + 3) we also have yn → 0 butψ(1/yn) = ψ(4n + 3) = −1→ −1. Therefore lim

x→0ψ(1/x) does not exist.




2.4.11. Example. Now consider the function f given by f (x) = xψ(1/x).We want to show that

limx→0

xψ(1/x) = 0.

Let (xn) be any sequence with xn → 0 as n→∞. Note that

|f (xn)| = |xn| · |ψ(1/xn)| ≤ |xn|n→∞−−−→ 0,

since |ψ(x)| ≤ 1 for all x ∈ R. It follows that

limn→∞|f (xn)| = 0,

so limx→0

f (x) = 0.




-1 1x

-1

1

y

The function f given by f (x) = xψ(1/x); note that it is constant for|x | > 1.




-0.1 -0.05 0.05x

y

Behavior near x = 0.



Behavior of Functions

Hence limits serve to characterize the behavior of functions near points.However, our current techniques do not give us very detailedinformation.For example, consider the functions f , g , h : R→ R,

f (x) =√

x2 + 1, g(x) = x , h(x) = x2.

We expect f and g to behave “similarly” as x →∞, but h to have “muchlarger” values. However, at this point we can only say that the individuallimits do not exist and the functions diverge to infinity.Similarly, x2 should be “much smaller” than x as x → 0, even though bothlimits vanish.The main tools for characterizing such behavior are known as Landausymbols. There are two different “kinds”: big-O and little-O. We firstintroduce big-Oh.



The Big-O Landau Symbol2.4.12. Definition. Let f ,ϕ be real- or complex-valued functions defined ona subset Ω ⊂ R and let x0 ∈ Ω. We say that

f (x) = O(ϕ(x)) as x → x0

if and only if

∃C>0

∃ε>0

∀x∈Ω

|x − x0| < ε ⇒ |f (x)| ≤ C |ϕ(x)| (2.4.2)

2.4.13. Example. We have x + x2 = O(x) as x → 0: taking ε = 1 andC = 2, for all x ∈ R,

|x − 0| < 1 ⇒ |x + x2| ≤ |x |+ |x |2 = (1 + |x |)︸︷︷︸<2

|x |.

2.4.14. Remark. “f (x) = O(ϕ(x)) as x → x0” can be interpreted as“|f (x)| is not significantly larger than |ϕ(x)| when x is near x0.”



The Big-O Landau SymbolWe can make an analogous definition for x →∞,2.4.15. Definition. Let f ,ϕ be real- or complex-valued functions defined ona subset Ω ⊂ R containing the interval (L,∞) for some L ∈ R. We saythat

f (x) = O(ϕ(x)) as x →∞

if and only if

∃C>0

∃M>L

x > M ⇒ |f (x)| ≤ C |ϕ(x)|

The statement that f (x) = O(ϕ(x)) as x → −∞ is defined similarly.2.4.16. Example. We have x + x2 = O(x2) as x →∞: taking M = 1 andC = 2, for all x ∈ R,

|x | > 1 ⇒ |x + x2| ≤ |x |+ |x |2 < 2|x |2.



The Little-O Landau Symbol2.4.17. Definition. Let f ,ϕ be real- or complex-valued functions defined ona subset Ω ⊂ R and let x0 ∈ Ω. We say that

f (x) = o(ϕ(x)) as x → x0

if and only if

∀C>0

∃ε>0

∀x∈Ω\x0

|x − x0| < ε ⇒ |f (x)| < C |ϕ(x)| (2.4.3)

2.4.18. Example. We have x2 = o(x) as x → 0: for any C > 0 we can takeε = C , so that for all x ∈ R,

|x − 0| < ε = C ⇒ |x2| = |x | · |x | < C |x |.

2.4.19. Remark. “f (x) = o(ϕ(x)) as x → x0” can be interpreted as “|f (x)|is much smaller than |ϕ(x)| when x is near x0.”



The Little-O Landau SymbolWe can make an analogous definition for x →∞,2.4.20. Definition. Let f ,ϕ be a real- or complex-valued functions definedon a subset Ω ⊂ R containing the interval (L,∞) for some L ∈ R. We saythat

f (x) = o(ϕ(x)) as x →∞

if and only if∀

C>0∃

M>Lx > M ⇒ |f (x)| < C |ϕ(x)|

The statement that f (x) = o(ϕ(x)) as x → −∞ is defined similarly.2.4.21. Example. We have 1/x2 = o(1/x) as x →∞: For any C > 0 takeM > 1/C . Then, for all x ∈ R,

x > M = 1/C ⇒ 1

x2≤ C

1

x



More Examples for Landau Symbol NotationSome more examples are given below:2.4.22. Examples.

(i) 1

x2= O

(1

x

)as x →∞,

(ii) 1

x= O

(1

x2

)as x → 0,

(iii) 1

x2 + x= O

(1

x

)as x → 0,

(iv) 1

x2 + x= O

(1

x

)as x →∞,

(v) 1

x2 + x= o

(1

x3

)as x → 0,

(vi) 1

x2 + x= o(1) as x →∞.



Landau Symbols using Limits

In many practical cases, the Landau symbols can be calculated using limits:2.4.23. Theorem. Let f ,ϕ be a real- or complex-valued functions definedon an interval I ⊂ R and let x0 ∈ I . If there exists some C ≥ 0 such that

limx→x0

|f (x)||ϕ(x)|

= C , (2.4.4)

then f (x) = O(ϕ(x)) as x → x0.

Proof.Let f ,ϕ, I and x0 be given and suppose that (2.4.4) holds for some C ≥ 0.Then

∀ε>0∃

δ>0∀

x∈I\x0|x − x0| < δ ⇒

∣∣∣∣ |f (x)||ϕ(x)|− C

∣∣∣∣ < ε.



Landau Symbols using LimitsProof (continued).Choose ε = 1. Then there exists some δ > 0 such that |x − x0| < δ implies∣∣|f (x)| − C |ϕ(x)|

∣∣ < |ϕ(x)| for all x ∈ I \ x0,

i.e.,|f (x)| < (C + 1)|ϕ(x)|.

This is just the definition of f (x) = O(ϕ(x)) as x → x0.

2.4.24. Theorem. Let f ,ϕ be a real- or complex-valued functions definedon an interval I ⊂ R and let x0 ∈ I . Then

limx→x0

|f (x)||ϕ(x)|

= 0 ⇔ f (x) = o(ϕ(x)) as x → x0. (2.4.5)

This result will be proven in the assignments.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 202 / 624


Usage of Landau SymbolsThe notation f (x) = O(ϕ(x)) as x → x0 is used to describe an absolutesize relation near x0 and at x0, to the effect that f is not significantlylarger than ϕ. For such size comparisons, it is useful to have the domain ofthe inequality (2.4.2) encompass the point x0 also.On the other hand, when using little-o, the convention is that only thebehaviour near x0 should enter into the statement that f is much smallerthan ϕ, so (2.4.3) does not encompass the point x0.This is what is responsible for the different statements of Theorems 2.4.23and 2.4.24, the first involving an implication, the second an equivalence.In fact, often the little-o symbol is defined in terms of the limit in (2.4.5)while the big-o symbol is defined in terms of (2.4.2); seehttp://mathworld.wolfram.com/LandauSymbols.html.Note also that the article on Landau symbols on Wikipedia is athttps://en.wikipedia.org/wiki/Big_O_notation and written exclusively from thepoint of view of computer science - little-o, so important in mathematics,is hardly mentioned at all.


http://mathworld.wolfram.com/LandauSymbols.html

https://en.wikipedia.org/wiki/Big_O_notation


Some Remarks on Landau SymbolsMore properly, “O(ϕ(x)) as x → x0” denotes a set: the set of all functionsf such that for some C ≥ 0 and some ε > 0, |f (x)| < C |ϕ(x)| for allx ∈ (x0 − ε, x0 + ε). A better notation for this set might be Ox0(ϕ) and wecould (perhaps more correctly) write

f ∈ Ox0(ϕ) instead of f (x) = O(ϕ(x)) as x → x0.

However, the second notation has become standard among physicists andmathematicians, and we write expressions like

(x + 1)4 = 1 + 4x + O(x2) meaning (x + 1)4 − (1 + 4x) = O(x2)

as x → 0. This is often more intuitive than writing

(x + 1)4 − (1 + 4x) ∈ O0(x2) or (x + 1)4 = 1 + 4x mod O0(x

2).



Some Remarks on Landau SymbolsWe do sacrifice the symmetry of the “=” sign, however. For example,

O(x3) = O(x2) as x → 0

means “any function f such that |f (x)| < C |x |3 for all |x | < ε for someε,C also satisfies |f (x)| < C ′|x |2 for all |x | < ε′ for some ε′,C ′.” Clearly,

O(x2)?= O(x3) as x → 0 is false.

Landau symbols can be combined with each other and with functions. Forexample, as x → 0,

c · O(xn) = O(xn), xnO(xm) = O(xn+m),

O(xn) + O(xm) = O(xmin(n,m)), O(xn)O(xm) = O(xn+m).

Further examples and applications will be given in the assignments.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 205 / 624

Real Functions, Convergence and Continuity Continuous Functions

Functions and Maps

Sequences

Real Functions

Limits of Functions




ContinuityWe have extensively discussed limits of functions, and also discovered thatthe values of a function at some point need not have anything to do withthe behavior of the function near that point.However, we would of course like to assume that “reasonable” functionsbehave in a consistent manner on their domain. We will therefore give aname to this reasonableness:2.5.1. Definition. Let Ω ⊂ R be any set and f : Ω → R be a functiondefined on Ω . Let x0 ∈ Ω. We say that f is continuous at x0 if

limx→x0

f (x) = f (x0).

If U ⊂ Ω, we say that f is continuous on U if f is continuous at everyx0 ∈ U.We say that f is continuous on its domain, or simply continuous, if f iscontinuous at every x0 ∈ Ω.



Continuity2.5.2. Remark. For a function f to be continuous at a point x0, threeconditions have to be fulfilled:

(i) f needs to have a limit at x0 ( limx→x0

f (x) must exist);(ii) f needs to be defined at x0 (f (x0) must exist);(iii) the value of f must coincide with its limit (f (x0) = lim

x→x0f (x)).

2.5.3. Examples.1. The zig-zag function f (x) = ψ(1/x) is not continuous at x = 0

because the limit limx→0

f (x) does not exist (see Example 2.4.10).2. The function f (x) = xψ(1/x) is not continuous at x = 0 because f is

not defined at x = 0.3. The function

f (x) =

xψ(1/x) x = 0

0 x = 0

is continuous at x = 0 (see Example 2.4.11).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 208 / 624


Continuity

We can use the results that we have obtained previously concerning limitsof sequences and functions to express the continuity of f at x0 inequivalent ways:2.5.4. Theorem. Let Ω ⊂ R be any set and f : Ω → R be a functiondefined on Ω . Let x0 ∈ Ω. Then the following are equivalent:

1. f is continuous at x0;2. for any real sequence (an) with an → x0, lim

n→∞f (an) = f (x0);

3. ∀ε>0∃

δ>0∀

x∈dom f: |x − x0| < δ ⇒ |f (x)− f (x0)| < ε.

In fact, since we have discussed sequences in metric spaces, and we havean idea of distance in metric spaces, we can define continuity easily formaps between metric spaces. This is left to the exercises.



One-Sided ContinuityOccasionally it will be useful to consider one-sided continuity.2.5.5. Definition. Let Ω ⊂ R be any set and f : Ω → R be a functiondefined on Ω . Let x0 ∈ Ω. We say that f is continuous from the left orfrom below at x0 if

limxx0

f (x) = f (x0).

We say that f is continuous from the right or from above at x0 if

limxx0

f (x) = f (x0).

Of course, a function is continuous at a point if and only if it is continuousboth from above and from below. One-sided continuity can also becharacterized using monotonic sequences; the precise formulation is left toyou!



Continuous Extension2.5.6. Definition. Let Ω ⊂ R be any set and Ω ⊃ Ω . Suppose thatf : Ω → R and f : Ω → R are continuous functions. If f (x) = f (x) for allx ∈ Ω , we say that f is a continuous extension of f to Ω .

2.5.7. Examples.1. Let f : [0,∞)→ R, f (x) = √x and f : R→ R, f (x) =

√|x |. Then f

is a continuous extension of f .2. Let f : R \ 1 → R, f (x) = (x2 − 1)/(x − 1) and f : R→ R,

f (x) = x + 1. Then f is a continuous extension of f .

2.5.8. Remark. Suppose that Ω ⊂ R, x0 ∈ Ω and f : Ω \ x0 → R iscontinuous and has the property that lim

x→x0f (x) exists. Then f : Ω → R,

f (x) = limy→x

f (y), x ∈ Ω ,

defines the unique continuous extension of f to Ω .Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 211 / 624


Continuity

While continuous functions may still exhibit some curious behavior (we willstudy examples later), nevertheless continuity is sufficient as a hypothesisfor many theorems. In fact, calculus may be regarded as the branch ofmathematics studying continuous functions.For real functions f and g we can define their sum f + g and product f · gby

f + g : (dom f ) ∩ (dom g)→ R, (f + g)(x) := f (x) + g(x),

f · g : (dom f ) ∩ (dom g)→ R, (f · g)(x) := f (x) · g(x).

We thus define the sum/product of two functions by describing how f + gand f · g act on every x ∈ (dom f ) ∩ (dom g). This type of definition iscalled a pointwise definition.



Continuity

2.5.9. Theorem. Let f and g be two real functions andx ∈ (dom f ) ∩ (dom g). Assume that both f and g are continuous at x .Then

(i) f + g is continuous at x and(ii) f · g is continuous at x .

Furthermore, if g(x) = 0, the function h defined by h(x) = 1/g(x) iscontinuous at x .Moreover, if f and g are real functions with x ∈ dom g , g(x) ∈ dom f , gis continuous at x and f is continuous at g(x), then the composition f gis continuous at x .



ContinuityProof.The first three assertions follow immediately from the correspondingtheorem for limits. We will prove the last assertion concerning thecomposition of f and g .Let ε > 0. Since f is continuous at g(x), we know that there exists someδ > 0 such that for all y ∈ dom f

|g(x)− y | < δ ⇒ |f (g(x))− f (y)| < ε.

We fix some such δ > 0. Now since g is continuous at x , there exists someδ > 0 such that for all z ∈ dom g

|x − z | < δ ⇒ |g(x)− g(z)| < δ.

Recall z ∈ dom(f g) if z ∈ dom g and g(z) ∈ dom f . Hence for allz ∈ dom(f g), if |x − z | < δ, then |f (g(x))− f (g(z))| < ε. This provesthat f g is continuous.



Continuity

In fact, we have a stronger version of the last statement that does notpresume the continuity of f .2.5.10. Theorem. Let f , g be real functions such that lim

x→x0g(x) = L exists

and f is continuous at L ∈ dom f . Then

limx→x0

f (g(x)) = f (L).

The proof is similar to the previous proof and will be treated in yourrecitation classes next week.



Continuity

The following lemma is useful in many proofs and also gives a first idea ofhow powerful continuity is.2.5.11. Lemma. Let Ω ⊂ R be some set, f : Ω → R a function that iscontinuous at some point x0 ∈ Ω and assume that f (x0) > 0. Then thereexists a δ > 0 such that f (x) > 0 for all x ∈ (x0 − δ, x0 + δ) ∩ Ω.

Proof.Since f is continuous at x0, for ε = f (x0)/2 we can find a δ > 0 such thatfor all x ∈ Ω

|x − x0| < δ ⇒ |f (x)− f (x0)| < f (x0)/2

⇒ f (x) > f (x0)/2 > 0.



The Bolzano Intermediate Value TheoremThis is the first of three important theorems that enable us to obtain hardinformation on the behavior of continuous functions.2.5.12. Theorem. Let a < b and f : [a, b]→ R be a continuous functionwith f (a) < 0 < f (b). Then there exists some x ∈ [a, b] such thatf (x) = 0.

Proof.We consider

A := x ∈ [a, b] : f (y) < 0 for y ∈ [a, x ].

Clearly A is bounded (a ≤ x ≤ b for all x ∈ A) and non-empty (a ∈ A).Therefore, A has a least upper bound a ≤ α ≤ b.We claim that in fact α < b. Note that f (b) > 0. By Lemma 2.5.11, weknow that there exists some δ > 0 such that f (x) > 0 for allx ∈ (b− δ, b]. Therefore, b− δ is an upper bound for A and hence α < b.



The Bolzano Intermediate Value Theorem

Proof (continued).Similarly, there exists a δ′ such that f (x) < 0 for all x ∈ [a, a+ δ′), so α isat least as large as a+ δ′. It follows that a < α < b.We will now show that f (α) = 0, by proving that f (α) > 0 and f (α) < 0lead to contradictions.First, suppose f (α) < 0. Then by Lemma 2.5.11, there exists a δ > 0 suchthat f (x) < 0 for x ∈ (α− δ,α+ δ). Since α = supA, for this δ thereexists some x0 ∈ A such that α− δ < x0 < α. (Otherwise, α− δ would bean upper bound for A, giving a contradiction.) This means that f (x) < 0for all x ∈ [a, x0]. Since f (x) < 0 for x ∈ (α− δ,α+ δ) and x0 > α− δ, itfollows that f (x) < 0 for x ∈ [a,α+ δ). But this contradicts theassumption that α is an upper bound for A.The proof for f (α) > 0 is analogous and left to you!



The Bolzano Intermediate Value Theorem

An immediate corollary is the following useful result:2.5.13. Bolzano Intermediate Value Theorem. Let a < b and f : [a, b]→ Rbe a continuous function. Then for y ∈ [minf (a), f (b), maxf (a), f (b)]there exists some x ∈ [a, b] such that y = f (x).

Proof.If f (a) = f (b), the statement is trivially true. Suppose that f (a) ≤ f (b).Given y , we apply Theorem 2.5.12 to the function g : [a, b]→ R,g(x) = f (x)− y . If f (a) > f (b), we apply Theorem 2.5.12 to the functiongiven by g(x) = f (a+ b − x)− y .



A Fixed Point Theorem

There are many applications of the intermediate value theorem. Mostinvolve choosing the function f to which the theorem is applied in a smartway. We give one such result here:2.5.14. Theorem. Let f : [a, b]→ R be a continuous function withran f ⊂ [a, b]. Then f has a fixed point, i.e., there exists some x ∈ [a, b]such that f (x) = x .This is an example of a fixed point theorem. (In fact, it is a special case ofBrouwer’s fixed point theorem.) Such results are extremely useful forproving existence and/or uniqueness of very general objects inmathematics.One of the most basic fixed point theorems is called the contractionmapping principle or Banach’s fixed point theorem and is treated in theassignments.



A Fixed Point Theorem

Proof.We assume that f (a) > a and f (b) < b, for otherwise a and b would befixed points. Then f (a)− a > 0 and f (b)− b < 0. Consider the functiong(x) = f (x)− x .Since g(a) > 0 and g(b) < 0, it follows that for some x0 ∈ [a, b]

0 = g(x0) = f (x0)− x0 ⇔ f (x0) = x0,

so x0 is a fixed point for f .



Boundedness of Continuous FunctionsIt turns out that continuity of a function implies its boundedness on closedand bounded domains; we will study this more closely.2.5.15. Lemma. Let Ω ⊂ R be some set and f : Ω → R a function that iscontinuous at some point x0 ∈ Ω. Then there exists a δ > 0 such that f isbounded above on (x0 − δ, x0 + δ) ∩ Ω .Basically, this lemma says that supf (x) : x ∈ Bδ(x0) exists for someδ > 0.

Proof.Since f is continuous at x0, for ε = 1 we can find a δ > 0 such that for allx ∈ Ω

|x − x0| < δ ⇒ |f (x)− f (x0)| < 1

⇒ |f (x)| < |f (x0)|+ 1.



Boundedness of Continuous Functions

We can now prove a more general theorem.2.5.16. Proposition. Let a < b and f : [a, b]→ R be a continuousfunction. Then f is bounded above.By this result, supf (x) : x ∈ [a, b] exists. Proposition 2.5.16 also impliesthat f is bounded below and hence bounded on [a, b].

Proof.We define

A := x ∈ [a, b] : f is bounded above on [a, x ].

Clearly A is bounded (a ≤ x ≤ b for all x ∈ A) and non-empty (a ∈ A).Therefore, A has a least upper bound a ≤ α ≤ b.




Proof (continued).We claim that in fact α = b. Suppose that α < b. By Lemma 2.5.15, weknow that there exists some δ > 0 such that f is bounded above on(α− δ,α+ δ). Since α is the least upper bound of A, there exists somex0 ∈ A such that α− δ < x0 < α. We conclude that f is bounded aboveon [a,α+ δ) and hence α+ δ/2 ∈ A. We have shown that f is bounded above on [a, x ] for every x < b. Notethat this does not automatically imply that f is bounded above on [a, b]!But since there exists a δ > 0 such that f is bounded above on (b − δ, b],we can conclude that f is bounded above on [a, b].




The previous result can be significantly strengthened.2.5.17. Theorem. Let a < b and f : [a, b]→ R be a continuous function.Then there exists a y ∈ [a, b] such that f (x) ≤ f (y) for all x ∈ [a, b].Hence maxf (x) : x ∈ [a, b] exists. Colloquially, we say that “acontinuous functions attains its maximum”.Proof.We define

A := y ∈ R : y = f (x), x ∈ [a, b].

By Proposition 2.5.16, A is bounded and A is clearly non-empty(f (a) ∈ A). Therefore, A has a least upper bound α such that α ≥ f (x)for all x ∈ [a, b]. We will show that α = f (y) for some y ∈ [a, b].




Proof (continued).We argue by contradiction: assume α = f (y) for all y ∈ [a, b]. Then

g : [a, b]→ R, g(x) =1

α− f (x)

is continuous on [a, b] but not bounded above, as we shall see.Since α is the least upper bound of A, for every ε > 0 there exists anx ∈ [a, b] such that α− f (x) < ε. But then for this x , g(x) > 1/ε. Sincewe can make ε as small as desired, g becomes as large as desired, so g isnot bounded above. But this contradicts Proposition 2.5.16.



Inverse FunctionsWe want to say a few words about the inverse of a function. First, we takethe opportunity to introduce some new terms:2.5.18. Definition. Let Ω , Ω ⊂ R and f : Ω → Ω a function. We say that fis

injective if f (x1) = f (x2)⇒ x1 = x2 for all x1, x2 ∈ Ω; surjective if for every y ∈ Ω there exists an x ∈ Ω such that f (x) = y

(i.e., if ran f = Ω); bijective if for every y ∈ Ω there exists a unique x ∈ Ω such that

f (x) = y (i.e., f is injective and surjective).If f : Ω → Ω is bijective we define the inverse of f , denoted by f −1 as

f −1 : Ω → Ω, f (x) 7→ x .

We then say that f is invertible on Ω.



Inverse Functions

If we regard f as a relation

f := (x , f (x)) : x ∈ Ω

we know that (x , y1) = (x , y2) implies y1 = y2 because f is a function. If fis injective, we also know that (x1, y) = (x2, y) implies x1 = x2. Switchingthe entries, we also know that (y , x1) = (y , x2) implies x1 = x2.Therefore, it follows that

f −1 := (f (x), x) : x ∈ Ω

is a function. Due to the surjectivity of f , we can write this set of pairs as

f −1 := (f (x), x) : f (x) ∈ Ω = (y , f −1(y)) : y ∈ Ω



Inverse FunctionsA function is invertible if and only if it is bijective. But we would like tofind a criterion for bijectivity. If f is strictly monotonic (either increasingor decreasing) on I then, in particular, f (x) = f (y) for x = y . This shouldimply that f is injective and hence bijective in a suitable sense.

2.5.19. Theorem. Let a, b ∈ R ∪ −∞ ∪ ∞ such that a < b. Letf : (a, b)→ R be strictly increasing and continuous. Then

α := limxa

f (x) ≥ −∞, β := limxb

f (x) ≤ ∞,

exist and f : (a, b)→ (α,β) is bijective. Furthermore, the inverse functionf −1 is also continuous and strictly increasing and, furthermore,

limyα

f −1(y) = a, limyβ

f −1(y) = b. (2.5.1)



Inverse Functions

Below we will give a sketch of the proof. Here, “sketch” means that somedetails are left to you to work out.Sketch of Proof.Due to the intermediate value theorem, f takes on every value in (α,β) atleast once [why?], so f is surjective. Since f is strictly increasing it is alsoinjective. Hence the inverse function exists.Denote the inverse function f −1 by g . If y1, y2 ∈ (α,β) with y1 < y2, weclaim that g(y1) < g(y2). If this were not the case,

g(y1) ≥ g(y2) ⇒ f (g(y1) = y1 ≥ f (g(y2)) = y2,

contradicting the fact that f is strictly increasing. Hence g is strictlyincreasing.



Inverse Functions

Sketch of Proof (continued).We now prove that g is continuous from the left. Fix y0 ∈ (α,β) andchoose x0 ∈ (a, b) such that f (x0) = y0. Let (an) be an increasingsequence with an y0. Then the sequence (g(an)) is increasing [why?]and bounded by x0, since

an ≤ y0 ⇒ an ≤ f (x0) ⇒ f (g(an)) ≤ f (x0) ⇒ g(an) ≤ x0

where we have used that f is increasing. In fact, x0 is the least upperbound of g(an) [why?] and so, by Theorem 2.2.24, g(an)→ g(y0). Thisproves that g is continuous from the left. In a similar manner, we canprove that g is continuous from the right.It remains to show (2.5.1). This is left to you!



Inverse Functions

We have seen that a monotonic function on an interval is bijective; thefollowing result gives a converse to that statement.2.5.20. Theorem. Let I ⊂ R be an interval and Ω ⊂ R a set. If f : I → Ωis continuous and bijective, then f is strictly monotonic on I .

Proof.Let a0, b0 ∈ I with a0 < b0 be given. Then either

i) f (b0)− f (a0) > 0 or ii) f (b0)− f (a0) < 0

We will show that if i) is true, then i) holds for any other a1, b1 ∈ I witha1 < b1, i.e., that f is strictly increasing. The proof for the case ii) issimilar and left to the reader.



Inverse FunctionsProof (continued).For 0 ≤ t ≤ 1 define

xt := (1− t)a0 + ta1, yt := (1− t)b0 + tb1.

Then x0 = a0 and x1 = a1 and xt ∈ [a0, a1]. An analogous statement istrue for yt . Thus xt , yt ∈ dom f for all t ∈ [0, 1].Since a0 < b0 and a1 < b1, we have

xt < yt for t ∈ [0.1].

We define the function g : [0, 1]→ R, g(t) = f (yt)− f (xt). Since g is acomposition of continuous functions, it is continuous on [0, 1].Moreover, g(t) = 0 for all t ∈ [0, 1] since xt < yt and f is bijective.Therefore, either g > 0 or g < 0 on [0, 1]. But g(0) > 0 by i), so g(t) > 0for all t ∈ [0, 1]. This implies that i) also holds for a1 < b1.



Image and Pre-Image of SetsWe take this opportunity to define some notation for future use:2.5.21. Definition. Let Ω ⊂ R and f : Ω → R. Then for any A ⊂ Ω andB ⊂ ran f

f (A) :=y ∈ R : ∃

x∈Af (x) = y

is called the image of A and

f −1(B) :=x ∈ Ω : ∃

y∈Bf (x) = y

is called the pre-image of B. Note that the symbol f −1(B) makes sensewhether or not f is invertible.

2.5.22. Example. Let f : R→ R, f (x) = x2. Then

f ([−1, 2]) = [0, 4] and f −1([1, 4]) = [−2,−1] ∪ [1, 2].



Continuity in Intervals

Up to now we have mainly discussed continuity at single points. Recallthat a function f was said to be continuous on an interval I if it iscontinuous at all points of the interval, i.e., f is continuous on I ⊂ dom fif and only if

∀x∈I∀

ε>0∃

δ>0∀y∈I|x − y | < δ ⇒ |f (x)− f (y)| < ε.

Note that we can exchange the first two quantifiers without changing theexpression, so we obtain

∀ε>0∀x∈I∃

δ>0∀y∈I|x − y | < δ ⇒ |f (x)− f (y)| < ε.

Here the δ > 0 essentially depends on the point x ∈ I ; given ε > 0, wewould be able to choose a different δ > 0 depending on x .



Continuity in Intervals

If, however, a function f is “well-behaved” enough such that we can usethe same δ > 0 for all x ∈ I , i.e., if in fact

∀ε>0∃

δ>0∀x∈I∀y∈I|x − y | < δ ⇒ |f (x)− f (y)| < ε.

we would want to give the function a special attribute.2.5.23. Definition. Let I ⊂ R be an interval and f : Ω → R a function withI ⊂ Ω . Then f is called uniformly continuous on I if and only if

∀ε>0∃

δ>0∀

x ,y∈I|x − y | < δ ⇒ |f (x)− f (y)| < ε.

Clearly, if f is uniformly continuous on I , then f is also continuous on I .



Uniform Continuity on Closed Intervals2.5.24. Theorem. Let f : Ω → R a function with I = [a, b] ⊂ Ω. If f iscontinuous on [a, b] then f is also uniformly continuous on [a, b].

Proof.Suppose that f is continuous but not uniformly continuous on I . Then

∃ε>0∀

δ>0∃

x ,y∈I|x − y | < δ ∧ |f (x)− f (y)| ≥ ε.

Denote this ε by ε0. Then for each δ = 1/n there exist numbers xn, yn ∈ Isuch that

|xn − yn| <1

n∧ |f (xn)− f (yn)| ≥ ε0.

We obtain two sequences (xn) and (yn) that are bounded (why?). By thetheorem of Bolzano-Weierstraß (xn) has a convergent subsequences (xnk ),say with limit ξ. Consider now the sequence (ynk ). It is also bounded andhence has a convergent subsequence (ynkj ), say with limit η.



Uniform Continuity on Closed Intervals

Proof (continued).Since (xnk ) converges to ξ, so does the subsequence (xnkj ). It follows thatwe have subsequences (xnkj ) and (ynkj ), converging to ξ and η,respectively. We will now drop the subscript j , writing simply (xnk ) and(ynk ) for these sequences. Since I is closed, both ξ and η must lie in I .By construction, |xnk − ynk | < 1

nk, so we see that ξ = η. However, then

xnk → ξ ∧ ynk → ξ ∧ |f (xnk )− f (ynk )| ≥ ε0 → 0.

which contradicts the continuity of f at ξ.



OutlookContinuity is an important basic concept in calculus, and so it should notbe surprising that there are several generalizations and furtherdevelopments of continuity. As an example, consider the following result:2.5.25. Corollary. Let Ω ⊂ R and f : Ω → R continuous. Suppose thatI ⊂ Ω is a closed interval. Then the image

f (I ) =y ∈ R : ∃

x∈If (x) = y

is also a closed interval.

Proof.The continuity of f on I implies that there exist α := infx∈I f (x) andβ := supx∈I f (x) so that f (I ) ⊂ [α,β]. Furthermore, by Theorem 2.5.17,α,β ∈ f (I ). Then by the Intermediate Value Theorem 2.5.13,[α,β] ⊂ f (I ), so f (I ) = [α,β].



Outlook

Now a closed interval is a special case of the following:2.5.26. Definition. A closed and bounded set K ⊂ R is called compact.It is possible to prove the following:2.5.27. Theorem. Let K ⊂ R be a compact set and f : K → R becontinuous. Then f is uniformly continuous on K , f is bounded and has amaximum and minimum. Furthermore, the image f (K ) is also compact.This result will be proven in the next term, where it will be very useful tous in the proof of various generalizations of Theorems 2.5.17 and 2.5.24.It turns out that if we want to consider functions of several variables, thehigher-dimensional analogue of a closed interval is a compact set and thebasic results that we have proven here carry over to continuous functionson compact sets.



First Midterm Exam

The preceding material completes the first third of the course material. Itencompasses everything that will be the subject of the First MidtermExam.The exam date and time will be announced on SAKAI.No calculators or other aids will be permitted during the exam. A sampleexam with solutions has been uploaded to SAKAI. Please study it carefully,including the instructions on the cover page.


Differential Calculus in One Variable

Part III




Differentiation of Real Functions

Properties of Differentiable Real Functions

Vector Spaces

Sequences of Real Functions

Series

Real and Complex Power Series

The Exponential Function

The Trigonometric Functions


Differential Calculus in One Variable Differentiation of Real Functions



Vector Spaces


Series






Finer Analysis of FunctionsWe have seen that even functions whose definitions appear quitestraightforward can exhibit a lot of complexity in their behavior. When wefirst started introducing functions, our goal was to develop mathematicalmethods to describe and analyze this behavior. In this respect, thedefinition of limits and Landau symbols was a relatively modest step - wewere describing the values of the function.Of course, the behavior of a function is not just described by its numericalvalues. For example, the graphs of the functions f (x) = x2 andg(x) = 3x2 + x − 4 look quite different at x = 1, even thoughf (1) = g(1) = 1.The next step in analyzing a function is therefore to look beyond theactual values to the change of the values: Is the function monotonic in aneighborhood of a certain point? Do the values attain a maximum or aminimum? Or is the situation more complex than that?



Linear Approximations

In order to answer (some of) these questions, it is useful to consider thesimplest non-constant type of function, the linear map

L : R→ R, L(h) = α · h for α ∈ R.

Observing the graph, it is immediately obvious whether a linear map isincreasing, strictly increasing, constant or (strictly) decreasing.We hence aim to describe the behavior of an arbitrary function f at apoint x0 by comparing it with a linear map L. In fact, we want to be ableto use a linear map as a linear approximation to f near x0. It will turn outthat if such an approximation exists it is unique and we can hence speak ofa linearisation of f .We will consider a basic example before giving the general definition.



Linear Approximation of the Square RootProblem: Calculate

√9.15 approximately.

Approach:√9 = 3 is known, write 9.15 = 9 + 0.15, where 0.15 is

considered “small”. Generalize: x = 9 and h = 0.15.Goal: Find a linear approximation of the square root

√x + h near √x .

What form does such a linear approximation take? We aim for√x + h ≈

√x + Lx · h (3.1.1)

where Lx ∈ R is a suitable constant that in general depends on x .Equation (3.1.1) is called a linear approximation because we use Lx · h (alinear function of h) to approximate

√x + h.

Our goal is now to find Lx , which will in general depend on x (but isindependent of h).



Linear ApproximationsWe now keep x > 0 fixed throughout the discussion. Instead of theapproximate identity (3.1.1) we write

√x + h =

√x + Lx · h + r(x , h) (3.1.2)

where r(x , h) is some remainder function which should become small whenh approaches zero. Let us formally solve for Lx :

√x + h =

√x + Lx · h + r(x , h)

⇔√x + h −

√x − r(x , h) = Lx · h

⇔ Lx =

√x + h −

√x

h− r(x , h)

h. (3.1.3)



Finding L

Since L is independent of h, we can let h approach zero on the right-handside of (3.1.3). Let us require that

r(x , h)

h→ 0 as h→ 0. (3.1.4)

The requirement (3.1.4) means that the remainder r(x , h) = o(h) ash→ 0, so the remainder is “much smaller” than the linear approximation.Then

Lx = limh→0

√x + h −

√x

h.



Linear Approximation of the Square RootLet us find Lx explicitly. For this, we use the following argument:

√x + h −

√x

h=

(√x + h −

√x)(√x + h +

√x)

h(√x + h +

√x)

=x + h − x

h(√x + h +

√x)

=1√

x + h +√x

Since x > 0, we can now easily see that

Lx = limh→0

√x + h −

√x

h=

1

limh→0

(√x + h +

√x)

=1

2√x



Linear ApproximationsWe have shown that

√x + h =

√x +

h

2√x+ o(h)

To answer our original question:√9.15 ≈

√9 +

0.15

2√9= 3.025.

This is actually a good approximation. The exact (rounded) value is√9.15 = 3.024897.

We can generalize this example simply by substituting an arbitraryfunction f for the square root. However, it turns out that not all functionscan be approximated linearly at every point; this is a very nice property ofa function and hence deserves an explicit definition.



Differentiability

3.1.1. Definition. Let Ω ⊂ R be a set, x ∈ Ω an interior point of Ω andf : Ω → R a real function. Then we say that f is differentiable at x ifthere exists a linear map Lx such that for all sufficiently small h ∈ R

f (x + h) = f (x) + Lx(h) + o(h) as h→ 0. (3.1.5)

We say that f is differentiable on some open set U ⊂ Ω if f isdifferentiable at every point of U.We say that f is differentiable if its domain is an open set and f isdifferentiable at every point of the domain.

3.1.2. Lemma. For given f and x the linear map Lx in (3.1.5) is unique.



DifferentiabilityThe map Lx : R→ R is a linear map, i.e., Lx(h) = αh for some α ∈ R. Itquickly becomes tedious to use two different symbols to describe the samemap, so we commit some slight abuse of notation: we will use the symbolLx to denote

1. the function Lx : R→ R as well as2. the number α ∈ R such that Lx(h) = α · h.

This means that we write

Lx : R→ R, Lx(h) = Lx · h.

This “double meaning” of Lx as a function and as a number may beconfusing at first, but will later seem very natural.Since by Lemma 3.1.2 the map Lx is unique, we may define the derivativeof f at x , denoted by f ′(x), by

f ′(x) = Lx .



Examples of Differentiable Functions

3.1.3. Examples.(i) f : R→ R, f (x) = c has derivative f ′(x) = 0 for any x ∈ R.(ii) f : R→ R, f (x) = x has derivative f ′(x) = 1 for any x ∈ R.(iii) f : R→ R, f (x) = x2 has derivative f ′(x) = 2x for any x ∈ R.(iv) Let f : R→ R, f (x) = xn, n ∈ N. By the binomial formula,

(x + h)n =n∑

k=0

(n

k

)xn−khk = xn + nxn−1h + o(h) as h→ 0.

It follows thatf ′(x) = nxn−1.



Examples of Differentiable Functions3.1.4. Lemma. The derivative of the function f : R \ 0 → R,f (x) = 1/x , is given by

f ′(x) = − 1

x2.

Proof.We first show that

1

1 + x= 1− x + o(x) as x → 0.

Indeed,

limx→0

1

x

(1

1 + x+ x − 1

)= lim

x→0

1

x

1 + (x − 1)(x + 1)

1 + x= lim

x→0

x

1 + x= 0.



Examples of Differentiable Functions

Proof (continued).Now

1

x + h=

1

x

1

1 + h/x=

1

x

(1− h

x+ o(h)

)=

1

x− 1

x2h + o(h)

as h→ 0. This implies that (1

x

)′= − 1

x2



Geometric Interpretation of the DerivativeGeometrically, a linear approximation to a function can be visualised as atangent line to the graph of the function, where the slope of the tangent isequal to the derivative. In order to obtain the tangent to a graph at apoint geometrically, one considers secants through the fixed point andpoints on the graph progressively approaching the first.Let (x1, f (x1)) and (x2, f (x2)) be two points on the graph of a function f .Then the secant line through these two points is given by

S(x ; x1, x2) =f (x2)− f (x1)

x2 − x1(x − x1) + f (x1).

If we write x2 = x1 + h, the the secant line can be written as

S(x ; x1, x1 + h) =f (x1 + h)− f (x1)

h(x − x1) + f (x1).



Geometric Interpretation of the Derivative

We can use the slope, f (x1+h)−f (x1)h , to obtain an equivalent definition of

the derivative.3.1.5. Theorem. Let Ω be a set, x ∈ Ω an interior point and f : Ω → R afunction that is differentiable at x with derivative Lx = f ′(x). Then

f ′(x) = limh→0

f (x + h)− f (x)

h. (3.1.7)

Furthermore, if the limit in (3.1.7) exists for an interior point x ∈ Ω , thenf is differentiable at x and the derivative is given by (3.1.7).



Geometric Interpretation of the DerivativeProof.Assume first that f is differentiable at x . We will show that (3.1.7) holds.Since

f (x + h) = f (x) + f ′(x) · h + o(h) as h→ 0,

we have

limh→0

f (x + h)− f (x)

h= lim

h→0

f (x) + f ′(x) · h + o(h)− f (x)

h

= limh→0

f ′(x) · h + o(h)

h= f ′(x) + lim

h→0

o(h)

h

Since o(h) denotes a function g such that g(h)/h→ 0 as h→ 0, we seethat

limh→0

f (x + h)− f (x)

h= f ′(x).



Geometric Interpretation of the DerivativeProof (continued).Now let

limh→0

f (x + h)− f (x)

h= Lx . (3.1.8)

We will show that then f (x + h) = f (x) + Lx · h + o(h), proving that f isdifferentiable at x and that f ′(x) = Lx . From (3.1.8) it follows that

f (x + h)− f (x)

h= Lx + r(x , h),

where the remainder r(x , h)→ 0 as h→ 0. We can rearrange thisequation to yield

f (x + h) = f (x) + Lxh + hr(x , h),

where hr(x , h)/h→ 0 as h→ 0. But this means hr(x , h) = o(h).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 261 / 624


Physical Interpretation of the DerivativeIn physics one is often interested in the dependence of one physicalquantity on another, for example the dependence of position on time, or ofelectrical conductivity on temperature. Let us consider the first case,where the position x of a point mass on a line depends on time t.Here we write x(t) for the position at time t. The derivative x ′(t) thentells us the rate of change of the position x at time t: if x ′(t) > 0, then xis increasing at t, if x ′(0) < 0, x is decreasing at t.The rate of change of the position is called the (instantaneous) velocity,which is again a function of t, denoted by v(t) = x ′(t). If we define theaverage velocity as

distance travelledtime taken =

x2 − x1t2 − t1

we arrive at a similar formula to that of the secant through the points(t1, x1) and (t2, x2).



Physical Interpretation of the Derivative

As before, taking the limit t2 → t1 we obtain the instantaneous velocity as

v(t) = x ′(t) = lim∆t→0

x(t +∆t)− x(t)

∆t.

Similarly, the rate of change of the velocity is called the acceleration,

a(t) = v ′(t),

and the rate of change of the acceleration is called the jerk,

j(t) = a′(t).

Hence the acceleration is the derivative of the derivative of the position.We are thus led naturally to introduce the concept of higher derivatives.



Higher DerivativesConsider a differentiable real function f . Then for every point x ∈ dom fwe can find precisely one real number f ′(x). Thus we can regard the map

f ′ : dom f → R, x 7→ f ′(x)

as a real function, which we call the derivative of f . The function f ′ mayitself be differentiable; in this case we use the notation

(f ′)′(x) = f ′′(x)

to denote the derivative of f ′ at x , and call f ′′(x) the second derivative off at x . If f ′′(x) exists, we say that f is twice differentiable at x , and if f ′is differentiable, we say that f is twice differentiable.This can of course be developed further, so that we say that f is n timesdifferentiable at f if

(... ((f ′)′)′ ... )′︸︷︷︸n times

(x)



Higher DerivativesWe would hence have

x ′(t) = v(t), x ′′(t) = a(t), x ′′′(t) = j(t)

for the velocity, acceleration and jerk, respectively.3.1.6. Notation. For the first three derivatives, one writes f ′, f ′′ and f ′′′,respectively. For higher derivatives, one writes

f (4), f (5), f (6), ...

In physics, for time-dependent quantities only, a slightly different notationgoing back to Newton is often used. There one denotes the first threederivatives by dots instead of primes, i.e.,

x(t) = v(t), x(t) = a(t),...x (t) = j(t)

These dots are only used for the first three derivatives, because inmechanics higher derivatives very rarely occur (for instance, Newton’sequations of motion often only contain first and second derivatives).



SmoothnessGeometrically, a function f is differentiable at x if we can find a tangent tothe graph of f at (x , f (x), i.e., if the limit of the slopes of secants exists.It is easy to find an example of a function where this is not the case at asingle point: the function

f (x) = |x |has derivative

f ′(x) =

1 x > 0

−1 x < 0

but the derivative is not defined for x = 0. In fact,

limh0

|0 + h| − |0|h

= limh0

h

h= 1

whilelimh0

|0 + h| − |0|h

= limh0

−hh

= −1

so f ′(0) = limh→0|0+h|−|0|

h does not exist.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 266 / 624


SmoothnessThe fact that f (x) = |x | is not differentiable at 0 is clearly visible by thesharp corner in the graph. As a first result we can note:3.1.7. Remark. A continuous function is not necessarily differentiable.However, if a function is differentiable, it is necessarily continuous:3.1.8. Lemma. Let f be a function that is differentiable at somex ∈ dom f . Then f is continuous at x .

Proof.Since f is differentiable at x , we can write

f (x + h) = f (x) + f ′(x)h + o(h).

In order to show that f is continuous at x we need to show thatlimy→x f (y) = f (x).



SmoothnessProof (continued).We can instead write

limy→x

f (y) = limh→0

f (x + h)

(why?) Now

limh→0

f (x + h) = limh→0

f (x) + limh→0

f ′(x)h + limh→0

o(h)

if all the limits on the right exist. Since limh→0o(h)h = 0 we have

limh→0

o(h) = limh→0

ho(h)

h= lim

h→0h limh→0

o(h)

h= 0.

Hence limh→0

f (x + h) = limh→0

f (x) so f is continuous.



Smoothness

This shows that the differentiable functions are a “special” or “nice”subset of the continuous functions. We often say that they are “smoother”than general continuous functions, alluding to the fact that they cannothave any sharp edges.Of course this can be generalized to higher derivatives; a function that istwice differentiable will be “smoother” than a merely once differentiablefunction. We make one obvious remark:3.1.9. Remark. If f is differentiable at x , f ′ is not necessarily continuous atx . A function whose derivative is also continuous (at x) is calledcontinuously differentiable (at x).We will later give an example of a function that is differentiable, but notcontinuously differentiable at x = 0.



SmoothnessWe have looked at an example of a function that is not differentiable at asingle point; there seems to be no reason why there should not be afunction that is continuous but not differentiable at any point. Anexample of such a function is the self-similar Van der Waerden function,pictured below.

0 1

8

1

4

3

8

1

2

5

8

3

4

7

81

0.0

0.1

0.2

0.3

0.4

0.5

0.6

We will later construct a modified version of this function (see Slides 432onwards).



Smoothness

The van der Waerden function is defined as an infinite sum of simplecontinuous functions. We will only later be able to study the definition ofthis function and prove its properties.We hence have a certain hierarchy of smoothness at a point x ∈ R:

arbitrary functions functions continuous at x

functions differentiable at x functions continuously differentiable at x

functions twice differentiable at x ...



Smoothness

For an open interval I ⊂ R we have finer gradations of smoothness: arbitrary functions functions continuous on I

functions uniformly continuous on I

functions differentiable on I

functions continuously differentiable on I

functions differentiable on I with uniformly continuous derivative functions twice differentiable at x ...



Differentiation

As usual, we will not always calculate derivatives by hand; rather we willcalculate some derivatives of basic functions and then obtain thederivatives of more complicated functions using general theorems.3.1.10. Lemma. Let f , g be functions on R, x ∈ dom f ∩ dom g andassume that f and g are both differentiable at x . Then

(f + g)′(x) = f ′(x) + g ′(x).

Furthermore,

(λf )′(x) = λf ′(x) for λ ∈ R.

This result is a simple consequence of the definitions.



Differentiation

Using Lemma 3.1.10 we can now calculate the derivative of anypolynomial. If p(x) =

∑nk=0 ckx

k , then

p′(x) =n∑

k=1

ckkxk−1.



The Product RuleWe may also find the derivative of the product of two functions:3.1.11. Lemma. Let f and g be real functions with x ∈ dom f ∩ dom gsuch that f and g are differeniable at x . Then

(f · g)′(x) = f ′(x)g(x) + g ′(x)f (x).

Proof.We just write out the definitions and use what we know about Landausymbols:

(f · g)(x + h) = f (x + h)g(x + h)

=(f (x) + f ′(x)h + o(h)

)(g(x) + g ′(x)h + o(h)

)= f (x)g(x) +

(f ′(x)g(x) + g ′(x)f (x)

)h + f ′(x)g ′(x)h2

+ o(h)[f ′(x)h + g ′(x)h + o(h)]

= f (x)g(x) +(f ′(x)g(x) + g ′(x)f (x)

)h + o(h)



The Chain Rule

In order to calculate the derivatives of more complicated functions, weneed to find a formula for the derivative of the composition of functions.This result is commonly called the chain rule.3.1.12. Theorem. Let f , g be real functions and x ∈ R such thatx ∈ dom g , g(x) ∈ dom f are interior points of the domains of g and f ,respectively. Assume further that g is differentiable at x and f isdifferentiable at g(x). Then

(f g)′(x) = f ′(g(x)) · g ′(x).



The Chain Rule

Proof.By our assumed differentiability of g at x we have

(f g)(x + h) = f (g(x + h)) = f (g(x) + g ′(x)h + o(h))

as h→ 0. Writing H = g ′(x)h + o(h) and using the differentiability of fat g(x), we have

(f g)(x + h) = f (g(x) + H) = f (g(x)) + f ′(g(x))H + o(H).

NowH = g ′(x)h + o(h) = O(h) + O(h) = O(h)

as h→ 0, so o(H) = o(O(h)) = o(h) as h→ 0 (as proven theassignments).



The Chain RuleProof (continued).We hence have

(f g)(x + h) = f (g(x)) + f ′(g(x))H + o(h)

= f (g(x)) + f ′(g(x))(g ′(x)h + o(h)) + o(h)

= f (g(x)) + f ′(g(x))g ′(x)h + o(h)

as h→ 0, so (f g)(x) = f ′(g(x))g ′(x).Using the chain rule, we can now calculate the derivative of many morefunctions, such as f (x) = x−n, n ∈ N or general quotients:3.1.13. Lemma. Let f and g be real functions with x ∈ dom f ∩ dom gsuch that f and g are differentiable at x and g(x) = 0. Then(

f

g

)′(x) =

f ′(x)g(x)− g ′(x)f (x)

g(x)2.



The Quotient RuleProof.Let h(x) = 1/x . Then f

g= f · (h g). Using the product rule,

(f

g

)′(x) = f ′(x) · (h g)(x) + f (x)(h g)′(x)

Now by Lemma 3.1.4 we have h′(x) = −1/x2, and by the chain rule weknow that

(h g)′(x) = h′(g(x))g ′(x),

so (f

g

)′(x) = f ′(x)

1

g(x)+ f (x)

(− 1

g(x)2

)g ′(x)

=f ′(x)g(x)− g ′(x)f (x)

g(x)2



The Inverse Function TheoremA simple consequence of the quotient rule is that (x−n)′ = (−n)x−n−1 forn ∈ N, n ≥ 1. In fact, we can now differentiate all rational functions(quotients of polynomials). However, there are still some functions we cannot yet differentiate, such as the square root function f (x) =

√x .

Since the square root function is defined as the inverse of g(x) = x2 forx > 0, it is perhaps more fruitful to consider derivatives of inversefunctions in general, rather than trying to obtain the derivative of everyfunction with a rational exponent separately.3.1.14. Theorem. Let I be an open interval and let f : I → R bedifferentiable and strictly monotonic. Then the inverse mapf −1 = g : f (I )→ I exists and is differentiable at all points y ∈ f (I ) forwhich f ′(g(y)) = 0. Furthermore,

g ′(y) =1

f ′(g(y)).



The Inverse Function TheoremProof.Let y = f (x) and y + H = f (x + h). Then

y + H = f (x + h) = f (x) + f ′(x)h + o(h)

= y + f ′(g(y))h + o(h)

= y + (f ′(g(y)) + o(1))h

as h→ 0. Assuming that f ′(g(y)) = 0, this gives

h =1

f ′(g(y)) + o(1)H as h→ 0.

Note that h = g(y + H)− g(y) and g is continuous, so h→ 0 if H → 0.By the continuity of g we then have

h =1

f ′(g(y)) + o(1)H = O(H) as H → 0.



The Inverse Function TheoremProof (continued).This allows us to use

y + H = f (x + h) = f (x) + f ′(x)h + o(h) = y + f ′(g(y))h + o(h)

as h→ 0 to deduce

h =1

f ′(g(y))H + o(h) as h→ 0,

=1

f ′(g(y))H + o(O(H)) as H → 0,

=1

f ′(g(y))H + o(H) as H → 0.

Then, as H → 0,

g(y + H) = x + h = g(y) +1

f ′(g(y))H + o(H)



The Inverse Function TheoremThe main difficulty in the proof of the previous theorem was showing thatf −1 was differentiable; if we know this, then the derivative is computedfrom the chain rule, because

y = f (g(y)),

so1 = y ′ = f ′(g(y))g ′(y)

and hence g ′(y) = 1/f ′(g(y)).We are now able to differentiate all algebraic functions and their inverses.For example,

(√x)′ =

1

2√x.

In general, we can show that (xp)′ = pxp−1 for all rational numbers p.



Leibniz Notation

Next to the notation f ′ for the derivative of f there is also a notationgoing back to Leibniz, which writes

f ′ =dy

dx.

This is of course inspired by taking the derivative as a “differentialquotient”

f ′(x0) = lim∆x→0

∆y

∆x

where ∆x = x − x0 and ∆y = f (x)− f (x0). Leibniz notation has provento be extremely versatile and flexible, and has become extremely popularin mathematics.



Leibniz NotationThere are some fine points one should be aware of:

1. Leibniz notation does not differentiate between f ′(x) and f ′; both canbe written as df

dx.

2. Leibniz himself regarded df and dx as independent infinitesimalelements, called differentials and df

dxas the differential quotient. For

the moment, we are not going to treat these individually, but only inthe form of a fraction.

There is a way around the first problem: we can use the notation |x0 , read“at x0,” to denote a function evaluated at a specific point. For example,

f (x0) = f |x0 = f (x)|x=x0

are all notations for the same number. Hence

f ′(x0) =df

dx

∣∣∣∣x0

can be used to indicate the derivative evaluated at a specific point.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 285 / 624


Leibniz NotationWhile it at first might look like this ambiguity of df /dx (does it indicate afunction or the value of a function?) is a disadvantage, it turns out thatprecisely this double meaning can be very useful to write out formulasefficiently. Furthermore, in advanced mathematics such a double meaningis often built into the statement, and a notation that refects this is veryuseful.In Leibniz notation, the chain rule for f g becomes

df

dx=

df

dg· dgdx

,

and the inverse function theorem can be writtendx

dy=

1dydx

.

While these formulae are useful mnemonics, they clearly need to beinterpreted correctly!



Leibniz NotationAnother advantage of Leibniz notation is that it becomes natural to write

df

dx=

d

dxf ,

where d

dxis a so-called differential operator that takes a function and

returns its derivative. For example,d

dx(4x3) = 12x2.

For higher derivatives, we write

f ′′ =d2f

dx2, f ′′′ =

d3f

dx3, f (n) =

dnf

dxn.

This allows us to writed

dx

(d

dxx5)

=d2

dx2x5 = 20x3.


Differential Calculus in One Variable Properties of Differentiable Real Functions



Vector Spaces


Series






Extrema of Real Functions

Now that we know how to differentiate and have established the basicnotation, we can at last treat our basic objective: the analysis of thebehavior of functions. We begin by looking at extrema, i.e., maxima andminima of functions.3.2.1. Definition. Let f be a function and Ω ⊂ dom f . Then x ∈ Ω iscalled a (global) maximum point for f on Ω if

f (x) ≥ f (y) for all y ∈ Ω.

The number f (x) is called the maximum value of f on Ω.A minimum point/minimum value is defined similarly.We will use derivatives to characterize maximum and minimum points.




3.2.2. Theorem. Let f be a function and (a, b) ⊂ dom f an open interval.If x ∈ (a, b) is a maximum (or minimum) point for f on (a, b) and if f isdifferentiable at x , then f ′(x) = 0.

Proof.We will consider the case of a maximum point only. Clearly, the geometricinterpretation of the derivative is well-suited to this theorem: we want toshow that the tangent at the maximum point is horizontal, i.e., has slopezero.Assume that f has a maximum at x . For sufficiently small h,x + h ∈ (a, b), so

f (x + h)− f (x) ≤ 0.




Proof (continued).If h > 0, then

f (x + h)− f (x)

h≤ 0

and hencelimh0

f (x + h)− f (x)

h≤ 0.

Similarly,limh0

f (x + h)− f (x)

h≥ 0.

But since f is differentiable at x , these limits must be equal, so they mustboth be zero.



Extrema of Real Functions3.2.3. Definition. Let f be a real function and Ω ⊂ dom f . Then x ∈ Ω iscalled a local maximum (minimum) point for f on Ω if there exists someε > 0 such that x is a maximum (minimum) point for f on Ω ∩ Bε(x).

3.2.4. Remark. If f is defined on an open interval (a, b) and has a localmaximum (minimum) at x ∈ (a, b), and if f is differentiable at x , thenf ′(x) = 0.

3.2.5. Definition. A function f is said to have a critical point at x ∈ dom fif f ′(x) = 0. The value f (x) is then called a critical value of f .Therefore, when finding the (local and global) maxima and minima of afunction on a set Ω ⊂ dom f , we need to check

1. the critical points of f ,2. the boundary points of Ω,3. interior points of Ω where f is not differentiable.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 292 / 624


Rolle’s Theorem

A basic result which will allow us to develop our analysis further is thefollowing, named after the French mathematician Rolle.3.2.6. Rolle’s Theorem. Let f be a real function and a < b ∈ R such that[a, b] ∈ dom f . Assume that f is continuous on [a, b] and differentiable on(a, b) and that f (a) = f (b). Then there exists a number x ∈ (a, b) suchthat f ′(x) = 0.

Proof.Since f is continuous, it will have a maximum and a minimum value on[a, b]. If either is in (a, b), then f ′ vanishes there by Theorem 3.2.2, so weare finished. Assume that both the maximum and the minimum are at a orb. Since f (a) = f (b), it follows that f is constant, so f ′(x) = 0 for any xin (a, b).



The Mean Value TheoremRolle’s theorem may not look spectacular, but it already contains the proofof the powerful mean value theorem:3.2.7. Mean Value Theorem. Let f be a real function and a < b ∈ R suchthat [a, b] ∈ dom f . Assume that f is continuous on [a, b] anddifferentiable on (a, b). Then there exists a number x ∈ (a, b) such that

f ′(x) =f (b)− f (a)

b − a.

Proof.Let

h(x) = f (x)− f (b)− f (a)

b − a(x − a).

Then h(a) = h(b) = f (a) and by Rolle’s theorem there exists somex ∈ (a, b) such that h′(x) = 0.

Since h′(x) = f ′(x)− f (b)− f (a)

b − a, this proves the theorem.



The Mean Value TheoremWe can now prove a seemingly “obvious” statement, which, however,requires the powerful mean value theorem.3.2.8. Corollary. Let f be a real function and I ⊂ dom f . Assume thatf ′ = 0 on I . Then f is constant on I .

Proof.Let a, b ∈ I , a < b. Then there exist some x ∈ (a, b) such that

f ′(x) =f (b)− f (a)

b − a.

But since f ′ = 0 on I , it follows that f (a) = f (b).

3.2.9. Corollary. Let f , g be real functions and I ⊂ dom f ∩dom g . Assumethat f ′ = g ′ on I . Then there exists some c ∈ R such that f = g + c .



The Mean Value Theorem

The following corollary is of great importance for analyzing the behavior offunctions:3.2.10. Corollary. Let f be a real function and I ⊂ dom f . Assume thatf ′ > 0 on I . Then f is strictly increasing on I . If f ′ < 0 on I , f is strictlydecreasing on I .

Proof.Consider the case where f ′ > 0. Let a, b ∈ I , a < b. Then there existsome x ∈ (a, b) such that

f ′(x) =f (b)− f (a)

b − a> 0.

Since b > a this implies f (b) > f (a).



Maxima and Minima3.2.11. Theorem. Let f be a real function and x ∈ dom f such thatf ′(x) = 0. If f ′′(x) > 0, then f has a local minimum at x ; if f ′′(x) < 0,then f has a local maximum at x .

Proof.We know that

f ′(x + h) = f ′(x) + f ′′(x)h + o(h) =(f ′′(x) + o(1)

)h as h→ 0

Assume that f ′′(x) > 0. Then for sufficiently small h,

f ′(x + h) > 0 if h > 0,f ′(x + h) < 0 if h < 0.

This means that f is strictly increasing to the right of x and strictlydecreasing to the left of x . Thus f has a minimum at x .



Maxima and Minima

3.2.12. Theorem. Let f be a real function and x ∈ dom f such that f ′′(x)exists. If f has a local minimum at x , then f ′′(x) ≥ 0; if f has a localmaximum at x , then f ′′(x) ≤ 0.

Proof.Suppose f has a minimum at x . If f ′′(x) < 0, then f also has a maximumat x . But this means that f is constant, therefore f ′′(x) = 0.



Convexity and ConcavityWe have seen how to find the extrema of real functions and also know howto determine intervals of monotonicity. However, an increasing functionmay exhibit one of two fundamentally different shapes:

1

x

y

y = x

y = x2

The graph of f (x) = x2 stays belowthe secant joining the origin withthe point (1, 1), while the graph ofg(x) =

√x stays above this line.

In fact, if we select any two pointson the graph of f and join themwith a line, the graph of f will re-main below these points. A corre-sponding statement is true for g .



Convexity and ConcavityThese observations motivate the following definition:3.2.13. Definition. Let Ω ⊂ R be any set and I ⊂ Ω an interval. A functionf : Ω → R is called strictly convex on I if for all a, x , b ∈ I with a < x < b,

f (x)− f (a)

x − a<

f (b)− f (a)

b − a.

We say that f is strictly concave on I if −f is strictly convex. If we replace“<” by “≤” above, f is simply called convex and −f concave.

3.2.14. Theorem. Let f : I → R be be strictly convex on I anddifferentiable at a, b ∈ I . Then

(i) For any h > 0 (h < 0) such that a+ h ∈ I , the graph of f over theinterval (a, a+ h) lies below the secant line through the points(a, f (a)), (a+ h, f (a+ h)).

(ii) The graph of f over all of I lies above the tangent line through thepoint (a, f (a)).

(iii) If a < b, then f ′(a) < f ′(b).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 300 / 624


Convexity and ConcavityProof.

(i) Let a < x < a+ h, h > 0. The secant line through the points(a, f (a)) and (a+ h, f (a+ h)) is given by

ψa,a+h : R→ R, ψa,a+h(t) = f (a) +ma,a+h(t − a),

while the line through the points (a, f (a)) and (x , f (x)) is

ψa,x : R→ R, ψa,x(t) = f (a) +ma,x(t − a)

with

ma,a+h =f (a+ h)− f (a)

h, ma,x =

f (x)− f (a)

x − a.

Since f is strictly convex, ma,x < ma,a+h. This implies

f (x) = ψa,x(x) < ψa,a+h(x).



Convexity and ConcavityProof.(ii) We show that the graph of f to the right of a lies above the tangent

line through a. A similar argument can be used for the graph of f tothe left of a. Let 0 < h1 < h2. Then, by the convexity of f ,

f (a+ h1)− f (a)

h1<

f (a+ h2)− f (a)

h2

It follows that the function

g(h) =f (a+ h)− f (a)

h

is strictly decreasing as h 0. Since limh→0

g(h) = f ′(a) we see that

f ′(a) <f (a+ h)− f (a)

h=: ma,a+h for h > 0. (3.2.1)



Convexity and ConcavityProof (continued).

Consider the secant line through the points (a, f (a)) and(a+ h, f (a+ h)), h > 0, given by

ψa,a+h : R→ R, ψa,a+h(t) = f (a) +ma,a+h(t − a).

The tangent line through the point (a, f (a)) is given by

Ta : [a, a+ h]→ R, Ta(t) = f (a) + f ′(a)(t − a).

Since f ′(a) < ma,a+h by (3.2.1) it follows that Ta(x) < ψa,h(x) fora < x ≤ a+ h. In particular,

f (a+ h) = ψa,a+h(a+ h) > Ta(a+ h).

Since this is true for any h > 0, the graph of f to the right of a liesabove the tangent line through a.



Convexity and Concavity

Proof (continued).(iii) By (ii), we know that the graph of the function stays above the

tangent at any point. In particular, f (b) is above the tangent linethrough (a, f (a)), so

f ′(a) < ma,b =f (b)− f (a)

b − a.

Similarly, considering the tangent line through (b, f (b)) we have thatf (a) is above this line, so

f ′(b) > mb,a =f (a)− f (b)

a− b=

f (b)− f (a)

b − a.

This implies f ′(a) < f ′(b).



Convexity and Concavity3.2.15. Lemma. Let I be an open interval, f : I → R differentiable and f ′

strictly increasing. If a, b ∈ I , a < b, and f (a) = f (b), thenf (x) < f (a) = f (b) for all x ∈ (a, b).

Proof.We first show that f (x) ≤ f (a) for all x ∈ (a, b). Suppose that thereexists some x ∈ (a, b) with f (x) > f (a). Then there exists some pointx0 ∈ (a, b) such that f (x0) is a maximum of f for [a, b]. Consequently, byTheorem 3.2.2, f ′(x0) = 0. However, the Mean Value Theorem 3.2.7 forthe interval [a, x0] shows the existence of some x1 ∈ (a, x0) such that

f ′(x1) =f (x0)− f (a)

x0 − a> 0.

Since x1 < x0 but f ′(x1) > f ′(x0), we obtain a contradiction to theassumption that f ′ is strictly increasing.



Convexity and ConcavityProof (continued).Now suppose there exists some x0 ∈ (a, b) with f (x0) = f (a). Sincef (x) ≤ f (a) for all x ∈ [a, b], x0 is a local maximum point for f and sof ′(x0) = 0.Since f ′ is strictly increasing, f can not be constant on [a, x0]. Thus, thereexists an x1 ∈ (a, x0) with f (x1) < f (x0). Applying the Mean ValueTheorem 3.2.7 to the interval [x1, x0] we find an x2 ∈ (x1, x0) such that

f ′(x2) =f (x0)− f (x1)

x0 − x1> 0.

This again contradicts the assumption that f ′ is increasing on I .The most important result in applications is the following:3.2.16. Theorem. Let I be an open interval, f : I → R differentiable and f ′

strictly increasing. Then f is strictly convex.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 306 / 624


Convexity and ConcavityProof.Let a, b ∈ I with a < b. Set g : I → R,

g(x) = f (x)− f (b)− f (a)

b − a(x − a).

Then g ′ is strictly increasing and g(a) = g(b) = f (a). By Lemma 3.2.15,

g(x) < f (a) for x ∈ (a, b).

This impliesf (x)− f (b)− f (a)

b − a(x − a) < f (a)

orf (x)− f (a)

x − a<

f (b)− f (a)

b − a

for x ∈ (a, b). This shows that f is convex.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 307 / 624


Curve Sketching

We now have all the tools we need to sketch the graph of a given function.In orer to perform a sketch, we first need to perform a curve discussion,where we explore the behavior of the curve at all points of its domain. Inparticular, we are interested in

1. the domain and range;2. the continuity and behavior near points of discontinuity;3. the behavior as x → ±∞; in particular asymptotes (straight lines

such that the values of the function approach the points on the lineas x → ±∞).;

4. local and global extrema;5. intervals where the function is increasing, decreasing or constant;6. inflection points, where the second derivative changes sign;7. other remarkable features of the curve.



Curve SketchingThe object of a sketch of a curve is to visualize the curve for the reader,and to convey all relevant information regarding the curve. Knowing whatto include in a sketch is just as important as knowing what not to include.First, here are some general guidelines:

Decide what area of the coordinate system you will need (for thegraph of a function, consider its domain and range). In most graphs,you should include the origin (0, 0) of the coordinate system.

Make sure your sketch is large enough; generally, it should be as wideas the page on which you are writing, and at least 8 cm high!

Use a ruler to draw the axes. If the sketch includes the origin of thecoordinate system, the axes should intersect there.

Add an arrowhead pointing right to the right end of the horizontalaxis (called the abscissa), and an arrowhead pointing upwards to thetop end of the vertical axis (called the ordinate).



Curve Sketching Label the axes (generally, “x” for the abscissa and “y” for the

ordinate, but you can use any other suitable variables), writing thelabels under the horizontal arrowhead and on the left of the verticalarrowhead.

You do not need to label the origin (0, 0). It is understood that theaxes intersect at (0, 0). Only if in your sketch the axes do notintersect at the origin should you label the intersection.

Sketch the curve so that all the space provided by the axes is used;for a graph of a function, the maximum should be near the top of theordinate and the minimum should be near the bottom of the ordinate.The domain of the function should be clearly visible in the sketch. Ifthe graph continues for x → ±∞, then draw the curve until you arejust above the ends of the abscissa - do not draw a curve further thanthe horizontal or vertical axes, and do not stop drawing before theends are reached.



Curve Sketching

Label the curve, e.g., “y = f (x)”, “f ”. You may use little arrows toindicate the curve.

Mark characteristic points of the curve: intersections with the axes,extrema, inflection points, other interesting points. Generally, it issufficient to mark these points on the axes; only in rare cases shouldyou resort to labeling a point (x , y) directly on the graph.

If you do not have exact values for some characteristic points,indicate them through suitable symbols (e.g., x1, x2, ...) and explainthese symbols briefly at the bottom of the graph.



Curve Sketching

Indicate asymptotes through dashed lines, and label horizontal andvertical asymptotes on the axes of the graph. A slant asymptoteshould be labeled as “y = mx + b” (perhaps using an arrow) in thesame way the graph of the function is labeled. If a horizontalasymptote lies on the abscissa (y = 0), or a vertical asymptote lies onthe ordinate (x = 0), then you should not try to draw the asymptote,but rather indicate through words and an arrow that there is anasymptote on these lines.

Title your sketch (e.g., “Graph of the function f ”).



Curve Sketching

Things you should generally not do: Mark points on the axes that do not correspond to characteristic

points of the graph; there is really no need to give a “scale” on theaxes.

Draw lines inside the graph; only use dashed lines for asymptotes. Ifyou feel that you need to connect points on the axes with points onthe graph, then your sketch is not clear enough!

All of the above points are guidelines; sometimes it may be necessary tobreak them and to do things differently. However, if you do so, be sure youhave a very good reason!



Curve SketchingAs an example, let us sketch the function

I : [1,∞)→ R, I (x) =x − 1

(1 + x)2(3 + x).



The Cauchy Mean Value Theorem

3.2.17. Theorem. Let f , g be real functions and [a, b] ⊂ dom f ∩ dom g . Iff and g are continuous on [a, b] and differentiable on (a, b), then thereexists an x ∈ (a, b) such that(

f (b)− f (a))g ′(x) =

(g(b)− g(a)

)f ′(x).

Proof.We apply Rolle’s Theorem to

h(x) = f (x)(g(b)− g(a)

)− g(x)

(f (b)− f (a)

).



L’Hôpital’s Rule

3.2.18. Theorem. Let f and g be real functions and b ∈ dom f ∩ dom gwith

limxb

f (x) = 0 and limxb

g(x) = 0.

Suppose that there exists a δ > 0 such that f and g are defined anddifferentiable on the interval (b, b+ δ) and g ′(x) = 0 for all x ∈ (b, b+ δ).Suppose further that the limit lim

xbf ′(x)/g ′(x) =: L exists. Then

limxb

f (x)

g(x)= L.

For the following proof, we will change the definition of f and g to ensurethat f (b) = g(b) = 0. This ensures that f and g are continuous at b, butdoes not change any of the statements of the theorem.



L’Hôpital’s RuleProof.Let f and g be given as in the theorem with f (b) = g(b) = 0.Furthermore, choose δ > 0 small enough. Then for all x ∈ (b, b + δ) wehave g(x) = 0 by the mean value theorem. (Otherwise, g ′ = 0 somewherein (b, x), which is not allowed.)Fix ε > 0. Choose δ > 0 (perhaps decreasing the choice of δ above) suchthat ∣∣∣∣ f ′(ξ)g ′(ξ)

− L

∣∣∣∣ < ε for all ξ ∈ (b, b + δ).

Choose x ∈ (b, b + δ). Applying the Cauchy mean value theorem, there isa number αx in (b, x) such that

f (x)

g(x)=

f ′(αx)

g ′(αx).




Proof (continued).But then ∣∣∣∣ f (x)g(x)

− L

∣∣∣∣ < ε for all x ∈ (b, b + δ).

Thus f (x)/g(x)→ L as x b.



L’Hôpital’s RuleAn analogous version of l’Hopital’s rule applies when considering limits atinfinity:3.2.19. Theorem. Let f and g be real functions such that for some C > 0the interval (C ,∞) ⊂ dom f ∩ dom g and

limx→∞

f (x) = 0 and limx→∞

g(x) = 0.

Suppose further that f and g are defined and differentiable on the interval(C ,∞) and g ′(x) = 0 for all x ∈ (C ,∞). Suppose further that the limitlimx→∞

f ′(x)/g ′(x) =: L exists. Then

limx→∞

f (x)

g(x)= L.



L’Hôpital’s RuleProof.Fix ε > 0. By assumption, we can find (and then fix) a C > 0 such thatfor all y > C ∣∣∣∣ f ′(x)g ′(x)

− L

∣∣∣∣ < ε for x > y .

This implies, in particular, that g ′(x) = 0 for x > C . Choose any x > y .Then g(x) = g(y), for otherwise, by Rolle’s theorem, we would haveg ′(z) = 0 for some z ∈ (y , x). Furthermore, for any y > C and x > y wecan apply the Cauchy mean value theorem to the interval [y , x ] to findsome z ∈ (y , x) such that

f (y)− f (x)

g(y)− g(x)=

f ′(z)

g ′(z).




Proof (continued).We deduce that for all y > C∣∣∣∣ f (y)− f (x)

g(y)− g(x)− L

∣∣∣∣ < ε for x > y .

The right-hand side does not depend on x , so we can let x →∞ and seethat ∣∣∣∣ f (y)g(y)

− L

∣∣∣∣ ≤ ε for y > C .

This proves that limx→∞

f (x)g(x) = L.

Further versions of l’Hôspital’s theorem will be proved in the recitationclasses.


Differential Calculus in One Variable Vector Spaces



Vector Spaces


Series






Vector SpacesWe have now developed many tools to analyze the functions that weknow, which essentially comprise

1. algebraic functions (quotients of polynomials),2. their inverse functions (root functions) and3. piece-wisely defined functions (e.g., f (x) = |x |).

Of course there are many more functions out there! However, in order tofind and properly define more functions, we need to mathematicallyorganize the “set of all functions” in such a way that we can provetheorems on classes of functions and are able, for example, to considersequences of functions.We have previously added the concept of distance to a set by defining ametric space (M, ϱ) consisting of a set M and a map ϱ : M ×M → R.However, this does not give us enough structure for our purposes, as wealso need to be able to at least add elements of M. Thus, we willintroduce a more suitable setting for our functions.



Vector SpacesOur model for the concept of a vector space is based on R2. From thepoint of view of mathematics, R2 is simply the set of pairs (x1, x2) wherex1, x2 ∈ R. Every pair can also be interpreted as a point in the plane,which we also call R2. However, from a physical point of view we canidentify (x1, x2) with an arrow joining the origin to (x1, x2).

-4 -3 -2 -1 1 2 3 4x1

-2

-1

1

2

3

4

x2

Hx1,x2L

-4 -3 -2 -1 1 2 3 4

x1

-2

-1

1

2

3

4

x2

x1

x2



Vector Spaces

The classical (physicist’s) notation for a vector in R2 is

x =

(x1x2

).

We will keep the idea of writing(x1x2

)instead of (x1, x2) when it is useful to

do so, but we will drop the arrow in x and simply write x . It will always beclear from context whether x ∈ R or x ∈ R2.



Vector SpacesIn physics, vectors are used to indicate directional quantities such asvelocity or forces. In this context, position is also assumed to bedirectional. As is well-known (from school-level physics), vectors can beadded and one can consider multiples of vectors, as follows:

x + y =

(x1x2

)+

(y1y2

)=

(x1 + y1x2 + y2

), λ · x = λ

(x1x2

)=

(λx1λx2

),

where x , y ∈ R2 and λ ∈ R.However, there is no straightforward way to multiply two vectors in R2 andobtain a new vector in R2. And for physical as well as mathematicalapplications, that is often not necessary!Implementing the ideas of the previous discussion, we want to define avector space as a set of elements (“vectors”) that can be added togetheror multiplied by a (real or complex) number to obtain new elements of theset.



Vector Spaces3.3.1. Definition. A triple (V , +, ·) is called a real vector space (or reallinear space) if

1. V is any set;2. +: V × V → V is a map (called addition) with the following

properties: (u + v) + w = u + (v + w) for all u, v ,w ∈ V (associativity), u + v = v + u for all u, v ∈ V (commutativity), there exists an element e ∈ V such that v + e = v for all v ∈ V

(existence of a unit element), for every v ∈ V there exists an element −v ∈ V such that

v + (−v) = e;3. · : R× V → V is a map (called scalar multiplication) with the

following properties: λ · (u + v) = λ · u + λ · v for all λ ∈ R, u, v ∈ V , (λ+ µ) · u = λ · u + µ · u for all λ,µ ∈ R, u ∈ V , (λµ) · u = λ · (µ · u) for all λ,µ ∈ R, u ∈ V .

If we replace R with C, we say that (V , +, ·) is a complex vector (or linear)space.



Examples of Vector SpacesWe will often simply write V instead of (V , +, ·) if the operations · and +are understood.3.3.2. Examples.

1. The set of n-tuples (or n-dimensional vectors)Rn = x = (x1, ... , xn) : x1, ... , xn ∈ R is a real vector space withaddition defined by

x + y = (x1, ... , xn) + (y1, ... , yn)

:= (x1 + y1, ... , xn + yn), x , y ∈ Rn,

and scalar multiplication defined by

λx = λ · (x1, ... , xn) := (λx1, ... ,λxn), λ ∈ R, x ∈ Rn.

We will sometimes abbreviate (x1, ... xn) by writing (xk)nk=1.



Examples of Vector Spaces

2. The set Cn = z = (z1, ... , zn) : z1, ... , zn ∈ C is a complex vectorspace with addition defined by

w + z := (w1 + z1, ... ,wn + zn), w , z ∈ Cn, (3.3.1)

and scalar multiplication defined by

λz := (λz1, ... ,λzn), λ ∈ C, z ∈ Cn.

3. The set Cn is a real vector space if we define addition as in (3.3.1)and scalar multiplication

λz := (λz1, ... ,λzn), λ ∈ R, z ∈ Cn.



Examples of Vector Spaces4. The set of real polynomials of degree less than or equal to n,

Pn =p : R→ R : p(x) =

n∑k=0

akxk , a0, ... , an ∈ R

is a real vector space with pointwise addition and scalar multiplication:

(p + q)(x) := p(x) + q(x), (λp)(x) := λp(x).

for p, q ∈ Pn, λ ∈ R.5. More generally, for any set M, the set of maps f : M → R (C) is a

real (real or complex) vector space with pointwise addition and scalarmultiplication

(f + g)(x) := f (x) + g(x), (λf )(x) := λf (x),

for f , g : M → R, λ ∈ R (C).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 330 / 624


Examples of Vector Spaces

6. Hence for M = N the set of all real (complex) sequences (an)n∈N is areal (real or complex) vector space.

7. The set of all functions f : R→ R is a real vector space.8. For any open set Ω ⊂ R the set of all continuous functions f : Ω → R

is a real vector space. We denote this set by C (Ω,R) or just C (Ω).9. For any open set Ω ⊂ R and k ∈ N the set of all k times continuously

differentiable functions f : Ω → R is a real vector space. We denotethis set by C k(Ω,R) or just C k(Ω).

10. For any open set Ω ⊂ R we define

C∞(Ω ,R) = C∞(Ω) = f : Ω → R : f ∈ C k(Ω ,R) for all k ∈ N,

which is also a real vector space.



Vector Spaces

3.3.3. Remarks. It is easy to prove that the unit element e ∈ V has theproperties

∀λ∈R (C)

λ · e = e, ∀x∈V

0 · x = e.

Instead of e ∈ V , we will often denote the unit element by 0 ∈ V , usingthe same symbol as for 0 ∈ R. This will simplify notation and generallynot cause any confusion.There can only be one unit element in V : if 0, 0′ ∈ V both have theproperty that v + 0 = v = v + 0′, then 0 = 0′.Let −v be the inverse of v ∈ V , i.e., v + (−v) = 0. Then −v = (−1) · v .If λ · v = 0 for some v ∈ V and λ ∈ R (C), then either v = 0 or λ = 0.



Subspaces

In the above examples, it seems clear that some vector spaces are in asense “contained” in others. For example, any polynomial of degree ≤ n isalso a polynomial of degree ≤ n + 1, so Pn ⊂ Pn+1 as a set. Furthermore,if we apply the addition and scalar multiplication defined in Pn+1 to theelements in Pn, then these elements remain in Pn and, furthermore, thisaddition and scalar multiplication acts in the same way as thecorresponding operations defined in Pn. This motivates the followingdefinition:3.3.4. Definition. Let (V , +, ·) be a real or complex vector space. If U ⊂ Vand (U, +, ·) is also a vector space, then we say that (U, +, ·) is asubspace of (V , +, ·).



Subspaces3.3.5. Example. The set L = (x1, x2) ∈ R2 : x1 + x2 = 0 gives a vectorspace (L, +, ·) with the usual component-wise addition and scalarmultiplication. We first need to check that + is actually a map L× L→ L,i.e., that x + y ∈ L if x , y ∈ L. Note that

x + y = (x1 + y1, x2 + y2)

so we need to check that (x1 + y1) + (x2 + y2) = 0. Since(x1 + y1) + (x2 + y2) = (x1 + x2) + (y1 + y2) = 0 + 0 = 0,

this is the case. Furthermore, the addition is associative and commutativefor x , y ∈ R2, so it will also have these properties for x , y ∈ L. The unitelement is e = (0, 0) ∈ L, and the inverse element to x is−x = (−x1,−x2) ∈ L (because −x1 + (−x2) = −(x1 + x2) = −0 = 0). Ina similar manner, we can check that scalar multiplication is a mapR× L→ L and satisfies all conditions to ensure that (L, +, ·) is a vectorspace.Since L ⊂ R2, (L, +, ·) is a subspace of (R2, +, ·).



SubspacesFortunately, we don’t always have to go to as much trouble as we did inthe previous example to verify that a vector space is a subspace of anothervector space.3.3.6. Lemma. Let (V , +, ·) be a real (complex) vector space and U ⊂ V .If u1 + u2 ∈ U for u1, u2 ∈ U and λu ∈ U for all u ∈ U and λ ∈ R (C),then (U, +, ·) is a subspace of (V , +, ·).

Proof.We consider the case of a real vector space. Assume that u1 + u2 ∈ U foru1, u2 ∈ U and λu ∈ U for all u ∈ U and λ ∈ R. Commutativity andassociativity of + are assured since they have these properties for allelements of V . Furthermore, e = 0 · v for any v ∈ V , so taking someu ∈ U, we have e = 0 · u ∈ U by our hypotheses. In the same way,(−1) · u = −u for any u ∈ U. Thus U contains the neutral element andinverse elements for every u ∈ U. All other properties of the maps + and ·are inherited from their properties in (V , +, ·).



SubspacesJust as we write V for (V , +, ·) when the operations are understood, wealso write U ⊂ V to indicate a subspace U of V .3.3.7. Examples.

1. C (R) ⊃ C 1(R) ⊃ · · · ⊃ C∞(R).2. Pn ⊂ C∞(R) for any n ∈ N.3. Denote the set of bounded sequences by

l∞ := (an)∞n=0 : supn∈N|an| <∞

and the set of sequences converging to 0 by

c0 := (an)∞n=0 : limn→∞

an = 0.

Both of these sets become vector spaces with component-wiseaddition and scalar multiplication, and c0 ⊂ l∞ as vector spaces.



Normed Vector SpacesNow that we have introduced the concept of a vector space, we want todefine on vector spaces something akin to the modulus |x | for x ∈ R or theeuclidean length of a vector in R2,

|x | = |(x1, x2)| =√

x21 + x22 .

3.3.8. Definition. Let V be a real (complex) vector space. Then a map∥ · ∥ : V → R is called a norm if for all u, v ∈ V and all λ ∈ R (C),

1. ∥v∥ ≥ 0 for all v ∈ V and ∥v∥ = 0 if and only if v = 0,2. ∥λ · v∥ = |λ| · ∥v∥,3. ∥u + v∥ ≤ ∥u∥+ ∥v∥.

The pair (V , ∥ · ∥) is called a normed vector space or a normed linearspace.Clearly, any normed vector space can also be considered as a metric space:it is easy to check that

ϱ(x , y) := ∥x − y∥satisfies all the requirements of a metric.



Normed Vector Spaces

3.3.9. Examples.

1. Rn with ∥x∥2 =( n∑j=1

x2i

)1/2,

2. Rn with ∥x∥p =( n∑j=1|xi |p

)1/pfor any p ∈ N \ 0,

3. Rn with ∥x∥∞ = max1≤k≤n

|xk |,

4. l∞ with ∥(an)∥∞ = supn∈N|an|,

5. c0 with ∥(an)∥∞ = supn∈N|an|,

6. C ([a, b]), [a, b] ⊂ R, with ∥f ∥∞ = supx∈[a,b]

|f (x)|



Sequences in Vector SpacesSince every normed vector space is also a metric space, we can considersequences in normed vector spaces without stating any new definitions.However, it may be useful to recall the definitions for the case of normedvector spaces.In the following, we assume that (V , ∥·∥) is a normed vector space.A sequence in a vector space V is a map (an) : N→ V . We say that (an)converges to a ∈ V if

∀ε>0

∃N∈N

∀n>N

: ∥an − a∥ < ε.

3.3.10. Example. Let V = R2 with ∥x∥ = max|x1|, |x2|. Then thesequence given by

an =

(n2

2 + n2,1

n

)n→∞−−−→ (1, 0).


Differential Calculus in One Variable Sequences of Real Functions



Vector Spaces


Series






Sequences of Functions

We will interest us in particular in sequences of functions. Thus we willconsider a sequence (fn), where for every n ∈ N, fn is a function.As anexample, take the sequence of continuous functions (fn) given by

fn : [−1, 1]→ R, fn(x) =

√1

(n + 1)2+ x2.

For every n ∈ N, fn ∈ C ([−1, 1]) (fn is continuous on R). We endow C (R)with the norm sup

x∈[−1,1]|f (x)|. Then fn → f as n→∞, where f (x) = |x |.

We shall see and prove this below.




-1 1

1



Sequences of FunctionsNow it seems clear that fn → f , f (x) = |x |, as stated. Mathematically,there are two distinct points of view:

1. pointwise convergence: For every x ∈ [−1, 1],

fn(x)n→∞−−−→ f (x) :⇔ |fn(x)− f (x)| n→∞−−−→ 0

2. uniform convergence: Each fn is an element of the normed vectorspace C ([−1, 1]), and so is f . Then

fnn→∞−−−→ f :⇔ ∥fn − f ∥∞

n→∞−−−→ 0.

The second approach is more natural: we are considering a sequence offunctions fn, not values of functions fn(x), so we should consider theproperties of fn over all of [−1, 1], not just separately those of fn(x) forevery x ∈ [−1, 1].




It is obvious that uniform convergence implies pointwise convergence forall x . In our example, we will prove that

√1/(n + 1)2 + x2 → |x |

uniformly for x ∈ [−1, 1], i.e,

∥fn − f ∥∞ = supx∈[−1,1]

∣∣√1/(n + 1)2 + x2 − |x |∣∣ n→∞−−−→ 0.

Note that we can take the supremum over [0, 1] instead of [−1, 1], sincefn(−x) = fn(x) and |−x | = |x |. Furthermore, it is sufficient to show that(

supx∈[−1,1]

∣∣√1/(n + 1)2 + x2 − |x |∣∣)2

= supx∈[−1,1]

∣∣√1/(n + 1)2 + x2 − |x |∣∣2 n→∞−−−→ 0.




Now∣∣√1/(n + 1)2 + x2 − |x |∣∣2 = 1

(n + 1)2+ 2x2 − 2x

√x2 + 1/(n + 1)2︸︷︷︸

< 0 if x > 0

.

It follows that the supremum is attained at x = 0 and

∥fn − f ∥2∞ = supx∈[−1,1]

∣∣√1/(n + 1)2 + x2 − |x |∣∣2 = 1

(n + 1)2n→∞−−−→ 0.

Thus fn → f uniformly (or as a sequence in (C ([−1, 1]), ∥ · ∥∞).



Sequences of FunctionsSummarizing, we can say that a sequence of continuous functions fn thatconverges uniformly in an interval [a, b] to some function f also convergespointwise to f :

supx∈[a,b]

|fn(x)− f (x)| n→∞−−−→ 0 ⇒ ∀x∈[a,b]

|fn(x)− f (x)| n→∞−−−→ 0

In general, we make the following definition:3.4.1. Definition. Let Ω ⊂ R and (fn) be a sequence of functionsfn : Ω → C. We say that the sequence (fn) converges pointwise to thefunction f : Ω → C if

∀x∈Ω|fn(x)− f (x)| n→∞−−−→ 0.

If f is the pointwise limit of (fn), we say that (fn) converges uniformly to fon Ω if

supx∈Ω|fn(x)− f (x)| n→∞−−−→ 0.




A sequence of continuous functions fn that converges pointwise to somefunction f does not need to converge uniformly to f .3.4.2. Example. The sequence (fn),

fn : [0, 1]→ R, fn(x) =

1− nx , 0 ≤ x ≤ 1/n,

0, otherwise,

converges to

f : [0, 1]→ R, f (x) =

1, x = 0,

0, otherwise,

pointwise, but not uniformly, as we now show.




Clearly, for every x ∈ (0, 1] there exists an N ∈ N such that fn(x) = 0 forall n > N, (Just take N = 1/x .) Furthermore, fn(0) = 1 for all n. Hencefn(x)→ f (x) for all x ∈ [0, 1].However,

supx∈[0,1]

|fn(x)− f (x)| ≥ |fn(1/(2n))− f (1/(2n))| = |1− n/(2n)− 0| = 1

2,

so supx∈[0,1]

|fn(x)− f (x)| → 0 as n→∞.

In general, when considering sequences of functions you should first findthe pointwise limit and then test whether the convergence is uniform.




In the previous example, the sequence of continuous functions fnconverged to the discontinuous function pointwise, but not uniformly. Thisno accident. In fact, a uniformly convergent sequence of continuousfunctions will always converge to a continuous function:3.4.3. Theorem. Let [a, b] ⊂ R be a closed interval. Let (fn) be asequence of continuous functions defined on [a, b] such that fn(x)converges to some f (x) ∈ R as n→∞ for every x ∈ [a, b]. If thesequence (fn) converges uniformly to the thereby defined functionf : [a, b]→ R, then f is continuous.




Proof.We need to show that f is continuous for all x ∈ [a, b]. We will here dealonly with x ∈ (a, b); the cases x = a and x = b are left to you.Let x ∈ (a, b). We will show that for any ε > 0 there exists a δ > 0 suchthat |h| < δ implies |f (x + h)− f (x)| < ε (for h so small thatx + h ∈ (a, b)). Fix ε > 0. Then there exists some N ∈ N such that

∥fn − f ∥∞ = supx∈[a,b]

|fn(x)− f (x)| < ε

3.

for all n > N. Since each fn is continuous on [a, b], there exists someδ > 0 such that |h| < δ implies

|fn(x)− fn(x + h)| < ε

3.



Completeness of C ([a, b])Proof.Let (fn) be a Cauchy sequence in C ([a, b]). We will show that lim

n→∞fn(x)

exists for every x ∈ [a, b] and that exists. First, by definition, for everyε > 0 we have

supx∈[a,b]

|fn(x)− fm(x)| < ε

for n,m sufficiently large. But then, for every fixed x ∈ [a, b], we have

|fn(x)− fm(x)| < ε

for n,m sufficiently large. This implies that for every x ∈ [a, b] thesequence or real numbers (fn(x)) is Cauchy. Since the real numbers arecomplete, (fn(x)) converges. Hence we can define the pointwise limit.

f (x) := limn→∞

fn(x) for every x ∈ [a, b].




We will be especially interested in a particular type of sequence offunctions, called power series. For example, we would like to find the limitof (fn) where

f0(x) = 1, f1(x) = 1 + x , f2(x) = 1 + x +x2

2!, ... , fn(x) =

n∑k=0

xk

k!.

Sequences which are formed by continually adding summands are calledseries, and we will have to start with studying their basic theory.


Differential Calculus in One Variable Series



Vector Spaces


Series






ConvergenceA series is “a sequence together with the wish to sum.” However, this maynot always be possible.3.5.1. Definition. Let (an) be a sequence in a normed vector space(V , ∥ · ∥). Then we say that (an) is summable with sum s ∈ V if

limn→∞

sn = s, sn :=n∑

k=0

ak .

We call sn the nth partial sum of (an). We use the notation∞∑k=0

ak (3.5.1)

to denote not only s, but also the “procedure of summing the sequence(an).” We call (3.5.1) an infinite series and we say that the seriesconverges if (an) is summable. If (sn) does not converge, we say that∑

ak diverges.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 357 / 624


The Geometric Series

3.5.2. Lemma. The geometric series∞∑k=0

zk , z ∈ C, converges to 1

1− zif

|z | < 1 and diverges if |z | > 1.

Proof.From

(1− z)n∑

k=0

zk =n∑

k=0

zk −n+1∑k=1

zk = 1− zn+1

we see thatn∑

k=0

zk =1− zn+1

1− z. (3.5.2)

As n→∞, |z |n+1 will diverge if |z | > 1 and converge to zero if|z | < 1.



The Koch Snowflake3.5.3. Example. The Koch snowflake is a simple example of a fractal curve.It is constructed as follows: one starts with an equilateral triangle:

One then successively adds an equilateral triangle to the middle third ofevery line segment.



The Koch Snowflake - First Iteration



The Koch Snowflake - Second Iteration



The Koch Snowflake - Third Iteration



The Koch Snowflake - Fourth Iteration



The Koch Snowflake - Fifth Iteration



The Koch Snowflake - Sixth Iteration



The Koch SnowflakeThe Koch snowflake is now defined as the limit as n→∞ of the nthiteration. The fractal nature of the curve becomes apparent when one triesto “zoom in” to a segment to analyze its structure:

This property is called self-similarity: any arbitrarily small segment of theKoch curve contains all the structural information of the entire curve.We are now interested in the total line length ln and the area An

circumscribed by the nth iteration of the Koch snowflake.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 366 / 624


Length of the Koch Snowflake

It is easy to see that in every iteration the length of the curve increases by1/3: effectively, one adds 1/3 of the length of each individual linesegment, so

ln+1 =4

3ln.

Hence, if the initial triangle has side length l0,

ln =

(4

3

)n

l0

When n→∞, the length of the nth iteration diverges to infinity, so theKoch curve has infinite length.



Area enclosed by the Koch SnowflakeNote that an equilateral triangle of side length l has area A = (

√3/4)l2.

To the area of the initial triangle we add that of three triangles of lengthl0/3, giving

A0 =

√3

4l20 , A1 =

√3

4l20 + 3 ·

√3

4

(l03

)2

.

The first iteration consists of 12 line segments and thereafter, the nthiteration will have 4 times as many as the previous iteration. There will bea total of 3 · 4n line segments in the nth iteration and the length of eachsegment is l0/3

n. This means that the nth iteration, n ≥ 1, will have anarea of

An =

√3

4l20 +

n∑k=1

3 · 4k−1

√3

4

(l03k

)2

=

√3

4l20

(1 +

3

4

n∑k=1

(4

9

)k)

=3

4A0

(1

3+

n∑k=0

(4

9

)k)

n→∞−−−→ 3

4A0

(1

3+

1

1− 4/9

)=

8

5A0.



Area enclosed by the Koch SnowflakeWe have

An =3

4A0

(1

3+

n∑k=0

(4

9

)k).

This is a geometric series, and when n→∞, we find that

limn→∞

An =3

4A0

(1

3+

1

1− 4/9

)=

8

5A0

Hence the Koch snowflake has 8/5 times the area of the original triangle.A finite area is hereby enclosed by a curve of infinite length.The theory of fractals owes much to the ideas of Benoit Mandelbrot, whopopularized the famous Mandelbrot fractal. Fractals play an importantrole in chaos theory and non-linear dynamics.



The Cauchy Criterion and its ConsequencesWe now return to the general theory of series. In particular, we want todevelop techniques to analyze the geometric series when |z | = 1.3.5.4. Cauchy Criterion. Let

∑ak be a series in a complete vector space

(V , ∥ · ∥). Then

∑ak converges ⇔ (sn)n∈N converges, sn =

n∑k=0

ak

⇔ (sn) is Cauchy⇔ ∀

ε>0∃

N∈N∀

m>n>N∥sm − sn∥ < ε

⇔ ∀ε>0

∃N∈N

∀m>n>N

∥∥∥ m∑k=n+1

ak

∥∥∥ < ε



The Cauchy Criterion and its Consequences

3.5.5. Corollary. If the series∑∞

k=0 ak converges, then the sequenceak → 0 as k →∞. (Take m = n + 1 in the Cauchy Criterion.)

3.5.6. Corollary. If the series∑∞

k=0 ak converges, then the sequence (An)given by

An :=∞∑k=n

ak

converges to 0 as n→∞. (Let m→∞ in the Cauchy Criterion.)

3.5.7. Example. The harmonic series∞∑k=1

1k is divergent. (Since

|s2n − sn| ≥ 1/2, the sequence of partial sums (sn) is not Cauchy.)



Absolute Convergence

3.5.8. Example. The geometric series∞∑k=0

zk , z ∈ C, diverges if |z | = 1. Tosee this, note that

limk→∞|zk | = lim

k→∞|z |k = 1,

so the sequence zk does not converge to zero. By Corollary 3.5.5, theseries can not converge.

3.5.9. Definition. A series∑

ak in a normed vector space (V , ∥ · ∥) iscalled absolutely convergent if

∑∥ak∥ converges.

A sequence (ak) in a normed vector space (V , ∥ · ∥) is called absolutelysummable if

∑ak converges absolutely.

3.5.10. Theorem. An absolutely convergent series∑

ak in a completevector space (V , ∥ · ∥) is convergent.




Proof.Let (ak) be a sequence with partial sum

sn =n∑

k=0

ak .

Then

∥sn − sm∥ =∥∥∥ n∑k=m+1

ak

∥∥∥ ≤ n∑k=m+1

∥ak∥. (3.5.3)

Since∑

ak converges absolutely,∑∥ak∥ is Cauchy, so for any ε > 0 there

exists N ∈ N such that∑n

k=m+1∥ak∥ < ε for all n,m > N. But by (3.5.3),this is also true for ∥sn − sm∥. Hence the sequence (sn) is Cauchy andconverges because (V , ∥ · ∥) is complete. Thus

∑ak converges.



Absolute ConvergenceThe concept of absolute convergence is useful in many contexts, such as

(i) Alternating series: A series of real numbers of the form∑∞k=0(−1)kak with ak > 0 for all k is called alternating. The series

will converge absolutely if∞∑k=0

|(−1)kak | =∞∑k=0

ak

converges.(ii) Complex series: The series

∑∞k=0 zk with zk = ak + ibk converges

absolutely if the series∞∑k=0

|zk | =∞∑k=0

√a2k + b2k

converges.




(iii) Series of functions: The series∑∞

k=0 fk of continuous functionsfk ∈ C (I ,C), I = [a, b] ⊂ R a closed interval, converges absolutely ifthe series

∞∑k=0

∥fk∥∞ =∞∑k=0

supx∈I|f (x)|

converges.Since the real numbers are complete, absolute convergence in (i) impliesconvergence of the alternating series to some real number. Similarly,absolute convergence of the complex sequence in (ii) implies convergenceto some complex number. The most interesting case from the point ofview of calculus is (iii): since

(C (I ,C), ∥ · ∥∞

)is complete by Theorem

3.4.4, absolute convergence implies convergence to some continuousfunction f ∈ C (I ,C).



Series of Functions3.5.11. Example. Let (fn) be the sequence of functions

fn :

[−1

2,1

2

]→ R, fn(x) = xn.

Clearly, fn ∈ C ([−12 ,

12 ]) for all n ∈ N. We want to find out if (fn) is

summable, i.e., if the series∞∑n=0

fn (3.5.4)

converges to a function f ∈ C ([−12 ,

12 ]). This would be the case if the

series (3.5.4) were to converge absolutely. We have∞∑n=0

∥fn∥∞ =∞∑n=0

sup−1/2≤x≤1/2

|xn| =∞∑n=0

1

2n= 2 <∞.

Hence (3.5.4) converges absolutely and therefore also converges to afunction f ∈ C ([−1/2, 1/2]).



Series of FunctionsIn this simple example we can actually calculate the pointwise limit of theseries:

f (x) =∞∑n=0

fn(x) =∞∑n=0

xn =1

1− x

for all x ∈ [−12 ,

12 ]. In many applications, however, we will not be able to

calculate the pointwise limit so easily. We will return to a discussion oflimits of series of functions below.

3.5.12. Lemma. Let (V , ∥ · ∥) be a complete normed vector space and∑ak an absolutely convergent series. Then∥∥∥ ∞∑

k=0

ak

∥∥∥ ≤ ∞∑k=0

∥ak∥. (3.5.5)



The “Infinite” Triangle Inequality

Proof.The triangle inequality yields, for any n ∈ N,∥∥∥ n∑

k=0

ak

∥∥∥ ≤ n∑k=0

∥ak∥ ≤∞∑k=0

∥ak∥. (3.5.6)

Since the series converges absolutely, it also converges and∥∥∥ ∞∑k=0

ak

∥∥∥ =∥∥∥ limn→∞

n∑k=0

ak

∥∥∥ = limn→∞

∥∥∥ n∑k=0

ak

∥∥∥since the norm is continuous (i.e., bn → b implies ∥bn∥ → ∥b∥. Why?).Since the limit exists, the estimate (3.5.6) implies (3.5.5).



The Comparison TestIn the rest of this section we will use the phrase “for sufficiently large k”to mean “for k > k0 for some k0 ∈ N”. For example, the statement“1/k < 0.001 for sufficiently large k” should be interpreted in this way.Many of the following criteria apply to series of positive real numbers.They are therefore suitable for establishing the absolute convergence of aseries.3.5.13. Comparison Test. Let (ak) and (bk) be real-valued sequences with0 ≤ ak ≤ bk for sufficiently large k. Then∑

bk converges ⇒∑

ak converges.

3.5.14. Remark. In applications one also often uses the contrapositive: if0 ≤ ak ≤ bk for sufficiently large k, then∑

ak diverges ⇒∑

bk diverges.



The Comparison TestProof.Let (ak) and (bk) be positive real sequences and denote by

Sn :=n∑

k=0

ak , sn :=n∑

k=0

bk , n ∈ N,

their partial sums. Since the series∑

bk converges, the sequence (sn)converges to

∑∞k=0 bk <∞ as n→∞.

Since ak ≤ bk for k > k0 form some k0 ∈ N, the partial sums of (ak) arebounded by

Sn =n∑

k=0

ak ≤ C +n∑

k=k0

bk ≤ C +∞∑k=0

bk

for some C > 0. Hence (Sn) is bounded above. Since (Sn) is obviouslyincreasing, the sequence converges. This means that

∑ak is

convergent.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 380 / 624


The Comparison Test

3.5.15. Example. The series∞∑k=1

(−1)k

k2k(3.5.7)

converges. To see this, we note that

ak :=

∣∣∣∣(−1)kk2k

∣∣∣∣ = 1

k

1

2k≤(1

2

)k

=: bk .

Since the geometric series∑∞

k=0 zk converges if z = 1/2, we know that∑∞

k=0 bk converges. By the comparison test,∑∞

k=0 ak converges, so(3.5.7) is absolutely convergent. Since the real numbers are complete, italso converges.



The Comparison Test

3.5.16. Example. The series∞∑k=1

k + 2

k(k + 1)(3.5.8)

diverges. To see this, we note that

bk :=k + 2

k(k + 1)≥ 1

k=: ak .

Since the harmonic series∑∞

k=11k diverges, so does

∑∞k=1

k+2k(k+1) . (See

Remark 3.5.14.)Many criteria for absolute convergence can be derived from thecomparison test. We will now study a few examples.



The Weierstraß M-test3.5.17. Weierstraß M-test. Let Ω ⊂ R and (fk) be a sequence of functionsdefined on Ω , fk : Ω → C, satisfying

supx∈Ω|fk(x)| ≤ Mk , k ∈ N (3.5.9)

for a sequence of real numbers (Mk). Suppose that∑

Mk converges.Then the limit

f (x) :=∞∑k=0

fk(x) exists for every x ∈ Ω.

Furthermore, the sequence (Fn) of partial sums

Fn(x) =n∑

k=0

fk(x)

converges uniformly to f .Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 383 / 624


The Weierstraß M-testProof.For every x ∈ Ω, |fk(x)| ≤ Mk , so

∑∞k=0|fk(x)| converges by the

comparison test. Thus, for every x ∈ Ω, the series of numbers∑∞

k=0 fk(x)converges absolutely. Since the complex numbers are complete, we deducethat

f (x) :=∞∑k=0

fk(x) exists for every x ∈ Ω.

For any x ∈ Ω,

|f (x)− Fn(x)| =∣∣∣f (x)− n∑

k=0

fk(x)∣∣∣ = ∣∣∣ ∞∑

k=n+1

fk(x)∣∣∣

≤∞∑

k=n+1

|fk(x)| ≤∞∑

k=n+1

Mk .



The Weierstraß M-test

Proof (continued).Hence,

supx∈Ω

∣∣∣f (x)− n∑k=0

fk(x)∣∣∣ ≤ ∞∑

k=n+1

Mk .

By Corollary 3.5.6, the right-hand side converges to zero as n→∞, so wehave

supx∈Ω

∣∣∣f (x)− n∑k=0

fk(x)∣∣∣ n→∞−−−→ 0.

This shows the uniform convergence.



Uniform Convergence of Series of Functions

3.5.18. Definition. Let Ω ⊂ R and (fk) be a sequence of functions definedon Ω, fk : Ω → C. We say that the series

∞∑k=0

fk

converges uniformly (to a function f : Ω → C) if the sequence of partialsums Fn =

∑nk=0 fk converges uniformly to f .



Uniform Convergence of Series of Functions3.5.19. Example. Consider the sequence of monomials (fn),

fn :

[−1

2,1

2

]→ R, fn(x) =

1

nxn, n ∈ N \ 0.

Sincesup

x∈[− 12, 12]

|fn(x)| =1

n2n<

1

2n=: Mn

and∑

Mn converges, by the Weierstraß M-test the series∑

fk convergesuniformly to

f :

[−1

2,1

2

]→ R, f (x) =

∞∑n=1

1

nxn.

At this time we are not able to find a more explicit expression for f .However, by Theorem 3.4.3 we know that f is a continuous function on[−1/2, 1/2].



The Root Test

A further consequence of the Comparison Test 3.5.13 is the following,quite fundamental criterion for convergence:3.5.20. Root Test. Let

∑ak be a series of positive real numbers ak ≥ 0.

(i) Suppose that there exists a q < 1 such that

k√ak ≤ q for all sufficiently large k.

Then∑

ak converges.(ii) Suppose that

k√ak > 1 for all sufficiently large k.

Then∑

ak diverges.



The Root Test

Proof.(i) Suppose that we have a q < 1 such that k

√ak ≤ q for all k ≥ k0.

Then

ak ≤ qk for k ≥ k0.


k=k0qk converges for q < 1, the series∑∞

k=k0ak converges by the Comparison Test 3.5.13.

(ii) Suppose that k√ak ≥ 1 for all k ≥ k0. Then ak ≥ 1, so ak → 0 as

k →∞. By Corollary 3.5.5, the series∑

ak can not converge.

3.5.21. Remark. Note that the existence of a q < 1 so that k√ak < q is

crucial; this is not the same as requiring k√ak < 1.



The Root Test3.5.22. Example. The series

∞∑k=0

kzk

converges absolutely if |z | < 1 and diverges if |z | > 1. This is easily seenfrom

k

√|kzk | = k

√k · |z |.

Since k√k → 1 as k →∞, we see that for sufficiently large k ,

k

√|kzk | > 1 if |z | > 1

and

k

√|kzk | < 1 if |z | < 1.



The Root Test3.5.23. Example. The root test does not always yield a result: Consider theseries

∞∑k=1

1

kand

∞∑k=1

1

k2.

In both cases, k√ak < 1 for all k, but there does not exist a constant q < 1

such that k√ak < q. We have seen that the harmonic series

∞∑k=1

1

k

diverges. We will see below that the Euler series∞∑k=1

1

k2

converges.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 391 / 624


The Root Test using Limits3.5.24. Root Test. Let ak be a sequence of positive real numbers ak ≥ 0.Then

limk→∞

k√ak < 1 ⇒

∞∑k=0

ak converges,

limk→∞

k√ak > 1 ⇒

∞∑k=0

ak diverges.

3.5.25. Remarks.(i) No statement is possible if lim

k→∞k√ak = 1.

(ii) If limk→∞

k√ak exists, it equals lim

k→∞k√ak . This will be the case in many

applications.



The Root Test using LimitsProof.Recall that for a bounded sequence (xk)

limk→∞

xk := limn→∞

supxn, xn+1, xn+2, ....

Suppose that limk→∞

k√ak =: p < 1. Then there exists an N ∈ N such that

∣∣sup n√an, n+1

√an+1, ... − p

∣∣ < 1− p

2for all n ≥ N.

This implies

sup n√an, n+1

√an+1, ... < p +

1− p

2=: q < 1

for all n ≥ N, so the series converges by the Root Test 3.5.20. This provesthe first statement. The proof of the second statement is analogous.



The Ratio TestThe Root Test is extremely powerful, but sometimes quite difficult tohandle. A more convenient criterion for convergence is provided by theratio test:3.5.26. Ratio Test. Let

∑ak be a series of strictly positive real numbers

ak > 0.(i) Suppose that there exists a q < 1 such that

ak+1

ak≤ q for all sufficiently large k.

Then∑

ak converges.(ii) Suppose that

ak+1

ak≥ 1 for all sufficiently large k.

Then∑

ak diverges.



The Ratio Test

Proof.(i) Suppose that we have a q < 1 such that ak+1/ak ≤ q for all k ≥ k0.

Then

ak ≤ qak−1 ≤ q2ak−2 ≤ ak0 · qk−k0 for k ≥ k0.


k=k0qk converges for q < 1, the series∑∞

k=k0ak converges by the Comparison Test 3.5.13.

(ii) Suppose that akak−1≥ 1 for all k ≥ k0. Then ak ≥ ak−1, so ak → 0 as

k →∞. By Corollary 3.5.5, the series∑

ak can not converge.



The Ratio Test3.5.27. Example. Consider the series

∞∑n=0

4n

n!.

Applying the root test would be quite tedious, because we don’t know hown√n! behaves. However,

4n+1

(n + 1)!

n!

4n=

4

n≤ 4

5

if n ≥ 5, so the series converges.The ratio test is easier to apply than the root test. However, it is formallynot as powerful: there are series whose convergence or divergence can notbe determined using the ratio test, but where the root test shows theconvergence. An example will be given in the assignments.



The Ratio Test using Limits

3.5.28. Ratio Test. Let (ak) be a sequence of strictly positive real numbersak > 0. Then

limk→∞

ak+1

ak< 1 ⇒

∞∑k=0

ak converges,

limk→∞

ak+1

ak> 1 ⇒

∞∑k=0

ak diverges.

The proof is left to you; note the inferior limit in the condition fordivergence!



The Ratio Comparison Test

We will now try to derive a finer version of the ratio test that will enableus to distinguish the divergence of the harmonic series from theconvergence of the Euler series. A first step is the following criterion:3.5.29. Ratio Comparison Test. Let (ak) and (bk) be sequences of strictlypositive real numbers ak , bk > 0. Suppose that

∑bk converges. If

ak+1

ak≤ bk+1

bkfor sufficiently large k,

then∑

ak converges.



The Ratio Comparison Test

Proof.The hypothesis ak+1

ak≤ bk+1

bkis equivalent to

ak+1

bk+1≤ ak

bk.

Hence the sequence (ak/bk) is decreasing and positive (since ak , bk > 0for all k).Thus (ak/bk) converges and is therefore bounded: there existssome M > 0 such that

akbk

< M ⇔ ak < M · bk .

Since∑

bk converges by assumption, so does∑

(Mbk). But then∑

akconverges by the Comparison Test 3.5.13.



The p-SeriesWe are interested in studying the so-called p-series,

∞∑k=1

1

kp, p ∈ R.

Clearly, for p ≤ 0 the series diverges, because it is not the sum of asequence converging to zero.The root and ratio tests do not yield information on the convergence, sowe will have to treat this series separately. First, we prove a preliminaryresult.3.5.30. Lemma. Denote by sn(p) the nth partial sum of the p-series. Forp > 0,

1 +1

2p−1sn(p)−

1

2p< s2n(p) < 1 +

1

2p−1sn(p)



The p-Series

Proof.By definition,

s2n(p) = 1 +1

2p+

1

3p+ · · ·+ 1

(2n)p

= 1 +( 1

2p+

1

4p+ · · ·+ 1

(2n)p

)+( 1

3p+

1

5p+ · · ·+ 1

(2n − 1)p

)> 1 +

1

2psn(p) +

( 1

4p+

1

6p+ · · ·+ 1

(2n)p

)where we have used that p > 0 in the last step. Thus

s2n(p) > 1 +1

2psn(p)−

1

2p+

1

2psn(p) = 1 +

1

2p−1sn(p)−

1

2p.



The p-Series

Proof (continued).Similarly,

s2n(p) = 1 +( 1

2p+

1

4p+ · · ·+ 1

(2n)p

)+( 1

3p+

1

5p+ · · ·+ 1

(2n − 1)p

)< 1 +

1

2psn(p) +

( 1

2p+

1

4p+ · · ·+ 1

(2n − 2)p

)+

1

(2n)p

< 1 +2

2psn(p) = 1 +

1

2p−1sn(p)

We can now state our main result:

3.5.31. Theorem. The p-series∞∑k=0

1kp converges when p > 1 and diverges

when p ≤ 1.



The p-SeriesProof.We only need to consider the case p > 0. Assume first that 0 < p ≤ 1 andthat the p-series converges, i.e., lim

n→∞sn(p) =: s(p) exists. Then the

estimate of Lemma 3.5.30 gives

1 +1

2p−1s(p)− 1

2p=

2p − 1

2p+

2

2ps(p) ≤ s(p).

and hence0 <

2p − 1

2p≤ 2p − 2

2ps(p) ≤ 0

which is a contradiction. Thus the series diverges for p ≤ 1. Now forp > 1 we have

sn(p) < s2n(p) < 1 + 21−psn(p),

so sn(p) < 1/(1− 21−p). Since (sn(p)) is increasing, the sequence ofpartial sums converges.



Raabe’s TestWe can use our previous result to obtain a finer version of the ratio test:3.5.32. Raabe’s Test. Let

∑ak be a series of positive real numbers

ak ≥ 0. Suppose that there exists a number p > 1 such thatak+1

ak≤ 1− p

kfor sufficiently large k.

Then the series∑

ak converges.

3.5.33. Example. The series∞∑n=1

(2n)!

4n(n + 1)!n!

is convergent by Raabe’s test.



Raabe’s Test

Proof.We first observe that if p > 1 and 0 ≤ x < 1, then

1− px ≤ (1− x)p. (3.5.10)

At x = 0 we have equality, while the derivative of the left-hand side is lessthan the derivative of the right-hand side for x ∈ (0, 1). We now apply(3.5.10) with x = 1/k to our hypothesis: for sufficiently large k,

ak+1

ak≤ 1− p

k≤ (1− 1/k)p =

(k − 1

k

)p

=bk+1

bk,

where bk := 1/(k − 1)p. Since∑

bk converges, it follows from the RatioComparison Test 3.5.29 that

∑ak converges.



Absolutely and Conditionally Convergent Series

All of the tests that we have discussed essentially test series for absoluteconvergence. In complete vector spaces, this implies ordinary convergence.But what is more, absolutely convergent series have properties that seriesthat converge without being absolutely convergent do not have.3.5.34. Definition. A series in a normed vector space (V , ∥ · ∥) is calledconditionally convergent if it is convergent, but not absolutely convergent.

3.5.35. Example. We will see later that∑

(−1)k 1k is conditionally

convergent.



Rearrangements of Terms in Series

For the proof of the following two theorems we refer to the text book(Spivak, Chapter 23, Theorems 7 & 8).3.5.36. Theorem. Assume that

∑ak is an absolutely convergent series in a

complete normed vector space. If the summands of the series arerearranged, the new series

∑bj , bj = ak(j), k : N→ N bijective, converges

absolutely with the same sum as∑

ak .In particular, if

∑ak and

∑bk are absolutely convergent, then∑

(ak + bk) is absolutely convergent and equal to∑

ak +∑

bk .Contrast this with the following result:3.5.37. Theorem. Let

∑ak be a conditionally convergent series of real

numbers. Then for any α ∈ R there exists a rearrangement bj = ak(j),k : N→ N bijective, of

∑ak such that

∑bj = α.



The Leibniz TheoremFor this reason, we are mostly interested in absolutely convergent series.However, we give one result concerning conditionally convergent series.3.5.38. Leibniz Theorem. Let

∑αk be a complex series whose partial

sums are bounded but need not converge. Let (ak) be a decreasingconvergent sequence with limit zero, ak 0. Then the series∑

αkak converges.

A typical application of the Leibniz Theorem 3.5.38 are alternating series,for which αk = (−1)k . In particular, the Leibniz series

∞∑k=1

(−1)k

k

converges by the Leibniz Theorem.



Summation by Parts3.5.39. Lemma. Let (ak) and (bk) denote complex sequences. Then form > n

m∑k=n+1

ak(bk − bk−1) = am+1bm − an+1bn −m∑

k=n+1

(ak+1 − ak)bk .

(3.5.11)

Proof.m∑

k=n+1

ak(bk − bk−1) =m∑

k=n+1

akbk −m∑

k=n+1

akbk−1

=m∑

k=n+1

akbk −m−1∑k=n

ak+1bk

=m∑

k=n+1

(ak − ak+1)bk + am+1bm − an+1bn



The Leibniz Theorem

Proof of Theorem 3.5.38.Denote by (sn) the sequence of partial sums of (αk),

sn =n∑

k=0

αk .

For k ≥ 1, αk = sk − sk−1. Hence, applying (3.5.11),n∑

k=0

αkak =n∑

k=1

(sk − sk−1)ak + α0a0

= α0a0 + snan+1 − a1α0 −n∑

k=1

(ak+1 − ak)sk

We will show that the limit as n→∞ of the right-hand side exists.



The Cauchy Product and Convolution of Sequences

3.5.40. Theorem. Let∑

ak and∑

bk be absolutely convergent series.Then the Cauchy product

∑ck given by

ck :=∑

i+j=k

aibj

converges absolutely and∑

ck =(∑

ak

)(∑bk

).

3.5.41. Remark. If a = (ak) and b = (bk) are two absolutely summablesequences, the sequence

a ∗ b := (ck), ck :=∑

i+j=k

aibj ,

is called the convolution of a and b.



The Cauchy Product and Convolution of SequencesProof.Let

An :=n∑

k=0

ak , Bn :=n∑

k=0

bk , Cn :=n∑

k=0

ck .

We will show that limn→∞|Cn − AnBn| = 0. Now

|Cn − AnBn| =∣∣∣ n∑k=0

∑i+j=k

aibj −n∑

i ,j=0

aibj

∣∣∣ ≤ ∑i+j>n0≤i ,j≤n

|aibj | =: Rn

If i + j > n, at least one of i and j will be greater than n/2. Thus

Rn ≤∑

n/2≤i≤n

|ai |n∑

j=0

|bj |+n∑

i=0

|ai |∑

n/2≤j≤n

|bj |



The Cauchy Product and Convolution of Sequences

Proof (continued).We have

Rn ≤∑n/2≤i

|ai |︸︷︷︸→0

·n∑

j=0

|bj |︸︷︷︸bounded

+n∑

i=0

|ai |︸︷︷︸bounded

·∑n/2≤j

|bj |︸︷︷︸→0

→ 0 as n→∞.

This implies that∑

ck =(∑

ak

)(∑bk

). Furthermore,

∑ck is

absolutely convergent, because∑|ck | ≤

(∑|ak |)(∑

|bk |)<∞.


Differential Calculus in One Variable Real and Complex Power Series



Vector Spaces


Series






Formal Power Series3.6.1. Definition. For any complex sequence (ak), the expression

∞∑k=0

ak zk

is called a (formal) complex power series.

3.6.2. Definition. Let∑

ak zk and

∑bk z

k be two complex power series.Then we define their sum

∞∑k=0

ak zk +

∞∑k=0

bk zk :=

∞∑k=0

(ak + bk) zk

and their (Cauchy) product(∑ak z

k)(∑

bk zk):=(∑

(a ∗ b)k zk)

where a = (ak), b = (bk) and ∗ denotes convolution of sequences.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 416 / 624


Convergent Power Series

3.6.3. Remark. We have defined the formal power series without referenceto what zn actually is, and we have introduced no notion of convergence.Hence a formal power series is little different from a sequence.Our goal is to eventually treat a power series as a polynomial of infinitedegree. We will analyze the convergence of complex power series for z ∈ C.

3.6.4. Definition. A formal power series∑

ak zk is said to be is (absolutely)

convergent at z0 ∈ C if the series∞∑k=0

akzk0 converges (absolutely).

3.6.5. Remark. For power series that are absolutely convergent at z0 theaddition and multiplication of Definition 3.6.2 coincide with the additionand multiplication of absolutely convergent series.



Convergent Power Series3.6.6. Definition and Theorem. Let

∑akz

k be a complex power series.Then there exists a unique number ϱ ∈ [0,∞] such that

(i)∑

akzk is absolutely convergent if |z | < ϱ,

(ii)∑

akzk diverges if |z | > ϱ.

This number is called the radius of convergence of the power series. It isgiven by Hadamard’s formula,

ϱ =

1

lim k√

|ak |, 0 < lim

k→∞k√|ak | <∞,

0, limk→∞

k√|ak | =∞,

∞, limk→∞

k√|ak | = 0.

(3.6.1)

3.6.7. Remark. No information is given about the convergence of∑

akzk

if |z | = ϱ. In these cases the series may converge or diverge, depending onthe point z .



Radius of Convergence

3.6.8. Example.∞∑k=0

(−1)k

k zk has ϱ = 1. The series converges for z = 1 anddiverges for z = −1.

Proof.It is sufficient to prove that ϱ defined by Hadamard’s formula (3.6.1) hasthe properties (i) and (ii) (Clearly, no other number can have theseproperties.)We first prove (i) Let ϱ > 0 and |z | < ϱ. We can assume that ϱ <∞ (thecase ϱ =∞ is left to the reader). Then

limk→∞

k

√|akzk | = |z | lim

k→∞k√|ak | =

|z |ϱ< 1,

so∑

akzk converges absolutely by the root test.




Proof (continued).Next we prove (ii) Let |z | > ϱ. We consider two cases:

ϱ = 0:Then limk→∞

k√|ak | =∞, so

limk→∞

k


k→∞k√|ak | =∞,

so∑

akzk diverges.

0 < ϱ <∞:Then

limk→∞

k


k→∞k√|ak | =

|z |ϱ> 1.

so∑

akzk diverges.




3.6.9. Lemma. Let∑

akzk be a complex power series with radius of

convergence ϱ. Then the series∑

kakzk−1 has the same radius of

convergence ϱ.

Proof.It is sufficient to show that the series∑

kakzk = z ·

∑kakz

k−1

has radius of convergence ϱ. We consider here only the case 0 < ϱ <∞.The cases ϱ = 0 and ϱ =∞ are left to you. In that case,

ϱ =1

limk→∞

k√|ak |

.



Radius of ConvergenceProof (continued).Suppose that |z | < ϱ. Then

1 >|z |ϱ

= limk→∞

k

√|ak ||z |k = lim

k→∞k

√|kakzk |/k.

This implies that there exists a q < 1 such that

k

√|kakzk |/k ≤ q and, hence, |kakzk | ≤ kqk

for sufficiently large k. By Example 3.5.22 the series∑

kqk converges, sothe Comparison Test 3.5.13 shows that

∑kakz

k converges absolutely.Similarly, if |z | > ϱ, then lim

k→∞k√|kakzk |/k > 1 and |kakzk | > kpk for

some p ≥ 1. This shows that∑

kakzk diverges. Therefore, the radius of

convergence of∑

kakzk must be ϱ.



Uniform Convergence of Power Series

Recall from Definition 3.5.18 that a series∑∞

k=0 fk of functions fk : Ω→ Cconverges uniformly to a function f if

supx∈Ω

∣∣∣f (x)− n∑k=0

fk(x)∣∣∣ n→∞−−−→ 0.

In that case, the function f is continuous by Theorem 3.4.3. We nowapply these general results to power series, where

fk(x) = akxk .



Uniform Convergence of Power Series3.6.10. Lemma. If

∑akz

k is a complex power series with radius ofconvergence ϱ, then for any R < ϱ the series converges uniformly onBR(0) = z : |z | < R.

Proof.Let 0 < R < ϱ be fixed. Then

supz∈BR(0)

|akzk | < |ak |Rk =: Mk

Now∞∑k=0

Mk =∞∑k=0

|ak |Rk

converges, because the point R < ϱ lies within the radius of convergenceof the series and the series convergences absolutely within its radius ofconvergence. By the Weierstraß M-test 3.5.17, the series

∑akz

k offunctions akz

k : BR(0)→ C converges uniformly on BR(0).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 424 / 624


Continuity of Power Series3.6.11. Corollary. A power series

∑∞k=0 akx

k with radius of convergence ϱdefines a continuous function

f : Bϱ(0)→ C, f (z) =∞∑k=0

akzk .

Proof.Let z ∈ C have modulus |z | < ϱ. Define R := |z |+ (ϱ− |z |)/2 < ϱ. Thenthe power series converges uniformly on BR(0) and is hence continuous onBR(0), which includes the point z .

3.6.12. Example. The power series∑∞

k=0(−1)kxk converges uniformly on[−R,R] for any 0 < R < 1. However, it does not converge uniformly on(−1, 1). (Why?) However, the series defines a continuous function on theinterval (−1, 1).



Differentiability of Power Series3.6.13. Remark. The definition of the derivative can be extended tofunctions of a single complex variable; in particular, if Ω ⊂ C is an openset, we say that a function f : Ω→ C is differentiable at z ∈ Ω if thereexists a number f ′(z) ∈ C such that

f (z + h) = f (z) + f ′(z)h + o(h), h→ 0.

3.6.14. Theorem. The real or complex power series f (z) :=∑

akzk with

radius of convergence ϱ defines a differentiable function f : Bϱ(0)→ C.Furthermore,

f ′(z) =∑

kakzk−1 (3.6.2)

where the series has the same radius of convergence as f .



Differentiability of Power SeriesProof.The statement that the series (3.6.2) has the same radius of convergenceas f was established in Lemma 3.6.9. Define

g(z) :=∞∑k=1

kakzk−1

and set

f (z) = SN(z) + EN(z), SN(z) :=N∑

k=0

akzk , EN(z) :=

∞∑k=N+1

akzk .

Fix z0 ∈ Br (0) with r < ϱ. We want to show that for any ε > 0 thereexists a δ > 0 so that∣∣∣∣ f (z0 + h)− f (z0)

h− g(z0)

∣∣∣∣ < ε whenever |h| < δ.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 427 / 624


Differentiability of Power Series

Proof (continued).This is enough to show that f is differentiable at z0 and f ′(z0) = g(z0).Fix ε > 0 and choose some δ such that |z0 + h| < r if h ∈ Bδ(z0).

f (z0 + h)− f (z0)

h− g(z0) =

(SN(z0 + h)− SN(z0)

h− S ′

N(z0)

)+(S ′N(z0)− g(z0)

)+

EN(z0 + h)− EN(z0)

h

We will show that each of the three terms on the right can be made smallso that the modulus of the difference on the left is less than ε. We startwith ∣∣∣∣EN(z0 + h)− EN(z0)

h

∣∣∣∣ ≤ ∞∑k=N+1

|ak ||h|∣∣(z0 + h)k − zk0

∣∣.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 428 / 624



Proof (continued).Using that for k ∈ N \ 0 and a, b ∈ C,

ak − bk = (a− b)k−1∑i=0

aibk−1−i

and |z0|, |z0 + h| < r , we have∣∣∣∣EN(z0 + h)− EN(z0)

h

∣∣∣∣ ≤ ∞∑k=N+1

k|ak |rk−1. (3.6.3)

Since the series∑

k |ak |rk converges, we can choose N1 large enough sothat the right-hand side of (3.6.3) is less than ε/3 for N > N1.




Proof (continued).Since lim

N→∞S ′N(z0) = g(z0) we can also find N2 > 0 such that

|S ′N(z0)− g(z0)| <

ε

3

for N > N2. Let us now fix N > max(N1,N2). Finally, because SN isdifferentiable (a polynomial), we can find a δ > 0 small enough so that∣∣∣∣SN(z0 + h)− SN(z0)

h− S ′

N(z0)

∣∣∣∣ < ε

3

for |h| < ε.



Some Applications

An application of our theoretical developments on power series is theability to define a decimal expansion for any real number a ∈ R. We wantto write

a = a0.a1a2a3 ...

where a0 ∈ Z and a1, a2, ... ∈ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. We realize this bydefining

a = a0 +∞∑k=1

ak10−k .

How this can be used as a definition for the set of real numbers is sketchedout in Chapter 29, Exercise 2 of the textbook.



The Van der Waerden Function

We will now construct the continuous and nowhere-differentiable Van derWaerden function. We use the “distance to the nearest integer” functiondefined in (2.3.1),

· : R→ [0, 1/2], x := min|x − ⌈x⌉|, |x − ⌊x⌋|.

The function is sketched below:

-4 -3 -2 -1 0 1 2 3

x

1

2

y



The Van der Waerden FunctionWe now define a sequence of functions (fn)n∈N by

fn(x) := 10−n10nx.

The graphs of the first few functions are plotted below:

0 1

x

1

2

y

y = f0 HxL

0 1

x

1

20

y

y = f1 HxL




We then setW (x) :=

∞∑n=0

fn(x).

From the definition of the fn it is immediate that

supx∈R|fn(x)| =

1

2· 10−n

so by the Weierstraß M-Test 3.5.17 we see that W defines a continuousreal function on R.




An approximation to the function W is plotted below:



The Van der Waerden FunctionA closer look at the graph on the interval (0, 1/10) shows the fractalnature of the function:



The Van der Waerden Function3.6.15. Theorem. The Van der Waerden function

W : R→ R, W (x) =∞∑n=0

10−n10nx

is continuous but not differentiable at any point of its domain.

Proof.Due to the periodicity of W , it suffices to show the assertion for pointsa ∈ (0, 1]. We have already established the continuity of W , so it remainsto show that W is not differentiable anywhere. We will do this by fixinga ∈ (0, 1] and exhibiting a sequence (hm) with hm → 0 as m→∞ suchthat

limm→∞

W (a+ hm)−W (a)

hm(3.6.4)

does not exist.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 437 / 624



Proof (continued).Suppose that a has the decimal expansion

a = 0.a1a2a3 ...

where am ∈ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, m ∈ N \ 0. We then define

hm :=

−10−m if am = 4 or am = 9,

10−m otherwise.

Then

W (a+ hm)−W (a)

hm=

∞∑n=0

±10m−n(10n(a+ hm) − 10na

)(3.6.5)

where the “±” indicates a “+” or “−” sign, depending on m.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 438 / 624


The Van der Waerden FunctionProof (continued).The series in (3.6.5) is actually finite, because for n ≥ m the term10nhm ∈ N and then 10n(a+ hm) = 10na. When n < m we have

10na = A+ 0.an+1an+2 ... am−1amam+1 ... , (3.6.6)10n(a+ hm) = A+ 0.an+1an+2 ... am−1(am ± 1)am+1 ... (3.6.7)

where A ∈ N and the “±” again depends on m. Our choice of the minussign when am = 9 is essential to ensure that (3.6.5) holds.Now suppose that the non-integer part on the right of (3.6.6), 10na− A,is less than 1/2. Then also the non-integer part on the right of (3.6.7) willbe less than 1/2. (For this to be true even when m = n + 1, it is essentialthat we choose the minus sign for hm when am = 4.) It follows that

10n(a+ hm) − 10na = ±10n−m when n < m. (3.6.8)



The Van der Waerden FunctionProof (continued).It is not difficult to see that (3.6.8) also holds when 10na− A ≥ 1/2. Wehence have

10m−n(10n(a+ hm) − 10na

)= ±1 when n < m.

Then (3.6.5) becomes

W (a+ hm)−W (a)

hm=

m∑n=0

±1 =: Sm.

Now the sequence (Sm) satisfies

|Sm+1 − Sm| = 1,

for all m ∈ N. Thus, it is not a Cauchy sequence and can not converge. Itfollows that the limit (3.6.4) does not exist.


Differential Calculus in One Variable The Exponential Function



Vector Spaces


Series






The Exponential FunctionWe can now introduce one of the most fundamentally important functionsin calculus:3.7.1. Definition. We define the exponential function

exp: C→ C, exp z :=∞∑n=0

zn

n!. (3.7.1)

We similarly define its restriction to the real numbers,

exp: R→ R, exp x :=∞∑n=0

xn

n!. (3.7.2)

The series (3.7.1) converges for all z ∈ C (ϱ =∞), so it defines acontinuous function for all z ∈ C (x ∈ R).We will now investigate the various properties of the exponential function.



The Exponential FunctionThe functional equation for the exponential function is

(exp x)(exp y) = exp(x + y), x , y ∈ C,

which can be shown using the Cauchy product as follows.

(exp x)(exp y) =( ∞∑n=0

xn

n!

)( ∞∑m=0

ym

m!

)=

∞∑n=0

((xkk!

)∗(ykk!

))n

=∞∑n=0

∑l+m=n

1

l!m!x lym

=∞∑n=0

1

n!

∑l+m=n

n!

l!m!x lym

=∞∑n=0

1

n!(x + y)n

= exp(x + y)




Note that exp(0) = 1 from the series representation

exp(0) = 1 + 0 +02

2!+ · · · = 1.

We next show that exp(z) = 0 for all z ∈ C. Assume that exp(z0) = 0 forsome z0 ∈ C. Then

1 = exp(0) = exp(z0 − z0) = exp(z0) exp(−z0) = 0 · exp(−z0) = 0,

which is impossible. Since exp(0) = 1 and exp is continuous, we candeduce from the intermediate value theorem that

exp(x) > 0 for all x ∈ R.



The Exponential FunctionWe can also calculate the derivative of the exponential function:

exp(x+h) = exp(x) exp(h) = exp(x)(1+h+o(h)

)= exp(x)+exp(x)h+o(h),

soexp′(x) = exp(x).

In fact, it turns out that exp is the only function satisfying the initial valueproblem

y ′(x) = y(x), y(0) = 1. (3.7.3)

Here the equation on the left is called a differential equation; its solution isa function y such that the derivative of y is equal to y . The secondequation is called an initial condition, because it specifies the value of y atfor some x0 ∈ R.



The Exponential FunctionIf an initial value problem has a unique solution, it can of course be usedto define this solution. Thus we could use (3.7.3) to define exp, if we wereable to prove that there exists a unique solution to (3.7.3). Since we havealready defined exp, this is not really necessary. Nevertheless, it isinstructive to prove the uniqueness of the solution.Assume that two functions f , g both solve (3.7.3). Then(

f (x) exp(−x))′= (f ′(x)− f (x)) exp(−x) = 0 =

(g(x) exp(−x)

)′Now by Corollary 3.2.9 f (x) exp(−x) = g(x) exp(−x) + C for someC ∈ R. Setting x = 0 we see

C = f (0) exp(0)− g(0) exp(0) = 1 · 1− 1 · 1 = 0,

so f (x) exp(−x) = g(x) exp(−x) or (multiplying with exp(x)) f = g .Thus exp is the only function satisfying (3.7.3).




The function exp: R→ R satisfies

exp′′ = exp′ = exp > 0,

so the real exponential function is increasing and convex. There are nolocal extrema. Furthermore,

exp(x) > 1 + x for x > 0,

so limx→∞

exp(x) =∞. Since

1 = exp(0) = exp(x − x) = exp(x) exp(−x) ⇔ exp(−x) = 1/ exp(x),

we see that limx→−∞

exp(x) = limx→∞

exp(−x) = 0.



The Exponential FunctionWe thus obtain the following plot of the exponential function:

x

1

y

y = exp HxL



Compound InterestWe now analyze a problem that will turn out to have an intrinsicconnection to the exponential function and the Euler number. Considerthe question of compound interest. If money is left in a bank savingsaccount for one year, the bank will return (1 + z) times the originalamount after the year is over. (For example, at 5% interest, z = 0.05.)However, if the money is withdrawn after half a year, tthe bank will return(1 + z/2) times the investment. If this money is then left with the bankfor another half-year, the total amount returned is(

1 +z

2

)2> 1 + z

times the original investment. Thus it seems advantageous to withdrawmoney after the nth part of a year and replace it in the savings accountimmediately. Repeating this n times, one earns(

1 +z

n

)ntimes the original investment.



Compound InterestOne would perhaps like to choose n as large as possible. The questionarises: will the return approach a limit? The answer is given by thefollowing proposition:

3.7.2. Proposition. Let z ∈ C. Then limn→∞

(1 +

z

n

)n= exp z

In real life, banks have taken this into account, and publish the effectiveannual interest rate, also called the Annual Equivalent Rate (AER) whichtakes into account the compound interest.Proof.From the functional equation for the exponential function we have

exp(z)−(1 +

z

n

)n=(exp(zn

))n︸︷︷︸

=:an

−(1 +

z

n

)n︸︷︷︸

=:bn

.



Compound Interest

Proof (continued).Since

an − bn = (a− b)n−1∑k=0

akbn−1−k

︸︷︷︸=:ϱ

we analyze first a− b:

a− b = expz

n−(1 +

z

n

)=

z

n

∞∑k=1

1

(k + 1)!

(zn

)k=

z

n· o(1)

as n→∞.



Compound InterestProof (continued).Furthermore,

|ϱ| ≤n−1∑k=0

∣∣∣exp z

n

∣∣∣k ∣∣∣1 + z

n

∣∣∣n−1−k

≤n−1∑k=0

(exp|z |n

)k (1 +|z |n

)n−1−k

≤n−1∑k=0

(exp|z |n

)k (exp|z |n

)n−1−k

= n

(exp|z |n

)n−1

≤ n exp(|z |)



The Euler NumberProof (continued).Hence ∣∣∣exp(z)− (1 + z

n

)n∣∣∣ ≤ |a− b| · |ϱ| ≤ |z |nn exp(|z |) · o(1)→ 0

as n→∞.

3.7.3. Definition. The number

e := exp(1)

is called the Euler number.It can be proven that e is irrational and transcendental (not the solution ofan algebraic equation). An approximate value is

e ≈ 2.7182818284590



The Euler Number3.7.4. Lemma. The Euler number is the monotonic limit of the sequencein Proposition 3.7.2, i.e.,(

1 +1

n

)n

e as n→∞.

Proof.By the binomial theorem,(

1 +1

n

)n

=n∑

k=0

n!

k!(n − k)!nk=

n∑k=0

1

k!

n

n

n − 1

n...

n − k + 1

n

=n∑

k=0

1

k!

(1− 1

n

)...

(1− k − 1

n

)

<

n+1∑k=0

1

k!

(1− 1

n + 1

)...

(1− k − 1

n + 1

)=

(1 +

1

n + 1

)n+1

.



The Euler Number

3.7.5. Proposition. For any rational number q ∈ Q, eq = exp(q).

Proof.We first show that en = exp(n) for n ∈ N by induction. For n = 0 we havee0 = 1 = exp(0). Furthermore,

exp(n + 1) = exp(n) · exp(1) = en · e = en+1.

Moreover, for n ∈ N we have exp(n) exp(−n) = 1, so

exp(−n) = 1

exp(n)=

1

en=: e−n.

Thus the statement holds for all n ∈ Z.



The Euler Number

Proof (continued).Now let x = p/q with p ∈ Z and q ∈ N \ 0. Then(

expp

q

)q

= exp p = ep

and thusexp

p

q=

q√ep =: ep/q.



Real Exponents, LogarithmThe exponential function is continuous (even C∞) on R, and for rational xcoincides with ex . We do not yet have a definition for ex when x is real,so it is logical to define

ex := exp x for x ∈ R

We say that exp x is a continuous extension of ex to the real numbers. Itis automatically the only such extension.We have seen that the function exp: R→ R+ (R+ := x ∈ R : x > 0) isincreasing and hence bijective. Thus there exists an inverse function,which we call the (natural) logarithm and denote by ln : R+ → R.

3.7.6. Lemma. The logarithm satisfies the functional equation

ln(u · v) = ln(u) + ln(v).

for all u, v ∈ R+



Real Exponents, LogarithmProof.Let u = ex , v = ey . Then

ln(u · v) = ln(ex · ey ) = ln(ex+y ) = x + y = ln(u) + ln(v).

In a similar manner we can use the properties of the exponential functionto prove

ln1

u= − ln(u), ln(up/q) =

p

qln(u) (p ∈ Z, q ∈ N \ 0).

This motivates the following definition:3.7.7. Definition. For all a > 0 and x ∈ R we define

ax := ex ln(a).



Real Exponents, LogarithmWe immediately obtain

ln ax = x ln a, (ax)y = e ln[(ax )y ] = eyx ln a = axy , ax+y = axay

and so on for all a > 0 and x , y ∈ R.3.7.8. Remark. Since

ex = 1 + x +x2

2!+ ... ,

we see that ex > xn/n! for any n and x > 0. In particular,limx→∞

ex/xn =∞ for any n ∈ N:

ex

xn>

1

(n + 1)!

xn+1

xn=

x

(n + 1)!x→∞−−−→∞.



Real Exponents, Logarithm3.7.9. Theorem. For any α > 0 we have

limx→∞

ln x

xα= 0, lim

x0xα ln x = 0.

Proof.Letting y = ln x , we have

limx→∞

ln x

xα=

1

αlimy→∞

αy

eαy=

1

αlimt→∞

t

et= 0

Furthermore, with y = 1/x ,

limx0

xα ln x = limy→∞

1

yαln

1

y= − lim

y→∞

ln y

yα= 0.



Real Exponents, Logarithm

3.7.10. Corollary. Since the exponential function is continuous, we see thatfor n ∈ N

n√n = n1/n = e

1nln n n→∞−−−→ e0 = 1.

Finally, we note that by the inverse function theorem the logarithm isdifferentiable on R+ and

(ln y)′ =1

(ex)′|x=ln y=

1

ex |x=ln y=

1

y.

Hence ln ∈ C∞(R+).



The LogarithmWe thus obtain the following plot of the logarithmic function:

1x

y

y = ln HxL


Differential Calculus in One Variable The Trigonometric Functions



Vector Spaces


Series






The Trigonometric FunctionsWe first note that we define

ez := exp z for z ∈ C.

We then introduce the well-known trigonometric cosine and sine functionscos, sin : R→ R by

cos(x) := Re e ix =e ix + e−ix

2=

∞∑k=0

(−1)k x2k

(2k)!,

sin(x) := Im e ix =e ix − e−ix

2i=

∞∑k=0

(−1)k x2k+1

(2k + 1)!.

The equatione ix = cos(x) + i sin(x)

is sometimes called the Euler relation.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 464 / 624


The Trigonometric FunctionsWe can immediately prove some useful properties of the sine and cosine: itfollows from the definition that

cos(0) = 1, sin(0) = 0, sin2(x) + cos2(x) = |e ix |2 = e ixe−ix = 1.

From

e i(x+y) = e ixe iy = (cos(x) + i sin(x))(cos(y) + i sin(y))

=(cos(x) cos(y)− sin(x) sin(y)

)+ i(cos(x) sin(y) + sin(x) cos(y)

)= cos(x + y) + i sin(x + y)

we obtain the addition relations for the sine and cosine:

cos(x + y) = cos(x) cos(y)− sin(x) sin(y),

sin(x + y) = cos(x) sin(y) + sin(x) cos(y).



The Trigonometric FunctionsWe can verify immediately that the sine and cosine functions aredifferentiable, since

cos(x + h) = cos(x) cos(h)− sin(x) sin(h)

= cos(x)(1− h2/2! + o(h2))− sin(x)(h − h3/3! + o(h3))

= cos(x)− sin(x)h + o(h)

and

sin(x + h) = cos(x) sin(h) + sin(x) cos(h)

= cos(x)(h − h3/3! + o(h3)) + sin(x)(1− h2/2! + o(h2))

= sin(x) + cos(x)h + o(h)

so we haved

dxcos x = − sin x ,

d

dxsin x = cos x




It follows from these formulas that the sine and cosine functions bothsatisfy the differential equation

y ′′ + y = 0,

where the sine function also satisfies the initial condition y(0) = 0 and thecosine function satisfies y ′(0) = 0. In fact,

y(x) = a cos x + b sin x (3.8.1)

satisfies the general initial value problem

y ′′ + y = 0, y(0) = a, y ′(0) = b for any a, b ∈ R. (3.8.2)

We will show that (3.8.1) is in fact the only solution to (3.8.2).




3.8.1. Lemma. Suppose that y : R→ R is twice differentiable and that

y ′′ + y = 0, y(0) = 0, y ′(0) = 0. (3.8.3)

Then y(x) = 0 for all x ∈ R.

Proof.Multiplying (3.8.3) with y ′, we obtain

0 = y ′y ′′ + yy ′ =1

2

((y ′)2 + y2

)′,

so (y ′)2 + y2 is constant. Since y(0) = y ′(0) = 0, it follows that(y ′(x))2 + y(x)2 = 0 for all x , so y(x) = 0 for all x ∈ R.



The Trigonometric Functions3.8.2. Theorem. Suppose that y : R→ R is twice differentiable and that forsome a, b ∈ R

y ′′ + y = 0, y(0) = a, y ′(0) = b. (3.8.4)Then

y(x) = a cos x + b sin x .

Proof.Let f be any solution to (3.8.4) and g(x) = f (x)− a cos x − b sin x .Theng solves

g ′′ + g = 0, g(0) = 0, g ′(0) = 0.

By Lemma 3.8.1, g(x) = 0 for all x , so

f (x) = a cos x + b sin x .



The Trigonometric FunctionsWe can hence conclude:

If y ′′ + y = 0 and y(0) = 1, y ′(0) = 0, then y(x) = cos x , If y ′′ + y = 0 and y(0) = 0, y ′(0) = 1, then y(x) = sin x .

This property will be quite useful later.Now consider the cosine function. We know that cos(0) = 1 > 0 andcos2 x + sin2 x = 1. Therefore the cosine function has a local maximum atx = 0. Furthermore, at least for small ε > 0, cos x > 0 on [0, ε). Since(sin x)′ = cos x , we see that sin x is strictly increasing on [0, ε) andsin x > 0 on (0, ε). Since

cos′(x) = − sin(x),

the cosine is decreasing on [0, ε) and will continue to decrease untilsin x = 0. From cos2 x + sin2 x = 1 we see that if sin x = 0, then|cos x | = 1. Since the cosine is less than 1 and decreasing on (0, ε), it willcontinue to decrease until cos x = −1.



The Number π

Hence there exists at least one x0 > 0 such that cos x0 = 0. We define

π := 2 ·minx ∈ R+ : cos(x) = 0.

(Exercise: why is this a minimum and not just an infimum?) It will turnout that π is an irrational and transcendental number which is identical tothe diameter of the unit circle. Numerically,

π ≈ 3.141 592 653 589 793 238 462 643 383 279 502 189.



The Number πWe thus have cos(π/2) = 0 and |sin(π/2)| = 1. Since sin(0) = 0 and thesine function is increasing until cos x = 0, i.e., for 0 ≤ x ≤ π/2, we seethat sin(π/2) = 1. It follows thatcos(π) = cos(π/2 + π/2) = cos(π/2) cos(π/2)− sin(π/2) sin(π/2) = −1,sin(π) = cos(π/2) sin(π/2) + sin(π/2) cos(π/2) = 0,

cos(2π) = cos(π) cos(π)− sin(π) sin(π) = 1,

sin(2π) = cos(π) sin(π) + sin(π) cos(π) = 0.

This impliescos(x + 2π) = cos(x) cos(2π)− sin(x) sin(2π) = cos(x),

sin(x + 2π) = cos(x) sin(2π) + sin(x) cos(2π) = sin(x),

so the sine and cosine functions are periodic with period 2π. Furthermore,sin(x + π/2) = cos(x) sin(π/2) + sin(x) cos(π/2) = cos x ,

so the functions are simply translations of each other.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 472 / 624


The Graph of Sine and CosineLastly, we note that directly from the definition one sees that

sin(−x) = − sin(x), cos(−x) = cos(x)

for all x ∈ R.We are now able to graph the functions:



Trigonometric Interpretation of Sine and CosineOf course, the sine and cosine functions are already familiar fromelementary trigonometry, where they were defined in right-angled trianglesas

COS θ :=Adjacent sideHypotenuse , SIN θ :=

Opposite sideHypotenuse , 0 <θ < 90.

In order to prove that sin x = SIN θ and cos x = COS θ for suitablex = x(θ), one proceeds as follows:

1. Using geometrical methods, one proves that for fixed length of sidesof the triangle,

limθ→0

COS θ = 1, limθ→0

SIN θ = 0,

and thus extends the definition of the geometric sine and cosine tozero accordingly.



Trigonometric Interpretation of Sine and Cosine

2. Using geometrical methods, one proves that

COS(θ + ϕ) = COS θ COSϕ− SIN θ SINϕ,

SIN(θ + ϕ) = COS θ SINϕ+ SIN θ COSϕ.

whenever 0 ≤ θ,ϕ, θ + ϕ < 90.3. Using geometrical methods, one further shows (See Spivak, Ch. 15.

Ex. 27) that

limθ→0

1− COS θ

θ= 0, lim

θ→0

θ − SIN θ

θ= 0.

4. From the above, one deduces (COS θ)′ = −SIN θ and(SIN θ)′ = COS θ.



Trigonometric Interpretation of Sine and Cosine6. It follows that SIN θ and COS θ satisfy

(SIN θ)′′ + (SIN θ) = 0 and SIN(0) = 1, SIN′(0) = 0, (COS θ)′′ + (COS θ) = 0 and COS(0) = 0, COS′(0) = 1.

Since the solution to the corresponding initial value problems are unique,COS θ = cos θ, SIN θ = sin θ. Now θ is a number between 0 and 90, and

limθ→90

SIN θ = 1, limθ→90

COS θ = 0.

A degree θ is defined as the 360th part of the circumference of a circle;this is clearly an arbitrary definition, and we should find what therelationship between θ and x is, where x is scaled so that lim

x→π/2sin x = 1.

It is clear that θ = 360x/(2π). This is just a linear re-scaling of thecoordinate plane to accomodate the “artificial” idea of dividing a circleinto 360 parts. We will dispense with this and always assume that ourcoordinates are not rescaled in such a way.



More Trigonometric FunctionsWe further define trigonometric functions such as the tangent, cotangent,secant and cosecant by

tan x :=sin x

cos x, cot x :=

cos x

sin x, sec x :=

1

cos x, csc x :=

1

sin x,

Except for the tangent, we will not use these frequently.

-5 Π

2-2 Π -

3 Π

2-Π -

Π

2

Π

2Π

3 Π

22 Π

x

y

y = tan HxL



Inverse Trigonometric FunctionsThe inverses to the various trigonometric functions are called the arc sine,arc cosine arc tangent etc. and denoted

arcsin x , arccos x , arctan x ,

respectively. In each case, the ranges of the inverses have to be chosencarefully, as the original functions are not injective for all x . For example,one commonly defines

arcsin : [−1, 1]→ [−π/2,π/2],arccos : [−1, 1]→ [0,π],

arctan: R→ (−π/2,π/2),

but other definitions are of course possible and sometimes useful (inparticular, for the arc tangent).



Inverse Trigonometric FunctionsWe can use the inverse function theorem to calculate the deivatives of theinverse trigonometric functions. For this, we first note that

(cos x)′ = − sin x = −√

1− cos2 x ,

(sin x)′ = cos x =√

1− sin2 x ,

(tan x)′ =

(sin x

cos x

)′=

1

cos2 x= 1 + tan2 x .

Then

(arccos x)′ =1

(cos y)′|y=arccos x=

−1√1− cos2 y |y=arccos x

=−1√1− x2

,

(arcsin x)′ =1

(sin y)′|y=arcsin x=

1√1− sin2 y |y=arcsin x

=1√

1− x2,

(arctan x)′ =1

(tan y)′|y=arctan x=

1

1 + tan2 y |y=arctan x=

1

1 + x2.



Some Further Properties of the Sine FunctionThe sine (and cosine) functions have some interesting properties whichallow them to be used as study cases for continuity and differentiability.First, note that

limx→0

sin1

xdoes not exist. This can be proven similarly as for the periodic function ψstudied in the section on continuty.

x

-1

1

y

y = sin H1xL



Some Further Properties of the Sine FunctionNow it is easy to prove that the function

f (x) =

x sin 1

x x = 0

0 x = 0

is continuous at x = 0.

x

-1

1

y

y = x sin H1xL



Properties of the Sine Function

The function

f (x) =

x2 sin 1

x x = 0

0 x = 0

is even differentiable: For x = 0,

f ′(x) = 2x sin1

x− cos

1

x.

Furthermore, f (0 + h) = f (h) = f (0) + o(h), so f ′(0) = 0. However, f ′ isnot continuous at 0, since lim

x→0f ′(x) does not exist.

We thus finally have an example of a function that is differentiable but notcontinuously differentiable!



The Most Beautiful Theorem

Since e iθ = cos θ + i sin θ, we observe that e iπ = −1, or

e iπ + 1 = 0.

This wonderful formula, which relates the most important constants incalculus with each other has been voted the most beautiful theorem ofmathematics in a survey by the magazine The Mathematical Intelligencer.



Polar Coordinates in the Complex PlaneAn important observation is that every complex number z (except 0) inthe complex plane has a unique representation through the distance of zto the origin and the angle it makes with the positive real axis:

-4 -3 -2 -1 1 2 3 4Re z

-2

-1

1

2

3

4

Im z

z

r = ÈzÈ

Θ = arctan

Im z

Re z



Polar Coordinates in the Complex Plane

Thus every z ∈ C \ 0 uniquely determines a pair (r , θ) given by

r = |z | =√

(Re z)2 + (Im z)2, θ =

arctan Im z

Re z , Re z > 0,

π + arctan Im zRe z , Re z < 0.

Conversely, given (r , θ) ∈ R+ × [0, 2π) we obtain a unique number

z = r cos θ + ir sin θ = r · e iθ ∈ C \ 0.

We call this the polar representation of a complex number z . While thecartesian representation

z = a+ bi

for a, b ∈ R is useful for addition and subtraction, the polar representationis more suited to multiplication, division and exponentiation.



Polar Coordinates in the Complex Plane

The product and quotient of two complex numbers z1, z2 ∈ C \ 0 can beexpressed through their polar representation as

z1 · z2 = (r1eiθ1)(r2e

iθ2) = r1r2ei(θ1+θ2),

z1z2

=r1e

iθ1

r2e iθ2=

r1r2e i(θ1−θ2).

In particular, we see that

zn = rne inθ for n ∈ Z.

This allows us to obtain the nth roots of a complex number, somethingthat was not easily done for complex numbers in cartesian representation.



Polar Coordinates in the Complex PlaneFor 0 = z = re iθ we clearly have an nth root given by

z1/n = r1/ne iθ/n,

since (z1/n)n = z . We would, however, expect to be able to find n distinctroots. The key to finding them is to note that

e i(θ+2π) = cos(θ + 2π) + i sin(θ + 2π) = cos θ + i sin θ = e iθ.

Since z = re iθ = re i(θ+2π), another root is given by

z1/n = r1/ne i(θ/n+2π/n).

Similarly, since z = re iθ = re i(θ+k2π) for k ∈ Z, we have n nth roots

r1/ne iθ/n, r1/ne i(θ+2π)/n, r1/ne i(θ+2·2π)/n, ... , r1/ne i(θ+(n−1)·2π)/n.



The Hyperbolic Trigonometric FunctionsWe conclude this part by introducing a final family of functions, thehyperbolic trigonometric functions:We define the hyperbolic sine and hyperbolic cosine, sinh, cosh: C→ C, by

sinh(x) :=ex − e−x

2, cosh(x) :=

ex + e−x

2

A comparison with the definition of the sine and cosine functionsimmediately shows that

sinh(ix) = i sin(x), cosh(ix) = cos x .

From the definition, we see that

cosh(x) + sinh(x) = ex .



The Hyperbolic Trigonometric Functions

Similarly to their trigonometric counterparts, we have

cosh(−x) = cosh(x), sinh(−x) = − sinh(x),

the hyperbolic cosine and sine are even and odd functions, respectively.We also see that

cosh2(x)− sinh2(x) =

(ex + e−x

2

)2

−(ex − e−x

2

)2

=1

4

(e2x + 2 + e−2x − e2x + 2− e−2x

)= 1

This last relation explains the origin of the term hyperbolic: if x = cosh tand y = sinh t, then x2 − y2 = 1, which is the equation for a hyperbola.



Second Midterm Exam

The preceding material completes the second third of the course material.It encompasses everything that will be the subject of the Second MidtermExam.The exam date and time will be announced on SAKAI.No calculators or other aids will be permitted during the exam. A sampleexam with solutions has been uploaded to SAKAI. Please study it carefully,including the instructions on the cover page.


Integral Calculus in One Variable

Part IV




Notions of Integration

Practical Integration

Applications of Integration


Integral Calculus in One Variable Notions of Integration






Areas under CurvesWe now turn to a question that seems unrelated to our previousinvestigations: How can we determine the area between the graph of afunction and the abscissa?

1x

1

y

y x2

Area = ?

The sketch at left shows the graphof f : [0, 1] → R, f (x) = x2. Whileit is possible to determine the areabounded by the graph through ele-mentary geometrical means, we aimto find a general procedure so thatwe can discuss general functions, e.g.,f (x) = ex .

At the outset, we have not even defined what an “area” is supposed to be,so we must first lay some groundwork.



Areas under CurvesSuppose f : [a, b]→ R is a constant function, i.e., there exists some c ∈ Rsuch that f (x) = c for all x ∈ [a, b]. We would then naturally consider the“area bounded by the abscissa and the graph of f ” to be equal toc · (b − a):

a bx

c

y

y c

Area c Hb - aL



Areas under CurvesWe can generalize this to the case where f : [a, b]→ R is “piecewiseconstant”:

a a1 bx

c1

c2

y

y f HxL

Area c2 Hb - a1L + Ha1 - aL c1

For our intuitive concept of what the “area” is, the value of f (a1) isirrelevant: we could have f (a1) = c1 or f (a1) = c2 or some completelydifferent value. The “area” should remain the same.



Step FunctionsOur first step is to give a proper definition of a “piecewise constant”function and the area “under” its graph.4.1.1. Definition. A partition P of an interval [a, b] ⊂ R is a tuple ofnumbers P = (a0, ... , an) such that

a = a0 < a1 < ... < an = b.

4.1.2. Definition. A function φ : [a, b]→ R is called a step function withrespect to a partition P = (a0, ... , an) if there exist numbers yi ∈ R,i = 1, ... , n, such that

φ(t) = yi whenever ai−1 < t < ai (4.1.1)

for i = 1, ... , n. We denote the set of all step functions by Step([a, b])



Step Functions

4.1.3. Remarks.1. We don’t care how φ is defined at the points ai !2. We call φ simply a step function if there exists some partition P with

respect to which it is a step function.

4.1.4. Example. The function

φ : [0, 1]→ R, f (x) =

0 0 ≤ x < 1

2 ,12 x = 1

2 ,

1 12 < x ≤ 1,

is a step function. In particular, it is a step function with respect toP = (0, 12 , 1) or any other suitable partition, e.g., P = (0, 12 ,

34 , 1).



Sum of Step Functions4.1.5. Definition. Let P be a partition of an interval [a, b]. We call apartition R of [a, b] a sub-partition of P if R ⊃ P, where we naturallyregard P and R as sets instead of tuples.

4.1.6. Remark. If φ : [a, b]→ R is a step function with respect to apartition P of [a, b], then it as also a step function with respect to anysub-partition R of P.We can define the sum of two step functions as follows:Given a step function φ (w.r.t. a partition P) and a step function ψ (w.r.t.a partition Q), we first define the partition R := P ∪ Q of [a, b] whichcontains all points of P and Q, ordered appropriately from smallest tolargest.Then R ⊃ P and R ⊃ Q, i.e., R is a sub-partition of both P and Q.Hence φ and ψ are step functions with respect to R and we can defineφ+ ψ by adding φ(x) and ψ(x) on each interval induced by R = P ∪ Q.



Sum of Step Functions

Since multiplying step functions by a number does not present anyproblems, we have proven the following:4.1.7. Proposition. If φ,ψ are step functions on [a, b] and λ,µ ∈ R, thenλφ+ µψ is a step function on [a, b].

4.1.8. Remark. This also shows that Step([a, b]) is a vector space.



Integral of a Step Function4.1.9. Definition and Theorem. Let φ : [a, b]→ R be a step function ofthe form (4.1.1) with respect to some partition P. Then

IP(φ) := (a1 − a0)y1 + · · ·+ (an − an−1)yn (4.1.2)is independent of the choice of the partition P. We define the integral of fas ∫ b

aφ := IP(φ)

for any partition P with respect to which φ is a step function.

4.1.10. Remark. It is common to use the notation∫ b

aφ(x) dx :=

∫ b

aφ.

The motivation underlying this notation will remain mysterious for now.However, it is an intuitive notation in the same sense that writinglimx→x0 f (x) is more intuitively friendly that limx0 f .



Integral of a Step Function

Proof.Let φ be a step function w.r.t. a partition P and also w.r.t. some otherpartition Q. Let R = P ∪ Q be a common sub-partition of both P and Q.We will show that

IP(φ) = IR(φ) = IQ(φ).

Let c ∈ R \ P and suppose that ak < c < ak+1 for some k = 1, ... , n. Weadd c to P, denoting the resulting partition by

Pc = (a0, ... , ak , c, ak+1, ... , an).



Integral of a Step Function

Proof (continued).Then φ(t) = yk for ak < t < ak+1, so

IP(φ) = (a1 − a0)y1 + ... + (ak+1 − ak)yk + ... + (an − an−1)yn

= (a1 − a0)y1 + ... + (c − ak)yk + (ak+1 − c)yk

+ ... + (an − an−1)yn

= IPc (φ).

Repeating this procedure, we eventually have IP(φ) = IR(φ). In the sameway we show that IQ(φ) = IR(φ).



Properties of the Integral of a Step FunctionIt is easy to see that ∣∣∣∣∫ b

aφ

∣∣∣∣ ≤ (b − a) supx∈[a,b]

|φ(x)|

for a step function φ, since, given a partition P,

|IP(φ)| ≤n∑

i=1

(ai − ai−1) · |yi | ≤ max1≤i≤n

|yi |n∑

i=1

(ai − ai−1)

= (b − a) supx∈[a,b]

|φ(x)|. (4.1.3)

Furthermore, it is not difficult to show that∫ b

a(λφ+ µψ) = λ

∫ b

aφ+ µ

∫ b

aψ (4.1.4)

for step functions φ,ψ and λ,µ ∈ R.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 504 / 624


Bounded Functions4.1.11. Definition. Let I ⊂ R be an interval. We say that a functionf : I → R is bounded if

∥f ∥∞ := supx∈I|f (x)| <∞. (4.1.5)

The set of all bounded functions is denoted L∞(I ).

4.1.12. Remarks.1. It is easy to check that L∞(I ) (endowed with point-wise addition and

multiplication) is a vector space and that (4.1.5) defines a norm onL∞(I ).

2. If I = [a, b] is a closed interval, every continuous function is bounded,so C ([a, b]) ⊂ L∞([a, b]).

3. If I = [a, b] is a closed interval, any step function on [a, b] takes ononly a finite number of real values, so is automatically bounded.Hence, Step([a, b]) ⊂ L∞([a, b]).



Area under the Graph of Bounded Functions

We restrict ourselves (for now) to bounded functions on an interval [a, b].The area under the graph of a bounded function (if it exists) can then notbe larger than the product of (a− b) and the bound of the function, i.e., itmust be finite.However, not all bounded functions may have an intuitive “area” undertheir graph. As an example, consider the Dirichlet function

χ : [0, 1]→ R, χ(x) =

1 for x rational,0 for x irrational.

It is not clear, what the value of the “area” might be: 1? 0? undefined?Given such an example, we will have to approach the concept of area verycarefully.



Area under the Graph of Bounded FunctionsSuppose a function f ∈ L∞([a, b]) can be approximated uniformly by stepfunctions, i.e., it is the uniform limit of a sequence of step functions. Thenwe might hope that the area under the step functions approaches the areaunder f :

a bx

y



Regulated Functions

It turns out that this approach works for functions that can beapproximated uniformly by step functions. We give the class of thesefunctions a specific name:4.1.13. Definition. A function f ∈ L∞([a, b]) is said to be regulated if forany ε > 0 there exists a step function φ such that

supx∈[a,b]

|f (x)− φ(x)| < ε. (4.1.6)

Equivalently, a regulated function f is a function such that there exists asequence of step functions (φn) which converges uniformly to f . (To seethis, we simply define φn as the step function satisfying (4.1.6) withε = 1/n.)



Example of a Regulated Function4.1.14. Example. The function f : [0, 1]→ R, f (x) = x2 is regulated. Tosee this, choose the sequence of step functions (φn) on [0, 1] given by

φn(x) =k2

n2for k − 1

n< x ≤ k

n, k = 1, ... , n,

and φn(0) = 0. Then

∥φn − f ∥∞ = supx∈[0,1]

|φn(x)− f (x)| = maxk=1,...,n

supk−1n

<x≤ kn

|φn(x)− f (x)|

= maxk=1,...,n

supk−1n

<x≤ kn

∣∣∣∣k2n2 − x2∣∣∣∣

= maxk=1,...,n

∣∣∣∣k2n2 − (k − 1)2

n2

∣∣∣∣ = maxk=1,...,n

∣∣∣∣2k − 1

n2

∣∣∣∣ = 2n − 1

n2.

It follows that ∥φn − f ∥∞ → 0 as n→∞, so f ∈ Reg([a, b]).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 509 / 624


Continuous Functions are Regulated4.1.15. Theorem. The continuous functions on the interval [a, b] areregulated, i.e., C ([a, b]) ⊂ Reg([a, b]).

Proof.We need to show that any continuous function on the interval [a, b] canbe regarded as the uniform limit of step functions. Let f ∈ C ([a, b]) begiven. By Theorem 2.5.24 f is also uniformly continuous, i.e.,

∀ε>0∃

δ>0∀

x ,y∈[a,b]|x − y | < δ ⇒ |f (x)− f (y)| < ε. (4.1.7)

Now fix ε > 0 and choose some δ > 0 according to (4.1.7). Define apartition P = (a0, ... , an) of [a, b] such that aj − aj−1 = (a− b)/n < δ,j = 1, ... , n, by choosing n large enough. Define φ : [a, b]→ R byφ(b) = f (b) and

φ(t) = f (ai−1) for ai−1 ≤ t < ai , i = 1, ... , n.



Piecewise Continuous Functions

4.1.16. Definition. Let I ⊂ R be an interval, a1, ... , an ⊂ I a finite set ofpoints and f : I → C a function such that

(i) f : I \ a1, ... , an → C is continuous and(ii) For all ai , i = 1, ... , n, both

limxai

f (x) and limxai

f (x)

exist.Then f is said to be piecewise continuous. We denote the set of piecewisecontinuous functions on I by PC(I ).

4.1.17. Theorem. The piecewise continuous functions on the interval [a, b]are regulated, i.e., PC([a, b]) ⊂ Reg([a, b]).The proof is left to the assignments.



Integral of a Regulated FunctionWe now define the integral of a regulated function as follows:4.1.18. Definition and Theorem. Let f ∈ Reg([a, b]) and (φn) a sequencein Step([a, b]) converging uniformly to f . Then the regulated integral of f ,defined by ∫ b

af := lim

n→∞

∫ b

aφn (4.1.8)

exists and does not depend on the choice of (φn).

Proof.We first prove that the limit (4.1.8) exists. Since (φn) converges to funiformly, (φn) is a Cauchy sequence in C ([a, b]) with the ∥ · ∥∞ norm.This implies that the sequence (

∫ ba φn) is Cauchy, since∣∣∣∣∫ b

aφn −

∫ b

aφm

∣∣∣∣ = ∣∣∣∣∫ b

a(φn − φm)

∣∣∣∣ ≤ (b − a)∥φn − φm∥∞.

by (4.1.4) and (4.1.3).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 513 / 624


Integral of a Regulated FunctionProof.Since the real numbers are complete, this implies that the sequence(∫ ba φn) converges. Hence, the limit (4.1.8) exists.

Furthermore, let (φn) and (ψn) be two sequences of step functionsconverging uniformly to f . Then∣∣∣∣∫ b

aφn −

∫ b

aψn

∣∣∣∣ = ∣∣∣∣∫ b

a(φn − ψn)

∣∣∣∣≤ (b − a)∥φn − ψn∥∞≤ (b − a)∥φn − f ∥∞ + (b − a)∥ψn − f ∥∞n→∞−−−→ 0.

This shows that both sequences yield the same limit (4.1.8), i.e., theintegral of f does not depend on the choice of the sequence of stepfunctions.



Example of a Regulated Integral4.1.19. Example. We have seen in Example 4.1.14 that the functionf : [0, 1]→ R, f (x) = x2 is the uniform limit of the sequence of stepfunctions (φn) on [0, 1] given by

φn(x) =k2

n2for k − 1

n< x ≤ k

n, k = 1, ... , n,

and φn(0) = 0. Now it is easy to see that the integral of the stepfunctions, by Definition 4.1.9, is given by∫ 1

0φn =

n∑k=1

1

n· k

2

n2=

1

n3

n∑k=1

k2 =n(n + 1)(2n + 1)

6n3.

It follows that ∫ 1

0f = lim

n→∞

∫ 1

0φn =

1

3.



Properties of the Regulated IntegralWe can verify that properties (4.1.4) and (4.1.3) also hold for theregulated integral. Let f , g ∈ Reg([a, b]) and (φn), (ψn) be sequences inStep([a, b]) converging uniformly to f and g , respectively. Then∣∣∣∣∫ b

af

∣∣∣∣ = ∣∣∣∣ limn→∞

∫ b

aφn

∣∣∣∣ = limn→∞

∣∣∣∣∫ b

aφn

∣∣∣∣ ≤ (b − a) limn→∞

supx∈[a,b]

|φn(x)|

≤ (b − a) supx∈[a,b]

|f (x)|. (4.1.9)

In the same way,

λ

∫ b

af + µ

∫ b

ag = λ lim

n→∞

∫ b

aφn + µ lim

n→∞

∫ b

aψn = lim

n→∞

∫ b

a(λφn + µψn)

=

∫ b

a(λf + µg)

since the sequence (λφn + µψn) converges uniformly to λf + µg .Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 516 / 624


Properties of the Regulated IntegralWe also note that if a < c < b, then for f ∈ L∞([a, b]),∫ b

af =

∫ c

af +

∫ b

cf . (4.1.10)

This can easily be seen by including the point c in any partition of stepfunctions approximating f .The set of regulated functions is closed and the regulated integral iscontinuous on this set. More precisely, we have the following result:4.1.20. Theorem. Let (fn) be a sequence of regulated functions such thatfn → f uniformly for some f ∈ L∞, i.e., ∥f − fn∥∞ → 0. Then f isregulated and ∫ b

afn

n→∞−−−→∫ b

af . (4.1.11)

The proof is left to the assignments.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 517 / 624


Non-Regulated FunctionsHowever, not all bounded functions are regulated:4.1.21. Example. Consider the function

f : [0, 1]→ R, f (x) =

1 if x = 1/n for some n ∈ N \ 0,0 otherwise.

Now any step function φ must be constant (say, equal to c ∈ R) in theinterval (0, δ) for some δ > 0. This interval will always contain both pointswhere x = 1/n and points where x = 1/n, no matter how small δ ischosen. Then

supx∈[0,1]

|f (x)− φ(x)| ≥ supx∈(0,δ)

|f (x)− c | ≥ max|1− c |, |c |

≥ 1

2

so that (4.1.6) can not be satisfied. Hence, f is not regulated.For technical reasons, however, it is desirable to extend the notion ofintegral to cover even non-regulated functions such as the one above.



Extending the Regulated IntegralLet us attempt another approach at defining the area under a curve: theregulated integral was based on approximating the function f uniformlythrough step functions and then defining the area under f to be the limitof the areas of step functions.

In contrast, we will now try to ap-proximate the area under f , with-out attempting to approximate fat all.We will do this by consideringstep functions that are larger thanf (i.e., φ(x) ≥ f (x) for all x)and step functions that are smallerthan f (ψ(x) ≤ f (x) for all x), asillustrated at right.

a bx

y



Properties of the IntegralIf the area between “upper” and “’lower” step functions can be made assmall as desired, we will have successfully approximated the area of fwithout necessarily needing to approximate f . We will now formalize theseideas.We have proven (or can easily check) that the regulated integral has thefollowing properties:

(i) The integral is linear, i.e.,∫(λf + µg) = λ

∫f + µ

∫g for λ,µ ∈ R;

(ii) The integral is positive, i.e., if f > 0 on [a, b], then∫f > 0;

(iii) The integral of a constant function f = c , c ∈ R, on [a, b] is given by∫f = c · (b − a);

(iv) The integral does not depend on the values of a function on finitesets. If f (x) = 0 for all but a finite set of x ∈ [a, b], then

∫f = 0.

Suppose there is a class of bounded functions on [a, b] that includes theregulated functions and all those functions for which it is possible to definean integral with the properties (i) - (iv) above.



Upper and Lower Step FunctionsLet f ∈ L∞([a, b]) be a function for which an integral with properties (i) -(iv) exists. Since f is bounded, there exists a step function u such thatu ≥ f on [a, b]. Thus ∫ b

af ≤

∫ b

au

by properties (i) and (ii) (since u − f > 0 and∫ ba (u − f ) =

∫ ba u −

∫ ba f ).

Denote by Uf the set of all such “upper” step functions. Then∫ b

af ≤ inf

u∈Uf

∫ b

au.

This infimum exists because the step functions are bounded below by thelower bound for f , and properties (ii) and (iii) guarantee the existence of alower bound for the integral. We define the upper area of f as

I (f ) := infu∈Uf

∫ b

au.



Upper and Lower Step FunctionsSimilarly, if Lf denotes the set of all step functions v such that v ≤ f , then∫ b

af ≥ sup

v∈Lf

∫ b

av =: I (f ).

Thus, if it is possible to define an integral of the bounded function f , then

I (f ) ≤∫ b

af ≤ I (f ).

Even if it is not possible to define an integral of f , we may still define Ufand Lf and it remains true that

I (f ) ≤ I (f ).

Clearly, for an arbitrary bounded f the supremum and infimum may differ;but if they are equal, then their common value must also be the value of∫ ba f . We can use this to define

∫ ba f by this common value, if it exists.



The Darboux Integral4.1.22. Definition. Let [a, b] ⊂ R be a closed interval and f a bounded realfunction on [a, b]. Let Uf denote the set of all step functions u on [a, b]such that u ≥ f and Lf the set of all step functions v on [a, b] such thatv ≤ f . The function f is then said to be Darboux-integrable if

I (f ) = supv∈Lf

∫ b

av = inf

u∈Uf

∫ b

au = I (f ).

In this case, the Darboux integral of f ,∫ ba f , is defined to be this common

value.

4.1.23. Remark. A bounded function f ∈ L∞([a, b]) is Darboux-integrableif and only if for every ε > 0 there exist step functions uε and vε such thatvε ≤ f ≤ uε and ∫ b

auε −

∫ b

avε ≤ ε.



Example for the Darboux Integral of a Function4.1.24. Example. Consider the function f : [0, 1]→ R, f (x) = x2. For anyn ∈ N, the step function defined by

un(x) =k2

n2for k − 1

n< x ≤ k

n, k = 1, ... , n,

and un(0) = 0 is an upper step function for f , while the step function

vn(x) =(k − 1)2

n2for k − 1

n< x ≤ k

n, k = 1, ... , n,

and un(0) = 0 is a lower step function for f . Then∫ 1

0un =

n∑k=1

1

n· k

2

n2=

1

n3

n∑k=1

k2 =n(n + 1)(2n + 1)

6n3,

∫ 1

0vn =

n∑k=1

1

n· (k − 1)2

n2=

1

n3

n−1∑k=1

k2 =n(n − 1)(2n − 1)

6n3.



Example for the Darboux Integral of a FunctionNow

infu∈Uf

∫ 1

0u ≤ lim

n→∞

∫ 1

0un =

1

3,

supv∈Lf

∫ 1

0v ≥ lim

n→∞

∫ 1

0vn =

1

3.

Since infu∈Uf

∫ 10 u ≥ sup

v∈Lf

∫ 10 v , this implies that

infu∈Uf

∫ 1

0u = sup

v∈Lf

∫ 1

0v =

1

3.

By the definition of the Darboux integral,∫ 1

0f = 1/3.



The Darboux Integral of a Non-Regulated Function4.1.25. Example. The function

f : [0, 1]→ R, f (x) =


of Example 4.1.21 is not regulated, but it is Darboux-integrable. To seethis, note that v0(x) = 0 defines a lower step function v0 ∈ Lf , while forany n,

un(x) =

1 0 ≤ x ≤ 1/n,

f (x) otherwisedefines an upper step function un ∈ Uf . As in the previous example,

infu∈Uf

∫ 1

0u ≤ lim

n→∞

∫ 1

0un =

1

n= 0

and supv∈Lf

∫ 10 v ≥

∫ 10 v0 = 0, yielding the common value

∫ 10 f = 0.



A regulated function is Darboux-integrable4.1.26. Theorem. If f ∈ Reg([a, b]), then f is Darboux-integrable and theDarboux integral coincides with the regulated integral.

Proof.Let f ∈ Reg([a, b]). We will show that for every ε > 0 there exist upperand lower step functions uε and vε such that

∫ ba uε −

∫ ba vε ≤ ε.

Fix ε > 0. Since f is regulated, there exists a step function φ such thatsup

x∈[a,b]|f (x)− φ(x)| < ε/[2(b − a)]. Then

uε := φ+ε

2(b − a), vε := φ− ε

2(b − a).

are upper and lower step functions for f and∫ b

auε −

∫ b

avε ≤ (b − a) sup

x∈[a,b]|uε(x)− vε(x)| = ε.



A regulated function is Darboux-integrable

Proof (continued).This shows that f is Darboux integrable.To show that the Darboux integral is equal to the regulated integral, wewill show that the limit of the integrals of step functions converginguniformly to f is just the Darboux integral. Let (φn) be a sequence of stepfunctions such that

supx∈[a,b]

|f (x)− φn(x)| ≤1

2n.

Then

un := φn +1

n, vn := φn −

1

n.

define upper and lower step functions for f .



A regulated function is Darboux-integrableProof (continued).The Darboux integral satisfies∫ b

avn ≤

∫ b

af ≤

∫ b

aun.

Since

limn→∞

∫ b

avn = lim

n→∞

∫ b

aφn − lim

n→∞

b − a

n= lim

n→∞

∫ b

aφn,

limn→∞

∫ b

aun = lim

n→∞

∫ b

aφn + lim

n→∞

b − a

n= lim

n→∞

∫ b

aφn.

it follows that ∫ b

af = lim

n→∞

∫ b

aφn.



A regulated function is Darboux-integrable

Hence, the Darboux integral is an extension of the regulated integral forfunctions f : [a, b]→ R. We can easily verify that the Darboux integral isbounded and additive:4.1.27. Proposition. The Darboux integral of Riemann-integrable functionsf and g on [a, b] satisfies ∣∣∣∣∫ b

af

∣∣∣∣ ≤ (b − a) supx∈[a,b]

|f (x)|,

λ

∫ b

af + µ

∫ b

ag =

∫ b

a(λf + µg).

The proof is left as an exercise.



Tagged Partitions and Riemann SumsThere is another approach to the integral that is most often seen inundergraduate textbooks.A tagged partition (P , Ξ) on an interval [a, b] consists of a partitionP = x0, ... , xn together with numbers Ξ = ξ1, ... , ξn such that eachξk ∈ [xk−1, xk ]. The mesh size of P is defined as

m(P) := maxk=1,...,n

(xk − xk−1)

A step function φ ∈ Step([a, b]) for a function f ∈ L∞([a, b]) with respectto (P, Ξ) is given by

φ(x) = f (ξk) for xk−1 < x < xk , k = 1, ... , n,where as usual φ(xk) can be defined in an arbitrary manner. The sum∫ b

aφ :=

n∑k=1

f (ξk)(xk − xk−1) (4.1.12)

is then called a Riemann sum for f .Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 531 / 624


Riemann SumsThe Riemann sum will hopefully approach the area under the graph of f asthe partitions become finer. Below, we have taken partitions of equal sizeand tags lying in the midpoints of the subintervals, but the subintervallengths and tag locations should not influence the limit:

a bx

y

Given a function f , the choice offiner and finer partitions gives Rie-mann sums of more and more val-ues of f , weighted by the meshsize.In the limit, one may regard a se-quence of such Riemann sums asapproaching the “sum of all valuesof f in the interval [a, b].”



The Riemann Integral4.1.28. Definition. Let [a, b] ⊂ R be a closed interval and f a boundedreal function on [a, b]. Then f is Riemann-integrable with integral∫ b

af ∈ R

if for every ε > 0 there exists a δ > 0 such that for any tagged partition(P , Ξ) on [a, b] with mesh size m(P) < δ∣∣∣∣ n∑

k=1

f (ξk)(xk − xk−1)−∫ b

af

∣∣∣∣ < ε.

Definition 4.1.28 is quite difficult to work with in practice, so it isfortunate that the following result holds:4.1.29. Theorem. A bounded real function is Riemann-integrable on [a, b]if and only if it is Darboux-integrable. Moreover, the values of theintegrals coincide.



Equivalence of the Darboux and Riemann Integrals

Proof.Suppose that f is Riemann integrable. Fix ε > 0. Then there exists apartition P = (x0, ... , xn) such that∣∣∣∣ n∑

k=1

f (ξk)(xk − xk−1)−∫ b

af

∣∣∣∣ < ε

4.

for any choice of ξk ∈ [xk−1, xk ]. On this partition P we can define upperand lower step functions for f by

u(x) := supξ∈[xk−1,xk ]

f (ξ), v(x) := infξ∈[xk−1,xk ]

f (ξ)

for x ∈ (xk−1, xk). By choosing ξk appropriately, we can ensure that thecorresponding Riemann sum differs from the integral of u by less than ε/4.




Proof (continued).This ensures that∣∣∣∣∫ b

au −

∫ b

af

∣∣∣∣ ≤ ∣∣∣∣∫ b

au −

n∑k=1

f (ξk)(xk − xk−1)

∣∣∣∣+

∣∣∣∣ n∑k=1

f (ξk)(xk − xk−1)−∫ b

af

∣∣∣∣<ε

2.

Applying the same argument for the lower step function, we find∣∣∣∣∫ b

au −

∫ b

av

∣∣∣∣ < ∣∣∣∣∫ b

au −

∫ b

af

∣∣∣∣+ ∣∣∣∣∫ b

av −

∫ b

af

∣∣∣∣ < ε. (4.1.13)




Proof (continued).Since such step functions u and v can be found for any ε > 0, this showsthat f is Darboux-integrable (see Remark 4.1.23). Furthermore, since forany choice of partition∫ b

av ≤

n∑k=1

f (ξk)(xk − xk−1) ≤∫ b

au

we see that the Darboux integral equals the Riemann integral.The proof that a Darboux-integrable function is Riemann-integrable is leftas an exercise.

4.1.30. Remark. The Riemann integral emphasizes the interpretation of theintegral as a sum or series over a “continuous parameter range”.



Insufficiency of the Darboux/Riemann IntegralFrom the definition of a step function, we can immediately see that theregulated integral of a function f ∈ L∞([a, b]) such that f (x) = 0 at allexcept a finite number of points in [a, b] vanishes. For example, thefunction f (x) = 0 for x = 0, f (0) = 1 is integrable on [0, 1] and theregulated integral vanishes.On the other hand, the function

f : [0, 1]→ R, f (x) =


(4.1.14)

does not have a regulated integral. Since f vanishes “at almost everypoint of the interval”, we would like the integral of f to be defined and tobe zero, but the notion of the regulated integral is not sufficient for this.However, we were able to extend the concept of an integral to the Darbouxintegral and as we have seen, the Darboux integral of f is indeed zero.



Insufficiency of the Darboux/Riemann IntegralHowever, there are still functions that seem to vanish “always everywhere”but whose Darboux integral is not defined. The basic example is theDirichlet function, which we have already mentioned earlier.4.1.31. Example. The Dirichlet function

χ : [0, 1]→ R, χ(x) =

1 for x rational,0 for x irrational.

(4.1.15)

is not Darboux-integrable.The reason for this is simple: any lower step function v must satisfyv(x) ≤ 0 while an upper step function must satisfy u(x) ≥ 1 for all xexcept a finite number of exceptions (at the “jumps” between the steps).Hence,

infu∈Uχ

∫ 1

0u ≥ 1 > 0 ≥ sup

v∈Lχ

∫ 1

0v

and the two bounds can not have a common value.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 538 / 624


Extensions of the Darboux/Riemann IntegralThere exist extensions of the Darboux and Riemann integrals that allow usto define an integral even of functions such as χ. While we lack the timeto go into details on these, two deserve to be mentioned:

(i) The Lebesgue integral (dating from around 1917) is the mostcommonly seen extension. It may be regarded as a modification ofthe Darboux integral in which step functions are replaced withso-called simple functions that are basically step functions with an“infinite number” of steps.

(ii) The gauge integral, also called the Henstock-Kurzweil integral, is afairly recent formulation (developed in the 1980s) that modifies theRiemann integral to require the mesh size of a tagged partition todepend on the specific tag within each subinterval. Hence, inDefinition 4.1.28 m(P) < δ is replaced by the conditionxk+1 − xk < δ(ξk) for a gauge function δ : [a, b]→ (0,∞).It turns out that the Henstock-Kurzweil integral is more general thanthe Lebesgue integral.



Conclusion

In fact, there is no significant benefit in the introduction of the Darbouxand Riemann integrals since:

All functions we will encounter in practice are at least piecewisecontinuous and therefore regulated.

The Darboux integral is not general enough to treat a very wide classof functions that includes, e.g., the Dirichlet function. If it werenecessary for us to treat such functions, we would need to develop aneven more general concept.

The Darboux and Riemann integrals were mainly introduced to illustratesome basic ideas on how to implement the seemingly simple question ofdefining the “area under a function”. In practice, it is entirely sufficient toconsider all integrals as referring to the regulated integral.


Integral Calculus in One Variable Practical Integration






The Fundamental Theorem of CalculusThe derivation in the previous chapter can also be applied tocomplex-valued functions with no change in the arguments. We henceforthassume that the regulated integral has been established for functionsf : [a, b]→ C.4.2.1. Theorem. Let f : [a, b]→ C be continuous and set

F : [a, b]→ C, F (x) :=

∫ x

af .

Then F is differentiable on (a, b) andF ′(x) = f (x), x ∈ (a, b).

While this theorem can be proved for general regulated functions, we arecontent with the (relatively simple) proof for continuous functions. It maybe immediately applied to piecewise continuous functions by using(4.1.10). In practice, we will not often come across integrals of functionsthat are not at least piecewise continuous.



The Fundamental Theorem of CalculusProof.Let h be sufficiently small. Then for x , x + h ∈ (a, b) we have

F (x + h) =

∫ x+h

af =

∫ x

af (t) dt +

∫ x+h

xf (t) dt = F (x) +

∫ x+h

xf (t) dt

= F (x) + hf (x) +

∫ x+h

x

(f (t)− f (x)

)dt.

Now

limh→0

1

|h|

∣∣∣∫ x+h

x

(f (t)− f (x)

)dt∣∣∣ ≤ lim

h→0

|h||h|

supx≤t≤x+h

|f (t)− f (x)|

= limh→0

supx≤t≤x+h

|f (t)− f (x)| = 0

Hence the integral is o(h) as h→ 0 and F ′(x) = f (x).



The Fundamental Theorem of Calculus

4.2.2. Definition. Let Ω ⊂ R be an open set and f : Ω→ C. We say thatF : Ω→ C is a primitive, indefinite integral or anti-derivative of f ifF ′ = f . We write

F =

∫f =

∫f (t) dt.

4.2.3. Lemma. Let f ∈ C ([a, b],C) and F be a primitive of f . Then∫ b

af = F (b)− F (a).

This fundamental result establishes that integration can be regarded as asort of “inverse differentiation.”



The Fundamental Theorem of Calculus

Proof.We note that by Theorem 4.2.1

d

dx

(∫ x

af − F (x)

)= f (x)− f (x) = 0,

so there exists a c ∈ C such that∫ x

af − F (x) = c

Letting x → a, we see that c = −F (a). This implies∫ x

af = F (x)− F (a).

Letting x → b we obtain the result.



Practical IntegrationWe can hence immediately calculate various integrals by guessing theirprimitives. For example, for α = 1,∫ b

axα dx =

1

α+ 1xα+1

∣∣∣x=b− 1

α+ 1xα+1

∣∣∣x=a

=:1

α+ 1xα+1

∣∣∣ba.

For short, we often write the right-hand side as∫ b

axα dx =

1

α+ 1xα+1

∣∣∣ba=[ 1

α+ 1xα+1

]ba.

Similarly, we “see” that∫eαx dx =

1

αeαx , α = 0,∫

1

xdx =

∫dx

x= ln|x |, x = 0.



Substitution RuleHowever, not all integrals are obvious. For example,∫

ln x dx = x ln x − x ,

∫dx√1 + x2

= Arsinh x .

Since we integrate by “inverse differentiation,” we should expect some helpfrom the theorems that govern differentiation of products andcompositions of functions.The following theorem is a direct result of the chain rule of differentiation.4.2.4. Substitution Rule. Let f ∈ Reg([α,β]) and g : [a, b]→ [α,β]continuously differentiable. Then∫ b

a(f g)(x)g ′(x) dx =

∫ g(b)

g(a)f (y) dy .



Substitution RuleProof.Let F be a primitive of f . Then (F g)′ = (f g)g ′ and∫ b

a(f g)(x)g ′(x) dx =

∫ b

a(F g)′ dx

= (F g)(x)∣∣ba= F (x)

∣∣g(b)g(a)

=

∫ g(b)

g(a)f (y) dy .

4.2.5. Remark. Using the same proof, we see that the substitution rule canbe applied to indefinite integrals (primitives):∫

(f g)(x)g ′(x) dx =

∫f (y) dy

∣∣∣y=g(x)

.



Substitution Rule

4.2.6. Example. We want to calculate∫

dx

λ2 + x2for λ = 0. We write

∫dx

λ2 + x2=

∫1

λ2dx

1 + (x/λ)2.

We now set f (x) = 1/(1 + x2), g(x) = y = x/λ. Then g ′(x) = 1/λ, sowe have∫

dx

λ2 + x2=

1

λ

∫1

λ

dx

1 + (x/λ)2

=1

λ

∫g ′(x)f (g(x)) dx =

1

λ

∫f (y) dy

∣∣∣y=g(x)

=1

λ

∫dy

1 + y2

∣∣∣y=x/lambda

=1

λarctan y

∣∣∣y=x/λ

=1

λarctan

x

λ.



Substitution Rule

The following “short cut” using Leibniz notation works very well with thesubstitution rule: Set y = g(x), so g ′(x) = dy

dx . Then we substitute y forg(x) and dy for g ′(x) dx .

4.2.7. Example. We calculate∫

xe−x2 dx . We set y = −x2. Then

dy

dx= −2x ⇔ dy = −2x dx ⇔ −1

2dy = x dx .

Thus ∫xe−x2 dx = −1

2

∫ey dy

∣∣∣y=−x2

= −1

2ey |y=−x2 = −

1

2e−x2 .



Integration by PartsAnalogously to the previous result, we obtain a technique for integrationbased on the product rule of differentiation.4.2.8. Theorem. Let f , g ∈ C 1([a, b],C). Then∫ b

af ′(x)g(x) dx = f (x)g(x)

∣∣∣ba−∫ b

af (x)g ′(x) dx .

Proof.This follows from the fact that fg is a primitive of f ′g + g ′f .4.2.9. Example.∫

ln x dx =

∫1︸︷︷︸

f ′(x)

· ln x︸︷︷︸g(x)

dx = x ln x︸︷︷︸f (x)g(x)

−∫

x︸︷︷︸f (x)

· 1

x︸︷︷︸g ′(x)

dx

= x ln x − x



The Area of a DiscWe now calculate the area of a disc of radius R. In fact, we calculate thearea of a quarter-disc by integrating the function f given byf (x) =

√R2 − x2 from 0 to R:

Rx

R

y

y = f HxL = R2- x2

t

R

Substituting x = R sin t,dx = R cos t dt, we have∫ R

0

√R2 − x2 dx

=

∫ π/2

0

√R2 − R2 sin2 t R cos t dt

= R2

∫ π/2

0cos2 t dt



The Area of a DiscWe are now left with calculating

∫ π/2

0cos2 t dt.

Using integration by parts:∫ π/2

0cos2 t dt = sin t cos t

∣∣∣π/20

+

∫ π/2

0sin2 t dt

= 0 +

∫ π/2

0(1− cos2 t) dt = π/2−

∫ π/2

0cos2 t dt

⇒∫ π/2

0cos2 t dt =

π

4

or using trigonometry and substitution:∫ π/2

0cos2 t dt =

1

2

∫ π/2

0(cos(2t) + 1) dt =

π

4+

1

2

∫ π/2

0cos(2t) dt

=π

4+

1

4

∫ π

0cos(s) ds =

π

4+

1

4sin s

∣∣∣π0=π

4



The Area of a Disc

Hence the area of a disc of radius R is 4 · π4R2 = πR2, as we know from

elementary geometry.Integration is an art, not a skill like differentiation. There are many more“tricks” and “techniques” for calculating integrals, some of which you willencounter in the assignments and recitation classes.



Improper IntegralsAn integral ∫ b

af (t) dt

is called “improper” if the domain of integration is unbounded i.e., a = −∞ or b =∞

and/or the integrand f is unbounded on (a, b) or otherwise not regulated.

4.2.10. Definition. Assume that b ≤ ∞ and that f : [a, b)→ C is regulatedon any closed subinterval [a, x ], x < b. Then∫ b

af (t) dt

is called an improper integral and is said to converge or exist if

limxb

∫ x

af (t) dt =: I



Improper Integrals4.2.10. Definition. (continued). The number I ∈ C is then called the valueof the improper integral and we write

I =

∫ b

af (t) dt.

Similarly, we define ∫ b

af (t) dt = lim

xa

∫ b

xf (t) dt

(if the limit exists) in the case where b <∞. In general, we define theintegral∫ b

af (t) dt :=

∫ c

af (t) dt +

∫ b

cf (t) dt, −∞ ≤ a < b ≤ ∞

for any c ∈ R such that both limits on the right exist.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 556 / 624


Improper Integrals

4.2.11. Example. Consider the integral∫ ∞

1

dt

tα. Then for α = 1

∫ x

1

dt

tα=

1

1− α(x1−α − 1)

x→∞−−−→

1

α−1 if α > 1

∞ if α < 1

Thus the integral converges for α > 1 and diverges for α < 1. We also seethat in the case α = 1,∫ x

1

dt

t= ln x − ln 1

x→∞−−−→∞,

so the integral diverges for α = 1.

In the same way we can show that∫ 1

0

dt

tαconverges if α < 1 and diverges

if α ≥ 1.



Cauchy Property of FunctionsWe would like to find a criterion for the convergence of improper integrals.As a preliminary step, we establish the following convergence criterion.4.2.12. Theorem. Let I ⊂ R be an interval, a ∈ I ⊂ R ∪ −∞,∞ andF : I → C. Then the following statements are equivalent:

a) The limit limx→a

F (x) exists.b) The function F satisfies the Cauchy property,

∀ε>0∃

δ>0∀

x ,y∈I: |x − a| < δ ∧ |y − a| < δ ⇒ |F (x)− F (y)| < ε.

(4.2.1)

when a ∈ R, or

∀ε>0

∃R>0

∀x ,y∈I

: ± x ,±y > R ⇒ |F (x)− F (y)| < ε. (4.2.2)

when a = ±∞.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 558 / 624


Cauchy Property of Functions

Proof.We will only consider the case a ∈ R. The case a = ±∞ is left to you.

a)⇒ b) Fix ε > 0. If the limit exists (call it L) we can find δ > 0such that for all x , y ∈ I

|x − a| < δ ⇒ |F (x)− L| < ε/2.

Then |F (x)− F (y)| ≤ |F (x)− L|+ |F (y)− L| implies b). b)⇒ a) We suppose that (4.2.1) holds and show that lim

x→aF (x) exists

by considering sequences (xn) with xn → a.Let (xn) be given and suppose that xn → a. Then b) implies that(F (xn)) is a complex Cauchy sequence (why? prove this!) and thissequence converges because C is complete.



Cauchy Property of Functions

Proof (continued).

Next, suppose that (xn) and (yn) are two such sequences. Define

(zn) = (x1, y1, x2, y2, ...)

Then by the same argument as for (xn), the limit of F (zn) exists.Since the limit is the only accumulation point of (F (zn)),

L := limn→∞

F (zn) = limn→∞

F (xn) = limn→∞

F (yn)

so limx→a

F (x) exists.



Cauchy Criterion for FunctionsThe main application of the Cauchy property is in the study of improperintegrals. For example, we have the following corollary of Theorem 4.2.12.4.2.13. Cauchy Criterion. Let a ∈ R and f : [a,∞)→ R be integrable onevery interval [a, x ], x ∈ R. The improper integral∫ ∞

af (x) dx

converges if and only if

∀ε>0

∃R>0

∀x ,y>R

∣∣∣∣∫ y

xf (t) dt

∣∣∣∣ < ε.

Proof.Apply Theorem 4.2.12 to F : [a,∞)→ R, F (x) =

∫ xa f (t) dt, noting that

F (y)− F (x) =

∫ y

af (t) dt −

∫ x

af (t) dt =

∫ y

xf (t) dt.



The Dirichlet Integral4.2.14. Example. The Dirichlet integral∫ ∞

0

sin x

xdx (4.2.3)

is convergent.

Π 3 Π 5 Π 7 Π 9 Π

x

1

y

y

sinHxL

x



The Dirichlet IntegralTo prove that (4.2.3) converges, we consider

I (x , y) :=

∫ y

x

sin t

tdt, x < y , (4.2.4)

which we can rewrite as

I (x , y) =

∫ x+π

x

sin t

tdt −

∫ y+π

y

sin t

tdt +

∫ y+π

x+π

sin τ

τdτ .

Substituting t = τ − π in the last integral, we obtain

I (x , y) =

∫ x+π

x

sin t

tdt −

∫ y+π

y

sin t

tdt −

∫ y

x

sin t

t + πdt. (4.2.5)

Adding (4.2.4) to (4.2.5), we obtain

2I (x , y) =

∫ x+π

x

sin t

tdt −

∫ y+π

y

sin t

tdt + π

∫ y

x

sin t

t(t + π)dτ . (4.2.6)



The Dirichlet IntegralSince |sin(t)/t| ≤ 1/t and |sin(t)/(t(t + π))| ≤ 1/t2 we obtain from(4.2.6) that

2|I (x , y)| ≤ 2π

x+ π

∫ y

x

dt

t2<

2π

x+ π

∫ ∞

x

dt

t2=

2π

x+π

x.

Fix ε > 0. Then choose R > 3π/(2ε). Then for all x , y > R we have∣∣∣∣∫ y

x

sin t

tdt

∣∣∣∣ = |I (x , y)| < ε.

By the Cauchy Criterion 4.2.13, the integral (4.2.3) converges.4.2.15. Remark. The sine cardinal, sinc : R→ R,

sinc(x) :=

sin xx x = 0,

1 x = 0,

plays an important role in signal processing. We will later be able to provethat the value of the integral (4.2.3) is π/2.



Comparison Test for Improper IntegralsAnother important consequence of Theorem 4.2.12 is the following:4.2.16. Comparison Test. Let I ⊂ R and f : I → C, g : I → [0,∞).Suppose that |f (t)| ≤ g(t) for t ∈ I and

∫I g(t) dt converges. Then∫

I f (t) dt also converges.

Proof.We treat the case I = [a, b] only. Let F (x) =

∫ xa f , G (x) =

∫ xa g for

a < x < b. Then

|F (x)− F (y)| =∣∣∣∫ x

af −

∫ y

af∣∣∣ = ∣∣∣∫ x

yf∣∣∣

≤∫ x

y|f | ≤

∫ x

yg = G (x)− G (y). (4.2.7)

Since limxb

G (x) exists, G satisfies the Cauchy criterion (4.2.1). From theestimate (4.2.7), F also satisfies (4.2.1), so lim

xbF (x) exists.



Absolute and Conditional Convergence

4.2.17. Definition. Let I ⊂ R and f : I → C. We say that the improperintegral

∫I f (t) dt converges absolutely if the improper integral

∫I |f (t)| dt

converges.If∫I f (t) dt converges but is not absolutely convergent, we say that the

integral is conditionally convergent.

4.2.18. Remark. An immediate consequence of the Comparison Test 4.2.16is that every absolutely convergent integral is also convergent.

4.2.19. Example. The Dirichlet integral (4.2.3) is conditionally convergent,i.e., ∫ ∞

0

|sin x |x

dx

diverges.



Euler Gamma FunctionThe Euler gamma function is defined through an improper integral,

Γ : R+ → R, Γ (t) : =

∫ ∞

0z t−1e−z dz , t > 0. (4.2.8)

The gamma function plays an important role in many parts ofmathematics, physics and engineering. For example, standard probabilisticdistributions that describe system failure are based on the gamma function(this will be discussed in detail in the course Ve401 Probabilistic Methodsin Engineering).We will first show that the integral (4.2.8) converges absolutely for everyt > 0. We write∫ ∞

0z t−1e−z dz =

∫ 1

0z t−1e−z dz +

∫ ∞

1z t−1e−z dz .

For z > 0 we have |z t−1e−z | < z t−1 and for t > 0 we know thatt − 1 > −1 and

∫ 10 z t−1 dz converges. Hence the first integral converges

absolutely.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 568 / 624


Euler Gamma FunctionNext we consider ∫ ∞

1z t−1e−z dz . (4.2.9)

Recall that ex > xn/n! for x > 0 and any n ∈ N. Hence (setting x = z/2,n = ⌈t⌉ = mink ∈ Z : k > t)

z t−1e−z/2 =1

z1−tez/2<

2⌈t⌉⌈t⌉!z1−tz⌈t⌉

< C (t) for z > 1.

Hence z t−1e−z < C (t)e−z/2. Since∫ ∞

1e−z/2 dz = lim

x→∞

∫ x

1e−z/2 dz = lim

x→∞(2e−1/2 − 2e−x/2) =

2√e

converges, we see that (4.2.9) and hence Γ (t) converges absolutely.



Properties of the Gamma Function4.2.20. Lemma. For any t > 0 we have

Γ (t + 1) = tΓ (t).

Before we discuss the proof, we note that for improper integrals, theconcept of integration by parts generalizes to∫ b

af ′(x)g(x) dx = f (x)g(x)

∣∣ba−∫ b

af (x)g ′(x) dx . (4.2.10)

where all the integrals are improper and

f (x)g(x)∣∣ba:= lim

xbf (x)g(x)− lim

xaf (x)g(x).

All the above limits are of course assumed to exist for (4.2.10) to hold.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 570 / 624


Properties of the Gamma Function

Proof of Lemma 4.2.20.We have

Γ (t + 1) =

∫ ∞

0z te−z dz = −e−zz t

∣∣∞0

+ t

∫ ∞

0z t−1e−z dz = tΓ (t)

Lemma 4.2.20 implies

n! = Γ (n + 1) for n ∈ N.

The Gamma function is hence a continuous extension of the factorialfunction to the positive real numbers. It is also the only continuousextension such that ln Γ is a convex function on R+. (You will checkthat Γ is even differentiable in the assignments.)



Euler Gamma FunctionThe Gamma function is, by its definition, only defined for t > 0. However,it turns out that one can find an extension of Γ (t) to values t > −1,t = 0, as follows: Set

F1(t) :=Γ (t + 1)

t.

Clearly, F1 defines a function on (−1,∞) \ 0. If t > 0, we haveΓ (t + 1) = tΓ (t), so

F1(t) =Γ (t + 1)

t=

tΓ (t)

t= Γ (t)

for t > 0. Hence F1 is an extension of Γ to values t > −1, t = 0. In asimilar manner, an extension F2 of F1 can be found that is defined on(−2,∞) \ −1, 0. This process can be continued indefinitely to achievean extension of the Gamma function to x ∈ R : x = −n, n ∈ N. Thisextension is also denoted Γ (t).



Euler Gamma FunctionBy defining the Gamma function for complex t ∈ C it can be proven thatthe above extension defines a continuous function for all t ∈ C except thenegative real integers, thereby justifying the definitions of F1, F2 etc.

-3 -2 -1 1 2 3 4 5x

-10

-5

5

10

15

y

y = G HxL = à0

¥

zx - 1 e-z âz


Integral Calculus in One Variable Applications of Integration






Integrals and Series

From the preceding discussion of improper integrals, it seems that thereare obvious analogies between functions and sequences:

functions f ←→ sequences (an)

integral∫ ∞

1f ←→ series

∞∑n=1

an

Comparison Test 4.2.16 ←→ Comparison Test 3.5.13

and many more. But can we use these analogies in a practical manner tohelp solve problems?



Integral Test for Series

4.3.1. Integral Test. Let m ∈ N and f : [m,∞)→ [0,∞) be a decreasingfunction, i.e., f (x) ≥ f (y) for m ≤ x ≤ y , such that

∫ xm f (t) dt exists for

every x > m. Then∫ ∞

mf (x) dx <∞ ⇔

∞∑n=m

f (n) <∞. (4.3.1)

Proof.Define the functions

g : [m,∞)→ R, g(x) = f (n) for x ∈ [n − 1, n), n ∈ N, n ≥ m,

h : [m,∞)→ R, h(x) = f (n − 1) for x ∈ [n − 1, n), n ∈ N, n ≥ m.



Integral Test for SeriesProof (continued).For any N ∈ N, the functions g |[m,N] and h|[m,N] are just step functionsand we find∫ N

mg =

N∑n=m+1

f (n),

∫ N

mh =

N∑n=m+1

f (n − 1) =N−1∑n=m

f (n)

Since f is decreasing, we have g(x) ≤ f (x) ≤ h(x) for all x ∈ [m,∞) andso

N∑n=m+1

f (n) =

∫ N

mg ≤

∫ N

mf ≤

∫ N

mh =

N−1∑n=m

f (n)

Letting N →∞, we obtain∞∑

n=m+1

f (n) ≤∫ ∞

mf ≤

∞∑n=m

f (n).



Integral Test for Series4.3.2. Example. The series

∞∑n=1

ln(n)

nα

converges if and only if α > 1. Note that f : [1,∞)→ R, f (x) = x−α ln xis decreasing for xα > e:

f ′(x) = x−α−1(1− α ln x)

We apply the integral test for m > e1/α. Suppose that α = 1. Integratingby ports, ∫

ln x

xαdx =

ln x

(1− α)xα−1− 1

1− α

∫1

xαdx

= −1 + (α− 1) ln x

xα−1(α− 1)2

so it is easy to see that∫∞3

ln xxα dx converges if and only if α > 1. The

case α = 1 can be handled similarly.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 578 / 624


Sequences of FunctionsRecall from Theorem 4.1.20 that if (fn) is a sequence of functionsfn ∈ C ([a, b]) such that fn n→∞−−−→

∥ · ∥∞f then

∫ b

afn

n→∞−−−→∫ b

af .

We can apply this to give the following practical result:4.3.3. Theorem. Let [a, b] ⊂ R and (fn) be a sequence of functions wherefn ∈ C 1([a, b]). Suppose that

f ′nn→∞−−−→∥ · ∥∞

g for some g ∈ C ([a, b])

and that for some x0 ∈ [a, b] the sequence (fn(x0)) converges. Then (fn)converges uniformly to a function f ∈ C 1([a, b]) such that f ′ = g .




Proof.For n ∈ N we have

fn(x) =

∫ x

af ′n + cn, cn = fn(a).

Let x = x0 as in the theorem. Then

cn = fn(x0)−∫ x0

af ′n

n→∞−−−→ limn→∞

fn(x0)−∫ x0

ag =: c .

Hence for any x ∈ [a, b] we have

fn(x)n→∞−−−→

∫ x

ag + c =: f (x),

so we have pointwise convergence to a function f with f ′ = g .




Proof (continued).We will show that the convergence is actually uniform:

∥fn − f ∥∞ ≤ supx∈[a,b]

∣∣∣∫ x

af ′n −

∫ x

ag∣∣∣+ |cn − c|

≤ supx∈[a,b]

|x − a| supt∈[a,x]

|f ′n(t)− g(t)|+ |cn − c |

= |b − a| · ∥f ′n − g∥∞ + |cn − c |.

Since cn → c and f ′n → g uniformly, we are finished.



Series of FunctionsTheorem 4.3.3 can be expressed for series of functions in the following way:4.3.4. Corollary. Let [a, b] ⊂ R and (fn) be a sequence of functions,fn ∈ C 1([a, b]), n ∈ N, such that

∞∑n=0

fn =: f

converges pointwise on [a, b] and∞∑n=0

f ′n

converges uniformly on [a, b]. Then the converge to f is actually uniform,f is differentiable and

f ′ =∞∑n=0

f ′n.



Differentiation and Integration of Power SeriesLet

f : (−ϱ, ϱ)→ R, f (x) =∞∑n=0

anxn

be the function defined by a power series with radius of convergence ϱ.Recall from Lemma 3.6.10 that a power series converges uniformly withinany closed and bounded set within its radius of convergence. Then

(i) f is continuous on (−ϱ, ϱ) by Theorem 3.4.3.(ii) f is differentiable by Theorem 3.6.14 and

f ′(x) =d

dx

∞∑n=0

anxn =

∞∑n=0

d

dx

(anx

n)

=∞∑n=1

nanxn−1, x ∈ (−ϱ, ϱ).



Differentiation and Integration of Power Series(iii) f may be integrated componentwise by Corollary 4.3.4,∫ x

0f (y) dy =

∫ x

0

( ∞∑n=0

anyn)dy =

∞∑n=0

∫ x

0

(any

n)dy

=∞∑n=0

ann + 1

xn+1, x ∈ (−ϱ, ϱ).

In summary:Within the radius of convergence we can “interchange”differentiation and summation as well as integration andsummation of a power series.

The analogous statements hold for power series “centered at x0 ∈ R”,

f : (x0 − ϱ, x0 + ϱ)→ R, f (x) =∞∑n=0

an(x − x0)n. (4.3.2)



Coefficients of Power SeriesFor the power series

f : (x0 − ϱ, x0 + ϱ)→ R, f (x) =∞∑n=0

an(x − x0)n.

we havea0 = f (x0).

Moreover, f ′(x) = 1 · a1 + 2a2(x − x0)2−1 + ... , so

a1 =f ′(x0)

1

and, in general,

an =f (n)(x0)

n!, n ∈ N. (4.3.3)



Taylor’s TheoremHence, a power series

f : (x0 − ϱ, x0 + ϱ)→ R, f (x) =∞∑n=0

an(x − x0)n

has coefficients given by

an =f (n)(x0)

n!, n ∈ N.

We are now interested in the converse question: If I ⊂ R is an interval andf ∈ C∞(R) a smooth function, does the power series

∞∑n=0

an(x − x0)n, an =

f (n)(x0)

n!, n ∈ N.

exist for x0 ∈ I? Does it converge to f (x) for all x?Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 586 / 624


Taylor’s Theorem4.3.5. Taylor’s Theorem. Let I ⊂ R an open interval and f ∈ C k(I ). Letx ∈ I and y ∈ R such that x + y ∈ I . Then for all p ≤ k,

f (x + y) = f (x) +1

1!f ′(x)y + · · ·+ 1

(p − 1)!f (p−1)(x)yp−1 + Rp (4.3.4)

with the remainder term

Rp :=

∫ 1

0

(1− t)p−1

(p − 1)!f (p)(x + ty)yp dt.

Proof.We prove the theorem by induction in p. For p = 1, (4.3.4) becomes

f (x + y)− f (x) =

∫ 1

0f ′(x + ty) · y dt =

∫ x+y

xf ′(z) dz ,

which is just the fundamental theorem of calculus.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 587 / 624


Taylor’s TheoremProof (continued).In order to prove that (4.3.4) for p implies (4.3.4) for p + 1, we show

Rp =1

p!f (p)(x)yp + Rp+1.

Integrating by parts, we have

Rp =

∫ 1

0

(1− t)p−1

(p − 1)!f (p)(x + ty)yp dt

= − (1− t)p

p!f (p)(x + ty)yp

∣∣∣∣1t=0

+

∫ 1

0

(1− t)p

p!

d

dtf (p)(x + ty)yp dt

=1

p!f (p)(x)yp +

∫ 1

0

(1− t)p

p!f (p+1)(x + ty)yp+1 dt

=1

p!f (p)(x)yp + Rp+1.



Lagrange RemainderIf we set x = x0 and y = x − x0 in (4.3.4),

f (x) =

p−1∑n=0

f (n)(x0)

n!(x − x0)

n + Rp (4.3.5)

for f ∈ Cp(I ), with

Rp =

∫ 1

0

(1− t)p−1

(p − 1)!f (p)(tx + (1− t)x0) dt · (x − x0)

p

=

∫ x

x0

(x − τ)p−1

(p − 1)!f (p)(τ) dτ (4.3.6)

This is known as the Lagrange form of the remainder. The remainder termsatisfies the estimate

|Rp| ≤1

(p − 1)!sup

y∈[x0,x]|f (p)(y)| · |x − x0|p (4.3.7)



Analytic Functions and Taylor SeriesWe will call

Tf ;p,x0(x) := f (x)− Rp+1 =

p∑n=0

f (n)(x0)

n!(x − x0)

n (4.3.8)

the Taylor polynomial of degree p of f at x0. The series

limp→∞

Tf ;p,x0(x) =∞∑n=0

f (n)(x0)

n!(x − x0)

n (4.3.9)

is called the Taylor series of f at x0.4.3.6. Definition. Let I ⊂ R be an open interval and f ∈ C∞(I ). We saythat f is real-analytic or just analytic at x0 ∈ I if there exists aneighborhood Bε(x0) = (x0 − ε, x0 + ε) ⊂ I such that

f (x) =∞∑n=0

f (n)(x0)

n!(x − x0)

n

for all x ∈ Bε(x0).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 590 / 624


Taylor Series4.3.7. Example. Let f : (−1,∞), f (x) = 1/(1 + x). Letting x0 = 0, wehave

f (x) =

p−1∑n=0

f (n)(0)

n!xn + Rp.

We can easily check thatf (n)(x) = (1 + x)−1−n · (−1)nn!,

so f (n)(0) = n!(−1)n and

f (x) =

p−1∑n=0

f (n)(0)

n!xn + Rp =

p−1∑n=0

(−1)nxn + Rp.

Thus the Taylor polynomial of degree p for f at x0 = 0 is given by

Tf ;p,0(x) =

p∑n=0

(−1)nxn



Taylor SeriesWe will show that f actually has a Taylor series at x0 = 0,

f (x) =∞∑n=0

(−1)nxn, for x ∈ (−1, 1). (4.3.10)

First, note that the power series in (4.3.10) has radius of convergenceϱ = 1, so the right-hand side is well-defined.We now prove that the Taylor series actually converges to f , i.e., for anyfixed x ∈ (−1, 1) Rp → 0 as p →∞. This is easy to show if x ≥ 0. Inthat case, we can use the estimate (4.3.7) to see that

|Rp| ≤1

(p − 1)!sup

y∈[x0,x]|f (p)(y)| · |x − x0|p

=1

(p − 1)!sup

y∈[0,x]|1 + y |−1−pp!|x |p = p|x |p p→∞−−−→ 0.



Taylor SeriesHowever, if −1 < x < 0 we have to use a more careful estimate. In thatcase, the Lagrange remainder (4.3.6) gives

Rp =

∫ x

0

(x − τ)p−1

(p − 1)!f (p)(τ) dτ = −p

∫ −|x |

0

(|x |+ τ)p−1

(1 + τ)p+1dτ

= p

∫ |x |

0

(|x | − τ)p−1

(1− τ)p+1dτ

Then

|Rp| ≤ p supτ∈[0,|x |]

∣∣∣∣ |x | − τ1− τ

∣∣∣∣p−1 ∫ |x |

0

1

(1− τ)2dτ

= p|x |p−1

∣∣∣∣1− 1

1− |x |

∣∣∣∣ p→∞−−−→ 0.

Thus we have proven that for every x ∈ (−1, 1) the formal Taylor series off actually converges to f . The convergence is uniform in [−R,R] for everyR < 1 (by the general theory of power series).



Taylor SeriesA 10-term approximation (green) to f (x) = 1/(1 + x) (blue):

-1 1

t

1

x



Taylor Series

4.3.8. Remark. The Taylor series for the previous example could have beenobtained in an easier way: using the geometric series expansion, we have

1

1 + x=

1

1− (−x)=

∞∑n=0

(−x)n

whenever |x | < 1. Since the Taylor series of a function f at x0 is unique(the coefficients are determined by (4.3.3)) it doesn’t matter how weobtain a series expansion - it is always the Taylor series.



Taylor Series

4.3.9. Remark. In the previous example, the Taylor series converged onlyfor |x | < 1. This was not surprising, since the function has a singularity atx = −1. However, the function

f : R→ R, f (x) =1

1 + x2

has a series expansion at x0 = 0 given by

f (x) =1

1 + x2=

1

1− (−x2)=

∞∑n=0

(−1)nx2n, |x | < 1.

The function evidently has no poles in R, but nevertheless the power seriesonly has radius of convergence ϱ = 1. In this case, the poles that limit theradius of convergence are hidden in the complex plane, at x = ±i .



Taylor SeriesWe can use the term-by-term integrability of power series to obtain seriesfor functions where this would otherwise be difficult:4.3.10. Example. Consider the function

f : R→ R, f (x) = arctan x .

To find the Taylor series (4.3.9) directly, we would need to calculatederivatives:

f ′(x) =1

1 + x2, f ′′(x) =

−2x(1 + x2)2

,

f ′′′(x) =−2

(1 + x2)2+

2 · (2x)2

(1 + x2)3

and it is obvious that obtaining higher-order derivatives becomes evermore complicated.



Taylor SeriesHowever, we can calculate

arctan x =

∫ x

0

dy

1 + y2=

∫ x

0

∞∑n=0

(−1)ny2n dy =∞∑n=0

(−1)n∫ x

0y2n dy

=∞∑n=0

(−1)n

2n + 1x2n+1.

The radius of convergence of the power series is ϱ = 1, and the aboveequalities hold by Corollary 4.3.4 for all x ∈ (−1, 1). So far, so good.However, the situation at x = 1 is now quite interesting: both arctan 1 and∑∞

n=0(−1)n

2n+1 converge (the latter by the Leibniz theorem). If they wereequal, we would have

arctan 1 =π

4= 1− 1

3+

1

5− 1

7+− ... ,

which is known as the Leibniz series.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 598 / 624


Abel’s Limit Theorem

The justification for the Leibniz series is in fact provided by the followingtheorem, which will be proved in the assignments:4.3.11. Abel’s Limit Theorem. Let f : [0, 1)→ R be defined by

f (x) =∞∑k=0

akxk ,

where∑

ak converges. Then∑

akxk converges uniformly on [0, 1], so

f ∈ C ([0, 1]) and∞∑k=0

ak = limx1

∞∑k=0

akxk .



Abel SummabilityAbel’s Limit Theorem gives rise to another consideration: if a series∑∞

k=0 ak converges, then the power series∞∑k=0

akxk

has radius of convergence ϱ ≥ 1 and certainly

limx1

∞∑k=0

akxk =

∞∑k=0

ak .

We can hence generalize the concept of summability of sequences, bydefining the sum to be the limit of the power series; we define the Abelsum

A –∞∑n=0

an := limx1

∞∑n=0

anxn

for any complex sequence (an) for which the right-hand side exists.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 600 / 624


Abel Summability4.3.12. Remark. If

∑∞k=0 ak converges,

∞∑k=0

ak = A –∞∑n=1

an.

But we also have sequences whose series does not converge in the classicalsense, such as the following:4.3.13. Example. (an) with an = (−1)n. From Example 4.3.7 we know that

A –∞∑n=0

(−1)n = limx1

∞∑n=0

(−1)nxn = limx1

1

1 + x=

1

2

so the sequence (1,−1, 1,−1, 1, ...) is Abel summable with sum 1/2. Wehence write

1− 1 + 1− 1 + 1−+ · · · = 1

2(4.3.11)

where the left-hand side is a classically divergent series.Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 601 / 624


Abel SummabilityIn the assignments you will show that

A –∞∑n=1

n(−1)n+1 = 1− 2 + 3− 4 + 5−+ · · · = 1

4.

These types of divergent series were historically analyzed by Euler andothers through naive methods, which we briefly sketch here.Let us revisit (4.3.11). If we write S = 1− 1 + 1− 1 + 1− 1 +− ...observe that

1− S = 1− (1− 1 + 1− 1 + 1− 1 +− ...)

= 1− 1 + 1− 1 + 1−+ ... = S

Hence we can deduce that S = 1− S or

S = 1− 1 + 1− 1 + 1− 1 +− ... = 1/2.



Divergent SeriesIn a similar way we can tackle a more interesting series: let

S = 1 + 2 + 3 + 4 + 5 + 6 + 7 + ...

Then−3S = (S − 2 · 2S) = 1 + 2 + 3 + 4 + ...− 2(2 + 4 + 6 + 8 + ...)

= 1− 2 + 3− 4 + 5− 6 +− ...

We can further write−3S = 1− 2 + 3− 4 + 5− 6 +− ...

= 1− (2− 3 + 4− 5 +− ...)

= 1− (1− 2 + 3− 4 +− ...)− (1− 1 + 1− 1 +− ...)

= 1 + 3S − 1/2

Together, this gives

S = 1 + 2 + 3 + 4 + 5 + 6 + 7 + ... = − 1

12. (4.3.12)



Divergent SeriesIt may seem strange that a divergent series yields a “sum” which isnegative. However, mathematicians in the 17th century did believe that“beyond positive infinity” lay the negative numbers.It is remarkable that this concept is even relevant in physics today: forexample, physical system of atoms can have a “negative (absolute)temperature” and then be “hotter” than any such system with a “positivetemperature”.Reference Max-Planck-Gesellschaft. “A temperature below absolute zero: Atomsat negative absolute temperature are the hottest systems in the world.”ScienceDaily, 4 Jan. 2013.http://www.sciencedaily.com/releases/2013/01/130104143516.htm.

The actual sum (4.3.12) occurs in quantum field theory and string theory,so that it is not “nonsense” as you may think. Let us try to justify the sumin a more rigorous fashion.


http://www.sciencedaily.com/releases/2013/01/130104143516.htm


Riemann Zeta FunctionIt turns out that the series 1 + 2 + 3 + 4 + 5 + 6 + ... is notAbel-summable: the limit

A –∞∑n=1

n = limx1

∞∑n=1

nxn = limx1

x

(1− x)2

does not exist! To justify (4.3.12), a more sophisticated approach isnecessary.The Riemann zeta function is defined as

ζ(s) =∞∑n=1

1

ns.

Note that the right-hand side is just the p-series, so we know that ζ(s)exists for s > 1. However, ζ can be extended to a function defined onR \ 1. This extension is analogous to, but more complicated than, theextension of the Euler Gamma function (see Slide 572).



Riemann Zeta FunctionWe denote the extension by ζ also; a sketch is shown below:

1 2s

1

Π2

6

y

y ΖHsL

For example, ζ(2) =∑ 1

n2= π2

6 (see the exercises).Dr. Hohberger (UM-SJTU JI) Functions of a Single Variable Fall 2015 606 / 624


Riemann Zeta Function

Now some divergent series might be considered as special cases of the zetafunction. For example,

ζ(0) =∞∑n=1

1

n0=

∞∑n=1

1 = 1 + 1 + 1 + 1 + ... ,

ζ(−1) =∞∑n=1

1

n−1=

∞∑n=1

n = 1 + 2 + 3 + 4 + ...

We see that the series in (4.3.12) is given by ζ(−1). Let us analyze thegraph of the zeta function more closely in this region.



Riemann Zeta Function

-8 -6 -4 -2 -1s

-1

12

-1

2

y

y ΖHsL

Observe that ζ(0) = −1/2 and ζ(−1) = −1/12! Thus, it (4.3.12) doesmake sense if we regard the summation by extending the zeta function.This summation procedure is called zeta function regularization.



Riemann Zeta FunctionIn order to calculate ζ(−1), we need to understand more about theextension of the zeta function. In particular, we would need to prove thatthe extension of the zeta function satisfies the functional equation

ζ(1− s) =2Γ (s)

(2π)scos(π2s)ζ(s) (4.3.13)

where Γ is the Euler Gamma function. If we set s = 2 in (4.3.13) weobtain

ζ(−1) = 2

4π2Γ (2) cos(π)ζ(2) = − 1

12.

The proof of (4.3.13) is today usually performed using methods ofcomplex analysis. An elementary (but complicated) proof, retracing Euler’soriginal calculations, is available on SAKAI.Reference B. Cais, Divergent series: why 1 + 2 + 3 + · · · = −1/12, retrieved fromhttp://www.math.mcgill.ca/bcais/papers.html


http://www.math.mcgill.ca/bcais/papers.html


A Deceptively Simple Power SeriesConsider the power series

∞∑n=1

nn

n!zn. (4.3.14)

The radius of convergence can be found from

limn→∞

(n + 1)n+1n!

nn(n + 1)!= lim

n→∞

(1 +

1

n

)n

= e

(see Proposition 3.7.2), soϱ =

1

e.

We are interested in the convergence at ±1/e. Let us discuss first theseries

∞∑n=1

nn

n!en. (4.3.15)



Elementary Estimate on the FactorialTo analyze the convergence of (4.3.15), we need an estimate on thegrowth of n!. We may obtain an elementary estimate as follows: Let

Sn = ln(n!) =n∑

k=1

ln k.

Then it is easy to see that∫ n+1

1ln x dx > Sn >

∫ n+1

2ln(x − 1) dx =

∫ n

1ln x dx . (4.3.16)

1 2 3 4 5 6 7x

lnH2L

lnH3L

lnH4LlnH5LlnH6LlnH7L

y

y lnHxL

y lnHx - 1L



Elementary Estimate on the Factorial

Integrating (4.3.16), we immediately obtain

lnnn

en−1< Sn < ln

(n + 1)n+1

en

or, since Sn = ln(n!),

nn

en−1< n! <

(n + 1)n+1

en. (4.3.17)

Noting that (1 + 1/n)n < e by Lemma 3.7.4, we have (n + 1)n < enn andobtain

e(ne

)n< n! < e(n + 1)

(ne

)n. (4.3.18)




4.3.14. Remark. From (4.3.18) and the squeeze theorem, we findn√n!

nn→∞−−−→ 1

e.

This relation is useful when applying the root test to series with factorialcoefficients.We can now analyze the convergence of (4.3.14) at x = ϱ = 1/e. Theestimate (4.3.18) gives

1

n!

(ne

)n>

1

n + 1

1

e

so (4.3.15) diverges by comparison with the harmonic series.




We now turn to the convergence of (4.3.14) at z = −1/e, i.e., theconvergence of

∞∑n=1

(−1)n nn

n!en. (4.3.19)

We would like to apply the Leibniz criterion to show the convergence of(4.3.19). However, while it is not difficult to show that the sequence (an),

an =nn

n!en

is decreasing, it does not follow that it converges to zero. The estimate(4.3.18) only yields an < e for all n ∈ N. Thus, we need a finer estimateon the behavior of n!.



Stirling’s Formula

We say that two functions f , g : R→ R are asymptotically equal asx →∞, writing f (x) ∼ g(x) as x →∞, if

limx→∞

f (x)

g(x)= 1.

This implies thatlimx→∞

g(x)− f (x)

g(x)= 0,

i.e., the relative difference between f (x) and g(x) converges to zero. Wenow prove a famous result on the asymptotic behavior of the factorialwhich has important applications in probability theory and statistics.

4.3.15. Stirling’s Formula. n! ∼√2πn

(ne

)nas n→∞.



Stirling’s FormulaProof.We will consider the sequence (an) given by

an :=n!√

2n(ne

)nand first show that it converges to some C ∈ R. Write

bn := ln an.

Then, after some elementary transformations,

bn − bn+1 =2n + 1

2ln

n + 1

n− 1. (4.3.20)

We want to show that the right-hand side is positive, so that (bn) isdecreasing.



Stirling’s FormulaProof (continued).We will work with a series expansion for ln n+1

n , n ∈ N \ 0. The mostimmediate way to obtain such an expansion would be to use the Taylorseries

ln(1 + x) = x − 1

2x2 +

1

3x3 − 1

4x4 +− ... , |x | < 1. (4.3.21)

However, this leads to an alternating series, which will not be useful inestimating the RHS of (4.3.20):

lnn + 1

n= ln

(1 +

1

n

)=

∞∑k=1

(−1)k+1

k

1

nk

Instead, we might use that

− ln(1− x) = x +1

2x2 +

1

3x3 +

1

4x4 + ... , |x | < 1. (4.3.22)



Stirling’s FormulaProof (continued).Then

lnn + 1

n= − ln

n

n + 1= − ln

(1− 1

n + 1

)=

∞∑k=1

1

k

1

(n + 1)k

=1

n + 1+

∞∑k=2

1

k

1

(n + 1)k.

However, without additional estimates, this is not sufficient to show thatthe RHS of (4.3.20) is positive. We need to use a more sneaky expansion.Note that (4.3.21) and (4.3.22) give

ln1 + x

1− x= ln(1 + x)− ln(1− x) = 2

∞∑k=0

x2k+1

2k + 1.




Proof (continued).We then have

lnn + 1

n= ln

1 + 12n+1

1− 12n+1

= 2∞∑k=0

1

2k + 1

1

(2n + 1)2k+1

=2

2n + 1+ 2

∞∑k=1

1

2k + 1

1

(2n + 1)2k+1

Inserting this into (4.3.20), we have

bn − bn+1 =2n + 1

2ln

n + 1

n− 1 =

∞∑k=1

1

2k + 1

1

(2n + 1)2k> 0.

It follows that the sequence (bn) is decreasing.



Stirling’s FormulaProof (continued).Observe that

bn − bn+1 <

∞∑k=1

1

(2n + 1)2k=

1

(2n + 1)21

1− 1(2n+1)2

=1

4

1

n(n + 1).

Hence,

b1 − bn = (b1 − b2) + (b2 − b3) + · · ·+ (bn−1 − bn)

<1

4

n−1∑k=1

1

k(k + 1)

<1

4

∞∑k=1

1

k(k + 1)=

1

4

∞∑k=1

(1

k− 1

k + 1

)=

1

4.




Proof (continued).so

bn > b1 −1

4=

e√2− 1

4

and (bn) is bounded below. Hence, (bn) converges. Since the exponentialfunction is continuous, we have

limn→∞

an = limn→∞

ebn = elim

n→∞bn,

so (an) converges, i.e.,

C := limn→∞

n!√2n(ne

)nexists.



Stirling’s FormulaProof (continued).We now use the Wallis product formula,

π

2= lim

n→∞

24n(n!)4

((2n)!)2(2n + 1)

(which is proven in the assignments). Then

π

2= lim

n→∞

24n

(2n + 1)

(√2n(ne

)n)4 ( n!√2n(ne

)n)4×(

1√4n(2ne

)2n)2(√4n (2ne )2n(2n)!

)2

= C 2 limn→∞

n2

n(2n + 1)=

C 2

2.




Proof (continued).It follows that C =

√π and hence

limn→∞

n!√2n(ne

)n =√π,

completing the proof.We can now see that there exists a C > 0 such that

an =nn

n!en<

C√n

for all n large enough, i.e., the sequence (an) converges to zeromonotonically. Thus, the power series (4.3.14) converges for z = −1/e.



Final Exam

The preceding material completes the final third of the course material. Itencompasses everything that will be the subject of the Final Exam.The exam date and time will be announced on SAKAI.No calculators or other aids will be permitted during the exam. A sampleexam with solutions has been uploaded to SAKAI. Please study it carefully,including the instructions on the cover page.


honors mathematics ii functions of a single variable -...

Documents