the calculus of polynomialsbeachy/courses/229/pdf/calculus.pdf · possibility, through the use of...

THE CALCULUS

OF POLYNOMIALS

John A. Beachy

Northern Illinois University

1991Revised 2003

Contents

1 Polynomials 11.1 Roots of polynomials . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Rational roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Approximating real roots . . . . . . . . . . . . . . . . . . . . . . 201.4 Complex roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.5 Interpolating polynomials . . . . . . . . . . . . . . . . . . . . . . 361.6 Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . 43

2 Derivatives 512.1 Tangent lines for polynomial functions . . . . . . . . . . . . . . . 542.2 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612.3 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . 642.4 Higher derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 672.5 Averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3 Approximation Techniques 753.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773.2 Approximating averages . . . . . . . . . . . . . . . . . . . . . . 813.3 Approximating areas . . . . . . . . . . . . . . . . . . . . . . . . 823.4 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873.5 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893.6 Trigonometric functions . . . . . . . . . . . . . . . . . . . . . . 913.7 The exponential form for complex numbers . . . . . . . . . . . . 94

iii

iv CONTENTS

Preface

These notes are intended to be used as a supplement to the material usually taught inthe first semester of calculus. Certain techniques used to obtain numerical approx-imations provide the focal point of the notes. Newton’s method uses tangent linesto find successive approximations to solutions of equations. The idea of using atangent line to approximate a function (locally) can be extended to use polynomialsof higher degree. Polynomial approximations are also useful in finding the areabeneath a curve.

This general theme of using polynomials to approximate functions presupposessome knowledge of polynomials. The first chapter of the notes is designed to reviewsome of the necessary background as well as to provide some new information aboutpolynomials. It also includes sections on mathematical induction and the complexnumbers.

The second chapter discusses derivatives of polynomial functions, without usinglimits. (For those already in the know, the tangent line at a point is defined as thelinear part of the Taylor expansion at the point.) The definition of the average valueof a function uses as motivation the statement that is the Fundamental Theorem ofCalculus (in a traditional development). This notion of an average value is then usedto compute areas under curves and to define the integral of a function.

The third chapter introduces the concept of a limit, in connection with sequencesof approximations. Then integrals are reconsidered, using limits of sequences. Itends with some results on infinite series, including a discussion of Taylor series.

To help set the tone of this course, it may be useful to make some generalcomments about mathematics. The field of mathematics involves the constructionand study of abstract models of physical situations. The construction of a modelrequires the selection of explicitly stated and precisely formulated premises. Theseassumptions are called axioms, and the the study of the model then involves drawingconclusions from these fundamental assumptions, using as high a degree of logicalrigor as possible. The rigor of mathematics is not absolute, but is rather in the processof continual development. For example, Euclid’s axiomatization of geometry and his

v

vi CONTENTS

study of this model of our spatial surroundings was accepted as completely rigorousfor over two thousand years, but a modern geometer could point out serious flawsin the logical development of the theory.

The choice of the basic assumptions to be taken as axioms usually involves anoversimplification of the facts. For example, models of the U.S. economy cannothope to take into account every variable and still have a workable model. A math-ematical model should only be viewed as the best statement of the known facts. Inmany cases a model should be viewed as merely the most efficient, incorporatingonly enough assumptions to give the desired degree of accuracy in prediction. Forexample, in an area small in comparison to the total surface of the earth, plane ge-ometry gives a good approximation for questions involving relationships of figures.As soon as the problems involve large distances, spherical geometry must be usedas the model. As another example, Newtonian physics is good enough for manyproblems in mechanics, and it is necessary to introduce the additional assumptionsof quantum mechanics only if answers have to be found at the atomic level.

If any distinction at all is to be made between applied mathematics and theoreticalmathematics, it is perhaps most evident when talking about the process of modeling.The applied mathematician is probably more involved in the construction of models,and must ask questions about the efficiency of the models and about how closelythey approximate the real world. The theoretical mathematician is concerned withdeveloping the model, by investigating the implications of the basic assumptions oraxioms. This is done by proving theorems. Of course, if a theorem is proved that isobviously contrary to nature, then it is clear to everyone that the basic assumptionsdo not coincide with reality. The theoretical mathematician is also concerned withthe internal consistency of the models.

In order to make logical deductions from the basic axioms, the language usedmust be extremely precise. This is done by making use of careful definitions, andsymbols that are lifted out of the context of ordinary language in order to strip awayany possible ambiguity. Much of the precision and clarity of mathematics is madepossible by its use of formulas. The modern reader is usually unaware that this isan achievement only of the past few centuries.1 For example, the signs+ and− appeared in manuscripts for the first time in 1481; parentheses first appearedin 1544; brackets and braces appeared essentially for the first time in 1593 in theworks of Vieta; the sign= appeared in 1557; the modern way of writing powerswas first used in 1637 by Descartes, but in 1801 Gauss still wrotexx instead ofx2.

The motivation of the pure mathematician certainly comes partly from the appli-

1The notes on symbols are fromDifferential and Integral Calculus , by Ostrowski.

CONTENTS vii

cations of the theorems that can developed within a given model. But perhaps morethan this, it comes from the joy of creating a theory of particular simplicity, eleganceand broad scope. It is certainly difficult to describe the beauty of a mathematicaltheory, but if one understands the theory, it is not difficult to appreciate its beauty, iffor no other reason than what it shows of the intellectual creativity of human kind.

In defense of the theoretical mathematician, it must be said that a theory shouldnot be judged only on its applicability to presently known problems. The history ofmathematics is filled with examples of particular theories that seemed at the time tobe mere intellectual exercises devoid of any relationship to physical problems, butthat later were discovered to have important applications.

One particularly impressive example is provided by non-Euclidean geometry,which arose from the efforts (extending over two thousand years) to prove thatEuclid’s parallel axiom could be deduced from his other, more obvious, axioms.This seemed to be a matter of interest only to mathematicians. Even Lobacevskii,the founder of the new geometry, was careful to label it “imaginary”, since he couldnot see any meaning for it in the actual world. In spite of this, his ideas laid thefoundation for a new development of geometry, namely the creation of theories ofvarious non-Euclidean spaces. These ideas later became, in the hands of Einstein,the basis of the general theory of relativity, in which the mathematical model consistsof a form of non-Euclidean geometry of four-dimensional space.

The generalizations and abstractions of mathematics often seem at first to bestrange and difficult. But with the very general expansion of knowledge and tech-nology that we are currently experiencing, it becomes necessary to identify andelucidate general underlying principles, in order to tie this information together.The language and concepts of mathematics help to fill this need.

Acknowledgments

These notes were written while teaching the honors calculus sequence duringthe fall of 1986 and the fall of 1991. Some of the material in the first chapter hadbeen covered in previous semesters. In particular, I am indebted to Bill Blair, HenryLeonard, Don McAlister, Linda Sons and Bob Wheeler for their notes on varioustopics.

J.A.B.

viii CONTENTS

Chapter 1

Polynomials

All of nature is in a state of constant motion and change. The branch of mathematicsthat provides methods for the quantitative investigation of various processes ofchange, motion, and dependence of one quantity on another is calledmathematicalanalysis, or simplyanalysis. A first course in calculus establishes some of the basicmethods of analysis, done in relatively simple cases.

The development of the methods of analysis was stimulated by problems inphysics. During the 16th century the central problem of physics was the investigationof motion. The expansion of trade, and the accompanying explorations, madeit necessary to improve the techniques of navigation, and these in turn dependedto a large extent on developments in astronomy. In 1543 Copernicus published“On the revolution of the heavenly bodies”, and then the “New astronomy” ofKepler, containing his first and second laws for the motion of planets around thesun, appeared in 1609. The third law was published by Kepler in 1618 in his book“Harmony of the world”. Galileo, on the basis of his study of Archimedes andhis own experiments, laid the foundations for the new mechanics, an indispensablescience for the newly arising technology.

During the Renaissance the Europeans finally became acquainted with Greekmathematics by way of the Arabic translations, after a period of almost one thousandyears of scientific stagnation. The books of Euclid, Ptolemy, and Al-Kharizmi weretranslated in the 12th century from Arabic into Latin, the common scientific languageof Western Europe, and at the same time the earlier Greek and Roman system ofcalculation was gradually replaced by the vastly superior Indian method, which alsoreached the Europeans via the Arabs.

It was not until the 16th century that European mathematicians finally surpassedthe achievements of their predecessors, with the solution by the Italians Tartagliaand Ferrari of the general cubic and fourth degree equations.

1

2 CHAPTER 1. POLYNOMIALS

The concepts of variable magnitude and function arose gradually as a resultof the interest of physics in laws of motion, as for example in the work of Keplerand Galileo. Galileo discovered the law of falling bodies by establishing that thedistance fallen increases in direct proportion to the square of the time.

The appearance in 1637 of the new “geometry” of Descartes marked the firstdefinite step toward a mathematics of variable magnitudes. This combined algebraicand geometric techniques, and is now known as analytic geometry. The main contentof the new geometry was the theory of conic sections: the ellipse, hyperbola, andparabola. This theory had been developed extensively by the ancient Greeks ingeometric form, and the combination of this knowledge with algebraic techniquesand the general idea of a variable magnitude produced analytic geometry.

For the Greeks the conic sections were a subject of purely mathematical inter-est, but by the time of Descartes they were of practical importance for astronomy,mechanics, and technology. Kepler discovered that each planet travels around thesun in an elliptical orbit, and Galileo established that an object thrown in the airtravels along a parabolic path (both of these are only first approximations). Thesediscoveries made it necessary to calculate various magnitudes associated with theconic sections and it was the method of Descartes that solved this problem.

The next decisive step was take by Newton and Leibnitz during the second halfof the 16th century, and resulted in the founding of differential and integral calculus.The Greeks and later mathematicians had studied the geometric problems of draw-ing tangents to curves and finding areas and volumes of figures. The remarkablediscovery of the relation of these problems to the problems of the new mechanicsand the formulation of general methods for solving them was brought to completionin the work of Newton and Leibnitz. This relation was discovered because of thepossibility, through the use of analytic geometry, of making a graphical represen-tation of how one variable depends on another. In short, what is involved is theconstruction of a geometric model of relationships involving variable magnitudes.

The simplest relationships are those given by polynomials such asx3− 2x+ 3.

The most elementary ones are the linear polynomials, which have the general formmx+ b, for constantsm andb. Complicated expressions likeex2

− sin3(x) aremuch more difficult to work with than polynomials, and so many times it is usefulto approximate such complicated expressions by using polynomials. The simplestcase would be to attempt to approximate a function by a linear function of the formf (x) = mx+ b. At best this is only possible for a small interval ofx values, andso differential calculus focuses on the construction and use of tangent lines. Byusing higher derivatives, the idea of a tangent line can be extended to the idea ofpolynomials of higher degree which are “tangent” in some sense to a given curve.These ideas are introduced in Chapter 2, and provide the motivation for much of

3

this chapter.The first section of this chapter reviews some basic facts about roots of polyno-

mials. One crucial fact is that a numberc is a root of the polynomialp(x) if and onlyif x−c is a factor ofp(x). Actually, we show more:p(c) is the remainder whenp(x)is divided byx − c. (This is called the Remainder Theorem.) This elementary factis used repeatedly throughout the study of calculus. Section 1.2 reviews syntheticdivision and other techniques for finding rational roots of polynomials with integercoefficients. Section 1.3 introduces some methods for approximating solutions thatare not rational numbers.

The general problem of solving polynomial equations leads to a study of complexnumbers, since they are necessary to solvex2

+ 1 = 0. Some basic facts aboutcomplex numbers are presented in Section 1.4. Quite often we need to use inductionto establish general formulas, and so a short introduction is provided in Section 1.6.


1.1 Roots of polynomials

We begin with some notation. The set{0,±1,±2,±3, . . .} is called the set ofintegers. We will use the symbolQ to denote the set ofrational numbers. That is,

Q ={m

n| m,n are integers andn 6= 0

}where we must agree thatm/n and p/q represent the same ratio ifmq = np. Byidentifying the integerm with the fractionm/1, we can think of the set of integersas a subset ofQ, and so we can say, very roughly, that we have enlarged the set ofintegers to the setQ of rational numbers in order to have a set in which division ispossible.

We will use the symbolR for the set ofreal numbers. We will simply viewthem as the set of all decimal numbers. They can be thought of as the coordinatesof the points on a straight line. A precise development of the construction of the setof real numbers requires more sophisticated concepts and much more time than wehave available to us at this point. The rational numbers can be viewed as a subsetof R, since fractions correspond to either terminating or repeating decimals.

1.1.1 DEFINITION. An expression of the form

amxm+ am−1xm−1

+ · · · + a1x + a0

is called apolynomial in the indeterminate x. The exponents (and subscripts)m,m − 1, . . . ,1,0 must be non-negative integers, and we will assume that thecoefficientsam, am−1, . . ., a0 are real numbers. We say that an expression of thisform is apolynomial over R.

If n is the largest nonnegative integer such thatan 6= 0, then we say that thepolynomial hasdegreen, andan is called theleading coefficientof the polynomial.

According to this definition, the zero polynomial has no degree, and a constantpolynomiala0 has degree 0 whena0 6= 0. It is important to note that two poly-nomials are equal precisely when they have the same degree and all correspondingcoefficients are equal.

If a(x) = amxm+am−1xm−1

+· · ·+a0 andb(x) = bnxn+bn−1xn−1

+· · ·+b0

are polynomials, thena(x) andb(x) can be added by just adding correspondingcoefficients. Their producta(x)b(x) is

a(x)b(x) = ambnxn+m+ · · · + (a0b2+ a1b1+ a2b0)x

2+ (a0b1+ a1b0)x + a0b0.

1.1. ROOTS OF POLYNOMIALS 5

In the above formula, two polynomials are multiplied by multiplying each termof the first by each term of the second, and then collecting similar terms. Table 1.1.1shows an efficient way to do this, by arranging the similar terms in columns.

Table 1.1.1:(2x4− 5x3

+ 3x + 1)(x2+ 2x − 1)

2x4−5x3

+3x +1× x2

+2x −4

−8x4+20x3

−12x −4+4x5

−10x4+6x2

+2x2x6

−5x5+3x3

+x2

2x6−x5

−18x4+23x3

+7x2−10x −4

We will sometimes need to simplify expressions such as(x + c)n. For n = 2andn = 3 we have the following identities.

(x + c)2 = (x + c)(x + c)

= x(x + c)+ c(x + c)

= x2+ xc+ cx+ c2

= x2+ 2cx+ c2

(x + c)3 = (x + c)(x + c)2

= x(x2+ 2cx+ c2)+ c(x2

+ 2cx+ c2)

= x3+ 2cx2

+ c2x + cx2+ 2c2x + c3

= x3+ 3cx2

+ 3c2x + c3

The coefficients in the above identities can be found from Pascal’s triangle,which is given in Table 1.1.2. In this triangle each row begins and ends with 1, andthe other terms are found by adding together the two numbers immediately abovethe term.

The last row gives the coefficients for(x + c)6. In the section on mathematicalinduction we will give a proof of the Binomial Theorem, which computes the co-efficients by using a different formula, and then we will show that we get the sameanswers either way.


Table 1.1.2: Pascal’s triangle (ton = 6 )

11 1

1 2 11 3 3 1

1 4 6 4 11 5 10 10 5 1

1 6 15 20 15 6 1

Leta(x) = amxm+am−1xm−1

+· · ·+a0 andb(x) = bnxn+bn−1xn−1

+· · ·+b0

be polynomials. To write down the general formula for the producta(x)b(x), it isuseful to introduce a notation for sums. It is traditional to use a Greek letter, sigma,to denote a sum. To show the terms that are being added we typically use subscripts,and then part of the notation tells what the subscripts should be. For example, wecan write

a0b2+ a1b1+ a2b0 =

2∑i=0

ai b2−i

to describe the coefficient ofx2 ina(x)b(x). In general, the coefficientck ofa(x)b(x)is given by the formula

ck =

k∑i=0

ai bk−i ,

which can also be written as

ck =∑

i+ j=k

ai b j .

1.1.2 PROPOSITION. If a(x) andb(x) are nonzero polynomials overR, then theproducta(x)b(x) is nonzero and

deg(a(x)b(x)) = deg(a(x))+ deg(b(x)).

Proof. Suppose thata(x) = amxm+ · · · + a0 andb(x) = bnxn

+ · · · + b0, with thedegree ofa(x) = m and the degree ofb(x) = n, so thatam 6= 0 andbn 6= 0. Fromthe general formula for multiplication of polynomials, the leading coefficient of


a(x)b(x) isambn, which must be nonzero since the product of nonzero real numbersis nonzero. Thus the degree ofa(x)b(x) is m+ n sinceambn is the coefficient ofxm+n.

We are interested in solving polynomial equations, or equivalently, in findingroots of polynomials. The main result that we need is thatc is a root of a givenpolynomial p(x) if and only if x − c is a factor ofp(x). To check this we need tobe able to dividep(x) by x − c. The next example should help you recall how todivide polynomials.

Example 1.1.1 ((2x3− 3x2

+ 5x + 1) ÷ (x + 1) ).

To divide the polynomial 2x3− 3x2

+ 5x+ 1 by x+ 1 we can use thestandard algorithm for division, as illustrated in Table 1.1.3.

Table 1.1.3:(2x3− 3x2

+ 5x + 1)÷ (x + 1)

2x2−5x +10

x + 1 2x3−3x2

+5x +12x3

+2x2

−5x2+5x

−5x2−5x

10x +110x +10

−9

Thus 2x3− 3x2

+ 5x + 1= (x + 1)(2x2− 5x + 10)+ (−9).2

Theorem 1.1.4 will show that for any polynomialp(x), the remainder whenp(x) is divided byx − c is p(c). That is, p(x) = (x − c)q(x) + p(c). Theremainder p(c) andquotient q(x) are unique. The usual proof of this theorem


uses the algorithm that we followed in Example 1.1.1. The next lemma leads to asimpler proof.

1.1.3 LEMMA. For any real numberc and any positive integerk, the linear termx − c is a factor ofxk

− ck.

Proof. Multiplying out the right hand side shows that we have the following factor-ization.

xk− ck= (x − c)(xk−1

+ cxk−2+ · · · + ck−2x + ck−1).

That is all that we need to complete the proof.

1.1.4 THEOREM (The Remainder Theorem).Let p(x) be a nonzero polynomialoverR, and letc be any real number. Then there exists a polynomialq(x) with realcoefficients such that

p(x) = (x − c)q(x)+ p(c).

Moreover, if p(x) = (x − c)q1(x) + k, whereq1(x) is a polynomial overR andkis a real number, thenq1(x) = q(x) andk = p(c).

Proof. If p(x) = amxm+· · ·+a0, thenp(x)−p(c) = am(xm

−cm)+· · ·+a1(x−c).By Lemma 1.1.3,x−c is a factor of each term on the right hand side of the equation,and so it must be a factor ofp(x) − p(c). Thus p(x) − p(c) = (x − c)q(x) forsome polynomialq(x) overR, or equivalently,p(x) = (x − c)q(x)+ p(c).

If p(x) = (x − c)q1(x) + k, then (x − c)(q(x) − q1(x)) = k − p(c). Ifq(x) − q1(x) 6= 0, then by Proposition 1.1.2 the left hand side of the equation hasdegree≥ 1, which contradicts the fact that the right hand side of the equation is aconstant. Thusq(x)− q1(x) = 0, which also implies thatk− p(c) = 0, and so thequotient and remainder are unique.

Given a polynomialp(x) and a real numberc, to find p(c)we simply substitutec in place ofx. We can also refer to this process by saying that weevaluatep(x) atc. When evaluating a polynomial, it is often best to write it with nested parentheses,as in the following example.

In the polynomial2x3− 7x2

+ 5x + 20

we first factorx out of 2x3− 7x2

+ 5x to get

2x3− 7x2

+ 5x + 20= (2x2− 7x + 5)x + 20 .


We then factorx out of 2x2− 7x, giving

2x3− 7x2

+ 5x + 20= ((2x − 7)x + 5)x + 20 .

When entering a polynomial into a graphing calculator, using the nested paren-theses form will sometimes speed up the graphing process, since no exponentialsare used, but just addition, subtraction, and multiplication.

Here is another example, in which we have inserted zeros for the appropriatecoefficients.

x5+ 2x2

− 3 = x5+ 0x4

+ 0x3+ 2x2

+ 0x − 3

= ((((x + 0)x + 0)x + 2)x + 0)x − 3

1.1.5 DEFINITION. Let p(x) = amxm+ · · · + a0 be a polynomial overR. A real

numberc is called aroot of the polynomialp(x) if p(c) = 0.

1.1.6 COROLLARY. Let p(x) be a nonzero polynomial overR, and letc be anyreal number. Thenc if a root of p(x) if and only if x − c is a factor ofp(x).

Proof. Using the Remainder Theorem, we can writep(x) = (x − c)q(x) + p(c),and then it follows thatp(c) = 0 if and only if p(x) = (x − c)q(x).

1.1.7 COROLLARY. A polynomial of degreen has at mostn distinct roots.

Proof. Suppose thatp(x) is a polynomial of degreen. If c is a root ofp(x), thenby Corollary 1.1.6 we can writep(x) = (x− c)q(x), for some polynomialq(x). Ifa is any root ofp(x), then substituting shows that(a− c)q(a) = 0, which impliesthat eitherq(a) = 0 or a = c. This shows that we can reduce the problem to apolynomial of lower degree, sinceq(x) has degreen− 1. If we already know thatq(x) has at mostn−1 distinct roots, thenp(x) can have at mostn−1 distinct rootswhich are different fromc.

One of the facts from geometry is that two distinct points determine a uniquestraight line. In the language of polynomials, this translates into the statement thatif p(x) = a1x + a0 is a polynomial of degree 1, then knowingp(c1) and p(c2) forany two pointsc1 6= c2 completely determines the coefficientsa1 anda0.


The next proposition will show that a similar result holds for quadratic poly-nomials. If p(x) = a2x2

+ a1x + a0 and we are givenp(c1), p(c2), and p(c3),for three distinct pointsc1, c2, andc3, then the coefficientsa2, a1, anda0 are com-pletely determined. In geometric terms, this implies that three points (not on a line)determine a unique parabola.

1.1.8 PROPOSITION. Let a(x) andb(x) be polynomials of degreen. If a(x) andb(x) agree atn+ 1 distinct points, then they must be equal.

Proof. Let p(x) = a(x) − b(x). If a(ci ) = b(ci ) for 1 ≤ i ≤ n + 1, thenp(x)hasn+ 1 distinct roots. Ifp(x) is nonzero, then deg(p(x)) ≤ n because bothp(x)andb(x) have degreen, and so Corollary 1.1.7 shows that this cannot happen. Theonly possibility that is left is thatp(x) is the zero polynomial, which shows thata(x) = b(x).

Example 1.1.2 ((x − 1)3+ 3(x − 1)2+ 3(x − 1)+ 1= x3).

Leta(x) = x3 andb(x) = (x−1)3+3(x−1)2+3(x−1)+1. To showthata(x) andb(x) are equal we only need to show that they agree at fourdistinct points. We will evaluate both polynomials atx = −1, x = 0,x = 1, andx = 2. We havea(−1) = −1, a(0) = 0, a(1) = 1, anda(2) = 8. On the other hand, we haveb(−1) = −8+12−6+1= −1,b(0) = −1+ 3− 3+ 1= 0, b(1) = 1, andb(2) = 1+ 3+ 3+ 1= 8.Thereforex3

= (x − 1)3+ 3(x − 1)2+ 3(x − 1)+ 1. 2

PROBLEMS: 1.1

1. Write out the expansions of(x + c)4 and(x + c)5.

2. Add the next row to Pascal’s triangle, as given in the text, and use it to writeout the expansion of(x + 2)7.

3. Rewritex5− 7x3

+ 5x2− 3x + 4 using nested parentheses.

4. Rewritex4+ 5x3

+ 11x2− 2x − 8 using nested parentheses.

5. Use Proposition 1.1.8 to show that the polynomialsa(x) = x3+ x − 4 and

b(x) = (x + 1)3− 3(x + 1)2+ 4(x + 1)− 6 are equal.


6. Use Proposition 1.1.8 to show that the polynomialsa(x) = x3+1 andb(x) =

32(x−1)(x)(x+1)− (x−2)(x)(x+1)+ 1

2(x−2)(x−1)(x+1) are equal.

7. Findq(x) with x6− 1 = (x3

− 1)q(x), showing thatx3− 1 is a factor of

x6− 1.

8. Findq(x) with x9− 1 = (x3

− 1)q(x), showing thatx3− 1 is a factor of

x9− 1.

9. Suppose thatmandn are positive integers for whichm is a factor ofn. Explainwhy xm

− 1 must be a factor ofxn− 1.

Hint: Write n = mk and apply Lemma 1.1.3.

10. Explain why rational numbers correspond to decimals that are either repeatingor terminating.

Hint: If q = m/n, then when dividingmbyn to putq into decimal form thereare at mostn different remainders. Conversely, ifd is a repeating decimal,then finds, t such that 10sd − 10td is an integer.


1.2 Rational roots

In this section we will work with polynomials that have integer coefficients. Theeasiest roots to find are those which are rational numbers, and the next propositionshows that there are only a finite number of possibilities that must be checked. Usinga computer program or calculator to graph the polynomial function can help a greatdeal in deciding which possible roots actually work.

1.2.1 PROPOSITION (Rational roots). Let p(x) = amxm+ am−1xm−1

+ · · · +

a1x + a0 be a polynomial with integer coefficients. Ifr/s is a rational root ofp(x)such thatr ands have no factors in common, thenr must be a factor ofa0 andsmust be a factor ofam.

Proof. If p( rs) = 0, then

am(rs)

m+ am−1(

rs)

m−1+ · · · + a1(

rs)+ a0 = 0 ,

then multiplying bysm gives the equation

amr m+ am−1r

m−1s+ · · · + a1rsm−1+ a0sm

= 0 .

This leads to the equations

amr m= s(−am−1r

m−1− · · · − a1rsm−2

− a0sm−1)

anda0sm

= r (−amr m−1− am−1r

m−2s− · · · − a1sm−1) .

Thuss is a factor ofamr m andr is a factor ofa0sm. Sincer ands are assumed tohave no factors in common, we see thatr must be a factor ofa0 ands must be afactor ofam.

Example 1.2.1.

Suppose that we need to find all integer roots of the polynomial

p(x) = x3− 5x2

+ 2x − 10.

Using Proposition 1.2.1, all rational roots ofp(x) can be found bytesting only a finite number of values. By looking at the signs we cansee thatp(x) cannot have any negative roots, so we only need to check

1.2. RATIONAL ROOTS 13

the positive factors of 10. Substituting these factors intop(x)we obtainp(1) = −12, p(2) = −18 andp(5) = 0. Thus 5 is a root ofp(x), andso we can use polynomial division to show thatx3

− 5x2+ 2x− 10=

(x − 5)(x2+ 2). Becausex2

+ 2 has no real roots, this completes thefactorization, and it follows that 5 is the only integer root. 2

Example 1.2.2.

In this example we will find the rational roots of

p(x) = 9x4− 6x3

+ 19x2− 12x + 2.

The possible numerators are the factors of 2, while the possible denom-inators are the factors of 9. The list of possible rational roots is

±1,±2,±1

3,±

2

3,±

1

9,±

2

9.

Graphing the function shows that there is a root between.3 and .4,so this suggests that 1/3 might be a root. (Since the possible rootslie between 0 and 2, on your graphing calculator first use the window0 ≤ x ≤ 2; −1 ≤ y ≤ 1. To take a closer look, you might trythe window 0≤ x ≤ .6; −.2 ≤ y ≤ .2.) If 1/3 is a root, thenwe know thatx − 1/3 must be a factor ofp(x). In order to keepto integer coefficients, it is better to check whether or not 3x − 1 isa factor of p(x). In fact, using polynomial division we can see thatp(x) = (3x− 1)(3x3

− x2+ 6x− 2). Dividing again by 3x− 1 gives

the complete factorization

p(x) = (3x − 1)2(x2+ 2),

and so the only root ofp(x) is 1/3. 2

We need to find some easier methods for evaluating polynomials, and so wewill take a closer look at the method using nested parentheses. The algorithmfor evaluating a polynomial by using nested parentheses is usually calledsyntheticdivision. Suppose thatp(x) = amxm

+am−1xm−1+· · ·+a0 andq(x) = bm−1xm−1

+

bm−2xm−2+ · · · + b0 with p(x) = (x − c)q(x) + p(c). Multiplying this out and

equating coefficients shows that the following equations hold.


am = bm−1

am−1 = bm−2− cbm−1

...

a1 = b0− cb1

a0 = p(c)− cb0

Solving for the coefficients ofq(x) gives

bm−1 = am

bm−2 = bm−1c+ am−1 = amc+ am−1

bm−3 = bm−2c+ am−2 = (amc+ am−1)c+ am−2

...

This shows that the coefficients ofq(x) can be found from the partial answers in the“nested parentheses” procedure for evaluatingp(c).

Example 1.2.3.

We will use “nested parentheses” to dividep(x) = 2x3−7x2

+5x+20by x − 2. As before, we can write

2x3− 7x2

+ 5x + 20= ((2x − 7)x + 5)x + 20 ,

and then we have the following steps:

b2 = 2

b1 = 2 · 2− 7= −3

b0 = (−3) · 2+ 5= −1

p(2) = (−1) · 2+ 20= 18 .

Thus we have

2x3− 7x2

+ 5x + 20= (x − 2)(2x2− 3x − 1)+ 18 . 2


1.2.2 PROPOSITION (Synthetic division).Let p(x) = amxm+am−1xm−1

+· · ·+

a0 be a polynomial with real coefficients. Ifp(x) = (x − c)q(x) + p(c), whereq(x) = bm−1xm−1

+ bm−2xm−2+ · · · + b0, then

bm−1 = am

bm−2 = bm−1c+ am−1

...

bi = bi+1c+ ai+1

...

b0 = b1c+ a1

and p(c) = b0c+ a0.

Example 1.2.4 (Synthetic division).

In using synthetic division to dividep(x) = amxm+am−1xm−1

+· · ·+a0

by x − c, the work is usually arranged as follows.

c am am−1 am−2 . . .

bm−1c bm−2c . . .

bm−1 bm−2 bm−3 . . .

For example, dividing 2x3−7x2

+5x+20 byx−3 gives the followingtable.

3 2 −7 +5 20+6 −3 6

2 −1 +2 26

We have 2x3− 7x2

− 2x + 20 = (x − 3)(2x2− x + 2) + 26. To

construct the table, bring the 2 down to the bottom row, then multiplyby 3 and insert it as the next term in the second row. As the next step,add it to the corresponding term of of the top row, and then repeat thesesteps, using the answer. 2


In using the synthetic division algorithm, you must insert 0 for any missingpower of x. For example, to dividex5

+ 2x2− 3 by x + 1 we would have the

following steps.

−1 1 0 0 2 0 −3−1 1 −1 −1 1

1 −1 1 1 −1 −2

Thusx5+ 2x2

− 3= (x + 1)(x4− x3+ x2+ x − 1)− 2.

The most compact notation omits the terms in the middle row, as follows.

−1 1 0 0 2 0 −31 −1 1 1 −1 −2

This will be useful later when we need to do some repeated divisions.Using Proposition 1.2.1, all rational roots of the polynomialp(x) can be found

by testing only a finite number of values. We now restrict ourselves even further,and study some ways to find integer roots of equations. The next proposition givesa method due to Newton which can speed up this process considerably.

1.2.3 PROPOSITION. Let p(x) be any polynomial with integer coefficients. Ifcis any integer root ofp(x) andn is any integer, thenc− n must be a factor ofp(n).

Proof. If c is an integer root ofp(x), then it follows from the Remainder Theorem(see Theorem 1.1.4) thatp(x) = (x − c)q(x), which we can rewrite asp(x) =(c− x)(−q(x)). The proof of the Remainder Theorem uses the factorization

xk− ck= (x − c)(xk−1

+ cxk−2+ · · · + ck−2x + ck−1).

If c is an integer, then the coefficients of

xk−1+ cxk−2

+ · · · + ck−2x + ck−1

are certainly integers, and so in the general factorizationp(x) = (x − c)q(x), itfollows that the coefficients ofq(x) are integers.

If we substituten into the equationp(x) = (c − x)(−q(x)), we getp(n) =(c− n)(−q(n)), which shows thatc− n is a factor of the integerp(n).


Example 1.2.5.

The preceding result can be combined with Proposition 1.2.1 to findthe rational roots of equations such asx3

+ 15x2− 3x − 6 = 0. By

Proposition 1.2.1 the possible rational roots are±1,±2,±3,±6. Let-ting p(x) = x3

+ 15x2− 3x − 6 we find thatp(1) = 7. Thus by

Proposition 1.2.3, for any rootc we are guaranteed thatc−1 must be afactor of 7. This eliminates all of the possible values exceptc = 2 andc = −6.

We find thatp(2) = 56, so 2 is not a root. This shows, in addition,thatc− 2 must be a factor of 56 for any rootc, but−6 still passes thistest. Finally,p(−6) = −24, and so this eliminates−6 andp(x) hasno rational roots. 2

If we evaluate a polynomialp(x) at x = c by using synthetic division, wecannot tell whether or notp(c) = 0 until the final step. Our final algorithm usessynthetic division, in reverse. In testing for integer roots, it may be possible toeliminate a potential root after only a few steps of the synthetic division procedure.If the potential integer root passes a divisor test at each step, then it is a root and theanswers at each step provide the coefficients of the quotient.

If p(x) = amxm+am−1xm−1

+· · ·+a0 is a polynomial with integer coefficients,and c is an integer root ofp(x), then p(x) = (x − c)q(x), for a polynomialq(x) = bm−1xm−1

+· · ·+b1x+b0 with integer coefficients. After multiplying thisout, we can solve for the coefficients ofq(x). Solving for−b0 givesb0 = −a0÷ c,We get the following equations, which we might callbackwards synthetic division.

b0 = −a0÷ c

b1 = (b0− a1)÷ c

b2 = (b1− a2)÷ c...

Sinceb0, b1, b2, . . ., bm−1 are integers, it follows thatc must be a factor of eachof the termsa0, (b0 − a1, etc. Furthermore, the very last term obtained must beequal to zero. This gives a series of checks to test whether or notc is a root. Just aswith synthetic division, any zero coefficients must be included in the algorithm. Ifcpasses each test, then not only do you know thatc is a root, but in addition you havefound the coefficients of the quotientq(x). This proves the following proposition.


1.2.4 PROPOSITION (Integer roots via backwards synthetic division).Letp(x) = amxm

+am−1xm−1+· · ·+a0 be a polynomial with integer coefficients. An

integerc is a root ofp(x) if and only if each of the following terms is an integer

b0 = −a0÷ c

b1 = (b0− a1)÷ c

b2 = (b1− a2)÷ c...

bm−1 = (bm−2− am−1)÷ c

andbm−1 = am.

Example 1.2.6.

In finding the integer roots ofp(x) = x3− 5x2

+ 2x − 10, we willuse backwards synthetic division to check the potential roots 2 and 5.(Compare what we did in Example 1.2.1.) We have

−(−10)÷ 2 = 5

(5− 2)÷ 2 = 3/2,

which eliminates 2 as a root. In testing 5 we get

−(−10)÷ 5 = 2

(2− 2)÷ 5 = 0

(0− (−5))÷ 5 = 1

(1− 1)÷ 5 = 0.

At each step we found an integer value, and the last step gave 0. There-fore 5 is a root, and we can read off the coefficients of the quotient byreversing the order, sox3

− 5x2+ 2x − 10= (x − 5)(x2

+ 2). 2

PROBLEMS: 1.2

1. Use synthetic division to writep(x) = q(x)(x − c)+ p(c) for

(a) p(x) = 2x3+ x2− 4x + 3; c = 1;

(b) p(x) = x3− 5x2

+ 6x + 5; c = 2;

(c) p(x) = x5− 7x2

+ 2; c = 2.


2. Use backwards synthetic division to show thatp(c) = 0, and to writep(x) =q(x)(x − c) for

(a) p(x) = x3− 5x2

+ 3x + 9; c = 3;

(b) p(x) = 2x5+ 2x4

− 3x2+ 2x + 5; c = −1;

(c) p(x) = x5+ 32; c = −2.

3. Use Proposition 1.2.1 to list all possible rational roots ofp(x) = 15x4+8x3

+

6x2− 9x − 2. Then determine which of the possibilities are in fact roots.

(You may eliminate possibilities graphically, using a computer or calculator ifavailable, but then verify by hand calculations that you have found the rationalroots.) Finally, use the information about roots for factorp(x) completely asa product of polynomials with integer coefficients.

4. Use the method outlined in the previous problem to find all rational roots ofp(x) = 30x5

− 7x4+ 20x3

− 4x2− 10x + 3, and use this information to

write out the factorization ofp(x).

5. Find all integer roots of the following equations (use any method). One sug-gestion is to use a calculator or computer to graph the polynomial function,then check the likely roots using synthetic division. Finally, show the com-plete factorization of each polynomial (into a product of polynomials withinteger coefficients).

(a) x3− x2− 4x − 6= 0

(b) x3− 8x2

− 3x + 91= 0

(c) x4+ 4x3

− 104x2− 105x − 108= 0

(d) x3− 6x2

− 24x + 64= 0

(e) 3x4− 5x3

− 7x2+ 9x − 2= 0

6. Let p(x) = amxm+am−1xm−1

+· · ·+a1x+a0 be a polynomial with rationalcoefficients. Show that ifb is a nonzero root ofp(x), then 1/b is a root ofq(x) = a0xm

+ a1xm−1+ · · · + am−1x + am.


1.3 Approximating real roots

In this section we will develop some techniques for approximating solutions to poly-nomial equations. We will introduce the “interval bisection” method, the “secant”method, and an algorithm for approximating square roots and cube roots. In search-ing for all roots of a given equation, it is helpful to have a rough bound on the sizeof the roots. The first proposition provides such a bound.

1.3.1 PROPOSITION. Let p(x) = amxm+ · · · + a1x+ a0. If c is any root of the

equationp(x) = 0, then

|c| ≤|am| + |am−1| + · · · + |a1| + |a0|

|am|.

Proof. If c is any root of the equationp(x) = 0, then

amcm+ am−1c

m−1+ · · · + a1c+ a0 = 0,

and soamcm

= −am−1cm−1− · · · − a1c− a0.

Dividing by amcm−1 gives

c = −am−1

am

1

c− · · · −

a1

am

1

cm−2−

a0

am

1

cm−1.

If we assume that|c| ≥ 1, then1

|c|≤ 1, and so we have

|c| =

∣∣∣∣−am−1

am

1

c− · · · −

a1

am

1

cm−2−

a0

am

1

cm−1

∣∣∣∣≤

∣∣∣∣am−1

am

∣∣∣∣ ∣∣∣∣1c∣∣∣∣+ · · · + ∣∣∣∣ a1

am

∣∣∣∣ ∣∣∣∣ 1

cm−2

∣∣∣∣+ ∣∣∣∣ a0

am

∣∣∣∣ ∣∣∣∣ 1

cm−1

∣∣∣∣≤|am−1| + · · · + |a1| + |a0|

|am|.

Finally, to obtain a bound that also works when|c| < 1, we can simply add 1 to thebound we already have, giving

|c| ≤ 1+|am−1| + · · · + |a1| + |a0|

|am|

=|am| + |am−1| + · · · + |a1| + |a0|

|am|.

This completes the proof.

1.3. APPROXIMATING REAL ROOTS 21

The interval bisection algorithm is based on the following result. If we aresolving the polynomial equationp(x) = 0 and find two numbersx1 < x2 for whichp(x1) and p(x2) have opposite signs, then there must be a solution betweenx1 andx2. That is, there is a solution in the interval[x1, x2].

The algorithm then proceeds as follows. To narrow down the interval in whichthere must be a solution, we evaluatep(x) at the averagex3 = (x1+ x2)/2, whichbisects the interval determined byx1 andx2. If p(x3) has the opposite sign fromp(x1), then the root is betweenx1 andx3, and so we repeat the procedure for theinterval[x1, x3]. On the other hand, ifp(x3) has the opposite sign fromp(x2), thenthe root is betweenx2 and x3, and so we repeat the procedure using the interval[x3, x2].

It is easy to write a program to carry out this algorithm. But using the intervalbisection algorithm by hand on a calculator produces numbers that are difficult towork with, and so we will use a modification of the algorithm. Suppose that theinitial interval is[0,1]. For the first step we bisect the interval, and consider either[0, .5] or [.5, 1]. Suppose thatp(.5) and p(1) have opposite signs. For the nextstep, rather than bisecting the interval, we consider either[.5, .7] or [.7, 1]. Thefirst of these can be bisected in the third step, while the interval[.7, 1] can be splitinto [.7, .8] and[.8, 1]. This gives a search procedure that increases the accuracy byone decimal point, each time that we make the necessary three or four calculations.

We will illustrate this method by solvingx2− 3 = 0, correct to three decimal

places. We know that the solution isx =√

3, which lies between 1 and 2, and sothis problem provides an easy example of the technique we want to introduce.

Let p(x) = x2− 3, so that we are solving the equationp(x) = 0. The

first three computations in the next paragraph will find thatp(1.7) = −.11 andp(1.8) = +.24, so that the answer must be between 1.7 and 1.8. It takes three morecomputations to show that the answer lies between 1.73 and 1.74, and an additionalthree computations to obtain the answer 1.732, correct to three decimal places.

After knowing that the answer is between 1 and 2, we do the computation for 1.5,half way between the two previous estimates. Becausep(1.5) = −.75 is negative,the answer must be between 1.5 and 2.0. To simplify matters and deal only withone decimal point at a time, instead of using an estimate of 1.75, which would behalf way between 1.5 and 2, we use 1.7, and then whenp(1.7) = −.11 we go top(1.8) = +.24.

The following table shows the remaining steps in the procedure.

a 1.75 1.73 1.74 1.735 1.733 1.732a2+.0625 −.0071 +.0276 +.010225 +.003289 −.000176

A graphing calculator or computer can be used in a similar way. By changing the


domain of the viewing window to give successively smaller and smaller intervals,it is possible to obtain more and more accurate estimates of where the graph of thepolynomial crosses thex-axis.

Example 1.3.1.

We will solve x4− 3 = 0, correct to five decimal places. We let

p(x) = x4− 3. (The values forp(x) have been rounded off to eight

decimal places.)

a 1.5 1.3 1.4p(a) 2.0625 −.1439 +.8416

a 1.35 1.33 1.32 1.31p(a) +.32150625 +.12900721 +.03595776 −.05500079

a 1.315 1.317 1.316p(a) −.00978090 +.00845209 −.00067480

a 1.3165 1.3163 1.3162 1.3161p(a) +.00388605 +.00206109 +.00114892 +.00023696

a 1.31605 1.31607 1.31608p(a) −.00021894 −.00003659 +.0000546

This shows that answer we are looking for is 1.31607. 2

Example 1.3.2.

In this example we will approximate a solution to the equation 2x3+

x− 2= 0. Let p(x) = 2x3+ x− 2. The possible rational solutions to

the equation arex = ±1,±2,±12, and substituting each of these values

for x shows that the solution(s) must be irrational. Butp(0) = −2 andp(1) = 1, so there must be a solution to the equation betweenx = 0andx = 1. In the table below we have not listed all of the steps inthe search procedure. We have listed only those values for which thefunction changes sign, to show how we can narrow down the region in


which the root lies.

x .8 .9f (x) −.176 .358

x .83 .84f (x) −.026426 .025408

x .835 .836f (x) −.0006342 .0045541

x .8351 .8352f (x) −.0001159 .0004026

x .83512 .83513f (x) −.0000121 .0000397 2

The next algorithm we will introduce is thesecant methodfor finding roots.As before, to solvep(x) = 0 we begin with two numbersx1 < x2 for which p(x1)

and p(x2) have opposite signs. The equation of the line joining the two points(x1, p(x1)) and(x2, p(x2)) is

y = m(x − x1)+ p(x1), where m=p(x2)− p(x1)

x2− x1.

If we set this equation equal to zero and solve, we get

x = x1−p(x1)

m

and then

x = x1−p(x1)(x2− x1)

p(x2)− p(x1).

Let the solution to this equation bex3, so that

x3 = x1−p(x1)(x2− x1)

p(x2)− p(x1).

We still have to findp(x3), in order to determine whether the root is in the interval[x1, x3] or in the interval[x3, x2]. We then repeat the procedure on the new interval.


To implement this algorithm on a calculator capable of storing functions, youcan storep(x), p(y), p(z), and

f (x, y) = x − (p(x))(y− x)÷ ((p(y))− (p(x))).

Store the endpoints of the interval asx andy, and then store the new approximationasz. After evaluatingp(z) to determine its sign, you must decide whether to storeyour new approximation asx or asy, making sure that the root is in the new interval[x, y].

Example 1.3.3.

In this example we will compare the secant method with the intervalbisection method that we used in Example 1.3.2 to find a root ofp(x) =2x3+x−2 between 0 and 1. The secant method produces the following

table.

x y p(x) p(y)

0 1 −2 1.66666667 1 −.74074074 1.80851064 1 −.13445961 1.83120654 1 −.02022509 1.83455273 1 −.00295162 1.83503963 1 −.00042884 1

The table in Example 1.3.2 contains only about one third of the compu-tations actually that are actually necessary, so when we compare it withthe above table, we see that the secant method is a major improvement.

Notice that the value ofy never changed, since the successive approx-imations all remained less than the actual root. We are free to chooseany value ofy for which p(y) is positive, and so it helps to use trial anderror in finding the second endpoint. We now repeat the computations,


making appropriate modifications iny.

x y p(x) p(y)

0 1 −2 1.66666667 1 −.74074074 1.80851064 .9 −.13445961 .358.83349060 .84 −.00844657 .025408.83511467 .836 −.00003981 .00455411.83512234 .835123 −.00000003 .00000338

The table shows that to six decimal places the root is.835122. 2

Both the interval bisection method and the secant method are quite simple touse. However, for finding square roots there is another algorithm due to Newtonthat is almost as simple but much more efficient. Ifa is an approximate value for√

c, then the actual value must lie betweena andc/a, since

a ·c

a= c.

(If a = c/a, thena · a = c anda is actually equal to√

c.) If a 6= c/a, then oneof a andc/a is too large and one is too small, so the average(a+ c/a)/2 of thesetwo values will give a better approximation to

√c. This procedure, called Newton’s

method, can be continued to give more and more accurate approximations. Ifan isan approximation for

√c, then the next approximation is given by the formula

an+1 =1

2

(an +

c

an

).

If you have a calculator capable of storing functions, you can store

f (x) = (x + c÷ x)÷ 2.

If a is an approximation for√

c, then the next approximation in Newton’s methodis f (a). This makes it very easy to carry out the steps in Newton’s method.

Example 1.3.4.


We will use Newton’s method to approximate√

3 to five decimal places.We first use trial and error to get an answer accurate to one decimalplace, giving a first approximation of 1.7. The average of 1.7 and3/1.7 is 1.7323529, and so we use this value is computing the terms inthe second step. Note that the calculations have been done on an eightdigit calculator.

Step 1 Step 2 Step 3a 1.7 1.7323529 1.7320508

3/a 1.7647058 1.7317487 1.7320508avg 1.7323529 1.7320508 1.7320508

This procedure arrives at an accurate answer much faster than the mod-ified interval bisection algorithm that we used earlier. In fact, each stepin the algorithm roughly doubles the number of decimal places to whichthe approximation is accurate. 2

Example 1.3.5.

In Newton’s method, fractions can be used instead of decimal approx-imations. For example, squaring 7/4 gives 49/16, which is very closeto 3. Using 7/4 as the first approximation to

√3 gives the sequence of

approximations 97/56 and 18817/10864.

To approximate√

2 we can begin with 7/5. Newton’s method thengives 99/70 and 19601/13860. 2

PROBLEMS: Section 1.1.3

1. Use the (modified) interval bisection method to approximate3√

12 to fourdecimal places. (Work with the polynomialp(x) = x3

− 12.)

2. Use the (modified) interval bisection method to approximate√

17 to threedecimal places.

3. Use Newton’s method to approximate√

17 to six decimal places.


4. Use the (modified) interval bisection method to find the root of the equation4x3− x − 2= 0 that lies betweenx = 0 andx = 1. Show that your answer

is accurate to at least four decimal places.

5. Use the secant method to solve the equation 4x3− x − 2 = 0. Obtain an

answer accurate to six decimal places.

6. Use the secant method to solve the equationx5− x2

− 6 = 0. Obtain ananswer accurate to six decimal places.


1.4 Complex roots

The equationx2+ 1 = 0 has no real root since for any real numberx we have

x2+ 1 ≥ 1. The purpose of this section is to construct a set of numbers that

extends the set of real numbers and includes a root of this equation. If we had aset of numbers that contained the set of real numbers, was closed under addition,subtraction, multiplication, and division, and contained a rooti of the equationx2+ 1 = 0, then it would have to include all numbers of the forma+ bi wherea

andb are real numbers. The addition and multiplication would be given by

(a+ bi)+ (c+ di) = (a+ c)+ (b+ d)i,

(a+ bi)(c+ di) = ac+ (bc+ ad)i + bdi2 = (ac− bd)+ (ad+ bc)i .

Here we have used the fact thati 2= −1 sincei 2

+ 1= 0.Our construction for the desired set of numbers is to invent a symboli for which

i 2= −1, and then to consider all pairs of real numbersa andb, in the forma+ bi .

This construction can be done much more formally, but we simply ask the reader toaccept the “invention” of the symboli at an informal, intuitive level.

1.4.1 DEFINITION. The setC = {a+bi | a,b ∈ R andi 2= −1} is called theset

of complex numbers.Addition and multiplication of complex numbers are definedas follows:

(a+ bi)+ (c+ di) = (a+ c)+ (b+ d)i,

(a+ bi)(c+ di) = (ac− bd)+ (ad+ bc)i .

Note thata+ bi = c+ di if and only if a = c andb = d. If c+ di is nonzero,that is, ifc 6= 0 ord 6= 0, then division byc+ di is possible:

a+ bi

c+ di=(a+ bi)(c− di)

(c+ di)(c− di)=

ac+ bd

c2+ d2+

bc− ad

c2+ d2i .

A useful model for the set of complex numbers is a geometric model in which thenumbera+ bi is viewed as the ordered pair(a,b) in the plane. (See Figure 1.4.1.)Note thati corresponds to the pair(0,1).

In polar coordinates,a+ bi is represented by(r, θ), where

r =√

a2+ b2 cosθ = a/r sinθ = b/r.

1.4. COMPLEX ROOTS 29

1–1

i

–i

��

a

b

a+ bisr

θ

Figure 1.4.1:

The valuer is called theabsolute valueor magnitude of a + bi , and we write|a + bi | =

√a2+ b2. Note that this definition is consistent with the one for real

numbers, since ifx ∈ R, then|x|representsthedistancef romx to 0,The polar form for complex numbers is very useful. With this notation we have

a+ bi = r (cosθ + i sinθ).

In this form we can compute the product of two complex numbers as follows:

r (cosθ + i sinθ) · t (cosφ + i sinφ)

= r t ((cosθ cosφ − sinθ sinφ)+ i (sinθ cosφ + cosθ sinφ))

= r t (cos(θ + φ)+ i sin(θ + φ)).

This simplification of the product comes from the trigonometric formulas for thecosine and sine of the sum of two angles. Thus to multiply two complex numbersrepresented in polar form we multiply their absolute values and add their angles. Arepeated application of this formula to cosθ + i sinθ gives the following theorem.

1.4.2 THEOREM (De Moivre). For any positive integern,

(cosθ + i sinθ)n = cos(nθ)+ i sin(nθ).


1.4.3 COROLLARY. For any positive integern, the equationzn= 1 hasn distinct

roots in the set of complex numbers.

Proof. For k = 0,1, . . . ,n− 1, the values

cos2kπ

n+ i sin

2kπ

n

are distinct and(cos

2kπ

n+ i sin

2kπ

n

)n

= cos 2kπ + i sin 2kπ = 1.

Since we know thatzn= 1 has no more thann roots, we have found them all.

The complex roots ofzn= 1 are called thenth roots of unity. When plotted in

the complex plane, they form the vertices of a regular polygon withn sides inscribedin a circle of radius 1 with center at the origin.

Example 1.4.1.

The cube roots of unity are the roots of the equationx3−1= 0. We can

factor to obtain(x − 1)(x2+ x+ 1) = 0, and then using the quadratic

formula we obtain the three roots

1, ω =−1

2+

√3

2i, and ω2

=−1

2−

√3

2i .

Equivalently, using Corollary 1.4.3 we obtain

1, cos2π

3+ i sin

2π

3, and cos

4π

3+ i sin

4π

3.

Note thatω2+ ω + 1 = 0, sinceω is a root ofz3

− 1 = (z− 1)(z2+

z+ 1) = 0. (See Figure 1.4.2.)

Example 1.4.2.

The fourth roots of unity are the roots of the equationx4− 1 = 0,

which we can factor to obtain(x− 1)(x+ 1)(x2+ 1). The roots are 1,

i , i 2= −1, andi 3

= −i . (See Figure 1.4.3.)


1–1

i

–i

bbbb

bbb

bb

""""

""

"""

s

s

s

ω = −12 +

√3

2 i

ω2= −

12 −

√3

2 i

Figure 1.4.2: Cube roots of unity

1–1

i

–i

@@@@

@@

��

��

��

@@@@@@

ss

s

s

Figure 1.4.3: Fourth roots of unity


Example 1.4.3.

If zn= u, then(zω)n = u, whereω is anynth root of unity. Thus if all

nth roots of unity are already known, it is easy to find thenth roots ofany complex number. In general, thenth roots ofr (cosθ + i sinθ) are

r 1/n

(cos

θ + 2kπ

n+ i sin

θ + 2kπ

n

), for 1≤ k ≤ n.

To find the square root of a complex number it may be helpful to usethe formulas

sinθ

2= ±

√1− cosθ

2and cos

θ

2= ±

√1+ cosθ

2.

Example 1.4.4.

In this example we will use three different methods to find the complexcube roots of−8.

We will first use the general formula from Example 1.4.3. To do so,we need to write−8 in the polar formr (cosθ + i sinθ). Since−8 =−8+ 0 i , the corresponding rectangular coordinates are(−8,0), andsor = 8, cosθ = −1, and sinθ = 0, which gives usθ = π . Thus

3√−8= 81/3

(cos

π + 2kπ

3+ i sin

π + 2kπ

3

), for 1≤ k ≤ 3,

which gives us the three solutions

2(cosπ+i sinπ), 2

(cos

5π

3+ i sin

5π

3

), 2

(cos

π

3+ i sin

π

3

).

Further simplification gives

−2, 1−√

3 i, 1+√

3 i .

The second method, which is easier to use in this case, is to find theeasy solution3

√−8 = −2. Then the three solutions can be found by

multiplying this particular solution by the cube roots of unity

1,−1

2+

√3

2i, and

−1

2−

√3

2i .


The final method is often impractical for more complicated problems,but in this relatively easy example it uses more familiar techniques. Tofind the cube roots of−8 we need to solve the equationx3

+ 8 = 0.We can factor, to obtainx3

+ 8= (x + 2)(x2− 2x + 4). Then we can

use the quadratic formula to find the roots ofx2−2x+4, giving us the

same solutions

−2, 1−√

3 i, 1+√

3 i .

We have noted that the powers ofi are i , i 2= −1, i 3

= −i , i 4= 1. Since

i 4= 1, the powers repeat. For example,i 5

= i 4i = i , i 6= i 4i 2

= −1, and so on.For any integern, the poweri n depends on the remainder ofn when divided by 4,since ifn = 4q + r , theni n

= i 4q+r= (i 4)qi r

= i r . In particular,i−1= i 3= −i ,

and a similar computation can be given for any negative exponent.The next theorem is usually referred to as the “fundamental theorem of algebra.”

It was discovered by D’Alembert in 1746, although he gave an incorrect proof. Thefirst acceptable proof was given by Gauss in 1799. The proof is beyond the scopeof this text.

1.4.4 THEOREM (Fundamental Theorem of Algebra). Every polynomial ofpositive degree with complex coefficients has a complex root.

1.4.5 COROLLARY. Every polynomialf (z) of degreen > 0 with complex coef-ficients can be expressed as a product of linear factors, in the form

f (z) = c(z− z1)(z− z2) · · · (z− zn).

Proof. To give a proof of the corollary, we only need to combine the FundamentalTheorem of Algebra with the fact that roots off (z) correspond to linear factors.

If z= a+ bi is a complex number, then itscomplex conjugate, denoted byz,is z = a− bi . Note thatzz = a2

+ b2 andz+ z = 2a are real numbers, whereasz− z= (2b)i is a purely imaginary number. Furthermore,z= z if and only if z is areal number ( i.e.,b = 0). It can be checked that(z+ w) = z+w and(zw) = zw.

1.4.6 PROPOSITION. Let f (x) be a polynomial with real coefficients. Then acomplex numberz is a root of f (x) if and only if z is a root of f (x).


Proof. If f (x) = anxn+ · · · + a0, thenanzn

+ · · · + a0 = 0 for any rootz of f (x).Taking the complex conjugate of both sides shows that

an(z)n+ · · · + a1z+ a0 = an(z)

n+ · · · + a1z+ a0 = 0

and thusz is a root of f (x). Conversely, ifz is a root of f (x), then so isz= z.

1.4.7 THEOREM. Any polynomial with real coefficients can be factored into aproduct of linear and quadratic terms with real coefficients.

Proof. Let f (x) be a polynomial with real coefficients, of degreen. By Corol-lary 1.4.5 we can writef (x) = c(x − z1)(x − z2) · · · (x − zn), wherec ∈ R. If zi

is not a real root, then by Proposition 1.4.6,zi is also a root, and sox− zi occurs asone of the factors. But then

(x − zi )(x − zi ) = x2− (zi + zi )x + zi zi

has real coefficients. Thus if we pair each nonreal root with its conjugate, theremaining roots will be real, and sof (x) can be written as a product of linear andquadratic polynomials each having real coefficients.

1.4.8 COROLLARY. Any polynomial of odd degree that has real coefficients musthave a real root.

Proof. By the previous theorem, such a polynomial must have a linear factor withreal coefficients, and this factor yields a real root.


1. Compute each of the following:

(a) (−1/√

2+ i /√

2)4

(b) (−1+ i )8

(c) (cos 30◦ + i sin 30◦)15

2. Find(a+ bi)−1, if a+ bi lies on the unit circle.


3. (a) Find the 6th roots of unity.

(b) Find the 8th roots of unity.

4. (a) Find the cube roots of−8i .

(b) Find the fourth roots of−1.

5. Solve the equationx3− x2− 4= 0.

6. Solve the equationx3+ 3x2

− 6x + 20= 0.

7. Use DeMoivre’s theorem to find formulas for cos 3θ (in terms of cosθ ) andsin 3θ (in terms of sinθ ).

8. Verify each of the following, for complex numbersz andw:

(a) zz= |z|2

(b) zw = zw

(c) |zw| = |z||w|

9. Leta andb be integers, each of which can be written as the sum of two perfectsquares. Show thatab has the same property.

Hint: Use part (c) of the previous problem.


1.5 Interpolating polynomials

Most calculus problems in textbooks assume that we either have a formula alreadyor can find one easily. But in practice we often have only a few data points, andhave to find a formula that fits the data. If we are able to predict the type of formulato use, then the problem is to find the formula of the given type that gives the closestfit to the data. In this section we will deal with a simpler problem. We assume thatthe formula we need is a polynomial, and that it fits the data exactly.

Two points in a plane determine a unique line that passes through them. Wewill show that given three points, which do not all lie on the same line, it is possibleto find a parabola that passes through them. It is easiest for us to work with graphsof functions, and so we must assume that no two points lie on the same verticalline. This means that we can assume that the parabola determined by three pointsis actually represented by the graph of a quadratic polynomial.

For a larger number of points, what we will show is that given distinct numbersx0, x1, . . . , xn together with numbersy0, y1, . . . , yn (which need not be distinct),there is a polynomialp(x) of degree at mostn such thatp(xi ) = yi for each indexi . There is at most one such polynomial of degreen. To show this, suppose thatq(x) is a polynomial of degreen such thatq(xi ) = yi for all i . Then the polynomialf (x) = p(x) − q(x) has degreen and hasn + 1 rootsx0, x1, . . . , xn. This isimpossible unlessf (x) is the zero polynomial, so this shows thatq(x) = p(x).

We begin with a familiar case. Given points(x0, y0) and(x1, y1), the two-pointform of the equation of the line through the given points is

y− y0

x − x0=

y1− y0

x1− x0.

This can be simplified to give

y = y0+y1− y0

x1− x0(x − x0).

A formula that begins with these terms will provide a basis for one solution toour general problem. It is difficult to write down in general, and so we first giveanother solution whose formula is easier to remember. We can rewrite the equationas follows:

y = y0+ y1(x − x0)

(x1− x0)− y0

(x − x0)

(x1− x0)

= y0(x0− x1)

(x0− x1)+ y0

(x − x0)

(x0− x1)+ y1

(x − x0)

(x1− x0)

= y0(x − x1)

(x0− x1)+ y1

(x − x0)

(x1− x0)

1.5. INTERPOLATING POLYNOMIALS 37

It is important to note how the above function is constructed. As a function ofx it has the form

f (x) = f0(x)+ f1(x),

where f0(x0) = y0, f0(x1) = 0, and f1(x0) = 0, f1(x1) = y1.To solve the problem of finding a polynomial function that passes through three

given points(x0, y0), (x1, y1), (x2, y2), we will define three polynomialsf0(x),f1(x), and f2(x) such that

f0(x0) = y0 f1(x0) = 0 f2(x0) = 0f0(x1) = 0 f1(x1) = y1 f2(x1) = 0f0(x2) = 0 f1(x2) = 0 f2(x2) = y2

Then f (x) = f0(x)+ f1(x)+ f2(x) has the properties we want:f (x0) = y0+0+0,f (x1) = 0+ y1+ 0, and f (x2) = 0+ 0+ y2.

How do we constructf0(x), f1(x), and f2(x)? Sincef0(x1) = 0 and f0(x2) = 0,we can see thatf0(x) must havex − x1 andx − x2 as factors, so we assume thatf0(x) has the form

f0(x) = k(x − x1)(x − x2)

wherek is a constant. Sincef0(x0) = y0, we can substitute and solve fork to get

k =y0

(x0− x1)(x0− x2).

We can findf1(x) and f2(x) similarly. This gives a derivation of the formula in thefollowing proposition.

1.5.1 PROPOSITION. Let x0, x1, x2 be distinct real numbers, and lety0, y1, y2 beany real numbers. Then the polynomial

f (x) =y0(x − x1)(x − x2)

(x0− x1)(x0− x2)+

y1(x − x0)(x − x2)

(x1− x0)(x1− x2)+

y2(x − x0)(x − x1)

(x2− x0)(x2− x1)

has the property thatf (x0) = y0, f (x1) = y1, and f (x2) = y2.

Example 1.5.1.

To find the polynomial of degree 2 whose graph passes through thepoints(−1,1), (0,0), and(1,1), we can use the Lagrange interpolationformula. We have

f (x) =1(x − 0)(x − 1)

(−1− 0)(−1− 1)+

0 (x + 1)(x − 1)

(0+ 1)(0− 1)+

1(x + 1)(x − 0)

(1+ 1)(1− 0),


and simplifying givesf (x) = 12x(x− 1)+ 1

2(x+ 1)x. Usually we areinterested in substituting values ofx close to one of the given points,and so it makes sense to leave the formula as is, rather than rewriting itin terms of powers ofx, as f (x) = x2.

Givenn+ 1 points, the Lagrange interpolation formula can easily be extendedto find the polynomial of degreen (or less) whose graph passes through the points.In order to be able to state the general formula in a reasonable way we need tointroduce some notation. We will use

∑to denote sums and

∏to denote products.

The next theorem uses this notation to state (without proof) the general Lagrangeinterpolation formula.

1.5.2 THEOREM (The Lagrange interpolation formula). Let x0, x1, . . ., xn

be distinct real numbers, and lety0, y1, . . ., yn be any real numbers. Then thepolynomial

p(x) =n∑

i=0

yi∏

s6=i (x − x j )∏s6=i (xi − x j )

has the property thatf (xi ) = yi , for i = 0, . . . ,n.

One difficulty with the Lagrange interpolation formula is that if an additionalpoint is given, then everything must be recomputed. To motivate another approach,using what are termeddivided differences, we go back to the two point form of astraight line, as given below.

f (x) = y0+y1− y0

x1− x0(x − x0).

In this case our function is expressed as a sum of two functionsf0(x) and f1(x)such thatf0(x0) = y0 and f1(x0) = 0, while f0(x1) = y0 and f1(x1) = y1− y0.

To add a third termf3(x) so thatf (x)would define a quadratic function passingthrough(x0, y0), (x1, y1), and(x2, y2), we look for three polynomialsf0(x), f1(x),and f2(x) such that

f (x) = f0(x)+ f1(x)+ f2(x),

where f0(x) has degree zero,f1(x) has degree one, andf2(x) has degree two suchthat


f0(x0) = y0 f1(x0) = 0 f2(x0) = 0f0(x1) = y0 f1(x1) = y1− f0(x1) f2(x1) = 0f0(x2) = y0 f1(x2) = f1(x2) f2(x2) = y2− f1(x2)− f0(x2)

Since f2(x0) = 0 and f2(x1) = 0, we look for a quadratic function of the form

f2(x) = k(x − x0)(x − x1).

Since we must havef (x2) = y2, we can determine the constantk by just substitutingx = x2 into the above equation. We obtain

k =

y2− y1

x2− x1−

y1− y0

x1− x0

(x2− x0).

This leads to the equation

f (x) = y0+y1− y0

x1− x0(x − x0)+ k(x − x0)(x − x1).

Notice that the first term in the numerator ofk is the slope of the line segment joining(x1, y1) and(x2, y2), while the second term is the slope of the line segment joining(x0, y0) and(x1, y1), which we have already computed. This is called thedivideddifferencesinterpolation formula.

Example 1.5.2.

We will use the divided differences method to determine a quadraticthrough the points(1,1), (2,4), and(3,9), knowing that we shouldobtain the functionf (x) = x2. We letx0 = 1, x1 = 2, andx2 = 3.Then the function we are looking for has the form

f (x) = a+ b(x − 1)+ c(x − 1)(x − 2),

where

a = 1,b =4− 1

2− 1= 3, andc =

9− 4

3− 2−

4− 1

2− 13− 1

=5− 3

3− 1= 1.

This gives us the polynomialf (x) = 1+ 3(x− 1)+ 1(x− 1)(x− 2).It can be used in this form, without combining terms. (You can check


that it reduces tof (x) = x2, if you like.) For example,f (2.5) =1+ 3(1.5)+ (1.5)(.5) = 6.25.

Note that the computation ofc uses the value obtained forb, togetherwith another value computed similarly. To simplify the computations,it is convenient to arrange the necessary terms in a table in the followingway. Here each column of divided differences is constructed from theprevious one. Then the coefficients of the polynomial are found byreading from left to right, along the bottom of each column.

x y3 9

9−43−2 = 5

2 4 5−33−1 = 1

4−12−1 = 3

1 1

Example 1.5.3.

In this example we will find a polynomial of degree at most 5 whosegraph passes through the points(3,19), (2.5,16.125), (1,3), (.5,−.0375),(0,2), and(−1,3). We begin the table of divided differences by list-ing thex-values in descending order, together with the correspondingy-values. Listing thex-values in descending order tends to simplifythe computations.

x y3.0 19.000

5.752.5 16.125 −1.5

8.75 −11.0 3.000 1.0 0

6.75 −1 00.5 −0.375 3.5 0

3.25 −10.0 −2.000 5.5

−5.00−1.0 3.000

For example, the value 5.5 in the fourth column is computed by sub-tracting−5.00 from 3.25, the immediately preceding values in the table.


Since this is in the second column of differences, the denominator isthe difference of thex values 0.5 and−1.0.

In the next column, the bottom term−1 is computed by subtracting 5.5from 3.5 and then dividing by the difference of thex values 1.0 and−1.0.

The zeros in the table show that only four coefficients are necessary,and so the six given points actually lie on a cubic. Reading off thecoefficients from the bottoms of the columns, we have 3,−5, 5.5,−1.The first of these is the constant term. The next coefficient correspondsto the lastx value in the table, which is−1, and so the term isx−(−1) =x + 1. The next coefficient corresponds to the term given by the lasttwo x values, namely(x − 1)(x − 0). Continuing in this way gives usthe polynomial

f (x) = 3− 5(x + 1)+ 5.5(x + 1)(x)− (x + 1)(x)(x − .5).

Example 1.5.4.

We know that 1+ 2+ · · · + n = n(n + 1)/2. This suggests that theformula for the sum of squares might be a cubic polynomial inn. Letus write p(n) =

∑ni=0 i 2. Then p(0) = 0, p(1) = 02

+ 12= 1,

p(2) = 02+ 12+ 22

= 5, andp(3) = 02+ 12+ 22+ 32

= 14. Tomake an educated guess at the general formula, we can use the Lagrangeinterpolation formula to get

p(n) =0 (n− 1)(n− 2)(n− 3)

(0− 1)(0− 2)(0− 3)+

1(n− 0)(n− 2)(n− 3)

(1− 0)(1− 2)(1− 3)

+5(n− 0)(n− 1)(n− 3)

(2− 0)(2− 1)(2− 3)+

14(n− 0)(n− 1)(n− 2)

(3− 0)(3− 1)(3− 2).

After simplifying (this is left as an exercise) we getp(n) = n(n +1)(2n+ 1)/6.

If we use the divided differences technique, we have the following table


of differences.n p(n)3 14

92 5 5/2

4 1/31 1 3/2

10 0

This gives the polynomialp(n) = 0+ 1(n− 0)+ 32(n− 0)(n− 1)+

13(n − 0)(n − 1)(n − 2), which again can be simplified top(n) =n(n+ 1)(2n+ 1)/6.


1. Use the Lagrange interpolation formula to find the polynomial of degree 2whose graph passes through(0,5), (1,7), and(−1,9).

2. Extend the Lagrange interpolation formula to the case of 4 points. Use yourformula to find the cubic polynomial whose graph passes through the points(0,−5), (1,−3), (−1,−11), and(2,1).

3. Do the first problem using divided differences.

4. Do the second problem using divided differences.

5. The following values give√

x to 6 decimal place accuracy:(55, 7.416198),(54, 7.348469), (53, 7.280110), (52, 7.211103), (51, 7.141428), (50, 7.071068).

(a) Compute the entire table of divided differences.

(b) Find the interpolating polynomial of degree 5 for the given values (do notsimplify) and use it to approximate

√50.25.

6. Fill in the missing algebraic steps in simplifying the two polynomials inExample 1.5.4.

7. Use divided differences to guess a formula for the sum ofn cubes.

1.6. MATHEMATICAL INDUCTION 43

1.6 Mathematical Induction

The principle of mathematical induction applies to statements that involve an arbi-trary positive integern. Examples of such statements are:

1. x − 1 is a factor ofxn− 1;

2. (cosθ + i sinθ)n = cos(nθ)+ i sin(nθ);

3. 1+ 2+ · · · + n = n(n+ 1)/2;

4. 12+ 22+ · · · + n2

= n(n+ 1)(2n+ 1)/6;

5. 13+ 23+ · · · + n3

= n2(n+ 1)2/4;

6. 3 is a factor of 10n − 1;

7. n2− n+ 41 is a prime number.

Notice that each statement depends on the positive integern and becomes eithertrue or false when some value is substituted forn. The above statements are true,except for the last one:n2

− n+ 41 is a prime number whenn = 1,2, . . . ,10, butnot whenn = 41.

Suppose thatPn is a statement depending on the positive integern. If Pn istrue for each choice ofn, then the principle of mathematical induction frequentlyallows us to establish this fact. We will state the principle, and then apply it to someexamples. Note that we could begin numbering with any integer, sayP0, P1, . . . orevenP−297, P−296, . . ..

1.6.1 THEOREM (Principle of Mathematical Induction). Let P1, P2, . . . be asequence of propositions. Suppose that

(i) P1 is true, and

(ii) if Pk is true, thenPk+1 is true for all positive integersk.ThenPn is true for all positive integersn.

We will illustrate the use of induction by giving some examples and provingtwo theorems.


Example 1.6.1.

As our first example, we will prove by induction thatx − 1 is a factorxn− 1, for all positive integersn. We should note that this result is a

special case of Lemma 1.1.3, but at that point in the text we gave aninformal proof that avoided using mathematical induction.

Let Pn be the statement thatx − 1 is a factor ofxn− 1. It is obvious

that theP1 is true. Next we will assume thatPk is true and prove thatPk+1 is true. We can write

xk+1− 1= xk+1

− xk+ xk− 1 .

Thenxk+1− xk

= (x − 1)xk, and since we have assumed thatx − 1is a factor ofxk

− 1, we have writtenxk+1− 1 as a sum of two terms,

each of which hasx − 1 as a factor. This shows thatx − 1 is a factorof xk+1

− 1, and thereforePk+1 is true. This completes the inductionproof.

This is a good point at which to emphasize that when we are using the principleof mathematical induction, we must establish the truth ofP1. However, when weestablish the truth ofPk+1, we get to assume the truth ofPk without having to proveanything about it.

Example 1.6.2.

To establish that 1+ 2+ · · · + n = n(n+ 1)/2, let Pn be the statement1+ 2+ · · · + n = n(n+ 1)/2. Whenn = 1 we have 1= 1(1+ 1)/2,so P1 is true.

The next step is to show thatPk implies Pk+1. Assume thatPk is true,so that we have 1+ 2+ · · · + k = k(k + 1)/2. Addingk + 1 to bothsides of this equation, we get

1+ 2+ · · · + k+ (k+ 1) =k(k+ 1)

2+ (k+ 1)

=k(k+ 1)+ 2(k+ 1)

2

=(k+ 1)(k+ 2)

2

=(k+ 1)[(k+ 1)+ 1]

2.


ThusPk+1 is true, and so by inductionPn holds for all positive integersn.

Example 1.6.3.

To establish that 13 + 23+ · · · + n3

= n2(n + 1)2/4, let Pn be thestatement 13+ 23

+ · · · + n3= n2(n+ 1)2/4. Then 1= 12(1+ 1)2/4,

so P1 is true. The next step is to show thatPk implies Pk+1. AssumethatPk is true, so that we have 13

+23+· · ·+ k3

= k2(k+1)2/4. Add(k+ 1)3 to both sides of this equation to get

13+ 23+ · · · + k3

+ (k+ 1)3 =k2(k+ 1)2

4+ (k+ 1)3 .

We must show that the right hand side of the equation gives us theformula in Pk+1, which is

(k+ 1)2(k+ 2)2

4.

That part of the proof is left to the reader, as Problem 3.

Example 1.6.4.

To prove that 3 is a factor of 10n− 1 for all positive integersn, let Pn

be the statement that 3 is a factor of 10n−1. Now P1 is true since 3 is a

factor of 9. Assume thatPk is true, that is, that 3 is a factor of 10k− 1.

Then since

10k+1−1= 10·10k

−1= 10·10k−10+10−1= 10· (10k

−1)+9

we see that 3 is a factor of 10k+1− 1 since it is a factor of both 10k

− 1and 9. HencePk+1 is true. By the principle of mathematical induction,Pn holds for all positive integersn.

We next give a proof by induction of Theorem 1.4.2.

1.6.2 THEOREM (De Moivre). For any positive integern,

(cosθ + i sinθ)n = cos(nθ)+ i sin(nθ) .


Proof. To give a proof by induction, we first observe that there is nothing to provewhenn = 1. Now assume that the statement is true forn = k. Then

(cosθ + i sinθ)k+1= (cosθ + i sinθ)k(cosθ + i sinθ)

= (cos(kθ)+ i sin(kθ))(cosθ + i sinθ)

= (cos(kθ) cosθ − sin(kθ) sinθ)

+i (sin(kθ) cosθ + cos(kθ) sinθ))

= cos(kθ + θ)+ i sin(kθ + θ))

= cos((k+ 1)θ)+ i sin((k+ 1)θ) .

The crucial step uses two trig identities: cos(φ+ θ) = cosφ cosθ − sinφ sinθ andsin(φ + θ) = sinφ cosθ + cosφ sinθ .

Before proving the Binomial Theorem, we need to introduce some notation and

prove a Lemma. The symbol

(n

i

)is called abinomial coefficient, and it is referred

to as “n choosei ”. It gives the number of ways in whichi elements can be chosenout of a set withn elements.

There aren ways to choose the first element,n− 1 ways to choose the secondelement out of the remainingn− 1 elements, and so on until the last step, at whichpoint there aren − (i − 1) ways to choose theith element out of the remainingn−(i −1) elements. To find the correct number of possibilities that are independentof the order in which the elements are chosen, we must divide by the number ofways in which thei elements can be reordered. This gives us(

n

i

)=

n(n− 1) · · · (n− i + 1)

i (i − 1) · · · 1

=n(n− 1) · · · (n− i + 1)(n− i ) · · · 1

i (i − 1) · · · 1(n− i ) · · · 1

=n!

i !(n− i )!,

wheren! = n(n− 1) · · · 2 · 1 for n ≥ 1 and 0! = 1.

1.6.3 LEMMA. For any positive integerk and any positive integeri with i ≤ k,(k

i

)+

(k

i − 1

)=

(k+ 1

i

).


Proof. Writing out the definition of the binomial coefficients and finding a commondenominator for the sum, we have(

k

i

)+

(k

i − 1

)=

k!

i !(k− i )!+

k!

(i − 1)!(k− i + 1)!

=(k− i + 1)k!

i !(k− i )!(k− i + 1)+

(i )k!

(i )(i − 1)!(k− i + 1)!

=(k− i + 1)k!

i !(k− i + 1)!+

(i )k!

(i )!(k− i + 1)!

=(k+ 1)k!

i !(k+ 1− i )!

=

(k+ 1

i

).


1.6.4 THEOREM (The Binomial Theorem). Let a andb be real numbers. Thenfor any positive integern,

(a+ b)n =n∑

i=0

(n

i

)an−i bi .

Proof. Let Pn be the statement that(a+ b)n =n∑

i=0

(n

i

)an−i bi . Then

1∑i=0

(1

i

)a1−i bi

=

(1

0

)a1b0+

(1

1

)a0b1= a+ b

and soP1 is true. Next, assume thatPk is true. Then we have

(a+ b)k+1= (a+ b)(a+ b)k

= (a+ b)

(k∑

i=0

(k

i

)ak−i bi

)

=

k∑i=0

(k

i

)ak−i+1bi

+

k∑i=0

(k

i

)ak−i bi+1 .


Next we break the two sums apart to get

(a+ b)k+1=

(k

0

)ak+1b0

+

k∑i=1

(k

i

)ak+1−i bi

+

k−1∑i=0

(k

i

)ak−i bi+1

+

(k

k

)a0bk+1 .

Then we can change the indices in the summations, so that we can combine termsby using Lemma 1.6.3.

(a+ b)k+1=

(k

0

)ak+1b0

+

k∑i=1

(k

i

)ak+1−i bi

+

k∑i=1

(k

i − 1

)ak−i+1bi

+

(k

k

)a0bk+1

= ak+1b0+

k∑i=1

[(k

i

)+

(k

i − 1

)]ak+1−i bi

+ a0bk+1

=

(k+ 1

0

)ak+1b0

+

k∑i=1

(k+ 1

i

)ak+1−i bi

+

(k+ 1

k+ 1

)a0bk+1

=

k+1∑i=0

(k+ 1

i

)ak+1−i bi .

ThusPk+1 is true, and the proof is complete.

A second form of mathematical induction is more useful for some purposes.

1.6.5 THEOREM. (Second Principle of Mathematical Induction) LetP1, P2, . . .

be a sequence of propositions. Suppose that(i) P1 is true, and(ii) if Pm is true for allm≤ k, thenPk+1 is true for allk ∈ Z+.

ThenPn is true for all positive integersn.

Example 1.6.5.

Define a sequence of natural numbers as follows: LetF1 = F2 = 1, andFn = Fn−1 + Fn−2 for n ≥ 3. ThusF3 = 2, F4 = 3, F5 = 5, F6 = 8,


etc. The sequenceF1, F2, . . . is called theFibonacci sequence. Wewill show thatFn < (7/4)n for all positive integersn.

Let Pn be the statementFn < (7/4)n. ThenP1 saysF1 = 1< (7/4)1,which is true. AssumingPm for all m≤ k, we have that

Fk+1 = Fk + Fk−1 <

(7

4

)k

+

(7

4

)k−1

=

(7

4

)k−1(7

4+ 1

)=

(7

4

)k−1(11

4

)

<

(7

4

)k−1(49

16

)=

(7

4

)k+1

,

and soPk+1 holds. Thus by the second principle of mathematical in-duction,Fn < (7/4)n for all positive integersn.


Use the principle of mathematical induction to establish each of the following,wheren is any positive integer:

1. n < 2n

2. 12+ 22+ · · · + n2

= n(n+ 1)(2n+ 1)/6

3. 13+ 23+ · · · + n3

= n2(n+ 1)2/4

(Complete the proof in Example 1.6.3).

4. 1+ 3+ 5+ · · · + (2n− 1) = n2

5. 2+ 22+ · · · + 2n

= 2n+1− 2

6. x − 1 is a factor ofxn− 1

7. x + 1 is a factor ofx2n−1+ 1


8. 1+ r + r 2+ · · · + r n

=1− r n

1− r(whenr 6= 1)

9.n∑

i=1

1

i · (i + 1)=

n

n+ 1

10.n∑

i=0

(n

i

)= 2n

Hint: Use the fact that(k+1

i

)=(k

i

)+( k

i−1

).

Chapter 2

Derivatives

The slopeb of the linear functiony = mx+ b has an obvious interpretation in thegraph of the function. It also measures the rate of growth ofy compared tox. Forexample, for the functiony = 2x + 1 we can make a chart of some values.

x 0 1 2 3 4 5y 1 3 5 7 9 11

This shows very clearly that asx increases,y increases twice as fast. For polynomialsof higher degree the rate of growth changes for different values ofx, but it still hasvery important implications for the function.

In this chapter the main question we will solve is the construction of tangentlines for polynomials. A tangent line to the curvey = f (x) at x = a can alsobe viewed as a “local linear approximation”. That is, the tangent line is the linearfunction that provides the best approximation toy = f (x) by a straight line, at leastfor values close tox = a. One reason that the tangent line atx = a is useful is thatits slope provides a measure of the rate of growth of the function atx = a. That iswhy the geometric question of constructing tangent lines is related to the solutionof problems of motion in physics.

We will now discuss the construction of a mathematical model for a particularlysimple kind of motion. Consider the motion of a car along a straight road. Thefirst problem is to give an analytic description of the relationship of position andtime. The language of set theory provides a general way of describing relationshipsbetween quantities, and in this case we can use it in talking about a function fromone set of real numbers to another.

In order to define a numerical relationship, we can select a reference point,letting one direction be positive and the other negative. This position only requires

51

52 CHAPTER 2. DERIVATIVES

one number since the car is moving along a straight road, and similarly the time canbe given by a single number, the elapsed time.

We can abstract the situation a little further, by representing the car as a pointmoving along a straight line, whose position is given by a number expressing thedistance and direction from the point to a fixed reference point, usually called theorigin. Specifying the position of the point at each instant in time is thus equivalentto defining a function from the set of all real numbers (representing time) to the setof all real numbers (representing position).

The following questions are some of those which arise in this situation. (1) Ifyou know the function giving the position of the car at each instant, can you givethe function which describes it velocity at each instant? (2) If you know only thevelocity at each instant, can you tell the distance traveled during a particular timeinterval? (3) If you know only the function giving the velocity at each instant, canyou reconstruct the function giving the position at each instant?

Answering the first question would be equivalent to giving a function listing thespeedometer readings at each instant. Here we assume the we have a speedometerthat gives both positive and negative readings, depending on the direction of travel.The function describing the rate of change (or velocity) at each instant is calledthe derived function or simply the derivative of the original function. Sometimesinformation about the motion of the car can be obtained more easily from the derivedfunction than it can from the original function. For example, you could find outwhen the car is stationary by simply finding out when the derived function is zero.A positive value for the derivative indicates forward motion and a negative valueindicates the reverse, so if you know that in a particular time interval the derivativeis positive, then zero, and then negative, this tells you that the car was movingforward, then stopped and started moving backwards. The point of farthest advanceduring this interval can then be found by solving the equation obtained by settingthe derived function equal to zero.

The second problem was the following: knowing only the velocity at eachinstant, find the distance traveled during a given time interval. If the velocity isconstant, the problem can be solved rather easily, by multiplying the velocity bythe amount of time. But in general situations, the velocity will be changing all ofthe time, so this method will not work. If we could find an average value for thevelocity, then we could just multiply this average value by the amount of time. Theproblem lies in the fact that there are infinitely many readings of the speedometer, asgiven by the function describing the velocity, and familiar methods only deal withfinding the average of a finite number of values.

In physical processes depending on time, there are normally only relatively smallchanges in the process during short intervals of time. Functions that have a similar

53

property, that small changes in the independent variable produce only relativelysmall changes in the dependent variable are called continuous functions. (To reallyexplain this, we would need to make precise what is meant by “relatively”.) It can beshown that an average can be found for any continuous function, so that the methodswe will develop will work in almost all physical situations.

Integral calculus deals with this second problem. If the rate at which a processis being carried out is known, and described analytically by a function, then thenumber that gives the total outcome of the process during the time interval is calledthe definite integral of this function, over the given interval of time.

The third problem, where we are given a function describing the velocity ofthe car and are then asked to find a function giving its position at each instant, isinvestigated in the branch of analysis known as differential equations. Of course,if we know the answer to this question, we can answer the second one quite eas-ily. This is a difficult area of study, but it is very important, since many physicalsituations can be described by giving simple equations involving rates of change.An equation involving derivatives of a function is called a differential equation, andsuch equations often give the simplest statements of physical laws.

For example, by solving a differential equation expressing the assumption thatthe only force acting on a planet is the gravitational attraction of the sun, andthat this is inversely proportional to the square of the distance between them, it ispossible to show that the planet must follow an elliptical path. This was one ofthe early triumphs of the techniques of calculus and differential equations, whenNewton derived Kepler’s laws of planetary motion. from his own simpler law ofgravitational attraction.

Tangent lines for general curves are defined using an infinite process that involves“limits”. This chapter develops tangent lines for polynomial functions. These canbe defined without the use of limits—all that it is necessary to use is the RemainderTheorem. Hopefully this will help you to understand tangent lines and derivativesin a fairly simple setting that does not require the machinery of limits, which canbe quite difficult to understand, especially when seeing it for the first time. Someof the basic formulas for tangent lines are very easy to obtain with this elementaryapproach.


2.1 Tangent lines for polynomial functions

From your high school course in geometry you know how to define what it meansfor a line to be tangent to a circle: the line must touch the circle in exactly one point.It is possible to define what it means for a line to be tangent to a parabola, but it hasto be done more carefully. A line should be tangent to a curve at a given point ofthe curve if it passes through that point and in some sense matches nearby points ofthe curve as closely as possible. Another way of stating this is to say that the lineshould be the one that gives the best “local approximation” to the curve.

Example 2.1.1.The graph in Figure 2.1.1 showsy = x2+ 2x − 3 together with

the line y = 2x−3. The straight line is just the “linear part” of the quadratic, andit seems to be a good candidate for a tangent line or “local linear approximation”to the curve at the point(0,−3).

The reason thaty = 2x − 3 is very close toy = x2+ 2x − 3 for values of

x close to zero is that the error termx2 (the difference between the two functions)is small in comparison to the values of the function. For example, ifx = .1, thenx2= .01; if x = .001, then x2

= .000001, etc.

Example 2.1.2.For the curve y = x2+ 2x − 3, if we want a tangent line at the

point (1,0) rather than the point(0,−3), then the line y = 2x − 3 obviouslywill not work. If we could expressx2

+2x−3 in terms of powers ofx−1 insteadof powers ofx, then we would be able to give the same argument as before: ifxis close to1, then (x−1)2 is much smaller thanx−1, and so the quadratic termdoesn’t make much difference. With little bit of work, we can expressy in the form

y = x2+ 2x − 3= (x − 1)2+ 4(x − 1)

so that the “linear” part of x2+ 2x − 3 when expressed in powers ofx − 1 is

just 4(x − 1).

The argument that we have used works in general. If we have a polynomial thatis expressed in terms of powers ofx − a instead of powers ofx, then the linearfunction that best approximates the polynomial for values ofx close to x = ais just the part of the polynomial involvingx − a, together with the constant term.We will simply take this as the definition of the tangent line.

2.1. TANGENT LINES FOR POLYNOMIAL FUNCTIONS 55

Figure 2.1.1: Tangent line toy = x2+ 2x − 3 atx = 0

Figure 2.1.2: Tangent line toy = x2+ 2x − 3 atx = 1


2.1.1 DEFINITION. For the polynomial function

p(x) = cn(x − a)n + cn−1(x − a)n−1+ · · · + c2(x − a)2+ c1(x − a)+ c0

the tangent line to the curvey = p(x) at the point (a, c0) is the line

y = c1(x − a)+ c0 .

The slope of the tangent line, given by the coefficient of the linear term, will bewritten p′(a) = c1.

With this notation, the equation of the line tangent to the curvey = p(x) atthe point determined byx = a is

y = p′(a)(x − a)+ p(a) .

Given a polynomialp(x) and a numbera, how can the polynomial be writtenin terms of powers ofx − a? If we use the Remainder Theorem from Section 1.1of Chapter 1, we can write

p(x) = (x − a)q1(x)+ p(a)

for some polynomialq1(x). Next, we have to do the same thing forq1(x). If wewrite

q1(x) = (x − a)q2(x)+ c ,

then substituting forq1(x) in the first equation gives

p(x) = q2(x)(x − a)2+ c(x − a)+ p(a) .

Continuing this procedure will eventually gives usp(x) expressed in terms ofpowers of x − a.

Example 2.1.3.As our first example we will show how to expressx3−x2−5x+6

in terms of powers ofx − 1. Our first step is to divide the given polynomial byx − 1.

x2−5

x − 1 x3−x2

−5x +6x3−x2

−5x +6−5x +5

1


This gives us the equation

x3− x2− 5x + 6= (x2

− 5)(x − 1)+ 1 .

We next divide the quotientx2−5 by x−1 to get x2

−5= (x+1)(x−1)−4.After substituting this value into the previous equation we have

x3−x2−5x+6= ((x+1)(x−1)−4)(x−1)+1= (x+1)(x−1)2−4(x−1)+1

Finally, we must dividex+ 1 by x− 1, giving us x+ 1= (x− 1)+ 2. Aftersubstituting for x + 1 in the previous equation, we have

x3− x2− 5x + 6= (x − 1)3+ 2(x − 1)2− 4(x − 1)+ 1

The linear part of this function isy = −4(x−1)+1, and this gives the equationof the tangent line atx = 1.

Figure 2.1.3: Tangent line toy = x3− x2− 5x + 6 atx = 1

To find p′(a) we do not need to find the complete expansion ofp(x) in powersof x−a. If we use the Remainder Theorem to writep(x) = (x−a)q(x)+ p(a) ,


and then expressq(x) in powers of x−a, then to find the linear part ofp(x) weneed to find the constant termq(a) of q(x). Note that(x−a)q(x) = p(x)− p(a),and so to findp′(a) we only need to (i) find p(x)−p(a)

x−a ; (ii) evaluate this expressionat x = a. This proves the following proposition.

2.1.2 PROPOSITION. For any polynomial p(x) and any numbera, the valuep′(a) is found by substitutingx = a in the polynomial

p(x)− p(a)

x − a.

Example 2.1.4. We will now use the above method to redo Example 2.1.3. Forp(x) = x3

− x2− 5x+ 6 and a = 1 we have p(x)− p(1) = x3

− x2− 5x+ 5

since p(1) = 1. Dividing by x − 1 gives x2− 5, and then substitutingx = 1

gives −4, the value we found in Example 2.1.3.

Example 2.1.5.We want to find the equation of the line tangent to the curvey =x3 at x = −2. Letting f (x) = x3, we have f (−2) = −8, and we dividef (x) − f (−2) = x3

+ 8 by x + 2 to get x2− 2x + 4. Substituting x = −2

gives 12, which is the slope of the tangent line. The equation of the tangent line atx = −2 is therefore y = 12(x + 2)− 8.

The next three examples give some additional computational techniques for ex-pressing a polynomialp(x) in terms of powers ofx− a. The method introducedin Example 2.1.3, using repeated division, was given to motivate the proof of Propo-sition 2.1.2. There are several other methods that you might prefer. In each examplewe will use the polynomial from Example 2.1.3.

Example 2.1.6.The repeated divisions shown in Example 2.1.3 can be done usingsynthetic division, in the following form. We want to expressx3

− x2− 5x + 6 in

terms of powers ofx− 1, and so we divide byx− 1, then again divide the answerby x − 1, etc.

1 1 −1 −5 61 0 −5 11 1 −41 21


This allows us to read off the coefficients1,2,−4,1, giving

x3− x2− 5x + 6= (x − 1)3+ 2(x − 1)2− 4(x − 1)+ 1 .

Example 2.1.7. If p(x) is expressed in powers ofx − a as

p(x) = cn(x − a)n + · · · + c1(x − a)+ c0 ,

then substitutingx + a for x gives us

p(x + a) = cnxn+ · · · + c1x + c0 .

This shows that to find the necessary coefficients to expressp(x) in terms of powersof x − a, we only need to substitutex + a into the polynomial and collect terms.Thus substitutingx + 1 into the polynomialx3

− x2− 5x + 6 gives us

(x + 1)3− (x + 1)2− 5(x + 1)+ 6 = x3+ 3x2

+ 3x + 1− x2− 2x − 1− 5x − 5+ 6

= x3+ 2x2

− 4x + 1

and so we get the same coefficients1,2,−4,1 as before.

Example 2.1.8.For the final technique, we note that when we expressx3− x2

−

5x + 6 in terms of powers ofx − 1, the coefficient of(x − 1)3 must be the sameas the coefficient ofx3 in the original polynomial. Subtracting(x − 1)3 gives us

(x3− x2− 5x + 6)− (x − 1)3 = 2x2

− 8x + 7 .

Next we subtract2(x− 1)2 from the answer, since that is the power ofx− 1 thatwe need to cancel2x2. Thus

(2x2− 8x + 7)− 2(x − 1)2 = −4x + 5 .

Finally, we have (−4x + 5) − (−4(x − 1)) = 1, and substituting and collectingtogether the powers ofx − 1 gives us

x3− x2− 5x + 6= (x − 1)3+ 2(x − 1)2− 4(x − 1)+ 1 .



1. Write 2x2− 3x + 1 in terms of powers ofx − 1 and find the equation of

the line tangent toy = 2x2− 3x + 1 at x = 1.

2. Write (x − 3)2 in terms of powers ofx − 1 and find the equation of theline tangent toy = (x − 3)2 at x = 1.

3. Write y = x2− 3x in terms of powers ofx + 1 and find the equation of

the line tangent toy = x2− 3x at x = −1.

4. Using each of the techniques in Examples 2.1.6, 2.1.7, and 2.1.8, writex3−

x2− 5x + 6 in terms of powers ofx − 3.

5. Write x3 in terms of powers ofx+1 and also in terms of powers ofx−1.Find the equations of the lines tangent toy = x3 at x = −1 and at x = 1.

6. Write x4− 8x3

+ 21x2− 17x+ 2 in terms of powers ofx− 2 and find the

equation of the line tangent toy = x4− 8x3

+ 21x2− 17x + 2 at x = 2.

2.2. DERIVATIVES 61

2.2 Derivatives

We are finally at a place in our development at which we can prove a theorem thatwill allow us to find the derivative p′(a) more easily. We can obtain formulasin some simple cases and then take any polynomial and break it up into a sum ofterms that we can deal with. Note that the derivative ata of a linear functionp(x) = mx+ b is equal tom, for all values of a.

2.2.1 THEOREM. Let p(x) and q(x) be polynomials.(a) The tangent line to the curvey = p(x)+ q(x) at x = a has slope

p′(a)+ q′(a) .

(b) The tangent line to the curvey = p(x)q(x) at x = a has slope

p′(a)q(a)+ p(a)q′(a) .

Proof. Given p(x) and q(x), we first write them in terms of powers ofx − a.Suppose that we have

p(x) = bn(x − a)+ · · · + p′(a)(x − a)+ p(a)

andq(x) = ck(x − a)+ · · · + q′(a)(x − a)+ q(a) .

It is obvious from simply addingp(x) and q(x) that the coefficient of(x − a)in the sum is p′(a)+ q′(a).

To prove part (b), we need to write out the linear part of the polynomialp(x)q(x). We obtain

[p′(a)q(a)+ q′(a)p(a)](x − a)+ p(a)q(a) ,

and so the coefficient of(x − a) is precisely what the theorem states it shouldbe.

Given a polynomialp(x), it is often convenient to write out the formula for theslope of the tangent line in terms ofx instead of just at the one pointx = a. Forexample, if p(x) = 5x3

−8x2+3x−5, then we can writep′(x) = 15x2

−16x+3.Then substitutingx = a into the formula for p′(x) gives the slope of the linetangent to the curvey = 5x3

− 8x2+ 3x − 5 at the point whosex-coordinate


is a. It is also useful to have a notation for the formula for the slopes that does notdepend on having a name for the function. The next definition includes a secondnotation Dx(p(x) that makes this possible.

2.2.2 DEFINITION. For a polynomial functionp(x), the function p′(x) is calledthederivativeof p(x). We will also use the notationDx(p(x)) = p′(x).

2.2.3 THEOREM. If p(x) is a polynomial function andn is any positive integer,then Dx(p(x))n = n(p(x))n−1 p′(x) .

Proof. Using Theorem 2.2.1, we have

Dx(p(x))2= Dx(p(x)p(x)) = p′(x)p(x)+ p(x)p′(x) = 2p(x)p′(x) .

We can use this calculation to obtain the following formula:

Dx(p(x))3= Dx(p(x)

2 p(x)) = [2p(x)p′(x)]p(x)+ p(x)2 p′(x) = 3p(x)2 p′(x) .

Now it is clear how to give a proof by induction. We assume that we have alreadyproved the result forDx(p(x)n). Then

Dx(p(x))n+1= Dx(p(x)

n p(x)) = [np(x)n−1 p′(x)]p(x)+p(x)n p′(x) = (n+1)p(x)n p′(x) .

The principle of mathematical induction implies that the formula holds for all pos-itive integersn .

2.2.4 COROLLARY. For the polynomial function

p(x) = bnxn+ bn−1xn−1

+ · · · + b1x + b0 ,

the line tangent to the curvey = p(x) at x = a has slope

p′(a) = bnnan−1+ bn−1(n− 1)an−2

+ · · · + b22a+ b1a .

Proof. We can breakp(x) up into the sum of the termsbnxn, bn−1xn−1, etc., and soby part (a) of Theorem 2.2.1, we only need to find the slope of the tangent line for eachindividual term. Sincebx has slopeb, we can use part (b) of the previous theoremto find the slope ofbx2

= (bx)(x). We get a slope of(b)(a)+(ba)(1) = b(2a). Ingeneral, by applying Theorem 2.2.1 and Theorem 2.2.3 in the special casey = bi xi ,we get a slope ofbi iai−1, and we can apply this formula to the individual termsbnxn, bn−1xn−1, etc.

2.2. DERIVATIVES 63

Example 2.2.1.If p(x) = x2+ 2x− 3, then p′(x) = 2x+ 2. To find the slope of

the tangent line whenx = 1, substituting givesp′(1) = 2 · 1+ 2= 4.

Example 2.2.2.If p(x) = x3− x2−5x+6, then we havep′(x) = 3x2

−2x−5,and substitutingx = 1 gives p′(1) = 3− 2− 5= −4 as the slope of the tangentline at x = 1.

Example 2.2.3.Using Theorem 2.2.3 we have

Dx(5x4− 3x − 1)8 = 8(5x4

− 3x − 1)7(20x3− 3) .

The slope of the tangent line toy = (5x4− 3x − 1)8 at x = 1 is 8(5(1)4 −

3(1) − 1)7(20(1)3 − 3) = 8 · 17= 136, and so the equation of the tangent line atx = 1 is y = 136(x − 1)+ 1.


1. Use Corollary 2.2.4 to find the derivative of each of the following polynomials.

(a) p(x) = x2− 3x

(b) q(x) = 4x3− 12x2

+ 17

(c) f (x) = x4− 8x3

+ 21x2− 17x + 2

(d) g(x) = 7x43+ 4x19

− 211x + 17

2. Find the derivative of each of the following functions.

(a) (x − 2)8

(b) (x3− 2x2)10

(c) (x7+ 4x3

− x2+ 5x + 3)4

(d) (x7+ 4x3

− x2+ 5x + 3)4(x3

− 2x2)10

3. For each of the following polynomials, find the equation of the tangent lineat the given points.

(a) p(x) = x2− 3x at x = 0 andx = 1

(b) q(x) = 4x3− 12x2

+ 17 atx = 1 andx = 2

(c) f (x) = x4− 8x3

+ 21x2− 17x + 2 atx = 1 andx = −1

(d) g(x) = x4− 8x3

+ 22x2− 24x + 6 atx = 1, x = 2, andx = 3


2.3 Newton’s method

In an earlier section we discussed the secant method for solving polynomial equa-tions. We can now use techniques of calculus, and it turns out that using tangentlines instead of secant lines gives an algorithm that finds a solution more quickly.

We are now ready to discuss Newton’s method for finding solutions to polyno-mial equations. We will illustrate this algorithm with an example. The equation

3x4− 5x3

+ 2x − 6= 0

has a solution betweenx = 1 and x = 2. Using trial and error this can benarrowed to betweenx = 1.8 and x = 1.9, since f (1.8) = −.0672 andf (1.9) = 2.6013, for f (x) = 3x4

− 5x3+ 2x − 6.

With an approximate solution ofx = 1.8, instead of continuing to work withthe function f (x), we could work with its tangent line atx = 1.8, which shouldbe close to f (x). If we set the tangent line equal to zero, we may be able tofind a better approximation to the solution of the equationf (x) = 0. Sincef ′(x) = 12x3

− 15x2+ 2 and f ′(1.8) = 23.384, the equation of the tangent line

at x = 1.8 isy = 23.384(x − 1.8)− .0672.

Setting y = 0 and solving forx, we obtain

x = 1.8+.0672

23.384= 1.8028738.

This is a better approximation, sincef (1.8028738) = .00026000.Our next step is to find the tangent line atx = 1.8028738, and then sety = 0

and solve for x in the equation of the tangent line. But before continuing thisprocess, it is easiest to derive a general formula. Ifx = a is an approximate solutionto the equation f (x) = 0, then the equation of the tangent line toy = f (x) atx = a is given by y = f ′(a)(x − a) + f (a). Setting y = 0 and solving for xgives

x = a−f (a)

f ′(a).

You should note that ifx = a is an exact solution, thenf (a) = 0 and so the nextapproximation is justx = a. In general, as we get closer and closer to a solution,we would expect f ′(a) to change very little, while f (a) is approaching zero.Returning to the problem we were working on, as the next approximation we obtain

x = 1.8028738−.00026

23.564426= 1.8028628.

2.3. NEWTON’S METHOD 65

Substituting this value givesf (1.802868) = .00000079.We need to note several things. If the initial approximation is too far from the

actual root, Newton’s algorithm may not give better and better approximations toit. The error in each term is proportional to the square of the error in the previousterm. Thus once you have accuracy to a certain number of decimal places, the nextapproximation should be accurate to roughly double that number of decimal places.If two successive terms agree up to a certain number of decimal places, then theactually accuracy of the latest term should be roughly double the number of decimalplaces to which the two approximations agree.

2.3.1 THEOREM (Newton’s method). Let c be a root of the polynomialf (x).If an is an approximate value for the rootc , then the next approximation is givenby the formula

an+1 = an −f (an)

f ′(an).

If the error at thenth step is εn, then the error at the next step is approximately

εn+1 = −f ′′(c)

2 f ′(c)ε2

n .

To implement Newton’s method on a calculator that is capable of storing func-tions, store the three formulasf (x), f ′(x), and

g(x) = x − ( f (x) )÷ ( f ′(x) ) .

If a1 is an approximate solution tof (x) = 0, then the next approximation is givenby a2 = g(a1). It is easy to repeat this process until theg(an) = an for some n.Then you have reached the best approximation possible on your calculator. A workof caution is also necessary. There are functions for which Newton’s method doesnot converge, so it is important to use some common sense and keep track of whatis happening to the approximations.

Newton’s method can be used to findnth roots. We first give an intuitive ap-proach, extending the justification we gave in an earlier section for finding squareroots.

Example 2.3.1.To approximate 5√

2 we use the following procedure. If we havean approximate valuea, then

a · a · a · a ·2

a4= 2


and so we can average the five numbersa,a,a,a,2/a4.

Step 1 Step 2 Step 3 Step 4a 1.2 1.1529012 1.1487289 1.14869842a4 0.96450617 1.1320396 1.1485762 1.1486982

15(4a+ 2

a4 ) 1.1529012 1.1487289 1.1486984 1.1486983

We can now show the connection between Newton’s method for solving poly-nomial equations and what we called Newton’s method for finding square roots. Wecan give a formal justification for the intuitive approach we took in the precedingexample.

Finding n√

c is the same as solving the equationxn= c, or xn

− c = 0. If welet f (x) = xn

− c, then f ′(x) = nxn−1. If a is an approximate value forn√

c,then the next approximation given by Newton’s method is

a−an− c

nan−1=

nan− an+ c

nan−1=(n− 1)an

+ c

nan−1

=1

n

((n− 1)a+

c

an−1

).

This is precisely the formula that we found earlier, by an intuitive argument: weaverage then values {a,a,a, . . . ,a, c/an−1

}.


1. Use Newton’s method to approximate3√

12 to six decimal places.

2. Use Newton’s method to approximate9√

217.4 to six decimal places.

3. Use Newton’s method to find the root of the equation 4x3− x − 2= 0 that

lies between x = 0 and x = 1. Show that your answer is accurate to atleast four decimal places.

4. The equationx5−x2−6= 0 has a solution betweenx = 1 and x = 2. Use

Newton’s method to find the solution, with accuracy to six decimal places.

5. The equation x8− x4

+ 2x + 1 = 0 has a solution betweenx = −2and x = 0. Use Newton’s method to find the solution, with accuracy to fourdecimal places.

2.4. HIGHER DERIVATIVES 67

2.4 Higher derivatives

If f (x) is a polynomial function, then we can construct the tangent line at the point(a, f (a)) by using the formula

y = f (a)+ f ′(a)(x − a) .

To do this we only need two pieces of information about the functionf (x), thevalue f (a) and the value of the derivativef ′(a). If we regard the tangent lineas a “local linear approximation”, good for approximatingf (x) for values of xclose to a, then we see that as soon as we knowf (a) and f ′(a), we are able toobtain information about nearby values off (x).

In this section we will extend the ideas behind the construction of a tangent line toinclude “local” approximations by polynomials of higher degrees. To approximatef (x) close to x = a we will need only information about the value off (x) andits derivatives, all evaluated atx = a.

2.4.1 DEFINITION. Let f (x) be a polynomial function. We define the higherderivatives of f (x) as follows: the second derivative off (x) is f ′′(x) =Dx( f ′(x)); the third derivative is f ′′′(x) = Dx( f ′′(x)); the nth derivative isf (n)(x) = Dx( f (n−1)(x)).

You should note that if f (x) has degreen , then thenth derivative f (n)(x)is just the constant term off (x) , and so all derivatives of higher degree thannmust be zero.

Example 2.4.1. In this example we will find the quadratic polynomialp(x) =ax2+ bx+ c which has the property thatp(1) = 0, p′(1) = 4, and p′′(1) = 2.

Substituting x = 1 gives the equationa + b+ c = 0, and then differentiatingand substitutingx = 1 gives 2a+ b = 4. Finally, differentiating again gives theequation 2a = 2. This allows us to solve the system, gettinga = 1, b = 2, andc = −3. Thus p(x) = x2

+ 2x − 3, the polynomial in Example 2.1.1.In Example 2.1.2 we rewrotep(x) in terms of powers ofx − 1 as p(x) =

(x − 1)2 + 4(x − 1) in order to determine the equationy = 4(x − 1) of thetangent line at the point(1,0). Given the information we had above atx = 1, itwould have been easier for us to solve for the coefficientsa,b, c in the polynomialp(x) = a(x − 1)2 + b(x − 1) + c. In this form, you can check that we just havep(1) = c, p′(1) = b, and p′′(1) = 2a.


We have already seen that it is often advantageous to write polynomials in termsof powers of x − a. This is particularly the case when we are using informationobtained at the pointx = a. The next proposition shows how to find a polynomialof degree n which approximates a polynomial functionf (x) close to x = a.It uses the value of the function and its derivatives atx = a, and so it is easiest toexpress the polynomial in terms of powers ofx − a.

2.4.2 PROPOSITION. Let f (x) be any polynomial. Then there is a uniquepolynomial

p(x) = f (a)+ f ′(a)(x − a)+f ′′(a)

2(x − a)2+ · · · +

f (n)(a)

n!(x − a)n

of degreen such that p(a) = f (a) and p(k)(a) = f (k)(a) for 1≤ k ≤ n.

Proof. Let p(x) = c0+ c1(x − a)+ · · · + cn(x − a)n be a polynomial of degreen, written in terms of powers ofx − a. Substitutingx = a gives p(a) = c0. Wehave

p′(x) = c1+ 2c2(x − a)+ · · · + ncn(x − a)n−1 ,

and so p′(a) = c1. Next we have

p′′(x) = 2c2+ 6c3(x − a)+ · · · + n(n− 1)cn(x − a)n−2 ,

and so substitutingx = a shows that p′′(a) = 2c2. In general, differentiatingrepeatedly and substitutingx = a gives us the constant term of the derivative sinceany terms with a factor of(x − a)i will be zero. The formula for the kth term isp(k)(a) = k!ck. This shows that the only way to havep(k)(a) = f (k)(a) is to have

ci =f (i )(a)

i ! .

2.4.3 DEFINITION. The polynomial in the previous proposition is called thenthdegree Taylor polynomialof f (x) at x = a.


Use the techniques of this section to work the following problems from Section2.2.1.

1. Write 2x2− 3x + 1 in terms of powers ofx − 1 .

2.4. HIGHER DERIVATIVES 69

2. Write (x − 3)2 in terms of powers ofx − 1 .

3. Write y = x2− 3x in terms of powers ofx + 1 .

4. Write x3− x2− 5x + 6 in terms of powers ofx − 3.

5. Write x3 in terms of powers ofx+1 and also in terms of powers ofx−1.

6. Write x4− 8x3

+ 21x2− 17x+ 2 in terms of powers ofx− 2 and find the

equation of the line tangent toy = x4− 8x3

+ 21x2− 17x + 2 at x = 2.


2.5 Averages

If an object travels in a straight line with a constant velocity, then the distance ittravels during a given interval of time can be found by just multiplying the velocityby the length of time. But how could we find the distance if the velocity is notconstant, but instead is specified by a formula giving distance as a function of time?

If the rate of growth of a population is constant, then the total growth duringa given interval of time can be found by just multiplying the rate of growth by thelength of time. But what if the rate of growth of the population changes with time?This is certainly the case when a population grows exponentially.

The area of a rectangle is just its height multiplied by its width. But what aboutareas of regions in which the height is not constant? Can a method be developedfor finding such areas?

In the first problem we want to find the total distance when the velocity is nota constant. If we could find an average velocity, then we could just multiply theaverage velocity by the length of time. The second problem, involving a variablerate of growth, can also be solved if we can find an average rate of growth.

Let us consider the third problem. Suppose that we have a region in the planethat is almost a rectangle—except that the top may be curved, and the curve isdescribed by a function f (x). If we could find an average height, then the areaof the region would be given by multiplying the width of the region by its averageheight. This provides another motivation for developing a method to find the averageof a function over a particular interval.

For a straight line such asf (x) = 2x + 1 , the average height fromx = ato x = b is just the average off (a) and f (b), given by ( f (a)+ f (b))/2. Theaverage fromx = a to x = b of a quadratic function such asf (x) = x2

−3x+5is much more difficult to find.

There is one case in which averages should be easy to describe. Imagine drivingfrom Chicago to Florida, and having some way of taking an average of all of thespeedometer readings (at each instant). This averaging process surely should justgive you your average speed. Otherwise it would not seem to be a reasonable wayto find averages.

To relate this to our mathematical discussion, suppose that the functionf (x)happens to be the derivative of the functionF(x). Then f (x) represents theinstantaneous rate of growth ofF(x) at the pointx . Averaging the instantaneousrates of growth over the interval fromx = a to x = b should simply give us theaverage rate of growth ofF(x). This average rate of growth is simply the change

2.5. AVERAGES 71

in F(x) divided by the change inx, or

F(b)− F(a)

b− a.

. Now we need to introduce a notation for the average, and we will simply use theabove discussion to motivate our definition of the average of a function.

2.5.1 DEFINITION. Let f (x) be any polynomial. If F(x) is any polynomialsuch that F ′(x) = f (x), then we say thatF(x) is anantiderivativeof f (x).

2.5.2 THEOREM. Let f (x) = cnxn+ cn−1xn−1

+ · · · + c1x + c0. Then thegeneral antiderivative ofp(x) is given by

F(x) = cnxn+1

n+ 1+ cn−1

xn

n+ · · · + c1

x2

2+ c0x + C .

Proof. It is clear from our formulas for derivatives thatF ′(x) = f (x). It is worthnoting that since the derivative of any constant is zero, the constant termC canbe any real number, and so antiderivatives of polynomials are not unique, but candiffer by a constant.

2.5.3 DEFINITION. Let f (x) be any polynomial, and letF(x) be any an-tiderivative of f (x). Then theaverageof f (x) from x = a to x = b is definedto be

Aba [ f (x)] =

F(b)− F(a)

b− a.

We have already noted that antiderivatives of polynomials are not unique, butcan differ by a constant. We should also note that the definition of the average isindependent of the choice of an antiderivative, because ifF1(x) and F2(x) areany two antiderivatives off (x), then F1(b)− F1(a) = F2(b)− F2(a) since anyconstant terms in eitherF1(x) or F2(x) simply cancel out in the computation. Infinding an average of a polynomial, it is easiest to just use the constantC = 0 inthe formula for the antiderivative.


Example 2.5.1.For the function xn, we have the antiderivativexn+1/(n + 1),which gives us the following interesting computation.

A10

[xn]=

1n+1

n+ 1−

0n+1

n+ 11− 0

=1

n+ 1.

Example 2.5.2.We should check that the definition we have given for an averagegives us what we expect in the case of a linear function. Iff (x) = c1x + c0, thenwe can use the antiderivativeF(x) = c1

x2

2 + c0x. Thus we have

Aba [c1x + c0] =

c1b2

2 + c0b− c1a2

2 − c0a

b− a=

c1b2− c1a2

2(b− a)+

c0b− c0a

b− a

=c1(b+ a)

2+ c0 =

1

2((c1b+ c0)+ (c1a+ c0)) ,

which shows that

Aba [ f (x)] =

f (a)+ f (b)

2,

and so this method does give us the value we should get.

.

This computation in a relatively simple case points up the need for some theoremsto make the computations easier. The average of a sum is the sum of the averages,and the average of a constant times a function is just the constant times the averageof the function. These results really come from similar ones for derivatives.

2.5.4 THEOREM. Let f (x) be a polynomials, and letc be any real number.

(a)Aba [c f (x)] = c ·Ab

a [ f (x)]

(b)Aba [ f (x)+ g(x)] = Ab

a [ f (x)] +Aba [g(x)]

Proof. Let F(x) and G(x) be antiderivatives of f (x) and g(x), respectively.Then cF(x) is an antiderivative ofc f (x), and F(x)+G(x) is an antiderivative off (x)+g(x), by our theorems on derivatives. Using these antiderivatives to computethe averages in question shows that the formulas in the theorem are correct.


2.5. AVERAGES 73

1. (a)A31

[x2− 2x

]=

(b)A5−1

[x4− 3x2

− 2x]=

2. (a) Use the answer in the previous exercise to find the area bounded by thecurve y = x2

− 2x, thex-axis, and the linesx = 1 and x = 2.

(a) Do the same to find the area bounded by the curvey = x4− 3x2

− 2x,thex-axis, and the linesx = −1 and x = 5.

3. (a)A2−2

[x5− 2x

]=

(b)Aa−a

[x17− 5x11+ 3x5

− 2x]=

4. (a)Aba

[x2]=

(b)Aba

[x3]=

5. In physics, the work done by a constant force acting in a straight line is definedto be the product of the force and the distance through which it acts. Whena spring is compressed, the force it exerts is proportional to the amount ofcompression. Now suppose that a spring has a constant of 16, so that aformula for the force is given byf (x) = 16x. Find the work done when thespring is compressed from 1 ft to.75 ft.

Hint: Use the average off (x) from x = 0 to x = .25.

Chapter 3

Approximation Techniques

In everyday life we are constantly faced with the problem of approximating certainnumbers by means of others. For example, our measurements of length, area,temperature, and so on, lead us to numbers that are only approximations. In practicewe use only rational numbers, but irrational numbers also exist, and thought they arenot used in measuring, our theoretical arguments often lead to them. The calculationof the circumference of a circle in terms of its radius involves the irrational numberπ and the calculation of the diagonal of a square involves the irrational number√

4 . we use 22/7, 3.14, 3.14159, etc., depending on the accuracy that we needin the value forπ .

The important fact is that no matter what degree of accuracy we need, wecan calculate a rational number that approximatesπ with better than the desiredaccuracy. We can express this fact by saying that we can construct a sequence ofrational numbers that converges toπ . The formal definition of convergence willinvolve the idea of being able to obtain better and better approximations, to anydesired degree of accuracy.

The same situation also occurs for functions. The quantitative laws of nature areexpressed mathematically by functions, and sometimes these expressions are verydifficult to use in calculations. It is often necessary to approximate the functionsby other functions whose values for specific numbers can be computed much moreeasily. Sometimes no attempt is made to approximate the function over all of itsdomain, but just close to one particular number, and this can be done by making useof a single function. Other times an approximation over larger regions is desired,and the function must be approximated by an infinite sequence of functions.

The functions typically used to approximate complicated functions are linearfunctions, polynomial functions, and trigonometric functions. Polynomials func-tions are particularly important, since computing the values of a polynomial function

75

76 CHAPTER 3. APPROXIMATION TECHNIQUES

involves only a finite number of multiplications and additions. This simplicity ofcomputation is necessary when making use of computers, since even though theywork very rapidly, they can only perform relatively simple operations. A machinecannot compute the log of a number exactly, but we can approximate the log func-tion by a polynomial, with any required degree of accuracy, and then the machinecan compute the values of the polynomial.

Continuous functions play an important role in both types of approximation.We will say that a function is continuous if it carries numbers that are close to eachother into numbers that are again close to each other. This gives them the propertyof preserving approximations of numbers by sequences of numbers, in the sensethat if a sequence converges to a number, then applying the function to each numberin the sequence gives a new sequence that converges to the value of the function atthe given number.

Continuous functions can be very general, but they can be approximated to anydesired degree of accuracy by polynomial functions. This is of practical importance,since many physical situations can be described by continuous functions, and thenthe values of these functions by computers, via the approximating polynomials.

3.1. SEQUENCES 77

3.1 Sequences

In Section 2.3 of Chapter II, we saw how to approximate roots of polynomials byusing Newton’s method. In this section we will look in more detail at what it meansto have a process (or algorithm) to approximate a number.

To begin with, if we want to approximateA, then we must have some way ofconstructing successive approximationsa1, a2, a3, . . . to the number A. Whenusing Newton’s method, each approximation depends in a fairly simple way on theprevious one. (Recall thatan+1 = an− f (an)/ f ′(an)). Any similar approximationtechnique should yield an approximate valuean close to A, for each positiveinteger n.

To be effective, an algorithm needs some way of estimating the error in eachapproximation. For Taylor polynomials we were able to do this by estimatingthe maximum value of the next higher derivative over a certain interval. Whenworking with Newton’s method we took the naive approach of simply computingapproximations until successive values agreed to the necessary number of decimalplaces.

3.1.1 DEFINITION. We say that the sequence{an} has thelimit A, written

limn→∞

an = A ,

if for each ε > 0 there exists an integerN such that|an− A| < ε for all n ≥ N.We also say that{an} convergesto the real numberA.

In this definition you can think of the sequence{an} as a method for approxi-mating the numberA. The approximation method actually works precisely whenthe sequence converges toA. In approximating a number we usually first specifythe allowable error, and in the definition, this is represented byε. If the sequencedoes in fact converge toA, then no matter how small the specified error, it is alwayspossible to guarantee that from some point on the approximate values are closer toA then the specified error.

Example 3.1.1.To illustrate the use of the definition, we will check thatlimn→∞

1n = 0.

We must show that for anyε > 0 there exists a value ofN such that | 1n −0| < ε

for all n ≥ N. This is equivalent to showing that1< Nε for some N, and to doso we only need to observe that for any positive real number, no matter how small,


adding it to itself enough times will give a result greater than1. (Formally, thisfollows from the Archimedean property of the real number system.)

Example 3.1.2.In some cases a sequence may be defined by a formula, even thoughthe limit of the sequence is initially unknown. For example, we might have a sequencedefined by the general formulaan =

n2−n

3n2+1. The first five terms area1 = 0,

a2 = 2/13, a3 = 3/14, a4 = 12/49, and a5 = 5/19. To get a better ideaof the limit of the terms, we can rewrite the formula by dividing numerator anddenominator by the highest power ofn in the denominator, givingan =

1−1/n3+1/n2 .

Now it is intuitively obvious that as the value ofn increases, the terms in thesequence will get closer and closer to1/3. Thus we havelim

n→∞an =

13.

In the previous example, after rewriting the formula for the general term of thesequence we intuitively used several facts about limits of sequences. We needed toknow that the limit of a sum (or quotient) is the sum (or quotient) of the limits. Wealso needed to know that lim

n→∞

1n = 0 and that lim

n→∞

1n2 = 0. We will now prove

the results we need.

3.1.2 THEOREM. Let {an} and {bn} be sequences with limn→∞

an = A and

limn→∞

bn = B. Then the following results are true.

(a) limn→∞

can = cA for any real numberc;

(b) limn→∞

(an + bn) = A+ B;

(c) limn→∞

anbn = AB;

(d) limn→∞

anbn=

AB provided B 6= 0 and bn 6= 0 for all n.

Proof. (a) The result obviously holds ifc = 0, and so it does not hurt to assume thatc 6= 0. Given ε > 0, since lim

n→∞an = A, there existsN such that|an− A| < ε

|c|

for all n ≥ N. Multiplying both sides of the inequality by|c| shows that that|can− cA| < ε for all n ≥ N. The result obviously holds ifc = 0, and so it doesnot hurt to assume thatc 6= 0.

(b) Given ε > 0, we must find N such that |(an + bn)− (A+ B)| < ε forall n > N. We can make use of the inequality

|(an + bn)− (A+ B)| = |(an − A)+ (bn − B)| ≥ |an − A| + |bn − B| .

3.1. SEQUENCES 79

There exists N1 such that |an − A| < ε/2 for all n ≥ N1, and similarly thereexists N2 such that|bn − B| < ε/2 for all n ≥ N2. If we let N be the larger ofthe two valuesN1 and N2, then for all n ≥ N we have

|(an + bn)− (A+ B)| ≥ |an − A| + |bn − B| <ε

2+ε

2= ε .

(c) We must show that for eachε > 0 there existsN such that|anbn−AB| <ε for all n ≥ N. We have to relate the term|anbn − AB| to the quantities|an− A| and |bn− B|, over which we have some control, since lim

n→∞an = A and

limn→∞

bn = B. We can write

|anbn − AB| = |(anbn − Abn)+ (Abn − AB)| ≤ |bn||an − A| + |A||bn − B| .

Using the basic ideas in parts (a) and (b), we can first find an integerN1 such that|bn − B| < ε

2|A| , since the sequence{bn} converges to B. (If A = 0 we cansimply ignore the second term above.) The first part is not quite so simple, sincethe value |bn| that we have factored out changes asn increases, and so we needan estimate that remains constant. Since{bn} converges to B, we can find aninteger N2 for which |bn| < |B| + 1, since for large enough values ofn we musthave |bn − B| < 1. Finally, since {an} converges toA there exists N3 suchthat |an − A| < ε

2(|B|+1) . If we let N be the largest ofN1, N2, and N3, then forn ≥ N we have

|anbn − AB| ≤ |bn||an − A| + |A||bn − B| < (|B| + 1)ε

2|B| + 2+ |A|

ε

2|A|,

which proves that|anbn − AB| < ε.(d) If we can show that{ 1

bn} converges to 1

B , then the conclusion we want

follows from part (c). To do so we can use the function1x in the next theorem,whose proof is left as an exercise.

3.1.3 DEFINITION. We say that the functionf (x) is continuousat x = A iffor each sequence{an} that converges toA, the sequence{ f (an)} converges tof (A).

The most important remark to make is that all polynomial functions are con-tinuous for all values A. As another example, it is also true that the functionf (x) = n

√x is continuous. That makes it possible to find thenth root of an irra-

tional number by approximating it by rational numbers, and then taking a sequenceof approximations to thenth roots of each of the rational numbers.


Example 3.1.3. If |r | < 1 then limn→∞

r n= 0. To show this, suppose thatε > 0

is given. It is sufficient to prove the result for0< r < 1, and then sincer < 1 wemust have1/r > 1, say 1/r = 1+ a for some positive real numbera. Using thebinomial theorem we have(1+ a)n = 1+ na+ · · · > 1+ na. Since a > 0, byadding a to itself we can construct a real number as large as we want. Thus thereexists N such that1+ Na> 1/ε. But thenr n

=1

(1+a)n <1

1+Na < ε.


1. Find limn→∞

an if {an} is defined by the formulaan =1−n2

2+3n2 .

3.2. APPROXIMATING AVERAGES 81

3.2 Approximating averages

—This section will use the summation formulas developed in the section on induc-tion to find the average of a function by “sampling” at a finite number of valuesand then averaging these. It will show that we get the same answer for the averageof a quadratic function as we did using antiderivatives. So we essentially have theFundamental Theorem of Calculus for quadratic functions.—



3.3 Approximating areas

In this section we will use Simpson’s method for numerical integration. This al-gorithm will approximate a function section by section by polynomials of degree2. Then the integral of each of the approximating polynomials can be found, andadding together these integrals gives an approximation to the integral we really want.We will see that if the approximating polynomials are chosen carefully, then thereare very simple formulas for their integrals.

To approximate∫ b

a f (x)dx we begin by choosing an even number of subdi-visions of the interval[a,b]. We do this because three points determine a uniqueparabola, and we choose the approximating parabola on the subdivisions[xi−1, xi ]

and [xi , xi+1] so that it passes through the points(xi−1, f (xi−1)), (xi , f (xi )), and(xi+1, f (xi+1)).

Now we will show how to find the polynomials that we need, and how to findtheir integrals. Assume that we are given a continuous functionf (x) and distinctpointsx0, x1, x2. (This simplifies the subscripts, and still gives a completely gen-eral formula.) We need a formula for the parabola that passes through the points(x0, f (x0)), (x1, f (x1)), and(x2, f (x2)). We will let f (x0) = y0, f (x1) = y1,and f (x2) = y2. To simplify the computations we assume thatx1 − x0 = h andx2 − x1 = h, so we can work with the pointsx1 − h, x1, andx1 + h. We will usethe divided differences method.

3.3.1 LEMMA. Let x0, x1, and x2 be real numbers, withx0 = x1 − h andx2 = x1 + h. Let f (x) be the polynomial of degree 2 withf (x0) = y0,f (x1) = y1, and f (x2) = y2 . Then the formula for f (x) is

f (x) = y1+y2− y0

2h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)

2 .

Proof. Using the divided differences method for finding interpolating polynomials,we obtain the following table.

x yx1+ h y2

y2− y0

2h

x1− h y0y0− 2y1+ y2

2h2y0− y1

−hx1 y1

3.3. APPROXIMATING AREAS 83

For the computation of the last term we have(y2− y0

2h−

y0− y1

−h

)(1

h

)=

y2− y0

2h2+

2y0− 2y1

2h2

=y0− 2y1+ y2

2h2.

The polynomial that we are looking for is

f (x) = y1+y1− y0

h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)(x − x1+ h)

= y1+y1− y0

h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)(h)+

y0− 2y1+ y2

2h2(x − x1)

2

= y1+2y1− 2y0

2h(x − x1)+

y0− 2y1+ y2

2h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)

2

= y1+y2− y0

2h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)

2 .

We now check to make sure thatf (x) is the polynomial that we want. It isclear that f (x1) = y1. Since x0− x1 = −h, we have

f (x0) = y1+y2− y0

2h(−h)+

y0− 2y1+ y2

2h2(h)2

= y1−y2− y0

2+

y0− 2y1+ y2

2= y0 .

Since x2− x1 = h, we have

f (x2) = y1+y2− y0

2h(h)+

y0− 2y1+ y2

2h2(h)2

= y1+y2− y0

2+

y0− 2y1+ y2

2= y2 .


The next step is to find the integral off (x) from x0 to x2. In the proof wewill use the fact that ifg(u) is an even function, then

∫ h−h g(u)du= 2

∫ h0 g(u)du,

and if g(u) is an odd function, then∫ h−h g(u)du= 0.


3.3.2 PROPOSITION. Let x0, x1, and x2 be real numbers, withx0 = x1 − hand x2 = x1 + h. Let f (x) be the polynomial of degree 2 withf (x0) = y0,f (x1) = y1, and f (x2) = y2 . Then∫ x2

x0

f (x)dx =h

3(y0+ 4y1+ y2) .

Proof. In the following integration we make the substitutionu = x − x1, so thenew limits arex0− x1 = −h and x2− x1 = h.∫ x2

x0

f (x)dx =∫ x2

x0

(y1+

y2− y0

2h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)

2

)dx

=

∫ h

−h

(y1+

y2− y0

2hu+

y0− 2y1+ y2

2h2u2

)du

=

∫ h

−hy1 du+

∫ h

−h

y2− y0

2hu du+

∫ h

−h

y0− 2y1+ y2

2h2u2 du

= 2∫ h

0y1 du+ 2

∫ h

0

y0− 2y1+ y2

2h2u2 du

= 2y1h+2(y0− 2y1+ y2)

2h2

h3

3

= h

(2y1+

y0− 2y1+ y2

3

)=

h

3(y0+ 4y1+ y2) .


The formula

f (x) = y1+y2− y0

2h(x − x1)+

y0− 2y1+ y2

2h2(x − x1)

2

gives the expansion off (x) in powers of(x− x1), with the coefficients expressedin terms of y0, y1, and y2. It is interesting to note that we could express anypolynomial of degree 3 that passes through the three points(x0, f (x0)), (x1, f (x1)),and (x2, f (x2)) by simply adding one more term of the formk(x − x1)

3. In theintegration in the preceding proposition, this would give an odd function, with anintegral that would be zero. Thus the formula we developed for the integral of aquadratic function would also work for any cubic function.

3.3. APPROXIMATING AREAS 85

3.3.3 COROLLARY. Let x0, x1, and x2 be real numbers, withx0 = x1 − hand x2 = x1 + h. Let f (x) be the polynomial of degree 2 withf (x0) = y0,f (x1) = y1, and f (x2) = y2 . Then the average off (x) on the interval[x0, x2] is

y0+ 4y1+ y2

6.

Proof. To find the average off (x) on [x0, x1] we only need to divide the integralof f (x) by the length of the interval, which is 2h.

3.3.4 THEOREM (Simpson’s Rule). Let f (x) be a continuous function definedon the interval[a,b], and letn be an even integer. Then

∫ ba f (x)dx is approximated

by the sum

(b− a)

3n( f (x0)+4 f (x1)+2 f (x2)+4 f (x3)+· · ·+2 f (xn−2)+4 f (xn−1)+ f (xn)) .

Proof. By Proposition 3.3.2,∫ x2

x0f (x)dx is approximated byf (x0)+ 4 f (x1)+

f (x2), while∫ x4

x2f (x)dx is approximated byf (x2) + 4 f (x3) + f (x4). Adding

these sums givesf (x0)+ 4 f (x1)+ 2 f (x2)+ 4 f (x3)+ f (x4). Continuing in thisway gives the approximation on the entire interval[a,b].


Approximate each of the following intervals, using Simpson’s method with 10subdivisions.

1.∫ 1

0 x4 dx

2.∫ 1

0

√x dx

3.∫ 3

1 x3 dx

4.∫ 2

11x dx

5.∫ 1

0 sin2 x dx

6.∫ 1

0

√4− x2 dx


7.∫ 1

04

1+x2 dx

3.4. INFINITE SERIES 87

3.4 Infinite series

One type of sequence that we have studied already involved adding on a smallamount to each term to construct the next. The simplest case is one in whichan + 1= an + arn, where a and r are real numbers. A sequence of this form iscalled ageometric serieswith ratio r .

Example 3.4.1.It is intuitively clear that the geometric seriesa1 =12, a2 =

12+

14,

a3 =12 +

14 +

18, etc., has a limit of1. In this case the ratio is1

2. We will see thatthe limit depends on the values of botha and r in the definition of a geometricseries.

3.4.1 THEOREM. Let r be a real number such that|r | < 1, and let {an} be thesequence defined by

an = a+ ar + ar2+ · · · + arn .

Then {an} converges, and

limn→∞

an =a

1− r.

Proof. In Section 1.1 of Chapter I, we saw thatx − c is a factor of xk− ck, for

any integerk and any real numberc. Letting c = 1 and k = n+ 1 we have

xn+1− 1= (x − 1)(xn

+ xn−1+ · · · + x2

+ x + 1) .

We can then substitutex = r and rewrite the equation to obtain

1+ r + r 2+ · · · + r n−1

+ r n=

1− r n+1

1− r.

Thus in the given sequence we have

an = a(1+ r + r 2+ · · · + r n−1

+ r n) = a ·1− r n+1

1− r.

If we apply the results in Theorem 3.1.2 and Example 3.1.3, we see that limn→∞

an =

a1−r .

If we look again at the sequence in Example 3.4.1, we see thata = 1/2 andr = 1/2. Thus Theorem 3.4.1 gives the limit of the sequence as1

2 ·1

1−1/2 = 1.


Example 3.4.2.Consider the sequence given bya1 = 3, a2 = 3− 3/4, a3 =

3− 3/4+ 3/16, a4 = 3− 3/4+ 3/16− 3/64, etc. To apply Theorem 3.4 we needto determine the values ofa and r , and after we obtaina = 3 and r = −1/4,we can compute the limit of the sequence as

limn→∞

an = 3 ·1

1+ 1/4=

12

5.


1. Let {an} be the sequence defined by the formulaan = 1+ 13+

19+· · ·+(

13)

n.Find lim

n→∞an.

2. Find limn→∞

an if {an} is defined by a1 = 1/3, a2 = 1/3+ 2/9, a3 =

1/3+ 2/9+ 4/27, a4 = 1/3+ 2/9+ 4/27+ 8/81, etc.

3.5. TAYLOR SERIES 89

3.5 Taylor series

Example 3.5.1. If we differentiate a polynomialp(x), the degree ofp′(x) is oneless, and so there is no polynomial for whichp′(x) = p(x). However, we can finda quadratic polynomial that would approximate (for values ofx close to x = 0)a function f (x) with f ′(x) = f (x). All such functions will differ by a constant,so we assume thatf (0) = 1. This allows us to calculate all higher derivatives atx = 0, since they must also be equal to1. The Taylor polynomial of degree2 off (x) at x = 0 is

1+ x +x2

2.

The Taylor polynomial of f (x) at x = a provides an approximation forvalues of the function at nearby values ofx, but thus far we have no idea of theaccuracy of the approximation. The simplest case, approximatingf (x) by the0thdegree Taylor polynomial, givesf (x) ∼ f (a). Using the Mean Value Theorem,we have f (x)− f (a) = f ′(c)(x−a) for some valuec betweena and x. Thuswe can estimate the error if we can estimate the maximum value off ′(x) betweena and x. It is possible to use the Mean Value Theorem to prove the followingtheorem, but to cover the case of higher degree Taylor polynomials it is necessaryto apply the Mean Value Theorem to a new function that is fairly complicated. Wehave omitted the proof of the theorem.

3.5.1 THEOREM (Taylor’s formula). Assume that f (x) is a function whose(n+ 1)st derivative exists on an interval containing the pointsa and x. Then

f (x) = f (a)+ f ′(a)(x−a)+f ′′(a)

2(x−a)2+· · ·+

f (n)(a)

n!(x−a)n+

f (n+1)(c)

(n+ 1)!(x−a)n+1

for some numberc betweena and b.

The formula in Theorem 3.5.1 is called Taylor’s formula with remainder, sincethe last term can be viewed as a remainder or error term. The remainder term is justthe next term in the Taylor polynomial of degreen+1, except that the last derivativeis evaluated at the unknown valuec instead of at a. The best that can be doneis to estimate the error term by finding a numberM such that | f (n+1)(c)| ≤ M .Then for the Taylor polynomialPn(x) of degreen , we have

| f (x)− Pn(x)| ≤M |x − a|n+1

(n+ 1)!.


We should note that the error formula for Newton’s method really comes from thiserror estimate for Taylor’s formula.

Example 3.5.2. In this example we will compute√

50.25 with an error of lessthan .000001. Let f (x) =

√x. To use Taylor’s formula we need a value ofa for

which we can easily evaluatef (x) and its derivatives, so it seems reasonable tochoosea = 49. For f (x) = x1/2 we have f ′(x) = 1

2x−1/2, f ′′(x) = −14x−3/2,

and f ′′′(x) = 38x−5/2. For the Taylor polynomial of degree2 the error term is

3c−5/2

8·6·1.253 , for some numberc between 49 and 50.25. The term c−5/2=

1√

c5

is largest whenc is as small as possible, so its maximum occurs forc = 49. Acomputation of this bound for the error term gives.00000762, so it is necessaryto use at least the Taylor polynomial of degree3. The next error term is less than−15(49)−7/2

16·4! (1.25)4 = .00000012, and so we can indeed use the approximation

√x ∼ 7+

(x − 49)

14+(x − 49)2

2744+(x − 49)3

268912.

Substitutingx = 50.25 gives√

50.25∼ 7+ .08928571− .00056942+ .00000726= 7.0887236.

This can be compared with the value7.0887234obtained on a calculator.

3.6. TRIGONOMETRIC FUNCTIONS 91

3.6 Trigonometric functions

—This section will use limits of sequences to find the derivatives of the sine andcosine functions. It will then derive the Taylor series approximations for the twofunctions.—

Example 3.6.1.To find a cubic polynomial that approximatesf (x) = sin(x) closeto x = 0, we need the first three derivatives off (x). We obtain f ′(x) = cos(x),f ′′(x) = − sin(x), and f ′′′(x) = − cos(x). Substitutingx = 0 gives f (0) = 0,f ′(0) = 1, f ′′(0) = 0, and f ′′′(0) = −1. If we approximate f (x) by apolynomial of the form

p(x) = c0+ c1x + c2x2+ c3x3 ,

then we get

p′(x) = c1+ 2c2x + 3c3x2

p′′(x) = 2c2+ 6c3x

p′′′(x) = 6c .

Thus we must havec0 = 0, c1 = 1, c2 = 0, and c3 = −1/6, giving the cubicpolynomial

p(x) = x −1

6x3 .

On your calculator, graph f (x) = sin(x), together with p(x). You will see thatthe polynomial agrees quite well withsin(x) for values of x close to x = 0.

Example 3.6.2. In Example 2.2 we found a cubic polynomial which approximatessin(x) close tox = 0. Using Proposition 2.1 we can easily extend this to find Taylorpolynomials of higher degree atx = 0. We have f (x) = sin(x), f ′(x) = cos(x),f ′′(x) = − sin(x), and f ′′′(x) = cos(x). Since f i v(x) = sin(x), this pattern willrepeat as higher and higher derivatives are calculated. After substitutingx = 0it is easy to see that the even powers ofx have coefficients that are all zero, whilethe odd ones alternate in sign. This gives us, for values ofx close to 0,

sin(x) ∼ x −x3

3!+

x5

5!−

x7

7!+

x9

9!− · · · ±

x2n+1

(2n+ 1)!.

For the function f (x) = cos(x), we have f ′(x) = − sin(x), f ′′(x) =− cos(x), f ′′′(x) = sin(x), and f i v(x) = cos(x). From this point on the higher


derivatives repeat this pattern. Substitutingx = 0 and applying Proposition 2.1gives us the Taylor polynomial of degree2n of cos(x) at x = 0.

cos(x) ∼ 1−x2

2!+

x4

4!−

x6

6!+

x8

8!−

x10

10!+ · · · ±

x2n

(2n)!.

Example 3.6.3.Suppose that we wish to use Taylor polynomials to approximate allvalues of sin(x) and cos(x) to within an accuracy of.00000001. If we haveformulas for bothsin(x) and cos(x), then using various trigonometric identitiesit is sufficient to do the computations for values ofx between0 and π/4, and sowe can use the Taylor polynomials given in Example 2.3. We have

sin(x) ∼ x −x3

3!+

x5

5!−

x7

7!+

x9

9!− · · · ±

x2n+1

(2n+ 1)!,

with error term± sin(c)

(2n+ 2)!x2n+2. We certainly have| sin(c)| ≤ 1, and so we can

estimate that the error is less thanx2n+2

(2n+ 2)!, and the largest value we need to

consider is that forx = π/4. For the Taylor polynomial of degree3 the error is

less than (π/4)4

4! = .01585434. It is easiest to compute the error terms recursively.For the polynomial of degree5 we need to multiply our previous answer by(π/4)2

and divide by 6 · 5, giving the value.00032599. To obtain the error term for thepolynomial of degree7, we again multiply by(π/4)2 and then divide by8 · 7,giving .00000359. The next error term is slightly greater than.00000002, and soit is necessary to use the Taylor polynomial of degree11 to achieve our goal.

A similar analysis shows that using the Taylor polynomial of degree10 forcos(x) will give us the desired accuracy, so we have

cos(x) ∼ 1−x2

2!+

x4

4!−

x6

6!+

x8

8!−

x10

10!

over the interval from0 to π/4, with error less than.00000001.


1. Find the first five nonzero terms of the Taylor polynomials for the followingfunctions.

(a) f (x) =√

x at a = 1 and a = 4.

3.6. TRIGONOMETRIC FUNCTIONS 93

(b) f (x) = x3/2 at a = 1.

(c) f (x) = 1x at a = 1 and a = 2.

2. Using the Taylor polynomial in part (d) of the previous question, estimate(1.1)3/2. Also estimate the error in your approximate value.

3. Given that f (x) is a function with f (0) = 1 and f ′(x) = f (x) for all x,find the first five nonzero terms of the Taylor polynomial off (x) at a = 0.Assuming that f (x) ≤ 10 on the interval from 0 to 1, how many terms ofthe Taylor polynomial for f (x) would be necessary to calculatef (1) withan error of at most.0000001? Finally, do the calculation of the approximatevalue of f (1).

4. Find the first five nonzero terms of the Taylor polynomials for the followingfunctions.

(a) f (x) = sin(3x) at a = 0.

(b) f (x) = cos 2(x) at a = 0.

5. Using a = 0, approximate sin(10◦) with an error at most.0001. Note thatyou must first convert degrees to radians.

6. Using a = π/3, approximate sin(62◦) with an error at most .000001.Again, be sure to first convert degrees to radians.


3.7 The exponential form for complex numbers

—This section will use the Taylor series representations for the sine and cosinefunctions to show thatei θ

= cosθ + i sinθ . It will end the book with the formulaeπ i= −1.—


the calculus of polynomialsbeachy/courses/229/pdf/calculus.pdf · possibility, through the use of...

Documents