math234 thirdsemester calculus4 contents 6. thetwovariablechainrule 58 7. problems 61 8. gradients...

173
z=f(x,y) x z y MATH 234 THIRD SEMESTER CALCULUS Fall 2015 1

Upload: others

Post on 22-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • z=f(x,y)

    x

    z

    y

    MATH 234THIRD SEMESTER

    CALCULUS

    Fall 2015

    1

  • 2

    Math 234 – 3rd Semester CalculusLecture notes version 0.9.1(Fall 2015)

    is is a self contained set of lecture notes for Math 234. e notes were wrien by SigurdAngenent, some problems were taken from Guichard’s open calculus text which is avail-able athttp://www.whitman.edu/mathematics/multivariable/src/

    e LATEX les, as well as the Python and Inkscape-svg les that were used to pro-duce the notes before you can be obtained from the following web site:

    http://www.math.wisc.edu/∼angenent/Free-Lecture-Notes

    ey are meant to be freely available for non-commercial use, in the sense that “freesoware” is free. More precisely:

    Copyright (c) 2009 Sigurd B. Angenent. Permission is granted to copy, distribute and/or modify this

    document under the terms of the GNU Free Documentation License, Version 1.2 or any later

    version published by the Free Soware Foundation; with no Invariant Sections, no Front-Cover

    Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ”GNU Free

    Documentation License”.

    http://www.whitman.edu/mathematics/multivariable/src/http://www.math.wisc.edu/~angenent/Free-Lecture-Notes

  • Contents

    Chapter 1. Vector Geometry in Three dimensional space 5

    1. Three dimensional space 5

    2. Geometric description of vectors 5

    3. Arithmetic of vectors 6

    4. Vector algebra 7

    5. Component representation of vectors 8

    6. The dot product 9

    7. The cross product 10

    8. The triple product 12

    9. Determinants 13

    10. Determinants, the triple product, and the cross product 13

    11. Defining equations for lines and planes 14

    12. Problems 16

    Chapter 2. Parametric curves and vector functions 19

    1. Vector functions 19

    2. Using vector functions to describe motion 19

    3. Lines 20

    4. Circular motion 20

    5. The cycloid 21

    6. The helix 21

    7. The derivative of a vector function 22

    8. The derivative as velocity vector 23

    9. Acceleration 24

    10. The dierentiation rules 25

    11. Vector functions of constant length 26

    12. Two examples 27

    13. Arc length 28

    14. Arc length derivative 29

    15. Unit Tangent and Curvature 30

    16. Osculating plane 31

    17. Problems 31

    Chapter 3. Functions of more than one variable 35

    1. Functions of two variables and their graphs 35

    2. Linear functions 38

    3. adratic forms 39

    4. Functions in polar coordinates r, θ 42

    5. Methods of visualizing the graph of a function 44

    Problems 46

    Chapter 4. Derivatives 49

    1. Interior points and continuous functions 49

    2. Partial Derivatives 50

    3. Problems 51

    4. The linear approximation to a function 52

    5. The tangent plane to a graph 55

    3

  • 4 CONTENTS

    6. The Two Variable Chain Rule 58

    7. Problems 61

    8. Gradients 62

    9. The chain rule and the gradient of a function of three variables 66

    10. Implicit Functions 69

    Problems 72

    11. The Chain Rule with more Independent Variables;

    Coordinate Transformations 73

    12. Problems 75

    13. Higher Partials and Clairaut’s Theorem 78

    14. Finding a function from its derivatives 79

    15. Problems 81

    Chapter 5. Maxima and Minima 83

    1. Local and Global extrema 83

    2. Continuous functions on closed and bounded sets 84

    3. Problems 85

    4. Critical points 86

    5. When there are more than two variables 89

    6. Problems 91

    7. A Minimization Problem: Linear Regression 92

    8. Problems 93

    9. The Second Derivative Test 94

    10. Problems 99

    11. Second derivative test for more than two variables 100

    12. Optimization with constraints and the method of Lagrange multipliers 101

    13. Problems 104

    Chapter 6. Integrals 107

    1. Ways of Integrating 107

    2. Double Integrals 108

    3. Problems 120

    4. Triple integrals 121

    5. Why compute a Triple Integral? 124

    6. Integration in special coordinate systems 129

    7. Problems 132

    Chapter 7. Vector Calculus 137

    1. Vector Fields 137

    2. Examples of vector fields 137

    3. Line integrals 140

    4. Problems 142

    5. Line integrals of vector fields 142

    6. Another Fundamental Theorem of Calculus 148

    7. Conservative vector fields 150

    8. Problems 151

    9. Flux integrals 151

    10. Green’s Theorem 155

    11. Conservative vector fields and Clairaut’s theorem 157

    12. Problems 159

    13. Surfaces and Surface integrals 160

    14. Examples 165

    15. The divergence theorem and Stokes’ theorem 167

    16. ~∇ – dierentiating vector fields 16817. Problems 171

  • CHAPTER 1

    Vector Geometry in ree dimensional space

    1. ree dimensional space

    e world according to our rst and second semester calculus courses is at: exceptfor a brief digression about surfaces of revolution, everything that we discussed in Math221 and 222 took place in the (x, y)-plane. All curves were curves in the plane and allfunctions had graphs that were curves in the plane. is semester we leave two dimen-sions behind and enter the three dimensional world. In order to understand the objectswe will be dealing with, such as curves that are free to loop around in space, or func-tions whose graphs are themselves two dimensional curved surfaces, we will rst reviewsome three dimensional geometry. In particular, we will review the use of vectors in threedimensional geometry.

    2. Geometric description of vectors

    2.1. Points and their coordinates. We are used to describing the location of anypoint in the plane by choosing two perpendicular “coordinate axes” (the x and y axes),and specifying the corresponding (x, y)-coordinates of any given point. In the same waywe can describe where points are in three dimensional space by choosing three mutuallyperpendicular axes, which we call the x, y, and z axes. To say where some given point Pis, we travel from the origin to P , rst along the x axis, then parallel to the y-axis, andnally parallel to the z-axis. e distances we had to go in the x, y, and z directions arethe x, y, and z coordinates of our point P .

    y-axis

    z-axis

    x-axis

    Figure 1. To determine the location of points in three dimensional space (such as the center of theblue sphere in this drawing), we should choose three coordinate axes, and specify three numbers:the x, y, and z coordinates of the point.

    5

  • 6 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    2.2. Vectors. While points and their coordinates are used to described locationsin space, vectors are used to describe displacements, i.e. how to go from one point toanother. Such a displacement has a size (how far we have to go), and a direction (whichway do we go). Vectors also get used in non-geometric situations to describe objectsthat have size and direction, e.g. velocities and forces in physics are typical examples ofvector-like objects.

    Informal denition of “vectors”. Wewill think of a vector as an arrow connecting two

    points. If the points are A andB then we call the vector# ‰

    AB. If we translate a vector# ‰

    AB

    without turning it then we say that the resulting vector# ‰

    CD is the same vector as the

    original vector# ‰

    AB. A more precise way of saying that we should be able to move# ‰

    AB“without turning,” is to insist that the line segments AB and CD should be parallel, andhave the same length and orientation.

    A

    B

    C

    D

    Figure 2. This figure contains four points (A, B, C , D), two line segments (AB and CD), but

    only one vector since# ‰

    AB and# ‰

    CD represent the same vector:# ‰

    AB =# ‰

    CD.

    We say that the arrows# ‰

    AB and# ‰

    PQ both represent the same vector. Since both# ‰

    AB

    and# ‰

    PQ are the same vector we will oen want to use a notation for vectors that doesnot emphasize any particular choice of initial- and endpoint. e notation we will use inthis course is

    #‰a =# ‰

    AB =# ‰

    PQ,

    i.e., a single leer with an arrow on top will always stand for a vector in this course.

    to add

    two vectors. . .

    . . .move one vector

    until its initial

    point. . .

    . . . is the end point of

    the other. . .. . . and combine them.

    BP

    Q

    BP

    Q

    C

    B

    C

    B

    C

    A A A A

    #‰a #‰a #‰a #‰a

    #‰

    b#‰

    b#‰

    b#‰

    b#‰a +

    #‰

    b

    Figure 3. Adding vectors

    3. Arithmetic of vectors

    To add two vectors# ‰

    AB and# ‰

    PQ we rst translate the vector# ‰

    PQ so that its initialpoint becomes B; let the result of this translation be the vector

    # ‰

    BC . en, by denition,

  • 4. VECTOR ALGEBRA 7

    the sum of# ‰

    AB and# ‰

    PQ is# ‰

    AC : in a formula,# ‰

    AB +# ‰

    PQ =# ‰

    AB +# ‰

    BC =# ‰

    AC.

    An equivalent way of adding two vectors# ‰

    AB and# ‰

    PQ is to move the vectors around untilthey have the same initial point. Two vectors with a common initial point form two sidesof a parallelogram (see Figure 4) and the sum of the two vectors is the diagonal of thatparallelogram.

    A

    B

    CC

    D

    A

    B

    CC

    D

    A

    B

    CC

    D

    A

    BD

    # ‰

    AB +# ‰

    AD =?

    Figure 4. Using a parallelogram to add vectors. To find# ‰

    AB+# ‰

    AD wemove the vector# ‰

    AD so

    that its initial point is at B, i.e. the endpoint of# ‰

    AB. This gives us a parallelogram ABCD, where# ‰

    AD =# ‰

    BC . Therefore# ‰

    AB +# ‰

    AD =# ‰

    AB +# ‰

    BC =# ‰

    AC

    One can also multiply vectors with numbers. To multiply a vector #‰a with a positivereal number t > 0, we multiply the length of the vector by a factor t, without changingthe direction of the vector.

    #‰a

    2 #‰a

    − #‰a

    #‰a

    #‰

    b

    − #‰a

    − #‰b

    #‰a − #‰b

    #‰

    b − #‰a

    Figure 5. Multiplying and subtracting vectors

    4. Vector algebra

    e addition and multiplication of vectors and numbers satisfy a number of alge-braic properties that should look familiar, as they are very similar to the usual algebraicproperties for adding and multiplying numbers. Here they are:

    #‰a +#‰

    b =#‰

    b + #‰a commutative law

    ( #‰a +#‰

    b ) + #‰c = #‰a + (#‰

    b + #‰c ) t · (s · #‰a) = (ts) · #‰a associative lawst · ( #‰a + #‰b ) = t #‰a + t #‰b (t+ s) #‰a = t #‰a + s #‰a distributive laws

  • 8 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    5. Component representation of vectors

    5.1. Components of a vector in two dimensional space. ere is a way to repre-sent a vector by specifying a list of numbers instead of by giving a geometric descriptionof the vector. To do this for vectors in the plane, we must choose two perpendicularcoordinate axes (the “x” and “y” axes). We dene

    #‰e1 = vector with length 1, in the direction of the x axis#‰e2 = vector with length 1, in the direction of the y axis

    en any other vector can be wrien as the sum of a multiple of #‰e1 and another multipleof #‰e2:

    (1) #‰a = a1#‰e1 + a2

    #‰e2.

    See Figure 6. e numbers a1 and a2 are called the components of the vector#‰a . If we

    know the components a1 and a2 of a vector, and if we know the two vectors#‰e1 and

    #‰e2,then we can reconstruct the vector #‰a by using the formula (1).

    #‰e1

    #‰e2

    #‰a #‰a #‰a

    a1#‰e1

    a2#‰e2

    Figure 6. Describing a vector in terms of its components.

    Instead of using the notation (1), one very oen writes

    (2) #‰a =

    (a1a2

    )

    , or #‰a =

    [a1a2

    ]

    , or #‰a = 〈a1, a2〉.

    is notation says that #‰a is the vector whose components are a1 and a2. Since the twovectors #‰e1 and

    #‰e2 depend on our choice of coordinate axes, we can only use the compo-nent notation if it is clear to everyone how we chose the coordinate axes.

    e rst way of writing the vector, in which the components a1 and a2 are listed in acolumn enclosed in either parentheses or square brackets, is the standard way of writing“column vectors,” and is used in linear algebra courses (math 320, 340, 341, etc.), as well asby most computational soware (MatlabTM, Octave, etc.). e other way of writing thecomponents, i.e. as 〈a1, a2〉, also gets used, especially when one has to type the equationsrather than write them by hand.

    5.2. Components of a vector in three dimensional space. e preceding alsoapplies to vectors in three dimensional space: instead of choosing two coordinate axeswe choose three axes, and call them the x, y, and z axes (or, the x1, x2, and x3 axes). en

    we dene #‰ı , #‰ , and#‰

    k (or #‰e1,#‰e2, and

    #‰e3) to be vectors of length one in the direction of

  • 6. THE DOT PRODUCT 9

    the three coordinate axes. A vector #‰a in space can then be wrien as a combination of

    the three vectors #‰ı , #‰ , and#‰

    k , namely,

    #‰a = a1#‰ı + a2

    #‰ + a3#‰

    k , or #‰a =

    a1a2a3

    .

    e #‰e1,#‰e2,

    #‰e3 notation is more systematic, but the#‰ı , #‰ ,

    #‰

    k notation, which was intro-

    a2#‰e2a1

    #‰e1

    a3#‰e3

    #‰e2

    #‰e3

    #‰e1

    The vector #‰a =

    a1a2a3

    is

    #‰a = a1#‰e1 + a2

    #‰e2 + a3#‰e3

    a1a2a3

    Figure 7. Components of a vector in three dimensional space

    Josiah Willard Gibbs1839–1903

    https://en.wikipedia.org/wiki/

    Josiah Willard Gibbs

    duced into vector geometry and vector calculus by J.W.Gibbs, is also very common.

    5.3. Length of a vector whose components are given. We will write

    ‖ #‰a‖

    for the length of a vector #‰a . If the vector is given in components,

    #‰a = a1#‰e1 + a2

    #‰e2, or#‰a = a1

    #‰e1 + a2#‰e2 + a3

    #‰e3,

    then the length of the vector is determined by Pythagoras’ law (see Figures 6 and 7):

    (3) ‖ #‰a‖ =√

    a21 + a22, or ‖ #‰a‖ =

    a21 + a22 + a

    23.

    6. e dot product

    ere are two dierent descriptions of the dot product of two vectors: one geometric,and the other in terms of the components of the vectors.

    6.1. Geometric description of the dot product. If #‰a and#‰

    b are two given vectors,then, by denition,

    θ #‰a

    #‰

    b

    The dot product between

    two vectors.

    (4) #‰a ·#‰

    b = ‖ #‰a‖ ‖ #‰b ‖ cos θ,

    where θ is the angle between the two vectors #‰a and#‰

    b .

    https://en.wikipedia.org/wiki/Josiah_Willard_Gibbshttps://en.wikipedia.org/wiki/Josiah_Willard_Gibbshttps://en.wikipedia.org/wiki/Josiah_Willard_Gibbs

  • 10 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    6.2. e dot product in terms of vector components. If we choose an orthonor-mal set of vectors #‰e1,

    #‰e2,#‰e3, and write

    #‰a = a1#‰e1 + a2

    #‰e2 + a3#‰e3 =

    a1a2a3

    ,#‰

    b = b1#‰e1 + b2

    #‰e2 + b3#‰e3 =

    b1b2b3

    ,

    then

    (5) #‰a ·#‰

    b = a1b1 + a2b2 + a3b3.

    e fact that (4) and (5) always give the same result is not obvious (the formulas look verydierent), and requires a proof. A very common proof relies on the law of cosines (it wasgiven in math 222 – see also Problem 12.17)

    6.3. Algebraic properties of the dot product. e dot product has the followingalgebraic properties, which we will use very oen throughout this course:

    #‰a ·#‰

    b =#‰

    b · #‰a commutative

    s( #‰a ·#‰

    b ) = (s #‰a)·#‰

    b associative

    ( #‰a +#‰

    b )· #‰c = #‰a · #‰c +#‰

    b · #‰c . distributive

    We will not prove these properties here. Proofs can be given if one starts either fromthe algebraic description of the dot-product (5), or from the geometric description (4) (al-though the distributive property is more dicult to prove from the geometric descriptionthan from the algebraic description.)

    e sign of the dot product tells us if the angle between two vectors is acute, obtuse,or if the vectors are perpendicular:

    #‰a ⊥ #‰b ⇐⇒ #‰a · #‰b = 0(6a)#‰a ·

    #‰

    b > 0 ⇐⇒ θ < π2

    (6b)

    #‰a ·#‰

    b < 0 ⇐⇒ θ > π2.(6c)

    7. e cross product

    As with the dot product, the cross product of two vectors also has a geometric de-scription, and a description in terms of components.

    7.1. Geometric description of the cross product. Let #‰a and#‰

    b be two vectors in

    three dimensional space, then their cross product is the vector #‰a×#‰

    b that satises

    • #‰a× #‰b is perpendicular to #‰a , and also to #‰b• the length of #‰a× #‰b is given by

    ‖ #‰a× #‰b ‖ = ‖ #‰a‖ ‖ #‰b ‖ sin θ,where θ is the angle between the vectors #‰a and

    #‰

    b ,

    • the three vectors #‰a , #‰b , #‰a× #‰b satisfy the right hand rule: if on your right hand#‰a is the index nger and

    #‰

    b is the middle nger, then your thumb points in the

    direction of #‰a×#‰

    b . See Figure 8.

  • 7. THE CROSS PRODUCT 11

    #‰a

    #‰

    b

    #‰a×#‰

    b

    #‰a

    #‰

    b

    #‰a×#‰

    b

    Figure 8. The cross product: #‰a×#‰

    b is perpendicular to both #‰a and#‰

    b ; its direction follows fromthe right-hand rule.

    e length of the cross product of two vectors has a geometric interpretation. Namely,

    the quantity ‖ #‰a‖ ‖ #‰b ‖ sin θ is exactly the are of the parallelogram spanned by the vectors#‰a and

    #‰

    b .

    height = ‖ #‰a‖ sin θ

    base = ‖ #‰b ‖

    #‰a

    θ

    Area=height×base

    #‰

    b

    7.2. Algebraic description of the cross product. If #‰a and#‰

    b are given by (4),i.e. by

    #‰a = a1#‰e1 + a2

    #‰e2 + a3#‰e3 =

    (a1a2a3

    )

    ,#‰

    b = b1#‰e1 + b2

    #‰e2 + b3#‰e3 =

    (b1b2b3

    )

    ,

    then

    #‰a×#‰

    b =

    a2b3 − a3b2a3b1 − a1b3a1b2 − a2b1

    .

    7.3. Algebraic properties of the cross product. e cross product has the dis-tributive property, namely,

    (7) ( #‰a +#‰

    b )× #‰c = #‰a× #‰c +#‰

    b× #‰c ,

    holds true for any three vectors #‰a ,#‰

    b , #‰c .

    e cross product is not commutative: #‰a×#‰

    b and#‰

    b× #‰a are not the same thing.Instead, we have :

    (8) #‰a×#‰

    b = − #‰b× #‰a .Because of this property the cross product is said to be “anti-commutative.”

  • 12 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    e associative property fails completely for the cross product: for most vectors #‰a ,#‰

    b , #‰c one has

    (9)

    ( #‰a×#‰

    b )× #‰c 6= #‰a×(#‰

    b× #‰c )

    If you need a vector that is perpendicular to two given vectors, take their cross prod-uct.

    e length of the cross product #‰a×#‰

    b is the area of the parallelogram spanned bythose vectors.

    8. e triple product

    Just as two vectors in the plane form a parallelogram, three vectors in space willform a shape called a parallelepiped. By denition, a parallelepiped is a solid body eachof whose faces is a parallelogram.

    θ

    #‰a

    #‰c

    #‰

    b

    #‰

    b× #‰c

    height

    θ

    #‰a#‰

    b#‰c

    #‰

    b× #‰c

    height

    Figure 9. A parallelepiped spanned by three vectors #‰a ,#‰

    b , #‰c . Since the base of the paral-

    lelepiped is a parallelogram with edges#‰

    b and #‰c , we have

    Area of base = ‖ #‰b× #‰c ‖.The height of the parallelepiped is ‖ #‰a‖ cos θ, and therefore the volume is given by

    Volume = height · area of base = ‖ #‰a‖ ‖ #‰b× #‰c ‖ cos θ = #‰a ·( #‰

    b× #‰c)

    .

    This derivation applies to the situation on the le, where the vector #‰a and the cross product#‰

    b× #‰c

    point in the same direction. If these vectors form an obtuse angle, as is the case on the right, thencos θ < 0, and the height is −‖ #‰a‖ cos θ. In that case one has

    Volume = height · area of base = −‖ #‰a‖ ‖ #‰b× #‰c ‖ cos θ = − #‰a ·( #‰

    b× #‰c)

    .

    If we are given three vectors #‰a ,#‰

    b , and #‰c , then the volume of the parallelepiped theydetermine is given by the formula

    “Volume equals Area of base times height”

    In terms of the three vectors this is

    (10) V =∣∣∣

    #‰a ·( #‰b× #‰c

    )∣∣∣ .

    A derivation is sketched in Figure 9. e quantity #‰a ·(#‰

    b× #‰c ) (without the absolute val-

    ues) is called the triple product of the three vectors #‰a ,#‰

    b , and #‰c . Apart from its usein computing the volume of a parallelepiped, the triple product appears in many other

  • 10. DETERMINANTS, THE TRIPLE PRODUCT, AND THE CROSS PRODUCT 13

    contexts. At rst sight the expression #‰a ·(#‰

    b× #‰c ) suggests that the order in which thevectors appear is important, but this turns out not to be true. One has

    #‰a ·( #‰b× #‰c

    )=

    #‰

    b ·(

    #‰c× #‰a)= #‰c ·

    (#‰a×

    #‰

    b)

    for any #‰a ,#‰

    b , #‰c .

    9. Determinants

    For any four numbers a, b, c, d, one denes the 2× 2 determinant to be

    (11)

    ∣∣∣∣

    a bc d

    ∣∣∣∣= ad− bc .

    One can also dene 3 × 3 determinants. Namely, for any nine numbers a1, . . . , c3 onedenes

    (12)

    ∣∣∣∣∣∣

    a1 b1 c1a2 b2 c2a3 b3 c3

    ∣∣∣∣∣∣

    = a1b2c3 − a1b3c2 − a2b1c3 + a2b3c1 + a3b1c2 − a3b2c1 .

    is can be wrien as∣∣∣∣∣∣

    a1 b1 c1a2 b2 c2a3 b3 c3

    ∣∣∣∣∣∣

    = a1(b2c3 − b3c2

    )− a2

    (b1c3 − b3c1

    )+ a3

    (b1c2 − b2c1

    )(13)

    = a1

    ∣∣∣∣

    b2 c2b3 c3

    ∣∣∣∣− a2

    ∣∣∣∣

    b1 c1b3 c3

    ∣∣∣∣+ a3

    ∣∣∣∣

    b1 b1b2 b2

    ∣∣∣∣

    where each coecient in the rst row is multiplied with the 2×2 determined that remainsaer one deletes the row and column containing the coecient.

    Instead of expanding along the rst row one can also expand along the rst column:

    (14)

    ∣∣∣∣∣∣

    a1 b1 c1a2 b2 c2a3 b3 c3

    ∣∣∣∣∣∣

    = a1

    ∣∣∣∣

    b2 c2b3 c3

    ∣∣∣∣− b1

    ∣∣∣∣

    a2 c2a3 c3

    ∣∣∣∣+ c1

    ∣∣∣∣

    a2 b2a3 b3

    ∣∣∣∣

    Many other mnemonic devices exist to remember how to compute a 3 × 3 determinant.A popular trick is “Sarrus’ rule” (see Figure 10.)

    One can also dene larger determinants, i.e. 4 × 4, 5 × 5, etc, and generally n × ndeterminants. e theory, which is beyond the scope of this course, is treated in linearalgebra courses such as Math 320, 340, or 341.

    10. Determinants, the triple product, and the cross product

    If the numbers a1, . . . , c3 in a determinant happen to be the components of three

    vectors #‰a ,#‰

    b , #‰c , i.e. if

    #‰a =

    a1a2a3

    ,#‰

    b =

    b1b2b3

    , #‰c =

    c1c2c3

    ,

    then the corresponding determinant is exactly the triple product:

    (15)

    ∣∣∣∣∣∣

    a1 b1 c1a2 b2 c2a3 b3 c3

    ∣∣∣∣∣∣

    = #‰a ·( #‰b× #‰c

    ).

  • 14 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    a1 a2 a3 a1 a2

    + + +---

    b1 b2 b3 b1 b2

    c1 c2 c3 c1 c2

    a1b2c3 a2b3c1 a3b1c2a3b2c1 a1b3c2 a2b1c3

    Figure 10. Computing 3 × 3 determinants. There are several shortcuts to remember howto compute a 3 × 3 determinant. Pictured here is “Sarrus’ rule,” which tells us to copy the firsttwo columns of the determinant to the right of the determinant, and read o the six terms in thedeterminant by following the diagonals.

    Related to this is the following practical trick for computing the cross product of two

    column vectors. Given two column vectors#‰

    b and #‰c one can write their cross product as

    b1b2b3

    ×

    c1c2c3

    =

    ∣∣∣∣∣∣

    #‰e1 b1 c1#‰e2 b2 c2#‰e3 b3 c3

    ∣∣∣∣∣∣

    =

    ∣∣∣∣

    b2 c2b3 c3

    ∣∣∣∣

    #‰e1 −∣∣∣∣

    b1 c1b3 c3

    ∣∣∣∣

    #‰e2 +

    ∣∣∣∣

    b1 c1b2 c2

    ∣∣∣∣

    #‰e3.

    e 3 × 3 determinant in this equation is unusual in that some of its entries are vectorsinstead of numbers. e intention of this notation is that one expand the determinantalong the rst column, as in (13) and then interpret the result as a vector.

    11. Dening equations for lines and planes

    11.1. Lines. Let ℓ be a line in the plane, and suppose we know one point A on theline, and that we also have a vector #‰n that is perpendicular to the line (and we exclude#‰n =

    #‰

    0 .) Such a vector is called a normal vector to the line. Given any other pointX in

    the plane we can form the vector# ‰

    AX and consider its dot-product with the normal. Wehave

    #‰n·# ‰

    AX = ‖ #‰n‖ ‖ # ‰AX‖ cos θ,where θ is the angle between the normal vector #‰n and

    # ‰

    AX .

    e combination ‖ # ‰AX‖ cos θ is, up to its sign, the distance from the line ℓ to thepoint X : If X lies on the side of ℓ at which the normal vector points then #‰n·

    # ‰

    AX > 0; if

    X lies on the other side then #‰n·# ‰

    AX < 0. We therefore have the following formula forthe distance between a point X and the line ℓ:

    (16) d =#‰n·

    # ‰

    AX

    ‖ #‰n‖When we use this equation to compute the distance from X to ℓ, it is good to recall thatif #‰x = ( x1x2 ) and

    #‰a = ( a1a2 ) are the position vectors of the points X and A, then

    # ‰

    AX = #‰x − #‰a =(x1 − a1x2 − a2

    )

    .

  • 11. DEFINING EQUATIONS FOR LINES AND PLANES 15

    X

    A

    d

    θ#‰n

    XA

    d

    θ#‰n

    π − θ

    #‰n·# ‰

    AX < 0d = ‖ # ‰AX‖ cos(π − θ)= −‖ # ‰AX‖ cos θ#‰n· # ‰AX > 0 d = ‖ # ‰AX‖ cos θ

    Moreover, the length of the normal vector is ‖ #‰n‖ =√

    n21 + n22, so we can rewrite (16) as

    d =n1(x1 − a1) + n2(x2 − a2)

    n21 + n22

    .

    is last formula is more impressive than (16), but it is beer to remember (16).

    e equation for the distance from any point X to a given line ℓ is also importantbecause it gives us the dening equation for the line ℓ. e dening equation is anequation that tells us for any given pointX in the plane if that point is on the line or not.SinceX is on ℓ exactly when the distance from ℓ toX vanishes, it follows from (16) thatX is on ℓ if and only if

    (17) #‰n·# ‰

    AX = 0.

    We can again rewrite this equation in a few dierent ways. If we want to write it in termsof the position vectors of A and X , then we get

    #‰n·(

    #‰x − #‰a)= 0, i.e.: #‰n· #‰x = #‰n· #‰a .

    Wrien without vectors, but in terms of the coordinates of the points A, X , and thecomponents of the normal vector #‰n, we can write this last version of our equation as

    n1x1 + n2x2 = n1a1 + n2a2.

    11.2. Planes. We can repeat the derivation of the distance from a point to a line inthe plane and derive a formula for the distance from a point in three dimensional spaceto a given plane. e drawings are harder to make (at rst only, practice makes perfect!),but the resulting formulas are the same.

    e distance from a point X to a plane P is given by equation (16), where #‰n is anormal vector to the plane (a vector that is perpendicular to the plane), and A is somepoint on the plane that we happen to know.

  • 16 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    A

    X

    #‰n

    θd

    d = ‖ # ‰AX‖ cos θ#‰n·

    # ‰

    AX = ‖ #‰n‖ ‖ # ‰AX‖ cos θ

    12. Problems

    1. (a) Simplify the following

    #‰a =

    1−23

    + 3

    013

    #‰

    b = 12

    (

    11/3

    )

    − 3(

    41

    )

    #‰c = (1 + t)

    (

    11− t

    )

    − t(

    1−t

    )

    #‰

    d = t

    100

    + t2

    0−12

    001

    (b) Write the vectors from part (a) usingGibbs’ notation, i.e. write them in terms of#‰ı , #‰ ,

    #‰

    k . (See § 5).

    2. If #‰a ,#‰

    b , #‰c are as in the previous prob-lem, then which of the following expressionsmean anything? Compute those expressionsthat are well defined.

    (a) #‰a +#‰

    b (b)#‰

    b + #‰c (c) π #‰a

    (d)#‰

    b2

    (e)#‰

    b / #‰c (f) ‖ #‰a‖+ ‖ #‰b ‖(g) ‖ #‰b ‖2 (h) #‰b / ‖ #‰c ‖

    3. Let #‰a =(

    1−22

    )

    and#‰

    b =(

    2−11

    )

    .

    Compute:

    (a) || #‰a || (b) 2 #‰a (c) ||2 #‰a ||2

    (d) #‰a +#‰

    b (e) 3 #‰a − #‰b

    4. Given: points A(2, 1) and B(−1, 4).Compute the vector

    # ‰

    AB. Is# ‰

    AB a positionvector? •

    5. Given: points A(2, 1), B(3, 2), C(4, 4)andD(5, 2).estion: Is ABCD a parallelogram? •

    6. Given: points A(0, 2, 1), B(0, 3, 2),C(4, 1, 4) andD.

    (a) If ABCD is a parallelogram, then whatare the coordinates of the pointD? •

    (b) If ABDC is a parallelogram, then whatare the coordinates of the pointD? •

    7. You are given three points in the plane:A has coordinates (2, 3), B has coordinates(−1, 2) and C has coordinates (4,−1).

    (a) Compute the vectors# ‰

    AB,# ‰

    BA,# ‰

    AC ,# ‰

    CA,# ‰

    BC and# ‰

    CB.

    (b) Find the pointsP,Q,R and S whose po-

    sition vectors are# ‰

    AB,# ‰

    BA,# ‰

    AC , and# ‰

    BC ,respectively. Make a precise drawing.

    8. Explain how you can use the dot prod-uct to find the angle between the vectors#‰a = 2 #‰ı − 3 #‰ , and #‰b = #‰ + #‰k .

  • 12. PROBLEMS 17

    A

    B

    C

    D

    E FGH

    Figure 11. Figure for problem 12.10

    9. For which value(s) of the number s arethe vectors

    #‰a =

    (

    s1− s

    )

    and#‰

    b =

    (

    23

    )

    perpendicular? Forwhich values of s do theymake an acute angle? •

    10. Figure 11 shows a cube whose sides havelength 1.

    Choose A to be the origin, and let the x,y, and z axes be along the sides AB, AD,and AE, respectively.

    (a) Draw the vectors #‰e1,#‰e2, and

    #‰e3 in thefigure.

    (b) Find a normal vector to the planethrough the points B,D, and E.

    (c) Draw the plane through ACH (or atleast the portion of that plane that lies in-side the cube). Find a normal to the planeACH .

    (d) Find the angle between the two planesBDE and ACH . (The angle between twoplanes is the same as the angle between theirnormal vectors, i.e. to find the angle betweentwo planes find a normal vector for each ofthe planes and compute the angle betweenthese two vectors.)

    (e) Find the angle between the two planesBDE andHFC .

    11. (a) Draw two vectors #‰a and#‰

    b for which#‰a has length 3,

    #‰

    b has length 5, and for

    which #‰a ·#‰

    b = −12. How many solutionsare there? •(b)Can there be two vectors #‰a and

    #‰

    b whose

    lengths are ‖ #‰a‖ = 3 and ‖ #‰b ‖ = 5, andwhose inner product is #‰a ·

    #‰

    b = 25? •

    12. Compute

    #‰a = ( #‰ı× #‰ )× #‰ and#‰

    b = #‰ı×( #‰× #‰ ).

    What does your answer say about the asso-ciative property for the cross product? (See§ 7.3.)

    What about

    #‰c = ( #‰ı× #‰ )×#‰

    k and#‰

    d = #‰ı×( #‰×#‰

    k )?

    13. Which of the following vector equations

    are true for any pair of vectors #‰a and#‰

    b ? Ei-ther give a proof (using the algebraic prop-erties or the algebraic or geometric descrip-tions).

    (a) ( #‰a +#‰

    b )·( #‰a − #‰b ) = ‖ #‰a‖2 − ‖ #‰b ‖2 ? •(b) If #‰a ⊥ #‰b then

    ‖ #‰a + #‰b ‖2 = ‖ #‰a‖2 + ‖ #‰b ‖2 ? •(c) If #‰a ⊥ #‰b then

    ‖ #‰a − #‰b ‖2 = ‖ #‰a‖2 − ‖ #‰b ‖2 ? •

  • 18 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

    14. True or False:

    (a) If #‰a ⊥ #‰b and also #‰b ⊥ #‰c then #‰a ⊥ #‰c?

    (b) If #‰a ⊥ #‰b and also #‰a ⊥ #‰c then #‰a ⊥(

    #‰

    b + #‰c ) ?

    (c) If #‰a ⊥ #‰b and also #‰b ⊥ #‰c then #‰b ⊥( #‰a − #‰c ) ?(d) If #‰a ⊥ #‰b + #‰c and also #‰a ⊥ #‰b − #‰c then#‰a ⊥ #‰b ?

    15. Simplify the following expressions

    (a) ( #‰a +#‰

    b )×( #‰a +#‰

    b ) •(b) ( #‰a +

    #‰

    b + #‰c )×( #‰a +#‰

    b + #‰c ) •(c) ( #‰a − #‰b )×( #‰a + #‰b ) •(d) ( #‰a +

    #‰

    b − #‰c )×( #‰a − #‰b + #‰c )(e) ( #‰a +

    #‰

    b − #‰c )·( #‰a − #‰b + #‰c )16. This problem is about “cross division,”

    i.e. can you solve #‰a×#‰

    b = #‰c for#‰

    b if youknow #‰a and #‰c ?

    (a) Let

    #‰a = #‰e1 − #‰e3, #‰c = #‰e1 + 3 #‰e2 + 2 #‰e3.

    Find a vector#‰

    b for which #‰a×#‰

    b = #‰c , ifthere is such a thing. (Hint: if #‰c = #‰a×

    #‰

    b ,then what do you know about #‰a · #‰c ?) •

    (b) Let #‰a = 2 #‰e1− #‰e3, and #‰c = #‰e1+3 #‰e2+2 #‰e3. Find a vector

    #‰

    b for which #‰a×#‰

    b = #‰c ,if such a thing exists. •

    17. The law of cosines says that in a triangle△ABC for which you know the sides ABandAC , as well as the angle ∠A, the lengthof the opposing side BC is given by

    (BC)2 = (AB)2 + (AC)2

    − 2(AB)(AC) cos∠A.

    Show how you can use the dot product to(re)prove this law.

    Hint: consider the vector equation# ‰

    BC =# ‰

    AC − # ‰AB. You will need both thegeometric description (4) of the dot product,and the algebraic properties from § 6.3.

  • CHAPTER 2

    Parametric curves and vector functions

    1. Vector functions

    So far in calculus we have only considered functions y = f(x) where both the inde-pendent variable x and the dependent variable y are real numbers.

    A vector function is a function of one variable whose values are vectors instead ofnumbers. One way to specify a vector function is to say what its components are:

    #‰x(t) =

    x(t)y(t)z(t)

    = x(t) #‰e1 + y(t)#‰e2 + z(t)

    #‰e3.

    2. Using vector functions to describe motion

    One way to visualize a vector function #‰x(t) is to think of the vector #‰x(t) for anygiven value of t as the position vector of some point in space (or the plane, if #‰x(t) is a two-dimensional vector). In other words, we represent the vector #‰x(t) as an arrow startingat the origin, and ending at some point X(t) whose coordinates are (x(t), y(t), z(t)):

    #‰x(t) =# ‰

    OX(t).

    As t varies, the pointX(t) moves around and traces out a curve. Such a curve is called aparametrized curve, or a parametric curve. e quantity t is called the parameter.

    We will now take a look at some examples of parametric curves.

    #‰x(t)

    O

    X(t)

    Figure 1. A parametric curve: as the parameter t changes, the vector #‰x(t)will also move. Keep-ing the initial point of the vector #‰x(t) at the originO, the endpointX(t) traces out a space curve.

    19

  • 20 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    3. Lines

    Consider the parametric curve given by

    (18) #‰x(t) = #‰a + t #‰v

    where #‰a and #‰v are given constant vectors. As before we let X(t) be the point with#‰x(t) =

    # ‰

    OX(t), i.e. #‰x(t) is the position vector of the point X(t), and as t changes, X(t)traces out the parametric curve.

    To see what the parametric curve looks like, we let A be the point with# ‰

    OA = #‰a ,then, since

    # ‰

    OX(t) =# ‰

    OA+# ‰

    AX(t),

    it follows from (18) that# ‰

    AX(t) = t #‰v . Now consider going from the origin O to thepoint X(t) in two steps: rst move from O to the point A, then go from A to X(t). e

    displacement in the second step is# ‰

    AX(t) = t #‰v . Changing t will then make the pointX(t) slide along the line through the point A in the direction of #‰v .

    #‰a#‰v

    #‰x(t) = #‰a + t #‰v

    X(t)

    Origin

    A

    t #‰v

    Figure 2. Vector form of linear motion given by #‰x(t) = #‰a + t #‰v .

    We say that #‰x(t) given by (18) describes motion with constant velocity, whose ve-locity vector is #‰v .

    4. Circular motion

    For given constants R > 0 and ω we consider the vector function

    (19) #‰x(t) = R cosωt #‰e1 +R sinωt#‰e2 =

    (R cosωtR sinωt

    )

    .

    e corresponding point is X(t) =(R cosωt,R sinωt

    ). It lies on the circle of radius R

    with center at the origin, and the angle subtended by OX(t) and the positive x-axis isexactly ωt.

    If ω > 0 then as t increases, the angle ωt increases and the point X(t) goes aroundthe circle in counter-clockwise direction. Ifω < 0 thenX(t) goes around in the clockwisedirection.

    e number ω is the rate of increase of the angle ωt, and is called the angular ve-locity of the motion.

  • 6. THE HELIX 21

    #‰x(t)ωt

    X(t)

    O

    Figure 3. Circular motion with angular velocity ω.

    5. e cycloid

    e cycloid is the curve we get if we put a (bicycle) wheel on the ground, markthe point on the tire that touches the ground, and follow this point as we roll the wheelforward. If we call the pointX , then it depends on the angle θ that the wheel has turnedsinceX was on the ground. Figure 4 provides a derivation of the vector function #‰x(θ) =# ‰

    OX(θ) that describes the cycloid. e result is

    (20) #‰x(θ) =

    (Rθ −R sin θR−R cos θ

    )

    .

    X

    C

    B

    AO

    θθ

    θ

    O AA

    CC

    X

    X

    Figure 4. The cycloid. A wheel of radius R rolls over the x-axis. Initially the wheel touches thex-axis at the origin O. The cycloid is the curve traced out by a pointX on the wheel.

    Derivation of the cycloid motion. The arc AX and the line segment OA have the samelength. Since AX has length Rθ, the x coordinates of the points A, B, and C are Rθ. The righttriangle CXB has hypotenuse R, so the lengths ofXB and CB are R sin θ, and R cos θ, respec-tively. Therefore the coordinates of the pointX are x = Rθ −R sin θ, and y = R−R cos θ.

    6. e helix

    When we walk up a spiral staircase we are tracing out a helix: we are going aroundin circles, and moving upward at the same time. e parametric curve that does this (and

  • 22 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    that has the z-axis as its central axis) is given by

    (21) #‰x(θ) =

    R cos θR sin θaθ

    or: #‰x(θ) = R cos θ #‰e1 +R sin θ#‰e2 + aθ

    #‰e3.

    Here R > 0 is the radius of the helix, i.e. the radius of the circle on the ground abovewhich the helix lies; the number a represents the rate at which the helix goes up.

    x y

    z

    θ

    X

    O

    YA

    Figure 5. The Helix. The point X traces out a helix: it sits at a height aθ above the point Y ,while Y runs around on a circle of radius R; here θ = ∠AOY

    7. e derivative of a vector function

    For a function y = f(x) of one variable we had twoways of describing the derivative:on one hand we had a geometric description of f ′(x) as “the slope of the tangent to thegraph,” and on the other we could describe f ′(x) in terms of a dierence quotient, i.e.

    f ′(x) = lim∆x→0

    f(x+∆x)− f(x)∆x

    .

    For vector functionswe can imitate both descriptions. We beginwith the formal denitionin terms of limits and then proceed to the geometric description, in which we interpretthe derivative as the “instantaneous velocity vector.”

    Denition. If #‰x(t) is a vector function, then we set

    (22) #‰x ′(t)def= lim

    ∆t→0

    #‰x(t+∆t)− #‰x(t)∆t

    .

    For (22) to make sense we would have to dene what the limit of a vector function is.is can be done, but we will not go into the precise denitions in this course. More

  • 8. THE DERIVATIVE AS VELOCITY VECTOR 23

    important for our use is that if the components of a vector function #‰x(t) are given, thenthe derivative can be computed by just dierentiating those components:

    (23) #‰x ′(t) =

    x′(t)y′(t)z′(t)

    , or #‰x ′(t) = x′(t) #‰e1 + y′(t) #‰e2 + z

    ′(t) #‰e3.

    As with ordinary functions of one variable we will use Leibniz’ notation for the derivativewhenever it seems convenient. us the following are equivalent ways of expressing thesame derivative:

    #‰a′(t) =

    d #‰a(t)

    dt=

    d

    dt#‰a(t).

    Example. For instance,

    #‰x(θ) =

    cos θ0θ

    = cos θ #‰e1 + θ#‰e3

    denes a vector function. Here we have called the independent variable θ instead of t.e derivative of this vector function is

    d #‰x

    dθ=

    d

    cos θ0θ

    =

    − sin θ01

    = − sin θ #‰e1 + #‰e3.

    8. e derivative as velocity vector

    Suppose the motion of some point X(t) in space is described by its position vectorfunction #‰x(t). Let us try to dene the instantaneous velocity of the point. is velocityshould have magnitude (“how fast the point is moving”) and also direction (“which way

    Δx

    v = dx/dt

    x(t)x(t+

    Δt)

    X(t)

    O

    Figure 6. The vector function #‰x(t) traces out a curve in space. The vector #‰x(t) is the positionvector of a pointX(t) on this curve. As we increase time from t to t+∆t, the pointX(t) moves.The displacement of the point X(t) is given by ∆ #‰x = #‰x(t + ∆t) − #‰x(t). The average velocityvector during this displacement is “displacement/time”, i.e.∆ #‰x/∆t.

    If we let ∆t → 0, then the average velocity becomes the instantaneous velocity at time t:#‰v = lim∆t→0 ∆

    #‰x/∆t = #‰x ′(t). This vector is tangent to the curve traced out by the vectorfunction #‰x(t). We call it a tangent vector.

  • 24 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    is the point going?”). e velocity should therefore be a vector. To see which vector, wego back to the notion that “velocity” is always “displacement divided by time.”

    We consider two instances in time, say, time t and time t+∆t. en the position vec-tors of the pointX at these two dierent times are #‰x(t) and #‰x(t+∆t). e displacementof the point X between these two times is then

    ∆ #‰x = #‰x(t+∆t)− #‰x(t)(see Figure 6.) We say that the average velocity over the time interval from t to t+∆t is“the displacement divided by ∆t,” i.e.

    #‰v average =#‰x(t+∆t)− #‰x(t)

    ∆t.

    Note that the average velocity is a vector. If we write it out in components, we get a muchlarger formula:

    #‰v average =

    x(t+∆t)− x(t)∆t

    y(t+∆t)− y(t)∆t

    z(t+∆t)− z(t)∆t

    .

    One big advantage of using vector notation is that many formulas simplify considerablywhen wrien in terms of vectors.

    To get the instantaneous velocity, we do the same thing as in one variable calculus:we take the limit as∆t → 0 of the average velocity over the time interval from t to t+∆t.us we get

    (24) #‰v (t) = lim∆t→0

    #‰x(t+∆t)− #‰x(t)∆t

    def=

    d #‰x

    dt.

    In terms of components this derivative is

    #‰x′(t) =

    d #‰x

    dt=

    x′(t)y′(t)z′(t)

    .

    us the velocity vector of any given vector function #‰x(t) is the same as the derivativeof this vector function.

    9. Acceleration

    Having found the velocity vector of a point X(t) whose position vector is a given

    vector function# ‰

    OX(t) = #‰x(t), we can also dene the acceleration vector of the movingpoint. By denition, the acceleration vector is the derivative of the velocity vector, i.e.

    (25) #‰a(t) =d #‰v

    dt=

    d2 #‰x

    dt2=

    x′′(t)y′′(t)z′′(t)

    .

    is denition is entirely analogous to the denition of acceleration (“a = dvdt ”) from rstsemester calculus. e only dierence is that, here, the position, velocity, and accelerationall have directions in addition to magnitudes: they are vectors.

  • 10. THE DIFFERENTIATION RULES 25

    Newton’s famous law relating forces and acceleration continues to hold. If a pointX(t) moves according to some vector function #‰x(t), then some force must be actingon this point. is force is a vector (it has magnitude and direction), and, according toNewton, it is given by

    (26)#‰

    F = m #‰a = md #‰v

    dt= m

    d2 #‰x

    dt2,

    wherem is the mass of the object at the pointX(t) whose motion we are considering. Itis always assumed to be a positive number.

    Note that according to this law, the absence of forces, i.e.#‰

    F =#‰

    0 , is the same asd #‰vdt =

    #‰

    0 , i.e. no force acts on the point if and only if its velocity vector is constant. Here“constant” means constant magnitude and constant direction.

    10. e dierentiation rules

    Just as with ordinary derivatives, the derivatives of vector functions satisfy certainrules, such as the product rule. e purpose of these rules is not the same as in one variablecalculus. ere we used sum, product, quotient and chain rules to compute derivativesof given functions without having to fall back on the denition of a derivative all thetime. For vector functions we do not need such rules, because we can dierentiate themby simply dierentiating each of their components (see the above example). Instead, thedierentiation rules for vector functions are mostly used to gain insight and establishgeneral facts about vector functions, a number of which we will see shortly.

    10.1. e sum rule. e analog of the sum rule (“derivative of the sum is the sum ofthe derivatives”) looks exactly like the ordinary sum rule. It says that for any two vector

    functions #‰a(t) and#‰

    b (t) one has

    d

    dt

    (#‰a(t)± #‰b (t)

    )=

    d #‰a(t)

    dt± d

    #‰

    b (t)

    dt.

    10.2. emany product rules. ere is no quotient rule for vector functions, sim-ply because we have no way of dividing vectors. On the other hand we have two waysof multiplying vectors, and we can also multiply vectors and numbers, so there are threedierent product rules. Fortunately they all look like the product rule from rst semestercalculus.

    If #‰a(t) and#‰

    b (t) are vector functions, and if f(t) is a function, then

    d #‰a(t)·#‰

    b (t)

    dt=

    d #‰a(t)

    dt·

    #‰

    b (t) + #‰a(t)·d

    #‰

    b (t)

    dt

    d #‰a(t)×#‰

    b (t)

    dt=

    d #‰a(t)

    dt×

    #‰

    b (t) + #‰a(t)×d

    #‰

    b (t)

    dt

    d f(t) #‰a(t)

    dt=

    df(t)

    dt#‰a(t) + f(t)

    d #‰a(t)

    dt

    In spite of the fact that these rules “look right,” they could still be wrong, so to be surewe would have to prove them. e proofs are very straightforward. Here is a short proof

  • 26 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    for the product rule involving the dot product. To shorten the formulas we omit the “(t)”from all functions:

    d #‰a ·#‰

    b

    dt=

    d

    dt

    (a1b1 + a2b2

    )

    =da1b1dt

    +da2b2dt

    =da1dt

    b1 + a1db1dt

    +da2dt

    b2 + a2db2dt

    ordinary product rule

    =da1dt

    b1 +da2dt

    b2 + a1db1dt

    + a2db2dt

    switch terms around

    =d #‰a

    dt·

    #‰

    b + #‰a ·d

    #‰

    b

    dt. recognize the dot-products

    11. Vector functions of constant length

    As an immediate application of the product rule for the dot-product we prove thefollowing fact about vector functions whose length does not change, i.e. vector functions#‰a(t) that change their direction, but not their length.

    #‰a(t)

    ∆ #‰a#‰a(t+∆t)

    If a vector function #‰a(t) hasconstant length, then, when theparameter t undergoes a smallchange ∆t, the correspondingsmall change ∆ #‰a in the vectorfunction will be almost perpendic-ular to #‰a(t) itself.

    eorem. Let #‰a(t) be a vector function. en a necessary and sucient condition forthe length ‖ #‰a(t)‖ to be constant is that #‰a(t) and #‰a ′(t) be perpendicular for all t.

    Proof. Dierentiating both sides of the equation

    ‖ #‰a(t)‖2 = #‰a(t)· #‰a(t)we get

    (27)d

    dt‖ #‰a(t)‖2 = #‰a ′(t)· #‰a(t) + #‰a(t)· #‰a ′(t) = 2 #‰a(t)· #‰a ′(t).

    If #‰a(t) has constant length, then ‖ #‰a(t)‖2 is also constant, and thus ddt‖ #‰a(t)‖2 = 0.erefore, for a vector function #‰a(t)whose length is constant, #‰a(t)· #‰a ′(t) = 0, i.e. #‰a(t) ⊥#‰a

    ′(t).

    Conversely, if #‰a(t) is a vector function for which #‰a(t) ⊥ #‰a ′(t) holds for all t, then#‰a(t)· #‰a ′(t) = 0, and (27) implies that ddt‖ #‰a(t)‖2 = 0, i.e. that ‖ #‰a(t)‖2 and hence ‖ #‰a(t)‖are constant.

  • 12. TWO EXAMPLES 27

    12. Two examples

    12.1. Motion on a straight line. We return to the motion given by (18), i.e.

    (28) #‰x(t) = #‰a + t #‰v .

    e velocity and acceleration are easy to compute:

    d #‰x(t)

    dt= #‰v ,

    d2 #‰x(t)

    dt=

    d #‰v

    dt=

    #‰

    0 ,

    since #‰v is a constant vector in this case.

    We see that if a point X(t) moves according to the parametrization (18), then itsvelocity is constant, and its acceleration is zero. According to Newton’s law, no force isexerted on an object undergoing this motion.

    12.2. Circular motion. For the point X(t) moving on a circle of radius R withangular velocity ω we have (19), i.e.

    #‰x(t) = R cosωt #‰e1 +R sinωt#‰e2

    so that the velocity and acceleration are easy to compute:

    #‰v (t) = #‰x ′(t) = −ωR sinωt #‰e1+ ωR cosωt #‰e2,#‰a(t) = #‰v ′(t) = −ω2R cosωt #‰e1− ω2R sinωt #‰e2.

    Note that the velocity vector #‰v (t) is perpendicular to the position vector #‰x(t), aspredicted in § 11. Our expression for the velocity vector #‰v (t) contains the familiar re-lation between angular velocity and velocity: the velocity v = ‖ #‰v (t)‖ with which thepoint X(t) is moving is

    v(t) = ‖−ωR sinωt #‰e1 + ωR cosωt #‰e2‖(29)

    =√

    ω2R2 sin2 ωt+ ω2R2 cos2 ωt

    = ωR.

    Hence the angular velocity of an object undergoing circular motion is

    (30) ω =v

    R.

    #‰

    F#‰v (t) ωt R

    X

    Figure 7. If an object moves along a circle with constant angular velocity, then the force#‰

    F re-

    quired to make the object follow that motion is#‰

    F = −ω2 #‰x . In particular it is parallel to theposition vector #‰x but in the opposite direction.

  • 28 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    We also note that the acceleration is a multiple of the position vector:

    #‰a(t) = −ω2 #‰x(t).According to Newton the force acting on the object atX(t) is

    #‰

    F = m #‰a = −mω2 #‰x , andits magnitude is

    (31) F = ‖ #‰F ‖ = ‖mω2 #‰x(t)‖ = mω2R,because ‖ #‰x(t)‖ = R at all times.

    Using (30) we can replace the angular velocity ω by the actual velocity, which leadsto the classical formula for the centrifugal force

    (32) F =mv2

    R.

    13. Arc length

    For any given vector function there is a simple formula for the length of the curve ittraces out. e formula is essentially the same as the formula for the length of a parametriccurve (or, to a lesser extent, of the graph of a function) that was described in Math 221.Here we repeat the intuitive derivation of the formula, wrien in terms of vectors thistime.

    Let #‰x(t) (a ≤ t ≤ b) be a vector function. To determine the length of the arc tracedout by X(t) as t varies from t = a to b, we divide the interval a ≤ t ≤ b into manyvery short subintervals. e corresponding pointsX(t) on the curve split the curve intomany short segments, each of which will be “close to a line segment.” We approximatethe length of the curve by adding the lengths of all these short segments. Finally we takethe limit in which the number of partition points becomes innite and our sum of lengthsof short segments becomes an integral. To see which integral we get, we need to nd anexpression for the length of a short segment between two adjacent partition points onthe curve.

    Suppose we have two points on the curve, with parameter values t and t + ∆t, re-spectively. e points are X(t) and X(t + ∆t), and the distance between them is thelength of the vector ∆ #‰x from one point to the next. is vector is

    Δx start(t=a)

    end(t=b)

    partition piece

    X(t)

    X(t+Δt)

    ∆x = #‰x(t+∆t)− #‰x(t) =#‰x(t+∆t)− #‰x(t)

    ∆t∆t ≈ #‰x ′(t)∆t,

    so that its length is ≈ ‖ #‰x ′(t)‖∆t. Adding the lengths of the short segments together,we nd that the length is approximately

    ∑ ‖ #‰x ′(t)‖∆t (where the summation is over allshort pieces of the curve). Taking the limit we arrive at this formula for the length of thecurve traced out by #‰x(t), a ≤ t ≤ b:

    (33) Length =

    ∫ b

    t=a

    ‖ #‰x ′(t)‖ dt.

    is integral looks simple, but that appearance turns out to be deceptive as we ndout when we write it in terms of the components of the vector function #‰x(t). Suppose#‰x(t) = x(t) #‰e1 + y(t)

    #‰e2 + z(t)#‰e3. en

    #‰x′(t) = x′(t) #‰e1 + y

    ′(t) #‰e2 + z′(t) #‰e3,

    so that

    ‖ #‰x ′(t)‖ =√

    x′(t)2 + y′(t)2 + z′(t)2.

  • 14. ARC LENGTH DERIVATIVE 29

    erefore the length formula (33) of the curve is equivalent to

    (34) Length =

    ∫ b

    t=a

    x′(t)2 + y′(t)2 + z′(t)2 dt.

    e square root makes this formula a reliable source of very dicult integrals. In fact thelist of curves whose length one can actually compute by doing the integral is rather short(see Problem …).

    14. Arc length derivative

    Let #‰x(t) be some vector function that describes the motion through space of somepoint X(t), and let f(t) be some other function. In what follows it will help to think ofthe parameter t as “time.” Typical examples of functions f that we might want to considerare f(t) = ‖ #‰x(t)‖ (the distance to the origin of the point X(t)) or f(t) = ‖ #‰x ′(t)‖ (thespeed at which the point is moving.)

    To describe the rate with which f(t) is changing we could compute its derivative,

    df

    dt

    which tells us what the ratio between the change ∆f of f , and the change ∆t in theparameter t is (at least approximately, if ∆t is small). If we interpret t as “time” thenthis derivative tells us how fast f(t) changes per second. But sometimes it is more usefulto know how much f changes aer we have travelled a small distance along the curve,rather than aer a short amount of time has passed. In other words, for two nearby pointsX(t) and X(t+∆t) on the curve we would like to know the ratio

    (35)change in f

    distance travelled=

    f(t+∆t)− f(t)distance from X(t) to X(t+∆t)

    We can work this out by observing that the distance fromX(t) toX(t+∆t) is the lengthof the vector from X(t) to X(t+∆t), i.e.

    distance from X(t) to X(t+∆t) = ‖ #‰x(t+∆t)− #‰x(t)‖ .Assuming ∆t is small, we have

    ‖ #‰x(t+∆t)− #‰x(t)‖ =∥∥∥∥

    #‰x(t+∆t)− #‰x(t)∆t

    ∥∥∥∥∆t ≈

    ∥∥ #‰x

    ′(t)∥∥ ∆t.

    We substitute this in (35), and get

    change in f

    distance travelled≈ f(t+∆t)− f(t)‖ #‰x ′(t)‖∆t .

    Now let ∆t → 0: the quantity on the le becomes what is called the arc length deriv-ative of the function f along the curve vx(t), and which is commonly denoted by dfds Inthe quantity on the right we recognize the derivative of f with respect to t (time), whichleads to

    (36)df

    ds=

    1

    ‖ #‰x ′(t)‖df

    dt.

    Here dfdt = f′(t) is the usual derivative of f with respect to t.

    If we want to emphasize the distinction between these two derivatives, then we can

    call dfdt the “time derivative of f .”

  • 30 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    15. Unit Tangent and Curvature

    15.1. Unit tangent. We have seen that we can nd a tangent vector to the curvetraced out by some vector function #‰x(t), simply by dierentiating the vector function:#‰x

    ′(t) always provides a tangent vector (if #‰x ′(t) 6= #‰0 ). In fact any multiple λ #‰x ′(t) ofA vector with length 1 is

    called a unit vector this vector will also be a tangent vector (provided λ 6= 0.) We can single out one specialtangent vector, by choosing λ > 0 so that λ #‰x ′(t) has length 1. Since for λ > 0 wehave ‖λ #‰x ′(t)‖ = λ‖ #‰x ′(t)‖ the value of λ that will make λ #‰x ′(t) a unit vector is λ =1/‖ #‰x ′(t)‖.

    For this reason the vector

    (37)#‰

    T (t) =d #‰x

    ds=

    #‰x′(t)

    ‖ #‰x ′(t)‖is called the unit tangent vector to the curve corresponding to the vector function #‰x(t).

    15.2. Example. For our constant velocity parametrization (18) of a straight linefrom § 3 we have

    #‰x(t) = #‰a + t #‰v ,

    so that #‰x ′(t) = #‰v and hence

    #‰

    T =#‰v

    ‖ #‰v ‖ .

    We see that the unit tangent vector is constant.

    15.3. Curvature and normal. If the curve described by a vector function #‰x(t) isnot a straight line, then the tangent to the curve will turn as one moves along the curve.e curvature vector #‰κ measures how much the curve is curved. It is dened to be therate of change of the unit tangent, but with respect to arc length instead of with respectto the given parameter t. us

    (38) #‰κdef=

    d#‰

    T

    ds.

    According to our denition of “derivative with respect to arc length” the right hand sidestands for

    (39)d

    #‰

    T

    ds=

    1

    ‖ #‰x ′(t)‖d

    #‰

    T

    dt.

    To write this completely in terms of the original vector function #‰x(t) we use (37)

    (40) #‰κ =1

    ‖ #‰x ′(t)‖d

    dt

    { 1

    ‖ #‰x ′(t)‖d #‰x

    dt

    }

    is formula is not as short as the original denition (38), but it does show that the curva-ture vector comes about by dierentiating the vector function #‰x(t) twice (and dividingby ‖ #‰x ′(t)‖ at the right moments.)

  • 17. PROBLEMS 31

    eorem. e curvature vector #‰κ is perpendicular to the tangent, i.e. #‰κ ⊥ #‰T .

    Proof. We have to show that #‰κ·#‰

    T = 0. From the second form (39) of the denitionof #‰κ we see

    #‰κ·#‰

    T =( 1

    ‖ #‰x ′(t)‖d

    #‰

    T

    dt

    )

    ·#‰

    T =1

    ‖ #‰x ′(t)‖d

    #‰

    T

    dt·

    #‰

    T .

    Remember that#‰

    T (t) is always a unit vector, i.e.#‰

    T (t) has constant length: by § 11 this

    implies that d#‰

    T

    dt ⊥#‰

    T (t) and thus d#‰

    T

    dt ·#‰

    T = 0, so we are done. �

    ere are two concepts that are derived from the curvature vector: the curvature κis by denition the length of the curvature vector #‰κ ,

    (41) κ = ‖ #‰κ‖ =∥∥∥∥∥

    d#‰

    T

    ds

    ∥∥∥∥∥,

    and the normal vector to the curve is

    (42)# ‰

    N =#‰κ

    ‖ #‰κ‖ =d

    #‰

    T

    ds∥∥∥d

    #‰

    T

    ds

    ∥∥∥

    .

    e normal vector is undened when #‰κ =#‰

    0 , because it would require division by zero.

    Since #‰κ is perpendicular to#‰

    T , the normal vector# ‰

    N is also perpendicular to#‰

    T (henceits name).

    (43)d

    #‰

    T

    ds= κ

    # ‰

    N

    16. Osculating plane

    At any pointX(t) on a space curve given by #‰x(t) one denes the osculating plane

    to be the plane that contains the pointX(t) and that is parallel to both the tangent#‰

    T (t)

    and normal# ‰

    N(t) of the curve.

    If we want to write a dening equation for the osculating plane as in § 11.2 thenwe need a vector perpendicular to the osculating plane. Since this plane is dened to be

    parallel to both#‰

    T and# ‰

    N , we can nd a normal vector to the osculating plane by taking

    the cross product of#‰

    T and# ‰

    N . is vector is called the binormal to the curve. In aformula, it is dened to be

    (44)#‰

    B =#‰

    T×# ‰

    N .

    17. Problems

    1. Let ℓ be the line given by

    #‰x(t) =

    110

    + t

    −121

    .

    (a) Find the unit tangent vector, the curva-ture, and the tangent line to the line ℓ at thepoint where t = 2.

    (b) Find the unit tangent vector, the curva-ture, and the tangent line to the line ℓ at anypoint on the line.

    2. What sign does ω have in Figure 7 ? Howwould the figure change if we change the

  • 32 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

    sign of ω? Does the force#‰

    F on the objectchange if we change the sign of ω?

    3. Suppose a point P is rotating around aline ℓ, keeping its distance to the line fixedat r, and moving in a plane perpendicularto the line. Suppose the point has angularvelocity ω: this means that during a time in-terval of length t the angle swept out by theline segment connecting P to ℓ is exactly ωt.

    In a previous math or physics class it wasshown that the velocity of the point P is ωr,where r is the distance from P to the line ℓ.

    The angular velocity vector is definedto be the vector #‰ω whose length is ω, andthat is parallel to the line ℓ. There are twosuch vectors (± #‰ω). By definition #‰ω points inthe direction in which a screwwould move ifit were turning in the same direction as thepoint P .

    (a) Assuming the line ℓ passes through theorigin show from the drawing that the ve-locity vector of the point P is #‰v is given by#‰ω× #‰x . You can do this in two steps, namely:

    — show that #‰ω× #‰x has the same direction as #‰v ,— show that #‰ω× #‰x has the same length as #‰v .

    (b) Show that the acceleration vector isgiven by #‰a = #‰ω×( #‰ω× #‰x). (hint: don’t usethe drawing, but combine the definitions of#‰v and #‰a , in (24) and (25) and also the prod-uct rule; finally, keep in mind that you havejust found that #‰v = #‰ω× #‰x .)

    (c) If someone told you they had computedthe acceleration vector and found

    #‰a = ( #‰ω× #‰ω)× #‰x ,

    could they be right? Explain! What if theytold you they got #‰a = #‰ω× #‰ω× #‰x?

    (d) True or False (explain your answers):

    (a) #‰v ⊥ #‰x? (b) #‰a ⊥ #‰v ? (c) #‰aand #‰x are parallel?

    (e) Include the acceleration vector #‰a in theabove drawing.

    4. Consider the “twisted cubic,” i.e. thecurve given by #‰x(t) = t #‰e1 + t

    2 #‰e2 + t3 #‰e3.

    (a) Find a parametrization for the tangent tothe curve at the point where t = 1. Wheredoes this point intersect the xy-plane?

    (b) For any given t find the tangent line tothe curve at the point X(t), and find wherethis curve intersects the xy-plane.

    (c) If you call that intersection point P (t),then which curve is traced out by the pointP (t) as t varies?

    5. Compute the length of one full turn of thehelix by taking the parametrization given in(21) and computing the length of the seg-ment with 0 ≤ θ ≤ 2π.

    Aer computing the length, considerthis: let P be the perimeter of the circle un-derneath the helix, and let H be the heightachieved by one full turn of the helix. Showthat the length L of the helix satisfies L2 =P 2 +H2.

    6. There is a multistory parking rampwherethe way out is a path in the shape of a he-lix that is wound around the outside of thebuilding. As a car drives down this pathat night its headlights shine a spot on theground. Which curve is traced out by thislight spot as the car drives all the way down?

    Origin

    ∆s = r∆θ = rω∆t

    #‰ω

    #‰x

    #‰v = #‰ω× #‰xℓ

    r rP P

  • 17. PROBLEMS 33

    Make a good drawing. Assume for sim-plicity that the center of the Parking ramp isthe z-axis.

    7. Compute the tangent, curvature, normaland binormal for the following curves

    (a) The parabola: #‰x(t) =(

    t2

    t

    )

    . At whichpoint on the curve is the curvature thelargest?

    (b) Neil’s parabola: #‰x(t) =(

    t2

    t3

    )

    . At

    which point on the curve is the curvature thelargest?

    (c) The helix: #‰x(θ) =(

    R cos θR sin θ

    )

    (see § 6 for

    an explanation of the constantsR and a). At

    which point on the curve is the curvature thelargest?

    (d) The graph of y = ex by using theparametrization #‰x(t) =

    (

    tet

    )

    . Where onthe graph is the curvature the largest? •

  • CHAPTER 3

    Functions of more than one variable

    1. Functions of two variables and their graphs

    1.1. Denition. A function of two variables has two ingredients: a domain and arule. e domain of the function is a collection of points in the xy-plane. For each point(x, y) from the domain of the function, the rule should tell us how to nd the functionvalue f(x, y).

    Just as with functions of one variable, the “rule” that gives us the function value isoen specied by some formula, e.g. f(x, y) = x + y. e domain of a function is theset of points at which we dene the function. is can in principle be any set of pointsin the plane. Typically the domain will be a rectangle, or a disc, or it could be the entirexy-plane, possibly with some points and lines removed.

    z

    height:z=f(x,y)

    Domain o

    f f

    x

    y

    Figure 1. The graph of some function, and its domain (a rectangle in this example).

    1.2. Graphs. By denition, the graph of a function z = f(x, y) is the collection ofall points (x, y, z) in three dimensional space that satisfy the equation z = f(x, y).

    e graph is usually a surface that oats above (or below) the domain of the function(see Figure 2).

    35

  • 36 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    1.3. Level sets. e graph of a function of two variables is a surface siing in threedimensional space, which can be dicult to draw or visualize. Instead of looking at thegraph we can also consider its level sets. If c is any real number, then, by denition, thelevel set at level c of the function is the set of all points (x, y) in the plane that satisfyf(x, y) = c.

    z

    c

    x

    y

    level set at level c

    level set at level c

    x

    y

    Figure 2. The graph of some function (top), and a construction of one of its level sets (boom).Note that by definition the level set (“at level c”) is the curve in the xy-plane under the graph: itis obtained by intersecting the graph of the function with a horizontal plane at height c, and thenprojecting this curve of intersection onto the xy-plane.

    Since the level set is the set of all solutions to the equation f(x, y) = c, one oen usesthe notation f−1(c) (“f -inverse of c”) for the level set. We can summarize the denitionin an equation:

    f−1(c) ={(x, y) : f(x, y) = c

    }.

    Note that the denition says that f−1(c) is not a number, but a set of points!

  • 1. FUNCTIONS OF TWO VARIABLES AND THEIR GRAPHS 37

    Level sets tend to be curves in the xy-plane, although in general level sets can haveany shape (see Problem 5.13 for an example.) ey are usually easier to draw than thegraphs of the corresponding functions.

    1.4. An example from the “real” world. Here is a function of local interest. edomain of the function is the water surface of Lake Mendota (let’s pretend this is a planedomain), and the function, which we will call d instead of f , is given by d(x, y) = thedepth of the lake at location (x, y). ere is no formula for this function, but the Wiscon-sin Department of Natural Resources has measured the depth and presented the resultsin terms of the level sets of the function d.

    Figure 3. The level curves of a function z = d(x, y). The domain of this function is the lakesurface, and d(x, y) is the depth in meters of Lake Mendota at (x, y). To see the graph of thefunction we could try to drain the lake.

    See http://limnology.wisc.edu/lake information/mendota/mendota.html

    1.5. A comment about language and set-theoretic notation. We will oen say“consider a function z = f(x, y). . . ”, but there is a sense in which this is incorrect. Itis convenient to say “consider a function z = f(x, y). . . ” since it not only names thefunction, but it also gives the independent variables x, y, and the dependent variable z aname. Nevertheless, the symbol in the equation z = f(x, y) that actually represents thefunction is “f”. e correct way of introducing the function1 would be to say “consider afunction f .”

    In fact, in the notation that is used inmodernmathematics onewouldwrite “Considerthe function f : D → R. . . ” Here f is the name of the function we are introducing, D is

    1Saying “consider the function z = f(x, y). . . ” to introduce the function f is like saying “Please meet mybrother Joe, Bill, and Sue” when you want to introduce your brother Joe, who happens to be standing next to

    Bill and Sue. To introduce your brother, you would of course say “Please meet my brother Joe.” and to introduce

    the function you should really say “Consider the function f .”

    http://limnology.wisc.edu/lake_information/mendota/mendota.htmlhttp://limnology.wisc.edu/lake_information/mendota/mendota.html

  • 38 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    the domain of that function (soD is a set of points in the plane), and R stands for the setof real numbers, indicating that computing f always results in a real number.

    1.6. Vector notation. If #‰x is the position vector of the point (x, y) in the plane, i.e.if #‰x = ( xy ), then one sometimes writes

    f(x, y) = f( #‰x).

    Physicists have a preference for #‰r instead of #‰x (because they call the position vector the“radius vector”), and will write f(x, y) = f( #‰r ).

    2. Linear functions

    e simplest function of one variable are those of the form f(x) = ax + b. eirgraphs are lines, and we called them linear functions.

    A linear function of two variables is a function f of the form

    (45) z = f(x, y) = ax+ by + c,

    where a, b, c are constants.

    x

    y

    z

    Figure 4. The graph of a linear function z = ax+ by + c.

    e graph of a linear function is always a plane. Indeed, the graph consists of allpoints (x, y, z) that satisfy the equation

    −ax− by + z = c,

    which we can write as#‰n· #‰x = #‰n· #‰p ,

    where

    #‰n =

    −a−b1

    , and #‰p =

    00c

    .

  • 3. QUADRATIC FORMS 39

    3. adratic forms

    Aer learning about linear functions in pre-calculus one usually goes on to quadraticfunctions. We will do the same for functions of two variables and studyadratic Forms.Just as in the one variable case where quadratic functions can have a maximum or min-imum, quadratic forms provide examples of functions of two variables that can have amaximum or a minimum, or, it turns out, a third kind of “min-max” or “saddle shape.”ey provide the basic prole of what we will run into when we look for local minimaand maxima of functions of two variables. In particular, the technique of classifying qua-dratic forms by completing the square, which we will see in this section, is the key to thesecond derivative test for functions of more than one variable.

    3.1. Denition. e general quadratic form in two variables is

    (46) f(x, y) = Ax2 +Bxy + Cy2,

    where A, B, and C are constants. Depending on the values of these constants the graphsof the functions can have a number of dierent shapes.

    In addition to these quadratic forms one can also consider the more general class ofquadratic functions,

    f(x, y) = Ax2 +Bxy + Cy2 +Dx+ Ey + F,

    which also have terms of degree 1 and 0. We will restrict ourselves to quadratic forms(for now).

    e prototypical examples. ere are several important special cases that are repre-sentative of what the graphs of quadratic forms can look like. ese special cases are

    f(x, y) = x2 + y2, and g(x, y) = −x2 − y2,(47a)h(x, y) = x2, and h̃(x, y) = −x2,(47b)k(x, y) = xy(47c)

    eir graphs are discussed in Figure 5.

    3.2. Classifying quadratic forms – the general procedure. All quadratic formshave graphs that look like one of the examples shown above – but how can we tell whichit is? In other words, if Q(x, y) is a given quadratic form how can we tell if it is denite,indenite, or semidenite? How do we know for which (x, y) the formQ(x, y) is positiveor negative? It turns out that we can always nd out by using the trick of “completingthe square.”

    e general procedure for a given quadratic formQ(x, y) = Ax2+Bxy+Cy2 is asfollows:

    (1) If A = 0, then we really have Q = Bxy + Cy2 and we can factor Q as

    Q(x, y) = (Bx+ Cy)y.

  • 40 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    (2) Assume A 6= 0. We factor out A, and complete the square for the rst twoterms:

    Q(x, y) = A{

    x2 +B

    Axy +

    C

    Ay2}

    = A{(

    x+B

    2Ay)2 −

    ( B

    2Ay)2

    +C

    Ay2}

    = A{(

    x+B

    2Ay)2

    ︸ ︷︷ ︸

    u2

    +4AC −B2

    4A2y2

    ︸ ︷︷ ︸

    ±v2

    }

    .

    (3) If 4AC −B2 > 0, then the expression in braces is positive, and we can write

    Q(x, y) = A(u2 + v2), where u = x+B

    2Ay, and v =

    √4AC −B2

    2Ay.

    Depending on the sign of A our function is always positive or always negative,and we say the form is positive denite or negative denite.

    The two forms f and g from (47a)are called definite, since they cannotchange sign:

    f(x, y) = x2 + y2

    is the sum of two squares, and there-fore is always positive, unless both xand y vanish. Similarly, g(x, y) =−f(x, y) is always negative, exceptat (x, y) = (0, 0).

    The form h(x, y) = x2 is calledsemidefinite because it too cannotchange its sign. Clearly, h(x, y) = x2

    is never negative, but for h(x, y) to bepositive, we need x 6= 0. So, the func-tion h(x, y) is positive, except on theline x = 0 (the y axis). The graph of

    the function h̃(x, y) = −y2 is similar,but upside down.

    The form k(x, y) = xy is called in-definite, because it can be both pos-itive and negative: if x and y have thesame sign, then xy > 0, but if theyhave opposite signs, then xy < 0.Thus the graph of z = xy lies abovethe xy-plane in the first and thirdquadrants, and below the xy-plane inthe second and fourth quadrants.

    xy > 0

    xy > 0

    xy < 0

    xy < 0x

    y

    Figure 5. Graphs of some representative quadratic forms.

  • 3. QUADRATIC FORMS 41

    (4) If 4AC −B2 < 0, then we have

    Q(x, y) = A(u2 − v2), where u = x+ B2A

    y, and v =

    √B2 − 4AC

    2Ay.

    When this happens we can factor the quadratic form, i.e. we have

    Q(x, y) = A(u+ v)(u− v).

    e form is indenite.(5) in the only remaining case we have 4AC −B2 = 0, so that

    Q(x, y) = A(

    x+B

    2Ay)2

    .

    In this case the form is a perfect square (times A). e form is semi-denite.

    To understand this procedure it is perhaps best to look at how it works in some examples.

    3.3. Classifying quadratic forms – two examples.

    3.3.1. An indenite quadratic form. Consider the formQ(x, y) = −3x2+9xy+6y2.We rewrite this as follows:

    Q = −3x2 + 6xy + 9y2

    = −3(x2 − 2xy − 3y2

    )

    = −3[x2 − 2xy + y2︸ ︷︷ ︸

    −4y2]

    complete the square

    = −3[(x− y)2 − 4y2

    ] in this case we get the dierence of twosquares, so use a2− b2 = (a− b)(a+ b)

    = −3(x− y − 2y)(x− y + 2y)= −3(x− 3y)(x+ y).

    is shows thatQ(x, y) > 0 when y > 13x or y < −x, andQ(x, y) < 0 when−x < y <13x.

    y

    x

    Q(x,y)

  • 42 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    3.3.2. A positive denite quadratic form. To see a dierent example, consider the qua-dratic form Q(x, y) = 2x2 − 4xy + 6y2. By completing the square we can write it as

    Q(x, y) = 2{x2 − 2xy + 3y2

    }

    = 2{x2 − 2xy + y2 + 2y2

    }the square is complete

    = 2{(x− y)2 + 2y2

    }

    = 2(x− y)2 + 4y2.

    We see that this particular quadratic form is positive denite.

    4. Functions in polar coordinates r, θ

    Recall that instead of using Cartesian coordinates (x, y) to specify the location pointsin the plane, we can also use polar coordinates. In many cases it is much easier to describea function using polar coordinates than in Cartesian coordinates.

    To go back and forth between Cartesian and Polar Coordinates we can use the fol-lowing relations

    x = r cos θ(48a)

    y = r sin θ(48b)

    r =√

    x2 + y2(48c)

    θ = arctan

    y

    x

    (48d)

    e equation for θ is only valid for x > 0, where −π2 < θ < π2 . In other regions of theplane there are other expressions relating θ to (x, y). See problem 5.8.

    θ

    r

    x

    y

    P

    θ0

    θ=θ0r=r0

    Figure 7. Polar coordinates are defined in the picture on the right (see also equations (48)). Onthe le: the set of points at which θ has one given value θ0 form a half line emanating from theorigin that makes an angle θ0 with the positive x-axis. The set of points at which r has a givenvalue r0 form a circle centered at the origin, with radius r0.

    e simplest kinds of functions one can consider in polar coordinates are those thatonly depend on one of those coordinates, i.e. functions that only depend on the radius r,and functions that only depend on the polar angle θ. Let’s look at some examples of suchfunctions.

  • 4. FUNCTIONS IN POLAR COORDINATES r, θ 43

    xy

    z

    z = r =√

    x2 + y2

    r

    z

    z=Φ(r) =

    r

    Figure 8. Radially symmetric functions. The graph of z = r.

    4.1. Radially symmetric functions. e functions

    f(x, y) = x2 + y2, g(x, y) =√

    x2 + y2, h(x, y) = ln(x2 + y2

    ),

    all can be expressed in terms of the radius r only. Namely, using r2 = x2 + y2, we have

    f(x, y) = r2, g(x, y) = r, h(x, y) = ln r2(= 2 ln r).

    In general, a function z = f(x, y) that can be wrien in terms of the radius r only, i.e. afunction for which there is some function Φ of one variable with

    f(x, y) = Φ(r), i.e. f(x, y) = Φ(√

    x2 + y2),

    is called a radially symmetric function.

    Since a radially symmetric function only depends on the radius r, its level sets consistof circles centered at the origin (one exception: the origin, r = 0 can also be a level set,and this is obviously not a circle but a point.)

    As an example, we consider the function g(x, y) =√

    x2 + y2 = r in more detail.e function Φ of one variable here is Φ(r) = r. We can try to visualize the graph of g

    by rst looking at the positive x-axis only. ere we have f(x, 0) =√x2 = x. We get

    the graph of g by revolving the graph of z = x around the z-axis. See Figure 8.

    4.2. Functions of θ only. Here are two functions that happen to depend on thepolar angle θ only:

    f(x, y) = sin θ, h(x, y) = θ.

    We can rewrite these functions in terms of x and y by using the relations between Carte-sian and Polar coordinates (48). We get

    f(x, y) = sin θ =y

    r=

    y√

    x2 + y2

    for f , and

    h(x, y) = θ = arctany

    xfor h, at least in the right half plane where x > 0.

    A function that only depends on θ is constant on rays emanating from the originbecause the polar angle θ is constant on such rays. e level sets of such a functiontherefore consist of half-lines (“rays”) starting at the origin. Its graph consists of “spokes”aached to the z-axis. Each spoke lies above a ray in the xy-plane with some polar angleθ, and is aached to the z-axis at a height given by the function value. As we vary θ, the

  • 44 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    spoke rotates around the vertical axis and moves up or down, as dictated by the function.Figure 9 shows what happens for f(x, y) = sin θ.

    θx y

    z=f(θ)

    “ray”

    “spoke”

    The graph of a function of θ onlyconsists of horizontal spokes

    aached to the z-axis.The graph of z = sin θ

    (the x-axis is coming right at us.)

    Figure 9

    e function z = θ has a simpler formula in polar coordinates but actually has amore complicated graph. Let us try to visualize its graph: the spokes that make up thegraph are horizontal, aached to the z-axis, and are at height θ. If we increase the angleθ the spokes go up at a steady rate in a way that should remind us of a helix (see § 6and Figure 5). Based on this description its graph should look like the surface drawn inFigure 10. e surface is called the helicoid, and it is not the graph of a function (it failsthe “vertical line test.”) We could have known this from the beginning , because when wedescribed our function as f(x, y) = θ, we should have immediately asked which θ? epolar angle θ of any given point is only determined up to a multiple of 2π. e “graph”that we have drawn of the “function” z = θ reects this. To make h(x, y) = θ into anhonest function we have to say which of the many possible angles θ we choose when weare given a point. One possible choice is to always require the polar angle θ to lie between0 and 2π (radians). More precisely, we can insist on

    0 ≤ θ < 2π.If we do this then there is a unique angle θ for each point (x, y) in the plane. e graphof this function is shown on the right in Figure 10.

    5. Methods of visualizing the graph of a function

    5.1. Freezing a variable. If a function is not familiar, then a good strategy for draw-ing its graph is to “freeze a variable.” In other words, to analyze a function z = f(x, y)we pretend y is a constant: then x is the only independent variable, and we can try todraw the graph of the function z = f(x, y), now thinking of this as a function of onlyone variable. is graph is a curve in the xz plane. We get one such curve for each choiceof y. Piecing these graphs together then gives us the graph of the two-variable functionz = f(x, y).

    We could apply the same procedure with the roles of x and y switched: i.e. for eachxed x you try to graph z = f(x, y) as a function of the variable y only, aer which wetry to t all the graphs we get for dierent values of x together.

    x

    y

    z

  • 5. METHODS OF VISUALIZING THE GRAPH OF A FUNCTION 45

    x

    y

    x

    y

    Figure 10. The graph of z = θ is the helicoid. It is not the graph of a function, but one can extracta function by choosing a “branch” of the function. One possible choice, drawn here on the right,is to restrict the polar angle θ to the interval 0 ≤ θ < 2π. There are many other possible choices.

    5.2. Moving graphs. ere is another way of visualizing a function z = f(x, y) oftwo variables in which we think of one of the independent variables (e.g. y) as “time.” enal picture is not one static image of a three dimensional surface, but rather a movie ofa graph that is moving around in the xz plane.

    If we have a function z = f(x, y), then let us think of y as time, and let us relabelit as t, so that we are looking at the function z = f(x, t). Now at each moment in timet we can think of z = f(x, t) as a function of one variable x whose graph we can try todraw, regarding it as a still-image. en, as we let time t vary, puing the still images ina sequence, you get a movie of a graph of a changing function of one variable.

    For instance, if the function is (once again) the saddle surface function z = xy, thenwe would be considering the function z = xt. At each moment t the graph of z = xt is

    t=1

    z

    x x x x x

    z z z z

    t=−1 t=−1/2 t=0 t=1/2

    Figure 11. The saddle movie. It’s about a line segment whose slope changes, even though it isotherwise stuck to the origin.

  • 46 3. FUNCTIONS OF MORE THAN ONE VARIABLE

    a line with slope t. Puing these graphs together gives a movie which begins with a lineof rather negative slope; during the movie the slope increases, and in the middle of themovie our line has achieved horizontality; nally, the closing shot presents us with a linewith a very positive slope. Figure 11 shows some stills from the movie.

    is interpretation is not very dierent from the procedure of “freezing the y vari-able.” e only real dierence lies in what we do with all the separate graphs we get aerwe freeze a variable. In one case we try to piece them together to make a bigger draw-ing of a three-dimensional object, in the other we put them together to make a motionpicture.

    Problems

    In the problems in this s