6 orthogonality and least squares

6 Orthogonality andLeast Squares

INTRODUCTORY EXAMPLE

The North American Datumand GPS NavigationImaginestartingamassiveprojectthatyouestimatewilltaketenyearsandrequiretheeffortsofscoresofpeopletoconstructandsolvea1,800,000by900,000systemoflinearequations. ThatisexactlywhattheNationalGeodeticSurveydidin1974, whenitsetouttoupdatetheNorthAmericanDatum(NAD)—anetworkof268,000preciselylocatedreferencepointsthatspantheentireNorthAmericancontinent, togetherwithGreenland, Hawaii, theVirginIslands, PuertoRico, andotherCaribbeanislands.

TherecordedlatitudesandlongitudesintheNADmustbedeterminedtowithinafewcentimetersbecausetheyformthebasisforallsurveys, maps, legalpropertyboundaries, andlayoutsofcivilengineeringprojectssuchashighwaysandpublicutilitylines. However,morethan200,000newpointshadbeenaddedtothedatumsincethelastadjustmentin1927, anderrorshadgraduallyaccumulatedovertheyears, duetoimprecisemeasurements andshifts in the earth’s crust. DatagatheringfortheNAD readjustmentwascompletedin1983.

ThesystemofequationsfortheNAD hadnosolutionintheordinarysense, butratherhada least-squaressolution, whichassignedlatitudesandlongitudestothereferencepointsinawaythatcorrespondedbesttothe1.8millionobservations. Theleast-squaressolutionwasfoundin1986bysolvingarelatedsystemofso-called

normalequations, whichinvolved928,735equationsin928,735variables.1

Morerecently, knowledgeofreferencepointsonthegroundhasbecomecrucialforaccuratelydeterminingthelocationsofsatellitesinthesatellite-based GlobalPositioningSystem(GPS).A GPS satellitecalculatesitspositionrelativetotheearthbymeasuringthetimeittakesforsignalstoarrivefromthreegroundtransmitters. Todothis, thesatellitesusepreciseatomicclocksthathavebeensynchronizedwithgroundstations(whoselocationsareknownaccuratelybecauseoftheNAD).

The GlobalPositioningSystem is usedbothfordeterminingthelocationsofnewreferencepointsonthegroundandforfindingauser’spositiononthegroundrelativetoestablishedmaps. Whenacardriver(oramountainclimber)turnsonaGPS receiver, thereceivermeasurestherelativearrivaltimesofsignalsfromatleastthreesatellites. Thisinformation, togetherwiththetransmitteddataaboutthesatellites’locationsandmessagetimes, isusedtoadjusttheGPS receiver’stimeandtodetermineitsapproximatelocationontheearth. Giveninformationfromafourthsatellite, theGPS receivercanevenestablishitsapproximatealtitude.

1A mathematicaldiscussionofthesolutionstrategy(alongwithdetailsoftheentireNAD project)appearsin NorthAmericanDatumof1983,CharlesR.Schwarz(ed.), NationalGeodeticSurvey, NationalOceanicandAtmosphericAdministration(NOAA) ProfessionalPaperNOS 2,1989.

329

330 CHAPTER 6 Orthogonality and Least Squares

BoththeNAD andGPS problemsaresolvedbyfindingavectorthat“approximatelysatisfies”aninconsistent

systemofequations. A carefulexplanationofthisapparentcontradictionwillrequireideasdevelopedinthefirstfivesectionsofthischapter.

WEB

Inordertofindanapproximatesolutiontoaninconsistentsystemofequationsthathasnoactualsolution, awell-definednotionofnearnessisneeded. Section 6.1introducestheconceptsofdistanceandorthogonalityinavectorspace. Sections 6.2and6.3showhoworthogonalitycanbeusedtoidentifythepointwithinasubspace W thatisnearesttoapoint y lyingoutsideof W . Bytaking W tobethecolumnspaceofamatrix,Section 6.5developsamethodforproducingapproximate(“least-squares”)solutionsforinconsistentlinearsystems, suchasthesystemsolvedfortheNAD report.

Section 6.4providesanotheropportunitytoseeorthogonalprojectionsatwork,creatingamatrixfactorizationwidelyusedinnumericallinearalgebra. Theremainingsectionsexaminesomeofthemanyleast-squaresproblemsthatariseinapplications,includingthoseinvectorspacesmoregeneralthan R

n.

6.1 INNER PRODUCT, LENGTH, AND ORTHOGONALITY

Geometricconceptsoflength, distance, andperpendicularity, whicharewellknownforR

2 and R3, aredefinedherefor R

n. Theseconceptsprovidepowerfulgeometrictoolsforsolvingmanyappliedproblems, includingtheleast-squaresproblemsmentionedabove. Allthreenotionsaredefinedintermsoftheinnerproductoftwovectors.

The Inner ProductIf u and v arevectorsin R

n, thenweregard u and v as n � 1 matrices. ThetransposeuT isa 1 � n matrix, andthematrixproduct uT v isa 1 � 1 matrix, whichwewriteasasinglerealnumber(ascalar)withoutbrackets. Thenumber uT v iscalledthe innerproduct of u and v, andoftenitiswrittenas u�v. Thisinnerproduct, mentionedintheexercisesforSection 2.1, isalsoreferredtoasa dotproduct. If

u D

2

6664

u1

u2

:::

un

3

7775

and v D

2

6664

v1

v2

:::

vn

3

7775

thentheinnerproductof u and v is

Œ u1 u2 � � � un �

2

6664

v1

v2

:::

vn

3

7775

D u1v1 C u2v2 C � � � C unvn

6.1 Inner Product, Length, and Orthogonality 331

EXAMPLE 1 Compute u�v and v�u for u D

2

4

2

�5

�1

3

5 and v D

2

4

3

2

�3

3

5.

SOLUTION

u�v D uT v D Œ 2 �5 �1 �

2

4

3

2

�3

3

5 D .2/.3/ C .�5/.2/ C .�1/.�3/ D �1

v�u D vT u D Œ 3 2 �3 �

2

4

2

�5

�1

3

5 D .3/.2/ C .2/.�5/ C .�3/.�1/ D �1

ItisclearfromthecalculationsinExample 1why u�v D v�u. Thiscommutativityoftheinnerproductholdsingeneral. ThefollowingpropertiesoftheinnerproductareeasilydeducedfrompropertiesofthetransposeoperationinSection 2.1. (SeeExercises 21and22attheendofthissection.)

THEOREM 1 Let u, v, and w bevectorsin Rn, andlet c beascalar. Then

a. u�v D v�ub. .u C v/�w D u�w C v�wc. .cu/�v D c.u�v/ D u�.cv/

d. u�u � 0, and u�u D 0 ifandonlyif u D 0

Properties(b)and(c)canbecombinedseveraltimestoproducethefollowingusefulrule:

.c1u1 C � � � C cpup/�w D c1.u1 �w/ C � � � C cp.up �w/

The Length of a VectorIf v isin R

n, withentries v1; : : : ; vn, thenthesquarerootof v�v isdefinedbecause v�visnonnegative.

DEF IN I T I ON The length (or norm)of v isthenonnegativescalar kvk definedby

kvk Dpv�v D

q

v21 C v2

2 C � � � C v2n; and kvk2 D v�v

Suppose v isin R2, say, v D

�

a

b

�

. Ifweidentify v withageometricpointinthe

|a|

|b|

x1

x2

(a, b)

a2 + b2√

0

FIGURE 1

Interpretationof kvk aslength.

plane, asusual, then kvk coincideswiththestandardnotionofthelengthofthelinesegmentfromtheoriginto v. ThisfollowsfromthePythagoreanTheoremappliedtoatrianglesuchastheoneinFig. 1.

A similarcalculationwiththediagonalofarectangularboxshowsthatthedefinitionoflengthofavector v in R

3 coincideswiththeusualnotionoflength.Foranyscalar c, thelengthof cv is jcj timesthelengthof v. Thatis,

kcvk D jcjkvk

(Toseethis, compute kcvk2 D .cv/� .cv/ D c2v�v D c2kvk2 andtakesquareroots.)


A vectorwhoselengthis1iscalleda unitvector. Ifwe divide anonzerovector vbyitslength—thatis, multiplyby 1=kvk—weobtainaunitvector u becausethelengthof u is .1=kvk/kvk. Theprocessofcreating u from v issometimescalled normalizingv, andwesaythat u is inthesamedirection as v.

Severalexamplesthatfollowusethespace-savingnotationfor(column)vectors.

EXAMPLE 2 Let v D .1; �2; 2; 0/. Findaunitvector u inthesamedirectionas v.

SOLUTION First, computethelengthof v:

kvk2 D v�v D .1/2 C .�2/2 C .2/2 C .0/2 D 9

kvk Dp

9 D 3

Then, multiply v by 1=kvk toobtain

u D 1

kvkv D 1

3v D 1

3

2

664

1

�2

2

0

3

775

D

2

664

1=3

�2=3

2=3

0

3

775

Tocheckthat kuk D 1, itsufficestoshowthat kuk2 D 1.

kuk2 D u�u D�

13

�2 C�

� 23

�2 C�

23

�2 C .0/2

D 19

C 49

C 49

C 0 D 1

EXAMPLE 3 Let W bethesubspaceof R2 spannedby x D . 2

3; 1/. Findaunit

vector z thatisabasisfor W .

SOLUTION W consistsofallmultiplesof x, asinFig. 2(a). Anynonzerovectorin W

isabasisfor W . Tosimplifythecalculation, “scale” x toeliminatefractions. Thatis,multiply x by3toget

y D�

2

3

�

Nowcompute kyk2 D 22 C 32 D 13, kyk Dp

13, andnormalize y toget

z D 1p13

�

2

3

�

D�

2=p

13

3=p

13

�

SeeFig. 2(b). Anotherunitvectoris .�2=p

13; �3=p

13/.

(a)

x1

x2

x

W

1

1

(b)

x1

x2

y

z

1

1

FIGURE 2

Normalizingavectortoproduceaunitvector.

Distance in Rn

Wearereadynowtodescribehowcloseonevectoristoanother. Recallthatif a and b

arerealnumbers, thedistanceonthenumberlinebetween a and b isthenumber ja � bj.TwoexamplesareshowninFig. 3. ThisdefinitionofdistanceinR hasadirectanaloguein R

n.

|2 – 8| = |–6| = 6 or |8 – 2| = |6| = 6 |(–3) – 4| = |–7| = 7 or |4 – (–3)| = |7| = 7

6 units apart

a b a b

7 units apart

1 32 4 5 6 7 8 9 1 30 2–1–3 –2 4 5

FIGURE 3 Distancesin R.


DEF IN I T I ON For u and v in Rn, the distancebetweenuandv, writtenas dist.u; v/, isthe

lengthofthevector u � v. Thatis,

dist.u; v/ D ku � vk

In R2 and R

3, thisdefinitionofdistancecoincideswiththeusualformulasfortheEuclideandistancebetweentwopoints, asthenexttwoexamplesshow.

EXAMPLE 4 Computethedistancebetweenthevectors u D .7; 1/ and v D .3; 2/.

SOLUTION Calculate

u � v D�

7

1

�

��

3

2

�

D�

4

�1

�

ku � vk Dp

42 C .�1/2 Dp

17

Thevectors u, v, and u � v areshowninFig. 4. Whenthevector u � v isaddedto v, theresultis u. NoticethattheparallelograminFig. 4showsthatthedistancefromu to v isthesameasthedistancefrom u � v to 0.

||u – v||

x1

x2

v

u

u – v

–v

1

1

FIGURE 4 Thedistancebetween u and v isthelengthof u � v.

EXAMPLE 5 If u D .u1; u2; u3/ and v D .v1; v2; v3/, then

dist.u; v/ D ku � vk Dp

.u � v/�.u � v/

Dp

.u1 � v1/2 C .u2 � v2/2 C .u3 � v3/2

Orthogonal VectorsTherestofthischapterdependsonthefactthattheconceptofperpendicularlinesin

||u –(– v)||

||u – v||

v

0

u

–v

FIGURE 5

ordinaryEuclideangeometryhasananaloguein Rn.

Consider R2 or R

3 andtwolinesthroughtheorigindeterminedbyvectors u and v.ThetwolinesshowninFig. 5aregeometricallyperpendicularifandonlyifthedistancefrom u to v isthesameasthedistancefrom u to �v. Thisisthesameasrequiringthesquaresofthedistancestobethesame. Now

Œ dist.u; �v/ �2 D ku � .�v/k2 D ku C vk2

D .u C v/� .u C v/

D u�.u C v/ C v� .u C v/ Theorem 1(b)D u�u C u�v C v�u C v�v Theorem 1(a), (b)

D kuk2 C kvk2 C 2u�v Theorem 1(a) (1)


Thesamecalculationswith v and �v interchangedshowthatŒdist.u; v/�2 D kuk2 C k � vk2 C 2u� .�v/

D kuk2 C kvk2 � 2u�vThetwosquareddistancesareequalifandonlyif 2u�v D �2u�v, whichhappensifandonlyif u�v D 0.

Thiscalculationshowsthatwhenvectors u and v areidentifiedwithgeometricpoints, thecorrespondinglinesthroughthepointsandtheoriginareperpendicularifandonlyif u�v D 0. Thefollowingdefinitiongeneralizesto R

n thisnotionofperpen-dicularity(or orthogonality, asitiscommonlycalledinlinearalgebra).

DEF IN I T I ON Twovectors u and v in Rn are orthogonal (toeachother)if u�v D 0.

Observethatthezerovectorisorthogonaltoeveryvectorin Rn because 0T v D 0

forall v.Thenexttheoremprovidesausefulfactaboutorthogonalvectors. Theprooffol-

lowsimmediatelyfromthecalculationin(1)aboveandthedefinitionoforthogonality.TherighttriangleshowninFig. 6providesavisualizationofthelengthsthatappearinthetheorem.

THEOREM 2 The Pythagorean Theorem

Twovectors u and v areorthogonalifandonlyif ku C vk2 D kuk2 C kvk2.

Orthogonal ComplementsToprovidepracticeusinginnerproducts, weintroduceaconceptherethatwillbeofuseinSection 6.3andelsewhereinthechapter. Ifavector z isorthogonaltoeveryvectorinasubspace W of R

n, then z issaidtobe orthogonalto W . Thesetofallvectors zthatareorthogonalto W iscalledthe orthogonalcomplement of W andisdenotedbyW ? (andreadas“W perpendicular”orsimply“W perp”).

v

u + v

||u + v|| u

||v||

||u||

0

FIGURE 6

EXAMPLE 6 Let W beaplanethroughtheoriginin R3, andlet L betheline

throughtheoriginandperpendicularto W . If z and w arenonzero, z ison L, and

w

zL

W

0

FIGURE 7

A planeandlinethrough 0 asorthogonalcomplements.

w isin W , thenthelinesegmentfrom 0 to z isperpendiculartothelinesegmentfrom 0to w; thatis, z�w D 0. SeeFig. 7. Soeachvectoron L isorthogonaltoevery w in W .Infact, L consistsof all vectorsthatareorthogonaltothe w’sin W , and W consistsofallvectorsorthogonaltothe z’sin L. Thatis,

L D W ? and W D L?

Thefollowingtwofactsabout W ?, with W asubspaceof Rn, areneededlater

inthechapter. ProofsaresuggestedinExercises 29and30. Exercises 27–31provideexcellentpracticeusingpropertiesoftheinnerproduct.

1. A vector x isin W ? ifandonlyif x isorthogonaltoeveryvectorinasetthatspans W .

2. W ? isasubspaceof Rn.


ThenexttheoremandExercise 31verifytheclaimsmadeinSection 4.6concerningthesubspacesshowninFig. 8. (AlsoseeExercise 28inSection 4.6.)

A

0 0

Row A

Nul A

Col A

Nul AT

FIGURE 8 Thefundamentalsubspacesdeterminedbyan m � n matrix A.

THEOREM 3 Let A bean m � n matrix. Theorthogonalcomplementoftherowspaceof A isthenullspaceof A, andtheorthogonalcomplementofthecolumnspaceof A isthenullspaceof AT :

.RowA/? D NulA and .ColA/? D NulAT

PROOF Therow–columnruleforcomputing Ax showsthatif x isin NulA, then x isorthogonaltoeachrowof A (withtherowstreatedasvectorsin R

n/. Sincetherowsof A spantherowspace, x isorthogonalto RowA. Conversely, if x isorthogonaltoRowA, then x iscertainlyorthogonaltoeachrowof A, andhence Ax D 0. Thisprovesthefirststatementofthetheorem. Sincethisstatementistrueforanymatrix, itistruefor AT . Thatis, theorthogonalcomplementoftherowspaceof AT isthenullspaceofAT . Thisprovesthesecondstatement, because RowAT D ColA.

Angles in R2 and R

3 (Optional)If u and v arenonzerovectorsineitherR

2 orR3, thenthereisaniceconnectionbetween

theirinnerproductandtheangle # betweenthetwolinesegmentsfromtheorigintothepointsidentifiedwith u and v. Theformulais

u�v D kuk kvk cos# (2)

ToverifythisformulaforvectorsinR2, considerthetriangleshowninFig. 9, withsides

oflengths kuk, kvk, and ku � vk. Bythelawofcosines,ku � vk2 D kuk2 C kvk2 � 2kuk kvk cos#

(u1, u

2)

(v1, v

2)

||u – v||

||v||

||u|| �

FIGURE 9 Theanglebetweentwovectors.


whichcanberearrangedtoproduce

kuk kvk cos# D 1

2

�

kuk2 C kvk2 � ku � vk2�

D 1

2

�

u21 C u2

2 C v21 C v2

2 � .u1 � v1/2 � .u2 � v2/2�

D u1v1 C u2v2

D u�v

Theverificationfor R3 issimilar. When n > 3, formula (2)maybeusedto define the

anglebetweentwovectorsin Rn. Instatistics, forinstance, thevalueof cos# defined

by(2)forsuitablevectors u and v iswhatstatisticianscalla correlationcoefficient.

PRACTICE PROBLEMS

Let a D�

�2

1

�

, b D�

�3

1

�

, c D

2

4

4=3

�1

2=3

3

5, and d D

2

4

5

6

�1

3

5.

1. Computea�ba�a

and�a�ba�a

�

a.

2. Findaunitvector u inthedirectionof c.3. Showthat d isorthogonalto c.4. UsetheresultsofPracticeProblems 2and3toexplainwhy dmustbeorthogonalto

theunitvector u.

6.1 EXERCISESComputethequantitiesinExercises1–8usingthevectors

u D�

�1

2

�

, v D�

4

6

�

, w D

2

4

3

�1

�5

3

5, x D

2

4

6

�2

3

3

5

1. u �u, v �u, and v �uu �u

2. w �w, x �w, and x �ww �w

3. 1

w �ww 4. 1

u �uu

5.�u �vv �v

�

v 6.�x �wx � x

�

x

7. kwk 8. kxk

InExercises9–12, findaunitvectorinthedirectionofthegivenvector.

9.�

�30

40

�

10.

2

4

�6

4

�3

3

5

11.

2

4

7=4

1=2

1

3

5 12.�

8=3

2

�

13. Findthedistancebetween x D�

10

�3

�

and y D�

�1

�5

�

.

14. Findthedistancebetween u D

2

4

0

�5

2

3

5 and z D

2

4

�4

�1

8

3

5.

DeterminewhichpairsofvectorsinExercises15–18areorthog-onal.

15. a D�

8

�5

�

, b D�

�2

�3

�

16. u D

2

4

12

3

�5

3

5, v D

2

4

2

�3

3

3

5

17. u D

2

664

3

2

�5

0

3

775, v D

2

664

�4

1

�2

6

3

775

18. y D

2

664

�3

7

4

0

3

775, z D

2

664

1

�8

15

�7

3

775

InExercises19and20, allvectorsareinRn. Markeachstatement

TrueorFalse. Justifyeachanswer.

19. a. v �v D kvk2.b. Foranyscalar c, u � .cv/ D c.u �v/.c. Ifthedistancefrom u to v equalsthedistancefrom u to

�v, then u and v areorthogonal.d. Forasquarematrix A, vectorsin ColA areorthogonalto

vectorsin NulA.e. If vectors v1; : : : ; vp span a subspace W and if x is

orthogonaltoeach vj for j D 1; : : : ; p, then x isin W ?.


20. a. u �v � v �u D 0.b. Foranyscalar c, kcvk D ckvk.c. If x isorthogonaltoeveryvectorinasubspace W , then x

isin W ?.d. If kuk2 C kvk2 D ku C vk2, then u and v areorthogonal.e. Foran m � n matrix A, vectorsinthenullspaceof A are

orthogonaltovectorsintherowspaceof A.21. Usethetransposedefinitionoftheinnerproducttoverify

parts(b)and(c)ofTheorem 1. MentiontheappropriatefactsfromChapter 2.

22. Let u D .u1; u2; u3/. Explain why u �u � 0. When isu �u D 0?

23. Let u D

2

4

2

�5

�1

3

5 and v D

2

4

�7

�4

6

3

5. Computeandcompare

u �v, kuk2, kvk2, and ku C vk2. Donotuse thePythagoreanTheorem.

24. Verifythe parallelogramlaw forvectors u and v in Rn:

ku C vk2 C ku � vk2 D 2kuk2 C 2kvk2

25. Let v D�

a

b

�

. Describetheset H ofvectors�

x

y

�

thatare

orthogonalto v. [Hint: Consider v D 0 and v ¤ 0.]

26. Let u D

2

4

5

�6

7

3

5, andlet W bethesetofall x in R3 suchthat

u �x D 0. Whattheorem inChapter 4canbeusedtoshowthatW isasubspaceof R

3? Describe W ingeometriclanguage.27. Supposeavector y isorthogonaltovectors u and v. Show

that y isorthogonaltothevector u C v.28. Suppose y is orthogonal to u and v. Showthat y is or-

thogonaltoevery w in Span fu; vg. [Hint: Anarbitrary win Span fu; vg hastheform w D c1u C c2v. Showthat y isorthogonaltosuchavector w.]

Span{u, v}

u

w

vy

0

29. Let W D Span fv1; : : : ; vpg. Showthatif x isorthogonaltoeach vj , for 1 � j � p, then x isorthogonaltoeveryvectorin W .

30. Let W beasubspaceof Rn, andlet W ? bethesetofall

vectorsorthogonalto W . Showthat W ? isasubspaceof Rn

usingthefollowingsteps.a. Take z in W ?, andlet u representanyelementof W .

Then z �u D 0. Takeanyscalar c andshowthat cz isorthogonalto u. (Since u wasanarbitraryelementof W ,thiswillshowthat cz isin W ?.)

b. Take z1 and z2 in W ?, andlet u beanyelementof W .Showthat z1 C z2 is orthogonal to u. Whatcanyouconcludeabout z1 C z2? Why?

c. Finishtheproofthat W ? isasubspaceof Rn.

31. Showthatif x isinboth W and W ?, then x D 0.

32. [M] Constructapair u, v ofrandomvectorsin R4, andlet

A D

2

664

:5 :5 :5 :5

:5 :5 �:5 �:5

:5 �:5 :5 �:5

:5 �:5 �:5 :5

3

775

a. Denote the columns of A by a1; : : : ; a4. Com-pute the length of each column, and compute a1 �a2,a1 �a3; a1 �a4; a2 �a3; a2 �a4, and a3 �a4.

b. Computeandcomparethelengthsof u, Au, v, and Av.c. Useequation(2)inthissectiontocomputethecosineof

theanglebetween u and v. Comparethiswiththecosineoftheanglebetween Au and Av.

d. Repeatparts(b)and(c)fortwootherpairsofrandomvectors. Whatdoyouconjectureabouttheeffectof A onvectors?

33. [M] Generaterandomvectors x, y, and v in R4 withinteger

entries(and v ¤ 0), andcomputethequantities�x �vv �v

�

v;�y �vv �v

�

v;.x C y/�v

v �vv;

.10x/�vv �v

v

Repeatthecomputationswithnewrandomvectors x andy. Whatdoyouconjectureaboutthemapping x 7! T .x/ D�x �vv �v

�

v (for v ¤ 0)? Verifyyourconjecturealgebraically.

34. [M] Let A D

2

66664

�6 3 �27 �33 �13

6 �5 25 28 14

8 �6 34 38 18

12 �10 50 41 23

14 �21 49 29 33

3

77775

. Construct

a matrix N whosecolumnsforma basis for NulA, andconstructamatrixR whose rows formabasisfor RowA (seeSection4.6fordetails). PerformamatrixcomputationwithN and R thatillustratesafactfromTheorem3.


SOLUTIONS TO PRACTICE PROBLEMS

1. a�b D 7, a�a D 5. Hencea�ba�a

D 7

5, and

�a�ba�a

�

a D 7

5a D

�

�14=5

7=5

�

.

2. Scale c, multiplyingby3toget y D

2

4

4

�3

2

3

5. Compute kyk2 D 29 and kyk Dp

29.

Theunitvectorinthedirectionofboth c and y is u D 1

kyky D

2

4

4=p

29

�3=p

29

2=p

29

3

5.

3. d isorthogonalto c, because

d�c D

2

4

5

6

�1

3

5�

2

4

4=3

�1

2=3

3

5 D 20

3� 6 � 2

3D 0

4. d isorthogonalto u because u hastheform kc forsome k, andd�u D d� .kc/ D k.d�c/ D k.0/ D 0

6.2 ORTHOGONAL SETS

A setofvectors fu1; : : : ;upg inRn issaidtobean orthogonalset ifeachpairofdistinct

vectorsfromthesetisorthogonal, thatis, if ui �uj D 0 whenever i ¤ j .

EXAMPLE 1 Showthat fu1; u2; u3g isanorthogonalset, where

u1 D

2

4

3

1

1

3

5; u2 D

2

4

�1

2

1

3

5; u3 D

2

4

�1=2

�2

7=2

3

5

SOLUTION Considerthethreepossiblepairsofdistinctvectors, namely, fu1; u2g,fu1; u3g, and fu2;u3g.

u1 �u2 D 3.�1/ C 1.2/ C 1.1/ D 0

u1 �u3 D 3�

� 12

�

C 1.�2/ C 1�

72

�

D 0

u2 �u3 D �1�

� 12

�

C 2.�2/ C 1�

72

�

D 0

Eachpairofdistinctvectorsisorthogonal, andso fu1; u2; u3g isanorthogonalset. SeeFig. 1; thethreelinesegmentstherearemutuallyperpendicular.

x1

x2

x3

u1

u2

u3

FIGURE 1

THEOREM 4 If S D fu1; : : : ; upg isanorthogonalsetofnonzerovectorsin Rn, then S is

linearlyindependentandhenceisabasisforthesubspacespannedby S .

PROOF If 0 D c1u1 C � � � C cpup forsomescalars c1; : : : ; cp , then0 D 0�u1 D .c1u1 C c2u2 C � � � C cpup/�u1

D .c1u1/�u1 C .c2u2/�u1 C � � � C .cpup/�u1

D c1.u1 � u1/ C c2.u2 � u1/ C � � � C cp.up � u1/

D c1.u1 � u1/

because u1 isorthogonalto u2; : : : ; up . Since u1 isnonzero, u1 �u1 isnotzeroandsoc1 D 0. Similarly, c2; : : : ; cp mustbezero. Thus S islinearlyindependent.

6.2 Orthogonal Sets 339

DEF IN I T I ON An orthogonalbasis forasubspace W of Rn isabasisfor W thatisalsoan

orthogonalset.

Thenexttheoremsuggestswhyanorthogonalbasisismuchnicerthanotherbases.Theweightsinalinearcombinationcanbecomputedeasily.

THEOREM 5 Let fu1; : : : ; upg beanorthogonalbasisforasubspace W of Rn. Foreach y in

W , theweightsinthelinearcombination

y D c1u1 C � � � C cpup

aregivenbycj D y�uj

uj � uj

.j D 1; : : : ; p/

PROOF Asintheprecedingproof, theorthogonalityof fu1; : : : ; upg showsthat

y�u1 D .c1u1 C c2u2 C � � � C cpup/�u1 D c1.u1 � u1/

Since u1 �u1 is notzero, theequationabovecanbesolvedfor c1. Tofind cj forj D 2; : : : ; p, compute y�uj andsolvefor cj .

EXAMPLE 2 Theset S D fu1; u2;u3g inExample 1isanorthogonalbasisfor R3.

Expressthevector y D

2

4

6

1

�8

3

5 asalinearcombinationofthevectorsin S .

SOLUTION Compute

y�u1 D 11; y�u2 D �12; y�u3 D �33

u1 � u1 D 11; u2 � u2 D 6; u3 � u3 D 33=2

ByTheorem 5,

y D y�u1

u1 � u1

u1 C y�u2

u2 � u2

u2 C y�u3

u3 � u3

u3

D 11

11u1 C �12

6u2 C �33

33=2u3

D u1 � 2u2 � 2u3

Noticehoweasyitistocomputetheweightsneededtobuild y fromanorthogonalbasis. Ifthebasiswerenotorthogonal, itwouldbenecessarytosolveasystemoflinearequationsinordertofindtheweights, asinChapter 1.

Weturnnexttoaconstructionthatwillbecomeakeystepinmanycalculationsinvolvingorthogonality, anditwillleadtoageometricinterpretationofTheorem 5.

An Orthogonal ProjectionGivenanonzerovector u in R

n, considertheproblemofdecomposingavector y in Rn

intothesumoftwovectors, oneamultipleof u andtheotherorthogonalto u. Wewishtowrite

y D Oy C z (1)


where Oy D ˛u forsomescalar ˛ and z issomevectororthogonalto u. SeeFig. 2. Given

u

y

0 ˆ

z = y – y

y = �u

FIGURE 2

Finding ˛ tomake y � Oyorthogonalto u.

anyscalar ˛, let z D y � ˛u, sothat(1)issatisfied. Then y � Oy isorthogonalto u ifandonlyif

0 D .y � ˛u/�u D y�u � .˛u/�u D y�u � ˛.u�u/

Thatis, (1)issatisfiedwith z orthogonalto u ifandonlyif ˛ D y�uu�u

and Oy D y�uu�u

u.Thevector Oy iscalledthe orthogonalprojectionofyontou, andthevector z iscalledthe componentofyorthogonaltou.

If c isanynonzeroscalarandif u isreplacedby cu inthedefinitionof Oy, thentheorthogonalprojectionof y onto cu isexactlythesameastheorthogonalprojectionof yonto u (Exercise 31). Hencethisprojectionisdeterminedbythe subspace L spannedby u (thelinethrough u and 0). Sometimes Oy isdenotedby projL y andiscalledtheorthogonalprojectionofyonto L. Thatis,

Oy D projL y D y�uu�u

u (2)

EXAMPLE 3 Let y D�

7

6

�

and u D�

4

2

�

. Findtheorthogonalprojectionof y

onto u. Thenwrite y asthesumoftwoorthogonalvectors, onein Span fug andoneorthogonalto u.

SOLUTION Compute

y�u D�

7

6

�

�

�

4

2

�

D 40

u�u D�

4

2

�

�

�

4

2

�

D 20

Theorthogonalprojectionof y onto u is

Oy D y�uu�u

u D 40

20u D 2

�

4

2

�

D�

8

4

�

andthecomponentof y orthogonalto u is

y � Oy D�

7

6

�

��

8

4

�

D�

�1

2

�

Thesumofthesetwovectorsis y. Thatis,�

7

6

�

"y

D�

8

4

�

"Oy

C�

�1

2

�

".y � Oy/

Thisdecompositionof y isillustratedinFig. 3. Note: Ifthecalculationsabovearecorrect, then fOy; y � Oyg willbeanorthogonalset. Asacheck, compute

Oy�.y � Oy/ D�

8

4

�

�

�

�1

2

�

D �8 C 8 D 0

SincethelinesegmentinFig. 3between y and Oy isperpendiculartoL, byconstruc-tionof Oy, thepointidentifiedwith Oy istheclosestpointof L to y. (Thiscanbeprovedfromgeometry. Wewillassumethisfor R

2 nowandproveitfor Rn inSection 6.3.)


x1

x2

y

u

y

L = Span{u}

3

1 8

6

yy – ˆ

FIGURE 3 Theorthogonalprojectionof y ontoaline L throughtheorigin.

EXAMPLE 4 FindthedistanceinFig. 3from y to L.

SOLUTION Thedistancefrom y to L isthelengthoftheperpendicularlinesegmentfrom y totheorthogonalprojection Oy. Thislengthequalsthelengthof y � Oy. Thusthedistanceis

ky � Oyk Dp

.�1/2 C 22 Dp

5

A Geometric Interpretation of Theorem 5Theformulafortheorthogonalprojection Oy in(2)hasthesameappearanceaseachofthetermsinTheorem 5. ThusTheorem 5decomposesavector y intoasumoforthogonalprojectionsontoone-dimensionalsubspaces.

Itiseasytovisualizethecaseinwhich W D R2 D Span fu1;u2g, with u1 and u2

orthogonal. Any y in R2 canbewrittenintheform

y D y�u1

u1 � u1

u1 C y�u2

u2 � u2

u2 (3)

Thefirsttermin(3)istheprojectionof y ontothesubspacespannedby u1 (thelinethrough u1 andtheorigin), andthesecondtermistheprojectionof y ontothesubspacespannedby u2. Thus(3)expresses y asthesumofitsprojectionsontothe(orthogonal)axesdeterminedby u1 and u2. SeeFig. 4.

0

y

u1

u2

y2 = projection onto u

2

y1 = projection onto u

1

FIGURE 4 A vectordecomposedintothesumoftwoprojections.

Theorem 5decomposeseach y in Span fu1; : : : ; upg intothesumof p projectionsontoone-dimensionalsubspacesthataremutuallyorthogonal.


Decomposing a Force into Component ForcesThedecompositioninFig. 4canoccurinphysicswhensomesortofforceisappliedtoanobject. Choosinganappropriatecoordinatesystemallowstheforcetoberepresentedbyavector y in R

2 or R3. Oftentheprobleminvolvessomeparticulardirectionof

interest, whichisrepresentedbyanothervector u. Forinstance, iftheobjectismovinginastraightlinewhentheforceisapplied, thevector u mightpointinthedirectionofmovement, asinFig. 5. A keystepintheproblemistodecomposetheforceintoacomponentinthedirectionof u andacomponentorthogonalto u. ThecalculationswouldbeanalogoustothosemadeinExample 3above.

y

u

FIGURE 5

Orthonormal SetsA set fu1; : : : ; upg isan orthonormalset ifitisanorthogonalsetofunitvectors. If W

isthesubspacespannedbysuchaset, then fu1; : : : ;upg isan orthonormalbasis forW , sincethesetisautomaticallylinearlyindependent, byTheorem 4.

Thesimplestexampleofanorthonormalsetisthestandardbasis fe1; : : : ; eng forR

n. Anynonemptysubsetof fe1; : : : ; eng isorthonormal, too. Hereisamorecompli-catedexample.

EXAMPLE 5 Showthat fv1; v2; v3g isanorthonormalbasisof R3, where

v1 D

2

64

3=p

11

1=p

11

1=p

11

3

75; v2 D

2

64

�1=p

6

2=p

6

1=p

6

3

75; v3 D

2

64

�1=p

66

�4=p

66

7=p

66

3

75

SOLUTION Compute

v1 �v2 D �3=p

66 C 2=p

66 C 1=p

66 D 0

v1 �v3 D �3=p

726 � 4=p

726 C 7=p

726 D 0

v2 �v3 D 1=p

396 � 8=p

396 C 7=p

396 D 0

Thus fv1; v2; v3g isanorthogonalset. Also,v1 �v1 D 9=11 C 1=11 C 1=11 D 1

v2 �v2 D 1=6 C 4=6 C 1=6 D 1

v3 �v3 D 1=66 C 16=66 C 49=66 D 1

whichshowsthat v1, v2, and v3 areunitvectors. Thus fv1; v2; v3g isanorthonormalset.Sincethesetislinearlyindependent, itsthreevectorsformabasisforR

3. SeeFig. 6.

x1

x2

x3

v1

v2

v3

FIGURE 6


Whenthevectorsinanorthogonalsetofnonzerovectorsare normalized tohaveunitlength, thenewvectorswillstillbeorthogonal, andhencethenewsetwillbeanor-thonormalset. SeeExercise 32. ItiseasytocheckthatthevectorsinFig. 6(Example 5)aresimplytheunitvectorsinthedirectionsofthevectorsinFig. 1(Example 1).

Matriceswhosecolumnsformanorthonormalsetareimportantinapplicationsandincomputeralgorithmsformatrixcomputations. TheirmainpropertiesaregiveninTheorems 6and7.

THEOREM 6 An m � n matrix U hasorthonormalcolumnsifandonlyif U TU D I .

PROOF Tosimplifynotation, wesupposethat U hasonlythreecolumns, eachavectorin R

m. Theproofofthegeneralcaseisessentiallythesame. Let U D Œ u1 u2 u3 �

andcompute

U TU D

2

64

uT1

uT2

uT3

3

75

�

u1 u2 u3

�

D

2

64

uT1 u1 uT

1 u2 uT1 u3

uT2 u1 uT

2 u2 uT2 u3

uT3 u1 uT

3 u2 uT3 u3

3

75 (4)

Theentriesinthematrixattherightareinnerproducts, usingtransposenotation. Thecolumnsof U areorthogonalifandonlyif

uT1 u2 D uT

2 u1 D 0; uT1 u3 D uT

3 u1 D 0; uT2 u3 D uT

3 u2 D 0 (5)

Thecolumnsof U allhaveunitlengthifandonlyif

uT1 u1 D 1; uT

2 u2 D 1; uT3 u3 D 1 (6)

Thetheoremfollowsimmediatelyfrom(4)–(6).

THEOREM 7 Let U bean m � n matrixwithorthonormalcolumns, andlet x and y bein Rn.

Then

a. kU xk D kxkb. .U x/�.U y/ D x�yc. .U x/� .U y/ D 0 ifandonlyif x�y D 0

Properties(a)and(c)saythatthelinearmapping x 7! U x preserveslengthsandorthogonality. Thesepropertiesarecrucialformanycomputeralgorithms. SeeExer-cise 25fortheproofofTheorem7.

EXAMPLE 6 Let U D

2

64

1=p

2 2=3

1=p

2 �2=3

0 1=3

3

75 and x D

�p2

3

�

. Noticethat U hasor-

thonormalcolumnsand

U TU D�

1=p

2 1=p

2 0

2=3 �2=3 1=3

�2

4

1=p

2 2=3

1=p

2 �2=3

0 1=3

3

5 D�

1 0

0 1

�

Verifythat kU xk D kxk.


SOLUTION

U x D

2

4

1=p

2 2=3

1=p

2 �2=3

0 1=3

3

5

�p2

3

�

D

2

4

3

�1

1

3

5

kU xk Dp

9 C 1 C 1 Dp

11

kxk Dp

2 C 9 Dp

11

Theorems6and7areparticularlyusefulwhenappliedto square matrices. Anorthogonalmatrix isasquareinvertiblematrixU suchthatU�1 D U T . ByTheorem 6,suchamatrixhasorthonormalcolumns.¹ Itiseasytoseethatany square matrixwithorthonormalcolumnsisanorthogonalmatrix. Surprisingly, suchamatrixmusthaveorthonormal rows, too. SeeExercises 27and28. OrthogonalmatriceswillappearfrequentlyinChapter 7.

EXAMPLE 7 Thematrix

U D

2

64

3=p

11 �1=p

6 �1=p

66

1=p

11 2=p

6 �4=p

66

1=p

11 1=p

6 7=p

66

3

75

isanorthogonalmatrixbecauseitissquareandbecauseitscolumnsareorthonormal,byExample 5. Verifythattherowsareorthonormal, too!

PRACTICE PROBLEMS

1. Let u1 D�

�1=p

5

2=p

5

�

and u2 D�

2=p

5

1=p

5

�

. Showthat fu1; u2g isanorthonormal

basisfor R2.

2. Let y and L beasinExample 3andFig. 3. Computetheorthogonalprojection Oy ofy onto L using u D

�

2

1

�

insteadofthe u inExample 3.

3. Let U and x beasinExample 6, andlet y D�

�3p

2

6

�

. Verifythat U x�U y D x�y.

6.2 EXERCISESInExercises1–6, determinewhichsetsofvectorsareorthogonal.

1.

2

4

�1

4

�3

3

5,

2

4

5

2

1

3

5,

2

4

3

�4

�7

3

5 2.

2

4

1

�2

1

3

5,

2

4

0

1

2

3

5,

2

4

�5

�2

1

3

5

3.

2

4

2

�7

�1

3

5,

2

4

�6

�3

9

3

5,

2

4

3

1

�1

3

5 4.

2

4

2

�5

�3

3

5,

2

4

0

0

0

3

5,

2

4

4

�2

6

3

5

5.

2

664

3

�2

1

3

3

775,

2

664

�1

3

�3

4

3

775,

2

664

3

8

7

0

3

775

6.

2

664

5

�4

0

3

3

775,

2

664

�4

1

�3

8

3

775,

2

664

3

3

5

�1

3

775

InExercises7–10, showthat fu1;u2g or fu1;u2; u3g isanorthog-onalbasisfor R

2 or R3, respectively. Thenexpress x asalinear

combinationofthe u’s.

7. u1 D�

2

�3

�

, u2 D�

6

4

�

, and x D�

9

�7

�

¹A betternamemightbe orthonormalmatrix, andthistermisfoundinsomestatisticstexts. However,orthogonalmatrix isthestandardterminlinearalgebra.


8. u1 D�

3

1

�

, u2 D�

�2

6

�

, and x D�

�6

3

�

9. u1 D

2

4

1

0

1

3

5, u2 D

2

4

�1

4

1

3

5, u3 D

2

4

2

1

�2

3

5, and x D

2

4

8

�4

�3

3

5

10. u1 D

2

4

3

�3

0

3

5, u2 D

2

4

2

2

�1

3

5, u3 D

2

4

1

1

4

3

5, and x D

2

4

5

�3

1

3

5

11. Compute theorthogonal projectionof�

1

7

�

onto the line

through�

�4

2

�

andtheorigin.

12. Computetheorthogonalprojectionof�

1

�1

�

ontotheline

through�

�1

3

�

andtheorigin.

13. Let y D�

2

3

�

and u D�

4

�7

�

. Write y asthesumoftwo

orthogonalvectors, onein Span fug andoneorthogonalto u.

14. Let y D�

2

6

�

and u D�

7

1

�

. Write y asthesumofavector

in Span fug andavectororthogonalto u.

15. Let y D�

3

1

�

and u D�

8

6

�

. Computethedistancefrom ytothelinethrough u andtheorigin.

16. Let y D�

�3

9

�

and u D�

1

2

�

. Computethedistancefrom ytothelinethrough u andtheorigin.

InExercises17–22, determinewhichsetsofvectorsareorthonor-mal. Ifasetisonlyorthogonal, normalizethevectorstoproduceanorthonormalset.

17.

2

4

1=3

1=3

1=3

3

5,

2

4

�1=2

0

1=2

3

5 18.

2

4

0

1

0

3

5,

2

4

0

�1

0

3

5

19.�

�:6

:8

�

,�

:8

:6

�

20.

2

4

�2=3

1=3

2=3

3

5,

2

4

1=3

2=3

0

3

5

21.

2

4

1=p

10

3=p

20

3=p

20

3

5,

2

4

3=p

10

�1=p

20

�1=p

20

3

5,

2

4

0

�1=p

2

1=p

2

3

5

22.

2

4

1=p

18

4=p

18

1=p

18

3

5,

2

4

1=p

2

0

�1=p

2

3

5,

2

4

�2=3

1=3

�2=3

3

5

InExercises23and24, allvectorsareinRn. Markeachstatement

TrueorFalse. Justifyeachanswer.

23. a. Noteverylinearlyindependentsetin Rn isanorthogonal

set.

b. If y isalinearcombinationofnonzerovectorsfromanorthogonalset, thentheweightsinthelinearcombinationcanbecomputedwithoutrowoperationsonamatrix.

c. Ifthevectorsinanorthogonalsetofnonzerovectorsarenormalized, thensomeofthenewvectorsmaynotbeorthogonal.

d. A matrix with orthonormal columns is an orthogonalmatrix.

e. IfL isalinethrough 0 andif Oy istheorthogonalprojectionof y onto L, then kOyk givesthedistancefrom y to L.

24. a. Noteveryorthogonalsetin Rn islinearlyindependent.

b. Ifaset S D fu1; : : : ; upg hasthepropertythat ui � uj D 0

whenever i ¤ j , then S isanorthonormalset.c. Ifthecolumnsofanm � nmatrixA areorthonormal, then

thelinearmapping x 7! Ax preserveslengths.d. Theorthogonalprojectionof y onto v isthesameasthe

orthogonalprojectionof y onto cv whenever c ¤ 0.e. Anorthogonalmatrixisinvertible.

25. ProveTheorem 7. [Hint: For(a), compute kU xk2, orprove(b)first.]

26. Suppose W is a subspace of Rn spanned by n nonzero

orthogonalvectors. Explainwhy W D Rn.

27. LetU beasquarematrixwithorthonormalcolumns. Explainwhy U isinvertible. (Mentionthetheoremsyouuse.)

28. Let U bean n � n orthogonalmatrix. ShowthattherowsofU formanorthonormalbasisof R

n.

29. Let U and V be n � n orthogonalmatrices. ExplainwhyUV isanorthogonalmatrix. [Thatis, explainwhy UV isinvertibleanditsinverseis .UV /T .]

30. Let U beanorthogonalmatrix, andconstruct V byinter-changingsomeofthecolumnsof U . Explainwhy V isanorthogonalmatrix.

31. Showthattheorthogonalprojectionofavector y ontoalineL throughtheoriginin R

2 doesnotdependonthechoiceofthenonzero u in L usedintheformulafor Oy. Todothis, suppose y and u aregivenand Oy hasbeencomputedbyformula(2)inthissection. Replace u inthatformulaby cu,where c isanunspecifiednonzeroscalar. Showthatthenewformulagivesthesame Oy.

32. Let fv1; v2g beanorthogonalsetofnonzerovectors, andletc1, c2 beanynonzeroscalars. Showthat fc1v1; c2v2g isalsoanorthogonalset. Sinceorthogonalityofasetisdefinedintermsofpairsofvectors, thisshowsthatifthevectorsinanorthogonalsetarenormalized, thenewsetwillstillbeorthogonal.

33. Given u ¤ 0 in Rn, let L D Span fug. Showthatthemap-

ping x 7! projL x isalineartransformation.

34. Given u ¤ 0 in Rn, let L D Span fug. For y in R

n, thereflectionofyin L isthepoint reflL y definedby


reflL y D 2� projL y � y

See the figure, which shows that reflL y is the sum ofOy D projL y and Oy � y. Showthatthemapping y 7! reflL yisalineartransformation.

x1

x2 y

u

y

L = Span{u}

yy – ˆ ref lL y

yy –ˆ

Thereflectionof y inalinethroughtheorigin.

35. [M] Showthatthecolumnsofthematrix A areorthogonalbymakinganappropriatematrixcalculation. Statethecal-culationyouuse.

A D

2

66666666664

�6 �3 6 1

�1 2 1 �6

3 6 3 �2

6 �3 6 �1

2 �1 2 3

�3 6 3 2

�2 �1 2 �3

1 2 1 6

3

77777777775

36. [M] Inparts(a)–(d), let U bethematrixformedbynormal-izingeachcolumnofthematrix A inExercise 35.a. Compute U TU and U U T . Howdotheydiffer?b. Generate a random vector y in R

8, and computep D U U Ty and z D y � p. Explainwhy p isin ColA.Verifythat z isorthogonalto p.

c. Verifythat z isorthogonaltoeachcolumnof U .d. Noticethat y D p C z, with p in ColA. Explainwhy z is

in .ColA/?. (Thesignificanceofthisdecompositionofy willbeexplainedinthenextsection.)


1. Thevectorsareorthogonalbecause

u1 � u2 D �2=5 C 2=5 D 0

Theyareunitvectorsbecause

ku1k2 D .�1=p

5/2 C .2=p

5/2 D 1=5 C 4=5 D 1

ku2k2 D .2=p

5/2 C .1=p

5/2 D 4=5 C 1=5 D 1

Inparticular, theset fu1; u2g islinearlyindependent, andhenceisabasisforR2 since

therearetwovectorsintheset.

2. When y D�

7

6

�

and u D�

2

1

�

,

Oy D y�uu�u

u D 20

5

�

2

1

�

D 4

�

2

1

�

D�

8

4

�

Thisisthesame Oy foundinExample 3. Theorthogonalprojectiondoesnotseemtodependonthe u chosenontheline. SeeExercise 31.

3. U y D

2

4

1=p

2 2=3

1=p

2 �2=3

0 1=3

3

5

�

�3p

2

6

�

D

2

4

1

�7

2

3

5

Also, fromExample 6, x D�p

2

3

�

and U x D

2

4

3

�1

1

3

5. Hence

U x�U y D 3 C 7 C 2 D 12; and x�y D �6 C 18 D 12SGMastering: OrthogonalBasis 6–4

6.3 Orthogonal Projections 347

6.3 ORTHOGONAL PROJECTIONS

TheorthogonalprojectionofapointinR2 ontoalinethroughtheoriginhasanimportant

analoguein Rn. Givenavector y andasubspace W in R

n, thereisavector Oy in W suchthat(1) Oy istheuniquevectorin W forwhich y � Oy isorthogonalto W , and(2) Oy istheuniquevectorin W closestto y. SeeFig. 1. Thesetwopropertiesof Oy providethekeytofindingleast-squaressolutionsoflinearsystems, mentionedintheintroductoryexampleforthischapter. ThefullstorywillbetoldinSection6.5.

Toprepareforthefirsttheorem, observethatwheneveravector y iswrittenasalinearcombinationofvectors u1; : : : ;un inR

n, thetermsinthesumfor y canbegroupedy

y0W

FIGURE 1

intotwopartssothat y canbewrittenas

y D z1 C z2

where z1 isalinearcombinationofsomeofthe ui and z2 isalinearcombinationoftherestofthe ui . Thisideaisparticularlyusefulwhen fu1; : : : ;ung isanorthogonalbasis. RecallfromSection 6.1that W ? denotesthesetofallvectorsorthogonaltoasubspace W .

EXAMPLE 1 Let fu1; : : : ; u5g beanorthogonalbasisfor R5 andlet

y D c1u1 C � � � C c5u5

Considerthesubspace W D Span fu1;u2g, andwrite y asthesumofavector z1 in W

andavector z2 in W ?.

SOLUTION Write

y D c1u1 C c2u2„ ƒ‚ …

z1

C c3u3 C c4u4 C c5u5„ ƒ‚ …

z2

z1 D c1u1 C c2u2 isin Span fu1; u2gwhere

z2 D c3u3 C c4u4 C c5u5 isin Span fu3;u4;u5g:and

Toshowthat z2 isin W ?, itsufficestoshowthat z2 isorthogonaltothevectorsinthebasis fu1; u2g for W . (SeeSection 6.1.) Usingpropertiesoftheinnerproduct, compute

z2 �u1 D .c3u3 C c4u4 C c5u5/�u1

D c3u3 � u1 C c4u4 � u1 C c5u5 � u1

D 0

because u1 isorthogonalto u3, u4, and u5. A similarcalculationshowsthat z2 �u2 D 0.Thus z2 isin W ?.

Thenexttheoremshowsthatthedecomposition y D z1 C z2 inExample 1canbecomputedwithouthavinganorthogonalbasisforR

n. Itisenoughtohaveanorthogonalbasisonlyfor W .


THEOREM 8 The Orthogonal Decomposition Theorem

Let W beasubspaceof Rn. Theneach y in R

n canbewrittenuniquelyintheform

y D Oy C z (1)

where Oy isin W and z isin W ?. Infact, if fu1; : : : ; upg isanyorthogonalbasisof W , then

Oy D y�u1

u1 � u1

u1 C � � � C y�up

up � up

up (2)

and z D y � Oy.

Thevector Oy in(1)iscalledthe orthogonalprojectionof y onto W andofteniswrittenas projW y. SeeFig. 2. When W isaone-dimensionalsubspace, theformulaforOy matchestheformulagiveninSection 6.2.

0

W

y

y = projW

yˆ

z = y – y

FIGURE 2 Theorthogonalprojectionof yonto W .

PROOF Let fu1; : : : ; upg beanyorthogonalbasisfor W , anddefine Oy by(2).¹ Then Oyisin W because Oy isalinearcombinationofthebasis u1; : : : ; up . Let z D y � Oy. Sinceu1 isorthogonalto u2; : : : ; up , itfollowsfrom(2)that

z�u1 D .y � Oy/�u1 D y�u1 �� y�u1

u1 � u1

�

u1 � u1 � 0 � � � � � 0

D y�u1 � y�u1 D 0

Thus z isorthogonalto u1. Similarly, z isorthogonaltoeach uj inthebasisfor W .Hence z isorthogonaltoeveryvectorin W . Thatis, z isin W ?.

Toshowthatthedecompositionin(1)isunique, suppose y canalsobewrittenasy D Oy1 C z1, with Oy1 inW and z1 inW ?. Then Oy C z D Oy1 C z1 (sincebothsidesequaly/, andso

Oy � Oy1 D z1 � z

Thisequalityshowsthatthevector v D Oy � Oy1 isin W andin W ? (because z1 and zarebothin W ?, and W ? isasubspace). Hence v�v D 0, whichshowsthat v D 0. Thisprovesthat Oy D Oy1 andalso z1 D z.

Theuniquenessofthedecomposition(1)showsthattheorthogonalprojection Oydependsonlyon W andnotontheparticularbasisusedin(2).

¹Wemayassumethat W isnotthezerosubspace, forotherwise W? D Rn and(1)issimply y D 0C y.Thenextsectionwillshowthatanynonzerosubspaceof Rn hasanorthogonalbasis.


EXAMPLE 2 Let u1 D

2

4

2

5

�1

3

5, u2 D

2

4

�2

1

1

3

5, and y D

2

4

1

2

3

3

5. Observethat fu1; u2g

isanorthogonalbasisfor W D Span fu1; u2g. Write y asthesumofavectorin W andavectororthogonalto W .

SOLUTION Theorthogonalprojectionof y onto W is

Oy D y�u1

u1 � u1

u1 C y�u2

u2 � u2

u2

D 9

30

2

4

2

5

�1

3

5C 3

6

2

4

�2

1

1

3

5 D 9

30

2

4

2

5

�1

3

5C 15

30

2

4

�2

1

1

3

5 D

2

4

�2=5

2

1=5

3

5

Also

y � Oy D

2

4

1

2

3

3

5 �

2

4

�2=5

2

1=5

3

5 D

2

4

7=5

0

14=5

3

5

Theorem 8ensuresthat y � Oy isin W ?. Tocheckthecalculations, however, itisagoodideatoverifythat y � Oy isorthogonaltoboth u1 and u2 andhencetoallof W . Thedesireddecompositionof y is

y D

2

4

1

2

3

3

5 D

2

4

�2=5

2

1=5

3

5C

2

4

7=5

0

14=5

3

5

A Geometric Interpretation of the Orthogonal ProjectionWhen W isaone-dimensionalsubspace, theformula(2)for projW y containsjustoneterm. Thus, when dimW > 1, eachtermin(2)isitselfanorthogonalprojectionof yontoaone-dimensionalsubspacespannedbyoneofthe u’sinthebasisfor W . Figure 3illustratesthiswhenW isasubspaceofR

3 spannedby u1 and u2. Here Oy1 and Oy2 denotetheprojectionsof y ontothelinesspannedby u1 and u2, respectively. Theorthogonalprojection Oy of y onto W isthesumoftheprojectionsof y ontoone-dimensionalsub-spacesthatareorthogonaltoeachother. Thevector Oy inFig. 3correspondstothevectory inFig. 4ofSection 6.2, becausenowitis Oy thatisin W .

u1

u2

0

y

y2

ˆ

y1

ˆ

y = u2 = y

1 + y

2ˆˆ–––––u

2 . u

2

y . u2–––––u

1 . u

1

y . u1 u

1+

FIGURE 3 Theorthogonalprojectionof y isthesumofitsprojectionsontoone-dimensionalsubspacesthataremutuallyorthogonal.


Properties of Orthogonal ProjectionsIf fu1; : : : ; upg isanorthogonalbasisfor W andif y happenstobein W , thentheformulafor projW y isexactlythesameastherepresentationof y giveninTheorem 5inSection 6.2. Inthiscase, projW y D y.

If y isin W D Span fu1; : : : ; upg, then projW y D y.

Thisfactalsofollowsfromthenexttheorem.

THEOREM 9 The Best Approximation Theorem

Let W beasubspaceof Rn, let y beanyvectorin R

n, andlet Oy betheorthogonalprojectionof y onto W . Then Oy istheclosestpointin W to y, inthesensethat

ky � Oyk < ky � vk (3)

forall v in W distinctfrom Oy.

Thevector Oy inTheorem 9iscalled thebestapproximationto y byelementsofW .Latersectionsinthetextwillexamineproblemswhereagiven y mustbereplaced, orapproximated, byavector v insomefixedsubspace W . Thedistancefrom y to v, givenby ky � vk, canberegardedasthe“error”ofusing v inplaceof y. Theorem 9saysthatthiserrorisminimizedwhen v D Oy.

Inequality(3)leadstoanewproofthat Oy doesnotdependontheparticularorthogo-nalbasisusedtocomputeit. IfadifferentorthogonalbasisforW wereusedtoconstructanorthogonalprojectionof y, thenthisprojectionwouldalsobetheclosestpointin W

to y, namely, Oy.

PROOF Take v inW distinctfrom Oy. SeeFig. 4. Then Oy � v isinW . BytheOrthogonalDecompositionTheorem, y � Oy isorthogonalto W . Inparticular, y � Oy isorthogonalto Oy � v (whichisin W ). Since

y � v D .y � Oy/ C .Oy � v/

thePythagoreanTheoremgives

ky � vk2 D ky � Oyk2 C kOy � vk2

(SeethecoloredrighttriangleinFig. 4. Thelengthofeachsideislabeled.) NowkOy � vk2 > 0 because Oy � v ¤ 0, andsoinequality(3)followsimmediately.

y

v

0

W

||y – v||y

||y – v||ˆ

||y – y||ˆ

FIGURE 4 Theorthogonalprojectionof yonto W istheclosestpointin W to y.


EXAMPLE 3 If u1 D

2

4

2

5

�1

3

5, u2 D

2

4

�2

1

1

3

5, y D

2

4

1

2

3

3

5, and W D Span fu1; u2g,

asinExample 2, then theclosestpointin W to y is

Oy D y�u1

u1 � u1

u1 C y�u2

u2 � u2

u2 D

2

4

�2=5

2

1=5

3

5

EXAMPLE 4 Thedistancefromapoint y in Rn toasubspace W isdefinedasthe

distancefrom y tothenearestpointinW . Findthedistancefrom y toW D Span fu1; u2g,where

y D

2

4

�1

�5

10

3

5; u1 D

2

4

5

�2

1

3

5; u2 D

2

4

1

2

�1

3

5

SOLUTION BytheBestApproximationTheorem, thedistancefrom y toW is ky � Oyk,where Oy D projW y. Since fu1; u2g isanorthogonalbasisfor W ,

Oy D 15

30u1 C �21

6u2 D 1

2

2

4

5

�2

1

3

5 � 7

2

2

4

1

2

�1

3

5 D

2

4

�1

�8

4

3

5

y � Oy D

2

4

�1

�5

10

3

5 �

2

4

�1

�8

4

3

5 D

2

4

0

3

6

3

5

ky � Oyk2 D 32 C 62 D 45

Thedistancefrom y to W isp

45 D 3p

5.

Thefinaltheoreminthissectionshowshowformula(2)for projW y issimplifiedwhenthebasisfor W isanorthonormalset.

THEOREM 10 If fu1; : : : ; upg isanorthonormalbasisforasubspace W of Rn, then

projW y D .y�u1/u1 C .y�u2/u2 C � � � C .y�up/up (4)

If U D Œ u1 u2 � � � up �, then

projW y D U U Ty forall y in Rn (5)

PROOF Formula(4)followsimmediatelyfrom(2)inTheorem 8. Also, (4)showsthat projW y is a linearcombinationof thecolumnsof U usingtheweights y�u1,y�u2; : : : ; y�up . Theweightscanbewrittenas uT

1 y; uT2 y; : : : ; uT

py, showingthattheyaretheentriesin U Ty andjustifying(5).

Suppose U isan n � p matrixwithorthonormalcolumns, andlet W bethecolumnWEB

spaceof U . ThenU TU x D Ipx D x forall x in R

p Theorem 6

U U Ty D projW y forall y in Rn Theorem 10

If U isan n � n (square)matrixwithorthonormalcolumns, then U isan orthogonalmatrix, thecolumnspace W isallof R

n, and U U Ty D Iy D y forall y in Rn.

Althoughformula(4)isimportantfortheoreticalpurposes, inpracticeitusuallyinvolvescalculationswithsquarerootsofnumbers(intheentriesofthe ui /. Formula(2)isrecommendedforhandcalculations.


PRACTICE PROBLEM

Let u1 D

2

4

�7

1

4

3

5, u2 D

2

4

�1

1

�2

3

5, y D

2

4

�9

1

6

3

5, and W D Span fu1; u2g. Usethefact

that u1 and u2 areorthogonaltocompute projW y.

6.3 EXERCISESInExercises1and2, youmayassumethat fu1; : : : ; u4g isanorthogonalbasisfor R

4.

1. u1 D

2

664

0

1

�4

�1

3

775, u2 D

2

664

3

5

1

1

3

775, u3 D

2

664

1

0

1

�4

3

775, u4 D

2

664

5

�3

�1

1

3

775,

x D

2

664

10

�8

2

0

3

775. Write x asthesumoftwovectors, onein

Span fu1;u2;u3g andtheother in Span fu4g.

2. u1 D

2

664

1

2

1

1

3

775, u2 D

2

664

�2

1

�1

1

3

775, u3 D

2

664

1

1

�2

�1

3

775, u4 D

2

664

�1

1

1

�2

3

775,

v D

2

664

4

5

�3

3

3

775. Write v asthesumof twovectors, onein

Span fu1g andtheother in Span fu2;u3;u4g.

InExercises3–6, verifythat fu1;u2g isanorthogonalset, andthenfindtheorthogonalprojectionof y onto Span fu1;u2g.

3. y D

2

4

�1

4

3

3

5, u1 D

2

4

1

1

0

3

5, u2 D

2

4

�1

1

0

3

5

4. y D

2

4

6

3

�2

3

5, u1 D

2

4

3

4

0

3

5, u2 D

2

4

�4

3

0

3

5

5. y D

2

4

�1

2

6

3

5, u1 D

2

4

3

�1

2

3

5, u2 D

2

4

1

�1

�2

3

5

6. y D

2

4

6

4

1

3

5, u1 D

2

4

�4

�1

1

3

5, u2 D

2

4

0

1

1

3

5

InExercises7–10, let W bethesubspacespannedbythe u’s, andwrite y asthesumofavectorin W andavectororthogonalto W .

7. y D

2

4

1

3

5

3

5, u1 D

2

4

1

3

�2

3

5, u2 D

2

4

5

1

4

3

5

8. y D

2

4

�1

4

3

3

5, u1 D

2

4

1

1

1

3

5, u2 D

2

4

�1

3

�2

3

5

9. y D

2

664

4

3

3

�1

3

775, u1 D

2

664

1

1

0

1

3

775, u2 D

2

664

�1

3

1

�2

3

775, u3 D

2

664

�1

0

1

1

3

775

10. y D

2

664

3

4

5

6

3

775, u1 D

2

664

1

1

0

�1

3

775, u2 D

2

664

1

0

1

1

3

775, u3 D

2

664

0

�1

1

�1

3

775

InExercises11and12, findtheclosestpointto y inthesubspaceW spannedby v1 and v2.

11. y D

2

664

3

1

5

1

3

775, v1 D

2

664

3

1

�1

1

3

775, v2 D

2

664

1

�1

1

�1

3

775

12. y D

2

664

3

�1

1

13

3

775, v1 D

2

664

1

�2

�1

2

3

775, v2 D

2

664

�4

1

0

3

3

775

InExercises13and14, findthebestapproximationto z byvectorsoftheform c1v1 C c2v2.

13. z D

2

664

3

�7

2

3

3

775, v1 D

2

664

2

�1

�3

1

3

775, v2 D

2

664

1

1

0

�1

3

775

14. z D

2

664

2

4

0

�1

3

775, v1 D

2

664

2

0

�1

�3

3

775, v2 D

2

664

5

�2

4

2

3

775

15. Let y D

2

4

5

�9

5

3

5, u1 D

2

4

�3

�5

1

3

5, u2 D

2

4

�3

2

1

3

5. Find the

distancefrom y tothe planein R3 spannedby u1 and u2.

16. Let y, v1, and v2 beasinExercise 12. Findthedistancefromy tothesubspaceof R

4 spannedby v1 and v2.

17. Let y D

2

4

4

8

1

3

5, u1 D

2

4

2=3

1=3

2=3

3

5, u2 D

2

4

�2=3

2=3

1=3

3

5, and

W D Span fu1;u2g.


a. Let U D Œ u1 u2 �. Compute U TU and U U T .b. Compute projW y and .U U T /y.

18. Let y D�

7

9

�

, u1 D�

1=p

10

�3=p

10

�

, and W D Span fu1g.

a. Let U be the 2 � 1 matrixwhoseonlycolumnis u1.Compute U TU and U U T .

b. Compute projW y and .U U T /y.

19. Let u1 D

2

4

1

1

�2

3

5, u2 D

2

4

5

�1

2

3

5, and u3 D

2

4

0

0

1

3

5. Notethat

u1 and u2 areorthogonalbutthat u3 isnotorthogonalto u1 oru2. Itcanbeshownthat u3 isnotinthesubspace W spannedby u1 and u2. Usethisfacttoconstructanonzerovector v inR

3 thatisorthogonalto u1 and u2.

20. Let u1 and u2 beasinExercise 19, andlet u4 D

2

4

0

1

0

3

5. Itcan

beshownthat u4 isnotinthesubspace W spannedby u1 andu2. Usethisfacttoconstructanonzerovector v in R

3 thatisorthogonalto u1 and u2.

InExercises21and22, allvectorsandsubspacesarein Rn. Mark

eachstatementTrueorFalse. Justifyeachanswer.

21. a. If z is orthogonal to u1 and to u2 and if W DSpan fu1;u2g, then z mustbein W ?.

b. Foreach y andeachsubspace W , thevector y � projW yisorthogonalto W .

c. Theorthogonalprojection Oy of y ontoasubspace W cansometimesdependontheorthogonalbasisfor W usedtocompute Oy.

d. If y isinasubspace W , thentheorthogonalprojectionofy onto W is y itself.

e. Ifthecolumnsofann � pmatrixU areorthonormal, thenU U Ty istheorthogonalprojectionof y ontothecolumnspaceof U .

22. a. If W isasubspaceof Rn andif v isinboth W and W ?,

then v mustbethezerovector.b. IntheOrthogonalDecompositionTheorem, eachtermin

formula(2)for Oy isitselfanorthogonalprojectionof yontoasubspaceof W .

c. If y D z1 C z2, where z1 isinasubspace W and z2 isinW ?, then z1 mustbetheorthogonalprojectionof y ontoW .

d. Thebestapproximationto y byelementsofasubspaceW isgivenbythevector y � projW y.

e. If an n � p matrix U hasorthonormal columns, thenU U Tx D x forall x in R

n.23. Let A bean m � n matrix. Provethateveryvector x in R

n

canbewrittenintheform x D p C u, where p isin RowA

and u isin NulA. Also, showthatiftheequation Ax D bisconsistent, thenthereisaunique p in RowA suchthatAp D b.

24. Let W be a subspace of Rn with an orthogonal basis

fw1; : : : ;wpg, andlet fv1; : : : ; vqg beanorthogonalbasisforW ?.a. Explainwhy fw1; : : : ;wp; v1; : : : ; vqg is anorthogonal

set.b. Explainwhythesetinpart(a)spans R

n.c. Showthat dimW C dimW ? D n.

25. [M] Let U bethe 8 � 4 matrixinExercise 36inSection 6.2.Findtheclosestpointto y D .1; 1; 1; 1; 1; 1; 1; 1/ in ColU .Write the keystrokes or commandsyouuse to solve thisproblem.

26. [M] Let U bethematrixinExercise 25. Findthedistancefrom b D .1; 1; 1; 1; �1; �1; �1; �1/ to ColU .

SOLUTION TO PRACTICE PROBLEM

Compute

projW y D y�u1

u1 � u1

u1 C y�u2

u2 � u2

u2 D 88

66u1 C �2

6u2

D 4

3

2

4

�7

1

4

3

5 � 1

3

2

4

�1

1

�2

3

5 D

2

4

�9

1

6

3

5 D y

Inthiscase, y happenstobealinearcombinationof u1 and u2, so y isin W . Theclosestpointin W to y is y itself.


6.4 THE GRAM SCHMIDT PROCESS

TheGram–Schmidt process is a simple algorithmfor producinganorthogonal ororthonormalbasisforanynonzerosubspaceofR

n. Thefirsttwoexamplesoftheprocessareaimedathandcalculation.

EXAMPLE 1 Let W D Span fx1; x2g, where x1 D

2

4

3

6

0

3

5 and x2 D

2

4

1

2

2

3

5. Con-

x1

x2

x2

v2

x3

W

v1 =

x

1

p

0

FIGURE 1

Constructionofanorthogonalbasis fv1; v2g.

structanorthogonalbasis fv1; v2g for W .

SOLUTION Thesubspace W isshowninFig. 1, alongwith x1, x2, andtheprojectionp of x2 onto x1. Thecomponentof x2 orthogonalto x1 is x2 � p, whichisin W becauseitisformedfrom x2 andamultipleof x1. Let v1 D x1 and

v2 D x2 � p D x2 � x2 � x1

x1 � x1

x1 D

2

4

1

2

2

3

5 � 15

45

2

4

3

6

0

3

5 D

2

4

0

0

2

3

5

Then fv1; v2g isanorthogonalsetofnonzerovectorsin W . Since dimW D 2, thesetfv1; v2g isabasisfor W .

ThenextexamplefullyillustratestheGram–Schmidtprocess. Studyitcarefully.

EXAMPLE 2 Let x1 D

2

664

1

1

1

1

3

775, x2 D

2

664

0

1

1

1

3

775, and x3 D

2

664

0

0

1

1

3

775. Then fx1; x2; x3g is

clearlylinearlyindependentandthusisabasisforasubspace W of R4. Constructan

orthogonalbasisfor W .

SOLUTION

Step1. Let v1 D x1 and W1 D Span fx1g D Span fv1g.Step2. Let v2 bethevectorproducedbysubtractingfrom x2 itsprojectionontothesubspace W1. Thatis, let

v2 D x2 � projW1x2

D x2 � x2 � v1

v1 � v1

v1 Since v1 D x1

D

2

664

0

1

1

1

3

775

� 3

4

2

664

1

1

1

1

3

775

D

2

664

�3=4

1=4

1=4

1=4

3

775

AsinExample 1, v2 is thecomponent of x2 orthogonal to x1, and fv1; v2g is anorthogonalbasisforthesubspace W2 spannedby x1 and x2.Step20 (optional). Ifappropriate, scale v2 tosimplifylatercomputations. Since v2 hasfractionalentries, itisconvenienttoscaleitbyafactorof4andreplace fv1; v2g bytheorthogonalbasis

v1 D

2

664

1

1

1

1

3

775

; v02 D

2

664

�3

1

1

1

3

775

6.4 The Gram–Schmidt Process 355

Step3. Let v3 bethevectorproducedbysubtractingfrom x3 itsprojectionontothesubspace W2. Usetheorthogonalbasis fv1; v02g tocomputethisprojectiononto W2:

projW2x3 D

Projectionofx3 onto v1

❄

x3 � v1

v1 � v1

v1 C

Projectionofx3 onto v02

❄

x3 � v02v02 � v02

v02 D 2

4

2

664

1

1

1

1

3

775

C 2

12

2

664

�3

1

1

1

3

775

D

2

664

0

2=3

2=3

2=3

3

775

Then v3 isthecomponentof x3 orthogonalto W2, namely,

v3 D x3 � projW2x3 D

2

664

0

0

1

1

3

775

�

2

664

0

2=3

2=3

2=3

3

775

D

2

664

0

�2=3

1=3

1=3

3

775

SeeFig. 2foradiagramofthisconstruction. Observethat v3 isin W , because x3

and projW2x3 arebothin W . Thus fv1; v02; v3g isanorthogonalsetofnonzerovectorsandhencealinearlyindependentsetin W . Notethat W isthree-dimensionalsinceitwasdefinedbyabasisofthreevectors. Hence, bytheBasisTheoreminSection 4.5,fv1; v02; v3g isanorthogonalbasisfor W .

projW

2

x3

x3

v3

v1

0

v'2

W2 = Span{v

1, v'

2}

FIGURE 2 Theconstructionof v3 from x3

and W2.

Theproofofthenexttheoremshowsthatthisstrategyreallyworks. Scalingofvectorsisnotmentionedbecausethatisusedonlytosimplifyhandcalculations.

THEOREM 11 The Gram--Schmidt Process

Givenabasis fx1; : : : ; xpg foranonzerosubspace W of Rn, define

v1 D x1

v2 D x2 � x2 � v1

v1 � v1

v1

v3 D x3 � x3 � v1

v1 � v1

v1 � x3 � v2

v2 � v2

v2

:::

vp D xp � xp � v1

v1 � v1

v1 � xp � v2

v2 � v2

v2 � � � � � xp � vp�1

vp�1 � vp�1

vp�1

Then fv1; : : : ; vpg isanorthogonalbasisfor W . Inaddition

Span fv1; : : : ; vkg D Span fx1; : : : ; xkg for 1 � k � p (1)


PROOF For 1 � k � p, letWk D Span fx1; : : : ; xkg. Set v1 D x1, sothat Span fv1g DSpan fx1g. Suppose, for some k < p, we have constructed v1; : : : ; vk so thatfv1; : : : ; vkg isanorthogonalbasisfor Wk . Define

vkC1 D xkC1 � projWkxkC1 (2)

BytheOrthogonalDecompositionTheorem, vkC1 isorthogonalto Wk . NotethatprojWk

xkC1 isinWk andhencealsoinWkC1. Since xkC1 isinWkC1, sois vkC1 (becauseWkC1 isasubspaceandisclosedundersubtraction). Furthermore, vkC1 ¤ 0 becausexkC1 isnotin Wk D Span fx1; : : : ; xkg. Hence fv1; : : : ; vkC1g isanorthogonalsetofnonzerovectorsinthe .k C 1/-dimensionalspace WkC1. BytheBasisTheoreminSec-tion 4.5, thissetisanorthogonalbasisfor WkC1. Hence WkC1 D Span fv1; : : : ; vkC1g.When k C 1 D p, theprocessstops.

Theorem 11showsthatanynonzerosubspaceW ofRn hasanorthogonalbasis, be-

causeanordinarybasis fx1; : : : ; xpg isalwaysavailable(byTheorem 11inSection 4.5),andtheGram–Schmidtprocessdependsonlyontheexistenceoforthogonalprojectionsontosubspacesof W thatalreadyhaveorthogonalbases.

Orthonormal BasesAnorthonormal basis is constructedeasily fromanorthogonal basis fv1; : : : ; vpg:simplynormalize(i.e., “scale”)allthe vk . Whenworkingproblemsbyhand, thisiseasierthannormalizingeach vk assoonasitisfound(becauseitavoidsunnecessarywritingofsquareroots).

EXAMPLE 3 Example 1constructedtheorthogonalbasis

v1 D

2

4

3

6

0

3

5; v2 D

2

4

0

0

2

3

5

Anorthonormalbasisis

u1 D 1

kv1kv1 D 1p

45

2

4

3

6

0

3

5 D

2

4

1=p

5

2=p

5

0

3

5

u2 D 1

kv2kv2 D

2

4

0

0

1

3

5

QR Factorization of MatricesIfan m � n matrix A haslinearlyindependentcolumns x1; : : : ; xn, thenapplyingtheWEB

Gram–Schmidtprocess(withnormalizations)to x1; : : : ; xn amountsto factoring A, asdescribedinthenexttheorem. Thisfactorizationiswidelyusedincomputeralgorithmsforvariouscomputations, suchassolvingequations(discussedinSection 6.5)andfindingeigenvalues(mentionedintheexercisesforSection 5.2).


THEOREM 12 The QR Factorization

IfA isanm � nmatrixwithlinearlyindependentcolumns, thenA canbefactoredas A D QR, where Q isan m � n matrixwhosecolumnsformanorthonormalbasisfor ColA and R isan n � n uppertriangularinvertiblematrixwithpositiveentriesonitsdiagonal.

PROOF Thecolumnsof A formabasis fx1; : : : ; xng for ColA. Constructanorthonor-malbasis fu1; : : : ; ung for W D ColA withproperty(1)inTheorem 11. ThisbasismaybeconstructedbytheGram–Schmidtprocessorsomeothermeans. Let

Q D Œ u1 u2 � � � un �

For k D 1; : : : ; n; xk isin Span fx1; : : : ; xkg D Span fu1; : : : ; ukg. Sotherearecon-stants, r1k ; : : : ; rkk , suchthat

xk D r1ku1 C � � � C rkkuk C 0�ukC1 C � � � C 0�un

Wemayassumethat rkk � 0. (If rkk < 0, multiplyboth rkk and uk by �1.) Thisshowsthat xk isalinearcombinationofthecolumnsof Q usingasweightstheentriesinthevector

rk D

2

66666664

r1k

:::

rkk

0:::

0

3

77777775

Thatis, xk D Qrk for k D 1; : : : ; n. Let R D Œ r1 � � � rn �. Then

A D Œ x1 � � � xn � D Œ Qr1 � � � Qrn � D QR

ThefactthatR isinvertiblefollowseasilyfromthefactthatthecolumnsofA arelinearlyindependent(Exercise 19). SinceR isclearlyuppertriangular, itsnonnegativediagonalentriesmustbepositive.

EXAMPLE 4 FindaQR factorizationof A D

2

664

1 0 0

1 1 0

1 1 1

1 1 1

3

775.

SOLUTION Thecolumnsof A are thevectors x1, x2, and x3 inExample 2. Anorthogonalbasisfor ColA D Span fx1; x2; x3g wasfoundinthatexample:

v1 D

2

664

1

1

1

1

3

775

; v02 D

2

664

�3

1

1

1

3

775

; v3 D

2

664

0

�2=3

1=3

1=3

3

775

Tosimplifythearithmeticthatfollows, scale v3 byletting v03 D 3v3. Thennormalizethethreevectorstoobtain u1, u2, and u3, andusethesevectorsasthecolumnsof Q:

Q D

2

6664

1=2 �3=p

12 0

1=2 1=p

12 �2=p

6

1=2 1=p

12 1=p

6

1=2 1=p

12 1=p

6

3

7775


Byconstruction, thefirst k columnsofQ areanorthonormalbasisof Span fx1; : : : ; xkg.FromtheproofofTheorem 12,A D QR forsomeR. TofindR, observethatQTQ D I ,becausethecolumnsof Q areorthonormal. Hence

QTA D QT .QR/ D IR D R

and

R D

2

4

1=2 1=2 1=2 1=2

�3=p

12 1=p

12 1=p

12 1=p

12

0 �2=p

6 1=p

6 1=p

6

3

5

2

664

1 0 0

1 1 0

1 1 1

1 1 1

3

775

D

2

4

2 3=2 1

0 3=p

12 2=p

12

0 0 2=p

6

3

5

NUMER ICAL NOTES

1. WhentheGram–Schmidtprocessisrunonacomputer, roundofferrorcanbuildupasthevectors uk arecalculated, onebyone. For j and k largebutunequal, theinnerproducts uT

j uk maynotbesufficientlyclosetozero. Thislossoforthogonalitycanbereducedsubstantiallybyrearrangingtheorderofthecalculations.1 However, adifferentcomputer-basedQR factorizationisusuallypreferredtothismodifiedGram–Schmidtmethodbecauseityieldsamoreaccurateorthonormalbasis, eventhoughthefactorizationrequiresabouttwiceasmucharithmetic.

2. ToproduceaQR factorizationofamatrix A, acomputerprogramusuallyleft-multiplies A byasequenceoforthogonalmatricesuntil A istransformedintoanuppertriangularmatrix. Thisconstructionisanalogoustotheleft-multiplicationbyelementarymatricesthatproducesanLU factorizationofA.

PRACTICE PROBLEM

LetW D Span fx1; x2g, where x1 D

2

4

1

1

1

3

5 and x2 D

2

4

1=3

1=3

�2=3

3

5. Constructanorthonor-

malbasisfor W .

6.4 EXERCISESInExercises1–6, thegivensetisabasisforasubspace W . UsetheGram–Schmidtprocesstoproduceanorthogonalbasisfor W .

1.

2

4

3

0

�1

3

5,

2

4

8

5

�6

3

5 2.

2

4

0

4

2

3

5,

2

4

5

6

�7

3

5

3.

2

4

2

�5

1

3

5,

2

4

4

�1

2

3

5 4.

2

4

3

�4

5

3

5,

2

4

�3

14

�7

3

5

5.

2

664

1

�4

0

1

3

775,

2

664

7

�7

�4

1

3

775

6.

2

664

3

�1

2

�1

3

775,

2

664

�5

9

�9

3

3

775

¹See FundamentalsofMatrixComputations, byDavidS. Watkins(NewYork: JohnWiley&Sons, 1991),pp. 167–180.


7. FindanorthonormalbasisofthesubspacespannedbythevectorsinExercise 3.

8. FindanorthonormalbasisofthesubspacespannedbythevectorsinExercise 4.

FindanorthogonalbasisforthecolumnspaceofeachmatrixinExercises9–12.

9.

2

664

3 �5 1

1 1 1

�1 5 �2

3 �7 8

3

775

10.

2

664

�1 6 6

3 �8 3

1 �2 6

1 �4 �3

3

775

11.

2

66664

1 2 5

�1 1 �4

�1 4 �3

1 �4 7

1 2 1

3

77775

12.

2

66664

1 3 5

�1 �3 1

0 2 3

1 5 2

1 5 8

3

77775

InExercises 13and14, thecolumnsof Q wereobtainedbyapplyingtheGram–SchmidtprocesstothecolumnsofA. Findanuppertriangularmatrix R suchthat A D QR. Checkyourwork.

13. A D

2

664

5 9

1 7

�3 �5

1 5

3

775, Q D

2

664

5=6 �1=6

1=6 5=6

�3=6 1=6

1=6 3=6

3

775

14. A D

2

664

�2 3

5 7

2 �2

4 6

3

775, Q D

2

664

�2=7 5=7

5=7 2=7

2=7 �4=7

4=7 2=7

3

775

15. FindaQR factorizationofthematrixinExercise 11.16. FindaQR factorizationofthematrixinExercise 12.

InExercises17and18, allvectorsandsubspacesarein Rn. Mark


17. a. If fv1; v2; v3g isanorthogonalbasisfor W , thenmul-tiplying v3 byascalar c givesaneworthogonalbasisfv1; v2; cv3g.

b. TheGram–Schmidtprocessproducesfromalinearlyin-dependentset fx1; : : : ; xpg anorthogonalset fv1; : : : ; vpgwiththepropertythatforeach k, thevectors v1; : : : ; vk

spanthesamesubspaceasthatspannedby x1; : : : ; xk .c. If A D QR, where Q hasorthonormalcolumns, then

R D QTA.18. a. If W D Span fx1; x2; x3g with fx1; x2; x3g linearlyinde-

pendent, andif fv1; v2; v3g isanorthogonalsetinW , thenfv1; v2; v3g isabasisfor W .

b. If x isnotinasubspace W , then x � projW x isnotzero.c. InaQR factorization, say A D QR (when A haslin-

earlyindependentcolumns), thecolumnsof Q formanorthonormalbasisforthecolumnspaceof A.

19. Suppose A D QR, where Q is m � n and R is n � n. ShowthatifthecolumnsofA arelinearlyindependent, thenRmustbeinvertible. [Hint: Studytheequation Rx D 0 andusethefactthat A D QR.]

20. Suppose A D QR, where R isaninvertiblematrix. Showthat A and Q havethesamecolumnspace. [Hint: Given y inColA, showthat y D Qx forsome x. Also, given y in ColQ,showthat y D Ax forsome x.]

21. Given A D QR asinTheorem 12, describehowtofindanorthogonalm � m (square)matrixQ1 andaninvertible n � n

uppertriangularmatrix R suchthat

A D Q1

�

R

0

�

TheMATLAB qr commandsuppliesthis“full”QR factor-izationwhen rankA D n.

22. Let u1; : : : ; up beanorthogonalbasisforasubspace W ofR

n, and let T W Rn ! R

n bedefinedby T .x/ D projW x.Showthat T isalineartransformation.

23. Suppose A D QR isaQR factorizationofan m � n ma-trix A (withlinearlyindependentcolumns). Partition A asŒA1 A2�, where A1 has p columns. ShowhowtoobtainaQR factorizationof A1, andexplainwhyyourfactorizationhastheappropriateproperties.

24. [M] Use the Gram–Schmidt process as in Example 2 toproduceanorthogonalbasisforthecolumnspaceof

A D

2

66664

�10 13 7 �11

2 1 �5 3

�6 3 13 �3

16 �16 �2 5

2 1 �5 �7

3

77775

25. [M] UsethemethodinthissectiontoproduceaQR factor-izationofthematrixinExercise 24.

26. [M] Foramatrixprogram, theGram–Schmidtprocessworksbetterwithorthonormalvectors. Startingwith x1; : : : ; xp asinTheorem 11, let A D Œ x1 � � � xp �. Suppose Q isann � k matrixwhosecolumnsformanorthonormalbasisforthesubspace Wk spannedbythefirst k columnsof A. Thenfor x in R

n, QQT x istheorthogonalprojectionof x onto Wk

(Theorem 10inSection 6.3). If xkC1 isthenextcolumnofA, thenequation(2)intheproofofTheorem 11becomes

vkC1 D xkC1 � Q.QT xkC1/

(The parentheses above reduce the number of arithmeticoperations.) Let ukC1 D vkC1=kvkC1k. Thenew Q forthenextstepis Œ Q ukC1 �. UsethisproceduretocomputetheQR factorizationofthematrixinExercise 24. Writethekeystrokesorcommandsyouuse.

WEB



Let v1 D x1 D

2

4

1

1

1

3

5 and v2 D x2 � x2 � v1

v1 � v1

v1 D x2 � 0v1 D x2. So fx1; x2g isalready

orthogonal. Allthatisneededistonormalizethevectors. Let

u1 D 1

kv1kv1 D 1p

3

2

4

1

1

1

3

5 D

2

4

1=p

3

1=p

3

1=p

3

3

5

Insteadofnormalizing v2 directly, normalize v02 D 3v2 instead:

u2 D 1

kv02kv02 D 1

p

12 C 12 C .�2/2

2

4

1

1

�2

3

5 D

2

4

1=p

6

1=p

6

�2=p

6

3

5

Then fu1;u2g isanorthonormalbasisfor W .

6.5 LEAST-SQUARES PROBLEMS

Thechapter’sintroductoryexampledescribedamassiveproblem Ax D b thathadnosolution. Inconsistentsystemsariseofteninapplications, thoughusuallynotwithsuchanenormouscoefficientmatrix. Whenasolutionisdemandedandnoneexists, thebestonecandoistofindan x thatmakes Ax ascloseaspossibleto b.

Thinkof Ax asan approximation to b. Thesmallerthedistancebetween b and Ax,givenby kb � Axk, thebettertheapproximation. The generalleast-squaresproblemistofindan x thatmakes kb � Axk assmallaspossible. Theadjective“least-squares”arisesfromthefactthat kb � Axk isthesquarerootofasumofsquares.

DEF IN I T I ON If A is m � n and b isin Rm, a least-squaressolution of Ax D b isan Ox in R

n

suchthatkb � AOxk � kb � Axk

forall x in Rn.

Themostimportantaspectoftheleast-squaresproblemisthatnomatterwhat x weselect, thevector Ax willnecessarilybeinthecolumnspace, ColA. Soweseekan xthatmakes Ax theclosestpointin ColA to b. SeeFig.1. (Ofcourse, if b happenstobein ColA, then b is Ax forsome x, andsuchan x isa“least-squaressolution.”)

Ax

Ax

Ax

Col A

b

0

FIGURE 1 Thevector b iscloserto AOx thanto Ax forother x.

6.5 Least-Squares Problems 361

Solution of the General Least-Squares ProblemGiven A and b asabove, applytheBestApproximationTheoreminSection 6.3tothesubspace ColA. Let

Ob D projColA b

Because Ob isinthecolumnspaceof A, theequation Ax D Ob is consistent, andthereisan Ox in R

n suchthatAOx D Ob (1)

Since Ob istheclosestpointin ColA to b, avector Ox isaleast-squaressolutionofAx D bifandonlyif Ox satisfies(1). Suchan Ox in R

n isalistofweightsthatwillbuild Ob outofthecolumnsof A. SeeFig. 2. [Therearemanysolutionsof(1)iftheequationhasfreevariables.]

x

�n

0

A

subspace of �m

bb – Ax

b = Axˆ

Col A

FIGURE 2 Theleast-squaressolution Ox isin Rn.

Suppose Ox satisfies AOx D Ob. BytheOrthogonalDecompositionTheoreminSec-tion 6.3, theprojection Ob hasthepropertythat b � Ob isorthogonalto ColA, so b � AOxisorthogonaltoeachcolumnof A. If aj isanycolumnof A, then aj �.b � AOx/ D 0,and aT

j .b � AOx/ D 0. Sinceeach aTj isarowof AT ,

AT .b � AOx/ D 0 (2)

(ThisequationalsofollowsfromTheorem3inSection6.1.) Thus

AT b � ATAOx D 0ATAOx D AT b

Thesecalculationsshowthateachleast-squaressolutionofAx D b satisfiestheequation

ATAx D AT b (3)

Thematrixequation(3)representsasystemofequationscalledthe normalequationsfor Ax D b. A solutionof(3)isoftendenotedby Ox.

THEOREM 13 Thesetofleast-squaressolutionsof Ax D b coincideswiththenonemptysetofsolutionsofthenormalequations ATAx D AT b.

PROOF Asshownabove, thesetofleast-squaressolutionsisnonemptyandeachleast-squaressolution Ox satisfiesthenormalequations. Conversely, suppose Ox satisfiesATAOx D AT b. Then Ox satisfies(2)above, whichshowsthat b � AOx isorthogonaltothe


rowsof AT andhenceisorthogonaltothecolumnsof A. Sincethecolumnsof A spanColA, thevector b � AOx isorthogonaltoallof ColA. Hencetheequation

b D AOx C .b � AOx/

isadecompositionof b intothesumofavectorin ColA andavectororthogonaltoColA. Bytheuniquenessoftheorthogonaldecomposition, AOx mustbetheorthogonalprojectionof b onto ColA. Thatis, AOx D Ob, and Ox isaleast-squaressolution.

EXAMPLE 1 Findaleast-squaressolutionoftheinconsistentsystem Ax D b for

A D

2

4

4 0

0 2

1 1

3

5; b D

2

4

2

0

11

3

5

SOLUTION Tousenormalequations(3), compute:

ATA D�

4 0 1

0 2 1

�2

4

4 0

0 2

1 1

3

5 D�

17 1

1 5

�

AT b D�

4 0 1

0 2 1

�2

4

2

0

11

3

5 D�

19

11

�

Thentheequation ATAx D AT b becomes�

17 1

1 5

��

x1

x2

�

D�

19

11

�

Rowoperationscanbeusedtosolvethissystem, butsince ATA isinvertibleand 2 � 2,itisprobablyfastertocompute

.ATA/�1 D 1

84

�

5 �1

�1 17

�

andthentosolve ATAx D AT b as

Ox D .ATA/�1AT b

D 1

84

�

5 �1

�1 17

��

19

11

�

D 1

84

�

84

168

�

D�

1

2

�

Inmanycalculations, ATA isinvertible, butthisisnotalwaysthecase. Thenextexampleinvolvesamatrixofthesortthatappearsinwhatarecalled analysisofvarianceproblemsinstatistics.

EXAMPLE 2 Findaleast-squaressolutionof Ax D b for

A D

2

6666664

1 1 0 0

1 1 0 0

1 0 1 0

1 0 1 0

1 0 0 1

1 0 0 1

3

7777775

; b D

2

6666664

�3

�1

0

2

5

1

3

7777775


SOLUTION Compute

ATA D

2

664

1 1 1 1 1 1

1 1 0 0 0 0

0 0 1 1 0 0

0 0 0 0 1 1

3

775

2

6666664

1 1 0 0

1 1 0 0

1 0 1 0

1 0 1 0

1 0 0 1

1 0 0 1

3

7777775

D

2

664

6 2 2 2

2 2 0 0

2 0 2 0

2 0 0 2

3

775

AT b D

2

664

1 1 1 1 1 1

1 1 0 0 0 0

0 0 1 1 0 0

0 0 0 0 1 1

3

775

2

6666664

�3

�1

0

2

5

1

3

7777775

D

2

664

4

�4

2

6

3

775

Theaugmentedmatrixfor ATAx D AT b is2

664

6 2 2 2 4

2 2 0 0 �4

2 0 2 0 2

2 0 0 2 6

3

775

�

2

664

1 0 0 1 3

0 1 0 �1 �5

0 0 1 �1 �2

0 0 0 0 0

3

775

Thegeneralsolutionis x1 D 3 � x4, x2 D �5 C x4, x3 D �2 C x4, and x4 isfree. Sothegeneralleast-squaressolutionof Ax D b hastheform

Ox D

2

664

3

�5

�2

0

3

775

C x4

2

664

�1

1

1

1

3

775

Thenexttheoremgivesusefulcriteriafordeterminingwhenthereisonlyoneleast-squaressolutionof Ax D b. (Ofcourse, theorthogonalprojection Ob isalwaysunique.)

THEOREM 14 Let A bean m � n matrix. Thefollowingstatementsarelogicallyequivalent:

a. Theequation Ax D b hasauniqueleast-squaressolutionforeach b in Rm.

b. Thecolumnsof A arelinearlyindpendent.c. Thematrix ATA isinvertible.

Whenthesestatementsaretrue, theleast-squaressolution Ox isgivenbyOx D .ATA/�1AT b (4)

ThemainelementsofaproofofTheorem 14areoutlinedinExercises19–21, whichalsoreviewconceptsfromChapter 4. Formula(4)for Ox isusefulmainlyfortheoreticalpurposesandforhandcalculationswhen ATA isa 2 � 2 invertiblematrix.

Whenaleast-squaressolution Ox isusedtoproduce AOx asanapproximationto b,thedistancefrom b to AOx iscalledthe least-squareserror ofthisapproximation.

EXAMPLE 3 Given A and b asinExample 1, determinetheleast-squareserrorintheleast-squaressolutionof Ax D b.


SOLUTION FromExample 1,

(2, 0, 11)

(0, 2, 1)

0

b

Ax

x3

x1

(4, 0, 1)

�84

Col A

FIGURE 3

b D

2

4

2

0

11

3

5 and AOx D

2

4

4 0

0 2

1 1

3

5

�

1

2

�

D

2

4

4

4

3

3

5

Hence

b � AOx D

2

4

2

0

11

3

5 �

2

4

4

4

3

3

5 D

2

4

�2

�4

8

3

5

andkb � AOxk D

p

.�2/2 C .�4/2 C 82 Dp

84

Theleast-squareserrorisp

84. Forany x in R2, thedistancebetween b andthevector

Ax isatleastp

84. SeeFig. 3. Notethattheleast-squaressolution Ox itselfdoesnotappearinthefigure.

Alternative Calculations of Least-Squares SolutionsThenextexampleshowshowtofindaleast-squaressolutionof Ax D b whenthecolumnsof A areorthogonal. Suchmatricesoftenappearinlinearregressionproblems,discussedinthenextsection.

EXAMPLE 4 Findaleast-squaressolutionof Ax D b for

A D

2

664

1 �6

1 �2

1 1

1 7

3

775

; b D

2

664

�1

2

1

6

3

775

SOLUTION Because the columns a1 and a2 of A are orthogonal, the orthogonalprojectionof b onto ColA isgivenby

Ob D b�a1

a1 � a1

a1 C b�a2

a2 � a2

a2 D 8

4a1 C 45

90a2 (5)

D

2

664

2

2

2

2

3

775

C

2

664

�3

�1

1=2

7=2

3

775

D

2

664

�1

1

5=2

11=2

3

775

Nowthat Ob isknown, wecansolve AOx D Ob. Butthisistrivial, sincewealreadyknowwhatweightstoplaceonthecolumnsof A toproduce Ob. Itisclearfrom(5)that

Ox D�

8=4

45=90

�

D�

2

1=2

�

In some cases, the normal equations for a least-squares problem can be ill-conditioned; thatis, smallerrorsinthecalculationsoftheentriesof ATA cansometimescause relatively large errors in the solution Ox. If the columns of A are linearlyindependent, theleast-squaressolutioncanoftenbecomputedmorereliablythroughaQR factorizationof A (describedinSection 6.4).¹

¹TheQR methodiscomparedwiththestandardnormalequationmethodinG.GolubandC.VanLoan,MatrixComputations, 3rded. (Baltimore: JohnsHopkinsPress, 1996), pp.230–231.


THEOREM 15 Givenan m � n matrix A withlinearlyindependentcolumns, let A D QR beaQR factorizationof A asinTheorem 12. Then, foreach b in R

m, theequationAx D b hasauniqueleast-squaressolution, givenby

Ox D R�1QT b (6)

PROOF Let Ox D R�1QT b. Then

AOx D QROx D QRR�1QT b D QQT b

ByTheorem 12, thecolumnsof Q formanorthonormalbasisfor ColA. Hence, byTheorem 10, QQTb istheorthogonalprojection Ob of b onto ColA. Then AOx D Ob,whichshowsthat Ox isaleast-squaressolutionof Ax D b. Theuniquenessof Ox followsfromTheorem 14.

NUMER ICAL NOTE

Since R inTheorem 15isuppertriangular, Ox shouldbecalculatedastheexactsolutionoftheequation

Rx D QT b (7)

Itismuchfastertosolve(7)byback-substitutionorrowoperationsthantocompute R�1 anduse(6).

EXAMPLE 5 Findtheleast-squaressolutionof Ax D b for

A D

2

664

1 3 5

1 1 0

1 1 2

1 3 3

3

775

; b D

2

664

3

5

7

�3

3

775

SOLUTION TheQR factorizationof A canbeobtainedasinSection 6.4.

A D QR D

2

664

1=2 1=2 1=2

1=2 �1=2 �1=2

1=2 �1=2 1=2

1=2 1=2 �1=2

3

775

2

4

2 4 5

0 2 3

0 0 2

3

5

Then

QT b D

2

4

1=2 1=2 1=2 1=2

1=2 �1=2 �1=2 1=2

1=2 �1=2 1=2 �1=2

3

5

2

664

3

5

7

�3

3

775

D

2

4

6

�6

4

3

5

Theleast-squaressolution Ox satisfies Rx D QT b; thatis,2

4

2 4 5

0 2 3

0 0 2

3

5

2

4

x1

x2

x3

3

5 D

2

4

6

�6

4

3

5

Thisequationissolvedeasilyandyields Ox D

2

4

10

�6

2

3

5.


PRACTICE PROBLEMS

1. LetA D

2

4

1 �3 �3

1 5 1

1 7 2

3

5 and b D

2

4

5

�3

�5

3

5. Findaleast-squaressolutionofAx D b,

andcomputetheassociatedleast-squareserror.2. Whatcanyousayabouttheleast-squaressolutionof Ax D b when b isorthogonal

tothecolumnsof A?

6.5 EXERCISESIn Exercises 1–4, finda least-squares solution of Ax D b by(a)constructingthenormalequationsfor Ox and(b)solvingforOx.

1. A D

2

4

�1 2

2 �3

�1 3

3

5, b D

2

4

4

1

2

3

5

2. A D

2

4

2 1

�2 0

2 3

3

5, b D

2

4

�5

8

1

3

5

3. A D

2

664

1 �2

�1 2

0 3

2 5

3

775, b D

2

664

3

1

�4

2

3

775

4. A D

2

4

1 3

1 �1

1 1

3

5, b D

2

4

5

1

0

3

5

InExercises5and6, describeallleast-squaressolutionsoftheequation Ax D b.

5. A D

2

664

1 1 0

1 1 0

1 0 1

1 0 1

3

775, b D

2

664

1

3

8

2

3

775

6. A D

2

6666664

1 1 0

1 1 0

1 1 0

1 0 1

1 0 1

1 0 1

3

7777775

, b D

2

6666664

7

2

3

6

5

4

3

7777775

7. Computetheleast-squareserrorassociatedwiththeleast-squaressolutionfoundinExercise 3.

8. Computetheleast-squareserrorassociatedwiththeleast-squaressolutionfoundinExercise 4.

InExercises9–12, find(a)theorthogonalprojectionof b ontoColA and(b)aleast-squaressolutionof Ax D b.

9. A D

2

4

1 5

3 1

�2 4

3

5, b D

2

4

4

�2

�3

3

5

10. A D

2

4

1 2

�1 4

1 2

3

5, b D

2

4

3

�1

5

3

5

11. A D

2

664

4 0 1

1 �5 1

6 1 0

1 �1 �5

3

775, b D

2

664

9

0

0

0

3

775

12. A D

2

664

1 1 0

1 0 �1

0 1 1

�1 1 �1

3

775, b D

2

664

2

5

6

6

3

775

13. Let A D

2

4

3 4

�2 1

3 4

3

5, b D

2

4

11

�9

5

3

5, u D�

5

�1

�

, and v D�

5

�2

�

. Compute Au and Av, andcomparethemwith b.

Could u possiblybea least-squares solutionof Ax D b?(Answerthiswithoutcomputingaleast-squaressolution.)

14. Let A D

2

4

2 1

�3 �4

3 2

3

5, b D

2

4

5

4

4

3

5, u D�

4

�5

�

, and v D�

6

�5

�

. Compute Au and Av, andcomparethemwith b. Is

itpossiblethatatleastoneof u or v couldbealeast-squaressolutionofAx D b? (Answerthiswithoutcomputingaleast-squaressolution.)

InExercises15and16, usethefactorization A D QR tofindtheleast-squaressolutionof Ax D b.

15. A D

2

4

2 3

2 4

1 1

3

5 D

2

4

2=3 �1=3

2=3 2=3

1=3 �2=3

3

5

�

3 5

0 1

�

, b D

2

4

7

3

1

3

5

16. A D

2

664

1 �1

1 4

1 �1

1 4

3

775

D

2

664

1=2 �1=2

1=2 1=2

1=2 �1=2

1=2 1=2

3

775

�

2 3

0 5

�

;b D

2

664

�1

6

5

7

3

775

InExercises17and18,A isanm � nmatrixand b isinRm. Mark


17. a. Thegeneralleast-squaresproblemistofindan x thatmakes Ax ascloseaspossibleto b.


b. A least-squaressolutionof Ax D b is a vector Ox thatsatisfies AOx D Ob, where Ob istheorthogonalprojectionofb onto ColA.

c. A least-squaressolutionof Ax D b isavector Ox suchthatkb � Axk � kb � AOxk forall x in R

n.d. Anysolutionof ATAx D AT b isaleast-squaressolution

of Ax D b.e. Ifthecolumnsof A arelinearlyindependent, thenthe

equation Ax D b hasexactlyoneleast-squaressolution.18. a. If b isinthecolumnspaceof A, theneverysolutionof

Ax D b isaleast-squaressolution.b. Theleast-squaressolutionof Ax D b isthepointinthe

columnspaceof A closestto b.c. A least-squaressolutionof Ax D b isalistofweights

that, whenappliedtothecolumnsof A, producestheorthogonalprojectionof b onto ColA.

d. If Ox is a least-squares solution of Ax D b, thenOx D .ATA/�1AT b.

e. Thenormalequationsalwaysprovideareliablemethodforcomputingleast-squaressolutions.

f. If A hasaQR factorization, say A D QR, thenthebestwaytofindtheleast-squaressolutionof Ax D b istocompute Ox D R�1QT b.

19. LetA beanm � nmatrix. Usethestepsbelowtoshowthatavector x inR

n satisfiesAx D 0 ifandonlyifATAx D 0. Thiswillshowthat NulA D NulATA.a. Showthatif Ax D 0, then ATAx D 0.b. Suppose ATAx D 0. Explainwhy xTATAx D 0, anduse

thistoshowthat Ax D 0.20. Let A bean m � n matrixsuchthat ATA isinvertible. Show

thatthecolumnsof A arelinearlyindependent. [Careful:Youmaynotassumethat A isinvertible; itmaynotevenbesquare.]

21. Let A bean m � n matrixwhosecolumnsarelinearlyinde-pendent. [Careful: A neednotbesquare.]a. UseExercise 19toshowthat ATA isaninvertiblematrix.b. Explain why A must have at least as many rows as

columns.c. Determinetherankof A.

22. UseExercise 19toshowthat rankATA D rankA. [Hint:Howmanycolumnsdoes ATA have? Howisthisconnectedwiththerankof ATA?]

23. Suppose A is m � n withlinearlyindependentcolumnsandb isin R

m. Usethenormalequationstoproduceaformulafor Ob, theprojectionof b onto ColA. [Hint: Find Ox first. Theformuladoesnotrequireanorthogonalbasisfor ColA.]

24. Findaformulafortheleast-squaressolutionofAx D bwhenthecolumnsof A areorthonormal.

25. Describeallleast-squaressolutionsofthesystem

x C y D 2

x C y D 4

26. [M] Example 3inSection 4.8displayedalow-passlinearfilterthatchangedasignal fykg into fykC1g andchangedahigher-frequencysignal fwkg into the zero signal, whereyk D cos.�k=4/ and wk D cos.3�k=4/. Thefollowingcal-culationswilldesignafilterwithapproximatelythoseprop-erties. Thefilterequationis

a0ykC2 C a1ykC1 C a2yk D ´k forall k .8/

Becausethesignalsareperiodic, withperiod 8, itsufficestostudyequation(8)for k D 0; : : : ; 7. Theactiononthetwosignalsdescribedabovetranslatesintotwosetsofeightequations, shownbelow:

k D 0

k D 1:::

k D 7

2

66666666664

ykC2

0ykC1

.7yk

1�:7 0 :7

�1 �:7 0

�:7 �1 �:7

0 �:7 �1

:7 0 �:7

1 :7 0

:7 1 :7

3

77777777775

2

4

a0

a1

a2

3

5 D

2

66666666664

ykC1

.70

�:7

�1

�:7

0

:7

1

3

77777777775

k D 0

k D 1:::

k D 7

2

66666666664

wkC2

0 �wkC1

.7wk

1:7 0 �:7

�1 :7 0

:7 �1 :7

0 :7 �1

�:7 0 :7

1 �:7 0

�:7 1 �:7

3

77777777775

2

4

a0

a1

a2

3

5 D

2

66666666664

0

0

0

0

0

0

0

0

3

77777777775

Write an equation Ax D b, where A is a 16 � 3 matrixformedfromthetwocoefficientmatricesaboveandwhere bin R

16 isformedfromthetworightsidesoftheequations.Find a0, a1, and a2 givenbytheleast-squaressolutionofAx D b. (The.7inthedataabovewasusedasanapprox-imationfor

p2=2, toillustratehowatypicalcomputation

inanappliedproblemmightproceed. If .707wereusedinstead, theresultingfiltercoefficientswouldagreetoatleastsevendecimalplaceswith

p2=4; 1=2, and

p2=4, thevalues

producedbyexactarithmeticcalculations.)

WEB



1. First, compute

ATA D

2

4

1 1 1

�3 5 7

�3 1 2

3

5

2

4

1 �3 �3

1 5 1

1 7 2

3

5 D

2

4

3 9 0

9 83 28

0 28 14

3

5

AT b D

2

4

1 1 1

�3 5 7

�3 1 2

3

5

2

4

5

�3

�5

3

5 D

2

4

�3

�65

�28

3

5

Next, rowreducetheaugmentedmatrixforthenormalequations, ATAx D AT b:2

4

3 9 0 �3

9 83 28 �65

0 28 14 �28

3

5 �

2

4

1 3 0 �1

0 56 28 �56

0 28 14 �28

3

5 � � � � �

2

4

1 0 �3=2 2

0 1 1=2 �1

0 0 0 0

3

5

Thegeneralleast-squaressolutionis x1 D 2 C 32x3, x2 D �1 � 1

2x3, with x3 free.

Foronespecificsolution, take x3 D 0 (forexample), andget

Ox D

2

4

2

�1

0

3

5

Tofindtheleast-squareserror, compute

Ob D AOx D

2

4

1 �3 �3

1 5 1

1 7 2

3

5

2

4

2

�1

0

3

5 D

2

4

5

�3

�5

3

5

Itturnsoutthat Ob D b, so kb � Obk D 0. Theleast-squareserroriszerobecause bhappenstobein ColA.

2. If b isorthogonaltothecolumnsofA, thentheprojectionof b ontothecolumnspaceof A is 0. Inthiscase, aleast-squaressolution Ox of Ax D b satisfies AOx D 0.

6.6 APPLICATIONS TO LINEAR MODELS

A commontaskinscienceandengineeringistoanalyzeandunderstandrelationshipsamongseveralquantitiesthatvary. Thissectiondescribesavarietyofsituationsinwhichdataareusedtobuildorverifyaformulathatpredictsthevalueofonevariableasafunctionofothervariables. Ineachcase, theproblemwillamounttosolvingaleast-squaresproblem.

Foreasyapplicationofthediscussiontorealproblemsthatyoumayencounterlaterinyourcareer, wechoosenotationthatiscommonlyusedinthestatisticalanalysisofscientificandengineeringdata. Insteadof Ax D b, wewrite Xˇ D y andreferto X asthe designmatrix, ˇ asthe parametervector, and y asthe observationvector.

Least-Squares LinesThe simplest relation between two variables x and y is the linear equationy D ˇ0 C ˇ1x.¹ Experimentaldataoftenproducepoints .x1; y1/; : : : ; .xn; yn/ that,

¹Thisnotationiscommonlyusedforleast-squareslinesinsteadof y D mx C b.

6.6 Applications to Linear Models 369

whengraphed, seemtolieclosetoaline. Wewanttodeterminetheparameters ˇ0

and ˇ1 thatmakethelineas“close”tothepointsaspossible.Suppose ˇ0 and ˇ1 are fixed, andconsider the line y D ˇ0 C ˇ1x in Fig. 1.

Correspondingtoeachdatapoint .xj ; yj / thereisapoint .xj ; ˇ0 C ˇ1xj / onthelinewiththesame x-coordinate. Wecall yj the observed valueof y and ˇ0 C ˇ1xj thepredicted y-value(determinedbytheline). Thedifferencebetweenanobserved y-valueandapredicted y-valueiscalleda residual.

ResidualResidual

Point on line

Data pointy

xj

x1

xn

x

y = �0 + �

1x

(xj, �

0 + �

1x

j)

(xj, y

j)

FIGURE 1 Fittingalinetoexperimentaldata.

Thereareseveralwaystomeasurehow“close”thelineistothedata. Theusualchoice(primarilybecausethemathematicalcalculationsaresimple)istoaddthesquaresoftheresiduals. The least-squaresline istheline y D ˇ0 C ˇ1x thatminimizesthesumofthesquaresoftheresiduals. Thislineisalsocalleda lineofregressionof yon x, becauseanyerrorsinthedataareassumedtobeonlyinthe y-coordinates. Thecoefficients ˇ0, ˇ1 ofthelinearecalled(linear) regressioncoefficients.²

Ifthedatapointswereontheline, theparameters ˇ0 and ˇ1 wouldsatisfytheequations

Predicted Observedy-value y-value

ˇ0 C ˇ1x1 = y1

ˇ0 C ˇ1x2 = y2

::::::

ˇ0 C ˇ1xn = yn

Wecanwritethissystemas

Xˇ D y; where X D

2

6664

1 x1

1 x2:::

:::

1 xn

3

7775

; ˇ D�

ˇ0

ˇ1

�

; y D

2

6664

y1

y2:::

yn

3

7775

(1)

Ofcourse, ifthedatapointsdon’tlieonaline, thentherearenoparameters ˇ0, ˇ1 forwhichthepredicted y-valuesin Xˇ equaltheobserved y-valuesin y, and Xˇ D y hasnosolution. Thisisaleast-squaresproblem, Ax D b, withdifferentnotation!

Thesquareofthedistancebetweenthevectors Xˇ and y ispreciselythesumofthesquaresoftheresiduals. The ˇ thatminimizesthissumalsominimizesthedistancebetween Xˇ and y. Computingtheleast-squaressolutionof Xˇ D y isequivalenttofindingthe ˇ thatdeterminestheleast-squareslineinFig. 1.

² Ifthemeasurementerrorsarein x insteadof y, simplyinterchangethecoordinatesofthedata .xj ; yj /

beforeplottingthepointsandcomputingtheregressionline. Ifbothcoordinatesaresubjecttopossibleerror,thenyoumightchoosethelinethatminimizesthesumofthesquaresofthe orthogonal (perpendicular)distancesfromthepointstotheline. SeethePracticeProblemsforSection 7.5.


EXAMPLE 1 Findtheequation y D ˇ0 C ˇ1x oftheleast-squareslinethatbestfitsthedatapoints .2; 1/, .5; 2/, .7; 3/, and .8; 3/.

SOLUTION Usethe x-coordinatesofthedatatobuildthedesignmatrix X in(1)andthe y-coordinatestobuildtheobservationvector y:

X D

2

664

1 2

1 5

1 7

1 8

3

775

; y D

2

664

1

2

3

3

3

775

Fortheleast-squaressolutionof Xˇ D y, obtainthenormalequations(withthenewnotation):

XTXˇ D XTy

Thatis, compute

XTX D�

1 1 1 1

2 5 7 8

�

2

664

1 2

1 5

1 7

1 8

3

775

D�

4 22

22 142

�

XTy D�

1 1 1 1

2 5 7 8

�

2

664

1

2

3

3

3

775

D�

9

57

�

Thenormalequationsare�

4 22

22 142

��

ˇ0

ˇ1

�

D�

9

57

�

Hence�

ˇ0

ˇ1

�

D�

4 22

22 142

��1�9

57

�

D 1

84

�

142 �22

�22 4

��

9

57

�

D 1

84

�

24

30

�

D�

2=7

5=14

�

Thustheleast-squareslinehastheequation

y D 2

7C 5

14x

SeeFig. 2.

1 2 3

1

2

3

4 5 7 8 96

y

x

FIGURE 2 Theleast-squaresliney D 2

7C 5

14x.

A commonpracticebeforecomputingaleast-squareslineistocomputetheaveragex oftheoriginal x-valuesandformanewvariable x� D x � x. Thenew x-dataaresaidtobein mean-deviationform. Inthiscase, thetwocolumnsofthedesignmatrixwillbeorthogonal. Solutionofthenormalequationsissimplified, justasinExample 4inSection 6.5. SeeExercises 17and18.


The General Linear ModelInsomeapplications, itisnecessarytofitdatapointswithsomethingotherthanastraightline. Intheexamplesthatfollow, thematrixequationisstill Xˇ D y, butthespecificformof X changesfromoneproblemtothenext. Statisticiansusuallyintroducearesidualvector �, definedby � D y � Xˇ, andwrite

y D Xˇ C �

Anyequationofthisformisreferredtoasa linearmodel. OnceX and y aredetermined,thegoalistominimizethelengthof �, whichamountstofindingaleast-squaressolutionof Xˇ D y. Ineachcase, theleast-squaressolution O isasolutionofthenormalequations

XTXˇ D XTy

Least-Squares Fitting of Other CurvesWhendatapoints .x1; y1/; : : : ; .xn; yn/ onascatterplotdonotlieclosetoanyline, itmaybeappropriatetopostulatesomeotherfunctionalrelationshipbetween x and y.

Thenexttwoexamplesshowhowtofitdatabycurvesthathavethegeneralformy D ˇ0f0.x/ C ˇ1f1.x/ C � � � C ˇkfk.x/ (2)

where f0; : : : ; fk areknownfunctionsand ˇ0; : : : ; ˇk areparameters that mustbedetermined. Aswewillsee, equation(2)describesalinearmodelbecauseitislinearintheunknownparameters.

Foraparticularvalueof x, (2)givesapredicted, or“fitted,” valueof y. Thedifferencebetweentheobservedvalueandthepredictedvalueistheresidual. Theparameters ˇ0; : : : ; ˇk mustbedeterminedsoastominimizethesumofthesquaresoftheresiduals.

EXAMPLE 2 Supposedatapoints .x1; y1/; : : : ; .xn; yn/ appeartoliealongsomesortofparabolainsteadofastraightline. Forinstance, ifthe x-coordinatedenotestheproductionlevelforacompany, and y denotestheaveragecostperunitofoperatingatalevelof x unitsperday, thenatypicalaveragecostcurvelookslikeaparabolathatopensupward(Fig. 3). Inecology, aparaboliccurvethatopensdownwardisusedto

Units produced

Aver

age

cost

per

unit

x

y

FIGURE 3

Averagecostcurve.modelthenetprimaryproductionofnutrientsinaplant, asafunctionofthesurfaceareaofthefoliage(Fig. 4). Supposewewishtoapproximatethedatabyanequationoftheform

y D ˇ0 C ˇ1x C ˇ2x2 (3)

Describethelinearmodelthatproducesa“least-squaresfit”ofthedatabyequation (3).SOLUTION Equation(3)describestheidealrelationship. Supposetheactualvaluesof

Surface areaof foliage

x

y

Net

pri

mar

ypro

duct

ion

FIGURE 4

Productionofnutrients.

theparametersare ˇ0, ˇ1, ˇ2. Thenthecoordinatesofthefirstdatapoint .x1; y1/ satisfyanequationoftheform

y1 D ˇ0 C ˇ1x1 C ˇ2x21 C �1

where �1 istheresidualerrorbetweentheobservedvalue y1 andthepredicted y-valueˇ0 C ˇ1x1 C ˇ2x2

1 . Eachdatapointdeterminesasimilarequation:

y1 D ˇ0 C ˇ1x1 C ˇ2x21 C �1

y2 D ˇ0 C ˇ1x2 C ˇ2x22 C �2

::::::

yn D ˇ0 C ˇ1xn C ˇ2x2n C �n


Itisasimplemattertowritethissystemofequationsintheform y D Xˇ C �. TofindX , inspectthefirstfewrowsofthesystemandlookforthepattern.

2

6664

y1

y2

:::

yn

3

7775

D

2

66664

1 x1 x21

1 x2 x22

::::::

:::

1 xn x2n

3

77775

2

64

ˇ0

ˇ1

ˇ2

3

75 C

2

66664

�1

�2

:::

�n

3

77775

y D X ˇ C �

EXAMPLE 3 IfdatapointstendtofollowapatternsuchasinFig. 5, thenan

x

y

FIGURE 5

Datapointsalongacubiccurve.

appropriatemodelmightbeanequationoftheform

y D ˇ0 C ˇ1x C ˇ2x2 C ˇ3x3

Suchdata, forinstance, couldcomefromacompany’stotalcosts, asafunctionofthelevelofproduction. Describethelinearmodelthatgivesaleast-squaresfitofthistypetodata .x1; y1/; : : : ; .xn; yn/.

SOLUTION ByananalysissimilartothatinExample 2, weobtain

Observation Design Parameter Residualvector matrix vector vector

y D

2

66664

y1

y2

:::

yn

3

77775

; X D

2

66664

1 x1 x21 x3

1

1 x2 x22 x3

2

::::::

::::::

1 xn x2n x3

n

3

77775

; ˇ D

2

66664

ˇ0

ˇ1

ˇ2

ˇ3

3

77775

; � D

2

66664

�1

�2

:::

�n

3

77775

Multiple RegressionSupposeanexperimentinvolvestwoindependentvariables—say, u and v—andonedependentvariable, y. A simpleequationforpredicting y from u and v hastheform

y D ˇ0 C ˇ1u C ˇ2v (4)

A moregeneralpredictionequationmighthavetheform

y D ˇ0 C ˇ1u C ˇ2v C ˇ3u2 C ˇ4uv C ˇ5v2 (5)

Thisequationisusedingeology, forinstance, tomodelerosionsurfaces, glacialcirques,soilpH,andotherquantities. Insuchcases, theleast-squaresfitiscalleda trendsurface.

Equations(4)and(5)bothleadtoalinearmodelbecausetheyarelinearintheunknownparameters(eventhough u and v aremultiplied). Ingeneral, alinearmodelwillarisewhenever y istobepredictedbyanequationoftheform

y D ˇ0f0.u; v/ C ˇ1f1.u; v/ C � � � C ˇkfk.u; v/

with f0; : : : ; fk anysortofknownfunctionsand ˇ0; : : : ; ˇk unknownweights.

EXAMPLE 4 In geography, local models of terrain are constructed fromdata.u1; v1; y1/; : : : ; .un; vn; yn/, where uj , vj , and yj arelatitude, longitude, andaltitude,respectively. Describethelinearmodelbasedon(4)thatgivesaleast-squaresfittosuchdata. Thesolutioniscalledthe least-squaresplane. SeeFig. 6.


FIGURE 6 A least-squaresplane.

SOLUTION Weexpectthedatatosatisfythefollowingequations:

y1 D ˇ0 C ˇ1u1 C ˇ2v1 C �1

y2 D ˇ0 C ˇ1u2 C ˇ2v2 C �2

::::::

yn D ˇ0 C ˇ1un C ˇ2vn C �n

Thissystemhasthematrixform y D Xˇ C �, whereObservation Design Parameter Residual

vector matrix vector vector

y D

2

6664

y1

y2

:::

yn

3

7775

; X D

2

6664

1 u1 v1

1 u2 v2

::::::

:::

1 un vn

3

7775

; ˇ D

2

4

ˇ0

ˇ1

ˇ2

3

5; � D

2

6664

�1

�2

:::

�n

3

7775

Example 4showsthatthelinearmodelformultipleregressionhasthesameabstractformasthemodelforthesimpleregressionintheearlierexamples. Linearalgebragivesusthepowertounderstandthegeneralprinciplebehindallthelinearmodels. Once X

isdefinedproperly, thenormalequationsfor ˇ havethesamematrixform, nomatterhowmanyvariablesareinvolved. Thus, foranylinearmodelwhere XTX isinvertible,theleast-squares O isgivenby .XTX/�1XTy.SG

The Geometry of aLinear Model 6–19

Further ReadingFerguson, J., IntroductiontoLinearAlgebrainGeology (NewYork: Chapman&Hall,1994).Krumbein, W.C., andF.A.Graybill, AnIntroductiontoStatisticalModelsinGeology(NewYork: McGraw-Hill, 1965).Legendre, P., andL.Legendre, NumericalEcology (Amsterdam: Elsevier, 1998).Unwin, DavidJ., AnIntroductiontoTrendSurfaceAnalysis, ConceptsandTechniquesinModernGeography, No. 5(Norwich, England: GeoBooks, 1975).

PRACTICE PROBLEM

Whenthemonthlysalesofaproductaresubjecttoseasonalfluctuations, acurvethatapproximatesthesalesdatamighthavetheform

y D ˇ0 C ˇ1x C ˇ2 sin .2�x=12/

where x isthetimeinmonths. Theterm ˇ0 C ˇ1x givesthebasicsalestrend, andthesinetermreflectstheseasonalchangesinsales. Givethedesignmatrixandtheparametervectorforthelinearmodelthatleadstoaleast-squaresfitoftheequationabove. Assumethedataare .x1; y1/; : : : ; .xn; yn/.


6.6 EXERCISESInExercises1–4, findtheequation y D ˇ0 C ˇ1x oftheleast-squareslinethatbestfitsthegivendatapoints.

1. .0; 1/, .1; 1/, .2; 2/, .3; 2/

2. .1; 0/, .2; 1/, .4; 2/, .5; 3/

3. .�1; 0/, .0; 1/, .1; 2/, .2; 4/

4. .2; 3/, .3; 2/, .5; 1/, .6; 0/

5. Let X bethedesignmatrixusedtofindtheleast-squareslinetofitdata .x1; y1/; : : : ; .xn; yn/. UseatheoreminSection 6.5toshowthatthenormalequationshaveauniquesolutionifandonlyifthedataincludeatleasttwodatapointswithdifferent x-coordinates.

6. Let X bethedesignmatrixinExample 2correspondingtoaleast-squaresfitofaparabolatodata .x1; y1/; : : : ; .xn; yn/.Suppose x1, x2, and x3 aredistinct. Explainwhythereisonlyoneparabolathatfitsthedatabest, inaleast-squaressense.(SeeExercise 5.)

7. A certain experiment produces the data .1; 1:8/, .2; 2:7/,.3; 3:4/, .4; 3:8/, .5; 3:9/. Describethemodelthatproducesaleast-squaresfitofthesepointsbyafunctionoftheform

y D ˇ1x C ˇ2x2

Suchafunctionmightarise, forexample, astherevenuefromthesaleof x unitsofaproduct, whentheamountofferedforsaleaffectsthepricetobesetfortheproduct.a. Givethedesignmatrix, theobservationvector, andthe

unknownparametervector.b. [M] Findtheassociatedleast-squarescurveforthedata.

8. A simplecurvethatoftenmakesagoodmodelforthevari-ablecostsofacompany, asafunctionofthesaleslevel x,hastheform y D ˇ1x C ˇ2x

2 C ˇ3x3. Thereisnoconstant

termbecausefixedcostsarenotincluded.a. Givethedesignmatrixandtheparametervectorforthe

linearmodelthatleadstoaleast-squaresfitoftheequa-tionabove, withdata .x1; y1/; : : : ; .xn; yn/.

b. [M] Findtheleast-squarescurveoftheformabovetofitthedata .4; 1:58/, .6; 2:08/, .8; 2:5/, .10; 2:8/, .12; 3:1/,.14; 3:4/, .16; 3:8/, and .18; 4:32/, withvaluesinthou-sands. Ifpossible, produceagraphthatshowsthedatapointsandthegraphofthecubicapproximation.

9. A certainexperimentproducesthedata .1; 7:9/, .2; 5:4/, and.3; �:9/. Describethemodelthatproducesaleast-squaresfitofthesepointsbyafunctionoftheform

y D A cos x C B sin x

10. SupposeradioactivesubstancesA andB havedecaycon-stantsof.02and.07, respectively. Ifamixtureofthesetwosubstancesattime t D 0 contains MA gramsofA and MBgramsofB,thenamodelforthetotalamount y ofthemixturepresentattime t is

y D MAe�:02t C MBe�:07t .6/

Suppose the initial amounts MA and MB are unknown,but a scientist is able to measure the total amountspresent at several timesandrecords thefollowingpoints.ti ; yi /: .10; 21:34/, .11; 20:68/, .12; 20:05/, .14; 18:87/,and .15; 18:30/.a. Describealinearmodelthatcanbeusedtoestimate MA

and MB.b. [M] Findtheleast-squarescurvebasedon(6).

Halley’sCometlastappearedin1986andwillreappearin2061.

11. [M] AccordingtoKepler’sfirstlaw, acometshouldhaveanelliptic, parabolic, orhyperbolicorbit(withgravitationalattractionsfromtheplanetsignored). Insuitablepolarcoor-dinates, theposition .r; #/ ofacometsatisfiesanequationoftheform

r D ˇ C e.r � cos#/

where ˇ isaconstantand e isthe eccentricity oftheorbit,with 0 � e < 1 foranellipse, e D 1 foraparabola, and e > 1

forahyperbola. Supposeobservationsofanewlydiscoveredcometprovidethedatabelow. Determinethetypeoforbit,andpredictwherethecometwillbewhen# D 4:6 (radians).3

# .88 1.10 1.42 1.77 2.14r 3.00 2.30 1.65 1.25 1.01

12. [M] A healthychild’ssystolicbloodpressure p (inmillime-tersofmercury)andweightw (inpounds)areapproximatelyrelatedbytheequation

ˇ0 C ˇ1 lnw D p

Usethefollowingexperimentaldatatoestimatethesystolicbloodpressureofahealthychildweighing100pounds.

3 Thebasicideaofleast-squaresfittingofdataisduetoK. F. Gauss(and, independently, toA.Legendre), whoseinitialrisetofameoccurredin1801whenheusedthemethodtodeterminethepathoftheasteroidCeres. Fortydaysaftertheasteroidwasdiscovered, itdisappearedbehindthesun. Gausspredicteditwouldappeartenmonthslaterandgaveitslocation. TheaccuracyofthepredictionastonishedtheEuropeanscientificcommunity.


w 44 61 81 113 131lnw 3.78 4.11 4.39 4.73 4.88p 91 98 103 110 112

13. [M] Tomeasurethetakeoffperformanceofanairplane, thehorizontalpositionoftheplanewasmeasuredeverysecond,from t D 0 to t D 12. Thepositions(infeet)were: 0, 8.8,29.9, 62.0, 104.7, 159.1, 222.0, 294.5, 380.4, 471.1, 571.7,686.8, and809.2.a. Find the least-squares cubic curve y D ˇ0 C ˇ1t C

ˇ2t2 C ˇ3t

3 forthesedata.b. Usetheresultofpart(a)toestimatethevelocityofthe

planewhen t D 4:5 seconds.

14. Let x D 1

n.x1 C � � � C xn/ and y D 1

n.y1 C � � � C yn/.

Show that the least-squares line for the data.x1; y1/; : : : ; .xn; yn/mustpassthrough .x; y/. Thatis, showthat x and y satisfythelinearequation y D O

0 C O1x. [Hint:

Derivethisequationfromthevectorequation y D X O C �.Denotethefirstcolumnof X by 1. Usethefactthattheresidualvector � isorthogonaltothecolumnspaceof X andhenceisorthogonalto 1.]

Givendataforaleast-squaresproblem, .x1; y1/; : : : ; .xn; yn/, thefollowingabbreviationsarehelpful:P

x DPn

iD1 xi ;P

x2 DPn

iD1 x2i ;

Py D

PniD1 yi ;

Pxy D

PniD1 xi yi

Thenormalequationsforaleast-squaresline y D O0 C O

1x maybewrittenintheform

n O0 C O

1

Px D

Py

O0

Px C O

1

Px2 D

Pxy

.7/

15. Derivethenormalequations(7)fromthematrixformgiveninthissection.

16. Useamatrixinversetosolvethesystemofequationsin(7)andtherebyobtainformulasfor O

0 and O1 thatappearinmany

statisticstexts.

17. a. RewritethedatainExample 1withnew x-coordinatesinmeandeviationform. Let X betheassociateddesignmatrix. Whyarethecolumnsof X orthogonal?

b. Writethenormalequationsforthedatainpart(a), andsolvethemtofindtheleast-squaresline, y D ˇ0 C ˇ1x

�,where x� D x � 5:5.

18. Supposethe x-coordinatesofthedata .x1; y1/; : : : ; .xn; yn/

areinmeandeviationform, sothatP

xi D 0. ShowthatifX isthedesignmatrixfortheleast-squareslineinthiscase,then XTX isadiagonalmatrix.

Exercises19and20involveadesignmatrix X withtwoormorecolumnsandaleast-squaressolution O of y D Xˇ. Considerthefollowingnumbers.

(i) kX O k2—the sumofthe squaresofthe“regressionterm.”Denotethisnumberby SS(R).

(ii) ky � X O k2—the sumofthe squaresfor errorterm. DenotethisnumberbySS(E).

(iii) kyk2—the“total” sumofthe squaresofthe y-values. DenotethisnumberbySS(T).

Everystatisticstextthatdiscussesregressionandthelinearmodely D Xˇ C � introducesthesenumbers, thoughterminologyandnotationvarysomewhat. Tosimplifymatters, assumethatthemeanofthe y-valuesiszero. Inthiscase, SS(T) isproportionaltowhatiscalledthe variance ofthesetof y-values.

19. Justifytheequation SS(T) D SS(R) C SS(E). [Hint: Useatheorem, andexplainwhythehypothesesofthetheoremaresatisfied.] Thisequationisextremelyimportantinstatistics,bothinregressiontheoryandintheanalysisofvariance.

20. Showthat kX O k2 = O T XTy. [Hint: Rewritetheleftsideandusethefactthat O satisfiesthenormalequations.] ThisformulaforSS(R) isusedinstatistics. FromthisandfromExercise 19, obtainthestandardformulaforSS(E):

SS(E) D yT y � O TXT y


Construct X and ˇ sothatthe kthrowof Xˇ isthepredicted y-valuethatcorresponds

x

y

Salestrendwithseasonalfluctuations.

tothedatapoint .xk ; yk/, namely,

ˇ0 C ˇ1xk C ˇ2 sin.2�xk=12/

Itshouldbeclearthat

X D

2

64

1 x1 sin.2�x1=12/:::

::::::

1 xn sin.2�xn=12/

3

75 ; ˇ D

2

4

ˇ0

ˇ1

ˇ2

3

5


6.7 INNER PRODUCT SPACES

Notions of length, distance, and orthogonality are often important in applicationsinvolvingavectorspace. For R

n, theseconceptswerebasedonthepropertiesoftheinnerproductlistedinTheorem 1ofSection 6.1. Forotherspaces, weneedanaloguesoftheinnerproductwiththesameproperties. TheconclusionsofTheorem 1nowbecomeaxioms inthefollowingdefinition.

DEF IN I T I ON An innerproduct onavectorspace V isafunctionthat, toeachpairofvectorsu and v in V , associatesarealnumber hu; vi andsatisfiesthefollowingaxioms,forall u, v, w in V andallscalars c:

1. hu; vi D hv; ui2. hu C v;wi D hu;wi C hv;wi3. hcu; vi D chu; vi4. hu;ui � 0 and hu; ui D 0 ifandonlyif u D 0

A vectorspacewithaninnerproductiscalledan innerproductspace.

Thevectorspace Rn withthestandardinnerproductisaninnerproductspace, and

nearlyeverythingdiscussedinthischapterfor Rn carriesovertoinnerproductspaces.

Theexamplesinthissectionandthenextlaythefoundationforavarietyofapplicationstreatedincoursesinengineering, physics, mathematics, andstatistics.

EXAMPLE 1 Fix any two positive numbers—say, 4 and 5—and for vectorsu D .u1; u2/ and v D .v1; v2/ in R

2, set

hu; vi D 4u1v1 C 5u2v2 (1)

Showthatequation(1)definesaninnerproduct.

SOLUTION Certainly Axiom 1 is satisfied, because hu; vi D 4u1v1 C 5u2v2 D4v1u1 C 5v2u2 D hv; ui. If w D .w1; w2/, then

hu C v;wi D 4.u1 C v1/w1 C 5.u2 C v2/w2

D 4u1w1 C 5u2w2 C 4v1w1 C 5v2w2

D hu;wi C hv;wiThisverifiesAxiom 2. ForAxiom 3, compute

hcu; vi D 4.cu1/v1 C 5.cu2/v2 D c.4u1v1 C 5u2v2/ D chu; viForAxiom 4, notethat hu; ui D 4u2

1 C 5u22 � 0, and 4u2

1 C 5u22 D 0 onlyif u1 D u2 D

0, thatis, if u D 0. Also, h0; 0i D 0. So(1)definesaninnerproducton R2.

Inner products similar to (1) can be defined on Rn. They arise naturally in

connectionwith“weightedleast-squares”problems, inwhichweightsareassignedtothevariousentriesinthesumfortheinnerproductinsuchawaythatmoreimportanceisgiventothemorereliablemeasurements.

Fromnowon, whenaninnerproductspaceinvolvespolynomialsorotherfunctions,wewillwritethefunctionsinthefamiliarway, ratherthanusetheboldfacetypeforvectors. Nevertheless, itisimportanttorememberthateachfunction is avectorwhenitistreatedasanelementofavectorspace.

6.7 Inner Product Spaces 377

EXAMPLE 2 Let t0; : : : ; tn bedistinctrealnumbers. For p and q in Pn, definehp; qi D p.t0/q.t0/ C p.t1/q.t1/ C � � � C p.tn/q.tn/ (2)

InnerproductAxioms 1–3arereadilychecked. ForAxiom 4, notethathp; pi D Œp.t0/�2 C Œp.t1/�2 C � � � C Œp.tn/�2 � 0

Also, h0; 0i D 0. (Theboldfacezeroheredenotesthezeropolynomial, thezerovectorin Pn.) If hp; pi D 0, then p mustvanishat n C 1 points: t0; : : : ; tn. Thisispossibleonlyif p isthezeropolynomial, becausethedegreeof p islessthan n C 1. Thus(2)definesaninnerproducton Pn.

EXAMPLE 3 Let V be P2, withtheinnerproductfromExample 2, where t0 D 0,t1 D 1

2, and t2 D 1. Let p.t/ D 12t2 and q.t/ D 2t � 1. Compute hp; qi and hq; qi.

SOLUTION

hp; qi D p.0/q.0/ C p�

12

�

q�

12

�

C p.1/q.1/

D .0/.�1/ C .3/.0/ C .12/.1/ D 12

hq; qi D Œq.0/�2 C Œq�

12

�

�2 C Œq.1/�2

D .�1/2 C .0/2 C .1/2 D 2

Lengths, Distances, and OrthogonalityLet V beaninnerproductspace, withtheinnerproductdenotedby hu; vi. JustasinR

n, wedefinethe length, or norm, ofavector v tobethescalar

kvk Dp

hv; viEquivalently, kvk2 D hv; vi. (Thisdefinitionmakessensebecause hv; vi � 0, butthedefinition doesnot saythat hv; vi isa“sumofsquares,” because v neednotbeanelementof R

n.)A unitvector isonewhoselengthis1. The distancebetween u and v is ku � vk.

Vectors u and v are orthogonal if hu; vi D 0.

EXAMPLE 4 Let P2 havetheinnerproduct(2)ofExample 3. Computethelengthsofthevectors p.t/ D 12t2 and q.t/ D 2t � 1.SOLUTION

kpk2 D hp; pi D Œp.0/�2 C�

p�

12

��2 C Œp.1/�2

D 0 C Œ3�2 C Œ12�2 D 153

kpk Dp

153

FromExample 3, hq; qi D 2. Hence kqk Dp

2.

The Gram–Schmidt ProcessTheexistenceoforthogonalbasesforfinite-dimensionalsubspacesofaninnerproductspacecanbeestablishedbytheGram–Schmidtprocess, justasin R

n. Certainorthogo-nalbasesthatarisefrequentlyinapplicationscanbeconstructedbythisprocess.

Theorthogonalprojectionofavectorontoasubspace W withanorthogonalbasiscanbeconstructedasusual. Theprojectiondoesnotdependonthechoiceoforthogonalbasis, andithasthepropertiesdescribedintheOrthogonalDecompositionTheoremandtheBestApproximationTheorem.


EXAMPLE 5 LetV beP4 withtheinnerproductinExample 2, involvingevaluationofpolynomialsat �2, �1, 0, 1, and2, andview P2 asasubspaceof V . Produceanorthogonalbasisfor P2 byapplyingtheGram–Schmidtprocesstothepolynomials1, t ,and t2.SOLUTION Theinnerproductdependsonlyonthevaluesofapolynomialat�2; : : : ; 2,sowelistthevaluesofeachpolynomialasavectorin R

5, underneaththenameofthepolynomial:¹

Polynomial: 1 t t 2

Vectorofvalues:

2

66664

1

1

1

1

1

3

77775

;

2

66664

�2

�1

0

1

2

3

77775

;

2

66664

4

1

0

1

4

3

77775

Theinnerproductoftwopolynomialsin V equalsthe(standard)innerproductoftheircorrespondingvectorsin R

5. Observethat t isorthogonaltotheconstantfunction1. Sotake p0.t/ D 1 and p1.t/ D t . For p2, usethevectorsin R

5 tocomputetheprojectionof t2 onto Span fp0; p1g:

ht2; p0i D ht2; 1i D 4 C 1 C 0 C 1 C 4 D 10

hp0; p0i D 5

ht2; p1i D ht2; ti D �8 C .�1/ C 0 C 1 C 8 D 0

Theorthogonalprojectionof t2 onto Span f1; tg is 105

p0 C 0p1. Thus

p2.t/ D t2 � 2p0.t/ D t2 � 2

Anorthogonalbasisforthesubspace P2 of V is:Polynomial: p0 p1 p2

Vectorofvalues:

2

66664

1

1

1

1

1

3

77775

;

2

66664

�2

�1

0

1

2

3

77775

;

2

66664

2

�1

�2

�1

2

3

77775

(3)

Best Approximation in Inner Product SpacesA commonprobleminappliedmathematicsinvolvesavectorspace V whoseelementsarefunctions. Theproblemistoapproximateafunction f in V byafunction g fromaspecifiedsubspace W of V . The“closeness”oftheapproximationof f dependsontheway kf � gk isdefined. Wewillconsideronlythecaseinwhichthedistancebetweenf and g isdeterminedbyaninnerproduct. Inthiscase, the bestapproximationto f byfunctionsin W istheorthogonalprojectionof f ontothesubspace W .

EXAMPLE 6 Let V be P4 withtheinnerproductinExample 5, andlet p0, p1,and p2 betheorthogonalbasisfoundinExample 5forthesubspace P2. Findthebestapproximationto p.t/ D 5 � 1

2t4 bypolynomialsin P2.

¹Eachpolynomialin P4 isuniquelydeterminedbyitsvalueatthefivenumbers �2; : : : ; 2. Infact, thecorrespondencebetween p anditsvectorofvaluesisanisomorphism, thatis, aone-to-onemappingontoR5 thatpreserveslinearcombinations.


SOLUTION Thevaluesof p0; p1, and p2 atthenumbers �2, �1, 0, 1, and2arelistedin R

5 vectorsin(3)above. Thecorrespondingvaluesfor p are �3, 9/2, 5, 9/2, and �3.Compute

hp; p0i D 8; hp; p1i D 0; hp; p2i D �31

hp0; p0i D 5; hp2; p2i D 14

Thenthebestapproximationin V to p bypolynomialsin P2 is

Op D projP2p D hp; p0i

hp0; p0ip0 C hp; p1ihp1; p1ip1 C hp; p2i

hp2; p2ip2

D 85p0 C �31

14p2 D 8

5� 31

14.t2 � 2/:

Thispolynomialistheclosestto p ofallpolynomialsin P2, whenthedistancebetweenpolynomialsismeasuredonlyat �2, �1, 0, 1, and2. SeeFig. 1.

t

y

2

2

p(t)

p(t)ˆ

FIGURE 1

Thepolynomials p0, p1, and p2 inExamples 5and6belongtoaclassofpolynomi-alsthatarereferredtoinstatisticsas orthogonalpolynomials.² TheorthogonalityreferstothetypeofinnerproductdescribedinExample 2.

Two InequalitiesGivenavector v inaninnerproductspace V andgivenafinite-dimensionalsubspaceW , wemayapplythePythagoreanTheoremtotheorthogonaldecompositionof v withrespectto W andobtain

kvk2 D k projW vk2 C kv � projW vk2

SeeFig.2. Inparticular, thisshowsthatthenormoftheprojectionof v ontoW doesnotexceedthenormof v itself. Thissimpleobservationleadstothefollowingimportant

v

0W

||v – projW

v||

||projW

v||

||v||

projW

v

FIGURE 2

Thehypotenuseisthelongestside.inequality.

THEOREM 16 The Cauchy--Schwarz Inequality

Forall u, v in V ,jhu; vij � kuk kvk (4)

²See StatisticsandExperimentalDesigninEngineeringandthePhysicalSciences, 2nded., byNormanL. JohnsonandFredC. Leone(NewYork: JohnWiley&Sons, 1977). Tablestherelist“OrthogonalPolynomials,” whicharesimplythevaluesofthepolynomialatnumberssuchas �2, �1, 0, 1, and2.


PROOF If u D 0, thenbothsidesof(4)arezero, andhencetheinequalityistrueinthiscase. (SeePracticeProblem 1.) If u ¤ 0, let W bethesubspacespannedby u. Recallthat kcuk D jcj kuk foranyscalar c. Thus

k projW vk D

hv;uihu; uiu

D jhv; uijjhu;uijkuk D jhv; uij

kuk2kuk D jhu; vij

kuk

Since k projW vk � kvk, wehave jhu; vijkuk � kvk, whichgives(4).

TheCauchy–Schwarzinequalityisusefulinmanybranchesofmathematics. A fewsimpleapplicationsarepresentedintheexercises. Ourmainneedforthisinequalityhereistoproveanotherfundamentalinequalityinvolvingnormsofvectors. SeeFig. 3.

THEOREM 17 The Triangle Inequality

Forall u; v in V ,ku C vk � kuk C kvk

PROOF ku C vk2 D hu C v;u C vi D hu;ui C 2hu; vi C hv; vi� kuk2 C 2jhu; vij C kvk2

� kuk2 C 2kuk kvk C kvk2 Cauchy–SchwarzD .kuk C kvk/2

Thetriangleinequalityfollowsimmediatelybytakingsquarerootsofbothsides.

0 u

v

||u + v||

u + v

||v||

||u||

FIGURE 3

Thelengthsofthesidesofatriangle.

An Inner Product for C Œa; b� (Calculus required)ProbablythemostwidelyusedinnerproductspaceforapplicationsisthevectorspaceC Œa; b� ofallcontinuousfunctionsonaninterval a � t � b, withaninnerproductthatwewilldescribe.

Webeginbyconsideringapolynomial p andanyinteger n largerthanorequaltothedegreeof p. Then p isin Pn, andwemaycomputea“length”for p usingtheinnerproductofExample 2involvingevaluationat n C 1 pointsin Œa; b�. However,thislengthof p capturesthebehavioratonlythose n C 1 points. Since p isin Pn foralllarge n, wecoulduseamuchlarger n, withmanymorepointsforthe“evaluation”innerproduct. SeeFig. 4.

tba

tba

p(t) p(t)

FIGURE 4 Usingdifferentnumbersofevaluationpointsin Œa; b� tocomputekpk2.


Letuspartition Œa; b� into n C 1 subintervalsoflength �t D .b � a/=.n C 1/, andlet t0; : : : ; tn bearbitrarypointsinthesesubintervals.

a t0

∆t

tj btn

If n islarge, theinnerproducton Pn determinedby t0; : : : ; tn willtendtogivealargevalueto hp; pi, sowescaleitdownanddivideby n C 1. Observethat 1=.n C 1/ D�t=.b � a/, anddefine

hp; qi D 1

n C 1

nX

jD0

p.tj /q.tj / D 1

b � a

2

4

nX

jD0

p.tj /q.tj /�t

3

5

Now, let n increasewithoutbound. Sincepolynomialsp and q arecontinuousfunctions,theexpressioninbracketsisaRiemannsumthatapproachesadefiniteintegral, andweareledtoconsiderthe averagevalueof p.t/q.t/ ontheinterval Œa; b�:

1

b � a

Z b

a

p.t/q.t/ dt

Thisquantityisdefinedforpolynomialsofanydegree(infact, forall continuousfunctions), andithasallthepropertiesofaninnerproduct, asthenextexampleshows.Thescalefactor 1=.b � a/ isinessentialandisoftenomittedforsimplicity.

EXAMPLE 7 For f , g in C Œa; b�, set

hf; gi DZ b

a

f .t/g.t/ dt (5)

Showthat(5)definesaninnerproducton C Œa; b�.

SOLUTION InnerproductAxioms 1–3followfromelementarypropertiesofdefiniteintegrals. ForAxiom 4, observethat

hf; f i DZ b

a

Œf .t/�2 dt � 0

Thefunction Œf .t/�2 iscontinuousandnonnegativeon Œa; b�. IfthedefiniteintegralofŒf .t/�2 iszero, then Œf .t/�2 mustbeidenticallyzeroon Œa; b�, byatheoreminadvancedcalculus, inwhichcase f isthezerofunction. Thus hf; f i D 0 impliesthat f isthezerofunctionon Œa; b�. So(5)definesaninnerproducton C Œa; b�.

EXAMPLE 8 Let V bethespace C Œ0; 1� withtheinnerproductofExample 7, andlet W bethesubspacespannedbythepolynomials p1.t/ D 1, p2.t/ D 2t � 1, andp3.t/ D 12t2. UsetheGram–Schmidtprocesstofindanorthogonalbasisfor W .

SOLUTION Let q1 D p1, andcompute

hp2; q1i DZ 1

0

.2t � 1/.1/ dt D .t2 � t /

ˇˇˇˇ

1

0

D 0


So p2 isalreadyorthogonalto q1, andwecantake q2 D p2. Fortheprojectionof p3

onto W2 D Span fq1; q2g, compute

hp3; q1i DZ 1

0

12t2 � 1 dt D 4t3

ˇˇˇˇ

1

0

D 4

hq1; q1i DZ 1

0

1 � 1 dt D t

ˇˇˇˇ

1

0

D 1

hp3; q2i DZ 1

0

12t2.2t � 1/ dt DZ 1

0

.24t3 � 12t2/ dt D 2

hq2; q2i DZ 1

0

.2t � 1/2 dt D 1

6.2t � 1/3

ˇˇˇˇ

1

0

D 1

3

Then

projW2p3 D hp3; q1i

hq1; q1i q1 C hp3; q2ihq2; q2i

q2 D 4

1q1 C 2

1=3q2 D 4q1 C 6q2

andq3 D p3 � projW2

p3 D p3 � 4q1 � 6q2

Asafunction, q3.t/ D 12t2 � 4 � 6.2t � 1/ D 12t2 � 12t C 2. Theorthogonalbasisforthesubspace W is fq1; q2; q3g.

PRACTICE PROBLEMS

Usetheinnerproductaxiomstoverifythefollowingstatements.

1. hv; 0i D h0; vi D 0.2. hu; v C wi D hu; vi C hu;wi.

6.7 EXERCISES1. Let R

2 have the inner product of Example 1, and letx D .1; 1/ and y D .5; �1/.a. Find kxk, kyk, and jhx; yij2.b. Describeallvectors .´1; ´2/ thatareorthogonalto y.

2. Let R2 havetheinnerproductofExample 1. Showthat

theCauchy–Schwarzinequalityholdsfor x D .3; �2/ andy D .�2; 1/. [Suggestion: Study jhx; yij2.]

Exercises3–8referto P2 withtheinnerproductgivenbyevalua-tionat �1, 0, and1. (SeeExample 2.)

3. Compute hp; qi, where p.t/ D 4 C t , q.t/ D 5 � 4t 2.

4. Compute hp; qi, where p.t/ D 3t � t 2, q.t/ D 3 C 2t 2.

5. Compute kpk and kqk, for p and q inExercise 3.

6. Compute kpk and kqk, for p and q inExercise 4.

7. Computetheorthogonalprojectionof q ontothesubspacespannedby p, for p and q inExercise 3.

8. Computetheorthogonalprojectionof q ontothesubspacespannedby p, for p and q inExercise 4.

9. Let P3 havetheinnerproductgivenbyevaluationat �3, �1,1, and3. Let p0.t/ D 1, p1.t/ D t , and p2.t/ D t 2.a. Computetheorthogonalprojectionof p2 ontothesub-

spacespannedby p0 and p1.b. Find a polynomial q that is orthogonal to p0 and

p1, such that fp0; p1; qg is an orthogonal basis forSpan fp0; p1; p2g. Scale the polynomial q so that itsvectorofvaluesat .�3; �1; 1; 3/ is .1; �1; �1; 1/.

10. Let P3 havetheinnerproductasinExercise 9, with p0; p1,and q thepolynomialsdescribedthere. Findthebestapprox-imationto p.t/ D t 3 bypolynomialsin Span fp0; p1; qg.

11. Let p0, p1, and p2 betheorthogonalpolynomialsdescribedinExample5, wheretheinnerproducton P4 isgivenbyevaluationat �2, �1, 0, 1, and2. Find the orthogonalprojectionof t 3 onto Span fp0; p1; p2g.

12. Finda polynomial p3 such that fp0; p1; p2; p3g (see Ex-ercise 11) is an orthogonal basis for the subspace P3 ofP4. Scalethepolynomial p3 sothatitsvectorofvaluesis.�1; 2; 0; �2; 1/.

6.8 Applications of Inner Product Spaces 383

13. Let A beanyinvertible n � n matrix. Showthatfor u, v inR

n, theformula hu; vi D .Au/� .Av/ D .Au/T .Av/ definesaninnerproducton R

n.

14. Let T beaone-to-onelineartransformationfromavectorspace V into R

n. Showthatfor u, v in V , theformulahu; vi D T .u/�T .v/ definesaninnerproducton V .

UsetheinnerproductaxiomsandotherresultsofthissectiontoverifythestatementsinExercises15–18.

15. hu; cvi D chu; vi forallscalars c.

16. If fu; vg isanorthonormalsetin V , then ku � vk Dp

2.

17. hu; vi D 14ku C vk2 � 1

4ku � vk2.

18. ku C vk2 C ku � vk2 D 2kuk2 C 2kvk2.

19. Given a � 0 and b � 0, let u D�p

apb

�

and v D�p

bpa

�

.

UsetheCauchy–Schwarzinequalitytocomparethegeomet-ricmean

pab withthearithmeticmean .a C b/=2.

20. Let u D�

a

b

�

and v D�

1

1

�

. Use the Cauchy–Schwarz

inequalitytoshowthat�

a C b

2

�2

� a2 C b2

2

Exercises 21–24refer to V D C Œ0; 1�, with the inner productgivenbyanintegral, asinExample 7.

21. Compute hf; gi, where f .t/ D 1 � 3t 2 and g.t/ D t � t 3.22. Compute hf; gi, where f .t/ D 5t � 3 and g.t/ D t 3 � t 2.23. Compute kf k for f inExercise 21.24. Compute kgk for g inExercise 22.25. Let V bethespace C Œ�1; 1� withtheinnerproductofExam-

ple 7. Findanorthogonalbasisforthesubspacespannedbythepolynomials 1, t , and t 2. Thepolynomialsinthisbasisarecalled Legendrepolynomials.

26. Let V bethespace C Œ�2; 2� withtheinnerproductofExam-ple 7. Findanorthogonalbasisforthesubspacespannedbythepolynomials 1, t , and t 2.

27. [M] Let P4 havetheinnerproductasinExample 5, andletp0, p1, p2 betheorthogonalpolynomialsfromthatexam-ple. Usingyourmatrixprogram, applytheGram–Schmidtprocesstotheset fp0; p1; p2; t 3; t 4g tocreateanorthogonalbasisfor P4.

28. [M] Let V be the space C Œ0; 2�� with the inner prod-uct of Example 7. Use the Gram–Schmidt process tocreate an orthogonal basis for the subspace spanned byf1; cos t; cos2 t; cos3 tg. Useamatrixprogramorcomputa-tionalprogramtocomputetheappropriatedefiniteintegrals.


1. ByAxiom 1, hv; 0i D h0; vi. Then h0; vi D h0v; vi D 0hv; vi, by Axiom 3, soh0; vi D 0.

2. ByAxioms 1, 2, andthen1again, hu; v C wi D hv C w; ui D hv; ui C hw; ui Dhu; vi C hu;wi.

6.8 APPLICATIONS OF INNER PRODUCT SPACES

TheexamplesinthissectionsuggesthowtheinnerproductspacesdefinedinSection 6.7ariseinpracticalproblems. Thefirstexampleisconnectedwiththemassiveleast-squaresproblemofupdatingtheNorthAmericanDatum, describedinthechapter’sintroductoryexample.

Weighted Least-SquaresLet y beavectorof n observations, y1; : : : ; yn, andsupposewewishtoapproximate y byavector Oy thatbelongstosomespecifiedsubspaceof R

n. (InSection 6.5, Oy waswrittenas Ax sothat Oy wasinthecolumnspaceof A.) Denotetheentriesin Oy by Oy1; : : : ; Oyn.Thenthe sumofthesquaresforerror, or SS(E), inapproximating y by Oy is

SS(E) D .y1 � Oy1/2 C � � � C .yn � Oyn/2 (1)

Thisissimply ky � Oyk2, usingthestandardlengthin Rn.


Nowsupposethemeasurementsthatproducedtheentriesin y arenotequallyreliable. (ThiswasthecasefortheNorthAmericanDatum, sincemeasurementsweremadeoveraperiodof140years.) Asanotherexample, theentriesin y mightbecomputedfromvarioussamplesofmeasurements, withunequalsamplesizes.) Thenitbecomesappropriatetoweightthesquarederrorsin(1)insuchawaythatmoreimportanceisassignedtothemorereliablemeasurements.¹ Iftheweightsaredenotedby w2

1 ; : : : ; w2n, thentheweightedsumofthesquaresforerroris

WeightedSS(E) D w21.y1 � Oy1/2 C � � � C w2

n.yn � Oyn/2 (2)

Thisisthesquareofthelengthof y � Oy, wherethelengthisderivedfromaninnerproductanalogoustothatinExample 1inSection 6.7, namely,

hx; yi D w21x1y1 C � � � C w2

nxnyn

Itissometimesconvenienttotransformaweightedleast-squaresproblemintoanequivalentordinaryleast-squaresproblem. LetW bethediagonalmatrixwith(positive)w1; : : : ; wn onitsdiagonal, sothat

W y D

2

6664

w1 0 � � � 0

0 w2

:::: : :

:::

0 � � � wn

3

7775

2

6664

y1

y2

:::

yn

3

7775

D

2

6664

w1y1

w2y2

:::

wnyn

3

7775

withasimilarexpressionfor W Oy. Observethatthe j thtermin(2)canbewrittenas

w2j .yj � Oyj /2 D .wj yj � wj Oyj /2

Itfollowsthattheweighted SS(E) in(2)isthesquareoftheordinarylengthin Rn of

W y � W Oy, whichwewriteas kW y � W Oyk2.Nowsupposetheapproximatingvector Oy istobeconstructedfromthecolumnsof

amatrix A. Thenweseekan Ox thatmakes AOx D Oy ascloseto y aspossible. However,themeasureofclosenessistheweightederror,

kW y � W Oyk2 D kW y � WAOxk2

Thus Ox isthe(ordinary)least-squaressolutionoftheequation

WAx D W y

Thenormalequationfortheleast-squaressolutionis

.WA/T WAx D .WA/T W y

EXAMPLE 1 Find the least-squares line y D ˇ0 C ˇ1x that best fits the data.�2; 3/, .�1; 5/, .0; 5/, .1; 4/, and .2; 3/. Supposetheerrorsinmeasuringthe y-valuesofthelasttwodatapointsaregreaterthanfortheotherpoints. Weightthesedatahalfasmuchastherestofthedata.

¹Noteforreaderswithabackgroundinstatistics: Supposetheerrorsinmeasuringthe yi areindependentrandomvariableswithmeansequaltozeroandvariancesof �2

1 ; : : : ; �2n . Thentheappropriateweightsin(2)

are w2i D 1=�2

i . Thelargerthevarianceoftheerror, thesmallertheweight.


SOLUTION AsinSection 6.6, write X forthematrix A and ˇ forthevector x, andobtain

X D

2

66664

1 �2

1 �1

1 0

1 1

1 2

3

77775

; ˇ D�

ˇ0

ˇ1

�

; y D

2

66664

3

5

5

4

3

3

77775

Foraweightingmatrix, choose W withdiagonalentries2, 2, 2, 1, and1. Left-multiplicationby W scalestherowsof X and y:

WX D

2

66664

2 �4

2 �2

2 0

1 1

1 2

3

77775

; W y D

2

66664

6

10

10

4

3

3

77775

Forthenormalequation, compute

.WX/T WX D�

14 �9

�9 25

�

and .WX/T W y D�

59

�34

�

andsolve �

14 �9

�9 25

��

ˇ0

ˇ1

�

D�

59

�34

�

Thesolutionofthenormalequationis(totwosignificantdigits) ˇ0 D 4:3 and ˇ1 D :20.Thedesiredlineis

y D 4:3 C :20x

Incontrast, theordinaryleast-squareslineforthesedatais

y D 4:0 � :10x

BothlinesaredisplayedinFig. 1.

2

2

–2

y = 4 – .1x

y = 4.3 + .2x

y

x

FIGURE 1

Weightedandordinaryleast-squareslines.

Trend Analysis of DataLet f representanunknownfunctionwhosevaluesareknown(perhapsonlyapprox-imately)at t0; : : : ; tn. Ifthereisa“lineartrend”inthedata f .t0/; : : : ; f .tn/, thenwemightexpecttoapproximatethevaluesof f byafunctionoftheform ˇ0 C ˇ1t .Ifthereisa“quadratictrend”tothedata, thenwewouldtryafunctionoftheformˇ0 C ˇ1t C ˇ2t2. ThiswasdiscussedinSection 6.6, fromadifferentpointofview.

Insomestatisticalproblems, itisimportanttobeabletoseparatethelineartrendfromthequadratictrend(andpossiblycubicorhigher-ordertrends). Forinstance,supposeengineersareanalyzingtheperformanceofanewcar, and f .t/ representsthedistancebetweenthecarattime t andsomereferencepoint. Ifthecaristravelingatconstantvelocity, thenthegraphof f .t/ shouldbeastraightlinewhoseslopeisthecar’svelocity. Ifthegaspedalissuddenlypressedtothefloor, thegraphof f .t/ willchangetoincludeaquadratictermandpossiblyacubicterm(duetotheacceleration).Toanalyzetheabilityofthecartopassanothercar, forexample, engineersmaywanttoseparatethequadraticandcubiccomponentsfromthelinearterm.

Ifthefunctionisapproximatedbyacurveoftheform y D ˇ0 C ˇ1t C ˇ2t2, thecoefficient ˇ2 maynotgivethedesiredinformationaboutthequadratictrendinthedata,becauseitmaynotbe“independent”inastatisticalsensefromtheother ˇi . Tomake


whatisknownasa trendanalysis ofthedata, weintroduceaninnerproductonthespace Pn analogoustothatgiveninExample 2inSection 6.7. For p, q in Pn, define

hp; qi D p.t0/q.t0/ C � � � C p.tn/q.tn/

Inpractice, statisticiansseldomneedtoconsidertrendsindataofdegreehigherthancubicorquartic. Solet p0, p1, p2, p3 denoteanorthogonalbasisofthesubspace P3 ofPn, obtainedbyapplyingtheGram–Schmidtprocesstothepolynomials1, t , t2, and t3.BySupplementaryExercise 11inChapter 2, thereisapolynomial g in Pn whosevaluesat t0; : : : ; tn coincidewiththoseoftheunknownfunction f . Let Og betheorthogonalprojection(withrespecttothegiveninnerproduct)of g onto P3, say,

Og D c0p0 C c1p1 C c2p2 C c3p3

Then Og iscalledacubic trendfunction, and c0; : : : ; c3 arethe trendcoefficients ofthedata. Thecoefficient c1 measuresthelineartrend, c2 thequadratictrend, and c3 thecubictrend. Itturnsoutthatifthedatahavecertainproperties, thesecoefficientsarestatisticallyindependent.

Since p0; : : : ; p3 areorthogonal, thetrendcoefficientsmaybecomputedoneatatime, independentlyofoneanother. (Recallthat ci D hg; pi i=hpi ; pi i.) Wecanignore p3 and c3 ifwewantonlythequadratictrend. Andif, forexample, weneededtodeterminethequartictrend, wewouldhavetofind(viaGram–Schmidt)onlyapolynomial p4 in P4 thatisorthogonalto P3 andcompute hg; p4i=hp4; p4i.

EXAMPLE 2 Thesimplestandmostcommonuseoftrendanalysisoccurswhenthepoints t0; : : : ; tn canbeadjustedsothattheyareevenlyspacedandsumtozero. Fitaquadratictrendfunctiontothedata .�2; 3/, .�1; 5/, .0; 5/, .1; 4/, and .2; 3/.

SOLUTION The t -coordinatesaresuitablyscaledtousetheorthogonalpolynomialsfoundinExample 5ofSection 6.7:

Polynomial: p0 p1 p2 Data: g

Vectorofvalues:

2

66664

1

1

1

1

1

3

77775

;

2

66664

�2

�1

0

1

2

3

77775

;

2

66664

2

�1

�2

�1

2

3

77775

;

2

66664

3

5

5

4

3

3

77775

Thecalculationsinvolveonlythesevectors, notthespecificformulasfortheorthogonalpolynomials. Thebestapproximationtothedatabypolynomialsin P2 istheorthogonalprojectiongivenby

Op D hg; p0ihp0; p0i

p0 C hg; p1ihp1; p1i

p1 C hg; p2ihp2; p2i

p2

D 205

p0 � 110

p1 � 714

p2

andOp.t/ D 4 � :1t � :5.t2 � 2/ (3)

Sincethecoefficientof p2 isnotextremelysmall, itwouldbereasonabletoconcludethatthetrendisatleastquadratic. ThisisconfirmedbythegraphinFig. 2.

2

2

–2

y

y = p(t)

x

FIGURE 2

Approximationbyaquadratictrendfunction.


Fourier Series (Calculus required)Continuousfunctionsareoftenapproximatedbylinearcombinationsofsineandcosinefunctions. Forinstance, acontinuousfunctionmightrepresentasoundwave, anelectricsignalofsometype, orthemovementofavibratingmechanicalsystem.

Forsimplicity, weconsiderfunctionson 0 � t � 2� . Itturnsoutthatanyfunctionin C Œ0; 2�� canbeapproximatedascloselyasdesiredbyafunctionoftheform

a0

2C a1 cos t C � � � C an cos nt C b1 sin t C � � � C bn sin nt (4)

forasufficientlylargevalueof n. Thefunction(4)iscalleda trigonometricpoly-nomial. If an and bn arenotbothzero, thepolynomialissaidtobeof order n. TheconnectionbetweentrigonometricpolynomialsandotherfunctionsinC Œ0; 2�� dependsonthefactthatforany n � 1, theset

f1; cos t; cos 2t; : : : ; cos nt; sin t; sin 2t; : : : ; sin ntg (5)

isorthogonalwithrespecttotheinnerproduct

hf; gi DZ 2�

0

f .t/g.t/ dt (6)

ThisorthogonalityisverifiedasinthefollowingexampleandinExercises 5and6.

EXAMPLE 3 Let C Œ0; 2�� havetheinnerproduct(6), andlet m and n beunequalpositiveintegers. Showthat cosmt and cos nt areorthogonal.

SOLUTION Useatrigonometricidentity. When m ¤ n,

hcosmt; cos nti DZ 2�

0

cosmt cosnt dt

D 1

2

Z 2�

0

Œcos.mt C nt/ C cos.mt � nt/� dt

D 1

2

�sin.mt C nt/

m C nC sin.mt � nt/

m � n

�ˇˇˇˇ

2�

0

D 0

Let W bethesubspaceof C Œ0; 2�� spannedbythefunctionsin(5). Given f

in C Œ0; 2��, thebestapproximationto f byfunctionsin W iscalledthe nth-orderFourierapproximation to f on Œ0; 2��. Sincethefunctionsin(5)areorthogonal,thebestapproximationisgivenbytheorthogonalprojectiononto W . Inthiscase, thecoefficients ak and bk in(4)arecalledthe Fouriercoefficients of f . Thestandardformulaforanorthogonalprojectionshowsthat

ak D hf; cos ktihcos kt; cos kti ; bk D hf; sin kti

hsin kt; sin kti ; k � 1

Exercise 7asksyoutoshowthat hcos kt; cos kti D � and hsin kt; sin kti D � . Thus

ak D 1

�

Z 2�

0

f .t/ cos kt dt; bk D 1

�

Z 2�

0

f .t/ sin kt dt (7)

Thecoefficientofthe(constant)function 1 intheorthogonalprojectionis

hf; 1ih1; 1i D 1

2�

Z 2�

0

f .t/�1 dt D 1

2

�1

�

Z 2�

0

f .t/ cos.0�t / dt

�

D a0

2

where a0 isdefinedby(7)for k D 0. Thisexplainswhytheconstanttermin(4)iswrittenas a0=2.


EXAMPLE 4 Findthe nth-orderFourierapproximationtothefunction f .t/ D t ontheinterval Œ0; 2��.SOLUTION Compute

a0

2D 1

2�

1

�

Z 2�

0

t dt D 1

2�

"

1

2t2

ˇˇˇˇ

2�

0

#

D �

andfor k > 0, usingintegrationbyparts,

ak D 1

�

Z 2�

0

t cos kt dt D 1

�

�1

k2cos kt C t

ksin kt

�2�

0

D 0

bk D 1

�

Z 2�

0

t sin kt dt D 1

�

�1

k2sin kt � t

kcos kt

�2�

0

D � 2

k

Thusthe nth-orderFourierapproximationof f .t/ D t is

� � 2 sin t � sin 2t � 2

3sin 3t � � � � � 2

nsin nt

Figure 3showsthethird-andfourth-orderFourierapproximationsof f .

(a) Third order

y

2�

2�

y = t

�

�

t

y

2�

2�

y = t

�

�

t

(b) Fourth order

FIGURE 3 Fourierapproximationsofthefunction f .t/ D t .

Thenormofthedifferencebetween f andaFourierapproximationiscalledthemeansquareerror intheapproximation. (Theterm mean refers tothefact thatthenormisdeterminedbyanintegral.) ItcanbeshownthatthemeansquareerrorapproacheszeroastheorderoftheFourierapproximationincreases. Forthisreason, itiscommontowrite

f .t/ D a0

2C1X

mD1

.am cosmt C bm sinmt/

Thisexpressionfor f .t/ is calledthe Fourierseries for f on Œ0; 2��. Thetermam cosmt , forexample, is theprojectionof f ontotheone-dimensionalsubspacespannedby cosmt .

PRACTICE PROBLEMS

1. Let q1.t/ D 1, q2.t/ D t , and q3.t/ D 3t2 � 4. Verifythat fq1; q2; q3g isanorthog-onalsetin C Œ�2; 2� withtheinnerproductofExample 7inSection 6.7(integrationfrom �2 to2).

2. Findthefirst-orderandthird-orderFourierapproximationstof .t/ D 3 � 2 sin t C 5 sin 2t � 6 cos 2t


6.8 EXERCISES1. Findtheleast-squaresline y D ˇ0 C ˇ1x thatbestfitsthe

data .�2; 0/, .�1; 0/, .0; 2/, .1; 4/, and .2; 4/, assumingthatthefirstandlastdatapointsarelessreliable. Weightthemhalfasmuchasthethreeinteriorpoints.

2. Suppose5outof25datapointsinaweightedleast-squaresproblemhavea y-measurementthatislessreliablethantheothers, andtheyaretobeweightedhalfasmuchastheother20points. Onemethodistoweightthe20pointsbyafactorof1andtheother5byafactorof 1

2. A secondmethodis

toweightthe20pointsbyafactorof2andtheother5byafactorof1. Dothetwomethodsproducedifferentresults?Explain.

3. FitacubictrendfunctiontothedatainExample 2. Theorthogonalcubicpolynomialis p3.t/ D 5

6t 3 � 17

6t .

4. Tomakeatrendanalysisofsixevenlyspaceddatapoints, onecanuseorthogonalpolynomialswithrespecttoevaluationatthepoints t D �5; �3; �1; 1; 3, and5.a. Showthatthefirstthreeorthogonalpolynomialsare

p0.t/ D 1; p1.t/ D t; and p2.t/ D 38t 2 � 35

8

(Thepolynomial p2 hasbeenscaledsothatitsvaluesattheevaluationpointsaresmallintegers.)

b. Fitaquadratictrendfunctiontothedata

.�5; 1/; .�3; 1/; .�1; 4/; .1; 4/; .3; 6/; .5; 8/

InExercises5–14, thespaceis C Œ0; 2�� withtheinnerproduct(6).

5. Showthat sinmt and sin nt areorthogonalwhen m ¤ n.6. Showthat sinmt and cosnt areorthogonalforallpositive

integers m and n.7. Showthat k cos ktk2 D � and k sin ktk2 D � for k > 0.8. Findthethird-orderFourierapproximationto f .t/ D t � 1.

9. Find the third-order Fourier approximation to f .t/ D2� � t .

10. Find the third-order Fourier approximation to the squarewavefunction, f .t/ D 1 for 0 � t < � and f .t/ D �1 for� � t < 2� .

11. Findthethird-orderFourierapproximationto sin2 t , withoutperforminganyintegrationcalculations.

12. Findthethird-orderFourierapproximationto cos3 t , withoutperforminganyintegrationcalculations.

13. ExplainwhyaFouriercoefficientofthesumoftwofunctionsisthesumofthecorrespondingFouriercoefficientsofthetwofunctions.

14. SupposethefirstfewFouriercoefficientsofsomefunctionf in C Œ0; 2�� are a0, a1, a2, and b1, b2, b3. Whichofthefollowingtrigonometricpolynomialsiscloserto f ? Defendyouranswer.

g.t/ D a0

2C a1 cos t C a2 cos 2t C b1 sin t

h.t/ D a0

2C a1 cos t C a2 cos 2t C b1 sin t C b2 sin 2t

15. [M] RefertothedatainExercise 13inSection 6.6, con-cerningthetakeoffperformanceofanairplane. Supposethepossiblemeasurementerrorsbecomegreaterasthespeedoftheairplaneincreases, andlet W bethediagonalweightingmatrixwhosediagonalentriesare1, 1, 1, .9, .9, .8, .7, .6, .5,.4, .3, .2, and.1. Findthecubiccurvethatfitsthedatawithminimumweightedleast-squareserror, anduseittoestimatethevelocityoftheplanewhen t D 4:5 seconds.

16. [M] Let f4 and f5 bethefourth-orderandfifth-orderFourierapproximationsin C Œ0; 2�� tothesquarewavefunctioninExercise 10. Produceseparategraphsof f4 and f5 ontheinterval Œ0; 2��, andproduceagraphof f5 on Œ�2�; 2��.

SG The Linearity of an Orthogonal Projection 6–25


1. Compute

hq1; q2i DZ 2

�2

1�t dt D 1

2t2

ˇˇˇˇ

2

�2

D 0

hq1; q3i DZ 2

�2

1� .3t2 � 4/ dt D .t3 � 4t/

ˇˇˇˇ

2

�2

D 0

hq2; q3i DZ 2

�2

t � .3t2 � 4/ dt D�

3

4t4 � 2t2

�ˇˇˇˇ

2

�2

D 0


2. Thethird-orderFourierapproximationto f isthebestapproximationin C Œ0; 2��

to f byfunctions(vectors)inthesubspacespannedby1, cos t , cos 2t , cos 3t ,sin t , sin 2t , and sin 3t . But f isobviously in thissubspace, so f isitsownbestapproximation:

f .t/ D 3 � 2 sin t C 5 sin 2t � 6 cos 2t

Forthefirst-orderapproximation, theclosestfunctionto f inthesubspace W DSpanf1; cos t; sin tg is 3 � 2 sin t . Theothertwotermsintheformulafor f .t/ areorthogonaltothefunctionsin W , sotheycontributenothingtotheintegralsthatgivetheFouriercoefficientsforafirst-orderapproximation.

y = 3 – 2 sin ty = f (t)

t

3

9

–32π

y

π

First-andthird-orderapproximationsto f .t/.

CHAPTER 6 SUPPLEMENTARY EXERCISES1. Thefollowingstatementsrefertovectorsin R

n (or Rm/ with

thestandardinnerproduct. MarkeachstatementTrueorFalse. Justifyeachanswer.a. Thelengthofeveryvectorisapositivenumber.b. A vector v anditsnegative �v haveequallengths.c. Thedistancebetween u and v is ku � vk.d. If r isanyscalar, then krvk D rkvk.e. Iftwovectorsareorthogonal, theyarelinearlyindepen-

dent.f. If x is orthogonal to both u and v, then x must be

orthogonalto u � v.g. If ku C vk2 D kuk2 C kvk2, then u and v areorthogonal.h. If ku � vk2 D kuk2 C kvk2, then u and v areorthogonal.i. Theorthogonalprojectionof y onto u isascalarmultiple

of y.j. Ifavector y coincideswithitsorthogonalprojectiononto

asubspace W , then y isin W .k. ThesetofallvectorsinR

n orthogonaltoonefixedvectorisasubspaceof R

n.l. If W isasubspaceof R

n, then W and W ? havenovectorsincommon.

m. If fv1; v2; v3g isanorthogonalsetandif c1, c2, and c3 arescalars, then fc1v1; c2v2; c3v3g isanorthogonalset.

n. Ifamatrix U hasorthonormalcolumns, then U U T D I .o. A squarematrixwithorthogonalcolumnsisanorthogo-

nalmatrix.p. Ifasquarematrixhasorthonormalcolumns, thenitalso

hasorthonormalrows.q. IfW isasubspace, then k projW vk2 C kv � projW vk2 D

kvk2.

r. A least-squaressolutionof Ax D b isthevector AOx inColA closestto b, sothat kb � AOx k � kb � Axk forall x.

s. The normal equations for a least-squares solution ofAx D b aregivenby Ox D .ATA/�1AT b.

2. Let fv1; : : : ; vpg beanorthonormalset. Verifythefollowingequalitybyinduction, beginningwith p D 2. If x D c1v1C� � � C cpvp , then

kxk2 D jc1j2 C � � � C jcpj2

3. Let fv1; : : : ; vpg beanorthonormalsetin Rn. Verifythe

followinginequality, called Bessel’sinequality, whichistrueforeach x in R

n:

kxk2 � jx �v1j2 C jx �v2j2 C � � � C jx �vpj2

4. Let U be an n � n orthogonal matrix. Show that iffv1; : : : ; vng is an orthonormal basis for R

n, then so isfU v1; : : : ; U vng.

5. Showthatifan n � n matrix U satisfies .U x/� .U y/ D x �yforall x and y in R

n, then U isanorthogonalmatrix.

6. Showthatif U isanorthogonalmatrix, thenanyrealeigen-valueof U mustbe ˙1.

7. A Householdermatrix, oran elementaryreflector, hastheform Q D I � 2uuT where u isaunitvector. (SeeExer-cise 13intheSupplementaryExercisesforChapter 2.) ShowthatQ isanorthogonalmatrix. (Elementaryreflectorsareof-tenusedincomputerprogramstoproduceaQR factorizationofamatrix A. If A haslinearlyindependentcolumns, thenleft-multiplicationbyasequenceofelementaryreflectorscanproduceanuppertriangularmatrix.)

Chapter 6 Supplementary Exercises 391

8. Let T W Rn ! R

n bealineartransformationthatpreserveslengths; thatis, kT .x/k D kxk forall x in R

n.a. Show that T also preserves orthogonality; that is,

T .x/�T .y/ D 0 whenever x �y D 0.b. Show that the standard matrix of T is an orthogonal

matrix.

9. Let u and v belinearlyindependentvectorsin Rn thatare

not orthogonal. Describehowtofindthebestapproximationto z in R

n byvectorsoftheform x1u C x2v withoutfirstconstructinganorthogonalbasisfor Span fu; vg.

10. Supposethecolumnsof A arelinearlyindependent. Deter-minewhathappenstotheleast-squaressolution Ox of Ax D bwhen b isreplacedby cb forsomenonzeroscalar c.

11. If a, b, and c are distinct numbers, then the followingsystemisinconsistentbecausethegraphsoftheequationsareparallelplanes. Showthatthesetofallleast-squaressolutionsofthesystemispreciselytheplanewhoseequationis x � 2y C 5´ D .a C b C c/=3.

x � 2y C 5´ D a

x � 2y C 5´ D b

x � 2y C 5´ D c

12. Considertheproblemoffindinganeigenvalueofan n � n

matrix A when an approximate eigenvector v is known.Since v isnotexactlycorrect, theequation

Av D �v .1/

will probably not have a solution. However, � can beestimatedbya least-squares solutionwhen(1) is viewedproperly. Thinkof v asan n � 1 matrix V , thinkof � asavectorin R

1, anddenotethevector Av bythesymbol b.Then(1)becomes b D �V , whichmayalsobewrittenasV � D b. Findtheleast-squaressolutionofthissystemof n

equationsintheoneunknown �, andwritethissolutionusingtheoriginalsymbols. Theresultingestimatefor � iscalledaRayleighquotient. SeeExercises11and12inSection5.8.

13. Usethestepsbelowtoprovethefollowingrelationsamongthe four fundamental subspaces determined by an m � n

matrix A.

RowA D .NulA/?; ColA D .NulAT /?

a. Showthat RowA iscontainedin .NulA/?. (Showthatifx isin RowA, then x isorthogonaltoevery u in NulA.)

b. Suppose rankA D r . Find dimNulA and dim .NulA/?,andthendeducefrompart(a)that RowA D .NulA/?.[Hint: StudytheexercisesforSection 6.3.]

c. Explainwhy ColA D .NulAT /?.

14. Explainwhyanequation Ax D b hasasolutionifandonlyif b isorthogonaltoallsolutionsoftheequation ATx D 0.

Exercises15and16concernthe(real) Schurfactorization ofann � nmatrixA intheformA D URU T , whereU isanorthogonalmatrixand R isan n � n uppertriangularmatrix.1

15. Show that if A admits a (real) Schur factorization, A DURU T , then A has n realeigenvalues, countingmultiplic-ities.

16. Let A bean n � n matrixwith n realeigenvalues, countingmultiplicities, denotedby �1; : : : ; �n. ItcanbeshownthatA admitsa(real)Schurfactorization. Parts(a)and(b)showthekeyideasintheproof. Therestoftheproofamountstorepeating(a)and(b)forsuccessivelysmallermatrices, andthenpiecingtogethertheresults.a. Let u1 bea unit eigenvector corresponding to �1, let

u2; : : : ; un beanyothervectorssuchthat fu1; : : : ; ungis an orthonormal basis for R

n, and then let U DŒ u1 u2 � � � un �. Show that the first column ofU T AU is �1e1, where e1 isthefirstcolumnofthe n � n

identitymatrix.b. Part(a)impliesthat U TAU hastheformshownbelow.

Explainwhytheeigenvaluesof A1 are �2; : : : ; �n. [Hint:SeetheSupplementaryExercisesforChapter 5.]

U TAU D

2

6664

�1 � � � �0::: A1

0

3

7775

[M] When the right side of an equation Ax D b is changedslightly—say, toAx D b C �b forsomevector�b—thesolutionchangesfrom x to x C �x, where �x satisfies A.�x/ D �b.Thequotient k�bk=kbk iscalledthe relativechange in b (orthe relativeerror in b when �b representspossibleerrorintheentriesof b/. Therelativechangeinthesolutionis k�xk=kxk.When A isinvertible, the conditionnumber of A, writtenascond.A/, producesaboundonhowlargetherelativechangeinx canbe:

k�xkkxk � cond.A/�

k�bkkbk .2/

InExercises17–20, solve Ax D b and A.�x/ D �b, andshowthattheinequality(2)holdsineachcase. (Seethediscussionofill-conditioned matricesinExercises41–43inSection 2.3.)

17. A D�

4:5 3:1

1:6 1:1

�

, b D�

19:249

6:843

�

, �b D�

:001

�:003

�

18. A D�

4:5 3:1

1:6 1:1

�

, b D�

:500

�1:407

�

, �b D�

:001

�:003

�

1 Ifcomplexnumbersareallowed, every n� n matrix A admitsa(complex)Schurfactorization,A D URU�1, whereR isuppertriangularand U�1 isthe conjugate transposeof U . Thisveryusefulfactisdiscussedin MatrixAnalysis, byRogerA. HornandCharlesR. Johnson(Cambridge: CambridgeUniversityPress, 1985), pp. 79–100.


19. A D

2

664

7 �6 �4 1

�5 1 0 �2

10 11 7 �3

19 9 7 1

3

775, b D

2

664

:100

2:888

�1:404

1:462

3

775,

�b D 10�4

2

664

:49

�1:28

5:78

8:04

3

775

20. A D

2

664

7 �6 �4 1

�5 1 0 �2

10 11 7 �3

19 9 7 1

3

775, b D

2

664

4:230

�11:043

49:991

69:536

3

775,

�b D 10�4

2

664

:27

7:76

�3:77

3:93

3

775

6 orthogonality and least squares

Documents