#schaum's outline of theory and problems of linear algebra second edition - seymour lipschutz

459
SCHAUM'S OUTLINE OF THEORY AND PROBLEMS OF LINEAR ALGEBRA Second Edition SEYMOUR LIPSCHUTZ, Ph.D. Professor of Mathematics Temple University SCHAUM'S OUTLINE SERIES McGRAW-HILL New York San Francisco Washington, DC. Auckland Bogota Caracas Lisbon London Madrid Mexico City Milan Montreal New Delhi San Juan Singapore Sydney Tokyo Toronto

Upload: monika-palaic

Post on 08-Feb-2016

1.049 views

Category:

Documents


206 download

TRANSCRIPT

SCHAUM'S OUTLINE OF

THEORY AND PROBLEMS

OF

LINEAR

ALGEBRA

Second Edition

SEYMOURLIPSCHUTZ,Ph.D.Professor of Mathematics

Temple University

SCHAUM'S OUTLINE SERIES

McGRAW-HILL

New York San Francisco Washington, DC. Auckland Bogota Caracas Lisbon

London Madrid Mexico City Milan Montreal New Delhi

San Juan Singapore Sydney Tokyo Toronto

SEYMOUR LIPSCHUTZ, who is presently on the mathematics faculty of

Temple University, formerly taught at the Polytechnic Institute of Brooklynand was a visiting professor in the Computer Science Department of

Brooklyn College. He received his Ph.D. in 1960 at the Courant Institute of

Mathematical Sciences of New York University.Someof his other books in

the Schaum's Outline Seriesare Programming with Fortran (with Arthur

\320\240\320\276\320\265),Discrete Mathematics, and Probability.

Schaum's Outline of Theory and Problems of

LINEAR ALGEBRA

Copyright \302\2511991, 1968 by The McGraw-Hill Companies, Inc. All rights reserved. Printed in

the United States of America. Except as permitted under the Copyright Act of 1976,no part of

this publication may be reproduced or distributed in any form or by any means, or stored in adata base or retrieval system, without the prior written permissionof the publisher.

II 12 13 14 15 16 17 18 19 20 BAW BAW 9 9

ISBN 0-07-038007-4

Sponsoring Editor: David BeckwithProductionSupervisor: Annette Mayeski

Editing Supervisors:MegTobin, Maureen Walker

Library of Congress Cataloging-in-Pnblication Dnta

Lipschutz,Seymour.

Schaum's outline of theory and problems of linear algebra /Seymour Lipschutz.\342\200\2242nd ed.

p. cm.\342\200\224(Schaum's outline series)Includes index.ISBN0-07-038007-41.Algebras, Linear\342\200\224Outlines, syllabi, etc. I. Title.

QAI88.L57 1991 90-42357512'.5\342\200\224dc20 CIP

McGraw-Hill

A Division ofThcMcGraw-HittCompames

Preface

Linear algebra has in recent years become an essential part of the mathematical

background required of mathematicians, engineers, physicistsand other scientists.This requirement reflects the importance and wide applications of the subject

matter.

This book is designed for use as a textbook for a formal course in linear algebra

or as a supplement to all current standard texts. It aims to present an introduction

to linear algebra which will be found helpful to all readers regardless of their fields

of specialization. More material has beenincluded than can be covered in most first

courses. This has been done to make the book more flexible, to provide a useful

book of reference, and to stimulate further interest in the subject.

Each chapter beginswith clear statements of pertinent definitions, principlesand theorems together with illustrative and other descriptive material. This is fol-

followed by graded sets of solved and supplementary problems. The solved problemsserve to illustrate and amplify the theory, bring into sharp focus those fine points

without which the student continually feels himself on unsafe ground, and providethe repetition of basicprinciples so vital to effective learning. Numerous proofs of

theorems are included among the solved problems. The supplementary problems

serve as a complete review of the material of each chapter.

The first chapter treats systems of linear equations. This provides the motivation

and basic computational tools for the subsequent material. After vectors andmatrices are introduced, there are chapterson vector spaces and subs paces and oninner products. This is followed by chapters covering determinants, eigenvalues andeigenvectors, and diagonalizing matrices (under similarity) and quadratic forms

(under congruence). The later chapters cover abstract linear maps and their canon-canonicalforms, specifically the triangular, Jordan and rational canonical forms. The last

chapter treats abstract linear maps on inner product spaces.The main changes in the second edition have been for pedagogical reasons

(form) rather than in content. Here, the notion of a matrix mapping is introduced

early in the text, and inner products are introduced right after the chapter on vector

spacesand subspaces.Also, algorithms for row reduction, matrix inversioncomput-computingdeterminants, and diagonalizing matrices and quadratic forms are presented

using algorithmic notation. Furthermore, such topicsas elementary matrices, LU

factorization, Fourier coefficients, and various norms in Rn are introduced directlyin the text, rather than in the problem sections. Lastly, by treating the more

advanced abstract topics in the latter part of the text, we make this edition more

suitable for an elementary course or for a two-semester course in linear algebra.I wish to thank the staff of the McGraw-Hill Schaum Series,especially John

Aliano, David Beckwith and Margaret Tobin. for invaluable suggestions and fortheir very helpful cooperation. Lastly, I want to express my gratitude to Wilhelm

Magnus, my teacher, advisor and friend, who introduced me to the beauty of mathe-

mathematics.

Temple University Seymour Lipschutz

January, 1991

iii

Contents

Chapter 1 SYSTEMS OF LINEAR EQUATIONS

I.I Introduction. 1.2 Linear Equations, Solutions. 1.3 Linear Equations in Two

Unknowns. 1.4 Systems of Linear Equations, Equivalent Systems, Elementary

Operations. 1.5 Systems in Triangular and Echelon Form. 1.6 Reduction

Algorithm. 1.7 Matrices. 1.8 Row Equivalence and Elementary Row Operations.1.9Systems of Linear Equations and Matrices. 1.10Homogeneous Systems of

Linear Equations.

Chapter 2 VECTORSIN R\" AND C\", SPATIAL VECTORS2.1 Introduction. 2.2 Vectors in R\". 2.3 Vector Addition and Scalar Multiplica-

Multiplication.2.4 Vectors and Linear Equations. 2.5 Dot (Scalar) Product. 2.6 Norm of

a Vector. 2.7 Located Vectors, Hyperplanes, and Lines in R\". 2.8 Spatial Vectors,ijk Notation in R3. 2.9 ComplexNumbers. 2.10 Vectors in C\".

39

Chapter 3 MATRICES3.1Introduction. 3.2 Matrices. 3.3 Matrix Addition and Scalar Multiplication.3.4 Matrix Multiplication. 3.5 Transpose of a Matrix. 3.6 Matrices and Systemsof Linear Equations. 3.7 Block Matrices.

74

Chapter 4 SQUAREMATRICES, ELEMENTARY MATRICES

4.1 Introduction. 4.2 Square Matrices. 4.3 Diagonal and Trace, Identity

Matrix. 4.4 Powers of Matrices, Polynomials in Matrices. 4.5 Invertible

(Nonsingular) Matrices. 4.6 Special Types of Square Matrices. 4.7 Complex

Matrices. 4.8 Square Block Matrices. 4.9 Elementary Matrices and Applica-Applications. 4.10 Elementary Column Operations, Matrix Equivalence. 4.11 Congruent

Symmetric Matrices, Law of Inertia. 4.12 Quadratic Forms. 4.13 Similarity.

4.14 LU Factorization.

89

Chapter 5 VECTOR SPACES 1415.1Introduction. 5.2 Vector Spaces. 5.3 Examples of Vector Spaces.

5.4 Subspaces. 5.5 Linear Combinations, Linear Spans. 5.6 Linear Dependence

and Independence. 5.7 Basis and Dimension. 5.8 Linear Equations and Vector

Spaces. 5.9 Sums and Direct Sums. 5.10 Coordinates. 5.11Change of Basis.

Chapter 6 INNER PRODUCT SPACES,ORTHOGONALITY 202

6.1 Introduction. 6.2 Inner Product Spaces. 6.3 Cauchy-Schwarz Inequality,

Applications. 6.4 Orthogonality. 6.5 Orthogonal Sets and Bases, Projections.

6.6 Gram-Schmidt Orthogonalization Process. 6.7 Inner Products and Matrices.

6.8 Complex Inner Product Spaces. 6.9 Normed Vector Spaces.

VI CONTENTS

Chapter 7 DETERMINANTS7.1 Introduction. 7.2 Determinants of Orders One and Two. 7.3 Determinants

of Order Three. 7.4 Permutations. 7.5 Determinants of Arbitrary Order.

7.6 Properties of Determinants. 7.7 Minors and Cofactors. 7.8 Classical

Adjoint. 7.9 Applications to Linear Equations, Cramer's Rule. 7.10 Submatrices,General Minors, Principal Minors. 7.11 Block Matrices and Determinants.

7.12 Determinants and Volume. 7.13 Multilinearity and Determinants.

246

Chapter 8 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 2808.1 Introduction. 8.2 Polynomials in Matrices. 8.3 CharacteristicPolynomial.

Cayley Hamilton Theorem. 8.4 Eigenvalues and Eigenvectors. 8.5 ComputingEigenvalues and Eigenvectors, Diagonalizing Matrices. 8.6 Diagonalizing Real

Symmetric Matrices. 8.7 Minimum Polynomial.

Chapter 9 LINEAR MAPPINGS9.1 Introduction. 9.2 Mappings. 9.3 Linear Mappings. 9.4 Kernel and Imageof a Linear Mapping. 9.5 Singular and Nonsingular Linear Mappings, Iso-

Isomorphisms. 9.6 Operations with Linear Mappings. 9.7 Algebra A(V) of Linear

Operators. 9.8 Invertible Operators.

312

Chapter 10 MATRICES AND LINEAR MAPPINGS 344

10.1 Introduction. 10.2 Matrix Representation of a Linear Operator.10.3Change of Basis and Linear Operators. 10.4 Diagonalization of LinearOperators. 10.5Matrices and General Linear Mappings.

Chapter 11 CANONICALFORMS 369

11.1 Introduction. 11.2 Triangular Form. 11.3 Invariance. 11.4 Invariant

Direct-Sum Decompositions. 11.5 Primary Decomposition. 11.6 Nilpotent Oper-

Operators. 11.7 Jordan Canonical Form. 11.8Cyclic Subspaces.11.9Rational

Canonical Form. 11.10 Quotient Spaces.

Chapter 12 LINEAR FUNCTIONALS AND THE DUAL SPACE12.1Introduction. 12.2 Linear Functionals and the Dual Space. 12.3 Dual

Basis. 12.4 SecondDual Space. 12.5 Annihilators. 12.6 Transpose of a Linear

Mapping.

397

Chapter 13 BILINEAR, QUADRATIC, AND HERMITIANFORMS 409

13.1 Introduction. 13.2 Bilinear Forms. 13.3 Bilinear Forms and Matrices.

13.4 Alternating Bilinear Forms. 13.5Symmetric Bilinear Forms, QuadraticForms. 13.6 Real Symmetric Bilinear Forms, Law of Inertia. 13.7 Hermitian

Forms.

CONTENTS vu

Chapter 14 LINEAR OPERATORS ON INNER PRODUCT SPACES

14.1 Introduction. 14.2 Adjoint Operators. 14.3Analogy Between A(V) and C,

Special Operators. 14.4Self-Adjoint Operators. 14.5 Orthogonal and Unitary

Operators. 14.6 Orthogonal and Unitary Matrices. 14.7 Change of OrthonormalBasis. 14.8Positive Operators. 14.9 Diagonalization and Canonical Forms in

Euclidean Spaces. 14.10 Diagonalization and Canonical Forms in Unitary

Spaces. 14.11 Spectral Theorem.

425

Appendix POLYNOMIALS OVER A FIELDA.I Introduction. A.2 Divisibility; Greatest Common Divisor. A.3 Factor-

Factorization.

446

INDEX 449

Chapter 1

Systems of Linear Equations

1.1INTRODUCTION

The theory of linear equations plays an important and motivating role in the subject of linear

algebra. In fact, many problems in linear algebra are equivalent to studying a system of linear equa-equations, e.g., finding the kernel of a linear mapping and characterizing the subspace spanned by a set of

vectors. Thus the techniques introduced in this chapter will be applicable to the more abstract treat-

treatment given later. On the other hand, some of the results of the abstract treatment will give us new

insights into the structure of \"concrete\" systems of linear equations.This chapter investigates systems of linear equations and describes in detail the Gaussian elimi-

elimination algorithm which is used to find their solution. Although matrices will be studied in detail in

Chapter 3, matrices, together with certain operations on them, are also introduced here, sincethey are

closely related to systems of linear equations and their solution.All our equations will involve specific numbers called constants or scalars. For simplicity, we

assume in this chapter that all our scalars belong to the real field R. The solutions of our equations will

also involve n-tuples \320\274= (fct, k2,.-., \320\272\321\217)of real numbers called vectors. The set of all such n-tuples isdenoted by R\".

We note that the results in this chapter also hold for equations over the complex field \320\241or over any

arbitrary field K.

1.2 LINEAR EQUATIONS, SOLUTIONS

By a linear equation in unknowns xt, x2,..., \321\205\342\200\236,we mean an equation that can be put in thestandard form:

alxl +a2x2+ -\302\246+\320\260\321\217\321\205\320\277

= \320\254 A-1)

where a,, a2,..., \320\260\321\217,b are constants. The constant ak is called the coefficient of xk and b is called theconstant of the equation.

A solution of the above linear equation is a set of values for all the unknowns, say x, \342\200\224klt

x2 = k2,..., \321\205\342\200\236=

\320\272\321\217,or simply an n-tuple \320\274= (kt, k2,.-., \320\272\321\217)of constants, with the property that the

following statement (obtained by substituting each kt for xf in the equation) is true:

alkl + a2k2 + -- + \320\260\321\217\320\272\321\217= \320\254\321\217\320\272\321\217

This set of values is then said to satisfy the equation.The set of all such solutions is called the solution set or general solution or, simply, the solution of

the equation.

Remark: The above notions implicitly assume there is an ordering of the

unknowns. In order to avoid subscripts,we will usually use variables x, y, z, as

ordered, to denote three unknowns, x, y, z, t, as ordered, to denotefour unknowns, and

x, y, z, s, t, as ordered,to denotefive unknowns.

1

2 SYSTEMS OF LINEAR EQUATIONS [CHAP. I

Example 1.1

(a) The equation 2x \342\200\2245y + 3xz = 4 is not linear since the product xz of two unknowns is of second degree.

(b) Theequation x + 2y \342\200\2244z + t = 3 is linear in the four unknowns x, y, z, t.

The 4-tuple u = C,2, 1,0) is a solution of the equation since

3 + 2B)- 4A) + 0 = 3 or 3 = 3

is a true statement. However, the 4-tuple v = A, 2, 4, 5) is not a solution of the equation since

1 + 2B)- 4D) + 5 = 3 or -6 = 3

is not a true statement.

Linear Equationsin One Unknown

The following basic result is proved in Problem 1.5.

Theorem 1.1: Consider the linear equation ax \342\200\224b.

(i) If \320\260\321\2040, then x = b/a is a unique solution of ax = b.

(ii) If a = 0, but b \320\2440, then ax = b has no solution,

(iii) If a = 0 and b = 0, then every scalar \320\272is a solution of ax = b.

Example 1.2

(a) Solve 4x - 1= x + 6.

Transpose to obtain the equation in standard form: 4x \342\200\224x = 6 + 1 or 3x = 7. Multiply by 1/3 to obtainthe unique solution x =

\\ [Theorem l.l(i)].

(b) Solve 2x-5-x = x + 3.Rewrite the equation in standard form: x \342\200\2245 = x + 3, or x \342\200\224x = 3 + 8, or Ox= 8.The equation has no

solution [Theorem l.l(ii)].

(c) Solve 4 + x \342\200\2243 = 2x + l\342\200\224x.

Rewrite the equation in standard form: x + 1 = x + 1,or x \342\200\224x = 1 \342\200\2241, or Ox = 0. Every scalar \320\272is asolution [Theorem l.l(iii)].

DegenerateLinear Equations

A linear equation is said to be degenerate if it has the form

0xt + 0x2+ - - \342\200\242+ 0\321\205\342\200\236

= b

that is, if every coefficient is equal to zero. The solution of such an equation is as follows:

Theorem 1.2: Consider the degenerate linear equation 0xl + 0x2+ \342\200\242\342\200\242\342\200\242+ 0xB = b.

(i) If the constant b \320\2440, then the equation has no solution,

(ii) If the constant b = 0, then every vector \320\270= (ku k2,..., &\342\200\236)is a solution.

CHAP. I] SYSTEMSOF LINEAR EQUATIONS

Proof, (ii) Let \320\274= (&,, k2,..., kj be any vector. Suppose b \321\2040. Substituting \320\274in the equation weobtain:

0/ct 4-0k2 + \302\246\302\246\342\200\242+ 0kn = b or 0 + 0 + \342\200\242\342\200\242

\342\200\242+0 = b or 0 = b

This is not a true statement since b \321\2040. Hence no vector \320\274is a solution,

(ii) Suppose b = 0. Substituting \320\274in the equation we obtain:

0/c, + 0k2 +\302\246\302\246\342\200\242

+ 0\320\272\342\200\236= 0 or 0 + 0 + \342\200\242\342\200\242\342\200\242

+ 0 = 0 or 0 = 0

which is a true statement. Thus every vector \320\274in R\" is a solution, as claimed.

Example 1.3. Describe the solution of4y-x-3y + 3 = 2 + x-2x + y+l.Rewrite in standard form by collecting terms and transposing:

\321\203\342\200\224x + 3 =

\321\203\342\200\224x + 3 or v \342\200\224x\342\200\224\321\203+ x = 3 \342\200\2243 or Ox + Qy

= 0

The equation is degeneratewith a zero constant; thus every vector u = (<i, b) in R2 is a solution.

Nondegenerate Linear Equations, Leading Unknown

This subsection covers the solution of a single nondegenerate linear equation in one or moreunknowns, say

\302\253i*i+ \302\2532*2+\342\200\242\"\342\200\242

+ anxn = b

By the leading unknown in such an equation, we mean the first unknown with a nonzero coefficient.Itsposition p in the equation is therefore the smallest integral value of j for which a} \320\2440. In other words,

xpis the leading unknown if a}

= 0 for j < p, but ap \321\2040.

Example 1.4. Consider the linear equation 5y \342\200\2242z = 3. Here \321\203is the leading unknown. If the unknowns are x, y,

and z, then p = 2 is its position; but if \321\203and z are the only unknowns, then p = I.

The following theorem, proved in Problem 1.9, applies.

Theorem 1.3: Considera nondegenerate linear equation aix1 + a2x2 + \302\246\302\246\302\246+ \320\260\321\217\321\205\320\277

= b with leadingunknown xp.

(i) Any set of values for the unknowns x, with j \321\204\321\200will yield a unique solution of the

equation. (The unknownsXj

are called free variables since one can assign any values

to them.)

(ii) Every solution of the equation is obtained in (i).(The set of all solutions is called the generalsolution of the equation.)

Example 1.5

(a) Find three particular solutions to the equation 2x \342\200\224Ay + z = 8.

Herex is the leading unknown. Accordingly, assign any values to the free variables \321\203and z, and thensolvefor x to obtain a solution. For example:(I) Set \321\203

= I and z = I. Substitution in the equation yields

2x-4A)+l=8 or 2x-4+l=8 or 2x = 11 or x = J^Thus u,

=(-\321\203-,1,1) is a solution.

SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

B) Set \321\203= I, z = 0. Substitution yields x = 6. Hence u2

= F, I, 0) is a solution.

C) Set \321\203= 0, z = 1.Substitution yields x =

\\. Thus u3 = C,0, I) is a solution.

(b) The general solution of the above equation 2x \342\200\224Ay + z = 8 is obtained as follows.

First, assign arbitrary values (calledparameters) to the free variables, say, \321\203= a and z = b. Then substitute

in the equation to obtain

2x - 4a + b = 8 or 2x = 8 + 4a - b or x = 4 + 2a- {bThus

x = 4 + 2a - ib, \321\203= a, z = b or \320\270= D + 2a -

%b, a, b)

is the general solution.

13 LINEAR EQUATIONS IN TWO UNKNOWNS

This section considers the special case of linear equations in two unknowns, x and y, that is,

equations that can be put in the standard form

ax + by = \321\201

where a, b, \321\201are real numbers. (We also assume that the equation is nondegenerate, i.e., that a and b are

not both zero.)Each solution of the equation is a pair of real numbers, \320\274= (ku k2), which can be found

by assigning an arbitrary value to x and solving for y, or vice versa.Every solution \320\274= (felt k2) of the above equation determines a point in the cartesian plane R2.

Sincea and b are not both zero, all such solutions correspond preciselyto the points on a straight line(whencethe name \"linear equation\.") This line is calledthe graph of the equation.

Example 1.6. Consider the linear equation 2x + \321\203= 4. We find three solutions of the equation as follows. First

choose any value for either unknown, say x = \342\200\2242. Substitute x = \342\200\2242 into the equation to obtain

2(-2) + y = 4 or -4 +\321\203= 4 or \321\203

= 8

Thus x = \342\200\2242, \321\203= \342\200\2248or the point ( \342\200\2242, 8) in R2 is a solution. Now find the y-intercept, that is, substitute x = 0 in

the equation to get \321\203= 4; hence @, 4) on the \321\203axis is a solution. Next find the x-intercept, that is, substitute \321\203

= 0

in the equation to get x = 2; henceB, 0) on the x axis is a solution.

To plot the graph of the equation, first plot the three solutions, (\342\200\2242, 8), @, 4), and B, 0), in the plane R2 as

pictured in Fig. 1-1. Then draw the line L determined by two of the solutions and note that the third solution alsolieson L. (Indeed, L is the set of all solutions of the equation.) The line L is the graph of the equation.

System of Two Equations in Two Unknowns

This subsection considers a system of two (nondegenerate) linear equations in the two unknowns xand \321\203:

+ by = Clc

(Thus at and bt are not both zero, and a2 and b2 are not both zero.) This simple system is treatedseparatelysinceit has a geometrical interpretation, and its properties motivate the general case.

CHAP. 1] SYSTEMSOF LINEAR EQUATIONS

(-2,8)

Graph of 2x+y = 4

Fig. 1-1

A pair \320\270=(\320\272\320\270k2) of real numbers which satisfies both equationsis calleda simultaneous solution

of the given equations or a solution of the system of equations. There are three cases,which can be

described geometrically.

A) The system has exactly one solution. Here the graphs of the linear equations intersect in one point,as in Fig. l-2(a).

B) The system has no solutions. Here the graphs of the linear equations are parallel, as in Fig. l-2(b).C) The system has an infinite number of solutions.Herethe graphs of the linear equations coincide,as

in Fig. l-2(c).

(a) (c)

The special cases B) and C) can only occur when the coefficients of x and \321\203in the two linear

equations are proportional;that is,

a,

\302\2532

\342\200\224a2^i =0

Specifically, case B) or C)occursif

?i =bi ?i

a2 b2 c2or 2i_ie?l

a2 b2 c2

respectively. Unless otherwise stated or implied,we assumewe are dealing with the general caseA).

SYSTEMSOF LINEAR EQUATIONS [CHAP.

Remark: The expressiona, b,a2 b2

, which has the value a,b2 \342\200\224\320\260\320\263^\320\270is called a

determinant of order two. Determinants will be studied in Chapter 7. Thus the system

has a unique solution when the determinant of the coefficients is not zero.

Elimination Algorithm

The solution to system (/.2) can be obtained by the process known as elimination, whereby we

reduce the system to a single equation in only one unknown. Assuming the system has a unique solu-

solution, this elimination algorithm consists of the following two steps:

Step I. Add a multiple of one equation to the other equation (or to a nonzero multiple of the other

equation) so that one of the unknowns is eliminated in the new equation.

Step 2. Solve the new equation for the given unknown, and substitute its value in one of the originalequations to obtain the value of the other unknown.

Example 1.7

(a) Consider the system

L,: 2x + Sy= 8

L2: 3x-2> = -7

We eliminate x from the equations by forming the new equation L =3Lt

\342\200\2242L2; that is, by multiplying

L, by 3 and multiplying L2 by \342\200\2242 and adding the resultant equations:

3L,: 6x + \\5y= 24

-2L2: -6x + 4y= 14

Addition: 19>>= 38

Solving the new equation for \321\203yields \321\203= 2. Substituting \321\203

= 2 into one of the original equations, say L,,yields

2x + 5B) = 8 or 2x+10 = 8 or 2x = -2 or x=-l

Thus x = \342\200\2241 and \321\203= 2, or the pair (\342\200\2241, 2), is the unique solution to the system.

(b) Considerthe system

L,: x-3>> = 4

L2: -2x+6>>= 5

Eliminate x from the equations by multiplying L, by 2 and adding it to L2; that is, by forming the equation

L = 2L, + L2. This yields the new equation Ox + Oy\342\200\22413. This is a degenerateequation which has a nonzero

constant; therefore, the system has no solution. (Geometrically speaking, the lines are parallel.)

(c) Considerthe system

L,: x -3y

= 4

L2: -2x + 6y= -8

Eliminate x by multiplying L, by 2 and adding it to L2. This yields the new equation Ox + Oy= 0 which is a

degenerateequation where the constant term is also zero.Hence the system has an infinite number of solu-

solutions, which correspond to the solutions of either equation. (Geometrically speaking, the lines coincide.) Tofind the general solution, let \321\203

= a and substitute in L, to obtain x \342\200\2243a = 4 or x = 3a+ 4. Accordingly, the

general solution to the system is

Ca + 4, a)

where a is any real number.

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS

1.4 SYSTEMS OF LINEAR EQUATIONS, EQUIVALENTSYSTEMS, ELEMENTARY

OPERATIONS

This section considers a system of m linear equations, say L,, L2, ..., Lm, in n unknowns \321\205\342\200\236

x2,..., \321\205\342\200\236which can be put in the standard form

a22x2 + +alnxn =b2

ami*i + 0*2*2 +\342\200\242+ amnxn=

bm

where the ay, b,-are constants.A solution (or a particular solution) of the above system is a set of values for the unknowns, say

xt = ku x2 = k2,..., xn= kn, or an \302\253-tuple u = (ku k2,..., fcj of constants, which is a solution of each

of the equations of the system. The set of all such solutions is called the solution set or the general

solution of the system.

Example 1.8. Consider the system

x, + 2x2 \342\200\2245x3 + 4x4 = 3

2x, + 3x2 + x3 - 2x4 = I

Determine whether x, = \342\200\2248, x2 = 4, x3 = 1,x4= 2 is a solution of the system.

Substitute in each equation to obtain

A) -8 +2D) -5A) + 4B) = 3 or -8 + 8-5 + 8 = 3 or 3 = 3

B) 2(-8) + 3D)+1-2B)= 1 or -16+12 + 1-4=1 or -7 = 3

No, it is not a solution since it is not a solution of the second equation.

Equivalent Systems, Elementary Operations

Systems of linear equationsin the same unknowns are said to be equivalent if the systems have thesame solution set.One way of producing a system which is equivalent to a given system, with linear

equations L,, L2,..., Lm, is by applying a sequence of the following operations called elementaryoperations:

[?t] Interchange the ith equation and thejth equation: Lt*-*L}.[?2] Multiply the ith equation by a nonzero scalar\320\272:\320\272\320\246

-* L,-, \320\272\321\204\320\236.

[?3] Replace the ith equation by \320\272times thejth equation plus the ith equation: (fcL, + L,) -> Lt.

In actual practice, we apply [?2] and then [?3] in one step, that is, the operation

[?] Replace the ith equation by k' times thejth equation plus \320\272(nonzero) times the ith equation:

The above is formally stated in the following theorem proved in Problem 1.46.

Theorem 1.4: Suppose a system (#) of linear equations is obtained from a system (*) of linear equa-equations by a finite sequence of elementary operations.Then (#) and (*) have the samesolution set.

Our method for solving the system {1.3)of linear equations consists of two steps:

Step 1. Usethe above elementary operations to reduce the system to an equivalent simpler system (in

triangular or echelon form).

Step 2. Use back-substitution to find the solution of the simpler system.

SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

The two steps are illustrated in Example 1.9. However, for pedagogical reasons,we first discuss Step2 in detail in Section 1.5 and then we discussStep1 in detail in Section 1.6.

Example 1.9. The solution of the system

x+ 2y- 4z = -4

5x + ll>>-2lz = -223x- 2y + 3z = II

is obtained as follows:

StepI. First we eliminate x from the second equation by the elementary operation (\342\200\2245L, + L2) -\302\273L2, that is, by

multiplying Lt by \342\200\2245and adding it to L2\\ and then we eliminate x from the third equation by applying

the elementary operation (\342\200\2243L, + L3) -* L3, i.e.,by multiplying L, by \342\200\2243 and adding it to L3:

-5xL,: -5x- 10>> + 20r= 20 -3xL,: -3x -6>> + 12z = 12

L2: 5x+ lly - 21z= -22 L3: 3x - 2y + 3z = llnewL2: y

\342\200\224r= -2 new L3: \342\200\2248)> + 15z - 23

Thus the original system is equivalent to the system

x + 2y- 4z = \342\200\2244

y- z=-2

-8y + 15z = 23

Next we eliminate \321\203from the third equation by applying (8L2 + L3) -\302\273L3 that is, by multiplying L2 by 8and adding it to L3:

8 x L2: %y- 8z= -16L3: -8>-+15z= 23

new L3: Iz = 7

Thus we obtain the following equivalent triangular system:

x + 2y- 4z = -4

y- z=-27z= 7

Step 2. Now we solve the simpler triangular system by back-substitution. The third equation gives z = 1.Substi-

Substitutez = 1 into the second equation to obtain

\321\203\342\200\2241 = \342\200\2242 or y= \342\200\2241

Now substitute \320\263= 1 and \321\203\342\200\224\342\200\2241 into the first equation to obtain

x+2(-l)-4(l)=-4 or x-2-4=-4 or x-6=-4 or x = 2

Thus x = 2, \321\203= \342\200\2241, z = 1, or, in other words, the ordered triple B,

\342\200\2241, 1), is the unique solution to the

given system.

The above two-step algorithm for solving a system of linear equationsiscalledGaussian elimination.

The following theorem will be used in Step 1 of the algorithm.

Theorem 1.5: Suppose a system of linear equationscontainsthe degenerate equation

L: Ox, + 0x2 + \342\200\242\342\200\242\302\246+ 0xB = b

(a) If b = 0, then L may be deleted from the system without changing the solution set.

(b) If b \321\2040, then the system has no solution.

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS

Proof. The proof follows directly from Theorem 1.2; that is, part (a) follows from the fact that

every vector in R\" is a solution to L, and part (b) follows from the fact that L has no solution and hencethe system has no solution.

1.5 SYSTEMS IN TRIANGULAR AND ECHELON FORM

This section considers two simple types of systems of linear equations:systems in triangular form

and the more general systemsin echelon form.

Triangular Form

A system of linear equations is in triangular form if the number of equations is equal to the number

of unknowns and if xk is the leading unknown of the fcth equation. Thus a triangular system of linear

equations has the following form:

ouxi +al2x2 + \302\246\302\246\302\246+ aln_lxn^l+ alnxn = bla22x2+- + \320\2312.\320\237-\320\233-1+ o2nxn

= b2

A-4)

\"\302\273-1,.-\320\233-1 + an-l.nxn \342\200\224\320\254\342\200\236-1

amxn=

bn

where alt \320\2440, a22 #0,..., am # 0.The above triangular system of linear equations has a unique solution which may be obtained by

the following process, known as back-substitution. First, we solve the last equation for the last

unknown, \321\205\342\200\236:

Second, we substitute this value for \321\205\342\200\236in the next-to-last equation and solve it for the next-to-last

unknown, xB_t:

Xn-l \342\200\224 ~

Third, we substitute these values for \321\205\342\200\236and xB_t in the third-from-last equation and solve it for the

third-from-last unknown, \321\205\342\200\236_2:

b._,\342\200\224

(a\302\273_j\302\273_i/a._, .

In general, we determine xk by substituting the previously obtained values of \321\205\342\200\236,xB_ls..., xk+l in thekth equation:

xk =%

The process ceaseswhen we have determined the first unknown, xt. The solution is unique since, at

each step of the algorithm, the value of xk is, by Theorem l.l(i), uniquely determined.

Example 1.10. Consider the system

2x+4y- r= 11

5y+ z= 2

3z = -9

10 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

Since the system is in triangular form it may be solved by back-substitution,

(i) The last equation yields z = \342\200\2243.

(ii) Substitute in the second equation to obtain 5)>\342\200\2243 = 2 or 5)>= 5 or \321\203

= 1.

(iii) Substitute \320\263= \342\200\2243 and \321\203= 1 in the first equation to obtain

2x + 4{l)-(-3) = 11 or 2x+ 4+3=ll or 2x= 4 or x = 2

Thus the vector \320\270= B, 1, \342\200\2243) is the unique solution of the system.

Echelon Form, Free Variables

A system of linear equations is in echelon form if no equation is degenerateand if the leadingunknown in each equation is to the right of the leading unknown of the preceding equation.Theparadigm is:

\320\260\321\206*1+ a12x2 + a13x3 + al4x4 + +fli,x,, =/>i

alj2xh + a2iil + 1xil+1 + + \320\260\320\263\320\277\321\205\320\277= b2

A.5)

arj,xjr + ar.jr+i*;,+i + \342\200\242\342\200\242\342\200\242+ \320\260\320\263\320\273\321\205\342\200\236

= br

where 1 < j2 < \302\246\302\246\302\246< jr and where att #0, a2j2 \321\204\320\236,...,\320\260\320\263^\321\2040. Note that r < n.

An unknown xk in the above echelon system G.5)is calleda free variable if xk is not the leadingunknown in any equation, that is, if xk \321\204\321\205\320\270\321\205\320\272\320\244xj2,..., xk \321\204xjr.

The following theorem, proved in Problem 1.13, describes the solution set of an echelon system.

Theorem 1.6: Consider the system G.5) of linear equations in echelon form. There are two cases.

(i) r = n. That is, there are as many equations as unknowns. Then the system has aunique solution.

(ii) r < n. That is, there are fewer equations than unknowns. Then we can arbitrarily

assign values to the n \342\200\224r free variables and obtain a solution of the system.

Suppose the echelon system G.5)doescontain more unknowns than equations. Then the systemhasan infinite number of solutions since each of the n \342\200\224r free variables may be assigned any real number.

The general solution of the system is obtained as follows. Arbitrary values, called parameters, saytu t2,..., /\342\200\236r, are assigned to the free variables, and then back-substitution is used to obtain valuesof the nonfree variables in terms of the parameters. Alternatively, one may use back-substitution tosolve for the nonfree variables xt, xj2,..., xjr directly in terms of the free variables.

Example 1.11. Consider the system

x + 4y - 3r + It = 5

\320\263- 4t = 2

Thesystem is in echelon form. The leading unknowns are x and z; hencethe free variables are the other unknowns

\321\203and t.

To find the general solution of the system, we assign arbitrary values to the free variables \321\203and t, say \321\203= \320\260

and t \342\200\224b, and then use back-substitution to solve for the nonfree variables x and z. Substituting in the last

equation yields z \342\200\2244b = 2 or z = 2 + 4b.Substitute in the first equation to get

x + 4a - 3B + 4b) + 2b = 5 or x + 4a - 6 - \\2b + 2b = 5 or x = 11- 4a + 10b

Thus

x = ll-4a+ 10b,y = a, z = 2 + 4b, t = b or A1 - 4a + 10b, a, 2 + 4b, b)

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 11

is the general solution in parametric form. Alternatively, we can use back-substitution to solve for the nonfree

variables x and z directly in terms of the free variables \321\203and t. The last equation gives z = 2 + 4t. Substitute in the

first equation to obtain

x + 4y - 3B + 4t) + It = 5 or x + 4y- 6 - \\2t + It = 5 or x = 11- 4>> + lOf

Accordingly,x= 11-4y + \320\256\320\263

z = 2 + 4t

is another form for the general solution of the system.

1.6 REDUCTION ALGORITHM

The following algorithm (sometimes called row reduction) reduces the system G.5) of m linear equa-equations in \320\270unknowns to echelon (possibly triangular) form, or determines that the system has nosolution.

Reduction algorithm

Step 1. Interchange equations so that the first unknown, xu appears with a nonzero coefficient in thefirst equation; i.e., arrange that a,, ^ 0.

Step2. Useax,asa pivot to eliminate x, from all the equations except the first equation. That is, foreach i > 1,apply the elementary operation (Section 1.4)

[?3]: -faa/a,,)L, + L,-* L, or [?]: -a,,/,, + \320\260\320\2631^

-*\320\246

Step 3. Examine each new equation L:

(a) If L has the form Ox, + 0x2 + \342\200\242\302\246\302\246+ 0\321\205\342\200\236

= 0 or if L is a multiple of another equation,then delete L from the system.

(b) If L has the form Ox, + 0x2 + \342\200\242\342\200\242\342\200\242+ 0\321\205\342\200\236

= b with b \321\2040, then exit from the algorithm.The system has no solution.

Step 4. Repeat Steps1,2, and 3 with the subsystem formed by all the equations, excluding the first

equation.

Step 5. Continue the above process until the system is in echelon form or a degenerateequation is

obtained in Step 3(b).

The justification of Step 3 is Theorem 1.5and the fact that if L = kL for some other equation L' in

the system, the operation \342\200\224kL' + L-*L replaces L by 0xt + 0x2+ \342\200\242\342\200\242\302\246+ 0\321\205\342\200\236= \320\236,which again may be

deleted by Theorem 1.5.

Example 1.12

(a) The system2x + \321\203

- 2z = 103x + 2y + 2r = 1

5x + 4y + 3z = 4

is solved by first reducing it to echelon form. To eliminate x from the second and third equations, apply the

operations \342\200\2243L,+ 2L2 -\302\273L2 and \342\200\2245L, + 2L3 -\302\273L3:

-3Lt: -6x-3j>+ 6z=-30 -5L,: - \320\256\321\205- 5y + lOz = -502L2: 6x + 4y + 4z = 2 2L3: \320\256\321\205+ %y + 6z = 8

-3L, +2L2: \321\203+ \320\256\320\263= -28 -5L, + 2L3: \320\252\321\203+ I6z = -42

12 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

This yields the following system, from which \321\203is eliminated from the third equation by the operation-3L2 + L3->L3:

2x+ y- 2z= ml [lx + \321\203- 2z = 10

y+10z= -28f -\302\246\\ >>+10z=-28

3y+16z=-42j [ -14z= 42

The system is now in triangular form. Therefore,we can use back-substitution to obtain the unique solution

\320\270= A, 2, -3).

(b) The system

x + 2y - 3z = 1

2x + 5y - 8z = 43x + 8>>

- 13z = 7

is solved by first reducing it to echelon form. To eliminate x from the second and third equations, apply

\342\200\2242L, + L2 -\302\273L2 and \342\200\224

3L, + L3 -\302\273L3 to obtain

x + 2y- 3z = 1 _ ,*

x + 2>>- 3z = 1

y-2z = 2 or 'v \342\200\2242r = 2

2y - 4z = 4 *

(The third equation is deleted since it is a multiple of the second equation.) The system is now in echelon form,with free variable z.

To obtain the general solution, let z = a and solve by back-substitution. Substitute z = a into the second

equation to obtain \321\203= 2 + 2a. Then substitute z = a and \321\203

= 2 + 2a into the first equation to obtain

x + 2B + 2a) \342\200\2243a = 1 or x = \342\200\2243- a. Thus the general solution is

x = -3 -a, y= 2 + 2a, z = a or (-3 - a,2+ 2a, a)

where a is the parameter.

(c) The system

x + 2y - 3z = - 1

3x -\321\203+ 2z = 7

5x + 3y- 4z = 2

is solved by first reducing it to echelon form. To eliminate x from the second and third equations, apply the

operations \342\200\2243L,+ L2-* L2 and \342\200\2245L, + L3 -\302\273L3 to obtain the equivalent system

x + 2y \342\200\2243z = -1

-7y+ llz= 10-7y+llz= 7

The operation\342\200\224

L2 + L3 -\302\273L3 yields the degenerate equation

Ox + 0y + 0z= -3

Thus the system has no solution.

The following basic result was indicated previously.

Theorem 1.7: Any system of linear equationshas either: (i) a unique solution, (ii) no solution, or

fiiil an infinite number of solutions.j \342\200\224j \342\200\224 \342\200\224j.

(iii) an infinite number of solutions.

Proof. Applying the above algorithm to the system, we can either reduce it to echelon form ordetermine that it has no solution. If the echelon form has free variables, then the system has an infinite

number of solutions.

CHAP. 1] SYSTEMSOF LINEAR EQUATIONS 13

Remark: A system is saidto be consistent if it has one or more solutions[Case(i)

or (iii) in Theorem 1.7], and is said to be inconsistent if it has no solutions [Case (ii) in

Theorem 1.7]. Figure 1-3 illustrates this situation.

Inconsistent

No

solution

System of linear equations

Uniquesolution

Fig. 1-3

1.7 MATRICES

Let A be a rectangular array of numbers as follows:

A =

\302\25312

\302\26022

Consistent

Infinite numberof solutions

The array A is called a matrix. Such a matrix may be denoted by writing A =(afj-),

i= 1, ...,m,j = 1,..., n, or simply A = (a(j).The m horizontal \320\270-tuples

fan, a12, ...,aln),{a2l,a22,..., \320\2602\321\217),..., (aml, am2, ...,amn)

are the rows of the matrix, and the n vertical m-tuples

are its columns. Note that the element atj, called the ij-entry or ij-component, appears in the ith row and

thejth column. A matrix with m rows and n columns is called an m by n matrix, or m x n matrix; the

pair of numbers (m, n) is called its size.

Example 1.13. Let A = I I. Then A is a 2 x 3

columns are (If J, and(

I

matrix. Its rows are A, -3, 4) and @, 5, -2); its

The first nonzero entry in a row R of a matrix A is called the leading nonzero entry of R. If R hasno leadingnonzero entry, i.e., if every entry in R is 0, then R is called a zero row.If all the rows of A arezero rows, i.e.,if every entry of A is 0, then A is called the zero matrix, denoted by 0.

\302\251

0

0

2

0

0

3\\\302\256

0

/\302\2600

0

\302\251

0

0

3

0

0

0

0

00

\302\251

4

-3

2

14 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

Echelon Matrices

A matrix A is called an echelon matrix, or is said to be in echelon form if the following two condi-conditions hold:

(i) AH zero rows, if any, are on the bottom of the matrix,

(ii) Each leading nonzero entry is to the right of the leading nonzero entry in the preceding row.

That is, A ={a{j) is an echelon matrix if there exist nonzero entries

flu,, \"in. \342\200\242\302\246\302\246'\302\253ry,.where ;, <\320\233< \302\246\302\246\302\246

<jr

with the property that

ujj= 0 for (i) i < r, j < j,, and (ii) i > r

In this case, aUl,..., arJr are the leading nonzero entries of A.

Example 1.14. The following are echelon matrices whose leading nonzero entries have been circled:

/\302\251320 4 5 -6\\

0 0 \302\256 I -3 20|

0 0 0 0 0 \302\256 2

\\0000 000/

An echelon matrix A is said to be in row canonical form if it has the following two additionalproperties:

(iii) Each leading nonzero entry is 1.

(iv) Each leading nonzero entry is the only nonzero entry in its column.

The third matrix above is an example of a matrix in row canonical form. The second matrix is not in

row canonical form since the leading nonzero entry in the second row is not the only nonzero entry in

its column, there is a 3 above it. The first matrix is not in row canonical form since some leadingnonzero entries are not I.

The zero matrix 0, for any number of rows or columns, is also an example of a matrix in row

canonical form.

1.8 ROW EQUIVALENCEAND ELEMENTARY ROW OPERATIONS

A matrix A is said to be row equivalent to a matrix B, written A ~ B, if \320\222can be obtained from A

by a finite sequence of the following elementary row operations:

[?,] Interchange the ith row and theyth row: K, <->K,.[?2] Multiply the ith row by a nonzero scalar k: kR{

-\302\273Rt, \320\272/ 0.

[?3] Replace the ith row by \320\272times the/th row plus the ith row: kRj + R, -* Rt.

In actual practice, we apply [?2] an<^ 'hen [?^] in one step, i.e.,the operation

[?] Replace the ith row by k' times the jth row plus \320\272(nonzero) times the /th row:

kRj + kR^Rt, fc#0.

The reader no doubt recognizes the similarity of the above operationsand those used in solvingsystems of linear equations.

The following algorithm row reduces a matrix A into echelon form. (The term \"row reduce\" or

simply \"reduce\"shall mean to transform a matrix by row operations.)

CHAP.1] SYSTEMS OF LINEAR EQUATIONS 15

Algorithm 1.8A

Here A = (au) is an arbitrary matrix.

Step 1. Find the first column with a nonzero entry. Suppose it is the j, column.

Step 2. Interchange the rows so that a nonzero entry appears in the first row of the j, column, that is,

so that axh \321\2040.

Step 3. Use aiu as a pivot to obtain 0s below alh; that is, for each i > 1, apply the row operation

-e\302\253,*i + alJtR,- R>or (-\302\253o>ui)Ri + Ri -* Ri \302\246

Step 4. Repeat Steps 1,2, and 3 with the submatrix formed by all the rows, excludingthe first row.

Step 5. Continue the above processuntil the matrix is in echelon form.

A

2-3 0\\

2 4-2 2 I is reduced to echelon form by Algorithm 1.8A as follows :

3 6-4 3/

Use a,, = 1 as a pivot to obtain 0s below a,,, that is, apply the row operations \342\200\22421?,+ R2-+R2 and

\342\200\22431?,+ 1?3 -> 1?3to obtain the matrix

Now use \320\2602\320\252= 4 as a pivot to obtain a 0 below a23, that is, apply the row operation \342\200\224

5R2 + 41?3 -\302\2731?3to obtain

the matrix

/1 2 -3 0\\

10 0 4 2

\\0 0 0 2/

The matrix is now in echelon form.

The following algorithm row reducesan echelonmatrix into its row canonical form.

Algorithm 1.8B

Here A = (a,j) is in echelon form, say with leading nonzero entries

aiu,a2j2,...yarjr

StepI. Multiply the last nonzero row RT by l/arjr so that the leading nonzero entry is 1.

Step 2. Use arjw= 1 as a pivot to obtain 0s above the pivot; that is, for i = r \342\200\224

1, r \342\200\2242,..., 1, applythe operation

Step 3. Repeat Steps 1 and 2 for rows Rr_ t, Rr_ 2,..., R2.

Step 4. Multiply Rt by l/alJl.

16 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

Example 1.16. Using Algorithm 1.8B, the echelon matrix

B

3 4 5 6\\

0 0 3 2 5

0 0 0 0 4/

is reduced to row canonical form as follows:

Multiply R3 by $ so that the leading nonzero entry equals I; and then use a3s = 1 as a pivot to obtain 0s

above it by applying the operations \342\200\2245R3 + R2 ~*

Rt and \342\200\2246R3 + i?i -\302\273J?i:

B

3 4 5 \320\261\\ /2 3 4 5 0\\

0 0 3 25j~J0

0 3 2 Ol

0 0 0 0 1/ \\00 0 0 1/

Multiply R2 by \320\267so that the leading nonzero entry equals 1; and then use a23 = 1 as a pivot to obtain 0 abovewith the operation \342\200\22441? 2 + /?,->/?,:

B

3 4 5 0\\ /2 3 0 I 0\\

0 0 1 \302\24701 ~ f 0 0 1 | 0]0 0 0 0 1/ \\0 0 0 0 1/

Finally, multiply R, by \\ to obtain

* 0 i 0\\

0 0 1 \302\2470

[\320\276\320\276\320\276\320\2761/

This matrix is the row canonical form of A.

Algorithms 1.8A and \320\222show that any matrix is row equivalent to at least one matrix in row

canonical form. In Chapter 5 we prove that such a matrix is unique, that is,

Theorem 1.8: Any matrix A is row equivalent to a unique matrix in row canonical form (called therow canonicalform of A).

Remark: If a matrix A is in echelon form, then its leading nonzero entries will be

called pivot entries. The term comes from the above algorithm which row reduces amatrix to echelon form.

1.9 SYSTEMS OF LINEAR EQUATIONSAND MATRICES

The augmented matrix M of the system A.3) of m linear equations in n unknowns is as follows:

a2l a22 ... aln b2

\302\253ml \302\253m2 \342\200\242\342\200\242\342\200\242\302\253mnbJ

Observe that each row of M correspondsto an equation of the system, and each column of M corre-

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 17

sponds to the coefficients of an unknown, except the last column which corresponds to the constants ofthe system.

The coefficient matrix A of the system A3) is

(a\\\\ \302\25312 \342\200\242\342\200\242\342\200\242

tl \320\241>22\342\200\242-\342\200\242

il um2

Note that the coefficient matrix A may be obtained from the augmented matrix M by omitting the lastcolumn of M.

One way to solve a system of linear equationsis by working with its augmented matrix M, specifi-

specifically, by reducing its augmented matrix to echelonform (which tells whether the system is consistent)and then reducing it to its row canonical form (which essentially gives the solution). The justification of

this process comes from the following facts:

A) Any elementary row operation on the augmented matrix M of the system is equivalent to applyingthe corresponding operation on the system itself.

B) The system has a solution if and only if the echelon form of the augmented matrix M does not

have a row of the form @,0,..., 0, b) with b \320\2440.

C) In the row canonical form of the augmented matrix M (excludingzero rows) the coefficient of eachnonfree variable is a leading nonzero entry which is equal to one and is the only nonzero entry in

its respective column; hence the free-variable form of the solution is obtained by simply trans-

transferring the free variable terms to the other side.

This processis illustrated in the following example.

Example 1.17

(a) The system

x+ \321\203\342\200\2242z + 4t = 5

2x + 2y- 3z + t = 3

3x + 3y- 4z - 2r = 1

is solved by reducing its augmented matrix M to echelonform and then to row canonical form as follows:

1 -2 4 5\\ .

-7-7Ho 0 1 -7 -l)

14 -14/

2 -3 13-4-2 1/ \\0

0 2 -

[The third row (in the second matrix) is deleted since it is a multiple of the second row and will result in a zerorow.]Thus the free variable form of the general solution of the system is as follows:

x + y- \320\256\320\263=-9 x=-9-y+10t

z \342\200\224It = \342\200\2247 z = -7 + It

Herethe free variables are \321\203and t, and the nonfree variables are x and z.

(b) The system

x, + x2 \342\200\2242xj + 3x4 = 4

2xi + 3x2 + 3x3 \342\200\224x4

= 3

5x, + 7x2 + 4x3 + x4 = 5

18 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

is solved as follows. First we reduce its augmented matrix to echelon form:

1 -2 3 4\\ /l 1 -2 3 4\\ /l

3 3-1 3~J0

1 7 -7 \342\200\22451~f 0

7 4 1 5/ \\02 14 -14 15/ \\0

There is no need to continue to find the row canonical form of the matrix since the echelon matrix already tellsus that the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degen-

degenerateequation

Ox, + 0x2 + 0x3 + 0x4= -5

which has no solution,

(c) The system

x + 2y + z = 3

2x + Sy - z = -4

3x -2y

- z = 5

is solved by reducing its augmented matrix M to echelonform and then to row canonical form as follows:

3\\ /1 2 1 3\\

-4l~ 0 1 -3 -10

5/ \\0 -8 -4 -4/

2 1 3\\ /l 2 0 0

1 -3 -iOl~(o 1 0 -10 1 3/ \\0

0 1 3

Thus the system has the unique solution x = 2, \321\203= \342\200\2241, z = 3 or \320\270= B, \342\200\224

1, 3). (Note that the echelon form ofM already indicated that the solution was unique since it corresponded to a triangular system.)

1.10 HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONSThe system A.3) of linear equations is said to be homogeneous if all the constants are equal to zero,

that is, if the system has the form

In fact, the system A.6) is calledthe homogeneous system associated with the system A.3).The homogeneoussystem A.6) always has a solution, namely the zero n-tuple 0 = @, 0,..., 0)

called the zero or trivial solution. (Any other solution, if it exists, is called a nonzero or nontrivial

solution.) Thus it can always be reducedto an equivalent homogeneous system in echelon form:

+ - \302\246\302\246+ alnxn =

+a22x2 + \302\246\302\246\302\246

+\320\2601\320\270\321\205\342\200\236=

A.7)

ariJr+lxjr+l + \302\246\302\246\302\246+ \320\260\320\277\321\205\342\200\236= 0

There are two possibilities:

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 19

(i) r = n. Then the system has only the zero solution,

(ii) r < n. Then the system has a nonzero solution.

Accordingly, if we begin with fewer equations than unknowns then, in echelon form, r < n and hence

the system has a nonzero solution.This proves the following important theorem.

Theorem 1.9: A homogeneous system of linear equations with more unknowns than equations has anonzero solution.

Example 1.18

(a) The homogeneoussystem

x + 2y-3z + w = 0

x-3y + z - 2w = 0

2x + \321\203- 3z + 5w = 0

has a nonzero solution since there are four unknowns but only three equations.

(b) We reduce the following system to echelonform:

x+ \321\203\342\200\224z = 0 x + \321\203

\342\200\224z = 0 x + \321\203\342\200\224z = 0

2x-3y + r = 0 -5y + 3z = O -5y + 3z = 0x-4>>+ 2z = 0 -5y + 3z = O

The system has a nonzero solution, since we obtained only two equations in the three unknowns in echelon

form. For example, let z = 5; then \321\203= 3 and x = 2. In other words, the 3-tuple B, 3, 5) is a particular nonzero

solution.

(e) We reduce the following system to echelon form:

X

2x

\320\227\321\205

+

+

+

\321\203-

4\321\203-

2\321\203+

z

Z

2z

= 0

= 0= 0

\321\205+ \321\203-

2\321\203+

-\321\203+

z=0

z = 05z= 0

x+y2\321\203

\342\200\224z

+ z

llz

= 0= 0= 0

Since in echelon form there are three equations in three unknowns, the given system has only the zerosolution @,0,0).

Basis for the General Solution of a HomogeneousSystem

Let W denote I he general solution of a homogeneoussystem. Nonzero solution vectors \302\253\342\200\236u2,...,

us are said to form a basis of W if every solution vector w in W can be expressed uniquely as a linear

combination of u,, u2, ..., us. The number s of such basis vectors is called the dimension of W, written

dim W = s. (If W = {0}, we define dim W = 0.)

The following theorem, proved in Chapter 5, tells us how to find such a basis.

Theorem 1.10: Let W be the general solution of a homogeneoussystem, and suppose an echelon formof the system has s free variables. Let ul5 u2, .. -, us be the solutions obtained by setting

one of the free variables equal to one (or any nonzero constant) and the remaining freevariables equal to zero. Then dim W = s and u,, u2,..., us form a basis of W.

Remark: The above term linear combination refers to multiplying vectors by

scalars and adding, wheresuch operationsare defined by

k(at, a2, ..., an)= (fca,, ka2, ..., \320\272\320\260\342\200\236)

(\320\260\342\200\236\320\2602,...,\320\260\320\277)-^(\320\2541,\320\2542,...,\320\254\342\200\236)=

(\320\2601+ \320\254\342\200\236\320\2602+b2,..., an + \320\254\342\200\236)

These operations are studied in detail in Chapter 2.

20 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

Example 1.19 Supposewe want to find the dimension and a basisfor the general solution W of the homogeneous

system

x + 2y\342\200\2243z + 2s - 4t = 0

2x + 4y- 5z + s - 6r = 0

5x + lOy- I3z + 4s - 16t = 0

First we reduce the system to echelon form. Applying the operations \342\200\2242L, + L2 -\302\273L2 and \342\200\224

5L2 + L3-+ L3, and

then -2L2 + L3 ->L3,yields:

jc + 2y - 3z + 2s- 4f = 0 x + 2y- 3z + 2s - 4f = 0

z - 3s+ 2f = 0 and z - 3s+ 2f = 0

2z - 6s+ 4f = 0

In echelon form, the system has three free variables, y. s and f; hence dim W = 3. Three solution vectors whichform a basisfor W are obtained as follows:

A) Set \321\203= 1, s = 0, f = 0. Back-substitution yields the solution u, = (- 2, 1,0,0,0).

B) Set \321\203= 0, s = 1, f = 0. Back-substitution yields the solution u2 = G, 0, 3, 1,0).

C) Set \321\203= 0, s = 0, f = 1. Back-substitution yields the solution u3 = (-2, 0, ^2, 0, 1).

The set {u,, u2, u3} is a basisfor W.

Now any solution of the system can be written in the form

\320\260\321\211+ bu2 + cu3 = o(-2, 1,0, 0, 0) + bG, 0, 3, 1,0) + c(-2, 0, -2, 0, 1)= (-2a + 76 - 2c, a, 36-

2c, b, c)

where a, b, \321\201are arbitrary constants. Observe that this is nothing other than the parametric form of the general

solution under the choiceof parameters \321\203= a, s = b, i = c.

Nonhomogeneous and Associated HomogeneousSystems

The relationship between the nonhomogeneous system A.3) and its associatedhomogeneoussystem

A.6) is contained in the following theorem whoseproof is postponed until Chapter 3 (Theorem 3.5).

Theorem 1.11: Let i>0 be a particular solution and let U be the general solution of a nonhomogeneoussystem of linear equations. Then

U = v0 + W = {v0 + w : w e W}

where W is the general solution of the associatedhomogeneoussystem.

That is, U \342\200\224v0 + W may be obtained by adding i>0 to each element of W.

The above theorem has a geometrical interpretation in the space R3. Specifically, if W is a line

through the origin, then, as pictured in Fig. 1-4, U =v0 + W is the line parallel to W which can be

obtained by adding v0 to each element in W. Similarly, whenever W is a plane through the origin, then

U =v0 + W is a plane parallel to W.

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 21

Fig. 1-4

Solved Problems

LINEAR EQUATIONS, SOLUTIONS

1.1. Determine whether each equation is linear:

(a) 5x + ly -Syz

- 16 (b) x + ny + ez = log 5 (c) 3x + ky- Sz = 16

(a) No, sincethe product yz of two unknowns is of second degree.

(b) Yes,since\320\273,\320\265,and log S are constants.

(c) As it stands, there are four unknowns: x, y, z, k. Becauseof the term ky it is not a linear equation.However, assuming \320\272is a constant, the equation is linear in the unknowns x, y, z.

1.2. Consider the linear equation x + 2y \342\200\2243z = 4. Determine whether \320\270\342\200\224(8, 1,2) is a solution.

Sincex, y, z is the ordering of the unknowns, \320\270= (8, 1, 2) is short for x = 8, \321\203= 1, z = 2. Substitute in

the equation to obtain

8 + 2A) - 3B)= 4 or 8 + 2-6 = 4 or 4 = 4

Yes, it is a solution

13. Determine whether (\320\260)\320\270= C, 2, 1, 0) and (b) v = A, 2, 4, 5) are solutions of the equation

x, + 2x2 \342\200\2244x3 + x4 = 3.

(a) Substitute to obtain 3 + 2B) -4A) + 0 = 3, or 3 = 3; yes, it is a solution.

(b) Substitute to obtain 1 + 2B) \342\200\2244D) + 5 = 3, or \342\200\2246 = 3; not a solution.

1.4. Is \320\270= F,4, \342\200\2242)a solution of the equation 3x2+ x3- x, = 4?

By convention, the componentsof \320\270are ordered according to the subscripts on the unknowns. That is,\320\270= F,4, \342\200\224

2) is short for x, = 6,x2= 4, x3 = \342\200\2242. Substitute in the equation to obtain 3D) \342\200\2242 - 6 = 4,or

4 = 4. Yes, it is a solution.

1.5. Prove Theorem 1.1.

Suppose \320\260\321\2040. Then the scalar b/a exists.Substituting b/a in ax = b yields a{b/a) = 6, or 6 = b; hence

b/a is a solution. On the other hand, suppose x0 is a solution to ax = b, so that oxo = b. Multiplying both

sides by I/a yields x0 = b/a. Henceb/a is the unique solution of ax = b.Thus (i) is proved.On the other hand, suppose a = 0. Then, for any scalar k, we have ak = Ok = 0. If b \321\2040, then ak \321\204b.

Accordingly, \320\272is not a solution of ax \342\200\224b and so (ii) is proved. If b = 0, then ak = b. That is, any scalar \320\272is

a solution of ax \342\200\224b and so (iii) is proved.

22 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

1.6. Solve each equation:

(a) ex =log 5 (c) 3x - 4 - x = 2x + 3

(b) ex = 0 (d) 7 + 2x- 4 = 3x + 3 - x

(a) Since e \321\2040, multiply by 1/e to obtain \320\273= (log 5)/e.

(b) If \321\201\321\2040, then 0/c = 0 is the unique solution. If \321\201= 0, then every scalar \320\272is a solution [Theoreml.l(iii)].

(c) Rewrite in standard form, 2x \342\200\2244 = 2* + 3 or Ox= 7.The equation has no solution [Theorem 1.1(ii)].(d) Rewrite in standard form, 3 + 2x = 2x + 3 or Ox = 0. Every scalar \320\272is a solution [Theorem l.l(iii)].

1.7. Describethe solutions of the equation 2x + y + x \342\200\2245 = 2y + 3x \342\200\224y + 4.

Rewrite in standard form by collecting terms and transposing:

\320\227\321\205+ \321\203- 5 =

\321\203+ \320\227\321\205+ 4 or Ox + Oy= 9

The equation is degeneratewith a nonzero constant; thus the equation has no solution.

1.8. Describe the solutions of the equation 2y + 3x \342\200\224y + 4 = x + 3 + y+l + 2x.

Rewrite in standard form by collecting terms and transposing:

\321\203+ \320\227\321\205+ 4 = \320\227\321\205+ 4 + \321\203 or Ox + Oy= 0

The equation is degeneratewith a zero constant; thus every vector \320\270= (a, b) in R2 is a solution.

1.9. Prove Theorem 1.3.

First we prove (i). Set Xj=

k} for \321\203\321\204p. Because o; = 0 for \321\203< \321\200,substitution in the equation yields

apxp + ap*\\kp+i +\302\246\302\246\302\246+a*K= h or

apxp=

b-ap+ tkp+i \320\260\342\200\236\320\272\320\277

with flp \320\2440. By Theorem l.l(i), xp is uniquely determined as

xp= \342\200\224

(b- ap+lkp+, - - - \342\200\242-

\320\260\342\200\236\320\272\342\200\236)

ap

Thus (i) is proved.Now we prove (ii). Suppose \320\270= (fc,, k2, ..., kn) is a solution. Then

orkp

= \342\200\224{b

-ap+lkp+i \320\260\320\277\320\272\342\200\236)

a

This, however, is precisely the solution

obtained in (i).Thus (ii) is proved.

1.10. Considerthe linear equation x \342\200\2242y + 3z = 4. Find (a) three particular solutions and (b) the

general solution.

(a) Here x is the leading unknown. Accordingly, assign any values to the free variables \321\203and z, and thensolvefor x to obtain a solution. For example:

A) Set \321\203= 1 and z = 1. Substitution in the equation yields

x -2A) + 3A) = 4 or x-2 + 3 = 4 or x = 3

Thus u,= C, 1, 1) is a solution.

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 23

B) Set \321\203= 1, z = 0. Substitution yields x = 6; hence u2

= F, 1, 0) is a solution.

C) Set \321\203= 0, z = 1.Substitution yields x = 1; henceu3

= A,0,1) is a solution.

(b) To find the general solution, assign arbitrary values to the free variables, say \321\203= a and z = b. (We call

a and b parametersof the solution.) Then substitute in the equation to obtain

x - 2a + 36 = 4 or x = 4 + 2a - 36

Thus \320\270= D + 2a \342\200\2243b, a, b) is the general solution.

SYSTEMS IN TRIANGULAR AND ECHELON FORM

1.11. Solve the system

2x -\320\252\321\203+ 5z - 2t = 9

5y- z + 3t = 1

7z- r = 32t = 8

The system is in triangular form; hencewe solveby back-substitution,

(i) The last equation gives t = 4.

(ii) Substituting in the third equation gives 7z \342\200\2244 = 3, or 7z = 7, or z = 1.

(iii) Substituting z = 1 and t = 4 in the second equation gives

5y-l+3D)=l or 5y- 1 + 12=1 or 5y=-10 or \321\203= -2

(iv) Substituting \321\203= \342\200\224

2, z = 1, t = 4 in the first equation gives

2x-3(-2) + 5(l)-2D) = 9 or 2x+ 6 + 5-8 = 9 or 2x = 6 or x = 3

Thus x = 3, \321\203= \342\200\2242, z= 1, f = 4 is the unique solution of the system.

1.12. Determine the free variables in each system:

3x + 2y\342\200\2245z \342\200\2246s + 2t = 4 5x \342\200\224

\320\252\321\203

z + 8s \342\200\2243f = 6 4ys \342\200\2245i = 5

(a) In the echelon form, any unknown that is not a leading unknown is termed a free variable.Here,\321\203and

t are the free variables.

(b) The leading unknowns are x, y, z. Hence there are no free variables (as in any triangular system).

(c) The notion of free variable applies only to a system in echelon form.

1.13. Prove Theorem 1.6.

There are two cases:

(i) r = n. That is, there are as many equations as unknowns. Then the system has a unique solution.

(ii) r <n. That is, there are fewer equations than unknowns. Then we can arbitrarily assign values to then \342\200\224T free variables and obtain a solution of the system.The proofis by induction on the number r of equations in the system. If \320\263= 1, then we have a single,

nondegenerate, linear equation, to which Theorem 1.3 applies when n > r \342\200\2241 and Theorem 1.1 applieswhen n = r = 1.Thus the theorem holds for r = 1.

Now assume that r > I and that the theorem is true for a system of r \342\200\2241 equations. We view the r \342\200\2241

equations

7z= 1

5z = 64z = 9

(fe)

X

2x

5x

+ 2y--\320\252\321\203+

-Ay-

3z = 2z= 1z = 4

24 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

a2j2xj2 + a2j2+1xj2+1 + \302\246\342\200\242\342\200\242+ \320\2602\320\277\321\205\342\200\236= b2

as a system in the unknowns xJz,..., \321\205\342\200\236.Note that the system is in echelon form. By the induction hypoth-

hypothesis,we can arbitrarily assign values to the (n\342\200\224

j2 + 1) \342\200\224(r

\342\200\2241) free variables in the reduced system to

obtain a solution (say, xj2 = kj2,..., \321\205\342\200\236=

fer). As in case r = 1, these values together with arbitrary valuesfor the additional j2

\342\200\2242 free variables (say x2 = k2,..., xj2 _,= kh _,) yield a solution of the first equation

with

x,= \342\200\224

(b, -altk2 \320\2601\321\217\320\272\342\200\236)\"n

[Note that there are (n -j2 + 1) - (r - 1)+ (j2\342\200\224

2) = n - r free variables.] Furthermore, these values for

x,,..., \321\205\342\200\236also satisfy the other equations since, in these equations, the coefficients of x1(..., Xj2 _, are zero.Now if r = n, then j2 \342\200\2242. Thus by induction we obtain a unique solution of the subsystem and then a

unique solution of the entire system.Accordingly, the theorem is proven.

1.14. Find the general solution of the echelon system

x \342\200\2242y

\342\200\2243z + 5s - 2f = 4

2z - 6s+ 3r = 2

5r= 10

Since the equations begin with the unknowns x, z, and t, respectively, the other unknowns \321\203and s arethe free variables.To find the general solution, assign parameters to the free variables, say \321\203

= a and s = b,and use back-substitution to solve for the nonfree variables x, z, and t.

(i) The last equation yields t = 2.

(ii) Substitute t = 2, s = b in the second equation to obtain

2z- 6b + 3B) = 2 or 2z-6b + 6 = 2 or 2z = 6b - 4 or z = 3b - 2

(iii) Substitute t = 2, s = b, z = 3b \342\200\2242, \321\203

= a in the first equation to obtain

x - 2a - 3C6 - 2)+ 5b - 2B) = 4 or x - 2a - 9b + 6 + 5b - 4 = 4

or x = la + 4b + 2

Thus

x = 2o+4b + 2 y = a z = 3b-2 s = b t = 2

or, equivalently,u = Ba + 4ft + 2, a, 3b -

2, b, 2)

is the parametric form of the general solution.Alternately, solving for x, z, and t in terms of the free variables \321\203and s yields the following free-variable

form of the general solution:

x =2y + 4s + 2 z = 3s-2 r = 2

SYSTEMS OF LINEAR EQUATIONS, GAUSSIANELIMINATION

1.15. Solve the system

x-2y+ z= 72x -

\321\203+ 4z = 17

3x -2y + 2z = 14

Reduceto echelonform. Apply\342\200\224

2L, + L2~* L2 and \342\200\2243L, + L3 -\302\273L3 to eliminate x from the second

and third equations, and then apply\342\200\224

4L2 + 3L3 -\302\273L3 to eliminate \321\203from the third equation. These

CHAP.1] SYSTEMS OF LINEAR EQUATIONS 25

operations yield

x-2y+z=7 x-2y + z = 7

3y + 2z= 3 and iy +\302\2462z = 3

4y- z= -7 - llz= -33The system is in triangular form, and hence, after back-substitution, has the unique solution \320\270= B, \342\200\2241, 3).

1.16. Solve the system

2x - 5y+3z-4s + 2t = 4

3x - ly + 2z - 5s + 4t = 95x -

\320\256\321\203-5z-4s + 7t = 22

Reduce the system to echelon form. Apply the operations \342\200\2243Lt + 2L2 -\302\273L2 and \342\200\224

5L, + 2L3 -\302\273L3,

and then \342\200\2245L2 + L3 -\302\273L3 to obtain

2x \342\200\2245y + 3z - 4s+ 2t = 4 2x \342\200\224

5y + 3z - 4s+ It = 4

\321\203- 5z + 25 + It = 6 and \321\203

\342\200\2245z + \320\252+ 2t = 6

5y- 25z + 125+ At = 24 25-6t=-6

The system is now in echelon form. Solving for the leading unknowns, x, y, and s, in terms of the free

variables, z and t, we obtain the free-variable form of the general solution:

x = 26+llz-15t y=12 + 5z-&t s=-3 + 3i

From this follows at once the parametric form of the general solution (where z = a, t = b):

\321\205= 26+11\321\217-15\320\254 \321\203= 12 + 5\320\260-8\320\254 z = a s=-3 + 3b t = b

1.17. Solve the system

x + 2y\342\200\2243z + 4t = 2

2x + 5y- 2z + f = 1

5x + I2y - lz + 6r = 7

Reduce the system to echelon form. Eliminate x from the second and third equations by the operations\342\200\224

2L, + L2 -* L2 and ~5L, + L3 -\302\273L3; this yields the system

x + 2y- 3z + At = 2

y + 4z - 7t = -3

2y+8z- 14t= -3

The operation\342\200\224

2L2 + L3-* L3 yields the degenerate equation 0 = 3. Thus the system has no solution

(even though the system has more unknowns than equations).

1.18. Determine the values of \320\272so that the following system in unknowns x, y, z has: (i) a unique

solution, (ii) no solution, (iii) an infinite number of solutions.

x + \321\203\342\200\224z=\\

2x + \320\252\321\203+ kz = 3

x + ky + \320\252\321\203= 2

Reduce the system to echelon form. Eliminate x from the second and third equations by the operations\342\200\224

2Lt + L2-* L2 and \342\200\224L, + L3 -\302\273L3 to obtain

x + y\342\200\224 z =1

\321\203+ (\320\272+ 2)z = 1

(\320\272-l)y+ 4z = 1

26 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

To eliminate \321\203from the third equation, apply the operation \342\200\224(k \342\200\224l)L2 + L3-* L3 to obtain

x+y- z =1

\321\203+ (* + 2)z = I

C + *X2 -k)z = 2 - \320\272

The system has a unique solution if the coefficient of z in the third equation is not zero; that is, if \320\272\321\2042 and

\320\272\321\204\342\200\2243. In case \320\272= 2, the third equation reduces to 0 = 0 and the system has an infinite number of

solutions (one for each value of z). In case \320\272= \342\200\2243, the third equation reducesto 0 = 5 and the system has

no solution. Summarizing: (i) \320\272\321\2042 and \320\272\321\2043, (ii) \320\272= \342\200\2243, (iii) \320\272\342\200\2242.

1.19. What condition must be placedon a, b, and \321\201so that the following system in unknowns x, y, andz has a solution?

x + 2y\342\200\2243z = a

2x + 6y \342\200\224llz = b

x \342\200\2242y + Iz = \321\201

Reduce to echelon form. Eliminating x from the second and third equation by the operations\342\200\224

2L, + L2 -* L2 and \342\200\224Lt + L3 -\302\273L3, we obtain the equivalent system

x + 2y - 3z = a

- 4y + lOz= \321\201\342\200\224a

Eliminating \321\203from the third equation by the operation 2L2 + L3-* L3, we finally obtain the equivalentsystem

x + 2y\342\200\2243z = a

2y - 5z = b - 2a

O = c + 2b-5a

The system will have no solution if \321\201+ 2b \342\200\2245a \321\2040. Thus the system will have at least one solution if

\321\201+ 2b \342\200\2245a = 0, or 5a = 2b + c. Note, in this case, that the system will have infinitely many solutions. In

other words, the system cannot have a unique solution.

MATRICES, ECHELON MATRICES, ROW REDUCTION

1.20. Interchangethe rows in each of the following matrices to obtain an echelon matrix:

0

40

100

\342\200\2243

2

7

4

5-2

6\302\2463J

8

0

1

0

020

035

04

_4

0

5

7

000

23

0

2

1

0

2

00

200

(\320\260) \321\204) (\321\201)

(a) Interchange the first and second rows, i.e., apply the elementary row operation Rl*->R2.

(b) Bring the zero row to the bottom of the matrix, i.e.,apply R, \302\253->R2 and then R2 \342\200\242\342\200\242->R3.

(c) No amount of row interchanges can produce an echelon matrix.

1.21. Row reduce the following matrix to echelonform:

2 -3A = | 2 4 -2

6 -4

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 27

Usea,, = 1as a pivot to obtain Osbelow a,,; that is, apply the row operations \342\200\2242\320\233,+ R2 -\302\273R2 and

\342\200\2243Rl + R3-f R3 to obtain the matrix

Now use a23 = 4 as a pivot to obtain a 0 below a23; that is, apply the row operation \342\200\2245R2 + 4R3 -\302\273R3 to

obtain the matrix

which is in echelon form.

1.22. Row reduce the following matrix to echelon form:

Hand calculationsare usually simpler if the pivot elementequals 1.Therefore, first interchange R, andR2; then apply 4\320\233,+ R2 -\302\273R2 and \342\200\224

6Rl + R3 -* R3; and then apply R2 + R3 ->\302\246R3:

1 2 -5\\ /l 2 -5\\ /l 2-5^

B~|_4 l \342\200\2246j

~ I 0 9 \342\200\22426J

~ I 0 9 -26 I

6 3 -4/ \\0-9 26/ \\0 0 0/

The matrix is now in echelon form.

1.23. Describe the pivoting row reduction algorithm. Also, describe the advantages, if any, of using this

pivoting algorithm.

Therow reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolutevalue is chosen as the pivot aUl and if one uses the row operation

(-\302\253y1/alil)J?l + J?1-J?1

The main advantage of the pivoting algorithm is that the above row operation involves division by the

(current) pivot alh and, on the computer, roundoff errors may be substantially reduced when one divides by

a number as large in absolute value as possible.

1.24. Use the pivoting algorithm to reduce the following matrix A to echelon form:

B-2

2-3 6 0-1

1 -7 10 2/

First interchanged, and R2 so that \342\200\2243 can be used as the pivot, and then apply (|)\320\233,+\320\2332-\302\273\320\2332and

28 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

Now interchange R2 and R3 so that \342\200\2245may be used as the pivot, and apply (|)\320\2332+

(-360

-l\\ /-3 6 0-1

0 -5 10 fl-l 0 -5 10

The matrix has been brought to echelon form.

ROW CANONICAL FORM

1.25. Which of the following echelon matrices are in row canonical form?

'l 2\342\200\2243 0 l\\ /0 I 7 -5 o\\ /l 0 5 0

0 5 2-4J

10 0 0 0 1I 10 1 2 04j

0 0 7 3/ \\0 0 0 0 0/ \\0 0 0 I 7/

The first matrix is not in row canonical form since,for example, two leading nonzero entries are 5 and

7, not 1. Also, there are nonzero entries above the leading nonzero entries 5 and 7. The second and third

matrices are in row canonical form.

1.26. Reducethe following matrix to row canonical form:

/2 2-1B= 4 4 1

\\6 6 0

First, reduce \320\222to an echelon form by applying\342\200\224

2\320\233,+ R2 -\302\273R2and \342\200\224\320\227\320\233,+ \320\2333-\302\273R3, and then

\342\200\224R2 4- R3 -\302\273R3:

6

10

20

4

1319

2

0

0

2

0

0

-133

6_2

2

4

1

i \320\223~ \320\276

\\o

2

0

0

-13

0

6

-2

4

452

Next reduce the echelon matrix to row canonical form. Specifically, first multiply R3 by 5, so the pivot

b3a = 1, and then apply 2R3 + R2 -\302\273R2 and -6R3

\\ / 0

0

I

Now multiply R2 by 5, making the pivot b23 = 1, and apply R2 + R, -\302\273Rt:

0 l\\ /2 2 0 00 2 1 ~ I 0 0 1 01 \\J \\0 0 0 I

Finally, multiply R, by ^ to obtain the row canonical form

A10

00 0 100 0 0 1

1.27. Reduce the following matrix to row canonical form:

A-2311 14-12 5 9-2

CHAP. I] SYSTEMSOF LINEAR EQUATIONS 29

First reduce A to echelon form by applying\342\200\224

R, + R2-* R2 and \342\200\2242Rl + R3 -* R3, and then apply-

applying-3R2 + R3-*R3:

A-2

3 1 2\\ /l -2 3 I 2\\

0 3 1-2 11~ 10 31-210 9 3 -4 4/ \\0

0 0 2 ij

Now use back-substitution. Multiply R3 by 5 to obtain the pivot a34 = 1. and then apply 2R3 + R2-> R2

and -R3 + R, -\302\273J?,:

/l-2 3 1 2\\ /l-2 3 0A~ 0 3 1-21

~jo3102

\\o\320\276 \320\276 1

\321\203 \\o 0 0 1

Now multiply \320\2332by 3 to obtain the pivot a22 = I, and then apply 2R2 + /?,-\302\246/?,:

/1-2 3 0 4\\ /1 0 v \320\276

/1-0 1 3 0\321\202\320\253\302\260

1 i 0 f

\\o0 0 1 i/ \\o

0 0 1

Since \320\276,[= 1, the last matrix is the desired row canonical form.

1.28. Describe the Gauss-Jordan elimination algorithm which reduces an arbitrary matrix A to its

row canonical form.

The Gauss-Jordan algorithm is similar to the Gaussian elimination algorithm except that here the

algorithm first normalizes a row to obtain a unit pivot and ther uses the pivot to place 0s both below and

above the pivot before obtaining the next pivot.

1.29. Use Gauss-Jordanelimination to obtain the row canonical form of the matrix of Problem 1.27.

Usethe leading nonzero entry a,, = 1 as pivot to put 0s below it by applying- ft, + /?2 \342\200\224\302\273R2 and

-2R, + R3-> R3; this yields

1

0

0

-23

9

31

3

1

-2

-4

2

1

4

Multiply R2 by 3 to get the pivot a22 = 1 and produce 0s below and above a22 by applying

-9R2 + R3 ->R3 and 2R2 + R^R^.

Last, multiply R3 by 2 to get the pivot a3i = 1 and produce Os above a34 by applying \302\247\320\2333+ R2 -\302\273R2 and

V -4 f\\ \320\233 \302\260 ? 0

4 -1 4-0 1 4 0

0 I */ \\0 0 0 1

1.30. One speaks of \"an\" echelon form of a matrix A, \"the\" row canonical form of A. Why?

An arbitrary matrix A may be row equivalent to many echelon matrices. On the other hand, regardless

of the algorithm that is used, a matrix A is row equivalent to a unique matrix in row canonical form. (The

30 SYSTEMSOF LINEAR EQUATIONS [CHAP. 1

term \"canonical\" usually connotes uniqueness.) For example, the row canonical forms in Problems 1.27

and 1.29 are identical.

131. Given an \320\270\321\205\320\270echelon matrix in triangular form,

/a,, a,, a,, ... aln_, aln\\

\342\200\242\302\246\302\2602.n1 \302\2532b

... a3 n_, a3nA =

0

0

J12 \0213

\0222 \302\25323

0 \302\25333

0amj\\ 0 0 0 ...

with all au \320\2440. Find the row canonical form of A.

Multiplying Ra by \\/am and using the new am = 1 as pivot, we obtain the matrix

/a,, a12 a,3 ... alB_, 0\\

\\0 0 0 0

Observe that the last column of A has been converted into a unit vector. Each succeeding back-substitutionyields a new unit column vector,and the end result is

\\0 0 ... I/

i.e., A has the \320\270\321\205\320\270identity matrix I as its row canonical form.

1.32. Reduce the following triangular matrix with nonzero diagonal elements to row canonical form:

/5 -9

C =|0

2 3

\\0 0 ll

By Problem 1.31, \320\241is row equivalent to the identity matrix. Alternatively, by back-substitution,

/5 -9 6\\ /5 -9 0\\ /5 -9 0\\ /5 0 0\\ /1 0 0\\

C~ 0 23|~

0 2o]~jO

1 01 ~ I 0 1oj~jo

1 0]

\\0 0 1/ \\0 0 1/ \\0 0 l/ \\0 0 1/ \\0 0 I)

SYSTEMS OF LINEAR EQUATIONS IN MATRIX FORM

1.33. Find the augmented matrix M and the coefficient matrix A of the following system:

x + 2y - 3z = 4

3y - 4z + 7x = 5

6z + 8x -9y

= I

First align the unknowns in the system to obtain

x + 2y \342\200\2243z = 4

7x + 3y- 4z = 5

8x -9y + 6z = 1

CHAP. 1] SYSTEMSOF LINEAR EQUATIONS 31

Then

/l 2 -3 4\\

and

1.34. Solve, using the augmented matrix,

x - 2y + 4z = 22x -

\320\252\321\203+ 5z = 3

3x -4y + 6z = 7

Reducethe augmented matrix to echelon form:

45

6

The third row of the echelon matrix corresponds to the degenerate equation 0=3; hence the system has no

solution.

1.35. Solve,using the augmented matrix,

x + 2y - 3z - 2s + 4t = 12x + 5y

- 8z - s + 6t = 4

x + Ay- Iz + 5s + 2t = 8

Reduce the augmented matrix to echelon form and then to row canonical form:

2

5-8-1 6 4|~|0 1-2 3-2 21~ f 0 1-2 3-2 2]

Thus the free-variable form of the solution is

x + z + 24r = 21 x = 21- z - 24r

y-2z\342\200\224 8i=-7 or y=-7 + 2z+ 8r

\320\273+ 2i = 3 .v = 3 - 2?

where z and ( are the free variables.

1.36. Solve,using the augmented matrix,

x + 2y \342\200\224z = 3

x + 3y + z = 5

3x + 8y + 4z = 17

32 SYSTEMS OF LINEAR EQUATIONS [CHAP.

Reducethe augmented matrix to echelon form and then to row canonical form:

fl 2-1 3\\ /l 2-1 3\\ /l 2-1

1 3 1 5-JO

1 22)~(o

! 2 2!

V3 8 4 17/ \\0 2 7 8/ \\0 0 3 4

The system has the unique solution x = -^, \321\203= \342\200\224

f, z = $ or u = (-x, \342\200\224f, \302\247).

HOMOGENEOUS SYSTEMS

1.37. Determine whether each systemhas a nonzero solution.

x + 2y- z = 0

x -2y + 3z - 2w = 0 x + 2y

- 3z = 0 2x + 5y + 2z = 0

3x -ly

- 2z + 4w = 0 2x + 5\321\203+ 2z = 0 x + 4_y + 7z = 0

4x + 3y + 5z + 2w = 0 3x \342\200\224\321\203

\342\200\2244z = 0 x + \320\252\321\203+ 3z = 0

(\320\260) \320\244) (c)

(a) The system must have a nonzero solution sincethere aremore unknowns than equations.

(b) Reduceto echelonform:

x + 2y - 3z = 0 x + 2y-3z = 0 x + 2y-3z = 02x + 5y + 2z = 0 to \321\203+ 8z = 0 to \321\203+ 8z = 03x- y-4z = 0 -7y + 5z = 0 61z = 0

In echelon form there are exactly three equations in the three unknowns; hence the system has aunique solution, the zero solution.

(c) Reduceto echelon form:

x + 2y-z = 0 x + 2y-z =02x + 5y + 2z = 0 \321\203+ 4z = 0 x + 2y

- z = 0to to

x + 4y + 7z = 0 2y + 8z = 0 \321\203+ 4z = 0x + 3y + 3z = O y + 4z = 0

In echelon form there are only two equations in the three unknowns; hence the system has a nonzerosolution.

1.38. Find the dimension and a basisfor the general solution W of the homogeneoussystem

x + 3y-2z + 5s - 3t = 0

2x + ly - 3z + 7s-5t = O

3x + 1ly- 4z + 10s - 9t = 0

Show how the basis gives the parametric form of the general solution of the system.

Reduce the system to echelon form. Apply the operations \342\200\2242L, + L2 -\302\273L2 and \342\200\224

3L, + L3 -\302\273L3, and

then -2L2 + L3-*L3 to obtain

x + \320\252\321\203- 2z + 5s - 3i = 0 x + \320\252\321\203

- 2z + 5s - 3i = 0

y+ z \342\200\2243s + i = 0 and \321\203+ z \342\200\2243s+ f = 0

2y + 2z - 5s =0 s - 2f = 0

CHAP. 1] SYSTEMS OF LINEAR EQUATIONS 33

In echelon form, the system has two free variables,z and t; hence dim W = 2. A basis [\320\270\342\200\236u2] for W maybe obtained as follows:

A) Set z = I, t = 0. Back-substitution yields s = 0, then y=\342\200\224l, and then x = 5. Therefore,u, =E, -1,1,0,0).

B) Set z = 0, i=l. Back-substitution yields s = 2, then \321\203= 5, and then x = -22. Therefore,

u2 = (-22, 5, 0,2,1).

Multiplying the basis vectors by the parameters a and b, respectively, yields

aux + bu2 = eE, - 1,1,0,0) + b{-22, 5, 0, 2, 1)= Ea -22b, -a + 5b, a, 2b, b)

This is the parametric form of the general solution.

1.39. Find the dimensionand a basisfor the general solution W of the homogeneoussystem

x + 2y- 3z = 0

2x + 5y + 2z = 0

3x- y-4z = 0Reduce the system to echelon form. From Problem 1.37(b)we have

x + 2y - 3z = 0

\321\203+ 8z = 0

61z= 0

There are no free variables (the system is in triangular form). Hence dim W = 0 and W has no basis.

Specifically, W consists only of the zero solution, W = {0}.

1.40. Find the dimension and a basis for the general solution W of the homogeneous system

2x+ 4y - 5z + 3t = 0

3x + 6y \342\200\224Iz + At = 0

5x+ \\Qy- llz + 6t = 0

Reduce the system to echelon form. Apply\342\200\224

3L, +2L2->L2 and \342\200\2245L, +2L3-*L3, and then

\342\200\2243L2 + L3 -\302\273L3 to obtain

2x + 4y-5Z + 3r = 02

z- / = 0 andz_ | = 0

3z - 3t = 0

In echelon form, the system has two free variables, \321\203and i; hence dim W = 2. A basis {u,, u2} f\302\260rW \321\202\320\260\320\243

be obtained as follows:

A) Set \321\203= 1, t = 0. Back-substitution yields the solution u, =

(\342\200\2242,1,0, 0).

B) Set \321\203= 0,1 = 1. Back-substitution yields the solution u2 = A,0, 1,1).

1.41. Consider the system

x -3y

- 2z + 4t = 53x -

8y- 3z + 8t = 18

2jc - 3y + 5z - 4t = 19

(a) Find the parametric form of the general solution of the system.

(b) Showthat the result of (a) may be rewritten in the form given by Theorem 1.11.

34 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

(a) Reduce the system to echelon form. Apply\342\200\224

3L,+L2-\302\273L2 and \342\200\2242L1 + L3-\302\273L3, and then

\342\200\2243L2 + L3-f L3 to obtain

x -3y - 2z + 4f = 5

, + Sz- 4.-3 and'-\320\227\320\243-2. + 4.-5

3, + 9z-12, = 9 y + 3z-4, = 3

In echelon form, the free variables are z and r. Set z = a and f = b, where a and \320\254are parameters.Back-substitution yields \321\203

= 3 - \320\227\320\260+ 4b, and then x = 14- la + 8b. Thus the parametric form of the

solution is\320\273:= 14 - la + 8b \321\203

= 3 - \320\227\320\260+ 4b z = a I = b (\302\246)

(b) Let d0 = A4, 3, 0, 0) be the vector of constant terms in (*), let \320\270,= (-7, -3, 1, 0) be the vector of

coefficients of a in (*), and let u2 = (8,4,0, 1)be the vector of coefficients of b in (*). Then the generalsolution (\302\273)may be rewritten in vector form as

(x, y, z, t) = v0 + \320\260\321\211+ bu2 (**)

We next show that (**) is the general solution per Theorem 1.11. First note that v0 is the solutionof the inhomogeneous system obtained by setting a = 0 and b = 0. Consider the associated homoge-homogeneoussystem, in echelon form:

x -3j>

- 2z + 4t = 0\321\203+ 3z - At = 0

The free variables are z and t. Set z = 1 and t = 0 to obtain the solution ul = ( \342\200\2247,\342\200\2243,1, 0). Set z = 0and i = 1 to obtain the solution u2

= (8, 4, 0, 1).By Theorem 1.10, {u,, u2} is a basisfor the solution

space of the associated homogeneous system. Thus (**)has the desired form.

MISCELLANEOUS PROBLEMS

1.42. Show that each of the elementary operations [EJ, [E2]>[?\321\215]nas an inverse operation of thesame type.

[?,] Interchangethe jth equation and theyth equation: L,-\302\253->L,-.

[?2] Multiply the ith equation by a nonzero scalar k: kL;-\302\273Lt, \320\272\321\2040.

[?3] Replace the jth equation by \320\272times the jth equation plus the ith equation: kLj + \320\246-*Lt.

(a) Interchanging the same two equations twice, we obtain the original system; that is, Lt*-^Ljis its own

inverse.

(b) Multiplying the ith equation by \320\272and then by k'1, or by fc\021 and then k, we obtain the original

system. In other words, the operations kLt -\302\273Lt and \320\272~

'L, -\302\273Lt are inverses.

(c) Applying the operation kLj + Lt -\302\273Lt and then the operation \342\200\224kLj + Lt -\302\273\320\246,or vice versa, we obtain

the original system. In other words, the operations kLj + Lt-\302\273Lt and

\342\200\224kLi+ L{-* Lt are inverses.

1.43. Showthat the effect of applying the following operation [?] can be obtained by applying [?2]and then [?3].

[?] Replace the ith equation by k' times the jth equation plus \320\272(nonzero) times the ith equa-equation : k'Lj + \320\272\320\246-*Litk*0.

Applying kL( -* Lt and then applying k'Lj + L;-* Lt has the same result as applying the operation

1.44. Suppose that each equation Lf in the system A.3) is multiplied by a constant cf, and that the

resulting equations are added to yield

(c,o,, + -+cmaml)x, +\342\200\242\342\200\242\342\200\242+(<\302\246,\302\253,\342\200\236+\302\246\302\246\342\200\242

+cmamn)xn= clb1 +\302\246\302\246\302\246+ cmbm (I)

CHAP. 1] SYSTEMSOF LINEAR EQUATIONS 35

Such an equation is termed a linear combination of the equations L,. Show that any solution of

the system (/..?)is alsoa solution of the linear combination (/).

Suppose\320\270= (fc,, k2, ...,&\342\200\236)is a solution of {1.3):

ailkl+ai2k2 + -- + ainkn =bi (i=l, ...,m) B)

To show that \320\272is a solution of (/), we must verify the equation

(Ci\302\253n <-\302\246\342\200\242\342\200\242+ <\342\200\242\342\200\236,\302\253*,i)*i+ \302\246\302\246\302\246

+(fifli\302\273 + '\342\200\242\302\246+cmanm)kn^cibi + \302\246\302\246\302\246

+cmbm

But this can be rearranged into

ci(\302\253n*i + \302\246\342\200\242\342\200\242+ \302\253iA) + \342\200\242\302\246\342\200\242+rm(\302\253\302\273.i+ \342\200\242\342\200\242\342\200\242+ fl,mA) = <-A + \342\200\242\342\200\242\302\246+ cmfcm

or, by B)

clbl + \302\246\302\246\302\246+ cmbm = c,b, + \342\200\242\342\200\242\342\200\242+ cmbm

which is clearly a true statement.

1.45. Suppose that a system (#) of linear equations is obtained from a system (*) of linear equationsby applying a single elementary operation\342\200\224[EJ, [?2], or [?3]- Show that (#) and (*) have allsolutions in common (the two systems are equivalent).

Each equation in (#) is a linear combination of the equations in (*). Therefore, by Problem 1.44,any

solution of (*) will be a solution of all the equations in (#). In other words, the solution set of (*) iscontained in the solution set of (#). On the other hand, since the operations [?,], [?2], ar|d [?3] have

inverse elementary operations, the system (*) can be obtained from (#) by a single elementary operation.

Accordingly, the solution set of (#) is contained in the solution set of (*).Thus (#) and (*) have the samesolutions.

1.46. Prove Theorem 1.4.

By Problem 1.45,eachstep does not change the solution set. Hence the original system (*) and the final

system (#) (and any system in between) have the same solution set.

1.47. Prove that the following three statements about a system of linear equations are equivalent:(i) The system is consistent (has a solution), (ii) No linear combination of the equations is the

equation

Ox, + 0x2 + \342\200\242\342\200\242\342\200\242+0\321\205\342\200\236

= b \321\2040 (*)

(iii) The system is reducibleto echelon form.

Suppose the system is reducible to echelon form. The echelon form has a solution, and hence the

original system has a solution. Thus (iii) implies (i).Suppose the system has a solution. By Problem 1.44, any linear combination of the equations also has

a solution. But (*) has no solution; hence (*) is not a linear combination of the equations. Thus (i) implies

(ii).

Finally, suppose the system is not reducible to echelon form. Then, in the Gaussian algorithm, it must

yield an equation of the form (\342\200\242).Hence (*) is a linear combination of the equations. Thus not-(iii) implies

not-(ii), or, equivalently, (ii) implies (iii).

36 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

Supplementary Problems

SOLUTION OF LINEAR EQUATIONS

1.48. Solve:

1.49. Solve:

Solve:

1.51. Solve:

2x + 3>' = 1 2x + Ay =10 4x-2y =

5x+7>- = 3 U3x + 6> = 15

[C} -6x + 3>=

2x + \321\203- iz = 5 2x + 3>'

- 2z = 5 x + 2y + 3z = 3

(a) 3x -2>- + 2z = 5 (b) x -

2y + 3z = 2 (c) 2x+ 3>- + 8z = 4

5x - 3>-- z = 16 4x -\321\203+ 4z = 1 3x + 2y+17z=l

2x + 3>- = 3 x + 2>-- 3z + 2t = 2 x + 2>-

- z + 3t = 3

(a) x -2y

= 5 (b) 2x + 5y - 8z+ 6r = 5 (c) 2x + 4y + 4z + 3r = 93x + 2>- = 7 3x + 4y - 5z + 2t = 4 3x + 6>-

- z + 8t = 10

x + 2y + 2z = 2x + 5>- + 4z- I3t =

_X~/~,Z~ . \342\204\226)3x- >- + 2z+ St =2x \342\200\224

5y + 3z = \342\200\22442x + 2y + 3z - 4t =

x + 4>-+ 6z = 0

HOMOGENEOUSSYSTEMS

1.52. Determine whether each system has a nonzero solution:

x + 3>-- 2z = 0 x + 3>-

- 2z = 0 x + 2y- bz + At = 0

(a) x -8>- + 8z = 0 (b) 2x -

3>- + z = 0 (c) 2x -3>- + 2z + 3r = 0

3x -2>- + 4z = 0 3x -

2y + 2z = 0 4x -7>- + z - 6t = 0

1.53. Find the dimension and a basis of the general solution W of each homogeneoussystem.

x + 3>- + 2z - s - \320\223= 0 2x -4>' + 3z - s + 2f = 0

(a) 2x+ 6y+ 5z + s - ( = 0 (b) 3x - 6y + 5z - 2s + At = 0

5x + 15>- + 12z + s - 3t = 0 5x -10>- + 7z - 3s + t = 0

ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS

134. Reduce A to echelon form and then to its row canonical form, where

CHAP. I] SYSTEMSOF LINEAR EQUATIONS 37

1.55. Reduce A to echelon form

(a) A

and then to

/I

= 02

\\4

3

II

\302\253

1

its row

1

-5

3

1

canonical

2\\

3

1

5/

form,

(ft)

when

/0

A =00

\\o

1

4

0

5

3

-11

-3

-2

3

1

4

1.56. Describe all the possible 2x2 matrices which are in row reduced echelon form.

1.57. Suppose A is a square row reduced echelon matrix. Show that if \320\220\321\204/, the identity matrix, then A has azero row.

1.58. Show that each of the following elementary row operations has an inverse operation of the same type.

[?,] Interchange the ith row and theyth row: /?(<->R-r

[E2] Multiply the ith row by a nonzero scalar k: kRt -\302\273/?;,\320\272\320\2440.

[E3] Replace the ith row by \320\272times the jth row plus the ith row: kRt + Rt -* /?,-.

1.59. Show that row equivalence is an equivalence relation:

(i) A is row equivalent to A;

(ii) A row equivalent to \320\222implies \320\222row equivalent to A;

(iii) A row equivalent to \320\222and \320\222row equivalent to \320\241implies A row equivalent to C.

MISCELLANEOUS PROBLEMS

1.60. Consider two general linear equations in two unknowns x and \321\203over the real field R:

ax + by= e

ex 4- dy =f

Show that:

a b de \342\200\224bf af\342\200\224c?

(i) If -\320\244-, i.e., if ad \342\200\224be \321\2040, then the system has the unique solution x = \342\200\224

-\342\200\224,\321\203= -1- \342\200\224

;\321\201d ad \342\200\224be ad \342\200\224be

a b \342\202\254

(ii) If - = -\320\244-, then the system has no solution;

\321\201d f

a b e(iii) If - = - = \342\200\224,then the system has more than one solution.

\321\201d f

1.61. Consider the system

ax + by= 1

ex + dy= 0

Show that if ad \342\200\224be \321\2040, then the system has the unique solution x =d/{ad

\342\200\224be), \321\203

show that if ad \342\200\224be = 0, \321\201\320\2440 or d \320\2440, then the system has no solution.

~c/(ad- be). Also

1.62. Show that an equation of the form Ox, + 0x2 + \342\200\242\342\200\242\342\200\242+ 0\321\205\342\200\236= 0 may be added or deleted from a system

without affecting the solution set.

38 SYSTEMS OF LINEAR EQUATIONS [CHAP. 1

1.63. Consider a system of linear equations with the same number of equations as unknowns:

+\320\26012\321\2052

+ a12x2 a2nxn =

<J\302\2731*1+ \302\253\342\200\2362*2 + \342\200\242\342\200\242' + annxn =\320\232

(I)

(i) Suppose the associated homogeneous system has only the zero solution. Show that (/) has a unique

solution for every choice of constants ft,-.

(ii) Suppose the associated homogeneoussystem has a nonzero solution. Show that there are constants ft,

for which (/) does not have a solution. Also show that if A) has a solution, then it has more than one.

1.64. Suppose in a homogeneous system of linear equations the coefficients of one of the unknowns are all zero.Show that the system has a nonzero solution.

Answers to Supplementary Problems

1.48. (a) x = 2, \321\203= \342\200\2241; (b) x = 5 \342\200\2242a, \321\203

= \320\260; (\321\201)no solution

fx = -11.49. (a) A,-3,-2); (ft) no solution; (c) (-1 - 7a, 2 + la, a) or <

= - 1 - 7z

2z

1.50. (a) x=3, > = -l; (b) (-a + 2b, 1 + 2a-2b, a, b) or -\320\223

Z +

(y = 1 + 2z \342\200\2242t

(c) - 2a, a, 1 + fo/2 , b) or \320\223 }

= 3 -5f/2

-

1.51. (a) B,1,-0; (b) no solution

1.52. (a) yes; (b) no; (c) yes, by Theorem 1.8

1.53. (a) Dim W = 3; \320\272,= (-3, 1, 0, 0, 0), u2 =G,0, -3, 1, 0), u3

= C,0, -1,0, 1);(b) Dim W = 2; \320\272,

= (-2,1, 0, 0, 0), u2= E, 0, -5,-3, 1)

A

2-1 2 I1

0 0 3-610 0 0-6 1/

/2 3-2 5

(b) 10 \342\200\2241110-15 5

\\0 0 0 0

and

(\\ 2 \320\236 \320\236 %\\

\320\236 \320\236 1 \320\236 0|;

0 0 0 1 -*/

1 0 & ?and |0 1 -\320\251 if

--

0 0 0 0 0)

1.55.(a)

(ft)

1

0

0

0

/000

10

3

11

0

0

1

0

00

-15

0

0

3

-1300

2\\3

0

o/

-21135

0

and

and

0 \320\220

1 -\320\220

\320\276 \320\276

\320\276 \320\276

1

\320\276

\320\276

\320\276

\320\276

1

\320\276

\320\276

Chapter2

Vectors in R\" and \320\241\320\273,Spatial Vectors

2.1 INTRODUCTION

In various physical applications there appear certain quantities, such as temperature and speed,

which possess only \"magnitude.\" These can be represented by real numbers and are called scalars. Onthe other hand, there are also quantities, such as force and velocity, which possess both \"magnitude\"

and \"direction.\" These quantities can be representedby arrows (having appropriate lengths and direc-directions and emanating from some given referencepoint O)and are calledvectors.

We begin by considering the following operations on vectors.

(i) Addition: The resultant u + v of two vectors u and v is obtained by the so-called parallelogramlaw, i.e., u + v is the diagonal of the parallelogram formed by u and v as shown in Fig. 2-1(a).

(ii) Scalar multiplication: The product ku of a vector u by a real number \320\272is obtained by multiplyingthe magnitude of u by \320\272and retaining the same direction if \320\272> 0 or the opposite direction if \320\272< \320\236,

as shown in Fig. 2-l(b).

Now we assumethe reader is familiar with the representation of the points in the plane by orderedpairs of real numbers. If the origin of the axes is chosen at the reference point \320\236above, then everyvector is uniquely determined by the coordinates of its endpoint. The relationship between the above

operations and endpoints follows.

(i) Addition: If (a, b) and (c,d) are the endpoints of the vectors u and v, then (a + c,b + d) will be the

endpoint of u + v, as shown in Fig. 2-2(a).

(ii) Scalar multiplication: If (a, b) is the endpoint of the vector u, then (ka, kb) will be the endpoint ofthe vector fcu, as shown in Fig. 2-2(b).

Mathematically,we identify the vector u with its endpoint (a, b) and write u = (a, b). In addition wecall the ordered pair (a, b) of real numbers a point or vector depending upon its interpretation. Wegeneralize this notion and call an \320\270-tuple (\320\260\̂320\2602,...,an) of real numbers a vector. However, specialnotation may be used for spatial vectors in R3 (Section 2.8).

39

40 VECTORSIN R\" AND O, SPATIAL VECTORS [CHAP. 2

(a + c,b

(a.b)a,b)

Fig. 2-2

We assume the reader is familiar with the elementary properties of the real number field which we

denote by R.

2.2 VECTORSIN R\"

The set of all \302\253-tuples of real numbers, denoted by R\", is called n-space. A particular n-tuple in R\",

say

\302\253=(\302\253\342\200\236\302\2532.\342\200\242\342\200\242\342\200\242\342\200\242\302\253\302\246)

is called a point or vector; the real numbers u,- are called the components (or coordinates)of the vector u.

Moreover, when discussing the space R\" we use the term scalar for the elementsof R.

Two vectors \320\270and d are equal, written \320\270= \320\270,if they have the same number of components,i.e.,belong to the same space, and if corresponding components are equal. The vectors A, 2, 3) and B, 3, 1)are not equal, since corresponding elements are not equal.

Example 2.1

(a) Consider the following vectors

@,1) A,-3) A,2, \320\243\320\227,4) (-5,i<U)

The first two vectors have two components and so are points in R2; the last two vectors have four componentsand so are points in R4.

(b) Suppose (x \342\200\224y, x + y, z \342\200\224

I) = D, 2, 3).Then, by definition of equality of vectors,

x \342\200\224\321\203

= 4

x + y= 2

z-1 =3

Solving the above system of equations gives x = 3, \321\203= \342\200\2241, and z = 4.

Sometimesvectors in \320\270-space are written vertically as columns rather than horizontally as rows, as

above. Such vectors are called column vectors. For example,

0 (-0

are column vectors with 2, 2, 3,and 3 components,respectively.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 41

23 VECTOR ADDITIONAND SCALAR MULTIPLICATION

Let \320\270and v be vectors in R\":

\302\253=(\302\253i.\022. \342\200\242\342\200\242\302\246,\"\342\200\236)and v={vl,v2,...,vH)

The sum of \320\270and d, written \320\270+ v, is the vector obtained by adding corresponding components:

u + v = {ui + vl,u2+v2,...,ua + va)

The product of the vector \320\270by a real number k, written ku, is the vector obtained by multiplying each

component of \320\270by k:

ku = (kuuku2, ...,kun)

Observethat \320\270+ v and ku are also vectors in R\". We also define

\342\200\224u=\342\200\224lu and \320\270\342\200\224v \342\200\224u + ( \342\200\224v)

The sum of vectors with different numbers of components is not defined.Basic propertiesof the vectors in R\" under vector addition and scalar multiplication are described

in the following theorem (proved in Problem 2.4). In the theorem, 0s@,0, ..., 0), the zero vector of R\".

Theorem 2.1: For any vectors u, v, w e R\" and any scalars \320\272,\320\272!e R,

(i) (u + v) + w = \320\270+ (v + w) (v) k(u + v)= ku + kv

(ii) \320\270+ 0 = \320\270 (vi) (\320\272+ k')u = ku + k'u

(iii) u + (-u) = 0 (vii) (kk')u= k(k'u)

(iv) \320\270+ v = v + \320\270 (viii) lu = \320\270

Suppose \320\270and v are vectors in R\" for which \320\270= kv for some nonzero scalar \320\272e R. Then \320\270is called

a multiple of d; and \320\270is said to be in the same direction as d if \320\272> 0, and in the opposite direction if

k<0.

2.4 VECTORS AND LINEAR EQUATIONSTwo important concepts, linear combinations and linear dependence,are closely related to systems

of linear equations as follows.

Linear Combinations

Consider a nonhomogeneous system of m equations in \320\270unknowns:

\302\25311*1 +\302\25312*2 +\342\200\242\342\200\242\342\200\242

+ \302\253!\342\200\236*\342\200\236=\320\254,

\302\25321*1 +\302\25322*2 +\342\200\242\342\200\242\342\200\242

+ \302\2532\342\200\236*\321\217=\320\2542

ani x, + am2x2 + \302\246\302\246\302\246+ amnxn =

bm

This system is equivalent to the following vectorequation

\\am2/

42 VECTORS IN R\" AND O, SPATIAL VECTORS [CHAP. 2

that is, the vector equation

x,u, + x2u2 + \302\246\342\200\242\342\200\242+xnua

= v

where \320\270\342\200\236\320\2702,...,\302\253\342\200\236,d are the above column vectors, respectively.

Now if the above system has a solution, then v is said to be a linear combination of the vectors \321\211.

We state this important concept formally.

Definition: A vector v is a linear combination of vectors \302\253,,u2 ,...,\302\253\342\200\236if there exist scalars klt k2, \302\246-.,

&\342\200\236such that

v = klul +k2u2 + ---+\320\272\342\200\236\320\270\320\277

that is, if the vector equation

v = xlu1 + x2u2 + \342\200\242\342\200\242\342\200\242+ \321\205\342\200\236\320\270\342\200\236

has a solution where the xf are unknown scalars.

The above definition applies to both column vectors and row vectors, although our illustration was

in terms of column vectors.

Example 2.2. Suppose

-(j) -(;) -(i)Then v is a linear combination of u,, u2, u3 since the vector equation (or system)

2\\ M /|\\ /1\\

3=Jl]

+ jjl +z@) or

4/ \\l/ \\0/ W

has a solution x =\342\200\2244,y=7,z=

\342\200\2241. In other words,

v =\342\200\2244\302\253!4- 7u2

\342\200\224u3

Linear Dependence

Consider a homogeneoussystemof m equations in \320\270unknowns:

\302\253n*i +ai2x2 + \342\200\242\342\200\242\342\200\242+ \320\260\342\200\236,\321\205,,

= 0

a21x, +a22x2 + --\342\200\242+ \320\2602\320\262\321\205,,

= 0

a\302\273,i*i + am2x2 + \302\246\302\246\302\246+ amnxn = 0

This systemis equivalent to the following vector equation:

+ x

\302\253ml

that is, the vector equation

x,\302\253, +x2u2 + \342\200\242\342\200\242\342\200\242+ \321\205\342\200\236\320\270,,

=

where \320\270\342\200\236u2,..., un are the above column vectors, respectively.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 43

Now, if the above homogeneous system has a nonzero solution, the vectors \320\2701\320\263\320\2702,...,\320\270\320\277are said to

be linearly dependent; on the other hand, if the equation has only the zero solution, then the vectors are

said to be linearly independent. We state this important concept formally.

Definition: Vectors \302\253\342\200\236u2, -.-, un in R\" are linearly dependent if there exist scalars kx, k2,...,\320\272\342\200\236,not

all zero, such that

k,ut + k2u2 + \302\246\302\246\302\246+ \320\272\320\277\320\270\342\200\236

= \320\236

that is, if the vector equation

\321\205,\320\270, + 0

has a nonzero solution where the x, are unknown scalars. Otherwise, the vectors are saidto be linearly independent.

The above definition applies to both column vectors and row vectors, although our illustration was

in terms of column vectors.

Example 2.3

(a) The only solution to

1 I+Ul l +zjo

= 0 or <x + y =0

{lj \\o/ \\0/ \\0/ [x =0

is the zero solution x = 0, \321\203= 0, z = 0. Hencethe three vectors are linearly independent.

(b) The vector equation (or system of linear equations)

/2\\ / l\\ /0\\

\\x+ 2y+ z = 0

4 + 31-4 + 4-5= 0 or <x- y-5z = 0

\\ 3/ \\ 3/ \\0/ [x+3y + 3z = 0

has a nonzero solution C, \342\200\2242,1), i.e., x = 3, \321\203= \342\200\2242, z = 1. Thus the three vectors are linearly dependent.

23 DOT (SCALAR) PRODUCT

Let \320\270and i; be vectors in R\":

\302\253=(\302\253n\302\2532.\342\200\242\342\200\242\342\200\242.\302\253\302\253)and v = (vt,v2,...,vn)

The dot, scalar,or inner product of \320\270and d, denoted by \320\270\342\200\242

v, is the scalar obtained by multiplying

corresponding components and adding the resulting products:

w v = u,vt + u2v2 + \302\246\302\246\302\246+uava

The vectors \320\270and v are said to be orthogonal (or perpendicular) if their dot product is zero, that is, if

\320\270\342\200\242v = 0.

Example 2.4. Let \320\270= A. -2, 3, -4), v = F. 7, 1,-2), and w = E, -4, 5, 7).Then

u-v= 1-6 + (-2)-7 + 3- 1 +(-4) -(-2) = 6- 14 + 3 + 8 = 3

\320\270\342\200\242w = 1 \342\200\2425 + (-2) \342\200\242(-4) + 3-5 + (-4) -7 = 5 + 8 +15-28 = 0

Thus \320\270and w are orthogonal.

Basic properties of the dot product in R\" (proved in Problem 2.17) follow.

44 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Theorem2.2: Forany vectors u, v, w e R\" and any scalar fc e R,

(i) (u + v) \342\200\242w = \320\270\342\200\242w + d \342\200\242w (iii) \320\270\342\200\242d = i> * \320\270

(ii) (feu)\342\200\242d = k(u \342\200\242

d) (iv) u \342\200\242\320\270> 0, and \302\253\342\200\242\302\253=

Remark: The space R\" with the above operations of vector addition, scalarmultiplication, and dot product is usually called Euclidean n-space.

2.6 NORM OF A VECTOR

Let \320\270=(\320\270\342\200\236\320\2702,\342\200\242- \342\200\242,\320\270\342\200\236)be a vector in R\". The norm (or length) of the vector u, written

to be the nonnegative square root of \320\270\342\200\242

\320\270:

||, is defined

Since \320\270'

\320\270> \320\236,the square root exists. Also, if \320\270\320\2440, then ||u|| > 0; and ||0||= 0.The above definition of the norm of a vector conforms to that of the length of a vector (arrow) in

(Euclidean) geometry. Specifically, suppose u is a vector (arrow) in the plane R2 with endpoint P(a, b) asshown in Fig. 2-3. Then | a | and |b | are the lengths of the sides of the right triangle formed by u and thehorizontal and vertical directions. By the Pythagorean Theorem, the length | u | of u is

|u| = Ja2+b2This value is the same as the norm of \320\270defined above.

P(a. b)

\\u\\

Fig. 2-3

Example 2.5. Suppose \320\270= C, \342\200\22412, \342\200\2244).To find ||u||, we first find Ju||2 = \320\270\342\200\242\320\270by squaring the components of \320\270

and adding:

144+16= 169

Then = 13.

A vector \320\270is a unit vector if ||u|| = 1 or, equivalently, if \320\270\342\200\242

\320\270= 1. Now if d is any nonzero vector,then

- - J_ _ JL\" =

INv ~

\320\253

is a unit vector in the same direction as v. (The process of finding v is called normalizing v.) For example,

v /2 -3 8 -5 \\II\302\253II 4/\320\2502' \320\243\320\2502'\320\243\320\2502'

is the unit vector in the direction of the vector i; = B,\342\200\2243, 8, \342\200\224

5).

CHAP. 2] VECTORS IN R\" AND C, SPATIAL VECTORS 45

We now state a fundamental relationship (proved in Problem 2.22) known as the Cauchy-Schwarzinequality.

Theorem2.3(Cauchy-Schwarz): For any vectors u, v in R\", | \320\270\342\200\242

v\\ < \\\\u\\\\ \\\\v\\\\.

Using the above inequality, we prove (Problem2.33)the following result known as the triangleinequality or as Minkowski's inequality.

Theorem 2.4(Minkowski):Forany vectors u, v in R\", \\\\u + v\\\\ < \\\\u\\\\ + \\\\v\\\\.

Distance, Angles, Projections

Let u = (ut, u2, ..., un) and v = {vx, v2, ..., vn) be vectors in Rn. The distance between \320\270and v,

denoted by d(u, v), is defined as

d(u, v) = \\\\u-

v\\\\=

v/(\"i- t>iJ + (\302\2532

- v2J + \302\246\302\246-+(\302\253\342\200\236-vnJ

We show that this definition corresponds to the usual notion of Euclidean distancein the plane R2.

Consider \320\270= (a, b) and v = (c, d) in R2. As shown in Fig. 2-4, the distance between the points P(a, b)and Q(c, d) is

d = y/(a- cf +{b- dfOn the other hand, by the above definition,

d(u, v) = ||u -v\\\\

=\\\\(a -c,b-d)\\\\= J(a

- cJ +(b-dJBoth give the same value.

\\b-d\\

P(a. h)

\320\236 \\a-c

Fig. 2-4

Using the Cauchy-Schwarz inequality, we can define the angle \320\262between any two nonzero vectorsu, v in R\" by

cos \320\262=\320\270\342\200\242v

\320\270\320\273\320\270

Note that if \320\270* v = 0, then \320\262= 90\302\260(or \320\262=

\321\217/2).This then agrees with our previous definition of orthog-

orthogonality.

Example 2.6. Suppose \320\270= A, -2, 3) and \320\270= C, -5, -7). Then

d{u, v) = y/(\\- 3J + (-2 + 5J + C + IJ = ^/4 + 9 + 100 =

To find cos 0, where 0 is the angle between \320\270and v, we first find

h-d = 3+ 10-21 = -8 ||u||2= 1+4 + 9= 14 \\\\v[\\2 = 9 + 25 + 49= 83

46 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Then

cos \320\262=\320\270

\342\200\242v -8

\"II

Let \320\270and v \321\2040 be vectors in R\". The (vector) projection of \320\270onto v is the vector

\320\270\342\200\242v

proj (u, v) =\342\200\224\342\200\2242

v

We show how this definition conforms to the notion of vector projection in physics. Consider thevectors u and v in Fig. 2-5. The (perpendicular)projectionof u onto v is the vector u*, of magnitude

| u* | = | u | cos 0 = | u |

u \342\200\242v u \342\200\242v

To obtain u*, we multiply its magnitude by the unit vector in the direction of v:

v u \302\246vU\" = U\"

M

This agrees with the above definition of proj (u, v).

Fig. 2-5

Example 2.7. Suppose\320\270= A, - 2, 3) and \320\270= B, 5,4). To find proj (u, v), we first find

u-d = 2- 10+ 12= 4 and \320\2302 = 4 + 25 + 16= 45

Then

\302\246- _ fLUL _ A B S _ ^jj_20 16\\ /84

IMI2 45

2.7 LOCATED VECTORS, HYPERPLANES, AND LINES IN R\"

This section distinguishes between an n-tuple P(au a2, ...,an)= P(a^ viewed as a point in R\" and

an n-tuple v = [ct, c2,-.., cn] viewed as a vector (arrow) from the origin \320\236to the point C{cu c2, \302\246\302\246\302\246,cn).

Any pair of points P = (a,)and Q =(b,-) in R\" defines the located vector or directed line segment from P

to Q, written PQ. We identify P@ with the vector

v = Q- \320\240=[\320\2541-\320\2601,\320\2542-\320\2602,...,\320\254\342\200\236-\320\260\342\200\236]

since PQ and v have the same magnitude and direction as shown in Fig. 2-6.

A hyperplane H in R\" is the set of points (x,, x2, ..., xn) that satisfy a nondegenerate linearequation

alXl +a2x2 + \302\246\302\246\302\246+anxn

= b

In particular, a hyperplane H in R2 is a line; and a hyperplane H in R3 is a plane. The vector

\320\270= [a,, a2,..., aj \321\2040 is called a normal to H. This terminology is justified by the fact (Problem 2.33)

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 47

Fig. 2-6

that any directed line segment PQ, where P, Q belong to H, is orthogonal to the normal vector u. Thisfact in R3 is shown in Fig. 2-7.

Fig. 2-7

The line L in R\" passing through the point P{at, a2, ..., \320\260\342\200\236)and in the direction of the nonzerovector \320\275= [uj, u2,..., un] consists of the points \320\220\320\223(\321\205\321\214\321\2052,\302\246\302\246\302\246,*\342\200\236)which satisfy

X = P + tu orx2 = a2 + u2t

unt

where the parameter t takes on all real numbers.(SeeFig.2-8.)

Fig. 2-8

48 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Example 2.8

(a) Consider the hyperplane H in R4 which passes through the point P(l, 3, -4, 2) and is normal to the vector

\320\270= [4, \342\200\2242, 5, 6]. Its equation must be of the form

4x -2y + 5z + 6t = \320\272

Substituting P into this equation, we obtain

4(l)-2C) + 5(-4) + 6B)= fc or 4-6-20+ 12 = k or \320\272=-10

Thus 4x \342\200\2242y + 5z + 6t = \342\200\22410 is the equation of H.

(b) Consider the line L in R4 passing through the point P(l, 2, 3, \342\200\2244)in the direction of \320\270= [5, 6, \342\200\2247,8]. A

parametric representation of L is as follows:

xt = I +5t

x2 = 2 + 6tof A + ^ 2 + 6{3_ 7f) _4 + g

x3 = 3 \342\200\2247t

x4= -4 + 8t

Note t = 0 yields the point P on L.

Curvesin R\"

Let D be an interval (finite or infinite) in the real line R. A continuous function F: D -* R\" is a curve

in R\". Thus to each t e D there is assigned the following point (vector) in R\":

Moreover, the derivative (if it exists) of F(t) yields the vector

V(t)= dF(t)/dt = [dFy{t)ldt, dF2(t)/dt,..., dFn{t)\342\204\226

which is tangent to the curve. Normalizing V(t) yields

Tu)_V\302\256

(t)\\\\

which is the unit tangent vector to the curve. [Unit vectors with geometrical significance are often

notated in bold type; compare Section2.8.]

Example 2.9. Consider the following curve \320\241in R3:

F{t) = [sin t, cos t, t]

Taking the derivative of F(t) [or each component of F(f)] yields

V(t) = [cos t, -sin t, 1]

which is a vector tangent to the curve. We normalize V(t). First we obtain

||^(t)||2 = cos2t + sin2 4+1 = 1 + 1=2Then

\320\242(\320\233=V(t)

[cost-sint 1

L'

which is the unit tangent vector to the curve.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 49

2.8 SPATIAL VECTORS,ijk NOTATION IN R3

Vectors in R\\ called spatial vectors, appear in many applications, especially in physics. In fact, aspecialnotation is frequently used for such vectorsas follows:

i = A,0,0) denotes the unit vector in the x direction,

j = @,1,0)denotes the unit vector in the \321\203direction,

\320\272= @,0, 1) denotes the unit vector in the z direction.

Then any vector \320\270= {a, b, c) in R3 can be expresseduniquely in the form

\320\270= (a, b, c) = a\\ + b\\ + ck

Since i, j, \320\272are unit vectors and are mutually orthogonal, we have

ii=l, j-j=l, kk=l and ij = 0, i \342\200\242\320\272= 0, j-k = O

The various vector operations discussed previously may be expressedin the above notation asfollows. Suppose u = aji + a2j + a3 \320\272and v = bji + b2\\ + b3k. Then

\320\270+ v = (a, + bj)i + (a2 + b2)j + (a3 + \320\2543)\320\272

cu = cflji + ca2j + ca3k \\\\u\\\\=

where \321\201is a scalar.

\320\270\342\200\242v =

\320\276,\320\254,+ a2b2+ \320\260\321\212\320\252\321\212

\302\246\320\270\342\200\224

yja\\ + a\\ + a\\

Example 2.10. Suppose \320\270= 3i + 5j - 2k and v = 4i -3j + 7k.

(a) To find \320\270+ \320\270,add corresponding components yielding

u + v = 7i + 2j + 5k

(b) To find 3u \342\200\2242t), first multiply the vectorsby the scalars, and then add:

3u - 2v = (9i + 15j- 6k) + (-8i + 6j - 14k)= 4i -

(c) To find \320\270\302\246v, multiply corresponding components and then add:

u-t>= 12- 15- 14= -17

(d) To find |M|, square each component and then add to get ||u||2.That is,

||u||2= 9 + 25 + 4= 38 and hence ||u||

- 20k

Cross Product

There is a special operation for vectors u, v in R3, called the cross product and denoted by \320\270\321\205v.

Specifically, suppose

\320\270= aji + a2j + a3k and v = b3k

Then

\320\270x v = (a2b3 \342\200\224a3b21i + @3^ \342\200\224

\320\260\321\205\320\254\321\212^+ (atb2\342\200\224

a2bl)k

Note \320\270x t? is a vector; hence \320\270\321\205v is also called vector productor outer product of \320\270and v.

Using determinant notation (Chapter 7), where

expressedas follows:

a b

\321\201d

U X V = a, a.

= ad \342\200\224be, the cross product may also be

a2

b2 b3

50 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

or, equi valently,

\320\270x v = ax a2

Two important properties of the cross product follow(seeProblem 2.56).

Theorem 2.5: Let u, v, w be vectors in R3.

(i) The vector \320\270x v is orthogonal to both \320\270and v.

(ii) The absolute value of the \"triple product\" \320\270\342\200\242v x w represents the volume of the

parallelepiped formed by the vectors u, v, and w (as shown in Fig. 2-9).

Fig. 2-9

Example 2.11

id) Suppose \320\270= 4i + 3j + 6k and v = 2i + 5j - 3k.Then

\320\270x v =5 -3 1+

4 3

2 5

\320\244) B, -1,5) x C,7, 6) = \320\233J

2 5

3 6

-3

-1 5

6

(Here we find the cross product without using the ijk notation.)

(c) The crossproducts of the vectors i, j, \320\272are as follows:

ixj =k, j x \320\272= i, kxi=j, and j x i = \342\200\224k,

k= -39i + 24j + 14k

3,17)

\320\272x j = -i, i x \320\272= -j

In other words, if we view the triple (i, j, k) as a cyclic permutation, i.e., as arranged around a circlein the

counterclockwise direction as in Fig. 2-10, then the product of two of them in the given direction is the thirdone but the product of two of them in the opposite direction is the negative of the third one.

Fig. 2-10

CHAP. 2] VECTORS IN R\" AND C, SPATIAL VECTORS 51

2.9 COMPLEXNUMBERS

The set of complex numbers is denoted by C. Formally, a complex number is an orderedpair (a, b)of real numbers; equality, addition, and multiplication of complex numbers are defined as follows:

(a, b) = (c,d) iff a = \321\201and b = d

(a, b) + {c,d)= (a + c,b + d)

(a, b)(c,d) = (ac- bd, ad + be)

We identify the real number a with the complex number (a, 0):

cn->(a,0)

This is possiblesincethe operations of addition and multiplication of real numbers are preservedunder

the correspondence

(a, 0) + (b, 0) = (a + b, 0) and (a, OXb, 0) = {ab,0)

Thus we view R as a subset of \320\241and replace (a, 0) by a wheneverconvenient and possible.

The complex number @, 1),denotedby i, has the important property that

i2 = ii = @, 1X0, I)= (-1,0) = -1 or i

Furthermore, using the fact

(a, b) = (a, 0) + @, b) and @, b) = (b, 0X0, 1)

we have

{a, b) = (a, 0) + (b, 0X0, 1)= a + bi

The notation z = a + bi, where a = Re z and b = Im z are called,respectively, the real and imaginary

parts of the complex number z, is more convenient than (a, b). For example, the sum and product of two

complex numbers z \342\200\224a + bi and w = \321\201+ di can be obtained by simply using the commutative anddistributive laws and i2 = \342\200\2241:

z + w = (a + bi) + (c + di)= a + \321\201+ bi + di = (a + c) + (b + d)i

zw = {a+ bi)(c + di) = ac + bci + adi + bdi2 = (ac \342\200\224bd) + (be + ad)i

Warning: The above use of the letter i for J \342\200\2241 has no relationship whatsoeverto the vector notation i = A,0,0) introduced in Section 2.8.

The conjugate of the complexnumber z = (a,b) = a + bi is denotedand denned by

z = a + bi = a \342\200\224bi

Then zz =(a + bi)(a \342\200\224bi) \342\200\224a2 \342\200\224b2i2 = a1 4- b2. If, in addition, z \321\2040, then the inverse z~l of z and

division of w by z are given, respectively, by

_, z a -b . w _z = \342\200\224= \342\200\224=\320\223\320\242+ \342\200\224kr-z i and \342\200\224= wz i

zz a + b a + b z

where weC. We also define

\342\200\224z=-\\z and w \342\200\224z \342\200\224w + ( \342\200\224z)

Just as the real numbers can be represented by the points on a line, the complex numbers can berepresented by the points in the plane. Specifically,we let the point (a, b) in the plane represent the

complex number z = a + bi, i.e., whose real part is a and whose imaginary part is b. (See Fig. 2-11.) The

52 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

absolutevalue of z, written | z|, is definedas the distance from z to the origin:

Note that | z | is equal to the norm of the vector (a, b). Also, | z | = Jzz.

z = a + \320\253

Fig. 2-11

Example 2.12. Supposez = 2 + 3i and w = 5 \342\200\2242i. Then

z + w = B + 3i) + E -2i)

= 2 + 5 + 3i - 2i\" = 7 + i

zw = B + 3i\"K5- 2i) =10+ 15r - 4/ - 6i2 =16+1 li

\320\263= 2 + 3i = 2 - 3r and vv = 5 - 2i = 5 + 2r

w 5 - 2r E-

2i\"K2- 3i) _ 4 \342\200\22419i _ 4 19 .

7=

2 + 3f=

B + 3iX2-3i)~ 13 ~\320\242\320\267~7\320\267'

and \\w\\= J25 + 4=J29

Remark: We emphasize that the set \320\241of complex numbers with the above oper-operations of addition and multiplication is afield, like R.

2.10 VECTORS IN \320\241

The set of all n-tuples of complex numbers,denotedby C\", is called complex n-space. Just as in the

real case, the elements of \320\241are called points or vectors, the elementsof \320\241are called scalars, and vectoraddition in \320\241and scalar multiplication on \320\241are given by

(z,, z2,...,zn) + (w,,w2,..., wn)= (zl +wx,z2 + w2, ...,zn + wn)

z(zt, z2,...,zn) = (zzu zz2,..., zzn)

where zf, w,-, z e C.

Example 2.13

(a) B + 3i\", 4 - i, 3) + C -2i, 5r, 4 - 6i)= E + i, 4 + 4i, 7 -

6i)

(b) 2iB + 3i, 4 -/, 3) = (- 6 + 4i,2 + 8i, 6i)

Now let \320\270and t? be arbitrary vectors in C:

u = (zi,z2 zn) o = K w2, ..., wn) zfi w, e \320\241

The dot, or inner, product of \320\270and \320\263is defined as follows:

u-v = zlwl +z2w2 +\302\246\302\246\302\246

+z>n

CHAP. 2] VECTORS IN R\" AND C, SPATIAL VECTORS 53

Note that this definition reduces to the previous one in the real case, since w,= &( when w,- is real. The

norm of \320\270is defined by

ll\302\253ll= %/\321\213\"

\320\270= Jz\\'z. + z,z, + \342\200\242\342\200\242-+ \320\263\342\200\236\320\263\342\200\236= x/|zi I2 + |z, |2 + --- + |zj2II II ^^ ^* 11* Z. Z. ' ' \320\237\320\237v^ 111 1ZI I \320\233I

Observe that \320\270\342\200\242

\320\270and so ||u|| are real and positive when \320\270\321\2040 and 0 when \320\270= 0.

Example 2.14. Let \320\270= B + 3/, 4 -i, 2i) and v = C -

2i, 5, 4 - 60-Then

u \342\200\242v = B + 3iK3- 2i) + D -

iK5) + BiK4 - 60= B + 30C + 20 + D -

iX5) + BX4 + 60= 13i+ 20 - 5i - 12+ 8i = 8 + 16i

u \342\200\242u = B + 3iX2 + 3/) + D - 0D- 0 + B0B0

= B + 30B -30 + D - 0D + 0 + B<M

- 20= 13+ 17+ 4= 34

The space \320\241with the above operations of vector addition, scalarmultiplication, and dot product,is called complexEuclidean n-space.

Remark: If \320\270\342\200\242v were defined by \320\270

\342\200\242v = z, wi + \342\200\242\342\200\242\342\200\242+ znwn, then it is possible

for \320\270'

\320\270= 0 even though \320\270\321\2040, e.g., if \320\270= A, i. 0). In fact, \320\270\342\200\242

\320\270may not even be real.

Solved Problems

VECTORS IN R\"

2.1. Let \320\270= B, - 7, 1),v = (- 3,0, 4), w = @, 5, - 8).Find (a) 3u - 4v, (b) 2u + 3v - 5w.

First perform the scalar multiplication and then the vector addition.

(a) 3u - 4\302\253= 3B, -7, 1) - 4(-3,0,4) = F, -21,3) + A2,0, -16) = A8, -21, -13)\320\244)2u + 3v~ 5w = 2B, -7, 1)+ 3(-3,0,4)

- 5@, 5, -8)= D, -14, 2) + (-9, 0, 12) + @, -25, 40)

= D_9+0, -14 + 0-25, 2+ 12+40)= 1-5, -39,54)

2.2. Compute:

First perform the scalar multiplication and then the vector addition.

54 VECTORSIN R\" AND C\", SPATIAL VECTORS [CHAP. 2

2.3. Find x and \321\203if (a) (x, 3) = B, x + y); (b) D, y) = xB, 3).

(a) Since the two vectors are equal, the corresponding components are equal to each other:

x = 2 3 = x + >>

Substitute x = 2 into the second equation to obtain \321\203= 1. Thus x = 2 and \321\203

= 1.

(fc) Multiply by the scalar x to obtain D, y) = x{2, 3)= Bx,3x). Set the corresponding components equalto each other:

4 = 2x \321\203= 3x

Solve the linear equations for x and y: x = 2 and \321\203= 6.

2.4. Prove Theorem 2.1.

Let \320\270,-,t)j, and \320\270';be the ith components of u, v, and w, respectively.

(i) By definition, uf + v, is the ith component of \320\270+ v and so (u,- + \320\270;)4- w,- is the ith component of

(u + v) 4- w. On the other hand, \302\273,-4- \320\2701;is the ith component of v 4- w and so u,- + (c,- + w;) is the ith

component of \320\2704- (v 4- w). But us, vt, and \320\2751;are real numbers for which the associative law holds,that is,

(U; + t);) + H'j= U; + (V, + H';) for J = 1, . . . , \320\237

Accordingly, (u + t)) + w = \320\270+ (t> + w) since their corresponding componentsare equal,

(ii) Here, 0 = @, 0,..., 0); hence

\320\274+ 0 =(\320\274\342\200\236m2,..., \320\275\342\200\236)+ @, 0, ...,0)

=(h, + 0, u2 + 0,..., un + 0) =

(\320\275\342\200\236\320\2702,..., \320\275\342\200\236)= \320\270

(iii) Since -u = - l(u,,u2,..., un)= (-u,, -u2,..., -\321\213\342\200\236),

u+(-u) = (h,,u2,...,hb) + (-Ui( -h2 -\320\275\342\200\236)

=(\320\2751-\320\2751,\320\2742-\320\2742,...,\320\270\342\200\236-\320\274\342\200\236)

= @,0, ...,0) = 0

(iv) By definition, u, + v( is the ith component of \320\270+ v, and \320\270;+ \321\211is the ith component of v + u. But \321\211

and t); are real numbers for which the commutative law holds, that is,

U; + D;=

t); + U, I = 1, .... \320\230

Hence u + t> = t) + \320\270since their corresponding components are equal.(v) Since u, 4- vt is the ith component of \320\270+ v, k{ut + vt) is the ith component of k(u + v). Since kut and

lev, are the ith components of feu and kv, respectively, ku{ + kvt is the ith component of ku 4- kv. But k,

uit and Vj are real numbers; hence

/\321\201(\321\2064- Dj)= kut + kvt i = 1, ..., n

Thus k(u 4- v) = ku + kv, as corresponding components are equal.(vi) Observe that the first plus sign refers to the addition of the two scalars \320\272and k' whereas the second

plus sign refers to the vector addition of the two vectors ku and k'u.

By definition, (k 4- \320\272')\321\211is the ith component of the vector {k 4- k')u. Since feu,- and k'ut are the ith

components of ku and fe'u, respectively, kut 4- fc'u,- is the ith component of ku + k'u. But \320\272,\320\272',and u;are real numbers; hence

(k 4- k')Ui = ku, + \320\272'\321\211 i = 1, ..., n

Thus (k 4- fc')u= ku 4- k'u, as corresponding components are equal.

(vii) Since \320\272'\321\211is the ith component of k'u, k(k'ut) is the ith component of \320\246\320\272'\320\270).But (fcfc')u.- is the ith

component of (fcfc')u and, since \320\272,\320\272'and u,- are real numbers,

(fc/c')u,= Mk'Uj) '=1 \302\253

Hence (\320\272\320\272')\320\270=

\320\272(\320\272'\320\270),as corresponding components are equal,

(viii) 1\302\246u= l(u,, u2,...,un) = (lu,, lu2,..., 1\320\270\342\200\236)

= (u,, u2,..., un)= u.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 55

VECTORS AND LINEAR EQUATIONS2.5. Convert the following vector equation to an equivalent system of linear equations and solve:

Multiply the vectors on the right by the unknown scalars and then add:

-6 = 2x + \320\253 + 2z = 2x + 5y + 2z

5/ \\3x/ \\SyJ \\3z/ \\3x + 8y + 3z/

Set corresponding componentsof the vectors equal to each other, and reduce the system to echelon form:

x + 2y + 3z = 1 x + 2y + 3z = 1 x + 2y + 3z = 1

2x + 5y + 2z=\342\200\2246 \321\203\342\200\2244z = \342\200\2248 \321\203

\342\200\2244z = \342\200\2248

3x + Sy + 3z = 5 2y- 6z = 2 2z = 18

The system is triangular, and back-substitution yields the unique solution x = \342\200\22482, \321\203= 28, z = 9.

2.6. Write the vector v = A, \342\200\2242,5) as a linear combination of the vectors \320\270,= A, 1, 1), u2

= A, 2, 3),and u3

= B, -1, 1).

We want to express v in the form v = xu, + yu2 + zu3 with x, y, and z as yet unknown. Thus we have

(It is more convenient to write the vectors as columns than as rows when forming linear combinations.)

Setting corresponding components equal to each other weobtain:

x + y + 2z= 1 x + \321\203+ 2z - 1 x + \321\203+ 2z - 1

x + 2y\342\200\224z=\342\200\2242 or \321\203

\342\200\2243z = \342\200\2243 or \321\203\342\200\2243z = \342\200\2243

x + 3y + z = 5 2y\342\200\224z=4 5z=10

The unique solution of the triangular form is x = \342\200\2246,\321\203= 3, z = 2; thus v =

\342\200\2246u,+ 3u2 + 2u3.

2.7. Write the vector v = B, 3, \342\200\2245)as a linear combination of u, =A, 2,

\342\200\2243), u2 = B, - 1, -4), and

u3= (l,7, -5).

Find the equivalent system of equations and then solve. First:

l\\ / x + 2y+ z\\

7 = 2x- y + 7z\\

-5/ \\-3x-4y-5z/

Setting corresponding components equal to each other we obtain

x + 2y + z = 2 x + 2y + z = 2 x + 2y + z= 2

2x -y + 7z= 3 or -5y+5z=-l or -5y+5z=-l

-3x-4y-5z=-5 2y-2z= 1 0= 3

The third equation, 0 = 3, indicates that the system has no solution. Thus v cannot be written as a linear

combination of the vectors uuu2 and u3.

56 VECTORSIN R\" AND C\", SPATIAL VECTORS [CHAP. 2

2.8. Determine whether the vectors u, =A, 1, 1), u2 = B, -1, 3), and u3

= A, \342\200\2245, 3) are linearly

dependent or linearly independent.

Recall that u,, u2, u3 are linearly dependent or linearly independent according as the vector equation

xu, + yu2 + zu3= 0 has a nonzero solution or only the zero solution. Thus first set a linear combination of

the vectors equal to the zero vector:

fx + 2y + z\\

x- y-5z\\\\x + 3y + 3z)

Set correspondingcomponentsequal to eachother, and reduce the system to echelon form:

x + 2y + z = 0 x + 2y + z = 0 x + 2y + z = 0

x -\321\203

- 5z = 0 or \342\200\2243y

- 6z = 0 or \321\203+ 2z = 0

x + 3>> + 3z = 0 \321\203+ 2z = 0

The system in echelon form has a freevariable; hencethe system has a nonzero solution. Thus the originalvectors are linearly dependent. (We do not need to solve the system to determine linear dependenceorindependence; we only need to know if a nonzero solution exists.)

2.9. Determine whether or not the vectors A, \342\200\2242,\342\200\224

3), B, 3, \342\200\2241), and C, 2, 1)are linearly dependent.

Set a linear combination (with coefficients x, y, z) of the vectors equal to the zero vector:

Setcorresponding components equal to each other, and reduce the system to echelon form:

x + 2y + 3z = O x + 2y + 3z = 0 x + 2y + 3z = O x + 2y + 3z = O-2x + 3y + 2z = 0 or ly + 8z =

0-^^\320\276\320\263^_^,>> + 2z = 0 or \321\203+ 2z = 0

-3x- y+ z = 0 Sy + lOz = 0^'^'^^ly + Sz = 0 -6z = 0

The homogeneous system is in triangular form, with no free variables; hence it has only the zero solution.

Thus the original vectors are linearly independent.

2.10. Prove: Any n + 1 or more vectors in R\" are linearly dependent.

Suppose \320\270i,u2,.... \320\270,are vectors in R\" and q > n. The vector equation

x, u, + x2 u2 + \302\246\302\246\302\246+ x, uq= 0

is equivalent to a homogeneous system of n equations in q > n unknowns. By Theorem 1.9, this system has

a nonzero solution. Therefore u,, u2,..., uqare linearly dependent.

2.11. Show that any set of q vectors that includes the zero vector is linearly dependent.

Denoting the vectors as 0, u2, u3,..., u,, we have 1@) + 0\320\2702+ 0\320\2703+ \342\200\242-\342\200\242+0\320\270,

= 0.

DOT (INNER) PRODUCT, ORTHOGONALITY

2.12. Compute \320\270\342\200\242v where \320\270=\302\246

A,- 2, 3, \342\200\2244)and t> = F, 7,1, \342\200\2242).

Multiply corresponding components and add: \320\270\342\200\242v = AX6) + (-2X7) + CX1)+ ( \342\200\2244X\342\200\2242)= 3.

2.13. Suppose \320\270= C, 2, 1), v = E, -3, 4), w = A, 6, -7). Find: (fl) (u + v)\342\200\242

w, (b) \320\270\342\200\242w + \320\270

\342\200\242w.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 57

(a) First calculate \320\270+ v by adding corresponding components:\320\270+ v = C + 5, 2 -

3, 1 + 4) = 8, -1, 5)

Then compute (u + v)\342\200\242w = (8X1) + (- 1X6) + EX-7) = 8 - 6 - 35 = -33.

(b) First find \320\270\302\246w = 3 + 12- 7 = 8 and v \342\200\242w = 5 - 18- 28 = -41. Then

u-tv+p-w = 8-41 = -33

[Note:As expected from Theorem 2.2(i),the two values are equal.]

2.14. Let \320\270= A,2, 3, -4), v = E, -6, 7, 8),and \320\272= 3. Find: (a) k(u \342\200\242v), (b) (ku)

\342\200\242v, (\321\201)\320\270

\342\200\242(kv).

(a) First find u \342\200\242\320\270= 5 - 12+ 21-32= -18.Then k(u \342\200\242v) = 3(-18) = -54.

(b) First find ku = CA), 3B),3C),3(-4))= C,6,9,-12).Then

(ku)\342\200\242v = CX5) + FX-6) + (9X7) + (-12X8) = 15 - 36 + 63 - 96 = -54

(c) First find to = A5, -18,21, 24).Then

\320\270\342\200\242(kv) = AX15) + BX-18) + CX21)+ (-4X24)

= 15 - 36 + 63 - 96 = -54

2.15. Let \320\270= E, 4, 1), v = C, -4, 1), and w = {1, -2, 3). Which pair of these vectors, if any, areperpendicular?

Find the dot product of each pair of vectors:

m-i>=15-16 + 1=0 \320\272\342\200\242w = 3 + 8 + 3 = 14 u-w = 5-8 + 3 = O

Hence vectors \320\270and v and vectors \320\270and w are orthogonal, but vectors v and w are not.

2.16. Determine \320\272so that the vectors \320\270and v are orthogonal where u = A, fc,\342\200\224

3) and t> = B,\342\200\224

5,4).

Compute \320\270\342\200\242

i>, set it equal to 0, and solve for \320\272.\320\270\342\200\242v = AX2) + (JtX- 5)+ (-3X4)= 2 - 5k - 12= 0,or -5fc- 10=0.Solving, Jt = -2.

2.17. Prove Theorem 2.2.

Letu = (ul,u2,...,uj,v = (vl,v2,..., vn), w = (w,, w2,..., wn).

(i) Since \320\270+ v = (u, + vlt u2 + v2,..., un + vn),

(u + v)\342\200\242w = (u, + v,)wl + (u2 + v2)w2 + \342\200\242\302\246

\342\200\242+(uB + vn)wn

= (u, w, + u2 w2 + - -\342\200\242+un wn) + (u, h>, + v2 w2

= \320\270\342\200\242w + v \342\200\242w

(ii) Since ku =(fcu,, ku2,...,

(fcu)\302\246v- kutvt + ku2v2 + \302\246\302\246\302\246+ kunvn = ki^vt + u2v2 + \302\246\302\246\302\246+ \320\270\342\200\236\320\270\342\200\236)

= k(u\342\200\242

v)

(iii) u- v = utvt +u2v2 + \342\200\242\302\246\302\246+unvK

= vlul +v2u2 + \302\246\302\246\302\246+ vnuK = v \321\201

(iv) Since uf is nonnegative for each i, and since the sum of nonnegative real numbers is nonnegative,

\320\270-\320\270=\320\270]+u\\ + \302\246\342\200\242\342\200\242

+uj| >0

Furthermore, \320\270\342\200\242

\320\270= 0 iff \321\211= 0 for each i, that is, iff \320\270= 0.

NORM (LENGTH) IN R\"

2.18. Find ||w|| if w = (-3, 1, -2,4, -5).||w||2= (-3J + I2 + (-2J + 42 + (-5J = 9 + 1 + 4 + 16 + 25= 55;hence ||w|| = v/55.

58 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

2.19. Determine \320\272such that ||u|| =\320\24339where \320\270= A, \320\272,-2, 5).

||\342\200\236||2= I2 + k2 + (-2J + 52 = \320\2722+ 30. Now solve \320\2722+ 30 = 39 and obtain \320\272= 3, -3.

2.20. Normalize w = D, -2,-3, 8).

First find |MI2 = w \302\246w = 42 + (-2J + (-3J + 82 = 16 + 4 + 9 + 64 = 93. Divide each componentof w by ||w|| = y/92 to obtain

w /4 -2-3 8 \\\" __ _ I I~IMI

~

V^' \320\24393',/93* y/wJ

2.21. Normalize v = (?, f, -?)

Note that \320\270and any positive multiple of v will have the same normalized form. Hence, first multiply v

by 12 to \"clear\" fractions: Uv = F, 8, -3). Then

= tJ^t = (1112^11 \\

109 and v = lb,

2.22. Prove Theorem 2.3 (Cauchy-Schwarz).\\u\302\246

v\\< ||u|| ||u||.

n

We shall prove the following stronger statement: \\u\342\200\242

i>| < ^ |\321\2061>,-|< ||u||||v||- First, if \320\270= 0 or v = 0,\302\246=i

then the inequality reduces to 0 < 0 < 0 and is therefore true. Hence we need only consider the case in

which \320\270\321\204\320\236and v \320\2440, i.e., where ||u|| \320\2440 and \\\\v\\\\ \320\2440. Furthermore, because

we need only prove the second inequality.Now, for any real numbers x, \321\203e R, 0 < (x \342\200\224

yJ = x2 \342\200\2242xy + y2 or, equivalently,

2xy < x2 + y2 (I)

Set x = |,\302\253(|/\320\2301and y =\\v-.\\l 'II \"II in (/) to obtain.

\\u,\\ |i

IMI II

for any

v\\\\

~II\"

\302\273,

,l2

II2+

\320\2532

Ikll2B)

But, by definition of the norm of a vector, ||u|| = E \320\2472= E I ui I2 an(' II\"II

= E \"?= E I \"\302\246I2- Thus summing

B) with respect to i and using | \321\211i>,-1= | \321\206111>,-1,we have

that is,

El\".\"il <.Hull\"

\" ~

Multiplying both sides by ||u|| ||i>||, we obtain the required inequality.

2.23. Prove Theorem 2.4 (Minkowski).\\\\u + u|| < ||u|| + ||u||.

By the Cauchy-Schwarz inequality (Problem 2.22)and the other properties of the inner product,

||u + v\\\\2 = {u + v)' (u + v) = \320\270\302\246\320\270+ 2(u\342\200\242

v) + v \342\200\242v

<\320\237\320\2302+ 2IMINI \320\247\320\247\320\2302= (INI + HIJ

Taking the square roots of both sides yields the desired inequality.

CHAP. 2] VECTORSIN R\" AND C\", SPATIAL VECTORS 59

2.24. Prove that the norm in R\" satisfies the following laws:

(a) [N,] For any vector u, ||u|| > 0; and ||u|| = 0 iff u = 0.

(b) [N2] For any vector \320\270and any scalar k, \\\\ku\\\\= | \320\272\\\\\\u\\\\.

(c) [N3] For any vectors \320\270and v, \\\\u + v\\\\ < \\\\u\\\\ + \\\\v\\\\.

(a) By Theorem 2.2, \320\270\342\200\242\320\270> 0, and \320\270\342\200\242\320\270= 0 iff \320\270= 0. Since ||u|| = Ju \342\200\242\320\270,[\320\233/,]follows.

(b) Suppose \320\270= (u,, \320\2702,..., \320\270\342\200\236)andso ku = (kult ku2,..., 1\321\201\320\270\342\200\236).Then

||\320\253|2 =(\320\272\321\211J+ (ku2J + \302\246\302\246\302\246+(\320\272\320\270\320\232)\320\263

= k\\u\\ + u\\ + \302\246\342\200\242\342\200\242+un2)

=k2\\\\u\\\\2

Taking square roots gives [N2].(c) [JV3] was proved in Problem 2.23.

2.25. Let \320\270= A, 2, -2), v = C, -12, 4), and \320\272= -3. (a) Find ||u||, ||v||, and ||fcu||. (b) Verify that

(a) ||\321\206||= J\\ + 4 + 4 = ^/9=3, |H|= ^/9 + 144 + 16=

\321\203\320\223\320\2619= 13, ku = (-3, -6, 6), and

J9 + 36 + 36 =\320\24381

= 9.

(b) Since|fc|= | \342\200\224

3| = 3, wehave |lc| ||u||= 3-3=9= ||fcu||. Also \320\270+ v = D, -10, 2).Thus

\\\\u + v\\\\=

\320\24316+ 100 + 4 =\320\243\320\250

< 16 = 3 + 13= ||u||+ ||r^||

DISTANCE, ANGLES, PROJECTIONS

2.26. Find the distance d(u, v) between the vectors \320\270and v where

(a) 11 = A, 7), t; = F, -5),

(b) u = C, -5, 4), \320\263= F, 2, -1),

(c) u = E, 3, -2, -4, -1), v = B, -1, 0, -7, 2).In each case use the formula d{u, v) =

\\\\u\342\200\224

v\\\\=

^/(u,\342\200\224

vj2 + \302\246\302\246\342\200\242+ (un\342\200\224

vj2.

(a) d(u, i>)=

^(l- 6J + G + 5J = ^25 + 144 = ^169 = 13

(b) d{u, v) =\320\243C

- 6J + (-5 - 2J + D + IJ = ^9 + 49 + 25 =

(c) \321\204,i;)= VE

- 2J + C + IJ + (-2 + 0J + (-4 + 7J + (-1 - 2J =

2.27. Find \320\272such that d(u, v)-6 where \320\270= B, \320\272,1, -4) and v = C, -1, 6, - 3).First find

[d(u, v)Y = B - 3J + (k + IJ + A- 6J + (-4 + 3J = k2 + 2k + 28

Now solve k2 + 2k + 28= 62 to obtain \320\272= 2, -4.

2.28. From Problem 2.24,prove that the distance function d(u, v) satisfies the following:

[M,] d(u, v)>0; and d(u, v) = 0 iff \320\270= v.

[M2] d(u, v) =\321\204,11).

[M3] d(u, u) < d(u, w) + d(w, v) (triangle inequality).

[M,] follows directly from [N,]. By [N2],

d(u, v) = ||u -v\\\\

= ||(-lKf - u)|| =I -11 \\\\v

-u\\\\

= ||t. -u\\\\ = d(v, u)

60 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

which is [M2]. By [N3],

d(u, v) =\\\\u

-v\\\\

=\\\\{u -w) + {w- v)\\\\ < \\\\u

-w\\\\ + \\\\w

-v\\\\ = d(u, w) + d(w, v)

which is [M3].

2.29. Find cos 0, where \320\262is the angle between \320\270= A, 2, -5) and v = B, 4, 3).

First find

M-i> = 2 + 8-15=-5 ||u||2= 1 +4 + 25 = 30 \320\2302= 4 + 16 + 9 = 29

Then

\320\270\302\246v 5

cos 0 =Hull Ik:!

2.30. Find proj (u, v) where \320\270= (I, -3, 4) and v = C, 4, 7).

First find \320\270\342\200\242v = 3 - 12+ 28 = 19 and |M|2 = 9 + 16+ 49 = 74. Then

\\\\v\\\\ 74 \\74 74 74/ \\74 37 74

POINTS, LINES, AND HYPERPLANES

This sectiondistinguishes between an n-tuple P(at, a2, ..., aj~ P(a,) viewed as a point in R\" and

an n-tuple v = [c,, c2,..., \321\201\342\200\236]viewed as a vector (arrow) from the origin \320\236to the point C(cu c2,..., \321\201\320\273).

2.31. Find the vector v that is identified with the directed line segmentPQ for the points (a) PB, 5) andQ(- 3,4)in R2, (b) P(l, -2,4) and QF,0, -

3) in R3.

(a) d = C-P = [-3-2,4-5] = [-5,-1](b) v = Q - P = [6- 1,0+ 2, -3 - 4] = [5,2, -7]

2.32. Consider points PC, fc, -2) and QE, 3, 4) in R3. Find fc so that PQ is orthogonal to the vector

11 = [4, -3,2].

First find v = Q - P = [5-3, 3 - k, 4 + 2] = [2,3 -

fe, 6]. Next compute

u-v = 4B) - 3C -\320\272)+ 2F) = 8 - 9 + \320\227\320\220\302\246+ 12 = \320\227\320\272+ 11

Lastly, set u \342\200\242\302\253= 0 or 3fc + 11 =0, from which \320\272= - 11/3.

2.33. Consider the hyperplane \320\257in R\" which is the solution set of the linear equation

ci.x, +a2x2 + \302\246\302\246\302\246+ anxn = b (I)

where B = [fl,,fl2,...,eJ^0. Show that the directed line segment PQ of any pair of points P,Q e H is orthogonalto the coefficient vector u; the vector \320\270is said to be normal to the hyper-

hyperplane H.

Let w, = OP and w2 = Og; hence v = w2\342\200\224

w, = Pg. By (/), \320\274\342\200\242w, = \320\254and \320\270\342\200\242

w2 = b. But then

u \302\246v = u \302\246(w2

\342\200\224w,)

= u*w2 \342\200\224u-w, = fe \342\200\224b = 0

Thus d = Pg is orthogonal to the normal vector u.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 61

2.34. Find an equation of the hyperplane H in R4 which passes through PC,\342\200\224

2, 1, \342\200\2244)and is normal

ton = [2, 5,-6,-2].An equation of H is of the form 2x + 5y \342\200\2246z \342\200\2242w = \320\272since it is normal to u. Substitute P into this

equation to obtain \320\272= \342\200\2242. Thus an equation of H is 2x + 5y\342\200\2246z \342\200\2242w = \342\200\2242.

2.35. Find an equation of the plane H in R3 which contains P(l, \342\200\2245, 2) and is parallel to the plane H'

determined by 3x \342\200\224ly + Az = 5.

H and H' axe parallel if and only if their normals are parallel or antiparallel. Hence an equation of H isof the form 3x - ly + 4z = k.Substitute P(l, \342\200\2245,2) into this equation to obtain \320\272= 46. Thus the required

equation is 3.x \342\200\224ly + 4z = 46.

236. Find a parametric representationof the line in R4 passing through PD,\342\200\224

2, 3, 1) in the direction\302\253=[2,5,-7,11].

The line L in R\" passing through the point P{at) and in the direction of the nonzero vector \320\270=[\321\206]

consists of the points X = (xf) which satisfy the equation

X = P + tu or x, = a, + \321\206-( (for i = 1, 2, ..., n) (/)

where the parameter t takes on all real values. Thus we obtain

x= 4+ 2f

\320\243~~ + 'or D + 2t, -2 + 5t, 3-7f, 1 + lit)

z = 3 \342\200\224It

w= 1 + IK

2.37. Find a parametric equation of the line in R3 passing through the points PE, 4, \342\200\2243)and

6A,-3,2).

First compute \320\270= PQ = [1 -5, -3 - 4, 2 - (-3)] = [-4, -7,5].Then use Problem 2.36 to obtain

x = 5-4r y= 4-7t z=-3 + 5t

2.38. Give a nonparametric representation for the line of Problem 2.37.

Solve each coordinate equation for f and equate the results

x-5 \321\203\342\200\2244 z + 3

-4=

-7 5

or the pair of linear equations 7x \342\200\2244y

= 19 and 5x + 4z = 13.

2.39. Find a parametric equation of the line in R3 perpendicular to the plane2x \342\200\224\320\252\321\203+ Iz = 4 and

intersectingthe plane at the point PF, 5, 1).

Since the line is perpendicular to the plane, the line must be in the direction of the normal vector\320\270\342\200\224

[2,\342\200\2243, 7] of the plane. Hence

x = 6 + 2t \321\203= 5-It z = 1 + It

2.40. Consider the following curve \320\241in R4 where 0 < t < 4:

Fit) =(t\\ 3r - 2, t\\ t2 + 5)

Find the unit tangent vector T when t = 2.

62 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Takethe derivative of (each component of) F(t) to obtain a vector V which is tangent to the curve:

V(t) = \342\200\224\342\200\224= Bt, 3, 3t2, 2t)dt

Now find V when t = 2: V = D, 3, 12, 4). Normalize V to get the unit tangent vector T to the curve when

t = 2. We have

||F||2= 16 + 9+ 144+ 16= 185 or ||F||=v/l85

Thus

L 185 ,/185 ,/185

2.41. Let T(t) be the unit tangent vector to a curve \320\241in R\". Show that dT(t)/dt is orthogonal to T(t).

We have T(f)\342\200\242

T(f) = 1. Using the rule for differentiating a dot product, along with d(\\)/dt = 0, we have

d[J(t)\302\246

T(t)]/dt =T(f)

\342\200\242dT(t)/dt + <TT(t)/dt

\302\246T(i) = 2T(t) \342\200\242

<\320\223\320\223(\320\236/^= \320\236

Thus dT(t)/dt is orthogonal to T(t).

SPATIAL VECTORS(VECTORS IN R3), PLANES, LINES, CURVES, ANDSURFACES IN R3

The following formulas will be used in Problems 2.42-2.53.The equation of a plane through the point Po(xo > \320\243\320\276> zo) w'tn normal direction N = a\\ + bj + ck is

a(x -x0) + b{y

- y0) + c(x-x0)

= 0 B.1)

The parametric equation of a line L through a point Po(xo. \320\243\320\276> zo) 'n tne direction of the vectorv \342\200\224ai + bj + ckis

x = at + x0 y = bt + y0 z = ct + z0

or, equivalently,

f(t) = (at + xo)i + (bt + yo)\\ + (ct + zo)k B.2)

The equation of a normal vector N to a surface F(x, y, z) = 0 is

N = Fxi+ FJ + F.k B5)

2.42. Find the equation of the plane with normal direction N = 5i \342\200\2246j + 7k and containing the point

PC,4,-2).

Substitute P and N in equation B.1) to get

5(x-3)-6(^-4) + 7(z + 2) = 0 or 5x-6y + 7z=-23

2.43. Find a normal vector /V to the plane 4x + 7_y\342\200\22412z = 3.

The coefficients of x, y, z give a normal direction; hence \320\233\320\223= 4i + 7j - 12k.(Any multiple of N also isnormal to the plane.)

2.44. Find the plane H parallel to 4x + ly - 12z= 3 and containing the point PB, 3,-1).

CHAP.2] VECTORS IN R\" AND C\", SPATIAL VECTORS 63

H and the given plane have the same normal direction; that is, N = 4i + 7j\342\200\22412k is normal to H.

Substitute P and N in equation (a) to get

4(x -2) + l(y - 3)- 12(z + 1) = 0 or 4x + ly - 12z = 41

2.45. Let H and \320\232be, respectively, the planes x + 2y \342\200\2244z = 5 and 2x \342\200\224\321\203+ 3z = 7. Find cos 0 where

0 is the angle between planesH and K.

The angle 0 between H and \320\232is the same as the angle between a normal N to H and a normal \320\233\320\223to

K. We have

N = i + 2j - 4k and N' = 2i - j + 3k

Then

N\342\200\242N'= 2-2-12= -12 ||N||2= 1 +4+ 16 = 21 ||N'||2= 4 + 1 + 9 = 14

Thus

N \342\200\242N' 12 12cos \320\262= \302\246

l UN'I

2.46. Derive equation B./).

Let P(x, y, z) be an arbitrary point in the plane. The vector v from Po to P is

\" = P - Po =\320\241*

~ *o)' + {\320\243-

\320\243\320\276\320\231+ (z -zo)l<

Since d is orthogonal to N = ai + fcj + ck (Fig. 2-12), we get the required formula

\321\204\321\201- x0) + b(y

-y0) + \321\204

- z0) = 0

Fig. 2-12

2.47. Derive equation B.2).

Let P(x, y, z) be an arbitrary point on the line L. The vector w from Po to P is

w = P-Po = (x -xo)i + (y -

yo)j + (z - zo)k (/)

Since w and v have the same direction (Fig. 2-13),

w = tv = t(m + bj + ck) = at\\ + brj + ctV B)

Equations G) and B) give us our result.

64 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

\320\236

Fig. 2-13

2.48. Find the (parametric) equation of the line L through:

(a) Point PC,4, -2) and in the direction of v = 5i - j + 3k,

(b) Points P(l, 3,2)and QB, 5, - 6).

(a) Substitute in equation B.2) to get

/@ = Ef + 3)i + (-t + 4)j + Ct - 2)k

(b) First find the vector v from P to g: v = Q \342\200\224P = i + 2j - 8k.Then use B.2) with v and one of the

given points, say P, to get

f(t) = (t Bt (-8\320\223+ 2)k

2.49. Let H be the plane 3x + Sy + Iz = 15.Find the equation of the line L perpendicularto H and

containing the point P(l, \342\200\2242,4).

Since L is perpendicular to the plane H, L must be in the same direction as the normal

N = 3i + 5j + 7k to H. Thus use B.2) with N and P to get

f(t) =Cf + l)i + Et -

2)j + (It + 4)k

2.50. Consider a moving body \320\222whose position at time ( is given by R(t)= t3i + 2r2j + 3tk. [Then

V(t)= dR{t)/dt denotes the velocity of \320\222and A(t) = dV(t)jdt denotes the accelerationof \320\222.]

(a) Find the position of \320\222when t = 1. (c) Find the speed s of \320\222when t=\\.

(fc) Find the velocity v of \320\222when t = 1. (d) Find the accelerationa of \320\222when t = 1.

(a) Substitute r = 1 into R(t) to get R(l) = i + 2j + 3k.

(b) Take the derivative of R(t) to get

ydt

Substitute t = 1 in V(t) to get v = V(\\) = 3i + 4j 4-3k.

(c) Thespeeds is the magnitude of the velocity v. Thus

+ 16 + 9 = 34 and hence

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 65

id) Take the second derivative of R(t) or, in other words, the derivative of V{t) to get

Substitute t = 1 in A{t) to get a =A(l)

= 6i + 4j.

2.51. Consider the surface xy2 + 2yz = 16in R3. Find (a) a normal vector N(x, y, z) to the surface and

(b) the tangent plane H to the surface at point P(l, 2, 3).

(a) Find the partial derivatives Fx, Fy, Fz where F{x, y, z) =xy1 \302\246+-2yz

\342\200\22416. We have

Fx =\321\203\320\263Fy

= 2xy + 2z Fz =2y

Thus, from equation B.3), N(x, y, z) = y2i + Bxy + 2z)j + 2yk.

(b) A normal to the surface at point P is

\320\251\320\240)=\320\2511,2, 3) = 4i + lOj+ 4k

Thus N = 2i + 5j + 2k is also a normal vector at P. Substitute P and N into equation B\320\233)to get

2(x - 1)+ 5(y- 2) + 2(z -

3) = 0 or 2x + 5>> + 2z = 18

2.52. Consider the ellipsoid x2 4- 2y2 4- 3z2 = 15.Find the tangent plane H at point PB, 2,1).

First find a normal vector [from equation B.3)]

\320\251\321\205,\321\203,z) = FJ + Ffi + Fzk= 2xi + 4>j + 6zk

Evaluate the normal vector N(x, y. z) at P to get

N(P) = NB, 2, 1)= 4i + 8j + 6k

Thus N - 2i + 4j + 3k is normal to the ellipsoid at P. Substitute P and N into B,/) to obtain H:

2(x-2) + %-2) + 3(z-l) = 0 or 2x + 4y + 3z = 15

2.53. Consider the equation F(x, y, z) = x2 + y2\342\200\224z = 0, whose solution set z = x2 + y2 represents a

surface S in R3. Find (a) a normal vector N to the surface S when x = 2, _y= 3; and (b) the

tangent plane H to the surface S when x = 2, _y= 3.

(a) Use the fact that, when F(x, y, z) = /(x, >>)- z, we have Fx

= ?, Ff=

/y, and F, = -1. Then

\320\233\320\223=\320\236\",./,. -\320\236

= 2xi + 2yj -k = 4i + 6j - \320\272

(t>) If x = 2, \321\203= 3, then z = 4 + 9 = 13;hence PB, 3, 13) is the point on the surface S. Substitute P and

N = 4i + 6j \342\200\224\320\272into equation B.7) to obtain H.

4(x-2) + 6(y-3)-(z- 13)= 0 or 4x + 6y-z=l3

CROSS PRODUCT

The crossproduct is defined only for vectors in R3.

2.54. Find \320\270x v where (\320\276)\321\213= A, 2, 3) and v = D, 5, 6), (b) \320\270= G, 3, 1) and \302\253;= A, 1, 1),(c) \321\213= (-4, 12, 2) and v = F,

- 18,3).The cross product of \320\270= [alt a2, a3) and v =

(fcj, t>2, b3) may be obtained as follows. Put the vectorv = {bub2,b3) under the vector \320\270\342\200\224(a,, a2, a3) to form the array

/a, a2 \320\260\320\233

66 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Then

11X11 =

I)

That is, cover the first column of the array and take the determinant to obtain the first component of\320\270x \320\270;cover the second column and take the determinant backwards [compute a3bl \342\200\224

\320\260{\320\2543instead of

\342\200\224(a,bj\342\200\224

flj^i)] to obtain the secondcomponent;and cover the third column and take the determinant to

obtain the third component.

{\320\260)\320\270\321\205v =Ml 2 3

\\4\\ 5 6

\320\244)

(\321\201)

/|[7]3 1

\320\270\321\205v = II

1 2 34

\320\2536

7[\320\242|

1

111

1 2 |3|4 5 |\320\261|

7 3

= A2- 15,12-6, 5-8) = (-3,6, -3)

12-18

-4 12

-18

-4 12

6 -18 -3 I)=C6+ 36, 12 + 12, 72 -

72)= G2, 24, 0)

2.55. Consider the vectors \320\270= 2i - 3j 4-4k, v = 3i + j - 2k, w = i + 5j + 3k. Find: (a) \320\270\321\205v,

(b) \320\270x w, (c) t> x w.

Use

i j \320\272

\320\260\\\320\260\320\263\320\260\320\267

where u, = a,i + a2j + a3k and v2 = fc,i + b2} + fc3k.

= \320\2602\320\2603

\320\2542\320\254,

i- i + \320\260,\320\2602

(a) u x \302\273= 2 -3 = F -4)i + A2 + 4)j + B + 9)k

= 2i + 16j + Ilk3 1 -2

[Remark: Observe that the j component is obtained by taking the determinant backwards)

\320\272

\320\270x w =

(\321\201)I) x w =

1 j2 -31 5

1 -2

= (-9 -20)i + D - 6)j + A0 + 3)k = \342\200\22429i - 2j + 13k

= C + io)i + (-2 -9)j + A5 - l)k = 13i- llj + 14k

2.56. Prove Theorem 2.5(i): The vector \320\270\321\205v is orthogonal to both \320\270and v.

Suppose \320\270= (al,a2,a3) and \320\270= (fc,, fc2, fc3). Then

u \342\200\242(u x v) = at(a2b3

\342\200\224o3b2) + a2{a3bl \342\200\224

aib3) + a3(alh2 \342\200\224fl2fci)

= ata2b3 \342\200\224ala3b2 -\\- a2a3bl

\342\200\224ala2b3 + ala3b2 \342\200\224

a2a3

Thus u x f is orthogonal to u. Similarly, uxcis orthogonal to v.

2.57. Find a unit vector \320\270orthogonal to v = A, 3,4) and w = B, \342\200\2246, 5).

/1 3 4\\First find v x w which is orthogonal to v and w. The array I I gives

\\2 -6 -5/

v x w = (-15 + 24,8 + 5, -6 -6) = (9, 13, - 12)

Now normalize \320\270x w to get \320\270=(9\320\224/394, 13/^/394,

- 12/^394).

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 67

2.58. Prove Lagrange's identity, \\\\u x v\\\\2= (u

' u)(v * v) \342\200\224(u

\342\200\242vJ.

If \320\270=(\320\2601,\320\260\320\263,\320\276,)and v = (bl,b1, b3), then

(u\342\200\242

u)(v\302\246

v)- (u \302\246

v\\2 = (a\\ + a\\ + a2Kb2 + b\\ + b\\)

- (a,6, + a2 62 + a3 fc3J B)

Expansion of the right-hand sides of G)and B) establishes the identity.

COMPLEX NUMBERS

2.59. Supposez = 5 + 3\302\253and w = 2 \342\200\2244i. Find: (a) z + w, (b) z \342\200\224w, (c) zw.

Use the ordinary rules of algebra together with i2 = \342\200\224I to obtain a result in the standard form a + bi.

(a) z + w = E + 30 + B -40

= 7 - i

(b) z - w = E + 3i) - B -40

= 5 + 3i - 2 + 4i = 3 + li

(c) zw = {5 + 30B - 40 = 10- I4i - 12i2 = 10- I4i + 12 = 22 - 14i

2.60. Simplify: (a) E + \320\2521\\2- li), (b) D -

3\302\253J,(c) A + 2\302\253K.

(a) E + 30B -7i) = 10 + 6i - 35/ - 21i2 = 31 - 29/

(b) D - 302= 16- 24i + 9i2 = 7 - 24i

(c) A+ 2iK= 1 + 6ii + 12i2 + 8/3 = 1+ 6i - 12 - 8i = -11 - 2i

2.61. Simplify: (a) i\302\260,i\\ i\\ (b) i5, i6, i\\ i8, (c) i39, \302\273174,i252, \302\253317.

(a) i\302\260= 1, i3 = i2@ = (- 1X0 = -\302\253,'4 =(*2K'2)

= (- IK- I) = 1.

(fc) i5 = (i*K0 = AK0 = i, i6 =(i*Ki2)

= (IK'2) = i2 = -1, '7 = i3 = -i, i8 = i* = I.

(c) Using i* = 1 and /\" = i\"*\302\253+r=(/*)\302\253''

= 1*'\"' = ''> divide the exponent \320\270by 4 to obtain the remainder r:

-19 -4(9)+1 (.-4\\9;3 19;S ;i ; ;174_ ;2_ I ;252 ,O \342\200\242I =1 ^^1I=11 \342\200\224I \342\200\224\342\200\224I I \342\200\224I \342\200\224\342\200\224I I \342\200\2241=1

2.62. Find the complex conjugate of each of the following:

(a) 6 + 4\302\253,7 - 5i, 4 + i,-3-i; (b) 6,- 3, 4i, - 9i.

(a) 6 + 4i = 6-4i, 7 \342\200\2245r = 7 + 5i, 4 + i = 4 - i, \342\200\2243 \342\200\224i = \342\200\2243 + i.

(b) 6 = 6, ^3=-3, 4i=-4i, ^9i = 9i\".

(Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary

number is the negative of the original number.)

2.63. Find zz and | z | when z = 3 + 4i.

For z \342\200\224a + bi, use zz = a2 + b2 and z = y/zi = yja2 + b2.

zi = 9+16 = 25

2.64. Simplify ^

To simplify a fraction z/w of complex numbers, multiply both numerator and denominator by w, the

conjugate of the denominator.

2 - 7i B- 7/KS - 3i) -11 - 4li II 41.

5 + 3i~

E + 3/K5-

3\302\273)

~34

~ ~34

~34

'

68 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

2.65. Prove:Forany complex numbers z, w e C, (i) z + w = z + w, (ii) ziv = zw, (iii) \320\263= z.

Suppose z = a + bi and w = \321\201+ di where a, b,c,de R.

(i) z + w = (a + bi) + (c + di) = (a + c)+ (b + d)i

= (a + c)-(b + d)i= a + c- bi - di

= (a -bi) + (c-di) = z + w

(ii) zw = {a + biRc + di) = (ac -bd) + (ad + bc)i

= (ac-bd)- (ad + bc)i = (a -bi)(c

- di) = zw

(iii) z = a + bi = a \342\200\224bi = a \342\200\224(\342\200\224b)i = a + bi = z

2.66. Prove: For any complex numbers z, w e C, \\ zw \\= | z 11w|

By (ii) of Problem 2.65,

| zw \\2= (zw)(zvv) =

(zwYzw)

The square root of both sides gives us the desiredresult.

2.67. Prove:Forany complex numbers z, w e C, |z + w|<|z| + IH-Suppose z = a + bi and w = \321\201+ di where a, b,c,de R. Considerthe vectors \320\270= (a, b) and v = (c, d) in

R2. Note that

and

|z + w\\= |(a + c) + (b + d)i\\ = J(a + cJ + (b + dI =

\\\\{a + c, b + d)\\\\ = \\\\u + v\\\\

By Minkowski's inequality (Problem 2.23),\\\\u + v\\\\ < \\\\u\\\\ + \\\\v\\\\ and so

2.68. Find the dot products \320\270\342\200\242v and v \342\200\242

\320\270where (\320\260)\320\270= A \342\200\224Ii, 3 + i), v = D + 2\302\273,5 \342\200\224

6\302\273),and

(b) \320\270= C -2\302\273,4\302\253,1 + 6\302\253),v = E + i, 2 -

3\302\253,7 + 2i).

Recall that the conjugates of the second vector appearin the dot product

(z, zn)-(wl,...,wn) =zlwl +\302\246\302\246\302\246+znwn

(a) \320\270\342\200\242v = A - 20D+ 2i) + C + iK5 -6i)

= A -2iX4

- 2i) + C -f iX5 + 6i) = - IOj + 9 + 23i = 9 + I3i\320\270\342\200\242\320\270= D + 20A -

2i) + E -6iX3 + i)

= D + 20A + 20 + E -6i>3

- i) = lOi + 9 - 23i = 9 - I3i

(b) \320\270\342\200\242v = C -2iX5~T7) + D0B - 30 + A + 60G + 20

= C -2/X5

- i) + DiX2 + 3i) + (I + 6iX7- 20 = 20 + 35i

v \342\200\242\320\270= E + 0C -20 + B - 30D7)+ G + \320\2511+ 6i)

= E + 0C + 2i) + B - 30(-40 + G + 20A- 6i) = 20 - 35i

In both cases, v \342\200\242\320\270= \320\270\342\200\242v. This holds true in general, as seen in Problem 2.70.

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 69

169. Let \320\270= G-

2\302\253,2 + 5i) and v = A + i, -3 -6\302\253).Find:

(a)u + v; (bJiu; (c)C-i)v; (d)wv; (e)\\\\u\\\\ and ||t>||.

(a) \320\270+ i> = G - 2i\302\246+ 1 + i, 2 + 5i - 3 -6i\") = (8 -

i,- I - i)

(b) 2iu = A4i - 4i2,4i+ 10i2)= D + 14i\",

- 10 + 4i)

(c) C - i>= C + 3i - i - i2, -9 - 18i+ 3i + 6i2) = D + 2i, - 15- 15i)

(d) u-v = G- 2i i) + B + 5iK-3-6i)= G

- 2iKl -i) + B + 5iK-3 + 6i)= 5 - 9i - 36- 3i = -31 - 12i

(e) \\\\u\\{ = I2 + (-3)\320\263+ (-6J =

2.70. Prove: For any vectors u, v e \320\241and any scalar z e C, (i)w v = v \342\200\242u, (ii) (zh)

\342\200\242v = z(u\342\200\242

v),

(in) \320\270\342\200\242

(zt>) = z(u \342\200\242t>).

Suppose \320\270= (z,, z2,..., zB)and i; = (w,, w2,..., wn).

(i) Using the properties of the conjugate.

= iv^, + vv2z2 + \342\200\242\342\200\242\302\246+ vvnzn = zlwl + z2iv2 + \342\200\242\342\200\242\342\200\242+ znvvB = \320\270\342\200\242v

(ii) Since zu = (zz,, zz2,..., zzj,

(zu)\342\200\242v \342\200\224

zzliv1 + zz2w2 + \302\246\342\200\242\302\246+ zznwn = z(zlwl + z2w2 + \342\200\242\342\200\242\342\200\242+ \320\263\342\200\236\320\271>\342\200\236)= z(u

(Compare with Theorem 2.2 on vectors in R\".)

(iii) Method 1. Sincezv = (zw,, zw2,..., zwn),

\320\270\342\200\242(zv) = zlzwl + z2zvv2+ \342\200\242\302\246\342\200\2424- znzwn = zlzwl + z2zvv2 + \302\246\342\200\242\302\246+ znzwn

=zfz,\302\273, + z2 w2 + - \302\246\342\200\242+ znivj = z(u \302\246

i;)

Method 2. Using (i) and (ii),

\320\270\342\200\242(zv) = (zv)

\342\200\242u = z(v \342\200\242u) = z(v

'u) = z(u

\342\200\242i>)

Supplementary Problems

VECTORS IN R\"

2.71. Let \320\270= B, -I, 0, -3), v = (\\, -1, -1, 3), w = (l, 3, -2, 2). Find: (a) 2u - 3i>; (b) 5u - 3d

(f)-u + 2i> \342\200\2242\302\273v;(d) \320\270' v, \320\270\342\200\242w and i> \342\200\242w; (e) \\\\u\\\\, \\\\v\\\\, and ||iv||.

2.72. Determine x and \321\203if: (a) xC, 2) =2\320\246\321\203,-1); (\320\254)\321\205B,\321\203)

= y(l, - 2).

2.73. Find d(u, w) and proj (u, i;) when (\320\260)\320\270= A, - 3), \320\270= D, I) and (b) \320\270== B,- I, 0,1), v = A, -1, 1, 2).

LINEAR COMBINATIONS,LINEAR INDEPENDENCE

2.74. Let

*-0*

'..20pm,Jan21,2005

70 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

Expressi!as a linear combination of u,, u2, u3 where

(b) v =\\

2.75. Determine whether the following vectors u, v, w are linearly independent and, if not, express one of them as

a linear combination of the others.

(a) \320\270= A, 0, 1), i; = (I, 2, 3), w = C, 2, 5);

(b) \320\270= A,0,1), v = (I, I, 1),w = @, I, I);

(c) u = A, 2), i> = A, -l),*v = B, 5);(d) u = A,0,0,1), v = @,1, 2,1), \320\270-= A, 2, 3,4);

(e) \320\270= (I, 0, 0, 1),\320\270= @, 1, 2, 1),w = (I, 2, 4, 3).

2.76. Provethat Theorem 2.3 holds with equality iff \320\270and v are linearly dependent.

LOCATED VECTORS, HYPERPLANES, LINES, CURVES

2.77. Find the (located) vector v from (a) PB, 3, -7) to ?>(!,-6, -5); \321\204)P(l, -8, -4, 6) to ?>C,-5, 2, -4).

2.78. Find an equation of the hyperplane in R3 which:

(a) Passesthrough B, -7, 1) and is normal to C, 1,-11);(b) Contains A, - 2, 2),@, 1, 3), and @, 2,-1);

(c) Contains (I, -5, 2) and is parallel to 3x - ly + 4z = 5.

2.79. Find a parametric representation of the line which:

(a) Passesthrough G,- I, 8) in the direction of A, 3, -5);

(b) Passes through A,9, -4, 5) and B,- 3,0,4);

(c) Passesthrough D,\342\200\224I, 9) and is perpendicular to the plane 3x -

2y + z = 18.

SPATIAL VECTORS (VECTORSIN R3); PLANES, LINES, CURVES, ANDSURFACES IN R3

2.80. Find the equation of the plane H :

(a) With normal N = 3i - 4j + 5k and containing the point P(l, 2, - 3);(b) Parallel to 4x + \320\252\321\203

- 2\320\263= 11 and containing the point P(l, 2, \342\200\2243).

2.81. Find a unit vector \320\270which is normal to the plane:

(a) 3x-4y- 12z = ll; (fc) 2x -\321\203

- 2z - 7.

2.82. Find cos \320\262where \320\262is the angle between the planes:

(a) 3x \342\200\2242y

- 4z = 5 and x + 2y \342\200\224bz = 4;

(fc) 2x + 5y - 4z = 1 and 4x + \320\252\321\203+ 2z = 1.

2.83. Find the (parametric) equation of the line L:

(a) Through the point PB, 5, - 3)and in the direction of v = 4i - 5j + 7k;

(fc) Through the points P[l, 2, -4) and 6C, -7, 2);

(c) Perpendicular to the plane 2x -\320\252\321\203+ Iz = 4 and containing P(l, -5, 7).

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 71

2.84. Consider the following curve where 0 < t < 5:

F(t) = t3i - t2\\ + Bf -3)k

(a) Find F(t) when I = 2.

(b) Find the endpoints of the curve.

(c) Find the unit tangent vector T to the curve when t = 2.

2.85. Consider the curve F(t) = (cos t)i + (sin r)j + A.

(a) Find the unit tangent vector T(t) to the curve.

(fc) Find the unit normal vector N(t) to the curve by normalizing U(t) =dT(t)/dt.

(c) Find the unit binormal vector B(I) to the curve using \320\222= T x N.

2.86. Considera moving body \320\222whose position at time t is given by R(\302\253)= t2i + \302\2533j+ 2tk. [Then K(t) =

dR(t)/dt

denotes the velocity of \320\222and A(t) = dV(t)/dt denotes the acceleration of \320\222.]

(a) Find the position of \320\222when t = I.

(fc) Find the velocity \320\270of \320\222when f = 1.

(c) Find the speeds of \320\222when t = 1.

(<f) Find the acceleration a of \320\222when t = 1.

2.87. Find a normal vector N and the tangent plane H to the surface at the given point:

(\320\260)Surface: xzy + 3yz = 20and point P(l, 3, 2);

(\320\261)Surface: x2 + 3/ - 5z2 = 16 and point PC, -2, 1).

2.88. Given the surface z =/(x, y)= x2 + 2xy. Find a normal vector N and the tangent plane H when x = 3,

y=l.

CROSSPRODUCT

The cross product is defined only for vectors in R3.

2.89. Given \320\270= 3i - 4j + 2k, \320\270= 2i + 5j -3k, w = 4i + 7j + 2k.Find: (a) u x i>, (t>) u x w, (\321\201)\320\270x w, (d) v x u.

2.90. Find a unit vector w orthogonal to (\320\260)\320\270= A, 2, 3) and v = A, -1, 2); (fc) u = 3i-j+2k andi; = 4i - 2j - k.

2.91. Prove the following properties of the cross product:

(\320\260)\320\270x v = \342\200\224(v x u) (d) \320\270x (d + vv)

= (u x c) + (u x w)

(fc) u x u = 0 for any vector \320\270 (e) (d + w) x u = (u x u) + (w x u)

X V = k(U X 17)= U X (fcl)) (/) (u X 17) X W = (u \342\200\242

w)t)\342\200\224

(ll\342\200\242

W)u

COMPLEX NUMBERS

2.92,

2,93. Simplify: (a)\302\261

; (b) |i|; (c)i15,i25, i3*; (d)

72 VECTORS IN R\" AND C\", SPATIAL VECTORS [CHAP. 2

2.94. Let z = 2 - 5i and w = 7 + 3i. Find: (a)z + w; (b) zw; (c) z/w; (</) \320\263,w; (\320\265)| z |, | w \\.

2.95. Let z = 2 + i and w = 6 - 5i.Find:(a) z/w; (b) i, w; (c)|z |, |w\\.

2.96. Show that (a) Re z = j(z + z); (fc) Im z = (z -z)/2i.

2.97. Show that zw = 0 implies z = 0 or w = 0.

VECTORS IN \320\241

2.98. Prove: For any vectors u, v, w e C\":

(i) (u + i>)\342\200\242w = u \342\200\242w + v '

w; (ii) w'(u + v)= wu + w'v.

2.99. Prove that the norm in C\" satisfies the following laws:

[N,] For any vector u, ||u|| > 0; and ||u|| = 0 iff \320\270= 0.

[N2] For any vector \320\270and any complex number z, \\\\zu\\\\ =\\z\\ \\\\u\\\\.

[N3] For any vectors \320\270and v, \\\\u + v\\\\ < \\\\u\\\\ + ||i>||.

Answers to Supplementary Problems

2.71. (a) 2u-3\302\273 = (l, 1, 3, -15) (d) u \342\200\242v = -6, u \342\200\242w = -7, d \342\200\242w = 6

F) 5u - 3\302\273- 4\320\270-= C, -14, 11, -32) (e) ||u||= ,/\320\231. ||i>|| = 2^3, ||\320\270-||=

3^2

(\321\201)-u + 2i;-2\302\273v = (-2, -7, 2, 5)

2.72. (a) x= -Uy= -I; (b) x = 0,v = 0 or x=-2,y=-4

2.73. (a) </= 5, proj (\320\270,\320\262)=

(\320\224,^); (b) <l =\320\224 proj (\320\262,\320\263)

= \320\231,-?,?.<?)

2.74. (a) \302\273= (fK-

5\320\2702+ (j>i,

{b) v = u2 + 2u3

(\321\201)\320\270= [(a - 2b + c)/2]u,+ (a + fc - c)u2 + [(r -a)/2]u3

2.75. (a) dependent; (b) independent; (c) dependent; (d) independent; (e) dependent.

2.77. (a) v = (-1, -9, 2); (b) v = B, 3, 6, -10)

2.78. (a) 3x + y- llz= -12; F) 13x-f 4y + z = 7; (c) 3x -7>-+ 4z = 46

2.79. (o) x = 7 + t, \321\203= - 1 + 3t, z = 8 - 5j

(fc) x, = 1 + t, x2= 9 - I2t, x3 = -4 + 4t, x4 = 5 - t

(c) x = 4 + 3f, \321\203= - 1 - It, z = 9 + t

2.80. (a) 3x-4y + 5z=-20; (b) 4x + 3y - 2z = 16

CHAP. 2] VECTORS IN R\" AND C\", SPATIAL VECTORS 73

2.81. (a) u = Ci-4j-12k)/13; (\320\254)\320\270= Bi - j -2k)/3

2.82. (a) 23/(^29741); (b)

(a) x = 2 + 4t,y= S-St,z= -3 + 7t

(b) x=\\+2t,y = 2-9t,z=-4 + 6t

(c) x = 1 + It, \321\203= -5 -

3f, z = 7 + It

2.84. (a) 8i-4j+k; (fc) -3k and 125i -25j + 7k; (\321\201)\320\242= Fi - 2j +

2.85. (a) T(t) =(-sint)\302\253

(fc) N(t) = (- cos t)i- (sin t)j

(c) B(t)= (sin r)i- (cos t)j + k)/v/2

2.86. (a) i+j + 2k; (b) 2i + 3j + 2k; (c) ^17; (d) 2i + 6j

7. (a) N = 6i + 7j+9k, 6x + 7y + 9z = 45

(fc) N = 6i - 12j- 10k, 3x -by

- 5z = 16

2\320\2338. N = 8i + 6j - k, 8x + by - z = 15

2.89. (a) 2i + 13j + 23k (c) 31i- 16j- 6k

(fc) -22i + 2j + 37k (d) \342\200\2242i - 13j \342\200\22423k

2.90. (a) G, 1, -3)A/59; (b) Ei + 1 lj -2\320\272)//\320\22350

2.92. (a) 50-55i; (b) \342\200\22416\342\200\22430i; (c) D + 7i)/65; (d) A + 30/2; (e) \342\200\2242- 2i

2.93. (a) -|i; (fc) E + 270/58; (c) -i,i,-I; (d) D + 30/50

2.94. (a) z + vv = 9-2i (d) \320\263= 2 + 5i, w = 7 - 3i

(b) z*v = 29-29i (e) \\z\\ =

(c) z/w= (-1 -410/58

2.95. (a) z/w= G +160/61; (b) z = 2 -

i, w = 6 + Si; (c) |z| = ^5, |w|=

2.97. Ifzw = 0, then|zw| - |z||iv|= |0|=0.Hence |z| =0or|w| = 0;andsoz = Oor w = 0.

Chapter 3

Matrices

3.1 INTRODUCTION

Matrices were first introduced in Chapter 1 and their elements were related to the coefficients of

systems of linear equations. Herewe will reintroduce these matrices and we will study certain algebraic

operations defined on them. The material here is mainly computational. However,as with linear equa-

equations, the abstract treatment presented later on will give us new insight into the structure of matrices.The entries in our matrices shall come from some arbitrary, but fixed, field K. The elements of \320\232

are called scaiars. Nothing essential is lost if the reader assumes that \320\232is the real field R or the complexfield \320\241

Last, we remark that the elements of R\" or \320\236are conveniently represented by \"row vectors\" or\"column vectors,\" which are special cases of matrices.

3.2 MATRICES

A matrix over afield \320\232(or simply a matrix if \320\232is implicit) is a rectangular array of scaiars au of theform

1 Ol2 \342\200\242-

a2l a22 ...

The above matrix is also denoted by (\320\260;\320\224i = 1,..., m,j = 1,..., n, or simply by (ay). The m horizontal

\302\253-tuples

(a,i, a,2,..., \320\260,\342\200\236),(\320\260-,,,\320\260\320\263\320\263,...,a2n), ..., (a_i, \320\260\321\202\320\263,..., a__)

are the rows of the matrix, and the \320\270vertical m-tuples

?2n

are its columns. Note that the element ait, called the ij-entry or ij-component, appears in the ith row and

the/th column. A matrix with m rows and \320\270columns is called an m by n matrix, or m x \320\270matrix; the

pair of numbers (m, n) is called its size or shape.Matrices will usually be denoted by capital letters A, B,..., and the elements of the field \320\232by lower-

lowercase letters a,b, Two matrices A and \320\222are equal, written A = B, if they have the same shape and if

corresponding elements are equal. Thus the equality of two m x n matrices is equivalent to a system of

mn equalities, one for each pair of elements.

Example 3.1

(a) The following is \320\2602 x 3 matrix: I I.

74

CHAP. 3] MATRICES 75

Its rows are A, -3, 4) and @, 5, -2); its columns are (\\ ( \\ and ( A

(x

+ v 2z + w\\I =

x-y z-is equivalent to the following system of equations:

x + \321\203=3

x-y=l

2z + w = 5

z \342\200\224w = 4

The solution of the system is x = 2, \321\203= 1, z = 3, w = \342\200\2241.

Remark: A matrix with one row is also referred to as a row vector, and with onecolumn as a column vector. In particular, an element in the field \320\232can be viewed as a1 x 1 matrix.

3.3 MATRIX ADDITION AND SCALAR MULTIPLICATION

Let A and \320\222be two matrices with the same size,i.e.,the same number of rows and of columns,say,m x n matrices:

\\a\\\\

/4 = \302\26022and B =

Ib 11 *>12

um2

The smwi of /4 and B, written \320\233+ \320\222,is the matrix obtained by adding corresponding entries:

bmi am2 +bm2 ... amn + bmj

The product of a matrix \320\233by the scalar k, written \320\272\342\200\242A or simply \320\253,is the matrix obtained by

multiplying each entry of A by k:

ka2| fca

12

22

ml kam2 ... \320\272\320\260\342\200\236

Observe that A + \320\222and kA are also m x n matrices. We also define

-A=-\\-A and A-B = A+(-B)The sum of matrices with different sizes is not denned.

Example 3.2. =(J

\302\260

2g).

Then

A +B = -2 + 05+1

4 -2 5

-3 6 2

2A

/3-1 3.(-2) 3-3 \\ /3 -6 9\\

V3'4 3-5 3-(-6)/ \\12 15 -18/

/2 -4 6\\ /-9 0 _6\\ = /-7 -4 0\\

V8 10 \342\200\22412\320\243+\\ 21 -3 -24; V 29 7 -36/

76 MATRICES [CHAP. 3

The m x n matrix whose entries are all zero is called the zero matrix; for example, the 2x3 zero matrix is

/0 0 0\\

\\0 0 Oj

The zero matrix is similar to the scalar 0, and the same symbol will be used for both. For any matrix A,

\302\246A + 0 = 0 + A = A

Basic properties of matrices under the operations of matrix addition and scalar multiplication follow.

Theorem 3.1: Let V be the set of all m x n matrices over a field K. Then for any matrices \320\220,\320\222,\320\241\320\265V

and any scalars ku k2 e K,

(i) (A + B) + C = A +{B + C) (v) kt(A + B) = klA + klB(ii) A + 0 = A (vi) (fc, + k2)A =

ktA +k2A

(iii) A + {-A) = 0 (vii) {kxk2)A= kx{k2 A)

(iv) A + B = B + A (viii) I \342\200\242A = A

Using (vi) and (viii) above, we also have that A + A \342\200\2242A, A + A + A = \320\227\320\220

Remark: Suppose vectors in R\" are represented by row vectors (or by column vectors); say,

u = (al,a2,...,an) and v =(/>\342\200\236b2,..., bn)

Then viewed as matrices, the sum \320\270+ v and the scalar product ku are as follows:

u + v = (ai+bi,a2 + h2,...,aK+ hn) and ku = (kau ka2,..., kan)

But this corresponds precisely to the sum and scalar product as defined in Chapter 2. In other words,the above operations on matrices may be viewed as a generalization of the corresponding operationsdennedin Chapter 2.

3.4 MATRIX MULTIPLICATION

The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first

begin with a special case.

The product A\342\200\242\320\222of a row matrix A = (a,) and a column matrix \320\222= (b,) with the same number of

elementsis dennedas follows:

(a,, a2,...,an)

Note that A\342\200\242

\320\222is a scalar (or a 1 x 1 matrix). The product A \342\200\242\320\222is not denned when A and \320\222have

different numbers of elements.

Example 3.3

/ 3\\

(8, -4, 5M 2 =8-3+(-4)-2

Using the above definition, we now define matrix multiplication in general.

CHAP. 3] MATRICES 77

Definition: Suppose A = {atj) and \320\222=(\320\254\321\206)

are matrices such that the number of columns of A is equalto the number of rows of B; say, A is an m x p matrix and \320\222is a p x n matrix. Then the

product AB is the m x n matrix whose i/-entry is obtained by multiplying the ith row At of

A by the 7th column Bj of B:

//lj \342\200\242B1 /1, -B2 ... \320\233,-Bn\\

-4,\302\246B1 A?-B2 ... A,-B\"AB =

\320\220\342\200\236-\320\2222... \320\220\342\200\236-\320\222\302\253

That is,

where\321\201\321\206=

ai\\ \302\246\342\200\242\342\200\242ai

Wl \342\200\242\302\246\342\200\242amp/

+ ai2b2j +

bu ... bln\\ /

\\bpl ...bpj

... bj

C, ... c,

\\Cml \342\200\242\342\200\242\342\200\242Cmnl

+ aipbpJ=

We emphasize that the product \320\220\320\222is not defined if A is an m x p matrix and \320\222is a g x n matrix,

where p \320\244q.

Example 3.4

(a)sb2 ra3 + sb

,;)(\320\263\320\233/\320\260,a2 ^

\\t uj\\b, b2 bj

~\\ta, + ub, ta2 + ub2 ta3 + ub

(h) (l 2V 1\\ = /\320\223|-1+2\"\302\2601-1+2-2\\

=/1 5\\

\\3 4\320\224\320\2762/ V3- 1+4-0 3-1+4-2/ \\3 \320\230/

/1 lVl 2\\_/1-1 + 1-3 l-2 + l-4\\_/4 6\\

Vo 2\320\2243 4/ \\0\302\2461 + 2 \302\2463 0-2 + 2-4/ \\6 8/

The above example shows that matrix multiplication is not commutative, i.e., the products AB and

\320\222A of matrices need not be equal.Matrix multiplication does, however, satisfy the following properties:

Theorem3.2: (i) (AB)C= A(BC) (associative law)

(ii) A(B + C) = AB + AC (left distributive law)

(iii) (B + C)A= BA + \320\241A (right distributive law)

(iv) k{AB)= (kA)B = A(kB), where \320\272is a scalar

We assume that the sums and products in the above theorem are defined.We remark that OA = 0 and BO = 0 where 0 is the zero matrix.

3.5 TRANSPOSEOF A MATRIX

The transpose of a matrix A, denoted AT, is the matrix obtained by writing the rows of A, in order,

as columns:

1 \302\25321 \342\200\242\342\200\242\342\200\242Oml\\

2 \302\26022 \342\200\242\342\200\242-am2

Will flr\302\2732 \302\246\302\246\302\246\302\260mj

78 MATRICES [CHAP. 3

In other words, if A = (atj) is an m x n matrix, then AT = (aj) is the n x m matrix where aJ- = a,,-,for all

i andy.Note that the transpose of a row vector is a column vector and vice versa.The transpose operation on matrices satisfies the following properties:

Theorem 3.3: (i) (A + B)T = AT + BT

(ii) {AT)T= A

(iii) (kA)T = kAT (k a scalar)(iv) (AB)T = BTAT

Observe in (iv) that the transpose of a product is the product of transposes, but in the reverse order.

3.6 MATRICES AND SYSTEMS OF LINEAR EQUATIONS

Consider again a system of m linear equations in n unknowns:

a21xl + a22x2 C.D

amnxn= bm

The above system is equivalent to the following matrix equation:

M\302\25312

\302\25321

... \320\260\342\200\236 \\bj

or simply AX = \320\222

where A = (atJ) is the matrix of coefficients, called the coefficient matrix, X =(xj) is the column of

unknowns, and \320\222= (b,) is the column of constants.The statement that they are equivalent means that

every solution of the system C.1) is a solution of the matrix equation, and viceversa.The augmented matrix of the system C.1) is the followingmatrix:

\320\257|2

\302\25322

\\e\302\273 \302\253m\302\273\320\232

That is, the augmented matrix of the system AX = \320\222is the matrix which consists of the matrix A of

coefficients followed by the column \320\222of constants. Observe that the system C.1) is completely deter-

determined by its augmented matrix.

Example 3.5. The following are, respectively, a system of linear equations and its equivalent matrix equation:

2x + 3y-4z = 7 ^

x -2y

- 5z = 3

(Note that the size of the column of unknowns is not equal to the size of the column of constants.)Theaugmented matrix of the system is

/2 3 -4\\l -2 -5

In studying linear equations it is usually simpler to use the language and theory of matrices, as

indicated by the following theorems.

CHAP. 3] MATRICES 79

Theorem 3.4: Suppose u,, u2,...,uB are solutions of a homogeneous system of linear equationsAX = 0. Then every linear combination of the u, of the form fc,u, -t- k2u2 + \302\246\302\246\302\246

+ knunwhere the ks are scalars, is also a solution of \320\233\320\220\320\242= \320\236.Thus, in particular, every multipleku of any solution \320\270of AX = 0 is also a solution of /1 A\" = 0.

Proof. We are given that Aul= 0, Au2 = 0,..., \320\233\321\213\342\200\236

= 0. Hence

A{ku1 + ku2 +\302\246\302\246\302\246

+ kuK) = klAu1 + k2 Au2 + \302\246\302\246\302\246+ kn \320\220\320\270\342\200\236

= fc,0 + fc20+ \342\200\242\342\200\242+fcn0

= 0

Accordingly, fc,u, + \302\246\302\246\342\200\242+ \320\272\342\200\236\320\270\320\277is a solution of the homogeneous systemAX = 0.

Theorem 3.5: The general solution of a nonhomogeneous system AX = \320\222may be obtained by addingthe solution space W of the homogeneous system AX = 0 to a particular solution v0 of

the nonhomogeneous system AX = B. (That is, v0 + W is the general solution ofAX = B.)

Proof. Let w be any solution of AX = 0. Then

A{v0 + w)= A(v0) + A{w) = \320\222+ 0 = \320\222

That is, the sum v0 + w is a solution of/4A\" = B.

On the other hand, supposev is any solution of AX = \320\222(which may be distinct from v0). Then

Aiv - d0)= Av - Av0 = \320\222- \320\222= 0

That is, the difference v \342\200\224v0 is a solution of the homogeneoussystemAX = 0. But

v = u0 + (i>- u0)

Thus any solution of AX = \320\222can be obtained by adding a solution of AX = 0 to the particular solu-solution v0 of AX = B.

Theorem3.6: Supposethe field \320\232is infinite (e.g., \320\232is the real field R or the complex field C). Then

the system AX = \320\222has no solution, a unique solution, or an infinite number of

solutions.

Proof. It sufficesto show that if AX = \320\222has more than one solution, then it has infinitely many.

Suppose \320\270and v are distinct solutions of AX = B; that is, \320\220\320\270= \320\222and Av = B. Then, for any \320\272\320\265\320\232,

A[u + k(u -v)]

= Au + k(Au -Av)

= \320\222+ k{B-

\320\222)= \320\222

In other words, for each \320\272\320\265\320\232,\320\270+ k{u \342\200\224v) is a solution of AX = B. Since all such solutions are

distinct (Problem 3.21), AX = \320\222has an infinite number of solutionsas claimed.

3.7 BLOCK MATRICES

Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into smaller

matrices called blocks(orcells)of A. The matrix A is then called a block matrix. Clearly, a given matrix

may be divided into blocks in different ways; for example,

0 15 74~~\" 5

1

2

3

-231

054

175

123

-2

3

1

0

54

175

3

~~-2

9

80 MATRICES [CHAP. 3

The convenience of the partition into blocks is that the result of operations on block matrices can beobtained by carrying out the computation with the blocks, just as if they were the actual elements of thematrices.This is illustrated below.

Suppose A is partitioned into blocks; say

'\320\220\321\206Al2

A AA=

A2l A12 ...

i A A Ai/i-i ^^tn2 \342\200\242\342\200\242\342\200\242^^n

Multiplying each block by a scalar k, multiplies each element of A by k; thus

kA = \\

kA 21

kA12 ... kAln

kA22 ... kA2n

\\kAml kAm2 ... kAn

Now suppose a matrix \320\222is partitioned into the same number of blocksas A; say

B =

12 ... Bln

22

\\Bml

Furthermore, suppose the corresponding blocks of A and \320\222have the same size. Adding these corre-corresponding blocks adds the corresponding elements of A and B. Accordingly,

I A,, + \320\222.

A + B =

11 i- \302\273ii

A2l + B21 A22\320\262,

B7

... Aln + Bt\302\246\302\246\302\246Am + B2n

\\Aml+Bml An2 + Bm2 ...

The case of matrix multiplication is less obvious but still true. That is, suppose matrices U and V

are partitioned into blocks as follows

/I/., U

t/ =

12

22 Ip and \320\24321 \320\243\320\263\320\263v2n

\\vml v, u. V22 ... Vj

such that the number of columns of each blockUik is equal to the number of rows of each block Vkj.Then

UV = \\

,, Wl2 ...\320\230^1\321\217\\

ti ^22 -. W2n

.1 Wm2 ...Wmnj

where

wiS= un vu uip vpj

The proof of the above formula for UV is straightforward, but detailed and lengthy. It is left for

Problem 3.31.

CHAP. 3] MATRICES 81

Solved Problems

MATRIX ADDITION AND SCALAR MULTIPLICATION

3.1. Compute:

(b) \320\227\320\220and - 5A, where A =\\\\

I _

(a) Add corresponding elements:

^4 + 0 5 + 3 6 + (-5)/ V4 8 1

(b) Multiply each entry by the given scalar:

/3-1 3-(-2) 3-3 \\ / 3 -6 9\\

V3-4 3-5 3-(-6)/ Vl2 15 -18//-5-1 -5-(-2) -5-3 \\ / -5 10 -15\\

V-5-4 -5-5 -5-(-6)/ V-20 -25 30/

3.2. Find 2A -\320\252\320\222,where A = (

~3\\ and \320\222= (

3 \302\260

z).

First perform the scalar multiplications, and then a matrix addition :

/2 -4 6\\ /-9 0 \342\200\2246\\( \342\200\2241 -4 0\\

2A~3B^\\S 10 -12/+

\\21 -3 -24//=

\\29 7 -36,/

(Note that we multiply \320\222by\342\200\2243 and then add, rather than multiplying \320\222by 3 and subtracting. This usually

avoids errors.)

3.3.FindW,andwif3(*

AJ* M + ( I X+A

\\z w/ \\\342\200\2241 2w/ \\z + w 3 /

First write each side as a single matrix:

/3x 3y\\_/ x + 4 x + y + 6\\

\\3z 3w/ \\z + w - 1 2w + 3 /

Set corresponding entries equal to each other to obtain the system of four equations,

3x = x + 4 2x = 4

3y=x + y + 6 2y=6 + x

3z=z + w-l 2z=w-l3w = 2w + 3 w = 3

The solution is: x = 2, \321\203= 4, z = 1,w = 3.

3.4. Prove Theorem 3.1(v):Let A and \320\222be m x n matrices and fc a scalar. Then k(A + \320\222)= \320\272\320\220+ kB.

Suppose A =(\302\253\321\203)

and \320\222=(by). Then

ay + by is the \321\203-entry of A + B, and so fc(ay + by) is the y-entry

of k(A + B).On the other hand, katJ andfcfcy are the (/-entries of kA and kB, respectively, and so katJ + kb,j

82 MATRICES [CHAP. 3

is the //-entry of \320\272A + kB. But k, a{jand bit are scalars in a field; hence

\320\272(\320\260\321\203+ htj)=

kufj + kb^ for every i, j

Thus k{A + B) = kA + kB, as correspondingentries are equal.

MATRIX MULTIPLICATION

3.5. Calculate: (a) C, 8, -2, 4I - 1 I (b) A, 8, 3, 4)F, 1, -3, 5)

(a) The product is not defined when the row matrix and column matrix have different numbers ofelements.

(b) The product of a row matrix and a row matrix is not defined.

3.6. Let \320\233= \320\223jjandB

= K\302\260~

\\ Find (a) AB, (b) BA.

{a) SinceA is 2 x 2 and \320\222is 2 x 3, the product \320\220\320\222is defined and is a 2 x 3 matrix. To obtain the entries

in the first row of AB, multiply the first row (I, 3) of A by the columns I I, I ), and I I of B,

respectively:

/1 3\\/2 0 -4\\_/1-2 + 3-3 1-0+ 3-(-2) l-(-4) + 3-6\\

\\2 -1\320\224\320\267-2 b)~{ )

/2 + 9 0-6 -4+18\\_/ll -6 14\\

To obtain the entries in the second row of AB, multiply the second row B, \342\200\2241) of A by the columns of

B, respectively:

3V2 0 -4\\_/ \320\230 -6 14 \\

-1/V3 -2 6j~V4-3 0 + 2 -8-6\320\243

H\" 1 -.'3(fc) Note that \320\222is 2 x 3 and /1 is 2 x 2. Sincethe inner numbers 3 and 2 are not equal, the product BA is

not defined.

0)and \320\222= (

3 4\320\233find (a) AB, 1

2 -

3.7. Given A = | 1 0 | and \320\222= ( ;~

^ ), find (a) AB, (b) BA.

(a) Since A is 3 x 2 and \320\222is 2 x 3, the product AB is defined and is a 3 x 3 matrix. To obtain the firstrow of AB, multiply the first row of A by each column of B,respectively:

2 -l\\,. \342\200\236\342\200\236/2-\320\227 -4-4 -10 + 0\\ 1-\320\245 -8 -10\\

1 :KJ 1 >4/

To obtain the second row of AB, multiply the second row of A by each column of B, respectively:

2 -1\\ _2 /-, -8 -10\\ /-. -8-^

1 oil, . _i=h+o -2 + 0 -5 + 01 = 1 1 -2 -s|1-3 4)

CHAP. 3] MATRICES 83

To obtain the third row of A B, multiply the third row of A by each column of B, respectively:

2-'Vl \022 -5\\ /

-1 -8\0210\\ /\"'

-8-\\

1 ol(J

= l 1 -2 -5 = 1 -2 -5,-3 4\320\233

'\\-3 + 12 6+16 15+0/ 9 22 15/

Thus

Since \320\222is 2 x 3 and A is 3 x 2. the product BA is defined and is a 2 x 2 matrix. To obtain the firstrow of \320\222A, multiply the first row of \320\222by each column of A, respectively:

1

To obtain the second row of \320\222A, multiply the second row of \320\222by each column of A. respectively:

B

-l\\\\ / 15 -21 \\ /15 -21\\

'\302\260/ V6+4 + 0 -3 + 0 + oj \\10 -3/-3 4/

/15 -21\\Vio -\321\212)

Thus BA

Remark: Observe that in this problem both AB and BA are defined, but they arenot equal; in fact they do not even have the same shape.

3.8. Find AB, where

3 \342\200\224

Since A is 2 x 3 and \320\222is 3 x 4. the product is defined as a 2 x 4 matrix. Multiply the rows of A by the

columns of \320\222to obtain:

2+9-l 0-15 + 2 12 + 3-2

4-6 + 5 0+10-10 24-2 + 104/3 6 -13 13\\

/ \\26 -5 0 32/

3.9. Refer to Problem 3.8. Suppose that only the third column of the product AB were of interest.

How could it be computed independently?

By the rule for matrix multiplication, the jth column of a product is equal to the first factor times thejlh column vectorof the second. Thus

+IO-IO

Similarly, the /th row of a product is equal to the ith row vector of the first factor times the secondfactor.

3.10. Let A be an m x n matrix, with m > 1 and n > 1. Assuming \320\270and v are vectors, discuss theconditions under which (\320\260)\320\220\320\270,(b) vA is defined.

(a) Theproduct \320\220\320\270is defined only when \320\270is a column vector with n components, i.e., an n x 1 matrix. In

such case. \320\220\320\270is a column vector with m components.

84 MATRICES [CHAP. 3

(b) The product vA is defined only when i> is a row vector with m components, i.e., a 1 x m matrix. In such

case, vA is a row vector with n components.

3.11. Compute: (a) I 3F,-4,5) and (b) F, -4, 5)|3

(a) The first factor is 3 x 1 and the second factor is 1 x 3, so the product is defined as a 3 x 3 matrix:

2\\ / BX6) BX-4) BX5) \\ / 12 -8 io\\

3 F, - 4, 5) = CK6) CX-4) CX5)= 18 -12 15

-1/ M-1X6) (-1X-4) (-1X5)/ \\-6 4 -5/

(b) The first factor is 1 x 3 and the second factor is 3 x I, so the product is defined as a 1 x 1 matrix,

which we frequently write as a scalar.

AF, -4, 5)|

3 1 = A2 \342\200\22412\342\200\2245) =(

\342\200\2245) 5

3.12. Prove Theorem 3.2(i):(AB)C=A(BC).

Let A =(\321\217,\320\224\320\222=

(\320\254\320\224and \320\241= (cu). Furthermore, let AB = S = (sj and \320\222\320\241= T = (tJt). Then

m

IA J 1 I\320\232' \321\210\320\273.?\320\224 Jffl MK ^^ Ij JK

1 in *-\320\270!1= 1

Now multiplying S by C, i.e., (/4B) by C, the element in the rth row and /th column of the matrix (AB)Cis

1=1 1= 1 J=l

On the other hand, multiplying A by 7\", i.e., A by \320\222\320\241,the element in the ith row and /th column of the

matrix A{BQ is

m m \321\217

\"ii'n + a,2 '21 + \342\200\242\342\200\242\342\200\242+ a.-'\302\273,i=

L \302\260u'ji=11 \320\260\320\224\320\254}\320\272\321\201\320\270)j=i j=i it=i

Since the above sums are equal, the theorem is proven.

3.13. Prove Theorem 3.2(ii):A(B + C) = AB + AC.

Let A = (a^, \320\222=(\320\254\320\264),

and \320\241=(c^). Furthermore, let D = B + C =

\320\243\320\264),\320\225= AB = (ett), and

F = /1\320\241= (/k). Then

Tik= \"iiCu + ai2c2k + \302\246\302\246\302\246+ aimc^ = ? \321\217\321\206\320\241

Hence the element in the /th row and feth column of the matrix AB + AC is

m m m

CHAP. 3] MATRICES 85

On the other hand, the element in the /th row and fcth column of the matrix AD = A(B + C) is

andlk + andlk + \342\200\242-\342\200\242+aimdmk = ? \320\260\321\206\320\233\320\264

= ? a^bJk + cjk)J= i j= i

Thus A(B + C)= AB + AC since the corresponding elementsare equal.

TRANSPOSE

3.14. Given A = [\\ I 4 find AT and (AT)T.\\6 -7 -8/

Rewrite the rows of A as columns to obtain AT, and then rewrite the rows of AT as columns to obtain

[As expected from Theorem 3.3(ii),{AT)T= /J.]

3.15. Show that the matrices AAT and ATA are defined for any matrix A.

If A is an m x n matrix, then AT is an n x m matrix. Hence /4/4r

is defined as an m x m matrix, and/4r/4 is defined as an n x n matrix.

3.16. Find A AT and ATA, where A = { \\\\3 -1 V

Obtain /4r by rewriting the rows of A as columns:

-\321\201 -?\321\215? -3-\320\265\302\273

AT=\\2 -ll whence /4/41

\\o 4/

1+9 2-3 0+ 12\\ /10 \342\200\2241 12\\

3 4+1 0-41=1-1 5 -412 0-4 0+16/ \\ 12 -4 16/

3.17. Prove Theorem 3.3(iv):(AB)T= BTAT.

If A =(Ojj)

and \320\222=(bij), the y-entry of \320\220\320\222is

aizb2j

Thus (/) is the ji-entry (reverse order) of (AB)T.On the other hand, column j of \320\222becomes row j of BT, and row i of \320\233becomes column i of /4T.

Consequently, theji-entry of BTATis

=biyfl.1 + b2jai2 + \342\200\242\342\200\242\342\200\242+ bmJaim

Thus, (AB)T=BTAT, since corresponding entries are equal.

86 MATRICES [CHAP. 3

BLOCK MATRICES

3.18. Compute AB using block multiplication, where

A

2 l\\ /l 2 33 4 | 01 and B = l4 5 6

0 \020

\"

I 2/ \\0

\"6 \"'6'

(EF\\ I R S\\

1 and \320\222= I ), where E, F, G, R, S, and T are the given blocks. Hence\302\260lx2 G/ \\\302\2601\302\2533TJ

(\302\246's \320\255

\320\251.1111

\320\273ER ES + FT

., GT

@ 0 0)

3.19. Compute CD by block multiplication, where

C =

CD =

1

300

n,3

2400

A

4

; 0

\302\2460~!~

5

3

/312 4J

0014

1

1

0

0

2

1

and

/3

200

\\o

-2

4

0

0

0

0012

\342\200\2244

\302\260\\0

2

-3

V

/5 + 2-8 10-3 + 2\\

V3 + 8-4 6-12+1/

MISCELLANEOUSPROBLEMS

3.20. Show: (a) If A has a zero row, then \320\220\320\222has a zero row. F) If \320\222has a zero column, then \320\220\320\222has a

zero column.

(a) LetRt be the zero row of A, and B1,..., B\" the columns of B. Then the ith row of AB is

(Rf\342\200\242

\320\222',R(\342\200\242

\320\2222,..., Rt\302\246

B\") = @, 0,..., 0)

(b) LetCjbe the zero column of B, and Au ..., Am the rows of A. Then the jth column of AB is

3.21. Let \302\253and \320\263be distinct vectors. Show that, for distinct scalars \320\272e K, the vectors \320\270+ fe(\302\253\342\200\224

r) are

distinct.

CHAP. 3] MATRICES 87

It suffices to show that if

\320\270+ kt{u -v) = u + k2{u -v) (!)

then fc,= k2. Suppose (/) holds.Then

kt{u -v) = k2(u -v) or (fc,

-k2%u

-v) = 0

Since \320\270and v are distinct, u - t # 0. Hencefc,\342\200\224

kz= 0 and fc,

= k2.

Supplementary Problems

MATRIX OPERATIONS

Problems 3.22-3.25 refer to the following matrices:

/1 2\\ /5 0\\ /1 -3 4\\

3.22. Find 5/1 - 2Band 2/1 + 3B.

3.23. Find: (a)AB and (/1B)C, (b) \320\222\320\241and A(BC). [Note (-4B)C =

3.24. Find /\320\223,Br, and ATBT. [Note /1\320\223\320\222\320\223# (AB)TJ

3.25. Find /l/< = A2 and /1\320\241.

(Oi

o2 a3

bi b2 b3 b4 I. Find e,/l, e2A, and e3 /1.

c, c2 /

3.27. Let e,-= @, ..., 0, I, 0,..., 0) where 1 is the ith component. Show the following:

(a) etA= /I;, the ith row of a matrix A.

(b) Be]= B', theyth column of B.

(c) If Cj /1 = e; \320\222for each i, then A = B.

(<i) If /lef = BeJ for each), then A = B.

-G3.28. Let ,4 = 1 I. Find a 2 x 3 matrix \320\222with distinct entries such that AB = 0.

3.29. Prove Theorem 3.2(iii):(B+ C)A= BA + CA; (iv) k{AB)

= (kA)B = A(kB), where \320\272is a scalar. [Parts (i)and (ii) were proven in Problems 3.12 and 3.13, respectively.]

\320\227\320\233\320\236.Prove Theorem 3.3: (i) [A + B)T = AT + BT; (ii) (AT)T = A; (iii) (kA)T = kAT, for \320\272scalar. [Part (iv) was

proven in Problem 3.17.]

3.31. Suppose A = (Aik) and \320\222=(BkJ) are block matrices for which AB is defined and the number of columns

of each block Aik is equal to the number of rows of each block Bkj. Show that

AB = (Cl7),where C(;= ? Aik Bkj.

88 MATRICES [CHAP. 3

Answers to Supplementary Problems

/-7 14\\ / 21 105-98\\(bJ5

~15 20\\ (21 105 -\320\251

\" '

(a)\\39 -28/V-17 -285 296/U \\8 60 -59/V-17 -285 296/

(-s -33 \302\273

3.26. (a,, a2, a3, \320\260\320\224(blt b2, b3, \320\254\320\224(\321\2011\320\263c2,c3, c4), the rows of A.

6>3.28.

/ 2 4

\\-i -2 -

Chapter 4

Square Matrices, Elementary Matrices

4.1 INTRODUCTION

Matrices with the same number of rows as columns are called square matrices. These matrices play

a major role in linear algebra and will be used throughout the text. This chapter introduces us to thesematricesand certain of their elementary properties.

This chapter also introduces us to the elementary matrices which are closely related to the elemen-elementaryrow operations in Chapter 1. We use these matrices to justify two algorithms- one which finds the

inverse of a matrix, and the other which diagonalizes a quadratic form.The scalarsin this chapter are real numbers unless otherwise stated or implied.However, we will

discuss the special case of complex matrices and some of their properties.

4.2 SQUARE MATRICES

A square matrix is a matrix with the same number of rows as columns.An n x n square matrix issaid to be of order n and is called an n-squarematrix.

Recall that not every two matrices can be added or multiplied. However, if we only consider squarematrices of somegiven order \320\270,then this inconvenience disappears. Specifically, the operations of addi-

addition, multiplication, scalar multiplication, and transpose can be performed on any n x n matrices and

the result is again annxn matrix.

/ . 2 3\\ /2-5 1\\

Example 4.1. Let A = I\342\200\2244 \342\200\2244\342\200\2244 I and \320\222= I 0 3 \342\200\2242 I. Then A and \320\222are square matrices of order 3.

\\ 5 6 7/ \\l2 -4/

Also,

C

-3 4\\ /24

-4 -I -61 2/1 = 1-8 -8 -8 | A1'

=

6 8 3/ \\ 10 12 U

and

B

+ 0+3 -5 + 6 + 6 1-4-12-8 + 0-4 20-12-8 -4+8+1610+ 0+7 -25+ 18 + 14 5-12-28

arematrices of order 3.

Remark: A nonempty collection A of matrices is called an algebra (of matrices) if

A is closed under the operations of matrix addition, scalar multiplication of a matrix,

and matrix multiplication. Thus the collection Mn of all \320\270-square matrices forms an

algebra of matrices.

SquareMatrices as Functions

Let A be any \320\270-square matrix. Then A may be viewedas a function A : R\" -\302\273R\" in two different

ways:

89

90 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP 4

A) A(u) = \320\220\320\270where \320\270is a column vector;

B) A(u)= uA where \320\270is a row vector.

This book adopts the first meaning of A(u), that is, that the function defined by the matrix A will be

A(u) = \320\220\320\270.Accordingly, unless otherwise stated or implied, vectors \320\270in R\" are assumed to be column

vectors (not row vectors). However, for typographical convenience, such column vectors \320\270will often be

presented as transposed row vectors.

-2 3\\

Example 4.2. Let A = I 4 5 -6 I. Ifu = A, -3, 7)T,then

\\2 0 -1/

/l -2 3\\/ l\\ /1+6 + 21

/!(\302\253)= \320\233\320\270= 4 5 \342\200\2246\320\230\342\200\2243I = I 4 \342\200\22415 \342\200\22442

\\2 0 - 1/\\ 7/ \\2 + 0 - 7

Ifw = B, -L,4)r, then

/l -2 3\\ / 2\\ /2 + 2 + 12

/l(w)= /lw = 4 5 -6 \320\230- 1 I = I 8 - 5 - 24

\\2 0 -l/\\ 4/ \\4 + 0-4

Commuting Matrices

Matrices /1 and \320\222are said to commute if /IB = \320\222\320\233,a condition that applies only for square matrices

of the same order. For example, suppose

and\\J 4/

Then

GO

4 + 22W17 26\\

12+44/ \\39 56/

and

26

44/1,39 56J

Since /IB = \320\222A, the matrices commute.

4.3 DIAGONAL AND TRACE, IDENTITY MATRIX

Let A = (ay) be an n-square matrix. The diagonal (or main diagonal) of \320\273consists of the elements

a,,, a22\302\273\302\246\342\200\242\342\200\242.\320\260\320\277\320\237\302\246The trac? of A, written tr A, is the sum of the diagonal elements, that is,

\320\273

tr A = a,, + a22+ \302\246\302\246\302\246+ \320\260\342\200\236\342\200\236= ? a,-,-

i= 1

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 91

The n-square matrix with l's on the diagonal and O'selsewhere,denoted by /\342\200\236or simply /, is calledthe identity (or unit) matrix. The matrix / is similar to the scalar 1 in that, for any matrix A (of the sameorder),

Al = IA = A

More generally, if \320\222is an m x n matrix, then \320\222!\342\200\236= \320\222and 1\321\202\320\222

= \320\222(Problem 4.9).

For any scalar \320\272\320\265\320\232,the matrix kl which contains fc'son the diagonal and O's elsewhere is calledthe scalar matrix corresponding to the scalar k.

Example 4.3.

(a) The Kronecker delta<50-

is defined by

. \320\223\320\236ifiTbj

*\302\253Mi in=,-

Thus the identity matrix may be defined by / ={5\321\204.

(b) The scalar matrices of orders 2, 3,and 4 corresponding to the scalar \320\272= 5 are, respectively,

/55

5

\\ 5/

(It is common practiceto omit blocks or patterns of O'sas in the third matrix.)

/5 ON

Vo s)

The following theorem is proved in Problem 4.10.

Theorem 4.1: Suppose A =(atJ) and \320\222

\342\200\224(b0) are n-square matrices and \320\272is a scalar. Then

(i) tr (A + B) = tr A + tr B, (ii) tr kA = \320\272\342\200\242tr A, (iii) tr AB = tr BA

4.4 POWERS OF MATRICES, POLYNOMIALS IN MATRICES

Let A be an n-square matrix over a field K. Powers of A are defined as follows:

A2 = AA A3 = A2A,...,An+1 = A\"A,... and A\302\260= 1

Polynomials in the matrix A are also defined. Specifically, for any polynomial

f(x) - a0 + atx + a2x2 + \302\246\302\246\302\246+ \320\260\342\200\236\321\205\"

where the a, are scalars,/(/l) is definedto be the matrix

f(A) = aol + atA + a2A2 + ---+\320\260\342\200\236\320\220\320\277

[Note that f{A) is obtained from/(x) by substituting the matrix A for the variable x and substituting

the scalar matrix a01 for the scalar a0.] In the case that f{A) is the zero matrix, the matrix A is called azero or root of the polynomial f(x).

Example 4.4. Let A =1 ). Then\\3 -4/

\"H _4)(\320\267 -4)-(-9 22J

7 -6Vl 2\\ /-11 38\\

\320\224 M 57 -.\320\236\320\261)

If/(x) = 2x2 - 3x + 5, then

/ 7 -6\\ (\\ 2\\ /1 0\\ / 16 -18\\

-<-9 22j-3(\320\267 -4J+

5(o .J4-27 6\302\273)

92 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP.4

If g(x) = x2 + 3x - 10,then

7 -6\\ f\\ 2\\ /1 0\\ /0 0'

Thus -4 is a zero of the polynomial g(x).

The above map from the ring /C[x] of polynomials over \320\232into the algebra Mn of \302\253-square matrices

defined by

Ax) -+AA)

is called the evaluation map at A.The following theorem (proved in Problem 4.11) applies.

Theorem 4.2: Letf(x) and g(x) be polynomials and let A be an \320\270-square matrix (all over K). Then

(i) (f+y)(A)=f(A) + g(A),

(ii) (fg\\A)=

\320\224\320\220)\320\264{\320\220),

(iii) AA)g(A) =\320\264(\320\220)\320\224\320\220).

In other words, (i) and (ii) state that if we first add (multiply) the polynomials/(x) and y(x) and then

evaluate the sum (product) at the matrix A, we get the same result as if we first evaluated /(x) and g{x)at A and then added (multiplied) the matrices/M) and g{A). Part (iii) states that any two polynomials in

A commute.

4.5 INVERTIBLE(NONSINGULAR) MATRICES

A square matrix A is said to be invertible (or nonsingular) if there exists a matrix \320\222with the propertythat

AB = BA = /

where / is the identity matrix. Such a matrix \320\222is unique; for

AB{ = B,A = / and AB2= B2A = I implies \320\222,

=\320\222,/

=\320\222^\320\220\320\222\320\263)

-(\320\222^\320\220)\320\222\320\263

=1\320\2222

= B2

We call such a matrix \320\222the inverse of A and denote it by A~l. Observe that the above relation issymmetric; that is, if \320\222is the inverse of A, then A is the inverse of B.

Example 4.5

(a) Suppose A = 1 I and \320\222= Ij.

Then

\320\227\320\233

5V2 5\\ /

2\320\245. \320\267)\320\247-

\320\227\320\233-I 2J43 3 -5 + 6 jU

3 -5V2 5\\_

/ 6-5 15- 15

-1 2\320\2241 3/1,-2 + 2 -5 + 6

Thus A and \320\222are invertible and are inverses of each other.

/1 0(b) Suppose /1=12 -1

\\4 1 ,

1140+12 2 + 0-2 2+O-2\\ /I 0 0\\

22 + 4+18 4 + 0-3 4- 1-3 | =[0

I 0| =

,-44-4 + 48 8^0-8 8+1-8/ \\0 0 1/

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 93

By Problem 4.21, AB = I if and only if B/l = /; hence we do not need to test if \320\222A = /. Thus A and \320\222are

inverses of each other.

Consider now a general 2x2 matrix

A =

We are able to determine when A is invertible and, in such a case,to give a formula for its inverse. Firstof all, we seek scalars x, y, z, t such that

(a b\\(x y\\ _/l 0\\ (ax + bz ay + bt\\_

(\\ 0\\

If d)\\z t)~\\0 l)\302\260r

\\cx + dz cy + dt)~\\O \\)

(ax+ bz = 1 ft

\\cx + dz = 0 \\t

which reduces to solving the following two systems

\\ay + bt = 0

[cy + dt = 1

where the original matrix A is the coefficient matrix of each system.Set| A |= ad \342\200\224be (the determinant

of A). By Problems 1.60and 1.61,Ihe two systems are solvable, and A is invertible, if and only if

| A \\ =? 0. In that case, the first system has the unique solution x =d/\\A\\, z =

\342\200\224c/\\A\\, and the second

system has the unique solution \321\203= \342\200\224

b/\\ A \\, t =af\\ A \\. Accordingly,

x( dj\\A\\ -b/\\A\\\\ = J_(d -b\\

\\-c/\\A\\ a/\\A\\) \\A\\\\-c a)

In words: When \\A\\ #0, the inverse of a 2 x 2 matrix A is obtained by (i) interchanging the elementson the main diagonal, (ii) taking the negatives of the other elements, and (iii) multiplying the matrix by

1/1 A |.

Remark 1: The above property that A is invertible if and only if its determinant

| A | \320\2440 is true for square matrices of any order. (See Chapter 7.)

Remark 2: Suppose A and \320\222are invertible, then \320\220\320\222is invertible and

{ABI = B~lA~l. Moregenerally, if Au A2,..., Ak are invertible, then their product

is invertible and

(A1A2- \302\246\302\246Ak)~l

= A^-A^A;*

the product of the inverses in the reverse order.

4.6 SPECIALTYPES OF SQUARE MATRICES

This section describesa number of special kinds of square matrices which play an important role in

linear algebra.

Diagonal Matrices

A square matrix D = [d^) is diayonal if its nondiagonal entries are all zero. Such a matrix is fre-

frequently notated as D = diag (d{,, d22,..., dnn), where some or all of the dti may be zero. For example,

\\3

0

0

0

-70

002

94 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

are diagonal matrices which may be represented, respectively, by

diag C,- 7, 2) diag D, -

5) and diag F, 0, -9, 1)

(Observethat patterns of 0s in the third matrix have been omitted.)Clearly, the sum, scalar product,and product of diagonal matrices are again diagonal.Thus all the

\320\270x n diagonal matrices form an algebra of matrices. In fact, the diagonal matrices form a commutative

algebra since any two \320\270x \320\277diagonal matrices commute.

Triangular Matrices

A square matrix A = (ai})is an upper triangular matrix or simply a triangular matrix if all entriesbelow the main diagonal are equal to zero; that is, if atj

= 0 for i>j. Generic upper triangular

matrices of orders 2, 3, and 4 are, respectively.

/\302\253ii \302\2531\320\263\\

V 0 a22)

fbM

(As in diagonal matrices, it is common practice to omit patterns of Os.)The upper triangular matrices also form an algebra of matrices.In fact.

Theorem 4.3: Suppose A =(ay) and \320\222=

(by) are n x n upper triangular matrices. Then

(i) A + \320\222is upper triangular, with diagonal (aI( + b,,, a22 + b22, ..., ann + bnn).

(ii) kA is upper triangular, with diagonal (&\302\253,,, ka22,..., kam).

(iii) AB is upper triangular, with diagonal (\320\257\321\206\320\254,,,a22b22, ..., ambnn).

(iv) For any polynomial J(x), the matrix f{A) is upper triangular with diagonal (/(\320\260\321\213),

\320\224\320\26022),...,\320\224\320\260\320\277\320\277)).

(v) A is invertible if and only if each diagonal element au \320\2440.

Analogously, a lower triangular matrix is a square matrix whose entries above the diagonal are all

zero, and a theorem analogous to Theorem 4.3 holds for such matrices.

Symmetric Matrices

A real matrix A is symmetric if AT = A. Equivalently, A =(uy) is symmetric if symmetric ele-

elements (mirror images in the diagonal) arc equal, i.e., if each ati= a,,. (Note that A must be

square in order for AT = A.)A real matrix A is skew-symmetric if A' = \342\200\224A.Equivalently, A = (ay) is skew-symmetric if each

atJ\342\200\224

\342\200\224ajj.Clearly, the diagonal elements of a skew-symmetricmatrix must be zero since au=

\342\200\224ait

implies aK = 0.

Example 4.6. Consider the following matrices:

B

-3 S\\ I 0 3 -<-3 6

7j5 7-8/

(a) By inspection, the symmetric elements in A are equal, or A' = A. Thus A is symmetric.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 95

(b) By inspection, the diagonal elements of \320\222are 0 and symmetric elementsare negatives of each other. Thus \320\222is

skew-symmetric.

(c) Since \320\241is not square, \320\241is neither symmetric nor skew-symmetric.

If A and \320\222are symmetric matrices, then A + \320\222and \320\272A are symmetric. However, \320\220\320\222need not be

symmetric. For example,

/4=1 I and \320\222=[ I are symmetric, but \320\220\320\222= I I is not symmetric.

\\2 3/ \\5 b/ \\23 28/Thus the symmetric matrices do not form an algebra of matrices.

The following theorem is proved in Problem 4.29.

Theorem 4.4: If A is a square matrix, then (i) A + AT is symmetric; (ii) A \342\200\224AT is skew-symmetric;

(iii) A = \320\222+ \320\241,for some symmetric matrix \320\222and some skew-symmetric matrix C.

OrthogonalMatrices

A real matrix A is said to be orthogonal if AAT = ATA = I. Observethat an orthogonal matrix A is

necessarily square and invertible, with inverse A'1 = AT.

Example 4.7. Let A = | -?I 8\\5

I -*\\/ 5 $ l\\ /1+64+16 4-32 + 28 8 + 8-16\\-I -ill I -f 5

= \342\200\22414-32 + 28 16+16 + 49 32-4-28h V\\-| -3 3/ 81\\8+ 8-16 32-4-28 64+1+ 16/

(81

0 0\\ /l 0

0 8. 0 = 0 10 0 81/ \\0

0

This means A1'= A'' and so ATA = 7 as well. Thus A is orthogonal.

Consider now an arbitrary 3x3 matrix

a2 a3

b2 b3

,c, c,

If A is orthogonal, then

\320\247 \302\2532\302\253\320\267\\/\320\260,6j c,\\ /1 0 0\\

\320\2542\320\254\320\267||\302\2532

\320\2542C2= 0 1 0| = l

\321\201\320\267/\\0 0 ly

This yields

al + a2 + fl3 = ' fll^l + a2 ^2 + a3 ^3 = 0 <

\320\254|\320\257|+ \320\2542a2 + ^3 fl3= 0 fef + fe2 + &3 = 1 feiCj + fe2 C2 + ^3 C3 = 0

c3 fe3= 0 c\\ + c2 + c\\

= 1

\302\2531\342\200\242

\022*

\"\320\267*

\"l

\"l

\"l

= 1

= 0

= 0

\"l\342\200\242

\022\"

\"\320\267-

\302\2532

\302\2532

\302\2532

= 0

= 1

= 0

\302\2531*

\302\2532*

\320\230\320\267\"

\023

\302\2533

\302\2533

= 0

= 0= 1

96 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP.4

or, in other words,

where u, = (a,, a2, a3), u2= (b,, b2, b3), u3

= (c,, c2, c3) are the rows of A. Thus the rowsu^ u2, andu3are orthogonal to each other and have unit lengths, or, in other words, u1? u2, u3 form an orthonormalset of vectors. The condition ATA = I similarly shows that the columns of A form an orthonormal set ofvectors.Furthermore, since each step is reversible, the converseis true.

The above result for 3 x 3 matrices is true in general. That is,

Theorem 4.5: Let A be a real matrix. Then the following are equivalent: (a) A is orthogonal; (b) therows of A form an orthonormal set; (c) the columns of A form an orthonormal set.

For n = 2, we have the following result, proved in Problem 4.33.

/ cos0 sin \320\262\\ /cos \320\262 sin \320\262\\

Theorem 4.6: Every 2x2 orthogonal matrix has the form I \342\200\236 \342\200\2361 or . \342\200\236 \342\200\236for

\\\342\200\224sin 6 cos vj \\sin0 \342\200\224cos 6/

some real number \320\262.

Remark: The condition that vectors ultu2,...,um form an orthonormal set maybe described simply by uf

\342\200\242u}

\342\200\224\321\221\320\270,

where Su is the Kronecker delta [Example4.3(a)].

Normal Matrices

A real matrix A is normal if A commutes with its transpose, that is, if AAT = A1A. Clearly, if A is

symmetric, orthogonal or skew-symmetric,then A is normal. These, however, are not the only normal

matrices.

Example 4.8. Let A = ( 1. Then

-3V 6 3\\ /45 0\\ / 6 3V6 -\302\253\320\220-3 6J4o 45J

and^\320\247-\320\267 \302\253\320\220\320\267

Since AAT = ATA, the matrix A is normal.

The following theorem, proved in Problem 4.35, completely characterizes real 2x2 normal

matrices.

Theorem 4.7: Let A be a real 2x2 normal matrix. Then A is either symmetric or the sum of ascalarmatrix and a skew-symmetric matrix.

4.7 COMPLEX MATRICES

Let A be a complex matrix, i.e.,a matrix whose entries are complex numbers. Recall(Section2.9)that if z = a + \320\253is a complex number, then z = a \342\200\224bi1 is its conjugate. The conjugate of the complex

matrix A, written A, is the matrix obtained from A by taking the conjugate of each entry in A, that is, if

A =(a,j) then A =

(bi}) where btj= ajr [We denotethis fact by writing A = (aj^).]

The two operations of transpose and conjugation commute for any complex matrix A, that is,

(A)T = (AT). In fact, the special notation A\" is used for the conjugate transpose of A. (Note that if A is

real then AH = AT.)

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 97

Example/2 + 8/ 5 \342\200\2243/ 4-7/\\

4.9. Let A = 1. Then

V 6i 1 - 4/ 3 + 2///2 + 8i 6i \\ /2-8i \342\200\2246i>

A\" = I 5 \342\200\2243/ 1 -4i I = I 5 + 3i 1 + 4i

\\4- 7/ 3 + 2i/ \\4 + 7/ 3 - 2i/

Hermitian, Unitary, and Normal Complex Matrices

A square complex matrix A is said to be Hermitian or skew-Hermitian according as

A\" = A or A\" = -A

If A = (al7)is Hermitian, then atj= a^ and hence each diagonal element au must be real. Similarly, if A

is skew-Hermitian then each diagonal element au = 0.

A square complexmatrix A is said to be unitary if

A complex matrix A is unitary if and only if its rows (columns) form an orthonormal set of vectors

relative to the inner product of complexvectors. (See Problem 4.39.)Note that when A is real, Hermitian is the same as symmetric, and unitary is the same as

orthogonal.A square complex matrix A is said to be normal if

AA\" = A\"A

This definition reduces to the one for real matrices when A is real.

Example 4.10. Considerthe following matrices:

1 -i -l+i\\ / 3 1-2/ 4 +7i\\

i , 1+I bJi+21 -4 -21

2\\l+i -1+/ 0 / \\4-7i 21 2

(a) A is unitary if A\" = A\021 or if AA\" = A\" A = /.As noted previously, we need only show that A A\" = /:

1 _; -1+'\\/ \342\200\242 -' '-A1 + i)[ i 1 -1 - i

1+1+2 -i-i + 2/ l-/ + /-l+O\\ /1

i + i - 2i 1 + 1+2 i + 1 - 1 - i 1 = 10

+ 1-1-1+0 -i + 1-1 + 1+0 2 + 2 + 0 / \\0 0 1

Accordingly, A is unitary.

(b) \320\222is Hermitian since its diagonal elements 3, \342\200\2244,and 2 are real, and the symmetric elements, 1 \342\200\2242i and

1 + 2i, 4 + li and 4 \342\200\2247i, and \342\200\2242i and 2i, are conjugates.

(c) To show that \320\241is normal, evaluate CC\" and C\"C:

+ 2iA 1 1 - 2iJ \\4 + 4i 6

\321\201\320\275\321\201_ crc _ f23' *Y2 + 3i 1 \\_ / 14 4-4

C\"V i i 2A i 2-;~

\\a+4i 6

Since CC\" = (\"HC, the complex matrix \320\241is normal.

98 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

4.8 SQUARE BLOCK MATRICES

A block matrix A is called a square block matrix if (i) A is a square matrix, (ii) the blocks form asquare matrix, and (iii) the diagonal blocks are also square matrices. The latter two conditions will

occur if and only if there are the same number of horizontal and vertical lines and they are placed

symmetrically.Consider the following two block matrices.\"

A =

Block matrix A is not a square block matrix since the second and third diagonal blocks are not squarematrices.On the other hand, block matrix \320\222is a square block matrix.

A block diagonal matrix M is a square block matrix where the nondiagonal blocks are all zeromatrices.The importance of block diagonal matrices is that the algebra of the block matrix is frequently

reduced to the algebra of the individual blocks. Specifically, suppose M is a blockdiagonal matrix and

f(x) is any polynomial. Then M and/(M) have the following form:

\\ //\320\230\320\274)

/1

1

9

4

\\\320\267

2 :

i|

8 !

\320\273r

5;

3

1

7

4

3

4

1

6

4

5

5\\

1

5

4

\320\267/

1

1

9

4

\\v

2

1

8

4

5

3

1

743

4

1

6

4

5

5\\

1

5

4

\320\267/

\320\257\320\234)=

\\

\\

(As usual, we use blank spacesfor patterns of zeros or zero blocks.)Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the

diagonal are zero matrices,and a block lower triangular matrix if the blocks above the diagonal are zeromatrices.

4.9 ELEMENTARY MATRICES AND APPLICATIONS

First recall (Section1.8)the following operations on a matrix A, called elementary row operations:

[?,] (Row-interchange) Interchange the ith row and the jlh row:

[?2] (Row-scaling) Multiply the ith row by a nonzero scalark:

kRt-\302\273Rt (\320\272\321\204\320\236)

[E3] (Row-addition) Replace the ith row by \320\272times the^th row plus the ith row:

kRj + R, -> R,

Each of the above operations has an inverse operation of the same type. Specifically (Problem 4.19):

A) Rj-\302\273R,- is its own inverse.

B) kRj-\302\273R; and k' lRt -\302\273Rt are inverses.

C) kRj + Rt-\302\273K; and -

kRj + R, -\302\273R, are inverses.

Also recall (Section 1.8)that a matrix \320\222is said to be row equivalent to a matrix A, written A ~ B, if

\320\222can be obtained from A by a finite sequence of elementary row operations. Since the elementary rowoperations are reversible,row equivalence is an equivalence relation; that is, (a) A ~ A; (b) if A ~ B, then

1

00

0-6

0

\302\260\\0

l/

/ 1

\\-4

010

001

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 99

\320\222~ A; (c) if A ~ \320\222and \320\222~ C, then /1 ~ C. We also restate the following basic result on row equiva-

equivalence:

Theorem 4.8: Every matrix A is row equivalent to a unique matrix in row canonical form.

Elementary Matrices

Let e denotean elementary row operation and let e{A) denote the result of applying the operation eto a matrix A. The matrix E obtained by applying e to the identity matrix,

E = e{I)

is called the elementary matrix corresponding to the elementary row operation e.

Example 4.11. The 3-square elementary matrices correspondingto the elementary row operations \320\2572*-*\320\257},\342\200\224

6R2 -* R2 and \342\200\2244\320\257,+ R3 -* R3 are, respectively,

0 0\\

0

1

The following theorem, proved in Problem 4.18, shows the fundamental relationship between the

elementary row operations and their corresponding elementary matrices.

Theorem 4.9: Let e be an elementary row operation and E the corresponding w-square elementary

matrix, i.e., E = e(lm).Then, for any m x n matrix A, e{A) \342\200\224EA.

That is, the result of applying an elementary row operation e to a matrix A can be obtained by

premultiplying A by the corresponding elementarymatrix E.

Now suppose e' is the inverse of an elementary row operation e. Let E and E be the corresponding

matrices. We prove in Problem 4.19 that E is invertible and E' is its inverse. This means, in particular,

that any product

of elementarymatricesis nonsingular.

Using Theorem 4.9, we are also ableto prove (Problem 4.20) the following fundamental result oninvertible matrices.

Theorem 4.10: Let A be a square matrix. Then the following are equivalent:

(i) A is invertible (nonsingular);

fii) A is row equivalent to the identity matrix /;

(iii) A is a product of elementary matrices.

We also use Theorem4.9to prove the following theorems.

Theorem 4.11: If \320\220\320\222= I, then \320\222A = / and hence \320\222= A~l.

Theorem 4.12: \320\222is row equivalent to A if and only if there exists a nonsingular matrix P such that

100 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP.4

Application to Finding Inverses

Suppose a matrix A is invertible and, say, it is row reducible to the identity matrix / by the

sequence of elementary operations et, e2, \302\246\302\246\302\246,eq. Let Et be the elementary matrix corresponding to the

operation e(. Then, by Theorem 4.9,

Eq E2ElA = I or (?4 \342\200\242\342\200\242\342\200\242E2EXI)A

= I so Al

=?,\342\200\242\342\200\242\302\246?2E,/

In other words, A~l can be obtained by applying the elementary row operations e{, e2,..., eqto the

identity matrix /.The above discussion leads us to the following (Gaussian elimination) algorithm which either finds

the inverse of an n-square matrix. A or determines that A is not invertible.

Algorithm 4.9: Inverse of a matrix A

Step 1. Form the n x 2n [block] matrix M = (A \\ I); that is, A is in the left half of M and the identity

matrix / is in the right half of M.

Step 2. Row reduce M to echelon form. If the process generatesa zero row in the \320\233-half of M, STOP

(A is not invertible). Otherwise,the \320\233-half will assume triangular form.

Step 3. Further row reduce M to the row canonical form (/ j B), where / has replaced A in the left halfof the matrix.

Step 4. SetA\"' = B.

/1 0

Example 4.12. Supposewe want to find the inverse of A = I 2 \342\200\2241 3 ). First we form the block matrix

\\4M = (A i I) and reduce M to echelon form:

2 , I 0 0\\ /l 0 2;

1 0 0\\ /l 0 2 1 0 o\\

3 I 0 1 0~(o

-1 -1 -2 1 0 -JO -1 -1 -2 1 08

'0 0 1/ \\0

1 0-4 0 l/ \\0 0-1-6 1 1/

In echelon form, the left half of M is in triangular form; hence A is invertible. Next we further reduce M to its row

canonical form:

0 0 ; \342\200\22411 2 2\\ /l 0-1 0 4 0 -1 I ~l0 1

0 1'

6 \342\200\2241 \342\200\2241/ \\0 0

The identity matrix is in the left half of the final matrix; hence the right half is A '.In other words,

'-11 2 2\\

/\320\223'=| -4 0 1

6 -1 -I)

4.10 ELEMENTARY COLUMN OPERATIONS, MATRIX EQUIVALENCE

This section repeatssomeof the discussion of the preceding section using the columns of a matrixinstead of the rows. (The choice of first using rows comes from the fact that the row operations are

closely related to the operations with linear equations.) We also show the relationship between the rowand column operationsand their elementary matrices.

The elementary column operationswhich are analogous to the elementary row operations are asfollows:

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 101

[^i] (Column-interchange) Interchange the ith column and the jth column:

[^2] (Column-scaling) Multiply the ith column by a nonzero scalark:

fcC,-\302\273

\320\241,(\320\272\320\2440)

[F3] (Column-addition) Replace the ith column by \320\272times the7th column plus the ith column:

kCj + Ct - dEach of the above operations has an inverseoperation of the same type just like the corresponding

row operations.

Let/denote an elementary column operation. The matrix F, obtained by applying/to the identity

matrix /, that is,

is called the elementary matrix corresponding to the elementary column operation/

Example 4.13. The 3-square elementary matrices corresponding to the elementary column operations C3 \302\253-\302\273C,,

-2C3 -> C3, and -5C2 + C3-> O, are, respectively,

@

0 l\\ /l 0 0\\ /l 0 l\\

0 1 0 F2 = 0 1 0 F3= 0 1 -5,10 0/ \\0 0 -2/ \\0 0 ly

Throughout the discussion below, e and / will denote, respectively, corresponding elementary rowand column operations,and E and F will denote the corresponding elementary matrices.

Lemma 4.13: SupposeA is any matrix. Then

f{A) =[_\320\265(\320\220\321\202)\320\243

that is, applying the column operation/ to a matrix A gives the same result as applyingthe corresponding row operation e to AT and then taking the transpose.

The proof of the lemma follows directly from the fact that the columns of A are the rows of AT, and

vice versa. The lemma shows that

In other words,

Corollary 4.14: F is the transpose of E.

(Thus F is invertible since E is invertible.) Also, by the above lemma,

\320\224\320\220)=

[\320\265[\320\220\321\202)\320\243= [EATY = (AT)TET = AF

This proves the following theorem (which is analogous to Theorem 4.9 for the elementary row

operations):

Theorem 4.15: For any matrix A, f(A) = AF.

That is, the result of applying an elementary column operation/on a matrix A can be obtained by

postmultiplying A by the corresponding elementarymatrix F.

102 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

A matrix \320\222is said to be column equivalent to a matrix A if \320\222can be obtained from A by a sequenceof elementary column operations.Using the argument that is analogous to that for Theorem 4.12

yields:

Theorem 4.16: \320\222is column equivalent to A if and only if there exists a nonsingular matrix Q such that

B = AQ.

Matrix Equivalence

A matrix \320\222is said to be equivalent to a matrix A if \320\222can be obtained from A by a finite sequence ofelementary row and column operations. Alternatively (Problem 4.23),\320\222is equivalent to A if there exist

nonsingular matrices P and Q such that \320\222= PAQ. Just like row equivalenceand column equivalence,

equivalence of matrices is an equivalence relation.The main result of this subsection, proved in Problem 4.25, is as follows:

Theorem4.17: Every m x n matrix A is equivalent to a unique block matrix of the form

,0 .

where lr is the r x r identity matrix. (The nonnegative integer r is called the rank of A.)

4.11 CONGRUENT SYMMETRIC MATRICES,LAW OF INERTIA

A matrix \320\222is said to be congruent to a matrix A if there exists a nonsingular (invertible) matrix P

such that

\320\222= \320\240\320\223\320\233\320\240

By Problem 4.123, congruence is an equivalence relation. SupposeA is symmetric, i.e., AT = A. Then

BT = (PTAP)T = PTATPTT= PTAP = \320\222

and so \320\222is symmetric. Since diagonal matrices are specialsymmetric matrices, it follows that onlysymmetric matricesare congruent to diagonal matrices.

The next theorem plays an important role in linear algebra.

Theorem4.18(Law of Inertia): Let A be a real symmetric matrix. Then there exists a nonsingularmatrix P such that \320\222= PTAP is diagonal. Moreover, every suchdiagonal matrix \320\222has the same number p of positive entries and the

same number n of negative entries.

The rank and signature of the above real symmetric matrix A are denoted and defined, respectively,

by

rank A = p + n and sig A = p \342\200\224n

These are uniquely defined by Theorem 4.18. [The notion of rank is actually defined for any matrix

(Section 5.7), and the above definition agreeswith the general definition.]

Diagonalization Algorithm

The following is an algorithm which diagonalizes (under congruence) a real symmetric matrix

A =(ajj).

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 103

Algorithm 4.11: Congruence diagonalization of a symmetric matrix

Step 1. Form the n x In [block] matrix M =(A \\I); that is, A is the left half of M and the identity

matrix / is the right half of M.

Step 2. Examine the entry a,,.

Case I: \320\260,,\320\2440. Apply the row operations \342\200\224anRt+ \320\276\320\270\320\232,-\302\273R,, i = 2. ..., n, and then

apply the corresponding column operations\342\200\224anCl+ aliCi-f C; (to the left half

of M) to reduce the matrix M to the form

V \320\276 \320\262\\ * *)

Case II: alt = 0 but \320\260\320\271\321\2040, for some i > 1. Apply the row operation Rl*->Riand then the

corresponding column operation C, *-* C; to bring au into the first diagonal posi-position. This reduces the matrix to Case I.

CaseIII: All diagonal entries au = 0. Choose i,j such that \320\260\320\270\321\2040 and apply the row oper-operations Rj

+ /?,-*/?, and the corresponding column operation C, + C, -> C, to

bring 2al7 \320\2440 into the ith diagonal position. This reducesthe matrix to Case II.

In each of the cases, we finally reduce M to the form (/) where \320\222is a symmetric matrix oforder lessthan A.

Remark: The row operations will change both halves of M, but the column oper-operations will only change the left half of M.

Step 3. Repeat Step 2 with each new matrix (neglecting the first row and column of the precedingmatrix) until A is diagonalized, that is, until M is transformed into the form M' = (D, Q) whereD is diagonal.

Step4. SetP =QT. Then D = PTAP.

The justification of the above algorithm is as follows.Let et, e2,..., ek be all the elementary row

operations in the algorithm and let j\\,f2, ...,jk be the corresponding elementary column operations.Suppose E, and F{ are the corresponding elementary matrices. By Corollary 4.14,

Ft=Ef

By the above algorithm,

since the right half / of M is only changed by the row operations. On the other hand, the left half A ofM is changed by both the row and column operations; therefore,

D = Ek \302\246\302\246\302\246E2ElAFlF2

\302\246\302\246\302\246Fk

= (Ek-E2El)A(Ek-E2El)T= QAQT = PTAP

where P = QT.

Il 2

\320\233Example 4.14. Suppose A = I 2 5 \342\200\2244I, a symmetric matrix. To find a nonsingular matrix P such that

\\-3 -4 8/

1

00

01

0

0

0

1

104 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

\320\222= PTAP is diagonal, first form the block matrix (A j /):

2 -35 -4

8

Apply the operations \342\200\2242Rl + R2-*R2 and 3R, + Rj-tR, to (/1 ; /) and then the corresponding operations

-2C, + C2->C2 and 3C, + C3 -\302\273C3 to /1 to obtain

2 -3 1 0 0\\ /l 0 0 1 0

1 2-2 1 0 and then(o

1 2-2 1

2-1301/ \\02 -1

'3 0

Next apply the operation \342\200\2242R2 + R3 -\302\273R3 and then the correspondingoperation \342\200\2242C2+ C3 -\302\273C3 to obtain

and then

Now A has been diagonalized. Set

/1 -2 7

P = IO 1 -2 | and then

\\0 0 1

Note that \320\222has p = 2 positive entries and n = I negative entry.

4.12 QUADRATIC FORMS

A quadratic form q in variables x,, x2,..., xB is a polynomial

<?(*\342\200\236X2, ..., \320\245\342\200\236)=

XCjjXjX, DJ)

(where each term has degree two). The quadratic form q is said to be diagonalizedif

q(x1,x2,...,xj = cllx2l +c22xl + \302\246\302\246\302\246+cnnx*

that is, if q has no cross product terms x, xy (where i \320\244j).

The quadratic form D.1) may be expressed uniquely in the matrix form

q(X) = XTAX D.2)

where X = (xlt x2,..., xn)r and A =(atJ) is a symmetric matrix. The entries of A can be obtained from

D.1)by setting

au= cu and

atj=

aM= Cij/2 (for i ^ j)

that is, /4 has diagonal entry a,, equal to the coefficient of xf and has entries\320\260\320\270

and aj( each equal tohalf the coefficientof x, Xj. Thus

\302\25321 \302\25322 \342\200\242\342\200\242-\302\2532\302\273*2

\302\25322 \342\200\242\302\246\342\200\242a|n\\/JCl

I n _.

qW = (x1,...,

= X \302\253ijX.-*;= a,,x? + a22x2 + \342\200\242\342\200\242\302\246+ amx2n + 2 X \302\253o-^^j

i.j 1<1

The above symmetric matrix A is called the matrix representationof the quadratic form q. Although

many matrices A in D.2) will yield the same quadratic form q, only one such matrix is symmetric.

CHAP. 4] SQUARE MATRICES,ELEMENTARY MATRICES 105

Conversely, any symmetric matrix A defines a quadratic form q by D.2). Thus there is a one-to-onecorrespondencebetween quadratic forms q and symmetric matrices A. Furthermore, a quadratic form qis diagonalized if and only if the corresponding symmetric matrix A is diagonal.

Example 4.15

(a) Thequadratic form

q[x, y, z) = x2 -6xy + Sy2 - 4xz + 5yz + 7z2

may be expressedin the matrix form

/' -'

-2\342\204\226\\q(x, y, z) = (x, y, zi -3 8 f \320\253

\\-2 h 7/W

where the defining matrix is symmetric. The quadratic form may also be expressedin the matrix form

q(x, y, z) = (x, y,z{o 8 5 H \321\203

\\0 0 7/W

where the defining matrix is upper triangular.

(b) The symmetric matrix I I determines the quadratic form

q(x, y)= (x, >>/

3JTJ= 2x2 + 6xy + 5y2

Remark: For theoretical reasons, we will always assume a quadratic form q is

represented by a symmetric matrix A. Since A is obtained from q by division by 2, wemust also assume I + I # 0 in our field K. This is always true when \320\232is the real fieldR or the complex field C.

Change-of-VariableMatrix

Considera changeof variables, say from xlt x2, \342\200\242\302\246\302\246,xn to >\302\246,,y2,...,>\342\200\236, by means of an invertiblelinear substitution of the form

x. = P;i>'i+ Pa >'2 +\342\200\242\342\200\242\"

+ Pm\320\243* (i = 1, 2,..., n)

(Here invertible means that one can solve for each of the /s uniquely in terms of the x's.) Such a linear

substitution can be expressed in the matrix form

X = PY D.3)where

* =(*\342\200\236x2,...,xJT Y = (yl,y2,...,ylf and P =

{p(j)

The matrix P is called the change-of-variablematrix; it is nonsingular since the linear substitution isinvertible.

Conversely, any nonsingular matrix P defines an invertible linear substitution of variables, X = PY.Furthermore,

Y = P lX

yields the formula for the /s in terms of the x's.

106 SQUAREMATRICES, ELEMENTARY MATRICES [CHAP. 4

There is a geometrical interpretation of the change-of-variable matrix P which is illustrated in thenext example.

Example 4.16. Consider the cartesian plane R2 with the usual x and \321\203axes, and consider the 2 x 2 nonsingular

matrix

-G \020

The columns u, = B, IO and u2= (\342\200\2241, IO of P determine a new coordinate system of the plane, say with s and t

axes, as shown in Fig. 4-1. That is

A) The s axis is in the direction of \302\253,and its unit length is the length of \302\253,.

B) The t axis is in the direction of u2 and its unit length is the length of u2.

Any point Q in the plane will have coordinates relative to each coordinatesystem, say Q(a, b) relative to the x and \321\203

axes and Q(a'. b') relative to the s and f axes. These coordinate vectors are related by the matrix P. Specifically,

(\320\232 -\320\255\320\256

or X = PY

where X = (a, b)r and \320\243= (a1, h')T.

Fig. 4-1

Diagonalizing a Quadratic Form

Consider a quadratic form q in variables x,, x2, ..., xn, say q{X)= XTAX (where A is a symmetricmatrix). Suppose a change of variables is made in q using the linear substitution D.3). Setting X = PYin q yields the quadratic form q(Y) =

(PY)TA{PY)= YT(PTAP)Y. Thus \320\222= PTAP is the matrix repre-

representation of the quadratic form in the new variables y\\, y2,..-, \321\203\342\200\236.Observe that the new matrix \320\222is

congruent to the original matrix A representing q.The above linear substitution X = PY is said to diagonalize the quadratic form q(X) if q(Y) is

diagonal, i.e.,if \320\222= PTAP is a diagonal matrix. Since \320\222is congruent to A and A is a symmetric matrix,Theorem 4.18may be restated as follows.

Theorem 4.19(Law of Inertia): Let q{X) = XTAX be a real quadratic form. Then there exists an

invertible linear substitution Y = PX which diagonalizes q. Moreover,every such diagonal representation of q has the same number p of

positive entries and the same number n of negative entries.

CHAP. 4] SQUARE MATRICES. ELEMENTARY MATRICES 107

The rank and signature of a real quadratic form q are denoted and defined by

rank q = p + n and sig q = p \342\200\224n

These are uniquely defined by Theorem 4.19.

Since diagonalizing a quadratic form is the same as diagonalizing under congruence a symmetric

matrix, Algorithm 4.11 may be usedhere.

Example 4.17. Consider the quadratic form

q{x, y, z) = x2 + 4xy + Sy2 - 6xz -Syz + 8z2 (/)

The (symmetric)matrix A which represents q is as follows:

/ 1 2 -3\\

H2 5 -4

\\-3 -4 8/From Example 4.14, the following nonsingular matrix P diagonalizes the matrix A under congruence:

A

-2 7\\ /l 0 0\\

0 1 -2 1 and B = PTAP = lo 1 0 1

0 0 1/ \\00 -5/

Accordingly, q may be diagonalized by the following linear substitution:

x = r - 2s f It\321\203

= s~2t

7 \342\200\224 t

Specifically, substituting for x, y, and z in (/) yields the quadratic form

q(r, s, f) = r2 + s2 - St2 B)

Here p = 2 and n = 1; hence

rank q = 3 and sig q = 1

Remark: There is a geometrical interpretation of the Law of Inertia (Theorem

4.19) which we give here using the quadratic form q in Example 4.17. Consider the

following surface S in R3:

\321\204\321\201,>\302\246,z) = x2 + 4xy + 5>'2- 6xz -

8>>z + 8z2 = 25

Under the change of variables,

x = r - 2s + 7f

y = s-2t

z = f

or, equivalently, relative to a new coordinate system with r, s, and t axes, the equationof S becomes

\321\204,s, t) = r2 + s2 - 5f2 = 25

Accordingly, S is a hyperboloid of one sheet, since there are two positive and onenegative entry on the diagonal. Furthermore, S will always be a hyperboloid of onesheet regardlessof the coordinate system. Thus any diagonal representation of the

quadratic form q{x, y, z) will contain two positive and one negative entries on the

diagonal.

108 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

Positive Definite SymmetricMatricesand Quadratic Forms

A real symmetric matrix A is said to be positive definite if

XTAX > 0

for every nonzero (column) vector X in R\". Analogously, a quadratic form q is said to be positive definite

if q(v) > 0 for everynonzero vector in R\".

Alternatively, a real symmetric matrix A or its quadratic form q is positivedefinite if any diagonal

representation has only positivediagonal entries.Suchmatrices and quadratic forms play a very impor-

importantrole in linear algebra. They are consideredin Problems 4.54 4.60.

4.13 SIMILARITY

A function/: R\" -\302\273R\" may be viewed geometrically as \"sending\" or \"mapping\" each point Q into apoint/(Q) in the space R\". Suppose the function/can be representedin the form

\342\204\226)= AQ

where A is an n x n matrix and the coordinates of Q are written as a column vector. Furthermore,suppose P is a nonsingular matrix which may be viewed as introducing a new coordinate system in the

space R\". (See Example 4.16.) Relativeto this new coordinate system, we prove that/is representedby

the matrix

B= PlAP

that is,

/(C) = BQ1

where Q' denotesthe column vector of the coordinates of Q relative to the new coordinate system.

Example 4.18. Consider the function/: R2 -\302\246R2 defined by

f(x, y) = Cx -4y, 5x + 2y)

or, equivalently,

() 4s iSuppose a new coordinate system with s and t axes is introduced in R2 by means of the nonsingular matrix

,-(? -;)-.

,\302\246-(_* 9

(See Fig. 4-1.) Relative to this new coordinate system of R2, the function /\"may be represented as

where

4s

CM:)

\342\200\242-'-\302\246\"-(-! \320\250 -\320\226 XI t)In other words,

/(s, t) = (J?s-

Theabove discussion leads us to the following:

CHAP.4] SQUARE MATRICES, ELEMENTARY MATRICES 109

Definition: A matrix \320\222is similar to a matrix A if there exists a nonsingular matrix P such that

B = PlAP

Similarity, like congruence,is an equivalence relation (Problem 4.125); hence we say that A and \320\222

are similar matrices when \320\222= P~ lAP.

A matrix A is said to be diagonalizable if there exists a nonsingular matrix P such that \320\222= P~lAP

is a diagonal matrix. The question of whether or not a given matrix A is diagonalizable and of finding

the matrix P when A is diagonalizable plays an important role in linear algebra. These questions will be

addressed in Chapter 8.

4.14 LU FACTORIZATION

Suppose A is a nonsingular matrix which can be brought into (upper) triangular form U without

using any row-interchange operations, that is. suppose A can be triangularized by the following algo-algorithm which we write using computer algorithmicnotation.

Algorithm 4.14: Triangularizing matrix A = (au)

Step 1. Repeat for i = 1,2, ...,\302\253- 1;

Step 2. Repeat for 7 \342\200\224i + I,..., n

(a) Set m0 \302\246=a-Jau

(b) Set Rj-= ntijR{+ Rj

[End of Step 2 inner loop.]

[End of Step 1 outer loop.]

Step3. Exit.

The numbers mtj are called multipliers. Sometimes we keep track of these multipliers by means of

the following lower triangular matrix L:

/1 0 0 ... 0 0\\

-m2l 1 0 ... 0 0

\342\200\224tn3l\342\200\224

m32 1 ... 0 0L=

\\-mnl -\321\202\342\200\2362-m :\321\2173\342\200\242\342\200\242\342\200\242-\"'\342\200\236,\342\200\236-|'/

That is, L has Is on the diagonal, Os above the diagonal, and the negative of mtj as its i/-entry below the

diagonal.The above lower triangular matrix L may be alternatively described as follows. Let et, e2, ..., ek

denote the sequence of elementary row operations in the above algorithm. The inverses of these oper-operations are as follows. For i = 1,2,..., n \342\200\224

1, we have

-mfJ R{ + Rj,-\302\273

Rj (j = i + I,..., n)

Applying these inverseoperationsin reverse order to the identity matrix / yields the matrix L Thus

/ \342\200\224F~lF l \302\246\302\246\302\246F~ll

where Eu ..., Ek are the elementary matrices corresponding to the elementary operations eu ...,ek.On the other hand, the elementary operations et,..., ek transform the original matrix A into the

upper triangular matrix U. Thus Ek-E2ElA \342\200\224{J. Accordingly,

A \342\200\224(F~lF~ l \342\200\242\342\200\242-F ~lMJ \342\200\224(F~l F~l \302\246\342\200\242\342\200\242F~lIMI \342\200\224I IJ

110 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

This gives us the classical LU factorization of such a matrix A. We formally state this result as atheorem.

Theorem 4.20: Let A be a matrix as above. Then A = LU whereL is a lower triangular matrix with Is

on the diagonal and U is an uppertriangular matrix with no 0s on the diagonal.

Remark: We emphasize that the above theorem only applies to nonsingular

matrices A which can be brought into triangular form without any row interchanges.Suchmatricesare saidto be LU-factorable or to have an LU factorization.

1 2 -3\\

Example 4.19. Let A =| -3 -4 13 I. Then A may be reduced to triangular form by the operations \320\227\320\231,+ Rz

2 I-5/

\342\200\242/?2 and - 2fl, + R3 -> R3,' and then (f)fl2 + /?3 -> R?:

This gives us the factorization A = LU where

Note that the entries -3, 2, and -4 in L come from the above elementary row operations, and that U is the

triangular form of A.

Applications to Linear Equations

Consider a computer algorithm M. Let C(n) denote the running time of the algorithm as a function

of the size n of the input data. [The function C(n) is sometimescalledthe time complexity or simply the

complexity of the algorithm M.] Frequently, C(n) simply counts the number of multiplications anddivisions executed by M, but does not count the number of additions and subtractions sincethey take

much less time to execute.Now considera square system of linear equations

AX = B

where A = (ay) has an LU factorization and

\320\245=(\321\205\342\200\236...,\321\205\320\277)\321\202and B = (bu...,bn)T

Then the systemmay be brought into triangular form (in order to apply back-substitution) by applyingthe above algorithm to the augmented matrix M \342\200\224(A, b) of the system. The time complexity of the

above algorithm and back-substitutionare, respectively,

C(n) * n3/2 and C(n) \302\253n2/2

where n is the number of equations.On the other hand, suppose we already have the factorization A = LU. Then to triangularize the

system we need only apply the row operations in the algorithm (retained by the matrix L) to the column

vector B. In this case, the time complexity is

C(n) x n2/2

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 111

Of course, to obtain the factorization A = LU requires the original algorithm where C(n) * n3/2. Thusnothing may be gained by first finding the LU factorization when a singlesystemis involved. However,

there are situations, illustrated below, where the LU factorization is useful.

Suppose that for a given matrix A we need to solve the system

AX = \320\222

repeatedly for a sequence of different constant vectors, say Bx, B2,..., Bk. Also, supposesomeof the Bt

depend upon the solution of the system obtained while using preceding vectorsBj. In such a case, it ismore efficient to first find the LU factorization of A, and then to use this factorization to solve the

system for each new B.

Example 4.20. Consider the system

x- 2y- 2 =fc,

2x - 5y - 2 =k2 or AX = \320\222 {1)

-3x + \\0y- 3z =

k3

I , -2 -rt A,where A = I 2 - 5 - 1

Jand \320\222= \\k2

\\-3 10 -3/ \\k3/

Suppose we want to solve the system for Bt, B2. \320\222\321\212,B^, where Bt =A,2, 3O and

B;+,=B; + A-; (for j > 1)

where Xj is the solution of (/) obtained using Bj. Here it is more efficient to first obtain the LU factorization for A

and then to use the LU factorization to solve the system for each of the B's. (See Problem 4.73.)

Solved Problems

ALGEBRA OF SQUARE MATRICES

\320\233 \320\267 \320\220

4.1. Let A = I 2 - 5 8 1.Find:(a)the diagonal and trace of A; (b) A(u) where \320\270= B,- 3,5)r;

\\4 -2 7/

(c) A{v) where v = A, 7, -2).

(a) The diagonal consists of the elements from the upper left corner to the lower right corner of the matrix,that is, the elements a,,, a22, a33.Thus the diagonal of A consists of the scalars 1, \342\200\2245,and 7. Thetrace of A is the sum of the diagonal elements; hence tr A = \\ \342\200\2245 + 7 = 3.

(I

3 6\\/ 2\\ /2-9 + 30 \\ /23\\

2 -5 8J -\320\267|

= |4+ 15+40 = 594 -2 7/\\ 5/ \\8+6 + 35/ \\49/

(c) By our convention, A(v) is not defined for a row vector v.

4.2. Let A = (!j. (a) Find A2 and A3, (b) Find/(/l), where/(x) = 2x3 - 4x+ 5.

112 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

(a) \" -3/V4

A ~AA

-{4 -3\320\224-

9 \"\320\233/'9\0216-4 + 34\320\247/-7 30

8 J

F) To find /(\320\233), first substitute A for x and 5/ for the constant 5 in the given polynomial

f(x) = 2xi-4x + 5:

Then multiply each matrix by its respectivescalar:

/-14 60\\ / -4 -8\\ /5 0-(l20 -134L-16 I2j+(o slements in the matrices:

/-14-4 + 5 60-8 + 0 \"\\ /-13

~Vl20- 16 + 0 -134+12 + 5j~V 104 -

Lastly, add the corresponding elements in the matrices:

/(\320\233)

4.3. Let \320\233= (J.

Find g(A), where #(x)= x2 - x - 8.

2V2 2\\ /4 + 6 4-2

-1\320\224\320\227-l) V6-3 6+1

/10 2\\ /2 2\\ /1 0\\-^-8/ =

(\320\267 \321\202\320\235\320\267-iJ\"% .J

Thus \320\233is a zero of g(x).

4.4. Given A = I 1. Find a nonzero column vector \320\270= I I such that /1(\321\213)= \320\227\321\213.

First set up the matrix equation \320\220\320\270= 3u:

D -\320\267

Write each side as a single matrix (column vector):

:-\320\227\321\203\320\243\\3y

Set corresponding elements equal to each other to obtain the system of equations, and reduce it to echelonform:

x + 3>>=

3x| f 2x \342\200\224\320\252\321\203

=0| f 2x \342\200\224

\320\227\321\203= \320\236

4x \342\200\224\320\252\321\203

= :

The system reduces to one homogeneous equation in two unknowns, and so has an infinite number of

solutions. To obtain a nonzero solution let, say, \321\203= 2; then x = 3. That is, \320\270= C, 2O has the desired

property.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 113

/I 2 -3\\4.5. Let A = I 2 5 -11. Find all vectors \320\270= (x, y, z)T such that A(u) = 0.

\\5 12 -5/

Set up the equation \320\233\320\270= 0 and then write each side as a single matrix:

2 -3\\M /0\\ /* + 2y-3z\\ /0\\

2 5 -lL = 0 or 2xf Sy- 2 = 0

12 -5/\\2/ \\0/ \\5xfl2y-52/ \\0/

Set corresponding elementsequal to eachother to obtain a homogeneous system, and reduce the system toechelon form:

x+ 2,-3.-01 ix + 2y- 3z =

0| =o2x + 5y

- 2 =Of

-\302\273< \321\203+ 5\320\263= 0? -\302\273<

5x+12y-5z=Oj ( 2y+10z = 0J*\302\246 \320\243+

In the echelon form, z is the free variable. To obtain the general solution, set z = a, where a is a parameter.Back-substitution yields \321\203

= \342\200\2245a,and then x = 13a.Thus, \320\270\342\200\224A3a, \342\200\2245a,a)T represents all vectors suchthat \320\220\320\270= \320\236.

(s t\\4.6. Show that the collection M of all 2 x 2 matrices of the form I 1 is a commutative algebra of

\\t s/

matrices.

Clearly, M is nonempty. If A = I I and \320\222=[ I belong to M, then

\\b a) \\d c)

fa + c d + b\\ (ka kb\\ (ac + bd ad + bc\\

\\b + d a + c) \\kb ka) \\bc + ad bd + ac)

also belong to M. Thus M is an algebra of matrices.Furthermore,

(ca + db cb + (\320\222\320\220=

\\da + cb db + ca

Thus \320\222\320\220= \320\220\320\222and so M is a commutative algebra of matrices.

4.7. Find all matrices M = ( I that commute with A = ( I.\\z t) \\0 l)

First find

/* + 7 V A- t\\and

2 t

Then set AM \342\200\224MA to obtain the four equations

X + Z = X \321\203+ t \342\200\224X + \320\243 Z \342\200\224Z t = Z + I

From the first or last equation, 2 = 0; from the second equation, x \342\200\224t. Thus M is any matrix of the form

0 x

4.8. Let ei,= @, ..., 1, .... 0)r, where i = 1,.... n, be the (column) vector in R\" with 1 in the ith

position and 0 elsewhere, and let A and Bberaxn matrices.

(a) Show that Ae, is the ith column of A.

114 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP.4

(b) SupposeAeL= Be{ for each i. Show that A = B.

(c) Suppose \320\220\320\270= Bu for every vector \320\270in R\". Show that A = B.

(a) Let A ={au)

and let Ae,, = (b,,..., bn)r. Then

bt = Rkei =(aM, ..., 0@, ..., 1,..., 0O=

flu

where Rk is the fcth row of A. Thus

^i = (<\"i,-a2i..-\"/the ith column of A.

{b) Aet= Bet means A and \320\222have the same ith column for each i. Thus A = B.

(c) If \320\220\320\270= \320\222\320\270for every vector \320\270in R\", then Aet = Be,-for each i. Thus \320\233= B.

4.9. Suppose A is an m x \320\270matrix. Show that: (a) Im A = A, {b) AIn= >4. (Thus AI = IA = A when A

is a square matrix.)

We use the fact that / =(\320\264\320\270)

where Su is the Kroneckerdelta (Example 4.3).

(a) Suppose lm A =(/\320\270).

Then

m

fij= E 6ikakj

=$\320\270\320\260\320\270

=a,j

1= 1

Thus lm A = /4, since corresponding entries are equal.(b) Suppose \320\233/\342\200\236

=fey). Then

\320\233

.9ij= Y, aikdkJ= aijdjj =

aiJk= 1

Thus /l/n = A, sincecorrespondingentries are equal.

4.10. Prove Theorem 4.1. (i) tr (A + B) = tr (A) + tr (B), (ii) tr (kA) = \320\272tr (A), (iii) tr (AB) = tr (BA).

(i) Let /4 + \320\222=(c0). Then c;j =

al7+ bij4 so that

n \320\273 n It

tr(A + B)= ? cu =Y, (\302\253n+ **i) = I \302\253u+ I b\302\253

= tr ^ + tr \320\222

*=l i=i i=i i=i

(ii) Let \320\272A = (cy). Then ctj=

\320\272\320\260\320\270,and

tr kA = ? fcajV= \320\272? u;; = /\321\201\342\200\242tr \320\233

(iii) Let \320\233\320\222= (cu) and \320\222\320\233=(Jfi). Then ciS= 5] aik^fcj anc^ ^ij

= 5] bikakjywhence

\320\273 \320\273n n \320\273

tr \320\233\320\222= X cu =11 aabkl =11 bkiaik= Y.d* = tr ^

i= 1 i= 1 1= 1 1= 1 i= 1 1=1

4.11. Prove Theorem 4.2. (i) (f+ g)(A) =f(A) + g(A), (ii) f(A)g(A) =(fg)(A), (iii) f(A)g(A) = g(A)f(A).

r s

Suppose/(x) = ?u;X'and g{x) = Y.bjX'.\302\246-1 j^ l

(i) We can assumer = s = n by adding powers of x with 0 as their coefficients. Then

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 115

Hence

(/+ MA)= t (\302\253\302\246+ biM1 =i<hA'+i b,Al =f(A) + g(A)

i=l i=l i- 1

(ii) We have/(xH(.x) = ? a.b}xl+i.Then

f(AMA) =

(iii) Using/(x)#(x) = g{x)f(x),we have

f{A)g(A) = (fgXA) =<\320\264/\320\245\320\233)

= g(A)f(A)

4.12. Let D*= kl, the scalar matrix belonging to the scalar k. Show that (a) DkA = fc4, (b) BDt =feB,

(c) Dk + Dk. = Dk,k., and (d)Dk Dk.= Dw..

(a) D4 A = (A/)/4 = A:(//l) = kA

(b) BDk =\320\251\320\272!)

= k(Bl) = kB

(c) Dk + Dk.- kl + *7 =

(fc +

(J) D^Dj. = 1\320\250\320\272'\320\223)= kk'(II) =

INVERTIBLE MATRICES, INVERSES

4.13. Find the inverse of I

Method 1. We seek scalarsx, y, z, and w for which

3 5\\/x y\\ _/I 0\\ /3x + 5r 3>-+ 5\302\273v\\_/l 0\\

2 3j\\z w)~\\0 \\)

\302\260\320\223

^2\321\205+ 3\320\2632>> + \320\227\320\2707^0 \\)

or which satisfy

f 3x + 5z = 1[3>-

+ 5w = 0

Bx + 3z = 0dn

[2y + 3w = 1

The solution of the first system is x = \342\200\2243, z = 2, and of the second system is \321\203= 5, w = \342\200\2243. Thus the

inverse of the given matrix i

rixis( I J).

fa b\\nx A = I I is

V<- d)

._, / d/\\A\\ -b/\\A\\\\ If d -b\\

/4=1 I = 1 I where \\A = ad \342\200\224be

V-c/MI a/\\A\\J \\A\\\\-c a)

Method 2. The general formula for the inverse of the 2 x 2 matrix

Thus if A = I I, then first find | A \\= C)C) - EK2) =-1^0. Next interchange the diagonal elements,

take the negatives of the other elements,and multiply by 1/| A |:

1 2 -4\\ /1 3 -4\\

4.14. Find the inverse of (a) A =|

- 1 - 1 5 and (\320\254)\320\222= I 1 5 - 1 I.2 7 -3/ \\3 13 -6/

116 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

(a) Form the block matrix M \302\246\342\200\224{A -I) and row reduce M to echelon form:

1 0 0\\ /I 2-40 1 01 ~ [ 0 1 1

0 0

1 0

1 1

-2 0

2-4 1 0

1 1,1 1

0 1 -5 -3

The left half of M is now in triangular form; hence A has an inverse. Further row reduce M to rowcanonical form:

2 0 -9 -6 2\\ /l 0

M~|0 I 0 i I -i]~|01

) 0 If! i/ \\00

(-16

-11 3\\

I ! -i

-I -* \\l(b) Form the block matrix M = (\320\222\\ I) and row reduce to echelon form:

\320\267-4

5 -1

13 -6

\320\267-4

2 \320\267-1

4 6-3

!\\ 3-4 1 0 0\\

0 2 3-1 1 Ol

0 0-1 -2 1/In echelon form, M has a zero row in its left half; that is, \320\222is not row reducible to triangular form.

Accordingly, \320\222is not invertible.

4.15. Prove the following:

(a) If A and \320\222are invertible, then AB is invertible and (AB)' = B lA '.

(b) If At, A2, ..., An are invertible, then (AtA2 \302\246\302\246\302\246A\342\200\236)

' = An' \342\200\242\342\200\242\342\200\242

/42 '/If'.

(c) A is invertible if and only if AT is invertible.

(d) The operationsof inversion and transposing commute: (AT)~' =(\320\233~')T.

(a) We have

(\320\233\320\222\320\232\320\222'\320\233') =\320\233(\320\222\320\222-')\320\233-'= AIA1 = AA~l = I

(B~ XA~ *%AB) - B~HA~'A)B = \320\222~\320\247\320\222= \320\222'\320\222= /

Thus \320\222\320\247' is the inverse of AB.

(b) By induction on n and using Part {a), we have

(c) If /4 is invertible, then there exists a matrix \320\222such that /4 \320\222= \320\222A \342\200\2241. Then

(AB)' = (\320\222\320\233O= /r and so BrAT = ATBl = /

Hence \320\233\321\202is invertible, with inverse BT.Theconverse follows from the fact that (AT)r = A.

id) By Part (c), B7 is the inverse of/\320\223; that is Br = (/ir) '.ButB = /l '; hence (A 1)T = {AT)~1.

4.16. Show that, if A has a zero row or a zero column, then A is not invertible.

CHAP.4] SQUARE MATRICES, ELEMENTARY MATRICES 117

By Problem 3.20, if A has a zero row, then \320\220\320\222would have a zero row. Thus if A were invertible, thenAA'

' = / would imply that / has a zero row. Therefore,A is not invertible. On the other hand, if A has azero column, then AT would have a zero row; and so AT would not be invertible. Thus, again, A is not

invertible.

ELEMENTARY MATRICES

4.17. Find the 3-square elementary matrices ?,, E2, E3 which correspond, respectively, to the row

operations R, <-\302\273R2,\342\200\224

7R3 -* R3 and \342\200\224\320\227\320\257,+ R2 -* R2.

/I 0 0\\

Apply the operations to the identity matrix /3 = I 0 1 0 I to obtain

\\0 0 1/

1 0 0\\

4.18. Prove Theorem 4.9.

Let R; be the ith row of A; we denote this by writing A = (R,, ..., R J. If \320\261is a matrix for which AB is

defined, then it follows directly from the definition of matrix multiplication that AB = (R,B,..., RmB). We

also let

ef= @. ...,0,T, 0, ...,0), \"=i

Here'\024 = i means that 1 is the ith component. By Problem 4.8, e, A = R,. We also remark that

/ = (e,,..., em) is the identity matrix.

(i) Let e be the elementary row operation R, *-\302\246Rj. Then, for

\320\273= i and \320\273=\321\203,

? = \320\265{1)=(\320\262\342\200\236..., \320\265],...,\"?,..., ej

and

Thus

EA =(\320\265,\320\233 eP e^A, .... em/l)= (R1,..., R^, ..., r7, ..., RJ =

(ii) Now let e be the elementary row operation fcK,--\302\273K,, fc # 0. Then, for ^ = i,

? = e(l) =(<?\342\200\236...,fce-, ..., ej and e(A) = (R,, .... kR,,...,RJ

Thus

E/4 =(e,/4, .... fce^/4, .... em/l) = (R,. ..., fcRJ, ..., RJ = e(A)

(iii) Last, let e be the elementary row operation kRj + R,-*Rt. Then, for~ = 1,

? = e(/) = (e,, ...,fc^+\"e\",,...,ej and \320\265(\320\233)= (R,, ..., kRj+^, ..., RJUsing (ke, + e,)A =

\320\233(\320\265;A) + e, A =kRj

+ \302\253,,we have

EA =(\320\265,\320\233,....(fce^ + e,)A, ..., emA)= (R,, ..., kRs + R,, ..., RJ =

e{A)

Thus we have proven the theorem.

4.19. Prove each of the following:

(a) Each of the following elementary row operations has an inverse operation of the same type.

[?,] Interchange the /th row and theyth row: /?,-<-> Rj.

118 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

[E2] Multiply the fth row by a nonzero scalar k: kR,-* R,, \320\272\321\2040.

[E3] Replace the ith row by \320\272times theyth row plus the ith row: kRj + R, -\302\273Rt.

(fc) Every elementary matrix E is invertible, and its inverse is an elementary matrix.

(a) Each operation is treated separately.

A) Interchanging the same two rows twice, we obtain the original matrix; that is, this operation is itsown inverse.

B) Multiplying the ith row by \320\272and then by k~l, or by k~' and then by k, we obtain the original

matrix. In other words, the operations kRt -\302\273Rt and fc~

'#?,-\342\200\224\302\273Rt are inverses.

C) Applying the operation kRj + Rt -\302\273R, and then the operation \342\200\224kRj + Rt -\302\273Rt, or applying the

operation \342\200\224kRj+ Rt-* Rt and then the operation kRj + Rt-+ Rt, we obtain the original matrix.

In other words, the operations kRj + R, -\302\273Rt and \342\200\224kRj + Rt-*R, are inverses.

(b) Let E be the elementary matrix corresponding to the elementary row operation e: e{I) \342\200\224E. Let \321\221be

the inverse operation of e and let E' be its correspondingelementary matrix. Then

/ = e\\e{l))= \321\221{\320\225)= \320\225\320\225and / = e{e'(I))= e(E)= \320\225\320\225'

Therefore E' is the inverse of E.

4.20. Prove Theorem 4.10.

SupposeA is invertible and suppose A is row equivalent to a matrix \320\222in row canonical form. Thenthere exist elementary matrices ?,. E2 Es such that E, \302\246\302\246\302\246

E2E,/1= B. Since A is invertible and each

elementary matrix E, is invertible, \320\222is invertible. But if \320\222\321\204I, then \320\222has a zero row; hence \320\222is not

invertible. Thus \320\222= /, and (a) implies (b).If (/>) holds, then there exist elementary matrices E^ E2,..., Es such that Es

\302\246\302\246\302\246E2 EtA = I, and so

A\342\200\224{E,--- E2El)~1= E^E2l

\302\246\302\246\302\246E~l. But the E,' arealsoelementary matrices. Thus (b) implies (c).

If (c) holds, then A =\320\225\321\205\320\225\320\263

\302\246\302\246\302\246Es. The Ef are invertible matrices; hence their product, A, is also

invertible. Thus (c) implies (a). Accordingly, the theorem is proved.

4.21. Prove Theorem 4.11. If AB = /, then BA = I and hence B = A~l.

Suppose A is not invertible. Then A is not row equivalent to the identity matrix /, and so A is row

equivalent to a matrix with a zero row. In other words, there exist elementary matrices E,, .... E5 such that

E, \342\200\242\302\246\302\246E2 E,/4 has a zero row. HenceE, \302\246\302\246\302\246

\320\2252\320\225^\320\220\320\222= Et

\302\246\302\246\302\246E2 E,, an invertible matrix, also has a zero

row. But invertible matrices cannot have zero rows; henceA is invertible, with inverse A ''. Then also,

B=IB = {A

4.22. Prove Theorem 4.12. \320\222~ A iff there exists nonsingular P such that \320\222= PA.

If \320\222- A, then B =\320\265/...(\320\2652(\320\265,(\321\2031)))...)

= E,\302\246\302\246

E2EtA = \320\240\320\233,where P = ES--E2E, is non-

singular. Conversely, suppose \320\222\342\200\224PA where P is nonsingular. By Theorem 4.10, P is a product of elemen-

elementarymatrices and hence \320\222can be obtained from A by a sequence of elementary row operations, i.e., \320\222~ A.

Thus the theorem is proved.

4.23. Showthat \320\222is equivalent to A if and only if there exist invertible matrices P and Q such that\320\222= PAQ.

If \320\222is equivalent to A, then \320\222= Es\302\246\302\246\302\246

E2ElAFlF2\302\246\302\246\302\246

F, = PAQ, where P =Es

\302\246\302\246\302\246E2Et and g =

FtF2 \302\246\302\246\302\246F, are invertible. The conversefollows from the fact that each step is reversible.

4.24. Showthat equivalence of matrices, written a, is an equivalencerelation:(a)A * A, (b) If A ss B,

then BkA,(c)UA*B and \320\222^ C, then \320\220\320\273\320\241.

(a) A \342\200\2241AI where / is nonsingular; hence A % A.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 119

(b) If A \320\260\320\222then A = PBQ where P and Q are nonsingular. Then B = P'lAQ'

where P~' and Q

' arenonsingular. Hence \320\222zs A.

(c) IM * \320\222and B*C, then A = PBQ and \320\222= P'CG where P; Q, P',Q'are nonsingular. Then

A = P(FCQ')Q = (PP')C(QQr)

where PV and Qg' are nonsingular. Hence \320\220\320\273\320\241

4.25. Prove Theorem 4.17.

The proofisconstructive, in the form of an algorithm.

Step I. Row reduce A to row canonical form, with leading nonzero entries a,,, a2j2,..., arJr.

Step 2. Interchange C2 and Cj2, interchange C3 and\320\241\320\273,...,

and interchange Cp and CJr. This gives a

matrix in the form I -r-]- 1, with leading nonzero entries a,,, a22,..., \320\260\342\200\236.

Step 3. Use column operations, with the au as pivots, to replaceeachentry in \320\222with a zero; i.e., for

i=l, 2,..., r and j = r + I, r + 2,.... n,

apply the operation \342\200\224biy Cf + C} -\302\273

C;.

The final matrix has the desired form I -Jr'rz)\342\200\242

SPECIAL TYPES OF MATRICES

4.26. Find an upper triangular matrix A such that A3 = I 1.

Set A = (X A. Then \320\220\321\212has the form (* *

). Thus x3 = 8, so x = 2: z3 = 27, so z = 3. Next

Vo z; \\o z3jcalculate A3 using x = 2 and r = 3:

/2 A/2 A /4 5\320\233 /2 >V4 5\320\233 /8 19

={ \320\224 j

=

U Jdnd A

={ \320\224 j

=

UA =

Thus 19>\302\273= \342\200\22457, or \321\203

= - 3. Accordingly

4.27. Prove Theorem 4.3(iii).

Let AB =(\321\201\321\206).Then

, /4 = I 1.

and

Suppose i >y. Then, for any fc, either i > \320\272or fc >j, so that either aik= 0 or b^ = 0. Thus, crt = 0, and AB

is upper triangular. Suppose i =j. Then, for \320\272< i, \320\260\320\273= 0; and, for fc > i, bw = 0. Hence \321\201\342\200\236

= aabu, as

claimed.

4.28. What kinds of matrices are both upper triangular and lower triangular?

If A is both upper and lower triangular, then every entry off the main diagonal must be zero. HenceA

is diagonal.

120 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

4.29. Prove Theorem 4.4.

(i) (A 4- AT)T = AT + (ATf = AT + A = A 4- AT

(ii) (A - ATf = AT - (AT)T = AT - A = -(A -AT)

(iii) Choose \320\222= \\(A + AT) and \320\241=j(A \342\200\224

/4r), and appeal to (i) and (ii). Observe that no other choice is

possible.

= I

'14.30. Write \320\233= I

'1 as the sum of a symmetric matrix \320\222and a skew-symmetric matrix C.

Calculated -(* J)

* + >\302\246'=

(,J ^and ,4 - \342\200\236'=

1 _ /2 5\\ 1 T /0 -2\\

I X

4.31. Find x, \321\203,z, s, t if A = I \302\247jj \321\203I is orthogonal.

s

Let R,, R2, R3 denote the rows of A, and let C,, C2, C3 denotethe columns of A. Since R, is a unit

vector, x2 4-1 4- f = 1,or x =\302\2615.Since R2 is a unit vector, 3 4- 5 4- >2 = 1, or \321\203

= +|. Since R^\342\200\242

R2 =

0, we get 2x/3 4-| 4- 2_v/3 = 0, or 3x 4- 3>-= - 1. The only possibility is that x = 3 and \321\203

= - f. Thus

A =

Since the columns are unit vectors,

-4--4-'2=l -4-'4-s2=l -4-44-f2=l

Thus r = + 3, s =\302\261|,and / =

\302\2613.

Case (i): z = 3. SinceC, and C2 are orthogonal, s = \342\200\224f; since C, and C3are orthogonal, / = 3.

Case (ii): r = \342\200\224f. Since C, and C2are orthogonal, s = f; since C, and C3 are orthogonal, t = \342\200\224

3.

Hence there are exactly two possible solutions:

/4=4 \321\202-41 and

4.32. Suppose /4=1 I is orthogonal. Show that a2 + b2 = 1 andV 4/

\320\233 (\320\277 \320\254\\ A (\302\253

\320\254\\

A=\\b -\320\260)

\320\236\320\223A =

{-b a)

SinceA is orthogonal, the rows of A form an orthonormal set. Hence

a2 + b2 =\\ \321\2012+ d2 = 1 of 4- M = 0

Similarly, the columns form an orthonormal set, so

a2 + c2 = 1 b2 + d2 = 1 ab 4- cd = 0

Therefore,c2 = 1 \342\200\224a2 = b2, whence \321\201= \302\261h.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 121

Case (i): \321\201= + b. Then b(a + d)= 0, or d = \342\200\224a; the corresponding matrix is I J.

\\b -a)

Case (ii): \321\201\342\200\224\342\200\224b. Then b = (d -a) = 0, or d = a; the corresponding matrix is I I.

\\-b a)

4.33. Prove Theorem4.6.

Let a and b be any real numbers such that a2 + b2 = 1. Then there exists a real number \320\262such that

a \342\200\224cos 0 and /> = sin 0. The result now follows from Problem 4.32.

4.34. Find a 3 x 3 orthogonal matrix P whose first row is a multiple of u, \342\200\224A, 1, 1) and whose

secondrow is a multiple of u2 = @, \342\200\2241, 1).

First find a vector u3 orthogonal to u, and u2, say (cross product) u3= u, x u2 = B, \342\200\2241, \342\200\224

1). Let Abe the matrix whose rows are ut,u2,u3; and let P be the matrix obtained from A by normalizing the rowsof A. Thus

and P= 0 -1/72

4.35. Prove Theorem 4.7.

Suppose /4=1\"'

I. Then-c a-

(a cVa b\\_/a2+ c2 ab + cd

~\\b d)\\c d)~\\ab + cd b1 + d2

Since AAT = ATA, we get

a2 + b1 = a2 + c2 c2 + rf2 = b2 + d2 ac + bd = ab + erf

The first equation yields fc2 = c2; hence b = f or b = \342\200\224c.

Case (i): b = \321\201(which includes the case b = \342\200\224\321\201= 0). Then we obtain the symmetric matrix A fa b\\

\\b djCase (ii): b = -\321\201\320\2440. Then af + bd = b{d - a) and ab + cd= b{a

- d). Thus b(d -a)

= b(a - d), and so

2b(d \342\200\224a) = 0. Since b ^ 0, we get a = d. Thus /1 has the form

0\\ /Ob

which is the sum of a scalarmatrix and a skew-symmetric matrix.

COMPLEX MATRICES

4.36. Find the conjugate of the matrix A =\\ 1.

\\6\342\200\224/ 2 \342\200\2249i 5 + 6i/

Take the conjugate of each element of the matrix (where a + bi = a \342\200\224hi):

- /2 + i 3 - 5i 4 + 8\320\224_B-i

3 + 5i 4 - 8i\\

\\6- i 2 - 9i 5 + 6i/ \\6 + i 2 + 9i 5 - 6i/

122 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

B

- 3i 5 + 8i\\

-43-7i).

-6-i Si I

AH = AT, the conjugate transpose of A. Hence,

A\302\273= (E1 JL =bji\\B+ 3i -4 -6+i\\

^5+8i 3 \342\200\2247i 5/ / V5-8i 3 + 7i -5i/

4.38. Write A =[\" J in the form /1 = \320\222+ \320\241,where \320\222is Hermitian and \320\241is skew-\\ 9 - i 4 - 2j/

Hermitian.

First find

^ )A A

Then the required matricesare

4.39. Define an orthonormal set of vectors in \320\241and prove the following complex analogue of

Theorem 4.5:

Theorem: Let \320\233be a complex matrix. Then the following are equivalent: (a) A is unitary;(fc)the rows of A form an orthonormal set; (c) the columns of A form an orthonormal set.

The vectors u,,u2,...,\302\253r inC\" form an orthonormal set if \321\206\342\200\242

itj= 6ti where the dot product in C\" is

defined by

(ai, a2, ..., aj \342\200\242(*\302\273\342\200\236b2, ..., \320\254\342\200\236)=

\320\260,\320\254,+ \320\2602?\320\263+ \302\246\302\246\302\246+ \320\260\320\277\320\242\342\200\236

and 5i} is the Kroneckerdelta [seeExample 4.3(a)].

Let R,,..., Rn denote the rows of A; then Rj,..., \320\251are the columns of A\". Let AAH = (c0). By

matrix multiplication, ctj =/?,\302\253/

= Rt\302\246

Rj. Then \320\233^4\320\235= / iff R,\342\200\242

Rj=

Sit iff R,, R2, .... Rn form an

orthonormal set. Thus (a) and (b) are equivalent. Similarly, A is unitary iff AH is unitary iff the rows of AHare orthonormal iff the conjugates of the columns of A are orthonormal iff the columns of A are orthonor-

orthonormal.Thus (a) and (c) are equivalent, and the theorem is proved.

/l _ 2,- 2,- \\

4.40. Show that \320\233= I2

,3 ,3 , I is unitary.V -3' -3-3'/

The rows form an orthonormal set:

\320\231\"

ft \302\2470\342\200\242

(- $i, -i-|0 = * + !) + (- *i - I) = 0

(- $i.- 4- io

\342\200\242(- ii, - i - io =

S + (i + i) = i

Thus /4 is unitary.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 123

SQUARE BLOCK MATRICES

4.41. Determine which matrix is a square block matrix:

A =

Although A is a 5 x Ssquare matrix and is a 3 x 3 block matrix, the second and third diagonal blocks

are not square matrices.Thus A is not a square block matrix,

\320\261is a square block matrix.

/11

9

3

\\v

2;i;8\"

i

\320\267:

\320\267:

3

1-

7

3

5

4163

Y

5\\l

5

3

~9

D

/119

_ _

3

\\l

2

1

8

3

3

31735

416

_

3

7

5\\1

5

3

9

4.42. Completethe partitioning of \320\241=

/11

9

3

\\l

2

1

8

3

3

3173

5

4163

7

5\\1

5

3

9/

into a square blockmatrix.

One horizontal line is between the second and third rows; henceadd a vertical line between the second

and third columns. The other horizontal line is between the fourth and fifth rows; hence add a vertical line

between the fourth and fifth columns. [The horizontal lines and the vertical lines must be symmetrically

placed to obtain a square block matrix.] This yields the square block matrix

C =

/11

9

3

\\l

2

1

8

3

1

7

3

5

44

6

3

7

5\\1

5

3

9/

4.43. Determine which of the following square block matrices are lower triangular, upper triangular,

or diagonal:

A = B =

1

2

5

0

03

0

7

0

4

600

U\\0

0

-.9

A is upper triangular since the block below the diagonal is a zeroblock.\320\222is lower triangular since all blocks above the diagonal are zeroblocks.\320\241is diagonal since the blocksaboveand below the diagonal are zero blocks.D is neither upper triangular nor lower triangular. Furthermore, no other partitioning of D will make

it into either a block upper triangular matrix or a block lower triangular matrix.

4.44. Consider the following block diagonal matrices of which corresponding diagonal blocks have thesamesize:

\321\212A2,...,Ar) and JV = diag (\320\222\342\200\236\320\2222,..., Br)

Find: (a) M + N, (b)kM,(\321\201)MN, (d)f(M) for a given polynomial/(x).

(a) Simply add the diagonal blocks: M + N = diag (A, + \320\222\342\200\236\320\220\320\263+ B2, ..., Ar + Br).

(b) Simply multiply the diagonal blocks by k: kM = diag (kAt, kA2,.... kAr).

124 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

(c) Simply multiply corresponding diagonal blocks: MN = diag (-4,5,, A2 B2, A,Bt).

(d) Find f{At) for each diagonal block A,. Then/(M) =diag (\320\224\320\233,),/(\320\2332),...,/(/4r)).

/12,

4.45. Find A/2 where M =

\\

j5 \302\246

'l 3

\\ '5 7/

Since \320\233/is block diagonal, square each block:

3 4/V.3 4/ \\15 22

EK5) = B5)

1 3\\/l 3\\ /16 24\\

5 7\320\2245 7/1,40 64 J

Then M2

/715

1022

25

: i6

'.40

\\

24

64

CONGRUENT SYMMETRIC MATRICES, QUADRATIC FORMS

/1-3 2\\

4.46. Let \320\233=1\342\200\2243 7 \342\200\2245I, a symmetric matrix. Find (a)a nonsingular matrix P such that PTAP

\\ 2 -5 8/is diagonal, i.e.,the diagonal matrix \320\222= PTAP, and (b) the signature of A.

(a) First form the block matrix (A ; /):

A-3

2 ; 1 0 0\\

-37-50102-58001/

Apply the row operations 3R, + R2-\302\273R2 an<^ \342\200\224

2R, + R3 -\302\273R3 to (\320\233;/) and then the correspondingcolumn operations 3C, + C2 -> C2 and \342\200\224

2C, + C3 -> C3to /1to obtain

4 , -2

0

1

0

and then

0

I

0

Next apply the row operation R2 4- 2R3 -\302\273R3 and then the corresponding column operationC2 + 2C3

\342\200\242C3 to obtain

and then

9'

-1 1

Now A has been diagonalized. Set

3 -1

0

0

1

1

and then

(b) \320\222has p = 2 positive and n = 1 negative diagonal elements. Hence sig /4=2\342\200\2241 = 1.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 125

QUADRATIC FORMS

4.47. Find the quadratic form q(x, y) corresponding to the symmetric matrix A = I I.

\\ \342\200\2243 8/

= 5x2 -\320\227\321\205\321\203

- Ixy + Sy2= 5x2 -

bxy + 8y2

4.48. Find the symmetric matrix A which corresponds to the quadratic form

q(x, y, z) = \320\227\321\2052+ 4xy \342\200\224\321\2032+ %xz \342\200\224

6yz + z2

The symmetric matrix A ={atJ) representing q(xly..., \321\205\342\200\236)has the diagonal entry au equal to the coeffi-

coefficientof x2 and has the entries akjand afi each equal to half the coefficient of xf Xj. Thus

4.49. Find the symmetric matrix \320\222which corresponds to the quadratic forms

(a) q{x,y)= 4x2 + 5xy -

7y2 (b) q(x, y, z) =Axy + 5j^2

/4 | \\

(a) Here \320\222= I 5 I. (Division by 2 may introduce fractions even though the coefficients in q are

integers.)

(b) Even though only x and \321\203appears in the polynomial, the expression q[x, y, z) indicates that there are

three variables. In other words,

q(x, y, z) = Ox2 + 4x>> + 5y2 + Oxz + Oyz + Oz2

Thus

4.50. Consider the quadratic form q(x, y) = 3x2 + 2xy \342\200\224y2 and the linear substitution, x = s \342\200\224

3r,

\321\203= 2s + t.

(a) Rewrite q(x, y) in matrix notation, and find the matrix A representing the quadratic form.

(b) Rewrite the linear substitution using matrix notation, and find the matrix P correspondingto the substitution.

(c) Find q{s,t) using direct substitution.

(d) Find q(s, t) using matrix notation.

(a) Here q{x, y) = (x, y{ \\\\HenceA =

( )and ^X> = XTAX where X = (x, yf.

(b)Wehave(XJ

= (~

YJ. Thus P =

(

~

Jand X = PY. where X = (x, y)T and Y = (s, f)r-

126 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

(c) Substitute for x and \321\203in q to obtain

q{s, t) = 3(s - 3lJ + 2(s- 3tX2s + f) - Bs + tJ

= 3(s2 - bst + 9t2) + 2Bs2 - 5st - 3r2) - (s2+ 4sf + t2) = 3s2 - 32st + 20f2

(d) Here q(X) = XTAX and X = PY. Thus Xr = \320\243\320\223\320\240\320\223.Therefore,

[As expected, the results in (c) and (d) are equal.]

4.51. Let L be a linear substitution X = PY, as in Problem 4.50. (a) When is L nonsingular? orthog-orthogonal? (b) Describe one main advantage of an orthogonal substitution over a nonsingular substi-substitution, (c) Is the linear substitution in Problem 4.50 nonsingular? orthogonal?

(a) L is said to be nonsingular or orthogonal according as the matrix P representing the substitution is

nonsingular or orthogonal.

(b) Recall that the columns of the matrix P representing the linear substitution introduces a new coordi-coordinatesystem. If P is orthogonal, then the new axes are perpendicular and have the same unit lengths as

the original axes.

(c) The matrix P = I I is nonsinguiar, but not orthogonal; hence the linear substitution is non-

singular, but not orthogonal.

4.52. Let q[x, y, z) = x2 + 4xy + 3y2\342\200\2248xz \342\200\224

\\2yz + 9z2. Find a nonsingular linear substitution

expressing the variables x, y, z in terms of the variables r, s, t so that q(r, s, t) is diagonal. Alsofind the signature of q.

Form the block matrix {A \\ I) where A is the matrix which corresponds to the quadratic form:

A

2 -4; 1 0 0\\

2 3-6 0 1 0

-4-6 9.0 0 1/Apply

\342\200\2242/?| + R2 \342\200\224\302\273R2 and 4/?, + R3 \342\200\224\302\246R-, and the corresponding column operations, and then

2R2 + R3-* R3 and the corresponding column operation to obtain

/'and then I 0

v>

Thus the linear substitution x = r \342\200\2242s, \321\203= s + 2t, z \342\200\224i will yield the quadratic form

q(r, s, t) = r2 - s2 - 3f2

By inspection, sig q = 1 \342\200\2242\342\200\224\342\200\2241.

4.53. Diagonalize the following quadratic form q by the method known as \"completingthe square\":

q(x, y) \342\200\2242x2 \342\200\22412xy + 5y2

First factor out the coefficient of x2 from the x2 term and the xy term to get

q(x, y)= 2{x2 -bxy ) + by2

1

0

0

0

-12

02

-7

11

! -2

4

0

1

0

001

0

-1

0

0

0-3

1

-2

0

0

1

2

0

0

1

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 127

Next complete the square inside the parentheses by adding an appropriate multiple of y2 and then subtractthe corresponding amount outside the parentheses to get

q(x, y) = 2(.v2- bxy + 9y2) + 5y2 \342\200\22418y2 = 2(x - 3yJ -

\\3y2

(The \342\200\22418comes from the fact that the 9y2 inside the parentheses is multiplied by 2.) Let s = x \342\200\224\320\252\321\203,t = y.

Then x = s + 3t, \321\203= t. This linear substitution yields the quadratic form q{s, t) = 2s2\342\200\22413f2.

POSITIVE DEFINITE QUADRATIC FORMS

4.54. Let q(x, y, z) = x2 + 2y2 \342\200\2244xz \342\200\2244yz + lz2. Is q positive definite?

Diagonalize (under congruence) the symmetric matrix A corresponding to q (by applying

2R, + R3 ->R3 and 2C, + \320\2413->\320\2413,and then R2 + R3

-> R3 and C2

/ 1 0 -2\\ /l 0 0\\ /l

A = l 0 2 -21-0 2 \342\200\22421-* I 0

\\_2 -2 7/ \\0-2

The diagonal representation of q only contains positive entries, 1, 2, and 1, on the diagonal; hence q ispositive definite.

4.55. Let q\\x, y, z) = x2 + y2 + 2xz + 4yz + 3z2.Is q positive definite?

Diagonalize (under congruence)the symmetric matrix A corresponding to q:

A

0 l\\ /l 0 0\\ /l 0 0\\

0 12J-\302\273

0 12J-

0 1O]

1 2 3/ \\02 2/ \\0 0 -2/

Thereis a negative entry \342\200\2242in the diagonal representation of q; henceq is not positive definite.

4.56. Show that q(x, y) = ax2 + bxy + cy2 is positive definite if and only if a > 0 and the discriminantD = b2 - Aac < 0.

Supposev = (x, \321\203)\321\2040, say \321\203\321\2040. Let t =x/y. Then

\320\244>)=

\320\2432[\320\260{\321\205/\321\203J+ b(xjy) + c] =y\\al2 4- bt + c)

However, s = at2 + bt + \321\201lies above the r axis, i.e., is positive for every value of t if and only if a > 0 andD = b1- Aac < 0. Thus q is positive definite if and only if a > 0 and D < 0.

4.57. Determine which quadratic form q is positive definite:

(a) q(x, y)= x2 - 4xy + 5y2 (b) q(x, y) = x2 + 6xy + 3y2

(a) Method 1. Diagonalize by completing the square:

q(x, y)= x2 -

4xy + 4y2 + 5y2 \342\200\2244y2 = (x - 2yJ + y2

= s2 + t2

where s = x \342\200\2241y, t =

y. Thus q is positive definite.

Method 2. Compute the discriminant D = b2 \342\200\2244ac = 16 \342\200\22420 = \342\200\2244.Since D < 0, q is positivedefinite.

(b) Method 1. Diagonaiize by completing the square:

\321\204\321\201,>>)= x2 + 6xy + 9y2 + \320\252\321\2032

- 9y2 = (x + \320\252\321\203J- by2 = s2 - bt2

where s = x + \320\252\321\203,t = y. Since \342\200\2246 is negative, q is not positive definite.

Method 2. Compute D = b2 \342\200\2244ac = 36 \342\200\22412 = 24. Since D > 0,q is not positive definite.

128 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

4.58. Let \320\222be any nonsingular matrix, and let M = BTB. Show that (a) M is symmetric, and (b) M is

positive definite.

(o) MT = (BTB)T= BTBTT = BTB=M, hence M is symmetric.(b) Since \320\222is nonsingular, BX \320\2440 for any nonzero X e R\". Hence the dot product of BX with itself.

BX- BX = (BX)T(BX\\ is positive. Thus

<j(X)= XrMX = XT(BTB)X =

(XTBT)(BX)= (BX)r(BX) > 0

Thus M is positive definite.

4.59. Show that q(X)= || A11|2, the square of the norm of a vector X, is a positive definite quadratic

form.

For X = (xt,x2, ..., xn), we have q{X)= x\\ + x\\ + \302\246\302\246\302\246+ x*. Now q is a polynomial with each term

of degree two, and q is in diagonal form where all diagonal entries are positive. Thus q is a positive definite

quadratic form.

4.60. Prove that the following two definitions of a positive definite quadratic form are equivalent:

(a) The diagonal entries are all positive in any diagonal representation of q.(b) q(Y) > 0, for any nonzero vector Y in R\".

Suppose q(Y) = a,>'^ + a2y\\ + \302\246\302\246\302\246+ \320\260\342\200\236\321\203%.If all the coefficients o, are positive, then clearly q(Y) > 0for any nonzero vector Y. Thus (a) implies (b). Conversely, suppose ak < 0. Let ek = @, ..., 1, ..., 0) be the

vector whose entries are all 0 except 1 in the feth position. Then q[ek) =ak < 0 for ek \320\2440. Thus not = (o)

implies not =(b). Accordingly, (a) and (b) are equivalent.

SIMILARITY OF MATRICES

4.61. Consider the cartesian plane R2 with the usual x and \321\203axes. The 2x2 nonsingular matrix

-(-:

determines a new coordinate system of the plane, say with s and t axes. (See Example4.16.)

(a) Plot the new s and t axis in the plane R2.

(b) Find the coordinates of 6A, 5)in the new system.

(a) Plot the s axis in the direction of the first column ut= A, \342\200\224

l)r of P with unit length equal to the

length u,. Similarly, plot the t axis in the direction of the second column u2 = C, 2)r of P with unit

length equal to the length of u2. See Fig. 4-2./2 \342\200\224-\\

(b) Find P ' = I* *

L say by using the formula for the inverse of a 2 x 2 matrix. Then multiply the

coordinate (column)vectorof Q by P ':

\342\200\224G I)C)-(I)

Thus Q'i\342\200\224^, f) represents Q in the new system.

4.62. Let/: R2 -> R2 be defined by/(x, y) = Bx -5y, 3x + Ay).

(a) Using X = (x, y)T, write/in matrix notation, i.e., find the matrix A such thatZ(A') = AX.

(b) Referring to the new coordinate s and t axes of R2 introduced in Problem 4.61, and usingY = (s, tI, find/E, r) by first finding the matrix \320\222such that/(y) = BY.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 129

\321\204)Find \320\222= \320\240-\320\234\320\240=

Thus/(s, t) =

Fig. 4-2

1\320\245\320\267

G

0-\302\253

/ L =

4.63. Considerthe space R3 with the usual x, y, z axes. The 3x3 nonsingular matrix

determines a new coordinate system for R3, say with r, s, t axes. [Alternatively, P defines the

linear substitution X = PY, where X = (x, y, zI and Y = (r, s, t)TJ] Find the coordinates of the

point 6A, 2, 3) in the new system.

First find P\"'. Form the block matrix M = (P \342\200\242/) and reduce M to row canonical form:

3

1

-1

310

-2 ;

-2;

3 1

0

0

l

l

2

-1

: \320\267

; 4'

1

0

1

0

2

3

1

0

0

1/

2

2

1

130 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP.4

Accordingly,

(-9

-7 -44

Thus Q'(\342\200\22447,16, 6) represents Q in the new system.

4.64. Let/: R3 -\302\273R3 be defined by

/(x, y, z)= (x + 2y \342\200\224

3z, 2x + z, x \342\200\2243y + z)

and let P be the nonsingular change-of-variable matrix in Problem 4.63. [Thus, X = PY where

X = (x, y, z)T and Y - (r, s, t)r.] Find:(a) the matrix -4 such that/fA') = AX, (b) the matrix \320\222

such that/(y) = BY, and (c)/(r, s, ()\342\200\242

(a) The coefficients of x, \321\203and z give the matrix A:

(\320\233 /1 2 -3\\M /1 2 -3\321\203)

=( 2 0 1 II \321\203I and so /1=2 0 1

-3 1/W V -3 1

(fc) Here \320\222is similar to A with respect to P, that is,

(_9

_7 _4\\/| 2 -3\\/ 1 3 -2\\ /l -19

4 3 2II 2 01 1 l/\\l -3

(c) Usethe matrix \320\222to obtain

/(r, s, I)=(r

- 19s + 58t, r + 12s- 27f, 5r + 15s - 1 It)

4.65. Suppose \320\222is similar to A. Prove tr \320\222= tr \320\233.

Since \320\222is similar to A, there exists a nonsinguiar matrix P such that \320\222= P~XAP. Then, usingTheorem4.1,

tr \320\222= tr P~lAP = tr PP\"lA = tr /1

LI/ FACTORIZATION

4.66. Find the LI/ factorization of A =\\

Reduce A to triangular form by the operations -2R, + R2-*R2 and 3R, +R3->R3, and thenIR-r + R* \342\200\224>Rv

Use the negatives of the multipliers- 2, 3, and 7 in the above row operations to form the matrix L, and use

the triangular form of A to obtain the matrix U; that is,

\302\260\\01 and

\342\200\242/

(As a check, multiply L and U to verify that A = LU.)

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 131

4.67. Find the LDU factorization of the matrix A in Problem 4.66.

The A\342\200\224LDV factorization refers to the situation where L is a lower triangular matrix with Is on the

diagonal (as in the LV factorization of A), D is a diagonal matrix, and V is an upper triangular matrix withIs on the diagonal. Thus simply factor out the diagonal entries in the matrix V in the above LV factor-factorization of A to obtain the matrices D and L. Hence

0

1 4 -3\\4.68. Find the LV factorization of \320\222= | 2 8 1 I.

k-5 -9 7/Reduce \320\222to triangular form by first applying the operations \342\200\224

2R, + R2 -\302\273R2 and 5Rt + R3-\302\273R3:

/I 4 -A

B~ 0 0 71\\0

11 -8/

Observe that the second diagonal entry is 0. Thus \320\222cannot be brought into triangular form without row

interchange operations. In other words, \320\222is not Lt/-factorable.

4.69. Find the LV factorization of A =

First form the following matrices L and V:

0 0 0

121

3

2

3

3

8

-3-8

1-1

53

13>

by a direct method.

1

<\320\267

\\/4, U

0

1

'43

and V =\320\276

\320\276

\302\25312

\0222

\320\236

\320\236

\" 13

\0223

\0233

\320\236

\302\25324

\302\25334

\0244/

That part of the product LV which determines the first row of A yields the four equations

= 2 = \342\200\2243 = 4

and that part of the product LV which determines the first column of A yields the equations

'2i\302\253ii=2. 'si\"n= b '4i\"ti = 3 or /21

= 2, /3i = l.

Thus, at this point, the matricesL and V have the form

L =

00

1

2

1

3 U, /4

0

0

1and V =

A00

\\o

2

\0222

0

0

3

\0223

\0233

0

4

\302\25324

\0234

Uaa

That part of the product LV which determines the remaining entries in the second row of A yields the

equations

or

4 + u22 = 3

\0222= - '

-6 + u23= \342\200\224i

\0223___ \320\273

8 + u24 = 5

\302\25324=-3

132 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

and that part of the product LV which determines the remaining entries in the second column of A yields

the equations

2 + '32\0222

Thus L and V now have tht

L =

1

2

1

3

= 3,

: form

0

1

-1

-2

6

001

+

0001

and

or

V =

'\320\2672= - 1, /\302\253= -2

1

0

0

0

2

-100

-3-2

\0233

0

4

-3

\0234

\0244

Continuing, using the third row, third column, and fourth row of A, we get

\"\320\267\320\267= 2, .= -1,

Thus

1

2

1

3

0

1

-1

-2

0

012

0001

then f43 = 2, and lastly

and V =

= 3

1

0

0

0

2

-100

_3-2

20

4

-3

-1

3

4.70. Find the LDU factorization of matrix A in the Problem 4.69.

Here V should have Is on the diagonal and D is a diagonal matrix. Thus, using the above LV factor-

factorization of A, factor out the diagonal entries in that V to obtain

D and V =

1 11 2

1

The matrix L is the sameas in Problem 4.69.

4\\3

-2

V

4.71. Given the factorization A = LU, where L =(l(J)

and U = (uw). Consider the system AX = B.

Determine (a) the algorithm to find L~l B, and (b) the algorithm that solves VX = \320\222by back-

substitution.

(a) The entry /\321\206in the matrix L correspondsto the elementary row operation \342\200\224/\320\246\320\233,-+ Rj -* Rj. Thus the

algorithm which transforms \320\222into B' is as follows:

Algorithm P4.71A: Evaluating LXB

Step 1. Repeat for j = 1 to n \342\200\2241:

Step 2. Repeat for i! = j + 1 to n:

b.;=-,..bi. + b.[End ofStep2 inner loop.]

[End of Step 1 outer loop.]

Step3. Exit.

[The complexity of this algorithm is C(n)x n2/2.]

(fc) The back-substitution algorithm follows:

Algorithm P4.71B: Back-substitution for system VX = \320\222

Step I. xn =bjulm

Step 2. Repeat Tory = n \342\200\2241, n - 2, ..., 1Xj

=(bj

-Uhj+ ,*,.. , \",\342\200\236*\342\200\236)/\";,

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 133

Step 3. Exit.

[The complexity here is also C(n) * n2/2.]

4.72. Find the LU factorization of the matrix A =

Reduce.4to triangular form by the operations

A) -2R,+R2-R2, B) \320\227\320\257.+\320\232\320\267-\320\232\320\267,C) -4R2

Thus

The entries 2, \342\200\2243,and 4 in L are the negatives of the multipliers in the above row operations.

4.73. Solve the system AX = \320\222for \320\222\342\200\236\320\2222,B3, where A is the matrix in Problem 4.72 and where

B, =A, 1, 1), B2 = B, + X,, B3

= B2 + AT2(here X} is the solution when \320\222=\320\222,).

(\320\276)Find f'Bf or. equivalently, apply the row operations A), B), and C) to B, to yield

Solve VX = \320\222for \320\222= A, -1, 8) by back-substitution to obtain Xx= (-25, 9, 8).

(b) Find B2 = Bl + Xt= A, I, 1) + (-25, 9, 8)= (-24, 10,9).Apply the operations A), B), and C) to B2

to obtain (-24, 58, -63), and then \320\222= (-24, 58, -295).Solve UX = \320\222by back-substitution to obtain X2

= (943, -353, -295).

(c) Find B3= B2 + X2 = (-24, 10,9) + (943, -353, -295) == (919, -343, -286). Apply the operations

A), B), and C) to B3 to obtain (919, -2181, 2671), and then \320\222= (919, -2181. 11395).Solve VX = \320\222by back-substitution to obtain \320\245\321\212

= (-37,628, 13,576, 11,395).

Supplementary Problems

ALGEBRA OF MATRICES

4.74. Let A =( ).

Find A\".

4.75. Suppose the 2x2 matrix \320\222commutes with every 2x2 matrix A. Show that \320\222= I I for some scalar k,

i.e., \320\222is a scalar matrix.

4.76. LetA = ( ). Find ail numbers \320\272for which A is a root of the polynomialV\302\2604

(a) f(x) = x2 - 7x + 10, (b) g(x) = x2 - 25, (c) /\302\273(x)= x2 - 4

134 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

I. Find a matrix A such that A3 = B.4.77. Let \320\222=

@

I 0 0\\

0I and \320\222= I 0 1 1 I Find: (a) A\" and (fc) B\", for all positive integers n.

0 0 0 0/

4.79. Find conditions on matrices A and \320\222so that /I2 - B2=(/1 + \320\222)(\320\233

-\320\222).

INVERTIBLE MATRICES, INVERSES, ELEMENTARY MATRICES

3 -2\\ /24.80. Find the inverse of each matrix: (o) I 2 8 -3 I, (b) 5

1 -1\\ /1 -2

2 -3 ,(c) 2 -3

U 1 \\0

4.81. Find the inverse of each matrix: (a)

1

0

0

0

1

1

0

0

1

1

1

0

1

13 1-214-24

4.82.Express each matrix as a product of elementary matrices: (a) I I, {b) I ).\\3 V \320\243,\022 4/

/. \320\263\320\276\\

4.83. Express /1=10 1 3 I as a product of elementary matrices.

\\3 8 7/

4.84. Suppose A is invertible. Show that if \320\220\320\222= AC then \320\222= \320\241Give an example of a nonzero matrix A such

that /IB = AC but \320\222^ C.

4.85. If A is invertible, show that kA is invertible when fc \321\2040, with inverse fc\" 1A '.

4.86. Suppose /I and \320\222are invertible and A + \320\222^ 0. Show, by an example, that A + \320\222need not be invertible.

SPECIAL TYPES OF SQUARE MATRICES

4.87. Using only the elements0 and 1, find ail 3 \320\2663 nonsingular upper triangular matrices.

4.88. Using only the elements 0 and 1, find the number of: (a)4 \320\2664 diagonal matrices, (b) 4 x 4 upper triangular

matrices, (c) 4 x 4 nonsingular upper triangular matrices. Generalize to n x n matrices.

= I?<:)'^B

=(n QA

4.89. Find all real matricesA such that A2 = \320\222where (\320\276)\320\222

/1 8 5\\

4.90. Let \320\222= I 0 9 5 I. Find a matrix A with positive diagonal entries such that A2 = B.

\\0 0 4/

4.91. Suppose/IB= \320\241where A and \320\241are upper triangular.

(a) Show, by an example, that \320\222need not be upper triangular even when A and \320\241are nonzero matrices.

(b) Show that \320\222is upper triangular when A is invertible.

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 135

4.92. Show that AB need not be symmetric, even though A and \320\222are symmetric.

4.93. Let A and \320\222be symmetric matrices. Show that AB is symmetric if and only if A and \320\222commute.

4.94. Suppose A is a symmetric matrix. Show that (a) A2 and, in general, A\" is symmetric; (b)f(A) is symmetricfor any polynomial/(x); (c) PTAP is symmetric.

4.95. Find a 2 x 2 orthogonal matrix P whose first row is {a)B/^/29,5\320\224/29), {b) a multiple of C, 4).

4.96. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of (a) A, 2, 3) and @, \342\200\2242,3), respec-

respectively;(b) A, 3, 1)and A, 0, - 1),respectively.

4.97. SupposeA and \320\222are orthogonal. Show that AT,A~X, and AB are also orthogonal.

4.98. Which of the following matrices are normal?

^-3 -1 2/

4.99. SupposeA is a normal matrix. Show that: {a) AT, (b) A2 and, in general \320\220\",(\321\201)\320\222= kl + A are also normal.

2 -2 -4\\

4.100. A matrix E is idempotent if E2 = E. Show that E =j

\342\200\2241 3 4 I is idempotent.

1 -2 -3/

4.101. Show that if /IB = /1 and BA = B, then A and \320\222are idempotent.

/ . . 3\\

4.102. A matrix A is nilpotent of class p if Ap = 0 but Ap~' * 0. Show that /1=1 5 2 6 1

\\-2 -1 -3/is nilpotent of class 3.

4.103. SupposeA is nilpotent of class p. Show that A4 = 0 for q > p but A4 ?t 0 for q < p.

4.104. A square matrix is tridiagonal if the nonzero entries occur only on the diagonal directly above the main

diagonal (on the superdiagonal), or directly below the main diagonal (on the subdiagonal). Display the

generic tridiagonal matrices of orders 4 and 5.

4.105. Show that the product of tridiagonal matrices need not be tridiagonal.

COMPLEX MATRICES

4.106. Find real numbers x, y, and z so that A is Hermitian, where

= (x + yi 3\\

\\3+zi 0/

3 x + 2i yi

<\302\253>A=[\", '. I- '\320\233 \342\204\226)=|3-2i 0 l+7i]

yi 1 \342\200\224xi \342\200\2241

4107. Suppose A is any complex matrix. Show that AA\" and A\"A are both Hermitian.

4108. Suppose A is any complex square matrix. Show that A + AH is Hermitian and A \342\200\224A\" is skew-Hermitian.

136 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

4.109. Which of the following matrices are unitary?

/ H2 -\302\246A =

U/2-

4.110.Suppose A and \320\222are unitary matrices. Show that: (a) A\" is unitary, (b) A~' is unitary, (c) /IB is unitary.

4.111. Determine which of the following matrices are normal: A = I 1, \320\222= I I.

V i 2 + 3i/ \\l-i ij

4.112.Suppose A is a normal matrix and U is a unitary matrix. Show that \320\222= U\"AU is also normal.

4.113. Recall the following elementary row operations:

[?,] R^Rj, [?2] fcR,.-\302\273Ki, fc*0, [?3] JcR, + R,\342\200\224R,

For complex matrices, the respectivecorrespondingHermitian column operations are as follows:

[C,] C^C,, [C2] kCi-Ct, \320\272\320\244\320\236.[\320\2413] kCj + Q-Q

Show that the elementary matrix corresponding to [CJ is the conjugate transpose of the elementary matrix

corresponding to [EJ.

SQUARE BLOCK MATRICES

4.114.Using vertical lines, complete the partitioning of each matrix so that it is a square block matrix:

A =

/1 2 3 4 5\\

1 I 1 I 1

9 8 7 6 5

2 2 2 2 2\\3 3 3 3 3/

B =

/I 2 3 4 5\\

1 I 1 1 1

9 8 7\" 6 5

2 2 2 2 2\\3 3 3 3 3/

4.115. Partition each of the following matricesso that it becomes a block diagonal matrix with as many diagonal

blocks as possible:

/l 2 0 0 0\\

3 0 0 0 0\320\222= 0 0 4 0 0

0 0 5 0 0

\\o \320\276\320\276\320\276\320\261/

4.116. Find M2 and M3 for each matrix M:

4.117. Let M = diag (/!,,..., Ak) and iV = diag (B,,..., Bk)be block diagonal matrices where each pair of blocksAj, \320\222,-have the same size. Prove M/V is block diagonal and

MN = diag(/l1B1,A2B2, ..., AkBk)

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 137

REAL SYMMETRIC MATRICES AND QUADRATIC FORMS

. Find a nonsingular matrix P such that \320\222= PTAP is diagonal. Also, find

4.119. For each quadratic form q(x, y, z), find a nonsingular linear substitution expressing the variables x, y, z in

terms of variables r, s, t such that q(r, s, t) is diagonal.

{a) q{x, y, z) = x2 + bxy + Sy2 \342\200\2244xz + 2yz \342\200\2249z2

(b) q{x, y, z) = 2xz -3y2 + Sxz + I2yz + 25z2

4.120.Find those values of \320\272so that the given quadratic form is positive definite:

{a) q{x, y) = 2x2 -5xy + ky2

(b) q[x, y) =-\302\2463x2 - kxy + \\2yz

(c) q(x, y, z) = x2 + 2xy + 2y2 + 2xz + byz 4- kz2

4.121. Give an example of a quadratic form q{x, y) such that q(u) = 0 and q(v) = 0 but qiu + v) ?t 0.

4.122. Show that any real symmetric matrix A is congruent to a diagonal matrix with only Is, \342\200\224Is, and 0s on the

diagonal.

4.123. Show that congruence of matricesis an equivalencerelation.

SIMILARITY OF MATRICES

/1 -2 -2\\

4.124. Consider the space R3 with the usual x, y, z axes. The nonsingular matrix P = I 2 \342\200\2243 \342\200\2246 I determines a

\\l I -7/new coordinate system for R3, say with r, s, i axes. Find:

(a) The coordinates of the point Q(l, 1, 1) in the new system,

(b) f(r, s, t) when/(x, y, z) = {x + \321\203,\321\203+ 2z, x - z),(c) g{r, s, t) when #(x, y,z) = {x+ \321\203

- z.x - 3z.2x + y).

4.125. Show that similarity of matrices is an equivalence relation.

LU FACTORIZATION

4.126. Find the LU and LDU factorization of each matrix:

'l 3 -l\\ /2 36\\

5 1 I, (fc) \320\222= I 4 7 9

4 2/ \\3 5 ^

4.127. Let A = 3 -4 -2

(a) Find the LI/ factorization of A.

138 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP 4

(fc) Let Xk denote the solution of AX = Bk. Find X^ X2, X3, Xt when B, = A, 1, IO and

Bk*i= Bk + Xktork > 0.

Answers to Supplementary Problems

4.74.

4.76. (\320\260)\320\272= 2, (fc) fc=-5, (c) None

4.78. (a) A1 =

4.79. AB = BA

4.80. (a)

4.81. (a)

0000

0000

0000

1\\

0

0/

0/

Ak = 0 for \320\272>3 \320\244)

jB\" =

1

/.

r0

I

\\o

n n

I

0

in

n

1

\320\260)

G X \"OG -\302\260)orG o)G -X4.82. 0

'1 0 0\\/l 0 0\\/l 0 0\\/l 2 0\\

4.83. [0 1 Oil 0 1 OJJO 13JJ0

1 0|

\\3 0 l/\\0 2 lAO 0 l/\\o 0

(b) No product: matrix has no inverse.

\321\207; ?>\320\247\320\267 \320\276)

4\320\2337. All diagonal entries must be 1 to be nonsingular. There are eight possible choices for the entries above thediagonal:

\320\241 \320\266 \320\266 :>\321\201 :>(: :\320\274: :>\321\201 :)\302\246(::)

CHAP. 4] SQUARE MATRICES, ELEMENTARY MATRICES 139

4\320\2338. (a) 24[2\"], (fc)

4.89. (a)

4.90.

(c)

4.96.

4.98. \320\220,\320\241

4.104.

1/^2 0 -l/v/2

^ -2\320\224/22 3/^/22/

b43 fc44

\320\22311 0\\/l 1 0\\ /2 2 1)4.105. ( I 1 ill 1 11=12 3 2!

[o 1 l/\\0 11/ \\l 2 2^

4.106.

4.109.

4.111.

(o)

/1

x =

\320\222,\320\241

: a (parameter), \320\243=

0,\320\263= 0; (fc) x = 3, \321\203

= \320\236,\320\263= 3

4.114. /1 =

/119

\320\243

\\\320\267

2

1

8

2

3

3

j

7

2\320\235

3

4

1

6\321\203_ _

2

3

5\\1

5

\022

\320\267/

,\320\262=

/11

9

2

\\\320\267

2

1

\"8\"

2

3

3

1

\320\2237

2

3

; 4

Ii

\"

2

\320\267

5\\1

~5

2

\320\267/

4.115. /1=1 \320\236

\\

\320\2360\\

\320\236\"\022 ,\320\222=

\320\2363/

I 2; \320\276\320\276\320\276

\320\267\320\276!\320\276\320\276\320\276

\320\236\320\236\320\2234 \320\236\320\236

/\320\276. \320\276

,\321\201= \320\276\320\276\320\276

\\0 2 \320\236/\320\2360^5

\320\236\320\236

\\\320\276\320\276'\320\276\320\276\320\261/

(\320\241,itself, is a block diagonal matrix; no further partitioning of \320\241is possible.)

140 SQUARE MATRICES, ELEMENTARY MATRICES [CHAP. 4

4.116. (a) M2 =

(fc) M2 =

4.118. P =

\320\267_

M3=I

25

22

\320\230

30

44

25

15

41

/

57156

78

213

-7

469

, sig /1 = 2

4.119.(a) x = r - 3s - 191,\321\203= s + lt,z = t,

q{r, s, t) = r2 \342\200\224s2 + 36f2, rank q = 3.sigq = 1(b) x = r - It, \321\203

= s + 2J, z = t,

q(r, s, t) = 2r2 - 3s2 + 29/2, rank q = 3,sigq = I

(c) x = r - 2s+ 18J, \321\203= s - It, z = t,

q(r, s, t) = r2 + s2 - 62f2, rank q = 3, sig q = 1

(d) x = r \342\200\224s \342\200\224t, y = s \342\200\224t, z = t,

q(x, y, z) = r2 + 2s2, rank q = 2, sig q = 2

4.120.

4.121.

4.122.

(a) 12; (c) k>5

Suppose A has been diagonalized to PTAP = diag (a,).Let g =diag (b,) be defined by

=( 1/n/I \302\253iI \302\253f\302\253,?t 0

Then \320\262=\320\265\320\263\321\200\320\223/1\321\200\320\265

= {pQ^AiPQ) has the required form.( 1 if a,-

= 0

4.124. (a) 017, 5, 3), (fc) /(r, 5, f)= A7r - 61s+ 134t, 4r - 41s + 46r,3r - 25s + 25f),

(c) g{r, s, t) = F1r + s -33O\302\253,16r + 3s - 91r,9r - 4s - 4/)

4.126.

4.127.

\320\270

\321\205-'.!\342\200\242*\302\246\302\2460-*\342\200\242\302\246&'\342\200\242-Q-'-S-\"-3

Chapter5

Vector Spaces

5.1 INTRODUCTION

This chapter introduces the underlying algebraic structure of linear algebra \342\200\224thatof a finite-

dimensional vector space.The definition of a vector space involves, an arbitrary field whose elementsarecalledscalars.The following notation will be used (unless otherwisestated or implied):

\320\232 the field of scalars

a, b, \321\201or \320\272 the elements of \320\232

V the given vector spaceu, v, w the elements of V

Nothing essential is lost if the reader assumes that \320\232is the real field R or the complex field \320\241

Length and orthogonality are not covered in this chapter since they are not considered as partof the fundamental structure of a vector space. They will be included as an additional structure in

Chapter 6.

5.2 VECTOR SPACES

The following defines the notion of a vectorspaceor linear space.

Definition: Let \320\232be a given field and let \320\232be a nonempty set with rules of addition and scalarmultiplication which assigns to any u, v e V a sum \320\270+ v e V and to any \320\270e V, \320\272e \320\232\320\260

product ku e V. Then V is called a vector space over \320\232(and the elements of V are called

vectors) if the following axioms hold (see Problem 53).

[/1,] Forany vectors u,v,we V, (u + v) + w = \320\270+ {v + w).

[/12] There is a vector in V, denoted by 0 and called the zero vector, for which \320\270+ 0 = \320\270

for any vector \320\270\320\265V.

[/43] For each vector \320\270e V there is a vector in V, denoted by \342\200\224u, for which

u + (-u) = 0.

[i44] For any vectors \320\270,v e V, \320\270+ v = v + u.

[M,] For any scalar \320\272\320\265\320\232and any vectors u, v e V, k{u + v) = ku + kv.

[M2] For any scalars a,beK and any vector \320\270e V, {a + b)u = \320\260\320\270+ bu.

[M3] For any scalars a, b e \320\232and any vector \320\270e V, (ab)u = a(bu).

[M4] For the unit scalar 1 e K, lu = \320\270for any vector \320\270\320\265V.

The above axioms naturally split into two sets. The first four are only concerned with the additive

structure of V and can be summarized by saying that \320\232is a commutative group under addition. It

follows that any sum of vectorsof the form

u, + v2 +\302\246\302\246\302\246+ vm

141

142 VECTOR SPACES [CHAP. 5

requires no parentheses and does not depend upon the order of the summands, the zero vector 0 is

unique, the negative \342\200\224\320\270of \320\270is unique, and the cancellation law holds; that is, for any vectors u, v,

we V.\320\270+ w = v + w implies u \342\200\224v

Also, subtraction is defined by

\320\270\342\200\224v = \320\270+ {\342\200\224

v)

On the other hand, the remaining four axioms are concernedwith the \"action\" of the field \320\232on V.

Observe that the labelling of the axioms reflects this splitting. Using these additional axioms we prove

(Problem 5.1) the following simplepropertiesof a vector space.

Theorem 5.1: Let \320\232be a vector space over a field K.

(i) For any scalar ke \320\232and 0 e V, /cO = 0.

(ii) For 0 e \320\232and any vector \320\270e V, Ou = 0.

(iii) If ku = 0, where \320\272\320\265\320\232and \320\270e V, then \320\272= 0 or \320\270= 0.

(iv) For any \320\272e \320\232and any \320\270e V, (\342\200\224k)u =fc(

\342\200\224\320\270)

= \342\200\224ku.

5.3 EXAMPLES OF VECTOR SPACES

This section lists a number of important examples of vector spaces which will be used throughoutthe text.

SpaceK\"

Let \320\232be an arbitrary field. The notation K\" is frequently used to denote the set of all n-tuples of

elements in K. Here K\" is viewed as a vector space over \320\232where vector addition and scalar multiplica-multiplicationare defined by

(au \320\2602,...,\320\260\342\200\236)+ (blt b2,..., bn) = (a, + \320\254\342\200\236a2 + b2,..., \320\260\342\200\236+ bn)

and

k{au a2,...,aj = (kau ka2,..., kan)

The zero vector in K\" is the n-tuple of zeros,

0 =@, 0, ...,0)

and the negative of a vector is defined by

-(a,, a2,...,an) = (-a,, -a2,..., -an)The proof that K\" is a vector space is identical to the proof of Theorem 2.1, which we now regard as

stating that R\" with the operations defined there is a vector space over R.

Matrix Space \320\234\321\217,1\342\200\236

The notation Mnn, or simply M, will be used to denote the set of all m x \320\270matrices over an

arbitrary field A^. Then Mm> \342\200\236is a vector space over \320\232with respect to the usual operations of matrix

addition and scalar multiplication. (See Theorem 3.1.)

CHAP. 5] VECTOR SPACES 143

Polynomial Space P(f)

Let P(t) denote the set of all polynomials

ao + att + a2t2 + \302\246\302\246\302\246+ asts (s = 0,1,2, ...)

with coefficients a, in some field K. Then P(t) is vector space over \320\232with respect to the usual operationsof addition of polynomials and multiplication of a polynomial by a constant.

Function Space F(X)

Let X be any nonempty set and let \320\232be an arbitrary field. Consider the set F(X)of all functions

from X into K. [Note that F{X) is nonempty since X is nonempty.] The sum of two functions /g e F{X)is the function/+ g e F(X) definedby

(f+g)(x)=f{x) + g(x) VxeX

and the product of a scalar \320\272e \320\232and a function/ e F(X) is the function kfe F(X) defined by

= kf(x) Vx e X

(The symbol V means \"for every.\") Then F(X) with the above operations is a vector space over \320\232

(Problem 5.5).

The zero vector in F(X) is the zero function 0 which maps each x e X into OeX, that is,

0(x) = 0 Vx e X

Also, for any function/e F{X), the function \342\200\224/defined by

(-/X*) = -fix) Vx e X

is the negative of the function/

Fields and Subfields

Suppose ? isa field which contains a subfield K. Then E may be viewed as a vector space over \320\232as

follows. Let the usual addition in E be the vector addition, and let the scalar product kv of \320\272\320\265\320\232and

v e E be the product of \320\272and v as elements of the field E. Then ? is a vector space over K, that is, the

above eight axioms of a vector space are satisfied by E and K.

5.4 SUBSPACES

Let W be a subset of a vector space V over a field K. W is called a subspace of V if W is itself avector spaceover \320\232with respect to the operations of vector addition and scalar multiplication on V.

Simple criteria for identifying subspaces follow (seeProblem 5.4for proof).

Theorem 5.2: Suppose W is a subsetof a vector space V. Then W is a subspace of V if and only if the

following hold:

(i) 0 e W

(ii) W is closed under vector addition, that is:For every u, v e W, the sum \320\270+ v e W.

(iii) W is closed under scalar multiplication, that is:

For every \320\270e W, \320\272e K, the multiple kit e W.

144 VECTOR SPACES [CHAP.5

Conditions (ii) and (iii) may be combinedinto one condition as in (ii) below (see Problem 5.5 for

proof).

Corollary 5.3: W is a subspace of V if and only if:

(i) 0 e W

(ii) \320\260\320\270+ bv e W for every u, v e W and a,b e K.

Example 5.1

(a) Let V be any vector space. Then the set {0} consisting of the zero vector alone, and also the entire space V are

subspacesof V.

(b) Let W be the xy plane in R3 consisting of those vectors whose third component is 0; or, in other words

W = {{a,b, 0): a, b e R}. Note 0 = @, 0, \320\241)\320\265W since the third component of 0 is 0. Further, for any vectors\320\270\342\200\224(a, b, 0) and v = (c, d, 0) in W and any scalar \320\272\320\265R, we have

\320\270+ v = {a + c, b + d, 0) and ku = {ka, kb, 0)

belong to W. Thus W is a subspaceof V.

(c) Let V =\320\234\321\217\342\200\236the space of \320\270\321\205\320\270matrices. Then the subset Wx of (upper) triangular matrices and the subset W2

of symmetric matrices are subspaces of V since they are nonempty and closed under matrix addition and

scalar multiplication.

(d) Recall that P(f) denotes the vector space of polynomials. Let Pn(t) denote the subset of P(f) that consists of all

polynomials of degree < n, for a fixed n. Then Pn(t) is a subspace of P(t).This vectorspace\320\240\342\200\236(()will occur veryoften in our examples.

Example 5.2. Let U and W be subspaces of a vector space V. We show that tl itersection U r\\ W is also asubspace of V. Clearly 0 e U and Oe W since V and W are subspaces; whe 0 e U n W. Now suppose

ityVeUn W. Then u,veU and u, v e W and, since U and W are subspaces,

\320\270+ v, ku 6 U and \320\270+ v, ku e W

for any scalar k. Thus \320\270+ v,ku e V r\\ W and hence V r\\ W is a subspaceof V.

The result in the preceding examplegeneralizesas follows.

Theorem 5.4: The intersection of any number of subspaces of a vector spaceV is a subspace of V.

Recall that any solution \320\270of a system AX = \320\222of linear equations in n unknowns may be viewed as

a point in K\"; and thus the solution set of such a system is a subset of K\". Suppose the system is

homogeneous, i.e., suppose the system has the form AX = 0. Let W denote its solution set. Since.40= 0, the zero vector 0 e W. Moreover, if \320\270and v belong to W, i.e., if \320\270and v are solutions ofAX = 0, then \320\220\320\270= 0 and Av = 0. Therefore,for any scalars a and b in K, we have

A(au + bv)= aAu + bAv = \320\2600+ \320\2540= 0 + 0 = 0

Thus \320\260\320\270+ bv is also a solution of AX = 0 or, in other words, \320\260\320\270+ bv e W. Accordingly, by the above

Corollary 5.3, we have proved:

Theorem 5.5: The solution set W of a homogenoussystem AX = 0 in n unknowns is a subspaceof K\".

We emphasize that the solution set of a nonhomogenous system AX = \320\222is not a subspace of K\". In

fact, the zero vector 0 doesnot belong to its solution set.

CHAP.5] VECTOR SPACES 145

5.5 LINEAR COMBINATIONS, LINEAR SPANS

Let \320\232be a vector space over a field \320\232and \\etvuv2,...,vme V. Any vector in V of the form

+ + \342\200\242\302\246\302\246+ amvm

where the a,- e K, is called a linear combination of vt, i>2,. -., vm. The set of all such linear combinations,denotedby

span(t),, v2,..., vj

is called the linearspan of t>,, v2,..., vm.

Generally, for any subset S of V, span S = {0} when S is empty and span S consistsof all the linear

combinations of vectors in S.

The following theorem is proved in Problem 5.16.

Theorem 5.6: Let S bea subset of a vector space V.

(i) Then spanS is a subspace of V which contains S.

(ii) If W is a subspace of V containing S, then span S ^ W.

On the other hand, given a vector space V, the vectorsul,u2,...,ur are said to span or generateorto form a spanning set of V if

K = span(ulf \320\2742,...,ur)

In other words, u,, u2,. -., uT span V if, for every v e V, there exist scalars at,a2 aT such that

v = alul + a2u2 +\302\246\302\246\302\246

+ ar ur

that is, if v is a linear combination of uu u2,..., ur.

Example 5.3

(a) Consider the vector spaceR3. The linear span of any nonzero vector \320\270\320\265R3 consists of all scalar multiples of

\320\270;geometrically, span \320\270is the line through the origin and the endpoint of \320\270as shown in Fig. 5- l(a). Also, for

any two vectors u, v e R3 which are not multiples of each other, span (u, v) is the plane through the origin and

the endpoints of \320\270and d as shown in Fig. 5-1F).{b) The vectors e, =

A, 0, 0), e2 =@, 1, 0) and e3 =

@, 0, 1) span the vector spaceR3. Specifically, for any vector\320\270= {a, b, c) in R3, we have

\320\270= {a, b, c) = a(l, 0, 0) + b@, 1, 0) + \321\204,0, 1) = ael + be2+ ce3

That is, \320\272is a linear combination of ex, e2, \320\265\321\215.

ia)

Fig. 5-1

146 VECTOR SPACES [CHAP.5

(c) The polynomials 1, t, t2, t3,... span the vectorspaceP(f) of all polynomials, that is,

P@ =span (l,i, t1, i3,...)

In other words, any polynomial is a linear combination of 1 and powers of t. Similarly, the polynomials

1, t, t2,.... t\" span the vector space \320\240\342\200\236(\320\223)of all polynomials of degree <n.

Row Space of a Matrix

Let A be an arbitrary m x n matrix over a field \320\232:

The rows of A,

Rl = fall. \302\25312 \320\236. \342\200\242\342\200\242.Rm=

(<*mU ^ml . \342\200\242\342\200\242\342\200\242'\302\253\320\273\321\217)

may be viewed as vectors in K\" and hence they span a subspaceof K\" called the row space of \320\233and

denoted by rowsp A. That is,

rowspA = span (Rl5 R2, \302\246\302\246-,Rm)

Analogously, the columns of A may be viewed as vectors in Km and hence span a subspace of Km called

the column space of A and denoted by colsp A. Alternatively, colsp A = rowsp AT.

Now suppose we apply an elementary row operation on A,

(i) R{*-+Rjt (ii) IcRi-tR^k^O, or (iii) kRJ + R,-*Ri

and obtain a matrix B. Then each row of \320\222is clearly a row of A or a linear combination of rows of A.

Hence the row space of \320\222is contained in the row space of A. On the other hand, we can apply the

inverse elementary row operation on \320\222and obtain A; hence the row space of A is contained in the rowspace of B. Accordingly, A and \320\222have the same row space. This leadsus to the following theorem.

Theorem 5.7: Row equivalent matrices have the same row space.

In particular, we prove the following fundamental results about row equivalent matrices (proved in

Problems 5.51 and 5.52,respectively).

Theorem 5.8: Row canonical matrices have the same row space if and only if they have the samenonzero rows.

Theorem 5.9: Every matrix is row equivalent to a unique matrix in row canonical form.

We apply the above results in the next example.

Example 5.4. Show that the subspace U of R4 spanned by the vectors

u, =A,2, -1,3) u2 = B, 4. 1. -2) and u3 = C, 6, 3, -7)

and the subspace W of R4 spanned by the vectors

vj= A, 2, -4, 11) and v2

= B, 4, -5, 14)

are equal; that is, U = W.

CHAP. 5] VECTOR SPACES 147

Method 1. Show that each u,- is a linear combination of d, and i2, and show that each r, is alinear combination of u,, u2, and u3. Observe that we have to show that six systems of linear

equations are consistent.

Method 2. Form the matrix A whose rows are the ut, and row reduce A to row canonicalform:

/l 2 -I \320\267\\ /l 2-1 3\\ /l 2 0

\320\233=12 4 1 -2 ~IO 0 3 \342\200\2248I~ I 0 0 1 -f

\\3 6 3 -7/ \\00 6 -16/ \\0 0 0

Now form the matrix \320\222whose rows are d, and v2, and row reduce \320\222to row canonical form:

/1 2 -4 11\\ /1 2 -4 11\\ /1 2 0 |\\\\2 4 -5 14/ \\0 0 3 -8/ \\0 0 1 -\302\247/

Since the nonzero rows of the reduced matrices are identical, the row spaces of A and \320\222are

equal and so U = W.

5.6 LINEAR DEPENDENCE AND INDEPENDENCE

The following defines the notion of linear dependenceand independence. This concept plays anessentialrole in the theory of linear algebra and in mathematics in general.

Definition: Let \320\232be a vector space over a field K. The vectors t>,, ..., vm e V are said to be linearly

dependent over K, or simply dependent,if there exist scalars a,,..., am e K, not all of them0, such that

Otherwise, the vectors are said to be linearly independent over K, or simply independent.

Observethat the relation (\302\246)will always hold if the o's are all 0. If this relation holds only in this

case, that is,

a,i>, + a1v1+\302\246\302\246+ amvm = 0 implies a, = 0,.... am

= 0

then the vectors are linearly independent. On the other hand, if the relation (\302\246)also holds when one ofthe o's is not 0, then the vectors are linearly dependent.

A set {d,, d2 vm] of vectors is said to be linearly dependent or independent according as thevectors D|, v2, ..., vm are linearly dependent or independent. An infinite set S of vectors is linearly

dependent if there exist vectors \320\274,,..., uk in S which are linearly dependent; otherwise S is linearlyindependent.

The following remarks follow from the above definitions.

Remark 1: If 0 is one of the vectors t>,,.... vm, say vl = 0, then the vectors must

be linearly dependent; for

It), + 0t>2 +\342\200\242- \342\200\242

+ 0t;m = 1 \342\200\2420 + 0 + \302\246\302\246\342\200\242+ 0 = 0

and the coefficientoft), is not 0.

Remark 2: Any nonzero vector v is, by itself, linearly independent; for

kv = 0, v ^0 implies \320\272= 0

148 VECTOR SPACES [CHAP.5

Remark 3: If two of the vectors \321\214\\,\321\2142,...,\321\214\321\202are equal or one is a scalar multiple

of the other, say vt= kv2, then the vectors are linearly dependent. For

i>,\342\200\224

kv2

and the coefficient of r, is not 0.

0v + 0c, = 0

Remark 4: Two vectors I*! and v2 are linearly dependent if and only if one ofthem is a multiple of the other.

Remark 5: If the set {t>,,...,vm} is linearly independent, then any rearrangement

of the vectors {vh, pf,,..., vim}is also linearly independent.

Remark 6: If a set S of vectors is linearly independent, then any subset of S islinearly independent. Alternatively, if 5 contains a linearly dependent subset, then S islinearly dependent.

Remark 7: In the real space R3, linear dependence of vectors can be describedgeometrically as follows: (a) Any two vectors \320\270and v are linearly dependent if and only

if they lie on the same line through the origin as shown in Fig. 5-2(a). (b) Any three

vectors u. v, and w are linearly dependent if and only if they lie on the same planethrough the origin as shown in Fig. 5-2(b).

(\320\260)\320\270and v are linearly dependent (b) \320\270,v, and w are linearly dependent

Fig. 5-2

Other examples of linearly dependent and independent vectors follow.

Example 5.5

(a) The vectors u = A, \342\200\2241, 0), r = (I, 3, \342\200\2241), and w = E, 3, \342\200\2242)are linearly dependent since

3A, -1.0L-2A,3, \342\200\2241)

\342\200\224E, 3, -2) = @,0,0)

That is, 3u + 2i> - w = 0.

(b) We show that the vectors u = F, 2, 3, 4). \320\263= @, 5, \342\200\2243,1), and w =@, 0, 7, \342\200\2242)are linearly independent. For

suppose xu + yv + zw = 0 where x, v and \320\263are unknown scalars. Then

@, 0, 0, 0) = xF. 2, 3, 4) + j<0, 5, -3, 1) + z@,0, 7, -2)=

<6x, 2.x + 5>'. 3x - 3v + 7z, 4x + \321\203- 2z)

and so, by the equality of the corresponding components,

6x =02x + 5>- =0

3x -3y + Iz = 0

4x+ v-2z = 0

CHAP. 5] VECTOR SPACES 149

The first equation yields x = 0; the second equation with x = 0 yields \321\203= 0; and the third equation with

x = 0, \321\203= 0 yields z = 0.Thus

xu + yv + zw = 0 implies jc = 0, \321\203= 0, z = 0

Accordingly u, i>, and \320\270>are linearly independent.

Linear Combinations and LinearDependence

The notions of linear combinations and linear dependenceare closelyrelated.Specifically, for more

than one vector, we show that the vectors vt,v2 vm are linearly dependent if and only if one of themis a linear combination of the others.

For suppose,say, vs is a linear combination of the others:

1^= 0,1), + \302\246\302\246\302\246

+ a,_,!),-_, +ai+lv,+ i +\302\246\302\246\302\246+amvm

Then by adding \342\200\2241>,to both sides, we obtain

0,1', + \342\200\242\342\200\242\302\246+ \320\260^\320\274,- t); + al +lvl+l +\342\200\242\302\246\302\246+ amvm = 0

where the coefficientof d, is not 0; hence the vectors are linearly dependent. Conversely, suppose thevectors are linearly dependent, say,

^i + \302\246\302\246\302\246+ bjVj+

\302\246\302\246\302\246+ bm vm = 0 where \320\236

Then

and soVj

is a linear combination of the other vectors.We now formally state a slightly stronger statement than that above (see Problem 5.36 for the

proof); this result has many important consequences.

Lemma 5.10: Suppose two or more nonzero vectors t>,, v2, \302\246\302\246\302\246,vm are linearly dependent. Then one of

the vectors is a linear combination of the preceding vectors, that is, there exists\320\260\320\272> 1

such that

Example 5.6. Considerthe following matrix in echelon form:

A =

/00

0

0\\o

0

0

0

0

34000

4-4

000

5

4

7

0

0

6-4

8

6

0

\320\2734

9

-6

\320\276/

Observe that rows R2, R3, and R4 have 0s in the second column (below the pivot element in R,) and hence any

linear combination of R2, R3, and RA must have a 0 as its second component. Thus \320\233,cannot be a linearcombination of the nonzero rows below it. Similarly, rows R3 and R4 have 0s in the third column below the pivotelement in R2; hence R2 cannot be a linear combination of the nonzero rows below it. Finally, R3 cannot be a

multiple of R4 since R4 has a 0 in the fifth column below the pivot in R3. Viewing the nonzero rows from the

bottom up, R4, R3, R2, R,, no row is a linear combination of the previous rows. Thus the rows are linearly

independent by Lemma 5.10.

150 VECTOR SPACES [CHAP.5

The argument in the above examplecan beused for the nonzero rows of any echelon matrix. Thuswe have the following very useful result (proved in Problem 5.37).

Theorem 5.11: The nonzero rows of a matrix in echelon form are linearly independent.

5.7 BASIS AND DIMENSION

First westate two equivalent ways (Problem 5.30) to definea basisof a vector space V.

Definition A: A set S = {u,, u2,..., un} of vectors is a basis of V if the following two conditions hold:

A) u,, u2,..., \321\213\342\200\236are linearly independent,

B) \302\253\342\200\236u2,...,unspan V.

Definition B: A set S ={u\302\261,u2,..., un} of vectors is a basisof V if every vector v e V can be written

uniquely as a linear combination of the basis vectors.

A vector space V is said to be offinite dimension n or to be n-dimensional, written

dim V = n

if V has such a basis with n elements. This definition of dimension is well denned in view of the

following theorem (proved in Problem 5.40).

Theorem 5.12: Let \320\232be a finite-dimensional vector space. Then every basis of V has the same numberof elements.

The vector space {0} is defined to have dimension0. When a vector space is not of finite dimension,

it is said to be of infinite dimension.

Example 5.7

(a) Consider the vector space M2 3 of all 2 x 3 matrices over a field K. Then the following six matrices form a

basis of M2. \320\267:

/l 0 0\\ /0 1 0\\ /0 0 1\\ /0 0 0\\ /0 0 0\\ /0 0 0\\

\\0 0 OJ \\0 0 OJ \\0 0 0/ \\l 0 OJ \\0 1 o) \\0 0 \\)

More generally, in the vector spaceMr s of r x s matrices let Ei} be the matrix with i/-entry 1 and 0 elsewhere.Then all such matrices Eu form a basis of Mr s, called the usual basis of Mr 5. Then dim Mrs = rs. In

particular, e, = A,0, ...,0), e2= @, 1, 0,..., 0),.... \320\265\342\200\236= @, 0, ...,0, 1)form the usual basis for K\".

(b) Consider the vector space Pn(t) of polynomials of degree <n. The polynomials 1, t, t2, ..., t\" form a basis ofPn(r), and so dim \320\240\342\200\236(\320\263)

= n + 1.

Theorem 5.12, the fundamental theorem on dimension, is a consequenceof the following

\"replacement lemma\" (proved in Problem 5.39):

Lemma 5.13: Suppose {t>,, v2,..., vn} spans V, and suppose {wu vv2,..., wm} is linearly independent.Then m < n, and V is spanned by a set of the form

{w,,..., wm,vh, ...,tvJ

Thus, in particular, any n + 1 or more vectors in V are linearly dependent.

Observe in the above lemma that we have replaced m of the vectors in the spanning set by the m

independent vectors and still retained a spanning set.

CHAP. 5] VECTOR SPACES 151

The following theorems (proved in Problems 5.41, 5.42, and 5.43, respectively)will be frequently

used.

Theorem 5.14: Let \320\232be a vector space of finite dimension n.

(i) Any n + 1 or more vectors in V are linearly dependent.

(ii) Any linearly independent set S = {u,,u2, \302\246.\302\246,un] with n elements is a basis of V.

(iii) Any spanning set T ={t>,, v2, \302\246.\302\246,vn} of V with \320\270elements is a basis of V.

Theorem 5.15: Suppose S spans a vector space V.

(i) Any maximum number of linearly independent vectors in S form a basisof V.

(ii) Suppose one deletes from S each vector which is a linear combination of preced-

precedingvectors in S. Then the remaining vectors form a basis of V.

Theorem 5.16: Let V be a vector space of finite dimension and let S = {uu u2, \302\246\302\246\302\246,ur) be a set oflinearly independent vectors in V. Then S is part of a basis of V, that is, S may beextendedto a basis of V.

Example 5.8

(a) Considerthe following four vectors in R4:

A,1,1,1) @,1,1,1) @,0,1,1) @,0,0,1)

Note that the vector will form a matrix in echelon form; hence the vectors are linearly independent. Further-

Furthermore,since dim R4 = 4, the vectors form a basis of R4.

(b) Considerthe following n \302\246+-1 polynomials in Pn(i):

\\,t- l,(t-lJ,...,(r-l)\"

The degree of (i\342\200\224II is k; hence no polynomial can be a linear combination of preceding polynomials. Thus

the polynomials are linearly independent. Furthermore, they form a basis of Pn(t) since dim Pn(t) = n + 1.

Dimension and Subspaces

The following theorem (proved in Problem 5.44) gives the basic relationship between the dimension

of a vector space and the dimension of a subspace.

Theorem 5.17: Let W be a subspace of an \302\253-dimensional vector space V. Then dim W < n. In particu-particularif dim W = n, then W = V.

Example 5.9. Let W be a subspace of the real space R3. Now dim R3 = 3; hence by Theorem 5.17 the dimensionof W can only be 0, 1, 2, or 3.The following cases apply:

(i) dim W = 0, then W = {0}, a point;

(ii) dim W = 1,then W is a line through the origin;

(iii) dim W = 2, then W is a plane through the origin;

(iv) dim W = 3, then W is the entire space R3.

Rank of a Matrix

Let A be an arbitrary m x n matrix over a field K. Recall that the row space of A is the subspace ofK\" spanned by its rows, and the column spaceof A is the subspace of Km spanned by its columns.

152 VECTORSPACES [CHAP. 5

The row rank of the matrix A is equal to the maximum number of linearly independent rows or,

equivalently, to the dimension of the row space of A. Analogously, the column rank of A is equal to the

maximum number of linearly independent columns or, equivalently, to the dimension of the column

space of A.

Although rowsp A is a subspace of K\", and colsp A is a subspace of Km, where n may not equal m, we

have the following important result (proved in Problem 5.53).

Theorem 5.18: The row rank and the column rank of any matrix A are equal.

Definition: The rank of the matrix A, written rank A, is the common value of its row rank and columnrank.

The rank of a matrix may be easily found by using row reduction as illustrated in the next example.

Example 5.10. Supposewe want to find a basis and the dimension of the row space of

(I

2 0 -\\\\

2 6-3-33 \320\256 -6 -'.

We reduce A to echelon form using the elementary row operations:

(I

2 0 -l\\ /l 2 0 -\\\\

0 2-3 \342\200\224I I \342\200\224I 0 2 -3 -I0 4-6 -2/ \\0

0 0 0/

Recall that row equivalent matrices have the same row space. Thus the nonzero rows of the echelon matrix, which

are independent by Theorem 5.11, form a basis of the row space of A. Thus dim rowsp 4=2 and so rank 4 = 2.

5.8 LINEAR EQUATIONS AND VECTOR SPACES

Considera system of m linear equations in n unknowns x,,..., \321\205\342\200\236over a field K:

a,,x, +al2x2 + \302\246\302\246\302\246+alnxn

= bl

+?\0222*2+ \342\200\242\342\200\242\302\246+ a2nxn = b2E.1)

or the equivalent matrix equation

AX = \320\222

where A = (ay) is the coefficientmatrix, and X \342\200\224(xt) and \320\222= (b{) are the column vectors consistingof

the unknowns and of the constants, respectively.Recall that the augmented matrix of the system isdefinedto be the matrix

(\302\25311

fl12 -\342\200\242\342\200\242aln V

\302\26021..fl22..:::..\302\2602\"..b2.

\"mi am2 ... amn bj

Remark 1: The linear equationsE.1)are said to be dependent or independentaccording as the corresponding vectors, Le., the rows of the augmented matrix, are

dependent or independent.

CHAP. 5] VECTOR SPACES 153

Remark 2: Two systems of linear equations are equivalent if and only if the cor-corresponding augmented matrices are row equivalent, i.e.,have the same row space.

Remark 3: We can always replace a system of equationsby a system of indepen-independentequations, such as a system in echelon form. The number of independent equa-equations will always be equal to the rank of the augmented matrix.

Observe that the system {5.1)is alsoequivalent to the vector equation

The above comment gives us the following basic existencetheorem.

Theorem 5.19: The following three statements are equivalent.

(a) The system of linear equations AX = \320\222has a solution.

(b) \320\222is a linear combination of the columns of A.

(c) The coefficient matrix A and the augmented matrix (A, B) have the same rank.

Recall that an m x n matrix A may be viewedas a function A : K\" -\302\273Km. Thus a vector \320\222belongs

to the image of A if and only if the equation AX = \320\222has a solution. This means that the image (range)of the function A, written Im A, is precisely the column spaceofA. Accordingly,

dim (Im A) = dim (colsp A)\342\200\224

rank A

We use this fact to prove (Problem 5.59) the following basic result on homogeneous systems of linearequations.

Theorem 5.20: The dimension of the solution space W of the homogeneous system of linear equationsAX

\342\200\2240 is n \342\200\224r where n is the number of unknowns and r is the rank of the coefficient

matrix A.

In case the system AX = 0 is in echelon form, then it has precisely n \342\200\224r free variables, say, xfl, xi2,

..., xjn_r. Let Vjbe the solution obtained by settingxt> \342\200\2241 (or any nonzero constant) and the remaining

free variables equal to 0. Then the solutions u,, ..., vn_r are linearly independent (Problem 5.58) andhence they form a basis for the solution space.

Example 5.11. Suppose we want to find the dimension and a basis of the solution space W of the following

system:

x + 2y + 2z - s + 3f = 0

x + 2y + 3z + s + t \342\200\2240

\320\227\321\205+ 6y + 8z + s + 5t = 0

First reduce the system to echelon form:

x + 2y+2z-s + 3t = 0x + 2y + 2z \342\200\224s + 3i = 0

z + 2s-2i=0 orz + 2s \342\200\2242t =0

2z + 4s - 4t = 0

The system in echelon form has 2 (nonzero) equations in 5 unknowns; and hence the system has 5 \342\200\2242 = 3 freevariables which are y, s and t. Thus dim W = 3. To obtain a basis for W, set:

(i) \321\203= 1,.? = 0, t = 0 to obtain the solution vt = (-2,1, 0, 0,0),

154 VECTOR SPACES [CHAP. 5

(ii) \321\203= 0, s = 1, t = 0 to obtain the solution v2 = E,0, -

2, 1, 0),

(iii) \321\203= 0, s = 0, t = 1 to obtain the solution d3

= (- 7, 0, 2,0, 1).The set {d,, v2, v3} is a basisof the solution space W.

Two Basis-FindingAlgorithms

Suppose we are given vectors ux, u2,.. -, ur in K\". Let

W/ = span(M1, \320\2742,..., uP)

the subspace of X\" spanned by the given vectors. Each algorithm below finds a basis (and hence thedimension)of W.

Algorithm 5.8A (Row space algorithm)

Step 1. Form the matrix A whose rows are the given vectors.

Step2. Row reduce A to an echelon form.

Step 3. Output the nonzero rows of the echelon matrix.

The above algorithm essentially already appeared in Example 5.10. The next algorithm is illustratedin Example 5.12 and uses the above result on nonhomogeneoussystems of linear equations.

Algorithm 5.8B(Casting-Outalgorithm)

Step 1. Form the matrix M whosecolumns are the given vectors.

Step 2. Row reduce M to echelon form.

Step 3. Foreach column Ck in the echelon matrix without a pivot, delete (cast-out) the vector vk from

the given vectors.

Step 4. Output the remaining vectors (which correspond to columns with pivots).

Example 5.12. Let W be the subspace of R5 spanned by the following vectors:

\302\273,=A,2, 1,-2.3) \302\2732=<2, 5, -1, 3, -2) i>3=(l, 3, -2,5, -5)

i>4= C, 1, 2, -4. 1) ^=E,6,1.-1,-1)

We use Algorithm 5.8B to find the dimension and a basis of W.

First form the matrix M whosecolumns are the given vectors, and reduce the matrix \321\216echelon form:

M =

\\o

CHAP. 5] VECTOR SPACES 155

Observethat the pivots in the echelonmatrix appear in columns CUC2, and C4. Thefact that column C3 does not

have a pivot means that the system xi\\ + yv2\342\200\224

v3 has a solution and hence v3 is a linear combination o(v1 and v2.Similarly, the fact that column C5doesnot have a pivot means that v5 is a linear combination of preceding vectors.

Accordingly, the vectors vu v2 and i>4, which correspond to the columns in the echelon matrix with pivots, form abasisof W and dim W = 3.

5.9 SUMS AND DIRECT SUMS

Let U and W be subsets of a vector space V. The sum of U and W, written U + W, consists of all

sums \320\270+ w where \320\270e V and w e W. That is,

V + W = {u + w: \320\270e U, w e W}

Now supposeV and W are subspaces of a vector space V. Note that 0 = 0 + 0et/+W, since

0 e U, 0 e W. Furthermore, suppose \320\270+ w and \320\270'+ w' belong to V + W, with u,u'eU and w, w e W.Then

(u + w) + (\320\270'+ W) = (u + u') + (w + W) e U + W

and, for any scalar k,

k(u kw e U + W

Thus we have proven the following theorem.

Theorem 5.21: The sum V + W of the subspaces U and W of V is also a subspace of V.

Recall that the intersection V n W is also a subspaceof \320\232The following theorem, proved in

Problem 5.69, relates the dimensions of thesesubspaces.

Theorem 5.22: Let V and W be finite-dimensional subspaces of a vector space V. Then U + W hasfinite dimension and

dim (U + W)= dim U + dim W - dim (U r, W)

Example 5.13. Suppose U and W are the xy and yz planes, respectively, in R3. That is,

U = {(a, b, 0)} and W ={@, b, c)}

Note R3 = U + W; hence dim (V + W)= 3. Also dim U = 2 and dim W = 2.By Theorem 5.22,

3 = 2 + 2- dim {V \320\263\320\273W) or dim (V n, W) = I

This agrees with the facts that U n W is the \321\203axis (Fig. 5-3) and the \321\203axis has dimension 1.

Fig, 5-3

156 VECTOR SPACES [CHAP. 5

Direct Sums

The vector spaceV is said to be the direct sum of its subspaces U and W, denoted by

V = U \302\256W

if every vector v e V can be written in one and only one way as v = \320\270+ w where \320\270e U and w e W.

The following theorem, proved in Problem 5.70, characterizes such a decomposition.

Theorem 5.23: The vector space V is the direct sum of its subspaces U and W if and only if

(i) V = U + W and (ii) V n W = {0}.

Example 5.14

(a) In the vectorspaceR3, let U be the xy plane and let W be the yz plane:

U = {(a,b, 0): a, b e R} and W = {@, b, c): b, \321\201\320\265R}

Then R3 = U + W since every vector in R3 is the sum of a vector in U and a vector in W. However, R3 is not

the direct sum of V and W since such sums are not unique; for example,

C, 5, 7)= C, 1,0) + @, 4, 7) and also C, 5, 7)= C, -4, 0) + @, 9, 7)

(b) In R3, let V be the xv plane and let W be the z axis:

U = {(a, b, 0): a, b e R} and W ={@, 0, \321\201):\321\201\320\265R}

Now any vector (a, b,c)e R3 can be written as the sum of a vector in U and a vector in V in one and only one

way:

{a, b, c) = (a, b, 0) + @, 0, c)

Accordingly, R3 is the direct sum of V and W, that is, R3 = V \302\256W. Alternatively, R3 = U \302\256W since

R3 = v + W and I/ n W = {0}.

GeneralDirectSums

The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the

direct sum of subspacesWl,W2,...,Wr, written

v = w, e w2 e \342\200\242\302\246\302\246e wr

if every vector v e V can be written in one and only one way as

V = W1 + W2 \302\246+\302\246\302\246\342\200\242+ Wr

where w, e Wlt w2 e W2,..., wr e Wr.

The following theorems apply.

Theorem 5.24: Suppose V =Wj \302\251W2 \302\251

- - -\302\256Wr. Also, for each i, suppose S, is a linearly indepen-

independentsubset of Wt. Then

(a) The union S = \\Jt S, is linearly independent in V.

(b) If S, is a basis of Wt, then S =\\J, S, is a basis of V.

(c) dim V = dim W, + dim W2 + \342\200\242\302\246\302\246+ dim Wr

CHAP. 5] VECTOR SPACES 157

Theorem 5.25: Suppose V \342\200\224W, + W2 + \302\246\302\246\302\246

+ Wr (where V has finite dimension) and suppose

dim V = dim W, + dim W2 +\302\246\302\246\302\246

+ dim Wr

Then V =Wt\302\256W2\302\256---@Wr.

5.10 COORDINATES

Let V be an n-dimensional vector space over a field K, and suppose

is a basis of V. Then any vector v e V can be expressed uniquely as a linear combination of the basisvectors in S, say

v = atu, +a2u2 + \302\246\342\200\242\342\200\242+anun

These \320\270scalars alt a2, ..., \320\260\342\200\236are called the coordinates of v relative to the basis S; and they form the

n-tuple [a,, a2, ..., an] in K\", called the coordinate vector of i; relative to S. We denote this vector by

[v]s, or simply [u], when S is understood. Thus

Observe that brackets [...], not parentheses (...), are usedto denote the coordinate vector.

Example 5.15

{a) Consider the vector space P2(t)of polynomials of degree < 2. The polynomials

Pl= 1 P2 = t - 1 p3 = (t - IJ = t2 - It + 1

form a basis S of P2(t).Let v = 2t2 \342\200\2245t + 6. The coordinate vector of v relative to the basis S is obtained asfollows.

Set v = xp, + yp2 + zp3 using unknown scalarsx, y, z and simplify:

2t2-5t +6= x(l)+ y[t- 1) + z(tz - 2t + 1)

= x + vt \342\200\224v + zt2 - 2zt + z

= zt2 + (>--

2z)f + (x - v + z)

Then set the coefficients of the same powers of t equal to each other:

X - V + Z = 6

y-2z= -5z= 2

The solution of the above system is x = 3, \321\203= \342\200\224

1, z = 2. Thus

v = 3p, \342\200\224p2 + 2p3 and so [v] = [3, \342\200\224

1, 2]

(b) Consider real space R3. The vectors

\321\211=A,-1,0) u2= A,1,0) u3= @,1,1)

form a basis S ofR3. Let d = E, 3, 4).Thecoordinates of v relative to the basis S are obtained as follows.

Set v = \321\205\320\270,+ yu2 +- zu3, that is, set \320\270as a linear combination of the basis vectors using unknown scalarsx, y, z:

E, 3, 4) = x(l, - 1,0) + y(l, 1, 0) + z@, 1, 1)= (x, -x, 0)+(y, >,0)+ @, z, z)

= (x + >, -x + y + z, z)

Then set the corresponding components equal to each other to obtain the equivalent system of linear equa-

equations

x + \321\203= 5 \342\200\224x + >> + z = 3 z = 4

158 VECTOR SPACES [CHAP. 5

The solution of the system is x = 3, \321\203= 2, z = 4. Thus

v = 3u, + 2\320\2702+ 4u3 and so Ms = [3, 2, 4]

Remark: There is a geometrical interpretation of the coordinates of a vector v

relative to a basis S for real spaceR\". We illustrate this by using the following basis ofR3 appearing in Example 5.15.

S ={\320\270,

= A,- 1, 0), u2

= A, 1, 0), \302\253\320\267= @, 1, 1)}

First consider the space R3 with the usual x, y, and z axes. Then the basis vectorsdetermine a new coordinate system of R3, say with x', /, and z' axes, as shown in

Fig. 5-4. That is:

A) The x' axis is in the direction of \320\270,.

B) The y' axis is in the direction of u2 \302\246

C) The z' axis is in the direction of u3.

Furthermore, the unit length in each of the axes will be equal, respectively, to the

length of the corresponding basis vector. Then each vector v = (a, b, c) or, equivalently,

the point P{a, b, c) in R3 will have new coordinates with respect to the new x', y', z'axes.Thesenew coordinates are precisely the coordinates of v with respect to the basisS.

v = E, 3, 4) = [3, 2, 4]

Fig. 5-4

Isomorphism of \320\243with /C

Consider a basis S ={\320\270,,\320\2702,...,\302\253\342\200\236}of a vector space V over a field K. We have shown above that

to each vector v e V there corresponds a unique n-tuple Ms in K\". On the other hand, for any n-tuple

CHAP. 5] VECTOR SPACES 159

[\321\201\342\200\236c2, ..., cje K\", there corresponds the vector c,u, + c2u2 + \302\246\302\246\302\246+ \321\201\342\200\236unin V. Thus the basis Sinducesa one-to-onecorrespondence between the vectors in V and the n-tuples in K\". Furthermore,

suppose

v = axux + a2u2 + \302\246\342\200\242\342\200\242+ anun and w = blu1 + b2u2 + \342\200\242\342\200\242\342\200\242+ fcnun

Then

i; + vv = {ax + b,)u, + (\320\2642+ \320\2542)\320\2702+\302\246\302\246\302\246

+(\320\260\342\200\236+ \320\254>\342\200\236

fcu = (fca,)w, + (fca2)u2 + \342\200\242\302\246\302\246+ (kan)un

where fc is a scalar. Accordingly,

lv + vv]s= [a, +&!,..., ^ + *J =

&\302\273i,\342\200\242\302\246-,aJ + \342\204\2261.\342\200\242-\342\200\242>bj= Hs + r>]s

and

[kv\\s= [feo,, fea2, ..., fcoj =

\320\272[\320\260\342\200\236a2,..., fcaj = k[v]s

Thus the above one-to-one correspondence between V and K\" preserves the vector space operationsof

vector addition and scalar multiplication; we then say that V and K\" are isomorphic, written V as K\".

We state this result formally.

Theorem 5.26: Let V be an n-dimensional vector space over a field K. Then V and K\" are isomorphic.

The next example givesa practical application of the above result.

11\\

9j

Example 5.16. Suppose we want to determine whether or not the following matrices are linearly independent:

1 2 -3\\ _/l 3 -4\\\320\263-(\320\252

8 ~]

0 1/ ~~\\6 5 4/ ~\\16 10

The coordinatevectors of the above matrices relative to the usual basis [Example5.7(a)]of M2 3 are as follows:

[\320\233]= A, 2, -3, 4, 0, 1) [B]=

A, 3, -4, 6, 5, 4) [C] = C, 8,-11, 16,10,9)

Form the matrix M whose rows are the above coordinate vectors:

2-3403-4658 -11 16 10

Row reduce M to echelon form:

A

2-3 4 0 l\\ /l 2-3 4 00 1-12 5

\320\267)~0 1-12 5

0 2-2 4 10 6/ \\0 0 0 0 0

Since the echelon matrix has only two nonzero rows, the coordinate vectors [Ai\\, [B], and [C] span a subspaceof

dimension 2 and so are linearly dependent. Accordingly, the original matricesA, B, and \320\241are linearly dependent.

5.11 CHANGE OF BASIS

Section5.10showed that we can represent each vector in a vector space V by means of an n-tupleonce we have selected a basis S of V. We ask the following natural question: How does our representa-representationchange if we select another basis? For the answer, we must first redefine some terms. Specifically,

suppose \320\260\321\203,\320\2602,...,\320\260\342\200\236are the coordinates of a vector v relative to a basis S of V. Then we will represent

160 VECTORSPACES [CHAP.5

v by its coordinate column vector,denotedand defined by

a2 =(\320\260\342\200\236\320\2602,...,\320\260\342\200\236O

We emphasize that in this section [v]s is an n x 1 matrix, not simply an element of K\". (The meaning of[v\\s will always be clear by its context.)

Suppose S ={\320\270\342\200\236\320\2702,..., \320\270\342\200\236}is a basis of a vector space V and suppose S' =

{\302\253;,,r2, ..., \320\270\321\217}is

another basis. Since S is a basis,eachvector in S' can be written uniquely as a linear combination of the

elements in S. Say

v, =\321\201,,\320\270,+c12\302\2532 \321\2011\320\277\320\270\342\200\236

+c22u2 + \302\246\342\200\242\342\200\242+\321\2012\320\277\320\270\342\200\236

cnnun

Let P denote the transpose of the above matrix of coefficients:

P =

/c,, c21 ..cl2 c22 ..

\302\246/

That is, P = (p^)wherep(J

=\321\201\321\217.

Then P is called the change-of-basismatrix (or transition matrix) fromthe \"old basis\" 5 to the \"new basis\" S'.

Remark: Since the vectors \302\253;,,v2, \302\246\302\246-,vnin S' are linearly independent, the matrix

P is invertible (Problem 5.84). In fact (Problem5.80),its inverse P~J is the change-of-basis matrix from the basis S' back to the basis S.

Example 5.17. Consider the following two bases of R2:

S ={\320\262,

= (I, 2), u2 = C, 5)} E =\\ex

= A, 0), e2 = @, 1)}

From ut = et + 2e2,u2= 3e, -I- 5e2 it follows that

\321\201,=

-5\320\270, +2\320\2702

e2=

\320\227\320\274,\342\200\224

u2

Writing the coefficients of u, and u2 as columns gives us the change-of-basis matrix P from the basis S to the usual

basis ?:

Furthermore, sinceE is the usual basis,

=A -0

u, =A, 2)= e, + 2e2u2

= C, 5) = 3e, + 5e2

Writing the coefficients of e, and c2 as columns gives us the change-of-basis matrix Q from the basis E back to the

basis S:

-

CHAP. 5] VECTOR SPACES 161

Observe that P and Q are inverses:

3\\/l 3\\ (\\ 0\\

-JU \320\267)-(\320\276 l)\021

The next theorem (proved in Problem 5.19) tells us how the coordinate (column) vectorsare affected

by a change of basis.

Theorem5.27: Let P be the change-of-basis matrix from a basis S to a basis S' in a vector space V.

Then, for any vector v e V, we have

POL- = Ms and hence P\" 'Ms = Ms-

Remark: Although P is called the change-of-basis matrix from the old basis S tothe new basis S', it is P~' which transforms the coordinates of v relative to the originalbasis S into the coordinates of v relative to the new basis S'.

We illustrate the above theorem in the case dim V = 3. Suppose P is the change-of-basismatrix

from the basis S ={\302\253\342\200\236u2, u3} to the basis S' =

{vu v2, v3}; say

a2u2 + a3u3v2

=bi\302\253i+ \320\2542\022+ ^\320\267\"\320\267

v3 =\321\201,\320\270,+ \321\2012\320\2702

Hence

Now suppose v e V and, say \320\270= fc^ + k2v2 + k3v3. Then, substituting for t),, u2, u3 from above, we

obtain

v = fc,(a,u, + a2 u2 + a3 u3) + /\321\2012(\320\254,\320\270,+ b2u2 + b3 u3) + fc3(c,\302\253, + c2u2 + c3 u3)= (a,fc, + bxk2 + c,fe3)M, + (a2 kx + b2 k2 + c2 fc3)u2 + (a3 fe, + b3 k2 + c3 k3)u3

Thus

(k\\

I alk1 + btk2

fe2 I and [i;]s = I a2 fe, + b2 k2 + c2 k3

kj \\a3 ^ +b3k2 + c3k3

Accordingly,

/a, b, c,

c2 II fc2 I = I a2fct + b2fe2 + c2fe3| = Ms\\a3 b3 c3f\\k3} \\a3fe,

Also, multiplying the above equation by P~', we have

P-1Ms = P'PMs- = /Ms- = Ms-

Remark: SupposeS={\320\270,,\320\2702,..., \320\270\342\200\236}is a basis of a vector spaceV over a field

K, and suppose P =(p,y)

is any nonsingular matrix over K. Then the \320\270vectors

\"i=

Pii\302\253i +P2l\302\2532 +\342\200\242\342\200\242\342\200\242

+ \320\240\320\270,\320\270\321\217'= 1, 2, ...,n

162 VECTOR SPACES [CHAP. 5

are linearly independent (Problem 5.84)and hence form another basis S' of V. More-

Moreover, P will be the change-of-basis matrix from S to the new basis S'.

Solved Problems

VECTOR SPACES

5.1. Prove Theorem 5.1.

(i) By axiom [/42], with u = 0, we have 0 + 0 = 0.Hence by axiom [M,],

fcO = k@ + 0) = fcO + kO

Adding \342\200\224k0 to both sides gives the desired result.

(ii) By a property of K, 0 + 0 = 0. Hence by axiom [M2], Ou = @ + 0)u = Ou + Ou. Adding -Ou to both

sides yields the required result.

(iii) Suppose ku = 0 and \320\272\321\2040. Then there exists a scalark~' such that k~lk = 1; hence

u=lu ={k lk)u = k~1(ku)=kl0 = 0

(iv) Using u + ( \342\200\224u) = 0, we obtain 0 = fcO = k(u + ( \342\200\224u))= fcu + fc(

\342\200\224u). Adding \342\200\224ku to both sides gives-fcu = fc(-u).

Using \320\272+ {\342\200\224

\320\272)- 0, we obtain 0 = 0u = (k +{~k))u = ku + (-fc)u. Adding -ku to both sides

yields \342\200\224ku = (\342\200\224fc)u. Thus (\342\200\224k)u = k( \342\200\224u) = \342\200\224fcu.

5.2. Show that for any scalar \320\272and any vectors \320\270and t), fe(w\342\200\224

v) = ku \342\200\224fef.

Use the definition of subtraction, u \342\200\224v = u + ( \342\200\224v), and the result k( \342\200\224

v) = \342\200\224kvto obtain:

k(u \342\200\224v) = fc(u + (-u)) = ku + fc(-u) = fcu + ( \342\200\224

kv) = ku - kv

53. Let \320\230be the set of all functions from a nonempty set X into a field K. For any functions/, g e V

and any scalar \320\272e K, let/+ g and fe/be the functions in V defined as follows:

\\x)=f(x) + g(x) and (#) = k/(x) Vx e X

Prove that \320\230is a vector space over K.

Since X is nonempty, V is also nonempty. We now need to show that all the axioms of a vector spacehold.

Let f, g, he V. To show that (/+ g) + h =/+ (g + h), it is necessary to show that the function

(/+ g) + h and the function/+ (g + h) both assign the same value to each x e X. Now,

((/ + G) + \320\250= (/ + \320\267\320\245\321\205)+ \320\230\321\205)

=(/(\321\205)+ flW) + Mx) Vx e X

V+i9 + Mx) = /(x) + (g + W*) = /W + (flW + *M) v* e AT

But/(x), fii(x), and ft(x) are scalars in the field \320\232where addition of scalars is associative;hence

(f(x) + g{x)) + h(x) =/(x) + (\320\267(\321\205)+ /i(x))

Accordingly, (/+ g) + h =/+ (g + h).

[/42] Let 0 denote the zero function: \320\251\321\205)= 0, Vx e X. Then for any function/e V,

(f+ \320\251\321\205)=/(x) + \320\251\321\205)=/(x) + 0 =/(x) Vx e X

Thus/ + 0 =/ and 0 is the zero vector in V.

CHAP. 5] VECTOR SPACES 163

[/43] For any function/e V, let \342\200\224/bethe function defined by (\342\200\224f)(x)=

\342\200\224f(x).Then,

(/ + (-f)\342\204\226= f(x) + (-\320\226\321\205)= /(x) - /(x) = 0 = 0(x) Vx e X

Hence/+(-/) = O.

[yl4] Let/, .\321\203\320\2651\320\233Then

(/+ gKx) =/(x) + fii(x)= g(x) +/(x) = (g +\320\233\320\234 Vx e X

Hence/ + g = g +f. [Note that/(x) + g(x) = g(x) +/(x) follows from the fact that/(x) and g(x) are

scalars in the field \320\232where addition is commutative.]

[A/,] Let/i/e K.Then

W + g)Kx)= k((f+ \320\264\320\253)= M/M + fl(x)) = k/(x) + kg(x)

= (k/Xx) + kgKx)=(kf + kgXx) Vx e X

Hence fc(/ + 9) =kf + kg. (Note that k(/(x) + g(x))

= fc/(x) + kg{x) follows from the fact that fc,/(x)and <j(x) are scalars in the field \320\232where multiplication is distributive over addition.)

[M2] Let/e \320\243and a.beK. Then

((a + b)/Kx)= (a + b)f(x) =

af{x) + bf(x) = (a/X*) + bf(x)

[M3] Let/e F and a, b e K. Then,

((cb)/X-v)= (flb)/(x)= a(b/(x)) = c(b/Xx) =

(c(b/)Xx) Vx e X

Hence {ab)f=a(bf).

[M4] Let/e V. Then, for the unit 1 e K, A/Xx) = l/(x)) =/(x), Vx e X. Hence 1/=/

Sinceall the axioms are satisfied, V is a vector space over K.

SUBSPACES

5.4. Prove Theorem 5.2.

For W to be a subspace, conditions (i), (ii), and (iii) are clearly necessary; we now show them to besufficient. By (i), W is nonempty; and by (ii) and (iii), the operations of vector addition and scalar multipli-

multiplicationare well defined for W. Moreover, the axioms [/4,], [/44], [M,], [M2], [M3] and [M4] hold in Wsince the vectors in W belong to V. Hence we need only show that [/42] and [/4j] also hold in W. Now,

[\320\2332]obviously holds, because the zero vector of V is also the \320\263\320\265\320\263\320\276vector of W. Finally, if v e W, then

(- l)i>= - \320\270e W and v + (- v) = 0; i.e., [/13] holds.

5.5. Prove Corollary 5.3.

Suppose W satisfies (i) and (ii). Then, by (i), W is nonempty. Furthermore, if v, w e W then, by (ii),v + w = Iv + Iwe W; and if v e W and \320\272\320\265\320\232then, by (ii), kv = kv + Ov e W. Thus by Theorem 5.2, W isa subspaceof V.

Conversely, if W is a subspaceof V then clearly (i) and (ii) hold in W.

5.6. Show that W is a subspace of R3 where W = {(a, b, c): a + b + \321\201= 0}, i.e., W consists of thosevectors each with the property that the sum of its components is zero.

0 =@, 0, 0) e W since 0 + 0 + 0 = 0. Suppose v = (a, b, c), w = (a', b', c') belong to W, i.e.,

a + b + \321\201= 0 and a' + b' + d \342\200\2240. Then for any scalars \320\272and k',

kv + k'w = k(a, b, c) + k'(a, b', c') ={ka, kb, kc) + (k'a, k'b', k'c')= {ka + k'a\\ kb + k'b', kc + k'c')

164 VECTOR SPACES [CHAP. 5

and furthermore

{ka -4- k'a') + {kb + k'b') + (kc + k'c') = k(a + b + c) + k'(a' + b' + \321\201')= \320\250+ fc'O = 0

Thus kv + k'w e W, and so W is a subspaceof R3.

5.7. Let V be the vector space of all square \320\270\321\205\320\270matrices over a field K. Show that W is a subspaceof V where:

(a) W consists of the symmetric matrices,i.e.,all matrices A = (aiS)for which aj{=

atj;

(b) W consists of all matrices which commute with a given matrix T; that is

W = {AeV: AT =\320\242\320\220).

(a) Oelf since all entries of 0 are 0 and henceequal.Now suppose A \342\200\224(\321\217\321\206)and \320\222=

(b^) belong to W,i.e.,ajt

=ay and bjt

= bi}. For any scalars a, b e K, aA + bB is the matrix whose y-entry is\320\260\320\260\320\270+ bby.

But \320\260\320\260{!+ bbjt=

\320\260\320\260\321\206+ bbti. Thus aA + bB is alsosymmetric, and so W is a subspaceof V.

(b) OeW since \320\236\320\242= 0 = TO. Now suppose A,BeW; that is, AT = \320\242\320\220and \320\222\320\242= \320\242\320\222.For any scalars

a, be K,(aA + bB)T = (aA)T + (bB)T=

a(AT) + b(BT) = a(TA) + b(TB)=

T(aA) + T{bB) = T{aA + bB)

Thus aA + bB commuteswith T, i.e., belongs to W; henceW is a subspace of V.

5.8. Let V be the vector space of all 2 x 2 matrices over the real field R. Show that W is not a

subspace of V where:

(a) W consists of all matriceswith zero determinant;

(b) W consists of all matrices A for which A2 = A.

(a) Recall that det I I = ad - be. The matrices A =( ) and \320\222= I I belong to W since

/1 0\\det (\320\233)

= 0 and det (B)= 0. But A + \320\222= I I does not belong to W since det (A + B) = 1. Hence

\320\230^is not a subspace of V.

(b) The unit matrix / = I I belongs to W since

\"-G X \320\232 0-'

/2 0\\: 2/ = I I does not belong to W since

-4o to \302\260H :b

HenceW is not a subspace of V.

5.9. Let \320\230be the vector space of all functions from the real field R into R. Show that W is a subspaceof V where W consists of the odd functions, i.e.,thosefunctions/forwhich/(\342\200\224

x)=

\342\200\224f(x).

Let 0 denote the zero function: 0(x) = 0, for every x e R. 0 e W since 0(-x) = 0= -0= -0(x).Suppose/,ge W, i.e.,/( \342\200\224x)=

\342\200\224/(x)and g{ \342\200\224x) = \342\200\224g{x).Then for any real numbers a and b,

(af+ bgK-x) = af(-x) + bg(-x) = -af(x)- bg(x)= -(aftx) + bg{x))= -(\302\253/+

Hence af+bge W, and so W is a subspace of V.

CHAP. 5] VECTOR SPACES 165

5.10. Let V be the vector space of polynomialsa0 + a^ + a2t2 + \302\246\302\246\302\246+asts (s = 0,1,2,...), with coef-

coefficients a,- e R. Determine whether or not W is a subspace of V where:

(a) W consists of all polynomials with integral coefficients;

(b) W consistsof all polynomials with degree < 3;

(c) W consists of all polynomials with only even powers of/.

(d) W consists of all polynomials having 4/ \342\200\2241 as a root.

(a) No, sincescalar multiples of vectors in W do not always belong to W. For example,

v = 3 + 5i + It2 6 W but {v = { + ft + It2 if: W

(Observe that W is \"closed\" under vector addition, i.e., sums of elementsin W belong to W.)(b),(c), and (d). Yes. For, in each case, W is nonempty, the sum of elements in W belong to W, and the

scalar multiples of any element in W belong to W.

5.11 Prove Theorem 5.4.

Let {Wj: i e /} be a collectionof subspacesof V and let W = f] (W-,: i e /). Since each Wt is a sub-

space, Oe\342\204\226jfor every i e /. HenceOeW. Supposeu, v e W. Then u, v e W, for every i 6 /. Since each Wt is

a subspace, \320\260\320\270+ bv e \320\251,for each i e /. Henceau + bv e W. Thus W is a subspace of V.

LINEAR COMBINATIONS, LINEAR SPANS

5.12. Express v\342\200\224

A, \342\200\2242,5) in R3 as a linear combination of the vectors \320\270,,\320\2702,u3 where

u{ = A, -3, 2),u2= B, -4, - 1),u3

= A, - 5, 7).

First set

A, -2, 5) = x(l, -3, 2) + y{2, -4, -1) + z(l, -5, 7) = (x + 2y + z, -3x -4y

- 5z, 2x -\321\203+ Iz)

Form the equivalent system of equations and reduce to echelonform:

x + 2y + z= 1 x + 2y+z=l x + 2y+z=l

-3x-4y-5z=-2 or 2y-2z=l or 2y- 2z = 1

2x- y + lz= 5 -5y+5z = 3 0 =11

The system does not have a solution. Thus v is not a linear combination of \320\270,,\320\2702,u3.

5.13. Express the polynomial v = t2 + 4r \342\200\2243 over R as a linear combination of the polynomials

Pi = t2 - It + 5, p2 = 2t2 - 3t, p3 = t + 3.

Set \320\270as a linear combination of p,, p2,\320\240\320\267using unknowns x, y, z:

f* + 4f _ \320\267= x(t2 _ 2i + 5)+ 3<2f2- 3r) + z(t + 3)

= xi2 - 2xt + 5x + 2yt2- 3yi + zt + 3z

= (x + 2y)f2 + (-2x -3y + z)f + Ex + 3z)

Setcoefficients of the same powers of t equal to each other, and reduce the system to echelon form:

x + 2y =1 x + 2y =1 x + 2y =1

\342\200\2242x \342\200\2243y + z = 4 or \321\203+ z = 6 or \321\203+ z = 6

5x +3z=-3 -10y + 3z=-8 13z = 52

The system is in triangular form and has a solution. Solving by back-substitution yields x = \342\200\2243, \321\203= 2,

z = 4. Thus v = \342\200\2243p, + 2p2 + 4p3.

166 VECTOR SPACES [CHAP.5

5.14. Write the matrix E = I I as a linear combination of the matrices

/1 \320\233 /0 0\\ /0 2\\

H o) H .)and

H -.)Set ? as a linear combination of \320\220,\320\222,\320\241using the unknowns x, y, z: E = xA + \321\203\320\222+ zC.

\320\241 -0\302\253G JK \320\232 -\320\255

_/.vx\\ /0 0\\ /0 2z\\/ x x + 2z\\

~Vx O/+

Vy J+

VO -z)~\\x + y y-z)

Form the equivalent system of equations by setting corresponding entries equal to each other:

x = 3 x + \321\203= 1 x + 2z=l \321\203

- z= \342\200\2241

Substitute x = 3 in the second and third equations to obtain \321\203= \342\200\2242 and z = \342\200\2241. Since these values also

satisfy the last equation, they form a solution of the system. Hence E = \320\252\320\220\342\200\224IB \342\200\224\320\241

5.15. Find a condition on a, b, \321\201so that w = (a, b, c) is a linear combination of \320\270= A, \342\200\2243,2) and

v = B, \342\200\2241, 1), that is, so that w belongs to span (u, i>).

Set w = xu + yv using unknowns x and y:

[a,b,c)= x(l,-3, 2) + yB, -1, 1)=(x + 2y, -3x -

y, 2x + y)

Form the equivalent system and reduce to echelonform:

x + 2y= a x + 2y

= a x + 2y= a

\342\200\2243x \342\200\224\321\203

= b or 5y = \320\252\320\260+ b or 5y -= 3a + b

2x + y = c -3y=-2a + c 0= -a+ ib+ 5c

The system is consistent if and only if a \342\200\2243fc \342\200\2245c = 0 and hencew is a linear combination of \320\270and v whena - 3fr - 5c = 0.

5.16. Prove Theorem 5.6.

Suppose S is empty. By definition span S = {0}.Hencespan S = {0} is a subspaceand S \321\201span S.

Suppose S is not empty, and v e S. Then li; = v e span S; hence S is a subset of span S. Thus span S is not

empty since S is not empty. Now suppose v, w e span S, say

v = alvl + \302\246\302\246\302\246+ amvm and w = fr,w, + \342\200\242\302\246\342\200\242+ bKwn

where vit WjG S and a;, b} are scalars. Then

and, for any scalar k,

kv

belong to span S sinceeach is a linear combination of vectorsin S. Thus span S is a subspaceof V.

Now suppose W is a subspaceof V containing S and suppose u,, ..., vm e S ^ W. Then all multiples

alvly..., am vm 6 W, where at e /l, and hence the sum alvl + \302\246\342\200\242\302\246+ am um 6 W. That is, W contains all linear

combinations of elements of S. Consequently, span S is a subspaceof W, as claimed.

LINEAR DEPENDENCE

5.17. Determinewhether \320\270= 1 \342\200\2243t + 2t2 - 3t3 and v = \342\200\2243 + 9t - 6t2 + 9t3 are linearly dependent.

Twovectors are linearly dependent if and only if one is a multiple of the other. In this case, v = \342\200\2243u.

CHAP. 5] VECTOR SPACES 167

5.18. Determine whether or not the following vectors in R3 are linearly dependent:

\321\213= A, -2, 1), v = B, 1, - 1),w = G, -4, 1).

Method 1. Set a linear combination of the vectors equal to the zero vector using unknown scalars x, y,andz:

x(l, -2, 1)+ M2, 1, - 1) + zG, -4, 1)= @, 0, 0)

Then

(x, -2x, x) + By, \321\203,-\321\203)+ Gz, -4z, z) =@, 0, 0)

or

(x + 2y + 7z, -2x + \321\203- 4z, x -

\321\203+ z) = @, 0, 0)

Set corresponding components equal to each other to obtain the equivalent homogeneous system, and

reduce to echelon form:

x + 2y + 7z = 0 x + 2y + Iz = 0

-2x + \321\203- 4z = 0 or by + lOz = 0 or

* + 2y + 7z = 0

x- y+ z=0 -3y- 6z=0 y + 2z = 0

The system, in echelon form, has only two nonzero equations in the three unknowns; hence the system has

a nonzero solution. Thus the original vectors are linearly dependent.

Method 2. Form the matrix whose rows are the given vectors, and reduce to echelon form using the

elementary row operations:

l\\ -2 l\\ /l -2 12 1 \342\200\2241

j\342\200\22410 5-3

-4 1/ \\010 -6

Since the echelonmatrix has a zero row, the vectors are linearly dependent. (The three given vectors span a

spaceof dimension 2.)

5.19. Consider the vector space P(t) of polynomials over R. Determine whether the polynomialsu, u, and w are linearly dependent where \320\270= t3 + 4r2 - 2t + 3, v = t3 + 6r2 \342\200\224t + 4,

w = 3r3 + 8t2 - 8t + 7.

Set a linear combination of the polynomials u, v and w equal to the zero polynomial using unknownscalarsx, y, and z; that is, set xu + yv + zw = 0. Thus

x(t3 + 4t2 - It + 3)+ y[t3 + 6t2 - t + 4) + zCi3 + 8f2 - 8i + 7)= 0

or xt3 + 4xt2 - 2xf + 3x + yt3 + 6yt2 -yt + 4y + 3zfJ + 8zi2 \342\200\2248zi + 7z = 0

or {\320\245+ \320\243+ 3z)t3 + D.v + by + 8z)f2 + (-2x -\321\203

- 8z)t + Cx + Ay + 7z) = 0

Set the coefficients of the powers of t each equal to 0 and reduce the system to echelon form:

x+y+3z = 0 x+y+3z = 04x + by + 8z = 0 2y-4z = 0 c \342\200\236x + \321\203+ 3z = 0

or or finally-2x- y-8z = 0 y-2z = 0 y-2z =0

3x + 4y + 7z = 0 \321\203- 2z = 0

The system in echelon form has a free variable and hence a nonzero solution. We have shown that

xu + yv + zw = 0 does not imply that x = 0, y= 0, z = 0; hencethe polynomials are linearly dependent.

5.20. Let V be the vector space of functions from R into R. Showthat/, g,heV are linearly indepen-

independent,where/(t)= sin t, g(t) = cos t, h(t)

= t.

168 VECTOR SPACES [CHAP. 5

Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, and

z: x/ + yg + zh = 0; and then show that x = 0, \321\203= 0, z = 0. We emphasize that x/+ yg + zh = 0 means

that, for every value oft, x/(t) + yg{t) + zh{t) - 0.Thus, in the equation x sin t + \321\203cos / + zt = 0, substitute

t = 0

t = n/2

t \342\200\22471

to

to

to

obtain

obtain

obtain

X

X

X

\302\246\320\236\320\275

\302\2461 \320\275

\342\200\242Oh

\320\275\321\203

i-y

h \320\230

-1

\342\200\2420

+

+

I)-

z -0 = 0zjt/2

= 0

+\302\246z \342\200\242jt = O

or

oror

X 4-

-y

y = 0

nz/2 = 0+ nz = 0

The last three equations have only the zero solution:x = 0, \321\203= 0, z = 0. Hence/, g and /i are linearly

independent.

5.21. Let V be the vector space of 2 x 2 matrices over R. Determine whether the matrices A, B,C e V

are linearly dependent, where:

-GO -\320\237Set a linear combination of the matrices A, B, and \320\241equal to the zero matrix using unknown scalars x,

y, and z; that is, set xA + \321\203\320\222+ zC = 0. Thus

1

or

0\\ /1 A /0 0\\

+ z x + z\\ /0 0\\

x + yJ=

Vo Oj

x + y

X

Set corresponding entries equal to each other to obtain the following equivalent system of linear equations:

x+y+z=0 v + z = 0 x=0 x + \321\203= 0

Solving the above system we obtain only the zero solution, x = 0, \321\203= 0, z = 0. We have shown that

xA + yB + zC \342\200\2240 implies x = 0, \321\203= 0, z = 0; hencethe matrices A, B, and \320\241are linearly independent.

5.22. Suppose u, i>, and w are linearly independent vectors. Showthat \320\270+ v, \320\270\342\200\224v, and \320\270- 2v + w are

also linearly independent.

Supposex(\302\253+ v) + y(u\342\200\224

v) + z(u \342\200\2242v + w) = 0 where x, y, and z are scalars.Then

xu + xv + yu\342\200\224

yv + zu \342\200\2242zv + zw = 0

or (x + \321\203+ z)u + (x \342\200\224\321\203

\342\200\2242z)i> + zw = 0

But \320\270,\320\270,and w are linearly independent; hence the coefficients in the above relation are each 0:

x + \321\203+\302\246z = 0

x -\321\203

- 2z = 0z = 0

The only solution to the above system is x = 0, \321\203= 0, z = 0. Hence\320\270+ v, \320\270\342\200\224v, and u \342\200\2242v + w are

linearly independent.

5.23. Show that the vectors v = A + i, 2i) and w = A, 1 + i) in C2 are linearly dependent over thecomplex field \320\241but are linearly independent over the real field R.

CHAP. 5] VECTOR SPACES 169

Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of

the other (by an element in K). Since

A + i)w= A + 0A, 1 + 0 =

A + i, 2i) = v

v and w are linearly dependent over C. On the other hand, v and w are linearly independent over R since no

real multiple of w can equal v. Specifically, when \320\272is real, the first component of kw = (fc, \320\272+ ki) is real and

it can never equal the first component I + i of i> which is complex.

BASIS AND DIMENSION

5.24. Determine whether A, 1, 1), A, 2, 3),and B, \342\200\2241, 1) form a basis for the vector spaceR3.

The three vectors form a basis if and only if they are linearly independent. Thus form the matrix A

whose rows are the given vectors, and row reduce to echelonform:

I l\\ /1 1

1

0

The echelon matrix has no zero rows; hence the three vectors are linearly independent and so they form a

basis for R3.

5.25. Determinewhether A, 1, 1, 1), (I, 2, 3,2),B, 5,6, 4), B, 6, 8, 5) form a basisof R4.

Form the matrix whose rows are the given vectors, and row reduce to echelon form:

B =

The echelon matrix has a zero row; hencethe four vectors are linearly dependent and do not form a basis

ofR4.

5.26. Consider the vector space \320\240\342\200\236(\320\263)of polynomials in t of degree < n. Determine whether or not

I + t, t + t2,t2 + t3,... ,t\"~l+t\" form a basis of \320\240\342\200\236(\320\263).

The polynomials are linearly independent since each one is of degree higher than the preceding ones.

However, there are only n polynomials and dim Pn(r)= n + 1. Thus the polynomials do not form a basisof

1

1

2

2

1

2

5

6

1

3

6

8

1

2

4

5

1

0

0

0

1

1

3

4

1

2

4

6

1\\

I

2\"

3/

/1\302\260

1\302\260

1

1

0

0

1

2

-2

-2

1\\

I

-1/

/1

00

1

1

0

0

1

2

2

0

1

1

1

0

5.27. Let V be the vector space of real 2x2 matrices.Determine whether

\302\273-n

form a basis for V.

The coordinate vectors (see Section5.10)of the matrices relative to the usual basis are, respectively,

M=(l, 1,0, 0) [B] = @, 1, 1, 0) [C]=@, 0, 1,1) [D]= @, 0, 0, 1)

The coordinate vectors form a matrix in echelon form and hence they are linearly independent. Thus thefour corresponding matricesare linearly independent. Moreover, since dim V = 4, they form a basisfor V.

5,28. Let \320\243be the vector space of 2 x 2 symmetric matrices over K. Show that dim V = 3. [Recallthat A =

(\320\2601\321\203)is symmetric iff A = A

ror, equivalently, ai}

= a}i.]

170 VECTOR SPACES [CHAP. 5

-ftAn arbitrary 2x2 symmetric matrix is of the form A = ( L ) where a,b,ce K. (Note that there are

three \"variables.\") Setting

(i) a=l,fc = O, c = 0, (ii) a = 0, fr=l,c = 0, (iii) a = 0, b = 0, \321\201= 1

we obtain the respectivematrices

'\\ 0\\

oj

We show that {?,, ?2, ?3} is a basisof V, that is, that (a) it spans V, and (b) it is linearly independent.

(a) For the above arbitrary matrix A in V, we have

=(::)

-\320\241 \320\255-+ \320\254\320\225-,+ c?.

oj

Thus {?,, ?2, ?3}spans V.

(b) Suppose x?, + yE2 + zE3 = 0, where x, y, z are unknown scalars. That is, suppose

0\\ /0 1\\ /00\\

/0 0\\ /x A /0

Setting corresponding entries equal to each other, we obtain x = 0, \321\203\342\200\2240, z = 0. In other words,

x?, + yE2 + z?3 = 0 implies x = 0, \321\203= 0, z = 0

Accordingly, {?,, ?2, ?3} is linearly independent.

Thus {?,, ?2, ?3} is a basisof V and so the dimension of V is 3.

5.29. Consider the complex field \320\241which contains the real field R which contains the rational field Q.(Thus \320\241is a vector space over R, and R is a vector space over Q.)

(a) Showthat \320\241is a vector space of dimension 2 over R.

(b) Show that R is a vector space of infinite dimension over Q.

(a) We claim that {1,1} is a basis of \320\241over R. For if i> 6 C, then v = a + hi = a-l + bni where a, b e R;that is, {1, i} spans \320\241over R. Furthermore, if x \342\200\2421 + \321\203

\342\200\242i = 0 or x + yi= 0, where x, \321\2036 R, then

x = 0 and \321\203= 0; that is, {1,1}is linearly independent over R. Thus {1,i} is a basisof \320\241over R, and so

\320\241is of dimension 2 over R.

(h) We claim that, for any n, the set jl, n, n2,...,n\"} is linearly independent over Q. For suppose

a0 1 + \320\260,\320\273+ a2 n1 + \302\246\302\246\302\246+ \320\260\342\200\236\320\277\"= 0, where the a, e Q, and not all the at are 0. Then n is a root of a

nonzero polynomial over Q: a0 + a,x + a2x2 + \302\246\342\200\242\302\246+ anx\". But it can be shown that n is a transcen-transcendentalnumber, i.e., that n is not a root of any nonzero polynomial over Q. Accordingly, the n + 1 real

numbers 1, n, n2, ..., if are linearly independent over Q. Thus, for any finite n, R cannot be of dimen-

dimensionn over Q; R is of infinite dimension over Q.

5.30. Let S = {u,,u2, ..., un} be a subset of a vector space K. Show that the following two conditionsare equivalent: (a) S is linearly independent and spans V, and (b) every vector v e V can bewritten uniquely as a linear combination of the vectors in S.

Suppose (a)holds.Since S spans F, the vector \320\270is a linear combination of the u:; say,

1) =\320\260,\320\270,+ a2u2 + \302\246\302\246\302\246+ \320\260\320\277\320\270\342\200\236

Suppose we also have

b2u2 + \302\246\302\246\302\246+bnun

CHAP. 5] VECTOR SPACES 171

Subtracting, we get0 = v - v = (a, -

b,)\302\253,+ (a2 -b2)u2 + \302\246\342\200\242\302\246+ (\320\260\342\200\236

- bjun

But the ut are linearly independent; hence the coefficients in the above relation are each 0:

fli\342\200\224

^i= 0, u2 \342\200\224

b2= 0,..., \320\260\342\200\236

\342\200\224bn

= 0

Therefore, al = bx,a2=b2, ...,an

\342\200\224bn; hence the representation of v as a linear combination of the u, is

unique. Thus {a)implies (b).

Suppose (b) holds. Then S spans V. Suppose

0 = c,u, + c2u2+ \302\246\302\246\302\246+\321\201\342\200\236\320\270\342\200\236

However, we do have0 = 0u, + 0u2 + - \302\246\342\200\242+ 0\320\274\342\200\236

By hypothesis, the representation of 0 as a linear combination of the u, is unique. Hence each \321\201,= \320\236and the

\321\211are linearly independent. Thus (b)implies (a).

DIMENSION AND SUBSPACES

5.31. Find a basisand the dimension of the subspace W of R3 where:

(a) W={(a,b,c):a + b + c = 0}, (b) W = {{a, b, \321\201):a = b = c},

(c) W = (a, b, \321\201):\321\201= 3a}

(a) Note W \320\244R3 since, e.g., A, 2, 3)$ W. Thus dim ff < 3. Note u,= A, 0, - 1)and u2

= @, 1, -1) aretwo independent vectors in W. Thus dim W = 2 and so u, and u2 form a basisof W.

(b) The vector u = A, 1, 1) e W. Any vector w 6 W has the form w =(\320\272,\320\272,\320\272).Hence w = ku. Thus u spans

W and dim W = 1.(c) \320\230^# R3 since, e.g., A, 1, 1) ? \320\230'.Thus dim W < 3.The vectors \302\253,

= A, 0, 3) and u2 = @, 1, 0) belongto W and are linearly independent. Thus dim W = 2 and \302\253\342\200\236u2 form a basis of W.

5.32. Find a basis and the dimensionof the subspace W of R4 spanned by

\320\270\321\205=A,-4,-2,1), u2 = (l, -3,-1,2). u3=C, -8, -2,7).Apply the Row Space Algorithm 5.?A Form a matrix where the rows are the given vectors, and row

reduce it to echelon form:

Thenonzero rows in the echelon matrix form a basis of W and so dim W = 2. In particular, this means thatthe original three vectors are linearly dependent.

5.33. Let W be the subspaceof R* spanned by the vectors

Ul=(l, -2, 5, -3), \320\2702= B, 3, 1, -4) M3 =C,8,-3, -5).

(a) Find a basisand the dimension of W. (b) Extend the basis of W to a basis of the whole spaceR4.

(a) Form the matrix A whose rows are the given vectors, and row reduce it to echelon form:

A

-2 5 -3\\ /l -2 5 -32 3 1 -4 ~l0 7-9 23 8-3 -5/ \\0

14 -18 4,

The nonzero rows A, \342\200\2242,5, \342\200\2243)and @, 7, \342\200\2249,2) of the echelon matrix form a basis of the row spaceof A and henceof W. Thus, in particular, dim W \342\200\2242.

172 VECTOR SPACES [CHAP. 5

5.34.

(b) We seek four linearly independent vectors which include the above two vectors. The four vectors

A, -2, 5, -3), @, 7, -9, 2), @, 0, 1, 0), and @, 0, 0, 1) are linearly independent (since they form an

echelon matrix), and so they form a basis of R* which is an extension of the basis of W.

Let W be the subspace of R5 spanned by the vectorsu, =A, 2, -1, 3, 4), u2

= B, 4, -2, 6, 8),u3

= A, 3, 2, 2, 6), \320\2704= A, 4, 5, 1, 8), and \320\2705

= B, 7, 3, 3, 9). Find a subsetof the vectors which

form a basis of W.

Method 1. Here we use the Casting-Out Algorithm 5.8B. Form the matrix whose columns are the given

vectors and reduce it to echelon form:

1

2

-13

\\ 4

/10

00

\\o

2

4

-2

6

8

20000

13

2

2

6

1

1

0

00

145

1

8

1

2

0

0

0

2\\7

3

3 1

0

00

9/ \\0

2\\

3

-4

0

o/

20000

113

-1

2

1

2

6

-2

4

2\\3

5

-3

/1000

i/ \\o

2

0

0

0

0

1

I

0

0

0

1

2000

2\\

3

-4

0

The pivot positions are in columns \320\241\342\200\236C3, C5. Hence the corresponding vectors \320\270\342\200\236\320\2703,us form a basis ofW and dim W = 3.

Method 2. Here we use a slight modification of the Row Reduction Algorithm 5?A. Form the matrix

whose rows are the given vectors and reduce it to an \"echelon\" form but without interchanging any zero

rows:

/l2

1

1

\\2

2 -14 -23 2

4 5

7 3

3 4\\

6 8

2 6

1 8

3 9/

/l 2-1 3 4\\ /l 2-1 3 4\\

00000 000000 1 3-1 2-0 1 3-1 2

026-2 4 00000

\\0 3 5-3 1/ \\0 0-4 0 -5/

The nonzero rows are the first, third, and fifth rows; hence ut,u3, u5 form a basis of W. Thus, in particular,

dim W = 3.

535. Let V be the vector space of real 2x2 matrices.Find the dimension and a basis of the subspace

W of V spanned by

\320\247-: -\321\201 j) -\321\201 '0 :)The coordinate vectors (see Section 5.10)of the given matrices relative to the usual basis of V are as

follows:

[/J] = [1,2, -1, 3] [B]= [2,5, 1, -1] [C] = [5, 12,1,1] [D]Form a matrix whose rows are the coordinate vectors, and reduce it to echelon form:

/l

00

[3,4, -2,5]

CHAP. 5] VECTOR SPACES 173

The nonzero rows are linearly independent, hence the corresponding matrices I LI L and

0 0\\1 form a basis of W and dim W = 3. (Note also that the matrices A, B, and D form a basis of W.)1 \342\200\224

10/

THEOREMS ON LINEAR DEPENDENCE, BASIS, AND DIMENSION

5.36. Prove Lemma 5.10.

Since the v, are linearly dependent, there exist scalars a,, ..., \320\260\342\200\236,not all 0, such that

alvl + \302\246\302\246\302\246+ amvm = 0. Let \320\272be the largest integer such that ak \320\2440. Then

atvi + \302\246\302\246\302\246+ akvk +0vk+i + \302\246\302\246\302\246+0vm= 0 or \320\260,\320\270,+ \342\200\242-\342\200\242+ akvt = 0

Suppose \320\272= 1; then alvl = 0, \320\260,\320\2440 and so u, = 0. But the v, are nonzero vectors;hence\320\272> 1 and

vk= -a,\"'0,11, -\320\260\320\223\320\247-iUi-i

That is, vk is a linear combination of the preceding vectors.

5.37. Prove Theorem 5.11.

SupposeRn, Rm-t, ..., Ri are linearly dependent. Then one of the rows, say Rm, is a linear com-

combination of the preceding rows:

*m=<Wi*m+i + am+1Rm+2 + \302\246\302\246\302\246+ a,Rm (\302\246)

Now suppose the fcth component of Rm is its first nonzero entry. Then, since the matrix is in echelon form,

the fcth components of Rm+ \342\200\236...,/?\342\200\236are all 0, and so the fcth component of (\342\200\242)is

fl\302\253+i-0 + flM+2-0+--- + oII'0 = 0

But this contradicts the assumption that the fcth component of Rm is not 0. Thus J?,, ..., Rn are linearly

independent.

5.38. Suppose {vu...,vm} spansa vector space V. Prove:

(a) If w e V, then {w,vlt..., vm] is linearly dependent and spans V.

(b) If vt is a linear combination of vectors (vt, v2, .-., \"j-i), then {vu ..., ty-i, vi+l, ..., vm}

spans V.

(a) The vector w is a linear combination of the v, since {ii(} spans V. Accordingly, {w,vlt..., vm] is linearly

dependent. Clearly, w with the vt span V since the vt by themselves span V. That is, {w, i>,, ..., vK]

spans V.

(b) Suppose ii,= fc,u, + \302\246\342\200\242\342\200\242+ k,_ ,ti|_ i- Let \320\270e V. Since {vt} spans K, u is a linear combination of the vt,

say, \302\253= a,i!, + \342\200\242\342\200\242\342\200\242+ amvm. Substituting for vt, we obtain

\302\246\302\246\302\246+ k,.^,..,) + fl,+ lu,+ , + \342\200\242\302\246\342\200\242+ aMt)nl

= (a, +fl,k,)\302\253, + \302\246\342\200\242\342\200\242+(fli-, +ajfc,_,)i;i_1 +a1+1tI+l + \342\200\242\342\200\242\342\200\242+ amvm

Thus {\302\253\342\200\236..., u,_,, t),-+l,..., t)m} spans V. In other words, we can delete v{ from the spanning set and

still retain a spanning set.

5.39. Prove Lemma 5.13.

It suffices to prove the theorem in the case that the v{ are all not 0. (Prove!)Since {u,-} spans V, we haveby Problem 5.38 that

174 VECTOR SPACES [CHAP.5

is linearly dependent and also spansV. By Lemma 5.10, one of the vectors in (/) is a linear combination of

the preceding vectors.This vectorcannot be w,, so it must be one of the i>'s, say v-r Thus by Problem 5.38

we can delete vs from the spanning set U) and obtain the spanning set

K:i>, \302\253'j\342\200\236vj+1 \302\273\342\200\236} B)

Now we repeat the argument with the vector w2. That is, since B) spans V, the set

{w,, w2,vu...,vj.1, vj+l, ..., \302\273\342\200\236} (i)

is linearly dependent and also spans V. Again by Lemma 5.10. one of the vectors in C) is a linear com-

combination of the preceding vectors. We emphasize that this vector cannot be w, or w2 since {w,, ..., wm} is

independent; hence it must be one of the vs, say vk. Thus by the preceding problem we can deletevk from

the spanning set C) and obtain the spanning set

{w,, w2, i;,, ..., \302\273;_\342\200\236Vj+ \342\200\236.... Ui_,, |>\321\206.\342\200\236.... i'J

We repeat the argument with w3 and so forth. At each step we are able to add oneof the w's and deleteone of the v's in the spanning set. If m < n, then we finally obtain a spanning set of the required form:

\320\232 \302\253..^ vJ

Last, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain the spanningset {w, *'\342\200\236}.This implies that wn +, is a linear combination of w,,..., wn which contradicts the hypothe-

hypothesisthat j w,} is linearly independent.

5.40. Prove Theorem 5.12.

Suppose {u,, u2,.. -, \302\253\342\200\236}is a basis of V, and suppose {u,, u2,...} is another basis of V. Since {uj spans\320\230,the basis {u,, u2, ...} must contain n or less vectors,or elseit is linearly dependent by Problem 5.39

(Lemma 5.13).On the other hand, if the basis {vt, v2,...} contains less than \302\253elements, then (u,, t<2, ...,un}is linearly dependent by Problem 5.39. Thus the basis {vt, v2, ...} contains exactly n vectors, and so the

theorem is true.

5.41. Prove Theorem 5.14.

Suppose \320\222= {w,, w2, ...,wn} is a basisof V.

(i) Since \320\222spans V, any n + 1 or more vectors are linearly dependent by Lemma 5.13.

(ii) By Lemma 5.13, elements from \320\222can be adjoined to S to form a spanning set of \320\243with n elements.Since S already has n elements, S itself is a spanning set of V. Thus S is a basisof V.

(iii) Suppose T is linearly dependent. Then some vt is a linear combination of the preceding vectors. ByProblem 5.38, V is spanned by the vectors in T without vt and there are n \342\200\2241 of them. By Lemma5.13,the independent set \320\222cannot have more than n \342\200\2241 elements. This contradicts the fact that \320\222

has n elements. Thus T is linearly independent and hence T is a basis of V.

5.42. Prove Theorem 5.15.

(i) Suppose{i>,, ..., vm\\ is a maximal linearly independent subset of S, and suppose w e S. Accordingly

{(),, ..., vm, w\\ is linearly dependent. No vk can be a linear combination of preceding vectors; hence w

is a linear combination of the i>;. Thus w 6 span i;, and hence S ^ span vt. This leads to

V = span S e span vt ^ \320\243

Thus {vj} spans V and, since it is linearly independent, it is a basisof V.

(ii) The remaining vectors form a maximal linearly independent subset of S and hence by part (i) it is a

basis of V.

5.43. Prove Theorem 5.16.

Suppose \320\222= {w,, w2,..., wn) is a basis of V. Then \320\222spans V and hence V is spanned by

S \320\270\320\222={\302\253,,u2, ..., ur, w,, vv2, .... wn}

CHAP. 5] VECTOR SPACES 175

By Theorem 5.15, we can delete from S <u \320\222each vector which is a linear combination of preceding vectorsto obtain a basis B' for V. Since S is linearly independent, no uk is a linear combination of precedingvectors. Thus B' contains every vector in S. Thus S is part of the basis B' for V.

5.44. Prove Theorem 5.17.

SinceV is of dimension n, any n+1 or more vectors are linearly dependent. Furthermore, since abasis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly,dim W < n.

In particular, if {w,, ..., \320\275'\342\200\236}is a basis of W, then, since it is an independent set with n elements, it is

also a basis of V. Thus W = V when dim W = n.

ROW SPACE AND RANK OF A MATRIX

5.45. Determine whether the following matrices have the same row space:

\320\267 \320\270) \\\321\212 -2 -\321\212)\321\203\320\267_, 3/

Matrices have the same row space if and only if their row canonical forms have the same nonzerorows; hence row reduce each matrix to row canonical form:

5W\302\260

2)\320\267) \\o i \320\267)

>\\

\320\267

o/

to

/1\302\260

\\o

0

1

0

2

3

0\320\241= | 4 \342\200\2243\342\200\2241

J~ I 0 1

p(2 6/ \\0 0

Since the nonzero rows of the reduced form of A and of the reduced form of \320\241are the same, A and \320\241

have the same row space.On the other hand, the nonzero rows of the reduced form of \320\222are not the sameas the others, and so \320\222has a different row space.

3 5\\ / 1 2 3\\

5.46. Show that A = | 1 4 3 I and \320\222= I \342\200\2242 -3 -41 have the same column space.

19/ \\ 7 12 17/Observe that A and \320\222have the same column space if and only if the transposes AT and BT have the

same row space. Thus reduce ATand BTto row canonical form:

h l i\\/i

i

4 1 ~ 0 1 -2|~|0 1 -2 | to

\\5 3 9/ \\0-2

/I -2 7\\ /1 -2

Br= 2 -3 12-0 1 \342\200\22421~ ( 0 1 -2 | to

\\3-4 17/ \\0

2

Since /lr and BThave the same row space, A and \320\222have the same column space.

100

1

0

0

0

1

0

01

0

3

-2

0

3-2

0

176 VECTOR SPACES [CHAP. 5

5.47. Consider the subspace V = span (u,, u2, u3) and W = span (w,, w2, w3) of R3 where:

m,= A, 1, - 1), \320\2742

= B, 3, -1), \320\2743= C, 1, -

5)

\321\211=A, -1, -3), w2=C, -2, -8), w3= B, 1, -3)

Show that U = W.

Form the matrix A whose rows are the uf, and row reduce A to row canonical form:

I\321\205

'

~x\\ I\321\205

l

~x\\ il\302\260

~2\\/1=2 3 -1]~ 0 1 1 ~ 0 1 1

\\3 1 -5/ \\0 -2 -2/ \\0 0 0/

Next form the matrix \320\222whose rows are the w, and row reduce \320\222to row canonical form:

-2 -8-0 1 lUo 1 I

\\2 1 -3/ \\0 3 3/ \\00 0/

Since A and \320\222have the same row canonical form, the row spaces of A and \320\222are equal and so V = W.

5.48. Find the rank of the matrix A where:

1 2 -3\\

(a) A =\\

\\ _;\320\246

\320\244) A-

1 4-2/

(a) Sincerow rank equals column rank, it is easier to form the transpose of A and then row reduce toechelon form:

A

2 -2 -l\\ /l 2 -2 -l\\ /1 2 -2 -1>

2 1 -14|~[o

-3 36l~jo

-3 3 6

-3 0 3 -2/ \\06 -3 -5/ \\0 0 3 l)

Thus rank /1 = 3.

(b) The two columns are linearly independent sinceoneis not a multiple of the other. Thus rank /1 = 2.

5.49. Consider an arbitrary matrix A = (ay). Suppose u = (bl, ..., bn) is a linear combination of therows/?,,..., Rmof /l;say \320\270= fc,R, 4 + kmRm. Show that

bi = klaii + k2a2i+ \302\246\302\246\342\200\242+ kmani (i = 1,2,..., \320\270)

where au,..., ami are the entries of the ith column of A.

We aregiven u = fc,/?j + \342\200\242-\342\200\242+ km Rm ; hence

(bi, ..-A) =*,(\302\253!\342\200\236...,a,,) + \342\200\242\342\200\242\302\246

+*>\302\273i.-.O

=(*,\302\253,,+

-\342\200\242\342\200\242+*:\342\200\236\302\253\342\200\236\342\200\236...,k,aml + \342\200\242\302\246\302\246+ kmann)

Setting corresponding componentsequal to eachother, we obtain the desired result.

5.50. Suppose A =(alV)

and \320\222=(bi}) are echelon matrices with pivot entries:

CHAP. 5] VECTOR SPACES 177

aljl******

a2jj * * * *

an.

\\

B =

I

b,k, * * * *

\\ I

Suppose A and \320\222have the same row space. Prove that the pivot entries of A and \320\222are in the

same positions, that is, prove that /, = kl,j2 = k2,...,jr =/cr, and r = s.

Clearly A = 0 if and only if \320\222= 0, and so we need only prove the theorem when \320\263> 1 and s > 1.We

first show that j, = k,. Suppose^,< kt. Then thejtth column of \320\222is zero. Since the first row of A is in the

row space of B,we have by the preceding problem,

\302\260Ui= cfi + c20 + \342\200\242\342\200\242+ cm0 = 0

for scalars ct. But this contradicts the fact that the pivot elementaljl \320\2440. Hence j, > k,, and similarly

k, >jx. Thusj, = k,.Now let A' be the submatrix of A obtained by deleting the first row of A, and let B' be the submatrix of

\320\222obtained by deleting the firsC row of B. We prove that A' and B' have the same row space. The theoremwill then follow by induction since A' and B' are also echelonmatrices.

Let R = (a,, a2, ..., \321\217\342\200\236)be any row of A' and let R,, ..., Rm be the rows of B. SinceR is in the rowspace of B, there exist scalars dx, \302\246\302\246-,dm such that R =

<J,R, + d2 R2 + \302\246\302\246\302\246+ dmRm. Since A is in echelon

form and R is not the first row of A, thej,th entry of R is zero: a, = 0 for i = j, = fc,. Furthermore, since \320\222

is in echelon form, all the entries in the k,th column of \320\222are 0 except the first: \320\2541\320\272\321\210\320\2440, but 62ti = 0, ...,b^x

= 0. Thus

0 =akl

=J,bUi + d2 0 + \302\246- \302\246+ dm0 = dxblkl

Now blkl \320\2440 and so rft= 0. Thus R is a linear combination of R2,..., Rm and so is in the row space of B'.

SinceR was any row of A', the row space of A' is contained in the row space of B'. Similarly, the row space

of B' is contained in the row space of A'. Thus A' and B' have the same row space, and so the theorem is

proved.

5.51. Prove Theorem 5.8.

Obviously, if A and \320\222have the same nonzero rows then Chey have the same row space. Thus we onlyhave to prove the converse.

Suppose A and \320\222have the same row space, and supposeR \321\2040 is the ith row of A. Then there exist

scalars c,,..., c, such that

R = c2R2

where the R, are the nonzero rows of B. The theorem is proved if we show that R =Rt, or ct = 1 but ck

= 0

for \320\272\320\244i.

Let aijt be the pivot entry in R, i.e., the first nonzero entry of R. By (/) and Problem 5.49,

\302\253\320\276,= \321\201A* \321\201,b,Sl B)

But by the preceding problem bljtis a pivot entry of \320\222and, since \320\222is row reduced, it is the only nonzero

entry in thejjth column of B. Thus from B) we obtain aijt=

\321\201{\320\254\320\236\320\263However, aib = 1 and bih= 1 since A and

\320\222are row reduced; hence ct\342\200\2241.

Now suppose \320\272\320\244i, and bkJkis the pivot entry in Rk. By (/) and Problem5.49,

= C \320\220\320\273 ^2\320\273+ \302\246\302\246\342\200\242+ \321\201\320\233\320\273 (\302\246*)

Since \320\222is row reduced, bkjhis the only nonzero entry in the jkth column of B; henceby (J), aijk

=ckbkjk.

Furthermore, by the preceding problem akjkis a pivot entry of A and, since A is row reduced, aiJk

= 0. Thus

ck bkjk= 0 and, since

bkjk= 1, ct = 0. Accordingly R =

R,- and the Cheorem is proved.

178 VECTOR SPACES [CHAP. 5

5.52. Prove Theorem 5.9.

SupposeA is row equivalent to matricesAy and A2 where /4, and A2 are in row canonical form. Then

rowsp A = rowsp At and rowsp A = rowsp A2. Hence rowsp Ax = rowsp A2. SinceAl and \320\220\320\263are in row

canonical form, At= A2 by Theorem 5.8. Thus the theorem is proved.

5.53. Prove Theorem 5.18.

Let A be an arbitrary m x n matrix:

A =

/\302\25311 O\\2

\302\25321\302\25322

\\\302\253ml\302\253m2\302\246\302\246\342\200\242\302\253\321\202\320\273/

Let Ru R2,..., Rm denote its rows:

\320\257,=

(\320\260,\342\200\236\320\26012,...,\320\2601\321\217),...,Rm= (aml,am2,...,aJ)

Suppose the row rank is r and that the following r vectors form a basis for the row space:

S, =(*,,. ft,2 bln), S2 = (b2l, b22,..., b2n),...,Sr= (brl, br2,..., bj

Then each of the row vectors is a linear combination of the S,-:

R, =k,,S, + fc,2S2 +\302\246-\302\246+ klrSr

R2 = fc21S,+k22S2 +\302\246\302\246\302\246+ k2rSr

where the ktj are scalars. Setting the ith components of each of the above vector equations equal to eachother, we obtain the following system of equations, each valid for i = 1 n:

\302\2532.= k22b2i +

\302\246kmlb2i+-

+kubri

+ k2rbri

Thus, for i = 1 n.

Ik

= b,

In other words, each of the columns of A is a linear combination of the r vectors

Thus the column space of the matrix A has dimension at most r, i.e., column rank <r. Hence, column

rank < row rank.

Similarly (or considering the transpose matrix AT) we obtain row rank < column rank. Thus the row

rank and column rank are equal.

5.54. Suppose R is a row vector and A and \320\222are matrices such that RB and \320\220\320\222are defined. Prove:

(a) RB is a linear combination of the rows of B.

(b) Row space of \320\220\320\222is contained in the row space of B.

CHAP. 5] VECTOR SPACES 179

(c) Column spaceof AB is contained in the column space of A.

{d) rank AB < rank \320\222and rank AB < rank A.

(a) SupposeR = (a,, a2 aj and \320\222=(hit). Let B, Bm denote the rows of \320\222and B\\ ..., B\" its

columns. Then

RB = (R-Bl, R-B2, ...,R- B\=

(a,/>,, + a2621 + \302\246-\302\246+ ambml,a,bl2 + a2b22 + \302\246\302\246\302\246+ ambm2, ..., a,fcln + o2fc2n + \302\246\302\246-+ amfcM)

=\302\253|(\320\254\321\206,bl2,..., bj + a2F21, b12, .... fc2lI) + \342\200\242\342\200\242\302\246+ ajfcml) fcm2, ..., fcmA)

= a,B,+a2B2+- +amBm

Thus RB is a linear combination of the rows of B,as claimed.

(fc) The rows of AB are R,B where R, is the ith row of A. Thus, by part (a), each row of AB is in the rowspaceof B.Thus rowsp AB ^ rowsp B,as claimed.

(c) Using part (fc),we have:

colsp AB = rowsp (AB)T = rowsp BrAT \320\257rowsp AT = colsp A

{d) The row space of AB is contained in the row space of B; hence rank AB < rank B. Furthermore, the

column space of AB is contained in the column spaceof A; hence rank AB < rank A.

5.55. Let A be an \320\270-square matrix. Show that A is invertible if and only if rank A = n.

Note that the rows of the n-square identity matrix /\342\200\236are linearly independent since /\342\200\236is in echelon

form; hence rank /\342\200\236= n. Now if A is invertible then A is row equivalent to /\342\200\236; hence rank A = n. But if A is

not invertible then A is row equivalent to a matrix with a zero row; hence rank A < n. That is, A is

invertible if and only if rank A = n.

APPLICATIONS TO LINEAR EQUATIONS5.56. Find the dimension and a basis of the solution space W of the system

x + 2y + z - \320\227\320\263= \320\236

2-\321\205+ 4> + 4z - t = 0.

3x + 6> + 7z + t = 0

Reducethe system to echelon form:

x + 2y+z\342\200\224 3t = 0 x + 2y+z-3t = 02z + 5r = 0 or 2z + 5t = O

Az + lOf = 0

The free variables are >\302\246and t, and dim W = 2. Set:

(i) \321\203= 1, z = 0 to obtain the solution u, = (-2, 1,0, 0)

(ii) \321\203= 0, t = 2 to obtain the solution u2 = A1,0, \342\200\2245,2)

Then {u,, u2) is a basisof W. [The choice \321\203= 0, f = 1 in (ii), would introduce fractions in the solution.]

5.57. Find a homogeneoussystem whose solution set W is spanned by

{A, -2, 0, 3), A, -1,-1, 4), A, 0, -2, 5)}

Let v = (x, y, z, t). Form the matrix M whose first rows are the given vectors and whose last row is p;

and then row reduce to echelonform:

M

1

1

1

X

\342\200\2242

-1

0

\320\243

0

-1

-2z

345

l

1

0

0

0

_21

2

2x + y

0

-1

-2

z

31

2

-3x + t

180 VECTOR SPACES [CHAR 5

The original first three rows show that W has dimension 2. Thus v e W if and only if the additional row

does not increase the dimension of the row space. Hence we set the last two entries in the third row on the

right equal to 0 to obtain the required homogeneous system

2x + y + z =0

5x + \321\203 -t = 0

5.58. Let xh, xh, ..., xik be the free variables of a homogeneoussystem of linear equations with \320\270

unknowns. Let vs be the solution for which \321\205{.\342\200\224

1, and all other free variables= 0. Show that

the solutions vt,v2,---,vk are linearly independent.Let A be the matrix whose rows are the i\302\273,-,respectively. We interchange column I and column i,, then

column 2 and column i2,..., and then column \320\272and column ik ; and obtain the \320\272\321\205\320\277matrix

... 0 0 c,.\302\273+, ... i

... 0 0 c2k+l ... i

0 0 0 ... 0 1The above matrix \320\222is in echelon form and so its rows are independent; hence rank \320\222= \320\272.Since A and \320\222

are column equivalent, they have the same rank, i.e., rank A = k. But A has \320\272rows; hence these rows, i.e.,the t'j, are linearly independent, as claimed.

5.59. Prove Theorem 5.20.

Suppose u,, u2, ..., ur form a basis for the column space of A (There are r such vectors sincerank

A \342\200\224r.) By Theorem 5.19, each system AX = u, has a solution, say vt. Hence

Avt= ut, Av2 = u2,...,Avr =

ur (/)

Suppose dim W = s and w,, w2,..., *v, form a basis of W. Let

\320\222\342\200\224{l>,, D2, ..., Dr, *,, VV2, ..., W,}

We claim that ? is a basis of K\". Thus we need to prove that \320\222spans K\" and that \320\222is linearly independent.

(a) Proof that \320\222spans K\". Suppose v e K\" and Av = u. Then \320\270= Av belongs to the column space of A andhenceAv is a linear combination of the \321\211.Say

Av = k,u, + k2u2 + \302\246\302\246\302\246+ krur B)

Let v' = v \342\200\224k,D,

\342\200\224k2 v2

\342\200\224\342\200\242\302\246\302\246\342\200\224k, vr. Then, using G) and B),

A(v')= A(v -

ktvt-

k2 v2 k,vr)

= Av \342\200\224k1Avl

\342\200\224k2 Av2

\342\200\224\302\246\302\246\302\246\342\200\224krAvr

= Av \342\200\224k,u,

\342\200\224k2u2

\342\200\224- \302\246\302\246\342\200\224krur = Av \342\200\224Av = 0

Thus if belongs to the solution W and hence v' is a linear combination of the w,. Sayv' =

c,*v, + c2w2 + -\342\200\242\342\200\242+ csws. Then

r r sV = V'+ Xk.t)i= ZM-+ ZCiWJ(=1 i=l j=l

Thus i! is a linear combination of the elements in B, and hence \320\222spans K\".

(b) Proof that \320\222is linearly independent. Suppose

o,t), + a2v2 + \302\246\302\246\302\246+ arvr + d,w, + b2 w2 + \302\246\302\246- + b,ws = 0 (i)

SinceWje W, each A\\Vj

= 0. Using this fact and (/) and (i), we obtain

0 = A@) = a( ? oiVi + ? bjW) ^ta.Av.+ t bjAWj\\i=i j=i / i=i j=i

7=1

CHAP.5] VECTOR SPACES 181

Since u,,..., \321\206.are linearly independent, each a( = 0.Substituting this in C) yields

fc,w, 4- b2w2 + \302\246\302\246\302\246+ bsws = 0

However, \320\270>\342\200\236...,ws are linearly independent. Thus eachbt

= 0. Therefore, \320\222is linearly independent.Accordingly \320\222is a basis of K\". Since \320\222has r + s elements,we have r + s = n. Consequently,

dim W = s = n \342\200\224r, as claimed.

SUMS, DIRECT SUMS, INTERSECTIONS

5.60. Let U and W be subspaces of a vector space V. Show that: (a) U and VP are each contained in

V; (b) U + W is the smallest subspace of V containing U and W, that is, V + W = span (U, W),

the linear span of U and VF; (c) VF + W = \320\226.

(a) Let u e V. By hypothesis W is a subspaceof V and so 0 6 W. Hence u = u + 0eU + W. Accordingly,V is contained in V + W. Similarly, W is contained in U + W.

(b) Since U + W is a subspace of V containing both V and W, it must also contain the linear span of LJ

and W, that is, span (U,W)^U + W.

On the other hand, if v e V + W then v \342\200\224u + w=lu+l\302\273v where u 6 V and \302\273ve W; hence \320\270is

a linear combination of elements in V \320\270W and so belongs to span (I/, W). ConsequentlyV + W \321\201span (U, W).

(c) Since W is a subspace of V, we have that W is closed under vector addition; hence W + W ? W. Bypart (a), W ? W + W. Hence W + W = W.

5.61. Give an example of a subset S of R2 such that: (a) S + S <= S (properly contained); (b) S <= S + S

(properly contained); (c)S + S= Sbut S is not a subspace of R2.

(a) Let S = {@,5),@, 6), @, 7),...}. Then S + S<=S.(b) Let S = {@,0),@, 1)}. Then S \321\201S + S.

(c) Let S ={@, 0), @, 1), @, 2), @, 3),...}. Then S + S= S.

5.62. Suppose U and W are distinct 4-dimensional subspaces of a vector space V where dim V = 6.

Find the possible dimensionsof U n W.

Since U and W are distinct, U + W properly contains U and W; consequently dim (V + W) > 4.But dim (I/ + W) cannot be greater than 6, since dim V == 6. Hence we have two possibilities:

(i) dim (V + W)= 5, or (ii) dim (V + W) = 6. By Theorem 5.22,

dim (V n W) = dim V + dim W - dim (V + W) = 8 - dim (U + W)

Thus (i)dim (V nW) = 3, or (ii)dim (U n W)= 2.

5.63. Consider the following subspaces of R4:

U = span {A, 1, 0, -1), A, 2, 3, 0),B, 3, 3, -1)}W = span {A, 2, 2, -2), B, 3,2, -3), A, 3, 4, -3)}

Find (a) dim (U + W) and {b) dim (C/ n W).

(a) U + W is the space spanned by all six vectors. Henceform the matrix whose rows are the given six

vectors, and then row reduce to echelon form:

1

1

-1

-2/

1

1

2

1

IN

1

1

3

2

3

3

033224

\342\200\2241\\

0

-1

-2

-3

-\320\267/

/1

0

0

0

0

\\o

1

1

1

1

1

0

3

3

2

4

/10000

\\o

1

1

1

0

0

0

0

3

000

11-1

00o/

182 VECTOR SPACES [CHAP. 5

/1

'\320\276

0

0

0

\\\320\276

1

1

0

0

0

0

03

-1000

\342\200\2241\\

1'

-2

0

0\320\276

Since the echelon matrix has three nonzero rows, dim (V + W)= 3.

(b) First find dim V and dim W. Form the two matrices whose rows are the generators of U and W,respectively, and then row reduce each to echelonform:

1

2

3and

2

3

3

Sinceeachof the echelon matrices has two nonzero rows, dim V = 2 and dim W = 2. Using Theorem

5.22, i.e., dim (V + W) = dim V + dim W - dim (V n W), we have

3 = 2 + 2 - dim (V \320\276W) or dim (U n W) = 1

5.64. Let C/ and W be the following subspaces of R4:

U = {(a,b,c,d):b + c + d = 0}, W ={(a, b, c, d): a + b = 0, \321\201=

\320\251

Find a basis and the dimension of: (a) U,{b) W, (c) U n W, (d) U + W.

{a) We seek a basis of the set of solutions (a, h, c, d) of the equation

b + \321\201+ d = 0 or 0- a + b + \321\201+ d = 0

The free variables are a, c, and d. Set:

A) a = 1, \321\201= 0, d = 0 to obtain the solution t), = A, 0,0, 0)B) e = Q,c=l,J = 0toobtain the solution u2

= @, - 1, 1,0)

C) a = 0, \321\201= 0, d = 1 to obtain the solution v3= @, -1,0, I)

The set {vu v2, v3} is a basisof V, and dim U = 3.

(b) We seek a basis of the set of solutions (a, b, c, d) of the system

ora + 6= 0\321\201- Id = 0

Thefree variables are b and d. Set

A) b = 1, d = 0 to obtain the solution u, = (- 1,1,0, 0)

B) b = 0, d = 1to obtain the solution v2 = @,0, 2, 1)Theset {\302\253\342\200\236u2} is a basis of W, and dim W = 2.

(c) U n W consists of those vectors (a, b, c, d) which satisfy the conditions defining V and the conditionsdefining W, i.e., the three equations

b + c \342\200\2240

= 0

= 2d

ora + b

b- Vc

\321\201

+ d

-2d

= 0= 0= 0

CHAP. 5] VECTOR SPACES 183

The free variable is d. Set d = 1 to obtain the solution v = C, -3, 2, 1).Thus {v} is a basis oft/nW,and dim (U n W)

= 1.

(d) By Theorem 5.22,

dim (U + W) = dim U + dim W - dim (I/ o, W) = 3 + 2 - 1 = 4

Thus U + W = R*. Accordingly any basis of R4, say the usual basis, will be a basis of V + W.

5.65. Consider the following subspaces of R5:

V = span {A, 3, -2, 2, 3),A, 4, -3, 4, 2), B, 3, -1, -2, 9)}

W = span {A, 3, 0, 2, 1),A, 5, -6, 6, 3), B, 5, 3, 2, 1)}

Find a basis and dimension of (\320\260)\320\241/+ \320\230^,(b) I/ n VT.

(a) I/ + W is the space spanned by all six vectors. Hence form the matrix whose rows are the given six

vectors, and then row reduce to echelon form:

/1121

1

\\2

3

4

3

3

5

5

-2-3-1

0-6

3

2

4

-2

2

62

3\\

2

9

1

3 j

/10000

i/ \\o

/10000

\\o

3

1

-3

0

2-1

3

1

0

0

0

0

-2-1

3

2

-4

7

-2

-12000

22

-604

-2

2

2

0

0

00

3\\

3

-2

0

-5/

\342\200\242A

-1

_2

0

0

<v

0

0

0

0

\\o

3 -2

t 1

0 0

0 2

0 -20 6

2 3^

2 -1

0 0

0 -20

-\320\261/

The set of nonzero rows of the echelon matrix,

{A, 3, -2, 2, 3),@, 1, - 1, 2, - 1),@, 0, 2, 0, -2)}

is a basisof V + W\\ thus dim (V + W)= 3.

(b) First find homogeneous systems whose solution sets are V and W, respectively. Form the matrix

whose first three rows span V and whose last row is (x, y, z, s, t) and then row reduce to an echelonform:

-2 2 3\\

-I 2 -1

3 -6 32x + z - 2x + s - 3x + tj

3-2 2 3

1-1 2 -10 \342\200\224x + \321\203+ z 4x \342\200\224

2y + s \342\200\2246x + \321\203

0 0 0 0

Set the entries of the third row equal to 0 to obtain the homogeneous system whose solution space

isU:

-x + \321\203+ z = 0 4x \342\200\2242y + s = 0 -6x + \321\203+ t =0

184 VECTOR SPACES [CHAP. 5

Now form the matrix whose first rows span W and whose last row is (x, y, z, s, t) and then row

reduce to an echelon form:

1

I

2

X

3

5

5

\320\243

0

-6

3

z

262s

1\\3

1

4</

(I00

\\\302\260

1

0

0

0

\342\200\224

-3x

3

1

0

0

3

2

1

+ \320\243

-9x

0

-63z

0-3+ \320\252\321\203+

0

2

4

-2-2x + s

z 4x-

-\342\200\224X

2

2

-2y-40

1

2

1

+ \320\263

-s 2x

1

1

-y + t

0

Set the entries of the third row equal to 0 to obtain the homogeneous system whose solution space

is W:

-9x + \320\252\321\203+ z = 0 4x -ly + s = 0 2x -

\321\203+ t = 0

Combine both of the above systems to obtain a homogeneous system whose solution space is

V n W, and then solve:

or

-X+ \321\203+2 =0

4x - ly + s =0-6x + \321\203 + t = 0- 9x + \320\252\321\203+ z =0

4x - ly + s =0Ix-

\321\203 + t = 0

-x + \321\203+ z =0

ly + 4z + s =08z + 5s + It = 04z + 3s =0

\320\273-- 2r = 0

or

or

\342\200\224x + \321\203+ z =0

ly + 4z + s =0-5y-6z +r=0-6>>

- 8z =0

ly + 4z+ s =0\321\203+ 2z + \320\223= 0

\342\200\224x+ \321\203+ z =0

ly + 4z + s =08z + 5s + It = 0

s - It = 0

There is one free variable, which is t; hence dim (U n W)= 1. Setting ? = 2, we obtain the solution

x = 1, \321\203= 4, z = -

3, s = 4, l = 2.Thus {A, 4, -3, 4, 2)} is a basisof V n W.

5.66. Let U and W be the subspaces of R3 defined by

U =--{(a, b,c):a = b = c} and W = {@, b, c)}

(Note that W is the yz plane.) Show that R3 = U \321\204\320\230\320\233

Note first that U n W = {0},for v = (a,b,c) e U n W implies that

a \342\200\224b = \321\201 and a = 0 which implies a = 0, fc = 0, \321\201= 0

We also claim that R3 = V + W. For if v = (a. b, \321\201)\320\265R3, then

v = (a, o, a) + @, b -\320\260,\321\201- a) where (a, a, \320\260)6 U and @, b -

\320\260,\321\201\342\200\224a) 6 W

Both conditions, U n W = {0}and R3 = V + W, imply R3 = U \302\251W.

5.67. Let \320\232be the vector space of n-squarematrices over a field K.

{a) Show that V = U \302\251W where (\320\243and W are the subspaces of symmetric and antisymmetric

matrices, respectively. (Recall M is symmetric iff M = MT, and M is antisymmetric iff

MT= -M.)

(b) Show that V \321\204V \302\256W where U and W are the subspaces of upper and lower triangularmatrices, respectively.(Note V = U + W.)

CHAP. 5] VECTOR SPACES 185

(\321\217)We first show that V \342\200\224U + W. Let A be any arbitrary \320\273-square matrix. Note that

A = {(A + AT) + {(A -AT)

We claim that \\(A + AT) e U and that \\(A- AT) e W. For

(iH + AT))T= iH + AT)T = \\(AT + ATT)= \320\246\320\220+ AT)

that is, $(A + AT)is symmetric. Furthermore,

(\320\246\320\220-

AT))T =^A~ AT)T =\302\261(AT-A)= -\320\246\320\220

~ AT)

that is, |(/4 \342\200\224AT)is antisymmetric.

We next show that U n W = {0}.SupposeM e U n W. Then M = MT and MT = -M, which

implies M = -M and hence M = 0. Thus ?\320\243n W = {0}.Accordingly, V = U \302\251W.

(b) U n W \321\204{0} since U n W consists of all the diagonal matrices.Thus the sum cannot be direct.

5.68. Suppose U and W are subspaces of a vector space V, and suppose that S = {u,}spans U and

S' = {wj} spans \320\230\320\233Show that S \320\270S' spans I/ + \320\230\320\233(Accordingly, by induction, if S, spans Wt fori = 1,2, ..., n, then S,u-uS, spans W, + \342\200\242\342\200\242\342\200\242

+ Wn.)

Let \320\2706 U + W. Then v = \320\270+ w where \320\270e U and w e W. Since S spans I/, u is a linear combination ofthe u,'s, and since S' spans W. w is a linear combination of the wfs:

\320\270-fljUj, + a2 ui2 + \302\246\302\246- + \320\260\342\200\236\320\270^ \320\260,\320\265\320\232

w = blwh + b2wJl + ---+bmwJm bjeK

Thus

v = \320\270+ w =a{uit + \320\260\320\263uh + \302\246- \302\246+ \320\260\342\200\236u,m+ b{wjx + b2 wh + \302\246\302\246\302\246+ bm wJm

Accordingly, SuS' = {ui,i);} spans U + W.

5.69. Prove Theorem 5.22.

Observe that U n W is a subspaceof both V and W. Suppose dim V = m, dim W = n, and dim

(U r\\ W) = r. Suppose {vt,..., vr} is a basis of V r\\ W. By Theorem 5.16,we can extend {t>,} to a basis ofU and to a basis of W\\ say,

{vl,...,vr,ul,...,um^,} and {vl,...,vr,wl *>\342\200\236_,}

are bases of V and W, respectively. Let

\320\222= {t>,,..., ur, u,,..., um_r, w,, .... w,.r}

Note that \320\222has exactly m + n \342\200\224r elements. Thus the theorem is proved if we can show that \320\222is a basis ofU + W. Since {v(, Uj} spans U and {ff, wk] spans W, the union \320\222=

{\320\270,,u,, wt} spans U + W. Thus it

suffices to show that \320\222is independent.

Suppose

+ \342\200\242\302\246\302\246+ arvr + b,u, + \342\200\242\302\246\302\246+ bm_rum_r + c,w, + \302\246- \302\246+ cn_rwB_r = 0 (\320\243)

where a,,bj,ck are scalars. Let

v = atvt + --- + arvr + blUl + \302\246\302\246\302\246+ bm_rum_r B)

By (/), we also have that

t>=-c,w, cn.rw^, C)

Since {vn it;} ^ V,veU by B); and since [wk] \321\217W,veW by C). Accordingly, veVnW. Now {vt} is a

basis of U r^ W and so there exist scalars d,, .... dr for which \320\270==d,t>, + \302\246\302\246\342\200\2424- dt vr. Thus by C) we have

<V, + \302\246\302\246\342\200\242+ drvr + c,wi + \302\246\342\200\242\302\246+ <._rwn_, = 0

186 VECTORSPACES [CHAP. 5

But {Vi, wk} is a basis of W and so is independent. Hence the above equation forces c, =0, ..., \321\201\342\200\236_\320\263= \320\236.

Substituting this into (/), we obtain

axvx + \302\246\302\246\302\246+ arvr + btu, + \342\200\242\302\246\342\200\242+ bm_rum_r = 0

But {v,, uj} is a basis of U and so is independent. Hence the above equation forces \321\217,= 0, ..., a, = 0,

b, =0,...,*>\342\200\236,_,

= 0.

Since the equation (/) implies that the \321\217,,b} and ck are all 0, \320\222\342\200\224{t>,, uJt wk} is independent and the

theorem is proved.

5.70. Prove Theorem 5.23.

Suppose V = U \320\244W. Then any v e V can be uniquely written in the form v = \320\270+ w, where \320\270\320\265U and

weW. Thus, in particular, V = U + W. Now suppose v e U r\\ W. Then:

A) v = v + 0 where \320\270e [/, 0 e ^; and B) t> = 0 + t> where 0eU,veW

Since such a sum for v must be unique, i> = 0. Accordingly, l/n W= {0}.On the other hand, suppose V =U + W and U \321\212W = {0}. Let v e V. Since F = ?\320\243+ W, there exist

uel/ and w e W such that v = u + w. We need to show that such a sum is unique. Suppose also thatv = u' +\302\246w' where u' e U and w' ? W. Then

\320\270+ w \342\200\224v! + w' and so \320\270\342\200\224\320\270= w' \342\200\224w

But \320\270- u' e U and w' - w e W; hence, by(/nW= {0},

\320\270\342\200\224\320\270'= 0, w' \342\200\224w = 0 and so u == u', w = w'

Thus such a sum for v e V is unique and V = U \302\251W.

5.71. Prove Theorem 5.24 (for two factors): Suppose V - U \302\256W. Suppose S = {uu ..., um} and

\302\246S\"= (wn \342\200\242-->wn} are linearly independent subsets of I/ and W, respectively. Then:

(a) S \320\270S' is linearly independent in V; (b) if S is a basis of U and S' is a basis of \320\230^,then S \320\270S'

is a basis of K; and (c)dim V = dim I/ + dim W.

(a) Suppose \320\264,\320\270,+ \302\246\302\246\342\200\242+ amum + b,w, +\302\246\302\246\342\200\242\302\246+ \320\254\342\200\236wn

= 0, where e,, bt are scalars. Then

0 =(\302\253,U,+

- - - + amuj + (b,w, + \342\200\242\302\246\342\200\242+ bnwn) = 0 + 0

where 0, \320\271,\320\270,+ \342\200\242\302\246- + amum e V and 0, blwl + \302\246\302\246\302\246+ bnwn e W. Sincesuch a sum for 0 is unique, this

leads to

alu1+-+amum = 0 b,w, +\342\200\242\342\200\242

\342\200\242+.bn \320\270-\342\200\236= 0

Since S is linearly independent, each a, = 0, and sinceS' is linearly independent, each b} = 0. ThusS \320\270S' is linearly independent.

(b) By part (\321\217),S \320\270S' is linearly independent, and, by Problem 5.68, S \320\270S' span F. Thus S \320\270S' is a basisof F.

(c) Follows directly from part (b).

COORDINATE VECTORS

5.72. Let S be the basis of R2 consisting of u,= B, 1) and u2

= A,\342\200\224

1). Find the coordinate vector [t>]

of t> relative to S where t> = (a, b).

Set u = xu, + yu2 to obtain (a, b) = Bjc + y, x -y). Solve 2x + \321\203

= \321\217and jc \342\200\224\321\203

= b to obtainx =

(\321\217+ b)/3, >>=

(\321\217- 2b)/3. Thus [t>]=

[(\321\217+ b)/3, (\321\217- 2b)/3].

5.73. Consider the vector spaceP3(t) of real polynomials in t of degree < 3.(a) Show that S = {1, 1 - t, A

- tJ, A -tK} is a basis of P3(t).

(b) Find the coordinate vector [u] of \320\270= 2 - 3t + t2 + 2f3 relative to S.

CHAP.5] VECTOR SPACES 187

(a) The degree of A\342\200\224

t)k is k; hence no polynomial in S is a linear combination of preceding polynomials.Thus the polynomials are linearly independent and, since dim P3(f)= 4, they form a basis of P3(f).

(b) Set \320\270as a linear combination of the basis vectors using unknowns x, y, z, s:

\320\270= 2 - 3t + t2 + 2t3 = x(l) + y(l- t) + 2A

- tI + s(l -tK

= x(l) + ><1 -t) + z(l - 2t + t2) + s(l - 3t + 3t2 -

t3)

= X + >>- yi + 2 - 22t + 2t2 + S - 3\302\253t+ 3st2 - St3

= (x + \321\203+ z + s) + ( \342\200\224y- 2z -3s)t + B + 3s]tz + (-s)t3

Then set the coefficients of the same powers of t equal to each other:

x + y + z + s = 2 -y-2z-3s=-3 2 + 3s=l -s = 2

Solving, x = 2, \321\203= -5, 2 = 7, s = -2. Thus [u] = [2, -5, 7, -2].

/2 3\\

5.74. Consider the matrix A = I I in the vector space V of 2 x 2 real matrices.Find the coordi-

coordinatevector [/C] of the matrix A relative to <(J,

I I, I I, I H, the usual basis

of V.

We have

f2

Thus x = 2, \321\203= 3, z = 4, t = \342\200\2247.Hence [/4] = [2, 3, 4, \342\200\224

7], whose components are the elements of Awritten row by row.

Remark: The above result is true in general, that is, if A is any m x \320\270matrix in

the vector space V of m x \320\270matrices over a field K, then the coordinate vector [/C] of\320\233relative to the usual basis of V is the mn coordinate vector in Kmn whose com-

components are the elements of A written row by row.

CHANGE OF BASIS

This section will represent a vector of v e V relative to a basis S of V by its coordinate columnvector,

=[\320\260\342\200\236a2,...,aJT

(which is an \320\270x 1 matrix).

5.75. Consider the following bases of R2:

S, ={\320\270,

= A, -2), u2 = C, -4)} and S2= \\Vl = A, 3), v2

= C, 8)}

(a) Find the coordinates of an arbitrary vector v \342\200\224(a, b) in R2 relative to the basis

St ={\320\274\342\200\236\320\2742}.

(b) Find the change-of-basis matrix P from S, to S2.

(c) Find the coordinates of an arbitrary vector v = (a, b) in R2 relative to the basisS2= {vt,v2}.

(d) Find the change-of-basis matrix Q from S2 back to S,.

188 VECTORSPACES [CHAP.5

(e) Verify that Q = P~l.

if) Show that PO]S2= [d]Si for any vector v = (a, b). (SeeTheorem 5.27.)

(g) Show that P~ l[v~\\Sl= [d]S2 for any vector v = (a, /?).(SeeTheorem 5.27.)

(a) Let i; = xu, +\302\246yu2 for unknowns x and y:

x + \320\252\321\203= a x + \320\252\321\203

= \320\260>r or

\342\200\2242x \342\200\2244y = b 2y = 2\321\217

Solve for x and \321\203in terms of a and b to get x = \342\200\2242a \342\200\224\\b, \321\203

= \321\217+ \\b. Thus

(\321\217,b) = (\342\200\2242\321\217\342\200\224

jb)ut + (\320\271+ ^b)uz or [(a, b)]Sl= [ \342\200\2242\320\271\342\200\224\\b,a + \302\246

GM4M4)

(b) Use (a) to write each of the basis vectors i>, and v2 of S2 as a linear combination of the basis vectors\320\270,

and u2 of S|T

v, = A, 3) = (-2 - fK + A + l)u2 = (

t,2= C, 8) = (-6- 12)u, +C + 4)u2= -18u, + 7u2

Then \320\257is the matrix whose columns are the coordinates of v, and v2 relative to the basis S,, that is,

(c) Let v = xU[ + >u2 for unknown scalars x and y:

x + 3v = \321\217 x +\302\2463 v = \321\217or or

3x + 8y= b -y = b-3a

Solvefor x and \321\203to get x = \342\200\2248\321\217+ 3b, \321\203= \320\227\321\217\342\200\224b. Thus

(a, b)=

(-8\320\2714- 3b)i;, + (\320\227\320\271-

b)i;2 or [(\321\217,b)]Sz=

[-8\320\271+ 3b, \320\227\321\217- b]T

(<f) Use (c) to express each of the basis vectors \320\270\320\263and u2 of S, as a linear combination of the basis vectorsv, and d2 of S2 :

u, =A,

\342\200\2242) = ( \342\200\2248 \342\200\224

6)\320\270,+ C + 2)i;2=\342\200\224141\302\273,+\302\2465i>2

u2 = C, -4) = (-24 -12)u, + (9 + 4)u2 = -36i;, + 13u2

Write the coordinates of u, and u2 relative to S2 as columns to obtain 6 = 1).

(e) QP

(f) Use (\320\271),(b), and (c) to obtain

(g) Use(\320\271),(\321\201),and {d) to obtain

P 'Ms' = CMs'=l ^ \"\320\237 \321\217+ ib /V \320\227\320\271-\320\254

5.76. Suppose the following vectors form a basis S of K\":

Show that the change-of-basis matrix from the usual basis ? = {e,} of K\" to the basis S is the

matrix P whose columns are the vectorsVi,v2,...,vK,respectively.

CHAP. 5] VECTOR SPACES 189

Since\302\253\342\200\236\302\253;\302\253,form the usual basis E of K\", we have

Vl =(a1,a2,...,an) = a1e1+ a2e2

\302\273\320\263=

(*>i. \320\2542,.... bK)= b,e, + b2e2

\".= (ci> C2. \342\200\242\342\200\242\342\200\242.cK)=cle1 + c2e2

Writing the coordinates as columns we get

\320\273,b, ... c,\\

P = ... c2

\320\260\342\200\236\320\232\302\246\302\246\302\246Cnl

as claimed.

\321\201\342\200\236\320\265\320\270

5.77. Consider the basis S = {u,= A, 2,0), u2 = A, 3, 2),u3= @,1, 3)} of R3. Find:

(a) The change-of-basismatrix P from the usual basis ? = {et,e2,e3} of R3 to the basis S,

(b) The change-of-basismatrix Q from the above basis S backto the usual basis ? of R3.

(a) Since E is the usual basis, simply write the basis vectorsof Sas columns:

\342\200\242i A

3 1

\\02 3/

Method 1. Expresseachbasisvector of ? as a linear combination of the basis vectorsof S by first

finding the coordinates of an arbitrary vector v = (a, b, c) relative to the basis S. We have

or

x + \321\203

2x + 3y + z

\320\260

b

Solve for x, y, 2 to get x = la \342\200\2243b + \321\201,\321\203= \342\200\2246a + 3b \342\200\224

\321\201,\320\263= 4a \342\200\2242b + \321\201Thus

i; = (a, b, c) =G\320\271

- 3b + c)ut + (-6a + 3b - c)u2 + Da - 2b + c)u3

or Ms = [(a,b, c)]s =[7\320\271

- 3b + \321\201,-6a + 3b - c, 4a - 2b + c]T

Using the above formula for [i]s and then writing the coordinates of the e, as columns yields

et= A,0,0)= 7u,-6u2 + 4u3 /7-3

=@, 1, 0) = - 3u, + 3u2

- 2u3= @,0, 1)= ut- u2+ u3

and \320\241= I -6 3-1

4 -2 1/

Method 2. Find P\"l by row reducing M = (P j /) to the form (/ I P\"l):

190 VECTOR SPACES [CHAP. 5

5.78. Supposethe x and \321\203axes in the plane R2 are rotated counterclockwise45\302\260so that the new xaxis is along the line \321\203

= x, and the new y' axis is along the line \321\203= \342\200\224x. Find (a) the change-of-

basis matrix P and (b) the new coordinates of the point /4E, 6) under the given rotation.

(\321\217)The unit vectors in the direction of the new x' and y' axes are, respectively,

\320\270,=

\321\204/2, \321\204.\320\277)and u2 = (- \321\203\320\224/2.^/5/2)

(The unit vectors in the direction of the original x and \321\203axes are, respectively, the usual basis vectors

for R2.) Thus write the coordinates of \320\270,and u2 as columns to obtain

(b) Multiply the coordinatesof the point by P ':

/ ^/2/2 \320\2432/2\\/5\\=/\320\23772/2\320\247

4-^/2/2 \320\243\320\263/\320\263\320\233\320\261\321\203/\\ ,/2/2/

[Since P is orthogonal, P\" ' is simply the transpose of P.]

5.79. Considerthe bases S = {1, i} and S' = {1+ i, 1+ 2i}of the complex field \320\241over the real field R.Find (a) the change-of-basis matrix P from the S-basis to the S'-basis,and (b) find the change-of-

basis matrix Q from the S'-basis back to the S-basis.

(a) We have

1 + i = 1A) + 1@and so P -

1+ 2i = 1A) +- 2(i) -\320\241 \320\255

/ 2 -1\\(b) Usethe formula for the inverse of a 2 x 2 matrix to obtain Q = P ' = I

J.\\\342\200\2241 1/

5.80. Suppose P is the change-of-basis matrix from a basis {uj to a basis{w,},and suppose Q is the

change-of-basis matrix from the basis {w,} back to the basis {u,}. Prove that P is invertible and

Suppose,for i = 1, 2,..., n,\320\273

W;=

\320\271(,\320\270,+ anu2 + \302\246\302\246\302\246+ alnun = ? \320\262\321\206\320\274\321\203 (/)j=i

and, for \321\203= 1, 2,..., n,

\320\273

\"j=

\320\254\320\264Wl + bj2 W2 + \302\246\342\200\242\342\200\242+ \320\254>\302\273VB=

Z \320\254\320\224W* B>\302\273'1

Let A =(\321\2170)

and \320\222=(\320\254\320\264).

Then P = AT and Q = BT. Substituting B) into (/) yields

w> = z \302\253.Ii vw*)=

Sincethe {\302\273vf}is a basis ? \320\260\321\206\320\254^=

\320\271\320\271where eik is the Kronecker delta, that is, <5\320\231= 1 if i = fc but Sik

= 0 if

i ^ fe. Suppose AB \342\200\224{cik). Then cik = eik. Accordingly, AB = 1, and so

QP= BTAT =\320\230\320\222)\320\223

= IT = /

Thusg= P1.

CHAP. 5] VECTOR SPACES 191

5.81. Prove Theorem 5.27.

Suppose S = {u,,..., uB} and S' = {w,,..., w,,}, and suppose, for i = 1,.... \320\270,

\320\273

w,- = e,,u, + \320\260\320\273\320\2702+ \302\246\302\246\302\246+ \320\264\342\200\236,\320\270\342\200\236=

Y. auuj

Then P is the n-square matrix whose yth row is

ft

Also suppose v = fc,w, + k2 w2 + \342\200\242\342\200\242\302\246+ \320\232w* = X ^ wi \302\246Theni=l

Substituting for w; in the equation for v, we obtain

n \320\270/\320\270 \\ \320\262/ \320\262 \\

i=i i=i \\j=i / j=i \\i=i /

i +aijk2 + ---+\302\253\342\200\236\320\233\320\246-

Accordingly, [y~\\s is the column vector whose jth entry is

alJkl+a2Jk2 + -- + \320\260\321\217]\320\272\320\277 (J)

On the other hand, the ^th entry of P[i/)s. is obtained by multiplying the jth row of P by [t/)s., that is,(/) by B). However, the product of (\320\243)and B) is C); hence P\\v\\. and [>]s have the same entries. Thus

P[p]s.= [d]s. , as claimed.

Furthermore, multiplying the above by P~' givesP~l[v]s= P~lP[v]s-= [t>]s..

MISCELLANEOUS PROBLEMS

5.82. Consider a finite sequence of vectors S= {vt, v2, ..., t;n}. Let T be the sequence of vectors

obtained from S by one of the following \"elementary operations\":(i) interchange two vectors,

(ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that

S and T span the same space W. Also show that T is independent if and only if S is independent.

Observethat, for each operation, the vectors in T are linear combinations of vectors in S. On the otherhand, each operation has an inverse of the same type (Prove!); hence the vectors in S are linear

combinations of vectors in T. Thus S and T span the same space W. Also, T is independent if and only if

dim W = n, and this is true if and only if S is also independent.

5.83. Let A =(\320\260\320\272))

and \320\222=(\320\254\321\206)

be row equivalent m x n matrices over a field K, and let y,, ...,vn be

any vectors in a vector space V over K. Let

u, =allVl +al2v2+-+alnvn w, =*>,,\302\273, + b12v2 + -\302\246\342\200\242+ blnvn

u2=a2lvl +a22v2 +\302\246\342\200\242\342\200\242+a2r,t\\, vv2=b21t>, + b22v2 + --+b2nvn

um= amlvl +am2v2+-- +amnvn wm

= bmlvl + bm2v2 + \302\246\302\246\302\246+ bmnvn

Show that {uj and {wj span the same space.

Applying an \"elementary operation\" of Problem S.82to {u,} is equivalent to applying an elementaryrow operation to the matrix A. Since A and \320\222are row equivalent, \320\222can be obtained from A by a sequenceof elementary row operations; hence {w,} can be obtained from {\302\253,}by the corresponding sequence of

operations.Accordingly, {u,} and {wj span the same space.

192 VECTOR SPACES [CHAP.5

5.84. Let vt, ...,vn belong to a vector space V over a field K. Let

w, =0,,\302\273, +al2v2 + -- + alnvn

a22v2

wB = an,f1 + an2v2 +\302\246\302\246

+ annvn

where atj e K. Let P be the n-square matrix of coefficients, i.e., let P =(ay).

(a) Suppose P is invertible. Show that {w,} and {i>,} span the same space; hence{w,}is indepen-

independentif and only if {v(} is independent.

(b) Suppose P is not invertible. Show that {w,} is dependent.

(c) Suppose{w,} is independent. Show that P is invertible.

(\321\217)Since P is invertible, it is row equivalent to the identity matrix /. Hence by the preceding problem {wjand {ti;} span the same space. Thus one is independent if and only if the other is.

(b) SinceP is not invertible, it is row equivalent to a matrix with a zero row. This means that {w,} spans a

space which has a spanning set of less than n elements. Thus {w,} is dependent.(c) This is the contrapositive of the statement of (b) and so it follows from (b).

5.85. Supposethat /t,, A2,... are linearly independent sets of vectors, and that A, ? A2 ? \302\246- \302\246.Show

that the union A = At \320\270A2 v \342\200\242\342\200\242\342\200\242is also linearly independent.

Suppose A is linearly dependent. Then there exist vectors t>,,..., vn e A and scalars au ..., \320\260\342\200\236\320\265\320\232,not

all of them 0, such that

atvt +a2v2 + \302\246\302\246+ anvn = 0 (\320\243)

Since A = \320\270A,and the v, e A, there exist sets Air ..., \320\2201\321\217such that

t>, e Ah, v2 eAi2,..., vn e A^

Let \320\272be the maximum index of the sets A t): fe = max(i, in). It follows then, since A, \320\257\320\2202\320\257---,that

each At. is contained in Ak. Hence vt, v2, ..., vne Ak and so, by (\320\243),/4t is linearly dependent, which

contradicts our hypothesis. Thus A is linearly independent.

5.86. Let \320\232be a subfield of a field L and L a subfield of a field E: that is, \320\232? L ? ?. (Hence\320\232is a

subfield of E.) Suppose that E is of dimension \320\270over L and L is of dimension m over K. Show

that ? is of dimension mn over K.

Suppose {t>, 1>\342\200\236}is a basis of E over L and {e,, ..., ej is a basis of L over K. We claim that

{flj t>j: i = 1,..., m,j = 1,..., n} is a basis of E over K. Note that {a,- Vj} contains mn elements.

Let w be any arbitrary element in E. Since {v,,..., vn\\ spans E over L, w is a linear combination of the

ti- with coefficients in L:

w = btvt + b2v2 + \302\246\302\246\302\246+ bnvn b(eL (\320\243)

Since {a,,..., am} spans L over K, each />, 6 L is a linear combination of the\320\260}

with coefficients in \320\232:

bi =knfli + fci2\302\2532+\"\302\246\342\200\242+ fcim\302\253m

b2 =fc21fl, + fc22a2 + \302\246\302\246\342\200\242+ k2mam

\320\232= fc^i + \320\272\342\200\2362\320\2602+ \302\246\302\246\302\246+ knmam

where\320\272\320\270

\320\265\320\232.Substituting in (\320\243),we obtain

w = (fcne, + \342\200\242\302\246\342\200\242+ klmam)vt + (fc2lfl, + \342\200\242\342\200\242\342\200\242+ k2majv2 + \302\246\302\246\302\246+(fcnlfl, + \342\200\242\302\246\302\246+ knmajv= kllalvl +\302\246\302\246\302\246+ klmamvl +fe21a,D2+-- + k2mamv2 + \302\246\302\246\302\246+ \320\272\342\200\236,\320\260^\342\200\236+ \302\246\302\246\302\246+ \320\272

i.j

CHAP. 5] VECTOR SPACES 193

where k^e K. Thus w is a linear combination of the asVj with coefficients in K; hence{\302\253,t>j}spans E

over K.The proof is complete if we show that {a,Vj} is linearly independent over K. Suppose, for scalars

Xj, ? K, ? XjActjVj)= 0; that is,

+ \321\20512\320\2732v, + \302\246\302\246\342\200\242+ \321\2051\321\217\320\262\321\217\320\270,)+ \342\200\242\302\246\342\200\242+ (jcMle,w,, + \321\205\342\200\2362\321\2172\320\270\342\200\236+ \302\246\302\246\302\246+ xnmamvn) = 0

--+\321\2051\320\266\320\260^,+ \302\246\342\200\242\302\246

+(\321\205\342\200\236,\320\271,+ \321\205\320\2772\320\2642+ \302\246\342\200\242\302\246+ x^eju. = 0

Since{t>1?...,\320\270\342\200\236}is linearly independent over L and sincethe above coefficients of the t>, belong to L, eachcoefficient must be 0:

-+\321\2051\321\202\320\264\321\202= 0, ....\321\205.^, + \321\205\342\200\2362\321\2172+ -\302\246\302\246+ \321\205\342\200\236\321\202\320\264\321\202

=

But {\320\273,,..., \321\217\321\202}is linearly independent over \320\232;hence, since the x}i e K,

\320\245\321\206=0, x12 = 0, ...,xlm = 0,...,x111=0, xn2= 0, ...,\321\205\342\200\236\321\202

= 0

Accordingly, {\320\262;1>\321\203}is linearly independent over \320\232and the theorem is proved.

Supplementary Problems

VECTOR SPACES

5.87. Let V be the set of ordered pairs (a, b) of real numbers with addition in V and scalar multiplication on V

defined by

(a, b) + (c,d)= (a + c,b + d) and k(a, b) = (ka, 0)

Show that V satisfies all of the axioms of a vector space except [M4]: \\u = u. Hence [M4] is not a

consequence of the other axioms.

5.88. Show that the following axiom [\320\233\320\233,can be derived from the other axioms of a vector space.[/44]For any vectors u, v e V, \320\270+ v = v +\302\246u.

5.89. Let V be the set of infinite sequences (\320\273,,a2,...) in a field \320\232with addition in V and scalar multiplication onV defined by

(\320\260,,\321\2172,...) + (b,, b2, ...) =(\320\271,+\320\254\342\200\236\321\2172+ b2,...)

<\321\201(\321\217\342\200\236\320\2602,...)=

(\320\233\320\260\342\200\236\320\272\320\2602,...)

where a^b^ke K. Show that \320\232is a vector space over K.

SUBSPACES

5.90. Determine whether or not W is a subspaceof R3 where W consists of those vectors (a, b, \321\201)\320\265R3 for which

(\321\217)a = 2b; (b) a < b < c;(c) ab = 0; (d) a = b = \321\201;(\320\265)\320\260= b1.

5.91. Let V be the vector space of n-square matricesovera field K. Show that W is a subspaceof V if W consistsof all matrices which are (a) antisymmetric (AT -

\342\200\224A),{b) (upper) triangular, (c) diagonal, (d) scalar.

5.92. Let AX = \320\222be a nonhomogeneous system of linear equations in n unknowns over a field K. Show that thesolution set of the system is not a subspace of K\".

194 VECTOR SPACES [CHAP. 5

5.93. Discusswhether or not R2 is a subspaceof R3.

5.94. Suppose U and W are subspacesof V for which U \320\270W is also a subspace. Show that either 1/eWor

5.95. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V in

each of the following cases.

(a) W consists of all bounded functions. [Here /: R -\302\273Ris bounded if there exists MeR such that

(b) W consists of all even functions. [Here/: R -\302\273R is even if/(\342\200\224jc)=/(x), Vx 6 R.]

(c) W consists of all continuous functions.

(d) W consists of all differentiable functions.

(e) W consistsof all integrable functions in, say, the interval 0 <, x < 1.

(Thelast three casesrequire some knowledge of analysis.)

5.96. Let V be the vector space (Problem 5.106)of infinite sequences (\320\264\342\200\236a2, ...) in a field K. Show that W is asubspaceof V where (a) W consists of all sequenceswith 0 as the first component, and (b) W consists of all

sequences with only a finite number of nonzero components.

LINEAR COMBINATIONS,LINEAR SPANS

5.97. Show that the complexnumbers w = 2 + 3i and z = 1 - 2i span the complex field \320\241as a vector space overthe real field R.

5.98. Show that the polynomials A -tK, A

- tJ, 1 -t, and 1 span the space P3(t)of polynomials of degree < 3.

5.99. Find one vector in R3 which spans the intersection of U and W where U is the xy plane: U = {(a,b, 0)},and W is the space spanned by the vectors A, 2, 3)and A,

\342\200\2241, 1).

5.100. Prove that span S is the intersection of all the subspaces of V containing S.

5.101. Show that span S = span (S \320\270{0}). That is, by joining or deleting the zero vector from a set, we do not

change the spacespanned by the set.

5.102. Show that if S ? T, then span S s span T.

5.103. Show that span (span S) = span S.

5.104.Let Wt, W2, ... be subspaces of a vector space V for which W, ? W2 ? \302\246\342\200\242\302\246.Let W = W, \320\270W2 \320\270\302\246- \302\246.

(a) Show that W is a subspaceof V. (b) Suppose S, spans W, for i = 1, 2, Show that S = S, \320\270S2 \320\270\342\200\242\302\246\342\200\242

spans W.

LINEAR DEPENDENCE AND INDEPENDENCE

5.105.Determine whether the following vectors in R* are linearly dependent or independent:

(a) A, 3, -1, 4), C, 8, -5, 7), B, 9, 4, 23) (b) A, -2, 4, 1),B, 1,0, -3),C, -6, 1,4)

5.106. Let V be the vector space of polynomials of degree < 3 over R. Determine whether u, v, w e V are linearly

dependent or independent where:

(a) \320\270= t3 - 4t2 + 2t + 3, v - t3 + 2t2 + 4t \342\200\224I, w = 2t3 \342\200\224t2 \342\200\2243t + 5

(b) \320\270= t3 - 5t2 - It + 3, i; = t3 - 4t2 - 3t + 4, w = 2t3 - 7t2 - 7t + 9

CHAP.5] VECTOR SPACES 195

5.107. Show that (a) the vectors A \342\200\224i, i) and B, \342\200\2241 4- i) in C2 are linearly dependent over the complex field \320\241but

are linearly independent over the real field R; (b) the vectors C + \321\204.,1 4- ,/2) and G, 1 4-2^2) in R2 are

linearly dependent over the real field R but are linearly independent over the rational field Q.

5.108. Suppose {u,,..., ur, w,, ws} is a linearly independent subset of V. Prove that span \321\211n span Wy= {0}.

(Recall that span u,- is the subspace of V spanned by the u,.)

5.109. Supposevitv2,...,vnare linearly independent vectors. Prove the following:

(a) {flit1!, a2v2,...,an vj is linearly independent where each a; \320\2440.

(b) {\321\200\342\200\236..., !?,-_,, w, p,+ l, ..., fn} is linearly independent where w = 6,1), + \342\200\242\342\200\242\342\200\2424- b,i>, 4- \342\200\242\342\200\242\342\200\242+ \320\254\342\200\236\320\263\321\217and

5.110. Suppose (flu, ..., \320\224,\342\200\236),--., (flm|, \342\200\242-\342\200\242,\320\224\321\202\342\200\236)are linearly independent vectors in K\", and suppose t>,, ..., vn are

linearly independent vectors in a vector space V over K. Show that the vectors

are also linearly independent.

5.111. Suppose A is any \320\270-square matrix and suppose uIt u2, .... \320\272,are \320\270x 1 column vectors. Show that, if Ault

Au2, --., Aur are linearly independent (column) vectors, then u,, u2,..., ur are linearly independent.

BASIS AND DIMENSION

5.112. Find a subset of u,, u2, u3, u4 which gives a basis for W = span (ti,, u2,u3, u4)of R5 where:

(a) \302\253,= A, 1, 1, 2, 3),u2

= A, 2, -1,-2, 1),\302\253\320\267= C, 5, - 1, -2, 5), u4 = A, 2, 1, -1, 4)

{b) u, = A, -2, 1,3,- 1), u2 = (-2,4, -2, -6, 2), u3 = A, -3, 1,2, 1),u4

= C, -7, 3, 8, -1)(c) u, =

A, 0, 1,0, 1), u2 = A. 1,2, 1,0), u3 = A. 2. 3,1, 1),u4= A. 2. 1, 1,1)

(d) ti,= A, 0, 1,1, 1),u2

= B,1, 2,0, 1), \302\253\320\267= A, 1, 2, 3,4), \302\2534

= D, 2, 5, 4, 6)

5.113. Let U and W be the following subspaces of R4:

V = {(a,b,c,d):b-2c + d = 0} W = {(a,b, c, d): a = d, b = 2c]Find a basis and the dimension of (a) V, {b)W, (c) V \320\263\320\273W.

5.114. Find a basis and the dimension of the solution space W of each homogeneous system:

x + 2y-2z + 2s- 1= 0 x + 2y~ z + 3s-4t = 0x + 2y

- z + 3s - 2t = 0 2x + 4y- 2z - s + 5f = 0

2x + 4y - 7z + s + t = 0 2x + 4y- 2z + As - 2t = 0

(a) \302\2530

5.115. Find a homogeneous system whose solution space is spanned by the three vectors

A,-2,0,3,-1) B,-3,2,5,-3) A,-2,1,2,-2)

5.116.Let V be the vector spaceof polynomials in t of degree < n. Determine whether or not each of the following

is a basis of V:

(a) {1,14 U 1 + t + t2, I + t + t2 + t3,..., 1 + t + t2 H 1- t\"~l + t\"}

{b) {I +t,t + r2, t2 + r\\ ...,t\"-2 + f-\\f-1 + t\"}

5.117. Find a basis and the dimension of the subspace W of P(f)spanned by the polynomials

(a) \320\270= t* + 2t2 - 2f + 1, v = t3 + 3f2 - t + 4, and w = 2f3 4- r2 - 7f - 7

(b) u = f3 + t2 - It + 2, v = 2r3 4- t2 + t - 4, and w = 4t3 + 3t2 - 5t + 2

196 VECTOR SPACES [CHAP. 5

5.118. Let V be the space of 2 x 2 matrices over R. Find a basis and the dimension of the subspace W of V

spanned by the matrices

U 1) (_: 9 U D - U \020

ROW SPACE AND RANK OF A MATRIX

5.119. Consider the following subspaces of R3:

If, = span [A, 1, -1), B, 3, -1), C, 1, -5)]V2

= span [A, -1, -3), C, -2, -8), B, 1, -3)]

l/3 =span [A, 1, 1), A, -1, 3),C, -1, 7)]

Determine which of the subspaces are identical.

5.120. Find the rank of each matrix:

1

1

1

2

3

4

47

-2

1

2

-3

5

3

46

45

3

13

1

1

3

2

23

8

1

-3-2_7

<j

-2

0

-2-10

-3-4

-11-3

14

5

-1

1

5

8

-2

(c)

2

5

1

2

(fl) (b)

5.121. Show that if any row is deleted from a matrix in echelon (row canonical)form then the resulting matrix isstill in echelon (row canonical) form.

5.122. LetA and \320\222be arbitrary m x n matrices. Show that rank (A + B) < rank A 4- rank B.

5.123. Give examplesof 2 x 2 matrices A and \320\222such that:

(a) rank (A + B)< rank A, rank \320\222 (\321\201)rank {A + B) > rank A, rank \320\222

{b) rank (A + B)= rank A = rank \320\222

SUMS, DIRECT SUMS, INTERSECTIONS

5.124. SupposeV and W are 2-dimensional subspacesofR3. Show that U n* W \320\244{0}.

5.125. Suppose V and W are subspaces of \320\232and that dim V = 4, dim \320\230^= 5, and dim \320\232= 7. Find the possibledimensions of U n W.

5.126. Let V and W be subspaces of R3 for which dim U = 1,dim W = 2, and I/ ? W. Show that R3 = U \302\251W.

5.127. Let I/ be the subspace of R5 spanned by

A,3,-3,-1,-4) A,4,-1,-2,-2) B,9,0,-5,-2)

and let W be the subspace spanned by

A,6,2,-2,3) B,8,-1,-6,-5) A,3,-1,-5,-6)

Find (a) dim (V + W), (b)dim (U r\\ W).

CHAP. 5] VECTOR SPACES 197

5.128.Let V be the vector spaceof polynomials over R. Find (a) dim (I/ + W), {b) dim {V n W), where

V = span (f3 4- 4f2 - t + 3, r3 + 5r2 + 5, 3t' + 10r2 - 5f + 5)

W = span (r3 + 4f2 + 6, f3 4- 2t2 - t + 5, 2f3 4- 2f2 - 3r + 9)

5.129. Let V be the subspace of R5 spanned by

A.-1,-1,-2,0) A,-2,-2,0,-3) and A,-1,-2,-2,1)

and let W be the subspace spanned by

A, -2, -3, 0, -2) A, -1, -3, 2, -4) and A, -1, -2, 2, -5)

(a) Find two homogeneous systems whose solution spaces are V and W, respectively.

(b) Find a basis and the dimension of V c\\ W.

5.130. Let [/,, l/2, and U3be the following subspaces of R3:

U,={{a,b,c):a + b + c = 0} U2= {(a, b, c): a = c} l/3 =

{@, 0, \321\201):\321\201\320\265R}

Show that: (a) R3 = L/, 4- l/2, (b) R3 = l/2 + U3,(c)R3 = I/, + l/3. When is the sum direct?

5.131. SupposeV, V, and 1\320\243are subspaces of a vector space. Prove that

(V n F) + (t/ n W)c[/n(V + W)

Find subspaces of R2 for which equality does not hold.

5.132. The sum of arbitrary nonempty subsets (not necessarily subspaces) S and \320\223of a vector space V is defined

by S + T = {s+ t: s e S, t e T}. Show that this operation satisfies:

(a) Commutative law: S + T = T + S (c) S + {0} = {0}+ S= S(b) Associative law: (S, + S2)+ S3= S, + (S2 + S3) (d) S+V=V + S=V

5.133.Suppose Wx, W2,..., Wr are subspaces of a vector space V. Show that:

(a) spanfW,, \320\230-2,..., Wr)=

Wx + W2 + \342\200\242\342\200\242- + Wr

{b) If S, spans W<for i= 1,..., r, then S, u S2 u \302\246\302\246\302\246u Sr spans Wx + W2 H + Wr.

5.134. Prove Theorem5.24.

5.135.Prove Theorem 5.25.

5.136. Let V and W be vector spacesover a field K. Let F be the set of ordered pairs (u, w) where u belongs to V

and w to W: V = {(u, w): \320\270e V, w e W]. Show that V is a vector spaceover X with addition in V and

scalar multiplication on V denned by

(u, w) + (u\\ w') = (u + u', w + w') and k(u, w)= (ku, kw)

where \320\270,\320\270e V, w, w' e 1\320\243,and \320\272e K. (This space F is calledthe external direct sum of V and \320\230\320\233)

5.137. Let F be the external direct sum of the vector spaces U and W over a field K. (See Problem 5.136.)Let

U ={(\320\270,0): u 6 \320\241\320\243}and W = {@,w): w 6 \320\230^}.Show that

(\320\260)\320\241and W' are subspaces of V and that F = 0 \302\251#;

(fc) I/ is isomorphic to \320\241under the correspondence h\302\253-\302\273(u,0), and that W is isomorphic to W under the

correspondence \302\273v-\302\253->@,w);

(c) dim V = dim I/ + dim \320\230\320\233

5.138. Suppose F = V \321\204W. Let F be the external direct product of U and W. Show that V is isomorphic to \320\232

under the correspondence i> = \320\270+ w *-* {u, w).

198 VECTOR SPACES [CHAP.5

COORDINATE VECTORS

5.139. Consider the basis S = {u, =A, \342\200\2242),u2 = D, \342\200\2247)}of R2. Find the coordinate vector [t>] of v relative to S

where (a) v = C, 5),{b) v = A. 1),and (c)v = (a, b).

5.140. Consider the vector space P3(f) of polynomials of degree <3 and the basis S = {I, t + 1, t2 + t, t3 + t2} of

P3(r). Find the coordinate vector of v relative to S where (a) v = 2 - 3r + t2 + 2t3; {b) v = 3 \342\200\224It - t2; and

(c) v = a + bt + ct2 + dt3.

5.141. LetS be the following basis of the vector space W of 2 x 2 real symmetric matrices:

Find the coordinate vector of the matrix A e W relative to the above basis S where [a) A = I I

and(bM

CHANGE OF BASIS

5.142. Find the change-of-basis matrix \320\257from the usual basis E = {{I,0), @, 1)} of R2 to the basis S, the change-of-basismatrix Q from S back to E, and the coordinate vector of v = (a, b) relative to S where

{a) S= {A,2),C,5)} (c) S = {B, 5),C, 7)}

(b) S = {A, -3), C, -8)} {d) S= {B, 3), D, 5)}

5.143. Consider the following bases of R2: S = {u,= A, 2), u2 = B, 3)} and S' = {vt = A, 3), v2 = A, 4)}.Find:[a)the change-of-basis matrix P from S to S',and {b) the change-of-basis matrix Q from S' back to S.

5.144. Supposethat the x and \321\203axes in the plane R2 are rotated counterclockwise 30\302\260to yield new x and \321\203axes

for the plane. Find: (a) the unit vectors in the direction of the new x' and / axes, (b) the change-of-basis

matrix P for the new coordinate system, and (c) the new coordinates of each of the following points underthe new coordinate system: A(l, 3), BB, -5), C(a, b).

5.145. Find the change-of-basismatrix P from the usual basis E of R3 to the basis S, the change-of-basis matrix Qfrom S back to E, and the coordinate vector of v = [a, b, c) relative to S where S consists of the vectors:

(a) \320\270,=A. 1,0), u2= @,1,2), \302\2533= @,1,1) (c) u,=(l,2, 1), u2

= A, 3, 4), u3= B, 5, 6)

(b) \321\211= A, 0, 1), u2

= (I, 1, 2), u3= A, 2, 4)

5.146. Suppose S,, S2,and S3 are bases of a vector spaceV, and suppose P is the change-of-basis matrix from S,to S2and Q is the change-of-basis matrix from S2 to S3. Provethat the product PQ is the change-of-basis

matrix from S, to S3 .

MISCELLANEOUS PROBLEMS

5.147.Determine the dimension of the vector space W of \320\270-square: (a) symmetric matrices over a field K,

{b) antisymmetric matrices over K.

5.148.Let \320\232be a vector space of dimension n over a field K, and let \320\232be a vector space of dimension m over asubfield F. (HenceV may also be viewed as a vector space over the subfield F.) Provethat the dimension of

V over F is mn.

CHAP. 5] VECTOR SPACES 199

5.149. Let t,, t2,..., tn be symbols, and let \320\232be any field. Let V be the set of expressions

a,f, 4- a2f2 + \302\246\302\246+ aKtn where af e \320\232

Define addition in V by

(a,f, + a2J2 + --+eltfB) + (b1f1 +M2+\342\200\242-\342\200\242+ \320\234\342\200\236)

=(\302\253i+ *i)ti + (\302\2532+ h)t2 + \302\246\302\246\302\246+ (\320\260\342\200\236+ \320\254\342\200\236I\342\200\236

Define scalar multiplication on V by

fc(a,f, +a2t2 + \302\246\302\246\302\246+ antn) = fca,i, + fca2r2 + \302\246\342\200\242\342\200\242+ kantn

Show that \320\232is a vector space over \320\232with the above operations. Also show that {f\342\200\236..., tn] is a basis of V

where, for i = 1,.... n,

tj= Of , + \342\200\242\342\200\242\302\246+0tf_, + ltf + 0r,+ , + \342\200\242\302\246\342\200\242+ 0fn

Answers to Supplementary Problems

5.90. (a) Yes. (d) Yes.

(b) No; e.g.,A, 2, 3) e W but -2A, 2, 3) \321\204W. (e) No; e.g., (9, 3,0)\320\265\320\230'but 2(9, 3, 0) \321\204W.

(c) No; e.g., A,0, 0),@, 1, 0) e W, but not their sum.

5.92. X = 0 is not a solution of AX = B.

5.93. No. Although one may \"identify\" the vector (a, b) e R2 with, say, (a, b, 0) in the xy plane in R3, they aredistinct elementsbelonging to distinct, disjoint sets.

5.95. (a) Let/, g e W with Mf and Mg bounds for/and g, respectively. Then for any scalars a, b e R,

\\{cf+ bg)(x)\\ = |a/(x) + 69(x)| < \\af[x)\\ +\\bg(x)\\ = |fl||/(x)l + IHlflMI

That is, |a | \320\233/\320\263+ | b | Mg is a bound for the function af+ bg.

(b) (a/+ b\302\273K- x) =

fl/(- x) + b?(- x)=

a/(x) + bfl(x) = (a/ + \320\254\320\264\320\250

5.99. B,-5,0).

5.105. (fl) Dependent, (b) Independent.

5.106. (a) Independent, (b) Dependent.

5.107. (a) B, - H- i)= A + 0A - i,0; (b) G,1 + 2^/2) = C -

^\320\245\320\227+ Jl 1 +

5.112. (\320\260)\320\274\342\200\236u2,u4; (b) \302\253,,u3; (\321\201)\320\270\342\200\236\320\2702,\320\2703,\320\2744;(d) \320\270\342\200\236u2,\302\2533.

5.113. (a) Basis, {A, 0, 0,0),@, 2, 1, 0), @, - 1,0, 1)};dim G = 3.

(b) Basis,{A, 0.0.1), @, 2, 1.0)}; dim W = 2.(c) Basis, {@, 2,1, 0)}; dim (U n W) = 1. Hint. I/ n W must satisfy all three conditions on a, b, \321\201and d.

5.114. (a) Basis, {B, -1,0,0, 0),D,0,1, -1,0),C,0,1,0,1)};dim tt'= 3.

(b) Basis,{B,- 1,0,0,0), A, 0, 1, 0, 0)};dim tt' = 2.

5.115.f 5x + \321\203

- ;

Ix+y-.

200 VECTOR SPACES [CHAP.5

5.116.(a) Yes, (b) No. For dim \320\243= n + 1, but the set contains only n elements.

5.117. (a) dim W = 2, (b) dim W = 3

5.118. dim W = 2

5.119. l/,andl/2.

5.120. (a) 3, (b) 2, (c) 3, (d) 2

\302\2534 \302\273)\342\200\242-(: .

5.125. dim (V nW) = 2,3or4.

5.127.(a) dim {V + W) = 3, (b) dim (U nW) = 2.

5.128. (a) dim {U + W) = 3, dim (U n \320\2300= 1.

(b) {A, -2,-5, 0, 0), @, 0, 1, 0, - 1)}.dim (V n W) = 2.

5.130. The sum is direct in (b) and (c).

5.131. In R2, let V, V, and W be, respectively, the line \321\203= x, the x axis, and the \321\203axis.

5.139. (a) [-41,11], (b) [-11,3], (c) [-7o -4b,2a + b]

5.140. (a) [4,-2,-1,2], (b) [4,-1,-1,0], (c) la-b + c-d,b-c + d,c-d,d]

5.141.(a) [2,-1,1], (b) [3,1,-2]

* '-

., p-(J J), ,\302\273,

e-(J J)

5.144. (\320\260)(\320\243\320\267/2,|),(-|,\320\243\320\267/2)

A)

CHAP. 5] VECTORSPACES 201

(c) M =[(\320\243\320\227

- 3)/2, A +

[\320\222]=

HZfi + 5I2, B -5\320\243\320\267)/2],

5.145. Since E is the usual basis, simply let \320\257be the matrix whose columns are \320\270,,u2, u3. Then Q = P~x and

5.147. (\320\260)\320\277(\320\270+ l)/2, (b) n(n- l)/2

5.148. Hint: The proof is almost identical to that given in Problem 5.86 for the special case when V is an extensionfield of K.

Chapter 6

Inner Product Spaces, Orthogonality

6.1 INTRODUCTION

The definition of a vector space V involves an arbitrary field K. In this chapter we restrict \320\232to be

either the real field R or the complex field C. Specifically, we first assume, unless otherwise stated orimplied, that \320\232= R, in which case V is called a real vector space,and in the last sections we extend ourresults to the case that \320\232= \320\241,in which case V is called a complexvector space.

Recall that the concepts of \"length\" and \"orthogonality\" did not appear in the investigation of

arbitrary vector spaces (although they did appear in Chapter 2 on the spaces R\" and C). In this chapter

we place an additional structure on a vector space V to obtain an inner product space,and in this

context these concepts are defined.As in Chapter 5, we adopt the following notation (unlessotherwise stated or implied):

V the given vector spaceu, v, w vectors in V

\320\232 the field of scalars

a, b, c, or \320\272 scalars in \320\232

We emphasize that V shall denote a vector space of finite dimension unless otherwise stated orimplied. In fact, many of the theorems in this chapter are not valid for spaces of infinite dimension. This

is illustrated by some of the examples and problems.

6.2 INNER PRODUCT SPACES

We begin with a definition.

Definition: Let V be a real vector space. Suppose to each pair of vectors u, v e V there is assigned areal number, denotedby <\302\253,i>>. This function is called a (real) inner product on V if itsatisfies the following axioms:

[7J (Linear Property) <a\302\253!+ bu2, v~>= a(uu v} + b(u2, v>

U2] (Symmetric Property) <\302\253,v}= <i\\ \302\253>

[73] (Positive Definite Property) <w, \302\253>> 0; and <\302\253,u> = 0 if and only if \320\270= 0.

The vector space \320\243with an inner product is calleda (real) inner product space.

Axiom [7j3 is equivalent to the following two conditions:

(a) <\302\253!+ u2, v} =<\302\253l5v> + <\321\2132,t;> and (b)

Using [/t] and the symmetry axiom [72], we obtain

<u, cvt + dv2y=

<\302\253;!+ dv2, \302\253>= c<t;l5 u> + d(v2, \302\253>

= c(u, v

202

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 203

or, equivalently, the two conditions

(a) <u, r, + v2> = <u, r,> + <u, v2} and {b) <u, b> = k(u, v}

That is, the inner product function is also linear in its second position (variable). By induction, we

obtain

<a,u, + \342\200\242\342\200\242\342\200\242+a,ur, v} =a,<u,, r> + a2<u2, i>> +

\342\200\242\342\200\242\302\246+ ar(ur, v}

and

Combining these two properties yields the following general formula:

The following remarks are in order.

Remark 1: Axiom [/,] by itself implies that

<0, 0> = <0t;, 0> =0<t;, 0> = 0

Accordingly, [/,], [/2], and [/3] are equivalent to [/,], [/2], and the following axiom:

\320\223_/'\320\267]If \320\270\321\2040, then <\302\253,\321\213>> \320\236

That is, a function satisfying [/,], [/2], and [7'3] is an inner product.

Remark 2: By [/3], <u, u) is nonnegativeand hence its positive real square root

exists. We use the notation

This nonnegative real number || \320\270\\\\ is called the norm or length of u. This function doessatisfy the axioms of a norm for a vector space. (See Theorem 6.25 and Section 6.9.)The relation || u ||2 =

<u, u> will be frequently used.

Example 6.1. Consider the vector space R\". The dot product (or scalar product) in R\" is defined by

\320\270\302\246v = albl + a2b2 + \302\246\302\246\302\246+ \320\260\342\200\236\320\254\342\200\236

where \320\270= (a,) and v = (bt). This function defines an inner product on R\". The norm || \320\270|| of the vector \320\270\342\200\224(a,) in

this space follows :

On the other hand, by the Pythagorean Theorem, the distance from the origin \320\236in R3 to the point P{a, b, c),

shown in Fig. 6-1, is given by ^/a2 + b2 + c2. This is precisely the same as the above defined norm of the vectorv = (a, b, c) in R3. Since the Pythagorean Theorem is a consequenceof the axioms of Euclidean geometry, the

vector space R\" with the above inner product and norm is called Euclidean n-space. Although there are many ways

to define an inner product on R\", we shall assume this as the inner product on R\", unless otherwise stated or

implied; it is called the usual inner product on R\".

204 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP.6

P{a. b. c)

*\302\246v

Fig. 6-1

Remark: Frequently, the vectors in R\" are represented by n x 1 column matrices.In such a case,the usual inner product on R\" (Example 6.1) may be defined by

<\302\253,v\")= uTv

Example 6.2

(a) Let V be the vector space of real continuous functions on the interval a < t < b. Then the following is an inner

product on V:

</,<?>= \\ fU)g(t)dt

where/(t) and g(t) are now any continuous functions on [a, b].

(b) Let V be again the vector space of continuous functions on the interval a <t < b. Let w{t) be a given contin-continuousfunction which is positive on the interval a < t < b. Then the following is also an inner product on V:

</, 9> = 4t)f(t)g(t) dt

In this case, w(t) is calleda weight function for the inner product.

Example 6.3

(a) Let V denote the vectorspaceof m x n matrices over R. The following is an inner product in V:

(A, B> = tr {BTA)

where tr stands for trace, the sum of the diagonal elements. If A = (a,7)and \320\222=(\320\254\321\206),

then

<\320\233,\320\222>= tr (BTA) =tt \320\260\320\270\321\214\321\207

the sum of the products of corresponding entries. In particular,

I2 =

the sum of the squares of all the elementsof A.

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 205

(b) Let V be the vectorspaceof infinite sequences of real numbers (\320\2601\320\263\320\2602,..\302\246)satisfying

? a] =a\\ + a\\ + \302\246\302\246\302\246< oo

i.e., the sum converges.Addition and scalar multiplication are defined componentwise:

(al,a1,...) + (bl,b2,...) = {al + fc,, a2 + b1,...)

k(al,a2,...) = (kai,ka2,...)An inner product is defined in V by

The above sum convergesabsolutely for any pair of points in V (Problem 6.12); hence the inner product is well

defined. This inner product spaceis calledl2space(or Hilbert space).

6.3 CAUCHY-SCHWARZ INEQUALITY, APPLICATIONS

The following formula (proved in Problem 6.10) is called the Cauchy-Schwarz inequality; it is usedin many branches of mathematics.

Theorem 6.1 (Cauchy-Schwarz):Forany vectors u,v e V,

<\321\213,vy2 < <u, u><d, u> or, equivalently, | <u, i/> | < || \320\270|| || v ||

Next we examine this inequality in specific cases.

Example 6.4

(a) Consider any real a,,..., an, fc,,..., bn. Then by the Cauchy-Schwarz inequality,

(a,fc, + a2b2 + \302\246\302\246\302\246+ \320\260\342\200\236\320\254\342\200\236J< (a2 + \302\246\302\246\302\246+ \320\260.\320\245\320\254?+ - \342\200\242\342\200\242+ fc2)

that is, (u\342\200\242

vJ < || \320\270\\\\\320\263|| v ||2 where \320\270= (a,)and v = (bt).

(b) Let/and g be any real continuous functions defined on the unit interval 0 < t < 1. Then by the Cauchy-Schwarz inequality,

(</\302\273\302\2732= /@0@^1 < f\\t)dt g\\t)dt=\\\\f\\\\1\\\\g\\\\2\\Jo / Jo Jo

HereV is the inner product spaceof Example 6.2(a).

The next theorem (proved in Problem 6.11) gives basic properties of a norm; the proof of the third

property requires the Cauchy-Schwarz inequality.

Theorem 6.2: Let \320\232be an inner product space. Then the norm in V satisfies the following properties:

[Ni] II v || > 0; and || v || = 0 if and only if v = 0.

DV,1 II kv || = Ifcl || v \\\\.

The above properties [\320\233\320\223,],[N2], and [N3] are those that have been chosen as the axioms of anabstract norm in a vector space (see Section6.9).Thus the above theorem says that the norm denned byan inner product is an actual norm. The property [N3] is frequently called the triangle inequality

because if we view u + uas the side of the triangle formed with u and v (as shown in Fig. 6-2), then [N3]states that the length of one side of a triangle is less than or equal to the sum of the lengths of the othertwo sides.

206 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

\320\270-tv

The following remarks are in order.

Remark 1: If || \320\270|| = 1, or, equivalently, if <u, u> = 1, then \320\270is called a unit vectorand is said to be normalized. Every nonzero vector v e V can be multiplied by the

reciprocal of its length to obtain the unit vector

v = v\\\\v\\\\

which is a positive multiple of v. This process is called normalizing v.

Remark 2: The nonnegative real number d{u, v) = || \320\270\342\200\224v || is called the distancebetween \320\270and v; this function does satisfy the axioms of a metric space (seeTheorem

6.19).

Remark 3: For any nonzero vectors u, v e V, the angle between \320\270and v is

defined to be the angle \320\262such that 0 < \320\262< \320\277and

\320\270 v

By the Cauchy-Schwarz inequality, \342\200\2241 < cos \320\262< 1 and so the angle \320\262always exists

and is unique.

6.4 ORTHOGONALITY

Let V be an inner product space. The vectors u, v e V are said to be orthogonal and u is said to be

orthogonal to v if

<\302\253.\302\273>= 0

The relation is clearly symmetric; that is, if \321\213is orthogonal to v, then <d, \320\270>= \320\236and so v is orthogonal

to u. We note that 0 e V is orthogonal to every v e \320\232for

<0, D> = <0l>,D> =0<l>, l>> = 0

Conversely, if \320\270is orthogonal to every v e V, then <u, u> = 0 and hence u = 0 by [/3]. Observe that \320\270

and u are orthogonal if and only if cos 6 = 0 where \320\262is the angle between u and v, and this is true if and

only if \320\270and v are \"perpendicular,\" i.e., \320\262=\320\273/2(or \320\262= 90\.

Example 6.5

(a) Consideran arbitrary vector \320\270\342\200\224

(a,, a2,..., \320\260\342\200\236)in R\". Then a vector \320\270=(\321\205\342\200\236\321\205\320\263,\321\205\342\200\236)is orthogonal to \320\270if

<u, t>>= a,x, + o2x2 + \302\246\302\246\302\246+ anxn = 0

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 207

In other words, v is orthogonal to \320\270if v satisfies a homogeneousequation whose coefficients are the elementsof u.

(b) Suppose we want a nonzero vector which is orthogonal to t>,= A, 3, 5) and v2

= @, 1, 4) in R3. Let

w = (x, y, z). We want

0 =<ti,, \320\270>>= x + \320\252\321\203+ 5z and 0 =

<t>2, \320\270>>=\321\203+ 4z

Thus we obtain the homogeneous system

x + 3y + 5z = 0 \321\203+ 4r = 0

Set z = 1 to obtain \321\203= \342\200\2244 and x = 7; then w = G, \342\200\2244,1) is orthogonal to vl and v2- Normalizing w, we

obtain

which is a unit vector orthogonal to r, and i>2 \302\246

Orthogonal Complements

Let S be a subsetof an inner product space V. The orthogonal complement of S, denoted by SL

(read \"S perp\") consists of thosevectors in V which are orthogonal to every vector \320\270\320\265S:

Sx = {v e V : <d, u> = 0 for every \321\2136 S]

In particular, for a given vector u in V, we have

u1={v e \320\232: <d, u> = 0}

That is,uL consists of all vectors in V which are orthogonal to the given vector u.

We show that Sx is a subspace of V. Clearly OeS1 since 0 is orthogonal to every vector in V. Now

suppose v, w e S1. Then, for any scalars a and b and any vector ueS.we have

<od + bw, u> = a<D,\320\270>+ b<w, \321\213>=\320\260-0+ \320\254*0= 0

Thus av + bw e SL and therefore SL is a subspaceof 1\320\233

We state this result formally.

Proposition6.3: Let S be a subset of an inner product space V. Then S1 is a subspaceof 1\320\233

Remark 1: Suppose \320\270is a nonzero vector in R3. Then there is a geometricaldescriptionof u1. Specifically, ux is the plane in R3 through the origin \320\236and perpen-perpendicular to the vector u, as shown in Fig. 6-3.

208 INNER PRODUCT SPACES,ORTHOGONALITY [CHAP. 6

Remark 2: Consider a homogeneoussystem of linear equations over R:

\302\25311*1+\302\25312*2 + -- + \302\253ln*n= O

\302\25321*1+ \302\25322*2 + \342\200\242\342\200\242\342\200\242+ \302\2532\342\200\236*\320\273= \320\236

Recall that the solution space W may be viewed as the solution of the equivalent

matrix equation AX = 0 where A =(a^) and X =

(xf). This gives another interpreta-interpretationof W using the notion of orthogonality. Specifically, each solution vector

v = (x,, x2,..., xn) is orthogonal to each row of A; and consequently W is the orthog-orthogonalcomplement of the row space of A.

Example 6.6. Suppose we want to find a basis for the subspace uL in R3 where \320\270= A, 3, \342\200\2244).Note uL consists ofall vectors (x, y, z) such that

<(x, y, z), A, 3. -4)> =0 or x + \320\252\321\203- 4z = 0

Thefree variables are \321\203and z. Set

A) \321\203= - 1. z = 0 to obtain the solution w, = C, - 1,0)

B) \321\203= 0, z = 1 to obtain the solution w2 = D.0, 1)

Thevectors \320\270>,and w2 form a basis for the solution space of the equation and hence a basis for u1.

Suppose W is a subspace of V. Then both W and WL are subspaces of V. The next theorem,whoseproof (Problem6.35)requires results of later sections, is a basic result in linear algebra.

Theorem 6.4: Let W be a subspace of V. Then V is the direct sum of W and W1, that is,V =W@W\\

Example 6.7. Let W be the z axis in R3, i.e., W = {@, 0, \321\201):\321\2016 R}. Then WL is the xy plane, or, in other words,W1 = {(a,b.0):a, be R} as shown in Fig. 6-4. As noted previously, R3 = W \302\251W1.

Fig. 6-4

6.5 ORTHOGONAL SETS AND BASES,PROJECTIONSA set S of vectors in V is called orthogonal if each pair of vectors in S are orthogonal, and S is called

orthonormal if S is orthogonal and each vector in S has unit length. In other words,S ={\321\211,\320\2702,..., u,}

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 209

is orthogonal if

<u,, Uj>= 0 for i \320\244)

and S is orthonormal if

\" J IJ11 for i! = j

Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the recipro-

reciprocalof its length in order to transform S into an orthonormal set of vectors.A basis S of a vector space V is called an orthogonal basis or an orthonormal basis according as S is

an orthogonal set or an orthonormal set of vectors.The following theorems, proved in Problems 6.20 and 6.21,respectively, apply.

Theorem 6.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.

Theorem 6.6 (Pythagoras): Suppose {ux,u2,..., ur} is an orthogonal set of vectors.Then

Here we prove the above Pythagorean Theorem in the special and familiar case for two vectors.

Specifically, suppose <u, i/> = 0. Then

|| \320\270+ v ||2 =<\320\270+ v, \320\270+ u> =

<\321\213,\321\213>+ 2<\321\213,\320\270>+ <i>, i>>=

<\320\270,\321\213>+ <\320\270,\320\270>= || \320\270||2 + || v ||2

which gives our result.

Example 6.8

(a) Consider the usual basis ? of Euclidean 3-space R3:

E = {e, =A, 0, 0), e2 = @, 1,0),e3=

@, 0, 1)}

It is clear that

<\302\253?\342\200\236\302\253?2>= <e,,e3> =

<\302\253?2,\302\253?3>= 0 and <\302\253?\342\200\236\302\253?,)

=<\302\253?2,\302\253?2>

=<\302\253?3,\302\253?3>

= 1

Thus E is an orthonormal basis of R3. More generally, the usual basis of R\" is orthonormal for every \320\270.

(b) Let V be the vector space of real continuous functions on the interval \342\200\224n < t < n with inner product defined

by </, 0> =J\" \342\200\236f(t)g{t) dt. The following is a classicalexample of an orthogonal subset of V:

{1, cos t, cos 2r,..., sin t, sin 2\320\263,...}

The above orthogonal set plays a fundamental role in the theory of Fourier series.

(c) Considerthe following set S of vectors in R4:

S={u = A, 2, -3, 4), t> = C, 4, 1. -2), \320\272= C, -2, 1, 1}Note that

<u, t>>= 3+8-3 + 8 = 0 <u, w> = 3-4-3+4 = 0 <t>, w> = 9-8+1-2 = 0

Thus S is orthogonal. We normalize S to obtain an orthonormal set by first finding

||u||2= 1 +4+9 + 16=30 ||t,||2= 9 + 16+ 1 +4 = 30 II w ||2 =9 +4+ 1 + 1 = 15

Then the following form the desiredorthonormal set of vectors:

\320\231=A/v^Ol 2\320\224/\320\227\320\236,-\320\227\320\233/\320\227\320\236,4/\320\243\320\267\320\236)

v =(\320\227\320\224/\320\227\320\236,4\320\224/30, 1\320\224/\320\227\320\236,-2\320\224/\320\227\320\236)

w =(\320\227\320\264/15,-2\320\224/\320\227\320\236,1\320\224/\320\2425,1/\320\243\320\2425)

We also have \320\270+ v + w = G, 4, - 1,3) and || \320\270+ v + \320\270-1|2= 49 + 16 + 1+ 9 = 75.Thus

II \320\270||2 + || v ||2 + || w ||2 = 30 + 30 + 15= 75 = || \320\270+ v + w ||2

which verifies the Pythagorean Theorem for the orthogonal set S.

210 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

Example 6.9. Consider the vector \320\270= A, 1, 1, 1) in R4. Suppose we want to find an orthogonal basis of ir, the

orthogonal complement of u. Note that u1 is the solution spaceof the linear equation

x+y+r+r=0 (/)

Find a nonzero solution t>, of (/), say t>,= @, 0, 1, \342\200\224

1). We want our secondbasisvector v2 to be a solution to (/)and alsoorthogonal to t>,, i.e., to be a solution of the system

x + \321\203+ z + t = Q r-f = 0 B)Find a nonzero solution v2 of B), say v2

= @, 2, \342\200\2241, \342\200\2241). We want our third basis vector to be a solution of (/) and

also orthogonal to t>, and v2, i.e., to be a solution of the system

x + \321\203+ z + t = 0 2y- z-t = 0 \320\263- 1 = 0 (i)

Find a nonzero solution of C), say t>3= (

\342\200\2243, 1, 1, 1).Then {t>,, v2, v3} is an orthogonal basis of ux. (Observe that

we chose the intermediate solutions t>, and v2 in such a way that each new system is already in echelon form. Thismakes the calculations simpler.) We can find an orthonormal basis for u' by normalizing the above orthogonalbasis for u1. We have

|| y, II2 =0 + 0+ 1 + 1 = 2 ||t>2||2 =0+4+ 1 + 1=6 ||i>3||2= 9+1 + 1 + 1 = 12

Thus the following is an orthonormal basis for uL.

t-, =(\320\276,\320\276,1\320\264\320\224

-i/\320\243\320\263) t-2

= (\320\276,2/^6,-

W6, -1/^6) t-\320\267= (- 3/\320\243\320\2422,1/\320\243\320\2422,1/\320\243\320\2422,1/\320\243\320\2422)

Example 6.10. Let S consist of the following three vectors in R3:

\321\211= A, 2, 1) u2 = B, 1, -4) u3

= C, -2, 1)

Then S is orthogonal since u,, u2, and u3 are orthogonal to each other:

<iils u2> = 2 + 2-4 = 0 <\320\274\342\200\236u3> = 3-4+I =0 <u2lu3>=6-2-4 = 0Thus S is linearly independent and, sinceS has 3 elements, S is an orthogonal basis for R3.

Suppose we want to write v = D, 1, 18)as a linear combination of u,, u2, u3. First set v as a linear com-

combination of u,, u2, u3 using unknowns x, >-, z as follows:

D, 1, 18)= x(l,1,1) + M2, 1, -4) + zC, -2, 1) (/)

Method 1. Expand (/) to obtain

x + 2y + 3r = 4 2x + \321\203- 2z = 1 x -

4y + \320\263= 18

from which x = 4, \321\203= \342\200\2243, z = 2. Thus v = 4u,

\342\200\2243u2 + 2u3.

Method 2. (This method uses the fact that the basis vectors are orthogonal, and the arith-

arithmetic is much simpler.) Take the inner product of (/) with u, to get

D, 1, 18)-A, 2, l) = x(l, 2, l)-(l. 2, 1) or 24 = 6x or x = 4

(The two last terms drop out since u, is orthogonal to u2 and to u3.) Take the inner product of

(/) with u2 to get

D, 1, 18)-B, 1, -4) =><2, 1, -4)-B, 1, -4) or -63 = 2ly or y=-3

Finally, take the inner product of (/) with u3 to get

D. 1, 18)-C, -2, l)=zC, -2, l)-C, -2, 1) or 28 = 14z or z = 2

Thus i> = 4u, \342\200\224\320\252\320\2702+ 2u3.

The procedure in Method 2 in Example 6.10 is true in general; that is,

Theorem 6.7: Suppose {\320\270,,u2,..., un} is an orthogonal basis for V. Then, for any v e V,

v = -, T \"i + 7 \320\263\321\2132+ \"'\" +

<\302\2532.\302\2532> <\302\253\342\200\236,\302\253\342\200\236>

(See Problem 6.5 for the proof.)

CHAP.6] INNER PRODUCT SPACES, ORTHOGONALITY 211

Remark: The above scalar,

'\"<\302\253!,\302\253.> II\". I2

is called the Fourier coefficientof \320\270with respect to uf since it is analogous to a coeffi-

coefficient in the Fourier series of a function. This scalar also has a geometric interpretation

which is discussed below.

Projections

Consider a nonzero vector w in an inner product space V. For any v e V, we show (Problem 6.24)

D, VV> <t>, VV>

that

\321\201=

, w> || w ,11 2

is the unique scalar such that v' = v \342\200\224cw is orthogonal to w. The projection of v along w, as indicatedby Fig. 6-5, is denoted and definedby

\302\246, > (v, w> <i>, w>pro] (i>, w)

= cw = w =-\342\200\224-j

w

<w,w> \\\\w\\\\2

The scalar \321\201is also called the Fourier coefficientof v with respect to w or the component of v along w.

Fig. 6-5

Example 6.11

(a) We find the component \321\201and the projection cw of v = A, 2, 3, 4) along w = A, \342\200\2243,4, \342\200\2242)in R4. First we

compute

<i>, \320\270>>=1 -6+ 12-8= -1 and || w ||2 = 1 +9+ 16 + 4 = 30

Then \321\201= -^\320\267and proj {v, w) = cw = (- 3^,jb, -

\320\224,jq).

(b) Let ^ be the vector space of polynomials with inner product </, g> = fJ /(OffU) \320\233-We find the component(Fourier coefficient) \321\201and the projection eg of/(f) = It \342\200\2241 along g(r)

= f2. First we compute

. i<\302\253.\342\200\236>.

Then \321\201= | and proj (/, g) = eg = 5r2/6.

The above notion may be generalized as follows.

Theorem 6.8: Suppose wu w2, ..., wr form an orthogonal set of nonzero vectors in V. Let \320\270be any

vector in V. Define v' = v \342\200\224cxw,

\342\200\224c2 vv2

\342\200\224\342\200\242\342\200\242\342\200\242\342\200\224crwr where

Then d' is orthogonal to wl7 vv2,..., vvr.

212 INNER PRODUCT SPACES,ORTHOGONALITY [CHAP. 6

Note that the c, in the above theorem are, respectively, the components (Fourier coefficients)

of v along the w,, Furthermore, the following theorem (proved in Problem 6.31) shows that

clwl + \342\200\242\342\200\242\342\200\242+ cr wr is the closest approximation to v as a linear combination of wl,..., wr.

Theorem 6.9: Supposew,, w2, ..., wr form an orthogonal set of nonzero vectors in V. Let V be any

vector in V and let c, be the component of v along w,. Then, for any scalars \302\253\342\200\236...,ar,

v-

The next theorem (proved in Problem 6.32) is known as the Besselinequality.

Theorem 6.10: Suppose {eu e2, ..., er] is an orthonormal set of vectors in V. Let v be any vector in V

and let c, be the Fourier coefficient of v with respect to \321\211.Then

Remark: The notion of projection includes that of a vector along a subspace asfollows. Suppose W is a subspace of V and v e V. By Theorem 6.4, V = W \302\256W1;

hence v can be expressed uniquely in the form

v = w + w' where w e W, w' e WL.

We call w the projection of v along W and denote it by proj (v, W). (See Fig. 6-6.)Inparticular, if W = span (w,,..., wr) where the wt forms an orthogonal set, then

proj (v, W) =CiWx + c2vv2

where cf is the component of v along w,-, as above.

crwr

Fig. 6-6

6.6 GRAM-SCHMIDT ORTHOGONALIZATION PROCESS

Suppose {i?i, v2,..., vn} is a basis for an inner product space V. One obtains an orthogonal basis{w,,w2, - \302\246.,vvn} for V as follows. Set

W, = V,

W, = D, \342\200\224\342\200\242

w,

w. = \320\270\342\200\236\342\200\224\342\200\242 w. -

W2|w, - \342\200\242\302\246\302\246-

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 213

In other words, for \320\272= 2, 3,..., \320\270,we define

wk = vk-cklwl -ck2w2 Ck.k-iw,,-,where cki

= (vk, \320\275>,>/||\320\270>;||2 is the component of vk along w,. By Theorem 6.8, each wk is orthogonal to

the preceding w's.Thus w1,w2,..., wn form an orthogonal basis for V as claimed. Normalizing each wk

will then yield an orthonormai basis for V.

The above construction is known as the Gram-Schmidtorthogonalization process. The followingremarks are in order.

Remark 1: Each vector wk is a linear combination of vk and preceding w's; hence

by induction each wk is a linear combination ofvl,v2,-..,vk.

Remark 2: Suppose w,, w2, ..., wr are linearly independent. Then they form a

basis for U = span vvf. Applying the Gram-Schmidt orthogonalization process to theWj yields an orthogonal basis for U.

Remark 3: In hand calculations, it may be simpler to clearfractions in any new

wk by multiplying wk by an appropriate scalar, as this does not affect the orthogonality(Problem6.76).

The following theorems, proved in Problems 6.32 and 6.33,respectively, use the above algorithmand remarks.

Theorem 6.11: Let {i;,, v2,..., vn} be any basis of an inner product space V. Then there exists anorthonormai basis {\321\213\342\200\236u2,..., un} of V such that the change-of-basis matrix from {u,}

to {wj is triangular; that is, for \320\272\342\200\224

1, ..., \320\270,

\"\320\272=

\320\260\320\2721\302\273\320\263+\320\260\320\2722\302\273\320\263+''' + akkvk

Theorem 6.12: Suppose S = {w,, w2,..., wr} is an orthogonal basis for a subspace W of V. Then one

may extend S to an orthogonal basis for V, that is, one may find vectors wr +,,..., wn,

such that {w,, vv2,..., wn} is an orthogonal basis for V.

Example 6.12. Consider the subspace V of R4 spanned by

t>, = A, 1, 1, 1) i>2= A, 2, 4, 5) ^ = A,-3,-4,-2)

We find an orthonormai basis for V by first finding an orthogonal basis of V using the Gram-Schmidt algorithm.First set w, =

t>,= A, 1, 1, 1).Next find

\"\302\273~

^' W\302\2732^ = A' 2' 4' 5>~

TA' '' '' 4 \302\260(~2' ~'' '' 2)II w II \321\206

Set wj = (-2, - 1,1,2).Then find

Br*i = (l. -3, -4, -2)-^(l, 1,1,l)-^(-2, -1,1,2)

\\5' 10' 10'5/

Clear fractions to obtain w3 = A6, \342\200\22417,-13, 14). Last, normalize the orthogonal basis

w,=(l, I, 1. 1) w2=(-2, -1, 1,2) w3==A6,-17, -13, 14)

214 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

Since||w, ||2 = 4, || \320\270>2||2 = 10, || w3 ||2 = 910, the following vectors form an orthonormal basisof V:

u,=i(l. 1,1,1) u2=-J=(-2, -1,1,2) u3 =\342\200\2241=A6, -17, -13, 14)2

/\320\2420 /9\320\256

Example 6.13. Let V be the vector space of polynomials f(t) with inner product </, g>=

J'_, f(t)g(t) dt. Weapply the Gram-Schmidt algorithm to the set {1, t, t1, t3} to obtain an orthogonal basis {/\320\276./\320\270\320\233./\321\215}with integercoefficients for the subspace V of polynomials of degree < 3.Herewe use the fact that if r + s = n then

'{n + 1) if n is even

if n is odd

First set/0 = 1.Then find

Let/, = t. Then find

-!\302\273-!.1 \302\253!*-!<!,!> (U) 2 $' 3

Multiply by 3 to obtain/2 = 3r2 - I. Next find

Multiply by 5 to obtain/3 = 5r3 - 3r. That is, {1,t, 3t2 - 1, 5f3 - 3f} is the required orthogonal basis for V.

Remark: Normalizing the polynomials in Example 6.13 so that p(l) = 1 for each

polynomial p{t) yields the polynomials

1, t, {{it2\342\200\2241), ^Et3

- 3r)

These are the first four Legendre polynomials (which are important in the study ofdifferential equations).

6.7 INNER PRODUCTSAND MATRICES

This section investigates two types of matrices which play a special role in the theory of real inner

product spaces V: positive definite matrices and orthogonal matrices. In this context, vectors in Rn will

be represented by column vectors. (Thus <u, v} = uTv denotes the usual inner product in RB.)

Positive Definite Matrices

Let A be a real symmetric matrix. Recall(Section4.11)that A is congruent to a diagonal matrix B,

i.e., that there exists a nonsingular matrix P such that \320\222= PTAP is diagonal, and that the number of

positive entries in \320\222is an invariant of A (Theorem 4.18, Law of Inertia). The matrix A is said to bepositivedefinite if all the diagonal entries of \320\222are positive. Alternatively, A is said to be positive definite

if XTAX > 0 for every nonzero vector X in R\".

1 0 -l\\

Example 6.14. Let \320\233=(

0 I -2 I. We reduceA to a (congruent) diagonal matrix (see Section 4.II) by

k-l ~2 8/

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 215

applying Rt +R3->R3 and the corresponding column operation C, + C3-*C3,and then applying

2R2 + R3->R3 and 2C2 + C3^C3:

A

0 0\\ /l 0 o\\

0 1 \342\200\2242I~I 0 IOJ

0 -2 7/ \\00 3/

Since the diagonal matrix has only positive diagonal entries A is a positive definite matrix.

Remark : A 2 x 2 symmetric matrix A \342\200\224I )= I ) is positive definite if

V d) \\b d/

and only if the diagonal entries a and d are positive and the determinant

det {A) \342\200\224ad \342\200\224be = ad \342\200\224b2 is positive.

The following theorem, proved in Problem 6.40, applies.

Theorem 6.13: Let A be a real positive definite matrix. Then the function <\321\213,i>> = uTAv is an innerproduct on R\".

Theorem 6.13 says that every positive definite matrix A determines an inner product. The following

discussion and Theorem 6.15 may be viewed as the converse of this result.

Let V be any inner product space and let S = {u,,u2, ..., \321\213\342\200\236}be any basis for V. The followingmatrix A is called the matrix representationof the inner product on V relative to the basis S:

<\320\253\342\200\236,\320\246,> <\320\253\342\200\236,\320\2532>... <\320\253\342\200\236,\320\253\342\200\236>

That is, A =(\320\276\321\203)

where atJ=

<\321\213;,\321\213;>.

Observe that A is symmetric since the inner product is symmetric, that is, <u,, \320\253\321\203>=

<\321\213\321\203,\321\213,->.Also,

A depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basisthen A is diagonal, and if S is an orthonormal matrix then A is the identity matrix.

Example 6.15. The following three vectors form a basis S for Euclidean space R3:

ut =A, 1, 0) u2 = A, 2, 3) u3

= A, 3, 5)

Computing each <\320\270;,u,>= <u,, u,> yields:

<u,,u1>= 1 + 1+0 = 2 <u,,u2>=1+2 + 0 = 3 <Uj,u3>= 1+3+0 = 4

<u2lu2>=1+4 + 9= 14 <u2,u3>= 1+6+ 15= 22 <u3, u3) = 1 + 9 + 25 = 35

I2\320\263

4\\Thus A= 3 14 22]\\4 22 35/

is the matrix representation of the usual inner product on R3 relative to the basis S.

The following theorems, proved in Problems 6.41 and 6.42, respectively, apply.

Theorem 6.14: Let A be the matrix representation of an inner product relative to a basisS for V.

Then, for any vectors u,veV, we have

<\321\213,\302\273>=

[\320\262]\320\263\321\204]

where [u] and [i] denote the (column)coordinatevectors relative to the basis S.

216 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

Theorem6.15: Let A be the matrix representation of any inner product on V. Then A is a positivedefinite matrix.

OrthogonalMatrices

Recall (Section 4.6) that a real matrix P is orthogonal if P is nonsingular and P~l \342\200\224PT, that is, if

PPT = PTP = I. This subsectionfurther investigates these matrices. First we recall (Theorem 4.5) an

important characterization of such matrices.

Theorem 6.16: Let P be a real matrix. Then the following three properties are equivalent:

(i) P is orthogonal, that is, PT = P~ \\

(ii) The rows of P form an orthonormal set of vectors,

(iii) The columns of P form an orthonormal set of vectors.

(The above theorem is true only using the usual inner product on R\". It is not true if R\" is given anyother inner product.)

or, - , / cos9 sin \320\262\\

Remark: Every 2x2 orthogonal matrix has the form\\

\342\200\224sin \320\262 cose/

() for some real number \320\262(Theorem 4.6).sin \320\262\342\200\224cos0/

Example 6.16

/1\320\224/3 l/v/3 1\320\233/3\\

Let P = I \320\236 1\320\224/2\342\200\224

l/y/2 I - The rows are orthogonal to each other and are unit vectors, that is, the rows

\\2\320\220/\320\261-1\320\224/6 -1/^6/

form an orthonormal set of vectors. Thus P is orthogonal.

The following two theorems, proved in Problems 6.48 and 6.49, respectively, show important

relationships between orthogonal matricesand orthonormal bases of an inner product space V.

Theorem 6.17: Suppose E = {e,}and E = {ej}are orthonormal bases of V. Let P be the change-of-

basis matrix from the basis E to the basis E'. Then P is orthogonal.

Theorem 6.18: Let {et, ..., \320\265\342\200\236}be an orthonormal basis of an inner product space V. Let P =(ai})

be

an orthogonal matrix. Then the following n vectors form an orthonormal basis for V:

e'i= aliel + a2ie2+ \302\246\302\246\302\246+ aaien (i = 1, 2, ...,n)

6.8 COMPLEX INNER PRODUCT SPACES

This section considers vector spaces V over the complex field C. First we recall someproperties of

complex numbers (Section 2.9). Supposez e C, say, z = a + \320\253where a, b e R. Then

z = a \342\200\224\320\253 zz = a2 + b2 and

Also, for any z, zt, z2 e C,

2, -\\- Z2 = Zt + Z2 ZtZ2= Z|

\"Z2 Z = Z

and r is real if and only if z = z.

CHAP.6] INNER PRODUCT SPACES, ORTHOGONALITY 217

Definition: Let \320\230be a vector space over C. Supposeto eachpair of vectors u, v e V there is assigned acomplexnumber, denoted by <\321\213,u>. This function is called a {complex) inner product on V

if it satisfies the following axioms:

[/?] (Linear Property) <\320\260\320\253|+ bu2, i>>= a(ul, i>> + b<u2, t;>

[/J] (Conjugate SymmetricProperty) <u, i>> = <i>, u>

[/?] (Positive Definite Property) <u, u> > 0; and <u, \321\213>= 0 if and only if \320\270= 0.

The vector space V over \320\241with an inner product is calleda (complex) inner product space.

Observe that a complex inner product differs only slightly from a real inner product space (only

[/J] differs from [/2])- In fact, many of the definitions and properties of a complexinner product spaceare the same as that of a real inner product space.However,someof the proofs must be adapted to the

complex case.

Axiom [/?] is also equivalentto the following two conditions:

(a) <u, +u2, i>>=

<\302\253!,u> + <u2> u> and (b) </\321\201\321\213,u> = /c<u, i>>

On the other hand,

=</cu, u> =

fc<i>, \321\213>= k(v, \321\213>

= fc<u, i>>

(In other words, we must take the conjugate of a complex scalar when it is taken out of the secondposition of the inner product.) In fact, we show (Problem6.50)that the inner product is conjugate linearin the second position, that is,

<u, at), + bv2y = a<u, d,> 2

One can analogously prove (Problem 6.96)

<o,u, + \320\2602\321\2132,b,i>, + b2i>2> = aibi<Ui, i>i> + aib2<ui> ^2> + \302\2532^i<M2' \"i> + a2b2(u2,

and, by induction,

I] \302\253.\342\200\242\302\253i.Z b>v; ) = Z \320\260<\320\254^\321\202.^-

The following similar remarks are in order:

Remark 1: Axiom [/?] by itself implies that <0, 0> = (Ov, 0> =0<i>, 0> = 0.

Accordingly, [/?], [/f], and [/J] are equivalent to [/J], [/J] and the following axiom:

[/?'] If \320\270\320\2440, then <u, u> > 0

That is, a function satisfying [/?], [/$], and [/?'] isa (complex)inner product on V.

Remark 2: By [/J], <\321\213,u> =<\321\213,\321\213>.Thus <\321\213,\320\274>must be real. By [/J], <\321\213,\321\213>

must be nonnegative, and hence its positive real squareroot exists.As with real inner

product spaces, we define || \320\270|| = ,/<u, u> to be the norm or length of u.

Remark 3: Besidesthe norm, we define the notions of orthogonality, orthogonal

complement, orthogonal and orthonormal sets as before. In fact, the definitions ofdistance and Fourier coefficient and projection are the same as with the real case.

218 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

Example 6.17. Let \320\270= (zs) and v = {w,) be vectors in C\". Then

is an inner product on C\" called the usual or standard inner product on C. (We assume this inner product on C\"

unless otherwise stated or implied.)In the case that \320\270and v are real, we have \320\271>;=

\320\270>,-and

+z2w2 + \302\246\302\246\302\246+ znwn = zlwl +z2w2 + ---+ \320\263\342\200\236\320\270>\342\200\236

In other words, this inner product reduces to the analogous oneon R\" when the entries are real.

Remark: Assuming \320\270and v are column vectors, then the above inner product

may be defined by <\321\213,\320\270)= uTi>.

Example 6.18

(a) Let V be the vector space of complexcontinuous functions on the (real) interval a <t <b. Then the following

is the usual inner product on V:

</. \320\262>= f(t)gU) dt

Jo

(b) Let I! be the vector space oFtnxn matrices over C. SupposeA =(z0) and \320\222=

(\320\270>\321\203)are elements of U. Thenthe following is the usual inner product on U:

(A, B} = tr B\"A = ? f WyZy

As usual, BH = BT, that is, BH is the conjugate transpose of B.

The following is a list of theorems for complex inner product spaces which are analogous to the

ones for the real case (Theorem6.19is proved in Problem 6.53).

Theorem 6.19 (Cauchy-Schwarz): Let V be a complex inner product space.Then

Theorem 6.20: Let W be a subspaceof a complexinner product space V. Then V = W \302\251ViF.

Theorem 6.21: Suppose {\321\211,\321\2132,..., \321\213\342\200\236}is an orthogonal basis for a complex vector space V. Then, for

any v e V,

v =

Theorem 6.22: Suppose{u,,u2,..., un} is a basis for a complex inner product space V. Let A =(ay)

be the complex matrix defined by atj =<\321\206-,10. Then, for any u,veV,

<\321\213,\302\273>= MT/!M

where [\321\213]and [t>] are the coordinate column vectors in the given basis {u,}.(Remark:This matrix A is said to represent the inner product on V.)

Theorem 6.23: Let A be a Hermitian matrix (i.e., A\" = AT = A) such that \320\245\320\223\320\233\320\245is real and positivefor every nonzero vector X e C. Then <u, u>

= uTAi> is an inner product on C.

Theorem 6.24: Let A be the matrix which represents an inner product on V. Then A is Hermitian, andXTAX is real and positive for any nonzero vector in C.

CHAP.6] INNER PRODUCT SPACES, ORTHOGONALITY 219

6.9 NORMED VECTOR SPACES

We begin with a definition.

Definition: Let V be a real or complex vector space.Supposelo each v e V there is assigned a realnumber, denoted by || v ||. This function || \342\200\242

|| is called a norm on V if it satisfies the follow-followingaxioms:

[WJ II v || > 0; and || v \\\\= 0 if and only if v = 0.

[tfj ||b|| = |MIMI-[W3] II\302\253+ HI<II\302\253II + IMI-

The vector space \320\232with a norm is called a normed vector space.

The following remarks are in order.

Remark 1: Axiom [N2] by itself implies that || 0 || = ||Op || = 01| v || =0. Accord-Accordingly,[\320\233/,], [\320\233/2],and [\320\233/3]are equivalent to [\320\233/2],[N3], and the following axiom:

If v \320\2440, then || i> || > 0

That is, a function ||*

|| satisfying \\N'(\\, [N2],and [\320\233/3]is a norm on a vector spaceF.

Remark 2: Suppose F is an inner product space.The norm on F denned by

II \302\253>II=

yf(v, i>> does satisify [N,], [N2], and [\320\233/3].Thus every inner product space V

is a normed vector space. On the other hand, there may be norms on a vector space F

which do not come from an inner product on V.

Remark 3: Let V be a normed vector space. The distance between two vectors

u, v 6 V is denoted and denned by d(u, v)\342\200\224

|| \320\270\342\200\224v ||.

The following theorem is the main reason why d(u, v) is called the distance between \320\270and v.

Theorem 6.25: Let \320\232be a normed vector space. Then the function d(u, v) = || \320\270\342\200\224v \\\\ satisfies the

following three axioms of a metric space:

[M,] d(u, v) > 0; and d(u, v) = 0 if and only if \320\270-\342\200\224v.

[M2] d(u, v) = d(v, u).

[M3] d(u, v) < d(u, w) + d(w, v).

Norms on R\" and C\"

The following define three important norms on R\" and C:

(Note that subscripts are used to distinguish between the three norms.) The norms || \342\200\242\320\246^,||

*|||, and

|| \342\200\242|| 2 are called the infinity-norm, one-norm, and two-norm, respectively. Observe that ||

\342\200\242|| 2 is the norm

on R\"(Cn) induced by the usual inner product on R\"(C).(We will let d^, du and d2 denote the corre-

corresponding distance functions.)

220 INNER PRODUCT SPACES,ORTHOGONALITY [CHAP. 6

Example 6.19. Consider the vectors \320\270= (I, -5, 3) and v = D, 2, -3) in R3.

(a) The infinity-norm chooses the maximum of the absolute values of the components. Hence

= 5 and M =4

(b) The one-normaddsthe absolute values of the components. Thus

||\320\274||,= 1+5 + 3 = 9 and || d||, = 4 + 2 + 3 = 9

(c) The two-norm is equal to the square root of the sum of the square of the components (i.e., the norm induced

by the usual inner product on R3). Thus

\\u\\\\2= + 25+ 9 = ,/35 and || t>||2 = ,/16 + 4 + 9= ,/29

(d) Since \320\270- v = A - 4, -5 -2, 3 + 3) = (-3, -7, 6), we have

dju,v) = l v) = 3 + 7 + 6= 16 d2(u, v) = J9 + 49 + 36 =4/94

Example 6.20. Consider the cartesian plane R2 shown in Fig. 6-7.

(a) Let ?>, be the set of points \320\270= (x, y) in R2 such that || \320\270||2 = 1. Then ?>, consistsof the points (x, y) such that|| \320\270||2 = x2 + \321\2032

= 1. Thus ?>, is the unit circle as shown in Fig. 6-7.

(b) Let D2 be the set of points \320\270= (x, y) in R2 such that || \320\270||, = 1. Then D2 consistsof the points (x, y) such that

|| \320\270||, = |x I + \\y\\ = 1. Thus D2 is the diamond inside the unit circle as shown in Fig. 6-7.

(c) Let ?>3 be the set of points \320\270= (x, 3/) in R2 such that || \320\270\\\\x= 1. Then D3 consists of the points (x, y) such that

II \320\270||\320\266= max (|x|, I y |) = 1.Thus D3 is the square circumscribing the unit circle as shown in Fig. 6-7.

Fig. 6-7

Norms on C\\a,b\\

Consider the vector space V = C[a, b] of real continuous functions on the interval a<t<b. Recallthat the following defines an inner product on V:

= f

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 221

Accordingly, the above inner product defines the following norm on V = C[a, b~] (which is analogous tothe || \342\200\242

|| 2 norm on R\:

II/II2= \320\223[/\320\234]2\320\233

The following example defines two other norms on V = C[a, b].

Example 6.21

(a) Let ||/||, =J* |/(t)| dt. (This norm is analogous to the ||

\302\246||, norm on R\".) There is a geometrical description

of ||/||, and the distance dt(f, g). As shown in Fig. 6-8, ||/||, is the area between the function |/| and the i

axis, and dt(f, g) is the area between the functions/and g.

(\321\217)II / II i is shaded \320\244)d,{f, g) is shaded

Fig. 6-8

(b) Let || /1| x = max (| f{t) |). (This norm is analogous to the ||\342\200\242

|| \342\200\236on R\".) There is a geometricaldescription ofII / II \342\200\236and the distance function dm(f, g). As shown in Fig. 6-9, ||/1!^ is the maximum distance between/andthe t axis, and d^,(f, g) is the maximum distance between/and g.

(a) II/IU (b) djf,g)

Fig. 6-9

Solved Problems

INNER PRODUCTS

6.1. Expand <5\321\213,+ 8\321\2132,6i>|\342\200\224

7i>2>.

Use the linearity in both positions to get

<5u, + 8u2, 6i>,- 7t>2>=

<5\320\274\342\200\2366i;,

= 30<u,, t>,

+ <5\320\274\342\200\236-7d2> + <8u2, 6d,> + <8u2,-

35<\320\274\342\200\236i;2> + 48<u2, t>,> - 56<u2,

222 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

[Remark: Observethe similarity between the above expansion and the expansion of Ea + 8b)Fc \342\200\224Id) in

ordinary algebra.]

62. Consider the following vectors in R3: \320\270= A, 2, 4), v = B,- 3, 5), w = D, 2, - 3).Find: (\320\260)\320\270

\342\200\242v,

(b) \320\270\302\246

w, (c) v \302\246w, (d) (u + v) \302\246

w, (e) II \320\270||, (/) || \320\270||, (j?) i| \320\270+ \320\270j|.

(a) Multiply corresponding components and add to get \320\270\342\200\242v = 2 \342\200\2246 + 20 = 16.

(b) u \342\200\242w = 4 + 4- 12= -4.(c) v w = 8-6- 15= -13.(d) First find u + v = C,

- 1, 9). Then (u + v)\302\246w = 12 - 2 - 27 = - 17. Alternatively, using [/,],

(u + t')-w = u-w+tl-w= \342\200\2244\342\200\22413= \342\200\22417.

(e) First find \\\\u\\\\2 by squaring the components of \320\270and adding:

||u||2= |2 + 22 +42= I +4+ 16 = 21 and so \\\\\320\270\\\\=^/2\\

(/) || v ||2 = 4 + 9 + 25 = 38 and so || t> || = ^/38.

(s) From (d\\ \320\270+ v = C, - 1,9).Hence || \320\270+ v ||2 = 9 + 1 + 81 = 91 and || \320\270+ v || = ,/91.

6.3. Verify that the following is an inner product in R2:

<u,v> = xlyl-xly2-x2yl+3x2y2 where u =(\321\205\342\200\236\321\2052),t; =

(>\302\246,,y2)

Method 1. We verify the three axioms of an inner product. Letting w = (zu z2), we find

\320\260\320\270+ bw = a(x,, x2) + fo(z,, z2) = (ax, + bz,, ax2 + bz2)

Thus

<au + bw, d> = <(ax,, ax2 + bz2), C;,,

= (ax, + bzi)yl - (ax, + bzt)y2- (ax2 + bz2)^,+ 3(ax2 + bz2)y2

= a(x1yl - xly2-x2y1 +3x2y2) + b(z,y, -zly2-z2yl + 3z2y2)=

a<u, t>> + b(w, v}

and so axiom [/1] is satisfied. Also,

<t>, \320\274>=

\321\203,\321\205,-3'i^2~>'2^i + 3y2v2 = x,>>, -\321\205.\321\203\320\267-\320\245\320\263^,+ 3x2y2 = <u, i;>

and axiom |72] is satisfied. Finally,

<\320\274.u> = xf - 2x,x2+ 3x|= xf\342\200\224

2x,x2 + x2 + 2x2 = (x, - x2J+ 2x2 > 0

Also, <u, u> = 0 if and only if x, = 0, x2 = 0,i.e.,\320\270= 0. Hence the last axiom [/3] is satisfied.

Method 2. We argue via matrices. That is, we can write (u, v} in matrix notation:

Since /1 is real and symmetric, we need only show that A is positive definite. Applying the elementary row

operation R, +R2^*R2 and then the corresponding elementary column operation C, +C2^C2, we

/1 0\\transform A into diagonal form I I. Thus A is positive definite. Accordingly, <u, i>> is an inner product.

6.4. Consider the vectors \320\270= A, 5) and v = C, 4) in R2. Find:

(a) <u, i>> with respect to the usual inner product in R2,

(b) <u, i>> with respect to the inner product in R2 in Problem 6.3,

(c) ||1:|| using the usual inner product in R2,

(d) || v || using the inner product in R2 in Problem 6.3.

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 223

(a) <u, i>>= 3 + 20 = 23.

(b) <\320\274,D>=l-3-l-4-5-3 + 3-5-4 = 3-4-15 + 60 = 44.

(c) || v ||2 =<\342\200\236,\342\200\236>

= <C, 4), C, 4\302\273= 9 + 16 = 25; hence || v || = 5.

(d) || t> ||2 = <t>, t>> = <C, 4), C, 4)>= 9 - 12- 12+ 48 = 33; hence || v || = ,/33.

6.5. Consider the vector spaceV of polynomials with inner product defined by Jo f(t)g(t) dt and the

polynomials f(t) = t + 2, g(t) = 3t - 2, and \320\222\320\224= t2 - It - 3. Find:(a) </, 0> and </, A> and

(b) || /1| and ||g ||.(c)Normalize/and3.(a) Integrate as follows:

if, 9> =I

\302\253+ 2X3t - 2) * = \320\223Ct2 + 4t -4) dt = [I3 + 2t2 - 4t]i = -1

Jo Jo

37

g, g> = C1-Jo

Xt + 2) dt = \342\200\224and || /1| = .

2X3i- 2) = 1; hence ||g \\\\

= Jl = 1.

(c) SinceIl/H = \342\200\224\342\200\224,/=\342\200\224\342\200\224

/=\342\200\224\342\200\224

(t + 2). Note g is already a unit vector since ||g \\\\= 1; hence

3 11/11 ./57g = g = 31 - 2.

6.6. Let V be the vector space of 2 x 3 real matrices with inner product <\320\233,\320\222)= tr BTA and con-

consider the matrices

7\\ /1 2 3\\ /3 -5 2\\

4JB =

D 5 \320\261)

C =

(l 0 -4)

9 85

Find:(a) </4, B>, <\320\233,C>, and <B, C>; (b) <2/4+ 3B,4C>;and (\321\201)|| \320\233|| and || \320\222||. (d) Normalize

A and B.

[m\321\217 \021

Use </l, B> = tr BTA = X Z \320\260\321\206\321\214\302\253>*e sum \302\260^tnePro<1ucts of corresponding entries.

i=ly=l J

</l,B>= 9 + 16 + 21 + 24 + 25 + 24 = 119

<\320\233,C> = 27 - 40 + 14 + 6 + 0 - 16= -9

<B, C> = 3 - 10+ 6 + 4 + 0 - 24 = -21

21 22 23\\ /12 -20 8\\

25 26)and4C=

^4 Q _ ^Then

<2/l + 3B, 4C> = 252 - 440+ 184 + 96 + 0 - 416= -324Alternatively, using the linear property of inner products,

<2A + \320\252\320\222,4C> =8<\320\224C> + 12<B, C> = 8(-9) + 12( \342\200\22421)

= -324

[mn \"I

Use || A ||2 =<\320\233,/1> = X Z \302\253\302\253\342\200\242thesum of the squares of all the elements of A.

i=l 7=1 J

|\320\230||2= <\320\224/l>

= 92 + 82 + 72 + 62+ 52 + 42 = 271 and so ||/11| = ,,/271

||B||2 = <B, B>= 12 + 22 + 32+ 42 + 52 + 62 = 91 and so

224 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP.6

A = _/4= A. ,v 8/^/271)

I \320\222II,/\302\273 W/ST 5/^9?

6.7. Find the distance d(u, i>) between the vectors:

(a) \320\270= A, 3, 5, 7) and v = D, -2, 8, 1)in R4;

(b) u = t + 2 and i> = 3t - 2 where <\321\213,i>> =J? u(r) i^r) dr.

Use d(u, v) = || \320\270- v ||.

(a) u-i> = (-3, 5, -3,6). Thus

|| u-i; ||2 = 9 + 25 + 9 + 36 = 79 and so d{u, v)= || \320\270- v |

(b) \320\274- d= -2t + 4.Thus

|| u _\342\200\236||2

= <\342\200\236-\342\200\236,\342\200\236-!,>={-It + 4)(-2t + 4) dt

n \320\223\320\273 \320\242 ?8= Dt2 - 16t + 16) dt = - t3 - 8\320\2232+ 16t = \342\200\224

Hence d{u, v)=

6.8. Find cos \320\262where \320\262is the angle between:

(a) u = (l,-3,2)andi> = B,l,5)inR3;

(b) \320\270= A, 3, - 5,4) and v = B, - 3, 4, 1)in R4;

(c) /(i) = 2t - 1and g(f)= t2, where </, 3> =

f\302\273/(tK(t) dr;

= r ~4 where <A,B>=trBTA.

Use \302\253\302\273g=

I u II II f II

(a) Compute <u, t>> = 2 - 3 + 10= 9, ||\320\270\\\\2 = 1 + 9 + 4 = 14,||i; ||2 = 4 + 1 + 25 = 30. Thus,

9 9cos \320\241= \302\246

/\320\274^/\320\267\320\276\321\203\320\223\320\2765

(b) Here <u, t>>= 2 - 9 - 20 + 4 = -23, || \320\270||2 = 1 + 9 + 25 + 16= 51, ||v ||2 = 4 + 9 + 16+ 1= 30.

Thus

-23 -23cos \320\262

\342\200\224\302\246

(\321\201)Compute

II / II2 = <//>= Dt - 4f + 1)dt = - and

1Thus cos 0 =

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 225

(d) Compute <\320\233,B> = 0-1+6-3 = 2, ||A ||2 = 4 + 1 + 9 + 1= 15, ||\320\222||2 = 0 + 1+ 4 + 9= 14.Thus

2 2cos \320\262=

6.9. Verify each of the following:

(a) Parallelogram Law (Fig. 6-10): || \320\270+ v \\\\2 + || \320\270-v\\\\2

= 2 || \320\270\\\\2 + 2 || v ||2.

(b) Polar form for <u, i>> (which shows that the inner product can be obtained from the norm

function): <u, d> = i(|| \320\270+ v \\\\2- || \320\270- v ||2).

Expand each of the following to obtain:

II u + v ||2 = <u + d, \320\270+ i>>= || \320\270||2 + 2<u, y> + || \321\203II2 (/)

|| \320\270-\320\24712= <\302\253

- v, \320\270- v> =\\\\u\\\\2

- 2<u, p> + II \320\23012 B)

Add (/) and B) to get the Parallelogram Law (e). Subtract B) from (/) to obtain

|| u + v ||2- || u - v ||2 = 4<u, t)>

Divide by 4 to obtain the (real) polar form (b).(Thepolar form for the complex caseis different.)

Fig. 6-10

6.10. Prove Theorem 6.1 (Cauchy-Schwarz).\\(u,v)\\ <\\\\u\\\\\\\\v\\.

For any real number t,

(tu + v,tu+ v) = t2(u. u) + 2t(u, v) + (v, 1)> = /2||\320\274|\320\240+ 2\320\263(\320\274,v) + \\\\vP

Let a = ||u ||2, b = 2<u, t>>, and \321\201= || w ||2. Since || tu + w ||2 > 0, we have

at2 + bt + \321\201^ 0

for every value of r. This means that the quadratic polynomial cannot have two real roots. This implies that

b2 - 4ac < 0 or b2 < \320\233\320\260\321\201.Thus

Dividing by 4 gives us our result. (Remark: The Cauchy-Schwarz inequality for complex inner productspacesappearsin Problem 6.53.)

6.11. Prove Theorem 6.2.If v \321\2040, then <i\\ d> > 0 and hence || v || = ,/<1\\ t>> > 0. If \321\203= 0 then <0, 0> = 0. Consequently

|| 0 || = jo = 0.Thus [N,] is true.

We have || kv ||2 = <kt), kv) = k2(v, vy = k2 || v \\\\2. Taking the square root of both sides gives [N2].Using the Cauchy-Schwarz inequality, we obtain

|| \320\270+ v ||2 = <u + v, \320\270+ t>> = <u, u> + <u, u> + <u, u> + <i\\ u>

< || u ||2 + 2 || u || || v\\\\ + || \320\23012= (II \" II + II v\\\\J

Taking the square root of both sides yields [N3].

226 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

6.12. Let (a,, a2,...) and (bu b2, \302\246\302\246.) be any pair of points in /2-space of Example 6.3(fe). Show that the

CO

inner product is well defined, i.e., show that the sum\320\245\320\260,\320\254,-

= fli^i + a2b2 + \302\246\302\246\302\246converges

i=l

absolutely.

By Example 6.4(a)(Cauchy-Schwarzinequality),

\\a,b,\\ + \302\246\342\200\242\342\200\242+ \\aKbK\\

which holds for every n. Thus the (monotonic) sequence of sums Sn= | a,b, | + \342\200\242\302\246\342\200\242+ | an bn | is bounded, and

therefore converges.Hencethe infinite sum converges absolutely.

ORTHOGONALITY, ORTHOGONAL COMPLEMENTS,ORTHOGONAL SETS

6.13. Find \320\272so that the following pairs are orthogonal:

(a) \320\270= A, 2, \320\272,3) and v = C, k, 7, -5) in R4;

(b) f(t) = t + \320\272and g(t) = t2 where if, g> =f J f(t)g(t) dt.

(a) First find <\320\270,d> = A, 2, k, 3) \342\200\242C, k, 7, -5) = 3 + Ik + Ik - 15= 9k - 12. Then set

<u, u> = 9k - 12= 0 to find fe = f.

(b) First find

1 \320\272 \320\227Set </, 0> = - + - = 0 to obtain \320\272= \342\200\224.

4 3 4

6.14. Consider \320\270= @,1, \342\200\2242,5) in R4. Find a basis for the orthogonal complement u1 of u.

We seek all vectors (x, y, z, t) in R4 such that

<(x, y, z, I), @, 1, -2,5)> = 0 or 0x + y-2z + 5t = 0

The free variables are x, z, and t. Accordingly,

A) Set x = 1,z = 0, t = 0 to obtain the solution w, = A,0,0,0).

B) Setx = 0,z = 1, t = 0 to obtain the solution w2 = @,2,1,0).C) Setx = 0,2=0,t = 1 to obtain the solution w3 = @, -5,0,1).

The vectorsw1,w2,wiform a basis of the solution spaceof the equation and hence a basis for u1.

6.15. Let W be the subspace of R5 spanned by u = A, 2, 3, -1, 2)and v = B,4, 7, 2, -1). Find a basisof the orthogonal complement W1 of W.

We seek all vectors w = (x, y, z, s, t) such that

<w,u~>= x + 2y + 3z - s + 2t = 0

<w, y> = 2x + 4y + Iz + 2s- t = 0

Eliminating x from the second equation, we find the equivalent system

x + 2y + 3z \342\200\224s + 2t = 0

z + 4s- 5t = 0

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 227

The free variables are y, s, and t. Therefore,

A) Set \321\203= -1, s = 0, t = 0 to obtain the solution w, = B, -1,0,0,0).

B) Set\321\203= 0, s = 1, t = 0 to find the solution w2 = A3,0, -4, 1,0).

C) Set \321\203= 0, s = 0, t = 1 to obtain the solution w3

= (- 17,0, 5, 0, 1).

Theset {w,, w2, w3} is a basisof WL.

6.16. Let w = A, 2, 3, 1)be a vector in R4. Find an orthogonal basis for w1.

Find a nonzero solution of x + 2y + 3z + t = 0; say u, =@, 0,1, \342\200\2243).Now find a nonzero solution of

the system

x + 2y + 3z + t = 0 z - 3i = 0

say v2= @, --5, 3, 1).Lastly, find a nonzero solution of the system

x + 2y+3z + t = 0 -Sy + 3z + r = 0 z-3f = 0

say vi=

(\342\200\22414,2, 3, 1). Thus d,, t>2, f3 form an orthogonal basis for wx. (Compare with Problem 6.14,where

the basis need not be orthogonal.)

6.17. Let S consist of the following vectors in R3:

ut = A, 1, 1) u2= A, 2, -3) u3

= E, -4, -1)

(a) Show that S is orthogonal and S is a basisfor R3.

(b) Write i' = A, 5, \342\200\2247) as a linear combination of u,, u2,u3.

(a) Compute

<u,, u2> = 1 + 2 - 3 = 0 <u,, u3> = 5 - 4 - 1 = 0 <u2, u3>= 5 - 8 + 3 = 0

Since each inner product equals 0, S is orthogonal and hence S is linearly independent. Thus S is abasis for R3 since any three linearly independent vectors form a basis for R3.

(b) Let v = xu, + yu2 + zu3 for unknown scalars x, y, z, i.e.,

A,5, -7) = x(l, 1,U + jrfl, 2, -3) + zE, -4, -1) (/)Method 1. Expand (/) obtaining the system

x + \321\203+ 5z = I x + 2y \342\200\2244z = 5 x - 3v \342\200\224z = \342\200\2247

Solve the system to obtain x = \342\200\224-j, \321\203

= ^, z = \342\200\224^y.

Method 2. (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is

simpler.)Takethe inner product of (/) with u, to get

A, 5, -7)-(l, 1, l) = x(l,1,1)-A,1,1) or -l=3x or x=-i

(The two last terms drop out since u, is orthogonal to u2 and to u3.) Take the inner product of (/) with

u2 to get

(I, 5, -7)-(l, 2, -3) =y(l, 2, -3)-(l,2, -3) or 32 =

14y or \321\203= ^

Take the inner product of (/) with u3 to get

A, 5, -7)-E, -4, -l) = zE, -4, -l)-E, -4, -1) or -8 = 42z or z = - ?In either case, we get v = (- \320\267)\320\270,+ (^f)u2 -

<2\321\202)\"\320\267-

6.18. Let S consist of the following vectors in R4:

u,=(l, 1,0, -1) \320\2702=A,2,1,3) u3=(l, 1,-9,2) u4= A6, -13, 1,3)

228 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

(a) Showthat S is orthogonal and a basis of R4.

(b) Find the coordinates of an arbitrary vector v = (a, b, c, d) in R* relative to the basis S.

(a) Compute

u,-u2 = 1+2 + 0-3 = 0 \320\270,-1*3=1 + 1+0-2 = 0 u,-u4= 16-13 + 0-3 = 0

u2-u3 =1 + 2-9+6 =0 u2-u4=16-26+1+9=0 us-u4=16-13-9+ 6 = 0

Thus S is orthogonal and hence S is linearly independent. Accordingly, S is a basis for R4 since anyfour linearly independent vectors form a basis ofR*.

(b) Since S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors,as in Theorem 6.7. Thus

_ <d, u,> _ a + b - d _ <t>, u3) _ a + b - 9c+ Idl==<Ul)Ul>= 3 3~<\302\2533,\302\2533>~ 87

<i>, u2> a + 2b + \321\201+ 3d <i>, u4> 16a - 13\320\254+ \321\201+ 3d2

=

<\302\2532,\302\2532>= 154 =

<Uil, u4>=

435

are the coordinates of v with respect to the basis S.

6.19. SupposeS,S,,and S2 are subsets of F. Prove the following:

(a) 5 \321\201S11 (b) If S, \321\20152, then 5^ \321\2015| (\321\201)5\321\205= (span SI

(a) Let w e S. Then <w, u>= 0 for every v e Sl; hence w e S11. Accordingly, S ? S11.

(b) Let w e S2 . Then <w, y> = 0 for every v e S2. Since S,^S2, <w, i>>= 0 for every v e S,. Thus w e Sf,

and hence S2 ? Sj\\

(c) Since 5 \321\201span 5, we have (span 5)x \320\241Sx. Suppose ueS and suppose v e span 5X. Then there exist

wb w2,..., wt in S such that

v = ^w, + a2 w2 + \342\200\242\342\200\242\342\200\242+ akwk

Then, using u e S\\ we have

<u, vy = (u,a,w, + a2w2 + \302\246\302\246\302\246+ akwky = a,(u, w,> + a2<\302\253,w2> + \302\246-\302\246+ ak= fl1-0 + fl2-0 + ---+flJt-0=0

Thus \320\270e (span 5)x. Accordingly, 5\321\205\320\241(span 5I. Both inclusions give Sx =(span SI.

6.20. Prove Theorem 6.5.

SupposeS={uu u2,..., ur} and suppose

e,u, + o2u2 + \342\200\242\302\246\302\246+ arur = 0

Taking the inner product of (/) with u,, we get

0= <0,u,> = <a,u,+ a2u2 + \342\200\242\302\246\302\246+arur, u,

Since u, # 0, we have <u,, u,> # 0. Thus a, = 0. Similarly, for i = 2,..., r, taking the inner product for (/)with \321\211,

0 = <0, Ui>= <a,u, +\302\246

\302\246-+\320\260\320\263\320\270\320\263,\320\270;>

= a,<u,, uf> + \302\246\342\200\242\342\200\242+ a;<\"i. \",\342\200\242>+\342\200\242+flr<ur, Uj>=

\321\217?<\".-.\".>

But <\320\272?,\321\206># 0 and hence a; = 0.Thus S is linearly independent.

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 229

6.21. Prove Theorem 6.6 (Pythagoras). || \320\274,+\342\200\242\302\246\302\246+ \320\270,|| = || \320\274,|| + \302\246\342\200\242\302\246+ || \320\274\320\263||.

Expanding the inner product, we have

II u, + u2 + \342\200\242\342\200\242\302\246+ u, ||2 = <u, + u2 + \342\200\242\342\200\242\342\200\242+ u,, u, + u2 + \302\246\302\246\342\200\242+ ur>

= <\"]. \"l> + <\302\2532,U2> + \342\200\242+ <Ur, Ur> + ]\320\223<\320\270?,\"j>

The theorem follows from the fact that <u,-,uf>= || uf ||2 and <\320\270\320\270\320\270^>

= 0 for i ^j.

6.22. Prove Theorem 6.7.

Suppose t> = fc,U| + fe2 u2 + \342\200\242\342\200\242\302\246+ fenun. Taking the inner product of both sides with u, yields

= fe,<u,, u,> + fe2.0 + \302\246\342\200\242\342\200\242+ fc..O = fe,<Ul,u,

Thus \320\233,=

<i>, u, >/<u,, u, >. Similarly, for i = 2,..., n.

Thus fef=

<t>, k(>/<Uj, u(>. Substituting for fef in the equation u =fc,u, + \342\200\242\342\200\242\342\200\242

+ \320\272\342\200\236\320\270\342\200\236,we obtain the desiredresult.

6.23. Suppose ? = {gj, e2,..., en} is an orthonormal basis of V. Prove:

(a) For any \320\270e P, we have \320\270= <u, e,)^! + <u, e2>e2+ \342\200\242\342\200\242- + <u, \302\253\342\200\236)\302\253,,.

(b) <a1e1+\302\246\342\200\242

+anen,b1e1 + \302\246\302\246\342\200\242+i,O = <\"A +fl2b2 + \342\200\242\342\200\242\302\246

+\320\260\320\270\320\254\342\200\236.

(c) For any u, d e F, we have <u, i>>= <u, et ><i>,e, > + \342\200\242\342\200\242\342\200\242

+ <u, ^\342\200\236><d,\320\265\342\200\236>.

(a) Suppose \320\270= kxet + k2 k2 + \302\246\302\246+<?\342\200\236\302\253\302\246\342\200\236.Taking the inner product of u with \320\265\342\200\236

= k,\302\2461 +fc2 -0 + \302\246\302\246\302\246+ fcB-0 =

fc1

Similarly, for i = 2,..., n

<u, e,> = <fc,e, + \342\200\242\342\200\242\342\200\242+ fc,-e,- + - \302\246\302\246+ <\321\201\342\200\236\320\265\342\200\236,\320\265,>

= fc,<e,, e,> + \302\246\342\200\242\302\246+ fc,<e,, e,> + \342\200\242\302\246\302\246+ &\342\200\236<*>\342\200\236,e;>

=fc, '0 + \342\200\242\342\200\242\342\200\242+ kt- 1 + h kn -0 =

kt

Substituting <\320\274,ej> for kt in the equation u = /c,e, + \342\200\242\302\246\302\246+ \320\272\342\200\236\320\265\320\277,we obtain the desired result.

(b) We have

i,y-l 1-1

But <ef, e,> = 0 for i jtj, and <e,, ey>= 1 for i =j; hence,as required,

\\i = i j=i

(c) By (a), we have

u = <u, e^e, +\302\246\342\200\242\302\246+ <\302\253,\320\265\342\200\236>\320\265\342\200\236and \321\203= <t>, e,)

Thus, by (b),

230 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP.6

PROJECTIONS,GRAM-SCHMIDT ALGORITHM, APPLICATIONS

6.24. Suppose w#0. Let v be any vector in V. Show that

\321\201=

is the unique scalar such that v' \342\200\224v \342\200\224cw is orthogonal to w.

In order for v' to be orthogonal to w we must have

<i>\342\200\224cw, w> = 0 or <u, w>

\342\200\224c<w, w> = 0 or <t>, w> = c(w, w>

Thus \321\201=<i>, w)/<w, w>. Conversely, suppose \321\201= <i\\ w)/<w, w>. Then

<l\\ W><u \342\200\224cw, w> = (v, w>

\342\200\224c<w. w> = <y. w> <w, w>

= 0

6.25. Find the Fourier coefficient\321\201and the projection of v = A, \342\200\2242,3, \342\200\2244)along w = A, 2, 1, 2)inR4.

Compute <u, w> = 1 - 4 + 3 - 8 = -8 and || w ||2 = 1 + 4 f 1 + 4 = 10. Then \321\201=-\302\246&

= -f and

6.26. Find an orthonormal basisfor the subspace U of R4 spanned by the vectors

p,=A, 1, 1, 1) i'2=(l, 1,2,4) v3=(l, 2, -4, -3)

First find an orthogonal basis of 17 using the Gram-Schmidt algorithm. Begin the algorithm by setting

w, =u, =A,1,1,1). Next find

8

|w,n2\"' \320\273\"'\"'\"'\302\246'

i1

Set w2 = (- 1,-1,0, 2). Then find

\022-

hrZlTT \302\273.=

\320\237.>. 2> 4)-

7 (\342\200\242-l.\302\253.\342\200\242)= (- 1- -\302\253.\320\236-2)

\"\320\267-^\320\223\"! -^'\"j^ >v2= (l, 2, -4, -3) -A, 1, 1, \320\237--7-(-1, -1,0,2)

= (ii -3,D

Clear fractions to obtain w3 =A, 3, \342\200\2246,2). Last, normalize the orthogonal basis consisting of w,, w2, w3.

Since || w, ||2= 4, ||w2 ||2 = 6, and || w3 ||2 = 50, the following vectors form an orthonormal basisof U:

\302\253,=-A,1,1,1) \302\2532=

\342\200\224p(-1. -1,0, 2) u3 =\342\200\224F(l, 3, -6, 2)2 Jf> 5,/2

6.27. Let V be the vector space of polynomials f(t) with inner product </ gf>=

JJ f(t) g{t) dt. Apply

the Gram-Schmidt algorithm to the set {1, t, t2} to obtain an orthogonal set {fo,fi,f2}integercoefficients.

First set/0 = 1.Then find

Clear fractions to obtain/, = 2r \342\200\2241. Then find

<f2 1> <t2 It 1>

Qear fractions to obtain/2 = 6f2 - 6f + 1. Thus {1, It - 1,6r2 - 6t + 1}is the required orthogonal set.

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 231

6.28. Suppose v = A, 3, 5, 7). Find the projection of v onto W (or, find w e W which minimizes|| v \342\200\224w ||) where W is the subspace of R4 spanned by:

(a) u, =A, 1, 1, 1) and u2

= A, -3, 4, -2)(b) v,

= A, 1, 1, 1) and v2= A, 2, 3, 2)

(a) Sinceu, and u2 are orthogonal, we need only compute the Fourier coefficients:

<t>, u.) 1 + 3 + 5 + 7 16Cl~

\320\246\320\2342~1 + 1 + 1 + 1\" 4~

_ <t>, u2> 1-9 + 20-14 -2 -1C2

=

II \022 II2

=1+9 + 16 + 4

~~30~

~~15~

Then

w = proj (t>, W) = c.u, + c2u2= 4A, 1, 1, 1) - iV(l. -3, 4, -2) = (ft ^, ft \320\246)

{b) Since y, and v2 are not orthogonal, first apply the Gram-Schmidt algorithm to find an orthogonalbasis for W. Set w, =

i>,= A, 1, 1, 1).Then find

_ <\302\273\302\273,\302\273,>= 2> 3 _8

^ ^ = (_ x 0 t 0)wf 4

Set w2= (\342\200\2241, 0, 1, 0). Now compute

<t>, w,) 1+3 + 5 + 7 16 <t), w2> -1+0+5 + 0^-6^Cl

~

II w, U2

~1 + 1 + 1 + 1

~4

~ \320\260\320\237\302\260\320\263~*

|| w2 ||2~

1 + 0 + 1+ 0 2

Then w = proj (v, W) =ctwi + c2w2 = 4A, 1, 1,1)- 3(-1,0,1,0) = G, 4, 1, 4).

6.29. Suppose Wj and w2 are nonzero orthogonal vectors. Let v be any vector in V. Find ct and c2 so

that d' is orthogonal to vvt and vv2 where v' = d \342\200\224CjW!

\342\200\224c2 w2.

If y' is orthogonal to w,, then

0 = <\320\243-

CtWj- C2W2,W,>

=<l>, W,>

- C,<WltW,> -- c20=

<y, \302\273i>- c^wj, w,

Thus c, = <i\\ w,>/<wj, w,>. (That is, c, is the component of v along \302\273,.)Similarly, if v' is orthogonal to

w2,then

0 =<u

- c,w, - c2w2, w2> = <u, w2> -c2(w2, w2>

Thus c2 =<i\\ w2>/<w2, w2>. (That is, c2is the component of v along w2.)

630. Prove Theorem 6.8.For i = 1, 2,..., r and using <w;> w^>

= 0 for i \321\204jt we have

(V-

CXWX -C2W2 CrWr, Wf>= <t), Wf>

- C,(wlt Wf> C^WjWj) Cr(,Wr,= <D, Wj>

- C,# 0 C,<Wi, Wj>

- \302\246\342\200\242\342\200\242- C,\342\200\2420

= (V, W,->-

Cj<Wi, Hf> = <t), Wf>- ' '

<W,, W,->4wi. wi?

= 0

Thus the theorem is proved.

232 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

631. Prove Theorem 6.9.

By Theorem 6.8,v \342\200\224? ckwk is orthogonal to every w,- and hence orthogonal to any linear combination

of w,, w2,..., wr. Therefore, using the Pythagorean Theorem and summing from \320\272= 1 to r,

The square root of both sides gives us our theorem.

6.32. Prove Theorem 6.10 (Bessel inequality).

Note that c,-= (v, e,-> since ||e;|| = 1. Then, using <e,-, e?) = 0 for i \320\244j and summing from \320\272= 1 to r,

we get

0 < \320\233- ? ckek, \320\263- ? ck, ek\\=

<i>, i>>- 2/1., ? t\302\273

= <i>, v>- ? 2ct<\302\273.**> + X cj =

<i>, v>-

^ 2ct2 + X c*

=<\302\273,i>>

- I ct2

This gives us our inequality.

6.33. Prove Theorem 6.11.

The proofuses the Gram-Schmidt algorithm and Remarks 1 and 2 of Section 6.6. That is, applythe algorithm to {u,} to obtain an orthogonal basis {w,, wj, and then normalize {w,} to obtain anorthonormal basis {u,} of V. The specific algorithm guarantees that each wk is a linear combination ofD,,..., vk, and hence each uk is a linear combination of \302\253\342\200\236...,i>t.

6.34. Prove Theorem 6.12.

Extend S to a basis S' = {w,,..., wr, \321\201\320\263\320\235,...,vn] for \320\232Applying the Gram-Schmidt algorithm toS\\ we first obtain w,, w2,..., wr since S is orthogonal and then we obtain vectors w,+ 1,..., *vn where

{w,, w2, ..., w,,} is an orthogonal basis for V. Thus the theorem is proved.

6.35. Prove Theorem 6.4.

By Theorem 6.11, there exists an orthogonal basis {\320\270,,..., ur] of W; and, by Theorem 6.12, we canextend it to an orthogonal basis {u,,u2,..., \320\270\342\200\236}of V. Hence u,+,,..., \320\270\342\200\236e W1. If f ? P', then

t) = a,u, + \342\200\242\302\246-+ \320\260\342\200\236\320\270\320\277where fl,u, + --+(i,i(,\342\202\254 W, ar+ iur+1 + \302\246\302\246\342\200\242+\320\260\342\200\236\320\270\320\277\320\265W1

Accordingly, V = W + WL.

On the other hand, ifweW r\\ Wx, then <w, w> = 0. This yields \302\273v= 0; hence W r\\ W1 = {0}.The two conditions, V = W f WL and W r^ W1 = {0},give the desired result V = W \302\256WL.

Note that we have proved the theorem only for the case that V has finite dimension; we remark that

the theorem also holds for spaces of arbitrary dimension.

6.36. Suppose W is a subspace of a finite-dimensionalspaceV. Show that W = W11.

By Theorem 6.4, V = W \302\256WL and, also, V = W1 @ W11. Hence

dim W = dim P - dim W1 and dim WLL = dim ^ - dim W1

This yields dim W = dim WLL. But W ? WLl [Problem 6.19(o)],hence W = WLL. as required.

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 233

INNER PRODUCTS AND POSITIVE DEFINITE MATRICES

'l 06.37. Determine whether or not A is positive definite where A =

| 0 1 2 |.K\\ 2 3/

Use the Diagonalization Algorithm of Section 4.11 to transform A into a (congruent) diagonal matrix.

Apply -R, + R3-\302\273R3 and -\320\241,+ \320\241\320\267~*\320\241\321\212and then -2R2 + R3-* R3 and -2C2 + C3->C3 to

obtain

A

0 0\\ /l 0 0\\

0 1 2 ~ \302\260 '\302\260)

0 2 2/ \\0 0 -2/

There is a negative entry\342\200\2242 in the diagonal matrix; henceA is not positive definite.

6.38. Find the matrix A which represents the usual inner product on R2 relative to each of the follow-followingbases of R2:

{a) {\302\273,= A, 3), v2 = B, 5)}, (b) \320\232

= A, 2), w2 = D, -2)}

(a) Compute <w,, t>,> = 1 + 9 = 10, <u,,u2>= 2 + 15 == 17, <t>2, t>2>

= 4 + 25 = 29. Thus

10 17\\

(b) Compute <w,, w,> =1+4 = 5, <w,, \302\273v2>= 4-4 = 0, <w2, w2>

= 16 + 4 = 20.Thus A = ( ).

(Sincethe basis vectors are orthogonal, the matrix A is diagonal.)

6.39. Consider the vector space V of polynomials f(t) of degree <2 with inner product

(a) Find </, g> where/(r) = t + 2 and g(t) = t2 - 3t + 4.

(fc) Find the matrix A of the inner product with respect to the basis {1, t, t2} of V.

(c) Verify Theorem 6.14 by showing that </, g} = [/]r/4[gf] with respect to the basis {1, t, t2}.

\320\230 </, g> =f' (t + 2\320\2512

- 3t + 4) it = \320\223(t3- t2 - It + 8) \320\233=

\320\223?

-\321\203

- r2 + 8fT =\321\203

(b) Here we use the fact that, if r + s = \320\270,

-\320\223

J

i i *n \302\246i ii i i ii \342\200\236j_ 11 it \320\270\342\200\242\302\253Ayen

n + 1 | _ , [0 if n is odd

Then <1,1>= 2, <1, \320\236= 0, <1, t2> = ? <r, r> = ? <r, f2>

= 0, <r2, t2) = f Thus

/2 0 l\\

A=\\0 | 0

\\i \320\276̂

(c) We have [/]T = B,1,0) and [_g]T = D, \342\200\2243, 1) relative to the given basis. Then

/2 0 \320\233/ 4\\ / 4\\

UYA[_g] = B, 1,0J 0 f 0H-3 =D,f,\302\253-3 =? = </,,

\\* 0 */\\ \342\200\242/ \\ 4

234 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

6.40. Prove Theorem 6.13.

For any vectors ut,u2, and v,

<t>, + u2, tf>= (u, + u2)TAv= (u[ + ul)Av

= u\\Av + u\\Av= <u,, v)> + <u2, t>>

and, for any scalar \320\272and vectors u, v,

<feu, t>> = (feu)r/lt) = kuTAv =fe<u, y>

Thus [J,] is satisfied.SinceuTAv is a scalar, (uTAv)T= uT/lt>. Also, \320\233\320\263= A since /1 is symmetric.Therefore,

<\321\213,vy = uTAv =(uTAv)T

= vTATtTT = vTAu = <t>, u>

Thus [/2] is satisfied.Lastly, since A is positive definite, XTAX > 0 for any nonzero X e R\". Thus, for any nonzero vector v,

<u, u> = f r/lf > 0. Also, <0, 0> = 0TA0 = 0.Thus [J3] is satisfied. Accordingly, the function <u, u> = uT/lt)is an inner product.

6.41. Prove Theorem 6.14.

SupposeS= {w,,w2,...,\320\270\302\273\342\200\236}and /1 =(fcy). Hence fey

= <Wj, \320\270>;>.Suppose

u =OjWj + a2 w2 + \302\246\342\200\242\342\200\242+ \320\260\342\200\236\320\270>\321\217and v = bjWj + b2w2 + \302\246\302\246\302\246+ bnwn

Then

On the other hand,

Ik

B)

Sinceki}= <w(, w,), the final sums in (/) and B) are equal. Thus <u, t>>

=[\320\270]\320\263\320\233\320\223_1>].

6.42. Prove Theorem 6.15.

Since <wf, Wj)= <wj, Wj> for any basis vectors w, and wJy the matrix A is symmetric. Let X be any

nonzero vector in R\". Then [u] = X for some nonzero vector \320\270\320\265V. Using Theorem 6.14, we haveXTAX = \\u]TA\\u\\ = <u, u> > 0. Thus A is positive definite.

INNER PRODUCTS AND ORTHOGONAL MATRICES

6.43. Find an orthogonal matrix P whose first row is u, = (^,\\, \302\247).

First find a nonzero vectorw2= (x, y, z) which is orthogonal to u,, i.e.,for which

0 = <u,, w2> =^+ \342\200\224+

^= 0 or x + 2y + 2z = 0

One such jolution_is w2 = @, 1, \342\200\2241).Normalize w2 to obtain the second row of P, i.e.,

CHAP.6] INNER PRODUCT SPACES, ORTHOGONALITY 235

Next find a nonzero vector w3= (x, y, z) which is orthogonal to both u, and u2, i.e., for which

0 = <u,, w3> =^ + -^ + \342\200\224=0 or x + 2y + 2z = 0

0 =<u2)w3>

=-^ -~7= = 0 or y- z = 0

Set z = \342\200\2241 and find the solution w3 = D, \342\200\2241, \342\200\2241).Normalize w3 and obtain the third row of P, i.e.,\302\253j

=D\320\224/18,

~1\320\233/\320\231,

-l/x/lS)- Thus,

/* \302\247

P = I 0 1\320\224/2-

\\4/3^/2 -1/3^2 -1/3^/2/

We emphasize that the above matrix P is not unique.

\320\273 i _i\\

6.44. Let /1 = I 1 3 4 I. Determine whether or not (a) the rows of A are orthogonal, (b) A is

\\7 -5 2/an orthogonal matrix, and (c) the columns of A are orthogonal.

(a) Yes, since

A, 1, -1)- A,3, 4)= 1 +3-4 = 0 A, 1, -1)- G, -5, 2)= 7- 5 -2 = 0

A, 3, 4)\342\200\242

G, -5, 2) = 7 - 15+ 8 = 0

{b) No, since the rows of A ate not unit vectors, e.g.,A,1, \342\200\224IJ = 1 + 1 + 1 = 3.

(c) No,e.g.,(l, 1,7)-A, 3, -5) = 1+ 3-35= -31 # 0.

6.45. Let \320\222be the matrix obtained by normalizing each row of A in Problem 6.44. (a) Find B. (b) Is \320\222

an orthogonal matrix? (c) Are the columns of \320\222orthogonal?

(a) We have

||A,1, -1)||2 = 1+ 1 + 1=3 ||A, 3,4)||2= 1+9 + 16= 26

|| G, -5, 2) ;;2 = 49 + 25 + 4 = 78

Thus

/l/y/3 1/^/3 -l/75\\\320\222= I 1/^26 3/^/26 4/^/26 1

N7/778 -5/778 2/778/

(b) Yes, since the rows of \320\222are still orthogonal and are now unit vectors.

(c) Yes, since the rows of \320\222form an orthonormal set of vectors, then, by Theorem 6.15, the columns of \320\222

must automatically form an orthonormal set.

6.46. Prove each of the following:

(a) P is orthogonal if and only if PT is orthogonal.

(b) If P is orthogonal, then P~l is orthogonal.

(c) If P and Q are orthogonal, then PQ is orthogonal.

(a) We have {PT)T = P. Thus P is orthogonal iff PPT = I iff PTTPT = I iff PT is orthogonal.

(/?) We have PT= P~l since P is orthogonal. Thus, by (a), P~l is orthogonal.

236 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

(c) We have PT = P~l and QT= Q~l. Thus

(pqwqy = \321\217\320\265\320\265\320\263\321\200\320\263= peer lp- ' = /

Thus (PQ)T = (P0~', and so PQis orthogonal.

6.47. Suppose P is an orthogonal matrix. Show that:

(a) <Fu, Pv} =<\320\275,i>> for any u, \320\270e F; (b) || Pu || = || \302\253|| for every ueV.

(a) Using PTP = I, we have

(Pu, Pv> = (Pu)T{Pv) = uTPTPv = uTv = <u, f)>

(b) Using PrP = /, we have

|| \320\240\321\213||2= <Pu, Pu> = uTPTPu = uru = <u, u>= || u ||2

Taking the square root of both sides gives us our result.

6.48. Prove Theorem 6.17.

Suppose

e'i= bilel+bi2e2 +\302\246\302\246+binen i=l,...,n (I)

Using Problem 6.23(fc) and the fact that ?' is ortho normal, we get

\320\271\321\206= (e\\, e'/>

=bnbn + bi2bj2 + \302\246\302\246\302\246+ binbjn B)

Let \320\222=(\320\254\321\206)

be the matrix of coefficients in (/). (Then P = BT.)SupposeBBT =(cy). Then

cy =\320\254\320\230\320\254\320\224+ bi2 bJ2 + \302\246\342\200\242+ biBb;, (J)

By B) and C), we have\321\201\321\206

=SiS. Thus BBr = /. Accordingly, \320\222is orthogonal, and hence P = BT is

orthogonal.

6.49. Prove Theorem 6.18.

Since {e,} is orthonormal, we get, by Problem 6.23(b),

<e[, e'j> = auau +a2ia2J +\302\246\302\246\302\246+aManJ

=<\320\241\342\200\236\320\241}\">

where C, denotes the ith column of the orthogonal matrix P =(ay). Since P is orthogonal, its columns form

an orthonormal set. This implies <\320\265|,\320\265^->=

<\320\241(,Cj->=

5fJ-.Thus {ej} is an orthonormal basis.

COMPLEX INNER PRODUCT SPACES

630. Let V be a complex inner product space.Verify the relation

<\302\253,avv + bv2y =a<\302\253,t>,> + b<\302\253,t>2>

Using [If], [/*], and then [IJ], we find

<u, at), + fct>2>=

<\320\260^,+ bv2, u> = o<d,, u> + b<tJ, u> =fl<Di, u> + b<f2\302\273

. Suppose <\302\253,d> = 3 + 2i in a complex inner product space V. Find:

(a) <B-4iXb>; (b) <\302\253,D + 3\302\253\302\273; (c) <C -6\302\253>,E -

(a) <B-4i>, f)> = B -

40<\320\270,f> = B -4iK3 + 2i) = 14 - 18i

(b) <u, D + 3i)c> = D + 3i)<\302\253,f> = D -3iX3 + 2i) = 18 - i

(c) <C- 6i>, E -

2i\302\273= C -

6iX5-

2i)<\302\253,f> = C -6iK5 + 2iX3 + 20 = 137- \320\226

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 237

6.52. Find the Fourier coefficient (component) \321\201and the projection cw of v = C + 4i, 2 \342\200\2243i) along

w = E + i, 2i) in C2.

Recall \321\201=<t>, wy/(w, w>. Compute

<\342\200\236,w> = C + 4iX5 + 0 + B -3iX2i)

= C + 4iX5 -i) + B - 3iX-2i)

= 19+ 17* - 6 - 4i = 13+ 13i<w, w> = 25 + 1 + 4 = 30

Thus \321\201= A3 + 13i)/30 =\320\246+ 13i/30. Accordingly,

proj (\320\262,w) = cw = (ff + 39i/15, -tf + if IS)

6.53. Prove Theorem 6.19 (Cauchy-Schwarz). |<\320\272,v)\\ <\\\\u\\\\\\\\v ||.

If t) = 0, the inequality reduces to 0 <, 0 and hence is valid. Now suppose v \320\2440. Using zz = | z |2 (for

any complex number z) and <t>, u> = <u, u>, we expand || \320\270\342\200\224<u, vytv ||2 ^ 0 where t is any real value:

0 < ||\320\270- <u, vytv ||2 = <u -<u, t;>tt), u -

<u, t)

= <u, u> -<u, f)>r<u, u> \342\200\224

<u, f)>t<t), u>

Set t = 1/H i; ||2 to find 0 < || u ||2 - '\\ , from which |<u, t>>|2 < || v ||2 || v ||2. Taking the square root of

both sides, we obtain the required inequality.

6.54. Find an orthogonal basis for ux in C3 where \320\270= A, i, 1 + i).

Hereux consists of all vectors w = (x, y, z) such that

<w, u> = x -i> + A - i)z = 0

Find one such solution, say w, = @,1 \342\200\224i, i). Then find a solution of the system

x \342\200\224iy + A \342\200\224

i)z = 0 A + i).V\342\200\224iz - 0

Here z is a free variable. Set z = 1 to obtain \321\203= i/(l +\342\200\242i) = A + i)/2 and x = Ci \342\200\224

3J. Multiplying by 2

yields the solution w2 = Ci \342\200\2243, 1 + i, 2). The vectors w, and w2 form an orthogonal basis for ux.

6.55. Find an orthonormal basis of the subspace W of C3 spanned by

vt= A, i, 0) and v2

= A, 2, 1 - i).

Apply the Gram-Schmidt algorithm. Set w,=

\302\273t= A, i, 0). Compute

Multiply by 2 to clear fractions obtaining w2 = A + 2i, 2 \342\200\224i, 2 \342\200\2242i). Next find || wL || = ,/2 and then

II vt>2|| = y/l&. Normalizing {w,, w2} we obtain the following orthonormal basis of W:

Find the matrix P which represents the usual inner product on C3 relative to the basis

{l,U-i}.

Compute

1,1-I> = 1-'= 1+1\"

<,, ,> = \302\253= 1 <j, 1 - i> = i(l - 0 = -1 + i <1- I, I - i> = 2

238 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

Then, using <u, \302\253>= <t>, u>, we have

/ 1 -I l+i\\

\\l-i -l-i 2(As expected, P is Hermitian, that is, P\" = P.)

NORMED VECTOR SPACES

6.57. Consider vectors \320\270= A, 3, -6, 4) and v = C, -5, 1, -2) in R4. Find:

(a) || \320\270||\342\200\236and || v ||\342\200\236 (c) || u ||2 and || i> ||2

(b) || u || i and || d ||, (d) ^(u, i;), dY(u, v) and d2(u, i>)

(a) The infinity-norm chooses the maximum of the absolute values of the components. Hence

II \302\253IL= 6 and IML = 5

(b) The one-norm adds the absolute values of the components. Thus

||\320\274||,= 1 +3 + 6 + 4= 14 ||p||,=3 + 5+1+2=U

(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e.,the norm

induced by the usual inner product on R3). Thus

II \" H2= n/1 + 9 + 36+ 16= 762 and || iHI2 = v/9 + 25+ 1+4 = ^39

(d) First find \320\270- v = (- 2, 8, -7, 6). Then

rf,(u, v)= || u-v ||,=2 + 8 + 7 + 6 = 23

d2iu, v) = || \320\270- v ||2 = y/4 + 64+ 49 + 36 =\320\243\320\250

6.58. Consider the function f{t) = t2 - 4f in C[0, 3]. (a) Find ||/ IL. (b) Plot /(t) in the plane R2.(c)Find ||/1| t. (rf) Find ||/1|2.

(a) We seek II / IL = max (|/(f)D- Since/(r)is differentiable on [0, 3], |/(i)| has a maximum at a critical

point of/(J), i.e.,when the derivative fit) = 0, or at an endpoint of [0, 3]. Since/'(\320\223)= 2f \342\200\224

4, we set2t \342\200\2244 = 0 and obtain t = 2 as a critical point. Compute

/B) = 4-8=-4 /@)= 0-0 = 0 /C) =9-12=-3Thus ||/IL= l/B)| = |-4|=4.

(b) Compute/(r) for various values of t in [0, 3], e.g.,

0 12 3

0 -3 -4 -3fit)

Plot the points in R2 and then draw a continuous curve through the points as shown in Fig. 6-11.

(c) We seek ||/||, =J3, |/(r)| dt. As indicated by Fig. 6-11, f(t) is negative in [0, 3]; hence

= -it2 -4i)

= 4t - t2. Thus

,= Di -

t2) dt =\\2t2

- '-\\ =18-9 = 9Jo L 3Jo

f3 f3 \320\223\320\2635 16i3~l3 15311/11= U(t)Vdt= (f*- 8r3 + 16r2) dr =

\\--2t* + \342\200\224\\ =-^

Jo Jo \\_-> J Jo \321\215

II/

\320\2233

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 239

Fig.6-11

6.59. Prove Theorem 6.25.

If \320\270\320\244v, then \320\270- v \320\2440, and hence d(u, v) = || \320\270- v || > 0. Also, d(u, \320\270)= || \320\270- \320\270|

[\320\233/,]is satisfied. We also have= 0. Thus

d(u, v) = || \320\270- v || = || -1(\342\200\236-\321\213)||=|_1|||\342\200\236-\321\213||= || v \342\200\224u || =d{v,u)

and

d{u, v) = || \320\270- v || = || (u- w) + (w

- v) || < || \320\270- w || + ||w - v || =d(u, w)

Thus [M2] and QM3]are satisfied.

Supplementary Problems

INNER PRODUCTS

6.60. Verify that the following is an inner product on R2 where \320\270= (xt, x2) and v = (y,, y2):

/(u, v) = x,y, -2x,y2

- 2x2yL + 5x2y2

6.61. Find the values of \320\272so that the following is an inner product on R2 where \320\270= (x,, x2) and v = (y1? y2):

f(u, v) = x, y,- 3x, y2

- 3x2y, + br2y2

6.62. Consider the vectors \321\213= A, - 3)and v = B, 5) in R2. Find:

(a) <u, u> with respect to the usual inner product in R2.

(b) <u, t)> with respect to the inner product in R2 in Problem 6.60.

(c) || v || using the usual inner product in R2.

(d) || v || using the inner product in R2 in Problem 6.60.

6.63. Show that each of the following is not an inner product on R3 where \320\270= (x,, x2, x3) and v = (y,, y2, y3):

(a) <u, t)>=x,y, +x2y2 and (b) <u, t)> = x,y2x3 +

6.64. Let V be the vector space ofmxn matrices over R. Show that </l, B> = tr [BT A) defines an inner product

in V.

240 [NNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

6.65. Let V be the vector space of polynomials over R. Show that </, g> = JJ f{t)g(t) dt defines an inner product

in V.

6.66. Suppose |<\320\274,i>>| = || \320\270|| || v ||. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show that \320\270

and i' are linearly dependent.

6.67. Suppose/(u, v) and g(u, v) are inner products on a vectorspaceV over R. Prove:

(a) The sum/+ g is an inner product on V where (/+ g) {u. v) =/(u, v) + g(u. v).

(b) Thescalar product kf, for \320\272> 0, is an inner product on V where (kf){u, v) = kf(u, v).

ORTHOGONALITY, ORTHOGONAL COMPLEMENTS, ORTHOGONAL SETS

6.68. Let V be the vector space of polynomials over R of degree < 2 with inner product </, g} = j^ f(t)g{t) dt.

Find a basis of the subspace W orthogonal to h(t)= 2t + 1.

6.69. Find a basis of the subspace W of R* orthogonal to u, = A, -2, 3, 4) and u2 = C, -

5, 7, 8).

6.70. Find a basis for the subspace W of R5 orthogonal to the vectors u,= A,1, 3,4,1) and u2

= A, 2, 1, 2,1).

6.71. Let w = A, \342\200\2242,\342\200\2241, 3) be a vector in R*. Find (a) an orthogonal and (b) an orthonormal basis for w1.

6.72. Let W be the subspace of R4 orthogonal to u,= A, 1, 2, 2) and u2 = @, 1, 2, -1). Find (a) an orthogonal

and (b)an orthonormal basis for W. (Compare with Problem 6.69.)

6.73. Let S consistof the following vectors in R*:

\302\253,=A,1,1,1) \321\2132= A, 1, -1, -1) \321\213\321\215

= A, -I, 1, -1) \321\2134= A, -1, -1, 1)

(a) Show that S is orthogonal and a basis of R4.

(b) Write v = A, 3, \342\200\2245, 6) as a linear combination of \320\270L, u2, u3, u4.

(c) Find the coordinates of an arbitrary vector v = (a, b, c, d) in R* relative to the basis S.

(d) Normalize S to obtain an orthonormal basis of R4.

6.74. Let V be the vector space of 2 x 2 matrices over R with inner product (A, B> = tr (BTA). Show that the

following is an orthonormal basis of V:

\302\260\\(\302\260l\\(\302\260 \302\260\\(\302\260o \320\276\320\224\320\276o)'\\p

6.75. Let V be the vector space of 2 x 2 matrices over R with inner product </l, B> = tr (B1A). Find an orthog-orthogonalbasis for the orthogonal complementof (a) the diagonal and (b) the symmetric matrices.

6.76. Suppose {uL, u2, ..., ur} is an orthogonal set of vectors. Show that {fciU,, k2u2,..., krur} is orthogonal for

any scalars kl,k2, \302\246\302\246-,kr.

6.77. Let V and W be subspaces of a finite-dimensional inner product space V. Show that: (a) (V + WI =1\320\257-n WL\\ and (b) (U n \320\23001

= V\302\261+ W\302\261.

PROJECTIONS, GRAM-SCHMIDT ALGORITHM, APPLICATIONS

6.78. Find an orthogonal and an orthonormal basis for the subspace U of R* spanned by the vectors

p, = A,1,1, l),t>2= (l, -1,2,2),^ = A,2, -3, -4).

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 241

6.79. Let V be the vector space of polynomials/(r) with inner product </ g> = Jj| f(t)g(t) dt. Apply the Gram-Schmidt algorithm to the set {1,t, f2} to obtain an orthogonal set {/0,/i,/2} with integer coefficients.

6.80. Supposev = (I, 2, 3,4,6). Find the projection of v onto W (or find w e W which minimizes || v \342\200\224w ||) whereW is the subspace of R5 spanned by:

(\320\260)\321\213,= A, 2, I, 2, I)andu2 = (l, -I, 2, -I, 1); (b) p,

= A, 2, 1, 2, 1)and t>2= A, 0, 1, 5, - 1)

6.81. Let V -C[\342\200\2241, 1] with inner product </, g) = J'_, f(t)g(t) dt. Let W be the subspace of V of polynomials

of degree <3. Find the projection of/(J) = t5 onto W. [Hint: Use the (Legendre) polynomials 1, t, 3t2 \342\200\2241,

5f3-3iin Example 6.13.]

6.82. Let V = C[0, I] with inner product </, #> =ft f(t)g{t) dt. Let W be the subspace of V of polynomials of

degree <2. Find the projection of/(r) = t3 onto H7. (Hint: Use the polynomials 1, 2f \342\200\2241, 6f2 \342\200\2246t + 1 inProblem 6.27.)

6.83. Let V be the subspace of R* spanned by the vectors

i>,=(l, 1, 1,1) v2= A, -1, 2, 2) d3=A, 2, -3, -4)

Find the projection of v = A, 2, - 3,4) onto I/.{Hint:UseProblem 6.78.)

INNER PRODUCTS AND POSITIVE DEFINITE MATRICES,ORTHOGONAL MATRICES

6.84. Find the matrix A which represents the usual inner product on R2 relative to each of the following bases ofR2:(a) {\302\273,

= A,4), v2 = B, -3)} and (b) {w,= A, - 3),w2

= F, 2)}.

6.85. Consider the following inner product on R2:

f(u, v)= x,y, - 2x,y2 -

2x2 y, + 5x2 y2 where \320\270= (x,, x2) and v =(y\342\200\236y2)

Find the matrix \320\222which represents this inner product on R2 relative to each basisin Problem 6.84.

6.86. Find the matrix \320\241which represents the usual basis on R3 relative to the basis S of R3 consisting of thevectors: u, =

A, 1, 1), u2 = A, 2, 1),u3= A, - 1,3).

6.87. Consider the vector space V of polynomials f(t) of degree <2 with inner product </, #> =^of{t)g(t) dt.

(a) Find </ #> where/(t) = t + 2 and 0(i) = t2 - 3t + 4.

(b) Find the matrix A of the inner product with respect to the basis {1, t, t2} of V.

(c) Verify Theorem 6.14 that </, #> = [/]r/l[g] with respect to the basis {1,t, t2}.

6.88. Determine which of the following matrices are positive definite:

4 5> \302\253G ')\342\200\242 \302\253G D- \302\253(-5 \"\320\255

6\320\2339. Determine whether or not /1 is positive definite where

2 -2 1\\ /1 1 2\\

(a) /l=|-2 3 -21 (b) /1=11 2 6 1

1 -2 2/ \\26 9/

6.90. Suppose ^ and \320\222are positive definite matrices. Show that: (a) A + \320\222is positive definite; (b) <;/! is positivedefinite for \320\272> \320\236.

6.91. Suppose \320\222is a real nonsingular matrix. Show that: (a) BTB is symmetric, and (fc) BTB is positive definite.

242 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP. 6

6.92. Find the number and exhibit all 2 x 2 orthogonal matrices of the form IT

\\\320\243

6.93. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of \320\270= A, 1, 1) and v = A, \342\200\2242,3),

respectively.

6.94. Find a symmetric orthogonal matrix P whose first row is ($, f, \302\247).(Compare with Problem 6.43.)

6.95. Real matrices A and \320\222are said to be orthogonally equivalent if there exists an orthogonal matrix P suchthat \320\222= PTAP. Show that this relation is an equivalence relation.

COMPLEX INNER PRODUCT SPACES

6.96. Verify that

a2u2, fcjf, + b2tJ> =fl1b,<u1, i^) + flib2<Mi. \022> +a2bi<u2> fi> + O2b2<u2, t>2>

More generally, prove that ( ? Oju,-, ? bjVj) = ? \320\260,-\320\254,-<\320\270,-,t>,>.

\\i=l J=l / i.j

6.97. Consider u = A + i, 3,4 - i) and i; = C- 4i, 1 + i, 2i) in C3. Find:

(a) <u, u>, (b) <i;,m>, (c) ||u||, (d) ||t) ||, (e) rf(u, p).

6.98. Find the Fourier coefficient \321\201and the projection cw of

(a) \320\270= C + i, 5 -2i\") along w = E + i, 1 + i) in C2;

(b) \320\270= A - i, 3i, 1 + i) along w = A, 2 -i, 3 + 2i) in C3.

6.99. Letu =(\320\263,,z2) and v = (w,, w2) belong to C2. Verify that the following is an inner product on C2:

f(u, v) = z^i + A + i)ztw2 + A -i)z2w, + 3z2iv2

6.100. Find an orthogonal basis and an orthonormal basis for the subspace W of C3 spanned by u,= A, i, 1) and

u2=(l +1,0, 2).

6.101. Let \320\270= (z,, z2) and v = (w,, w2) belong to C2. For what values of a, b, c, d e \320\241is the following an inner

product on C2?

/(u, v)= az,w, 4 bzlw2 +\302\246cz2i\302\273,-I- dz2iv2

6.102. Prove the following polar form for an inner product in a complex space V:

<u, \302\253>= i|| \320\270+ i; ||2 - ill \320\275- p ||2 + ill \320\270+ iv ||2 - ill \320\270- iv ||2

[Compare with Problem 6.9(b).]

6.103. Let V be a real inner product space. Show that:

(i) || \320\270|| = || v II if and only if <u + \320\270,\320\270-t>>

= 0;

(ii) ||u + t;||2= ||u||2 + |i v ||2 if and only if <u, i;> = 0.

Show by counterexamples that the above statements are not true for, say, C2.

6.104. Find the matrix P which represents the usual inner product on C3 relative to the basis {1,1 + i, 1 \342\200\2242i}.

CHAP. 6] INNER PRODUCT SPACES,ORTHOGONALITY 243

6.105. A complex matrix A is unitary if it is invertible and A' = A\". Alternatively, A is unitary if its rows

(columns) form an orthonormal set of vectors (relative to the usual inner product of C\.") Find a unitary

matrix whose first row is: (a)a multiple of A, I \342\200\224i); (b) (|, |i, j \342\200\224

%i).

NORMED VECTOR SPACES

6.106. Considervectors \320\270= A, -3, 4, 1, -2) and \320\270= C, 1, -2,-3, 1)in R5. Find:

(a) ML and || t^ (c) ||u||2and ||t>||2

(b) || \320\270||, and || v ||, (d) da(u, v), d^u, v) and d2(u. u)

6.107. Consider vectors \320\270= A + i, 2 -4i) and v = A -

i, 2 + 3i) in C2. Find:

(a) HulLand | d|L (c) ||n||2and||D||2

(b) || u II, and || v i|, (<f) d,(u, v), dju, v) and d2(u, v)

6.108. Consider the functions f(t) = 5f - i2 and g(J) = 3/ - t2 in C[0, 4]. Find: (a) d^f, g), (b) d,(/, g), and

6.109. Prove that: (a) || \302\246|| L is a norm on R\"; (b) \\\\

\342\200\242\\\\\342\200\236.is a norm on R\".

6.110. Prove that: (a) || \302\246|| L is a norm on C[a, b];{b)\\\\

\342\200\242II\342\200\236is a norm on C[a, b].

Answers to Supplementary Problems

6.61.k> 9

6.62. (a) -13, [b) -71, (c) y/29, (d) ,/89

6.63. Let \320\270= @, 0, 1).Then <u, u> = 0 in both cases.

6.68. {It2 -5t, 12t2 - 5}

6.69. {A,2,1,0),D,4,0,1)}

6.70. (-1,0,0,0,1), (-6. 2,0, 1,0), (-5, 2, 1,0,0)

6.71. (a) @, 0, 3, 1),@, 3, -3, 1), B, 10, -9, 3); (b) @, 0, 3, 1)/Jl6. @, 3, -3, 1)/\320\24319, B, 10, -9, 3)/v/194

6.72. (a) @, 2,-1, 0), (- 15,1,2, 5); (b) @, 2, - 1,0)\320\264\320\224(- 15, I, 2, 5)/s/255

6.73. (b) \320\263= Eu, + 3u2 -13u3 + 9u4)/4

(c) [t>] = [a + b + \321\201+ d, a + b - \321\201- d, a - b + \321\201\342\200\224d, a - b - \321\201+ d]/4

- -\321\201 i>c \321\215

-\302\253\321\201 -;)

6.78. w, = A, 1, 1, 1), w2 = @, -2, 1,1),w3= A2, -4, - 1, -7).

6.79. /0=l,/1 = i-l,/3 = 3\302\2732-6i + 2

244 INNER PRODUCT SPACES,ORTHOGONALITY [CHAP. 6

6.80. (a) proj {v, W) = B1, 27, 26,27,21)/8(b) First find orthogonal basis for W: w, = A, 2,1,2,1), w2

= @,2,0, -3, 2).Thenproj {v, W) = C4, 76, 34,56,42)/17

6.81. proj (/, W) = 10i3/9 -5J/21

6.82. proj (/, W) = 3J2/2-3i/5 + &

6.83. proj (i>, V) = (-14, 158,47, 89)/70

/10 0\\

V 0 40)

/58

V23

6.86.

6.87. (\320\260)\320\246, (b)

6.88. (a) No, (b) yes, (c) no, (J) yes

6.89. (a) Yes,

6.93.

-4

6-94. (| -f i]U 4 -*\320\243

6.97. (a) -4i, (b) 4i, (c) ^28, (d) ,/\320\267\320\242,(\320\265)

6.98. (a) \321\201= A9-50/28, (b) \321\201= C + 60/19

6.100. {\302\273,= A, i, 1)/\320\243\320\267,v2

= Bi, 1 -3i, 3 - 0/724}

6.101. a and d real and positive, c = b and ad - be positive.

6.103. u = (U2),v = (i,2i)

A

l-i\" I + 2\320\233

1+i 2 -2 + 3i]1 \342\200\2242i -2 - 3i 5

CHAP. 6] INNER PRODUCT SPACES, ORTHOGONALITY 245

\302\253Is

6.106. (a) 4and3, (b) 11and 13, (c) .Jbi and ^/24, (d) 6, 19, and 9

6.107. (\320\260)\321\204\320\276and \321\203\320\223\320\267,(\320\254)̂2 + ^/20 and ^/2 + \320\243\320\237, (\321\201)̂/22 and ^/\320\237, (d)

6.108. (a) 8, (fc) 16, (\321\201)̂

Chapter 7

Determinants

7.1 INTRODUCTION

Each n-square matrix A = (ay) is assigned a specialscalarcalled the determinant of A, denoted by

det (A) or | A | or

'12 \"In

\302\2532n

\"nl \"\342\200\2362\342\200\242\342\200\242\302\246\"n

We emphasize that an n x n array of scalars enclosed by straight lines, called a determinant of order n, is

not a matrix but denotes the determinant of the enclosed array of scalars, i.e., the enclosed matrix.The determinant function was first discovered in the investigation of systems of linear equations.

We shall see that the determinant is an indispensabletool in investigating and obtaining properties ofsquare matrices.

The definition of the determinant and most of its propertiesalsoapply in the case where the entriesof a matrix come from a ring.

We begin with the special case of determinants of ordersone,two, and three. Then we define adeterminant of arbitrary order. This general definition is preceded with a discussion of permutations,which is necessary for our general definition of the determinant.

7.2 DETERMINANTS OF ORDERSONE AND TWO

The determinants of orders one and two are defined as follows:

l\302\253lll= \302\253I

\302\25311 \302\25312

\302\25321 \302\25322

Thus the determinant of a 1 x 1 matrix A = (au) is the scalar a,, itself, that is, det {A) = |a,The determinant of order two may easily be rememberedby using the following diagram:

= a,

That is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the

product of the elements along the minus-labeled arrow. (There is an analogous diagram for determi-

determinantsof order three but not for higher-order determinants.)

Example 7.1

(a) Since the determinant of order one is the scalar itself, we have det B4) = 24, det (\342\200\2246) = \342\200\2246, and

det (t + 2)= t 4- 2.

246

CHAP. 7] DETERMINANTS 247

(b)

5 4

2 3=

EX3)- DX2) = 15- 8 = 7 and

2 1-4 6

= BX6)-AX-4)=12

Consider two linear equations in two unknowns:

Recall (Problem1.60)that the system has a unique solution if and only if D =\320\260,\320\2542

\342\200\224a2

that solution is; and

x = \320\2542\321\201\321\203-\320\254\320\263\321\2012

atb2\342\200\224

The solution may be expressed completelyin terms of determinants:

_Nx_b2cl -b^j.D Qlb2

\342\200\224

a2

D alb2 \342\200\224a2bt

\302\2532

Here D, the determinant of the matrix of coefficients, appears in the denominator of both quotients.Thenumerators Nx and Ny of the quotients for x and y, respectively, can be obtained by substituting the

column of constant terms in place of the column of coefficientsof the given unknown in the matrix ofcoefficients.

5yExample 7.2. Solve by determinants :

The determinant D of the matrix of coefficients is

2 -3D = = BX5)-CX-3)=10 + 9=

N =* ' X

7

1

-3

5

SinceD^O, the system has a unique solution. To obtain the numerator Nx replace, in the matrix of coefficients, the

coefficients of x by the constant terms:

= GX5)-(lX-3) =

To obtain the numerator Ny replace, in the matrix of coefficients, the coefficients of \321\203by the constant terms:

= BX1)-CX7)= 2-21 = -

N -19

N =\320\243

2

3

7

1

Thus the unique solution r\302\273f\302\273he\302\253v\302\253t\302\253r>i\302\253

D 19

Remark: The result in Example 7.2 actually holds for any system of n linear

equations in n unknowns, and this general result will be discussed in Section 7.5. We

emphasize that this result is important for theoretical reasons;in practice, Gaussian

elimination, not determinants, is usually used to find the solution of a system of linear

equations.

248 DETERMINANTS [CHAP. 7

7.3 DETERMINANTS OF ORDER THREE

Consider an arbitrary 3x3 matrix A = (ai}).The determinant of A is defined as follows:

del (A)=

\302\253n

\302\25321

\302\25331

\302\25312

\302\25322

'13

'23

a 32

= \302\253lla22\302\25333 \302\25312\302\25323\302\25331+ \302\25313\302\25321\302\25332

-\302\25312\302\25321\302\25333

~\302\253U\302\25323\302\25332

Observe that there are six products, each product consisting of three elements of the original matrix.Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled(change their sign).

The diagrams in Fig. 7-1 may help to remember the above six products in det {A). That is, thedeterminant is equal to the sum of the products of the elements along the three plus-labeled arrows in

Fig. 7-1 plus the sum of the negatives of the products of the elements along the three minus-labeled

arrows. We emphasize that there are no such diagrammatic devicesto remember determinants of higher

order.

Fig. 7-1

Example 7.3

1 1

5 -2

1 -3

= BK5K4)+ AK-2KD+ AK-3KO)-

A\320\2455\320\245\320\247- (-3X-2X2) -

DX1K0)

= 40-2+0-5-12-0 =

The determinant of the 3 x 3 matrix A =(ay) may be rewritten as:

\302\2531\320\267(\302\25321\302\25332~\302\25322\302\2533l)

= a u\302\25322

\302\25332

\302\25323

\302\25333-\302\25312

a21

\302\25331

\302\25323

\302\25333

+ \302\25313\302\25321

\302\25331

\302\25322

\302\25332

which is a linear combination of three determinants of order two whose coefficients(with alternating

signs) form the first row of the given matrix. This linear combination may be indicated in the form

\302\25311

\302\25321

\302\25331

\302\25312

\302\25322

\302\25332

\302\25313

\302\25323

\302\25333

\342\200\224a 12

\302\24641\302\25312

\302\25322

\0232

\302\25313

\302\25323

\302\25333

+ \302\25313

\302\253I \302\253I1 \0212 \0213

\302\25321 \302\25322 \302\25323

31 a 32 a 33

Note that each 2x2 matrix can be obtained by deleting, in the original matrix, the row and column

containing its coefficient.

CHAP. 7] DETERMINANTS 249

Example 7.4

= !

= 1

1 2 3

4-2 3

0 5-1

-2 35 -!

-2 4 -2

5 -!

-240

3

\342\200\2241+ 3

4

0

-25

2 3

-2 3

5 -1

= !B-15)

- 2(-4 + 0) + 3B0+ 0) = - 13 + 8 + 60 = 55

7.4 PERMUTATIONS

A permutation a of the set {1,2,..., n} is a one-to-one mapping of the set onto itself or, equiva-

lently, a rearrangement of the numbers 1, 2,..., n. Such a permutation a is denoted by

or<r=JiJ2---J\302\253

1 2 ... \320\2701

VJi J2 \302\246\342\200\242\302\246;\342\200\236\321\203

The set of all such permutations is denoted by Sn, and the number of such permutations is n\\. If a e Sn,then the inverse mapping a ' e Sn;and if a, x e Sn, then the composition mappingoneS,. Also, the

identity mapping e = a -> a~l eSn. (In fact, ? = 12... n.)

Example 7.5

(a) There are 2! = 2 \342\200\2421 = 2 permutations in 52: the permutations 12 and 21.

\\h) There are 3! = 3- 2 \342\200\2421 = 6 permutations in S3:the permutations 123, 132,213.231.312,321.

Consideran arbitrary permutation a in Sn; say a =jl j2 ...jn.We say a is an even or odd permu-

permutation according to whether there is an even or odd number of inversions in a. By an inversion in a,

we mean a pair of integers(i,k)such that i > \320\272but i precedes \320\272in a. We then define the sign or parity of

a, written sgn a, by

sgn{1

if rr is

\342\200\2241 if a is

evenodd

Example 7.6

{a) Consider the permutation a = 35142 in S5. For each element, count the number of elements smaller than it

and to the right of it. Thus

3 produces the inversions C, I) and C, 2);5 producesthe inversions E, I), E, 4), E, 2):4 produces the inversion D, 2);

(Note that 1 and 2 produce no inversions.) Sincethere are, in all, six inversions, a is even and sgn a = !.

(b) The identity permutation e = 123 ... \320\270is even because there are no inversions in e.

(r) In S2, the permutation 12 is even and 21 is odd.In S3, the permutations 123, 231 and 312areeven, and the permutations 132, 213and 32! are odd.

{d) Let \321\202be the permutation which interchanges two numbers i and j and leavesthe other numbers fixed, that is,

tJi\") =\321\203 tO) = i \321\202[\320\272)= \320\272 k*i,j

We call \321\202a transposition. If i < j, then

i= 12...(/- 11/04-!)...(;\"- 1\320\250+ !)...\302\253

250 DETERMINANTS [CHAP. 7

There are 2(j \342\200\224i \342\200\2241) + 1 inversions in \321\202as follows:

U, 0. (h x), (x, 0 where x = i + 1, ..., j - !Thus the transposition \321\202is odd.

7.5 DETERMINANTS OF ARBITRARY ORDER

Let A = (ay) be an n-squarematrix over a field \320\232:

a ll \302\2531 \302\246\342\200\242\302\253ln\\

\302\25321 \302\25322

\"nl \"n2

\302\2532n

\342\200\242\342\200\242\342\200\242\302\253\320\277\342\200\236/

Consider a product of n elements of A such that one and only one element comes from each row and

one and only one element comes from each column. Such a product can be written in the form

that is, where the factors come from successive rows and so the first subscripts are in the natural order1,2,..., n. Now since the factors come from different columns, the sequence of second subscripts form apermutation a =

_/, j2 \302\246..)\342\200\236in Sn. Conversely, each permutation in Sn determines a product of the aboveform. Thus the matrix A contains n! such products.

Definition: The determinant of A = (ay), denoted by det (A) or \\A\\, is the sum of all the above \320\270!

products where each such product is multiplied by sgn a. That is,

I \320\233I=

X (sgn a)alha2j2\302\246\302\246\302\246

anjna

\320\236\320\223 \\A | = X (Sgn ff)\302\253lrtl)\302\2532ffB)\302\246\342\200\242\342\200\242

\302\253mr(n)\342\200\242eS,

The determinant of the n-square matrix A is said to be of ordern.

The next example shows that the above definition agrees with the previous definition of determi-determinantsof orders one, two, and three.

Example 7.7

(a) Let A = (an) be a ! x 1 matrix. Since S, has only one permutation which is even,det (A)= a, [, the number

itself.

(b) Let A \342\200\224

(c) Let A =(ait)

be a 3 x 3 matrix. In S3, the permutations 123,23! and 312 are even, and the permutations 321,

213 and 132areodd.Hence

a 2 x 2 matrix. In S2, the permutation 12is even and the permutation 21 is odd. Hence

det (A) =

det (A) =

'31

-al3a22a3l

- al2a2la33 -

As n increases, the number of terms in the determinant becomes astronomical. Accordingly, we useindirect methods to evaluate determinants rather than its definition. In fact, we prove a number of

properties about determinants which will permit us to shorten the computation considerably.In partic-

particular, we show that a determinant of order n is equal to a linear combination of determinants of order

n \342\200\2241 as in case n = 3 above.

CHAP. 7] DETERMINANTS 251

7.6 PROPERTIESOF DETERMINANTS

We now list basic properties of the determinant.

Theorem 7.1: The determinants of a matrix A and its transpose AT are equal; that is, | A |= | AT\\.

By this theorem, proved in Problem 7.21, any theorem about the determinant of a matrix A whichconcerns the rows of A will have an analogous theorem concerning the columns of A.

The next theorem, proved in Problem 7.23, gives certain cases for which the determinant can beobtained immediately.

Theorem 7.2: Let A be a square matrix.

(a) If A has a row (column) ofzeros,then | A |= 0.

(b) If A has two identical rows (columns), then | A | = 0.

(c) If A is triangular, i.e., A has zeros above or below the diagonal, then | A | = productof diagonal elements. Thus in particular, | /1 = 1where / is the identity matrix.

The next theorem, proved in Problem 7.22, shows how the determinant of a matrix is affectedbythe elementary row and column operations.

Theorem7.3: Suppose\320\222is obtained from A by an elementary row (column)operation.

(a) If two rows (columns) of A were interchanged, then | \320\222|= \342\200\224

| A \\.

(b) If a row (column) of A was multiplied by a scalar k, then | \320\222|= \320\272| A |.

(c) If a multiple of a row (column) was added to another row (column), then | \320\222\\= | A \\.

We now state two of the most important and useful theorems on determinants.

Theorem7.4: Let A be any n-square matrix. Then the following are equivalent:

(i) A is invertible; i.e.,A has an inverse A~l.

(ii) AX = 0 has only the zero solution,

(iii) The determinant of A is not zero: | A \\ \320\2440.

Remark: Depending on the author and text, a nonsingular matrix A is denned tobe an invertible matrix A, or a matrix A for which | A \\ \321\2040, or a matrix A for which

AX = 0 has only the zero solution. The above theorem showsthat all such definitionsare equivalent.

Theorem7.5: The determinant is a multiplicative function. That is, the determinant of a product oftwo matrices A and \320\222is equal to the product of their determinants: | AB\\ =

\\A\\\\B\\.

We shall prove the above two theorems (Problems7.27and 7.28, respectively) using the theory ofelementarymatricesand the following lemma (proved in Problem 7.25).

Lemma7.6: Let ? be an elementary matrix. Then, for any matrix A, \\ EA |= | E | | A \\.

We comment that one can also prove the preceding two theorems directly without resorting to the

theory of elementarymatrices.

252 DETERMINANTS [CHAP 7

Recall that matrices A and \320\222are similar if there exists a nonsingular matrix P such that\320\222\342\200\224P~lAP. Using the multiplicative property of the determinant (Theorem 7.5), we are able to prove

(Problem 7.30):

Theorem 7.7: Suppose A and \320\222are similar matrices. Then | A |= | B\\.

7.7 MINORS AND COFACTORS

Consideran n-square matrix A =(\321\217\321\206).

Let M{j denote the (n \342\200\224l)-square submatrix of A obtained

by deleting its ith row and jth column.The determinant | Mtj\\ is called the minor of the element ai} of A,and we define the cofactor of ait, denoted by Atj,

to be the \"signed\" minor:

Note that the \"signs\" (\342\200\224l)'+J accompanying the minors form a chessboard pattern with +'s on themain diagonal:

We emphasize that MtJ denotes a matrix whereas Atj denotes a scalar.

Remark: The above sign (\342\200\224l)'+J of the cofactorALj

is frequently obtained usingthe checkerboard pattern. Specifically, beginning with

\"\302\246+

\"and alternating signs, i.e.,

+ ,\342\200\224.+.\342\200\224 count from the main diagonal to the appropriate square.

/2 3 4\\

Example 7.8. Consider the matrix /1=15 6 7 1.

\\8 9 1/

2

5

8

3

6

9

47

1

\342\200\2242

8

3

9\\M13\\= 5 6 7= = 18-24= -

The following theorem, proved in Problem 7.31, applies.

Theorem 7.8: The determinant of the matrix A =(atJ) is equal to the sum of the products obtained by

multiplying the elements of any row (column) by their respective cofactors:\320\273

\\A\\= \320\260\321\206\320\220\320\277+ ai2 Aa + \302\246\302\246\302\246+ ainAin = ? atj Ai}

\320\237

and \\A\\=aljAlj + a2jA2j+\302\246\302\246\302\246+ anj Anj = ? ai} Ai}

i-1

The above formulas for \\A\\ are called the Laplace expansions of the determinant of A by the ith

row and the jth column, respectively.Together with the elementary row (column) operations, they offer

a method of simplifying the computation of | A |, as describedbelow.

Evaluation of Determinants

The following algorithm reduces the evaluation of a determinant of order n to the evaluation of adeterminant of order n - I.

CHAP. 7] DETERMINANTS 253

Algorithm 7.7 (Reduction of the order of a determinant)

Here A =(alV) is a nonzero n-squarematrix with n > 1.

Step 1. Choosean element atj= 1 or, if lacking, aXi \321\2040.

Step 2. Using ay as a pivot, apply elementary row [column] operations to put 0s in all the otherpositions in the column [row] containing au.

Step 3. Expand the determinant by the column [row] containing atj.

The following remarks are in order.

Remark 1: Algorithm 7.7 is usually used for determinants of order four or more.With determinants of order less than four, one uses the specific formulas for thedeterminant.

Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 7.7

together with row interchanges, can be used to transform a matrix A into an uppertriangular matrix whose determinant is the product of its diagonal entries. However,one must keep track of the number of row interchanges since each row interchangechanges the sign of the determinant. (See Problem7.11.)

Example 7.9. Compute the determinant of A =

5

2

5

1

4

3

-7-2

2

!

-3

-1

1

-2

9

4

by Algorithm 7.7.

Use a23 = ! as a pivot to put 0s in the other positions of the third column, that is, apply the row operations\342\200\224

2R2 + K, -\302\273Rj, 3R2 + R3 -\302\273R3, and

change by these operations; that is,\320\2574->\320\2334.By Theorem 7.3(c), the value of the determinant does not

\\A\\=

5

2

-5

1

4

3

-7

-2-3-1

-29

4

-2

3

2

5

-232

Now if we expand by the third column, we may neglect all terms which contain 0. Thus

1-2053 1 -2

0

0

! -2! 2

3 1

= _D _ 18 + 5 - 30 - 3 + 4) = -(-38) = 38

7.8 CLASSICAL ADJOINT

Consider an n-square matrix A = (a0) over a field K. The classicaladjoint (traditionally, just

\"adjoint\") of A, denoted adj A, is the transpose of the matrix of cofactors of A:

adj A =\\

'11 ^21

>\302\27312A12

'In

We say \"classical adjoint\" instead of simply \"adjoint\" because the term adjoint is currently used for an

entirely different concept.

254 DETERMINANTS [CHAP. 7

B 3 -4\\

Example 7.10. Let A = | 0 \342\200\2244 2 |. The cofactors of the nine elements of A are

-1

-4-1

-13

-4

2

5

5

-4

2

= -18

-1!

= -10

\320\23312=

/422 =

^32 =

-

+

-

0

i

2

5

-45

-42

/4.,=

= 14

= -4 /433= 4-

0

!

2

1

2

0

-4-1

3

-1

3

-4

= 4

\320\241

= -8

The transpose of the above matrix of cofactors yields the classical adjoint of A:

(-18

-11 -10\\2 14-4 1

4 5 -i

The following theorem, proved in Problem 7.33, applies.

Theorem 7.9: For any square matrix A,

A \342\200\242(adj A) = (adj A) \342\200\242A =

\\ A \\ I

where / is the identity matrix. Thus, if | A \\ \320\2440,

A'1 =\342\200\224-(adj/4)

Theorem 7.9 gives us another method of obtaining the inverse of a nonsingular matrix.

Example 7.11. Consider the matrix A of Example 7.\320\256for which \\A\\= \342\200\22446.We have

B

3 -4\\/-l8 -I! -10\\ /-46 0 0\\

0-42j

2 !4 -4l= 0-46 0l=-^! -I 5/\\ 4 5 -8/ \\ 0 0 -46/

Furthermore, by Theorem 7.9,

I

0

0

0

!

0

001

f-18/(-46) -11\320\224-46) -10/(-46)\\ / \320\231 is A

2\320\224-46) 14\320\224-46) -4\320\224-46) = -A -^ A4\320\224-46) 5\320\224-46) -8\320\224-46)/ \\-\320\224 -^ A/

7.9 APPLICATIONS TO LINEAR EQUATIONS, CRAMER'S RULE

Consider a system of n linear equations in n unknowns:

a21x, +a22x2

\320\260\320\277,\320\245|+\320\260\320\2772\321\2052+\302\246\302\246

+\320\260\320\277\320\277\321\205\320\273=\321\200\320\277

The above system can be written in the form AX = \320\222where A = (ay) is the (square) matrix of coeffi-

coefficients and \320\222=(\320\224.)is the column vector of constants. Let A{ be the matrix obtained from A by replacing

CHAP. 7] DETERMINANTS 255

the ith column of A by the column vector B. Let D = det (A) and let N(= det (\320\220\320\246for i = 1, 2,..., n.

The fundamental relationship between determinants and the solution of the above system follows.

Theorem 7.10: The above system has a unique solution if and only if D \320\2440. In this case the uniquesolution is given by

xt = NJD, x2 = N2/D, \320\232= NJD

The above theorem, proved in Problem 7.34, is known as \"Cramer's rule\" for solving systems oflinear equations. We emphasize that the theorem only refers to a system with the same number of

equations as unknowns, and that it only gives the solution when D \321\2040. In fact, if D = 0 the theoremdoes not tell whether or not the system has a solution.However, in the case of a homogeneoussystemwe have the following useful result (to be proved in Problem 7.65).

Theorem 7.11: The homogeneoussystemAx = 0 has a nonzero solution if and only if D \342\200\224| A | = 0.

!2x

+ \321\203- r = 3

x + \321\203+ r = 1.x - ly - 3r = 4

First compute the determinant D of the matrix of coefficients:

2 1 -11 1 11 -2 -3

=-6+I+2+1+4+3=5

Since D \320\2440, the system has a unique solution. To compute Nx, Ny,and \320\233/\320\263,replace the coefficients of x, y, and z in

the matrix of coefficients by the constant terms:

1 -1

-2 -3

3 -11 14 -3

1 3

1 1

-2 4

= -9 + 4 + 2 + 4 + 6 + 3= 10

=-6+3-4+I-8+9=-5

=8+1-6-3+4-4=0

Thus the unique solution is x = NJD = 2, \321\203= NJD

= -1, z =Nr/D

= 0.

7.10 SUBMATRICES, GENERAL MINORS, PRINCIPALMINORS

Let A =(\320\260\320\270)

be an n-square matrix. Each ordered set \302\273\342\200\236i2, \302\246\302\246\302\246,\302\253,of r row indices, and each orderedsetjl,j2, \302\246\302\246\302\246,jr of r column indices, definesthe following submatrix of A of order r:

la, , a, ,

W.ii air.h\302\246\302\246\302\246

The determinant | M\\'. ill'.'.'.'. tr\\IS called a minor of A of order r and

+J'\\A{]\302\261::::i:\\

256 DETERMINANTS [CHAP. 7

is the corresponding signedminor. (Note that a minor of order n \342\200\2241 is a minor in the sense of Section7.7 and the corresponding signed minor of order n \342\200\2241 is a cofactor.) Furthermore, if i'k and fk denote,

respectively, the remaining row and column indices, then

14J: ::::?:; I

is the complementary minor.

Example 7.13. SupposeA =(ey) is a 5-square matrix. The row subscripts 3 and 5 and the column subscripts 1

and 4 define the submatrix

\320\220\321\212\\=

and the minor and signed minor are, respectively,

,\342\200\236,.4, \302\25331\302\25334

\302\25351\302\25354=\302\25331\302\25354

-\302\25334\302\25351and (-D3

The remaining row subscripts are 1, 2, and 4 and the remaining column subscripts are 2, 3, and 5; hencethe

complementary minor of | /4\320\267-*

| is

Principal Minors

A minor is principal if the row and column indices are the same or, in other words, if the diagonal

elements of the minor come from the diagonal of the matrix.

Example 7.14. Consider the following minors of a 5-squarematrix A =(\320\260\321\206):

M, =

\302\25322\302\25324\302\25325

\302\25352\302\25354\302\25355

M,= \302\25321\302\25323\302\25325

\302\25351\302\25353\302\25355

\302\25322\302\25325

Here M, and M3 are principal minors since all their diagonal elementsbelong to the diagonal of A. On the other

hand, M2 is not principal since e23 belongs to the diagonal of M2 but not to that of A.

The following remarks are in order.

Remark 1: The sign of a principal minor is always +1 since the sum of the row

and identical column subscripts is even.

Remark 2: A minor is principal if and only if its complementary minor is alsoprincipal.

Remark 3: A real symmetric matrix A is positive definite if and only if all the

principal minors of A are positive.

7.11 BLOCK MATRICES AND DETERMINANTS

The following is the main result of this section.

Theorem 7.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks

/4,, A2, ..., An. Then

det (M) = det {AJ det (A2)\342\200\242- - det (An)

The proof appears in Problem 7.35.

CHAP. 7] DETERMINANTS 257

\\

2

-1

0

0

0

5000

43

2

3

5

7

2

1

-1

2

8\\

1

5

4

6/

Example 7.15. Find | M | where M =

Note that M is an upper triangular block matrix. Evaluate the detenninant of each diagonal block:

2 3

-1 5

= 10+3= 13

2 1 5

3-1 45 2 6

= - 12+ 20 + 30 + 25 - 16- 18 = 29

Then |M| =A3)B9)

= 377.

Remark: Suppose M = I I where \320\220,\320\222,\320\241,D are square matrices. Then it is

not generally true that |\320\234|=

|\320\233||\320\236|-|\320\222||\320\241|.(See Problem 7.77.)

7.12 DETERMINANTS AND VOLUME

Determinants are related to the notions of area and volume as follows. Let u,, u2,..., \320\270\342\200\236be vectors

in R\". Let S be the (solid) parallelepipeddetermined by the vectors; that is,

S ={ajM, + o2 u2 \320\275 + an uK : 0 < at < 1 for i = 1,..., n}

(When n = 2, S is a parallelogram.)Let V{S) denote the volume of S (or area of S when n = 2). Then

V(S)= absolute value of det (A)

where A is the matrix with rows \320\270\342\200\236u2,..., un. In general, V(S)= 0 if and only if the vectors u,,..., un

do not form a coordinate system for R\", i.e., if and only if the vectors are linearly dependent.

7.13 MULTILINEARITYAND DETERMINANTS

Let \320\230be a vector space over a field K. Let si = V, that is, .e/ consists of all the n-tuples

A \342\200\224(A|, A2, \302\246\342\200\242\342\200\242,An)

where the Ak are vectors in V. The following definitionsapply:

Definition: A function D: \320\273/-\302\273\320\232is said to be multilinear if it is linear in each of the components; that

is:

(i) If/1, = B+C, then

D(A) = D(...,B+C,...) = ?>(..., B,...) + D(...,C,...)

(ii) If At = kB where \320\272\320\265\320\232,then

D(A) = D{...,kB,...) = /cD(..., B,...)

We also say n-linear for multilinear if there are n components.

258 DETERMINANTS [CHAP.7

Definition: A function D: s4 -* \320\232is said to be alternating if D(A) = 0 whenever A has two identical

elements; that is,

D(A \342\200\236A2, \302\246..,\320\220\342\200\236)= 0 whenever At = As, 1\320\244j

Now let M denote the set of all n-square matrices A over a field K. We may view A as an n-tuple

consisting of its row vectors Au A2,..., \320\220\342\200\236;that is, we may view A in the form A =(At, A2,.. -, \320\220\342\200\236).

The following basic result (proved in Problem 7.36) applies (where / denotes the identity matrix):

Theorem 7.13: There existsa unique function D: M -\302\273\320\232such that

(i) D is multilinear, (ii) D is alternating, (iii) ?>(/)= 1

This function D is none other than the determinant function; that is, for any matrix

A e M, D(A) =\\A\\.

Solved Problems

COMPUTATION OF DETERMINANTS OF ORDERS TWO AND THREE

7.1. Evaluate the determinant of each of the following matrix:

B 3> D 5/ (-1 -2)6 5

2 3= FX3)-EX2) =18-10 = 8,

3 -2

4 5= 15+ 8 = 23,

4 \342\200\2245

-I -25= -13

7.2. Find the determinant of IJ.

r-5 7-1 f + 3

= (t -5Xt + 3) + 7 = t2 - 2f - 15 + 7 = t2 - It -

7.3. Find those values of /c for which\320\272\320\272

4 2k= 0.

Set = 2k2 - 4k = 0, or 2fc(fc- 2) = 0. Hence \320\272= 0; and fc = 2. That is, if \320\272= 0 or fc = 2, the

determinant is zero.

CHAP.7] DETERMINANTS 259

7.4. Evaluate the determinants of the following matrices:

bx

(a) (b) a2 b2 c2

\\a3 b3 c3j

Use the diagrams in Fig. 7-1.

(a)

(b)

1 -2

2

1

4 -15 -2

= AK4K-2) + (-2X- W) + PX5X2)- AX4X3) - (SX- IX\302\273)

- (-2X-2X2)= -8+ 2+30-12 + 5-8 = 9

\320\260,\320\254,\321\201

a2 b2 C

B 3 4\\

7.5. Compute the determinant of\\

5 6 7 1.

9 l/

First simplify the entries by subtracting twice the first row from the second row, that is, by applying

-2R, + R2->R2:

5 68 9

0 -1 = 0 - 24 i- 36 - 0 + 18- 3 = 27

7.6. Find the determinant of A where:

(t + 3 -1 I

(b) A = | 5 t - 3 1

6 -6 t + 4/

(a) First multiply the first row by 6 and the second row by 4. Then

= 241/41=3 -6 -2

3 2-41 -4

:6 + 24 + 24 + 4 - 48 + 18 = 28

Hence | A |=

\302\247\302\247= J. (Observe that the original multiplications eliminated the fractions, so the arith-

arithmetic is simpler.)

(b) Add the second column to the first column, and then add the third column to the second column to

produce Os: that is, apply C2 + Cl-> C, and C3 + C2 -\302\273C2:

t + 2 0 1t+2 t-2 1

0 t-2 t+4Now factor t + 2 from the first column and t \342\200\2242 from the second column to get

\\A |= (t + 2\320\245\320\223

- 2)

0

1

260 DETERMINANTS [CHAP. 7

Finally subtract the first column from the third column to obtain

0 0

0

0

r + 4

= (t + 2Xr-

2X\302\253+ 4)

COMPUTATION OF DETERMINANTS OF ARBITRARY ORDERS

\320\263 5-3 -A

7.7. Compute the determinant of A =\\

R3

Use a31 = 1 as a pivot and apply the row operations -2R3 + Rt -* R,, 2R3 + R2 -\302\273R2, and

\\A\\ =

2

-21

-1

5

-3

3

-6

-32

-24

-2

-5

2

3

0010

-13

3

-3

= 10+ 3 - 36 + 36 - 2 - 15 = -4

1-2-2

2

-6

-1

2

5

-13

_3

1

-2

2

-6

7.8. Evaluate the determinant of A \342\200\224

Use a2l = I as a pivot, and apply 2C, + C3

12 4 3

10 0 0l>l|==

3 -1 7 -24-362

2-1-3

7 -2

8 2

= -B8 + 24 - 24 + 63 + 32 + 8) = - 131

7.9. Find the determinant of \320\241=

/ 6

213

2110

1122

1 -3

0-2-2

3

4

5\\

1

3

-1

2/First reduce | \320\241| to a determinant of order four, and then to a determinant of order three. Use c22= 1

asa pivot and apply \342\200\2242R2 + R, -\302\273R,,

\342\200\224/?2 + R3 -\302\273Rj, and R2 + Rs -\302\273Rs:

2

2

1

3

1

1

5

1

0

1

0

00

43

2

\342\200\2241

1

1

2

-2

5

5

7

4

-203

2

= 2

2 -II I\342\200\2241 1

3 2

I -2

4

0

3

2

3

2

-13

1

0

5

_ j

1

1

2

-2

4

032

50

-57

= 21 + 20 + 50 + 15+ 10- 140 = -24

CHAP. 7] DETERMINANTS 261

7.10. Find the determinant of each of the following matrices:

/51

4

7

5

9

7

\302\253\320\276\320\273=

(c) C =

(a) Since/1has a row of zeros, det (A) = 0.(b) Since the second and fourth columns of \320\222are equal, det (B) = 0.

(c) Since \320\241is triangular, det (C) is equal to the product of the diagonal entries. Hencedet (C)= -120.

7.11. Describe the Gaussian elimination algorithm for calculating the determinant of an n-square

matrix A =(\320\260\320\263]).

The algorithm uses Gaussian elimination to transform A into an upper triangular matrix (whose deter-determinant is the product of its diagonal entries). Since the algorithm involves exchanging rows, which changesthe sign of the determinant, one must keep track of such changes using some variable, say SIGN. Thealgorithm will also use \"pivoting\"; that is, the element with the greatest absolute value will be used as thepivot. The algorithm follows.

Step 1. SetSIGN= 0.[This initializes the variable SIGN.]

Step 2. Find the entry a,, in the first column with greatest absolute value.

(a) If a,,= 0, then set det (A) = 0 and exit.

(b) If i \321\2041, then interchange the first and ith rows and set SIGN = SIGN+ 1.

Step 3. Use a,, asa pivot and elementary row operations of the form kRq + Rp -\302\273Rp to put 0s below e,,.

Step4. Repeat Steps 2 and 3 with the submatrix obtained by omitting the first row and the first column.

Step 5. Continue the above process until A is an upper triangular matrix.

Step 6. Set det (A) = (- l)SIGNa, ,a22\302\246\302\246\302\246

am, and exit.

Note that the elementary row operation of the form kRp-\302\273

Rp (which multiplies a row by a scalar), which is

permitted in the Gaussian algorithm for a system of linear equations, is barred here, as it changes the valueof the determinant.

7.12. Use the Gaussian elimination algorithm in Problem 7.11 to find the determinant of

3 8 \320\261\\

A=\\ -1 -3 1

5 10 15/First row reduce the matrix to an upper triangular form keeping track of the number of row inter-

interchanges:

10

2

0

262 DETERMINANTS [CHAP.7

A is now in triangular form and SIGN = 2 since there were two interchanges of rows Hence

A = (-I)sion E)B)(^) = 85.

7.13. Evaluate | fl| =

0.921 0.185 0.476 0.6140.782 0.157 0.527 0.1380.872 0.484 0.637 0.799

0.312 0.555 0.841 0.448

Multiply the row containing the pivot \320\260\320\270by l/ay so that the pivot is equal to 1:

B\\= 0.921

= 0.921

1

0.782

0.872

0.312

0

0.3090.492

= 0.921(-0.384)

0.201 0.5170.157 0.527

0.484 0.637

0.555 0.841

0.123 -0.3840.196 0.2170.680 0.240

0.667

0.138

0.799

0.448

= 0.921

= 0.921(-0.384)

0 0 1

0.309 0.265 0.217

0.492 0.757 0.240

1 0.201 0.571

0 0 0.1230 0.309 0.196

0 0.492 0.680

0 -0.320 1

0.309 0.196 0.217

0.492 0.680 0.240

= 0.921(-0.384)0.309 0.265

0.492 0.757

0.667

-0.3840.2170.240

= 0.921(-0.384K0.104)= -0.037

COFACTORS, CLASSICAL

7.14. Considerthe matrix A

We have

\"\302\273\302\246'-\"'\342\200\242\342\200\242

2 1

5 -4

4 0

3 -2

ADJOINTS

/2= 5

4

\\3

-3

7

65

1

-4

0

-2

4-2

2

=

-3

7

6

5

.?\302\246

4

3

4^-2-3

11

0

-2

. Find the cofactor of the 7 in A, that is, A 23-

= -@-9- 32-0- 12-8)= -(-61) = 61

The exponent 2 + 3 comes from the fact that 7 appears in the second row, third column.

1

7.15. Consider the matrix \320\222=| 2 3 4 I. Find: (a) | fi|, (b) adj B, and (c) B~' using adj B.

V5 8 9

(a) | B\\ = 27 + 20 + 16- 15-32- 18=-2

CHAP. 7] DETERMINANTS 263

(ft) Take the transpose of the matrix of cofactors:

adj \320\222= -

(\321\201)Since | \320\222| \320\2440,

_

3\320\276

1

8

1

3

4

91

9

14

_

25

1

5

1

2

4

9

1

9

1

4

_

2

51

5

1

2

3

8

1

8

1

3

\\

/

51

1

2

4

2

\320\274\320\263-3 -

l/

/-52

\\ 1

-1

4

-3

121

I -2

7.16. Consider an arbitrary 2-square matrix A = I I. (a) Find adj A. (b) Show that

adj (adj A) = A.

ft a) \\-c a)

DETERMINANTS AND LINEAR EQUATIONS,CRAMER'S RULE

7.17. Solve the following systems by using determinants.

f ax - 2by = \321\201

[\320\252\320\260\321\205\342\200\224

5by= 2c

where ab \320\2440

First find D =

Next find

a -2ft

N =

3a -5b

\321\201-2b

= \342\200\2245ab + bob = ab. Since D = ab \320\2440, the system has a unique solution.

2c -5b= - 5bc + 4ftc = -ftc and Nv =

3a 2c= 2ac\342\200\224\320\252\320\260\321\201= \342\200\224\320\260\321\201

Then x = NJD = \342\200\224bc/ab= -c/a and \321\203

=N,,/D

=\342\200\224ac/ab= \342\200\224

c/fc.

!3y

+ 2x = z + 13x + 2z = 8 \342\200\224

5>>

3z - 1 = x -2y

First arrange the equations in standard form:

2x + \320\252\321\203- z = \\

3x + 5y + 2z = 8x \342\200\224

2y\342\200\2243z = \342\200\2241

Compute the determinant D of the matrix of coefficients:

2 3-1D =

-2 -3

= -30+6 + 6 + 5 + 8 +27 = 22

264 DETERMINANTS [CHAP. 7

Since D \321\2040, the system has a unique solution. To compute Nx, Nyand N2, replace the corresponding

coefficients of the unknown in the matrix of coefficients by the constant terms:

Hence

N =

1 3 -1

8 5 2

-1 -2 -3= -15-6+ 16-5+4 + 72 = 66

1

8

-1

3

5

-2

-12

-3

I

= -48 + 2 + 3 + 8 + 4 + 9= -22

= -10 + 24-6-5 + 32 + 9 = 44

-22

D 22= -1

N. 44

7.19. Solve the following system by using Cramer's rule:

2x, + x2 + 5x3 + x4 = 5

X j + X2 -JX \320\2674X4^ 1

3x, + 6x2 \342\200\2242x3 + x4 = 8

2x, + 2x2 + 2x3\342\200\224

3x4 = 2

Compute

D =

5 1

-3 -4

6 -22 -3

= -120

5

-1

8

2

2

1

3

2

5

-1

8

2

5

-3-2

2

1

-4

1

-3

= -24

2 15 5

1 1-3-1

3 6-282 2 2 2

1

1

6

2

2 1

1 1

3 6

2 2

= -96

Then x, = = 2, jc2= N2/D = |, x3

= N3/D = 0, x4 =/V4/D

= f.

5 I

-3 -4

-2 1

2 -3

5 1

-I -4

8 1

2 -3

= -240

= 0

7.20. Usedeterminants to find those values of \320\272for which the following system has a unique solution:

kx + \321\203+ z = 1

x + ky + z = 1

x + >> + fcz = 1

The system has a unique solution when D -\321\204\320\236,where D is the determinant of the matrix of coefficients.Compute

D = =(\320\272-lJ(k + 2)

CHAP. 7] DETERMINANTS 265

Thus the system has a unique solution when (k \342\200\224\\J{k + 2) \321\2040, that is, when \320\272\321\2041 and \320\272\320\244\342\200\2242. (Gaussian

elimination indicates that the system has no solution when \320\272\342\200\224\342\200\2242, and the system has an infinite number

of solutions when \320\272= 1.)

PROOF OF THEOREMS

7.21. Prove Theorem 7.1. |AT| = |A|.

If A =(a,;), then AT = F0),with btj

= ajt. Hence

M TI= L (sgn a)b, \342\200\236B)b2aB)

\302\246\302\246\302\246*>\342\200\236\342\200\236,\342\200\236,

= ? (s\302\253n<*\302\273\302\253<i\302\273.i \302\253.B\302\273.2\" \302\246\342\200\242

\302\253\342\200\236<\302\273>.,<J\342\202\254S, ecSn

Let \321\202= a'1. By Problem 7.43,sgn \321\202= sgn a, and \320\260\342\200\236A(-,e^li2\302\246\302\246-

\320\276\342\200\236,\342\200\236,,\342\200\236= o,r,,,o2r,2,

\302\246\302\246\302\246\320\260\342\200\236,\342\200\236(.Hence

'rZ t)altA)a2tB(

\342\200\242\342\200\242\342\200\242amM

\302\253tS,

However, as <r runs through all the elementsof Sn, \321\202= \321\201\321\202\021also runs through all the elements ofSn. Thus\\Ar\\

= \\A\\.

7.22. Prove Theorem 13(a). If two rows (columns)of A are interchanged, then | \320\222| = - | A \\.

We prove the theorem for the case that two columns are interchanged. Let \321\202be the transpositionwhich interchanges the two numbers corresponding to the two columns of A that are interchanged. If

A = (a,;) and \320\222=(bit), then bu = anlj). Hence,for any permutation a.

Thus

\\B\\= l

Since the transposition \321\202is an odd permutation, sgn \321\202\320\260= sgn r \342\200\242sgn a = \342\200\224

sgn <r. Thus sgn a = \342\200\224sgn \321\202<\321\202.

and so

IB\\ = - X (sgn T<T)alr<rA(o2t<rB(\342\200\242\302\246-

ameMiteS,

But as <r runs through all the elements of Sr, \321\202\320\276also runs through all the elements of SB; hence

|B| \342\200\224mi.

7.23. Prove Theorem 7.2.

(a) Each term in | A \\ contains a factor from every row and so from the row of zeros. Thus each term of| A | is zero and so | A \\

= 0.

{b) Suppose 1 + 1 / 0 in K. If we interchange the two identical rows of A, we still obtain the matrix A.

Hence, by Problem 7.22, | A | = -\\A | and so \\A | = 0.

Now suppose 1 + 1 = 0 in K. Then sgn a = 1 for every a e Sn. SinceA has two identical rows, we

can arrange the terms of A into pairs of equal terms. Sinceeachpair is 0. the determinant of A is zero.

(c) Suppose A =(\320\2600)

is lower triangular, that is, the entries above the diagonal are all zero: a{j= 0 when-

wheneveri < j. Consider a term t of the determinant of A:

t = (sgn a)aUiaVl\302\246\302\246\302\246

flnl> where \320\276= i,i2\302\246\302\246

in

Suppose i, / 1.Then 1 < it and so aUl = 0; hencet = 0. That is, each term for which i, \320\2441 is zero.

Now suppose ij = 1 but i2 \321\2042. Then 2 < i2 and so ali2= 0; hence ( = 0. Thus each term for

which i, / 1or i2 \320\2442 is zero.

Similarly we obtain that each term for which i, \321\204\\ or i2 / 2 or \342\200\242\302\246\342\200\242or in / n is zero. Accordingly,

| A | = a, ,a22\342\200\242\342\200\242\342\200\242\320\260\342\200\236\342\200\236

= product of diagonal elements.

266 DETERMINANTS [CHAP. 7

7.24. Prove Theorem 7.3.

(a) Proved in Problem 7.22.

(b) If the yth row of A is multiplied by k, then every term in | A | is multiplied by \320\272and so | \320\222| = \320\272| \320\220\\.

That is,

I \320\222|= ? (sgn a)aUialh \302\246\302\246\302\246

(kay,)\302\246\302\246\302\246

a^= fc ? (sgn a)altla2n

\342\200\242\342\200\242\302\246am.

= \320\272\\\320\220\\\320\262 \320\260

(c) Suppose \321\201times the fcth row is added to the jth row of A. Using the symbol \320\233to denote the \320\246\320\252

position in a determinant term, we have

I\320\222I= ? (sgn a)aUialh \302\246\302\246\302\246

(cakit + ajt)\302\246\302\246\302\246

anirt\320\262

= \321\201? (sgn o)aUlalh\302\246\302\246\302\246

a^t\302\246\302\246\302\246

aniii + ? (sgn <\321\202)\321\217\320\2371\320\2622B\342\200\242\302\246\302\246

a^\342\200\242\302\246\302\246

s^\320\260 \320\262

The first sum is the determinant of a matrix whose fcth and jth rows are identical; hence, by

Theorem 7.2(fc), the sum is zero. The second sum is the determinant of A. Thus

7.25. Prove Lemma 7.6. If E is an elementary matrix, then | EA | = | E | | A \\.

Consider the following elementary row operations: (i) multiply a row by a constant \320\272\320\2440; (ii) inter-

interchange two rows; (iii) add a multiple of one row to another. Let ?,, E2, and E3 be the correspondingelementary matrices.That is, Ex, E2, and ?3 are obtained by applying the above operations, respectively, to

the identity matrix /. By Problem 7.24,

|E,|=fc|/| = fc |?2|= -|/|=-1 |E3| = |/| = 1

Recall (Theorem 4.12) that ?, A is identical to the matrix obtained by applying the corresponding operationto A. Thus, by Theorem 7.3,

=\\El\\\\A\\ \\E2A\\= -\\A\\ =

\\E2\\\\A\\ \\E3A \\= \\A \\

= 1 \\A | = | ?3 \\\\A\\

and the lemma is proved.

7.26. Suppose\320\222is row equivalent to a square matrix A. Show that | \320\222| = 0 if and only if | A | = 0.

By Theorem 7.3, the effect of an elementary row operation is to change the sign of the determinant orto multiply the determinant by a nonzero scalar.Therefore\\B\\

= 0 if and only if | A | = 0.

7.27. Prove Theorem 7.4.

The proof is by the Gaussian algorithm. If A is invertible, it is row equivalent to /. But | /1 ^0; hence,

by Problem 7.26, \\A\\ \320\2440.If A is not invertible, it is row equivalent to a matrix with a zero row; hence,det A = 0. Thus (i) and (iii) are equivalent.

If AX = 0 has only the solution X = 0, then A is row equivalent to / and A is invertible. Conversely, if

A is invertible with inverse A~x, then

X = IX = {A~ lA)X = A~ \\AX)= /\320\223'\320\236= 0

is the only solution of AX = 0. Thus (i) and (ii) are equivalent.

7.28. Prove Theorem 7.5. | AB \\= | A \\ | \320\222|.

If A is singular, then AB is also singular and so |\320\233\320\222|=0=|.4||\320\262|. On the other hand, if A is

nonsingular, then A =?\342\200\236

\302\246\342\200\242\342\200\242?2 ?,, a product of elementary matrices. Then, using Lemma 7.6 and induc-

induction,we obtain

CHAP. 7] DETERMINANTS 267

7.29. SupposeP is invertible. Show that | />~' | =I ^ \320\223'\342\200\242

\320\240*\320\240= 1. Hence 1 = |/1 = |P lP\\= | P~' 11 P\\, and so | P \"'

| = |P\\~'.

7.30. Prove Theorem7.7.Suppose A and \320\222are similar matrices, then | A |= | \320\222|.

Since A and \320\222are similar, there exists an invertible matrix P such that \320\222= \320\240'^\320\220\320\240.Then by the

preceding problem, \\\320\222\\=

\\\320\2401\320\220\320\240\\=

\\\320\240~\320\245\\\\\320\220\\\\\320\240\\=

\\\320\220\\\\\320\240~1\\\\\320\240\\=

\\\320\220\\.

We remark that although the matrices P~l and A may not commute, their determinants \\P~l\\ and

| A | do commute since they are scalars in the field K.

7.31. If A =(ai}), prove that \\A\\=anAtl + ai2 Ai2 + \302\246\302\246\302\246

+ ain Ain, whereAi}

is the cofactor of atJ.

Each term in | A | contains one and only one entry of the ith row (\302\253,,,ai2, .... ain) of A. Hence we canwrite | A | in the form

\\A |=

\302\253,-,/\302\253?,+ ai2 \320\233&+ \342\200\242\342\200\242\342\200\242+ ain AZ,

(Note A* is a sum of terms involving no entry of the ith row of A.) Thus the theorem is proved if we can

show that

where Mtj is the matrix obtained by deleting the row and column containing the entry atJ. (Historically, the

expression A* was defined as the cofactor of\320\260\320\270,

and so the theorem reduces to showing that the twodefinitions of the cofactor are equivalent.)

First we consider the case that i = n,j = n. Then the sum of terms in | A | containing am is

=ann ?(sgn <r)a1<rA)a2<rB)

\342\200\242\302\246\342\200\242\320\260\342\200\236_,,

a

where we sum over all permutations \320\276\320\265Sn for which a(n) = n. However, this is equivalent (prove!) tosumming over all permutations of {1,..., n \342\200\224

1}. Thus \320\220*\342\200\236= \\Mm\\ = (\342\200\224l)\"+n| \320\234\342\200\236\342\200\236|.

Now we consider any i and j. We interchange the ith row with each succeeding row until it is last, andwe interchange the /th column with each succeeding column until it is last. Note that the determinant | Mtj\\is not afTected since the relative positions of the other rows and columns are not afTected by these inter-interchanges. However, the \"sign\" of | A \\ and of A* is changed n \342\200\224i and then n \342\200\224jtimes. Accordingly,

732. Let A =(ah) and let \320\222be the matrix obtained from A by replacing the ith row of A by the row

vector (ft,-,,..., biB). Show that

\\B\\ = buAtl + bi2 Ai2 + \302\246\302\246\342\200\242+ binAin

Furthermore, show that, for \321\203\321\204i,

anAn + an Ai2 +\342\200\242\342\200\242\342\200\242+ ajn Ain = 0 and al}Au + a2j A2i + \342\200\242\342\200\242\342\200\242

+ anjAni= 0

Let \320\222=(btj). By Theorem 7.8,

\\B\\= bilBn + bi2Bi2 + - \302\246\302\246

+binBin

Since Bij does not depend upon the ith row of B, Btj =Au forj= 1,.... n. Hence

| \320\222|= btl An + bi2 Ai2 + \302\246\302\246\302\246+ bin Ain

Now let A' be obtained from A by replacing the ith row of A by the jth row of A. Since A' has twoidentical rows, \\A'\\ =0. Thus, by the above result,

\\A'\\=

\320\260}1\320\220\320\277+ aJ2 Al2 + \302\246\302\246\302\246+ aJn \320\220\321\213= \320\236

Using | A'| = | A |, we also obtain that au A\342\200\236+ a2jA2i + \302\246\342\200\242\302\246+ anj Ani = 0.

268 DETERMINANTS [CHAP.7

7.33. Prove Theorem 7.9. A \342\200\242(adj A) = (adj A) \342\200\242A = | A \\ I.

Let A = (ah) and let A \342\200\242(adj A) = {bh).The ith row of A is

(fl,i, o,2,...,oj (/)

Since adj /1 is the transpose of the matrix of cofactors, the jth column of adj /1 is the transpose of thecofactors of the/th row of A: That is,

j2y...,Ajy B)

Now btj, the (/-entry in A \302\246(adj A), is obtained by multiplying (/) and B):

airAjn

By Theorem 7.8 and Problem 7.32,

= \\\\A\\\\fi=j

iJ\\0 \\\320\230\320\244]

Accordingly, A \342\200\242(adj A) is the diagonal matrix with each diagonal element \\A\\. In other words,

A \342\200\242(adj A) =

\\A | /. Similarly, (adj A) - A = \\A | /.

7.34. Prove Theorem7.10.AX = \320\222has unique solution iff D \320\2440 in which case \320\273,= /V,/D.

By previous results, AX = \320\222has a unique solution if and only if A is invertible, and A is invertible if

and only if D =\\A \\ \321\2040.

Now suppose D \320\2440. By Theorem 7.9, /1\"' =A/DXadj A). Multiplying AX = \320\222by A ''

we obtain

X = A'

lAX = A/DXadj A)B A)

Note that the ith row of A/DXadj A) is (\\jD\\Au, A2i,..., AJ. U \320\222= {bltb2,... bn)T then, by (/),

Xi= A/DXMn + b2 A2i + - - \302\246+ *>\342\200\236\320\233\320\2331.)

However, as in Problem 7.32,bxAu\302\246+ b2A2i + \302\246- - + bnAM =/Vf, the determinant of the matrix obtained

by replacing the ith column of A by the column vector B.Thus x(= (l/D)Nf, as required.

735. Prove Theorem 7.12.

We need only prove the theorem for n = 2, that is, when M is a square block matrix of the form

(A C\\M =

( I. The proof of the general theorem follows easily by induction.\\0 B/

Suppose A =(au) is r-square, \320\222=

{\320\254\320\270)is s-square, and M =

(mtj)is n-square, where n = r + s. By

definition,

det M = ? (sgn o)mlaiumlam\302\246\302\246-

\321\202\320\262\342\200\236!\320\277)

If i > \320\263and \321\203< r, then my = 0. Thus we need only consider those permutations a such that

o{r+ 1, r + 2, ...,r + s} ={r+ 1,r + 2, ...,r + s} and <t{1, 2,..., r} = {1,2, ..., r}Let a^fc) = \321\204)for fc < r, and let o2{k)= a{r+ k)

- r (or \320\272< s. Then

(sgn <r)m1(JA(m2(,|2)\302\246- \302\246

mn(J(n)= (sgn tf,)**,,,.,!,^,,,,^

\342\200\242\342\200\242\302\246ora2(r)(sgn o2)blailltblaillt

\302\246\302\246\342\200\242fcSO2(s)

which implies det M = (det A)(det B).

7.36. Prove Theorem 7.13.

Let Dbe the determinant function: D(A) =\\A\\. We must show that D satisfies (i), (ii), and (iii), and that

D is the only function satisfying (i), (ii), and (iii).

CHAP. 7] DETERMINANTS 269

By Theorem 7.2, D satisfies (ii) and (iii); hencewe needshow that it is multilinear. Suppose the ith row

of A =(afJ)

has the form (bfl + cn, bi2 + ci2,..., bln + cjn). Then

= L (sgn cr)\302\253Mi>\342\200\242\342\200\242\342\200\242

\302\253,-, \342\200\236\342\200\236._,y(bwo + Ci,m)\302\246\302\246\302\246

\320\260\342\200\236\342\200\236(\320\277(s.

= L (sgn <7\320\232,<\342\200\236\342\200\242\342\200\242\342\200\242

\320\254,\342\200\236\321\202\302\246\302\246\302\246

\320\260\342\200\236\320\260{\320\277)+ ? (sgn \321\201\321\202)\321\2171<7A,- \302\246\302\246

\321\201,\342\200\236A.(- \342\200\242\302\246

\302\253\342\200\236\342\200\236,\342\200\236,s. s\302\273

= D(A,,..., B,,.... 4J + D^,,.... \320\241,,..., -4.)

Also, by Theorem 7.3(b),

DM,,.... \320\234\342\200\236..., An) =\320\251\320\233\342\200\236.... \320\220.\342\200\236..., ^\342\200\236)

Thus D is multilinear, i.e.,D satisfies (i).

We next must prove the uniqueness of D. Suppose D satisfies (i),(ii), and (iii). If {e,,..., \320\265\342\200\236}is the usual

basis of K\", then by (iii), D(e,, e2, ..., \320\265\342\200\236)= ?>{/) = 1. Using (ii) we also have that

D(efl, eh, ...,ein) =sgn a where a = i, i2

\302\246\302\246in (/)

Now suppose A =(oy). Observe that the fcth row Ak of A is

Ak= (ou, aw, ..., akll)

= aue, + \320\260\320\275\320\2652+ \302\246\342\200\242\302\246+ \320\260\320\272\342\200\236\320\265\320\277

Thus

D(/l) = Z)(flne1 + \342\200\242\342\200\242\302\246+ alnen,a21e1 + \302\246\302\246\302\246+ \320\2602\321\217\320\265\342\200\236,..., \320\260\320\2771\320\265,+--- + \320\260\320\2341\320\26211)

Using the multilinearity of D, we can write D(A) as a sum of terms of the form

D(A) = X tKaUleh, aIheh,.... ani,ej= I (aUla2il

\302\246\302\246\302\246an(JD(elV eh,..., ein) B)

where the sum is summedover all sequences i^\302\246\302\246\302\246

in where ik e {1,.... n}. If two of the indices are equal,say ij

= ik but \321\203\321\204\320\272,then by (ii),

DC*,,.\302\253,,.-. \302\253J= 0

Accordingly, the sum in B) need only be summedoverall permutations \320\276= i^i2\302\246\302\246\302\246

in. Using (/), we finally

have that

\320\251\320\220)= ? iaUla2il

\302\246- \302\246anJD(efl, ei2, ..., ej

a=

X (sgn a)aufi2ii\302\246\302\246\342\200\242

aniii where <r = i^\302\246\302\246\302\246

ina

Hence D is the determinant function and so the theorem is proved.

PERMUTATIONS

737. Determine the parity of \321\201= 542163.

Method 1. Weneedto obtain the number of pairs (i,j) for which i >j and i precedes j in a. There are:

3 numbers E, 4 and 2) greater than and preceding I,

2 numbers (S and 4) greater than and preceding 2,

3 numbers E, 4 and 6) greater than and preceding 3,I number E) greater than and preceding 4,0 numbers greater than and preceding 5,0 numbers greater than and preceding 6.

Since3 + 2 + 3+1+0 + 0 = 9 is odd, a is an odd permutation and so sgn a \342\200\224\342\200\2241.

270 DETERMINANTS [CHAP. 7

Method 2. Bring 1 to the first position as follows:

4 2 (\320\223)63 to 154263

Bring 2 to the second position:

1*5 4f2l6 3 to 12 5 4 6 3

Bring 3 to the third position:

1 2r5 4 6 C) to 12 3 5 4 6

Bring 4 to the fourth position:

1 2 3 *5D) 6 to 12 3 4 5 6

Note that 5 and 6 are in the \"correct\" positions. Count the number of numbers \"jumped\":3 + 2 + 3+1=9. Since 9 is odd, a is an odd permutation. {Remark: This method is essentially the same as

the preceding method.)

Method 3. An interchange of two numbers in a permutation is equivalent to multiplying the permu-permutation by a transposition. Hence transform a to the identity permutation using transpositions; such as,

5.4 2,1 6 3

\320\2234 2 5 6 3

X1 2 4 5 6,3

1 2 \320\267\025\320\261\"*4

1 2 3 4*6\320\2475

12 3 4 5 6

Sincean odd number, 5, of transpositions was used (and since odd x odd = odd), a is an odd permutation.

7.38. Let a = 24513 and \321\202= 41352 be permutations in S5. Find:

(a) the composition permutation \321\202\302\260a, (b) a \302\260\321\202,(\321\201)<\321\202'.

Recall that <r = 24513 and \321\202= 41352 are short ways of writing

'1 2 3 4 5\\ _/l2 3 4 5\\

4 5 13/\320\260\320\237

T~\\4 ' 3 5 Vwhich means

<r(l) = 2 <\321\202B)= 4 <tC) = 5 <tD)= 1 and <rE) = 3

and

tA) = 4 \321\202B)= 1 rC) = 3 rD) = 5 and rE) = 2

(a) Theeffect of a and then \321\202on 1, 2, ..., 5 is as follows:

12 3 4 5

\320\2761 i 1 1 I2 4 5 13

\321\202I 1 1 I 115 2 4 3

Thus t\"<t = ( ],or \321\202\320\276ff = 15243.\\1 5 2 4 3/

CHAP. 7] DETERMINANTS 271

(b) The effect of \321\202and then a on 1,2,..., 5 is as follows:

12 3 4 5t 1 I 1 1 1

4 13 5 2\320\276I I 1 1 1

12 5 3 4

Thus \320\260\320\276\321\202= 12534.

(c) By definition, a ~l(j) = \320\272if and only if tr(fc) =

_/; hence

2 4

2

5 1 3\\ /1 2 3 4 5\\

3 4 5) \"D 1 5 2 3)

7.39. Consider any permutation a = j1j2 \342\200\242\342\200\242\342\200\242jn. Show that, for each inversion(i, k) in \302\253r,there is a pair

(i*, /c*) such that

i* < /c* and <r(i*) > <7(fc*) G)

and vice versa.Thus a is even or odd according as to whether there is an even or odd number ofpairs satisfying (/).

Choose i* and k* so that a{i*) = i and o{k*)= k. Then i > \320\272if and only if ct(i*) > a{k*), and i precedeskin \320\276if and only if i* < k*.

7.40. Consider the polynomial g = g(xt,..., \321\205\342\200\236)=

J~[ (x, - jc;).Write out explicitly the polynomialg = g(xu x2)x3,x4). '\302\246*\302\246\302\273

The symbol [~[ is used for a product of terms in the same way that the symbol ? is used for a sum of

terms. That is, Jl (xt\342\200\224

x,) means the product of all terms (x,\342\200\224

Xj) for which 1 < j. Hence\342\200\242<j

g = g(xly..., x4)= (X[

7.41. Let a be an arbitrary permutation.- For the above polynomial g in Problem 7.40, define=

\320\237(*<\302\253o-

xHi))- Show that

_ | g if a is\"j-0 if a is

even

Accordingly, <r(g)= (sgn a)g.

Since a is one-to-oneand onto,

o(g) =\320\237Kw

-x\302\253u))

=\320\237

Thus \321\201\321\202(^)= g or ct(<7)

= \342\200\224g according as to whether there is an even or an odd number of terms of the form

(\321\205(\342\200\224

Xj) where i >j. Note that for each pair (i,j) for which

i<j and u(i) > a(j) (I)there is a term (\321\205\342\200\236@

-x,,^) in a[g) for which <7(i) > a{j). Since <r is even if and only if there is an even

number of pairs satisfying (/), we have aig) =\320\262if and only if a is even; hence a(g)= \342\200\224

g if and only if a is

odd.

7.42. Let a, t e Sn. Show that sgn (\321\202\302\260<r) = (sgn xXsgn \321\201\321\202).Thus the product of two even or two odd

permutations is even, and the product of an odd and an even permutation is odd.

272 DETERMINANTS [CHAP. 7

Using Problem 7.41,we have

sgn (t \302\253a)g =(\321\202

\320\276a)(g) =\321\202{\320\260{\320\264))

= t((sgn o)g) = (sgn t)(sgn c)g

Accordingly, sgn (x \302\253a) = (sgn tKsgn a).

7.43. Considerthe permutation a =jlj2 ...jn. Show that sgn a' = sgn <r and, for scalars atj, show

that

aJ,laJ22\302\246\342\200\242\342\200\242

\302\253J,n=

\302\2531*,\302\2532*2\342\200\242\342\200\242'

\302\253\302\253\320\272,

where <rx = k,fe2 \302\246- - \320\232\302\246

We have <r~' \302\260<r = e, the identity permutation. Since e is even, <r\"' and <r are both even or both odd.

Hence sgn o~l =sgn a.

Since a =jij2 \302\246\302\246L is a permutation, 0^,1^2\302\246\302\246\302\246

\320\260^\342\200\236=

au,a2*2'\"

\302\260\302\253*\342\200\236-Tnen fci> ^\320\263.\342\200\242--.\320\232have the

property that

= 2, ...,Let \321\202=

\320\272\321\205\320\272\320\263... kn. Then for 1 = 1, .... \320\270,

Thus <\321\202\320\276\321\202= e, the identity permutation; hence \321\202= \320\276~'.

MISCELLANEOUS PROBLEMS

7.44. Without expanding the determinant, show that

1 a b + \321\201

1 \320\252\321\201+ \320\260

1 <\342\200\242a + b

= 0.

Add the second column to the third column, and remove the common factor from the third column;

this yields

1

1

1

a

b

\321\201

b-\\

\321\201-

a -

- \321\201

\\- a

vb

=

1 a a + b + \321\201

= 1 b a+b+\321\201

1 \321\201\320\260+ b + \321\201

= (a + b + c)

111

ab\321\201

1

1

1

= (a + b + cXO) = 0

(We use the fact that a determinant with two identical columns is zero.)

7.45. Showthat the difference product g(xt, ..., xB) of Problem 7.40 can be represented by means of

the Vandermonde determinant of Xj, x2,..., \321\205\342\200\236_j, x denned by:

l

xi*i

i

*22

X2

...

... X

... X

X

1

n~2

\302\253:

i

i

ii

l

X

x2

X\321\217\021

This is a polynomial in x of degree n \342\200\2241, of which the roots are xl, x2,--.,xn_1; moreover, the

leading coefficient (the cofactorof x\"\"

') is equal to Vn _ 2(\321\205\342\200\236_ ,). Thus, from algebra,

\320\232-Ax)= (x - *iX* -

x2)\302\246\302\246\302\246

(x-

\321\205,,_1)\320\230\320\270_\320\260(\321\20511_1)

so that, by recursion,

=[(\321\205-\321\205,)

-(\321\205-\321\205\342\200\236,)][(\321\205\342\200\236, -\321\205,)

-(\321\205\342\200\236_,

-

CHAP. 7] DETERMINANTS 273

It follows that

Thus g(x vj = (- 1\320\223\"'\302\253\320\230\320\273\342\200\236,<*\342\200\236).

=\320\237 \320\237

7.46. Find the volume V(S) of the parallelepiped S in R3 determined by the vectors u, = A, 2, 4),u2

= B, 1, -3), and u3= E, 7,9).

Evaluate the determinant 1 -3

5 7 9

other words, \320\270,,u2 and u3 lie in a plane.

= 9 - 30 + 56 - 20 + 21- 36 = 0. Thus V(S) = 0 or, in

7.47. Find the volume VE) of the parallelepiped S in R4 determined by the vectors

\302\253,= B, -1, 4, -3), u2

= ( - 1, 1,0, 2), \321\211= C, 2, 3, - 1),\320\2744

= A,-2, 2, 3)

Evaluate the following determinant, using u21 as a pivot and applying C2 + C, -+ C, and

-2C2 + Q-\302\273Q:

2

-1

3

1

-1

1

2

-2

4

0

3

2

-32

_ j

3

-

1

0

5

-1

-112

-2

4

0

3

2

-1

0

-57

-

1

5

_ j

4

3

2

-1-5

7

= 21+ 20 - 10- 3 + 10 - 140= - 102

Hence V(S) = 102.

7.48. Find det (M)where M =

/3

2

4 0 0 0\\

5 0 0 0

0 9 204

0 5

\\0 0

0 0

6 73 4/

Partition M into a (lower) triangular block matrix as follows:

/32

00

\\o

4

\320\241

9

5

0

0

0204

00063

0

0

7

4/

Evaluate the determinant of each diagonal block:

3 42 5

= 15-8 = 7 = 2 = 24 - 21 = 3

Hence|M| = 7-2-3=42.

7.49. Find the minor, signed minor, and complementaryminor of A\\; 3 where

/10 4 -1\\

3 2-250 1-37

\\6 -4 -5 -1/

A =

274 DETERMINANTS [CHAP. 7

The row subscripts are 1 and 3 and the column subscripts are 2 and 3; hence the minor is

0 4M?:fl =

1 -3=0-4=-4

and the signed minor is

The missing row subscripts are 2 and 4 and the missing column subscripts are 1 and 4. Hence the comple-complementary minor is

\302\25321\302\25324

6 -1= -3-30= -33

7.50. Let A = (a,,) be a 3-square matrix. Describe the sum Sk of the principal minors of orders

(\320\260)\320\272= 1, (b) \320\272= 2, (\321\201)\320\272= 3.

(a) The principal minors of order one are the diagonal elements. Thus Sl = an + a12 + a33 = tr A, the

trace of A.

(b) The principal minors of order two are the cofactors of the diagonal elements. Thus S2 =\320\233,,+ A22

+ Ai3 where Ai{ is the cofactor of au.

(c) Thereis only one principal minor of order three, the determinant of A. Thus S3= det (\320\233).

7.51. Find the number Nk and the sum Sk of all principal minors of order (\320\260)\320\272= 1, (b) \320\272= 2, (\321\201)\320\272= 3,and (d) \320\272= 4 of the matrix

A =

1

-4

1

3

2

03 -2

Each (nonempty) subset of the diagonal (or, equivalently, each nonempty subset of {1,2, 3, 4})deter-

determines a principal minor of A, and Nk= | . ) = \321\202\321\202.\320\263\321\202of them are of order k.

(\321\217)

(b)

= 4 and

S,=|l| + |2| + |3|+|4| =1+2 + 3 + 4= 10

\321\201_\320\2712

\342\200\2241

\342\200\2244

3

2+

1

1

0

3

1

3

-14

2

0

5

3

2

-214

3

1

-2

4

(c)

= 14+ 3 + 7 + 6 + 10+ 14 = 54

: 4 and

S,= -41

320

05

3

+

1

-4

3

3

2

-2

-114

+

1

1

3

0

3

1

-1-2

4+

2

0

-2

5

3

1

1

-2

4

= 57 + 65 + 22 + 54 = 198

(d) N4 = 1 and S4 = det (A)= 378.

CHAP. 7] DETERMINANTS 275

7.52. Let V be the vector space of 2 by 2 matrices M = ( I over R. Determine whether or not\\c d)

D : V -\302\273R is 2-linear (with respect to the rows) where: (a) D{M) = a + d, (b) D{M) = ad.

(a) No.For example, suppose A = (I, 1)and \320\222= C, 3). Then

\320\251\320\220,\320\222)= \320\271\320\223

'j= 4 and DBA, B) = d(

J= 5 * 2D(,4, B)

(/>) Yes. Let /4 = (a,, a2).B =(bi> \320\253and c = (c^ c2);then

and \320\224\320\222,C)

Hence for any scalars s, t e R.

tB,C) = Dl ' ' 2 2 = (sa. + tfc,)\\ ci C2 )

= s(o,c2)+ t{b,c2)= sD(/l, C) + tD(B,C)

That is, D is linear with respect to the first row.Furthermore,

D(C, A) = D[Cl C2)= c,a2 and \320\251\320\241,\320\222)= \320\241\320\2231^2 )= Clb2

\\a, a2/ \\bl bj

Hence for any scalars s, t e R,

/1 + tB)

=s\\cva2) + t(Clb2) = sD{C, A) + tD{C, B)

=d(

C' \320\241\320\263

)= Cl(sa2 + tb2)

\\sat + tfc, sa2 + tb2)

That is, D is linear with respect to the second row.Both linearity conditions imply that D is 2-linear.

7.53. Let D be a 2-linear,alternating function. Show that D(A, B) = \342\200\224D(B, A). More generally, show

that if D is multilinear and alternating, then

that is, the sign is changed whenever two components are interchanged.

SinceDis alternating, D{A + B, A + B)= 0.Furthermore, since D is multilinear,

0 =D(A + B, A + B)=

D[A, A + B) + D(B, A + B)= D(A, A) + D(A, \320\222)+ \320\251\320\222,A) + D[B, B)

But \320\251\320\220,A) = 0 and D(B,B) = 0. Hence

0 = D(A, B) + D(B, A) or D(A, B) = -D(B, A)

Similarly,

0 = D(..., A + B,...,A + B,...)= D(...,A,...,A,...)+ D{...,A,...,B,...) + D{...,B,...,A,...)+D{...,B B,= D(..., A, ..., B, ....) + \320\251..., B, ..., A, ...)

and thus \320\251..., A,..., B,...) = -D(..., \320\222 A,...)-

276 DETERMINANTS [CHAP. 7

Supplementary Problems

COMPUTATIONOF DETERMINANTS

7.54. Compute the determinant of each matrix:

{a)

7.55. Evaluate the determinant of each matrix:

(c) (d)

(fc) -3 t + 5 -3 ,

7.56. For each matrix in Problem 7.54,determine those values of t for which the determinant is zero.

7.57. Evaluate the determinant of each matrix: (a)

7.58. Evaluate each determinant:

1

I

3

4

2

0-1-3

2-2

10

3

0

_2

2

2

31

2

10

-12

3

1

4

-1

2-2

3

1

1

2

3

5

2

2-1

113

_ j

1

0

2

-1

3-2

2-3

1

13

-1

4

-2

. (b)

1

2

0

0

0

34000

5

2

1

5

2

7

4

263

9

2

3

2

1

. W

1

5

0

0

0

24000

3

3

6

0

0

425

7

2

5

1

1

4

3

COFACTORS, CLASSICAL ADJOINTS, INVERSES

2 2\\

759. Let /1 = | 3 1 0 j. Find (a) adj \320\233,and (b) A~l

1

7.60. Find the classical adjoint of each matrix in Problem 7.57.

7.61. Determinethe general 2 by 2 matrix A for which A = adj A.

7.62. Suppose A is diagonal and \320\222is triangular; say,

A =

(ax 0 ... 0\\

0 a2 ... 0and \320\222=

\\0 0 ...

(a) Show that adj A is diagonal and adj \320\222is triangular.

0 fc,

0 ... bj

CHAP. 7] DETERMINANTS 277

(b) Show that \320\222is invertible iff all bt \320\2440; hence A is invertible iff all as \320\2440.

(c) Show that the inverses of A and \320\222(if either exists) are of the form

\321\217,\021\320\236

\320\236 \320\260:1

0\\

\320\236

\320\276 \320\276

B'0

0

dl2 ...fc2- ...

0 ...

dln\\

In

That is, the diagonal elements of A 'and \320\222

' are the reciprocals of the corresponding diagonalelements of A and B.

DETERMINANTS AND LINEAR EQUATIONS

7.63. Solve by determinants: (a) < (b) <

{Ax -2y=l {.

{2x

-5j/ + 2z = 7

x + 2y\342\200\2244z = 3,

3x \342\200\2244y

\342\200\2246z = 5

7.65. Prove Theorem7.11.

2x -\320\252\321\203

= - I

4x + ly = - 1'

PERMUTATIONS

7.66. Determine the parity of these permutations in S5: (a) a = 32154,(b) \321\202= 13524, (\321\201)\320\277= 42531.

7.67. For the permutations \320\260,\320\263and \320\273in Problem 7.66, find (\320\260)\320\263\302\260a, (fc) \320\273\302\260\320\260,(\321\201)a\"', (<J) iT '.

7.68. Let \321\202? Sn. Show that r \302\260a runs through Sn as \321\201\321\202runs through Sn; that is, Sn= (t\302\273(r:o6Sn].

7.69. Let a ? Sn have the property that a(ri) = n. Let a* e Sn_ y be defined by \321\201\321\202*(\321\205)= <\321\202(\321\205).(a) Show that

sgn a* =sgn \302\253\320\263.(ft) Show that as \320\276runs through Sn, where o-(n) = n, a* runs through Sn_,; that is,

Sn_, = {a*:ae Sn,v{n)= n}.

7.70. Consider a permutation a = j, j2 ... jn. Let {et} be the usual basis of K\", and let A be the matrix whose ith

row is eJt, i.e.,A =(eVl, eh,..., ej. Show that | A \\

= sgn a.

MISCELLANEOUS PROBLEMS

7.71. Find the volume V(S) of the parallelepiped S in R3 determined by the vectors:

=A,2, -3),u2=C. 4, -I),u3=B, -1, 5),(b)u,= A, 1, 3),u2=(I, -2, -4), u3

= D, 1,2)

7.72. Find the volume V(S) of the parallelepiped S in R* determined by the vectors:

u,=(l, -2,5, -1) \302\2532=B, 1, -2, 1) u3=C, 0, 1, -2) u4= A.-1, 4, -1)

7.73. Find the minor M\342\200\236the signed minor M2, and the complementary minor M3 of A J; J where:

2 3 2\\ /l 3 -1 5\\

(b) A =2 -30 -5.3 0

1 4

2 1

5 -2,

278 DETERMINANTS [CHAP.7

7.74. For \320\272= 1, 2, 3, find the sum Sk of all principal minors of order \320\272for:

A

3 2\\ /l 5 -4\\ /l -42-4 3 1, (/>) B= 2 6 I , (c) C= 2 I

5 -2 1/ \\3 -2 0/ \\4-7

7.75. For \320\272= 1, 2, 3, 4, find the sum Sk of all principal minors of order \320\272for

/l 2 3 -1\\1-2 0 5j0 1-2 2]

\\4 0 -1 -3/

7.76. LetA be an n-square matrix. Prove \\kA\\= \320\272\"| A \\.

7.77. Let \320\220,\320\222,\320\241and D be commuting n-square matrices. Considerthe 2n-square block matrix M = I I.

Prove that | M | = | ^411D |\342\200\224|B||C|. Show that the result may not be true if the matrices do not commute.

7.78. SupposeA is orthogonal, that is, ATA = I. Show that det (A)= +1.

Vc d)

7.79. Let V be the space of 2 x 2 matrices M = over R. Determinewhether or not D : V -> R is 2-linearVc dj

(with respect to the rows) where: (a) D(M) = ac -bd, {b) D(M) = a/> - cd, (c) D(M)= 0, and (rf) D(M) = 1.

7.80. Let V be the space of w-square matrices viewed as w-tuples of row vectors. SupposeD : V -\302\273\320\232is m-linear

and alternating. Show that if/4,, A2, ..., Am are linearly dependent, then D{At,..., Am)= 0.

7.81. Let \320\232be the space of w-square matrices (as above), and suppose D: V -> K. Show that the following

weaker statement is equivalent to D being alternating:

D(Alt A2,..., An)\342\200\2240 whenever As \342\200\224Ai+l for some i

7.82. Let V be the space of n-square matrices over K. Suppose\320\222e V is invertible and so det (\320\222)\320\2440. Define

D : V -> \320\232by D(.4) = det (,4B)/det(B)where /4 e K. Hence

D{A\342\200\236\320\2332,.., \320\233\342\200\236)= det {A ,B, /42B, ..., An B)/det (B)

where \321\2034;is the ith row of A and so AtB is the tth row of AB. Show that D is multilinear and alternating,

and that D(I) = I. (This method is used by some texts to prove | AB | = | A11\320\222|.)

7.83. Let /1 be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A

(obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal

rank of A is equal to its rank, i.e., the maximum number of linearly independent rows (or columns).

Answers to Supplementary Problems

7.54. (a) 21, (b) -11, (c) 100, (rf) 0

7.55. {a) (f + 2X/ - M -4), (/>) [t + 2)\\t - 4), (c) (t + 2J(f - 4)

7.56. (a) 3,4,-2; (/>) 4,-2; (c) 4,-2

CHAP. 7] DETERMINANTS 279

7.57. (a) -131, (fc) -55

7.58. (a) -12, (/>) -42, (c) -468

1 0 -2

7.59. adj/l=| -3 -1 62 I -5

7.60. (\302\253)

-16

-30

-8

-13

-\342\200\242

-CD

7.63. (a) x = ^,^ = if; (b) *=-\302\246&>=

\320\233

7.64. (a) x = 5,y = \\,z = 1. (b) Since D = 0, the system cannot be solved by determinants.

7.66. sgn a = I, sgn \321\202= \342\200\2241, sgn \321\217= \342\200\2241

7.67. (a) T\302\260CT= 53142, (fc) n \320\276a = 52413, (c) G\"' = 32154, W) t~' = 14253

7.71. (o) 30, (fc) 0

7.72. 17

7.73. (a) -3,-3,-1; (/>) -23,-23,-10

7.74. (a) -2,-17,73; (/>) 7,10,105; (c) 13,54,0

7.75. S, = -6, S2= 13, S3 = 62,S4= -219

7.79. (a) Yes, (/>) No, (c) Yes, (d) No

Chapter 8

Eigenvalues and Eigenvectors, Diagonalization

8.1 INTRODUCTION

Consider an \320\270-square matrix A over a field K. Recall (Section4.13)that A induces a function

/: K\" -* K\" defined by

f(X) = AX

where X is any point (column vector) in K\". (We then view A as the matrix which represents thefunction/relative to the usual basis E for K\".)

Suppose a new basis is chosen for K\", say

(Geometrically, S determines a new coordinate system for K\".) Let P be the matrix whose columns arethe vectors u,, u2, --,\302\253\342\200\236\302\246Then (Section 5.11) F is the change-of-basis matrix from the usual basis E toS. Also, by Theorem 5.27,

gives the coordinates of X in the new basis S. Furthermore, the matrix

?= PlAP

represents the function / in the new system S; that is, /(A\") = BX'.The following two questions are addressedin this chapter:

A) Given a matrix A, can we find a nonsingular matrix P (which represents a new coordinate systemS),so that

\320\222= P lAP

is a diagonal matrix? If the answer is yes, then we say that A is diagonalizable.

B) Given a real matrix A, can we find an orthogonal matrix P (which represents a new orthonormal

system S) so that

\320\222= P lAP

is a diagonal matrix? If the answer is yes, then we say that A is orthogonally diagonalizable.

Recall that matrices A and \320\222are said to be similar (orthogonally similar) if there exists a non-singular (orthogonal) matrix P such that \320\222= P '' AP. What is in question, then, is whether or not agiven matrix A is similar (orthogonally similar) to a diagonal matrix.

The answers are closely related to the roots of certain polynomials associated with A. The particu-particularunderlying field \320\232also plays an important part in this theory since the existence of roots of the

polynomials depends on K. In this connection, see the Appendix (page 446).

8.2 POLYNOMIALS IN MATRICES

Consider a polynomial/(t) over a field K; say

280

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 281

Recall that if A is a square matrix over K, then we define

where / is the identity matrix. In particular, we say that A is a root or zero of the polynomial /(?) if

\320\224\320\220)= \320\236.

Example 8.1. Let A = ( \\ and let/(f) = 2t2 - 3t + 7, g{t) = t2 - 5t - 2.Then

9-\320\241 \320\276)

Thus A is a zero of g(t).

The following theorem, proved in Problem 8.26, applies.

Theorem 8.1: Let/and g be polynomials over K, and let A be an \320\270-square matrix over K. Then

@

(ii)

(iii) (fc/X/1)= kf(A) for all fc e \320\232

(iv) /(

By (iv), any two polynomials in the matrix A commute.

8.3 CHARACTERISTIC POLYNOMIAL,CAYLEY-HAMILTON THEOREM

Consider an \302\253-square matrix A over a field \320\232:

/a,, aI2 ... a,

A = fa2i\302\2602

The matrix tln \342\200\224A, where /\342\200\236is the \302\253-square identity matrix and t is an indeterminate, is called the

characteristic matrix of A:

tl.-A =

It-\320\236\321\206 \342\200\224\320\236\\\320\263\302\246\302\246\302\246~a

-a21 t-a22 ... -\320\2602\320\272

Its determinant

\320\224\320\2330= det [tln -

A)

which is a polynomial in t, is called the characteristic polynomial of A. We also call

AA(t)= det (tln

-A)

= 0

the characteristic equation of /4.

282 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Now each term in the determinant contains one and only one entry from each row and from each

column; hence the above characteristic polynomial is of the form

AA(t)= (t- anX\302\273

- a22)\342\200\242\302\246\302\246

(t- aj

+ terms with at most n \342\200\2242 factors of the form t \342\200\224au

Accordingly,

\320\273\320\273\320\234= f\" \342\200\224

(fli i + fl22 H + \320\260\320\277^1\"~'+ terms of lower degree

Recall that the trace of A is the sum of its diagonal elements.Thus the characteristic polynomial

AA(t) = det (?/\342\200\236\342\200\224

A) of A is a monic polynomial of degree n, and the coefficient of t\"~l is the negative ofthe trace of A. (A polynomial is monic if its leading coefficient is 1.)

Furthermore, if we set t \342\200\2240 in AA(t), we obtain

But A^@) is the constant term of the polynomial A^r). Thus the constant term of the characteristic

polynomial of the matrix A is (\342\200\2241)\" | A \\ where n is the order of AWe now state one of the most important theorems in linear algebra (proved in Problem 8.27):

Cayley-HamiltonTheorem 8.2: Every matrix is a zero of its characteristic polynomial.

Example 8.2. Let \320\222= I I. Its characteristic polynomial is

'~3' ~_22=(t-l)(t-2)-6 = t2-3t-4

As expected from the Cayley-Hamilton Theorem, \320\222is a zero of A{t):

: -\320\232 \320\276)

Now suppose A and \320\222are similar matrices, say \320\222= P~ lAP where P is invertible.We show that A

and \320\222have the same characteristic polynomial. Using tl = P~ltlP,\\tl -B\\ = \\tl - P lAP\\

= \\P \320\2471\320\240-PlAP\\

= \\P~\\tl -A)P\\

= \\P l\\\\tl~A\\\\P\\

Since determinants are scalarsand commute, and since |F\"'||F| = 1,we finally obtain

|t/-B| = |t/-yl|

Thus we have proved

Theorem 8.3: Similar matrices have the same characteristic polynomial.

Characteristic Polynomials of DegreeTwo and Three

Let A be a matrix of order two or three. Then there is an easy formula for its characteristic poly-polynomial A(t). Specifically:

A) Suppose A = . Then

=\320\223-(\320\26011+a22)t

\302\26022

= t2 - (tr A)t + det {A)

(Here tr A denotes the trace of A, that is, the sum of the diagonal elementsof .4.)

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 283

'\302\25311 \302\25312 \302\25313^

B) Suppose A = ( \320\2602, \320\26022 \320\260,,). Then

\\\302\25331 \302\25332 \302\25333

= t3-(a,,(I:

22

32\302\25323

\302\25333

\302\25311

\302\25331

\302\25313

\302\25333 \302\25322

\0211 \0212 \0213

\302\25321 \302\25322 \302\25323

\302\25331 \302\25332 \302\26033

= t3 - (tr A)t2 +{A11+A22 + A33)t - det (A)

(Here A,,, A22, A33 denote, respectively,the cofactors of the diagonal elements a, x, a22 ,a33.)

Consider again a 3-squarematrix A =(a^). As noted above,

S, = tr A S2 =\320\220\342\200\236+ A22 + A33 S3=dct(A)

are the coefficients of its characteristic polynomial with alternating signs. On the other hand, each Sk isthe sum of all the principal minors of A of order k. The next theorem,whoseproof liesbeyond the scope

of this Outline, tells us that this result is true in general.

Theorem 8.4: Let A be an \320\270-square matrix. Then its characteristic polynomial is

A(t)= \320\223- S,f-' + S21\"\022 + (- l)\"Sn

where Sk is the sum of the principal minors of order k.

CharacteristicPolynomial and Block Triangular Matrices

Suppose M is a blocktriangular matrix, say M = ('

I

Then the characteristic matrix of M,

where Al and A2 are square matrices.

\320\236 t/ -\320\2332,

is also a block triangular matrix with diagonal blocks tl \342\200\224Ax and r/ \342\200\224

/42 \342\200\242Thus, by Theorem 7.12,

tl -\320\220\320\263-\320\222

\\tl-M\\ =0 tl - A,

That is, the characteristic polynomial of M is the product of the characteristic polynomials of the

diagonal blocks /4, and A2.By induction, we obtain the following useful result.

Theorem 8.5: Suppose M is a blocktriangular matrix with diagonal blocks /4,, A2, ..., Ar. Then the

characteristic polynomial of M is the product of the characteristic polynomials of the

diagonal blocks A(, that is,

Example 8.3. Consider the matrix

/9

8

0

V>

-1

3

0

0

\321\201

3

-1

7

-4

6

284 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Then M is a block triangular matrix with diagonal blocks \302\2464=1 I and \320\222= I 1. Here

tr A = 9 + 3 = 12 det (A) = 27 + 8 = 35 and so &.A[t)= t1 - 12f + 35= (I

-5Xf

- 7)

tr \320\222= 3 + 8 = 11 det (\320\222)= 24 + 6 = 30 and so \320\224\320\262(\320\263)= i1 - 1 If + 30 = (t - 5){t - 6)

Accordingly, the characteristic polynomial of M is the product

\320\224\320\274(\320\263)=

\320\224\320\2330\320\224\320\262\302\253= (i - 5J(r -

6Xt- 7)

8.4 EIGENVALUES AND EIGENVECTORS

Let A be an \320\270-square matrix over a field K. A scalar A e \320\232is called an eigenvalw of /4 if there exists

a nonzero (column) vector v e K\" for which

/4l> = AV

Every vector satisfying this relation is then called an eigenvector of A belonging to the eigenvalue A.

Note that each scalar multiple kv is such an eigenvector since

A(kv)= k(Av) =

\320\251\320\270)= Mkv)

The set ?\320\264of all eigenvectors belonging to \320\273is a subspace of K\" (Problem 8.16), called the eigenspace ofA. (If dim ?\320\273

= 1, then ?A is called an eigenline and \320\273is called a scaling factor.)The terms characteristic value and characteristic vector (or proper value and proper vector) are some-

sometimes used instead of eigenvalue and eigenvector.

Example 8.4. Let \320\233=|

1 and let d, = B, 3O and p2= A, - IO.Then

and

Thus \321\201|and i'2 are eigenvectors of A belonging, respectively, to the eigenvalues Al= 4 and \320\2572

= \342\200\2241 of A.

The following theorem, proved in Problem 8.28, is the main tool for computing eigenvalues and

eigenvectors (Section 8.5).

Theorem 8.6: Let A be an \320\270-square matrix over a field K. Then the following are equivalent:

(i) A scalar A e \320\232is an eigenvalue of A.

(ii) The matrix M = XI \342\200\224A is singular,

(iii) The scalar A is a root of the characteristic polynomialA(t) of A.

The eigenspace ?\320\264of A is the solution space of the homogeneous system MX =(A/

\342\200\224A)X = 0.

Sometimes it is more convenient to solve the homogeneous system (A\342\200\224

U)X = 0; both systems, ofcourse,yield the same solution space.

Some matrices may have no eigenvalues and hence no eigenvectors.However, using the Fundamen-

FundamentalTheorem of Algebra (every polynomial over \320\241has a root) and Theorem 8.6,we obtain the followingresult.

CHAP. 8] EIGENVALUES AND EIGENVECTORS. DIAGONALIZATION 285

Theorem 8.7: Let A be an \302\253-square matrix over the complex field \320\241Then A has at least one eigen-eigenvalue.

Now suppose \320\257is an eigenvalue of a matrix A. The algebraic multiplicity of \320\257is denned to be the

multiplicity of A as a root of the characteristic polynomial of A. The geometric multiplicity of A is denned

to be the dimension of its eigenspace.

The following theorem, proved in Problem 10.27, applies.

Theorem 8.8: Let \320\257be an eigenvalue of a matrix A. Then the geometric multiplicity of \320\272does not

exceed its algebraic multiplicity.

Diagonalizable Matrices

A matrix A is said to be diagonalizable (under similarity) if there exists a nonsingular matrix P suchthat D = P~lAP is a diagonal matrix, i.e., if A is similar to a diagonal matrix D. The following theorem,proved in Problem 8.29, characterizes such matrices.

Theorem8.9: An \302\253-square matrix A is similar to a diagonal matrix D if and only if A has \302\253linearly

independent eigenvectors. In this case, the diagonal elements of D are the corresponding

eigenvalues and D = P lAP where P is the matrix whose columns are the eigenvectors.

Suppose a matrix A can be diagonalized as above, say P~xAP = D where D is diagonal. Then A

has the extremely useful diagonal factorization

Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D which can beeasilycalculated.Specifically, suppose D = diag (kuk2,.-., \320\272\342\200\236).Then

Am = (PDP Y = \320\240\320\223\320\223\320\240'1= P diag (kf,..., fc\P~'

and, more generally, for any polynomial/(t),

\320\224\320\220)=f(PDPl) = Pf(D)P1 = P diag (\320\257\320\234,-\302\246-,/(*\342\200\236))P1

Furthermore, if the diagonal entries of D are nonnegative, then the following matrix \320\222is a \"squareroot\" of A:

that is, B2 = A.

Example 8.5. Consider the matrix A = I 1. By Example 8.4, A has two linearly independent eigenvectors

( I and 1 I. Set P = I I, and so P~' = I \\ \\\\ Then A is similar to the diagonal matrix

\342\200\242-\342\200\224-a -ixi' z -o-c -\302\260)

As expected, the diagonal elements4 and \342\200\2241 of the diagonal matrix \320\222are the eigenvalues corresponding to the

given eigenvectors. In particular, A has the factorization

286 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Accordingly,

'V256A -PDF -

Furthermore, if/(t) = t3 - 7t2 + 9t - 2, then

\320\273\320\262-\321\202\320\270\321\200-(, _,)( 0

Remark: Throughout this chapter, we use the fact that the inverse of the matrix

Ja\320\254)

is the matrix P \302\246J <\"\"\\c d) \\-cl\\P\\ l\\P\\ a/\\P\\

That is, P\021 is obtained by interchanging the diagonal elements a and d of P, taking

the negatives of the nondiagonal elementsb and c, and dividing each element by the

determinant \\P\\.

The following two theorems, proved in Problems 8.30 and 8.31, respectively, will be subsequently

used.

Theorem 8.10: Let vu ..., vn be nonzero eigenvectors of a matrix A belonging to distinct eigenvalues/,,..., \320\233\342\200\236.Then vt,...,vn are linearly independent.

Theorem8.11: Supposethe characteristic polynomial A(f) of an n-square matrix A is a product of n

distinct factors, say, A(t) =(t

\342\200\224at){t

- a2) \302\246\302\246\302\246(t

\342\200\224an). Then A is similar to a diagonal

matrix whose diagonal elements are the \321\217,.

8.5 COMPUTING EIGENVALUES AND EIGENVECTORS, DIAGONALIZING MATRICES

This section computes the eigenvalues and eigenvectorsfor a given square matrix A and determines

whether or not a nonsingular matrix P exists such that P~lAP is diagonal. Specifically, the following

algorithm will be applied to the matrix A.

Diagonalization Algorithm 8.5:

The input is an n-square matrix A.

Step I. Find the characteristic polynomialA(t) of A.

Step 2. Find the roots of A(t) to obtain the eigenvalues of A.

Step 3. Repeat (a) and (b) for each eigenvalue \320\257of A:

(a) Form M = A \342\200\224XI by subtracting \320\257down the diagonal of A, or form M' = /.I \342\200\224A by

substituting t = \320\257in tl \342\200\224A.

(b) Find a basis for the solution space of the homogeneous system MX = 0. (Thesebasis

vectors are linearly independent eigenvectors of A belonging to A.)

Step 4. Consider the collectionS = {vuv2,...,vm} of all eigenvectors obtained in Step 3:

{a) If m \321\204n, then A is not diagonalizable.

CHAP.8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 287

(b) If m \342\200\224n, let P be the matrix whosecolumns are the eigenvectors vltv2,...,vn. Then

\320\232

where k, is the eigenvalue corresponding to the eigenvector vt.

-G -DExample 8.6. The Diagonalization Algorithm is applied to A

1. The characteristic polynomial A(t) of A is the determinant

t -4 - 1\320\224A)= | tl - A | =

\342\200\236= t2 - 3t - 10 = (t -

5X\302\253+ 2)- 3 t+ 1

Alternatively, tr /1 = 4 - 1= 3 and \\A\\= -4 \342\200\2246 = \342\200\22410; so A(t) = f2 - 3f - 10.

2. Set\320\224A)= (t -

5Xt + 2) = 0. The rootsA,= 5 and A2

= -2 are the eigenvalues of A.

3. (i) We find an eigenvector t), of A belonging to the eigenvalue A,= 5.

Subtract A, = 5 down the diagonal of A to obtain the matrix M = I I. The eigenvectors

belonging to A,= 5 form the solution of the homogeneous system MX = 0, that is,

1 2\\/x\\ /ON, , ., i i \342\200\236i \320\236\320\223^ \342\200\236, _ \320\236\320\223-}

The system has only one independent solution; for example,x = 2, \321\203= 1. Thus t>,

= B, 1) is an eigen-eigenvector which spans the eigenspace of A, =5.

(ii) We find an eigenvector v2 of A belonging to the eigenvalue A2 = \342\200\2242.

/6 2\\Subtract \342\200\2242 (or add 2) down the diagonal of A to obtain M = I I which yields the homoge-

homogeneoussystem

f 6x + 2y = 0or \320\227\321\205+ \321\203

= 03x+ y

= 0

The system has only one independent solution; for example, x = \342\200\2241, \321\203

= 3. Thus v2 =(\342\200\2241, 3) is an

eigenvectorwhich spans the eigenspace of A2= \342\200\2242.

4. Let P be the matrix whose columns are the above eigenvectors: P = I I. Then P~' = Ij 2 )

and

D = \320\240'\320\263\320\220\320\240is the diagonal matrix whose diagonal entries are the respective eigenvalues:

d-p-*ap=( * ^Y4 2Y2 ~lW5

Accordingly, /1 has the \"diagonal factorization\"

If/(t) = t4 - 4f3 - 3f2 + 5, then we can calculate/E) =55.\320\224-2)

= 41; thus

288 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Example 8.7. Consider the matrix \320\222= ( I. Here tr fi = 5 + 1 = 6 and |fi| = 5+4 = 9. Hence

V-4 \\)\320\250)= t2 - 6/ + 9 = (t

- 3J is the characteristic polynomial of B. Accordingly, A = 3 is the only eigenvalue of B.

Subtract a = 3 down the diagonal of \320\222to obtain the matrix M = I I which corresponds to the

homogeneoussystem

f 2x + \321\203= \320\236

< or 2x + \321\203= \320\236

l-4x-2y = 0 \320\243

The system has only one independent solution; for example, x = 1, \321\203= \342\200\2242.Thus v = A, \342\200\2242)is the only indepen-

independenteigenvector of the matrix B. Accordingly, \320\222is not diagonalizable since there does not exist a basis consisting

of eigenvectors of B.

-\321\201 \320\267

Example 8.8. Consider the matrix /1 =( t ^ ]. Here tr/i = 2-2 = O and \\A\\ = -4 + 5 = 1. Thus

\320\224(\320\263)= t2 + 1 is the characteristic polynomial of A. We consider two cases:

(a) A is a matrix over the real field R. Then \320\224A)has no (real) roots. Thus A has no eigenvalues and no eigen-

eigenvectors, and so A is not diagonizable.

(b) A is a matrix over the complex field C. Then \320\224(/)= (t \342\200\224i)(f + 0 has two roots, i and \342\200\224i. Thus A has two

distinct eigenvalues i and \342\200\224i, and hence \320\233has two independent eigenvectors. Accordingly, there exists a

nonsingular matrix P over the complex field \320\241for which

\302\246\302\246-\321\201-,

Therefore, A is diagonalizable (overC).

8.6 D1AGONALIZING REAL SYMMETRIC MATRICES

There are many real matrices A which are not diagonalizable. In fact, some such matrices may nothave any (real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist.

Namely:

Theorem 8.12: Let A be a real symmetric matrix. Then each root \320\273of its characteristic polynomial isreal.

Theorem8.13: Let A be a real symmetric matrix. Suppose \320\270and v are nonzero eigenvectors of A

belonging to distinct eigenvalues A, and \320\2572.Then \320\270and v are orthogonal, i.e.,<\320\274,v)

= 0.

The above two theorems gives us the following fundamental result:

Theorem8.14: Let A be a real symmetric matrix. Then there existsan orthogonal matrix P such that

D = \320\240'\320\263\320\220\320\240is diagonal.

We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A;then the diagonal entries of D are the corresponding eigenvalues.

Example 8.9. Let 4=1 1. We find an orthogonal matrix P such that P ' AP is diagonal. Here

tr/1 = 2 + 5 = 7 and | A \\= 10 - 4 = 6. Hence A(f) = t2 - It + 6 =

(f-

6)(f- 1) is the characteristic polynomial

of A. The eigenvalues of A are 6 and 1. Subtract \320\257= 6 down the diagonal of A to obtain the corresponding

homogeneous system of linear equations

-4x -2y

= 0 -2x-y = 0

CHAP.8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 289

A nonzero solution is u, = A, \342\200\2242).Next subtract \320\273= 1 down the diagonal of A to find the corresponding homo-homogeneous system

+ x-2y = 0 -2x+4y = 0A nonzero solution is B, 1).As expected from Theorem 8.13, u, and v2 are orthogonal. Normalize t>, and v2 toobtain the orthonormal vectors

\302\253,= (l/v/5, -2\320\224/5) u2

=B\320\224\320\224

Finally let P be the matrix whose columns are \320\270,and u2, respectively. Then

\320\247-2\320\233/5 1\320\224/5/ V0 1/

As expected, the diagonal entries of P\"

'/IP are the eigenvalues corresponding to the columns of P.

Application to Quadratic Forms

Recall (Section 4.12) that a real quadratic form q(x,, x2,..., xn) can be expressedin the matrix form

q{X) = XTAX

where X =(\321\205\342\200\236..., \321\205\320\277)\320\263and A is a real symmetric matrix, and recall that under a change of variables

X = PY, where \320\243= (yu ..., \321\203\342\200\236)and P is a nonsingular matrix, the quadratic form has the form

= YTBY

where \320\222= PTAP. (Thus \320\222is congruent to A)Now if P is an orthogonal matrix, then PT = P~'. In such a case,\320\222= \320\240\320\223\320\233\320\240= P~ XAP and so \320\222is

orthogonally similar to A. Accordingly, the above method for diagonalizing a real symmetric matrix A

can be used to diagonalize a quadratic form q under an orthogonal change of coordinates,as follows.

Orthogonal Diagonalization Algorithm 8.6:

The input is a quadratic form q(X).

Step 1. Find the symmetric matrix A which represents q and find its characteristic polynomial A(f).

Step 2. Find the eigenvalues of >4, which are the roots of A(r).

Step 3. For each eigenvalue \320\257of A in Step 2, find an orthogonal basis of its eigenspace.

Step 4. Normalize all eigenvectors in Step 3 which then forms an orthonormal basis of R\".

Step 5. Let P be the matrix whose columns are the normalized eigenvectorsin Step 4.

Then X = PY is the required orthogonal change of coordinates, and the diagonal entries of PTAPwill be the eigenvalues A,,..., \320\233\320\277which correspond to the columns of P.

8.7 MINIMUM POLYNOMIAL

Let A be an \320\277-square matrix over a field \320\232and let J(A) denote the collection of all polynomials/(f)

for which f(A) = 0. [Note J(A) is not empty since the characteristic polynomial AA(t) of A belongs to\320\224\320\233).]Let m(t) be the monic polynomial of minimal degree in J{A). Then m{t) is called the minimum

polynomial of A. [Such a polynomial m(t) exists and is unique (Problem 8.25).]

Theorem 8.15: The minimum polynomial m(t) of A divides every polynomialwhich has A as a zero. Inparticular, m{t) divides the characteristic polynomial A(i) of A.

(The proof is given in Problem 8.32.) There is an even stronger relationship between m[t) and A(t).

290 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP.8

Theorem 8.16: The characteristic and minimum polynomials of a matrix A have the same irreduciblefactors.

This theorem, proved in Problem 8.33(b), does not say that m(t)= A(t); only that any irreducible

factor of one must divide the other. In particular, since a linear factor is irreducible, m(t) and A(f) have

the same linear factors; hence they have the same roots. Thus we have:

Theorem 8.17: A scalar \320\257is an eigenvalue of the matrix A if and only if \320\257is a root of the minimumpolynomial of A.

Example 8.10. Find the minimum polynomial m(t) of A = I 3

First find the characteristicpolynomial \320\224@of A:

t-2 -2 5

A(t)= \\tl-A\\= -3 t-1 15

-1 -2 t + 4

= t3 - 5r2 + It - 3 =(t

- \\J(t - 3)

Alternatively, A(t) = t3 -(tr A)t2 + (A,, + A22 + A33)t -

\\A | = r3 - 5t2 + It - 3 = (t- lJ(t - 3) (where Ati is

the cofactor of au in A).The minimum polynomial m(t) must divide \320\224(\320\263).Also, each irreducible factor of \320\224(\320\263),that is, t \342\200\2241 and t \342\200\2243,

must also be a factor of m(t). Thus m(t) is exactly only of the following:

/(t) = (r- 3Xf - 1) or g(t)

= {t- 3Xt - IJ

We know, by the Cayley-Hamilton Theorem, that g(A) = A{A) = 0; hencewe need only test/(t). We have

/1 2 -5\\/-l/\320\230)

=\320\230-/\320\245\320\233-3/)= 3 6 -15 3

\\l 2 -5/

Thus/(t) =m(t)

= (t \342\200\224lXt

- 3) = t2 - 4t + 3 is the minimum polynomial of A.

Example 8.11. Consider the following n-square matrix where \320\260\321\2040:

/A a 0 ... 0 0\\

2

4

2

-5\\-15

-7

i /

\\0

0

0

0

000

0 \320\257\320\260 0 0

\320\274= \\

0 0 0 ... \320\220\320\260

\\\320\2760 0 ... 0 \320\273/

Note that M has A's on the diagonal, a's on the superdiagonal, and Os elsewhere. This matrix, especially when

a = 1,is important in linear algebra. One can show that

f{t) = (t-W

is both the characteristic and minimum polynomial of M.

Example 8.12. Consider an arbitrary monic polynomial f(t)= t\" + a,,.,!11\021 +\302\246\302\246\302\246+a,! + a0. Let A be the

n-square matrix with Is on the subdiagonal, the negatives of the coefficients in the last column and Oselsewhereasfollows:

A =

1\302\260

1

0

0

0

1

... 0

... 0

... 0

-<h \\

-a2

\\0 0 ... 1 -an.J

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 291

Then A is called the companion matrix of the polynomial f{t). Moreover, the minimum polynomial m(t) and the

characteristic polynomial \320\224(<)of the above companion matrix A are both equal to/(r).

Minimum Polynomial and Block Diagonal Matrices

The following theorem, proved in Problem 8.34, applies.

Theorem 8.18: Suppose M is a block diagonal matrix with diagonal blocks Au A2, \302\246..,Ar. Then the

minimum polynomial of M is equal to the least common multiple (LCM) of theminimum polynomials of the diagonal blocks At.

Remark: We emphasize that this theorem applies to block diagonal matrices,

whereas the analogous Theorem 8.5 on characteristic polynomials applies to block

triangular matrices.

Example 8.13. Find the characteristic polynomial \320\224(\320\263)and the minimum polynomial m(t) of the matrix

/2 5 0 00\\

0 2 0 0 0A = 0 0 4 2 0

0 0 3 5 0

\\0 0 0 0 7/Note A is a block diagonal matrix with diagonal blocks

-GOA \342\200\224

Then A(r) is the product of the characteristic polynomials A,(/), A2@. and A3(r) of At, A2, and A3, respectively.

Since Ax and \320\220\321\212are triangular, A,(() = (f - 2J and \320\2243@= (i - 7).Also.

A2(r) = t2 -(tr A2)t + \\\320\220\320\263\\

= t2 - 9t + 14 = (t -2Xt

- 7)

Thus A(() = (t- 2K(/ - 7J. [As expected, deg A(<)

= 5.]The minimum polynomials m,(f),m2U), and m3(r) of the diagonal blocks At, A2, and A3, respectively, are equal

to the characteristic polynomials; that is,

m^t) = (i- 2J m2(t)=

(t-

2)\302\253- 7) m,(t) = 1-7

But m(i) is equal to the least common multiple of m,(f),m2(f), m3(f). Thus m(f)= (f - 2J(< \342\200\224

7).

Solved Problems

POLYNOMIALS IN MATRICES, CHARACTERISTICPOLYNOMIAL

/1 -2\\8.1. Let A = I 1. Find/(\320\233) where: (a)/(i) = f2 - 3r + 7, and (b)f(t) = t2 - 6t + 13.

t)_/-7 -12\\ / -3 6\\ /7

oN|_f-3 ~6\\

~\\24 17/+

V-12 -15/+

V0 7j~\\ 12 9/

[Thus /1 is a root of/(t).]

292 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP.8

8.2. /2 -3\\Find the characteristic polynomial A(t) of the matrix A = I I.

Form the characteristic matrix tl \342\200\224A:

ft 0\\ (-2 3\\ ft-2 3 \\

\"-H .)+{-> -i)-(-> .-\302\246)

The characteristic polynomial A(t)of A is its determinant:

f-2 3-5 t- 1

Alternatively, tr A = 2 + 1= 3 and \\A | = 2 + 15= 17;hence A(t) = t2 - 3f + 17.

= (t -2)(f

- 1) + 15 = t2 - 3t + 17

1 6 -2\\8.3. Find the characteristic polynomial A(t) of the matrix A =

\\\342\200\2243 2 0 I.

0 3 -4/

t- 1 -6

\320\223-2 0

-3 t+4

=(f

- IK' -\320\250+ 4) - 18+ 18(t + 4) = f3 + t1 - 8f + 62

Alternatively, tr/1 = 1+2-4=-!, 4,,=2 0

3 -4= -8,

1 -2

0 -4= -4, \320\233\342\200\236

=1 6

-3 22+ 18= 20, -4,, + A21 + A33= -8-4 + 20 = 8, and \320\230I

= -8 + 18-72= -62. Thus

\320\224@= /3 -

(tr A)t2 + (\320\220\321\206+ A22 + A33)t- \\A\\= f3 + t2 - 8t + 62

8.4. Find the characteristic polynomialsof the following matrices:

2 3 40 2 8-6

(a) R =0 0 3-5' {b) S =

2 5 7 -9\\1 4-6 4'0 0 6 -5

0 0 2 3/

(a) Since R is triangular, A(t)= (t - 1 )(t

-2Xf

- 3)(t - 4).

(b) Note S is block triangular with diagonal blocks A, = I I and A2 = I I. Thus

\320\224(\320\263)=AAft)AAl(t)

= (f2 - 6t + 3Xf2 \342\200\2249t + 28)

EIGENVALUES AND EIGENVECTORS

\320\233 4\\8.5. Let /4=1 1. Find: (a) all eigenvalues of A and the corresponding eigenspaces,(b) an invert-

ible matrix P such that D = P lAP is diagonal, and (c) A5 and f{A) where

/(t) = t* - 3t3 - It2 + 6t - 15.

(a) Form the characteristic matrix 11 \342\200\224A of A:

-2 t- (I)

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 293

The characteristic polynomial &{t) of A is its determinant:

( - 1 -4Mt) = \\4-A\\= =t2-At-5 = (t -5Kf + 1)

Alternatively, tr A = 1 + 3 = 4 and |\320\233|= 3 \342\200\224

8=\342\200\2245,so A(f) = t2 - 4f - 5. The roots \320\273,= 5 and

a2 = \342\200\2241 of the characteristic polynomial A(<) are the eigenvalues of A.

We obtain the eigenvectors of A belonging to the eigenvalue \320\273,= 5. Substitute t = 5 in the char-

/ 4 -4\\acteristicmatrix (/) to obtain the matrix M = I I. The eigenvectors belonging to a, =5 form

the solution of the homogeneous system MX = 0, that is,

(\320\233 D0-Q

4x -4y

= 0or < or x -

\321\203= 0

\\-2x + 2y= 0

The system has only one independent solution; for example,x = 1,>>= 1. Thus u, =

A, 1) is an eigen-eigenvector which spans the eigenspace of \320\257,=5.

We obtain the eigenvectorsof A belonging to the eigenvalue /2 = \342\200\2241. Substitute t = \342\200\2241 into

tl \342\200\224A to obtain M =(

I which yields the homogeneous system

i\342\200\2242x \342\200\224

4y\342\200\2240

or x + 2y = 0- 2x -4>>

= 0

The system has only one independent solution; for example,x = 2, \321\203= \342\200\2241. Thus v2 = B, \342\200\224

1) is an

eigenvector which spans the eigenspace of \320\2732= \342\200\2241.

igen values:

\342\200\224-a -og x -\320\272 -t)

(fc) Let P be the matrix whose columns are the above eigenvectors:P = I |. Then D = P *AP is

the diagonal matrix whose diagonal entries are the respective eigenvalues:

[Remark: HereP is the change-of-basic matrix from the usual basis E of R2 to the basis S = [u,, v2}.HenceD is the matrix representation of,the function determined by A in this new basis.]

(c) Use the diagonal factorization of A,

and 5s = 3125and (-1M = - 1 to obtain:

A \"

\".\".\"

\320\233.'. I.' \\1042 2083

Also, since/E) = 90 and/(- 1) = -

24,

\320\241 X -\320\250 -0

_PDSp-._/'1 2Y3125 \302\260Y4 fW

?-;Xi \321\205-

8.6. Find all eigenvalues and a maximal set S of linearly independent eigenvectors for the followingmatrices:

Which of the matrices can be diagonalized? If so, find the required nonsingular matrix P.

(a) Find the characteristic polynomial A(<) = t2 \342\200\224it \342\200\22428 = (t \342\200\2247X< + 4). Thus the eigenvalues of A are

\320\273,= 7 and A2

= \342\200\2244.

294 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP.8

/-2 6\\(i) Subtract A,

= 7 down the diagonal of A to obtain M = I I which corresponds to the

system

f-2x + 6y= 0

{\320\243

or x -\320\252\321\203

= 0

Here t), = C, I) isa nonzero solution (spanning the solution space)and so u, is the eigenvectorof;., = 7.

/9 6\\(ii) Subtract A2

= \342\200\2244 (or add 4) down the diagonal of A to obtain M = I I which corresponds to

the system 3x + 2y\342\200\2240. Here v2 = B, \342\200\2243)is a solution and hencean eigenvector of \320\2732

= \342\200\2244.

Then S ={vt

= C, 1), v2= B, \342\200\2243)}is a maximal set of linearly independent eigenvectors of A. Since S

is a basisfor R2, A is diagonalizable. Let \320\257be the matrix whose columns are u, and u2- Then

-C4)

(fc) Find \320\224(\320\263)= t1 \342\200\2248t +- 16 = (t \342\200\2244J. Thus \320\257= 4 is the only eigenvalue. Subtract \320\272\342\200\2244 down the diago-

diagonalof \320\241to obtain M = [ I which corresponds to the homogeneous system x + \321\203= 0. Here

v = (I, 1) is a nonzero solution of the system and hence v is an eigenvector of \320\241belonging to A = 4.Since there are no other eigenvalues, the singleton set S = {v = (I, 1)}is a maximal set of linearly

independent eigenvectors.Furthermore, \320\241is not diagonalizable since the number of linearly indepen-independenteigenvectors is not equal to the dimension of the vector space R2. In particular, no such non-

singular matrix P exists.

8.7. Let A = I 1. Find: (a) all eigenvalues of A and the corresponding eigenvectors; (b) an invert-

\\1 J/

ible matrix P such that D = P~lAP is diagonal; (c) A6; and (d) a \"positive square root\" of A,

i.e., a matrix B, having nonnegative eigenvalues, such that B2 = A.

(a) Here \320\224A)= t2 - tr A + \\A | = t2 \342\200\2245r + 4 = {t

- lXt - 4).Hence A,= 1 and A2

= 4 are eigenvalues of

A. We find corresponding eigenvectors:

(i) Subtract A, = 1 down the diagonal of A to obtain M = IJ

which corresponds to the homo-

homogeneous system x + 2y = 0. Here t>,= B, \342\200\224

1) is a nonzero solution of the system and so an

eigenvector of \320\233belonging to A, = 1.

= I I(ii) Subtract A2= 4 down the diagonal of A to obtain M = I I which corresponds to the

homogeneoussystem x \342\200\224\321\203

= 0. Here t>2= A, I) is a nonzero solution and so an eigenvector of A

belonging to \320\2732= 4.

(fc) Let P be the matrix whose columns are t>, and t>2. Then

A = PDP '=(

(c) Use the diagonal factorization of A,

2

1 '\320\224\320\2364\320\2244

to obtain

4 ~l\\ /1366 2730^

J365 2731,x \320\233!

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 295

(d) Here! n' ~^ ) are square roots of D.Henceti oy0 \302\2612)'

U X \320\226 D-G D

is the positive square root of A.

\320\270 i -\320\273

8.8. Suppose A = I 2 5 \342\200\2242I. Find: (a) the characteristic polynomial A(t) of A, (b) the eigen-

\\l 1 2/

values of A, and (c) a maximal set of linearly independent eigenvectors of A. (d) Is A diagonal-

izable? If yes, find P such that P~ lAP is diagonal.

(a) We have

i-4 -1 1-2 t-5 2-1 -1 t-2

= t3- \\U2 + 39i-45

Alternatively, \320\224A)= t3 -(tr \320\233)\302\2532+ \320\230,,+ A21 + A33)t- \\A | = t3 - Ut2 + 39t - 45. (Here Au is the

cofactor of aik in the matrix A.)

(b) Assuming \320\224@has a rational root, it must be among +1, \302\2613.\302\2615,+9, \302\26115,\302\26145.Testing by syn-synthetic division, we get

1 - 11+ 39 - 45

3-24 + 45

1 - 8 + 15 + 0

Thus I == 3 is a root of A(i) and t \342\200\2243 is a factor, giving

\320\224(*)=

(\320\263- 3Kt2 - 8t + 15) = (t -

3X*- 5Xi -

3)= (t - 3J{t -

5)

Accordingly, \320\273,= 3 and A2

= 5 are the eigenvalues of A.

(c) Find independent eigenvectors for each eigenvalue of A.

/1 1 -1\\(i) Subtract A, = 3 down the diagonal of /4 to obtain the matrix M = l 2 2 \342\200\224

2]which

\\l 1 -l/correspondsto the homogeneous system x + \321\203

\342\200\224z = 0. Here u = A,\342\200\2241, 0) and v = A, 0, 1) are

two independent solutions.

/-'! -'\\

(ii) Subtract A2 = 5 down the diagonal of \320\233to obtain M = I 2 0\342\200\2242Jwhich corresponds to

\\1 1-3/

the homogeneoussystem

\321\203- 2z = 0

Only z is a free variable. Here w = A, 2, 1) is a solution.

Thus {u = A, \342\200\2241,0), v = A,0, 1),w = A, 2, 1)} is a maximal set of linearly independent eigenvectorsof

A.

Remark: The vectors \320\270and v were chosen so they were independent solutions of the homo-

homogeneous system x + v \342\200\224\320\263= 0. On the other hand, w is automatically independent of \320\270and v

since w belongs to a different eigenvalue of A. Thus the three vectors are linearly independent.

296 EIGENVALUES AND EIGENVECTORS. DIAGONALIZATION [CHAP.8

(d) A is diagonalizable since it has three linearly independent eigenvectors.Let P be the matrix with

column u, v, w. Then

1 1 l\\ /3

P = \\ -1 0 21 and P lAP = \\ 3

Oil/ \\ 5/

3 -1 1

8.9. Suppose \320\222=| 7 \342\200\2245 1 | Find: (a) the characteristic polynomial \320\224(\320\263)and eigenvalues

6 -6 2;of fi; and (b) a maximal set S of linearly independent eigenvectors of \320\222.(\321\201)Is \320\222diagonalizable? If

yes, find P such tht P!

BP is diagonal.

(o) We have:

\320\230\320\222)= 3 - 5 + 2 = 0,Bu= -10 + 6 = -4, B22

= 6 - 6 = 0, \320\222\342\200\236= - 15 + 7 = - 8.

|0| =-30-6-42+ 30+18+14 = - 16

Therefore, \320\224(/)= /' - 12/+ 16=

(t- 2JU + 4) and so A = 2 and A = 4 are the eigvenvalues of 0.

(b) Find a basis for the eigenspace of each eigenvalue.

(i) Subtract A = 2 down the diagonal of \320\222to obtain the homogeneous system

<l -1 l\\/x\\ /0\\ \320\223\321\205- >> + z = 0

7 _7 1J(>'l

=(oJ

or <7x-7>> + z = O or

\\6 -6 0/\\z/ \\o/ (\320\261\321\205-6>>=0

The system has only one independent solution, e.g., x = 1,\321\203= 1, z = 0. Thus \320\270= A, 1,0) forms a

basisfor the eigenspace of A = 2.

(ii) Subtract A = - 4 (or add 4) down the diagonal of \320\222to obtain the homogeneous system

\302\246'

It0or 7x\"y+ z=0 or

-6 6/\\z/ \\0/ [6x-6>4-6z = 0

The system has only one independent solution, e.g., x = 0, \321\203= 1, z = 1.Thus v = @, I, I) forms a

basisof the eigenspace of A = \342\200\2244.

Thus S = {u. v} is a maximal set of linearly independent eigenvectors of B.

(c) Since \320\222has at most two independent eigenvectors, \320\222is not similar to a diagonal matrix, i.e., \320\222is not

diagonalizable.

8.10. Find the algebraic and geometric multiplicities of the eigenvalue A = 2 for matrix \320\222in Problem

8.9.

The algebraic multiplicity of A = 2 is two since t - 2 appears with exponeni 2 in \320\224(\320\263).However, the

geometric multiplicity of A = 2 is one since dim Ex = I.

8.11. Let A = I I. Find all eigenvalues and corresponding eigenvectors of A assuming \320\233is a

real matrix. Is A diagonalizable? If yes, find P such that P~lAP is diagonal.

The characteristic polynomial of A is \320\224(\302\273)= t2 + 1 which has no root in R. Thus A, viewed as a real

matrix, has no eigenvalues and no eigenvectors, and hence A is not diagonalizable over R.

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 297

8.12. Repeat Problem 8.11assuming now that A is a matrix over the complex field \320\241

The characteristic polynomial of A is still A(t) = t1 + 1. (It does not depend on the field K.) Over C,A(t) does factor; specifically, A(f) = t2 + 1 = (t \342\200\224

i'K' + ')\342\200\242Thus A,= i and A2

= \342\200\224i\"are eigenvalues of A.

(i) Substitute t = i in 11 \342\200\224A to obtain the homogeneoussystem

_2.. ,.n..i ^r,, or 1 ,\342\200\236.\302\253.,;.. \302\253

or (i-l)x + y= O

The system has only one independent solution, e.g., x = 1, \321\203= 1 \342\200\224i. Thus vt = A, 1 \342\200\224

i) is an eigen-eigenvector which spans the eigenspace of A,

= i.

(ii) Substitute t = \342\200\224iinto f/ \342\200\224A to obtain the homogeneoussystem

V -2 -i-lAy) W\320\236\320\223

l-2x+(-i-l)y = 0or (-i - l)x+ \321\203

= \320\236

The system has only one independent solution, e.g., x = 1, \321\203= 1 + i. Thus t'2 = A, 1 + i) is an eigen-

eigenvector of A which spans the eigenspace of A2= \342\200\224i.

As a complex matrix, A is diagonalizable. Let P be the matrix whose columns are i>, and v2. Then

pj' ') and />'\320\233/> =('

\302\260

/2 4\\8.13. Let \320\222= I I. Find: (a) all eigenvalues of \320\222and the corresponding eigenvectors; (b) an invert-

ible matrix P such that D = P lBP is diagonal; and (\321\201)\320\2226.

(\320\271)Here \320\250= f2 - tr \320\222+ |B| = t2 - 3t - 10 = (t -5)(t + 2). Thus \320\2731

= 5 and A2= -2 are the eigen-

eigenvalues of B.

/-3 4\\

(i) Subtract A, = 5 down the diagonal of \320\222to obtain M = I I which corresponds to the

homogeneoussystem 3.x \342\200\2244_y

= 0. Here d, = D, 3) is a nonzero solution.

/4 4\\(ii) Subtract A2

= \342\200\2242 (or add 2) down the diagonal of \320\222to obtain M = I I which corresponds to

the system x + \321\203= 0 which has a nonzero solution v2 = A, \342\200\224

1).

(Since \320\261has two independent eigenvectors, \320\222is diagonalizable.)

(fc) Let P be the matrix whose columns are c, and t>2. Then

(c) Use the diagonal factorization of B,

g _:x: -X 4)to obtain E6 = 15625,(-2)\"= 64):

1\320\24315625 Oyi )\\_i

i6669 6733,1

15625 0\\A A/8956 88924

8.14. Determine whether or not A is diagonalizable where A \342\200\224I 0 2 3 1.

\\0 0 3/

Since /4 is triangular, the eigenvalues of /4 are the diagonal elements 1. 2, and 3. Since they are distinct,

A has three independent eigenvectors and thus A is similar to a diagonal matrix (Theorem 8.11). (We

298 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

emphasize that here we do not need to compute eigenvectors to tell that A is diagonalizable. We will have

to compute eigenvectors if we want to find P such that P~ lAP is diagonal.)

8.15. Suppose A and \320\222are n-square matrices.

(a) Show that 0 is an eigenvalue of A if and only if A is singular.

(b) Show that AB and BA have the same eigenvalues.

(c) SupposeA is nonsingular (invertible) and X is an eigenvalue of A. Show that X' is an

eigenvalue of A ~l.

(d) Show that A and its transpose AT have the same characteristic polynomial.

(a) We have that 0 is an eigenvalue of A if and only if there exists a nonzero vector v such that

A{v) = Ov = 0; i.e., if and only if A is singular.

(h) By part (a) and the fact that the product of nonsingular matrices is nonsingular, the following state-statements are equivalent: (i) 0 is an eigenvalue of AB, (ii) AB is singular, (iii) A or \320\222is singular, (iv) BA is

singular, (v) 0 is an eigenvalue of \320\222A.

Now suppose \320\257is a nonzero eigenvalue of AB. Then there exists a nonzero vector v such that

ABv = Xv. Set w = Bv. SinceX \321\2040 and v \320\2440,

Aw = ABv = kv \320\2440 and so w^O

But w is an eigenvector of BA belonging to the eigenvalue \320\272since

BAw = \320\222ABv = \320\222\320\2571;= /ft; = Xw

Hence \320\272is an eigenvalue of BA. Similarly, any nonzero eigenvalue of \320\222A is also an eigenvalue of AB.Thus AB and BA have the same eigenvalues.

(c) By part (a) X ?? 0. By definition of an eigenvalue, there exists a nonzero vector i; for which A(y) = kv.

Applying A~' to both sides, we obtain v = A~

l{kv) = XA l(v). Hence A l(v) = k~' v; that is, \320\257~' is an

eigenvalue of A l.

(d) Since a matrix and its transpose have the same determinant, \\tl \342\200\224A\\ = \\(tl \342\200\224

A)T\\ = |f/ \342\200\224AT\\. Thus

A and AT have the same characteristic polynomial.

8.16. Let X be an eigenvalue of an n-square matrix A over K. Let Ex be the eigenspace of X, i.e., the set

of all eigenvectors of A belonging to X. Show that Ex is a subspace of K\", that is, show that:

(a) if v e \320\225\320\273,then kv e Ex for any scalar ke K; and (fc) if u, v e Ex, then u + v e Ex.

(a) Since i; \342\202\254Ex, we have /4(i;)= Xv. Then

/4(\320\272\320\270)= fc/4(t;) = k(kv)=

k(kv)

Thus kv e Ex. [We must allow the zero vector of K\" to serve as the \"eigenvector\" corresponding to

\320\272= 0, to make Ex a subspace.]

(fc) Since u, v e Ex,v/e have A(u)= kv and A(v) = kv. Then

/4(u + i;) =A(u) + A(v) = Xu + Xv \342\200\224X(u + v)

Thus \320\270+ v e Ex.

DIAGONALIZING REAL SYMMETRIC MATRICESAND REAL QUADRATIC FORMS

8.17. Let A = I I. Find a (real) orthogonal matrix P for which PTAP is diagonal.

The characteristicpolynomial \320\224A)of A is

f-3 -2-2 ,_3='2-6< + 5=<'-5*-l)

and thus the eigenvalues of /4 are 5 and 1.

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 299

Subtract ). = 5 down the diagonal of A to obtain the corresponding homogeneous system of linear

equations

-2x + 2_y= 0 2x-2>> = 0

A nonzero solution is u, =A, 1). Normalize i>, to find the unit solution \320\270,

=A\320\224/2, 1\320\224/2).

Next subtract \320\257= 1 down the diagonal of A to obtain the corresponding homogeneous system of

linear equations

2x + 2y= 0 2x + 2y

= 0

A nonzero solution is v2 \342\200\224A,\342\200\2241). Normalize v2 to find the unit solution u2 =

A\320\224/2,\342\200\224

Finally let P be the matrix whose columns are u, and u2, respectively; then

As expected, the diagonal entries of PTAP are the eigenvalues of A.

11-8 4\\

8.18. Suppose \320\241=|

\342\200\2248 \342\200\2241\342\200\22421. Find: (a) the characteristic polynomial A(r) of C; (b) the eigen-4 -2 -4/

values of \320\241or, in other words, the roots of \320\224(\320\263);(c) a maximal set S of nonzero orthogonal

eigenvectors of C; and (<f) an orthogonal matrix P such that P lCP is diagonal.

(a) We have

\320\224@= r1 -

(tr Qt1 + (C,, -I- C22 + C33)t - | C\\= i3 - 6t2 - 135f - 400

[HereClk is the cofactor of cik in \320\241=(\321\201^).]

(b) If A(t) has a rational root, it must divide 400.Testing t = \342\200\2245, we get

-5 I 1 - 6-135-400

I -5+55 + 4001 - 11 - 80+ 0

Thus t + 5 is a factor of \320\224@and

\320\224(\320\263)= (t + 5\342\204\2262

- \320\230\320\263- 80) =(\320\263+ 5J(t - 16)

Accordingly, the eigenvalues of \320\241are \320\257= \342\200\2245(with multiplicity two) and \320\257= 16 (with multiplicity

one).

(c) Find an orthogonal basis for each eigenspace.Subtract \320\233= \342\200\2245 down the diagonal of \320\241to obtain the homogeneous system

I6x -8>> + 4z = 0 -8x+4y-2z = 0 4x -

2y + \320\263= 6

That is, Ax \342\200\2242y + \320\263= 0. The system has two independent solutions. One solution is vt

= @, 1, 2). Weseeka second solution v2 = (a, b, c) which is orthogonal to i>,;i.e.,such that

4a \342\200\2242b + \321\201= 0 and also b -2c = 0

One such solution is v2=

(\342\200\2245, \342\200\2248, 4).Subtract \320\257= 16 down the diagonal of \320\241to obtain the homogeneous system

-5x -8y + 4z = 0 -8x -

\320\237\321\203- 2z = 0 4x -

2y- 20z = 0

This system yields a nonzero solution i;3= D, \342\200\2242,1). (As expected from Theorem 8.13,the eigenvector

v3 is orthogonal to vt and i>2 -)

Then i>,, v2, v3 form a maximal set of nonzero orthogonal eigenvectors of C.

300 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

(d) Normalize t>,, v2, t>3 to obtain the orthonormal basis

\320\275,= (Q, 1\320\233\320\2242/>/S) u2 =(-5/v^, -8/^105, 4/,/lO5) ti3

=D/v/21, -2/v/21, l/v/21)

Then P is the matrix whose columns are ul, \321\2132,\320\2703.Thus

0 -5/\320\243\320\2505 4/72\320\242\\/ 5

/^ -8/-/\320\2505 -2\320\264\320\224\320\274and PTCP = I -5

\\2/,/5 4/^/105

8.19. Let q(x,y)= 3x2 \342\200\224

bxy + 1 \\y2. Find an orthogonal change ofcoordinateswhich diagonalizes q.

Find the symmetric matrix A representing q and its characteristic polynomial A(f):

t-\320\252 3

-(3 \"'

V-3 uA=\\ , ,,) ^d \320\224@=3 f_n

The eigenvalues are 2 and 12; hence a diagonal form of q is

= t2 - I4t + 24 = (I-2Kt

- 12)

The correspondingchange of coordinatesis obtained by finding a corresponding set of eigenvectorsof A.

Subtract \320\257= 2 down the diagonal of A to obtain the homogeneous system

x -3y

= 0, - 3.v + 9v = 0

A nonzero solution is u, = C,1). Next subtract A = 12 down the diagonal of \320\233to obtain the homogeneoussystem

- 9x - 3 \321\203= 0, - 3x -\321\203

= 0

A nonzero solution is v2 =(\342\200\2241, 3). Normalize vt and v2 to obtain the orthonormal basis

u, =(\320\267/\320\243\320\242\320\276,1/\320\243\321\210)

The change-of-basis matrix P and the required change of coordinates follow:

/\320\267/\320\243\321\210-1\320\233/\320\231\320\247and

\\1\320\264/1\320\276 3/vw

One can also expressx' and y' in terms of x and \321\203by using P ~' = PT, that is,

x' = Cx + y)/y/lO y'= (- x + \320\227\321\203\320\243\320\273/\320\232)

8.20. Consider the quadratic form q{x,y, z) = 3x2 + 2xy + \320\227\321\2031+ 2xz + 2yz + 3z2.Find:

(a) The symmetric matrix A which represents q and its characteristicpolynomial \320\224(\320\263),

(b) The eigenvalues of A or, in other words, the roots of A(r),

(c) A maximal set S of nonzero orthogonal eigenvectors of A.

(d) An orthogonal change of coordinates which diagonalizes q.

(a) Recall A =(\320\260\321\203)

is the symmetric matrix where au is the coefficient of xf and ati= ait is one-half the

coefficient of x; Xj. Thus

1

3 1 I and \320\224@=

1

t - 3 - 1 - 1

-1 i-3 -1-1 -1 t-3

= t3 - 9t2 + 24t - 20

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 301

(b) If \320\224@has a rational root, it must divide the constant 20, or, in other words, it must be among

\302\2611,\302\2612,\302\2614,+ 5, + 10, +20. Testing t = 2, we get

2 I 1-9 + 24-20

2-14+201-7+10 + 0

Thus t \342\200\2242 is a factor of \320\224A),and we find

\320\224@= (t -

2Xt2- It + 10)= (t

- 2J(t - 5)

Hencethe eigenvalues of A are 2 (with multiplicity two) and 5 (with multiplicity one).

(c) Find an orthogonal basis for each eigenspace.Subtract X = 2 down the diagonal of A to obtain the corresponding homogeneous system

x + >> + z = 0 x + \321\203+ z = 0 x + \321\203+ z = 0

That is, x + \321\203+ z = 0. The system has two independent solutions. One such solution is i>,= @, 1, \342\200\224

1).

We seek a second solution v2 \342\200\224(a, b, c) which is orthogonal to i>,; that is, such that

a + b + \321\201= 0 and also b \342\200\224\321\201= 0

For example, v2\342\200\224B,

\342\200\2241, \342\200\2241). Thus vl =

@, 1, \342\200\2241), v2 = B, \342\200\2241, \342\200\224

1) form an orthogonal basis for

the eigenspace of A = 2.Subtract A = 5 down the diagonal of A to obtain the corresponding homogeneous system

\342\200\2242x+ y+z = 0 x \342\200\2242_v + z = 0 x + y

\342\200\2242z = 0

This system yields a nonzero solution v3= A, 1, 1).(As expected from Theorem 8.13, the eigenvector v3

is orthogonal to t>, and v2.)Then v,, v2, v3 form a maximal set of nonzero orthogonal eigenvectors of A.

(d) Normalize vt, v2, v3 to obtain the orthonormal basis

u, =@, 1/^/2,

- 1/^/2) u2=

B\320\224/6,-

1\320\224/6,-

1\320\224/6) u3=

A\320\224\320\2241\320\224\320\2241\320\264/3)

Let P be the matrix whose columns are uuu2,u3. Then

/ 0 2ljl 1\320\233/3\\ /2

P = { 1/^/2 -1\320\224/61\320\224/\320\267)

and PMP= 2

\\-1/^/2 -1\320\224/6 1\320\224/3/ \\ 5/

Thus the required orthogonal change of coordinates is

^1 + ^

x' / z'

_x \321\203 z

z~

~^/1~7\320\261+

7\320\267

Under this change of coordinates,q is transformed into the diagonal form

q(x\\ y, z') = 2x'2 + 2\320\2432+ 5z'2

MINIMUM POLYNOMIAL

/4-2 2^8.21. Find the minimum polynomial m(i) of the matrix A = I 6 - 3 4

\\3 -2 3;

302 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

First find the characteristic polynomial A(f) of A:

t-2 1 -1

&(t) =\\tl

- A\\ = -6 t + 3 -4-3 t- 3

= t3 - 4t2 + 5t - 2 = (t -2X'

- IJ

Alternatively, \320\224@= t3 -

(tr \320\233)\302\2532+ (\320\233\342\200\236+ A21 + \320\23333)\320\223- \\A | = t3 - 4t2 + 5t - 2 = (t -

2\\i- IJ. (Here

Au is the cofactor of au in /4.)

The minimum polynomial m[t) must divide A(t). Also, each irreducible factor of A(f), that is, I \342\200\2242 and

t \342\200\2241, must also be a factor of m{t). Thus m(/) is exactly only of the following:

/(!) = ((-2X1-1) or g(t) = (i -2X1

- IJ

We know, by the Cayley-Hamilton Theorem, that y(A) = A(/4) = 0; hence we need only test/(f). Wehave

Thus/(t) = /n(f)= (f \342\200\224

2)(t\342\200\224

1) = t1 \342\200\2243t + 2 is the minimum polynomial of A.

0

0

0

(\320\257

a 0)

0 /. a0 0 kj

The characteristic polynomial of \320\222is A(r) = (t \342\200\224AK. [Note

(t -AK.] We find (B -

A/J # 0; thus nit) = A(t) = (t -AK.

is exactly one of t \342\200\224a, (t \342\200\224AJ, or

(Remark: This matrix is a special case of Example8.11and Problem 8.61.)

8.23. Find the minimum polynomial m(t) of the following matrix: M' =

/4 1 0 0 0\\

0 4 10 0

0 0 4 0 00 0 0 4 1

\\0 0 0 0 4/

Here M' is block diagonal with diagonal blocks

/4 1 0\\

4

\\0 0 4\320\243

and

The characteristic and minimum polynomial of A' is f(t) = (t\342\200\2244K, and the characteristic and minimum

polynomial of B' is g(t) =(t

\342\200\2244J. Thus A(f) =f(t)g(t) = (t \342\200\2244M is the characteristic polynomial of M', but

m(t) = LCM[f(t), gUy] = (I \342\200\2244K (which is the sizeof the largest block) is the minimum polynomial of M'.

8.24. Find a matrix A whose minimum polynomial is:

(a) f(t) = t3-8t2 + 5t + l, (b) f(t) = tA - 3t3 - 4r2 + St

Let A be the companion matrix (see Example 8.12)of/(f).Then

/0

1(b) A =

0

001

0

0

0

01

-6

-5

4

3

(Remark: Thepolynomial/@ is also the characteristic polynomial of A.)

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 303

8.25. Show that the minimum polynomial of a matrix A exists and is unique.

By the Cayley-Hamilton Theorem, A is a zero of some nonzero polynomial (see also Problem 8.37).Let n be the lowest degree for which a polynomial/(i) exists such that f(A) = 0. Dividing/(f) by its leading

coefficient, we obtain a monic polynomial /n(f) of degree n which has A as a zero. Supposem'(t) is another

monic polynomial of degreen for which m'(A) = 0. Then the difference m(t) \342\200\224m'it) is a nonzero polynomial

of degree less than n which has A as a zero.This contradicts the original assumption on n; hence m(t) is the

unique minimum polynomial.

PROOFS OF THEOREMS

8.26. Prove Theorem 8.1. (i) (/+ g){A) =f{A) + g(A), (ii) (fg)(A) = f(A)g(A), (iii) (kf)(A) = kf(A).

Suppose/=ant\" + \302\246\302\246\302\246+ att + a0 and \320\264= hmtm + \302\246\302\246\342\200\242+ bxt + b0. Then by definition,

f(A) =anA\" + --

+a,A +aol and \320\264(\320\220)=

\320\254\342\200\236\320\220\321\202+ \302\246\302\246\302\246+ b,A + hol

(i) Supposem < n and let bt= 0 if i > m. Then

/+ 9 =\320\232+ \320\254\342\200\236\320\223+ \302\246\302\246\342\200\242+ (a, + b,)t + (a0 + b0)

Hence

(/+ 0X-4)=

\320\232+ bn)A\" + \302\246\302\246\302\246+ (a, + fc.H + (\302\253o+ bo)/

g(A)

n + m

(ii) By definition, fg =cnJ.m/\"+m + \302\246\302\246\302\246+ c,t + f0 =

\320\242.\321\201\320\272'\320\272where

Hence (fg)(A) = ^ck-0

f{A)g(A)=

( ? a, A')( ? \320\252,A*)= f J a,- b;

\\l = 0 /\\j=0 / i = 0 y=0

\" =

(iii) By definition, kf= kant\" + \342\200\242\342\200\242\302\246+ katt + ka0, and so

(kffA) =kan A\" + \302\246

+fca,,4 + kaol =k(an

(iv) By (ii), g{A)f[A) =(afX>*)

=\320\250\320\230\320\233)=f(A)g[A).

8.27. Prove the Cayley-Hamilton Theorem8.2.Every matrix is a root of its characteristic polynomial.

Let A be an arbitrary n-square matrix and let \320\224(\320\263)be its characteristic polynomial; say,

\320\224(\320\263)= \\tl -

A\\= t\" + an_it\021 \342\200\224\302\246\302\246\302\246+ fl,t + fl0

Now let \320\251)denote the classical adjoint of the matrix tl \342\200\224A. The elements of B{t) are cofactorsof the

matrix tl \342\200\224A and hence are polynomials in t of degree not exceedingn \342\200\2241. Thus

\320\271@=\320\222\320\277.,\320\263\"\021+ \302\246\302\246\302\246+ 6,1 + fi0

where the \320\222\320\271are n-square matrices over \320\232which are independent of t. By the fundamental property of the

classical adjoint (Theorem 7.9), (tl -A)B(t) = | tl - A |/, or

304 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Removing parenthesesand equating the coefficients of corresponding powers of f,

\320\222..,= /

Multiplying the above matrix equations by A\", A\" ',..., A, I, respectively,

AB0- A1B1 =

a,A

Adding the above matrix equations,

0 = A\" + a^iA\" '+\302\246\302\246-+a,A + aol

or \320\220(\320\233)= 0, which is the Cayley-Hamilton Theorem.

8.28. Prove Theorem 8.6.

The scalar \320\257is an eigenvalue of A if and only if there exists a nonzero vector v such that

Av = Xv or (XI)v \342\200\224Av = 0 or (XI\342\200\224A)v = 0

or M = XI \342\200\224A is singular. In such a casea isa root of A(f) = | tl \342\200\224A |. Also, v is in the eigenspace Ex of \320\257if

and only if the above relations hold; hence v is a solution of (XI\342\200\224

A)X \342\200\2240.

8.29. Prove Theorem 8.9.

SupposeA has n linearly independent eigenvectors e,, \320\2622,..., tin with corresponding eigenvaluesa,, A2, ..., \320\245\342\200\236.Let P be the matrix whose columns are i;,,..., vr. Then P is nonsingular. Also, the columns

of AP are Av,,..., Avn. But /4^ =Xvk. Hence the columns of AP are /.,1?,,..., Xnvn. On the other hand, let

D = diag (\320\257,,a2, ..., /.\342\200\236),that is, the diagonal matrix with diagonal entries \320\273\320\272.Then PD is also a matrix

with columns \\vk. Accordingly,

AP = PD and hence D = P lAP

as required.

Conversely, suppose there exists a nonsingular matrix P for which

P '/IP = diag (\320\257\342\200\236A2, ..., \320\260\342\200\236)= D and so AP = PD

Let i>lf u2,..., vn be the column vectorsof P. Then the columns of AP are Avk and the columns of PD areAt Bt. Accordingly, since AP = PD,we have

Av, =\320\220,\320\262|,/4i>2 = a2d2, ..., /4i;n

= knvn

Furthermore, since P is nonsingular, \320\262,,\320\2622,..., \320\262\342\200\236are nonzero and hence, they are eigenvectors of A

belonging to the eigenvalues that are the diagonal elementsof D. Moreover, they are linearly independent.Thus the theorem is proved.

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 305

8.30. Prove Theorem 8.10.

The proof is by induction on n. If n = 1, then t>, is linearly independent since u, ?? 0. Assume n > 1.Suppose

flji), +a2v2 + \302\246\302\246\302\246+\320\260\342\200\236\320\270\320\273

= 0 (/)

where the a( are scalars. Multiply (/) by A and obtain

a, Av, + a2Av2 -H\302\246\302\246\302\246+ \320\260\342\200\236\320\233\320\270\342\200\236

= /10 = 0

By hypothesis, Av, = A,u,..Thus on substitution we obtain

a,Xlvl + a2l2v2 + \302\246\302\246\302\246+ \320\260\342\200\236\320\273\320\273\320\270,,= 0 B)

On the other hand, multiplying (/) by \320\273\320\277,we get

a,^!), + a2Anv2 + \302\246\342\200\242\302\246+ \320\260\342\200\2361\342\200\236\320\270\342\200\236= 0 (i)

Subtracting (i) from B) yields

-\320\233>2+

\302\246\302\246\302\246+ \320\262.-l(V l-

\320\220>\342\200\236_,= 0

By induction, u,, \302\2532,..., \302\253\342\200\236_iare linearly independent; hence each of the above coefficients is 0. Since theA,-

are distinct, A,\342\200\224

\320\257\342\200\236\321\2040 for i ?t n. Hence a, = \342\200\242\342\200\242\342\200\242= an^ , =0. Substituting this into (/), we get ar vn\342\200\224

0, and

hence an= 0. Thus the \302\253,-are linearly independent.

831. Prove Theorem 8.11.

By Theorem 8.6, the a, are eigenvalues of A. Let w, be corresponding eigenvectors. By Theorem 8.10,

the \320\270,are linearly independent and henceform a basis of K\". Thus A is diagonalizable by Theorem 8.9.

8.32. Prove Theorem 8.15. The minimum polynomial m(t) of A divides/(r) whenever/(A) = 0.

Suppose/(/) is a polynomial for which/(/1) = 0.By the division algorithm, there exist polynomials q(t)and r\\t) for which/(t) = m(t)q{t)+ t\\t) and t\\t) = 0 or degr(t) < deg m((). Substituting t = A in this equation,and using that f(A) = 0 and m{A)

= 0, we obtain r\\A) = 0. If r[t) =? 0, then r(t) is a polynomial of degree lessthan m(t) which has A as a zero; this contradicts the definition of the minimum polynomial. Thus r(f) = 0and so/(t) =

m(t)<j@, i.e., m(t) divides/(().

8.33. Let m{t) be the minimum polynomial of an n-square matrix A.

{a) Show that the characteristic polynomial of A divides (m(t))\".

(b) Prove Theorem 8.16. m(t) and \320\224(\320\223)have the same irreducible factors.

(a) Supposem(t)= f + Cjtr~' + \302\246\342\200\242\302\246+ c, ,t + cr. Considerthe following matrices:

Then

\320\262,

\320\2622

\320\262,

\320\2622

\320\222,-,

\320\222\320\276

-\320\220\320\2220

-\320\220\320\222\320\233

= \320\220+\321\201,/

= \320\2201+ \320\263,

~ \320\220\320\263~*4-

= /=

\321\201,/

=\321\2012/

Also, -/1\320\222\320\223_,=\321\201\320\263/-\320\230\320\263+ \321\201,^\320\2631+\342\200\242\302\246\342\200\242+cr_,/l + cP/)

306 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Set B(t)= tr 'Bo + \320\2232B, +\342\200\242\302\246\302\2464- lBr_2 + Br_,

Then

(tl -A)

\302\246B(t) = {t'B0 + f 'B, + \302\246\342\200\242\302\246+ tBr_l)-(trlAB0 + tr-2ABl+-+AB, ,)

=t'B0 + f >(B, -

AB0) + t'z(B2- /IB,) + \302\246\302\246\302\246+ t(BP_,

- ABr_2)- ABr.l= t'l + CJ'4 + C2trll + \302\246\302\246\302\246+ C, ,t/ + C,I

Taking the determinant of both sides gives | tl - A 11 B(t) | = | m(t)I | =(m(t))\"- Since | B(t) | is a poly-

polynomial, | tl \342\200\224A | divides (m(/))\"; that is, the characteristic polynomial of A divides (m(t))\"-

(b) Suppose fit) is an irreducible polynomial. If/(t) divides m(t) then, since m(t) divides A(t), /(') divides

A@- On the other hand, if/(r) divides A(t) then, by part (a), f{t) divides {m{t)y. But/(t) is irreducible;

hence/(f) also divides m(t). Thus m{t) and A(t) have the same irreducible factors.

8.34. Prove Theorem 8.18.

We prove the theorem for the case r = 2. The general theorem follows easily by induction. Suppose

M = I I where A and \320\222are square matrices. We need to show that the minimum polynomial m(i) of\\0 B/

M is the least common multiple of the minimum polynomials g(t) and h(t) of A and B. respectively.

Since m(t) is the minimum polynomial of \320\233/,m(M) = I I = 0 and hence m[A)= 0 and

\\ 0 m.B)J

m{B) = 0. Sinceg(t) is the minimum polynomial of A, g(t) divides m(t). Similarly, h(t) divides m(t). Thus m(t)

is a multiple of g(t) and /;(()\342\200\242

/\320\224\320\220)0 \\ /0 0\\Now let /(t) be another multiple of g{t) and /i(t); then/(M) = = I = 0. But m(r) is

\\ \302\260\320\2336)/ V\302\260\302\260/

the minimum polynomial of M; hencem(r) divides/(t). Thus m(t) is the least common multiple of g(t)and h(l).

8.35. Suppose >1 is a real symmetric matrix viewedas a matrix over C.

(a) Prove that </4u, t>>= <u, Av} for the inner product in C.

(b) Prove Theorems 8.12 and 8.13for the matrix A.

(a) We use the fact that the inner product in C\" is defined by <u. v) = uTv. Since A is real symmetric,\320\233= AT = A. Thus

(\320\220\320\270,v} = (Aufv = uTATv = ur^D = \320\272\320\263\320\233\321\203= <u, \320\233\320\270>

(b) We use the fact that in C\", <fcu, u> = k(u, y> but <\302\253,fci;> = k(u, v}.

A) There existsc^O such that Av = \320\273\320\270.Then

A(v, v} = (/.v, v} = (Av, vy = <u, >4i;)= (v, \320\273\320\270)=

\320\273<\320\270,\320\270>

But <u, y> # 0 sincev \320\2440. Thus \320\257= \320\273and so \320\257is real.

B) Here \320\220\320\270=\320\257,\321\213and \320\233\320\270=

\320\2202\320\270and, by A), \320\2732is real. Then

\320\233,<\321\213,\320\262>= <A,u, i;> = (\320\220\320\270,vy = (u, Avy \342\200\224(u, A2i;> = 12(U>v) =

'-\320\263(\320\270,i>>

Since \320\273,# A2, we have (u, vy\342\200\2240.

MISCELLANEOUS PROBLEMS

8.36. Suppose /1 be a 2 x 2 symmetric matrix with eigenvalues 1 and 9 and suppose \320\270= A, 3)Tis an eigenvector belonging to the eigenvalue 1. Find: (a) an eigenvector v belonging to the

eigenvalue 9, (b) the matrix A, and (c) a square root of A, i.e., a matrix \320\222such that B2 = A.

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 307

(a) Since A is symmetric, v must be orthogonal to u. Set v =(\342\200\2243, 1 )r.

(fe) Let P be the matrix whose columns are the eigenvectorsu and v. Then, by the diagonal factorization of

A, we have

0

(Alternatively, A is the matrix for which \320\220\320\270= \320\270and \320\233\320\270= 9v.)

(\321\201)Use the diagonal factorization of A to obtain

f\\ \342\200\2243\\/l (\320\233

8.37. Let A be an \302\253-square matrix. Without using the Cayley-Hamilton theorem, show that A is a

root of a nonzero polynomial.

Let N = n2. Consider the following N + 1 matrices

I, A, A1 A\"

Recall that the vectorspaceV of n x n matrices has dimension N = n2. Thus the above N + 1 matrices are

linearly dependent. Thus there exist scalarsaD,a,, a2 , \302\246.-, aN, not all zero, for which

aN AN + \302\246\302\246\342\200\242+ a XA + a01 = 0

Thus A is a root of the polynomial/(t) =aN tN + \342\200\242\302\246\302\246+ att + a0.

8.38. Suppose A is an n-square matrix. Prove the following:

(a) A is nonsingular if and only if the constant term of the minimum polynomial of A is not

zero.

(b) If A is nonsingular, then A ' is equal to a polynomial in A of degree not exceeding n.

(a) SupposeJ'U)= t' + ar_ tf ' + \342\200\242\302\246\342\200\242+ a,r + a0 is the minimum (characteristic) polynomial of A. Then

the following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of/(t), and (iii) the constant term a0is not zero. Thus the statement is true.

{b) Let nil) be the minimum polynomial of A. Then m(t) = /' + ar_ ,fr\"' + \342\200\242\302\246\342\200\242+ a,t + a0, where r < n.

Since A is nonsingular, a0 \320\2440 by part (a). We have

i = \320\233\320\263+ ar_ i/4r + \302\246\302\246\342\200\242+ a,/4 + a0 / = 0

Thus

Accordingly,

1 , ,(Ar

' + \320\260,.,\320\233\320\2232

+ \342\200\242\342\200\242\342\200\242f a,/M =

.4\"'= --

8.39. Let F be an extension of a field K. Let A be an \320\270-square matrix over K. Note that A may also beviewed as a matrix \320\233over F. Clearly | tl \342\200\224A | = | tl \342\200\224A |, that is, A and \320\224have the same

characteristic polynomial. Show that A and A also have the same minimum polynomial.

Let m(t) and m'{t) be the minimum polynomials of A and A, respectively. Now m'(t) divides every

polynomial over F which has A as a zero. Sincem{t) has \320\233as a zero and since nit) may be viewed as apolynomial over F, m'(i) divides m(l\\ We show now that m(t)divides m'(t).

308 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP. 8

Sincem'(t) is a polynomial over F which is an extension of K, we may write

m\\t) =/,@\320\254, +/2@b2 + ' \342\200\242\302\246+/\320\273(')\320\254\342\200\236

wherefft) are polynomials over K, and b,,..., \320\254\342\200\236belong to F and are linearly independent over K. We have

m\\A) =UA)bt +f2{A)b2 + \302\246\302\246\302\246+\320\246\320\220)\320\254\320\257

= 0 (/)

Leta[J>

denote the i/-entry o(fk(A).The abovematrix equation implies that, for each pair (i,/).

Since the b, are linearly independent over \320\232and since the aj}' e K, every ajj*= 0. Then

Since theyj-(r) are polynomials over \320\232which have A as a zeroand since m(t) is the minimum polynomial of

\320\233as a matrix over K, m(t) divides each of the/<t). Accordingly, by (/), /n(t) must also divide m'(t). But monic

polynomials which divide each other are necessarily equal. That is, m(t)\342\200\224

m'(t), as required.

Supplementary Problems

POLYNOMIALS IN MATRICES

8.40. Let f(t) = 2t2 - 5f + 6 and g(t) = t3 - It1 + t + 3. Find f(A), g(A), f(B), and g(B) where

-(I'D\342\200\224(i \320\267)

8.41. Let A = (l \\ Find A 2, A \\ /I\321\217.

(8

12 0\\

0 8 12I.Find a real matrix A such that \320\222= A1.

0 0 8/

8.43. Show that, for any square matrix A, (P~lAPf = P~lA\"P where P is invertible. More generally, show that

f(P~ lAP) = P lf{A)P for any polynomialf(t).

8.44. Let/(r) be any polynomial. Show that (a)f(AT) \342\200\224(f(A))T, and (b) if A is symmetric, then/(/l) is symmetric.

EIGENVALUES AND EIGENVECTORS

8.45. Let A = I I. Find: (a) all eigenvalues and linearly independent eigenvectors; (b) P such that

D = P~ lAP is diagonal; {c) A10 and/\320\230) where/(t) = t* - 5t3 + It2 - 2t + 5; and (d) \320\222such that B2 = A.

8.46. For eachof the following matrices, find all eigenvalues and a basis for each eigenspace:

/3 1 1\\ / . 2 2\\ /1 1 0\\

(\320\260)\320\233= I 2 4 21, (\320\254)\320\222= I 1 2 -1 , (\321\201)\320\241= 10 1 0

\\l 1 3/ \\-l 1 4/ \\0 0 1/

When possible, find invertible matrices P,, P2, and P3 such that P^'/IP,, P2 'BP2, and PilCP3 are

diagonal.

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 309

8.47. Considerthe matrices A = Ij

and \320\222= Ij.

Find all eigenvalues and linearly independent

eigenvectors assuming (a) A and \320\222are matrices over the real field R, and (fc) A and \320\222are matrices over the

complexfield C.

8.48. Suppose \320\270is a nonzero eigenvector of matrices A and B. Show that v is also an eigenvector of the matrix

kA + k'B where \320\272and k' are any scalars.

8.49. Suppose p is a nonzero eigenvector of a matrix A belonging to the eigenvalue \320\257.Show that for n > 0, v is

also an eigenvector of A\" belonging to /.\".

8.50. Suppose \320\257is an eigenvalue of a matrix A. Show that/(\320\273) is an eigenvalue off (A) for any polynomial/(t).

8.51. Show that similar matrices have the sameeigenvalues.

8.52. Show that matrices A and AT have the same eigenvalues. Give an examplewhere A and AT have different

eigenvectors.

CHARACTERISTIC AND MINIMUM POLYNOMIALS

8.53. Find the characteristic and minimum polynomials of each of the following matrices:

/2 5 0 0 0\\

0 2 0 0 00 0 4 2 00 0 3 5 0

\\o \320\276\320\276\320\2767/

B =

/3 1 0 0o\\

0 3 0 0 00 0 3 10

0 0 0 3 1

\\o 0 0 0 3/

/a 0 0 00\\

\320\276\321\2170 0 0

0 0 A 0 0

\320\276\320\276\320\276\321\2170

\\o \320\276\320\276\320\276\321\217/

8.54.

8.55.

8.56.

8.57.

8.58.

8.59.

8.60.

8.61.

A

1 0\\ /2 0 0\\

0 2 0 1 and \320\222=J

0 2 2 1. Show that A and \320\222have different characteristic polynomials

001/ \\o 0 1/

(and so are not similar), but have the same minimum polynomial. Thus nonsimilar matrices may have the

same minimum polynomial.

Consider a square block matrix M = I I. Show that tl \342\200\224M = Ij

is the characteristic

matrix of M.

Let A be an \302\253-square matrix for which Ak = 0 for some \320\272> \320\277.Show that A\" = 0.

Show that a matrix A and its transpose AT have the same minimum polynomial.

Suppose f(t) is an irreducible monic polynomial for which/(/1) = 0 for a matrix A. Show that/(r) is the

minimum polynomial of A.

Show that \320\233is a scalar matrix kl if and only if the minimum polynomial of A is m(t) = t \342\200\224k.

Find a matrix A whose minimum polynomial is (a) t3 \342\200\2245t2 + bt f 8, (fc) t* - 5t3 - It + It + 4.

Consider the following n-square matrices(where \320\260\320\2440):

/0 1 0 ... 0 0\\ /\320\257a 0 ... 00\\

N =

0 0 1 0 0

0 0 0 ... 0 I

\\o 0 0 ... 0 0/

M =

\320\276\321\217 \320\276\320\276

\320\276\320\276\320\276... \321\217

\\o \320\276\320\276... \320\276a/

310 EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION [CHAP.

HereN has Is on the first diagonal above the main diagonal and 0s elsewhere, and M has X's on the main

diagonal, a's on the first diagonal above the main diagonal and 0s elsewhere.

(a) Show that, for \320\272< n, Nk has Is on the fcth diagonal above the main diagonal and 0s elsewhere, and

show that N\" = 0.

(b) Show that the characteristic polynomial and minimal polynomial of N is/(f) = t\".

(c) Show that the characteristic and minimum polynomial of M is g(t) =(t

\342\200\224Xf. (Hint: Note that

M = XI + aN.)

DIAGONALIZATION

8.62. Let A = I be a matrix over the real field R. Find necessary and sufficient conditions on a, b, c, and d\\c dj

so that A is diagonalizable, i.e.,has two linearly independent eigenvectors.

8.63. Repeat Problem 8.62for the case that A is a matrix over the complex field C.

8.64. Show that a matrix A is diagonalizable if and only if its minimum polynomial is a product of distinct linearfactors.

8.65. Suppose? is a matrix such that E1 = E.

(a) Find the minimum polynomial m(t) of E.

{b) Show that E is diagonalizable and, moreover, E is similar to the diagonal matrix A = I 1 where r

is the rank of E.

DIAGONALIZATION OF REAL SYMMETRIC MATRICESAND QUADRATIC FORMS

8.66. For each of the following symmetric matrices A, find an orthogonal matrix P for which P 1AP is diagonal:

.:)\302\246 H-G-0

8.67. Find an orthogonal transformation of coordinates which diagonalizes each quadratic form:

(a) q(x, y) = 2x2 -6xy + \\0y2, {b) q(x, y) = x2 + Sxy

- 5y2

8.68. Find an orthogonal transformation of coordinates which diagonalizes the following quadratic form

q{x,y, z) = 2xy + 2xz + 2yz.

8.69. Let A be a 2 x 2 real symmetric matrix with eigenvalues 2 and 3, and let \320\270= A, 2) be an eigenvectorbelonging to 2. Find an eigenvectorv belonging to 3 and find A.

Answers to Supplementary Problems

8.41.'-GO

CHAP. 8] EIGENVALUES AND EIGENVECTORS, DIAGONALIZATION 311

2 a b\\

8.42. Hint: Let A = I 0 2 \321\201I. Set \320\222= A3 and then obtain conditions on a, b, and \321\201

\\0 0 2/

8.45. {\320\260)\320\257,= 1, \320\270= C, -2); \320\2572

= 2, v = B, - I)

'2 -6

(b) P =

@( 4093 6138

2046 -

-2 -1

-3+4\321\205/2 -6+6^/2^

8.46. (a) \320\257,=2,\320\274= A, -1,0), w = A,0, -1); \320\2572

= 6, w = A,2, I)

(b) \320\257,= 3, \320\270= A, 1, 0), v = A, 0, 1); /2 = 1,w = B, - 1,1)

(c) A= l,u= A,0,0),\302\253= @,0,1)

/ 1 1 1\\ /1 1 2\\

Let P, = I - 1 0 2 1 and P2 = I 1 0 - I I. P3 does not exist since \320\241has at most two linearly

\\ 0 -1 l/ \\01 l/

independent eigenvectors, and so cannot be diagonalized.

8.47. (a) For \320\220,\320\257= 3, \320\270= A, - 1);\320\222has no eigenvalues (in R);

(b) For \320\220,\320\257= 3, \320\270= A, - 1);for \320\222.\320\257,= 2/, \320\270= A, 3 - 2i); \320\2572

= -2i, \320\270= A, 3 4- 2i).

8.52. Let A =\\. Then \320\257= 1 is the only eigenvalue and v = A, 0) spans the eigenspace of \320\257= 1. On the

other hand, for AT = I),

\320\257= 1 is still the only eigenvalue, but w = @, 1) spans the eigenspaceof \320\257= 1.

8.53. (a) A(t) = (t- 2K(t- 7J;

@

0 -8\\I 0

-\320\261],

0 1 5/

(\320\254)

/\320\2360 0 -4\\

1 0 0-7

0 10 2

\\0 0 1 5/

8.65. (\320\260)\320\232\320\225= I, m(t) = (\320\263- 1); if ? = 0, m(t) = t; otherwise m(r) = t(t - 1).

(b) Hint: Use (a)

866 (a)8.66. \320\270 (b) P-3/vio

8.67. (a) x = (\320\227\320\243-

/)/,/\320\250, \321\203= (\320\243+ 3/)\320\233/\320\250,

8.68. \321\205=\321\205'/\321\204+ \321\203/\321\203\320\224+ z'ljl, \321\203

= \321\205'

(\320\254)=

{2\321\205'-

y)ljl, \321\203=

(\321\205'+ 2/)/%/5

, \320\263=\321\205'\320\264/\320\267

-2\320\263'/\320\243\320\261

8.69. \320\270= B, -\\\\\320\220= (\320\224

Chapter9

Linear Mappings

9.1 INTRODUCTION

The main subject matter of linear algebra is finite-dimensional vector spaces and linear mappingsbetween such spaces.Vector spaces were introduced in Chapter 5. This chapter introduces us to the

linear mappings. First, however, we begin with a discussion of mappings in general.

9.2 MAPPINGS

Let A and \320\222be arbitrary nonempty sets. Suppose to each element of A there is assigned a unique

element of B; the collection of such assignmentsis calleda mapping (or map) from A into B. The set A is

called the domain of the mapping and \320\222is called the codomain. A mapping/from A into \320\222is denoted by

f-.A^B

We write/(a), read \"/ofa\"for the element of \320\222that /assigns to a e A; it is called the value of/at a orthe image of a under/

Remark: The term function is used synonymously with the word mapping,

although some texts reserve the word function for a real-valued or complex-valued

mapping, i.e., one which maps a set into R or \320\241

Consider a mapping /: A -\302\273B. If A' is any subset of A, then f(A') denotes the set of images ofelements of A'; and if B' is any subset of B, then/\"\" !(B') denotes the set of elements of A each of whose

image lies in B':

f(A') = {/(a): a e A'} and / I(B') = {a e A :f{a) e B'}

We call f(A') the image of A' and/ '(B') the inverseimage or preimage of B'. In particular, the set of all

images, i.e.,f(A), is called the image(or range) of/.

To each mapping/: A -* \320\222there corresponds the subset of A x \320\222given by {(a,f(a)):a e A}. We

call this set the graph of/ Two mappings/: A -\302\273\320\222and g : A -* \320\222are defined to be equal, written/= g,if/(a) =

g(a) for every a e A, that is, if they have the same graph. Thus we do not distinguish between afunction and its graph. The negation of/= g is written/\321\204 \320\264and is the statement:

There exists an a e A for which/(\320\260) \320\244g(a).

Sometimes the \"barred\" arrow i\342\200\224\302\246is used to denote the image of an arbitrary element x e A undera mapping/: A -* \320\222by writing

This is illustrated in the following example.

Example 9.1

(a) Let/: R -\302\273R be the mapping which assigns to each real number x its square x2:

XI\342\200\224>JC2 \320\236\320\223f(\\) = X2

Here the imageof \342\200\2243 is 9 so we may write/( \342\200\2243) = 9.

312

CHAP. 9] LINEAR MAPPINGS 313

(b) Consider the 2 x 3 matrix A = I).

If we write the vectors in R3 and R2 as column vectors,then A

determines the mapping F : R3 -\302\273R2 defined by

vi\342\200\224*Av that is F{v) = Av v e R3

3\\

I. then F(v) = Av = i

\\2 4-1A

\342\200\2244

(c) Let V be the vector space of polynomials in the variable t over the real field R. Then the derivative

defines a mapping D: \320\243-* V where, for any polynomial feV, we let T>{f)=df/dt. For example,

DCr2 -5t + 2) == \320\253- 5.

(d) Let V be the vector space of polynomials in t over R [as in (c)]. Then the integral from, say 0 to 1 defines amapping J : V -\302\273R where, for any polynomial/e V, we let J(/) = Jo fit) dt. For example,

= f 'Ct2 -Jo

JCt2- 5t + 2)=

Ct2- 51 + 2) \320\233= |

Jo

Note that this map is from the vector space V into the scalar field R whereas the map in (c) is from V into

itself.

Remark: Every m x n matrix A over a field \320\232determines the mappingF:Kn->Kmdefined by

where the vectors in K\" and Km are written as column vectors. For convenience, weshall usually denote the above mapping by A, the same symbol used for the matrix.

Compositionof Mapping

Considertwo mappings/: A -\302\273\320\222and g: B-*C illustrated below:

Let ae A; then/(a) e B, the domain of g.Hencewe can obtain the image of/(a) under the mapping g,

that is, g(f(a)). This map

from A into \320\241is called the composition or product of/and g, and is denoted by g \302\260f.In other words,

(g \302\246-\342\200\242/): A-*C is the mapping defined by

Our first theorem tells us that composition of mappings satisfies the associative law.

Theorem 9.1: Let/: A~+B,g:B^>C. and h:C-*D. Then h ^(g c/) = (/, \320\276

\320\264)of.

We prove this theorem now. If a e A, then

(h \302\246>(g

and ((\320\220\321\201g,) \321\201

/\321\205\320\260)= (ft \321\201

flx/(fl\302\273=

%(/(\302\253)))

Thus (\320\233\320\276

(#\320\276

/)\320\245\321\217)=

(\320\2411\302\260

&) \302\260/Xfl) f\302\260revery a e A, and so h \320\276

{\320\264\320\276/) =

(fto

\320\264)\320\276/

314 LINEAR MAPPINGS [CHAP.9

Remark: Let F: A -\302\273B.Some texts write aF instead of F{a) for the image of

a e A under F. With this notation, the composition of functions F: A-* \320\222and

G : \320\222-\302\273\320\241is denoted by F \302\260G and not by G \302\260F as used in this text.

Injective(One-to-Onc)and Surjective (Onto) Mappings

We formally introduce somespecial types of mappings.

Definition: A mapping f:A-*B is said to be one-to-one (or one-one or 1-1) or injective if different

elements of A have distinct images; that is,

or, equivalently,

\321\205\320\263\320\260\321\204\320\260'implies f

iff(a)=f(a') implies a = d

Definition: A mapping/: A -* \320\222is said to be onto (or/maps A onto B) or surjective if every b e \320\222is

the image of at least one a e A.

A mapping /: A -* \320\222which is both one-to-one and onto is said to be a one-to-one correspondencebetween A and \320\222or bijective.

Example 9.2

(a) Let /: R \342\200\224R, g : R \342\200\224R, and h : R -> R be denned by f(x) = 2X, g(x) = x3 - x and h(x) = x2. The graphs of

these mappings are shown in Fig. 9.1. The mapping / is one-to-one: geometrically, this means that eachhorizontal line does not contain more than one point of/ The mapping g is onto; geometrically, this means

that each horizontal line contains at least one point of g. The mapping h is neither one-to-one nor onto; for

example, 2 and -2 have the same image 4, and - 16 is not the image of any element of R.

f(x) = 2* g(x) \342\200\224x3

Fig. 9-1

h(x) - x2

(b) Let A be any set. The mapping/: A-* A defined by/(a) = a, i.e.,which assigns to each element in A itself, is

called the identity mapping on A and is denoted by 1A or 1 or /.

(c) Let/: A -\302\273B. We call g : \320\222-\302\273A the inverse of/, written/ ', if

/o9 = lB and 9\302\260f=lA

We emphasize that/has an inverse if and only if/is both one-to-one and onto (Problem 9.11). Also, if b e \320\222

then /~ '(fe)= a where a is the unique element of A for which /(a) = b.

93 LINEAR MAPPINGS

Let V and U be vector spaces over the same field K. A mapping F: V -* U is called a linearmapping (or linear transformation or vector space homomorphism) if it satisfies the following two condi-conditions :

CHAP. 9] LINEAR MAPPINGS 315

A) For any v,weV, F{v + w)= F(v) + F(w).

B) For any \320\272\320\265\320\232and any v e V, F(kv) =kF(v).

In other words, F : V -* V is linear if it'*

preserves\"

the two basic operations of a vector space, that of

vector addition and that of scalar multiplication.

Substituting \320\272= 0 into B) we obtain F@)= 0. That is, every linear mapping takes the zero vector

into the zero vector.Now for any scalars a, be \320\232and any vectors v, w e V we obtain, by applying both conditions of

linearity,

F(av + bw) =F(av) + F(bw) = aF{v) + bF{w)

More generally, for any scalars a{e \320\232and any vectors vt e V we obtain the basic property of linear

mappings:

F(axvx + a2 v 2 + \342\200\242\342\200\242\342\200\242+ an vn)

= a,^,) + a2 F(v2) +\302\246\302\246\302\246

+an F{vn)

Remark: The condition F{av + bw)= aF(v) + bF(w) completely characterizes

linear mappings and is sometimes used as their definition.

Example 9.3

(a) Let A be any m x \320\270matrix over a field K. As noted previously, A determines a mapping F : K\" -\302\273Km by the

assignment vt\342\200\224*Av.(Here the vectors in K\" and Km are written as columns.)We claim that F is linear. For, by

properties of matrices,

F(v + w) = A(v + w) = Av + Aw = F{v) + F(w)

and F(kv) =A(kv)

= kAv = kF(v)

where v,weK\" and \320\272\320\265\320\232.

{b) Let F : R3 -\302\273R3 be the \"projection\" mapping into the xy plane: F(x, y, z) = (x, y, 0). We show that F is linear.Let v = (a, b, c) and w = (a, b\\ c'). Then

F(v + w) = Fia + a, b + b\\ \321\201+ \321\201')= (a + a', b + b', 0)= (a, b, 0) + (a1,b', 0) =

F{v) + F(w)

and, for any \320\272\320\265R,

F(kv) = F(ka, kb, kc) = (ka, kb, 0) = k(a, b, 0) = kF{v)

That is, F is linear.

(c) Let F : R2 -> R2 be the \"translation

\"mapping defined by F(x, y)

= (x + 1, \321\203+ 2). Note that F@) =F@, 0) =

A, 2) \320\2440. That is, the zero vector is not mapped onto the zero vector.HenceF is not linear.

(d) Let F : V -\302\273I/ be the mapping which assigns 0 e I/ to every i> e F. Then, for any v, w e V and any keK.wehave

F{v + w) = 0 = 0 + 0 = F(v) + F(w) and F{kv)= 0 = fcO =

fcF(i>)

Thus F is linear. Wecall F the zero mapping and shall usually denote it by 0.

(e) Consider the identity mapping / : V -* V which maps each v e V into itself. Then, for any v, w e V and any a,b e K, we have

/(ot) + bw) = od + bw = al{v) + bl{w)

Thus / is linear.

(/) Let V be the vector space of polynomials in the variable t over the real field R. Then the derivative mapping

D : V -\302\273V and the integral mapping J : V -\302\273R, defined in Example 9.1(c)and (rf), are linear. For it is proven incalculus that for any u,veV and \320\272\320\265R,

316 LINEAR MAPPINGS [CHAP. 9

calculus that for any u. v e V and \320\272\320\265R,

d(u 4- v) du dv d(ku) du= \342\200\224+ \342\200\224and \342\200\224\342\200\224= \320\272\342\200\224

dt dt dl dl dt

that is, D(u + v) = D(u) \302\246+\342\200\242I)(i>) and \320\251\320\272\320\270)= \320\272D(wl; and also.

+ iit\\] dt = i<<0 dt 4- I i'(f) dt\320\223 \320\2231 \320\2231[uUI + v{t\\] dl = \320\272@dl + i

Jo Jo Jo

and ku(t) dt = \320\272l \320\275(\320\263)dt

Jo Jo

that is, J(u + v) = J(u) + J(t>) and J(Jcu)= ki(u).

(g) Let F : V -\302\273\320\241be a linear mapping which is both one-to-one and onto. Then an inverse mapping F\"'

: V -\302\273\320\232

exists. We will show (Problem9.15)that this inverse mapping is also linear.

Our next theorem (proved in Problem 9.12) gives us an abundance of examplesof linear mappings;

in particular, it tells us that a linear mapping is completelydetermined by its values on the elements of abasis.

Theorem 9.2: Let V and U be vector spacesover a field K. Let {v,, v2, vn) be a basis of V and let

\320\270\342\200\236u2,..., un be any vectors in V. Then there exists a unique linear mapping F :V -*Usuch that F(v1) = u,, F(v2) = u2, ..., F(vn)

=\320\270\342\200\236.

We emphasize that the vectors u,, ..., \320\274\342\200\236in Theorem 9.2 are completely arbitrary; they may be

linearly dependent or they may be equal to each other.

Vector Space Isomorphism

The notion of two vector spacesbeing isomorphic was defined in Chapter 5 when we investigated

the coordinates of a vector relative to a basis, We now redefine this concept.

Definition: Two vector spaces V and U over \320\232are said to be isomorphic if there exists a bijectivelinear mapping F : V -\302\273U. The mapping F is then called an isomorphism between V and V.

Example 9.4. Let \320\232be a vector space over \320\232of dimension \320\270and let S be a basis of V. Then, as noted previously,the mapping i\"\342\200\224*[b']s,which maps each v e V into its coordinate vector relative to the basis S, is an isomorphismbetween V and K\".

9.4 KERNEL AND IMAGE OF A LINEAR MAPPING

We begin by defining two concepts.

Definition: Let F: V -> U be a linear mapping.The image of F, written Im F, is the set of image pointsin V:

Im F =\\u e U: F(v) = \320\270for some p e V}

The kernel of /\342\200\242',written Ker F, is the set of elements in V which map into 0 e U:

Kcr F = {pe V. F(v) = 0}

The following theorem is easily proven (Problem 9.22):

CHAP. 9] LINEAR MAPPINGS 317

Theorem 9.3: Let F: V -\302\273U be a linear mapping. Then the image of F is a subspace of U and the

kernel of F is a subspaceof V.

Now suppose that the vectors vt,..., vn span V and that F: V -\302\273C/ is linear. We show that the

vectors \320\223(\320\270\320\224..., F(vj e U span Im F. For suppose \320\270e Im F; then F(i>)= \320\270for some vector v e V.

Since the t>,- span V and since i; e V, there exist scalars a,, ..., an for which

v = alv1 + a2v2 + \302\246\302\246\302\246+ anvn.

Accordingly,

\320\270= F(v) = F(alvl + a2 v2 + \302\246\342\200\242\342\200\242+ an vn)

= QjF^J + a2 F(v2) +\302\246\302\246\302\246+ anF{vj

and hence the vectors F(d,), ..., F(vn) span Im F.

We formally state the above useful result.

Proposition 9.4: Suppose vl,v2,...,vn span a vector space V and F: \320\232-\302\273\320\241/is linear. Then

F(i?t), F(t2),..., F(i;n) span Im F.

Example 9.5

(a) Let F : R3 -\302\273R3 be the projection mapping into the .x>' plane. That is,

F(x, >\342\200\242,z) = (x, y, 0)

(SeeFig. 9-2.)Clearly the image of F is the entire xy plane. That is,

Im F ={{a, b, 0): a, b e R}

Note that the kernel of F is the 2 axis. That is,

Ker F ={@. 0, \321\201):\321\201\320\265R}

since these points and only these points map into the zero vector 0 =@, 0, 0).

v (a, b, c)

Fig. 9-2

{b) Let V be the vectorspaceof polynomials over R and let T : V -\302\273V be the third-derivative operator, that is,

[Sometimes the notation T = D3, where D is the derivative mapping in Example 9.1(c),is used.]Then

Ker T ={polynomials of degree < 2}

[since T(at2+ bt + c) = 0 but T(/\") \320\2440 for \320\270> 2]. On the other hand,

since every polynomial/(t) in V is the third derivative of somepolynomial.

318 LINEAR MAPPINGS [CHAP. 9

(c) Consideran arbitrary 4x3 matrix A over a field \320\232:

A =

which we view as a linear mapping A : K3 -\302\273K4. Now the usual basis {el?e2,e3}of K3 span K3 and so their

values Ae,, Ae2, Ae3 under A span the image of A. But the vectors Ae^, Ae2, and /le3 are the columns of A:

a, a2 a3\320\233

4

a,

by

c,

\302\2532

\320\254\320\263

'\022

\302\2533

b3

c3

^3,

\302\2531\302\2532

c3

thus the image of A is precisely the column spaceof A.

On the other hand, the kernel of A consists of all vectors v for which At = 0. This means that the kernelof A is the solution space of the homogeneous system AX = 0.

Remark: The above result is true in general. That is, if A is any m x n matrix

viewed as a linear mapping A : K\" -* Km and E = {e(}is the usual basis of K\", then

Aex,..., Aen are the columns of A and

Ker A = nullsp A and Im A = colsp A

Here colsp A means the column space of A (Section 5.5) and nullsp A means the null

space of A, i.e., the solution space of the homogeneoussystem AX = 0.

Rank and Nullity of a Linear Mapping

So far we have not related the notion of dimension to that of a linear mapping F: V -\302\273U. In the

case that V is of finite dimension, we have the following fundamental relationship.

Theorem 9.5: Let V be of finite dimension and let F : V -* U be a linear mapping.Then

dim V = dim (Ker F) + dim (Im F) {9.1)

That is, the sum of the dimensions of the image and kernel of a linear mapping is equal to the

dimension of its domain.

Equation (9.1) is easily seen to hold for the projection mapping F in Example 9.5(a). There the

image (xy plane)and the kernel (z axis) of F have dimensions2 and I, respectively, whereas the domainR3 of F has dimension3.

Remark: Let F: V -\302\273U be a linear mapping. Then the rank of F is defined to bethe dimension of its image, and the nullity of F is defined to be the dimension of itskernel: that is,

rank F = dim (Im F) and nullity F = dim (Ker F)

CHAP.9] LINEAR MAPPINGS 319

Thus Theorem 9.5 yields the following formula for F when V has finite dimension:

rank F + nullity F = dim V

Recall that the rank of a matrix A was originally defined to be the dimension of its column space and of

its row space. Observe that if we now view \320\233as a linear mapping, then both definitions correspondsince the image of A is precisely its column space.

Example 9.6. Let F : R* -> R3 be the linear mapping defined by

F(x, y, s, t) = (x ->> + s + r, x + 2s \342\200\224t, x + \321\203+ is \342\200\224

3f)

{a) Find a basis and the dimension of the image of F.Find the image of the usual basis vectorsof R4:

F(\\, 0, 0, 0) = A, 1, 1) F@, 0, 1, 0) = A, 2, 3)F@, 1, 0, 0) = (-1, 0, 1) F@, 0, 0, 1) = A, -1, -3)

By Proposition 9.4, the image vectors span Im F; hence form the matrix whose rows are these image vectorsand row reduce to echelon form:

\\ / 1 1

1 2

0 0

0 0

Thus A, 1, I) and @, 1, 2) form a basis of Im F; hence dim (Im F) = 2 or, in other words, rank F = 2.

(b) Find a basis and the dimension of the kernel of the map F.

Set F(v) = 0 where i> = (x, y, z, t):

F(x, y, s, i) = (x \342\200\224\321\203+ s + t, x + 2s-

t, x + \321\203+ 3s - 3f) =@, 0, 0)

Set corresponding components equal to each other to form the following homogeneous system whose solution

space is Ker F:

x-y+ s+ f=0 x\342\200\224y+ s+ f=0x \342\200\224

\321\203+ s + t = 0x +2,- r = 0 or y+ S-2t = 0 or

y + s_2t = 0x + y + 3s-3t = 0 2y + 2s - 4t = 0

The free variables are s and t; hence dim (Ker F) = 2 or nullity F = 2. Set:

(i) s = \342\200\2241, t = 0, to obtain the solution B, I, \342\200\2241, 0),

(ii) s = 0, t = 1, to obtain the solution A, 2, 0, 1).

Thus B, 1, - 1,0) and A, 2, 0, 1) form a basis for Ker F. (Observe that rank F + nullity F = 2 + 2 = 4, which

is the dimension of the domain R* of F.)

Application to Systems of Linear Equations

Consider a system of m linear equations in \320\270unknowns over a field \320\232:

\320\260\321\206*1+ al2x2 + \302\246\302\246\302\246+ \320\2601\320\277\321\205\342\200\236= b,

a21x, +a22x2 + \302\246\302\246\302\246+ \320\2602\320\277\321\205\342\200\236= b2

amnxn =bm

which is equivalent to the matrix equation

320 LINEAR MAPPINGS [CHAP. 9

where A =(atj) is the coefficient matrix, and x = (x;) and b = (b,) are the column vectors of the

unknowns and of the constants, respectively. Now the matrix A may also be viewed as the linearmapping

A : K\" -> Km

Thus the solution of the equation Ax = b may be viewed as the preimage of b e Km under the linear

mapping A : K\" -\302\273Km. Furthermore, the solution of the associated homogeneousequation Ax = 0 maybe viewedas the kernel of the linear mapping A : K\" -\302\273Km.

Theorem 9.5 on linear mappings gives us the following relation :

dim (Ker A)= dim K\" \342\200\224dim (Im A) = n \342\200\224rank A

But \320\270is exactly the number of unknowns in the homogeneous system Ax = 0. Thus we have recovered

Theorem 5.20.

Theorem 9.6: The dimension of the solution space W of the homogeneous system of linear equationsAX = 0 is \320\270\342\200\224r where \320\270is the number of unknowns and r is the rank of the coefficient

matrix A.

9.5 SINGULAR AND NONSINGULARLINEAR MAPPINGS, ISOMORPHISMS

A linear mapping F : V -* U is said to be singular if the image of some nonzero vector under F is 0,i.e., if there exists v e V for which d^O but F(v) = 0. Thus F : V \342\200\224\302\273U is nonsingular if only 0 e V maps

into 0 ? U or, equivalently, if its kernel consists only of the zero vector:KerF = {0}.One fundamental property of nonsingular mappings follows(seeproof in Problem 9.29).

Theorem 9.7: Suppose a linear mapping F : V -\302\273V is nonsingular. Then the image of any linearly

independent set is linearly independent.

Isomorphisms

Supposea linear mapping F : V -* U is one-to-one. Then only 0 e V can map into 0 e U and so Fis nonsingular. The converse is also true. For suppose F is nonsingular and F(v) = F(w); then

F{v\342\200\224

w) = F(v) \342\200\224F{w)

= 0 and hence t> \342\200\224u> = 0 or v = w. Thus F(v) =F(w) implies i> = w, that is, F is

one-to-one.Thus we have proven.

Proposition 9.8: A linear mapping F : V -> U is one-to-one if and only if it is nonsingular.

Recall that a mapping F : V -* U is called an isomorphism if F is linear and if F is bijective, i.e., if F

is one-to-one and onto. Also, recall that a vector space V is said to be isomorphic to a vector space U,

written V ~ I/, if there is an isomorphism F : V -* U.

The following theorem, proved in Problem 9.30, applies.

Theorem 9.9: Suppose V has finite dimension and dim V = dim V. Suppose F : V \342\200\224>V is linear. ThenF is an isomorphism if and only if F is nonsingular.

9.6 OPERATIONSWITH LINEAR MAPPINGS

We are able to combine linear mappings in various ways to obtain new linear mappings.Theseoperations are very important and shall be usedthroughout the text.

CHAP. 9] LINEAR MAPPINGS 321

Suppose F: V -\302\273U and G :V -*U are linear mappings of vector spaces over a field K. We definethe sum F + G to be the mapping from V into U which assigns F{v) + G(v) to v e V:

(F + GX\302\273)= F{v) + G{v)

Furthermore, for any scalar \320\272\320\265\320\232,we define the product kF to be the mapping from V into U which

assigns kF{v) to i; e V:

We now show that if F and G are linear, then F + G and kF are alsolinear.We have, for any vectorsv, w e V and any scalars a,beK,

(F + G)(av + bw)= F(av + bw) + G(av + bw)

= aF(v) + bF{w) + aG{v)+ bG{w)

= a{F(v) + G(v)) + b{F(w) + G(w))

= a{F + GXi>) + b(F + CXw)

and (kFYav + bw) = kF{av + bw)= k(aF{v) + bF{w))

=akF(v) + bkF(w) = a(kF)(v) + b{kF)(w)

Thus F + G and kF are linear.The following theorem applies.

Theorem 9.10: Let V and U be vector spaces over a field K. Then the collection of all linear mappingsfrom V into U with the above operations of addition and scalar multiplication forms avector space over K.

The space in Theorem 9.10 is usually denoted by

Hom(K, U)

Here Horn comes from the word homomorphism. In the case that V and U are of finite dimension, we

have the following theorem, proved in Problem 9.36.

Theorem 9.11: Suppose dim V = m and dim U = n. Then dim Horn (V, V) = mn.

Composition of Linear Mappings

Now suppose that V, U, and W are vector spacesover the same field K, and that F: V -> V and

G : V -\302\273W are linear mappings:

Recall that the composition function G \302\260F is the mapping from V into W defined by

(G \320\276FXf) = G{F(t)). We show that G \302\260F is linear whenever F and G are linear. We have, for anyvectors v, w e V and any scalars a,b e K,

(G \320\276Ftav + bw) = G(F(at' + bw))

= G(aF(i?) + bF(w))= aG(F(u)) + bG(F(w))

= a(G \320\276F)(v) + b(G \321\201FXw)

That is, G \302\260F is linear.

The composition of linear mappings and the operations of addition and scalarmultiplication are

related as follows (see proof in Problem 9.37):

Theorem 9.12: Let V, U, and W be vector spaces over K. Let F, F be linear mappingsfrom V into V

and G, G' linear mappings from U into W, and let \320\272\320\265\320\232.Then

322 LINEAR MAPPINGS [CHAP. 9

(i) G\302\260{F + F') = G\302\260F + G\302\260F'

(ii) {G + G')\302\260F = G\302\260F + G' \302\260F

(iii) k{G \320\276F) = {kG) \302\260F = G

9.7 ALGEBRA \320\233(\320\232)OF LINEAR OPERATORS

Let V be a vector space over a field K. We now consider the special case of linear mappingsF : V -* V, i.e., from V into itself. They are also called linear operators or linear transformations on V.

We will write A{V), instead of Horn (V, V), for the space of all such mappings.By Theorem 9.10, A{V) is a vector spaceover \320\232;it is of dimension n2 if V is of dimension n. Now if

F,G e A{V), then the composition G \302\260F exists and is also a linear mapping from V into itself, i.e.,G \320\276F e A(V). Thus we have a \"multiplication\" defined in A(V). [We shall write GF for G \302\260F in the

space /1(V).]

Remark: An algebra A over a field \320\232is a vector space over \320\232in which an

operation of multiplication is defined satisfying, for every F,G, H e A and every \320\272\320\265\320\232,

(i) F(G + H) = FG

(ii) (G + H)F = GF + HF

(iii) k{GF)= (kG)F = G(kF)

If the associative law also holds for the multiplication, i.e.,if for every F,G, H e A,

(iv) (FG)H = F(GH)

then the algebra A is said to be associative.

The above definition of an algebra and Theorems9.10,9.11,and 9.12 give us the following basicresult.

Theorem 9.13: Let V be a vector space over K. Then A(V) is an associative algebra over \320\232with

respect to composition of mappings. If dim V = n, then dim A( V) = n2.

In view of the above theorem, A(V) is frequently called the algebra of linear operators on V.

Polynomials and Linear Operators

Observethat the identity mapping / : V -* V belongs to A(V). Also, for any T e A(V), we haveTI = IT = T. We note that we can also form \"powers\" of T; we use the notation T2 = T \320\276\320\242,

T3 = T \320\276\321\202\320\276\320\242, Furthermore, for any polynomial

p(x) =ao + atx + a2x2 + \342\200\242\302\246\302\246+ anx\" ate \320\232

we can form the operator p(T) defined by

\321\200{\320\242)=

\320\260\320\2761+ \320\260,\320\223+ a2T2 + \302\246\302\246\302\246+\320\260\342\200\236\320\242\"

(For a scalar \320\272e K, the operator fc/ is frequently denoted simply as k.) In particular, if p{T)= 0, the

zero mapping, then T is said to be a zero of the polynomial p(x).

Example 9.7. Let T :R3 -> R3 be defined by T(x, y, z) = @, x, y). Now if (a, b, c) is any element of R3, then

(T + I\\a, b, c) = @, a, b) + {a, b, \321\201)=

(\320\260,\320\260+ \320\254,\320\254+ \321\201)

and T3(a, b, c) = [email protected], b)= T@, 0, a) = @, 0, 0)

CHAP.9] LINEAR MAPPINGS 323

Thus we see that T! = 0, the zero mapping from V into itself. In other words. T is a zero of the polynomialpix) = x3.

9.8 INVERT1BLE OPERATORS

A linear operator T: V -\302\273V is said to be invertible if it has an inverse, i.e., if there exists

T ' 6 A(V) such that TT ] = 7\021 T = I.

Now T is invertible if and only if it is one-to-one and onto. Thus in particular, if T is invertible,then only 0 e V can map into itself, i.e., T is nonsingular.The converseis not true in general as seen by

the following example.

Example 9.8. Let V be the vector space of polynomials over K, and let T be the operator on V defined by

alt + -- + asts =aot + a1t2 + \302\246\302\246\302\246+a,t1*1(s = 0, 1,2,...)

i.e.,T increases the exponent of t in each term by 1. Now T is a linear mapping and is nonsingular. However, T isnot onto and so is not invertible.

The space V of Example 9.8 is infinite-dimensional. The situation changes significantly when V has

finite dimension. Specifically, the following theorem applies.

Theorem 9.14: Suppose T is a linear operatoron a finite-dimensional vector space V. Then the follow-followingfour conditions are equivalent :

(i) T is nonsingular, i.e., Ker T = {0}.

(ii) T is injective, i.e., one-to-one,

(iii) T is surjective, i.e., onto,

(iv) T is invertible, i.e., one-to-one and onto.

Proposition 9.8 tells us that (i) and (ii) are equivalent. Thus, to prove the theorem, we need only

show that (i) and (iii) are equivalent. [It will then follow that (iv) is equivalent to the others.] By

Theorem 9.6,

dim \320\243= dim (Im T) + dim (Ker T)

If T is nonsingular, then dim (Ker T) = 0 and so dim V = dim (Im T). This means that V = Im T or Tis surjective. Thus (i) implies (iii). Conversely, suppose T is surjective. Then V = Im T and sodim V = dim (Im T). This means that dim (Ker T) = 0 and hence T is nonsingular. Thus (iii) implies

(i). Accordingly, the theorem is proved. (The proof of Theorem 9.9 is identical to this proof.)

Example 9.9. Let T be the operator on R2 defined by T(x, y) = (>; 2x -y). The kernel of T is {@, 0)}; hence T is

nonsingular and, by Theorem 9.14, invertible. We now find a formula for T~~'. Suppose (s, t) is the image of (x. y)

under T; hence (x, y) is the image of (s, t) under T'1: that is, T{x, y)= (s, t) and T~ l(s, t) = (x, >>). We have

1\\x, y)= (y, 2x -

y)= (s, r) and so \321\203

= s, 2x -\321\203

= t

Solving for x and \321\203in terms of s and t, we obtain x = js + \\t, \321\203= s. Thus \320\223\021is given by the formula

T-\\st) ={\\s + \\ts\\

Applications to Systems of Linear Equations

Consider a system of linear equations over \320\232and suppose the system has the same number of

equations as unknowns, say n. We can represent this system by the matrix equation

Ax = b (*)

324 LINEAR MAPPINGS [CHAP. 9

where A is an \320\270-square matrix over \320\232which we view as a linear operatoron K\". Suppose the matrix Ais nonsingular, i.e., the matrix equation Ax \342\200\2240 has only the zero solution. Then the linear mapping A isone-to-one and onto.This means that the system (*) has a unique solution for any b e K\". On the other

hand, suppose the matrix A is singular, i.e., the matrix equation Ax = 0 has a nonzero solution. Then

the linear mapping A is not onto. This means that there exist b e K\" for which (*) does not have asolution. Furthermore, if a solution exists it is not unique. [Review Problem 1.63(ii).]Thus we have

proven the following fundamental result:

Theorem9.15: Considerthe following system of linear equations with the same number of equationsas unknowns:

+ al2x2H + alnxn = b,

(a)

(b)

If the corresponding homogeneoussystem has only the zero solution, then the

above system has a unique solution for any values of the bt.

If the corresponding homogeneoussystem has a nonzero solution, then: (i) there

are values for the b{ for which the above system does not have a solution;

(ii) whenever a solution of the above system exists, it is not unique.

Solved Problems

MAPPINGS

9.1.

9.2.

State whether or not each diagram in Fig. 9-3 defines a mapping from A = {a, b, c} into\320\222= {x, y, z}.

(a) No.There is nothing assigned to the element b e A.

(b) No. Two elements, x and z, are assigned to \321\201\320\265A.

(c) Yes.

Let the mappings /: A -* \320\222and g:B~*C be defined by the diagram in Fig. 9-4. Find:(a) the composition mapping (g \302\260f):A-> C, and (b) the image of the mappings/, g, and g \302\260/.

Fig. 9-4

CHAP. 9] LINEAR MAPPINGS 325

(a) We use the definition of the composition mapping to compute:

@ - fW = g(f(b)) =g[x)

= s

<</\302\246fie) = g(f(c)) =

g(y)= I

Observe that we arrive a( the same answer if we \"follow the arrows\" in Fig. 9-4:

a\342\200\224*\321\203-* I h -\302\273x \342\200\224\302\273s \321\201\342\200\224*\321\203-* t

\321\204)By Fig. 9-4, the image values under the mapping /are x and y, and the image values under g are r, s,and I; hence

Im f\342\200\224{x, >>} and Im c/ = {r, .s, f}

Also, by (u), the image values under the composition mapping g \342\200\242/ are i and s; accordingly,Im @ \302\246\342\200\242

f) = [s, i]. Note that the images of g and g \302\246fart different.

9.3. Consider the mapping F :R3 -* R2 denned by F(x. y, z) =(yz, x2). Find:

(a) FB, 3,4), (b) FE, - 2,7), (c) F '@, 0), i.e., all vectorsceR3 such that F(v) = 0.

(a) Substitute in the formula for F to get FB, 3, 4) = C \342\200\2424, 22) = A2, 4).

(ft) fE, -2, 7) = ( -2 \342\200\2427, 52) = (- 14,25).

(c) Set F(v) \342\200\224-0 where v = (x, y, z), and then solve for x, y, z:

F(x, y, z) = (vz, x2) = @, 0) or yz = 0 and x2 = 0

Thus x = 0 and either \321\203= 0 or z = 0. In other words, x = 0, \321\203

= 0 or x = 0, z = 0. Accordingly, d lieson the z axis or the \321\203axis.

9.4. Consider the mapping G : R3 -> R2 defined by G(x, >,.?) = (x +\302\2462>-\342\200\224

4z, 2x + 3_y + z). Find

G'C,4).

SetG(x, i', z) = C, 4) to get the system

-v + 2y - 4z= 3 x + 2j>- 4z = 3 x + 2y

- 4z = 3

2x + 3y+ \320\263= 4\302\260r

->> + 9z=-2\302\260\320\223

v-9z = 2

Here z is a free variable. Set z = a, a e R. to obtain the general solution

x = - 14\302\253- 1 \321\203= 9a + 2 z = a

In other words, G\" 'C, 4) = {(- 14a - I, 9a + 2, a)}.

9.5. Considerthe mapping F : R2 -> R2 defined by F(x, j>)= Cj>, 2x). Let S be the unit circle in R2,

that is, the solution set of x2 + y2= 1. (a) Describe F(S). {b)Find F\" l(S).

(a) Let (a, b) be an element of F(S). Then there exists (x, >\342\200\242)e S such that F(x, y)= (a, b). Hence:

C>>, 2x) = (a, b) or 3>>= a, 2x = b or \321\203

= a/3, x = b/2

Since(x, y) e S, that is, x2 + y2 = 1, we have

(b/2J + (a/3J = 1 or a2f9 + b2/4 = 1

Thus F(S} is an ellipse.(b) Let F(x, >>)

= (a, b) where (a, b) e S. Then C>>,2x) = (a, b) or 3>-= a, 2x = b. Since(a, b) e S, we have

a2 +b2 = 1.Thus C>>J + BxJ = 1.Accordingly, F *(S) is the ellipse4x2 + 9>>2= 1.

9.6. Let the mappings /and g be defined by/(x) = 2x + 1 and y(x) = x2 \342\200\2242. Compute formulas forthe mappings (a)g \320\276f, {b) f \302\246\302\246\302\246</, and (c) g <\302\246

\321\203(sometimes denoted by g2).

326 LINEAR MAPPINGS [CHAP. 9

(a) Compute (he formula for g \"/as follows:

(\320\262\302\260/K*)= <?(/(*)) = 9Bx 4- 1) = Bx + IJ - 2= 4x2 +\302\2464x - 1

Observe thai (he same answer can be found by writing

\321\203= /(x) = 2x+\\ and z = g(y) =

y2- 2

and (hen eliminating \321\203as follows: z =y2

\342\200\2242 = Bx + IJ \342\200\2242 = Ax2 + 4x \342\200\2241.

(b) (/\302\260flKx) =/(fl(x)) =/(x2 - 2)=2(x2

- 2) + 1 = 2x2 - 3

(c) (9 \320\2760Hx) =0(.<?W)

= g(x2 - 2) = (x2- 2J - 2 = x* - 4x2 + 2

9.7. Suppose/: A -* \320\222and g: B-*C; hence the composition (g \302\260/):A -* \320\241exists. Prove:

(a) If/and g are one-to-one, then g <>/is one-to-one.

(fe) If/and g are onto mappings, then g \302\260/is an onto mapping.

(c) If g \302\260/is one-to-one, then/is one-to-one.

(</) If 0 \302\260/isan onto mapping, then g is an onto mapping.

(a) Suppose (g \"\320\257\320\234={\320\264\"/X>0- Then g(f(x)) = g(f{y)).Since g is one-to-one, /(x) =/(>')\342\200\242Since / is

one-to-one,x = y.We have proven that (g \302\260fMx)= (g \302\260f\\y) implies x = y; hence g \320\276/isone-to-one.

(b) Suppose \321\201e C. Since # is onto, there exists b e \320\222for which g(b)= \321\201Since/is onto, there exists a e A

for which/(fl) = b. Thus (g \302\260f)(a) = g(f(a)) = g(b) = c; hence g \302\260/isonto.

(c) Suppose/is not one-to-one. Then there exist distinct elements x, \321\203e A for which/(x) =/(>')\342\200\242Thus

(g o/Kx) = g(f(x)) =g(f(y))

= (g \"/My); hence g \302\260/isnot one-to-one. Therefore, if g \"/is one-to-one,

then/must be one-to-one.

(d) U ae A, then (# \302\253/Xa)= (Af(a)) \321\201#(B); hence (g \302\260f)(A) ? g(B). Suppose g is not onto. Then g(B) is

properly contained in \320\241and so (g !>/\320\235\320\233)is properly contained in C; thus g =/is not onto. Accord-

Accordinglyif g \302\253/isonto, then g must be onto.

9.8. Prove that a mapping/: A-* \320\222has an inverse if and only if it is one-to-one and onto.

Suppose/hasan inverse function/1 :B-*A and hence/ '\342\200\242>/=\\A and/\"/\021 = 1B. Since \\A is

one-to-one,/is one-to-one by Problem 9.7(c), and since 1B is onto,/is onto by Problem 9.7(d). That is,/isboth one-to-one and onto.

Now suppose/ is both one-to-one and onto. Then each b \321\201\320\222is the image of a unique element in A,

say \320\262.Thus if/(a) = b, (hen a = 6; hence /(/>) = fe. Now let # denote the mapping from \320\222to A defined by

bi-+f>. We have:

(i) (g \302\253/Xfl)

= g{f(a)) = s(fe)= b' = a, for every \320\271\320\265\320\233;hence 9 \320\276/= 1A.

(ii) (/<= ffXb) =/(\302\273(*>)) =/(b) = b, for every \320\254\320\265\320\222;hence/\302\273 g = 1B.

Accordingly,/has an inverse^\342\200\224themapping g.

LINEAR MAPPINGS

9.9. Show that the following mapping is linear: F: R3 -> R denned by F(x, y, z) = 2x ~ 3y + 4z.

Let v = (a, b, c) and w = (a', h', c'):hence

v + w = {a + a', b + b', \321\201+ \321\201') and kv = (ka, kb, kc) \320\272\320\265R

We have F(d) = 2a- 3b + 4c and F(w) = 2a' - 3b' + 4c. Thus

F(u + w) = F(a + a',b + b\\ \321\201+ \321\201')= 2{a + d) -

3(b + b') + 4{c + c')= Ba-3b + 4c) + Ba' - 3b'+ 4c') = F(v) + F(w)

and F(kv) =F(ka, kb, kc) = 2ka - 3kb + 4kc = kBa - 3b + Ac) = kF(v)

Accordingly, F is linear.

CHAP.9] LINEAR MAPPINGS 327

9.10. A mapping F : R2 -\302\273R3 is defined by F(x, y) = (x + 1,2y,x + y). Is F linear?

SinceF@, 0) = A, 0, 0) \321\204@, 0, 0), F cannot be linear.

9.11. Let V be the vector space of n-squarematrices over K. Let M be an arbitrary but fixed matrix in

V. Let T : V \342\200\224V be defined by T(A) = AM + MA, where Ae V. Show that T is linear.

For any \320\220,\320\222e V and any \320\272\320\265\320\232,we have

\320\242(\320\233+ \320\222)=

(\320\220+ \320\222)\320\2344- \320\234(\320\233+ \320\222)= AM + BM + MA 4- MB

= (AM 4- MA) 4- (BM -I- MB) = T(A) 4- T(fl)

and T(fc/1) =(\320\272\320\233)\320\234+ M(kA) = k(AM) + k(MA)

= k(AM + MA) =kT(A)

Accordingly, T is linear.

9.12. Prove Theorem 9.2.

There are three steps to the proof of the theorem: A) Define the mapping F: V -* U such that

F(vi) = u,:, i = 1, ..., \320\270.B) Show that F is linear. C) Show that F is unique.

Step 1. Let v e V. Since {d,, ..., \320\270\342\200\236}is a basis of V, there exist unique scalars au ..., \320\260\342\200\236\320\265\320\232for which

v = fl,u, +\302\246fl2y2 + ''' + anvn- We define F : V -\302\273I/by

F(u) = a,u, + a2u2 H 4- fleun

(Since the af are unique, the mapping F is well-defined.) Now, for i = 1 \320\270,

d,= 0u, H + ItI1 + \342\200\242\302\246\342\200\242+ 0\320\270\342\200\236

Hence F(d,) = 0u, + \342\200\242\342\200\242\342\200\242+ lu, + \302\246\342\200\242\302\246+ 0\320\270\321\217

=\321\206

Thus the first step of the proof is complete.

Step 2. Supposev = atvt + a2v2 + \302\246\302\246\302\246+ anvn and w = fc,u, + fc2y2 + \"\"\" + \320\254\342\200\236\320\270\342\200\236.Then

\320\276+ w = (a, + *>,)!;, + (a2 + b2)v2+ \302\246\302\246\302\246+(an + bn)vn

and, for any \320\272e K,kv = kalvl + ka2 v2 4- \302\246\342\200\242\302\246+ kflnt!M. By definition of the mapping F,

F(u)= fl,u, + a2u2 + \302\246\302\246\302\246+ \320\260\320\277\320\270\342\200\236and F(w) = b,u, + b2v2 + \302\246\302\246\302\246

+ bnvn

Hence F(v + w) =(flj + fe^Uj 4- (fl2 4- b2)u2 4- \342\200\242- \302\2464- (an + \320\254\342\200\236)\320\270\342\200\236

= <a,u, 4-fl2u24- \302\246\342\200\242\302\246+\320\260\321\217\320\270^+ (\320\2541\321\211+ b2u2 + \302\246\302\246\302\246

+ bnuj= F(v) 4- F(w)

and F(kv) = fc(fl,u, 4- a2u2 + \302\246\302\246\302\246+ \320\260\321\217\320\270\320\277)=

fcF(t>)

Thus F is linear.

\302\246StepJ. Suppose G : \320\232-\302\273\320\241/is linear and G(i>;)= u{, i = 1, ..., \320\270.If

then G{v) = G(fl,t), 4- a2 \302\25324- \342\200\242\342\200\242\342\200\2424- arvj = a^v,) + a2G(v2)4- \342\200\242\342\200\242\342\200\2424- fl\302\273

= fl,u, 4- a2 u2 + \302\246\302\246\302\2464- \320\260\342\200\236\320\270\342\200\236= F(v)

Since G(d)=F(t>) for every v e V,G = F. Thus F is unique and the theorem is proved.

9.13. Let F : R2 -\302\273R2 be the linear mapping for which

F(l, 2) = B, 3) and F@, 1) = A, 4)

328 LINEAR MAPPINGS [CHAP. 9

[Since A, 2) and @, 1) form a basis of R2,such a linear map F exists and is unique by Theorem

9.2.] Find a formula for F, that is, find F(a, b).

Write (a, b) as a linear combination of (I, 2) and @, 1) using unknowns x and y:

(a, b)= x(l, 2) + ^0, 1)= (x, 2x + y) so a = x, b = 2x + \321\203

Solve for x and \321\203in terms of a and h to get x = \320\260,\321\203= \342\200\2242o + fc. Then

F(a, />)= xF(l, 2) + >>F@, 1) = a[2, 3) + (-2a + b)(\\, 4) =

{b, -5a + \320\251

9.14. Let T: V-+U be linear, and suppose vu...,vne V have the property that their imagesT(i>|),..., T(i>n) are linearly independent. Show that the vectors vl,...,vK are also linearly

independent.

Suppose that, for scalars \320\264,,..., an,atvt + a2v2 + \302\246\302\246\302\246+ anvn = 0. Then

0 = \320\223@)= T(atv, + \320\2602\321\214-2+ \342\200\242- \302\246+ \320\260\342\200\2361>\342\200\236)= a, T{v,) + a2 T(i>2) + \342\200\242\302\246\342\200\242+ \320\260\342\200\236T(vJ

Since the T(vk)are linearly independent, all the a,- = 0.Thus the vectors d,, vn are linearly independent.

9.15. Suppose the linear mapping F: V -* U is one-to-one and onto. Showthat the inverse mappingF~l : V \342\200\224>I7 is also linear.

Suppose u, u' e I/. Since F is one-to-oneand onto, there exist unique vectors v, v e V for which

F(v) = \320\270and F(v') = t/. SinceF is linear, we also have

F(v + v) = F(d) + F(d')= \320\270+ \320\270 and F(ku) = kF(v) = ku

By definition of the inverse mapping, F '(u) = i;, F \"'(\302\253')= v', F~

'(u + u') = v + v\\ and F\"

'(&\") = feu. Then

F'l(u + u') = v + u-= F \\u) + F'l(u) and F'(M = kv = kF\" '(\302\253)

and thus F ' is linear.

IMAGE AND KERNEL OF LINEAR MAPPINGS

9.16. Let F : R5 -> R3 be the linear mapping defined by

F(x, y, z, s, t) = (x + 2y + z \342\200\2243s + At, 2x + 5y + 4z \342\200\2245s + 5t, x + 4y + 5z \342\200\224s \342\200\2242t)

Find a basis and the dimension of the image of F.

Find the images of the usual basis vectorsof R5:

FA,O, 0, 0, 0)=(l,2, 1) F@, 1, 0, 0, 0) = B, 5, 4) F@, 0, 1,0, 0)= A, 4, 5)

F@, 0, 0, 1, 0) = (-3, -5,-1) F@, 0, 0, 0, 1)= D, 5, -2)

By Proposition 9.4, the image vectors span Im F; hence form the matrix whose rows are these image

vectors, and row reduce to echelon form:

/1

2

1

-3

4

2

5

4-5

5

1\\4

5

-1

-2/

/1000

\\o

2

1

2

1

-3

A

2

-6/

/i000

\\o

2

1

0

0

0

1

2

0

0

0

Thus A, 2, 1) and @, 1,2) form a basis of Im F, and so dim (Im F) = 2.

9.17. Let G : R3 -\302\273R3 be the linear mapping defined by G(x, y, z) = (x + 2y- z, \321\203+ z, x + \321\203

- 2z).Find a basisand the dimension of the kernel of G.

CHAP. 9] LINEAR MAPPINGS 329

Set G(v) = 0 where v = (x, >>, z):

C(x, >', z) = (x + 2>>- z, >> + z, x + >'- 2z) = @, 0, 0)

Set corresponding components equal to each other to form the homogeneous system whose solution spaceis the kernel \320\230^of G:

X

X

+

+

2y

\320\243

\320\243

\342\200\224

+

\342\200\224

z = 0

z=0

2z = 0or

x + 2>>

\320\243

-\321\203

\342\200\224z

+ z- z

= 0= 0= 0

orx + 2y = 0

The only free variable is z; hence dim W = 1. Let z = 1; then >'= - I and x = 3. Thus C,

\342\200\2241, 1) forms abasis for Ker G.

9.18. Consider the matrix mapping A: R4 Find a basis and

the dimension of (a) the image of A, and {b) the kernel of A.

(a) The column space of A is equal to Im A. Thus, now reduce A r to echelon form:

1 3\\

13

\\\\ -2 -3 0 -3 -<

Thus {A,1, 3), @, 1, 2)} is a basisof Im A and dim (Im A)= 2.

Here Ker A is the solution space of the homogeneous system AX = 0 where X = (x, y, z, t)T. Thusreduce the matrix A of coefficients to echelon form:

1

0

0

2

1

2

324

I-3

I

1 /!Jo

1

\\o

2

1

\320\276

3

2

q

1

-30

The free variables are z and t. Thus dim (Ker A) = 2. Set:

(i) z = 1,\320\263= 0, to get the solution A, -2, 1,0),(ii) z = 0, t = 1,to get the solution (- 7, 3, 0, 1).

Thus A, -2, 1, 0) and (-7, 3, 0, 1)form a basis for Ker A.

1 2 5\\

9.19. Consider the matrix map \320\222: R3 -\302\273R3 where \320\222=\\ 3 5 13 j. Find the dimensionand a

-2 -1 -4/basis for (a) the kernel of B, and (fe) the image of B.

(a) Reduce\320\222to echelon form to get the homogeneous system corresponding to Ker B:

*\\

or2y + 5z = 0

\321\203+ 2z = 0

There is one free variable z so dim (Ker B)= 1. Set z = 1 to get the solution (

\342\200\2241, \342\200\2242,1) which formsa basis of Ker B.

330 LINEAR MAPPINGS [CHAP. 9

(b) ReduceBTto echelonform:

f\\

Thus A, 3, -2) and @, 1, -3) form a basisof Im B.

9.20. Find a linear map F : R3 -> R4 whose image is spanned by A, 2, 0, -4) and B,0, -1, -3).

Method 1. Consider the usual basis of R3: e, =A, 0, 0), e2 = @. I, 0), \320\265\321\212

= @, 0, 1). Set

F{e,) = (I, 2, 0, -4) F(e2) = B, 0, -1,-3) and F(e3) = @, 0, 0, 0)

By Theorem 9.2, such a linear map F exists and is unique. Furthermore, the image of F is spanned by the

ef); hence F has the required property. We find a general formula for F(x, y, z):

F(x, y, z) = F(xe,+ ye2 + ze3) = xF(e.) + yF(ex) + zF{e3)= x(l, 2, 0, -4) + yB, 0, - 1, -3) + r@, 0, 0, 0)

= (x + 2y,2x, -y, -4x-3y)

Method 2. Form a 4 x 3 matrix A whose columns consist only of the given vectors; say,

1 2 2\\

2 0 0A=

0 -1 -1-4 -3 -3/

Recall that A determines a linear map A : R3 -> R4 whose image is spanned by the columns of A. Thus A

satisfies the required condition.

9.21. Let V be the vector space of 2 by 2 matrices over R and let M = 1 I. Let F : V -\302\273V be the

linear map definedby F(A)\342\200\224AM \342\200\224MA. Find a basis and the dimensionof the kernel W of F.

We seek the set of|*

\320\243)such that F[*

\320\243)= ( ).

f(x y\\=(x yYl 2)-(l 2YX \320\233

V t) \\s \302\253\320\224\320\276\321\212) \\o \320\267\320\224*t)

(x

2x + \320\227\320\233/x + 2s \321\203+ 2A

s 25 + 3f/ \\ 3s 3t )

(-2s

2x + 2>>\342\200\2242A /0 0\\

-25 25 J~\\0 ojf2x + 2y - 2< = 0 f x + \321\203

- t = 0Thus { or <

[ 2s = 0 [ s = 0The free variables are \321\203and r; hence dim W = 1. To obtain a basisof W set

(a) \321\203= - 1, t = 0 to obtain the solution x = 1, \321\203

= - 1,s = 0, t = 0;

(b) \321\203= 0, t = 1 to obtain the solution x = 1,\321\203

= 0, s = 0, t = 1.

Thus{G ~i)C i)}isabasisofvv-

CHAP.9] LINEAR MAPPINGS 331

9.22. Prove Theorem 9.3. If F:V-* if is linear, then Im F and Ker F aresubspaces.(a) Since F@) = 0, we have OelmF. Now suppose u, u e Ira F and a. b e K.Since \320\270and u' belong to the

image of F, there exist vectors v, v' e V such that F[v) = u and F[v') = u'. Then

F(av 4 bv') = aF(v) 4 bF(v) = \320\260\320\270+ bu \321\201Im F

Thus the imageof F is a subspace of V.

[b) Since F@) = 0, we have 0 e Ker F. Now suppose v, w e Ker F and a, b e K. Since d and w belong tothe kernel of F, F(r) = 0 and F(w) = 0. Thus

F(flt) + bw) = aF(v) 4 bF(w) = aO4 bO = 0 and so au + dwe Ker F

Thus the kernel of F is a subspaceof V.

9.23. Prove Theorem 9.5. dim V = dim (Ker F) + dim (Im F).

Supposedim (Ker F) = r and {w,,..., w,} is a basis of KerF, and suppose dim (Im F) = s and

{u],..., u,} is a basis of Im F. (By Proposition 9.4, Im F has finite dimension.) Sinceu;- \321\201Im F, there exist

vectors v i, v, in V such that F(\302\253,)=

\320\270,,..., F(i;s) = us. We claim that the set

fl= {iv, wr,Vl,...,vs}

is a basis of V, that is, (i) \320\222spans V, and (ii) \320\222is linearly independent. Once we prove (i) and (ii), then

dim V = r 4 s;= dim (Ker F) + dim (Im F).

(i) fl spans V.Let \320\275\320\265\320\232Then F(v) e Im F. Sincethe u,- span Im F, there exist scalars at, ..., as such that

F(d) =\320\260,\321\213,+ \342\200\242\342\200\242\302\246+ asus. Set r =

\320\260,\320\270,+ \342\200\242\342\200\242\302\246+ flsi)s\342\200\224v. Then

F(C) = F(fl,D, + \302\246\302\246\342\200\242+ asvs-

d)= fl,F(t-,) + \342\200\242\302\246\342\200\242+ asF{vs)

- F(v)= fl,^, + \302\246\342\200\242\342\200\242+ asus

\342\200\224F(v) = 0

Thus v e Ker F. Since the wf span Ker F, there exist scalars b,,..., br such that

Accordingly,

v = a,p, + \342\200\242-\302\246+ asvs \342\200\224bjtv,

\342\200\224-\302\246\342\200\242\342\200\224brwr

Thus fl spans V'.

(ii) fl is linearly independent.

Suppose

where x;, >'je K. Then

0= F@) =F(xlW, + \302\246\302\246\342\200\242+ xrw, \321\202y,v, + \302\246\302\246\302\246+ ysvs)

= x,F(w,) + \342\200\242\342\200\242\302\246+ x,F(wr) 4 y,F(i>,) + \342\200\242\342\200\242\302\246+ ysF(t)s) B)

But F(wf)= 0 since wt \321\201Ker F, and F(vj) = u}. Substitution in B) gives y,Ui + \302\246\302\246\342\200\242

4 ysus = 0. Sincethe uj are linearly independent, each ys

= 0. Substitution in (/) gives xtwt 4 \342\200\242\342\200\242\342\200\2424 xrwr \342\200\2240. Since the

w,- are linearly independent, each x{= 0. Thus \320\222is linearly independent.

9.24. Suppose F : V -* U and G : U -*W are linear. Prove:

(a)rank (C\302\253f)< rank G (b) rank (G\302\253F)s rank F

(a) Since F(F) e f, we also have G(F(V)) e G(U)and so dim \302\253FtV')) < dim G(V). Then

rank (G\302\260F)= dim ((G \320\276FX^)) = dim (G(F(V))) < dim G{V) = rank G

(fe) We have dim (G(F{V)))< dim F(V). Hence

rank (G \320\276F) = dim ((G \302\260FXK\302\273

= dim iG(F(V))) < dim F(K)= rank F

332 LINEAR MAPPINGS [CHAP. 9

9.25. Supposef:V-*U is linear with kernel W, and that f(v)= u. Show that the \"coset\"

v + W = {v + w:w e W} is the preimage of \320\270,that is,/ '(\320\270)= v + W.

We must prove that (\\)f~'(u) ?t+W and (ii) v + W e/~

'(u).We first prove (i). Suppose d' e/~ '(u).Then/(u') = \320\272and so

/(i/ - i>) =/(\302\273') ~/(i>) = \320\270- \320\270= \320\236

that is, v' \342\200\224v e W. Thus t>' = v + (i/ \342\200\224v) e v + W and hence/ '(\302\253)Ei) + W.

Now we prove (ii). Suppose v' e v + W. Then v \342\200\224v + w where w e W. Since W is the kernel of /f(w) = 0.Accordingly,

\302\273')=/(i> + w) +f(v) +f(w) =f{v)+ 0 =/(i>)= \320\270

Thus d' e/\" \302\273and so v + W ^f'\\u).

SINGULAR AND NONSINGULARLINEAR MAPPINGS, ISOMORPHISMS

9.26. Determine whether or not each linear map is nonsingular. If not, find a nonzero vector v whose

image is 0.

(a) F : R2 \342\200\224R2 defined by F(x, >) = (x -\321\203,\321\205- 2y).

{b) G : R2 -* R2 defined by G(x, y) = Bx -Ay, 3x - 6>).

(a) Find Ker F by setting F(v) = 0 where v = (x, v):

(x-y.x-2rt.flX0\302\273 or

The only solution is x = 0, \321\203= 0; hence F is nonsingular.

(fe) Set G(x, y) = @,0) to find Ker G:

f 2x - 4y = 0[3x-

6>>= 0

Bx -4y, 3x - 6y) = @, 0) or < / _ or x -

2>>= 0

[3x \342\200\224by = 0

The system has nonzero solutions since \321\203is a free variable; henceG is singular. Let >' = 1 to obtain the

solution v = B, 1)which is a nonzero vector such that G(v) = 0.

9.27. Let H : R3 -+ R3 be defined by H(x, j\\ z) = (x + \321\203- 2z, x + 2y + z, 2x + 2> - 3z).(a) Showthat

H is nonsingular. (b) Find a formula for H~'.

(a) Set H(x, >>, z) = @,0, 0); that is, set

(x + >>- 2z, x +2y + z,2x+ 2y

- 3z) =@, 0, 0)

This yields the homogeneous system

x+y-2z = 0 fx + >>-2z = 0

x + 2y + z = 0 or < j; + 3z = 02x + 2y

- 3z = 0 ( z = 0

The echelon system is in triangular form so the only solution is x = 0, \321\203= 0, z = 0. Thus H is non-

singular.

(b) SetH(x,y, z) = (a, b, c) and then solve for x, y, z in terms of a,b,c:

x + \321\203\342\200\2242z = \320\260 \320\223\321\205+ \321\203

\342\200\2242z = a

x + 2y + z = b or < \321\203+ 3z = b - a

[2x + 2y- 3z = \321\201 [ z = \321\201- 2a

Solving for x, y, z yields x = \342\200\2248a \342\200\224b + 5c, \321\203= 5a + b \342\200\2243c, z = \342\200\2242a+c.Thus

H l{a, b, c) = (-8a - b + 5c, 5a + b -3c, -2a + c)

CHAP. 9] LINEAR MAPPINGS 333

or, replacing a, b, \321\201by x, y, z, respectively,

H' '(x, y, z) = (-8x -\321\203+ 5z, 5x + \321\203

- 3z, -2x + z)

9.28. SupposeF : V -\302\273\320\241is linear and that F is of finite dimension. Show that V and the image of Fhave the same dimension if and only if F is nonsingular. Determine all nonsingular linear map-mappings T: R4 -* R3.

By Theorem 9.5, dim V = dim (Im F) + dim (Ker F). HenceV and Im F have the same dimension ifand only if dim (Ker F) = 0 or Ker F = {0}, i.e., if and only if F is nonsingular.

Since dim R3 is less than dim R4, we have dim (Im T)is lessthan the dimension of the domain R4 of

T. Accordingly, no linear mapping T : R4 -\302\273R3 can be nonsingular.

9.29. Prove Theorem 9.7.

Suppose t1], d2 \320\270\302\273are linearly independent vectors in V. We claim that F(u,), F(i>2), ..., F(vn) arealso linearly independent. Suppose alF(vl) + a2 F(v2)+ \302\246\302\246\302\246+ an F(vn) = 0, where a, e K. Since F is linear,

F(fl,D, + fl2iJ + \302\246\302\246\342\200\242+ anvn) = 0; hence

fl,D, + azv2 + \302\246\302\246\302\246+ \320\276,\302\273,\321\201Ker F

But F is nonsingular, i.e., Ker F = {0};hence\320\260,\320\270,+ a2 v2 + \342\200\242\342\200\242\302\246+ \320\260\320\277\320\270\342\200\236= 0. Since the vt are linearly inde-

independent, all the at are 0. Accordingly, the F(vt) are linearly independent. Thus the theorem is proved.

9.30. Prove Theorem 9.9.

If F is an isomorphism, then only 0 maps to 0, so F is nonsingular. Suppose F is nonsingular. Thendim (Ker F) = 0. By Theorem 9.5, dim \320\243= dim (Ker F) + dim (Im F). Thus dim V = dim V =dim (Im F). SinceU has finite dimension, Im F = V and so F is surjective. Thus F is both one-to-oneand onto, i.e., F is an isomorphism.

OPERATIONS WITH LINEAR MAPPINGS

9.31. Let F : R3 -\302\273R2 and G : R3 -\302\273R2 be defined by F(x, y, z)= Bx, \321\203+ z) and G(x, y, z) = (x -

z, >),

respectively. Find formulas defining the maps (a) F + G, (b) 3F, and (c)2F \342\200\2245G.

(a) (F + GXx, .v, z) = F(x, >\342\200\242,z) + (G(x, >>, \320\263)

= B.x, \321\203+ z) + (x -z, y)

= Cx -z,2y+ z)

(b) CFKx, >>, z) = 3F(x, >>, z) = 3Bx, y + z) = Fx, 3>> + 3z)

(c) BF -5GKx, y, z) = 2F(x, y, z)

- 5G(x, y, z) = 2Bx, \321\203+ z) - 5(x -z, y)

= Dx, ly + 2z)+ (-5x + 5z, -5y) = (-x + 5z, -3>>+ 2z)

9.32. Let F : R3 -\302\273R2 and \320\241: R2 - R2 be defined by F(x, >>, z) = Bx, \321\203+ z) and C(x, y) =(y, x),

respectively. Derive formulas defining the mappings (a) G \302\260F,(b) F \302\260G.

{a) (G \302\260FXx, >', z) = G(F(x, y, z)) = GBx, \321\203+ z) =ty + z, 2x)

(fe) The mapping F \320\276Gis not defined since the image of G is not contained in the domain of F.

9.33. Prove:(a) The zero mapping 0, defined by 0(v) = 0 e U for every i; e V, is the zero element of

Horn (V, U). (b) The negativeof F e Horn (V, U) is the mapping (- 1)F,i.e.,-F = (- l)F.

(a) Let F e Horn (V, V). Then, for every v e V,

(F + 0Xi>) = F(i>) + 0(i>)= F(i-)+ 0 =F(v)

Since (F + OXd)= F(v) for every we V, F + 0 = F.

334 LINEAR MAPPINGS [CHAP. 9

(b) For every v e V,

(F + (- \\)FM.v)= F(v) + {-\\)F(v) = F(v)

- F(v) = 0 = 0(v)

Since (F + (- l)FXt>)= 0(i>) for every v e V, F + (- 1)F= 0.Thus (- 1)F is the negative of F.

9.34. Suppose F,, F2,..., Fn are linear maps from ^ into U. Showthat, for any scalars au a2, \342\200\242\342\200\242\342\200\242,aK,

and for any v e V,

(aiFl + a2 F2 + \342\200\242\342\200\242\302\246+ \320\260\320\270F\302\273

= a.FJu) + a2 F2(v)+ -+\321\217. F\302\273

By definition of the mapping fltF,, (a,F,)(t>)= fl,F,(i>); hence the theorem holds for \320\270= 1. Thus by

induction,

(fl.F, + a2 F2 + \302\246\302\246\302\246+ anFn)(v) = (fl,F,Ki>) + (a2F2 + \302\246\302\246\302\246+ anFJv)

= a,F,(v) + a2F2(v) + \302\246\302\246\302\246+ \320\260\342\200\236Fn(v)

9.35. Consider linear mappings F : R3 -\302\273R2, G : R3 -+ R2, H : R3 -> R2 defined by

F(x, >>, z) =(,v + \321\203+ z, x + y) G{x, y, z) = Bx + z, x + t) H(x, y, z) = {2y, x)

Show that F, G, H are linearly independent [as elements of Horn (R3, R2)].

Suppose,for scalars a, b, \321\201\320\265\320\232,

aF + hG + cH = 0 (/)

(Here 0 is the zero mapping.) For e, =A, 0. 0) e R3, we have

(aF + bG + cH\\e2)= oF@, 1,0) + fcG@, 1,0) + cH@, l, 0)= \320\273A,1) + 40, 1) + cB, 0) = (a + 2c, a + b) =

\320\251\320\2652)= @, 0)

and 0(\320\265,)= @, 0). Thus by (/), (a + 2b, a + b + c) =

@, 0) and so

a + 2b = 0 and \320\260+ fc + \321\201= 0 B)

Similarly for e2 = @, 1,0) e R3, we have

(aF + bG + cHMe2)= aF@, 1,0)+ bG@, 1,0) + c//@, l, 0)

= a(l, 1)+ />@, 1) + cB, 0) = (a + 2c,a + b) = 0(e2) = @, 0)

Thus a + 2c = 0 and a + b = 0 (J)

Using B) and C) we obtain a = 0 ft = 0 \321\201= 0 D)

Since (/) implies D), the mappings F, G, and H are linearly independent.

9.36. Prove Theorem 9.1 1. If dim V = m and dim if = \302\253,then dim Horn (V, U) = mn.

Suppose {d,, ..., vn) is a basisof V and {u,, ..., uK] is a basis of V. By Theorem 9.2, a linear mapping

in Horn (V, U) is uniquely determined by arbitrarily assigning elementsof V to the basis elements v{ of V.

We define

Fo e Horn (V, U) i= 1, .... m; j = 1, .... \320\270

to be the linear mapping for which F;j(i;()=

u}, andFtj(vk)

= 0 for \320\272j= i. That is, Ftj maps d, into u, and the

other d's into 0. Observe that {Fh) contains exactly mn elements; hence the theorem is proved if we show

that it is a basis of Horn (V, U).

Proof that {Ftj} generates Horn (V. V). Consider an arbitrary function F \321\201\320\235\320\276\321\202(V, V). SupposeF(t>,)= \302\273v,,F(v2)

= w2, ..., F(vm)= wm. Since wk e U, it is a linear combination of the t/s;say,

w*\342\200\224fln\"i + ak2u2 + ''' + aknun \320\272= \\, ..., m, atje \320\232 (J)

m n

Consider the linear mapping G = ? ? aij^u- Since G is a linear combination of the Fi}, the proof that

{Fjj} generates Horn (V, U) is complete if we show that F = G.

CHAP.9] LINEAR MAPPINGS 335

We now compute G{vk), \320\272= I, ..., \321\202.Since Ff/t?t)= 0for \320\272\320\244i and Fk,{vk) = u,,

m n \320\273 \320\273

G(vk) = I X ati F,f.vk) = ? akj Fk{vk)= X \321\217*,\302\253,

= \"ki\"i +ak2u2 + \302\246+\320\260\320\272\342\200\236\320\270\320\277

Thus by (/), G(vk)= wk for each k. But F(vk) = wk for each k. Accordingly, by Theorem 9.2, F = G; hence

{FlV} generates Horn (F, I/).Proof that {Fu} is linearly independent. Suppose, for scalars c,t e K,

For !>,,, \320\272= 1, ..., m,

0 =0(!74)

= ? ? c\302\253F\302\253(\302\273J= I ckiFki(vk)

= ?i=l j=l 7=1 7= 1

But the U; are linearly independent; hence for \320\272= 1,.... m, we have ct) = 0, ck2= 0,..., ctll = 0. In other

words, all thectJ

= 0 and so {F;j}is linearly independent.

9.37. Prove Theorem 9.12.

(i) For every v s V,

(G \320\276(F + F'Mv) = G({F+ F\302\273)=

G(F(\302\273)+ F(i>))

= G(F(i;))+ G(F'(i;))= (G -

FK\302\253)+ (G c- F'Kr) = (G - F + G \320\276F'Mv)

Thus G \302\246(F + F') = G - F + G \302\260F.

(ii) For every v e V,

((G + G') \320\276FK<;) = (G + G'XF(\302\273))= C(F(b)) + G'(F(!;))

= (G \320\276FKt>) + (G' -> FXi;) = (G - F + G' <-\302\246

Thus (C + C>F=C\302\253F+ C\302\273F.

(iii) For every v e V,

(k(G \320\276FMv) = k(G \320\276F)(v) = k(G{F(v))) = (kG)(F(v))= (kG o

and (fc(G \320\276F))(V) = k(G \320\276FKi;) =fe(G(F(J;)))

= G(fcF(i;)) = GHkFM) = (G -kF)(v)

Accordingly, k{G \302\260F) =(feG) \302\273F= C\" (feF). (We emphasize that two mappings are shown to be equal

by showing that they assign the same image to each point in the domain.)

ALGEBRA OF LINEAR OPERATORS

9.38. Let F and G be the linear operators on R2 defined by F(x, y) = (y. x) and G(x,y)= @, x). Find

formulas denning the operators (a) F + G, (b) IF - 3G, (c)FG,(d) GF, (e) F2, (/) G2.

(e) (F + GXx, y) = F{x, y) + G(x, y) = [y, x) + @, x) = {y, 2x).

ib) BF -3GX*, y) = 2F(x, y)

- 3G(x, y)= 2(y, x) -

3@, x) = By, -x).

(c) (FGXx, y) = F(G(x, y)) = F@, x) = (x, 0).

(d) (GF^x,y)= G{F(x, y)) = G(y. x) = @, y).

(e) F2(x, y)= F{F{x, y)) =

F{y, x) = (x, y). Note F2 = 1,the identity mapping.

(/) G2(x, y)= G(G(x, y)) = G@,x) = @, 0). Note G2 = 0, the zero mapping.

336 LINEAR MAPPINGS [CHAP.9

9.39. Considerthe linear operator T on R3 denned by T(x, y, z) = Bx, 4x \342\200\224y, 2x + 3y \342\200\224

z). (a) Show

that T is invertible.Find formulas for: (b) T~\\ (\321\201)T2, and (d) T~2.

(a) Let W = Ker T. We need only show that T is nonsingular, i.e., that W = {0}.Set T(x, y, z) = @, 0, 0)which yields

T(x, y, z) = Bx,4x -y,2x + 3y-z) = @, 0, 0)

Thus W is the solution space of the homogeneous system

2x = 0 4x -\321\203

= 0 2x + 3y- z = 0

which has only the trivial solution @, 0, 0). Thus W = {0};hence T is nonsingular and so T isinvertible.

\320\244)Set T(x, y, z) = (r, s, 0 [and so T '(r, s, t) = (x, y, z)]. We have

Bx, 4x -y, 2x + \320\252\321\203

\342\200\224z) = (r, s, /) or 2x = r, 4x -

\321\203= s, 2x + 3y

- z = (

Solvefor x, y, z in terms of r, s, f to get x = ^r, \321\203= 2r \342\200\224s, z = 7r \342\200\2243s \342\200\224t. Thus

\320\223\"'(r, s, /) = (^r, 2f - s,Ir - 3s -f) or \320\223\"'(x, y, z) = (^x, 2x -

y, 7x - 3y - z)

(c) Apply T twice to get

T2(x,\321\203,z) = TBx, 4x -y, 2x + 3y -

z)

= [4x, 4Bx) - Dx -y), 2Bx) + 3Dx -

y)- Bx + 3y

- z)]= Dx, 4x + y, 14x - 6y + z)

(rf) Apply \320\223~' twice to get

T 2(x, y, z) = \320\223\"2(^x, 2x -y, 7x - 3y -

z)

= [ix, 2(^x) - Bx -\321\203),\320\246{\321\205)

- 3Bx -y)

- Gx - iy -z)]

= (ix, -x + \321\203,\342\200\224?\321\205+ 6y + z)

9.40. Let V be of finite dimension and let T be a linear operator on V for which TR = /, for someoperator R on V. (We call \320\232a riy/if inverse of T.) (a) Show that T is invertible. (b) Show that

/? = T\021. (e) Give an example showing that the above need not hold if V is of infinite dimension.

(a) Letdim V = n. By Theorem 9.14,T is invertible if and only if T is onto; hence T is invertible if and

only if rank 7\" = n. We have n = rank / = rank TR < rank T < n. Hence rank T = n and T is invert-invertible.

({,) \320\223\320\223' = T'lT = /.Then R= IR = (T lT)R = T~l{TR) = T 4 = T~K

(c) Let \320\232be the space of polynomials in / over \320\232;say, p{t) = a0 + o,f + a2t2 + \302\246\302\246\302\246+ a,ts. Let T and R

be the operators on V defined by

T(p(t)) = 0 + a, + a2t + \302\246-\302\246+ ast*' and R(p(t)) = aot + a^2 + \302\246\302\246\302\246+ ast'+l

We have

(TRXP(O)= \320\237\320\250\320\250=

\320\237<\320\272>1+ a,t2 +\302\246+ asts+ l) = a0 + ait+-+asts = ft)and so TR = /, the identity mapping. On the other hand, if \320\272\320\265\320\232and \320\272^0, then

{RT)(k) =R(T{k))

= R@) = 0 \320\244\320\272

Accordingly, RT \321\204I.

9.41. Let F and G be the linear operators on R2 denned by F{x, y) = @, x) and G(x, y)= (x, 0). Show

that GF = 0 but FG \321\2040. Also show that G2 = G.

CHAP. 9] LINEAR MAPPINGS 337

(GF)(x,\320\243)= G(F(x, y)) = G@, x) =

@, 0). Since GF assigns 0 =@, 0) to every (x, y) e R2, it is the zero

mapping: GF = 0.(FG)(x, y) = F{G(x, y)) = F(x, 0) =

@, x). For example, (FGK4, 2) = @, 4). Thus FG \320\2440, since it doesnot assign 0 = @,0) to every element of R2.

For any (x, y) e R2, G2(x,y)= G(G(x, y)) = G(x, 0) = (x,0) = G(x, y). Hence G2 = G.

9.42. Considerthe linear operator T on R2 defined by T(x, y) = Bx + Ay, 3x + 6y). Find: (a) a for-

formula for T~', (b) T~ '(8. 12),and (c) T\" 'A, 2). (d) Is T an onto mapping?

(a) T is singular, e.g., TB, 1) = @,0); hencethe linear operator T~': R2 -\302\246R2 does not exist.

(b) T~'(8,12)means the preimage of (8, 12) under T. Set T(x, y)= (8, 12) to get the system

f 2x + 4y= 8

< or x + 2y= 4

[3x +6^=12^

Here \321\203is a free variable. Set j>= a, where \320\276is a parameter, to get the solution x = \342\200\2242a + 4, \321\203

= \320\260.

Thus \320\223\021(8, 12) = {(-2a + 4.e): a e R}.(c) Set \320\223(\321\205,\321\203)

= A, 2) to get the system

2x + Ay= 1 3x + 6y = 2

The system has no solution. Thus T~ 'A, 2) = 0, the empty set.

(d) No, since, e.g.,A,2) has no preimage.

9.43. Let S = {vx, v2, v3} be a basisof V and let S' ={\302\253,,\320\2702}be a basis of I/. Let T : V -\302\273I/be linear.

Also, suppose

a2 u2 u2

T(v2) =b]\302\253,+ \320\2542\320\2702and A -=

/4 b, c,\\\302\2532 fc2 C

Show that, for any t; e V, A[v~\\s = [T(f)]s- (where the vectors in K2 and K3 are column vectors).

Suppose v \342\200\224fc,f, + ^2^2 + ''\320\267vi>tnen Ms = C^i. ^i > ''\320\267]7-Also

C2U2)= (alki + b,k2 + c{k3)ui +(a2k, + b2k2 + c2k3)u2

Accordingly,

Computing, we obtain

9.44. Let \320\272be a nonzero scalar. Show that a linear map T is singular if and only if \320\272\320\242is singular.Hence T is singular if and only if \342\200\224T is singular.

Suppose T is singular. Then T{v) = 0 for some vector v =A 0. Hence

(fcTX\302\273)= kT(v) = fcO = 0

and so kT is singular.

338 LINEAR MAPPINGS [CHAP. 9

Now suppose kT is singular. Then (fcTMvv)= 0 for some vector w # 0; hence

T(kw) = kT{w)= (fcTXw) = 0

But \320\272\320\2440 and w \320\2440 implies kw \320\2440; thus T is also singular.

9.45. Let ? be a linear operatoron V for which E1 = E. (Such an operator is termed a projection.) LetU be the image of ? and W the kernel. Show that: (a) if \320\270e U, then ?(\302\253)

= u, i.e., ? is theidentity map on U; (b) if E \320\244I, then ? is singular, i.e.,E(v)

= 0 for some v \321\2040; (\321\201)V = U \302\251W.

(a) If u e I/, the image of ?, then ?(f) = \320\270for some i> e 1\320\233Hence using ?2 = ?, we have

\320\270= E(v) =E\\v)

= ?(?((;)) = E(u)

(b) If ? # / then, for some \320\270e K, E[v) = \320\270where i> \320\244u. By (i), ?(\321\213)= \320\270.Thus

?(u \342\200\224u) = ?(u) \342\200\224

?(\321\213)= \320\270\342\200\224\320\270= 0 where v \342\200\224\320\270\320\2440

(c) We first show that \320\232= V + W. Let i; e K. Set u = E(v) and w = \320\270- E(v). Then

i' = E(v) + v \342\200\224?(u) = u 4- w

By definition, \320\270= E(v) e U, the image of ?. We now show that w e W, the kernel of ?:

E(w) =E{v

- ?(i')) = E(v) -E2(v) = ?(i;) - ?(i;)= 0

and thus w ? W. Hence V = V \320\233W.

We next show that U n W = {0}. Let d e I/ n W. Since veV, E(v)= \320\270by (o). Since veW,

E{v)= 0. Thus v = E(i>) = 0 and so V n W = {0}.The above two properties imply that V \342\200\224V \302\251W.

9.46. Find the dimension d of (a) Horn (R3, R2), (b) Horn (C\\ R2), (c) Horn (V, R2) where F = C3viewedas a vector space over R, (d) /4(R3), (<\302\246)/1(C3), (/) A(V) where F = C3 viewed as a vector

space over R.

(a) Sincedim R3 = 3 and dim R2 = 2, we have (Theorem9.11)d = 3 \342\200\2422 = 6.

(b) C3 is a vector spaceover \320\241and R2 is a vector spaceover R; hence Horn (C3, R2) does not exist.

(c) As a vectorspaceover R, V = C3 has dimension 6. Hence(Theorem 9.11) d = 6 \342\200\2422 = 12.

(d) A(K3) = Horn (R3, R3) and dim R3 = 3; henced = 32 = 9.

\320\230 \320\233(\320\2413)= Horn (\320\2413,\320\2413)and dim \320\2413= 3; hence d = 32= 9.

(/) Since dim V = 6, d = dim A(V) = 62 = 36.

Supplementary Problems

MAPPINGS

9.47. Determine the number of different mappings from {a, b} into {1, 2, 3}.

9.48. Let the mapping g assign to each name in the set {Betty, Martin, David, Alan, Rebecca} the number of

different letters needed to spell the name. Find {a) the graph of g, and (b) the image of g.

9.49. Figure 9-5 is a diagram of mappings /: A -> B, g : \320\222-> A, h : \320\241-\302\273\320\222,F : \320\222-\302\273\320\241and \320\241: A -> \320\241.Determine

whether each of the following defines a composition mapping and, if it does, find its domain and codomain:{a)g uy; (b) h -/, (c) F \302\273/(d) G \302\273), (e) g <\302\246h, (/) h \302\246G - g.

CHAP. 9] LINEAR MAPPINGS 339

Fig. 9-5

9.50. Let/: R -\302\273R and g : R -\302\273R be defined by/(x) = x2 + 3x + 1 and g(x) = 2x - 3. Find formulas defining the

composition mappings {a)f \302\260g, (b) g <\302\246/ (c) g <\342\200\242#, (d)f \302\246\342\200\242f.

9.51. For each of the following mappings f: R -\302\246R find a formula for the inverse mapping: (a)f(x) = 3x \342\200\2247, and

= x3 + 2.

9.52. For any mapping/: /1 -\302\273B, show that 1B -/=/=/\302\246' 1

LINEAR MAPPINGS

9.53. Verify that the operators T and R of Problem 9.40(c) are linear.

9.54. Let V be the vector space ofnxn matrices over \320\232;and let M be a fixed nonzero matrix in V. Show that

the first two mappings T: V -* V are linear, but the third is not linear: (i) T(A) = MA, (ii)

T(A) = MA- AM, (iii) T(A) = M + A.

9.55. Find T(a, b) where T : R2 -\302\246R3 is defined by T(l, 2) = C, -1, 5) and \320\224\320\236,1) = B, 1, - 1).

9.56. Give an example of a nonlinear map F : V -> V such that F~ '@)= {0}but F is not one-to-one.

9.57. Show that if F : V -\302\273U is linear and maps independent sets into independent sets then F is nonsingular.

9.58. Find a 2 x 2 matrix A which maps u, and u2 into c, and t-2, respectively, where: (o) u, = A, 3O, u2 =

(I, 4)randi>, ==(-2, 5)T, t>2 =C, -\\)TAb)ui =B, -4)T, u2= (- 1, 2)Tand v{ =A, 1)T, c2 =

A, 3)T

9.59. Find a 2 x 2 singular matrix \320\222which maps A, l)T into A, 3)T.

9.60. Find a 2 x 2 matrix \320\241that has an eigenvalue i. = 3 and maps A, I)T into A, 3)T.

9.61. Let \320\223:\320\241-\302\273\320\241be the conjugate mapping on the complex field C. That is, T(z)= z where z e C, orT(a +\302\246hi) = a \342\200\224hi where a, b e R. (o) Show that \320\223is not linear if \320\241is viewed as a vector spaceover itself.

{b) Show that T is linear if \320\241is viewed as a vectorspaceover the real field R.

9.62. Let F : R2 -\302\273R2 be defined by F(x, y)= Cx + 5y, 2x + 3y), and let S be the unit circle in R2. (S consistsof

all points satisfying x2 + y2= 1.) Find: (a) the image F{S\\, and {b) the preimage F\" '(S).

9.63. Consider the linear map G : R3 -\302\246R3 defined by G(x, y, z) = (x + \321\203+ z, \321\203\342\200\2242z, \321\203

\342\200\2243z) and the unit sphere

S2 in R3 which consists of the points satisfying x2 + y2 + z2 = 1.Find:{a) G{S2), and {b) G~ '(S2).

9.64. Let H be the plane x + 2y - 3z = 4 in R3 and let \320\241be the linear map in Problem 9.63. Find: (a) G(H),and

340 LINEAR MAPPINGS [CHAP. 9

KERNEL AND IMAGEOF LINEAR MAPPINGS

9.65. For the following linear map C, find a basis and the dimension of (i) the image of C, (ii) the kernel of G:\320\241: R3 - R2 denned by G(x, y, z) = (x + y, >' + z).

9.66. Find a linear mapping F : R3 -> R3 whose image is spanned by A, 2, 3) and D, 5,6).

9.67. Find a linear mapping F : R4 -\302\273R3 whose kernel is spanned by A, 2, 3, 4) and @, 1, 1, 1).

9.68. Let F: V -* U be linear. Show that (a) the image of any subspace of V is a subspaceof U and (b) the

preimage of any subspace of V is a subspace of V.

9.69. Each of the following matricesdetermines a linear map from R4 into R3:

A

2 0 \320\233 / 1 0 2 -

2-1 2 -1 I (b) \320\222= 2 3-1 1

1 -3 2 -2/ \\-2 0 -5 3/

Find a basis and the dimension of the image U and the kernel W of each map.

9.70. Consider the vector space V of real polynomials/(r) of degree 10or less and the linear map D4: V -* V

defined by d*f/dt*, i.e., the fourth derivative. Find a basis and the dimension of (a) the image of D4, and

(b) the kernel of D4.

OPERATIONS WITH LINEAR MAPPINGS

9.71. Let F : R3 - R2 and G : R3 - R2 be defined by F{x, y, z) =(y, x + z) and G(x, y, z) = Bz, x -

y). Find

formulas defining the mappings F + G and 3F \342\200\2242G.

9.72. Let H : R2 -\302\273R2 be defined by H{x, v) ={y, 2x). Using the mappings F and G in Problem 9.71, find formu-

formulasdefining the mappings: (a)H\302\273F and H \302\253G, (d)Fofl and C\302\253fl, (c) H \302\260(F + G) and H \320\276F + H \302\260G.

9.73. Show that the following mappings F, G, and H are linearly independent:

(a) F,G,He Horn (R2, R2) defined by F(x, y) = (x, 2y), G(x, y) = (y, x + y), H(x, y) = @,x).(J>) F,G,He Horn (R3, R) defined by F(x, y, z) = x + \321\203+ z, G(x, y, z) =

\321\203+ z, H(x, y, z) = x - z.

9.74. For F,Ge Horn (F, I/), show that rank (F + G)< rank F + rank G. (Here V has finite dimension.)

9.75. Let F :V -*U and G : V -\302\273\320\232be linear. Show that if F and G are nonsingular then C\302\273Fis nonsingular.Give an example where G \302\253F is nonsingular but G is not.

9.76. Prove that Horn (V, V) does satisfy all the required axioms of a vectorspace.That is, prove Theorem 9.10.

ALGEBRA OF LINEAR OPERATORS

9.77. Suppose F and G are linear operators on V and that F is nonsingular. Assume V has finite dimension.Show that rank (FG) = rank (GF) = rank G.

9.78. Suppose V = V @ W. Let ?, and \320\225\320\263be the linear operators on V defined by E^v) = u, E2(v)= w, where

v = \320\270+ w, ueU, weW. Show that: (a) ?2 = ?, and ?2 = ?2, i.e., that ?, and ?2 are projections;(b) ?i + ?2 = \320\233the identity mapping; (c) ?|?2 = 0 and ?2 ?, = 0.

9.79. Let E, and ?2 be linear operators on V satisfying (o), (b), (c) of Problem 9.78. Prove: V = Im ?, \302\251Im ?2.

CHAP. 9] LINEAR MAPPINGS 341

9.80. Show that if the linear operators F and G are invertible, then FG is invertible and (FG) '=G iF *.

9.81. Let V have finite dimension, and let T be a linear operator on V such that rank T2 = rank T. Show thatKer T n Im T = {0}.

9.82. Which of the following integers can be the dimension of an algebra A(V) of linear maps: 5, 9, 18,25, 31, 36,

44, 64, 88, 100?

9.83. An algebra A is said to have an identity element 1 if 1 \342\200\242a = a \342\200\2421 = a for every a e A. Show that A(V) hasan identity element.

9.84. Find the dimension of A(V) where: [a) V = R4, (b) V = C4, (c) V = C* viewed as a vector spaceover R,

(d) \320\232= polynomials of degree < 10.

MISCELLANEOUS PROBLEMS9.85. Suppose T: K\" --> Km is a linear mapping. Let {elt ..., en} be the usual basis of K\" and let A be the m x n

matrix whose columns are the vectors T(e,),..., T(en), respectively. Show that, for every vector v e K\",

T{v) = Av, where v is written as a column vector.

9.86. SupposeF : V -\302\273U is linear and /c is a nonzero scalar. Show that the maps F and kF have the same kerneland the same image.

9.87. Show that if F : V -* V is onto, then dim V < dim V. Determine all linear maps T : R3 -\302\246R4 which areonto.

9.88. Let T : V -* V be linear and let W be a subspace of V. The restriction of \320\223to If is the map Tw : W -> Udenned by Tw(w) = T(w), for every welf. Prove the following: (a) Tw is linear, (b) Ker 7\"\342\200\236,

= Ker T n W.(c)Im TW

= T(W).

9.89. Two operators F, G e \320\233(\320\232)are said to be similar if there exists an invertible operator P e A(V) for whic'n

F = P~ lGP.Prove the following: (a) Similarity of operators is an equivalence relation, (b) Similar operato rshave the same rank (when V has finite dimension).

9.90. Let v and w be elements of a real vectorspaceV. The /ine segment L from f to \320\270+ w is defined to be the set

of vectors v + tw for 0 < t < 1. (See Fig. 9-6.)

(a) Show that the line segment L between vectors t? and \320\270consists of the points:

(i) A- t)v + tu for 0< t < 1, (ii) t{v + t2u for f, + f2

= 1, (, > 0, f2 > 0.

(J>) Let F : \320\232-\302\273V be linear. Show that the image F(L) of a line segment L in \320\232is a line segment in U.

Fig. 9-6

342 LINEAR MAPPINGS [CHAP. 9

9.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors)P, Q e X is contained in X.

(a) Show that the intersection of convexsetsis convex.

{b) Suppose F : V -\302\273U is linear and X is convex. Show that F(X) is convex.

Answers to Supplementary Problems

9.47. Nine.

9.48. (a) {(Betty, 4), (Martin, 6), (David, 4), (Alan, 3), (Rebecca,5)}. (b) Image of g = {3,4, 5, 6}.

9.49. (a) {ycf):A->A, {b) No, (c) (F\302\260/): A -* C, (d) No, (e) {g\302\260h):C^A,

9.50. (a) (f\302\260g\\x)= 4x2-bx+\\, (\321\201)[\320\264

\321\201gX*)= 4x -

9,

2x2 + 6x- 1, (d) (/o/)(x)= x4+ 6x3+I4x2+ 15x+5.

9.51. (a) f-\\x) = (x + l)/3, (b) f \\x) =

9^5. T(a, b)= {-a + 2b, -3a + b, 7a - b).

9.56. ChooseV = V = R2 and F(x, y)= (x1, y2).

( \"i

(b) No such matrix exists, as u{ and u2 are dependent but vt and v2 are not.

9..59. \320\223 \302\260\\{Hint: Send @, l)r into @, 0)r.]

9.60'. (~2 \320\233.{Hint: Send @, 1)T into @, 3)T.]

9.62. (a) 13x2- 42xy + 34y2 = 1, (b) I3x2+ 42xy + 24y2 = 1.

9.63. (a) x2 -Sxy + 26y2 + bxz -

3%yz + \\4z2 = 1, (b) x2 + 2xy + 3y2 + 2xz -8yz + \\4z2 = 1.

9.64. (a) x -\321\203+ 2z = 4, (b) x - \\2z = 4.

9.65. (i) A, 0),@, 1), rank G = 2; (ii) A,- 1, I), nullity G = I.

9.66. F(x, y, z)\342\200\224

(x + 4y, 2x + 5y, 3x + 6y).

9j67. F(x, y,z,t) = (x+ y-z,3x+y-t, 0).

9.69. {a) {A, 2, 1),@, 1,1)} basis of Im A; dim (Im A) = 2.

{D, -2, -5,0),(l, -3, 0,5)}basis of Ker A; dim (Ker A)= 2.

(b) IaiB = R3;{(-l, j, 1, 1)}basisof Ker \320\222;dim (Ker B) = 1.

CHAP.9] LINEAR MAPPINGS 343

9.70. (o) 1,t t6; rank D4 = 7; (b) 1, f, t2, t3; nullity D* = 4.

9.71. (F + GXx, y, z) = (j/ + 2z,2x -\321\203+ z), CF -

2GXx, y, z) = Cy -4z, x + 2y + 3z).

9.72. (\320\262)(\320\235\320\276FX*, y, z) = (x + y, 2y), (H <. GXx, y, z) = (x -

y, 4z). (b) Not defined,

(c) |H.(F+ G)Xx, y, z) = (H \320\276F + \320\230\320\276GXx, \321\203,\320\263)= Bx -\321\203+ z, 2y + 4z).

9.82. Squares:9,25, 36, 64, 100.

9.84. (a) 16, (b) 16, (c) 64, \342\204\226121.

Chapter10

Matrices and Linear Mappings

10.1INTRODUCTION

Suppose S = {ut, u2,..., \320\274\320\277}is a basis of a vector spaceV over a field \320\232and, for v e V, suppose

v = alu1 + a2u2 + \302\246\302\246\302\246+\320\260\320\277\320\270\342\200\236

Then the coordinate vector of v relative to the basis S, which we write as a column vector unlessotherwisespecifiedor implied,is denotedand defined by

Recall [Example 9.4] that the mapping i>h->[i>]s, determined by the basis S, is an isomorphism betweenV and the space K\\

This chapter shows that there is also an isomorphism, determined by the basis S, between the

algebra A(V) of linear operators on V and the algebra M of n-squarematrices over K. Thus every linear

operator T : V -* V will correspond to an n-square matrix [T]s determined by the basis S.The question of whether or not a linear operator T can be represented by a diagonal matrix will

also be addressed in this chapter.

10.2 MATRIX REPRESENTATION OF A LINEAR OPERATOR

Let T bea linear operator on a vector space V over a field \320\232and suppose S = {uu u2 ,...,\302\253\342\200\236}is a

basis of V. Now T(ux), ..., \320\242(\320\270\342\200\236)are vectors in V and so each is a linear combination of the vectors in

the basis S; say,

T(U2)=

\302\25321\302\2531+ a22 U2 + ' \" ' + a2n \"n

= a.,u. + a_, u, + \342\200\242\342\200\242\342\200\242+ a..u.ft 1 1 \320\237\320\245X \320\237\320\237\320\237

The following definition applies.

Definition: The transposeof the above matrix of coefficients, denoted by m^T) or [T]s, is called thematrix representation of T relative to the basis S or simply the matrix of T in the basis S;that is,

(a,!

a2l ... a

a\"..a22..:::..fl\022

\302\253in a2n \342\200\242\342\200\242\342\200\242\320\241

(The subscript S may be omitted if the basis S is understood.)

344

CHAP. 10] MATRICES AND LINEAR MAPPINGS 345

Remark: Using the coordinate (column) vector notation, the matrix representa-

representationof T may also be written in the form

= [T] = ([T(Ul)],[T(uJ)],..., [T(Un)])

that is, the columns of m{T)are the coordinate vectors [T(\302\253t)],..., [T(uJ].

Example 10.1

(a) Let V be the vector space of polynomials in t over R of degree < 3, and let D : V -\302\273V be the differential

operator defined by D(p(t)) = d(p(t))/dt. We compute the matrix of D in the basis {I, t, t2, t3}.

= 0 = 0 + 0t + 0t2 + \320\236?3

D(t) =1 = 1 + Of + Ot2 + Ot3

D(t3) = 3t2 = 0 + Ot + 3t2 + Ot3

[Note that the coordinate vectors D(l), D(f),D(t2), and D(f3) are the columns, not the rows, of [D].]

(b) Consider the linear operator F : R2 -\302\273R2 defined by F(x, y)= Dx \342\200\224

2y, 2x + y) and the following bases of R2:

S={m,=A. 1),\320\274\320\263= (-1,0)} and E = {e,= A, 0), e2 = @, 1)}

We have

-c

F(u.) = F(l, 1)= B, 3)= 3A, 1) + (-1, 0) =

3\302\253,+ \320\2742

F(u2)= F(-l,0) = (-4, -2)= -2A, l) + 2(-l, 0)= -2u, + 2u2

(\320\252-2\\Therefore, [F]s = ( I is the matrix representation of F in the basis S. We also have

F(e,) = F(l, 0) = D, 2) = 4et + 2e2

F(e2)= F@, 1) = (-2, 1)= -2e, + e2

/4 -2\\Accordingly, [F]? = I I is the matrix representation of F relative to the usual basis E.

(c) Considerany n-square matrix A over \320\232(which defines a matrix map A : K\" -* K\") and the usual basis ? = {e,}of K\". The Ae,, Ae2, ..., Aen are precisely the columns of A (Remark on Example 9.5) and their coordinates

relative to the usual basis ? are the vectors themselves. Accordingly,

[\320\233]?= \320\220

that is, relative to the usual basis ?, the matrix representation of a matrix map A is the matrix A itself.

The following algorithm will be used to compute matrix representations.

Algorithm 10.2

Given a linear operator T on V and a basis S =\\ux, ..., \320\270\342\200\236]of V, this algorithm finds the matrix

[T]s which represents T relative to basis S.

Step 1. Repeat for each basis vector uk in S:

(a) Find T(uk).

(b) Write T(uk) as a linear combination of the basis vectors ult ..., \320\270\342\200\236to obtain the coordi-coordinates of T(u,) in the basis S.

346 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Step 2. Form the matrix [T]s whose columns are the coordinate vectors [T(uk)]s obtained in

Step I(b).

Step 3. Exit.

Remark: Observe that Step l(b) is repeatedfor each basis vector uk. Accordingly,it may be useful to first apply:

Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the

basis S.

Our first theorem, proved in Problem 10.10, tells us that the \"action\" of an operator T on a vector

v is preserved by its matrix representation:

Theorem 10.1: Let S = {u,, u2, \342\200\242--,uB} be a basis for V and let T be any linear operator on V. Then,

for any vector v e V, [T]s[>]s = [T(t;)]s.

That is, if we multiply the coordinate vector of v by the matrix representation of T, then we obtain

the coordinate vector of T(v).

Example 10.2. Consider the differential operator D : V -\302\273V in Example 10.1(a).Let

p(t)= a + bt + a2 + dt3 and so D(p(t)) = b + let + 3dt2

Hence, relative to the basis {1, f, f2, f3},

\320\2501= [\". \320\254,\321\201,dY and [D{p@)] = [6, 2c,3d, 0]r

We show that Theorem 10.1doeshold here:

0

0

0

0

1

0

0

0

0

2

00

\302\260\\0

\320\267

o/

la'

b

\321\201

V

b

2c

3d

\\o

Now we have associated a matrix [\320\223]to each T in A(V), the algebra of linear operators on V. Byour first theorem, the action of an individual operator T is preserved by this representation. The nexttwo theorems, (proved in Problems 10.11 and 10.12, respectively), tell us that the three basic operationswith these operators,

(i) Addition,

(ii) Scalar multiplication,

(iii) Composition

are also preserved.

Theorem 10.2: Let S ={\302\253,,u2, ..., \302\253\342\200\236}bea basis for a vector space V over K, and let M be the

algebra of n-square matrices over K. Then the mapping m: A(V)-*M defined by\342\204\242(T)

= [7\"]s is a vector space isomorphism.That is, for any F, G e A(V) and any\320\272\320\265\320\232,we have

(i) m(F + G) =m(F) + m(G) or [F + G] = [F] + [G],

(ii) m(kF) = kniF) or [kF] = k[F],(iii) m is one-to-one and onto.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 347

Theorem 103: For any linear operators G, F e A(v),

niG\302\253 F) = niG)m(F) or [

(HereG \302\260F denotes the composition of the maps G and F.)

We illustrate the above theorems in the case dim V = 2. Suppose {u,, u2] is a basisfor V, and F

and G are linear operators on V for which

F(ut)= alul + a2 u2 G(\",)

= clul + c2u2F(u2) =

fciM, + b2 u2 G(u2) =^iMi

Then

We have

=(fl'

4 and [G]= (^

<

\\a2 bj \\c2 d

(F + GX\"i)= F(u,) + G(u,) =

\320\260,\320\270,+ a2u2 + c^u^ + c2u2= (fl!+ c,)u, + (a2 + c2)u2

(F + GX\022)= f(\022) + G(u2) =

\320\254,\320\270,+ b2u2 + ^i^! + d2 u2

= (b, + \320\233,)\320\274,+ (\320\2542+ <*2)u2

Thus ZF + G} =D ,/\320\220

= ( A + ( j\\ fc ^/ \\a2 bj \\c2 d

Also, for \320\272\320\261/\320\241,we have

u!) = fcF(u,) = /c(a,u,+ a2u2)= ka,u,+ ka2u2

= k(blUl + b2 u2) =kfcjU,

Thus

a2 b2

Finally, we have

(G \320\276FXui)

= G(F(ui)) =G(ayUy + a2 u2) =

\320\260^\320\270^+ a2 G{u2)= ul(c,ul + c2u2) + a^dyUi + d2 u2)= (alcl + a2dl)u1+ (alc2+ a2d2)u2

(G\320\276FXm2))

= G(F(u2) = Gib,!*, + b2 u2)= b,G(M,) + b2 G(u2)

= b^Ciiti+ c2u2)+ b2(^|U|+ d2u2)

= (blcl + b2dl)ul + (fc,c2+ b2d2)u2

Accordingly,

+ a2dl

103 CHANGE OF BASISAND LINEAR OPERATORS

The above discussion shows that we can represent a linear operator by a matrix once we havechosen a basis.We ask the following natural question: How doesour representation change if we select

another basis? In orderto answer this question, we first recall a definition and some facts.

348 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Definition: Let S= {u,,w2, ..., \320\270\320\273}be a basis of V and let S' = {vt, v2, \302\246-,vn} be another basis.

Suppose, for i = 1.2,..., n,

Vi=anu, + ai2u2 + \302\246\302\246\302\246+\320\260\321\213\320\270\342\200\236

The transpose P of the above matrix of coefficients is termed the change-of-basis (ortransition) matrix from the \"old\" basis S to the \"new\" basis S'.

Fact 1. The above change-of-basismatrix P is invertible and its inverse P'x is the change-of-basis

matrix from S' back to S.

Fact2. Let P be the change-of-basis matrix from the usual basis ? of K\" to another basis S. Then P isthe matrix whose columns are precisely the elementsof S.

Fact 3. Let P be the change-of-basis matrix from a basis S to a basis S' in V. Then (Theorem 5.27),for

any vector v e V,

W = Ms and P 'Hs = WS-

(Thus P 'transforms the coordinates of i> in the \"old\" basis S to the \"new\" basis S'.)

The following theorem, proved in Problem 10.19, answers the above question, that is, shows how

the matrix representation of a linear operator is affected by a changeof basis.

Theorem 10.4: Let P be the change-of-basis matrix from a basis S to a basisS' in a vector space V.

Then, for any linear operator T on V,

In other words, if A is the matrix representing T in a basis S, then \320\222= P lAP is the matrix which

represents \320\223in a new basis S' whereP is the change-of-basis matrix from S to S'.

Example 10.3. Consider the following bases for R2:

E={\302\253?,=(l,0),<?2=((U)} and S ={\302\253,=A, -2), \302\2532

= B. -5)}

Since ? is the usual basis of R2, we write the basis vector in S as columns to obtain the change-of-basis matrix Pfrom E lo S:

-U 4)Consider the linear operator F onR1 defined by F(x, y) = Bx -

3>\\ 4\321\205+ >\302\246).We have

F(e,) = F(l, 0) = B, 4) = 2c, + 4e2 B -3

is the matrix representation of F relative to the usual basis E. By Theorem 10.4,

2 -5J V-18 -<=p>ap=(5

2Y2 -3YV-2 -1\320\2244 lA-

is the matrix representation of F relative to the basis S.

Remark: Suppose P =(\320\276\321\203)

is any n-square invertible matrix over a field K, and

suppose S = {m,,m2 ,..., un\\ is a basis for a vector space V over K. Then the n vectors,

i),=

\302\253\342\200\236\302\246\321\213,+ a2i u2 + \302\246\302\246\302\246+ aniun i = 1, 2, ..., \321\217

are linearly independent and hence they form another basis S' for V. Furthermore, P isthe change-of-basis matrix from the basis S to the basis S'. Accordingly, if A is anymatrix representation of a linear operator T on V, then the matrix \320\222= P\" lAP is alsoa matrix representation of T.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 349

Similarity and Linear Operators

SupposeA and \320\222are square matrices for which there exists an invertible matrix P such that\320\222= P~lAP; then (Section 4.13) \320\222is said to be similar to A or is said to be obtained from A by a

similarity transformation. By Theorem 10.4and the above remark, we have the following basic result.

Theorem 103: Two matrices A and \320\222represent the same linear operator T if and only if they aresimilar to each other.

That is, all the matrix representations of the linear operator T form an equivalence class of similar

matrices.

Now suppose /is a function on square matrices which assigns the same value to similar matrices;that \\s,f(A) =f(B) whenever A is similar to B. Then/induces a function, also denoted by/, on linearoperators T in the following natural way:/(T) =/([T]s) where S is any basis. The function is well-definedby Theorem 10.5. Three important examples of such functions are:

A) determinant, B) trace, and C) characteristic polynomial

Thus the determinant, trace, and characteristicpolynomial of a linear operator T are well-defined.

Example 10.4. Let F be the linear operator on R2 defined by F(x, y)= Bx -

3y, 4x + y). By Example 10.34,the

matrix representation of T relative to the usual basis for R2 is

-eB -':

Accordingly:

(i) det (X) = det {A)= 2 + 12 = 14is the determinant of T.

(ii) tr T = tr A = 2 + 1 = 3 is the trace of T.

(iii) A,Xf) = AA(t) = t3 \342\200\2243t + 14 is the characteristic polynomial of T.

By Example10.3,another matrix representation of T is the matrix

44

Using this matrix, we obtain:

(i) det (T) = det (A)= - 1804 + 1818= 14 is the determinant of T.

(ii) tr T = tr A = 44 - 41 = 3 is the trace of T.(iii) &i{t)

=\320\224\342\200\236@

= f2 - 3t + 14 is the characteristic polynomial of T.

As expected, both matrices yield the same results.

10.4 DIAGONALIZATION OF LINEAR OPERATORS

A linear operator T on a vector space V is said to be diagonalizable if T can be represented by a

diagonal matrix D. Thus T is diagonalizable if and only if there exists a basisS = {u,,\302\2532,...,\302\253\342\200\236}of V

for which

T(ut) = /c,u,T(u2)

= k2 u2

\320\242(\320\270\320\273)= \320\272\342\200\236\320\270\342\200\236

In such a case, T is representedby the diagonal matrix

D = diag(fe1,/c2,...,kB)relative to the basis S.

350 MATRICES AND LINEAR MAPPINGS [CHAP. 10

The above observation leads us to the following definitions and theorems which are analogous to thedefinitions and theorems for matrices discussed in Chapter 8.

A scalar \320\257\320\265\320\232is called an eigenvalue of T if there exists a nonzero vector v e V for which

T(v) = kv

Every vector satisfying this relation is called an eigenvector of T belonging to the eigenvalue k. The setEx of all such vectors is a subspace of V called the eigenspace of \320\257.(Alternatively, \320\257is an eigenvalue of Tif kl \342\200\224T is singular and, in this case, Ex is the kernel of kl \342\200\224

T.)

The following theorems apply.

Theorem 10.6: T can be represented by a diagonal matrix D (or T is diagonalizable) if and only if

there exists a basis S of V consisting of eigenvectors of T. In this case, the diagonalelements of D are the corresponding eigenvalues.

Theorem 10.7: Nonzero eigenvectorsu,, u2, ..., ur of T, belonging, respectively, to distinct eigen-eigenvalues \320\257,,\320\2572,\342\200\242\342\200\242\342\200\242,\320\257\320\263,are linearly independent. (See Problem 10.26for the proof.)

Theorem 10.8: T is a root of its characteristic polynomial A(t).

Theorem 10.9: The scalar \320\257is an eigenvalue of T if and only if \320\257is a root of the characteristic

polynomial A(/) of T.

Theorem 10.10: The geometricmultiplicity of an eigenvalue \320\257of \320\223does not exceed its algebraic multi-

multiplicity. (See Problem 10.27 for the proof.)

Theorem 10.11: SupposeA is a matrix representation of T. Then T is diagonalizable if and only if A is

diagonalizable.

Remark: Theorem 10.11reduces the investigation of the diagonalization of alinear operator T to the diagonalization of a matrix A which was discussed in detail in

Chapter 8.

Example 10.5

(a) Let V be the vector spaceof real functions for which S ={sin 0, cos 0} is a basis,and let D be the differential

operator on V. Then

D(sin 0)= cos 0 = 0(sin 0) + I (cos 0)

D(cos 0) = -sin 0 = -l(sin 0) + 0(cos 0)

Hence \320\233= I I is the matrix representation of D in the basis S.Therefore,

\320\224@= t2 -

(tr Ajt + \\A | = t2 + I

is the characteristic polynomial of both A and D. Thus A and D have no (real) eigenvalues and. in particular,D is not diagonalizable.

(b) Consider the functions *\"\", e\022',..., e\302\260'xwhere a^ az, ..., ar are distinct real numbers. Let D be the differential

operator; hence D(e\302\260\")=

a\302\273<\302\246\"*'\302\246Accordingly, the functions ew are eigenvectorsof D belonging to distinct

eigenvalues. Thus, by Theorem \320\256.7,the functions are linearly independent.

(c) Let T : R2 -\302\273R2 be the linear operator which rotates each vector r e R2 by an angle 0 = 90\302\260(as shown in Fig.lO-l). Note that no nonzero vector is a multiple of itself. Hence T has no eigenvalues and so no eigenvectors.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 351

T{v)

Fig. 10-1

Now the minimum polynomial m(t) of a linear operator T is defined independently of the theory ofmatrices,as the monic polynomial of lowest degree which has T as a zero. However,for any polynomial

/@,

/(T) = 0 if and only if f(A) = \320\236

where A is any matrix representation of T. Accordingly, T and A have the same minimum polynomial.

Thus all theorems in Chapter 8 on the minimum polynomial of a matrix also hold for the minimumpolynomial of a linear operator T.

10.5 MATRICES AND GENERAL LINEAR MAPPINGS

Lastly, we consider the general case of linear mappings from one space into another. Let V and V

be vector spaces over the same field \320\232and, say, dim V = m and dim U = n. Furthermore, let

e = {vy, v2, \302\246\302\246\302\246,vm} and f = {ul,u2,..., \320\270\342\200\236}be arbitrary but fixed bases of V and U, respectively.

Suppose F : V -\302\273U is a linear mapping. Then the vectors F(r,),..., F(vm) belong to U and soeachis

a linear combination of the uk; say

+\302\253i2 \022

+a22u2 \320\2602\320\277\320\270\342\200\236

= amlun + am2 u2 \320\260\321\202\342\200\236\320\270\342\200\236

The transpose of the above matrix of coefficients, denoted by [F]{, is called the matrix representation of

F relative to the basese and f:

(We will use the simple notation [F] when the bases are understood.)The following theoremsapply.

Theorem 10.12: For any vector v e V, [F]{Me = IF(v)]f.

That is, multiplying the coordinate vector of v in the basis e by the matrix [F]f, we obtain thecoordinate vector of F(v) in the basis f.

Theorem 10.13: The mapping F*->[F] is an isomorphism from Horn (V, U) onto the vector space of

n x m matrices over K. That is, the mapping is one-to-one and onto and, for any

F, G e Horn (V, U) and any \320\272\320\265\320\232,

IF + G] = [F] + [G] and [fcF] = fc[F]

352 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Remark: Recall that any n x m matrix A over \320\232may be identified with thelinear mapping from Km into K\" given by vt-> Av. Now suppose V and U are vector

spaces over \320\232of dimensions m and \320\270,respectively, and suppose e is a basis of V and f

is a basis of U. Then in view of the preceding theorem, we shall also identify A with the

linear mapping F : V -* U given by [F(v)~\\f==\302\246

A\\v\\e. We comment that if other bases

of V and V are given, then A is identified with another linear mapping from V into U.

Theorem 10.14: Let e, f, and g be bases of V, U, and W, respectively. Let F : V -> U and \320\241:1/->(\320\243\320\254\320\265

linear mappings. Then

That is, relative to the appropriate bases, the matrix representation of the composition of two linear

mappings is equal to the product of the matrix representations of the individual mappings.

Our next theorem shows how the matrix representation of a linear mapping F: V -> U is affectedwhen new bases are selected.

Theorem 10.15: Let P be the change-of-basis matrix from a basis e to a basise' in V, and let Q be thechange-of-basismatrix from a basis f to a basis f' in U. Then, for any linear mappingF:V->U,

In otherwords, if A represents the linear mapping F relative to the bases e and f, then

B = Q'lAP

represents F relativeto the new bases e' and f.Our last theorem, proved in Problem 10.34, shows that any linear mapping from one space into

another can be represented by a very simple matrix.

Theorem 10.16: Let F: V -\302\273V be linear and, say, rank F \342\200\224r. Then there exist bases of V and of Usuch that the matrix representation of F has the form

where lr is the r-square identity matrix.

The above matrix A is called the normal or canonicalform of the linear map F.

Solved Problems

MATRIX REPRESENTATIONS OF LINEAR OPERATORS

10.1. Suppose F : R2 -> R2 is defined by F(x, y) \342\200\224By, 3x \342\200\224y). Find the matrix representation of F

relative to the usual basis ? = {e,= A, 0), e2 = @, 1)}:Note first that if (a, b) e R2, then (a, b) = aex + be2.

F(e()= F(l,0) = @, 3) = Oe, + 3e2

F(e2)=F@, l) = B, -l) = 2e,- e2

a\"

-C -0It is seen thai the rows of [F]? are given directly by the coefficients in the components of F(x, >\302\246).This

generalizes to any space K\".

CHAP. 10] MATRICES AND LINEAR MAPPINGS 353

10.2. Find the matrix representation of the linear operator F in Problem 10.1 relative to the basis

First find the coordinatesof an arbitrary vector (a, b) e R2 with respect to the basis S. We have

x + 2y= a

\320\236\320\2233x + 5y = b

Solvefor x and \321\203in terms of a and b to get x = 2b \342\200\2245a and \321\203= \320\227\320\260\342\200\224b. Thus

(a, b) = (-5a + 2b)ut + Ca -b)u2

We have F(x, y)= By, 3x \342\200\224

y). Hence

F(h,) =F{\\, 3) = F, 0) = -30u, + 18u2

F(u2) = FB, 5) = A0, 1) = -48h, + 29u2

-m,=r\302\273 -? /

(Remark: We emphasize that the coefficients of u, and u2 are written as columns, notrows, in each matrix representation.)

10.3. Let G be the linear operator on R3 defined by G(x,y, z)= By + z, x \342\200\224

Ay, 3x).

(a) Find the matrix representation of G relative to the basis

S =\320\232

= A, 1, 1), w2= A, 1, 0), w3

= A, 0, 0)}

(b) Verify that [G][r] = [G(v)] for any vector t; 6 R3.

First find the coordinates of an arbitrary vector (a, b, c)e R3 with respect to the basis S.Write (a, fc, c)

as a linear combination of wl,w2, w3 using unknown scalars x, y, and z:

(a, b, c) = x(\\, 1, 1)+ y(l, 1,0) + z(l, 0, 0) = (x + \321\203+ z, x + y, x)

Set corresponding components equal to each other to obtain the system of equations

x + \321\203+ z = a x + y= b x = c

Solvethe system for x, y, and z in terms of a, b, and \321\201to find x = \321\201,\321\203= b \342\200\224c, z = a \342\200\224b. Thus

(a, b, c) = cw, + (b \342\200\224c)w2 + (a \342\200\224

b)w3 or equivalently [(a, b, c)] = [c, t> \342\200\224\321\201,\320\260\342\200\224

6]r

(a) Since G(x, y, z) = By + z,x- Ay, 3x),

GiwJ = G(l, 1, 1)= C, -3, 3) = 3w, -6w2 + bw3

G(w2) = G(l, 1,0) = B, -3, 3) -3wt

- 6w2 + 5w3

G{w3) = G(l, 0, 0) = @, 1, 3) = 3w, -2w2

- w3

Write the coordinates of Giwt), Giw2\\ G(W3) as columns to get

/ 3 3 3\\

[G]= -6 -6 -2

\\ 6 5-1/

(fc) Write Gif) as a linear combination of wl,w2, w3 where v \342\200\224(a, b, c) is an arbitrary vector in R3:

G{v) = G(a, b, c) = Bb + c, a -4b, \320\227\320\260)

= 3aw, 4- (-2a -4b)w2 +(~a + 6b + c)w3

or, equivalently,

[G(vJ] =[\320\227\320\260,-2a-Ab, -a + 6b + c]r

354 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Accordingly,

3 3 3\\/ \321\201

[G]M=|-6 -6 -2 II fe - \321\2011 =

6 5 -\\\\a-b

10.4. Let /1 = 1. I and let T be the linear operator on R2 defined by T(v) = Av (where v is written\\3 \320\233/

as a column vector). Find the matrix of T in each of the following bases:

(a) E = {et= A, 0), e2 = @, 1)},i.e.,the usual basis; and (b) S = {uj =A, 3), u2 = B, 5)}.

(a)and thus [7\"];

-G

Observe that the matrix of T in the usual basis is precisely the original matrix A which defined T.

SeeExample 10.1(c).

F) From Problem 10.2,(a, b)= (-5a + 2b)ut + Ca -

b)u2. Hence

and thus [7\"]s=

10.5. Each of the sets (a) {1, t, e\\ te'} and (b) {e31, te3', t2e3'} is a basisof a vector space V of functions

/: R -> R. Let D be the differential operator on F, that is, D(/) = d//^t. Find the matrix of D in

each given basis.

and thus

(a) D(I) = O =0A) + 0@ + 0(e')

D@ = 1 =\320\2461)+ 0@ + 0(e')

D(e')= e' =0A) + 0@ + 1\320\230+ Otfe')

D(te') = e' + te'= 0A) + 0@ + HO + Hie1)

(b) D(e31) = 3e31 = 3(e3')+ 0(ie31) + 0{t V) /3 1 o\\

D(te3') = e3r + 3te3' =l(e3t) + 3(fe3') + 0(tV) and thus [D]=l0 3 2

D(iV) = 2te3' + 3\302\2532e3'= 0(e3') + 2(tc3') + 3(tV) \\0 0

10.6. Let V be the vector space of 2 x 2 matrices with the usual basis

fc-GSb- Gib- \320\237'\320\247\320\276?

Let M = ( 1 and T be the linear operator on V defined by T(A) = MA. Find the matrix

representation of T relative to the above usual basis of V.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 355

We have

Hence

(Sincedim \320\232= 4, any matrix representation of a linear operator on V must be a 4-square matrix.)

10.7. Consider the basis S = {A, 0), A, lj} of R2. Let L : R2 -> R2 be defined by L(l, 0) = F, 4) andL(l, 1)= A, 5). (Recall that a linear map is completely defined by its action on a basis.)Find

the matrix representation of L with respect to the basis S.

Write F, 4) and then A,5) eachasa linear combination of the basis vectorsto get

\320\246\320\245,0) = F, 4) =2A, 0) + 4A, 1) B -4

\320\246\320\245,1) = A, 5) = -4A, 0) + 5A, 1and thus

10.8. Consider the usual basis E = {et, e2, ..., en} of K\". Let L:K\"->K\" be defined by Lfo) = t;;.Show that the matrix A representing L relative to the usual basis E is obtained by writing the

image vectors vt,v2, \302\246\302\246-,vnascolumns.

Suppose Vj= (a,,, ai2 ain). Then \320\246\320\265,)

=\320\270,-

= an e, + ai2 e2 + \302\246\302\246\302\246+ a,nen. Thus

\302\26012a22

10.9. For each of the following linear operatorsL on R2, find the matrix A which represents L (relativeto the usual basis of R2):

(a) L is defined by \320\251,0) = B, 4) and L@,1)= E, 8).

(b) L is the rotation in R2 counterclockwise by 90\302\260.

(c) L is the reflection in R2 about the line \321\203\342\200\224\342\200\224x.

(a) Since A,0) and @, 1) do form the usual basis of R2, write their images under L as columns (Problem10.8)to get

A =

(h) Under the rotation L, we have \320\251,0) = @, 1) and \320\2460,1) = (- 1,0).Thus A =

(c) Under the reflection L, we have \320\246\320\245,0) = @, - 1)and L@, !) = (-!, 0).Thus A =

356 MATRICES AND LINEAR MAPPINGS [CHAP. 10

10.10. Prove Theorem 10.1. [T]s[v]s = \\T(v)]s.

Suppose,for i = 1,..., n,

T(Ui) =\320\260\320\277\320\270\321\205+ al2u2 + \302\246\342\200\242\302\246+ alnur = ? a,j\"j

Then [T]s is the n-square matrix whose jth row is

(\302\260lj.\302\2602j<\302\246\302\246>\302\260nj> W

Now suppose\320\273

i= 1

Writing a column vector as the transpose of a row vector.

Furthermore, using the linearity of T,

T(v)= T[ f k(uA = f k:T(u,)

= \320\243kl f \320\260\320\277\320\270\320\233\\i=l / i=l i=l \\>= 1 /

I. / n \\ n

Thus [T(^)]s is the column vector whoseyth entry is

auk1 + a2j.fc2 + --+a)yfc,, (i)

On the other hand, the^th entry of [T]s[i;]s is obtained by multiplying the^th row of [T]s by [d]s, i.e., (/)

by B). But the product of (/) and B) is (i); hence [T]s[t>]s and [T(d)]s have the same entries. Thus

10.11. Prove Theorem 10.2. (i) [F + G] = IF]+ [G], (ii) [*F] = k\\F].

Suppose,for i = 1,..., \320\270,

n n

Consider the matrices A = (a{J)and \320\222=(by). Then [F] = \320\233\320\263and [G] = Br. We have, for i = 1,..., n,

ft

(F

Since j4 + \320\222is the matrix (ay + fc.j), we have

[F + C] =(A + B)T = AT+BT= [F] + [G]

Also, for i = 1,..., n,

XUi)= kF{Ui) = k?aijUj= ? (fcfljJ)u.

Since fc>l is the matrix (/cay), we have

[fcF] = (kA)T = \320\2537\"=\320\272]_\320\223\\

Lastly, m is one-to-one since a linear mapping is completely determined by its values on a basis, and m is

onto since each matrix A =(oy) in M is the image of the linear operator

Thus the theorem is proved.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 357

10.12. Prove Theorem 10.3. [G <=F) = \\G][F].

Using the notation in Problem 10.11,we have

FX\302\253,)= G(F(u,)) =

G( ? \320\257\321\206\320\230\320\233==

? flijGK\\j i / jii

(G

\320\273

Recall that AB is the matrix AB = {c^) where cik =]\320\223\320\260\320\270\320\254]\320\272.Accordingly,

[G - F] = (AB)T= BTAT = IGRF]

Thus the theorem is proved.

10.13. Let A be a matrix representation of an operatorT.Show that f (A) is the matrix representation of

f(T), for any polynomial/(<). [Thus/(T) = 0 if and only if \320\224\320\220)=

\320\236.]

Let \321\204be the mapping T\\-*A, i.e., which sends the operator T into its matrix representation A. Weneedto prove that \321\204{/{\320\242))=/(/4). Suppose/(f) =

aat\" + \302\246\302\246\302\246+ att + a0. The proofis by induction on n, the

degree of/(f).Supposen = 0. Recall that \321\204{\320\223)

= / where /' is the identity mapping and / is the identity matrix. Thus

=\321\204(\320\260\320\2761')

=\320\260\320\276\321\204(\320\223)

= aol =f(A)

and so the theorem holds for n = 0.Now assume the theorem holds for polynomials of degree less than n. Then, since \321\204is an algebra

isomorphism,

\321\204[/{\320\242))=

\321\204{\320\260\342\200\236\320\242\"+ \320\260\342\200\236_1\320\242\302\253\302\246

+-+\320\260,\320\242 + \320\2600\320\223)

=\320\260\320\277\321\204(\320\242\320\250\320\242\320\2771) + 0(\320\260\342\200\236_,\320\242\"

'+\342\200\242\302\246\302\246\302\246+ \320\260,\320\242+ \320\260\320\276\320\223)

=\320\260\342\200\236\320\220\320\220\"1+{\320\260\321\217.1\320\220\321\217

'+ \302\246\302\246\302\246

+ \320\260,\320\220+ aol) =f(A)

and the theorem is proved.

10.14. Let V be the vector space of functions which has {sin 0, cos 0\\ as a basis, and let D be the

differential operator on V. Show that D is a zero off(t) = f2 + 1.

Apply/(D) to each basisvector:

/(DXsin (!) = (D2 + /Xsin 0) = D2(sin 0) + /(sin 0) = -sin 0 + sin 0 = 0

/(DKcos 0)= (D2 + /)(cos 0) = D2(cos 0)+ /(cos0) = -cos \302\253+ cos 0 = 0

Sinceeachbasisvector is mapped into 0, every vector i? e V is also mapped into 0 by.AD). Thus/(D) = 0.[Thisresult is expected since, by Example 10.5(a)./(() is the characteristic polynomial of D.]

CHANGE OF BASIS, SIMILAR MATRICES

10.15.Let F : R2 -\302\273R2 be defined by F(x, y) = Dx -y, 2x + y) and consider the following bases for R2:

? = {*!= A, 0), e2 = @, 1)} and S ={\320\270,

= A, 3), u2 = B, 5)}

{a) Find the change-of-basis matrix P from E to S, the change-of-basis matrix Q from S back to

?, and verify that Q = P '.

358 MATRICES AND LINEAR MAPPINGS [CHAP. 10

(b) Find the matrix A that represents F in the basis E, the matrix \320\222that represents F in thebasis S, and verify that B = P~~lAP.

(c) Find the trace tr F, the determinant det (F), and the characteristic polynomial A(t) of F.

(a) Since? is the usual basis, write the elements of S as columns to obtain the change-of-basis matrix

P-G 9

Solving the system u, = e, + 3c2,u2= 2?, + 5e2, for e, and e2 yields

e, = \342\200\2245u, 4 3u2 and thus Q =

e2 = 2u, -u2

We have

-K ^0

(b) Write the coefficients of x and \321\203as rows (Problem 10.3)to obtain

-G 1)

SinceF(x, y) = Dx -j>, 2x + y) and (a, b) = (- 5a + 2b)u, +\342\200\242Ca

- b)u2,

1 '\342\200\236

'\342\200\236_

' 2and thus \320\222=

We have

(c) Use A (or B) to obtain

tr F = tr A = 4 + 1 = 5 det (F) = det (\320\233)= 4 + 2 = 6

A(f) = t2 - (tr \320\233)*+ \320\264\320\265\320\246\320\220)= f2 - 5f + 6

10.16. Let G be the linear operator on R3 defined by G(x,y, z)= By + z, x \342\200\224

Ay, 3x) and consider theusual basis E of R3 and the following basis S of R3:

S={w,

= A, 1, 1), w2= (I, 1, 0), w3

= (I, 0, 0)}

(a) Find the change-of-basis matrix P from E to S, the change-of-basis matrix Q from S back to?, and verify that Q = P '.

(fc) Verify that [G]s = P^G^P.

(c) Find the trace, determinant, and characteristic polynomial of G.

(a) Since ? is the usual basis, write the elements of Sas columns to obtain the change-of-basis matrix

/. , \320\273

\320\257= I 1 1 Ol

\\l 0 0/

By the familiar inversion process (seeProblem 10.3)we obtain

et =Ovv, 4- 0w2 + lw3 /0 0

e2=Ow, + lw2 \342\200\224

\\w3 and thus fi=

|0 1 \342\200\2241

e3 = lw, -lw2 + 0w3 \\ 1 -1 0/

CHAP. 10] MATRICES AND LINEAR MAPPINGS 359

We have

/i i 1W0 \320\276 \320\233 /i

PQ = I 1 1 0 0 1 -1 = I 0

\\l 0 0/\\l -1 0/ \\0

/0 2 l\\ / 3 3 3\\

(b) From Problems 10.1 and 10.3, [G]? =1 -4 0 and [G]s =I -6 -6 -2. Thus

\\3 0 0/ \\6

5\342\200\2241/

/0 0 l\\/0 2 l\\/l 1 l\\ / 3 3 3\\

P\"'[G]?P= 0 I -11 -4 0 1 1 0 = -6 -6 -2 =[G]S\\1

-1 0/\\3 0 o/\\l 0 0/ \\6 5 -l/

(c) Use[G]?(the simpler matrix) to obtain

tr G = 0 - 4 + 0 = -4 det (G) = 12 and A(f)= t3 + 4t2 - 5t - 12

10.17. Find the trace and determinant of the following operator on R3:

T(x, y, z) = (a,.x + a2y + a3z, btx + b2y + b3z, ctx + c2y + c3z)First find a matrix representation A of \320\223.Choosing the usual basis ?,

Then

tr T = tr /I = a, + b2 + c3

and

det (T) = det (A) = a,b2c3 + a2b3cl + a3bic2\342\200\224

a3bbc3\342\200\224

a2b,c3\342\200\224

atb3c2

(\\ 2\\10.18. Let V be the space of 2 x 2 matrices over R, and let M = I I. Let T be the linear operator

on V defined by T(A) = MA. Find the trace and determinant of T.

We must first find a matrix representation of T. Choosethe usual basis of V to obtain (Problem 10.6)the following matrix representation:

A

0 2 0\\

0 3 0 4/

Then tr T = 1 + 1 + 4 + 4 = 10and det (T) = 4.

10.19. Prove Theorem 10.4. [Ts,] = P~l[T]P.

Let v be any vector in V. Then, by Theorem 5.27,P[u]s.= [t>]s. Therefore,

p- '[T

But [T]s.[^]s. = [T(t;)]s.;hence

360 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Since the mapping i>h-\302\273[>]s. is onto K\", we have P '[T~\\SPX = [T]s.X for every X e K\". Thus

= [T]s., as claimed.

DIAGONALIZATION OF LINEAR OPERATORS, EIGENVALUES AND EIGENVECTORS

10.20. Find the eigenvaluesand linearly independent eigenvectors of the following linear operator onR2, and, if it is diagonalizable, find a diagonal representation D: F(x, y)

= Fx \342\200\224y, 3x + 2y).

First find the matrix A which represents F in the usual basis of R2 by writing down the coefficients of x

and \321\203as rows:

The characteristic polynomial A@ of F is then

\320\224@= t2 -(tr A)t + Ml = f2 - 8f + 15 = (t -

3X\302\253- 5)

Thus A, = 3 and A2= 5 are eigenvalues of F. We find the corresponding eigenvectors as follows:

(i) Subtract A, = 3 down the diagonal of A to obtain the matrix M=G -! which corresponds to the

homogeneous system 3x \342\200\224v = 0. Here \320\270,= A, 3) is a nonzero solution and hence an eigenvector of F

belonging to A, =3.

-G :(ii) Subtract A2 = 5 down the diagonal of A to obtain M =( \\ \\\\ which corresponds to the system

x \342\200\224\321\203

= 0. Here v2 =A, 1) is a nonzero solution and hence an eigenvector of F belonging to A2 = 5.

Then S ={\302\253\342\200\236v2} is a basis of R2 consisting of eigenvectors of F. Thus F is diagonalizable, with the matrix

/3 0\\

representation D -- I I.

10.21. Let L be the linear operatoron R2 which reflects points across the line \321\203= kx (where \320\272\321\2040). See

Fig. 10-2.

(a) Show that t>,= (k, I) and v2

= A,\342\200\224

k) are eigenvectors of L.

(b) Show that L is diagonalizable, and find such a diagonal representation D.

*- V

Fig. 10-2

(a) The vector \321\203,=

{\320\272,1) lies on the line \321\203= kx and hence is left fixed by L, that is, L(t>[)= i>,. Thus t>, is

an eigenvector of L belonging to the eigenvalue A, = 1.Thevector v2 = A, \342\200\224fc) is perpendicular to the

line \321\203= kx and hence L reflectsv2 into its negative, that is, L(v2)= \342\200\224

d2. Thus v2 is an eigenvector ofL belonging to the eigenvalue k2 = \342\200\2241.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 361

(b) Here S ={vlt v2} is a basis of R2 consisting of eigenvectors of L. Thus L is diagonalizable with the

(\\ 0\\

diagonal representation (relative to S)D = I I.

10.22. Find all eigenvalues and a basis of each eigenspace of the operator T : R3 -> R3 defined by

T(x, y, z) = Bx + \321\203,\321\203\342\200\224

\320\263,2\321\203+ 4\320\263).Is T diagonalizable? If so, find such a representation D.

First find the matrix A which represents T in the usual basis of R3 by writing down the coefficients of

x, v, \320\263as rows, and then find the characteristic polynomial \320\224(\320\263)of 7~. We have

i-2 -1 01 - 1 | and so \320\224(\320\263)= |f/ - A' =

2 4,

0 t-\\ 1

0 -2 t-4

= (f - 2JO-3)

Thus A = 2 and A = 3 are the eigenvalues of T.

We find a basis of the eigenspace ?2 of A = 2. Subtract A = 2 down the diagonal of A to obtain the

homogeneous system

v = 0

-\321\203

- \320\263= 0 or

2, + 2\320\263- 0v + \320\263= 0

The system has only one independent solution, e.g., x = 1,\321\203\342\200\2240, z = 0. Thus \320\270= A, 0, 0) forms a basis of

the eigenspace ?2.We find a basis of the eigenspaceE3 of A = 3. Subtract A = 3 down the diagonal of A to obtain the

homogeneous system

- \320\273-+ v = 0

-2v-z=0 or

\320\276 n2v + z = \302\260

2 v + \320\263= 0

The system has only one independent solution, e.g., x = 1,\321\203= 1, z = \342\200\2242.Thus \320\270= A, 1, \342\200\2242)forms a basis

of the eigenspace E3.Observe that T is not diagonalizable, since T has only two linearly independent eigenvectors.

10.23. Show that 0 is an eigenvalue of T if and only if T is singular.

We have that 0 is an eigenvalue of T if and only if there exists a nonzero vector v such that

T{v) = Ov = 0, i.e., if and only if T is singular.

10.24. Suppose X is an eigenvalue of an invertible operator \320\223.Show that \320\257i is an eigenvalue of T~l.

Since T is invertible, it is also nonsingular; hence, by Problem 10.23 \320\257\321\2040.

By definition of an eigenvalue, there exists a nonzero vector v for which T{v) = \320\257\320\270.Applying T\021 to

both sides, we obtain v = T~ '(\320\257\320\270)= XT \\v\\ Hence T\"'(f) = \320\257

\"'\321\203;that is, \320\257\"'is an eigenvalue of T~'.

10.25. Supposedim V = n. Let T :V -*V be an invertible operator. Show that T ~l is equal to a

polynomial in T of degree not exceeding n.

Let m(t) be the minimum polynomial of T. Then m{t)= tr + ar-if'~' +-

\302\246\302\246\302\246+ a,t \302\246+\302\246a0, where r < n.

Since T is invertible, a0 ^ 0.We have

362 MATRICES AND LINEAR MAPPINGS [CHAP. 10

Hence

alI)T = I and = (Trl + \320\260,a

10.26. Prove Theorem 10.7.

The proof is by induction on n. If n = 1, then u, is linearly independent since u, \321\2040. Assume n > 1.Suppose

a,u, -1- a2u2 \302\246+\302\246\342\200\242\342\200\242\342\200\242+ anuB = 0

where the af are scaiars. Applying T to the above relation, we obtain by linearity

aj(ut) + a2T(u2) + \302\246\302\246\302\246+ \320\260\342\200\236\320\242(\320\270\320\271)= \320\242@)

= 0

But by hypothesis T(u,) =\320\257,-u,-; hence

(/)

On the other hand, multiplying (/) by \320\257\342\200\236.

Now subtracting C) from B),

+\320\260\342\200\236\320\257\342\200\236\320\270\342\200\236=

\320\260\342\200\236_,(\320\257\320\257_,-

\320\257\342\200\236)\320\270\342\200\236_,= 0

B)

(i)

By induction, \320\270,,\320\2702,..., \320\270\342\200\236_,are linearly independent; hence each of the above coefficients is 0. Since the \320\273,-

are distinct, kt \342\200\224\320\257\342\200\236\320\2440 for i \320\244n. Hence a, = \302\246\302\246\342\200\242=

\320\260\320\270_,= 0. Substituting this into (/) we get \320\276\342\200\236un= 0, and

hence \320\260\342\200\236= 0. Thus the u-, are linearly independent.

10.27. Prove Theorem 10.10.

Suppose the geometric multiplicity of A is r. Then the eigenspace ?\320\273contains r linearly independenteigenvectorsvlt..., vr. Extend the set {u,}to a basisof Vsay: {vt, vr. wt,.... ws}. We have

T(Vi)= /D,

T(v2)= h>2

T(vr)=

, +\302\246\302\246\302\246

T(*vs)= a5lD, + \302\246\302\246\302\246+ \320\260\342\200\236\320\270,+ bslwj + \302\246\302\246\302\246+ bss ws

The matrix of \320\223in the above basis is

M =

/\320\257

0

0

0

0

\\o

0

\321\217

0

0

0

0

... 0

... 0

... \321\217

... 0

... 0

... 0

\"\320\270

Ol2

\302\246\320\254,2

\321\214..

\0221

\320\26022

\320\2602\320\263

\320\25422

b2s

\302\246-\302\246\320\260\342\200\2361

\302\246\302\246\302\246\320\260*2

... \320\260\320\265\321\202

... \320\254\320\274

... \320\254\320\2632

- \321\214-

1. \320\262

where /4 =(\320\260\320\276)\321\202

and \320\222=(\320\254^)\321\202.

Since M is a block triangular matrix, the characteristic polynomial of Ur, which is (t \342\200\224\320\257)\320\263,must divide

the characteristic polynomial of M and hence that of T. Thus the algebraicmultiplicity of \320\257for the operatorT is at least r, as required.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 363

10.28. Let {vt ujbea basis of V. Let T : V -\302\273V be an operator for which T(Uj)= 0, T(v2) = a21vt,

=a3lvt +a32v2,..., \320\242(\320\263\320\277)

=\320\260\320\27711,+ \302\246\342\200\242\342\200\242+ e,.,-,\302\273,_i. Show that T\" = 0.

(\342\200\242)

It suffices to show that

TJ(Vj)= 0

for \321\203= 1,..., \320\270.For then it follows that

T\"(Vj)= T\"

J{T\\v))= T\" \302\246'@)

= 0, for \321\203= 1,.... \320\270

and, since {vt, ..., i>n} is a basis, \320\223\"= 0.

We prove (*) by induction on j. The case j = 1 is true by hypothesis. The inductive step follows

(for \321\203= 2, ...,n) from

= a,lT'-1(I>t)+ ---+ flAJ-t7\"J-I0>J-,)

= an0 + \342\200\242\302\246\302\246+ a,, \321\203,0 = 0

Remark: Observe that the matrix representation of T in the above basis is triangular with

diagonal elements 0:

/\320\276\022 1

\320\236\320\236 \320\260\320\2732

0 0 0

0 0 0 \"\320\276'/

MATRIX REPRESENTATIONS OF LINEAR MAPPINGS

10.29.Let F : R3 -\302\273R2 be the linear mapping definedby F(x, y, z) = Cx + 2y - 4z,x -\320\252\321\203+ 3z).

(a) Find the matrix of F in the following bases of R3 and R2:

S ={Wl

= A, 1, I), w2= (I, 1, 0), w3

= A, 0, 0)} S' = {u, = A, 3), u2 = B, 5)}

(fc) Verify that the action of F is preservedby its matrix representation; that is, for any i; e R3,

inllvh = [F(t,)]s.

(a) From Problem 10.2, (a, b)= (-5a + 2b)u{ + Ca -

b)u2. Thus

F(w,)=F(l, 1, l) = (l, -1)= -7u, +4u2

F{w2)= F(l, 1, 0) = E, -4) = -33^!+ 19u2

F(w3) = F(l, 0, 0) = C, I)= -13u, + 8u2

Write the coordinatesof F(w,), F(w2), F(w3) as columns to get

-7 -33 -_/-7 -33 -13\\\\ 4 19 SJ

(b) If v = (x, y, z) then, by Problem 10.3, v = zwt + (y\342\200\224

z)w2 + (x \342\200\224y)w3. Also,

F(t>)= Cx + 2y - 4z,x -

5y + 3z) = (- 13x- 20> + 26z)u, + (8x + 1\\y- I5z)u2

Hence

Thus

-

-7 -33 -4 ,9

13\\/ Z\\ /-13x-20> +

8JU-Z 4 8*+ll,-

364 MATRICES AND LINEAR MAPPINGS [CHAP. 10

10.30.Let F : Kn -\302\273Km be the linear mapping defined by

F(xu x2,...,xn) = (auxl + \302\246\302\246\342\200\242+alnxn,a2ixl +\302\246\302\246\302\246+ a2nxn,..., amlx, +\302\246\302\246\302\246+ amnxn)

Show that the matrix representation of F relative to the usual bases of K\" and of Km is given by

... a-,

\\flml flm2 \302\246\302\246\302\246

That is, the rows of [F] are obtained from the coefficients of the x, in the components of

F(x,, ...,\321\205\342\200\236).

We have

F(l,0,...,0) = (an,a21,...,aml)F@, 1,...,0) =

(\320\26012,\320\26022,...,\320\260\342\200\2362)and thus [F] =

\302\26011\302\26012

\302\26021\0222 \"in

10.31. Find the matrix representation of each of the following linear mappings relative to the usual

bases of R\":

F : R2 - R3 defined by F(x, y) = Cx -y, 2x + Ay, 5x -

6y)

F : R4 - R2 defined by F(x, y, s, 0 = Cx -Ay + 2s - 5t, Sx + ly - s-

It)

F : R3 -> R4 defined by F(x, y, z) = Bx + 3y- 8z, x + \321\203+ z, Ax -

5z, 6y)

By Problem 10.30, we need only look at the coefficients of the unknowns in F(x, y,...). Thus

/2 3 -i

[F]3-4 2 -

5 7 -l -\320\263CF]

1

4

\\o

1

0

6

1

-5

0

10.32. Let T : R2 -\302\246R2 be defined by T(x, y) = Bx -3y, x + Ay). Find the matrix of T relative, respec-

respectively, to the following bases of R2:

? = {*,=A, 0), e2 = @, 1)} and S = {u,= A, 3), u2 = B, 5)}

(We can view \320\223as a linear mapping from one spaceinto another, each having its own basis.)

From Problem 10.2, (a, b) = (-5a + 2b)u, + Ca -b)u2. Hence

T(e,) =\320\223A,0) = B, 1) = -8u, + 5u2

T(e2)= T@, I) = (-3, 4)

= 23m,- 13u2

and thus \320\241\320\233!-n

10.33. Let A = I I. Recall that A determines a linear mapping F: R3 -> R2 defined by

F(v) = Av where t; is written as a column vector. Find the matrix representation of F relative tothe following bases of R3 and R2.

S ={\320\270/,

= A, 1, 1), w2= A, 1, 0), w3

= A, 0, 0)} S = {h,= (I, 3), \302\2532= B, 5)}

CHAP. 10] MATRICES AND LINEAR MAPPINGS 365

From Problem 10.2,{a, b) = (- 5a + 2b)u, + Ca - b)uz.Thus

F(Wl)=

(l -4 ~5I|=U=-12m' + 8h*

-4

Writing the coefficients of Ffw,), F(w2), F(w3)as columns yields

/-12 -41 -84s

V 8 24 5)

10.34. Prove Theorem 10.16.ForF: V-> (/, there exist bases for which [F] lo oJ_Suppose dim V = m and dim I/ = n. Let W be the kernel of F and U' the image of F. We are given

that rank F = r; hence the dimension of the kernel of F is m \342\200\224r. Let {w,,.... wm_,} be a basisof the kernel

of F and extend this to a basis of V:

(\302\253i \302\253,-\302\273 Vr}

Set u, = FfrJ, u2= F{v2), ...,u, =

F(vr)

We note that {u,,..., ur} is a basis of I/', the image of F. Extend this to a basis

{\302\253i \",.4+1 4.}

of U. Observe that

F(vt)= u, = lu, + 0u2 + \342\200\242\342\200\242\342\200\242+ 0ur + 0\",+1 + \342\200\242\342\200\242\342\200\242+ 0\320\270\321\217

F(v2)= u2 = 0u, + lu2 + \302\246\302\246- + 0ur + 0ur+ ,+\342\200\242\342\200\242+0un

0u2 + \342\200\242\342\200\242+ lur

=0 = 0u, + 0u2 + \342\200\242\342\200\242\342\200\242+ Ou, + 0ur+, + \342\200\242\302\246\302\246+ 0\320\270\342\200\236

^(wm_r)= 0 = 0u, + 0u2 + \342\200\242\342\200\242\342\200\242+ 0ur

Thus the matrix of F in the above baseshas the required form.

Supplementary Problems

MATRIX REPRESENTATIONSOF LINEAR OPERATORS

1035. Find the matrix representation of each of the following linear operators T on R3 relative to the usual basis:

(a) T(x,y, z) = (x, y, 0)

(b) T(x, y, z) = Bx -7y

- 42, \320\227\321\205+ \321\203+ 4z, 6x -8y + z)

(c) T(x, y, z) = (z, \321\203+ z, x + \321\203+ z)

366 MATRICES AND LINEAR MAPPINGS [CHAP. 10

10.36. Find the matrix of each operator T in Problem 10.35 with respect to the basis

S ={\302\253.

= 0. I- 0), u2 = (I, 2, 3), u3= (I, 3, 5)}

10.37. LetD be the differential operator, i.e., D(/) =dfjdt. Each of the following sets is a basis of a vector space V

of functions/: R -\302\273R. Find the matrix of D in each basis:

(a) [e', e2', te2'}, (b) {sin f, cos f}, (c) {<\302\246*',ft-51,/V}, (d) {1, f, sin 3t, cos 3f}

10.38. Consider the complex field \320\241as a vector space over the real field R. Let T be the conjugation operator on

C, that is, T(z) = z. Find the matrix of T in each basis: (a) {1, /}, and (b) {1 + /, 1 + 2i}.

10.39. Let V be the vector space of 2 x 2 matrices over R and let M = I I. Find the matrix of each of the

Vc d)

following linear operators T on V in the usual basis (seeProblem 10.18):

(a) T(A) = MA, {b) T(A) = AM, (<\342\200\242)T(A) = MA- AM

10.40. Let 1^and 0y denote the identity and zero operators, respectively, on a vector spaceV. Show that, for anybasis S of V, {a) [lv]s = /, the identity matrix, {b) [0^]s

= 0, the zero matrix.

CHANGE OF BASIS, SIMILAR MATRICES

10.41. Consider the following bases of R2: ? = {c,= A, 0), e2 = @, 1)}and S = {ut =A, 2), u2 = B, 3)}.

(a) Find the change-of-basis matrices P and Q from ? to S and from S to ?, respectively. Verify Q = P~l.

{b) Show that \320\236]?= P[d]s for any vector i; e R2.

(c) Verify that [T]s = P\" '\320\223\320\233\320\265Pfor the linear operator T(x, y)= Bx -

\320\227\321\203,\320\273+ \321\203).

10.42. Find the trace and determinant of each linear map on R3:

(a) F(x, y, z) = (x + 3y, 3x - 2z. \320\273-4>-

- 3z), {b) G(x,y, z) = (x + \321\203- z. x + 3y, 4y + 3z)

10.43. Suppose S = {u,,u2} is a basis of V and T: V -\302\273P7 is a linear operator for which T(u,) = 3u, \342\200\2242u2 and

T(u2) =u, + 4u2. Suppose S' = {w,, vv2} is a basis of F for which w, = u, + u2 and w2 = 2u, + 3u2.Find

the matrix of T in the basis S'.

10.44. Consider the bases {I, i} and {I + i, 1 + 2i\\ of the vector space \320\241over the real field R. (a) Find the

change-of-basismatrices P and Q from S to S' and from S' to S, respectively. Verify that Q = P''. (b) Show

that [T]s. =\320\240~1[\320\242\320\252\320\240for the conjugate operator \320\223in Problem 10.38.

DIAGONALIZATION OF LINEAR OPERATORS,EIGENVALUES AND EIGENVECTORS

10.45. Suppose v is an eigenvector of operators T, and T2. Show that v is also an eigenvector of the operator

eTj + bT2 where a and fc are any scalars.

10.46. Suppose v is an eigenvector of an operator T belonging to the eigenvalue \320\257.Show that for n > 0, v is also

an eigenvector of T\" belonging to X\".

10.47. Suppose \320\257is an eigenvalue of an operator T and f(t) is a polynomial. Show that /(/.) is an eigenvalue of

mi.

10.48. Show that a linear operator T is diagonalizable if and only if its minimum polynomial is a product of

distinct linear factors.

CHAP. 10] MATRICES AND LINEAR MAPPINGS 367

10.49. Let L and T be linear operators such that LT = TL. Let A be an eigenvalue of T and let W be its

eigenspace.Show that W is invariant under L, i.e., U.W) \320\257W.

MATRIX REPRESENTATIONS OF LINEAR MAPPINGS

10.50.Find the matrix representation relative to the usual bases for R\" of the linear mapping F: R3 -> R2

defined by F{x, y, z) = Bx -Ay + 9z, 5x + 3y

- 2r).

10.51. Let F : R3 -> R2 be the linear mapping defined by F(x, y, z) = Bx + \321\203- z, 3x -

2y + 4z). Find the matrix

of F in the following bases of R3 and R2:

S={Wl= A, 1, 1), w2 = A, I, 0), w3 = A, 0, 0)}

Verify that, for any vector v e R3, [F]f[i;]s = [F(t')]s..

S'= {\320\270,= A, 3), v2 =

A, 4)}

10.52. Let S and S' be bases of V, and let 1,, be the identity mapping on V. Show that the matrix of 1^ relative to

the bases S and S' is the inverse of the change-of-basis matrix P from S to S'; that is, [li/]f = P~'.

10.53. ProveTheorem 10.12.

10.54. Prove Theorem 10.13.

10l55. Prove Theorem 10.15.

Answers to Supplementary Problems

1\\

10.35. (\320\276)I 0

\\o

10.39. (o)

0

3\320\233

10.36. (a) |0 -5 -10,3 6/

>1 0 0\\

1037. (a) 10 2 1 ,

\\0 0 2/

- \302\246'-

a

0

\321\201

0

1

2

0

a

0\321\201

b

0

0

2\\

0

(b) -

(\320\254)

51 104

191 -351 ,116 208

@

(c)

0

\321\201

0

\342\200\224\321\201

a-d

0

\321\201

b

0

-b

0

b

\342\200\224c

0

368 MATRICES AND LINEAR MAPPINGS [CHAP. 10

10.42. (a) -2. 13,r3+ 2r2-20r-13; (b) 7, 2, f3 - It2 + 14r - 2

(\320\233 -0

G \"J -')

-\302\246'.: .\" .;

Chapter 11

Canonical Forms

11.1 INTRODUCTION

Let \320\223be a linear operator on a vector space of finite dimension. As seen in Chapter 10, T may nothave a diagonal matrix representation. However, it is still possible to \"simplify\" the matrix representa-representationof T in a number of ways. This is the main topic of this chapter. In particular, we obtain the

primary decomposition theorem, and the triangular, Jordan and rational canonical forms.

We comment that the triangular and Jordan canonical forms exist for T if and only if the character-characteristicpolynomial \320\224(\320\263)of T has all its roots in the base field K. This is always true if \320\232is the complexfield \320\241but may not be true if \320\232is the real field R.

We also introduce the idea of a quotient space. This is a very powerful tool and will be used in the

proof of the existence of the triangular and rational canonical forms.

11.2 TRIANGULAR FORM

Let T be a linear operator on an n-dimensional vector space V. Suppose T can be represented by

the triangular matrix

i a12 ...aln\\

\302\25322 \302\246\342\200\242\342\200\242\302\2532-

Then the characteristic polynomial of T,

-a22) \342\200\242\342\200\242(\302\253-a

is a product of linear factors. The converse is also true and is an important theorem, namely (seeProblem 11.28for the proof),

Theorem 11.1: Let T : \320\232-\302\273\320\232be a linear operator whose characteristic polynomial factors into linear

polynomials. Then there existsa basis of V in which T is represented by a triangular

matrix.

Theorem 11.1 (Alternate Form): Let A be a square matrix whose characteristic polynomial factors

into linear polynomials. Then A is similar to a triangular matrix, i.e.,

there exists an invertible matrix P such that P~lAP is triangular.

We say that an operator T can be brought into triangular form if it can be represented by a

triangular matrix. Note that in this case the eigenvalues of T are preciselythoseentries appearing on

the main diagonal. We give an application of this remark.

369

370 CANONICAL FORMS [CHAP. 11

Example 11.1. Let A be a square matrix over the complex field C. Suppose \320\272is an eigenvalue of A1. Show that

sj). or \342\200\224y/X is an eigenvalue of A.

We know by Theorem 11.1 that A is similar to a triangular matrix

(f*i

\320\262=

Hence A2 is similar to the matrix

A

Since similar matrices have the same eigenvalues, \320\257= fif for some i. Hence \320\264-=

-^/A or ft = \342\200\224yjX is an eigenvalue

of A.

11.3 IN VARIANCE

Let T : V \342\200\224>V be linear. A subspace W of V is said to be invariant under T or T-invariant if T mapsW into itself, i.e., if v e W implies T(v) e W. In this case T restricted to W defines a linear operator onW; that is. T induces a linear operatorT: W -> W defined by T[w) = T(w) for every w e W.

Example 11.2

(a) Let T : R3 -> R3 be the linear operator which rotates each vector about the z axis by an angle 0 (Fig. 11-1):

T(x, v, z) = (x cos 0 \342\200\224\321\203sin \320\262,x sin 0 + \321\203cos 0, z)

Observe that each vector w = {a, b, 0) in the xy plane W remains in W under the mapping \320\223,hence, W is

7\"-invariant. Observe also that the z axis U is invariant under 7\". Furthermore, the restriction of T to W

rotates each vector about the origin O, and the restriction of T to U is the identity mapping of V.

V

T(v)

Fig. I1-I

(b) Nonzero eigenvectors of a linear operator T :V -\302\273V may be characterized as generators of T-invariant

1-dimensional subspaces. For suppose T(v)= Xi\\ v \321\2040. Then W ={kv, k e K), the 1-dimensionalsubspace

generated by v, is invariant under T because

T(kv)= kT(v) =

k{Xv)= fcAu e W

Conversely, suppose dim V = I and u/0 spans I/, and U is invariant under T. Then \320\223(\320\270)\320\261f7 and so

T{u) is a multiple of u, i.e., T(u) = \321\206\320\270.Hence u is an eigenvector of T.

The next theorem, proved in Problem 11.3,givesus an important class of invariant subspaces.

CHAP. 11] CANONICAL FORMS 371

Theorem 11.2: Let T : V -\302\273V be any linear operator, and Iet/(t) be any polynomial. Then the kernel

of/(T) is invariant under T.

The notion of invarianceis related to matrix representations (Problem 11.5)as follows.

Theorem 11.3: Suppose W is an invariant subspace of T : V \342\200\224>V. Then T has a block matrix repre-

representation I I where \320\233is a matrix representation of the restriction t of T to W.

11.4 INVARIANT DIRECT-SUM DECOMPOSITIONS

A vector space V is termed the direct sum of subspaces Wu ..., Wr, written

V = W, \302\251W2 \321\204\342\200\242\342\200\242\342\200\242

\302\251Wr

if every vector v e V can be written uniquely in the form

v = w, + w2 + \302\246\342\200\242\342\200\242+ wr with W; e Wt

The following theorem, proved in Problem 11.7, applies.

Theorem 11.4: Suppose Wx,..., Wr are subspaces of V, and suppose

{w,,, ..., vvlni}, ..., {wri,...,wnr}

are bases of Wl7 Wr, respectively. Then V is the direct sum of the Wt if and only if

the union \320\222= {wllr..., wlni,..., wrl,..., wrnr}is a basis of V.

Now suppose T: V \342\200\224*V is linear and V is the direct sum of (nonzero) \320\223-invariant subspaces

Wlt...,Wr:

V =Wl@---\302\256Wr and TiWJcWt i=l,...,rLet Tt denote the restriction of T to Wt. Then T is said to be decomposableinto the operators Tt or T issaid to be the direct sum of the Tt, written T = T, \302\251

\342\200\242- \342\200\242\302\251TT. Also, the subspaces Wu ..., Wr are said to

reduceT or to form a T-invariant direct-sum decomposition of V.

Consider the special case where two subspaces U and W reduce an operator T: V -*V; say,

dim U \342\200\2242 and dim W = 3 and suppose{\320\270,,u2} and {w,, w2, w3} are bases of U and W, respectively.

If T, and T2 denote the restrictionsof T to U and IV, respectively, then

\320\2421(\320\2701)=

\320\26011\321\211+al2u2

\320\242,(\320\270\320\263)=

\320\260\320\263,\320\270,4-

\342\200\224I 0|2 u22 ^32

\320\25413 fe23 fe33

Hence /l = (au fl21) and\\al2 fl22/

are matrix representations of T, and T2, respectively. By Theorem 11.4, {\320\270,,\320\2702,w1( w2, w3} is a basisof V. Since T(\302\253()

=\320\242,(\320\270()and r(Wj)

= T2(w,), the matrix of T in this basis is the block diagonal

matnxU \320\262}

A generalization of the above argument gives us the following theorem.

Theorem 11.5: SupposeT: V -\302\273V is linear and V is the direct sum of T-invariant subspaces,say

Wl, ..., Wr. If A-, is a matrix representation of the restriction of \320\223to \320\230;,then T can be

372 CANONICALFORMS [CHAP. 11

represented by the block diagonal matrix

(Ax

0 ... 00 A2 ... 0

0 0 ... Ar

The block diagonal matrix M with diagonal entries A 1(..., Ar is sometimes called the direct sum ofthe matrices Ax,..., Ar and denoted by M = /Ij \302\251

\342\200\242\342\200\242\342\200\242\302\251Ar.

11.5 PRIMARY DECOMPOSITION

The following theorem showsthat any operator T : V -\302\273V is decomposable into operators whoseminimum polynomials are powers of irreducible polynomials.This is the first step in obtaining a canon-canonicalform for T.

Primary Decomposition Theorem 11.6: Let T : V -> V be a linear operator with minimum polynomial

where the f,{t) are distinct monic irreducible polynomials.Then V is the direct sum of T-invariant subspacesWu ..., Wr

where Wk is the kernel of \320\224\320\242)\321\202.Moreover, //tI\" is theminimum polynomial of the restriction of T to W{.

Since the polynomials ffof' are relatively prime, the above fundamental result follows (Problem11.11)from the next two theorems.

Theorem 11.7: SupposeT : V -\302\273V is linear, and suppose /(t) =g{t)h(t) are polynomials such that

f{T) = 0 and g{t) and h(t) are relatively prime. Then V is the direct sum of the

T-invariant subspaces U and W, where U = Kerg(T)and W = Ker h{T).

Theorem 11.8: In Theorem 11.7,if/(t) is the minimum polynomial of T [and g(t) and h(t) are monic],then g(t) and h(t) are the minimum polynomials of the restrictions of T to U and W,

respectively.

We will also use the primary decomposition theorem to prove the following useful characterization

of diagonalizable operators (see Problem 11.12for the proof).

Theorem 11.9: A linear operator \320\223:V -> V is diagonalizable if and only if its minimum polynomialm(t) is a product of distinct linear polynomials.

Theorem 11.9 (Alternate Form): A matrix A is similar to a diagonal matrix if and only if its minimum

polynomial is a product of distinct linear polynomials.

Example 11.3. SupposeA ^ I is a square matrix for which A3 = /. Determine whether or not A is similar to a

diagonal matrix if A is a matrix over (i) the real field R, (ii) the complex field \320\241

Since A3 = I, A is a zero of the polynomial fit) = t3 \342\200\2241 = (t \342\200\224IX'2 + t + 1).The minimum polynomial nit)

of A cannot be t \342\200\2241, since A ^ I. Hence

m{t)= t2 + t + 1 or m(f) = r3 - I

Since neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other

hand, each of the polynomials is a product of distinct linear polynomials over C. HenceA is diagonalizable over C.

CHAP. 11] CANONICAL FORMS 373

11.6 NILPOTENT OPERATORS

A linear operator T: V -> V is termed nilpotent if T\" = 0 for some positive integer n; we call \320\272the

index of nilpotency of T if Tk = 0 but Tk ' ^ 0. Analogously, a square matrix /1 is termed nilpotent if

A\" = 0 for some positive integer n, and of index \320\272if /1* = 0 but \320\233*\"'^0. Clearly the minimum poly-polynomial of a nilpotent operator (matrix) of index \320\272is m(t) = tk; hence 0 is its only eigenvalue.

The fundamental result on nilpotent operators follows.

Theorem 11.10:

/o0

0

\\o

1

0

0

0

01

00

... 0

... 0

... 0

... 0

\302\260\\0

1

0/

Let T : V -\302\273V be a nilpotent operator of index k. Then T has a block diagonal matrix

representation whose diagonal entries are of the form

N =

(i.e., all entries of N are 0 exceptthose just above the main diagonal, which are I).

There is at least one /V of order \320\272and all other N are of orders < k.The number of N

of each possible order is uniquely determined by T. Moreover, the total number of Nof all orders is equal to the nullity of T.

In the proof of the above theorem (Problem 11.16), we shall show that the number of N of order i is

equal to 2nt, \342\200\224mi +,

\342\200\224m, ,, where m, is the nullity of T'.

We remark that the above matrix N is itself nilpotent and that its index of nilpotency is equal to itsorder(Problem11.13).Note that the matrix N of order 1 isjust the 1 \320\2721 zero matrix @).

11.7 JORDAN CANONICALFORM

An operator T can be put into Jordan canonical form if its characteristic and minimum poly-polynomials factor into linear polynomials. This is always true if \320\232is the complex field C. In any case, wecan always extend the base field \320\232to a field in which the characteristic and minimum polynomials do

factor into linear factors; thus in a broad sense every operator has a Jordan canonicalform. Analo-

Analogously, every matrix is similar to a matrix in Jordan canonical form.

Theorem 11.11: Let T : K->V be a linear operator whose characteristic and minimum polynomials

are, respectively,

\320\224(\320\263)= (,-/,)-\302\246\342\200\242(,-/,)\"' and m(t) =

(t-/,)\342\200\242\"\342\200\242\302\246\302\246\302\246(t-

;.\320\223\320\223

where the X{ are distinct scalars. Then T has a blockdiagonal matrix representation J

whose diagonal entries are of the form

/A,- 1 0 ... 0 0\\

0 A, 1 ... 0 0

0 00

00 0

1

For each Af the corresponding blocks Ji} have the following properties:

(i) There is at least one J{iof order \321\202{;all other Ji} are of order < m{.(ii) The sum of the orders of the Ju is nt.

(iii) The number of Ji}equals the geometric multiplicity of A,.

(iv) The number of Jtj of each possible order is uniquely determined by T.

374 CANONICAL FORMS [CHAP. 11

The matrix J appearing in the above theorem is calledthe Jordan canonical form of the operator T.A diagonal block Ju is called a Jordan blockbelonging to the eigenvalue A,. Observe that

/A,- 1 0 ... 00\\ /A,-

0 ... 0 0\\/0

10 ... 00\\

0 1 0 0

0

\\o

\320\276

\320\276

1

hi

0 L 0 0

0 00

\320\276

0

\320\276

0 0 1 \320\276\320\276

0 0

\\o \320\276

0 1

0 0/

That is,

where N is the nilpotent block appearing in Theorem 11.10. In fact, we will prove Theorem 11.11

(Problem 11.18) by showing that T can be decomposed into operators, each the sum of a scalar oper-operator and a nilpotent operator.

Example 11.4. Supposethe characteristic and minimum polynomials of an operator T are, respectively,

\320\224@= (r - 2L(t - 3K and m(f) = (f -

2J(f- 3J

Then the Jordan canonical form of T is oneof the following matrices:

/20

12

2 \320\223

0 2\342\200\224\342\200\224

3

0

1

3

\\

!\320\267/

/2

\321\202\321\203

-\320\223\320\267'\320\242;

i.\302\260._U.

\320\223\320\267/

The first matrix occurs if T has two independent eigenvectorsbelonging to the eigenvalue 2; and the second matrix

occurs if T has three independent eigenvectorsbelonging to 2.

11.8 CYCLIC SUBSPACES

Let T be a linear operator on a vector space V of finite dimension over K. Suppose v e V and

v \321\2040. The set of all vectors of the form/(TXt>), where/(r) ranges over all polynomials over K, is aT-invariant subspace of V called the T-cyclic subspaceof V generated by v; we denote it by Z(v, T) anddenote the restriction of T to Z(v, T) by Tv. By Problem 11.56, we could equivalently define Z{v, T) asthe intersection of all T-invariant subspaces of V containing v.

Now consider the sequence

v, T(v),T2(v), T\\v),...

of powers of T acting on v. Let \320\272be the least integer such that Tk(v) is a linear combination of thosevectors which precede it in the sequence; say,

TV) = - \342\200\242\302\246\302\246-aiT(v)

- aovThen +

is the unique monic polynomial of lowestdegreefor which mv(T)(v) = 0. We call mv(t) the T-annihilator

ofv and Z(v, T).The following theorem (proved in Problem 11.29)applies.

Theorem 11.12: Let Z(v, T), \320\242\342\200\236,and mj(t) be defined as above.Then:

(i) The set {v, T(v), ...,Tk~ V)} is a basisof Z{v, T); hence dim Z(v, T) = k.

CHAP. 11] CANONICAL FORMS 375

(ii) The minimum polynomial of Tv is mj(t).

(iii) The matrix representation of Tv in the above basis is just the companion matrix

(Example 8.12) of mv{t):

/0 0 0 ... 0 -a0 \\

1 0 0 ... 0 -a,0 1 0 ... 0 -a2

0 0 0

\\o\320\276\320\276

0 -at_

1 -**-

11.9 RATIONAL CANONICAL FORM

In this section we present the rational canonical form for a linear operator T : V -\302\273V. We empha-emphasizethat this form exists even when the minimum polynomial cannot be factored into linear poly-polynomials. (Recall that this is not the case for the Jordan canonical form.)

Lemma 11.13: Let T : V -* V be a linear operator whoseminimum polynomial is/(t)\" where/(t) is amonic irreduciblepolynomial. Then V is the direct sum

V = Z(vlt T)@---@Z(vr,T)

of T-cyclic subspacesZ(vt, T) with corresponding T-annihilators

Any other decomposition of V into T-cyclic subspaces has the same number of com-components and the same set of \320\223-annihilators.

We emphasize that the above lemma, proved in Problem 11.31, does not say that the vectors t>,-

or the T-cyclic subspaces Z{v,, T) are uniquely determined by T; but it does say that the set of\"\320\223-annihilators is uniquely determined by T. Thus T has a unique matrix representation

\\ \321\201,

where the Ct are companion matrices. In fact, the C, are the companion matrices of the polynomials

Using the primary decomposition theorem and Lemma 11.13, we obtain the following fundamental

result.

Theorem 11.14: Let T : V -\302\273V be a linear operator with minimum polynomial

376 CANONICAL FORMS [CHAP.11

where the \320\243\320\245\320\236are distinct monic irreducible polynomials. Then T has a unique block

diagonal matrix representation

c,,

\\ I

where the Ci} are companion matrices.In particular, the Cy are the companionmatrices of the polynomials /j(t)\"iJ where

ml = l2 ns2

The above matrix representation of T is calledits rational canonical form. The polynomials/,(t)\"iJarecalledthe elementary divisors of T.

Example 11.5. Let V be a vector space of dimension 6 over R, and let T be a linear operator whose minimum

polynomial is m(f)= (t2 \342\200\224t + 3)(f

\342\200\2242J. Then the rational canonical form of T is one of the following direct sums

of companionmatrices:

(i) C(t2- t + 3) \302\251C(t2

- t + 3) \302\251C((t- 2J)

(ii) C(t2-i + 3)\302\256C((t- 2J) \302\251C((i

- 2J)

(iii) C(r2- t + 3) 0 C((r- 2J)\302\251C(t- 2) \302\251C(t

- 2)

where C(/(r)) is the companion matrix of/(r); that is,

/\302\260

\\

-3

1

10

1 1

0

1

\\

-4

4/

M

\\

-31

0

11

-4

4. 1

0

1

\\1

\\

-3

1

01

\320\273

4

2 ;

\\

J

(i) (iii)

11.10 QUOTIENT SPACES

Let \320\232be a vector space over a field \320\232and let Wbea subspace of V. If \320\270is any vector in V, wewrite v + W for the set of sumsv + w with w e W:

v + w ={v + w : w e W}

These setsare called the cosets of W in V. We show (Problem 11.22) that these cosets partition V into

mutually disjoint subsets.

Example 11.6. Let W be the subspace of R2 defined by

W = {(a,b):a =b}

That is, W is the line given by the equation x \342\200\224\321\203

= 0. We can view v + W as a translation of the line, obtained by

adding the vector v to each point in W. As shown in Fig. 11-2, v + W is also a line and is parallel to W. Thus the

cosets of W in R2 are precisely all the lines parallel to W.

CHAP. 11] CANONICAL FORMS 377

Fig. 11-2

In the following theorem we use the cosets of a subspace W of a vector space V to define a newvector space;it is called the quotient space of V by W and is denoted by

Theorem 11.15: Let ffbea subspace of a vector space over a field K. Then the cosets of W in V form

a vector space over \320\232with the following operations of addition and scalarmultiplica-

multiplication:

(i) (u + W) + (v+ W)= (u + v) + W

(ii) k{u + W) = ku + W, where \320\272\320\265\320\232.

We note that, in the proof of Theorem 11.15 (Problem 11.24), it is first necessary to show that the

operations are well defined; that is, whenever \320\270+ W = u' + W and v + W = v' + W, then

(i) (u + v) + W = (u1 + v') + W and (ii) ku + W = ku' + W, for any A: e \320\232

In the case of an invariant subspace, we have the following useful result, proved in Problem 11.27.

Theorem 11.16: Suppose W is a subspace invariant under a linear operator T: V -* V. Then Tinduces a linear operator f on V/W defined by T(v + W) = T(v) + W. Moreover, if

T is a zero of any polynomial, then so is T. Thus the minimum polynomial of Tdivides the minimum polynomial of T.

Solved Problems

INVARIANT SUBSPACES

11.1. Suppose T : V -\302\273V is linear. Show that each of the following is invariant under T: (a) {0}, (b) V,

(c) kernel of T, and (d) image of T.

(a) We have T@)= 0 e {0};hence {0} is invariant under T.

{b) For every v e V, T(v) e V; hence V is invariant under T.

(c) Let \320\270e Ker T. Then T(u)= 0 e Ker T since the kernel of T is a subspace of V. Thus Ker T is invari-

invariantunder T.

(d) Since T{v)elm \320\223for every t> e V, it is certainly true when v e Im T. Hence the image of T is invari-

invariantunder T.

378 CANONICAL FORMS [CHAP. 11

11.2. Suppose {Wi} is a collection of T-invariant subspacesof a vector space V. Show that the inter-

intersection W = f]f Wj is also T-invariant.

Suppose v e W; then v e Wt for every i. Since Wt is T-invariant, T(v) e Wt for every i. Thus T(v) e W

and so W is T-invariant.

11.3. Prove Theorem 11.2. Ker/(T) is invariant under T.

Suppose v e Ker/(T), i.e.,/(T)(f) = 0. We need to show that T(v) also belongs to the kernel of/(T),

i.e.,/(T)(T(iO) = (f(T) \320\276TW = 0. Since/(r)f = tf(t), we have/(T) =T = T -f{T).Thus

(/(T) \320\276\320\223\320\245\342\200\236,=

(\320\223\302\253/(\320\233\320\245\302\273)= T(/(TKd)) = \320\242@)= 0

as required.

\302\246\320\241:0-11.4. Find all invariant subspaces of A = I I viewed as an operator on R .

By Problem 11.1, R2 and {0}are invariant under A. Now if A has any other invariant subspace, it mustbe 1-dimensional.However, the characteristic polynomial of A is

f-2 5

-I t + 2

HenceA has no eigenvalues (in R) and so A has no eigenvectors.But the 1-dimensional invariant subspacescorrespond to the eigenvectors; thus R2 and {0}are the only subspaces invariant under A.

11.5. Prove Theorem 11.3.

Wechoosea basis{tv,,..., tvr} of W and extend it to a basis {tv,,..., tvr, u,, ..., vs} of V. We have

T(v/i)= T(tv1)= a11tv1 4 + alrw,

T(w2) =T(w2)

= a2lwl + \302\246\302\246\302\246+ alrwr

T(tvr) = T(tvr) =arltv, +\302\246\302\246+ arrwr

+ \" \302\246\"+ b ir tvr

i +\342\200\242\302\246\342\200\242+c2sv.

T(v,) =bslw, + \302\246\302\246\302\246+ bsr wr + cslvl + \302\246\302\246\302\246+ css vs

But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of

equations. (See page 344.) Therefore it has the form I 1 where A is the transpose of the matrix of

coefficients for the obvious subsystem. By the same argument, A is the matrix of t relative to the basis {tv,-}

of W.

11.6. Let t denote the restriction of an operator T to an invariant subspace W, i.e., t(w) =T(w) for

every w e W. Prove:

(i) For any polynomial/(t),/(TXw) =/(T)(w).

(ii) The minimum polynomial of T divides the minimum polynomial of T.

(i) If/@ = 0 or if/(') is a constant, i.e., of degree 1, then the result clearly holds.

Assume deg/= n > 1 and that the result holds for polynomials of degree lessthan n. Suppose that

CHAP. 11] CANONICAL FORMS 379

Then f(f)(w) =(\320\260\342\200\236f\" + \320\260\342\200\236_, f\"~ '+\342\200\242\302\246\342\200\242+a0 tyw)

= (e. f\"\" ')(f(w)) + (\321\206,_,f\"\"' + \302\246\342\200\242\302\246+ e0/K=

(\320\262.\320\223\"-'Kr(w)) + K^T\"\021 + - \342\200\242\342\200\242+ \321\217\320\2760\320\230

(ii) Let m(t) denote the minimum polynomial of T. Then by (i), tn(T)(w) = m(T)(w) =0(\320\270>)

= 0 for everyw e W; that is, \320\223is a zero of the polynomial m(r). Hence the minimum polynomial of t divides m(r).

INVARIANT DIRECT-SUM DECOMPOSITIONS

11.7. Prove Theorem 11.4.

Suppose \320\222is a basis of V. Then, for any v e V,

v = auw,i +\302\246\302\246\302\246+ almwlni + \302\246\302\246\302\246+ arlwrl + \302\246\302\246\302\246+ \320\260,\342\200\236\320\263wrKt

= tv, + tv2 +\302\246\302\246\342\200\242

+ wr

where tv,= atlwn + \302\246\302\246\302\246+ a^w^ e Wt. We next show that such a sum is unique. Suppose

v =w\\ + w'2 + \342\200\242\342\200\242\342\200\242-t- w'r where w\\ e Wt

Since {v/,i,..., wini] is a basis of \320\230^-,w| = bnwn + \302\246\342\200\242\342\200\242+\342\200\242biniwini and so

v = bllw11 +\302\246\342\200\242+*>,\342\200\236,wlBi+ \302\246\302\246\342\200\242

+brlwrl + \342\200\242\302\246\342\200\242+b,nrwrBr

Since \320\222is a basis of V, ai}\342\200\224

\320\254\320\270,for each i and each j. Hencew,

=wj- and so the sum for v is unique.

Accordingly, V is the direct sum of the Wt.

Conversely, suppose V is the direct sum of the \320\251.Then for any v e V, v = w, + \302\246\342\200\242\302\246+ wr where

w, e \320\230^.Since {wiJt} is a basis of Wt, each tv, is a linear combination of thew(J.. and so v is a linear

combination of the elements of B.Thus \320\222spans V. We now show that \320\222is linearly independent. Suppose

\302\253u^ii +\342\200\242\302\246\302\246+fli,,\302\253'i\302\273,+ \342\200\242\302\246\302\246+ fl,iVf,i + \342\200\242\302\246\342\200\242+arn,W\342\204\242r

= O

Note that \302\253\342\200\236w,,+ \302\246\302\246+ aintwini eHS.We also have that 0 = 0 + 0 + ---+0 where 0 e Wt. Since such a

sum for 0 is unique,

+ \342\200\242\342\200\242'+ ainiwin. = 0 for i = 1, .. , r

The independenceof the bases {wiM} imply that all the a's are0.Thus \320\222is linearly independent and hence isa basisof V.

11.8. Suppose T : \320\243-* \320\243is linear and suppose T = TtQT2 with respect to a \320\223-invariant direct-sum

decomposition V = V \302\256W. Show that:

(a) m(t) is the least common multiple of m,(t) and m2(i) where nit), m,(t), and m2(t) are the

minimum polynomials of T, Tj, and T2, respectively.

(b) A(f) = A,(r) A2(r), where A(f), A,(t) and A2(t) are the characteristic polynomials of T, Tt and

T2, respectively.

(a) By Problem 11.6, each of m,(t) and m2(t) divides nit). Now suppose/(t) is a multiple of both m^i) and

m2(f); then /(\320\242,\320\2451/)= 0 and/(T2)(^) = 0. Let teK; then v = \320\270+ w with ueU and weW. Now

f(T)v =/(T)u +/(T)tv =/G> +/(T2)w= 0+0 = 0

That is, \320\223is a zero of/(r). Hencem{t) divides/(f), and so m(t) is the least common multiple of m,(r) and

m2(f).

(b) By Theorem 11.5,T has a matrix representation M = 1 1 where A and \320\222are matrix representa-\\\302\260BJ

tions of T, and \320\2232,respectively. Then,

tl - A 0A(r) = I tl \342\200\224MI = = | tl \302\246

0 tl \342\200\224\320\222

as required.

380 CANONICAL FORMS [CHAP. 11

11.9. Prove Theorem 11.7.

Note first that U and W are \320\223-invariant by Theorem 11.2. Now since g(t) and h(t) are relatively prime,there exist polynomials r[t) and s{t) such that

s{t)h(t) = 1

Hence, for the operator T, r[TMT) + s{T)h(T)= I (*)

Let v 6 K; then by (*), v = riT)g(TYv) + \320\251\320\242\320\250\320\234

But the first term in this sum belongs to W = Ker h(T). In fact, since the compositionof polynomials in T is

commutative,

h(T)r(TMT)(v)= r{T)g(T)h(T)(v)= r[T\\f(TM =

\320\263[\320\242\320\251= \320\236

Similarly, the second term belongs to U. HenceV is the sum of U and IV.

To prove that V = U \302\251W, we must show that a sum v = \320\270+ w with \320\270e U, w e W, is uniquely

determined by v. Applying the operator r{T)g(T)to v = \320\270+ w and using g{T)(u)= 0,we obtain

= r(T)g(T)(u) + r{T)g(T)(W)=

Also, applying (*) to w alone and using h(T)(w) = 0,we obtain

w =r{T)g(T\342\204\226 + s(r)MTKw) = r(T)s,(TXw)

The last two equations together give tv = r{T)g(T)(v), and so w is uniquely determined by v. Similarly, \320\270is

uniquely determined by v. Hence V = V \302\251W, as required.

11.10. Prove Theorem 11.8:In Theorem11.7(Problem 11.9), if f(t) is the minimum polynomial of T

[and g(t) and h(t) are monic], then g(t) is the minimum polynomial of the restriction T, of T toG and \320\233(\320\263)is the minimum polynomial of the restriction t2 of T to W.

Let m,(r) and m2(r) be the minimum polynomials of T, and T2, respectively. Note that gifj) = 0 and

Ht2) = 0 because U = Ker g(T) and W = Ker \320\233(\320\223).Thus

m,(r) divides ^@ and m2(r) divides h(t) (I)

By Problem 11.9, /(r) is the least common multiple of mi(r) and m2(r). But rtii(t) and m2(r) are relatively

prime since g(t) and h(t) are relatively prime. Accordingly, f(t) = m,(r)m2@- We also have that/(r) = g(t)h(t).These two equations together with {1) and the fact that all the polynomials are monic, imply that

g(t) = m,(r) and h(t) =m2(r), as required.

11.11. Prove the Primary DecompositionTheorem 11.6.

The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for

r \342\200\2241. By Theorem 11.7 we can write \320\243as the direct sum of \320\223-invariant subspaces Wt and K, where Wy is

the kernel of/^T)\021and where Vx is the kernel of/2(T)\022\342\200\242\342\200\242\302\246/\320\224\320\242)\"'.By Theorem 11.8, the minimum poly-

polynomials of the restrictions of T to W, and Vt are, respectively, /,(()\"' and /2(t)\022\342\200\242\342\200\242\302\246

/,(')\"'\302\246

Denote the restriction of T to K, by %. By the inductive hypothesis, K, is the direct sum of subspacesW2, ..., Wt such that \320\251is the kernel of^T!)\021and such that ftt)\"' is the minimum polynomial for therestriction of 7\\ to \320\251.But the kernel of/{\320\223)\"', for i = 2 , r is necessarily contained in Vy since fffl1

divides/2(t)\"z -/\320\224\320\236\"'Thus the kernel of/<T)\"' is the same as the kernel of/^T,)\"'which is Wt. Also, therestriction of T to Wt is the same as the restriction of tt to Wt (for i = 2,..., r); hencef&tf1 is also theminimum polynomial for the restriction of T to \320\230^.Thus \320\232= W, \302\251W2 \302\251

\302\246\342\200\242\302\246\302\251Wr is the desired decompo-

decompositionof T.

11.12. Prove Theorem 11.9.T is diagonalizable iff m(t) is a product of distinct linear factors.

Suppose tn{t) is a product of distinct linear polynomials; say,

CHAP. 11] CANONICAL FORMS 381

where the Af are distinct scalars. By Theorem 11.6, \320\243is the direct sum of subspacesWu..., Wr where

\320\251= Ker {T -

A;/). Thus if v e W-, then (\320\223- ktl)(v) = 0 or T{v)

= A,u. In other words, every vector in Wt

is an eigenvector belonging to the eigenvalue A,-. By Theorem 11.4, the union of bases for Wl3..., Wr is a

basis of K. This basisconsistsof eigenvectors and so T is diagonalizable.Conversely,supposeT is diagonalizable, i.e., V has a basis consisting of eigenvectors of T. Let

A,,..., A, be the distinct eigenvalues of \320\223.Then the operator

f(T) =(\320\223

-\320\273./\320\245\320\223

- A2/) \342\200\242\342\200\242\342\200\242IT - As/)

maps each basis vector into 0. Thus /(T) = 0 and hence the minimum polynomial nit) of T divides the

polynomial

/(() = (r -A.X*

-A2) \342\200\242\342\200\242\302\246(/-/,)

Accordingly, m(f) is a product of distinct linear polynomials.

NILPOTENT OPERATORS, JORDAN CANONICAL FORM

11.13. Let T : V -* V be linear. Suppose, for v e \320\243,Tk(v)= 0 but 7* \\v) / 0. Prove:

(a) The set S \342\200\224{v, T(v), ..., Tk l(v)} is linearly independent.

(b) The subspace W generated by S is \"T-invariant.

(c) The restriction t of T to W is nilpotent of indexk.

(d) Relative to the basis {Tk~

\\v) , T(v), v} of W, the matrix of T is of the form

/0 1 0 ... 0 0\\

0 0 1 ... 0 0

\320\2360 0 ... \320\2361

\\\320\276\320\276\320\276... \320\276\320\276/

Hence the above fc-square matrix is nilpotent of index k.

(a) Suppose

av + atT{v) + a2 T\\v) + \342\200\242\302\246\342\200\242+ a,,.,^\021^) = 0 (*)

Applying \320\223*' to (*) and using Tk{v)

= 0, we obtain aTk~\\v)= 0; since Tk~ l(v) ^0, a = 0. Now

applying T* 2 to (*) and using T*(v) = 0 and a = 0, we find a,^\" '(v) = 0; hence flj= 0. Next apply-

applyingT*~3 to (*) and using Tk{v)= 0 and a = a, = 0,we obtain a2 Tk~'(\") = 0; hencea2

= 0. Contin-

Continuingthis process, we find that all the as are 0; hence S is independent.

{b) Let v e W. Then

v = bv + blT(v) + b2T2lv)+ 1- bk_tTk~l(v)

Using T\\v) = 0, we have that

T(v) = bT(v) + b,T2(v)+ \302\246\302\246\302\246+ bk_2 Tk~ l{v) e W

Thus W is \320\223-invariant.

(t1) By hypothesis, 1*(v) = 0. Hence, for i = 0, ..., \320\272- 1,

That is. applying 74 to each generator of W, we obtain 0: hence 7* = 0 and so T is nilpotent of index

at most k. On the other hand, T* '(r) = T* '(i') \320\2440; hence T is nilpotent of index exactly k.

382 CANONICALFORMS [CHAP. 11

(d) For the basis {Tk'l(v), 7*-2(t>) 7\302\273,v} of W,

f(Tk l(v))= r\"(f)= O

\320\246\320\242*2(p)) = 1*-l{v)

t(Tk-3(v)) = 7* 2(t>)

T(T(v)) = T*(v)

7\302\273= T(v)

Hence the matrix of T in this basis is

/0 1 0 ... 00\\

0 0 1 ... 0 0

0 0 0 ... 0 1\\o \320\276\320\276... \320\276o/

11.14. Let T: \320\230->V be linear. Let U = Ker T and W = Ker Ti+l. Show that (a) U <= W, and

(b) T(W) <= U.

(a) Suppose uel/ = Kerf. Then T'(u) = 0 and so Ti+l(u)= T{T\\u)) = T{0)= 0. Thus

u g Ker Ti+' = W. But this is true for every \320\270\320\265U; hence [/glf.

(ft) Similarly, if weW=KerTi+\\ then \320\223+1(\320\270>)= 0. Thus f+1(w)= \320\223(\320\222\320\224

=\320\2370)

= 0 and so

T(W) \321\201l/.

11.15. Let \320\223: V - \320\232be linear. Let X = Ker T 2, \320\243= Ker \320\223l, and Z = Ker Tj. By Problem 11.14,

X \321\201\321\203\321\201z. Suppose

{\"l \"r! {\"l. \342\200\242\342\200\242\342\200\242.\"r. ^1 Vs} {\"l \"r. l'l> \302\246\302\246-.\"S VV, W,}

are bases of X, Y, and Z, respectively. Show that

S= {\320\274\342\200\236...,\320\275\320\263,r(w,),..., T(w()}

is contained in \320\243and is linearly independent.

By Problem 11.14, T(Z) e \320\243and hence SeV. Now supposeS is linearly dependent. Then there existsa relation

a,ul + \302\246\302\246\302\246+ arur + b,T{wi) H + b,T{wt) = 0

where at least one coefficient is not zero. Furthermore, since {u,} is independent, at least one of the bk must

be nonzero. Transposing, we find

biT(wt) +\302\246\302\246\302\246+b, T(w,) = -alul arur e X = Ker \320\2232

Hence r-2(ft,T(tv,) + \302\246\342\200\242\302\246+ b, T(w,)) = 0

Thus 7\"'~ Hbiw{ H + b, w,) = 0 and so ft,tv, + h b, w, e Y = Ker \320\2231\"'

Since {u(, ^-} generates \320\243,we obtain a relation among the \320\270,,Vjand wk where one of the coefficients, i.e.,

one of the bk, is not zero. This contradicts the fact that {\321\211,Vj. wk] is independent. Hence S must also be

independent.

11.16. Prove Theorem 11.10.

Suppose dimK = n. Let Wx= Ker T, W2

= Ker T2 Wk= Ker 7*. Let us set m^dimH^,

for i = 1, ..., \320\272.Since T is of index k, Wk= V and W, , / \320\232and so mR_ , < mk

= n. By Problem 11.14,

CHAP. 11] CANONICAL FORMS 383

Thus, by induction, we can choose a basis{ult..., u,}of V such that {ut, ..., um] is a basis of Wt.

We now choose a new basis for V with respect to which T has the desired form. It will be convenient

to label the members of this new basis by pairs of indices. We begin by setting

t{\\, k) =umk_, +,, v{2, k) =

umk^ + 2,..., v(mk- mt_,, k) =

umk

and setting

c(l, k- 1)= Tu(l, fc), uB, k- 1)= Tv{2, k), ...,v{mk-mk_uk- 1)= Tv(mk-

\321\202\342\200\236_\342\200\236\320\272)

By Problem 11.15, S, ={\302\253\342\200\236.... umi_2, c(l, \320\272- 1),.... \321\204\320\270\302\273

-m\302\273_,,fc

-1)}

is a linearly independent subset of Wk_v We extend S, to a basisof \320\230^_,by adjoining new elements (ifnecessary)which we denote by

v{mk- mk_l + 1, \320\272- 1), v(mk

- mk_l + 2, \320\272- 1),..., v{mk ,- mk_2, \320\272\342\200\224

1)

Next we set

w(l, \320\272-2)= Tv(\\, k- \\)M2,k-2)=Tii2,k- I), ..., v{mk_l- mk_2, \320\272- 2) = Tvim,,.., -

mk_2, \320\272- 1)

Again by Problem 11.15,

S2 = {u,, ...,umk_3,v{l,\320\272-2),..., v{mk_, -mk_2,k-2)}

is a linearly independent subset of Wk_2 which we can extend to a basis of Wk_2 by adjoining elements

t^mk_, \342\200\224mt_2 + 1, \320\272\342\200\224

2), tf(mt_, \342\200\224mk_2 + 2,k \342\200\224

2),..., v{mk_2 \342\200\224mk_3, \320\272\342\200\224

2)

Continuing in this manner we get a new basis for F which for convenient reference we arrange as follows:

v(l, k), ...,v{mk-mk-,,k)

v{\\, \320\272\342\200\2241),..., \321\204\320\277\320\272

\342\200\224mk_i, \320\272\342\200\224

1), ..., f(mt_i \342\200\224mk_2, \320\272\342\200\224

1)

v{\\,2), ..., v{mk-

\321\202\321\206_\342\200\2362), ..., \320\270(\321\202\320\272_,- mk_2, 2),.... u(m2

- mu 2)

c(l, 1), ..., v{mk-

\321\202\320\272_\342\200\2361), ..., K\302\253*-i-

\302\273\"k_2.1),..., v(m2 - m,, 1),.... c(m,,1)

The bottom row forms a basis of Wx, the bottom two rows form a basis of W2, etc. But what is importantfor us is that T maps each vector into the vector immediately below it in the table or into 0 if the vector isin the bottom row. That is,

-{

- 1) (orj>\\0 (orj=\\

Now it is clear [see Problem 11.13(rf)]that T will have the desired form if the v{i,j) are ordered lexico-lexicographically: beginning with v(l, 1) and moving up the first column to t^l, k), then jumping to v{2, 1) and

moving up the second column as far as possible, etc.

Moreover, there will be exactly

mk\342\200\224

mk _ i diagonal entries of order \320\272

(mk_,\342\200\224

mt_2)\342\200\224

[mk\342\200\224

mk_,) = 2mk , \342\200\224mk

\342\200\224mk_2 diagonal entries of order \320\272\342\200\2241

2m2\342\200\224

m,\342\200\224

m3 diagonal entries of order 22m,

\342\200\224m2 diagonal entries of order 1

as can be read off directly from the table. In particular, since the numbers m,,..., mk are uniquely deter-determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity

m, =(mk

-mk_ i) + Bmt_ , -

mk\342\200\224

mk_2) + \302\246-\342\200\242+\342\200\242Bm2- m, -

m3) + Bml - m2)

384 CANONICALFORMS [CHAP. 11

shows that the nullity m, is the total number of diagonal entries of T.

/0 1 1 01\\

/0 0 1 1 1\\

/\320\236 0 0 0 0

11.17. Let A =

0 0 1110 0 0 0 00 0 0 0 0

\\o \320\276\320\276\320\276o/

Then A2 = 0 0 0 0 00 0 0 0 0

\\o \320\276\320\276\320\276o/

and A3 = 0; hence A is nilpotent of

index 3. Find the nilpotent matrix M in canonical form which is similar to A.

SinceA is nilpotent of index 3, M contains a diagonal block of order 3 and none greater than 3. Notethat rank A = 2; hence nullity -4 = 5-2 = 3. Thus M contains 3 diagonal blocks. Accordingly M must

contain one diagonal block of order 3 and two of order 1; that is,

M

r 000

l000

01

0

0

0

0

00

\302\260\\0

0

!orr/

\\o \320\276\320\276\320\276\320\223\320\276/

11.18. Prove Theorem 11.11.

By the primary decomposition theorem, T is decomposableinto operators Tlt...,Tr, that is,T =

\320\223,\302\251\342\200\242\302\246\342\200\242

\321\2047^, where (f\342\200\224

\320\257,I\021is the minimum polynomial of Tt. Thus in particular,

=0,...,(\320\223\320\263-\320\273\320\263/\320\223=

Set N, =Tt

-A, /. Then, for i = 1, ..., r,

T, = N, + A(/ where = 0

That is, T( is the sum of the scalar operator A,/ and a nilpotent operator N,-, which is of index m,- since

(t \342\200\224k,f* is the minimum polynomial of Tt.

Now by Theorem 11.10 on nilpotent operators, we can choose a basis so that Nt is in canonical form.In this basis, 7] =

Nt + A, / is represented by a block diagonal matrix M, whose diagonal entries are thematrices J^. The direct sum J of the matrices M; is in Jordan canonical form and, by Theorem 11.5, is amatrix representation of T.

Lastly we must show that the blocks Ji} satisfy the required properties. Property (i) follows from the

fact that N, is of index m,. Property (ii) is true since T and J have the same characteristic polynomial.

Property (iii) is true since the nullity of N( = Tt \342\200\224X\\ I is equal to the geometric multiplicity of the eigenvalue

A,-. Property (iv) follows from the fact that the Tt and hence the N,- are uniquely determined by T.

11.19. Determine all possible Jordan canonical forms for a linear operator T : V -* V whose character-characteristicpolynomial is A(t) = (t \342\200\224

2K(t- 5J.

Since t \342\200\2242 has exponent 3 in \320\224(\320\263),2 must appear three times on the main diagonal. Similarly 5 must

appear twice. Thus the possibleJordan canonical forms are

/2

\\

1

2 1

25

\\

1

5/

/2

\\

1

2 - - ,2i

1I11

5

\\

1

5/

/2

\\

;

~1 2

*\342\200\224\342\200\224

r5

\\

V

5/

(i) (ii) (iii)

CHAP. 11] CANONICAL FORMS 385

/2 I

2 1

2

\\ VI

(iv)

/2

Ts/

(v)

Ts/

11.20. Determine all possible Jordan canonical forms J for a matrix of order 5 whose minimal poly-polynomial is m(t) = (t \342\200\224

2J.

J must have one Jordan block of order 2 and the others must be of order 2 or 1. Thus there are only

two possibilities:

/2 1 \\ /2 1

2 1

2

\\

or J =

\\

2\\

\\2\\

\\

Note that all the diagonal entries must be 2 since2 is the only eigenvalue.

QUOTIENT SPACE AND TRIANGULAR FORM

11.21.Let W be a subspace of a vector space V. Show that the following are equivalent: (i) \320\270e v + W,(ii) u-veW, (iii) v e \320\270+ W.

Suppose \320\270e v + W. Then there exists w0 e W such that \320\270= v + w0. Hence \320\270\342\200\224v = w0 e W. Con-Conversely, suppose \320\270\342\200\224v e W. Then \320\270\342\200\224v = w0 where w0 e W. Hence \320\270= v + w0 e W. Thus (i) and (ii) are

equivalent.We alsohave: \320\270\342\200\224v e W iff \342\200\224(u \342\200\224

v) \342\200\224v\342\200\224usWiffveu+ W. Thus (ii) and (iii) are also equivalent.

11.22. Prove: The cosetsof W in V partition V into mutually disjoint sets. That is:

(i) any two cosets \320\270+ W and v + W are either identical or disjoint;and

(ii) each v e V belongs to a coset;in fact, v e v + W.

Furthermore, \320\270+ W = v + W if and only if \320\270\342\200\224v e W, and so (v + w) + W = v + W for anywe W.

Let v 6 V. Since 0 6 W, we have v = v + 0ev + W which proves (ii).

Now suppose the cosets \320\270+ W and v + W are not disjoint; say, the vector x belongs to both \320\270+ W

and v + W. Then \320\270\342\200\224x e W and x \342\200\224v e W. The proof of (i) is complete if we show that \320\270+ W = v + W.

Let \320\270+ w0 be any element in the coset \320\270+ W. Since \320\270\342\200\224x, x \342\200\224v, and vv0 belong to W,

(u + w0)\342\200\224v = (u - x) + (x -

v) + w0 6 W

Thus \320\270+ woe v + W and hence the coset \320\270+ W is contained in the coset v+W. Similarly v + W iscontained in \320\270+ W and so \320\270+ W = v + W.

The last statement follows from the fact that \320\270+ W = v + W if and only if \320\270e v + W, and, by

Problem 11.21, this is equivalent to \320\270\342\200\224v e W.

11.23. Let W be the solution space of the homogeneous equation 2x + 3y + 4z = 0.Describethe cosets

of W in R3.

386 CANONICAL FORMS [CHAP. 11

W is a plane through the origin \320\236= @, 0, 0), and the cosets of W are the planes parallel to W asshown in Fig. 11-3. Equivalently, the cosetsof W are the solution sets of the family of equations

2x + 3y + 4z = \320\272 \320\272e R

In particular the coset v + W, where v = (a, b, c), is the solution set of the linear equation

2x + 3y + 4z = 2a + 3b + 4c or 2{x -a) + \320\252(\321\203

-b) +- 4{z -

\321\201)= \320\236

Fig. 11-3

11.24. Suppose W is a subspaceof a vector space V. Show that the operations in Theorem 11.15 are

well defined; namely, show that if \320\270+ W = \320\270'+ W and r> + W = v' + W, then:

(a) (u + v)+W= (\302\253'+ i/) + and (b) ku+W = ku' + W, for any \320\272\320\265\320\232.

(d) Since \320\270+ W = \302\253'+ W and \302\273-I- W = v + W, both \320\270\342\200\224\320\270'and v \342\200\224tf belong to W. But then

(u + v) -(\302\253'+ \302\253')

=(\320\274

-\302\253')+ (i! - v') e W. Hence (u + v)+ W ~(u' + v') + W.

(fo) Also, since u \342\200\224u?W implies k(u \342\200\224u')eW, then ku \342\200\224ku'=k(u

\342\200\224\320\270)\320\265W; accordingly

ku + W = ku' + W.

11.25. Let V be a vector space and W a subspace of V. Show that the natural map r\\: V

defined by ij(t?)= \320\263+ W, is linear.

For any u, v 6 V and any \320\272\320\265\320\232,we have

and

Accordingly, \321\206is linear.

t](kv) = =\320\272\321\204)

11.26. Let W be a subspace of a vector space V. Suppose {wu ..., vvr} is a basis of W and the set of

cosets [vu ..., vs}, whereVj

=Vj + W, is a basis of the quotient space. Show that \320\221is a basis of

V where \320\222={\321\201,,..., t?s, wtJ ..., wr}. Thus dim F = dim PV + dim

Suppose \320\270e V. Since{\302\253,}

is a basis of F/W,

\320\274= \320\270+ W = a,tJ, + a2r2+ \342\200\242\342\200\242\342\200\242+ ast5s

Hence u = alvl + \342\200\242\302\246\302\246+ asvs + w where w e W. Since(wj isa basisof W,

\320\270= ajDj + \342\200\242\342\200\242\342\200\242+ as vs + b1wl + \302\246\302\246\302\246+ br wr

Accordingly, \320\222spans V.

CHAP. 11] CANONICAL FORMS 387

We now show that \320\222is linearly independent. Suppose

\321\201,!?,+ \302\246\342\200\242\342\200\242+ csvs + dtw, +\302\246\302\246\302\246+drwr= 0 (/)

Then cji + \302\246-\302\246+c,v, = O=W

Since {vj} is independent, the c's are all 0. Substituting into (/), we find dlwl + \302\246\302\246\302\246+ dr wr = 0. Since {wj isindependent, the d's are all 0. Thus \320\222is linearly independent and therefore a basis of V.

11.27. Prove Theorem 11.16.

We first show that f is well defined, i.e., if \320\270+ W = v + W then T(u + W)= T{v + W). If

\320\270+ W = v + W then u- veW and, since W is T-invariant, T(u -v) = T(u) - T(v)e W. Accordingly,

T (u + W) = T(u) + W = T(v) + W = T(v + W)

as required.We next show that t is linear. We have

\320\251\320\270+ W) + {v+ W))= f (u + v + W) =T(u + v) + W =

T(u) + T(v) + W= T(u) + W + T{v)+ W = T(u +W)+ f(v + W)

and

f(k(\302\253+ W)) = T(ku + W)= T(ku) +W = kT(u) + W =\320\246\320\242(\320\270)+ W) = kt(u + W)

Thus T is linear.Now, for any coset \320\270+ W in V/W,

T*{u + W)= T2{u) + W = T(T(u)) + W = t(T(u) +W)= t(t(u + W)) = T2(u + W)

Hence T2= f2.Similarly V = t\" for any n. Thus for any polynomial

+ W)=/(\320\223\320\232^+

W = X \320\260,\320\223'(\302\253)+ W = X a,.G>) + \320\2300

= X \320\276,\320\223(\302\253+ W) = X a. t\\u + W) = <?a, \320\240\320\246\321\206+ W) =/(f Ku + W)

and so 7(T) =/(f). Accordingly, if \320\223is a root of/(t) then /(\320\233= 6 = W =/(f), i.e.f is also a root of/(r).

Thus the theorem is proved.

11.28. Prove Theorem 11.1.The proof is by induction on the dimension of V. If dim V = 1, then every matrix representation of T

is a 1 x I matrix, which is triangular.

Now suppose dim V = n > 1 and that the theorem holds for spacesof dimension less than n. Since thecharacteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at leastone nonzero eigenvector v, say T{v) = altv. Let W be the 1-dimensional subspace spanned by v. Set

V = V/W. Then (Problem 11.26)dim V = dim V \342\200\224dim W = n - 1. Note also that W is invariant underT. By Theorem 11.16, T induces a linear operator \320\223on V whose minimum polynomial divides the

minimum polynomial of T. Since the characteristic polynomial of T is a product of linear polynomials, so isits minimum polynomial; hence so are the minimum and characteristic polynomials of T. Thus V and T

satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis {v2, .. -, \302\253\342\200\236}of V such that

f(v2) = a22v2

T{v3)=

\320\260\320\2522v2 + a33 v3

T(vn) =an2 v2 + ani D3 + \302\246\302\246\302\246+ aKa vn

Now let v2,..., vn be elements of V which belong to the cosets v2, ..., vn, respectively. Then {v, v2, vn}

is a basis of V (Problem 11.26). Since T{v2)= a22v2, we have

T(v2) \342\200\224a22 v2 = 0 and so T{v2) \342\200\224

a22 v2 e W

388 CANONICALFORMS [CHAP. 11

But W is spanned by v; hence T{v2)\342\200\224

a22 v2 is a multiple of v, say

T(v2) \342\200\224a22 v2 = a2lv and so T(v2)

= a21v + a22 v2

Similarly, for i = 3,..., n,

T(v,)\342\200\224

ai2v2\342\200\224

ai3v3\342\200\224\302\246\302\246\302\246\342\200\224

auvt e W and so T(v,)= aitv + ai2v2 + \302\246-\302\246+ au

Thus T(v) =altvT(v2)

= a2lv + a22v2

T{vr)= aniv + an2 v2 + \302\246\302\246\302\246+ \320\260\342\200\236vn

and hence the matrix of T in this basis is triangular.

CYCLIC SUBSPACES, RATIONAL CANONICAL FORM

11.29. Prove Theorem 11.12.

(i) By definition of mv(t), Tk(v) is the first vector in the sequence v, T{v), T2(v),... which is a linearcombination of those vectorswhich precede it in the sequence;hence the set \320\222= {v, T(v),..., 7*~'(\302\273)}

is linearly independent. We now only have to show that Z(v, T) = \320\246\320\222),the linear span of B. By the

above, T\\v) e \320\246\320\222).We prove by induction that T\"(v) e \320\246\320\222)for every n. Suppose n > \320\272and

Tn~V)e \320\246\320\251,i-e- V~l{v) is a linear combination of v, ..., Tk~\\v). Then T\"(v)= T(T\" l(v)) is a

linear combination of T(v),..., Tk{v).But T\\v) e \320\246\320\222);hence T\"(v) e \320\246\320\222)for every n. Consequently

f{T)(v) e \320\246\320\222)for any polynomial/(t). Thus Z(v, T) \342\200\224\320\246\320\222)and so \320\222is a basis as claimed.

(ii) Suppose nit) = t5 + fos_ ,(s\"' + \342\200\242\302\246\302\246+ b0 is the minimal polynomial of \320\223\342\200\236.Then, since v e Z(v, T),

0 =m(TM

= niTKv) = T*(v)+ bs_lT*~\\v) + --+bov

Thus T'{v) is a linear combination of v, T{v), ..., T*'1^), and therefore \320\272< s. However, mv{T) = 0and so mv{Tv) = 0. Then m{t) divides mJLi) and so s < fc. Accordingly fc = s and hence \321\202\342\200\236@

= m(t).

(iii) \320\223\342\200\236(\320\263)= T(v)

Tv(T(v))=

\320\2232(\320\263)

By definition, the matrix of Tv in this basis is the transpose of the matrix of coefficients of the above

system of equations; hence it is C, as required.

11.30. Let T : V -* V be linear. Let W be a T-invariant subspace of V and T the induced operator onV/W. Prove: (a) The T-annihilator of v e V divides the minimum polynomial of T. (b) The

T-annihilator of v e V/W divides the minimum polynomial of T.

(a) The T-annihilator of v e V is the minimum polynomial of the restriction of T to Z(v, T) and therefore,

by Problem 11.6, it divides the minimum polynomial of T.

(fo) The \320\223-annihilator of v e VjW divides the minimum polynomial of t, which divides the minimum

polynomial of T by Theorem 11.16.

Remark. In case the minimum polynomial of T is/(f)\" where/(t) is a monic irreducible polynomial,then the T-annihilator ofveV and the \320\223-annihilator of v e VJW are of the form/(f)m where m < n.

11.31. Prove Lemma 11.13.The proof is by induction on the dimension of V. If dim V \342\200\224

1, then V is itself \320\223-cyclic and the lemmaholds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than

that of V.

CHAP. 11] CANONICAL FORMS 389

Sincethe minimum polynomial of \320\223is/(f)n, there exists vt e V such that/(ry '(\302\253,)\320\2440; hence theT-annihilator of j?! is/(*)\"\342\200\242Let Z, = Z(vt, T) and recallthat Z, is \320\223-invariant. Let V = V/Zt and let T be

the linear operator on V induced by \320\223.By Theorem 11.16, the minimum polynomial of T divides/(t)\";hence the hypothesis holds for V and T. Consequently, by induction, V is the direct sum of T-cyclicsubspaces; say,

V = Z(v2, T)\302\256-\302\256Z(vr,T)

where the corresponding T-annihilators are/(f)\"\\ ...,/(?)\"', n > n2 > \302\246\302\246\302\246> nr.We claim that there is a vector v2 in the coset v2 whose \320\223-annihilator is/(f)\022, the T-annihilator of v2.

Letw be any vector in v2 \302\246Then/(r)\022(w) e Z,. Hence there exists a polynomial g(t) for which

Since/(t)\" is the minimum polynomial of \320\223,we have by (/),

But/(T)\" is the \320\223-annihilator of v,; hence/(f)\" divides f(tf~nig{i) and so g(t) =f(tf1h{t) for some poly-polynomial h(t). We set

Since w \342\200\224v2

= /j(TXi>i) e Z,, i>2 also belongs to the coset v2. Thus the \320\223-annihilator of v2 is a multiple of

the T-annihilator of v2.On the other hand, by (/)

f(Tr(v2) =f(Tr(w -fcGKr,))

=/(\320\223\320\235\320\270>)

- etrXr.) = 0

Consequently the \320\223-annihilator of v2 is/(f)\022 as claimed.

Similarly, there exist vectorsv3,..., vr e V such that vt e v] and that the \320\223-annihilator of vt is/@\"', theT-annihilator of ~vt. We set

Z2 = Z(v2, T),..., Zr= Z(vr, T)

Let d denote the degree of/(t) so that/(t)\"' has degree dnt. Then since /(\320\263I\"is both the \320\223-annihilator of vtand the T-annihilator of ?it we know that

..7*'-1(i>l)} and {S

are bases for Z(vt, T) and Zfc, T), respectively, for i = 2,..., r. But P = Z(i^, T) \302\251\302\246\342\200\242\342\200\242

\302\256Z(tv, T); hence

is a basisfor P. Therefore by Problem 11.26and the relation T'(S) =T\\v) (see Problem 11.27),

is a basisfor V. Thus by Theorem 11.4,\320\243= Z(r,, \320\223)\321\204\342\200\242- -

\302\256Z(t)r, \320\223),as required.It remains to show that the exponents n,,..., nr are uniquely determined by \320\223.Since d denotes the

degreeofflt),

dim V = d(nl + \302\246\342\200\242\302\246+ nr) and dim Z{= dn, i = 1,..., r

Also, if s is any positive integer then (Problem 1l.59)/(r)s(Z,) is a cyclic subspace generated by/(r)s(D,) andit has dimension d{n-,

\342\200\224s) if nt > s and dimension 0 if n{ < s.

Now any vector v e V can be written uniquely in the form v = w, + \342\200\242- - + wr where w, e Z,-. Henceany vector \321\202/G\320\243(\320\232)can be written uniquely in the form

where /(\320\223)\321\217(\320\270>,)6/(rM(Zf). Let t be the integer, dependent on s, for which

n, > s, -.., n, > s, nI+1 > s

Then f(TY(V) =/(rLZ!) e \342\200\242\342\200\242\342\200\242e/(r)'(zr)

and so dim

390 CANONICALFORMS [CHAP. 11

The numbers on the left of (*) are uniquely determined by T. Set s = n\342\200\224I and (*) determines the number

of n,- equal to n. Next set s = n \342\200\2242 and (\302\246)determines the number of n; (if any) equal too- 1. We repeat

the process until we set s = 0 and determine the number of n,- equal to 1. Thus the n,- are uniquely deter-determined by T and F, and the lemma is proved.

11.32. Let V be a vector space of dimension 7 over R, and let T: V -* V be a linear operator with

minimum polynomial m{i) = (r2 + 2)(r + 3K. Find all the possible rational canonical forms for T.

The sum of the degrees of the companion matrices must add up to 7. Also, one companion matrix

must be t1 + 2 and one must be (t + 3K. Thus the rational canonical form of T is exactly one of the

following direct sums of companion matrices:

(i) C(t2 + 2)\302\251C(t2 + 2) \302\251C((r + 3K)

(ii) C(t2 + 2)\302\251C((t + 3K) \302\251C((t + 3J)

(iii) C{t2 + 2) \302\251C((r + 3K) \302\251C(? + 3) \302\251C(t + 3)

That is,

/0 -2I 0

01

-20

01

0

0

0

1

-27

-27i

-9/

/0-2

1 0010

001

-27-27-9

J

0 -9

1 -6

(i) (ii)

q 2

1 0

0

1

0

0

01

-27-27_9

L---W(iii)

PROJECTIONS

11.33. Suppose V =Wj\302\256--- \302\256Wr. The projection of V into its subspace \320\230^is the mapping E : V -\302\273F

defined by ?(i?) =w* where d = w1 + \342\200\242\342\200\242\302\246+ vvr, w, e Wt. Show that (a) E is linear, and (b)E2 = E.

(a) Since the sum d =w, + \342\200\242\342\200\242\342\200\242+ wr, wt e W is uniquely determined by r, the mapping E is well defined.

Suppose,for \320\270e V, \320\270=w\\ + \302\246\302\246\302\246+ w'r, w\\ e Wj. Then

v + \320\270= (iv, + w',) + \342\200\242\302\246\342\200\242+ [wr + w'r) and kv =few, + \302\246\302\246\302\246+ kw,

are the unique sums corresponding to v + \320\270and fcy. Hence

E(v + u) =wk + w'k

= ?(d) + ?(\321\213) and ?(fet)) = fc

and therefore ? is linear.

(b) We have that wt = 0 + ---+0+w,t + 0+---+0

is the unique sum corresponding to wk e Wk\\ hence E(wk) = wk. Then for any v e V,

E2(v)=E(E(v))

= EiwJ =wk

= E(v)

Thus E2 = E,as required.

kwt, wt + \320\270>;\320\261Wt

kE{v)

CHAP. 11] CANONICAL FORMS 391

11.34.Suppose E: V -* V is linear and E1 = E. Showthat: (a) E{u)= \320\270for any \320\270elm E, i.e., the

restriction of E to its image is the identity mapping; (b) V is the direct sum of the image and

kernel of E: that is, V = Im E \302\251Ker E;(c) E is the projection of V into Im E, its image. Thus,by Problem 11.33, a linear mapping T : V -\302\273V is a projection if and only if T2 = T; compareProblem9.45.

(a) If u 6 Im ?, then there exists v e V for which E(t>)= u; hence

?(u) =E{E{v))

= ?>) = E(r) = u

as required,

(fo) Let v e V. We can write v in the form v = E(v) + v \342\200\224E(v). Now ?(i>) 6 Im ? and, since

E{v-

?(\302\273))= ?(\302\253)- ?2(t>) = E(v) -

\320\236\320\224= \320\236

\302\273- E(v) e Ker E. Accordingly, V = Im ? + Ker?.Now suppose w 6 Im ? n Ker ?. By (i), ?(w) = w because w e Im ?. On the other hand, ?(w) = 0

becausew e Ker ?. Thus w = 0 and so Im E n Ker ? = {0}. These two conditions imply that V is the

direct sum of the image and kernel of ?.

(c) Let v e V and suppose v = \320\270+ w where \320\270elm E and w e Ker ?. Note that E{u) = \320\270by (i), and

?(w) = 0 because w e Ker ?. Hence

E(v)= E(u + w) =

?(\321\213)+ ?(w) = u + 0 = \320\270

That is, ? is the projection of V into its image.

11.35. Suppose V = V \302\256W and suppose T : V -* V is linear. Show that \320\241and W are both T-invariantif and only if \320\242\320\225= ET where E is the projection of V into U.

Observe that ?(\302\253)6 V for every v e V, and that (i)E{v)= t: iff v e V, (ii) ?(t>)= 0 iff v 6 W.

Suppose ET = \320\242\320\225.Let \320\270ell. Since ?(\321\213)= \320\270,

T(u) = T(E{u)) =(\320\223\320\225\320\232\320\270)

=(\320\2257*\320\270)

=?(\320\223(\302\253))6 V

Hence I/ is \320\223-invariant. Now let we W. Since?(w) = 0,

?(r(w))= (fiTXw)= (rfiKw) = r(?H) =

\320\223@)= 0 and so T(w) e W

Hence W is also T-invariant.

Conversely, suppose V and W are both T-invariant. Let v e V and suppose v = \320\270+ w where \320\270\320\265\320\242

and v/eW. Then T(u) e V and T(w) e W; hence E{T(u))= T(u) and ?(T(w)) = 0.Thus

(?TXr) =(?TX\302\253+ w) = (?\320\223\320\245\320\274)+ (?TXw) = ?(T(\302\253))

and (T?X\302\253)=

(T?X\302\253+ w) = \320\242(?(\321\213+ w)) = T(u)

That is, (?TXf) =(T?X^) for every v e V; therefore ?\320\223= \320\223?as required.

Supplementary Problems

INVARIANT SUBSPACES

11.36.Suppose W is invariant under T : \320\243-\302\273V. Show that W is invariant underf{T) for any polynomial/(t).

11.37. Show that every subspace of V is invariant under 1 and 0, the identity and zero operators.

392 CANONICAL FORMS [CHAP. 11

11.38. Suppose W is invariant under \320\223,: V -* V and T2 : V -* V. Show that W is also invariant under \320\223,+ T2

and TtT2.

1139. Let T: V -\302\273V be linear and let W be the eigenspace belonging to an eigenvalue A of T. Show that W isT-invariant.

11.40. Let I' be a vector spaceof odd dimension (greater than 1) over the real field R. Show that any linear

operator on V has an invariant subspace other than V or {0}.

-G11.41. Determine the invariant subspaces of A = I I viewed as a linear operator on (i) R2, (ii) C2.

11.42. Supposedim V = n. Show that T : \320\243-\302\273V has a triangular matrix representation if and only if there exist

\320\223-invariant subspaces Wxa W2 ez \302\246\302\246\302\246\321\201Wn= \320\243for which dim Wk=

\320\272,\320\272= 1, ..., \320\277.

INVARIANT DIRECT SUMS

11.43. The subspacesWx,...,Wr are said to be independent if w, + \302\246\342\200\242\342\200\242+ w, = 0, wt e Wt, implies that each w,-= 0.

Show that span (\320\251= W, \302\251

\302\246\302\246\302\246\302\251Wr if and only if the Wt are independent. [Here span (W<) denotes the

linear span of the Wt.]

11.44. Show that V = Wt \302\251\342\200\242\342\200\242\342\200\242

\302\256W, if and only if (i) V = span (\320\251and, for \320\272= 1, 2,..., r, (ii) Wk n span(W,,..., \320\23041,Wk+1,...,Wr)={0}.

11.45. Show that span (W<)=

\320\230^\302\251\342\200\242\342\200\242\342\200\242

\302\251Wr if and only if dim span (\320\230<)= dim Wt + \302\246\302\246+ dim Wr.

11.46. Suppose the characteristic polynomial of \320\223: \320\243-\302\273V is A(?) =/1(t)\"'/2(f)\022\342\200\242\302\246\302\246

/r@\"' where the/|(t) are distinct

monic irreducible polynomials. Let \320\243= W, \321\204\302\246\342\200\242\342\200\242

\302\251Wr be the primary decomposition of \320\243into T-

invariant subspaces. Show that/XO\021 is the characteristic polynomial of the restriction of \320\223to Wt.

NILPOTENT OPERATORS

11.47. Suppose\320\223,and T2 are nilpotent operators which commute, i.e., TtT2 =T2TX. Show that \320\223,+ T2 and TyT2

are alsonilpotent.

11.48. Suppose A is a supertriangular matrix, i.e., all entries on and below the main diagonal are 0. Show that A is

nilpotent.

11.49. Let V be the vector space of polynomials of degree < n. Show that the derivative operator on V is nilpotentof index n + 1.

11.50. Show that the following nilpotent matricesof order n are similar:

/0 1 0 ... 0\\ /0 0 ... 00\\

0 0 1 ... 0

0 0 0 ... 1\\o \320\276\320\276... o/

and

1 0 ... 0 \320\236'

0 1 ... 0 0

\\0 0 ... 1 0/

11.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpo-

tency. Show by example that the statement is not true for nilpotent matrices of order 4.

CHAP. 11] CANONICAL FORMS 393

JORDAN CANONICAL FORM

11.52.Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t) and

minimum polynomial m(?)are as follows:

(a) A(t) = (f - 2f(t - 3J,\320\277\320\246)= (t - 2J(f - 3J

(b) A(t) =(f

- 7M, nit) = (t- If(c) A(t) =

(\320\223-

2)\\ \320\277\320\246)= (t - 2K

(d) A(t)= (t - 3L(f- 5L, m(r) = (t - 3J(t - 5J

11.53.Show that every complex matrix is similar to its transpose. {Hint: Use its Jordan canonical form andProblem 11.50.)

11.54.Show that all n x n complex matrices A for which A\" = 1 but Ak =fc 1 for \320\272< \320\277are similar.

11.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only realentries.

CYCLIC SUBSPACES

11.56.Suppose T: \320\243-+ \320\243is linear. Prove that Z{v, T) is the intersection of all \320\223-invariant subspaces containing v.

11.57. Let/(<) and g(t) be the T-annihilators of \320\270and v, respectively. Show that if/(t) and g(t) are relatively prime,

then f{t)g(t) is the \320\223-annihilator of \320\270+ v.

11.58. Prove that Z(u, T) = Z{vy T) if and only if g{T)(u) = v where g(t) is relatively prime to the \320\223-annihilator

ofu.

11.59. Let W =Z(v, T), and suppose the \320\223-annihilator of v is/(t)\" where/(t) is a monic irreducible polynomial of

degree d. Show that/(T)'(W) is a cyclic subspacegenerated by f{T)s{v) and it has dimension d(n\342\200\224

s) if n > sand dimension 0 if n < s.

RATIONAL CANONICALFORM

11.60. Find all possible rational canonical forms for:

(a) 6x6 matrices with minimum polynomial m(t) = (t2 + 3X< + IJ

(fo) 6x6 matrices with minimum polynomial m(r) \342\200\224{t + IK

(c) 8x8 matrices with minimum polynomial nit) = {t2 + 2J{t + 3J

11.61. LetA be a 4 x 4 matrix with minimum polynomial m{t) = (t2 + IK'2 \342\200\2243). Find the rational canonical form

for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C.

(\320\257

1 0 0\\0 A 1 0

0 0 \320\2731

0 0 0A/

11.63. Prove that the characteristic polynomial of an operator \320\223: V -* \320\243is a product of its elementary divisors.

11.64. Prove that two 3x3 matrices with the same minimum and characteristic polynomials are similar.

394 CANONICAL FORMS [CHAP. 11

11.65.Let C(/(f)) denote the companion matrix to an arbitrary polynomial/(f). Show that/(t) is the characteristic

polynomial of C(/(t))-

PROJECTIONS

11.66. Suppose V = Wt \302\251\302\246\302\246\302\246

\302\251Wr. Let ?f denote the projection of \320\243into W,. Prove: (i) EiE) = 0, i \321\204\321\203,and

(ii)/ = ?, + \302\246\342\200\242\342\200\242+?,.

11.67. Let ?,, ..., Er be linear operators on V such that: (i) Ef = Et, i.e.,the ?, are projections; (ii) \320\225,\320\225}= 0, i \320\244]\\

(iii) 1 = Et + \302\246\302\246\302\246+ Er. Prove that V = Im ?,\302\251\342\200\242\302\246\342\200\242\302\251Im Er.

11.68. Suppose ?: \320\243-* V is a projection, i.e., ?2 = ?. Prove that ? has a matrix representation of the form

| where r is the rank of ? and lr is the r-square identity matrix.

ft :)\302\246

11.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 11.68.)

11.70. Suppose?: \320\243-> \320\243is a projection. Prove: (i) / \342\200\224? is a projection and V = Im ?\302\256Im (/ \342\200\224?); and

(ii) / + ? is invertible (if 1 + 1 \320\2440).

QUOTIENT SPACES

11.71. Let W be a subspace of V. Suppose the set of cosets {v, + W, v2 + W,.... vn + W} in VjW is linearly

independent. Show that the set of vectors{i^,i>2, \302\246\342\200\242-, vn\\ in \320\243is also linearly independent.

11.72. Let W be a subspace of V. Suppose the set of vectors {u,,\320\270\320\263,..., \321\213\320\277}in \320\232is linearly independent, and that

span (u,-) n W= {0}.Show that the set of cosets {\302\253,+ W,..., un + W) in VjW is also linearly indepen-

independent.

11.73. Suppose V = U \302\251W and that {\302\253,,...,\302\253\342\200\236}is a basis of V. Show that {\302\253,+ W,.... \321\213\342\200\236+ W} is a basis ofthe quotient space VjW. (Observe that no condition is placed on the dimensionality of V or W.)

11.74. Let W be the solution spaceof the linear equation

a,x, + a2x2 + \302\246\302\246\302\246+ \320\260\342\200\236\321\205\342\200\236= 0 ate \320\232

and let v = (b,, b2,..., \320\254\342\200\236)e K\". Prove that the coset v + W of W in K\" is the solution set of the linear

equation

a,x, + a2x2 + \302\246\302\246\302\246+ \320\260\342\200\236\321\205\342\200\236= fo where b = albl + \302\246\302\246\302\246

+ \320\260\342\200\236\320\254\342\200\236

11.75. Let \320\243be the vector space of polynomials over R and let W be the subspace of polynomials divisible by f4,

that is, of the form a0 f4 + \320\260,\320\2635+ \342\200\242\342\200\242\302\246+ an_4t\". Show that the quotient space VjW is of dimension 4.

11.76. Let U and W be subspaces of V such that W ? U ^ V. Note that any coset u + W of W in U may also beviewed as a coset of W in \320\243since net/ implies \320\270\320\265V; hence 1//W is a subset of VjW. Prove that (i) VjWis a subspaceof VjW, and (ii) dim {VjW) - dim (t//W) = dim (VjV).

11.77. Let U and If be subspaces of V. Show that the cosets of U n \320\230^in \320\230can be obtained by intersecting each

of the cosets of U in \320\243by each of the cosets of W in V:

V/(U n W) ={(\302\253+ I/) n (t/ + \320\2300:\302\253,i>' e V)

11.78. Let \320\223: I7 -> V be linear with kernel W and image I/. Show that the quotient space VjW is isomorphic toV under the mapping \320\262:VjW -> U defined by 6(v+ W)= T(v). Furthermore, show that T = i \302\260\320\261\302\260

r\\

CHAP. 11] CANONICAL FORMS 395

where rj: V -\302\273V/W is the natural mapping of V into V/W, i.e. tj(v)= v + W, and i: V -* V is the inclusion

mapping, i.e. i(u) = u. (See Fig. 11-4.)

Fig. 11-4

Answers to Supplementary Problems

11.41.(a) R2and{0}, (b) C2, {0}, W,=

\320\246B,1-

2\302\273)),W2=

\320\2511.1 + 2i)).

(\320\260)/2 i; \\

, 2 1

2

3 1

: \\

\\

3 1

V

F)

\320\263 17 1

_7;_ _ _

! 7 '.i- -|\302\2467\302\246

I2

\\

l ;2 i:

2 ;

\320\2232\"l

2 1

\\

.1

9

/2

\\

2/ \\

1

2 1

2

2 1

2

\\

I22 1

2

2 1

2

\\

/3

2|

/2 12 1

2

L-!i/

\320\267l

3

5 1

5

5 1

/3 1

33 1

3

5 1

5

5 i- - i

|5

396 CANONICAL FORMS [CHAP. 11

11.60.(a)

(\320\254)

(\321\201)

11.61. (a)

11.62. /\320\236

1

\\\320\276

3

1

\\\\

/0

1

\\

/\320\2761

0

\\

/\320\2761

0

0

/\302\260

1

0

0

\\

1\302\2601

\\

0

0

1

0

1 ,

3 i

1 3

-3

0

0 -0 -1 -

0010

0010

-10

0001

1\"*\302\246- -1

:\320\267;

1

0

1

-1

-3

-3

01

0

0

0

0

1

0001

01

5 1

5

-3

0

5 1

5/

\321\203

\\

0 -1

00

-40

-40

-40

-40

3

0

\320\220*

4\320\2333

-6\320\2332

4\320\233

9

-2/

\\ /\302\260

-1

-3

1

0

-\320\267/ \\

0 -2

1 0

0 -91 -6

(\320\254)/\320\236

1

\\

3

\\

/0

1

\\

0

0

1

\320\267;\342\200\224i

~ ~'

-3

0

01

-1-3-3

0 -1 -

\\1

\320\255-9

I -6

-3

-10

\\

-\320\267/

\320\243\320\267

3^

~]~5

U

-1

-2

-1

-2-

/\302\260

1

0

0

1

-\320\243\320\267

\"\321\202;

5'

\302\2465 .

\\

0 -1

1 -2/

\\ /\302\260

1

0

\321\207 \\

0 0

0 01 00 1

\\(\321\201)

J

1

1

/'

/0

1

\\

0 -

0 -1 -

-40

-40

ft

\\

\342\200\224

1

3

3

0

1

1

3

0

0

1

-1

-9

-6

\320\243\320\267\"

1

-2

\\

1

-1/

\\

0 -9

1 -6

-\320\243\320\267\"

Chapter 12

Linear Functionals and the Dual Space

12.1 INTRODUCTION

In this chapter we study linear mappings from a vector space V into its field \320\232of scalars. (Unlessotherwise stated or implied,we view \320\232as a vector space over itself.) Naturally all the theorems andresults for arbitrary linear mappings on V hold for this special case. However,we treat these mappings

separately because of their fundamental importance and because the special relationship of V to \320\232gives

rise to new notions and resultswhich do not apply in the general case.

12.2 LINEAR FUNCTIONALS AND THE DUAL SPACE

Let V be a vector space over a field K. A mapping \321\204: V -* \320\232is termed a linear functional (or linearform) if, for every u,veV and everya,b e K,

\321\204(\320\260\320\270+ bv) =\320\260\321\204(\320\270)+ \320\254\321\204[\321\200)

In other words, a linear functional on V is a linear mapping from V into K.

Example 12.1

(a) Let 1[(:\320\232\"-\302\273\320\232be the fth projection mapping, i.e.,n,(aj,\320\2602,...,\320\260\342\200\236)=

\321\211.Then \321\211is linear and so it is a linearfunctional on K\".

(b) Let V be the vector space of polynomials in t over R. Let J : V -\302\273Rbe the integral operator defined by

J(pW) = Jo P@ <*'\342\200\242Recall that J is linear; and hence it is a linear functional on V.

(c) Let V be the vector space of n-square matricesover K. Let T : V -\302\273\320\232be the trace mapping

T(/4)= a,, + a22 + \342\200\242\302\246- + ann where /4 = (au)

That is, 7\" assigns to a matrix /4 the sum of its diagonal elements. This map is linear (Problem 12.24) and so it

is a linear functional on V.

By Theorem 9.10, the set of linear functionals on a vector space V over a field \320\232is also a vector

space over \320\232with addition and scalar multiplication definedby

(\321\204+ erXu) = <Kv) + \320\244) and (\320\272\321\204\342\204\226)=

\320\272\321\204&)

where \321\204and a are linear functionals on V and \320\272e K. This space is called the </ua/ space of \320\232and is

denoted by V*.

Example 12.2. Let V = K\", the vector space of n-tuples which we write as column vectors.Then the dual spaceV* can be identified with the space of row vectors. In particular, any linear functional \321\204= (a,,..., an) in V* has therepresentation

397

398 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 12

or simply

<?(x,, ..., xn)= OjXj + a2x2 + \342\200\242\342\200\242\342\200\242+ \320\260\342\200\236\321\205\342\200\236

Historically, the above formal expressionwas termed a linear form.

12.3 DUAL BASIS

SupposeV is a vector space of dimension n over K. By Theorem 9.11 the dimension of the dual

space V* is also n (since \320\232is of dimension 1 over itself). In fact, each basis of V determines a basis of V*

as follows (see Problem 12.3for the proof):

Theorem 12.1: Suppose {vt,..., vn} is a basis of V over K. Let \321\204\321\205,...,\321\204\342\200\236\320\265V* be the linear function-als denned by

Then {<?,,..., \321\204\342\200\236}is a basis of K*.

The above basis {<?,} is termed the basis dual to {i;,}or the dual basis. The above formula, which

uses the Kronecker delta<5y,

is a short way of writing

<Md,)= 1, 0.\320\253

= 0, <Mr>3) = 0, ..., 0,(O = 01.̂ 2(^3)

= a..., <?>\342\200\236)= 0

<?>.) = 0, 0^2) = 0,..., uk_ ,)= 0, ^>j = 1

By Theorem 9.2 these linear mappings \321\204{are unique and well-defined.

Example 12.3. Consider the following basis of R2: {u,= B, 1), v2 = C, 1)}.Find the dual basis {\321\2041\320\263\321\2042].

We seek linear functional \321\204,(\321\205,\321\203)= ax + by and \321\2042(\321\205,\321\203)

= ex + dy such that

<Muj) = <?iB,1)= 2a + h = l]. nf or a= \342\200\2241, \320\254= 3

Thus

, = 3c + .= - - --M--2Hence the dual basis is {<?,(x,y)

= \342\200\224x -t- \320\227\321\203,<?2(*, \321\203)= \320\273\342\200\224

2y}.

The next theorems give relationshipsbetweenbasesand their duals.

Theorem 12.2: Let {vu ..., vn} be a basis of V and let {\321\204\321\205,..., \321\204\342\200\236}be the dual basis of V*. Then for

any vector \320\270\320\265V,

and, for any linear functional a e V*,

\320\244\320\226 A2.2)

Theorem 12.3: Let {i>,,..., 1>\321\217}and {w,,..., wn} be bases of V and let {\321\204\320\270..., \321\204\320\277}and {au ...,\320\260\342\200\236}be

the bases of V* dual to {v:}and {w,},respectively. Suppose P is the change-of-basismatrix from {uj to {w;}. Then (P~ 4)T is the change-of-basis matrix from {\321\204-\\to {at}.

CHAP. 12] LINEAR FUNCTIONALS AND THE DUAL SPACE 399

12.4 SECONDDUAL SPACE

We repeat: Every vector space V has a dual space V* which consists of all the linear functionalsonV. Thus V* itself has a dual space V**, called the second dual of V, which consists of all the linearfunctionalson V*.

We now show that each v e V determines a specific elementS e V**. First of all, for any \321\204\320\265V* we

define

\320\246\320\244)=

\320\244\320\253

It remains to be shown that this map v:V*-*K is linear. For any scalars a, b e \320\232and any linear

functionals \321\204,\320\260\320\265V*, we have

\320\246\320\260\321\204+ bo) =(\320\260\321\204+ ba)(v) =

\320\260\321\204(\320\272)+ ba(v) =\320\260\320\246\321\204)+ bv{a)

That is, v is linear and so v e V**. The following theorem, proved in Problem 12.7, applies.

Theorem 12.4: If V has finite dimension, then the mapping vt\342\200\224>v is an isomorphism of V onto V**.

The above mapping vi-^v is called the natural mapping of V into F**. We emphasize that this

mapping is never onto V** if V is not finite-dimensional. However, it is always linear and, moreover, it

is always one-to-one.

Now suppose V does have finite dimension. By Theorem 12.4, the natural mapping determines an

isomorphism between V and V**. Unless otherwise stated we shall identify V with V** by this

mapping. Accordingly, we shall view V as the space of linear functionals on V* and shall writeV = V**. We remark that if {<?,-} is the basis of V* dual to a basis{t>,} of V, then {\320\270,}is the basis ofV** = V which is dual to

12.5 ANNIHILATORS

Let W be a subset (not necessarily a subspace)of a vector space V. A linear functional \321\204\320\265V* is

called an annihilator of W if </>(w)= 0 for every w e W. i.e., if \321\204(\320\243?)

\342\200\224{0}. We show that the set of all

such mappings, denoted by W\302\260and called the annihilator of W, is a subspace of V*. Clearly 0 e W\302\260.

Now suppose \321\204,a e W\302\260.Then, for any scalars a,heK and for any w e W,

{\320\260\321\204+ ba^w) =\320\260\321\204{\320\251+ ba(w) = aO + \320\254\320\236= \320\236

Thus \320\260\321\204+ bo e W\302\260and so W\302\260is a subspace of V*.In the case that W is a subspace of V, we have the following relationship between W and its

annihilator W\302\260.(See proof in Problem 12.11.)

Theorem 12.5: SupposeV has finite dimension and W is a subspaceof V. Then:

(i) dim W + dim W\302\260= dim \320\243 and (ii) W00 = W

Here \320\230-\342\204\242= {v e V: \321\204(\320\270)= 0 for every \321\204e W0} or, equivalently, W00 =

(\320\230^\302\260)\302\260where \320\230-\342\204\242is

viewed as a subspace of V under the identification of V and V**.

The concept of an annihilator enables us to give another interpretation of a homogeneous system oflinear equations

\302\253n^t +al2x2 0

a21x, +\320\26022\321\205\320\263

^ml^l + am2 X2 + ' \" - + \320\260\321\202\320\277\320\245\342\200\236= \320\236

Here each row (ait, ai2,..., ain) of the coefficient matrix A =(ay) is viewed as an element of K\" and each

solution vector \321\204=

(\321\205\342\200\236\321\2052,..., \321\205\342\200\236)is viewed as an element of the dual space. In this context, the

400 LINEAR FUNCTIONAL^ AND THE DUAL SPACE [CHAP. 12

solution spaceS of (*) is the annihilator of the rows of A and hence of the row space of A. Consequent-

Consequently,using Theorem 12.5, we again obtain the following fundamental result on the dimension of the

solution space of a homogeneous system of linear equations:

dim S = dim K\" \342\200\224dim (rowsp A) = n \342\200\224rank A

12.6 TRANSPOSE OF A LINEAR MAPPING

Let T : V -* U be an arbitrary linear mapping from a vector space V into a vector space U. Nowfor any linear functional \321\204e U*, the composition \321\204

\302\260\320\242is a linear mapping from V into \320\232:

That is, \321\204\302\260\320\242e V*. Thus the correspondence

\321\204>\342\200\224>\321\204\320\276T

is a mapping from U* into V*; we denote it by T' and call it the transpose of T. In other words,T':U*-> V* is denned by

Thus (T@)Xd) =\321\204(\320\242(\320\270))for every veV.

Theorem 12.6: The transposemapping T' denned above is linear.

Proof. For any scalars a,beK and any linear functionals \321\204,\320\260\320\265U*,

\320\242'{\320\260\321\204+ bo) =(\320\260\321\204+ \320\246\320\276\320\223

=\320\260{\321\204

\302\260\320\223)+ \320\254(\321\201\321\202

\320\276\320\242)

=\320\260\320\242'(\321\204)+ \320\254\320\242\\\320\260)

That is, T' is linear as claimed.

We emphasize that if T is a linear mapping from V into U, then T1 is a linear mapping from V*

into V*:

\321\202 \321\202<

V >U V*< U*

The name \"transpose\"for the mapping T' no doubt derives from the following theorem, proved in

Problem 12.16.

Theorem 12.7: Let T : V -> U be linear, and let A be the matrix representation of T relative to bases

{u,-} of V and {u,} of U. Then the transpose matrix AT is the matrix representation of

T': I/* -\302\273K* relative to the bases dual to {u,}and {t>,}.

Solved Problems

DUAL SPACES AND BASES

12.1. Considerthe following basis of R3: {vt =A, -1, 3), v2 = @, 1, -1), v3

= @, 3, -2)}. Find thedual basis{\321\2041\320\263\321\2042,\321\2043}.

CHAP. 12] LINEAR FUNCTIONALS AND THE DUAL SPACE 401

We seek linear functionals

\321\205,\320\243,z) = atx + a2y + a3z \321\2042{\321\205,\321\203,z) =\320\254,\321\205+ b2y + b3z \321\2043(\321\205,\321\203,z) = ctx + c2y + c3z

such that

We find \321\204)as follows:

M\"i) = 1 4>A\302\273i)= 0 \320\244\320\220\302\273\320\267)

= 0

>A>l) = 0 <bM = 1 *2<\302\253>\321\215)= 0

M\302\253,)= 0 ^3(tJ) = 0 ^3(iK)

= 1

<M\"i) = #i0. -1, 3)=\321\217,

-\321\2172+ \320\227\321\2173

= 1

*i(\302\253>2)= *i@. 1, -1) =

a2- a3 = 0

<Mi>3)= <M0, 3, -2)= 3a2-2a3 = 0

Solving the system of equations, we obtain a, = 1, a2 = 0, a3 = 0.Thus 4>t{x, \320\243,z) = x.We next find \321\2042:

*i(\302\273.)= *2A. - 1. 3)= b, -

b2 + 3b3 = 0

<t>2(v2)= <^2@. 1, - 1)= \320\2542

-b3 = 1

*2(\302\273\320\267)=

<\320\254@,3, - 2) =\320\227\320\2542

-2\320\2543

= 0

Solving the system, we obtain b, = 7, b2\342\200\224-2, b3 = -3. Hence\321\2042(\321\205,\321\203,z) = 7x -

2y\342\200\2243z.

Finally, we find \321\204\321\212:

\320\244\320\267\320\254\320\263)=

\320\244\320\233\320\270- U 3) -

\321\201,-

\321\2012+ \320\227\321\2013= 0

<\320\234\302\2732)=

\320\244\320\267\320\244,1, -1) =\321\2012

-\321\2013

= 0

<\342\204\226\320\267)=

^>\320\267@,3, -2) =\320\227\321\2012

-2\321\2013

= 1

Solving the system, we obtain \321\201,= -2, c2 = 1,c3 = 1.Thus \321\204\321\212(\321\205,y,z)= \342\200\2242x+ y+z.

12.2. Let V be the vector space of polynomials over R of degree < 1, i.e., V \342\200\224{a + bt: a, b e R}. Let

\321\2041:V -> R and \321\2042: V \342\200\224R be defined by

(/@) =j

/(') dt and 4>JJ\302\253))=

J7@<*t

(We remark that \321\204\321\205and \321\2042are linear and so belong to the dual space V*.) Find the basis

{t>,, v2} of V which is dual to {$,, \321\2042].

Let \320\270,= a + bt and v2

= \321\201+ \320\233.By definition of the dual basis,

\321\204\320\220\321\200\\)= 1, <p2(^i)

= 0 and (p.^j) ^ \320\236,\320\2442(^2)̂

Thus

#i(\302\253>i)= I (\321\217+ bt) \320\233= a + 2-\320\254

= 1

Jo

<^2(t'i) =I (\321\217+ bt) dt = 2a + 2b = 0

<M\302\2732)= (<\302\246

\320\244\320\226>1)=

(\321\201

Jo

In other words, {2-2f, \342\200\224\\+ t) is the basisof V which is dual to {<?,,

or a = 2, b= -2

dt = c+ id =

\320\233= 2c + 2J =or \321\201= -1/2, d = 1

402 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 12

12.3. Prove Theorem 12.1.

We first show that {\321\204\321\205,...,\321\204\342\200\236}spans V*. Let \321\204be an arbitrary element of V*, and suppose

\320\2441)= ku \321\204(\320\2702)

= k2,..., \321\204(\321\214\320\277)=\320\272\320\277

Setcr =fc,tf>, + \302\246\302\246\342\200\242+ &\342\200\236<?\342\200\236.Then

fc,<M\"i) + k2 \321\204^^ + \342\200\242\342\200\242\342\200\242+ &\342\200\236<?>,)

Similarly, for i = 2 n,

(fc,tf\302\273,+ \342\200\242\302\246\302\246+ k^Jvt) = *i<M\".) + \302\246\302\246\302\246+fc.-^-fai) + \342\200\242\342\200\242\302\246+ *\302\273\302\246\342\200\242)=

\320\272.

Thus <?([),)=

(\321\202([);)for i = 1, ..., n. Since <? and tr agree on the basis vectors, \321\204= a =

\320\272\320\263\321\204\320\263+ \302\246\302\246\302\246+ \320\272\342\200\236\321\204\320\277

Accordingly, {\321\204\321\205,...,\321\204\320\277}spans V*.

It remains to be shown that {<?,,..., \321\204\342\200\236}is linearly independent. Suppose

Applying both sides to u,, we obtain

0 = (Hi;,) = (a,<^,+

Similarly, for i = 2,..., n,

0 =0(\320\270;)

= (a,^, + \302\246\302\246\302\246+ \320\260\342\200\236<^\342\200\236\320\232\320\270;)= a,tf>,(i>i) + \342\200\242\302\246\302\246+ fl.-^rfw,-) + \302\246\342\200\242\342\200\242+ \320\260\342\200\236^\320\277(\320\270;)

=a,-

That is, \302\253!= 0,..., an

= 0. Hence {\321\204\320\270..., <?\342\200\236}is linearly independent and so it is a basis of V*.

12.4. Prove Theorem 12.2.

Suppose

\320\270= a^t + a2v2-i \\-anvn

Then

\321\204^\320\270)= a^AvJ + \320\2602\321\204^2)+ \302\246\302\246\302\246+ an^,([)n) = a, -1 +\320\2602-0+-+\320\260\342\200\236-0

=\320\2601

Similarly, for i = 2,..., n,

\321\204,{\320\270)= a,^,) + \342\200\242\342\200\242\302\246+ \320\260^\320\224^)+ - \342\200\242\342\200\242+ \320\260\342\200\236^,<[;\342\200\236)

=\320\260,

Thai is, ^itu) =\302\253,,<^2(\") = fl2, \342\200\242- \342\200\242,<^\342\200\236(\320\272)

=\321\217\342\200\236\302\246Substituting these results into (./),we obtain (/2./).

Next we prove (/2.2).Applying the linear functional a to both sides of A2.1),

rr(u)=

=\320\230\302\273,)^,+ \320\2442)\321\2042+ \302\246\302\246\302\246+ \320\244\342\200\236)\320\244\342\200\236\320\234

Since the above holds for every ue V, a =\321\204^\321\204^+ \320\2442)\321\2042+ \302\246\342\200\242\302\246+ \320\244\342\200\236)\321\204\342\200\236as claimed.

12.5. Prove Theorem 12.3.

Supposeal2v2 + \342\200\242\342\200\242\342\200\242+ alnvn ai =

\320\254\320\270\321\204\321\205+ \320\25412\321\2042+ h \320\2541\320\277\321\204\320\277

a22v2 + \302\246\302\246-+ a2nvn a2 = fc2I^, + fc22^2 + h \320\2542\342\200\236\321\204\342\200\236

wn = ^nivl +an2v2 + \302\246\342\200\242-+ anavn \320\276\342\200\236=

\320\252\320\2771\321\204\321\205+ \320\252\320\2772\321\2042+-\302\246-+ \320\254\342\200\236\342\200\236\321\204\320\277

where P = (ai}) and Q = (b^).We seek to prove that Q = (P~')'.

CHAP. 12] LINEAR FUNCTIONALS AND THE DUAL SPACE 403

Let K, denote the ith row of Q and let C} denote theyth column of PT.Then

Ri=

\320\244\321\206,bi2, ..., bj and Cj = (a,,,a]2,..., a]n)T

By definition of the dual basis,

a,{Wj)=

(b.,4\302\273,+ bi2 \321\2042+ \302\246\302\246\302\246+ \320\254,\342\200\236\321\204\342\200\236\320\232\320\260\321\206\302\273\321\205+ aj2v2 + \302\246\302\246\302\246+ ajn vn)

=\320\254\320\277\320\260\302\273+ bi2 a12 + \302\246\302\246\302\246+ \320\254\321\213\320\260]\320\270

= R; C, =8{j

where Su is the Kroneckerdelta.Thus

CPT =

1I \320\230

u

\321\201,

\321\201,

\321\201,

\320\272,\321\2012

\320\2322\320\2412

RnC2

... Rn

... R2

... Rn

T

1\320\276

0

0 ...

1 ...

0 ...

0

0

1

and hence Q = (PT)' = (P\" ')T as claimed.

12.6. Suppose V has finite dimension. Show that if v e V, v \321\2040, then there exists \321\204\320\265V* such that

\320\250 \320\2440.

We extend {v} to a basis {u, v2 vn) of \320\232By Theorem 9.2, there exists a unique linear mapping

\321\204:\320\243->\320\232such that <?(u)= 1 and \321\204{vi)

= 0, i = 2,..., n. Hence <? has the desired property.

L2.7. Prove Theorem 12.4. If V has finite dimension, then V is isomorphic to V**.

We first prove that the map t'i\342\200\224>D is linear, i.e., for any vectors v, w e V and any scalars a, b e K,av + bw = av + bw. For any linear functional \321\204\320\265V*,

bw) =\320\260\321\204{\320\270) \320\254\320\246\321\204)

= (av

Since av + \320\254\\\320\274{\321\204)= (av + \320\254\320\251\321\204)for every \321\204\320\265V*, we have av + bw \342\200\224av + bw. Thus the map v i\342\200\224>v is

linear.Now suppose v e V, v jtO. Then, by Problem 12.6, there exists \321\204e V* for which \321\204{\321\200)? 0. Hence

\320\246\321\204)=

\321\204(\321\203)\320\2440 and thus D \320\2440. Since \320\270^ 0 implies v ^ 0, the map i>i\342\200\224>\320\241is nonsingular and hence an

isomorphism (Theorem 9.9).

Now dim V = dim V* = dim V** because V has finite dimension. Accordingly, mapping ui\342\200\224\302\273t>is an

isomorphism of \320\232onto V**.

ANNIHILATORS

12.8. Show that if \321\204e V* annihilates a subset S of F, then \321\204annihilates the linear span L{S) of S.HenceS\302\260= (span (S))\302\260.

Suppose \320\263e span (S). Then there exist w,,..., wr e S for which t; = a, w, + a2w2+ \302\246\302\246-+ ar wr.

\321\204{v)=

\320\2601\321\204&1)+ \320\2602\321\204\342\204\2262)+ \302\246\342\200\242\342\200\242+ a,<?(wr) = a,0 + a20+ \342\200\242-\302\246+ ar0 = 0

Sincev was an arbitrary element of span (S),<? annihilates span (S) as claimed.

12.9. Let W be the subspace of R4 spanned by u,= A, 2, -3, 4) and v2

= @, 1, 4, -1). Find a basisof

the annihilator of W.

By Problem 12.8, it suffices to find a basis of the set of linear functional of the form\321\204(\321\205,\321\203,z, w) = ax + by + cz + dw for which <?(u,)

= 0 and \321\204{1J)= 0:

\321\204A,2, -3, 4) = a + 2b - 3c+ 4d = 0

0@, 1,4,-1)= b + 4c- d = 0

The system of equations in unknowns a, b, c, d is in echelon form with free variables \321\201and d.

404 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 12

Set \321\201= 1, d = 0 to obtain the solution a = 11,b = \342\200\2244, \321\201= 1, d = 0 and hence the linear functional

<t>t(x, y, z, w) = 1 Ijc \342\200\224Ay + z.

Set \321\201= 0, d = \342\200\2241 to obtain the solution a = 6, b = \342\200\224\\,c= 0,d= \342\200\2241and hence the linear functional

\320\2442{\321\205,\321\203,z, w) = 6x \342\200\224\321\203

\342\200\224w.

The set of linear functionals {\321\204,,\321\204\320\263\\is a basis of W\302\260,the annihilator of W.

12.10. Show that: (a) for any subset S of V, S \321\201S00; and (b) if St ? S2, then S? ? S\302\260.

(a) Let v e S. Then for every linear functional \321\204e S\302\260,\320\252(\321\204)=

\321\204{\321\201)= 0. Hence v e (S\302\260)\302\260.Therefore, under

the identification of V and K**, t; e Sv0.Accordingly, S ? S00.

(b) Let <? e S?. Then <?(u)= 0 for every v e S2. But S, e S2; hence \321\204annihilates every element of S,, i.e.,

\321\204e S?. Therefore S? \321\201S?.

12.11. Prove Theorem 12.5. (i) dim V = dim \302\273y+dim W\302\260,(ii) W00 = W.

(i) Supposedim V = n and dim W \342\200\224r < n. We want to show that dim \320\230^0= n \342\200\224r. We choose a basis{w,,..., wr) of W and extend it to the following basis of V, say {w,,..., w,, u,, ...,vn ,}.Consider the

dual basis

{\320\244\\,\342\200\242--.\320\244\320\263,\320\236\\,\302\246\302\246\302\246,\320\236\342\200\236-\320\263}

By definition of the dual basis,eachof the above <r's annihilates each w,; hence \321\201,,..., an_r e W\302\260.We

claim that {oj} is a basisof W\302\260.Now {oj} is part of a basisof V* and so it is linearly independent.We next show that {\321\204-\\spans W\302\260.Let a e \320\230^\302\260.By Theorem 12.2,

a =<t(w,)<?, + \302\246\302\246\302\246+ o-(wr)<^r + (t^Oct, + \342\200\242\342\200\242\342\200\242+ (\321\202([)\342\200\236_,)<\321\202\320\270_\320\263

= 0<^, + \342\200\242\342\200\242- + 0\321\204\320\263+ *,V, + \302\246\302\246\302\246+ \321\204\342\200\236..\320\263)\320\260\320\277_\320\263

=\321\204,)\321\2011+-+<\321\202A)\342\200\236_>\342\200\236_.

Consequently {<\321\202, \321\202\342\200\236_\320\263}spans W\302\260and so it is a basisof \320\230'0.Accordingly, as required

dim W\302\260= n - r = dim V - dim \320\230\320\233

(ii) Suppose dim V = n and dim W = r. Then dim K* = n and, by (i), dim \320\230'0= n \342\200\224r. Thus by (i),dim W\302\260\302\260= n-(n-r) = r; therefore dim W = dim Ww. By Problem 12.10, W <= \320\230'00.Accordingly,

W = W00.

12.12. Let U and W7 be subspaces of V. Prove: (V + W7H= l/\302\260n W\302\260.

Let \321\204\320\265((/+ \320\230^)\302\260.Then 0 annihilates U + W and so, in particular, \321\204annihilates U and \320\230^.That is,

\321\204e l/\302\260and \321\204\320\265\320\230^\302\260;hence <? e l/\302\260\320\276\320\230^\302\260.Thus (I/ + 1\320\243He V\302\260n W\302\260.

On the other hand, suppose a e U\302\260n W\302\260.Then a annihilates U and also W. If \320\270e V + W, then

\320\270= \320\270+ w where ue V and w e \320\230\320\233Hence a{v) =<r(u) + a(w) = 0 + 0 = 0.Thus a annihilates V + W, i.e.,

a e (V + \320\230\320\2360.Accordingly, l/\302\260+ \320\230^\302\260S (I/ + W)\302\260.

Both inclusion relations give us the desired equality.Remark: Observe that no dimension argument is employed in the proof; hence the result holds for

spaces of finite or infinite dimension.

TRANSPOSE OF A LINEAR MAPPING

12.13. Let \321\204be the linear functional on R2 denned by \321\204(\321\205,\321\203)= \321\205- 2y. For each of the following linear

operators T on R2,find G*(<?)Xx, y):

(a) T(x,y) = (x,Q), (b) T(x,y)= {y,x+ y), (c) T(x, y) = Bx -3y, 5x + 2y)

By definition of the transpose mapping, \320\242\\\321\204)=

\321\204<> T, that is, (T'(^)Xu) =

\320\244(\320\242(\321\214))for every vector v.Hence:

(\320\260)[\320\237\321\204\320\246\320\246\321\205,\321\203)=

\321\204(\320\242(\321\205,\321\203))=

\321\204{\321\205,0) = \321\205

CHAP. 12] LINEAR FUNCTIONALS AND THE DUAL SPACE 405

\321\204)(\320\242'(\321\204\320\226\321\205,\321\203)=

\321\204(\320\246\321\205,\321\203))=

\321\204(\321\203,\321\205+ \321\203)=

\321\203-2(\321\205+ \321\203)=-2\321\205-\321\203

(\321\201)(\320\223(\321\204)\320\243(\321\205,\321\203)=

\321\204(\320\242(\321\205,\321\203))=

\321\204B\321\205-

\320\227\321\203,5\321\205+ 2\321\203)=

B\321\205-

\320\252\321\203)- 2E\321\205+ 2\321\203)

= -8\321\205- 7\321\203

12.14. Let \320\242: V ->\342\200\242U be linear and let 7\" :{/\302\246-> \320\232*be its transpose. Show that the kernel of \320\223is the

annihilator of the image of T, i.e.,\320\232\320\265\320\263\320\223= (Im 7\\302\260.

Suppose \321\204e \320\232\320\265\320\263\320\242;that is, \320\242'(\321\204)=

\321\204\302\260T = 0. If \320\270e Im T, then u = T(t>) for some \320\270\320\265V; hence

=(\321\204

\320\276T)(v) = 0{v) = 0

We have that <?(u) = 0 for every \320\270e Im \320\223;hence <? e (Im \320\223H.Thus Ker V ? (Im T)\302\260.

On the other hand, supposea e (Im T)\302\260;that is, <r(Im \320\223)= {0}. Then, for every v e V,

{T\\e\342\204\226=

(\320\276\320\276TM = a(T(v)) = 0 =

0(v)

We have (\320\223'((\321\202)\320\232\302\253)=

\320\236\320\224for every v e V; hence T\\a)= 0. Thus a e \320\232\320\265\320\263\320\223and so (Im T)\302\260? Ker V.

Both inclusion relations give us the required equality.

12.15. Suppose V and U have finite dimension and suppose T: V -\302\273I/ is linear. Prove: rank T =rank V.

Suppose dim V = n and dim t/ = m. Also suppose rank T \342\200\224r. Then, by Theorem 12.5,

dim (Im \320\223H= dim U - dim (Im T) = m - rank T = m-r

By Problem 12.14, Ker T =(Im T)\302\260.Hence nullity V = m - r. It then follows that, as claimed,

rank V = dim V* -nullity \320\223'= m - (m-

r)= r = rank T

12.16. Prove Theorem 12.7. If A represents 7\", then AT represents T'.

Suppose

22uj++\321\2172\321\217\320\270\342\200\236

We want to prove that

\320\242'(\320\2761)=

\320\26011\321\2041+\320\26021\321\2042+ ---+\320\260\321\2021\321\204\321\202

---+\320\260\321\2022\321\204\321\202

where {<\320\263(}and {\321\204}\\are the bases dual to {u,} and {i>j}, respectively.

Let v e V and suppose v = klv1 + k2v2 + \342\200\242\342\200\242\342\200\242+ kmvm. Then, by A),

T(v) =*,7X\302\273,)+

\320\233

=Z (Mili=l

Hence forj = 1,..., n,

,fl,y + k2a2j+-+ kmamJ C)

406 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 12

On the other hand, for 7 = 1,.... \320\270,

(\320\260\320\270\321\2041+ a2j4>2 + \302\246\302\246\302\246+ \320\260\320\2771\321\204\320\226>)=

(\320\273\320\270<?,+ \320\2602]\321\2042+ \302\246\302\246\302\246+ \320\260^\321\204\321\202?\320\2721\320\2631+ k2v2 + \302\246\302\246\302\246+ kmvj

= Mi> + k2a2J + -- + kmamJ D)

Since v e V was arbitrary, C) and D) imply that

\320\242'(<\321\202\321\203)=

\320\2601]\321\2041+ \320\26021\321\2042+ \302\246\302\246\342\200\242+ \320\260\321\202]\321\204\321\202j=\\,...,n

which is B). Thus the theorem is proved.

12.17. Let A be an arbitrary m x n matrix over a field K. Prove that the row rank and the column rankof A are equal.

Let T : K\" -> Km be the linear map defined by T(v) = Av, where the elements of K\" and Km are written

as column vectors.Then A is the matrix representation of T relative to the usual bases of K* and Km, andthe imageof T is the column space of A. Hence

rank T = column rank of A

By Theorem 12.7, AT is the matrix representation of T' relative to the dual bases. Hence

rank V = column rank of AT = row rank of A

But by Problem 12.15, rank T = rank 7\"'; hence the row rank and the column rank of A are equal. (Thisresult was stated earlier as Theorem 5.18,and was proved in a direct way in Problem 5.53.)

Supplementary Problems

DUAL SPACESAND DUAL BASES

12.18. Define linear functionals \321\204: R3 -\302\273R and a : R3 -> R by 0(x, y, z) = 2x -\320\252\321\203+ z and <\321\202(\321\205,\321\203,z) =

4x -2y + 3z. Find: (\320\260)\321\204+ a, (b) \320\252\321\204,(\321\201)2\321\204

- 5a.

12.19. Let V be the vector space of polynomials over R of degree <2. Let #,, \321\2042,and \321\204\321\212be the linear functionalson V defined by

0iCAO) =J

.f(t)dtJo

Here /(I) = a + bt + ct2 eV and f'(t) denotesthe derivative of f(t). Find the basis {/,((),/2@, /\320\267@}of V

which is dual to {</>,, \321\2042,\320\244\320\267}-

12.20. Suppose u, u 6 V and that #(u) = 0 implies #(t>)= 0 for all \321\204e K*. Show that v = ku for some scalar k.

12.21. Suppose\321\204,\320\260e V* and that \321\204(\302\253)= 0 implies <t(i;)

= 0 for all ve V. Show that a =\320\272\321\204for some scalar k.

12.22. Let F be the vector space of polynomials over K. For ae K, define \321\204\342\200\236: V -> \320\232by \321\204\320\233/\320\277))=/(\321\217). Show

that: (a) $o is linear; (b)if a 9^ b, then \321\204\320\276\320\244\321\204\321\214.

12.23. Let \320\232be the vector space of polynomials of degree <2. Let a, b, \321\201\320\265\320\232be distinct scalars. Let 0O, \321\204\321\214,and

& be the linear functionals defined by &,(/@) =f(a), tf>b(/@) =/(*>), &(/(')) =/(c). Show that {\321\204\320\260,\321\204\342\200\236,\321\204\321\201]

is linearly independent, and find the basis {/i@, /\320\263@,/\320\267(')}of \320\243which is its dual.

12.24. Let \320\243be the vector space of square matrices of order \320\270.Let T: V -> \320\232be the trace mapping: that is

T(A) = a,, + a22+ \302\246\302\246\302\246+ ann, where A = (au).Show that T is linear.

CHAP. 12] LINEAR FUNCTIONALS AND THE DUAL SPACE 407

12.25.Let W be a subspace of V. For any linear functional \321\204on W, show that there is a linear functional a on V

such that o(w) =<?(\320\270>)for any w e W, i.e.,\321\204is the restriction of a to W.

12.26. Let {e,, ..., \320\265\342\200\236)be the usual basis of K\". Show that the dual basis is {\321\217.̂.., \320\273\342\200\236}where \321\217,is the ith

projection mapping: that is, n,{au ...,\320\260\342\200\236)=

\320\260(.

12.27. Let V be a vector spaceover R. Let \321\2041\321\203\321\2042\320\265V* and suppose \321\201: V -* R defined by a{v) =\321\204^)\321\2042^)also

belongs to V*. Show that either \321\2041= \320\236or \321\2042

= \320\236.

ANNIHILATORS

12.28. Let W be the subspace of R4 spanned by A, 2, -3, 4), A, 3, -2. 6), and A, 4, -1, 8).Find a basis of theannihilator of W.

12.29. Let W be the subspace of R3 spanned by A,1, 0) and @, 1,1). Find a basis of the annihilator of W.

1230. Show that, for any subset S of V, span (S) = S00 where span (S) is the linear span of S.

12.31. Let I/ and W be subspaces of a vectorspaceV of finite dimension. Prove: (U n W)\302\260= U\302\260+ W\302\260.

12.32. Suppose V = U \302\251W. Prove that V\302\260= U\302\260\302\251W\302\260.

TRANSPOSE OF A LINEAR MAPPING

12.33. Let \321\204be the linear functional on R2 defined by \321\204(\321\205,\321\203)= \320\227\321\205\342\200\224

2y. For each of the following linear map-mappings T : R3 - R2, find (\320\223'\320\250\321\205,\321\203,z).

(a) T(x, y, z) = (x + \321\203,\321\203+ z); (b) T(x, y, z) = (x + \321\203+ z, 2x -y)

12.34. Suppose T, : t/ - \320\232and T2 : V - W are linear. Prove that (T2 - T,)'= \320\223,\302\253

7\022.

12J5. Suppose T : V -* U is linear and V has finite dimension. Prove that Im V = (Ker T)\302\260.

1236. Suppose T : V -* U is linear and u 6 17. Prove that \320\270e Im T or there exists \321\204e V* such that \320\242'(\321\204)= 0 and

12.37. Let \320\243be of finite dimension. Show that the mapping Ti->T' is an isomorphism from Horn (V, V) onto

Horn (K*, V). (\320\235\320\265\320\263\320\265\320\242is any linear operator on V.)

MISCELLANEOUS PROBLEMS

12.38. Let V be a vector space over R and let \321\204\320\265V*. Define:

W* = {veV :\321\204{\321\214)>0} W = {v e V : \321\204&)= 0} W~ =

{v e V : \321\204(\321\214)< 0}

Prove that W+, W, and W are convex (see Problems9.90and 9.91).

12.39. Let V be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzerolinear functional \321\204on V. [Under this definition a hyperplane necessarily

\"passes through the origin,\" i.e.

contains the zero vector. Show that every subspace of V is the intersection of a finite number of hyper-

planes.

408 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 12

Answers to Supplementary Problems

12.18.(a) 6x-5y + 4z, (b) 6x -9y + 3z, (cj -16x + 4y-13z

12.22. (b) Let/(t) = t. Then &,(/\342\204\226)= \320\260\320\244b =

\320\244\320\233\320\250and therefore \321\204\342\200\236\320\244\321\204\321\214.

12.23 !n,|t2-(b + c)t + bc i>-{a + c)t + ac t* -

{a + b)t + ob

12.28. {\321\204,(x,>-, z, 0 = 5x -\321\203+ z, \321\2042{\321\205,y,z,t) = 2y-t]

12.29.' {\321\204(\321\205,\321\203,z) = x -\321\203+ z}

12.33. (\320\260)(\320\223(<\320\250*.y, z) = \320\227\321\205+ \321\203- 2z, (\320\254)(\320\223(\321\204)\320\232^,\321\203,z) = -x + 5y + 3z

Chapter 13

Bilinear, Quadratic, and Hermitian Forms

13.1 INTRODUCTION

This chapter generalizes the notions of linear mappings and linear functionals. Specifically, weintroduce the notion of a bilinear form. (Actually, the notion of a general multilinear mapping did

appear in Section 7.13.) These bilinear maps also give rise to quadratic and Hermitian forms. Although

quadratic forms occurred previously in the context of matrices, this chapter is treated independentlyof the previous results.(Thus there may be some overlap in the discussion and some examples andproblems.)

13.2 BILINEAR FORMS

Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping

/: V x V \342\200\224*\320\232which satisfies:

(i) /(au, + bu2, v)= a/(u,, v) + bf(u2, v)

(ii) /(u, at\302\273,+ bv2) = af(u, vt) + bf(u, v2)

for all a, b e \320\232and all \321\211,v{ e V. We express condition (i) by saying/is linear in the first variable,andcondition (ii) by saying/is linear in the second variable.

Example 13.1

(a) Let \321\204and a be arbitrary linear functionals on V. Let/: V x V -\302\273\320\232be defined by/(u, v)=

\321\204(\320\270)\320\260{\321\216).Then/is

bilinear because \321\204and a are each linear. (Such a bilinear form/turns out to be the \"tensor product\" of \321\204and

\320\262and so is sometimes written/= \321\204\302\256a.)

\321\204)Let/be the dot product on R\"; that is,

f(u, v)= \320\270\342\200\242v \342\200\224

albl + a2b2 + \342\200\242\302\246\302\246+ \320\260\342\200\236\320\254\342\200\236

where \320\270= (a,) and v = (b,). Then/is a bilinear form on R\".

(c) Let A =(\320\2600)

be any n x \320\270matrix over K. Then A may be identified with a bilinear form/on K\", where

\302\25312\342\200\242\342\200\242-<*ln\\(yi\\

f(X, Y) =

The aboveformal expression in variables xf, y, is termed the bilinear polynomial corresponding to the matrix

A. Equation A3.1) below shows that, in a certain sense,every bilinear form is of this type.

We will let B(V)denote the set of bilinear forms on V. A vector space structure is placed on B(V) by

defining f+g and kf as follows:(f+gKu, v) =f(u,v) + g{u,v)

\320\250\320\270,v) = kf(u, v)

for any/, g e B(V) and any \320\272\320\265\320\232.In fact,

409

410 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

Theorem 13.1: Let V be a vector space of dimension n over K. Let {<?,,..., \321\204\342\200\236}be any basis of the

dual space V*. Then {ftJ: i,j; = 1, ..., n} is a basis of B(V) where ftj is denned by

fifu, v) =<t>i(u)<t>j(v). Thus, in particular, dim B(V) = n2.

(See Problem 13.4 for the proof.)

133 BILINEARFORMS AND MATRICES

Let/be a bilinear form on V, and let S = {uu u2,..., \321\206,}be a basis of V. Suppose u, v e V and

suppose

\320\270=\320\276,\302\253,+ \342\200\242\342\200\242\302\246

+ \320\260\320\277\320\270\342\200\236and v = b,u, + \342\200\242\342\200\242\342\200\242+ \320\254\342\200\236\320\270\320\277

Then

/(\320\270,d) =/(fl,u, + \302\246\342\200\242\342\200\242+ \320\260\320\277\320\270\342\200\236,b,u, + - - -+ bnun)

= a,b,/(Mi, u,) + fl,b2/(u,, u2) + \302\246\302\246\302\246+ \320\260\342\200\236\320\254\342\200\236/(\320\270\342\200\236,\320\270\342\200\236)

Thus/is completely determined by the n2 values/(\302\253f, u7).The matrix A =

(al}) where al} =f(ul3 Uj)is called the matrix representation off relative to the basis

S or, simply, the matrix off in S. It \"represents \"/in the sense that

/(u, d) = X afbj/(U(, uy)= (a,,..., \320\260\342\200\236

for all u, d ? V. [As usual, [u]s denotes the coordinate (column)vector of \320\270\320\265\320\232in the basis S.]We next ask, how does a matrix representing a bilinear form transform when a new basis is select-

selected?The answer is given in the following theorem, proved in Problem 13.6. (Recall Theorem 10.4 that

the change-of-basis matrix P from one basisS to another basis S' has the property that [u]s\342\200\224

P[u]s.,

for every \320\270\320\265V.)

Theorem 13.2: Let P be the change-of-basis matrix from one basis S to another basis S'. If A is the

matrix of/in the original basis S, then

\320\222= PTAP

is the matrix of/in the new basis S'.

The above theorem motivates the following definition.

Definition: A matrix \320\222is said to be congruent to a matrix A if there exists an invertible (or:nonsingular) matrix P such that \320\222= PTAP.

Thus by Theorem 13.2matrices representingthe same bilinear form are congruent. We remark that

congruent matrices have the same rank becauseP and PT are nonsingular; hence the following defini-definitionis well framed.

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 411

Definition: The rank of a bilinear form/on V, written rank/ is the rank of any matrix representation.

We say that / is degenerate or nondegenerate according as rank / < dim V or rank

/= dim V.

13.4 ALTERNATING BILINEAR FORMS

A bilinear form/on V is said to be alternating if

(i) /(\302\273,\302\273)= 0

for every v e V. If/ is alternating, then

0 =/(u + v, \320\270+ v) =f(u, u) +f(u, v) +fv, u) +f(v, v)

and so (ii) f{u, v) = -/(i>, u)

for every u, v e V. A bilinear form which satisfies condition (ii) is said to be skew-symmetric (or

antisymmetric). If 1 + 1 \321\2040 in K, then condition (ii) implies f(v, v) =\342\200\224f(v,v) which implies condition

(i). In other words,alternating and skew-symmetric are equivalent when 1 + 1^0.The main structure theorem of alternating bilinear forms, proved in Problem 13.19, follows.

Theorem 133: Let/be an alternating bilinear form on V. Then there existsa basis of V in which/isrepresented by a matrix of the form

/ 0 1

-1 0 \\

0 1

-1 0

0 11 0

\320\276!

Moreover, the number of ( Is is uniquely determined by/(because it is equal to

In particular, the above theorem shows that an alternating bilinear form must have even rank.

13.5 SYMMETRICBILINEAR FORMS, QUADRATIC FORMS

A bilinear form/on V is said to be symmetric if

f(u, v) =/(\302\273, u)

for every u,v e V. If A is a matrix representationof/, we can write

/(X, Y) = XTAY = (XTAY)T = YTATX

(We use the fact that XTAY is a scalar and therefore equals its transpose.) Thus if/is symmetric,

YTATX =f(X, Y) =/(\320\243, #) = YTAX

412 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

and sincethis is true for all vectors X, Y it follows that A = AT or A is symmetric. Conversely if A is

symmetric, then/is symmetric.The main result for symmetric bilinear forms, proved in Problem 13.11, is given in

Theorem 13.4: Let /be a symmetric bilinear form on V over \320\232(in which 1 + 1^0). Then V has a

basis {d,, ..., vn} in which/is represented by a diagonal matrix, i.e.,/(d,, Vj)= 0 for

Theorem 13.4 (Alternate Form): Let A be a symmetric matrix over \320\232(in which 1 + 1^0). Then thereexists an invertible (or nonsingular) matrix P such that PTAP is

diagonal. That is, A is congruent to a diagonal matrix.

Sincean invertible matrix P is a product of elementary matrices (Theorem 4.10), one way of obtain-

obtainingthe diagonal form PTAP is by a sequence of elementary row operations and the same sequence of

elementary column operations.Thesesameelementary row operations on / will yield PT.

Definition: A mapping q : V -\302\273\320\232is called a quadratic form if q(v) =/(i>, v) for some symmetric bilinearform/on V.

Now if/is represented by a symmetric matrix A =(al7), then q is represented in the form

hi

q(X) = f{X, X)= XTAX = {xu..., \321\205\342\200\236)|

\"\"\"

\320\237\320\2772

=Z \302\253y*.-*>

= anx? + \302\25322*2 +\342\200\242\342\200\242\302\246+ annx$ + 2

\342\200\242\\j

The above formal expression in variables x, is termed the quadratic polynomial corresponding to the

symmetric matrix A. Observe that if the matrix A is diagonal, then q has the diagonalrepresentation

q(X)= XTAX = a, ,x2 + a22 x2 + \342\200\242- -

+ \320\260\342\200\236\320\277\321\205\320\263\342\200\236

that is, the quadratic polynomial representingq will contain no \"cross product\" terms. Moreover, by

Theorem 13.4, every quadratic form has such a representation (when 1 + 1^0).

If 1 + 1 \320\2440 in K, then the above Definitioncan be reversed to yield

f(u, v) = $[q(u + v)- q{u) -

q(v)~\\

which is the so-called polar form of/

13.6 REAL SYMMETRIC BILINEAR FORMS, LAW OF INERTIA

In this section we treat symmetric bilinear forms and quadratic forms on vector spaces over the realfield R. Theseforms appear in many branches of mathematics and physics.The special nature of R

permits an independent theory.The main result, proved in Problem 13.13,follows.

Theorem 13.5: Let/be a symmetric bilinear form on V over R. Then there is a basisof V in which/is

represented by a diagonal matrix; every other diagonal representation has the same

number p of positive entries and the same number n of negative entries. The difference

s = p \342\200\224n is called the signature of/

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 413

A real symmetric bilinear form/is said to be nonnegative semidefinite if

q(v) =/(\302\273, v) > 0

for every vector v; and is said to be positivedefinite if

q(v) = /(u, v) > 0

for every vector v ^ 0. By Theorem 13.5,

(i) /is nonnegative semidefinite if and only if s = rank (/),

(ii) /is positive definite if and only if s = dim V,

where s is the signature of/

Example 13.2. Let/be the dot product on R\"; that is,

f(u, v)= \320\270\342\200\242v = a,b, + a2b2 + \302\246\302\246\302\246+ \320\260\320\277\320\254\342\200\236

where \320\270= (aj and v = (bj. Note that/is symmetric since

f(u, v)= w v = v \342\200\242\320\270=f(v, u)

Furthermore,/is positive definite because

f(u, u) =a\\ + a\\ + \302\246\302\246\302\246+ \320\260^> \320\236

when u \321\2040.

In Chapter 14 we will see how a real quadratic form q transforms when the transition matrix P isorthogonal.If P is merely nonsingular, then q can be represented in diagonal form with only Is and\342\200\224Is as nonzero coefficients. Specifically,

Corollary 13.6: Any real quadratic form q has a unique representation in the form

q(xy, ..., \321\205\342\200\236)= x\\ + \302\246\302\246\302\246

+ xjj-

xl+, -xj

where r = p + n is the rank of the form.

The above result is sometimes referred to as the Law of Inertia or Sylvester's Theorem.

13.7 HERMITIAN FORMS

Let \320\232be a vector space of finite dimension over the complex field C. Let/: V x V ->C\\x such that

(i) f(au, + bu2, v) = o/(u,, v) + bf(u2, v)

(ii) f(u,v)=J(^u)

where a, b e \320\241and \321\211,v e V. Then/is called a Hermitian form on V. (As usual, \320\272denotes the complex

conjugate of \320\272\320\265\320\241.)By (i) and (ii),

f(u, avt + bv2) =f(avt + bv2,u) =of(v1, u) + bf(v2, u)

= a/(D,, u) + bf(v2,u) = o/(u, d,) + bf{u, v2)

That is,

(iii) f(u, avt + bv2) =\320\251\320\270,d,) + bf(u, v2)

As before, we expresscondition (i) by saying / is linear in the first variable. On the other hand, weexpresscondition (iii) by saying/is conjugate linear in the second variable. Note that, by (ii), we have

f(v, v) =f(v, v) and so/(d, v) is real for every v e V.

414 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

The resultsof Sections13.5and 13.6 for symmetric forms have their analogues for Hermitian forms.

Thus, the mapping q : V -* R, defined by q(v) =/(i>, v), is called the Hermitian quadratic form or complex

quadratic form associated with the Hermitian form/. We can obtain/from q in the polar form

f(u, v) = i[q(u + v)- q{u - d)] + \\[q(u + iv)

-q{u

- id)]

Now suppose S = {u,,..., \320\270\342\200\236}is a basis of V. The matrix H =

(\320\233\321\206)where htj =/(uf, u,) is called the

matrix representation of/in the basis S. By (u),/(hj, u,)=

f(ujy uf); hence H is Hermitian and, in partic-

particular, the diagonal entries of H are real.Thus any diagonal representation of/contains only real entries.The next theorem, to be proved in Problem 13.33, is the complex analog of Theorem 13.5 on real

symmetric bilinear forms.

Theorem 13.7: Let /be a Hermitian form on V. Then there exists a basisS={u ly..., \302\253\342\200\236}of V in

which/is represented by a diagonal matrix, i.e.,/(ui7 u,)= 0 for i \320\244).Moreover, every

diagonal representation of/ has the same number p of positive entries, and the same

number n of negative entries.The difference s = p\342\200\224n is called the signature of/

Analogously, a Hermitian form/is said to be nonnegative semidefinite if

q(v)=f(v,v)>0

for every v e V, and is said to be positive definite if

q(v) =f(v, v)>0

for every v \321\2040.

Example 13.3. Let / be the dot product on C; that is,

/(u, v) = wv = z1w1 + z2w2 + \302\246\302\246\302\246+ zn wn

where \320\270= (Z() and v \342\200\224(wt). Then/is a Hermitian form on C\". Moreover,/is positive definite since, for any v \320\2440,

f(u, u) = xlxl + z2z2+ - \302\246- + zn7n = |z, |2 + \\z212 + - \342\200\242- + |zj2 > 0

Solved Problems

BILINEARFORMS

13.1. Let \320\270=(\321\205\342\200\236\321\2052,x3) and v = (y,, y2, y3), and let

f(u, v) = 3x,j;, -2x,y2 + 5x2>, + 7x2j;2 - 8x2>3+ 4x3>2

- x3y3

Express/in matrix notation.

Let A be the 3x3 matrix whose y-entry is the coefficient of x(yj. Then

/3 -27 -8

\\0 4 -1/W

13.2. Let A be an n x n matrix over K. Show that the mapping/denned by

f(X, Y)= XTAY

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 415

is a bilinear form on K\".

For any o, b e \320\232and any Xt, Yt e K\",

f(aX1 + bX2, Y) = (oX, + bX2)TAY = (aXf + bX\\~)AY

= aX\\AY + \320\254\320\245\\AY =\320\260/(\320\224\320\223\342\200\236\320\243)+ \320\254/(\320\2452,\320\243)

Hence/is linear in the first variable. Also,

f(X, aYx + bY2) = XTA(aYl + bY2)= aXrAYt + bXTAY2= o/(X, \320\243,)+ b/(X, Y2)

Hence/is linear in the second variable, and so/is a bilinear form on K\".

133. Let/be the bilinear form on R2 defined by

/((*!, x2),(\321\203\342\200\236\321\2032))= 2xlyl

- 3x,.y2 + x2y2

(a) Find the matrix A of/in the basis {ut= A,0), u2 = A, 1)}.

(b) Find the matrix \320\222of/in the basis {vt = B, 1),v2= A, -1)}.

(c) Find the change-of-basismatrix P from the basis {ut} on the basis {i>,}, and verify that

\320\222= PTAP.

(a) Set /1 = {a,,)where \320\276\321\206=/(\320\274\342\200\236\320\270,):

\320\276..=/(\302\253\342\200\236\302\253,)=/(A. 0), A, 0)) = 2 - \320\236+ \320\236= 2

0,2 =/(\302\253\302\246-\302\2532)=/(A. 0), A, 1))= 2 - 3 + 0 = - 1

\320\2762.=/(\302\2532. \302\253.)=/((!. 1), A, \320\236))= 2 - \320\236+ \320\236= 2

\320\23622=/(\302\2532. \302\2532)=/(A, 1). A, 1))= 2 - 3 + 1 = \320\236

B -1\\Thus \320\233= I , I is the matrix of/in the basis {u,, u2}.

(b) Set \320\222=(by) where

biy=

/(\302\273,,u;):

b. i =/(\302\253.. \302\273i)=/(B, 1), B, 1)) =8-6+1=3b.2 =/(\302\253.. \"\320\263)=/(B, 1), A, -1\302\273 =4 + 6-1=9

b2.=/(\302\2732, \302\273.)=/(U> -1),B. 1\302\273=4-3-1=0

\320\2542\320\263=/(\302\2532. \022) =/(A. -1), A, -1)) = 2 + 3 + 1 = 6

/3 9\\.

\320\233\320\276\320\261]1

Thus \320\222= I I is the matrix of/in the basis {d,, v2}.\\0 6/

(c) We must write i>, and i>2 in terms of the ut:

vt = B, 1) = A, 0) + A, 1) =u, + u2

\022= A, - 1)= 2A, 0)

- A, 1)= 2u,- u2

ThenP = r _~)andsoPT=

Q M.Thus

\342\200\224G -9G \"\320\255\320\241-0-G D-

13.4. Prove Theorem 13.1.

Let {u,,.... \320\270\342\200\236}be the basis of V dual to {<&}. We first show that {/y} spans B(K). Let/e B(V) and

suppose/(Uf, Mj)=

a,-;. We claim that/= ? \320\262\321\213/^.It suffices to show that

Aus,U,)= (L<>ijfij)(u,,U,) for 5.1=1 \320\233

416 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

We have

as required. Hence {f^ spans fl(^).Next, suppose ? O;j/y = 0.Then for s, t = 1, ..., n,

0 =(\320\245\320\270\342\200\236\320\270,)

= (I fly/yfc, \302\253,)=

o\302\253

The last step follows as above.Thus {fu} is independent and henceis a basisof B{V).

13.5. Let [/] denote the matrix representation of a bilinear form/on V relative to a basis {u,,..., \320\270\342\200\236}

of V. Show that the mapping /V-+ [/] is an isomorphism of B(K) onto the vector space of

\320\273-square matrices.

Since/is completely determined by the scalars/(u?, ut), the mapping/i\342\200\224>[/] is one-to-one and onto. It

suffices to show that the mapping/i-+ [/] is a homomorphism; that is, that

However, for i,j = 1,..., n,

which is a restatement of (*). Thus the result is proved.

13.6. Prove Theorem 13.2. \320\222= \320\240~[\320\220\320\240represents/relative to a new basis.

Let u, v e V. Since P is the change-of-basis matrix from S to S', we have P[u]s. = [u]s and alsoP[t.]s.= Ms;hence |>]J = Mj. PT. Thus

\320\224\320\270,v)=[ugA[v]s = [u]ll

Since \320\270and i> are arbitrary elements of V, PTAP is the matrix of fin the basis S'.

SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS

13.7. Find the symmetric matrix which corresponds to each of the following quadratic polynomials:

(a) q(x, y, z) = 3x2 + 4xy \342\200\224y2 + Sxz \342\200\224

6yz + z2 (b) q(x, y, z)= x2 \342\200\224

2yz + xz

The symmetric matrix A =(oy) representing <j(x,, ..., \321\205\342\200\236)has the diagonal entry au equal to the coeffi-

coefficientof xf and has the entries oy and a}i each equal to half the coefficient of xt Xj. Thus

/3 2 4^2 -1 -3

\\4 -3 I)

{a)

13.8. For the following real symmetric matrix A, find a nonsingular matrix P such that PTAP is

diagonal and also find its signature:

1

01

0

0

-1

(b)

I-1

0

First form the block matrix (A, I):

1 -3 2= | -3 7 -5

2-5 8

1 0 0\\

0 1 0]0 0 1>

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 417

Apply the row operations 3Rt + R2->R2 anc' \342\200\2242Rt+ R3-> R3 to (A, I) and then the correspondingcolumn operations3C, + C2

-\302\273C2 and \342\200\2242C, + C3 -\302\273C3 to /4 to obtain

1 0 0\\ /10 03 1 0| and then [o -2 1

-2 0

1 03 1

-2 0

Next apply row operation R2 + 2R3-\302\273K3 and then the corresponding column operation C2+ 2C3

-\302\273C3

to obtain

1 0 0\\ /10 03 1 0| and then @-2 0

1 0

3 1

9-112/ \\00 18\342\200\2241 1

/I 3 -1\\ /, 0Now A has been diagonalized. Set P = I 0 1 1 1; then PTAP = I 0 -2

\\0 0 2/ \\0 0

The signature of A is s = 2 \342\200\2241 = 1.

13.9. Suppose 1 + 1 \321\2040 in K. Give a formal algorithm to diagonalize (under congruence) a symmetricmatrix A =

(\320\264\321\203)over K.

Case (i): o,, # 0.Apply the row operations \342\200\224\320\276\320\277\320\233,+ \320\236\321\206\320\233,--\302\273\320\257,,i = 2,..., n, and then the correspond-

correspondingcolumn operations \342\200\224anCi+ allCi -\302\273Ct to reduce /4 to the form I\"

).V 0 BJ

Case (ii): o,, = 0 but ait \320\2440, for some i > 1.Apply the row operation flj \302\253fi,and then the correspond-correspondingcolumn operation C,\302\253-\302\273C,-to bring o(j into the first diagonal position. This reduces the

matrix to (i).

Case (iii): All diagonal entries ati = 0. Choose i, j such that a(J \320\2440, and apply the row operation

Rj + Rt -* Rt and the corresponding column operation C, + C^Ci to bring 2ait \320\2440 into theith diagonal position. This reduces the matrix to (ii).

/flu 0\\

'U \320\262)

In each of the cases, we can finally reduce A to the form I I where \320\222is a symmetric matrix ofV 0 BJ

order less than A. By induction we can finally bring A into diagonal form.

Remark: Thehypothesis that 1 + 1 \320\2440 in K, is used in (iii) where we state that 2atj \320\2440.

13.10. Let q be the quadratic form associated with the symmetric bilinear form /. Verify the polaridentity/(u, v)

=\\{.\320\247\320\253+ v)

\342\200\224q[u)

\342\200\224q(vf\\. (Assume that 1+1^0.)

We have

q{u + v)- q{u) -

\321\204)=f(u + v,u + v) -f(u, u) -f(v, v)

=f(u, u) +f{u, v) +f(v, u) +f(v, v) -f{u, u) -/(\302\273, v)

= 2f{u, v)

If 1 + 1 \321\2040, we can divide by 2 to obtain the required identity.

13.11. Prove Theorem 13.4. A symmetric bilinear form/is diagonalizable.

Method 1. If/= 0 or if dim V = 1, then the theorem clearly holds. Hence we can suppose/\320\244 0 and

dim V = n > I. Hq(v)=/(i>,v)= 0 for every v \320\262V. then the polar form of/(see Problem 13.10) implies that

/= 0. Hence we can assume there is a vector i>, e V such that/(i>,, i>,) \320\2440. Let U be the subspace spanned

by vt and let W consist of those vectors v 6 V for which/(y,, v)= 0. We claim that V = U \302\256W.

418 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP 13

(i) Proof that U n W = {0}: Suppose\320\270e U n W. Since \320\270e V, \320\270= kv^ for some scalar \320\272\320\265\320\232.Since

ueW,0 =f(u, u) =f(kvt, kvt)= k2f(vt, u,). But/(u,, vt) \321\2040; hence \320\272= 0 and therefore \320\270= kvt = 0.

Thus U \320\263\320\273W = {0}.

(ii) Proof that V = U + W: Let v e V. Set

w = f -

Then

\302\2531^f i)(\320\262.,\302\273i)

= 0

Thus weW.By (/), \320\270is the sum of an element of U and an element of W. Thus V = U + W. By (i)and (ii), V =V\302\256W.

Now / restricted to W is a symmetric bilinear form on W. But dim W = n \342\200\2241; hence by inductionthere is a basis{d2,..., 1>\320\270}of W such that/fu,-, tij)

= 0 for i \320\244)and 2 < i,j < n. But by the very definition

of W, /(\302\253i,Uy)= 0 for j = 2, ..., n. Therefore the basis {i>,,..., \320\270\342\200\236}of V has the required property that

/(\"\302\246-.^)=

0\320\223\320\276\320\277#\321\203.

Method 2. The algorithm in Problem 13.9 shows that every symmetric matrix over \320\232is congruent to adiagonal matrix. This is equivalent to the statement that/has a diagonal representation.

\\

13.12. Let A = , a diagonal matrix over K. Show that:

(a) For any nonzero scalars ku ..., \320\272\342\200\236\320\265\320\232,A is congruent to a diagonal matrix with diagonal

entries \302\253,-fc?;

(b) If \320\232is the complex field C, then A is congruent to a diagonal matrix with only Is and 0s asdiagonal entries;

(c) If \320\232is the real field R, then /1 is congruent to a diagonal matrix with only Is, \342\200\224Is, and 0s

as diagonal entries.

(o) Let P be the diagonal matrix with diagonal entries k,. Then

\\

PTAP =

\\

'*. \\

kj

(olk\\ \\

o2k\\

{b) Let P be the diagonal matrix with diagonal entries bf=

required form.

(c) Let P be the diagonal matrix with diagonal entries ht\302\246\302\246

required form.

=\\

\"' ' \302\260'. Then PTAP has the

1 if \320\236;= \320\236

af| ifflj/O

if \320\276,= \320\236

. Then PT>1P has the

Remark: We emphasize that (fc) is no longer true if congruence is replaced by Hermitian

congruence (see Problems 13.32and 13.33).

13.13. Prove Theorem 13.5.

By Theorem 13.4, there is a basis{ut,..., \320\270\342\200\236}of V in which/is representedby a diagonal matrix with,

say, p positive and n negative entries. Now suppose {w, \320\270>\342\200\236}is another basis of V, in which/is rep-

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 419

resentedby a diagonal matrix with p' positive and n' negative entries. We can assume without loss in

generality that the positive entries in each matrix appear first. Since rank / = p + n = p' + n', it suffices to

prove that p = p'.Let U be the linear span of ult..., up

and let W be the linear span ofwp. + l, ..., wn. Then/(i>, v) > 0

for every nonzero v e U, and /(i>, i>) < 0 for every nonzero v e W. Hence U r\\ W = {0}. Note that

dim U = p and dim W = n \342\200\224p'. Thus

dim (I/ + W)= dim U + dim W - dim (U n W) = p + (n

- p') - 0 = p - p' + n

But dim (U + W) < dim V = n; hence p -p' + n < n or p < p'. Similarly, p' < p and therefore p = p', as

required.

Remark: The above theorem and proof dependonly on the concept of positivity. Thus thetheorem is true for any subfield \320\232of the real field R such as the rational field Q.

13.14. An n x n real symmetric matrix A is called positive definite if XTAX > 0 for every nonzero(column) vector X e R\", i.e., if A is positive definite viewedas a bilinear form. Let \320\222be any real

nonsingular matrix. Show that (a) BTB is symmetric and (b) BTBis positive definite.

(a) (BTB)T = BTBTT= BTB,henceBTB is symmetric.

(b) Since \320\222is nonsingular, BX \320\2440 for any nonzero X e R\". Hence the dot product of BX with itself,BX - BX = {BX)T{BX), is positive. Thus XT{BTB)X= (XTBT^BX) = (BX)T(BX) > 0 as required.

HERMITIAN FORMS

13.15.Determine which of the following matrices are Hermitian:

2 2 + 3i 4 - 5i\\ / 3 2-J 4 + i\\

2 \342\200\2243i 5 6 + 2*1 I 2-i 6 iV4 + 5i 6-2j -7 / \\4 + i i 3

(a) (b) (c)A matrix A = (ay) is Hermitian iff A \342\200\224A*, i.e., iffo^

= oj|.

(a) Thematrix is Hermitian, since it is equal to its conjugate transpose.

(b) The matrix is not Hermitian, even though it is symmetric.

(c) Thematrix is Hermitian. In fact, a real matrix is Hermitian if and only if it is symmetric.

13.16. Let A be a Hermitian matrix. Show that / is a Hermitian form on \320\241where / is defined by

f(X, Y) = XTAY.

For all a, b e \320\241and all Xlt X2, Y e C\",

/(oX, + bX2, Y)=

(\320\260\320\245,+ bX2)TAY =(aX\\ + bXj)AY

=aX\\AY + bX\\AY =

^(\320\224-\342\200\236\320\243)+ bf(X2, Y)

Hence/is linear in the first variable. Also,

f(X, Y) = XTAY = {XrAY)T = \"F/FX = YTA*X = \320\223\320\223/1\320\245=/(\320\243, \320\245)

Hence/is a Hermitian form on C\". (Remark: We use the fact that \320\233\320\223\320\242\320\233\320\243is a scalar and so it is equal to its

transpose.)

13.17. Let/be a Hermitian form on V. Let H be the matrix of/in a basis S = {u,,..., \320\270\320\270}of V. Show

that:

(\320\260)/(\320\270,i7)=

[\320\270^\320\235\320\250for all u, v e V,

420 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

(b) If P is the change-of-basis matrix from S to a new basis S' of V, then \320\222= PTHP (or\320\222= Q*HQ where Q = P) is the matrix of/in the new basis S'.

Note that (b) is the complex analog of Theorem 13.2.

(a) Let u, v e V and suppose \320\270= 0,1*, + \320\260\320\263\320\270\320\263+ \302\246\302\246\302\246+ \320\260\342\200\236\320\270\342\200\236and t) = i^u, + \320\254\320\263\320\270\320\263+ \302\246\302\246\302\246+ \320\254\342\200\236\320\270\342\200\236.Then

/6,

i.J

as required.

(b) SinceP is the change-of-basis matrix from S to S\\ we have P[u~\\s.= [u]s and P[d]s. = [u]s; hence

MI = Mf PT and PL =\320\240\320\2317.Thus by (a),

But u and v are arbitrary elements of V; hence PTHP is the matrix of/in the basis S'.

1 1 + i 2i13.18.Let \320\257= | 1 \342\200\224i 4 2 \342\200\2243i J, a Hermitian matrix. Find a nonsingular matrix P such that

-2i 2 + 3i 7

is diagonal.

First form the block matrix {H, I):

1 1 + i

1 -i 2 \342\200\2243i

2 + 3i

1 0 0\\

0 1 0

0 0 1/

Apply the row operations (\342\200\2241+ i)K, + \320\2312-\302\273\320\2572and 2iU, + R3 -\302\273\320\2333to (\320\233,/), and then the conjugate

column operations (\342\200\2241 \342\200\224i)Ct + C2-*C2 and \342\200\2242iC,+ C3 -\302\273C3 to i4. to obtain

and then

Next apply the row operation R3-\302\273\342\200\224

5i/?2 + 2R3 and the conjugate column operation C3 -\302\2735iC2 + 2C3 toobtain

0

2

5i

0

-5i3

1

-1+i

2i

0

1

0

0

0

1

1

Now H has beendiagonalized.Set

100

020

00

-38_1

5

1

+ i

+ 9i

01

\342\200\2245i

0

0

2

and then

't 0 0\\

and then PTHP = 10 2 0\\o 0 -38/

Note that the signature ofHiss=2\342\200\224 1 = 1.

MISCELLANEOUS PROBLEMS

13.19. Prove Theorem 13.3.

If/= 0, then the theorem is obviously true. Also, if dim V = 1, then/(fc,u, k2u)= klk1f(u, u) = 0 and

so/= 0. Accordingly we can assume that dim V > I and/# 0.

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 421

Since /# 0, there exist (nonzero) u,, u2 e V such that/(uj, u2) \321\2040. In fact, multiplying ut by an

appropriate factor, we can assume that/(u,, u2)\342\200\2241 and so/(u2, u,) = \342\200\224I. Now ut and u2 are linearly

independent; because if, say, u2 = feu,, then /(u,, u2) =/(u,, feu,) =fe/(u,, u,) = 0. Let U = span (u,, u2);

then

(i) The matrix representation of the restriction of/to U in the basis {u,, u2] is IJ;

(ii) If u e U, say u =aut + bu2, then

/(u, u,)=/(ou, + bu2, u,) = -fc

/(\302\253.\022) =/(o\"i + b\302\2532.\302\2532)= \320\276

Let W consist of those vectorsw e V such that/(w, Uj)= 0 and/(w, u2) = 0.Equivalently,

W = {w e K:/(w, u)= 0 for every ueU}

We claim that \320\232= U \320\244\320\230\320\233It is clear that U r\\ W = {0}, and so it remains to show that V = U + W. LetveV. Set

u =/(\302\253,u2)u, \342\200\224/(\320\270,u,)u2 and w = d \342\200\224u (/)

Since u is a linear combination of u, and u2, \320\270e I/. We show that w e \320\230\320\233By (/) and (ii),/(u, u,) =/(t>, u,);hence

/(w, u,) =/(u -u, u,) =/(y, u,) -f{u, u,) = 0

Similarly,/(u, u2) =f(v, u2) and so

/(w, u2) +/(u- u, u2) =f{v, u2) -/(u, u2) = 0

Then w e W and so, by (/), v = \320\270+ w where ue W. This shows that V \342\200\224U + W; and therefore

\320\232= U \320\244W.

Now the restriction of/ to W is an alternating bilinear form on W. By induction, there exists a basis

u3,..., un of W in which the matrix representing / restricted to W has the desired form. Accordingly,u,, u2, u3, ..., un is a basis of F in which the matrix representing/has the desired form.

Supplementary Problems

BILINEAR FORMS/1 2\\

13.20. Let V be the vector space of 2 \320\2662 matrices over R. Let M = I 1, and Iet/(>1, B) = tr {ATMB), where

\320\220,\320\222e V and \"tr\" denotes trace, (a) Show that/is a bilinear form on V. (b) Find the matrix of/in the basis

1\320\241 \320\226 ;>\321\201 sx::)}

13.21. Let B(F)be the set of bilinear forms on V over K. Prove:

(o) If/ 9 e B(F), then/+ 9 and kf, for fe e K, also belong to B(V), and so \320\222(\320\230)is a subspace of the vector

space of functions from V x V into \320\232;

(b) If $ and \320\276are linear functional on V, then/(u, i>)=

<?(u)<7(i>) belongs to B(V).

13.22. Let/ bea bilinear form on V. For any subset S of V, we write

S\021-= {v e V :/(u, v) = 0 for every ueSj ST ={u e V :f(v, u) = 0 for every u e S}

422 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

Show that: (a) Sx and ST are subspacesof V; {b) S, gS2 implies Si ? Si and Sj^Sj; and (c)

{0}x = {0}T= V.

13.23. Prove: If / is a bilinear form on V, then rank /= dim V \342\200\224dim VL = dim V \342\200\224dim FT and hencedim VL = dim FT.

13.24. Let/ bea bilinear form on V. For each \320\270e F, let u : V -\302\273\320\232and u : F -\302\273X be defined by fi(x) = f(x, u) and

\"M =/(\302\253.x). Prove:

(\320\276)\320\277and \320\270are each linear, i.e. it, \320\270e F*;

(b) \320\270\320\274\320\270and ui\342\200\224>\320\270are each linear mappings from V into F*;

(c) rank / = rank (u i\342\200\224>u) = rank (u i\342\200\224>ii).

13.25. Show that congruence of matrices is an equivalence relation, i.e., (i) A is congruent to A; (ii) if A is con-congruent to B, then \320\222is congruent to A; (iii) if A is congruent to \320\222and \320\262is congruent to C, then \320\233is

congruent to \320\241

SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS

13.26. Find the symmetric matrices belonging to the following quadratic polynomials:

(a) q(x, y, z) = 2x2 -Sxy + y2 \342\200\224\\bxz + \\Ayz + 5z2 (c) q{x, y, z) = xy + y1 4- 4xz + z2

(b) q(x, y, z) = x2 \342\200\224xz + y2 (^ <?(x,y, z) =xy + yz

13.27. For eachof the following matrices A, find a nonsingular matrix P such that PTAP is diagonal:

F) \342\200\236

In each case find the rank and signature.

13.28. Let S(V) be the set of symmetric bilinear forms on F. Show that:

(i) S( F) is a subspaceof B{ F); (ii) if dim V = n, then dim S(V)= }n(n + 1).

13.29. SupposeA is a real symmetric positive definite matrix. Show that there exists a nonsingular matrix P suchthat A = PTP.

n

13.30. Consider a real quadratic polynomial q{xt,..., xj = ? \320\262\321\203\321\205,-\321\205,,where o^

= aJ(.\342\200\242.j = i

(i) If o,, # 0, show that the substitution

Oil

yields the equation <\320\263(\321\205\321\214..., \321\205\342\200\236)=

\320\276,,\321\2032+ q'(y2,..., yn), where <j' is also a quadratic polynomial,

(ii) If a,, = 0 but, say, o12 \320\2440, show that the substitution

*i=

>\"i + \320\2432,*2 =\320\2431

-\320\2432> *\321\215

=\320\243\320\267.\342\200\242- \342\200\242> *n

=\320\243*

yields the equation q(x,,..., \321\205\342\200\236)= J] b^y^j, where b,, # 0, i.e., reduces this case to case (i). This

method of diagonalizing q is known as \"completing the square.\"

CHAP. 13] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 423

HERMITIAN FORMS

13.31.Let A be any complex nonsingular matrix. Show that \320\230= A*A is Hermitian and positive definite.

1332. We say that \320\222is Hermitian congruent to A if there exists a nonsingular matrix Q such that \320\222= Q*AQ.Show that Hermitian congruence is an equivalence relation.

1333. Prove Theorem 13.7. [Note that the second part of the theorem does not hold for complex symmetricbilinear forms, as is shown by Problem 13.12(ii). However, the proof of Theorem 13.5 in Problem 13.13 does

carry over to the Hermitian case.]

MISCELLANEOUS PROBLEMS

1334. Let V and W be vector spacesover K.A mapping/: V x W -\302\273\320\232is called a bilinear form on V and W if:

(i) f(av, + bv2, w)= af(vt, w) + bf(v2, w)

(ii) f(v, \320\271\320\272-,+ \320\254\320\270>2)= af(v, w,) + bf(v, w2)

for every a, b e K, v( e V, w} e W. Prove the following:

(a) The set B(V, W) of bilinear forms on V and W is a subspace of the vector space of functions fromV x W into K.

(b) 1({\321\2041,..., \321\204\321\202]is a basis of V* and {\321\201\320\263\342\200\236..., on} is a basisof W*, then {ftj: i= l,...,m,j= 1,..., n} is

a basis of B(F, WO where/v is defined by/iy(t>, w) = ^w)or/w). Thus dim B(F, \320\2300= dim V - dim \320\230\320\233

Remark: Observe that if V = W, then we obtain the space B[V) investigated in this

chapter.

1335. Let \320\230be a vector space over K. A mapping/: VxVx---xV->Kis called a multilinear (or m-linear)

form on F if/is linear in each variable, i.e., for i = 1, ..., m,

/(..., ST+^\", ...) = fl/(...,ii,...)+ *>/(..., v,...)

where\" denotes the ith component, and other components are held fixed. An m-linear form / is said to bealternating if

f(v,, ..., i>m)= 0 whenever vt

= vk, i \320\244\320\272

Prove:

{a) The set Bm(V) of m-linear forms on \320\243is a subspace of the vector space of functions fromV x V x \302\246\302\246\302\246x V into K.

(b) ThesetAm(V) of alternating m-linear forms on V is a subspace of Bm(F).

Remark 1: If m = 2, then we obtain the space B(V)investigated in this chapter.

Remark 2: If V = Km, then the determinant function is a particular alternating m-linear

form on V.

Answers to Supplementary Problems

'10 2 0

3 0 5

424 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAP. 13

13.26.(\320\276)-

.3.27.

(c) P = r = 4, s = 2

\320\236 0 469/

Chapter 14

Linear Operators on Inner Product Spaces

14.1 INTRODUCTION

This chapter investigates the space A{V) of linear operators T on an inner product space V. (See

Chapter 6.) Thus the base field \320\232is either the real field R or the complex field \320\241In fact, different

terminologies will be used for the real case and for the complex case. We also use the fact that the inner

product on EuclideanspaceR\" may be defined by

<\320\274,t;>= uTv

and that the inner product on complex Euclideanspace\320\241may be defined by

<\320\274,t;> = uTv

where \320\270and v are column vectors.The readershould review the material in Chapter 6. In particular, the reader should be very familiar

with the notions of norm (length), orthogonality,and orthonormal bases.

Lastly, we mention that Chapter 6 mainly dealt with real inner product spaces. On the other hand,

here we assume that \320\230is a complex inner product space unlessotherwise stated or implied.

14.2 ADJOINT OPERATORS

We begin with the following basic definition.

Definition: A linear operator T on an inner product space V is said to have an adjoint operator T* onV if <T(u), v> =

<\320\274,T*(v)y for every \320\270,v e V.

The following example shows that the adjoint operator has a simpledescriptionwithin the context

of matrix mappings.

Example 14.1

(a) Let A be a real \320\270-square matrix viewed as a linear operator on R\". Then, for every u,ve R\",

{\320\220\320\270,u> = (Aufv = uTATv =<\302\253,ATv}

Thus the transposed matrix AT is the adjoint of A.

(b) Let \320\222be a complex \320\270-square matrix viewed as a linear operator on C\". Then, for every u.v e C\",

<Bu, v} = IBufv = uTBTv = uTWv = \302\253TB*D= <\302\253,J3*t>>

Thus the conjugate transposed matrix B* = BT is the adjoint of B.

Remark: The notation B*, employed here for the adjoint of \320\222was previously

used to denote the conjugate transposeof B. By Example 14.1(fe) the ambiguity is ahappy one.

The following theorem, proved in Problem 14.4, is the main result in this section.

425

426 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

Theorem 14.1: Let \320\223be a linear operator on a finite-dimensional inner product space V over \320\232

Then:

(i) There exists a unique linear operator T* on V such that <\320\242(\320\275),v} = <u, T*(v)}for every u, v e V. (That is, T has an adjoint T*.)

(ii) If A is the matrix representation of T with respect to any orthonormal basis

S = {m,}of V, then the matrix representation of T* in the basis S is the conjugatetranspose A* of A (or the transpose AT of A when \320\232is real).

We emphasize that no such simple relationship exists between the matrices representing T and T*if the basis is not orthonormal. Thus we see one useful property of orthonormal bases.We also empha-

emphasizethat this theorem is not valid if V has infinite dimension (Problem 14.31).

Example 14.2. Let 7\" be the linear operator on C3 defined by

T(x, y, z) = Bx + iy, \321\203- 5iz, x + (I -

i)y + 3z)

We find a similar formula for the adjoint 7\"* of T. Note (Problem lO.l) that the matrix of 7\" in the usual basis of C3is

Recallthat the usual basis is orthonormal. Thus by Theorem 14.1, the matrix of T* in this basis is the conjugate

transpose of [T]:

/ 2 0 1 \\

[7*]= -i 11+i]

\\ 0 Si 3 /

Accordingly,

T*{x, y,z)={2x + z, -ix + \321\203+ A + i)z, 5iy + 3z)

The following theorem, proved in Problem 14.5, summarizes some of the properties of the adjoint.

Theorem 14.2: Let \320\223,\320\242\342\200\236T2 be linear operators on V and let \320\272\320\265\320\232.Then:

@ (T, + T2)*= T? + 71 (Hi) {\320\242,\320\2422)*=\320\242^\320\242^

(ii) (kT)* = k'T* (iv) (\320\223*)*= T

Observe the similarity between the above theorem and Theorem 3.3 on properties of the transpose

operation on matrices.

Linear Functionals and Inner ProductSpaces

Recall (Chapter 12) that a linear functional \321\204on a vector space \320\232is a linear mapping from V into

the base field K. This subsectioncontains an important result (Theorem 14.3)which is used in the proofof the above basic Theorem 14.1.

Let V be an inner product space. Each \320\270e V determines a mapping \320\270: V -\302\273\320\232defined by

=\320\236,\320\274>

Now for any a, b e \320\232and any r>,, v2 e V,

u{avi + bv2) =<at>i + bv2, u) = a<i>,, \320\274>+ b<u2, u> = au(vt) + bii(v2)

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 427

That is, \320\274is a linear functional on V. The converse is also true for spaces of finite dimension and is animportant theorem (proved in Problem 14.3). Namely,

Theorem 14.3: Let \321\204be a linear functional on a finite-dimensional inner product space V. Then thereexistsa unique vector \320\270e V such that <j>(v)

= <i>, \320\275)for every v e V.

We remark that the above theorem is not valid for spaces of infinite dimension (Problem 14.24),

although some general results in this direction are known. (One such famous result is the Riesz Repre-Representation Theorem.)

14.3 ANALOGY BETWEEN A{V) AND C, SPECIALOPERATORS

Let A(V) denote the algebra of all linear operators on a finite-dimensional inner product space V.

The adjoint mapping T\\-*T* on A(V) is quite analogous to the conjugation mapping zhz on the

complex field C. To illustrate this analogy we identify in Table 14-1 certain classesof operators

T e A(V) whose behavior under the adjoint map imitates the behavior under conjugation of familiar

classes of complex numbers.

Table 14-1

Class of

complex numbers

Unit circle (|z|= 1)

Real axis

Imaginary axis

Positive real axis

@,oo)

Behavior under

conjugation

z=l/z

z \342\200\224z

z = \342\200\224z

z = ww, w \321\2040

Class of

operators in A(V)

Orthogonal operators (real case)Unitary operators (complex case)

Self-adjoint operatorsAlso called:

symmetric (real case)Hermitian (complex case)

Skew-adjoint operatorsAlso called:

skew-symmetric (real case)skew-Hermitian (complex case)

Positive definite operators

Behavior under

the adjoint map

\320\223*= 7'\021

r* = \321\202

T*= -T

T = S*Swith S nonsingular

The analogy between theseclassesofoperators\320\223and complex numbers z is reflectedin the follow-

followingtheorem.

Theorem 14.4: Let \320\257be an eigenvalue of a linear operatorT on V.

(i) If T* = T ' (i.e.,T is orthogonal or unitary), then | A | = 1.

(ii) If T* = T (i.e.,T is self-adjoint), then \320\257is real,

(iii) If T* = \342\200\224T (i.e., T is skew-adjoint), then \320\257is pure imaginary.

(iv) If T = S*S with S nonsingular (i.e., \320\223is positive definite), then \320\257is real and

positive.

Proof. In each case let t be a nonzero eigenvector of T belonging to \320\257,that is, T(v) = /.v with

vjtO; hence iv, v} is positive.

428 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

Proof of (\\): We show that kl(v, i>>= <v, u>:

U(v, v> =<A\302\273,At;) = (T(v), T(v)> =

<\302\273,T*T(v)>=

<\302\273,/(\302\273}>=

<\302\273,\302\273>

But <i>, i;> # 0; hence kl = 1 and so | A | = 1.

Proof ofin): We show that A<u, i;> =A<t;, i;>:

But <i;, i;> / 0; henceA = A and so A is real.

Proof of (iii): We show that A<u, t;> = \342\200\224A<u, t;>:

A<\302\273,\302\273>=

<A\302\273,r> = <T(t;), \302\273>=

<\302\273,T*(v)>=

<\302\273,-7X\302\273)>=

<\302\273,-A\302\273>=

-A<\302\273, \302\273>

But <t;, i>> / 0; hence A = \342\200\224A or A = \342\200\224A,and so A is pure imaginary.

Proof of (\\\\): Note first that S(t;) ^ 0 becauseS is nonsingular; hence <S(u), S(i>)>is positive. We

show that a(v, f> =<S(i>), S(i;)>:

\320\220<\320\243,\321\216>=

<A\302\273,v> = <T(t;), \302\273>= <S*S(v), \302\273>

= <S(t;), S(\302\273)>

But <u, t;> and <S(t;),S(t;)> are positive; hence A is positive.

Remark: Each of the above operators T commutes with its adjoint, that is,TT* = T*T.Such operators are called normal operators.

14.4 SELF-ADJOINTOPERATORS

Let T be a self-adjoint operator on an inner product space V, that is, suppose

T* = T

(In case, T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or

complex.) By Theorem 14.4,the eigenvalues of T are real. Another important property of T follows.

Theorem 14.5: Let \320\223be a self-adjoint operator on V. Suppose \320\270and t; are eigenvectors of T belongingto distinct eigenvalues. Then \320\270and t; are orthogonal, i.e., <\320\274,t;> = 0.

Proof. Suppose T(u) =X{u and T(v) = X2v where A, # A2. We show that X^u. u> = A2<u, t;>:

A,<\302\253,\302\273>=

<A,\302\253,\302\273>=

<T(\302\253), \302\273>=

<M, T*(v)> =<\302\253,T(\302\273)>

= <M, A2 !;> =A2<M, D>

= A2<H, D>

(The fourth equality uses the fact that \320\223*= T and the last equality uses the fact that the eigenvalue A2

is real.) Since A, / A2, we get <\320\274,t;> = 0. Thus the theorem is proved.

14.5 ORTHOGONAL AND UNITARY OPERATORS

Let U be a linear operator on a finite-dimensional inner product space V. Recall that if

U* = ?/-' or equivalent^ I/1/*= U*U = I

then I/ is said to be orthogonal or unitary according as the underlying field is real or complex. The next

theorem, proved in Problem 14.10,givesalternative characterizations of these operators.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 429

Theorem 14.6: The following conditions on an operator U are equivalent:

(i) U* = V ', that is, UU* = U*U = /.

(ii) U preserves inner products, i.e., for every v,w e V,

(iii) U preserves lengths, i.e., for every v s V, \\\\U(v)\\\\=

\\\\v\\\\.

Example 14.3

(a) Let T : R3 -\302\273R3 be the linear operator which rotates each vector about the z axis by a fixed angle \320\262as shown

in Fig. 14-1; 7\" is defined by

7(x, y, z) = (x cos0 \342\200\224\321\203sin 0, x sin \320\262+ \321\203cos \320\262,z)

Note that lengths (distancesfrom the origin) are preserved under T. Thus T is an orthogonal operator.

T(v)

Fig. 14-1

(b) Let V be the /2-spaceof Example 6.3(fc). Let T : V -\302\273V be the linear operator defined by

T(fl1,fl2,...) = @,fl1,fl2,...).

Clearly, T preserves inner products and lengths. However, T is not surjective since, for example, A, 0, 0, ...)does not belong to the imageof T; hence T is not invertible. Thus we see that Theorem 14.6is not valid for

spaces of infinite dimension.

An isomorphism from one inner product spaceinto another is a bijective mapping which preserves

the three basic operations of an inner product space: vector addition, scalar multiplication, and inner

products. Thus the above mappings (orthogonal and unitary) may also be characterized as the iso-

isomorphisms of V into itself. Note that such a mapping V also preserves distances, since

\\\\U(v)-

\320\222\320\224\320\246= \\\\U(v

- w)|| = ||p-HI

Hence U is also called an isometry.

14.6 ORTHOGONAL AND UNITARY MATRICES

Let U bea linear operator on an inner product space V. By Theorem 14.1, we obtain the followingresult when the base field \320\232is complex.

Theorem 14.7A: A matrix A with complex entries represents a unitary operator U (relative to anorthonormal basis) if and only if A* = A '.

On the other hand, if the base field \320\232is real then A* =AT; hence we have the following corre-

corresponding theorem for real inner product spaces.

Theorem 14.7B: A matrix A with real entries representsan orthogonal operator U (relative to anorthonormal basis) if and only if AT = A~l.

The above theorems motivate the following definitions.

430 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

Definition: A complex matrix A for which A* = A ', or, equivalently, AA* = A*A = /, is called aunitary matrix.

Definition: A real matrix A for which AT = A ', or, equivalently, AAT = A1A = /, is called an orthog-orthogonalmatrix.

Observe that a unitary matrix with real entries is orthogonal.

(\320\260. \320\260\320\233

Example 14.4. Suppose A \342\200\224I I is a unitary matrix. Then A A* = / and hence\\\302\260ibiJ

\\b\\ b2j\\a2 b2) \\5)fci + a2b2 |fc,| + ]b2] J \\0 \\J

Thus

l\302\253il2+ |a2|2= 1 |ft,|2 + |f>2l2= 1 and a,/*, + a2 fc2

= 0

Accordingly, the rows of A form an orthonormal set. Similarly, A*A = I forces the columns of A to form an

orthonormal set.

The result in the above example holds true in general; namely,

Theorem 14.8: The following conditions for a matrix A are equivalent:

(i) A is unitary (orthogonal),

(ii) The rows of A form an orthonormal set.

(iii) The columns of A form an orthonormal set.

14.7 CHANGE OF ORTHONORMAL BASIS

In view of the special role of orthonormal bases in the theory of inner product spaces,we are

naturally interested in the properties of the change-of-basis matrix from one such basis into another.

The following theorem, proved in Problem 14.12, applies.

Theorem 14.9: Let {\320\270\342\200\236..., \320\274\342\200\236}be an orthonormal basis of an inner product space V. Then the

change-of-basis matrix from {\320\274,} into another orthonormal basis is unitary

(orthogonal). Conversely, if P = (ay) is a unitary (orthogonal) matrix, then the follow-followingis an orthonormal basis:

{\320\270'(=

\320\260\320\270\321\211+a2iu2+\302\246\302\246\302\246

+aniun:i= 1,.... n)

Recall that matrices A and \320\222representing the same linear operator T are similar, i.e., \320\222= P~lAP

where P is the (nonsingular) change-of-basismatrix. On the other hand, if V is an inner product space,we are usually interested in the case when P is unitary (or orthogonal) as suggestedby Theorem 14.9.

(Recall that P is unitary if P* = P \\ and P is orthogonal if PT = P \320\233)This leads to the followingdefinition.

Definition: Complexmatrices A and \320\222are unitarily equivalent if there is a unitary matrix P for which

\320\222= P*AP. Analogously, real matrices A and \320\222are orthogonally equivalent if there is an

orthogonal matrix P for which \320\222= PTAP.

Note that orthogonally equivalentmatricesare necessarilycongruent.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 431

14.8 POSITIVE OPERATORS

Let P be a linear operator on an inner product space V. P is said to be positive (or semidefinite) if

P = S*S for some operator S

and is said to be positive definite if S is also nonsingular. The next theorems give alternative character-

characterizations of these operators. (Theorem 14.10Ais proved in Problem 14.21.)

Theorem 14.10A: The following conditions on an operator P are equivalent:

(i) P = T2 for some self-adjointoperatorT.

(ii) P is positive,

(iii) P is self-adjointand <\320\240(\320\274),\320\275>> \320\236for every \320\270e V.

The corresponding theorem for positivedefinite operators is

Theorem 14.10B: The following conditions on an operator P are equivalent:

(i) P = T2 for some nonsingular self-adjoint operator T.

(ii) P is positive definite,

(iii) P is self-adjoint and <P(\302\253),\320\274>> 0 for every \320\270\321\2040 in V.

14.9 DIAGONALIZATION AND CANONICAL FORMS IN EUCLIDEAN SPACES

Let \320\223be a linear operator on a finite-dimensional inner product space V over K. Representing T

by a diagonal matrix depends upon the eigenvectorsand eigenvalues of T, and hence upon the roots of

the characteristic polynomial A(f) of T. Now A(t) always factors into linear polynomials over the

complex field C, but may not have any linear polynomials over the real field R. Thus the situation for

Euclidean spaces (where \320\232= R) is inherently different than that for unitary spaces (where \320\232=\320\241);

hence we treat them separately. We investigate Euclidean spaces below (see Problem 14.14for proof of

Theorem 14.11), and unitary spaces in the next section.

Theorem 14.11: Let T be a symmetric (self-adjoint) operator on a real finite-dimensional inner pro-product space V. Then there exists an orthonormal basis of V consisting of eigenvectorsof \320\223;that is, T can be represented by a diagonal matrix relative to an orthonormalbasis.

We give the corresponding statement for matrices.

Theorem 14.11(Alternate Form): Let A be a real symmetric matrix. Then there exists an orthogonalmatrix P such that \320\222= P~lAP = PTAP is diagonal.

We can choose the columns of the above matrix P to be normalized orthogonal eigenvectorsof A;

then the diagonal entries of \320\222are the corresponding eigenvalues.

Canonical Form for OrthogonalOperators

An orthogonal operator T need not be symmetric, and so it may not be represented by a diagonalmatrix relative to an orthonormal basis. However, such an operator T does have a simple canonical

representation, as described in the following theorem, proved in Problem 14.16.

432 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP.14

Theorem 14.12: Let T be an orthogonal operator on a real inner product space V. Then there is anorthonormal basiswith respect to which T has the following form:

1

-1

-1cos 0, \342\200\224sin 0,

, sin 0, sin 0,

\\

cos 0r \342\200\224sin0r i

sin 6r cos 6r ;

The reader may recognize that each of the above 2x2 diagonal blocksrepresents a rotation in the

corresponding two-dimensionalsubspaceand that each diagonal entry \342\200\2241 represents a reflection in the

corresponding one-dimensional subspace.

14.10 DIAGONALIZATION AND CANONICAL FORMS IN UNITARY SPACES

We now present the fundamental diagonalization theorem for complex inner product spaces, i.e., forunitary spaces. Recall that an operator T is said to be normal if it commutes with its adjoint, i.e., if

TT* = T*T. Analogously, a complex matrix A is said to be normal if it commutes with its conjugatetranspose, i.e.,if AA* = A*A.

Example 14.5. Let A

\\i 3 + 2iJThen

4X'

Y1\\i \320\227+ 2/Al 3 -2

3 + 3i 14

l i- -(2

3\023'1\\3 + 3i 14 )

AA* =

A*A =

Thus A is a normal matrix.

The following theorem applies.

Theorem 14.13: Let T be a normal operator on a complexfinite-dimensional inner product space V.

Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T

can be represented by a diagonal matrix relative to an orthonormal basis.

We give the corresponding statement for matrices.

Theorem 14.13(Alternate Form): Let A be a normal matrix. Then there exists a unitary matrix P such

that B=P lAP= P*AP is diagonal.

The following theorem shows that even nonnormal operators on unitary spaces have a relativelysimple form.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 433

Theorem 14.14: Let T be an arbitrary operator on a complexfinite-dimensional inner product spaceV. Then T can be represented by a triangular matrix relative to an orthonormal basis

of V.

Theorem 14.14 (Alternate Form): Let A be an arbitrary complex matrix. Then there exists a unitarymatrix P such that \320\222= P~ XAP = P*AP is triangular.

14.11 SPECTRAL THEOREM

The Spectral Theorem is a reformulation of the diagonalization Theorems 14.11and 14.13.

Theorem 14.15 (Spectral Theorem): Let \320\223be a normal (symmetric) operator on a complex(real)finite-

dimensional inner product space V. Then there exist linear oper-operators ?,,..., ?r on V and scalars A,,..., \320\257,such that

(i) T = /,?, +A2?2+ \302\246\302\246\302\246+ /,?, (iii) ?? =

?\342\200\236...,?r2= ?r

(ii) ?, +?2 + --\302\246+ ?, = / (iv) EiEj = OfoTi^j.

The above linear operators ?,,..., ?p are projections in the sense that ?? = ?,. Moreover, they are

said to be orthogonalprojections since they have the additional property that ?, Ej= 0 for i \320\244j.

The following example shows the relationship between a diagonal matrix representation and thecorresponding orthogonal projections.

/2

3

3Example 14.6. Consider a diagonal matrix, say A = Let

/I

\\

\\

/o

?,=

/0

?3 =

The reader can verify that:

(i) A = 2EX + 3?2 + 5?3 (ii) ?, + ?2 + ?3 = /, (iii) Ef = ?,, and (iv) , = Ofor i

Solved Problems

ADJOINTS

C

- 7i 18 4 + i

- Ii 6 - i 2 - 3i I.

8 + i 7 + 9i 6 + 3i/

C

+ 7i 7i 8 - i \\

18 6 + i 7 \342\200\2249i I.

4 - i 2 + 3i 6 - 3i/

14.2. Find the adjoint of each linear operator:

(a) F : R3 -> R3 denned by F(x, y, 2)= Cx + Ay

- 5z, 2x -6y + Iz, 5x-9y + z).

434 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

(b) G : C3 -* C3 defined by

G(x, y, z) = Bx + A- i)y, C + 2i> -

4iz, 2ix + D -3i)y

- 3z)

{a) First find the matrix A representing T in the usual basis of R3. (Recallthe rows of A are the coefficients

of x, y, z.) Thus

/3 4 -S\\

A=\\2 -6 7

\\5 -9 l;

Since the base field is R, the adjoint F* is represented by the transpose AT of A. Thus form

/ 3 2 A

AT = \\ 4 -6 -9

\\-5 7 l/Then F*(x, y, z) = Cx + 2y + 5z, 4x -

6y- 9z, \342\200\2245x + 7y + z).

(fc) F'irst find the matrix \320\222representing 7\" in the usual basis of C3:

B

1 - i 0

3 + 2i 0 -4i2i 4-3/ -3;

Form the conjugate transpose B* of B:

B

3 \342\200\2242. ~2i

l+i 0 4 + 3/

0 4i -3

Thus G*(x, y, z) = Bx + C -2i)y

- 2iz, A + Ox + D + 3i)z, 4iy- 3z).

14.3. Prove Theorem 14.3.

Let {wlt..., wn} be an orthonormal basis of V. Set

\320\270= <Kwt)wj + <f>(w2)w2 + \342\200\242\342\200\242\302\246+ <jiwn)wn

Let \320\271be the linear functional on V defined by u(v) = <i\\ u>, for every v e V. Then for i = 1,..., \320\270,

Since u and \321\204agree on each basis vector, it =\321\204.

Now suppose \320\270is another vector in V for which $(u) =<u, u'> for every v e V. Then <f, u>

= <d, u'>or (v, \320\270\342\200\224\320\270\321\203

= 0. In particular this is true for v = \320\270\342\200\224\320\270and so <u \342\200\224\320\270',\320\270\342\200\224

u'> = 0. This yields\320\270\342\200\224\320\270'= 0 and \321\213= i/. Thus such a vector u is unique as claimed.

14.4. Prove Theorem 14.1.

Proof of (i):

We first define the mapping 7\"*. Let t> be an arbitrary but fixed element of V. The map ut\342\200\224\302\273<T(u),t>>

is a linear functional on F. Henceby Theorem 14.3, there exists a unique element v' e V such that

<7\"(u), t>> = <u, c'> for every \320\270e K. We define TT - \320\232by T*{v) = t>'. Then <7'(u), u> = <u, T*(d)>for

every u, v e V.We next show that T* is linear. For any u, vt e V, and any a,beK,

<u, r*(av{ + bv2)y= <T(u), \320\264\320\260,+ fct>2>

= a(T(u\\ d,> + b<T(u),v2}

=\320\260<\320\274,r*(Vl)} + biu, T*(d2)>= <u, flT*(B,) + bT*{v2)}

But this is true for every us V; hence T*{avx + hv2) = a7\"(D,) + fc7\"(i;2). Thus T* is linear.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 435

Proofo/(ii):

The matrices A =(\320\260,7)

and \320\222=(\320\233\320\276)representing T and T*, respectively, in the basis S =

{u,} are given

by a,j=

<\320\223(\320\2707),\320\260(>and ^ =<T*(uy), w,-> (see Problem 6.23). Hence

Thus \320\222= /4*, as claimed.

14.5. Prove Theorem 14.2.

(i) For any u,ve V,

<(T, + \320\223\302\273.v> = <7i(ii) + T\302\273,i\302\273>=

<\320\2421(\320\274),i\302\273>+ <7\302\273, p>

= <\320\270,7\302\273> + <\320\274,\320\223}(\302\273)>=

<\302\253,7\302\273 + T%v= <u, (\320\263*-i- rsx\302\273)>

The uniqueness of the adjoint implies (T, + T2)*=

\320\223^+ 7\"J.

(ii) For any u,v e V,

\302\253kT\342\204\226v> =<fcT(u), i\302\273>

= k(T(u), vy =fc<u, \320\223*A))>

=<\321\206,k~T*(v)> = <

The uniqueness of the adjoint implies (kT)* = fcT*.

(iii) For any u. v e V,

\302\253\320\223.\320\242\320\233\320\270),u> = <T1(T2Ku), t;> =<7\302\273, \320\223\302\273>

= <\320\270,\320\223*G\302\273)>= <u, (TJTtXp\302\273

The uniqueness of the adjoint implies (T,T2)*=Tf Tf.

(iv) For any u, v e K,

> = <c,T*(u)>= <T(i>), u> = <\302\253,7X\302\273)>

The uniqueness of the adjoint implies (T*)*= T.

14.6. Show that: (a) J* = J, and (b)0* = 0.(a) For every u, v e V, </(u), u>

= <u, u> =<u, /(!;)>; hence /* = J.

(b) For every u, v e K, <0(u), u>= <0, u> = 0 =

<u, 0> = <u, 0(c)>; hence0* = 0.

14.7. SupposeT is invertible. Show that (T~ ')* = (T*)~'./ = /* =(\320\223\320\223-')\302\273=(\320\242')\302\273T*;hence (T ')* = \320\223*1.

14.8. Let T be a linear operatoron V, and let W be a \"\320\223-invariant subspace of V. Show that Wx is

invariant under T*.

Let \320\270e jyx. If w e W, then T(w) 6 W and so <w, T*(u)> = <T(w), u> = 0. Thus T*(u) e \320\230^\321\205since it is

orthogonal to every w e W. Hence WL is invariant under T*.

14.9. Let T bea linear operator on V. Show that each of the following conditions impliesT = 0:(i) <T(u), i>> = 0 for every u, v e V;

(ii) \320\232is a complex space, and <T(u),\302\253>= 0 for every ue V;

(iii) \320\223is self-adjoint and (T{u), u> = 0 for every ue V.

Give an example of an operator \320\223on a real space V for which < T{u), u> = 0 for every \320\270\320\265V but

\320\223^0. [Thus (ii) need not hold for a real space V.']

436 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

(i) Setv = T(u). Then <T(u), T(u)>= 0 and hence T(u) = 0. for every ueV. Accordingly, T = 0.

(ii) By hypothesis, (T(v + w), v + w> = 0 for any v, w e V. Expanding and setting (T(v), u>= 0 and

<T(w), w>= 0, we find

<7M w> + <T(w), u> = 0 (/)

Note w is arbitrary in (/). Substituting iw for w, and using <7*(u), \302\273v>= i(T(v), w> = \342\200\224

i(T(v), w> and

<T(iw), t>>= </T(w), \320\270)

= K^w), \320\262>,we find

-i<T(t)), w> + i<T(w),u>= 0

Dividing through by i and adding to (/), we obtain <T(w), u> = 0 for any u, w e V. By (i), T = 0.

(iii) By (ii), the resuli holds for the complex case; hence we need only consider the real case. Expanding

(T(v + w),v + w}\342\200\2240, we again obtain (/). SinceT is self-adjoint and since it is a real space, we have

<T(w), d> =<w, T(v\\) = <T(d), w>. Substituting this into (/), we obtain <T(i>),w> = 0 for any

n, weV.By (i), T = 0.

For our example, consider the linear operator T on R2 defined by T(x, y) =(>\302\246,-x). Then

<J(u), u> = 0 for every \320\270e K, but T + 0.

ORTHOGONALAND UNITARY OPERATORS AND MATRICES

14.10. Prove Theorem 14.6.

Suppose (i) holds. Then, for every v, w e V,

Thus (i) implies (ii). Now if (ii) holds, then

\\\\V(v)\\\\=

^\320\223\320\241\320\251^\320\233\320\250=

Hence (ii) implies (iii). It remains to show that (iii) implies (i).Suppose(iii) holds. Then for every v e V,

(V*V(v), v> = <[/(!>),U(v)>=

Hence <.(U*V -/)(i>), i>> = 0 for every veV. But V*V - I is self-adjoint (Prove!); then by Problem 14.9

we have U*V - 1 = 0 and so V*U = /. Thus V* = U ' as claimed.

14.11.Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U.

Show that Wx is also invariant under U.

SinceI/ is nonsingular, U(W) = \320\230^;that is, for any w e W there exists w' e VF such that V(w') = w.

Now let v 6 W1. Then for any w e W,

V(W)} =<t), W'> = 0

Thus V(v) belongs to W1. Therefore W1 is invariant under U.

14.12. Prove Theorem 14.9.

Suppose{vj} is another orthonormal basis and suppose

\"i=

bii\302\253i+ bi2u2 + \302\246\342\200\242\342\200\242+ \320\254\321\213\320\270\342\200\236i=l, ...,\320\270 (/)

Since {tij} is orthonormal,

^o =<f.- \342\200\242fj>

= b.ib^ + bl2 b^ + ' \302\246\302\246+ bm b~ B)

Let \320\222=\321\204\320\270)be the matrix of coefficients in (/). (Then BT is the change-of-basis matrix from {u,} to {v,}.)

Then BB* =(\321\201\320\270)

where ci;=

bnbjx + bi2b~2 + \302\246\342\200\242\342\200\242+ fcjflfc~. By B), cj;=

\320\271,7and therefore BB* = I. Accord-

Accordingly,B, and hence BT,is unitary.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 437

It remains to prove that {\321\213|}is orthonormal. By Problem 6.23,

(u\\, u}>= atlau + a2

where C,- denotes the ith column of the unitary (orthogonal) matrix P =(\321\217\321\203).

Since P is unitary

(orthogonal), its columns are orthonormal; hence <u|, u'j}= <Cj; C,->

=&u. Thus {\302\253J}is an orthonormal

basis.

SYMMETRIC OPERATORS AND CANONICALFORMS IN EUCLIDEAN SPACES

14.13. Let \320\223be a symmetric operator. Show that: (a) the characteristic polynomial A(t) of \320\223is a

product of linear polynomials(over R); (b) T has a nonzero eigenvector.

(a) Let A be a matrix representing T relative to an orthonormal basis of V; then A = AT. Let A(t) be thecharacteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real eigen-eigenvalues by Theorem 14.4. Thus

A(t)= (t -

A,Xt-

\320\2332)\342\200\242(t-A.,)

where the \320\233,are all real. In other words, A(t) is a product of linear polynomials over R.

(b) By (a), T has at least one (real) eigenvalue. Hence T has a nonzeroeigenvector.

14.14. Prove Theorem 14.11.

The proof is by induction on the dimension of V. If dim V = 1, the theorem trivially holds. Nowsuppose dim V \342\200\224n > 1. By Problem 14.13, there exists a nonzero eigenvectoru, of T. Let W be the space

spanned by u,, and let u, be a unit vector in W, e.g., let u, = v,/Hd, ||.Since u, is an eigenvector of T, the subspace PF of V is invariant under T. By Problem 14.8, PF1 is

invariant under T* = T. Thus the restriction f of T to Wx is a symmetric operator. By Theorem 6.4,V = W \302\251\320\230^1.Hence dim WL \342\200\224n \342\200\2241 since dim \320\230^= 1. By induction, there exists an orthonormal basis{u2,..., \320\270\342\200\236}of Wx consisting of eigenvectorsof t and hence of T. But <\320\272\342\200\236u,-> = 0 for i = 2,..., n because\321\211\320\265\320\230^1.Accordingly {\320\270\342\200\236\320\2702,.... un} is an orthonormal set and consists of eigenvectorsof T. Thus the

theorem is proved.

14.15. Find an orthogonal change of coordinates which diagonalizes the real quadratic form defined by

q(x, y) = 2x2 + 2xy + 2y2.

First find the symmetric matrix A representing q and then its characteristic polynomial A(t):

-G3and A(t) = | tl - A | =

The eigenvalues of A are 1 and 3; hence the diagonal form of q is

q(x\\ y1) = x'2 + 3y'2

We find the corresponding transformation of coordinates by obtaining a correspondingorthonormal set of

eigenvectors of A.

Set t = 1 in the matrix tl \342\200\224A to obtain the correspondinghomogeneoussystem

\342\200\224x\342\200\224\321\203= 0 \342\200\224x\342\200\224

\321\203= \320\236

A nonzero solution is u,= A, -1). Now set t = 3 in the matrix tl - A to find the corresponding homoge-

homogeneoussystem

x -\321\203

= 0 -x + \321\203= \320\236

A nonzero solution is v2= A, 1). As expected from Theorem 14.5, vt and v2 are orthogonal. Normalize d,

and v2 to obtain the orthonormal basis

{\320\270,=

(l/v/2,- 1/^2), u2

=

438 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

The transition matrix P and the required transformation of coordinates follow:

pJ 1\320\233/2 ./\321\207/2\\and

(\\-\\ly/l My/2/ \\y/ \\\320\243

Note that the columns of P are Ui and \320\270\320\263.We can also express x' and / in terms of x and \321\203by usingP ' = PT;thatis,

x' = (x- \321\203\320\243\321\203\320\224/ = (x + \321\203)\321\204

14.16. Prove Theorem 14.12.

Let S = T + T'' = T + T*. Then S* = (T + T*)*= T* + T = S. Thus S is a symmetric operator onV. By Theorem 14.11, there exists an orthonormal basis of V consisting of eigenvectorsof S. If/,, ..., km

denote the distinct eigenvalues of S. then V can be decomposed into the direct sum V =^@^\302\256'\"\320\244\320\247

where the V; consists of the eigenvectors of S belonging to /,. We claim that each Vi is invariant under T.For supposev e (/\342\200\242;then S(v) =

Xt v and

S(T(v)) = (T + Tl)T(v)= T(T + T- \302\273= TS(v) =

\320\242(\320\273,.v) = I, T(v)

That is, T(v) e J/-. Hence Vt is invariant under T. Since the \320\246are orthogonal to each other, we can restrictour investigation to the way that T acts on eachindividual \320\246.

On a given \320\246,(T + T ')u = S(v)-

/.tv. Multiplying by T,

(T2-A,T + /Kf) = 0

We consider the cases \320\257,= \302\2612and \320\273(\321\204\302\2612separately. If \320\224,

= \302\2612,then (T \302\261/J(d) = 0 which leads to

(\320\223\302\261/Xf) = 0 or T{v)= \302\261v.Thus T restricted to this \320\246is either / or -/.If X{ \320\244\302\2612,then T has no eigenvectorsin \320\246since by Theorem 14.4 the only eigenvalues of T are 1 or

\342\200\2241. Accordingly, for v \320\2440 the vectors v and T(v) are linearly independent. Let W be the subspace spanned

by v and T(v). Then W is invariant under \320\223,since

T(T(v)) = T2(i>)= \320\233,7\302\273- v

By Theorem 6.4, \320\246= W@ WL. Furthermore, by Problem 14.8, W1 is also invariant under T. Thus we can

decompose Vt into the direct sum of two-dimensional subspaces Wj where theW}

are orthogonal to eachother and each Wt is invariant under T. Thus we can now restrict our investigation to the way T acts oneach individual W}.

Since T2 -\320\233,\320\242+ / = 0, the characteristic polynomial A(t) of T acting on Ws is A(i) = f2 - X{t + 1.

Thus the determinant of T is 1, the constant term in A(t). By Theorem 4.6, the matrix A representing T

acting onWj

relative to any orthonormal basisofWt

must be of the form

/cos \320\262\342\200\224sin6\\

\\sin 0 cos \320\262)

The union of the bases of the Wj gives an orthonormal basis of Vt, and the union of the bases of the V{

gives an orthonormal basis of V in which the matrix representing T is of the desired form.

NORMAL OPERATORS AND CANONICALFORMS IN UNITARY SPACES

14.17. Determine which matrix is normal: (a) A\342\200\224

I I, and (b) \320\222= I 1.

\321\201 ;\321\205-!-\321\215-u ;> xooSince /1/4* ^ i4*i4, the matrix A is not normal.

\\-i 2-/\320\2241 2 + i^ \\2-2(\" 6 )Since BB* = B*B, the matrix \320\222is normal.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 439

14.18.Let T be a normal operator. Prove:

(a) T(v)= 0 if and only if T*(v) = 0.

(b) \320\242- XI is normal.

(c) If T(v)= Xv, then T*(v) \342\200\224

Xv; hence any eigenvector of T is alsoan eigenvector of T*.

(d) If T{v)= Xtv and T(w) = k2w where \320\220,\320\244A2, then <t>, w> = 0; that is, eigenvectors of T

belonging to distinct eigenvalues are orthogonal.

(a) We show that <\320\223(\320\270),\320\223(\320\270)>= <\320\223*(\320\270),\320\242*{v)}:

<7\320\234 T(v)> =<\320\262,\320\223\302\273\320\223(\320\263)>

=<\321\206\320\223\320\237\321\200))=

<r\302\273(iO, \320\223\302\273>

Hence by [/3], T{v)= 0 if and only if T*(v) = 0.

(b) We show that T \342\200\224XI commutes with its adjoint:

(\320\223- kl\\T - U)* = (T -

\320\233\320\232\320\223\302\273- XI) = TT* - XT* - XT + XXI

= T*T - XT - XT* + XXI = (T* -XI\\T

- XI)

=(\320\223

- XI)*(T -XI)

Thus T \342\200\224XI is normal.

(c) If T{v)= \320\273\320\270,then (\320\223-

\320\251\320\263)= 0. Now T - XI is normal by (b); therefore, by (a), (T - //)*(i>)= 0.That is, (T* -

\320\233/Xd)= 0; hence T*[v) = Xv.

(d) We show that \320\257,<\320\270,w> = ;.2<d, w>:

;.,<!;,w>= <A,t), w> = <T(u),w> = <d, r*(w)> =

<d, a2w> =\320\2732<\320\262,w>

But \320\257,yt \320\2732; hence <d, h\">= 0.

14.19. Prove Theorem 14.13.

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now

suppose dim V = n > 1. Since V is a complex vector space, T has at least one eigenvalue and hence a

nonzero eigenvector d. Let W be the subspace of V spanned by v and let u, be a unit vector in W.Since v is an eigenvector of 7\", the subspace \320\230'is invariant under T. However, i; is also an eigenvector

of T* by Problem 14.18; hence W is also invariant under T*. By Problem 14.8, W1 is invariant under

T** = T. The remainder of the proof is identical with the latter part of the proof of Theorem 14.11(Problem 14.14).

14.20.Prove Theorem 14.14.

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now

suppose dim V = n > 1. Since V is a complexvector space,T has at least one eigenvalue and hence at leastone nonzeroeigenvectorv. Let W be the subspaceof V spanned by v and let \320\272,be a unit vector in W. Then

u, is an eigenvector of T and, say, T(ut) = e,,^.By Theorem 6.4, V = W\302\256W1. Let E denote the orthogonal projection of V into W1. Clearly W1 is

invariant under the operator ET. By induction, there exists an orthonormal basis {u2, ..., \320\270\342\200\236\\of W1 such

that, for i = 2,..., n,

ET(Ui)= ai2 u2 + ai3 u3 + \342\200\242\302\246\342\200\242+ \320\260\342\200\236ut

(Note that {\320\270,,u2,.... un} is an orthonormal basis of V.) But E is the orthogonal projection of V onto WL\\

hence we must have

ai2u2 + - \342\200\242\302\246+

for i = 2,..., n. This with T(u,) = allu1 gives us the desired result.

440 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

MISCELLANEOUSPROBLEMS

14.21. Prove Theorem 14.10A.

Suppose (i) holds, that is, P = T2 where T = T*. Then P = TT = T*T and so (i) implies (ii). Now

suppose (ii) holds. Then P* = {S*S)*= S*S** = S*S = P and so P is self-adjoint. Furthermore,

<P(u), u> = <S*S(u), u} =<S(u), S(n)> > 0

Thus (ii) implies (iii), and so it remains to prove that (iii) implies (i).

Now suppose (iii) holds. Since P is self-adjoint, there exists an orthonormal basis {\320\270,,..., un} of V

consisting of eigenvectorsof P; say, \320\240(\321\206)=

\320\233;\320\246-.By Theorem 14.4, the \320\273,-are real. Using (iii), we show that

the A,- are nonnegative. We have, for each i,

0 < <P(u,), \321\206>=

<\320\273,-\321\206,\320\274,.>=

*,.<\302\253,.Ui>

Thus <ui( u,-> > 0 forces h, > 0, as claimed. Accordingly, 4/a^ is a real number. Let \320\223be the linear operatordefined by

T(u) = y/IiU; for i = 1, .... \320\270

Since \320\223is represented by a real diagonal matrix relative to the orthonormal basis {\321\206},\320\242is self-adjoint.

Moreover, for each i.

;Ui=\320\273,,u, =

Since \320\2232and P agree on a basisof V, P \342\200\224T2. Thus the theorem is proved.

Remark: The above operator T is the unique positive operator such that P = T2; it is

called the positive square root of P.

14.22. Show that any operator T is the sum of a self-adjoint operator and skew-adjoint operator.

Set S =$(T + T*) and V = \\{J

- T*).Then T = S+V where

S* = (i(T + T*))* = i(T* + T**)=$(T* + T) = S

and V* = (i(T -T*))*

= \\(T* -T)= -i(T-T*)= -V

i.e.,S is self-adjoint and V is skew-adjoint.

14.23. Prove: Let T be an arbitrary linear operator on a finite-dimensionalinner product space V.

Then \320\223is a product of a unitary (orthogonal) operator U and a unique positiveoperatorP, that

is, T = UP. Furthermore, if T is invertible, then U is also uniquely determined.

By Theorem 14.10, T*T is a positive operator and hence there exists a (unique) positive operator Psuch that P2 = T*T (Problem 14.43).Observethat

\320\277\320\260\320\264\320\2702= <P(d \320\260\320\264>= <p>), v> = <t*T(v), v>

= <tm \321\210>= iitwii2 (/)

We now consider separately the cases when T is invertible and non-invertible.

If T is invertible, then we set V = PT '. We show that 0 is unitary:

(\320\242*I\320\240 and 0*U =(T*)lPPT' =(T*)lT*TT ' =/

Thus I/ is unitary. We next set V = U '. Then I/ is alsounitary and T = UP as required.To prove uniqueness, we assume T = V0P0 where Uo is unitary and Po is positive. Then

T*T = P* V* Uo Po = Po /Po = P2

But the positive square root of T*T is unique (Problem 14.43) hence Po = P. (Note that the invertibility ofT is not used to prove the uniqueness of P.) Now if T is invertible, then P is also by (/). Multiplying

U0P = UP on the right by P 'yields Un = U. Thus U is also unique when T is invertible.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 441

Now suppose T is not invertible. Let W be the image of P, i.e., W = Im P. We define I/, : W -\302\273V by

L/,(w) = T(d) where P<d) = w B)

We must show that V, is well defined, that is, that P(v)= P{v') implies T(v) = 7V). This follows from the

fact that P(v -v')

= 0 is equivalent to \\\\P(v-

v')\\\\ = 0 which forces \\\\T[v-

v')\\\\ = 0 by (/). Thus I/, is well

defined. We next define V2 '\302\246W -* V. Note by (/) that P and T have the same kernels. Hence the images of

P and T have the same dimension, i.e., dim (Im P) = dim W = dim (Im T). Consequently, Wl and (Im TIalso have the same dimension. We let V7 be any isomorphism between WL and (Im TI.

We next set V = I/, \302\256U2- [Here I/ is denned as follows: if v e V and d = w> 4- w' where w> 6 W,w' e W\\ then 1/(\320\270)= t/,(w) + f72(w').] Now I/ is linear (Problem 14.69)and, if v e V and P(i;) = w, then

by B)

\320\223(\320\270)= l\\(w) = l/(w) =

VP(v)

Thus T = l/P as required.It remains to show that V is unitary. Now every vectorx e V can be written in the form x = P(d)+ W

where w' 6 WL. Then U(x) = t/P(i>) + U2(w') = T(d)+ t/2(w') where <\320\223(\320\270),t/^w')) = 0 by definition of V2.Also, <\320\223(\320\270),\320\223(\320\270)>

= <P(u), P(t))> by (/). Thus

<t/(x), l/(x)>= <T(v) + U2(w'), nv) + t/2(w')>

= <T(d), T(t))> + <l/2(W), t/2(w')>(d)>+ <w', w'> =

<P(t>) + w', P(d) + W)

= <x, x>

[We also used the fact that <P(f), w'> = 0.] Thus U is unitary and the theorem is proved.

14.24. Let V be the vector space of polynomialsover R with inner product defined by

</ 0> =I f\342\204\226t)dt

Jo

Give an example of a linear functional \321\204on F for which Theorem 14.3does not hold, i.e., for

which there is no polynomial h(t) such that <f>(f)= </ h} for every/e \320\232

Let \321\204: V -\302\273Rbe defined by <?(/) =/@), that is, \321\204evaluates /(t) at 0 and hence maps f(t) into its

constant term. Suppose a polynomial \320\251)exists for which

<md =/@) =I nmt) ^ </)Jo

for every polynomial/(/). Observe that \321\204maps the polynomial tf(t) into 0; hence by (/),

f\320\250)\320\233

= 0 B)Jo

for every polynomial/(/). In particular, B) must hold for/(r) =tft(t), that is,

t2h\\t) dt = 0\320\223

This integral forces /i(i) to be the zero polynomial; hence \321\204{/)= </, /i> = </, 0> = 0 for every polynomial

f(t). This contradicts the fact that \321\204is not the zero functional: hence the polynomial h(t) does not exist.

Supplementary Problems

ADJOINT OPERATORS

14.25. Find the adjoint of each of the following matrices:

442 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

14.26. Let T :R3 - R3 be defined by T(x, y, z) = (x + 2y, 3x - 4z,y). Find T*(x, \321\203,z).

14.27. Let T: C3- C3 be defined by

T(x, >; z) =(ix + B + 3i)>; 3x + C -

i)z, B- 5i> + <z)

Find T*(x, \321\203,z).

14.28. For each of the following linear functions \321\204on V find a vector ueC such that \321\204(\320\263)= <t\\ u> for every

veV:

(a) \321\204: R3 -> R defined by \321\204(\321\205,\321\203,z) = x + 2y- 3z.

(b) \321\204:\320\2413->\320\241defined by \321\204(\321\205,\321\203,z) = ix + B + 3i)>' + A - 2i)z.

(c) \321\204: V -> R defined by <?(/) =/A) where V is the vector space of Problem 14.24.

14.29.Suppose V has finite dimension. Prove that the image of \320\223*is the orthogonal complement of the kernel of

T, i.e., Im T* =(Ker T)\\ Hence rank T = rank T*.

1430. Show that T*T = 0 implies T = 0.

14.31. Let \320\232be the vector space of polynomials over R with inner product defined by </, g} = Jo f(t)g(t) dt. Let

D be the derivative operator on V, i.e., D(/) =df/dt. Show that there is no operator D* on V such that

<D(/), \320\264\320\243= </, D*@\302\273for every/, jeK That is, D has no adjoint.

UNITARY AND ORTHOGONAL OPERATORS AND MATRICES

14.32.Find a unitary (orthogonal) matrix whose first row is:

(\320\260)B\320\224/13, \320\227\320\264/13), (b) a multiple of A, 1 - i), (c) & *i, *- }0

14.33. Prove:The product and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal matrices

form a group under multiplication called the orthogonal group.)

14.34. Prove: The product and inverses of unitary matrices are unitary. (Thus the unitary matrices form a group

under multiplication called the unitary group.)

14.35. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal.

1436. Recall that the complex matrices A and \320\222are unitarily equivalent if there exists a unitary matrix P suchthat \320\222= \320\240*\320\220\320\240.Show that this relation is an equivalencerelation.

14.37. Recall that the real matrices A and \320\222are orthogonally equivalent if there exists an orthogonal matrix P

such that \320\222= PTAP. Show that this relation is an equivalence relation.

14.38. Let W be a subspace of V. For any v e V let v = w + w' where w e W, w' e Wx. (Such a sum is uniquebecause V = W \302\251WL) Let T: V -\302\273V be defined by T(v) = w -W. Show that \320\223is a self-adjoint unitary

operator on V.

14.39. Let \320\232be an inner product space, and suppose V : V -\302\273V (not assumed linear) is surjective (onto) and

preserves inner products, i.e., <,V(v), l/(w)> = <\302\253,w> for every v, w e V. Prove that V is linear and henceunitary.

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 443

POSITIVE AND POSITIVE DEFINITE OPERATORS

14.40. Show that the sum of two positive (positive definite) operators is positive (positive definite).

14.41. Let \320\223be a linear operator on V and let/: V x V -> \320\232be denned by f(u, v) = <T(u),u>. Show that/is itselfan inner product on V if and only if T is positive definite.

14.42. SupposeE is an orthogonal projection onto some subspace W of V. Prove that kl + E is positive (positive

definite) if \320\272>0(fc>0).

14.43. Consider the operator T defined by T(u,) = N/%uI, i = 1, ..., n, in the proof of Theorem 14.10A. Show that

T is positive and that it is the only positive operator for which T2 = P.

14.44.Suppose P is both positive and unitary. Prove that P = I.

14.45.Determine which of the following matricesare positive (positive definite):

\320\241 D (-\" o) U :) G !) \320\241 9 \320\241 \320\236

(i) (ii) (iii) (iv) (v) (vi)

14.46. Prove that a 2 x 2 complex matrix \320\233=[ .) is positive if and only if (i) A = A*, and (ii) a, d, and\\r dj

ad \342\200\224be are nonnegative real numbers.

14.47. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a non-negative (positive) real number.

SELF-ADJOINT AND SYMMETRIC OPERATORS

14.48.For any operator T, show that T + T* is self-adjoint and T \342\200\224T* is skew-adjoint.

14.49. Suppose T is self-adjoint. Show that T2(v) = 0 implies T(v)= 0. Use this to prove that T\"(d)= 0 alsoimplies T(v) = 0 for \320\270> 0.

14.50. Let V be a complex inner product space. Suppose <T(d), d> is real for every v e V. Show that T is self-

adjoint.

14.51. Suppose T, and T2 are self-adjoint. Show that Tl T2 is self-adjoint if and only if Tx and T2 commute, i.e.,TJ2=T2TV

14.52. For each of the following symmetric matricesA, find an orthogonal matrix P for which PTAP is diagonal:

1433. Find an orthogonal transformation of coordinates which diagonalizes each of the following quadratic

forms: (a) tfx, y)= 2x2 -

6xy + lOy2, (b) <j(x, y) = x2 + &xy- 5y2

14.54. Find an orthogonal transformation of coordinates which diagonalizes the quadratic form

<j(x, y, z) = 2xy + 2xz + lyz

444 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP. 14

NORMAL OPERATORS AND MATRICES

14.55. Verify that A = [ I is normal. Find a unitary matrix P such that P*AP is diagonal, and find P*AP.V< V

14.56. Show that a triangular matrix is normal if and only if it is diagonal.

14.57. Provethat if T is normal on V, then ||T(u)|| = ||T*(u)|| for every v e V. Prove that the converse holds in

complex inner product spaces.

14.58. Show that self-adjoint, skew-adjoint and unitary (orthogonal) operators are normal.

14.59. SupposeT is normal. Prove that:

(a) T is self-adjoint if and only if its eigenvalues are real.

(b) T is unitary if and only if its eigenvalues have absolute value 1.

(c) \320\242is positive if and only if its eigenvalues are nonnegative real numbers.

14.60. Show that if T is normal, then T and T* have the same kernel and the same image.

14.61. Suppose T, and T2are normal and commute. Show that T, + T2 and T, T2 are also normal.

14.62. SupposeT, is normal and commutes with T2. Show that Tt also commutes with T2*.

14.63. Prove: Let T, and T2 be normal operators on a complex finite-dimensional vector space V. Then there

exists an orthonormal basis of V consisting of eigenvectors of both T, and T2. (That is, T, and T2 can be

simultaneously diagonalized.)

INNER PRODUCTSPACE ISOMORPHISM PROBLEMS

14.64. Let S ={\302\253|,..., \302\253\342\200\236}be an orthonormal basis of an inner product space V over K. Show that the map

\302\273h[\302\273]sis an (inner product space) isomorphism between V and K\". (Here [v~\\s denotes the coordinatevector of v in the basis S.)

14.65. Show that inner product spaces V and W over \320\232are isomorphic if and only if V and W have the same

dimension.

14.66. Suppose {u,,..., un} and {u\\,..., u'J are orthonormal basis of V and W, respectively. Let T : \320\243-\302\273W be thelinear map defined by T\"(u?)

= u[, for each i. Show that T is an isomorphism.

14.67. Let V be an inner product space. Recall(page426) that each \320\270\320\265V determines a linear functional it in the

dual space V* by the definition \320\271(\320\270)= <d, u> for every v e V. Show that the map ut-*u is linear and

nonsingular, and hence an isomorphism from V onto V*.

MISCELLANEOUS PROBLEMS

14.68. Show that there exists an orthonormal basis {u,, ..., un} of V consisting of eigenvectorsof T if and only ifthere exist orthogonal projections ?,, ?, and scalars \320\224, \320\273,such that:

(i) T =\320\257,?,+

\342\200\242\302\246\342\200\242+ ;.,Er; (ii) ?, + \342\200\242\342\200\242\342\200\242+?,= /; (iii) ?;?; = 0 for i \321\204j.

14.69. Suppose V = I/ \302\251W and suppose \320\223,: V -> V and T2:W -> V are linear. Show that T =\320\223,\302\251\320\2232is

also linear. Here T is defined as follows: If v e V and v = \320\270+ w where \320\270e V, w e W, then

T(v) =\320\251\320\270)+ T2(w)

CHAP. 14] LINEAR OPERATORS ON INNER PRODUCT SPACES 445

14.70.Suppose U is an orthogonal operator on R3 with positive determinant. Show that U is either a rotation or

a reflection through a plane.

Answers to Supplementary Problems

14.26.T*(x, y, z) = (x + 3y, 2x + z, -4y)

14.27. T*(x,y, z) = (- ix + 3y, B -3i)x + B + 5i)z,C + i)y

- iz)

14.28. (o) u = A,2,-3), (fc) u = (-i,2-3i, 1 + 2i), (c) u = {\342\204\2262- 8t

14.45. Only (i) and (v) are positive. Moreover,(v) is positive definite.

14.52. W PJ ^ -*<Jhm PJ *S* \"I/A w P = (

3/\320\243\320\250-

\320\247-1/\320\2435 2/^5/ V- 2/75

14.53. (a) x = Cx'- \320\243)/71\320\276, \321\203= (x' + 3/)/7\321\216, (b) x = Bx'- \321\203')/\321\207\320\233,> = (x' + 2y')ljl

\\4J5A. x =

1/72

Appendix

Polynomials Over a Field

A.I INTRODUCTION

The ring /C[t] of polynomials over a field \320\232has many properties analogous to properties of the

integers. These play an important role in obtaining canonical forms for a linear operatorT on a vector

space V over K.

Any polynomial in K[t] may be written in the form

The entry ak is called the kih coefficient of/. If n is the largest integer for which an \320\2440, then we say that

the degree of/is n, written

deg/= n

We also call \320\260\342\200\236the leading coefficient of/ and if \320\260\342\200\236= 1 we call/a monic polynomial. On the other hand,

if every coefficient of/is 0 then /is called the zero polynomial, written/= 0. The degree of the zeropolynomial is not defined.

A.2 DIVISIBILITY; GREATEST COMMONDIVISOR

The following theorem formalizes the process known as \"long division.\"

Theorem A.I (Euclidean Algorithm): Let/and g be polynomials over a field \320\232with g \320\2440. Then there

exist polynomials q and r such that

/= qg + r

where either r = 0 or deg r < deg g.

Proof: If/'= 0 or if deg/< deg g, then we have the required representation

/= 00 +/Now supposedeg/> deg g, say

f=ant\" + --- + att + a0 and g =bmtm +

\342\200\242\302\246\302\246+ btt + b0

where \320\260\342\200\236,\320\254\321\202\320\2440and n > m. We form the polynomial

(')

Then deg /, < deg/ By induction, there exist polynomials qt and r such that

/i =4\\Q + r

where either r = 0 or degr < deg g. Substituting this into (/) and solving for/

which is the desired representation.

446

APPENDIX] POLYNOMIALS OVER A FIELD 447

TheoremA.2: The ring K[t\\ of polynomials over a field \320\232is a principal ideal ring. If / is an ideal in

K[t~\\, then there existsa unique monic polynomial d which generates /, that is, such that

d divides everypolynomial/e /.

Proof: Let d be a polynomial of lowest degreein /. Since we can multiply d by a nonzero scalarand still remain in /, we can assume without loss in generality that d is a monic polynomial. Now

suppose /e /. By Theorem A.I there exist polynomials q and r such that

f=qd + r where either r = 0 or degr < deg d

Now fdel implies qd e I and hence r \342\200\224f\342\200\224

qd el. But d is a polynomial of lowest degree in /.

Accordingly, r = 0 and /= qd, that is, d divides/. It remains to show that d is unique. If d' is anothermonic polynomial which generates /, then d divides d' and d! dividesd. This implies that d = d', becaused and d' aremonic.Thus the theorem is proved.

Theorem A.3: Let / and g be nonzero polynomials in K[t]. Then there exists a unique monic poly-polynomial d such that: (i) d divides/and g; and (ii) if d' divides/and g, then d' divides d.

Definition: The above polynomial d is called the greatest common divisor of/and g. If d = 1, then/and g are said to be relatively prime.

Proof of Theorem A.3: The set / = {mf+ng : m, n e /C[t]} is an ideal. Let d be the monic poly-polynomial which generates /. Note/ gel; hence d divides/and g.Now suppose d' divides/and g. Let Jbe the ideal generated by d'. Then f,geJ and hence I c J. Accordingly, d e J and so d' divides d asclaimed.It remains to show that d is unique. If dx is another (monic) greatest common divisor of/\"and g,

then d divides dt and dt divides d. This implies that d = d, because d and dx are monic. Thus thetheorem is proved.

Corollary A.4: Let d be the greatest common divisor of the polynomials / and g. Then there exist

polynomials m and n such that d = mf + ng. In particular, if/and g are relatively primethen there exist polynomials m and n such that mf + ng

= 1.

The corollary follows directly from the fact that d generates the ideal

/ ={mf + ng:m,ne

A.3 FACTORIZATION

A polynomial p e /C[f] of positivedegreeis said to be irreducible if p\342\200\224

fg implies/or g is a scalar.

Lemma A.5: Suppose p e /C[f] is irreducible. If p divides the product fg of polynomials / g e K[t~\\,

then p divides /or p divides g. Moregenerally, if p divides the product of n polynomials

/1/2''' L > then p divides one of them.

Proof: Supposep divides fg but not/ Since p is irreducible,the polynomials / and p must then be

relatively prime. Thus there exist polynomials m, n e /C[f] such that mf'+ np = 1. Multiplying this equa-

equation by g, we obtain mfg + npg = g. But p divides fg and so mfg, and p divides npg; hence p divides the

sum g = mfg + npg.Now suppose p divides/,/2 \342\200\242\302\246\302\246

/\342\200\236.If p divides/,, then we are through. If not, then by the aboveresult p divides the product/2 \302\246\302\246\342\200\242/\342\200\236.By induction on n, p divides one of the polynomials/2, ...,/\342\200\236.

Thus the lemma is proved.

448 POLYNOMIALS OVER A FIELD [APPENDIX

Theorem A.6 (Unique Factorization Theorem): Let/be a nonzero polynomial in K[f]. Then/can be

written uniquely (except for order) as a product

/= kptp2\302\246\302\246

pn

where \320\272\320\265\320\232and the p,- are monic irreducible poly-polynomials in K[t].

Proof. We prove the existence of such a product first. If/is irreducible or if/e K, then such a

product clearly exists. On the other hand, suppose/= gh where/and g are nonscalars.Then g and h

have degrees less than that of/ By induction, we can assume

9 = ki0i02\302\246'\"\320\262\320\263and h = k2hlh2-- hs

where ku k2 e \320\232and the g, andhi

are monic irreducible polynomials. Accordingly,

f={kik2)glg2 \302\246\302\246\302\246gThxh2

\302\246\302\246hs

is our desired representation.We next prove uniqueness (except for order) of such a product for/ Suppose

/=kPiP2 \342\200\242\342\200\242\302\246Pn

= k'qiq2\342\200\242\302\246\342\200\242

qm

where \320\272,\320\272'\320\265\320\232and the Pi, ...,\321\200\342\200\236,\320\247\\,---,\320\247\321\202\320\260\320\263\320\265monic irreducible polynomials. Now p, dividesk'fi '''

4m- Since p, is irreducible it must divide one of the qt, by Lemma A.5. Say p, divides qv SincePj and </, are both irreducible and monic, pt

= qx. Accordingly,

fcp2\302\246\302\246

Vn= k'q2

\302\246\342\200\242qm

By induction, we have that n = m and p2 = q2,..., \321\200\342\200\236= qm for some rearrangement of the qt. We also

have that \320\272\342\200\224\320\272'.Thus the theorem is proved.If the field \320\232is the complex field C, then we have the following

Theorem A.7 (Fundamental Theorem of Algebra): Lei /(f) be a nonzero polynomial over the complexfield \320\241Then f(t) can be written uniquely (except

for order) as a product

.\342\204\226=k(t-rl){t-r2)-(t-rn)where \320\272,\320\263,\320\262\320\241,i.e. as a product of linear poly-polynomials.

In the case of the real field R we have the following result.

Theorem A.8: Let f\\t) be a nonzero polynomial over the real field R. Then/(f) can be written uniquely

(except for order) as a product

f(t) =kpMp2(t)--Pm(t)

where \320\272e R and the p,{t) are monic irreduciblepolynomials of degree one or two.

Index

A(V),322

Absolute value, 52

Adjoint, classical,253-254Adjoint operator, 425-426, 433 436

matrix representation, 426

Algebra of linear maps, 322

Algebraic multiplicity, 285

Algorithm:

basis-finding, 154determinant, 253

diagonalization under:

congruence, 102-104similarity, 102, 286

Elimination (Gaussian), 8, 24 25,100,261Euclidean, 446

Gauss-Jordan, 29inverse matrix, 100

orthogonal diagonalization, 289Alternating bilinear form, 411

Angle, 45, 206Annihilators, 399-400, 403-404

Arbitrary permutations, 249Augmented matrix, 78

B(K),409Back-substitution, 9Basis:

dual, 398, 400-403

general solution, 19-20orthogonal, 208

orthogonal, 209second dual, 399

usual, 150vector space, 150,169 171

Basis-finding algorithm, 154Besselinequality, 212

Bijective mapping, 314Bilinear form, 409, 414-416

alternating, 411matrix representation of, 410

polar form of, 412

rank of, 411real symmetric, 412 413

symmetric, 411 412Blockmatrix, 79 80, 86

determinant of, 256 257diagonal, 98, 291

Jordan,374

square, 98, 123 124

triangular, 98

C\", 52-53, 72

Cayley Hamilton theorem,282,302-304. 307

Cancellation law, 142Canonical form, 14, 352

Jordan, 373 374,381-385rational, 375 376, 388 390row, 14, 16, 28 30

Cauchy Schwarz inequality, 45, 58, 205-206, 218Cells,79

Change of basis, 159-162, 187-191Change-of-basismatrix, 160, 348, 357 359

Change-of-variablematrix, 105-106

Characteristic polynomial, 281-284, 349Classicaladjoint, 253 254

Codomain. 312Coefficient, 1

Fourier, 211

Coefficient matrix, 17,78Cofactor,252

Column, 13,74

operations, 100-102rank, 152

space, 146

vector, 40, 75Commuting matrices, 90

Companion matrix, 291,375Complement, orthogonal, 207-208, 226

Complex:conjugate, 51inner product, 216-218

matrix, 96, 121-122n-space,52

numbers, 51

Component, 40, 211Composition of mappings, 313

Congruent:diagonalization, 102-103

matrices. 102, 410

symmetric matrices,102-103,124-125Conjugate, complex, 51

Conjugate matrix, 96Consistent systems, 13

Convex set, 342, 407Coordinatevector, 157, 186 187

Coordinates, 40, 157 159Cosets,376

Cramer's rule, 255, 263-264, 268Crossproduct, 49-50, 65 67

Curves, 48Cyclicsubspaces,374-375, 388 389

Degenerate linear equations, 2, 3

Degree of polynomial, 446

449

450 INDEX

Dependence,linear, 147, 166-169

Determinant, 246block matrix, 256-257

computation of, 253, 260-262

linear equations, 254 255linear operators, 349

multiplinearity, 257 258ordern, 250

order, 3, 248properties,251-252volume, 257

Diagonal (of a matrix), 90Diagonal matrix, 93-94

block, 98

Diagonalization, 349 350,360 363

algorithm, 102 104,286 288Diagonalizable matrices, 280, 288-289, 298-301Dimension of solution spaces. 19

Dimension of vector spaces,150 151subspaces, 151, 171 172

Direct sum, 156,181 186

decomposition, 371, 379-380Directed line segment, 46

Distance, 45Domain,312Dot product, 43-46, 52. 56-57Dual:

basis,398

space, 397

Euclidean:

algorithm, 446space,203

Evaluation map, 92

F(X), 143Factorization:

polynomial, 448

LU, 109-111, 130-133Field, 74Finite dimension, 150 151Fourier coefficient, 211

Free variable, 3, 10Function,

square matrix, 89-90

spaces, 143Functional, linear, 397

Gauss Jordan algorithm, 29

Gaussian elimination, 8, 24-26, 100,261

General solution, 1, 7

Geometric multiplicity, 285

Gram Schmidt orthogonalization. 213Graph, 4Greatest common divisor, 447

Echelon:form, 10 11, 14,23-24matrices. 14

Enenspace,284, 350

Eigenvalue, 284 285, 292 298,350

computing, 286 287

Eigenvector, 284-286,292-298,350

computing, 286 287

Elementary operations, 7-8column, 100 102

row, 14,98-100Elementary divisors, 376

Elementary matrix, 99, 101,117-118Elimination algorithm, 6

Elimination, Gaussian, 8, 24 26,100, 261

Equal:

matrices, 74vectors,40

Equations (see Linear equations)Equivalence:

matric, 100,102row, 14, 16

Equivalent sysiems, 7-8

Hermitian:

form, 413-414

matrix, 97quadratic form, 414

Homogeneous systems, 18-20,32 34, 399 400

Hilbert space, 205Hyperplane.46,407

ijk notation, 49

Imf, 316Imz, 51Identity:

mapping, 314

matrix, 91Imageof linear mapping, 316 318, 328 331Imaginary part, 51

Inconsistent systems, 13Independence,linear, 147-150

Infinite dimension, 150Infinity-norm, 219

Injective mapping, 314Inner product, 202, 221-226

complex, 216 218usual, 203, 214 216,218

INDEX 451

Inner product spaces, 202 205linear operators on, 425

In variance, 370-371Invariant subspaces, 370-371, 377 378Inverse matrix, 92

computing algorithm, 100Invertible matrices, 92 93, 115-117

linear substitutions, 105

operators, 323Isomorphicvectorspaces,316Isomorphism, 158 159, 316, 320

Linear operator, 322

adjoint, 425characteristic polynomial, 349

determinant, 349

inner product spaces, 425

invertible, 323matrix representation, 344

nilpotent, 373Locatedvectors,46LU factorization, 109-111, 130-133

Jordan canonical form, 373 374

Jordan block, 374

K\", 142

Kerf, 316Kernel of a linear map, 316-320, 328 332Kroneckerdelta, 91

Laplace expansion, 252

Law of Inertia, 102,413

Leading nonzero entry, 13Leadingunknown, 3

Length, 44, 203Line, 47Linear:

combination, 19,41-42, 145-147, 149, 165-166

dependence,42-43,147-149,166 169

functional, 397, 426-427independence,43, 147 149

span, 145Linear equation, 1

degenerate, 2 3

one unknown, 2

Linear equations (system), 1,21, 152-153,319-320,323-324, 399-400

consistent, 13

echelon form. 10,23-24two unknowns, 4 6

triangular form, 9-10, 23Linearmapping, 314 316,319,326 328

image, 316-319,328 331

kernel, 316 320, 328 332matrix representation, 351

nullity, 318-319rank, 318 319

transpose, 400

Mappings (maps), 312-314, 324-326

composition of, 313

linear, 314-316, 326 328Matrices,13-14,74 75

augmented, 78

block, 79 80,86

change-of-basis, 160, 348, 430change-of-variable,105-106coefficient, 17,78

companion, 375

complex, 96-97diagonal, 93-94

diagonizable, 280, 298echelon,14equivalent, 102

invertible, 92-93

nonsingular, 92-93normal, 97

similar, 357 359

square, 89-90triangular, 94

Matrix representation:adjoint operator, 426bilinear form, 410-411

linear maps, 344,351,363-365

quadratic form, 104Matrix space, 142Minkowski's inequality, 45

Minimum polynomial, 289 291,301 303, 351

Minor, 252, 255-256principle, 256

Monic polynomial, 282, 446Multiplication of matrices, 76-77

Multilinearity, 257 258Multiplicity, 285

n-space, 40

Nilpotent, 373Nondegeneratelinear equations, 3

Nonnegative semidefinite, 413Nonsingular:

linear maps, 320

matrices, 92-93

452 INDEX

Norm, 44, 57-59, 203, 219-220,238Normal:

matrix, 97

operator, 428, 432Normal vector, 46

Normalizing, 44Normed vectorspaces,219-221Nullity, 318-319

Product:

cross, 49-50dot, 43-44, 52inner, 202 205,221-226

Projections, 45 46,211-212,315,338, 390-391, 397,

433

orthogonal, 433Pythagorean theorem, 209

One-norm, 219-220One-to-one:

correspondence,314mapping, 314

Onto mapping, 314

Operations on linear maps, 320-321

Operators (see Linear operators)Orthogonal:

basis,209

complement, 207-208, 226

matrix, 95,216,429 430

operator, 427-429, 431-432projections,433sets,208-209,212

Orthogonality, 43, 226-229, 280

Orthogonalization (Gram-Schmidt),213Orthonormal basis, 209

Outer product, 49

P(t), 143

Parallelogram law, 39Parameters,4, 10,47

Perpendicular, 43

Permutations, 249, 269 272Pivot entries, 16

Pivoting (row reduction), 27

Planes, 61 65Point, 40, 52

Polar form, 412Polynomial, 446

characteristic, 281-284

degree, 446-447Legendre,214minimum, 289-291, 301-303, 351

monic, 446Positive definite:

matrices, 108, 127-128, 214-216operators,427

Positive operators, 427Powers of matrices,91Primary decomposition, 372

Principal minor, 256

Quadraticforms, 104-108, 124-127, 289, 298, 300-301,412

diagonalizing, 104

Hermitian, 414matrix representation, 104

Quotient spaces, 376-377, 385-387

R\", 40-41. 53-54

Rez,51Rank, 102,107,152Rational canonical form, 375-376Real part, 51Real symmetric bilinear form, 412-413Reduction algorithm:

linear equations, 11-12

determinants, 252-253Ring of polynomials, 446

Row, 13,74canonical form, 14, 16, 28-30

equivalence. 14,98operations, 14,98rank, 152

reducing, 14, 26-27space, 146-147vector, 75

Scalar, 40, 141matrix. 91

product, 43Schwarz (Cauchy-Schwarz)inequality, 45, 58

Second dual space,399Self-adjoint operator, 427

Signature, 102, 107Similarity:

matrices, 109, 128-130, 357-359operators, 349

Singular, 320, 332

Size of a matrix, 13, 74

Skew-adjoint operator, 427Skew-Hermitian, 97

Skew-symmetric, 94, 411

INDEX 453

Solutions, 1,21-23

general, 1, 7

simultaneous, S

Spatial vectors, 39, 49,62-65Span.145

Spanning sets, 145

Spectral theorem, 433Squarematrix, 89-90, 111

block, 98, 123 124Square root of a matrix, 285Standard form, 7

Sum of vectorspaces,155-156Surjective map, 314

Sylvester's theorem, 413Symmetric bilinear form, 411-412, 416-418Symmetric matrices,94-95

congruent, 102, 124

Systems of linear equations,

(see Linear equations)

Tangent vector, 48

Trace, 90, 349Transpose:

matrix, 77, 85

linear mapping, 400Transition matrix, 160, 348

Triangle inequality, 205

Triangular form:linear equations, 9, 23

linear operators, 369,385,387-388

Triangular matrix, 94

block, 98Two-norm, 219

Unknown, leading, 3Unit vector, 44, 206

Unity:matrix, 97, 430

operator, 427Usual:

basis,150inner product, 203, 218

V*, 397V**, 399

Variable, free, 3Vector, 39, 40

addition, 41,52column, 75coordinate,157-159,186-187located, 46

normal, 46

row, 75scalarmultiplication, 41, 52

spatial, 39, 49,62-65tangent, 48

unit. 44, 206zero,41

Vector space, 141-J42, 162-163

basis, 150,169-170dimension, 150 151, 169-170

normed, 219sums. 155-156

Zero:mapping, 315

matrix, 76

polynomial, 441,446vector, 41