chapter 2 linear transformations

Upload: jason-tan

Post on 06-Apr-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Chapter 2 Linear Transformations

    1/24

    LECTURE 8

    Linear maps

    In this lecture, we introduce the idea of linear maps, also known as linear mappings or

    linear transformations, with examples. The aim of our study of linear maps is two-fold: to

    understand linear maps in R, R2 and R3, and to bring this understanding to bear on more

    complex examples.

    1. Defining linear maps

    Linear maps are (mathematical abstractions of) very common types of function.

    Exercise 8.1. Consider the function R : R2 R2 that rotates a vector v anti-

    clockwise around the origin through an angle to give a vector Rv. Let v,w R2 and

    R. Find R(v + w) in terms of Rv and Rw, and find R(v) in terms of Rv.

    Answer. When we rotate a line, we get a line, and when we rotate parallel lines, we

    get parallel lines. So when we rotate a parallelogram, we get a parallelogram.

    Consider the parallelogram with vertices 0, v, w and v + w. When we rotate thisparallelogram about the origin through the angle , we obtain a new parallelogram, with

    vertices R0, Rv, Rw and R(v + w). Note that R0 = 0.

    By the parallelogram law for adding vectors, the fourth vertex of the parallelogram

    with vertices 0, Rv and Rw is Rv + Rw. Hence

    R(v + w) = Rv + Rw.

    0

    v

    w

    v + w

    R0

    Rv

    Rw

    R(v + w)

    Figure 8.1. Rotating sums of vectors

    37

  • 8/2/2019 Chapter 2 Linear Transformations

    2/24

    38 8. LINEAR MAPS

    Next, if > 0, then multiplication by is a dilation about the origin. We get the same

    result when we dilate first, then rotate, as when we rotate, then dilate. This means that

    R(v) = (R

    v).

    If we consider reflections in the origin as well, which correspond to multiplication by 1,

    then we get this formula for all scalars .

    Exercise 8.2. The simple harmonic oscillator is the quantum mechanical version of a

    mass oscillating in simple harmonic motion on a spring. The operator H is a function

    from functions to functions. Given a function f : R R, we define the new function

    Hf : R to R by the rule

    (Hf)(x) = d2

    dx2f(x) + x2f(x) x R.

    For instance, if f(x) = ex2/2, then Hf(x) = ex

    2/2. Find H(f + g) in terms of Hf and

    Hg, and find H(f) in terms of Hf.

    Answer. It is easy to see that

    H(f + g)(x) = d2

    dx2(f + g)(x) + x2(f + g)(x)

    = d2

    dx2

    f(x) + x2f(x) d2

    dx2

    g(x) + x2g(x)

    = Hf(x) + Hg(x),

    that is, H(f + g) = Hf + Hg.

    Similarly, H(f) = Hf.

    The functions R and H in these two examples enjoy the same algebraic properties

    they respect the basic vector operations of addition and scalar multiplication. This motivates

    the following definition.

    Definition 8.3. Let V and W be vector spaces. A linear transformation (or mapping

    or map) from V to W is a function T : V W such that

    T(v + w) = Tv + Tw

    T(v) = T(v)

    for all vectors v and w and scalars .

  • 8/2/2019 Chapter 2 Linear Transformations

    3/24

    2. LINEAR MAPS AND MATRICES 39

    Let us investigate the linear maps from R to R.

    Exercise 8.4. Which of the following functions are linear?

    f(x) = 3x + 1 g(x) = x

    h(x) = ex + 2 k(x) = sin(x).

    Answer. First, observe that f(x+y) = 3(x+ y)+1 = 3x+3y +1, while f(x)+ f(y) =

    (3x +1)+(3y +1) = 3x + 3y +2. These expressions are different, so f(x + y) = f(x) + f(y),

    and f is not a linear map.

    Next, g(x + y) = (x + y) = x y , while g(x) + g(y) = x y . Hence

    g(x + y) = g(x) + g(y). Further, g(x) = x = (x) = g(x). Then g is a linear

    map.

    Third, h(x) = ex + 2, while h(x) = (ex + 2) = ex + 2. There is no reason

    why these should be the same for all values of x and in R. Indeed, when = 1, then

    they are equal only when ex + 2 = (ex + 2), that is, if ex + ex + 4 = 0, that is, never.

    Then h is not a linear map.

    We omit the proof that k is not a linear map.

    In general, the linear functions from R to R are the functions of the form l(x) = cx, for

    some fixed c R (including 0).

    2. Linear Maps and Matrices

    Suppose that A is an m n matrix. Define TA : Rm Rn by the formula

    TAx = Ax x Rm.

    Then TA is a linear map. Indeed,

    TA(x + y) = A(x + y) = Ax + Ay = TA(x) + TA(y)

    for all x,y Rm, and similarly TA(x) = TA(x) for all x Rm and R.

    We show now that multiplications by matrices are essentially the only examples of

    linear maps from Rm to Rn. Recall that ej denotes the vector (0, . . . , 0, 1, 0, . . . , 0)T, with

    only one nonzero entry, namely 1, in the jth place.

  • 8/2/2019 Chapter 2 Linear Transformations

    4/24

    40 8. LINEAR MAPS

    Theorem 8.5. Suppose that T : Rm Rn is a linear map. Letaj denote the vector

    Tej, and A denote the m n matrix whose jth column is aj, where 1 j m. Then

    Tx = Ax

    for allx inRm.

    Proof. Take x in Rm. We write x as (x1, x2, . . . , xm)T, and then

    x = x1e1 + x2e2 + + xmem.

    Since T is linear,

    Tx = T(x1e1 + x2e2 + + xmem)

    = x1Te1 + x2Te2 + + xmTem

    = x1a1 + x2a2 + + xmam = Ax,

    as required. 2

    3. Examples in R2 and R3

    Exercise 8.6. Let R : R2

    R2

    be the rotation (anti-clockwise) through the angle .Represent R as multiplication by a matrix.

    Answer. Observe that

    R

    1

    0

    =

    cos

    sin

    and R

    0

    1

    =

    sin

    cos

    .

    Then

    Rxy = xR1

    0+ yR0

    1 = cos sin

    sin cos x

    y .

    .

    Exercise 8.7. Let Ds : R3 R3 be dilation by the factor s in R+, that is, Dsx = sx

    for all x in R3. Show that Ds is linear and represent Ds as multiplication by a matrix.

    Answer. We do not show that Ds is linear.

  • 8/2/2019 Chapter 2 Linear Transformations

    5/24

    3. EXAMPLES IN R2

    AND R3

    41

    It is easy to see that

    Dsx = s 0 0

    0 s 0

    0 0 sx x R3.

    .

    Exercise 8.8. What is the geometric effect of multiplying vectors in R3 by the matrix

    1 0 0

    0 1 0

    0 0 103

    ?

    Answer. If we think of the vectors (1, 0, 0)T and (0, 1, 0)T as being horizontal, and

    (0, 0, 1)T as begin vertical, then, on the one hand, we do not change the horizontal compo-nents of a vector, while on the other hand, we reduce the vertical component by a factor

    of 103. This corresponds to squashing down.

    Exercise 8.9. What is the geometrical effect of multiplying vectors in R2 by the matrix1 1

    0 1

    ?

    Answer. The effect of this transformation is to slide horizontally: to the left wheny < 0 and to the right when y > 0. This is known as a shear transformation.

    Exercise 8.10. Let a be a unit vector in R3. Define the map L : R3 R3 by the

    formula Lx = (x a)a for all x R3.

    (a) Show that L is linear.

    (b) Express L as multiplication by a matrix.

    (c) Describe L geometrically.

    Answer. First, observe that

    L(x + y) = ((x + y) a)a = (x a + y a)a = (x a)a + (y a)a = Lx + Ly,

    for all x,y R3. Moreover,

    L(x) = ((x) a)a = ((x a))a = ((x a)a) = Lx,

  • 8/2/2019 Chapter 2 Linear Transformations

    6/24

    42 8. LINEAR MAPS

    for all x R3 and R. Hence L is a linear map.

    It is easy to check that Lej = aja, and hence Lv = Av, where

    A =a1a1 a2a1 a3a1a1a2 a2a2 a3a2

    a1a3 a2a3 a3a3

    .Geometrically, Lv is the projection ofv onto a.

    Exercise 8.11. Let a be a unit vector in R3. Define the map X : R3 R3 by the

    formula

    Xv = a v

    for all v R3.

    (a) Show that X is linear.

    (b) Express X as multiplication by a matrix.

    (c) Describe X geometrically.

    Answer. First, observe that

    X(x + y) = a (x + y) = a x + a y = Xx + Xy,

    for all x,y R3. Moreover,

    X(x) = a (x) = (a x) = Xx,

    for all x R3 and R. Hence X is a linear map.

    It is easy to check that Xv = Av, where

    A =

    0 a3 a2a3 0 a1

    a2 a1 0

    .

    Geometrically, ifv is parallel to a, then Xv = 0, while ifv is perpendicular to a, then

    Xv is obtained by rotating v through /2 around the a axis.

    So X corresponds to projection onto the plane {v R3 : v a = 0} followed by a

    rotation through /2 around the a axis.

  • 8/2/2019 Chapter 2 Linear Transformations

    7/24

    LECTURE 9

    More on linear maps

    We have defined linear maps, and seen some examples. In this lecture, we see more

    examples and properties of linear maps.

    1. Examples of linear maps

    Exercise 9.1. Define the function S : R2

    R2

    by

    S

    x

    y

    =

    x2 y2

    2xy

    for all (x, y)T R2. Is S linear?

    Answer. Observe that

    S

    x

    y

    = S

    x

    y

    =

    2x2 2y2

    22xy

    = 2

    x2 y2

    2xy

    .

    Since = 2 (except when = 0 or = 1),

    S

    x

    y

    = S

    x

    y

    ,

    and S is not linear.

    Exercise 9.2. Define the function I : C[R] C[R] by

    If(x) =x0

    f(t) dt

    for all x R. Is I linear?

    Answer. Take continuous functions f and g on R and R. First,

    I(f + g)(x) =

    x0

    (f(t) + g(t)) dt =

    x0

    f(t) dt +

    x0

    g(t) dt = If(x) + Ig(x)

    43

  • 8/2/2019 Chapter 2 Linear Transformations

    8/24

    44 9. MORE ON LINEAR MAPS

    for all x R, that is, I(f + g) = If +Ig. Next,

    I(f)(x) = x

    0

    (f(t)) dt = x

    0

    f(t) dt = If(x)

    for all x R, that is, I(f) = (If).

    Hence I is linear.

    2. Algebra of linear maps

    We gather together a number of useful facts about linear maps.

    Lemma 9.3. Suppose that V and W are vector spaces and T : V W is a linear map.

    Then

    T(0) = 0

    T(u) = T(u)

    for allu V.

    Proof. For all vectors u in V and all scalars , we know that T(u) = T(u). Take

    equal to 0 to prove the first identity, and equal to 1 to prove the second. 2

    This means that if T : V W is a function between vector spaces, and T(0) = 0 or

    T(v) = T(v) for just one vector v, then T is not linear.

    We have already used the following results implicitly.

    Lemma 9.4. Suppose that V and W are vector spaces and T : V W is a function.

    Then T is a linear transformation if and only if

    T(u + v) = T(u) + T(v) (9.1)

    for allu,v V and all scalars , .

    Proof. If T is linear, then

    T(u + v) = T(u) + T(v) = T(u) + T(v).

    Conversely, if (9.1) holds, then taking = = 1 shows that

    T(u + v) = T(u) + T(v)

  • 8/2/2019 Chapter 2 Linear Transformations

    9/24

    2. ALGEBRA OF LINEAR MAPS 45

    and taking = 0 shows that

    T(u) = T(u),

    so T is linear. 2

    Corollary 9.5. Suppose that V and W are vector spaces and T : V W is a linear

    map. Then for all finite linear combinationsn

    j=1 jvj in V,

    T

    nj=1

    jvj

    =

    n

    j=1jT(vj).

    Proof. This is proved by induction on n: the lemma above shows that the result

    holds when n = 2, and if k 2, and

    T

    kj=1

    jvj

    =

    kj=1

    jT(vj),

    then

    T

    k+1j=1

    jvj

    = T

    k

    j=1

    jvj

    + k+1vk+1

    = T

    kj=1

    jvj

    + T(k+1vk+1)

    =k

    j=1

    jT(vj) + k+1T(vk+1)

    =

    k+1j=1 jT(vj).

    The first equality holds by definition, the second because T is linear, the third by the

    inductive hypothesis and the linearity of T, and the fourth by definition. It follows that

    the result holds for all integers n 2 by induction. 2

  • 8/2/2019 Chapter 2 Linear Transformations

    10/24

    46 9. MORE ON LINEAR MAPS

    3. More on the geometry of linear maps

    Exercise 9.6. Let D denote the diagonal n n matrix

    d1 0 . . . 00 d2 . . . 0...

    .... . .

    ...

    0 0 . . . dn

    .

    Show that multiplication by D is linear. What is the geometric effect of multiplying by

    the matrix D?

    Answer. We omit the proof that multiplication by D is linear.

    Multiplication changes the kth component of a vector by a factor ofdk

    . If|dk

    | < 1, this

    gives a compression; if |dk| > 1, this gives an expansion. If dk < 0, there is also a change

    of orientation.

    4. Images and kernels

    Consider the linear system

    a11x1 + a12x2 + + a1mxn = b1

    . . .

    am1x1 + am2x2 + + amnxn = bm,

    or equivalently, in matrix form

    Ax = b.

    What b can we solve this for? Is the solution unique?

    We have seen that we can solve the equation if and only ifb is a linear combination of

    the columns ofA; we write b col(A) for short. We also know that, ifxpart is a particular

    solution ofAx = b, then every solution is of the form xpart+xhom, where xhom is a solution

    of the homogeneous equation

    Ax = 0.

    If Ax = b has any solutions, then it has as many solutions as the homogeneous equation.

    Next, consider the integral equation

    Iu = b,

  • 8/2/2019 Chapter 2 Linear Transformations

    11/24

    4. IMAGES AND KERNELS 47

    where I is the integration operator introduced earlier, b is a known function and u is an

    unknown continuous function. What b can we solve this for? Is the solution unique?

    It is actually quite hard to answer the first question. But if u is a continuous function,

    then Iu(0) = 0, so a necessary condition to be able to solve this problem is that b(0) = 0,

    and certainly we cannot solve this for all functions b. It is convenient to have a notation

    for the functions for which we can solve this equation. The set of vectors {If : f C[R]}

    is called the image of I, and written image(I).

    For the second question, we can show that, if upart is a particular solution of Iu = b,

    then every solution of the equation is of the form upart + uhom, where uhom is a solution of

    the homogeneous equation

    Iu = 0.

    If Iu = b has any solutions, then it has as many solutions as the homogeneous equation.

    We unify these (and other examples) in the following definitions.

    Definition 9.7. Suppose that V and W are vector spaces and T : V W is a linear

    map. The set of vectors {Tv : v V} is called the image or range of T, and written

    image(T) or range(T) or T(V). In symbols,

    image(T) = {Tv : v V}.

    An equivalent form of the definition is that image(T) is the collection of vectors w in

    W for which the equation Tx = w can be solved.

    Let xpart be a particular solution of the equation Tx = w, and let xhom be any solution

    of the homogeneous equation Txhom = 0. Then

    T(xpart + xhom) = T(xpart) + T(xhom) = w + 0 = w,

    so that xpart +xhom is also a solution ofTx = w. Further, every solution of Tx = w is of

    this form. Indeed, if Tx = w and Txpart = w, then

    T(x xpart) = T(x) T(xpart) = w w = 0,

    so x xpart is a solution of the homogeneous equation, and x = xpart + (x xpart).

    The solutions to the homogeneous equation Tx = 0 are important in the discussion

    above, and we give them a name.

    Definition 9.8. Suppose that V and W are vector spaces and T : V W is a linear

    map. The kernel of T, written ker(T), also known as the null space of T, is the subset of

  • 8/2/2019 Chapter 2 Linear Transformations

    12/24

    48 9. MORE ON LINEAR MAPS

    V of all vectors x such that Tx = 0. In symbols,

    ker(T) = {x V : Tx = 0}.

    Theorem 9.9. Suppose that V and W are vector spaces and T : V W is a linearmap. Then ker(T) and image(T) are subspaces.

    Proof. First we show that ker(T) is a subspace. We have just seen that T0 = 0, so

    ker(T) is not empty. Next, ifv,w ker(T), then

    T(v + w) = Tv + Tw = 0 + 0 = 0,

    so v + w ker(T). Thus ker(T) is closed under vector addition. Further, if is a scalar

    and v ker(T), then

    T(v) = (Tv) = 0 = 0,so v ker(T). Thus ker(T) is closed under scalar multiplication. By the Subspace

    Theorem, ker(T) is a subspace.

    Now we show that image(T) is a subspace. We have just seen that T0 = 0, so image(T)

    is not empty. Next, ifx, y image(T), then there are vectors u and v in V such that

    x = Tu and y = Tv. Then

    x + y = Tu + Tv = T(u + v),

    and x + y image(T) since u + v V. Thus image(T) is closed under vector addition.

    Further, if is a scalar and v ker(T), then there is a vector u in V such that x = Tu,

    whence

    x = (Tu) = T(u),

    and x image(T) since u V. Thus image(T) is closed under scalar multiplication.

    By the Subspace Theorem, image(T) is a subspace. 2

    The spaces ker(T) and image(T) are vector spaces, and have dimensions. These tell us

    something about T, and this is what we will investigate next.

  • 8/2/2019 Chapter 2 Linear Transformations

    13/24

    LECTURE 10

    Rank and nullity

    1. Definitions and properties

    Definition 10.1. Suppose that V and W are vector spaces and T : V W is a linear

    map. The nullity of T, written nullity(T), is the dimension of ker(T). The rank of T,

    written rank(T), is the dimension of image(T). The co-rank of T, written co-rank(T), is

    the number dim(W) dim(image(T)).

    Observe that the nullity of T is equal to the number of parameters in the general

    solution of Tx = w. Observe also that the rank determines the co-rank, and vice versa.

    To any matrix A, we have associated a linear map TA, namely, multiplication by A. It

    is convenient to use the expressions ker(A), nullity(A), image(A), rank(A) and co-rank(A)

    to mean ker(TA), nullity(TA), image(TA), rank(TA) and co-rank(TA).

    However, this also applies to other kinds of vectors. For example, the general solution

    of

    d2dx2

    u(x) u(x) = x

    is

    u(x) = x + A sin x + B cos x.

    This has two parameters, and the nullity of the linear differential operator T, given by

    T f(x) = f(x) f(x) is 2.

    Proposition 10.2. Suppose that T : V W is a linear map. Then

    (i) T is one-to-one if and only if nullity(T) = 0.

    (ii) T is onto if and only if rank(T) = dim(W), that is, if and only if co-rank(T) = 0.

    Proof. First, we consider when T is one-to-one. On the one hand, suppose that

    nullity(T) = 0, so ker(T) = {0}. By linearity, if x,y V and T(x) = T(y), then

    T(x y) = 0, and so x y ker T = {0}. Thus x y = 0, that is, x = y. Thus T is

    one-to-one.

    49

  • 8/2/2019 Chapter 2 Linear Transformations

    14/24

    50 10. RANK AND NULLITY

    One the other hand, suppose that nullity(T) = 0, so that ker(T) {0}. Take z

    ker(T) such that z = 0. Then T(z) = 0 = T(0), and T is not one-to-one.

    Now we consider when T is onto. On the one hand, suppose that rank(T) = dim(W) =

    n, say. Now image(T) W, and image(T) has a basis, {w1, . . . ,wn}. This is a linearly

    independent set in W, of maximal size, since dim(W) = n, and hence is a basis in W.

    Thus span{w1, . . . ,wn} = image(T) and span{w1, . . . ,wn} = W, so W = image(T). This

    means that for any w W, we can find v V such that Tv = w, and T is onto.

    On the other hand, suppose that rank(T) < dim(W). Then a basis for image(T)

    is smaller than a basis for W. In particular, a basis {w1, . . . ,wn} for image(T) can be

    enlarged to form a basis {w1, . . . ,wn,wn+1, . . . } for W. A basis is linearly independent,

    so wn+1 is not in the span of {w1, . . . ,wn}. Thus wn+1 is not in image(T), and T is not

    onto. 2

    Theorem 10.3. If U is a subspace of W, then the smallest number of linear equations

    needed to describe U is dim(W) dim(U).

    We omit the proof of this result. It implies that the co-rank of a linear transformation

    T : Rm Rn is the number of linear equations needed to describe the image of T.

    Exercise 10.4. Consider the line in R3 with parametric equation x = d, where

    d = (2, 0, 1)T

    . This line is a 1-dimensional subspace. It may also be described by theequations y = 0 and x = 2z. It is ofcodimension 2.

    Challenge Problem 10.5. Find equations which define span

    (1, 2, 0, 1)T, (1, 0, 2, 0)T

    in R4. What is the minimal number of linear equations needed to do this?

    2. Examples

    Exercise 10.6. Consider D : Pn(R) Pn1(R), given by

    D(antn + + a0) = nant

    n1 + + 2a2t + a1

    (that is, D corresponds to differentiation). Find nullity(D) and rank(D).

    Answer. First of all, take p(t) Pn(R) given by p(t) = antn + + a0. IfD(p)(t) = 0

    (for all t), then nantn1 + + a2t + a1 = 0, and so an, an1, . . . , a1 are all 0. However a0

    can be arbitrary. Thus ker(D) is the set of constant polynomials, which is of dimension 1.

    Hence nullity(T) = 1.

  • 8/2/2019 Chapter 2 Linear Transformations

    15/24

    2. EXAMPLES 51

    It is easy to see that {1, t , t2, . . . , tn1} is a basis for Pn1(R), whence dim(Pn1(R)) = n.

    Suppose that q(t) Pn1(R) is given by q(t) = bn1tn1 + + b1t + b0. We choose

    the coefficients an, an1, . . . , a1 as follows:

    an =bn1

    n, an1 =

    bn2n 1

    , . . . , , a2 =b12

    , a1 = b0,

    and define p(t) = antn + + a0. It is easy to check that Dp(t) = q(t) for all real t. Since

    q is an arbitrary element of Pn1(R), it follows that D is onto, image(D) = Pn1(R), and

    rank(D) = dim(Pn1(R)) = n.

    Observe that rank(D) + nullity(D) = n + 1 = dim(Pn).

    Exercise 10.7. Find the nullity of the matrix

    1 2 3 4

    1 0 4 2

    1 1 0 0

    .

    Answer. Reduce to row-echelon form. The reduced matrix is of the form

    1 2 3 4

    0 2 1 2

    0 0 92 1

    .

    Then the rank of the matrix is 3; the nullity is 1.

    Note that we do not have to find ker(T) explicitly to show that it is 1-dimensional.

    Exercise 10.8. Find a basis for the kernel of the matrix

    1 2 3 4

    1 0 4 2

    1 1 0 0

    .

    Answer. We need to find x1, . . . , x4 such that

    1 2 3 4

    1 0 4 2

    1 1 0 0

    x1

    x2

    x3

    x4

    = 0,

  • 8/2/2019 Chapter 2 Linear Transformations

    16/24

    52 10. RANK AND NULLITY

    that is, to find the solutions of the system represented by the augmented matrix

    1 2 3 4

    1 0 4 2

    1 1 0 0

    0

    0

    0 .

    Row-reduced, this is of the form1 2 3 40 2 1 2

    0 0 92

    1

    0

    0

    0

    .

    The solution space has the parametric description

    x = t

    10

    10

    2

    9

    ,

    where t R, and is 1-dimensional. Then {(10, 10, 2, 9)T} is a basis for the kernel of

    the matrix.

    Exercise 10.9. Find a basis for the image of the matrix1 2 3 41 0 4 2

    1 1 0 0

    .

    Answer. We row-reduce this matrix, and get

    1 2 3 4

    0 2 1 2

    0 0 92

    1

    .

    Thus the first three columns are linearly independent, and the fourth column depends lin-early on these. It follows that the vectors (1, 1, 1)T, (2, 0, 1)T and (3, 4, 0)T are linearly

    independent; since R3 is 3-dimensional, they must form a basis.

    Of course, other sets of three of these vectors, such as {(1, 0, 0)T, (0, 1, 0)T, (0, 0, 1)T},

    are also bases for R3.

  • 8/2/2019 Chapter 2 Linear Transformations

    17/24

    2. EXAMPLES 53

    Exercise 10.10. Define T : P3(R) R4 by

    T(a3x3 + a2x

    2 + a1x + a0) = (a0, a1, a2, a3)T.

    Show that T is a linear mapping, and find its rank and nullity.

    Answer. Suppose that

    p(x) = a3x3 + a2x

    2 + a1x + a0

    q(x) = b3x3 + b2x

    2 + b1x + b0.

    Then

    T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x

    2 + (a1 + b1)x + (a0 + b0))

    = (c0 + b0, a1 + b1, a2 + b2, a3 + b3)T

    = (a0, a1, a2, a3)T + (b0, b1, b2, b3)

    T

    = T(p(x)) + T(q(x)),

    and further

    T(p(x)) = T(a3x3 + a2x

    2 + a1x + a0)

    = (a0, a1, a2, a3)T

    = (a0, a1, a2, a3)T

    = T(p(x)),

    so T is linear.

    Alternatively, it suffices to write

    T(p(x) + q(x)) = T((a3 + b3)x3 + (a2 + b2)x

    2 + (a1 + b1)x + (a0 + b0))

    = (a0 + b0, a1 + b1, a2 + b2, a3 + b3)T

    = (a0, a1, a2, a3)T + (b0, b1, b2, b3)

    T

    = T(p(x)) + (T(q(x)).

    by Lemma 9.4.

    Exercise 10.11. Define T : Pn(R) Pn(R) by

    T(p(x)) = x2d2p(x)

    dx2+ 4x

    dp(x)

    dx 4p(x).

  • 8/2/2019 Chapter 2 Linear Transformations

    18/24

    54 10. RANK AND NULLITY

    Find rank(T) and nullity(T).

    Answer. Suppose that p(x) = xk, where 0 k n. Then

    T(p(x)) = x2 d2xk

    dx2+ 4xdxk

    dx 4xk = x2k(k 1)xk2 + 4xkxk1 4xk

    = [k(k 1) + 4k 4]xk = [k2 + 3k 4]xk

    = tkxk,

    say. Note that y2 + 3y 4 = 0 if and only if y = 4 or 1, so that tk = 0 when k = 1, but

    not for other nonnegative integers k.

    Now suppose that

    p(x) = anxn + an1x

    n1 + + a1x + a0.

    Then

    T(p(x)) = antnxn + an1tn1x

    n1 + + a1t1x + a0t0.

    If T(p(x)) = 0, then antn = an1tn1 = . . . a1t1 = a0t0 = 0, and so an = an1 = . . . a2 =

    a0 = 0, while a1 can be arbitrary. Thus ker(T) = {cx : c R}, and nullity(T) = 1.

    Given any q Pn(R), say

    q(x) = bnxn + bn1x

    n1 + + b1x + b0,

    then we can try to solve T(p(x)) = q(x) by taking

    bn =antn

    , bn1 =an1tn1

    , . . . , b0 =a0t0

    .

    Of course, this is a problem when n = 1, but fine otherwise. Hence

    image(T) = {bnxn + bn1x

    n1 + + b1x + b0 : b1 = 0},

    and rank(T) = n.

  • 8/2/2019 Chapter 2 Linear Transformations

    19/24

    LECTURE 11

    The rank-nullity theorem

    In this lecture, we prove the rank-nullity theorem for matrices and general linear maps.

    And we show that the set of linear maps from a vector space V to a vector space W is

    itself a vector space, and that the set of invertible linear maps on a vector space V is a

    group.

    1. Representing general linear maps by matrices

    It is easier calculating with matrices than with general linear maps. Now we show how

    to represent a general linear map as a matrix.

    Theorem 11.1. Suppose that V and W are vector spaces with bases {v1, . . . ,vm} = A

    and {w1, . . . ,wn} = B respectively, and suppose that T : V W is a linear map. Lettj

    denote the vector Tvj and [tj]B denote its coordinates relative to the basis B, andA denote

    the matrix whose jth column is [tj]B, where j = 1, . . . , m. Then for allx in V,

    [Tx]B = A[x]A.

    Proof. We start by clarifying the definitions: for j = 1, . . . , m, we may write tj as a

    linear combination of the vectors w1, . . . ,wn, that is,

    tj = a1jw1 + + anjwn =n

    k=1

    akjwk,

    where the akj are scalars. By definition, the scalars a1j , . . . , anj are the coordinates of tj

    relative to the ordered basis B, that is, [tj]B = (a1j, . . . , anj)T. Hence the matrix A has

    entries akj .

    For x in V, we may write

    x = x1v1 + + xmvm =m

    j=1

    xjvj,

    55

  • 8/2/2019 Chapter 2 Linear Transformations

    20/24

    56 11. THE RANK-NULLITY THEOREM

    and then, by definition, (x1, . . . , xm)T = [x]A. It follows that

    Tx = Tm

    j=1

    xjvj =m

    j=1

    xjT(vj) =m

    j=1

    xjtj

    =m

    j=1

    xj

    nk=1

    akjwk =n

    k=1

    mj=1

    akjxj

    wk =

    nk=1

    (A[x]A)kwk,

    so

    [Tx]B = A(x1, x2, . . . , xm)T,

    as required. 2

    Exercise 11.2. Consider the differential operator D : Pn(R) Pn(R) given by

    Dp(x) = xd2

    dx2p(x) + 3p(x).

    Represent D as a matrix, using the basis {1, x , . . . , xn} for Pn(R).

    Answer. Observe that

    Dxk = xk(k 1)xk2 + 3xk = k(k 1)xk1 + 3xk.

    It follows that, if p(x) = a0x0 + a1x

    1 + + anxn, then Dp(x) = b0x

    0 + b1x1 + + bnx

    n,

    where b = Aa, and A is the matrix

    3 0 0 0 0 . . .

    0 3 2 0 0 . . .

    0 0 3 6 0 . . .

    0 0 0 3 12 . . .

    0 0 0 0 3 . . ....

    ......

    ......

    . . .

    .

    A number of problems about linear equations can be reduced to problems about ma-

    trices in this way.

    2. The rank-nullity theorem

    Theorem 11.3 (The rank-nullity theorem for matrices). Suppose thatA Mm,n. Then

    rank(A) + nullity(A) = m.

  • 8/2/2019 Chapter 2 Linear Transformations

    21/24

    2. THE RANK-NULLITY THEOREM 57

    Proof. Suppose that A is row-reduced to row-echelon form. Then the columns of A

    that correspond to the leading columns of the reduced matrix form a basis for range(A),

    hence rank(A) is equal to the number of leading columns. The nonleading columns of the

    reduced matrix correspond to the parameters of the solution, that is, nullity(A) is equal

    to the number of nonleading columns. These numbers add to give the total number of

    columns, that is, m. 2

    Corollary 11.4. Suppose thatT : V W is a linear map between (finite-dimensional)

    vector spaces. Then rank(T) + nullity(T) = dim(V).

    Proof. We have just seen that we may represent T by a matrix, and we may apply

    the rank-nullity theorem for matrices to this matrix.

    Alternatively, here is a more descriptive proof. Take a basis {v1, . . . ,vk} for the kernel

    of T and enlarge it to a basis {v1, . . . ,vm} for V. We now claim that the m k vectors

    T(vk+1), . . . , T(vm) are linearly independent and span image(T), so form a basis for this

    space. It then follows that dim(image(T)) = m k, which is essentially the desired result.

    To see that the vectors T(vk+1), . . . , T(vm) are linearly independent is quite easy: if

    k+1T(vk+1) + + mT(vm) = 0,

    then

    T(k+1vk+1 + + mvm) = 0,

    that is,

    k+1vk+1 + + mvm ker(T).

    Since v1, . . . , vk span ker(T), there exist 1, . . . , k such that

    (k+1vk+1 + + mvm) = 1v1 + + kvk,

    and hence

    1v1 + + kvk + k+1vk+1 + + mvm = 0;

    since the vectors v1, . . . , vm are linearly independent, the j are all 0 when 1 j m.

    To prove that the vectors T(vk+1), . . . , T(vm) span image(T) is shorter and easier:

    if y image(T), then there exists x V such that y = T(x); we write x as a linear

  • 8/2/2019 Chapter 2 Linear Transformations

    22/24

    58 11. THE RANK-NULLITY THEOREM

    combinationm

    j=1 xjvj, and then y is the corresponding linear combination:

    y = T(x) = Tm

    j=1

    xjvj =m

    j=1

    xjT(vj) =m

    j=k+1

    xjT(vj),

    since T(vj) = 0 when 1 j k. Hence y span{T(vk+1), . . . , T (vm)}. 2

    Corollary 11.5. If A is a square matrix, then nullity(A) = co-rank(A). Hence the

    linear maps associated to square matrices are one-to-one if and only if they are onto.

    Proof. We saw that a linear map is one-to-one if and only if its nullity is 0 and that

    it is onto if and only if its co-rank is 0. The rank-nullity theorem implies that the nullity

    and co-rank are equal. 2

    3. The algebra of linear maps

    We discuss briefly how to get new linear maps from old.

    Suppose that S and T are linear maps from V to W and and are scalars. We may

    define a map S+ T from V to W by

    (S+ T)(v) = S(v) + T(v) v V.

    Lemma 11.6. Suppose that S and T are linear maps from V to W and and are

    scalars. Then S+ T is a linear map.

    Proof. By definition, ifu,v V and and are scalars, then

    (S+ T)(u + v) = S(u + v) + T(u + v)

    = (S(u) + S(v)) + (T(u) + T(v))

    = S(u) + S(v) + T(u) + T(v)

    = (S(u) + T(u)) + (S(v) + T(v))

    = (S+ T)(u) + (S+ T)(v),

    and so S+ T is linear, by Lemma 9.4. 2

    We may also compose linear maps: if S : U V and T : V W are linear maps,

    then we define T S : U W by T S(u) = T(S(u)) for all u U.

  • 8/2/2019 Chapter 2 Linear Transformations

    23/24

    3. THE ALGEBRA OF LINEAR MAPS 59

    Lemma 11.7. Suppose that S : U V and T : V W are linear maps. Then

    T S : U W is a linear map.

    Proof.We leave this proof as an exercise.

    2

    Exercise 11.8. Suppose that S : U V and T : V W are linear maps. How big,

    in terms of the quantities nullity(S) and nullity(T), can nullity(T S) be?

    Answer. First of all, ker(S) ker(T S) U. We take a basis for ker(S), and enlarge

    it, first to a basis A of ker(T S), and then to a basis B ofU. There are nullity(S) vectors in

    the basis for ker(S). Any basis vector v in B that is part of the basis for ker(T S) but not

    of the basis for ker(S) has the property that S(v) = 0, but T(S(v)) = 0, so S(v) ker(T).

    Such vectors S(v) are linearly independent in ker(T), so there are at most nullity(T) of

    them.

    In total, there are at least nullity(S) vectors in A, and at most nullity(S) + nullity(T)

    vectors.

    We conclude that

    nullity(S) nullity(T S) nullity(S) + nullity(T)..

    If T : V V is a linear map on a vector space V, then T has an inverse T1

    , which isa function on V, precisely when T is one-to-one and onto. This is when nullity(T) = 0 and

    rank(T) = dim(V). By the rank-nullity theorem, it suffices to suppose that nullity(T) = 0

    or that rank(T) = dim(V).

    Lemma 11.9. If T : V V is an invertible linear map, then T1 is a linear map.

    Proof. Ify1,y2 V, then there exist x1,x2 V such that T(x1) = y1 and T(x2) =

    y2. Then x1 = T1(y1) and x2 = T

    1(y2).

    Further,

    T(1x1 + 2x2) = 1T(x1) + 2T(x2) = 1y1 + 2y2.

    Hence

    T1(1y1 + 2y2) = T1T(1x1 + 2x2) = 1x1 + 2x2 = 1T

    1(y1) + 2T1(y2),

    which shows that T1 is linear. 2

  • 8/2/2019 Chapter 2 Linear Transformations

    24/24

    60 11. THE RANK-NULLITY THEOREM

    Of course, T T1 = I, the identity map, and I is linear.

    Invertible linear maps are important in both pure and applied mathematics. Pure

    mathematicians study the group GL(V) of invertible linear maps of a vector space, and

    find that the properties of this group answer basic questions in number theory and in the

    construction of efficient networks.

    Exercise 11.10. Show that GL(V) has the following properties:

    (a) GL(V) is closed under composition, that is, if S, T GL(V), then ST GL(V).

    (b) GL(V) has an identity I, such that IT = T I = T for all T GL(V).

    (c) GL(V) has inverses: for all T GL(V), there exists T1 GL(V) such that

    T T1 = T1T = I.

    (d) GL(V) is associative, that is

    (RS)T = R(ST)

    for all R,S,T GL(V).

    Engineers describe states of robotic systems by vectors, and are interested in invertible

    linear maps of these systems. For instance, a television camera is mounted on a base which

    allows it to swivel up and down in a vertical plane, while the base can rotate in a circle in

    a horizontal plane. To get the camera pointing in a given direction, one has to both swivel

    the camera and rotate the base. How much of the basic motions is needed to effect a three

    dimensional motion? This is quite an easy question, but it does not take long to get to

    rather difficult questions of this kind on robotics.