last lecture summary independent vectors x rank – the number of independent columns/rows in a...
TRANSCRIPT
Last lecture summary
• independent vectors x
• rank – the number of independent columns/rows in a matrix
0alln,combinatiozeroexcept
02211
i
nn
c
xcxcxc
552
873
321 • Rank of this matrix is 2!• Thus, this matrix is noninvertible (singular).• It’s because both column and row spaces have the same rank.• And row2 = row1 + row3 are identical, thus rank is 2.
• Column space – space given by columns of the matrix and all their combinations.
• Columns of a matrix span the column space.
• We’re highly interested in a set of vectors that spans a space and is independent. Such a bunch of vector is called a basis for a vector space.
• Basis is not unique.• Every basis has the same number of
vectors – dimension.• Rank is dimension of the column space.
• dim C(A) = r, dim N(A) = n - r (A is m x n)• row space
– C(AT), dim C(AT) = r• left null space
– N(AT), dim N(AT) = m – r• C(A) ┴ N(AT)• C(AT) ┴ N(A), row space and null space
are orthogonal complements
G. Strang, Introduction to linear algebra
• orthogonal = perpendicular, dot product aTb = a1b1+a2b2+… = 0
• length of the vector |a| = √|a|2 = √aTa• If subspace S is orthogonal to subspace T
then every vector in S is orthogonal to every vector in T.
Four possibilities for Ax = b
A: m × n, rank r
r = m & r = n square & invertible 1 solution
r = m & r < n short & wide ∞ solutions
r < m & r = n tall & thin 0 or 1 solution
r < m & r < n not full rank 0 or ∞ solutions
Least squares problem induction
based on excelent video lectures by Gilbert Strang, MIThttp://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/detail/lecture15.htmLecture 15
I want to solve Ax = b when there is no solution.
WHAT ??WAS ??
• So b is not in a column space.• This problem is not rare, it’s actually quite
typical.• It appears when the number of equations
is bigger than the number of unknowns (i.e. m > n for m x n matrix A)– so what can you tell me about rank, what the
rank can be?• it can’t be m, it can be n or even less
– so there will be a lot of RHS with no solution !!
• Example– You measure a position of sattelite buzzing around– There are six parameters giving the position– You measure the position 1000-times– And you want to solve Ax = b, where A is 1000 x 6
• In many problems we’ve got too many equations with noisy RHSs (b).
• So I can't expect to solve Ax = b exactly right, because there's a measurement mistake in b. But there's information too. There's a lot of information about x in there.
• So I’d like to separate the noise from the information.
• One way to solve the problem is throw away some measurements till we get nice square, non-singular matrix.
• That’s not satisfactory, there's no reason in these measurements to say these measurements are perfect and these measurements are useless.
• We want to use all the measurements to get the best information.
• But how?
• Now I want you jump ahead to the matrix that will play a key role. It is a matrix ATA.
• What you can tell me about the matrix?– shape?
• square – dimension?
• n x n– symmetric or not?
• symmetric• Now we can ask more about the matrix. The answers will
come later in the lecture– Is it invertible?– If not, what’s its null space?
• Now let me to tell you in advance what equation to solve when you can’t solve Ax = b:– multiply both sides by AT from left, and you get ATAx = ATb, but
this x is not the same as x in Ax = b, so lets call it , because I am hoping this one will have a solution.
– And I will say it’s my best solution. This is going to be my plan.
x̂
• So you see why I am so interested in ATA matrix, and its invertibility.
• Now ask ourselves when ATA is invertible? And do it by example.
51
21
11
A
• 3 x 2 matrix, i.e. 3 equations on 2 unknowns• rank = 2• Does Ax equal b? When can we solve it?
• Only if b is in the column space of A.• It is a combination of columns of A. • The combinations just fill up the plane,
but most vectors b will not be on that plane.
• So I am saying I will work with matrix ATA.• Help me, what is ATA for this A?
51
21
11
A
308
83AAT • Is this ATA invertible?
• Yes
• However, ATA is not always invertible !• Propose such A so that ATA is not invertible ?
31
31
31
A
279
93AAT
Generally, if I have two matriceseach with rank r, their product can’t have rank higher than r.
And in our case rank(A)=1, so rank(AT) can’t be more than 1.
• This happens always, rank(ATA) = rank(A).• If rank(ATA) = rank(A), then N(ATA)=N(A).
• So ATA is invertible exactly if N(A)=0. Which means when columns of A are independent.
Projections
based on excelent video lectures by Gilbert Strang, MIThttp://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/detail/lecture15.htmLecture 15
• I want to find a point on line a that is closest to b.• My space is what?
– 2D plane• Is line a a subspace?
– Yes, it is, one dimensional.• So where is such a point?• So we say we projected vector b on line a, we projected b into
subspace. And how did we get it?• Orthogonality
ab
p
e = b - p
e is the error, i.e. how much I am wrong by, and it is perpendicular to a
And we know, that the projection p is some multiple of a, p = xa. And we want tofind the number x.
p = xa
• Key point is that a is perpendicular to e.• So I have aTe = aT(b-p) = aT(b -xa) = 0• So after some simple math we get
• I may look at the problem from another point of view.
• The projection from b to p is carried out by some matrix called projection matrix P.
• p = Pb• What is the P for our case?
xapaa
bax
T
T
aa
baap
T
T
aa
aaP
T
T
→
Projection matrix
• What’s its column space?– How acts the column space of a matrix A?
• If you multiply the matrix A by anything you always get in the column space. That’s what column space is.
– So where am I if I do Pb? • I am on the line a. The column space of P is the
line through a.
• What is the rank of P?– one
• Column times row is a rank one matrix, the columns of the matrix are row-wise-multiples of the column vector, so the column vector is a basis for its column space.
1510
128
96
,32,
5
4
3TT baba
• P is symmetric. Show me why?
• What happens if I do the projection twice? i.e. I multiply by P and then by P again (P × P = P2).
P
aa
aa
aa
aa
aa
aaP
T
T
TTT
TTT
TT
TTT
aa
aaP
T
T
• So if I project b, and then do projection again I what?– stay put– So P2 = P … Projection matrix is idempotent.
ab
p
e = b - p
p = xa = Pb
• Summary: if I want to project on line, there are three formulas to remember:
• And properties of P:– P = PT, P = P2
aa
aaPxap
aa
bax
T
T
T
T
More dimensions• Three formulas again, but different, we won’t
have single line, but plane, 3D or nD subspace.• You may be asking why I actually project?
– Because Ax = b may have no solution– I am given a problem with more equations than
unknowns, I can’t solve it.– The problem is that Ax is in the column space, but b
does not have to be. – So I change vector b into closest vector in the column
space of A. – So I solve Ax = p instead !!– p is a projection of b onto the column space– I should indicate somehow, that I am not looking for x
from Ax = b (x, which actually does not exist), but for x that’s the best possible.
px ˆA
• I must figure out what’s the good projection here. What's the good RHS that is in the column space and that's as close as possible to b.
• Let’s move into 3D space, where I have a vector b I want to project into a plane (i.e. subspace of 3D space)
b
pa1
a2
this is a plane of a1 and a2
This plane is the column space of matrix A 21 | aaA
e = b - p e is perpendicular to the plane
Apparently, projection p is some multiple of basis vectors.
p = x1a1 + x2a2 = Ax , and I am looking for x^^ ^ ^
So now I've got hold of the problem. The problem is to find the right combination of the columns so that the error vector (b – Ax) is perpendicular to the plane.
^
• I write again the main point– Projection is p = Ax– Problem is to find x– Key is that e = b – Ax is perpendicular to the
plane• So I am looking for two equations,
because I have x1 and x2.
• And e is perpendicular to the plane, so it means it must be perpendicular to each vector in the plane. It must be perpendicular to a1 and a2 !!
• So which two eqs. do I have? Help me.
^^^
b
pa1
a2
e = b - p
^ ^
A word about subspaces.• In what subspace lies (b – Ax)?
– Well, this is actually vector e, so I have ATe=0. Thus in which space is e?
– In N(AT)!• And from the last lecture, what do we
know about N(AT)?– It is perpendicular to C(A).
0ˆ0ˆ 21 xAbaxAba TT
0ˆ
xAbAT
^
e is in N(AT)e is ┴ to C(A)
It perfectly holds.We all are happy, aren’t we?
b
pa1
a2
e = b - p
• OK, we’ve got the equation, let’s solve it.• ATA is n by n matrix.• As in the line case, we must get answers
to three questions:1. What is x?
2. What is projection p?
3. What is projection matrix P?
bAxAAxAbA TTT
ˆ0ˆ
^
normal equations
• x is what? Help me.• What is the projection p = Ax?
• What’s the projection
matrix p = Pb?
bAxAA TT
ˆ
bAAAx TT1
ˆ
bAAAAp TT 1
projection matrix P
^
^
• can I do this?
• Apparently not, but why not? What did I do wrong?
• A is not square matrix, it does not have an inverse.
• Of course, this formula works well also if A was square invertible n x n matrix.– Then it’s column space is the whole what?
• Rn
– Then b is already in the whole Rn space, I am projecting b there, so the P = I.
?what111
TTTT AAAAAAAAP
• Also P = PT, and P = P2 holds. Prove P2!• So we have all the formulas
• And when will I use these equations. If I have more equations (measurements) than unknowns.
• Least squares, fitting by a line.
TTTTTT AAAAPbAAAApbAAAx111
ˆ
Moore-Penrose Pseudoinverse
bAx 1
bAAAx TT1
ˆ
TT AAAA1
Least SquaresCalculation
based on excelent video lectures by Gilbert Strang, MIThttp://ocw.mit.edu/OcwWeb/Mathematics/18-06Spring-2005/VideoLectures/detail/lecture16.htmLecture 16
Projection matrix recap
• Projection matrix P = A(ATA)-1AT projects vector b to the nearest point in the column space (i.e. Pb).
• Let’s have a look at two extreme cases:
1. If b is in the column space, then Pb = b. Why?– What does it mean that b is in the column
space of A?– b is linear combination of columns of A, i.e. b
is in the form Ax.– so Pb = PAx = A(ATA)-1ATAx = Ax = b
2. If b is ┴ to the column space of A then Pb = 0. Why?
– What vectors are perpendicular to the column space?
– Vectors in N(AT)– Pb = A(ATA)-1ATb = 0
= 0C(A)
N(AT)
b
p
ep + e = b
p = Pb → b – e = Pbe = (I - P)b
That’s the projection too.Projection onto the ┴ space.
When P projects onto one subspace, I – P projects onto the perpendicular subspace
points (1,1) (2,2) (3,2)
(Points at the picture areshifted for better readability.)
OK, I want to find a matrix A, once we have A, we can do all we need.
I am looking for the best line (smallest overall error) y = a + bx, meaning I am looking for a, b.
Equations: a + b = 1 a + 2b = 2 a + 3b = 2
1 1 1
1 2 2
1 3 2
aAx b
b
bAxAA TT
ˆ
this eq. can’t be solved
but this can
x
y
• In other words, the best solution is the line with smallest errors in all points.
• So I want to minimize length |Ax – b|, which is the error |e|, actually I want to minimize the never-zero quantity |Ax – b|2.
x
y
e1
e2
e3
so the overall error is the sum of squares|e1|2 + |e2|2 + |e3|2 p1 p2
p3
b1
b3
b2
What are those p1, p2, p3?If I put them in the equationsa + b = p1
a + 2b = p2
a + 3b = p3
I can solve them. Vector [p1,p2,p3] is in the column space
Least squares – traditional way
• least squares problem – “metoda nejmenších čtverců” … the sum of square of errors is minimized
x
y
points (x,y) : (1,1) (2,2) (3,2)
I am looking for a line: a + bx = y
Equations: a + b = 1 a + 2b = 2 a + 3b = 2
x
y
e1
e2
e3p1 p2
p3
b1
b3
b2
Equations: a + b = 1 a + 2b = 2 a + 3b = 2
points (x,y) : (1,1) (2,2) (3,2)
- So if there is a solution, each point lies on that line: a + b = 1, a + 2b = 2, a + 3b = 2- However, there is apparently no solution, no line at which all three
points lie.- The optimal line a+bx will go somewhere between the points. Thus
for each point, there will be some error (i.e. b value of the point on that line will differ from the required b value)
- Therefore, the errors are: e1 = a + b - 1, e2 = a + 2b - 2, e3 = a + 3b - 2
Least squares – linear algebra way
C(A)
N(AT)
b
p
e
3
2
1
31
21
11
2
2
1
p
p
p
pAb
• And now computation• Task: find p and x = [a b]
• Let’s solve that equation for
• Help me, what is ATA?
• And what is ATb?
• So I have to solve (Gauss elimination) a system of linear equations 3a + 6b =5, 6a + 14b = 11
bAxAA TT
ˆ
146
63
11
5
a = 1/2 b=2/3
^1 1 1
1 2 2
1 3 2
aAx b
b
• best line: 2/3 + 1/2x• What is p1?
– A value for x = 1 … 7/6
• And e1? – 1 - p1 = -1/6
• p2 = 5/3, e2 = +2/6, p3 = 13/6, e3 = -1/6
• So we have projection vector p, and error vector e
points (1,1) (2,2) (3,2)
616261
6133567
2
2
1
epb
Ja, das stimmt!
• p and e should be perpendicular. Verify that.
• However, e is not perpendicular not only to p. Give me another vector e is perpendicular to?– Well, e is perpendicular to column space, so?– It must be perpendicular to columns of matrix
A, i.e. to [1 1 1] and [1 2 3]• Just again, fitting by straight line
means solving the key equation
xApbAxAA TT ˆˆ But A must have indpendent columns,
then ATA is invertible
If not, oops, sorry, I am out of luck