lesson 3 cspp58001. matrices reminder, matrices operate on vectors and transform them to new vectors...
TRANSCRIPT
Lesson 3
CSPP58001
Matrices
• Reminder, matrices operate on vectors and transform them to new vectors
• This is similar to how a regular scalar function f(x) operates on a scalar value
• The inverse is defined such that
• A is singular if there no inverse, det(A)=0, and there exists an infinite family of non-zero vectors that A transforms into the null space:s
€
Av ⇒ ′ v
€
A−1
€
A−1A = I
€
Av ⇒ 0
Geometric interpretation
• We can visualize a linear transformation via a simple 2-d example.
• Consider the vector . This can be visualized as a point in a 2-d plane with respect to the basis vectors
€
x =1
1
⎡
⎣ ⎢
⎤
⎦ ⎥
€
1
0
⎡
⎣ ⎢
⎤
⎦ ⎥
€
0
1
⎡
⎣ ⎢
⎤
⎦ ⎥
(1,1)(1,1)
€
Ax =1 1
1 −1
⎡
⎣ ⎢
⎤
⎦ ⎥1
1
⎡
⎣ ⎢
⎤
⎦ ⎥=
2
0
⎡
⎣ ⎢
⎤
⎦ ⎥
(2,0)(2,0)•Rotation matrixRotation matrix
Note that general transformations will not preserveNote that general transformations will not preserveorthogonality of axes orthogonality of axes
(0,1)(0,1)
(1,0)(1,0)
(1,1)(1,1)
(1,-1)(1,-1)
Interpretation as projection
• Think of the rows of A as the new axes onto which x is projected. If the vectors are normalized this is literally the case
• Norm of a vector x: €
Ax =
1
2
1
21
2−
1
2
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
1
21
2
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥=
1
0
⎡
⎣ ⎢
⎤
⎦ ⎥
€
norm(x) = xT x
Linear Systems
3333232131
2323222121
1313212111
bxaxaxa
bxaxaxa
bxaxaxa
3333232131
2323222121
1313212111
bxaxaxa
bxaxaxa
bxaxaxa
3
2
1
3
2
1
333231
232221
131211
b
b
b
x
x
x
aaa
aaa
aaa
3
2
1
3
2
1
333231
232221
131211
b
b
b
x
x
x
aaa
aaa
aaa
ReminderReminder
Linear Systems
• Solve Ax=b, where A is an nn matrix andb is an n1 column vector
• Can also talk about non-square systems whereA is mn, b is m1, and x is n1– Overdetermined if m>n:
“more equations than unknowns”– Underdetermined if n>m:
“more unknowns than equations”Can look for best solution using least squares
Singular Systems
• A is singular if some row islinear combination of other rows
• Singular systems can be underdetermined:
or inconsistent:1064
532
21
21
xx
xx
1064
532
21
21
xx
xx
1164
532
21
21
xx
xx
1164
532
21
21
xx
xx
Recall that, though ARecall that, though AIs not invertible, elimination Is not invertible, elimination Can give us family of solutionsCan give us family of solutions
In this case elimination will In this case elimination will Reveal the inconsistency Reveal the inconsistency
Inverting a Matrix
• Usually not a good idea to compute x=A-1b– Inefficient– Prone to roundoff error
• In fact, compute inverse using linear solver– Solve Axi=bi where bi are columns of identity,
xi are columns of inverse
– Many solvers can solve several R.H.S. at once
Gauss-Jordan Elimination
• Fundamental operations:1. Replace one equation with linear combination
of other equations2. Interchange two equations3. Re-label two variables
• Combine to reduce to trivial system• Simplest variant only uses #1 operations,
but get better stability by adding#2 (partial pivoting) or #2 and #3 (full pivoting)
Gauss-Jordan Elimination
• Solve:
• Only care about numbers – form “tableau” or “augmented matrix”:
1354
732
21
21
xx
xx
1354
732
21
21
xx
xx
13
7
54
32
13
7
54
32
Gauss-Jordan Elimination
• Given:
• Goal: reduce this to trivial system
and read off answer from right column
13
7
54
32
13
7
54
32
?
?
10
01
?
?
10
01
Gauss-Jordan Elimination
• Basic operation 1: replace any row bylinear combination with any other row
• Here, replace row1 with 1/2 * row1 + 0 * row2
13
7
54
32
13
7
54
32
1354
1 27
23
1354
1 27
23
Gauss-Jordan Elimination
• Replace row2 with row2 – 4 * row1
• Negate row2
1354
1 27
23
1354
1 27
23
110
1 27
23
110
1 27
23
110
1 27
23
110
1 27
23
Gauss-Jordan Elimination
• Replace row1 with row1 – 3/2 * row2
• Read off solution: x1 = 2, x2 = 1
110
1 27
23
110
1 27
23
1
2
10
01
1
2
10
01
Gauss-Jordan Elimination
• For each row i:– Multiply row i by 1/aii
– For each other row j:• Add –aji times row i to row j
• At the end, left part of matrix is identity,answer in right part
• Can solve any number of R.H.S. simultaneously
Pivoting
• Consider this system:
• Immediately run into problem:algorithm wants us to divide by zero!
• More subtle version:
8
2
32
10
8
2
32
10
8
2
32
1001.0
8
2
32
1001.0
Pivoting
• Conclusion: small diagonal elements bad• Remedy: swap in larger element from
somewhere else
Partial Pivoting
• Swap rows 1 and 2:
• Now continue:
8
2
32
10
8
2
32
10
2
8
10
32
2
8
10
32
2
1
10
01
2
4
10
1 23
2
1
10
01
2
4
10
1 23
Full Pivoting
• Swap largest element onto diagonal by swapping rows 1 and 2 and columns 1 and 2:
• Critical: when swapping columns, must remember to swap results!
8
2
32
10
8
2
32
10
2
8
01
23
2
8
01
23
Full Pivoting
• Full pivoting more stable, but only slightly
2
8
01
23
2
8
01
23
32
38
32
32
0
1
32
38
32
32
0
1
1
2
10
01
1
2
10
01
** Swap results Swap results 1 and 2 1 and 2
Operation Count
• For one R.H.S., how many operations?• For each of n rows:– Do n times:• For each of n+1 columns:
– One add, one multiply
• Total = n3+n2 multiplies, same # of adds• Asymptotic behavior: when n is large,
dominated by n3
Faster Algorithms
• Our goal is an algorithm that does this in1/3 n3 operations, and does not requireall R.H.S. to be known at beginning
• Before we see that, let’s look at a fewspecial cases that are even faster
Tridiagonal Systems
• Common special case:
• Only main diagonal + 1 above and 1 below
4
3
2
1
4443
343332
232221
1211
00
0
0
00
b
b
b
b
aa
aaa
aaa
aa
4
3
2
1
4443
343332
232221
1211
00
0
0
00
b
b
b
b
aa
aaa
aaa
aa
Solving Tridiagonal Systems
• When solving using Gauss-Jordan:– Constant # of multiplies/adds in each row– Each row only affects 2 others
4
3
2
1
4443
343332
232221
1211
00
0
0
00
b
b
b
b
aa
aaa
aaa
aa
4
3
2
1
4443
343332
232221
1211
00
0
0
00
b
b
b
b
aa
aaa
aaa
aa
Running Time
• 2n loops, 4 multiply/adds per loop(assuming correct bookkeeping)
• This running time has a fundamentally different dependence on n: linear instead of cubic– Can say that tridiagonal algorithm is O(n) while
Gauss-Jordan is O(n3)
Big-O Notation
• Informally, O(n3) means that the dominant term for large n is cubic
• More precisely, there exist a c and n0 such that running time c n3
if n > n0
• This type of asymptotic analysis is often usedto characterize different algorithms
Triangular Systems
• Another special case: A is lower-triangular
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
Triangular Systems
• Solve by forward substitution
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
11
11 a
bx
11
11 a
bx
Triangular Systems
• Solve by forward substitution
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
22
12122 a
xabx
22
12122 a
xabx
Triangular Systems
• Solve by forward substitution
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
4
3
2
1
44434241
333231
2221
11
0
00
000
b
b
b
b
aaaa
aaa
aa
a
33
23213133 a
xaxabx
33
23213133 a
xaxabx
Triangular Systems
• If A is upper triangular, solve by backsubstitution
5
4
3
2
1
55
4544
353433
25242322
1514131211
0000
000
00
0
b
b
b
b
b
a
aa
aaa
aaaa
aaaaa
5
4
3
2
1
55
4544
353433
25242322
1514131211
0000
000
00
0
b
b
b
b
b
a
aa
aaa
aaaa
aaaaa
55
55 a
bx
55
55 a
bx
Triangular Systems
• If A is upper triangular, solve by backsubstitution
5
4
3
2
1
55
4544
353433
25242322
1514131211
0000
000
00
0
b
b
b
b
b
a
aa
aaa
aaaa
aaaaa
5
4
3
2
1
55
4544
353433
25242322
1514131211
0000
000
00
0
b
b
b
b
b
a
aa
aaa
aaaa
aaaaa
44
54544 a
xabx
44
54544 a
xabx
Triangular Systems
• Both of these special cases can be solved inO(n2) time
• This motivates a factorization approach to solving arbitrary systems:– Find a way of writing A as LU, where L and U are
both triangular– Ax=b LUx=b Ly=b Ux=y– Time for factoring matrix dominates computation
Determinant• If A is singular, it has no inverse• If A is singular, its determinant det(A) = 0
€
deta11 a12
a21 a22
⎡
⎣ ⎢
⎤
⎦ ⎥≡ a11a22 − a12a21
€
det
a11 a12 a13
a21 a22 a23
a31 a32 a33
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥≡ a11a22a33 + a12a23a31 + a13a21a32 − a11a23a32 − a12a21a33 − a13a22a31
For arbitrary size matrices can use Leibnitz or Laplace formulaFor arbitrary size matrices can use Leibnitz or Laplace formulaThese are rarely used literally; many better techniquesThese are rarely used literally; many better techniques
Eigenvalues/vectors
• Definition– Let A be an nxn matrix. The number is an
eigenvalue of A if there exists a non-zero vector v such that
– The vector(s) v referred to as eigenvectors of A– Eigens have a number of incredibly important
properties; needing to compute them is very common.
€
Av = λ v
Computing eigens
• Rewrite as
• Above can only be true for nonzero vector v if A is singular (not invertible). Thus, its determinant must vanish
€
Av = λ v
€
A − λ I( ) v = 0
€
det(A − λ I) = 0
Example
• Find the eigenvalues/eigenvectors of the linear transformation
• Step 1: solve the characteristic polynomial formed by
• Step 2: plug in each eigenvalue and solve the linear system
• Check in Matlab: [v, l] = eig(A);
€
A =2 −4
−1 −1
⎡
⎣ ⎢
⎤
⎦ ⎥
€
det(A − λ I) = 0
€
A − λ I( ) v = 0
Many important properties
• Sum of eigenvalues = Tr(A)
• Product of eigenvalues = det(A)
• Eigenvalues of a triangular matrix are on the diagonal
€
tr(A) ≡ Aii
i=1
n
∑
Cholesky Decomposition
• For symmetric matrices, choose U=LT
• Perform decomposition
• Ax=b LLTx=b Ly=b LTx=y
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
Cholesky Decomposition
22
312123322332223121
221222222
222
221
11
1331133111
11
1221122111
1111112
11
l
llalallll
lalall
l
alall
l
alall
alal
22
312123322332223121
221222222
222
221
11
1331133111
11
1221122111
1111112
11
l
llalallll
lalall
l
alall
l
alall
alal
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
Cholesky Decomposition
ii
i
kjkikij
ji
i
kikiiii
l
llal
lal
1
1
1
1
2
ii
i
kjkikij
ji
i
kikiiii
l
llal
lal
1
1
1
1
2
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
33
3222
312111
333231
2221
11
332313
232212
131211
00
00
00
l
ll
lll
lll
ll
l
aaa
aaa
aaa
Cholesky Decomposition
• This fails if it requires taking square root of a negative number
• Need another condition on A: positive definite
For any v, vT A v > 0
(Equivalently, all positive eigenvalues)
Cholesky Decomposition
• Running time turns out to be 1/6n3
– Still cubic, but much lower constant
• Result: this is preferred method for solvingsymmetric positive definite systems
LU Decomposition
• Again, factor A into LU, whereL is lower triangular and U is upper triangular
• Last 2 steps in O(n2) time, so total time dominated by decomposition
Ax=bAx=bLUx=bLUx=bLy=bLy=bUx=yUx=y
Crout’s Method
• More unknowns than equations!• Let all lii=1
33
2322
131211
333231
2221
11
333231
232221
131211
00
00
00
u
uu
uuu
lll
ll
l
aaa
aaa
aaa
33
2322
131211
333231
2221
11
333231
232221
131211
00
00
00
u
uu
uuu
lll
ll
l
aaa
aaa
aaa
Crout’s Method
33
2322
131211
3231
21
333231
232221
131211
00
0
1
01
001
u
uu
uuu
ll
l
aaa
aaa
aaa
33
2322
131211
3231
21
333231
232221
131211
00
0
1
01
001
u
uu
uuu
ll
l
aaa
aaa
aaa
22
123132323222321231
1221222222221221
1212
11
3131311131
11
2121211121
1111
u
ulalaulul
ulauauul
au
u
alaul
u
alaul
au
22
123132323222321231
1221222222221221
1212
11
3131311131
11
2121211121
1111
u
ulalaulul
ulauauul
au
u
alaul
u
alaul
au
Crout’s Method
For i = 1..n– For j = 1..i
– For j = i+1..n
33
2322
131211
3231
21
333231
232221
131211
00
0
1
01
001
u
uu
uuu
ll
l
aaa
aaa
aaa
33
2322
131211
3231
21
333231
232221
131211
00
0
1
01
001
u
uu
uuu
ll
l
aaa
aaa
aaa
1
1
j
kkijkjiji ulau
1
1
j
kkijkjiji ulau
ii
i
kkijkji
ji u
ulal
1
1
ii
i
kkijkji
ji u
ulal
1
1
Crout’s Method
• Interesting note: # of outputs = # of inputs,algorithm only refers to elements not output yet– Can do this in-place!– Algorithm replaces A with matrix
of l and u values, 1s are implied– Resulting matrix must be interpreted in a special way:
not a regular matrix– Can rewrite forward/backsubstitution routines to use
this “packed” l-u matrix
333231
232221
131211
ull
uul
uuu
333231
232221
131211
ull
uul
uuu
LU Decomposition
• Running time is 1/3n3
– Only a factor of 2 slower than symmetric case– This is the preferred general method for
solving linear equations
• Pivoting very important– Partial pivoting is sufficient, and widely
implemented