guangning (gary) tan - mcmaster universitytang4/cse700/week4b.pdf2015/10/1tgn3000.com24 title linear...
TRANSCRIPT
Linear least squares problems
Guangning (Gary) Tan
CAS708/CSE700
McMaster University
October 1, 2015
Review Normal equations Orthogonal transformations
Contents
Review
Normal equations
Orthogonal transformations
This PDF can be accessed at
tgn3000.com⇒ Teaching⇒ CAS708/CSE700 Week 4b
2015/10/1 tgn3000.com 2
Review Normal equations Orthogonal transformations
Review
minx||Ax− b||22
I x ∈ Rn,A ∈ Rm×n with m > n, hence tall
I Has full column rank, rank(A) = n
I Ax = b is an overdetermined system, no exact solution
I Arises in data fitting problems, e.g.,
I y = ax3 + bx2 + cx + d
I y = aebx
I y = a + b log x
2015/10/1 tgn3000.com 3
Review Normal equations Orthogonal transformations
Unconstrained optimization problem
Objective function
f (x) = ||Ax− b||22 =m∑
i=1
bi −n∑
j=1
aijxj
2
I Convex problem, optimal value (minimum) existsI Necessary condition
∇f =
(∂f∂x1
,∂f∂x2
, . . . ,∂f∂xn
)= 0T
2015/10/1 tgn3000.com 4
Review Normal equations Orthogonal transformations
Normal equations
I For j = 1 : n,
∂f∂xj
=m∑
i=1
[(bi −
n∑k=1
aikxj
)(−aij)
]= 0
I Rearranging terms gives normal equations
AT Ax = AT b
I Since A has full column rank, AT A is SPD; can doCholesky
I Solution is x = (AT A)−1AT b
2015/10/1 tgn3000.com 5
Review Normal equations Orthogonal transformations
Normal equations (cont.)
AT Ax = AT b, x = (AT A)−1AT b
I A† = (AT A)−1AT is Moore-Penrose inverse of A
I Assume A has singular values σ1 ≥ . . . ≥ σn > 0
I Condition number κ(A) = σ1/σn
I κ(AT A) = κ(A)2 can be large
2015/10/1 tgn3000.com 6
Review Normal equations Orthogonal transformations
Polynomial data fitting
b ≈ x0 + x1v + x2v2 + . . . , xn−1vn−1
I Sample points (ti , yi), i = 1 : m, m > n
I Find undetermined coefficients x0, . . . , xn−1 to fit data1 v1 v2
1 . . . vn−11
1 v2 v22 . . . vn−1
2...
......
. . ....
1 vm v2m . . . vn−1
m
x0
x1...
xn−1
≈
b0
b1...
bn−1
written Vx ≈ b
I Wish to find minx ||Vx− b||2I V is Vandermonde matrix, with nodes or knots v1, . . . , vm, assumed real
and distinct
I Will see Vandermonde again in polynomial interpolation
2015/10/1 tgn3000.com 7
Review Normal equations Orthogonal transformations
Orthogonal transformations
I Singular value decomposition (SVD)I [U,S,V] = svd(A)
I Used when A is (nearly) rank deficient
I QRI [Q, R] = qr(A)
I “Standard” approach
I Expensive than normal equations when m� n but morerobust
I Gram-Schmidt
I Householder (elementary reflectors)
I Givens (plane rotations)
2015/10/1 tgn3000.com 8
Review Normal equations Orthogonal transformations
Singular value decomposition (SVD)
A = UΣV T
I A ∈ Rm×n
I U,V are orthogonal matrices, of size m,n, resp.
I Σ is a diagonal matrix, with singular values σi on thediagonal, i = 1 : min{m,n}
I Generalized inverse, or Moore-Penrose pseudoinverse ofA is A† = V Σ†UT
I Σ† is pseudoinverse of Σ, formed by replacing everynonzero diagonal entry by its reciprocal and transposing it
2015/10/1 tgn3000.com 9
Review Normal equations Orthogonal transformations
SVD (cont.)
Am×n = U
σ1σ2
. . .σk
0. . .
00(m−n)×n
V T where σ1 ≥ . . . ≥ σk > 0
A†n×m = V
σ−11
σ−12
. . .σ−1
k 0n×(m−n)0
. . .0
UT
2015/10/1 tgn3000.com 10
Review Normal equations Orthogonal transformations
Gram-Schmidt orthogonalization
Basic idea
I aT b = |a||b| cos θ
I (b− c) ⊥ aθ a
b
b− c
c = |b| cos θ · q1
I Let q1 = a/|a|I c is the projection of b onto q1
I b− c = b− |b| ·qT
1 b|q1||b|
· q1 = b−(
qT1 b)· q1
I An orthogonal basis consists of
q1 = a/|a| and q2 = (b− c)/|b− c|
2015/10/1 tgn3000.com 11
Review Normal equations Orthogonal transformations
Gram-Schmidt (cont.)
Algorithm (Gram-Schmidt).for j = 1 : n
qj ← aj
for i = 1 : j − 1rij ← qT
i aj
qj ← qj − rijqi
endrjj ← ||qj ||2qj ← qj/rjj % normalization
end
I Need separate storage for A,Q
2015/10/1 tgn3000.com 12
Review Normal equations Orthogonal transformations
Gram-Schmidt (cont.)
Algorithm (Modified Gram-Schmidt).for j = 1 : n
rjj ← ||qj ||2qj ← qj/rjj % normalizationfor i = j + 1 : n
rij ← qTi aj
aj ← aj − rijqi
endend
I Can update A to Q when iterating j
2015/10/1 tgn3000.com 13
Review Normal equations Orthogonal transformations
Example
A0 =
1 2 31 −1 2−1 2 −1−2 0 1
1 2 −1
= (a1,a2,a3)
r11 = ||a1|| = 2√
2
q1 = a1/r11 = (1,1,−1,−2,1)T/2√
2
r12 = qT1 a2 = 1/2
√2
a2 ← a2 − r12q1 = (15,−9,17,2,15)T/8
r13 = qT1 a3 = 3/2
√2, a3 ← a3 − r13q1 = . . .
2015/10/1 tgn3000.com 14
Review Normal equations Orthogonal transformations
A1 =
1/2√
2 15/8 31/2√
2 −9/8 2−1/2
√2 17/8 −1
−2/2√
2 2/8 11/2√
2 15/8 −1
= (q1,a2,a3)
r22 = ||a2|| =√
206/4
q2 = a2/r22 = (15,−9,17,2,15)T/2√
206
r23 = qT2 a3 = . . . , a3 ← a3 − r23q2 = . . .
r33 = ||a3||, q3 = a3/r33
2015/10/1 tgn3000.com 15
Review Normal equations Orthogonal transformations
Q = (q1,q2,q3)
R =
r11 r12 r13
r22 r23r33
02×3
A = QR
2015/10/1 tgn3000.com 16
Review Normal equations Orthogonal transformations
Householder orthogonalization
H = I − 2uuT
I Householder transformation of a unit vector uI Orthogonal and symmetricI Given vector a, find u s.t. αe1 = Ha
u = (a− αe1)/(2uT a)
I Can pick v = a− αe1 and then normalizeI To avoid cancellation, choose α = −sgn(a(1))||a||2
v = a + sgn(a(1))||a||2e1, u = v/||v||2
2015/10/1 tgn3000.com 17
Review Normal equations Orthogonal transformations
Example
A0 =
1 2 31 −1 2−1 2 −1−2 0 1
1 2 −1
v = a + sgn(a(1))||a||2e1
=
11−1−2
1
+ 2√
2
10000
=
1 + 2
√21−1−2
1
2015/10/1 tgn3000.com 18
Review Normal equations Orthogonal transformations
u = v/||v||2, H1 = I − 2uuT , A1 = H1A0
>> v = A(:,1); v(1) = v(1)+norm(A(:,1))*sign(v(1));>> u = v/norm(v)u =
0.82270.2149
-0.2149-0.42980.2149
>> H1 = eye(5)-2*u*u’; A = H1*AA =
-2.8284 -0.3536 -1.06070.0000 -1.6148 0.9393
-0.0000 2.6148 0.0607-0.0000 1.2295 3.12130.0000 1.3852 -2.0607
2015/10/1 tgn3000.com 19
Review Normal equations Orthogonal transformations
>> v = A(2:end,2);>> v(1) = v(1)+norm(A(2:end,2))*sign(v(1));>> u = v/norm(v)u =
-0.85150.42790.20120.2267
>> H2 = blkdiag(eye(1),eye(4)-2*u*u’);>> A = H2*AA =
-2.8284 -0.3536 -1.0607-0.0000 3.5882 -0.10450.0000 0.0000 0.5853
-0.0000 -0.0000 3.36800.0000 0 -1.7827
2015/10/1 tgn3000.com 20
Review Normal equations Orthogonal transformations
>> v = A(3:end,3);>> v(1) = v(1)+norm(A(3:end,3))*sign(v(1));>> u = v/norm(v)u =
0.75890.5756
-0.3047
>> H3 = blkdiag(eye(2),eye(3)-2*u*u’);>> A = H3*AA =
-2.8284 -0.3536 -1.0607-0.0000 3.5882 -0.10450.0000 0.0000 -3.8554
-0.0000 -0.0000 0.00000.0000 0.0000 -0.0000
2015/10/1 tgn3000.com 21
Review Normal equations Orthogonal transformations
Householder orthogonalization (cont.)
I Recall that H1,H2,H3 are orthogonal matrices
R = H3H2H1A = Q−1AQR = A
where
Q = H−11 H−1
2 H−13 = HT
1 HT2 HT
3
2015/10/1 tgn3000.com 22
Review Normal equations Orthogonal transformations
Householder orthogonalization (cont.)
Algorithm (Householder).[m,n]=size(A); p=zeros(1,n);
for j=1:n
a←A(j:m,j);e1←[1; zeros(m-j,1)];
u←a+sign(a(1))*norm(a)*e1;u←u/norm(u); % normalizationA(j:m,j:n)-=2*u*(u’*A(j:m,j:n)); % Aj ← HjAj−1
% Store up(k)=u(1);
A(k+1:m,k)=u(2:m-k+1);
end
2015/10/1 tgn3000.com 23
Review Normal equations Orthogonal transformations
Givens rotation
GA0 =
1
11
cos θ sin θ− sin θ cos θ
1 × ×1 × ×−1 × ×−2 × ×
1 × ×
=
1 × ×1 × ×−1 × ×
a × ×0 × ×
I Choose θ s.t. 2 sin θ + cos θ = 0I sin θ = 1/
√5, cos θ = −2/
√5
I a = ||(−2,1)T ||2 =√
5
2015/10/1 tgn3000.com 24