guangning (gary) tan - mcmaster universitytang4/cse700/week4b.pdf2015/10/1tgn3000.com24 title linear...

24
Linear least squares problems Guangning (Gary) Tan [email protected] CAS708/CSE700 McMaster University October 1, 2015

Upload: hakhanh

Post on 12-May-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Linear least squares problems

Guangning (Gary) Tan

[email protected]

CAS708/CSE700

McMaster University

October 1, 2015

Review Normal equations Orthogonal transformations

Contents

Review

Normal equations

Orthogonal transformations

This PDF can be accessed at

tgn3000.com⇒ Teaching⇒ CAS708/CSE700 Week 4b

2015/10/1 tgn3000.com 2

Review Normal equations Orthogonal transformations

Review

minx||Ax− b||22

I x ∈ Rn,A ∈ Rm×n with m > n, hence tall

I Has full column rank, rank(A) = n

I Ax = b is an overdetermined system, no exact solution

I Arises in data fitting problems, e.g.,

I y = ax3 + bx2 + cx + d

I y = aebx

I y = a + b log x

2015/10/1 tgn3000.com 3

Review Normal equations Orthogonal transformations

Unconstrained optimization problem

Objective function

f (x) = ||Ax− b||22 =m∑

i=1

bi −n∑

j=1

aijxj

2

I Convex problem, optimal value (minimum) existsI Necessary condition

∇f =

(∂f∂x1

,∂f∂x2

, . . . ,∂f∂xn

)= 0T

2015/10/1 tgn3000.com 4

Review Normal equations Orthogonal transformations

Normal equations

I For j = 1 : n,

∂f∂xj

=m∑

i=1

[(bi −

n∑k=1

aikxj

)(−aij)

]= 0

I Rearranging terms gives normal equations

AT Ax = AT b

I Since A has full column rank, AT A is SPD; can doCholesky

I Solution is x = (AT A)−1AT b

2015/10/1 tgn3000.com 5

Review Normal equations Orthogonal transformations

Normal equations (cont.)

AT Ax = AT b, x = (AT A)−1AT b

I A† = (AT A)−1AT is Moore-Penrose inverse of A

I Assume A has singular values σ1 ≥ . . . ≥ σn > 0

I Condition number κ(A) = σ1/σn

I κ(AT A) = κ(A)2 can be large

2015/10/1 tgn3000.com 6

Review Normal equations Orthogonal transformations

Polynomial data fitting

b ≈ x0 + x1v + x2v2 + . . . , xn−1vn−1

I Sample points (ti , yi), i = 1 : m, m > n

I Find undetermined coefficients x0, . . . , xn−1 to fit data1 v1 v2

1 . . . vn−11

1 v2 v22 . . . vn−1

2...

......

. . ....

1 vm v2m . . . vn−1

m

x0

x1...

xn−1

b0

b1...

bn−1

written Vx ≈ b

I Wish to find minx ||Vx− b||2I V is Vandermonde matrix, with nodes or knots v1, . . . , vm, assumed real

and distinct

I Will see Vandermonde again in polynomial interpolation

2015/10/1 tgn3000.com 7

Review Normal equations Orthogonal transformations

Orthogonal transformations

I Singular value decomposition (SVD)I [U,S,V] = svd(A)

I Used when A is (nearly) rank deficient

I QRI [Q, R] = qr(A)

I “Standard” approach

I Expensive than normal equations when m� n but morerobust

I Gram-Schmidt

I Householder (elementary reflectors)

I Givens (plane rotations)

2015/10/1 tgn3000.com 8

Review Normal equations Orthogonal transformations

Singular value decomposition (SVD)

A = UΣV T

I A ∈ Rm×n

I U,V are orthogonal matrices, of size m,n, resp.

I Σ is a diagonal matrix, with singular values σi on thediagonal, i = 1 : min{m,n}

I Generalized inverse, or Moore-Penrose pseudoinverse ofA is A† = V Σ†UT

I Σ† is pseudoinverse of Σ, formed by replacing everynonzero diagonal entry by its reciprocal and transposing it

2015/10/1 tgn3000.com 9

Review Normal equations Orthogonal transformations

SVD (cont.)

Am×n = U

σ1σ2

. . .σk

0. . .

00(m−n)×n

V T where σ1 ≥ . . . ≥ σk > 0

A†n×m = V

σ−11

σ−12

. . .σ−1

k 0n×(m−n)0

. . .0

UT

2015/10/1 tgn3000.com 10

Review Normal equations Orthogonal transformations

Gram-Schmidt orthogonalization

Basic idea

I aT b = |a||b| cos θ

I (b− c) ⊥ aθ a

b

b− c

c = |b| cos θ · q1

I Let q1 = a/|a|I c is the projection of b onto q1

I b− c = b− |b| ·qT

1 b|q1||b|

· q1 = b−(

qT1 b)· q1

I An orthogonal basis consists of

q1 = a/|a| and q2 = (b− c)/|b− c|

2015/10/1 tgn3000.com 11

Review Normal equations Orthogonal transformations

Gram-Schmidt (cont.)

Algorithm (Gram-Schmidt).for j = 1 : n

qj ← aj

for i = 1 : j − 1rij ← qT

i aj

qj ← qj − rijqi

endrjj ← ||qj ||2qj ← qj/rjj % normalization

end

I Need separate storage for A,Q

2015/10/1 tgn3000.com 12

Review Normal equations Orthogonal transformations

Gram-Schmidt (cont.)

Algorithm (Modified Gram-Schmidt).for j = 1 : n

rjj ← ||qj ||2qj ← qj/rjj % normalizationfor i = j + 1 : n

rij ← qTi aj

aj ← aj − rijqi

endend

I Can update A to Q when iterating j

2015/10/1 tgn3000.com 13

Review Normal equations Orthogonal transformations

Example

A0 =

1 2 31 −1 2−1 2 −1−2 0 1

1 2 −1

= (a1,a2,a3)

r11 = ||a1|| = 2√

2

q1 = a1/r11 = (1,1,−1,−2,1)T/2√

2

r12 = qT1 a2 = 1/2

√2

a2 ← a2 − r12q1 = (15,−9,17,2,15)T/8

r13 = qT1 a3 = 3/2

√2, a3 ← a3 − r13q1 = . . .

2015/10/1 tgn3000.com 14

Review Normal equations Orthogonal transformations

A1 =

1/2√

2 15/8 31/2√

2 −9/8 2−1/2

√2 17/8 −1

−2/2√

2 2/8 11/2√

2 15/8 −1

= (q1,a2,a3)

r22 = ||a2|| =√

206/4

q2 = a2/r22 = (15,−9,17,2,15)T/2√

206

r23 = qT2 a3 = . . . , a3 ← a3 − r23q2 = . . .

r33 = ||a3||, q3 = a3/r33

2015/10/1 tgn3000.com 15

Review Normal equations Orthogonal transformations

Q = (q1,q2,q3)

R =

r11 r12 r13

r22 r23r33

02×3

A = QR

2015/10/1 tgn3000.com 16

Review Normal equations Orthogonal transformations

Householder orthogonalization

H = I − 2uuT

I Householder transformation of a unit vector uI Orthogonal and symmetricI Given vector a, find u s.t. αe1 = Ha

u = (a− αe1)/(2uT a)

I Can pick v = a− αe1 and then normalizeI To avoid cancellation, choose α = −sgn(a(1))||a||2

v = a + sgn(a(1))||a||2e1, u = v/||v||2

2015/10/1 tgn3000.com 17

Review Normal equations Orthogonal transformations

Example

A0 =

1 2 31 −1 2−1 2 −1−2 0 1

1 2 −1

v = a + sgn(a(1))||a||2e1

=

11−1−2

1

+ 2√

2

10000

=

1 + 2

√21−1−2

1

2015/10/1 tgn3000.com 18

Review Normal equations Orthogonal transformations

u = v/||v||2, H1 = I − 2uuT , A1 = H1A0

>> v = A(:,1); v(1) = v(1)+norm(A(:,1))*sign(v(1));>> u = v/norm(v)u =

0.82270.2149

-0.2149-0.42980.2149

>> H1 = eye(5)-2*u*u’; A = H1*AA =

-2.8284 -0.3536 -1.06070.0000 -1.6148 0.9393

-0.0000 2.6148 0.0607-0.0000 1.2295 3.12130.0000 1.3852 -2.0607

2015/10/1 tgn3000.com 19

Review Normal equations Orthogonal transformations

>> v = A(2:end,2);>> v(1) = v(1)+norm(A(2:end,2))*sign(v(1));>> u = v/norm(v)u =

-0.85150.42790.20120.2267

>> H2 = blkdiag(eye(1),eye(4)-2*u*u’);>> A = H2*AA =

-2.8284 -0.3536 -1.0607-0.0000 3.5882 -0.10450.0000 0.0000 0.5853

-0.0000 -0.0000 3.36800.0000 0 -1.7827

2015/10/1 tgn3000.com 20

Review Normal equations Orthogonal transformations

>> v = A(3:end,3);>> v(1) = v(1)+norm(A(3:end,3))*sign(v(1));>> u = v/norm(v)u =

0.75890.5756

-0.3047

>> H3 = blkdiag(eye(2),eye(3)-2*u*u’);>> A = H3*AA =

-2.8284 -0.3536 -1.0607-0.0000 3.5882 -0.10450.0000 0.0000 -3.8554

-0.0000 -0.0000 0.00000.0000 0.0000 -0.0000

2015/10/1 tgn3000.com 21

Review Normal equations Orthogonal transformations

Householder orthogonalization (cont.)

I Recall that H1,H2,H3 are orthogonal matrices

R = H3H2H1A = Q−1AQR = A

where

Q = H−11 H−1

2 H−13 = HT

1 HT2 HT

3

2015/10/1 tgn3000.com 22

Review Normal equations Orthogonal transformations

Householder orthogonalization (cont.)

Algorithm (Householder).[m,n]=size(A); p=zeros(1,n);

for j=1:n

a←A(j:m,j);e1←[1; zeros(m-j,1)];

u←a+sign(a(1))*norm(a)*e1;u←u/norm(u); % normalizationA(j:m,j:n)-=2*u*(u’*A(j:m,j:n)); % Aj ← HjAj−1

% Store up(k)=u(1);

A(k+1:m,k)=u(2:m-k+1);

end

2015/10/1 tgn3000.com 23

Review Normal equations Orthogonal transformations

Givens rotation

GA0 =

1

11

cos θ sin θ− sin θ cos θ

1 × ×1 × ×−1 × ×−2 × ×

1 × ×

=

1 × ×1 × ×−1 × ×

a × ×0 × ×

I Choose θ s.t. 2 sin θ + cos θ = 0I sin θ = 1/

√5, cos θ = −2/

√5

I a = ||(−2,1)T ||2 =√

5

2015/10/1 tgn3000.com 24