conjugate gradient algorithm - aalto sci msconjugate gradient method for least squares (cgls) †...
TRANSCRIPT
![Page 1: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/1.jpg)
E. Somersalo
Conjugate Gradient algorithm
• Need: A symmetric positive definite;
• Cost: 1 matrix-vector product per step;
• Storage: fixed, independent of number of steps.
The CG method minimizes the A norm of the error,
xk = arg minx∈Kk(A,b)
‖x− x∗‖2A.
x∗ = true solution, ‖z‖2A = zTAz.
Computational Methods in Inverse Problems, Mat–1.3626 0-0
![Page 2: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/2.jpg)
E. Somersalo
Krylov subspaces
The kth Krylov subspace associated with the matrix A and the vector b is
Kk(A, b) = span{b, Ab, . . . , Ak−1b}.
Iterative methods which seek the solution in a Krylov subspace are calledKrylov subspace iterative methods.
Computational Methods in Inverse Problems, Mat–1.3626 0-1
![Page 3: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/3.jpg)
E. Somersalo
At each step, minimize
α 7→ ‖xk−1 + αpk−1 − x∗‖2ASolution:
αk =‖rk−1‖2
pTk−1Apk−1
New updatexk = xk−1 + αk−1pk−1.
Search directions:p0 = r0 = b−Ax0,
Computational Methods in Inverse Problems, Mat–1.3626 0-2
![Page 4: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/4.jpg)
E. Somersalo
Iteratively, A-conjugate to the previous ones:
pTk Apj = 0, 0 ≤ j ≤ k − 1.
Found by writing
pk = rk + βkpk−1, rk = b−Axk,
βk =‖rk‖2‖rk−1‖2 ,
Computational Methods in Inverse Problems, Mat–1.3626 0-3
![Page 5: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/5.jpg)
E. Somersalo
Algorithm (CG)
Initialize: x0 = 0; r0 = b−Ax0; p0 = r0;
for k = 1, 2, . . . until stopping criterion is satisfied
αk =‖rk−1‖2
pTk−1Apk−1
;
xk = xk−1 + αkpk−1;
rk = rk−1 − αkApk−1;
βk =‖rk‖2‖rk−1‖2 ;
pk = rk + βkpk−1;
end
Computational Methods in Inverse Problems, Mat–1.3626 0-4
![Page 6: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/6.jpg)
E. Somersalo
CGLS Method
Conjugate Gradient method for Least Squares (CGLS)
• Need: A can be rectangular (non-square);
• Cost: 2 matrix-vector products (one with A, one with AT) per step;
• Storage: fixed, independent of number of steps.
Mathematically equivalent to applying CG to normal equations
ATAx = ATb
without actually forming them.
Computational Methods in Inverse Problems, Mat–1.3626 0-5
![Page 7: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/7.jpg)
E. Somersalo
CGLS minimization problem
The kth iterate solves the minimization problem
xk = arg minx∈Kk(ATA,ATb)
‖b−Ax‖.
The kth iterate xk of CGLS method (x0 = 0) is characterized by
Φ(xk) = minx∈Kk(ATA,ATb)
Φ(x)
whereΦ(x) :=
12xTATAx− xTATb.
Computational Methods in Inverse Problems, Mat–1.3626 0-6
![Page 8: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/8.jpg)
E. Somersalo
Determination of the minimizer
Perform sequential linear searches along ATA-conjugate directions
p0, p1, . . . , pk−1
that span Kk(ATA,ATb).
Determine xk from xk−1 and pk−1 according to
xk := xk−1 + αk−1pk−1
where αk−1 ∈ R solves
minα∈R
Φ(xk−1 + αpk−1).
Computational Methods in Inverse Problems, Mat–1.3626 0-7
![Page 9: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/9.jpg)
E. Somersalo
Residual Error
Introduce the residual error associated with xk:
rk := ATb−ATAxk.
Thenpk := rk + βk−1pk−1
Choose βk−1 so that pk is ATA-conjugate to the previous search directions:
pTk ATApj = 0, 1 ≤ j ≤ k − 1.
Computational Methods in Inverse Problems, Mat–1.3626 0-8
![Page 10: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/10.jpg)
E. Somersalo
Discrepancy
The discrepancy associated with x is
dk = b−Axk.
Thenrk = ATdk
It was shown by Hestenes and Stiefel that
‖dk+1‖ ≤ ‖dk‖; ‖xk+1‖ ≥ ‖xk‖.
Computational Methods in Inverse Problems, Mat–1.3626 0-9
![Page 11: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/11.jpg)
E. Somersalo
Algorithm (CGLS)
x0 := 0; d0 = b; r0 = ATb;p0 = r0; t0 = Ap0;
for k = 1, 2, . . . until stopping criterion is satisfied
αk = ‖rk−1‖2/‖tk−1‖2
xk = xk−1 + αkpk−1;
dk = dk−1 − αktk−1;
rk = ATdk;
βk = ‖rk‖2/‖rk−1‖2;pk = rk + βkpk−1;
tk = Apk;
end
Computational Methods in Inverse Problems, Mat–1.3626 0-10
![Page 12: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/12.jpg)
E. Somersalo
Example: a Toy Problem
An invertible 2× 2-matrix A,
yj = Ax∗ + εj , j = 1, 2.
Preimages,xj = A−1yj , j = 1, 2,
Computational Methods in Inverse Problems, Mat–1.3626 0-11
![Page 13: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/13.jpg)
E. Somersalo
x*
x1
x2
A
y*
y1
y2
A−1
Computational Methods in Inverse Problems, Mat–1.3626 0-12
![Page 14: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/14.jpg)
E. Somersalo
solution by iterative methods: semiconvergence.
WriteB = Ax∗ + E, E ∼ N (0, σ2I),
and generate a sample of data vectors, b1, b2, . . . , bn
Solve with CG
Computational Methods in Inverse Problems, Mat–1.3626 0-13
![Page 15: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/15.jpg)
E. Somersalo
Computational Methods in Inverse Problems, Mat–1.3626 0-14
![Page 16: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/16.jpg)
E. Somersalo
When should one stop iterating?
LetAx = b∗ + ε = Ax∗ + ε = b,
Approximate information‖ε‖ ≈ η,
where η > 0 is known. Write
‖A(x− x∗)‖ = ‖ε‖ ≈ η
Any solution satisfying‖Ax− b‖ ≤ τη
is reasonable.
Morozov discrepancy principle
Computational Methods in Inverse Problems, Mat–1.3626 0-15
![Page 17: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/17.jpg)
E. Somersalo
Example: numerical differentiation
Let f : [0, 1] → R be a differentiable function, f(0) = 0.
Data = f(tj)+ noise, tj =j
n, j = 1, 2, . . . n.
Problem: Estimate f ′(tj).
Direct numerical differentiation by, e.g., finite difference formula does notwork: the noise takes over.
Where is the inverse problem?
Computational Methods in Inverse Problems, Mat–1.3626 0-16
![Page 18: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/18.jpg)
E. Somersalo
Data
−3 −2 −1 0 1 2 3−0.5
0
0.5
1
1.5
2
2.5
Noise level 5%
Computational Methods in Inverse Problems, Mat–1.3626 0-17
![Page 19: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/19.jpg)
E. Somersalo
Solution by finite difference
−3 −2 −1 0 1 2 3−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
Computational Methods in Inverse Problems, Mat–1.3626 0-18
![Page 20: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/20.jpg)
E. Somersalo
Formulation as an inverse problem
Denote g(t) = f ′(t). Then,
f(t) =∫ t
0
g(τ)dτ.
Linear model:
Data = yj = f(tj) + ej =∫ tj
0
g(τ)dτ + ej ,
where ej is the noise.
Computational Methods in Inverse Problems, Mat–1.3626 0-19
![Page 21: Conjugate Gradient algorithm - Aalto SCI MSConjugate Gradient method for Least Squares (CGLS) † Need: A can be rectangular (non-square); † Cost: 2 matrix-vector products (one with](https://reader030.vdocuments.net/reader030/viewer/2022040304/5e95c22f9289875be229686d/html5/thumbnails/21.jpg)
E. Somersalo
Discretization
Write ∫ tj
0
g(τ)dτ ≈ 1n
j∑
k=1
g(tk).
By denoting g(tk) = xk,y = Ax + e,
where
A =1n
11 1...
. . .1 1
.
Computational Methods in Inverse Problems, Mat–1.3626 0-20