a new algorithm for computing quadrature-based bounds in ... · for computing quadrature-based...
TRANSCRIPT
A new algorithm
for computing quadrature-based boundsin conjugate gradients
Petr Tichý
Czech Academy of SciencesCharles University in Prague
joint work with
Gérard Meurant and Zdeněk Strakoš
June 08–13, 2014Householder Symposium XIX,
Spa, Belgium
1
Problem formulation
Consider a system
Ax = b
where A ∈ Rn×n is symmetric, positive definite.
Without loss of generality, ‖b‖ = 1 , x0 = 0 .
2
The conjugate gradient method
input A, br0 = b, p0 = r0
for k = 1, 2, . . . do
γk−1 =rT
k−1rk−1
pTk−1A pk−1
xk = xk−1 + γk−1pk−1
rk = rk−1 − γk−1A pk−1
δk =rT
k rk
rTk−1rk−1
pk = rk + δkpk−1
test quality of xk
end for
Dk
γ−10
. . .. . .
γ−1k−1
Lk
1√
δ1. . .. . .
. . .√δk−1 1
3
How to measure quality of an approximation in CG?A practically relevant question
using residual information,– normwise backward error,– relative residual norm.“Using of the residual vector rk as a measure of the “goodness” of
the estimate xk is not reliable” [Hestenes & Stiefel 1952]
using error estimates,– the A-norm of the error,– the Euclidean norm of the error.“The function (x− xk, A(x− xk)) can be used as a measure of the
“goodness” of xk as an estimate of x.” [Hestenes & Stiefel 1952]
The A-norm of the error plays an important role in stoppingcriteria [Deuflhard 1994], [Arioli 2004], [Jiránek, Strakoš, Vohralík 2006].
4
The Lanczos algorithmLet A be symmetric, compute orthonormal basis of Kk(A, b)
input A, bv1 = b/‖b||, δ1 = 0β0 = 0, v0 = 0for k = 1, 2, . . . do
αk = vTk Avk
w = Avk − αkvk − βk−1vk−1
βk = ‖w‖vk+1 = w/βk
end for
Tk
α1 β1
β1. . .
. . . βk−1
βk−1 αk
5
CG versus LanczosLet A be symmetric, positive definite
Lanczos
Tk
←→CG
Dk, Lk
Both algorithms generate an orthogonal basis of Kk(A, b).
Lanczos using a three-term recurrence → Tk.
CG using a coupled two-term recurrence → Dk, Lk .
Tk = Lk Dk LTk .
6
CG, Lanczos and Gauss quadrature
CG, Lanczos
Gauss quadratureOrthogonal pol.Jacobi matrices
At any iteration step k, CG (implicitly) determinesweights and nodes of the k-point Gauss quadrature
∫ ξ
ζf(λ) dω(λ) =
k∑
i=1
ωi f( θi) + Rk[f ] .
7
Gauss quadrature for f(λ) ≡ λ−1
Gauss quadrature∫ ξ
ζλ−1 dω(λ) =
k∑
i=1
ωi
θi+ Rk[λ−1] .
Lanczos (T
−1n
)1,1
=(T
−1k
)1,1
+ Rk[λ−1].
CG
‖x‖2A =k−1∑
j=0
γj‖rj‖2
︸ ︷︷ ︸τk
+ ‖x− xk‖2A .
Important : Rk[λ−1] > 0 .
8
Gauss-Radau quadrature for f(λ) = λ−1
µ is prescribed∫ ξ
ζf(λ) dω(λ) =
k∑
i=1
ω̃if(θ̃i
)+ ω̃k+1f(µ)
︸ ︷︷ ︸(T̃
−1k+1
)1,1≡ τ̃k+1
+ Rk[f ] ,
where
T̃k+1 =
α1 β1
β1. . .
. . .. . .
. . . βk−1
βk−1 αk βk
βk α̃k+1
and µ is an eigenvalue of T̃k+1.
Important: if 0 < µ ≤ λmin, then Rk[λ−1] < 0 .
9
Idea of estimating the A-norm of the error[Golub & Strakoš 1994], [Golub & Meurant 1994, 1997]
Consider two quadrature rules at steps k and k + d, d > 0,
‖x‖2A = τk + ‖x− xk‖2A ,
‖x‖2A = τ̂k+d + R̂k+d .
Then‖x− xk‖2A = τ̂k+d − τk + R̂k+d .
Gauss quadrature: R̂k+d > 0 → lower bound,Radau quadrature: R̂k+d < 0 → upper bound.
How to compute efficiently
τ̂k+d − τk ?
10
How to compute efficiently τ̂k+d − τk?
‖x− xk‖2A = τ̂k+d − τk + R̂k+d .
For numerical reasons, it is not convenient to compute τk andτ̂k+d explicitly. Instead,
τ̂k+d − τk =k+d−2∑
j=k
(τj+1 − τj) + (τ̂j+d − τj+d−1)
≡k+d−2∑
j=k
∆j + ∆̂k+d−1 ,
and update the ∆j’s without subtractions. Recall τj =(T
−1j
)1,1
.
11
Golub and Meurant approach
[Golub & Meurant 1994, 1997]: Use tridiagonal matrices
CG → Tk → Tk−µI → T̃k
and compute ∆’s using updating strategies,no need to store tridiagonal matrices.
Use the formulas
‖x− xk‖2A =k+d−1∑
j=k
∆j + ‖x− xk+d‖2A ,
‖x− xk‖2A =k+d−2∑
j=k
∆j + ∆(µ)k+d−1 + R(R)
k+d .
12
CGQL (Conjugate Gradients and Quadrature via Lanczos)
input A, b, x0, µr0 = b−Ax0, p0 = r0
δ0 = 0, γ−1 = 1, c1 = 1, β0 = 0, d0 = 1, α̃(µ)1 = µ,
for k = 1, . . . , until convergence doCG-iteration (k)
αk =1
γk−1+
δk−1
γk−2, β2
k =δk
γ2k−1
dk = αk −β2
k−1
dk−1, ∆k−1 = ‖r0‖2
c2k
dk,
α̃(µ)k+1 = µ +
β2k
αk − α̃(µ)k
,
∆(µ)k = ‖r0‖2
β2kc2
k
dk
(α̃
(µ)k+1dk − β2
k
) , c2k+1 =
β2kc2
k
d2k
Estimates(k,d)end for
13
Our approach
[Meurant & T. 2013]: Update LDLT decompositions of Tk and T̃k
CG → LkDkLTk → L̃kD̃kL̃
Tk
We use tridiagonal matrices only implicitly.
We get very simple formulas for updating ∆k−1 and ∆(µ)k .
In [Meurant & T. 2013], this idea is used also for other types ofquadratures (Gauss-Lobatto, Anti-Gauss).
14
CGQ (Conjugate Gradients and Quadrature)
[Meurant & T. 2013]
input A, b, x0, µ,r0 = b−Ax0, p0 = r0
∆(µ)0 = ‖r0‖2
µ,
for k = 1, . . . , until convergence doCG-iteration(k)
∆k−1 = γk−1‖rk−1‖2,
∆(µ)k =
‖rk‖2(∆
(µ)k−1 −∆k−1
)
µ(∆
(µ)k−1 −∆k−1
)+ ‖rk‖2
Estimates(k,d)end for
15
Conclusions (theoretical part)
Simple formulas for computing bounds on ‖x− xk‖A.
Almost for free.
Work well also with preconditioning.
Behaviour in finite precision arithmetic?
16
CG in finite precision arithmetic
Orthogonality is lost, convergence is delayed!
0 20 40 60 80 100 12010
−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
(E) || x−xj ||
A
(FP) || x−xj ||
A
(FP) || I−VTjV
j ||
F
Identities need not hold in finite precision arithmetic!
17
Bounds in finite precision arithmetic
Observation: CGQL and CGQ give the same results(up to a small inaccuracy).
Do the bounds correspond to ‖x− xk‖A?
Gauss quadrature lower bound → yes [Strakoš & T. 2002].
What about the Gauss-Radau upper bound?
‖x− xk‖2A = ∆(µ)k + R(R)
k+1 ,
‖x− xk‖A ≤√
∆(µ)k .
18
Gauss-Radau upper bound, exact arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1
0 5 10 15 20 25 30 35 40 45 5010
−3
10−2
10−1
100
101
CGQ − strakos48
|| x − xk ||
A
Upper bound, µ = 0.5 λ1
Upper bound, µ = λ1
19
Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1
0 20 40 60 80 100 12010
−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
102
CGQ − strakos48
|| x − xk ||
A
Upper bound, µ = 0.5 λ1
Upper bound, µ = λ1
20
Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1
µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)
80 85 90 95 100 105 110 115 12010
−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
CGQ − strakos48
√∆
(µ)k
21
Gauss-Radau upper bound, finite precision arithmeticStrakoš matrix, n = 48, λ1 = 0.1, λn = 1000, ρ = 0.9, d = 1
µ = α λ1 , α ∈ (0, 1], α ≈ 1 (red), α ≈ 0 (magenta)
80 85 90 95 100 105 110 115 12010
−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
CGQ − strakos48
√α ∆
(µ)k
22
Conclusions (numerical observation)Gauss-Radau upper bound
It seems that√
ε is a limiting level for the accuracy of theGauss-Radau upper bound.
We cannot avoid subtractions in computing this bound.If µ ≈ λ1, then Tk − µI may be ill conditioned.
Simple formulas → investigation of numerical behaviour.
Understanding can help
in suggesting another approach,
in improving Gauss quadrature lower bound(adaptive choice of d).
23
Related papers
G. Meurant and P. Tichý, [On computing quadrature-based bounds for the
A-norm of the error in CG, Numer. Algorithms, 62 (2013), pp. 163–191.]
G. H. Golub and G. Meurant, [ Matrices, moments and quadrature with
applications, Princeton University Press, USA, 2010.]
Z. Strakoš and P. Tichý, [On error estimation in CG and why it works in
finite precision computations, ETNA, 13 (2002), pp. 56–80.]
G. H. Golub and G. Meurant, [Matrices, moments and quadrature. II. BIT,
37 (1997), pp. 687–705.]
G. H. Golub and Z. Strakoš, [Estimates in quadratic formulas, Numer.
Algorithms, 8 (1994), pp. 241–268.]
Thank you for your attention!
24