Chapter 7Bounds and estimates for elements of f (A)
Gerard MEURANT
January-February, 2012
1 Introduction
2 Analytic bounds
3 Examples
4 Numerical experiments
5 Estimates of traces and determinants
Introduction
We would like to compute bounds or estimates of
[f (A)]i ,j = (e i )T f (A)e j
I Analytic bounds
I Numerical experiments for f (λ) = 1/λ, exp(λ),√
λ
I Estimates of trace(A−1) and det(A)
Analytic boundsPerforming analytically one or two Lanczos iterations, we are ableto obtain bounds for the entries of A−1
TheoremLet A be a symmetric positive definite matrix. Let
s2i =
∑j 6=i
a2ji , i = 1, . . . , n
Using the Gauss, Gauss–Radau and Gauss–Lobatto rules∑k 6=i
∑l 6=i ak,iak,lal ,i
ai ,i∑
k 6=i
∑l 6=i ak,iak,lal ,i −
(∑k 6=i a
2k,i
)2≤ (A−1)i ,i
ai ,i − b +s2ib
a2i ,i − ai ,ib + s2
i
≤ (A−1)i ,i ≤ai ,i − a +
s2ia
a2i ,i − ai ,ia + s2
i
(A−1)i ,i ≤a + b − aii
ab
Compute analytically α1, η1, α2, the inverse of
J2 =
(α1 η1
η1 α2
)is
J−12 =
1
α1α2 − η21
(α2 −η1
−η1 α1
)For Gauss–Radau we have to modify the (2, 2) element of J2
Using the nonsymmetric Lanczos algorithm
TheoremLet A be a symmetric positive definite matrix and
ti =∑k 6=i
ak,i (ak,i + ak,j)− ai ,j(ai ,j + ai ,i )
For (A−1)i ,j + (A−1)i ,i we have the two following estimates
ai ,i + ai ,j − a + tia
(ai ,i + ai ,j)2 − a(ai ,i + ai ,j) + ti,
ai ,i + ai ,j − b + tib
(ai ,i + ai ,j)2 − b(ai ,i + ai ,j) + ti
If ti ≥ 0, the first expression with a gives an upper bound and thesecond one with b a lower bound
Other functions
We have to compute f (J) for
J =
(α ηη ξ
)
Proposition
Let δ = (α− ξ)2 + 4η2
γ = exp
(1
2(α + ξ −
√δ)
), ω = exp
(1
2(α + ξ +
√δ)
)The (1, 1) element of the exponential of J is
1
2
[γ + ω +
ω − γ√δ
(α− ξ)
]
TheoremLet
λ+ =1
2(α + ξ +
√δ), λ− =
1
2(α + ξ −
√δ)
The (1, 1) element of f (J) is
1
2√
δ
[(α− ξ)(f (λ+)− f (λ−)) +
√δ(f (λ+) + f (λ−))
]
We can obtain analytic bounds for the (i , i) element of f (A) forany function for which we can compute f (λ+) and f (λ−)
Examples
Example F1This is an example of dimension 10
A =1
11
10 9 8 7 6 5 4 3 2 19 18 16 14 12 10 8 6 4 28 16 24 21 18 15 12 9 6 37 14 21 28 24 20 16 12 8 46 12 18 24 30 25 20 15 10 55 10 15 20 25 30 24 18 12 64 8 12 16 20 24 28 21 14 73 6 9 12 15 18 21 24 16 82 4 6 8 10 12 14 16 18 91 2 3 4 5 6 7 8 9 10
The inverse of A is a tridiagonal matrix
A−1 =
2 −1−1 2 −1
. . .. . .
. . .
−1 2 −1−1 2
Example F3This is an example proposed by Z. Strakos. Let Λ be a diagonalmatrix
λi = λ1 +
(i − 1
n − 1
)(λn − λ1)ρ
n−i , i = 1, . . . , n
Let Q be the orthogonal matrix of the eigenvectors of thetridiagonal matrix (−1, 2, −1). Then the matrix is
A = QTΛQ
We will use λ1 = 0.1, λn = 100 and ρ = 0.9
Example F4The matrix is arising from the 5–point finite differenceapproximation of the Poisson equation in a unit square with anm ×m meshThis gives a linear system Ax = c of order m2
A =
T −I−I T −I
. . .. . .
. . .
−I T −I−I T
Each block is of order m and
T =
4 −1−1 4 −1
. . .. . .
. . .
−1 4 −1−1 4
Diagonal elements
Example F1, GL, A−15,5 = 2
rule Nit=1 2 3 4 5 6 7
G 0.3667 1.3896 1.7875 1.9404 1.9929 1.9993 2
G–R bL 1.3430 1.7627 1.9376 1.9926 1.9993 2.0000 2
G–R bU 3.0330 2.2931 2.1264 2.0171 2.0020 2.0001 2
G–L 3.1341 2.3211 2.1356 2.0178 2.0021 2.0001 2
Example F3, GL, n = 100, A−150,50 = 4.2717
Nit G G–R bL G–R bU G–L
10 2.7850 3.0008 5.1427 5.1664
20 4.0464 4.0505 4.4262 4.4643
30 4.2545 4.2553 4.2883 4.2897
40 4.2704 4.2704 4.2728 4.2733
50 4.2716 4.2716 4.2718 4.2718
60 4.2717 4.2717 4.2717 4.2717
Example F4, GL, n = 900, A−1150,150 = 0.3602
Nit G G–R bL G–R bU G–L
10 0.3578 0.3581 0.3777 0.3822
20 0.3599 0.3599 0.3608 0.3609
30 0.3601 0.3601 0.3602 0.3602
40 0.3602 0.3602 0.3602 0.3602
Non–diagonal elements with the nonsymmetric Lanczosalgorithm
Example F1, GNS, A−12,2 + A−1
2,1 = 1
rule Nit=1 2 4 5 6 7
G 0.4074 0.6494 0.9512 0.9998 1.0004 1
G–R bL 0.6181 0.8268 0.9998 1.0004 1.0001 1
G–R bU 2.6483 1.4324 1.0035 1.0012 0.9994 1
G–L 3.2207 1.4932 1.0036 1.0012 0.9993 0.9994
Example F3, GNS, n = 100, A−150,50 + A−1
50,49 = 1.4394
Nit G G–R bL G–R bU G–L
10 0.8795 0.9429 2.2057 2.2327
20 1.3344 1.3362 1.5535 1.5839
30 1.4301 1.4308 1.4510 1.4516
40 1.4386 1.4387 1.4404 1.4404
50 1.4394 1.4394 1.4395 1.4395
60 1.4394 1.4394 1.4394 1.4394
Example F4, GNS, n = 900, A−1150,150 + A−1
150,50 = 0.3665
Nit G G–R bL G–R bU G–L
10 0.3611 0.3615 0.3917 0.3979
20 0.3656 0.3657 0.3678 0.3680
30 0.3663 0.3664 0.3666 0.3666
40 0.3665 0.3665 0.3665 0.3665
Non–diagonal elements with the block Lanczos algorithm
Let (J−1k )1,1 the 2× 2 (1, 1) block of the inverse of Jk with
Jk =
Ω1 ΓT
1
Γ1 Ω2 ΓT2
. . .. . .
. . .
Γk−2 Ωk−1 ΓTk−1
Γk−1 Ωk
∆1 = Ω1, ∆i = Ωi − Γi−1Ω
−1i−1Γ
Ti−1, i = 2, . . . , k
Ck = ∆−11 ΓT
1 ∆−12 ΓT
2 · · ·∆−1k−1Γ
Tk−1∆
−1k ΓT
k
(J−1k+1)1,1 = (J−1
k )1,1 + Ck∆−1k+1C
Tk
Going from step k to step k + 1 we compute Ck+1 incrementallyNote that we can reuse Ck∆−1
k+1 to compute Ck+1
Example F3, GB, n = 100, A−12,1 = −3.2002
Nit G G–R bL G–R bU G–L
2 -3.0808 -3.0948 -3.9996 -4.1691
3 -3.1274 -3.1431 -3.5655 -3.6910
4 -3.2204 -3.2187 -3.2637 -3.5216
5 -3.2015 -3.2001 -3.1974 -3.2473
6 -3.1969 -3.1966 -3.1964 -3.1969
7 -3.1970 -3.1972 -3.1995 -3.1994
8 -3.1993 -3.1995 -3.2008 -3.1999
9 -3.2001 -3.2001 -3.2005 -3.2008
10 -3.2002 -3.2002 -3.2002 -3.2004
We see that we obtain good approximations but not always boundsAs a bonus we also obtain estimates of A−1
1,1 and A−12,2
Example F4, GB, n = 900, A−1400,100 = 0.0597
Nit G G–R bL G–R bU G–L
10 0.0172 0.0207 0.0632 0.0588
20 0.0527 0.0532 0.0616 0.0621
30 0.0590 0.0591 0.0597 0.0597
40 0.0597 0.0597 0.0597 0.0597
Note that for this problem the Gauss rule gives a lower bound,Gauss–Radau a lower and an upper bound
Dependence on the eigenvalue estimates
We take Example F4 with m = 6We look at the number of Lanczos iterations needed to obtain anupper bound for the element (18, 18) with four exact digits
Example F4, GL, n = 36
a=10−4 10−2 0.1 0.3 0.4 1 6
15 13 11 11 8 8 9
With the exact eigenvalue a = 0.3961 we need 9 Lanczos iterationsNote that it works even when a > λmin
Bounds for the elements of the exponential
Example F3, GL, n = 100, exp(A)50,50 = 5.3217 1041. Results×10−41
Nit G G–R bL G–R bL G–L
2 0.0000 0.0000 7.0288 8.8014
3 0.0075 0.2008 5.6649 6.0776
4 1.0322 2.5894 5.3731 5.4565
5 3.9335 4.7779 5.3270 5.3385
6 5.1340 5.2680 5.3235 5.3232
7 5.3070 5.3178 5.3218 5.3219
8 5.3203 5.3209 5.3218 5.3218
9 5.3212 5.3213 5.3217 5.3217
10 5.3215 5.3217 5.3217 5.3217
11 5.3217 5.3217 5.3217 5.3217
Convergence is faster than with A−1
Example F4, GNS, n = 900, exp(A)50,50 + exp(A)50,49 = 83.8391
rule Nit=2 3 4 5 6 7
G 63.4045 81.4124 83.6607 83.8318 83.8389 83.8391
G–R bL 108.0918 86.3239 83.8796 83.8420 83.8392 83.8391
G–R bU 76.1266 83.7668 83.7781 83.8383 83.8391 83.8391
G–L 163.8043 90.9304 84.1878 83.8530 83.8395 83.8391
Convergence is quite fast
Bounds for the elements of the square root
Example F4, GL, n = 900, (√
A)50,50 = 1.9189
Nit G G–R bL G–R bU G–L
2 1.9319 1.8945 1.9255 1.8697
3 1.9220 1.9112 1.9209 1.9038
4 1.9201 1.9160 1.9197 1.9140
5 1.9195 1.9176 1.9193 1.9169
6 1.9192 1.9183 1.9191 1.9180
7 1.9191 1.9186 1.9190 1.9185
8 1.9190 1.9187 1.9190 1.9187
9 1.9190 1.9188 1.9190 1.9188
10 1.9190 1.9189 1.9190 1.9189
11 1.9190 1.9189 1.9190 1.9189
12 1.9190 1.9189 1.9189 1.9189
13 1.9189 1.9189 1.9189 1.9189
Estimates of traces and determinants
Bai and Golub gave analytic bounds based on the first threemomentsLet
µr = tr(Ar ) =n∑
i=1
λri =
∫ b
aλr dα
be the moments related to α, the measure (that we do not knowexplicitly) with steps at the eigenvalues of AThe first three moments are easily computed
µ0 = n, µ1 = tr(A) =n∑
i=1
ai ,i , µ2 = tr(A2) =n∑
i ,j=1
a2i ,j = ‖A‖2F
However, these bounds are sometimes far from being sharp
One can compute numerically more moments and use theChebyshev algorithmmoments → Jacobi matrix → quadrature rule → µ−1
Example F4, n = 900, Chebyshev, trace(A−1) = 512.6442
k est.
1 225.0000
2 296.7033
3 344.6869
4 375.8398
5 400.0648
6 418.2138
7 433.1216
8 444.9913
9 455.0122
10 463.2337
The moment matrices are ill–conditioned and after k = 10 they arenot positive definite anymore
To solve this problem we use the modified Chebyshev algorithmwith shifted Chebyshev polynomials as auxiliary polynomials
C0(λ) ≡ 1,
(λmax − λmin
2
)C1(λ) = λ−
(λmax + λmin
2
)
(λmax − λmin
4
)Ck+1(λ) =
(λ− λmax + λmin
2
)Ck(λ)
−(
λmax − λmin
4
)Ck−1(λ)
We compute the trace of the matrices Ci (A), i = 0, . . . , k whichare the modified moments
Example F4, n = 900, modified moments, trace(A−1) = 512.6442
k est.
5 400.0648
10 463.2560
15 489.5383
20 502.0008
25 508.0799
30 510.9301
35 512.1385
40 512.5469
Estimates of the determinant
It is based on the following result
Proposition
Let A be a symmetric positive definite matrix
ln(det(A)) = tr(ln(A))
Then
tr(ln(A)) =n∑
i=1
lnλi =
∫ b
alnλ dα
Example F4, n = 400, modified moments, det(A) = 7.7187 10206
k est.
2 1.8705 10217
4 1.4990 10209
6 4.9892 10207
8 1.7268 10207
10 1.1338 10207
12 9.3701 10206
14 8.5330 10206
16 8.1315 10206
18 7.9273 10206
20 7.8210 10206
Z. Bai and G.H. Golub, Bounds for the trace of theinverse and the determinant of symmetric positive definitematrices, Annals Numer. Math., v 4 (1997), pp. 29–38
G.H. Golub and G. Meurant, Matrices, moments andquadrature, in Numerical Analysis 1993, D.F. Griffiths andG.A. Watson eds., Pitman Research Notes in Mathematics,v 303 (1994), pp. 105–156
G. Meurant, Estimates of the trace of the inverse of asymmetric matrix using the modified Chebyshev algorithm,Numerical Algorithms, v 51 n 3 (2009), pp. 309–318