on two variants of incremental condition estimation · on two variants of incremental condition...

On two variants of incremental conditionestimation

Jurjen Duintjer Tebbens

joint work with

Miroslav Tuma

Institute of Computer Science

Academy of Sciences of the Czech Republic

SNA11, Rožnov pod Radhoštem, January 27, 2011.

J. Duintjer Tebbens, M. Tuma 2

1. Motivation: BIF

● This work is motivated by the recently introduced Balanced IncompleteFactorization (BIF) method for Cholesky (or LDU) decomposition [Bru,Marín, Mas, Tuma - 2008]

● The method is remarkable, among others, in that it computes theinverse triangular factors simultaneously during the factorizationprocess.

● In BIF the presence of the inverse could be used to obtain robustdropping criteria

● In this talk we are interested in complete factorization (BIF withoutdropping any entries)

1. Motivation: BIF

● The main question is: How can the presence of the inverse factors beexploited in complete factorization?

● Perhaps the first thing that comes to mind, is to use the inversetriangular factors for improved condition estimation. Recall that

κ(L) =σmax(L)

σmin(L)= ‖L‖‖L−1‖.

● In this talk we concentrate on estimation of the 2-norm conditionnumber.

● We will see that exploiting the inverse factors for better conditionestimation is possible, but not as straightforward as it may seem.

2. Condition estimation

Traditionally, 2-norm condition estimators assume a triangulardecomposition and compute estimates for the factors. E.g., if A issymmetric positive definite with Cholesky decomposition

A = LLT ,

then the condition number of A satisfies

κ(A) = κ(L)2 = κ(LT )2.

κ(LT ) can be cheaply estimated with a technique called incrementalcondition number estimation. Main idea: Subsequent estimation ofleading submatrices:

LT = ⇒ all columns are accessed only once.

We will call the original incremental technique, introduced by Bischof[Bischof - 1990], simply incremental condition estimation (ICE):

Let R be upper triangular with a given approximate maximal singular valueσmaxICE(R) and corresponding approximate singular vector y, ‖y‖ = 1,

σmaxICE(R) = ‖yTR‖ ≈ σmax(R) = max‖x‖=1

‖xTR‖.

ICE approximates the maximal singular value of the extended matrix

R′ =

by maximizing∥

sy, c)

, over all s, c satisfying c2 + s2 = 1.

We have

maxs,c,c2+s2=1

sy, c)

= maxs,c,c2+s2=1

sy, c)

= maxs,c,c2+s2=1

σmaxICE(R)2 + (yT v)2 γ(vT y)

γ(vT y) γ2

The maximum is obtained with the normalized eigenvector correspondingto the maximum eigenvalue λmax(BICE) of

BICE ≡

σmaxICE(R)2 + (yT v)2 γ(vT y)

γ(vT y) γ2

We denote the normalized eigenvector by

and put y ≡

Then we define the approximate maximal singular value of the extendedmatrix as

σmaxICE(R′) ≡ ‖yTR′‖ ≈ σmax(R

Similarly, if for some y with unit norm,

σminICE(R) = ‖yTR‖ ≈ σmin(R) = min‖x‖=1

‖xTR‖,

then ICE uses the minimal eigenvalue λmin(BICE) of

BICE =

σminICE(R)2 + (yT v)2 γ(vT y)

γ(vT y) γ2

The corresponding eigenvector of BICE yields the new vector yT and

σminICE(R′) ≡ ‖yTR′‖ ≈ σmin(R

Experiment:

● We generate 50 random matrices B of dimension 100 with the Matlabcommand B = randn(100, 100)

● We compute the Cholesky decompositions LLT of the 50 symmetricpositive definite matrices A = BBT

● We compute the estimations σmaxICE(LT ) and σminICE(L

● In the following graph we display the quality of the estimations throughthe number

σmaxICE(LT )σminICE(LT )

κ(A),

where κ(A) is the true condition number. Note that we always have

σmaxICE(LT )

σminICE(LT )

≤ κ(A).

0 10 20 30 40 500

Quality of the estimator ICE for 50 random s.p.d. matrices of dimension100.

With the decomposition method from [Bru, Marín, Mas, Tuma - 2008] wehave to our disposal not only the Cholesky decomposition of A,

A = LLT

but also the inverse Cholesky factors as is the case in balancedfactorization, i.e. we have

A−1 = L−TL−1.

Then we can run ICE on L−T and use the additional estimations

σmaxICE(L−T )≈ σmin(L

σminICE(L−T )≈ σmax(L

In the following graph we take the best of both estimations, we display(

max(σmaxICE(LT ),σminICE(L−T )−1)min(σminICE(LT ),σmaxICE(L−T )−1)

κ(A).

0 10 20 30 40 500

Quality of ICE for 50 random s.p.d. matrices of dimension 100.

0 10 20 30 40 500

Quality of the estimator ICE without (black) and with exploiting (green) theinverse for 50 random s.p.d. matrices of dimension 100.

Plugging in the inverse gives no improvement!

By comparing

σmaxICE(R′), R′ =

σminICE((R′)−1), (R′)−1 =

R−1 −R−1vγ

it can be proven that

σmaxICE(A) =1

σminICE(A−1), σminICE(A) =

σmaxICE(A−1),

which we denote asICE(A) = ICE(A−1).

An alternative technique called incremental norm estimation (INE) wasproposed in [Duff, Vömel - 2002]:

Let R be upper triangular with given approximate maximal singular valueσmaxINE(R) and corresponding approximate right singular vector z,‖z‖ = 1,

σmaxINE(R) = ‖Rz‖ ≈ σmax(R) = max‖x‖=1

‖Rx‖.

INE approximates the maximal singular value of the extended matrix

R′ =

by maximizing∥

, over all s, c satisfying c2 + s2 = 1.

We have

maxs,c,c2+s2=1

= maxs,c,c2+s2=1

zTRTRz zTRT v

zTRT v vT v + γ2

The maximum is obtained for the normalized eigenvector correspondingto the maximum eigenvalue λmax(BINE) of

BINE ≡

zTRTRz zTRT v

zTRT v vT v + γ2

We denote the normalized eigenvector by

and put z ≡

Then we define the approximate maximal singular value of the extendedmatrix as

σmaxINE(R′) ≡ ‖R′z‖ ≈ σmax(R

Similarly, if for a unit vector z,

‖Rz‖ ≈ σmin(R) = min‖x‖=1

‖Rx‖,

then INE uses the minimal eigenvalue λmin(BINE) of

BINE =

zTRTRz zTRT v

zTRT v vT v + γ2

The corresponding eigenvector of BINE yields the new vector z and

σminINE(R′) ≡ ‖R′z‖ ≈ σmin(R

By comparing

σmaxINE(R′), R′ =

σminINE((R′)−1), (R′)−1 =

R−1 −R−1vγ

it can be checked, that for INE we have in general

INE(A) 6= INE(A−1) !

Let us demonstrate this in the same experiment as before.

0 10 20 30 40 500

Quality of the estimator INE(A)

0 10 20 30 40 500

Quality of the estimator INE(A) (black) and of INE(A−1) (green).

0 10 20 30 40 500

Quality of the estimators INE(A) (black), INE(A−1) (green) and best ofboth INE(A) and INE(A−1) (red).

0 10 20 30 40 500

Quality of the estimators INE(A) (black), INE(A−1) (green), best of bothINE(A) and INE(A−1) (red) and ICE(A) (blue).

Why this improvement ?

● In general, both ICE and INE give a satisfactory approximation ofσmax(R), though INE tends to be better.

● The problem is to approximate σmin(R), for ICE as well as for INE.

● The trick is to translate to the problem of finding the maximal singularvalue σmax(R

−1) of R−1, which is in general done better with INE thanwith ICE.

● This has an important impact on the estimate because σmin(R) istypically small and appears in the denominator of σmax(R)

σmin(R) ,

We see that the main reason for the improvement is that INE tends to givea better estimate of maximal singular values. And why is that ?

Note: INE does not always give a better estimate of the maximal singularvalue. But we have the following rather technical result.

Theorem [DT, Tuma - ?]. Consider condition estimation of the matrix

R′ =

where both ICE and INE start with the same approximation of σmax(R)denoted by δ. Let y, ‖y‖ = 1 be the approximate singular vector for ICE,

δ = ‖yTR‖ ≈ σmax(R),

and let z, ‖z‖ = 1 be the approximate singular vector for INE,

δ = ‖Rz‖ ≈ σmax(R).

Theorem (continued). Then we have superiority of INE,

σmaxINE(R′) ≥ σmaxICE(R

(vTRz)2 ≥ δ2(vT y)2 +1

vT v − (vT y)2)

α−√

α2 + 4δ2(vT y)2)

where α = δ2 − γ2 − (vT y)2.

Let us use the notation

ρ ≡ δ2(vT y)2 +1

vT v − (vT y)2)

α−√

α2 + 4δ2(vT y)2)

If ρ ≤ 0, then INE is unconditionally superior to ICE (i.e. regardless ofthe approximate singular vector z).

To conclude we demonstrate the previous theorem.

● Assume that at some stage of an incremental condition estimationprocess we have σmaxICE(R) = σmaxINE(R) = 1.

● Consider possible vectors v in the new columns of R′ that have unitnorm, i.e. vT v = 1.

● Then (vT y)2 ≤ 1. The x-axes of the following figures represent thepossible values of (vT y)2 ≤ 1.

● The y-axes represent values of γ2, i.e. the square of the new diagonalentry.

● The superiority criterion for INE expressed by the value of ρ is given bythe z-axes.

0.40.6

Value of ρ in dependence of (vT y)2 (x-axis) and γ2 (y-axis) with ‖v‖2 = 1.

Value of ρ in dependence of (vT y)2 (x-axis) and γ2 (y-axis) with ‖v‖2 = 10.

● Exploiting the presence of inverse factors combined with INE gives asignificant improvement of incremental condition estimation.

● This may be an important advantage of methods where inversetriangular factors are just a by-product of the factorization.

● We did not consider sparse Cholesky factors, which ask for modifiedICE [Bischof, Pierce, Lewis - 1990].

● Matlab does not offer a 2-norm condition number estimator but a1-norm condition number estimator (previous versions offered the2-norm condition number estimator ICE)

Thank you for your attention!

Supported by project number IAA100300802 of the grant agency of the Academy of Sciences of the

Czech Republic.

on two variants of incremental condition estimation · on two variants of incremental condition...

Documents

sars-cov-2 variants of concern and variants under...

depthcut: improved depth edge estimation using multiple...

1992-8645 a survey of congestion detection … ·...

dental age estimation using cemental annulations: … ·...

parameter estimation via stochastic variants of the ecm...

rsa variants

incremental motion estimation through local … motion...

subsystems and incremental testing. subsystems testing and...

incremental sheet metal forming - incremental single point

genera&ve model variants - github pages...2 generative model...

financing international projects. capital budgeting capital...

1 1.mle 2.k-function & variants 3.residual methods...

incremental load development method -...

incremental shaft encoders incremental thru-bore & motor

yasir latif - incremental sparse gp regression for...

summary of interpretation of sequence variants schemes-...

pure::variants user's guide€¦ · pure::variants user's...

outlier/event detection techniques in wireless sensor...

incremental development (pengembangan incremental)

enum variants