efficient analysis of high-dimensional data in tensor formats

Efficient Analysis of High Dimensional Datain Tensor Formats

M. Espig, W. Hackbusch, A. Litvinenko, H. G. Matthies and E. [email protected], 13. Januar 2012

CC

SCScien

tifiomputing

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Outline

Motivation

Discretisation

Analysis of high dimensional dataComputing the maximum normComputation of the characteristic

Computation of level sets and frequency

Numerical Experiments

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 2


An example of SPDE

∂∂t u(x, t) = ∇ · (κ(x,ω)∇u(x, t)) + f (x, t), x ∈ G ⊂ Rd , t ∈ [0,T ],where ω ∈ Ω, and U = L2(G).

For each ω ∈ Ω seek for u(x, t) ∈ L2([0,T ])⊗ U.

Let S := L2([0,T ])⊗ L2(Ω), one is looking for u(t , x,ω) ∈ U⊗ S.

Further decompositionL2(Ω) = L2(×jΩj) ∼=

⊗j L2(Ωj) ∼=

⊗j L2(R, Γj) results in

u(t , x,ω) ∈ U⊗ S = L2(G)⊗(

L2([0,T ])⊗⊗

j L2(R, Γj))

.

CC

SCScien

tifiomputing



Discretisation and Tensorial quantities

span wnNn=1 = UN ⊂ U, dimUN = N,

span τk Kk=1 = TK ⊂ L2([0,T ]) = SI , dimTK = K ,

span Xjm Jmjm=1 = SII,Jm ⊂ L2(R, Γm) = SII , dim SII,Jm = Jm,

1 6 m 6 M,

Let P := [0,T ]×Ω, an approximation to u : P→ U is thus given by

u(x, t ,ω1, . . . ,ωM) ≈N∑

n=1

K∑k=1

J1∑j1=1

. . .

JM∑jM=1

uj1,...,jMn,k wn(x)⊗ τk(t)⊗

(M⊗

m=1

Xjm(ωm)

).

CC

SCScien

tifiomputing



Tensorial quantities

We can write for all

n = 1, . . . ,N, k = 1, . . . ,K , m = 1, . . . ,M, jm = 1, . . . , Jm :

uj1,...,jm,...,jMn,k = u(xn, tk ,ω

j11 , . . . ,ω

jMM ),

has R′′= N × K ×

∏Mm=1 Jm terms.

Look for

(uj1,...,jm,...,jMn,k ) ≈

R′∑

ρ=1

uρwρ ⊗ τρ ⊗

(M⊗

m=1

Xρm

),

where R′6 R

′′, wρ ∈ RN , τρ ∈ RK , Xρm ∈ RJm .

CC

SCScien

tifiomputing



Values of interest

With the last tensor representation one wants to perform differenttasks:

evaluation for specific parameters (t ,ω1, . . . ,ωM),

finding maxima and minima,

finding ‘level sets’ and quantiles.

CC

SCScien

tifiomputing



Spatial discretisation and PCE

UN := ϕn(x)Nn=1 ⊂ U:

u(x,ω) =

N∑n=1

un(ω)ϕn(x),

un(θ) =∑α∈J

uαn Hα(θ(ω)), where

J is taken as a finite subset of N(N)0 , R := |J|.

u(θ) =∑α∈J

uαHα(θ(ω)),

where uα := [uα1 , . . . ,uαn ]

T .

CC

SCScien

tifiomputing



Discretized equation in tensor form

KLE: κ(x,ω) = κ0(x) +∑∞

j=1 κjgj(x)ξj(θ), whereξj(θ) =

1κj

∫G(κ(x,ω) − κ0(x)) gj(x)dx.

Knowing PCE κ(x,ω) =∑α κ

(α)Hα(θ), compute

ξj(θ) =∑α∈J ξ

(α)j Hα(θ), where ξ(α)j = 1

κj

∫G κ

(α)(x)gj(x)dx.

Further compute ξ(α)j ≈∑s

l=1(ξl)j∏∞

k=1(ξl, k)αk .

CC

SCScien

tifiomputing



Discretized equation in tensor form

[Matthies, Keese 04, 05, 07]

Au :=(∑∞

j=0 Aj ⊗∆j) (∑

α∈J uα ⊗ eα)=(∑

α∈J fα ⊗ eα)=: f,

where eα denotes the canonical basis in⊗Mµ=1 RRµ .

f =∑α∈J∑∞

i=0√λi f iαf i ⊗ eα =

∑∞i=0√λi f i ⊗ gi , where

gi :=∑α∈J f i

αeα.

Splitting gi further, obtainf ≈∑R

k=1 f k ⊗⊗Mµ=1 gkµ.

CC

SCScien

tifiomputing



Final discretized stochastic PDE

Au = f, where

A:=(∑s

l=1 Al ⊗⊗Mµ=1 ∆lµ

), Al ∈ RN×N , ∆lµ ∈ RRµ×Rµ ,

u:=(∑r

j=1 uj ⊗⊗Mµ=1 ujµ

), uj ∈ RN , ujµ ∈ RRµ ,

f:=∑R

k=1 f k ⊗⊗Mµ=1 gkµ, f k ∈ RN and gkµ ∈ RRµ .

[Wähnert, Espig, Hackbusch, Litvinenko, Matthies 02.2012]

CC

SCScien

tifiomputing


Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Notation

LetT :=

⊗dµ=1 Rnµ ,

Rr (T) := Rr :=∑r

i=1⊗dµ=1 viµ ∈ T : viµ ∈ Rnµ

,

I :=×dµ=1 Iµ, where Iµ := i ∈ N : 1 6 i 6 nµ.

CC

SCScien

tifiomputing



Maximum norm and corresponding index

Let u =∑r

j=1⊗dµ=1 ujµ ∈ Rr , compute

‖u‖∞ := maxi:=(i1,...,id)∈I|ui | = maxi:=(i1,...,id)∈I

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

iµ

∣∣∣∣∣∣ .(1)

Computing ‖u‖∞ is equivalent to the following e.v. problem.

Let i∗ := (i∗1 , . . . , i∗d ) ∈ I, #I =

∏dµ=1 nµ.

‖u‖∞ = |ui∗ | =

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

i∗µ

∣∣∣∣∣∣ and e(i∗) :=

d⊗µ=1

ei∗µ ,

where ei∗µ ∈ Rnµ the i∗µ-th canonical vector in Rnµ (µ ∈ N6d ).

CC

SCScien

tifiomputing



Then

u e(i∗) =

r∑j=1

d⊗µ=1

ujµ

d⊗µ=1

ei∗µ

=

r∑j=1

d⊗µ=1

ujµ ei∗µ

=

r∑j=1

d⊗µ=1

[(ujµ)i∗µei∗µ

]

=

r∑j=1

d∏µ=1

(ujµ)i∗µ

︸︷︷︸

ui∗=

d⊗µ=1

e(i∗µ),

from which follows

u e(i∗) = ui∗e(i∗).

CC

SCScien

tifiomputing



Let D(u) :=∑r

j=1⊗dµ=1 diag

((ujµ)lµ

)lµ∈N6nµ

, obtain

D(u)v = u v for all v ∈ T.

Corollary

Elements of u are the eigenvalues of D(u) and all eigenvectors e(i)

are of the following form:

e(i) =

d⊗µ=1

eiµ ,

where i := (i1, . . . , id) ∈ I is the index of ui . Therefore ‖u‖∞ is thelargest eigenvalue of D(u) with corresp. e.v. e(i∗).

CC

SCScien

tifiomputing



Computing ‖u‖∞, u ∈ Rr by vector iteration

1: Choose y0 :=⊗dµ=1

1nµ

1, where 1 := (1, . . . ,1)T ∈ Rnµ ,kmax ∈ N, and take ε := 10e − 7

2: for k = 1,2, . . . , kmax do3:

qk = u yk−1, λk = 〈yk−1,qk〉 , zk = qk/√〈qk ,qk〉,

yk = Appε(zk).4: end for

yk = Appε(zk), [Approximate iteration, Khoromskij, Hackbusch,Tyrtyshnikov 05],Algorithms in [Espig, Hackbusch 2010]

CC

SCScien

tifiomputing



Definition (Characteristic, Sign)The characteristic χI(u) ∈ T of u ∈ T in I ⊂ R is for every multi-index i ∈ I pointwise defined as

(χI(u))i :=

1, ui ∈ I;0, ui /∈ I.

(2)

Furthermore, the sign(u) ∈ T is for all i ∈ I pointwise defined by

(sign(u))i :=

1, ui > 0;−1, ui < 0;0, ui = 0.

(3)

CC

SCScien

tifiomputing



Lemma

Let u ∈ T, a,b ∈ R, and 1 =⊗dµ=1 1µ, where

1µ := (1, . . . ,1)t ∈ Rnµ .

(i) If I = R<b, then we have χI(u) = 12(1+ sign(b1− u)).

(ii) If I = R>a, then we have χI(u) = 12(1− sign(a1− u)).

(iii) If I = (a,b), then we haveχI(u) = 1

2(sign(b1− u) − sign(a1− u)).

Computing sign(u), u ∈ Rr , via hybrid Newton-Schulz iteration:

CC

SCScien

tifiomputing



Computing sign(u), u ∈ Rr

1: Choose u0 := u and ε ∈ R+.2: while ‖1− uk−1 uk−1‖ < ε‖u‖ do3: if ‖1− uk−1 uk−1‖ < ‖u‖ then4: zk := 1

2uk−1 (31− uk−1 uk−1)5: else6: zk := 1

2(uk−1 + u−1k−1)

7: end if8: uk := Appεk

(zk)9: end while

CC

SCScien

tifiomputing



Definition (Level Set, Frequency)Let I ⊂ R and u ∈ T. The level set LI(u) ∈ T of u respect to I ispointwise defined by

(LI(u))i :=

ui ,ui ∈ I ;0,ui /∈ I ,

(4)

for all i ∈ I.The frequency FI(u) ∈ N of u respect to I is defined as

FI(u) := # suppχI(u). (5)

CC

SCScien

tifiomputing



PropositionLet I ⊂ R, u ∈ T, and χI(u) its characteristic. We have

LI(u) = χI(u) u

and rank(LI(u)) 6 rank(χI(u))rank(u).The frequency FI(u) ∈ N of u respect to I is

FI(u) = 〈χI(u),1〉 ,

where 1 =⊗dµ=1 1µ, 1µ := (1, . . . ,1)T ∈ Rnµ .

CC

SCScien

tifiomputing



2D L-shape domain, N = 557.KLE terms for q(x,ω) = eκ(x,ω): lk = 10,stoch. dim. mk = 10 and pk = 2,shifted lognormal distrib. for κ(x,ω),covκ(x, y) is of the Gaussian type, `x = `y = 0.3.RHS: lf = 10, mf = 10, pf = 2 and Beta distrib. 4,2 for RVs.covf (x, y) is of the Gaussian type, `x = `y = 0.6.Total stoch. dim. mu = mk + mf = 20, |J| = 231

u =

231∑j=1

21⊗µ=1

ujµ ∈ R557 ⊗20⊗µ=1

R3.

CC

SCScien

tifiomputing



0 1 2 3 4 50

0.1

0.2

0.3

0.4

0 0.5 1 1.50

0.5

1

1.5

2

2.5

Shifted lognormal distribution with parameters µ = 0.5, σ2 = 1.0(on the left) and Beta distribution with parameters 4,2 (on theright).

CC

SCScien

tifiomputing



Mean (on the left) and standard deviation (on the right) of κ(x,ω)(lognormal random field with parameters µ = 0.5 and σ = 1).

CC

SCScien

tifiomputing



Mean (on the left) and standard deviation (on the right) of f (x,ω)(beta distribution with parameters α = 4, β = 2 and Gaussian cov.function).

CC

SCScien

tifiomputing



Mean(on the left) and standard deviation (on the right) of the solution u.

CC

SCScien

tifiomputing



Results

Computed ‖u‖∞ after 20 iterations.

The maximal rank of the intermediate iterants (uk)20k=1 was 143

(uk)20k=1 ⊂ R143 is the sequence of generated tensors.

The approximation error εk = 10−6

CC

SCScien

tifiomputing



Level sets

Now we compute level sets

sign(b‖u‖∞1− u)

for b ∈ 0.2, 0.4, 0.6, 0.8.

CC

SCScien

tifiomputing



The computing time to get any row is around 10 minutes.

Tensor u has 320 ∗ 557 = 1,942,138,911,357 entries.

R1 := rank(sign(b‖u‖∞1− u)) ,R2 := max16k6kmaxrank(uk),

Error= ‖1−ukmaxukmax ‖‖(b‖u‖∞1−u)‖ .

b R1 R2 kmax Error0.2 12 24 12 2.9×10−8

0.4 12 20 20 1.9×10−7

0.6 8 16 12 1.6×10−7

0.8 8 15 8 1.2×10−7

CC

SCScien

tifiomputing



Literature

1. P. Wähnert, W.Hackbusch, M. Espig, A. Litvinenko, H. Matthies:Efficient approximation of the stoch. Galerkin matrix in thecanonical tensor format, (in preparation) MPI Leipzig, 2011.

2. Dissertation of Mike Espig, Leipzig 2008.

3. Mike Espig, W. Hackbusch: A regularized Newton method forthe efficient approx. of tensor represented in the c.t. format, MPILeipzig 2010

4. H. G. Matthies, Uncertainty Quantification with Stochastic FiniteElements, Encyclopedia of Computational Mechanics, Wiley,2007.

CC

SCScien

tifiomputing



Acknowledgement

Project MUNA, German Luftfahrtforschungsprogramm funded bythe Ministry of Economics (BMWA).

Elmar Zander:A Malab/Octave toolbox for stochastic Galerkin methods(KLE, PCE, sparse grids, tensors, many examples etc)Stoch. Galerkin lib.: http://ezander.github.com/sglib/

M. Espig, M. Schuster, A. Killaitis, N. Waldren, P. Wähnert, S.Handschuh, H. AuerTensor Calculus lib.: http://gitorious.org/tensorcalculus

CC

SCScien

tifiomputing


efficient analysis of high-dimensional data in tensor formats

Education