efficient analysis of high-dimensional data in tensor formats

30
Efficient Analysis of High Dimensional Data in Tensor Formats M. Espig, W. Hackbusch, A. Litvinenko , H. G. Matthies and E. Zander , 13. Januar 2012 C C SC Scientifi omputing

Upload: alexander-litvinenko

Post on 11-Jan-2017

225 views

Category:

Education


0 download

TRANSCRIPT

Efficient Analysis of High Dimensional Datain Tensor Formats

M. Espig, W. Hackbusch, A. Litvinenko, H. G. Matthies and E. [email protected], 13. Januar 2012

CC

SCScien

tifiomputing

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Outline

Motivation

Discretisation

Analysis of high dimensional dataComputing the maximum normComputation of the characteristic

Computation of level sets and frequency

Numerical Experiments

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 2

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

An example of SPDE

∂∂t u(x, t) = ∇ · (κ(x,ω)∇u(x, t)) + f (x, t), x ∈ G ⊂ Rd , t ∈ [0,T ],where ω ∈ Ω, and U = L2(G).

For each ω ∈ Ω seek for u(x, t) ∈ L2([0,T ])⊗ U.

Let S := L2([0,T ])⊗ L2(Ω), one is looking for u(t , x,ω) ∈ U⊗ S.

Further decompositionL2(Ω) = L2(×jΩj) ∼=

⊗j L2(Ωj) ∼=

⊗j L2(R, Γj) results in

u(t , x,ω) ∈ U⊗ S = L2(G)⊗(

L2([0,T ])⊗⊗

j L2(R, Γj))

.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 3

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Discretisation and Tensorial quantities

span wnNn=1 = UN ⊂ U, dimUN = N,

span τk Kk=1 = TK ⊂ L2([0,T ]) = SI , dimTK = K ,

span Xjm Jmjm=1 = SII,Jm ⊂ L2(R, Γm) = SII , dim SII,Jm = Jm,

1 6 m 6 M,

Let P := [0,T ]×Ω, an approximation to u : P→ U is thus given by

u(x, t ,ω1, . . . ,ωM) ≈N∑

n=1

K∑k=1

J1∑j1=1

. . .

JM∑jM=1

uj1,...,jMn,k wn(x)⊗ τk(t)⊗

(M⊗

m=1

Xjm(ωm)

).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 4

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Tensorial quantities

We can write for all

n = 1, . . . ,N, k = 1, . . . ,K , m = 1, . . . ,M, jm = 1, . . . , Jm :

uj1,...,jm,...,jMn,k = u(xn, tk ,ω

j11 , . . . ,ω

jMM ),

has R′′= N × K ×

∏Mm=1 Jm terms.

Look for

(uj1,...,jm,...,jMn,k ) ≈

R′∑

ρ=1

uρwρ ⊗ τρ ⊗

(M⊗

m=1

Xρm

),

where R′6 R

′′, wρ ∈ RN , τρ ∈ RK , Xρm ∈ RJm .

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 5

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Values of interest

With the last tensor representation one wants to perform differenttasks:

evaluation for specific parameters (t ,ω1, . . . ,ωM),

finding maxima and minima,

finding ‘level sets’ and quantiles.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 6

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Spatial discretisation and PCE

UN := ϕn(x)Nn=1 ⊂ U:

u(x,ω) =

N∑n=1

un(ω)ϕn(x),

un(θ) =∑α∈J

uαn Hα(θ(ω)), where

J is taken as a finite subset of N(N)0 , R := |J|.

u(θ) =∑α∈J

uαHα(θ(ω)),

where uα := [uα1 , . . . ,uαn ]

T .

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 7

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Discretized equation in tensor form

KLE: κ(x,ω) = κ0(x) +∑∞

j=1 κjgj(x)ξj(θ), whereξj(θ) =

1κj

∫G(κ(x,ω) − κ0(x)) gj(x)dx.

Knowing PCE κ(x,ω) =∑α κ

(α)Hα(θ), compute

ξj(θ) =∑α∈J ξ

(α)j Hα(θ), where ξ(α)j = 1

κj

∫G κ

(α)(x)gj(x)dx.

Further compute ξ(α)j ≈∑s

l=1(ξl)j∏∞

k=1(ξl, k)αk .

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 8

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Discretized equation in tensor form

[Matthies, Keese 04, 05, 07]

Au :=(∑∞

j=0 Aj ⊗∆j) (∑

α∈J uα ⊗ eα)=(∑

α∈J fα ⊗ eα)=: f,

where eα denotes the canonical basis in⊗Mµ=1 RRµ .

f =∑α∈J∑∞

i=0√λi f iαf i ⊗ eα =

∑∞i=0√λi f i ⊗ gi , where

gi :=∑α∈J f i

αeα.

Splitting gi further, obtainf ≈∑R

k=1 f k ⊗⊗Mµ=1 gkµ.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 9

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Final discretized stochastic PDE

Au = f, where

A:=(∑s

l=1 Al ⊗⊗Mµ=1 ∆lµ

), Al ∈ RN×N , ∆lµ ∈ RRµ×Rµ ,

u:=(∑r

j=1 uj ⊗⊗Mµ=1 ujµ

), uj ∈ RN , ujµ ∈ RRµ ,

f:=∑R

k=1 f k ⊗⊗Mµ=1 gkµ, f k ∈ RN and gkµ ∈ RRµ .

[Wähnert, Espig, Hackbusch, Litvinenko, Matthies 02.2012]

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 10

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Notation

LetT :=

⊗dµ=1 Rnµ ,

Rr (T) := Rr :=∑r

i=1⊗dµ=1 viµ ∈ T : viµ ∈ Rnµ

,

I :=×dµ=1 Iµ, where Iµ := i ∈ N : 1 6 i 6 nµ.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 11

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Maximum norm and corresponding index

Let u =∑r

j=1⊗dµ=1 ujµ ∈ Rr , compute

‖u‖∞ := maxi:=(i1,...,id)∈I|ui | = maxi:=(i1,...,id)∈I

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

∣∣∣∣∣∣ .(1)

Computing ‖u‖∞ is equivalent to the following e.v. problem.

Let i∗ := (i∗1 , . . . , i∗d ) ∈ I, #I =

∏dµ=1 nµ.

‖u‖∞ = |ui∗ | =

∣∣∣∣∣∣r∑

j=1

d∏µ=1

(ujµ)

i∗µ

∣∣∣∣∣∣ and e(i∗) :=

d⊗µ=1

ei∗µ ,

where ei∗µ ∈ Rnµ the i∗µ-th canonical vector in Rnµ (µ ∈ N6d ).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 12

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Then

u e(i∗) =

r∑j=1

d⊗µ=1

ujµ

d⊗µ=1

ei∗µ

=

r∑j=1

d⊗µ=1

ujµ ei∗µ

=

r∑j=1

d⊗µ=1

[(ujµ)i∗µei∗µ

]

=

r∑j=1

d∏µ=1

(ujµ)i∗µ

︸ ︷︷ ︸

ui∗=

d⊗µ=1

e(i∗µ),

from which follows

u e(i∗) = ui∗e(i∗).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 13

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Let D(u) :=∑r

j=1⊗dµ=1 diag

((ujµ)lµ

)lµ∈N6nµ

, obtain

D(u)v = u v for all v ∈ T.

Corollary

Elements of u are the eigenvalues of D(u) and all eigenvectors e(i)

are of the following form:

e(i) =

d⊗µ=1

eiµ ,

where i := (i1, . . . , id) ∈ I is the index of ui . Therefore ‖u‖∞ is thelargest eigenvalue of D(u) with corresp. e.v. e(i∗).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 14

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Computing ‖u‖∞, u ∈ Rr by vector iteration

1: Choose y0 :=⊗dµ=1

1nµ

1, where 1 := (1, . . . ,1)T ∈ Rnµ ,kmax ∈ N, and take ε := 10e − 7

2: for k = 1,2, . . . , kmax do3:

qk = u yk−1, λk = 〈yk−1,qk〉 , zk = qk/√〈qk ,qk〉,

yk = Appε(zk).4: end for

yk = Appε(zk), [Approximate iteration, Khoromskij, Hackbusch,Tyrtyshnikov 05],Algorithms in [Espig, Hackbusch 2010]

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 15

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Definition (Characteristic, Sign)The characteristic χI(u) ∈ T of u ∈ T in I ⊂ R is for every multi-index i ∈ I pointwise defined as

(χI(u))i :=

1, ui ∈ I;0, ui /∈ I.

(2)

Furthermore, the sign(u) ∈ T is for all i ∈ I pointwise defined by

(sign(u))i :=

1, ui > 0;−1, ui < 0;0, ui = 0.

(3)

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 16

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Lemma

Let u ∈ T, a,b ∈ R, and 1 =⊗dµ=1 1µ, where

1µ := (1, . . . ,1)t ∈ Rnµ .

(i) If I = R<b, then we have χI(u) = 12(1+ sign(b1− u)).

(ii) If I = R>a, then we have χI(u) = 12(1− sign(a1− u)).

(iii) If I = (a,b), then we haveχI(u) = 1

2(sign(b1− u) − sign(a1− u)).

Computing sign(u), u ∈ Rr , via hybrid Newton-Schulz iteration:

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 17

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic

Computing sign(u), u ∈ Rr

1: Choose u0 := u and ε ∈ R+.2: while ‖1− uk−1 uk−1‖ < ε‖u‖ do3: if ‖1− uk−1 uk−1‖ < ‖u‖ then4: zk := 1

2uk−1 (31− uk−1 uk−1)5: else6: zk := 1

2(uk−1 + u−1k−1)

7: end if8: uk := Appεk

(zk)9: end while

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 18

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Definition (Level Set, Frequency)Let I ⊂ R and u ∈ T. The level set LI(u) ∈ T of u respect to I ispointwise defined by

(LI(u))i :=

ui ,ui ∈ I ;0,ui /∈ I ,

(4)

for all i ∈ I.The frequency FI(u) ∈ N of u respect to I is defined as

FI(u) := # suppχI(u). (5)

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 19

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

PropositionLet I ⊂ R, u ∈ T, and χI(u) its characteristic. We have

LI(u) = χI(u) u

and rank(LI(u)) 6 rank(χI(u))rank(u).The frequency FI(u) ∈ N of u respect to I is

FI(u) = 〈χI(u),1〉 ,

where 1 =⊗dµ=1 1µ, 1µ := (1, . . . ,1)T ∈ Rnµ .

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 20

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

2D L-shape domain, N = 557.KLE terms for q(x,ω) = eκ(x,ω): lk = 10,stoch. dim. mk = 10 and pk = 2,shifted lognormal distrib. for κ(x,ω),covκ(x, y) is of the Gaussian type, `x = `y = 0.3.RHS: lf = 10, mf = 10, pf = 2 and Beta distrib. 4,2 for RVs.covf (x, y) is of the Gaussian type, `x = `y = 0.6.Total stoch. dim. mu = mk + mf = 20, |J| = 231

u =

231∑j=1

21⊗µ=1

ujµ ∈ R557 ⊗20⊗µ=1

R3.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 21

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

0 1 2 3 4 50

0.1

0.2

0.3

0.4

0 0.5 1 1.50

0.5

1

1.5

2

2.5

Shifted lognormal distribution with parameters µ = 0.5, σ2 = 1.0(on the left) and Beta distribution with parameters 4,2 (on theright).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 22

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Mean (on the left) and standard deviation (on the right) of κ(x,ω)(lognormal random field with parameters µ = 0.5 and σ = 1).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 23

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Mean (on the left) and standard deviation (on the right) of f (x,ω)(beta distribution with parameters α = 4, β = 2 and Gaussian cov.function).

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 24

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Mean(on the left) and standard deviation (on the right) of the solution u.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 25

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Results

Computed ‖u‖∞ after 20 iterations.

The maximal rank of the intermediate iterants (uk)20k=1 was 143

(uk)20k=1 ⊂ R143 is the sequence of generated tensors.

The approximation error εk = 10−6

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 26

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Level sets

Now we compute level sets

sign(b‖u‖∞1− u)

for b ∈ 0.2, 0.4, 0.6, 0.8.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 27

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

The computing time to get any row is around 10 minutes.

Tensor u has 320 ∗ 557 = 1,942,138,911,357 entries.

R1 := rank(sign(b‖u‖∞1− u)) ,R2 := max16k6kmaxrank(uk),

Error= ‖1−ukmaxukmax ‖‖(b‖u‖∞1−u)‖ .

b R1 R2 kmax Error0.2 12 24 12 2.9×10−8

0.4 12 20 20 1.9×10−7

0.6 8 16 12 1.6×10−7

0.8 8 15 8 1.2×10−7

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 28

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Literature

1. P. Wähnert, W.Hackbusch, M. Espig, A. Litvinenko, H. Matthies:Efficient approximation of the stoch. Galerkin matrix in thecanonical tensor format, (in preparation) MPI Leipzig, 2011.

2. Dissertation of Mike Espig, Leipzig 2008.

3. Mike Espig, W. Hackbusch: A regularized Newton method forthe efficient approx. of tensor represented in the c.t. format, MPILeipzig 2010

4. H. G. Matthies, Uncertainty Quantification with Stochastic FiniteElements, Encyclopedia of Computational Mechanics, Wiley,2007.

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 29

Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments

Acknowledgement

Project MUNA, German Luftfahrtforschungsprogramm funded bythe Ministry of Economics (BMWA).

Elmar Zander:A Malab/Octave toolbox for stochastic Galerkin methods(KLE, PCE, sparse grids, tensors, many examples etc)Stoch. Galerkin lib.: http://ezander.github.com/sglib/

M. Espig, M. Schuster, A. Killaitis, N. Waldren, P. Wähnert, S.Handschuh, H. AuerTensor Calculus lib.: http://gitorious.org/tensorcalculus

CC

SCScien

tifiomputing

Efficient Analysis of High Dimensional Data in Tensor Formats Seite 30