efficient analysis of high-dimensional data in tensor formats
TRANSCRIPT
Efficient Analysis of High Dimensional Datain Tensor Formats
M. Espig, W. Hackbusch, A. Litvinenko, H. G. Matthies and E. [email protected], 13. Januar 2012
CC
SCScien
tifiomputing
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Outline
Motivation
Discretisation
Analysis of high dimensional dataComputing the maximum normComputation of the characteristic
Computation of level sets and frequency
Numerical Experiments
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 2
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
An example of SPDE
∂∂t u(x, t) = ∇ · (κ(x,ω)∇u(x, t)) + f (x, t), x ∈ G ⊂ Rd , t ∈ [0,T ],where ω ∈ Ω, and U = L2(G).
For each ω ∈ Ω seek for u(x, t) ∈ L2([0,T ])⊗ U.
Let S := L2([0,T ])⊗ L2(Ω), one is looking for u(t , x,ω) ∈ U⊗ S.
Further decompositionL2(Ω) = L2(×jΩj) ∼=
⊗j L2(Ωj) ∼=
⊗j L2(R, Γj) results in
u(t , x,ω) ∈ U⊗ S = L2(G)⊗(
L2([0,T ])⊗⊗
j L2(R, Γj))
.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 3
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Discretisation and Tensorial quantities
span wnNn=1 = UN ⊂ U, dimUN = N,
span τk Kk=1 = TK ⊂ L2([0,T ]) = SI , dimTK = K ,
span Xjm Jmjm=1 = SII,Jm ⊂ L2(R, Γm) = SII , dim SII,Jm = Jm,
1 6 m 6 M,
Let P := [0,T ]×Ω, an approximation to u : P→ U is thus given by
u(x, t ,ω1, . . . ,ωM) ≈N∑
n=1
K∑k=1
J1∑j1=1
. . .
JM∑jM=1
uj1,...,jMn,k wn(x)⊗ τk(t)⊗
(M⊗
m=1
Xjm(ωm)
).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 4
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Tensorial quantities
We can write for all
n = 1, . . . ,N, k = 1, . . . ,K , m = 1, . . . ,M, jm = 1, . . . , Jm :
uj1,...,jm,...,jMn,k = u(xn, tk ,ω
j11 , . . . ,ω
jMM ),
has R′′= N × K ×
∏Mm=1 Jm terms.
Look for
(uj1,...,jm,...,jMn,k ) ≈
R′∑
ρ=1
uρwρ ⊗ τρ ⊗
(M⊗
m=1
Xρm
),
where R′6 R
′′, wρ ∈ RN , τρ ∈ RK , Xρm ∈ RJm .
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 5
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Values of interest
With the last tensor representation one wants to perform differenttasks:
evaluation for specific parameters (t ,ω1, . . . ,ωM),
finding maxima and minima,
finding ‘level sets’ and quantiles.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 6
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Spatial discretisation and PCE
UN := ϕn(x)Nn=1 ⊂ U:
u(x,ω) =
N∑n=1
un(ω)ϕn(x),
un(θ) =∑α∈J
uαn Hα(θ(ω)), where
J is taken as a finite subset of N(N)0 , R := |J|.
u(θ) =∑α∈J
uαHα(θ(ω)),
where uα := [uα1 , . . . ,uαn ]
T .
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 7
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Discretized equation in tensor form
KLE: κ(x,ω) = κ0(x) +∑∞
j=1 κjgj(x)ξj(θ), whereξj(θ) =
1κj
∫G(κ(x,ω) − κ0(x)) gj(x)dx.
Knowing PCE κ(x,ω) =∑α κ
(α)Hα(θ), compute
ξj(θ) =∑α∈J ξ
(α)j Hα(θ), where ξ(α)j = 1
κj
∫G κ
(α)(x)gj(x)dx.
Further compute ξ(α)j ≈∑s
l=1(ξl)j∏∞
k=1(ξl, k)αk .
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 8
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Discretized equation in tensor form
[Matthies, Keese 04, 05, 07]
Au :=(∑∞
j=0 Aj ⊗∆j) (∑
α∈J uα ⊗ eα)=(∑
α∈J fα ⊗ eα)=: f,
where eα denotes the canonical basis in⊗Mµ=1 RRµ .
f =∑α∈J∑∞
i=0√λi f iαf i ⊗ eα =
∑∞i=0√λi f i ⊗ gi , where
gi :=∑α∈J f i
αeα.
Splitting gi further, obtainf ≈∑R
k=1 f k ⊗⊗Mµ=1 gkµ.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 9
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Final discretized stochastic PDE
Au = f, where
A:=(∑s
l=1 Al ⊗⊗Mµ=1 ∆lµ
), Al ∈ RN×N , ∆lµ ∈ RRµ×Rµ ,
u:=(∑r
j=1 uj ⊗⊗Mµ=1 ujµ
), uj ∈ RN , ujµ ∈ RRµ ,
f:=∑R
k=1 f k ⊗⊗Mµ=1 gkµ, f k ∈ RN and gkµ ∈ RRµ .
[Wähnert, Espig, Hackbusch, Litvinenko, Matthies 02.2012]
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 10
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Notation
LetT :=
⊗dµ=1 Rnµ ,
Rr (T) := Rr :=∑r
i=1⊗dµ=1 viµ ∈ T : viµ ∈ Rnµ
,
I :=×dµ=1 Iµ, where Iµ := i ∈ N : 1 6 i 6 nµ.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 11
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Maximum norm and corresponding index
Let u =∑r
j=1⊗dµ=1 ujµ ∈ Rr , compute
‖u‖∞ := maxi:=(i1,...,id)∈I|ui | = maxi:=(i1,...,id)∈I
∣∣∣∣∣∣r∑
j=1
d∏µ=1
(ujµ)
iµ
∣∣∣∣∣∣ .(1)
Computing ‖u‖∞ is equivalent to the following e.v. problem.
Let i∗ := (i∗1 , . . . , i∗d ) ∈ I, #I =
∏dµ=1 nµ.
‖u‖∞ = |ui∗ | =
∣∣∣∣∣∣r∑
j=1
d∏µ=1
(ujµ)
i∗µ
∣∣∣∣∣∣ and e(i∗) :=
d⊗µ=1
ei∗µ ,
where ei∗µ ∈ Rnµ the i∗µ-th canonical vector in Rnµ (µ ∈ N6d ).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 12
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Then
u e(i∗) =
r∑j=1
d⊗µ=1
ujµ
d⊗µ=1
ei∗µ
=
r∑j=1
d⊗µ=1
ujµ ei∗µ
=
r∑j=1
d⊗µ=1
[(ujµ)i∗µei∗µ
]
=
r∑j=1
d∏µ=1
(ujµ)i∗µ
︸ ︷︷ ︸
ui∗=
d⊗µ=1
e(i∗µ),
from which follows
u e(i∗) = ui∗e(i∗).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 13
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Let D(u) :=∑r
j=1⊗dµ=1 diag
((ujµ)lµ
)lµ∈N6nµ
, obtain
D(u)v = u v for all v ∈ T.
Corollary
Elements of u are the eigenvalues of D(u) and all eigenvectors e(i)
are of the following form:
e(i) =
d⊗µ=1
eiµ ,
where i := (i1, . . . , id) ∈ I is the index of ui . Therefore ‖u‖∞ is thelargest eigenvalue of D(u) with corresp. e.v. e(i∗).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 14
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Computing ‖u‖∞, u ∈ Rr by vector iteration
1: Choose y0 :=⊗dµ=1
1nµ
1, where 1 := (1, . . . ,1)T ∈ Rnµ ,kmax ∈ N, and take ε := 10e − 7
2: for k = 1,2, . . . , kmax do3:
qk = u yk−1, λk = 〈yk−1,qk〉 , zk = qk/√〈qk ,qk〉,
yk = Appε(zk).4: end for
yk = Appε(zk), [Approximate iteration, Khoromskij, Hackbusch,Tyrtyshnikov 05],Algorithms in [Espig, Hackbusch 2010]
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 15
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Definition (Characteristic, Sign)The characteristic χI(u) ∈ T of u ∈ T in I ⊂ R is for every multi-index i ∈ I pointwise defined as
(χI(u))i :=
1, ui ∈ I;0, ui /∈ I.
(2)
Furthermore, the sign(u) ∈ T is for all i ∈ I pointwise defined by
(sign(u))i :=
1, ui > 0;−1, ui < 0;0, ui = 0.
(3)
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 16
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Lemma
Let u ∈ T, a,b ∈ R, and 1 =⊗dµ=1 1µ, where
1µ := (1, . . . ,1)t ∈ Rnµ .
(i) If I = R<b, then we have χI(u) = 12(1+ sign(b1− u)).
(ii) If I = R>a, then we have χI(u) = 12(1− sign(a1− u)).
(iii) If I = (a,b), then we haveχI(u) = 1
2(sign(b1− u) − sign(a1− u)).
Computing sign(u), u ∈ Rr , via hybrid Newton-Schulz iteration:
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 17
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical ExperimentsComputing the maximum norm Computation of the characteristic
Computing sign(u), u ∈ Rr
1: Choose u0 := u and ε ∈ R+.2: while ‖1− uk−1 uk−1‖ < ε‖u‖ do3: if ‖1− uk−1 uk−1‖ < ‖u‖ then4: zk := 1
2uk−1 (31− uk−1 uk−1)5: else6: zk := 1
2(uk−1 + u−1k−1)
7: end if8: uk := Appεk
(zk)9: end while
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 18
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Definition (Level Set, Frequency)Let I ⊂ R and u ∈ T. The level set LI(u) ∈ T of u respect to I ispointwise defined by
(LI(u))i :=
ui ,ui ∈ I ;0,ui /∈ I ,
(4)
for all i ∈ I.The frequency FI(u) ∈ N of u respect to I is defined as
FI(u) := # suppχI(u). (5)
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 19
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
PropositionLet I ⊂ R, u ∈ T, and χI(u) its characteristic. We have
LI(u) = χI(u) u
and rank(LI(u)) 6 rank(χI(u))rank(u).The frequency FI(u) ∈ N of u respect to I is
FI(u) = 〈χI(u),1〉 ,
where 1 =⊗dµ=1 1µ, 1µ := (1, . . . ,1)T ∈ Rnµ .
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 20
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
2D L-shape domain, N = 557.KLE terms for q(x,ω) = eκ(x,ω): lk = 10,stoch. dim. mk = 10 and pk = 2,shifted lognormal distrib. for κ(x,ω),covκ(x, y) is of the Gaussian type, `x = `y = 0.3.RHS: lf = 10, mf = 10, pf = 2 and Beta distrib. 4,2 for RVs.covf (x, y) is of the Gaussian type, `x = `y = 0.6.Total stoch. dim. mu = mk + mf = 20, |J| = 231
u =
231∑j=1
21⊗µ=1
ujµ ∈ R557 ⊗20⊗µ=1
R3.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 21
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
0 1 2 3 4 50
0.1
0.2
0.3
0.4
0 0.5 1 1.50
0.5
1
1.5
2
2.5
Shifted lognormal distribution with parameters µ = 0.5, σ2 = 1.0(on the left) and Beta distribution with parameters 4,2 (on theright).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 22
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Mean (on the left) and standard deviation (on the right) of κ(x,ω)(lognormal random field with parameters µ = 0.5 and σ = 1).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 23
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Mean (on the left) and standard deviation (on the right) of f (x,ω)(beta distribution with parameters α = 4, β = 2 and Gaussian cov.function).
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 24
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Mean(on the left) and standard deviation (on the right) of the solution u.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 25
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Results
Computed ‖u‖∞ after 20 iterations.
The maximal rank of the intermediate iterants (uk)20k=1 was 143
(uk)20k=1 ⊂ R143 is the sequence of generated tensors.
The approximation error εk = 10−6
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 26
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Level sets
Now we compute level sets
sign(b‖u‖∞1− u)
for b ∈ 0.2, 0.4, 0.6, 0.8.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 27
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
The computing time to get any row is around 10 minutes.
Tensor u has 320 ∗ 557 = 1,942,138,911,357 entries.
R1 := rank(sign(b‖u‖∞1− u)) ,R2 := max16k6kmaxrank(uk),
Error= ‖1−ukmaxukmax ‖‖(b‖u‖∞1−u)‖ .
b R1 R2 kmax Error0.2 12 24 12 2.9×10−8
0.4 12 20 20 1.9×10−7
0.6 8 16 12 1.6×10−7
0.8 8 15 8 1.2×10−7
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 28
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Literature
1. P. Wähnert, W.Hackbusch, M. Espig, A. Litvinenko, H. Matthies:Efficient approximation of the stoch. Galerkin matrix in thecanonical tensor format, (in preparation) MPI Leipzig, 2011.
2. Dissertation of Mike Espig, Leipzig 2008.
3. Mike Espig, W. Hackbusch: A regularized Newton method forthe efficient approx. of tensor represented in the c.t. format, MPILeipzig 2010
4. H. G. Matthies, Uncertainty Quantification with Stochastic FiniteElements, Encyclopedia of Computational Mechanics, Wiley,2007.
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 29
Motivation Discretisation Analysis of high dimensional data Computation of level sets and frequency Numerical Experiments
Acknowledgement
Project MUNA, German Luftfahrtforschungsprogramm funded bythe Ministry of Economics (BMWA).
Elmar Zander:A Malab/Octave toolbox for stochastic Galerkin methods(KLE, PCE, sparse grids, tensors, many examples etc)Stoch. Galerkin lib.: http://ezander.github.com/sglib/
M. Espig, M. Schuster, A. Killaitis, N. Waldren, P. Wähnert, S.Handschuh, H. AuerTensor Calculus lib.: http://gitorious.org/tensorcalculus
CC
SCScien
tifiomputing
Efficient Analysis of High Dimensional Data in Tensor Formats Seite 30