introduction to tensor - intelligent computing for ...zhang/resources/introduction_to_tensor.pdf ·...

51
Introduction to Tensor Intelligent Computing for Computational Intelligence in post Moores Law era! Xiao-Yang Liu www.tensorlet.com April 19, 2019 1/51

Upload: others

Post on 02-Feb-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to TensorIntelligent Computing for Computational Intelligence

in post Moores Law era!

Xiao-Yang Liuwww.tensorlet.com

April 19, 2019

1/51

Page 2: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Agenda

I Background

I Tensor Decompositions (CP, Tucker, and Tensor-Train/Tensor-Ring)

I Transform-based Tensor Model and Applications

I Tensor Computations (cuTensor, TensorDeC++)

www.tensorlet.com 2/51

Page 3: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

BackgroundIn sensing systems, the number and resolution of the sensors grow to the point that

multidimensional data of exceedingly huge volume, variety and structural richness becomeubiquitous across disciplines in engineering and data science.

www.tensorlet.com 3/51

Page 4: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

BackgroundMany problems in computational neuroscience,

neuroinformatics, pattern/image recognition, signalprocessing and machine learning generate massiveamounts of multidimensional data with multipleaspects and high dimensionality. These data havefour “V” characters, as shown in the right figure.

Tensor provides a natural and compactrepresentation for such massive multidimensionaldata via suitable low-rank approximations, of whichthe dynamic analysis allows us to discovermeaningful hidden structures of complex data andto perform generalizations by capturing multi-linearand multi-aspect relationships.

Volume - scale of dataVariety - different forms of dataVeracity - uncertainty of data

Velocity - analysis of streaming datawww.tensorlet.com 4/51

Page 5: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

What Is Tensor?

www.tensorlet.com 5/51

Page 6: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

What Is Tensor?

www.tensorlet.com 6/51

Page 7: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Fibers

www.tensorlet.com 7/51

Page 8: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Slices

www.tensorlet.com 8/51

Page 9: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Unfolding

A(i) means mode-i unfolding.

www.tensorlet.com 9/51

Page 10: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor NotationsI Scalars are denoted by lowercase letters, e.g., a.

I Vectors (tensors of order one) are denoted by boldface lowercase letters, e.g., a.

I Matrices (tensors of order two) are denoted by boldface capital letters, e.g., A.

I Higher-order tensors (order three or higher) are denoted by boldface Euler script letters,e.g., X .

I The n-th element in a sequence is denoted by a superscript in parentheses, e.g., A(n)

denotes the n-th matrix in a sequence.

I “◦” represents the vector outer product.

I n-mode product of a tensor X ∈ RI1×I2×···×Id with a matrix U ∈ RJ×In is denoted byX ×n U and is of size I1 × · · · × In−1 × J × In+1 × · · · × Id ,

(X ×n U)i1···in−1jin+1···id =In∑

in=1

xi1i2···idujin .

www.tensorlet.com 10/51

Page 11: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor NotationsI The Kronecker product of matrices A ∈ RI×J and A ∈ RK×L is denoted by A⊗ B. The

result is a matrix of size (IK )× (JL) and defined by

A⊗ B =

a11B a12B · · · a1JBa21B a22B · · · a2JB

......

. . ....

aI1B aI2B · · · aIJB

=[a1 ⊗ b1 a1 ⊗ b2 a1 ⊗ b3 · · · aJ ⊗ bL−1 aJ ⊗ bL

].

I The Khatri-Rao product is the “matching columnwise” Kronecker product. Given matricesA ∈ RI×K and B ∈ RJ×K , their KhatriRao product is denoted by A� B. The result is amatrix of size (IJ)× K defined by

A� B =[a1 ⊗ b1 a2 ⊗ b2 · · · aK ⊗ bK

].

www.tensorlet.com 11/51

Page 12: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Agenda

I Background

I Tensor Decompositions (CP, Tucker, and Tensor-Train/Tensor-Ring)

I Transform-based Tensor Model and Applications

I Tensor Computations (cuTensor, TensorDeC++)

www.tensorlet.com 12/51

Page 13: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Rank-One TensorAn N-way tensor X ∈ RI1×I2×···×Id is rank one if it can be written as the outer product of d

vectors:X = a(1) ◦ a(2) ◦ · · · ◦ a(d).

example. Rank-one third-order tensor, X = a ◦ b ◦ c . The (i , j , k) element of X is given byxijk = aibjck .

www.tensorlet.com 13/51

Page 14: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

CP Decomposition

X ≈R∑

r=1

λrar ◦ br ◦ cr

= Λ×1 A×2 B×3 C

= [[Λ; A,B,C]]

X(1) = AΛ(C� B)T + E(1)

X(2) = BΛ(C� A)T + E(2)

X(3) = CΛ(B� A)T + E(3)

www.tensorlet.com 14/51

Page 15: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

CP Decomposition

Name Proposed by

Polyadic form of a tensor Hitchcock, 1927PARAFAC (parallel factors) Harshman, 1970CANDECOMP or CAND (canonical decomposition) Carroll and Chang, 1970Topographic components model Mocks, 1988CP (CANDECOMP/PARAFAC) Kiers, 2000

Table: Some of the many names for the CP decomposition.

www.tensorlet.com 15/51

Page 16: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tucker Decomposition

Y = G ×1 A×2 B×3 C + E = [[G; A,B,C]] + E

X(1) ≈ AG(1)(C⊗ B)T

X(2) ≈ BG(2)(C⊗ A)T

X(3) ≈ CG(3)(B⊗ A)T

www.tensorlet.com 16/51

Page 17: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tucker Decomposition

Name Proposed by

Three-mode factor analysis (3MFA/Tucker3) Tucker, 1966Three-mode PCA (3MPCA) Kroonenberg and De Leeuw, 1980N-mode PCA Kapteyn et al., 1986Higher-order SVD (HOSVD) De Lathauwer et al., 2000N-mode SVD Vasilescu and Terzopoulos, 2002

Table: Names for the Tucker decomposition (some specific to three-way and some for N-way).

www.tensorlet.com 17/51

Page 18: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Train Decomposition

TT Form

Ai1i2···id = G1(i1)︸ ︷︷ ︸1×R

G2(i2)︸ ︷︷ ︸R×R

· · · Gd(id)︸ ︷︷ ︸R×1

A graphical representation of the tensor train decomposition

www.tensorlet.com 18/51

Page 19: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Ring DecompositionExample.

www.tensorlet.com 19/51

Page 20: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tensor Ring Decomposition

TR Form

Ai1i2···id = Tr{Z1(i1)Z2(i2) · · · Zd(id)} = Tr

{d∏

k=1

Zk(ik)

}

A graphical representation of the tensor ring decompositionwww.tensorlet.com 20/51

Page 21: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Agenda

I Background

I Tensor Decompositions (CP, Tucker, and Tensor-Train/Tensor-Ring)

I Transform-based Tensor Model and Applications

I Tensor Computations (cuTensor, TensorDeC++)

www.tensorlet.com 21/51

Page 22: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Transform-based Model

Basic Operators

The operator matview(·) takes a tensor A ∈ Cn1×n2×n3×n4 and returns an n1n3n4 × n2n3n4block diagonal matrix, with each block being an n1 × n2 matrix, defined as

matview(A) = diag(A1, · · · ,Ap, · · · ,AP), p ∈ [P],

andAp(i , j) = A(i , j , k , l), p = (l − 1)n3 + k , i ∈ [n1], j ∈ [n2], k ∈ [n3], l ∈ [n4],

where P = n3n4, and [n] denotes the index set {1, 2, · · · , n}. The operator tenview(·) foldsmatview(A) back to tensor A, i.e.,

tenview(matview(A)) = A.

www.tensorlet.com 22/51

Page 23: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Transform-based Model

Basic Operators

Given two fourth-order tensors A ∈ Cn1×n′×n3×n4 and B ∈ Cn′×n2×n3×n4 , the correspondingp-th matrices are Ap ∈ Cn1×n′ and Bp ∈ Cn′×n2 , and their multiplication is well-defined asCp = ApBp ∈ Cn1×n2 . Later in the transform domain, we will need the following matrixmultiplication of two block diagonal matrices, e.g.,

matview(C) = matview(A) · matview(B),

where · denotes the conventional matrix multiplication.

www.tensorlet.com 23/51

Page 24: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Transform-based ModelTensor-scalar multiplication

Given an invertible 2D discrete transform L : C1×1×n3×n4 → C1×1×n3×n4 , the element-wisemultiplication ◦, and α, β ∈ C1×1×n3×n4 , we define the tensor-scalar multiplication

α • β , L−1(L(α) ◦ L(β)),

where L−1 : C1×1×n3×n4 → C1×1×n3×n4 is the inverse transform.

Tensor-linear combinations

Given tensor scalars cj ∈ C1×1×n3×n4 , j ∈ [n2], a tensor-linear combination of thetensor-columns Aj ∈ Cn1×1×n3×n4 , j ∈ [n2], is defined as

A1 • c1 + · · ·+An2 • cn2 = A • c,

where A = [A1, · · · ,An2 ] and c = [c1, · · · , cn2 ]T .

www.tensorlet.com 24/51

Page 25: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Transform-based Model

L-product

The L-product C = A•B ∈ Cn1×n2×n3×n4 of A ∈ Cn1×n′×n3×n4 and Cn′×n2×n3×n4 is defined as

C(i , j) =∑k∈[n′]

A(i , k) • B(k , j), i ∈ [n1], j ∈ [n2].

Lemma. The L-product C = A • B can be calculated in the following way: First, we compute

matview(C) = matview(A) · matview(B).

Then, we stack matview(C) back to tensor tenview(matview(C)) and perform the inverse

transform to get C, i.e., C = L−1(C). The notation A denotes the transform-domain

representation of A ∈ Cn1×n2×n3×n4 such that A = L(A) and A = L−1(A).

www.tensorlet.com 25/51

Page 26: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Transform-based ModelSo the L-product can be considered as

L(C(i , j)) =∑k∈[n′]

L(A(i , k)) ◦ L(B(k, j)),

which can be represented as Cp = ApBp, p ∈ [P].

www.tensorlet.com 26/51

Page 27: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank ModelNotations

~Ai ≡ A(:, i , :), A(j) ≡ A(:, :, j), A := fft(A, [], 3)

bcirc(A) =

A(1) A(n) A(n−1) · · · A(2)

A(2) A(1) A(n) · · · A(3)

......

.... . .

...A(n) A(n−1) · · · A(2) A(1)

unfold(A) =

A(1)

A(2)

...A(n)

, fold(unfold(A)) = A

www.tensorlet.com 27/51

Page 28: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

t-Product

A ∗ B = fold(bcirc(A) · unfold(B))

Example. Let A ∈ R3×2×2 with frontal faces

A(1) =

1 00 2−1 3

and A(2) =

−2 1−2 70 −1

,and let ~B ∈ R2×1×2 with frontal faces

B(1) =

[3−1

]and B(2) =

[−2−3

].

www.tensorlet.com 28/51

Page 29: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

A ∗ ~B = fold

1 0 −2 10 2 −2 7−1 3 0 −1−2 1 1 0−2 7 0 20 −1 −1 3

3−1−2−3

= fold

4−19−3−9−19−6

∈ R3×1×2

is a 3× 1× 2 tensor. In other words, in this example, ~C := A ∗ ~B is a 3× 2 matrix, oriented asa lateral slice of a third-order tensor.

www.tensorlet.com 29/51

Page 30: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

t-Linear combinations

Given k tubal scalars ~cj ∈ R1×1×n, j = 1, 2, · · · , k , a t-linear combination of~Xj ∈ Rm×1×n, j = 1, 2, · · · , k is defined as

~X1 ∗~c1 + ~X2 ∗~c2 + · · ·+ ~Xk ∗~ck ≡ X ∗ ~C

where

X :=[~X1, ~X2, · · · , ~Xk

], ~C :=

~c1~c2...~ck

.

Example. Using A ∈ R3×2×2 and ~B ∈ R2×1×2 from the previous example, we see that

www.tensorlet.com 30/51

Page 31: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

A ∗ ~B = ~A1 ∗ ~b11 + ~A2 ∗ ~b21

= fold

74−3−8−62

+ fold

−3−23

0−1−13−8

= fold

4−19−3−9−19−6

Thus, ~C := A ∗ ~B is a t-linear combination of the lateral slices of A.

www.tensorlet.com 31/51

Page 32: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

Observation

Given ~a, ~b ∈ R1×1×n, ~a ∗ ~b can be computed as

~a ∗ ~b := ifft(~a } ~b, [], 3),

where } of two tubal scalars means pointwise multiplication.Factorizations of A are created (implicitly) by applying the appropriate matrix factorization

to each of the A(i)

A = Q ∗R ⇐⇒ A(i) = Q(i)R(i).

www.tensorlet.com 32/51

Page 33: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Low-tubal-rank Model

t-SVD

A = U ∗ S ∗ VT =

min(l,m)∑i=1

~Ui ∗ si ∗ ~VTi , si := S(i , i , :)

The t-SVD of an l ×m × n tensorwww.tensorlet.com 33/51

Page 34: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tubal-tensor Sparse Coding

Tubal-tensor Linear Combination

A two-dimensional image of size m × k is represented by a third-order tensor X ∈ Rm×1×k ,which can be approximated by the t-product between D ∈ Rm×r×k and C ∈ Rr×1×k as

X = D ∗ C= D(:, 1, :) ∗ C(1, 1, :) +D(:, 2, :) ∗ C(2, 1, :) + · · ·+D(:, r , :) ∗ C(r , 1, :)

Tubal-tensor sparse coding model is based on circular convolution operation

www.tensorlet.com 34/51

Page 35: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tubal-tensor Sparse Coding

Tubal-tensor Sparse Representation

A third-order tensor X ∈ Rm×n×k is presented by n images of size m × k. Let D ∈ Rm×r×k

be the tensor dictionary, where each lateral slice D(:, j , :) represents a tensor basis, andC ∈ Rr×n×k be the tensor corresponding representations. Each image X (:, j , :) is approximatedby a sparse t-linear combination of those tensor bases. Tubal-tensor sparse coding (TubSC)model can be formulated as

minD,C

1

2‖X − D ∗ C‖2F + β‖C‖1

s.t. ‖D(:, j , :)‖2F ≤ 1, j = 1, 2, · · · , r .

TubSC model can be solved alternately by tensor coefficients learning and tensordictionary learning.

www.tensorlet.com 35/51

Page 36: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tubal-tensor Sparse Coding

Tensor Coefficients Learning

minC

1

2‖X − D ∗ C‖2F + β‖C‖1

According to low-tubal-tensor model, the problem can be transformed to

minunfold(C)

1

2‖unfold(X )− bcirc(D) · unfold(C)‖2F + β‖unfold(C)‖1.

It can be solved by Iterative Shrinkage Thresholding algorithm based on Tensor (ISTT), whichis rewritten as

minC

f (C) + βg(C),

where f (C) = 12‖X − D ∗ C‖

2F , and g(C) = ‖C‖1.

www.tensorlet.com 36/51

Page 37: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Tubal-tensor Sparse Coding

Tensor Dictionary Learning

minD

1

2‖X − D ∗ C‖2F

s.t. ‖D(:, j , :)‖2F ≤ 1, j = 1, 2, · · · , r .

We transform this problem into the frequecy domain:

minD(l)

k∑l=1

‖X (l) − D(l)C(l)‖2F , l = 1, 2, · · · , k

s.t.k∑

l=1

‖D(l)(:, j)‖2F ≤ k , j = 1, 2, · · · , r .

Then adopt the Lagrange dual (Lee et al. 2007) to solve the dual variables by Newtonsalgorithm.

www.tensorlet.com 37/51

Page 38: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

Agenda

I Background

I Tensor Decompositions (CP, Tucker, and Tensor-Train/Tensor-Ring)

I Transform-based Tensor Model and Applications

I Tensor Computations (cuTensor, TensorDeC++)

www.tensorlet.com 38/51

Page 39: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)This library is a general approach to computelow-tubal-rank tensor operations in the frequencydomain on GPUs.

1. Obtain the frequency domain representation ofthe input tensor by performing Fourier transformalong the third dimension (called tube-wise DFT)on the GPU;

2. In the frequency domain, the tensor operationsare separated into multiple independent complexmatrix computations that possess strongparallelism;

3. Converting the frequency domain results back tothe time domain through inverse Fouriertransform along the third dimension on the GPU(called tube-wise inverse DFT).

System architecture of the cuTensor-tuballibrary

www.tensorlet.com 39/51

Page 40: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)

Operation Input Output

t-FFT A ∈ Rm×n×k A ∈ Cm×n×k

inverse t-FFT A ∈ Cm×n×k A ∈ Rm×n×k

t-product A ∈ Rm×l×k ,B ∈ Rl×n×k C ∈ Rm×n×k

t-SVD T ∈ Rm×n×k U ∈ Rm×m×k ,V ∈ Rn×n×k ,× ∈ Rm×n×k

t-QR T ∈ Rm×n×k Q ∈ Rm×m×k ,R ∈ Rm×n×k

t-inverse T ∈ Rn×n×k T −1 ∈ Rn×n×k

t-normalization T ∈ Rm×1×k T ∈ Rm×1×k

Table: Seven tensor operations in the cuTensor-tubal library

www.tensorlet.com 40/51

Page 41: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)

Key Challenges

I Data Transfer Between the CPU and GPU

I Alternative Access to Tube and Slice Data Structures

I Parallelizing the Fourier Transforms and Matrix Computations

www.tensorlet.com 41/51

Page 42: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)Efficient Data Transfer

Overlapping data transfer with computations

www.tensorlet.com 42/51

Page 43: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)Uniform Memory Access to Tube and Slice Data Structures

Tensors are stored as a 1D array in memory Data structures in tensor computations

www.tensorlet.com 43/51

Page 44: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)Parallelizing the Fourier Transforms and Matrix Computations

Operation #(FFT operations) #(Matrix Operation) #(inverse FFT operation)

t-FFT m × n None m × ninverse t-FFT m × n None m × nt-product m × l + l × n k m × nt-SVD m × n k m × n + n × n + m × nt-QR m × n k m ×m + m × nt-inverse n × n k n × nt-normalization m k m

Table: Seven tensor operations in the cuTensor-tubal library

www.tensorlet.com 44/51

Page 45: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

cuTensor-tubal (GPU)

System workflow of the cuTensor-tubal library

Memory access operators

www.tensorlet.com 45/51

Page 46: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)TenDeC++ is a new library for tensordecompositions in C++, in which a novel underlyingtechnology PointerDefomer leveraging he uniquepointer is proposed to further explore potentials ofC++. TenDeC++ supports

I Canonical Polyadic

I Tucker Decomposition

I Tensor-train Decomposition

I t-SVD

Compared with Tensorly in Python and TensorLabin MATLAB, TenDeC++ reduces more than83.7%, 53.3% decomposition time, and supports2.5×, 2× of tensor.

System architecture of the TenDeC++library

www.tensorlet.com 46/51

Page 47: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)PointerDeformer

A 3D tensor is stored as a 1D array in memory. Accessing these data with different sequencescan form size-specific matrices including three mode-n views: column-major, row-major, andconcatenation views. These virtual views motivate us to design PointerDeformer to skip thetime-consuming unfolding operation in C++.

www.tensorlet.com 47/51

Page 48: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)Optimized Basic Tensor Operation: n-mode Product

Compare with traditional process, the optimized process does not need the time-consumingunfold/fold operations. Instead, PointerDeformer achieves the virtual transformation byaccessing the data with specific sequence in memory.

www.tensorlet.com 48/51

Page 49: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)Other Acceleration Techniques

I Exploit Symmetry with PointerDefomerI Exploit Conjugate Symmetry for t-SVD Decomposition

Based on the conjugate symmetric property of FFT for real input data, there isconj(X (j)) = X (k−j+2), where j = 2, 3, · · · , d k+1

2 e and conj(X) denotes the conjugate of

matrix X. Conjugation on both sides of X (j) = U (j)S(j)V(j),

X (k−j+2) = conj(U (j)) · conj(S(j)) · conj(V(j)).

Hence, with the conjugate symmetry, it only needs to preform SVD on half slice in t-SVDdecomposition.

www.tensorlet.com 49/51

Page 50: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)Performance

Running time of CP decomposition Running time of Tucker decomposition

www.tensorlet.com 50/51

Page 51: Introduction to Tensor - Intelligent Computing for ...zhang/resources/Introduction_to_Tensor.pdf · Introduction to Tensor Intelligent Computing for Computational Intelligence in

Introduction to Tensor 4/19/2019

TenDeC++ (CPU)Performance

Running time of t-SVDwww.tensorlet.com 51/51