efficient tensor completion for color image and video recovery:...

14
1 Efficient tensor completion for color image and video recovery: Low-rank tensor train Johann A. Bengua 1 , Ho N. Phien 1 , Hoang D. Tuan 1 and Minh N. Do 2 Abstract—This paper proposes a novel approach to tensor completion, which recovers missing entries of data represented by tensors. The approach is based on the tensor train (TT) rank, which is able to capture hidden information from tensors thanks to its definition from a well-balanced matricization scheme. Ac- cordingly, new optimization formulations for tensor completion are proposed as well as two new algorithms for their solution. The first one called simple low-rank tensor completion via tensor train (SiLRTC-TT) is intimately related to minimizing a nuclear norm based on TT rank. The second one is from a multilinear matrix factorization model to approximate the TT rank of a tensor, and is called tensor completion by parallel matrix factorization via tensor train (TMac-TT). A tensor augmentation scheme of transforming a low-order tensor to higher-orders is also proposed to enhance the effectiveness of SiLRTC-TT and TMac-TT. Simulation results for color image and video recovery show the clear advantage of our method over all other methods. Index Terms—Color image recovery, video recovery, tensor completion, tensor train decomposition, tensor train rank, tensor train nuclear norm, Tucker decomposition. I. I NTRODUCTION Tensors are multi-dimensional arrays, which are higher- order generalizations of matrices and vectors [1]. Tensors provide a natural way to represent multidimensional data whose entries are indexed by several continuous or discrete variables. Employing tensors and their decompositions to process data has become increasingly popular since [2]–[4]. For instance, a color image is a third-order tensor defined by two indices for spatial variables and one index for color mode. A video comprised of color images is a fourth-order tensor with an additional index for a temporal variable. Re- siding in extremely high-dimensional data spaces, the tensors in practical applications are nevertheless often of low-rank [1]. Consequently, they can be effectively projected to much smaller subspaces through underlying decompositions such as the CANDECOMP/PARAFAC (CP) [5], [6], Tucker [7] and tensor train (TT) [8] or matrix product state (MPS) [9]–[11]. Motivated by the success of low rank matrix completion (LRMC) [12]–[14], recent effort has been made to extend its concept to low rank tensor completion (LRTC). In fact, LRTC has found applications in computer vision and graphics, signal processing and machine learning [15]–[21]. Additionally, it 1 Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW 2007, Australia; Email: [email protected], [email protected], [email protected] 2 Department of Electrical and Computer Engineering and the Coordinated Science Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA; Email: [email protected] has been shown in [15] that tensor-based completion algo- rithms are more efficient in completing tensors than matrix- based completion algorithms. The common target in LRTC is to recover missing entries of a tensor from its partially observed entities [22]–[24]. LRTC remains a grand challenge due to the fact that computation for the tensor rank, defined as CP rank, is already an NP-hard problem [1]. There have been attempts in approaching LRTC via Tucker rank [15], [18], [21]. A conceptual drawback of Tucker rank is that its components are ranks of matrices constructed based on an unbalanced matricization scheme (one mode versus the rest). The upper bound of each individual rank is often small and may not be suitable for describing global information of the tensor. In addition, the matrix rank minimizations is only efficient when the matrix is more balanced. As the rank of a matrix is not more than min{n, m}, where m and n are the number of rows and columns of the matrix, respectively, the high ratio max{m, n}/ min{m, n} would effectively rule out the need of matrix rank minimization. It is not surprising for present state- of-the-art LRMC methods [12], [13], [25], [26] to implicitly assume that the considered matrices are balanced. Another type of tensor rank is the TT rank, which consists of ranks of matrices formed by a well-balanced matricization scheme, i.e. matricize the tensor along permutations of modes. TT rank was defined in [8], yet low rank tensor analysis via TT rank can be seen in earlier work in physics, specifically in simulations of quantum dynamics [27], [28]. Realizing the computational efficiency of low TT rank tensors, there has been numerous works in applying it to numerical linear algebra [29]–[31]. Low TT rank tensors were used for the singular value decomposition (SVD) of large-scale matrices in [32], [33]. The alternating least squares (ALS) algorithms for tensor approximation [34], [35] are also used for solutions of linear equations and eigenvector/eigenvalue approximation. In [36], [37], low TT rank tensors were also used in implementing the steepest descent iteration for large scale least squares problems. The common assumption in all these works is that all the used tensors during the computation processes are of low TT rank for computational practicability. How low TT rank tensors are relevant to real-world problems was not really their concern. Applications of the TT decomposition to fields outside of mathematics and physics has rarely been seen, with only a recent application of TT to machine learning [38]. As mentioned above, color image and video are perfect examples of tensors, so their completion can be formulated as tensor completion problems. However, it is still not known if TT rank-based completion is useful for practical solutions. The main purpose of this paper is to show that TT rank is the right

Upload: others

Post on 13-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

1

Efficient tensor completion for color image andvideo recovery: Low-rank tensor train

Johann A. Bengua1, Ho N. Phien1, Hoang D. Tuan1 and Minh N. Do2

Abstract—This paper proposes a novel approach to tensorcompletion, which recovers missing entries of data representedby tensors. The approach is based on the tensor train (TT) rank,which is able to capture hidden information from tensors thanksto its definition from a well-balanced matricization scheme. Ac-cordingly, new optimization formulations for tensor completionare proposed as well as two new algorithms for their solution.The first one called simple low-rank tensor completion via tensortrain (SiLRTC-TT) is intimately related to minimizing a nuclearnorm based on TT rank. The second one is from a multilinearmatrix factorization model to approximate the TT rank ofa tensor, and is called tensor completion by parallel matrixfactorization via tensor train (TMac-TT). A tensor augmentationscheme of transforming a low-order tensor to higher-orders isalso proposed to enhance the effectiveness of SiLRTC-TT andTMac-TT. Simulation results for color image and video recoveryshow the clear advantage of our method over all other methods.

Index Terms—Color image recovery, video recovery, tensorcompletion, tensor train decomposition, tensor train rank, tensortrain nuclear norm, Tucker decomposition.

I. INTRODUCTION

Tensors are multi-dimensional arrays, which are higher-order generalizations of matrices and vectors [1]. Tensorsprovide a natural way to represent multidimensional datawhose entries are indexed by several continuous or discretevariables. Employing tensors and their decompositions toprocess data has become increasingly popular since [2]–[4].For instance, a color image is a third-order tensor definedby two indices for spatial variables and one index for colormode. A video comprised of color images is a fourth-ordertensor with an additional index for a temporal variable. Re-siding in extremely high-dimensional data spaces, the tensorsin practical applications are nevertheless often of low-rank[1]. Consequently, they can be effectively projected to muchsmaller subspaces through underlying decompositions such asthe CANDECOMP/PARAFAC (CP) [5], [6], Tucker [7] andtensor train (TT) [8] or matrix product state (MPS) [9]–[11].

Motivated by the success of low rank matrix completion(LRMC) [12]–[14], recent effort has been made to extend itsconcept to low rank tensor completion (LRTC). In fact, LRTChas found applications in computer vision and graphics, signalprocessing and machine learning [15]–[21]. Additionally, it

1Faculty of Engineering and Information Technology, Universityof Technology Sydney, Ultimo, NSW 2007, Australia; Email:[email protected], [email protected],[email protected]

2Department of Electrical and Computer Engineering and the CoordinatedScience Laboratory, University of Illinois at Urbana-Champaign, Urbana, IL61801 USA; Email: [email protected]

has been shown in [15] that tensor-based completion algo-rithms are more efficient in completing tensors than matrix-based completion algorithms. The common target in LRTCis to recover missing entries of a tensor from its partiallyobserved entities [22]–[24]. LRTC remains a grand challengedue to the fact that computation for the tensor rank, defined asCP rank, is already an NP-hard problem [1]. There have beenattempts in approaching LRTC via Tucker rank [15], [18], [21].A conceptual drawback of Tucker rank is that its componentsare ranks of matrices constructed based on an unbalancedmatricization scheme (one mode versus the rest). The upperbound of each individual rank is often small and may notbe suitable for describing global information of the tensor. Inaddition, the matrix rank minimizations is only efficient whenthe matrix is more balanced. As the rank of a matrix is notmore than min{n,m}, where m and n are the number ofrows and columns of the matrix, respectively, the high ratiomax{m,n}/min{m,n} would effectively rule out the need ofmatrix rank minimization. It is not surprising for present state-of-the-art LRMC methods [12], [13], [25], [26] to implicitlyassume that the considered matrices are balanced.

Another type of tensor rank is the TT rank, which consistsof ranks of matrices formed by a well-balanced matricizationscheme, i.e. matricize the tensor along permutations of modes.TT rank was defined in [8], yet low rank tensor analysis viaTT rank can be seen in earlier work in physics, specificallyin simulations of quantum dynamics [27], [28]. Realizing thecomputational efficiency of low TT rank tensors, there hasbeen numerous works in applying it to numerical linear algebra[29]–[31]. Low TT rank tensors were used for the singularvalue decomposition (SVD) of large-scale matrices in [32],[33]. The alternating least squares (ALS) algorithms for tensorapproximation [34], [35] are also used for solutions of linearequations and eigenvector/eigenvalue approximation. In [36],[37], low TT rank tensors were also used in implementingthe steepest descent iteration for large scale least squaresproblems. The common assumption in all these works is thatall the used tensors during the computation processes are oflow TT rank for computational practicability. How low TTrank tensors are relevant to real-world problems was not reallytheir concern. Applications of the TT decomposition to fieldsoutside of mathematics and physics has rarely been seen, withonly a recent application of TT to machine learning [38]. Asmentioned above, color image and video are perfect examplesof tensors, so their completion can be formulated as tensorcompletion problems. However, it is still not known if TTrank-based completion is useful for practical solutions. Themain purpose of this paper is to show that TT rank is the right

Page 2: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

2

approach for LRTC, which can be addressed by TT rank-basedoptimization. The paper’s contribution is as follows:

1) Using the concept of von Neumann entropy in quantuminformation theory [39], we show that Tucker rankdoes not capture the global correlation of the tensorentries and thus is hardly ideal for LRTC. Since TTrank constitutes of ranks of matrices formed by a well-balanced matricization scheme, it is capable of capturingthe global correlation of the tensor entries and is thus apromising tool for LRTC.

2) We show that unlike Tucker rank, which is often low andnot appropriate for optimization, TT rank optimization isa tractable formulation for LRTC. Two new algorithmsare introduced to address the TT rank optimization basedLRTC problems. The first algorithm called simple low-rank tensor completion via tensor train (SiLRTC-TT)solves an optimization problem based on the TT nuclearnorm. The second algorithm called tensor completionby parallel matrix factorization via tensor train (TMac-TT) uses a mutilinear matrix factorization model toapproximate the TT rank of a tensor, bypassing thecomputationally expensive SVD. Avoiding the directTT decomposition enables the proposed algorithms tooutperform other start-of-the-art tensor completion al-gorithms.

3) We also introduce a novel technique called ket aug-mentation (KA) to represent a low-order tensor by ahigher-order tensor without changing the total numberof entries. The KA scheme provides a perfect means toobtain a higher-order tensor representation of visual databy maximally exploring the potential of TT rank-basedoptimization for color image and video completion.TMac-TT especially performs well in recovering videoswith 95% missing entries.

The rest of the paper is organized as follows. Section IIintroduces notation and a brief review of tensor decomposi-tions. In Section III, the conventional formulation of LRTCis reviewed, the advantage of TT rank over Tucker rank interms of global correlations is discussed, then the proposedreformulations of LRTC in the concept of TT rank. SectionIV introduces two algorithms to solve the LRTC problemsbased on TT rank, followed by a discussion of computationalcomplexity. Subsequently, the tensor augmentation schemeknown as KA is proposed in Section V. Section VI providesexperimental results and finally, we conclude our work inSection VII.

II. TENSOR RANKS

Some mathematical notations and preliminaries of tensorsare adopted from [1]. A tensor is a multi-dimensional arrayand its order or mode is the number of its dimensions.Scalars are zero-order tensors denoted by lowercase letters(x, y, z, . . .). Vectors and matrices are the first- and second-order tensors which are denoted by boldface lowercase letters(x, y, z,. . . ) and capital letters (X,Y, Z, . . .), respectively. Ahigher-order tensor (the tensor of order three or above) aredenoted by calligraphic letters (X ,Y,Z, . . .).

An Nth-order tensor is denoted as X 2 RI1⇥I2⇥···⇥IN

where Ik, k = 1, . . . , N is the dimension corresponding tomode k. The elements of X are denoted as xi1···ik···iN , where1 ik Ik, k = 1, . . . , N . The Frobenius norm of X is||X ||F =

qPi1

Pi2· · ·PiN

x

2i1i2···iN .

A mode-n fiber of a tensor X 2 RI1⇥I2⇥···⇥IN is avector defined by fixing all indices but in and denoted byxi1...in�1:in+1...iN .

Mode-n matricization (also known as mode-n unfoldingor flattening) of a tensor X 2 RI1⇥I2⇥···⇥IN is the pro-cess of unfolding or reshaping the tensor into a matrixX(n) 2 RIn⇥(I1···In�1In+1···IN ) by rearranging the mode-nfibers to be the columns of the resulting matrix. Tensor element(i1, . . . , in�1, in, in+1, . . . , iN ) maps to matrix element (in, j)such that

j = 1 +

NX

k=1,k 6=n

(ik � 1)Jk with Jk =

k�1Y

m=1,m 6=n

Im. (1)

The vector r = (r1, r2, . . . , rN ), where rn is the rank of thecorresponding matrix X(n) denoted as rn = rank(X(n)), iscalled as the Tucker rank of the tensor X . It is obvious thatrank(X(n)) In.

Using Vidal’s decomposition [28], X can be represented bya sequence of connected low-order tensors in the form

X =

X

i1,...,iN

[1]i1�

[1] · · ·�[N�1]�

[N ]iN

ei1 ⌦ · · ·⌦ eiN , (2)

where for k = 1, . . . , N , �[k]ik

is an rk�1 ⇥ rk matrix and �

[k]

is the rk ⇥ rk diagonal singular matrix, r0 = rN+1 = 1. Forevery k, the following orthogonal conditions are fulfilled:

IkX

ik=1

[k]ik�

[k](�

[k]ik�

[k])

T= I[k�1]

, (3)

IkX

ik=1

(�

[k�1]�

[k]ik)

T�

[k�1]�

[k]ik

= I[k], (4)

where I[k�1] and I[k] are the identity matrices of sizesrk�1 ⇥ rk�1 and rk ⇥ rk, respectively. The TT rank of thetensor is simply defined as r = (r1, r2, . . . , rN�1), and can bedetermined directly via the singular matrices �[k]. Specifically,to determine rk, rewrite (2) as

X =

X

i1,i2...,iN

u[1···k]i1···ik�

[k]v[k+1···N ]ik+1···iN, (5)

where

u[1···k]i1···ik= �

[1]i1�

[1] · · ·�[k]ik

⌦kl=1 eil , (6)

and

v[k+1···N ]ik+1···iN= �

[k+1]ik+1

[k+1] · · ·�[N ]iN

⌦Nl=k+1 eil . (7)

We can also rewrite (5) in terms of the matrix form of an SVDas

X[k] = U�

[k]V

T, (8)

where X[k] 2 Rm⇥n (m =

Qkl=1 Il, n =

QNl=k+1 Il) is the

mode-(1, 2, . . . , k) matricization of the tensor X [8], U 2

Page 3: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

3

Rm⇥rk and V 2 Rn⇥rk are orthogonal matrices. The rankof X[k] is rk, which is defined as the number of nonvanishingsingular values of �[k].

Since matrix X[k] is obtained by matricizing along k modes,its rank rk is bounded by min(

Qkl=1 Il,

QNl=k+1 Il).

III. TENSOR COMPLETION

This section firstly revisits the conventional formulationof LRTC based on the Tucker rank. Then, we propose anew approach to LRTC via TT rank optimization, whichleads to two new optimization formulations, one based onnuclear norm minimization, and the other on multilinear matrixfactorization.

A. Conventional tensor completionAs tensor completion is fundamentally based on matrix

completion, we give an overview of the latter prior its in-troduction. Recovering missing entries of a matrix T 2 Rm⇥n

from its partially known entries given by a subset ⌦ can bestudied via the well-known matrix-rank optimization problem[40], [41]:

minX

rank(X) s.t. X⌦ = T⌦. (9)

The missing entries of X are completed such that the rank ofX is as small as possible, i.e the vector (�1, ....,�min{m,n}) ofthe singular values �k of X is as sparse as possible. The spar-sity of (�1, ....,�min{m,n}) leads to the effective representationof X for accurate completion. Due to the combinatorial natureof the function rank(·), the problem (9), however, is NP-hard.For the nuclear norm ||X||⇤ =

Pmin{m,n}k=1 �k, the following

convex `

1 optimization problem in (�1, ....,�min{m,n}) hasbeen proved the most effective surrogate for (9) [12], [13],[25]:

minX

||X||⇤ s.t. X⌦ = T⌦. (10)

It should be emphasized that the formulation (9) is efficientonly when X is balanced (square), i.e. m ⇡ n. It is likelythat rank(X) ⇡ m for unbalanced X with m ⌧ n, i.e. thereis not much difference between the optimal value of (9) andits upper bound m, under which rank optimization problem(9) is not interesting. More importantly, one needs at leastCn

6/5rank(X) log n ⇡ Cn

6/5m log n sampled entries [26]

with a positive constant C to successfully complete X , whichis almost the total nm entries of X .

Completing an N th-order tensor T 2 RI1⇥I2···⇥IN from itsknown entries given by an index set ⌦ is formulated by thefollowing Tucker rank optimization problem [15], [18], [20],[21]:

minX(k)

NX

k=1

↵krank(X(k)) s.t. X⌦ = T⌦. (11)

where {↵k}Nk=1 are defined as weights fulfilling the conditionPNk=1 ↵k = 1, which is then addressed by the following `

1

optimization problem [15]:

minX(k)

NX

k=1

↵k||X(k)||⇤ s.t. X⌦ = T⌦. (12)

Each matrix X(k) in (11) is obtained by matricizing the tensoralong one single mode and thus is highly unbalanced. Forinstance, when all the modes have the same dimension (I1 =

· · · = IN ⌘ I), its dimension is I ⇥ I

N�1. As a consequence,its rank is low, which makes the matrix rank optimizationformulation (11) less efficient for completing T . Moreover,as analyzed above, it also makes the `

1 optimization problem(12) not efficient in addressing the rank optimization problem(11).

In the remainder of this subsection we show that rank(X(k))

is not an appropriate means for capturing the global correla-tion of a tensor as it provides only the mean of the correlationbetween a single mode (rather than a few modes) and the restof the tensor.

Firstly, normalize X (||X ||F = 1) and represent it as:

X =

X

i1,i2...,iN

xi1i2···iN ei1 ⌦ ei2 · · ·⌦ eiN , (13)

where “⌦” denotes a tensor product [1], eik 2 RIk form anorthonormal basis in RIk for each k = 1, . . . , N . Applyingmode-k matricization of X results in X(k) representing a purestate of a composite system AB in the space HAB 2 Rm⇥n,which is a tensor product of two subspaces HA 2 Rm and

HB 2 Rn of dimensions m = Ik and n =

NQl=1,l 6=k

Il, respec-

tively. The subsystems A and B are seen as two contiguouspartitions consisting of mode k and all other modes of thetensor, respectively. It follows from (13) that

X(k) =

X

ik,j

xikjeik ⌦ ej , (14)

where the new index j is defined as in (1), ej = ⌦Nl=1,l 6=keil 2

Rn. According to the Schmidt decomposition [39], there existorthonormal bases {uA

l } in HA and {vBl } in HB such that,

X(k) =

rkX

l=1

�luAl ⌦ vB

l , (15)

where rk is the rank of X(k), �l are nonvanishing singularvalues satisfying

Prkl=1 �

2l = 1, {uA

l } and {vBl } are orthonor-mal bases. The correlation between two subsystems A and B

can be studied via von Neumann entropy defined as [39]:

S

A= �Trace(⇢A log2(⇢

A)), (16)

where ⇢

A is called the reduced density matrix operator of thecomposite system and computed by taking the partial trace ofthe density matrix ⇢

AB with respect to B. Specifically, wehave

AB= X(k) ⌦ (X(k))

T

=

⇣ rkX

l=1

�luAl ⌦ vBl

⌘⌦

⇣ rkX

j=1

�juAj ⌦ vBj

⌘T.(17)

Then ⇢

A is computed as

A= TraceB(⇢

AB)

=

rkX

l=1

2l uA

l ⌦ (uAl )

T, (18)

Page 4: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

4

Substituting (18) to (16) yields

S

A= �

rkX

l=1

2l log2 �

2l . (19)

Similarly,

S

B= �Trace(⇢B log2(⇢

B))

= �rkX

l=1

2l log2 �

2l , (20)

which is the same with S

A, simply S

A= S

B= S. This

entropy reflects the correlation or degree of entanglement be-tween subsystem A and its complement B [42]. It is boundedby 0 S log2 rk. Obviously, there is no correlationbetween subsystems A and B whenever S = 0 (where�1 = 1 and the other singular values are zeros). There existscorrelation between subsystems A and B whenever S 6= 0 withits maxima S = log2 rk (when �1 = · · · = �rk = 1/

prk).

Furthermore, if the singular values decay significantly, e.g.exponential decay, we can also keep a few rk largest singularvalues of � without considerably losing accuracy in quantify-ing the amount of correlation between the subsystems. Then rk

is referred to as the approximate low rank of the matrix X(k),which also means that the amount of correlation between twosubsystems A (of mode k) and B (of other modes) is small. Onthe contrary, if two subsystems A and B are highly correlated,i.e. the singular values decay very slowly, then rk needs to beas large as possible.

From the above analysis, we see that the rank rk of X(k) isonly capable of capturing the correlation between one modek and the others. Hence, the problem (11) does not take intoaccount the correlation between a few modes and the rest ofthe tensor, and thus may not be sufficient for completing highorder tensors (N > 3). To overcome this weakness, in the nextsubsection, we will approach LRTC problems optimizing TTrank, which is defined by more balanced matrices and is ableto capture the hidden correlation between the modes of thetensor more effectively.

B. Tensor completion by TT rank optimizationA new approach to the LRTC problem in (11) is to address

it by the following TT rank optimization

minX[k]

N�1X

k=1

↵krank(X[k]) s.t. X⌦ = T⌦, (21)

where ↵k denotes the weight that the TT rank of the ma-trix X[k] contributes to, with the condition

PN�1k=1 ↵k =

1. Recall that X[k] is obtained by matricizing along k

modes and thus its rank captures the correlation be-tween k modes and the other N � k modes. Therefore,(rank(X[1]), rank(X[2]), ..., rank(X[N ])) provides a much bet-ter means to capture the global correlation of a tensor becauseit encompasses of correlations between permutations of allmodes.

As the problem (21) is still difficult to handle as rank(·)is presumably hard. Therefore, from (21), we propose thefollowing two problems.

The first one based on the so-called TT nuclear norm,defined as

||X ||⇤ =

N�1X

k=1

↵k||X[k]||⇤, (22)

is given by

minX

N�1X

k=1

↵k||X[k]||⇤ s.t. X⌦ = T⌦, (23)

The concerned matrices in (23) are much more balanced thantheir counterparts in (12). As a result, the `

1 optimizationproblem (23) provides an effective means for the matrix rankoptimization problem (21).A particular case of (23) is the square model [43]

minX

||X[round(N/2)]||⇤ s.t. X⌦ = T⌦. (24)

by choosing the weights such that ↵k = 1 if k = round(N/2),otherwise ↵k = 0. Although the single matrix X[round(N/2)] isbalanced and thus (24) is an effective means for minimizingrank(X[round(N/2)]), it should be realized that it only capturesthe local correlation between round(N/2) modes and otherround(N/2) modes.

The second problem is based on the factorization modelX[k] = UV for a matrix X[k] 2 Rm⇥n of rank rk, whereU 2 Rm⇥rk and V 2 Rrk⇥n. Instead of optimizing the nuclearnorm of the unfolding matrices X[k] as in (23), the Frobeniusnorm is minimized:

minUk,Vk,X

N�1X

k=1

↵k

2

||UkVk �X[k]||2F

s.t. X⌦ = T⌦,(25)

where Uk 2 RQk

j=1 Ij⇥rk and Vk 2 Rrk⇥QN

j=k+1 Ij . Thismodel is similar to the one proposed in [20], [21] (which isan extension of the matrix completion model [44]) where theTucker rank is employed.

IV. PROPOSED ALGORITHMS

This section is devoted to the algorithmic development forsolutions of two optimization problems (23) and (25).

A. SiLRTC-TTTo address the problem (23) we further convert it to the

following problem:

minX ,Mk

N�1X

k=1

↵k||Mk||⇤ + �k

2

||X[k] �Mk||2F

s.t. X⌦ = T⌦,(26)

where �k are positive numbers. The central concept is based onthe BCD method to alternatively optimize a group of variableswhile the other groups remain fixed. More specifically, thevariables are divided into two main groups. The first onecontains the unfolding matrices M1,M2, . . . ,MN�1 and theother is tensor X . Computing each matrix Mk is related tosolving the following optimization problem:

minMk

↵k||Mk||⇤ + �k

2

||X[k] �Mk||2F , (27)

Page 5: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

5

with fixed X[k]. The optimal solution for this problem has theclosed form [13] which is determined by

Mk = D�k(X[k]), (28)

where �k =

↵k�k

and D�k(X[k]) denotes the thresholding SVDof X[k] [12]. Specifically, if the SVD of X[k] = U�V

T , itsthresholding SVD is defined as:

D�k(X[k]) = U��kVT, (29)

where ��k = diag(max(�l � �k, 0)). After updating all theMk matrices, we turn into another block to compute the tensorX which elements are given by

xi1···iN =

( ⇣PNk=1 �kfold(Mk)PN

k=1 �k

i1···iN(i1 · · · iN ) /2 ⌦

ti1···iN (i1 · · · iN ) 2 ⌦

(30)

The pseudocode of this algorithm is given in AlgorithmI. We call it simple low-rank tensor completion via tensortrain (SiLRTC-TT) as it is an enhancement of SiLRTC [15].The convergence condition is reached when the relative errorbetween two successive tensors X is smaller than a threshold.The algorithm is guaranteed to be converged and gives rise toa global solution since the objective in (26) is a convex and thenonsmooth term is separable. We can also apply this algorithmfor the square model [43] by simply choosing the weights suchthat ↵k = 1 if k = round(N/2) otherwise ↵k = 0. For thisparticular case, the algorithm is defined as SiLRTC-Square.

Algorithm I: SiLRTC-TT

Input: The observed data T 2 RI1⇥I2···⇥IN , index set ⌦.Parameters: ↵k,�k, k = 1, . . . , N � 1.

1: Initialization: X 0, with X 0⌦ = T⌦, l = 0.

2: While not converged do:3: for k = 1 to N � 1 do4: Unfold the tensor X l to get Xl

[k]

5: M l+1k = D↵k

�k

(Xl[k])

6: end for7: Update X l+1 from M l+1

k by (30)8: End while

Output: The recovered tensor X as an approximation of T

B. TMac-TTTo solve the problem given by (25), following TMac and

TC-MLFM in [21] and [20], we apply the BCD method to al-ternatively optimize different groups of variables. Specifically,we focus on the following problem:

minUk,Vk,X[k]

||UkVk �X[k]||2F , (31)

for k = 1, 2, . . . , N � 1. This problem is convex when eachvariable Uk, Vk and X[k] is modified while keeping the othertwo fixed. To update each variable, perform the followingsteps:

U

l+1k = X

l[k](V

lk)

T(V

lk(V

lk)

T)

†, (32)

V

l+1k = ((U

l+1k )

TU

l+1k )

†(U

l+1k )

T)X

l[k] (33)

X

l+1[k] = U

l+1k V

l+1k , (34)

where “†”denotes the Moore-Penrose pseudoinverse. It wasshown in [21] that we can replace (32) by the following:

U

l+1k = X

l[k](V

lk)

T, (35)

to avoid computing the Moore-Penrose pseudoinverse(V

lk(V

lk)

T)

†. The rationale behind this is that we only need theproduct U l+1

k V

l+1k to compute X l+1

[k] in (34), which is the samewhen either (32) or (35) is used. After updating U

l+1k , V

l+1k

and X

l+1[k] for all k = 1, 2, . . . , N � 1, we compute elements

of the tensor X l+1 as follows:

x

l+1i1··· =

8<

:

⇣N�1Pk=1

↵kfold(X l+1[k] )

i1···(i1 · · ·) /2 ⌦

ti1··· (i1 · · ·) 2 ⌦

(36)

This algorithm is defined as tensor completion by parallelmatrix factorization in the concept of tensor train (TMac-TT), and its pseudocode is summarized in Algorithm II. Theessential advantage of this algorithm is that it avoids a lot ofSVDs, and hence it can substantially save computational time.

The algorithm can also be applied for the square model [43]by choosing the weights such that ↵k = 1 if k = round(N/2),otherwise ↵k = 0. For this case, we define the algorithmTMac-Square.

Algorithm II: TMac-TT

Input: The observed data T 2 RI1⇥I2···⇥IN , index set ⌦.Parameters: ↵i, ri, i = 1, . . . , N � 1.

1: Initialization: U0, V 0,X 0, with X 0⌦ = T⌦, l = 0.

While not converged do:2: for k = 1 to N � 1 do3: Unfold the tensor X l to get Xl

[k]

4: U l+1i = Xl

[k](Vlk)

T

5: V l+1k = ((U l+1

k )TU l+1k )†(U l+1

k )TXl[k]

6: Xl+1[k] = U l+1

k V l+1k

7: end8: Update the tensor X l+1 using (36)End while

Output: The recovered tensor X as an approximation of T

C. Computational complexity of algorithms

The computational complexity of the algorithms are givenin Table I to complete a tensor X 2 RI1⇥I2⇥···⇥IN , where weassume that I1 = I2 = · · · = IN = I . The Tucker rank and TTrank are assumed to be equal, i.e. r1 = r2 = · · · = rN = r.

Table I: Computational complexity of algorithms for oneiteration.

Algorithm Computational complexity

SiLRTC O(NIN+1)SiLRTC-TT O(I3N/2 + I3N/2�1)TMac O(3NINr)TMac-TT O(3(N � 1)INr)

Page 6: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

6

V. TENSOR AUGMENTATION

In this section, we introduce ket augmentation (KA) torepresent a low-order tensor by a higher-order one, i.e. tocast an N th-order tensor T 2 RI1⇥I2⇥···⇥IN into a Kth-order tensor ˜T 2 RJ1⇥J2⇥···⇥JK , where K � N andQN

l=1 Il =

QKl=1 Jl. A higher-order representation of the

tensor offers some important advantages. For instance, theTT decomposition is more efficient for the augmented tensorbecause the local structure of the data can be exploitedeffectively in terms of computational resources. Actually, ifthe tensor is slightly correlated, its augmented tensor can berepresented by a low-rank TT [8], [45].

The concept of KA was originally introduced in [45] forcasting a grayscale image into real ket state of a Hilbert space,which is simply a higher-order tensor, using an appropriateblock structured addressing.

We define KA as a generalization of the original scheme tothird-order tensors T 2 RI1⇥I2⇥I3 that represent color images,where I1⇥ I2 = 2

n⇥ 2

n (n � 1 2 Z) is the number of pixelsin the image and I3 = 3 is the number of colors (red, greenand blue). Let us start with an initial block, labeled as i1, of2 ⇥ 2 pixels corresponding to a single color j (assume thatthe color is indexed by j where j = 1, 2, 3 corresponding tored, green and blue colors, respectively). This block can berepresented as

T[21⇥21⇥1] =

4X

i1=1

ci1jei1 , (37)

where ci1j is the pixel value corresponding to color j and ei1is the orthonormal base which is defined as e1 = (1, 0, 0, 0),e2 = (0, 1, 0, 0), e3 = (0, 0, 1, 0) and e4 = (0, 0, 0, 1). Thevalue i1 = 1, 2, 3 and 4 can be considered as labeling the up-left, up-right, down-left and down-right pixels, respectively.For all three colors, we have three blocks which are repre-sented by

T[21⇥21⇥3] =

4X

i1=1

3X

j=1

ci1jei1 ⌦ uj , (38)

where uj is also an orthonormal base which is defined asu1 = (1, 0, 0), u2 = (0, 1, 0), u3 = (0, 0, 1). We now considera larger block labeled as i2 make up of four inner sub-blocksfor each color j as shown in Fig. 1. In total, the new block isrepresented by

T[22⇥22⇥3] =

4X

i2=1

4X

i1=1

3X

j=1

ci2i1jei2 ⌦ ei1 ⌦ uj . (39)

Generally, this block structure can be extended to a size of2

n ⇥ 2

n ⇥ 3 after several steps until it can present all thevalues of pixels in the image. Finally, the image can be castinto an (n + 1)th-order tensor C 2 R4⇥4⇥···⇥4⇥3 containingall the pixel values as follows,

T[2n⇥2n⇥3] =

4X

in,...,i1=1

3X

j=1

cin···i1jein ⌦ · · ·⌦ ei1 ⌦ uj . (40)

This presentation is suitable for image processing as it notonly preserves the pixels values, but also rearranges them in

Figure 1: A structured block addressing procedure to cast animage into a higher-order tensor. (a) Example for an image ofsize 2⇥2⇥3 represented by (38). (b) Illustration for an imageof size 2

2 ⇥ 2

2 ⇥ 3 represented by (39).

a higher-order tensor such that the richness of textures in theimage can be studied via the correlation between modes of thetensor [45]. Therefore, due to the flexibility of the TT-rank,our proposed algorithms would ideally take advantage of KA.

VI. SIMULATIONS

Extensive experiments are conducted with synthetic data,color images and videos. The proposed algorithms are bench-marked against TMac [21], TMac-Square, SiLRTC [15],SiLRTC-Square [43] and state-of-the-art tensor completionmethods FBCP [46] and STDC [47]1. Additionally, we alsobenchmark the TT-rank based optimization algorithm, ALS[34], [35].

The simulations for the algorithms are tested with respect todifferent missing ratios (mr) of the test data, with mr definedas

mr =

p

QNk=1 Ik

, (41)

where p is the number of missing entries, which is chosenrandomly from a tensor T based on a uniform distribution.

To measure performance of a LRTC algorithm, the rela-tive square error (RSE) between the approximately recoveredtensor X and the original one T is used, which is defined as,

RSE = ||X � T ||F /||T ||F . (42)

The convergence criterion of our proposed algorithms isdefined by computing the relative error of the tensor Xbetween two successive iterations as follows:

✏ =

||X l+1 � X l||F||T ||F tol, (43)

where tol = 10

�4 and the maximum number of iterationsmaxiter = 1000. These simulations are implemented under aMatlab environment.

1Applicable only for tensors of order N = 3.

Page 7: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

7

A. Initial parametersIn the experiments there are three parameters that must be

initialized: the weighting parameters ↵ and �, and the initialTT ranks (ri, i = 1, . . . , N � 1) for TMac, TMac-TT andTMac-Square. Firstly, the weights ↵k are defined as follows:

↵k =

�kPN�1k=1 �k

with �k = min(

kY

l=1

Il,

NY

l=k+1

Il), (44)

where k = 1, . . . , N � 1. In this way, we assign the largeweights to the more balanced matrices. The positive parame-ters are chosen by �k = f↵k, where f is empirically chosenfrom one of the following values in [0.01, 0.05, 0.1, 0.5, 1] insuch a way that the algorithm performs the best. Similarly, forSiLRTC and TMac, the weights are chosen as follows:

↵k =

IkPNk=1 Ik

, (45)

where k = 1, . . . , N . The positive parameters are chosen suchthat �k = f↵k, where f is empirically chosen from one ofthe following values in [0.01, 0.05, 0.1, 0.5, 1] which gives thebest performance.

To obtain the initial TT ranks for TMac, TMac-TT andTMac-Square, each rank ri is bounded by keeping only thesingular values that satisfy the following inequality:

[i]j

[i]1

> th, (46)

with j = 1, . . . , ri, threshold th, and {�[i]j } is assumed to

be in descending order. This condition is chosen such thatthe matricizations with low-rank (small correlation) will havemore singular values truncated. We also choose th empiricallybased on the algorithms performance.

It is important to highlight that these initial parameters canaffect the performance of the proposed algorithms. Conse-quently, the proposed algorithms performance may not neces-sarily be optimal and future work will need to be consideredin determining the optimal TT rank and weights via automatic[46] and/or adaptive methods [21].

B. Synthetic data completionWe firstly perform the simulation on two different types of

low-rank tensors which are generated synthetically in such away that the Tucker and TT rank are known in advance.

1) Completion of low TT rank tensor: The N th-ordertensors T 2 RI1⇥I2···⇥IN of TT rank (r1, r2, . . . , rN�1) aregenerated such that its elements is represented by a TT format[8]. Specifically, its elements is ti1i2...iN = A

[1]i1A

[2]i2

· · ·A[N ]iN

,where A

[1] 2 RI1⇥r1 , A

[N ] 2 RrN⇥IN and A[k] 2Rrk�1⇥Ik⇥rk with k = 2, . . . , N � 1 are generated randomlywith respect to the standard Gaussian distribution N (0, 1).For simplicity, in this paper we set all components of the TTrank the same and so does the dimension of each mode, i.e.r1 = r2 = · · · = rN�1 = r and I1 = I2 = · · · = IN = I .

The plots of RSE with respect to mr are shown in theFigure. 2 for tensors of different sizes, 40⇥40⇥40⇥40 (4D),20⇥20⇥20⇥20⇥20 (5D), 10⇥10⇥10⇥10⇥10⇥10 (6D) and

10⇥10⇥10⇥10⇥10⇥10⇥10 (7D) and the corresponding TTrank tuples are (10, 10, 10) (4D), (5, 5, 5, 5) (5D), (4, 4, 4, 4, 4)(6D) and (4, 4, 4, 4, 4, 4) (7D). From the plots we can see thatTMac-TT shows best performance in most cases. Particularly,TMac-TT can recover the tensor successfully despite the highmissing ratios, where in most cases with high missing ratios,e.g. mr = 0.9, it can recover the tensor with RSE ⇡ 10

�4.More importantly, the proposed algorithms SiLRTC-TT andTMac-TT often performs better than their corresponding coun-terparts, i.e. SiLRTC and TMac in most cases. FBCP and ALShave the worst results with random synthetic data, so for theremaining synthetic data experiments, only SiLRTC, SiLRTC-Square, SiLRTC-TT, TMac, TMac-Square and TMac-TT arecompared.

10 -8

10 -6

10 -4

10 -2

10 0

RSE

(a) 4D

FBCPALSSiLRTC-TTSiLRTCTMacTMac-TTSiLRTC-SquareTMac-Square

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

10 1

(b) 5D

0 0.2 0.4 0.6 0.8 1mr

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

10 1

RSE

(c) 6D

0 0.2 0.4 0.6 0.8 1mr

10 -6

10 -5

10 -4

10 -3

10 -2

10 -1

10 0

10 1

(d) 7D

Figure 2: The RSE comparison when applying different LRTCalgorithms to synthetic random tensors of low TT rank.Simulation results are shown for different tensor dimensions,4D, 5D, 6D and 7D.

For a better comparison on the performance of differentLRTC algorithms, we present the phase diagrams using thegrayscale color to estimate how successfully a tensor can berecovered for a range of different TT rank and missing ratios.If RSE ✏ where ✏ is a small threshold, we say that the tensoris recovered successfully and is represented by a white blockin the phase diagram. Otherwise, if RSE > ✏, the tensor isrecovered partially with a relative error and the block color isgray. Especially the recovery is completely failed if RSE = 1.Concretely, we show in Fig. 3 the phase diagrams for differentalgorithms applied to complete a 5D tensor of size 20⇥ 20⇥20 ⇥ 20 ⇥ 20 where the TT rank r varies from 2 to 16 and✏ = 10

�2. We can see that our LRTC algorithms outperform

Page 8: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

8

mr

(a) SiLRTC

Fail

Success0.2

0.4

0.6

0.8

(b) TMac

Fail

Successmr

(c) SiLRTC-Square

Fail

Success0.2

0.4

0.6

0.8

(d) TMac-Square

Fail

Success

r

mr

(e) SiLRTC-TT

Fail

Success

5 10 15

0.2

0.4

0.6

0.8

r

(f) TMac-TT

Success

5 10 15

mr

(a) SiLRTC

Fail

Success0.2

0.4

0.6

0.8

(b) TMac

Fail

Successmr

(c) SiLRTC-Square

Fail

Success0.2

0.4

0.6

0.8

(d) TMac-Square

Fail

Success

r

mr

(e) SiLRTC-TT

Fail

Success

5 10 15

0.2

0.4

0.6

0.8

r

(f) TMac-TT

Success

5 10 15

Figure 3: Phase diagrams for low TT rank tensor completionwhen applying different algorithms to a 5D tensor.

the others. Especially, TMac-TT always recovers successfullythe tensor with any TT rank and missing ratio.

2) Completion of low Tucker rank tensor: Let us nowapply our proposed algorithms to synthetic random tensorsof low Tucker rank to make the most favorable conditionfor Tucker-based algorithms. The N th-order tensor T 2RI1⇥I2···⇥IN of Tucker rank (r1, r2, . . . , rN ) is constructedby T = G⇥1A

(1)⇥2A(2) · · ·⇥N A

(N), where the core tensorG 2 Rr1⇥r2···⇥rN and the factor matrices A

(k) 2 Rrk⇥Ik, k =

1, . . . , N are generated randomly by using the standard Gaus-sian distribution N (0, 1). Here, we choose r1 = r2 = · · · =rN = r and I1 = I2 = · · · = IN = I for simplicity. Tocompare the performance between the algorithms, we show inthe Fig. 5 the phase diagrams for different algorithms appliedto complete a 5D tensor of size 20⇥20⇥20⇥20⇥20 where theTucker rank r varies from 2 to 16 and ✏ = 10

�2. We can seethat both TMac and TMac-TT perform much better than theothers. Besides, SiLRTC-TT shows better performance whencompared to SiLRTC and SiLRTC-Square. Similarly, TMac-TT is better than its particular case TMac-Square.

In summary, TMac is well optimized to complete this typeof data. Despite this, TMac-TT still performed close to TMac.In the case of low TT rank in Fig. 3, TMac performed poorly,which shows that Tucker-based algorithms are not flexible forcompleting real types of data.

The synthetic data experiments were performed to initiallytest the proposed algorithms. In order to have a better compari-son between the algorithms we benchmark the methods againstreal world data such as color images and videos, where theranks of the tensors are not known in advance. These will beseen in the subsequent subsections.

C. Image completionThe color images known as Peppers, Lena and House are

employed to test the algorithms. All the images are initially

represented by third-order tensors which have same sizes of256⇥ 256⇥ 3.

Note that when completing the third-order tensors, wedo not expect the proposed methods to prevail against theconventional ones due to the fact that the TT rank of the tensoris a special case of the Tucker rank. Thus, performance ofthe algorithms should be mutually comparable. However, forthe purpose of comparing the performance between differentalgorithms for real data (images) represented in terms ofhigher-order tensors, we apply the tensor augmentation schemeKA mentioned above to reshape third-order tensors to higher-order ones without changing the number of entries in thetensor. Specifically, we start our simulation by casting athird-order tensor T 2 R256⇥256⇥3 into a ninth-order ˜T 2R4⇥4⇥4⇥4⇥4⇥4⇥4⇥4⇥3 and then apply the tensor completionalgorithms to impute its missing entries. We perform thesimulation for the Peppers and Lena images where missingentries of each image are chosen randomly according to auniform distribution, the missing ratio mr varies from 0.5 to0.9.

In Fig. 9, performance of the algorithms on completingthe Peppers image is shown. When the image is representedby a third-order tensor, the STDC algorithm performs verywell against all methods, with the ALS algorithm performingpoorly, and the remaining algorithms having similar perfor-mance. However, for the case of the ninth-order tensors, theperformance of the algorithms are rigorously distinguished.Specifically, our proposed algorithms (especially TMac-TT)prevails against all other methods, and this is demonstratedin Fig. 4 for mr = 0.9. Furthermore, using the KA schemeto increase the tensor order, SiLRTC-TT and TMac-TT are atleast comparable to STDC, with TMac-TT having the lowestRSE for mr = 0.9. More precisely, TMac-TT gives the bestresult of RSE ⇡ 0.156 when using the KA scheme. Thegreat results can be attributed to the capability of TT rank-based optimization algorithms in taking full advantage of thecorrelations between modes of the tensor, which represent therichness of textures in the image.

The results for the experiment performed on the Lena imagefor mr = 0.9 and recovery results are shown in Fig. 6 andFig. 10, respectively. These figures show that again TMac-TTgives the lowest RSE for each mr thanks to its ability toutilize KA effectively.

For the House image, the missing entries are now chosenas white text, and hence the missing rate is fixed. The result isshown in Fig. 7. STDC provides the best performance withoutaugmentation, while all other algorithms are comparable.However, the outlines of text can still be clearly seen onthe STDC image. Using tensor augmentation, TMac-TT andTMac-Square provides the best performance, where the text isalmost completely removed using TMac-TT.

D. Video completion with ket augmentationIn color video completion we benchmark FBCP, ALS,

TMac, TMac-TT and TMac-Square against two videos, NewYork City (NYC) and bus2. The other methods are computation-

2Videos available at https://engineering.purdue.edu/~reibman/ece634/

Page 9: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

9

Original image mr =0.9

RSE =0.207

STDC

RSE =0.236 RSE =0.465 RSE =0.261 RSE =0.289 RSE =0.381 RSE =0.253 RSE =0.283 RSE =0.261

RSE =0.309

FBCP

RSE =0.420

ALS

RSE =0.208

SiLRTC-TT

RSE =0.478

SiLRTC

RSE =0.470

TMac

RSE =0.156

TMac-TT

RSE =0.230

SiLRTC-Square

RSE =0.459

TMac-Square

Figure 4: Recover the Peppers image with 90% of missing entries using different algorithms. Top row from left to right: theoriginal image and its copy with 90% of missing entries. Second and third rows represent the recovery results of third-order(no order augmentation) and ninth-order tensors (KA augmentation), using different algorithms: STDC (only on second row),FBCP, ALS, SiLRTC-TT, SiLRTC, TMac, TMac-TT, SiLRTC-Square and TMac-Square from the left to the right, respectively.The Tucker-based SiLRTC and TMac perform considerably worse when using KA because they are based on an unbalancedmatricization scheme, which is unable to take the full advantage of KA.

mr

(a) SiLRTC

Fail

Success0.2

0.4

0.6

0.8

(b) TMac

Fail

Success

mr

(c) SiLRTC-Square

Fail

Success0.2

0.4

0.6

0.8

(d) TMac-Square

Fail

Success

r

mr

(e) SiLRTC-TT

Fail

Success

5 10 15

0.2

0.4

0.6

0.8

r

(f) TMac-TT

Fail

Success

5 10 15

mr

(a) SiLRTC

Fail

Success0.2

0.4

0.6

0.8

(b) TMac

Fail

Success

mr

(c) SiLRTC-Square

Fail

Success0.2

0.4

0.6

0.8

(d) TMac-Square

Fail

Success

r

mr

(e) SiLRTC-TT

Fail

Success

5 10 15

0.2

0.4

0.6

0.8

r

(f) TMac-TT

Fail

Success

5 10 15

Figure 5: Phase diagrams for low Tucker rank tensor comple-tion when applying different algorithms to a 5D tensor.

ally intractable or not applicable for N � 4 in this experiment.For each video, the following preprocessing is performed:Resize the video to a tensor of size 81 ⇥ 729 ⇥ 1024 ⇥ 3

(frame ⇥ image row ⇥ image column ⇥ RGB). The firstframe of each video can be seen in Figs. 8a and 8b. Theframe mode is merged with the image row mode to form athird-order tensor, which we define here as a video sequencetensor (VST), of size 59, 049 ⇥ 1024 ⇥ 3 (combined row ⇥image column ⇥ RGB). Examples of the VST can be seenin the range 20000:20700 for combined row in Figs. 8c and8d. Hence, rather than performing an image completion oneach frame, we perform our tensor completion benchmark

on the entire video. It is important to highlight that we onlybenchmark with a ket augmented (not the third-order) VSTdue to computational intractability for high-dimensional low-order tensors.

Using KA, reshape the VST to a low-dimensional high-order tensor of size 6⇥6⇥6⇥6⇥6⇥6⇥6⇥6⇥6⇥6⇥3. Theeleventh-order VST is directly used for the tensor completionalgorithms.

Table II: RSE and SSIM tensor completion results for 95%,90% and 70% missing entries from the NYC video.

mr = 0.95 mr = 0.9 mr = 0.7Algorithm RSE SSIM RSE SSIM RSE SSIM

FBCP 0.210 0.395 0.210 0395 0.210 0.396ALS 0.193 0.397 0.189 0.398 0.168 0.429TMac 0.185 0.605 0.143 0.750 0.055 0.967

TMac-TT 0.072 0.876 0.066 0.902 0.053 0.949TMac-Square 0.111 0.722 0.076 0.901 0.056 0.946

For the case of 95% missing entries, results of the bench-mark can be seen in Figs. 11 and 12. The NYC results inFig. 11 shows that FBCP and ALS are completely incom-prehensible, whereas only the TMac-based algorithms cansuccessfully complete the video. Moreover, in this case, TMac-TT outperforms all algorithms, which can be seen with theRSE and mean structural similarity index (SSIM ) [48] (overall 81 frames) in Table II for mr = 0.95. For the bus results inFig. 12, TMac-TT outperforms all algorithms. The other TTrank-based algorithm ALS can only manage a simple structureof the bus, and FBCP cannot produce any resemblence tothe original video. Table III summarizes the RSE and meanSSIM results. With 90% missing entries, the results aresimilar to those of 95% missing entries, however, TMac-TTand TMac-Square are now comparable in performance for theNYC video. For the NYC video with mr = 0.7, Table II shows

Page 10: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

10

Original image mr =0.9

RSE =0.185

STDC

RSE =0.164 RSE =0.322 RSE =0.189 RSE =0.205 RSE =0.257 RSE =0.176 RSE =0.202 RSE =0.184

RSE =0.217

FBCP

RSE =0.244

ALS

RSE =0.148

SiLRTC-TT

RSE =0.382

SiLRTC

RSE =0.385

TMac

RSE =0.110

TMac-TT

RSE =0.164

SiLRTC-Square

RSE =0.303

TMac-Square

Figure 6: Recover the Lena image with 90% of missing entries using different algorithms. Top row from left to right: theoriginal image and its copy with 90% of missing entries. Second and third rows represent the recovery results of third-order(no order augmentation) and ninth-order tensors (KA augmentation), using different algorithms: STDC (only on second row),FBCP, ALS, SiLRTC-TT, SiLRTC, TMac, TMac-TT, SiLRTC-Square and TMac-Square from the left to the right, respectively.

Original image Text image

RSE =0.031

STDC

RSE =0.047 RSE =0.065 RSE =0.036 RSE =0.039 RSE =0.053 RSE =0.040 RSE =0.039 RSE =0.054

RSE =0.094

FBCP

RSE =0.064

ALS

RSE =0.028

SiLRTC-TT

RSE =0.075

SiLRTC

RSE =0.092

TMac

RSE =0.029

TMac-TT

RSE =0.030

SiLRTC-Square

RSE =0.038

TMac-Square

Figure 7: Recover the House image with missing entries described by the white letters using different algorithms. Top rowfrom left to right: the original image and its copy with white letters. Second and third rows represent the recovery resultsof third-order (no order augmentation) and ninth-order tensors (KA augmentation), using different algorithms: STDC (onlyon second row), FBCP, ALS, SiLRTC-TT, SiLRTC, TMac, TMac-TT, SiLRTC-Square and TMac-Square from the left to theright, respectively.

that all TMac-based algorithms are comparable, with FBCPand ALS unable to reproduce a sufficient approximation. In thebus video, TMac-TT and TMac-Square provide comparableRSE and SSIM.

Table III: RSE and SSIM tensor completion results for 95%,90% and 70% missing entries from the bus video.

mr = 0.95 mr = 0.9 mr = 0.7Algorithm RSE SSIM RSE SSIM RSE SSIM

FBCP 0.527 0.269 0.527 0.269 0.504 0.271ALS 0.447 0.323 0.342 0.387 0.271 0.513TMac 0.518 0.316 0.496 0.374 0.402 0.598

TMac-TT 0.154 0.807 0.092 0.932 0.062 0.974TMac-Square 0.267 0.582 0.196 0.781 0.077 0.968

In summary, the bus video includes more vibrant coloursand textures compared to the NYC video, which can be clearlyseen from the overall SSIM performance in Tables II andIII. It is important to highlight that TMac-TT still providesa high quality (SSIM = 0.807) approximation for the highmissing ratio (mr = 0.95) test, where the next best result ofTMac-Square had only SSIM = 0.582. This demonstratesthe superiority of using TMac-TT over the other algorithmsfor high missing ratio video completion problems.

VII. CONCLUSION

A novel approach to the LRTC problem based on TT rankwas introduced along with corresponding algorithms for its

Page 11: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

11

(a) Bus frame 1 (b) NYC frame 1 (c) Bus combined (20000:20700) (d) NYC combined (20000:20700)

Figure 8: The first frames of the bus and NYC videos are shown (a) and (b), respectively. In (c) and (d), the third-order VSTfor bus and NYC are shown for the range 20000:20700 in the combined row mode, respectively.

0.4 0.6 0.8 1mr

10 -2

10 -1

10 0

RSE

(a)

ALSFBCPSTDCSiLRTC-TTSiLRTCTMacTMac-TTSiLRTC-SquareTMac-Square

0.4 0.6 0.8 1mr

(b)

ALSFBCPSiLRTC-TTSiLRTCTMacTMac-TTSiLRTC-SquareTMac-Square

Figure 9: Performance comparison between different tensorcompletion algorithms based on the RSE vs the missing ratewhen applied to the Peppers image. (a) Original tensor (noorder augmentation). (b) Augmented tensor using KA scheme.

0.4 0.6 0.8 1mr

10 -2

10 -1

10 0

RSE

(a)

ALSFBCPSTDCSiLRTC-TTSiLRTCTMacTMac-TTSiLRTC-SquareTMac-Square

0.4 0.6 0.8 1mr

(b)

ALSFBCPSiLRTC-TTSiLRTCTMacTMac-TTSiLRTC-SquareTMac-Square

Figure 10: Performance comparison between different tensorcompletion algorithms based on the RSE vs the missing ratewhen applied to the Lena image. (a) Original tensor (no orderaugmentation). (b) Augmented tensor using KA scheme.

solution. The SiLRTC-TT algorithm was defined to minimizethe TT rank of a tensor by TT nuclear norm optimization.Meanwhile, TMac-TT was proposed, which is based on themultilinear matrix factorization model to minimize the TT-rank. The latter is more computationally efficient due to thefact that it does not need the SVD. The proposed algorithmsare employed to simulate both synthetic and real world datarepresented by higher-order tensors. For synthetic data, ouralgorithms prevails against the others when the tensors havelow TT rank. Their performance is comparable in the caseof low Tucker rank tensors. The TT-based algorithms arequite promising and reliable when applied to real world data.To validate this, we studied image and video completionproblems. Benchmark results show that when applied to orig-inal tensors without tensor augmentation, our algorithms arecomparable to STDC in image completion. However, in thecase of augmented tensors, our proposed algorithms not onlyoutperform the others, but also provide better recovery resultswhen compared to the case without tensor order augmentationin both image and video completion experiments.

Applications of the proposed TT rank optimization basedtensor completion to data compression, text mining, imageclassification and video indexing are under our interest.

REFERENCES

[1] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,”SIAM Review, vol. 51, no. 3, pp. 455–500, 2009.

[2] M. Vasilescu and D. Terzopoulos, “Multilinear subspace analysis ofimage ensembles,” in Proc. IEEE Conf. Computer Vision and PatternRecognition, vol. 2, June 2003, pp. 93–99.

[3] J.-T. Sun, H.-J. Zeng, H. Liu, Y. Lu, and Z. Chen, “Cubesvd: A novelapproach to personalized web search,” in Proc. 14th Int’l World WideWeb Conf. (WWW ’, 05), 2005, pp. 382–390.

[4] T. Franz, A. Schultz, S. Sizov, and S. Staab, “Triplerank: Rankingsemantic web data by tensor decomposition,” in The Semantic Web -ISWC 2009. Springer Berlin Heidelberg, 2009, vol. 5823, pp. 213–228.

[5] J. Carroll and J.-J. Chang, “Analysis of individual differences in mul-tidimensional scaling via an n-way generalization of “Eckart-Young”decomposition,” Psychometrika, vol. 35, no. 3, pp. 283–319, 1970.

[6] R. A. Harshman, “Foundations of the PARAFAC procedure: Modelsand conditions for an “explanatory” multi-modal factor analysis,” UCLAWorking Papers in Phonetics, vol. 16, no. 1, p. 84, 1970.

[7] L. R. Tucker, “Some mathematical notes on three-mode factor analysis,”Psychometrika, vol. 31, no. 3, pp. 279–311, Sep 1966.

[8] I. V. Oseledets, “Tensor-train decomposition,” SIAM J. Sci. Comput.,vol. 33, no. 5, pp. 2295–2317, Jan 2011.

[9] M. Fannes, B. Nachtergaele, and R. Werner, “Finitely correlated stateson quantum spin chains,” Comm. Math. Phys., vol. 144, no. 3, pp. 443–490, 1992.

[10] A. Klümper, A. Schadschneider, and J. Zittartz, “Matrix product groundstates for one-dimensional spin-1 quantum antiferromagnets,” EPL,vol. 24, no. 4, p. 293, 1993.

Page 12: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

12

Figure 11: The 7th, 21st, 33rd and 70th frames (from left to right column) in the NYC video, with each row (from top tobottom) representing the original frames, original frames with 95% missing entries, TMac, TMac-TT, TMac-Square, ALS andFBCP.

[11] D. Perez-Garcia, F. Verstraete, M. M. Wolf, and J. I. Cirac, “Matrixproduct state representations,” Quant. Info. Comput., vol. 7, no. 5, pp.401–430, 2007.

[12] J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholdingalgorithm for matrix completion,” SIAM J. Optim., vol. 20, no. 4, pp.1956–1982, Jan 2010.

[13] S. Ma, D. Goldfarb, and L. Chen, “Fixed point and Bregman iterativemethods for matrix rank minimization,” Math. Programm., vol. 128, no.1-2, pp. 321–353, Sep 2009.

[14] B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-ranksolutions of linear matrix equations via nuclear norm minimization,”SIAM Rev., vol. 52, no. 3, pp. 471–501, Jan 2010.

[15] J. Liu, P. Musialski, P. Wonka, and J. Ye, “Tensor completion forestimating missing values in visual data,” IEEE Trans. Pattern Analysisand Machine Intelligence, vol. 35, no. 1, pp. 208–220, Jan 2013.

[16] M. Signoretto, L. De Lathauwer, and J. A. K. Suykens†, “Nuclear normsfor tensors and their use for convex multilinear estimation,” ESAT-SISTA, K. U. Leuven (Leuven, Belgium), Tech. Rep., 2010.

[17] M. Signoretto, R. Van de Plas, B. De Moor, and J. Suykens, “Tensorversus matrix completion: A comparison with application to spectraldata,” IEEE Signal Process. Lett., vol. 18, no. 7, pp. 403–406, Jul 2011.

[18] S. Gandy, B. Recht, and I. Yamada, “Tensor completion and low-n-ranktensor recovery via convex optimization,” Inv. Probl., vol. 27, no. 2, p.025010, Jan 2011.

[19] T. Ryota, S. Taiji, K. Hayashi, and H. Kashima, “Statistical performance

of convex tensor decomposition,” in Proc. Adv. Neural Inf. Process. Syst.,Dec 2011, pp. 972–980.

[20] H. Tan, B. Cheng, W. Wang, Y.-J. Zhang, and B. Ran, “Tensor comple-tion via a multi-linear low-n-rank factorization model,” Neurocomputing,vol. 133, pp. 161–169, Jun 2014.

[21] Y. Xu, R. Hao, W. Yin, and Z. Su, “Parallel matrix factorization forlow-rank tensor completion,” IPI, vol. 9, no. 2, pp. 601–624, Mar 2015.

[22] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image inpaint-ing,” in Proc. ACM Conf. Comp. Graph. (SIGGRAPH ’, 00), 2000, pp.417–424.

[23] N. Komodakis, “Image completion using global optimization,” in Proc.IEEE Comput. Vis. Pattern Recognit., vol. 1, 2006, pp. 442–452.

[24] T. Korah and C. Rasmussen, “Spatiotemporal inpainting for recoveringtexture maps of occluded building facades,” IEEE Trans. Image Pro-cessing, vol. 16, no. 9, pp. 2262–2271, Sep 2007.

[25] F. R. Bach, “Consistency of trace norm minimization,” J. Mach. Learn.Res., vol. 9, pp. 1019–1048, Jun 2008.

[26] E. Candès and B. Recht, “Exact matrix completion via convex optimiza-tion,” Found. Comput. Math., vol. 9, no. 6, pp. 717–772, 2009.

[27] G. Vidal, “Efficient classical simulation of slightly entangled quantumcomputations,” Phys. Rev. Lett., vol. 91, p. 147902, Oct 2003.

[28] ——, “Efficient simulation of one-dimensional quantum many-bodysystems,” Phys. Rev. Lett., vol. 93, no. 4, Jul 2004.

[29] I. V. Oseledets, E. E. Tyrtyshnikov, and N. L. Zamarashkin, “Tensor-train ranks of matrices and their inverses,” Comput. Meth. Appl. Math,vol. 11, no. 3, pp. 394–403, 2011.

Page 13: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

13

Figure 12: The 7th, 21st, 33rd and 70th frames (from left to right column) in the bus video, with each row (from top to bottom)representing the original frames, original frames with 95% missing entries, TMac, TMac-TT, TMac-Square, ALS and FBCP.

[30] E. Corona, A. Rahimian, and D. Zorin, “A tensor-train accelerated solverfor integral equations in complex geometries,” arXiv, vol. abs/quant-ph/1511.06029, 2015.

[31] I. V. Oseledets and S. V. Dolgov, “Solution of linear systemsand matrix inversion in the tt-format,” SIAM Journal on ScientificComputing, vol. 34, no. 5, pp. A2718–A2739, 2012. [Online]. Available:http://dx.doi.org/10.1137/110833142

[32] T. Mach, “Computing inner eigenvalues of matrices in tensor train matrixformat,” in Proc. 9th European Conference on Numerical Mathematicsand Advanced Applications, Leicester, 2011, pp. 781–788.

[33] N. Lee and A. Cichocki, “Estimating a few extreme singular values andvectors for large-scale matrices in tensor train format,” SIAM Journal onMatrix Analysis and Applications, vol. 36, no. 3, pp. 994–1014, 2015.

[34] S. Holtz, T. Rohwedder, and R. Schneider, “The alternating linearscheme for tensor optimization in the tensor train format,” SIAM J. Sci.Comput., vol. 34, no. 2, pp. A683–A713, Jan 2012.

[35] L. Grasedyck, M. Kluge, and S. Kramer, “Variants of alternating leastsquares tensor completion in the tensor train format,” SIAM J. Sci.Comput., vol. 37, no. 5, pp. A2424–A2450, Jan 2015.

[36] C. D. Silva and F. J. Herrmann, “Optimization on the hierarchical tuckermanifold – applications to tensor completion,” Linear Algebra Appl., vol.481, pp. 131–173, Sep 2015.

[37] H. Rauhut, R. Schneider, and Z. Stojanac, “Tensor completion in hierar-chical tensor representations,” in Compressed Sensing Appl. Springer,2015, pp. 419–450.

[38] J. A. Bengua, H. N. Phien, and H. D. Tuan, “Optimal feature extractionand classification of tensors via matrix product state decomposition,” in

2015 IEEE International Congress on Big Data, June 2015, pp. 669–672.

[39] M. A. Nielsen and I. L. Chuang, Quantum Computation and QuantumInformation. Cambridge University Press (CUP), 2009.

[40] M. Fazel, “Matrix rank minimization with applications,” Ph.D. disserta-tion, PhD thesis, Stanford University, 2002.

[41] M. Kurucz, A. A. Benczur, and K. Csalogany, “Methods for large scaleSVD with missing values,” in Proc. 13th ACM SIGKDD Int. Conf.Knowl. Discov. Data Mining, 2007.

[42] C. H. Bennett, D. P. DiVincenzo, J. A. Smolin, and W. K. Wootters,“Mixed-state entanglement and quantum error correction,” Phys. Rev. A,vol. 54, no. 5, pp. 3824–3851, Nov 1996.

[43] C. Mu, B. Huang, J. Wright, and D. Goldfarb, “Square deal: Lowerbounds and improved relaxations for tensor recovery,” in Proc. 31thInt’l Conf. Machine Learning, ICML 2014, vol. 32, 2014, pp. 73–81.

[44] Z. Wen, W. Yin, and Y. Zhang, “Solving a low-rank factorizationmodel for matrix completion by a nonlinear successive over-relaxationalgorithm,” Math. Programm. Comput., vol. 4, no. 4, pp. 333–361, Jul2012.

[45] J. I. Latorre, “Image compression and entanglement,” arxiv, vol.abs/quant-ph/0510031, 2005.

[46] Q. Zhao, L. Zhang, and A. Cichocki, “Bayesian cp factorization of in-complete tensors with automatic rank determination,” IEEE Transactionson Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1751–1763, Sept 2015.

[47] Y. L. Chen, C. T. Hsu, and H. Y. M. Liao, “Simultaneous tensordecomposition and completion using factor priors,” IEEE Transactions

Page 14: Efficient tensor completion for color image and video recovery: …minhdo.ece.illinois.edu/publications/TensorCompletion.pdf · 2017-05-09 · 1 Efficient tensor completion for

14

on Pattern Analysis and Machine Intelligence, vol. 36, no. 3, pp. 577–591, March 2014.

[48] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Imagequality assessment: from error visibility to structural similarity,” IEEETransactions on Image Processing, vol. 13, no. 4, pp. 600–612, April2004.

Johann A. Bengua received the B.E. degree intelecommunications engineering from the Universityof New South Wales, Kensington, Australia, in 2012and Ph.D. degree in electrical engineering from theUniversity of Technology Sydney, Ultimo, Australia,in 2017. He is now a data scientist for TeradataAustralia and New Zealand. His current researchinterests include tensor networks, machine learning,image and video processing.

Phien N Ho received the B.S. degree from HueUniversity, Vietnam in 200e, the M.Sc. degree inphysics from Hanoi Institute of Physics and Elec-tronics, Vietnam in 2007, and the Ph.D. degreein computational physics from the University ofQueensland, Australia in 2015. He has been a post-doctoral fellow at the Faculty of Engineering andInformation Technology, University of Technology,Sydney, Australia from 2015 to 2016. He is now aquantitative analyst with Westpac Group, Australia.His interests include mathematical modelling and

numerical simulations for multi-dimensional data systems.

Hoang Duong Tuan received the Diploma (Hons.)and Ph.D. degrees in applied mathematics fromOdessa State University, Ukraine, in 1987 and 1991,respectively. He spent nine academic years in Japanas an Assistant Professor in the Department ofElectronic-Mechanical Engineering, Nagoya Univer-sity, from 1994 to 1999, and then as an AssociateProfessor in the Department of Electrical and Com-puter Engineering, Toyota Technological Institute,Nagoya, from 1999 to 2003. He was a Profes-sor with the School of Electrical Engineering and

Telecommunications, University of New South Wales, from 2003 to 2011.He is currently a Professor with the Faculty of Engineering and InformationTechnology, University of Technology Sydney. He has been involved inresearch with the areas of optimization, control, signal processing, wirelesscommunication, and biomedical engineering for more than 20 years.

Minh N. Do (M’01-SM’07-F’14) received theB.Eng. degree in computer engineering from theUniversity of Canberra, Australia, in 1997, and theDr.Sci. degree in communication systems from theSwiss Federal Institute of Technology Lausanne(EPFL), Switzerland, in 2001. Since 2002, he hasbeen on the faculty at the University of Illinois atUrbana-Champaign (UIUC), where he is currentlya Professor in the Department of ECE, and holdsjoint appointments with the Co- ordinated ScienceLaboratory, the Beckman Institute for Advanced

Science and Technology, and the Department of Bioengineering. His researchinterests include signal processing, computational imaging, geometric vision,and data analytics. He received a CAREER Award from the National ScienceFoundation in 2003, and a Young Author Best Paper Award from IEEE in2008. He was named a Beckman Fellow at the Center for Advanced Study,UIUC, in 2006, and received of a Xerox Award for Faculty Research from theCollege of Engineering, UIUC, in 2007. He was a member of the IEEE SignalProcessing Theory and Methods Technical Committee, Image, Video, andMultidimensional Signal Processing Technical Committee, and an AssociateEditor of the IEEE Transactions on Image Processing.