multi-weight matrix completion with arbitrary subspace

12
1 Multi-weight Matrix Completion with Arbitrary Subspace Prior Information Hamideh.Sadat Fazael Ardakani, Niloufar Rahmani, Sajad Daei Abstract—Matrix completion refers to completing a low-rank matrix from a few observed elements of its entries and has been known as one of the significant and widely-used problems in recent years. The required number of observations for exact completion is directly proportional to rank and the coherency parameter of the matrix. In many applications, there might exist additional information about the low-rank matrix of interest. For example, in collaborative filtering, Netflix and dynamic channel estimation in communications, extra subspace infor- mation is available. More precisely in these applications, there are prior subspaces forming multiple angles with the ground- truth subspaces. In this paper, we propose a novel strategy to incorporate this information into the completion task. To this end, we designed a multi-weight nuclear norm minimization where the weights are such chosen to penalize each angle within the matrix subspace independently. We propose a new scheme for optimally choosing the weights. Specifically, we first calculate an upper- bound expression describing the coherency of the interested matrix. Then, we obtain the optimal weights by minimizing this expression. Simulation results certify the advantages of allowing multiple weights in the completion procedure. Explicitly, they indicate that our proposed multi-weight problem needs fewer observations compared to state-of-the-art methods. Index Terms—Nuclear norm minimization, Matrix completion, Subspace prior information, Non-uniform weights I. I NTRODUCTION Noisy Matrix completion refers to the task of recovering a low-rank matrix X P R nˆn with rank r ! n from few noisy observations Y R Ω pX ` EqP R nˆn [1], [2]. Here, E is the noise matrix and the observation operator R Ω pZq for a matrix Z is defined as R Ω pZq :n ÿ i,j1 ij p ij xZ, e i e j ye i e T j , (1) where p ij is the probability of observing pi, j q-th element of the matrix, and ij is a Bernoulli random variable taking values 1 and zero with probabilities p ij and 1 ´ p ij , respectively. While there are many matrix solutions satisfying the obser- vation model, it has been proved that the matrix with lowest rank is unique [3]. Hence, to promote the low-rank feature, the following rank minimization is employed: min ZPR nˆn rankpZq, s.t. }Y ´ R Ω pZq} F ď e, (2) where e is an upper-bound for }R Ω pEq} F . Since (2) is generally NP-hard and intractable, it is common to replace it with the following surrogate optimization: min ZPR nˆn }Z} ˚ , s.t. }Y ´ R Ω pZq} F ď e (3) where }¨} ˚ is called the nuclear norm which computes the sum of singular values of a matrix and is considered as a relaxed version of rank function [4]. In many applications such as quantum state tomography [5], MRI [6], [7], collaborative filtering [8], exploration seismology [9] and Netflix problem [10], there is some accessible prior knowledge about the ground-truth subspaces (i.e. the row and column subspaces of the ground-truth matrix X). For instance, in Netflix problem, prior evaluations of the movies by the referees can provide prior information about the ground- truth subspaces of Netflix matrix. Further, in sensor network localization [11], some information about the position of sensors can be exploited as available prior knowledge (c.f. [12, Section I] and [13, Section I.A] for more applications). The aforementioned prior subspace information often appears in the form of column and row r 1 -dimensional subspaces (denoted by r U r 1 and r V r 1 ) forming angles with column and row spaces of the ground-truth matrix X, respectively. To incor- porate the prior information into the recovery procedure, the following tractable problem for low-rank matrix completion is proposed: min ZPR nˆn }Q r U r 1 ZQ r V r 1 } ˚ s.t. }Y ´ R Ω pZq} F ď e (4) where Q r U r 1 :r U r 1 Λ r U H r 1 ` P r U K r 1 , Q r V r 1 :r V r 1 Γ r V H r 1 ` P r V K r 1 (5) and Λ, Γ are diagonal matrices whose elements are within the interval r0, 1s, r U r 1 , r V r 1 P R nˆr 1 indicate some bases for the subspaces r U r 1 and r V r 1 , respectively. Also, the orthogonal projection matrix is defined by P r U K r 1 :I n ´ r U r 1 r U H r 1 where I n is the identity matrix of size n. So according to the definitions, if Λ Γ I r 1 , the problem (4) reduces to the standard nuclear norm minimization (3). The values of matrices Λ and Γ depend on the precision of our available prior knowledge for each direction (i.e. each column of r U r 1 ) in the form of principal angles. As an example, when the principal angle between the estimated and true basis vectors (or directions) increases, the accuracy of that direction estimate is reduced, and so the assigned weight to that direction estimate should intuitively be large and near 1. A. Contributions In this paper, we propose a general scheme for low-rank matrix completion with prior subspace information. Since we arXiv:2111.00235v1 [cs.IT] 30 Oct 2021

Upload: others

Post on 30-Dec-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-weight Matrix Completion with Arbitrary Subspace

1

Multi-weight Matrix Completion with ArbitrarySubspace Prior Information

Hamideh.Sadat Fazael Ardakani, Niloufar Rahmani, Sajad Daei

Abstract—Matrix completion refers to completing a low-rankmatrix from a few observed elements of its entries and has beenknown as one of the significant and widely-used problems inrecent years. The required number of observations for exactcompletion is directly proportional to rank and the coherencyparameter of the matrix. In many applications, there might existadditional information about the low-rank matrix of interest.For example, in collaborative filtering, Netflix and dynamicchannel estimation in communications, extra subspace infor-mation is available. More precisely in these applications, thereare prior subspaces forming multiple angles with the ground-truth subspaces. In this paper, we propose a novel strategy toincorporate this information into the completion task. To this end,we designed a multi-weight nuclear norm minimization where theweights are such chosen to penalize each angle within the matrixsubspace independently. We propose a new scheme for optimallychoosing the weights. Specifically, we first calculate an upper-bound expression describing the coherency of the interestedmatrix. Then, we obtain the optimal weights by minimizing thisexpression. Simulation results certify the advantages of allowingmultiple weights in the completion procedure. Explicitly, theyindicate that our proposed multi-weight problem needs fewerobservations compared to state-of-the-art methods.

Index Terms—Nuclear norm minimization, Matrix completion,Subspace prior information, Non-uniform weights

I. INTRODUCTION

Noisy Matrix completion refers to the task of recovering alow-rank matrix X P Rnˆn with rank r ! n from few noisyobservations Y “ RΩpX ` Eq P Rnˆn [1], [2]. Here, E isthe noise matrix and the observation operator RΩpZq for amatrix Z is defined as

RΩpZq :“nÿ

i,j“1

εijpijxZ, eiejyeie

Tj , (1)

where pij is the probability of observing pi, jq-th element ofthe matrix, and εij is a Bernoulli random variable taking values1 and zero with probabilities pij and 1 ´ pij , respectively.While there are many matrix solutions satisfying the obser-vation model, it has been proved that the matrix with lowestrank is unique [3]. Hence, to promote the low-rank feature,the following rank minimization is employed:

minZPRnˆn

rankpZq, s.t. Y ´RΩpZqF ď e, (2)

where e is an upper-bound for RΩpEqF . Since (2) isgenerally NP-hard and intractable, it is common to replaceit with the following surrogate optimization:

minZPRnˆn

Z˚, s.t. Y ´RΩpZqF ď e (3)

where ¨˚ is called the nuclear norm which computes the sumof singular values of a matrix and is considered as a relaxedversion of rank function [4].

In many applications such as quantum state tomography [5],MRI [6], [7], collaborative filtering [8], exploration seismology[9] and Netflix problem [10], there is some accessible priorknowledge about the ground-truth subspaces (i.e. the rowand column subspaces of the ground-truth matrix X). Forinstance, in Netflix problem, prior evaluations of the movies bythe referees can provide prior information about the ground-truth subspaces of Netflix matrix. Further, in sensor networklocalization [11], some information about the position ofsensors can be exploited as available prior knowledge (c.f.[12, Section I] and [13, Section I.A] for more applications).The aforementioned prior subspace information often appearsin the form of column and row r1-dimensional subspaces(denoted by rUr1 and rVr1 ) forming angles with column and rowspaces of the ground-truth matrix X , respectively. To incor-porate the prior information into the recovery procedure, thefollowing tractable problem for low-rank matrix completion isproposed:

minZPRnˆn

QrUr1ZQ

rVr1˚

s.t. Y ´RΩpZqF ď e (4)

where

QrUr1

:“ rUr1Λ rUHr1 ` P rUKr1

, QrVr1

:“ rVr1Γ rV Hr1 ` P rVKr1

(5)

and Λ, Γ are diagonal matrices whose elements are withinthe interval r0, 1s, rUr1 , rVr1 P Rnˆr1 indicate some bases forthe subspaces rUr1 and rVr1 , respectively. Also, the orthogonalprojection matrix is defined by P

rUKr1:“ In´ rUr1 rU

Hr1 where In

is the identity matrix of size n. So according to the definitions,if Λ “ Γ “ Ir1 , the problem (4) reduces to the standardnuclear norm minimization (3). The values of matrices Λ andΓ depend on the precision of our available prior knowledgefor each direction (i.e. each column of rUr1 ) in the form ofprincipal angles. As an example, when the principal anglebetween the estimated and true basis vectors (or directions)increases, the accuracy of that direction estimate is reduced,and so the assigned weight to that direction estimate shouldintuitively be large and near 1.

A. Contributions

In this paper, we propose a general scheme for low-rankmatrix completion with prior subspace information. Since we

arX

iv:2

111.

0023

5v1

[cs

.IT

] 3

0 O

ct 2

021

Page 2: Multi-weight Matrix Completion with Arbitrary Subspace

2

penalize the inaccuracy of each basis (direction) in the priorsubspace in our proposed method, more degrees of freedom areprovided in compared to the previous related works and thisleads to fewer required observations for matrix completion.Moreover, we design an optimal strategy to promote the priorformation so as to reduce the required number of samplesfor completion up to the greatest possible extent. This isaccomplished by assigning dedicated weights to penalize thebases of the prior subspaces in an optimal manner. Ourtheoretical and numerical results also certify that our devisedmethod needs fewer samples for matrix completion comparedto the existing similar methods in [13] and [14].

B. Prior Arts and Key Differences

In this section, we summarize some of the related existingapproaches for completing low-rank matrix. The authors in[15] propose a weighted form of trace-norm regularization thatoutperforms the unweighted version:

Xtr :“ diagp?pqXdiagp

?qq˚ (6)

in which ppiq and qpjq indicate the probabilities of the i-th row and the j-th column of the matrix under observation,respectively.

In [16], [17] and [18], the directions in the row and columnsubspaces of X are penalized based on prior information.In [19], the authors discuss the problem of minimizing re-weighted trace norm as an iterative heuristic and analyze itsconvergence. In [15] and [8], the authors considered a gener-alized nuclear norm to incorporate structural prior informationinto matrix completion and proposed a scalable algorithmbased on the approach of [20].

Aravkin et al. in [9] for the first time incorporated priorsubspace information into low-rank matrix completion by aniterative algorithm in order to solve:

minZPRnˆn

QrUrZQ

rVr˚, s.t. Y “ ApZq

where

QrUr

:“ λPrUKr` P

rUKr, Q

rVr:“ γP

rVKr` P

rVKr(7)

and λ, γ depend on the maximum principal angle.Eftekhari et al. in [13] proves that the number of required

observations can be reduced compared to the standard nuclearnorm minimization in the presence of prior information. Theirapproach assigns the whole prior subspaces with a singleweight that is chosen by maximizing the coherence of theinterested matrix. Penalizing the whole subspaces with a singleweight seems to be not reasonable since the directions withinsubspaces have different angles with those of the ground-truthmatrix. Thus, it would be better to penalize the far directionsmore while encouraging the close ones via assigning multipleweights. Further, their approach only works when the priorsubspaces are sufficiently close to the ground-truth ones. In[14], the authors propose a similar weighting strategy as in(4) for low-rank matrix recovery problem. Their approachfor choosing the weights is to weaken the restricted isometryproperty (RIP) condition of the measurement operator and iscompletely different from what the current work will offer.

Unfortunately, many measurement operators (including ma-trix completion framework) fail to satisfy RIP and thus thisapproach can not be applied to matrix completion problem.Further, unlike [13], the considered prior subspaces in [14]can be either far or close to the ground-truth subspaces. Inother words, [14] has shown that far prior subspaces can bebeneficial in improving the performance of low-rank matrixrecovery. There are also some works of different flavor inmatrix completion such as [21], [22] which amounts to de-signing algorithms for recovering coherent low-rank matrices.Despite these efforts, it is still vague to what extent does priorknowledge help (or hurt) matrix completion.

Another related work with the same model as (7) is [12]where optimal weights are designed based on statistical dimen-sion theory in full contrast to other works that deal with theRIP bound. However, statistical dimension theory is not appli-cable to many measurement models such as matrix completionframework. It is worth mentioning that our interest in weightedmatrix completion is inspired by a closely related field knownas compressed sensing [23], [24], which Needell et al. in[25] present recovery conditions for weighted `1-minimizationwhen there are several available prior information about thesupport of a sparse signal. This prior information is organizedinto several sets where each contributes to the support witha certain degree of precision and these sets are assigned non-uniform weights. It is equally important to note the term “non-uniform weights”points out to multiple different weights whichare employed interchangeably in this paper. This work canactually be regarded as an extension of [25] to the matrixcompletion case. In order to penalize different directions of theground-truth matrix, we use non-uniform weights. Despite thegeneral idea, our used tools and analysis in the current worksubstantially differs from those in [25] and involves highlychallenging and non-trivial mathematical steps.

C. Outline and Notations

The paper is organized as follows: In Section II, we reviewthe uniform (single) weighted strategy introduced in [13] withmore details. In Section III, we present our main resultswhich amounts to proposing a non-uniform weighted nuclearnorm minimization. In Sections IV, some numerical resultsare provided which verify the superior performance of ourmethod. Finally, in Section V, the paper is concluded.

Throughout the paper, scalars are indicated by lowercaseletters, vectors by lowercase boldface letters, and matrices byuppercase letters. The trace and hermitian of a matrix areshown as Trp¨q and p¨qH, respectively. The Frobenius innerproduct is defined as xA,ByF “ TrpABHq. ¨ denote thespectral norm and X ě 0 means that X is a semidefinitematrix. We describe the linear operator A : Rmˆn ÝÑ Rpas AX “ rxX,A1yF , ¨ ¨ ¨ , xX,ApyF s

T where Ai P Rmˆn.The adjoint operator of A is defined as A˚y “

řpi“1 yiAi

and I is the identity linear operator i.e. IX “X .

II. SINGLE WEIGHT NUCLEAR NORM MINIMIZATION

In this section, we explain the strategy of single weightpenalization employed in [13].

Page 3: Multi-weight Matrix Completion with Arbitrary Subspace

3

Definition 1 ( [12]). Assume U P Rn is an r-dimensionalsubspace and PU indicates orthogonal projection onto thatsubspace, then the coherency of U is defined as:

µpUq :“n

rmax

1ďiďnPUei

2, (8)

where ei P Rn is the canonical vector having 1 in the i-th location and zero elsewhere. In order to define principalangles between subspaces U and rU , let r and r1 representdimensions of U and rU , respectively with r ď r1. There existsr non-increasing principal angles θu P r0o, 90osr

θupiq “ min

"

cos´1p|xu, ruy|

u2ru2q : u P U , ru P rU

u K uj , ru K ruj : @j P ti` 1, ¨ ¨ ¨ , ru

*

(9)

where u and ru indicate principal vectors and θup1q denotesthe maximum principal angle.

Theorem 1 ( [4]). Assume Xr “ UrΣrVHr P Rnˆn be

a truncated SVD from matrix X P Rnˆn, for an integerr ď n and let Xr` “ X ´ Xr indicates the resid-ual. Consider rUr and rVr as prior subspace information ofUr “ spanpXrq and Vr “ spanpXH

r q, respectively. AssumeηpXrq “ ηpUrV

Hr q “ maxi µipUrq_maxj νjpVrq indicates

the coherence of Xr. Additionally, let U and V representorthonormal bases for spanprUr, rUrsq and spanprVr, rVrsq,respectively. For λ, γ P p0, 1s, if xX is a solution to (4), then

X ´XF ÀXr`˚?p

` e?pn (10)

provided that

1 ě p Á maxrlogpα1.nq, 1s.ηpXrqrlogn

n.

maxr1`ηpU V Hq

ηpUV Hq, 1s

α3 ď1

8(11)

where ηpU V Hq represents the coherence of U V H and

α1 :“

d

λ4cos2θup1q ` sin2θup1q

λ2cos2θup1q ` sin2θup1q

d

λ4cos2θvp1q ` sin2θvp1q

λ2cos2θvp1q ` sin2θvp1q

α2 :“

ˆ

d

λ2cos2u` sin2u

γ2cos2v ` sin2v`

d

γ2cos2v ` sin2v

λ2cos2u` sin2u

˙

.

`

a

λ4cos2u` sin2u`

b

γ4cos2v ` sin2v˘

α3 :“3?

1´ λ2 sinu

2a

λ2cos2u` sin2u`

3a

1´ γ2 sinv

2a

γ2cos2v ` sin2v.

Remark 1. By taking λ “ γ “ 1, the problem (4) reduces tothe standard unweighted nuclear norm minimization problem(3), i.e. it leads to α1 “ 1, α2 “ 4 and α3 “ 0. Thus, (11)will change to

1 ě p ÁηpXrqrlog2n

n.`

1`

d

ηpU V Hq

ηpUrV Hr q

˘

(12)

Further, due to the termc

ηpUV Hq

ηpUrV Hr q

, the probability of an

element being observed is worse than [3] for solving (3) inthe noisy case which is considered as 1 ě p Á ηpXrqrlog2n

n .

III. NON-UNIFORM WEIGHTING

In this section, we generalize the single (or uniform)weighted matrix completion with nuclear norm minimizationapproach [13], to the non-uniform weights. Consider rUr1 andrVr1 as prior subspace information forming angles with Ur

and Vr, respectively. We optimize the weights according tovalues of the principal angels. Theorem (2) below providesperformance guarantees for both noiseless and noisy matrixcompletion and uniform weighting strategy is a special caseof it.

Theorem 2. Consider Xr P Rnˆn as a truncated SVD frommatrix X , and let Xr` “X ´Xr indicate the residual. LetUr “ spanpXrq and Vr “ spanpXH

r q indicate the columnand row subspaces of Xr, and the r1-dimensional subspacesrUr1 and rVr1 be their corresponding prior subspace estimates,respectively. For each pair of subspaces, consider the non-increasing principal angle vectors defined bellow to indicatethe accuracy of prior information:

θu “ =rU , rUs, θv “ =rV , rVs (13)

Assume ηpXrq “ ηpUrVHr q “ maxiµi pUrq _maxj νjpVrq

indicate the coherence of Xr. Additionally, let U andV represent orthonormal bases for spanprUr, rU

1rsq and

spanprVr, rV1r sq, respectively. For Λi,i,Γi,i P p0, 1s, let X be

a solution to (4). Then,

X ´XF ÀXr`˚?p

` e?pn (14)

provided that

1 ě p Á maxrlogpα4.nq, 1s.µpXrqrlogn

n.

max“

α25

`

1`ηpU V Hq

ηpUV Hq

˘

, 1‰

α6 ď1

4(15)

where ηpU V Hq is the coherence of U V H and

α4 :“ α4pui, vi, λpiq, γ1piqq :“d

maxi

ˆ

λ1piq4cos2θupiq ` sin2θupiq

λ1piq2cos2θupiq ` sin2θupiq

˙

¨

d

maxi

ˆ

γ1piq4cos2θvpiq ` sin2θvpiq

γ1piq2cos2θvpiq ` sin2θvpiq

˙

α5 :“ α5pθupiq, θvpiq, λ1piq, γ1piqq :“b

maxipλ1piq2cos2θupiq ` sin2θupiqq ¨

d

maxi

ˆ

γ1piq4cos2θvpiq ` sin2θvpiq

γ1piq2cos2θvpiq ` sin2θvpiq

˙

`

b

maxipγ1piq2cos2θvpiq ` sin2θvpiqq ¨

Page 4: Multi-weight Matrix Completion with Arbitrary Subspace

4

d

maxi

ˆ

λ1piq4cos2θupiq ` sin2θupiq

λ1piq2cos2θupiq ` sin2θupiq

˙

α6 :“ α6pθupiq, θvpiq, λ1piq, λ2piq, γ1piq, γ2piqq :“d

maxi

ˆ

p1´ λ1piq2q2cos2θupiq ` sin2θupiq

λ1piq2cos2θupiq ` sin2θupiq

˙

¨

d

maxi

ˆ

p1´ γ1piq2q2cos2θvpiq ` sin2θvpiq

γ1piq2cos2θvpiq ` sin2θvpiq

˙

´

max

"

maxipλ2piq ´ 1q,max

i

ˆ

λ1piqb

λ1piq2cos2θupiq ` sin2θupiq

´ 1

˙*

´max

"

maxipγ2piq ´ 1q,

maxi

ˆ

γ1piqb

γ1piq2cos2θvpiq ` sin2θvpiq´ 1

˙*

.

(16)

Proof. See Appendix B.

Remark 2. If Λ “ Γ “ Ir1 then QrUr1“ Q

rVr1“ In and the

problem reduces to the standard matrix completion (4). Also,considering Γ “ γIr and Λ “ λIr, (4) reduces to the singleweighted problem studied in [13].

Remark 3. (Finding optimal weights) Our goal is to reducethe number of required observations or alternatively the recov-ery error in matrix completion problem. Therefore, we choosethe optimal weights that minimizes the lower-bound on p in(15) and consequently the recovery error in (14).

Remark 4. In our proposed model, all principal angels be-tween subspaces are assumed to be accessible. Accordingly,this provides higher degrees of freedom to reduce the numberof required observations which is in turn translated to enhancethe performance accuracy.

IV. SIMULATION RESULTS

In this section, we provide simulation experiments to repre-sent that non-uniform weighting strategy performs better thanuniform weighting approach in matrix completion problem.The simulation results are generated using CVX packageand optimal weights are obtained by numerical optimization.Consider a square matrix X P Rnˆn with n “ 20 andrank r “ 4. We use a perturbation matrix N P Rnˆn, withindependent Gaussian random elements of mean zero andvariance 10´4 to have X 1 “ X ` N . Then, we constructthe prior subspaces rUr1“r`4 and rVr1“r`4 as spans of X 1 andX 1H , respectively. Also θu P r0, 90sr and θv P r0, 90sr arethe known principal angels between subspaces rUr, rUr1s andrVr, rVr1s, respectively. The bases U r and U r1 without lossof generality can be defined in such a way that

UHrrUr1 “ rcosθu 0rˆr1´rs, V H

rrVr1 “ rcosθv 0rˆr1´rs

(17)

i.e. U r and rUr1 can be considered as left and right singularmatrices of UH

rrUr1 . Similar definitions are also done for V r

and rVr1 .

In this section, we compare the performance of the stan-dard matrix completion (3) with the strategy of non-uniformweights (4). Also the standard nuclear norm minimization withoptimal weights in different θu and θv are compared. Eachexperiment is repeated 50 times with different sampled entriesand noise (in noisy cases). Considering X as the solution ofthe problem, the normalized recovery error (NRE) is definedas: NRE :“ X´XF

XF. NRE less than 10´4 shows a successful

experiment.Success rate and NRE (in the noiseless case) are shown

in Fig. 1. In this experiment, prior information is assumedto have good accuracy, and the principal angles betweensubspaces are considered as θu “ r1.32, 1.72, 2.11, 3.07s andθv “ r1.08, 1.70, 2.37, 2.73s degrees. As it can be observed,matrix completion with non-uniform weights outperforms thestandard problems and unweighted algorithm.

In Fig. 2, we investigate a case with similar parame-ters assuming θu “ r2.01, 8.28, 15.55, 20.26s and θv “

r2.09, 10.5, 19.45, 22.00s with the difference that some direc-tions are accurate and some are not. As expected, we see thatmatrix completion with non-uniform weights performs betterthan other methods.

In Fig. 3, we consider prior information with weak ac-curacy, i.e. θu “ r40.87, 49.63, 50.55, 69.39s and θv “

r28.76, 37.83, 40.52, 63.65s. In this experiment, similar toprevious ones, it can be again observed that non-uniformlyweighted matrix completion has superior performance com-pared to the other methods.

V. CONCLUSION

In this work, we designed a novel framework in orderto exploit prior subspace information in matrix competitionproblem. We first developed a weighted optimization problemto promote both prior knowledge and low-rank feature. Then,we proposed a novel theoretical way to obtain the optimalweights by minimizing the required observations. Numericalresults were also provided which demonstrate the superioraccuracy of our proposed approach compared to the state-of-the-art methods.

APPENDIX AREQUIRED LEMMAS AND BACKGROUND

In this section, some essential lemmas are provided whichare required in the proof of Theorem (14).

A. Constructing the Bases

This section introduces the bases in order to simplify theproofs.

Lemma 1. [12] Consider Xr P Rnˆn as a rank r matrixwith column and row subspaces Ur and Vr, respectively.Also, let rUr1 and rVr1 of dimension r1 ě r with r knownprincipal angles θu and θv with subspaces Ur and V as prior

Page 5: Multi-weight Matrix Completion with Arbitrary Subspace

5

Fig. 1. Matrix completion using different approaches in noise-less case. Principal angels are considered as θu “ r2.01, 8.28, 15.55, 20.26s and θv “r2.09, 10.5, 19.45, 22.00s.

Fig. 2. Matrix completion using different approaches in noise-less case. Principal angels are considered as θu “ r1.32, 1.72, 2.11, 3.07s and θv “r1.08, 1.70, 2.37, 2.73s.

Fig. 3. Matrix completion using different approaches in noise-less case. Principal angels are considered as θu “ r1.32, 1.72, 2.11, 3.07s and θv “r1.08, 1.70, 2.37, 2.73s.

information. There exists orthogonal matrices Ur,Vr P Rnˆrand rUr1 , rVr1 P Rnˆr

1

and BL,BR P Rnˆn such that:

Ur “ spanpUrq, Vr “ spanpVrq

rUr1 “ spanp rUr1q, rVr1 “ spanp rVr1q

BL :“ rUr U 11,r U 12,r1´r U2n´r´r1s P Rnˆn

BR :“ rVr V 11,r V 12,r1´r V 2n´r´r1s P Rnˆn (18)

For definitions of the submatrices, see [12, Section VI.A].

The following relation can be concluded from lemma 1:

rUr1 “ BL

»

cosθu´sinθu

´Ir1´r

fi

ffi

ffi

fl

(19)

Therefore orthogonal projections onto the subspaces rUr1 andrUK

r1 are:

PrUr1“ rUr1 rU

Hr1 “

BL

»

cos2θu ´sinθucosθu´sinθucosθu sin2θu

Ir1´r

fi

ffi

ffi

fl

BHL

PrUKr1“ I ´ P

rUr1“

BL

»

sin2θu sinθucosθusinθucosθu cos2θu

0r1´rIn´r1´r

fi

ffi

ffi

fl

BHL

(20)

We also have:

QrUr1

:“ rUr1Λ rUHr1 ` P rUK

“ BL

»

Λ1 cos2 θu ` sin2 θupI ´Λ1q sinθu cosθu

Page 6: Multi-weight Matrix Completion with Arbitrary Subspace

6

pI ´Λ1q sinθu cosθuΛ1 sin2 θu ` cos2 θu

Λ2

In´r1´r

fi

ffi

ffi

fl

BHL ,

(21)

in which Λ :“

Λ1 P RrˆrΛ2 P Rr

1´rˆr1´r

.

Now in order to rewrite QrUr1

to contain an upper-triangularmatrix, we first define the orthonormal base:

OL :“

»

pΛ1 cos2 θu ` sin2 θuq.∆´1L

´pI ´Λ1q sinθu cosθu.∆´1L

´pI ´Λ1q sinθu cosθu.∆´1L

pΛ1 cos2 θu ` sin2 θuq.∆´1L

Ir1´rIn´r1´r

fi

ffi

ffi

fl

, (22)

in which since Λ1 ľ 0, ∆L :“b

Λ21 cos2 θu ` sin2 θu P

Rnˆn is an invertible matrix . Now (21) can be rewritten as:

QrUr1“ BLpOLO

HL q

»

Λ1 cos2 θu ` sin2 θupI ´Λ1q sinθu cosθu

pI ´Λ1q sinθu cosθuΛ1 sin2 θu ` cos2 θu

Λ2

In´r1´r

fi

ffi

ffi

fl

BHL

“ BLOL

»

∆L

pI ´Λ21q sinθu cosθu.∆

´1L

Λ1∆´1L

Λ2

In´r1´r

fi

ffi

ffi

fl

BHL

“: BLOL

»

L11 L12

L22

˚ Λ2˚In´r1´r

fi

ffi

ffi

fl

BHL

“ BLOLLBHL , (23)

where L P Rnˆn is a block upper-triangular matrix:

L :“

»

L11 L12

L22

˚ Λ2˚In´r1´r

fi

ffi

ffi

fl

»

∆L pI ´Λ21q sinθu cosθu.∆

´1L

Λ1∆´1L

Λ2

In´r1´r

fi

ffi

ffi

fl

.

(24)

Since matrices BL and OL indicate orthonormal bases, itfollows that Q

rUr1 “ L “ 1. Similar results can also be

deduced for the row subspace:

R :“

»

R11 R12

R22

Γ2

In´r1´r

fi

ffi

ffi

fl

»

∆R pI ´ Γ21q sinθv cosθv.∆

´1R

Γ1∆´1R

Γ2

In´r1´r

fi

ffi

ffi

fl

,

(25)

where ∆R :“b

Γ21 cos2 θv ` sin2 θv and ∆L have similar

properties. Considering H P Rnˆn as an arbitrary matrix,one can say:

QrUr1HQ

rVr1“ BLOLLpB

HLHBRqR

HOHRB

HR

“ BLOLLHRHOH

RBHR pH :“ BH

LHBRq

“: BLOLL

»

H11 H12 ˚H13 H14

H21 H22 H23 H24

H31 H32 ˚H33 H34

H41 H42 H43 H44

fi

ffi

ffi

fl

RHOHRB

HR .

(26)

Since spanpXrq “ spanpUrq and spanpXHr q “ spanpVrq

and with upper triangular matrices L and R, we can rewriteQ

rUr1XrQ rVr1

in terms of new bases:

QrUr1XrQ rVr1

“ BLOLLpBHLXrBRqR

HOHRB

HR

“ BLOLLXrRHOH

RBHR pXr :“ BH

LXrBRq

“: BLOLL

Xr,11

0n´r

RHOHRB

HR

“ BLOL

L11Xr,11R11

0n´r

OHRB

HR . (27)

Lemma 2. The operator norms regarding the sub-blocks of Lin (24) are as follows:

L11 “ ∆L “ maxi

b

λ21piq cos2 θupiq ` sin2 θupiq,

L12 “ maxi

d

p1´ λ21piqq

2 cos2 θupiq sin2 ui

λ21piq cos2 θupiq ` sin2 θupiq

,

Ir ´L22 “ maxi

λ1piq ´b

λ21piq cos2 θupiq ` sin2 θupiq

b

λ21piq cos2 θupiq ` sin2 θupiq

,

rL11 L12s “ maxi

d

λ41piq cos2 θupiq ` sin2 θupiq

λ21piq cos2 θupiq ` sin2 θupiq

(28)

L12 “ maxidipθu,λ1,λ2q (29)

Ir ´L22

Ir ´Λ2

›“ max

!

maxip1´ λ2piqq

˚ ,maxi

´

1´λ1piq

b

λ21piq cos2 θupiq ` sin2 θupiq

¯)

,

(30)

Page 7: Multi-weight Matrix Completion with Arbitrary Subspace

7

where di is defined as:

d1pθ,a, bq :“ maxi

`` apiq

sqrta2piq cosθpiq ` sin2 θpiq´ 1

˘2

`p1´ apiqq2 cos2 θpiq sin2 θpiq

apiq2 cos2 θpiq sin2 θpiq

˘

d2pθ,a, bq :“ maxipbpiq ´ 1q2 (31)

The same equalities hold for sub-blocks of R.Proof. See Appendix D.

B. Support Definitions

Let Xr P Rnˆn be a rank-r matrix which is obtained viathe truncated SVD of X:

X “Xr `Xr` “ UrXr,11VHr `˚Xr` ,

where Ur and Vr are some orthogonal bases of column androw spaces of Xr, and thus Xr,11 is not necessarily diagonal.Also consider that Ur “ spanpUrq “ spanpXrq and Vr “

spanpVrq “ spanpXHr q are column and row subspaces of Xr,

respectively. Then the support of Xr can be defined as:

T :“ tZ P Rnˆn : Z “ PUrZPVr ` PUrZPVKr` PUKr ZPVKr u “ supppXrq, (32)

and the orthogonal projection onto T and TK can be definedas

PT pZq “ PUZ `˚ZPV ´ PUZPV ,PTKpZq “ PUKZPVK .(33)

We can rewrite T using Lemma 1 as

T “!

Z P Rnˆn : Z “ BLZBHR , Z :“

Z11 Z12

Z21 0n´r

)

“ BLTBHR , (34)

where T Ă Rnˆn is the support of Xr “ BHLXrBR:

T “ tZ P Rmˆn : Z :“

Z11 Z12

Z21 0n´r

u. (35)

For arbitrary

Z :“

Z11 Z12

Z21 Z22

P Rnˆn, (36)

the orthogonal projection onto T and its complement TK

are

PT pZq “

Z11 Z12

Z21 0n´r

, PTKpZq “

0rZ22

, (37)

respectively. Since Z “ BLZBHR , one can say:

PT pZq “ BLPT pZqBHR , PTKpZq “ BLPT

KpZqBHR .(38)

APPENDIX BPROOF OF THEOREM 2

Proof. For matrix Xr with rank r, If Ur and Vr indicate theorthogonal bases, then column and row subspaces are:

Ur “ spanpXrq “ spanpUrq,Vr “ spanpXHr q “ spanpVrq

and for coherency of the ith row and jth column of Xr wehave

µi “ µipUrq :“n

rUrri, :s

22 i P r1 : ns

νj “ νjpVrq :“n

rVrrj, :s

22 j P r1 : ns (39)

As we can see in these definitions, coherency of subspaces areindependent from selection of orthogonal bases. According toDefinition 8 and (40), the coherency of a matrix is equal tothe maximum coherency of subspaces.

ηpXrq “ maxiµipUrq _max

jνjpVrq (40)

similarly, for subspaces Ur “ spanprU , rUr1sq and Vr “

spanprV , rVr1sq we have:

µi “ µipUrq i P r1 : ns νj “ νjpVrq j P r1 : ns (41)

For simplicity and in order to use coherency between sub-spaces, we define a diagonal matrix as:

µ “

»

µ1

. . .µn

fi

ffi

fl

(42)

although only µi is considered in this definition, but it can beexpanded for other coherencies.let A2Ñ8 be the maximum `2-norm of rows in A, then

pµr

nq´12 Ur2Ñ8 “ 1, p

νr

nq´12 Vr2Ñ8 “ 1

pµr

nq´12 U 1r12Ñ8 ď

d

r1 ` r

rmaxi

µiµi

pνr

nq´12 V 1r12Ñ8 ď

d

r1 ` r

rmaxj

νjνj

(43)

considering U as the orthogonal bases of U , correctness ofabove equations can be provided by:

pµr

nq´12 U 1r2Ñ8 “ max

i

U 1rri, :s2Urri, :s2

ď maxi

Urri, :s2Urri, :s2

d

maxi

µi.dimpUrq

µir

ď

d

r1 ` r

rmaxi

µiµi

pdimpUrq ď r1 ` rq (44)

where the third inequality of (44) is followed by U 1r1 Ă U .According to the definition of sampling operator in (1), wecan say:

RΩpXqF “nÿ

i,j“1

εijpijXri, js.Ci,j (45)

Page 8: Multi-weight Matrix Completion with Arbitrary Subspace

8

Now, we provide some properties of sampling operator, whichis imperative for developing our main Theory. In order to listthese properties, at first we define following norms. Assumeµp8q and µp8, 2q measure the weighted `8-norm and themaximum weighted `2-norm of rows of a matrix, respectively.For a square matrix Z P Rnˆn we set:

Zµp8q “ pµr

nq´12 Zp

νr

nq´12 8 “ max

i,j

c

n

µir|Zri, js|

c

n

νjr|

(46)

Zµp8,2q “ pµr

nq´12 Z2Ñ8 _ p

νr

nq´12 ZH2Ñ8

“ pmaxi

c

n

µirZri, :s2q _ pmax

j

c

n

νjrZr:, js2q (47)

Lemma 3. [26] Considering the sampling operator definedin (1), also assuming T as support of Xr and PT as theorthogonal projection onto the support, then

pPT ´ PT ˝RΩ ˝ PT qpZqFÑF ď1

2

on condition that

pµi `˚νjqr log n

nď pij ď 1 @i, j P r1 : ns.

Lemma 4. [26] Considering the assumptions in Lemma 3,for every fix matrix Z, we expect

pI ´RΩqpZq ď Zµp8q ` Zµp8,2q

under the same condition of Lemma 3.

Lemma 5. [26] Considering the assumptions in Lemma 3,for every fix matrix Z that PpZq “ Z, we expect

pPT ´ PT ˝RΩ ˝ PT qpZqµp8,2q ď1

2Zµp8,2q `

1

2Zµp8q

under the same condition of Lemma 3.

Lemma 6. [26] Considering the assumptions in Lemma 3,for a fix matrix Z P T ,

pPT ´ PT ˝RΩ ˝ PT qpZqµp8q ď1

2Zµp8q

In some cases, we will use RΩ instead of RΩ which is definedas:

RΩpZq “ BHLRΩpBLZB

HRqBR (48)

In accordance with the sampling operator RΩ, we define theorthogonal projection PppZq in order to project the input tothe support of RΩ:

PppZq “nÿ

i,j“1

εijZri, jsCij (49)

so similar to RΩ and Pp, we have:

PppZq “ BHLPppBLZB

HRqBR (50)

In the following we provide some principal features aboutthe recent defined variables.

Lemma 7. For an arbitrary matrix Z and ZZHL ZBR andfor operators Pp,Pp,RΩ and Ap we have:

xZ,RΩpZqy “ xZ,RΩpZqy (51)

RΩpZqF “ RΩpZqF (52)

Also if 0 ď l ď h ď 1 and pi,j Ă rl, hs, then

pRΩ ˝RΩqp.q ľ pRΩqp.q (53)

RΩp.qFÑF “ RΩp.qFÑF ď l´1 (54)

in the last two equations and for operators Ap¨q and Bp¨q,Ap¨q ľ Bp¨q means that for every matrix Z, xZ,ApZqy ľ

xZ,BpZqy, so finally

PppZqF ď hRΩpZqF (55)

Now considering these lemmas, let’s complete the proof ofTheorem (14).

Consider X “ Xr is a rank r matrix and our observationis measured in a noisy environment. let X and H :“ X´Xbe the solution and the estimation error of (4). Then one cansay:

QrUrpX `HqQ

rVr˚ ď Q rUr

XQrVr˚ (56)

for the right side of (56) we have:

QrUrXQ

rVr˚ “ Q rUr

XrQ rVr˚

ď BLOLLXrRHOH

RBHR˚ “ LXrR

L11Xr,11R11

˚0n´r˚

˚

(57)

which is held since BR,OR,BL and OL are orthogonal. Forthe left side of (56) we have:

QrUrpX `HqQ

rVr˚ “ Q rUr

pXr `HqQ rVr˚

ě BLOLLpXr `HqRHOH

RBHR˚ ´ X

`r ˚

“ LpXr `HqRH˚ ´ X

`r ˚

L11Xr,11R11

˚0n´r˚

`˚LHRH

˚

´ X`r ˚

(58)

replacing these two upper and lower bounds in (56) andconsidering convexity of nuclear norm we have:

xLHRH,GHy

ď

L11Xr,11R11

˚0n´r˚

`˚LHRH

˚

´

L11Xr,11R11

˚0n´r˚

˚

ď 0

@G P B

L11Xr,11R11

˚0n´r˚

˚

(59)

In (59), BA˚ indicates the sub-differential of nuclear normat point A [4]. In order to fully describe sub-differential,let rankpXr,11q “ rankpXrq “ r and for non-zero

Page 9: Multi-weight Matrix Completion with Arbitrary Subspace

9

weights rankpL11X11R11q “ r. Consider svd for matrixL11Xr,11R11 as:

L11Xr,11R11 “ U r∆rVH

r

Also let S be the sign matrix defined as:

S :“

S11

0n´r

:“

«

U rVH

r

0n´r

ff

(60)

Finally the sub-differential is defined as:

B

L11Xr,11R11

˚0n´r˚

˚

“ tG P Rnˆn : G “

S11 P RnˆnG22 P Rpn´rqˆpn´rq

˚and G22 ď 1u

tG P Rnˆn : PT pGq “ S “

S11

0n´r

and PTKpGq ď 1u (61)

Considering (60) we have:

rankpSq “ rankpS11q “ r, S “ S11 “ 1

SF “ S11F “?r (62)

Then according to all above equations and for G22 ď 1,(59) will be changed to:

xLHRH,S `˚

0rG22

y ď 0 (63)

0 ě xLHRH,Sy ` supG22ď1

xLHRH,

0rG22

y

“ xLHRH,Sy ` supGď1

xPTKpLHRHq,Gy

“ xLHRH,Sy ` PTKpLHRHq˚

“ xH,LHSRy ` PTKpLHRHq˚

C

H,

»

L11S11R11 L11S11R12

LH12S11R11 ˚LH

12S11R12

0n´2r

fi

fl

G

`

»

0r ˚˚˚L22H22R22 L22H23Γ2 L22H24

Λ2H32R22 Λ2H33Γ2 Λ2H34

H42R22 H34Γ2 H44

fi

ffi

ffi

fl

˚

C

H,

»

L11S11R11 L11S11R12

LH12S11R11 ˚0r

0n´2r

fi

fl

G

` xH22,LH12S11R12y

»

0r ˚˚˚L22H22R22 L22H23Γ2 L22H24

Λ2H32R22 Λ2H33Γ2 Λ2H34

H42R22 H34Γ2 H44

fi

ffi

ffi

fl

˚

“: xH,S1y `˚xH22,L

H12S11R12y ` L

1PTKpHqR1˚

(64)

where (64) is held due to the fact that spectral norm is thedual of nuclear norm. Also the matrices S1 and L1 are definedas below:

S1:“

»

L11S11R11 L11S11R12

LH12S11R11 ˚0r

0n´r1´r

fi

fl

L1 :“

»

0r˚L22

Λ2

In´r1´2r

fi

ffi

ffi

fl

It is worth mentioning that S P T . Here are some propertiesof matrix S

1:

˚S1F “

»

L11S11R11 L11S11R12

LH12S11R11 ˚0r

0n´2r

fi

fl

F

ď

»

L11S11R11 L11S11R12

LH12S11R11 LH

12S11R12

0n´2r

fi

fl

F

˚ ď rL11,L12sS11F rR11,R12s

“ rL11,L12sSF rR11,R12s

“?rrL11,L12srR11,R12s

“?r

d

maxi

`λ1piq4 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

˘

.

d

maxi

`γ1piq4 cos2 θvpiq ` sin2 θvpiq

γ1piq2 cos2 θvpiq ` sin2 θvpiq

˘

“:?rαpθupiq, θvpiq, λ1piq, γ1piqq (65)

The second and the last inequality of (65) are held due toABF ď ABF . Considering (65), (64) will be writtenas:

0 ě xH,S1y ` xH22,L

H12S11R12y ` L

1PTKpHqR1˚

ě xH,S1y ´ P

TKpHq˚L12S11R12 ` PT

KpHq˚

´ PTKpHq ´L1P

TKpHqR1˚

“ xH,S1y ` p1´ L12R12qPT

KpHq˚

´ PTKpInqPT

KpHqPTKpInq ´L

1PTKpHqR1˚

˚ ě xH,S1y ` p1´ L12R12qPT

KpHq˚

´ PTKpInq ´L

1PTKpHq˚

´ L1PTKpHq˚PT

KpInq ´R1

ě xH,S1y ` p1´ L12R12qPT

KpHq˚

´

Ir ´L22

Ir ´Λ2

›P

TKpHq˚ ´ PT

KpHq˚

.›

Ir ´R22

Ir ´ Γ2

“ xH,S1y ` p1´ L12R12 ´

Ir ´L22

Ir ´Λ2

´

Ir ´R22

Ir ´ Γ2

›qP

TKpHq˚

“ xH,S1y `

˜

d

max´

p1´ λ1piq2q2 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

¯

Page 10: Multi-weight Matrix Completion with Arbitrary Subspace

10

.

d

max´

p1´ γ1piq2q2 cos2 θvpiq sin2 θvpiq

γ1piq2 cos2 θvpiq ` sin2 θvpiq

¯

´maxitmax

ipλ2piq ´ 1q,max

i

´ λ1piqb

λ1piq2 cos2 θupiq ` sin2 θupiq

´ 1¯

u ´maxitmax

ipγ2piq ´ 1q,

maxi

´ γ1piqb

γ1piq2 cos2 θvpiq ` sin2 θvpiq´ 1

¯

¸

PTKpHq˚

“: xH,S1y ` p1´ α6pθupiq, θvpiqλi, γiqq˚PT

KpHq˚(66)

in which the first inequality holds due to (64), the second onecomes from holder’s inequality and the fact that H22 is a partof P

TKpHq. The last inequality is due to ab`a`b ď 3

2 pa`bq.Following lemma defines dual certificate.

Lemma 8. Let T be the support of matrix Xr as defined in(34), let min

i,jpij ď l, consider RΩ and Pp as defined in (48)

and (50), respectively. As long as for i, j P r1 : ns :

maxrlogpα4nq, 1s.pµi ` νjqr log n

n

.maxrα5

´

1`maxi

µiµi`max

j

νjνj

¯

, 1s À pij ď 1 (67)

then (68) holds:

pPT ´ PT ˝Ap ˝ PT qp.qFÑF ď1

2(68)

and there exists a matrix Π P Rnˆn that ensures (69) to (71)

S1´ PT pΠqF ď

l

4?

2(69)

PTKpΠq ď

1

2(70)

˚Π “ AppΠq (71)

where α4 and α5 can be defined as:

α4 “ α4pui, vi, λ1piq, γ1piqq :“d

maxi

`λ1piq4 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

˘

d

maxi

`γ1piq4 cos2 θvpiq ` sin2 θvpiq

γ1piq2 cos2 θvpiq ` sin2 θvpiq

˘

α5 “ α5pθupiq, θvpiq, λ1piq, γ1piqq :“b

maxipλ1piq2 cos2 θupiq ` sin2 θupiqq.

d

maxi

`γ1piq4 cos2 θvpiq ` sin2 θvpiq

γ1piq2 cos2 θvpiq ` sin2 θvpiq

˘

1`

c

r1`rr max

i

µi

µi

¯

`b

maxipγ1piq2 cos2 θvpiq ` sin2 θvpiq

d

maxi

`λ1piq4 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

˘

1`

c

r1`rr max

i

νiνi

¯

(72)

So now, according to Lemma 8 there exists a Π ensures(69) to (71). Thus, (64) can be re-written as:

0 ě xH,S1y ` p1´ α6q˚PT

KpHq˚

“ xH,PT pΠqy ` xH,S1´ PT pΠqy ` p1´ α6q˚PT

KpHq˚

“ xH,Πy ` xH,PT pΠqy ` xH,S1´ PT pΠqy

˚` p1´ α6q˚PTKpHq˚

“ ´xH,PTKpΠqy ` xH,S

1´ PT pΠqy

˚` p1´ α6q˚PTKpHq˚

ě ˚PTKpHq˚PT

KpΠq ´ ˚PT pHqF S1´ PT pΠqF

˚` p1´ α6q˚PTKpHq˚

ě´1

2˚P

TKpHq˚ ´

l

4?

2˚PT pHqF˚` p1´ α6q˚PT

KpHq˚

“ p1

2´ α6q˚P

TKpHq˚ ´

l

4?

2˚PT pHqF (73)

for α6 ď 1, (73) is equivalent to:

p1

2´ α6q˚P

TKpHq˚ ď

l

4?

2˚PT pHqF (74)

the third inequality of (73) comes from Holder’s inequality,also it’s third inequality holds due to (69) and (70) and xH,Πy

AppHqF “ RΩpHqF “ RΩpxX ´XqF “ 0 (75)

xH,Πy “ xH,PppΠqy “ xPppHq,Πy “ 0 (76)

In order to calculate upper bound of (74) first consider:

AppPT pHqqF “ AppPTKpHqqF

ď App.qFÑF PTKpHqF ď

1

lP

TKpHqF (77)

Now considering Lemma 8, one can say:

AppPT pHqq2F “ xPT pHq, pAp ˝ApqpPT pHqqy

ě xPT pHq,AppPT pHqqy “ xPT pHq,PT pHqy

` xPT pHq, pPT ˝Ap ˝ PT ´ PT q ˝ PT pHqy

ě PT pHq2F ´ PT ˝Ap ˝ PT ´ PT FÑF PT pHq

2F

ě1

2PT pHq

2F (78)

Thus, comparing and combining (77) and (78) leads to:

PT pHqF ď

?2

lP

TKpHqF (79)

Finally considering (79), (74) leads to:

p1

2´ α6qPT

KpHq˚ ďl

4?

2P

TKpHqF (80)

in which it can be concluded that as long as α6 ď 14 ,

PTKpHq “ 0. So According to the aforementioned tips, the

error bound of H is:

HF ď PTKpHqF ` PT pHqF

ď p

?2

l`˚1qP

TKpHqF “ 0; (81)

Page 11: Multi-weight Matrix Completion with Arbitrary Subspace

11

and X “ X . Expanding the existing results for low rankmatrix and observations in noisy environments, one can say:

xX ´XF ď

?h

lQ

rUrXr`Q rVr

˚ `e?nh

32

l?h

lQ

rUrXr`˚Q rVr

`e?nh

32

l?h

lXr`˚ `

e?nh

32

l(82)

xX ´XF ď

?h

lXr`˚ `

e?nh

32

l(83)

APPENDIX CPROOF OF LEMMA 8

As mentioned in the Lemma 3, Ur and rUr1 are orthogonalbases of subspaces Ur and rUr1 , respectively. Now withoutloss of generality suppose that

UHrrUr1 “ rcosθu 0rˆr1´rs : ˚ “ UH

r rrU1,r

rU2,r1´rs P Rrˆr1

(84)

where rU1,r P Rnˆr and rU2,r1´r P Rnˆr1´r are orthogonalbases for subspaces rU1,r Ă rUr1 and rU2,r1´r Ă rUr1 .

For construct orthonormal bases BL in (??)we set

U 11,r :“ ´pI ´UrUHr q

rU1,r sin´1pθuq “ ´P

rUKrrU1,r sin´1

pθuq P Rnˆr

U 12,r1´r :“ ´pI ´UrUHr q

rU2,r1´r “ ´PrUKr

rU2,r1´r P Rnˆr1´r

(85)

and consider

spanpU2n´r1´rq “ spanprUr U 1r1sqK.

Although we explained results and proof for column space butall results exist and honest for row spaces.

APPENDIX DPROOF OF LEMMA 2

We use from tow important following point for proof ofequalities in Lemma 2:

1) The operator norm of the diagonal matrix is the maxelement of the matrix.

2) For X P Rnˆn

X “b

λmaxpXHXq “ σmaxpXq,

where λmaxpXHXq is largest eigenvalue of XHX and

σmaxpXq the largest singular value of X [27, LemmaA.5.]

L11 “ ∆L “b

maxipλ1piq2 cos2 θupiq ` sin2 θupiq,

L12 “

d

maxi

´

p1´ λ1piq2q2 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

¯

,

Ir ´L22 “ Ir ´Λ∆´1L “

g

f

f

e

maxi

´ pλ1piq ´b

λ2i cos2 θupiq ` sin2 θupiqq2

λ1piq2 cos2 θupiq ` sin2 θupiq

¯

,

Ir ´L22

Ir ´Λ2

›“ max

itmax

ip1´ λ2piqq

˚ ,maxi

´

1´λ1piq

b

λ1piq2 cos2 θupiq ` sin2 θupiq

¯

u,

rL11 L12s2 “ max

i

«

b

pλ1piq2 cos2 θupiq ` sin2 θupiq

p1´ λ1piq2q cos θupiq sin θupiq

b

pλ1piq2 cos2 θupiq ` sin2 θupiq

ff›

2

2

“ maxi

`λ1piq4 cos2 θupiq ` sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiq

˘

»

0r L12

L22 ´ IrΛ2 ´ Ir1´r

0n´r1´r

fi

ffi

ffi

fl

2

max!

L12

L22 ´ Ir

2

2

, Λ2 ´ Ir1´r22

)

“ maxitrmax

i

´

1´ pλ1piq

b

λ1piq2 cos2 ui ` sin2 ui

q2`

pp1´ λ1piqq

2 cos2 θupiq sin2 θupiq

λ1piq2 cos2 θupiq ` sin2 θupiqq

¯

,maxipλ2piq ´ 1q2su.

(86)

REFERENCES

[1] E. J. Candes and B. Recht, “Exact matrix completion via convexoptimization,” Foundations of Computational mathematics, vol. 9, no. 6,p. 717, 2009.

[2] B. Recht, “A simpler approach to matrix completion.,” Journal ofMachine Learning Research, vol. 12, no. 12, 2011.

[3] E. J. Candes and Y. Plan, “Matrix completion with noise,” Proceedingsof the IEEE, vol. 98, no. 6, pp. 925–936, 2010.

[4] B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-ranksolutions of linear matrix equations via nuclear norm minimization,”SIAM review, vol. 52, no. 3, pp. 471–501, 2010.

[5] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert, “Quantumstate tomography via compressed sensing,” Physical review letters,vol. 105, no. 15, p. 150401, 2010.

[6] J. P. Haldar and Z.-P. Liang, “Spatiotemporal imaging with partiallyseparable functions: A matrix recovery approach,” in 2010 IEEE In-ternational Symposium on Biomedical Imaging: From Nano to Macro,pp. 716–719, IEEE, 2010.

[7] B. Zhao, J. P. Haldar, C. Brinegar, and Z.-P. Liang, “Low rank matrixrecovery for real-time cardiac mri,” in 2010 IEEE International Sympo-sium on Biomedical Imaging: From Nano to Macro, pp. 996–999, IEEE,2010.

[8] N. Srebro and R. R. Salakhutdinov, “Collaborative filtering in a non-uniform world: Learning with the weighted trace norm,” in Advances inNeural Information Processing Systems, pp. 2056–2064, 2010.

Page 12: Multi-weight Matrix Completion with Arbitrary Subspace

12

[9] A. Aravkin, R. Kumar, H. Mansour, B. Recht, and F. J. Herrmann,“Fast methods for denoising matrix completion formulations, withapplications to robust seismic data interpolation,” SIAM Journal onScientific Computing, vol. 36, no. 5, pp. S237–S266, 2014.

[10] J. Bennett, S. Lanning, et al., “The netflix prize,” in Proceedings ofKDD cup and workshop, vol. 2007, p. 35, New York, 2007.

[11] A. M.-C. So and Y. Ye, “Theory of semidefinite programming for sensornetwork localization,” Mathematical Programming, vol. 109, no. 2-3,pp. 367–384, 2007.

[12] S. Daei, A. Amini, and F. Haddadi, “Optimal weighted low-rankmatrix recovery with subspace prior information,” arXiv preprintarXiv:1809.10356, 2018.

[13] A. Eftekhari, D. Yang, and M. B. Wakin, “Weighted matrix completionand recovery with prior subspace information,” IEEE Transactions onInformation Theory, vol. 64, no. 6, pp. 4044–4071, 2018.

[14] H. S. F. Ardakani, S. Daei, and F. Haddadi, “Multi-weight nuclear normminimization for low-rank matrix recovery in presence of subspace priorinformation,” arXiv preprint arXiv:2005.10878, 2020.

[15] N. Rao, H.-F. Yu, P. Ravikumar, and I. S. Dhillon, “Collaborativefiltering with graph information: Consistency and scalable methods.,”in NIPS, vol. 2, p. 7, Citeseer, 2015.

[16] R. Angst, C. Zach, and M. Pollefeys, “The generalized trace-norm andits application to structure-from-motion problems,” in 2011 InternationalConference on Computer Vision, pp. 2502–2509, IEEE, 2011.

[17] P. Jain and I. S. Dhillon, “Provable inductive matrix completion,” arXivpreprint arXiv:1306.0626, 2013.

[18] M. Xu, R. Jin, and Z.-H. Zhou, “Speedup matrix completion with sideinformation: Application to multi-label learning,” in Advances in neuralinformation processing systems, pp. 2301–2309, 2013.

[19] K. Mohan and M. Fazel, “Reweighted nuclear norm minimizationwith application to system identification,” in Proceedings of the 2010American Control Conference, pp. 2953–2959, IEEE, 2010.

[20] T. Zhou, H. Shan, A. Banerjee, and G. Sapiro, “Kernelized probabilisticmatrix factorization: Exploiting graphs and side information,” in Pro-ceedings of the 2012 SIAM international Conference on Data mining,pp. 403–414, SIAM, 2012.

[21] H. Fathi, E. Rangriz, and V. Pourahmadi, “Two novel algorithms forlow-rank matrix completion problem,” IEEE Signal Processing Letters,vol. 28, pp. 892–896, 2021.

[22] H. Ardakani, S. Fazael, S. Daei, and F. Haddadi, “A greedy algorithmfor matrix recovery with subspace prior information,” arXiv preprintarXiv:1907.11868, 2019.

[23] D. L. Donoho, “Compressed sensing,” IEEE Transactions on informationtheory, vol. 52, no. 4, pp. 1289–1306, 2006.

[24] E. J. Candes, “The restricted isometry property and its implications forcompressed sensing,” Comptes rendus mathematique, vol. 346, no. 9-10,pp. 589–592, 2008.

[25] D. Needell, R. Saab, and T. Woolf, “Weighted-minimization for sparserecovery under arbitrary prior information,” Information and Inference:A Journal of the IMA, vol. 6, no. 3, pp. 284–309, 2017.

[26] Y. Chen, S. Bhojanapalli, S. Sanghavi, and R. Ward, “Completing anylow-rank matrix, provably,” The Journal of Machine Learning Research,vol. 16, no. 1, pp. 2999–3034, 2015.

[27] S. Foucart and H. Rauhut, “A mathematical introduction to compressivesensing,” Bull. Am. Math, vol. 54, pp. 151–165, 2017.