singular values of tensors

67
Singular Values of Tensors Harm Derksen University of Michigan CUNY/NYU February 15, 2019 Harm Derksen Singular Values of Tensors

Upload: others

Post on 14-Apr-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Singular Values of Tensors

Singular Values of Tensors

Harm Derksen

University of Michigan

CUNY/NYUFebruary 15, 2019

Harm Derksen Singular Values of Tensors

Page 2: Singular Values of Tensors

Tensor Rank

F a field, F = R or F = CV (i) ∼= Fni for i = 1, 2, . . . , dV = V (1) ⊗ V (2) ⊗ · · · ⊗ V (d) ∼= Fn1×···×nd tensor product space

Definition

A simple tensor is a tensor of the form v (1) ⊗ v (2) ⊗ · · · ⊗ v (d)

(v (i) ∈ V (i)).

Definition (tensor rank)

The rank of T is the smallest positive integer r such that T is thesum of r simple tensors.

Harm Derksen Singular Values of Tensors

Page 3: Singular Values of Tensors

Tensor Rank

F a field, F = R or F = CV (i) ∼= Fni for i = 1, 2, . . . , dV = V (1) ⊗ V (2) ⊗ · · · ⊗ V (d) ∼= Fn1×···×nd tensor product space

Definition

A simple tensor is a tensor of the form v (1) ⊗ v (2) ⊗ · · · ⊗ v (d)

(v (i) ∈ V (i)).

Definition (tensor rank)

The rank of T is the smallest positive integer r such that T is thesum of r simple tensors.

Harm Derksen Singular Values of Tensors

Page 4: Singular Values of Tensors

Matrix Multiplication Tensor

Md =d∑

i=1

d∑j=1

d∑k=1

ei ,j ⊗ ej ,k ⊗ ek,i ∈ Fn×n ⊗ Fn×n ⊗ Fn×n

clearly rank(Md) ≤ d3

Theorem (Strassen)

if rank(Md) = s then complexity of n × n matrix multiplication isO(nlogd (s)) (the standard algorithm is O(n3))

Theorem (Strassen)

rank(M2) ≤ 7, so complexity of n × n matrix multiplication isO(nlog2(7)) = O(n2.81)

Current record: O(n2.373) (Le Gall)

Harm Derksen Singular Values of Tensors

Page 5: Singular Values of Tensors

Matrix Multiplication Tensor

Md =d∑

i=1

d∑j=1

d∑k=1

ei ,j ⊗ ej ,k ⊗ ek,i ∈ Fn×n ⊗ Fn×n ⊗ Fn×n

clearly rank(Md) ≤ d3

Theorem (Strassen)

if rank(Md) = s then complexity of n × n matrix multiplication isO(nlogd (s)) (the standard algorithm is O(n3))

Theorem (Strassen)

rank(M2) ≤ 7, so complexity of n × n matrix multiplication isO(nlog2(7)) = O(n2.81)

Current record: O(n2.373) (Le Gall)

Harm Derksen Singular Values of Tensors

Page 6: Singular Values of Tensors

Matrix Multiplication Tensor

Md =d∑

i=1

d∑j=1

d∑k=1

ei ,j ⊗ ej ,k ⊗ ek,i ∈ Fn×n ⊗ Fn×n ⊗ Fn×n

clearly rank(Md) ≤ d3

Theorem (Strassen)

if rank(Md) = s then complexity of n × n matrix multiplication isO(nlogd (s)) (the standard algorithm is O(n3))

Theorem (Strassen)

rank(M2) ≤ 7, so complexity of n × n matrix multiplication isO(nlog2(7)) = O(n2.81)

Current record: O(n2.373) (Le Gall)

Harm Derksen Singular Values of Tensors

Page 7: Singular Values of Tensors

The Canonical Polyadic (CP) Model

aka PARAFAC, CANDECOMP

Problem

Given a tensor T ∈ V = V (1) ⊗ · · · ⊗ V (d), writeT = v1 + v2 + · · ·+ vr where v1, v2, . . . , vr are simple tensors and ris minimal.

I the CP-decomposition is sometimes unique, not always

I not numerically stable: there is no upper bound formax{‖v1‖, . . . , ‖vr‖} as a function of ‖T‖.

I difficult to compute: for many tensors of interest the rank isunknown

Many numerical applications in psychometrics, signal processing,numerical linear algebra, computer vision, numerical analysis, datamining, graph analysis, neuroscience, chemometrics, etc.

Harm Derksen Singular Values of Tensors

Page 8: Singular Values of Tensors

The Canonical Polyadic (CP) Model

aka PARAFAC, CANDECOMP

Problem

Given a tensor T ∈ V = V (1) ⊗ · · · ⊗ V (d), writeT = v1 + v2 + · · ·+ vr where v1, v2, . . . , vr are simple tensors and ris minimal.

I the CP-decomposition is sometimes unique, not always

I not numerically stable: there is no upper bound formax{‖v1‖, . . . , ‖vr‖} as a function of ‖T‖.

I difficult to compute: for many tensors of interest the rank isunknown

Many numerical applications in psychometrics, signal processing,numerical linear algebra, computer vision, numerical analysis, datamining, graph analysis, neuroscience, chemometrics, etc.

Harm Derksen Singular Values of Tensors

Page 9: Singular Values of Tensors

The Canonical Polyadic (CP) Model

To make it more numerically stable, we could allow an error term.

Problem

Given a tensor T ∈ V and fixed r , write T = v1 + v2 + · · ·+ vr +Ewhere v1, v2, . . . , vr are simple tensors and ‖E‖ is minimal.

This optimization problem might be ill-posed. Define

T = e2 ⊗ e1 ⊗ e1 + e1 ⊗ e2 ⊗ e1 + e1 ⊗ e1 ⊗ e2,

v1(t) = t−1(e1+te2)⊗(e1+te2)⊗(e1+te2), v2(t) = −t−1e1⊗e1⊗e1Then T = v1(t) + v2(t) + E (t) with ‖E (t)‖ = t

√3 + t.

So we can write T = v1 + v2 + E with ‖E‖ = ε for every ε > 0,but we cannot write T = v1 + v2 + E with ‖E‖ = 0.

Harm Derksen Singular Values of Tensors

Page 10: Singular Values of Tensors

The Canonical Polyadic (CP) Model

To make it more numerically stable, we could allow an error term.

Problem

Given a tensor T ∈ V and fixed r , write T = v1 + v2 + · · ·+ vr +Ewhere v1, v2, . . . , vr are simple tensors and ‖E‖ is minimal.

This optimization problem might be ill-posed. Define

T = e2 ⊗ e1 ⊗ e1 + e1 ⊗ e2 ⊗ e1 + e1 ⊗ e1 ⊗ e2,

v1(t) = t−1(e1+te2)⊗(e1+te2)⊗(e1+te2), v2(t) = −t−1e1⊗e1⊗e1Then T = v1(t) + v2(t) + E (t) with ‖E (t)‖ = t

√3 + t.

So we can write T = v1 + v2 + E with ‖E‖ = ε for every ε > 0,but we cannot write T = v1 + v2 + E with ‖E‖ = 0.

Harm Derksen Singular Values of Tensors

Page 11: Singular Values of Tensors

Motivation: Compressed Sensing and Convex Relaxation

For x ∈ Rn, its sparsity is measured by ‖x‖0 = #{i | xi 6= 0}.

Problem

Given A ∈ Rm×n, b ∈ Rm, find a solution x ∈ Rn for Ax = b with‖x‖0 minimal (a sparsest solution).

But, ‖ · ‖0 is not convex and this optimization problem is difficult,

so instead we consider:

Problem (Basis Pursuit)

Given A ∈ Rm×n, b ∈ Rm, find a solution x ∈ Rn for Ax = b with‖x‖1 minimal.

Basis Pursuit can be solved by linear programming and is generallyfast. Under reasonable assumptions, Basis Pursuit also gives thesparsest solutions (Candes-Tao, Donoho).

Harm Derksen Singular Values of Tensors

Page 12: Singular Values of Tensors

Motivation: Compressed Sensing and Convex Relaxation

For x ∈ Rn, its sparsity is measured by ‖x‖0 = #{i | xi 6= 0}.

Problem

Given A ∈ Rm×n, b ∈ Rm, find a solution x ∈ Rn for Ax = b with‖x‖0 minimal (a sparsest solution).

But, ‖ · ‖0 is not convex and this optimization problem is difficult,so instead we consider:

Problem (Basis Pursuit)

Given A ∈ Rm×n, b ∈ Rm, find a solution x ∈ Rn for Ax = b with‖x‖1 minimal.

Basis Pursuit can be solved by linear programming and is generallyfast. Under reasonable assumptions, Basis Pursuit also gives thesparsest solutions (Candes-Tao, Donoho).

Harm Derksen Singular Values of Tensors

Page 13: Singular Values of Tensors

The Nuclear and Spectral Norms

Low rank tensors are sparse in some sense, and by convexrelaxation:

Definition

The nuclear norm ‖T‖? is the smallest value of∑r

i=1 ‖vi‖ whereT =

∑ri=1 vi and v1, . . . , vr are simple tensors. (well-defined)

V (1), . . . ,V (d),V a spaces with a positive definitebilinear/hermitian form 〈·, ·〉

Definition

The spectral norm is defined by

‖T‖σ = max{|〈T , v〉| | v simple tensor with ‖v‖ = 1}.

The spectral norm is dual to the nuclear norm, in particular

|〈T ,S〉| ≤ ‖T‖?‖S‖σ for all tensors S ,T .

Harm Derksen Singular Values of Tensors

Page 14: Singular Values of Tensors

The Nuclear and Spectral Norms

Low rank tensors are sparse in some sense, and by convexrelaxation:

Definition

The nuclear norm ‖T‖? is the smallest value of∑r

i=1 ‖vi‖ whereT =

∑ri=1 vi and v1, . . . , vr are simple tensors. (well-defined)

V (1), . . . ,V (d),V a spaces with a positive definitebilinear/hermitian form 〈·, ·〉

Definition

The spectral norm is defined by

‖T‖σ = max{|〈T , v〉| | v simple tensor with ‖v‖ = 1}.

The spectral norm is dual to the nuclear norm, in particular

|〈T ,S〉| ≤ ‖T‖?‖S‖σ for all tensors S ,T .

Harm Derksen Singular Values of Tensors

Page 15: Singular Values of Tensors

The Nuclear and Spectral Norms

Low rank tensors are sparse in some sense, and by convexrelaxation:

Definition

The nuclear norm ‖T‖? is the smallest value of∑r

i=1 ‖vi‖ whereT =

∑ri=1 vi and v1, . . . , vr are simple tensors. (well-defined)

V (1), . . . ,V (d),V a spaces with a positive definitebilinear/hermitian form 〈·, ·〉

Definition

The spectral norm is defined by

‖T‖σ = max{|〈T , v〉| | v simple tensor with ‖v‖ = 1}.

The spectral norm is dual to the nuclear norm, in particular

|〈T ,S〉| ≤ ‖T‖?‖S‖σ for all tensors S ,T .

Harm Derksen Singular Values of Tensors

Page 16: Singular Values of Tensors

The Nuclear and Spectral Norms

if A ∈ Rn×m = Rn ⊗ Rm is an n ×m matrix then the tensor rankof A coincides with the matrix rank of A

If λ1 ≥ λ2 ≥ · · · ≥ λr are the singular values of A, then‖A‖? = λ1 + · · ·+ λr , ‖A‖σ = λ1 (spectral/operator norm) and

‖A‖ = ‖A‖2 = ‖A‖F =√λ21 + · · ·+ λ2r (Euclidean/Frobenius

norm)

Harm Derksen Singular Values of Tensors

Page 17: Singular Values of Tensors

The Nuclear and Spectral Norms

if A ∈ Rn×m = Rn ⊗ Rm is an n ×m matrix then the tensor rankof A coincides with the matrix rank of A

If λ1 ≥ λ2 ≥ · · · ≥ λr are the singular values of A, then‖A‖? = λ1 + · · ·+ λr , ‖A‖σ = λ1 (spectral/operator norm) and

‖A‖ = ‖A‖2 = ‖A‖F =√λ21 + · · ·+ λ2r (Euclidean/Frobenius

norm)

Harm Derksen Singular Values of Tensors

Page 18: Singular Values of Tensors

Example: Determinant Tensor

Dn =∑

σ∈Sn sgn(σ)eσ(1) ⊗ eσ(2) ⊗ · · · ⊗ eσ(n) ∈ Cn ⊗ · · · ⊗ Cn.

clearly ‖Dn‖? ≤ n! and rank(Dn) ≤ n!

‖Dn‖σ = max{| det(v1v2 · · · vn)| | ‖v1‖ = · · · = ‖vn‖ = 1} = 1

‖Dn‖? = ‖Dn‖?‖Dn‖σ ≥ 〈Dn,Dn〉 = n!.

so ‖Dn‖? ≥ n!

Theorem (D.)

‖Dn‖? = n!

Harm Derksen Singular Values of Tensors

Page 19: Singular Values of Tensors

Example: Determinant Tensor

Dn =∑

σ∈Sn sgn(σ)eσ(1) ⊗ eσ(2) ⊗ · · · ⊗ eσ(n) ∈ Cn ⊗ · · · ⊗ Cn.

clearly ‖Dn‖? ≤ n! and rank(Dn) ≤ n!

‖Dn‖σ = max{| det(v1v2 · · · vn)| | ‖v1‖ = · · · = ‖vn‖ = 1} = 1

‖Dn‖? = ‖Dn‖?‖Dn‖σ ≥ 〈Dn,Dn〉 = n!.

so ‖Dn‖? ≥ n!

Theorem (D.)

‖Dn‖? = n!

Harm Derksen Singular Values of Tensors

Page 20: Singular Values of Tensors

Example: Permanent Tensor

Pn =∑

σ∈Sn eσ(1) ⊗ eσ(2) ⊗ · · · ⊗ eσ(n) ∈ Cn ⊗ · · · ⊗ Cn.

‖Pn‖σ = max{| perm(v1v2 · · · vn)| | ‖v1‖ = · · · = ‖vn‖ = 1}

Theorem (Carlen, Lieb and Moss, 2006)

max{perm(v1v2 · · · vn) | ‖v1‖ = · · · = ‖vn‖ = 1} = n!/nn/2

n!

nn/2‖Pn‖? = ‖Pn‖?‖Pn‖σ ≥ 〈Pn,Pn〉 = n!.

so ‖Pn‖? ≥ nn/2

Harm Derksen Singular Values of Tensors

Page 21: Singular Values of Tensors

Example: Permanent Tensor

Pn =∑

σ∈Sn eσ(1) ⊗ eσ(2) ⊗ · · · ⊗ eσ(n) ∈ Cn ⊗ · · · ⊗ Cn.

‖Pn‖σ = max{| perm(v1v2 · · · vn)| | ‖v1‖ = · · · = ‖vn‖ = 1}

Theorem (Carlen, Lieb and Moss, 2006)

max{perm(v1v2 · · · vn) | ‖v1‖ = · · · = ‖vn‖ = 1} = n!/nn/2

n!

nn/2‖Pn‖? = ‖Pn‖?‖Pn‖σ ≥ 〈Pn,Pn〉 = n!.

so ‖Pn‖? ≥ nn/2

Harm Derksen Singular Values of Tensors

Page 22: Singular Values of Tensors

Example: Permanent Tensor

Theorem (Glynn 2010)

Pn =1

2n−1

∑δ

(∏ni=1 δi

)(∑n

i=1 δiei )⊗ · · · ⊗ (∑n

i=1 δiei )

where δ runs over {1} × {−1, 1}n−1.

In particular,( nbn/2c

)≤ rank(Pn) ≤ 2n−1 and ‖Pn‖? ≤ nn/2, so

Theorem (D.)

‖Pn‖? = nn/2

Harm Derksen Singular Values of Tensors

Page 23: Singular Values of Tensors

Example: Permanent Tensor

Theorem (Glynn 2010)

Pn =1

2n−1

∑δ

(∏ni=1 δi

)(∑n

i=1 δiei )⊗ · · · ⊗ (∑n

i=1 δiei )

where δ runs over {1} × {−1, 1}n−1.

In particular,( nbn/2c

)≤ rank(Pn) ≤ 2n−1 and ‖Pn‖? ≤ nn/2, so

Theorem (D.)

‖Pn‖? = nn/2

Harm Derksen Singular Values of Tensors

Page 24: Singular Values of Tensors

The Convex Decomposition (CoDe) Model

Problem

Given a tensor T ∈ V , write T = v1 + v2 + · · ·+ vr wherev1, v2, . . . , vr are simple tensors and ‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ isminimal.

Well-defined but the decomposition may not be unique.

We can also allow an error term . . .

Harm Derksen Singular Values of Tensors

Page 25: Singular Values of Tensors

The Convex Decomposition (CoDe) Model

Problem

Given a tensor T ∈ V , write T = v1 + v2 + · · ·+ vr wherev1, v2, . . . , vr are simple tensors and ‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ isminimal.

Well-defined but the decomposition may not be unique.

We can also allow an error term . . .

Harm Derksen Singular Values of Tensors

Page 26: Singular Values of Tensors

The Convex Decomposition (CoDe) Model

Problem

Given a tensor T ∈ V and fixed x ≥ 0, writeT = v1 + v2 + · · ·+ vr + E where r is a nonnegative integer,v1, v2, . . . , vr are simple tensors, ‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ ≤ x and‖E‖ is minimal.

The decomposition is not unique.

If A = v1 + v2 + · · ·+ vr then A is the unique tensor with‖A‖? ≤ x for which ‖T − A‖ is minimal.

Problem

Given a tensor T ∈ V and fixed y ≥ 0, writeT = v1 + v2 + · · ·+ vr + E where r is a nonnegative integer,v1, v2, . . . , vr are simple tensors, ‖E‖ ≤ y and‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ is minimal.

Harm Derksen Singular Values of Tensors

Page 27: Singular Values of Tensors

The Convex Decomposition (CoDe) Model

Problem

Given a tensor T ∈ V and fixed x ≥ 0, writeT = v1 + v2 + · · ·+ vr + E where r is a nonnegative integer,v1, v2, . . . , vr are simple tensors, ‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ ≤ x and‖E‖ is minimal.

The decomposition is not unique.

If A = v1 + v2 + · · ·+ vr then A is the unique tensor with‖A‖? ≤ x for which ‖T − A‖ is minimal.

Problem

Given a tensor T ∈ V and fixed y ≥ 0, writeT = v1 + v2 + · · ·+ vr + E where r is a nonnegative integer,v1, v2, . . . , vr are simple tensors, ‖E‖ ≤ y and‖v1‖+ ‖v2‖+ · · ·+ ‖vr‖ is minimal.

Harm Derksen Singular Values of Tensors

Page 28: Singular Values of Tensors

t-Orthogonality

Definition

Simple unit tensors v1, v2, . . . , vr are t-orthogonal if

r∑i=1

|〈vi ,w〉|2/t ≤ 1

for every simple tensor w with ‖w‖2 = 1.

1-orthogonal ⇔ orthogonalIf t > s then t-orthogonal ⇒ s-orthogonal.

Theorem (D.)

If v1, . . . , vr ∈ V are t-orthogonal, then r ≤ dim(V )1/t .

Harm Derksen Singular Values of Tensors

Page 29: Singular Values of Tensors

t-Orthogonality

Definition

Simple unit tensors v1, v2, . . . , vr are t-orthogonal if

r∑i=1

|〈vi ,w〉|2/t ≤ 1

for every simple tensor w with ‖w‖2 = 1.

1-orthogonal ⇔ orthogonalIf t > s then t-orthogonal ⇒ s-orthogonal.

Theorem (D.)

If v1, . . . , vr ∈ V are t-orthogonal, then r ≤ dim(V )1/t .

Harm Derksen Singular Values of Tensors

Page 30: Singular Values of Tensors

t-Orthogonality

Definition

Simple unit tensors v1, v2, . . . , vr are t-orthogonal if

r∑i=1

|〈vi ,w〉|2/t ≤ 1

for every simple tensor w with ‖w‖2 = 1.

1-orthogonal ⇔ orthogonalIf t > s then t-orthogonal ⇒ s-orthogonal.

Theorem (D.)

If v1, . . . , vr ∈ V are t-orthogonal, then r ≤ dim(V )1/t .

Harm Derksen Singular Values of Tensors

Page 31: Singular Values of Tensors

Horizontal and Vertical Tensor Product

Theorem (“horizontal tensor product”, D.)

If v1, . . . , vr are t-orthogonal, and w1, . . . ,wr are s-orthogonal,then v1 ⊗ w1, . . . , vr ⊗ wr are (s + t)-orthogonal.

If V = V (1) ⊗ · · · ⊗ V (d) and W = W (1) ⊗ · · · ⊗W (d), then

V �W := (V (1) ⊗W (1))⊗ · · · ⊗ (V (d) ⊗W (d)).

(v1⊗· · ·⊗vd)�(w1⊗· · ·⊗wd) = (v1⊗w1)⊗(v2⊗w2)⊗· · ·⊗(vd⊗wd)

Theorem (“vertical tensor product”, D.)

If v1, v2, . . . , vr ∈ V and w1, . . . ,ws ∈W are t-orthogonal, then{vi � wj | 1 ≤ i ≤ r , 1 ≤ j ≤ s} are t-orthogonal.

Harm Derksen Singular Values of Tensors

Page 32: Singular Values of Tensors

Horizontal and Vertical Tensor Product

Theorem (“horizontal tensor product”, D.)

If v1, . . . , vr are t-orthogonal, and w1, . . . ,wr are s-orthogonal,then v1 ⊗ w1, . . . , vr ⊗ wr are (s + t)-orthogonal.

If V = V (1) ⊗ · · · ⊗ V (d) and W = W (1) ⊗ · · · ⊗W (d), then

V �W := (V (1) ⊗W (1))⊗ · · · ⊗ (V (d) ⊗W (d)).

(v1⊗· · ·⊗vd)�(w1⊗· · ·⊗wd) = (v1⊗w1)⊗(v2⊗w2)⊗· · ·⊗(vd⊗wd)

Theorem (“vertical tensor product”, D.)

If v1, v2, . . . , vr ∈ V and w1, . . . ,ws ∈W are t-orthogonal, then{vi � wj | 1 ≤ i ≤ r , 1 ≤ j ≤ s} are t-orthogonal.

Harm Derksen Singular Values of Tensors

Page 33: Singular Values of Tensors

The Diagonal Singular Value Decomposition

Definition

Suppose that (?) : T =∑r

i=1 λivi such that λ1 ≥ · · · ≥ λr > 0and v1, . . . , vr are 2-orthogonal simple tensors of unit length, then(?) is called a diagonal singular value decomposition of T (DSVD).

If d = 2 (tensor product of 2 spaces) then the DSVD is the usualsingular value decomposition. For d > 2, the DSVD is differentfrom the Higher Order Singular Value Decomposition defined byDe Lathauer, De Moor, and Vandewalle. Not every tensor has aDSVD.

Harm Derksen Singular Values of Tensors

Page 34: Singular Values of Tensors

The Diagonal Singular Value Decomposition

Theorem (D.)

If T has a DSVD then

‖T‖? =∑i

λi , ‖T‖2 =

√∑i

λ2i , ‖T‖σ = λ1

Theorem (D.)

If λ1 > λ2 > · · · > λr then the DSVD is unique.

Theorem (D.)

If v1, . . . , vr are t-orthogonal with t > 2, then the DSVD is unique.

Harm Derksen Singular Values of Tensors

Page 35: Singular Values of Tensors

The Diagonal Singular Value Decomposition

Theorem (D.)

If T has a DSVD then

‖T‖? =∑i

λi , ‖T‖2 =

√∑i

λ2i , ‖T‖σ = λ1

Theorem (D.)

If λ1 > λ2 > · · · > λr then the DSVD is unique.

Theorem (D.)

If v1, . . . , vr are t-orthogonal with t > 2, then the DSVD is unique.

Harm Derksen Singular Values of Tensors

Page 36: Singular Values of Tensors

Example: Matrix Multiplication Tensor

e1, . . . , en ∈ Cn are orthogonale1 ⊗ e1, . . . , en ⊗ en ∈ Cn ⊗ Cn are 2-orthogonal

e1 ⊗ e1 ⊗ 1, . . . , en ⊗ en ⊗ e1 ∈ Cn ⊗ Cn ⊗ C are 2-orthogonale1 ⊗ 1⊗ e1, . . . , en ⊗ 1⊗ en ∈ Cn ⊗ C⊗ Cn are 2-orthogonal1⊗ e1 ⊗ e1, . . . , 1⊗ en ⊗ en ∈ C⊗ Cn ⊗ Cn are 2-orthogonal

Using vertical tensor product, we get

{(ei ⊗ ej)⊗ (ej ⊗ ek)⊗ (ek ⊗ ei ) | 1 ≤ i , j , k ≤ n}

are 2-orthogonal.

Harm Derksen Singular Values of Tensors

Page 37: Singular Values of Tensors

Example: Matrix Multiplication Tensor

e1, . . . , en ∈ Cn are orthogonale1 ⊗ e1, . . . , en ⊗ en ∈ Cn ⊗ Cn are 2-orthogonal

e1 ⊗ e1 ⊗ 1, . . . , en ⊗ en ⊗ e1 ∈ Cn ⊗ Cn ⊗ C are 2-orthogonale1 ⊗ 1⊗ e1, . . . , en ⊗ 1⊗ en ∈ Cn ⊗ C⊗ Cn are 2-orthogonal1⊗ e1 ⊗ e1, . . . , 1⊗ en ⊗ en ∈ C⊗ Cn ⊗ Cn are 2-orthogonal

Using vertical tensor product, we get

{(ei ⊗ ej)⊗ (ej ⊗ ek)⊗ (ek ⊗ ei ) | 1 ≤ i , j , k ≤ n}

are 2-orthogonal.

Harm Derksen Singular Values of Tensors

Page 38: Singular Values of Tensors

Example: Matrix Multiplication Tensor

e1, . . . , en ∈ Cn are orthogonale1 ⊗ e1, . . . , en ⊗ en ∈ Cn ⊗ Cn are 2-orthogonal

e1 ⊗ e1 ⊗ 1, . . . , en ⊗ en ⊗ e1 ∈ Cn ⊗ Cn ⊗ C are 2-orthogonale1 ⊗ 1⊗ e1, . . . , en ⊗ 1⊗ en ∈ Cn ⊗ C⊗ Cn are 2-orthogonal1⊗ e1 ⊗ e1, . . . , 1⊗ en ⊗ en ∈ C⊗ Cn ⊗ Cn are 2-orthogonal

Using vertical tensor product, we get

{(ei ⊗ ej)⊗ (ej ⊗ ek)⊗ (ek ⊗ ei ) | 1 ≤ i , j , k ≤ n}

are 2-orthogonal.

Harm Derksen Singular Values of Tensors

Page 39: Singular Values of Tensors

Example: Matrix Multiplication Tensor

Theorem (D.)

The matrix multiplication tensor

Tn =n∑

i ,j ,k=1

ei ,j ⊗ ej ,k ⊗ ek,i

is a DSVD.

The singular values of Tn are

1, 1, . . . , 1︸ ︷︷ ︸n3

In particular,

‖Tn‖? =n3∑i=1

1 = n3.

Harm Derksen Singular Values of Tensors

Page 40: Singular Values of Tensors

Example: Group Algebra Multiplication Tensor

G is a group with n elements and CG ∼= Cn is the group algebra

TG =∑

g ,h∈Gg ⊗ h ⊗ h−1g−1.

DFT case corresponds to G = Z/nZ.

Theorem (D.)

TG has a DSVD and its singular values are√nd1, . . . ,

√nd1︸ ︷︷ ︸

d31

, . . . ,√

nds, . . . ,

√nds︸ ︷︷ ︸

d3s

where d1, d2, . . . , ds are the dimension of the irreduciblerepresentations of G .

Harm Derksen Singular Values of Tensors

Page 41: Singular Values of Tensors

Example: Group Algebra Multiplication Tensor

G is a group with n elements and CG ∼= Cn is the group algebra

TG =∑

g ,h∈Gg ⊗ h ⊗ h−1g−1.

DFT case corresponds to G = Z/nZ.

Theorem (D.)

TG has a DSVD and its singular values are√nd1, . . . ,

√nd1︸ ︷︷ ︸

d31

, . . . ,√

nds, . . . ,

√nds︸ ︷︷ ︸

d3s

where d1, d2, . . . , ds are the dimension of the irreduciblerepresentations of G .

Harm Derksen Singular Values of Tensors

Page 42: Singular Values of Tensors

Pn no DSVD

Suppose that Pn has a DSVD with singular values λ1, . . . , λr .

‖Pn‖? = nn/2 =∑r

i=1 λi

‖Pn‖σ = n!nn/2

= λ1

‖Pn‖2 = n! =∑r

i=1 λ2i

λ1

r∑i=1

λi = n! =r∑

i=1

λ2i

so λ1 = · · · = λr , and r = rλ1λ1

= ‖Dn‖?‖Dn‖σ = nn

n! .If n > 2 then r is not an integer!

Harm Derksen Singular Values of Tensors

Page 43: Singular Values of Tensors

Noisy Signal Model

Suppose that V is a finite dimensional R-vector space of signals. Ifc ∈ V is a noisy signal, then there is a decomposition

c = a + b

where a ∈ V is a sparse original signal, and b ∈ V is additive noise.

Using convex relaxation, sparsity is measured by some `1-typenorm ‖ · ‖X . The noise is measured by some `2-type norm ‖ · ‖Y .

For example, c is a tensor in V = V (1) ⊗ · · · ⊗ V (d), ‖ · ‖X = ‖ · ‖?is the nuclear norm, and ‖ · ‖Y = ‖ · ‖2 = ‖ · ‖ is the usual `2-norm.(see the CoDe Model)

Harm Derksen Singular Values of Tensors

Page 44: Singular Values of Tensors

Noisy Signal Model

Suppose that V is a finite dimensional R-vector space of signals. Ifc ∈ V is a noisy signal, then there is a decomposition

c = a + b

where a ∈ V is a sparse original signal, and b ∈ V is additive noise.

Using convex relaxation, sparsity is measured by some `1-typenorm ‖ · ‖X . The noise is measured by some `2-type norm ‖ · ‖Y .

For example, c is a tensor in V = V (1) ⊗ · · · ⊗ V (d), ‖ · ‖X = ‖ · ‖?is the nuclear norm, and ‖ · ‖Y = ‖ · ‖2 = ‖ · ‖ is the usual `2-norm.(see the CoDe Model)

Harm Derksen Singular Values of Tensors

Page 45: Singular Values of Tensors

The Pareto Frontier

V finite dimensional R-vector space‖ · ‖X , ‖ · ‖Y norms on Vc ∈ V

Definition

A pair (x , y) ∈ R2 is Pareto-efficient if there exists adecomposition c = a + b with ‖a‖ = x , ‖b‖ = y and for everydecomposition c = a′ + b′ we have ‖a′‖X > x , ‖b′‖Y > y or(‖a′‖X , ‖b′‖Y ) = (x , y). We call c = a + b an XY -decomposition.The Pareto-frontier consists of all Pareto-efficient pairs.

Lemma

The Pareto-frontier is the graph of a strictly decreasing, convex,homeomorphism f cYX : [0, ‖c‖X ]→ [0, ‖c‖Y ].

Harm Derksen Singular Values of Tensors

Page 46: Singular Values of Tensors

The Pareto Frontier

V finite dimensional R-vector space‖ · ‖X , ‖ · ‖Y norms on Vc ∈ V

Definition

A pair (x , y) ∈ R2 is Pareto-efficient if there exists adecomposition c = a + b with ‖a‖ = x , ‖b‖ = y and for everydecomposition c = a′ + b′ we have ‖a′‖X > x , ‖b′‖Y > y or(‖a′‖X , ‖b′‖Y ) = (x , y). We call c = a + b an XY -decomposition.The Pareto-frontier consists of all Pareto-efficient pairs.

Lemma

The Pareto-frontier is the graph of a strictly decreasing, convex,homeomorphism f cYX : [0, ‖c‖X ]→ [0, ‖c‖Y ].

Harm Derksen Singular Values of Tensors

Page 47: Singular Values of Tensors

The Pareto Sub-Frontier

〈·, ·〉 positive definite bilinear form‖v‖2 := ‖v‖ =

√〈v , v〉 `2-norm

‖ · ‖X , ‖ · ‖Y dual to each others

Theorem

Equivalent:

1. c = a + b is an X2-decomposition

2. c = a + b is an 2Y -decomposition

3. 〈a, b〉 = ‖a‖X‖b‖Y .

Definition

The Pareto sub-frontier consists of all pairs (x , y) such that thereexists an X2-decomposition c = a + b with ‖a‖X = x and‖b‖Y = y .

Harm Derksen Singular Values of Tensors

Page 48: Singular Values of Tensors

The Pareto Sub-Frontier

〈·, ·〉 positive definite bilinear form‖v‖2 := ‖v‖ =

√〈v , v〉 `2-norm

‖ · ‖X , ‖ · ‖Y dual to each others

Theorem

Equivalent:

1. c = a + b is an X2-decomposition

2. c = a + b is an 2Y -decomposition

3. 〈a, b〉 = ‖a‖X‖b‖Y .

Definition

The Pareto sub-frontier consists of all pairs (x , y) such that thereexists an X2-decomposition c = a + b with ‖a‖X = x and‖b‖Y = y .

Harm Derksen Singular Values of Tensors

Page 49: Singular Values of Tensors

The Pareto Sub-Frontier

〈·, ·〉 positive definite bilinear form‖v‖2 := ‖v‖ =

√〈v , v〉 `2-norm

‖ · ‖X , ‖ · ‖Y dual to each others

Theorem

Equivalent:

1. c = a + b is an X2-decomposition

2. c = a + b is an 2Y -decomposition

3. 〈a, b〉 = ‖a‖X‖b‖Y .

Definition

The Pareto sub-frontier consists of all pairs (x , y) such that thereexists an X2-decomposition c = a + b with ‖a‖X = x and‖b‖Y = y .

Harm Derksen Singular Values of Tensors

Page 50: Singular Values of Tensors

The Pareto Sub-Frontier

Lemma

The Pareto sub-frontier is the graph of a strictly decreasinghomeomorphism hcYX : [0, ‖c‖X ]→ [0, ‖c‖Y ].

Clearly, f cYX ≤ hcYX .

Definition

A vector c ∈ V is called tight if f cYX = hcYX . If all vectors in V aretight then the norms ‖ · ‖X and ‖ · ‖Y are called tight.

Harm Derksen Singular Values of Tensors

Page 51: Singular Values of Tensors

(Sub)-Pareto Example

c = (3 3)t ∈ V = R2∥∥∥∥(z1z2)∥∥∥∥

X

=√

12z

21 + 2z22 ,

∥∥∥∥(z1z2)∥∥∥∥

Y

=√

2z21 + 12z

22 .

dual norms

0 1 2 3 4 50

1

2

3

4

5

fYX

,hYX

So c is not tight.

Harm Derksen Singular Values of Tensors

Page 52: Singular Values of Tensors

The Area Below the Pareto Sub-Frontier

Suppose c = a + b is an X2-decomposition with ‖a‖X = x0 and‖b‖Y = y0. The area under the Pareto sub-frontier is12‖c‖

2 = 12‖a‖

2 + 〈a, b〉+ 12‖b‖

2.

X

Y

hYX

||c||X

||c||Y

x0

y0

(1/2)||a||2

2

(||a||X

,||b||Y

)

<a,b>

(1/2)||b||2

2

Harm Derksen Singular Values of Tensors

Page 53: Singular Values of Tensors

The Area Below the Pareto Sub-Frontier

The sub-Pareto frontier applies to:

I denoising a sparse 1D signal: ‖ · ‖1 vs. ‖ · ‖∞ (tight!)

I denoising of a matrix by soft tresholding: ‖ · ‖? vs.‖ · ‖σ(tight!)

I denoising tensors: ‖ · ‖? vs. ‖ · ‖σI compressive sensing (Basis Pursuit Denoising, LASSO,

Dantzig Selector)

I Fatemi-Osher-Rudin image denoising: total variation norm vs.its dual

I 1D total variation denoising: Taut String Method

Harm Derksen Singular Values of Tensors

Page 54: Singular Values of Tensors

Example: Singular Value Decomposition

Rn×m = Rn ⊗ Rm, r = min{n,m}A ∈ Rn×m with singular values λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0

‖A‖X = ‖A‖? = λ1 + λ2 + · · ·+ λr

‖A‖Y = ‖A‖σ = λ1

‖A‖ =√λ21 + λ22 + · · ·+ λ2r

Lemma

The norms ‖ · ‖? and ‖ · ‖σ are tight.

XY -decompositions are related to soft tresholding

Harm Derksen Singular Values of Tensors

Page 55: Singular Values of Tensors

Example: Singular Value Decomposition

Rn×m = Rn ⊗ Rm, r = min{n,m}A ∈ Rn×m with singular values λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0

‖A‖X = ‖A‖? = λ1 + λ2 + · · ·+ λr

‖A‖Y = ‖A‖σ = λ1

‖A‖ =√λ21 + λ22 + · · ·+ λ2r

Lemma

The norms ‖ · ‖? and ‖ · ‖σ are tight.

XY -decompositions are related to soft tresholding

Harm Derksen Singular Values of Tensors

Page 56: Singular Values of Tensors

Example: Singular Value Decomposition

(note that the Pareto and sub-Pareto frontiers are the same here)

Lemma

If A has singular values λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0 then the Pareto(sub-)frontier is the piecewise linear function through the points:

(λ1 + λ2 + · · ·+ λk − kλk+1, λk+1), k = 0, 1, 2, . . . , r

(with the convention λr+1 = 0).

Corollary

The singular values are uniquely determined by the (sub-)Paretocurve, and vice versa.

Harm Derksen Singular Values of Tensors

Page 57: Singular Values of Tensors

Example: Singular Value Decomposition

(note that the Pareto and sub-Pareto frontiers are the same here)

Lemma

If A has singular values λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0 then the Pareto(sub-)frontier is the piecewise linear function through the points:

(λ1 + λ2 + · · ·+ λk − kλk+1, λk+1), k = 0, 1, 2, . . . , r

(with the convention λr+1 = 0).

Corollary

The singular values are uniquely determined by the (sub-)Paretocurve, and vice versa.

Harm Derksen Singular Values of Tensors

Page 58: Singular Values of Tensors

Slope decomposition

Definition

c = c1 + c2 + · · ·+ cs is called a slope decomposition if〈ci , cj〉 = ‖ci‖X‖cj‖Y for all i ≤ j and

‖c1‖X‖c1‖Y

<‖c2‖X‖c2‖Y

< · · · < ‖cr‖X‖cr‖Y

.

Theorem

c ∈ V has a slope decomposition if and only if c is tight.

Theorem

The slope decomposition is unique when it exists.

Harm Derksen Singular Values of Tensors

Page 59: Singular Values of Tensors

Slope decomposition

Definition

c = c1 + c2 + · · ·+ cs is called a slope decomposition if〈ci , cj〉 = ‖ci‖X‖cj‖Y for all i ≤ j and

‖c1‖X‖c1‖Y

<‖c2‖X‖c2‖Y

< · · · < ‖cr‖X‖cr‖Y

.

Theorem

c ∈ V has a slope decomposition if and only if c is tight.

Theorem

The slope decomposition is unique when it exists.

Harm Derksen Singular Values of Tensors

Page 60: Singular Values of Tensors

Slope decomposition

Definition

c = c1 + c2 + · · ·+ cs is called a slope decomposition if〈ci , cj〉 = ‖ci‖X‖cj‖Y for all i ≤ j and

‖c1‖X‖c1‖Y

<‖c2‖X‖c2‖Y

< · · · < ‖cr‖X‖cr‖Y

.

Theorem

c ∈ V has a slope decomposition if and only if c is tight.

Theorem

The slope decomposition is unique when it exists.

Harm Derksen Singular Values of Tensors

Page 61: Singular Values of Tensors

Singular values for tight vectors

Suppose c ∈ V is tight, with slope decompositionc = c1 + c2 + · · ·+ cs .

Definition

Let µi = ‖ci‖Y , and λi = µi + µi+1 + · · ·+ µs for all i . Then c

has singular value λi with multiplicity ‖ci‖X‖ci‖Y −‖ci−1‖X‖ci−1‖Y .

Multiplicities and singular values are positive, but not necessarilyintegers.

Theorem

If c has singular Values λ1, . . . , λs with multiplicities m1, . . . ,ms

respectively, then ‖c‖X =∑

i miλi , ‖c‖Y = λ1 and

‖T‖ =√∑

i miλ2i .

Harm Derksen Singular Values of Tensors

Page 62: Singular Values of Tensors

Singular values for tight vectors

Suppose c ∈ V is tight, with slope decompositionc = c1 + c2 + · · ·+ cs .

Definition

Let µi = ‖ci‖Y , and λi = µi + µi+1 + · · ·+ µs for all i . Then c

has singular value λi with multiplicity ‖ci‖X‖ci‖Y −‖ci−1‖X‖ci−1‖Y .

Multiplicities and singular values are positive, but not necessarilyintegers.

Theorem

If c has singular Values λ1, . . . , λs with multiplicities m1, . . . ,ms

respectively, then ‖c‖X =∑

i miλi , ‖c‖Y = λ1 and

‖T‖ =√∑

i miλ2i .

Harm Derksen Singular Values of Tensors

Page 63: Singular Values of Tensors

Singular Values for Tensors

Norms ‖ · ‖X = ‖ · ‖?, ‖ · ‖Y = ‖ · ‖σ

Theorem

If T =∑r

i=1 λivi is a diagonal singular value decomposition, thenthe singular values of T are λ1, . . . , λr .

The permanent tensor Pn is tight, but has no DSVD.It has singular value n!

nn/2with multiplicity nn

n! .

Harm Derksen Singular Values of Tensors

Page 64: Singular Values of Tensors

Singular Values for Tensors

Norms ‖ · ‖X = ‖ · ‖?, ‖ · ‖Y = ‖ · ‖σ

Theorem

If T =∑r

i=1 λivi is a diagonal singular value decomposition, thenthe singular values of T are λ1, . . . , λr .

The permanent tensor Pn is tight, but has no DSVD.It has singular value n!

nn/2with multiplicity nn

n! .

Harm Derksen Singular Values of Tensors

Page 65: Singular Values of Tensors

Singular Value Region of a Tensor

There is an easy procedure to go from the Pareto subfrontier tothe singular values and their multiplicities for tight vectors.

Even if a vector are not tight we can use the same procedure todefine a singular spectrum. Some singular valules may haveinfinitesemal multiplicity.

Harm Derksen Singular Values of Tensors

Page 66: Singular Values of Tensors

Example: Singular Value Region of a Tensor

‖ · ‖X = ‖ · ‖? and ‖ · ‖Y = ‖ · ‖σ are dual norms on Rp×q×r

C = e2 ⊗ e1 ⊗ e1 + e1 ⊗ e2 ⊗ e1 + e1 ⊗ e1 ⊗ e2 ∈ R2 ⊗ R2 ⊗ R2

has continuous singular spectrum

0 1 2 3 40

0.5

1

1.5

singular value 2√3

with multiplicity 2713 , singular value 1

4 with

multiplicity 43 and singular values between 1

4 and 2√3

with

infinitesemal multiplicity‖C‖σ = 2√

3(largest singular value), area is ‖C‖? = 3, and integral

of 2y over region is ‖C‖22 = 3

Harm Derksen Singular Values of Tensors

Page 67: Singular Values of Tensors

Thanks

Thank You!

Harm Derksen Singular Values of Tensors