boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/btf_icdm_slides.pdffactorizations: the...

32
BOOLEAN TENSOR FACTORIZATIONS Pauli Miettinen 14 December 2011

Upload: others

Post on 02-Sep-2019

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BOOLEAN TENSOR FACTORIZATIONS

Pauli Miettinen 14 December 2011

Page 2: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS

X

Page 3: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS

XA

B

C

Page 4: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BACKGROUND: BOOLEAN MATRIX FACTORIZATIONS

• Given a binary matrix X and a positive integer R, find two binary matrices A and B such that A has R columns and B has R rows and X ≈ A o B.

• A o B is the Boolean matrix product of A and B,

(A ◦ B)ij =R�

r=1

bilclj

Page 5: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BOOLEAN TENSOR FACTORIZATIONS: THE IDEA

1. Take existing (normal) tensor factorization

2. Make everything binary and define summation as 1 + 1 = 1

3. Try to understand what you just did.

Research problem. What can we say about Boolean tensor factorizations and how do they relate to normal tensor factorizations and Boolean matrix factorizations?

Page 6: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

RANK-1 (BOOLEAN) TENSORS

X

a

b

=

X = a× b

Page 7: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

RANK-1 (BOOLEAN) TENSORS

X

a

b

c

X = a×1 b×2 c

=

Page 8: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE CP TENSOR DECOMPOSITION

xijk ≈R�

r=1

airbjrckr

≈X

a1 a2 aR

bRb2b1

c1 c2 cR

+ + · · ·+

Page 9: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE CP TENSOR DECOMPOSITION

XA

B

C

xijk ≈R�

r=1

airbjrckr

Page 10: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE BOOLEAN CP TENSOR DECOMPOSITION

≈X

a1 a2 aR

bRb2b1

c1 c2 cR

∨ ∨ · · ·∨

xijk ≈R�

r=1

airbjrckr

Page 11: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE BOOLEAN CP TENSOR DECOMPOSITION

XA

B

C

≈ ◦◦

xijk ≈R�

r=1

airbjrckr

Page 12: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

DIGRESSION: FREQUENT TRI-ITEMSET MINING

• Rank-1 N-way binary tensors define an N-way itemset

• Particularly, rank-1 binary matrices define an itemset

• In itemset mining the induced sub-tensor must be full of 1s

• Here, the items can have holes

• Boolean CP decomposition = lossy N-way tiling

Page 13: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

TENSOR RANKThe rank of a tensor is the minimum number of rank-1

tensors needed to represent the tensor exactly.

X

a1 a2 aR

bRb2b1

c1 c2 cR

+ + · · ·+=

Page 14: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

BOOLEAN TENSOR RANKThe Boolean rank of a binary tensor is the minimum

number of binary rank-1 tensors needed to represent the tensor exactly using Boolean arithmetic.

=X

a1 a2 aR

bRb2b1

c1 c2 cR

∨ ∨ · · ·∨

Page 15: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SOME RESULTS ON RANKS

• Normal tensor rank is NP-hard to compute

• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

• But no more than min{nm, nk, mk}

Page 16: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SOME RESULTS ON RANKS

• Normal tensor rank is NP-hard to compute

• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

• But no more than min{nm, nk, mk}

• Boolean tensor rank is NP-hard to compute

Page 17: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SOME RESULTS ON RANKS

• Normal tensor rank is NP-hard to compute

• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

• But no more than min{nm, nk, mk}

• Boolean tensor rank is NP-hard to compute

• Boolean tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

Page 18: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SOME RESULTS ON RANKS

• Normal tensor rank is NP-hard to compute

• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

• But no more than min{nm, nk, mk}

• Boolean tensor rank is NP-hard to compute

• Boolean tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}

• But no more thanmin{nm, nk, mk}

Page 19: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SPARSITY

• Binary matrix X of Boolean rank R and |X| 1s has Boolean rank-R decomposition A o B such that |A| + |B| ≤ 2|X| [M., ICDM ’10]

• Binary N-way tensor of Boolean tensor rank R has Boolean rank-R CP-decomposition with factor matrices A1, A2, …, AN such that ∑i |Ai| ≤ N| |

• Both results are existential only and extend to approximate decompositions

X

X

Page 20: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE TUCKER TENSOR DECOMPOSITION

X GAB

C

xijk ≈P�

p=1

Q�

q=1

R�

r=1

gpqraipbjqckr

Page 21: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE BOOLEAN TUCKER TENSOR DECOMPOSITION

X GAB

C

xijk ≈P�

p=1

Q�

q=1

R�

r=1

gpqraipbjqckr

Page 22: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE ALGORITHMS

• The normal CP-decomposition can be solved using matricization and ALS

• ⊙ is the Khatri–Rao matrix product

• (C ⊙ B)T is R-by-mk

• For normal matrices, we can use standard least-squares projections

• One projection per mode

• Similar algorithms for the Tucker decomposition

X(1) = A(C⊙ B)T

X(2) = B(C⊙A)T

X(3) = C(B⊙A)T

Page 23: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE ALGORITHMS

• For Boolean case, matrix product must be changed

• Khatri–Rao stays as it

• Finding the optimal projection is NP-hard even to approximate

• Good initial values are needed due to multiple local minima

• Obtained using Boolean matrix factorization to matricizations

X(1) = A ◦ (C⊙ B)T

X(2) = B ◦ (C⊙A)T

X(3) = C ◦ (B⊙A)T

Page 24: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

THE TUCKER CASE

• The core tensor has global effects

• Updates are hard

• Core tensor is usually small

• We can afford more time per element

• In Boolean case many changes make no difference

X GAB

C

xijk ≈P�

p=1

Q�

q=1

R�

r=1

gpqraipbjqckr

Page 25: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SYNTHETIC EXPERIMENTS

4 8 16 32 640

2

4

6

8

10

12x 10

4

r

Reconstr

uction e

rror

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0

1

2

3

4

5

6

7

8

9x 10

4

Ranks

Re

con

stru

ctio

n e

rro

r

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 26: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SYNTHETIC EXPERIMENTS

4 8 16 32 640

2

4

6

8

10

12x 10

4

r

Reconstr

uction e

rror

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0

1

2

3

4

5

6

7

8

9x 10

4

Ranks

Re

con

stru

ctio

n e

rro

r

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 27: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

SYNTHETIC EXPERIMENTS

4 8 16 32 640

2

4

6

8

10

12x 10

4

r

Reconstr

uction e

rror

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0

1

2

3

4

5

6

7

8

9x 10

4

Ranks

Re

con

stru

ctio

n e

rro

r

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 28: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

REAL-WORLD EXPERIMENTS

5 10 15 301200

1400

1600

1800

2000

2200

2400

r

Re

co

nstr

uction

err

or

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

RanksR

eco

nst

ruct

ion

err

or

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 29: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

REAL-WORLD EXPERIMENTS

5 10 15 301200

1400

1600

1800

2000

2200

2400

r

Re

co

nstr

uction

err

or

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

RanksR

eco

nst

ruct

ion

err

or

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 30: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

REAL-WORLD EXPERIMENTS

5 10 15 301200

1400

1600

1800

2000

2200

2400

r

Re

co

nstr

uction

err

or

BCP_ALSCP_ALS

B

CP_ALSF

CP_OPTB

CP_OPTF

(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

RanksR

eco

nst

ruct

ion

err

or

BTucker_ALSTucker_ALS

B

Tucker_ALSF

Page 31: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

CONCLUSIONS

• Boolean tensor decompositions are a bit like normal tensor decompositions

• And a bit like Boolean matrix factorizations

• They generalize other data mining techniques in many ways

• The playing field between Boolean and normal tensor factorizations is more level

Page 32: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and

CONCLUSIONS

• Boolean tensor decompositions are a bit like normal tensor decompositions

• And a bit like Boolean matrix factorizations

• They generalize other data mining techniques in many ways

• The playing field between Boolean and normal tensor factorizations is more level

!ank Y#!