high-performance sampling of determinantal point processesheader-only c++14 package...

106
High-performance sampling of Determinantal Point Processes Jack Poulson hodgestar.com Numerical algorithms for high-perf. computational science The Royal Society, London April 08, 2019 1 / 71

Upload: others

Post on 10-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

High-performance sampling ofDeterminantal Point Processes

Jack Poulson

hodgestar.com

Numerical algorithms for high-perf. computational scienceThe Royal Society, London

April 08, 2019

1 / 71

Page 2: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Overview

• Will draw strong connection between techniques for efficientlyfactoring matrices and for sampling structured subsets ofa ground set.

• The basic bridge: forming a Schur complement equates toforming a representation of a conditional distribution.

• One can import HPC techniques, such as DAG-scheduleddense and sparse-direct blocked algorithms, fromfactorizations to Determinantal Point Processes[Macchi-1975, Burton/Pemantle-1993,Benjamini/Lyons/Peres/Schramm-2001].

• Implementations are available in the permissively licensed,header-only C++14 package Catamari [P-2018] available athodgestar.com/catamari.

2 / 71

Page 3: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Overview

• Will draw strong connection between techniques for efficientlyfactoring matrices and for sampling structured subsets ofa ground set.

• The basic bridge: forming a Schur complement equates toforming a representation of a conditional distribution.

• One can import HPC techniques, such as DAG-scheduleddense and sparse-direct blocked algorithms, fromfactorizations to Determinantal Point Processes[Macchi-1975, Burton/Pemantle-1993,Benjamini/Lyons/Peres/Schramm-2001].

• Implementations are available in the permissively licensed,header-only C++14 package Catamari [P-2018] available athodgestar.com/catamari.

2 / 71

Page 4: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Overview

• Will draw strong connection between techniques for efficientlyfactoring matrices and for sampling structured subsets ofa ground set.

• The basic bridge: forming a Schur complement equates toforming a representation of a conditional distribution.

• One can import HPC techniques, such as DAG-scheduleddense and sparse-direct blocked algorithms, fromfactorizations to Determinantal Point Processes[Macchi-1975, Burton/Pemantle-1993,Benjamini/Lyons/Peres/Schramm-2001].

• Implementations are available in the permissively licensed,header-only C++14 package Catamari [P-2018] available athodgestar.com/catamari.

2 / 71

Page 5: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Overview

• Will draw strong connection between techniques for efficientlyfactoring matrices and for sampling structured subsets ofa ground set.

• The basic bridge: forming a Schur complement equates toforming a representation of a conditional distribution.

• One can import HPC techniques, such as DAG-scheduleddense and sparse-direct blocked algorithms, fromfactorizations to Determinantal Point Processes[Macchi-1975, Burton/Pemantle-1993,Benjamini/Lyons/Peres/Schramm-2001].

• Implementations are available in the permissively licensed,header-only C++14 package Catamari [P-2018] available athodgestar.com/catamari.

2 / 71

Page 6: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Main idea: pivots as inclusion probabilities

Sampling a DPP can be reinterpreted as ‘factoring’ a class ofmatrices such that the j ’th pivot is the probability of including thej ’th item.

Flip a coin weighted by the pivot to determine inclusion:

• If the item is kept, proceed as in an LU/LDL factorization.

• If the item is dropped, take the pivot’s complement in [0, 1]and negate – i.e., subtract one – and proceed as normal.

The likelihood of the sample is thus the product of the absolutevalue of the diagonal of the ‘factorization’.

Essentially all high-performance techniques for dense andsparse-direct factorizations therefore carry over.

3 / 71

Page 7: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Main idea: pivots as inclusion probabilities

Sampling a DPP can be reinterpreted as ‘factoring’ a class ofmatrices such that the j ’th pivot is the probability of including thej ’th item.

Flip a coin weighted by the pivot to determine inclusion:

• If the item is kept, proceed as in an LU/LDL factorization.

• If the item is dropped, take the pivot’s complement in [0, 1]and negate – i.e., subtract one – and proceed as normal.

The likelihood of the sample is thus the product of the absolutevalue of the diagonal of the ‘factorization’.

Essentially all high-performance techniques for dense andsparse-direct factorizations therefore carry over.

3 / 71

Page 8: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Main idea: pivots as inclusion probabilities

Sampling a DPP can be reinterpreted as ‘factoring’ a class ofmatrices such that the j ’th pivot is the probability of including thej ’th item.

Flip a coin weighted by the pivot to determine inclusion:

• If the item is kept, proceed as in an LU/LDL factorization.

• If the item is dropped, take the pivot’s complement in [0, 1]and negate – i.e., subtract one – and proceed as normal.

The likelihood of the sample is thus the product of the absolutevalue of the diagonal of the ‘factorization’.

Essentially all high-performance techniques for dense andsparse-direct factorizations therefore carry over.

3 / 71

Page 9: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Main idea: pivots as inclusion probabilities

Sampling a DPP can be reinterpreted as ‘factoring’ a class ofmatrices such that the j ’th pivot is the probability of including thej ’th item.

Flip a coin weighted by the pivot to determine inclusion:

• If the item is kept, proceed as in an LU/LDL factorization.

• If the item is dropped, take the pivot’s complement in [0, 1]and negate – i.e., subtract one – and proceed as normal.

The likelihood of the sample is thus the product of the absolutevalue of the diagonal of the ‘factorization’.

Essentially all high-performance techniques for dense andsparse-direct factorizations therefore carry over.

3 / 71

Page 10: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Main idea: pivots as inclusion probabilities

Sampling a DPP can be reinterpreted as ‘factoring’ a class ofmatrices such that the j ’th pivot is the probability of including thej ’th item.

Flip a coin weighted by the pivot to determine inclusion:

• If the item is kept, proceed as in an LU/LDL factorization.

• If the item is dropped, take the pivot’s complement in [0, 1]and negate – i.e., subtract one – and proceed as normal.

The likelihood of the sample is thus the product of the absolutevalue of the diagonal of the ‘factorization’.

Essentially all high-performance techniques for dense andsparse-direct factorizations therefore carry over.

3 / 71

Page 11: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 12: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 13: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 14: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 15: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 16: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

What is meant by a ’structured subset’?

The basic mechanism of a (finite) Point Process is to define aprobability distribution over the power set of a ground set[n] = [0, ..., n − 1].

A determinantal point process sets the probability of a subsetJ ⊆ [n] being in the sample equal to the J-minor of a fixedmarginal kernel matrix.

The kernel matrix is often assumed Hermitian positivesemi-definite – with spectrum in [0, 1], but Hermiticity does nothold in some important cases.

Inadmissible combinations of members of the set can therefore beencoded through linear dependencies in the kernel matrix.

Before diving into the details, it will be instructive to describesome Hermitian and non-Hermitian standard DPPs.

4 / 71

Page 17: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 5$ ./ aztec_diamond --diamond_size =5

Complex non-Hermitian kernel; Sample likelihoods: exp(−10.3972)

5 / 71

Page 18: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 5$ ./ aztec_diamond --diamond_size =5

Complex non-Hermitian kernel; Sample likelihoods: exp(−10.3972)

6 / 71

Page 19: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 5$ ./ aztec_diamond --diamond_size =5

Complex non-Hermitian kernel; Sample likelihoods: exp(−10.3972)

7 / 71

Page 20: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 5$ ./ aztec_diamond --diamond_size =5

Complex non-Hermitian kernel; Sample likelihoods: exp(−10.3972)

8 / 71

Page 21: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 5$ ./ aztec_diamond --diamond_size =5

Complex non-Hermitian kernel; Sample likelihoods: exp(−10.3972)

9 / 71

Page 22: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 10$ ./ aztec_diamond --diamond_size =10

Complex non-Hermitian kernel; Sample likelihoods: exp(−38.1231)

10 / 71

Page 23: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 10$ ./ aztec_diamond --diamond_size =10

Complex non-Hermitian kernel; Sample likelihoods: exp(−38.1231)

11 / 71

Page 24: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 10$ ./ aztec_diamond --diamond_size =10

Complex non-Hermitian kernel; Sample likelihoods: exp(−38.1231)

12 / 71

Page 25: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 10$ ./ aztec_diamond --diamond_size =10

Complex non-Hermitian kernel; Sample likelihoods: exp(−38.1231)

13 / 71

Page 26: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 10$ ./ aztec_diamond --diamond_size =10

Complex non-Hermitian kernel; Sample likelihoods: exp(−38.1231)

14 / 71

Page 27: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 40$ ./ aztec_diamond --diamond_size =40

Complex non-Hermitian kernel; Sample likelihoods: exp(−568.381)

15 / 71

Page 28: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 40$ ./ aztec_diamond --diamond_size =40

Complex non-Hermitian kernel; Sample likelihoods: exp(−568.381)

16 / 71

Page 29: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 40$ ./ aztec_diamond --diamond_size =40

Complex non-Hermitian kernel; Sample likelihoods: exp(−568.381)

17 / 71

Page 30: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 40$ ./ aztec_diamond --diamond_size =40

Complex non-Hermitian kernel; Sample likelihoods: exp(−568.381)

18 / 71

Page 31: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 40$ ./ aztec_diamond --diamond_size =40

Complex non-Hermitian kernel; Sample likelihoods: exp(−568.381)

19 / 71

Page 32: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 80$ ./ aztec_diamond --diamond_size =80

Complex non-Hermitian kernel; Sample likelihoods: exp(−2245.8)

20 / 71

Page 33: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 80$ ./ aztec_diamond --diamond_size =80

Complex non-Hermitian kernel; Sample likelihoods: exp(−2245.8)

21 / 71

Page 34: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 80$ ./ aztec_diamond --diamond_size =80

Complex non-Hermitian kernel; Sample likelihoods: exp(−2245.8)

22 / 71

Page 35: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 80$ ./ aztec_diamond --diamond_size =80

Complex non-Hermitian kernel; Sample likelihoods: exp(−2245.8)

23 / 71

Page 36: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond: d = 80$ ./ aztec_diamond --diamond_size =80

Complex non-Hermitian kernel; Sample likelihoods: exp(−2245.8)

24 / 71

Page 37: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 10)$ ./ uniform_spanning_tree --x_size =10 --y_size =10

Real-symm’ elementary kernel; Sample likelihoods: exp(−98.448)

25 / 71

Page 38: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 10)$ ./ uniform_spanning_tree --x_size =10 --y_size =10

Real-symm’ elementary kernel; Sample likelihoods: exp(−98.448)

26 / 71

Page 39: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 10)$ ./ uniform_spanning_tree --x_size =10 --y_size =10

Real-symm’ elementary kernel; Sample likelihoods: exp(−98.448)

27 / 71

Page 40: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 10)$ ./ uniform_spanning_tree --x_size =10 --y_size =10

Real-symm’ elementary kernel; Sample likelihoods: exp(−98.448)

28 / 71

Page 41: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 10)$ ./ uniform_spanning_tree --x_size =10 --y_size =10

Real-symm’ elementary kernel; Sample likelihoods: exp(−98.448)

29 / 71

Page 42: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 40)$ ./ uniform_spanning_tree --x_size =40 --y_size =40

Real-symm’ elementary kernel; Sample likelihoods: exp(−1794.24)

30 / 71

Page 43: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 40)$ ./ uniform_spanning_tree --x_size =40 --y_size =40

Real-symm’ elementary kernel; Sample likelihoods: exp(−1794.24)

31 / 71

Page 44: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 40)$ ./ uniform_spanning_tree --x_size =40 --y_size =40

Real-symm’ elementary kernel; Sample likelihoods: exp(−1794.24)

32 / 71

Page 45: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 40)$ ./ uniform_spanning_tree --x_size =40 --y_size =40

Real-symm’ elementary kernel; Sample likelihoods: exp(−1794.24)

33 / 71

Page 46: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 40)$ ./ uniform_spanning_tree --x_size =40 --y_size =40

Real-symm’ elementary kernel; Sample likelihoods: exp(−1794.24)

34 / 71

Page 47: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 100)$ ./ uniform_spanning_tree --x_size =100 --y_size =100

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 484.5)

35 / 71

Page 48: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 100)$ ./ uniform_spanning_tree --x_size =100 --y_size =100

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 484.5)

36 / 71

Page 49: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 100)$ ./ uniform_spanning_tree --x_size =100 --y_size =100

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 484.5)

37 / 71

Page 50: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 100)$ ./ uniform_spanning_tree --x_size =100 --y_size =100

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 484.5)

38 / 71

Page 51: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Uniform Spanning Tree in Z2 (d = 100)$ ./ uniform_spanning_tree --x_size =100 --y_size =100

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 484.5)

39 / 71

Page 52: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 10)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =10 −−y s i z e =10 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−299.101)

40 / 71

Page 53: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 10)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =10 −−y s i z e =10 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−299.101)

41 / 71

Page 54: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 10)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =10 −−y s i z e =10 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−299.101)

42 / 71

Page 55: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 10)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =10 −−y s i z e =10 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−299.101)

43 / 71

Page 56: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 10)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =10 −−y s i z e =10 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−299.101)

44 / 71

Page 57: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 60)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =60 −−y s i z e =60 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 486.8)

45 / 71

Page 58: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 60)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =60 −−y s i z e =60 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 486.8)

46 / 71

Page 59: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 60)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =60 −−y s i z e =60 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 486.8)

47 / 71

Page 60: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 60)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =60 −−y s i z e =60 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 486.8)

48 / 71

Page 61: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

UST for hexagonal tiling of plane (d = 60)

$ . / u n i f o rm s p a n n i n g t r e e −−x s i z e =60 −−y s i z e =60 −−hexagona l=t r u e

Real-symm’ elementary kernel; Sample likelihoods: exp(−11, 486.8)

49 / 71

Page 62: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Hermitian Determinantal Point Processes

Definition 1. A (Hermitian) marginal kernel matrix is a (real or complex)Hermitian matrix whose eigenvalues live in [0, 1].

Definition 2. A (finite, Hermitian) Determinantal Point Process (DPP) is arandom variable Y over the power set of Y = {0, ..., n − 1} = [n] generated bya n × n (Hermitian) marginal kernel matrix K via the rule

PK [Y ⊆ Y] = det(KY ),

where KY is the |Y | × |Y | submatrix of K formed by restricting to the rowsand columns in the index set Y .

Definition 3. A (Hermitian) DPP is called elementary if the eigenvalues of its

marginal kernel matrix all lie in {0, 1}.

50 / 71

Page 63: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Hermitian Determinantal Point Processes

Definition 1. A (Hermitian) marginal kernel matrix is a (real or complex)Hermitian matrix whose eigenvalues live in [0, 1].

Definition 2. A (finite, Hermitian) Determinantal Point Process (DPP) is arandom variable Y over the power set of Y = {0, ..., n − 1} = [n] generated bya n × n (Hermitian) marginal kernel matrix K via the rule

PK [Y ⊆ Y] = det(KY ),

where KY is the |Y | × |Y | submatrix of K formed by restricting to the rowsand columns in the index set Y .

Definition 3. A (Hermitian) DPP is called elementary if the eigenvalues of its

marginal kernel matrix all lie in {0, 1}.

50 / 71

Page 64: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Hermitian Determinantal Point Processes

Definition 1. A (Hermitian) marginal kernel matrix is a (real or complex)Hermitian matrix whose eigenvalues live in [0, 1].

Definition 2. A (finite, Hermitian) Determinantal Point Process (DPP) is arandom variable Y over the power set of Y = {0, ..., n − 1} = [n] generated bya n × n (Hermitian) marginal kernel matrix K via the rule

PK [Y ⊆ Y] = det(KY ),

where KY is the |Y | × |Y | submatrix of K formed by restricting to the rowsand columns in the index set Y .

Definition 3. A (Hermitian) DPP is called elementary if the eigenvalues of its

marginal kernel matrix all lie in {0, 1}.

50 / 71

Page 65: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Non-Hermitian DPP kernels

Definition 4. A (finite) Determinantal Point Process is a randomvariable Y over the power set of Y = [n] generated by an admissibleK ∈ Cn×n that is consistent with the rule:

PK [Y ⊆ Y] = det(KY ).

Proposition 1 (Brunel-2018). A matrix K ∈ Cn×n is admissible as aDPP marginal kernel iff

(−1)|J|det(K − IJ) ≥ 0, ∀J ⊆ [n].

51 / 71

Page 66: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Non-Hermitian DPP kernels

Definition 4. A (finite) Determinantal Point Process is a randomvariable Y over the power set of Y = [n] generated by an admissibleK ∈ Cn×n that is consistent with the rule:

PK [Y ⊆ Y] = det(KY ).

Proposition 1 (Brunel-2018). A matrix K ∈ Cn×n is admissible as aDPP marginal kernel iff

(−1)|J|det(K − IJ) ≥ 0, ∀J ⊆ [n].

51 / 71

Page 67: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Equivalence classes of DPP kernels

Proposition 2 (P-2019). The equivalence class of a structurallysymmetric DPP kernel K ∈ Cn×n is its orbit under the group of diagonalsimilarity transformations, i.e.,

{D−1KD : D = diag(d), d ∈ (Cx)n}.

For complex Hermitian and real symmetric K , the entries of D mustrespectively lie in U(1) and O(1).

Proposition 3 (P-2019). The equivalence class of a structurallynonsymmetric DPP kernel K strictly contains the its orbit under thegroup of diagonal similarity transformations.

Proof.If structural symmetry is broken at a 2× 2 submatrix, we need onlyobserve that:

DPP(

(α 0β γ

)) ≡ DPP(

(α 00 γ

)),

but neither is contained in the orbit of the other.

52 / 71

Page 68: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Equivalence classes of DPP kernels

Proposition 2 (P-2019). The equivalence class of a structurallysymmetric DPP kernel K ∈ Cn×n is its orbit under the group of diagonalsimilarity transformations, i.e.,

{D−1KD : D = diag(d), d ∈ (Cx)n}.

For complex Hermitian and real symmetric K , the entries of D mustrespectively lie in U(1) and O(1).

Proposition 3 (P-2019). The equivalence class of a structurallynonsymmetric DPP kernel K strictly contains the its orbit under thegroup of diagonal similarity transformations.

Proof.If structural symmetry is broken at a 2× 2 submatrix, we need onlyobserve that:

DPP(

(α 0β γ

)) ≡ DPP(

(α 00 γ

)),

but neither is contained in the orbit of the other.

52 / 71

Page 69: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complements

Proposition 4. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Y] = det(KB − KB,AK−1A KA,B).

Proof.

det(KA∪B) = det(KA)det(KB − KB,AK−1A KA,B)

and

P[B ⊆ Y|A ⊆ Y] =det(KA∪B)

det(KA).

53 / 71

Page 70: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complements

Proposition 4. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Y] = det(KB − KB,AK−1A KA,B).

Proof.

det(KA∪B) = det(KA)det(KB − KB,AK−1A KA,B)

and

P[B ⊆ Y|A ⊆ Y] =det(KA∪B)

det(KA).

53 / 71

Page 71: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 72: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 73: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 74: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 75: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 76: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 77: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Conditioning and Schur complementsProposition 5. Given disjoint subsets A,B ⊂ Y,

P[B ⊆ Y|A ⊆ Yc ] = det(KB − KB,A(KA − I )−1KA,B).

Proof.The claim follows from repeated application of the case where A is a singleelement, a:

P[B ⊆ Y|a /∈ Y] =P[a /∈ Y|B ⊆ Y]P[B ⊆ Y]

P[a /∈ Y]

=(1− P[a ∈ Y|B ⊆ Y])P[B ⊆ Y]

1− P[a ∈ Y]

=P[B ⊆ Y]− P[a ∈ Y ∧ B ⊆ Y]

1− P[a ∈ Y]

=det(KB)− det(Ka∪B)

1− Ka

= det(KB)(1− Ka,BKB−1

Ka − 1KB,a)

= det(KB − KB,a(Ka − 1)−1Ka,B).

54 / 71

Page 78: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Traditional Hermitian DPP sampling

Lemma 5 (Hough et al.-2006). Given any Y ∼ DPP(K), where K hasspectral decomposition QΛQ∗, sampling from Y is equivalent to sampling fromthe random elementary DPP with kernel P(QZ), where P(U) ≡ UU∗ and QZ

consists of the columns of Q with indices from Z ∼ DPP(Λ).

“Alg. 1 runs in time O(Nk3), where k is the number of eigenvectors selected[...] the initial eigendecomposition of [K ] is often the computationalbottleneck, requiring O(N3) time. Modern multi-core machines can computeeigendecompositions up to N ≈ 1, 000 at interactive speeds of a few seconds,or larger problems up to N ≈ 10, 000 in around ten minutes.”[Kulesza/Taskar-2012]

[Gillenwater-2014] reduced the factored elementary DPP sampling down to

cubic complexity via what is equivalent to diagonally-pivoted Cholesky.1

1[Gillenwater-2014] Approximate inference for determinantal point processes55 / 71

Page 79: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Traditional Hermitian DPP sampling

Lemma 5 (Hough et al.-2006). Given any Y ∼ DPP(K), where K hasspectral decomposition QΛQ∗, sampling from Y is equivalent to sampling fromthe random elementary DPP with kernel P(QZ), where P(U) ≡ UU∗ and QZ

consists of the columns of Q with indices from Z ∼ DPP(Λ).

“Alg. 1 runs in time O(Nk3), where k is the number of eigenvectors selected[...] the initial eigendecomposition of [K ] is often the computationalbottleneck, requiring O(N3) time. Modern multi-core machines can computeeigendecompositions up to N ≈ 1, 000 at interactive speeds of a few seconds,or larger problems up to N ≈ 10, 000 in around ten minutes.”[Kulesza/Taskar-2012]

[Gillenwater-2014] reduced the factored elementary DPP sampling down to

cubic complexity via what is equivalent to diagonally-pivoted Cholesky.1

1[Gillenwater-2014] Approximate inference for determinantal point processes55 / 71

Page 80: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Traditional Hermitian DPP sampling

Lemma 5 (Hough et al.-2006). Given any Y ∼ DPP(K), where K hasspectral decomposition QΛQ∗, sampling from Y is equivalent to sampling fromthe random elementary DPP with kernel P(QZ), where P(U) ≡ UU∗ and QZ

consists of the columns of Q with indices from Z ∼ DPP(Λ).

“Alg. 1 runs in time O(Nk3), where k is the number of eigenvectors selected[...] the initial eigendecomposition of [K ] is often the computationalbottleneck, requiring O(N3) time. Modern multi-core machines can computeeigendecompositions up to N ≈ 1, 000 at interactive speeds of a few seconds,or larger problems up to N ≈ 10, 000 in around ten minutes.”[Kulesza/Taskar-2012]

[Gillenwater-2014] reduced the factored elementary DPP sampling down to

cubic complexity via what is equivalent to diagonally-pivoted Cholesky.1

1[Gillenwater-2014] Approximate inference for determinantal point processes55 / 71

Page 81: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Traditional Hermitian DPP samplingRecently, authors are noticing connections to LDLH factorizations.23

In [Launay et al.-2018], timings are provided for the spectrally-preprocessed and“sequentially thinned” algorithm for elementary real symmetric kernels of rank20 and varying size (left) and varying rank and size 5000 (right):

This talk decreases runtimes by 100-1000x, for more general kernels, by

importing dense factorization techniques. We then extend to non-Hermitian

and sparse-direct analogues.2[Chen et al.-2017] Fast Greedy MAP inference for Det’ Point Processes3[Launay et al.-2018] Exact sampling of determinantal point processes

without eigendecomposition. arxiv.org/abs/1802.08429v356 / 71

Page 82: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Traditional Hermitian DPP samplingRecently, authors are noticing connections to LDLH factorizations.23

In [Launay et al.-2018], timings are provided for the spectrally-preprocessed and“sequentially thinned” algorithm for elementary real symmetric kernels of rank20 and varying size (left) and varying rank and size 5000 (right):

This talk decreases runtimes by 100-1000x, for more general kernels, by

importing dense factorization techniques. We then extend to non-Hermitian

and sparse-direct analogues.2[Chen et al.-2017] Fast Greedy MAP inference for Det’ Point Processes3[Launay et al.-2018] Exact sampling of determinantal point processes

without eigendecomposition. arxiv.org/abs/1802.08429v356 / 71

Page 83: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked DPP sampling factorizationsample = []

for j in range(n):

J2 = [j+1:n]

if Bernoulli(K(j,j)):

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

This is a small tweak of unblocked, unpivoted LU factorization – readilyspecializable to LDLH and LDLT for Hermitian and complex symmetricmatrices.

The majority of the work is in rank-1 updates. And the standard optimizationsapply (e.g., blocking and sparse-direct factorization)!

The likelihood of the sample is equal to the product of the absolute value of

the diagonal of the result.57 / 71

Page 84: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked DPP sampling factorizationsample = []

for j in range(n):

J2 = [j+1:n]

if Bernoulli(K(j,j)):

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

This is a small tweak of unblocked, unpivoted LU factorization – readilyspecializable to LDLH and LDLT for Hermitian and complex symmetricmatrices.

The majority of the work is in rank-1 updates. And the standard optimizationsapply (e.g., blocking and sparse-direct factorization)!

The likelihood of the sample is equal to the product of the absolute value of

the diagonal of the result.57 / 71

Page 85: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked DPP sampling factorizationsample = []

for j in range(n):

J2 = [j+1:n]

if Bernoulli(K(j,j)):

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

This is a small tweak of unblocked, unpivoted LU factorization – readilyspecializable to LDLH and LDLT for Hermitian and complex symmetricmatrices.

The majority of the work is in rank-1 updates. And the standard optimizationsapply (e.g., blocking and sparse-direct factorization)!

The likelihood of the sample is equal to the product of the absolute value of

the diagonal of the result.57 / 71

Page 86: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked DPP sampling factorizationsample = []

for j in range(n):

J2 = [j+1:n]

if Bernoulli(K(j,j)):

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

This is a small tweak of unblocked, unpivoted LU factorization – readilyspecializable to LDLH and LDLT for Hermitian and complex symmetricmatrices.

The majority of the work is in rank-1 updates. And the standard optimizationsapply (e.g., blocking and sparse-direct factorization)!

The likelihood of the sample is equal to the product of the absolute value of

the diagonal of the result.57 / 71

Page 87: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Blocked DPP sampling factorization

sample = []

J1_beg = 0

while J1_beg < n:

J1_end = min(n, J1_beg+blocksize)

J1 = [J1_beg:J1_end ]; J2 = [J1_end:n]

subsample , K(J1 ,J1) = unblocked_dpp(K(J1 ,J1))

sample.append(subsample + J1_beg)

K(J2 ,J1) /= triu(K(J1 ,J1))

K(J1 ,J2) \= unit_tril(K(J1 ,J1))

K(J2 ,J2) -= K(J2 ,J1) * K(J1 ,J2)

J1_beg = J1_end

return sample

OpenMP 4.0 tasks – say, with tile sizes of 128 – can be readilyused to provide shared-memory, DAG-scheduled parallelism[Agullo/Langou/Luszczek-2010, Yarkhan et al.-2011, Chan etal.-2007].

58 / 71

Page 88: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Blocked DPP sampling factorization

sample = []

J1_beg = 0

while J1_beg < n:

J1_end = min(n, J1_beg+blocksize)

J1 = [J1_beg:J1_end ]; J2 = [J1_end:n]

subsample , K(J1 ,J1) = unblocked_dpp(K(J1 ,J1))

sample.append(subsample + J1_beg)

K(J2 ,J1) /= triu(K(J1 ,J1))

K(J1 ,J2) \= unit_tril(K(J1 ,J1))

K(J2 ,J2) -= K(J2 ,J1) * K(J1 ,J2)

J1_beg = J1_end

return sample

OpenMP 4.0 tasks – say, with tile sizes of 128 – can be readilyused to provide shared-memory, DAG-scheduled parallelism[Agullo/Langou/Luszczek-2010, Yarkhan et al.-2011, Chan etal.-2007].

58 / 71

Page 89: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked, greedy, MAP DPP sampling

sample = []

for j in range(n):

J2 = [j+1:n]

if K(j,j) >= 0.5:

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

Greedy MAP sampling is a trivial tweak of the standard sampler, and the

blocked extension is essentially identical.

59 / 71

Page 90: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Unblocked, greedy, MAP DPP sampling

sample = []

for j in range(n):

J2 = [j+1:n]

if K(j,j) >= 0.5:

sample.append(j)

else:

K(j,j) -= 1

K(J2 ,j) /= K(j,j)

K(J2 ,J2) -= K(J2 ,j) * K(j,J2)

return sample

Greedy MAP sampling is a trivial tweak of the standard sampler, and the

blocked extension is essentially identical.

59 / 71

Page 91: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Full-rank real symmetric DPP on i9-7960x (16-core)

Dense, real LDLH-based DPP sampler [P-2019].

$ OMP_NUM_THREADS =16 ./ dense_dpp

0.1 0.2 0.3 0.4 0.5 0.6 0.8 1

·104

0.5

1

Kernel matrix size

TF

lop

/s

Single-precision DPP

Single-precision LDLH

Double-precision DPP

Double-precision LDLH

0.1 0.2 0.3 0.4 0.5 0.6 0.8 1

·104

0

0.2

0.4

0.6

Kernel matrix size

Sec

on

ds

Single-precision DPP

Single-precision LDLH

Double-precision DPP

Double-precision LDLH

60 / 71

Page 92: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Aztec diamond DPP on i9-7960x (16-core)

Dense, complex LU-based DPP sampler [P-2019].*

20 40 60 80

0

0.5

1

1.5

d (n = 4d2)

TF

lop

/s

Single-precisionDouble-precision

20 40 60 80

0

20

40

60

d (n = 4d2)

Sec

on

ds

Single-precisionDouble-precision

*Generated from the Kenyon formula over the Kasteleyn matrix [Chhita et al.-2015].

61 / 71

Page 93: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Z2 UST DPP on i9-7960x (16-core)

Dense, real LDLH-based DPP sampler [P-2019].*

50 100

0

0.5

1

d (n ≈ 2d2)

TF

lop

/s

Single-precisionDouble-precision

50 100

0

2

4

d (n ≈ 2d2)

Sec

on

ds

Single-precisionDouble-precision

*Over the Gramian of the star space basis [Lyons/Peres-2016].

62 / 71

Page 94: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Hexagonal UST DPP on i9-7960x (16-core)

Dense, real LDLH-based DPP sampler [P-2019].*

20 40 60

0

0.5

1

d (n ≈ 6d2)

TF

lop

/s

Single-precisionDouble-precision

20 40 60

0

2

4

6

d (n ≈ 6d2)

Sec

on

ds

Single-precisionDouble-precision

*Over the Gramian of the star space basis [Lyons/Peres-2016].

63 / 71

Page 95: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Low precision corrupting sampling

$ ./ aztec_diamond --diamond_size =80

Double-precision sampleSingle-precision sample (visiblyerroneous)

64 / 71

Page 96: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Basic questions for DPP factorizations

Given the close connection between DPP sampling and densefactorization:

• One should be able to probabilistically generalize elementgrowth and numerical stability bounds.

• Use maximum-entropy diagonal pivot selection? Minimizesworst case pivot.

• High-performance techniques for backpropagating throughCholesky are now known [Murray-2016].4 Do these blockedalgorithms extend to DPPs?

4[Murray-2016] Differentiation of the Cholesky decomposition.arxiv.org/abs/1602.07527

65 / 71

Page 97: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Basic questions for DPP factorizations

Given the close connection between DPP sampling and densefactorization:

• One should be able to probabilistically generalize elementgrowth and numerical stability bounds.

• Use maximum-entropy diagonal pivot selection? Minimizesworst case pivot.

• High-performance techniques for backpropagating throughCholesky are now known [Murray-2016].4 Do these blockedalgorithms extend to DPPs?

4[Murray-2016] Differentiation of the Cholesky decomposition.arxiv.org/abs/1602.07527

65 / 71

Page 98: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Basic questions for DPP factorizations

Given the close connection between DPP sampling and densefactorization:

• One should be able to probabilistically generalize elementgrowth and numerical stability bounds.

• Use maximum-entropy diagonal pivot selection? Minimizesworst case pivot.

• High-performance techniques for backpropagating throughCholesky are now known [Murray-2016].4 Do these blockedalgorithms extend to DPPs?

4[Murray-2016] Differentiation of the Cholesky decomposition.arxiv.org/abs/1602.07527

65 / 71

Page 99: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Sparse-direct DPP factorizationsWe have so-far discussed analogues of dense factorizations, andsparse-direct analogues are a natural extension.

Catamari implements templated, real and complex, Cholesky / LDLH /LDLT – switching between DAG-scheduled, right-looking supernodaland up-looking simplicial based upon arithmetic intensity[Chen/Davis/Hager/Rajamanickam-2008].

A variant of the sparse-direct LDLH is provided for sparse DPPs.

(Unpivoted) sparse-direct LU and LDLT DPP sampling is a

straight-forward extension.66 / 71

Page 100: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Sparse-direct DPP factorizationsWe have so-far discussed analogues of dense factorizations, andsparse-direct analogues are a natural extension.

Catamari implements templated, real and complex, Cholesky / LDLH /LDLT – switching between DAG-scheduled, right-looking supernodaland up-looking simplicial based upon arithmetic intensity[Chen/Davis/Hager/Rajamanickam-2008].

A variant of the sparse-direct LDLH is provided for sparse DPPs.

(Unpivoted) sparse-direct LU and LDLT DPP sampling is a

straight-forward extension.66 / 71

Page 101: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Sparse-direct DPP factorizationsWe have so-far discussed analogues of dense factorizations, andsparse-direct analogues are a natural extension.

Catamari implements templated, real and complex, Cholesky / LDLH /LDLT – switching between DAG-scheduled, right-looking supernodaland up-looking simplicial based upon arithmetic intensity[Chen/Davis/Hager/Rajamanickam-2008].

A variant of the sparse-direct LDLH is provided for sparse DPPs.

(Unpivoted) sparse-direct LU and LDLT DPP sampling is a

straight-forward extension.66 / 71

Page 102: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Complex sparse LDLT on i9-7960x (16-core)

3D Helmholtz w/ PML and trilinear,hexahedral elements

$ OMP_NUM_THREADS =16 ./ helmholtz_3d_pml

20 40 60 80 100

0

0.2

0.4

0.6

d (n = d3)

TF

lop

/s

Double-precision (with symbolic)

Double-precision (w/o symbolic)

20 40 60 80 100

0

20

40

d (n = d3)

Sec

on

ds

Double-precision (with symbolic)

Double-precision (w/o symbolic)

67 / 71

Page 103: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

(MAP) Sampling from 2D −σ∆

$ ./ dpp_shifted_2d_negative_laplacian \

--x_size =200 --y_size =200 --scale =0.72

Log-likelihood: -27472.2Sample time: 0.0107 seconds

Log-likelihood: -26058Sample time: 0.0112 seconds

68 / 71

Page 104: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

(MAP) Sampling from 2D −σ∆

$ ./ dpp_shifted_2d_negative_laplacian \

--x_size =200 --y_size =200 --scale =0.75

Log-likelihood: -27612.6Sample time: 0.0124 seconds

Log-likelihood: -26009Sample time: 0.0114 seconds

69 / 71

Page 105: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

(MAP) Sampling from 2D −σ∆

$ ./ dpp_shifted_2d_negative_laplacian \

--x_size =200 --y_size =200 --scale =0.85

Log-likelihood: -27581.7Sample time: 0.0114 seconds

Log-likelihood: -25765Sample time: 0.0118 seconds

70 / 71

Page 106: High-performance sampling of Determinantal Point Processesheader-only C++14 package Catamari[P-2018]available at ... Overview Will draw strong connection between techniques for e ciently

Closing

Availability:Quotient is available under the Mozilla Public License 2.0 athodgestar.com/quotient/ and gitlab.com/hodge star/quotient.This talk is based on version 0.2.

Catamari is available under the Mozilla Public License 2.0 athodgestar.com/catamari/ and gitlab.com/hodge star/catamari.This talk is based on version 0.2.3.

These slides are available at:hodgestar.com/catamari/April8-2019-RoyalSociety.pdf

Acknowledgements:

• Alex Kulesza and Jenny Gillenwater:For answering my initial DPP sampling questions.

Questions/comments?

71 / 71