graph signal processing for clustering · 2016. 1. 26. · graph signal processing for clustering...
TRANSCRIPT
Graph signal processing for clustering
Nicolas Tremblay
PANAMA Team, INRIA Rennes with Remi Gribonval,Signal Processing Laboratory 2, EPFL, Lausanne with Pierre Vandergheynst.
Introduction Graph signal processing... ... applied to clustering Conclusion
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 1 / 26
What’sclustering ?
Introduction Graph signal processing... ... applied to clustering Conclusion
Given a series of N objects :
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Given a series of N objects :
1/ Find adapted descriptors
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Given a series of N objects :
1/ Find adapted descriptors
2/ Cluster
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
After step 1, one has :
• N vectors in d dimensions (descriptor dimension) :
x1, x2, · · · , xN ∈ Rd
• and their distance matrix D ∈ RN×N .
The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.
There exists two different general types of methods :
• methods directly based on the xi and/or D like k-means orhierarchical clustering.
• graph-based methods.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
After step 1, one has :
• N vectors in d dimensions (descriptor dimension) :
x1, x2, · · · , xN ∈ Rd
• and their distance matrix D ∈ RN×N .
The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.
There exists two different general types of methods :
• methods directly based on the xi and/or D like k-means orhierarchical clustering.
• graph-based methods.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
After step 1, one has :
• N vectors in d dimensions (descriptor dimension) :
x1, x2, · · · , xN ∈ Rd
• and their distance matrix D ∈ RN×N .
The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.
There exists two different general types of methods :
• methods directly based on the xi and/or D like k-means orhierarchical clustering.
• graph-based methods.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
After step 1, one has :
• N vectors in d dimensions (descriptor dimension) :
x1, x2, · · · , xN ∈ Rd
• and their distance matrix D ∈ RN×N .
The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.
There exists two different general types of methods :
• methods directly based on the xi and/or D like k-means orhierarchical clustering.
• graph-based methods.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
After step 1, one has :
• N vectors in d dimensions (descriptor dimension) :
x1, x2, · · · , xN ∈ Rd
• and their distance matrix D ∈ RN×N .
The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.
There exists two different general types of methods :
• methods directly based on the xi and/or D like k-means orhierarchical clustering.
• graph-based methods.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Graph construction from the distance matrix D
Create a graph G = (V ,E ) :
• each node in V is one of the N objects
• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.
For example, two connectivity possibilities :
• Gaussian kernel :
1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)
2. remove all links of weight inferior to ε
• k nearest neighbors : connect each node to its k nearestneighbors.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Graph construction from the distance matrix D
Create a graph G = (V ,E ) :
• each node in V is one of the N objects
• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.
For example, two connectivity possibilities :
• Gaussian kernel :
1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)
2. remove all links of weight inferior to ε
• k nearest neighbors : connect each node to its k nearestneighbors.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Graph construction from the distance matrix D
Create a graph G = (V ,E ) :
• each node in V is one of the N objects
• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.
For example, two connectivity possibilities :
• Gaussian kernel :
1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)
2. remove all links of weight inferior to ε
• k nearest neighbors : connect each node to its k nearestneighbors.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Graph construction from the distance matrix D
Create a graph G = (V ,E ) :
• each node in V is one of the N objects
• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.
For example, two connectivity possibilities :
• Gaussian kernel :
1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)
2. remove all links of weight inferior to ε
• k nearest neighbors : connect each node to its k nearestneighbors.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Graph construction from the distance matrix D
Create a graph G = (V ,E ) :
• each node in V is one of the N objects
• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.
For example, two connectivity possibilities :
• Gaussian kernel :
1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)
2. remove all links of weight inferior to ε
• k nearest neighbors : connect each node to its k nearestneighbors.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The problem now states :
Given the graph G representing the similarity between the Nobjects, find a partition of all nodes in k clusters.
Many methods exist [Fortunato ’10] :
• Modularity (or other cost-function) optimisation methods[Newman ’04]
• Random walk methods [Delvenne ’10]
• Methods inspired from statistical physics [Krzakala ’12],information theory [Rosvall ’07]...
• spectral methods
• ...
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 5 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The problem now states :
Given the graph G representing the similarity between the Nobjects, find a partition of all nodes in k clusters.
Many methods exist [Fortunato ’10] :
• Modularity (or other cost-function) optimisation methods[Newman ’04]
• Random walk methods [Delvenne ’10]
• Methods inspired from statistical physics [Krzakala ’12],information theory [Rosvall ’07]...
• spectral methods
• ...
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 5 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Three useful matrices
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 6 / 26
The adjacency matrix : The degree matrix :
The Laplacian matrix :
W =
0 1 1 01 0 1 11 1 0 00 1 0 0
S =
2 0 0 00 3 0 00 0 2 00 0 0 1
L = S−W =
2 −1 −1 0−1 3 −1 −1−1 −1 2 00 −1 0 1
Introduction Graph signal processing... ... applied to clustering Conclusion
Three useful matrices
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 6 / 26
The adjacency matrix : The degree matrix :
The Laplacian matrix :
W =
0 .5 .5 0.5 0 .5 4.5 .5 0 00 4 0 0
S =
1 0 0 00 5 0 00 0 1 00 0 0 4
L = S−W =
1 −.5 −.5 0
−.5 5 −.5 −4−.5 −.5 1 0
0 −4 0 4
Introduction Graph signal processing... ... applied to clustering Conclusion
The classical spectral clustering algorithm [Von Luxburg ’06] :
Given the N-node graph G of adjacency matrix W :
1. Compute :Uk = (u1|u2| · · · |uk)
the first k eigenvectors of L = S−W.
2. Consider each node i as a point in Rk :
fi = U>k δi .
3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The classical spectral clustering algorithm [Von Luxburg ’06] :
Given the N-node graph G of adjacency matrix W :
1. Compute :Uk = (u1|u2| · · · |uk)
the first k eigenvectors of L = S−W.
2. Consider each node i as a point in Rk :
fi = U>k δi .
3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The classical spectral clustering algorithm [Von Luxburg ’06] :
Given the N-node graph G of adjacency matrix W :
1. Compute :Uk = (u1|u2| · · · |uk)
the first k eigenvectors of L = S−W.
2. Consider each node i as a point in Rk :
fi = U>k δi .
3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s the point of using a graph ?
N points in d = 2 dimensions.Result with k-means (k=2) :
After creating a graph fromthe N points’ interdistances,and running the spectral clus-tering algorithm (with k=2) :
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 8 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Computation bottlenecks of thespectral clustering algorithm
When N and/or k become too large, there are two mainbottlenecks in the algorithm :
1. The partial eigendecomposition of the Laplacian.
2. k-means.
Our goal :
Circumvent both !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Computation bottlenecks of thespectral clustering algorithm
When N and/or k become too large, there are two mainbottlenecks in the algorithm :
1. The partial eigendecomposition of the Laplacian.
2. k-means.
Our goal :
Circumvent both !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Computation bottlenecks of thespectral clustering algorithm
When N and/or k become too large, there are two mainbottlenecks in the algorithm :
1. The partial eigendecomposition of the Laplacian.
2. k-means.
Our goal :
Circumvent both !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 10 / 26
What’s graph signalprocessing ?
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s a graph signal ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26
J F M A M J J A S O N D−20
0
20
40
C
0 5 10 15 205
10
15
C
Heure
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s the graph Fourier matrix ?[Hammond ’11]
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 12 / 26
The “classical” graph :
Lcl =
2 −1 0 ··· 0 −1−1 2 −1 ··· 0 0...
......
......
0 0 0 ··· 2 −1−1 0 0 ··· −1 2
All classical Fourier modes are theeigenvectors of Lcl
Any graph :
L
By analogy, any graph’s Fouriermodes are the eigenvectors of its
Laplacian matrix L.
Introduction Graph signal processing... ... applied to clustering Conclusion
What’s the graph Fourier matrix ?[Hammond ’11]
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 12 / 26
The “classical” graph :
Lcl =
2 −1 0 ··· 0 −1−1 2 −1 ··· 0 0...
......
......
0 0 0 ··· 2 −1−1 0 0 ··· −1 2
All classical Fourier modes are theeigenvectors of Lcl
Any graph :
L
By analogy, any graph’s Fouriermodes are the eigenvectors of its
Laplacian matrix L.
Introduction Graph signal processing... ... applied to clustering Conclusion
The graph Fourier matrix
L = S−W
Its eigenvectors :
U = (u1|u2| · · · |uN)
form the graph Fourierorthonormal basis.
Its eigenvalues :
0 = λ1 ≤ λ2 ≤ · · · ≤ λN
represent the graphfrequencies. λi is the squaredfrequency associated to the
Fourier mode ui .
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 13 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Illustration
Low frequency : High frequency :
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 14 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The Fourier transform• given f ∈ RN a signal on a graph of size N.
• f is obtained by decomposing f on the eigenvectors ui :
f =
< u1, f >< u2, f >< u3, f >
...< uN , f >
, i.e. f = U> f
• Inversely, the inverse Fourier transform reads :
f = U f
• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The Fourier transform• given f ∈ RN a signal on a graph of size N.
• f is obtained by decomposing f on the eigenvectors ui :
f =
< u1, f >< u2, f >< u3, f >
...< uN , f >
, i.e. f = U> f
• Inversely, the inverse Fourier transform reads :
f = U f
• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The Fourier transform• given f ∈ RN a signal on a graph of size N.
• f is obtained by decomposing f on the eigenvectors ui :
f =
< u1, f >< u2, f >< u3, f >
...< uN , f >
, i.e. f = U> f
• Inversely, the inverse Fourier transform reads :
f = U f
• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
The Fourier transform• given f ∈ RN a signal on a graph of size N.
• f is obtained by decomposing f on the eigenvectors ui :
f =
< u1, f >< u2, f >< u3, f >
...< uN , f >
, i.e. f = U> f
• Inversely, the inverse Fourier transform reads :
f = U f
• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Filtering
Given a filter function g definedin the Fourier space.
0
0.2
0.4
0.6
0.8
1
1 2λ
g(λ
)
In the Fourier space, the signal filtered by g reads :
f g =
f (1) g(λ1)
f (2) g(λ2)
f (3) g(λ3)...
f (N) g(λN)
= G f with G =
g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)
In the node space, the filtered signal f g reads therefore :
f g = U G U> f = Gf
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Filtering
Given a filter function g definedin the Fourier space.
0
0.2
0.4
0.6
0.8
1
1 2λ
g(λ
)
In the Fourier space, the signal filtered by g reads :
f g =
f (1) g(λ1)
f (2) g(λ2)
f (3) g(λ3)...
f (N) g(λN)
= G f with G =
g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)
In the node space, the filtered signal f g reads therefore :
f g = U G U> f = Gf
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Filtering
Given a filter function g definedin the Fourier space.
0
0.2
0.4
0.6
0.8
1
1 2λ
g(λ
)
In the Fourier space, the signal filtered by g reads :
f g =
f (1) g(λ1)
f (2) g(λ2)
f (3) g(λ3)...
f (N) g(λN)
= G f with G =
g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)
In the node space, the filtered signal f g reads therefore :
f g = U G U> f = Gf
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 17 / 26
So where’s the link ?
Introduction Graph signal processing... ... applied to clustering Conclusion
Remember : the classical spectral clustering algorithm
Given the N-node graph G of adjacency matrix W :
1. Compute :Uk = (u1|u2| · · · |uk)
the first k eigenvectors of L = S−W.
2. Consider each node i as a point in Rk :
fi = U>k δi .
3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the first bottleneck : estimate Dij without partiallydiagonalizing the Laplacian matrix.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 18 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Remember : the classical spectral clustering algorithm
Given the N-node graph G of adjacency matrix W :
1. Compute :Uk = (u1|u2| · · · |uk)
the first k eigenvectors of L = S−W.
2. Consider each node i as a point in Rk :
fi = U>k δi .
3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the first bottleneck : estimate Dij without partiallydiagonalizing the Laplacian matrix.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 18 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Ideal low-pass filtering1st step : assume we know Uk and λk
Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5
Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.
We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.
Norm conservation theorem for ideal filter
Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :
∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Ideal low-pass filtering1st step : assume we know Uk and λk
Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5
Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.
We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.
Norm conservation theorem for ideal filter
Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :
∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Ideal low-pass filtering1st step : assume we know Uk and λk
Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5
Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.
We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.
Norm conservation theorem for ideal filter
Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :
∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Non-ideal low-pass filtering2nd step : assume all we know is λk
In practice, we use a polyapprox of order m of hλk :
hλk =m∑l=1
αlλl ' hλk .
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5ideal
m=100
m=20
m=5
Indeed, in this case, filtering a vector x reads :
Hλkx = Uhλk (Λ)U>x = Um∑l=1
αlΛlU>x =
m∑l=1
αlLlx
• Does not require the knowledge of Uk .
• Only involves matrix-vector multiplications [costs O(m|E |)].
The theorem stays (more or less) valid with this non-ideal filtering !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Non-ideal low-pass filtering2nd step : assume all we know is λk
In practice, we use a polyapprox of order m of hλk :
hλk =m∑l=1
αlλl ' hλk .
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5ideal
m=100
m=20
m=5
Indeed, in this case, filtering a vector x reads :
Hλkx = Uhλk (Λ)U>x = Um∑l=1
αlΛlU>x =
m∑l=1
αlLlx
• Does not require the knowledge of Uk .
• Only involves matrix-vector multiplications [costs O(m|E |)].
The theorem stays (more or less) valid with this non-ideal filtering !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Non-ideal low-pass filtering2nd step : assume all we know is λk
In practice, we use a polyapprox of order m of hλk :
hλk =m∑l=1
αlλl ' hλk .
λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5ideal
m=100
m=20
m=5
Indeed, in this case, filtering a vector x reads :
Hλkx = Uhλk (Λ)U>x = Um∑l=1
αlΛlU>x =
m∑l=1
αlLlx
• Does not require the knowledge of Uk .
• Only involves matrix-vector multiplications [costs O(m|E |)].
The theorem stays (more or less) valid with this non-ideal filtering !
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Last step : estimate λk
Goal : given L, estimate its k-th eigenvalue as fast as possible.
We use eigencount techniques (also based on polynomial filteringof random vectors !) :
• given the interval [0, b], get an approximation of the numberof enclosed eigenvalues.
• And find λk by dichotomy on b.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 21 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Last step : estimate λk
Goal : given L, estimate its k-th eigenvalue as fast as possible.
We use eigencount techniques (also based on polynomial filteringof random vectors !) :
• given the interval [0, b], get an approximation of the numberof enclosed eigenvalues.
• And find λk by dichotomy on b.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 21 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Accelerated spectral algorithm
Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.
Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Fast spectral algorithm ?Given the N-node graph G of adjacency matrix W :
1. Estimate λk , the k-th eigenvalue of L.
2. Generate η random graph signals in matrix R ∈ RN×η.
3. Filter them with Hλk and treat each node i as a point in Rη :
f >i = δ>i Hλk R.
4. Sample randomly ρ ' k log k << N nodes out of N :
f ri = Mfi = (MHλk R)>δri .
5. Run k-means in this reduced space with the Euclideandistance : Dr
ij = ||f ri − f r
j || and obtain k clusters.
6. Interpolate cluster indicator functions c rl on the whole graph :
cl = arg minx∈RN
||Mx − c rl ||2 + µxL>x .
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 23 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Compressive spectral clustering : a summary
1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;
2. subsample the set of nodes ;
3. cluster the reduced set of nodes ;
4. interpolate the cluster indicator vectors back to the completegraph.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Compressive spectral clustering : a summary
1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;
2. subsample the set of nodes ;
3. cluster the reduced set of nodes ;
4. interpolate the cluster indicator vectors back to the completegraph.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Compressive spectral clustering : a summary
1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;
2. subsample the set of nodes ;
3. cluster the reduced set of nodes ;
4. interpolate the cluster indicator vectors back to the completegraph.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Compressive spectral clustering : a summary
1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;
2. subsample the set of nodes ;
3. cluster the reduced set of nodes ;
4. interpolate the cluster indicator vectors back to the completegraph.
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
This work was done in collaboration with :
• Gilles PUY and Remi GRIBONVAL from the PANAMAteam (INRIA).
• Pierre VANDERGHEYNST from EPFL.
Part of this work has been published (or submitted) :
• Circumventing the first bottleneck has been accepted toICASSP 2016
• Interpolation of k-bandlimited graph signals has beensubmitted to ACHA in November (an application of whichhelps us circumvent the second bottleneck).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 25 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
This work was done in collaboration with :
• Gilles PUY and Remi GRIBONVAL from the PANAMAteam (INRIA).
• Pierre VANDERGHEYNST from EPFL.
Part of this work has been published (or submitted) :
• Circumventing the first bottleneck has been accepted toICASSP 2016
• Interpolation of k-bandlimited graph signals has beensubmitted to ACHA in November (an application of whichhelps us circumvent the second bottleneck).
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 25 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Perspectives and difficult questions
Two difficult questions (among others) :
1. Given a semi-definite positive matrix, how to estimate as fastas possible its k-th eigenvalue, and only that one ?
2. How to subsample ρ nodes out of N while ensuring thatclustering them in k classes is the result one would haveobtained by clustering all N nodes ?
Perspectives
1. How about if nodes are added one by one ?
2. Rational filters instead of polynomial filters ?
3. Approximating other spectral clustering algorithms ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 26 / 26
Introduction Graph signal processing... ... applied to clustering Conclusion
Perspectives and difficult questions
Two difficult questions (among others) :
1. Given a semi-definite positive matrix, how to estimate as fastas possible its k-th eigenvalue, and only that one ?
2. How to subsample ρ nodes out of N while ensuring thatclustering them in k classes is the result one would haveobtained by clustering all N nodes ?
Perspectives
1. How about if nodes are added one by one ?
2. Rational filters instead of polynomial filters ?
3. Approximating other spectral clustering algorithms ?
N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 26 / 26