spectral methods for complex networks · 2016-07-15 · spectral methods for complex networks...
TRANSCRIPT
Spectral Methods for Complex Networks
Richard C. WilsonDept. of Computer Science
University of York
+ path
Outline
Part I1. Brief recap of spectral graph theory2. Representation3. Spectra of graph models4. Application to graph partitioningPart II1. Paths and Cycles2. Formal Series3. Counting paths4. Counting cycles
Matrix Representation
A Matrix Representation X of a network is matrix with entries representing the vertices and edgesAdjacency
12
34
5
Degree matrix
Matrix Representation
The Laplacian (L) is
Signless Laplacian
Matrix Representation
Normalized Laplacian
Entries are
Incidence matrix
The incidence matrix of a graph is a matrix describing the relationship between vertices and edges
Relationship to signless Laplacian
Adjacency
Laplacian
12
3
Matrix Representation
Consider the Laplacian (L) of this network
Clearly if we label the network differently, we get a different matrixIn fact
represents the same graph for any permutation matrix P of the n labels
12
34
5
12
Characterisations
Are two networks the same? (Graph Isomorphism), or is there a bijection between the vertices such that all the edges are in correspondence?
Interesting problem in computational theory, complexity unknown but hypothesised as separate class in NP-hierarchy, GI-hard
Graph Automorphism: Isomorphism between a graph and itself.
Characterisations
An equivalent statement: Two networks are isomorphic iff there exists a permutation matrix P such that
X should contain all information about the network– Applies to L, A etc not to D
P is a relabelling; changes the order in which we label the vertices
Our measurements from a matrix representation should be invariant under this transformation (similarity transform)
X is a full matrix representation
Spectral Graph Theory
Properties of the graph from the eigenvalues (eigenvectors) of a matrix representation of the graph
Symmetric (undirected)Always has n real eigenvalues
Non-symmetricPossibly complex eigenvalues
Perron-Frobenius Theorem
Perron-Frobenius Theorem:If X is an irreducible square matrix with non-negative entries, then there exists an eigenpair (λ,u) such that
Applies to both left and right eigenvector• Key theorem: if our matrix is non-negative, we can find a principal(largest) eigenvalue which is positive and has a non-negative eigenvector• Irreducible implies associated digraph is strongly connected
Spectrum
The graph has a ordered set of eigenvalues (λ1, λ2,… λn) in terms of size (I will use smallest first).
The (ordered) set of eigenvalues is called the spectrum of the graph.
Theorem: The spectrum is unchanged by the relabelling transform
Corollary: If two graphs are isomorphic, they have the same spectrum
This does not solve the isomorphism problem, as two different graphs may have the same spectrum
Undirected networks: Spectrum of A
Spectrum of A: Positive and negative eigenvalues
Undirected networks: Spectrum of A
Bipartite graph: If λ is an eigenvalue, then so is –λ, Sp(A) symmetric around 0
Perron-Frobenius Theorem (A non-negative matrix)n is largest magnitude eigenvalue, corresponding
eigenvector un is non-negative
Undirected Networks: Spectrum of L
Spectrum of L: L positive semi-definite
There always exists an eigenvector 1 with eigenvalue 0, because of zero row-sums
The number zeros in the spectrum is the number of connected components of the network.
Spectrum of L
A spanning tree of a graph is a tree containing only edges in the network and all the vertices
Example
Kirchhoff’s theoremThe number of spanning trees of a graph is
Spectrum of normalised L
Spectrum of : Positive semi-definite
As with Laplacian, the number zeros in the spectrum is the number of disconnected components of the network.
Eigenvector exists with eigenvalue 0 and entries
‘scale invariance’ for eigenvalues
Regular networks
• A network is regular if all vertices have the same degree
• Spectra (eigenvalues and eigenvectors) essentially the same
Eigensystem stability and the spectral difference
• If the network changes for some reason– Rewiring, random noise etc.
• The eigenvalues and eigenvectors will change• Let N be a symmetric matrix representing the
change (deleted/extra edges)
• The change in an eigenvalue is bounded above by the Frobenius norm of N– Small perturbation, small change in eigenvalues
Eigensystem stability and the spectral difference
• If N is small compared to X we can apply eigenperturbation theory
• Eigenvectors not stable if spectral difference |λk-λj| is small
References
Spectra of Graphs, Brouwer & Haemers, Springer
Graph Spectra for Complex Networks, Van Mieghem, Cambridge University Press
Spectral Graph Theory, Fan Chung, American Mathematical Society
Spectral Methods and Labels
So far, we have considered edges only as present or absent {0,1}. If we have more edge information, can encode in a variety of ways. Edges can be weighted to encode attributes, include diagonal entries to encode vertices
0.40.6
0.2
Coding Attributes
• Note: When using Laplacian, add diagonal elements after forming L
• Label attributes: Code labels into [0,1]• Example: chemical structures
Edges─ 0.5═ 1.0Aromatic 0.75VerticesC 0.7N 0.8O 0.9
Coding Attributes
Spectral theory works equally well for complex matrices
Matrix entry is x+iy so can encode two independent attributes per entry, x and y. Symmetric matrix becomes Hermitian matrix
Eigenvalues real, eigenvectors complex
Spectra of Network Models
• A number of famous network models give very distinctive eigenvalue distributions
• Example: Erdos-Renyi random graph model• Edges are chosen by connecting each pair of
vertices with fixed probability p
Erdos-Renyi Spectrum of A
Scale free
• Scale-free (Preferential attachment)• Network grows by adding new vertices
– m new edges added each time• Probability of connection proportional to degree
Scale-free Spectrum of A
Small world
• Small world (Watts-Strogatz)• Basic ring topology with m neighbours• Reconnect edges randomly with prob. p• When p=0, regular graph with degree m
– Degenerate spectrum with sharp peaks• When p=1, ER random graph
– Semi-circle law• Transitions between two for p∈[0,1]
Small world
p=0.0 p=0.1
p=0.5p=0.3
Spectral Partitioning and Cuts
• Divide a network into modules or clusters
• Minimise C– This simple approach does not work
cut
Spectral Partitioning
• Should prefer equal partitions
Ratio cut
Normalized cut
Spectral Partitioning
• Analysis (ratio cut)Introduce indicator vector xHas following properties1. 2.3. xi takes only two values
Spectral Partitioning
• Similarly, for normalized cut
• Discretize x into two values to obtain partitions
• Solution depends on finding eigenvector• Type of cut depends on matrix
– Equally well use another matrix, e.g. adjacency
• A measures affinity between vertices for being in the same partition
Modularity
• Modularity is a measure of partition quality relative to some base graph model
• Can be summarised in modularity matrix
• Pij is the expected affinity according to base model– Needs to be more clustered that the model
• Common to use the configuration model as the base
Modularity
• Modularity
Paths
Paths
• The structure of a network can be probed by looking at the paths– Communicability– Commute time
• Generally not tractable to enumerate paths – too many
• Need to think carefully about what can be computed in practice– Powers of A, exp(A) etc.
Path
A path is a contiguous sequence of edges in the network
The length of p, l(p) is the number of edges traversed
A simple path is a self-avoiding path, which does not repeat any vertices (with the possible exception of i and j)
1
3
2
4
5
Cycles
• A cycle is a closed path in a network, i.e. a path across edges returning to the same vertex (i=j)
• Cycles are often an important structural component of networks
1
3
2
4
5
Cycles
• A cycle is a sequence
• A simple cycle does not repeat any vertex except the first/last
• Two cycles may be considered equivalent if they are the same cycle with different starting points
1
3
2
4
5 1212342~3423~423453435
SimpleSimpleNon-simple
Counting paths
• Formal adjacency matrix
• Replace {0,1} with formal variables representing edges
• Allows us to keep track of which sequences contribute to a particular calculation– Substitute specific values to do find actual values
Counting paths
• Example
Weighted sum of paths of all lengths
Walk generating function
• Can use z to control convergence, z<1/n
ExampleMUTAG
Collection of 188 labelled chemical compounds.Task is to predict whether each compound has mutagenicity or not.
Method Dataset AccuracyRandom walk kernel Backtrackless walk kernel
Mutag(labelled)Mutag(labelled)
90. 0%91.1%
Feature vector from Random walkFeature vector from backtrackless random walkFeature vector from Ihara coefficients Shortest Path Kernel
COIL(unlabeled)COIL(unlabeled)COIL(unlabeled)COIL(unlabeled)
94.4%95.5%94.4%86.7%
Feature vector from Random walkFeature vector from backtrackless random walkFeature vector from Ihara coefficients
Mutag(unlabeled)Mutag(unlabeled)Mutag(unlabeled)
89.4%90.5%80.5%
Graph Kernels
• The walk generating function efficiently counts paths
• Including backtracks
• Tottering masks interesting information• Simple paths difficult to compute
Oriented Line Graph
• Oriented Line graph:
1 2
34
e21
e12
e23e32e42
e24e41 e14
e43
e34
e23
e21
e12
e32
e42
e24e41
e14
e43e34
Oriented Line graph (OLG): no backtracking
1. Convert edges into directed pairs2. Each directed edge becomes a vertex3. Join vertices where the head of one edge
meets the tail of another4. Reverse pairs are not joined (eg. e12, e21)
Backtrackless Walks
• The adjacency of the OLG is given by T (the Hashimoto matrix of the network)
• Paths on T are paths on A, except backtracks do not appear– Path of length l on T is path of length l+1 on A
• Count paths on T, but T can be big (2|E|×2|E|)
Efficient computation
• Complexity is a problem
• We can directly compute n×n matrix Ak, defined as
here i, j run over the vertices of G.• Recursions for the matrices Ak
– Let A be the adjacency matrix of a simple graph G and Q be a n×n diagonal matrix whose ith diagonal entry is the degree of the ith node minus 1. Then
[Stark and Terras 1996,Aziz et al 2013]
Cycles
• It is easy to count short simple cycles in a network
• As we noted earlier, (number of 2-cycles)
• (number of simple 3-cycles)• which is the number of 4-cycles,
most of which are not simple
1
4
2
3
Cycles
• A cycle in OLG(G) induces a cycle in G• Since backtracks are not allowed, certain cycles do not
appear– Cycles of length 2– Cycles with tails
• Let T be the adjacency of the OLG– Called the Hashimoto matrix of G
• Still get repeats at larger size, eg c12, c1c2
2
Cycles
• What about other matrix functions?• Structural measures should be invariant to
permutation similarity transform• det and perm seem obvious choices
– Counts hikes of length n, collections of disjoint cycles• perm hard to compute
Ihara Zeta Function
• Ihara (1966), Sunada (1986)• Prime cycle of a graph:
– A cycle which has no backtracking and is not a multiple of another cycle
Prime Not Prime (backtracking)
Not Prime (twice round a single
cycle)
Prime Cycles
– Similar trick to walk generating function– Sum over hikes of any length
• Use T to eliminate backtracks in the hikes and let wij → z to get a generating function
• Ihara zeta function of network– Effectively series over Ihara prime cycles
• Efficient evaluation using (large) eigenvalues of T
Application: Social Balance
• Some social interactions can be characterised by a positive/negative interaction– Friend/enemy, for/against
• Social theory suggests that networks should evolve into a balanced state to decrease tension– Does this happen in practice?
Alice
Bob
Carol
+
+-
Balance in Networks
• Early work focussed on triangles– Easy to count
Cycles
• Problem with counting balanced squares:
• There is no simple way of counting simple cycles of arbitrary length in a graph
• A Hamiltonian cycle is a simple cycle which visits all vertices of the graph
• Determining whether such a cycle existing is known to be NP-complete– No polynomial-time algorithm likely for general simple cycle
counting
Cycles
• Can count Ihara cycles instead– Simple up to length 5– No cycle powers ck
Simple cycles
• Generating function for simple cycles
– S all simple cycles• There is a trace formula for this function [Giscard
et al 2016]
• Naturally this is NP-hard to compute– Can get efficient approximations for shorter cycles
using Monte Carlo sampling, particularly on sparse graphs
Balance in real networks
• WikiElections network represents the votes of wikipedia users during the elections of other users to adminship. – Directed, 8,297 vertices, 12915 edges
Balance in real networks
• The Epinions network is a large directed graph on 131,828 vertices representing relations between the users of the consumer review website Epinions.com.– Directed with 841,372 edges