![Page 1: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/1.jpg)
Edo Airoldi
Department of Statistics, Harvard Broad Institute of Harvard and MIT
Guest lecture for EE380L at UT Austin (Prof. J Ghosh), November 10, 2011
Models of networks and mixed membership stochastic blockmodels
![Page 2: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/2.jpg)
Guest lecture for EE380L (November 2011) 2
Agenda
• Overview • Models of networks • Mixed membership blockmodels
1. Inference 2. Results
• Concluding remarks
![Page 3: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/3.jpg)
Overview
• Structured data vs. latent dependence structure Leveraging observed (noisy) structure for estimation As opposed to dim redux, graphical models, sparsity, …
• Technical challenges Abandon convenient representations of dependence Deal with structured measurements and interfering units
• This talk Statistical problems when structure is expressed by a graph
3
![Page 4: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/4.jpg)
What is a complex network?
• Define as a collection of measurements on pairs of sampling units and of unit-specific attributes
• Traditionally, can only choose 2 out of 3 1. Large scale, e.g. millions of nodes 2. Realistic 3. Completely mapped, or to a large extent
• Today, a number of systems fall under this data setting that satisfy all three characteristics
Guest lecture for EE380L (November 2011) 4
![Page 5: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/5.jpg)
A few examples
• Internet, WWW and Wikipedia • Signaling pathways and metabolic networks • JStor and scientific literature • Cell-phone data, e.g. Rwanda, UK, ATT • Yahoo and other instant messaging systems • Linked-In and Facebook • Blogs and Twitter
Guest lecture for EE380L (November 2011) 5
![Page 6: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/6.jpg)
Rich, interdisciplinary literature
• Historical notes Moreno formalizes the sociogram (’34), Sociometry (‘37)
50s: Sociology (Coleman et al. ‘57), Mathematics (Erdos & Reniy ‘59, Gilbert ‘59), Psychology (Milgram ‘67, ’69)
70s: Statistics (Holland, Leinhardt, Fienberg, Wasserman)
90s: Computer Science (Faloutsos3 ’99), Physics (Huberman & Adamic ‘99, Albert & Barabasi ‘99)
Guest lecture for EE380L (November 2011) 6
![Page 7: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/7.jpg)
Statistical issues in network analysis
• Representation and compressed sensing How to smoothly represent the space of all graph structures? Motifs, metrics, spectral, …, semi-parametric
• Population models Sample size? Notions of variability? (See survey paper)
• Diffusion of information on a network How to infer who talks to whom from aggregate traffic?
Guest lecture for EE380L (November 2011) 7
![Page 8: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/8.jpg)
Statistical issues in network analysis
• Confidence sets, tests, GoF, model selection How to establish confidence sets for network structure? The Newman-Girvan modularity score is inconsistent
• Inference from a sample CDC sponsored more than 90 studies to date using RDS Are network sampling designs ignorable? No.
• Causal inference with interference How to separate peer-influence effects from homophily?
Guest lecture for EE380L (November 2011) 8
![Page 9: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/9.jpg)
Some details to think about
• Easy to measure things. Hard to pose questions. May not really know what any node or link means.
• What does Yij=0 mean?
• Valued measurements and censoring.
• Notion of variability. (sample size, populations)
• Global properties must be non-trivial outcomes of the composition of local properties and structures
Guest lecture for EE380L (November 2011) 9
![Page 10: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/10.jpg)
Guest lecture for EE380L (November 2011) 10
Agenda
• Overview • Models of networks • Mixed membership blockmodels
1. Inference 2. Results
• Concluding remarks
![Page 11: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/11.jpg)
Network modeling 101
• Graphs or networks?
• Usually a graph is defined as, G = (V,E)
• For the purpose of this seminar, G = (1:N,YN✕N)
• Complex networks, G = (1:N,YN✕N,XN✕P)
• Random graphs via P(G|Θ) or P(Y|Θ)
• Frequentist or Bayes?
Guest lecture for EE380L (November 2011) 11
![Page 12: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/12.jpg)
Erdös-Renyi-Gilbert
• The most widely known random graph model
• Binary edges are sampled independently G(N,θ): sample Yij from Bernoulli(θ) for i,j=1..N G(N,M): sample Y from SRS(θ,M)
• Likelihood for G(N,θ) P(Y|Θ) = Πij θYij (1-θ)(1-Yij)
Guest lecture for EE380L (November 2011) 12
![Page 13: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/13.jpg)
Emergence of the giant component
• ER studied G(N,M) as θ=M/ increases in [0,1]
• For a graph with N nodes, θ=1/N is a critical value 1. If θ<1/N, no connected components of size larger than
O(log N) will exist in the graph, as N↑∞
2. If θ=1/N, largest connected component of size O(N2/3) will exist in the graph, as N↑∞
3. If θ>1/N, unique connected component of size O(N) will exist in the graph, as N↑∞. No other components with more than O(log N) will exist, as N↑∞
Guest lecture for EE380L (November 2011) 13
!
N2"
# $ %
& '
![Page 14: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/14.jpg)
0.00 0.01 0.02 0.03 0.04 0.05
020
40
60
80
100
probability of an edge
siz
e o
f la
rgest c-c
om
p (
mean)
5e-05 5e-04 5e-03 5e-02
020
40
60
80
100
probability of an edge
siz
e o
f la
rgest c-c
om
p (
mean)
0.00 0.01 0.02 0.03 0.04 0.05
05
10
15
probability of an edge
siz
e o
f la
rgest c-c
om
p (
st.dev)
5e-05 5e-04 5e-03 5e-02
05
10
15
probability of an edge
siz
e o
f la
rgest c-c
om
p (
st.dev)
0.00 0.01 0.02 0.03 0.04 0.05
0.0
0.2
0.4
0.6
0.8
1.0
probability of an edge
pro
b o
f gia
nt com
p (
mean)
5e-05 5e-04 5e-03 5e-02
0.0
0.2
0.4
0.6
0.8
1.0
probability of an edge
pro
b o
f gia
nt com
p (
mean)
0.00 0.01 0.02 0.03 0.04 0.05
0.0
0.1
0.2
0.3
0.4
0.5
probability of an edge
pro
b o
f gia
nt com
p (
st.dev)
5e-05 5e-04 5e-03 5e-02
0.0
0.1
0.2
0.3
0.4
0.5
probability of an edge
pro
b o
f gia
nt com
p (
st.dev)
14
![Page 15: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/15.jpg)
p* or ERG models
Pr (Y=y|Θ=θ) = exp{ Σk θkSk(y) + A(θ) }
where Sk(y) counts specific structure k, such as • edges S1(y) = Σ1≤i≤j≤n yij
• triangles S3(y) = Σ1≤i≤j≤h≤n yij yih yjh.
Frank & Strauss (JASA, 1986), Snijders et al. (Soc. Met., 2004), Hanneke & Xing (LNCS, 2007)
Guest lecture for EE380L (November 2011) 15
![Page 16: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/16.jpg)
Towards exchangeable graphs
• Symmetry suggests the nodes should be treated as exchangeable in the following sense
• A result by Hoover and Aldous: any model that satisfies this condition for any N is of the form
for ui,uj i.i.d. and εij i.i.d node/pair-specific effects
Guest lecture for EE380L (November 2011) 16
![Page 17: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/17.jpg)
Exchangeable graph models
• Alternative specifications of h(µ,ui,uj,εij) lead to different models. With some generality
P(Yij=1|µ,ui,uj,εij) = h’(µ + α(ui,uj) + εij) = θij
• Likelihood P(Y|c) = ∫Θ P(Θ|c) ⋅ Πij θij
Yij (1-θij)(1-Yij) dθij
Guest lecture for EE380L (November 2011) 17
![Page 18: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/18.jpg)
Guest lecture for EE380L (November 2011) 18
Approach
• Issues: scalability, global vs. local perspectives
Data
Probabilistic Hierarchical Models
Bayesian Posterior Inference Hidden Mechanism
(Statistician)
Domain Knowledge and Hypotheses
(Domain expert)
![Page 19: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/19.jpg)
Three basic models
• Latent space model α(ui,uj) = -|ui-uj|; ui real vectors, for i=1…N
• Latent eigenmodel α(ui,uj) = ui
’Λuj; ui real vectors, for i=1…N; Λ diag. K×K
• Latent class model α(ui,uj) = Bui,uj; ui =1…K, for i=1…N; B symm. K×K
Guest lecture for EE380L (November 2011) 19
![Page 20: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/20.jpg)
Latent space models
log-odds (Yij=1|ui,uj,µ) = µ – |ui–uj| = ηij
where ui is a point in Rk, for all nodes i in N.
Idea: close points in Rk are likely to be connected.
Here uis are constants; θij = [1+exp{–ηij}]-1 and likelihood is P(Y|U,µ) = Σij [ηijYij – log(1+exp{ηij}) ]
Hoff et at. (JASA, 2002), Handcock et al. (JRSS/A, 2007), Krivitsky et al. (Soc. Net., 2009)
Guest lecture for EE380L (November 2011) 20
![Page 21: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/21.jpg)
Guest lecture for EE380L (November 2011) 21
Shortcomings so far
• ERG models (Wasserman et al., Handcock et al.)
Summarize graphs using exp model on motif-counts Issues: cannot offer node-specific predictions, ..
• Latent space models (Hoff et al. 02; Hoff 03)
Project adjacency matrix onto a latent RK via logistic regression; closer points increase chance of connectivity
Issues: MCMC does not scale, hard identifiability problem, no clustering effect
![Page 22: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/22.jpg)
Model specifications
πi ~ Dirichlet (α), for all nodes i=1..N yij|πi,πj ~ Bernoulli (πi`B πj), for all pairs (i,j)
where πi is a point in the K-simplex, and B is K×K.
Nodes in the same block share similar connectivity.
Loraine & White (JMS, 1971), Fienberg et al. (JASA, 1985), Nowicki & Snijders (JASA, 2001), Airoldi et al. (JMLR, 2008)
22
1 2
3
4 5
6
8
9 7
![Page 23: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/23.jpg)
Guest lecture for EE380L (November 2011) 23
Agenda
• Overview • Models of networks • Mixed membership blockmodels
1. Inference 2. Results 3. Remarks
• Concluding remarks
![Page 24: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/24.jpg)
The cell
Guest lecture for EE380L (November 2011) 24
(Source: fig.cox.miami.edu)
![Page 25: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/25.jpg)
Functions & mechanisms
• Cytoplasm is a busy place Proteins, small molecules
• Taxonomy of functions Gene Ontology annotations (e.g., cell division)
• Mechanisms Pathways as complex graphs (e.g., carbon metabolism)
Guest lecture for EE380L (November 2011) 25
(Source: SGD and own work)
![Page 26: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/26.jpg)
26
(Source: Nature, and BMC Bioinformatics)
Domain knowledge
Proteins form stable protein complexes to carry out functions in the cell
Protein interaction data
![Page 27: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/27.jpg)
Guest lecture for EE380L (November 2011) 27
Scientific questions
• Can interaction motifs: – indicate proteins’ multifaceted functional role? – reveal protein complexes and relations among them?
Protein interaction data Functions (GO Slim) Yeast cell
(Source: fig.cox.miami.edu, SGD, and own work)
![Page 28: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/28.jpg)
• Structural equivalence (Lorrain & White, 1971) – Nodes with similar connectivity collapsed into a block
• Instantiated by – Blockmodel (B) (≈ Nowiki & Snijders, 01, Airoldi et al. 05, 07, 08)
• Combined with – Mixed membership (Π) (Airoldi et al. 05, 07, 08)
Guest lecture for EE380L (November 2011) 28
Two modeling ideas 1 2
3
4 5
6
8
9 7
(7,8,9)
(1,2,3) (4,5,6)
9
![Page 29: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/29.jpg)
Guest lecture for EE380L (November 2011) 29
Blockmodel, B
• Captures salient structure at the block level
• Connectivity among nodes within the same block (across blocks) is only specified on average
A B C
1.0 0 0.3 A
0.3 1.0 0 B
0 0.3 0 C
C = (7,8,9)
A = (1,2,3) B = (4,5,6)
1 2
3
4 5
6
8
9 7
From
To
![Page 30: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/30.jpg)
Guest lecture for EE380L (November 2011) 30
Mixed membership, Π
• Nodes can be mapped to multiple blocks
• Extends the idea of a mixture (i.e., local weights)
• Node-specific weights useful for prediction
A B C node
1.0 0 0 1
1.0 0 0 2
. . . .
0.1 0.1 0.8 9
1 2
3
4 5
6
8
9 7
9
A B C
![Page 31: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/31.jpg)
Guest lecture for EE380L (November 2011) 31
Model: projecting Y onto B via Π
Mixed Membership
Stochastic Blockmodel
![Page 32: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/32.jpg)
Blockmodel + node-specific memberships
Likelihood
Note: the matrix B has size K✕K Guest lecture for EE380L (November 2011) 32
Model: variant for prediction
Y (n, m) ! Bernoulli (!"!
nB !"m), (n, m) " [1, N ]2
!"n ! Dirichlet (#), n " [1, N ]
!(Y |", B) =!!
"n
p(#$n|")"
nmp(Y (n, m)|#$n,#$m, B) d!
![Page 33: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/33.jpg)
Guest lecture for EE380L (November 2011) 33
Model: variant for de-noising
Blockmodel + relation-specific memberships
Note: the matrix B has size K✕K
!"n ! Dirichlet (#), n " [1, N ]
!znm! ! multinomial (!"n, 1), (n, m) " [1, N ]2
!znm! ! multinomial (!"m, 1), (n, m) " [1, N ]2
Y (n, m) ! Bernoulli (!z!nm"B !znm#), (n, m) " [1, N ]2
![Page 34: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/34.jpg)
Guest lecture for EE380L (November 2011) 34
Agenda
• Overview • Models of networks • Mixed membership blockmodels
1. Inference 2. Results
• Concluding remarks
![Page 35: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/35.jpg)
Guest lecture for EE380L (November 2011) 35
Revisiting EM
• Data Y, latent variables X =(Π,Z), and constants Θ =(α,B)
log
![Page 36: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/36.jpg)
q ! q!(X) " p(X | Y ) at !! = !!(Y )
Guest lecture for EE380L (November 2011) 36
Variational EM
• EM maximizes the lower bound over (q,Θ) • In EM we set
• If not feasible, we can posit approximation for q using free parameters Δ ⎯ this is vEM
q = p(X | Y,!)
![Page 37: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/37.jpg)
Eq!
!
log p(Y, X | !) ! log q!(X)"
=: L(q!, !)
Guest lecture for EE380L (November 2011) 37
Variational EM (cont.)
• Leads to approximate lower bound
• Iterate
Variational E-step:
M-step:
!! = arg max! L(q!, ")
!! = arg max! L(q"! , !)
![Page 38: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/38.jpg)
38
Nested variational EM
• Mean field:
Vanilla vEM (Jordan et al. 99) E-step: initialize γ1:N, ϕ1:N,1:N 1. update ϕ1:N,1:N 2. update γ1:N
M-step: update α, B
Nested vEM (Airoldi et al. 05, 08)
E-step: initialize γ1:N loop pairs (n,m) 1. init & optimize ϕn,m 2. partially update γ n,γm
M-step: update α, B
q!(!, Z) =!
n q!"n(!"n) ·
!nm q!#nm
(!znm)
![Page 39: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/39.jpg)
Guest lecture for EE380L (November 2011) 39
Agenda
• Overview • Models of networks • Mixed membership blockmodels
1. Inference 2. Results
• Concluding remarks
![Page 40: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/40.jpg)
Guest lecture for EE380L (November 2011) 40
• Functional content in P(Y | )
• Model reveals information about functional modules (cross-validation: K*=50; gold standard in Myers et al. 06)
Evaluation: recovering function
3
2
1
Prec
isio
n
Recall
![Page 41: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/41.jpg)
41
Evaluation: identifying blocks
• Two model variants capture a different number of functional processes, with equally high accuracy
GO functional processes (Area under the curve, red = high)
2 1 3
![Page 42: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/42.jpg)
42
Evaluation: mixed membership
• Amount of mixed membership is substantial • Membership reveals multifaceted functional roles
Mixed membership
Estimated memberships 0 1
Estim
ated
mem
bers
hips
15 high level functions 15 high level functions
NAT-1 GAL-4 NOP-1
NOP-58 MET-31
![Page 43: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/43.jpg)
Guest lecture for EE380L (November 2011) Edo Airoldi
National study on adolescents
• A friendship network among 69 students in grades 7-12
Original data node-specific relation-specific (prediction) (de-noising)
![Page 44: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/44.jpg)
Columbia University, Nov. 26th, 2007, New York, NY Edo Airoldi
![Page 45: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/45.jpg)
Sampson’s monastery data
• Multivariate sociometric relations among novices in a NE monastery, over two years.
• Anthropological observations as ground truth
• Two factions, plus social outcasts and waverers
• After two years John and Greg get expelled, most young turks leave and the order dissolves
Guest lecture for EE380L (November 2011) 45
![Page 46: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/46.jpg)
Guest lecture for EE380L (November 2011) 46
Expressing connectivity
• Two variants provide increasing levels of definition
Original data node-specific relation-specific (prediction) (de-noising)
![Page 47: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/47.jpg)
Social structure: blockmodel
Young Turks
Loyal
Opposition
Outcasts
0.9
0.9 0.5
0.3
0.4
Guest lecture for EE380L (November 2011) 47
![Page 48: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/48.jpg)
Columbia University, Nov. 26th, 2007, New York, NY Edo Airoldi
Social structure: membership
![Page 49: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/49.jpg)
Guest lecture for EE380L (November 2011) 49
Evaluation: nested variational EM
(Simulated data; 300 nodes, 10 blocks)
— Vanilla
— Nested
Variational EM
(Airoldi et al. 07)
(Jordan et al. 99) Hel
d-ou
t log
like
lihoo
d
Run time (seconds)
![Page 50: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/50.jpg)
Guest lecture for EE380L (November 2011) 50
Model extensions
• Sparsity, general formulation, informative priors and full Bayes (Airoldi, Blei, Fienberg & Xing, 05, 06, 08)
• Node attributes (Airoldi, Markowetz, Blei & Troyanskaya)
• Dynamic (Airoldi, Fienberg & Krackhardt, 08)
• Extensions by others (Hofman & Wiggins 07; Eliassi-Rad, Griffiths & Jordan; Nallapati, Cohen & Lafferty; Frey et al., 06, Chang & Blei)
![Page 51: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/51.jpg)
Guest lecture for EE380L (November 2011) 51
Y
Dynamics of social failure
• Analysis suggests a theory of social failure in isolated communities. Try longitudinal model
• Data:
Y
![Page 52: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/52.jpg)
Guest lecture for EE380L (November 2011) 52
Whom do like (epoch 1) Whom do like (epoch 2)
Whom do like (epoch 3)
![Page 53: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/53.jpg)
Guest lecture for EE380L (November 2011) 53
Agenda
• Overview • Models of networks • Mixed membership blockmodels • Concluding remarks
![Page 54: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/54.jpg)
Take home points
• Complex networks are an exciting research area that is generating new statistical problems
• The familiar notions of sampling variability and sampling designs are challenged
• Potential for impact in the sciences, from biology to communications, and from computational social science to healthcare survey design and analysis
Guest lecture for EE380L (November 2011) 54
![Page 55: Models of networks and mixed membership stochastic blockmodels · Models of networks and mixed membership stochastic blockmodels. Guest lecture for EE380L (November 2011) 2 Agenda](https://reader033.vdocuments.net/reader033/viewer/2022060514/5f86988df16208009322f0f0/html5/thumbnails/55.jpg)
Acknowledgements and pointers
CDC, Facebook, Bell Labs. S Fienberg, E Xing, D Blei, B Singer, A Gelman, Z Ghahramani, J Leskovec, J Kleinberg, D Rubin.
1. Getting started in probabilistic graphical models. Airoldi, PLoS Computational Biology, 2007.
2. Mixed membership stochastic blockmodels. Airoldi, Blei, Fienberg & Xing, Journal of Machine Learning Research, 2008. (in R: iGraph, LDA)
3. A survey of statistical network models. Goldenberg, Zheng, Fienberg & Airoldi. Foundations & Trends in Machine Learning, 2009.
4. Deconvolution of mixing time series on a graph. Blocker & Airoldi. Uncertainty in Artificial Intelligence (UAI), 2011.