an introduction to exponential random graph models (ergm) johan koskinen the social statistics...

189
An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre for Network Analysis Workshop: Monday, 29 August 2011

Upload: elisha-robbins

Post on 31-Mar-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

An introduction to exponentialrandom graph models (ERGM)

Johan Koskinen

The Social Statistics Discipline Area, School of Social Sciences

Mitchell Centre for Network Analysis

Workshop: Monday,  29 August 2011

Page 2: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 1a

Why an ERGM

Page 3: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966

FEMALEMale

Page 4: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966 – 300 schools Stockholm

Page 5: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966 – 200 schools Stockholm

-10

-5

0

5

10

15

Density-4

-2

0

2

4

6

8

Mutuality-4

-3

-2

-1

0

1

2

Alt. in-stars-20

-15

-10

-5

0

5

10

Alt. out-stars

-1

0

1

2

3

Alt. trans trian-2

-1.5

-1

-0.5

0

0.5

1

1.5

Alt. indep 2-paths-2

0

2

4

6

Homophily girl-15

-10

-5

0

5

10

Main girl

CI

WLS

CIWLS

Koskinen and Stenberg (in press) JEBS

Page 6: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966 – 200 schools Stockholm

-10

-5

0

5

10

15

Density-4

-2

0

2

4

6

8

Mutuality-4

-3

-2

-1

0

1

2

Alt. in-stars-20

-15

-10

-5

0

5

10

Alt. out-stars

-1

0

1

2

3

Alt. trans trian-2

-1.5

-1

-0.5

0

0.5

1

1.5

Alt. indep 2-paths-2

0

2

4

6

Homophily girl-15

-10

-5

0

5

10

Main girl

CI

WLS

CIWLS

j

i

Koskinen and Stenberg (in press) JEBS

Page 7: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966 – 200 schools Stockholm

-10

-5

0

5

10

15

Density-4

-2

0

2

4

6

8

Mutuality-4

-3

-2

-1

0

1

2

Alt. in-stars-20

-15

-10

-5

0

5

10

Alt. out-stars

-1

0

1

2

3

Alt. trans trian-2

-1.5

-1

-0.5

0

0.5

1

1.5

Alt. indep 2-paths-2

0

2

4

6

Homophily girl-15

-10

-5

0

5

10

Main girl

CI

WLS

CIWLS

h1

j

i

h1

j

i

h2 h1

j

i

h2 h3

h1

j

i

h2

hk

h3

(a)

Koskinen and Stenberg (in press) JEBS

Page 8: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

6018 grade 6 children 1966 – 200 schools Stockholm

-10

-5

0

5

10

15

Density-4

-2

0

2

4

6

8

Mutuality-4

-3

-2

-1

0

1

2

Alt. in-stars-20

-15

-10

-5

0

5

10

Alt. out-stars

-1

0

1

2

3

Alt. trans trian-2

-1.5

-1

-0.5

0

0.5

1

1.5

Alt. indep 2-paths-2

0

2

4

6

Homophily girl-15

-10

-5

0

5

10

Main girl

CI

WLS

CIWLS

j i

i j

Koskinen and Stenberg (in press) JEBS

Page 9: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 1b

Modelling graphs

Page 10: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Social Networks

• Notational preliminaries• Why and what is an ERGM• Dependencies• Estimation

Geyer-Thompson Robins-Monro Bayes (The issue, Moller et al, LISA, exchange algorithm)

• Interpretation of effects• Convergence and GOF• Further issues

Page 11: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Networks matter – ERGMS matter

Numerous recent substantively driven studies• Gondal, The local and global structure of knowledge

production in an emergent research field, SOCNET 2011• Lomi and Palotti, Relational collaboration among spatial

multipoint competitors , SOCNET 2011• Wimmer &  Lewis, Beyond and Below Racial Homophily,

AJS 2010• Lusher, Masculinity, educational achievement and social

status, GENDER & EDUCATION 2011• Rank et al. (2010). Structural logic of intra-organizational

networks, ORG SCI, 2010.

Book for applied researchers: Lusher, Koskinen, Robins ERGMs for SN, CUP, 2011

Page 12: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

The exponential random graph model (p*) framework

• An ERGM (p*) model is a statistical model for the ties in a network

• Independent (pairs of) ties (p1, Holland and Leinhardt, 1981; Fienberg and Wasserman, 1979, 1981)

• Markov graphs (Frank and Strauss, 1986)• Extensions (Pattison & Wasserman, 1999; Robins, Pattison

& Wasserman, 1999; Wasserman & Pattison, 1996)• New specifications (Snijders et al., 2006; Hunter &

Handcock, 2006)

Page 13: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

Page 14: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

• We want to model tie variables• But structure – overall pattern – is evident• What kind of structural elements can we

incorporate in the model for the tie variables?

Page 15: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

If we believe that the frequency of interaction/density is an important aspect of the network

li

j k

Counts of the number of ties in our model

We should include

Page 16: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

If we believe that the reciprocity is an important aspect of the (directed) network

li

j k

Counts of the number of mutual ties in our model

We should include

Page 17: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

If we believe that an important aspect of the network is that

i

j

k

two edge indicators {i,j} and {i’,k} are

conditionally dependent if {i,j} {i’,k}

We should include counts of

degree distribution; preferential attachment,

etcfriends meet through friends; clustering; etc

Page 18: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

If we believe that the attributes of the actors are important (selection effects, homophily, etc)

Heterophily/homophily

We should include counts of

Distance/similarity

Page 19: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

If we believe that (Snijders, et al., 2006)

i

j

k

two edge indicators {i,k} and {l,j} are

conditionally dependent if {i,l} , {l,j} El

clustered regions

Page 20: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

li

j k

m n

log

1 # +2 #

i kPr

i kPr

given the rest

given the rest

+ #+3 # +

li

j k

m n

li

j k

m n

li

k

i

j k

+ 2 =adding edge, e.g.:

Page 21: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

logi kPr

i kPr

given the rest

given the rest

)()()()()Pr(log 2211 xzxzxzxX pp

The conditional formulation

)()()( 22

11 xxx p

ikpikik

May be ”aggregated” for all dyads so that the model for the entire adjacency matrix can be written

)()()( xzxzx rijrrik where Is the difference in counts of

structure type k

Page 22: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs

For a model with edges and triangles

The model for the adjacency matrix X is a weighted sum

The parameters 1 and weight the relative importance of

ties and triangles, respectively

- graphs with many triangles but not too dense are more

probable than dense graphs with few triangles

)()()()Pr(log 1 xTxLxX

where #)( xL #)( xT

Page 23: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

Padgett’s Florentine families (Padgett and Ansell, 1993) network of 8 employees in a business company BusyNet <- as.matrix(read.table(

”PADGB.txt”,header=FALSE))

Page 24: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

BusyNetNet <- network(BusyNet, directed=FALSE)plot(BusyNetNet)

Requires libraries

'sna’,'network’

Page 25: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

Observed

15)( ji

ijxxL

and

1202

16

M

Density:

125.08

1

120

15)()(

M

xLxd

Page 26: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

For a model with only edges

)()()Pr(log 11 xLxX

Equivalent with the model:

heads

tails

For each pair flip a p-coin

Where p is the probability coin comes up heads

Page 27: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

i.e., an estimated one in 8 pairs establish a tie

The (Maximul likelihood) estimate of p is the density

here:

heads

tails

125.0)(ˆ xdpMLE

For an ERGM model with edges )()()Pr(log 11 xLxX

and hence )(

)(

1

1

1 xL

xL

e

ep

Page 28: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

for 1, we have that the density parameter

Solving

heads

tails

)(

)(

1

1

1 xL

xL

e

ep

)1/1log(1 p

and for the MLE

)1ˆ/1log(ˆ ,1 MLEMLE p

9451)11/8log( .

125.0)(ˆ xdpMLE

Let’s check in stanet

Page 29: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

Estim1 <- ergm(BusyNetNet ~ edges)summary(Estim1)

125.0)(ˆ xdpMLE

945.1ˆ ,1 MLE

MLE,1Parameter corresponding to #)( xL

approx. standard error of MLE of 1,

Success: )1ˆ/1log(ˆ ,1 MLEMLE p

Page 30: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

Do we believe in the model:

heads

tails

For each pair flip a p-coin

Let’s fit a model that takes Markov dependencies into account

)()()()()()Pr(log 33221 xTxSxSxLxX

where #)( xL #)( xT#)(2 xS #)(3 xSstatnet

Page 31: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGMS – modelling graphs: example

#)( xL #)( xT#)(2 xS #)(3 xS

MLE

approx. standard error of MLE of ,

Estim2 <- ergm(BusyNetNet ~ kstar(1:3) + triangles)summary(Estim2)

Page 32: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 3

Estimation

Page 33: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Likelihood equations for exponential fam

)()()()()Pr(log 2211 xzxzxzxX pp

”Aggregated” to a joint model for entire adjacency matrix X

)()}({ˆ obsxzXzEMLE

Sum over all 2n(n-1)/2 graphs

The MLE solves the equation (cf. Lehmann, 1983):

Page 34: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Likelihood equations for exponential fam

Solving )()}({ˆ obsxzXzEMLE

• Using the cumulant generating function (Corander, Dahmström, and Dahmström, 1998)

• Stochastic approximation (Snijders, 2002, based on Robbins-Monro, 1951)

• Importance sampling (Handcock, 2003; Hunter and Handcock, 2006, based on Geyer-Thompson 1992)

Page 35: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

)}()({ obs)(1

0)()1(

)( xzxzDa mr

mmm

Robbins-Monro algorithm

Solving )()}({ˆ obsxzXzEMLE

Snijders, 2002, algorithm

- Initialisation phase

- Main estimation

- convergence check and cal. of standard errors

MAIN:

Draw using MCMC

Page 36: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Parameter space Network space

We can simulate the distribution of graphs associated with any given set of parameters.

Observed data

A set of parameter values

In this case, the distribution of graphs from the parameter values is not consistent with the observed data

Page 37: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Parameter space Network space

We want to find the parameter estimates for which the observed graph is central in the distribution of graphs

Page 38: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Parameter space Network space

So: 1. compare distribution of graphs with observed data2. adjust parameter values slightly to get the distribution closer to the observed data3. keep doing this until the estimates converge to the right point

Observed data

Based on this discrepancy, adjust parameter values

Page 39: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Over many iterations we walk through the parameter space until we find the right estimates

Parameter space Network space

Page 40: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Phase 1, Initialisation phase

Find good values of the initial parameter state

And the scaling matrix

(use the score-based method, Schweinberger & Snijders, 2006)

Page 41: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Phase 2, Main estimation phase

Iteratively update θ

)}()({ obs)(1

0)()1(

)( xzxzDa mr

mmm

by drawing one realisation

from the model defined by the current θ(m)

Repeated in sub-phases with fixed

Page 42: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

200 400 600 800 1000 1200 1400-1.8

-1.6

-1.4

-1.2

-1ed

ge p

aram

eter

1(m

)

iterations

200 400 600 800 1000 1200 1400-0.05

0

0.05

0.1

0.15

0.2

alte

rnat

ing

tria

ngle

par

amet

er 2(m

)

iterations

200 400 600 800 1000 1200 1400-40

-30

-20

-10

0

10

20

z 1( x

(m) )

- z

1( x

obs )

iterations

200 400 600 800 1000 1200 1400

-20

0

20

40

z 2( x

(m) )

- z

2( x

obs )

iterations

Page 43: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Phase 2, Main estimation phase

Relies on us being able to draw one realisation

We can NOT do this directly

from the ERGM defined by the current θ

We have to simulate xMore specifically use Markov chain Monte Carlo

Page 44: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

What do we need to know about MCMC?

Method:

Generate a sequence of graphs

for arbitrary x(0), using an updating rule… so that

as

Page 45: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

What do we need to know about MCMC?

So if we generate an infinite number of graphs in the “right” way we have the ONE draw we need to update θ once?

Typically we can’t wait an infinite amount of time so we settle for

In Pnet very large ismultiplication factor

Page 46: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Phase 3, Convergence check and calculating standard errors

At the end of phase 2 we always get a value

But is it the MLE?

Does it satisfy

?

Page 47: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Robbins-Monro algorithm

Phase 3, Convergence check and calculating standard errors

Phase 3 simulates a large number of graphs

And checks if

A minor discrepancy - due to numerical inaccuracy - is acceptable

Convergence statistics:

Page 48: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

)()()( obs1

)()(1)1()1()( xzxzwIM

m

mmggg

Geyer-Thompson

Solving )()}({ˆ obsxzXzEMLE

Handcock, 2003, approximate Fisher scoring

MAIN:

Approximated using importance sample from MCMC

Page 49: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayes: dealing with likelihood

The normalising constant of the posterior not essential for Bayesian inference, all we need is:

y

p

k kk

p

k kk

yz

xzx

1

1

)}(exp{

)}(exp{);(

)();(d)();(

)();()|(

x

x

xx

… but

Sum over all 2n(n-1)/2 graphs

Page 50: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayes: MCMC?

Consequently, in e.g. Metropolis-Hastings, acceptance probability of move to θ

y

p

k kk

y

p

k kk

yz

yz

1

*

1

)}(exp{

)}(exp{

)|(

)|(

)();(

*)();(min

)|(

)|(

)|(

)|(,1min

*

**

*

**

prop

prop

prop

prop

q

q

x

x

q

q

x

x

… which contains

Page 51: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayes: Linked Importance Sampler Auxiliary Variable MCMC

LISA (Koskinen, 2008; Koskinen, Robins & Pattison, 2010): Based on

Møller et al. (2006), we define an auxiliary variable

And produce draws from the joint posterior

m

j

K KK1

,,1,,1 X

)(})(exp{

)(

})(exp{

})(exp{)|,( ,

yz

P

yz

xzx

kk

B

kk

obskkobs

using the proposal distributions

),(~| )()(* tt N and )}(exp{

)(~|

*

*

,** *

yz

P

kk

F

Page 52: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayes: alternative auxiliary variable

LISA (Koskinen, 2008; Koskinen, Robins & Pattison, 2010): Based on

Møller et al. (2006), we define an auxiliary variable

Improvement: use exchange algorithm (Murray et al. 2006)

m

j

K KK1

,,1,,1 X

Many linked chains:- Computation time- storage (memory and time issues)

),(~| )()(* tt N and )ERGM(~| *** x

Accept θ* with log-probability: ))}(*)((*)(,0min{ obsT xzxz

Caimo & Friel, 2011

Page 53: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayes: Implications of using alternative auxiliary variable

Improvement: use exchange algorithm (Murray et al. 2006)

),(~| )()(* tt N and )ERGM(~| *** x

Accept θ* with log-probability: ))}(*)((*)(,0min{ obsT xzxz

Caimo & Friel, 2011

• Storing only parameters• No pre tuning – no need for good initial values• Standard MCMC properties of sampler• Less sensitive to near degeneracy in estimation• Easier than anything else to implement

QUICK and ROBUST

Page 54: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 4

Interpretation of effects

Page 55: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Markov dependence assumption:

i

j

k

two edge indicators {i,j} and {i’,k} are

conditionally dependent if {i,j} {i’,k}

We have shown that the only effects are:

degree distribution; preferential attachment,

etcfriends meet through friends; clustering; etc

Page 56: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Often for Markov model

degree distribution; preferential attachment,

etcfriends meet through friends; clustering; etc

)()}({ˆ obsxzXzEMLE

Matching

hardImpossible!or

Page 57: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

)()}({ˆ obsxzXzEMLE

Matching

hardImpossible!or

# triangles

# edges

or Some statistic k

Page 58: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

0 50 100 150 200 250 300 350 400number of triangles

Problem with Markov models

)()}({ˆ obsxzXzEMLE

Matching

hard

# triangles

The number of triangles for a Markov model with edges (-3), 2-stars (.5), 3-stars (-.2)

and triangles (1.1524)

Page 59: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

0 0.5 1 1.550

60

70

80

90

100

110

120

130

140

num

ber

of e

dges

triangle parameter0 0.5 1 1.5

0

50

100

150

200

250

300

350

400

num

ber

of t

riang

les

triangle parameter

Problem with Markov models

)()}({ˆ obsxzXzEMLE

Matching

hard

Exp. # triangles and edges for a Markov

model with edges, 2-stars, 3-stars and

triangles

Page 60: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

)()}({ˆ obsxzXzEMLE

Matching Impossible!

# 2-stars # edges

If for some statistic k

Similarly for max (or conditional min/max)e.g. A graph on 7 vertices with 5 edges

Implies:

Page 61: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

First solutions (Snijders et al., 2006)

Markov:

adding k-star adds (k-1) stars

alternating sign compensates but eventually zk(x)=0

Alternating stars:

Restriction on star parameters - Alternating sign

prevents explosion, and

models degree distribution

Page 62: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

First solutions (Snijders et al., 2006)

Markov:

adding k-star adds (k-1) stars

alternating sign compensates but eventually zk(x)=0

#)( xL #)( xT#)(2 xS #)(3 xS

Page 63: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

First solutions (Snijders et al., 2006)

Alternating stars:

Restriction on star parameters - Alternating sign

prevents explosion, and

models degree distribution

Include all stars but restrict parameter:

/23 /34 /45

);()()()( 113322 xAKSxSxSxS AKSnn new

Page 64: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

First solutions (Snijders et al., 2006)

Include all stars but restrict parameters:

/23 /34 /45

);()()()( 113322 xAKSxSxSxS AKSnn new

Expressed in termsof degree distribution

Page 65: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

First solutions (Snijders et al., 2006)

Include all stars but restrict parameters:

/23 /34 /45

);()()()( 113322 xAKSxSxSxS AKSnn

Interpretation:Positive parameter (λ ≥ 1) – graphs with some high degree nodes and larger degree variance more likely than graphs with more homogenous degree distribution Negative parameter (λ ≥ 1) – the converse…

Page 66: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Markov:

triangles evenly spread out

but one edge can add many triangles…

Alternating triangles:

• Restrictions on different order triangles – alternating sign

• Prevents explosion, and

• Models multiply clustered regions

• Social circuit dependence assumption

First solutions (Snijders et al., 2006)

Page 67: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Markov:

triangles evenly spread out

but one edge can add many triangles…

First solutions (Snijders et al., 2006)

k-triangles:

1-triangles: 2-triangles: 3-triangles:

Page 68: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Alternating triangles:

• Restrictions on different order triangles – alternating sign

• Prevents explosion, and

• Models multiply clustered regions

First solutions

/1 kk

);()()()( 223322 xAKTxTxTxT AKTnn

Weigh the k-triangles:

Where:

Page 69: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Alternating triangles:

• Restrictions on different order triangles – alternating sign

• Prevents explosion, and

• Models multiply clustered regions

First solutions

Underlying assumption:

i

j

k

two edge indicators {i,k} and {l,j} are

conditionally dependent if {i,l} , {l,j} E

l

Social circuit dependence assumption

Page 70: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Alternative interpretation

We may (geometrically) weight together :

i j

Page 71: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Alternative intrepretation

We may also define theEdgewise Shared Partner Statistic:

i j… and we can weigh together the ESP statisticsusing Geometrically decreasing weights: GWESP

Page 72: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 4b

Example Lazega’s law firm partners

Page 73: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

Collaboration network among 36 lawyers in a

New England law firm (Lazega, 2001)

Boston office:

Hartford office:

Providence off.:

least senior:

most senior:

Page 74: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

)()()()()Pr(log 2211 xzxzxzxX pp

Fit a model with ”new specifications” and covariates

ijx

)( jiij aax )( jiij bbx

)( jiij bbx 1

)( jiij ccx 1

)( jiij ddx 1

323

12

1

)()1(

)()(3

n

nn xtxtxt

Edges:

Seniority:

Practice:

Homophily

Sex:

Office:

GWESP:

Practice:

Main effect

Page 75: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

Fit a model with ”new specifications” and covariates

PNet:

Page 76: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

Fit a model with ”new specifications” and covariates

ijx

)( jiij aax )( jiij bbx

)( jiij bbx 1

)( jiij ccx 1

)( jiij ddx 1

323

12

1

)()1(

)()(3

n

nn xtxtxt

Edges:

Seniority:

Practice:

Homophily

Sex:

Office:

GWESP:

Practice:

Main effect

Page 77: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 4c

Interpreting attribute-related effects

Page 78: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationFitting an ERGM in Pnet: a business communications network

office sales

officeRb for attribute

R for attribute

sales

For ”wwbusiness.txt” we have recorded wheather the employee works in the central office or is a traveling sales represenative

Page 79: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationFitting an ERGM in Pnet: a business communications network

office sales

officeRb for attribute

R for attribute

sales

Consider a dyad-independent model

)()()Pr(log 1 jiijRbjiijRijijij OFFOFFxOFFOFFxxxX

Page 80: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationFitting an ERGM in Pnet: a business communications network

office sales

officeRb for attribute

R for attribute

sales

With log odds

jiRbjiRij

ij OFFOFFOFFOFFX

X

)()0Pr(

)1Pr(log 1

1

R 1

1jOFF 0jOFF

1iOFF

0iOFF

)0Pr(

)1Pr(log

ij

ij

X

X

RbR 21

R 1

Page 81: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 4d

Interpreting higher order effects

Page 82: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Alternating stars a way of- “fixing” the Markov problems (models all degrees)- Controlling for paths in clustering

);()()()( 113322 xAKSxSxSxS AKSnn

k-triangles measure clustering…

1-triangles: 2-triangles: 3-triangles:

Page 83: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Alternating stars a way of- “fixing” the Markov problems (models all degrees)- Controlling for paths in clustering

);()()()( 113322 xAKSxSxSxS AKSnn

Is it closure or an artefact of many stars/2-paths?

1-triangles: 2-triangles: 3-triangles:

Page 84: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Interpreting the alternating star parameter:

centralization – +

variance of degree measure centralization

Page 85: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Interpreting the alternating star parameter:

centralization – +

alternating star parameter – +

Page 86: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Statistics for graph (n = 16); fixed density; alt trian: 1.17

alternating star parameter – +

Page 87: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Statistics for graph (n = 16); fixed density; alt trian: 1.17

alternating star parameter – +

Page 88: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating star effect

Graphs (n = 16); fixed density; alt trian: 1.17

alternating star parameter – +

Page 89: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Alt, stars in termsof degree distribution

Note also the influence of isolates:

Page 90: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Note also the influence of isolates:

alternating star parameter – +

Page 91: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

Note also the influence of isolates:

This is because we impose a particular shape on the degree distribution

);()()()( 113322 xAKSxSxSxS AKSnn

Page 92: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Problem with Markov models

The degree distribution for different values of:al

tern

atin

g st

ar p

aram

eter

+

Page 93: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Alternating triangles measure clustering

We may also define theEdgewise Shared Partner Statistic:

i j… and we can weigh together the ESP statisticsusing Geometrically decreasing weights: GWESP

Page 94: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Statistics for graph (n = 16); fixed density; alt star: 2.37

alternating Triangle parameter – +

Page 95: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Statistics for graph (n = 16); fixed density; alt star: 2.37

alternating Triangle parameter – +

Page 96: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Statistics for graph (n = 16); fixed density; alt star: 2.37

alternating Triangle parameter – +

clus

terin

g

+ closed

open

Page 97: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Statistics for graph (n = 16); fixed density; alt star: 2.37

alternating Triangle parameter – +

clus

terin

g

+ closed

open

Page 98: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Alternating triangles model multiply clustered areas

For multiply clustered areas triangles stick togetherWe model how many others tied actors have

Edgewise Shared Partner Statistic:

i j

Page 99: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Unpacking the alternating triangle effect

Statistics for graph (n = 16); fixed density; alt star: 2.37

# edwise shared partners – +

Alte

rnat

ing

tria

ngle

par

a

+

Page 100: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 5

Convergence check and goodness of fit

Page 101: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Revisiting the Florentine families

#)( xL #)( xT#)(2 xS #)(3 xS

PNet

statnet

Why difference?

Page 102: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Revisiting the Florentine families

#)( xL #)( xT#)(2 xS #)(3 xS

Pnet checks convergence

in 3rd phase

)()}({ˆ obsxzXzEMLE

Page 103: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Revisiting the Florentine families

#)( xL #)( xT#)(2 xS #)(3 xS

Lets check

For statnet

)()}({ˆ obsxzXzEMLE

Page 104: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

#)( xL

#)( xT

#)(2 xS

#)(3 xS

Estim3 <- ergm(BusyNetNet ~ kstar(1:3) + triangles , verbose=TRUE)

mcmc.diagnostics(Estim3)

)()}({ˆ obsxzXzEMLE

?

Page 105: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

Given a specific model we can simulate potential outcomes under that model. This is used for

• Estimation: (a) to match observed statistics and expected statistics(b) to check that we have “the solution”

• GOF: to check whether the model can replicated features of the data that we not explicitly modelled

• Investigate behaviour of model, e.g.: degeneracy and dependence on scale

Page 106: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

Standard goodness of fit procedures are not valid for ERGMs – no F-tests or Chi-square tests available

If indeed the fitted model adequately describes the data generating process, then the graphs that the model produces should be similar to observed data

For fitted effects this is true by construction

For effects/structural feature that are not fitted this may not be true

If it is true, the modelled effects are the only effects necessary to produce data – a “proof” of the concept or ERGMs

Page 107: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

Example: for our fitted model for Lazega

For arbitrary function g

We can look at how well the model reproduces

Page 108: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

From the goodness-of-fit tab in Pnet we get

Page 109: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

Effect MLE s.e.

Density -6.501 0.727

Main effect seniority 1.594 0.324

Main effect practice 0.902 0.163

Homophily effect practice 0.879 0.231

Homophily sex 1.129 0.349

Homophily office 1.654 0.254

Page 110: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

2

4

6

8

10

12

freq

uenc

y

degree1 2 3 4 5 6 7 8 9 10 11

0

10

20

30

40

50

freq

uenc

y

number of shared partners

Page 111: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

Effect MLE s.e. MLE s.e.

Density -6.501 0.727 -6.510 0.637

Main effect seniority 1.594 0.324 0.855 0.235

Main effect practice 0.902 0.163 0.410 0.118

Homophily effect practice 0.879 0.231 0.759 0.194

Homophily sex 1.129 0.349 0.704 0.254

Homophily office 1.654 0.254 1.146 0.195

GWEPS 0.897 0.304

Log-lambda 0.778 0.215

Page 112: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Simulation, GOF, and Problems

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

2

4

6

8

10

12

freq

uenc

y

degree1 2 3 4 5 6 7 8 9 10 11

0

10

20

30

40

50

freq

uenc

y

number of shared partners

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

2

4

6

8

10

freq

uenc

y

degree1 2 3 4 5 6 7 8 9 10 11

0

10

20

30

40

freq

uenc

y

number of shared partners

Page 113: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 2

Dependencies

Page 114: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Independence - Deriving the ERGM

li

j k

m n

heads

tails

li

li

heads

tails

i

k

i

k

Page 115: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Independence - Deriving the ERGM

0.25 0.25

0.25 0.25

AUD

0.5 0.5

SEK

0.5

0.5 li

k

li

k

Knowledge of AUD, e.g. does not help us predict SEK

e.g. whether or

li

j k

m n

Page 116: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Independence - Deriving the ERGM

i

i

k

Knowledge of AUD, e.g. does not help us predict SEK

e.g. whether or

even though dyad {i,l} li

and dyad {i,k}

have vertex i

in common

Page 117: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Independence - Deriving the ERGM

0.4 0.1

0.1 0.4

AUD

0.5 0.5

SEK

0.5

0.5

li

k

li

k

li

j k

m n

May we find model such that knowledge of AUD, e.g. does help us predict SEK

e.g. whether or ?

Page 118: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

Consider the tie-variables that have Mary in common

How may we make these “dependent”?

Page 119: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

petemary

Page 120: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

johnpetemary mary

Page 121: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

johnpetemary

paulmary

mary

Page 122: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

johnpetemary

paulmary

mary

pete john

Page 123: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

johnpetemary

paulmary

mary

paul john

pete john

Page 124: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

johnpetemary

paulmary

mary

paul pete paul john

pete john

Page 125: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

The “probability structure” of a Markov graph is described by cliques of the dependence graph (Hammersley-Clifford)….

Page 126: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

pete

mary

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 127: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paul

m,pe

Page 128: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

johnpete

mary

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 129: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 130: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paul

m,pe m,j

Page 131: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 132: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 133: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paul

m,pe

pe,j

m,j

Page 134: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 135: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 136: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,j

m,pe

pe,j

Page 137: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Deriving the ERGM: From Markov graph to Dependence graph

mary

johnpete

paulm,pa

pa,pe pa,j

m,pe

pe,j

m,j

Page 138: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

From Markov graph to Dependence graph – distinct subgraphs?

too ma

ny sta

tistic

s (par

ameter

s)

Page 139: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

The homogeneity assumption

=

=

=

=

Page 140: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

A log-linear model (ERGM) for ties

)()()()()Pr(log 2211 xzxzxzxX pp

”Aggregated” to a joint model for entire adjacency matrix

Interaction terms in log-linear model of types

ijX ikij XX jkikij XXX

Page 141: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

A log-linear model (ERGM) for ties

By definition of (in-) dependence

)Pr()Pr(),Pr( ikikijijikikijij xXxXxXxX

E.g. and co-occuring i

j

i

j k

i

k

Main effects interaction term

ijX ikX ikij XX

More than is explained by margins

Page 142: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 7

Summary of fitting routine

Page 143: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

The steps of fitting an ERGM

• fit base-line model• check convergence• rerun if model not converged• include more parameters? GOF• candidate models

Page 144: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 6

Bipartite data

Page 147: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bi-partite networks

people participating in

belonging to

Social events(Breiger, 1974)

voluntary organisations(Bonachich, 1978)

people

directors sitting oncorporate boards

(eg. Mizruchi, 1982

Page 148: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

One-mode projection

Tie: If two directors share a board

Two-mode

one-mode

Page 152: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

One-mode projection

What mode given priority? (Duality of social actors and social groups; e.g. Breiger 1974; Breiger and Pattison, 1986)Loss of information

Page 153: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGM for bipartite networks

)()()()()Pr(log 2211 xzxzxzxX pp

The model is the same as for one-mode networks (Wang et al., 2007)

i j

i

k

j

i

k

j

i

k ℓ k

j

Edges ”people” 2-stars

”affiliation” 2-stars

3-paths 4-cycles

Page 154: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGM for bipartite networks

i

k

j

4-cycles

A form of bipartite clustering

Also known as Bi-cliques

One possible interpretation:

Page 155: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGM for bipartite networks

i

k

j

4-cycles

A form of bipartite clustering

Also known as Bi-cliques

One possible interpretation:Who to appoint?

How about x?

x

Page 156: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

ERGM for bipartite networks

Fitting the model in (B)Pnet straightforward extension

These statistics are all Markov:

i j

i

k

j

i

k

j

i

k ℓ k

j

Edges ”people” 2-stars

”affiliation” 2-stars

3-paths 4-cycles

<BPNet>

Page 157: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 7

Missing data

Page 158: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Methods for missing network data

Effects of missingness

PerilsSome investigations on the effects on indices of structural properties (Kossinets, 2006; Costenbader & Valente, 2003; Huisman, 2007)Problems with the “boundary specification issue”Few remediesDeterministically “complement” data (Stork & Richards, 1992)Stochastically Impute missing data (Huisman, 2007)Ad-hoc “likelihood” (score) for missing ties (Liben-Nowell and Kleinberg, 2007)

Page 159: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Model assisted treatment of missing network data

missing data

observed data

If you don’t have a model for what you have observed

How are you going to be able to say something about what you have not observed using what you have observed

Page 160: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Model assisted treatment of missing network data

• Importance sampling (Handcock & Gile 2010; Koskinen, Robins & Pattison, 2010)

• Stochastic approximation and the missing data principle (Orchard & Woodbury,1972) (Koskinen & Snijders, forthcoming)

• Bayesian data augmentation (Koskinen, Robins & Pattison, 2010)

Page 161: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Subgraph of ERGM not ERGM

i

j

k

Dependence in ERGM We may also have dependence

i

j

lk

But if

k

?

j

We should include

counts of:

Marginalisation (Snijders, 2010; Koskinen et al, 2010)

Page 162: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

With missing data:

Simulate parameters

In each iteration simulate graphs

missing

Bayesian Data Augmentation

Page 163: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

Most likely missing given current

Bayesian Data Augmentation

Page 164: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

Most likely given current missing

Bayesian Data Augmentation

Page 165: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

Most likely missing given current

Bayesian Data Augmentation

Page 166: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

Most likely given current missing

Bayesian Data Augmentation

Page 167: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

Most likely missing given current

Bayesian Data Augmentation

Page 168: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

and so on…

Bayesian Data Augmentation

Page 169: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

Simulate parameters

With missing data:

In each iteration simulate graphs

missing

… until

Bayesian Data Augmentation

Page 170: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data Augmentation

What does it give us?

Distribution of parameters

Distribution of missing data

Subtle point

Missing data does not depend on the parameters (we don’t have to choose parameters to simulate missing)

missing

Bayesian Data Augmentation

Page 171: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

Collaboration network among 36 lawyers in a

New England law firm (Lazega, 2001)

Boston office:

Hartford office:

Providence off.:

least senior:

most senior:

Page 172: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers

172

ijx

)( jiij aax )( jiij bbx

)( jiij bbx 1

)( jiij ccx 1

)( jiij ddx 1

323

12

1

)()1(

)()(3

n

nn xtxtxt

Edges:

Seniority:

Practice:

Homophily

Sex:

Office:

GWESP:

with 8 = log()

Practice:

Main effect

t1 :

t2 :

etc.

(bi = 1, if i corporate,0 litigation)

t3 :

Page 173: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationLazega’s (2001) Lawyers – ERGM posteriors (Koskinen, 2008)

Page 174: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationCross validation (Koskinen, Robins & Pattison, 2010)

Remove 200 of the 630 dyads at random

Fit inhomogeneous Bernoulli model obtain the posterior predictive tie-probabilities for the missing tie-variables

Fit ERGM and obtain the posterior predictive tie-probabilities for the missing tie-variables (Koskinen et al., in press)

Fit Hoff’s (2008) latent variable probit model with linear predictor Tz(xij) + wiwj

T

Repeat many times

Page 175: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Positive Rate

Tru

e P

ositi

ve R

ate

Page 176: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Positive Rate

Tru

e P

ositi

ve R

ate

Page 177: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationROC curve for predictive probabilities combined over 20 replications (Koskinen et al. 2010)

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Positive Rate

Tru

e P

ositi

ve R

ate

Page 178: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Bayesian Data AugmentationSnowball sampling

• Snowball sampling design ignorable for ERGM (Thompson and Frank, 2000, Handcock & Gile 2010; Koskinen, Robins & Pattison, 2010)

• ... but snowball sampling rarely used when population size is known...

• Using the Sageman (2004) “clandestine” network as test-bed for unknown N

Page 179: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 7

Spatial embedding

Page 180: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Spatial embedding – Daraganova et al. 2011 SOCNET

306 actors in Victoria, Australia

Page 181: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Spatial embedding – Daraganova et al. 2011 SOCNET

306 actors in Victoria, Australia

... all living within 14 kilometres of each other

spatially embedded

Page 182: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Spatial embedding – Daraganova et al. 2011 SOCNET

306 actors in Victoria, Australia

... all living within 14 kilometres of each other

Bernoulli conditional on distance

Empirical probability

Page 183: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Spatial embedding – Daraganova et al. 2011 SOCNET

Edges -4.87* (0.13) 1.56* (0.65) -4.79* (0.66) -0.20 (0.87)

Alt. star -0.86* (0.18) -0.86* (0.2)

Alt. triangel 2.74* (0.15) 2.69* (0.14)

Log distance -0.78* (0.08) -0.56* (0.07)

Age

heterophily-0.07* (0.01) -0.07* (0.01) 0.001 (0.07) 0.002 (0.06)

Gender

homophily-1.13* (0.61) -1.13 (0.69) 0.09 (0.83) 0.07 (0.47)

ERGM: distance and endogenous dependence explain different things

Page 184: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Part 8

Further issues

Page 185: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Other extensions

There are ERGMs (ERGM-like modesl) for- directed data- valued data- bipartite data- multiplex data- longitudinal data- modelling actor autoregressive attributes

Page 186: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Further issues – relaxing homogeneity assumption

ERGMs typically assume homogeneity

=

=

=

=

(A)Block modelling and ERGM (Koskinen, 2009)(B) Latent class ERGM (Schweingberger & Handcock)

Page 187: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Further issues – model selection

Assessing Goodness of Fit:- Posterior predictive distributions (Koskinen,

2008; Koskinen, Robins & Pattison, 2010; Caimo & Friel, 2011)

- Non-Bayesian heuristic GOF (Robins et al., 2007; Hunter et al., 2008; Robins et al., 2009; Wang et al., 2009)

Model selection- Path sampling for AIC (Hunter & Handcock,

2006); Conceptual caveat: model complexity when variables dependent?

- Bayes factors (Wang & Handcock…?)

Page 188: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Wrap-up

ERGMs- Increasingly being used- Increasingly being understood- Increasingly being able to handle imperfect data (also missing link prediction)

Methods- Plenty of open issues- Bayes is the way of the futureLegitimacy and dissemination- e.g. Lusher, Koskinen, Robins ERGMs for SN, CUP, 2011

Page 189: An introduction to exponential random graph models (ERGM) Johan Koskinen The Social Statistics Discipline Area, School of Social Sciences Mitchell Centre

Remaining question: used to be p*…

... why ERGM?