network science workshop
TRANSCRIPT
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 1/37
WARNING!Network Science is extremely
contagious ONCE YOU LEARN IT you . ,START seeing Networks everywhere.
D Zinoviev.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 2/37
Outline
● What Is Network Science?● Terms and Definitions● Measures● Formation● Complex Behavior● Tools of the Craft● Unusual Applications of Network Science
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 3/37
What is Network Science?
Network science is an interdisciplinary academic field which studies complex networks such as:
telecommunication, transportation, electrical, computer, biological, cognitive and semantic, and social.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 4/37
What is it based upon?
The field draws on theories and methods including:
Graph theory from mathematics (Erdős, Rényi, Strogatz), Game theory from economics (Jackson), Statistical mechanics from physics (Barabási, Newman, Vespignani,
Watts), Data mining and information visualization from computer science
(Adamic), and Social structure from sociology (Watts).
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 5/37
Terms and definitions● Network = Graph● Nodes (vertexes, actors, members)
represent entities● Nodes have properties (gender,
capacity, political view)● Edges (arcs, links, ties) represent
relationships● Edges have properties (direction,
weight, kind)● Directed vs undirected● Multigraph: graph with parallel
edges● Simple graph: undirected, no loops,
no parallel edges● Connected graphs
Boston SSAlbany
Brunswick
Boston NS
St Albans
ProvidenceHartford
Springfield
New Haven
New York PS
Montreal
Rutland
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 6/37
Adjacency Matrix A
7
5
Boston SSAlbany
Brunswick
Boston NS
St Albans
ProvidenceHartford
Springfield
New Haven
New York PS
Montreal
6 Rutland
9
12
11
4
8
1
3
2
10
A=0 0 1 0 0 0 0 1 0 0 0 00 0 1 0 0 0 0 0 0 0 0 01 1 0 0 0 1 1 0 0 0 0 00 0 0 0 0 0 1 0 0 1 0 00 0 0 0 0 0 1 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 00 0 1 1 1 0 0 0 1 0 0 01 0 0 0 0 0 0 0 1 1 0 00 0 0 0 0 0 1 1 0 0 0 00 0 0 1 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 1 0
A
ij=1 if and only if i and j are connected
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 7/37
Incidence Matrix B
B=1 0 1 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 0 00 1 1 0 0 0 0 0 0 0 0 00 0 1 0 0 1 0 0 0 0 0 00 0 1 0 0 0 1 0 0 0 0 00 0 0 0 0 0 0 1 1 0 0 00 0 0 0 0 0 1 0 1 0 0 00 0 0 0 0 0 0 1 0 1 0 00 0 0 1 0 0 0 0 0 1 0 00 0 0 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 1 10 0 0 0 1 0 1 0 0 0 0 0
7
5
Boston SSAlbany
Brunswick
Boston NS
St Albans
ProvidenceHartford
Springfield
New Haven
New York PS
Montreal
6 Rutland
9
12
11
4
8
1
3
2
10
A
B
C D
E
F
G
H
I
J
KL
Bij=1 if and only if node i is incident to edge j
edges
node
s
A=B2−2I
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 8/37
PATHS
7
5
Boston SSAlbany
Brunswick
Boston NS
St Albans
ProvidenceHartford
Springfield
New Haven
New York PS
Montreal
6 Rutland
9
12
11
4
8
1
3
2
10
A
B
C D
E
F
G
H
I
J
KL
Path = sequence of connected edges (e.g., B – H – I)
Can be simple (no self-intersections)
Can be a loop (ends where it starts)
Paths have lengths Geodesic = a shortest path (B
– F – G – J is not a geodesic, but B – H – I is)
What if edges are weighted?
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 9/37
Small World
We are on average just 4–6 links (“handshakes”) away from any other living person on Earth (Milgram's experiment)—thence, “six degrees of separation”
Not all networks have the “small world” property
I
Someone I know
Boris Berezovsky
Vladimir Putin
Barak Obama
Wait, how do you know O
bama?
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 10/37
Centrality
● How “central” is a node in the network?
● Possibly affects influence, resilience, susceptibility, etc.
● Several flavors: degree, closeness, betweenness, eigenvalue, etc.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 11/37
Degree Centrality[ ]
7
5
Boston SS (2)Albany (4)
Brunswick (1)
Boston NS (1)
St Albans (1)
Providence (2)
Springfield (4)
New Haven (3)
New York PS (2)
Montreal (1)
6 Rutland (1)
9
12
11
4
8
1
3
2
10Hartford (2)
Just count the neighbors! More neighbors = more
“friends” = more importance Distinguish in-degree, out-
degree, and [total] degree Can be defined in two ways (N
is the total number of nodes, a
ij∈A):
d i=∑ jaij
d i=∑ jaij / N−1
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 12/37
Degree Distribution
Degree [centrality] distribution is an important network measure—it relates to the network formation process
Most common distributions in complex networks: binomial (Poisson for n→∞) and power law (a.k.a. Pareto, Zipf, scale free)
Why it is what it is?
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 13/37
Closeness Centrality
7
5
Boston SS (0.5)
Brunswick (1)
Boston NS (1)
St Albans (0.4)
Providence (0.4)
Springfield (0.6)
New Haven (0.5)
New York PS (0.5)
Montreal (0.4)
6 Rutland (0.4)
9
12
11
4
8
1
3
2
10Hartford (0.5)
Albany (0.6)
Calculate average inverse shortest path to all other nodes
Shorter path = closer “friends” = better connectivity
Can be defined in two ways (N is the total number of nodes, p
ij
is a geodesic path from I to j)
Takes care of disconnected networks!
ci=∑ j1 / pij
ci=∑ j1/ p ij / N−1
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 14/37
Betweenness Centrality
7
5
Boston SS (0.1)
Brunswick (0)
Boston NS (0)
St Albans (0)
Providence (0.04)
Springfield (0.5)
New Haven (0.14)
New York PS (0.13)
Montreal (0)
6 Rutland (0)
9
12
11
4
8
1
3
2
10Hartford (0.06)
Albany (0.5)
Calculate how many shortest paths go through the node
Mores paths = better brokerage opportunities (= more vulnerability)
Can be defined in two ways (N is the total number of nodes, p
ij
is a geodesic path from I to j, n is the number of such paths)
bwi=∑ j≠i≠kn p jik /n p jk
bwi=∑ j≠i≠kn p jik /n p jk / N−1 N−2
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 15/37
Eigenvector Centrality
7
5
Boston SS (0.29)
Brunswick (0)
Boston NS (0)
St Albans (0.19)
Providence (0.25)
Springfield (0.49)
New Haven (0.34)
New York PS (0.31)
Montreal (0.17)
6 Rutland (0.17)
9
12
11
4
8
1
3
2
10Hartford (0.33)
Albany (0.45)
Recursive definition: A node is as important as its neighbors are
ei=1∑ j
aij e j
A− I E=0 E , =eig A
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 16/37
Similarity and Triadic Closure
Connectivity between nodes may imply similarity: A is connected to B A is similar to B (known as homophily in social networks). Two dyads sharing a node become a triad.
A
B
C
A
B
CAlternative interpretation: weak ties become strong ties (Granovetter).
A
B
C A
B
C
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 17/37
Clustering Coefficient
Clustering coefficient of a node with n neighbors:
Ci=0 — star
Ci=1 — clique (1, 4, 5, 6)
C1=6/10
Average clustering coefficient: C=(.6+.67+1+1+1+1)/6=.88
C i=2∑ j , k
aij aik a jk
n n−1
“Birds of a feather flock together...” (William Turner)
1 (.6)2 (.67)
3 (1.)4 (1.)
5 (1.)6 (1.)
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 18/37
Modularity and Components
NSSI (self-cutters) online communities in LiveJournal (blogging social Web site) form six components
If these two components are merged, they form a giant component
Modularity Q∈[-1, 1] measures the density of links inside clusters as compared to links between clusters:
Q=
∑ij [aij−∑iaij∑ j
aij
∑ijaij ]ij
∑ijaij
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 19/37
Assortativity
Assortative networks: nodes connect to nodes with similar degree; high modularity, better community structure
Dissassortative networks: nodes connect to nodes with different degree
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 20/37
Network Formation
● Networks are complex systems composed of interconnected parts that as a whole exhibit properties not obvious from the properties of the individual parts.
● Most networks are not an immediate product of intelligent design.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 21/37
exponential Networks A.k.a. Erdős–Rényi networks Start with a fixed set of N nodes Randomly connect them with probability p Average degree λ=pN Binomial / Poisson degree distribution
(decays exponentially after max) No small-world property!
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 22/37
Small World Networks A.k.a. Watts–Strogatz networks Start with a fixed set of N nodes Connect each node to its m neighbors Rewire the connections with probability β Degree distribution: δ-function for β→0, binomial/Poisson
for β→1 (unrealistic) Small-world—but no clustering!
β=0
0<β<1
β=1
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 23/37
Scale Free Networks A.k.a. Barabási–Albert networks Start with few nodes Attach a new node X to m existing nodes
Yi with probability proportional to the
degrees of Yi (preferential attachment)
Power law degree distribution Small-world, community structure No meaningful average degree (scale-
free) Fat tail
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 24/37
Strategic Network formation Formed on purpose Start with a fixed set of N nodes Add links to maximize utility: either
globally or pairwise Topology depends on the costs and
benefits
Link cost c Benefit from direct
connection δ Benefits from indirect
connections δ2, δ3, δ4, etc.
3δ-3c
3δ-3c
3δ-3c
3δ-3c
δ+2δ2-c3δ-3c
δ+2δ2-cδ+2δ2-c
0
0
0
0
δ vs c
“cheap” li
nks
“expensive” links
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 25/37
Complex Behaviors
● Simple contagion: epidemics, rumor propagation
● Complex contagion: collective action, political views, fashion
● Information diffusion: effect of feedback
● Resilience
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 26/37
Simple Contagion
Susceptible – Infectious – Susceptible (SIS): At each step, a “healthy” (but susceptible) node gets infected by an infected neighbor with probability p, and an infected node recovers with probability r
Susceptible – Infectious – Recovered (SIR): same as in SIS, but a node cannot be reinfected
Spreads fast in power-law networks
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 27/37
Collective Action
A node becomes infected with probability p when either a certain number M or a certain fraction m of its neighbors is infectious ✔ “I will wear red pants if at least 50% of my friends wear red
pants.”✔ “I will use protocol X if at least 10 of my partners support
protocol X.”✔ “I will go to protest tax hikes if all my friends go with me.”✔ “I will feel happy if people around me are happy.”
Supported by community structure:✔ Structural trapping (few external links)✔ Social reinforcement (many internal links)✔ Homophily (“connected” means “similar”)
Success depends on the point of origin
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 28/37
Information Diffusion
A network of senders and receivers Each actor has knowledge, credibility,
and popularity Options for sender (speaker):
To send (gain popularity, gain or lose credibility)
Not to send (lose popularity) Options for receiver (listener):
Listen silently (gain knowledge, lose popularity)
Listen and provide feedback (gain knowledge, gain popularity, gain or lose credibility)
Action based on Nash equilibrium
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 29/37
Resilience
Random attacks: Fail
random nodes
Targeted attacks: Attack
selected nodes
Exponential random networks
No difference: The network gracefully degrades
Scale-free networks (robust yet fragile)
The giant component survives.
The giant component rapidly falls
apart.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 30/37
Tools of the Craft
● Gephi—graph visualization● Pajek—network algorithms and some
visualization● NetLogo—simple simulation environment (good
for small-scale experiments)● CFinder—community finder● NodeXL—network visualization plugin for Excel● networkx—Python library for network
algorithms
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 31/37
Gephi Network
Science “Paintbrush”
Analysis and visualization of large networks
Windows, Linux, MacOS
Developed by Gephi consortium
Free and open source
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 32/37
Pajek “Spider” in
Slovene Analysis and
visualization of large networks
Windows (run on Linux in wine)
Developed by Batagelj and Mrvar
Free, but not open source
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 33/37
Unusual applications
Reminder:If all you know is Network Science
everything looks like a Network.
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 34/37
Unusual networks
● Networks of recipes and cooking ingredients (Adamic)
● Product space networks (Hidalgo)● Human disease networks (Barabási)● Flavor networks (Ahn)● Soccer player networks (Onody / de Castro)● And more!..
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 35/37
Semantic networks Two words are similar if they
are used by similar people (But two people are similar if
they use similar words!) Zinoviev, Stefanescu,
Swenson, and Fireman, “Semantic Networks of Interests in Online NSSI Communities,” Proc. of Workshop “Words and Networks,” Evanston, IL, June 2012
April 29, 2013 3rd International Business Complexity and Global Leadership Conference 36/37
Textual Networks Co-occurrence of actors in
the New Testament A node is an actor, an
edge is introduced if two actors are mentioned in the same chapter of a book at least once
Bigger nodes—more mentioning
Zinoviev, research in progress, unpublished