network theory david lusseau biol4062/5062 [email protected]

30
Network theory David Lusseau BIOL4062/5062 [email protected]

Upload: tyler-lawson

Post on 20-Jan-2016

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Network theory

David Lusseau

BIOL4062/5062

[email protected]

Page 2: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Outline

Today: basics of graph theory and network statistics

8 March: incorporating uncertainty, network models

13 March: community structure Suggested readings:

Newman M.E.J. 2003. The structure and function of complex networks. SIAM Review 45,167-256

Page 3: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

What is a network

Set of objects (vertices) with connections (edges)

Represented by an adjacency matrix or a list

1 2 3 4

1 0 0 0 1

2 0 0 1 0

3 0 1 0 1

4 1 0 0 1

v1 v2 weight

Hal John 5

John George 10

Liz Hal 2

Beth Liz 1

Beth John 20

Page 4: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Types of networks

Undirected graph (weighted or not)

Directed graph (weighted or not) Cyclic (contain loops) Acyclic (no loops)

Hypergraph (one edge join more than two vertices)

Page 5: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Undirected graph

Page 6: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Directed graph

cyclic

acyclic?

Cycle: <(a,b),(b,c),(c,a)>

Page 7: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Hypergraph

Meyers et al. 2004 J Th. Bio.

Page 8: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Some terminology

Component: set of interconnected vertices (s)

(in- and out- components in a directed graph)

Giant component: the largest component in the graph (S)

Page 9: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Some terminology

Degree: number of edges connected to a vertex (k)

(in- and out- degrees in a directed graph)

Geodesic path: the shortest path through the network from one vertex to another (l)

Diameter: length of the longest geodesic path (d)

Page 10: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

v=7e=9

v=19

e=27

v=3e=2

v=1e=0

Page 11: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

k=0

k=4

Page 12: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

kin =4kout=4

kin =2kout=3

kin =2kout=1

Page 13: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

l(a,b)=2

Component 4

d(4)=5

Page 14: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Other centrality measures

Betweenness

Eigenvector

Reach

Clustering coefficient

Page 15: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Betweenness and bottleneck

Number of geodesic path passing through a vertex

A

B

C

D EBetweenness of B =

1 + 1 + 1 = 3

Page 16: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Betweenness and bottleneck

Number of geodesic path passing through a vertex

A

B

C

D EBetweenness of D =

½ + ½ = 1

Page 17: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Eigenvector

Eigenvector of the dominant eigenvalue

ei integrates the connectivity of i (its degree) and the connectivity of its neighbours

Page 18: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Reach

Number of vertices that can be reached in k steps as a proportion of vertices in the network

Typically 2 or fewer steps

Page 19: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Reach

Centrality measure integrating link redundancy as well (are your friends only talking to your friends?)

Page 20: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Clustering coefficient

1 triangle, 8 connected triples: C=(3*1)/8=3/8 Each triangle contributes to 3 triples

Local clustering coefficientn triangle connected to i/ n triples conn. to i

3/3=1

3/3=10/1=0

0/1=0

3/6=0.5

Page 21: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Dealing with weighted matrices First option: do not deal with them

Ignore the weight of the edges

Transform the weighted matrices in binary matrices Meaningful measures wij>expected by chance, Significance and relevance to hypotheses

ww ij

Page 22: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Extending to weighted matrices Retrieve more information Relevance of binary matrix statistics strength ↔ degree:

N

1jiji as

Page 23: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Some examples of real world networks Social networks Contact networks Food webs Man-made networks (internet, electricity grid) Metabolite interactions …

Page 24: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

High school dating

Bearman et al. 2004 Am. J. Soc.

Graph by M.E.J. Newman

Page 25: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

High school friendship

Moody 2001 Am. J. Soc.

Page 26: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Internet

Cheswick, Bell Labs

Page 27: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Food webCaribbean coral reef system

Page 28: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Human protein-protein interactions

Chinnaiyan et al. 2005 Nature Biotech

Page 29: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Tools for network analyses

Ucinet/Netdraw (http://www.analytictech.com/)

Socprog (http://myweb.dal.ca/hwhitehe/social.htm)

Pajek (http://vlado.fmf.uni-lj.si/pub/networks/pajek/)

Jung (JAVA) (http://jung.sourceforge.net)

SNA (R package) (http://erzuli.ss.uci.edu/R.stuff)

Page 30: Network theory David Lusseau BIOL4062/5062 d.lusseau@dal.ca

Tools for network analyses

Net.Linux (Linux OS)

(http://pil.phys.uniroma1.it/%7Eservedio/software.html)

Visualising large graphs

Graphviz (http://www.graphviz.org)

Yed (http://www.yworks.com/en/products_yed_about.htm)