alternativa 1alternativa 2mattina (fariselli)pomeriggio (martelli) lun 1 ott svm (ml)grafi (sb) mar...

66
Alternativ a 1 Alternativ a 2 Mattina (Fariselli) Pomeriggio (Martelli) Lun 1 Ott Lun 1 Ott SVM (ML) Grafi (SB) Mar 2 Ott Mar 2 Ott System Biology (SB) Probabilita (ML) Mer 3 Ott Mer 3 Ott System Biology (SB) HMM (ML) Ven 5 Ott Lun 8 Ott System Biology (SB) HMM (ML) Lun 8 Ott Mar 9 Ott Linux Linux e python Mar 9 Ott Mer 10 Ott Esercitazione su SVM Mer 10 Ott Gio 11 Ott Esercitazione su HMM rsi vengono integrati e conterranno grosso modo due moduli: SB: systems biology ML: machine learning

Post on 19-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Alternativa 1 Alternativa 2 Mattina (Fariselli) Pomeriggio (Martelli)

Lun 1 Ott Lun 1 Ott SVM (ML) Grafi (SB)

Mar 2 Ott Mar 2 Ott System Biology (SB) Probabilita (ML)

Mer 3 Ott Mer 3 Ott System Biology (SB) HMM (ML)

Ven 5 Ott Lun 8 Ott System Biology (SB) HMM (ML)

Lun 8 Ott Mar 9 Ott Linux Linux e python

Mar 9 Ott Mer 10 Ott Esercitazione su SVM

Mer 10 Ott Gio 11 Ott Esercitazione su HMM

I corsi vengono integrati e conterranno grosso modo due moduli:SB: systems biologyML: machine learning

Page 2: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

A branch of science that seeks to integrate different levels of information to understand how biological systems function.

It is not (only) the number and properties of system elements but their relations!!

L. Hood: “Systems biology defines and analyses the interrelationships of all of the elements in a functioning system in order to understand how the system works.”

Systems Biology. What Is It?

Page 3: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

More on Systems BiologyMore on Systems Biology

Essence of living systems is flow of mass,energy, and information in space and time.

The flow occurs along specific networks

Flow of mass and energy (metabolic networks)

Flow of information involving DNA (transcriptional regulation networks)

Flow of information not involving DNA (signaling networks)

The Goal of Systems Biology: To understand the flow of mass, energy, and information in living systems.

Page 4: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Networks and the Core Concepts of Systems Biology

(i) Complexity emerges at all levels of the hierarchy of life

(ii) System properties emerge from interactions of components

(iii) The whole is more than the sum of the parts.

(iv) Applied mathematics provides approaches to modeling biological systems.

Page 5: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How to Describe a System As a Whole?

Networks - The Language of Complex Systems

Page 6: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Air Transportation Network

Page 7: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The World Wide Web

Page 8: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Fragment of a Social Network(Melburn, 2004)

Friendship among 450 people in Canberra

Page 9: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Biological NetworksA. Intra-Cellular Networks

Protein interaction networks

Metabolic Networks

Signaling NetworksGene Regulatory Networks

Composite networksNetworks of Modules, Functional Networks Disease networks

B. Inter-Cellular NetworksNeural Networks

C. Organ and Tissue Networks

D. Ecological Networks

E. Evolution Network

Page 10: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Protein Interaction Network of Yeast

Yeast two hybridUetz et al, Nature 2000

Page 11: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Source: ExPASy

Metabolic Networks

Page 12: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott
Page 13: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Gene Regulation Networks

Page 14: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

protein-gene interactions

protein-protein interactions

PROTEOME

GENOME

Citrate Cycle

METABOLISM

Bio-chemical reactions

L-A Barabasi

miRNAregulation?

- -

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Page 15: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Functional Networks

Number of shared proteins

20125

55740

43221

33419

28692

12160

20147

7

83 65

41 49

172 75

321

24 260

11

596

35

103

Cell Cycle Cell Polarity & Structure

Intermediateand EnergyMetabolism

Protein Synthesisand Turnover

Protein RNA / Transport

RNAMetabolism

Signaling

Transcription/DNAMaintenance/Chromatin Structure

Number of protein complexes

Number of proteins 13111

8 6125 40

77 19 14

97

30 16 27

11

75 299

53

37

19

7 15

22187

33 73

13

94

MembraneBiogenesis &Turnover

Yeast: 1400 proteins, 232 complexes, nine functional groups of complexes

(Data A.-M. Gavin et al. (2002) Nature 415,141-147)

D. Bonchev, Chemistry & Biodiversity 1(2004)312-326

Page 16: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

What is a Network?

Network is a mathematical structure composed of points connected by lines

Network Theory <-> Graph Theory

Network Graph

Nodes Vertices (points)

Links Edges (Lines)

A network can be build for any functional system

System vs. Parts = Networks vs. Nodes

Page 17: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The 7 bridges of Königsberg

The question is whether it is possible to walk with a route that crosses each bridge exactly once.

Page 18: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The representation of Euler

The shape of a graph may be distorted in any way without changing the graph itself, so long as the links between nodes are unchanged. It does not matter whether the links are straight or curved, or whether one node is to the left or right of another.

In 1736 Leonhard Euler formulated the problem in terms of abstracted the case of Königsberg:

1) by eliminating all features except the landmasses and the bridges connecting them;

2) by replacing each landmass with a dot (vertex) and each bridge with a line (edge).

Page 19: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

3

3

5 3

The solution depends on the node degree

In a continuous path crossing the edges exactly once, each visited node requires an edge for entering and a different edge for exiting (except for the start and the end nodes).

A path crossing once each edge is called Eulerian path.It possible IF AND ONLY IF there are exactly two or zero nodes of odd degree. Since the graph corresponding to Königsberg has four nodes of odd degree, it cannot have an Eulerian path.

Page 20: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The solution depends on the node degree

If there are two nodes of odd degree, those must be the starting and ending points of an Eulerian path.

2

3

4

5

1

Start

6

End

Page 21: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Hamiltonian paths

Find a path visiting each node exactly one

Conditions of existence for Hamiltonian paths are not simple

Page 22: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Hamiltonian paths

Page 23: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Graph nomenclature

Graphs can be simple or multigraphs, depending on whetherthe interaction between two neighboring nodes is unique or can be multiple, respectively.

A node can have or not self loops

Page 24: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Graph nomenclature

Networks can be undirected or directed, depending on whether the interaction between two neighboring nodes proceeds in both directions or in only one of them, respectively.

The specificity of network nodes and links can be quantitatively characterized by weights

2.5

2.5

7.3 3.3 12.7

8.1

5.4

Vertex-Weighted Edge-Weighted

1 2 3 4 5 6

Page 25: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Networks having no cycles are termed trees. The

more cycles thenetwork has, the more complex it

is.

A network can be connected (presented by a

single component) or disconnected (presented by

several disjoint components).

connected disconnected

trees

cyclic graphs

Graph nomenclature

Page 26: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Paths

Stars

Cycles

Complete Graphs

Graph nomenclature

Page 27: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Large graphs = Networks

Page 28: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

• Vertex degree distribution (the degree of a vertex is the number of vertices connected with it via an edge)

Statistical features of networks

Page 29: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

• Clustering coefficient: the average proportion of neighbours of a vertex that are themselves neighbours

Statistical features of networks

Node4 Neighbours (N)

2 Connections among the Neighbours

6 possible connections among the Neighbours(Nx(N-1)/2)

Clustering for the node = 2/6

Clustering coefficient: Average over all the nodes

Page 30: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

• Clustering coefficient: the average proportion of neighbours of a vertex that are themselves neighbours

Statistical features of networks

C=0

C=0

C=0

C=1

Page 31: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Given a pair of nodes, compute the shortest path between them

• Average shortest distance between two vertices

• Diameter: maximal shortest distance

Statistical features of networks

How many degrees of separation are they between two random people in the world, when friendship networks are considered?

Page 32: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How to compute the shortest path between home and work?

Edge-weighted Graph

The exaustive search can be too much time-consuming

Page 33: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Initialization:Fix the distance between “Casa” and

“Casa” equal to 0Compute the distance between “Casa” and

its neighboursSet the distance between “Casa” and its

NON-neighbours equal to ∞

Fixed nodesNON –fixed nodes

Page 34: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Iteration (1):Search the node with the minimum

distance among the NON-fixed nodes and Fix its distance, memorizing the incoming direction

Fixed nodesNON –fixed nodes

Page 35: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

4

Iteration (2):

Update the distance of NON-fixed nodes, starting from the fixed distances

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Page 36: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

The updated distance is different from the previous one

Iteration:

Fix the NON-fixed nodes with minimum distance

Update the distance of NON-fixed nodes, starting from the fixed distances.

Page 37: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Iteration:

Fix the NON-fixed nodes with minimum distance

Update the distance of NON-fixed nodes, starting from the fixed distances.

Page 38: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Iteration:

Fix the NON-fixed nodes with minimum distance

Update the distance of NON-fixed nodes, starting from the fixed distances.

Page 39: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Iteration:

Fix the NON-fixed nodes with minimum distance

Update the distance of NON-fixed nodes, starting from the fixed distances.

Page 40: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Iteration:

Fix the NON-fixed nodes with minimum distance

Update the distance of NON-fixed nodes, starting from the fixed distances.

Page 41: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

The Dijkstra’s algorithm

Fixed nodesNON –fixed nodes

Conclusion:

The label of each node represents the minimal distance from the starting node

The minimal path can be reconstructed with a back-tracing procedure

Page 42: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

• Average shortest distance between two vertices

• Diameter: maximal shortest distance

Statistical features of networks

• Vertex degree distribution

• Clustering coefficient

Page 43: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Two reference models for networks

Regular network (lattice) Random network (Erdös+Renyi, 1959)

Each edge is randomly set with probability p

Regular connections

Page 44: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Two reference models for networksComparing networks with the same number of nodes (N) and edges

Degree distribution

Poisson distribution

!k

ekPk

Exp decay

Average shortest path

Average connectivity

≈ N ≈ log (N)

high low

Page 45: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Some examples for real networks

Network size vertex degree

shortest path

Shortest path in fitted random graph

Clustering

Clustering in random graph

Film actors 225,226 61 3.65 2.99 0.79 0.00027

MEDLINE coauthorship

1,520,251

18.1 4.6 4.91 0.43 1.8 x 10-4

E.Coli substrate graph

282 7.35 2.9 3.04 0.32 0.026

C.Elegans neuron network

282 14 2.65 2.25 0.28 0.05

Real networks are not regular (low shortest path)Real networks are not random (high clustering)

Page 46: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Adding randomness in a regular network

Random changes in edges

OR

Addition of random links

Page 47: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Adding randomness in a regular network(rewiring)

Networks with high clustering (like regular ones) and low path length (like random ones) can be obtained:SMALL WORLD NETWORKS (Strogatz and Watts, 1999)

Page 48: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Small World Networks

A small amount of random shortcuts can decrease the path length, still maintaining a high clustering: this model “explains” the 6-degrees of separations in human friendship network

Page 49: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

What about the degree distribution in real networks?

Both random and small world models predict an approximate Poisson distribution: most of the values are near the mean;Exponential decay when k gets higher: P(k) ≈ e-k, for large k.

Page 50: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

What about the degree distribution in real networks?

In 1999, modelling the WWW (pages: nodes; link: edges), Barabasi and Albert discover a slower than exponential decay:

P(k) ≈ k-a with 2 < a < 3, for large k

Page 51: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Scale-free networks

Networks that are characterized by a power-law degree distribution are highly non-uniform: most of the nodes have only a few links. A few nodes with a very large number of links, which are often called hubs, hold these nodes together. Networks with a power degree distribution are called scale-free

hubs

It is the same distribution of wealth following Pareto’s 20-80 law:Few people (20%) possess most of the wealth (80%), most of the people (80%) possess the rest (20%)

Page 52: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Hubs

Attacks to hubs can rapidly destroy the network

Page 53: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Three non biological scale-free networks

Albert and Barabasi, Science 1999

Note the log-log scaleLINEAR PLOT

kAkPkAkP loglog)(log)(

Page 54: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How can a scale-free network emerge?

Network growth models: start with one vertex.

Page 55: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How can a scale-free network emerge?

Network growth models: new vertex attaches to existing vertices by preferential attachment: vertex tends choose vertex according to vertex degree

In economy this is called Matthew’s effect: The rich get richerThis explain the Pareto’s distribution of wealth

Page 56: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How can a scale-free network emerge?

Network growth models: hubs emerge(in the WWW: new pages tend to link to existing, well linked pages)

Page 57: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Metabolic pathways are scale-free

Hubs are pyruvate, coenzyme A….

Page 58: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Protein interaction networks are scale-free

Degree is in some measure related to phenotypic effect upon gene knock-out

Red : lethalGreen: non lethalYellow: Unknown

Page 59: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Titz et al, Exp Review Proteomics, 2004

Caveat: different experiments give different results

Page 60: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

How can a scale-free network emerge?

Gene duplication (and differentiation): duplicated genes give origin to a protein that interacts with the same proteins as the original protein (and then specializes its functions)

Page 61: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Caveat on the use of the scale-free theory

The same noisy data can be fitted in different ways

Finding a scale-free behaviour do NOT imply the growth with preferential attachment mechanism

A sub-net of a non-free-scale network can have a scale-free behaviour

Keller, BioEssays 2006

Page 62: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Hierarchical networks

Standard free scale models have low clustering: a modular hierarchical model accounts for high clustering, low average path and scale-freeness

Page 63: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Sub-graphs more represented than expected

Modules

209 bi-fan motifs found in the E.coli regulatory network

Page 64: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Summary

Many complex networks in nature and technology have common features.

They differ considerably from random networks of the same size

By studying network structure and dynamics, and by using comparative network analysis, one can get answers of important biological questions.

Page 65: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Fundamental Biological Questions to Answer

(i) Which interactions and groups of interactions are likely to have equivalent functions across species? (ii) Based on these similarities, can we predict new functional information about proteins and interactions that are poorly characterized?(iii) What do these relationships tell us about the evolution of proteins, networks and whole species? (iv) How to reduce the noise in biological data: Which interactions represent true binding events?

False-positive interaction is unlikely to be reproduced across the interaction maps of multiple species.

Page 66: Alternativa 1Alternativa 2Mattina (Fariselli)Pomeriggio (Martelli) Lun 1 Ott SVM (ML)Grafi (SB) Mar 2 Ott System Biology (SB)Probabilita (ML) Mer 3 Ott

Barabasi and Oltvai (2004) Network Biology: understanding the cell’s functional organization. Nature Reviews Genetics 5:101-113

Stogatz (2001) Exploring complex networks. Nature 410:268-276

Hayes (2000) Graph theory in practice. American Scientist 88:9-13/104-109

Mason and Verwoerd (2006) Graph theory and networks in Biology

Keller (2005) Revisiting scale-free networks. BioEssays 27.10: 1060-1068