dream4 puzzle – inferring network structure from microarray data qiong cheng

13
DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Upload: diane-borders

Post on 14-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

DREAM4 Puzzle – inferring network structure from microarray data

Qiong Cheng

Page 2: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Outline

Gene Network Gene Regulatory Systems and Related Work FunGen: Reconstructing Biological Networks

Using Conditional Correlation Analysis ARACNE: Algorithm for Reconstructing

Accurate Cellular Network

Page 3: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Gene Network

Directed network– nodes : genes – edges : regulation– including loops– Scale-free:

• Degree distribution:– power law

P(k) ~ k-λ

Page 4: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Genetic Network Generation Schematic

Jong Modeling and simulation of genetic regulatory systems: a literature review. J. Comput Biol 2002;9(1):67-103

Page 5: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Random Network Model

ER model– each pair of nodes connected by an edge with

probability p– Independence of the edges– poisson degree distribution (e.g. P(k) ~ e-k for k)

BA model– Scale-free distribution ( P(k) ~ k-x )– Process:

new nodes prefer attached to already high degree nodes

http://arxiv.org/pdf/cond-mat/0010278

Page 6: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Random Network Model

Module extraction from source random scale-free network (used by DREAM3)

– Hierarchical scale-free network– Extraction: Random seed node + iteratively adding

neighbor nodes with highest modularity Q

Marbach D, Schaffter T, Mattiussi C, and Floreano D (2009) Generating Realistic in silico Gene Networks for Performance Assessment of Reverse Engineering Methods. J Comput Biol, 16(2):229–239

m

kkPijPABBss

mQ ji

ijjiijT

2 ; ;

4

1

Page 7: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Microarray Data Distributions

Benford’s law ( in base 10):

P(D)=log10(1+D-1)

Zipf’s law: microarray data log-normal distribution as a potential distribution for normalization of the bulk of the corrected spot intensities

Noise

Source: “Make Sense Of Microarray Data Distributions”

Page 8: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

Reverse Engineering Clustering + … Correlation measures + …

Optimization method– Bayesian network (conditional independence via DAG)– Markov chains– Dynamic Bayesian network– Expectation maximization (max likelihood)– GA – Neuron network

Simulation– Piecewise-linear differential equations– Stochastic equations– Stochastic/hybrid petri-net– Boolean network

Regression techniques

Page 9: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

FunGen : Reconstructing Biological Networks Using Conditional Correlation Analysis

Synthetic network Network dynamics Simulation protocol - perturbation Conditional correlation

– Correlation is symetric– Matrix is non-symetric– May lead to indirect connection

False positive (indirect connection) + false negative (noise)– error = FP/(FP+TN) + FN/(FN+TP)

Reduce false positive– Choose optimal ρ_opt– Triangle reduction construction

Page 10: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

ARACNE: Algorithm for Reconstructing Accurate Cellular Network Assume two-way interaction: pairwise potential determines all statistical

dependencies + uniform marginal distributions Mutual information (MI) = measure of relatedness

Independency Data processing inequality: if genes g1 and g3 interact through g2 then

ARACNE starts with network so for every edge look at gene triplets and remove edge with smallest MI

Ignore the direction of the edges Reconstruct tree-network topologies exactly

– higher-order potential interactions will not be accounted for (ARACNE’s algorithm will open 3-gene loops).

– A two-gene interaction will be detected iff there are no alternate paths.

i ii

ii

ypxp

yxp

MyxI

)()(

),(log

1),(

2

1

2

2

)(

2

1)(

d

xxxpe

Mxp ji

ji

j

jijiji

d

yyxx

Mdyxp 2

2

22

22 2

)()(exp

2

1),(

)()(),( iif 0),( jijiji ypxpyxpyxI

Page 11: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

ARACNE – Example & Evaluation

Synthetic networks: ER , BAPerformance to be assessed via Precision-Recall curves (PRCs)

Example:

ratio) success (expected inferredcorrectly nsinteractio trueoffraction Precision FPTP

TP

NN

N

ones inferred all among nsinteractio trueoffraction recallFNTP

TP

NN

N

Page 12: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

(Demo) Sample input data file

Input_file_name.exp

N = 3 # genes

M = 2 # microarrays

Input file has N+1=4 lines

each lines has M+2 (2M+2) fields

AffyID HG_U95Av2 SudHL6.CHP ST486.CHPG1 G1 16.477367 0.69939363 20.150969 0.5297595G2 G2 7.6989274 0.55935365 26.04019 0.5445875G3 G3 8.8098955 0.5445875 21.554955 0.31372303

header line

annotation name

Microarray chip names

(value,p-value)-chip1

Source from ARACNE slides

Page 13: DREAM4 Puzzle – inferring network structure from microarray data Qiong Cheng

(Demo, cont’d) Sample output data file

input_data_file_name[non-default_param_vals].adj# lines = N = # genes

G1:0 8 0.064729G2:1 2 0.0298643 7 0.0521425G3:2 1 0.0298643G4:3 8 0.0427217G5:4 5 0.403516G6:5 4 0.403516 6 0.582265G7:6 5 0.582265 9 0.38039G8:7 1 0.0521425 8 0.743262G9:8 0 0.064729 3 0.0427217 7 0.743262 9 0.333104G10:9 6 0.38039 8 0.333104

AffyID ID# Associated gene ID# MI value

9

14

8 10

7

2 3

6

5

Source from ARACNE slides