dependency networks sushmita roy bmi/cs 576 [email protected] nov 26 th, 2013

26
Dependency networks Sushmita Roy BMI/CS 576 www.biostat.wisc.edu/bmi576 [email protected] Nov 26 th , 2013

Upload: tobias-mccarthy

Post on 25-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Dependency networks

Sushmita RoyBMI/CS 576

www.biostat.wisc.edu/[email protected]

Nov 26th, 2013

Page 2: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Goals for today

• Introduction to Dependency networks• GENIE3: A network inference algorithm for learning a

dependency network from gene expression data• Comparison of various network inference algorithms

Page 3: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

What you should know

• What are dependency networks?• How they differ from Bayesian networks?• Learning a dependency network from expression

data• Evaluation of various network inference methods

Page 4: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Graphical models for representing regulatory networks

• Bayesian networks• Dependency networks

Structure

Msb2

Sho1

Ste20

Random variables encode expression levels

TARGET

REGULATORS

X1

X2

Y3

X1 X2

Y3

Edges correspond to some form of statistical dependencies

Y3=f(X1,X2)

Function

Page 5: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Dependency network

• A type of probabilistic graphical model• As in Bayesian networks has– A graph component– A probability component

• Unlike Bayesian network – Can have cyclic dependencies

Dependency Networks for Inference, Collaborative Filtering and Data visualization Heckerman, Chickering, Meek, Rounthwaite, Kadie 2000

Page 6: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Notation

• Xi: ith random variable

• X={X1,.., Xp}: set of p random variables

• xik: An assignment of Xi in the kth sample

• x-ik: Set of assignments to all variables other than Xi

in the kth sample

Page 7: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Dependency networks

?? ?…

Xj

Regulators

•Function: fj can be of different types.•Learning requires estimation of each of the fj functions•In all cases it is trying to minimize an error of predicting Xj from its neighborhood:

fj

Page 8: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Different representations of the fj function

• If X is continuous– fj can be a linear function

– fj can be a regression tree

– fj can be a random forest• An ensemble of trees

• If X is discrete– fj can be a conditional probability table

– fj can be a conditional probability tree

Page 9: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Linear regressionY

(out

put)

X (input)

Linear regression assumes that output (Y) is a linear function of the input (X)

Slope Intercept

Page 10: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Estimating the regression coefficient

• Assume we have N training samples• We want to minimize the sum of square errors

between true and predicted values of the output Y.

Page 11: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

An example random forest for predicting gene expression

Ensemble of Regression trees

Output

1Input

A selected path for a set of genes

Sox6>0.5

Page 12: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Considerations for learning regression trees

• Assessing the purity of samples under a leaf node– Minimize prediction error– Minimize entropy

• How to determine when to stop building a tree?– Minimum number of data points at each leaf node– Depth of the tree– Purity of the data points under any leaf node

Page 13: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Algorithm for learning a regression tree

• Input: Output variable Xj, Input variables Xj

• Initialize tree to single node with all samples under node– Estimate

• mc: the mean of all samples under the node• S: sum of squared error

• Repeat until no more nodes to split– Search over all input variables and split values and compute

S for possible splits– Pick the variable and split value that has the highest

improvement in error

Page 14: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

GENIE3: GEne Network Inference with Ensemble of trees

• Solves a set of regression problems– One per random variable

• Models non-linear dependencies• Outputs a directed, cyclic graph with a confidence of

each edge• Focus on generating a ranking over edges rather than

a graph structure and parameters

Inferring Regulatory Networks from Expression Data Using Tree-Based Methods Van Anh Huynh-Thu, Alexandre Irrthum, Louis Wehenkel, Pierre Geurts, Plos One 2010

Page 15: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

GENIE3 algorithm sketch

• For each gene j, generate input/output pairs– LSj={(x-j

k,xjk),k=1..N}

– Use a feature selection technique on LSj such as tree building to compute wij for all genes i ≠ j

– wij quantifies the confidence of the edge between Xi and Xj

• Generate a global ranking of regulators based on each wij

Page 16: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

GENIE3 algorithm sketch

Figure from Huynh-Thu et al.

Page 17: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Feature selection in GENIE3

• Random forest to represent the fj• Learning the Random forest

• Generate M=1000 bootstrap samples• At each node to be split, search for best split among K randomly

selected variables

– K was set to p-1 or (p-1)1/2

Page 18: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Computing the importance weight of each predictor

• Feature importance is computed at each test node• Remember there can be multiple test nodes per

regulator• For a test node importance is given by the reduction

in variance if we make a split on that node

Test node Set of data samples that reach the test node

#S: Size of the set S

Var(S): variance of the output variable in set S

Page 19: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Computing the importance of a predictor

• For a single tree the overall importance is then sum over over all points in the tree where this node is used to split

• For an ensemble the importance is averaged over all trees.

Page 20: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Computational complexity of GENIE3

• Complexity per variable– O(TKNlog N)– T is the number of trees– K is the number of random attributes selected per split– N is the learning sample size

Page 21: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Evaluation of network inference methods

• Assume we know what the “right” network is• One can use Precision-Recall curves to evaluate the

predicted network• Area under the PR curve (AUPR) curve quantifies

performance

Page 22: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

AUPR based performance comparison

Page 23: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

DREAM: Dialogue for reverse engineeting assessments and methods

Community effort to assess regulatory network inference

DREAM 5 challenge

Previous challenges: 2006, 2007, 2008, 2009, 2010 Marbach et al. 2012, Nature Methods

Page 24: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Where do different methods rank?

Marbach et al., 2010 Com

mun

ityRa

ndom

Page 25: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Comparing module (LeMoNe) and per-gene (CLR) methods

Page 26: Dependency networks Sushmita Roy BMI/CS 576  sroy@biostat.wisc.edu Nov 26 th, 2013

Summary of network inference methods

• Probabilistic graphical models provide a natural representation of networks

• A lot of network inference is done using gene expression data

• Many algorithms exist, we have seen three– Bayesian networks

• Sparse candidates• Module networks

– Dependency networks– GENIE3

• Algorithms can be grouped into per-gene and per-module