bayesian network for gene regulatory network construction
DESCRIPTION
Bayesian network for gene regulatory network construction. Jin Chen CSE891- 001 2012 Fall. Layout. Bayesian network learning Scalability and Precision Large-scale learning algorithms Integrative approaches. Bayesian network - concept. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/1.jpg)
1
Bayesian network for gene regulatory network construction
Jin ChenCSE891-001
2012 Fall
![Page 2: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/2.jpg)
2
Layout
• Bayesian network learning• Scalability and Precision• Large-scale learning algorithms• Integrative approaches
![Page 3: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/3.jpg)
3
Bayesian network - concept
• A Bayesian network X is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph– nodes variable; edges conditional dependency– Disconnected nodes variables are conditionally independent
of each other– Each node is associated with a probability function that takes as
input a set of values for the node's parent and gives the probability of the variable represented by the node
adopted from Wikipedia
![Page 4: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/4.jpg)
4
Bayesian network - example
Bayesian network Structure: there are 2 events which could cause grass to be wet: either the sprinkler is on or it's raining. The rain has a direct effect on the use of the sprinkler. The conditional probability tables (CPT) are learned from historical data.
Then the joint probability is P(G,S,R) = P(G|S,R)P(S|R)P(R)
adopted from Wikipedia
![Page 5: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/5.jpg)
5
Bayesian network - example
adopted from Wikipedia
What is the probability that it is raining, given the grass is wet?
![Page 6: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/6.jpg)
6
Bayesian network – structure learning
• In the simplest case, a Bayesian network structure is specified by an expert and is then used to perform inference
• In the cases that the task of defining the network structure is too complex for humans, the network structure and the parameters of the local distributions must be learned from data
• Automatically learning the structure of a Bayesian network is a challenge pursued within machine learning – Methods of structural learning usually uses optimization based search, which
requires a scoring function and a search strategy – The time requirement of an exhaustive search returning back a structure that
maximizes the score is super-exponential in the number of variables
![Page 7: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/7.jpg)
7
Bayesian network learning for gene regulatory networks
• Bayesian networks are well suited to model relationships between genes because:
1. BN uses an acyclic direct graph to denote the relationship between the variables of interest (genes), thus can naturally model causal relationships between genes
2. BN has a solid theoretical foundation and offers a probabilistic approach to accommodate the variations typically observed in microarray experiments
3. BN can accommodate missing data and incorporate prior knowledge through prior distribution of the parameters
![Page 8: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/8.jpg)
8
Gene regulatory network construction
• Various GRN structure learning approaches– Pair-wise comparison– Differential equation estimation– Bayesian network learning– Common problem: only a relatively small number of genes were
included into the network
• Recent studies have been targeted at deriving the large-scale or even complete networks
![Page 9: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/9.jpg)
9
![Page 10: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/10.jpg)
10
Gene regulatory network construction
• Use a combination of scoring approaches and K2 algorithm to maximize the computational efficiency of network inference
– Step 1. Construct an undirected network based on mutual information (MI). This allows us to search the best DAG in a reduced space
– Step 2. Assign directions to the edges. The undirected network is split into sub-networks. Given the node ordering information, the sub-networks are trained with K2 algorithm sequentially. For each sub-network, the directions of edges can be identified based on the BDe score
![Page 11: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/11.jpg)
11
Constructing undirected networks• Construct undirected networks based on mutual information
(MI). • MI between two variables X & Y, denoted by I(X; Y), is defined
as the amount of information shared between the two variables. It is used to detect general dependencies in data
where
![Page 12: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/12.jpg)
12
Constructing undirected networks• MI measures the dependency between two random variables
• The greater the MI values I(X; Y), the more closely the two variables are related
• If there is a direct edge in GRN between X and Y, there exists a strong dependency between X and Y
• This allows us to search the best DAG only in a reduced space
![Page 13: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/13.jpg)
13
Graph splitting• Every node and all its neighbors form a sub-network
• For each sub-network, K2 algorithm is used to find the optimal edge orientations that maximize BDe score (Bayesian Dirichlet equivalence)
• This is reasonable because to maximize the BDe for the whole network, one only need to find all the sub-networks with the best BDe scores
Cooper,G.F. and Herskovits,E. (1992) A Bayesian method for the induction of probabilistic networks from data. Mach. Learn., 9, 309–347.
![Page 14: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/14.jpg)
14
Decide the order of sub-networks• In each sub-network, K2 algorithm is run to obtain the best
directed sub-network structure
• The K2 result of one sub-network may affect the topology of other sub-networks. Thus we need to decide the order of the sub-networks for K2 algorithm
• Ordering: for each node in the whole undirected network, the number of edges connecting to it is counted; nodes are then sorted in descending order
![Page 15: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/15.jpg)
15
K2 algorithm
http://web.cs.wpi.edu/~cs539/s05/Projects/k2_algorithm.pdf
![Page 16: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/16.jpg)
16
Scoring function
![Page 17: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/17.jpg)
17
http://web.cs.wpi.edu/~cs539/s05/Projects/k2_algorithm.pdf
![Page 18: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/18.jpg)
18
Performance
Correct Edges Miss Wrong orientation
Wrong connection
![Page 19: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/19.jpg)
19
Small network
![Page 20: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/20.jpg)
20
Large network
![Page 21: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/21.jpg)
21
Further improvement• Ko et al further developed a new Bayesian network, in
which Gaussian mixture models is used to describe continuous gene expression data and learn gene pathways
• Data discretization is often required since many approaches to learn network structures were developed for binary or discrete input data
• The discretization of continuous values can result in loss of information and different discretizations can substantially change the input values and the inferred network
Ko et al. Inference of Gene Pathways Using Gaussian Mixture Models. IEEE International Conference on Bioinformatics and Biomedicine. pp 362-367. 2007
![Page 22: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/22.jpg)
22
![Page 23: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/23.jpg)
23
![Page 24: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/24.jpg)
24
Integrative approaches
Tamada et al. Bioinformatics Vol. 19 Suppl. 2 2003, pages ii227–ii236
![Page 25: Bayesian network for gene regulatory network construction](https://reader035.vdocuments.net/reader035/viewer/2022081604/56816609550346895dd9405a/html5/thumbnails/25.jpg)
25
Dynamic approaches
• Reconstruct gene regulatory networks from expression data using dynamic Bayesian network (DBN)
Zou M, Conzen SD: A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 2005, 21(1):71-79.