problem limited number of experimental replications. postgenomic data intrinsically noisy. poor...
TRANSCRIPT
![Page 1: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/1.jpg)
![Page 2: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/2.jpg)
![Page 3: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/3.jpg)
Problem
• Limited number of experimental replications.
• Postgenomic data intrinsically noisy.
• Poor network reconstruction.
![Page 4: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/4.jpg)
Problem
• Limited number of experimental replications.
• Postgenomic data intrinsically noisy.
• Can we improve the network reconstruction by systematically integrating different sources of biological prior knowledge?
![Page 5: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/5.jpg)
![Page 6: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/6.jpg)
+
![Page 7: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/7.jpg)
+
+
![Page 8: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/8.jpg)
+
+
+
+…
![Page 9: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/9.jpg)
• Which sources of prior knowledge are reliable?
• How do we trade off the different sources of prior knowledge against each other and against the data?
![Page 10: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/10.jpg)
Overview of the talk
• Revision: Bayesian networks
• Integration of prior knowledge
• Empirical evaluation
![Page 11: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/11.jpg)
Overview of the talk
• Revision: Bayesian networks
• Integration of prior knowledge
• Empirical evaluation
![Page 12: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/12.jpg)
Bayesian networks
A
CB
D
E F
NODES
EDGES
•Marriage between graph theory and probability theory.
•Directed acyclic graph (DAG) representing conditional independence relations.
•It is possible to score a network in light of the data: P(D|M), D:data, M: network structure.
•We can infer how well a particular network explains the observed data.
),|()|(),|()|()|()(
),,,,,(
DCFPDEPCBDPACPABPAP
FEDCBAP
![Page 13: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/13.jpg)
![Page 14: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/14.jpg)
Bayesian networks versus causal networks
Bayesian networks represent conditional (in)dependence relations - not necessarily causal interactions.
![Page 15: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/15.jpg)
Bayesian networks versus causal networks
A
CB
A
CB
True causal graph
Node A unknown
![Page 16: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/16.jpg)
Bayesian networks versus causal networks
A
CB
• Equivalence classes: networks with the same scores: P(D|M).
• Equivalent networks cannot be distinguished in light of the data.
A
CB
A
CB
A
CB
![Page 17: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/17.jpg)
Symmetry breaking
A
CB
Prior knowledge
A
CB
A
CB
A
CB
P(M|D) = P(D|M) P(M) / Z
D: data. M: network structure
![Page 18: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/18.jpg)
P(D|M)
![Page 19: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/19.jpg)
Prior knowledge:
B is a transcription factor with binding sites in the upstream regions of A and C
P(M)
![Page 20: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/20.jpg)
P(M|D) ~ P(D|M) P(M)
![Page 21: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/21.jpg)
Learning Bayesian networks
P(M|D) = P(D|M) P(M) / Z
M: Network structure. D: Data
![Page 22: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/22.jpg)
![Page 23: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/23.jpg)
![Page 24: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/24.jpg)
Overview of the talk
• Revision: Bayesian networks
• Integration of prior knowledge
• Empirical evaluation
![Page 25: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/25.jpg)
![Page 26: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/26.jpg)
Use TF binding motifs in promoter sequences
![Page 27: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/27.jpg)
Biological prior knowledge matrix
Biological Prior Knowledge
Indicates some knowledge aboutthe relationship between genes i and j
![Page 28: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/28.jpg)
Biological prior knowledge matrix
Biological Prior Knowledge
Define the energy of a Graph G
Indicates some knowledge aboutthe relationship between genes i and j
![Page 29: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/29.jpg)
Notation
• Prior knowledge matrix:
P B (for “belief”)
• Network structure:
G (for “graph”) or M (for “model”)
• P: Probabilities
![Page 30: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/30.jpg)
Prior distribution over networks
Energy of a network
![Page 31: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/31.jpg)
Sample networks and hyperparameters from the posterior distribution • Capture intrinsic inference uncertainty• Learn the trade-off parameters automatically
P(M|D) = P(D|M) P(M) / Z
![Page 32: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/32.jpg)
Prior distribution over networks
Energy of a network
![Page 33: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/33.jpg)
Rewriting the energy
Energy of a network
![Page 34: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/34.jpg)
Approximation of the partition function
Partition function of a perfect gas
![Page 35: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/35.jpg)
Multiple sources of prior knowledge
![Page 36: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/36.jpg)
MCMC sampling scheme
![Page 37: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/37.jpg)
Sample networks and hyperparameters from the posterior distribution
Metropolis-Hastings scheme
Proposal probabilities
![Page 38: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/38.jpg)
Bayesian networkswith biological prior knowledge
•Biological prior knowledge: Information about the interactions between the nodes.
•We use two distinct sources of biological prior knowledge.
•Each source of biological prior knowledge is associated with its own trade-off parameter: 1 and 2.
•The trade off parameter indicates how much biological prior information is used.
•The trade-off parameters are inferred. They are not set by the user!
![Page 39: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/39.jpg)
Bayesian networkswith two sources of prior
Data
BNs + MCMC
Recovered Networks and trade off parameters
Source 1 Source 2
1 2
![Page 40: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/40.jpg)
Bayesian networkswith two sources of prior
Data
BNs + MCMC
Source 1 Source 2
1 2
Recovered Networks and trade off parameters
![Page 41: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/41.jpg)
Bayesian networkswith two sources of prior
Data
BNs + MCMC
Source 1 Source 2
1 2
Recovered Networks and trade off parameters
![Page 42: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/42.jpg)
Overview of the talk
• Revision: Bayesian networks
• Integration of prior knowledge
• Empirical evaluation
![Page 43: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/43.jpg)
Evaluation
• Can the method automatically evaluate how useful the different sources of prior knowledge are?
• Do we get an improvement in the regulatory network reconstruction?
• Is this improvement optimal?
![Page 44: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/44.jpg)
Raf regulatory network
From Sachs et al Science 2005
![Page 45: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/45.jpg)
Raf regulatory network
![Page 46: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/46.jpg)
Evaluation: Raf signalling pathway
• Cellular signalling network of 11 phosphorylated proteins and phospholipids in human immune systems cell
• Deregulation carcinogenesis
• Extensively studied in the literature gold standard network
![Page 47: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/47.jpg)
DataPrior knowledge
![Page 48: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/48.jpg)
Flow cytometry data
• Intracellular multicolour flow cytometry experiments: concentrations of 11 proteins
• 5400 cells have been measured under 9 different cellular conditions (cues)
• Downsampling to 100 instances (5 separate subsets): indicative of microarray experiments
![Page 49: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/49.jpg)
Microarray example Spellman et al (1998)Cell cycle73 samples
Tu et al (2005)Metabolic cycle36 samples
Ge
nes
Ge
nes
time time
![Page 50: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/50.jpg)
DataPrior knowledge
![Page 51: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/51.jpg)
KEGG PATHWAYS are a collection of manually drawn pathway maps representing our knowledge of molecular interactions and reaction networks.
http://www.genome.jp/kegg/
Flow cytometry data and KEGG
![Page 52: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/52.jpg)
Prior knowledge from KEGG
![Page 53: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/53.jpg)
Prior distribution
![Page 54: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/54.jpg)
The data and the priors
+ KEGG
+ Random
![Page 55: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/55.jpg)
Evaluation
• Can the method automatically evaluate how useful the different sources of prior knowledge are?
• Do we get an improvement in the regulatory network reconstruction?
• Is this improvement optimal?
![Page 56: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/56.jpg)
Bayesian networkswith two sources of prior
Data
BNs + MCMC
Recovered Networks and trade off parameters
Source 1 Source 2
1 2
![Page 57: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/57.jpg)
Bayesian networkswith two sources of prior
Data
BNs + MCMC
Source 1 Source 2
1 2
Recovered Networks and trade off parameters
![Page 58: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/58.jpg)
Sampled values of the
hyperparameters
![Page 59: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/59.jpg)
Evaluation
• Can the method automatically evaluate how useful the different sources of prior knowledge are?
• Do we get an improvement in the regulatory network reconstruction?
• Is this improvement optimal?
![Page 60: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/60.jpg)
How can we evaluate the reconstruction accuracy?
![Page 61: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/61.jpg)
![Page 62: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/62.jpg)
Flow cytometry data and KEGG
![Page 63: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/63.jpg)
Evaluation
• Can the method automatically evaluate how useful the different sources of prior knowledge are?
• Do we get an improvement in the regulatory network reconstruction?
• Is this improvement optimal?
![Page 64: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/64.jpg)
Learning the trade-off hyperparameter
• Repeat MCMC simulations for large set of fixed hyperparameters β
• Obtain AUC scores for each value of β
• Compare with the proposed scheme in which β is automatically inferred.
Mean and standard deviation of the sampled trade off parameter
![Page 65: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/65.jpg)
![Page 66: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/66.jpg)
Conclusion• Bayesian scheme for the systematic
integration of different sources of biological prior knowledge.
• The method can automatically evaluate how useful the different sources of prior knowledge are.
• We get an improvement in the regulatory network reconstruction.
• This improvement is close to optimal.
![Page 67: Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction](https://reader031.vdocuments.net/reader031/viewer/2022020417/56649f315503460f94c4c407/html5/thumbnails/67.jpg)
Thank you