chengwei lei, ph.d. assistant professor of computer science

67
A Random Walk Based Approach for Improving Interaction Network and Increasing Prediction Accuracy Chengwei LEI, Ph.D. Assistant Professor of Computer Science Department of Electrical Engineering and Computer Science McNeese State University

Upload: kibo-lucas

Post on 03-Jan-2016

45 views

Category:

Documents


0 download

DESCRIPTION

A Random Walk Based Approach for Improving Interaction Network and Increasing Prediction Accuracy. Chengwei LEI, Ph.D. Assistant Professor of Computer Science Department of Electrical Engineering and Computer Science McNeese State University. What is Interaction Network. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

A Random Walk Based Approach

for Improving Interaction Network and

Increasing Prediction Accuracy

Chengwei LEI, Ph.D.Assistant Professor of Computer Science

Department of Electrical Engineering and Computer ScienceMcNeese State University

Page 2: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

• Interaction network is a network of nodes that are connected by features.

What is Interaction Network

Page 3: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

• If the feature is a physical and molecular, the interaction network is molecular interactions usually found in cells.

First Introduced in Biology

Page 4: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Network View of Protein Interaction Network

Page 5: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 6: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Sounds familiar?

Page 7: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 8: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Sounds familiar?

Page 9: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Even In Mechanical Engineering

Page 10: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Real-world Classification

• Noisy data

• Overfitting problem

• Few true “driver” changes / vast number of “passenger” changes.

Good Bad

Page 11: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Current Methods

Classifier

Prediction

Page 12: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Current Methods

Classifier

Prediction

Statistical test

Pick the most significant ones

Page 13: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Problem?

• Ignore the relationships between nodes/features/sensors

Page 14: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Our approach

• Improve prognosis by combining

– Node readout data – Node-node interaction networks

Page 15: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 16: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Classifier

Prediction

Page 17: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Network

Transformation Matrix

Page 18: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Network

TransformationMatrix

Page 19: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Classifier

Network

Prediction

Transformation Matrix

Page 20: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Transformation Matrix

• Transformation matrix is generated by apply the Random Walk with Restart (RWR) algorithm on the Interaction network.

Page 21: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

• A random walk is a mathematical formalization of a path that consists of a succession of random steps.

Random Walk

Page 22: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

• A random walk is a mathematical formalization of a path that consists of a succession of random steps.

• Random walk for one node on a graph G is a walk on G where the next node is chosen uniformly at random from the set of neighbors of the current node– when the walk is at node v, the probability to

move in the next step to the neighbor u is Pvu = 1/d(v) for (v, u) is connected and 0 otherwise.

Random Walk

Page 23: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Page 24: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Step 1Step 1

Page 25: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Page 26: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Step 2Step 2

Page 27: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Page 28: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Page 29: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Page 30: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Step 3

Page 31: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk

Step 2Step 1

…… Step NStep 3

Page 32: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 33: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Random Walk with Restart

• A random walker start from a node (v) with – uniform probability to visit its neighbors – fixed probability c to revisit the start node

(v)• The probability for a random walker to

be on node j after k times is

– fijk(v) is the probability for a random walker

to take path i to j at time k– Fj(v) at equilibrium is the probability for a random

walker starting from node v to reach node j => Similarity between patient v and j

Page 34: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 35: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 36: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

How about Two?

Page 37: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 38: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 39: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 40: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 41: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 42: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 43: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 44: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science
Page 45: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

• Biology Data– Cancer prediction

Experiments

Page 46: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Classification results

Page 47: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Wang’s Dataset

Network

TransformationMatrix

Page 48: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

1 1 0 … 1

1 1 0 … 1

0 0 1 … 0

… … … … …

1 1 0 … 1

1 1 0 … 1

1 1 0 … 1

… … … … …

1 1 0 … 1

286

Wang’s Dataset

7885

10144

10144

10144

7885

286

7885

2259

Page 49: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

286

7885

286

7885

2259

1247T-test

Good Bad

Page 50: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

286

7885

286

7885

2259

1678

T-test

Good Bad

Page 51: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

286

7885

286

7885

2259

483119552

Page 52: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Pvalue comparison for Wang’s data

Significantlydown-regulated

genes

Significantlyup-regulated

genes

Page 53: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

For Vijver’s dataset

146349 856

DE Genes

Page 54: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Further verification

• For verification, search each gene in the PubMed database – pick the top DE genes from the original dataset

and the enhanced dataset,– with keyword “( GENE-NAME ) AND Cancer AND

(Metastasis or Metastatic) ”.

Page 55: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 DE genes in original dataset

Page 56: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

Page 57: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• SLC26A8 is a male reproductive system diseases related gene

• It is also related to breast cancer

Page 58: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• SLC26A8 is a male reproductive system diseases related gene

• It is also related to breast cancer– A. E. Dahm, A. L. Eilertsen, J.

Goeman, “A microarray study on the effect of four hormone therapy regimens on gene transcription in whole blood from healthy postmenopausal women,” Thrombosis research, vol. 130, no. 1, pp. 45–51, 2012.

– J.-H. Shin, E. Son, H. Lee, S. Kim, “Molecular and functional expression of anion exchangers in cultured normal human nasal epithelial cells,” Acta physiologica, vol. 191, no. 2, pp. 99–110, 2007

Page 59: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• RPS6 is a very important gene in cancer research, especially for the cancer antibodies drug development

Page 60: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• RPS6 is a very important gene in cancer research, especially for the cancer antibodies drug development

– J. C. Potratz, D. N. Saunders, D. H. Wai, et al., “Synthetic lethality screens reveal rps6 and mst1r as modifiers of insulin-like growth factor-1 receptor inhibitor activity in childhood sarcomas,” Cancer research, vol. 70, no. 21, pp. 8770–8781, 2010.

– F. Henjes, C. Bender, S. von der Heyde, L. Braun, H. et al., “Strong egfr signaling in cell line models of erbb2-amplified breast cancer attenuates response towards erbb2-targeting drugs,” Oncogenesis, vol. 1, no. 7, p. e16, 2012.

Page 61: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• G2E3 is a dual function ubiquitin ligase required for early embryonic development

• and also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization

Page 62: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• G2E3 is a dual function ubiquitin ligase required for early embryonic development

• and also a nucleo-cytoplasmic shuttling protein with DNA damage responsive localization

– W. S. Brooks, E. S. Helton, S. Banerjee, “G2e3 is a dual function ubiquitin ligase required for early embryonic development,” Journal of Biological Chemistry, vol. 283, no. 32, pp. 22 304–22 315, 2008.

Page 63: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• RACGAP1 plays a regulatory role in cell growth, transformation and metastasis

Page 64: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

• RACGAP1 plays a regulatory role in cell growth, transformation and metastasis

– S. Saigusa, K. Tanaka, Y. Mohri, M. Ohi, T. Shimura, et al., “Clinical signif-icance of racgap1 expression at the invasive front of gastric cancer,” Gastric Cancer, pp. 1–9, 2014.

– V. Kotoula, K. T. Kalogeras, G. Kouvatseas, D. Televantou, R. Kro-nenwett, “Sample parameters affecting the clinical relevance of rna biomarkers in translational breast cancer research,” Virchows Archiv, vol. 462, no. 2, pp. 141–154, 2013.

– K. Pliarchopoulou, K. Kalogeras, R. Kronenwett, et al., “Prognostic significance of racgap1 mrna expression in high-risk early breast cancer: a study in primary tumors of breast cancer patients participating in a randomized hellenic cooperative oncology group trial,” Cancer chemotherapy and pharmacology, vol. 71, no. 1, pp. 245–255, 2013..

Page 65: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Top 15 original non-significant genes in the enhanced dataset

Page 66: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Ongoing Experiment

Page 67: Chengwei  LEI, Ph.D. Assistant Professor of Computer Science

Thank you