pathway ranking tool dimitri kosturos linda tsai socalbsi, 8/21/2003

31
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Post on 19-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Pathway Ranking Tool

Dimitri Kosturos Linda Tsai

SoCalBSI, 8/21/2003

Page 2: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Project Overview

BioDiscovery, Inc. at Marina del Rey Analyzing microarray data on pathway level

instead of individual gene level Methods:

-Enrichment Analysis-Permutational Statistics

-S. Metric-Multivariate test

Project Overview

Page 3: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Validation of statistical methods 2 data sets: Brain Tumor, Interferon-gamma. Sources of annotation: BioCarta, Kegg, Gene

Ontology.

Project Overview, cont.

Page 4: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

phenotype

microarray

algorithm

pathway

Dimitri,

(Computer Scientist)

Linda(biologist)

Project Flowchart

Page 5: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

GeneSight is a data analysis software Feature:

-Statistical significance testing

-Multiple Data Visualizations

-Automated gene annotation

-Complete result reports

-Pathway analysis (?)

Research and Development in GeneSight

Page 6: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Glioblastoma multiforme(GBM) is the most malignant of the glial tumors, classified as grade IV.

Many brain tumors are currently incurable. Average survival time: 1 year

Biology of Brain Tumor

Page 7: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Oncogenes: promote normal cell growth

Tumor suppressor genes: retard cell growth

http://www.med.harvard.edu/publications/On_The_Brain/Volume4/Number2/SP95Awry.html

Bad Genes Foment Trouble

Page 8: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Interferon is a class of cytokines that mediate antiviral, antiproliferative, antitumor activites, etc.

IFN gamma is produced by T lymphocytes in response to mitogens or to antigens.

IFNs bind to their receptors and initiate JAK-STAT signaling cascade.

Biology of Interferon

Page 9: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

http://www.grt.kyushu-u.ac.jp/eny-doc/pathway/ifn_gamma.html

Biology of Interferon, cont.

Page 10: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Grouping related genes together into pathways(A) BioCarta

Ex: p53 Signaling Pathway(B) KEGG

Ex:Citrate cycle (TCA cycle) Grouping genes into structured, controlled

vocabularies (ontologies) Gene Ontology-Biological Process. Ex: angiogenesis, apoptosis-Molecular Function. Ex: DNA binding activity-Cellular Component. Ex: nucleus, mitochondria

Gene Annotations

Page 11: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Traditional method of ranking gene pathways

Steps:

1. Mann-Whitney Test: obtain list of probe sets that satisfy a certain p-value.

2. Cluster analysis: see how many of listed probe occur in a cluster (pathway).

Example:

1. Original data: 12,625 genes. Select genes p-value <0.001.

=>narrow to 927 genes.

2. Cluster those 927 genes into clusters.

Page 12: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

4 of the genes in SODD/TNFR1 Signaling Pathway satisfy p-value<0.001

Annotations \ Lists DG-Less_than_0.001BioCarta Pathway SODD/TNFR1 Signaling Pathway p=0.012: CASP8,FADD,LTA,TNF (4 of 9)BioCarta Pathway D4-GDI Signaling Pathway p=0.017: CASP1,CASP10,CASP8,JUN (4 of 10)BioCarta Pathway TNFR1 Signaling Pathway p=0.021: CASP8,FADD,JUN,LMNB1,LTA,MADD,TNF (7 of 28)BioCarta Pathway Cadmium induces DNA synthesis and proliferation in macrophages p=0.021: JUN,LTA,MAPK3,PRKCB1,TNF (5 of 16)BioCarta Pathway Visceral Fat Deposits and the Metabolic Syndrome p=0.022: LPL,LTA,TNF (3 of 6)BioCarta Pathway Fibrinolysis Pathway p=0.032: F13A1,F2R,SERPINE1 (3 of 7)BioCarta Pathway EPO Signaling Pathway p=0.033: EPO,EPOR,GRB2,JUN,MAPK3 (5 of 18)

Mann-Whitney Test, Denovo Glioblastoma p<0.001

Page 13: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

How Affy. Microarray Chips Work

http://www.ucl.ac.uk/oncology/MicroCore/HTML_resource/Norm_Affy1.htm

Best results: Genes hybridize perfectly with Perfect Match, and not at all with Mismatch.

PM: Perfect MatchMM: Mismatch

Page 14: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Example of GeneSight PlotData

  Normal Normal Tumor Tumor

Probe Set A 4.5 3.8 10.2 11.1

Probe Set B 2.3 2.7 13.5 13.6

Probe Set C 7.8 8.2 1.4 1.8

Probe Set A 3.5 4.2 8.9 9.6

Theoretical Tumor Expression Levels (Log Transformed)

Conditions

GenesNotice column replicates, Probe Set replicates.

Page 15: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Given Data Sets

Given two data sets: Brain Tumor, IFN-γ Brain Tumor Data Set has 5+ tumor

types,however, only 2 Tumor types were used (Denovo Glioblastoma, Progressive Glioblastoma)

IFN-γ Data Set: the entire data set was used.

Page 16: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

What and why?

Goal: write a prototype extension to GeneSight that uses permutational statistics to develop a custom distribution for a given Microarray data set.

Overall significance: the software provides a list of (potentially) significant pathways that enables researchers to focus their work.

Page 17: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

What is permutational statistics?

E E C C1 2 3 4

 

Choose different Control and Experiment groupings (permute).

E C E C1 2 3 4

 

By iterating through an adequate number of permutations, we can determine if a pathway is likely to be significant (p-value).

(In this context.)

Page 18: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Permutational Stats.

There are two versions of the S. Metric currently implemented.

S. Metric I =

S. Metric II =

M = Number of Genes flagged as significant

Total = Total number of Genes in the Pathway

Page 19: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

(Layman's) How Statistics Works

Data Statistic P-Value

Permute Here

S. Metric I, II

After all permutations are done, calculate the p-Value

Page 20: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Algorithm

Take at least 10,000 unique permutations. A unique permutation is determined by a Permute class.For each condition For each permutation For each gene Calc. Mean diff. Calc. T-stat End For For each pathway store the statistic End for End for calcPvalue(stored statistic)End For

S. Metric

Initial Significance Flagging

pValue

Page 21: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Limitations

Computational Power (Memory, CPU) Required number of replicates (8,8)

Page 22: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Output of result

Page 23: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Validation of pathway analysisMethod 1

  Computer algorithm classified as significant pathways

Computer algorithm classified as insignificant pathways

Linda's Selection of significant pathways True Positive False Negative

Linda's Selection of insignificant pathways False Positive True Negative

Problem: lack of insignificant pathways

????

Page 24: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Validation of pathway analysisMethod 2

Best algorithm Random

Worst

Comparision of Prediction Methods

0

2

4

6

8

10

12

14

16

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

# of Pathways in BioCarta sorted by P-value

# o

f id

enti

fied

sig

nif

ican

t p

ath

way

s

Page 25: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

ResultBrain Tumor-BioCarta

D. glioblastoma

0

510

15

20

2530

35

40

1 26 51 76 101 126 151 176 201 226 251

# of pathways in BioCarta sorted by P-value

# of

sig

. ide

ntifi

ed

path

way

s

SMI

SMII

DG0.001

DG0.01

Page 26: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

ResultIFNG-Molecular Function (GO)

IFNG-Molecular Function

01020304050607080

1

125

249

373

497

621

745

869

993

1117

1241

1365

1489

1613

1737

Number of terms in MF(GO) sorted by p-value

Num

ber

of s

ig.

Iden

tifie

d te

rms SMI

SMII

enrich0.01

enrich0.001

Page 27: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Biological Limitations

Prediction of pathways to be significant in the conditions of interest is subjective.

Assumption of similar biological states between Denovo Glioblastoma and Progressive Glioblastoma.

Page 28: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Future Direction

Finish modifying the Multivariate Statistic for use in the permutational method. This method uses PCA and Multivariate statistics.

Finish Validating the data produced using the Multivariate Statistic.

Page 29: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Initial Results of Multivariate Stat.

IFNG-Biological Process(GO)

0

10

20

30

40

50

1 98 195

292

389

486

583

680

777

874

971

1068

1165

Number of terms in BP(GO)

Nu

mb

er o

f te

rms

iden

tifi

ed

SMI

SMII

enrich0.01

enrich0.001

M. Perm

Sorted by p-value.

Page 30: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Conclusion

It is not clear which is better the S. metric or traditional Enrichment Analysis.

Improvements can be made to the S. metric.

Page 31: Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003

Acknowledgements

Dr. Bruce Hoff Dr. Anton Petrov SoCalBSI: Dr. Jamil Momand,

Dr. Sandra Sharp, Dr. Nancy Warter-Perez, Dr. Wendie Johnston

National Science Foundation National Institute of Heath