hvp critical assessment of genome interpretation

16
CAGI (\ˈkā-jē\) Critical Assessment of Genome Interpretation A community experiment to evaluate phenotype prediction Reece Hart (with Steven Brenner and John Moult) QB3 / Center for Computational Biology UC Berkeley [email protected] Human Variome Project Meeting Paris 2010-05-12 ca·gey \ˈkā-jē\ adjective 1: hesitant about committing oneself; 2a: wary of being trapped or deceived; 2b: marked by cleverness

Upload: reece-hart

Post on 11-May-2015

736 views

Category:

Documents


2 download

DESCRIPTION

Note: CAGI occurred in Dec 2010, after I left Berkeley. Susanna Repo made the event happen and it would not have occurred without her.

TRANSCRIPT

Page 1: HVP Critical Assessment of Genome Interpretation

CAGI (\ˈkā-jē\)Critical Assessment of Genome InterpretationA community experiment to evaluate phenotype prediction

Reece Hart (with Steven Brenner and John Moult)QB3 / Center for Computational BiologyUC [email protected]

Human Variome Project MeetingParis 2010-05-12

ca·gey \ˈkā-jē\ adjective1: hesitant about committing oneself;2a: wary of being trapped or deceived;2b: marked by cleverness

Page 2: HVP Critical Assessment of Genome Interpretation

2

The Significance of“Variants of Uncertain Significance”

“VUS – Variant of uncertain significance. A variation in a genetic sequence whose association with disease risk is unknown. Also called variant of uncertain significance, variant of unknown significance, and unclassified variant.”http://www.cancer.gov/cancertopics/genetics-terms-alphalist

Page 3: HVP Critical Assessment of Genome Interpretation

3

The long tail of rare diseases.

“A rare disease typically affects a patient population estimated at fewer than 200,000 in the U.S. There are more than 6,000 rare diseases known today and they affect an estimated 25 million persons in the U.S.”

NIH Office of Rare Diseases Researchhttp://rarediseases.info.nih.gov/

Page 4: HVP Critical Assessment of Genome Interpretation

4

Interpretation of Unclassified Variantsa sampling of responses from genetic counselors

➢ Routinely used● dbSNP● OMIM● GeneReviews● PolyPhen● SIFT● PubMed● Mailing lists

➢ Selectively used● PharmGKB● LSDBs● Domain prediction● Structure impact

analysis● Homology

Page 5: HVP Critical Assessment of Genome Interpretation

5

Genome Variant Impact Prediction Toolsan incomplete list

Program URL

CUPSAT

SIFTSNAP

SNPs3D

Align-GVGD http://agvgd.iarc.fr/AutoMute http://proteins.gmu.edu/automute/

http://cupsat.tu-bs.de/Dmutant http://sparks.informatics.iupui.edu/hzhou/mutation.htmlnsSNPAnalyzer http://snpanalyzer.uthsc.edu/PantherPSEC http://www.pantherdb.org/tools/csnpScoreForm.jspPhD-SNP http://gpcr.biocomp.unibo.it/~emidio/PhD-SNP/PhD-SNP.htmPmut http://mmb2.pcb.ub.es:8080/PMut/

PolyPhen http://coot.embl.de/PolyPhen/http://sift.jcvi.org/http://cubic.bioc.columbia.edu/services/snap/

SNP Function Pred. http://www.ensembl.org/ [N.B. login required]SNPinfo / FuncPred http://snpinfo.niehs.nih.gov/snpfunc.htm

http://snps3d.org/UMD-predictor http://www.umd.be/

Page 6: HVP Critical Assessment of Genome Interpretation

6

Current methods are the tip of the iceberg.

~1%

~99%

m

Cnon-proteintranscripts

proteintranscripts

repeats indels epigenetics

Page 7: HVP Critical Assessment of Genome Interpretation

7

Objectively Assessing Computational Predictions

Data Acquisition

Publication

The Prediction Window~1-12 months when unpublishedhigh-quality data are available

➢ CASP – Structure prediction➢ CAPRI – Protein-ligand docking➢ EGASP – Encode Gene Annotation➢ RGASP – RNA-Seq mapping➢ DREAM – network model assessment

Page 8: HVP Critical Assessment of Genome Interpretation

8

➢ Follow the successful critical assessment framework:

● Solicit pre-publication genotype-phenotype associations

● Provide genomic data to predictors and collect their predictions

● Assess predictions against revealed annotations, mechanisms, and phenotypes

CAGI – Critical Assessment of Genome InterpretationA community assessment of the state-of-the-art in phenotype prediction.

Page 9: HVP Critical Assessment of Genome Interpretation

9

Please contact us if you have pre-publication genotype-phenotype association data.

Sample Prediction Categories

MolecularA

T

OrganismalA

T

CellularA

T

MTHFR mutants – Yeast growth rates with variousMTHFR mutations and [folate].(Jasper Rine)

Breast Cancer –Segregation of rare variants among 2500 cases and controls.(Sean Tavtigian)

PGP100 – Unpublished phenotypes from PGP100 project.

(George Church)

Page 10: HVP Critical Assessment of Genome Interpretation

10

Census of Molecular Mechanismspossible mechanisms of variant impact for WTCCC SNVs

Wellcome Trust Case Control Consortium Nature. 2007;447(7145):661-78.

Page 11: HVP Critical Assessment of Genome Interpretation

11

Contributors, Predictors, Assessorsan incomplete list of participants

Gad Getz

Pauline Ng

Sean Tavtigian

George ChurchMarc Greenblatt

Jasper RineRachel Karchin

Mauno Vihinen

Page 12: HVP Critical Assessment of Genome Interpretation

12

Sample CAGI Timeline05

-03

05-1

0

05-1

7

05-2

4

05-3

1

06-0

7

06-1

4

06-2

1

06-2

8

07-0

5

07-1

2

07-1

9

07-2

6

08-0

2

08-0

9

08-1

6

08-2

3

08-3

0

09-0

6

09-1

3

09-2

0

09-2

7

10-0

4

10-1

1

10-1

8

10-2

5

11-0

1

11-0

8

11-1

5

11-2

2

11-2

9

12-0

6

12-1

3

12-2

0

12-2

7

01-0

3

01-1

0

01-1

7

01-2

4

01-3

1

Data Gathering

Prediction Season

Assessment

Key Dates▲ finalize data sources ▲ workshop

▲ release prospectus / rules▲ open participant registration

Dates are for illustration – exact dates have not been set.

Page 13: HVP Critical Assessment of Genome Interpretation

13

CAGI Summary

➢ CAGI will:● objectively assess phenotype prediction methods● inform future research directions● introduce researchers in diverse fields

➢ CAGI is being planned for the end of 2010 or early 2011.

➢ Now seeking data contributors, assessors, and predictors.

➢ Feedback is sought! [email protected]

➢ See http://genomecommons.org/cagi for more information.

Page 14: HVP Critical Assessment of Genome Interpretation

14

Page 15: HVP Critical Assessment of Genome Interpretation

15

The Genome Commons:A Flagship Project Within QB3

10 km

Page 16: HVP Critical Assessment of Genome Interpretation

16

Reece HartChief ScientistUC Berkeley

Steven BrennerPlant & Mol. BiologyUC Berkeley

Sandrine DudoitBiostatisticsUC Berkeley

Robert NussbaumChief, Medical GeneticsUCSF

Jasper RineGenetics, Genomics & DevChair, Computational BiologyUC Berkeley

Lior PachterMathematicsMol., Cell, BiolUC Berkeley

Bernie LoDirector, Medical EthicsDepartment of MedicineUCSF

Rasmus NielsenoMichael I. JordanIan HolmesKimmen SjölanderYun SongMonty SlatkinTerry SpeedMark van der LaanRichard KarpBernd SturmfelsSteven EvansElizabeth PurdomHaiyan HuangPeter BickelSusan MarquseeMichael EisenLisa BarcellosRachel BremTom Alber

Program in Translational Genomics