phd thesis presentation

18
Next-generation text-mining applied to toxicogenomics data analysis Kristina Hettne PhD thesis defense 20 December, 2012

Upload: kristina-hettne

Post on 05-Dec-2014

1.314 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: PhD thesis presentation

Next-generation text-mining applied to toxicogenomics data

analysis

Kristina Hettne

PhD thesis defense

20 December, 2012

Page 2: PhD thesis presentation

Toxicogenomics: study if a chemical causes

damage to genes

Text mining: teach a computer to “read”

articles and extract explicit information

Next-generation text mining: teach a

computer to find implicit information in

articles

Page 3: PhD thesis presentation
Page 4: PhD thesis presentation

Image source: The Independent, July 12, 2012

Drug safety is essential! But… how to minimize animal testing?

Page 5: PhD thesis presentation

Toxicogenomics data Interpretation using knowledge from manually curated databases

Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/

Page 6: PhD thesis presentation

Toxicogenomics data Interpretation using knowledge from manually curated databases

Not sufficient in coverage

We hypothesize that next-generation text mining

can increase the information coverage Image sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/

Page 7: PhD thesis presentation

Information cloud for a chemical concept

Information cloud for a gene concept Shared concepts

7

Next-generation text mining = concept profile matching

Image source: Herman van Haagen

Page 8: PhD thesis presentation

Concepts come from a thesaurus and are identified in text with concept identification software

A good thesaurus = the basis for good concept identification

Image source: Herman van Haagen

Page 9: PhD thesis presentation

9

Research objectives: • Investigate information coverage in public

biomedical and chemical thesauri and databases

• Provide methods to improve the quality and coverage

• Give recommendations for use • Investigate added value of next-

generation text mining when interpreting toxicogenomics data

Page 10: PhD thesis presentation

10

Results

Page 11: PhD thesis presentation

11

A thesaurus of chemical concepts1 and methods1,2,3 to prepare a thesaurus to be used with concept identification software

1. Hettne et al. Bioinformatics, 2009 2. Hettne et al. Journal of Biomedical Semantics, 2010 3. Hettne et al. Journal of Cheminformatics, 2010

http://www.biosemantics.org/casper http://www.biosemantics.org/jochem

Page 12: PhD thesis presentation

12

A next-generation text mining-based method for interpreting biological data

Biological data Statistical test Next-generation text mining

This method gives more, and more specific results1 than other available tools

1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, 't Hoen PA. Briefings in Bioinformatics, 2011

http://www.biosemantics.org/weightedglobaltest

Page 13: PhD thesis presentation

Application to toxicogenomics

http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets

Hettne et al. (submitted)

Page 14: PhD thesis presentation

Image sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012

See developmental defects in stem cells instead of in animal embryos

A) Control group rat embryo B)Triazole-exposed rat embryo

2.

1. Embryonic structure

Posterior neuropore open

Page 15: PhD thesis presentation

Toxicity class prediction (case study: Triazoles)

1. Chemical

Image source 1: Verhallen and Piersma, 2011

25 times larger chemical-gene matrix compared to manual work (Comparative Toxicogenomics Database)

Page 16: PhD thesis presentation

Next-generation text mining combined with

statistical tests complements, and is

sometimes superior to, manually curated

databases in:

- Relating chemical information to gene

expression data

- Identifying toxic effects already at the

gene expression stage

- Discriminating between different classes

of chemicals

Conclusions

Page 17: PhD thesis presentation

2. Apply the method for new drugs

with unknown toxicity

Early prediction of toxicity -> less animal testing and safer drugs

1. Make the method easier to use

(currently being worked on)

Future

Page 18: PhD thesis presentation

Thank you to all who made

this possible!