vivo mini-grant: integrating the umls ontology into vivo for linking biomedical scientists

26
Stony Brook University School of Medicine 8/25/2011 1 VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists Moises Eisenberg* and Janos Hajagos*

Upload: janos-hajagos

Post on 12-Jul-2015

1.280 views

Category:

Sports


2 download

TRANSCRIPT

Page 1: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Stony Brook University

School of Medicine

8/25/2011

1

VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Moises Eisenberg* and Janos Hajagos*

Page 2: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Contributors:

• Erich Bremer*

• Jizu Zhi**

• Tammy DiPrima*

• Ann Gardner***

• Naresh Singh****

• Aniket Divecha****

2

Dept. of Medical Informatics* | OSA**| SOM***|Dept. of Computer Science****

Page 3: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

SUNY REACH

3

Page 4: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

4

Page 5: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

5

Page 6: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

6

Page 7: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

VIVO has an ontology

7

Page 8: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

The Semantic Web starts simple with a RDF triple

Subject ObjectPredicate

8

Page 9: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Builds into a more complex network of interlinked URIs

Source data from NCIt

9

Page 10: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

10

Page 11: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

The CUI: Concept Unique Identifier

11

Page 12: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

UMLS as linked data

• Developed tool to publish large databases into RDFS

• Published 2011AA version of the UMLS and corresponding RxNorm Release

– Public available sources (SRL=0)

– RxNorm is linked to DrugBank

• Includes attributes, relationships, and semantic types from the UMLS

12

Page 13: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

UMLS RDF in the wider world

13

Page 14: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Faceted browser view

14

Page 15: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

UMLS CUI Alignment Tool

15

Page 16: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

16

Page 17: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Algorithm for aligning free text

• Parse free text into component words

• Build phrases of different word length

• Query UMLS if phrase exists

• Sort in descending order of number of words

• Tie criteria based on number of occurrences in different source vocabularies

– Most widely used gets a higher rank

17

Page 18: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

UMLS Web Service

• Base address: http://link.informatics.stonybrook.edu

• Sample call: /MeaningLookup/MlServiceServlet?textToProcess=Pediatric%20HIV&format=json

• Response format:– JSON, N-triples, RDF/XML

• Response content:– Best choices and all choices for matching CUIs

18

Page 19: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

19

Alignment to the UMLS CUIs

Page 20: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

PubMed RDF Conversion• Started with XSLT published in 2008 by Pierre Lindenbaum

• A prototype project linked 2010 PubMed to the internal Health Sciences Library MARC holdings data (>800,000,000 triples)

• Allowed linked data search joining article data with holdings data

• PubMed XSLT updated to 2011 schema with MeSH aligned to the UMLS CUIs

• Current translation generated 1,973,880,813 triples

20

Page 21: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

21

Page 22: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

PubMed CUI Web Service

• Base address: http://link.informatics.stonybrook.edu

• Sample call: /weaver/pubmed2cuis?pmid=17952453

• Response format: JSON

• Response content: UMLS CUIs with labels

22

Page 23: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Linking subject areas to publications

23

Page 24: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Data facts

• UMLS RDF (2011AA release; English language; SRL=0)– Number of triples: 110,415,427– Number of different sources: 46– Number of CUIs: 2,404,344– Number of AUIs: 3,594,372

• REACH VIVO (Data extracted 8/23/2011)– Number of people: 684– Number of triples: 448,112

• UMLS alignment of subject areas– Number of subject areas: 425– Number of UMLS CUIs generated: 899– Number of distinct UMLS CUIs: 604

• PubMed alignment to REACH– Number of UMLS CUIs generated: 192,450– Number of distinct UMLS CUIs: 11,039– Articles with no MeSH is 1,293 out of 15,975

24

Page 25: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Links

• SPARQL endpoint:– http://link.informatics.stonybrook.edu/sparql/

• CUI alignment tool:– http://link.informatics.stonybrook.edu/MeaningLookup/

• Points to start browsing linked data:– http://link.informatics.stonybrook.edu/umls/– http://link.informatics.stonybrook.edu/umls/SAB

• Open source code developed at SBU:– http://code.google.com/p/py-triple-simple/

• Native Python RDF utility

– http://code.google.com/p/sbu-mi-vivo-tools/• Automated dumping of VIVO sites RDF and alignment to UMLS and PubMed

– http://code.google.com/p/spyder-web/• Faceted browser and lightweight web service for parameterized SPARQL queries

25

Page 26: VIVO Mini-Grant: Integrating the UMLS Ontology into VIVO for Linking Biomedical Scientists

Acknowledgements

• Supported through:

VIVO: Enabling National Networking of ScientistsNIH U24 RR029822

• Original interactive CUI alignment tool created by Jakub Pezacki (SBU Class of 2010)

26