olivier bodenreider

36
Olivier Bodenreider Olivier Bodenreider Lister Hill National Lister Hill National Center Center for Biomedical for Biomedical Communications Communications Bethesda, Maryland - Bethesda, Maryland - eriences in visualizing and navigat medical ontologies and knowledge ba ISMB 2002 ISMB 2002 Fifth Annual Bio-Ontologies Meeting Fifth Annual Bio-Ontologies Meeting August 8, 2002 August 8, 2002

Upload: mirit

Post on 05-Jan-2016

49 views

Category:

Documents


1 download

DESCRIPTION

Olivier Bodenreider Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA. ISMB 2002 Fifth Annual Bio-Ontologies Meeting August 8, 2002. Experiences in visualizing and navigating biomedical ontologies and knowledge bases. Introduction 1. Biomedical knowledge - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Olivier Bodenreider

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland - USABethesda, Maryland - USA

Experiences in visualizing and navigatingbiomedical ontologies and knowledge bases

ISMB 2002ISMB 2002Fifth Annual Bio-Ontologies MeetingFifth Annual Bio-Ontologies Meeting

August 8, 2002 August 8, 2002

Page 2: Olivier Bodenreider

2Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Introduction Introduction 11

Biomedical knowledgeBiomedical knowledge TerminologiesTerminologies (names)(names) OntologiesOntologies (objects)(objects) Knowledge basesKnowledge bases (facts)(facts)

Common featuresCommon features Terms / ConceptsTerms / Concepts Inter-concept relationshipsInter-concept relationships

HierarchicalHierarchical AssociativeAssociative

Page 3: Olivier Bodenreider

3Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Introduction Introduction 22

ChallengesChallenges Volume of informationVolume of information

10104 4 -- 10106 6 conceptsconcepts 10105 5 -- 10107 7 relationshipsrelationships

OrientationOrientation Mapping to conceptsMapping to concepts Visualizing concept spacesVisualizing concept spaces Navigating concept spacesNavigating concept spaces

term

knowledge

Page 4: Olivier Bodenreider

4Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Introduction Introduction 33

SemNavSemNav UMLS browserUMLS browser Entry point: biomedical Entry point: biomedical

termterm Display related conceptsDisplay related concepts

Display properties of Display properties of interconcept relationshipsinterconcept relationships

Allow navigation among Allow navigation among conceptsconcepts

GenNavGenNav GO browserGO browser Entry point: GO term or Entry point: GO term or

gene product name/symbolgene product name/symbol Display related GO terms Display related GO terms

and gene productsand gene products Display properties of Display properties of

term/term and term/gene term/term and term/gene product relationshipsproduct relationships

Allow navigation between Allow navigation between GO terms and gene GO terms and gene productsproducts

Page 5: Olivier Bodenreider

5Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

OutlineOutline

BackgroundBackground Unified Medical Language System (UMLS)Unified Medical Language System (UMLS) Gene OntologyGene Ontology

Overview of the browsersOverview of the browsers SemNavSemNav GenNavGenNav

Common featuresCommon features DifferencesDifferences

Page 6: Olivier Bodenreider

UMLS and GO

Page 7: Olivier Bodenreider

7Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UUnified nified MMedical edical LLanguage anguage SSystemystem

Developed at NLM since 1990Developed at NLM since 1990 1313thth edition in 2002 edition in 2002 Integrates some 60 terminological resourcesIntegrates some 60 terminological resources

Clinical vocabularies (including specialties)Clinical vocabularies (including specialties) Core terminologies (anatomy, drugs, med. devices)Core terminologies (anatomy, drugs, med. devices) Administrative terminologies, standardsAdministrative terminologies, standards

IntegrationIntegration Synonymous terms are clustered in a conceptSynonymous terms are clustered in a concept Hierarchies (trees) are combined in a graph structureHierarchies (trees) are combined in a graph structure

Page 8: Olivier Bodenreider

8Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Terminology integration Terminology integration TermsTerms

Duchenne muscular dystrophy

MeSH, SNOMEDCTV3, Jablonski,CRISP, DxPlain,MedDRA, LOINC

pseudohypertrophic muscular dystrophyMeSH, CTV3SNOMED

X-liked recessive muscular dystrophy Jablonski

Duchenne de Boulogne muscular dystrophy Jablonski

Duchenne’s muscular dystrophy COSTAR

severe generalized familial muscular dystrophy SNOMED

Duchenne type progressive muscular dystrophy SNOMED

Page 9: Olivier Bodenreider

9Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Terminology integration Terminology integration RelationshipsRelationships

UMLS

Adrenal Cortex Diseases

Hypoadrenalism

Adrenal Gland Hypofunction

Adrenal cortical hypofunction

Adrenal Gland Diseases

Addison’s Disease

SNOMEDMeSHAODRead Codes

Page 10: Olivier Bodenreider

10Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

UMLSUMLS

Two-level structureTwo-level structure Semantic NetworkSemantic Network

134 Semantic Types (STs)134 Semantic Types (STs) 54 types of relationships54 types of relationships

among STsamong STs

MetathesaurusMetathesaurus 800,000 concepts800,000 concepts ~10 M inter-concept~10 M inter-concept

relationshipsrelationships

Link = categorizationLink = categorizationConcept

Metathesaurus

SemanticType

Semantic Network

categorization

Page 11: Olivier Bodenreider

Heart

Concepts

Metathesaurus

22

225

97

4

12

9 31

Esophagus

Left PhrenicNerve

HeartValves

FetalHeart

Medias-tinum

SaccularViscus

AnginaPectoris

CardiotonicAgents

TissueDonors

AnatomicalStructure

Fully FormedAnatomical

Structure

EmbryonicStructure

Body Part, Organ orOrgan Component Pharmacologic

Substance

Disease orSyndrome

PopulationGroup

Semantic Types

SemanticNetwork

Page 12: Olivier Bodenreider

12Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Gene OntologyGene Ontology

Developed by the GO ConsortiumDeveloped by the GO Consortium Several componentsSeveral components

Ontology (~11,000 concepts)Ontology (~11,000 concepts) Molecular functionsMolecular functions Cellular componentsCellular components Biological processesBiological processes

Gene products (~125,000)Gene products (~125,000) Associations between Gene products and GO concepts Associations between Gene products and GO concepts

(~357,000)(~357,000)

Page 13: Olivier Bodenreider

SemNav

Page 14: Olivier Bodenreider
Page 15: Olivier Bodenreider
Page 16: Olivier Bodenreider

MeSH Browser

Page 17: Olivier Bodenreider
Page 18: Olivier Bodenreider

18Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

SemNav SemNav Visualization optionsVisualization options

Page 19: Olivier Bodenreider
Page 20: Olivier Bodenreider
Page 21: Olivier Bodenreider
Page 22: Olivier Bodenreider
Page 23: Olivier Bodenreider

23Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

SemNav SemNav RelationshipsRelationships

Dystrophin

Concepts

Semantic Types

MuscularDystrophy,Duchenne55

Amino Acid,Peptide or Protein

Disease orSyndrome

Biologically ActiveSubstance

Page 24: Olivier Bodenreider

GenNav

Page 25: Olivier Bodenreider
Page 26: Olivier Bodenreider

Material and Methods

Page 27: Olivier Bodenreider
Page 28: Olivier Bodenreider
Page 29: Olivier Bodenreider
Page 30: Olivier Bodenreider

Common featuresand differences

Page 31: Olivier Bodenreider

31Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

Mapping query termsMapping query terms

Mapping terms to conceptsMapping terms to concepts Matching criteria (exact, approximate)Matching criteria (exact, approximate) Normalization techniquesNormalization techniques

work well on clinical termswork well on clinical terms less applicable to gene namesless applicable to gene names

Query disambiguationQuery disambiguation With semantic type in With semantic type in SemNavSemNav With species in With species in GenNavGenNav

Page 32: Olivier Bodenreider

32Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

VisualizationVisualization

Graph vs. Trees (Forest)Graph vs. Trees (Forest) Multiple inheritance is better visualized by graphs than Multiple inheritance is better visualized by graphs than

by treesby trees Off-the-shelf, freely available graph visualization Off-the-shelf, freely available graph visualization

packages are available (GraphViz)packages are available (GraphViz)

Need to reduce complexityNeed to reduce complexity Transitive reduction on complex graphsTransitive reduction on complex graphs Feature selectionFeature selection

e.g., a given vocabulary in e.g., a given vocabulary in SemNavSemNav e.g., a given species in e.g., a given species in GenNavGenNav

Page 33: Olivier Bodenreider

33Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

NavigationNavigation

Tool for explorationTool for exploration Navigation among conceptsNavigation among concepts

((SemNavSemNav and and GenNavGenNav)) Navigation between two polesNavigation between two poles

(Gene products and GO concepts in (Gene products and GO concepts in GenNavGenNav))

Self-contained (Self-contained (SemNavSemNav))or opened to external resources (or opened to external resources (GenNavGenNav))

Page 34: Olivier Bodenreider

Conclusions

Page 35: Olivier Bodenreider

35Lister Hill National Center for Biomedical CommunicationsLister Hill National Center for Biomedical Communications

ConclusionsConclusions

Most of the lessons learned while developing Most of the lessons learned while developing SemNavSemNav (for browsing general biomedical (for browsing general biomedical knowledge) were applicable to knowledge) were applicable to GenNavGenNav (for (for browsing molecular biology knowledge)browsing molecular biology knowledge)

The lexical techniques suitable for mapping text to The lexical techniques suitable for mapping text to clinical terminologies require adaptation to the clinical terminologies require adaptation to the specificity of molecular biology terminologiesspecificity of molecular biology terminologies

Page 36: Olivier Bodenreider

Contact: Contact: [email protected]@nlm.nih.gov

Olivier BodenreiderOlivier Bodenreider

Lister Hill National CenterLister Hill National Centerfor Biomedical Communicationsfor Biomedical CommunicationsBethesda, Maryland - USABethesda, Maryland - USA

SemNav http://umlsks.nlm.nih.gov*

► Resources ► Semantic Navigator(* free UMLS registration required)

GenNav http://etbsun2.nlm.nih.gov:8000/perl/gennav.pl