umls and phenotype coding - mor.nlm.nih.gov · 12/2/2008 · umls and phenotype coding anita...

33
UMLS and phenotype coding Anita Burgun, Fleur Mougin, Olivier Bodenreider INSERM U936, EA 3888- Faculté de Médecine, Univ. Rennes1 ISPED, Univ. Bordeaux, France US National Library of Medicine, NIH, Bethesda, USA December 2, 2008 One Medicine One Pathology: 2 nd annual CASIMIR Symposium on Human and Mouse Disease Informatics

Upload: dinhdan

Post on 05-Sep-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

UMLS and phenotype coding

Anita Burgun, Fleur Mougin, Olivier BodenreiderINSERM U936, EA 3888- Faculté de Médecine, Univ. Rennes1

ISPED, Univ. Bordeaux, FranceUS National Library of Medicine, NIH, Bethesda, USA

December 2, 2008

One Medicine One Pathology: 2nd annual CASIMIR Symposiumon Human and Mouse Disease Informatics

Phenotype coding

ClinicalClinicalrepositoriesrepositories

ResearchResearcharticlesarticles

Annotation Annotation databasesdatabases

MeSHMeSH

SNOMED SNOMED CTCT

MPOMPO

Phenotype coding

ClinicalClinicalrepositoriesrepositories

ResearchResearcharticlesarticles

Annotation Annotation databasesdatabases

MeSHMeSH

SNOMED SNOMED CTCT

MPOMPO

ICD, NCI Thesaurus, …….

Phenotype coding: role of the UMLS

AnnotationDatabases

OMIM,ClinicalDatabases

MOUSE HUMAN

Mammalian PhenotypeOntology

Clinical terminology

Phenotype coding: role of the UMLS

MOUSEDatabases

OMIMClinical DB

MOUSE HUMAN

PhenotypeOntology

Clinical Terminology

UMLS

Unified Medical Language System

• Addresses heterogeneity issues– More than 100 source vocabularies– Unification

• Clusters terms into concepts– Metathesaurus: CUIs

• Organizes hierarchies– Metathesaurus: relations

• Categorizes concepts– Semantic Network: Semantic Types

Unified Medical Language System

• Addresses heterogeneity issues– More than 100 source vocabularies– Unification

• Clusters terms into concepts– Metathesaurus: CUIs

• Organizes hierarchies– Metathesaurus: relations

• Categorizes concepts– Semantic Network: Semantic Types

UMLS Metathesaurus

• Craniosynostosis in UMLS Release 2008AA• Source vocabularies

– ICD-10– ICPC– MedDRA– MeSH– OMIM– Read Codes– SNOMED CT– …..

• Definition (MeSH) : Premature closure of one or moresutures of the skull

• UMLS Metathesaurus• CUI C0010278 Craniosynostosis (Preferred Term)

– Craniostenosis (ICD, ICPC, OMIM, SNOMED CT)– Craniosynostosis syndrome (SNOMED CT)– Synostosis (cranial) (CRISP)– Word phrases

• Premature closure of cranial sutures (MedDRA, SNCT)• Congenital ossification of cranial sutures• Congenital ossification of sutures• Congenital ossification of sutures of skull• Premature cranial suture closure (SNOMED)

– Abbreviations• CRS, CSO, CRS1 (OMIM)

– More specific terms• Craniosynostosis, type 1 (OMIM)

• Possible synonyms and related– Hurst syndrome (C0014077)– Christian syndrome 1 (C0795794)– SCARF (skeletal abnormalities, cutis laxa, craniostenosis, psychomotor

retardation, facial abnormalities) syndrome (C0796146)…..

Unified Medical Language System

• Addresses heterogeneity issues– More than 100 source vocabularies– Unification

• Clusters terms into concepts– Metathesaurus: CUIs

• Organizes hierarchies– Metathesaurus: relations

• Categorizes concepts– Semantic Network: Semantic Types

C0010278 Craniosynostosis

C0495614 Other congenital malformations of skull and face bones

ICD

ICD-10

C0010278 Craniosynostosis

C0852332 Musculoskeletal and connectivetissue deformitiesof skull, face and buccal cavity

MedDRA

MedDRA

C0010278 Craniosynostosis

C0039093 Congenitalabnormal synostosis

C0376634 Craniofacial Abnormalities

MeSHMeSH

MeSH

C0010278 Craniosynostosis

C0037303 SkullC0018670 Head

OMIM

OMIMOMIM

C0010278 Craniosynostosis

C0424705 Cranialsuturefinding

C1279960 Congenital anomaly of bone and joint

C0495615Congenitalabnormalityof skull andface bones

Bone structureof cranium = skullFinding_site (SNCT)

SNCT

SNOMED CT

C0010278 Craniosynostosis

C0495615 Congenitalabnormalityof skull and face bones

C0424705 Cranialsuturefinding

C1279960 Congenital anomaly of bone and joint

C0495614 Other congenital malformations of skull and face bones

C0852332 Musculoskeletal and connectivetissue deformitiesof skull, face and buccal cavity

C0039093 Congenitalabnormal synostosis

C0376634 Craniofacial Abnormalities

Bone structureof cranium = skullFinding_site (SNCT)Parent (OMIM)

ICDSNCT isaMedDRAMeSHSNCTMeSH

UMLS

Unified Medical Language System

• Addresses heterogeneity issues– More than 100 source vocabularies– Unification

• Clusters terms into concepts– Metathesaurus: CUIs

• Organizes hierarchies– Metathesaurus: relations

• Categorizes concepts– Semantic Network: Semantic Types

“Biologic Function” hierarchy (isa)

Biologic Function

Pathologic FunctionPhysiologic Function

Disease orSyndrome

Cell orMolecular

Dysfunction

ExperimentalModel ofDisease

OrganismFunction

Organor TissueFunction

CellFunction

MolecularFunction

Mental orBehavioral

Dysfunction

NeoplasticProcess

MentalProcess

GeneticFunction

“Biologic Function” hierarchy (isa)

Biologic Function

Pathologic FunctionPhysiologic Function

Disease orSyndrome

Cell orMolecular

Dysfunction

ExperimentalModel ofDisease

OrganismFunction

Organor TissueFunction

CellFunction

MolecularFunction

MentalProcess

GeneticFunction

CUI C0010278Craniosynostosis

Overview of the Semantic Network

EmbryonicStructure

AnatomicalAbnormality

CongenitalAbnormality

AcquiredAbnormality

Fully FormedAnatomicalStructure

AnatomicalStructure

part of

OrganismAttribute

property of

BodySubstance

contains,produces

evaluation of

part of

Body Part, Organ orOrgan Component

part of

Tissue

part of

Cell

part of

CellComponent

Gene orGenome

Organismprocess of

Body Spaceor Junction

adjacent to

location of

location of

evaluation ofFinding

Laboratory orTest Result

Sign orSymptom

BiologicFunction

PhysiologicFunction

PathologicFunction

Body Locationor Region

conceptualpart of

conceptualpart of

Injury orPoisoning

disrupts

disrupts

co-occurs with

Overview of the Semantic Network

EmbryonicStructure

AnatomicalAbnormality

CongenitalAbnormality

AcquiredAbnormality

Fully FormedAnatomicalStructure

AnatomicalStructure

part of

OrganismAttribute

property ofevaluation of

Organismprocess of

evaluation ofFinding

Laboratory orTest Result

Sign orSymptom

BiologicFunction

PhysiologicFunction

PathologicFunction

Injury orPoisoning

disrupts

disrupts

CUI C0010278Craniosynostosis

Disease orSyndrome

ExperimentalModel ofDisease

Cell orMolecular

Dysfunction

Semantic Groups

EmbryonicStructure

AnatomicalAbnormality

CongenitalAbnormality

AcquiredAbnormality

Fully FormedAnatomicalStructure

AnatomicalStructure

part of

OrganismAttribute

property ofevaluation of

Organismprocess of

evaluation ofFinding

Laboratory orTest Result

Sign orSymptom

BiologicFunction

PhysiologicFunction

PathologicFunction

Injury orPoisoning

disrupts

disrupts

Disease orSyndrome

ExperimentalModel ofDisease

Cell orMolecular

Dysfunction

Disorders

Phenotype coding: role of the UMLS

MOUSEDatabases

OMIMClinical DB

MOUSE HUMAN

PhenotypeOntology

Clinical Terminology

UMLS

Mammalian Phenotype Ontology (MPO)

• 14,662 terms• 6,307 concepts• MP:0003561 Rheumatoid arthritis• MP:0000218 increased leukocyte cell number

increased leukocyte countincreased WBC countincreased WBC numberincreased white blood cell numberleukocytosis

• MP:0000410 waved haircurly hairwaved furwavy hair

Mapping to UMLS (1/3)• Step 1 : Exact/normalized match• Results

– 2,065 MPO terms mapped successfully(14%)– 1,495 MPO concepts mapped successfully (24%))– Among them, 1,432 correspond to Disorders in UMLS (SG)

• Examples mapped successfully– MP:0000062 increased bone density-> C1141880 Bone density increased

(NSI)– MP:0000081 craniostosis -> C0010278 Craniosynostosis (syn in MPO,

EM)– MP:0000061 brittle bones -> C0029434 Osteogenesis Imperfecta (syn in

UMLS, EM)• Examples unmapped

– MP:0000100 abnormal ethmoidal bone– MP:0000101 absent ethmoidal bone– MP:0000687 small lymphoid organs– MP:0000689 abnormal spleen structure

Mapping to UMLS (2/3)

• 11,466 unmapped terms (4,812 concepts)– MP:0000100 abnormal ethmoidal bone– MP:0000101 absent ethmoidal bone– MP:0000687 small lymphoid organs– MP:0000689 abnormal spleen structure

• Step 2 : demodification– 30 modifiers: abnormal, absent, small…..– Demodified terms– Mapping to UMLS

• Results : demodified terms– 9,845 <<modifier> xxx> terms in MPO– 7,925 unique terms (after demodification)

Mapping to UMLS (3/3)

• Results : mapping after demodification– 2,359 MPO terms mapped successufully after demodification (out

of 11,466 , 20%)– Unique terms : 1,586 (20%)– 1,645 MPO concepts mapped successufully after demodification

(out of 4,812, 34%)• Demodified terms correspond mostly to:

– Anatomical concepts• Semantic Group Anatomy in UMLS• 1410 terms, e.g., abnormal <anatomical structure>• MP:0000005 (increased) brown fat -> C0006298 Brown Fat (Tissue,

ANAT)– Physiology

• Semantic Group Physiology in UMLS• 516 terms, e.g., abnormal <physiological process>• MP:0000057 (abnormal) osteogenesis -> C0029433 Osteogenesis

(Organ or Tissue Function, PHYS)

Phenotype coding: role of the UMLS

MOUSEDatabases

OMIMClinical DB

MOUSE HUMAN

PhenotypeOntology

Clinical Terminology

UMLS

Discussion

< abnormal <anatomical structure > >

30 modifiers

Functional Concept

Qualitative Concept

abnormaldecreasedincreasedreducedsmall …..

Anatomy

Physiology

SGs

Phenote : EQ model : combine entities from any ontology with qualities / traits (such as those in PATO)

Class Name: abnormal (PATO:0000460)Is A: deviation(from_normal)Synonym: aberrant, atypical, defective[R]Is A: pathological

entities from any ontology

• Compositionality in phenotype terms

Discussion

• Role of the UMLS in integrating phenotype terminologies

MOUSE HUMAN

3,140 concepts present in theUMLS (24%)

Clinical Terminology

UMLS

3,167 concepts absent (50%)

impaired ossification of basisphenoid bonesnout shape abnormalitiesinability to present cytosolic antigens to Class-I restricted cytotoxic T cells

+1,645 demodified (26%)

6307

Discussion

• Role of the UMLS in integrating phenotype terminologies

MOUSE HUMAN

2,065 terms present in theUMLS

UMLS

3,167 concepts absent

+ 2,359 demodified

6307

SNOMED CT : 1,557MedDRA 1,358

OMIM 1,151NCI Thesaurus 714

ICD 1,082

• Role of the UMLS in integrating phenotype data– OMIM, Clinical DBs, Mouse DBs

Discussion

MOUSE HUMAN

craniosynostosisThe newborn presentsan anomaly of cranial

bones

UMLS

Christian syndrome

Acknowledgements• Olivier Bodenreider, NLM• Fleur Mougin, ISPED

• Download/ customize/browse the UMLS

• Knowledge Source Server

• umlsks.nlm.nih.gov/