nci thesaurus and enterprise vocabulary services: resources for cancer research lawrence w. wright...

20
NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services http://evs.nci.nih.gov/ May 13, 2015

Upload: vanessa-malone

Post on 18-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

NCI Thesaurus and

Enterprise Vocabulary Services: Resources for Cancer Research

Lawrence W. WrightProgram Manager

NCI Enterprise Vocabulary Serviceshttp://evs.nci.nih.gov/

May 13, 2015

Page 2: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

2

EVS Purpose and Scope

EVS provides terminology and ontology services to support NCI and cancer researchers, and has found many shared interests and strong partnerships in the broader biomedical community. - Encode Precise, Stable Meanings:

• Support best-practice, science-based, quick-response terminology/ontology resources to help researchers accurately collect, code, and analyze data.

- Support Semantic Infrastructures:• Support metadata, models, value sets, and mappings that provide broader,

computable representations to structure meanings and make them interoperable.

- Build Shared Standards:• Partner and harmonize with other ICs, agencies, SDOs, and researchers in creating

and improving shared standards for increasingly international, cross-cutting research.

- Promote Open Content and Tools:• Promote open access, open source content and tools to lower barriers, share burdens,

and build shared resources.

Page 3: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

3

NCI Thesaurus (NCIt):Semantic Backbone for Research Information

• Terminology: Provide best-practice coding as needed in all relevant domains.- Cancers and other diseases, findings and abnormalities.- Clinical & research trials/studies, procedures, tools, management, etc.- Agents, regimens, chemicals, nutrients, nanoparticles, & other substances.- Anatomy, tissues, subcellular structures.- Genes, gene products, pathways, biological processes.- Animal models – mouse, rat, zebrafish, other.- Concepts, properties, qualifiers, administrative & other misc. terminology.

• Ontology: Deep & precise representation of key research and health concepts.- Neoplasms

• 8,000+ concepts defined using 200,000 description logic relationships plus text definitions.• Tracks latest molecular, pathological, and clinical classifications.

- Drugs• 17,000+ individual agents & related substances, including nutritional.• 3,400+ agent combinations being extended to cover specific regimens.

- Molecular• 16,000+ genes, gene products, pathways, and abnormalities.

- Anatomy • 6,750+ concepts including systems, structures, tissues, and an extensive microanatomy.• Federal Consolidated Health Informatics (CHI) standard.

Page 4: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

4

NCIt Example: Lymphoma (1 of 5)

Concept Code

Links to caDSR and NCIm

NCI Preferred Term NCI Definition

Page 5: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

5

NCIt Example: Lymphoma (2 of 5)

Term Source

Tagged C3208 stakeholders (incl. Contributing_Source): CTEP, CTRP, PDQ, TCGA, NICHD, CDISC, FDA

Term Code or Subsource

Page 6: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

6

NCIt Example: Lymphoma (3 of 5)

Relationships 1: Parent and Child Concepts

Page 7: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

7

NCIt Example: Lymphoma (4 of 5)

Relationships 2: Role Relationships & Subset Associations

Role Relationships

Associations

Page 8: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

8

NCIt Example: Lymphoma (5 of 5)Activated B-Cell-Like Diffuse Large B-Cell Lymphoma

Preferred Name: Activated B-Cell-Like Diffuse Large B-Cell LymphomaCode: C36081Semantic Type: Neoplastic ProcessParent Concepts: Diffuse Large B-Cell Lymphoma by Gene Expression Profile

Aggressive Non-Hodgkin Lymphoma

Definition: A biologic subset of diffuse large B-cell lymphomas with a unique molecular signature or expression profile. It represents approximately 30% of diffuse large B-cell lymphomas, and is characterized by the expression of CD44, PKCbeta1, Cyclin D2, BCL-2, and IRF4/MUM1 genes. Morphologically, these lymphomas are either centroblastic or immunoblastic (ratio 2:1). Patients with this type of diffuse large B-cell lymphoma are reported to have a less favorable outcome compared to those with a germinal center B-cell expression profile, with a 5-year survival rate of 35% and a median survival of 2 years.

* Partial List

Page 9: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

9

NCIt Drugs (1 of 2)

Page 10: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

10

NCIt Drugs (2 of 2)

Page 11: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

11

NCIt Biomarker Typeswith example concepts

• Molecular/Genetic Markers BRCA1 Gene; BRAF NP_004324.2:p.V600X (BRAF V600 Mutation); t(11;18)

(q21;q21); BCR/ABL1 Fusion Protein p230, N-Telopeptide• Laboratory Test Results

Estrogen Receptor Status; Methemoglobin Reductase Deficiency; CD34-Positive Neoplastic Cells Present; HMB-45-Positive Neoplastic Cells Present

• Histology/Pathology Findings Positive Surgical Margin; Blast Cells Present in Peripheral Blood; Ductal

Carcinoma Cell; Cervical Glandular Dysplasia; Atypical Mitotic Figures• Antigens and Metabolic Markers

Ganglioside GM2; CD15; 2-Methoxyestradiol; 4-Hydroxyestrone; N(6)-Carboxymethyllysine; 8-Oxoguanine

• Physiological and Pathological Processes DNA Methylation, Tumor Angiogenesis, Oxidative Stress, Lipid Peroxidation, S-

Nitrosylation; Histone Acetylation

Page 12: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

12

Partnered NCIt subsetshttp://ncit.nci.nih.gov/ncitbrowser/pages/subset.jsf

Page 13: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

13

Partnered NCIt subsets (1 of 550):FDA SPL Drug Route of Administration Terminology

Page 14: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

14

Cross Map by MeaningNCI Metathesaurus https://ncim.nci.nih.gov

Definitions

Terms & Sources

Page 15: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

15

NCI Hosted Mappings

nciterms.nci.nih.gov/ncitbrowser/pages/mapping_search.jsf?nav_type=mappings

Page 16: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

16

NCI Unified, Open InfrastructureLexEVS Server & NCI Term Browser http://nciterms.nci.nih.gov/

22 Sources

Search

Linked Resources

3 ResourceTypes

25 / 75 Subsources

Page 17: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

17

NCI Metadata in caDSRWidely Used in NCI & Partner Semantic Infrastructures

Page 18: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

18

Some Lessons (1 of 2)

• Coding and representation of biomedical information will remain diverse and dynamic.

- Many ‘legacy’ systems will be widely used for a long time to come.- Their content and use can be improved in important ways.- There is a large and growing role for more innovative resources

responsive to specific research and care needs.• Responsiveness and partnerships are vital: Engage to analyze and

address needs quickly, form strategic partnerships and communities around key needs.

• Open standards encourage participation and reuse: Harmonize and share as openly and widely as possible, have expert staff to support operations.- Scale of reuse can easily exceed scale of original uses.

• Open technical standards and tools such as OWL/RDF, CTS2/REST, Protégé, LexEVS, and NCI browsers increase sharing and compatibility.

Page 19: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

19

Some Lessons (2 of 2)

• Core best practices are vital: Stable codes for precise meanings, clear terms and synonymy, human-readable text definitions, extensive quality control, expert staff and community input.

• NCIt reference terminology provides semantic backbone for most EVS-supported coding, analyzing, and sharing research data.

• NCIt embedded partner terminology combines tighter semantics, harmonization, shared coding, and partner-appropriate terms.

• NCIm and related mappings are very useful for reference, NLP, and translation, but have weaker semantics and use.

• User driven priorities create relevance but also unevenness: EVS combines broad scope and rich ontology with some gaps and simple coding.

Page 20: NCI Thesaurus and Enterprise Vocabulary Services: Resources for Cancer Research Lawrence W. Wright Program Manager NCI Enterprise Vocabulary Services

20

EVS Resources

Web & Wiki Pages:• EVS Web Portal: http://evs.nci.nih.gov/• EVS Wiki: https://wiki.nci.nih.gov/display/EVS/EVS+Wiki • EVS Bibliography: https://

wiki.nci.nih.gov/display/EVS/Bibliography+on+EVS+and+Its+Use • EVS Use & Collaborations: https

://wiki.nci.nih.gov/display/EVS/EVS+Use+and+Collaborations

Browsers and Term Request:• NCI Term Browser: http://nciterms.nci.nih.gov/ • NCI Thesaurus: http://ncit.nci.nih.gov/ • NCI Metathesaurus: http://ncim.nci.nih.gov/ • NCI EVS Term Request Page: http://ncitermform.nci.nih.gov/

EVS/NCIt Staff email: [email protected]