beyond ontologies: putting biomedical knowledge to work

36
Beyond Ontologies: Putting Biomedical Knowledge to Work Philip R.O. Payne, Ph.D. Associate Professor and Chair, Biomedical Informatics (College of Medicine) Associate Professor, Health Services Management and Policy (College of Public Health) Associate Director for Data Sciences, Center for Clinical and Translational Science Executive-in-residence, Office of Technology Transfer and Commercialization NCBO Project Meeting March 13, 2013

Upload: philip-payne

Post on 30-Nov-2014

197 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Beyond Ontologies: Putting Biomedical Knowledge to Work

Beyond Ontologies: Putting Biomedical Knowledge to Work

Philip R.O. Payne, Ph.D.Associate Professor and Chair, Biomedical Informatics (College of Medicine)Associate Professor, Health Services Management and Policy (College of Public Health)Associate Director for Data Sciences, Center for Clinical and Translational ScienceExecutive-in-residence, Office of Technology Transfer and Commercialization

NCBO Project MeetingMarch 13, 2013

Page 2: Beyond Ontologies: Putting Biomedical Knowledge to Work

COI/Disclosures

Federal Funding: NCI, NLM, NCATS, AHRQ

Additional Research Funding: SAIC, Rockefeller Philanthropy Associates, Academy Health, Pfizer

Academic Consulting: CWRU, Cleveland Clinic, University of Cincinnati, Columbia University, Emory University, Virginia Commonwealth University, University of California San Diego, University of California Irvine, University of California San Francisco, University of Minnesota, Northwestern University

Other Consulting/Honoraria: American Medical Informatics Association (AMIA), Institute of Medicine (IOM)

Editorial Boards: Journal of the American Medical Informatics Association, Journal of Biomedical Informatics

Study Sections: NLM (BLIRC), NCATS (formerly NCRR), NIDDK

Corporate: Oracle, HP, Epic, Accelmatics (interim-CEO)

2

Page 3: Beyond Ontologies: Putting Biomedical Knowledge to Work

Outline

Working definitions and assumptions Putting biomedical knowledge to work

A practical approach to CDEs Resource discovery Hypothesis generation

Discussion Knowledge-based systems engineering Big data

3

Page 4: Beyond Ontologies: Putting Biomedical Knowledge to Work

Outline

Working definitions and assumptions Putting biomedical knowledge to work

A practical approach to CDEs Resource discovery Hypothesis generation

Discussion Knowledge-based systems engineering Big data

4

Page 5: Beyond Ontologies: Putting Biomedical Knowledge to Work

5

The Multiple Dimensions of Biomedical Knowledge Engineering

Technology

• Deployment model

• Systems integration

• Extensibility

Data Sharing

• Vocabularies• Semantics• Knowledge

engineering processes

Empowering Knowledge

Workers

• Tools and best practices

• Governance• Socio-cultural

factors

Creating Dynamic, Interoperable Systems

Lowering Barriers

to Adoption

Facilitating Sharing

and Integration

Solving Real World

Problems

Page 6: Beyond Ontologies: Putting Biomedical Knowledge to Work

6

A Balanced Approach to Realizing the Benefits of Shared Semantics

Computable Interoperability

Working Interoperability

Peer-to-Peer Negotiation of

Project-Specific Constructs

Impact on Community-wide: Governance Technologies Software Engineering Approaches

Community-wide Vocabularies

and Semantics

Historical Focus

Page 7: Beyond Ontologies: Putting Biomedical Knowledge to Work

7

Empowering Knowledge Workers

Driving Biological and

Clinical Problems

Subject Matter Experts

Solutions to Real World

Interoperability Needs

Critical Issues: Workflows that enable engagement by Subject Matter Experts Tight coupling of knowledge engineering efforts and research programs

that can define “real world” driving problems Facilitation and support of interdisciplinary, team science models

(including basic and translational scientists, clinical researchers, and informaticians)

Biomedical Informatics ≠ EngineeringSystems-level Approaches To Knowledge Engineering and Usability Are Essential

Page 8: Beyond Ontologies: Putting Biomedical Knowledge to Work

4 Assumptions Regarding the Current State and Future of the NCBO

1) The tools and knowledge collections created and maintained by the NCBO have become a substrate for a broad spectrum of biomedical informatics innovations Analogous to the role played by NLM provided

resources

2) Future directions for the center and its work will trend towards an applied science focus

3) At the same time, outreach, engagement, and education will remain high priorities

4) The current funding climate presents a significant and unknown challenge to the preceding 3 assumptions

8

Page 9: Beyond Ontologies: Putting Biomedical Knowledge to Work

4 Assumptions Regarding the Current State and Future of the NCBO

1) The tools and knowledge collections created and maintained by the NCBO have become a substrate for a broad spectrum of biomedical informatics innovations Analogous to the role played by NLM provided

resources

2) Future directions for the center and its work will trend towards an applied science focus

3) At the same time, outreach, engagement, and education will remain high priorities

4) The current funding climate presents a significant and unknown challenge to the preceding 3 assumptions

9

Page 10: Beyond Ontologies: Putting Biomedical Knowledge to Work

Outline

Working definitions and assumptions Putting biomedical knowledge to work

A practical approach to CDEs Resource discovery Hypothesis generation

Discussion Knowledge-based systems engineering Big data

10

Page 11: Beyond Ontologies: Putting Biomedical Knowledge to Work

11

A Pragmatic Approach to CDEs:The openMDR Project

Page 12: Beyond Ontologies: Putting Biomedical Knowledge to Work

Defining Common Data Elements (CDEs)

Common Data Elements (CDEs) are standardized terms for the collection and exchange of data

CDEs are metadata CDEs describe the type of data being collected, not the data

itself

Critical role(s) for CDEs: to identify discrete, defined items for data collection to promote consistent data collection in the field to eliminate unneeded or redundant data collection to promote consistent reporting and analysis to reduce the possibility of error related to data translation and

transmission to facilitate data sharing

12

Source: National Cancer Institute (NCI)

Page 13: Beyond Ontologies: Putting Biomedical Knowledge to Work

OpenMDR: a Distribute CDE Platform

13

Semantic Metadata Management Suite Locally relevant ontology-anchored data elements

Rapid and agile development paradigm

Distributed terminology ecosystem Federated queries across multiple deployments

Interaction with other semantic management systems ISO 11179 semantic repository

Integration with industry standard tools

http://openmdr.org

Page 14: Beyond Ontologies: Putting Biomedical Knowledge to Work

OpenMDR Functional Components

14

Create and manage terminology

Discover and reuse concepts

Annotate models for discovery and interoperability

Utilize data elements to build semantically anchored services

http://openmdr.org

Page 15: Beyond Ontologies: Putting Biomedical Knowledge to Work

OpenMDR Is Federated

15

Multiple deployments for locally relevant terminology.

DNS-like hierarchy of authority

http://openmdr.org

Page 16: Beyond Ontologies: Putting Biomedical Knowledge to Work

OpenMDR as part of an MDA Workflow

16

Empowers knowledge workers

Enterprise Architect plugin Formulate searches

against local or distributed OpenMDR instances

Identify semantic terms in detail Concept codes help

distinguish similar elements

Apply annotations to the data model

http://openmdr.org

Page 17: Beyond Ontologies: Putting Biomedical Knowledge to Work

OpenMDR and the TRIAD SOA

17

http://openmdr.org

Page 18: Beyond Ontologies: Putting Biomedical Knowledge to Work

18

Resource Discovery: ResearchIQ

Page 19: Beyond Ontologies: Putting Biomedical Knowledge to Work

Motivation for the Design of ResearchIQ

• Clinical and translational researchers frequently need to identify and engage:– Collaborators– Shared resources– Data, information, and knowledge collections

• There are a multitude of sources that can be used to support such needs, however they are usually:– Heterogeneous– Difficult to find– Not linked

How do we overcome these barriers to the efficient planning and conduct of clinical and translational studies?

Page 20: Beyond Ontologies: Putting Biomedical Knowledge to Work

A Potential Solution:

What is ResearchIQ (Research Integrative Query)? A single knowledge resource portal for the clinical and

translational research community that will provide a ”front door” for a variety of resources.

How does ResearchIQ work? Knowledge anchored semantic search Leveraging semantic web technologies

Current project focus is on the development and deployment of an end-user facing proof-of concept Can it be done? How difficult will it be? Can it scale? What kind of coverage would we have?

Page 21: Beyond Ontologies: Putting Biomedical Knowledge to Work

High-level System Architecture

Query Performance Optimization

Page 22: Beyond Ontologies: Putting Biomedical Knowledge to Work

Presentation Layer

Page 23: Beyond Ontologies: Putting Biomedical Knowledge to Work

Knowledge Base Growth (2012)

Jan-12 Feb-12 Mar-12 Apr-12 May-12 Jun-12 Jul-12 Aug-12 Sep-12 Oct-12 Nov-12 Dec-123

3.2

3.4

3.6

3.8

4

4.2

4.4

Cumulative Total Resources

Months - 2012

Lo

g (

Mo

nth

ly C

um

ula

tiv

e

Re

so

urc

e C

ou

nt)

Page 24: Beyond Ontologies: Putting Biomedical Knowledge to Work

Managing Knowledge Base Growth

Page 25: Beyond Ontologies: Putting Biomedical Knowledge to Work

25

Hypothesis Generation: TOKEN Knowledge Synthesis

Platform

Page 26: Beyond Ontologies: Putting Biomedical Knowledge to Work

Putting Conceptual Knowledge to Work:Constructive Induction (CI) & Hypothesis Generation

Conceptual Knowledge Constructs (CKCs)• Conceptual knowledge-anchored concepts + relationships• Higher order constructs (multiple intermediate concepts)• Controls for concept granularity (search depth)• Basis for inference of hypotheses concerning relationships between data elements

Page 27: Beyond Ontologies: Putting Biomedical Knowledge to Work

Experimental Context: CLL Research Consortium

NCI-funded Program/Project (PO1) Translational research targeting Chronic Lymphocytic Leukemia

(CLL) Established in 1999 Cohort of over 6,000 patients Comprehensive phenotypic and bio-molecular data sets, as well as

bio-specimens

8 participating sites

Informatics platform: Research networking Clinical trials management Correlative data management Bio-specimen management

Page 28: Beyond Ontologies: Putting Biomedical Knowledge to Work

Multi-part CI Evaluation Study in CLL

(1) Efficacy (2) Verification & Validation

(3) Mining Domain

Literature

CKC Evaluation• 108 data elements• 822 UMLS concepts• 5800 CKCs• 5 SMEs• Random sample (250)

• 86% valid• 90% “meaningful”

Search depth controls

TOKEn browser

Automated lit. queries

• Random sample (50)

SME “gold standard”

•Support metric Critical

relationship• support metric• “meaningful”• Significant correlation1. Payne PR, Borlawsky T, Kwok A, Dhaval R, Greaves A. Ontology-anchored Approaches to Conceptual Knowledge Discovery in a

Multi-dimensional Research Data Repository. AMIA Translational Bioinformatics Summit Proc. 2008.2. Payne PR, Borlawsky T, Kwok A, Greaves A. Supporting the Design of Translational Clinical Studies Through the Generation and

Verification of Conceptual Knowledge-anchored Hypotheses. AMIA Annu Symp Proc. 2008.3. Payne PR, Borlawsky T, Lele O, James S, Greaves AW. The TOKEN Project: Knowledge Synthesis for in-silico Science. Journal

of American Medical Informatics Association (JAMIA). 2011

Mining CLL literature

• Medline, 2005-2008

Comparison•Literature-based

CKCs•Ontology-based CKCs

Critical findings• No overlap• Differing granularity

• More timely (SMEs)

Page 29: Beyond Ontologies: Putting Biomedical Knowledge to Work

CKC Visualization

Cytogenetic & Chromosomal abnormalities

Bio-molecular Products

HematologicMalignancies

Bone Marrow Morphology

Tissues of Origin

Solid Tumors

Myelogenous Malignancies

TOKEn CKC Network: CLL Research Consortium Metadata

Page 30: Beyond Ontologies: Putting Biomedical Knowledge to Work

Cytogenetic Abnormalities

TreatmentResponse

Bone Marrow Morphology

Lymphomas

Leukemia's

Chromosome Loss

Laboratory Findings

Protein Expression

Molecular Abnormalities

Tissues of Origin

Tissues of Origin

TOKEn CKC Network: Semantic Partitions

Page 31: Beyond Ontologies: Putting Biomedical Knowledge to Work

Outline

Working definitions and assumptions Putting biomedical knowledge to work

A practical approach to CDEs Resource discovery Hypothesis generation

Discussion Knowledge-based systems engineering Big data

31

Page 32: Beyond Ontologies: Putting Biomedical Knowledge to Work

Applying Conceptual Knowledge: Building Knowledge-Based Systems

Payne PR et al. Translational informatics: enabling high-throughput research paradigms. In: Physiol. Genomics 39: 131-140, 2009

Page 33: Beyond Ontologies: Putting Biomedical Knowledge to Work

Knowledge-based Systems: Replicating Expert Performance

Adapted from Gaines and Shaw, “Knowledge Acquisition Tools Based On Personal Construct Psychology”, 1993

Page 34: Beyond Ontologies: Putting Biomedical Knowledge to Work

The Importance of Knowledge-based Systems Engineering is Amplified by Our Increased Focus on Big Data

34

Volume

Velocity

Variability

Scalability Extensibility Reproducibility Multi-

dimensional data, information, and knowledge Integration

Moving beyond the “hype cycle” and solving real world problems

Over $100M investment by NIH, including the creation of centers of excellence

Page 35: Beyond Ontologies: Putting Biomedical Knowledge to Work

35

Collaborators: Peter J. Embi, MD, MS

Albert M. Lai, PhD

Kun Huang, PhD

Po-Yin Yen, RN, PhD

Yang Xiang, PhD

Marcelo Lopetegui, MD

Tara Borlawsky-Payne, MA

Omkar Lele, MS, MBA

Marjorie Kelley

William Stephens

Arka Pattanayak

Caryn Roth

Andrew Greaves

Funding: NCI: R01CA134232, R01CA107106,

P01CA081534, P50CA140158, P30CA016058

NCATS: U54RR024384

NLM: R01LM009533, T15LM011270

AHRQ: R01HS019908

Rockefeller Philanthropy Associates

Academy Health – EDM Forum

Acknowledgements

Laboratory for Knowledge Based Applications and Systems Engineering (KBASE):

Page 36: Beyond Ontologies: Putting Biomedical Knowledge to Work

36

Thank you for your time and attention!• [email protected]• http://go.osu.edu/payne