the future of health information barry smith ontology research group center of excellence in...

61
The Future of Health Information Barry Smith Ontology Research Group Center of Excellence in Bioinformatics and Life Sciences University at Buffalo ontology.buffalo.edu/smith

Post on 19-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

The Future of Health Information

Barry Smith

Ontology Research Group

Center of Excellence in Bioinformatics and Life Sciences

University at Buffalo

ontology.buffalo.edu/smith

2

Collaborations

• National Center for Biomedical Ontology (http://NCBO.us)

• WHO Collaborating Center for Terminology

• Cleveland Clinic Semantic Database

• SNOMED CT – Disease Ontology

• German national Electronic Health Record initiative [Health Version 11]

3

Overview of this talk

• The role of ontology

• The role of HL7

• The future of health information

4

• The role of ontology

• The role of HL7

• The future of health information

Overview of this talk

5

we need to know where in the body,

where in the cell

we need to know what kind of

disease process

= we need ontologies

we need semantic annotation of data

7

8

9

Ontologies are systems of terms for annotating data

They are controlled vocabularies designating the types of entities in reality

Data designate the instances of these types

10

• cellular locations

• molecular functions

• biological processes

• used to annotate the entities represented in the major biochemical databases

• thereby creating integration across these databases

The Gene Ontology: A set of standardized textual descriptions of

11

what cellular component?

what molecular function?

what biological process?

12

The process of data annotation

• yields a slowly growing computer-interpretable map of biological reality within which major databases are automatically integrated in semantically searchable form

13

But now

need to extend the methodology to other domains, including clinical medicine

need disease, symptom (phenotype) ontologies

14

The Problem

need for prospective standards to ensure mutual consistency and high quality of clinical counterparts of GO

need to ensure consistency of the new clinical ontologies with the basic biomedical sciences

if we do not start now, the problem will only get worse

15

The Solution

• establish common rules governing best practices for creating ontologies and for using these in annotations

• apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies

16

• a shared portal for (so far) 58 ontologies • (low regimentation)

• http://obo.sourceforge.net NCBO BioPortal

First step (2003)

17

Second step (2004):reform efforts initiated, e.g. linking GO to other OBO ontologies to ensure

interoperability

id: CL:0000062name: osteoblastdef: "A bone-forming cell which secretes an extracellular matrix. Hydroxyapatite crystals are then deposited into the matrix to form bone." is_a: CL:0000055relationship: develops_from CL:0000008relationship: develops_from CL:0000375

GO

Cell type

New Definition

+

=Osteoblast differentiation: Processes whereby an osteoprogenitor cell or a cranial neural crest cell acquires the specialized features of an osteoblast, a bone-forming cell which secretes extracellular matrix.

19

The OBO FoundryThe OBO Foundryhttp://obofoundry.org/http://obofoundry.org/

Third step (2006)Third step (2006)

20

• a family of interoperable gold standard biomedical reference ontologies to serve the annotation of

scientific literature model organism databases clinical data experimental results

The OBO FoundryThe OBO Foundry

21

Compare the UMLS Metathesaurus

a system of post hoc mappings between independent source vocabularies

built by trained experts

massively useful for information retrieval and information integration

creates out of literature a semantically searchable space

22

for UMLS

local usage respected

regimentation frowned upon

cross-framework consistency not important

no concern to establish consistency with basic science

different grades of formal rigor, different degrees of completeness, different update policies

no path towards improvement

no path towards support for logical reasoning

23

The OBO Foundry is a prospective standard

designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping)

established March 2006

12 initial candidate OBO ontologies – focused primarily on basic science domains

several being constructed ab initio

now 16 ontologies

Ontology Scope URL Custodians

Cell Ontology (CL)

cell types from prokaryotes to mammals

obo.sourceforge.net/cgi-

bin/detail.cgi?cell

Jonathan Bard, Michael Ashburner, Oliver Hofman

Chemical Entities of Bio-

logical Interest (ChEBI)

molecular entities ebi.ac.uk/chebiPaula Dematos,Rafael Alcantara

Common Anatomy Refer-

ence Ontology (CARO)

anatomical structures in human and model

organisms(under development)

Melissa Haendel, Terry Hayamizu, Cornelius

Rosse, David Sutherland,

Foundational Model of Anatomy (FMA)

structure of the human body

fma.biostr.washington.

edu

JLV Mejino Jr.,Cornelius Rosse

Functional Genomics Investigation

Ontology (FuGO)

design, protocol, data instrumentation, and

analysisfugo.sf.net FuGO Working Group

Gene Ontology (GO)

cellular components, molecular functions, biological processes

www.geneontology.orgGene Ontology

Consortium

Phenotypic Quality Ontology

(PaTO)

qualities of anatomical structures

obo.sourceforge.net/cgi

-bin/ detail.cgi?attribute_and_value

Michael Ashburner, Suzanna

Lewis, Georgios Gkoutos

Protein Ontology (PrO)

protein types and modifications

(under development)Protein Ontology

Consortium

Relation Ontology (RO)

relationsobo.sf.net/

relationshipBarry Smith, Chris

Mungall

RNA Ontology(RnaO)

three-dimensional RNA structures

(under development) RNA Ontology Consortium

Sequence Ontology(SO)

properties and features of nucleic sequences

song.sf.net Karen Eilbeck

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic

Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Compone

nt(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Building out from the original GO

26

OBO low-regimentation ontology portal

OBO Foundry high-regimentation collaborative initiative to create a gold standard suite of interoperable ontologies

The vision

27

Common Anatomy Reference Ontology

Disease Ontology (DO) [SNOMED CT]

Biomedical Image Ontology (BIO)

Environment Ontology (EnvO)

Biobank Ontology (BrO)

Clinical Trial Ontology (CTO) [with WHO Global Trial Bank, Immune Tolerance Network, ACGT Advancing Genomics Clinical Trials in Cancer EU IP]

Ontologies under construction

28

Clinical Trial Ontology

• part of a larger project called the Ontology for Biomedical Investigations (OBI)

29

controlled vocabulary for biomedical investigations including

protocols instrumentationmaterialdata types of analysis and statistical tools

applied to the data

OBI

http://obofoundry.org/http://obofoundry.org/

30

Clinical Trial Ontology

• To serve merger of data schemas• To serve flexibility of collaborative clinical trial

research• To serve design and management of clinical trials• To serve data access and reuse – send me all trials

which ...

31

Ontology vs. Database Schema

• Separate development of data schemas and ‘information models’ (HL7) and terminologies such as SNOMED CT

• the two do not work together

32

Ontology vs. Database Schema

• diabetes => disease

• diabetes => string

• temperature => quality

• temperature => integer

33

CTO

34

CTO Continuant

35

CTO Occurrent

36

Clinical Trial Ontology Working Group

• http://www.bioontology.org/wiki/

• Workshop on May 16-17, 2007

37

• The role of ontology

• The role of HL7

• The future of health information

38

HL7 V3

“the data standard for

biomedical informatics”

• http://aurora.regenstrief.org/~schadow/ HL7TheDataStandardForBiomedicalInformatics.ppt

39

HL7 V2

a workable messaging standard faced the problem of local dialects

seeks to solve this problem by having all HL7 artifacts conform to a single ‘Reference

Information Model’ (the RIM)

HL7 V3

40

After 10 years?And many attempts?

And gigantic investments of energy and funding?

is there a single, successful RIM-implementation?

There are clear examples of failure of billion-dollar implementations resting on the RIM

and of programmers involved in such failures who are tearing out their hair, and blaming HL7

42

Is it justified, in these circumstances, to promote HL7 V3 as an ISO Standard

in the domain of patient care?

43

One indispensable foundation for a successful standard

a correct and uniform interpretation of its basic terms

• Act• Participation• Entity• Role• ActRelationship• RoleLink

44

• Sometimes ‘Act’ means information about an act

• Sometimes ‘Act’ means real-world action

• Sometimes ‘Act’ means a mixture of the above

• Sometimes in the very same sentence

Demonstrably, the HL7 community does not understand its own basic terms

45

Consequences of unclarity here

• Different user groups have interpreted the same classes in different ways

• Different message specifications used different interpretations

• This recreates interoperability problems

• Can we be sure that these problems will not lead to incidents relevant to patient safety?

46

Even with clarity – and clear documentation – the RIM would still be in bad shape

http://hl7-watch.blogspot.com/

47

Where are diseases

• Acts ?• Things, Persons, Organizations ?• Participations ?• Roles ?• ActRelationships ?• RoleLinks ?

48

The HL7 Clinical Genomic Standard

• defines an allele as the observation of an allele

• defines a phenotype as the observation of an observation

49

The $ 35 bn. NHS Program “Connecting for Health”

• has applied the RIM rigorously, using all the normative elements, and it discovered that it needed to create dialects of its own to make the V3-based system work for its purposes (it still does not work)

50

The RIM has no coherent answer

• Basic categories cannot be agreed upon even for common phenomena like snakebites.

• HL7 V3 dialects are formed – and the RIM does not do its job.

51

The moral of this story

• Don’t claim to be

• “the data standard for biomedical informatics”

• until you have a system that works

• http://aurora.regenstrief.org/~schadow/

HL7TheDataStandardForBiomedicalInformatics.ppt

52

• The role of ontology

• The role of HL7

• The future of health information

53

A New Paradigm for Health Information

• How achieve semantic interoperability amongst healthcare applications?

• Through referent tracking

www.org.buffalo.edu/RTU

54

The myth of ‘unambiguous’ understanding through biomedical terminologies

5572 04/07/1990 26442006 closed fracture of shaft of femur

5572 04/07/1990 81134009 Fracture, closed, spiral

5572 12/07/1990 26442006 closed fracture of shaft of femur

5572 12/07/1990 9001224 Accident in public building (supermarket)

5572 04/07/1990 79001 Essential hypertension

0939 24/12/1991 255174002 benign polyp of biliary tract

2309 21/03/1992 26442006 closed fracture of shaft of femur

2309 21/03/1992 9001224 Accident in public building (supermarket)

47804 03/04/1993 58298795 Other lesion on other specified region

5572 17/05/1993 79001 Essential hypertension

298 22/08/1993 2909872 Closed fracture of radial head

298 22/08/1993 9001224 Accident in public building (supermarket)

5572 01/04/1997 26442006 closed fracture of shaft of femur

5572 01/04/1997 79001 Essential hypertension

PtID Date ObsCode Narrative

0939 20/12/1998 255087006 malignant polyp of biliary tract

*

*

*

* cause, not disorder

How many disorders have patients 5572, 2309 and 298 each had thus far in their lifetime ?

How many numerically different disorders are listed here ?

How many different types of disorders are listed here ?

55

Does seeing the labels help ?

5572 04/07/1990 26442006 closed fracture of shaft of femur

5572 04/07/1990 81134009 Fracture, closed, spiral

5572 12/07/1990 26442006 closed fracture of shaft of femur

5572 12/07/1990 9001224 Accident in public building (supermarket)

5572 04/07/1990 79001 Essential hypertension

0939 24/12/1991 255174002 benign polyp of biliary tract

2309 21/03/1992 26442006 closed fracture of shaft of femur

2309 21/03/1992 9001224 Accident in public building (supermarket)

47804 03/04/1993 58298795 Other lesion on other specified region

5572 17/05/1993 79001 Essential hypertension

298 22/08/1993 2909872 Closed fracture of radial head

298 22/08/1993 9001224 Accident in public building (supermarket)

5572 01/04/1997 26442006 closed fracture of shaft of femur

5572 01/04/1997 79001 Essential hypertension

PtID Date ObsCode Narrative

0939 20/12/1998 255087006 malignant polyp of biliary tract

Same patient, same hypertension code:Same (numerically identical) hypertension ?

Different patients, same fracture codes:Same (numerically identical) fracture ?

Same patient, different dates, same fracture

codes: same (numerically identical)

fracture ?

Same patient, same date,2 different fracture codes:

same (numerically identical) fracture ?

Same patient, different dates, Different codes. Same (numericallyidentical) polyp ?

Different patients. Same supermarket? Maybe the same freezer section ?Or different supermarkets, but always in the freezer sections ?

56

We have unique IDs

• for patients

• for healthcare deliverers

• for images

• for invoices

57

Let’s introduce unique IDs

• for everything that is mentioned in the record:– lesions– fractures– presentings– surgical procedures

http://sourceforge.net/projects/rtsystem

• IUI = instance unique identifier

58

Better public health statistics

5572 04/07/1990 26442006 closed fracture of shaft of femur

5572 04/07/1990 81134009 Fracture, closed, spiral

5572 12/07/1990 26442006 closed fracture of shaft of femur

5572 12/07/1990 9001224 Accident in public building (supermarket)

5572 04/07/1990 79001 Essential hypertension

0939 24/12/1991 255174002 benign polyp of biliary tract

2309 21/03/1992 26442006 closed fracture of shaft of femur

2309 21/03/1992 9001224 Accident in public building (supermarket)

47804 03/04/1993 58298795 Other lesion on other specified region

5572 17/05/1993 79001 Essential hypertension

298 22/08/1993 2909872 Closed fracture of radial head

298 22/08/1993 9001224 Accident in public building (supermarket)

5572 01/04/1997 26442006 closed fracture of shaft of femur

5572 01/04/1997 79001 Essential hypertension

PtID Date ObsCode Narrative

0939 20/12/1998 255087006 malignant polyp of biliary tract

IUI-001

IUI-001

IUI-001

IUI-003

IUI-004

IUI-004

IUI-005

IUI-005

IUI-005

IUI-007

IUI-007

IUI-007

IUI-002

IUI-012

59

‘John Doe’s ‘John Smith’s

liver liver

tumor tumor

was treated was treated

with with

RPCI’s RPCI’s

irradiation device’ irradiation device’

‘John Doe’s

liver

tumor

was treated

with

RPCI’s

irradiation device’

Better reasoning over health information

#1

#3

#2

#4

#5

#6

treating

person

liver

tumor

clinic

device

instance-of at t1

instance-of at t1

instance-of at t1

instance-of at t1

instance-of at t1

#10

#30

#20

#40

#5

#6

inst-of at t2

inst-of at t2

inst-of at t2

inst-of at t2

inst-of at t2

60

Application principles

#IUI-1 ‘affects’ #IUI-2#IUI-3 ‘affects’ #IUI-2#IUI-1 ‘causes’ #IUI-3

Referent TrackingDatabase

EHR

CAG repeat

Juvenile HD

persondisorder

continuantOntology

61

Goal: A New Form of Evidence Based Medicine

• Now:– Decisions based on the outcomes of (reproducible)

results of well-designed studies• Guidelines and protocols

– Evidence is hard to get, takes time to accumulate.

• Future:– Each discovered fact or expressed belief should

instantly become available as contributing to the total body evidence, wherever its description is generated.

– Data ‘eternally’ reusable independent of the purpose for which they have been generated.