2007 cdisc international interchange ontologies in clinical research: representation of clinical...
TRANSCRIPT
2007 CDISC International Interchange
Ontologies in Clinical Research: Representation of clinical research data in the framework of formal
biomedical ontologies
Richard H. ScheuermannChief, Division of Biomedical Informatics
U.T. Southwestern Medical Center
Outline
• Motivation - US NIH Clinical and Translational Science Award (CTSA)
• Ontologies and the Open Biomedical Ontologies (OBO) Foundry
• Ontology for Biomedical Investigations (OBI)• Ontology for Clinical Investigations (OCI)
– Approach – Current status– Future direction
Clinical and Translational Science Award (CTSA)
Implementing biomedical discoveries Implementing biomedical discoveries made in the last 10 years demands an made in the last 10 years demands an evolution of clinical science.evolution of clinical science.
New prevention strategies and New prevention strategies and treatments must be developed, tested, treatments must be developed, tested, and brought into medical practice more and brought into medical practice more rapidly.rapidly.
CTSA awards will help to lower barriers CTSA awards will help to lower barriers between disciplines, and encourage between disciplines, and encourage creative, innovative approaches to creative, innovative approaches to solving complex medical problems.solving complex medical problems.
These awards will catalyze change -- These awards will catalyze change -- breaking silos, breaking barriers, and breaking silos, breaking barriers, and breaking conventions.breaking conventions.
Building a National CTSA Consortium
Trial Design
Advanced Degree-Granting
Programs
Participant& CommunityInvolvement
RegulatorySupport
Biostatistics
ClinicalResources
BiomedicalInformatics
ClinicalResearch
Ethics
CTSACTSAHOMEHOME
NIH & other government
agencies
Healthcare organizations
IndustryIndustry
Each academic health center will create a home for clinical and translational science
• Data management - to develop a comprehensive controlled information system infrastructure to capture and manage clinical and translational research data
• Data integration - to integrate clinical and translational research data with data and knowledge from external public database resources
• Data analysis - to support clinical and translational research data analysis by providing state-of-the-art software analytical tools
• Support - to provide training and support for CRIS use
Clinical Research Information System
Clinical Research Information Systems
ProtocolDesign
ProtocolDesign
Statistical Endpoint Analysis
Statistical Endpoint Analysis
Case Report Form
Development
Case Report Form
Development
Visit Management
Visit Management
Subject Enrollment & Consenting
Subject Enrollment & Consenting
Clinical Data Capture
Clinical Data Capture
Laboratory Experimentation
Laboratory Experimentation
Specification PhaseSpecification Phase Implementation PhaseImplementation Phase Analysis PhaseAnalysis Phase
Sample Procurement& Processing
Sample Procurement& Processing
Consent FormDevelopment
Consent FormDevelopment
IRB Submission & Approval
IRB Submission & Approval
Grant ProposalDevelopment
Grant ProposalDevelopment
Enrollment Criteria
Specification
Enrollment Criteria
Specification
Subject Identification &
Recruitment
Subject Identification &
Recruitment
Agency & Scientific Reporting
Agency & Scientific Reporting
Laboratory Results Analysis
Laboratory Results Analysis
Integrative Data Analysis
Integrative Data Analysis
Workflow
Stakeholders
PrincipalInvestigatorPrincipal
Investigator
IRBIRB
GrantsManagement
GrantsManagement
ResearchCoordinatorResearch
Coordinator
StudySubjectStudy
Subject
LabPersonnel
LabPersonnel
PrincipalInvestigatorPrincipal
Investigator
StudyMonitorStudy
Monitor
Data & StatisticsAnalyst
Data & StatisticsAnalyst
DatabaseAnalyst
DatabaseAnalyst
StudySponsorStudy
Sponsor
Functions
Feasibility StudyFeasibility Study
Requirements• Accurate Representation
– therapeutic drug as a design variable vs. medical history– DNA as a therapeutic agent vs. analysis specimen
• Interoperability– unambiguous data exchange between research sites– effective data exchange between software applications
• Customization– support of study-specific details
• Dynamics– role changes throughout and between studies– eligibility criteria to relevant clinical phenotype
• Inference– semantic queries (e.g. patients with autoimmune disease)
• Meta-analysis– studies with common features (e.g. all studies where flu vaccine was
evaluated as a conditional variable)
Constraints
• Essential to build upon and extend, or map to, existing and emerging data standards (e.g. HL7, CDISC) and relevant vocabularies (e.g. ICD-9/10, NCI Thesaurus, SNOMED-CT)
• Recognize the difference between medical (hospital) IT and biological (science) IT
• Support wide variety of different clinical and translational study types - reduce complexity by modeling commonalities
• Support needs of multiple stakeholders - different uses of same data
• Standards should be easy to implement and use• Standards need to be easily and logically extensible• Support clinical research data use cases
Need for standard representations
Data standards
+
Common vocabularies
+
Extensible data model
=
Data interoperability
Description Framework
Data Standards and Interoperability:
• Minimum Information Sets - CDISC Codelists, MIBBI
• Vocabularies & Ontologies - ICD-9/10, SNOMED, LOINC, NCI Thesaurus, OBO Foundry
• Object Models - CDISC, HL7 RIM, BRIDG, FuGE
• Exchange Syntaxes - HL7, XML, RDF
Definition of “Ontology”
Philosophical• “The study of that which exists” (ISMB 2005)• “The science of what is: of the kinds and structures of the objects, and their
properties and relations in every area of reality” (ISMB 2005)
Information/computer scientists• “A shared, common, backbone taxonomy of relevant entities, and the
relationships between them, within an application domain” (ISMB 2005)• “A computable representation of biological reality” (ISMB 2005)• “A structured vocabulary”• “A formal way of representing knowledge in which concepts are described both
by their meaning and their relationship to each other” (Bard 2004)• “A data model that represents a domain and is used to reason about the objects
in that domain and the relations between them” (Wikipedia)
• Provide clear thinking about how to structure information
• Support data integration, modeling, query processing, user interface development, data exchange/export
• To enforce data correctness
• To be able to map to database management systems
• To enables a computer to reason over the data
• To provide the capability to infer relationships that have not been explicitly defined
Ontology Goals
The OBO Foundry - 2006
The OBO foundry is a set of interoperable ontologies that adhere to a growing set of principles set forth for best practices in ontology development
The OBO Foundry
16
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy / placeholder
)
Anatomical Entity(FMA, CARO)
OrganFunction
(placeholder) Phenotypic
Quality(PaTO)
Biological Process
(GO)CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)
Initial OBO Foundry Ontologiesbuilding out from the original GO
17
Mature OBO Foundry ontologies (now undergoing reform)
Cell Ontology (CL)Chemical Entities of Biological Interest (ChEBI)Foundational Model of Anatomy (FMA)Gene Ontology (GO)Phenotypic Quality Ontology (PaTO)Relation Ontology (RO)Sequence Ontology (SO)
18
Ontologies being built to satisfy Foundry principles ab initio
•Ontology for Clinical Investigations (OCI)
•Common Anatomy Reference Ontology (CARO)
•Environment Ontology (EnvO)
•Ontology for Biomedical Investigations (OBI)
•Protein Ontology (PRO)
•RNA Ontology (RnaO)
•Subcellular Anatomy Ontology (SAO)
•Disease Ontology (DO)
Disease Ontology
OBO Foundry provides a suite of basic science Reference Ontologiesdesigned to serve as modules for re-use in Application Ontologies such as:
Infectious Disease OntologyImmunology Ontology
Multiple Sclerosis Ontology
Mammalian Adult Neurogenesis Ontology
20
Ontology for Biomedical Investigations - Overview
International collaboration (since 2006)• Communities developing ontologies/terminologies
- Unambiguous description of how the investigation was performed- Consistent annotation, powerful queries and data integration
Describe the laboratory workflow• Set of universal terms
- Investigation (organization, intent, design etc) - Material (biological and chemical, manipulation and transformation)- Protocols and instrumentations- Data generated and types of analysis performed on it
• Set of biological and technological domain-specific terms - To meet the annotation requirements of any given community (e.g. clinical research)
Part of the Open Biomedical Ontology (OBO) Foundry• Orthogonality and x-referencing with existing bio-ontologies• 'Interoperable by construction' with those under the Foundry
- Including Unit, Quality (PATO), Environment and Chemical (ChEBI) ontologies• Agree on an initial structure (trunk) with is_a relationship
- Rely on Relation Ontology (RO)
OBI – Communities and Structure
1. Coordination Committee (CC): Representatives of the communities -> Monthly conferences
2. Developers WG: CC and other communities’ members
Weekly conferences calls
3. Advisors:
-> cBiO will oversee the Open BioMedical Ontology (OBO) initiative
OBI – Top Level Classes
Continuant: an entity that endure/remains the same through time
• Independent Continuant: stands on its ownE.g. All physical entities (instrument, technology platform, document etc.)
E.g. Biological material (organism, population etc.) •Dependent Continuant: inheres from another entity
E.g. Environment (depend on the set of ranges of conditions, e.g. geographic location)
E.g. Characteristics (entity that can be measured, e.g. temperature, unit)
- Realizable: an entity that is realized through a process (executed/run)
E.g. Software (a set of machine instructions)
E.g. Design (the plan that can be realized in a process)
•E.g. Role (the part played by an entity within the context of a process)
Occurrent: an entity that occurs/unfolds in timeE.g. Temporal Regions, Spatio-Temporal Regions (single actions or Event)
• Process E.g. Investigation (the entire ‘experimental’ process)E.g. Assay (process of performing some tests and recording the results)
Ontology for Clinical Investigations Approach
• Transparency and inclusivity (http://www.bioontology.org/wiki/index.php/OCI:Main_Page; Google “OCI wiki”)
• Combined top down/bottom up approach (prospective standardization)– Assembled term lists– Combine terms– Separate homonyms– Combine synonyms– Assigned membership into BFO/OBI branches– Position terms within branches– Define terms
• Testing
OCI Wiki
Term lists
Homonyms
sample size:1. A subset of a larger population, selected for investigation to draw
conclusions or make estimates about the larger population.
2. The number of subjects in a clinical trial.
3. Number of subjects required for primary analysis.
Study Design• Descriptive research – research in which the investigator attempts to describe a group of
individuals based on a set of variable in order to document their characteristics– Case study – description of one or more patients– Developmental research – description of pattern of change over time– Normative research – establishing normal values– Qualitative research – gathering data through interview or observation– Evaluation research – objectively assess a program or policy by describing the needs for the
services or policy, often using surveys or questionnaires
• Exploratory research– Cohort or case-control studies – establish associations through epidemiological studies– Methodological studies – establish reliability and validity of a new method– Secondary analysis – exploring new relationships in old data– Historical research – reconstructing the past through an assessment of archives or other records
• Experimental research– Randomized clinical trial – controlled comparison of an experimental intervention allowing the
assessment of the causes of outcomes• Single-subject design• Sequential clinical trial• Evaluation research – assessment of the success of a program or policy
– Quasi-experimental research– Meta-analysis – statistically combining findings from several different studies to obtain a
summary analysis
Assign membership into BFO/OBI branches
Biological marker (CDISC)Study populations (CDISC)Trial coordinator (CDISC)Study variable (CDISC)Drug (RCT)Subject (MUSC)
Case report form (CDISC)Patient file (CDISC)Consent form (CDISC)New drug application (MUSC)Investigational new drug application (MUSC)
Meta-analysis (CDISC)Quality assurance (CDISC)Quality control (CDISC)Baseline assessment (CDISC)Validation (CDISC)Coding (MUSC)Permuted block randomization (MUSC)Secondary-study-protocol (RCT)Intervention-step (RCT)Blinding-method (RCT)
Study design
Development plan (CDISC)Standard operating procedures (CDISC)Statistical analysis plan (CDISC)
Future directions
• Engage more stakeholders• Direct collaboration with organizations such as CDISC • Continue development• Evaluation approaches and metrics
– Based on scientific use cases– Categories of use cases
• Interoperability– Data exchange– Accuracy of representation– Homonyms and context; ontology helps us do
that• Reasoning and inference
– Test with CTSA IT Project (trial registration)
OCI Working Group
• Jennifer Fostel• Richard Scheuermann• Cristian Cocos• W. Jim Zheng• Wenle Zhao• Herb Hagler• Jamie Lee• Matthias Brochhausen• Amar K. Das• Dave Parrish• Barry Smith• Trish Whetzel