semantic models for cdisc based standards and metadata management
Post on 19-Oct-2014
2.563 views
DESCRIPTION
TRANSCRIPT
© CDISC 2012
Presented at CDISC Interchange Europe, Stockholm, 19 April
2012, by
Kerstin Forsberg, R&D, AstraZeneca
Frederik Malfait, IMOS Consulting and Hoffmann-La Roche
1
Semantic Models for CDISC Based
Standards and Metadata Management
© CDISC 2012
Key Message
• Things converge to create new and unique
opportunities.
The coverage and maturity of existing CDISC standards.
The establishment of these standards within the
industry.
The use of these standards as a foundation for metadata
driven systems.
The upcoming role of semantic web standards and
linked data principles.
• See also presentation and blog post from last
year’s conference: Linking Clinical Data Standards
2
© CDISC 2012
Two real world use of semantic web
standards and linked data principles
3
© CDISC 2012
Today’s Situation
• “Not if and when, but how” to best adopt CDISC
based data standards is becoming the leading
question.
• We see a variety of CDISC standards at different
levels of maturity, not linked together and
published in different formats.
• Sponsors are faced with challenges on all levels:
architecture, process, and application.
4
© CDISC 2012
An Emerging Insight
• The CDISC standards is all about the meaning of
what is studied in the biological and clinical reality
(often referred to as concepts).
• How these concepts are represented as data
elements from protocol to submission, and beyond.
• We are dealing with semantics and metadata for
biomedical and clinical research knowledge and
data.
• “Put semantic into the semantic”
Use semantic web standards
and linked data principles.
5
© CDISC 2012
RDF Triples
• Resource Description Framework (RDF)
A general model of how any piece of data, and
representations of knowledge, can be expressed
as so called triples.
6
subject predicate
Stockholm place
Stockholm Sweden
Stockholm Port cities in Sweden
Stockholm “+46-8”
object (or value)
type
capital
subject
areaCode
“http://en.wikipedia.org/wiki/Stockholm” primaryTopic Stockholm
© CDISC 2012
RDF Triples
• Triples can be aggregated into graphs with subject
and objects as nodes, and predicates as arcs.
7
City
Sweden
Stockholm Port cities in Sweden
“+46-8”
type
capital
subject
areaCode
“http://en.wikipedia.org/wiki/Stockholm” primaryTopic
© CDISC 2012
RDF Triples
• Graphs of triples can be extended across different
sources and for different purpose.
8
City
Sweden
Stockholm Port cities in Sweden
“+46-8”
type
capital
subject
areaCode
Country type
Gothenburg
subject
CDISC
CDISC
Interchange
EU 2012
“http://en.wikipedia.org/wiki/Stockholm” primaryTopic
© CDISC 2012
RDF Triples
• RDF Schema and the RDF based Web Ontology
Language (OWL) add a typing mechanism to
classify subjects and objects into hierarchies.
9
City
Sweden
Stockholm Port cities in Sweden
“+46-8”
type
capital
subject
areaCode
Country type CDISC
CDISC
Interchange
EU 2012
“http://en.wikipedia.org/wiki/Stockholm” primaryTopic
Adm.Area
Place
subClass
subClass
Organization
type
Business
Event
type
Event
subClass
Thing subClass subClass
Gothenburg
subject
subClass
© CDISC 2012
RDF Triples
• Google, Bing (Microsoft) and Yahoo use OWL
publish a joint vocabulary.
10
City Country
Adm.Area
Place
subClass
subClass
subClass
Organization
Business
Event
Event
subClass
Thing subClass subClass
Exempel
http://schema.org/City
© CDISC 2012
RDF Triples
• NCI use OWL to publish NCI Thesaurus (the
source for CDISC’s CT:s) in an RDF/XML format.
11
Hemoglobin
Measurement
Hematology
Test
subClass
CDISC Laboratory
Test Name
Terminology
Concept in
Subset
definition “A quantitative measurement of the amount of
hemoglobin present in a sample.”
NCI Thesaurus
http://ncicb.nci.nih.gov/download/evsportal.jsp
CDISC Laboratory
Test
Terminology
Concept in
Subset
Laboratory
Procedure
Has NCIHD
Parent
© CDISC 2012
Linked Open Data Cloud
12
http://lod-cloud.net/
Richard Cyganiak and Anja Jentzsch
© CDISC 2012
Real world use
• Two examples of how sponsors have started to
use semantic web standards and apply linked data
principles.
AstraZeneca:
• Integrative Informatics (i2) program establishing the
components to let a Linked Data cloud grow across
AstraZeneca R&D
Roche
• Implementing an internally built MDR.
13
© CDISC 2012
Roche Biomedical MDR
14
CDISC
Standards
Metadata
Management
Knowledge
Management
Schema Architecture Production
Partial / Future
© CDISC 2012
Roche Biomedical MDR
15
Content
• External content
SDTM 1.2, SDTMIG 3.1.2
NCI Thesaurus, CDISC Controlled Terminology
• Integrated Data Standards, Roche and Genentech
Safety and every Roche TA, ~ 2000 data elements
Data Collection and Data Tabulation
• Value level metadata
Lab measurements, Unit conversions, Questionnaires
• Looking at metadata for
SDTM Conformance Checking, Biomarker (HGNC), …
© CDISC 2012
Roche Biomedical MDR
16
Information Architecture
Study
Design
Data
Collection
Data
Tabulation
Data
Analysis
Regulatory
Submission
CDISC
Data Standards
Biomedical
Domain Model
Transformation
Models
PRM CDASH SDTM
+++ BRIDG +++ SHARE +++ NCI Thesaurus +++ Data Element Concepts +++
ADaM Define
Roche Global
Data Standards
Study & Project
Level Metadata
Production
Partial
Future
© CDISC 2012
Roche Biomedical MDR
17
System Architecture
Content
Management
Metadata
Repository
Single Point
of Access
Content
Publishing
© CDISC 2012
Roche Biomedical MDR
18
Value Proposition
• Current
Integrated knowledge, metadata, and data standards
management
System independent information asset
Single point of access
• Future
Leverage the SOA interface to create a framework for
integrated metadata driven workflow
Integrate MDR and Component Based Authoring
capabilities (study design, protocol, CSR)
© CDISC 2012
Key Message
• We now see all of these things converge to create
new and unique opportunities.
The coverage and maturity of existing CDISC standards.
The establishment of these standards within the industry
at large.
The use of these standards as a foundation for metadata
driven systems.
The upcoming role of semantic web standards and
linked data principles.
19
© CDISC 2012 21
© CDISC 2012 22
© CDISC 2012 23
© CDISC 2012 24
© CDISC 2012 25
© CDISC 2012 26
© CDISC 2012 27
© CDISC 2012 28
© CDISC 2012 30
© CDISC 2012 31
© CDISC 2012 32
© CDISC 2012 33
© CDISC 2012 34
© CDISC 2012 35
© CDISC 2012 36
© CDISC 2012 37
© CDISC 2012 39
© CDISC 2012 40
© CDISC 2012 41
© CDISC 2012 43
© CDISC 2012 44
Oh well, if you really
want that Excel sheet
© CDISC 2012 45
© CDISC 2012 46
© CDISC 2012 47
© CDISC 2012 48
© CDISC 2012 49
© CDISC 2012 50
© CDISC 2012 51
© CDISC 2012 52
© CDISC 2012 53