semantic models for cdisc based standards and metadata management

53
© CDISC 2012 Presented at CDISC Interchange Europe, Stockholm, 19 April 2012, by Kerstin Forsberg, R&D, AstraZeneca Frederik Malfait, IMOS Consulting and Hoffmann-La Roche 1 Semantic Models for CDISC Based Standards and Metadata Management

Post on 19-Oct-2014

2.563 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Presented at CDISC Interchange Europe, Stockholm, 19 April

2012, by

Kerstin Forsberg, R&D, AstraZeneca

Frederik Malfait, IMOS Consulting and Hoffmann-La Roche

1

Semantic Models for CDISC Based

Standards and Metadata Management

Page 2: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Key Message

• Things converge to create new and unique

opportunities.

The coverage and maturity of existing CDISC standards.

The establishment of these standards within the

industry.

The use of these standards as a foundation for metadata

driven systems.

The upcoming role of semantic web standards and

linked data principles.

• See also presentation and blog post from last

year’s conference: Linking Clinical Data Standards

2

Page 3: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Two real world use of semantic web

standards and linked data principles

3

Page 4: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Today’s Situation

• “Not if and when, but how” to best adopt CDISC

based data standards is becoming the leading

question.

• We see a variety of CDISC standards at different

levels of maturity, not linked together and

published in different formats.

• Sponsors are faced with challenges on all levels:

architecture, process, and application.

4

Page 5: Semantic models for cdisc based standards and metadata management

© CDISC 2012

An Emerging Insight

• The CDISC standards is all about the meaning of

what is studied in the biological and clinical reality

(often referred to as concepts).

• How these concepts are represented as data

elements from protocol to submission, and beyond.

• We are dealing with semantics and metadata for

biomedical and clinical research knowledge and

data.

• “Put semantic into the semantic”

Use semantic web standards

and linked data principles.

5

Page 6: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• Resource Description Framework (RDF)

A general model of how any piece of data, and

representations of knowledge, can be expressed

as so called triples.

6

subject predicate

Stockholm place

Stockholm Sweden

Stockholm Port cities in Sweden

Stockholm “+46-8”

object (or value)

type

capital

subject

areaCode

“http://en.wikipedia.org/wiki/Stockholm” primaryTopic Stockholm

Page 7: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• Triples can be aggregated into graphs with subject

and objects as nodes, and predicates as arcs.

7

City

Sweden

Stockholm Port cities in Sweden

“+46-8”

type

capital

subject

areaCode

“http://en.wikipedia.org/wiki/Stockholm” primaryTopic

Page 8: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• Graphs of triples can be extended across different

sources and for different purpose.

8

City

Sweden

Stockholm Port cities in Sweden

“+46-8”

type

capital

subject

areaCode

Country type

Gothenburg

subject

CDISC

CDISC

Interchange

EU 2012

“http://en.wikipedia.org/wiki/Stockholm” primaryTopic

Page 9: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• RDF Schema and the RDF based Web Ontology

Language (OWL) add a typing mechanism to

classify subjects and objects into hierarchies.

9

City

Sweden

Stockholm Port cities in Sweden

“+46-8”

type

capital

subject

areaCode

Country type CDISC

CDISC

Interchange

EU 2012

“http://en.wikipedia.org/wiki/Stockholm” primaryTopic

Adm.Area

Place

subClass

subClass

Organization

type

Business

Event

type

Event

subClass

Thing subClass subClass

Gothenburg

subject

subClass

Page 10: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• Google, Bing (Microsoft) and Yahoo use OWL

publish a joint vocabulary.

10

City Country

Adm.Area

Place

subClass

subClass

subClass

Organization

Business

Event

Event

subClass

Thing subClass subClass

Exempel

http://schema.org/City

Page 11: Semantic models for cdisc based standards and metadata management

© CDISC 2012

RDF Triples

• NCI use OWL to publish NCI Thesaurus (the

source for CDISC’s CT:s) in an RDF/XML format.

11

Hemoglobin

Measurement

Hematology

Test

subClass

CDISC Laboratory

Test Name

Terminology

Concept in

Subset

definition “A quantitative measurement of the amount of

hemoglobin present in a sample.”

NCI Thesaurus

http://ncicb.nci.nih.gov/download/evsportal.jsp

CDISC Laboratory

Test

Terminology

Concept in

Subset

Laboratory

Procedure

Has NCIHD

Parent

Page 12: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Linked Open Data Cloud

12

http://lod-cloud.net/

Richard Cyganiak and Anja Jentzsch

Page 13: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Real world use

• Two examples of how sponsors have started to

use semantic web standards and apply linked data

principles.

AstraZeneca:

• Integrative Informatics (i2) program establishing the

components to let a Linked Data cloud grow across

AstraZeneca R&D

Roche

• Implementing an internally built MDR.

13

Page 14: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Roche Biomedical MDR

14

CDISC

Standards

Metadata

Management

Knowledge

Management

Schema Architecture Production

Partial / Future

Page 15: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Roche Biomedical MDR

15

Content

• External content

SDTM 1.2, SDTMIG 3.1.2

NCI Thesaurus, CDISC Controlled Terminology

• Integrated Data Standards, Roche and Genentech

Safety and every Roche TA, ~ 2000 data elements

Data Collection and Data Tabulation

• Value level metadata

Lab measurements, Unit conversions, Questionnaires

• Looking at metadata for

SDTM Conformance Checking, Biomarker (HGNC), …

Page 16: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Roche Biomedical MDR

16

Information Architecture

Study

Design

Data

Collection

Data

Tabulation

Data

Analysis

Regulatory

Submission

CDISC

Data Standards

Biomedical

Domain Model

Transformation

Models

PRM CDASH SDTM

+++ BRIDG +++ SHARE +++ NCI Thesaurus +++ Data Element Concepts +++

ADaM Define

Roche Global

Data Standards

Study & Project

Level Metadata

Production

Partial

Future

Page 17: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Roche Biomedical MDR

17

System Architecture

Content

Management

Metadata

Repository

Single Point

of Access

Content

Publishing

Page 18: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Roche Biomedical MDR

18

Value Proposition

• Current

Integrated knowledge, metadata, and data standards

management

System independent information asset

Single point of access

• Future

Leverage the SOA interface to create a framework for

integrated metadata driven workflow

Integrate MDR and Component Based Authoring

capabilities (study design, protocol, CSR)

Page 19: Semantic models for cdisc based standards and metadata management

© CDISC 2012

Key Message

• We now see all of these things converge to create

new and unique opportunities.

The coverage and maturity of existing CDISC standards.

The establishment of these standards within the industry

at large.

The use of these standards as a foundation for metadata

driven systems.

The upcoming role of semantic web standards and

linked data principles.

19

Page 20: Semantic models for cdisc based standards and metadata management

© CDISC 2012 20

TopBraid Semantic

Modeling Workbench

Page 29: Semantic models for cdisc based standards and metadata management

© CDISC 2012 29

Roche Global Data

Standards Browser

Page 38: Semantic models for cdisc based standards and metadata management

© CDISC 2012 38

Publishing and Item

Level Versioning

Page 42: Semantic models for cdisc based standards and metadata management

© CDISC 2012 42

Using Web Services to

Export to…

Page 44: Semantic models for cdisc based standards and metadata management

© CDISC 2012 44

Oh well, if you really

want that Excel sheet