cabig semantic infrastructure 2.0: supporting tbpt needs dave hau, m.d., m.s. acting director,...

25
caBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics and Information Technology

Upload: joy-webb

Post on 30-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

caBIG Semantic Infrastructure 2.0:

Supporting TBPT Needs

Dave Hau, M.D., M.S.Acting Director, Semantic Infrastructure

NCI Center for Biomedical Informatics and Information Technology

Page 2: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Key Extensions from 1.0 Infrastructure

• Lower barrier of entry - “Make easy things easy”• Linear value proposition

• Vs. all or none• Immediate return upon initial investment• Tools guide user to increase semantics for more

“return”• Support all levels of participation based on user needs

• Support non-NCI semantic representation strategies• Legacy support for 1.0 users• Leverage existing open-source technologies and

standards from the community

2

Page 3: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Semantic Infrastructure 2.0

• Knowledge Management Service (SAIF / ECCF Registry)• Informational (Static) Semantics

• Layered (~ upper ontology) – promote reuse ; enhance interoperability ; avoid data element “explosion”

• Contextual – all elements have a model context ; vs. individual element curation

• RIM-based semantics and ISO 21090 healthcare datatypes• Ability to transform between different formats based on use case

• Behavioral (Dynamic) Semantics• Support semantic workflow composition on caGrid 2.0• Analytical and transactional services

• Artifact Management – SAIF metamodel (UML profile) ; DITA• Form templates – HL7 CDA, CDISC ODM, Xform

• Governance Service• Conformance checking – design time and run time

3

Page 4: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Leverage Open-Source Technologies

Informational (Static) Semantics

Eclipse Modeling Framework (EMF)

Transformation between model formats

4

HL7 MIF

• HL7 Static Model Designer

UML

• MDT (Model Development Tools)

Ontology

• Protégé?

Page 5: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Linear Value Proposition

“Just enough semantics” for different deployment contexts

(Lab-wide vs. Institution-wide vs. Global)

5

Innovation Path

Transition

Enterprise Path

“VALUE”

Page 6: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Linear Value Proposition – Static Semantics

Value – Data integration across caGrid 2.0

6

Any community ontology (e.g. Gene Ontology)

Mapping (e.g. ISA-TAB -> LSDAM mapping)

caBIG Domain Analysis Model (e.g. BRIDG, LSDAM)

Page 7: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Linear Value Proposition – Behavioral Semantics

Value – Semantic Workflow Composition7

WSDL / WADL

WSDL / WADL + pre,post-conditions

SAIF Behavioral Framework (Service Contract – roles, interactions)

Page 8: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

SI 2.0 is federated

8

Standard

Institutional

Lab Lab

Institutional

Lab Lab

Page 9: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

SI 2.0 Supports Multiple Platforms

9

Service Registry

• caGrid 2.0

Service Registry

• CVGrid

Service Registry

• Other Platform

Semantic Infrastructure 2.0

Page 10: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

SI 2.0 -> Service Registry -> Service Instance

10

SI 2.0 – caTissue specification

caGrid 2.0 Service Registry

(stores reference)

caTissue instance at Wash U

caTissue instance at Fox Chase

caTissue instance at U Leicester,

UK

caTissue instance at Lowy, UNSW

Semantic Infrastructure

Service Registry

Service Instances

Page 11: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Dynamic Extensions

11

SI 2.0 – caTissue specification

caGrid 2.0 Service Registry

(stores reference)

caTissue instance at Wash U

caTissue instance at Fox Chase

caTissue instance at U Leicester,

UK

caTissue instance at Lowy, UNSW

Semantic Infrastructure (NCI instance or local instance)

Service Registry

Service Instances

DEs

CDA template / xform

Version bump - resync

LexEVS

Value sets, pick lists

Page 12: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Dynamic Extensions - Details

• SI 2.0 will provide portlet and service capabilities for caTissue to create Dynamic Extensions (DEs) directly on an SI 2.0 instance.

• SI 2.0 will provide capabilities for querying and reuse existing models and attributes.

• Newly created DEs are available for sharing from SI 2.0 instance at owner’s preferred timing.

• LexEVS will provide value set querying, creation and management capabilities for the DEs.

• Forms will be created on SI 2.0, and retrievable by caTissue as a CDA template, xform etc., with validation mechanisms (e.g. schematrons associated with CDA template).

12

Page 13: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Dynamic Extensions - Details

• For defining DEs, SI 2.0 will support use of NCIt concepts, and also non-NCI semantic representations such as community ontologies (NCBO Bioportal ontologies, OBO Foundry etc.)

• Upon creation of DEs, the version of a user's model will be incremented. SI 2.0 will prompt the runtime registry to reload the new version of the model, so discovery can be based on the new extended model.

• (Possibly) Entity-Attribute-Value(EAV) or RDF triple representation to avoid building and deploying new data service.

13

Page 14: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

AIM v3 Sample Annotations

<Calculation description="Linear Measurement" cagridId="0" mathML="" codeMeaning="Length" codeValue="G-A22A" codingSchemeDesignator="SRT" codingSchemeVersion="" uid="1.24897.57654138621.1646" >

<referencedCalculationCollection/>

<calculationResultCollection>

<CalculationData cagridId="0" value="28.8822383880615">

14

Page 15: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

AIM v3 Sample Annotations

<AnatomicEntity codeMeaning="LUNG" codeValue="REX0001" codingSchemeDesignator="RADLEX" cagridId="0" label="">

</AnatomicEntity>

15

Page 16: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

AIM v3 Sample Annotations

<ImagingObservation codeMeaning="Conspicuity" codeValue="REX4001" codingSchemeDesignator="RADREX" comment="" cagridId="0" label="">

<imagingObservationCharacteristicCollection>

<ImagingObservationCharacteristic codeMeaning="Very Obvious" codeValue="REX4006" codingSchemeDesignator="RADLEX" comment="" cagridId="0" label="" />

</imagingObservationCharacteristicCollection>

</ImagingObservation>

16

Page 17: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

AIM – Radiology / Pathology Imaging

• The metadata is in the data.

• We are annotating each data instance, instead of each class and attributes. “One annotation per object”

• Entity-Attribute-Value representation

• RDF triples / SPARQL endpoint

17

Page 18: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Semantic Infrastructure 2.0 Timeline (Tentative)

• Nov 2010 – Jan 2011: “Inception Activities” – Risk Mitigation:• Data Migration• Semantic Workflow Composition

• Summer 2011: SI 2.0 repository initial release ; data migration starts

• Fall 2011: Knowledge Management Service• Winter 2011: Governance Service• Spring 2012: Tools• Summer 2012: Decision Support / Analysis Service

18

Page 19: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata(Interim Strategy)

I'm a Prostate SPORE Group, and we have a list of new data elements that we want to add to caTissue (or other caBIG tools). I'd like to know what are the data elements and query from caTissue as I'm filling out a dynamic extension form in caTissue. - how can we a) use a vocabulary service - to send "prostate cancer " (or similar term) and get a drop down list of values that are in caDSR and thus registered in NCI Thesaurus. So, in this very simple example - they would have "prostate carcinoma".

“CDE Curation Tool” (Querying now open to public)

http://cdecurate.nci.nih.gov

19

Page 20: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata

20

Page 21: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata

21

Page 22: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata

22

Page 23: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata

23

Page 24: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Use Case – Reuse Metadata

24

Page 25: CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics

Semantic Infrastructure 2.0 Roadmap

• SI 2.0 Roadmap public website:• https://wiki.nci.nih.gov/x/IRnDAQ

• SI 2.0 Roadmap, Sep 6, 2010 (next release Nov 19, 2010):• https://wiki.nci.nih.gov/x/vw-0AQ (online version)• https://wiki.nci.nih.gov/download/attachments/29563169/CCBIIT_S

emantic_Infrastructure_2.0_Roadmap_Sept_6_2010.pdf (pdf version)

• SI 2.0 Roadmap community input form:• https://wiki.nci.nih.gov/download/attachments/29563169/

Semantic_Infrastructure_Community_Input_Form.xlsx

25