cancer bioinformatics grid (cabig) cans 2006 chicago, illinois shannon hastings [email protected]...

21
Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings [email protected] Department of Biomedical Informatics Ohio State University

Upload: leslie-stokes

Post on 16-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Cancer Bioinformatics Grid (caBIG) CANS 2006

Chicago, Illinois

Shannon [email protected]

Department of Biomedical InformaticsOhio State University

Page 2: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

National Cancer Institute 2015 Goal

Relieve suffering and death due to cancer by

the year 2015

Page 3: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Cancer Biomedical Informatics Grid (caBIGTM)

The cancer Biomedical Informatics Grid (caBIG™), is a voluntary network or grid connecting individuals and institutions to enable the sharing of data and tools, creating a World Wide Web of cancer research. The goal is to speed the delivery of innovative approaches for the prevention and treatment of cancer. The infrastructure and tools created by caBIG™ also have broad utility outside the cancer community. National Cancer Institute Initiative Over 800 Participants Over 80 Organizations Over 70 Projects

Page 4: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Origins of caBIG

Need: Enable investigators and research teams nationwide to combine and leverage their findings and expertise in order to meet NCI 2015 Goal.

Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network

Page 5: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

caBIG Community Organization

Page 6: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

caBIG Overview

Common, widely distributed infrastructure that permits the cancer research community to focus on innovation

Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange

Collection of interoperable applications developed to common standards

Cancer research data is available for mining and integration

Page 7: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Interoperability

The ability of multiple systems to exchange information and to be able

to use the information that has been exchanged.

Syntacticinteroperability

Semanticinteroperability

Page 8: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

SYNTACTIC

SEMANTIC

SEMANTIC

SEMANTIC

caBIG Compatibility Guidelines

Page 9: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

What is caGrid?

Development project of Architecture Workspace, aimed at helping define and implement Gold Compliance (the highest level of caBIG compatibility)

Gold compliance creates the G in caBIG Gold => Grid => connecting Silver Compliant Systems

No requirements on implementation technology is necessary for Gold compliance Specifications will be created defining requirements for

interoperability caGrid provides core infrastructure, and tooling to

provide “a way” to achieve Gold compliance

Page 10: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

caGrid Conceptual View

Microarray

NCICB

ResearchCenter

Gene Database

Grid-Enabled Client

ResearchCenter

Tool 1

Tool 2caArray

Protein Database

Tool 3

Tool 4

Grid Data Service

Analytical Service

Image

Tool 2

Tool 3

Grid Services Infrastructure(Metadata, Registry, Query,

Invocation, Security, etc.)

Grid Portal

Page 11: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

caGrid Data Description Infrastructure

Client and service APIs are object oriented, and operate over well-defined and curated data types

Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR)

Object definitions draw from vocabulary registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described

XML serialization of objects adhere to XML schemas registered in the Global Model Exchange (GME)

Service

Core Services

Client

XSDWSDL

Grid Service

Service Definition

Data TypeDefinitions

Service API

Grid Client

Client API

Registered In

Object Definitions

SemanticallyDescribed In

XMLObjectsSerialize To

ValidatesAgainst

Client Uses

Cancer Data Standards Repository

Enterprise Vocabulary

Services

Objects

GlobalModel

Exchange

GMERegistered In

ObjectDefinitions

Objects

Page 12: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Conceptual View of the Problem

Service

Core Services

Client

XSDWSDL

Grid Service

Service Definition

Data TypeDefinitions

Service API

Grid Client

Client API

Registered In

Object Definitions

SemanticallyDescribed In

XMLObjectsSerialize To

ValidatesAgainst

Client Uses

Cancer Data Standards Repository

Enterprise Vocabulary

Services

Objects

GlobalModel

Exchange

GMERegistered In

ObjectDefinitions

Objects

Page 13: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

caGrid Components

Leverage existing technologies: caDSR, EVS, Mobius GME: Common data elements, controlled

vocabularies, schema management Globus Toolkit (currently version 4.0.3)

Core grid services infrastructure Service deployment, service registry, invocation, base security

infrastructure Additional Core Infrastructure

Higher-level security services (Dorian, GTS, GridGrouper) Grid service access to metadata components (caDSR, GME, etc) Workflow, Identifier services

Service Provider Tooling (Introduce) Graphical service development and configuration environment Abstractions from service infrastructure for Data and Analytical

services Deployment wizards

Client Tooling High-level APIs for interacting with core components and services Graphical Tools

Page 14: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Grid Authentication and Authorization with Reliably Distributed Services (GAARDS)

The GAARDS Security Infrastructure provides services and tools for the administration and enforcement of security policy in an enterprise Grid.

Developed on top of the Globus Toolkit

Extends the Grid Security Infrastructure (GSI)

Provide enterprise services and administrative tools for: Grid User Management Identity Federation Trust management Group/VO management Access Control Policy management and enforcement Integration between existing security domains and the grid security domain.

Security Infrastructure for the Cancer Biomedical Informatics Grid (caBIGTM)

Page 15: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

GAARDS Services

Dorian Grid User Account Management Integration point between external

security domains and the grid. Allows accounts managed in

external domains to be federated and managed in the grid.

Dorian allows users to use their existing credentials (external to the grid) to authenticate to the grid

Grid Trust Service (GTS) Creation and Management of a

federated trust fabric. Supports applications and services

in deciding whether or not signers of digital credentials/user attributes can be trusted.

Supports the provisioning of trusted certificate authorities and corresponding CRLS.

Grid Grouper Group management service for the

grid Provides a group-based

authorization solution for the Grid Enforce authorization policy based

on membership to groups

GA

AR

DS

Sec

uri

ty In

fras

tru

ctu

re

Grid Services

Au

then

tic

atio

n

Dorian Services

Grid Trust Fabric

Grid Trust Service (GTS)

GTS GTS GTS

Authentication Services

Certificate Authorities

Certificate/CRL

Publshing

Certificate/CRL

Publshing

RegisteredTrustedIdentity

Providers

OSUDuke NCI

DorianDorianDorianDorian

...Trust Validate/Authenticate

Au

tho

riza

tio

n

Access Control Policy

Common Security Module (CSM)

Grid Grouper Services

Grid Grouper

ObtainGrid Credentials

LocalAuthentication

Invoke

Authorization

MembershipLookup

Page 16: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Accessing caGrid workflow

Data Service@ uchicago.edu

BPELWorkflow

DocBPELEngine

WorkflowMgmt

Service

Analytic service@ osu.edu

Analytic service@ duke.eduResearcher Workflow

Results

Workflowinputs

Workflow management service Sharing workflows Get workflow status

Page 17: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Introduce Graphical Development Environment (GDE)

GUI for creating and manipulating a grid service Provides means of simple creation of

service skeleton that a developer can then implement, build, and deploy

Automatic code generation of complete caBIG compliant grid service which is configured to provide:

Security Advertisement Discovery Complete Client API

Provides a set of tools which enable the developer to add/remove/modify/import methods of the service as well create sub-services.

Automatic code generation of all the required code, Globus grid service code/configuration, service configuration, implementation of the client, and stubbed implementation of the service

Page 18: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Introduce Generated Grid Service Architecture

Base service is a GT4 based WSRF capable grid service.

Utilize compositional inheritance (in lieu of non-standard port type extensions) to enable the service to inherit required features such as providing service security metadata and access to resource properties.

Utilize JNDI for server side configuration properties, and resources and resource properties.

Provide client and service side wrappers which implement the service designers interface as opposed to the document literal interface generated by Axis.

Provide metadata registration to the index service by configuring the Resource to register it’s service groups to a predefined caGrid MDS based Index Service.

Page 19: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Collaborating Architects and Developers

Ohio State University Argonne National Lab Duke University Georgetown University Semantic Bits

Page 20: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Project Resources and Communication

caBIG at NCI http://cabig.nci.nih.gov

Globus Dev http://dev.globus.org

caGrid 1.0 GForge Home: Feature Requests Bug Reports Discussion Forums Public Wiki Quality Dasboards Downloads / Source Repository http://gforge.nci.nih.gov/projects/cagrid-1-0/

caGrid Users Mailing List https://list.nih.gov/archives/cagrid_users-l.html [email protected]

Page 21: Cancer Bioinformatics Grid (caBIG) CANS 2006 Chicago, Illinois Shannon Hastings hastings@bmi.osu.edu Department of Biomedical Informatics Ohio State University

Cancer Bioinformatics Grid (caBIG) CANS 2006

Chicago, Illinois

Shannon [email protected]

Department of Biomedical InformaticsOhio State University