cancer bioinformatics grid (cabig) cans 2006 chicago, illinois shannon hastings [email protected]...
TRANSCRIPT
Cancer Bioinformatics Grid (caBIG) CANS 2006
Chicago, Illinois
Shannon [email protected]
Department of Biomedical InformaticsOhio State University
National Cancer Institute 2015 Goal
Relieve suffering and death due to cancer by
the year 2015
Cancer Biomedical Informatics Grid (caBIGTM)
The cancer Biomedical Informatics Grid (caBIG™), is a voluntary network or grid connecting individuals and institutions to enable the sharing of data and tools, creating a World Wide Web of cancer research. The goal is to speed the delivery of innovative approaches for the prevention and treatment of cancer. The infrastructure and tools created by caBIG™ also have broad utility outside the cancer community. National Cancer Institute Initiative Over 800 Participants Over 80 Organizations Over 70 Projects
Origins of caBIG
Need: Enable investigators and research teams nationwide to combine and leverage their findings and expertise in order to meet NCI 2015 Goal.
Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network
caBIG Community Organization
caBIG Overview
Common, widely distributed infrastructure that permits the cancer research community to focus on innovation
Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange
Collection of interoperable applications developed to common standards
Cancer research data is available for mining and integration
Interoperability
The ability of multiple systems to exchange information and to be able
to use the information that has been exchanged.
Syntacticinteroperability
Semanticinteroperability
SYNTACTIC
SEMANTIC
SEMANTIC
SEMANTIC
caBIG Compatibility Guidelines
What is caGrid?
Development project of Architecture Workspace, aimed at helping define and implement Gold Compliance (the highest level of caBIG compatibility)
Gold compliance creates the G in caBIG Gold => Grid => connecting Silver Compliant Systems
No requirements on implementation technology is necessary for Gold compliance Specifications will be created defining requirements for
interoperability caGrid provides core infrastructure, and tooling to
provide “a way” to achieve Gold compliance
caGrid Conceptual View
Microarray
NCICB
ResearchCenter
Gene Database
Grid-Enabled Client
ResearchCenter
Tool 1
Tool 2caArray
Protein Database
Tool 3
Tool 4
Grid Data Service
Analytical Service
Image
Tool 2
Tool 3
Grid Services Infrastructure(Metadata, Registry, Query,
Invocation, Security, etc.)
Grid Portal
caGrid Data Description Infrastructure
Client and service APIs are object oriented, and operate over well-defined and curated data types
Objects are defined in UML and converted into ISO/IEC 11179 Administered Components, which are in turn registered in the Cancer Data Standards Repository (caDSR)
Object definitions draw from vocabulary registered in the Enterprise Vocabulary Services (EVS), and their relationships are thus semantically described
XML serialization of objects adhere to XML schemas registered in the Global Model Exchange (GME)
Service
Core Services
Client
XSDWSDL
Grid Service
Service Definition
Data TypeDefinitions
Service API
Grid Client
Client API
Registered In
Object Definitions
SemanticallyDescribed In
XMLObjectsSerialize To
ValidatesAgainst
Client Uses
Cancer Data Standards Repository
Enterprise Vocabulary
Services
Objects
GlobalModel
Exchange
GMERegistered In
ObjectDefinitions
Objects
Conceptual View of the Problem
Service
Core Services
Client
XSDWSDL
Grid Service
Service Definition
Data TypeDefinitions
Service API
Grid Client
Client API
Registered In
Object Definitions
SemanticallyDescribed In
XMLObjectsSerialize To
ValidatesAgainst
Client Uses
Cancer Data Standards Repository
Enterprise Vocabulary
Services
Objects
GlobalModel
Exchange
GMERegistered In
ObjectDefinitions
Objects
caGrid Components
Leverage existing technologies: caDSR, EVS, Mobius GME: Common data elements, controlled
vocabularies, schema management Globus Toolkit (currently version 4.0.3)
Core grid services infrastructure Service deployment, service registry, invocation, base security
infrastructure Additional Core Infrastructure
Higher-level security services (Dorian, GTS, GridGrouper) Grid service access to metadata components (caDSR, GME, etc) Workflow, Identifier services
Service Provider Tooling (Introduce) Graphical service development and configuration environment Abstractions from service infrastructure for Data and Analytical
services Deployment wizards
Client Tooling High-level APIs for interacting with core components and services Graphical Tools
Grid Authentication and Authorization with Reliably Distributed Services (GAARDS)
The GAARDS Security Infrastructure provides services and tools for the administration and enforcement of security policy in an enterprise Grid.
Developed on top of the Globus Toolkit
Extends the Grid Security Infrastructure (GSI)
Provide enterprise services and administrative tools for: Grid User Management Identity Federation Trust management Group/VO management Access Control Policy management and enforcement Integration between existing security domains and the grid security domain.
Security Infrastructure for the Cancer Biomedical Informatics Grid (caBIGTM)
GAARDS Services
Dorian Grid User Account Management Integration point between external
security domains and the grid. Allows accounts managed in
external domains to be federated and managed in the grid.
Dorian allows users to use their existing credentials (external to the grid) to authenticate to the grid
Grid Trust Service (GTS) Creation and Management of a
federated trust fabric. Supports applications and services
in deciding whether or not signers of digital credentials/user attributes can be trusted.
Supports the provisioning of trusted certificate authorities and corresponding CRLS.
Grid Grouper Group management service for the
grid Provides a group-based
authorization solution for the Grid Enforce authorization policy based
on membership to groups
GA
AR
DS
Sec
uri
ty In
fras
tru
ctu
re
Grid Services
Au
then
tic
atio
n
Dorian Services
Grid Trust Fabric
Grid Trust Service (GTS)
GTS GTS GTS
Authentication Services
Certificate Authorities
Certificate/CRL
Publshing
Certificate/CRL
Publshing
RegisteredTrustedIdentity
Providers
OSUDuke NCI
DorianDorianDorianDorian
...Trust Validate/Authenticate
Au
tho
riza
tio
n
Access Control Policy
Common Security Module (CSM)
Grid Grouper Services
Grid Grouper
ObtainGrid Credentials
LocalAuthentication
Invoke
Authorization
MembershipLookup
Accessing caGrid workflow
Data Service@ uchicago.edu
BPELWorkflow
DocBPELEngine
WorkflowMgmt
Service
Analytic service@ osu.edu
Analytic service@ duke.eduResearcher Workflow
Results
Workflowinputs
Workflow management service Sharing workflows Get workflow status
Introduce Graphical Development Environment (GDE)
GUI for creating and manipulating a grid service Provides means of simple creation of
service skeleton that a developer can then implement, build, and deploy
Automatic code generation of complete caBIG compliant grid service which is configured to provide:
Security Advertisement Discovery Complete Client API
Provides a set of tools which enable the developer to add/remove/modify/import methods of the service as well create sub-services.
Automatic code generation of all the required code, Globus grid service code/configuration, service configuration, implementation of the client, and stubbed implementation of the service
Introduce Generated Grid Service Architecture
Base service is a GT4 based WSRF capable grid service.
Utilize compositional inheritance (in lieu of non-standard port type extensions) to enable the service to inherit required features such as providing service security metadata and access to resource properties.
Utilize JNDI for server side configuration properties, and resources and resource properties.
Provide client and service side wrappers which implement the service designers interface as opposed to the document literal interface generated by Axis.
Provide metadata registration to the index service by configuring the Resource to register it’s service groups to a predefined caGrid MDS based Index Service.
Collaborating Architects and Developers
Ohio State University Argonne National Lab Duke University Georgetown University Semantic Bits
Project Resources and Communication
caBIG at NCI http://cabig.nci.nih.gov
Globus Dev http://dev.globus.org
caGrid 1.0 GForge Home: Feature Requests Bug Reports Discussion Forums Public Wiki Quality Dasboards Downloads / Source Repository http://gforge.nci.nih.gov/projects/cagrid-1-0/
caGrid Users Mailing List https://list.nih.gov/archives/cagrid_users-l.html [email protected]
Cancer Bioinformatics Grid (caBIG) CANS 2006
Chicago, Illinois
Shannon [email protected]
Department of Biomedical InformaticsOhio State University