cagrid 1.0 service infrastructure

18
ISMB/ECCB 2007 Bioinformatics Open Source Conference Vienna, Austria July 18, 2007 caGrid 1.0 Service Infrastructur e Avinash Shanbhag Director, Core Infrastructure Engineering National Cancer Institute Center for Biomedical Informatics and Information Technology USA

Upload: bosc

Post on 11-May-2015

1.443 views

Category:

Technology


1 download

DESCRIPTION

Title: CaGrid 1.0 Service InfrastructureAuthor: Avinash Shanbhag

TRANSCRIPT

Page 1: CaGrid 1.0 Service Infrastructure

ISMB/ECCB 2007

Bioinformatics Open Source Conference Vienna, Austria

July 18, 2007

caGrid 1.0ServiceInfrastructure

Avinash Shanbhag

Director, Core Infrastructure Engineering

National Cancer Institute Center for Biomedical Informatics and Information Technology

USA

Page 2: CaGrid 1.0 Service Infrastructure

Agenda

• High Level Overview• caGrid Service Architecture• Component Highlights• Project Resources

Page 3: CaGrid 1.0 Service Infrastructure

What is caBIG?

• Common, widely distributed infrastructure that permits the cancer research community in USA to focus on secure data sharing

• Shared, harmonized set of terminology, data elements, and data models that facilitate information exchange

• Collection of interoperable applications developed to common standards

• Cancer research data is available for mining and integration

Page 4: CaGrid 1.0 Service Infrastructure

caGrid – Service Infrastructure supporting caBIG

• Requirements:

• Support scientific requirements: Use cases from cancer research community

• Support functional requirements: identifiers, workflow, query, etc

• Support non-functional requirements: security, reliability, performance, etc

• Principles:

• Driven by cancer research community requirements

• caBIG Principles• Open Source, Open Access, Open Development• Federated• Syntactic and Semantic Interoperability

• Services-Oriented Architecture

• Metadata driven and implements Virtualization

• Standards based

Page 5: CaGrid 1.0 Service Infrastructure

12/1/03 12/31/06

1/1/04 4/1/04 7/1/04 10/1/04 1/1/05 4/1/05 7/1/05 10/1/05 1/1/06 4/1/06 7/1/06 10/1/06

December 2006caGrid 1.0

Official Release

July 2006caGrid 1.0

Beta Release

8/31/05caGrid 0.5Release

10/7/05caGrid 0.5.1

Release

11/15/05caGrid 0.5.2

Release

1/25/06caGrid 0.5.3

Release

5/12/06caGrid 0.5.4

Release12/1/03caGrid Concept Origin

7/1/04Initial caGrid Prototype

History of caGrid

Page 6: CaGrid 1.0 Service Infrastructure

What is a Community Provided caGrid Service?

• Standardized, common pattern and mechanism for remote access• Language and implementation technology independent

• Common security infrastructure for authentication and authorization

• Standardized service metadata models and metadata advertisement mechanisms

• Community provided service types:• Data Services

• Expose data to the grid in a unified way

• Analytical Services• Expose analytical operations to the grid

Page 7: CaGrid 1.0 Service Infrastructure

caGrid Services - Strongly Typed and Semantically Rich

• Object Oriented APIs and data resources are developed using Object types and UML information models registered in the caDSR

• These systems are grid-enabled by defining a grid service interface that defines the functionality to be exposed to the grid

• The grid service interface uses the same Object types as the existing system, but leverages a platform and language neutral representation (XML) of them

• The grid service implementation maps service invocations to API calls or queries into the existing system

Page 8: CaGrid 1.0 Service Infrastructure

caGrid Components

• Leverage existing technologies:• caDSR, EVS, Mobius GME: Common data elements, controlled vocabularies,

schema management• Globus Toolkit (currently version 4.0.3)

• Core grid services infrastructure• Service deployment, service registry, invocation, base security infrastructure

• Additional Core Infrastructure• Higher-level security services• Grid service access to metadata components (caDSR, EVS, GME, etc)• Workflow, Identifier, Federated Query services

• Service Provider Tooling (Introduce)• Graphical service development and configuration environment• Abstractions from grid service infrastructure for Data and Analytical services• Deployment wizards

• Client Tooling• Installer• High-level APIs for interacting with core components and services• Graphical Tools (administration tools, sample applications, etc)

• Production Deployment and Support of Infrastructure Services

Page 9: CaGrid 1.0 Service Infrastructure

caGrid Production Environment

Page 10: CaGrid 1.0 Service Infrastructure

caGrid Projects

• The caGrid release is oriented around a number of individual projects

• Build process manages inter-project dependencies

• Each project provides a specific set of functionality, and is self contained once caGrid is built

• Grid Services:

• authentication-service, cadsr, dorian, evs, fqp, gme, gridgrouper, gts, index, syncgts, workflow, ws-naming, ws-transfer

• Grid Service Components and Extensions:

• authz, bulkDataTransfer, cabigextensions, data, sdkQuery, sdkQuery32, service-security-provider, ws-enum, ws-handlesystem

• Utilities and APIs:

• AntInstallerFramework, core, discovery, graph, gridca, metadata, metadatautils, opensaml

• Applications:

• installer, introduce, portal, security-ui

Page 11: CaGrid 1.0 Service Infrastructure

Metadata Services

• Cancer Data Standards Repository (caDSR)• caBIG projects register their data models as Common Data Elements (CDEs) which are

semantically harmonized and then centrally stored and managed the caDSR• The caDSR grid service provides:

• Model discovery and traversal• caGrid standard metadata generation capabilities

• Enterprise Vocabulary Services (EVS)• EVS is set of services and resources that address the need for controlled vocabulary• The EVS grid service provides:

• Query access to the data semantics and controlled vocabulary managed by the EVS• Global Model Exchange (GME)

• GME is a DNS-like data definition registry and exchange service that is responsible for storing and linking together data models in the form of XML schema.

• The GME grid service provides:• Access to the authoritative structural representation of data types on the grid

• Globus Information Services: Index Service• The Globus Information Services infrastructure provides a generic framework for aggregation

of service metadata, a registry of running Grid services, and a dynamic data-generating and indexing node, suitable for use in a hierarchy or federation of services

• The Index grid service provides:• Yellow and white pages for the grid

Page 12: CaGrid 1.0 Service Infrastructure

caGrid Security Components

• Dorian • Grid User Account Management• Enables Identity Management and Federation

• Authentication Service• Provides a uniform authentication interface in which applications can be built on, and

a framework for issuing SAML assertions for existing credential providers such that they may easily integrated with Dorian and other grid credential providers

• Grid Trust Service (GTS)• Creation and Management of a federated trust fabric.• Supports applications and services in deciding whether or not signers of digital

credentials/user attributes can be trusted.• Grid Grouper

• Grid Group / VO Management • Enables Group/VO Based Authorization

• Authorization Support• Provides a framework to perform service authorization based on permissions from

both the Common Security Module (CSM) as well as Grid Grouper groups• Security Communication Metadata

• Metadata providing the ability for two parties to negotiate a communication mechanism which meets the service’s requirements

• Grid CA• APIs and Command Line for platform independent certificate authority

Page 13: CaGrid 1.0 Service Infrastructure

Introduce Overview

• A framework which enables fast and easy creation of strongly typed and highly interoperable grid services

• Provides a powerful extension system wherein specific functionality can be added to the service or service editing process• Support for caDSR, GME, caGrid metadata, Data

Services, and caGrid authorization services are all added this way

• Abstracts all the details of the grid from the developer, allowing them to focus on the business logic being exposed

• Provides a graphical environment

Page 14: CaGrid 1.0 Service Infrastructure

Introduce Graphical Development Environment

• GUI for creating and manipulating a grid service• Provides means of

simple creation of service skeleton that a developer can then implement, build, and deploy

• Automatic code generation of complete caBIG compliant grid service which is configured to provide:

• Advertisement• Standard Metadata• Security• Complete Client API

Page 15: CaGrid 1.0 Service Infrastructure

GAARDS Security Infrastructure

GA

AR

DS

Sec

uri

ty In

fras

tru

ctu

re

Grid Services

Au

then

tica

tio

n

Dorian Services

Grid Trust Fabric

Grid Trust Service (GTS)

GTS GTS GTS

Authentication Services

Certificate Authorities

Certificate/CRL

Publshing

Certificate/CRL

Publshing

RegisteredTrustedIdentity

Providers

OSUDuke NCI

DorianDorianDorianDorian

...Trust Validate/Authenticate

Au

tho

riza

tio

n

Access Control Policy

Common Security Module (CSM)

Grid Grouper Services

Grid Grouper

ObtainGrid Credentials

LocalAuthentication

Invoke

Local Authorization and Policy

MembershipLookup

Page 16: CaGrid 1.0 Service Infrastructure

Project Resources and Communication

• caGrid Homepage:• https://cabig.nci.nih.gov/workspaces/Architecture/caGrid• http://www.cagrid.org

• caGrid 1.0 Release:• Release Notes: http://gforge.nci.nih.gov/frs/shownotes.php?release_id=952• http://gforge.nci.nih.gov/frs/?group_id=25&release_id=952

• caGrid 1.0 GForge Home:• Feature Requests• Bug Reports• Discussion Forums• Public Wiki• Downloads / Source Repository• http://gforge.nci.nih.gov/projects/cagrid-1-0/

• caGrid Users Mailing List• https://list.nih.gov/archives/cagrid_users-l.html• [email protected]

• Architecture Workspace• Community direction from Working Groups• Report out and feedback during WS calls

Page 17: CaGrid 1.0 Service Infrastructure

Acknowledgements : caGrid Team

• Ohio State University• Joel Saltz• Scott Oster• Shannon Hastings• Stephen Langella• David Ervin• Tahsin Kurc

• Argonne National Laboratory• Ian Foster

• William E. Allcock

• Frank Siebenlist

• Mike Wilde

• Ravi Madduri

• Jarek Gawor

• Rachana Ananthakrishnan

• Duke University• Patrick McConnell

• Georgetown University• Steve Moore• Arnie Miles• Paul Kennedy• Chad La Joie

• Science Applications International Inc.

• Manav Kher

• ScenPro Inc• David Wellborn• Val Bragg

• SemanticBits, LLC• Vinay Kumar

• Oracle Corp.• Christophe Ludet

• Booz Allen Hamilton• Arumani Manisundaram

Page 18: CaGrid 1.0 Service Infrastructure

Acknowledgements

• National Cancer Institute Center for Bioinformatics

• George Komatsoulis

• Frank Hartel

• Denise Warzel

• Peter Covitz