grid computing, sao, and autonomic computing

102
Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 1 Grid Computing, SAO, and Autonomic Computing Paul Giangarra Sr. Technical Staff Member e-mail: [email protected]

Upload: others

Post on 12-Sep-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 1

Grid Computing, SAO, and Autonomic ComputingPaul GiangarraSr. Technical Staff Membere-mail: [email protected]

Page 2: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 2

AgendaGrid Computing, a Brief IntroductionGrid Computing Core ConceptsGrid Computing Standards and ArchitectureInformation and Grid ComputingAutonomic Computing and Grid ComputingService Oriented Architecture and Grid Computing(Now What do I do With All This?)

The Realm of the PossibleSummary and Questions

Page 3: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 3

What’s the Problem?Grid Problem:

Provide for flexible secure coordinated resource sharing among dynamic collections of individuals, institutions & resources (a.k.a. virtual organizations)This includes unique authentication, authorization, resource access, and resource discovery

Grid Challenge:Create an architecture and solution set based on open standards and where they exist exploit existing technologies to solve this

See: The Anatomy of the Grid by Foster, Kesselman, Tuecke

Page 4: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 4

What Is NOT a Grid?The 8:00 AM rush hour (that’s gridlock)

A bunch of PCs on a network(it’s a lot more than that)

A cluster, a network attached storage device, a scientific instrument, a network, etc.(each is an important component of a Grid, but by itself each does not constitute a Grid)

KEY: Grid Computing is NOT a silver bullet!

Page 5: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 5

So, What Is a Grid?More correctly, what is Grid Computing?

Based on services-oriented architectureBased on standard, open, general-purpose protocols and interfacesGrid Computing, Services, and Technologies:

Help coordinate and manage disparate and possibly heterogeneous resources that are not subject to centralized controlCan be used to deliver non-trivial quantities of serviceCan be used to aggregate disparate IT elements such as compute resources, data storage and filing systems to create a single, unified virtual system

Page 6: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 6

StorageData

Applications

Processing I/O Operating System

Microcosm – Pre-Internet “System”

What Is Grid Computing?

Page 7: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 7

What Is Grid Computing?

....a single unified image

StorageData

Applications

Processing I/O Operating System

Macrocosm – Distributed Resources and Applications

Page 8: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 8

Grid Computing EnablesDistributed computing across networks using open standards supporting heterogeneous resources by providing facilities for:

Virtualized Sharing of ResourcesVirtual Organizations & Collaboration

Autonomic Management of ResourcesQuality of Service & Optimization

Secure Reliable Access to ResourcesOn Demand Computing and Utility Models

Page 9: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 9

Grid Computing, SAO, and Autonomic Computing

Grid Computing Core Concepts

Page 10: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 10

3 Models and Unique Value Propositions

IncreasedResults:

Resource useFlexibilityProductivityReliability/Availability

ComplexityTotal cost of ownership

Decreased

Grid Computing Value Proposition

On Demand“Access data & processing capabilities in a utility-like fashion…….. Make vs. Buy”

Processing“Aggregate processing power from a distributed collection of heterogeneous systems”

Data

“Secure access and sharing of distributed data & information ina collaborative fashion”

Resiliency“Improve the quality of service of distributed systems, despite unplanned events”

Page 11: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 11

Grid Computing Resources & Types

Grid ResourcesComputationStorageDataApplicationsCommunication (I/O)Software & LicensesSpecial equipment, capacities, architectures, & policies

Grid TypesCollaboration GridCompute Grids

Desktop ScavengingServer

Data/Information GridsContentDataFileStorage

Grid Resources Virtualized Across the Grid Types

Page 12: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 12

1. Intra-GridsGrid

NAS/SAN

Grid

NAS/SAN

Grid Deployment OptionsA Function of Business Need, Technology and Organizational Flexibility

Page 13: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 13

1. Intra-Grids

2. Extra-Grids

GridGrid

NAS/SANNAS/SAN

Grid

NAS/SAN

VPN

A Function of Business Need, Technology and Organizational Flexibility

Grid Deployment Options

Page 14: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 14

1. Intra-Grids

2. Extra-Grids

3. Inter-Grids

GridGrid

NAS/SANNAS/SAN

Grid

NAS/SAN

VPN

A Function of Business Need, Technology and Organizational Flexibility

Grid Deployment Options

Page 15: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 15

Motivations for Grid Computing

SupportHeterogeneous

SystemsEnable

Collaboration

ReduceTime toResults

IncreaseCapacity

ImproveEfficiencyReduceCosts

ProvideReliability

& Availability

Page 16: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 16

Motivations for Grid Computing

Increase CapacityExploit distributed resources to provide capacity for high-demand applications

• Existing applications that cannot be run effectively on a single processor

• New large scale application that provide strategic business advantages

Page 17: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 17

Motivations for Grid Computing

Increase CapacityExploit distributed resources to provide capacity for high-demand applications

Improve Efficiency / Reduce Costs

Reduce infrastructure cost associated with over-provisioned resourcesReduce the cost of manpower to manage and configure resources

Page 18: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 18

IBMIBMIBM

Provide Reliability / AvailabilityUse distributed resources Monitor work progressRestart failed jobs

Motivations for Grid Computing112234

567891011

JobScheduler

TIMEOUT !

JOB 1JOB 1 JOB 2JOB 2 JOB 3JOB 3JOB 1JOB 1Recovery / Restart

Page 19: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 19

Motivations for Grid Computing

Reduce “Time to Results”Exploit opportunities for parallel computing to allow business critical computation to be completed in a timely fashionGain competitive advantage by allowing computation to be executed more frequently and on customer demand Deliver real-time results to internal and external customers

112

234

567891011

March

29March

28March

27

Serial Execution

Parallel Execution

Page 20: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 20

Provide Reliability / AvailabilityUse distributed resources Monitor work progressRestart failed jobs

Support Heterogeneous systemsDifferent hardware, system platforms,

and available middlewareSpecialized equipment

Motivations for Grid Computing

Linux / Z-OS

IBM

IBM

AIX / Linux

IBM

IBM

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

serverp Se ries

IBM

H C R U6

Page 21: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 21

Motivations for Grid Computing

Enable CollaborationsEnable collaboration across applications to integrate results Support large multi-disciplinary collaborationsBoth within a single organization and between partners

Air Force

ArmyNavy

C2C

MissionPlanning

Page 22: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 22

Grid Computing, SAO, and Autonomic Computing

Grid Computing Standards and Architecture

Page 23: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 23

The Value of Open Standards

Networking:The Internet

(TCP/IP)

Communications:e-mail

(pop3,SMTP,Mime)

Information:World-wide Web

(html, http, j2ee, xml)

Applications:Web Services

(SOAP, WSDL, UDDI)

Distributed Computing:Grid

(Globus / OGSA)

Operating System:Linux

Page 24: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 24

Cooperation on Standards

MicrosystemsMicrosystems

Page 25: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 25

WSDLDescribes what the service is, how to use it (XML document)

UDDI (optional)Yellow pages for web services

(Universal Directory, Discovery and Integration ) Directory

SOAPConnect the service (“the envelope”)

Core Web Services Technologies

Page 26: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 26

Value Proposition

Increase business flexibility through standardized services

Enabling the ecosystem

Extend IT Infrastructure to suppliers and Business Partners

Radical reduction in complexity of integration

Leverage existing investments and skills

IBM provides the industry's broadest support for Web services

Development LifecycleTransaction ServicesInformation IntegrationCollaboration ServicesManagement Services

IBM Software Activities

Drive definition, adoption and interoperability of Web services

Open standards-based Open standards-based technology for flexible technology for flexible

integrationintegration

Making Web Services Work

Basic Profile 1.1 - Final Specification published August 24, 2004

Page 27: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 27

Open Grid Services Architecture (OGSA)

Objectives:

Manage resources across distributed heterogeneous platforms Deliver seamless QoSProvide a common base for autonomic management solutionsDefine open, published interfaces

Exploit industry-standard integration technologies

Web Services: SOAP, XML, WSDL, WS-Security, UDDI…

Integrate with existing IT resources

Page 28: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 28

Web Services “Stack”

HTTP(S), SMTP, FTP, BEEP, TCP/IP, …

Messaging

WSDL

Quality of Service

WS-Transactions

ComponentsComposite

Transport

SOAP RMI/IIOP, JMS, …

WS-CoordinationWS-SecurityWS-Reliable

Messaging

DescriptionWS-Policy

UD

DI, W

S-A

ddressing, WS

-Inspection

Atomic

BPEL4WS WS-Coord

Page 29: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 29

Grid Protocol vs. Internet Protocol

Fabric

Connectivity

Resource

Collective

Applications

Applications

Transport

Internet

LinkGrid

Pro

toco

l Arc

hite

ctur

e

Inte

rnet

Pro

toco

l Arc

hite

ctur

e

Page 30: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 30

Grid Computing Protocol Architecture

Resource and Connectivity protocols, which facilitate the sharing of resourcesBuild on capabilities provided by lower layersDesign goals:

Place few constraints on implementationFocus on small set of core abstractionsEmphasize identification and definition of protocols and servicesIdentify and define APIs and SDKsProvide for a Secure Environment

Fabric

Connectivity

Resource

Collective

Applications

The layered Grid Computing protocol architecture is based on Open Standards

Page 31: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 31

Grid Protocol – FabricProvides the resources to which shared access is mediated by Grid protocolsExamples include computational resources, storage systems, catalogs, or network resources

Includes logical resources such as distributed file systems and clusters

Resources implement inquiry mechanisms that permit discovery of their structure, state, and capabilities

Page 32: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 32

Grid Protocol – ConnectivityDefines core communication and authentication protocols required for Grid-specific network transactionsCommunication protocols enable the exchange of data between Fabric layer resources.Authentication protocols build on communication servicesProvide cryptographically secure mechanisms for verifying the identity of users and resources.

Asymmetric cryptography

TransportRouting Naming

Single Sign On Delegation Security Integration Trust Relationships

Page 33: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 33

Grid Protocol – ResourceBuilds on Connectivity layer communication and authentication protocols

Defines protocols (and APIs and SDKs) for the secure negotiation, initiation, monitoring, control, accounting, and payment of sharing operations on individual resources

Concerned entirely with individual resourcesIgnores issues of global state and atomic actions across distributed collections

API/SDK

MonitorControl Negotiation

InitiationAccountingPayment

Page 34: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 34

Grid Protocol – CollectiveProtocols and services (and APIs and SDKs) that are not associated with any one specific resource but rather are global in nature and capture interactions across collections of resources

Directory servicesCo-allocation, scheduling, and brokering servicesMonitoring and diagnostic servicesData replication servicesGrid-enabled programming systemsWorkload managementCommunity authorization and accountingSoftware discovery servicesCollaborative services

Page 35: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 35

Grid Protocol – Application LayerPutting it all together

InteragencyCollaborative

Data Grid

ComputeIntensive

Simulation

WeatherSimulation and

Modeling

Utility compute providers HA Operational

SupportSystems

B2B Hubs and trading

networks }

}}}}

Application layer:Grid enabledsecure and scalableVirtual Organizations

Collective layer:Global interactionsand servicesResource layer:ResourcemanagementservicesConnectivity layer:Security, transport,routing,

Fabric layer:Physical resources

Page 36: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 36

OGSA – Open Grid Services Architecture

Network

OGSA Enabled

Storage

OGSA Enabled

Servers

OGSA Enabled

MessagingOGSA Enabled

DirectoryOGSA Enabled

File SystemsOGSA Enabled

DatabaseOGSA Enabled

WorkflowOGSA Enabled

SecurityOGSA Enabled

Web Services

OGSI – Open Grid Services Infrastructure

Grid Data Services Grid Core

Services

Grid Program Execution Services

Domain Specific Services

OGSA Architected Services

Applications

Open Grid Services Architecture (OSGA)

Enabled Hardware and Operating System Platforms

Enabled “generalpurpose” middleware

Support for web services on a

variety of platforms, languages and protocols

Open architecture forinteroperability

Open and value-addedvendor implementations

Applications & systemsbuilt on standards

Open Standards Based Architecture: 2003

Page 37: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 37

• OGSA Services can be defined and implemented asWeb services

• OSGA can take advantage of other Web services standards

• OGSA can be implemented using standard Web services development tools

• Grid applications will NOT require special Web services infrastructure

Network

OGSA Enabled

Storage

OGSA Enabled

Servers

OGSA Enabled

MessagingOGSA Enabled

DirectoryOGSA Enabled

File SystemsOGSA Enabled

DatabaseOGSA Enabled

WorkflowOGSA Enabled

SecurityOGSA Enabled

Web Services

WS-Resource Framework & WS-Notification are an evolution of OGSI

OGSI – Open Grid Services Infrastructure

Web Services

OGSA Architected Services

Applications

WS-

Serv

ice

Gro

up

WS-RenewableReferences

WS-

Notif

icatio

n

Modeling Stateful

Resources with Web Services

WS-Base Faults

WS-ResourceProperties W

S-Resource

Lifetime

WS-RF & WS-Notification and OGSA

Page 38: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 38

Web Servicesdynamic, addressable, state-full, manageable

OGSA Structure

OGSA Architected ServicesGrid Data ServicesGrid Program Execution

Services Grid Core Services

WS-Addressing

WS-PolicyWS-CoordinationWS-Security

WS-Trust

Domain Specific Services

SecurityPolicy ManagementService CommunicationService Management Security

•Registries and Discovery Services (SG)

• Attribute Propagation and Query• Service Domain

•Service Orchestration •Metering & Accounting

• Installation & Deployment

• Messaging and Queuing Services

• Event Services• Distributed Secure

Logging Service

Policy ManagementService CommunicationService Management

• Authentication• Authorization &

Access Control• Credential

Validation & Transformation

• Trust Broker

• Policy Service Manager• Policy Agent• Policy Transformation Service• Policy Resolution Service• Policy Validation Service• Policy Administration Services

and Negotiation Framework

• Job Scheduler & Queuing Services

• Resource Reservation Services

• Workload Managers and Micro-Scheduling Services

• Data Access Services• Data Transformation &

Federation Services• Data Replication Service• Data Caching Service• MetaData Catalog Services

Page 39: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 39

Meta OS Grid Services Service

CollectionsJob

SchedulingFile

TransferData

ReplicationProvisioningLoggingProblemDetermination

ResourceManagement

ClusterManagementPolicy

Security APIsglobus_gss_assist - simplifies the use of the GSSAPI in the globus environment [1.1.x, 2.0]GSS API - the Generic Security Service API C bindings (IETF draft) [version 2]

Information Service APIsOpenLDAP - an API for the LDAP protocol used by MDS (developed by the OpenLDAP Project) [version 1.2]

Communication APIsglobus_io - provides high-performance I/O with integrated security and a socket-like interface [1.1.x,2.0]globus_nexus - provides multithreaded, asynchronous, thread-safe multiprotocol communication facilities [1.1.x,2.0]globus_nexus_fd - provides NEXUS-based support for file descriptors and timed events (This API is obsolete as of release1.1.2. We recommend use of globus_io instead.) [1.1.1]

Data Access APIsglobus_ftp_control - provides low-level services for implementing FTP client and servers [2.0]globus_ftp_client - provides a convenient way of accessing files on remote FTP servers [2.0]globus_gass_copy - provides a uniform interface for accessing files using a variety of protocols [2.0]globus_gass - provides clients with access to remote files [1.1.x]globus_gass_transfer - provides an API for clients and servers involved in GASS data transferglobus_gass_cache - manages the local GASS cache on a client system [1.1.x,2.0]globus_gass_server_ez - provides a simple set of GASS server capabilities [1.1.x,2.0]globus_gass_server - provides GASS server functionality (This API is obsolete as of release 1.1.2. We recommend use of globus_gass_transfer instead.) [1.1.1]globus_gass_client - allows clients to get and put remote files via several protocols (This API is obsolete as of release 1.1.2. We recommend use of globus_gass_transfer instead.) [1.1.1]

Data Management APIsglobus_replica_catalog - provides an interface to a catalog of data collections, logical files, and physical locations [2.0]globus_replica_management - allows clients to manage files within a file replication system [2.0]

Resource Management APIsglobus_gram_client - provides remote job submission and management capabilities [1.1.x,2.0]globus_gram_myjob - provides a basic communication mechanism for processes within a GRAM job [1.1.x,2.0]globus_gram_jobmanager - provides a simple, consistent way to interact locally with a variety of schedulers such as LSF, LoadLeveler, PBS, Condor, etc. [1.1.x,2.0]globus_duroc - provides resource coallocation services for starting distributed jobs [1.1.x,2.0]

Fault Detection APIsglobus_hbm_client - allows a client process to be monitored by a Heartbeat Monitor system [1.1.x]globus_hbm_datacollector - allows clients to monitor multiple processes and enables the notification of exceptions [1.1.x]

Portability APIsglobus_module - provides a mechanism for activating and deactivating software modules [1.1.x,2.0]globus_libc - provides a portable implementation of libc[1.1.x,2.0]globus_thread - implements threads and synchronization mechanisms [1.1.x,2.0]globus_dc - provides cross-platform data conversion servicesglobus_utp - supports the use of timers for monitoring applications and other programs [1.1.x,2.0]globus_list - support for linked lists [1.1.x,2.0]globus_fifo - supports first-in-first-out queues [1.1.x,2.0]globus_hashtable - supports hash tables [1.1.x,2.0]globus_url - supports URL strings [2.0]globus_error - provides an abstract error type for function return codesglobus_poll - supports polling on I/O channels

see: http://www.globus.org/developer/api-reference.html

Page 40: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 40

Recent Developments (Jan 20, 2004)WS-Resource Framework & WS-Notification

announced January 20th 2004at Globus World in San Francisco

Proposals to extend to Web servicesModeling Stateful Resources with Web Services

Driven by requirements from:Grid computingSystems ManagementBusiness computing

WS-

Serv

ice G

roup

WS-RenewableReferences

WS-

Notif

icatio

n

Modeling Stateful

Resources with Web Services W

S-Base Faults

WS-ResourceProperties

WS-Resource

Lifetime

Page 41: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 41

A family of Web services specification proposalsIntroduces a design pattern to specify how to use Web services to access “stateful” componentsIntroduce message based publish-subscribe to Web services

WS-

Serv

ice G

roup

WS-RenewableReferences

WS-

Notif

icatio

n

Modeling Stateful

Resources with Web Services

WS-Base Faults

WS-ResourceProperties WS-Resource

Lifetime

IntroducedIn Jan

To be developed

What Was Announced

Page 42: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 42

WS-NotificationProvides a publish-subscribe messaging capability for Web Services

WS-Resource FrameworkThere are many possible ways Web services might model, access and manage stateWS-RF is a family of Web services specifications that clarify how “state” and Web services combine

Both: Build upon existing Web services specifications and technologyHelp align Grid computing, Systems Management and Web services

Contributed to by:WS-Resource Framework: IBM, Globus, HPWS-Notification: IBM, Globus, Akamai, HP, SAP, Tibco, Sonic

What Was Announced

Page 43: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 43

The WS-Resource Framework Model

What is a WS-Resource?Examples of WS-Resources: • Physical entities (e.g. processor, communication link,

disk drive)or Logical construct (e.g. agreement, running task, subscription)

• Real or virtual• Static (long-lived, pre-existing) or

Dynamic (created and destroyed as needed)• Simple (one), or Compound (collection)

Unique – Has a distinguishable identity and lifetime

resource

Page 44: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 44

The WS-Resource Framework Model

Architecture rationaleWS-Resource framework exploits WS-Addressing

Web services and WS-Resources are referenced using an “Endpoint Reference”Services that create or locate WS-Resources returnEndpoint References

Web service and WS-Resource are separate:A Web service is statelessA WS-Resource provides a context / mechanism for stateful execution

Page 45: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 45

WS-NotificationWS-Notification

Brings enterprise quality publish and subscribe messaging to Webservices

• Loosely coupled, asynchronous messaging in a Web services context• Composes with other Web services technologies• Facilitates integration between different messaging middleware

environmentsExploits WS Resource framework and Web services technologiesStandardizes the role of Brokers, Publishers, Subscribers and ConsumersProvides two forms of publish/subscribe: direct publishing and brokered publishing

Standardizes Web service message exchanges for publishing, subscribing and notification deliveryDefines XML model of Topics and TopicSpaces to categorize and organize notification messages

Page 46: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 46

Open Grid Infrastructure (OGSI)

Grid Service Implementation Independence

HardwareOperating System

Other Middleware

Hosting Environment

Implementation

Abstract service interface remains the

same

Page 47: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 47

Open Grid Infrastructure (OGSI)

Grid Service Implementation – Examples

Hardware

Operating System

Other Middleware

Hosting Environment - J2EE

File TransferService

File System

Storage System (NAS/SAN)

Implementation

Abstract service interface remains the

same

Database (DB2)

Page 48: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 48

Grid Computing, SAO, and Autonomic Computing

Information…… and Grid Computing

Page 49: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 49

Managing Information at Different Levels

Global NamingMeta-data and catalogFederation and Transformation

Data

Distributed File Systems / Remote AccessFile Transfer / Data ReplicationCaching

File

NAS / SAN “Storage Cluster”

Automatic or Dynamic provisioning of storage

Support for hierarchy managementStorage

Page 50: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 50

IBM Products for an Information Grid

* Avaki is an IBM business partner

Data backup/restore, data archive and retrieve

Enterprise wide reporting, file level analysis, subsystem reporting, automated capacity provisioning

Creates pools of managed disks spanning multiple storage subsystems. Includes dynamic data-migration function.

Provides a common file system specifically designed for storage networks. Manages the metadata on the storage network instead of within individual network servers.

Provides scalable access to GPFS from outside cluster. GPFS + NFSv4 provides the performance of a SAN File System scalable to a WAN.

Cluster based, shared disk, parallel file system. Data and metadata can flow to all nodes and all disks in parallel. Featured in HPC environments. Available on pSeries and Linux clusters.

Data catalog, data provisioning, reusable data integrations, caching capabilities.

Relational database that runs on Linux, Unix, Windows, z/OS, and OS/390

Federated data server, replication server

Features BenefitsProduct

Centralized protection leading to faster backups and restores with less resources needed. Tivoli Storage Manager

Manageability features, Integrated Information capabilities via Web Services, Integrated business intelligence, and more

DB2 UDB

Security and access control in a grid environment.NFS v4

Storage on demand for file systems. Reclaim wasted space consumed by non-essential files. Ensure storage used efficiently for future capacity.

Tivoli Storage ResourceManager

Centralized point of control for volume mgmt. Allows administrators to migrate storage from one device to another w/o taking it offline.

SAN Volume ControllerStorageFile

Data

Not a client-server file system like NFS, DFS, or AFS: no single server bottleneck, no protocol overhead for data transfer.

GPFS (General Parallel File System)

Provides high performance access to data and enables sharing across heterogeneous application servers. Allows applications on any server within the SAN to access any file in the network without making changes to the application.

SAN File System

Provisioning, access, and integration of data from multiple, heterogeneous, distributed sources.

Avaki Data Grid 5.0*

Query and access distributed data without requiring central repository. Supports movement of data from mixed relational data sources.

DB2 Information Integrator

Page 51: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 51

Grid Computing, SAO, and Autonomic Computing

Autonomic Computing… …and Grid Computing

Page 52: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 52

A continuously evolving and dynamic state that establishes the correct balance between what is managed

by a person and what is managed by the system

Focus on business, not infrastructure

Autonomic Computing Is

Page 53: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 53

Why Autonomic Computing?

Heterogeneity

Large state space

Unpredictable human element

Unpredictable scalabilityContinuous Change

Open-endedness

Connectedness

The interconnected characteristics of a

complex system need…

…Systems level understanding with certain

component and system characteristics

Real-timeSelf-adaptiveSelf-organizingSelf-healingSelf-formingSelf-testing Resilient

Page 54: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 54

Self-managing Systems Deliver:Increased ResponsivenessAdapt to dynamically changing environments

Business ResiliencyDiscover, diagnose,

and act to prevent disruptions

OperationalEfficiencyTune resources and balance workloads to maximize use of IT resources

Secure Information and Resources

Anticipate, detect, identify, and protect

against attacks

“Autonomic computing allows companies to operate more efficiently and achieve more from their existing IT environments, enabling increased responsiveness, business continuance and availability.” — Rick Sturm

Page 55: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 55

The Autonomic Element: Sense & Respond

An autonomic element contains continuous control loop that monitors activities and takes action Autonomic elements learn from past experience to build action plansManaged elements are consistently monitored

Knowledge

Analyze Plan

Monitor Execute

Element

Sensors Effectors

The autonomic computing control loop

“IBM’s autonomic approach to automation goes well beyond integration to the truly intelligent, responsive and proactive capabilities needed to deliver e-business on demand.”

— Mark Hydar

Page 56: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 56

Levels of Automation

Level 2 Level 3 Level 4 Level 5Level 1

Basic

Managed

Predictive

Adaptive

Autonomic

Manual analysis and problem solving

Centralized tools, manual actions

Cross-resource correlation and guidance

System monitors, correlates and takes action

Dynamic business policy based management

Evolution not revolution

“Autonomic computing is a vision that will take several years to realize, but with the model that IBM has outlined, there are benefits attainable at every step, which pay you back... fairly quickly for the investments you make.”

— Mike Gilpin

Page 57: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 57

Self-configuringAdapt automatically to the dynamically changing environments

Self-Configuring

Self-Configuring

Self-healingDiscover,

diagnose, and react to disruptions

Self-HealingSelf-

Healing

Self-optimizingMonitor and tune

resources automatically

Self-Optimizing

Self-Optimizing

Self-protectingAnticipate, detect, identify, and protect against attacks from anywhere

Self-Protecting

Self-Protecting

Autonomic Computing: Self Managing Systems

Autonom

ic Capabilities

OGSA Structure + Autonomic Backplane

Adaptive Grid

Page 58: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 58

Grid Computing and the oDOE

Open

Linux

XML WSDLWSDL

SOAPOGSA

Self-protectingSelf-protecting

Self-healingSelf-healing

Self-optimizingSelf-optimizing

Self-configuringSelf-configuring

Autonomic

Virtualized

Integrated

Page 59: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 59

Service-Oriented Architecture Evolution

Web Services

Complex Event Processing

Enterprise Infrastructure

Component Orchestration

Semantic Web

Standards-based info management framework

Warfighter events pattern recognition

Distributed collaborative processing with discovery

Orchestration of C4ISR components

Intelligent M2Mcollaboration

Page 60: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 60

Service Oriented ArchitectureChange of Paradigm at the core of Grid Computing

Services “encapsulate” heterogeneous resourcesServices provide a compose-able, orchestrable, extensible base Common Resource Model (CRM) for abstractions key to manageability of resources

Simple Rules:Any function is implemented once and once only as a ServiceServices can be runtime or deployment-time re-usedService providers and requesters are loosely bound:

• Each service is defined by an implementation independent interface.• Services are defined in terms of common business function and data

models.• Communication protocols that emphasize interoperability and location

transparency are used to mediate service interactions

Service “contract” can come with a QoS “clause” (SLA)

Page 61: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 61

Anatomy of a Service Interface

Interface by contractAn explicit interface definition or contract is used to bind a service requestor and a service providerSpecifies explicitly only the mutual behaviour -specifies nothing about the implementation of the requestor or the providerAllows either to change implementation or identity freely

Interface granularityBased on Service Type:Examples:

• Business Process Services• Business Transaction Services• Business Function Services• Technical Function Services

Interface Code

Interface Code

Internal code and processs

Shared process and interface definitions

CONTRACT

SYSTEM 1

SYSTEM 2

Internal code and processs

Page 62: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 62

Refactoring: Things to Deal WithMany Existing Applications are Monolithic or Tightly Coupled Need to Re-Factor Applications

Some things to worry about are:• Distributed threads • Data locking• Latency

Re-Hosting ApplicationsExploit Meta-OS servicesAchieve platform independenceRe-Factor for distributed parallel execution

Need for Re-Hosted MiddlewareAbility to Exploit Grid computing services, e.g. Distributed ProvisioningManage (and exploit) Quality of Service across the Grid

Challenge: Move to and Exploit Services Oriented Architecture

Page 63: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 63

Can Your Application Benefit from Grid Computing?

How do you know if your application can benefit from Grid computing? Ask these questions:Q. Is the application computationally intensive?Q. Does it serve a distributed or collaborative community?Q. Can the tasks or jobs the application performs run in parallel?Q. Does the application do pattern matching?Q. Does it have a reasonable network bandwidth profile?

A. If the answer to any or all of these is yes, then Grid-enablement is feasible.

Q. What is the application processing type (e.g., serial or batch)?

A. Batch is currently more amenable to Grid enablement.

Q. Do the operations within the task have time and/or sequencing dependencies?

A. The fewer dependencies, the better.

Q. What are the bottlenecks in the existing use of the application (e.g., single processor performance, scalability, memory, data output volume, pre/post processing)?

A. Grid can potentially address these bottlenecks.

Page 64: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 64

Processors

Time223+837+383+662+121+554+123+816+228+772+452+827+972+274+...+832+971+753+981+2282+23

223+...+772

452+...+845

183+...+559

884+...+121

314+...+265

271+...+173

491+...+23

2443+...+9772

Parallel application done

Serial application done

Rearranging computations to execute in parallel on Grid

CPU – Make Execution Parallel

Page 65: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 65

Sequence

Sequence

if

Loop

Sequence

Sequence

Sequence Sequence

Sequence

Sequence

Sequence

Sequence

Sequence

if

Sequence

Sequence Sequence

if

if

CPU – Programming Code Control Graph

Rearranging computationsSeparate subgraphs to run in parallelConsider data dependenciesChange algorithms

Page 66: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 66

Compute & Data Intensive Application

Video conversion problemCapture video tape onto computer hard drive• About 200 Megabytes per minute• 25 Gigabytes for a 2 hour tape

Compress video and audio• Can take days at higher quality level

Write VCD, SVCD, or DVD disk (650 MB to 4.7 GB)

Page 67: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 67

Single stream:

Using a Grid:

VCR

2 hours

HD

24 GB

10 minutes

HD

4.7 GB

HD compression HD

Data

Transfer

Dat

aTr

ansf

er

compressioncompressioncompression

Data

TransferD

ataTransfer

Data

TransferD

ataTransfer D

ata

Tran

sfer

Dat

aTr

ansf

erD

ata

Tran

sfer

Dat

aTr

ansf

er

HDHDHDHDHDHDHDHD

45 minutesat 100mb/s

9 minutesat 100mb/s

The Grid

compression

<<30 hours

VCR HD

2 hours

compression

30 hours

HD

10 minutes

24 GB 4.7 GB

Compute & Data Intensive Application

Page 68: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 68

Overlapping data transfer with capture and computing:

VCR

2 hours

HD

24 GB

10 minutes

HD

4.7 GB

compressionD

ata

Tran

sfer

HD

The Grid

Data

Transfer

Dat

aTr

ansf

erD

ata

Tran

sfer

Dat

aTr

ansf

erD

ata

Tran

sfer

Dat

aTr

ansf

erD

ata

Tran

sfer

Dat

aTr

ansf

erData

Transfer

Data

Transfer

Data

Transfer

Data

Transfer

Data

Transfer

Data

Transfer

Data

Transfer

HD HD HD HD HD HD HD

compressioncompressioncompressioncompressioncompressioncompressioncompression

Compute & Data Intensive Application

Page 69: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 69

Six Strategies for Grid Application Enablement

Page 70: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 70

Six Strategies for Grid Application Enablement

Strategy 1: Batch AnywhereOnly the grid (not the application, the client, the user, or anything else) decides which node to use for the jobThe machine submitting the job might not be a node in the gridExample application: a query to determine whether a given number, x, is a prime number. More than one node in the grid can submit the same query. The grid returns the correct results to the submitter.

Strategy 2: Independent Concurrent Batch Multiple independent instances of the same application run concurrently and independently without interference.Independent jobs are common. For example, Job X for Account A can run concurrently with Job X for Account B. Databases and other resources don't have hot spots or deadlocks.

Strategy 3: Parallel BatchTake each user's batch work, subdivide it, disperse it out to multiple nodes, collect it, and then aggregate the results.

Page 71: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 71

Six Strategies for Grid Application Enablement

Strategies 4, 5, & 6 use services on the grid in order to get jobs done. Strategy 4: Service

Focus on the transition from a batch to a service-oriented architectureA follow-on to Independent Concurrent BatchIt is not assumed that each client subdivides its work and spreads it over multiple service instances

Strategy 5: Parallel ServicesService with the subdivided work model of Parallel Batch. Provides multiple service instancesPermits these instances to be invoked in parallel on the client's behalf

Strategy 6: Tightly Coupled Parallel ProgramsThe domain of specialized applications in engineering, physics, and biological modeling, such as finite state analysisProvides intense communications and synchronization between client and services and among services

Page 72: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 72

From Enablement to Exploitation

Page 73: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 73

Three Stages for ImplementationRun

Strategies 1 and 2, and the simplest form of Strategy 3, focus on the ability of an application to run in a grid.

AdaptThe more complex form of Strategy 3 as well as Strategies 4 and 5 significantly adapt the function and value of the business application by enabling it to use a grid without requiring many changes that are specific to grid middleware. The same application could be structured to run in a non-grid environment.

ExploitApplications at Strategy 6 exploit the grid or cluster infrastructure for their operation because they were written from the start with a grid in mind. Strategy 6 applications cannot finish in a timely and successful manner without running in a grid.

See: http://www-106.ibm.com/developerworks/grid/library/gr-enable/

Page 74: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 74

It’s Not Just Limited to Applications

MiddlewareApplication ServersGPFS, Database, Transaction ManagersSystems Management SoftwareCollaborative Software…

ResourcesProcessorsStorageNetwork…

And more…

Page 75: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 75

Example: GPFS Parallel AccessParallel Cluster File System

Cluster – fabric-interconnected nodes (IP, SAN, …)

Shared disk – all data and metadata on fabric-attached disk

Parallel – data and metadata flows from all of the nodes to all of the disks in parallel under control of distributed lock manager.

Fine grain locks – efficient sharing of individual files

GPFS File System Nodes

Switching fabric(System or storage area network)

Shared disks(SAN-attached or network

block device)

Page 76: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 76

GPFS: Information Management For The Grid

Goal: sharing GPFS file systems over the WAN

WAN adds 10-60 ms latency… but under load, storage latency is much higher than this anyway!

New GPFS featureGPFS NSD now allows both SAN and IP access to storageSAN-attached nodes go directNon-SAN nodes use NSD over IP

Award winning demo at SC03

Work in progress

/NCSAGPFS File System

/NCSAover WAN

/Sc2003GPFS File System

/SDSCGPFS File System

/SDSCover WAN

/SDSCover SAN

/NCSAover SAN

SDSC Compute Nodes

Sc2003 Compute Nodes

NCSA Compute Nodes

NCSA NSD Servers

Sc03 NSD Servers

SDSC NSD Servers

Scinet

NCSA SAN

Sc03 SAN

SDSC SAN

Visualization

Page 77: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 77

Some Important Infrastructure Considerations

SecurityAuthentication/authorizationClient and server concerns

Information servicesWhat resources exist, what is their state and how do I access them?

Data managementHow do I access, move, replicate data to where I need it?

Resource managementHow do I run a job and monitor its state?

Page 78: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 78

Grid Computing, SAO, and Autonomic Computing

The Realm of the Possible

Page 79: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 79

Why Are Customers Implementing Grid Computing Solutions?

Accelerate Business ProcessesGrids provide the ability to shorten application run-times without upgrading existing servers.(i.e. Charles Schwab, MassMutual, RBC Insurance, Nippon Life Insurance, Royal Dutch Shell, EADS)Ability to run new High Performance Computing (HPC) applicationsGrid computing provides the opportunity to run new applications due to the cost effective grid virtual computing environment. (i.e. AIST, UMass, FNMOC, TeraGrid)Data Sharing & CollaborationGrid architecture provides the ability to store, share and analyze large volumes of data(i.e. eDiamond, NDMA, WestGrid, CERN, European DataGrid, Kansai Electric)Accelerate Research & DevelopmentGrids provide Life Science companies the ability to speed up drug research & development.(i.e. Smallpox Grid, Aventis, Novartis)I/T Optimization & Resiliency – Virtualization of Servers & Storage

Page 80: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 80

Grid Infrastructure

Grid Computing – Industry Applications

DerivativesAnalysis

Statistical Analysis

Portfolio Risk

Analysis

Batch Throughput

Product Design

Process Simulation

FiniteElement Analysis

Failure Analysis

Cancer Research

Drug Discovery

Protein Folding

Protein Sequencing

CollaborativeResearch

Weather Analysis

High Energy Physics

Unique by Industry with Common Characteristics

Seismic Analysis

Reservoir Analysis

Bandwidth Consumption

Digital Rendering

Multiplayer Gaming

Primary Focus

Energy

Financial Services

Manufacturing

Life Sciences Telco & Media

Government & Higher Education

Page 81: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 81

IBM Grid Focus Areas and Information Grid

Financial ServicesPublicIndustrial

PublicPublicIndustrial

IndustrialFinancial ServicesPublic Industrial

Sectors

Provide large scale data sharing infrastructure for industry and scientific collaboration

Virtualized distributed storage and data resources

Facilitating access to large scale data marts and broadly distributed client data.

Sharing design data across large multi-party projects.

Sharing of public data sources. Also supports use of shared compute resources.

Information Grid

Create large-scale IT infrastructures to drive economic development and/or enable new government services

Optimize computing and data assets to improve utilization, efficiency and business continuity

Enable faster and more comprehensive business planning and analysis through the sharing of data and computing power

Share data and computing power, for computing intensive engineering and scientific applications, to accelerate product design

Accelerate and enhance the R&D process by enabling the sharing data and computing power seamlessly for research intensive applications

Description

Government Development Grid

Enterprise Optimization

Grid

Business Analytics Grid

Engineering and Design Grid

Research and Development Grid

Virtualization of Compute and Information Resources

Page 82: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 82

MREN STARTAPMAGPI

NCNISOX

ESNet

CANet*2

U Toronto

U Penn

U NCOak Ridge

U CHI

NGIX Chicago Peering Point

Indianapolis GigaPop

Atlanta GigaPop

New York GigaPop

Abilene Peer Network

Abilene Connector

Project Site

Sponsored by: University of Pennsylvania

National Digital Mammographic Archive

Page 83: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 83

Research & DevelopmentResearch & DevelopmentCombines Grid Computing with Radiology to makebreast cancer diagnosis faster and treatment moreeffective

IBM assisted with implementing a Gridinfrastructure across the hospitals to manage andretrieve digital mammograms

Secure transmission of all patient records

Grid solution architecture includes:IBM pSeries, xSeries, Linux, DB2, GPFS, Globus

WebSite: http://nscp.upenn.edu/NDMA

NDMA: National Digital Mammographic Archive

Page 84: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 84

Connects nine (9) major supercomputing sites: NCSA, SDSC, Argonne NL, CalTech, PSC, UTexas, IndianaU, PurdueU, Oak Ridge NL

40 gigabit network backbone connecting the sites20 Teraflops of computing power1 Petabyte of disk accessible data storage

Accessible to thousands of scientists working on advanced research

Applications include:Real Time Brain MappingEarthquake ModelingMolecular Dynamics simulationMcell – Monte Carlo simulation of cellular micro physiologyEncyclopedia of Life – Protein catalog

IBM project team and solution includes:IBM High Performance Computing (HPC) expertiseIBM GPFS expertiseIBM Linux Clusters – Itanium2 processorsIBM Power4 processors – p690 RegattasIBM Grid Computing & Linux consulting services

The TeraGrid – Extensible Terascale Facility

National Science Foundation Grid Computing project ($90M):

Page 85: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 85

CERN

Cambridge

Newcastle

EdinburghUS Sites

EU

Glasgow

Cardiff

Southampton London

Belfast

Dublin

Oxford

Manchester

Multiple Grid Applications including:HighEnergy PhysicsAircraft Engine MaintenanceCombinatorial ChemistryOceanographic studyParticle Physics & AstronomyBiomolecular analysisEnvironmental simulation

Heterogeneous Grid:IBM, Sun, HP serversLinux, Globus, Condor, SRB

UK eScience Grid

Page 86: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 86

The SMALLPOX Research GridA massive distributed computing grid running a computational chemistry application to help fight the smallpox virus:

Screened 35 million potential drug moleculesTwo (2) million computer processors in 200

countries were connected to this grid

The Grid architecture will reduce the time required to develop a commercial drug by several years:

“In-Silico” Research

IBM collaborated with:United DevicesAccelrysEvotec OAIUS Department of DefenseOxford University

IBM provided the hardware and software for storing and analyzing the molecule screening results:

p690, AIX, DB2, Linux

Page 87: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 87

Butterfly.netThe Butterfly Grid:

an end-to-end solution designed to support up to one million simultaneoususersbased on IBM WebSphere Application Server, DB2 and the Globus Toolkitrunning on IBM eServerxSeries clusters at an IBM e-business Hosting Center

Modeling and Simulation platform

Page 88: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 88

The Butterfly Grid: Service Provider Program

Package:Butterfly server software suiteButterfly game admin appsGlobus, provisioning, policy mgt, billing

Shared Grid or DedicatedGamersIndustrialMilitary

NotesInter-node resource sharingValue-added broadband packageSLAs, QoS guarantees, ratings/certification

Page 89: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 89

Japan AIST(National Institute of Advanced Industrial Science & Technology)

Collaborations

Government

Life Science Nanotechnology

LAN Internet

Academia Corporations

Grid Technology

Advanced Computing Center.

Other Research Institutes

One of the world’s most powerful Linux-based supercomputersMore than 11 trillion calculations per secondMore powerful than the current third most powerful supercomputer in the world

Solution Linux Cluster

• 2116 CPU AMD Opteron Cluster• 520 CPU Intel Madison Cluster

Globus Toolkit 3.0 (OGSA)

ChallengeAIST, Japan‘s largest national research organization needed to provide an on-demand computing infrastructure which dynamically adapts to support various research requirements of its collaborators focusing on grid computing, life sciences, and nanotechnology.

Page 90: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 90

Grid Computing @ IBM

Charlotte (1) 3RTP (2) 7

Cambridge (2) 4Hawthorne (2) 4Poughkeepsie (4) 4Somers (1) 1Southbury (4) 13Yorktown Heights (7) 28

Markham (2) 14

San Jose (7) 13San Mateo (2) 7

Hursley (1) 3London (1) 2

Montpellier (2) 10

Uithorn (1) 2

Boeblingen (2) 2

Zurich (1) 6

Haifa (1) 2Austin (9) 58Roanoke (2) 4

Bangalore (1) 1

Chiba (1) 2Tokyo (2) 2

Taipei (2) 2

Rochester (3) 15

Chicago (1) 1 Sapporo (1) 4Beijing (2) 7

27 different geographic locations137 end user teams66 Grid applications

Heterogeneous platforms:- Linux on x, z, p series- AIX on pSeries

Globus 2.2 & Globus 3.0

Page 91: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 91

IBM Grid Middleware –Product Roadmap

Grid Services (OGSA) & Web Services

Scheduling

Information Virtualization

Provisioning

Workload Management

Billing and Metering

Transaction Management

Gri

d C

apab

iliti

es

TotalStorage

GridXpertGridXpert

IBM Grid Toolbox

Page 92: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 92

1. Intra-Grids

2. Extra-Grids

3. Inter-Grids

GridGrid

NAS/SANNAS/SAN

Grid

NAS/SAN

VPN

A Function of Business Need, Technology and Organizational Flexibility

Grid Deployment Options

Page 93: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 93

…. Look Familiar?

Page 94: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 94

… How About This?

Page 95: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 95

SummaryGrid Computing still evolvingIt is built on existing and new open computing standardsIt exploits existing components and technologiesIt can and is being used todayThere are many ways and places to exploit Grid ComputingMake decisions based on “business” needsIBM is leading with both products and services for Grid Computing

Page 96: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 96

Thank You

Questions?

Page 97: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 97

References (Articles and Publications)M.Mitchell Waldrop, Grid Computing, MIT Technology Review, May 2002, pgs 30-37

I. Foster, C. Kesselman, S. Tuecke, The Anatomoy of the Grid, http://www.globus.org/research/papers/anatomy.pdf

I. Foster, C. Kesselman, J. Nick, S. Tuecke, The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration, http://www.globus.org/research/papers/ogsa.pdf

I. Foster, C. Kesselman, eds., The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, San Francisco, Calif. (1999)

IBM Redbook: Introduction to Grid Computing with Globus, http://www.ibm.com/redbooks/

Page 98: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 98

References (URLs)IBM Grid Web Site: http://www.ibm.com/grid/Globus: http://www.globus.org/OGSA (Open Grid Services Architecture): http://www.globus.org/ogsa/Global Grid Forum: http://www.gridforum.orgGrid Computing Planet: http://www.gridcomputingplanet.comGrid Today Newsletter: http://www.gridtoday.comNASA's Information Power Grid: http://www.ipg.nasa.govDOE Science Grid: http://www.doesciencegrid.orgParticle Physics Data Grid, PPDG: http://www.ppdg.net/National Digital Mammographic Archive:http://www.isi.edu/us-uk.gridworkshop/presentations/hollebeek.pdfNSF TeraGrid: http://www.teragrid.org/Nasa Information Power Grid: http://www.nas.nasa.gov/About/IPG/ipg.htmlUK eScience Program: http://www.research-councils.ac.uk/escience/UK e-Science Grid Program: http://www.escience-grid.org.uk/e-Diamond: http://www.gridoutreach.org.uk/docs/pilots/ediamond.htmEuropean Union DataGrid Project: http://www.eu-datagrid.org/

Page 99: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 99

Additional IBM Grid Information: Red Paper & Red Book

Download from http://www.redbooks.ibm.com

Page 100: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 100

IBM RedBook: Grid Enabling Applications

Download from www.redbooks.ibm.com

Page 101: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 101

References:http://www.ibm.com/developerworks/grid/library/gr-visual

developerWorks Journal, November 2003 Issue

Good reference for IBM and customer technical people

Covers some of the same material as this presentation

Page 102: Grid Computing, SAO, and Autonomic Computing

Colorado Software Summit: October 24 – 29, 2004 © Copyright 2004, IBM Corporation

Paul Giangarra — Grid Computing, SAO, and Autonomic Computing Page 102

http://www.varbusiness.com/sections/news/breakingnews.asp?articleid=45311varBusiness, October 27, 2003 Issue

References: (Continued)