grid computing & web services: a natural partnership

48
Grid Computing & Web Services: A Natural Partnership Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago Address of Poznan Supercomputing & Networking Center Poznan, Poland February 7, 2002 Dave Angulo Department of Computer Science The University of Chicago and Mathematics and Computer Science Division Argonne National Laboratory

Upload: gina

Post on 25-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Grid Computing & Web Services: A Natural Partnership. Dave Angulo Department of Computer Science The University of Chicago and Mathematics and Computer Science Division Argonne National Laboratory. Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid Computing & Web Services: A Natural Partnership

Grid Computing & Web Services:A Natural Partnership

Ian FosterMathematics and Computer Science Division

Argonne National Laboratoryand

Department of Computer ScienceThe University of Chicago

Address of Poznan Supercomputing & Networking Center Poznan, Poland February 7, 2002

Dave AnguloDepartment of Computer Science

The University of Chicagoand

Mathematics and Computer Science Division

Argonne National Laboratory

Page 2: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Partial Acknowledgements Open Grid Services Architecture work is performed

by– Ian Foster, Globus Co-PI @ Argonne/UofC– Carl Kesselman, Globus Co-PI @ USC/ISI– Steve Tuecke, Globus Toolkit Architect @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM

Globus Toolkit R&D involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see www.globus.org)

Strong collaborations with many outstanding EU, UK, US Grid projects

Support from DOE, NASA, NSF, Microsoft

Page 3: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Partial Acknowledgements

Globus ToolkitTM

– R&D involves> many fine scientists & engineers at ANL/UofC, USC/ISI, and

elsewhere (see www.globus.org)– Led by

> Ian Foster @ Argonne/UofC> Carl Kesselman @ USC/ISI

Open Grid Services Architecture work performed by– Ian Foster, Globus Co-PI @ Argonne/UofC– Carl Kesselman, Globus Co-PI @ USC/ISI– Steve Tuecke, Globus Toolkit Architect @ANL– Jeff Nick, Steve Graham, Jeff Frey @ IBM

Strong collaborations with many outstanding EU, UK, US Grid projects

Support from DOE, NASA, NSF, Microsoft, IBM

Page 4: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Computing

Page 5: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

The Grid Problem Resource sharing & coordinated problem

solving in dynamic, multi-institutional virtual organizations

Page 6: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Why Grids? A biochemist exploits 10,000 computers to screen

100,000 compounds in an hour 1,000 physicists worldwide pool resources for

petaflop analyses of petabytes of data Civil engineers collaborate to design, execute, &

analyze shake table experiments Climate scientists visualize, annotate, & analyze

terabyte simulation datasets A home user invokes architectural design functions

at an application service provider– An application service provider purchases cycles

from compute cycle providers

Page 7: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Elements of the Problem Resource sharing

– Computers, storage, sensors, networks, …– Sharing always conditional: issues of trust, policy,

payment, … Coordinated problem solving

– Beyond client-server: distributed data analysis, computation, …

Dynamic, multi-institutional virtual orgs– Community overlays on classic org structures– Large or small, static or dynamic

Page 8: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grids: Why Now? Moore’s law improvements in computing

produce highly functional end systems The Internet and burgeoning wired and

wireless provide universal connectivity Network exponentials produce dramatic

changes in geometry and geography

Page 9: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grids: Why Now? Moore’s law improvements in computing

produce highly functional endsystems The Internet and burgeoning wired and

wireless provide universal connectivity Network exponentials produce dramatic

changes in geometry and geography– 9-month doubling: double Moore’s law!– 1986-2001: x340,000; 2001-2010: x4000?

Page 10: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

The Grid World: Current Status Dozens of major Grid projects in scientific &

technical computing/research & education– Deployment, application, technology

Considerable consensus on key concepts and technologies– Globus Toolkit™ has emerged as de facto

standard for major protocols & services Global Grid Forum has emerged as a significant

force– And first “Grid” proposals at IETF

Page 11: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Selected Major Grid ProjectsName URL &

SponsorsFocus

Access Grid www.mcs.anl.gov/FL/accessgrid; DOE, NSF

Create & deploy group collaboration systems using commodity technologies

BlueGrid IBM Grid testbed linking IBM laboratoriesDISCOM www.cs.sandia.gov/

discomDOE Defense Programs

Create operational Grid providing access to resources at three U.S. DOE weapons laboratories

DOE Science Grid

sciencegrid.orgDOE Office of Science

Create operational Grid providing access to resources & applications at U.S. DOE science laboratories & partner universities

Earth System Grid (ESG)

earthsystemgrid.orgDOE Office of Science

Delivery and analysis of large climate model datasets for the climate research community

European Union (EU) DataGrid

eu-datagrid.orgEuropean Union

Create & apply an operational grid for applications in high energy physics, environmental science, bioinformatics

ggg

g

g

g

New

New

Page 12: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Selected Major Grid ProjectsName URL/

SponsorFocus

EuroGrid, Grid Interoperability (GRIP)

eurogrid.orgEuropean Union

Create technologies for remote access to supercomputer resources & simulation codes; in GRIP, integrate with Globus

Fusion Collaboratory

fusiongrid.orgDOE Off. Science

Create a national computational collaboratory for fusion research

Globus Project globus.orgDARPA, DOE, NSF, NASA, Msoft

Research on Grid technologies; development and support of Globus Toolkit; application and deployment

GridLab gridlab.orgEuropean Union

Grid technologies and applications

GridPP gridpp.ac.ukU.K. eScience

Create & apply an operational grid within the U.K. for particle physics research

Grid Research Integration Dev. & Support Center

grids-center.orgNSF

Integration, deployment, support of the NSF Middleware Infrastructure for research & education

g

g

g

g

g

g

New

New

New

New

New

Page 13: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Selected Major Grid ProjectsName URL/Sponsor Focus

Grid Application Dev. Software

hipersoft.rice.edu/grads; NSF

Research into program development technologies for Grid applications

Grid Physics Network

griphyn.orgNSF

Technology R&D for data analysis in physics expts: ATLAS, CMS, LIGO, SDSS

Information Power Grid

ipg.nasa.govNASA

Create and apply a production Grid for aerosciences and other NASA missions

International Virtual Data Grid Laboratory

ivdgl.orgNSF

Create international Data Grid to enable large-scale experimentation on Grid technologies & applications

Network for Earthquake Eng. Simulation Grid

neesgrid.orgNSF

Create and apply a production Grid for earthquake engineering

Particle Physics Data Grid

ppdg.netDOE Science

Create and apply production Grids for data analysis in high energy and nuclear physics experiments

g

g

g

g

gNew

New

g

Page 14: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Selected Major Grid ProjectsName URL/Sponsor Focus

TeraGrid teragrid.orgNSF

U.S. science infrastructure linking four major resource sites at 40 Gb/s

UK eScience Grid grid-support.ac.ukU.K. eScience

Support center for Grid projects within the U.K.

Unicore BMBFT Technologies for remote access to supercomputers

g

gNew

New

Also many technology R&D projects: e.g., Condor, NetSolve, Ninf, NWS

See also www.gridforum.org

Page 15: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Communities & Applications:Data Grids for High Energy Physics

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm ~20 TIPS

CERN Computer Centre

FermiLab ~4 TIPSFrance Regional Centre

Italy Regional Centre

Germany Regional Centre

InstituteInstituteInstituteInstitute ~0.25TIPS

Physicist workstations

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~1 MBytes/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physicists work on analysis “channels”.

Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server

Physics data cache

~PBytes/sec

~622 Mbits/sec or Air Freight (deprecated)

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Caltech ~1 TIPS

~622 Mbits/sec

Tier 0Tier 0

Tier 1Tier 1

Tier 2Tier 2

Tier 4Tier 4

1 TIPS is approximately 25,000

SpecInt95 equivalents

www.griphyn.org www.ppdg.net www.eu-datagrid.org

Page 16: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Communities and Applications:Mathematicians Solve NUG30

Community=an informal collaboration of mathematicians and computer scientists

Condor-G delivers 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)

Solves NUG30 quadratic assignment problem

14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23

www.mcs.anl.gov/metaneos: Argonne, Iowa, NWU, Wisconsin

Page 17: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Communities and Applications:Network for Earthquake Eng. Simulation

NEESgrid: national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other

On-demand access to experiments, data streams, computing, archives, collaboration

NEESgrid: Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org

Page 18: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

The 13.6 TF TeraGrid:Computing at 40 Gb/s

26

24

8

4 HPSS

5

HPSS

HPSS UniTree

External Networks

External NetworksExternal

Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org

Page 19: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Intl. Virtual Data Grid Lab.

Tier0/1 facilityTier2 facility

10+ Gbps link

2.5 Gbps link

622 Mbps link

Other link

Tier3 facility

www.ivdgl.org

Page 20: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Access Grid Collaborative work

among large groups ~50 sites worldwide Use Grid services for

discovery, security www.scglobal.org

Ambient mic(tabletop)

Presentermic

Presentercamera

Audience camera

Access Grid: Argonne, others www.accessgrid.org

Page 21: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Architecture & Globus Toolkit™ The question:

– What is needed for resource sharing & coordinated problem solving in dynamic virtual organizations (VOs)?

The answer:– Major issues identified: membership, resource

discovery & access, …, …– Grid architecture captures core elements,

emphasizing pre-eminent role of protocols– Globus Toolkit™ has emerged as de facto standard

for major protocols & services

Page 22: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

The Critical Role of Protocols Need for interoperability when different groups

want to share resources– E.g., IP lets me talk to your computer, but how do we

establish & maintain sharing?– How do I discover, authenticate, authorize, describe

what I want to do, etc., etc.? Need for shared infrastructure services to avoid

repeated development, installation, e.g.– One port/service for remote access to computing, not

one per tool/application– X.509 enables sharing of Certificate Authorities

Page 23: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Grid Architecture

Application

Fabric“Controlling things locally”: Access to, & control of, resources

Connectivity“Talking to things”: communication (Internet protocols) & security

Resource“Sharing single resources”: negotiating access, controlling use

Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services

InternetTransport

Application

Link

Internet Protocol Architecture

For more info: www.globus.org/research/papers/anatomy.pdf

Page 24: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Globus Project and Toolkit Globus Project™

– R&D project at ANL, U.Chicago, USC/ISI– Emphasis on identifying and defining core

protocols and services– O(40) researchers & developers

Globus Toolkit™– A major product of the Globus Project– Open source software: reference implementation

of core protocols & services– Growing open source developer community

Page 25: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Globus Toolkit: Evaluation (1) Good technical solutions for key problems, e.g.

– Authentication and authorization– Resource discovery and monitoring– Reliable remote service invocation– High-performance remote data access

This + good engineering is enabling progress– Good quality reference implementation, multi-

language support, interfaces to many systems, large user base, industrial support

– Growing community code base built on tools

Page 26: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Globus Toolkit: Evaluation (2) Protocol deficiencies, e.g.

– Heterogeneous basis: HTTP, LDAP, FTP– No standard means of error propagation

Significant missing functionality, e.g.– Databases, sensors, instruments– Programming tools: workflow, …– Virtualization of end systems (hosting envs.)

Little work on total system properties, e.g. – Dependability, end-to-end QoS, …– Reasoning about system properties

Page 27: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

“Web Services” Increasingly popular standards-based framework for

accessing network applications– W3C standardization; Microsoft, IBM, Sun, others

WSDL: Web Services Description Language– Interface Definition Language for Web services

SOAP: Simple Object Access Protocol– XML-based RPC protocol; common WSDL target

WS-Inspection (WSIL)– Conventions for locating service descriptions

UDDI: Universal Desc., Discovery, & Integration – Directory for Web services

Page 28: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Transient Service Instances “Web services” address discovery & invocation of

persistent services In Grids, must also support transient service

instances, created/destroyed dynamically– E.g., to manage eBusiness workflow, video

conference, or distributed data analysis Significant implications for how services are

managed, named, discovered, and used– In fact, much of our work is concerned with the

management of service instances

Page 29: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Open Grid Services Architecture Service orientation to virtualize resources From Web services:

– Standard interface definition mechanisms: multiple protocol bindings, multiple implementations, local/remote transparency

Building on Globus Toolkit:– The Grid service defines standard semantics for service

interactions– Factory, registry, and mapper services– Reliable and secure transport

Multiple hosting targets: J2EE, .NET, “C”, etc.

Page 30: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

OGSA Service Model System comprises (a typically few) persistent services &

(potentially many) transient services All services adhere to specified Grid service interfaces

and behaviors– Reliable invocation, lifetime management, discovery,

authorization, notification, upgradeability, concurrency, manageability

Interfaces for managing Grid service instances– Factory, registry, mapper

Heavily leverage Globus Toolkit technology=> Reliable secure mgmt of distributed state

Page 31: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

The Grid Service A (potentially transient) Web service with specified

interfaces & behaviors, including– Creation (Factory)– Global naming (GSH) & references (GSR)– Lifetime management– Registration & Discovery– Authorization– Notification– Concurrency– Manageability

Page 32: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Factory A Grid service with Factory interface can be

requested to create a new Grid service instance– Reliable creation (once-and-only-once)

Create operation can be extended to accept Grid-service-specific creation parameters

Returns a Grid Service Handle (GSH)– A globally unique URL– Uniquely identifies the instance for all time– Based on name of a home mapper service

Page 33: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Mapper A GSH is a stable name for a Grid service, but does not

allow client to actually communicate with the Grid service

A Grid Service Reference (GSR) is a WSDL document that describes how to communicate with the Grid service– Contains protocol binding, network address, …– May expire (I.e. GSR information may change)

The Mapper interface allows a client to map from a GSH to a GSR– http get on GSH also returns a GSR

Page 34: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Lifetime Management GS instances created by factory or manually; destroyed

explicitly or via soft state– Negotiation of initial lifetime with Factory

SoftStateDestruction interface supports– GetTerminationTime message for inquiry

>Notification interface also allows for lifetime notification– SetTerminationTime message for keepalive

Soft state lifetime management avoids– Explicit client teardown of complex state– Resource “leaks” in hosting environments

ExplicitDestruction interface also available

Page 35: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Discovery A Grid service instance may maintain a set of

service information– XML fragments encapsulated in standard <name,

type, TTL-info> containers Discovery interface allows clients to query the Grid

service instance for this information– Query operation, plus supporting operations

>Extensible query language support See also Notification interfaces

– Allows notification of service existence and about service information

Page 36: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Registry The Registry interface may be used to discover a set

of Grid service instances– Returns a WS-Inspection document containing the

GSHs of a set of Grid services– Also returns policy associated with the set– Also available through Discovery interface

The RegistryManagement interface allows for soft-state registration of a Grid service– A set of Grid services can periodically register their

GSHs into a registry service, to allow for discovery of services in that set

Page 37: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Authorization Protocol binding handles authentication during

invocation of Grid service operation– Gives service URI for authenticated subject

Grid service instance should apply authorization policy on all operations– May be site-, service-, instance-, etc., specific

OGSA defines standard interfaces for remote management of access control policy– OperationAuthorizationManagement– SubjectEquivalency

Page 38: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Notification Interfaces NotificationSource for client subscription

– One or more notification generators> Generates notification message of a specific type> Typed interest statements: E.g., Filters, topics, …> Supports messaging services, 3rd party filter services, …

– Soft state subscription to a generator NotificationSink for asynchronous delivery of

notification messages A wide variety of uses are possible

– E.g. Dynamic discovery/registry services, monitoring, application error notification, …

Page 39: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Use of Web Services (1) A Grid service interface is a WSDL portType A Grid service definition is a WSDL extension

(serviceType) containing:– A set of one or more portTypes supported by

the service– portType & serviceType compatibility

statements, to support upgradability> For discovery of compatible services when interfaces are

upgraded– Implementation version information

Page 40: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Use of Web Services (2) A GSR is a WSDL document with extensions:

– Extension to service element to reference serviceType– Service element extensions to carry the GSH, and the

expiration time of the GSR A GSH is an URL, with the following properties:

– Globally unique for all time– http get on GSH + “.wsdl” returns GSR– Can derive GSH to Mapper from it

Registry returns WS-Inspection documents

Page 41: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Using OGSAto Construct Grid Environments

Factory RegistryService

FactoryH2R

Mapper

ServiceService Service ...

...

(a) Simple HostingEnvironment

Factory RegistryService

FactoryH2R

Mapper

ServiceService Service ...

...

F R

F M

SS S

F R

F M

SS S

(b) Virtual HostingEnvironment

E2EFactory

E2E Reg

E2E H2RMapper

...

F1

RM

SS S

F2

RM

SS S

E2E S E2E S E2E S

(c) Compound Services

In each case, Registry handle is effectively the uniquename for the virtual organization.

Page 42: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

OGSA and the Globus Toolkit Technically, OGSA enables

– Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!

– Integration with hosting environments: simplifying components, distribution, etc.

– Greatly expanded standard service set Pragmatically, we are proceeding as follows

– Develop open source OGSA implementation> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs

– Partnerships for service development– Also expect commercial value-adds

Page 43: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Globus Toolkit Refactoring Grid Security Infrastructure (GSI)

– Used in Grid service network protocol bindings Meta Directory Service 2 (MDS-2)

– Native part of each Grid service:> Discovery, Registry, RegistryManagement, Notification

Grid Resource Allocation & Mngt (GRAM)– Gatekeeper -> Factory for job mgr instances

GridFTP– Refactor control channel protocol

Other services refactored to used Grid services

Page 44: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Timeline Summer 2002 – Alpha releases of high-

level Grid Services Late 2002, Early 2003 – Alpha release of

new core Grid Services (MDS, GRAM, GridFTP)

Page 45: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Migration Paths Globus ToolkitTM evolutionary in nature

– Toolkit implementation may change– Underlying model of Grid Computing remains the

same– Capabilities of future Toolkits will be superset of

today’s Toolkit New implementations integrate better with existing

commodity technologies In cases of radical departure from current

implementations, migration paths will be provided– possibly maintain compatible APIs– possibly create gateways to today’s protocols

Page 46: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Summary:Evolution of Grid Technologies

Initial exploration (1996-1999; Globus 1.0)– Extensive appln experiments; core protocols

Data Grids (1999-??; Globus 2.0+)– Large-scale data management and analysis

Open Grid Services Architecture (2001-??, Globus 3.0)– Integration w/ Web services, hosting environments,

resource virtualization– Databases, higher-level services

Radically scalable systems (2003-??)– Sensors, wireless, ubiquitous computing

Page 47: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

Summary The Grid problem: Resource sharing & coordinated

problem solving in dynamic, multi-institutional virtual organizations

Grid architecture: Protocol, service definition for interoperability & resource sharing

Globus Toolkit a source of protocol and API definitions—and reference implementations– And many projects applying Grid concepts (& Globus

technologies) to important problems Open Grid Services Architecture represents (we

hope!) next step in evolution

Page 48: Grid Computing & Web Services: A Natural Partnership

[email protected] University of Chicago

For More Information The Globus Project™

– www.globus.org Grid architecture

– www.globus.org/research/papers/anatomy.pdf

Open Grid Services Architecture– www.globus.org/research/

papers/ogsa.pdf– www.globus.org/research/

papers/gsspec.pdf