grid services: middleware infrastructure for use of distributed resources

41
National Computational Science Grid Services: Middleware Infrastructure for use of Distributed Resources John Towns Principal Investigator, NLANR Distributed Applications Support Team Division Director, Scientific Computing, NCSA / Univ of Illinois [email protected]

Upload: taya

Post on 12-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Grid Services: Middleware Infrastructure for use of Distributed Resources. John Towns Principal Investigator, NLANR Distributed Applications Support Team Division Director, Scientific Computing, NCSA / Univ of Illinois [email protected]. Outline. Emergence of Distributed Computing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid Services: Middleware Infrastructure for use of Distributed Resources

National Computational Science

Grid Services: Middleware Infrastructure for use of Distributed Resources

John Towns Principal Investigator, NLANR Distributed Applications Support Team

Division Director, Scientific Computing, NCSA / Univ of Illinois

[email protected]

Page 2: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20022

National Computational Science

Outline

• Emergence of Distributed Computing– Middleware develops

• Establishment of Grid Services– What are Grid Services?– How do they relate to Web Services?

• Current Middleware Development Projects– Grid Services Middleware– Toolkits– Packaging Efforts

• Deployment/Leverage of Grid Services– Infrastructure Projects Deploying Grid Services Infrastructure– Projects Leveraging Grid Services for Science and Engineering

Research and Development

Page 3: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20023

National Computational Science

Late 1980’s – Early 1990’s

• Late 1980s - Metacomputing– Focus on Distributed Computation - running applications

across several supercomputing resources

• Early 1990’s - Gigabit Testbeds– Networking research testbeds pushing limits of

communication bandwidth to Gigabit/s levels

– BLANCA, CASA, Aurora and other testbeds in the US

– Additional such testbeds follow in other countries

Page 4: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20024

National Computational Science

Communications Libraries

• Data communications libraries developed for wide area networks– Some use of pre-existing PVM-type libraries

• Typically not good for wide area

– Development of software optimized for larger messages, higher latencies

• Data Transfer Mechanism (DTM)

• PVM extensions

– Plethora of messaging libraries is a problem• Some unification in MPI standardization process

Page 5: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20025

National Computational Science

Early Infrastructure

• Poor network infrastructure– Network testbeds were exactly that – finite lifetimes, experimental environment– All distributed application runs manually scheduled on networks used

• Poor distributed computing infrastructure– Distributed applications were experiments and difficult to schedule time on

“production” compute resources– All distributed application runs manually scheduled on supercomputing systems

used

• Poor software environment– Communications libraries were relatively immature– Disparity of communications libraries required installation by applications

teams on all systems of interest• Little support from system admins

• Little support for anything beyond distributed simulations on supercomputers

Page 6: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20026

National Computational Science

Why “The Grid”?

• New Applications Based on High-speed Coupling of People, Computers, Databases, Instruments, etc.– Computer-enhanced Instruments

– Collaborative Engineering

– Browsing of Remote Datasets

– Use of Remote Software

– Data-intensive Computing

– Multi-supercomputer Simulation

– Large-scale Parameter Studies

Source: Ian Foster, ANL

Page 7: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20027

National Computational Science

The Grid:Blueprint for a New

Computing Infrastructure• Published in 1999

– Ian Foster, Carl Kesselman (Eds)

– ISBN 1-55860-475-8, www.mkp.com/grids

• 22 chapters by expert authors including: – Andrew Chien,

– Jack Dongarra,

– Tom DeFanti,

– Andrew Grimshaw,

– Roch Guerin,

– Ken Kennedy,

– Paul Messina,

– Cliff Neuman,

– Jon Postel,

– Larry Smarr,

– Rick Stevens,

– and many others“A source book for the historyof the future” -- Vint Cerf

Page 8: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 2002

National Computational Science

The Grid

• “Dependable, Consistent, Pervasive Access to [High-end] Resources”

• Dependable: – Can Provide Performance and

Functionality Guarantees

• Consistent: – Uniform Interfaces to a

Wide Variety of Resources

• Pervasive: – Ability to “Plug In” From Anywhere

Source: Ian Foster, ANL

Page 9: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 20029

National Computational Science

I-WAY

• SC95 and the Information Wide Area Year– 17 sites and 10 networks connected using early middleware

at SC95

– 60+ applications, 15+ disciplines

– Lots of lessons learned

Page 10: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200210

National Computational Science

Middleware Emerges

• Globus Develops– 1994-1996

• Initial development and experimentation

– 1997-1999• Creation of Initial Globus Toolkit (1998)• First Adoption / Deployment Successes• Partnerships With NCSA, NASA, others

• UNICORE– 1997

• Development begins in Germany as a national research project

– 1999-2000• Proof of concept prototype released (1999)• First successes

• Other related projects in mid-1990’s– OSF’s Distributed Computing Environment (DCE)– Object Management Group's Common Object Request Broker Architecture (CORBA)– Microsoft's COM/DCOM– Many others…

Page 11: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200211

National Computational Science

Layered Approach to Building the GRID

Science Portals & Workbenches Science Portals

Capability Computing

Twenty-First Century Applications

Computational Services

Performance

Networking, Devices and Systems

Grid Middleware(resource independent)

Grid Fabric(resource dependent)

Access Services & Technology

Access Grid

Computational Grid

Build the GRID

Alliance Grid Model

Page 12: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200212

National Computational Science

I-WAY

NASA’s Information Power Grid

The Alliance National Technology Grid

Grid Testbeds

Page 13: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200213

National Computational Science

Grid Applications

• What’s the Grid about?– initially, most thought just “parallel MPI jobs”

• that missed some of the real opportunities

– but, how does the Grid add maximal value?

• What applications most need Grids?– remote instruments and sensors

• inhospitable environments

• remote telescopes, environmental monitoring, …

• equipment and logistics monitoring

– distributed data archives• multi-spectral astronomy (NVO), LIGO, LHC, genomics, …

• discipline engineering (e.g., earthquake engineering)

Page 14: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200214

National Computational Science

WashU

NCSA

Hong Kong

AEI

ZIB

Thessaloniki

How Do We:• Maintain/develop Code?• Manage Computer Resources?• Carry Out/monitor Simulation?

Teams Require Grid Technologies

Paris

Page 15: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200215

National Computational Science

• Distributed data acquisition (NEXRAD radars)

• Distributed dynamic computing

• Distributed decision making and data dissemination

• Intelligent networking and data routing

s fn

CONUS Forecasts (20 km resolution)

Regionalization and Customization of NWP

Regional (5 km resolution)

Sub-regional (2 km resolution)

Local (0.5-1.0 km resolution)

Link NEXRAD Radars with

Nested Simulation

NEXRAD Regional Linking and Abilene

Page 16: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200216

National Computational Science

Access GridSTARTAP Links

Russia to Six US Sites

Page 17: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200217

National Computational Science

Some Projects Needing Grids

• NEES earthquake engineering and simulation– integrated simulation, experiment, and collaboration

• LHC CERN large hadron collider (LHC)– multiple detectors and international teams

– petabytes of data

• ALMA millimeter telescope array– mountain top remote site

– remote data analysis and management

• NEON national ecological observatory network– remote sensing and data correlation

• EarthScope– USArray and San Andreas Fault Observatory at Depth

Page 18: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200218

National Computational Science

Grid Services Emerge in

Middleware• Global Grid Forum

– Formation of Open Grid Services Infrastructure Working Group (OGSI-WG)

• Globus– 2000-2002

• Push Concept of “Grid Services” into Network

• Development of Application-Specific Toolkits

• UNICORE– 2001-2002

• Development of Grid Services compatibility with Globus

Page 19: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200219

National Computational Science

What are Grid Services?

• Defined by the Open Grid Services Architecture (OGSA)– The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems

Integration• Draft 2.9, 6/22/2002• http://www.gridforum.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf

• Based on the Open Grid Services Infrastructure (OGSI)– Grid Service Specification

• Draft 3, 7/17/2002• http://www.gridforum.org/ogsi-wg/drafts/GS_Spec_draft03_2002-07-17.pdf

– defines the standard interfaces and behaviors of a Grid service– builds on a Web Services base

• Open Grid Services Infrastructure Working Group– OGSI-WG formed within the Global Grid Forum– Refinement of infrastructure-related portions of OGSA– OGSA builds on Web services; likely to incorporate specifications defined elsewhere

• W3C, IETF, OASIS, others…

Page 20: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200220

National Computational Science

So what are Web Services?

• Web Services– self-contained, self-describing, modular ``applications'' (components)

that can be published, located, and typically invoked using standard HTTP port 80

– new generation of capabilities using HTTP

– not specific to the Web

– can perform a a variety of functions and can make use of other Web Services

• Web Services consists of– Simplest form

• HTTP and XML

– Can also (generally do) include any of• SOAP, WSDL, WS-I, XML Query, Z39.50, JDBC, Jini, …

Page 21: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200221

National Computational Science

More on Web Services

• Define a means to:– describe software components to be accessed– access these components– enable the identification of relevant services/service providers

• They are neutral with respect to:– programming languages– programming models– system software

• Web Services provide:– uniform and widely accessible interface and access glue over services – a veneer for programmatic access to existing services – interoperability between middleware solutions

Page 22: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200222

National Computational Science

OK… so what are Grid Services then?

• Grid Service: – a Web service that conforms to a set of conventions (interfaces and behaviors)

that define how a client interacts with a Grid service– Interfaces address

• discovery• dynamic service creation• lifetime management• notification• manageability

– Conventions address• naming • upgradeability

• Effect– Provide a useful abstraction of capabilities and a simple means of interaction

that is independent of implementation

Page 23: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200223

National Computational Science

Relevant Web Services

Components

• Plenty of web services standards being defined; most relevant for Grid Services are:– Simple Object Access Protocol (SOAP)

• means of messaging between a service (provider) and a service requestor (client)

– Web Services Description Language (WSDL) • an XML document for describing Web services as a set of endpoints

operating on messages containing either document-oriented (messaging) or RPC payloads

– WS-Inspection • a simple XML language for locating service descriptions published by a

service provider

Page 24: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200224

National Computational Science

Grid Service: Characteristics

• Everything is a service– a network enabled entity that provides some capability through the

exchange of messages

• Dynamic entities– Can be dynamically created/destroyed

– Can be upgraded dynamically

• Maintain internal state for life of service

• Implement one or more interfaces

– MUST provide a GridService interface

Page 25: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200225

National Computational Science

Grid Services are Dynamic

• Can be dynamically created/destroyed – Explicitly created/destroyed

– System failure causing inaccessibility of destruction

– Each instance assigned a unique global handle to identify it• Grid service instance

Grid Service Handle

• Can be upgraded dynamically• ie. support new new protocol versions or to add alternative protocols

– Must maintain information related to a specific instance during upgrade

– Provides independence in upgrading Grid services

Grid Service Reference

Page 26: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200226

National Computational Science

Grid Services: State

• Each Grid Service maintains internal state for the life of service

• Grid Service Handle– Globally unique name assigned to each Grid service instance

– Differentiates between different instances of the service

– Invariant over the lifetime of a service instance

• Grid Service Reference– Instance-specific information required to interact with a specific service

instance

– Can change over the lifetime of the service instance

Page 27: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200227

National Computational Science

Grid Services: Interfaces

• Grid Service Interface– Set of operations invoked by exchanging a defined sequence of messages

• Correspond to portTypes in WSDL

• MUST provide a GridService interface – Required for all Grid Services

• Standard WSDL operation

– Mechanism for obtaining service data• Basic information about a Grid service instance (XML representation) including:

– Grid Service Handle– Grid Service Reference

• May provide a Registry Interface– Mechanism to support service discovery using a registry service– Used to register a Grid service with a registry service– GridService interface used by registry service to get information about the Grid

service

• Other defined Grid service interfaces– NotificationSource, NotificationSink, Factory, HandleMap

Page 28: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200228

National Computational Science

Accessing Grid Services Example

• Use WSDL to describe– multiple protocol bindings

– encoding styles

– messaging styles

– etc.

• Avoid binding specific interactions– Client interface and proxy allow

for generalized representation from client application

– Allows flexibility in using alternate services support specific bindings

– Some performance implications

Page 29: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200229

National Computational Science

Making it More Interesting

• Composing services– A service can be a complex composition of other services

• Create higher level services:– Accounting service– Workflow service– Authentication service– Data management service

• Archive management• Data transfer

– Remote Access service• telnet, ssh

Page 30: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200230

National Computational Science

Basic Grid Services Examples

• GSI (Grid Security Infrastructure) – PKI-based single sign-on

– mutual authentication • users and resources

– mapping to local user identifiers and accounts

– data privacy and integrity

• GSI-enabled SSH– secure, remote access

• GridFTP– secure, reliable, high-performance remote access

– third-party transfer between storage systems

Page 31: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200231

National Computational Science

Intermediate Grid Services

Examples• MDS (Metacomputing Directory Service)

– secure information service • distributed access to resource state and status information

• GRAM (Grid Resource Allocation & Management)– secure remote access

– resource allocation and management

• MPICH-G2– Grid-enabled Message Passing Interface (MPI)

• based on the MPICH implementation of MPI

• Distributed accounting– distributed access and management of accounting data

Page 32: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200232

National Computational Science

Advanced Grid Services Examples

• Replica Management Tools– secure, distributed management

• location of replicas of large scientific datasets

• GRAM-2 (GRAM extensions)– advance resource reservations

• networks, storage, and graphics pipelines

• co-reservation of multiple resources

• CAS (Community Authorization Service)– group access control and policy management

• Condor-G (brokering “super scheduler”)– single submission point for all resources within a virtual organization

– co-allocation of multiple resources

Page 33: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200233

National Computational Science

OGSA

• Builds on:– Globus Toolkit

– Web Services

• New definition of a Grid:– “an extensible set of Grid services that may be aggregated in various

ways to meet the needs of virtual organizations, which themselves can be defined in part by the services that they operate and share”

Page 34: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200234

National Computational Science

Grid Services Middleware

• The Globus Project– http://www.globus.org/

• The Condor Project– http://www.cs.wisc.edu/condor/

• The Legion Project– http://legion.virginia.edu/

• UNICORE– http://www.unicore.de/

Page 35: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200235

National Computational Science

Building on Grid Services:

Toolkits• GridLab

– Grid Application Toolkit• APIs for accessing Grid services from e.g. application

codes, portals, data managements systems, …

– http://www.gridlab.org/

• Grid Application Development Software (GrADS) Project – http://hipersoft.cs.rice.edu/grads/

Page 36: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200236

National Computational Science

Integration Efforts

• NSF Middleware Initiative (NMI)– NSF integration award

• http://www.nsf-middleware.org/

– package and deploy NMI software and documents• transparently use and share distributed resources• develop effective scientific collaborations

– GRIDS Center Software Suite• http://www.grids-center.org/

• Virtual Data Toolkit– From GriPhyN and iVDGL– Will include NMI and software from GriPhyN for virtual data – http://www.lsc-group.phys.uwm.edu/vdt/

• Grid Starter Kit– UK e-Science Grid product– Globus, SRB, Condor– http://esc.dl.ac.uk/StarterKit/

Page 37: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200237

National Computational Science

Deploying Grid Services

http://lhcgrid.web.cern.ch/LHCgrid/

http://www.nordugrid.org/about.html

http://server11.infn.it/grid/

http://www.eurogrid.org/http://doesciencegrid.org/

NASA Information Power Gridhttp://www.ipg.nasa.gov/

http://www.teragrid.org

http://datatag.web.cern.ch/datatag/

http://eu-datagrid.web.cern.ch/eu-datagrid/ http://www.nesc.ac.uk/

Also see: http://www-fp.mcs.anl.gov/~foster/grid-projects/#International Projects and Activities http://www.gridcomputing.com/

http://grangenet.net

Page 38: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200238

National Computational Science

Grid Projects

Particle Physics Data Gridhttp://www.ppdg.net/

International Virtual Data Grid Laboratoryhttp://www.ivdgl.org/index.php

http://www.eu-crossgrid.org/

http://gridtest.hpcnet.ne.kr/

http://www.gridlab.org/

http://www.griphyn.org/index.php

http://www.ascportal.org/

http://www.earthsystemgrid.org

http://www.apgrid.org/

http://www.gridpp.ac.uk/http://www.astrogrid.org

http://www.apbionet.org/

Page 39: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200239

National Computational Science

Corporate Buy-In

• IBM and Globus Announce Open Grid Services for Commercial Computing– Feb 20, 2002– http://www-916.ibm.com/press/prnews.nsf/jan/2C818325D8D4D23585256B660050DF6F

• United Devices Announces Support for the Open Grid Services Architecture– May 15, 2002– http://www.ud.com/company/press/press_releases/05132002.htm

• Sun Releases Enhanced Grid Computing Services– February 15, 2002– http://www.internetnews.com/ent-news/article.php/7_975901

• Sun ONE Grid Engine Software– http://wwws.sun.com/software/gridware/

• Platform Globus– http://www.platform.com/products/globus

Page 40: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200240

National Computational Science

Forums

• The Global Grid Forum– http://www.globalgridforum.org

• The European Grid Forum, EGrid– Part of GGF

– http://www.egrid.org/

• Grid Forum Korea– http://www.gridforumkorea.org/

Page 41: Grid Services: Middleware Infrastructure for use of Distributed Resources

25 September 2002John Towns <[email protected]>

iGrid 200241

National Computational Science

References

• The Grid: Blueprint for a New Computing Infrastructure– http://www.mkp.com/grids

• The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration

– Draft 2.9, 6/22/2002– http://www.gridforum.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf

• Grid Service Specification – Draft 3, 7/17/2002– http://www.gridforum.org/ogsi-wg/drafts/GS_Spec_draft03_2002-07-17.pdf

• An Introduction to Web Services and related Technology for building an e-Science Grid

– UK Grid Engineering Task Force – Web and Grid Services Working Group – http://esc.dl.ac.uk/WebServices/