1 overview of grid middleware concepts peter kacsuk mta sztaki, hungary univ. westminster, uk...

52
1 Overview of Grid middleware Overview of Grid middleware concepts concepts Peter Kacsuk •MTA SZTAKI, Hungary • Univ. Westminster, UK [email protected]

Post on 21-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

1

Overview of Grid middleware Overview of Grid middleware conceptsconcepts

Peter Kacsuk

•MTA SZTAKI, Hungary

• Univ. Westminster, UK

[email protected]

Page 2: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

2

The goal of this lecture

• To overview the main trends of the fast evolution of Grid systems

• Explaining the main features of the three generation of Grid systems– 1st gen. Grids: Metacomputers– 2nd gen. Grids: Resource-oriented Grids– 3rd gen. Grids: Service-oriented Grids

• To show how these Grid systems can be handled by the users

Page 3: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

3

OGSA/OGSI

Super-computing

Network Computing

Clustercomputing

High-throughputcomputing

High-performancecomputing Web Services

Condor Globus

Client/server

Progress in Grid Systems

OGSA/WSRF Grid Systems

2nd Gen.

3rd Gen.

1st Gen.

Page 4: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

4

1st Generation Grids1st Generation GridsMetacomputersMetacomputers

Page 5: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

5

Original motivation for Original motivation for metacomputingmetacomputing

• Grand challenge problems run weeks and months even on supercomputers and clusters

• Various supercomputers/clusters must be connected by wide area networks in order to solve grand challenge problems in reasonable time

Page 6: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

6

Progress to MetacomputersProgress to Metacomputers

Single processor

2100 2100 2100 2100

2100 2100 2100 2100

Cluster Meta-computer

TFlops

Time

Super-computer

1980 1990 1995

Page 7: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

7

Original meaning of metacomputing

Wide area network

Original goal of metacomputing:Original goal of metacomputing:

• Distributed supercomputing to achieve higher performance than individual supercomputers/clusters can provide

Supercomputing +Metacomputing =

Page 8: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

8

Distributed Supercomputing

• Issues:– Resource discovery,

scheduling– Configuration– Multiple comm methods– Message passing (MPI)– Scalability– Fault tolerance

NCSAOrigin

CaltechExemplar

ArgonneSP

MauiSP

SF-Express Distributed Interactive Simulation (SC’1995)

Page 9: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

9

High-throughput computing (HTC) and the Grid

• Better usage of computing and other resources accessible via wide area network

• To exploit the spare cycles of various computers connected by wide area networks

• Two main representatives– SETI

– Condor

Page 10: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

10

OGSA/OGSI

Super-computing

Network Computing

Clustercomputing

High-throughputcomputing

High-performancecomputing Web Services

Condor Globus

Client/server

Progress in Grid Systems

OGSA/WSRF Grid Systems

1st Gen.

Page 11: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

11

TCP/IP

Resource requirement

ClassAdds

Match-maker

Resource requestor

Resource provider

Publish

(configuration description)

Client program moves to resource(s)

Security is a serious problem!

The Condor model

Page 12: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

12

ClassAdsClassAds

• Resources of the Grid have different properties (architecture, OS, performance, etc.) and these are described as advertisements (ClassAds)

• Creating a job, we can describe our requirements (and preferencies) for these properties.

• Condor tries to match the requirements and the ClassAds to provide the most optimal resources for our jobs.

Page 13: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

13

yourworkstation

personalCondor

Condorjobs

The concept of personal Condor

Page 14: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

14

yourworkstation

personalCondor

Condorjobs

GroupCondor

The concept of Condor pool

Page 15: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

15

Architecture of a Condor poolArchitecture of a Condor pool

Central Manager

Submit Machine

Execution Machines

Page 16: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

16

Components of a Condor poolComponents of a Condor pool

Page 17: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

17

yourworkstation

friendly Condor

personalCondor

Condorjobs

GroupCondor Your schedd daemons

see the CM of the other pool as if it was part of your pool

The concept of Condor flocking

Page 18: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

18

Condor flocking “grids”

Condor pool

Condor pool

Condor pool

Condor pool

Condor pool

Client machine

Resources do not meet the

requirements of the job: forward it to a

friendly pool

Resources do meet the requirements of the job: execute it

submitJob

Job

Page 19: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

19

yourworkstation

friendly Condor

personalCondor

Condorjobs

Grid

PBS LSF

Condor

GroupCondor

glide-ins

The concept of glide-in

Page 20: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

20

Three levels of scalability in CondorThree levels of scalability in Condor

2100 2100 21002100

2100 2100 21002100

Among nodes of a

cluster

Grid

Gliding

Flocking among clusters

2100 2100 21002100

2100 2100 21002100

2100 2100 21002100

2100 2100 21002100

Page 21: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

21

NUG30 - Solved!!!

Numberofworkers

Solved in 7 days instead of 10.9 years

Page 22: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

22

NUG30 Personal Grid …NUG30 Personal Grid …

Managed by one Linux box at Wisconsin

Flocking: -- Condor pool at Wisconsin (500 processors)

-- Condor pool at Georgia Tech (284 Linux boxes)

-- Condor pool at UNM (40 processors)

-- Condor pool at Columbia (16 processors)

-- Condor pool at Northwestern (12 processors)

-- Condor pool at NCSA (65 processors)

-- Condor pool at INFN Italy (54 processors)

Glide-in: -- Origin 2000 (through LSF ) at NCSA. (512 processors)

-- Origin 2000 (through LSF) at Argonne (96 processors)

Page 23: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

23

Problems with Condor flocking “grids”

• Friendly relationships are defined statically.• Firewalls are not allowed between friendly pools.• Client can not choose resources (pools) directly.• Private (non-standard) “Condor protocols” are used

to connect friendly pools together.• Not service-oriented

Page 24: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

24

2nd Generation Grids2nd Generation GridsResource-oriented GridResource-oriented Grid

Page 25: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

25

The main goal of 2nd gen. Grids

• To enable a – geographically distributed community [of thousands] – to perform sophisticated, computationally intensive

analyses – on large set (Petabytes) of data

• To provide – on demand– dynamic resource aggregation – as virtual organizations

Example virtual organizations :– Physics community (EDG, EGEE)– Climate community, etc.

Page 26: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

26

Resource intensive issues include

• Harness data, storage, computing and network resources located in distinct administrative domains

• Respect local and global policies governing what can be used for what

• Schedule resources efficiently, again subject to local and global constraints

• Achieve high performance, with respect to both speed and reliability

Page 27: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

27

Grid Protocols, Services and Tools

• Protocol-based access to resources– Mask local heterogeneities– Negotiate multi-domain security, policy– “Grid-enabled” resources speak Grid protocols– Multiple implementations are possible

• Broad deployment of protocols facilitates creation of services that provide integrated view of distributed resources

• Tools use protocols and services to enable specific classes of applications

Page 28: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

28

The Role of Grid Middleware and Tools

Remotemonitor

Remoteaccess

Informationservices

Datamgmt

. . .Resourcemgmt

CollaborationTools

Data MgmtTools

Distributedsimulation

. . .

net

Credit to Ian Foster

Page 29: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

29

OGSA/OGSI

Super-computing

Network Computing

Clustercomputing

High-throughputcomputing

High-performancecomputing Web Services

Condor Globus

Client/server

Progress in Grid Systems

OGSA/WSRF Grid Systems

2nd Gen.

Page 30: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

30

Solutions by Globus (GT-2)

• Creation of Virtual Organizations (VOs)• Standard protocols are used to connect Globus sites• Security issues are basically solved

– Firewalls are allowed between Grid sites– PKI: CAs and X.509 certificates– SSL for authentication and message protection

• The client does not need account on every Globus site:– Proxies and delegation for secure single Sign-on

• Still: – provides metacomputing facilities (MPICH-G2)– Not service-oriented either

Page 31: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

31

GRAM API

MDS-2 API

Resource description

MDS-2)

Resource requestor

Resource provider

Publish

(configuration description)

Client program moves to resource(s)

Security is a serious problem!

The Globus-2 model

Page 32: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

32

The Role of the Globus ToolkitThe Role of the Globus Toolkit

• A collection of solutions to problems that come up frequently when building collaborative distributed applications

• Heterogeneity

– A focus, in particular, on overcoming heterogeneity for application developers

• Standards

– We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF)

– GT also includes reference implementations of new/proposed standards in these organizations

Page 33: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

33

Without the Globus ToolkitWithout the Globus Toolkit

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

A

B

C

D

E

Application Developer

10

Off the Shelf 12

Globus Toolkit

0

Grid Community

0

Page 34: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

34

A possibility with the Globus ToolkitA possibility with the Globus Toolkit

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf 9

Globus Toolkit

8

Grid Community

3

Page 35: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

35

Data MgmtSecurityCommonRuntime

Execution Mgmt

Info Services

AuthenticationAuthorization

(GSI)GridFTP

C CommonLibraries

Globus Toolkit version 2 (Globus Toolkit version 2 (GT2GT2))

Grid ResourceAlloc. Mgmt

(GRAM)

Monitoring& Discovery

(MDS)

User applications&

Higher level services

Page 36: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

36

Globus ComponentsGlobus Components

Grid SecurityInfrastructure

Job Manager

GRAM client API calls to request resource allocation

and process creation.

MDS client API callsto locate resources

Query current statusof resource

Create

RSL Library

Parse

RequestAllocate &

create processes

Process

Process

Process

Monitor &control

Site boundary

Client MDS: Grid Index Info Server

Gatekeeper

MDS: Grid Resource Info Server

Local Resource Manager

MDS client API callsto get resource info

GRAM client API statechange callbacks

Page 37: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

37

Example 1 for a GT2 Grid: TeraGrid

HPSS

HPSS

574p IA-32 Chiba City

128p Origin

HR Display & VR Facilities

MyrinetMyrinet MyrinetMyrinet

1176p IBM SPBlue Horizon

Sun E10K1500p Origin

UniTree

1024p IA-32 320p IA-64

HPSS

256p HP X-Class

128p HP V2500

92p IA-32

NCSA: Compute-Intensive

ANL: Visualization

Caltech: Data collection and analysis applications

SDSC: Data-orientedcomputing

Credit to Fran Berman

Page 38: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

38

TeraGrid Common Infrastructure Environment

• Linux Operating Environment• Basic and Core Globus Services

– GSI (Grid Security Infrastructure)

– GSI-enabled SSH and GSIFTP

– GRAM (Grid Resource Allocation & Management)

– GridFTP– Information Service– Distributed accounting– MPICH-G2– Science Portals

• Advanced and Data Services– Replica

Management Tools

– GRAM-2 (GRAM extensions)

– Condor-G (as brokering “super scheduler”)

– SDSC SRB (Storage Resource Broker)

Credit to Fran Berman

Page 39: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

39

Example 2 for a GT2 Grid: LHC Grid and LCG-2

• LHC Grid– A homogeneous Grid developed by CERN– Restrictive policies (global policies overrule local policies) – A dedicated Grid to the Large Hydron Collider experiments

• LCG-2– A homogeneous Grid developed by CERN and the EDG

and EGEE projects– Restrictive policies (global policies overrule local policies) – A non-dedicated Grid– Works 24 hours/day and has been used in EGEE and EGEE-

related Grids (SEEGRID, BalticGrid, etc.)

Page 40: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

40

Main Logical Machine Types (Services) Main Logical Machine Types (Services) in LCG-2in LCG-2

• User Interface (UI)

• Information Service (IS)

• Computing Element (CE)– Frontend Node– Worker Nodes (WN)

• Storage Element (SE)

• Replica Catalog (RC,RLS)

• Resource Broker (RB)

Page 41: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

41

The LCG-2 ArchitectureThe LCG-2 Architecture

Collective ServicesCollective Services

Information & MonitoringInformation

& MonitoringReplica

ManagerReplica

ManagerGrid

SchedulerGrid

Scheduler

Local ApplicationLocal Application Local DatabaseLocal Database

Underlying Grid ServicesUnderlying Grid Services

Computing Element Services

Computing Element Services

Authorization Authentication & Accounting

Authorization Authentication & Accounting

Replica CatalogReplica Catalog

Storage Element Services

Storage Element Services

Database Services

Database Services

Fabric servicesFabric services

ConfigurationManagement

ConfigurationManagement

Node Installation &Management

Node Installation &Management

Monitoringand Fault Tolerance

Monitoringand Fault Tolerance

Resource Management

Resource Management

Fabric StorageManagement

Fabric StorageManagement

Grid

Fabric

Local Computing

Grid Grid Application LayerGrid Application Layer

Data Management

Data Management

Job Management

Job Management

Metadata Management

Metadata Management

Logging & Book-

keeping

Logging & Book-

keeping

Page 42: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

42

3rd Generation Grids3rd Generation Grids

Service-oriented GridsService-oriented GridsOGSAOGSA

(Open Grid Service Architecture)(Open Grid Service Architecture)andand

WSRFWSRF(Web Services Resource Framework)(Web Services Resource Framework)

Page 43: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

43

OGSA/OGSI

Super-computing

Network Computing

Clustercomputing

High-throughputcomputing

High-performancecomputing Web Services

Condor Globus

Client/server

Progress in Grid Systems

OGSA/WSRF Grid Systems

3rd Gen.

Page 44: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

44

Bind

Find (WSDL, UDDI)

Service description

Service registry

Service requestor

Service provider

Publish (WSDL, UDDI)

(SOAP)

Predefined programs (services) wait for invocation

Much more secure than the GT-2 concept

The Web Services model

UDDI provider

Page 45: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

45

Grid and Web Services: Convergence

Grid

Web

However, despite enthusiasm for OGSI, adoption within Web community turned out to be problematic

Started far apart in apps & tech

OGSI/GT3

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging ?

Page 46: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

46

Concerns

• Too much stuff in one specification

• Does not work well with existing Web services tooling

• Too object oriented

Page 47: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

47

Grid and Web Services: Convergence

Grid

Web

The definition of WSRF means that Grid and Web communities can move forward on a common base

WSRF

Started far apart in apps & tech

OGSI

GT2

GT1

HTTPWSDL,

WS-*

WSDL 2,

WSDM

Have beenconverging

Page 48: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

48

Layered diagram of Layered diagram of OGSA, GT4, WSRF, and Web ServicesOGSA, GT4, WSRF, and Web Services

Page 49: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

49

Relationship between Relationship between OGSA, GT4, WSRF, and Web ServicesOGSA, GT4, WSRF, and Web Services

Page 50: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

50

Towards GT4 production GridsTowards GT4 production Grids

Stable highly-available GT2 production Grid Extension with GT4 site and services by UoW

Westminster• UoW (Univ of Westminster)

Core members:• Manchester• CCLRC RAL

• Oxford• Leeds

• CSAR• HPCx

Partner sites• Bristol• Cardiff• Lancaster

Data clusters

Compute clusters

National HPC

services

Page 51: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

51

Workload ManagementData Management

SecurityInformation & Monitoring

Access

gLite Grid Middleware ServicesgLite Grid Middleware Services

API

ComputingElement

WorkloadManagement

MetadataCatalog

StorageElement

DataMovement

File & ReplicaCatalog

Authorization

Authentication

Information &Monitoring

Application

MonitoringAuditing

JobProvenance

PackageManager

CLI

Accounting

Site Proxy

Overview paper http://doc.cern.ch//archive/electronic/egee/tr/egee-tr-2006-001.pdf

Page 52: 1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK kacsuk@sztaki.hu

52

Conclusions

• Fast evolution of Grid systems and middleware:– GT1, GT2, OGSA, OGSI, GT3, WSRF, GT4, …

• Current production scientific Grid systems are built based on 1st and 2nd gen. Grid technologies

• Enterprise Grid systems are emerging based on the new OGSA and WSRF concepts