grid networks. what is grids? cluster of clusters – geographically distributed and connected with...

35
Grid Networks

Upload: kai-markham

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Grid Networks

What is Grids?

Cluster of clusters – geographically distributed and connected with high-speed MAN and WAN links.

Made up of tens to thousands of small commodity servers interconnected with scalable, high-performance Ethernet networks.

Typical Grid Computing Model

http://www.doc.ic.ac.uk/~sjn5/INDOUK/TYM-GCG.pdf

Why Grids?

A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour

1,000 physicists worldwide pool resources for petaop analyses of petabytes of data

Civil engineers collaborate to design, execute, & analyze shake table experiments

Climate scientists visualize, annotate, & analyze terabyte simulation datasets

An emergency response team couples real time data, weather model, population data

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

Why Grid? (contd.)

A multidisciplinary analysis in aerospace couples code and data in four companies

A home user invokes architectural design functions at an application service provider

An application service provider purchases cycles from compute cycle providers

Scientists working for a multinational soap company design a new product

A community group pools members’ PCs to analyze alternative designs for a local road

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

The Grid Problem

Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resourceFrom “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”

Enable communities (virtual organizations”) to share geographically distributed resources a s they pursue common goals – assuming the absence of

Central location, Central control, Omniscience, Existing trust relationships.

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

Why Now?

Moore’s law improvements in computing produce highly functional end-systems

The Internet and burgeoning wired and wireless provide universal connectivity

Changing modes of working and problem solving emphasize teamwork, computation

Network exponentials produce dramatic changes in geometry and geography

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

Network Exponentials

Network vs. computer performance Computer speed doubles every 18 months Network speed doubles every 9 months

1986 to 2000 Computers: x 500 Networks: x 340,000

2001 to 2010 Computers: x 60 Networks: x 4000

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan. 2001) by Cleo Viett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Broader Context

“Grid Computing” has much in common with major industrial trusts Business-to-business, Peer-to-peer, Application Service

Providers, Storage Service Providers, Distributed Computing, Internet Computing…

Sharing issues not adequately addressed by existing technologies Complicated requirements: “run program X at site Y subject

to community policy P, providing access to data at Z according to policy Q”

High performance: unique demands of advanced & high-performance systems

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

The Globus ProjectTM

Close collaboration with real Grid projects in science and industry

Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure

Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing

The Globus ToolkitTM: Open Source, reference software based for building grid infrastructure and applications

Global Grid Forum: Development of standard protocols and APIs for Grid computing

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

Basic Grid Architecture

Layered Grid Architecture

http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

“Coordinating multiple resources”:Ubiquitous infrastructure services,App-specific distributed services

“Sharing single resources”:Negotiating access, controlling use

“Talking to things”: communication(Internet protocols) & security

“Controlling things locally”: Access to, & control of, resources

The Single System Model

User Interface / API

ResourceDiscovery

ProcessManagement

AuthenticationAuthorizationAccounting

MessagePassing

DataManagement

Operating System

Storage Compute

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

What Makes a Cluster? Uses a Distributed Resource Manager (DRM) to

manager job scheduling Tightly coupled - High speed, low latency

interconnect network Shared storage for home directories, high

throughput scratch space, applications Fairly homogenous - Configuration management is

important! Single administrative domain User accounts managed with traditional

mechanisms

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

The Cluster Model

RD PM3A DMMP

Operating System

StorageCompute

Cluster DRM

RD PM3A DMMP

Operating System

StorageCompute

Cluster DRM

RD PM3A DMMP

Operating System

StorageCompute

Cluster DRM

RD PM3A DMMP

Operating System

StorageCompute

Cluster DRM

RD PM3A DMMP

User Interface/API

Cluster DRM

Cluster Node Cluster Node Cluster Node Cluster Node

High SpeedInterconnect

Master Node

SharedStorage

ConfigurationManagement

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

How is an Enterprise Grid Different from a Cluster? Heterogeneous - Clusters, SMP, even workstations

of dissimilar configurations, but all are tied together through a grid middleware layer

Lightly coupled - Connected via 100 or 1000Mbps Ethernet

Introduces a resource registry and grid security service But usually only a single registry and security service

for the grid Not necessarily a single administrative domain

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

The Enterprise Grid Model

RD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceRD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceRD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceRD PM3A DMMP

Operating System

StorageCompute

Grid Interface

RD PM3A DMMP

Operating System

StorageCompute

Grid Interface

RD PM3A DMMP

User Interface/API

Grid Interface

SMP SMP

EnterpriseLAN or WAN

SecurityInfrastructure

ResourceRegistry

Grid Interface

Cluster DRM RD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceRD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceRD PMAA DMMP

Operating System

StorageCompute

Cluster InterfaceGrid Interface

Cluster DRM

RD PM3A DMMP RD PM3A DMMP

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

How is a Global Grid Different from an Enterprise Grid?

"Grid of Grids" - Collection of enterprise grids Loosely coupled between sites Mutually distrustful administrative domains Multiple grid resource registries and grid

security services

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

The Global Grid Model

Grid

WAN

RR SI

Cluster

Grid

SMP

Grid

SMP

Grid

Cluster

UI/API

Grid

LAN

Grid

RR SI

SMP

Grid

SMP

Grid

SMP

Grid

Cluster

Cluster

RR SI

ClusterSMP

Grid

Cluster

Grid Grid Grid

LAN

Site A

Site B

Site C

UI/API

Grid

UI/API

Grid

LAN

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Grid Platforms

Examples: Globus

Grid Platform Example: Globus Toolkit V2 Primary development occurred at Argonne National

Labs Principals were Ian Foster and Carl Kesselman

Open source But architecture development was a closed process

Toolkit approach: different “bundles” that can be installed depending upon what functions are desired

API through CoG (Commodity Grid) kits Java, Python, CORBA, Perl, Matlab, Web services,

JSP (JavaServer Page)

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Majority of its use is in university and government

research environments Some vendors offer value-added versions

IBM Grid Toolbox Platform Globus

NSF Middleware Initiative (NMI) is packaging pre-built Globus with other relevant components NWS (Network Weather Service) KX.509/KCA (Kerberos-X.509 integration) Condor-G as a “metascheduler” GSI-enabled OpenSSH

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

* GSI :Grid security Infrastructure

Globus Toolkit V2 “Pillars”

InformationServices(MDS)

DataManagement

(GASS)

ResourceManagement

(GRAM)

Grid Security Infrastructure(GSI)

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Stack

MDS GASS/GridFTPGRAM

GSI

HTTP LDAP FTP

TLS/SSL

TCP/IP

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Key Components

Grid Resource Allocation Manager (GRAM) Server-side: “gatekeeper” process that controls execution of

job managers Client-side: “globusrun” UI to launch jobs

Monitoring and Directory Service (MDS) GRIS: Grid Resource Information Service collects local info GIIS: Grid Index Information Service collects GRIS info

Global Access to Secondary Storage (GASS) GridFTP, implemented through “in.ftpd” daemon and

“globus-url-copy” command Files accessed through a URI, e.g.

gsiftp://node1.ncbiogrid.org/data/ncbi/ecoli.nt

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Key Components: GSI

Uses a TLS/SSL-based PKI infrastructure All server resources (i.e. gatekeeper, GRIS) and users have a

public key that has been digitally signed by the CA (the “certificate”) and a private key “grid-cert-request” to generate key pair User/sysadmin sends the public key to CA CA signs the public key with its private key and returns to the

signed certificate to the user/sysadmin The user/sysadmin stores the signed certificate in the local

filesystem Certificate contains: the subject name, the subject’s public key,

the CA’s name, and the CA’s signature

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Key Components: GSI

Logging in to the grid (“grid-proxy-init”): User creates a temporary public-private key pair User’s private key is used to digitally sign the temporary public

key -- this becomes the “proxy” certificate This creates a chain of trust from the CA to the user to the

proxy certificate The proxy certificate and associated private key are transmitted

with a job The proxy certificate can be used to issue commands on

remote servers on the user’s behalf (“delegation”) On remote servers, there is a “grid-mapfile” that maps user

cert subject names to local userids

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Additional Components Grid Packaging Tools (GPT)

Used to build (“gpt-build”), install (“gpt-install”) and localize (“gpt-postinstall”) Globus components

MPICH-G2 A Globus V2 enabled version of MPI (Message

Passing Interface) Based on MPICH Utilizes GSI, MDS and GRAM

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Network Services

CertificateAuthority

GIISServer

GRIS

gatekeeper

in.ftpd

Grid Node

GRAMClient

Client Node

GRIS

gatekeeper

in.ftpd

Grid Node

GRIS

gatekeeper

in.ftpd

Grid Node

GRIS

gatekeeper

in.ftpd

Grid Node

Network

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

GRAM, MDS and GASS Interactions

resourceresourceprocessprocess

job manager

gatekeeper

process

GRAM

GRIS

resource

GIIS

MDS

GridFTPin.ftpd

GASS

job allocationjob management

resourcediscovery

data transferdata control

user / proxy

Client

RSL/DUROC/HTTP 1.1 LDAP LDAP

LDAP LDAP

gsiftp

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

Globus Toolkit V2 Strengths and WeaknessesStrengths: Mindshare and

collaboration in both industry & academia

Open source Standards-based

underpinnings (e.g. SSL, LDAP)

Flexibility and CoG API's Driving OGSA with heavy

resource commitment from IBM

Weaknesses: Significant effort required

to get applications working on a grid

Not production quality at this time

No “metascheduler” -- user has to explicitly tell their jobs where to run

http://www.ncbiogrid.org/resources/slides/grid-overview.ppt

References Dr. Carl Kesselman, “Grid Computing”

[email protected] Sciences Institute, University of Southern CaliforniaJoint work with Ian Foster, ANL and U Chicago

Bryan Carpenter, Geoffrey Fox, and Marlon Pierce, “e-Science e-Business e-Government and their Technologies Introduction”[email protected], [email protected], [email protected] Pervasive Technology Laboratories, Indiana University http://www.grid2004.org/spring2004

References

Fran Berman and Anthony J.G. Hey, “Grid Computing: Making The Global Infrastructure a Reality,” Wiley, ISBN: 0-470-85319-0, March 2003

“High-Performance Computing with Scalable Server Cluster and Grid Networks,” FORCE10

http://www.force10networks.com/applications/pdf/ClusterGridapV1_0.pdf

References

Ian Foster, et al., “The Anatomy of the Grid,’ http://www.globus.org/research/papers/anatomy.pdf

Ian Foster, et al, “Computational Grid,” http://www-fp.globus.org/research/papers/chapter2.pdf

“Grid Networks,” ITU, http://www.itu.int/osg/spu/newslog/categories/gridNetworks/

References

Steve Tuecke, “National eScience Core Programme & Grid Highlights,” http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf

J. Charles Kesler, “Grid Overview,” http://www.ncbiogrid.org/resources/slides/grid-overview.ppt