grid networks. what is grids? cluster of clusters – geographically distributed and connected with...
TRANSCRIPT
What is Grids?
Cluster of clusters – geographically distributed and connected with high-speed MAN and WAN links.
Made up of tens to thousands of small commodity servers interconnected with scalable, high-performance Ethernet networks.
Why Grids?
A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour
1,000 physicists worldwide pool resources for petaop analyses of petabytes of data
Civil engineers collaborate to design, execute, & analyze shake table experiments
Climate scientists visualize, annotate, & analyze terabyte simulation datasets
An emergency response team couples real time data, weather model, population data
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Why Grid? (contd.)
A multidisciplinary analysis in aerospace couples code and data in four companies
A home user invokes architectural design functions at an application service provider
An application service provider purchases cycles from compute cycle providers
Scientists working for a multinational soap company design a new product
A community group pools members’ PCs to analyze alternative designs for a local road
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
The Grid Problem
Flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resourceFrom “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”
Enable communities (virtual organizations”) to share geographically distributed resources a s they pursue common goals – assuming the absence of
Central location, Central control, Omniscience, Existing trust relationships.
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Why Now?
Moore’s law improvements in computing produce highly functional end-systems
The Internet and burgeoning wired and wireless provide universal connectivity
Changing modes of working and problem solving emphasize teamwork, computation
Network exponentials produce dramatic changes in geometry and geography
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Network Exponentials
Network vs. computer performance Computer speed doubles every 18 months Network speed doubles every 9 months
1986 to 2000 Computers: x 500 Networks: x 340,000
2001 to 2010 Computers: x 60 Networks: x 4000
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan. 2001) by Cleo Viett, source Vined Khoslan, Kleiner, Caufield and Perkins.
Broader Context
“Grid Computing” has much in common with major industrial trusts Business-to-business, Peer-to-peer, Application Service
Providers, Storage Service Providers, Distributed Computing, Internet Computing…
Sharing issues not adequately addressed by existing technologies Complicated requirements: “run program X at site Y subject
to community policy P, providing access to data at Z according to policy Q”
High performance: unique demands of advanced & high-performance systems
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
The Globus ProjectTM
Close collaboration with real Grid projects in science and industry
Development and promotion of standard Grid protocols to enable interoperability and shared infrastructure
Development and promotion of standard Grid software APIs and SDKs to enable portability and code sharing
The Globus ToolkitTM: Open Source, reference software based for building grid infrastructure and applications
Global Grid Forum: Development of standard protocols and APIs for Grid computing
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
Layered Grid Architecture
http://www.nesc.ac.uk/talks/talks/BiGUM1.pdf
“Coordinating multiple resources”:Ubiquitous infrastructure services,App-specific distributed services
“Sharing single resources”:Negotiating access, controlling use
“Talking to things”: communication(Internet protocols) & security
“Controlling things locally”: Access to, & control of, resources
The Single System Model
User Interface / API
ResourceDiscovery
ProcessManagement
AuthenticationAuthorizationAccounting
MessagePassing
DataManagement
Operating System
Storage Compute
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
What Makes a Cluster? Uses a Distributed Resource Manager (DRM) to
manager job scheduling Tightly coupled - High speed, low latency
interconnect network Shared storage for home directories, high
throughput scratch space, applications Fairly homogenous - Configuration management is
important! Single administrative domain User accounts managed with traditional
mechanisms
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
The Cluster Model
RD PM3A DMMP
Operating System
StorageCompute
Cluster DRM
RD PM3A DMMP
Operating System
StorageCompute
Cluster DRM
RD PM3A DMMP
Operating System
StorageCompute
Cluster DRM
RD PM3A DMMP
Operating System
StorageCompute
Cluster DRM
RD PM3A DMMP
User Interface/API
Cluster DRM
Cluster Node Cluster Node Cluster Node Cluster Node
High SpeedInterconnect
Master Node
SharedStorage
ConfigurationManagement
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
How is an Enterprise Grid Different from a Cluster? Heterogeneous - Clusters, SMP, even workstations
of dissimilar configurations, but all are tied together through a grid middleware layer
Lightly coupled - Connected via 100 or 1000Mbps Ethernet
Introduces a resource registry and grid security service But usually only a single registry and security service
for the grid Not necessarily a single administrative domain
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
The Enterprise Grid Model
RD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceRD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceRD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceRD PM3A DMMP
Operating System
StorageCompute
Grid Interface
RD PM3A DMMP
Operating System
StorageCompute
Grid Interface
RD PM3A DMMP
User Interface/API
Grid Interface
SMP SMP
EnterpriseLAN or WAN
SecurityInfrastructure
ResourceRegistry
Grid Interface
Cluster DRM RD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceRD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceRD PMAA DMMP
Operating System
StorageCompute
Cluster InterfaceGrid Interface
Cluster DRM
RD PM3A DMMP RD PM3A DMMP
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
How is a Global Grid Different from an Enterprise Grid?
"Grid of Grids" - Collection of enterprise grids Loosely coupled between sites Mutually distrustful administrative domains Multiple grid resource registries and grid
security services
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
The Global Grid Model
Grid
WAN
RR SI
Cluster
Grid
SMP
Grid
SMP
Grid
Cluster
UI/API
Grid
LAN
Grid
RR SI
SMP
Grid
SMP
Grid
SMP
Grid
Cluster
Cluster
RR SI
ClusterSMP
Grid
Cluster
Grid Grid Grid
LAN
Site A
Site B
Site C
UI/API
Grid
UI/API
Grid
LAN
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Grid Platform Example: Globus Toolkit V2 Primary development occurred at Argonne National
Labs Principals were Ian Foster and Carl Kesselman
Open source But architecture development was a closed process
Toolkit approach: different “bundles” that can be installed depending upon what functions are desired
API through CoG (Commodity Grid) kits Java, Python, CORBA, Perl, Matlab, Web services,
JSP (JavaServer Page)
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Majority of its use is in university and government
research environments Some vendors offer value-added versions
IBM Grid Toolbox Platform Globus
NSF Middleware Initiative (NMI) is packaging pre-built Globus with other relevant components NWS (Network Weather Service) KX.509/KCA (Kerberos-X.509 integration) Condor-G as a “metascheduler” GSI-enabled OpenSSH
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
* GSI :Grid security Infrastructure
Globus Toolkit V2 “Pillars”
InformationServices(MDS)
DataManagement
(GASS)
ResourceManagement
(GRAM)
Grid Security Infrastructure(GSI)
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Stack
MDS GASS/GridFTPGRAM
GSI
HTTP LDAP FTP
TLS/SSL
TCP/IP
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components
Grid Resource Allocation Manager (GRAM) Server-side: “gatekeeper” process that controls execution of
job managers Client-side: “globusrun” UI to launch jobs
Monitoring and Directory Service (MDS) GRIS: Grid Resource Information Service collects local info GIIS: Grid Index Information Service collects GRIS info
Global Access to Secondary Storage (GASS) GridFTP, implemented through “in.ftpd” daemon and
“globus-url-copy” command Files accessed through a URI, e.g.
gsiftp://node1.ncbiogrid.org/data/ncbi/ecoli.nt
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components: GSI
Uses a TLS/SSL-based PKI infrastructure All server resources (i.e. gatekeeper, GRIS) and users have a
public key that has been digitally signed by the CA (the “certificate”) and a private key “grid-cert-request” to generate key pair User/sysadmin sends the public key to CA CA signs the public key with its private key and returns to the
signed certificate to the user/sysadmin The user/sysadmin stores the signed certificate in the local
filesystem Certificate contains: the subject name, the subject’s public key,
the CA’s name, and the CA’s signature
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Key Components: GSI
Logging in to the grid (“grid-proxy-init”): User creates a temporary public-private key pair User’s private key is used to digitally sign the temporary public
key -- this becomes the “proxy” certificate This creates a chain of trust from the CA to the user to the
proxy certificate The proxy certificate and associated private key are transmitted
with a job The proxy certificate can be used to issue commands on
remote servers on the user’s behalf (“delegation”) On remote servers, there is a “grid-mapfile” that maps user
cert subject names to local userids
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Additional Components Grid Packaging Tools (GPT)
Used to build (“gpt-build”), install (“gpt-install”) and localize (“gpt-postinstall”) Globus components
MPICH-G2 A Globus V2 enabled version of MPI (Message
Passing Interface) Based on MPICH Utilizes GSI, MDS and GRAM
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Network Services
CertificateAuthority
GIISServer
GRIS
gatekeeper
in.ftpd
Grid Node
GRAMClient
Client Node
GRIS
gatekeeper
in.ftpd
Grid Node
GRIS
gatekeeper
in.ftpd
Grid Node
GRIS
gatekeeper
in.ftpd
Grid Node
Network
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
GRAM, MDS and GASS Interactions
resourceresourceprocessprocess
job manager
gatekeeper
process
GRAM
GRIS
resource
GIIS
MDS
GridFTPin.ftpd
GASS
job allocationjob management
resourcediscovery
data transferdata control
user / proxy
Client
RSL/DUROC/HTTP 1.1 LDAP LDAP
LDAP LDAP
gsiftp
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
Globus Toolkit V2 Strengths and WeaknessesStrengths: Mindshare and
collaboration in both industry & academia
Open source Standards-based
underpinnings (e.g. SSL, LDAP)
Flexibility and CoG API's Driving OGSA with heavy
resource commitment from IBM
Weaknesses: Significant effort required
to get applications working on a grid
Not production quality at this time
No “metascheduler” -- user has to explicitly tell their jobs where to run
http://www.ncbiogrid.org/resources/slides/grid-overview.ppt
References Dr. Carl Kesselman, “Grid Computing”
[email protected] Sciences Institute, University of Southern CaliforniaJoint work with Ian Foster, ANL and U Chicago
Bryan Carpenter, Geoffrey Fox, and Marlon Pierce, “e-Science e-Business e-Government and their Technologies Introduction”[email protected], [email protected], [email protected] Pervasive Technology Laboratories, Indiana University http://www.grid2004.org/spring2004
References
Fran Berman and Anthony J.G. Hey, “Grid Computing: Making The Global Infrastructure a Reality,” Wiley, ISBN: 0-470-85319-0, March 2003
“High-Performance Computing with Scalable Server Cluster and Grid Networks,” FORCE10
http://www.force10networks.com/applications/pdf/ClusterGridapV1_0.pdf
References
Ian Foster, et al., “The Anatomy of the Grid,’ http://www.globus.org/research/papers/anatomy.pdf
Ian Foster, et al, “Computational Grid,” http://www-fp.globus.org/research/papers/chapter2.pdf
“Grid Networks,” ITU, http://www.itu.int/osg/spu/newslog/categories/gridNetworks/