grid architecture
DESCRIPTION
Introduction about ArchitectureTRANSCRIPT
Grid ArchitectureGrid Architecture
Prof. Ruay-Shiung ChangProf. Ruay-Shiung Chang
March 2004March 2004
2
3
Course ContentsCourse Contents
Grid ArchitectureGrid Architecture Open Grid Services ArchitectureOpen Grid Services Architecture Resource and Service ManagementResource and Service Management Building Reliable Clients and ServicesBuilding Reliable Clients and Services Instrumentation and MonitoringInstrumentation and Monitoring
4
Grid ArchitectureGrid Architecture
Grid ArchitectureGrid Architecture
6
Why Discuss Architecture?Why Discuss Architecture? DescriptiveDescriptive
• Provide a common vocabulary for use when Provide a common vocabulary for use when describing Grid systemsdescribing Grid systems
GuidanceGuidance• Identify key areas in which services are Identify key areas in which services are
required required PrescriptivePrescriptive
• Define standard “Intergrid” protocols and Define standard “Intergrid” protocols and APIs to facilitate creation of interoperable APIs to facilitate creation of interoperable Grid systems and portable applicationsGrid systems and portable applications
7
8
Water Garden in EnglandWater Garden in England
9
One View of RequirementsOne View of Requirements Identity & authenticationIdentity & authentication Authorization & policyAuthorization & policy Resource discoveryResource discovery Resource characterizationResource characterization Resource allocationResource allocation (Co-)reservation, workflow(Co-)reservation, workflow Distributed algorithmsDistributed algorithms Remote data accessRemote data access High-speed data transferHigh-speed data transfer Performance guaranteesPerformance guarantees MonitoringMonitoring
Adaptation Intrusion detection Resource
management Accounting &
payment Fault management System evolution etc. etc. …
10
Another View: Another View: ““Three ObstaclesThree Obstaclesto Making Grid Computing Routineto Making Grid Computing Routine””
1)1) New approaches to problem solvingNew approaches to problem solving• Data Grids, distributed computing, peer-to-peer, Data Grids, distributed computing, peer-to-peer,
collaboration grids, …collaboration grids, …
2)2) Structuring and writing programsStructuring and writing programs• Abstractions, toolsAbstractions, tools
3)3) Enabling resource sharing across distinct Enabling resource sharing across distinct institutionsinstitutions• Resource discovery, access, reservation, Resource discovery, access, reservation,
allocation; authentication, authorization, policy; allocation; authentication, authorization, policy; communication; fault detection and notification; communication; fault detection and notification; ……
11
13
The Systems Problem:The Systems Problem:Resource Sharing Mechanisms That Resource Sharing Mechanisms That ……
Address security and policy concerns of Address security and policy concerns of resource owners and usersresource owners and users
Are flexible enough to deal with many Are flexible enough to deal with many resource types and sharing modalitiesresource types and sharing modalities
Scale to large number of resources, many Scale to large number of resources, many participants, many program componentsparticipants, many program components
Operate efficiently when dealing with large Operate efficiently when dealing with large amounts of data & computationamounts of data & computation
14
Aspects of the Systems Aspects of the Systems ProblemProblem
1)1) Need for Need for interoperabilityinteroperability when different when different groups want to share resourcesgroups want to share resources• Diverse components, policies, mechanismsDiverse components, policies, mechanisms• E.g., standard notions of identity, means of E.g., standard notions of identity, means of
communication, resource descriptionscommunication, resource descriptions
2)2) Need for Need for shared infrastructure servicesshared infrastructure services to to avoid repeated development, installationavoid repeated development, installation• E.g., one port/service/protocol for remote access to E.g., one port/service/protocol for remote access to
computing, not one per tool/appcomputing, not one per tool/app• E.g., Certificate Authorities: expensive to runE.g., Certificate Authorities: expensive to run
A common need for A common need for protocols & servicesprotocols & services
15Hence, a Protocol-Oriented ViewHence, a Protocol-Oriented Viewof Grid Architecture, that Emphasizes of Grid Architecture, that Emphasizes
…… Development of Development of Grid protocols & servicesGrid protocols & services
• Protocol-mediated access to remote resourcesProtocol-mediated access to remote resources• New services: e.g., resource brokeringNew services: e.g., resource brokering• ““On the Grid” = speak Intergrid protocolsOn the Grid” = speak Intergrid protocols• Mostly (extensions to) existing protocolsMostly (extensions to) existing protocols
Development of Development of Grid APIs & SDKsGrid APIs & SDKs• Interfaces to Grid protocols & servicesInterfaces to Grid protocols & services• Facilitate application development by supplying Facilitate application development by supplying
higher-level abstractionshigher-level abstractions The (hugely successful) model is the InternetThe (hugely successful) model is the Internet
16
Layered Grid ArchitectureLayered Grid Architecture(By Analogy to Internet (By Analogy to Internet
Architecture)Architecture)Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arch
itectu
re
17
Protocols, Services,Protocols, Services,and APIs Occur at Each Leveland APIs Occur at Each Level
Languages/Frameworks
Fabric Layer
Applications
Local Access APIs and Protocols
Collective Service APIs and SDKs
Collective ServicesCollective Service Protocols
Resource APIs and SDKs
Resource ServicesResource Service Protocols
Connectivity APIs
Connectivity Protocols
18
Important PointsImportant Points
Built on Internet protocols & servicesBuilt on Internet protocols & services• Communication, routing, name resolution, etc.Communication, routing, name resolution, etc.
““Layering” here is conceptual, does not Layering” here is conceptual, does not imply constraints on who can call whatimply constraints on who can call what• Protocols/services/APIs/SDKs will, ideally, be Protocols/services/APIs/SDKs will, ideally, be
largely self-containedlargely self-contained• Some things are fundamental: e.g., Some things are fundamental: e.g.,
communication and securitycommunication and security• But, advantageous for higher-level functions to But, advantageous for higher-level functions to
use common lower-level functionsuse common lower-level functions
19
The Hourglass ModelThe Hourglass Model
Focus on architecture Focus on architecture issuesissues• Propose set of core services Propose set of core services
as basic infrastructureas basic infrastructure• Use to construct high-level, Use to construct high-level,
domain-specific solutionsdomain-specific solutions Design principlesDesign principles
• Keep participation cost lowKeep participation cost low• Enable local controlEnable local control• Support for adaptationSupport for adaptation• ““IP hourglass” modelIP hourglass” model
Diverse global services
Coreservices
Local OS
A p p l i c a t i o n s
20
HourglassHourglass
21
Where Are We With Where Are We With Architecture?Architecture?
No “official” standards existNo “official” standards exist But: But:
• Globus Toolkit™ has emerged as the de facto Globus Toolkit™ has emerged as the de facto standard for several important Connectivity, standard for several important Connectivity, Resource, and Collective protocolsResource, and Collective protocols
• GGF has an architecture working groupGGF has an architecture working group• Technical specifications are being developed Technical specifications are being developed
for architecture elements: e.g., security, for architecture elements: e.g., security, data, resource management, informationdata, resource management, information
• Internet drafts submitted in security areaInternet drafts submitted in security area
22
Fabric LayerFabric LayerProtocols & ServicesProtocols & Services
Just what you would expect: the diverse mix Just what you would expect: the diverse mix of resources that may be sharedof resources that may be shared• Individual computers, Condor pools, file systems, Individual computers, Condor pools, file systems,
archives, metadata catalogs, networks, sensors, archives, metadata catalogs, networks, sensors, etc., etc.etc., etc.
Few constraints on low-level technology: Few constraints on low-level technology: connectivity and resource level protocols connectivity and resource level protocols form the “neck in the hourglass” form the “neck in the hourglass”
Defined by interfaces not physical Defined by interfaces not physical characteristicscharacteristics
23
Fabrics in generalFabrics in general
24
Connectivity LayerConnectivity LayerProtocols & ServicesProtocols & Services
CommunicationCommunication• Internet protocols: IP, DNS, routing, etc.Internet protocols: IP, DNS, routing, etc.
Security: Grid Security Infrastructure (GSI)Security: Grid Security Infrastructure (GSI)• Uniform authentication, authorization, and Uniform authentication, authorization, and
message protection mechanisms in multi-message protection mechanisms in multi-institutional settinginstitutional setting
• Single sign-on, delegation, identity mappingSingle sign-on, delegation, identity mapping• Public key technology, SSL, X.509, GSS-APIPublic key technology, SSL, X.509, GSS-API• Supporting infrastructure: Certificate Supporting infrastructure: Certificate
Authorities, certificate & key management, …Authorities, certificate & key management, …
25
Not too many connections!Not too many connections!
26
Resource LayerResource LayerProtocols & ServicesProtocols & Services
Grid Resource Allocation Management (GRAM) Grid Resource Allocation Management (GRAM) • Remote allocation, reservation, monitoring, control Remote allocation, reservation, monitoring, control
of compute resourcesof compute resources GridFTP protocol (FTP extensions)GridFTP protocol (FTP extensions)
• High-performance data access & transportHigh-performance data access & transport Grid Resource Information Service (GRIS)Grid Resource Information Service (GRIS)
• Access to structure & state informationAccess to structure & state information Others emerging: Catalog access, code Others emerging: Catalog access, code
repository access, accounting, etc.repository access, accounting, etc. All built on connectivity layer: GSI & IPAll built on connectivity layer: GSI & IP
27
Collective LayerCollective LayerProtocols & ServicesProtocols & Services
Index servers aka metadirectory servicesIndex servers aka metadirectory services• Custom views on dynamic resource collections Custom views on dynamic resource collections
assembled by a community assembled by a community Resource brokers (e.g., Condor Matchmaker)Resource brokers (e.g., Condor Matchmaker)
• Resource discovery and allocationResource discovery and allocation Replica catalogsReplica catalogs Replication servicesReplication services Co-reservation and co-allocation servicesCo-reservation and co-allocation services Workflow management servicesWorkflow management services etc.etc.
Condor: www.cs.wisc.edu/condor
28
Collectives: The BorgsCollectives: The Borgs
Resistance is futile. You will be assimilated.
29
ComputeResource
SDK
API
AccessProtocol
CheckpointRepository
SDK
API
C-pointProtocol
Example: High-Throughput Computing Example: High-Throughput Computing SystemSystem
High Throughput Computing System
Dynamic checkpoint, job management, failover, staging
Brokering, certificate authorities
Access to data, access to computers, access to network performance data
Communication, service discovery (DNS), authentication, authorization, delegation
Storage systems, schedulers
Collective(App)
App
Collective(Generic)
Resource
Connect
Fabric
30
Example:Example:Data Grid ArchitectureData Grid Architecture
Discipline-Specific Data Grid Application
Coherency control, replica selection, task management, virtual data catalog, virtual data code catalog, …
Replica catalog, replica management, co-allocation, certificate authorities, metadata catalogs,
Access to data, access to computers, access to network performance data, …
Communication, service discovery (DNS), authentication, authorization, delegation
Storage systems, clusters, networks, network caches, …
Collective(App)
App
Collective(Generic)
Resource
Connect
Fabric
31
CERN’s Data Grid ApplicationCERN’s Data Grid Application
32
The Compact Muon Solenoid
OpenOpen Grid Service Grid Service ArchitectureArchitecture
Minds are like parachutes, they Minds are like parachutes, they only function when they are only function when they are openopen. .
–– Thomas Robert Dewar Thomas Robert Dewar
34
Service-Oriented ArchitectureService-Oriented Architecture
Service: an entity that provides some Service: an entity that provides some capability to its clients by capability to its clients by exchanging messagesexchanging messages
Service: defined by identifying Service: defined by identifying sequences of specific message sequences of specific message exchanges that cause the service to exchanges that cause the service to perform some operationsperform some operations
36
Service-oriented ArchitectureService-oriented Architecture
Great Great flexibility in flexibility in implemen-implemen-tation and tation and location: location: because all because all operations operations are defined are defined in terms of in terms of message message exchangesexchanges
SOAP: Simple Object Access Protocol
37
Service-oriented ArchitectureService-oriented Architecture
Definition: a service-oriented architecture Definition: a service-oriented architecture is one in which all entities are services, and is one in which all entities are services, and thus any operation visible to the thus any operation visible to the architecture is the result of message architecture is the result of message exchangeexchange
Underlyinginfrastructure
Message exchanges services
38
Service-oriented architectureService-oriented architecture
Examples:Examples:• A storage serviceA storage service• A data transfer serviceA data transfer service• A troubleshooting serviceA troubleshooting service
Two important themesTwo important themes
low-level service
high-level service
39
But, first four personality typesBut, first four personality types
40
Service-oriented architectureService-oriented architecture
Common behaviors can reoccur in Common behaviors can reoccur in different contextsdifferent contexts• A goal of OGSA design is to allow these A goal of OGSA design is to allow these
behaviors to be expressed in standard behaviors to be expressed in standard ways regardless of contexts, so as to ways regardless of contexts, so as to simplify application design and simplify application design and encourage code reuseencourage code reuse
41
42
Service-oriented architectureService-oriented architecture
A higher-level service behavior (data A higher-level service behavior (data transfer) can be implemented via the transfer) can be implemented via the composition of simpler behaviors composition of simpler behaviors (storage service)(storage service)• Ease of composition is a second major Ease of composition is a second major
design goal for OGSAdesign goal for OGSA
43
Service-oriented architectureService-oriented architecture
Similarity in Similarity in protocolsprotocols
44
Service-oriented architectureService-oriented architecture
By encapsulating service operations By encapsulating service operations behind a common message-oriented behind a common message-oriented service interface, service-oriented service interface, service-oriented architecture encourages architecture encourages service service virtualizationvirtualization, isolating users from , isolating users from details of service implantation and details of service implantation and locationlocation
45
Service-oriented architectureService-oriented architecture
Virtualization
46
Service-oriented architectureService-oriented architecture
VirtualizationVirtualization• Everything is becoming virtual: virtual Everything is becoming virtual: virtual
stores, virtual workspace, virtual stores, virtual workspace, virtual organization, virtual networks, organization, virtual networks, ……
• More than 60 million US workers work More than 60 million US workers work remotelyremotely
47
Service-oriented architectureService-oriented architecture
Interaction with a given service is Interaction with a given service is facilitated by using a standard facilitated by using a standard interface definition languageinterface definition language ( (IDLIDL), ), such as WSDL, to describe the such as WSDL, to describe the serviceservice’’s interfaces.s interfaces.
48
Service-oriented architectureService-oriented architecture
Web service description languageWeb service description language
49
Service-oriented architectureService-oriented architecture
IDL: a cornerstone of interoperability IDL: a cornerstone of interoperability and transparencyand transparency
50
Service-oriented architectureService-oriented architecture
An IDL defines the operations An IDL defines the operations supported by a service, by specifying supported by a service, by specifying the messages that a service the messages that a service consumes and producesconsumes and produces
An interface specification describes An interface specification describes the messages the service expects the messages the service expects but does not define what the service but does not define what the service does in response to those messages does in response to those messages (i.e., its behavior)(i.e., its behavior)
51
Service-oriented architectureService-oriented architecture
52
Service-oriented architectureService-oriented architecture
53
Service-oriented architectureService-oriented architecture
A well-defined interface definition A well-defined interface definition language and a separation of language and a separation of concerns between service interface concerns between service interface and implementation simplify the and implementation simplify the manipulation and management of manipulation and management of services in four important respects: services in four important respects: service discovery, composition, service discovery, composition, specialization, and interface specialization, and interface extensionextension..
54
Service-oriented architectureService-oriented architecture
Service discoveryService discovery
55
Service-oriented architectureService-oriented architecture
Service discovery Service discovery • Service Location Protocol (SLP)Service Location Protocol (SLP)• Jini: A service discovery architecture Jini: A service discovery architecture
based on Javabased on Java• Universal Plug and Play (UPnP): Universal Plug and Play (UPnP):
Microsoft's solution to service discoveryMicrosoft's solution to service discovery• Salutation: A light weight network Salutation: A light weight network
protocol independent service discovery protocol independent service discovery protocol protocol
56
Service-oriented architectureService-oriented architecture
Service compositionService composition
57
Service-oriented architectureService-oriented architecture
Service specialization: use of Service specialization: use of different implementations of a different implementations of a service interface on different service interface on different platformsplatforms
58
Service-oriented architectureService-oriented architecture
Interface extension: Interface extension: allow for specialized allow for specialized implementations to implementations to add additional, add additional, implementation-implementation-specific specific functionality, while functionality, while still supporting the still supporting the common interfacecommon interface A value-added extension
59
Open Grid Service ArchitectureOpen Grid Service Architecture
Objectives of OGSAObjectives of OGSA• Manage resources across distributed Manage resources across distributed
heterogeneous platforms heterogeneous platforms • Deliver seamless quality of service Deliver seamless quality of service • Provide a common base for autonomic Provide a common base for autonomic
management solutions management solutions • Define open, published interfaces Define open, published interfaces • Exploit industry standard integration Exploit industry standard integration
technologies technologies
60
Open Grid Service ArchitectureOpen Grid Service Architecture Three principal elements of OGSA: OGSI, Three principal elements of OGSA: OGSI,
OGSA services, OGSA applicationsOGSA services, OGSA applicationsOGSA main architecture
61
Open Grid Service InfrastructureOpen Grid Service Infrastructure
OGSI: OGSI: provides a uniform way for provides a uniform way for software developers to model and interact software developers to model and interact with grid services by providing interfaces with grid services by providing interfaces for discovery, life cycle, state for discovery, life cycle, state management, creation and destruction, management, creation and destruction, event notification, and reference event notification, and reference management. management.
62
Requesting a serviceRequesting a service
63
Important OGSI concepts and Important OGSI concepts and interactionsinteractions
service data, keep-alive,notifications,
service invocation
create service
grid service handleservice requester
servicediscovery
serviceregistry
registerservice
resourceallocation
service factory
serviceinstances
64
Open Grid Service InfrastructureOpen Grid Service Infrastructure
OGSI and web services: OGSA architecture OGSI and web services: OGSA architecture enhances Web services to accommodate enhances Web services to accommodate requirements of the grid. These enhancements requirements of the grid. These enhancements are specified in OGSI. Over time, it's expected are specified in OGSI. Over time, it's expected that much of the OGSI functionality will be that much of the OGSI functionality will be incorporated in Web services standards. incorporated in Web services standards.
65
Open Grid Service InfrastructureOpen Grid Service Infrastructure
Implementation of OGSI: Implementation of OGSI: The Globus The Globus Toolkit 3 (GT3) is the first full-scale Toolkit 3 (GT3) is the first full-scale implementation of the OGSI standard. implementation of the OGSI standard.
66
OGSA Architected ServicesOGSA Architected Services
OGSA architected servicesOGSA architected services
67
OGSA Architected ServicesOGSA Architected Services
Grid core servicesGrid core services
68
OGSA Architected ServicesOGSA Architected Services
Grid program execution and data Grid program execution and data servicesservices
69
OGSA Architected ServicesOGSA Architected Services
Grid program execution and data services Grid program execution and data services hostinghosting
70
NamingNaming
Because Grid services are dynamic Because Grid services are dynamic and stateful, we need a way to and stateful, we need a way to distinguish one dynamically created distinguish one dynamically created service instance from another.service instance from another.
Thus, we need a naming scheme for Thus, we need a naming scheme for Grid service instances.Grid service instances.
71
72
NamingNaming
73
NamingNaming
74
NamingNaming
75
NamingNaming
76
NamingNaming
77
NamingNaming
OGSI defines a two-level naming OGSI defines a two-level naming scheme for Grid service instances scheme for Grid service instances based on simple, abstract, long-lived based on simple, abstract, long-lived Grid service handles (GSH).Grid service handles (GSH).
GSH can be mapped by handle GSH can be mapped by handle resolution services to concrete but resolution services to concrete but potentially short-lived Grid service potentially short-lived Grid service references (GSR).references (GSR).
78
NamingNaming
A GSH is a globally unique name that A GSH is a globally unique name that distinguishes that specific Grid distinguishes that specific Grid service instance from all other Grid service instance from all other Grid service instances that have existed, service instances that have existed, exist now, or will exist in the future.exist now, or will exist in the future.
A GSH is represented using a A GSH is represented using a Uniform Resource Identifier.Uniform Resource Identifier.
79
NamingNaming
A GSH carries no protocol- or A GSH carries no protocol- or instance-specific information such as instance-specific information such as network address or supported network address or supported protocol bindings.protocol bindings.
All other instance-specific All other instance-specific information is encapsulated into a information is encapsulated into a single abstraction called a Grid single abstraction called a Grid service reference (GSR).service reference (GSR).
80
NamingNaming
client handleresolver
GSR1 GSR2
serviceinstance
serviceinstance
Resolve (GSH)
time>Ttime<T
migrate
at time T
81
Service lifetime managementService lifetime management
The introduction of transient service The introduction of transient service instances raises the issue of instances raises the issue of determining the servicedetermining the service’’s lifetimes lifetime, , that is, determining when a service that is, determining when a service can or should be terminated so that can or should be terminated so that associated resources can be associated resources can be recovered.recovered.
82
Happiness for a lifetimeHappiness for a lifetime
If you want happiness for an hour -- If you want happiness for an hour -- take a nap. If you want happiness for take a nap. If you want happiness for a day -- go fishing. If you want a day -- go fishing. If you want happiness for a month -- get married. happiness for a month -- get married. If you want happiness for a year -- If you want happiness for a year -- inherit a fortune. inherit a fortune. If you want If you want happiness for a lifetime -- help happiness for a lifetime -- help someone elsesomeone else. - Chinese Proverb . - Chinese Proverb
83
Service lifetime managementService lifetime management
OGSA addresses this problem OGSA addresses this problem through a soft-state approach in through a soft-state approach in which Grid service instances are which Grid service instances are created with a specified lifetime.created with a specified lifetime.
Three steps:Three steps:• Negotiating an initial lifetimeNegotiating an initial lifetime• Explicit terminatingExplicit terminating• Requesting a lifetime modificationRequesting a lifetime modification
84
Soft-state used in RSVPSoft-state used in RSVP
What is RSVP? (What is RSVP? (Resource reSerVation Protocol)
85
Soft-state used in RSVPSoft-state used in RSVP
86
Soft-state used in RSVPSoft-state used in RSVP
87
Soft-state used in RSVPSoft-state used in RSVP
Path message
88
Soft-state used in RSVPSoft-state used in RSVP
Resv message
Merge the reservation
89
OGSA ImplementationsOGSA Implementations
demarshaling/
decoding
Grid serviceimplementation
Grid serviceimplementation
Grid serviceimplementation
protocoltermination
protocoltermination
protocoltermination
Container (good for code reuse)
Grid service
invocation
adaptationlayer
90
OGSA ImplementationsOGSA Implementations
91
OGSA ImplementationsOGSA Implementations
92
OGSA ImplementationsOGSA Implementations
93
Future DirectionsFuture Directions
Service and toolsService and tools ImplementationImplementation SemanticsSemantics ScalabilityScalability ……