grid computing: concepts, appplications, and technologies ian foster mathematics and computer...
Post on 21-Dec-2015
222 views
TRANSCRIPT
Grid Computing:Concepts, Appplications, and
Technologies
Ian Foster
Mathematics and Computer Science Division
Argonne National Laboratory
and
Department of Computer Science
The University of Chicago
http://www.mcs.anl.gov/~foster
Grid Computing in Canada Workshop, University of Alberta, May 1, 2002
[email protected] ARGONNE CHICAGO
2
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
3
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
4Living in an Exponential World
(1) Computing & Sensors
Moore’s Law: transistor count doubles each 18 months
Magnetohydro-dynamics
star formation
[email protected] ARGONNE CHICAGO
5
Living in an Exponential World:(2) Storage
Storage density doubles every 12 months Dramatic growth in online data (1 petabyte =
1000 terabyte = 1,000,000 gigabyte)– 2000 ~0.5 petabyte
– 2005 ~10 petabytes
– 2010 ~100 petabytes
– 2015 ~1000 petabytes? Transforming entire disciplines in physical and,
increasingly, biological sciences; humanities next?
[email protected] ARGONNE CHICAGO
6
Data Intensive Physical Sciences
High energy & nuclear physics– Including new experiments at CERN
Gravity wave searches– LIGO, GEO, VIRGO
Time-dependent 3-D systems (simulation, data)– Earth Observation, climate modeling
– Geophysics, earthquake modeling
– Fluids, aerodynamic design
– Pollutant dispersal scenarios Astronomy: Digital sky surveys
[email protected] ARGONNE CHICAGO
7
Ongoing Astronomical Mega-Surveys
Large number of new surveys– Multi-TB in size, 100M objects or larger
– In databases
– Individual archives planned and under way Multi-wavelength view of the sky
– > 13 wavelength coverage within 5 years Impressive early discoveries
– Finding exotic objects by unusual colors> L,T dwarfs, high redshift quasars
– Finding objects by time variability> Gravitational micro-lensing
MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...
MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE...
[email protected] ARGONNE CHICAGO
9
Coming Floods of Astronomy Data
The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!– All-sky survey every few days, so will have
fine-grain time series for the first time
[email protected] ARGONNE CHICAGO
10Data Intensive Biology and Medicine
Medical data– X-Ray, mammography data, etc. (many petabytes)
– Digitizing patient records (ditto) X-ray crystallography Molecular genomics and related disciplines
– Human Genome, other genome databases
– Proteomics (protein structure, activities, …)
– Protein interactions, drug delivery Virtual Population Laboratory (proposed)
– Simulate likely spread of disease outbreaks Brain scans (3-D, time dependent)
[email protected] ARGONNE CHICAGO
11
And comparisons must bemade among many
We need to get to one micron to know location of every cell. We’re just now starting to get to 10 microns – Grids will help get us there and further
A Brainis a Lotof Data!
(Mark Ellisman, UCSD)
[email protected] ARGONNE CHICAGO
12An Exponential World: (3) Networks
(Or, Coefficients Matter …) Network vs. computer performance
– Computer speed doubles every 18 months
– Network speed doubles every 9 months
– Difference = order of magnitude per 5 years 1986 to 2000
– Computers: x 500
– Networks: x 340,000 2001 to 2010
– Computers: x 60
– Networks: x 4000
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.
[email protected] ARGONNE CHICAGO
13
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
14
Evolution of the Scientific Process
Pre-electronic– Theorize &/or experiment, alone or in small
teams; publish paper Post-electronic
– Construct and mine very large databases of observational or simulation data
– Develop computer simulations & analyses
– Exchange information quasi-instantaneously within large, distributed, multidisciplinary teams
[email protected] ARGONNE CHICAGO
15
Evolution of Business
Pre-Internet– Central corporate data processing facility
– Business processes not compute-oriented Post-Internet
– Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B)
– Outsourcing becomes feasible => service providers of various sorts
– Business processes increasingly computing- and data-rich
[email protected] ARGONNE CHICAGO
16
The Grid
“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
[email protected] ARGONNE CHICAGO
17
An Example Virtual Organization: CERN’s Large Hadron Collider
1800 Physicists, 150 Institutes, 32 Countries
100 PB of data by 2010; 50,000 CPUs?
[email protected] ARGONNE CHICAGO
18Grid Communities & Applications:Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
[email protected] ARGONNE CHICAGO
19Data Integration and Mining: (credit Sara Graves)
From Global Information to Local Knowledge
Precision Agriculture
Emergency Response
Weather Prediction
Urban Environments
[email protected] ARGONNE CHICAGO
20Intelligent Infrastructure:Distributed Servers and Services
[email protected] ARGONNE CHICAGO
22Grid Computing
[email protected] ARGONNE CHICAGO
23
Early 90s– Gigabit testbeds, metacomputing
Mid to late 90s– Early experiments (e.g., I-WAY), academic software
projects (e.g., Globus, Legion), application experiments 2002
– Dozens of application communities & projects– Major infrastructure deployments– Significant technology base (esp. Globus ToolkitTM)– Growing industrial interest – Global Grid Forum: ~500 people, 20+ countries
The Grid:A Brief History
[email protected] ARGONNE CHICAGO
28
The Grid World: Current Status Dozens of major Grid projects in scientific &
technical computing/research & education– www.mcs.anl.gov/~foster/grid-projects
Considerable consensus on key concepts and technologies– Open source Globus Toolkit™ a de facto standard for
major protocols & services Industrial interest emerging rapidly
– IBM, Platform, Microsoft, Sun, Compaq, … Opportunity: convergence of eScience and
eBusiness requirements & technologies
[email protected] ARGONNE CHICAGO
35
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
36Grid Technologies:
Resource Sharing Mechanisms That …
Address security and policy concerns of resource owners and users
Are flexible enough to deal with many resource types and sharing modalities
Scale to large number of resources, many participants, many program components
Operate efficiently when dealing with large amounts of data & computation
[email protected] ARGONNE CHICAGO
37
Aspects of the Problem
1) Need for interoperability when different groups want to share resources– Diverse components, policies, mechanisms
– E.g., standard notions of identity, means of communication, resource descriptions
2) Need for shared infrastructure services to avoid repeated development, installation– E.g., one port/service/protocol for remote access to
computing, not one per tool/appln
– E.g., Certificate Authorities: expensive to run A common need for protocols & services
[email protected] ARGONNE CHICAGO
39
The Hourglass Model
Focus on architecture issues– Propose set of core services
as basic infrastructure– Use to construct high-level,
domain-specific solutions Design principles
– Keep participation cost low– Enable local control– Support for adaptation– “IP hourglass” model
Diverse global services
Coreservices
Local OS
A p p l i c a t i o n s
[email protected] ARGONNE CHICAGO
40
Layered Grid Architecture(By Analogy to Internet Architecture)
Application
Fabric“Controlling things locally”: Access to, & control of, resources
Connectivity“Talking to things”: communication (Internet protocols) & security
Resource“Sharing single resources”: negotiating access, controlling use
Collective“Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services
InternetTransport
Application
Link
Inte
rnet P
roto
col
Arch
itectu
re
[email protected] ARGONNE CHICAGO
41
Globus Toolkit™
A software toolkit addressing key technical problems in the development of Grid-enabled tools, services, and applications– Offer a modular set of orthogonal services
– Enable incremental development of grid-enabled tools and applications
– Implement standard Grid protocols and APIs
– Available under liberal open source license
– Large community of developers & users
– Commercial support
[email protected] ARGONNE CHICAGO
42
General Approach
Define Grid protocols & APIs– Protocol-mediated access to remote resources
– Integrate and extend existing standards
– “On the Grid” = speak “Intergrid” protocols Develop a reference implementation
– Open source Globus Toolkit
– Client and server SDKs, services, tools, etc. Grid-enable wide variety of tools
– Globus Toolkit, FTP, SSH, Condor, SRB, MPI, … Learn through deployment and applications
[email protected] ARGONNE CHICAGO
43
Key Protocols
The Globus Toolkit™ centers around four key protocols– Connectivity layer:
> Security: Grid Security Infrastructure (GSI)
– Resource layer:> Resource Management: Grid Resource Allocation Management
(GRAM)
> Information Services: Grid Resource Information Protocol (GRIP) and Index Information Protocol (GIIP)
> Data Transfer: Grid File Transfer Protocol (GridFTP)
Also key collective layer protocols– Info Services, Replica Management, etc.
[email protected] ARGONNE CHICAGO
44
Globus Toolkit Structure
GRAM MDS
GSI
GridFTP MDS
GSI
???
GSI
Reliable invocationSoft state
management
Notification
ComputeResource
DataResource
Other Serviceor Application
Jobmanager
Jobmanager
Service naming
[email protected] ARGONNE CHICAGO
45
GSI: www.gridforum.org/security/gsi
Connectivity LayerProtocols & Services
Communication– Internet protocols: IP, DNS, routing, etc.
Security: Grid Security Infrastructure (GSI)– Uniform authentication, authorization, and
message protection mechanisms in multi-institutional setting
– Single sign-on, delegation, identity mapping
– Public key technology, SSL, X.509, GSS-API
– Supporting infrastructure: Certificate Authorities, certificate & key management, …
[email protected] ARGONNE CHICAGO
46
Why Grid Security is Hard
Resources being used may be extremely valuable & the problems being solved extremely sensitive
Resources are often located in distinct administrative domains– Each resource may have own policies & procedures
The set of resources used by a single computation may be large, dynamic, and/or unpredictable– Not just client/server
It must be broadly available & applicable– Standard, well-tested, well-understood protocols
– Integration with wide variety of tools
[email protected] ARGONNE CHICAGO
47
1) Easy to use
2) Single sign-on
3) Run applicationsftp,ssh,MPI,Condor,Web,…
4) User based trust model
5) Proxies/agents (delegation)
1) Specify local access control
2) Auditing, accounting, etc.
3) Integration w/ local systemKerberos, AFS, license mgr.
4) Protection from compromisedresources
API/SDK with authentication, flexible message protection,
flexible communication, delegation, ...Direct calls to various security functions (e.g. GSS-API)Or security integrated into higher-level SDKs:
E.g. GlobusIO, Condor-G, MPICH-G2, HDF5, etc.
User View Resource Owner View
Developer View
Grid Security Requirements
[email protected] ARGONNE CHICAGO
48
Grid Security Infrastructure (GSI) Extensions to existing standard protocols & APIs
– Standards: SSL/TLS, X.509 & CA, GSS-API
– Extensions for single sign-on and delegation Globus Toolkit reference implementation of GSI
– SSLeay/OpenSSL + GSS-API + delegation
– Tools and services to interface to local security> Simple ACLs; SSLK5 & PKINIT for access to K5, AFS, etc.
– Tools for credential management> Login, logout, etc.
> Smartcards
> MyProxy: Web portal login and delegation
> K5cert: Automatic X.509 certificate creation
[email protected] ARGONNE CHICAGO
49
Site A(Kerberos)
Site B (Unix)
Site C(Kerberos)
Computer
User
Single sign-on via “grid-id”& generation of proxy cred.
Or: retrieval of proxy cred.from online repository
User ProxyProxy
credential
Computer
Storagesystem
Communication*
GSI-enabledFTP server
AuthorizeMap to local idAccess file
Remote fileaccess request*
GSI-enabledGRAM server
GSI-enabledGRAM server
Remote processcreation requests*
* With mutual authentication
Process
Kerberosticket
Restrictedproxy
Process
Restrictedproxy
Local id Local id
AuthorizeMap to local idCreate processGenerate credentials
Ditto
GSI in Action: “Create Processes at A and B that Communicate & Access Files at C”
[email protected] ARGONNE CHICAGO
50
GSI Working Group Documents
Grid Security Infrastructure (GSI) Roadmap– Informational draft overview of working group
activities and documents Grid Security Protocols & Syntax
– X.509 Proxy Certificates
– X.509 Proxy Delegation Protocol
– The GSI GSS-API Mechanism Grid Security APIs
– GSS-API Extensions for the Grid
– GSI Shell API
[email protected] ARGONNE CHICAGO
51
GSI Futures
Scalability in numbers of users & resources – Credential management
– Online credential repositories (“MyProxy”)
– Account management Authorization
– Policy languages
– Community authorization Protection against compromised resources
– Restricted delegation, smartcards
[email protected] ARGONNE CHICAGO
52
CAS1. CAS request, with resource names and operations
Community Authorization
Does the collective policy authorize this
request for this user?
user/group membership
resource/collective membership
collective policy information
Resource
Is this request authorized for
the CAS?
Is this request authorized by
the capability? local policy
information
4. Resource reply
User 3. Resource request, authenticated with
capability
2. CAS reply, with and resource CA info
capability
Laura Pearlman, Steve Tuecke, Von Welch, others
[email protected] ARGONNE CHICAGO
53
GRAM, GridFTP, GRIS: www.globus.org
Resource LayerProtocols & Services
Grid Resource Allocation Management (GRAM) – Remote allocation, reservation, monitoring, control
of compute resources GridFTP protocol (FTP extensions)
– High-performance data access & transport Grid Resource Information Service (GRIS)
– Access to structure & state information Others emerging: Catalog access, code repository
access, accounting, etc. All built on connectivity layer: GSI & IP
[email protected] ARGONNE CHICAGO
54
Resource Management
The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started and managed on remote resources, despite local heterogeneity
Resource Specification Language (RSL) is used to communicate requirements
A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services– Integrated with Condor, PBS, MPICH-G2, …
[email protected] ARGONNE CHICAGO
55
GRAM GRAM GRAM
LSF Condor NQE
Application
RSL
Simple ground RSL
Information Service
Localresourcemanagers
RSLspecialization
Broker
Ground RSL
Co-allocator
Queries& Info
Resource Management Architecture
[email protected] ARGONNE CHICAGO
56
Data Access & Transfer
GridFTP: extended version of popular FTP protocol for Grid data access and transfer
Secure, efficient, reliable, flexible, extensible, parallel, concurrent, e.g.:– Third-party data transfers, partial file transfers
– Parallelism, striping (e.g., on PVFS)
– Reliable, recoverable data transfers Reference implementations
– Existing clients and servers: wuftpd, ncftp
– Flexible, extensible libraries in Globus Toolkit
[email protected] ARGONNE CHICAGO
57The Grid Information Problem
Large numbers of distributed “sensors” with different properties
Need for different “views” of this information, depending on community membership, security constraints, intended purpose, sensor type
[email protected] ARGONNE CHICAGO
58The Globus Toolkit Solution: MDS-2
Registration & enquiry protocols, information models, query languages– Provides standard interfaces to sensors
– Supports different “directory” structures supporting various discovery/access strategies
[email protected] ARGONNE CHICAGO
59
Globus Applications and Deployments
Application projects include– GriPhyN, PPDG, NEES, EU DataGrid, ESG,
Fusion Collaboratory, etc., etc. Infrastructure deployments include
– DISCOM, NASA IPG, NSF TeraGrid, DOE Science Grid, EU DataGrid, etc., etc.
– UK Grid Center, U.S. GRIDS Center Technology projects include
– Data Grids, Access Grid, Portals, CORBA, MPICH-G2, Condor-G, GrADS, etc., etc.
[email protected] ARGONNE CHICAGO
60
Globus Futures Numerous large projects are pushing hard on
production deployment & application– Much will be learned in next 2 years!
Active R&D program, focused for example on– Security & policy for resource sharing
– Flexible, high-perf., scalable data sharing
– Integration with Web Services etc.
– Programming models and tools Community code development producing a true
Open Grid Architecture
[email protected] ARGONNE CHICAGO
61
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
62
Important Grid Applications
Data-intensive Distributed computing (metacomputing) Collaborative Remote access to, and computer
enhancement of, experimental facilities
[email protected] ARGONNE CHICAGO
63
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
64
Data Intensive Science: 2000-2015
Scientific discovery increasingly driven by IT – Computationally intensive analyses
– Massive data collections
– Data distributed across networks of varying capability
– Geographically distributed collaboration Dominant factor: data growth (1 Petabyte = 1000 TB)
– 2000 ~0.5 Petabyte
– 2005 ~10 Petabytes
– 2010 ~100 Petabytes
– 2015 ~1000 Petabytes?
How to collect, manage,access and interpret thisquantity of data?
Drives demand for “Data Grids” to handleadditional dimension of data access & movement
[email protected] ARGONNE CHICAGO
65Data Grid Projects Particle Physics Data Grid (US, DOE)
– Data Grid applications for HENP expts. GriPhyN (US, NSF)
– Petascale Virtual-Data Grids iVDGL (US, NSF)
– Global Grid lab TeraGrid (US, NSF)
– Dist. supercomp. resources (13 TFlops) European Data Grid (EU, EC)
– Data Grid technologies, EU deployment CrossGrid (EU, EC)
– Data Grid technologies, EU emphasis DataTAG (EU, EC)
– Transatlantic network, Grid applications Japanese Grid Projects (APGrid) (Japan)
– Grid deployment throughout Japan
Collaborations of application scientists & computer scientists
Infrastructure devel. & deployment
Globus based
[email protected] ARGONNE CHICAGO
66Grid Communities & Applications:Data Grids for High Energy Physics
Tier2 Centre ~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
FermiLab ~4 TIPSFrance Regional Centre
Italy Regional Centre
Germany Regional Centre
InstituteInstituteInstituteInstitute ~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec or Air Freight (deprecated)
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Tier2 Centre ~1 TIPS
Caltech ~1 TIPS
~622 Mbits/sec
Tier 0Tier 0
Tier 1Tier 1
Tier 2Tier 2
Tier 4Tier 4
1 TIPS is approximately 25,000
SpecInt95 equivalents
www.griphyn.org www.ppdg.net www.eu-datagrid.org
[email protected] ARGONNE CHICAGO
67Biomedical InformaticsResearch Network (BIRN)
Evolving reference set of brains provides essential data for developing therapies for neurological disorders (multiple sclerosis, Alzheimer’s, etc.).
Today – One lab, small patient base– 4 TB collection
Tomorrow– 10s of collaborating labs– Larger population sample– 400 TB data collection: more
brains, higher resolution– Multiple scale data integration
and analysis
[email protected] ARGONNE CHICAGO
68
Digital Radiology (Hollebeek, U. Pennsylvania)
Hospital digital data– Very large data sources: great clinical value to digital
storage and manipulation and significant cost savings
– 7 Terabytes per hospital per year
– Dominated by digital images Why mammography
– Clinical need for film recall & computer analysis
– Large volume ( 4,000 GB/year ) (57% of total)
– Storage and records standards exist
– Great clinical value
mammograms X-raysMRICAT scansendoscopies ...
[email protected] ARGONNE CHICAGO
69Earth System Grid
(ANL, LBNL, LLNL, NCAR, ISI, ORNL)
Enable a distributed community [of thousands] to perform computationally intensive analyses on large climate datasets
Via– Creation of Data Grid supporting secure, high-
performance remote access
– “Smart data servers” supporting reduction and analyses
– Integration with environmental data analysis systems, protocols, and thin clients
www.earthsystemgrid.org (soon)
[email protected] ARGONNE CHICAGO
70
Earth System Grid Architecture
Metadata Catalog
Replica Catalog
Tape Library
Disk Cache
Attribute Specification
Logical Collection and Logical File Name
Disk Array Disk Cache
Application
Replica Selection
Multiple Locations
NWS
SelectedReplica
GridFTP commands PerformanceInformation &Predictions
Replica Location 1 Replica Location 2 Replica Location 3
MDS
[email protected] ARGONNE CHICAGO
71
Data Grid Toolkit Architecture
Data Movement ServiceOptimized Endpoint Management
(Bulk parallel transfer, rate-limited transfer, disk/network scheduling)
Data Transfer ServiceEnd-to-End File Transfer
(Link optimization, performance guarantees, admission control)
Collective Data Management ServiceCollective File Movement
(Collection mgmt., priority, fault recovery, replication, resource selection)
[email protected] ARGONNE CHICAGO
72
A UniversalAccess/Transport Protocol
Suite of communication libraries and related tools that support– GSI security
– Third-party transfers
– Parameter set/negotiate
– Partial file access
– Reliability/restart
– Logging/audit trail All based on a standard, widely deployed protocol
– Integrated instrumentation
– Parallel transfers
– Striping (cf DPSS)
– Policy-based access control
– Server-side computation
[later]
[email protected] ARGONNE CHICAGO
73
And the Universal Protocol is … GridFTP
Why FTP?– Ubiquity enables interoperation with many
commodity tools– Already supports many desired features, easily
extended to support others We use the term GridFTP to refer to
– Transfer protocol which meets requirements– Family of tools which implement the protocol
Note GridFTP > FTP Note that despite name, GridFTP is not restricted
to file transfer!
[email protected] ARGONNE CHICAGO
74
GridFTP: Basic Approach
FTP is defined by several IETF RFCs Start with most commonly used subset
– Standard FTP: get/put etc., 3rd-party transfer Implement RFCed but often unused features
– GSS binding, extended directory listing, simple restart
Extend in various ways, while preserving interoperability with existing servers– Stripe/parallel data channels, partial file,
automatic & manual TCP buffer setting, progress and extended restart
[email protected] ARGONNE CHICAGO
75
The GridFTP Family of Tools
Patches to existing FTP code– GSI-enabled versions of existing FTP client
and server, for high-quality production code Custom-developed libraries
– Implement full GridFTP protocol, targeting custom use, high-performance
Custom-developed tools– E.g., high-performance striped FTP server
[email protected] ARGONNE CHICAGO
76
High-Performance Data Transfer
ControlInterface
Scheduling Modules
Data Channel
Bulk Transfer Protocol
TCP Transfer Protocol
Rate Limiting Interface
Enquiry
GRIP GridFTP TCP, BTP… TCP, BTP…
Data Channel
Bulk Transfer Protocol
TCP Transfer Protocol
Rate Limiting Interface
Resource Mgmt.(disk, NIC)
GRAM
[email protected] ARGONNE CHICAGO
77
GridFTP for Efficient WAN Transfer
Transfer Tb+ datasets– Highly-secure authentication– Parallel transfer for speed
LLNL->Chicago transfer (slow site network interfaces):
FUTURE: Integrate striped GridFTP with parallel storage systems, e.g., HPSS
Parallel TransferFully utilizes bandwidth of
network interface on single nodes.
Striped TransferFully utilizes bandwidth of
Gb+ WAN using multiple nodes.
Par
alle
l F
iles
yste
m
Par
alle
l F
iles
yste
m
GridFTP (globus-url-copy)
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30 35
# of Parallel Streams
Ban
dw
idth
(M
bs)
[email protected] ARGONNE CHICAGO
78
GridFTP for User-Friendly Visualization Setup
High-res visualization is too large for display on a single system– Needs to be tiled, 24bit->16bit
depth– Needs to be staged to display units
GridFTP/ActiveMural integration application performs tiling, data reduction, and staging in a single operation– PVFS/MPI-IO on server– MPI process group transforms data
as needed before transfer– Performance is currently bounded
by 100Mb/s NICs on display nodes
[email protected] ARGONNE CHICAGO
79Distributed Computing+Visualization
Remote CenterGenerates Tb+ datasets fromsimulation code
LAN/WAN Transfer
User-friendly striped GridFTP application tiles the frames and stages tiles onto display nodes
FLASH data transferredto ANL for visualization
GridFTP parallelismutilizes high bandwidth
(Capable of utilizing>Gb/s WAN links)
WAN Transfer Chiba City Visualizationcode constructsand storeshigh-resolutionvisualizationframes fordisplay onmany devices
ActiveMural DisplayDisplays very high resolution
large-screen dataset animations
Job SubmissionSimulation code submitted toremote center for execution
on 1000s of nodes
FUTURE (1-5 yrs)• 10s Gb/s LANs, WANs• End-to-end QoS• Automated replica management• Server-side data reduction & analysis• Interactive portals
[email protected] ARGONNE CHICAGO
80
SC’2001 Experiment:Simulation of HEP Tier 1 Site
Tiered (Hierarchical) Site Structure– All data generated at lower tiers must be forwarded to
the higher tiers
– Tier 1 sites may have many sites transmitting to them simultaneously and will need to sink a substantial amount of bandwidth
– We demonstrated the ability of GridFTP to support this at SC 2001 in the Bandwidth Challenge
– 16 Sites, with 27 Hosts, pushed a peak of 2.8 Gbs to the showfloor in Denver with a sustained bandwidth of nearly 2 Gbs
[email protected] ARGONNE CHICAGO
82
The ReplicaManagement Problem
Maintain a mapping between logical names for files and collections and one or more physical locations
Important for many applications Example: CERN high-level trigger data
– Multiple petabytes of data per year
– Copy of everything at CERN (Tier 0)
– Subsets at national centers (Tier 1)
– Smaller regional centers (Tier 2)
– Individual researchers will have copies
[email protected] ARGONNE CHICAGO
83
Our Approach to Replica Management
Identify replica cataloging and reliable replication as two fundamental services– Layer on other Grid services: GSI, transport,
information service– Use as a building block for other tools
Advantage– These services can be used in a wide variety
of situations
[email protected] ARGONNE CHICAGO
84
Replica Catalog Structure: A Climate Modeling Example
Logical File Parent
Logical File Jan 1998
Logical CollectionC02 measurements 1998
Replica Catalog
Locationjupiter.isi.edu
Locationsprite.llnl.gov
Logical File Feb 1998
Size: 1468762
Filename: Jan 1998Filename: Feb 1998…
Filename: Mar 1998Filename: Jun 1998Filename: Oct 1998Protocol: gsiftpUrlConstructor: gsiftp://jupiter.isi.edu/ nfs/v6/climate
Filename: Jan 1998…Filename: Dec 1998Protocol: ftpUrlConstructor: ftp://sprite.llnl.gov/ pub/pcmdi
Logical CollectionC02 measurements 1999
[email protected] ARGONNE CHICAGO
85
Giggle: A Scalable Replication Location Service
Local replica catalogs maintain definitive information about replicas
Publish (perhaps approximate) information using soft state techniques
Variety of indexing strategies possible
0
2
4
6
8
10
1 10 100 1000Number of LFNs ('000)
Tim
e (m
s)
Create (LRC) Add (LRC)Delete (LRC) Query (LRC)Query (RLI)
1
10
100
1000
10000
1 10 100 1000
Number of LFNs ('000)
Tim
e fo
r sof
t-st
ate
upda
te (s
ec)
1 LRC
2 LRCs
[email protected] ARGONNE CHICAGO
86
GriPhyN = App. Science + CS + Grids
GriPhyN = Grid Physics Network– US-CMS High Energy Physics
– US-ATLAS High Energy Physics
– LIGO/LSC Gravity wave research
– SDSS Sloan Digital Sky Survey
– Strong partnership with computer scientists Design and implement production-scale grids
– Develop common infrastructure, tools and services
– Integration into the 4 experiments
– Application to other sciences via “Virtual Data Toolkit” Multi-year project
– R&D for grid architecture (funded at $11.9M +$1.6M)
– Integrate Grid infrastructure into experiments through VDT
[email protected] ARGONNE CHICAGO
87
GriPhyN Institutions
– U Florida
– U Chicago
– Boston U
– Caltech
– U Wisconsin, Madison
– USC/ISI
– Harvard
– Indiana
– Johns Hopkins
– Northwestern
– Stanford
– U Illinois at Chicago
– U Penn
– U Texas, Brownsville
– U Wisconsin, Milwaukee
– UC Berkeley
– UC San Diego
– San Diego Supercomputer Center
– Lawrence Berkeley Lab
– Argonne
– Fermilab
– Brookhaven
[email protected] ARGONNE CHICAGO
88
GriPhyN: PetaScale Virtual Data Grids
Virtual Data Tools
Request Planning &
Scheduling ToolsRequest Execution & Management Tools
Transforms
Distributed resources(code, storage, CPUs,networks)
Resource Management
Services
Resource Management
Services
Security and Policy
Services
Security and Policy
Services
Other Grid ServicesOther Grid
Services
Interactive User Tools
Production TeamIndividual Investigator Workgroups
Raw data source
~1 Petaop/s~100 Petabytes
[email protected] ARGONNE CHICAGO
89
GriPhyN/PPDGData Grid Architecture
Application
Planner
Executor
Catalog Services
Info Services
Policy/Security
Monitoring
Repl. Mgmt.
Reliable TransferService
Compute Resource Storage Resource
DAG
DAG
DAGMAN, Kangaroo
GRAM GridFTP; GRAM; SRM
GSI, CAS
MDS
MCAT; GriPhyN catalogs
GDMP
MDS
Globus
= initial solution is operational
Ewa Deelman, Mike Wilde
[email protected] ARGONNE CHICAGO
90GriPhyN Research Agenda
Virtual Data technologies– Derived data, calculable via algorithm
– Instantiated 0, 1, or many times (e.g., caches)
– “Fetch value” vs. “execute algorithm”
– Potentially complex (versions, cost calculation, etc) E.g., LIGO: “Get gravitational strain for 2 minutes around 200
gamma-ray bursts over last year” For each requested data value, need to
– Locate item materialization, location, and algorithm
– Determine costs of fetching vs. calculating
– Plan data movements, computations to obtain results
– Execute the plan
[email protected] ARGONNE CHICAGO
91
Virtual Data in Action
Data request may– Compute locally
– Compute remotely
– Access local data
– Access remote data Scheduling based on
– Local policies
– Global policies
– Cost
Major facilities, archives
Regional facilities, caches
Local facilities, cachesFetch item
[email protected] ARGONNE CHICAGO
92
GriPhyN Research Agenda (cont.) Execution management
– Co-allocation (CPU, storage, network transfers)
– Fault tolerance, error reporting
– Interaction, feedback to planning Performance analysis (with PPDG)
– Instrumentation, measurement of all components
– Understand and optimize grid performance Virtual Data Toolkit (VDT)
– VDT = virtual data services + virtual data tools
– One of the primary deliverables of R&D effort
– Technology transfer to other scientific domains
[email protected] ARGONNE CHICAGO
93
Programs as Community Resources:Data Derivation and Provenance
Most scientific data are not simple “measurements”; essentially all are:– Computationally corrected/reconstructed
– And/or produced by numerical simulation And thus, as data and computers become
ever larger and more expensive:– Programs are significant community resources
– So are the executions of those programs
[email protected] ARGONNE CHICAGO
94
Transformation Derivation
Data
created-by
execution-of
consumed-by/generated-by
“I’ve detected a calibration error in an instrument and
want to know which derived data to recompute.”
“I’ve come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes.”
“I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I won’t have to write one from scratch.”
“I want to apply an astronomical analysis program to millions of objects. If the results
already exist, I’ll save weeks of computation.”
[email protected] ARGONNE CHICAGO
95
The Chimera Virtual Data System(GriPhyN Project)
Virtual data catalog– Transformations,
derivations, data Virtual data language
– Data definition + query
Applications include browsers and data analysis applications
Data Grid Resources(distributed execution
and data management)
VDL Interpreter(manipulate derivations
and transformations)
Virtual Data Catalog(implements ChimeraVirtual Data Schema)
Virtual DataApplications
Virtual Data Language(definition and query)
Task Graphs(compute and data
movement tasks, withdependencies)
SQL
Chimera
[email protected] ARGONNE CHICAGO
96SDSS Galaxy Cluster Finding
[email protected] ARGONNE CHICAGO
97
Cluster-finding Data Pipelinecatalog
cluster
5
4
core
brg
field
tsObj
3
2
1
brg
field
tsObj
2
1
brg
field
tsObj
2
1
brg
field
tsObj
2
1
core
3
[email protected] ARGONNE CHICAGO
98
Virtual Data in CMS
Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN 2001-16
[email protected] ARGONNE CHICAGO
99
NCSA Linux cluster
5) Secondary reports complete to master
Master Condor job running at
Caltech
7) GridFTP fetches data from UniTree
NCSA UniTree - GridFTP-enabled FTP server
4) 100 data files transferred via GridFTP, ~ 1 GB each
Secondary Condor job on WI
pool
3) 100 Monte Carlo jobs on Wisconsin Condor pool
2) Launch secondary job on WI pool; input files via Globus GASS
Caltech workstation
6) Master starts reconstruction jobs via Globus jobmanager on cluster
8) Processed objectivity database stored to UniTree
9) Reconstruction job reports complete to master
Early GriPhyN Challenge Problem:CMS Data Reconstruction
Scott Koranda, Miron Livny, others
[email protected] ARGONNE CHICAGO
100
0
20
40
60
80
100
120
Pre / Simulation Jobs / Post (UW Condor)
ooHits at NCSA
ooDigis at NCSA
Delay due to script error
Trace of a Condor-G Physics Run
[email protected] ARGONNE CHICAGO
101
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
102
Distributed Computing
Aggregate computing resources & codes– Multidisciplinary simulation
– Metacomputing/distributed simulation
– High-throughput/parameter studies Challenges
– Heterogeneous compute & network capabilities, latencies, dynamic behaviors
Example tools– MPICH-G2: Grid-aware MPI
– Condor-G, Nimrod-G: parameter studies
[email protected] ARGONNE CHICAGO
103
•Lift Capabilities•Drag Capabilities•Responsiveness
•Deflection capabilities•Responsiveness
•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption
•Braking performance•Steering capabilities•Traction•Dampening capabilities
Crew Capabilities- accuracy- perception- stamina- re-action times- SOPs
Engine Models
Airframe Models
Wing Models
Landing Gear Models
Stabilizer Models
Human Models
Multidisciplinary Simulations: Aviation Safety
Whole system simulations are produced by coupling all of the sub-system simulations
[email protected] ARGONNE CHICAGO
104
MPICH-G2: A Grid-Enabled MPI
A complete implementation of the Message Passing Interface (MPI) for heterogeneous, wide area environments– Based on the Argonne MPICH implementation of
MPI (Gropp and Lusk) Requires services for authentication, resource
allocation, executable staging, output, etc. Programs run in wide area without change
– Modulo accommodating heterogeneous communication performance
See also: MetaMPI, PACX, STAMPI, MAGPIE
www.globus.org/mpi
[email protected] ARGONNE CHICAGO
105
Grid-based Computation: Challenges
Locate “suitable” computers Authenticate with appropriate sites Allocate resources on those computers Initiate computation on those computers Configure those computations Select “appropriate” communication methods Compute with “suitable” algorithms Access data files, return output Respond “appropriately” to resource changes
[email protected] ARGONNE CHICAGO
106MPICH-G2 Use of Grid Services
mpirunGenerates
resource specification
globusrunSubmits multiple jobs
GRAM GRAM GRAM
DUROC Coordinates startup
Authenticates
fork
P1 P2
LSF
P1 P2
LoadLeveler
P1 P2
Initiates job
Communicates via vendor-MPI and TCP/IP (globus-io)
Monitors/controls
Detects termination
MDSLocateshosts
GASSStages
executables
% grid-proxy-init% mpirun -np 256 myprog
[email protected] ARGONNE CHICAGO
107
Cactus(Allen, Dramlitsch, Seidel, Shalf, Radke)
Modular, portable framework for parallel, multidimensional simulations
Construct codes by linking– Small core (flesh): mgmt services
– Selected modules (thorns): Numerical methods, grids & domain decomps, visualization and steering, etc.
Custom linking/configuration tools Developed for astrophysics, but not
astrophysics-specific
Cactus “flesh”
Thorns
www.cactuscode.org
[email protected] ARGONNE CHICAGO
108Cactus: An Application
Framework for Dynamic Grid Computing Cactus thorns for active management of application
behavior and resource use Heterogeneous resources, e.g.:
– Irregular decompositions
– Variable halo for managing message size
– Msg compression (comp/comm tradeoff)
– Comms scheduling for comp/comm overlap Dynamic resource behaviors/demands, e.g.:
– Perf monitoring, contract violation detection
– Dynamic resource discovery & migration
– User notification and steering
[email protected] ARGONNE CHICAGO
109Cactus Example:Terascale Computing
Gig-E100MB/sec
SDSC IBM SP1024 procs5x12x17 =1020
NCSA Origin Array256+128+1285x12x(4+2+2) =480
OC-12 lineBut only 2.5MB/sec)
17
12
512
5
4 2 2
Solved EEs for gravitational waves (real code)– Tightly coupled, communications required through derivatives– Must communicate 30MB/step between machines– Time step take 1.6 sec
Used 10 ghost zones along direction of machines: communicate every 10 steps
Compression/decomp. on all data passed in this direction Achieved 70-80% scaling, ~200GF (only 14% scaling without tricks)
[email protected] ARGONNE CHICAGO
110
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1 3 5 7 9
11
13
15
17
19
21
23
25
27
29
31
33
Clock Time
Ite
rati
on
s/S
ec
on
dCactus Example (2):Migration in Action
Loadapplied
3 successivecontract
violations
RunningAt UIUC
(migrationtime not to scale)
Resourcediscovery
& migration
RunningAt UC
[email protected] ARGONNE CHICAGO
111
AmesMoffett Field, CA
GlennCleveland, OH
LangleyHampton, VA
Whitcomb
OVERFLOW on IPG using Globus and MPICH-G2 for
intra-problem, wide area communication
Lomax512 node SGI Origin 2000
high-lift subsonicwind tunnel model
Sharp
Application POC: Mohammad J. Djomehri
IPG Milestone 3:Large Computing Node
Completed 12/2000
Slide courtesy Bill Johnston, LBNL & NASA
[email protected] ARGONNE CHICAGO
112
High-Throughput Computing:Condor
High-throughput computing platform for mapping many tasks to idle computers
Three major components– Scheduler manages pool(s) of [distributively
owned or dedicated] computers
– DAGman manages user task pools
– Matchmaker schedules tasks to computers Parameter studies, data analysis Condor-G extensions support wide area
execution in Grid environment
www.cs.wisc.edu/condor
[email protected] ARGONNE CHICAGO
113
Defining a DAG A DAG is defined by a .dag file, listing each of its
nodes and their dependencies:
# diamond.dag
Job A a.sub
Job B b.sub
Job C c.sub
Job D d.sub
Parent A Child B C
Parent B C Child D Each node runs the Condor job specified by its
accompanying Condor submit file
Job A
Job B Job C
Job D
[email protected] ARGONNE CHICAGO
114
High-Throughput Computing:Mathematicians Solve NUG30
Looking for the solution to the NUG30 quadratic assignment problem
An informal collaboration of mathematicians and computer scientists
Condor-G delivered 3.46E8 CPU seconds in 7 days (peak 1009 processors) in U.S. and Italy (8 sites)
14,5,28,24,1,3,16,15,10,9,21,2,4,29,25,22,13,26,17,30,6,20,19,8,18,7,27,12,11,23
MetaNEOS: Argonne, Iowa, Northwestern, Wisconsin
[email protected] ARGONNE CHICAGO
115Grid Application Development Software
(GrADS) Project
hipersoft.rice.edu/grads
[email protected] ARGONNE CHICAGO
116
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
117Access Grid
High-end group work and collaboration technology
Grid services being used for discovery, configuration, authentication
O(50) systems deployed worldwide
Basis for SC’2001 SC Global event in November 2001– www.scglobal.org
Ambient mic(tabletop)
Presentermic
Presentercamera
Audience camera
www.accessgrid.org
[email protected] ARGONNE CHICAGO
118
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
119
Grid-Enabled Research Facilities:Leverage Investments
Research instruments, satellites, particle accelerators, MRI machines, etc., cost a great deal
Data from those devices can be accessed and analyzed by many more scientists– Not just the team that gathered the data
More productive use of instruments– Calibration, data sampling during a run, via
on-demand real-time processing
[email protected] ARGONNE CHICAGO
120
NETWORK
IMAGINGINSTRUMENTS
COMPUTATIONALRESOURCES
LARGE-SCALEDATABASES
DATA ACQUISITIONPROCESSING,
ANALYSISADVANCED
VISUALIZATION
Telemicroscopy & Grid-Based Computing
[email protected] ARGONNE CHICAGO
121
Tokyo XP(Chicago)
STAR TAP
TransPAC vBNS
(San Diego)SDSC
NCMIR(San Diego)
UCSD
UHVEM(Osaka, Japan)
CRL/MPT
NCMIR(San Diego)
UHVEM(Osaka, Japan)
1st
2nd
Globus
APAN Trans-Pacific Telemicroscopy Collaboration, Osaka-U, UCSD, ISI
(slide courtesy Mark Ellisman@UCSD)
[email protected] ARGONNE CHICAGO
122Network forEarthquake Engineering Simulation
NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other
On-demand access to experiments, data streams, computing, archives, collaboration
Argonne, Michigan, NCSA, UIUC, USC www.neesgrid.org
[email protected] ARGONNE CHICAGO
123
“Experimental Facilities” Can Include Field Sites
Remotely controlled sensor grids for field studies, e.g., in seismology and biology– Wireless/satellite communications
– Sensor net technology for low-cost communications
[email protected] ARGONNE CHICAGO
124
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
125
Nature and Role of Grid Infrastructure
Persistent Grid infrastructure is critical to the success of many eScience projects– High-speed networks, certainly
– Remotely accessible compute & storage
– Persistent, standard services: PKI, directories, reservation, …
– Operational & support procedures Many projects creating such infrastructures
– Production operation is the goal, but much to learn about how to create & operate
[email protected] ARGONNE CHICAGO
126
A National Grid Infrastructure
REGIONAL
AA
AA
AA
AA
AA
REGIONAL
AA
AA
AA
AA
AA
REGIONAL
AA
AA
AA
AA
AA
REGIONAL
AA
AA
AA
AA
AA
REGIONAL
AA
AA
AA
AA
AA
[email protected] ARGONNE CHICAGO
127
Example Grid Infrastructure Projects
I-WAY (1995): 17 U.S. sites for one week GUSTO (1998): 80 sites worldwide, exp NASA Information Power Grid (since 1999)
– Production Grid linking NASA laboratories INFN Grid, EU DataGrid, iVDGL, … (2001+)
– Grids for data-intensive science TeraGrid, DOE Science Grid (2002+)
– Production Grids link supercomputer centers U.S. GRIDS Center
– Software packaging, deployment, support
[email protected] ARGONNE CHICAGO
128The 13.6 TF TeraGrid:Computing at 40 Gb/s
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
External Networks
External Networks
External Networks
External Networks
Site Resources Site Resources
Site ResourcesSite ResourcesNCSA/PACI8 TF240 TB
SDSC4.1 TF225 TB
Caltech Argonne
NCSA, SDSC, Caltech, Argonne www.teragrid.org
[email protected] ARGONNE CHICAGO
129TeraGrid (Details)
32
32
5
32
32
5
Cisco 6509 Catalyst Switch/Router
32 quad-processor McKinley Servers(128p @ 4GF, 8GB memory/server)
Fibre Channel Switch
HPSS
HPSS
ESnetHSCCMREN/AbileneStarlight
10 GbE
16 quad-processor McKinley Servers(64p @ 4GF, 8GB memory/server)
NCSA500 Nodes
8 TF, 4 TB Memory240 TB disk
SDSC256 Nodes
4.1 TF, 2 TB Memory225 TB disk
Caltech32 Nodes
0.5 TF 0.4 TB Memory
86 TB disk
Argonne64 Nodes
1 TF0.25 TB Memory
25 TB disk
IA-32 nodes
4
Juniper M160
OC-12
OC-48
OC-12
574p IA-32 Chiba City
128p Origin
HR Display & VR Facilities
= 32x 1GbE
= 64x Myrinet
= 32x FibreChannel
Myrinet Clos SpineMyrinet Clos Spine Myrinet Clos SpineMyrinet Clos Spine
Chicago & LA DTF Core Switch/RoutersCisco 65xx Catalyst Switch (256 Gb/s Crossbar)
= 8x FibreChannel
OC-12
OC-12
OC-3
vBNSAbileneMREN
Juniper M40
1176p IBM SPBlue Horizon
OC-48
NTON
32
24
8
32
24
8
4
4
Sun E10K
4
1500p Origin
UniTree
1024p IA-32 320p IA-64
2
14
8
Juniper M40vBNS
AbileneCalrenESnet
OC-12
OC-12
OC-12
OC-3
8
SunStarcat
16
GbE
= 32x Myrinet
HPSS
256p HP X-Class
128p HP V2500
92p IA-32
24Extreme
Black Diamond
32 quad-processor McKinley Servers(128p @ 4GF, 12GB memory/server)
OC-12 ATM
Calren
2 2
[email protected] ARGONNE CHICAGO
130
Targeted StarLightOptical Network Connections
Vancouver
Seattle
Portland
San Francisco
Los Angeles
San Diego(SDSC)
NCSA
Chicago NYC
SURFnetCA*net4Asia-
Pacific
Asia-Pacific
AMPATH
PSC
Atlanta
IU
U Wisconsin
DTF 40Gb
NTON
NTON
AMPATH
Atlanta
NCSA/UIUC
ANL
UICChicago Cross connect
NW Univ (Chicago) StarLight Hub
Ill Inst of Tech
Univ of Chicago Indianapolis (Abilene NOC)
St Louis GigaPoP
I-WIRE
www.startap.net
CERN
April 18, 2023 Introduction to Grid [email protected] ARGONNE CHICAGO
131
CA*net 4 Architecture
Calgary ReginaWinnipeg
OttawaMontreal
Toronto
Halifax
St. John’s
Fredericton
Charlottetown
ChicagoSeattleNew York
CANARIEGigaPOP
ORAN DWDM
Carrier DWDM
Thunder Bay
CA*net 4 node)
Possible future CA*net 4 node
Quebec
Windsor
Edmonton
Saskatoon
Victoria
Vancouver
Boston
[email protected] ARGONNE CHICAGO
132
2001.9.3
Current status2001(plan)
Europe
Australia
KoreaJapan
China
Thailand
Malaysia
Singapore Indonesia
STAR TAP(USA)
PhilippinesVietnam
Hong Kong
Sri Lanka
622Mbps x 2
APAN Network Topology
[email protected] ARGONNE CHICAGO
133
iVDGL: A Global Grid Laboratory
International Virtual-Data Grid Laboratory– A global Grid laboratory (US, Europe, Asia, South
America, …)– A place to conduct Data Grid tests “at scale”– A mechanism to create common Grid infrastructure– A laboratory for other disciplines to perform Data Grid
tests– A focus of outreach efforts to small institutions
U.S. part funded by NSF (2001-2006)– $13.7M (NSF) + $2M (matching)
“We propose to create, operate and evaluate, over asustained period of time, an international researchlaboratory for data-intensive science.”
From NSF proposal, 2001
[email protected] ARGONNE CHICAGO
134
Initial US-iVDGL Data Grid
Tier1 (FNAL)Proto-Tier2Tier3 university
UCSDFlorida
Wisconsin
FermilabBNL
Indiana
BU
Other sites to be added in
2002
SKC
Brownsville
Hampton
PSU
JHUCaltech
[email protected] ARGONNE CHICAGO
135
U.S. PIs: Avery, Foster, Gardner, Newman, Szalay www.ivdgl.org
iVDGL:International Virtual Data Grid Laboratory
Tier0/1 facility
Tier2 facility
10 Gbps link
2.5 Gbps link
622 Mbps link
Other link
Tier3 facility
[email protected] ARGONNE CHICAGO
137
US iVDGL Interoperability
US-iVDGL-1 Milestone (August 02)
ATLAS
CMS LIGO
SDSS/NVO
US-iVDGL-1
Aug 2002
US-iVDGL-1
Aug 2002
1
2
iGOC
1
2
1
2
1
2
[email protected] ARGONNE CHICAGO
138
Transatlantic Interoperability
iVDGL-2 Milestone (November 02)
ATLAS
CMS LIGO
SDSS/NVO
iVDGL-2Nov 2002
iVDGL-2Nov 2002
ANL
BNL
BU
IU
UM
OU
UTA
HU
LBL
CIT
UCSD
UF
FNAL
FNAL
JHU
CS Research
ANL
UCB
UC
IU
NU
UW
ISI
CIT
UTB
PSU
UWM
UC
iGOC Outreach DataTAG
CERN
INFN
UK PPARC
U of A
[email protected] ARGONNE CHICAGO
139
Another Example:INFN Grid in Italy
20 sites, ~200 persons, ~90 FTEs, ~20 IT Preliminary budget for 3 years: 9 M Euros Activities organized around
– S/w development with Datagrid, DataTAG….
– Testbeds (financed by INFN) for DataGrid, DataTAG, US-EU Intergrid
– Experiments applications
– Tier1..Tiern prototype infrastructure Large scale testbeds provided by LHC
experiments, Virgo…..
[email protected] ARGONNE CHICAGO
140U.S. GRIDS Center
NSF Middleware Infrastructure Program
GRIDS = Grid Research, Integration, Deployment, & Support
NSF-funded center to provide– State-of-the-art middleware infrastructure to
support national-scale collaborative science and engineering
– Integration platform for experimental middleware technologies
ISI, NCSA, SDSC, UC, UW NMI software release one: May 2002
www.grids-center.org
[email protected] ARGONNE CHICAGO
141
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
142
Globus Toolkit: Evaluation (+)
Good technical solutions for key problems, e.g.– Authentication and authorization
– Resource discovery and monitoring
– Reliable remote service invocation
– High-performance remote data access This & good engineering is enabling progress
– Good quality reference implementation, multi-language support, interfaces to many systems, large user base, industrial support
– Growing community code base built on tools
[email protected] ARGONNE CHICAGO
143
Globus Toolkit: Evaluation (-) Protocol deficiencies, e.g.
– Heterogeneous basis: HTTP, LDAP, FTP
– No standard means of invocation, notification, error propagation, authorization, termination, …
Significant missing functionality, e.g.– Databases, sensors, instruments, workflow, …
– Virtualization of end systems (hosting envs.) Little work on total system properties, e.g.
– Dependability, end-to-end QoS, …
– Reasoning about system properties
[email protected] ARGONNE CHICAGO
144
Globus Toolkit Structure
GRAM MDS
GSI
GridFTP MDS
GSI
???
GSI
Reliable invocationSoft state
management
Notification
ComputeResource
DataResource
Other Serviceor Application
Jobmanager
Jobmanager
Lots of good mechanisms, but (with the exception of GSI) not that easilyincorporated into other systems
Service naming
[email protected] ARGONNE CHICAGO
145
Open Grid Services Architecture Service orientation to virtualize resources Define fundamental Grid service behaviors
– Core set required, others optional A unifying framework for interoperability &
establishment of total system properties Integration with Web services and hosting
environment technologies Leverage tremendous commercial base Standard IDL accelerates community code
Delivery via open source Globus Toolkit 3.0 Leverage GT experience, code, mindshare
[email protected] ARGONNE CHICAGO
146
“Web Services” Increasingly popular standards-based framework for
accessing network applications– W3C standardization; Microsoft, IBM, Sun, others
WSDL: Web Services Description Language– Interface Definition Language for Web services
SOAP: Simple Object Access Protocol– XML-based RPC protocol; common WSDL target
WS-Inspection– Conventions for locating service descriptions
UDDI: Universal Desc., Discovery, & Integration – Directory for Web services
[email protected] ARGONNE CHICAGO
147
Web Services Example:Database Service
WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.:– Query(QueryLanguage, Query, Result)
– SOAP protocol
Client C, Java, Python, etc., APIs can then be generated
DBaccess
[email protected] ARGONNE CHICAGO
148
Transient Service Instances “Web services” address discovery & invocation of
persistent services– Interface to persistent state of entire enterprise
In Grids, must also support transient service instances, created/destroyed dynamically– Interfaces to the states of distributed activities
– E.g. workflow, video conf., dist. data analysis Significant implications for how services are managed,
named, discovered, and used– In fact, much of our work is concerned with the
management of service instances
[email protected] ARGONNE CHICAGO
149
The Grid Service =Interfaces + Service Data
Servicedata
element
Servicedata
element
Servicedata
element
GridService … other interfaces …
Implementation
Service data accessExplicit destructionSoft-state lifetime
NotificationAuthorizationService creationService registryManageabilityConcurrency
Reliable invocationAuthentication
Hosting environment/runtime(“C”, J2EE, .NET, …)
[email protected] ARGONNE CHICAGO
150
Open Grid Services Architecture:Fundamental Structure
1) WSDL conventions and extensions for describing and structuring services– Useful independent of “Grid” computing
2) Standard WSDL interfaces & behaviors for core service activities– portTypes and operations => protocols
[email protected] ARGONNE CHICAGO
151
WSDL Conventions & Extensions portType (standard WSDL)
– Define an interface: a set of related operations serviceType (extensibility element)
– List of port types: enables aggregation serviceImplementation (extensibility element)
– Represents actual code service (standard WSDL)
– instanceOf extension: map descr.->instance compatibilityAssertion (extensibility element)
– portType, serviceType, serviceImplementation
[email protected] ARGONNE CHICAGO
152
Structure of a Grid Service
service
PortTypePortType
service service service
Standard WSDL
… …
…
ServiceDescription
ServiceInstantiation
PortType
serviceImplementation serviceImplementation …
=
serviceType serviceType …
cA
cA
cA compatibilityAssertion=
cA
instanceOf instanceOf instanceOf instanceOf
[email protected] ARGONNE CHICAGO
153Standard Interfaces & Behaviors:Four Interrelated Concepts
Naming and bindings– Every service instance has a unique name, from which
can discover supported bindings Information model
– Service data associated with Grid service instances, operations for accessing this info
Lifecycle– Service instances created by factories
– Destroyed explicitly or via soft state Notification
– Interfaces for registering interest and delivering notifications
[email protected] ARGONNE CHICAGO
154
GridService Required– FindServiceData
– Destroy
– SetTerminationTime
NotificationSource– SubscribeToNotificationTopic
– UnsubscribeToNotificationTopic NotificationSink
– DeliverNotification
OGSA Interfaces and OperationsDefined to Date
Factory– CreateService
PrimaryKey– FindByPrimaryKey
– DestroyByPrimaryKey
Registry– RegisterService
– UnregisterService
HandleMap– FindByHandle
Authentication, reliability are binding propertiesManageability, concurrency, etc., to be defined
[email protected] ARGONNE CHICAGO
155
Service Data A Grid service instance maintains a set of service data
elements– XML fragments encapsulated in standard <name, type,
TTL-info> containers
– Includes basic introspection information, interface-specific data, and application data
FindServiceData operation (GridService interface) queries this information– Extensible query language support
See also notification interfaces– Allows notification of service existence and changes in
service data
[email protected] ARGONNE CHICAGO
156
Grid Service Example:Database Service
A DBaccess Grid service will support at least two portTypes– GridService
– DBaccess Each has service data
– GridService: basic introspection information, lifetime, …
– DBaccess: database type, query languages supported, current load, …, …
GridService DBaccess
DB info
Name, lifetime, etc.
[email protected] ARGONNE CHICAGO
159
Lifetime Management GS instances created by factory or manually;
destroyed explicitly or via soft state– Negotiation of initial lifetime with a factory
(=service supporting Factory interface) GridService interface supports
– Destroy operation for explicit destruction
– SetTerminationTime operation for keepalive Soft state lifetime management avoids
– Explicit client teardown of complex state
– Resource “leaks” in hosting environments
[email protected] ARGONNE CHICAGO
160
Factory
Factory interface’s CreateService operation creates a new Grid service instance– Reliable creation (once-and-only-once)
CreateService operation can be extended to accept service-specific creation parameters
Returns a Grid Service Handle (GSH)– A globally unique URL
– Uniquely identifies the instance for all time
– Based on name of a home handleMap service
[email protected] ARGONNE CHICAGO
161
Transient Database Services
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DBaccessFactory
Factory info
Instance name, etc.
GridService Registry
Registry info
Instance name, etc.
GridService DBaccess
DB info
Name, lifetime, etc.
“What services can you create?”
“What database services exist?”
“Create a database service”
[email protected] ARGONNE CHICAGO
162
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider
“I want to createa personal databasecontaining data one.coli metabolism”
.
.
.
DatabaseFactory
[email protected] ARGONNE CHICAGO
163
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Find me a data mining service, and somewhere to store
data”
DatabaseFactory
[email protected] ARGONNE CHICAGO
164
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
GSHs for Miningand Database factories
DatabaseFactory
[email protected] ARGONNE CHICAGO
165
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
DatabaseFactory
[email protected] ARGONNE CHICAGO
166
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
“Create a data mining service with initial lifetime 10”
“Create adatabase with initial lifetime 1000”
[email protected] ARGONNE CHICAGO
167
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
[email protected] ARGONNE CHICAGO
168
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Query
Query
Keepalive
Keepalive
[email protected] ARGONNE CHICAGO
169
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
MinerKeepalive
KeepaliveResults
Results
[email protected] ARGONNE CHICAGO
170
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Miner
Keepalive
[email protected] ARGONNE CHICAGO
171
Example:Data Mining for Bioinformatics
UserApplication
BioDB n
Storage Service Provider
DatabaseFactory
MiningFactory
CommunityRegistry
DatabaseService
BioDB 1
DatabaseService
.
.
.
Compute Service Provider...
Database
Keepalive
[email protected] ARGONNE CHICAGO
172
Notification Interfaces NotificationSource for client subscription
– One or more notification generators> Generates notification message of a specific type
> Typed interest statements: E.g., Filters, topics, …
> Supports messaging services, 3rd party filter services, …
– Soft state subscription to a generator NotificationSink for asynchronous delivery of
notification messages A wide variety of uses are possible
– E.g. Dynamic discovery/registry services, monitoring, application error notification, …
[email protected] ARGONNE CHICAGO
173
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
NotificationSink
Subscribers
[email protected] ARGONNE CHICAGO
174
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
“Notify me ofnew data about
membrane proteins”
Subscribers
NotificationSink
[email protected] ARGONNE CHICAGO
175
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
DB info
Name, lifetime, etc.
NotificationSource
Keepalive
NotificationSink
Subscribers
[email protected] ARGONNE CHICAGO
176
Notification Example
Notifications can be associated with any (authorized) service data elements
GridService DBaccess
DB info
Name, lifetime, etc.
GridService
NotificationSink
DB info
Name, lifetime, etc.
NotificationSource
New data
Subscribers
[email protected] ARGONNE CHICAGO
177Open Grid Services Architecture:Summary
Service orientation to virtualize resources– Everything is a service
From Web services– Standard interface definition mechanisms:
multiple protocol bindings, local/remote transparency
From Grids– Service semantics, reliability and security models
– Lifecycle management, discovery, other services Multiple “hosting environments”
– C, J2EE, .NET, …
[email protected] ARGONNE CHICAGO
178
Recap: The Grid Service
Servicedata
element
Servicedata
element
Servicedata
element
GridService … other interfaces …
Implementation
Service data accessExplicit destructionSoft-state lifetime
NotificationAuthorizationService creationService registryManageabilityConcurrency
Reliable invocationAuthentication
Hosting environment/runtime(“C”, J2EE, .NET, …)
[email protected] ARGONNE CHICAGO
179
OGSA and the Globus Toolkit Technically, OGSA enables
– Refactoring of protocols (GRAM, MDS-2, etc.)—while preserving all GT concepts/features!
– Integration with hosting environments: simplifying components, distribution, etc.
– Greatly expanded standard service set Pragmatically, we are proceeding as follows
– Develop open source OGSA implementation> Globus Toolkit 3.0; supports Globus Toolkit 2.0 APIs
– Partnerships for service development
– Also expect commercial value-adds
[email protected] ARGONNE CHICAGO
180
GT3: An Open Source OGSA-Compliant Globus Toolkit
GT3 Core– Implements Grid service
interfaces & behaviors
– Reference impln of evolving standard
– Java first, C soon, C#? GT3 Base Services
– Evolution of current Globus Toolkit capabilities
– Backward compatible Many other Grid services
GT3 Core
GT3 Base Services
Other GridServicesGT3
DataServices
[email protected] ARGONNE CHICAGO
181
Hmm, Isn’t This Just Another Object Model?
Well, yes, in a sense– Strong encapsulation
– We (can) profit greatly from experiences of previous object-based systems
But– Focus on encapsulation not inheritance
– Does not require OO implementations
– Value lies in specific behaviors: lifetime, notification, authorization, …, …
– Document-centric not type-centric
[email protected] ARGONNE CHICAGO
182
Grids and OGSA:Research Challenges
Grids pose profound problems, e.g.– Management of virtual organizations
– Delivery of multiple qualities of service
– Autonomic management of infrastructure
– Software and system evolution OGSA provides foundation for tackling these
problems in a rigorous fashion?– Structured establishment/maintenance of
global properties
– Reasoning about total system properties
[email protected] ARGONNE CHICAGO
183
Summary
OGSA represents refactoring of current Globus Toolkit protocols and integration with Web services technologies
Several desirable features– Significant evolution of functionality
– Uniform IDL facilitates code sharing
– Allows for alignment of potentially divergent directions (e.g., info service, service registry, monitoring)
[email protected] ARGONNE CHICAGO
184
Evolution
This is not happening all at once– We have an early prototype of Core (alpha
release May?)
– Next we will work on Base, others
– Full release by end of 2002??
– Establishing partnerships for other services Backward compatibility
– API level seems straightforward
– Protocol level: gateways?
– We need input on best strategies
[email protected] ARGONNE CHICAGO
185
For More Information
OGSA architecture and overview– “The Physiology of the Grid: An Open Grid
Services Architecture for Distributed Systems Integration”, at www.globus.org/ogsa
Grid service specification– At www.globus.org/ogsa
Open Grid Services Infrastructure WG, GGF– www.gridforum.org/ogsi (?), soon
Globus Toolkit OGSA prototype– www.globus.org/ogsa
[email protected] ARGONNE CHICAGO
186
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
187
GGF Objectives
An open process for development of standards– Grid “Recommendations” process modeled
after Internet Standards Process (IETF) A forum for information exchange
– Experiences, patterns, structures A regular gathering to encourage shared effort
– In code development: libraries, tools…
– Via resource sharing: shared Grids
– In infrastructure: consensus standards
[email protected] ARGONNE CHICAGO
188
GGF Groups
Working Groups– Tightly focused on
development of a spec or set of related specs
> Protocol, API, etc.
– Finite set of objectives and schedule of milestones
Research Groups– More exploratory than
Working Groups
– Focused on understanding requirements, taxonomies, models, methods for solving a particular set of related problems
– May be open-ended but with a definite set of objectives and milestones to drive progress
Groups are approved and evaluated by a GGF Steering Group (GFSG) based on written charters. Among the criteria for group formation:• Is this work better done (or already being done) elsewhere, e.g. IETF, W3C?• Are the leaders involved and/or in touch with relevant efforts elsewhere?
[email protected] ARGONNE CHICAGO
189Current GGF Groups(Out-of-date List, Sorry…)
AREA Working Groups Research Groups
Grid Information Services
Grid Object Specification Grid Notification Framework Metacomputing Directory Services
Relational Database Information Services
Scheduling and Resource Management
Advanced Reservation Scheduling Dictionary Scheduler Attributes
Security Grid Security Infrastructure Grid Certificate Policy
Performance Grid Performance Monitoring Architecture
Architectures JINI NPI Architecture
Grid Protocol Architecture Accounting Models
Data GridFTP Data Replication
Applications, Programming Models, and User Environments
Applications Grid User Services Grid Computing Env. Adv Programming Models Adv Collaboration Env
[email protected] ARGONNE CHICAGO
190Proposed GGF Groups(Again, Out of Date …)
AREA Working Groups Research Groups
Scheduling and Resource Management
Scheduling Command Line API
Distributed Resource Mgmt Applic API
Grid Resource Management Protocol
Scheduling Optimization
Performance Network Monitoring/Measurement
Sensor Management Grid Event Service
Architectures Open Grid Services Architecture
Grid Economies
Data Archiving Command Line API
Persistent Archives
DataGrid Schema Application Metadata Network Storage
Area TBD… Open Source Software LicensingCluster Standardization
High-Performance Networks for Grids
[email protected] ARGONNE CHICAGO
191
Getting Involved
Participate in a GGF Meeting– 3x/year, last one had 500 people
– July 21-24, 2002 in Edinburgh (with HPDC)
– October 15-17, 2002 in Chicago Join a working group or research group
– Electronic participation via mailing lists (see www.gridforum.org)
[email protected] ARGONNE CHICAGO
192
Grid Events
Global Grid Forum: working meeting– Meets 3 times/year, alternates U.S.-Europe,
with July meeting as major event HPDC: major academic conference
– HPDC’11 in Scotland with GGF’5, July 2002 Other meetings with Grid content include
– SC’XY, CCGrid, Globus Retreat
www.gridforum.org, www.hpdc.org
[email protected] ARGONNE CHICAGO
193
Outline
The technology landscape Grid computing The Globus Toolkit Applications and technologies
– Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure Open Grid Services Architecture Global Grid Forum Summary and conclusions
[email protected] ARGONNE CHICAGO
194
Summary
The Grid problem: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations – Real application communities emerging
– Significant infrastructure deployments
– Substantial open source/architecture technology base: Globus Toolkit
– Pathway defined to industrial adoption, via open source + OGSA
– Rich set of intellectual challenges
[email protected] ARGONNE CHICAGO
195Major Application Communities are
Emerging Intellectual buy-in, commitment
– Earthquake engineering: NEESgrid
– Exp. physics, etc.: GriPhyN, PPDG, EU Data Grid
– Simulation: Earth System Grid, Astrophysical Sim. Collaboratory
– Collaboration: Access Grid Emerging, e.g.
– Bioinformatics Grids
– National Virtual Observatory
[email protected] ARGONNE CHICAGO
196Major Infrastructure Deployments are
Underway For example:
– NSF “National Technology Grid”
– NASA “Information Power Grid”
– DOE ASCI DISCOM Grid
– DOE Science Grid’
– EU DataGrid
– iVDGL
– NSF Distributed Terascale Facility (“TeraGrid”)
– DOD MOD Grid
[email protected] ARGONNE CHICAGO
197
A Rich Technology Basehas been Constructed
6+ years of R&D have produced a substantial code base based on open architecture principles: esp. the Globus Toolkit, including– Grid Security Infrastructure
– Resource directory and discovery services
– Secure remote resource access
– Data Grid protocols, services, and tools Essentially all projects have adopted this as a
common suite of protocols & services Enabling wide range of higher-level services
[email protected] ARGONNE CHICAGO
198
Pathway Defined toIndustrial Adoption
Industry need– eScience applications, service provider models,
need to integrate internal infrastructures, collaborative computing in general
Technical capability– Maturing open source technology base
– Open Grid Services Architecture enables integration with industry standards
Result likely to be exponential industrial uptake
[email protected] ARGONNE CHICAGO
199
Rich Set of Intellectual Challenges
Transforming the Internet into a robust, usable computational platform
Delivering (multi-dimensional) qualities of service within large systems
Community dynamics and collaboration modalities
Program development methodologies and tools for Internet-scale applications
Etc., etc., etc.
[email protected] ARGONNE CHICAGO
206
New Programs
U.K. eScience program
EU 6th Framework U.S. Committee on
Cyberinfrastructure Japanese Grid
initiative
[email protected] ARGONNE CHICAGO
207U.S. Cyberinfrastructure:Draft Recommendations
New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities 21st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources– Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE
– Budget estimate: incremental $650 M/year (continuing) An INITIATIVE OFFICE with a highly placed, credible leader empowered to
– Initiate competitive, discipline-driven path-breaking applications within NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE
– Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science
– Develop high quality middleware and other software that is essential and special to scientific research
– Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.
[email protected] ARGONNE CHICAGO
208
For More Information The Globus Project™
– www.globus.org Grid concepts, projects
– www.mcs.anl.gov/~foster Open Grid Services Architecture
– www.globus.org/ogsa Global Grid Forum
– www.gridforum.org GriPhyN project
– www.griphyn.org