grid deployments and cyberinfrastructure
DESCRIPTION
Grid Deployments and Cyberinfrastructure. Andrew J. Younge 102 Lomb Memorial Drive Rochester, NY 14623 [email protected] http://grid.rit.edu. How Do we Make Use of These Tools?. Grid Hierarchy. Cluster Systems. All batch queuing systems!. PBS – Portable Batch System. - PowerPoint PPT PresentationTRANSCRIPT
Grid Deployments and Cyberinfrastructure
Andrew J. Younge102 Lomb Memorial Drive
Rochester, NY [email protected]://grid.rit.edu
How Do we Make Use of These Tools?
Grid Hierarchy
Cluster Systems
All batch queuing systems!
PBS – Portable Batch System• Used for dedicated Cluster resources– Homogeneous clusters with MPI
• Manages thousands of CPUs in near real time• Schedules large numbers of jobs quickly and efficiently• Many different implementations– PBS Pro (not free but advanced)– Open PBS (free but old)– Torque & Maui (free, stable, advanced)
• Deployments– Dedicated clusters in academic and corporate settings– Playstation3 Clusters
Condor• Used for dedicated and non-dedicated resources– Typically used to “scavenge” CPUs in places where a lot of
workstations are available– Heterogeneous environments• Separate Condor tasks – Resource Management and Job
Management• User interface is simple; commands that use small
config files• Not good for MPI jobs• Deployments– Campus workstations and desktops– Corporate servers
Grid tools in Condor• Condor-G– Replicates the Job Management functionality– Submission to a grid resource using the Globus Toolkit– NOT a grid service, just a way to submit to a grid• Flocking– Allows for queued jobs in one Condor cluster to be executed on
another Condor cluster– Directional flocking (A => B but not B => A)– Unidirectional flocking (A <=> B)• Glidein– Dynamically adds machines to a Condor cluster– Can be used to create your own personal Condor cluster on the
Teragrid!
Clusters in Action
Ganglia
BOINC
• Desktop based Grid Computing - “Volunteer Computing”– Centralized Grid system– Users encouraged by gaining credits for their
computations– Can partake in one or many different projects
• Open access for contributing resources, closed access for using grid
• Allows organizations to gain enormous amounts of computational power with very little cost.
• BOINC is really a cluster and a grid system in one!
BOINC Projects
BOINC Projects (2)
Full List of Projects: http://boinc.berkeley.edu/wiki/Project_list
The Lattice Project
1185
3461
The Open Science Grid - OSG
• Large national-scale grid computing infrastructure
• 5 DOE Labs, 65 Universities, 5 regional/campus grids• 43,000 CPUs, 6 Petabytes of disk space
• Uses the Globus Toolkit– GT4, however uses pre-WS services (GT2)– Typically connects to Condor pools
• Virtual Data Toolkit (VDT) & OSG Release Tools– NMI + VOMS, CEMon, MonaLisa, AuthZ, VO
management tools, etc– VORS – Resource Selector: http://vors.grid.iu.edu/
The TeraGrid• NSF-funded national-scale Grid Infrastructure– 11 Locations – LONI, NCAR, NCSA, NICS, ORNL, PSC, IU, PU, SDSC, TACC,
UC/ANL– 1.1Petaflops, 161 thousand CPUs, 60 Petabytes disk space– Dedicated 10G fiber lines to each location– Specialized visualization servers
• Uses Globus Toolkit 4’s basic WS services and security protocols• Grid Infrastructure Group (GIG) at U. Chicago– Commity for Teragrid planning, management, and coordination
• Science Gateways– Independent services for specialized groups and organizations– “TeraGrid Inside” capabilities– Web Portals, desktop apps, coordinated access points– Not Virtual Organizations (VOs)
TeraGrid Overview
SDSC
TACC
UC/ANL
NCSA
ORNL
PU
IU
PSC
NCAR
CaltechUNC/RENCI
UW
Resource Provider (RP)
Software Integration Partner
Grid Infrastructure Group (UChicago)
LONI
NICS
TeraGrid User Portal
EGEE• European Commision funded International Grid system
– 250 resource locations, 40,000 CPUs, 20 Petabytes of storage– Originally European grid, but expanded to US and Asia
• Uses the gLite Middleware system– Uses Globus’ Grid Security Infrastructure (GSI)– Specialized elements to utilize underlying hardware– Groups organized as Virtual Organizations (VOs) and uses
VOMS membership services to enable user privileges• Originally based on the old LHC Grid
– EGEE-I Ended April, 2006. Continued on as EGEE-II– Now part of WLCG
Worldwide LHC Computing Grid
• Large grid to support the massive computational needs of the Large Hadron Collider at CERN– Project produces >15 Petaflops per year!
• WLCG is really a mashup of other grids– EGEE, OSG, GridPP, INFN Grid, NorduGrid– Uses specialized upperware to manage these grids
• Multi-tier system for efficiently distributing data to scientists and researchers around the world
• Used mostly for ATLAS, ALICE, CMS, LHCb, LHCf and TOTEM experiments
What If we could use all of the Grids
together?
Cyberaide Shell
• There are many different cyberinfrastructure deployments today.– How do we make sence of them?– How do we use them for our benefit?
• Our idea for Cyberaide Gridshell will be to link to these grid deployments– Provide an easy, all-in-one interface for many
different grids– Automate scheduling and resource management– Leverage Web 2.0 technologies
References• http://www.cs.wisc.edu/condor/• http://www.gridpp.ac.uk/wiki/Torque_and_Maui• http://www.opensciencegrid.org/• http://teragrid.org/• https://twiki.grid.iu.edu/pub/Education/MWGS2008Syl
labus/6_NationalGrids.ppt• http://globus.org/toolkit/• https://edms.cern.ch/file/722398//gLite-3-UserGuide.h
tml• http://www.eu-egee.org/• http://lattice.umiacs.umd.edu/resources/• http://lcg.web.cern.ch/LCG/• http://grid.rit.edu/lab/doku.php/grid:shell