chep 2004 grid enabled analysis: prototype, status and results (on behalf of the gae collaboration)...
TRANSCRIPT
CHEP 2004CHEP 2004
Grid Enabled Analysis: Prototype, Status and Results
(on behalf of the GAE collaboration) Caltech, University of Florida, NUST, UBP
Frank van Lingen ([email protected]) Frank van Lingen ([email protected])
CHEP 2004CHEP 2004
ARCHITECTURE
Suport 100-1000s of analysis-production tasksSuport 100-1000s of analysis-production tasksBatch, interactive (interactive should be really interactive!)Batch, interactive (interactive should be really interactive!)““Chaotic” behavior. (not only production like workflow)Chaotic” behavior. (not only production like workflow)
Resource limited and policy constrainedResource limited and policy constrainedWho is allowed to access what resources?Who is allowed to access what resources?
Real time monitoring and trend analysisReal time monitoring and trend analysisWorkflow tracking, and data provenanceWorkflow tracking, and data provenanceCollaborate on analysis (country, world wide)Collaborate on analysis (country, world wide)
Provide secure access to data and resourcesProvide secure access to data and resourcesSelf organizing (prevent the 100 system administrators nightmare)Self organizing (prevent the 100 system administrators nightmare)
Detection of bottlenecks within the grid: network, storage, cpu and take action without human intervention.Detection of bottlenecks within the grid: network, storage, cpu and take action without human intervention.Secure, robust, fast data transferSecure, robust, fast data transferHigh level services: autonomous replication, steering of jobs, workflow management (service flows, data analysis flows)High level services: autonomous replication, steering of jobs, workflow management (service flows, data analysis flows)Creating a robust end 2 end system for physics analysisCreating a robust end 2 end system for physics analysis
No single point of failureNo single point of failureComposite servicesComposite services
Provide a simple access point for the user, while performing complex tasks behind the sceneProvide a simple access point for the user, while performing complex tasks behind the scene
System viewSystem view
GAE is not all “new” development, but also focuses on integration of existing components which can include experiment specific applications
User viewUser viewProvide a transparent environment for a physicist to perform his/her analysis Provide a transparent environment for a physicist to perform his/her analysis ((batch/interactivebatch/interactive) in a distributed dynamic environment: Identify your data ) in a distributed dynamic environment: Identify your data ((CatalogsCatalogs), submit your (complex) job (), submit your (complex) job (Scheduling, Workflow,JDLScheduling, Workflow,JDL), get “fair” ), get “fair” access to resources (access to resources (Priority, AccountingPriority, Accounting), monitor job progress (), monitor job progress (Monitor, Monitor, SteeringSteering), get the results (), get the results (Storage, RetrievalStorage, Retrieval), repeat the process and refine results), repeat the process and refine resultsI want to share my results/code with a selected audience!I want to share my results/code with a selected audience!I want access to data as quickly as possible!I want access to data as quickly as possible!
CatalogsCatalogs
ScheduleSchedulerr
FarmFarm
StoragStoragee
Simplis
tic
Simplis
tic
view
!
view
!
IdentifyIdentify
submisubmitt
executexecutee
Monitor/Monitor/steeringsteering
storestore
Notify/Notify/movemove
locatelocate
ArchitectureArchitectureExample implementations Associated with GAE components
SchedulerCatalogs
Grid ServicesWeb Server
ExecutionPriority
Manager
Grid WideExecutionService
DataManagement
Fully-ConcretePlanner
Fully-AbstractPlanner
Analysis Client
AnalysisClient
Virtual Data
Replica
ApplicationsMonitoring
Partially-AbstractPlanner
Metadata
HTTP, SOAP, XML-RPC
Chimera
Sphinx
MonALISA
Clarens
ROOT (analysis tool)PythonCojac (detector viz.)/IGUANA (cms viz tool)
MCRunjob
BOSS
RefDB
POOL
ORCA
ROOT FAMOS
VDT-Server
MOPDB
•Discovery,Discovery,•Acl management,Acl management,•Certificate based accessCertificate based access
Analysis Clients talk standard protocols to the “Grid Services Web Server”, a.k.a. the Clarens data/services portal.Simple Web service API allows Analysis Clients (simple or complex) to operate in this architecture.Typical clients: ROOT, Web Browser, IGUANA, COJAC The Clarens portal hides the complexity of the Grid Services from the client, but can expose it in as much detail as req’d for e.g. monitoring.Key features: Global Scheduler, Catalogs, Monitoring, and Grid-wide Execution service.
Peer 2 Peer SystemPeer 2 Peer System
Allow a “Peer-to-Peer” configuration to be built, with associated robustness and scalability features.
Discovery of Services
No Single point of failure
Robust file download
Find service (e.g. Find service (e.g. Catalog)Catalog)
Dis
cover
Dis
cover
serv
ices
serv
ices
Discover
Discover
services
services
Discove
r
Discove
r
serv
ices
serv
ices
CataloCatalogg
File_File_xx
Download
Download
filefile
File_File_xx
ClienClientt
Query for
Query for
datadata
Self OrganizingSelf Organizing
Data_1Data_1
Data_2Data_2
Autonomous Autonomous Replica managementReplica management
Trend Trend analysisanalysis
Trend Trend
analysisanalysis
Tre
nd
Tre
nd
an
aly
sis
an
aly
sis
ReplicateReplicate
Data_1Data_1
RemoveRemove
Job schedulingJob scheduling
Real time Real time feedbackfeedback
Real time
Real time
feedback
feedback
Real
Real
time
time
feed
bac
feed
bac
kk
Steering jobs, job Steering jobs, job feedbackfeedback
CHEP 2004CHEP 2004
Development(see for more details the talks on Clarens and
Sphinx)
X509 Cert based accessX509 Cert based accessGood PerformanceGood PerformanceAccess Control ManagementAccess Control ManagementRemote File AccessRemote File AccessDyanamic Discovery of Services on a Global ScaleDyanamic Discovery of Services on a Global ScaleAvailable in Python and JavaAvailable in Python and JavaEasy to install, as root or normal user, and part of DPE Easy to install, as root or normal user, and part of DPE distribution. As root do:distribution. As root do:
wget -q -O - http://hepgrid1.caltech.edu/clarens/setup_clump.sh |sh wget -q -O - http://hepgrid1.caltech.edu/clarens/setup_clump.sh |sh export opkg_root=/opt/openpkgexport opkg_root=/opt/openpkg
Interoperability with other web service environments such as Interoperability with other web service environments such as Globus, through SOAPGlobus, through SOAPInteroperability with MonALISA (Publication of service Interoperability with MonALISA (Publication of service methods via MonALISA)methods via MonALISA)
Service Service publicationpublication
Monitoring Clarens Monitoring Clarens parametersparameters
GAE Backbone: Clarens Service FrameworkGAE Backbone: Clarens Service Framework
dbsvr=Clarens.clarens_client('http://tier2c.cacr.caltech.edu:8080/dbsvr=Clarens.clarens_client('http://tier2c.cacr.caltech.edu:8080/clarens/')clarens/') dbsvr.echo.echo('alive?')dbsvr.echo.echo('alive?')dbsvr.file.size('index.html')dbsvr.file.size('index.html')dbsvr.file.ls('/web/system','*.html')dbsvr.file.ls('/web/system','*.html')dbsvr.file.find(['//web'],'*','all')dbsvr.file.find(['//web'],'*','all')dbsvr.catalog.getMetaDataSpec('cat4')dbsvr.catalog.getMetaDataSpec('cat4')dbsvr.catalog.queryCatalog('cat4','val1 LIKE "%val%"','meta')dbsvr.catalog.queryCatalog('cat4','val1 LIKE "%val%"','meta')dbsvr.refdb.listApplications('cms',0,20,'Simulation')dbsvr.refdb.listApplications('cms',0,20,'Simulation')
ClienClientt
Web Web serverserver
ServiceService
33rdrd party party applicatioapplicatio
nsns
ClienClienttClienClienttClienClientt
Python:Python:
Grid Portal: Grid Portal: Secure cert-based access to services through Secure cert-based access to services through
browserbrowserhttp/httpshttp/https
other serversother servers
•Java clientJava client
•ROOT (analysis tool)ROOT (analysis tool)
•IGUANA (CMS viz. tool)IGUANA (CMS viz. tool)
•ROOT-CAVES client (analysis sharing tool)ROOT-CAVES client (analysis sharing tool)
•CLASH (Giulio)CLASH (Giulio)
•… … any app that can make XML-RPC/SOAP any app that can make XML-RPC/SOAP callscalls
POOL catalogPOOL catalogRefDB/PubDB (CERN)RefDB/PubDB (CERN)BossBossPhedex (CERN)Phedex (CERN)MCrunJob/MopDB (FNAL)MCrunJob/MopDB (FNAL)Sphinx (Grid scheduler) Sphinx (Grid scheduler) (UFL)(UFL)…………
Clarens Clarens Cont.:Cont.:
Clarens Grid PortalsClarens Grid Portals
Catalog
Catalog
service
service Jo
b Jo
b
execu
tion
execu
tion
Collaborative Analysis Collaborative Analysis destopdestop
PDAPDA
MonALISA IntegrationMonALISA Integration
Publish web services Publish web services information for discovery in information for discovery in other distribution systemsother distribution systems
Query repositories for Query repositories for monitor informationmonitor information
Gather and publish Gather and publish access patterns on access patterns on collections of datacollections of data
SPHINX: Grid schedulerSPHINX: Grid scheduler
Simple sanity checksSimple sanity checks120 canonical virtual data 120 canonical virtual data workflows submitted to US-workflows submitted to US-CMS GridCMS Grid
Round-robin strategyRound-robin strategyEqually distribute work Equally distribute work
to all sitesto all sitesUpper-limit strategyUpper-limit strategy
Makes use of global Makes use of global information (site information (site capacity)capacity)
Throttle jobs using just-Throttle jobs using just-in-time planningin-time planning
40% better throughput 40% better throughput (given grid topology)(given grid topology)
Distribution of jobs across the US-CMS Grid Testbed
60 60 60 6052
106
25
57
0
20
40
60
80
100
120
DGT IGT UCSD CALTECH
Sites
Round Robin
Upper Limit
Sphinx Server
VDT Client
VDT Server SiteMonALISA Monitoring Service
Globus Resource
Replica Location Service
Condor-G/DAGMan
Request Processing
Data Warehouse
Data Management
InformationGathering
Sphinx ClientChimera
Virtual Data System
ClarensWS Backbone
Data warehouseData warehousePolicies, account information, grid weather, resource Policies, account information, grid weather, resource properties and status, request tracking, workflowsproperties and status, request tracking, workflows
Control processControl processFinite state machineFinite state machineDifferent modules modify jobs, graphs, workflowDifferent modules modify jobs, graphs, workflowFlexible Flexible ExstensibleExstensible
CODESHCODESHVirtual log-book for “shell” sessionsVirtual log-book for “shell” sessionsParts can be local (private) or sharedParts can be local (private) or sharedTracks environment variablesTracks environment variables, aliases etc during a session, aliases etc during a sessionReproduce complete working sessionsReproduce complete working sessionsFirst prototypes use popular tools: Python, ROOT and CVS; e.g. all ROOT First prototypes use popular tools: Python, ROOT and CVS; e.g. all ROOT commands and commands and CAVESCAVES commands available commands availableThree tier architecture: isolate client from back-end details; different Three tier architecture: isolate client from back-end details; different implementations possibleimplementations possibleLightweight clients (use ROOT; C++; Python; e.g. CVS API)Lightweight clients (use ROOT; C++; Python; e.g. CVS API)Back-ends: e.g. CVS pservers (remote stores) with read/write access Back-ends: e.g. CVS pservers (remote stores) with read/write access control; ARCH, Clarens etccontrol; ARCH, Clarens etcOptional MySQL servers for metadata (fast search for large data volumes)Optional MySQL servers for metadata (fast search for large data volumes)More info: http://bourilko.home.cern.ch/bourilko/dpf04caves.pptMore info: http://bourilko.home.cern.ch/bourilko/dpf04caves.ppt
Physics AnalysisPhysics AnalysisBoth Florida and Caltech moved DC04 data successfully from Both Florida and Caltech moved DC04 data successfully from FNAL.FNAL.DC04 catalog and data distributed over several hosts and catalog DC04 catalog and data distributed over several hosts and catalog available as web service available as web service Within CODESH A Complex CMS ORCA example is availableWithin CODESH A Complex CMS ORCA example is available…………
Services within GAEServices within GAE
MCrunjobMCrunjob
RefdbRefdb
BOSSBOSS
POOL catalogPOOL catalog
GROSSGROSS
MOPDMOPDBB
TMDBTMDB
SphinSphinxxMonaLisMonaLisaa SRMSRM
SRBSRB
ChimeraChimera
File File accessaccess
VO VO managementmanagement
ACL ACL managementmanagement
Service Service discoverydiscovery
PubDPubDBB
Caltech/CERNCaltech/CERN
CERN/Caltech/INFNCERN/Caltech/INFN
FNALFNAL
FNAFNALL
UFUFLLCaltechCaltech
CERN/CERN/caltechcaltech
CERNCERN
CERNCERN
Monte carlo Monte carlo processing processing
serviceservice
FNAL/CaltechFNAL/Caltech
=accessible through a GAE =accessible through a GAE web serviceweb service
=has javascript front end=has javascript front end
=service being developed=service being developed
=on wish list to become a service or to interoperate =on wish list to become a service or to interoperate with this servicewith this service
=Clarens core =Clarens core serviceservice
CodeshCodesh UFUFLL
CHEP 2004CHEP 2004
Deployment
The new Clarens distributions The new Clarens distributions register automatically with register automatically with MonaLisa. (Notice there are several MonaLisa. (Notice there are several entries for the same server entries for the same server representing different protocols)representing different protocols)
UCSDUCSD
CERCERNN
CACRCACR
PakistaPakistann
UKUK
ConferencConference User!e User!
Approximately Approximately 20+ 20+ installations installations world wide world wide
GAE GAE testbedtestbed
=service=service
RefdRefdbb
Refdb (replica)Refdb (replica)
discoverdiscoveryy
discoverdiscoveryy
Pool catalog (host 4 Pool catalog (host 4 replica)replica)
Pool CatalogPool Catalog
Host 1Host 1
Host 2Host 2
Host 3Host 3
Host 4Host 4
Pool catalog (host 3 Pool catalog (host 3 replica)replica)
Pool catalog (host 1 Pool catalog (host 1 replica)replica)Pool catalog (host 2 Pool catalog (host 2 replica)replica)
Host 5Host 5
Pool CatalogPool Catalog
Pool CatalogPool Catalog
Pool CatalogPool Catalog
Host 6Host 6
Host 7Host 7
ClienClientt
(1) Discover pool (1) Discover pool catalog, refdb, grid catalog, refdb, grid schedulersschedulers
(2) Query for (2) Query for datasetdataset
(2) Query for (2) Query for datasetdataset
(2) Query for (2) Query for datasetdataset
(2) Query for (2) Query for datasetdataset
Grid Grid schedulerscheduler(3) Submit orca/root job(s) with (3) Submit orca/root job(s) with
dataset(s) for dataset(s) for reconstruction/analysisreconstruction/analysis
Multiple clients will Multiple clients will query and submit query and submit jobsjobs
TODO: integration of TODO: integration of pubdbpubdb
(1) Discover pool (1) Discover pool catalog, refdb, grid catalog, refdb, grid schedulersschedulers
Client code has no Client code has no knowledge about knowledge about location of location of services, except services, except for several urls for for several urls for discovery servicesdiscovery services
runjobrunjob
Querying for Querying for datasetsdatasets
= not yet deployed= not yet deployed
=service=service
SPHINXSPHINX
Uniform job Uniform job submission layersubmission layer
Mona Mona lisalisa
Farm 1Farm 1
BOSBOSSS
CondorCondorGG
monitormonitorss
Mona Mona lisalisa
Farm 2Farm 2
BOSBOSSS
CondorCondorGG
Mona Mona lisalisa
Farm 3Farm 3
BOSBOSSS
PBPBSS
Mona Mona lisalisa
Farm 4Farm 4
BOSBOSSS
PBPBSS
(1) Submit orca/root job(s) with (1) Submit orca/root job(s) with dataset(s) for dataset(s) for reconstruction/analysisreconstruction/analysis
(2) (2) Query Query resource resource statusstatus
(2) (2) Query Query resource resource statusstatus
(2) (2) Query Query resource resource statusstatus
(2) (2) Query Query resource resource statusstatus
(3) Submit (3) Submit job(s)job(s)
Scheduling Push ModelScheduling Push Model
Pottentially Pottentially BOSS can BOSS can also be also be used for used for global job global job submissionsubmission
Push model Push model has has limitations limitations once the once the system system becomes becomes resource resource limited limited
=service=service
QueueQueue
Uniform job Uniform job submission layersubmission layer
Mona Mona lisalisa
Farm 1Farm 1
BOSBOSSS
CondorCondorGG
monitormonitorss
Mona Mona lisalisa
Farm 2Farm 2
BOSBOSSS
CondorCondorGG
Mona Mona lisalisa
Farm 3Farm 3
BOSBOSSS
PBPBSS
Mona Mona lisalisa
Farm 4Farm 4
BOSBOSSS
PBPBSS
(1) Submit orca/root job(s) with (1) Submit orca/root job(s) with dataset(s) for dataset(s) for reconstruction/analysisreconstruction/analysis
(3) pull job(s)(3) pull job(s)
Scheduling Pull ModelScheduling Pull Model
(2) Resources (2) Resources are available, are available, give me a jobgive me a job
(2) Resources (2) Resources are available, are available, give me a jobgive me a job
Combining Combining push and push and pull to get pull to get better better scalability scalability
clientclient
catalogcatalog
=service=service
local local managermanager
dcachdcachee
job job submissiosubmissio
nn
farm 1farm 1
global global managermanager
discoverdiscover
local local managermanager
dcachdcachee
job job submissiosubmissio
nn
farm 2farm 2
local local managermanager
dcachdcachee
job job submissiosubmissio
nn
farm 3farm 3
JobJob
(1) Discover a (1) Discover a global global managermanager
(2) (2) Request Request session session
(dataset)(dataset)
(3) Discover (3) Discover catalog catalog serviceservice
(4) Get list of (4) Get list of farms that have farms that have this datasetthis dataset
(5) Reserve (5) Reserve process process timetime
(5) Reserve (5) Reserve process process timetime
(5) Reserve (5) Reserve process process timetime
(6) Allocate (6) Allocate timetime
(7) Submit (7) Submit job(s)job(s)
(7) Move (7) Move data to data to nodesnodes
(8) Create (8) Create jobjob
(9) Data (9) Data movedmoved
(9) (9) Data Data ready?ready?
(10) Alive (10) Alive signal signal during during processingprocessing
MonaLisMonaLisaa
(7) Report (7) Report access access statistics to statistics to MonaLisaMonaLisa
Multiple clients Multiple clients query and submit query and submit jobsjobs
Client code and Client code and global manager global manager have no have no knowledge about knowledge about location of location of services, except services, except for several urls for for several urls for discovery servicesdiscovery services
Similarity with other Similarity with other approachesapproaches
global global managermanagerglobal global
managermanager
CHEP 2004CHEP 2004
SUMMARY
Lessons learnedLessons learnedQuality of (the) service(s)Quality of (the) service(s)
Lot of exception handling needed for robust services (gracefully failure of services)Lot of exception handling needed for robust services (gracefully failure of services)Time outs are importantTime outs are important
Need very good performance for composite servicesNeed very good performance for composite servicesDiscovery service enables location independent service composition.Discovery service enables location independent service composition.Semantics of services are important (different name, name space, and/or WSDL)Semantics of services are important (different name, name space, and/or WSDL)Web service design: Not every application is developed with a web service Web service design: Not every application is developed with a web service interface in mindinterface in mindInterfaces of 3rd party applications change: Rapid Application Development Interfaces of 3rd party applications change: Rapid Application Development Social engineeringSocial engineering
Finding out what people want/need Finding out what people want/need
Overlapping functionality of applications (but not the same interfaces!)Overlapping functionality of applications (but not the same interfaces!)Not one single solution for CMSNot one single solution for CMSNot every problem has a technical solution, conventions are also importantNot every problem has a technical solution, conventions are also important
Future Work”Future Work”Integration of runjob into current deployment of servicesIntegration of runjob into current deployment of services
Full chain of end to end analysisFull chain of end to end analysis
Develop/deploy accounting service (ppdg activity)Develop/deploy accounting service (ppdg activity)Steering serviceSteering serviceAutonomous replicationAutonomous replicationTrend analysis using monitor dataTrend analysis using monitor dataImprove exception handlingImprove exception handlingIntegrate/interoperability mass storage (e.g. SRM) applications Integrate/interoperability mass storage (e.g. SRM) applications into/with Clarens environmentinto/with Clarens environment
GAE PointersGAE PointersGAE web page: GAE web page: http://ultralight.caltech.edu/gaeweb/http://ultralight.caltech.edu/gaeweb/Clarens web page: Clarens web page: http://clarens.sourceforge.nethttp://clarens.sourceforge.netService descriptions: Service descriptions: http://hepgrid1.caltech.edu/GAE/services/http://hepgrid1.caltech.edu/GAE/services/MonaLisa : http://monalisa.cacr.caltech.edu/MonaLisa : http://monalisa.cacr.caltech.edu/SPHINX: http://www.griphyn.org/sphinx/Research/research.phpSPHINX: http://www.griphyn.org/sphinx/Research/research.php