cms distributed production
Post on 19-Jan-2016
39 Views
Preview:
DESCRIPTION
TRANSCRIPT
CMS Production System1
CMS Distributed ProductionCMS Distributed Production
Michael ThomasSep. 20, 2006
With thanks to Ajit Mohapatra from UW
CMS Production System2
Old Production ModelOld Production Model
Large scale MC production in CMS has been going on for past several years.
• Based on agreement/willingness of individual CMS institutes to participate in the MC production.
• Installation/maintenance of CMS software at the involved sites by their own people.
• MC production, storage and transfer was managed by the sites.
• Site responsibility to arrange/negotiate the needed resources by themselves.
• Each production site performed the task based on the directions from the production mgmt group at CERN.
CMS Production System3
New Grid-based Distributed ProductionNew Grid-based Distributed Production
• Based on new CMSSW software frameworkBased on new CMSSW software framework• Centralized software installation at both LCG and Centralized software installation at both LCG and
OSG sitesOSG sites• (Centralized) Production Manager maintains all (Centralized) Production Manager maintains all
work to be done (1 OSG operator, 3 for LCG)work to be done (1 OSG operator, 3 for LCG)• Remote Production Agents pull work from the Remote Production Agents pull work from the
Production Manager, run and track jobsProduction Manager, run and track jobs• Jobs are monitored and resubmitted in case of Jobs are monitored and resubmitted in case of
failurefailure• Makes extensive use of Grid resourcesMakes extensive use of Grid resources• In use since July 2006In use since July 2006
CMS Production System4
CMS Computing InfrastructureCMS Computing Infrastructure
CMS Production System5
ProdAgentProdAgent
CMS Production System6
Production SystemProduction System
Production Manager
Production Agent
Site
JobJobJobJob
Site
JobJobJobJob
Site
JobJobJobJob
Production Agent
CMS Production System7
Production Agent
Production/Workflow Manager
Production SystemProduction System
Workflow Spec
Job Spec
Job
Application
*
*
*
CMS Production System8
Production SystemProduction System
Request/Assignment level templateRequest/Assignment level template
Has Unique IDHas Unique ID
Corresponds to CTDR TaskCorresponds to CTDR Task
Used by the (central) Production Used by the (central) Production Manager processManager process
Workflow Spec
Job Spec
Job
Application
CMS Production System9
Production System IIProduction System II
Processing Job DefinitionProcessing Job Definition
Has a unique ID within a Has a unique ID within a workflow/requestworkflow/request
Used to create physical jobsUsed to create physical jobs
Dealt with by the Production AgentDealt with by the Production Agent
Workflow Spec
Job Spec
Job
Application
CMS Production System10
Production System IIIProduction System III
Physical job created from Job SpecPhysical job created from Job Spec
Combination of JobSpecID and Combination of JobSpecID and batch/grid job id are uniquebatch/grid job id are unique
Corresponds to a CTDR jobCorresponds to a CTDR job
Multiple jobs created from the same Multiple jobs created from the same JobSpec will be the same job in JobSpec will be the same job in terms of physics (retries, etc.)terms of physics (retries, etc.)
Workflow Spec
Job Spec
Job
Application
CMS Production System11
Production System IIIIProduction System IIII
Application like nodes within a job, Application like nodes within a job, managed by SHREEKmanaged by SHREEK
CMSSW, StageOut, Rescue types of CMSSW, StageOut, Rescue types of taskstasks
May be multiple CMSSW apps run in a May be multiple CMSSW apps run in a single jobsingle job
Each app has a unique node name Each app has a unique node name within the workflowwithin the workflow
Workflow Spec
Job Spec
Job
Application
CMS Production System12
MonitoringMonitoring
Prod. Manager receives notifications from Prod. AgentProd. Manager receives notifications from Prod. Agent AllocatedAllocated Final states: success, failureFinal states: success, failure
Prod. Agent monitors the high level status of jobs via Message Prod. Agent monitors the high level status of jobs via Message Service and reports to CMS DashboardService and reports to CMS Dashboard CreatedCreated RunningRunning FinishedFinished Fail (requeue)Fail (requeue)
Job Tracker monitors the job statusJob Tracker monitors the job status QueuedQueued % completed% completed Output textOutput text
Job publishes dataJob publishes data BOSS wrapper captures stdoutBOSS wrapper captures stdout Job may use ApMon directly to publish to MonALISAJob may use ApMon directly to publish to MonALISA
CMS Production System13
Data Storage on OSGData Storage on OSG
CMS Production System14
Data Transfer and ManagementData Transfer and Management
CMS Production System15
Production SummaryProduction Summary
CMS Production System16
Production SummaryProduction Summary
CMS Production System17
Looking ForwardLooking Forward
• Integration of Prod ManagerIntegration of Prod Manager
• Scale to multiple Prod AgentsScale to multiple Prod Agents
• Extend production to non-CMS OSG sitesExtend production to non-CMS OSG sites
• Ramp up for CSA06Ramp up for CSA06
top related