understanding priorities in htcondor

26
CERN, Dec 2012 HTCondor priorities 1 glideinWMS for users Understanding priorities in HTCondor by Igor Sfiligoi (UCSD)

Upload: igor-sfiligoi

Post on 15-Jan-2015

575 views

Category:

Technology


2 download

DESCRIPTION

Introductory talk describing how priorities are handled in HTCondor (formelly known as Condor).

TRANSCRIPT

Page 1: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 1

glideinWMS for users

Understanding priorities in HTCondor

by Igor Sfiligoi (UCSD)

Page 2: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 2

Scope of this talk

This talk provides an overviewof how priorities work in HTCondor,

both between users and among jobs of the same user,

and how the user can affect policies.

Reader is expected to already have a basic understanding of HTCondor.

Page 3: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 3

HTCondor Architecture

● As a reminder

Central manager

Condor

Submit node

Condor

Execute node

Condor

Submit node

Submit node

Execute node

Execute node

Execute node

Execute node

Page 4: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 4

HTCondor Architecture

● And with relevant daemon names

Central manager

Negotiator

Submit node

Schedd

Execute node

Condor

Submit node

Submit node

Execute node

Execute node

Execute node

Execute node

Page 5: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 5

User Priorities

Page 6: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 6

What is a user?

● Before talking about priorities between userswe need to define what IS a user

● A “HTCondor user” is represented asOwner@Domain● In most setups, the Owner is the

“Login User Name” on the submit node● The Domain may either represent the submit node itself,

or a set of submit nodes that share the same Owner identification policies

Yes, priorities are based

on the User not the Owner

Both rules defined by the HTCondor adminand cannot be changed by the final user

Page 7: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 7

User priorities

● By default, the Negotiator treats all users equally● You get fair-share out of the box

● Each user is assigned a priority number● The lower, the better● Two users with the same priority number

on average get half of Slots each

● User priority asymptotically steers toward the number of Slots used● Both up and down

http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html#SECTION00444000000000000000

Page 8: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 8

Special users

● If not all users are equally important, the Negotiator supports● Accounting groups – When you need to group users ● Priority factors – Works on user-by-user basis

● The two mechanisms can be combined

http://research.cs.wisc.edu/htcondor/manual/v7.8/3_4User_Priorities.html

Page 9: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 9

Accounting groups

● Users can be joined in accounting groups● The Negotiator defines the groups,

but jobs specify which group they belong to● Each group can be given a quota

● Can be absolute or relative to the size of the pool● Sum of running jobs in the group cannot exceed it

● If quotas >100%, can be used for relative prio● Here higher is better● Each group will be given,

on average, quotaG/sum(quotas) of slots

Jobs without anygroup may never

get anything

Page 10: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 10

Mapping jobs to A.G.

● Users must specify which group they belong to● No automatic mapping or validation in Condor● Based on trust

● Jobs must add to their submit file+AccountingGroup = "<group>.<owner>"

Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1

Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1

Page 11: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 11

Mapping jobs to A.G.

● Users must specify which group they belong to● No automatic mapping or validation in Condor● Based on trust

● Jobs must add to their submit file+AccountingGroup = "<group>.<owner>"

Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1

Universe = vanillaExecutable = cosmosArguments = -k 1543.3Output = cosmos.outInput = cosmos.inLog = cosmos.log+AccountingGroup = "group_higgs.frieda"Queue 1

“AccountingGroup@Domain”is effectively the identifier

used by the Negotiatorfor Priority purposes

With the default beingA.G.==Owner

Page 12: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 12

Priority Factors

● Each user can be assigned a Priority Factor● PF>1 will reduce a user's priority

– If users X and Y have PFX=(N-1)*PFY, on averageuser X gets 1/N of slots (with user Y the rest)

● Can manage with cmdline tool condor_userprio

● Admin likely have set high default PF (e.g. 1000)– PF cannot go below 1

$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37$ condor_userprio -setfactor group1.user1@node1 1000The priority factor of group1.user1@node1 was set to 1000.000000$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37

$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 10.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37$ condor_userprio -setfactor group1.user1@node1 1000The priority factor of group1.user1@node1 was set to 1000.000000$ condor_userprio -all -allusers |grep [email protected]@node1 8016.22 8.02 1000.00 0 15780.63 11/23/2011 05:59 11/30/2012 20:37

http://research.cs.wisc.edu/htcondor/manual/v7.8/2_7Priorities_Preemption.html#sec:user-priority-explained

Only superuser can set

Page 13: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 13

Efficiency trade-off

● After getting a Slot, the schedd will keep it for an extended period of time● i.e. will schedule several jobs

of the same user on it● For efficiency reasons

– Negotiator can take a few mins to do the matching

● As a side effect● A low priority user may keep the execute node

even if jobs from a higher priority user show up

Configurable,but it is a trade-off.

In glideinWMS,lifetime of the glidein

by default

Page 14: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 14

Preemption

● HTCondor has the notion of preemption● If a job from a higher priority user shows up,

the Negotiator may instruct an execute node to kill the running job and re-negotiate

● Yes, all work done to that point is lost(unless the job is able to checkpoint)

● Disabled by default on glideinWMS systems

Page 15: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 15

Submit node limits

● HTCondor resource usage on the submit node scales with the number of running jobs● So an admin will likely set a limit

MAX_JOBS_RUNNING

● If the submit node gets close to the limit, you are likely to see “weird behavior”● The negotiator will try to be fair,

and distribute the remaining wiggle room to several users with a similar priority number

● Remember: User priority is a dynamic property

Page 16: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 16

Monitoring per-user usage

● Submitter ClassAds provide per-user info● But one ClassAd per submitter node

● The long format contains info about limits

$ condor_status -submitters

Name Machine Running IdleJobs HeldJobs

uscms3024@cmsanalysi glidein-2. 802 299 1uscms3024@cmsanalysi submit-2.t 2063 1131 0uscms3044@cmsanalysi submit-2.t 663 344 0uscms3045@cmsanalysi submit-2.t 0 1 0 RunningJobs IdleJobs HeldJobs

uscms3024@cmsanalysi 2865 1430 1uscms3044@cmsanalysi 663 344 0uscms3045@cmsanalysi 0 1 0

Total 3528 1775 1

$ condor_status -submitters

Name Machine Running IdleJobs HeldJobs

uscms3024@cmsanalysi glidein-2. 802 299 1uscms3024@cmsanalysi submit-2.t 2063 1131 0uscms3044@cmsanalysi submit-2.t 663 344 0uscms3045@cmsanalysi submit-2.t 0 1 0 RunningJobs IdleJobs HeldJobs

uscms3024@cmsanalysi 2865 1430 1uscms3044@cmsanalysi 663 344 0uscms3045@cmsanalysi 0 1 0

Total 3528 1775 1

Actual ClassAds

Summary

Page 17: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 17

Job Priorities

Page 18: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 18

Priority-FIFO

● So, a user will have many jobs● In which order will they be executed?

● HTCondor guarantees the Priority-FIFO policy● Each jobs has a priority associated with it

● Jobs in the same priority class will start in FIFO order

● Jobs with higher priority always start before jobs with lower priority– i.e. higher priority is better

User-specific – will not affect priority between users

Page 19: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 19

Non-uniform environments

● Of course, everything is contingent to matching● P-FIFO only applies to jobs that match

at least one Slot

● If not all Slots are uniform● Lower priority (or submitted late) Jobs

may start before high priority (or submitted early) Jobsif the latter do not match any Unclaimed Slots

Page 20: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 20

Job restarts

● If an execute node dies for whatever reason, HTCondor will try to re-start the job that was running there somewhere else

● In a typical (glidein) setup, it will get the next available matching slot for that user● i.e. it will not preempt a lower priority job

Page 21: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 21

Multiple submit nodes

● The same user may have submitted jobs on many submit nodes● Here assuming they share the same Domain name

● Each submit node will handle its jobs on its own● No guarantee on the execution order

between jobs on different node● HTCondor will try to Round-Robin between them

● In 7.9.x, HTCondor can be configured to treat the Job priority as a global property● i.e. first high priority jobs, no matter which submitter● But still no guarantee within the prio. class

NEW

Page 22: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 22

Prioritiesin

glideinWMS

Page 23: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 23

None

● The glideinWMS layer does not handle priorities in any shape or form

● All jobs from all users treated the same● Although it may create different execute node

requirements for some of them– But it is effectively a binary decision

Page 24: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 24

The End

Page 25: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 25

Pointers

● HTCondor Home Pagehttp://research.cs.wisc.edu/htcondor/

● HTCondor [email protected]@cs.wisc.edu

Page 26: Understanding priorities in HTCondor

CERN, Dec 2012 HTCondor priorities 26

Acknowledgments

● The creation of this document was sponsored by grants from the US NSF and US DOE,and by the University of California system