![Page 1: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/1.jpg)
Gridengine Configuration review
● Gridengine overview ● Our current setup● The scheduler● Scheduling policies● Stats from the clusters
![Page 2: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/2.jpg)
Gridengine Overview
● Accepts jobs from the outside world● Puts jobs in a holding area until they
can be run● Sends jobs from the holding area to an
execution device● Manages running jobs ● Records details about finished job
![Page 3: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/3.jpg)
Gridengine Overview (2)
● Four types of hosts– Execution: runs jobs.– Submit: allowed to submit jobs from– Master: schedules jobs.– Admin: allowed to run admin cluster from.
● Hosts can be many types but only one master (hot spare).
● Could run everything on one host...silly but possible.
![Page 4: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/4.jpg)
Queues (Cluster Queues)
● Container for a class of jobs● Can define specific resources
– large memory machines– specific processor– architecture– time restricted (runtime or time of day/week)
● Contain one or more execution hosts● Can be preemptive● Can contain subqueues
![Page 5: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/5.jpg)
Queues(2)
● Queue instance– Each queue is bound to an included
execution host via a queue instance– Each execution host can have multiple queue
instances attached.– Can have one or more job slots.
![Page 6: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/6.jpg)
Simple configuration
● One cluster queue● Each execution host has one queue
instance● Jobs are scheduled in FIFO.● This is the default configuration
gridengine ships with.
![Page 7: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/7.jpg)
Our Hardware
● 4 clusters running gridengine– Lion: 64+ nodes (GX240)– Lutzow: 16 nodes (PE530)– Townhill: 34 nodes (PE1425 dual CPU)– Hermes: 24 nodes (PE1425 single CPU)
● 4 head nodes (1 per cluster)● 1Tb local home directories ● 1Tb “scratch” space
![Page 8: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/8.jpg)
Current setup
● All hosts are admin hosts● Single “head node” configured as
submit/master● Execution hosts have ssh blocked● Users ssh onto head node and submit
jobs.– Actually they tend to run scripts which
submit jobs– Lots of jobs– Not all of them will run properly.
![Page 9: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/9.jpg)
The Scheduler Process
![Page 10: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/10.jpg)
Prioritisation
● Prioritisation based on– Entitlement– Urgency– Custom
● Generates a Dispatch priority● Real number based on combination of
above.
![Page 11: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/11.jpg)
Entitlement
● Priority based on users/groups● Can be explicit(user A jobs before user
B)● Can allocate ratio of resources (group A
get 60% CPU usage over , group B get 40%)
● Share tree allows the allocation to be spread over a defined time period.
● Need to configure information for users/groups
![Page 12: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/12.jpg)
Share tree example
![Page 13: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/13.jpg)
Urgency
● Deadline contribtion– Priority rises closer to deadline specified at
submission● Wait time contribution
– Priority rises with time● Resource contribution
– Can assign urgency to a resource (Maltab licenses)
![Page 14: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/14.jpg)
Custom
● Allows for prioritisation based on site specific requirements
● Run arbitrary script which alters priority.● Defaults to posix priority (like nice)
– Users can lower priority– Admin can raise priority
![Page 15: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/15.jpg)
Summary
● Can control job execution based on– Queues: assign specific execution hosts for
specific tasks or users/groups. Queues can be calendar controlled.
– Scheduler: prioritise jobs based on who submitted them or what resources they require.
![Page 16: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/16.jpg)
Current setup
● Single queue containing all nodes in a cluster
● Limited user/group support (FC5)● Allocates equal priority to each user with
jobs in pending queue.
![Page 17: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/17.jpg)
It's mostly downhill from here
![Page 18: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/18.jpg)
Gathering job data
● Sun dbwriter● Java script runs on accounting/reporting
file and populates postgresql database (42GB footprint).
● Data from Dec/Jan until yesterday “with holes”
● Difficult to analyse some jobs (parallel,stopped jobs)
![Page 19: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/19.jpg)
How many jobs
Row 360
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
Job throuput
Hermes
LionLutzow
Townhill
![Page 20: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/20.jpg)
Hmm thats a lot of short jobs
0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9+ 0
10
20
30
40
50
60
70
80
90
100
% of jobs by runtime.
Hermes
Lion
Lutzow
Townhill
Average
Run Time (hours)
% o
f jo
bs
![Page 21: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/21.jpg)
That's really a lot of short jobs
● Remember all those scripts?● How many of these jobs actually run for
any length of time?
![Page 22: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/22.jpg)
How many jobs (>3min)
Hosts0
50000
100000
150000
200000
250000
300000
350000
400000
450000
500000
550000
Main Title
Hermes
Lion
Lutzow
Townhill
Tota
l Jo
bs
![Page 23: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/23.jpg)
Remove the <3min jobs
0-1 1-2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9+ 0
10
20
30
40
50
60
70
80
90
100
% of jobs by runtime (no short jobs)
Hermes
Lion
Lutzow
Townhill
Average
Runtime (hours)
% o
f jo
bs
![Page 24: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/24.jpg)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 24+
0
5
10
15
20
25
30
35
40
45
50
55
60
% Run length
Hermes
Lion
LutzowTownhill
Average
job length (cpuhours)
% o
f sys
tem
run
time
![Page 25: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/25.jpg)
1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
28+
05
10152025303540455055606570
% Run time in days
Hermes
Lion
Lutzow
Townhill
Average
Job run length(cpudays)
% o
f syste
m tim
e
![Page 26: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/26.jpg)
00-01
01-02
02-03
03-04
04-05
05-06
06-07
07-08
08-09
09-10
10-11
11-12
12-13
13-14
14-15
15-16
16-17
17-18
18-19
19-20
20-21
21-22
22-23
23-00
0
2
4
6
8
10
12
14
16
18
% Jobs By Submission Time
Hermes
Lion
Lutzow
Townhill
Average
Time of Day
% of
Job
s
![Page 27: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/27.jpg)
1 2 3 4 5 6 7 8 9 10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
24+
0
10
20
30
40
50
60
70
80
90
Wait time
Hermes
Lion
Lutzow
Townhill
Average
Wait time (hours)
% o
f jo
bs
![Page 28: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/28.jpg)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230
2
4
6
8
10
12
14
16
18
free slots
Hermes
Lion
Lutzow
Townhill
Average
time of day
free
slo
ts
![Page 29: Gridengine Configuration revie · 2007. 7. 6. · Current setup All hosts are admin hosts Single “head node” configured as submit/master Execution hosts have ssh blocked Users](https://reader036.vdocuments.net/reader036/viewer/2022071411/61063bbd00d1994d987cf03d/html5/thumbnails/29.jpg)
Tentative conclusions
● Could add more submit hosts/backup scheduler for redundancy (virtualisation).
● Need to set up queue to handle short jobs with quick turnaround
● Also need preempted queue for longer running jobs.
● User scripts can muddy the water, can't assume quiet time for system admin tasks