2 3 4 in mcm database in scheduler database 5

22
Workflows Scheduler for Monte Carlo samples production at CMS Author: Julius Skripkauskas Vilnius University Mathematics and Informatics Faculty Supervisors: Jean-Roch Vlimant, Giovanni Franzoni

Upload: gabriella-simmons

Post on 17-Jan-2018

226 views

Category:

Documents


0 download

DESCRIPTION

3

TRANSCRIPT

Page 1: 2 3 4 In MCM database In Scheduler database 5

Workflows Scheduler for Monte Carlo

samples production at CMS

Author: Julius SkripkauskasVilnius University

Mathematics and Informatics Faculty

Supervisors: Jean-Roch Vlimant,Giovanni Franzoni

Page 2: 2 3 4 In MCM database In Scheduler database 5

Project description• Timetable is displayed for a user to check position of

requests and interact with variety of configurations.• The goal is predicting the time of completion of single

samples as well as of production campaign, taking into account the several concurrent production campaign and the CMS computing resources constraints.

2

Page 3: 2 3 4 In MCM database In Scheduler database 5

3

What is MC production request?• MC production requests – Monte Carlo samples production.• Submitted by physicists.• Each request has its info, like: events number, keywords, time it

takes to compute an event. “Event” is main element of request.• Single event is a single simulation of some type of particle

collision.• Request is N number of same type of simulations.• Single request is displayed as single block in timetable.• Requests are kept in MCM database.• MCM (Monte Carlo Management) a replacement for PREP for

sample request management.

Page 4: 2 3 4 In MCM database In Scheduler database 5

4

Requests data format• Attributes of converted request (request data in

scheduler database):• Id• Width – time it takes to complete request.• Height – number of events per time unit.• Type:

• Priority – Requests scheduled by importance.• Deadline – Requests scheduled by date until they have to be finished and

importance.• Keywords• Group• Source

Page 5: 2 3 4 In MCM database In Scheduler database 5

5

Requests data formatIn MCM

databaseIn

Scheduler database

Page 6: 2 3 4 In MCM database In Scheduler database 5

6

Scheduler• Schedules requests and visualizes them in graphical

environment.• Allows MC product managers to predict when the

production of certain sample will be ready.• Predicting the evolution of the production necessary,

because of the limited, distributed resources of CMS.• Sample production uses resources from different

clusters in different Tiers, one of the inputs in scheduler is such number of resources available.

Page 7: 2 3 4 In MCM database In Scheduler database 5

7

Main parts of schedulerMC

production requests

Scheduling Views

Conversion of requests data

Scheduling of converted

requests data

Displaying scheduled

data

Page 8: 2 3 4 In MCM database In Scheduler database 5

8

Data pathsDifferent sources

Translator

Database

Scheduler

Web page

Data sources either a database like MCM with non-converted data in it or csv file uploaded in a server.

Conversion of requests data from different sources. Data either saved in database or given directly to scheduler depending on

source of data.

Database of converted requests data from other databases like MCM.

Scheduling of converted data to be displayed in a timetable.

Interactive web interface, displays timetable and allows to pass different configurations or more data to scheduler.

Page 9: 2 3 4 In MCM database In Scheduler database 5

9

Scheduler - algorithm• Previously developed by Štěpán Balcar.• The algorithm does not have to find the best solutions.• Timetable is represented as permutation inspired to Smith

Evolution Algorithm.• Permutation determines the order in which tasks will be

sequentially inserted.• Inserter function tries to insert tasks into all positions in free space.• Available positions are sorted according to the direction in which is

done insertion - inspired Hwang et algorithm.• The asymptotic complexity is: O(N * log (N) + A*N) • A = number of operation needed to insert one Block.

Page 10: 2 3 4 In MCM database In Scheduler database 5

10

Scheduler – algorithm(1)

Page 11: 2 3 4 In MCM database In Scheduler database 5

11

Scheduler – algorithm(1)• First insert requests with deadline.• Deadlines do not exist yet.• Insertion from bottom right corner into area which is

bounded by deadline.• One by one in ascending order of priorities.

Page 12: 2 3 4 In MCM database In Scheduler database 5

12

Scheduler - algorithm(2)

Page 13: 2 3 4 In MCM database In Scheduler database 5

13

Scheduler – algorithm(2)• Insert priority blocks into free spaces (empty space

between deadlines).• Insertion from bottom left corner.• One by one in order of priorities descending.

Page 14: 2 3 4 In MCM database In Scheduler database 5

14

Scheduler – older version• Developed by Štěpán Balcar.• Fake data - did not reflect real data well enough.• Fully implemented scheduling algorithm (placing

request blocks in timetable).• Almost no interactivity or modification in scheduler.• Defined format of converted requests data.

Page 15: 2 3 4 In MCM database In Scheduler database 5

15

Scheduler – new versionY axis

X axis

Coloring

Developed by Julius Skripkauskas https://cms-pdmv.cern.ch/scheduler/

Page 16: 2 3 4 In MCM database In Scheduler database 5

16

Scheduler – new version• X axis – period of time, scheduled requests (colorful blocks in

timetable) have width in days, hours, minutes that occupy part of X axis. X axis can be modified to display longer or shorter period of time by choosing dates from dropdown list.• Y axis – available resources (slots, processing power) for

computing of requests. Default value is ~86000 slots, y axis (number of slots) can be modified by inserting number into slots input field.• Coloring – Recoloring of already scheduled data by desired

options. Option to recolor timetable chosen by clicking on button with option name on it. Additionally list of color labels is displayed to distinguish among variety of colors and requests.

Page 17: 2 3 4 In MCM database In Scheduler database 5

17

Scheduler – new versionColored by Member of Campaign option

Label shows that “Spring14dr” is brown, requests with “Spring14dr” are also colored brown.

Page 18: 2 3 4 In MCM database In Scheduler database 5

18

Scheduler – new versionDrop down lists to change displayed part of X axis.

Input fields:“Slots” – number of slotsavailable, after enteringnumber and clicking “Redraw”everything is recalculated with new Y axis.

“Keywords” – filtering requestsby keywords in them, after entering keywords like status“new” and clicking “Redraw” everything is recalculated justby using requests that have that keyword.

Data source configurationwindow – allows to upload data within csv files to server.Csv data uploaded with labelprovided by user.Source of data may be configured by user by clicking on a checkbox.After configuration and reloadof page data is scheduled anddisplayed from previouslychosen sources.

Page 19: 2 3 4 In MCM database In Scheduler database 5

19

Scheduler – filtering by keywords example

Scheduled by filteringrequest only to those of“Spring14dr” campaign.As we see production ofall requests of “Spring14dr”takes about 4-5 days.

Page 20: 2 3 4 In MCM database In Scheduler database 5

20

Scheduler – new version• Real data – requests from MCM database in other words input from

actual production status of CMS, also csv files.• A lot of faulty data discarded.• A lot more interactivity and modification allowed.

• Data scheduled on the fly, no scheduling on past dates.• User chooses range of scheduler to be displayed (by specifying start and end

dates).• Possibility to choose different coloring besides coloring by priority.• Ability to filter by keywords (prepid, energy, source, status, etc.).• Functionality to modify number of slots available.• Ability to upload either temporary or permanent requests data in csv files.• Simple interface for configuration of source list (which sources to be used).• Variety of tooltips and performance improvements.

Page 21: 2 3 4 In MCM database In Scheduler database 5

21

Improvements• Possibility to allow more modifications, for example to

modify single request information.• Improve scheduling algorithm, perhaps calculating

and splitting number of priority type requests that fit into each free interval, and then scheduling at the same time by using multi-threading.• Improve graphical design of scheduler, hire a designer.

Page 22: 2 3 4 In MCM database In Scheduler database 5

22