8-dec-15t.wildish / princeton1 cms analytics a proposal for a pilot project cms analytics

12
27 Jun 2022 T.Wildish / Princeton 1 CMS analytics A proposal for a pilot project CMS Analytics

Upload: eustacia-dorsey

Post on 17-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

21 Apr 2023 T.Wildish / Princeton 1

CMS analytics

A proposal for a pilot project

CMS Analytics

Page 2: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Goals for this project

• Learn how to run an analytics project in CMS• Find holes in our monitoring, plug them• Improve the way we do use computing resources

in Run-2• First of a series, expand scope and ambition along

with our experience

2

Page 3: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

How can we make more efficient use of our computing resources?

• CPU: CMS has a shortage of CPU

• Disk: we have ~enough, but don’t use it well

• Network: we treat it as free and infinite

• Budgets: flat, if we’re lucky

• Manpower: shrinking

• Expectations for Run-2: very high!

• => Need greater understanding of how we do things now so we can improve them

3

Page 4: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

How do we understand our current system better than now?

• Simple monitoring not enough– Monitor for debugging in near-time, not good enough

for long term

– Monitoring per sub-system not well integrated, hard to find correlations between them

• E.g. interference between user stageout of files from batch jobs and scheduled transfers?

– But we have all that data – in principle!

• => Need to consolidate our monitoring and figure out how to use it!

4

Page 5: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

How do we start?

• Top-down approach: – Lots of planning, meetings, targets, manpower…

– Unlikely to deliver anything useful with long projects

– We don’t even know exactly what we want yet!

• Bottom-up approach: – Pick a few things we want to learn about the system

– Start pilot projects to see how we can measure them

– Incremental improvements, learn as we go

5

Page 6: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Pilot project: Predicting the popularity of new datasets before they are delivered

• We have popularity data going back ~3 years• We have DDP that can make more replicas of

popular data or delete unpopular data• Don’t know how many replicas to make of a

dataset before it’s been used for a while• Can we predict popularity of a new dataset?

– Want to do this for data and MC

– Want to do this before the data becomes available

6

Page 7: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Predicting popularity: inputs

• Data type/content– Physics triggers or MC parameters (lepton, jets…)

– Software processing steps (RECO, AOD…)

– Software version (look for behavioural interactions)

– For MC: who requested it (which physics group?)

– #unique users, their physics interests and past activities

– Sources: SiteDB, HN, DBS, dashboard, CRAB…

• Popularity and replica information– PopDB & PhEDEx (found a hole already!)

7

Page 8: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Predicting popularity: outputs

• Estimate of number of replicas needed initially– Don’t need long term prediction, DDP takes care of that

– N.B. only care about popular data, for which having too few initial replicas creates bottlenecks

• Estimate of where to place the data– Based on pattern of user activity

– How much CPU or I/O is typically used on similar datasets?

– Site reliability, both in past for similar data and now

8

Page 9: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Predicting popularity: method

• Basics: data import & cleaning, define a data-frame

• Try a few simple algorithms to start with:– Clustering, decision trees…

– Recommender systems (collaborative filtering)

• Measure performance– On historical data and on new data produced today

• This is the bulk of our learning curve

9

Page 10: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Next steps…

• Taking it further:– Look for ‘conference effects’

– Look for interactions between older and newer processings of the same data

– See what else we can do with that data?

• Other analytics projects– Data transfer latency, understanding network traffic,

analysis job runtime, job-scheduler optimisation…– https://twiki.cern.ch/twiki/bin/view/Main/CMSDataMiningForAnalysisModel

10

Page 11: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Open question: co-operation between IT and CMS?

• What do we share, and how?– Hardware? Can we use your toys or should we get our

own?– Tools? It makes sense to converge where we can,

understand why not where we can’t. OTOH, explore vs. exploit, may be useful to share trying alternatives

– Experience? Replicating mistakes is not useful

• Larger projects?– Can we do anything together that we couldn’t do alone?– Look forward to common projects: IT/CMS/ATLAS?

11

Page 12: 8-Dec-15T.Wildish / Princeton1 CMS analytics A proposal for a pilot project CMS Analytics

Conclusion

• CMS is embarking on a series of analytics projects– We want to learn how to run analytics projects, what

hardware, software and skills it takes

– Start a few pilot projects to bootstrap this effort

• Consolidate and improve our monitoring– Plug gaps that will improve our analytics potential in

the future

• Improve use of computing resources for >= Run-2– Aim for incremental improvement, proven ROI

12