big data from the lhc commissioning: practical lessons from big science - simon metson (cloudant)

Big Data from the LHC Commissioning

Practical Lessons from Big Science

Simon/@drsm79

Hello!

Bristol University Cloudant

Time at places I’ve worked

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

Python Perl Bash C++ Java Javascript Fortran

The formula

The formulaFixed

Fixed Usually fixed

The formula

Grant * Effectiveness

The life of LHC data1. Detected by experiment

2. “Online” filtering (hardware and software)

3. Transferred to CERN main campus, archived & reconstructed

4. Transferred to T1 sites, archived, reconstructed & skimmed

5. Transferred to T2 sites, reconstructed, skimmed, filtered & analysed

6. Written into locally analysable files, put on laptops

7. Turned into a plot in a paper

D i g b i g t u n n e l s

C h a i n u p s e r i e s o f “ a t o m s m a s h e r s ”

P u t s e n s i t i v e c a m e r a s i n a w k w a r d p l a c e s

R e c o r d e v e n t s

Process data on high end machines

http://www.chilton-computing.org.uk

CMS online data flow

We have a big digital camera

It takes photos of this

courtesy of James Jackson

which come out like this

courtesy of James Jackson

Which goes into lots of computers (the HLT)

Which goes into lots of disk (the Storage Manager)

CMS data flow

Which goes into lots of disk (the Storage Manager)

Write to HLT at ~200GB/s

Write to Storage Manager at ~2GB/s

Write to T0 at ~2GB/s

1 0 P B o f d a t a / y e a r

1PB/week

Why transfer so much data?

To process all the data taken in one year on one computer would take ~64,000 years

Analysis

• Each analysis is ~unique

• Query language is C++

• Runs on distributed system and local resources

• Series of “cut” selections to identify interesting events

• Data in the final plot may be substantially reduced from the original dataset

Workflow ladderLarge datasets (>100 TB) Complex computation

Private datasets (0.1-10 GB) Simple computation

Work on laptop/desktop machine, store resulting datasets to Grid storage

Use Grid compute and storage exclusively

Shared datasets (0.1-10 GB) Simple computation

Large datasets (>100 TB) Simple computation

Shared datasets (10-100 GB) Simple computation

Work on departmental resources, store resulting datasets to Grid storage

Shared datasets (10-500 GB) Complex computation

Shared datasets (>500 GB) Complex computation

Number of users

The life of LHC simulated data

1. Simulated by experimentalists at T0/T1/T2 sites

2. Transferred to T1 sites, archived possibly reconstructed & skimmed

Most events get cut

!“We are going to die, and that makes us the lucky ones. Most people are never going to die because they are never going to be born.”

!- Richard Dawkins

Adoption & Use

• Maybe a bit different to other people

• Many sites (>100) with >100’s TB storage, 10000’s worker nodes

• Global system

• Why not at one site?

• politics, power budget, cost

The grid

We Have a “Big Data” Problem

We Have a Big “Data Problem”

Do what you do best, out source the rest

What's interesting is that big data isn't

interesting any more

Define and refine workflows

Our situation

• Expert users, who are not interested in infrastructure

• Will work around things they perceive as unnecessary limitations

Disruptive users

How to engage disruptive users?

Open access

1PB/week

Open access

Our situation

• Limited resources for integration/testbed style activities

• Strange organisation

Data temperature

There is no such thing as now

Keep things as local as possible

Defining monitoring is difficult

Small files are bad, m'kay

Compartmentalise metadata

Recognise, embrace and communicate failures

People are harder than computers

People are important

The formula

��64

Consequences

• Automate all the things

• Learn to love a configuration management system

• Make sure everyone in the team knows how to interact with it

• Simple human solutions go a long way

Build good abstractions

Encourage collaboration

Workflow ladderLarge datasets (>100 TB) Complex computation

Private datasets (0.1-10 GB) Simple computation

Work on laptop/desktop machine, store resulting datasets to Grid storage

Use Grid compute and storage exclusively

Shared datasets (0.1-10 GB) Simple computation

Large datasets (>100 TB) Simple computation

Shared datasets (10-100 GB) Simple computation

Work on departmental resources, store resulting datasets to Grid storage

Shared datasets (10-500 GB) Complex computation

Shared datasets (>500 GB) Complex computation

Number of users

Summary

big data from the lhc commissioning: practical lessons from big science - simon metson (cloudant)

Technology

the latest big (data)...

a (xpages) developers guide to cloudant

mobile solutions for ios (and other platforms) - cloudant

metso vuosikatsaus 2017...metson vuosikertomus 2017 metson...

field work: map-centric mobile apps with cloudant geo and...

cloudant and the *aas landscape

upphandling av it governance€¦ · ibm dash db...

database expert q&a from 2600hz and cloudant

all your database are belong to us - koop, cloudant, feature...

webinar: the anatomy of the cloudant data layer

temperature/humidity sensor with node-red and cloudant iot...

ibm cloudant datalayer managed service with nosql€¦ ·...

cloudant overview bluemix meetup from lisa neddam

metson vuosikertomus 2016

ibm - introduction to cloudant

metson osavuosikatsaus 1. tammikuuta - 31. maaliskuuta 2014

how watson, bluemix, cloudant, and xpages can work ... ·...

metson osavuosikatsaus 1. tammikuuta - 30. kesäkuuta 2014:...

intro to ibm bluemix devops services, a workshop with a...

metson huoltokorjaamo tampere, hatanpää · raportin...