petabyte-scale computing for lhc ian bird, cern wlcg project leader wlcg project leader isef...

Petabyte-scale computing for LHC

Ian Bird, CERN WLCG Project Leader

ISEF Students18th June 2012

Accelerating Science and Innovation

Enter a New Era in Fundamental ScienceStart-up of the Large Hadron Collider (LHC), one of the largest and truly global

scientific projects ever, is the most exciting turning point in particle physics.

Exploration of a new energy frontier

LHC ring:27 km circumference

CMS

ALICE

LHCb

ATLAS

data

Date Collaboration sizes Data volume, archive technology

Late 1950’s 2-3 Kilobits, notebooks

1960’s 10-15 kB, punchcards

1970’s ~35 MB, tape

1980’s ~100 GB, tape, disk

1990’s 700-800 TB, tape, disk

2010’s ~3000 PB, tape, disk

CERN / January 2011 3

Some history of scale…

For comparison:1990’s: Total LEP data set ~few TBWould fit on 1 tape today

Today: 1 year of LHC data ~25 PB

Where does all this data come from?

CERN has about 60,000 physical disks to provide about 20 PB of reliable storage

CERN / May 2011

150 million sensors deliver data …

… 40 million times per second

• Raw data:– Was a detector element hit?– How much energy?– What time?

• Reconstructed data:– Momentum of tracks (4-vectors)– Origin– Energy in clusters (jets)– Particle type– Calibration information– …

CERN / January 2011 6

What is this data?

• HEP data are organized as Events (particle collisions)

• Simulation, Reconstruction and Analysis programs process “one Event at a time” – Events are fairly

independent Trivial parallel processing

• Event processing programs are composed of a number of Algorithms selecting and transforming “raw” Event data into “processed” (reconstructed) Event data and statistics

Ian Bird, CERN 7

Data and Algorithms

26 June 2009

RAW Detector digitisation

~2 MB/event

ESD/RECOPseudo-physical information:Clusters, track candidates

~100 kB/event

AOD~10 kB/event

TAG

~1 kB/event

Relevant information for fast event selection

Triggered eventsrecorded by DAQ

Reconstructed information

Analysis information

Classification information

Physical information:Transverse momentum, Association of particles, jets, id of particles

simulation

reconstruction

analysis

interactivephysicsanalysis

batchphysicsanalysis

batchphysicsanalysis

detector

event summary data

rawdata

eventreprocessing

eventreprocessing

eventsimulation

eventsimulation

analysis objects(extracted by physics topic)

Data Handling and Computation for

Physics Analysisevent filter(selection &

reconstruction)

event filter(selection &

reconstruction)

processeddata

les.r

ob

ert

son

@cern

.ch

26 June 2009 8Ian Bird, CERN

Ian Bird, CERN 9

The LHC Computing Challenge

Signal/Noise: 10-13 (10-9 offline) Data volume

High rate * large number of channels * 4 experiments

15 PetaBytes of new data each year Compute power

Event complexity * Nb. events * thousands users

200 k CPUs 45 PB of disk storage

Worldwide analysis & funding Computing funding locally in major

regions & countries Efficient analysis everywhere GRID technology

22 PB in 2011

150 PB

250 k CPU

10

A collision at LHC

26 June 2009 Ian Bird, CERN

Ian Bird, CERN 11

The Data Acquisition

26 June 2009

[email protected]

Tier 0 at CERN: Acquisition, First pass reconstruction, Storage & Distribution

1.25 GB/sec (ions)

12

2011: 400-500 MB/sec

2011: 4-6 GB/sec

• A distributed computing infrastructure to provide the production and analysis environments for the LHC experiments

• Managed and operated by a worldwide collaboration between the experiments and the participating computer centres

• The resources are distributed – for funding and sociological reasons

• Our task was to make use of the resources available to us – no matter where they are located

Ian Bird, CERN 13

WLCG – what and why?

Tier-0 (CERN):• Data recording• Initial data reconstruction• Data distribution

Tier-1 (11 centres):•Permanent storage•Re-processing•Analysis

Tier-2 (~130 centres):• Simulation• End-user analysis

e-Infrastructure

http://www.opensciencegrid.org/


• Tier 0 • Tier 1 • Tier 2

WLCG Grid Sites

• Today >140 sites• >250k CPU cores• >150 PB disk

Lyon/CCIN2P3Barcelona/PIC

De-FZK

US-FNAL

Ca-TRIUMF

NDGF

CERNUS-BNL

UK-RAL

Taipei/ASGC

Ian Bird, CERN 1626 June 2009

Today we have 49 MoU signatories, representing 34 countries:

Australia, Austria, Belgium, Brazil, Canada, China, Czech Rep, Denmark, Estonia, Finland, France, Germany, Hungary, Italy, India, Israel, Japan, Rep. Korea, Netherlands, Norway, Pakistan, Poland, Portugal, Romania, Russia, Slovenia, Spain, Sweden, Switzerland, Taipei, Turkey, UK, Ukraine, USA.

WLCG Collaboration StatusTier 0; 11 Tier 1s; 68 Tier 2 federations

Amsterdam/NIKHEF-SARA

Bologna/CNAF

[email protected] 17

Original Computing model

Ian Bird, CERN 18

From testing to data:Independent Experiment Data Challenges

Service Challenges proposed in 2004To demonstrate service aspects:

- Data transfers for weeks on end- Data management- Scaling of job workloads- Security incidents (“fire drills”)- Interoperability- Support processes

2004

2005

2006

2007

2008

2009

2010

SC1 Basic transfer rates

SC2 Basic transfer rates

SC3 Sustained rates, data management, service reliability

SC4 Nominal LHC rates, disk tape tests, all Tier 1s, some Tier 2s

CCRC’08 Readiness challenge, all experiments, ~full computing models

STEP’09 Scale challenge, all experiments, full computing models, tape recall + analysis

• Focus on real and continuous production use of the service over several years (simulations since 2003, cosmic ray data, etc.)• Data and Service challenges to exercise all aspects of the service – not just for data transfers, but workloads, support structures etc.

e.g. DC04 (ALICE, CMS, LHCb)/DC2 (ATLAS) in 2004 saw first full chain of computing models on grids

• In 2010+2011 ~38 PB of data have been accumulated, expect about 30 PB more in 2012

• Data rates to tape in excess of original plans : up to 6 GB/s in HI running (cf. nominal 1.25 GB/s)

WLCG: Data in 2010;11;12HI: ALICE data into Castor > 4 GB/s (red)

HI: Overall rates to tape > 6 GB/s (r+b)

23 PB data written in 2011…and 2012, 3 PB/month

Large numbers of analysis users: ATLAS, CMS ~1000 LHCb,ALICE ~250

Use remains consistently high: >1.5 M jobs/day; ~150k CPU

Grid Usage

As well as LHC data, large simulation productions always ongoing

CPU used at Tier 1s + Tier 2s (HS06.hrs/month) – last 24 months

At the end of 2010 we saw all Tier 1 and Tier 2 job slots being filled

CPU usage now >> double that of mid-2010 (inset shows build up over previous years)

In 2011 WLCG delivered~ 150 CPU-millennia!

1.5M jobs/day

109 HEPSPEC-hours/month(~150 k CPU continuous use)


Tiers usage vs pledges

We use everything we are given!

• The grid really works• All sites, large and small can

contribute– And their contributions are

needed!


CPU – around the Tiers

Data transfersGlobal transfers > 10 GB/s (1 day)

Global transfers (last month)

CERN Tier 1s (last 2 weeks)

• Relies on – OPN, GEANT, US-LHCNet– NRENs & other national &

international providers Ian Bird, CERN 24

LHC Networking

e-Infrastructure



Data Management Services Job Management ServicesSecurity Services

Information Services

Certificate Management Service

VO Membership Service

Authentication Service

Authorization Service

Information System Messaging Service

Site Availability Monitor

Accounting Service

Monitoring tools: experiment dashboards; site monitoring

Storage Element

File Catalogue Service

File Transfer Service

Grid file access tools

GridFTP service

Database and DB Replication Services

POOL Object Persistency Service

Compute Element

Workload Management Service

VO Agent Service

Application Software Install Service

Today’s Grid Services

Experiments invested considerable effort into integrating their software with grid services; and hiding complexity from users

Consider that:• Computing models have evolved• Far better understanding of requirements now than 10 years ago

– Even evolved since large scale challenges• Experiments have developed various workarounds to manage

shortcomings in middleware• Pilot jobs and central task queues (almost) ubiquitous• Operational effort often too high;

– lots of services were not designed for redundancy, fail-over, etc.• Technology evolves rapidly, rest of world also does (large scale)

distributed computing – don’t need entirely home grown solutions

• Must be concerned about long term support and where it will come from

Technical evolution: Background


Computing model evolution

Evolution of computing models

Hierarchy Mesh

• Not just bandwidth• We are a Global

collaboration … but well connected countries do better


Connectivity challenge

• Need to effectively connect everyone that wants to participate in LHC science

• Large actual and potential communities in Middle East, Africa, Asia, Latin America … but also on the edges of Europe

[email protected]

• WLCG has been leveraged on both sides of the Atlantic, to benefit the wider scientific community– Europe:

• Enabling Grids for E-sciencE (EGEE) 2004-2010

• European Grid Infrastructure (EGI) 2010--

– USA:• Open Science Grid (OSG)

2006-2012 (+ extension?)

• Many scientific applications

30

Impact of the LHC Computing Grid

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

Spectrum of grids, clouds, supercomputers, etc.

31

Grids• Collaborative environment• Distributed resources (political/sociological)• Commodity hardware (also supercomputers)• (HEP) data management• Complex interfaces (bug not feature)

Supercomputers• Expensive• Low latency interconnects• Applications peer reviewed• Parallel/coupled applications• Traditional interfaces (login)• Also SC grids (DEISA, Teragrid)

Clouds• Proprietary (implementation)• Economies of scale in management• Commodity hardware• Virtualisation for service provision and encapsulating application environment• Details of physical resources hidden• Simple interfaces (too simple?)

Volunteer computing• Simple mechanism to access millions CPUs• Difficult if (much) data involved• Control of environment check • Community building – people involved in Science• Potential for huge amounts of real work

Many different problems:Amenable to different solutions

No right answer

Consider ALL as a combined e-Infrastructure ecosystemAim for interoperability and combine the resources into a consistent wholeKeep applications agile so they can operate in many environments

• Grid: Is a distributed computing service– Integrates distributed resources – Global single-sign-on (use same credential everywhere)– Enables (virtual) collaboration

• Cloud: Is a large (remote) data centre– Economy of scale – centralize resources in large centres– Virtualisation – enables dynamic provisioning of

resources• Technologies are not exclusive

– In the future our collaborative grid sites will use cloud technologies (virtualisation etc)

– We will also use cloud resources to supplement our ownIan Bird, CERN 32

Grid <-> Cloud??

26 June 2009

• We have a grid because:– We need to collaborate and share resources– Thus we will always have a “grid” – Our network of trust is of enormous value for us and for (e-)science in

general• We also need distributed data management

– That supports very high data rates and throughputs– We will continually work on these tools

• But, the rest can be more mainstream (open source, commercial, … )– We use message brokers more and more as inter-process communication– Virtualisation of our grid sites is happening

• many drivers: power, dependencies, provisioning, …– Remote job submission … could be cloud-like– Interest in making use of commercial cloud resources, especially for peak

demand


Grids clouds??

• Several strategies:• Use of virtualisation in the CERN & other CCs:

– Lxcloud pilot + CVI dynamic virtualised infrastructure (which may include “bare-metal” provisioning)

– No change to any grid or service interfaces (but new possibilities)– Likely based on Openstack – Other WLCG sites also virtualising their infrastructure

• Investigating use of commercial clouds – “bursting”– Additional resources; – Potential of outsourcing some services?– Prototype with Helix Nebula project;– Experiments have various activities (with Amazon, etc)

• Can cloud technology replace/supplement some grid services?– More speculative: Feasibility? Timescales?

Ian Bird, CERN 34

Clouds & Virtualisation

CERN Infrastructure Evolution 35

CERN Data Centre Numbers

Systems 7,899 Hard disks 62,023

Processors 14,972 Raw disk capacity (TiB) 62,660

Cores 64,623 Tape capacity (PiB) 47

Memory (TiB) 165 Ethernet 1Gb ports 16,773

Racks 1,070 Ethernet 10Gb ports 622

Power Consumption (KiW) 2,345

From http://sls.cern.ch/sls/service.php?id=CCBYNUM

http://sls.cern.ch/sls/service.php?id=CCBYNUM


Evolution of capacity: CERN & WLCG

petabyte-scale computing for lhc ian bird, cern wlcg project leader wlcg project leader isef...

Documents

reconstructed data

raw data

hep data

data management

year of lhc data

raw event data

petabytes of new data

volume of data req