the worldwide lhc computing grid, gridpp and...

42
The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker [email protected]

Upload: others

Post on 04-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

The WorldWide LHC computing Grid,

GridPP and you

Christopher J. Walker

[email protected]

Page 2: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 2

Overview

• The Grid

– Motivation and history

• Thanks David Britton

– Live Demo

Page 3: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 3 3

Introduction

The physics

The LHC

The Grid

The Experiments

July 4th 2012:

Rolf Heuer

CERN

Director

General

Page 4: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 4 4 David Britton, University of Glasgow

Why we need the Grid.

What is a Grid?

How the Grid works.

Grid usage and impact.

Evolution.

Summary.

Outline

Page 5: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 5

Challenge – The Data Volume

5 David Britton, University of Glasgow

All Events

Standard

Model

W,Z, Jets

Higgs

Notionally

40TB/sec

200MB/sec

recorded

Higgs?

10 Orders of

Magnitude

Page 6: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 6

Challenge: Date Complexity

6 David Britton, University of Glasgow

Multiple separate interactions

during each “collision”.

TABLE OF MILLIONS

Collisions 40-million times a second,

each a composite of many interactions.

150-million electronic channels on

ATLAS and CMS detectors.

15-million gigabytes (15 Petabytes) of

data recorded per year.

Expect a few per million recorded

collisions to contain a Higgs (but

individually you can’t tell that they are

Higgs).

Page 7: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 7

Data Pyramid

7 David Britton, University of Glasgow

Raw Data

Reconstructed Data

Analysis Objects

Tag Data

Ntuples

H

Monte Carlo Data

Reconstructed Data

Analysis Objects

Tag Data

Ntuples

H

Total ATLAS Disk Used

100 PB

2008 2012

1 Petabyte = 1000 Terabytes

1 Terabyte = 1000 Gigabytes

1 Gigabyte = 1000

Megabytes

Page 8: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 8

What is the Grid?

8 David Britton, University of Glasgow

Page 9: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 9 9 David Britton, University of Glasgow

Web:

Focused historically on sharing information

(high level data - text, picture, music, video)

Allows a limited set of predetermined actions

(data processing) such as search, filter, sort,

stream, etc.

Grid: The idea is to share storage and

computing power more directly,

enabling much larger data sets to be

shared with user-determined data

processing.

Web vs Grid

Page 10: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 10 10 David Britton, University of Glasgow

Evolution towards Grid

Page 11: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 11 11 David Britton, University of Glasgow

Why “Grid?”

Page 12: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 12

How does it work?

12 David Britton, University of Glasgow

E = mc2

Grid

Middleware

Page 13: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 13 13 David Britton, University of Glasgow

Middleware

CPU Disks, CPU etc

Application Layer

OPERATING

SYSTEM

Word/Excel

Email/Web

Your

Program

Games

MIDDLEWA

RE

CPU

Cluster

User

Interface

Machine

CPU

Cluster

CPU

Cluster

Resource

Broker Information

Service

Grid

Disk

Server

Your

Program

Replica

Catalogue Bookkeeping

Service

Single PC

Middleware is the Operating System of a distributed computing system.

Page 14: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 14 14 David Britton, University of Glasgow

How does it work?

Getting Started

1. Get a digital certificate (UK Certificate Authority)

2. Join a Virtual Organisation (VO)

Authentication – who you are

Authorisation – what you are allowed to do

Page 15: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 15 15 David Britton, University of Glasgow

How does it works?

The details

VOMS

WMS

JS

RB

LFC

BDII

Logging &

Bookkeeping

3

CPU Nodes Storage

Grid Enabled Resources

CPU Nodes Storage

Grid Enabled Resources

CPU Nodes Storage

Grid Enabled Resources

CPU Nodes Storage

Grid Enabled Resources

4

5

Submitter

6

7

8 9

10

The Grid

glite-wms-job-submit myjob.jdl Myjob.jdl

JobType = “Normal”;

Executable = "/sum.exe";

InputData = "LF:testbed0-00019";

DataAccessProtocol = "gridftp";

InputSandbox = {"/home/user/WP1testC","/home/file*”, "/home/user/DATA/*"};

OutputSandbox = {“sim.err”, “test.out”, “sim.log"};

Requirements = other. GlueHostOperatingSystemName == “linux" &&

other. GlueHostOperatingSystemRelease == "Red Hat 6.2“ && other.GlueCEPolicyMaxWallClockTime > 10000;

Rank = other.GlueCEStateFreeCPUs;

gridui

JDL

11

0 VOMS-proxy-init

1

2

Job S

tatu

s?

Page 16: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 16

Storage Development

The similar exponential increase in the storage

density, and the corresponding fall in the cost

of data storage.

CPU Development

The sustained exponential increase in the

density of transistors, and the corresponding

fall in the cost of computational power.

Enabling Technology

16

Network Development

The similar exponential increase in the

available bandwidth and the corresponding fall

in the cost of moving data.

Density of storage

Page 17: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 17

When is a Grid useful?

17 David Britton, University of Glasgow

Problems that are highly parallelizable Problem

Grid

Solution

Input data is independent

e.g. Images:

A=2

B=3 A=3

B=3 A=2

B=4

Simulation using

different parameters:

Not so good for closely

coupled problems

These pieces may be

independent

These pieces will have

to interact

Page 18: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 18

Structure of the Grid

Institutes

CERN computer centre

RAL,UK

ScotGrid NorthGrid SouthGrid London

France Italy Germany USA

Glasgow Edinburgh Durham

Tier 0

Tier 1 National centres

Tier 2 Regional groups

Offline farm

Online system

Workstations

Useful model for

Particle Physics but

not necessary for

others

Studies in the late 90’s lead to a hierarchical structure.

RAL,UK

ScotGrid NorthGrid SouthGrid London

Glasgow Edinburgh Durham

“GridPP”

“wLCG”

18 David Britton, University of Glasgow

Page 19: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 19

Multiple Grids

19 David Britton, University of Glasgow

Worldwide LHC Computing Grid

(WLCG) combines:

•EGI (European Grid Infrastructure).

•OSG (Open Science Grid) in the US.

•NorduGrid in the Nordic countries.

Combined Resources (August 2012):

•152 Sites in 36 Countries

•325,000 logical CPUs

•210 Petabytes of disk

•180 Petabytes of tape

Comparing number of cores (which is not a fair measure of the computing

power) “CERN’s (distributed) SuperComputer” would rank 3rd in the current

top-10 supercomputers worldwide.

Page 20: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 20

UK Contribution

20 David Britton, University of Glasgow

UK Resources (August 2012):

•19 Sites

•36,000 logical CPUs

•21 Petabytes of disk

•5+ Petabytes of tape

CPU contributions in 2012

USA

UK

Page 21: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 21

Grid in Action

21 David Britton, University of Glasgow

Page 22: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 22

Data Transfer

22 David Britton, University of Glasgow

Nominal

design rate

was 1.3 GB/s

Page 23: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 23

Moving Data – Quick quiz

• With a Gbit connection, how long does it take to move

– 1 GB (Gbyte)?

– 1TB

– 10TB

Page 24: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 24

Worldwide Usage

24 David Britton, University of Glasgow

1 million jobs/day

Page 25: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 25

Impact

“Wealth Creation” “Quality of Life”

Immense (pictures); Cambridge Ontology (n-

grams); Econophysica (financial); Total Oil

(exploration); Constellation Tech (software)

Avian Flu (biomed); Malaria (Wisdom

project); Landslide prediction; nano-CMOS;

photonics; etc.

Page 26: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 26

Evolution-I

26 David Britton, University of Glasgow

Tier-structure for wLCG designed in the late

90’s assumed 600Mbps links. Today’s multi-

Gigabit links enable a more flexible and robust

architecture. May increase complexity.

New CPU architectures require more

application development in order to exploit the

increase in computing capacity. This is a

challenge for legacy code.

Maximum Sustained Bandwidth Density of storage

Maximum Sustained Bandwidth

Although storage density continues to

increase it is getting more difficult to use,

which puts demands on the architecture and

applications, increasing the complexity.

Evolution of computing models

Hierarchy Mesh

Page 27: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 27

Evolution-II

27 David Britton, University of Glasgow

MIDDLEWA

RE

CPU

Cluster

User

Interface

Machine

CPU

Cluster

CPU

Cluster

Resource

Broker Information

Service

Grid

Disk

Server

Your

Program

Replica

Catalogue Bookkeeping

Service

MIDDLEWARE

CPU

Cluster

User

Interface

Machine

CPU

Cluster

CPU

Cluster

Resource

Broker Information

Service

Gri

d

Disk

Server

Your

Program

Replica

Catalogue Bookkeeping

Service

Too much middleware actually resides in the

application-layer and is unique to an individual

user group (virtual organisation).

In addition, there are multiple middleware

stacks (gLite; ARC, Unicore, etc) used by

different user groups.

Some degree of rationalisation and consolidation is required - this is a

natural part of the process when working in a development

environment.

Page 28: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 28

Evolution-III

28 David Britton, University of Glasgow

The boundary between web and Grid has

become blurred as Grid ideas are taken up.

The web is becoming much more machine-

readable; data movement is becoming more

automated and more extensive as bandwidth

improvements enable new services.

The Grid is also about collaboration: This is somewhat different in the

commercial world where partners tend not to share internal

infrastructure but out-source to a third-party. So we’ve seen the

growth of “Cloud Computing”. Again, the boundaries are becoming

blurred with Grids of Clouds and Clouds of Grids likely in the future.

Page 29: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 30

Summary

30 David Britton, University of Glasgow

Page 30: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 31

Summary

31 David Britton, University of Glasgow

A Large Hadron Collider Delivering collisions up to 40 million times per second

A Global Supercomputer

Page 31: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 Support for Non LHC

VOs 32

VO usage (3 months)

Page 32: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 Support for Non LHC

VOs 33

Non LHC VO share

http://pprc.qmul.ac.uk/~walker/votable.html

Page 33: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 34

Demo

• Submitting a job

– Helloworld

– Running a script

• Managing data

– Copying a file

• LFC

– Mounting via WebDAV

• In future redirect via LFC

Page 34: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 35

Hello world example

walker@heplt019:~/talks/2013-daresbury/londongrid-

example$ cat helloworld.jdl

#############Hello World#################

Executable = "/bin/echo";

Arguments = "Hello welcome to londongrid ";

StdOutput = "hello.out";

StdError = "hello.err";

OutputSandbox = {"hello.out","hello.err"};

######################################

Page 35: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 Support for Non LHC

VOs 36

“Live Demo – mounting

storage”

heplt019:~# mount -t davfs https://hepgrid11.ph.liv.ac.uk/dpm/ph.liv.ac.uk/home/ /mnt/liverpoolPlease enter the username to authenticate with server

https://hepgrid11.ph.liv.ac.uk/dpm/ph.liv.ac.uk/home/ or hit enter for none.

Username:

Please enter the password to authenticate user with server

https://hepgrid11.ph.liv.ac.uk/dpm/ph.liv.ac.uk/home/ or hit enter for none.

Password:

Please enter the password to decrypt client

certificate /etc/davfs2/certs/private/my_cert.p12.

Password: /sbin/mount.davfs: the server certificate is not trusted

issuer: Authority, eScienceCA, UK

subject: CSD, Liverpool, eScience, UK

identity: hepgrid11.ph.liv.ac.uk

fingerprint: 34:c1:2d:63:57:2d:ff:07:10:21:cc:1d:a7:7a:ad:58:f9:bd:4d:b0

You only should accept this certificate, if you can

verify the fingerprint! The server might be faked

or there might be a man-in-the-middle-attack.

Accept certificate for this session? [y,N] y

/sbin/mount.davfs: warning: the server does not support locks

Page 36: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 Support for Non LHC

VOs 37

“Live Demo”

heplt019:~# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda3 90G 78G 7.2G 92% /

tmpfs 2.0G 0 2.0G 0% /lib/init/rw

udev 2.0G 316K 2.0G 1% /dev

tmpfs 2.0G 0 2.0G 0% /dev/shm

https://hepgrid11.ph.liv.ac.uk/dpm/ph.liv.ac.uk/home/

26G 13G 13G 50% /mnt/liverpool

heplt019:~# cd /mnt/liverpool/atlas/atlasscratchdisk

heplt019:/mnt/liverpool/atlas/atlasscratchdisk# echo "hello webdav" >cjwhello

heplt019:/mnt/liverpool/atlas/atlasscratchdisk# cat cjwhello

hello webdav

Page 37: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 38

What does this mean for you?

• GridPP expertise: Big Data

– Compute

– Data transfer over Wide area network

– Federated access

• If your problem fits our solution:

– Talk to us

• Share experience

• Some resources

– Scientific Linux (RHEL/centos compatible)

Page 38: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 39

Lessons

• Grid good for

– embarrassingly parallel problems

• Need to deal with failure

– Bookkeeping difficult

– Ganga and Dirac solutions

• FTS for file transfers

Page 39: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 40

What does this mean for you?

• GridPP expertise: Big Data

– Compute

– Storage

– Data transfer over Wide area network

– Federated access

• Lots of people accessing the same data

• How do I learn more?

– Talk to me

– Talk to your local grid admin (high energy physics group)

Page 40: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 41

GridPP sites in the UK

• If you want to

know more

about GridPP,

talk to your local

site admin

Page 41: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 42

Conclusions

• Overview of Grid computing

– LHC and the Higgs

• GridPP

• Demo

– Submitted some jobs

– Transferred some data

Page 42: The WorldWide LHC computing Grid, GridPP and youcommunity.hartree.stfc.ac.uk/access/content/group/admin/e...The WorldWide LHC computing Grid, GridPP and you Christopher J. Walker C.J.Walker@qmul.ac.uk

29/08/2013 S 43

Acknowledgements

• GridPP

• David Britton

– Many of the slides