architecture and prototype of a wlcg data lake for hl-lhc · architecture and prototype of a wlcg...

12
Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone, Simone Campana , Xavier Espinal Currul (CERN) 1 12/07/2018 [email protected] - CHEP2018

Upload: others

Post on 13-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Architecture and prototype of a WLCG

data lake for HL-LHC

Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria

Girone, Simone Campana, Xavier Espinal Currul

(CERN)

112/07/[email protected] - CHEP2018

Page 2: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Motivation

HL-LHC needs are above the expected technology evolution (15%/yr) and

funding (flat)

We need to optimize hardware usage and operational costs

12/07/[email protected] - CHEP2018 2

Page 3: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Some ideas for reducing cost

12/07/[email protected] - CHEP2018 3

Site A

EOS2 copies

Site B

CEPH3 copies

Site C

dCacheRaid-N

Reduce Operational Cost:

deploy fewer (larger) storage

services.

Reduce hardware cost:

introduce the concept of QoS

(Quality of Service)Today we store more than we think

Tbps

Gbps

Co-location of data and processors

is not guaranteed

Page 4: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Data and Compute Infrastructures

12/07/[email protected] - CHEP2018 4

StorageStorage

Content Delivering and Caching

Storage

Grid Compute Grid Compute

Site N

Distributed Regional Storage

Compute Provisioning

Storage Interoperability Services

Grid Compute

Asynchronous Data Transfer

Storage

Distributed Storage

Volatile Storage

High Level Services High Level Services High Level Services

Cloud Compute

HPCCompute

@HOME

Co

mp

ute

In

fras

tru

ctu

reD

ata

(Lak

e)

Infr

astr

uct

ure

Page 5: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

The eulake prototype

12/07/[email protected] - CHEP2018 5

RAL-UK

CERN-HUNS

CERN-CHNS

22ms

Dubna-RU

SARA-NL

NIKHEF-NL PIC-ES

Aarnet-AUS

300ms

42ms

18ms

42ms18ms

42ms

18ms

75ms93ms

300ms

CNAF-IT

Goal: test and

demonstrate some

of those ideas

We deployed a

Distributed Storage

prototype

Based on the EOS

technology NRC-KI

(PNPI)

88ms105ms

Page 6: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Quality of Service and Transitions

12/07/[email protected] - CHEP2018 6

2 Replicas 5 stripes: (n-2) RS Single Copy

Triggered conversion

by

‘namespace’ attr change

Pre-established auto

conversion Δt=1h by

‘namespace’ attribute

Dataset:100 files of 1GB - Single client writing (VM)

Three different QoS: 2 copies, 3+2 chunks, 1 copy. Automatic and triggered transition

Page 7: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Integration of eulake with ATLAS Data Management

12/07/[email protected] - CHEP2018 7

We expose eulake to the ATLAS data management system as storage endpoint

Data can be transferred from any ATLAS site into eulake

We imported 4 input samples in different eulake areas for the next tests

#files/source #files/dest

Page 8: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Activities

12/07/[email protected] - CHEP2018 8

1. deployment and commissioning

2. transfer tests and input data replication

3. data access

writing into eulake (MB/s)

eulake I/O (MB/s)

Page 9: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

HammerCloud

12/07/[email protected] - CHEP2018 9

We integrated eulake with the HammerCloud framework

Allows test real workflows and data access patterns. Initial focus on ATLAS

Four test scenarios. Read from:

1. Local access (no eulake)2. eulake, data@CERN, WN@CERN3. eulake, data NOT@CERN, WN@CERN4. eulake, 4+2 stripes, WN@CERN

Data is copied from storage to WN

Page 10: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

12/07/[email protected] - CHEP2018 1012/07/[email protected] - CHEP2018 10

Low I/O intensity workflow: ~40MB input (1 file), 2 events, ~5 mins/event

High I/O intensity workflow: ~6GB input (1 file), 1000 events, ~2 seconds/event

Local access (no eulake)

eulake, data@CERN

eulake, data NOT@CERN

eulake, 4+2 stripes

Local access (no eulake)

eulake, data@CERN

eulake, data NOT@CERN

eulake, 4+2 stripes

Local access (no eulake)

eulake, data@CERN

eulake, data NOT@CERN

eulake, 4+2 stripes

Local access (no eulake)

eulake, data@CERN

eulake, data NOT@CERN

eulake, 4+2 stripes

85s

186s

279s

948s

22s

28s

40s

40s

2.6s/evt

3.5s/evt

3.7s/evt

5.6s/evt

545s/evt

600s/evt

752s/evt

686s/evt

Page 11: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Conclusions

We set up a distributed storage instance based on EOS technology Small in terms of storage volume, large in the geographical sense Deployment varieties: Native EOS, EOS on Docker containers, volume export

(CEPH)

We integrated such instance with the ATLAS distributed computing services and HammerCloud Next step is integration with CMS

We have a prototype in place and understand what we want to measure We don't yet have enough stats to draw any firm conclusions

The prototype we set up serves to test many ideas in preparation for HL-LHC

12/07/[email protected] - CHEP2018 11

Page 12: Architecture and prototype of a WLCG data lake for HL-LHC · Architecture and prototype of a WLCG data lake for HL-LHC Gavin McCance, Ian Bird, Jaroslava Schovancova, Maria Girone,

Acknowledgements

This work has been done in collaboration with many participant WLCG sites. Credit goes to their work

We would like to thank the ATLAS Distributed Computing and particularly the Rucio team for the help in the integration

We thank the EOS experts at CERN

Special thanks to Ivan Kadochnikov for the help in the final part of this work

12/07/[email protected] - CHEP2018 12