hybrid cloud for cern

24
Hybrid Cloud for CERN Experience with Open Telecom Cloud and OpenStack Dr Helge Meinhard / CERN-IT OpenStack France Day 22-Nov-2016

Upload: helix-nebula-the-science-cloud

Post on 13-Jan-2017

55 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Hybrid Cloud for CERN

Hybrid Cloud for CERNExperience with Open Telecom Cloud and OpenStack

Dr Helge Meinhard / CERN-ITOpenStack France Day22-Nov-2016

Page 2: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

CERN• International organisation close to

Geneva, straddling Swiss-French border, founded 1954

• Facilities for fundamental research in particle physics

• 22 member states,1 B CHF budget

• 3’197 staff, fellows, apprentices, …• 13’128 associates

3

“Science for peace”

1954: 12 Member States

Members: Austria, Belgium, Bulgaria, Czech republic, Denmark, Finland, France, Germany, Greece, Hungary, Israel, Italy, Netherlands, Norway, Poland, Portugal, Romania, Slovak Republic, Spain, Sweden, Switzerland, United KingdomCandidate for membership: Serbia, CyprusObservers: European Commission, India, Japan, Russia, Turkey, UNESCO, United States of AmericaNumerous non-member states with collaboration agreements

2’531 staff members, 645 fellows, 21 apprentices

7’000 member states, 1’800 USA, 900 Russia, 270 Japan, …

22-Nov-2016

Birth place of World-wide Web

Page 3: Hybrid Cloud for CERN

Tools (1): LHC

Exploration of a new energy frontierin p-p and Pb-Pb collisions

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 422-Nov-2016

LHC ring:27 km circumference

Run 1 (2010-2013): 4+4 TeVRun 2 (2015-2018): 6.5 + 6.5 TeV

Page 4: Hybrid Cloud for CERN

Tools (2): Detectors

Exploration of a new energy frontierin p-p and Pb-Pb collisions

CMS

ALICE

LHCb

ATLAS

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 522-Nov-2016

Page 5: Hybrid Cloud for CERN

Tools (2): Detectors

Exploration of a new energy frontierin p-p and Pb-Pb collisions

CMS

ALICE

LHCb

ATLAS

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 522-Nov-2016

ATLAS (A Toroidal Lhc ApparatuS)• 25 m diameter, 46 m length, 7’000 tons• 3’000 scientists (including 1’000 grad students)• 150 million channels• 40 MHz collision rate• Event rate after filtering: 300 Hz in Run 1; up to

1’000 Hz in Run 2

Page 6: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Results so far• Many… the most spectacular one

being• 04 July 2012: Discovery of a “Higgs-

like particle”• March 2013: The particle is indeed the

Higgs boson• 08 Oct 2013 / 10 Dec 2013: Nobel

price to Peter Higgs and François Englert• CERN, ATLAS and CMS explicitly

mentioned

622-Nov-2016

Page 7: Hybrid Cloud for CERN

Integrates computer centres worldwide that provide computing and storage resource into a single infrastructure accessible by all LHC physicists

The Worldwide LHC Computing Grid

Tier-1: permanent storage, re-processing, analysis

Tier-0 (CERN):data recording, reconstruction and distribution

Tier-2: Simulation,end-user analysis

> 2 million jobs/day

~350’000 cores

500 PB of storage

more than 170 sites, 40 countries

10-100 Gb links

An International collaboration to distribute and analyse LHC data

22-Nov-2016Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 7

• Up to 6 GB/s to be permanently stored after filtering

• Almost 30 PB/y in Run 1

• Expect ~50 PB/y in Run 2

• 2023: 400 PB/y(?)

Page 8: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

ChallengesRun 2:• Moore’s law helps, but not

sufficient• Large effort spent to improve

software efficiency• Exploit multi-threading, new

instruction sets, …• Still need factor 2 in terms of

cores, storage etc.

22-Nov-2016 8

Page 9: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Tier-0: 15% of WLCG

CERN data centre3.5 MW - full

Tier-0 extension:Wigner Research Centre,Budapest/Hungary

Two dedicated100GE connections

22-Nov-2016 9

Page 10: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Transforming In-House ResourcesWe now have• Full support for physical and virtual servers• Full support for remote machines• Horizontal view

• Responsibilities by layers of service deployment• Large fraction of resources run as private cloud under OpenStack• Scaling to large numbers

(> 15’000 physical, several 100’000s virtual)• Support for dynamic host creation/deletion

• Deploy new services/servers in hours rather than weeks/months• Optimise operational and resource efficiency

22-Nov-2016 10

Page 11: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Scaling up Further: Public Clouds (1)• Additional resources, perhaps later replacing

on-premise capacity• Potential benefits:

• Economy of scale• More elastic, adapts to changing demands• Somebody else worries about machines and

infrastructure

22-Nov-2016 11

Page 12: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Scaling up Further: Public Clouds (2)• Potential issues:

• Cloud provider’s business models not well adapted to procurement rules and procedures of public organisations

• Lack of skills for and experience with procurements• Market largely not targeting compute-heavy tasks

• Performance metrics/benchmarks not established• Legal impediments• Not integrated with on-premise resources and/or publicly funded

e-infrastructures

22-Nov-2016 12

Page 13: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

CERN Approach

13

Open Telekom Cloud

22-Nov-2016

Series of short procurement projects of increasing size and complexity

T-Systems OTC: Detector simulation, reconstruction and analysis for ATLAS, CMS, ALICE, LHCb• Contract signed in February 2016• 4’000 cores for three months• Includes storage for data-heavy workflows

Page 14: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 15

Project with T-Systems• 1’000 four-core VMs, 8 GB

RAM, 100 GB disk• 870 used as compute, 50 as

storage head nodes, 80 as service nodes and spares

• 500 TB of clustered block storage

• External IP connectivity (10 Gbps) over GÉANT• 1’000 public IPv4 addresses

• Contract signed in February• Kick-off in March• Start of commissioning in May• Start of production on 1 Aug• 28 Oct: Production compute

phase ended

• Extension of in-house resources

22-Nov-2016

Page 15: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 16

Issues Identified and Addressed• OpenStack Neutron APIs

needed upgrading• DNS service needed to be

set up• APIs throttling adapted• UDP fragment corruption

due to bug in Open vSwitch

• VPN slow• Not used

• Live migration not tested• CERN image lacking a

required driver• Orchestration to take

account of I/O limitations of VMs• Significant number of

VMs needed as storage head nodes

22-Nov-2016

Page 16: Hybrid Cloud for CERN

XBatch - T-Systems 17

Usage by experiment

Page 17: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Some Lessons Learnt• APIs difficult to use – many calls required, divergence between clouds

• Prefer using an ecosystem tool such as Terraform, libcloud, jclouds• Development needed to support all OpenStack APIs

• Clustered block storage only little used• Difficult to set up, not really worth the effort in a 90-day project

• Networking can compensate for clustered storage• Streaming directly from and to CERN’s storage• 10 Gbps was often saturated

• Finding right balance between storage and networking remains a challenge• Same for balance between object an block storage• Users not fully prepared yet to use object rather than block storage

• Transparency is key• Bug causing UDP corruptions known, but not communicated, causing unnecessary delays

22-Nov-2016 14

Page 18: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Future Requirements• Not only LHC, but a number of particle physics projects with

high data rates• SuperKEKB, HL-LHC, FCC, LBNF, ILC

• Not only particle physics, but also other physics fields (e.g. astronomy)• SKA, LSST, CTA

• Not only physics, but also other sciences (e.g. life sciences, material science)• EBI expects data doubling every year (!)

22-Nov-2016 15

Page 19: Hybrid Cloud for CERN

Procurers: CERN, CNRS, DESY, EMBL-EBI, ESRF,IFAE, INFN, KIT, SURFSara, STFC

Experts: Trust-IT & EGI.eu

Procurers have committed funds (>1.6M€), manpower, use-cases with applications & data and in-house IT resources

Objective: procure innovative IaaS level cloud services• Fully and seamlessly integrating commercial cloud (Iaas)

resources with in-house resources and European e-Infrastructures

• To form a hybrid cloud platform for science

Services will be made available to end-users from many research communities: High-energy physics, astronomy, life sciences, neutron/photon sciences, long tail of science

Co-funded via H2020 (Jan’16-Jun’18)• Grant Agreement 687614• Total procurement volume: >5M€

HELIX NEBULA The Science CloudJoint Pre-Commercial Procurement

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch 16Total procurement commitment >5M€

22-Nov-2016

Page 20: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Technical Challenges• Compute

• Integration of some HPC requirements• Storage

• Caching at provider’s site, if possible automatically (avoid managed storage)

• Network• Connection via GÉANT• Support of identity federation (eduGAIN) for IT managers

• Procurement• Match of cloud providers’ business model with public procurement rules

22-Nov-2016 17

Page 21: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

HNSciCloud – Current Status• Official start of project: Jan 2016, duration: 30 months• Tender announced in Jan 2016• 17-Mar-2016: Open market consultation• 21-Jul-2016: Tender issued (> 200 downloads, > 70 requests for clarification)• 07-Sep-2016: Tender information day – design phase• 19-Sep-2016: Deadline for tender replies

• Sufficient number of valid tenders received• Evaluation by administrative and technical experts

• 07-Oct-2016: Award decision, contracts• Consortia led by T-Systems; IBM; RHEA; INDRA

• 02-Nov-2016: Kick-off meeting with Phase 1 contractors

22-Nov-2016 18

Page 22: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

HNSciCloud – Contacts• Interested?

• See http://www.hnscicloud.eu/• Subscribe to [email protected]

22-Nov-2016 19

Page 23: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Summary• Public clouds have a large potential of addressing the requirements

of public research organisations for ever more resources and of dealing with peak demands• A full, seamless integration of public clouds with on-premise resources and

public e-infrastructures into a hybrid cloud infrastructure is required• OpenStack has proven to be very adequate for the massive

deployment of CERN’s resources without any increase in personnel, and is a very suitable basis for commercial clouds as well

• CERN’s project with T-Systems has been a positive experience of collaboration with important lessons learned… by both parties!

22-Nov-2016 20

Page 24: Hybrid Cloud for CERN

Hybrid Cloud for CERN - Helge Meinhard at CERN.ch

Merci pour votre attentionhttp://cern.ch

http://cern.ch/it-dephttp://cern.ch/wlcg

http://www.hnscicloud.eu/

Accelerating Science and Innovation

22-Nov-2016 21