summary of the hepix autumn 2013 meeting

13
CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/ Summary of the HEPiX Autumn 2013 Meeting Arne Wiebalck Afroditi Xafi Thomas Oulevey CERN ITTF November 22, 2013

Upload: avari

Post on 15-Jan-2016

37 views

Category:

Documents


5 download

DESCRIPTION

Summary of the HEPiX Autumn 2013 Meeting. Arne Wiebalck Afroditi Xafi Thomas Oulevey CERN ITTF November 22, 2013. Outline. Miscellaneous Site reports Storage Basic IT Services Computing & Batch Systems IT facilities End User Services Clouds & Virtualisation Networking & Security. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Summary of the  HEPiX Autumn 2013 Meeting

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

Summary of the HEPiX Autumn 2013 Meeting

Arne Wiebalck

Afroditi Xafi

Thomas Oulevey

CERN ITTF

November 22, 2013

Page 2: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 2

Outline

• Miscellaneous• Site reports• Storage• Basic IT Services• Computing & Batch Systems• IT facilities• End User Services • Clouds & Virtualisation• Networking & Security

Arne

Afroditi

Thomas

Page 3: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 3

HEPiX – www.hepix.org

• Global organization of service managers and support staff providing computing facilities for HEP community

• Participating sites include BNL, CERN, DESY,

FNAL, IN2P3, NIKHEF, RAL, SLAC, TRIUMF …

• Meetings are held twice per year– Spring: Europe, Autumn: U.S./Asia

• Exchange of experiences, reports on recent work,work in progress & future plans– Usually no showing-off

Page 4: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 4

Next HEPiX Meetings

• Spring 2014– LAPP, Annecy, France – May 19 – May 23, 2014

• Autumn 2014– University of Nebraska (NE), U.S.– Final approval needed, dates to be determined

• Spring 2015– U.K. discussed as an option

Page 5: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 5

HEPiX Autumn 2013

• Oct 28 - Nov 1 at U Michigan, Ann Arbor (MI)– Very well organized, pretty rich program

– Network access: eduroam (as in Bologna)

• 115 (!) registered participants– Europe: 48, U.S./Canada: 47, Asia: 3, Australia: 2 (CERN: 13)

– Many first timers, several North-American WLCG Tier-2 Univ.’s

– DoE labs could mostly participate, only few cancellations (ZFS)

– 15 participants from 9 companies

• 65 presentations from 35 institutes– 26 hours of presentations– Many offline discussions

• Sponsors: WD, UMICH, DDN, NetApp, and Univa

Page 6: Summary of the  HEPiX Autumn 2013 Meeting

Updates from the WGs (1)

• Storage– WG terminated, no summary as Andrei could not participate

• Batch – WG terminated, updates to Wiki will continue

• IPv6– Big ISPs move to IPv6 (CH: >10% of Google traffic already via IPv6)

– CERN seems well prepared, some smaller labs have not even started

– IPv6 support in batch systems?

– A lot of testing ongoing, including the experiments, test bed growing – https://indico.cern.ch/getFile.py/access?contribId=26&sessionId=2&resId=1&materialId=slides&confId=247864

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 6

Page 7: Summary of the  HEPiX Autumn 2013 Meeting

Updates from the WGs (2)

• Benchmarking– New SPEC CPU benchmark suite planned for Oct 2014– Plan is to start working with the experiments early (to identify apps to validate)

• Bit preservation– New working group led by CERN (German Cancio) and DESY (Dimitry Ozerov)– Follow-up on DPHEP presentation from J. Shiers during Bologna meeting– Focus on technical advice on bit preservation– https://indico.cern.ch/getFile.py/access?contribId=45&sessionId=3&resId=1&materialId=slides&confId=247864

• Configuration Management– No update (chairs could not participate)

• Energy efficiency– On hold for now, little feedback, no interest or no resources?– To be re-discussed in Annecy

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 7

Page 8: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 8

Site reports (1)

• Configuration Management (Puppet) “hot topic” – Sites come from Rocks, Quattor, home-grown scripts, …

– Interesting: master-less Puppet at FNAL

– Other sites discuss similar topics as we do (workflow, secrets, …)

– Little synergy in the community so far, WG activity needed!

• Batch system reviews ongoing– Univa GridEngine & HTCondor take the lead

(SLURM did not survive testing at various sites)

– IPv6 and job authentication remain open issues

• Broad use of cloud services & virtualization – Clouds move into production everywhere

– Complete virtualization of services (e.g. AFS at UMICH)

Page 9: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 9

Site reports (2)

• “Dropbox”-like service at GridKA– For 55’000 users from several universities (10GB quota)

– Powerfolder was picked as their solution

• Lustre/Hadoop established at various sites– Lustre: GSI (10PB), IHEP (3PB), FNAL (0.2PB), JLAB, …– Hadoop: smaller sites, PB installations

• Interest in & investigations around Ceph– Mostly for OpenStack VMs, but also other usage cases (RBD),

backend for dCache, NFS replacement, CASTOR complement …

– Most sites still at an early stage

Page 10: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 10

Site reports (3)

• Scientific Linux 6– Many sites finished migration (of batch) to SL6: RAL, GridKA, INFN, …

– Significantly improved performance on older systems

Page 11: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 11

Storage (1)

• dCache update – Support for v4.1/pNFS currently being tested (looks OK)

– xroot and HTTP/WebDAV federations

– Backend testing (DDN, Ceph)

• Summary of FNAL USCMS T1 storage investigation – Seeking solutions for online (2GB, POSIX) and nearline (1TB w/ tape)

– Currently on BlueArc & dCache & Lustre & EOS

– Goal: consolidation of storage solutions

– Evaluated: the current systems plus NetApp, GPFS, Nexsan, SnapScale

– Result: dCache for T1 production, EOS for LPC analysis, HNAS for home

Page 12: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 12

Storage (2)

• Western Digital on disk drive technology– Giving insights on difficulties when doing macroscopic mechanics on

nano-scale• Platter ‘non-flatness’ plus unequal lube distribution can cause problems • Heads usually fly at 10nm and “descend” to ~2nm for actual I/O (by thermal expansion!)

– Introducing a new reliability metric (MPbF): disk failure rate dependent on load (not on power-on-hours)

– http://indico.cern.ch/getFile.py/access?contribId=37&sessionId=3&resId=3&materialId=slides&confId=247864

• 3 presentations on AFS– OpenAFS status report

• 1.6 released in Sep 2011, slow (server-side) uptake • Security advisories

– YFS : new security, new Rx (WAN), IPv4/IPv6, limits removed, …

– Summary of IPv6 investigations & survey, concluding that dual-stack seems to be solution to “IPv6/AFS issue”

Page 13: Summary of the  HEPiX Autumn 2013 Meeting

Wiebalck, Xafi, Oulevey: Summary of the HEPiX Autumn 2013 Meeting - 13

Questions?

• “We built the first data centre with heaters!”(from Ulf Tigerstedt’s presentation on building the Kajaani DC )

•“Controlling a disk head is like flying a Jumbo 747 above a highway at a distance of less than 1 inch for 5 years!”(from Amit Chattopadhyay’s presentation on Disk Load Monitoring)