atlas computing in geneva szymon gadomski, ndgf meeting, september 2009 s. gadomski, ”atlas...
Post on 19-Dec-2015
222 views
TRANSCRIPT
ATLAS computing in Geneva
Szymon Gadomski, NDGF meeting, September 2009
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 1
• the Geneva ATLAS Tier-3 cluster• other sites in Switzerland• issues with the data movement
ATLAS computing in Geneva
• 268 CPU cores• 180 TB for data
– 70 in a Storage Element
• special features:– direct line to CERN at 10 Gb/s – latest software via CERN AFS – SE in Tiers of ATLAS since Summer
2009– FTS channels from CERN and from
NDGF Tier 1
• the analysis facility for Geneva group
• Trigger development, validation, commissioning
• grid batch production for ATLAS
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 2
How it is used
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 4
• NorduGrid production since 2005
• login and local batch
• trigger development and validation
• analysis preparations
• 75 accounts, 55 active users, not only Uni GE
Added value by resource sharing
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 5
local jobs come in peaks
grid always has jobs
little idle time, a lot of Monte Carlo done
Swiss ATLAS Grid
Uni of Geneva Tier-3
Uni of Bern Tier-3
CSCS Tier-2
(shared)
Swiss ATLAS Grid
S. Gadomski, ”Swiss ATLAS Grid", SwiNG, June 2009 6
CERN Tier-0 and CAF
Karlsruhe Tier-1
CSCS
• 960 CPU cores, 520 TB (for three LHC experiments)
• grid site since 2006– LCG gLite and NorduGrid– dCache Storage Element– mostly “production” for the
three experiments
• change of personnel in recent past
• large hardware upgrades in 2008 and 2009
• use of Lustre in the near future (worker node disk cache)
S. Gadomski, ”Swiss ATLAS Grid", SwiNG, June 2009 7
Bern
• 30 CPU cores, 30 TB in a local cluster
• 250 CPU cores in a shared University cluster
• grid site since 2005– NorduGrid– gsiftp storage element– mostly ATLAS production
• interactive and local batch use
• data analysis preparation
S. Gadomski, ”Swiss ATLAS Grid", SwiNG, June 2009 8
Swiss contribution to ATLAS computing
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 9
~1.4% of ATLAS computing in 2008
Issue 1 - data movement for grid jobs
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 10
local jobs can read the SE directly
Issue 1 - data movement for grid jobs
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 11
grid jobs can not read the SE directly
No middleware on worker nodes. This is a good thing, but it hits us a little. Any plans about that?
Issue 2 - data rates
Storage system direction max rate [MB/s]
NFS write 250
NFS read 370
DPM Storage Element write 2*250
DPM Storage Element read 2*270
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 12
Internal to the Cluster the data rates are OK
Source/method MB/s GB/day
dq2-get average 6.6 560
dq2-get max 58 5000
FTS from CERN to UNIGE-DPNC 10 – 59 840 – 5000
FTS from NDGF-T1 to UNIGE-DPNC 3 – 5 250 – 420
Transfers to Geneva need improvement
Test of larger TCP buffers
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 13
• transfer from fts001.nsc.liu.se
• network latency 36 ms (CERN at 1.3 ms)
• increasing TCP buffer sizes Fri Sept 11th (Solaris default 48 kB)
192 kB
1 MB
Why?
~25 MB/s
Data rate per server
Can we keep the FTS transfer at 25 MB/s/server?
Summary and outlook
• A large ATLAS T3 in Geneva• Special site for Trigger development• In NorduGrid since 2005• Storage Element in the NDGF since July 2009
– FTS from CERN and from the NDGF-T1– exercising data transfers, need to improve performance
• Short-term to do list– Add two more file servers to the SE.– Move to SLC5– Write a note, including performance results– Keep working on data transfer rates
• Towards a steady–state operation!
S. Gadomski, ”ATLAS computing in Geneva", NDGF, Sept 09 14
SMSCG
•Swiss Multi-Science Computing Grid is using ARC
S. Gadomski, ”Swiss ATLAS Grid", SwiNG, June 2009 16
Performance of dq2-get
S. Gadomski, ”Tests of data movement…", June 2009 17
• rates calculated using timestamps of files
• average data rate 6.6 MB/s• large spread• max close to hardware
limit of 70 MB/s (NFS write to single server)
• average time to transfer 100 GB is 7 hours