scotgrid report: prototype for tier-2 centre for lhc akram khan on behalf of the scotgrid team...

22
ScotGRID Report: Prototype ScotGRID Report: Prototype for Tier-2 Centre for LHC for Tier-2 Centre for LHC Akram Khan On Behalf of the ScotGRID On Behalf of the ScotGRID Team Team (http:/www.scotgrid.ac.uk) (http:/www.scotgrid.ac.uk)

Upload: florence-watkins

Post on 18-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

ScotGRID Report: Prototype ScotGRID Report: Prototype for Tier-2 Centre for LHCfor Tier-2 Centre for LHC

ScotGRID Report: Prototype ScotGRID Report: Prototype for Tier-2 Centre for LHCfor Tier-2 Centre for LHC

Akram Khan

On Behalf of the ScotGRID TeamOn Behalf of the ScotGRID Team

(http:/www.scotgrid.ac.uk)(http:/www.scotgrid.ac.uk)

Akram Khan

On Behalf of the ScotGRID TeamOn Behalf of the ScotGRID Team

(http:/www.scotgrid.ac.uk)(http:/www.scotgrid.ac.uk)

GridPP6 Collaboration Meeting ScotGRID Report

Overview of TalkOverview of Talk

Misc BitsMisc Bits

Summary & OutlookSummary & Outlook

Future Plans Future Plans

Hardware / OperationHardware / Operation

What are we hoping to do..?What are we hoping to do..?

GridPP6 Collaboration Meeting ScotGRID Report

Never Forget The Spirit of the Never Forget The Spirit of the ProjectProject

The LHC Computing

Challenge for Scotland

2000: JREI Bid The JREI funds will make possible to commission and fully exercise a prototype

LHC computing centre in Scotland

The Centre would provide:

1. Technical service based for the grid(GIIS, VO services…)

2. DataStore to handle samples of data towards part. Analysis

3. Significant simulation production capability

4. Excellent network connection RAL + regional sites

5. Support grid middle devel. with CERN and RAL

6. Support core software devel. within LHCb and ATLAS

7. Support user applications in other scientific areas

This will enable us to answer: Is the grid viable solution for LHC computing challenge Can a two-site Tier-2 centre be setup and operate effectively How can network topology between Ed,GL, RAL & CERN

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID: Glasgow / Edinburgh ScotGRID: Glasgow / Edinburgh 59 x330 dual PIII 1GHz/2 Gbyte compute

nodes

2 x340 dual PIII/1 GHz /2 Gbyte head nodes

3 x340 dual PIII/1 GHz/2 Gbyte storage nodes, each with 11 by 34 Gbytes in Raid 5

1 x340 dual PIII/1 GHz/0.5 Gbyte masternode

59 x330 dual PIII 1GHz/2 Gbyte compute nodes

2 x340 dual PIII/1 GHz /2 Gbyte head nodes

3 x340 dual PIII/1 GHz/2 Gbyte storage nodes, each with 11 by 34 Gbytes in Raid 5

1 x340 dual PIII/1 GHz/0.5 Gbyte masternode

xSeries quad Pentium Xeon 700 MHz/16 Gbytes, server

1 FAStT 500 controller

7 diskarrays of 10 x 73 Gb disk

xSeries quad Pentium Xeon 700 MHz/16 Gbytes, server

1 FAStT 500 controller

7 diskarrays of 10 x 73 Gb disk

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID - GlasgowScotGRID - Glasgow

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID: Glasgow - SchematicScotGRID: Glasgow - Schematic

Internet VLAN

10.0.0.0 VLAN

100 Mbps

1000 Mbps

Masternode Storage Nodes Head Nodes

Compute Nodes

Campus Backbone

bottleneck

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID: Edinburgh - SchematicScotGRID: Edinburgh - Schematic

Disk Arrays(Total 4.6 Tb)Disk Arrays(Total 4.6 Tb)

FastT 500 Storage ControllerFastT 500 Storage Controller

Server (4*Pentium Xeon, 16Gb RAM)Server (4*Pentium Xeon, 16Gb RAM)

SRIF Network

GridPP6 Collaboration Meeting ScotGRID Report

Towards a Prototype Tier-2Towards a Prototype Tier-2

2002 200520042003

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

2001

Q4

PrototypesPrototypesPrototypesPrototypesxCAT tutorial, attempt on masternode

ScotGRID room handed over to builders

Building work complete xCAT reinstall

User registration, trail production

Installation of Software

Configuring disk array

Reconfiguring kernel drivers for FAStT storage controller

User registration, Upgrade storage controller

Sco

tGR

ID d

eliv

ery

of

Kit

: D

ec 2

001

Group disk (re)organisation to match project

Group disk (re)organisation to match project

Glasgow: MC-FARM

Edinburgh: Datestore

Pro

po

sal J

RE

I: 2

000

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID 1ScotGRID 1stst Year Review Year Review

9:45 Arrive - Coffee

10:00-10:15 Welcome (Freddie Moran)

10:15-10:35 ScotGrid Introduction (Tony Doyle)

10:35-10:50 Technical Status Overview (Akram Khan)

10:50-11:05 Cluster Operations (David Martin)

11:05-11:30 Coffee

11:30-11:50 ScotGrid Upgrade Plans (Steve Playfer)

11:50-13:00 IBM IT Briefing Discussion

13:00-14:00 Lunch

14:00-14:30 IBM IT Briefing Discussion

14:40-14:55 Grid Data Management - simulations (David Cameron)

15:10-15:30 Tea

Particle Physics Applications

15:30-15:45 ATLAS (John Kennedy)

15:45-16:00 LHCb (Akram Khan)

16:00-16:15 BABAR (Steve Playfer)

16:15-16:30 CDF (Rick St Denis)

ScotGrid Meeting at IBM Briefing Centre (Greenock)

Friday 10th Jan

ScotGrid Meeting at IBM Briefing Centre (Greenock)

Friday 10th Jan

Complete Success as you will

see!

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID StatisticsScotGRID StatisticsThe amount of storage space in ScotGRID:used by each group

Edinburgh (5TBytes) Glasgow (600 Gbytes)

GridPP6 Collaboration Meeting ScotGRID Report

ScotGRID:CPU Usage 24/6/02 – 6/1/2003ScotGRID:CPU Usage 24/6/02 – 6/1/2003

The % use by each group over the pervious weeks

startup phase Christmas period different applications

GridPP6 Collaboration Meeting ScotGRID Report

Forward Look: IntroductionForward Look: Introduction

ScotGrid JREI project includes a mid-term hardware upgrade.

As part of GridPP planning, we need to upgrade from Prototype to Production Tier 2 status by 2004.

JREI funding left to be spent by June 2003:

Edinburgh £220k Glasgow £30k

£250k

GridPP6 Collaboration Meeting ScotGRID Report

Forward Look: Possible Upgrade Forward Look: Possible Upgrade Plan?Plan?

Edinburgh kit Glasgow

Dual FastT700+ 20-32 TB

IBM@server: xSeries 440 8* Xeon

(1.9GHz) Scalable configuration

GridPP6 Collaboration Meeting ScotGRID Report

Forward Look: Front-End Grid Forward Look: Front-End Grid ServersServers

Front end for EDG style Compute Engine/LCFG

Front end for EDG style Storage Engine Overall ScotGrid Front end to arbitrate

Grid services being requested?

Would like to install Grid software on dedicated (modest-sized) servers. Decouples Grid softwarefrom Compute and Storage hardware.

Will there be a standard configuration for Grid access to Tier 2 sites? (RLS/SlashGrid)

GridPP6 Collaboration Meeting ScotGRID Report

Towards a Production Tier-2 & Towards a Production Tier-2 & beyondbeyond

2002 200520042003

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

Q1 Q2 Q3 Q4

2001

Q4

ProductionProductionProductionProduction

Delivery of more Kit…

End of JREI fundingStart of ScotGRID-II

Start of GridPP-II

Links to other applications…

Production Tier-2 Site

Future Upgrades ?

GridPP6 Collaboration Meeting ScotGRID Report

Technical Support Group Technical Support Group Core members of the group & invited to discuss wider issues:

CORE:Akram Khan (Chair: Edinburgh)David Martin (sysadm: Glasgow)Roy de Ruiter-Koelemeiger (sysadm: Edinburgh)Gavin McCance (EDG: Glasgow)RA post (EDG: Edinburgh)

INVITED:Paul Mitchell (sysadm: Edinburgh)Alan J. Flavell (Networking: Glasgow)Steve Traylen (EDG: RAL)IBM Team

CORE:Akram Khan (Chair: Edinburgh)David Martin (sysadm: Glasgow)Roy de Ruiter-Koelemeiger (sysadm: Edinburgh)Gavin McCance (EDG: Glasgow)RA post (EDG: Edinburgh)

INVITED:Paul Mitchell (sysadm: Edinburgh)Alan J. Flavell (Networking: Glasgow)Steve Traylen (EDG: RAL)IBM Team

Webpage “technical group” of http://www.scotgrid.ac.uk/

Support is a real issue we are just about ok but for a production Tier-2?

Support is a real issue we are just about ok but for a production Tier-2?

GridPP6 Collaboration Meeting ScotGRID Report

All University trafficPacket filtering

1 Gb/s 1 Gb/s

2.5 Gb/sGlasgowEdinburgh

1. 194.36.1.1 (194.36.1.1) 1.479 ms 0.743 ms 0.558 ms

2. 130.209.2.1 (130.209.2.1) 2.343 ms 0.678 ms 0.577 ms

3. 130.209.2.118 (130.209.2.118) 0.577 ms 0.322 ms 0.454 ms

4. glasgow-bar.ja.net (146.97.40.105) 0.564 ms 0.305 ms 0.341 ms

5. po9-0.glas-scr.ja.net (146.97.35.53) 0.546 ms 0.544 ms 0.465 ms

6. po3-0.edin-scr.ja.net (146.97.33.62) 1.644 ms 1.471 ms 1.634 ms

7. po0-0.edinburgh-bar.ja.net (146.97.35.62) 1.509 ms 1.474 ms 1.400 ms

8. 146.97.40.62 (146.97.40.62) 1.622 ms 1.493 ms 1.518 ms

9. vlan686.kb5-msfc.net.ed.ac.uk (194.81.56.58) 2.084 ms 2.528 ms 1.869 ms

10. 129.215.255.242 (129.215.255.242) 1.851 ms 1.828 ms 1.624 ms

Traceroute

GridPP6 Collaboration Meeting ScotGRID Report

EDG Middleware: EDG Middleware: Replica Optimiser Replica Optimiser SimulationSimulation

Using ScotGrid for large-scale simulation runs.

uses ~15MB memory for ~60 threads.

2-12 hours/simulation

Results to appear in IJHPCA 2003.

GridPP6 Collaboration Meeting ScotGRID Report

BaBar: Monte Carlo Production BaBar: Monte Carlo Production (SP4)(SP4)

ScotGrid (= edin)

8 Million Events

in 3 weeks

ScotGrid (= edin)

8 Million Events

in 3 weeks

Expect to import some streams/skims to Edinburgh in 2003

After the upgrade to ~30TB there may be interest in using ScotGrid to add to the storage available at the RAL Tier A site

GridPP6 Collaboration Meeting ScotGRID Report

CERN (932 k) and Bologna (857 k) RAL (471 k) Imperial College and Karlsruhe (437 k) Lyon (202 k) ScotGrid (194 k) Cambridge (100 k) Bristol (92 k) Moscow (87 k) Liverpool (70 k) Barcelona (56 k) Rio (32 k) CESGA (28 k) Oxford (25 k)

LHCb: Production CentresLHCb: Production Centres

We can be confident for the TDR production and in 56 days with the current configuration we can produce 10 Million events (March-April 2003).

Included in Draft of LHCC document: B0->J/phi K0s

GridPP6 Collaboration Meeting ScotGRID Report

Summary and OutlookSummary and Outlook

Exciting time for ScotGRID:Exciting time for ScotGRID: There has been a lot of effort during the past year to get

ScotGRID up and operational – we have learnt many ticks!

Exciting time for ScotGRID:Exciting time for ScotGRID: There has been a lot of effort during the past year to get

ScotGRID up and operational – we have learnt many ticks!

Operational Prototype Centre: Operational Prototype Centre: We have an operational centre Meeting the short term needs of the applications with modest resources (HEP + Middleware + non-PP) Proof of Principle for Tier-2 Operation (pre-grid)

There is a lot that needs still to the done:

having a full production system (24*7) (opt-grid) to prototype various architectural solutions for Tier-2 look towards upgrades with a view for LHC timetable

Support & Resources are a real issue for the near term future (Q1-2004)

There is a lot that needs still to the done:

having a full production system (24*7) (opt-grid) to prototype various architectural solutions for Tier-2 look towards upgrades with a view for LHC timetable

Support & Resources are a real issue for the near term future (Q1-2004)

GridPP6 Collaboration Meeting ScotGRID Report

RLS ArchitectureRLS Architecture

LocalReplica

Catalogues

LRC onStorageElement

LRC onStorageElement

LRC onStorageElement

RLI RLIRLI

LRC onStorageElement

Multiply indexed LRC for higher availability

RLI indexing over the full namespace (all LRCs are indexed)

RLI indexing over a subset of LRCs

LRC indexed by only one RLI

ReplicaLocationIndices

Glasgow Edinburgh CERN

A Replica Location Service (RLS) is system that maintains and provides access to information about the physical location of copies of data items.

A Replica Location Service (RLS) is system that maintains and provides access to information about the physical location of copies of data items.

Gavin McCanceAlasdair EarlAkram Khan

(starting Feb)