gridpp presentation to pparc e-science committee 26 july 2001 steve lloyd tony doyle john gordon

20
GridPP Presentation to PPARC e-Science Committee 26 July 2001 Steve Lloyd Tony Doyle John Gordon

Upload: colin-lynch

Post on 10-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

GridPP Presentation to PPARC e-Science Committee26 July 2001

Steve Lloyd Tony DoyleJohn Gordon

GridPP e-Science Presentation Slide 2

Outline• Component Model

• Resource Allocation and Funding Scenarios

• Intl Financial Comparisons• Intl Grid Collaborations• Grid Architecture(s)• Links with Industry• Summary• Addendum: 1. VISTA and GridPP 2. GridPP monitoring page

GridPP e-Science Presentation Slide 3

GridPP Proposal

GridPP = Vertically integrated programme

= component model...

•Input to development of £15-20M funding scenarios

GridPP e-Science Presentation Slide 4

GridPP Workgroups

A - Workload Management

Provision of software that schedule application processing requests amongst resources

B - Information Services and Data Management

Provision of software tools to provide flexible transparent and reliable access to the data

C - Monitoring Services

All aspects of monitoring Grid services

D - Fabric Management and Mass Storage

Integration of heterogeneous resources into common Grid frameworkE - Security

Security mechanisms from Certification Authorities to low level components

F - Networking

Network fabric provision through to integration of network services into middleware

G - Prototype Grid

Implementation of a UK Grid prototype tying together new and existing facilities

H - Software Support

Provide services to enable the development, testing and deployment of middleware and applications at institutes

I - Experimental Objectives

Responsible for ensuring development of GridPP is driven by needs of UK PP experiments

J - Dissemination

Ensure good dissemination of developments arising from GridPP into other communities and vice versa

Technical work broken down into several workgroups - broad overlap with EU DataGrid

GridPP e-Science Presentation Slide 5

Components 1-4: £21M

I: 11.9%

ExperimentObjectives

H*: 5.4%

H: 3.2%

Software Support

G: Prototype Grid9.7%

F*

1.5%

F

1.9%

CERN Staff27.0%

CERN Hardware6.8%

J2.6%

E

1.7%

D*

1.5%

D

2.7%

C*

0.6%

C

1.1%

B*

1.9%

B

1.5%

A*

0.4%

A

1.4%

UK Managers 1.9%

UK Capital15.3%

Work Groups A - F

GridPP e-Science Presentation Slide 6

£20M Project

I: 11.9%

ExperimentObjectives

H*: 5.4%

H: 3.2%

Software Support

G: Prototype Grid9.7%

F*

1.5%F

1.9%

CERN

J2.6%

E

1.7%

D*

1.5%

D

2.7%

C*

0.6%

C

1.1%

B*

1.9%

B

1.5%

A*

0.4%

A

1.4%

UK Managers 1.9%

UK Capital

Work Groups A - F

£7.1m £6.7m

£3.2 £2.9

96.3%

GridPP e-Science Presentation Slide 7

£17M Project

I: £2.49m £1.2m

ExperimentObjectives

H*: 5.4%

H: 3.2%

Software Support

G: Prototype Grid9.7%

F*

1.5%F

1.9%

CERN

J2.6%

E

1.7%

D*

1.5%

D

2.7%

C*

0.6%

C

1.1%

B*

1.9%

B

1.5%

A*

0.4%

A

1.4%

UK Managers 1.9%

UK Capital

Work Groups A - F

£7.1m £6.7m £6.0m

£3.2 £2.9 £2.45m

90.0%

GridPP e-Science Presentation Slide 8

Experiment ObjectivesComponent WG Name Funding SY

Vertically integrated programme?

Broken component model…

•Specific experiments or overall reduction?

To be determined by Experiments Board

50% reduction?

23 SY

4 I ATLAS UKPP 8.00

4 I CMS UKPP 9.50

4 I LHCb UKPP 9.00

4 I ALICE UKPP 2.50

4 I BaBar UKPP 6.00

4 I UKDMC UKPP 3.00

4 I H1 UKPP 1.50

4 I CDF UKPP 4.68

4 I D0 UKPP 2.00

4 I ZEUS UKPP 0.00

GridPP e-Science Presentation Slide 9

CERN (Component 3)Component WG Name Funding SY

Basic Grid functionality: UK-CERN integrated programme - synergies, but cuts here will impact…

10% reduction?

3.1 SY

3 K Scalable fabric error and performance monitoring system CERN 1.00

3 K Automated, scalable installation system CERN 2.00

3 K Automated software maintenance system CERN 1.00

3 K Scalable, automated (re-)configuration system CERN 1.00

3 K Automated, self-diagnosing and repair system CERN 2.00

3 K Implement grid-standard APIs, meta-data formats CERN 2.00

3 K Data replication and synchronisation CERN 3.00

3 K Performance and monitoring of wide area data transfer CERN 3.00

3 K Integration of LAN and Grid-level monitoring CERN 1.00

3 K Adaptation of databases to Grid replication and caching CERN 5.00

3 K Preparation of training courses, material CERN 4.00

3 K Adaptation of application – science A CERN 3.00

3 K Adaptation of application – science B CERN 3.00

GridPP e-Science Presentation Slide 10

CERN (Component 4)Component WG Name Funding SY

Experiments support: similar conclusions to UK-based programme

Non-UK funding dependencies?

50% reduction?

11 SY

4 K Provision of basic physics environment for prototypes CERN 2.00

4 K Support of grid testbeds CERN 5.00

4 K Adaptation of physics core software to the grid environment CERN 6.00

4 K Exploitation of the grid environment by physics applications CERN 6.00

4 K Support for testbeds CERN 3.00

Pro-rata reduction on disk, tape, CPU...

+HARDWARE15% reduction?

£0.2M

GridPP e-Science Presentation Slide 11

Workload/Data ManagementComponent WG Name Funding SY

3 A Further testing and refinement UKDG 1.00

3 A Modify SAM UKPP 1.00

3 A Profiling HEP jobs and scheduler optimisation UKPP 1.50

3 A Super scheduler development UKPP 0.50

3 B Directory Services EU 3.00

3 B Distributed SQL Development UKDG 4.00

3 B Data Replication UKDG 1.50

3 B Query Optimisation and Data Mining B UKPP 0.60

3 B Releases B UKPP 2.00

3 B Liaison UKPP 2.50

Reduced long-term programme?

e.g. scheduler optimisation (WG A)

query optimisation (WG B)

… or overall reduction?

10% reduction?

1.2 SY

GridPP e-Science Presentation Slide 12

£15M Project

I: £2.49m £0

ExperimentObjectives

H*: 5.4%

H: 3.2%

Software Support

G: Prototype Grid9.7%

F*

1.5%F

1.9%

CERN

J2.6%

E

1.7%

D*

1.5%

D

2.7%

C*

0.6%

C

1.1%

B*

1.9%

B

1.5%

A*

0.4%

A

1.4%

UK Managers 1.9%

UK Capital

Work Groups A - F

£5m

£3.2 £2.9 £2.45m

90.0%

GridPP e-Science Presentation Slide 13

£15M ProjectWithin the component model, it is impossible to achieve the programme described in theProposal with £15M. The CERN component would have to be scaled back to £5M,saving an additional £1M. The other £1.2M saving would have to come from eliminatingthe experimental objectives (Component 4) completely from the scope of this project.The overall balance of the GridPP project would be radically distorted. We stress that theconcurrent development of the applications is essential to provide feedback during thedevelopment of the Grid.

Summary• Even a £21-20M reduction is not trivial.. • EU DataGrid commitments are built in• Focus on CERN and UK Capital as largest single items, then reduce workgroup allocations• £17M budget cuts hard into the project

– Examples are based on original Component Model• £15M budget is impossible within the Component Model• A fixed allocation help in planning the start-up phase

GridPP e-Science Presentation Slide 14

International Comparisons

PP Grids under development• France• Germany• Italy• US

– CMS– Atlas

•Tier1 starting up at Karlsruhe •BaBar TierB at Karlsruhe•Tier2 for ALICE at Darmstadt•No national Grid - project led

•Tier-1 RC for all 4 LHC experiments at CC-IN2P3 in Lyon•BaBar TierA•an LHC prototype starting now •National Core Grid (2M€/year)

•Tier-1 at FNAL and 5 Tier-2 centres

•Prototype built during 2000-04, with full deployment during 2005-7

•Staff estimates for the Tier-1 centre are 14 FTE by 2003, reaching 35 FTE in 2007.

•Integrated costs to 2006 are $54.7M

•excluding, GriPhyN and PPDG

•Atlas plans very similar to CMS with costs foreseen to be the same

•Tier1 at Brookhaven

•INFN National Grid based round EU-DataGrid

•Tier-1 RC and a prototype starting now in CNAF, Bologna

•15.9M€ is allocated during 2001-3 for Tier-1 hardware alone

•Tier1 staff rising to 25 FTE by 2003

•10 Tier2 centres at 1M€/year

GridPP e-Science Presentation Slide 15

International Comparisons

Summary - different countries, different models• France & Germany budget for hardware, assume staff• Italy - lots of hardware and staff• US - funds split between Tier1/2, Universities, infrastructure, and R&D

• Italy > UK ~ France (EU) ~US (GriPhyN, PPDG and iVDGL characteristics within GridPP: single UK programme)

~

GridPP e-Science Presentation Slide 16

GridPP Architecture

• Based on EU DataGrid developments feeding into GGF

• Status: Version 2 (2/7/01)• Key elements:

– Evolutionary capability– Service via Protocols and Client APIs– Representation using UML (TogetherSoft)– Defines responsibilities of Work Packages– Built from Infrastructure– Based on PP Use Cases (applies to GridPP)

The DataGrid ArchitectureVersion 2

German Cancio, CERN Steve M. Fisher, RAL Tim Folkes, RALFrancesco Giacomini, INFN Wolfgang Hoschek, CERN Dave Kelsey, RAL

Brian L. Tierney, LBL/CERN

July 2, 2001

GridPP e-Science Presentation Slide 17

The Grid and Industry

• Help us develop the Grid:– Supply hardware - PCs, Disks, Mass Storage,

Networking etc– Supply software, middleware, management

systems, databases etc• Use the Grid for themselves:

– Collaborative Engineering– Massive simulation– Federating their own worldwide databases

• Sell or develop the Grid for others:– Computation Services, Data services etc

GridPP e-Science Presentation Slide 18

Summary• Balanced exploitation programme costs £21M• £20M-£17M-£15M 3-year funding scenarios

examined• £20M = maintains balanced programme • £17M = reduced experimental objectives• £15M = eliminates experimental objectives• Final balance depends on funding allocation• Emphasis on vertical integration: component model

• International comparisons: Italy > UK ~ France (EU) ~US (GriPhyN, PPDG and iVDGL characteristics within GridPP: single UK programme)

• Contacts established with GriPhyN, PPDG and iVDGL• InterGrid Co-ordination Group in development• Architecture defined by GGF via lead in DataGrid• Industry links: emphasis on partnership

~

GridPP e-Science Presentation Slide 19

GridPP and VISTA

• Astrogrid will federate VISTA data with other large databases elsewhere– this requires that VISTA data has already been

processed and catalogues and images are available.• VISTA have a proposal (e-VPAS) that concentrates on

producing the databases on which the Astrogrid tools will work. This work has much in common with GridPP: – a similar timescale– very large data flows from one remote site– many distributed users– reprocessing of data– utilization of distributed computing resources

• GridPP have started discussions with VISTA and EPCC (GenGrid) as to how we can collaborate and share expertise and middleware

GridPP e-Science Presentation Slide 20

GridPP Monitoring Page• Various sites now set up with UK Globus certificates

– Grid Monitoring• Polls Grid test-bed sites via globus-job-run command• Runs basic script producing XML encoded status

information• Load average and timestamp information retrieved• Current status and archived load information is

plotted...• To be done...

– JAVA CoG kit being investigated (more robust)– Simple monitoring system to verify test-bed timestamps (in

case not everyone is using NTP)– Integrate with the Grid Monitoring Architecture– Incorporate current network bandwidth measurements into

graphical system– Automatic notification system