the infn grid project zscope: study and develop a general infn computing infrastructure, based on...

16
THE INFN GRID PROJECT Scope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case) implementing distributed Regional Center prototypes for LHC expts: ATLAS, CMS, ALICE and, later on, also for other INFN expts (Virgo, Gran Sasso ….) Project Status: Outline of proposal submitted to INFN management 13-1- 2000 3 Year duration Next meeting with INFN management 18th of February Feedback documents from LHC expts by end of February (sites, FTEs..) Final proposal to INFN by end of March

Upload: thomas-lindsey

Post on 13-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

THE INFN GRID PROJECT

Scope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case) implementing distributed Regional Center prototypes for LHC expts: ATLAS, CMS, ALICE and, later on, also for other INFN expts (Virgo, Gran Sasso ….)

Project Status: Outline of proposal submitted to INFN management 13-1-2000 3 Year duration Next meeting with INFN management 18th of February Feedback documents from LHC expts by end of February

(sites, FTEs..) Final proposal to INFN by end of March

Page 2: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

INFN & “Grid Related Projects”

Globus tests“Condor on WAN” as general purpose

computing resource“GRID” working group to analyze viable

and useful solutions (LHC computing, Virgo…)

Global architecture that allows strategies for the discovery, allocation, reservation and management of resource collection

MONARC project related activities

Page 3: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Evaluation of the Globus ToolKit

5 sites Testbed (Bologna, CNAF, LNL, Padova, Roma1)

Use case: HTL CMS studies MC Prod. Complete HLT chain

Services to test/implement Resource management

fork() Interface to different local resource managers (Condor, LSF)

Resources chosen by hand Smart Broker to implement a Global resource manager

Data Mover (Gass, Gsiftp…) to stage executable and input files to retrieve output files

Bookkeeping (Is this a worth a general tool ?)

Page 4: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Use Case: CMS HLT studies

Page 5: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Status

Globus installed in 5 Linux PCs in 3 sitesGlobus Security Infrastructure

works !! MDS

Initial problems accessing data (long response time and time out)

GRAM, GASS, Gloperf Work in progress

Page 6: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Condor on WAN Objectives

Large INFN project of the Computing Commission involving ~20 sites

INFN collaboration with Condor Team UWISC

I goal: Condor “tuning” on WAN verify Condor reliability and robustness in Wide

Area Network environment

Verify suitability to INFN computing needs

Network I/O impact and measures

Page 7: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

II goal: Network as a Condor Resource Dynamic checkpointing and Checkpoint domain

configuration

Pool partitioned in checkpoint domains (a dedicated ckpt server for each domain)

Definition of a checkpoint domain according:Presence of a sufficiently large CPU capacityPresence of a set of machines with an efficient

network connectivitySub-pools

Page 8: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Checkpointing: next step

Distributed dynamic checkpointing Pool machines select the “best”

checkpoint server (from a network view)

Association between execution machine and checkpoint server dynamically decided

Page 9: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Implementation

Characteristics of the INFN Condor pool:

Single pool To optimize CPU usage of all INFN hosts

Sub-pools To define policies/priorities on resource usage

Checkpoint domains To guarantee the performance and the

efficiency of the system To reduce network traffic for checkpointing

activity

Page 10: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

GARR-B Topology

155 Mbps ATM based Network

access points (PoP)

main transport nodes

TORINO PADOVA

BARI

PALERMO

FIRENZE

PAVIA

MILANO

GENOVA

NAPOLI

CAGLIARI

TRIESTE

ROMA

PISA

L’AQUILA

CATANIA

BOLOGNA

UDINE

TRENTO

PERUGIA

LNF

LNGS

SASSARI

LECCE

LNS

LNL

USA

155Mbps

T3

SALERNO

COSENZA

S.Piero

FERRARA

PARMA

CNAF Central Manager

INFN Condor Pool on WAN: checkpoint domains

ROMA2

10

10

40

15

415

65

5

Default CKPTdomain @ Cnaf

CKPT domain # hosts

10

2

3

6

3

2

USA

3

5

1

15

EsNet

machines 500-1000 machines

6 ckpt servers 25 ckpt servers

Page 11: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Management

Central management ([email protected])

Local management ([email protected])Steering committeesoftware maintenance contract with

Condor_support team of University of Madison

Page 12: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

INFN-GRID project requirements

Networked Workload Management:- Optimal co-allocation of data and CPU and network

for a specific “grid/network-aware” job- distributed scheduling (data and/or code migration)- unscheduled/ scheduled job submission- Management of heterogeneous computing systems- Uniform interface to various local resource

managers and schedulers- Priorities, policies on resource (CPU, Data, Network)

usage- bookkeeping and ‘web’ user interface

Page 13: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Networked Data Management:- Universal name space: transparent, location

independent - Data replication and caching- Data mover (scheduled/interactive at OBJ/file/DB

granularity)- Loose synchronization between replicas- Application Metadata, interfaced with DBMS, i.e.

Objectivity, …- Network services definition for a given application- End systems network protocol tuning

Project req. (cont.)

Page 14: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Application Monitoring/Management:- Performance, “instrumented systems” with

timing information and analysis tools - Run-time analysis of collected application events- Bottleneck analysis- Dynamic monitoring of GRID resources to

optimize resource allocation- Failure management

Project req. (cont.)

Page 15: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

Computing Fabric and general utilities for a global managed Grid:

- Configuration management of computing facilities - Automatic software installation and maintenance- System, service, network monitoring and global

alarm notification, automatic recovery from failures- resource use accounting- Security of GRID resources and infrastructure usage- Information service

Project req. (cont.)

Page 16: THE INFN GRID PROJECT zScope: Study and develop a general INFN computing infrastructure, based on GRID technologies, to be validated (as first use case)

:

D a t a S e r v e rT i e r 2 / 3

D a t a S e r v e r T i e r 2 / 3

D a t a S e r v e r T i e r 1

C l i e n t

C l i e n t

C l i e n t

C l i e n t

C l i e n t

d e s k o p

d e s k t o p

C E R N –T i e r 0

Logical layout of the multi-tier client-server model

Grid Tools