polish infrastructure for supporting computational science in the european research space

31
EUROPEAN UNION Polish Infrastructure for Supporting Polish Infrastructure for Supporting Computational Science in the Computational Science in the European Research Space European Research Space Jacek Kitowski, Kazimierz Wiatr, Marian Bubak Michał Turała, Łukasz Dutka and Zofia Mosurska ACK CYFRONET AGH, Cracow, Poland CMS’09 Computer Methods and Systems Cracow, Nov. 26-27, 2009

Upload: maitland

Post on 24-Jan-2016

35 views

Category:

Documents


1 download

DESCRIPTION

Polish Infrastructure for Supporting Computational Science in the European Research Space. Jacek Kitowski, Kazimierz Wiatr, Marian Bubak Michał Turała , Łukasz Dutka and Zofia Mosurska ACK CYFRONET AGH, Cracow , Poland. CMS’09 Computer Methods and Systems Cracow , Nov . 26-27, 2009. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

EUROPEAN UNION

Polish Infrastructure for Supporting Polish Infrastructure for Supporting Computational Science in the Computational Science in the

European Research SpaceEuropean Research Space

Jacek Kitowski, Kazimierz Wiatr, Marian BubakMichał Turała, Łukasz Dutka and Zofia Mosurska

ACK CYFRONET AGH, Cracow, Poland

CMS’09 Computer Methods and Systems

Cracow, Nov. 26-27, 2009

Page 2: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

National Grid Initiative in Poland in a nutshell Motivation – e-Science and Science 2.0 Rationales and Foundations Status

PL-Grid Project – Current Results

Summary

OutlineOutline

Integration Activity in the framework of European Grid Initiative

Page 3: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Demand from Community for e-Science ApproachDemand from Community for e-Science Approachin spite of complexity: Computing, Storage, Infrastructurein spite of complexity: Computing, Storage, Infrastructure

E-Science: collaborative research supported by advanced distributed computations Multi-disciplinary, Multi-Site and Multi-National Building with and demanding advances in Computing/Computer Sciences

Goal: to enable better research in all disciplines System-level Science: beyond individual phenomena, components interact

and interrelate, experiments in silico to generate, interpret

and analyse rich data resources• From experiments, observations

and simulations

• Quality management, preservation and reliable evidence

to develop and explore models and simulations• Computation and data at all scales

• Trustworthy, economic, timely and relevant results to enable dynamic distributed collaboration

• Facilitating collaboration with information and resource sharing

• Security, trust, reliability, accountability, manageability and agility

ACK: M. Atkinson, e-Science (...), Grid2006 & 2-nd Int.Conf.e-Social Science 2006, National e-Science Centre UKACK: I. Foster, System Level Science and System Level Models, Snowmass, August 1-2, 2007

Page 4: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Science 2.0Science 2.0

4ACK: David de Roure, http://www.soton.ac.uk/~dder/talks.html

Page 5: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Users’ RequirementsUsers’ Requirements

5

e-Infrastructures in Europe: Research Network infrastructure:

GEANT pan-European network interconnecting National Research and Education Networks

Computing Grid Infrastructure: Enabling Grids for E-SciencE (EGEE project)

Transition to the sustainable European Grid Initiative (EGI) currently worked out through EGI_DS project

Data & Knowledge Infrastructure: Digital Libraries (DILIGENT) and repositories (DRIVER-II)

A series of other projects : Middleware interoperation, applications, policy and support

actions, etc.

Cyber-Infrastructures around the world: Similar in US and Asia Pacific

Research e-Infrastructures

ACK: Fabrizio Gagliardi, Microsoft, ICCS 2008

Explosion of Data

Experiments Archives LiteratureSimulations

Petabytes

Doubling every 2 years

ACK: Fabrizio Gagliardi, Microsoft, ICCS 2008

Petascale is coming

BlueGene/L, 596 TFlops peak, Low power, cheap processors

Earth Simulator, 35 TFlops peakVector processing

Jaguar / Franklin, 120 / 100 TFlops peak, Vector processing

Roadrunner, 1457 TFlops peak, Heterogeneous / PowerXCell 8iAMD Opteron DC 1.8GHzBE 3.2GHz (12.8 Gflops)

ACK: Hank Childs, Lawrence Livermore National Laboratory, ICCS 2008

129600 cores

212992 cores

TOP500, Nov.2008

ACK: F. Gagliardi

Resources -- Services Infrastructure Data Computational

Page 6: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Introducing the GridIntroducing the Grid

6

Coordinates resources that are NOT subject of centalized control, which can be used as a virtual computer / storage

Using standard, open, general-purpose protocols and interfaces

Delivers nontrival quality of services Resources are shared by users due to Virtual

Organizations

Page 7: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

EGEEEGEE

7Mariusz Sterzel CGW'08 Kraków, 13 October 2008 7

EGEE

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

>250 sites48 countries>100,000 CPUs>30 PetaBytes>10,000 users>150 VOs>150,000 jobs/day

Page 8: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

The Consortium consists of five High Performance Computing Polish Centres representing Communities (coordinated by ACK CYFRONET) due to:

Participation in International and National Projects and Collaboration• ~35 international projects FP5, FP6, FP7 on Grids (50% common)

• ~15 Polish projects (50% common)

Needs by Polish Scientific Communities• ~75% publications in 5 Communities

Computational resources to date• Top500 list

European/Worldwide integration Activities

• EGEE I-III, EGI_DS, EGI, e-IRG, PRACE, DEISA, OMII, EU Unit F3 „Research Infrastructure” Experts

National Network Infrastructure ready• (thanks to Pionier National Project)

Rationales behind PL-Grid ConsortiumRationales behind PL-Grid Consortium

GEANT2GEANT2

Page 9: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Partners’ Computing ResourcesPartners’ Computing Resources

TOP500 – November 2008TOP500 – November 2008

Page 10: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

PL-Grid Foundations – SummaryPL-Grid Foundations – Summary

Motivation E-Science approach to research EGI initiative ongoing in Europe (http://web.eu-egi.eu)

• Formation of stable e-intrastucture in Europe

• Collaboration with NGIs and coordination development activites

PL-Grid Project (2009-2012) Application in Operational Programme Innovative Economy, Activity 2.3

(in Sept. 2008) Got funded March 2, 2009 (via European Structural Funds)

Polish Infrastructure for Supporting Computational Science in the European Research Space

European Grid Initiative (EGI) National Grid Initiative (NGI) PL-Grid

Response to the needs of Polish scientists and ongoing Grid activities in Poland, other European countries and all over the world

Page 11: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Grid infrastructure (Grid services) PL-Grid

App

licat

ion

App

licat

ion

App

licat

ion

App

licat

ion

Clusters High Performance Computers Data repositories

National Computer Network PIONIER

DomainGrid

Advanced Service Platforms

DomainGrid

DomainGrid

DomainGrid

Polish Grid is developing a common base infrastructure – compatible and interoperable with European and Worldwide Grids.

Specialized, domain Grid systems – including services and tools focused on specific types of applications.

Such an approach should enable efficient use of available financial resources. Plans for HPC and Scalability Computing enabled.

Offer for the Users Computing Power 215 Tflop/s Storage 2500 TB Support from PL-Grid staff

to using advanced Grid tools Support for porting legacy codes

to Grid environment Support for designing applications

for PL-Grid environment

PL-Grid AssumptionsPL-Grid Assumptions

Page 12: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Elements and FunctionalityElements and Functionality

PL-Grid software comprises: user tools (portals, systems for

applications management and monitoring, result visualization and other purposes, compatible with the lower-layer software used in PL-Grid);

software libraries; virtual organization systems: certificates,

accounting, security, dynamic ; data management systems: metadata

catalogues, replica management, file transfer;

resource management systems: job management, applications, grid services and infrastructure monitoring, license management, local resource management, monitoring.

Users

Nationalcomputernetwork

Grid Application

Programming Interface

Virtual organizations andsecurity systems

Basic Grid services

Gridservices

LCG/gLite(EGEE)

UNICORE(DEISA)

OtherGrids

systems

Gridresources

Distributedcomputational

resources

Grid portals, development tools

Distributed data

repositories

Three Grid structures are maintained: production, reseach, development/testing.

Page 13: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

The PL-Grid Project is split into several workpackages

Planned Realization of AimsPlanned Realization of Aims

PLANNING AND DEVELOPMENT

OF INFRASTRUCTURE

P2

CoordinationStructure

Operation RulesDissemination

PROJECT MANAGEMENTP1

SECURITY CENTER

P6

Training

SUPPORT FOR VARIOUSDOMAIN GRIDS

P5P4 GRID SOFTWAREAND USERS

TOOLS DEVELOPMENTEGEE DEISA … .

OPERATIONS CENTERP3

Main Project Indicators: • Peak Perf.: 215 Tflops • Disk Storage: 2500 TB

Page 14: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Operational Center: TasksOperational Center: Tasks

EGIGroup

EGI Testing

Middleware

EGI ProductionMiddleware

Coordination of Operation Management and accounting EGI and DEISA collaboration

(Scalability and HPC Computing) Users’ requirements analysis for

operational issues Running infrastructure for:

Production Developers Research

Future consideration:

Computational Cloud Data Cloud Internal and External

Clouds Virtualization aspects

Page 15: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Operational Center: Current ResultsOperational Center: Current Results

PL-Grid VO (vo.plgrid.pl) fully operational Virtual production resources for PL-Grid users in

4 centres (Gdańsk, Kraków, Poznań, Wrocław) Infrastructure monitoring (failure detection) tools

based on Nagios & EGEE SAM in place Support teams, tools and EGEE/EGI-compliant

operations procedures aimed at achieving high availability of resources established

Knowledge sharing tool for operations problems Problem tracking tool to be released in November Provisional user registration procedure User registration portal in preparation

(1st prototype released in October) Accounting integrated with EGEE

PCSS

Cyfronet

WCSS

Page 16: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Operational Center: NextOperational Center: Next PPlanslans

• User Registration– access to any person registered as researcher in Poland -

www.opi.org.pl– portal being already developed (to be released end of October'09)– access to resources based on

• grants – giving rights to use X CPUhours and TB of disk space

• access services – way of accessing resources in particular computing centre

• PL-Grid Research Infrastructure – “resources on demand” allows to request machines for different purposes– install a grid middleware for testing purposes– for developers to build a testing environment– any other research requiring grid resources– procedures are being defined

• Enable access to UNICORE resources – small amount of dedicated UNICORE machines– integrate UNICORE and gLite – make use of the same resources

Page 17: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

18

Software and Tools Development: Current ResultsSoftware and Tools Development: Current Results

Establishment of eight groups of research developers, 20+ people from five Polish supercomputing centers, collaborating together to improve the quality of existing European grid infrastructure services, tools and applications

Research and development testbed available based on dedicated physical and virtual machines from supercomputing centers

Initial evaluation of new user tools, environments and grid applications developed and supported in Poland

Basic extensions and improvement procedures proposed to various functionalities provided by gLite, Unicore and QosCosGrid

Software reengineering, development and testing of useful user tools, such as Migrating Desktop, gEclipse, ViroLab/GridSpace, FIVO, Vine Toolkit, QCG-OpenMPI, QCG-ProActive, etc.

Software quality management for new services and tools to be deployed and maintained in the second phase of Pl-Grid

Active links to new user communities in Poland interested in advanced computing technologies in order to collect additional requirements and scenarios:

www.plgrid.pl/ankieta

Page 18: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

19

Software and Tools Development: in nSoftware and Tools Development: in numbersumbers

Research developers: 20+Main research teams: 8Supercomputing centers involved : 5Dedicated physical / virtual machines: 30+Tested infrastructures: EGEE, DEISASoftware reengineering and tools testing:

QosCosGrid, GridSpace, Migrating Desktop, gEclipse, Virolab/GridSpace, FiVO, Vine, QCG-OpenMPI/ProActive, etc.

Software quality management :for new services and tools

Current applications under integration and testing procedures: 10+

Current user tools under integration and testing procedures: 5+

Active links to new user communities in Poland: www.plgrid.pl/ankieta

Page 19: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

20

Software and Tools Development: Next PSoftware and Tools Development: Next Planslans

Closer collaboration with user and developer communities representing various scientific domains (based on feedback received from our questionnaire)

Adaptation of virtualization techniques improving usability, fault tolerance, checkpointing and cluster management

Fancy web GUIs based on Liferay and Vine Toolkit accessing existing and extended e-Infrastructure services

Performance tests of various large scale cross-cluster parallel applications using co-allocation and advance reservation techniques

More analysis of grid management, monitoring and security solutions Common software repository and deployment rules Example services, tools and applications available as virtual boxes with different

configurations Main checkpoints and timeline:

IX 09VI 09 IX 10 IV 11 I 12

We are here

Page 20: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Users’ Support and TrainingUsers’ Support and Training

General task: Running Help-Desk and users’ training and support tasks Supporting how to effectively exploit developed infrastructure including

selection of right applications for users’ problems and seamless running computation on the grid.

Pl-grid is an user oriented infrastructure, support and training team will be close to the users.

At the moment: Support Team is organized Internal training for support team already begun Selection of licensed application required by the users’ community to be

installed on the whole PL-Grid infrastructure First training for the users delivered.

Plans: Many, many trainings for beginners and advanced grid users

Page 21: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

PL-Grid Security PL-Grid Security Group:Group: Current ResultsCurrent Results

Defined security guidelines for architecture planning Defined security policy for local sites installations Incident reporting systems review to choose best solution for PL-

Grid GGUS -  Global Grid User Support RTIR - Request Tracking for Incident Response DIHS - Distributed Incident Handling System

Architecture and prototype version of on-line user mapping and credentials distribution system.

Page 22: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

PL-Grid Security Group: Next PlansPL-Grid Security Group: Next Plans

Establish NGI-wide CERT Introduce security policies into local sites Develop and deploy local sites security policy conformity monitoring

system check if software (kernel, services) is up to date check listening ports, firewall rules monitor suid/sgid lists and integrity

Develop distributed alert correlation system build on top of network and host sensors.

Page 23: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

How to become a User ?How to become a User ?

Each Polish scientist can make use of PL+Grid resources

Just to register

For registration and other information see:

http://www.plgrid.pl

Ask questions:

[email protected]

For registration send an e-mail to:

email: [email protected]

24ACK: M. Sterzel, Cyfronet

Page 24: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

ExamplesExamples

25

Page 25: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Example: Virtual Laboratory GridSpaceExample: Virtual Laboratory GridSpace

Use of distributed computational resources and data repositories High-level tools offered for the user for in-silico experiments

ACK: EU Project Virolab, http://virolab.cyfronet.pl26

Page 26: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Sample experiment in Sample experiment in ViroLabViroLab Environment Environment

Patient’s data Medical examination HIV genetic sequence put into

database

Experiment in-silico Collect HIV genetic sequences

from database Perform sequence matching

dopasowanie Calculate virus resistance

27

http://virolab.cyfronet.plhttp://www.virolab.org

Page 27: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

[email protected]

Example: Biotechnology in GridExample: Biotechnology in Grid

ComputingElement

Euchina Virtual Organization (EGEE)

StorageElement

2. Transfer

application

User InterfacePortal

AABTDDSAD

1.Submit

sequence

3. Store protein

4. Visualize

PDB1.32 3.23 3.442.77 4.33 5.661.32 3.23 3.44

ACK: Irena Roterman, Jagiellonian University, Tomasz Szepieniec, Cyfronet

Never Born Protein Folding

Page 28: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Contract-based Dynamic Virtual Organizations FiVOContract-based Dynamic Virtual Organizations FiVO

29

FiVO FiVO

FiVO FiVO

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

ORGANIZATION A

ORGANIZATION B

ORGANIZATION C

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

inetOrgPerson

ORGANIZATION D

MDS

MDS

ServiceRegistry

ServiceRegistry

ServiceRegistry

GVOSFGVOSF

GVOSFGVOSF

inetOrgPerson

inetOrgPerson

MDS

inetOrgPerson

inetOrgPerson

DATA

DATA

DATA

DATA

VO-1

VO-1CONTRACT

Goal: Allow end users to defined their

requirements for the Virtual Organization on a high level of abstraction Semantic description of domain

Results: Automatic deployment of the VO

Security and monitoring

Optimization of data access Replication and migration

Overall system architectureOverall system architecture

Page 29: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

Summary – ActivitiesSummary – Activities

Achieved Establish PL-Grid VO using Partners’ local and EGEE resources In operation for Users’ Communities Provide resources for covering operational costs Provide resources for keeping international collaboration ongoing

Long term Software and tools implementation Users’ support and traning Provide, keep and extend the necessary infrastructure Work on approaching paradigms and integration development

• HPCaaS and Scalable Computing (Capability and Capacity Computing)

• Clouds Computing (internal-external, computing clouds, data clouds)

• SOA paradigm, knowledge usage …

• „Future Internet” as defined by EC in Workprogramme

Page 30: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

AcknowledgementsAcknowledgements

ACK Cyfronet AGH Tomasz Szepieniec Marcin Radecki Mariusz Sterzel Karol Krawentek Agnieszka Szymańska Andrzej Oziębło Tadeusz Szymocha Alex Kusznir Teresa Ozga Aleksandra Mazur

ICM Piotr Bała Maciej Filocha

PCSS Norbert Meyer Krzysztof Kurowski Mirosław Kupczyk

WCSS Józef Janyszek Bartłomiej Balcerek Paweł Dziekoński

TASK Mścisław Nakonieczny Jarosław Rybicki Rafał Tylman

31

WORK PERFORMED IN THE FRAMEWORK OF PROJECT POIG.02.03.00-00-007/08

Page 31: Polish Infrastructure  for  Supporting Computational  Science  in the European Research Space

http://plgrid.plhttp://plgrid.pl