egee-iii infso-ri-222667 enabling grids for e-science egee and glite are registered trademarks...

23
EGEE-III INFSO-RI- 222667 Enabling Grids for E- sciencE www.eu-egee.org EGEE and gLite are registered Experiences with using the EGEE grid infrastructure and lessons for the future Bob Jones EGEE Project Director Bob Jones (CERN) EGEE project Director

Upload: colten-puckett

Post on 28-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Experiences with using the EGEE grid infrastructure and lessons for the future

Bob Jones

EGEE Project Director

Bob Jones (CERN)EGEE project Director

Page 2: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Contents

• EGEE in one slide• What EGEE does today (one more slide)

• Our understanding of what CLARIN wants– Based on Peter Wittenburg’s presentation at EGEE’09 last week

and the “CLARIN short-guides” (very useful!) Centres, Trust Domain, Metadata, Virtual Collections, etc.

– Mapped on to what exists today I don’t pretend that we have a turn-key solution for CLARIN but rather

these are examples of what is possible

• How CLARIN could interface with EGI• Suggested Next Steps

Lots of material contributed by EGEE & WLCG colleagues

Bob Jones - NEERI 09 2

Page 3: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 Bob Jones - NEERI 09 3

EGEE-III

Main Objectives– Expand/optimise existing EGEE

infrastructure, include more resources and user communities

– Prepare migration from a project-based model to a sustainable federated infrastructure based on National Grid Initiatives

Flagship Grid infrastructure project co-funded by the European Commission

Duration: 2 years Consortium: ~140 organisations across 33 countries

EC co-funding: 32Million €

Page 4: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 Bob Jones - NEERI 09 4

EGEE – What do we deliver?• Infrastructure operation - Sites distributed across many

countries Large quantity of CPUs and storage Continuous monitoring of grid services & automated site

configuration/management Support multiple Virtual Organisations from diverse research

disciplines

• Middleware - Production quality software distributed under business friendly open source licence

Implements a service-oriented architecture that virtualisesresources

Adheres to recommendations on web service inter-operability and evolving towards emerging standards

• User Support - Managed process from first contact through to production usage

Training Expertise in grid-enabling applications Online helpdesk Dedicated support for specific disciplines Networking events (User Forum, Conferences etc.) for cross-

discipline interaction

Page 5: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

CLARIN Centres

• Centres classification• Recognized R• Matadata C• Service B• Infrastructure A (roughly equivalent of EGEE Regional Operations Centres)• External E

• Need to monitor quality of services provided by centres• Need more details on the service definitions for each type• Probably need a Service Level Agreement for each type

• Example from EGEE: EGEE Service level agreement between Regional Operations Centres and Sites

• EGEE/EGI has an extendable monitoring infrastructure• Based on NAGIOS widely used and extendable open source monitoring toolkit• See Service Availability Monitoring in EGEE and Beyond

video demo @ EGEE’09 on YouTube

Bob Jones - NEERI 09 5

Page 6: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

WLCG depends on two major science grid infrastructures ….

EGEE - Enabling Grids for E-ScienceOSG - US Open Science Grid

6

Interoperability & interoperation is vital significant effort in building the procedures to support it

Bob Jones - NEERI 09

Page 7: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Tier 0 – Tier 1 – Tier 2

7

Tier-0 (CERN):•Data recording•Initial data reconstruction

•Data distribution

Tier-1 (11 centres):•Permanent storage•Re-processing•Analysis

Tier-2 (~130 centres):• Simulation• End-user analysis

The WLCG MoU: http://lcg.web.cern.ch/lcg/mou.htm

An example: WLCG

Bob Jones - NEERI 09

Page 8: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Monitoring Centres

Bob Jones - NEERI 09 8

http://gstat-dev/gstat/summary/grid/WLCG/

Page 9: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Trust Domain

Bob Jones - NEERI 09 9

https://www.eugridpma.org/members/worldmap/

• The choices made by CLARIN appear to be very sensible• Not exactly the same as EGEE/EGI but interoperation is possible

• Pilot project between BiG Grid, SURFnet and MPI already built an integrated online “SLCS” Certificate Authority service with an example use case of the IMDI browser (a linguistic corpus access browser)

• Talk to AAI community• GEANT and IGTF/EUGridPMA have a lot of useful experience• Europe should avoid separate sets of CAs• [email protected]

Page 10: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Centre availability/reliability reporting

Bob Jones - NEERI 09 10

See VO Specific Service Monitor using Service Level Status video demo @ EGEE’09 on YouTube

Page 11: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Component Metadata

• AMGA – the ARDA Metadata Grid Application• Metadata Catalogue of EGEE’s gLite Middleware

– Millions of files, 6000+ users, 200+ computing centres

– Mainly (real-only) file metadata

– Main concerns : scalability, performance, fault-tolerance, support for hierarchical collections, security

replicate metadata between different AMGA instances allowing the federation of metadata

different authentication methods via (Grid-Proxy-) Certificates as well as very flexible accesses control mechanisms for individual data items based on ACLs

– Does not yet support Persistent Identifiers AMGA uses grid file LFNs (Logical File Name) as does rest of gLite Would require some development

http://amga.web.cern.ch/amga/

[email protected]

AMGA 2.0 presentation at EGEE’09

Bob Jones - NEERI 09 11

same campus as KAIST(possible ISOcat mirror for CLARIN)

Page 12: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Workflows

Bob Jones - NEERI 09 12

• Many workflow managers supported– WMS (part of gLite)– GridWay (part of RESPECT)– Kepler, Taverna etc.

• Example - WISDOM

Page 13: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Virtual Collections

VOMS: Virtual Organization Membership Service

VOMS is a system for managing authorization data within multi-institutional collaborations. VOMS provides a database of user roles and capabilities and a set of tools for accessing and manipulating the database and using the database contents to generate Grid credentials for users when needed

Bob Jones - NEERI 09 13

http://www.gcube-system.org/

gCube offers a feature full platform for distributed hosting, management and retrieval of data and information

See EGEE09 demo on YouTube: A Virtual Research Environment for Species Distribution Map Generation and Management

Page 14: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Goal: Long-term sustainability of grid infrastructures in Europe

Approach: Establish a federated model bringing together National Grid Infrastructures (NGIs) to build the European Grid Infrastructure (EGI)

EGI Organisation: Coordination and operation of a common multi-national, multi-disciplinary Grid infrastructure

To enable and support international Grid-based collaborationTo provide support and added value to NGIsTo liaise with corresponding infrastructures outside Europe

Page 15: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

CLARIN and EGI

• The creation of National Grid Infrastructures and their overall coordination can provide an ICT context for the research infrastructures– An operational framework for centres involved in CLARIN

• In the EGI context, Specialised Support Centres (SSCs) are the means of interaction with user communities– The EGI SSCs are established and governed by the

user communities– Humanities SSC foreseen in ROSCOE project proposal

Bob Jones - NEERI 09 15

Page 16: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

ESFRI @ EGEE’09

Cherenkov Telescope Array

Page 17: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

How to be future proof

• Consider ALL (production grids, supercomputers, commercial cloud systems, volunteer grids, network etc.) as a combined e-Infrastructure ecosystem– Aim for interoperability and combine the resources into a consistent

whole– Work closely with EGEE/EGI, DEISA/PRACE and GEANT – they are

ready to help! - they have links around the world

• Keep the applications agile– Don’t make the code so specialised that it can only use one specific

installation – things will change!

• Make it easy for the users– Consider a community gateway/portal

Simplify authorisation/authentication Easy access to common codes (handle license issues) Relevant tutorials & documentation

Bob Jones - NEERI 09 17

Page 18: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Grids, clouds, supercomputers, etc.

Bob Jones - NEERI 09 18

Grids• Collaborative environment• Distributed resources (political/sociological)• Commodity hardware (also supercomputers)• (HEP) data management• Complex interfaces (bug not feature)

Supercomputers• Expensive• Low latency interconnects• Applications peer reviewed• Parallel/coupled applications• Traditional interfaces (login)• Also SC grids (DEISA, Teragrid)

Clouds• Proprietary (implementation)• Economies of scale in management• Commodity hardware• Virtualisation for service provision and encapsulating application environment• Details of physical resources hidden• Simple interfaces (too simple?)

Volunteer computing• Simple mechanism to access millions CPUs• Difficult if (much) data involved• Control of environment check • Community building – people involved in Science• Potential for huge amounts of real work

Many different problems:Amenable to different solutions

No right answer

Many different problems:Amenable to different solutions

No right answer

Page 19: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

European E-Infrastructure Forum

• Forum for the discussion of principles and practices to create synergies for distributed Infrastructures

• Goal: seamless interoperation of leading e-Infrastructures serving the European Research Area

• Focus: needs of the user communities that require services which can only be achieved by collaborating Infrastructures

• Initial membership:– EGEE & EGI– DEISA & PRACE– Terena & GEANT

• Offers a way of interacting as a whole with user communities of a multi-national nature that are interested in making use of the Infrastructures

Bob Jones - NEERI 09 19

Page 20: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Proposed next steps (1)

• Identify clear contact points between the ESFRI projects and e-Infrastructures– E-Infrastructure projects have been talking to individuals (users

or partners) Can we make contacts more official and identify contact points in

specific areas:• Security

• Data management

• Network

• Etc.

– These will be useful for establishing links between different ESFRI projects, between ESFRI projects and e-Infrastructures etc.

Bob Jones - NEERI 09 20

Page 21: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Proposed next steps (2)

• Use these contacts to build matrix for technical requirements & organisational aspects

Bob Jones - NEERI 09 21

requirement CLARIN DARIAH/CESSDA

EISCAT3D

EPOS LIFEWATCH

ELIXIR XFEL CTA FAIR SKA

Singlesign-on

Persistent storage

Global

workflows

Virt Org

stds

Page 22: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Proposed next steps (3)

• Once the matrix has been built it can be used to focus:

– Collaboration between ESFRI projects

– Collaboration between ESFRI projects and e-Infrastructures

– Provide input to roadmaps for e-Infrastructures of the future

– Provide input to national funding agencies and European

Commission on their future funding programmes

Bob Jones - NEERI 09 22

Page 23: EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE  EGEE and gLite are registered trademarks Experiences with using the EGEE grid infrastructure

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Summary

Bob Jones - NEERI 09 24

The key added value of grid infrastructures is a framework for collaboration

• Global secure access to computing resources, data, software and resultsCPU power for computing-intensive tasksData management capabilities

Metadata and annotationSecurityReplicationHigh-speed data transfersFacilitate creation of distributed data repositories, data mining, indexing and search

Software servicesAvailability of open source softwareIntegration with commercial software packages

• Scalable and dynamic architecture which can be extended with additional services as required

• All organisations can participate AND contribute

The EGI operational model and SSCs are a candidate mechanism for CLARIN to interact with EGI