dr kevin vella - staff.um.edu.mt

29
Introduction to Grids Dr Kevin Vella Department of Computer Science University of Malta

Upload: others

Post on 17-Jun-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr Kevin Vella - staff.um.edu.mt

Introduction to Grids

Dr Kevin Vella

Department of Computer ScienceUniversity of Malta

Page 2: Dr Kevin Vella - staff.um.edu.mt

2

Overview

� Introduction

� Grid Architecture� Grid Applications� Grid Projects

Page 3: Dr Kevin Vella - staff.um.edu.mt

3

Grid Computing� An analogy with the electrical power grid

� It is pervasive, just log (or plug) in and use anywhere� It is a utility, you ask for resources and you get what you asked

for� You do not know or care where the resources are coming from

(power station/data centre)

� The Web enables seamless sharing of information� Developed to enable information sharing between researchers

� The Grid enables seamless sharing of information, compute power, storage, databases and applications� Revolutionising the way research is conducted� Empowering researchers in remote areas with limited facilities

Page 4: Dr Kevin Vella - staff.um.edu.mt

4

Grid Computing� Grid Computing: controlled sharing of geographically

distributed resources that are owned and administered by several organisations� Supercomputers (Roadrunner – c. 1 petaflop/s)� Big Science equipment (telescopes - SKA, particle

accelerators – LHC)� Digital archives

� Grid: distributed supercomputer with access to huge data resources and world-class equipment (EGI/EGEE, OSG, WLHCG)

� Agreed-upon standards and protocols are crucial (OGF, OGSA)

� Compliant middleware (often based on web service standards) needs to be present on all participating nodes (GLite, Globus)

� Batch jobs/interactive web-based grid applications

Page 5: Dr Kevin Vella - staff.um.edu.mt

5

Related Technologies

� Distributed computing� Cluster computing� Meta-computing� Application service provision (ASP)� Software-as-a-Service (SaaS)� Utility computing� Cloud computing� P2P computing� Web services, WSRF

Grid technologies complement existing distributed technologies by extending the distribution across organisational boundaries

Page 6: Dr Kevin Vella - staff.um.edu.mt

6

Resources and Services� The grid needs to schedule access by multiple users (human or

machine) to a wide range of distributed shared resources� Multiple servers or peers

� Standard protocols guarantee interoperability across a range of systems (hardware, OS, programming language)

� Services are defined solely in terms of� Protocol� Behaviour

thus abstracting away internal heterogeneity of resources� All resources are exposed as services

� New generation grid middleware uses web service standards (see WSRF)

� Standard services include� Data access� Resource discovery� Access to computational resources

Page 7: Dr Kevin Vella - staff.um.edu.mt

7

Virtual Organisations� A virtual organisation is a community whose members

are sharing a set of resources/services� Multiple institutions may be involved (e.g. universities and

research labs, business consortia)� Sophisticated rules govern sharing within a VO� Users and resources may be geographically dispersed� VOs are long-lived and dynamic

� VOs tend to be domain-specific (e.g. LHC, Biomed)� The LHC VO in EGEE enables European physicists to

analyse immense datasets produced by the CERN LHC experiment using several supercomputers across Europe

� A grid generally contains several VOs

Page 8: Dr Kevin Vella - staff.um.edu.mt

8

Overview

� Introduction� Grid Architecture

� Grid Applications� Grid Projects

Page 9: Dr Kevin Vella - staff.um.edu.mt

9

Grid Architecture

Application

Collective (Groups)

Resource (Services)

Connectivity (TCP/IP, IPv6)

Fabric (Physical resources)

Page 10: Dr Kevin Vella - staff.um.edu.mt

10

Fabric Layer� The fabric layer exposes local resources located at

various sites to be shared across the Grid� Inspect and change resource’s state through a standard protocol

� Richer functionality enables more sophisticated sharing at higher layers� Advance reservation and co-scheduling

� Simpler functionality simplifies integration with existing management interfaces to resources

� Resource capabilities vary� Computational resources� Storage resources (file operations, free space)� Network resources (bandwidth reservation, load inspection)� Code repositories (source and object, e.g. CVS)� Catalogs (query and update, e.g. databases)

� ‘Exactly-once’ semantics are required

Page 11: Dr Kevin Vella - staff.um.edu.mt

11

Connectivity Layer� The connectivity layer provides communication

and authentication protocols� Currently TCP/IP is used for communication� Authentication requirements

� Single sign-on (user authenticates only once, at the start of a session)

� Delegation (user delegates rights to a program)� Integration with existing security solutions (e.g.

Kerberos, UNIX)� User-based trust (multiple sites can interact with

each other on behalf of a user, without specific intervention by individual site administrators)

Page 12: Dr Kevin Vella - staff.um.edu.mt

12

Resource Layer� The resource layer enables sharing of individual

resources on the grid by accessing fabric APIs through the connectivity layer

� Resource sharing is done using� Information protocols to obtain resource state

(configuration, load, usage policy)� Management protocols to negotiate access

(advance reservation, QoS, operation to be performed, status monitoring, accounting and payment) e.g. GridFTP, LDAP

� Small and focused set of resource abstractions based on fabric characterisation

Page 13: Dr Kevin Vella - staff.um.edu.mt

13

Collective Layer� The collective layer deals with the coordination of

collections of resources� Persistent services across groups – a ‘session’ between several

parties� Collective state is shared across multiple resources

� Collective layer services include� Directories (e.g. to query resources available on a VO)� Co-allocation, scheduling and brokering� Data replication (e.g. file system caching)� Software discovery� Collaboration� Grid-enabled software run-time environments� Monitoring and diagnostics

� Collective components may include individual resources as well as other collective components

Page 14: Dr Kevin Vella - staff.um.edu.mt

14

Overview

� Introduction� Grid Architecture� Grid Applications

� Grid Projects

Page 15: Dr Kevin Vella - staff.um.edu.mt

15

4 Large Experiments

CERN Large Hadron ColliderThe world’s most powerful particle accelerator

CERN and the Grid

ATLAS

Page 16: Dr Kevin Vella - staff.um.edu.mt

16

Example from LHC: starting from this event

We are looking for this “signature”

Selectivity: 1 in 1013

Like looking for 1 person in a thousand world populations;or for a needle in 20 million haystacks!

• ~100,000,000

electronic

channels

• 0.0002 Higgs

per second

• 15 PBytes of

data a year

• (10 Million

GBytes = 14

Million CDs)

Concorde

(15 km)

Mt. Blanc

(4.8 km)

One year’s data

from LHC would

fill a stack of

CDs 20km high

CERN and the Grid

Page 17: Dr Kevin Vella - staff.um.edu.mt

17

� A wide variety of scientific applications are running on European grids

� High Energy Physics: Large Hadron Collider experiments (ATLAS, CMS, ALICE, LHCb) at CERN

� Biomedical Applications� GPS@ portal: protein sequence similarity searches, sites and

signatures detection, multiple alignment, secondary structure prediction and primary structure analysis

� WISDOM: finding new drugs against malaria, H5N1, etc.� Astrophysics Applications.

� ESA is simulating the forthcoming Planck satellite mission and test the data pipelines, thus providing input to the mission’s hardware requirements.

� Processing data from MAGIC, an imaging atmospheric telescope located on the Canary Islands that is used for astro-particle physics research.

Grid Applications

Page 18: Dr Kevin Vella - staff.um.edu.mt

18

� Earth Science and Geophysics Applications.� Analysis of ozone profiles from the GOME satellite and oil spill

data from the ERS/SAR satellite, facilitating data sharing within the earth observation community.

� Montecarlo simulations for seawater intrusion in a coastal aquifer of the Mediterranean basin

� Other areas such as Computational Chemistry, Financial and Economic Research, Digital Libraries, Fusion Research

Grid Applications

Page 19: Dr Kevin Vella - staff.um.edu.mt

19

Overview

� Introduction� Grid Architecture� Grid Applications� Grid Projects

Page 20: Dr Kevin Vella - staff.um.edu.mt

20

� A pan-EU high-speed research network with full operational support is available� GÉANT / GÉANT-II� EUMEDCONNECT

� An overlying European Grid Infrastructure � EGEE / EGEE-II / EGEE-III� EUMEDGRID, SEEGRID, EUCHINAGRID, EELA

� Fostering collaboration between researchers from Europe and other countries

� Bridging the digital divide among areas with different levels of technological development

The EuroMed Scenario

Page 21: Dr Kevin Vella - staff.um.edu.mt

21

The EUMEDGRID Project� EU-funded project (FP6 SSA)� Principal objectives:

� build the first high performance computing grid extending across the Mediterranean

� foster National Grid Initiatives in the Mediterranean region

“Computing and storage capacity on demand

for researchers in the Mediterranean”

Page 22: Dr Kevin Vella - staff.um.edu.mt

22

EUMEDCONNECT

Page 23: Dr Kevin Vella - staff.um.edu.mt

23

EUMEDGRID

Page 24: Dr Kevin Vella - staff.um.edu.mt

24

GEANT 2

Page 25: Dr Kevin Vella - staff.um.edu.mt

25

EGEETaken from EGEE 2008 report

Page 26: Dr Kevin Vella - staff.um.edu.mt

26

>250 sites

48 countries

>50,000 CPUs

>20 PetaBytes

>10,000 users

>150 VOs

>150,000 jobs/day

Application areas include:

Archeology

Astronomy

Astrophysics

Civil Protection

Comp. Chemistry

Earth Sciences

Finance

Fusion

Geophysics

High Energy Physics

Life Sciences

Multimedia

Material Sciences

Taken from EGEE 2008 report

Page 27: Dr Kevin Vella - staff.um.edu.mt

27

EGI: The European Grid Initiative� To ensure long-term sustainability of the

European Grid beyond the fixed-term EGEE projects

� To facilitate integration and interaction between European National Grid Initiatives (NGIs)

Page 28: Dr Kevin Vella - staff.um.edu.mt

28

Collaborating e-Infrastructures

Taken from EGEE 2008 report

Page 29: Dr Kevin Vella - staff.um.edu.mt

29

Information Sources� Foster et al. The Anatomy of the Grid. Intl

J. Supercomputer applications, 2001� EGEE site. www.eu-egee.org� EUMEDGRID site. www.eumedgrid.org� Grid Café. www.gridcafe.org� Vella et al. EUMEDGRID: Grid computing

in Malta and the Mediterranean. CSAW 2006