clouds for hpc - peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/lippert.pdf · hpc for fusion...

20
meinschaft er Helmholtz-Gem Mitglied d Clouds for HPC ? Potential? Challenges? Thomas Lippert Institute for Advanced Simulation Session: Cloud Computing and HPC – Synergy or Competition? Jülich Supercomputing Centre ISC09, June 24, 2009

Upload: others

Post on 12-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

mei

nsch

aft

er H

elm

holtz

-Gem

Mitg

lied

d

Clouds for HPC?Potential?

Challenges?Thomas Lippert

Institute for Advanced Simulation

gSession: Cloud Computing and HPC – Synergy or Competition?

Jülich Supercomputing CentreISC09, June 24, 2009

Page 2: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

HPC

224.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 2

Page 3: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Germany

Gauss Centre

Europe

PRACE

European Services Tier-0 Principle Partners

1

N ti l S i Ti 1 G l P t

1

National Services Tier-1 General Partners

10

Gauss Alliance

Regional and Tier 2 Regional andRegional andtopical services

Tier-2 Regional andtopical services

100

Local services Tier-3

Local servicesHPC Cloud ?

Grid

1000

Page 4: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Hardware Aspects

Leadership HPC systems Similar to large experimental projectsSimilar to large experimental projects

Machine life cycle of 3 to 5 years

Time scale of know-how: 15 to 30 years

Usage: 24x7 h / week

Most industries are more than 6 years behind leadership HPC

Tendency „Full transparency“ of machine becomes more and more utopia

U d t k h hi lik h i i t d t k th User needs to know her machine like physicist need to know math

Assembler, SSE, MPI, parallelization strategy, scalability

424.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 4

Page 5: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

JUGENE JuRoPA + HPC-FF @ JülichJUGENE, JuRoPA + HPC-FF @ Jülich

Cluster computer Bull NovaScale R422-E2Bull NovaScale R422 E21080 nodes, 8640 cores 101 TF peak, Intel Nehalem24 GB memory

IBM Bl G /P

24 GB memoryInfiniband QDR (Mellanox)ParaStation Cluster-OSHPC for Fusion

IBM Blue Gene/P72 racks, 294912 cores1 Petaflop/s peak

Cluster computer SUN-blades2208 nodes, 17664 cores 207 TF k I t l N h l144 Tbyte memory

6 Pbyte disks 25 PByte tape capacity

207 TF peak, Intel Nehalem48 GB memoryInfiniband QDR (SUN M9)P St ti Cl t OS

524.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 5

25 PByte tape capacityHighest scalability

ParaStation Cluster-OSGeneral Purpose HPC

Page 6: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Need of HPC Users

For effective usage of tier-0, tier-1 and tier-2-systems high-level support structures high level support structures

Jülich: more than 50 % of staff works as domain scientists,Jülich: more than 50 % of staff works as domain scientists, mathematician and computer scientist in simulation labs Support Research Community oriented Integrated in community Parallelization has come closer to theory and model

624.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 6

Page 7: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Simulation Labs

core group Core group tasks:

associated community

b

disciplinary research supportive tasks

members

SL Steering Committee

Expansion towards distributed Europeandistributed EuropeanSLscommunity

groups

outreach

724.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 7

outreach

Page 8: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Simulation Labs @ FZJ and KIT@

Simulation Laboratories (New)Earth &Environment

Plasma Physics EnergyBiology Molecular

Systems NanoMikroAstro-Particle

Methods & Algorithms

Parallel Performance

Cross Sectional Teams Research Groups

Quantum Information

NIC Group

Distributed Computing

Cross-Sectional Teams Research Groups

Education & Training Programmes

824.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 8

Education & Training Programmes

Page 9: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Example: Simulation Lab Biology

Research Protein folding & interaction

St t di ti Structure prediction Systems biology

Support Libraries, Bio databases Benchmarking Monte Carlo, FFT docking, Machine learning, g, g

Codes PROFASI, SMMP

Outreach Protein 1LQ7Outreach FZJ: Biological institutes (ISB, INM), Helmholtz Groups Regional: ABC of Life Science Informatics International: UC Berkeley Michigan Tech

924.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 9

International: UC Berkeley, Michigan Tech

Page 10: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

National and European User Group

Soft Matter CompositesSoft Matter CompositesDEISADEISAI3HPI3HPJülich InitiativeJülich InitiativeOtherOther

ChemistryChemistryMany Particle PhysicsMany Particle PhysicsElementaryElementary ParticleParticle PhysicsPhysicsBiologyBiology//BiophysicsBiophysicsMaterial ScienceMaterial Science OtherOtherMaterial ScienceMaterial ScienceSoft MatterSoft MatterOtherOther

Proposals for computer time accepted from Germany and Europe

1024.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 1010

p p p y p Peer review by international referees

Page 11: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Cloud @ Jülich

1124.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 11

Page 12: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Germany

Gauss Centre

Europe

PRACE

European Services Tier-0 Principle Partners

1

N ti l S i Ti 1 G l P t

1

National Services Tier-1 General Partners

10

Gauss Alliance

Regional and Tier 2 Regional andRegional andtopical services

Tier-2 Regional andtopical services

100

Local services Tier-3

Local services

SoftComp@ JSC

Grid

1000

Page 13: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Users and Members of SoftComp

Member groups of the NoE SoftComp

fzjg: Forschungszentrum Jülich fzjg: Forschungszentrum Jülich jogu: Johannes Gutenberg Universität Mainz scr: Schlumberger Cambridge Research Limited unid: Heinrich-Heine Universität Düsseldorf unid: Heinrich-Heine Universität Düsseldorf upv: Universidad del Pais Vasco, Euskal Herriko Unibertsitatea utcdr: University of Twente uutr: Utrecht Universityuutr: Utrecht University Ulcrl: Unilever UK Central Resources Limited

German Federal Ministry for Education and Research (BMBF)y ( )

Promotion of applications in economy and science by grid infrastructures

1324.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 13

Page 14: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Our Cloud Computer: SoftComp Linux Cluster

125 compute nodes (500 cores)

Heterogeneous, AMD Opteron

IB and GigE, ParaStation, Unicore

1424.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 14

2.5 TF

Page 15: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

AccessAccess Open, extensible, interoperable Strong security, workflow support,

powerful clients, application integration Widely established in academia

(D-Grid DEISA EGI SKIF-GRID(D-Grid, DEISA, EGI, SKIF-GRID, SoftComp) and industry (T-Systems, Philips, 52° North)A d l d h 1200 Average downloads per month ~ 1200

Management of scientific data with metadata and scalable storageg

User-defined execution environments with Clouds and virtualisation

1524.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 151515

Page 16: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

1624.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 16

Page 17: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

1724.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 17

Page 18: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Parallel Codes

Why parallel?

Results in a shorter time

Results more precise

Local system memory not sufficient

But: Programs have to be adapted to run in parallel modeBut: Programs have to be adapted to run in parallel mode

Required

S t f Si L b d C C tti G Support of SimLab and Cross Cutting Groups

Mathematics

1824.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 18

Performance Analysis

Page 19: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

Lessons Learnt:

Challenges for the HPC Cloud P id

Lessons Learnt:

Provider

1924.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 19

Page 20: Clouds for HPC - Peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/Lippert.pdf · HPC for Fusion IBM Blue Gene/P 72 racks, 294912 cores 1 Petaflop/s peak Cluster computer SUN-blades

To serve beyond tier-3 and desktop applicationsa cloud provider must

• offer leading edge tier-3,2,1,0 high performance systems

• guarantee absolute security

• guarantee absolute privacy

• care for long-term data storage and curation

• guarantee uninterrupted service for critical applicationsguarantee uninterrupted service for critical applications

• actively offer highest level support and research for science communities and industryscience communities and industry

• provide SimLab-like support structure for HPC

2024.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 20