computational steering on grids

29
Computational Steering on Grids Computational Steering on Grids A survey of A survey of RealityGrid RealityGrid Peter V. Coveney Peter V. Coveney Centre for Computational Science, University College London

Upload: manton

Post on 15-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Computational Steering on Grids. A survey of RealityGrid. Peter V. Coveney Centre for Computational Science, University College London. Some RealityGrid science applications High performance “capability” computing Computational steering Early experiences with the UK Level 2 Grid. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computational Steering on Grids

Computational Steering on GridsComputational Steering on Grids

A survey of A survey of RealityGridRealityGrid

Peter V. CoveneyPeter V. Coveney

Centre for Computational Science, University College London

Page 2: Computational Steering on Grids

2

• Some RealityGrid science applications

• High performance “capability” computing

• Computational steering

• Early experiences with the UK Level 2 Grid

Talk contents

Page 3: Computational Steering on Grids

3

RealityGrid

User with laptop/PDA (web based portal)

VR and/or AG nodes

HPC resources

Scalable MD, MC, mesoscale modelling

“Instruments”: XMT devices, LUSI,…

Visualization engines

Steering

ReG steering API

Storage devicesGrid infrastructure (Globus, Unicore,…)

Moving the bottleneck out of the hardware and into the human mind…Moving the bottleneck out of the hardware and into the human mind…

Performance control/monitoring

Page 4: Computational Steering on Grids

4

RealityGrid: Goals

Use grid technology to closely couple high performance computing, high performance visualization and high throughput

experiment by means of computational steering.

Molecular and mesoscale condensed matter simulation in the terascale regime.

Deployment of component based middleware and performance control for optimal utilisation of dynamically varying grid resources.

Contribute to global grid standards via the GGF for benefit of RealityGrid and general modelling and simulation community.

Operate in a robust grid environment

Page 5: Computational Steering on Grids

5

Mesoscale Simulations:Lattice-Boltzmann methods

Coarse-grained lattice gas automaton. Continuum fluid dynamicists see it as a numerical solver for the BGK approximation to Boltzmann’s equation

The dynamics relaxes mass densities to an equilibrium state, conserving mass and momentum (given infinite machine precision)

– Densities at each lattice node are altered during “collision” (relaxation) step. – Judicious choice of equilibrium state.– Relaxation time (viscosity) is an adjustable parameter.

Computer codes are simple, algorithmically efficient and readily parallelisable, but numerical stability is a serious problem.

Page 6: Computational Steering on Grids

6

Three dimensional Lattice-Boltzmann simulations

Code (LB3D) written in Fortran90 and parallelized using MPI.

Scales linearly on all available resources.

Fully steerable. Future plans include move to parallel

data format PHDF5. Data produced during a single large

scale simulation can exceed hundreds of gigabytes or even terabytes.

Simulations require supercomputers High end visualization hardware and

parallel rendering software (e.g. VTK) needed for data analysis.

3D datasets showing snapshots from a simulation of spinodal decomposition: A binary mixture of water and oil phase separates. ‘Blue’ areas denote high water densities and ‘red’ visualizes the interface between both fluids.

Page 7: Computational Steering on Grids

7

Large Scale Molecular Dynamics

Molecules are modelled as particles moving according to Newton’s equations of motion with real atomistic interactions.

Simulated systems are LARGE: 30,000-300,000 atoms.

Interaction potentials: Lennard-Jones and Coulombic interactions, harmonic bond potentials, constraints on atoms, etc.

We use scalable codes (LAMMPS Large-scale Atomic/Molecular Massively Parallel Simulator & NAMD).

Page 8: Computational Steering on Grids

8

MHC-peptide complexes

Ribbon representation of the HLA-A*0201:MAGE-A4 complex.

Page 9: Computational Steering on Grids

9

TCR-peptide-MHC complex

Peptide MHC binding is just like the binding of drugs to other receptors.

We can use molecular dynamics (MD) simulation method to examine/model MHC-peptide interaction.

Garcia, K.C. et al., (1998). Science 279, 1166-1172.

Page 10: Computational Steering on Grids

10

• CSAR:

CRAY T3E; Origin3800;

Altix (Available in October 2003, 256 CPU machine)

• HPCxUK terascale facility (DL/EPCC): IBM Power 4, initially 1280 processors, 3.24 Teraflops.

• Pittsburgh Supercomputing CenterLemieux: HP Alphaserver Cluster comprising 750 4-processor compute nodes

• Boston University Supercomputing CenterOrigin and IBM systems

Big iron currently used by us

Page 11: Computational Steering on Grids

11

MHC-peptide complexes: Simulation models

... for the 58,825 atom

model (whole model),

we can perform 1 ns

simulation in 17 hours'

wall clock time on 256

processors of Cray T3E

using LAMMPS

Wan S., Coveney P. V., Flower D. R., preprint (2003).

Page 12: Computational Steering on Grids

12

MHC-peptide complexes: Conclusions

• For 58,825 atoms system, 1 ns simulation can be performed in

17 hours' wall clock time on 256 processors of Cray T3E.

• More accurate results are obtained by simulating the whole

complex than just a part of it.

• The 3 and 2m domains have a significant influence on the

structural and dynamical features of the complex, which is very

important for determining the binding efficiencies of epitopes.

We are now doing TCR-peptide-MHC simulations

(ca. 100,000 atom model) using NAMD.

Page 13: Computational Steering on Grids

13

TCR-peptide-MHC complex: Simulation models

... we can perform 1 ns simulation in 16 hours' wall clock time on 256

processors of an SGI Origin 3800 using NAMD. The Alpha cluster is

about two times faster.

Alpha based Linux Cluster ‘LeMieux’SGI Origin 3800 ‘Green’

Page 14: Computational Steering on Grids

14

TCR-peptide-MHC complex: First results with HPCx

... 1 ns simulation in < 8 hours' wall clock time on 256 processors of

HPCx using NAMD, which is faster than LeMieux

Alpha based Linux Cluster ‘LeMieux’HPCx

Page 15: Computational Steering on Grids

15

Hybrid Multiscale Modelling: MD/continuum

• Objective: To construct a numerical scheme (code) to couple two descriptions of matter with very different time and length characteristic scales.

• Applications: Dynamical processes near interfaces governed by the interplaybetween micro and macro-dynamics.Proteins, complex-fluids near surfaces (polymers, colloids, etc…), lipid membranes, wetting, crystal growth, melting, droplets, heating of critical fluids, Rayleigh-Taylor instability, etc…

R Delgado-Buscalioni and P V Coveney, Phys Rev E 67, 046704 (2003)

• Interesting grid problem – coupled system.

Page 16: Computational Steering on Grids

16

Capability Computing

What is “Capability Computing”?– “Uses more than half of the available resources (CPUs, memory,…)

on a single supercomputer for one job.” – This requires ‘draining’ the machine’s queues in order to make the

needed resources available.

Examples inside RealityGrid– LB3D on CSAR’s SGI Origin 3800 using up to 504 CPUs– LB3D on HPCx using up to 1024 CPUs (planned)– NAMD on Lemieux @ PSC using 2048 CPUs– NAMD on CSAR’s SGI Origin 3800 using up to 256 CPUs

Page 17: Computational Steering on Grids

17

Capability Computing: Storage requirements

NAMD (TCR-peptide-MHC system, 96,796 atoms, 1ns):– Trajectory data: 1.1GB

– Checkpointing files: 4.5MB each, write every 50ps.

– Total (including input files and data from analysis): 1.4GB

LB3D (512x512x512 system, 5,000 timesteps, porous media)– XMT sandstone data: 1.7GB

– Single dataset: 0.5-1.5GB, up to 7.5GB in total per measurement.

– Checkpointing files: 60GB

– Total (measure every 50 timesteps, checkpoint every 500 timesteps):

100 x 7.5GB + 10 x 60GB + XMT data: 1.5TB

Need terabyte storage facilities!

Page 18: Computational Steering on Grids

18

Capability Computing: Visualization requirements

We use the Visualization Toolkit (VTK), IRIS Explorer and AVS for parallel volume rendering and isosurfacing of 3D datasets.

Large scale simulations require specialised hardware like our SGI Onyx2 because

– data files can be huge, i.e. upwards of tens of gigabytes each– isosurfaces can be very complicated.

Self-assembly of the gyroid cubic mesophase by lattice-

Boltzmann simulation

Nélido González-Segredo and Peter V. Coveney, preprint (2003)

Lattice-Boltzmann simulation of an oil-filled sandstone

Page 19: Computational Steering on Grids

19

Computational Steering with LB3D

All simulation parameters are steerable using the ReG steering library.

Checkpointing/restarting functionality allows ‘rewinding’ of simulations and run time job migration across architectures.

Steering reduces storage requirements because the user can adapt data dumping frequencies.

CPU time can be saved because users does not have to wait for jobs to be finished if they can already see that nothing relevant is happening.

Instead of doing “task farming”, i.e. launching many simulations at the same time, parameter searches can be done by “steering” through parameter space.

Analysis time is significantly reduced because less irrelevant data is produced.

Page 20: Computational Steering on Grids

20

A typical steered LB3D simulation

Initial condition: Random water/ surfactant mixture.

Self-assembly starts.

Rewind and restart from checkpoint.

Lamellar phase: surfactant bilayers between water layers.

Cubic micellar phase, low surfactant density gradient.

Cubic micellar phase, high surfactant density gradient.

Page 21: Computational Steering on Grids

21

Progress on short term goals

We are grid-based (GT2, Unicore, GT3…)

Capable of distributed and heterogeneous operation

Do not require wholesale commitment to a single framework

Fault tolerant (we can close down, move, and re-start either the simulation or the visualisation without disrupting the other)

We have a steering API and library that insulates application from details of implementation

We do steering in an OGSA framework

We have learned how to expose steering controls through OGSA services

Page 22: Computational Steering on Grids

22

“Fast Track” Steering Demo

BezierSGI Onyx @ Manchester

Vtk + VizServer

DiracSGI Onyx @ QMUL

LB3D with RealityGridSteering API

LaptopSHU Conference Centre

UNICOREGateway and NJS

Manchester

Fir

ew

al

l

SGI OpenGL VizServer

Simulation

Data

VizServer clientSteering GUI The Mind Electric GLUE web service hosting environment with OGSA extensionsSingle sign-on using UK e-Science digital certificates

UNICOREGateway and NJS

QMUL

Steering (XML)

Page 23: Computational Steering on Grids

23

Progress on long term goals

Steering timely for scientific research using capability computing.

Need to make steering capabilities genuinely useful for scientists:value added quantified. See Contemporary Physics article.

Many codes have now been interfaced to the RealityGrid steering library: LB3D, NAMD, Oxford’s Monte Carlo code, Loughborough’s MD code, and Edinburgh’s Lattice-Boltzmann code.

Moving to component architecture & incorporating performance control capabilities (“deep track” timeline)– checkpoint/restart/migration now available for LB3D

Web based portal development (EPCC) - steering in a web environment.

HCI recommendations inform ultimate steering GUI’s

Page 24: Computational Steering on Grids

24

Deploying applications on a persistent grid: The “Level 2 Grid”

The components of this Grid are the computing and data resources contributed by the UK e-Science Centres linked through the SuperJanet4 backbone, regional and metropolitan area networks.

Many of the infrastructure services available on this Grid are provided by Globus software. A national Grid directory service links the information servers operated at each site and enables tasks to call on resources at any of the e-Science Centres.

The Grid operates a security infrastructure based on X.509 certificates issued by the e-Science Certificate Authority at the UK Grid Support Centre at CLRC.

In contrast to other Grid projects (like DataGrid), L2G resources are highly heterogeneous.

Page 25: Computational Steering on Grids

25

Examples of currently available resources on the “Level 2 Grid”

Compute resources:– Various Linux clusters

– Various SGI Origin machines

– Various SUN clusters

– HPCx & CSAR resources

– And many others

Visualization resources:– Our local SGI Onyx2 is currently the only L2G resource which is not

based at an e-science centre.

(… and of course ESNW’s famous Sony Playstation.)

Page 26: Computational Steering on Grids

26

program lbe

use lbe_init_module

use lbe_steer_module

use lbe_invasion_module

RealityGrid-L2: LB3D on the L2G

VisualizationSGI Onyx

Vtk + VizServer

SimulationLB3D with RealityGrid

Steering API

LaptopVizserver Client

Steering GUIGLOBUS used to

launch jobs

SGI OpenGL VizServer

SimulationData

GLOBUS-IOSteerin

g (XML)

File based communication via

shared filesystem: Steerin

g GUI

X output is tu

nnelled back using

ssh.

ReG steering GUI

Page 27: Computational Steering on Grids

27

The Level 2 Grid: First experiences from a user’s point of view

In principle, use of the L2G is attractive for application scientists because there are many resources available which can be accessed in a similar manner using GLOBUS.

But today one has to be very enthusiastic to use it for daily production work, because:

– It is not trivial to get started, as the available documentation does not answer the users’ questions properly. For example:

• Which resources are actually available?

• How can I access these resources in the correct way (queues,logins,…)?

• How do I get sysadmins to sort out firewall problems?

– Support is limited because most sysadmins seem not to have extensive/favourable experience with GLOBUS.

– So far, most people involved are computer scientists who are more interested in the technology than in the usability of the grid.

“As you run into bumps in the road, remember that you are a Grid pioneer. Do not expect all the roads to be paved. (Do not expect roads.) Grids do not yet run smoothly.”

From the Globus Quickstart Guide

Page 28: Computational Steering on Grids

28

Summary

Fast track– Steering capabilities deployed in several RealityGrid codes.

– We work with Unicore and Globus Toolkit 2; GT 3 forthcoming

– Capability computing possible via L2G

Deep track– Checkpoint/restart and migration

– General performance control

– Componentisation via ICENI framework

– GGF standards for advanced reservation/coallocation of resources

Page 29: Computational Steering on Grids

29

ReG Workshop - Outline

A pot pourri of presentations, posters and demos • From RealityGrid:

– Software Infrastructure, OGSI implementations and HCI aspects– Componentisation/ICENI– Performance control– Applications: Molecular dynamics and mesoscale modelling

• From external speakers:– GridLab applications on grids– Combinatorial Chemistry– Visualization environments– Bio simulations and curation of simulation data– Earth/environmental Sciences