centre for parallel computing 8th idgf workshop hannover, august 17 th 2011 international desktop...

28
CENTRE FOR PARALLEL COMPUTING 8th IDGF Workshop Hannover, August 17 th 2011 International Desktop Grid Federation

Upload: olivia-owens

Post on 24-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

CENTRE FOR PARALLEL COMPUTING

8th IDGF Workshop Hannover, August 17th 2011International DesktopGrid Federation

CENTRE FOR PARALLEL COMPUTING

Experiences with theUniversity of WestminsterDesktop Grid

S C Winter, T Kiss, G Terstyanszky, D Farkas, T Delaitre

CENTRE FOR PARALLEL COMPUTING

Contents

• Introduction to Westminster Local Desktop Grid (WLDG)– Architecture, deployment management– EDGeS Application Development

Methodology (EADM)• Application examples• Conclusions

CENTRE FOR PARALLEL COMPUTING

Introduction to Westminster Local Desktop Grid (WLDG)

1

2

34

5

6

New Cavendish St576

Marylebone Road559

Regent Street 395Wells Street 31Little Titchfield St

66Harrow Campus 254

CENTRE FOR PARALLEL COMPUTING

WLDG Environment• DG Server on private University network • Over 1500 client nodes on 6 different

campuses• Most machines are dual core, all running

Windows• Running SZTAKI Local Desktop Grid package• Based on student laboratory PC’s

– If not used by student switch to DG mode– If no more work from DG server shutdown

(Green policy)

CENTRE FOR PARALLEL COMPUTING

The DG Scenario

BOINC Server

BOINC workers

UoW Local Desktop Grid

Create graph and concrete workflow, and submit to DG

End-user

gUSE WS P-GRADE portal WS P-GRADE

DG Submitter submits jobs and retrieve results via 3G Bridge

Workers: Download executable and input filesUpload: result

CENTRE FOR PARALLEL COMPUTING

WLDG: ZENworks deployment

• BOINC clients installed automatically and maintained by specifically developed Novell ZENworks objects– MSI file has been created to generate a ZENworks object

that installs the client software.– BOINC Client Install Shield Executable converted into an MSI

package (/a switch on the BOINC Client executable)– BOINC client is part of the generic image installed on all lab

PC’s throughout the University– Guaranteed that any newly purchased and installed PC

automatically becomes part of the WLDG• All clients registered under same user account

CENTRE FOR PARALLEL COMPUTING

EDGeS Application Development Methodology (EADM)• Generic methodology for DG application porting• Motivation: Special focus required when

porting/developing an application to a SG/DG platform

• Defines how the recommended software tools, eg. developed by EDGeS, can aid this process

• Supports iterative methods:– well-defined stages suggest a logical order– but (since in most cases process is non-linear) allows

revisiting and revising results of previous stages, at any point

CENTRE FOR PARALLEL COMPUTING

EADM – Defined Stages1. Analysis of current

application

1. Analysis of current application

2. Requirements analysis

2. Requirements analysis

3. Systems design

4. Detailed design

5. Implementation

6. Testing

7. Validation

8. Deployment

9. User support, maintenance & feedback

CENTRE FOR PARALLEL COMPUTING

Application Examples

• Digital Alias-Free Signal Processing• AutoDock Molecular Modelling

CENTRE FOR PARALLEL COMPUTING

Digital Alias-Free Signal Processing (DASP)• Users: Centre for Systems Analysis – University of Westminster

• Traditional DSP based on Uniform sampling– Suffers from aliasing

• Aim: Digital Alias-free Signal Processing (DASP)

– One solution is Periodic Non-uniform Sampling (PNS)

• The DASP application designs PNS sequences

• Selection of optimal sampling sequence is computationally expensive process– A linear equation has to be solved and a large number of solutions

(~1010 ) compared.

• The analyses of the solutions are independent from each other suitable for DG parallelisation

CENTRE FOR PARALLEL COMPUTING

DASP - Parallelisation

Solveprqrq rr )12(12

Find bestPermutation for solution 1, 1+m,

1+2m…

Find bestPermutation for solution 2, 2+m,

2+2m …

Find bestPermutation for solution m, 2m,

3m, …

…qr, qr+1, …, q2r-1 qr, qr+1, …, q2r-1 qr, qr+1, …, q2r-1

Find globally best solution

Locally bestsolution

Locally bestsolution

Locally bestsolution

Computer 1 Computer 2 Computer m

Solveprqrq rr )12(12

Find bestPermutation for solution 1, 1+m,

1+2m…

Find bestPermutation for solution 2, 2+m,

2+2m …

Find bestPermutation for solution m, 2m,

3m, …

…qr, qr+1, …, q2r-1 qr, qr+1, …, q2r-1 qr, qr+1, …, q2r-1

Find globally best solution

Locally bestsolution

Locally bestsolution

Locally bestsolution

Computer 1 Computer 2 Computer m

CENTRE FOR PARALLEL COMPUTING

DASP – Performance test results

Period T (factor) Sequential DG

worstDG

median DG best# of

work units

Speedup (best case)

# of nodes

involved (median)

18 13 min 9 min 7 min 4 min 50 3.3 59

20 2 hr29 min

111 min 43 min 20 min 100 7.5 97

22 26 hr40 min

5h 1min

3 hr24 min

2 hr 31 min 723 11 179

24 ~820 hr n/a n/a 17 hr54 min 980 46 372

CENTRE FOR PARALLEL COMPUTING

DASP – Addressing the performance issues• Inefficient load balancing

– solutions of the equation should be grouped based on the execution time required to analyse individual solutions

• Inefficient work unit generation – some of the solutions should be divided into subtasks (more

work units)– Limits to the possible speed-up

• User-community/application developers to consider redesigning the algorithm

CENTRE FOR PARALLEL COMPUTING

AutoDock Molecular Modelling

• Users:• Dept of Molecular & Applied Biosciences, UoW

• AutoDock: • a suite of automated docking tools• designed to predict how small molecules, such as

substrates or drug candidates, bind to a receptor of known 3D structure

• application components:– AutoDock performs the docking of the ligand to a set of grids

describing the target protein – AutoGrid pre-calculates these grids

CENTRE FOR PARALLEL COMPUTING

Need for Parallelisation• One run of AutoDock finishes in a reasonable time

on a single PC

• However, thousands of scenarios have to be simulated and analysed to get stable and meaningful results.– AutoDock has to be run multiple times with the same

input files but with random factors– Simulations runs are independent from each other –

suitable for DG

• AutoGrid does not require Grid resources

CENTRE FOR PARALLEL COMPUTING

AutoDock component workflow

gpf file

pdb file (ligand)

pdb file (receptor)

prepare_ligand4.py

prepare_receptor4.py

pdbqt file

pdbqt file

AUTOGRID AUTODOCKmap files

Pamela

dpf file

AUTODOCKAUTODOCKAUTODOCKAUTODOCK

dlg files

SCRIPT1SCRIPT2best dlg files pdb file

CENTRE FOR PARALLEL COMPUTING

Computational workflow inP-GRADE

receptor.pdb

ligand.pdb

Autogrid executables, Scripts (uploaded by thedeveloper , don’t change it)

gpf descriptor file

dpf descriptor file

output pdb file

Number of work units

1. The Generator job creates specified numbered of AutoDock jobs.

2. The AutoGrid job creates pdbqt files from the pdb files, runs the autogrid application and generates the map files. Zips them into an archive file. This archive will be the input of all AutoDock jobs.

3. The AutoDock jobs are running on the Desktop Grid. As output they provide dlg files.

4. The Collector job collects the dlg files. Takes the best results and concatenates them into a pdb file.

dlg files

CENTRE FOR PARALLEL COMPUTING

AutoDock – Performance test results

Speedup

0

20

40

60

80

100

120

140

160

180

200

10 100 1000 3000

# of work units

Sp

eed

up

CENTRE FOR PARALLEL COMPUTING

DG Drawbacks: The “Tail” Problem

Jobs >> Nodes

Jobs ≈ Nodes

CENTRE FOR PARALLEL COMPUTING

Tackling the Tail Problem

• Augment the DG infrastructure with more reliable nodes, eg. service grid or cloud resources

• Redesign scheduler to detect tail and resubmit tardy tasks to SG or cloud

CENTRE FOR PARALLEL COMPUTING

Cloudbursting: Indicative Results

CENTRE FOR PARALLEL COMPUTING

AutoDock - Conclusions

• CygWin on Windows implementation inhibited performance– can be improved using (eg.)

• DG to EGEE bridge• Cloudbursting

• AutoDock is black-box legacy application– source code not available – code-based improvement not possible

CENTRE FOR PARALLEL COMPUTING

Further Applications• Ultrasound Computer Tomography - Forschungszentrum Karlsruhe • EMMIL – E-marketplace optimization - SZTAKI• Anti-Cancer Drug Research (CancerGrid) - SZTAKI• Integrator of Stochastic Differential Equations in Plasmas - BIFI • Distributed Audio Retrieval - Cardiff University• Cellular Automata based Laser Dynamics - University of Sevilla• Radio Network Design – University of Extramadura • An X-ray diffraction spectrum analysis - University of Extramadura• DNA Sequence Comparison and Pattern Discovery - Erasmus Medical

Center• PLINK - Analysis of genotype/phenotype data - Atos Origin• 3D video rendering - University of Westminster

CENTRE FOR PARALLEL COMPUTING

Conclusions – Performance Issues

• Performance enhancements – accrue from cyclical enterprise level hardware

and software upgrades• Are countered by performance degradation

– arising from shared nature of resources• Need to focus on robust performance

measures– in face of random unpredictable run-time

behaviours

CENTRE FOR PARALLEL COMPUTING

Conclusions – Load Balancing Strategies

• Heterogranular workflows– Tasks can differ widely in size and run times– Performance prediction, based eg. on previous runs, can inform

mapping (up to a point) ..– .. but after this, may need to re-engineer code (white box only)– .. or consider offloading bottleneck tasks to reliable resources

• Homogranular workflows– Classic example: parameter sweep problem– Fine grain problems (#Tasks >> #Nodes) help smooth out the overall

performance, but ..– .. tail problem can be significant (especially if #Tasks ≈ #Nodes)– Smart detection of delayed tasks coupled with speculative duplication

CENTRE FOR PARALLEL COMPUTING

Conclusions – Deployment Issues• Integration within enterprise desktop management

environment has many advantages, eg.– PC’s and applications are continually upgraded– Hosts and licenses are “free” on the DG

• … but, also some drawbacks:– No direct control

• Typical environments can be slack and dirty• Corporate objectives can override DG service objectives• Examples: current UoW Win7 deployment, green agenda

– Service relationship, based on trust• DG bugs can easily damage trust relationship, if not caught quickly• Example: recent GenWrapper bug

– Non-dedicated resource• Must give way to priority users, eg. students

CENTRE FOR PARALLEL COMPUTING

The End

Any questions?