ian foster computation institute argonne national laboratory & the university of chicago
DESCRIPTION
From the Heroic to the Logistical Programming Model Implications of New Supercomputing Applications. Ian Foster Computation Institute Argonne National Laboratory & The University of Chicago. With thanks to: Miron Livny , Ioan Raicu , Mike Wilde , Yong Zhao , and many others. - PowerPoint PPT PresentationTRANSCRIPT
From the Heroic to the Logistical
Programming Model Implicationsof New Supercomputing Applications
Ian Foster
Computation InstituteArgonne National Laboratory &
The University of Chicago
With thanks to: Miron Livny, Ioan Raicu, Mike Wilde, Yong Zhao, and many others.
What will we do with 1+ Exaflops and 1M+ cores?
3
1) Tackle Bigger and Bigger Problems
ComputationalScientist
as Hero
4
2) Tackle Increasingly Complex Problems
ComputationalScientist
as
LogisticsOfficer
5
“More Complex Problems”
Use ensemble runs to quantify climate model uncertainty Identify potential drug targets by screening a database of
ligand structures against target proteins Study economic model sensitivity to key parameters Analyze turbulence dataset from multiple perspectives Perform numerical optimization to determine optimal
resource assignment in energy problems Mine collection of data from advanced light sources Construct databases of computed properties of chemical
compounds Analyze data from the Large Hadron Collider Analyze log data from 100,000-node parallel computations
6
Programming Model Issues
Massive task parallelism Massive data parallelism Integrating black box applications Complex task dependencies (task graphs) Failure, and other execution management issues Data management: input, intermediate, output Dynamic task graphs Dynamic data access involving large amounts of data Long-running computations Documenting provenance of data products
7
Problem Types
Number of tasks
Inputdatasize
1 1K 1M
Hi
Med
Lo
HeroicMPI
tasks
Dataanalysis,mining
Many loosely coupled tasks
Much data and complex tasks
8
An Incomplete and Simplistic View ofProgramming Models and Tools
Many TasksDAGMan+Pegasus
Karajan+Swift
Much DataMapReduce/Hadoop
Dryad
Complex Tasks, Much DataDryad, Pig, Sawzall
Swift+Falkon
Single task, modest dataMPI, etc., etc., etc.
9
Image courtesy Pat Behling and Yun Liu, UW Madison
NCAR computer + grad student160 ensemble members in 75 days
TeraGrid + “Virtual Data System”250 ensemble members in 4 days
Many Tasks
Climate Ensemble Simulations(Using FOAM,2005)
10
Many Many Tasks:Identifying Potential Drug Targets
2M+ ligandsProtein x target(s)
(Mike Kubal, Benoit Roux, and others)
11
start
report
DOCK6Receptor
(1 per protein:defines pocket
to bind to)
ZINC3-D
structures
ligands complexes
NAB scriptparameters
(defines flexibleresidues, #MDsteps)
Amber Score:1. AmberizeLigand3. AmberizeComplex5. RunNABScript
end
BuildNABScript
NABScript
NABScript
Template
Amber prep:2. AmberizeReceptor4. perl: gen nabscript
FREDReceptor
(1 per protein:defines pocket
to bind to)
Manually prepDOCK6 rec file
Manually prepFRED rec file
1 protein(1MB)
6 GB2M
structures(6 GB)
DOCK6FRED ~4M x 60s x 1 cpu~60K cpu-hrs
Amber~10K x 20m x 1 cpu
~3K cpu-hrs
Select best ~500
~500 x 10hr x 100 cpu~500K cpu-hrsGCMC
PDBprotein
descriptions
4 million tasks500K cpu-hrs
Select best ~5KSelect best ~5K
12
DOCK on SiCortex
CPU cores: 5760 Tasks: 92160 Elapsed time: 12821 sec Compute time: 1.94 CPU years Average task time: 660.3 sec
(does not include ~800 sec to stage input data)
Ioan Raicu,Zhao Zhang
13
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 180 360 540 720 900 1080 1260 1440Time (sec)
CP
U C
ore
s
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
0 180 360 540 720 900 1080 1260 1440
Mic
ro-T
asks
Idle CPUsBusy CPUsWait Queue LengthCompleted Micro-Tasks
MARS Economic Model Parameter Study 2,048 BG/P CPU cores Tasks: 49,152 Micro-tasks: 7,077,888 Elapsed time: 1,601 secs CPU Hours: 894
Mike Wilde, Zhao Zhang
14
B. Berriman, J. Good (Caltech)J. Jacob, D. Katz (JPL)
15
Montage in MPIand Swift
(Yong Zhao, Ioan Raicu, U.Chicago)
MPI: ~950 lines of C for one stage Pegasus: ~1200 lines of C + tools to
generate DAG for specific dataset SwiftScript: ~92 lines for any dataset
16
MapReduce/Hadoop
Client
I/O
Namenode
Metadata (Name, replicas, …):/home/sameerp/foo, 3, …
/home/sameerp/docs, 4, …
Client
Datanodes
Rack 1 Rack 2
Metadata ops
Word Count
221
11431795
863
46887860
1
10
100
1000
10000
75MB 350MB 703MBData Size
Tim
e (
se
c)
Swift+PBSHadoop
Hadoop DFS Architecture
ALCF: 80 TB memory,8 PB disk,
78 GB/s I/O bandwidth
Soner Balkir, Jing Tie, Quan Pham
17
0
0.5
1
1.5
2
2.5
0 20000 40000 60000 80000 100000 120000 140000
Number of Application Tasks
Late
ncy
(sec
s)
1-deep (VN Mode)
2-deep (VN Mode)
3-deep (VN Mode)
131,072 processes
Extreme Scale Debugging: Stack Trace Sampling Tool (STAT)
Cost per sample on BlueGene/L
Bart Miller, Wisconsin
18
Summary
Peta- and exa-scale computers enable us to tackle new types of problems at far greater scales than before– Parameter studies, ensembles, interactive data
analysis, “workflows” of various kinds– Potentially an important source of new applications
Such apps frequently stress petascale hardware and software in interesting ways
New programming models and tools are required– Mixed task and data parallelism, management of many
tasks, complex data management, failure, … – Tools for such problems (DAGman, Swift, Hadoop, …)
exist but need refinement Interesting connections to distributed systems community
More info: www.ci.uchicago.edu/swift
19
Amiga Mars – Swift+Falkon
1024 Tasks (147456 micro-tasks) 256 CPU cores