bridge the gap between hpc and htc applications structured as dags data dependencies will be files...
TRANSCRIPT
Paving the Road to Exascales with Many-Task Computing
Speaker: Ke WangHome page: http://datasys.cs.iit.edu/~kewang
Supervisor: Ioan RaicuData-Intensive Distributed Systems Laboratory
Computer Science DepartmentIllinois Institute of Technology
November 14th, 2012
Many-Task Computing (MTC)
• Bridge the gap between HPC and HTC• Applications structured as DAGs• Data dependencies will be files that are
written to and read from a file system • Loosely coupled apps with HPC
orientations
Paving the Road to Exascales with Many-Task Computing 1
Number of Tasks
Input Data Size
Hi
Med
Low1 1K 1M
HPC(Heroic
MPI Tasks)
HTC/MTC(Many Loosely Coupled Tasks)
MapReduce/MTC(Data Analysis,
Mining)
MTC(Big Data and Many Tasks)
• Falkon Fast and Lightweight Task Execution Framework http://datasys.cs.iit.edu/projects/Falkon/index.html
• Swift Parallel Programming System http://www.ci.uchicago.edu/swift/index.php
Load Balancing
• the technique of distributing computational and communication loads evenly across processors of a parallel machine, or across nodes of a supercomputer
• Different scheduling strategies– Centralized scheduling: poor scalability (Falkon, Slurm, Cobalt)– Hierarchical scheduling: moderate scalability (Falkon, Charm++)– Distributed scheduling: possible approach to exascales (Charm++)
• Work Stealing: a distributed load balancing strategy – Starved processors steal tasks from overloaded ones– Various parameters affect performance:
• Number of tasks to steal (half)• Number of neighbors (square root of number of all nodes)• Static or Dynamic random neighbors (Dynamic random neighbors)• Stealing poll interval (exponential back off)
Paving the Road to Exascales with Many-Task Computing 2
SimMatrix
Paving the Road to Exascales with Many-Task Computing 3
• light-weight and scalable discrete event SIMulator for MAny-Task computing execution fabRIc at eXascales
• supports centralized (FIFO) and distributed (work scheduling) scheduling
• has great scalability (millions of nodes, billions of cores, trillions of tasks)
• future extensions: task dependency, work flow system simulation, different network topologies, data-aware scheduling
LogVisual
StealAvailable
cores
Global Event Queue
Sorted by time
Insert Event(time:t)
No waiting tasks
TaskEndHas Waiting
Tasks
Failed
Dis
patc
h ta
sks
TaskRec
TaskDispStart
First node needs tasks
MATRIX
• a real implementation of distributed MAny-Task execution fabRIc at eXascales
Paving the Road to Exascales with Many-Task Computing 4
Client
Compute nodeCompute node
Compute node
Index Server
registration (1)
send mem
bership list (2)
requ
est m
embe
rshi
p lis
t (3)
send
mem
bership
list (2)
submit tasks using ZHT(4)
lookup task status using ZHT(5)
send task status info (6)
request load (7)
request load (7)
send
load
(8)
send load (8)
request tasks (9)
send tasks (10)
3.8% 4.7% 3.7% 5.0%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0
1000
2000
3000
4000
5000
6000
64 128 256 512
Diff
eren
ce (%
)
Thro
ughp
ut (t
asks
/sec
)
Scale (# of nodes)
SimMatrix ThroughputMATRIX ThroughputDifference
Acknowledgement
• DataSys LaboratoryIoan RaicuAnupam RajendranTonglin LiKevin Brandstatter
• University of ChicagoZhao Zhang