guang r. gao founder et international inc newark, delaware usa ggao@etinternational

14
Copyright 2012 ET International, Inc. ET International IDC-Panel 04-2012 1 HPC User Forum 2012 Panel on Potential Disruptive Technologies Emerging Parallel Programming Approaches Guang R. Gao Founder ET International Inc Newark, Delaware USA [email protected]

Upload: ava-holt

Post on 31-Dec-2015

25 views

Category:

Documents


1 download

DESCRIPTION

HPC User Forum 2012 Panel on Potential Disruptive Technologies Emerging Parallel Programming Approaches. Guang R. Gao Founder ET International Inc Newark, Delaware USA [email protected]. Who is ETI ?. From “Cool Vendors” Report – By Gartner ( April 17,2012 ): [ - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 1

HPC User Forum 2012 Panel on Potential Disruptive Technologies

Emerging Parallel Programming Approaches

Guang R. GaoFounder

ET International Inc

Newark, Delaware

USA

[email protected]

Page 2: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 2

Who is ETI ?

From “Cool Vendors” Report – By Gartner (April 17,2012):

[ET InternationalNewark, Delaware (www.etinternational.com)

Analysis by Carl Claunch

Why Cool: ET International delivers its dataflow-oriented ETI Swarm environment for garnering high efficiency from highly parallel software, based on the alternative ParalleX execution model. As highly parallel execution becomes essential to addressing the more substantial computing tasks that HPC users face today, progress is increasingly being stymied by the application's inability to keep all the parallel strands working productively.…]

Page 3: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 3

• Many-core is coming Current paradigms don't have the expressive power to harness

concurrency

• Hardware is getting more heterogeneous Current hybrid programming techniques (OpenMP+MPI+OpenCL) are

not maintainable: too complicated

• Caches are disappearing or becoming non-coherent Distributed memory is everywhere, and at different levels

• Fine grained power management Use what you need and turn off/down the rest

• Failure is the norm Resilience must be baked in the whole stack (application, compiler,

runtime, hardware)

• Increasing Application Computation/data Irregularity Static scheduling can no longer properly load balance

Motivation

Page 4: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 4

ETI Vision

• We need new “Execution Models”!• Leverage ETI’s deep and growing IP position based on 25+ years of

applied R&D expertise and $20M+ in R&D software engineering and development (e.g. extensive system software base for Cyclops, CELL, SCC, Intel Runnemede,

Intel X86 based machines, Adapteva, etc)

• Provide high-performance SWARM software solutions to our OEM’s, partners and direct customers

• Advance SWARM solutions to address optimization opportunities driven by heterogeneous multi-/many- core processing including:

Big Compute (Private HPC Cloud) systems

Big Data HPC systems

HPC embedded appliances

etc

Page 5: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 5

MPI, OpenMP, OpenCL SWARM

Asynchronous Event-Driven Tasks Dependencies Resources Active Messages Control Migration

Communicating Sequential Processes Bulk Synchronous Message Passing

Tim

e Tim

e

Active threads

Waiting

Execution Paradigm Comparisons

Page 6: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 6

Tasks mapped to resources

CPU CPU CPU CPU

CPU CPU CPU

CPU CPU CPU

GPU

GPU

Enabled Tasks Tasks with Unsatisfied Dependencies

Dependencies

satisfied

Resources in Use

CPU

GPU

SWARM

Resources allocated

Tasks enabled

Available Resources

Resources released

SWARM Execution Overview

Page 7: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 7

Case Studies of Fine-Gran Execution Models

• Static Dataflow Model (1970s - )• EARTH Model (1988 - )• TNT Model and Cyclops-64 (2003 - )• Codelet Model under

Intel-led DARPA/UHPC

04/19/2023 FT-06-09-2011-Gao 7

Page 8: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 8

ET International, Inc.

CPU

Memory

ExecutionModel

Productivity

Resiliency

InterconnectFabric

HW/SWCo-Design

Event driven codeletsSelf-aware introspectionCode and data motion

Model-basedGoal-orientedSelf-morphing

1000X energy reductionOverhauled DRAM mArchResilient memory

<10% overheadCheckpoint with Flash/CPM

Security Through Sandboxing

Heterogeneous & taperedLarge local memory

1000X Energy reductionHeterogeneous, Tightly-CoupledSimple Architecture

Application Efficiency

System Management & Concurrency

Data Movement

Assured Operation

DARPA/Intel Runnemede Program

University of Illinois

Our Collaborators

Courtesy of The Intel DARPA UHPC Team

Page 9: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 9

Progress & Proof Points To-Date

Page 10: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 10

Barnes-HutSWARM vs OpenMP

Barnes-Hut

1 2 3 4 5 6 7 8 9 10 11 120

1

2

3

4

5

6

7

8

9

10

11

12

Number of Threads

Speedup o

ver

Seri

al

Ideal

SWARM

OpenMP

Barnes-Hut SWARM vs OpenMP

Page 11: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 11

SWARM/MPI Performance Comparison

4 8 16 32 64 128 256 5120%

100%200%300%400%500%600%700%800%900%

1000%1100%1200%1300%1400%1500%

Lonestar

Redsky

Endeavor

Jaguar

Number of Nodes

SW

AR

M S

peed

up

MPI

Consistent Speed-up from 2X to 14.5X

Page 12: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 12

Cholesky Decomposition (SWARM vs MKL/ScaLAPACK)

2 4 8 16 32 640

10

20

30

40

50

60

70

80

90

100Choelsky %Peak

MKL/ScaLAPACK % PeakSWARM %Peak

# Nodes

% P

ea

k

1 2 4 8 16 32 64 128 256 512 102410

100

1000

10000

100000 Choelsky Weak ScalingSWARMMKL_GFLOPSIdeal

# Cores

GF

LO

PS

Cholesky Decomposition (SWARM vs MKL/ScaLAPACK

Page 13: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 13

Summary and Acknowledgements

• Summary (productivity observation)N-Body: 1 man-day, 3XG-500: 1 man-month, upto 14xCholesky: 2 man-week, 1.5x

NOTE: the base is performance of optimized code

• AcknowledgementsOur SponsorsOur Collaborators and ColleaguesMy HostOthers

.

Page 14: Guang R. Gao Founder ET International Inc Newark, Delaware USA ggao@etinternational

Copyright 2012 ET International, Inc.

ET

Inte

rnat

ion

al

IDC-Panel 04-2012 14

Cholesky Profiles

SWARM

OpenMP