march 18, 2008sse meeting 1 mary hall dept. of computer science and information sciences institute...

17
March 18, 2008 SSE Meeting 1 Mary Hall Dept. of Computer Science and Information Sciences Institute Multicore Chips and Parallel Programming

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

March 18, 2008 SSE Meeting 1

Mary Hall

Dept. of Computer Science and Information Sciences Institute

Multicore Chips and Parallel Programming

March 18, 2008 SSE Meeting 2

The Multicore Paradigm Shift:Technology Drivers

March 18, 2008 SSE Meeting 3

• Key ideas:– Movement away from increasingly

complex processor design and faster clocks

– Replicated functionality (i.e., parallel) is simpler to design

– Resources more efficiently utilized– Huge power management advantages

What to do with all these transistors?

Part 1: Technology Trends

March 18, 2008 SSE Meeting 4

The Architectural Continuum

Supercomputer:IBM BG/L

Commodity Server:Sun Niagara

Embedded:Xilinx Virtex 4

March 18, 2008 SSE Meeting 5

Multicore: Impact on Software

Consequences:– Individual processors will no longer get faster. At

first, they might get a little slower.– Today’s software may not perform as well on

tomorrow’s hardware as written.• And forget about adding capability!

The very future of the computing industry demands successful strategies for applications to exploit parallelism across cores!

March 18, 2008 SSE Meeting 6

We are at the cusp of a transition to multicore, multithreaded architectures, and we still have not demonstrated the ease of programming the move will require… I have talked with a few people at Microsoft Research who say this is also at or near the top of their list [of critical CS research problems].

Justin Rattner, CTO, Intel Corporation

The Multicore Paradigm Shift:Computing Industry Perspective

March 18, 2008 SSE Meeting 7

The Rest of this Talk

• Convergence of high-end, conventional and embedded computing– Application development and compilation strategies for high-end

(supercomputers) are now becoming important for the masses• Why?

– Technology trends (Motivation)• Looking to the future

1. Automatically generating parallel code is useful, but insufficient.2. Parallel computing for the masses demands better parallel

programming paradigms.3. Compiler technology will become increasingly important to deal

with a diversity of optimization challenge… and must be engineered for managing complexity and adapting to new architectures.

4. Potential to exploit vast machine resources to automatically compose applications and systematically tune application performance.

5. New tunable library and component technology.

March 18, 2008 SSE Meeting 8

1. Automatic Parallelization

• Old approaches:– Limited to loops and array computations– Difficult to find sufficient granularity (parallel work between

synchronization)– Success from fragile, complex software

• New ideas in this area:– Finer granularity of parallelism -- more plentiful– Combine with hardware support (e.g., speculation and

multithreading)

From Hall et al., “Maximizing Multiprocessor Performance with the SUIF Compiler”, IEEE Computer, Dec. 1996.

March 18, 2008 SSE Meeting 9

2. Parallel Programming State of the Art

Three dominant classes of applications

Domains Appl. Characteristics

Programming Paradigms

Scientific Computing

Very large arrays representing simulation region, loops, data parallel

MPI dominant, Also, OpenMP, PGASGrids & distributed computing

Databases Queries over large data sets, often distributed

Query languages like SQL

Systems and Embedded Software

Fine-grain threads, small number of processors

Low-level threading such as Pthreads

Domain-specific, intellectually challenging and low-level programming models not suitable for the masses.

March 18, 2008 SSE Meeting 10

2. New Parallel Programming Paradigms

• Transactional memory– Section of code executes atomically with

subsequent commit or rollback– Programming model + hardware support

• Streams and data-parallel models– Data streams describe the flow of data– Well-suited for certain applications and hardware

(IBM Cell, GPUs)

• Domain-specific languages and libraries– Parallelism implicit within implementation

Different applications and users demand different solutions. Convergence unlikely. Architecture independence?

March 18, 2008 SSE Meeting 11

3. Engineering a Compiler

• Compiler research will play a crucial role in achieving performance and programmability of multi-core hardware.

• What is the state of compilers today?– Roughly 5 year lag between introducing a new

architecture and a robust compiler– Many interesting new architectures fail in the

marketplace due to inadequate software tools• Today’s compilers are complex and

monolithic– SUIF has ~500K LOC, Open64 has ~12M LOC

The best research ideas do not always make it into practice

March 18, 2008 SSE Meeting 12

BatchCompiler

code

input data

3. A New Kind of “Compiler”

Traditional view:

March 18, 2008 SSE Meeting 13

3 & 4. Performance Tuning “Compiler”

Code Translation

code

input data

(characteristics)

Experiments Engine

transformationscript(s)

search script(s)

March 18, 2008 SSE Meeting 14

4. Auto-tuner

Code Translation

code

input data

(characteristics)

Experiments Engine

transformationscript(s)

search script(s)

March 18, 2008 SSE Meeting 15

Heterogeneous: Additional Complexity

DeviceType 1

DeviceType 2

DeviceType 3

DeviceType 4

Memory

StagingData to/from

global memory

Other:

• Utilizing highly tuned libraries

• Differences in programming models (GPP +FPGA is extreme example)

Partitioning:Where to execute?

Managing data movement and

synchronization

March 18, 2008 SSE Meeting 16

Traditional View

ExpandedView

Code(source or binary)

Interface:Provides/Requires

Data Description:Types, Sizes

Partial Code(source or tunable binary)

CodeGenerator

Interface:Abstract Provides/

Requires

Data Description:Types, Sizes

Interface:Device

Dependencies

Data Description:Map Features to

Optimization

Performance:Device,

Data Features

5. Libraries and Component Technology

Support for automatic selection, tuning, scheduling, etc.

March 18, 2008 SSE Meeting 17

Summary

• Parallel computing is everywhere!– And we need software tools– Can we find some common ground?

• Strategies– Automatic parallelization – Libraries and domain-specific tools that hide

parallelism component technology– New programming languages– Auto-tuners to “test” alternative solutions

• General approach to solving challenges– Education: CS503, Parallel Programming– Organize the community to support incremental

LONG TERM development.