bulk synchronous parallel processing model jamie perkins

15
Bulk Synchronous Parallel Processing Model Jamie Perkins

Upload: stanley-chase

Post on 30-Dec-2015

233 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Bulk Synchronous Parallel Processing Model Jamie Perkins

Bulk Synchronous Parallel Processing ModelBulk Synchronous Parallel Processing Model

Jamie PerkinsJamie Perkins

Page 2: Bulk Synchronous Parallel Processing Model Jamie Perkins

Overview Overview

Four W’s – Who, What, When and Why

Goals for BSP

BSP Design and Program

Cost Functions

Languages and Machines

Four W’s – Who, What, When and Why

Goals for BSP

BSP Design and Program

Cost Functions

Languages and Machines

Page 3: Bulk Synchronous Parallel Processing Model Jamie Perkins

A Bridge for Parallel ComputationA Bridge for Parallel Computation

Von Neumann modelDesigned to insulate hardware and

software

BSP model (Bulk Synchronous Parallel)Proposed by Leslie Valiant of Harvard

University in 1990Developed by W.F. McColl of OxfordDesigned to be a “bridge” for parallel

computation

Von Neumann modelDesigned to insulate hardware and

software

BSP model (Bulk Synchronous Parallel)Proposed by Leslie Valiant of Harvard

University in 1990Developed by W.F. McColl of OxfordDesigned to be a “bridge” for parallel

computation

Page 4: Bulk Synchronous Parallel Processing Model Jamie Perkins

Goals for BSPGoals for BSP

Scalability – performance of HW & SW must be scalable from a single processor to thousands of processors

Portability – SW must run unchanged, with high performance, on any general purpose parallel architecture

Predictability – performance of SW on different architecture must be predictable in a straight forward way

Scalability – performance of HW & SW must be scalable from a single processor to thousands of processors

Portability – SW must run unchanged, with high performance, on any general purpose parallel architecture

Predictability – performance of SW on different architecture must be predictable in a straight forward way

Page 5: Bulk Synchronous Parallel Processing Model Jamie Perkins

BSP DesignBSP Design

Three ComponentsNode

Processor and Local MemoryRouter or Communication Network

Message Passing or Point-to-Point communication

Barrier or Synchronization MechanismImplemented in hardware

Three ComponentsNode

Processor and Local MemoryRouter or Communication Network

Message Passing or Point-to-Point communication

Barrier or Synchronization MechanismImplemented in hardware

Page 6: Bulk Synchronous Parallel Processing Model Jamie Perkins

BSP Design BSP Design

Fixed memory architectureHashing to allocate memory in “random”

fashion

Fast access to local memory

Uniformly slow access to remote memory

Fixed memory architectureHashing to allocate memory in “random”

fashion

Fast access to local memory

Uniformly slow access to remote memory

Page 7: Bulk Synchronous Parallel Processing Model Jamie Perkins

Illustration of BSP ComputerIllustration of BSP Computer

Communication Network

P M P M P M

Node Node Node

Barrier

http://peace.snu.ac.kr/courses/parallelprocessing/

Page 8: Bulk Synchronous Parallel Processing Model Jamie Perkins

BSP ProgramBSP Program

Composed of S supersteps

Superstep consists of:A computation where each processor

uses only locally held valuesA global message transmission from

each processor to any subset of the others

A barrier synchronization

Composed of S supersteps

Superstep consists of:A computation where each processor

uses only locally held valuesA global message transmission from

each processor to any subset of the others

A barrier synchronization

Page 9: Bulk Synchronous Parallel Processing Model Jamie Perkins

Strategies for programming on BSPStrategies for programming on BSP

Balance the computation between processes

Balance the communication between processes

Minimize the number of supersteps

Balance the computation between processes

Balance the communication between processes

Minimize the number of supersteps

Page 10: Bulk Synchronous Parallel Processing Model Jamie Perkins

BSP ProgramBSP Program

Superstep 1

Superstep 2Barrier

P1 P2 P3 P4

Computation

Communication

http://peace.snu.ac.kr/courses/parallelprocessing/

Page 11: Bulk Synchronous Parallel Processing Model Jamie Perkins

Advantages of BSPAdvantages of BSP

Eliminates need for programmers to manage memory, assign communication and perform low-level synchronization (w/ sufficient parallel slackness)

Synchronization allows automatic optimization of the communication pattern

BSP model provides a simple cost function for analyzing the complexity of algorithms

Eliminates need for programmers to manage memory, assign communication and perform low-level synchronization (w/ sufficient parallel slackness)

Synchronization allows automatic optimization of the communication pattern

BSP model provides a simple cost function for analyzing the complexity of algorithms

Page 12: Bulk Synchronous Parallel Processing Model Jamie Perkins

Cost FunctionCost Function

g – “gap” or bandwidth inefficiency L – “latency”, minimum time needed for one

superstep w – largest amount of work performed (per

processor) h – largest number of packets sent or received

wi + ghi + L = execution time for the

superstep i

g – “gap” or bandwidth inefficiency L – “latency”, minimum time needed for one

superstep w – largest amount of work performed (per

processor) h – largest number of packets sent or received

wi + ghi + L = execution time for the

superstep i

Page 13: Bulk Synchronous Parallel Processing Model Jamie Perkins

Languages & MachinesLanguages & Machines

BSP ++CC++FortranJBSPOpal

BSP ++CC++FortranJBSPOpal

IBM SP1SGI Power

Challenge(Shared Memory)

Cray T3DHitachi SR2001TCP/IP

IBM SP1SGI Power

Challenge(Shared Memory)

Cray T3DHitachi SR2001TCP/IP

Page 14: Bulk Synchronous Parallel Processing Model Jamie Perkins

Thank YouThank You

Any QuestionsAny Questions

Page 15: Bulk Synchronous Parallel Processing Model Jamie Perkins

ReferencesReferences

http://peace.snu.ac.kr/courses/parallelprocessing/ http://wwwcs.uni-paderborn.de/fachbereich/AG/

agmad http://www.cs.mu.oz.au/677/notes/node41.html McColl, W.F. The BSP Approach to Architecture

Independent Parallel Programming. Technical report, Oxford University Computing Laboratory, Dec. 1994

United States Patent 5083265 Valiant, L.G. A Bridging Model for Parallel

Computation. Communications of the ACM 33,8 (1990), 103-111.

http://peace.snu.ac.kr/courses/parallelprocessing/ http://wwwcs.uni-paderborn.de/fachbereich/AG/

agmad http://www.cs.mu.oz.au/677/notes/node41.html McColl, W.F. The BSP Approach to Architecture

Independent Parallel Programming. Technical report, Oxford University Computing Laboratory, Dec. 1994

United States Patent 5083265 Valiant, L.G. A Bridging Model for Parallel

Computation. Communications of the ACM 33,8 (1990), 103-111.