1 making parallel processing on clusters efficient, transparent and easy for programmers andrzej m....

57
1 Making Parallel Processing Making Parallel Processing on Clusters Efficient, on Clusters Efficient, Transparent and Easy for Transparent and Easy for Programmers Programmers Andrzej M. Goscinski Andrzej M. Goscinski School of Computing and Mathematics School of Computing and Mathematics Deakin University Deakin University Joint work with Michael Hobbs. Jackie Silcock Joint work with Michael Hobbs. Jackie Silcock and Justin Rough and Justin Rough

Upload: ashton-dunlap

Post on 26-Mar-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

1

Making Parallel Processing on Making Parallel Processing on Clusters Efficient, Transparent and Clusters Efficient, Transparent and

Easy for ProgrammersEasy for Programmers

Andrzej M. Goscinski Andrzej M. Goscinski School of Computing and MathematicsSchool of Computing and Mathematics

Deakin UniversityDeakin University

Joint work with Michael Hobbs. Jackie Silcock and Justin Joint work with Michael Hobbs. Jackie Silcock and Justin RoughRough

Page 2: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

2

Overview and AimsOverview and Aims Basic issues and Basic issues and

solutionssolutions– Parallel processing: user Parallel processing: user

expectations, clusters, expectations, clusters, phasesphases

– Parallelism managementParallelism management– TransparencyTransparency– Communication paradigmsCommunication paradigms– What to do?What to do?– Related systemsRelated systems

Cluster Execution Cluster Execution environmentsenvironments– MiddlewareMiddleware– Cluster operating systemsCluster operating systems

GENESISGENESIS– ArchitectureArchitecture– Services for parallelism Services for parallelism

management and management and transparencytransparency

GENESIS programming GENESIS programming interfaceinterface– Message passingMessage passing– DSMDSM– PrimitivesPrimitives

Easy to Use and Program Easy to Use and Program EnvironmentEnvironment

Performance StudyPerformance Study Summary and Future WorkSummary and Future Work

Page 3: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

3

Parallel Processing:Parallel Processing:User ExpectationsUser Expectations

AffordableAffordable Supercomputers for a “poor man” Supercomputers for a “poor man”

PerformancePerformance Good performanceGood performance

Ease of UseEase of Use Free from creation and placement concernsFree from creation and placement concerns

TransparencyTransparency Unaware of location of processesUnaware of location of processes

Ease of ProgrammingEase of Programming Choice and easy use of communication paradigmChoice and easy use of communication paradigm

Page 4: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

4

Parallel Processing:Parallel Processing:ClustersClusters

Clusters are an Clusters are an ideal platform for ideal platform for the execution of the execution of parallel applications parallel applications

Many institutions Many institutions (universities, banks, (universities, banks, industries) move industries) move toward toward homogeneous non-homogeneous non-dedicated clustersdedicated clusters

Advantages:Advantages:– Cheap to build: commodity Cheap to build: commodity

PCs, networksPCs, networks– Widely availableWidely available– Idle during weekendsIdle during weekends– Low utilization during Low utilization during

working hoursworking hours Disadvantages:Disadvantages:

– Poor and difficult to use Poor and difficult to use software (operating software (operating systems and runtime systems and runtime systems)systems)

– User unfriendlyUser unfriendly– Distribution of resources Distribution of resources

(CPUs and peripherals)(CPUs and peripherals)

Page 5: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

5

Parallel ProcessingParallel Processing PhasesPhases

Three distinct phases:Three distinct phases:– InitializationInitialization– ExecutionExecution– TerminationTermination

Researchers and manufacturers mainly Researchers and manufacturers mainly concentrate on execution to achieve the best concentrate on execution to achieve the best performanceperformance

Ease of use of parallel systems and Ease of use of parallel systems and programmer’s time are neglectedprogrammer’s time are neglected

Application developers are discouraged as they Application developers are discouraged as they have to program many activities, which are of an have to program many activities, which are of an operating system nature operating system nature

Page 6: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

6

Parallelism ManagementParallelism Management

Present operating systems that manage clusters Present operating systems that manage clusters are not built to support parallel processingare not built to support parallel processing

Reason: these operating systems do not provide Reason: these operating systems do not provide services to manage parallelismservices to manage parallelism

Parallelism management is the management of Parallelism management is the management of parallel processes and computational resourcesparallel processes and computational resources– Achieve high performanceAchieve high performance– Use computational resources efficientlyUse computational resources efficiently– Make programming and use of parallel systems easyMake programming and use of parallel systems easy

Page 7: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

7

Parallelism ManagementParallelism Management Parallelism management in parallel Parallelism management in parallel

programming tools, Distributed Shared programming tools, Distributed Shared Memory and enhanced operating system Memory and enhanced operating system environments environments – has been neglectedhas been neglected– left to the application developersleft to the application developers

Application developers must deal Application developers must deal – not only with parallel application developmentnot only with parallel application development– but also with the problems of initiation and control but also with the problems of initiation and control

for the execution on the cluster for the execution on the cluster Transparency and reliability (SSI) have been Transparency and reliability (SSI) have been

neglected – users do not see a cluster as a neglected – users do not see a cluster as a single powerful computersingle powerful computer

Page 8: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

8

Services for Parallelism Services for Parallelism Management on ClustersManagement on Clusters

Services for parallelism management Services for parallelism management and transparencyand transparency

Establishment of a virtual machine Establishment of a virtual machine Mapping of processes to computersMapping of processes to computers Parallel processes instantiationParallel processes instantiation Data (including shared) distributionData (including shared) distribution Initialisation of synchronization variablesInitialisation of synchronization variables Coordination of parallel processesCoordination of parallel processes Dynamic load balancingDynamic load balancing

Page 9: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

9

TransparencyTransparency Users should see a cluster as a single powerful Users should see a cluster as a single powerful

computer computer Dimensions of parallel processing Dimensions of parallel processing

transparencytransparency – Location transparencyLocation transparency– Process relation transparencyProcess relation transparency– Execution transparencyExecution transparency– Device transparencyDevice transparency

Page 10: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

10

Communication ParadigmsCommunication Paradigms

Two communication paradigms:Two communication paradigms: Message Passing (MP)Message Passing (MP)

Explicit communication between processes of a Explicit communication between processes of a parallel applicationparallel application– FastFast– Difficult to use for programmersDifficult to use for programmers

Distributed Shared Memory (DSM)Distributed Shared Memory (DSM)Implicit communication between processes of a Implicit communication between processes of a parallel application through shared memory objectsparallel application through shared memory objects– Easy to useEasy to use– Demonstrates reduced performanceDemonstrates reduced performance

Claim: Operating environments that offer MP and Claim: Operating environments that offer MP and DSM should be provided as a part of a cluster DSM should be provided as a part of a cluster operating system as they manage system resourcesoperating system as they manage system resources

Page 11: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

11

What to do?What to do? AffordableAffordable

ClustersClusters PerformancePerformance

Introduce Introduce special servicesspecial services

Ease of UseEase of Use Parallelism Parallelism

managementmanagement TransparencyTransparency

Operating Operating systemssystems

Ease of ProgrammingEase of Programming Message passing Message passing

and DSMand DSM

Development of cluster Development of cluster operating systems operating systems supporting parallel supporting parallel processingprocessing

Services of cluster Services of cluster operating systems:operating systems:– Distributed services for Distributed services for

transparent transparent communication and communication and management of basic management of basic system resourcessystem resources

– Services for parallelism Services for parallelism management and management and transparencytransparency

Page 12: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

12

Related SystemsRelated SystemsMessage Passing SystemsMessage Passing Systems

PVMPVM– A set of cooperating server processes and specialized libraries A set of cooperating server processes and specialized libraries

that support process communication, execution and that support process communication, execution and synchronizationsynchronization

– A virtual machine must be set up by the userA virtual machine must be set up by the user– Provides transparent process creation and terminationProvides transparent process creation and termination

MPIMPI– Objective is to standardize and coordinate the direction of various Objective is to standardize and coordinate the direction of various

message passing applications, tools and environmentsmessage passing applications, tools and environments– Provides limited process management functions to support Provides limited process management functions to support

parallel processingparallel processing HARNESSHARNESS

– Does not provide transparencyDoes not provide transparency– Programmers are forced to specify computers, map processes to Programmers are forced to specify computers, map processes to

these computersthese computers– Load imbalance is neglectedLoad imbalance is neglected

Page 13: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

13

Related SystemsRelated SystemsDSM SystemsDSM Systems

Research concentrates mainly on improving Research concentrates mainly on improving performanceperformance

Ease of use has been neglectedEase of use has been neglected MuninMunin

– Programmers must label different variables according to the Programmers must label different variables according to the consistency protocol they requireconsistency protocol they require

– The initialisation stage requires the application developer to The initialisation stage requires the application developer to define the number of computers to be useddefine the number of computers to be used

– Programmers must create a thread on each computer, Programmers must create a thread on each computer, initialise shared data and create synchronization variablesinitialise shared data and create synchronization variables

TreadMarksTreadMarks– The application developer has a substantial input into The application developer has a substantial input into

initialisation of DSM processesinitialisation of DSM processes– Full transparency is not providedFull transparency is not provided

Page 14: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

14

Related SystemsRelated SystemsExecution EnvironmentsExecution Environments

Improvement to PVM, MPI and DSM approach of running Improvement to PVM, MPI and DSM approach of running on top of an operating system is through the on top of an operating system is through the enhancement of an operating system to support parallel enhancement of an operating system to support parallel processingprocessing

Beowulf Beowulf – Exploits distributed process space to manage parallel processesExploits distributed process space to manage parallel processes– Processes can be started on remote computers after logon Processes can be started on remote computers after logon

operation into that computer was completed successfullyoperation into that computer was completed successfully– It does not address resource allocation nor load balancingIt does not address resource allocation nor load balancing– Transparent process migration is not provided Transparent process migration is not provided

Page 15: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

15

Related SystemsRelated SystemsExecution EnvironmentsExecution Environments

NOWNOW– Combines specialized libraries and server processes with Combines specialized libraries and server processes with

enhancement to the kernelenhancement to the kernel– Enhancement: scheduling and communication kernel modules- Enhancement: scheduling and communication kernel modules-

GLUnix to provide network wide process, file and VM managementGLUnix to provide network wide process, file and VM management– Parallelism management service: process initialisation on any Parallelism management service: process initialisation on any

cluster computer, support semi-transparent start of parallel cluster computer, support semi-transparent start of parallel processes on multiple nodes (how to select nodes?), barriers, MPIprocesses on multiple nodes (how to select nodes?), barriers, MPI

MOSIXMOSIX– Provides enhanced and transparent communication and Provides enhanced and transparent communication and

scheduling within the kernelscheduling within the kernel– Employs PVM to provide parallelism support (initial placement)Employs PVM to provide parallelism support (initial placement)– Process migration transparently migrates processesProcess migration transparently migrates processes– Provides dynamic load balancing and data collectionProvides dynamic load balancing and data collection– Remote communication is handled through the originating Remote communication is handled through the originating

computer computer

Page 16: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

16

Related SystemsRelated SystemsSummarySummary

All systems but MOSIX are based on middleware – All systems but MOSIX are based on middleware – there is no trial to develop a comprehensive operating there is no trial to develop a comprehensive operating system to support parallel processing on clusterssystem to support parallel processing on clusters

The solutions are performance driven – little work has The solutions are performance driven – little work has been done on making them programmer friendlybeen done on making them programmer friendly

Problems from parallel processing point of view:Problems from parallel processing point of view:– Processes are created one at a time although primitives Processes are created one at a time although primitives

provided enable the user to create multiple processesprovided enable the user to create multiple processes– These systems (with the exception of MOSIX) do not provide These systems (with the exception of MOSIX) do not provide

complete transparencycomplete transparency– Virtual machine is not set up automaticallyVirtual machine is not set up automatically– These systems do not provide load balancingThese systems do not provide load balancing

Page 17: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

17

Cluster Execution Cluster Execution EnvironmentsEnvironments

Execution environments that support parallel Execution environments that support parallel processing on clusters can be developed usingprocessing on clusters can be developed using

Middleware approach – at the application levelMiddleware approach – at the application level Underware – at the kernel levelUnderware – at the kernel level

Page 18: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

18

MiddlewareMiddleware

Userprocess

M

PVM software Library functions or separate software

Operating System (Unix)

Userprocess

M

DSM software

Operating System (Unix)

ORApplication processes

Operating system

Page 19: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

19

Middleware - summaryMiddleware - summary Middleware allows programmersMiddleware allows programmers

– to develop parallel application (PVM, MPI)to develop parallel application (PVM, MPI)– execute parallel applications on clusters (Beowulf)execute parallel applications on clusters (Beowulf)– employ shared memory based programming (Munin)employ shared memory based programming (Munin)– achieve good execution performanceachieve good execution performance– take advantage of portabilitytake advantage of portability

MiddlewareMiddleware– does not offer complete transparencydoes not offer complete transparency– reduces potential execution performance (services are duplicated)reduces potential execution performance (services are duplicated)– forces programmers to be involved in many time consuming and forces programmers to be involved in many time consuming and

error prone activities that are of the operating system natureerror prone activities that are of the operating system nature Conclusion: to provide parallelism management, offer Conclusion: to provide parallelism management, offer

transparency, make programming and use of a system transparency, make programming and use of a system easy develop the needed services at the operating system easy develop the needed services at the operating system levellevel

Page 20: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

20

Cluster operating systemsCluster operating systems Cluster is a special kind of a distributed systemCluster is a special kind of a distributed system Cluster operating system supporting parallel Cluster operating system supporting parallel

processing shouldprocessing should– possess the features of a distributed operating system to possess the features of a distributed operating system to

deal with distributed resources and their management deal with distributed resources and their management and hide distributionand hide distribution

– exploit additional services to manage parallelism for exploit additional services to manage parallelism for application and offer complete transparencyapplication and offer complete transparency

– provide an enhanced programming environmentprovide an enhanced programming environment Three logical levels of a cluster operating systemThree logical levels of a cluster operating system

– Basic distributed operating systemBasic distributed operating system– Parallelism management and transparency systemParallelism management and transparency system– Programming environmentProgramming environment

Page 21: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

21

Logical architecture of Logical architecture of a cluster operating system a cluster operating system

Message Passing/PVM

M

PROGRAMMINGENVIRONMENT

CommunicationServices

DSMServices

ParallelismManagement

System

Enhanced Subset of a Distributed Operating System(Microkernel, Communication/File Management)

SharedMemory

Page 22: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

22

GENESIS GENESIS Cluster Operating SystemCluster Operating System

Proof of conceptProof of concept Client-server model, microkernel approach and object Client-server model, microkernel approach and object

based approach (all entities have names)based approach (all entities have names) All basic resources: processor, main memory, network, All basic resources: processor, main memory, network,

interprocess communication, files are managed by interprocess communication, files are managed by relevant serversrelevant servers

IPC - Message passing servicesIPC - Message passing services– basic communication paradigmbasic communication paradigm– cornerstone of the architecturecornerstone of the architecture– provided by IPC Manager and local IPC component of provided by IPC Manager and local IPC component of

microkernelmicrokernel IPC placement and relationship with other services IPC placement and relationship with other services

designed to achieve high performance and transparencydesigned to achieve high performance and transparency DSM provided by Space (memory) and IPC Managers DSM provided by Space (memory) and IPC Managers

Page 23: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

23

The GENESIS ArchitectureThe GENESIS Architecture

ParallelProcesses

ParallelismManagement

System

KernelServers

GlobalScheduler

RHODOS Microkernel

DSM System

ExecutionManager

IPCManager

SpaceManager

ProcessManager

File/CacheManager

NetworkManager

MP PVM DSM

ResourceDiscovery

MigrationManager

Page 24: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

24

GENESIS Services for GENESIS Services for Parallelism Management and Parallelism Management and

Transparency Transparency

Basic services that provide parallelism Basic services that provide parallelism management and offer transparency:management and offer transparency:

Establishment of a virtual machineEstablishment of a virtual machine Process creationProcess creation Process duplicationProcess duplication Process migrationProcess migration Global schedulingGlobal scheduling

Page 25: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

25

Establishment of Establishment of a Virtual Machinea Virtual Machine

Resource Discovery Server supports adaptive Resource Discovery Server supports adaptive establishment of a virtual machineestablishment of a virtual machine

Resource Discovery Server Resource Discovery Server – IdentifiesIdentifies

Idle and lightly loaded computersIdle and lightly loaded computers Computer resources: e.g., processor model, memory sizeComputer resources: e.g., processor model, memory size Computational load and available memory Computational load and available memory Communication patterns for each processCommunication patterns for each process

– Passes information to the Global Scheduling Server perPasses information to the Global Scheduling Server per ProcessProcess ServerServer Averaged over an entire clusterAveraged over an entire cluster

Virtual machine changes dynamicallyVirtual machine changes dynamically Some computers become overloaded or out of orderSome computers become overloaded or out of order Some computers become idleSome computers become idle

Page 26: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

26

Process CreationProcess Creation

RequirementsRequirements– Multiple process creation – to create many instances of a process Multiple process creation – to create many instances of a process

on a single or over many computerson a single or over many computers– Scalability – must be scalable to many computersScalability – must be scalable to many computers– Complete transparency – must hide the location of all resources Complete transparency – must hide the location of all resources

and processesand processes Three forms of process creation:Three forms of process creation:

SingleSingle MultipleMultiple GroupGroup

Creation is invoked when the Execution Manager receives Creation is invoked when the Execution Manager receives a process create request from a parent processa process create request from a parent process– Execution Manager notifies Global SchedulerExecution Manager notifies Global Scheduler– Global Scheduler sends location on which process should be Global Scheduler sends location on which process should be

createdcreated– Execution Manager on selected computer manages process Execution Manager on selected computer manages process

creationcreation

Page 27: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

27

Process CreationProcess CreationSingle and Multiple ServicesSingle and Multiple Services

Single process creation serviceSingle process creation service– Similar to the services found in traditional systems Similar to the services found in traditional systems

supporting parallel processingsupporting parallel processing– Requires executable image to be downloaded from disk Requires executable image to be downloaded from disk

for each parallel process to be createdfor each parallel process to be created Multiple process creation serviceMultiple process creation service

– Supports the concurrent instantiation of a number of Supports the concurrent instantiation of a number of processes on a given computer through one creation callprocesses on a given computer through one creation call

– When many computers are involved in multiple process When many computers are involved in multiple process creation, each computer is addressed in a sequential creation, each computer is addressed in a sequential mannermanner

– Executable image of a parallel child process must be Executable image of a parallel child process must be downloaded separately for each computer involved – downloaded separately for each computer involved – scalability problemscalability problem

Page 28: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

28

Process CreationProcess CreationGroupGroup

Group process creation combines multiple Group process creation combines multiple process creation and group communication process creation and group communication

Group process creation serviceGroup process creation service– allows multiple process to be created concurrently allows multiple process to be created concurrently

on many computers on many computers – Single executable is downloaded from a file server Single executable is downloaded from a file server

using group communicationusing group communication

Page 29: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

29

Group Process CreationGroup Process CreationBehaviorBehavior

FileFile ServerServer

GlobalGlobalSchedulerScheduler

ExecExecManagerManager

ParentParent Child 1Child 1

Computer 1Computer 1

ExecExecManagerManager

Child 2Child 2

Computer 2Computer 2

ExecExecManagerManager

Child nChild n

Computer nComputer n

9911

22 33

44

55

55

44

44

88

66

77

77

Page 30: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

30

Process DuplicationProcess DuplicationSingle Local and RemoteSingle Local and Remote

Parallel processes are instantiated on selected Parallel processes are instantiated on selected computers by employing process duplication computers by employing process duplication supported by process migration supported by process migration

Three forms of process duplicationThree forms of process duplication Single local and remote Single local and remote Multiple local and remoteMultiple local and remote Group remoteGroup remote

Single local and remote process duplicationSingle local and remote process duplication– Duplication is invoked when the Execution Manager receives Duplication is invoked when the Execution Manager receives

a twin request from a parent processa twin request from a parent process Execution Manager notifies Global SchedulerExecution Manager notifies Global Scheduler Global Scheduler sends a location on which twin should be Global Scheduler sends a location on which twin should be

placedplaced If this computer is remote process migration is employedIf this computer is remote process migration is employed

Page 31: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

31

Process DuplicationProcess DuplicationMultiple Local and RemoteMultiple Local and Remote

Multiple local and remote process duplication is an Multiple local and remote process duplication is an enhancement of single process duplicationenhancement of single process duplication

Duplication is invoked when the Execution Manager Duplication is invoked when the Execution Manager receives a multiple duplication request from a parent receives a multiple duplication request from a parent processprocess– Execution Manager notifies Global SchedulerExecution Manager notifies Global Scheduler– Global Scheduler sends a location on which twin should be placedGlobal Scheduler sends a location on which twin should be placed– If computer is local If computer is local

Process Manager and Space Manager are requested to duplicate Process Manager and Space Manager are requested to duplicate multiple copies of process entries and memory spacesmultiple copies of process entries and memory spaces

– If computer is remote If computer is remote the parent process is migrated to this destinationthe parent process is migrated to this destination multiple copies of the parent process are duplicatedmultiple copies of the parent process are duplicated the parent process on the remote computer is killedthe parent process on the remote computer is killed

Child processes should be duplicated on many computersChild processes should be duplicated on many computers– Remote process duplication is performed for each selected Remote process duplication is performed for each selected

computercomputer

Page 32: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

32

Process DuplicationProcess DuplicationGroup RemoteGroup Remote

When more than one remote computer is involved in When more than one remote computer is involved in process duplication the overall performance decreasesprocess duplication the overall performance decreases

Decrease is caused by migrating a parent process to Decrease is caused by migrating a parent process to each remote computer sequentiallyeach remote computer sequentially

Performance is improved by employing group process Performance is improved by employing group process migrationmigration– Process Managers and Execution Managers each join a Process Managers and Execution Managers each join a

relevant group and use group communicationrelevant group and use group communication– The parent process is concurrently migrated to all selected The parent process is concurrently migrated to all selected

remote computers involved in process duplicationremote computers involved in process duplication

Page 33: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

33

Group Remote Process Group Remote Process DuplicationDuplicationBehaviorBehavior

Computer nComputer n

GlobalGlobalSchedulerScheduler

ExecExecManagerManager

Child 1Child 1 ParentParent

Computer 1Computer 1

99

11

22 33

5M5M

5544

66

77

77

MigrationMigration ManagerManager

ExecExecManagerManager

Child 2Child 2ParentParent

Computer 2Computer 2

MigrationMigration ManagerManager

ExecExecManagerManager

Child nChild nParentParent

MigrationMigration ManagerManager

88 1010

77

55

55

5M5M88

88

Page 34: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

34

Process MigrationProcess Migration Designed to separate policy from mechanismDesigned to separate policy from mechanism

– Process Migration Manager acts as the coordinator for migration Process Migration Manager acts as the coordinator for migration of various resources that combine to form a processof various resources that combine to form a process

– Migration of resources: memory, process entries, buffers is carried Migration of resources: memory, process entries, buffers is carried out by the Space, Process and IPC Managers, respectivelyout by the Space, Process and IPC Managers, respectively

Two forms of process migration: single and groupTwo forms of process migration: single and group Single process migrationSingle process migration

– Global Scheduler provides “which” process to “where” computerGlobal Scheduler provides “which” process to “where” computer– Local Manager requests its remote peer to prepare for a processLocal Manager requests its remote peer to prepare for a process– Local Migration Manager requests Space, Process and IPC Local Migration Manager requests Space, Process and IPC

Managers to migrate respective resourcesManagers to migrate respective resources– Remote Manager informs its local peer of successful migrationRemote Manager informs its local peer of successful migration– Local Manager requests Space, Process and IPC Managers to Local Manager requests Space, Process and IPC Managers to

delete the respective resources of the migrated processdelete the respective resources of the migrated process

Page 35: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

35

Process MigrationProcess MigrationBehaviorBehavior

33

1122

6655

77 ProcessProcess ManagerManager

GlobalGlobalSchedulerScheduler

MigrationMigration ManagerManager

ProcessProcess

Source ComputerSource Computer

SpaceSpace ManagerManager

IPCIPC ManagerManager

ProcessProcess ManagerManager

MigrationMigration ManagerManager

ProcessProcess

Destination ComputerDestination Computer

SpaceSpace ManagerManager

IPCIPC ManagerManager

Process StateProcess State

SpacesSpaces

IPC BuffersIPC Buffers

44

44

44

EventEvent

Page 36: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

36

Group Process MigrationGroup Process Migration

Enhancement of the single process migrationEnhancement of the single process migration Modifying the single communication between Modifying the single communication between

the peer Migration Managers, Process Managers, the peer Migration Managers, Process Managers, Space Managers and IPC Managers to that of Space Managers and IPC Managers to that of group communicationgroup communication

Global Scheduler provides “which” process to Global Scheduler provides “which” process to “where” computers“where” computers– Each server migrates their respective resources to Each server migrates their respective resources to

multiple destination computers in a single message multiple destination computers in a single message using group communicationusing group communication

– Parent process is duplicated on each remote computerParent process is duplicated on each remote computer– At the end of successful migration the parent process At the end of successful migration the parent process

on each remote computer is killed on each remote computer is killed

Page 37: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

37

Global SchedulingGlobal Scheduling

Makes policy decisions of which processes should Makes policy decisions of which processes should be mapped to which computersbe mapped to which computers

Input provided by the Resource Discovery ManagerInput provided by the Resource Discovery Manager Relies on mechanisms ofRelies on mechanisms of

– Single, multiple an group process creation and duplication Single, multiple an group process creation and duplication servicesservices

– Single and group process migrationSingle and group process migration The server combines services ofThe server combines services of

– Static allocation – at the initial stage of parallel processing Static allocation – at the initial stage of parallel processing – Dynamic load balancing – to react to load fluctuationsDynamic load balancing – to react to load fluctuations

Currently, the Global Scheduler is implemented as a Currently, the Global Scheduler is implemented as a centralized servercentralized server

Page 38: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

38

GENESIS Programming GENESIS Programming InterfaceInterface

Designed and Designed and developed to developed to provide both provide both communicatiocommunication paradigms:n paradigms:– Message Message

passingpassing– Shared Shared

memorymemory

Message Passing/PVM

M

PROGRAMMINGENVIRONMENT

CommunicationServices

DSMServices

ParallelismManagement

System

Enhanced Subset of a Distributed Operating System(Microkernel, Communication/File Management)

SharedMemory

Page 39: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

39

Message PassingMessage Passing

Basic Message PassingBasic Message Passing– Exploits basic interprocess communication conceptsExploits basic interprocess communication concepts– Transparent and reliable local and remote IPCTransparent and reliable local and remote IPC– Integral component of GENESISIntegral component of GENESIS– Offers standard message passing and RPC primitivesOffers standard message passing and RPC primitives

GENESIS PVMGENESIS PVM– PVM added to provide a well known parallelism programming tool PVM added to provide a well known parallelism programming tool – Ported from the UNIX based PVMPorted from the UNIX based PVM– Implemented within a ‘library’ in GENESISImplemented within a ‘library’ in GENESIS– Mapping of the standard PVM services onto the GENESIS servicesMapping of the standard PVM services onto the GENESIS services– Performance improvement of PVM on GENESISPerformance improvement of PVM on GENESIS

No additional “classic” PVM server processes requiredNo additional “classic” PVM server processes required Direct interprocess communication model instead of the default Direct interprocess communication model instead of the default

modelmodel Load balancing providedLoad balancing provided

Page 40: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

40

Architecture of PVM on UnixArchitecture of PVM on Unix

PVMPVM ServerServer

User Task 1User Task 1 libpvmlibpvm

KernelKernel

Computer 1Computer 1

PVMPVM ServerServer

User Task 2User Task 2 libpvmlibpvm

KernelKernel

Computer 2Computer 2

TCP ConnectionsTCP Connections

UDP DatagramsUDP Datagrams

Page 41: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

41

Architecture of PVM on Architecture of PVM on GENESISGENESIS

ExecutionExecutionManagerManager

MigrationMigrationManagerManager

GlobalGlobalSchedulerScheduler

IPCIPCManagerManager

NetworkNetworkManagerManager

ExecutionExecutionManagerManager

MigrationMigrationManagerManager

IPCIPCManagerManager

NetworkNetworkManagerManager

Computer 1Computer 1 Computer 2Computer 2PVMPVMCommsCommsUser PVMUser PVM

Parallel ProcessesParallel Processes

libpvmlibpvm

User PVMUser PVM Parallel ProcessesParallel Processes

libpvmlibpvm

MicrokernelMicrokernel MicrokernelMicrokernel

NetworkNetwork

Page 42: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

42

Distributed Shared MemoryDistributed Shared Memory DSM is an integral component of the operating systemDSM is an integral component of the operating system Since DSM is a memory management function the DSM Since DSM is a memory management function the DSM

system is integrated into the Space Managersystem is integrated into the Space Manager– Shared memory used as though it were physically sharedShared memory used as though it were physically shared– Easy to use shared memoryEasy to use shared memory– Low overhead, improved performanceLow overhead, improved performance

Two consistency models supported:Two consistency models supported:– Sequential – implemented using invalidation modelSequential – implemented using invalidation model– Release – implemented using write-update modelRelease – implemented using write-update model

Synchronization and coordination of processesSynchronization and coordination of processes– Semaphores - owned by Space Manager on particular computerSemaphores - owned by Space Manager on particular computer– Gaining ownership is distributed and mutually exclusiveGaining ownership is distributed and mutually exclusive– Barriers used for coordination – their management is centralizedBarriers used for coordination – their management is centralized

Page 43: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

43

Distributed Shared MemoryDistributed Shared Memory

IPCIPCManagerManager

IPCIPCManagerManager

ProcessProcess ManagerManager

ProcessProcessManagerManager

DSMDSM DSMDSM SpaceSpaceManagerManager

SpaceSpace ManagerManager

User DSMUser DSM Parallel ProcessesParallel Processes

User DSMUser DSM Parallel ProcessesParallel Processes

Computer 1Computer 1 Computer 2Computer 2

MicrokernelMicrokernel MicrokernelMicrokernel

NetworkNetwork

SharedSharedMemoryMemory

Page 44: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

44

GENESIS PrimitivesGENESIS Primitives ExecutionExecution

Two groups of primitivesTwo groups of primitives

– to support execution services to support execution services

– for the provision of communication and coordination servicesfor the provision of communication and coordination services

MP PVM DSM

proc-ncreate() pvm_spawn() proc-ncreate()

proc-exit() pvm_exit() proc-exit()

Page 45: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

45

GENESIS PrimitivesGENESIS PrimitivesCommunication and CoordinationCommunication and Coordination

MP PVM DSM

send() pvm_send() read access

recv() pvm_recv() write access

pvm_pkbuf() wait()

pvm_unpkbuf() signal()

barrier()pvm_barrier()

barrier()

Page 46: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

46

Easy to Use and Program Easy to Use and Program EnvironmentEnvironment

GENESIS systemGENESIS system Provides and efficient and transparent environment Provides and efficient and transparent environment

for execution of parallel applicationsfor execution of parallel applications Offers transparencyOffers transparency Relieves programmers from activities such as:Relieves programmers from activities such as:

– Selection of computers for a virtual a machine for the Selection of computers for a virtual a machine for the given applicationgiven application

– Setting up a virtual machineSetting up a virtual machine– Mapping processes to virtual machineMapping processes to virtual machine– Process instantiation using process creation and Process instantiation using process creation and

duplication supported by process migrationduplication supported by process migration– Load balancingLoad balancing

Page 47: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

47

Easy to Use and Program Easy to Use and Program EnvironmentEnvironment

In the In the GENESIS systemGENESIS system Location of the remote computer(s) of the cluster is Location of the remote computer(s) of the cluster is

selected automatically by Global Schedulerselected automatically by Global Scheduler Users do not know process locationUsers do not know process location Programming of parallel applications has been Programming of parallel applications has been

made easy by providingmade easy by providing– Message passing: standard and PVMMessage passing: standard and PVM– Distributed Shared MemoryDistributed Shared Memory– Powerful primitives: implement sequences of operations Powerful primitives: implement sequences of operations

and provide transparency and provide transparency process_ncreate(GROUP_CREATE,n, process_ncreate(GROUP_CREATE,n, “child_prog”)“child_prog”)

– Process instantiation using process creation and Process instantiation using process creation and duplication supported by process migrationduplication supported by process migration

– Load balancingLoad balancing

Page 48: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

48

Performance Performance of Standard Parallel of Standard Parallel

ApplicationsApplications

GENESIS SystemGENESIS System– 13 Sun3/50 Workstations13 Sun3/50 Workstations

12 Computation + 1 File Server12 Computation + 1 File Server– 10 Mbit/sec shared Ethernet10 Mbit/sec shared Ethernet

Influence of process instantiation on execution Influence of process instantiation on execution performanceperformance

GENESIS PVM vs. Unix PVMGENESIS PVM vs. Unix PVM Standard parallel applicationsStandard parallel applications

– Successive Over Relaxation Successive Over Relaxation – Quicksort Quicksort – Traveling Salesman ProblemTraveling Salesman Problem

Page 49: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

49

Influence of Process Instantiation on Influence of Process Instantiation on Execution PerformanceExecution Performance

Parallel Simulation (5, 25, 50 Second Workload)Parallel Simulation (5, 25, 50 Second Workload)

Simulation - amount of work Simulation - amount of work relates to the overall exec relates to the overall exec time time

Two parameters:Two parameters: – Work load (Work load (5, 25, 50 Seconds)5, 25, 50 Seconds)– Number of workstations (1 ..12)Number of workstations (1 ..12)

Global scheduler & migrationGlobal scheduler & migration Speedups for #comp = #procSpeedups for #comp = #proc

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealGroupMultiSingle

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealGroupMultiSingle

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealGroupMultiSingle

Page 50: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

50

GENESIS PVM vs. Unix PVMGENESIS PVM vs. Unix PVMIPC LatencyIPC Latency

Support for IPC provided by the PVM server in Unix was substituted Support for IPC provided by the PVM server in Unix was substituted with GENESIS operating system mechanismswith GENESIS operating system mechanisms

To measure the time saved by removing the server, a simple PVM To measure the time saved by removing the server, a simple PVM application that exchanges messages (1kbyte –100kbytes) was usedapplication that exchanges messages (1kbyte –100kbytes) was used

Round-trip time (including data packing and unpacking) was measuredRound-trip time (including data packing and unpacking) was measured

0

200

400

600

800

1000

1200

1400

1600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

Message Size (KBytes)

Ro

un

d T

rip

TIm

e (m

s)

Genesis PVM

Unix PVM (Default Route, No Encoding)

Page 51: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

51

GENESIS PVM vs. Unix PVMGENESIS PVM vs. Unix PVMSpeedupSpeedup

Application used to study the influence of process instantiation Application used to study the influence of process instantiation - amount of work relates to the overall exec time – was studied- amount of work relates to the overall exec time – was studied

Parameters:Parameters:– Number of workstationsNumber of workstations– GENESIS with and without load balancingGENESIS with and without load balancing

1

3

5

7

9

11

1 2 4 6 8 10 12

Number of Workstations

Max

imu

m S

pee

du

p A

chie

ved

Optimal

Genesis PVM w ith Load Balancing

Genesis PVM w ithout Load Balancing

Unix PVM

Page 52: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

52

Successive Over RelaxationSuccessive Over Relaxation Parallel applications developed based on algorithms of Rice UniversityParallel applications developed based on algorithms of Rice University Rice superior cluster hardware: DEC station-5000/240 + fast ATM netRice superior cluster hardware: DEC station-5000/240 + fast ATM net For 8 computers – array size: Rice - 512 x 2048 elements with 101 For 8 computers – array size: Rice - 512 x 2048 elements with 101

iterations; GENESIS 128 x 128 elements with 10 iterationsiterations; GENESIS 128 x 128 elements with 10 iterations– DSM: TreadMarks – 6.3; GENESIS – 4.4DSM: TreadMarks – 6.3; GENESIS – 4.4– PVM: Rice – 6.91; GENESIS – 5.1PVM: Rice – 6.91; GENESIS – 5.1

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealMPPVMDSM

Page 53: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

53

QuicksortQuicksort Parallel applications developed based on algorithms of Rice Parallel applications developed based on algorithms of Rice Rice superior cluster hardware: DEC station-5000/240 + fast ATM Rice superior cluster hardware: DEC station-5000/240 + fast ATM

netnet For 8 computers – array size: Rice - 256 x 1024 integers; GENESIS For 8 computers – array size: Rice - 256 x 1024 integers; GENESIS

256 x 256 integers256 x 256 integers– DSM: TreadMarks – 5.3; GENESIS – 2.5DSM: TreadMarks – 5.3; GENESIS – 2.5– PVM: Rice – 6.79; GENESIS – 6.07PVM: Rice – 6.79; GENESIS – 6.07

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealMPPVMDSM

Page 54: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

54

Traveling Salesman ProblemTraveling Salesman Problem Parallel applications developed based on algorithms of Rice Parallel applications developed based on algorithms of Rice

UniversityUniversity Rice superior cluster hardware: DEC station-5000/240 + fast ATM netRice superior cluster hardware: DEC station-5000/240 + fast ATM net For 8 computers – 18 city tour; with the minimum threshold set to 13 For 8 computers – 18 city tour; with the minimum threshold set to 13

citiescities– DSM: TreadMarks – 4.74; GENESIS – 6.33DSM: TreadMarks – 4.74; GENESIS – 6.33– PVM: Rice – 5.63; GENESIS – 5.94PVM: Rice – 5.63; GENESIS – 5.94

0123456789

10111213

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Workstations

Speed-u

p IdealMPPVMDSM

Page 55: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

55

SummarySummary Nondedicated clusters are commonly availableNondedicated clusters are commonly available

– Force application developers to program operating system Force application developers to program operating system operationsoperations

– Do not offer transparencyDo not offer transparency Application developers need a computer system thatApplication developers need a computer system that

– Processes applications efficientlyProcesses applications efficiently– Uses cluster resources wellUses cluster resources well– Allows to see cluster as a single powerful computer rather than Allows to see cluster as a single powerful computer rather than

as a set of connected computersas a set of connected computers Proposal: employ a cluster operating systemProposal: employ a cluster operating system Design: cluster operating system with three logical levelsDesign: cluster operating system with three logical levels

– Distributed operating systemDistributed operating system– Parallelism management and transparency systemParallelism management and transparency system– Programming environmentProgramming environment

Page 56: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

56

SummarySummary GENESIS – designed and developed as a “proof of GENESIS – designed and developed as a “proof of

concept”concept” GENESIS is a system that satisfies user requirementsGENESIS is a system that satisfies user requirements GENESIS approach is uniqueGENESIS approach is unique

– Offers both message passing (MP and PVM) and DSM environmentOffers both message passing (MP and PVM) and DSM environment– Services providing parallelism management are integral Services providing parallelism management are integral

components of an operating systemcomponents of an operating system– Provides a comprehensive environment to transparently manage Provides a comprehensive environment to transparently manage

system resourcessystem resources Programmers do not have to be involved in parallelism Programmers do not have to be involved in parallelism

management management Use of the cluster is has been made easyUse of the cluster is has been made easy Complete transparency is offeredComplete transparency is offered Good performance results have been achievedGood performance results have been achieved

Page 57: 1 Making Parallel Processing on Clusters Efficient, Transparent and Easy for Programmers Andrzej M. Goscinski School of Computing and Mathematics Deakin

57

Future WorkFuture Work

Port GENESIS to an Intel like platformPort GENESIS to an Intel like platform Use virtual memory to support DSMUse virtual memory to support DSM Offer reliable parallel computing services on Offer reliable parallel computing services on

clusters by employingclusters by employing– Reliable group communication Reliable group communication – Checkpointing to offer fault toleranceCheckpointing to offer fault tolerance