introduction to mpiteaching.csse.uwa.edu.au/units/cits3402/lectures/2pointtopoint.pdfremainder of...

109
Introduction to MPI Continued

Upload: others

Post on 15-Feb-2020

5 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Introduction to MPI Continued

Page 2: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Remainder of the Course

1. Why bother with HPC

2. What is MPI

3. Point to point communication (today)

4. User-defined datatypes / Writing parallel code / How to use a super-computer (today)

5. Collective communication

6. Communicators

7. Process topologies

8. File/IO and Parallel profiling

9. Hadoop/Spark

10. More Hadoop / Spark

11. Alternatives to MPI

12. The largest computations in history / general interest / exam prep

Page 3: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Last Time

• Introduced distributed parallelism

• Introduced MPI

• General setup for MPI-based programming

Very brief, I hope you have enjoyed the not-break-break

Page 4: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Today

• Deriving MPI

• Blocking / Non-blocking send and receive

• Datatypes in MPI

• How to use a super-computer

Get the basics right and everything else should fall into place

Page 5: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Deriving MPI

Page 6: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Advantages of Message Passing

• Why would we want to use a message passing model?• And sometimes you don’t (Map/Reduce, ZeroMQ etc.)

• Universality

• Expressivity

• Ease of Debugging

• Performance

Page 7: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Advantages of Message Passing

• Why would we want to use a message passing model?• And sometimes you don’t (Map/Reduce, ZeroMQ etc.)

• Universality• No special hardware requirements• Works with special hardware

• Expressivity• Message passing is a complete model for parallel algorithms• Useful for automatic load balancing

• Ease of Debugging • Still difficult• Thinking w.r.t. messages fairly intuitive

• Performance• Associates data with a processor• Allows compiler and cache manager to work well together

Page 8: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Introduction to MPI (for real)

• Simple goals• Portability

• Efficiency

• Functionality

• So we want a message passing model• Each process has separate address space

• A message is one process copying some of its address space to another• Send

• Receive

Page 9: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Minimal MPI

• What does the sender send?• Data starting address + length(bytes)

• Destination destination address (an int will do)

• What does the receiver receive?• Data starting addres + length(bytes)

• Source source address (filled when received)

Page 10: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Minimal MPI

• So we can send and receive messages

• Might be enough for some applications but there’s something missing

Message selection

• Currently all processes receive all messages

• If we add a tag field processes will be able to ignore messages not intended for them

Page 11: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Minimal MPI

• Our model now becomes

• Send(address, length, destination, tag)

• Receive (address, length, source, tag, actual length)

• We can make the source and tag arguments wildcards to go back to our original model

• This is a complete model for Message-Passing HPC

• Most exotic MPI functions are built by combining these two

Page 12: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Minimal MPI – Problems

• There are still some issues that MPI solves

1. Describing message buffers

2. Separating families of messages

3. Naming processes

4. Communicators

Page 13: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Describing Buffers

• (address, length) is not sufficient for two main reasons

• Assumes data is contiguous• Often not the case• E.g. sending the row of a matrix stored column-wise

• Assumes data representation is always known• Does not handle heterogenous clusters• E.g. CPU + GPU machines for example

• MPI’s solution• MPI_datatypes→ Abstract one layout up → Allow users to specify their own• (address, length, datatype)

Page 14: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Separating families of messages

• Consider using a library written with MPI• They will have their own naming of tags and such

• Your code may interact with this

• MPI’s solution• Contexts → Think if this as super-tags

• Provides one more layer of separation between codes running in one application

Page 15: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Naming Processes

• Processes belong to groups

• A rank is associates with each group

• Using an int is actually sufficient in this case

Page 16: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Communicators

• Combines contexts and groups into a single structure

• Destination and source ranks are specified relative to a communicator

Page 17: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Send(start, count, datatype, dest, tag, comm)

• Message buffer described by• Start

• Count

• Data types

• Target process given by• Dest

• Comm

• Tag can be used to create different ‘types’ of messages

Page 18: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Recv(start, count, datatype, source, tag, comm, status)

• Waits until a matching (source, tag) message is available

• Reads into the buffer• Start

• Count

• Datatype

• Target process specified by• Source

• Comm

• Status contains more information

• Receiving fewer than count occurrences of datatype is okay, more is an error

Page 19: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Other Interesting Features

• Collective communication• Get all of your friends involved – light up the group chat

• Two flavours• Data movement – E.g. broadcast

• Collective computation – Min, max, average, logical OR etc.

Page 20: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Other Interesting Features

• Virtual topologies• Allow graphs and grid connections to be imposed on processes• ‘Send to my neighbours’

• Debugging and profiling help• MPI requires ‘hooks’ to be available for debugging implementation

• Communication modes• Blocking vs. Non-blocking

• Support for Libraries• Communicators allow libraries to exist in their own space

• Support for heterogenous networks• MPI_Send/Recv implementation independent

Page 21: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI – Other Interesting Features

• Processes vs. Processors• A process is a software concept

• A processor is a rock we tricked into thinking

• Some implementations limit one process per processors

Page 22: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Is MPI Large?

• There are many functions available and many strange idiosyncrasies

• The core is rather tight however

• A full MPI-specification (fundamentally) requires• Init

• Comm_rank

• Comm_size

• Send

• Recv

• Finalize

Page 23: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Point-to-Point Communication

Page 24: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

What’s the point?

• The fundamental mechanism in MPI is the transmission of data between a pair of processes• One sender

• One receiver

• Almost all other MPI constructs are short-hand versions of tasks you could achieve with point-to-point methods

• We will linger on this topic a little longer than may first seem necessary• Many idiosyncrasies in MPI come from how point-to-point communication is

achieved in-code

Page 25: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

What’s the point?

• Remember• Rank → ID of each process in a communicator

• Communicator → Collection of processes

• MPI_COMM_WORLD → The communicator for all processes

Page 26: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example: Knock-Knock

What we want to do:

• Find our rank

• If process 0• Send “Knock, knock”

• If process 1• Receive a string from process 0

• Otherwise• Do nothing

Page 27: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example: Knock-Knock

char msg[20];int myrank, tag = 99;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if(myrank == 0){

strcpy(msg, "Knock knock");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);

} else if (myrank == 1){MPI_Recv(msg, 20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

}

This code sends a single string from process 0 to process 1Some things we’ll discuss in detail:• MPI_Send• MPI_Recv• Tags

Page 28: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Send

1. Msg (in)• The buffer (location in memory) to send from

2. Strlen(msg)+1 (in)• The number of items to send

• + 1 to include the null-byte ‘\0’ → Only relevant when sending strings

3. MPI_CHAR (in)• The MPI datatype (more on this later) indicates the size of each element in

your buffer

if(myrank == 0){strcpy(msg, "Knock knock");status = MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);

Page 29: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Send

4. 1 (in)• The rank of the destination process

5. Tag (in)• The ‘topic’ of the message (will only be received if process 1 Recv’s on tag 99)

6. MPI_COMM_WORLD (in)• The communicator on which we are sending through

• Each communicator (with two processes) has a rank 0 process and a rank 1 process

if(myrank == 0){strcpy(msg, "Knock knock");status = MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);

Page 30: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Recv1. Msg (out)

• The buffer to receive from

2. 20 (in)• The maximum number of elements we want

3. MPI_CHAR (in)• The size of each element

} else if (myrank == 1){MPI_Recv(msg, 20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

}

Page 31: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Recv4. 0 (in)

• The process we want to receive from

5. Tag (in)• The ‘topic’ we want to receive on (more on this later)

6. MPI_COMM_WORLD (in)• The communicator we are communicating on

7. &status (inout)• The error code, in this case passed to the function itself (since it returns how

many elements was received)

} else if (myrank == 1){MPI_Recv(msg, 20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

}

Page 32: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Tags

• A rather good idea at the time

• Rarely used in practice

• Allow processes to provide ‘topics’ for communication• E.g. ‘42’ refers to all communication for a particular sub-task etc.

• MPI_ANY_TAG renders specifying tags useless

Generally, we write our own code

and we assume we know what we’re doing.

Page 33: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Datatypes

• Back to the MPI_CHAR

• Generally, data in a message (sent or received) is described as a triple• Address

• Count

• Datatype

Page 34: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Datatypes

• Back to the MPI_CHAR• Generally, data in a message (sent or received) is described as a triple

• Address• Count• Datatype

• An MPI datatype can be• Predefined, corresponding to a language primitive (MPI_INT, MPI_CHAR)• A contiguous array of MPI datatypes• A strided block of datatypes• An indexed array of blocks of datatypes• An arbitrary structure of datatypes

• There are MPI functions to construct custom datatypes (int, float) tuples for example

Page 35: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Datatypes

• Using MPI datatypes specifies messages as data-points not bytes• Machine independent

• Implementation independent

• Portable between machines

• Portable between languages

We have some more information

Page 36: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Some really, really important points

• Communication requires cooperation; you need to know:• Who you are sending/receiving from/to

• What you are sending/receiving

• When you want to send/receive

• Very specific, requires careful reasoning about algorithms

• All nodes (in general) will run the same executable

Page 37: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Some really, really important points

• Communication requires cooperation; you need to know:• Who you are sending/receiving from/to

• What you are sending/receiving

• When you want to send/receive

• Very specific, requires careful reasoning about algorithms

• All nodes (in general) will run the same executable• Very different style of programming

• The ‘root’ (usually rank 0) may have very different tasks to all other nodes

• Rank becomes very important to dividing the bounds of a problem

Page 38: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 2: Knock-knock, who’s there

What we want to do

• Find our rank

• If process 0 • Send ‘knock knock’• Receive from process 1

• If process 1• Send ‘who’s there’ • Receive from process 2

• Else• Do nothing

Page 39: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 2: Knock-knock, who’s therechar msg[20];int myrank, tag = 99;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if(myrank == 0){

strcpy(msg, "Knock knock");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);MPI_Recv(msg, 20, MPI_CHAR, 1, tag, MPI_COMM_WORLD, &status);

}

Page 40: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 2: Knock-knock, who’s therechar msg[20];int myrank, tag = 99;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if(myrank == 0){

strcpy(msg, "Knock knock");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);MPI_Recv(msg, 20, MPI_CHAR, 1, tag, MPI_COMM_WORLD, &status);

} else if (myrank == 1){strcpy(msg, "Who's there?");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);MPI_Recv(msg, 20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

}

Page 41: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 2: Knock-knock, who’s there

char msg[20];int myrank, tag = 99;MPI_Status status;...MPI_Comm_rank(MPI_COMM_WORLD, &myrank);if(myrank == 0){

strcpy(msg, "Knock knock");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);MPI_Recv(msg, 20, MPI_CHAR, 1, tag, MPI_COMM_WORLD, &status);

} else if (myrank == 1){strcpy(msg, "Who's there?");MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, tag, MPI_COMM_WORLD);MPI_Recv(msg, 20, MPI_CHAR, 0, tag, MPI_COMM_WORLD, &status);

}

There may be a problem here

Page 42: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Blocking vs. Non-blocking

• Depending on the implementation you use this may cause a deadlock• If you have enough buffer space it might be okay (but don’t rely on this)

• We’ve been using the blocking send/receive functions• Halt execution until completed

• There exist non-blocking versions of send/recv• MPI_ISend – Same arguments

• MPI_IRecv – Same arguments but replace MPI_Status with MPI_Request

• Return immediately and continue with computation

Page 43: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

When to use Non-blocking

• Should only be used where performance improves• E.g. sending a large amount of data when a large amount of compute is also

available

• Using non-blocking communication will parallelise a little more

• To check for a communication’s success, need to use• MPI_Wait()

• MPI_Test()

• An alternate interpretation• MPI_Send/Recv is just MPI_Isend/Irecv + MPI_WAIT()

Page 44: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 3: Knock-Who’s Knock-There

int main(int argc, char *argv[]) {int myrank, numprocs;int msg[20], msg2[20];MPI_Request request;MPI_Status status;

MPI_Init(&argc,&argv);MPI_Comm_size(MPI_COMM_WORLD, &numprocs);MPI_Comm_rank(MPI_COMM_WORLD, &myid);//… Next slide

}

Page 45: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example 3: Knock-Who’s Knock-There

if(rank == 0){strcpy(msg, "Knock knock");MPI_Irecv(msg2, 20, MPI_CHAR, 1, MPI_ANY_TAG, MPI_COMM_WORLD, &request);MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, MPI_ANY_TAG, MPI_COMM_WORLD);MPI_Wait(&request, &status);

} else if (rank == 1){strcpy(msg, "Who's there?");MPI_Irecv(msg2, 20, MPI_CHAR, 1, MPI_ANY_TAG, MPI_COMM_WORLD, &request);MPI_Send(msg, strlen(msg)+1, MPI_CHAR, 1, MPI_ANY_TAG, MPI_COMM_WORLD);MPI_Wait(&request, &status);

}MPI_Finalize();return 0;

Page 46: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

On Multi-threading + MPI

• MPI does not specify the interaction of blocking communication calls and a thread scheduler

• A good implementation would block only the sending thread

• It is the user’s (your) responsibility to ensure other threads do not edit the communicating (possibly shared) buffer

Page 47: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

On Message Ordering

• Messages are non-overtaking

• The order a process sends messages is the order another process receives them

• The order multiple processes send messages in does not matter

P0

P1

P2

M02 M0

1

M11

Can be received at P2 as:• M1

1,M01,M0

2

• M01, M1

1, M02

• M01, M0

2, M11

But not:• M1

1,M02,M0

1

• M02, M1

1, M01

• M02, M0

1, M11

Page 48: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

On Message Ordering

• Another important note: Ordering is not transitive• Sounds goofy, but easy to make this mistake

• Be careful when using MPI_ANY_SOURCE

P0Send(P1)Send(P2)

P1Recv(P0)Send(P2)

P2Recv(P?)Recv(P?)

Page 49: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

On Message Ordering

• One goal of MPI is to encourage deterministic communication patterns

• Using exact addresses, exact buffer sizes, enforced ordering etc.

• Makes code predictable

• Sources of non-determinism• MPI_ANY_SOURCE as source argument

• MPI_CANCEL()

• MPI_WAITANY()

• Threading

Page 50: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Extended Example : Computing Pi (*groans)

• Your favourite example is back again

• This example is ‘perfect’ computationally• Automatic load balancing → All processes do as much work as possible

• Verifiable answer

• Minimal communication

• This time, we use numerical integration

0

11

1 + 𝑥𝑑𝑥 = arctan 𝑥 0

1 = arctan 1 − arctan 0 = arctan 1 =𝜋

4

Page 51: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Extended Example – Computing PI

• So we integrate the function 𝑓 𝑥 =4

1+𝑥2

• Our approach is very simple• Divide [0,1] by some value n

• Each forms a rectangle of height 𝑓(𝑛) and width 1

𝑛

• Add up all the rectangles to get a (not very good) approximation of the integration

• This gives us an approximation to 𝜋

Page 52: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Extended Example – Computing PI

• Our parallelism will also be quite simple• One process (the root, rank 0) will obtain n from the user and broadcast this

value to all others

• All other processes will determine how many points they each compute

• All other processes will compute their sub-approximations

• All other processes will send back their approximations

• The root will display the final result

Page 53: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Code Time

Page 54: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Extended Example 2 – Matrix Vector Multiplication• Previous example does not have any ‘actual’ message passing

• We introduce one of the most common structures for a parallel program• Self-scheduling

• Formerly ‘Master-slave’

• I will use ‘Master-node’

• Matrix-Vector multiplication is a good example but there are many scenarios where this approach is applicable• The nodes do not need to communicate with each other

• The amount of work for each node is difficult to predict

Page 55: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Matrix-Vector Multiplication

cbA

Page 56: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Matrix-Vector Multiplication

• All processes receive a copy of the vector b

• Each unit of work is a dot-product of one row of matrix A

• The root sends rows to each of the nodes

• When the result is sent back, another row is sent if available

• After all rows are processed, termination messages are sent

Page 57: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Code Time IIThere’s a git repo with all in-lecture code examples

https://github.com/pritchardn/teachingCITS3402

Page 58: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

Page 59: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• We can do a little better than timing programs

• Goal is to estimate• Computation

• Communication

• Scaling w.r.t. problem size

Page 60: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• We can do a little better than timing programs

• Goal is to estimate• Computation

• Communication

• Scaling w.r.t. problem size

• Consider Matrix-vector multiplication• Square, dense matrix 𝑛 × 𝑛

• Each element of c requires 𝑛 multiplications and 𝑛 − 1 additions

• There are 𝑛 elements in c so our FLOP requirements are

𝑛 𝑛 + 𝑛 − 1 = 2𝑛2 − 𝑛

Page 61: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• We also consider communication costs

• We assume all processes have the original vector already

• Need to send 𝑛 + 1 values (sending to and back)

• 𝑛 times (for each row)

𝑛 𝑛 + 1 = 𝑛2 + 𝑛

Page 62: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• A ratio of communication to computation is𝑛2 + 1

2𝑛2 − 𝑛× (

𝑇𝑐𝑜𝑚𝑚

𝑇𝑐𝑎𝑙𝑐)

• Computation is usually cheaper than communication • We try to minimise this ratio

• Often making the problem larger makes communication overhead insignificant

• Here, this is not the case

• For large 𝑛 the ratio gets closer to 1

2

Page 63: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• We could easily adapt our approach for matrix-matrix multiplication

• Instead of a vector b we have another square matrix B

• Each round sees a vector sent back instead of a single value

Page 64: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Studying Parallel Performance

• Computation requirements• The operations for each element of C is 𝑛 multiplications and 𝑛-1 adds• Now 𝑛2 elements to compute

𝑛2 2𝑛 − 1 = 2𝑛3 − 𝑛2

• Communication requirements• 𝑛 (to send each row) + 𝑛 (to send a row back) and there are 𝑛 rows

𝑛 × 2𝑛 = 2𝑛2

• Comm/Calc ratio2𝑛2

2𝑛3 − 𝑛2× (

𝑇𝑐𝑜𝑚𝑚

𝑇𝑐𝑎𝑙𝑐)

• Which scales to 1

𝑛for large 𝑛

Page 65: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

User Defined DatatypesWriting Parallel Code

How to use a super-computer

Page 66: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Today

• Introduction to User-Defined Datatypes• Datatype constructors

• Use of derived datatypes

• Addressing

• Packing and Unpacking

• Writing parallel code• Some guidance on getting started

• Reasoning with parallel algorithms

Page 67: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

User Defined Datatypes

• We’ve seen MPI communicate a sequence of identical elements, contiguous in memory

• We sometimes want to send other things• Structures containing setup parameters• Non-contiguous array subsections

• We’d like to send this data with a single communication call rather than several smaller ones• Communication speed is network bound

• MPI provides two mechanisms to help

We’ll look at how this works (relatively) briefly, most applications can be dealt with primitive communication

Page 68: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

User Defined Datatypes

• Derived datatypes• Specifying your own data layouts

• Useful for sending structs for instance

• Data-packing• An extra routine before/after sending/receiving to compress non-contiguous

data into a dense block

• Often, we can achieve the same data transfer with either method• Obviously, there are pros and cons to either

Page 69: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Derived Datatypes

• All MPI communication functions take a datatype argument

• MPI provides functions to construct ‘types’ of our own (only relevant to MPI of course) to provide to these functions

• They describe layouts of memory of complex-data structures or non-contiguous memory

Page 70: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Derived Datatypes

• Constructed from basic datatypes

• The writers of MPI recursively defined datatypes – allowing us to use them for our own nefarious tasks

• A derived datatype is an opaque object (we can’t edit it after construction) specifying two things• A sequence of primitive datatypes - Type signature

• A sequence of integer (byte) displacements - Type map

Page 71: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Type Signature

• A list of datatypes

• E.g. Typesig = {MPI_CHAR, MPI_INT}

• This describes some datatype which has one or more MPI_CHARs followed by one or more MPI_INTs

Page 72: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Type map

• A list of pairs

• The first elements are your type signature

• The second elements are the displacement in memory (in bytes) from the first location

• E.g. MPI_INT has the type map {(int), 0} – A single integer beginning at the start

• Type maps define the size of the buffer you are sending

This will make more sense once we introduce datatype-constructors

Page 73: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Some utility functions (for reference)

• MPI_TYPE_EXTENT(datatype, extent)• datatype IN datatype (e.g. MPI_CHAR, MPI_MYTHINGY)

• extent OUT the extent of the datatype i.e. the largest possible size the datatype can be (factoring for memory bound requirements)

• MPI_TYPE_SIZE(datatype, size)• datatype IN datatype

• size OUT An array of ints specifying the size (in bytes) of each entry in the type signature

Page 74: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Some utility functions (for reference)

• MPI_Commit(datatype)• ‘compiles’ or flattens the datatype into an internal representation for MPI to

use

• Must be done before user a derived datatype

Importantly, all processes in a parallel program must have the same datatypes committed

Page 75: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_TYPE_CONTIGUOUS(count, oldtype, newtype)

• count IN replication count

• oldtype IN the old datatype

• newtype OUT the new datatype

• Constructs a typemap of count copies of oldtype in contiguous memory

• Example, oldtype = {(double, 0), (char, 8)}

• MPI_TYPE_CONTIGUOUS(3, oldtype, newtype) yields the datatype• {(double, 0),(char, 8),(double, 16),(char, 24), (double, 32),(char, 40)}

Page 76: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Constructors – Contiguous

• MPI_TYPE_CONTIGUOUS(3, oldtype, newtype) yields the datatype• {(double, 0),(char, 8),(double, 16),(char, 24), (double, 32),(char, 40)}

oldtype

count 3

newtype

Page 77: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_TYPE_INDEXED(count, array_of_blocklengths, array_of_displacements, oldtype, newtype)

• Allows one to specify a non-contiguous data layout

• The displacements of each block can differ (and can be repeated)

• count IN the number of blocks

• array_of_blocklengths IN # of elements for each block

• array_of_displacements IN displacement for each block (measured in number of elements)

• oldtype IN the old datatype

• newtype OUT the new datatype

Page 78: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_TYPE_INDEXED(count, array_of_blocklengths, array_of_displacements, oldtype, newtype)

• E.g. oldtype = {(double, 0), (char, 8)}• Blocklengths B = (3, 1)

• Displacements D = (4, 0)

• MPI_TYPE_INDEXED(2, B, D, oldtype, newtype) yields

oldtype

count = 3, blocklength = (2,3,1), displacement = (0, 3, 8)

newtype

Page 79: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example – Upper Triangle of a Matrix

[0][0] [0][1] [0][2] …

[1][1] [1][2] …

[2][2] [2][3] …

[3][3] [3][4] …

Page 80: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Example – Upper Triangle of a Matrix

double values[100][100];double disp[100];double blocklen[100];MPI_Datatype upper;

/* find the start and end of each row */for(int i = 0; i < 100; ++i){

disp[i] = 100 * i + i; // The i-th row starts at the i-thelement

blocklen[i] = 100 - i; // There are i elements in each row}/* Create datatype */MPI_Type_indexed(100, blocklen, disp, MPI_DOUBLE, &upper);MPI_Type_commit(&upper);/* Send it */MPI_Send(values, 1, upper, dest, tag, MPI_COMM_WORLD);

Page 81: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Packing and Unpacking

• Based on how previous libraries achieved this

• In most cases one can avoid packing and unpacking in favour of derived datatypes• More portable

• More descriptive

• Generally simpler

• Sometimes for simple use cases it can be desireable

• This is where we make use of the MPI_PACKED datatype mentioned last time

Page 82: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_PACK(inbuf, incount, datatype, outbuf, outsize, position, comm)

• inbuf IN input buffer

• incount IN number of elements in the buffer

• datatype IN the type of each elements

• outbuf OUT output buffer

• outsize IN output buffer in size (bytes)

• position INOUT current position in buffer (bytes)

• comm IN communicator for packed message

Used by repeatedly calling MPI_PACK with changed inbuf and outbufvalues

Page 83: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_UNPACK(inbuf, insize, position, outbuf, outcount, datatype, comm)

• inbuf IN input buffer

• insize IN size of input buffer (bytes)

• position INOUT current position (bytes)

• outbuf OUT output buffer

• outcount IN number of components to be unpacked

• datatype IN datatype of each output component

• comm IN communicator

The exact inverse of MPI_PACK. Used by repeatedly calling unpack, extracting each subsequent element

Page 84: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Finally

There are many other functions that are helpful.

Please have a look at the documentation and MPI spec

when you come across a bizarre communication requirement.

Page 85: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

How to write Parallel Code

Page 86: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Why bother with this section

• Reasoning with parallel algorithms is bizarre relative to serial algorithms. Subsequently writing and reading parallel code can be painful if you are not prepared

• This is not helped by running code on a remote machine using a job-system.

This section is simply some guidance on where to start, everyone finds some things more useful than others.

Page 87: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Writing parallel code – Quick tips

• Install MPI locally • Cannot be stressed enough

• You can simulate running multiple nodes on a single machine with mpiexec/mpirun calls

• Write a serial version of your solution first• Tight and clean serial code is vastly easier to parallelise

• Especially if you don’t know how to parallelise the code in the first place

• I would not advise trying to do both simultaneously

Page 88: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Writing parallel code – Quick tips

• Write a ‘dummy’ parallelised version of your code first• E.g. decomposing a matrix

• Have each process compute its bounds first and print them out

• Test your scheme for many different configurations and check you’re correct

• Then start adding actual computation

• Read documentation• Most of the time, there will be a function to help make your life easier

• Test your parallel code in serial• A parallelised piece of code should still work if only one node is available

Page 89: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Writing parallel code – Quick tips

• Use the sysadmins• If you end up using a commercial HPC system (e.g. Pawsey Supercomputing

Centre) use the helpdesk – they want to help

• Invest in some tests• Writing a few small examples and making them easy to run will make testing

changes easier

• Good printouts are invaluable• It helps to quickly find out where your code may be bugged and what values

are changing

Page 90: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Writing parallel code – Quick tips

• Make writing code easy for you • Many IDEs / editors (CLion, VSCode etc.) allow for a full remote mode. Write

code directly on the supercomputer using your own editor

• Otherwise • write code locally

• test locally

• commit with git

• pull the edits on the supercomputer and run

Write and run code – you’ll only improve with practice

Page 91: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Common Errors Writing Parallel Code

• Expected argc and argv to be passed to all processors

• Doing things before MPI_Init or after MPI_Finalize

• Matching MPI_Bcast with MPI_Recv

• Assuming your MPI implementation is thread-safe

Page 92: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Summary

• Looked at point to point communication

• MPI_Send

• MPI_Recv

• MPI_ISend

• MPI_Irecv

• MPI_Wait

Bonus material: Tour-de-force of most useful point-to-point function descriptions (might be helpful later)

Quick reminder: https://www.mpi-forum.org/docs/→ Explanation of all function calls

Page 93: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Datatypes

Page 94: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Datatypes

• Since all data is given an MPI type, an MPI implementation can communicate between very different machines

• Specifying application-oriented data layout• Reduces memory-to-memory copies in implementation

• Allows the use of special hardware where available

Page 95: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

A list of MPI Datatypes (C)MPI Datatype C Datatype

MPI_CHAR signed char

MPI_SHORT signed short int

MPI_INT signed int

MPI_LONG signed long int

MPI_UNSIGNED_CHAR unsigned char

MPI_UNSIGNED_SHORT unsigned short int

MPI_UNSIGNED unsigned int

MPI_UNSIGNED_LONG unsigned long int

MPI_FLOAT float

MPI_DOUBLE double

MPI_LONG_DOUBLE long double

MPI_BYTE n/a

MPI_PACKED n/a

Page 96: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_BYTE / MPI_PACKED

• MPI_BYTE is precisely a byte (eight bits)

• Un-interpreted and may be different to a character • Some machines may use two bytes for a character for instance

• MPI_PACKED is a much more complicated beastie (we’ll get to it later)• For now, it’s just ‘any-data’

• Used to send structs through MPI

Page 97: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Point-to-Point-Pointers

Page 98: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Send(start, count, datatype, dest, tag, comm)

• start (IN) initial address of send buffer

• count (IN) number of elements in buffer

• datatype (IN) datatype of each entry

• dest (IN) rank of destination

• tag (IN) message tag

• comm (IN) communicator

Performs a blocking send, returns a status flag (int)

Page 99: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Recv(start, count, datatype, source, tag, comm, status)

• start

• count

• datatype

• source

• tag

• comm

• status

Performs a blocking receive, return an error flag. Status can be inspected for more information

Page 100: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Sendrecv(sendbuf, sendcount, sendtype, dest, sendtag, recvbuf, recvcount, recvtype, source, recvtag, comm, status)

• sendbuf (IN) start of send buffer

• sendcount (IN) number of entries to send

• sendtype (IN) datatype of each entry

• dest (IN) rank of destination

• sendtag (IN) message tag

• recvbuf (OUT) start of receive buffer (must be a different buffer)

• recvcount (IN) number of entries to receive

• recvtype (IN) datatype of receive entries

• source (IN) rank of source

• recvtag(IN) message tag

• comm (IN) communicator

• status (OUT) return status

Performs a standard send and receive as if executed on two separate threads

Page 101: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_ISend(buf, count, datatype, dest, tag, comm, request)

• buf (IN) initial address of send buffer

• count (IN) number of entries to send

• datatype (IN) datatype of each entry

• dest (IN) rank of destination in comm

• tag (IN) message tag

• comm (IN) communicator

• request (OUT) request handle (MPI_REQUEST)

Posts a standard nonblocking send

Page 102: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_IRecv(buf, count, datatype, source, tag, comm, request)

• buf (OUT) start of receive buffer

• count (IN) number of entries to receive

• datatype (IN) type of each entry

• source (IN) rank of source

• tag (IN) message tag

• comm (IN) communicator

• request (OUT) request handle

Posts a nonblocking receive

Page 103: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Wait(request, status)

• request (INOUT) request handle

• status (OUT) status object

Returns when the operation in request is complete

Page 104: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Test(request, flag, status)

• request (INOUT) request handle

• flag (OUT) true if operation completed

• status (OUT) status object

Returns true if the operation defined by request is complete. This function can allows single-threaded applications to schedule alternate tasks while waiting for communication to complete.

Page 105: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Waitall(count, requests, statuses)

• count (IN) list length

• requests (INOUT) array of request handles

• statuses (OUT) array of status objects

Blocks until all communications associated with the array of requests have resolved.

Page 106: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI_Testall(count, requests, statuses)

• count (IN) list length

• requests (INOUT) array of request handles

• statuses (OUT) array of statuses

Tests for completion of all communications specified in the array of requests. If all have completed, returns true, false otherwise.

Page 107: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

MPI Constants

Page 108: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Communicators

• MPI_ANY_TAG Specifies no tag

• MPI_ANY_SOURCE If passed when sending, any process can recv

• MPI_COMM_NULL A return value occurring when processes are not in a given communicator

• MPI_COMM_SELF A communicator containing only itself

• MPI_COMM_WORLD Communicator containing all processes

Page 109: Introduction to MPIteaching.csse.uwa.edu.au/units/CITS3402/lectures/2PointtoPoint.pdfRemainder of the Course 1. Why bother with HPC 2. What is MPI 3. Point to point communication (today)

Error codes

• MPI _SUCCESS Status code of a successful call

• MPI _ERR Can be used to check if any error has occurred

• MPI _ERR _ARG Indicates an error with a passed argument

• MPI _ERR _COMM Indicates an invalid communicator

• MPI _ERR _IN _STATUS Indicates an error with the error code

• MPI _ERR _ROOT Indicates an invalid root node argument

• MPI _ERR _TAG Indicates an invalid argument for tag

• MPI _ERR _UNKNOWN An error code not matching anything.