12c.1 collective communication in mpi unc-wilmington, c. ferner, 2008 nov 4, 2008

12c.1

Collective Communication in MPI

UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008

12c.2

Barrier• A barrier is a way to synchronize all (or a

subset) of the processors.• When processors reach the

MPI_Barrier(), they block until all processors have reach the same barrier

• All processors should call the barrier function, or else you have a deadlock

• Syntax:MPI_Barrier(MPI_COMM_WORLD);

12c.3

Barrier

• Example: MPI_Barrier(MPI_COMM_WORLD); if (mypid == 0) { gettimeofday(&tv1, NULL); }

... // Do some work

MPI_Barrier(MPI_COMM_WORLD); if (mypid == 0) { gettimeofday(&tv2, NULL); }

12c.4

Broadcast

• A broadcast is when one processor needs to send the same information to all (or a subset) of the other processors

• Syntax:MPI_Bcast (buffer, count, datatype, root, MPI_COMM_WORLD)

• buffer, count, datatype are the same as with MPI_Send()

• root is the id of the thread initiating the broadcast

12c.5

Broadcast

• Example:int N = ___;float b = ____;float a[N];

MPI_Bcast (&N, 1, MPI_INT, 0, MPI_COMM_WORLD);MPI_Bcast (&b, 1, MPI_FLOAT, 0, MPI_COMM_WORLD);MPI_Bcast (a, N, MPI_FLOAT, 0, MPI_COMM_WORLD);

12c.6

Broadcast

• All processors participating in the broadcast (whether they are the source or a destination) must call the broadcast function with the same parameters or else it won't work

• The runtime of a broadcast is O(log(p)) instead of O(p), where p is the number of processors, as it would be if the root send the data to each processor in turn

12c.7

Broadcast

0

CommunicationNon communication

0 4

0 42 6

0 42 61 53 7

12c.8

Reduction

• A Reduction is where an array of values is reduced to a single value by applying a binary (usually commutative) operators.

MPI_MAX maximumMPI_MIN minimum MPI_SUM sum

MPI_PROD product

MPI_LAND logical and

MPI_BANDMPI_LOR logical or MPI_BOR bit-wise or

MPI_LXOR logical xor

bit-wise and

12c.9

ReductionP

1P

0P

3P

2P

5P

4P

7P

6

+

37 16 48 2 73 44 32 11

53 16 50 2 117 44 43 11

103 16 50 2 160 44 43 11

263 16 50 2 160 44 43 11

+ + +

+ +

+


12c.10

Reduction

• Syntax:

MPI_Reduce(sendbuf, recvbuf, count, MPI_Datatype, MPI_Op, root, MPI_Comm)

• sendbuf, count, datatype, and MPI_Comm are the same as with MPI_Send() and Bcast()

• root is the id of the thread which will posses the final value

• MPI_Op is one of the constants on previous slide

12c.11

Reduction

• Example:

int x, y;// Each processor has a different// value for x

MPI_Reduce(&x, &y, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

• The root thread (0) has the sum of all x's in the variable y

12c.12

Reduction

• Example:

int x[N], y[N];// Each processor has different// values in the array x

MPI_Reduce(x, y, N, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

• The root thread (0) has the sum of all x[0]'s in y[0], the sum of all x[1]'s in y[1], ...

12c.13

Reduction

• All processors participating in the reduction (whether they are the source or a destination) must call the Reduce function with the same parameters or else it won't work

• The runtime of a reduction is O(log(p)) instead of O(p), where p is the number of processors

12c.14

Reduction

0


0 4

0 42 6

0 42 61 53 7

12c.15

Scatter/Gather

• Scatter sends parts of an array from the root to each processors

• Syntax:MPI_Scatter(send_data, send_count, send_type, recv_data, recv_count, recv_type, root, MPI_Comm)

• Gather brings together parts of an array of different processors to the root

• Syntax:MPI_Gather(send_data, send_count, send_type, recv_data, recv_count, recv_type, root, MPI_Comm)

12c.16

}

Scatter

P1P

0P

3P

2

P0 } } }37 16 48 2 73 44 32 11

37 16 48 2 73 44 32 11

12c.17

}

Gather

P1P

0P

3P

2

P0

} } }

37 16 48 2 73 44 32 11

37 16 48 2 73 44 32 11

12c.18

Scatter/Gather

float a[N], localA[N];...

if (mypid == 0) {

printf ("<pid %d>: a = ",mypid); for (i = 0; i < N; i++) printf ("%f ", a[i]); printf ("\n");

}

12c.19

Scatter/Gather

blksz = (int) ceil (((float) N)/P);

MPI_Scatter(a, blksz, MPI_FLOAT, &localA[0], blksz, MPI_FLOAT, 0, MPI_COMM_WORLD);

12c.20

Scatter/Gather

for (i = 0; i < blksz; i++) printf ("<pid%d>: localA = %.2f\n", mypid, localA[i]);

for (i = 0; i < blksz; i++) localA[i] += mypid;

for (i = 0; i < blksz; i++) printf ("<pid%d>: new localA =%.2f\n", mypid, localA[i]);

12c.21

Scatter/Gather

MPI_Gather(&localA[0], blksz, MPI_FLOAT, a, blksz, MPI_FLOAT, 0, MPI_COMM_WORLD);

if (mypid == 0) { printf ("<pid %d>: A = ",mypid); for (i = 0; i < N; i++) printf ("%f ", a[i]); printf ("\n");}

12c.22

Scatter/Gather

$ mpirun -nolocal -np 3 mpiGatherScatter 6 <pid 0>: A = 84.019997 39.439999 78.309998 79.839996 91.160004 19.760000

<pid0>: localA = 84.02<pid0>: localA = 39.44<pid0>: new localA = 84.02<pid0>: new localA = 39.44

12c.23

Scatter/Gather<pid1>: localA = 78.31<pid1>: localA = 79.84<pid1>: new localA = 79.31<pid1>: new localA = 80.84

<pid2>: localA = 91.16<pid2>: localA = 19.76<pid2>: new localA = 93.16<pid2>: new localA = 21.76

<pid 0>: A = 84.019997 39.439999 79.309998 80.839996 93.160004 21.760000

12c.24

For further reading

• Man pages on MPI Routines: – http://www-unix.mcs.anl.gov/mpi/www/ www3/

• Barry Wilkinson and Michael Allen, Parallel Programming: Techniques and Applications Using Networked Workstations and Parallel Computers, Prentice Hall, Upper Saddle River, NJ, 1999, ISBN 0-13-671710-1

• Peter S. Panache, Parallel Programming with MPI, Morgan Kaufmann Publishers, Inc., San Francisco, CA, 1997, ISBN 1-55860-339-5

12c.1 collective communication in mpi unc-wilmington, c. ferner, 2008 nov 4, 2008

Documents