performances

Post on 26-May-2015

620 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Performance Analysis in Parallel Programming

Ezio Bartocci, PhD

SUNY at Stony Brook

January 13, 2011

Today’s agenda

Serial Program Performance An example in the cable I/O statements Taking times

Parallel Program Performance Analysis The cost of communication Amdahl's Law Sources of Overhead Scalability

Agenda

Performance of a Serial Program Introductions

Definition of Running Time:

Let be T(n) units the running time of a program.

The actual time depends on:

the hardware being used the programming language and compiler details of the input other than its size

Example of Trapezoidal Rule:

h = (b-a)/n;integral = (f(a)+f(b))/2.0;x = a;for (i=1; i <=n-1; i++){

x = x + h;integral = integral + f(x);

}integral = integral * h;

a bh

c1

c2

c1

T n ≈k1 n

T n ≈k1 n+k2

T n =c1+c 2 n−1

k 1n≫k2

linear polynomial in n

What about the I/O ?Introductions

In 33 Mhz 486 PC (linux) and gcc compilerA multiplication takes a microsecondprintf about 300 microseconds

On faster systems, the ratios may be worse,with the arithmetic time decreasingI/O times remaining the same

Estimation of the time

T n =Tcalc n +T i /o n

Parallel Program Performance Analysis

Introductions

T n , p denotes the runtime of the parallel solution with p processes.

Speedup of a parallel program:

S n , p=T n

T n , p

Efficiency of a parallel program:

En , p=S n , p

p=

T npTn , p

Ambiguous definitionT nIs the fastest known serial program ?

or T n=T n ,10S n , p≤pS n , p=p linear speedup

Efficiency is a measure of process utilitization in a parallel program, relative to the serial program. 0E n , p≤1E n , p =1 linear speedupE n , p1/ p slowdown

The Cost of CommunicationIntroductions

For a reasonable estimatation of the performance of a parallel program, we should count also the cost of communication.

T n , p=T calc n , pT i /on , pT comm n , p

While the cost of sending a single message containig k units of data will be:

t sk tc

t s is called message latency

1t c

is called bandwidth

Taking TimingsIntroductions

#include <time.h>

....clock_t c1, c2;....c1 = clock();for(j = 0; j < MATRIX_SIZE; j++){ for(i = 0; i < MATRIX_SIZE; i++){ B[i * MATRIX_SIZE + j]= A[j * MATRIX_SIZE + i] + A[j * MATRIX_SIZE + i]; }}c2 = clock();

serial_comp = c2 - c1;printf("Time serial computation: %.0f milliseconds\n", serial_comp / (CLOCKS_PER_SEC / 1000));

Amdahl's LawIntroductions

S p=T

1−r T rT / p= 11−r r / p

dSdp

= r

[1−r pr ]2≥0

lim p∞ S p = 11−r

0≤r≤1 is the fraction of the program that is perfectly parallelizable

Example: if r =0.5 the maximum speedup is 2

S(p) is an increasing function in p

if r =0.99 the maximum speedup is 100 even if we use 10000 proc.

Work and OverheadIntroductions

The amount of work done by a serial program is simply the runtime:

W n=T n

The amount of work done by a parallel program is the sum of the Amounts of work done by each process:

W n , p =∑q=0

p−1W q n , p

Thus, an alternative definition of efficiency is:

E n , p=T n

pTn , p=

W nW n , p

Work and OverheadIntroductions

Overhead is the amount of work done by the parallel program that is not done by the serial program:

T on , p =W n , p−W on=pT n , p−T n

per-process overhead: difference between the parallel runtime and the ideal parallel runtime that would be obtained with linear speedup:

T ' on , p =T n , p−T n/ pT on , p=pT ' on , p

Main sources of Overhead: communication, idle time and extra computation

In terms of Amdahl's law the efficiency achievable by a parallel program on a given problem instance will approach zero as the number of processes is increased.

ScalabilityIntroductions

Amdahl's law define the upper bound on the speedupin terms of efficiency we have:

E=T

p 1−r T rT

= 1p 1−r r

ScalabilityIntroductions

Look this assertion in terms of overhead:

T O n , p= pT n , p −T n

Efficiency can be written as:

E n , p =T n

pT n , p=

T n T O n , p T n

= 1T O n , p /T n1

Look this assertion in terms of overhead:

The behavior of the efficiency as p is increased is completely determined by the overhead function T O n , p

A parallel program is scalable if, as the number of processes (p) isincreased, we can find a rate of increase for the problem size (n) so that efficiency remains constant.

top related