online computation of critical paths for multithreaded languages

44
May/01/2000 HIPS 2000 1 Online Computation of Critical Paths for Multithreaded Languages Yoshihiro Oyama Kenjiro Taura Akinori Yonezawa University of Tokyo

Upload: joella

Post on 13-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Online Computation of Critical Paths for Multithreaded Languages. Yoshihiro Oyama Kenjiro Taura Akinori Yonezawa University of Tokyo. Presentation Outline. What is a critical path? Background & Overview Our work Target language Critical path computation algorithm Instrumentation scheme - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 1

Online Computation ofCritical Paths for

Multithreaded Languages

Yoshihiro OyamaKenjiro Taura

Akinori Yonezawa

University of Tokyo

Page 2: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 2

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 3: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 3

What is a Critical Path (CP)?

• The longest execution path– Nodes: sequential program parts– Edges: fork/sync points

31

36

5

2

2

2

8

7

4

1

CP length: 31

Page 4: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 4

Benefits of Getting CPs(1/2)

• CP info gives us– Performance upper bound

= Exec. time lower bound = lim {exec. time} PE→∞

– Important parts in need of tuning

Page 5: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 5

Benefits of Getting CPs(2/2)

• CP info is useful for– Tuning

• CP is short → Overhead should be reduced• Otherwise → CP should be shortened

– Performance prediction• TP = T1 / P + T∞ (by Cilk group)• Exec. time is close to CP length

→ More processors: futile

Page 6: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 6

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 7: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 7

This Work

• Computing critical paths– Primary targets:

• Multithreaded languages• Shared-memory machines

– On-the-fly• Not using tracefiles

– Source code instrumentation

Page 8: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 8

Background(Shortcoming of Existing Work)

• Cilk [Frigo et al. 98]– Provides online computation of CPs

– Supports fork-join synchronization only

– Unrealistic setting• Fork: zero cost• Join: zero cost

Page 9: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 9

Contribution

• Developed algorithm for computing CPs– It deals with languages with threads and synchr

onization via first-class data• Not limited to fork-join model

– It takes fork / communication cost into account– It gives length of each subpath in a CP

• Helps us “pinpoint” important program parts

• Demonstrated its usefulness through experiments using SMP

Page 10: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 10

CP Info Example

• Displaying a sequence of all subpaths in a CP

frame entry point frame exit point time=============================================================main() --- move_mols(mols,100) 741 usecspawn 10 usecmove_mols(mols,n) --- spawn move_one_mol(mols[i]) 39 usecspawn 10 usecmove_one_mol(molp) --- return 4982 useccommunication 15 usecv = recv(r) --- send(s, v*2) 128 useccommunication 15 usecu = recv(s) --- die 1207 usec=============================================================critical path length 7147 usec

Page 11: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 11

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 12: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 12

Target Language

• Sequential language(C, Scheme, …)

+ Threads spawn f(x1,…,xn)

+ Channels• are first-class sync. media• can express locks, barriers,

and monitors

r

th2v = recv(r)

th1send(r,8)

8

8

Page 13: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 13

Sample Program

main(){ spawn sum(r,vec); ... v = recv(r); ... die;}

sum(r,vec){ ... ... send(r,ans);}

End of Program

Beginning of Program

Page 14: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 14

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 15: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 15

Behavior of Sample Program

sum(r,vec)

v=recv(r)spawn sum(r,vec)

send(r,ans)

diemain

Nodes: fork & sync. points

Edges: inter-node dependencies

DAG-structured execution

Page 16: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 16

Three Kinds of Edges (Dependencies)

• Arithmetic edges• Spawn edges• Communication edges

83

5

14 29

sum(r,vec)

v=recv(r)spawn sum(r,vec)

send(r,ans)

diemain

Page 17: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 17

CP Computation AlgorithmBasic Idea

• DAG not constructed– Each thread keeps only the longest path

up to the current program point

recvmain

Path2

Path1thrownaway

Page 18: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 18

Key Questions

• How to determine edge values?

• How to compute CP withoutconstructing DAG?– How to manage CP info? – How to keep the longest path?

Page 19: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 19

Determining Edge Values

• Computing the amount of time that elapsed after leaving the previous node

Y ZXt1=time() t2=time() t3=time()8 6

Page 20: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 20

CP=({…},{…},{…}, {L1,L2,8})

CP=({…},{…},{…})

Extending CP withArithmetic Edge

XL1:

8 YL2:

6 ZL3:

CP=({…},{…},{…}, {L1,L2,8})

CP=({…},{…},{…}, {L1,L2,8}, {L2,L3,6})

The amount of time in nodes: NOT accounted

CP info = a sequence of edge info

Page 21: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 21

Extending CP withSpawn Edge

CP=({…},{…},{…})

X spawn Y

ZCP=({…},{…},{…})

CP=({…},{…},{…}, {…,…,Cspawn })

Page 22: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 22

Extending CP withCommunication Edge

CPsend=({…},{…})

send

recv

[v, CPsend]

Piggyback a sentvalue with CP

CPsend=({…},{…}, {…,…,Ccomm })

CPsend=({…},{…})

Page 23: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 23

Keeping the Longest Path(Throwing Shorter Paths Away)

send

recv

[v, CPsend]

CP=max( CPsend, CPrecv )

CPsend = …

CPrecv = … CPsend=({…},{…}, {…,…,Ccomm })

Page 24: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 24

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 25: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 25

Instrumentation

• Source-to-source transformation

– Independent of the implementation details• Ex. management of activation frames

– Instrumentation code is inserted into• Sends, recvs, spawns• Entry/exit points of functions

Page 26: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 26

Transformation Rule Example

l: v = recv(r);

t = time() - et;[v, cp’] = recv(r);cp’’ = addCommEdge(cp’)if(t + length(cp) < length(cp’)){ cp = cp’ el = l; et = time();} else { et = time() - t;}

Compute CP up to recv

Receive a valuepiggybacked with CP

Compare the two CPs

Extend CP withcomm. edge

Use the sender’s CP

Use the receiver’s CP

Page 27: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 27

• DAG shape varies between different runs

Discussion (1/2)-- Nondeterminism --

X Y28

X Y5

– The amounts of time for each part vary(e.g., cache effects)

send

recv

send

recv

send

recv

send

recv

– Comm. edges may connect different pairs

Page 28: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 28

Discussion (2/2)-- What we Compute as CP --

• CP of a DAG created in an actual run

– Programs may give different CPsin different runs

– Other reasonable ways?

Page 29: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 29

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 30: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 30

Experiments

• Schematic: concurrent OO language [Taura et al. 96]

• Sun Ultra Enterprise 10000– UltraSPARC x 64

• Apps:– Prime– Natural Language Parser– Raytrace

• Timer function: gethrtime()

Page 31: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 31

Purpose of Experiments

• Checking that execution timesget close to computed CPs

• Identifying how large instrumentation overhead is

Page 32: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 32

Raytrace

0

500

1000

1500

0 10 20 30 40 50 60number of processors

tim

e (m

sec)

Instrumented CP Org

We could predictthe best performance

by using only one processor

We could predictthe best performance

by using only one processor

Page 33: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 33

Prime

0

300

600

900

1200

0 10 20 30 40 50 60

number of processors

tim

e (m

sec)

Instrumented CP Org

Small (< 5%) differencebetween the actual execution timeand the predicted execution time

Small (< 5%) differencebetween the actual execution timeand the predicted execution time

Page 34: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 34

Information Useful for Future Tuning of Prime

• Gathering primes into a list → 95 % of CP

• Dividing prime candidates by smaller primes → 5% of CP

Page 35: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 35

Natural Language Parser

0

400

800

1200

0 10 20 30 40 50 60

number of processors

tim

e (m

sec)

Instrumented CP Org

Page 36: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 36

Information Useful for Future Tuning of NL Parser

• Application of lexical rules → 4 % of CP

• Application of production rules → 96% of CP

Page 37: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 37

Instrumentation Overhead(Execution Time on One Processor)

9.9

4.7 3.7

0

5

10

15

Prime NL Parser Raytrace

norm

aliz

ed t

ime

Org Instrumented

Page 38: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 38

Presentation Outline

• What is a critical path?• Background & Overview• Our work

– Target language– Critical path computation algorithm– Instrumentation scheme

• Experimental results• Related work

Page 39: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 39

Related Work (1/2)

% foo -nproc 10 20

• Cilk– Breakdown of CP not shown

• CP info: not detailed enough for tuning

Which function should we tune???

result: 524288Running time on 10 procs: 416.33 msTotal work = 3.94 sCritical path = 1.08 msParallelism = 2800.92%

Page 40: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 40

Related Work (2/2)

• Paradyn [Hollingsworth 98]– Main target is message-passing programs– It does not display all subpaths in CP

• Tracefile-based offline scheme(Dimemas [Pallas] etc.)– Tracefile contains the parameters and the timin

gs of all communication operations

– Required memory/storage is very large

Page 41: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 41

Summary (1/2)

• Scheme for online CP computation– Supports synchronization via first-class data

• Piggybacking communicated values with CP info• Keeping the maximum of two paths in receives

– Takes spawn/communication cost into account

– Shows all subpaths in CP• Attaching subpath info in each CP update

Page 42: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 42

Summary (2/2)

• CP info we compute– Helps predict the MP performance

• Small (< 10%) difference between– Actual execution time– Predicted execution time

– Gives a useful guide to tuning• Prime: Tune list construction part!• Parser: Tune production rule application part!

Page 43: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 43

Future Work

• More precise performance prediction– Taking thread mapping into account

• Adaptive optimization using CP info– Time-consuming optimizations are

applied to the parts included in CP

Page 44: Online Computation of Critical Paths for Multithreaded Languages

May/01/2000 HIPS 2000 44

Any Comments?