speculative parallelism in cilk++

Post on 12-Sep-2021

22 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Speculative Parallelism in Cilk++

Ruben Perez & Gregory Malecha

MIT

May 11, 2010

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 1 / 33

Parallelizing Embarrassingly Parallel Problems

Lots of search algorithms are embarrassingly parallel.

Search down all the paths of a tree.Search multiple disjoint branches in parallel.

No communcation overhead.Very little memory contention.

Since we often only need a single answer, we can stop once we’vefound any solution.

Cannonical algorithms are in game-search, minimax and α-β pruning.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 2 / 33

2 Player Game Trees

Nodes represent game states.

Edges represent transitionsbetween states.

Leaf states are scored by aheuristic.

Goal is to maximize theheuristic against an adversary.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 3 / 33

Parallelizing MiniMax

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3 0

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3 0

2

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3 0

2 -1

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3 0 -1

2 -1

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Parallelizing MiniMax

3

3 0 -1

2 -1

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 4 / 33

Improving MiniMax

MiniMax exhaustively searches the game tree to a certain depth(iterative deepening).

Not efficient for most games due to large branching factor.

Can’t search deep enough for the computer to be smart.

Intuition Keep track of the range of feasible scores and prunebranches that fall outside the range.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 5 / 33

α-β Pruning

Keep additional information with each game node: α and β.

α is the lower bound on the player’s score.

β is the upper bound on the player’s score.

This pruning allows us to search twice as deep on average.

Let’s see an example...

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 6 / 33

α-β Pruning

Keep additional information with each game node: α and β.

α is the lower bound on the player’s score.

β is the upper bound on the player’s score.

This pruning allows us to search twice as deep on average.

Let’s see an example...

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 6 / 33

Pruning Example

[−∞,∞]

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

Pruning Example

[3,∞]

3

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

Pruning Example

[3,∞]

3 0

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

Pruning Example

[3,∞]

3 0 [3,∞]

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

Pruning Example

[3,∞]

3 0 [3,∞]

2

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

Pruning Example

[3,∞]

3 0 [3, 2]

2

Prune!

max

min

max

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 7 / 33

A Recipe for Speculation

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 8 / 33

A Recipe for Speculation

A Recipe for Speculation

When we speculate, we don’t want to have to commit to somethingfinishing.

Need a way to abort currently running computations that we don’tneed anymore.For consistency, we’d like to be able to abort computations that arespeculating.

We also need a way to combine the results.

Reducers would work for this, but they will need to be able to callabort.Older versions of cilk had another mechanism for this.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 9 / 33

A Recipe for Speculation

Speculation in Cilk5

Speculation is just based on spawn.

Combining results done through inlets.

inlets are functions that merge results together (similar to reducers).inlets can also make the choice to abort computations.

Let’s look at a simple example.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 10 / 33

A Recipe for Speculation

Simple Example: Native abort & Inlets

Imagine you have two computations that should yield the same result,but one could take significantly longer to compute.

Speculatively execute both and abort when the first finishes.

i n t l o ng compu ta t i on 1 ( vo id ∗ a r g s ) ;i n t l o ng compu ta t i on 2 ( vo id ∗ a r g s ) ;

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {i n t x ;inlet void reduce(int r) {

x = r;

abort;}reduce(c i l k spawn l o ng compu ta t i on 1 ( a rg s1 )) ;reduce(c i l k spawn l o ng compu ta t i on 2 ( a rg s2 )) ;c i l k s y n c ;re tu rn x ;

}

No support for abort or inlet in cilk++.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 11 / 33

A Recipe for Speculation

Simple Example: Native abort & Inlets

Imagine you have two computations that should yield the same result,but one could take significantly longer to compute.

Speculatively execute both and abort when the first finishes.

i n t l o ng compu ta t i on 1 ( vo id ∗ a r g s ) ;i n t l o ng compu ta t i on 2 ( vo id ∗ a r g s ) ;

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {i n t x ;inlet void reduce(int r) {

x = r;

abort;}reduce(c i l k spawn l o ng compu ta t i on 1 ( a rg s1 )) ;reduce(c i l k spawn l o ng compu ta t i on 2 ( a rg s2 )) ;c i l k s y n c ;re tu rn x ;

}

No support for abort or inlet in cilk++.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 11 / 33

A Recipe for Speculation

Simple Example: Native abort & Inlets

Imagine you have two computations that should yield the same result,but one could take significantly longer to compute.

Speculatively execute both and abort when the first finishes.

i n t l o ng compu ta t i on 1 ( vo id ∗ a r g s ) ;i n t l o ng compu ta t i on 2 ( vo id ∗ a r g s ) ;

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {i n t x ;inlet void reduce(int r) {

x = r;

abort;}reduce(c i l k spawn l o ng compu ta t i on 1 ( a rg s1 )) ;reduce(c i l k spawn l o ng compu ta t i on 2 ( a rg s2 )) ;c i l k s y n c ;re tu rn x ;

}

No support for abort or inlet in cilk++.Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 11 / 33

A Recipe for Speculation Porting the Example

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 12 / 33

A Recipe for Speculation Porting the Example

Handling inlets

Inlets are local functions that merge the result of spawnedcomputations.

Serve a similar purpose to reducers, but a little more general.Get access to the parent function’s stack frame.

Semanticsinlets are locally serial, i.e. all of the inlets for a particular stackframe will run serially.The order of executing them is non-deterministic, implementationbased on the amount of time that the computations take.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 13 / 33

A Recipe for Speculation Porting the Example

Handling inlets

i n t l o ng compu ta t i on 1 ( vo id ∗ a r g s ) ;i n t l o ng compu ta t i on 2 ( vo id ∗ a r g s ) ;

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {i n t x ;i n l e t vo id r educe ( i n t r ) {

x = r ;abort ;

}r educe ( c i l k spawn l o ng compu ta t i on 1 ( a rg s1 ) ) ;r educe ( c i l k spawn l o ng compu ta t i on 2 ( a rg s2 ) ) ;c i l k s y n c ;re tu rn x ;

}

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 14 / 33

A Recipe for Speculation Porting the Example

Handling inlets – via Translation

Translate inlets into continuations.

Not a particularly painful translation.It can be annoying to work with continuations in C code.Ideally this would be a compilation step.

Mostly mechanical translation preserves semantics

Depending on how they are used, it might be more efficient to usereducers.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 15 / 33

A Recipe for Speculation Porting the Example

cilk5 Code

i n t f 1 ( vo id ∗ a r g s ) ;i n t f 2 ( vo id ∗ a r g s ) ;

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {i n t x ;i n l e t vo id r educe ( i n t r ) {

x = r ;abort ;

}r educe ( c i l k spawn f 1 ( a rg s1 ) ) ;r educe ( c i l k spawn f 2 ( a rg s2 ) ) ;c i l k s y n c ;re tu rn x ;

}

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 16 / 33

A Recipe for Speculation Porting the Example

Translated cilk++ Code

s t r u c t I n l e t E n v { c i l k : : mutex m; i n t x ; } ;

i n t f 1 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;i n t f 2 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;

i n t f i r s t i n l e t ( i n t r e s u l t , I n l e t E n v ∗ env ) {env−>m. l o c k ( ) ; // S e r i a l e x e c u t i o nenv−>x = r e s u l t ;abort ;env−>m. un lock ( ) ; // S e r i a l e x e c u t i o n

}

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {InletEnv env ;c i l k spawn f 1 ( a rg s1, first inlet, env ) ;c i l k spawn f 2 ( a rg s2, first inlet, env ) ;c i l k s y n c ;re tu rn env.x ;

}

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 17 / 33

A Recipe for Speculation Porting the Example

Translated cilk++ Code

s t r u c t I n l e t E n v { c i l k : : mutex m; i n t x ; } ;

i n t f 1 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;i n t f 2 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;

i n t f i r s t i n l e t ( i n t r e s u l t , I n l e t E n v ∗ env ) {env−>m. l o c k ( ) ; // S e r i a l e x e c u t i o nenv−>x = r e s u l t ;abort ;env−>m. un lock ( ) ; // S e r i a l e x e c u t i o n

}

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {InletEnv env ;c i l k spawn f 1 ( a rg s1, first inlet, env ) ;c i l k spawn f 2 ( a rg s2, first inlet, env ) ;c i l k s y n c ;re tu rn env.x ;

}

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 17 / 33

A Recipe for Speculation Porting the Example

Translated cilk++ Code

s t r u c t I n l e t E n v { c i l k : : mutex m; i n t x ; } ;

i n t f 1 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;i n t f 2 ( vo id ∗ a r g s, int(*cont)(int, InletEnv*), InletEnv* env ) ;

i n t f i r s t i n l e t ( i n t r e s u l t , I n l e t E n v ∗ env ) {env−>m. l o c k ( ) ; // S e r i a l e x e c u t i o nenv−>x = r e s u l t ;abort ; Still need to handle abort

env−>m. un lock ( ) ; // S e r i a l e x e c u t i o n}

i n t f i r s t ( vo id ∗ args1 , vo id ∗ a rg s2 ) {InletEnv env ;c i l k spawn f 1 ( a rg s1, first inlet, env ) ;c i l k spawn f 2 ( a rg s2, first inlet, env ) ;c i l k s y n c ;re tu rn env.x ;

}

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 17 / 33

A Recipe for Speculation Implementing abort

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 18 / 33

A Recipe for Speculation Implementing abort

abort Features

We saw a notion of global abort in Lab 5.

When you abort, you abort everything, completely done.

This is not compositional.

Compositionality

Compositionality requires the ability to abort speculating computations.Need an abort hierarchy.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 19 / 33

A Recipe for Speculation Implementing abort

Hierarchical abort

Abort any subtree of the computation without affecting the rest ofthe computations.

0

0 0

0 0

0 0

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 20 / 33

A Recipe for Speculation Implementing abort

Hierarchical abort

Abort any subtree of the computation without affecting the rest ofthe computations.

0

X 0

0 0

0 0

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 20 / 33

A Recipe for Speculation Implementing abort

Hierarchical abort

Abort any subtree of the computation without affecting the rest ofthe computations.

0

X 0

X X

X X

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 20 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?Yes

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X 0

0 0Should I abort?

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X 0

X 0Should I abort?Yes

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

0 0

0 0

Abort!

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X 0

0 0

Abort!

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X 0

X 0

Abort!

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X 0

X X

Abort!

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

User-Implemented abort

Implement using polling...

Two possible implementations:1 Poll up toward the root.

Also, lazily copy values downward.

2 Abort down, push the abort flag down the tree

0

X 0

X X

X X

Abort!

0 0

0 0

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 21 / 33

A Recipe for Speculation Implementing abort

High-level Performace

Poll-UpPolling is expensive (linear in depth of node)abort is cheap (constant time, single memory access).

Really simple implementation.

Abort-DownPolling is cheap (constant time, single memory access).abort is expensive (linear in the size of the subtree).

Code is much more complex.Some extra overhead, currently using a locking implementation.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 22 / 33

A Recipe for Speculation Implementing abort

High-level Performace

Poll-UpPolling is expensive (linear in depth of node)abort is cheap (constant time, single memory access).Really simple implementation.

Abort-DownPolling is cheap (constant time, single memory access).abort is expensive (linear in the size of the subtree).Code is much more complex.Some extra overhead, currently using a locking implementation.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 22 / 33

Evaluation

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 23 / 33

Evaluation Performance

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 24 / 33

Evaluation Performance

Different Flavors of Abort

Compare the different abort techniques.Build a tree of height 14 and branching factor of 4.Poll at each leaf 100 times.Abort the root.Poll at each leaf.

Page 1

Poll Up Poll Up (Caching) Poll Down

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

Distribution of Time

Tree Height 14, Branching Factor 4

Aborted Poll(x1)AbortPoll (x100)Build

Abort Method

Pro

cess

or

Tic

ks (

x10

00

00

0)

Note that abort is serial.Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 25 / 33

Evaluation Performance

Different Flavors of Abort

Compare the different abort techniques.Build a tree of height 20 and branching factor of 2.Poll at each leaf 100 times.Abort the root.Poll at each leaf.

Page 1

Poll Up Poll Up (Caching) Poll Down

0

1000

2000

3000

4000

5000

6000

Distribution of Time

Tree Height 20, Branching Factor 2

Aborted Poll(x1)AbortPoll (x100)Build

Abort Method

Pro

cess

or

Clo

ck T

icks

(x1

00

00

00

)

Note that abort is serial.Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 26 / 33

Evaluation Performance

Polling Granularity Effects – Nodes Explored

We’re running our ported pousse code (with some instrumentation).

Granularity N doesn’t poll at the lowest N levels of the tree.

Base only checks abort in the loop that spawns the childcomputations.

Page 1

Granularity 3 Granularity 4 Granularity 5 Base

54000

56000

58000

60000

62000

64000

66000

Total Number of Nodes Explored

Polling Granularity

# o

f No

de

s (x

10

00

)

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 27 / 33

Evaluation Performance

Polling Granularity Effects – # of Polls

We’re running our ported pousse code (with some instrumentation).

Granularity N doesn’t poll at the lowest N levels of the tree.

Base only checks abort in the loop that spawns the childcomputations.

Page 1

Granularity 3 Granularity 4 Granularity 5 Base

232000

234000

236000

238000

240000

242000

244000

246000

248000

Number of Polls

Polling Strategy

# o

f Po

lls (

x10

00

)

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 28 / 33

Evaluation Performance

Polling Granularity Effects – # of Aborts

We’re running our ported pousse code (with some instrumentation).

Granularity N doesn’t poll at the lowest N levels of the tree.

Base only checks abort in the loop that spawns the childcomputations.

Page 1

Granularity 3 Granularity 4 Granularity 5 Base

0

1000

2000

3000

4000

5000

6000

Number of Aborts

Polling Strategy

# o

f Ab

ort

s (x

10

00

)

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 29 / 33

Evaluation Complexity Porting Old Code

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 30 / 33

Evaluation Complexity Porting Old Code

Experience Porting Cilk Code

We ported Cilk Pousse 1 from cilk to cilk++.

LinesCilk Pousse 956Cilk++ Pousse 1011

Increase ≈5.07%

Pretty simple with the inlet translation.Most annoying part is adding the calls to poll.

This can use some tuning.

Code is a little more difficult to read, but not too bad.

The real problem is that this changes the interface.Need to pass around the abort object.If it is used with an inlet, need to add continuation.Whole-code transformation is bad.

1people.csail.mit.edu/pousse/Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 31 / 33

Conclusions

Outline

1 A Recipe for SpeculationPorting the ExampleImplementing abort

2 EvaluationPerformanceComplexity Porting Old Code

3 Conclusions

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 32 / 33

Conclusions

Speculations on Speculative Parallelism

Speculative parallelism is definitely implementable as a library.

Native runtime support might be more efficient/nicer.

Lack of some abstraction features makes using it at the high-levelrequire interface changes.

...but, using it locally at the leaves does not leak into the rest of thecode.

Ruben Perez & Gregory Malecha (MIT) Speculative Parallelism in Cilk++ May 11, 2010 33 / 33

top related