faster work–stealing with return barriers€¦ · • higher steal rate correlates to high steal...
TRANSCRIPT
-
Vivek Kumar, Stephen M Blackburn The Australian National University
Faster Work–Stealing With Return Barriers
-
• Commodity processors with parallel execution abilities
• A fundamental turn toward concurrency in software
2
The New Era of Computing
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Background
-
• Modern hardware requires s/w parallelism • Software parallelism difficult to identify, expose
– Hard coded optimizations may get you there…
• Hard to realize potential of modern processors
The Challenge
Goal: performance and productivity
3
3
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Background
-
• Language based features to expose parallelism – Dynamic task parallelism – Work–stealing scheduler
• A runtime to hide the hardware complexities
4
Options ?
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Background
-
Contributions ✔ In-depth analysis
Of overheads associated with stealing tasks
✔ A new design Simple extension to JVM re-using old idea
✔ Detailed performance study Using standard work-stealing benchmarks
✔ Results That show we can significantly reduce the tasks stealing overhead
5
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Understanding Work–Stealing
6
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Work–Stealing
7
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Work–Stealing
8
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Work–Stealing
9
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Work–Stealing
10
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Initiation
Termination
State Management
Work–Stealing
11
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
12
Our Prior Work
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
OOPSLA 2012
13
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Our Prior Work
-
• Sequential overheads – Initiation – State management – Code restructuring
• Exploit existing JVM mechanisms – Initiation: Execution stack for steal initiation – State management: Extract state from stack & registers – Code restructuring: Try–catch blocks for control flow
• Eliminated most sequential overheads
Fork–Join: 200% X10: 400%
12%
Eliminating Sequential Overheads
14
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Our Prior Work
-
Motivating Analysis
15
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
16
Methodology
• Benchmarks – Jacobi – LU Decomposition – Heat Diffusion
• High steal ratio
• Hardware Platform – 2 Intel Xeon E7530
• 6 cores each
• Software Platform – Jikes RVM (3.1.2)
16
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
-
1.00E-06
1.00E-05
1.00E-04
1.00E-03
1.00E-02
1.00E-01
1.00E+00
2 3 4 5 6 7 8 9 10 11 12
Stea
ls/T
ask
Threads
Heat Jacobi LUD
Measured using JavaWS (Try–Catch)
17
Steals:Task Ratio
17
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
-
0
500
1000
1500
2000
2500
3000
3500
2 3 4 5 6 7 8 9 10 11 12
Stea
ls/s
econ
d
Threads
Heat Jacobi LUD
18
Steal Rate
18
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
Measured using JavaWS (Try–Catch)
-
Insight
• Steal ratio and steal rate not correlated
19
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
-
0
2
4
6
8
10
2 3 4 5 6 7 8 9 10 11 12
Vict
im W
ait T
ime
(% C
PU c
ycle
s)
Threads
Heat Jacobi LUD
20
Steal Overhead
20
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
Measured using JavaWS (Try–Catch)
-
Insights
• Steal ratio and steal rate not correlated • Higher steal rate correlates to high steal overhead
21
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Motivating Analysis
-
Our Approach
22
22
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Reducing Stealing Overhead
• Two pronged attack – Reduce the cost of each steal
• Return barriers – Reduce total number of steal events
• Steal more than one continuation at a time
23
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Our Approach
-
Implementation
24
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
25
Return Barrier • Allows runtime to intercept a common event • Hijack a return and bridge to some other method • Register and stack state preserved
E
D
C
B
A
TOP
BASE
Sta
ck G
row
th D
irect
ion Frame
Pointer Return
Address
Frame C Method C Frame C trampoline method
Frame D
Hijacked Frame Pointer
Hijacked Return
Address
Frame C Method C
trampoline method
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Implementation
-
26
E
D
C
B
A
TOP
BASE
Sta
ck G
row
th D
irect
ion
Yieldpoint Mechanism
A
Thief Installs Return Barrier
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Implementation
-
27
D
C
B
A
TOP
BASE
Sta
ck G
row
th D
irect
ion
A
Victim Moves The Return Barrier
D
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Implementation
-
28
D
C
B
TOP
BASE
Sta
ck G
row
th D
irect
ion
A
Robbing A Victim With Return Barrier
D
B
Yieldpoint Mechanism
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Implementation
-
Performance Evaluation
29
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
-
Vict
im w
ait t
ime
CP
U c
ycle
s (%
)
Threads
Jacobi 30
0
2
4
6
8
10
2 3 4 5 6 7 8 9 10 11 12 0
2
4
6
8
10
12
2 3 4 5 6 7 8 9 10 11 12 Threads
Spe
edup
Ove
r Seq
uent
ial
No Return Barrier With Return Barrier
30
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Evaluation
-
Vict
im w
ait t
ime
CP
U c
ycle
s (%
)
Threads
LUD 31
0
2
4
6
8
10
2 3 4 5 6 7 8 9 10 11 12 0
2
4
6
8
10
12
2 3 4 5 6 7 8 9 10 11 12 Threads
Spe
edup
Ove
r Seq
uent
ial
No Return Barrier With Return Barrier
31
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Evaluation
-
Summary
• Steal overhead dominated by steal rate • Two pronged attack
– Reduce the cost of each steal • Return barriers
– Reduce total number of steals • Steal more than one
• No change in speedup
32
58%
32
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Summary
-
Future Work
• Steal overhead dominated by steal rate • Two pronged attack
– Reduce the cost of each steal • Return barriers
– Reduce total number of steals • Steal more than one
• No change in speedup • Merge both the techniques
33
58%
?%
?
Faster Work-Stealing With Return Barriers | Kumar & Blackburn | VMIL’12
Future Work