![Page 1: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/1.jpg)
Analyses and Optimizations for Multithreaded Programs
Martin Rinard, Alex Salcianu,Brian Demsky
MIT Laboratory for Computer Science
John Whaley IBM Tokyo Research Laboratory
![Page 2: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/2.jpg)
Motivation
• Threads are Ubiquitous• Parallel Programming for Performance• Manage Multiple Connections• System Structuring Mechanism
• Overhead• Thread Management• Synchronization
• Opportunities• Improved Memory Management
![Page 3: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/3.jpg)
What This Talk is About
• New Abstraction: Parallel Interaction Graph• Points-To Information• Reachability and Escape Information • Interaction Information
•Caller-Callee Interactions•Starter-Startee Interactions
• Action Ordering Information• Analysis Algorithm• Analysis Uses (synchronization elimination,
stack allocation, per-thread heap allocation)
![Page 4: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/4.jpg)
Outline
• Example• Analysis Representation and Algorithm• Lightweight Threads• Results• Conclusion
![Page 5: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/5.jpg)
Sum Sequence of Numbers
9 8 1 5 3 7 2 6
![Page 6: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/6.jpg)
Group in Subsequences
9 8 1 5 3 7 2 6
![Page 7: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/7.jpg)
Sum Subsequences (in Parallel)
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
![Page 8: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/8.jpg)
Add Sums Into Accumulator
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
Accumulator0
![Page 9: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/9.jpg)
Add Sums Into Accumulator
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
Accumulator17
![Page 10: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/10.jpg)
Add Sums Into Accumulator
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
Accumulator23
![Page 11: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/11.jpg)
Add Sums Into Accumulator
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
Accumulator33
![Page 12: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/12.jpg)
Add Sums Into Accumulator
9 8 1 5 3 7 2 6
+
6
+
17
+
10
+
8
Accumulator41
![Page 13: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/13.jpg)
Common Schema
• Set of tasks• Chunk tasks to increase granularity• Tasks have both
• Independent computation• Updates to shared data
![Page 14: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/14.jpg)
Realization in Java
class Accumulator { int value = 0; synchronized void add(int v) { value += v; }}
![Page 15: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/15.jpg)
Realization in Java
class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; }
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); }}
0
work dest
Task
62
Accumulator
Vector
![Page 16: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/16.jpg)
Realization in Java
class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; }
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); }}
0
work dest
Task
62
Accumulator
Vector
Enumeration
![Page 17: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/17.jpg)
Realization in Java
void generateTask(int l, int u, Accumulator a) { Vector v = new Vector(); for (int j = l; j < u; j++) v.addElement(new Integer(j)); Task t = new Task(v,a); t.start();}void generate(int n, int m, Accumulator a) { for (int i = 0; i < n; i ++) generateTask(i*m, i*(m+1),
a);}
![Page 18: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/18.jpg)
Accumulator0
Task Generation
![Page 19: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/19.jpg)
Accumulator
Vector0
Task Generation
![Page 20: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/20.jpg)
Accumulator
Vector0
Task Generation
2
![Page 21: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/21.jpg)
62
Accumulator
Vector0
Task Generation
![Page 22: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/22.jpg)
work dest
Task
62
Accumulator
Vector0
Task Generation
![Page 23: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/23.jpg)
work dest
Task
62
Accumulator
Vector0
98
Vector
Task Generation
![Page 24: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/24.jpg)
work dest
Task
62
Accumulator
Vector0
work
dest
Task
98
Vector
Task Generation
![Page 25: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/25.jpg)
work dest
Task
62
Accumulator
Vector0
work
dest
Task
98
Vector
work
dest
Task
51
Vector
Task Generation
![Page 26: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/26.jpg)
Analysis
![Page 27: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/27.jpg)
Analysis Overview
• Interprocedural• Interthread • Flow-sensitive
• Statement ordering within thread• Action ordering between threads
• Compositional, Bottom Up• Explicitly Represent Potential
Interactions Between Analyzed and Unanalyzed Parts
• Partial Program Analysis
![Page 28: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/28.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}
•Abstraction: Points-to Graph
•Nodes Represent Objects•Edges Represent References
work dest
Task
Vector
Enumeration
this
![Page 29: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/29.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}•Inside Nodes
•Objects Created Within Current Analysis Scope
•One Inside Node Per Allocation Site
•Represents All Objects Created At That Site
work dest
Task
Vector
Enumeration
this
![Page 30: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/30.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}
•Outside Nodes•Objects Created Outside Current Analysis Scope
•Objects Accessed Via References Created Outside Current Analysis Scope
work dest
Task
Vector
Enumeration
this
![Page 31: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/31.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}•Outside Nodes
•One per Static Class Field •One per Parameter•One per Load Statement
• Represents Objects Loaded at That Statement
work dest
Task
Vector
Enumeration
this
![Page 32: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/32.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}
•Inside Edges•References Created Inside Current Analysis Scope
work dest
Task
Vector
Enumeration
this
![Page 33: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/33.jpg)
Analysis Result for run Method
Accumulator
public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum);}
•Outside Edges•References Created Outside Current Analysis Scope
•Potential Interactions in Which Analyzed Part Reads Reference Created in Unanalyzed Part
work dest
Task
Vector
Enumeration
this
![Page 34: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/34.jpg)
Concept of Escaped Node
• Escaped Nodes Represent Objects Accessible Outside Current Analysis Scope• parameter nodes, load nodes• static class field nodes• nodes passed to unanalyzed methods• nodes reachable from unanalyzed but
started threads• nodes reachable from escaped nodes
• Node is Captured if it is Not Escaped
![Page 35: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/35.jpg)
Why Escaped Concept is Important
• Completeness of Analysis Information• Complete information for captured
nodes• Potentially incomplete for escaped nodes
• Lifetime Implications• Captured nodes are inaccessible when
analyzed part of the program terminates• Memory Management Optimizations
•Stack allocation •Per-Thread Heap Allocation
![Page 36: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/36.jpg)
Intrathread Dataflow Analysis
• Computes a points-to escape graph for each program point
• Points-to escape graph is a pair <I,O,e>• I - set of inside edges• O - set of outside edges• e - escape information for each node
![Page 37: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/37.jpg)
Dataflow Analysis
• Initial state:I : formals point to parameter
nodes,classes point to class nodes
O: Ø• Transfer functions:
I´ = (I – KillI ) U GenI
O´ = O U GenO
• Confluence operator is U
![Page 38: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/38.jpg)
Intraprocedural Analysis
• Must define transfer functions for:• copy statement l = v
• load statement l1 = l2.f
• store statement l1.f = l2
• return statement return l• object creation site l = new cl
• method invocation l = l0.op(l1…lk)
![Page 39: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/39.jpg)
copy statement l = v
KillI = edges(I, l)
GenI = {l} × succ(I, v)
I´ = (I – KillI ) U GenI
l
v
Existing edges
![Page 40: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/40.jpg)
copy statement l = v
KillI = edges(I, l)
GenI = {l} × succ(I, v)
I´ = (I – KillI ) U GenI
Generated edges
l
v
![Page 41: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/41.jpg)
load statement l1 = l2.f
SE = {n2 in succ(I, l2) . escaped(n2)}
SI = U{succ(I, n2, f) . n2 in succ(I, l2)}
case 1: l2 does not point to an escaped node (SE = Ø)
KillI = edges(I, l1)
GenI = {l1} × SI
l1
l2
Existing edges
f
![Page 42: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/42.jpg)
load statement l1 = l2.f
SE = {n2 in succ(I, l2) . escaped(n2)}
SI = U{succ(I, n2, f) . n2 in succ(I, l2)}
case 1: l2 does not point to an escaped node (SE = Ø)
KillI = edges(I, l1)
GenI = {l1} × SI
Generated edges
l1
l2
f
![Page 43: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/43.jpg)
load statement l1 = l2.f
case 2: l2 does point to an escaped node (not SE = Ø)
KillI = edges(I, l1)
GenI = {l1} × (SI U {n})
GenO = (SE × {f}) × {n}
where n is the load node for l1 = l2.f
l1
l2
Existing edges
![Page 44: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/44.jpg)
load statement l1 = l2.f
case 2: l2 does point to an escaped node (not SE = Ø)
KillI = edges(I, l1)
GenI = {l1} × (SI U {n})
GenO = (SE × {f}) × {n}
where n is the load node for l1 = l2.f
Generated edges
l1
l2
nf
![Page 45: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/45.jpg)
store statement l1.f = l2
GenI = (succ(I, l1) × {f}) × succ(I, l2)
I´ = I U GenI
l2
Existing edges
l1
![Page 46: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/46.jpg)
store statement l1.f = l2
GenI = (succ(I, l1) × {f}) × succ(I, l2)
I´ = I U GenI
Generated edges
l2
l1f
![Page 47: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/47.jpg)
object creation site l = new cl
KillI = edges(I, l)
GenI = {<l, n>}
where n is inside node for l = new cl
l
Existing edges
![Page 48: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/48.jpg)
object creation site l = new cl
KillI = edges(I, l)
GenI = {<l, n>}
where n is inside node for l = new cl
Generated edges
l n
![Page 49: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/49.jpg)
Method Call
• Analysis of a method call:• Start with points-to escape graph
before the call site• Retrieve the points-to escape graph
from analysis of callee• Map outside nodes of callee graph to
nodes of caller graph• Combine callee graph into caller graph
• Result is the points-to escape graph after the call site
![Page 50: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/50.jpg)
v
t
a
Points-to Escape Graphbefore call to
t = new Task(v,a)
Start With Graph Before Call
![Page 51: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/51.jpg)
work
dest
v
t
a
this
w
d
Points-to Escape Graphbefore call to
t = new Task(v,a)
Points-to Escape Graphfrom analysis of
Task(w,d)
Retrieve Graph from Callee
![Page 52: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/52.jpg)
work
dest
v
t
a
this
w
d
Points-to Escape Graphbefore call to
t = new Task(v,a)
Points-to Escape Graphfrom analysis of
Task(w,d)
Map Parameters from Callee to Caller
![Page 53: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/53.jpg)
work
dest
v
t
a
this
w
d
Combined Graphafter call to
t = new Task(v,a)
Points-to Escape Graphfrom analysis of
Task(w,d)
Transfer Edges from Callee to Caller
work
dest
![Page 54: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/54.jpg)
v
t
a
Combined Graphafter call to
t = new Task(v,a)
Discard Parameter Nodes from Callee
work
dest
![Page 55: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/55.jpg)
Points-to Escape Graphbefore call to
x.foo()
Points-to Escape Graphfrom analysis of
foo()
thisx
More General Example
yz
![Page 56: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/56.jpg)
Points-to Escape Graphbefore call to
x.foo()
Points-to Escape Graphfrom analysis of
foo()
thisx
Initialize MappingMap Formals to Actuals
yz
![Page 57: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/57.jpg)
Points-to Escape Graphbefore call to
x.foo()
Points-to Escape Graphfrom analysis of
foo()
thisx
Extend MappingMatch Inside and Outside Edges
y
Mapping is UnidirectionalFrom Callee to Caller
z
![Page 58: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/58.jpg)
Points-to Escape Graphbefore call to
x.foo()
Points-to Escape Graphfrom analysis of
foo()
thisx
Complete Mapping Automap Load and Inside Nodes Reachable
from Mapped Nodes
yz
![Page 59: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/59.jpg)
Combined Graphafter call to
x.foo()
Points-to Escape Graphfrom analysis of
foo()
thisx
Combine MappingProject Edges from Callee Into Combined
Graph
yz
![Page 60: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/60.jpg)
Combined Graphafter call to
x.foo()
x
Discard Callee Graph
z
![Page 61: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/61.jpg)
Combined Graphafter call to
x.foo()
x
Discard Outside Edges From Captured Nodes
z
![Page 62: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/62.jpg)
Interthread Analysis
• Augment Analysis Representation • Parallel Thread Set• Action Set (read,write,sync,create edge)• Action Ordering Information
(relative to thread start actions)• Thread Interaction Analysis
• Combine points-to graphs• Induces combination of other information
• Can perform interthread analysis at any point to improve precision of results
![Page 63: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/63.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Combining Points-to Graphs
x this
![Page 64: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/64.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Initialize MappingMap Startee Thread to Starter
Thread
x this
![Page 65: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/65.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Extend MappingMatch Inside and Outside Edges
x this
![Page 66: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/66.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Extend MappingMatch Inside and Outside Edges
x this
![Page 67: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/67.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Extend MappingMatch Inside and Outside Edges
x this
Mapping is BidirectionalFrom Startee to StarterFrom Starter to Startee
![Page 68: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/68.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Complete Mapping Automap Load and Inside Nodes Reachable from Mapped Nodes
x this
![Page 69: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/69.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Combine GraphsProject Edges Through Mappings Into
Combined Graph
x this
![Page 70: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/70.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Combine GraphsProject Edges Through Mappings Into
Combined Graph
x this
![Page 71: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/71.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Combine GraphsProject Edges Through Mappings Into
Combined Graph
x this
![Page 72: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/72.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Combine GraphsProject Edges Through Mappings Into
Combined Graph
x this
![Page 73: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/73.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Discard StarteeThread Node
x this
![Page 74: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/74.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Discard Startee Thread Node
x
![Page 75: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/75.jpg)
Combined Points-to Escape Graph sometime after call to
x.start()
Discard Outside Edges From Captured Nodes
x
![Page 76: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/76.jpg)
Life is not so Simple
• Dependences between phases• Mapping best framed as constraint
satisfaction problem• Solved using constraint satisfaction
algorithm
![Page 77: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/77.jpg)
Interthread Analysis With Actions and Ordering
![Page 78: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/78.jpg)
Accumulatorb e
awork dest
Task
d
c
Vector
ta
ParallelThreads
Actions
wr a
wr b
wr c
wr d
sync b
rd b
Points-to Graph
Action Ordering
“All actionshappen before
thread a starts
executing”
Analysis Result for generateTask
![Page 79: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/79.jpg)
6
Enumeration
Accumulator2 5
1work dest
Task
4
3
Vector
this
ParallelThreads
Actions
rd 1
rd 2
rd 3
rd 4
Action Ordering
noparallelthreads
none
rd 5
wr 5
sync 2
rd 6
wr 6
Points-to Graph
Analysis Result for run
sync 5
edge(1,2)
edge(1,5)
edge(2,3)
edge(3,4)
![Page 80: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/80.jpg)
Role of edge(1,2) Actions
• One edge action for each outside edge• Action order for edge actions improves
precision of interthread analysis• If starter thread reads a reference
before startee thread is started• Then reference was not created by
startee thread• Outside edge actions record order• Inside edges from startee matched only
against parallel outside edges
![Page 81: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/81.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Edge Actions in Combining Points-to Graphs
1
2
3
x this
Action Ordering
edge(1,2) || 1
![Page 82: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/82.jpg)
Points-to Escape Graphsometime after call to
x.start()
Points-to Escape Graphfrom analysis of
run()
Edge Actions in Combining Points-to Graphs
1
2
3
x this
Action Ordering
(i.e., edge(1,2)created before
started)1
none
![Page 83: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/83.jpg)
Accumulatorb e
awork dest
Task
d
c
Vector
t
ParallelThreads
Actions
wr a
wr b
wr c
wr d
sync b
rd b
Points-to Graph
Action Ordering
“All actions from
current threadhappen before
thread a starts
executing”
Analysis Result After Interaction
rd a, a
rd b, a
rd c, a
rd d, a
rd e, a
wr e, a
sync b, a
sync e, a
a
![Page 84: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/84.jpg)
Roles of Intrathread and Interthread Analyses
• Basic Analysis• Intrathread analysis delivers parallel
interaction graph at each program point•records parallel threads•does not compute thread interaction
• Choose program point (end of method)• Interthread analysis delivers additional
precision at that program point• Does not exploit ordering information from
thread join constructs
![Page 85: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/85.jpg)
Join Ordering
t = new Task();t.start();
“computation that runs in parallel with task t”
t.join();
“computation that runs after task t”
t.run();“computation
from task t”
![Page 86: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/86.jpg)
Exploiting Join Ordering
• At join point• Interthread analysis delivers new
(more precise) parallel interaction graph
• Intrathread analysis uses new graph• No parallel interactions between
• Thread• Computation after join
![Page 87: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/87.jpg)
Extensions
• Partial program analysis• can analyze method independent of
callers• can analyze method independent of
methods it invokes• can incrementally analyze callees to
improve precision• Dial down precision to improve efficiency• Demand-driven formulations
![Page 88: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/88.jpg)
Key Ideas
• Explicitly represent potential interactions between analyzed and unanalyzed parts• Inside versus outside nodes and
edges• Escaped versus captured nodes• Precisely bound ignorance
• Exploit ordering information• intrathread (flow sensitive)• interthread (starts, edge orders, joins)
![Page 89: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/89.jpg)
Analysis Uses
Overheads in Standard Execution and How to Eliminate Them
![Page 90: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/90.jpg)
6
Enumeration
Accumulator2 5
1work dest
Task
4
3
Vector
this
Intrathread Analysis Result from End of run Method
•Enumeration object is captured•Does not escape to caller•Does not escape to parallel
threads•Lifetime of Enumeration object
is bounded by lifetime of run•Can allocate Enumeration
object on call stack instead of heap
![Page 91: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/91.jpg)
Accumulator
b e
awork dest
Task
d
c
Vector
t
ParallelThreads
Actions
wr a
wr b
wr c
wr d
sync b
rd b
Points-to Graph
Action Ordering
“All actions from current thread happen before
thread a startsexecuting”
rd a, a
rd b, a
rd c, a
rd d, a
rd e, a
wr e, a
sync b, a
sync e, a
a
•Vector object is captured•Multiple threads synchronize on
Vector object•But synchronizations from
different threads do not occur concurrently
•Can eliminate synchronization on Vector object
Interthread Analysis Result from End of generateTask Method
![Page 92: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/92.jpg)
Accumulator
b e
awork dest
Task
d
c
Vector
t
ParallelThreads
Actions
wr a
wr b
wr c
wr d
sync b
rd b
Points-to Graph
Action Ordering
“All actions from current thread happen before
thread a startsexecuting”
rd a, a
rd b, a
rd c, a
rd d, a
rd e, a
wr e, a
sync b, a
sync e, a
a
•Vectors, Tasks, Integers captured
•Parent, child access objects•Parent completes accesses
before child starts accesses•Can allocate objects on child’s
per-thread heap
Interthread Analysis Result from End of generateTask Method
![Page 93: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/93.jpg)
Thread Overhead
• Inefficient Thread Implementations• Thread Creation Overhead• Thread Management Overhead• Stack Overhead
• Use a more efficient thread implementation• User-level thread management• Per-thread heaps• Event-driven form
![Page 94: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/94.jpg)
Standard Thread Implementation
return address
frame pointer
x
y
return address
frame pointer
b
c
a
•Call frames allocated on stack•Context Switch
• Save state on stack• Resume another thread
•One stack per thread
![Page 95: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/95.jpg)
Standard Thread Implementation
return address
frame pointer
x
y
return address
frame pointer
b
c
a
save area
•Call frames allocated on stack•Context Switch
• Save state on stack• Resume another thread
•One stack per thread
![Page 96: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/96.jpg)
Event-Driven Form
return address
frame pointer
x
y
return address
frame pointer
b
c
a
•Call frames allocated on stack•Context Switch
• Build continuation on heap• Copy out live variables• Return out of computation• Resume another continuation
•One stack per processor
c
x
resumemethod
resumemethod
![Page 97: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/97.jpg)
Complications
• Standard thread models use blocking I/O• Automatically convert blocking I/O to
asynchronous I/O• Scheduler manages interleaving of
thread executions• Stack Allocatable Objects May Be Live
Across Blocking Calls• Transfer allocation to per-thread heap
![Page 98: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/98.jpg)
Opportunity
• On a uniprocessor, compiler controls placement of context switch points
• If program does not hold lock across blocking call, can eliminate lock
![Page 99: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/99.jpg)
Experimental Results
• MIT Flex Compiler System• Static Compiler• Native code for StrongARM
• Server Benchmarks • http, phone, echo, time
• Scientific Computing Benchmarks• water, barnes
![Page 100: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/100.jpg)
Server Benchmark Characteristics
IR Size
(instrs)
Number of
Methods
PreAnalysis
Time (secs)
echo 4,639 131 28
time 4,573 136 29
http 10,643 292 103
phone 9,547 267 75
IntraThreadAnalysis
Time (secs)
InterThreadAnalysis
Time (secs)
74
70
199
191
73
74
269
256
![Page 101: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/101.jpg)
Percentage of Eliminated Synchronization Operations
0
20
40
60
80
100
http phone time echo mtrt
Intrathread only
Interthread
![Page 102: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/102.jpg)
Compilation Options for Performance Results
• Standard• kernel threads, synch included
• Event-Driven• event-driven, no synch at all
• +Per-Thread Heap• event-driven, no synch at all, per-
thread heap allocation
![Page 103: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/103.jpg)
Throughput (Responses per Second)
Standard
Event-Driven
+Per-ThreadHeap
echo timehttp2K
http20K
0
100
200
300
400
phone
![Page 104: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/104.jpg)
water 25,583 335 1156
IR Size(instrs)
Number ofMethods
Total AnalysisTime (secs)
barnes 19,764 364 491
380
Pre AnalysisTime (secs)
129
Scientific Benchmark Characteristics
![Page 105: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/105.jpg)
Compiler Options
0: Sequential C++1: Baseline - Kernel Threads2: Lightweight Threads3: Lightweight Threads + Stack Allocation4: Lightweight Threads + Stack Allocation
- Synchronization
![Page 106: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/106.jpg)
0
0.2
0.4
0.6
0.8
1
Baseline
+Light
+Stack
-Synch
Execution Times
Proportion of Sequential C++ Execution Time
water small water barnes
![Page 107: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/107.jpg)
Related Work
• Pointer Analysis for Sequential Programs• Chatterjee, Ryder, Landi (POPL 99)• Sathyanathan & Lam (LCPC 96)• Steensgaard (POPL 96)• Wilson & Lam (PLDI 95)• Emami, Ghiya, Hendren (PLDI 94)• Choi, Burke, Carini (POPL 93)
![Page 108: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/108.jpg)
Related Work
• Pointer Analysis for Multithreaded Programs• Rugina and Rinard (PLDI 99) (fork-
join parallelism, not compositional)• We have extended our points-to analysis
for multithreaded programs (irregular, thread-based concurrency, compositional)
• Escape Analysis• Blanchet (POPL 98)• Deutsch (POPL 90, POPL 97)• Park & Goldberg (PLDI 92)
![Page 109: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/109.jpg)
Related Work
• Synchronization Optimizations• Diniz & Rinard (LCPC 96, POPL 97)• Plevyak, Zhang, Chien (POPL 95)• Aldrich, Chambers, Sirer, Eggers
(SAS99)• Blanchet (OOPSLA 99)• Bogda, Hoelzle (OOPSLA 99)• Choi, Gupta, Serrano, Sreedhar, Midkiff
(OOPSLA 99)• Ruf (PLDI 00)
![Page 110: Analyses and Optimizations for Multithreaded Programs Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science John Whaley IBM Tokyo](https://reader035.vdocuments.net/reader035/viewer/2022062516/56649e465503460f94b3b3b8/html5/thumbnails/110.jpg)
Conclusion
• New Analysis Algorithm• Flow-sensitive, compositional• Multithreaded programs• Explicitly represent interactions between
analyzed and unanalyzed parts• Analysis Uses
• Synchronization elimination• Stack allocation• Per-thread heap allocation
• Lightweight Threads