sprint: speculative prefetching of remote data
DESCRIPTION
Sprint: Speculative Prefetching of Remote Data. Arun Raman Princeton University. Greta Yorsh ARM, UK. Martin Vechev IBM ResearchÐ Zurich. Eran Yahav Technion , Israel. Acknowledgments: Nick Mitchell and Mark Wegman IBM Research. IBM Yellow Pages Application. Local - PowerPoint PPT PresentationTRANSCRIPT
Sprint: Speculative Prefetching of Remote Data
Arun RamanPrinceton University
Greta YorshARM, UK
Martin VechevIBM ResearchÐ Zurich
Eran YahavTechnion, Israel
Acknowledgments: Nick Mitchell and Mark Wegman
IBM Research
Client Datasource
RemoteProcessing
(2 sec)
LocalProcessing
(2 sec)
NetworkLatency(16 sec)
Prefetching,Caching
Async,Batching
Query Planning,Cache Opti.
Ted
Chandra
Walter
Heywood
Frank
Rama
Ralph
Dimitri
David
IBM Yellow Pages Application
Remote Access Latency
Compiler
Execution engine
Expose Parallelism across remote accesses
Optimize remote accesses
Sprint (Our Technique)
Node build(String email) { Employee emp = getEmployee(email); if (!emp) return NULL; Node root = new Node(emp); numNodes++; for(reportee_email: emp.getReportees()){ Node child = build(reportee_email);
if (child) { root.addToList(child); child.setParent(root);}
} return root;}
Remote DependencyLocal Dependency
Remote AccessIBM Yellow Pages Application
Ted
Chandra
Walter
Heywood
Frank
Rama
Ralph
Dimitri
David
Ted
Chandra
Walter
Heywood
Frank
Rama
Ralph
Dimitri
David
Program Remote Data- source
input
output
input
Optimist(prefetcher program)
Pessimist(original program)
Sprint execution
engine
cache
output
Remote Data- source
Optimist(prefetcher program)
Pessimist(original program)
• Parallelization•Memory Protection•Output Protection
• Initiating the Optimist• Deadlock Avoidance
Compiler Transformations
Compiler
Execution engine
Expose Parallelism across remote accesses
Optimize remote accesses
Sprint (Our Technique)
Node build(String email) { Employee emp = getEmployee(email); if (!emp) return NULL; Node root = new Node(emp); numNodes++; for(reportee_email: emp.getReportees()){ Node child = build(reportee_email);
if (child) { root.addToList(child); child.setParent(root);}
} return root;}
build(K) { V = get(K) for(k in V.keys) build(k)}
IBM Yellow Pages Application
build(K) { V = get(K) for(k in V.keys) build(k)}
Core 0 Core 1 Core 2
Value StKey
St (State): Absent, Present, or Issued
AAAAAAA
Sprint Cache
Pessimist Optimist
build(K) { V = get(K) for(k in V.keys) build(k)}
Core 0 Core 1 Core 2launch
get(K0)get(K0)
build(K1)get(K1)get(K1)
build(K1)
build(K4)get(K4)
build(K3)get(K3)
build(K3)get(K3)
Pessimist Optimist
Value StKey
St (State): Absent, Present, or Issued
AAAAAAA
K0 I
build(K0) build(K0)
get(K2)build(K2)
get(K2)build(K2)
PV0K1K2
II
V1V2
PP
t0
t1
t2K3K4
II
WAIT
WAIT
HIT!
Sprint Cache
HIT!t3
V3V4
PP
Client Datasource
LocalProcessing
(2 sec)
NetworkLatency(16 sec)
RemoteProcessing
(2 sec)
(2 sec)
Original Execution
(2 sec) (3 sec)Sprint-ed Execution
In the paper: Batching optimization Task prioritization optimization Data access processing algorithm Data consistency with remote updates Correctness proof
• IBM’s Yellow Pages Web Service• Publications Database (DB2)• Facebook Web Service
Datasources• Management Hierarchy• Employee Search• Citation Count• Bibliography Agg.• Friend Connectivity
Clients
QUESTIONS ?
1. IBM’s Yellow Pages Web Service2. Publications Database (DB2)3. Facebook Web Service
Datasources1. Management Hierarchy2. Employee Search3. Citation Count4. Bibliography Agg.5. Friend Connectivity
Clients
Cache Statistics (for Sprint with all optimizations turned on)
Client Accesses Hits Waits Misses Miss % CachedMH 766 747 10 9 1.12% 757ES 293 197 48 48 16.38% 714CC 502 202 168 132 26.29% 370BA 1268 949 178 141 11.12% 1127FC 401 394 0 7 1.75% 394
P1() { x=read(M,a); y=read(M,b); assert (y > x);} P2() { atomic{ write(M,a,2); write(M,b,3) } }
S = {(⟨a, 1⟩, ⟨b, 2⟩), (⟨a, 1⟩, ⟨b, 3⟩), (⟨a, 2⟩, ⟨b, 3⟩)}
read(b) // by Optimist of P1write(a,2),write(b,3) // by P2 read(a) // by Optimist of P1 read(a),read(b) // by Pessimist of P1
S′ = (⟨a, 2⟩, ⟨b, 2⟩)