![Page 1: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/1.jpg)
![Page 2: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/2.jpg)
“ ”
2
![Page 3: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/3.jpg)
3
Learning
Supervised • Classification • Regression • Recommender Unsupervised • Clustering • Dimensionality
reduction • Topic modeling
Data
Big: TiB - PiB
Model
Small: MiB - GiB
![Page 4: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/4.jpg)
Modeling Evaluation Example Formation Examples Model
![Page 5: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/5.jpg)
Example
Click Log
Bag of Words ID
Label ID
Bag of Words Label ID
Feature Extraction
Label Extraction
Data Parallel Functions
Large Scale Join
(Large Scale) Join
![Page 6: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/6.jpg)
Step II: Modeling
Step III: Evaluation
Modeling Evaluation Example Formation
![Page 7: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/7.jpg)
Step I: Example Formation Feature and Label Extraction
Step III: Evaluation
Modeling Evaluation Example Formation
![Page 8: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/8.jpg)
Apply Model to Data
Observe Errors
Update Model
![Page 9: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/9.jpg)
Sample Features
Copy Model
Modeling Evaluation Example Formation
![Page 10: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/10.jpg)
+ MapReduce model fits statistical query model
learning - Hadoop MR does not support iterations (30x
slowdown compared to others) - Hadoop MR does not match other forms of
algorithms
Hadoop Abuse
![Page 11: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/11.jpg)
Statistics
Model Updates
![Page 12: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/12.jpg)
Statistics / Updates
![Page 13: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/13.jpg)
![Page 14: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/14.jpg)
Rise of the Resource Managers
![Page 15: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/15.jpg)
Map Task
Reduce Task
Map Task
Reduce Task
Map Task
![Page 16: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/16.jpg)
![Page 17: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/17.jpg)
App Master
Resource Allocation = list of (node type, count, resource) E.g. { (node1, 1, 1GB), (rack-1, 2, 1GB),(*, 1, 2GB) }
![Page 18: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/18.jpg)
App Master Container
Container Container
Container
![Page 19: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/19.jpg)
App Master Container
Container Container
Container
![Page 20: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/20.jpg)
App Master Container
Container Container
Container
![Page 21: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/21.jpg)
![Page 22: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/22.jpg)
![Page 23: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/23.jpg)
REEF: Retainable Evaluator Execution Framework
![Page 24: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/24.jpg)
![Page 25: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/25.jpg)
YARN / HDFS
SQL / Hive … … Machine Learning
![Page 26: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/26.jpg)
SQL / Hive
YARN / HDFS
… … Machine Learning
REEF
![Page 27: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/27.jpg)
YARN / HDFS
SQL / Hive … … Machine Learning
REEF
Physical Data Parallel Operators
Logical Abstraction
![Page 28: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/28.jpg)
Storage
Network
State Management
Job Driver Control plane implementation. User code executed on YARN’s Application Master
Activity User code executed within an Evaluator.
Evaluator Execution Environment for Activities. One Evaluator is bound to one YARN Container.
![Page 29: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/29.jpg)
![Page 30: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/30.jpg)
![Page 31: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/31.jpg)
Client
public class DistributedShell { ... public static void main(String[] args){ ... Injector i = new Injector(yarnConfiguration); ... REEF reef = i.getInstance(REEF.class); ... reef.submit(driverConf); } }
![Page 32: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/32.jpg)
public class DistributedShell { ... public static void main(String[] args){ ... Injector i = new Injector(yarnConfiguration); ... REEF reef = i.getInstance(REEF.class); ... reef.submit(driverConf); } }
Client
![Page 33: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/33.jpg)
public class DistributedShellJobDriver { private final EvaluatorRequestor requestor; ... public void onNext(StartTime time) { requestor.submit(EvaluatorRequest.Builder() .setSize(SMALL).setNumber(2) .build() ); } ... }
Client
![Page 34: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/34.jpg)
Client
public class DistributedShellJobDriver { private final EvaluatorRequestor requestor; ... public void onNext(AllocatedEvaluator eval) { Configuration contextConf = ...; eval.submitContext(contextConf) } ... }
![Page 35: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/35.jpg)
context config +
Client
![Page 36: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/36.jpg)
Client
![Page 37: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/37.jpg)
public class DistributedShellJobDriver { private final String cmd = “ls”; [...] public void onNext(ActiveContext ctx) { final String activityId = [...]; Configuration activityConf = Activity.CONF .set(IDENTIFIER, "ShellActivity") .set(ACTIVITY, ShellActivity.class) .set(COMMAND, this.cmd) .build(); ctx.submitActivity(activityConf); } [...] }
activity config
Client
![Page 38: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/38.jpg)
class ShellActivity implements Activity { private final String command; @Inject ShellActivity(@Parameter(Command.class) String c) { this.command = c; } private String exec(final String command){ ... } @Override public byte[] call(byte[] memento) { String s = exec(this.cmd); return s.getBytes(); } }
Client
![Page 39: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/39.jpg)
Client
![Page 40: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/40.jpg)
Retains State!
Client
![Page 41: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/41.jpg)
activity config
Client
![Page 42: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/42.jpg)
Client
![Page 43: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/43.jpg)
Client
![Page 44: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/44.jpg)
![Page 45: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/45.jpg)
![Page 46: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/46.jpg)
Name Node
Yarn RM
HDFS NM
REEF
HDFS NM
HDFS NM
Job Driver
Activity
Client
node1
node2
node3
node4
![Page 47: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/47.jpg)
Name Node
Yarn RM
HDFS NM
REEF
HDFS NM
HDFS NM
Job Driver
Activity
Client
node1
node2
node3
node4
![Page 48: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/48.jpg)
Name Node
Yarn RM
HDFS NM
REEF
HDFS NM
HDFS NM
Job Driver
Client
node1
node2
node3
node4
Activity
activity config +
![Page 49: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/49.jpg)
![Page 50: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/50.jpg)
![Page 51: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/51.jpg)
![Page 52: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/52.jpg)
52
Select, Project, Join, Group MapReduce MPI
Logical Layer
Physical Layer
ML algorithm
Graph Analysis SQM
![Page 53: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/53.jpg)
53
MapReduce MPI
Logical Layer
Physical Layer
ML algorithm
Graph Analysis SQM
Select, Project, Join, Group
![Page 54: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/54.jpg)
REEF
Parallel Recursive Dataflow
Query optimizer
54
Logical query over training data
ML algorithm Graph
Analysis SQM
![Page 55: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/55.jpg)
• Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and N. Filardo. Dyna: Extending datalog for
modern AI. In Datalog ’10 S. Funiak et al. Distributed inference with declarative
overlay networks. EECS Tech Report 2008 D. Deutch, C. Koch, T. Milo. On Probabilistic Fixpoint and
Markov Chain Query Languages. In PODS ’10 Y. Bu et al. Scaling Datalog for Machine Learning on Big
Data. Tech Report http://arxiv.org/abs/1203.0160 2012 55
REEF
Parallel Recursive Dataflow
Query optimizer
Datalog query over training data
ML algorithm Graph
Analysis SQM
![Page 56: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/56.jpg)
• Implementation over Hyracks • Supports both Iterative-MRU and Pregel • Standard optimizations + some new tricks
56
Hyracks
Hardcoded optimizations
Datalog queries
Programming Models for ML
algorithms
Pregel Iterative-MRU
REEF
![Page 57: REEF: The Retainable Evaluator Execution Framework•Recursion is built into the language • Amenable to optimizations • Lots of existing work that we can leverage J. Eisner and](https://reader034.vdocuments.net/reader034/viewer/2022042321/5f0acb5a7e708231d42d6036/html5/thumbnails/57.jpg)
Parallel Recursive Dataflow
57
REEF
Query optimizer
Datalog query over training data
ML algorithm Graph
Analysis SQM
• Storage/Networking services
• State Management • Caching policies
• Dynamic resources • Elastic operators
• Cost estimation for recursive computation
• Cost models (time vs money)
• Interactive Query Processing
• Provenance for triage • “My model misbehaves
- why?” • Fault-awareness policies • Incremental learning