integrating r and the jvm platform - alpine data labs' r execute operator
DESCRIPTION
Reactive programming is a phenomenal idea, but it's not always achievable "all the way down" in practice. In the real world, one rarely writes entire platforms from scratch and even then, one often needs to integrate with third-party applications that are blocking, stateful, and seem to violate nearly every reactive principle. In my talk, I will explain how Akka is still ideally suited to handle the integration of such systems into both reactive and non-reactive JVM code. To illustrate the above claims, I will talk about Alpine Data Labs' JVM-R integration. Calls to the R language runtime to perform a data science computation are blocking given the constraints of R itself. Sessions have to be maintained since many messages have to be sent per R session (populating the R heap with DTOs, sending the script to be executed, etc.), and each actor can hold a TCP connection to a single R runtime. R is very prone to failure, be it due to poor memory management, dynamically typed, buggy user code, segmentation faults in native R packages, etc. I will show how Akka can handle all of these problems in a graceful manner to help integrate a faulty, non-engineering grade technology like R into a JVM enterprise application.TRANSCRIPT
Integrating Non-Reactive Legacy Code - The Case of R !
!!!!!Marek Kolodziej Machine Learning Engineer !
!
!
!!!!SF Scala Meetup, Sep. 10, 2014
Reactive Recap
Event-‐driven!-‐ Asynchronous -‐ Non-‐blocking -‐ Op4mized around Amdahl’s Law
Scalable-‐ Loca4on transparency (up and out)
-‐ Factor in unreliable network !
Resilient-‐ Failure isola4on (bulkhead paAern, etc.)
-‐ Clean service and failure handling separa4on (supervision)
Responsive-‐ Minimize latency -‐ Deal with bursty traffic -‐ Gracefully handle conges4on (backpressure/ac4ve pull by subscriber)
Reactive Recap
07
< <
Not everything’s an actor-‐ Legacy Java/Scala code -‐ Third-‐Party Libraries
Blocking calls!-‐ Database queries -‐ Calls to services -‐ Non-‐threaded run4mes (R) !!
Long-‐running jobs!-‐ Resource clean-‐up in case network par44on occurs way before the 4me-‐out is reached
-‐ Timeouts vs. heartbeats !
Not all failures are within the JVM!-‐ Can we revive them from within the JVM?
!!
The tough realityNot everything’s under your control
07
< <
Alpine’s R Operator
07
< <
For
!!!!!!!!!!!!!
-‐ 5,000+ sta4s4cal and machine learning libraries
-‐ “[Numeric] gold standard” implementa4ons
-‐ Operator would allow arbitrary processing in a “canned” applica4on
-‐ Data scien4sts already know the language
-‐ Support for client’s exis4ng code base (100s of scripts)
-‐ Very rapid prototyping -‐ Focus on science instead of coding !
Alpine’s R OperatorThe cases for and against R
07
< <
Against
!!!!!!!!!!!!!!
-‐ Slow run4me (even with JIT) -‐ Memory hogging (by-‐copy seman4cs)
-‐ Very slow garbage collec4on -‐ Single-‐threaded run4me (even worse than Python and Ruby) -‐ Na4ve libraries wriAen by people without much CS/engineering background (segfaults, etc.)
-‐ Buggy libraries (infinite loops, etc.)
-‐ Run4me crashes -‐ Terrible handling of big datasets
Licensing Issues!-‐ R is GPL -‐ RServe is (L)GPL -‐ Shipped soaware (GPL SaaS loophole doesn’t apply)
Distributed compuHng
!!
!!!!!!!
-‐ Need a cluster of R workers (mul4-‐user, mul4-‐operator concurrency given a single-‐ threaded R run4me) !-‐ REST is good for data but preAy bad for control (some structure would be nice) !
-‐ Sessions or backpressure !!!
Challenges
07
< <
Fault tolerance!-‐ R run4me failures -‐ Network par44ons (R session clean-‐up) !
!
Licensing Issues
!!!!
-‐ Akka is Apache 2.0 -‐ RServe is (L)GPL -‐ Can open-‐source the R-‐Java server bridge
-‐ Communica4on to Alpine backend via (open-‐source) message case classes
Distributed compuHng
!!!!!!!!!!!!!
-‐ Akka’s loca4on transparency is ideal for distribu4ng work
-‐ Cluster API would have been preferred but Alpine uses Akka 2.2.3 due to Spark dependency
-‐ Structure and seman4cs due to message case classes
-‐ Rx streams would have been nice for backpressure, but we have an old Akka version (so sessions)
!
Solutions
07
< <
Fault tolerance
!!!!!!!!!!!!!!!!
-‐ Rserve forks R processes. Exc. handling of the Connec4on object lets you restart processes.
-‐ Akka’s heartbeat allows session clean-‐up in case of network failure before 4me-‐out (important if 4me-‐out is ~1 day).
-‐ Event bus lets you observe failure to connect to remote actor system.
-‐ No need for exactly once seman4cs (the user can re-‐run the flow), but you have to know that the failure occurred. !!
!
Sessions
!!!!!
!!!-‐ Arguably the ugliest part of the solu4on (can be replaced with alterna4ves) -‐ Worker actors blocked for long periods (hours). -‐ Large data blocks are sent to the Akka R server (~ 128 MB). -‐ No backpressure via Rx streams since it’s Akka 2.3.2. -‐ Custom router -‐ refuses requests if all workers are busy. -‐ Client needs to respond to request refusal by awai4ng a free worker message (reac4ve but inelegant). -‐ BeAer solu4on -‐ use reac4ve streams (we need to upgrade Akka) -‐ Improvement: use Akka for control but REST for data movement !!!!!!
!!!!!!
Solutions
07
< <
-‐ Data movement via REST !
-‐ Replacement of sessions via reac4ve streams (Akka upgrade!) !
-‐ Kamon test drive for distributed actors (released ~2 weeks ago) !!!!
Future Improvements
07
< <
!!!!!
!!!-‐ Akka makes even non-‐reac4ve distributed programming easier and more reliable !-‐ If you can, use the latest Akka version because a lot of the earlier pain can be avoided: -‐ clustering -‐ persistence -‐ reac4ve streams !-‐ Large data movement via Akka is probably not an ideal use of the framework: -‐ use REST (including Spray, Play, etc.) and HTTP chunking -‐ move the data directly using NeAy, etc. !!
!!!!!!
Conclusions
07
< <
Thank You !!!
07
< <
!!!!!
!!!
-‐ Alpine is hiring -‐ machine learning engineers (Scala/Java) -‐ data scien4sts (R/Python) -‐ Front end developers (Ruby on Rails) !
-‐ ScalaCourses.com is looking for reviewers: -‐ Scala (beginner/intermediate) -‐ Akka -‐ Play -‐ Java Interop. -‐ contact Michael Slinn: [email protected] !!
!!!!!
Miscellaneous
07
< <