Reactive programming is a phenomenal idea, but it's not always achievable "all the way down" in practice. In the real world, one rarely writes entire platforms from scratch and even then, one often needs to integrate with third-party applications that are blocking, stateful, and seem to violate nearly every reactive principle. In my talk, I will explain how Akka is still ideally suited to handle the integration of such systems into both reactive and non-reactive JVM code.
To illustrate the above claims, I will talk about Alpine Data Labs' JVM-R integration. Calls to the R language runtime to perform a data science computation are blocking given the constraints of R itself. Sessions have to be maintained since many messages have to be sent per R session (populating the R heap with DTOs, sending the script to be executed, etc.), and each actor can hold a TCP connection to a single R runtime. R is very prone to failure, be it due to poor memory management, dynamically typed, buggy user code, segmentation faults in native R packages, etc. I will show how Akka can handle all of these problems in a graceful manner to help integrate a faulty, non-engineering grade technology like R into a JVM enterprise application.
3. Reactive Recap
Eve!nt-‐driven
-‐ Asynchronous
-‐ Non-‐blocking
-‐ Op4mized
around
Amdahl’s
Law
Scalable
-‐ Loca4on
transparency
(up
and
out)
-‐ Factor
in
unreliable
network
!
Resilient
-‐ Failure
isola4on
(bulkhead
paAern,
etc.)
-‐ Clean
service
and
failure
handling
separa4on
(supervision)
Responsive
-‐ Minimize
latency
-‐ Deal
with
bursty
traffic
-‐ Gracefully
handle
conges4on
(backpressure/ac4ve
pull
by
subscriber)
< <
07
4.
5. The tough reality
Not everything’s under your control
Not
everything’s
an
actor
-‐ Legacy
Java/Scala
code
-‐ Third-‐Party
Libraries
Blo! cking
calls
-‐ Database
queries
-‐ Calls
to
services
-‐ Non-‐threaded
run4mes
(R)
!
!
!Long-‐running
jobs
-‐ Resource
clean-‐up
in
case
network
par44on
occurs
way
before
the
4me-‐out
is
reached
-‐ Timeouts
vs.
heartbeats
!
Not
all
failures
are
within
th!e
JVM
-‐ Can
we
revive
them
from
within
the
JVM?
!
!
< <
07
7. !
!
!
!
!
!
!
!
!
!
!
!
!
For
Alpine’s R Operator
!
!
!
!
!
!
!
!
!
!
!
!
!
!
The cases for and against R
-‐ 5,000+
sta4s4cal
and
machine
learning
libraries
-‐ “[Numeric]
gold
standard”
implementa4ons
-‐ Operator
would
allow
arbitrary
processing
in
a
“canned”
applica4on
-‐ Data
scien4sts
already
know
the
language
-‐ Support
for
client’s
exis4ng
code
base
(100s
of
scripts)
-‐ Very
rapid
prototyping
-‐ Focus
on
science
instead
of
coding
!
< <
07
Against
-‐ Slow
run4me
(even
with
JIT)
-‐ Memory
hogging
(by-‐copy
seman4cs)
-‐ Very
slow
garbage
collec4on
-‐ Single-‐threaded
run4me
(even
worse
than
Python
and
Ruby)
-‐ Na4ve
libraries
wriAen
by
people
without
much
CS/
engineering
background
(segfaults,
etc.)
-‐ Buggy
libraries
(infinite
loops,
etc.)
-‐ Run4me
crashes
-‐ Terrible
handling
of
big
datasets
8. Lice! nsing
Issues
Challenges
!
!
!
!
!
!
!
!
!
-‐ Need
a
cluster
of
R
workers
(mul4-‐user,
mul4-‐operator
concurrency
given
a
single-‐
threaded
R
run4me)
!
-‐ REST
is
good
for
data
but
preAy
bad
for
control
(some
structure
would
be
nice)
!
-‐ Sessions
or
backpressure
!
!
-‐ R
is
GPL
-‐ RServe
is
(L)GPL
-‐ Shipped
soaware
(GPL
SaaS
loophole
doesn’t
apply)
Distributed
compuHng
< <
07
Fa!ult
tolerance
-‐ R
run4me
failures
-‐ Network
par44ons
(R
session
clean-‐up)
!
!
9. !
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Licensing
Issues
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
Solutions
-‐ Akka
is
Apache
2.0
-‐ RServe
is
(L)GPL
-‐ Can
open-‐source
the
R-‐Java
server
bridge
-‐ Communica4on
to
Alpine
backend
via
(open-‐source)
message
case
classes
Distributed
compuHng
-‐ Akka’s
loca4on
transparency
is
ideal
for
distribu4ng
work
-‐ Cluster
API
would
have
been
preferred
but
Alpine
uses
Akka
2.2.3
due
to
Spark
dependency
-‐ Structure
and
seman4cs
due
to
message
case
classes
-‐ Rx
streams
would
have
been
nice
for
backpressure,
but
we
have
an
old
Akka
version
(so
sessions)
!
< <
07
Fault
tolerance
-‐ Rserve
forks
R
processes.
Exc.
handling
of
the
Connec4on
object
lets
you
restart
processes.
-‐ Akka’s
heartbeat
allows
session
clean-‐up
in
case
of
network
failure
before
4me-‐
out
(important
if
4me-‐out
is
~1
day).
-‐ Event
bus
lets
you
observe
failure
to
connect
to
remote
actor
system.
-‐ No
need
for
exactly
once
seman4cs
(the
user
can
re-‐run
the
flow),
but
you
have
to
know
that
the
failure
occurred.
!
!
!
10. !
!
!
!
!
!
!
!
-‐ Arguably
the
ugliest
part
of
the
solu4on
(can
be
replaced
with
alterna4ves)
-‐ Worker
actors
blocked
for
long
periods
(hours).
-‐ Large
data
blocks
are
sent
to
the
Akka
R
server
(~
128
MB).
-‐ No
backpressure
via
Rx
streams
since
it’s
Akka
2.3.2.
-‐ Custom
router
-‐
refuses
requests
if
all
workers
are
busy.
-‐ Client
needs
to
respond
to
request
refusal
by
awai4ng
a
free
worker
message
(reac4ve
but
inelegant).
-‐ BeAer
solu4on
-‐
use
reac4ve
streams
(we
need
to
upgrade
Akka)
-‐ Improvement:
use
Akka
for
control
but
REST
for
data
movement
!
!
!
!
!
!
!
!
!
!
!
!
Sessions
Solutions
< <
07
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25. Future Improvements
-‐ Data
movement
via
REST
!
-‐ Replacement
of
sessions
via
reac4ve
streams
(Akka
upgrade!)
!
-‐ Kamon
test
drive
for
distributed
actors
(released
~2
weeks
ago)
!
!
!
!
< <
07
26. Conclusions
!
!
!
!
!
!
!
!
-‐ Akka
makes
even
non-‐reac4ve
distributed
programming
easier
and
more
reliable
!
-‐ If
you
can,
use
the
latest
Akka
version
because
a
lot
of
the
earlier
pain
can
be
avoided:
-‐
clustering
-‐
persistence
-‐
reac4ve
streams
!
-‐ Large
data
movement
via
Akka
is
probably
not
an
ideal
use
of
the
framework:
-‐
use
REST
(including
Spray,
Play,
etc.)
and
HTTP
chunking
-‐
move
the
data
directly
using
NeAy,
etc.
!
!
!
!
!
!
!
!
< <
07