4. Problem Statement
● Many applications are running on a computer with multiple
CPU cores, and the number of CPU cores is likely to
increase in the future.
● To access the full power of the computer, applications need
to make good use of multiple threads. (Vertical scalability.)
● The traditional approach to employing multiple threads—
threads and locks—is error prone and extremely difficult to
maintain. And often does not run much faster than when a
single thread is used.
5. The Heart of the Problem: Shared Mutable State
● Traditionally, mutable (changeable) state is shared by
multiple threads, using locks to prevent corruption when
partially updated state is accessed by another thread.
● Threads block when attempting to access a locked state,
which increases context switching. (When a thread is
blocked, the underlying hardware thread tries find a thread
to run which is not blocked.)
● As the complexity of an application increases, so does the
chance that unguarded state will be shared—which gives
rise to (often infrequent) race conditions.
● Complex applications usually require a locking hierarchy to
prevent deadlocks, and chances are good that the locking
hierarchy will be violated as the application code ages.
6. Alternatives to Locks
● Concurrent and Atomic Data Structures
● Functional Programming with Immutable State
● Actors
8. What is an Actor?
● An actor is like an object, except that it sends messages to
other actors rather than call methods.
● When an actor has messages to process, a thread is
assigned to that actor. And when there are no more
messages to process, the thread is released.
● Unlike other objects, an actor's state is not shared. The
state can only be accessed by a single thread at any given
time. So there is no need for locks.
● In many ways, Actors look like the ideal alternative to locks.
9. Actors need to be Large
● When passing messages, the throughput is not especially
high—about a million messages per second per hardware
thread on a good actor implementation. So you need to
avoid having actors that pass a large volume of messages.
Message passing needs to be kept out of an application's
inner loops. So you end up having relatively large actors that
do a fair amount for each message received.
● Applications then tend not to be very modular and the actors
often need to process a number of different message types,
and to process them differently depending on the actor's
state.
10. Large Actors tend to get Complicated
● Large actors often need to address a variety of concerns,
and tend to turn into bowls of spaghetti code. Using a state
machine does help, but even then it is not always easy to
maintain the state transition model as the application ages.
● In practice, monitors are often used to ensure that an actor
continues to function and restart it when it is not. This of
course makes it more difficult to ensure that messages are
not lost or processed more than once.
11. Flow Control is left to the Application Developer
● Messages are mostly one-way, so there is no inherent flow
control and it is easy for actors to flood the system with
messages.
● Programmers, unless they have some experience with
communication protocols, do not normally concern
themselves with flow control. Method calls (an object's
equivalent of a message) provide implicit flow control, as the
calling object resumes only on the completion of the call.
● Indeed, the developer may not even realize the need to
implement flow control until load testing is performed.
● Adding flow control to the application logic will, of course,
further complicate the code of the actors.
13. JActor: A High-Performance Actor Framework
● Messages are sent at up to 150 million / second, fast
enough that actors can be used ubiquitously.
● The actors are light weight: a billion actors a second can be
created on a single thread.
● Large tables can be deserialized, updated and reserialized
at a rate of 400 nanoseconds per unchanged entry virtually
independent of the size and complexity of those entries.
● A transaction pipeline is provided that durably (with fsync)
logs and processes up to 900,000 transactions per second.
● (Tests were run on an i7-3770 CPU @ 3.40GHz with a
Vertex 3 SATA III SSD and 1600 MHz DDR3 RAM.)
14. Mailboxes
● Actors can share a common queue of received messages.
These message queues are called mailboxes.
● Actors which share a mailbox always operate on the same
thread, allowing them to directly call each other's methods.
● Messages sent to an actor with the same mailbox are
processed immediately without being enqueue.
15. Commandeering
● When a message is sent to an actor with an empty mailbox,
the sending thread commandeers that mailbox and
immediately processes the request.
● Commandeering prevents another thread from being
assigned to the mailbox, so the actor's state will still only be
accessible from a single thread at any given time.
● With commandeering we avoid having to enqueue the
message and assign a thread to dequeue and process the
message—giving JActor a significant boost in performance.
16. Asynchronous Mailboxes
● Sometimes we need to prevent commandeering, e.g. when
an actor performs blocking I/O or long computations.
Otherwise all the actors of the mailbox whose thread did the
commandeering will also be blocked.
● Asynchronous mailboxes differ from the default type of
mailbox in that they do not allow commandeering.
● Messages sent to an actor with an asynchronous mailbox
then are always processed on a different thread.
17. Message Buffering
● Message buffering is a technique borrowed from flow-based
programming to increase message throughput at a small
cost to latency.
● A message buffer is simply an ArrayList. And all the
messages in a message buffer are destined for the same
mailbox. The message buffers themselves are held by a
mailbox.
● When an actor sends a message and the destination actor
uses the same mailbox, or the destination actor has an idle
mailbox, then the message is processed immediately on the
current thread. Otherwise the message is placed in a
message buffer.
● When a mailbox has no more incoming messages to
process, the last thing it does before releasing its assigned
task is to send all the message buffers.