Akka and the Zen of Reactive System Design

Akka and the Zen of Reactive
System Design
by Konrad Malawski (@ktosopl)

Konrad `@ktosopl` Malawski
akka.io
typesafe.com
geecon.org
Java.pl / KrakowScala.pl
sckrk.com / meetup.com/Paper-Cup @ London
GDGKrakow.pl
lambdakrk.pl

Why such talk?
1 : One actor is no Actor
2 : Structure your Actors
3 : Name your Actors
4 : ”Matrix of mutability (Pain)”
5 : Blocking needs careful management
6 : Never Await, for/ﬂatMap instead!
7 : Avoid Java Serialization
7.5 : Trust no-one, benchmark everything!
Agenda
8 : Let it Crash!
9 : Backoff Supervision
10 : Design using State Machines
11 : Cluster Convergence and Joining
12 : Cluster Partitions and “Down”
13 : Akka is a Toolkit.
14 : Happy Hakking, Community!
Questions?

“The Tao / Zen of Programming”
Talk title loosely based on
the “Tao of Programming” book
by Goeffrey James (1987).

And the follow-up book
“Zen of Programming”.

Available here: http://www.mit.edu/~xela/tao.html
Series of nine “books”,
stories about an apprentice programmer and his sensei.
Thus spake the Master Programmer:
“Without the wind, the grass does not move.
Without software hardware is useless.”

The Akka landscape
Akka
Actor
IO
Cluster
Cluster Tools (PubSub, Sharding, …)
Persistence & Persistence Query
Streams
HTTP
Typed

The Zen of Akka
Is best explained as a way of thinking about Architecture.
Akka provides building blocks, with speciﬁc semantics.

The Zen of Akka
Is best explained as a way of thinking about Architecture.
Akka provides building blocks, with speciﬁc semantics.
Actors are cheap – so they can be 1:1 for a user, or wallet etc
Actors are referentially transparent – can scale-out trivially
Actors encapsulate state – avoiding global state
Actors are engines –
Streams / Streaming HTTP / Cluster Sharding / Distributed Data…
– all using are Actors as engines, high-level Architectural help.

If you have only one actor then it can only…
1. Reply
2. Drop the message (“ignore”
3. Schedule another message to self
So we’re not really making any use of its
parallelism or concurrency capabilities.

- Actors are meant to work together.
- An Actor should do one thing and do it very well
- then talk to other Actors to do other things for it. 
- Child Actors usually used for workers or “tasks” etc. 
- Avoid using `actorSelection`, 
introduce Actors to each other.

Different types…
but no structure!

Parent / child relationships
also allow for Actor supervision.

// default
context.actorOf(childProps) // "$a", "$b", "$c"
Default names are: BASE64(sequence_nr++)
Here’s why:
- cheap to generate
- guarantees uniqueness
- less chars than plain numbers

// default: naming is BASE64(sequential numbers)
// better: but not very informative...
context.actorOf(childProps, nextFetchWorkerName) // "fetch-worker-1", "fetch-worker-2"
private var _fetchWorkers: Int = 0
private def nextFetchWorkerName: String = {
_fetchWorkers += 1
s”fetch-worker-${_fetchWorkers}”
}
Sequential names are a bit better sometimes.

// better: but not much informative...
private var _fetchWorkers: Int = 0
_fetchWorkers += 1
s”fetch-worker-${_fetchWorkers}”
}
abstract class SeqActorName {
def next(): String
def copy(name: String): SeqActorName
}
object SeqActorName {
def apply(prefix: String) = new SeqActorNameImpl(prefix, new AtomicLong(0))
}
final class SeqActorNameImpl(val prefix: String, counter: AtomicLong)
extends SeqActorName {
def next(): String = prefix + '-' + counter.getAndIncrement()
def copy(newPrefix: String): SeqActorName = new SeqActorNameImpl(newPrefix, counter)
}
If you use this pattern a lot, here’s a simple encapsulation of it:

// better: but not much informative...
private var fetchWorkers: Int = 0
fetchWorkers += 1
s"fetch-worker-$fetchWorkers"
}
// BEST: proper names, based on useful information
context.actorOf(childProps, fetcherName(videoUrl)) // "fetch-yt-MRCWy2E_Ts", ...
def fetcherName(link: Link) = link match {
case YoutubeLink(id, metadata) => s"fetch-yt-$id"
case DailyMotionLink(id, metadata) => s"fetch-dm-$id"
case VimeoLink(id, metadata) => s"fetch-vim-$id"
}
Meaningful names are the best!

import akka.actor.OneForOneStrategy
import akka.actor.SupervisorStrategy._
import scala.concurrent.duration._
// ... extends Actor with ActorLogging {
override def supervisorStrategy: SupervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1.minute) {
case ex: Exception
log.warning("Child {} failed with {}, attempting restart...",
sender().path.name,
ex.getMessage)
Restart
}
The name of the failed child Actor!

import akka.actor.OneForOneStrategy
import akka.actor.SupervisorStrategy._
// ... extends Actor with ActorLogging {
override def supervisorStrategy: SupervisorStrategy =
OneForOneStrategy(maxNrOfRetries = 10, withinTimeRange = 1.minute) {
case ex: Exception
log.warning("Child {} failed with {}, attempting restart...",
sender().path.name,
ex.getMessage)
Restart
}
The name of the failed child Actor!
// BAD –– String ALWAYS built
log.debug(s"Something heavy $generateId from $physicalAddress")
// GOOD! –– String built only when DEBUG level is ON
log.debug("Something heavy {} from {}", generateId, physicalAddress)
Side note: always use {} log formatting (or macros), not s””

See Jamie Allen’s talk on the subject.

Blocking operations are really bad.
Actors are all about resource sharing, and if someone is “behaving
badly” it hurts everyone.
Here is an example how blocking can grind an app to a halt.
Next we’ll see how to avoid that… even if we have to live with the
blocking code.

In simple terms:
Blocking is bad because instead of doing something else,
we just wait and do nothing (wasting CPU time)…

Having that said, it’s not a bad question. Let’s investigate.

// BAD! (due to the blocking in Future):
implicit val defaultDispatcher = system.dispatcher
val routes: Route = post {
complete {
Future { // uses defaultDispatcher
Thread.sleep(5000) // will block on the default dispatcher,
System.currentTimeMillis().toString // starving the routing infra
}
}
}

// application.conf
my-blocking-dispatcher {
type = Dispatcher
executor = “thread-pool-executor"
thread-pool-executor {
// in Akka previous to 2.4.2:
core-pool-size-min = 16
core-pool-size-max = 16
max-pool-size-min = 16
max-pool-size-max = 16
// or in Akka 2.4.2+
ﬁxed-pool-size = 16
}
throughput = 100
}

// GOOD (due to the blocking on a dedicated dispatcher):
implicit val blockingDispatcher = system.dispatchers.lookup("my-blocking-dispatcher")
val routes: Route = post {
complete {
Future { // uses the good "blocking dispatcher" that we configured,
// instead of the default dispatcher – the blocking is isolated.
Thread.sleep(5000)
System.currentTimeMillis().toString
}
}
}

The “Never block!” mantra sounds cool,
but actually what we mean by it is “blocking needs careful management”.
We use the “bulkhead” pattern separate out potentially blocking
behaviours to their independent dispatchers (and should always do so).
http://stackoverﬂow.com/questions/34641861/akka-http-blocking-in-a-future-blocks-the-server/34645097#34645097

6 : Never Await, for/flatMap instead!

// ... extends Actor {
import context.dispatcher
import scala.concurrent.Await // bad sign!
// BAD!!!
val fThings: Future[Things] = computeThings()
val t: Things = Await.result(fThings, atMost = 3.seconds)
val d: Details = Await.result(moreDetailsFor(t), atMost = 3.seconds)

// BAD!!!
// Good:
val fThingsWithDetails = for {
t <- computeThings()
d <- moreDetailsFor(t)
} yield t -> d
fThingsWithDetails foreach {
case (things, details) => // case (things: Things, details: Details) =>
println(s"$things with $details")
}

// BAD!!!
// Good:
val fThingsWithDetails = for {
t <- computeThings()
d <- moreDetailsFor(t)
} yield t -> d
fThingsWithDetails foreach {
}
// adding timeout:
val timeoutFuture = akka.pattern.after(3.seconds, context.system.scheduler) {
Future.failed(new TimeoutException("My timeout details..."))
}
Future.firstCompletedOf(fThingsWithDetails :: timeoutFuture :: Nil) foreach {
}

Java Serialization is the default one in Akka, since it’s easy to
get started with it – no conﬁguration needed.
If you need performance and are running on multiple nodes,
you must change the serialization.
Popular formats are ProtoBuf or Kryo.
Kryo is easier, but harder to evolve schema with.
ProtoBuf is harder to maintain but great schema evolution.

Benchmarking serialization impact on “ping pong” case.
(Two actors sending a message between them.)
in-process messaging, super fast. 
no serialization overhead.

more work => increased latency => decreased throughput.
over-the-network messaging,
slower due to network and serialization.

Java Serialization is known to be:
very slow & footprint heavy

It is on by default in Akka… Why?
a) zero setup => simple to “play around”
b) historical reasons - hard to remove the default
Since 2.4 a warning is logged: 
WARNING: Using the default Java serializer for class [{}] which is not recommended
because of performance implications. Use another serializer or disable this warning
using the setting 'akka.actor.warn-about-java-serializer-usage'

sbt> jmh:run  
-f 1
-tu us  
-wi 20
-i 10
-jvm /home/ktoso/opt/jdk1.8.0_65/bin/java
-jvmArgsAppend -XX:+PreserveFramePointer
-bm avgt
.*pingPong.*
[info] # JMH 1.10.3 (released 184 days ago, please consider updating!)
[info] # VM version: JDK 1.8.0_65, VM 25.65-b01
[info] # VM invoker: /home/ktoso/opt/jdk1.8.0_65/bin/java
[info] # VM options: -XX:+PreserveFramePointer
[info] # Warmup: 20 iterations, 5 s each
[info] # Measurement: 10 iterations, 1 s each
[info] # Timeout: 10 min per iteration
[info] # Threads: 1 thread, will synchronize iterations
[info] # Benchmark mode: Average time, time/op
[info] # Benchmark: akka.actor.ForkJoinActorBenchmark.pingPong
[info] # Parameters: (serializer = java)
github.com/ktoso/sbt-jmh
openjdk.java.net/projects/code-tools/jmh/

[info] # Warmup Iteration 1: 35.717 us/op
. . .
[info] Iteration 1: 25.790 us/op
. . .
[info] Iteration 10: 26.168 us/op
 
[info] Result "pingPong":
[info] 25.464 ±(99.9%) 1.175 us/op [Average]
[info] (min, avg, max) = (24.383, 25.464, 26.888), stdev = 0.777
[info] CI (99.9%): [24.289, 26.639] (assumes normal distribution)
[info] ForkJoinActorBenchmark.pingPong java avgt 10 25.464 ± 1.175 us/op
[info] ForkJoinActorBenchmark.pingPong off avgt 10 0.967 ± 0.657 us/op
github.com/ktoso/sbt-jmh
openjdk.java.net/projects/code-tools/jmh/

Good serializers include (but are not limited to):
Kryo, Google Protocol Buffers, SBE,Thrift, JSON even (sic!)
// dependencies
"com.github.romix.akka" %% "akka-kryo-serialization" % "0.4.0"
// application.conf
extensions = [“com.romix.akka.serialization.kryo.KryoSerializationExtension$"]
 
serializers {  
java = "akka.serialization.JavaSerializer"
kryo = "com.romix.akka.serialization.kryo.KryoSerializer"  
}
akka.actor.serialization-bindings {
“com.mycompany.Example”: kryo
. . .
}
[info] ForkJoinActorBenchmark.pingPong java avgt 10 25.464 ± 1.175 us/op
[info] ForkJoinActorBenchmark.pingPong kryo avgt 10 4.348 ± 4.346 us/op
[info] ForkJoinActorBenchmark.pingPong off avgt 10 0.967 ± 0.657 us/op

----sr--model.Order----h#-----J--idL--customert--Lmodel/Customer;L--descriptiont--Ljava/lang/String;L--orderLinest--Ljava/util/List;L--totalCostt--Ljava/
math/BigDecimal;xp--------ppsr--java.util.ArrayListx-----a----I--sizexp----w-----sr--model.OrderLine--&-1-S----I--lineNumberL--costq-~--L--descriptionq-
~--L--ordert--Lmodel/Order;xp----sr--java.math.BigDecimalT--W--(O---I--scaleL--intValt--Ljava/math/BigInteger;xr--java.lang.Number-----------xp----sr--
java.math.BigInteger-----;-----I--bitCountI--bitLengthI--ﬁrstNonzeroByteNumI--lowestSetBitI--signum[--magnitudet--[Bxq-~----------------------ur--[B------
T----xp----xxpq-~--xq-~--
Java Serialization
final case class Order(id: Long, description: String, totalCost: BigDecimal,
orderLines: ArrayList[OrderLines], customer: Customer)
<order id="0" totalCost="0"><orderLines lineNumber="1" cost="0"><order>0</order></orderLines></order>XML…!
{"order":{"id":0,"totalCost":0,"orderLines":[{"lineNumber":1,"cost":0,"order":0}]}}JSON…!
------java-util-ArrayLis-----model-OrderLin----java-math-BigDecima---------model-Orde-----Kryo…!
Excellent post by James Sutherland @
http://java-persistence-performance.blogspot.com/2013/08/optimizing-java-serialization-java-vs.html

7 : Avoid Java Serialization for Persistence!!!
Java Serialization is a horrible idea if you’re going to store the
messages for a long time.
For example, with Akka Persistence we store events “forever”.
Use a serialization format that can evolve over time in a
compatible way. It can be JSON or ProtocolBuffers (or Thrift
etc).

7.5 : Trust no-one, benchmark everything!
Always measure and benchmark properly
before judging performance of a tool / library.
Benchmarking is often very hard,
use the right tools:
- JMH (for Scala via: ktoso/sbt-jmh)
-YourKit / JProﬁler / …
- Linux perf_events
7.5

8 : Let it Crash! Supervision, Failures & Errors

http://www.reactivemanifesto.org/
Error
… which is an expected and coded-for condition—for
example an error discovered during input validation, that
will be communicated to the client …
Failure
… is an unexpected event within a service that
prevents it from continuing to function normally.  
A failure will generally prevent responses to the current,
and possibly all following, client requests.

8 : Let it Crash! Supervision, Failures &

Error: “Not enough cash.”

Error: “Unable to fulﬁl request”
Failure: “Row 3 is broken”

Our goal is to “let things crash”
and “recover gracefully”
Not to hammer the DB while it tries to recover!

IF we allowed immediate restarts…
we could end up in “inﬁnite replay+fail hell”.
(we don’t. since Persistence went stable in 2.4.x)

Many PersistentActors Fail and Stop.

Backoff Supervisor counts failures
and starts “the same” entity after
exponential timeouts.

def receive = {
case Thingy() =>
// ...
case AnotherThingy() =>
// ...
case DoOtherThings() =>
// ...
case PleaseGoAway() =>
// ...
case CarryOn() =>
// ...
case MakeSomething() =>
// ...
// ...
}

def receive = {
case Thingy() =>
// ...
// ...
// ...
// ...
case CarryOn() =>
// ...
// ...
// ...
}
Good:
Actors avoid the “pyramid of doom”.
Pyramid of doom in some
async programming styles.

def receive = {
case Thingy() =>
// ...
// ...
// ...
// ...
case CarryOn() =>
// ...
// ...
// ...
}
That well works because
“everything is a message”:

def receive = awaitingInstructions
def awaitingInstructions: Receive =
terminationHandling orElse {
case CarryOn() =>
// ...
case MakeSomething(metadata) =>
// ...
context become makeThings(meta)
}
def makeThings(metadata: Metadata): Receive =
terminationHandling orElse {
case Thingy() =>
// make a thingy ...
// make another thingy ...
case DoOtherThings(meta) =>
// ...
context become awaitingInstructions
}
def terminationHandling: Receive = {
// ...
context stop self
}
DoOtherThings
MakeSomething

We also provide an FSM (Finite State Machine) helper trait.
You may enjoy it sometimes, give it a look.
DoOtherThings
MakeSomething
http://doc.akka.io/docs/akka/2.4.1/scala/fsm.html
class Buncher extends FSM[State, Data] {
startWith(Idle, Uninitialized)
when(Idle) {
case Event(SetTarget(ref), Uninitialized) =>
stay using Todo(ref, Vector.empty)
}
// transition elided ...
when(Active, stateTimeout = 1 second) {
case Event(Flush | StateTimeout, t: Todo) =>
goto(Idle) using t.copy(queue = Vector.empty)
}
// unhandled elided ...
initialize()
}

http://doc.akka.io/docs/akka/2.4.1/common/cluster.html
Cluster Gossip Convergence
When a node can prove that the cluster state it is observing
has been observed by all other nodes in the cluster.

Cluster Gossip Convergence
When a node can prove that the cluster state it is observing
has been observed by all other nodes in the cluster.
Convergence is required for “Leader actions”,
which include Join-ing and Remove-ing a node.
Down-ing can happen without convergence.

akka.cluster.allow-weakly-up-members=on

Everyone has ‘seen’
Kurt joining,
move him to Up.

Kurt is now with us
in the Cluster.

I can’t hear Kurt…
He’s unreachable…

Kurt does has not
seen Rei… I can not
mark him as Up!
allow-weakly-up-members=off // default

I’ll mark Rei as
“WeaklyUp” until Kurt
is back.
allow-weakly-up-members=on
allow-weakly-up-members

Kurt is back! Once he
has seen Rei I’ll mark
Rei as Up.
allow-weakly-up-members=on
allow-weakly-up-members

I’m going
home…

Make sure the others have heard you say goodbye before you leave.
Vanishes immediately.

Make sure the others have heard you say goodbye before you leave.
“Where’s Bill?
I did not hear him say Goodbye!”
“Failure Detector” used to determine UNREACHABLE.

Failure detection only triggers “UNREACHABLE”.
Nodes can come back from that state.

Declaring DOWN is done by either timeouts (which is rather unsafe) [auto-downing].
Or by “voting” or “majority” among the members of the cluster [split-brain-resolver].

お
前
は
も
う
死
ん
で
い
る
.
.
.
.
Node declared “DOWN” comes back…

You
are
already
dead.

12 : Cluster Partitions and “Down”Why do we do that?
In order to guarantee consistency via
“single writer principle”.
Note:
Akka Distributed Data has no need for “single
writer”, it’s CRDT based. But it’s harder to model
things as CRDT, so it’s a trade off.
You
are
already
dead.

Notice that we do not mention “Quarantined”.
That is a state in Akka Remoting, not Cluster.
It’s a terminal state from which one can never recover.
TL;DR;
use Akka Cluster instead of Remoting.
it’s pretty much always the thing you need (better than remoting).

13 : A fishing rod is a Tool. Akka is a Toolkit.

Akka strives is Toolkit,
not a Framework.
“Give a man a ﬁsh and you feed him for a day
teach a man to ﬁsh and you feed him for a lifetime.”

Akka strives is Toolkit,
not a Framework.
Play is a Framework,
Lagom is a Framework.

13 : Akka is a Toolkit, pick the right tools for the job.
“Constraints Liberate,
Liberties Constrain”
Runar Bjarnason
Runar’s excellent talk @ Scala.World 2015

The less powerful abstraction
must be built on top of
more powerful abstractions.

Asynchronous processing toolbox:

Single value, no streaming by deﬁnition.
Local abstraction. 
Execution contexts.

Mostly static processing layouts.
Well typed and Back-pressured!

Plain Actor’s younger brother, experimental.
Location transparent, well typed.
Technically unconstrained in actions performed

Location transparent.
Various resilience mechanisms.
(watching, persistent recovering, migration, pools)
Untyped and unconstrained in actions performed.

13 : Picking the tools – distributing / sharding processing

13 : Picking the tools – distributing / sharding processing
Runar’s excellent talk @ Scala.World 2015Whitepaper by Pat Helland, 2005 
http://cidrdb.org/cidr2005/papers/P12.pdf

13 : Picking the tools – “element by element” processing
// types: _
Source[Int, Unit]
Flow[Int, String, Unit]
Sink[String, Future[String]]
Source.single(1).map(_.toString).runWith(Sink.head)

13 : Picking the tools – streaming HTTP APIs

No demand from TCP
=
No demand upstream
=
Source won’t generate tweets

No demand from TCP
=
No demand upstream
=
=>

No demand from TCP
=
No demand upstream
=
=>
Bounded memory
stream processing!

14 : Happy hAkking, Community!

14 : Happy hAkking, Community!
akka.io – website
github.com/akka/akka/issues – help out!
“community-contrib” or “small” for starters
groups.google.com/group/akka-user – mailing list
gitter.im/akka/akka – chat about using Akka
gitter.im/akka/dev – chat about developing Akka

Links
• http://stackoverﬂow.com/questions/34641861/akka-http-blocking-
in-a-future-blocks-the-server/34645097#34645097
• The wonderful Zen paintings to buy here: 
http://paintingwholesalechina.com/products/chinese-culture-zen-
no-1-academic-chinese-painting
• http://doc.akka.io/docs/akka/2.4.1/common/cluster.html
• http://www.mit.edu/~xela/tao.html
• http://doc.akka.io/docs/akka/2.4.1/common/cluster.html
• http://doc.akka.io/docs/akka/2.4.1/scala/cluster-usage.html
• akka.io et al.

Thanks!
ktoso @ typesafe.com
twitter: ktosopl
github: ktoso
team blog: letitcrash.com
home: akka.io
Thus spake the Master Programmer:
“After three days without programming,
life becomes meaningless.”

lightbend.com/contact
Reactive Roundtable
World Tour by Lightbend
lightbend.com/reactive-roundtable
PoV/Pilot
Enablement 
Accelerate Project Success

Akka and the Zen of Reactive System Design

More Related Content

Akka and the Zen of Reactive System Design