Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Akk A Stream and HTTP Java

Download as pdf or txt
Download as pdf or txt
You are on page 1of 138
At a glance
Powered by AI
The key takeaways are that Akka Streams and HTTP provide tools for building reactive and resilient applications using streaming data flows and HTTP networking.

Akka Streams and HTTP provide tools for building reactive applications using streaming data flows and HTTP networking. Akka Streams allow processing data streams in a non-blocking way, while Akka HTTP provides tools for building HTTP servers and clients.

The design principles behind Akka Streams include regarding data as streams, non-blocking processing, composability of stream processing units, and backpressure to control rate of data flow between stages.

Akka Stream and HTTP Experimental

Java Documentation
Release 1.0

Typesafe Inc

July 20, 2015

CONTENTS

Streams
1.1 Introduction . . . . . . . . . . . . . . . . . .
1.2 Quick Start Guide: Reactive Tweets . . . . . .
1.3 Design Principles behind Akka Streams . . . .
1.4 Basics and working with Flows . . . . . . . .
1.5 Working with Graphs . . . . . . . . . . . . . .
1.6 Modularity, Composition and Hierarchy . . . .
1.7 Buffers and working with rate . . . . . . . . .
1.8 Custom stream processing . . . . . . . . . . .
1.9 Integration . . . . . . . . . . . . . . . . . . .
1.10 Error Handling . . . . . . . . . . . . . . . . .
1.11 Working with streaming IO . . . . . . . . . .
1.12 Pipelining and Parallelism . . . . . . . . . . .
1.13 Testing streams . . . . . . . . . . . . . . . . .
1.14 Overview of built-in stages and their semantics
1.15 Streams Cookbook . . . . . . . . . . . . . . .
1.16 Configuration . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
2
7
10
16
26
37
40
57
69
71
74
77
80
84
96

Akka HTTP
2.1 Configuration . . . . . . . . . . . . . . . . . .
2.2 HTTP Model . . . . . . . . . . . . . . . . . .
2.3 Low-Level Server-Side API . . . . . . . . . .
2.4 Server-Side WebSocket Support . . . . . . . .
2.5 High-level Server-Side API . . . . . . . . . .
2.6 Consuming HTTP-based Services (Client-Side)

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

98
98
103
107
111
114
131

CHAPTER

ONE

STREAMS

1.1 Introduction
1.1.1 Motivation
The way we consume services from the internet today includes many instances of streaming data, both downloading from a service as well as uploading to it or peer-to-peer data transfers. Regarding data as a stream of
elements instead of in its entirety is very useful because it matches the way computers send and receive them (for
example via TCP), but it is often also a necessity because data sets frequently become too large to be handled as
a whole. We spread computations or analyses over large clusters and call it big data, where the whole principle
of processing them is by feeding those data sequentiallyas a streamthrough some CPUs.
Actors can be seen as dealing with streams as well: they send and receive series of messages in order to transfer
knowledge (or data) from one place to another. We have found it tedious and error-prone to implement all the
proper measures in order to achieve stable streaming between actors, since in addition to sending and receiving
we also need to take care to not overflow any buffers or mailboxes in the process. Another pitfall is that Actor
messages can be lost and must be retransmitted in that case lest the stream have holes on the receiving side. When
dealing with streams of elements of a fixed given type, Actors also do not currently offer good static guarantees
that no wiring errors are made: type-safety could be improved in this case.
For these reasons we decided to bundle up a solution to these problems as an Akka Streams API. The purpose is to
offer an intuitive and safe way to formulate stream processing setups such that we can then execute them efficiently
and with bounded resource usageno more OutOfMemoryErrors. In order to achieve this our streams need to be
able to limit the buffering that they employ, they need to be able to slow down producers if the consumers cannot
keep up. This feature is called back-pressure and is at the core of the Reactive Streams initiative of which Akka is a
founding member. For you this means that the hard problem of propagating and reacting to back-pressure has been
incorporated in the design of Akka Streams already, so you have one less thing to worry about; it also means that
Akka Streams interoperate seamlessly with all other Reactive Streams implementations (where Reactive Streams
interfaces define the interoperability SPI while implementations like Akka Streams offer a nice user API).
Relationship with Reactive Streams
The Akka Streams API is completely decoupled from the Reactive Streams interfaces. While Akka Streams
focus on the formulation of transformations on data streams the scope of Reactive Streams is just to define a
common mechanism of how to move data across an asynchronous boundary without losses, buffering or resource
exhaustion.
The relationship between these two is that the Akka Streams API is geared towards end-users while the Akka
Streams implementation uses the Reactive Streams interfaces internally to pass data between the different processing stages. For this reason you will not find any resemblance between the Reactive Streams interfaces and the
Akka Streams API. This is in line with the expectations of the Reactive Streams project, whose primary purpose is
to define interfaces such that different streaming implementation can interoperate; it is not the purpose of Reactive
Streams to describe an end-user API.

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.1.2 How to read these docs


Stream processing is a different paradigm to the Actor Model or to Future composition, therefore it may take some
careful study of this subject until you feel familiar with the tools and techniques. The documentation is here to
help and for best results we recommend the following approach:
Read the Quick Start Guide: Reactive Tweets to get a feel for how streams look like and what they can do.
The top-down learners may want to peruse the Design Principles behind Akka Streams at this point.
The bottom-up learners may feel more at home rummaging through the Streams Cookbook.
For a complete overview of the built-in processing stages you can look at the table in Overview of built-in
stages and their semantics
The other sections can be read sequentially or as needed during the previous steps, each digging deeper into
specific topics.

1.2 Quick Start Guide: Reactive Tweets


A typical use case for stream processing is consuming a live stream of data that we want to extract or aggregate
some other data from. In this example well consider consuming a stream of tweets and extracting information
concerning Akka from them.
We will also consider the problem inherent to all non-blocking streaming solutions: What if the subscriber is
too slow to consume the live stream of data?. Traditionally the solution is often to buffer the elements, but this
canand usually willcause eventual buffer overflows and instability of such systems. Instead Akka Streams
depend on internal backpressure signals that allow to control what should happen in such scenarios.
Heres the data model well be working with throughout the quickstart examples:
public static class Author {
public final String handle;
public Author(String handle) {
this.handle = handle;
}
// ...
}
public static class Hashtag {
public final String name;
public Hashtag(String name) {
this.name = name;
}
// ...
}
public static class Tweet {
public final Author author;
public final long timestamp;
public final String body;
public Tweet(Author author, long timestamp, String body) {
this.author = author;
this.timestamp = timestamp;
this.body = body;
}

1.2. Quick Start Guide: Reactive Tweets

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

public Set<Hashtag> hashtags() {


return Arrays.asList(body.split(" ")).stream()
.filter(a -> a.startsWith("#"))
.map(a -> new Hashtag(a))
.collect(Collectors.toSet());
}
// ...
}
public static final Hashtag AKKA = new Hashtag("#akka");

Note: If you would like to get an overview of the used vocabulary first instead of diving head-first into an actual
example you can have a look at the Core concepts and Defining and running streams sections of the docs, and
then come back to this quickstart to see it all pieced together into a simple example application.

1.2.1 Transforming and consuming simple streams


The example application we will be looking at is a simple Twitter fed stream from which well want to extract
certain information, like for example finding all twitter handles of users who tweet about #akka.
In order to prepare our environment by creating an ActorSystem and ActorMaterializer, which will be
responsible for materializing and running the streams we are about to create:
final ActorSystem system = ActorSystem.create("reactive-tweets");
final Materializer mat = ActorMaterializer.create(system);

The ActorMaterializer can optionally take ActorMaterializerSettings which can be used to define materialization properties, such as default buffer sizes (see also Buffers in Akka Streams), the dispatcher to be
used by the pipeline etc. These can be overridden withAttributes on Flow, Source, Sink and Graph.
Lets assume we have a stream of tweets readily available, in Akka this is expressed as a Source:
Source<Tweet, BoxedUnit> tweets;

Streams always start flowing from a Source<Out,M1> then can continue through Flow<In,Out,M2> elements or more advanced graph elements to finally be consumed by a Sink<In,M3>.
The first type parameterTweet in this casedesignates the kind of elements produced by the source while the
M type parameters describe the object that is created during materialization (see below)BoxedUnit (from the
scala.runtime package) means that no value is produced, it is the generic equivalent of void.
The operations should look familiar to anyone who has used the Scala Collections library, however they operate
on streams and not collections of data (which is a very important distinction, as some operations only make sense
in streaming and vice versa):
final Source<Author, BoxedUnit> authors =
tweets
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);

Finally in order to materialize and run the stream computation we need to attach the Flow to a Sink<T, M> that
will get the flow running. The simplest way to do this is to call runWith(sink) on a Source<Out, M>. For
convenience a number of common Sinks are predefined and collected as static methods on the Sink class. For now
lets simply print each author:
authors.runWith(Sink.foreach(a -> System.out.println(a)), mat);

or by using the shorthand version (which are defined only for the most popular sinks such as Sink.fold and
Sink.foreach):

1.2. Quick Start Guide: Reactive Tweets

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

authors.runForeach(a -> System.out.println(a), mat);

Materializing and running a stream always requires a Materializer to be passed in explicitly, like this:
.run(mat).
The complete snippet looks like this:
final ActorSystem system = ActorSystem.create("reactive-tweets");
final Materializer mat = ActorMaterializer.create(system);
final Source<Author, BoxedUnit> authors =
tweets
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);
authors.runWith(Sink.foreach(a -> System.out.println(a)), mat);

1.2.2 Flattening sequences in streams


In the previous section we were working on 1:1 relationships of elements which is the most common case, but
sometimes we might want to map from one element to a number of elements and receive a flattened stream,
similarly like flatMap works on Scala Collections. In order to get a flattened stream of hashtags from our
stream of tweets we can use the mapConcat combinator:
final Source<Hashtag, BoxedUnit> hashtags =
tweets.mapConcat(t -> new ArrayList<Hashtag>(t.hashtags()));

Note: The name flatMap was consciously avoided due to its proximity with for-comprehensions and monadic
composition. It is problematic for two reasons: firstly, flattening by concatenation is often undesirable in bounded
stream processing due to the risk of deadlock (with merge being the preferred strategy), and secondly, the monad
laws would not hold for our implementation of flatMap (due to the liveness issues).
Please note that the mapConcat requires the supplied function to return a strict collection (Out f ->
java.util.List<T>), whereas flatMap would have to operate on streams all the way through.

1.2.3 Broadcasting a stream


Now lets say we want to persist all hashtags, as well as all author names from this one live stream. For example
wed like to write all author handles into one file, and all hashtags into another file on disk. This means we have
to split the source stream into 2 streams which will handle the writing to these different files.
Elements that can be used to form such fan-out (or fan-in) structures are referred to as junctions in Akka
Streams. One of these that well be using in this example is called Broadcast, and it simply emits elements
from its input port to all of its output ports.
Akka Streams intentionally separate the linear stream structures (Flows) from the non-linear, branching ones
(FlowGraphs) in order to offer the most convenient API for both of these cases. Graphs can express arbitrarily
complex stream setups at the expense of not reading as familiarly as collection transformations.
A graph can be either closed which is also known as a fully connected graph, or partial which can be
seen as a partial graph (a graph with some unconnected ports), thus being a generalisation of the Flow concept,
where Flow is simply a partial graph with one unconnected input and one unconnected output. Concepts around
composing and nesting graphs in large structures are explained explained in detail in Modularity, Composition
and Hierarchy.
It is also possible to wrap complex computation graphs as Flows, Sinks or Sources, which will be explained in
detail in Constructing Sources, Sinks and Flows from Partial Graphs. FlowGraphs are constructed like this:

1.2. Quick Start Guide: Reactive Tweets

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Sink<Author, BoxedUnit> writeAuthors;


Sink<Hashtag, BoxedUnit> writeHashtags;
FlowGraph.factory().closed(b -> {
final UniformFanOutShape<Tweet, Tweet> bcast = b.graph(Broadcast.create(2));
final Flow<Tweet, Author, BoxedUnit> toAuthor = Flow.of(Tweet.class).map(t -> t.author);
final Flow<Tweet, Hashtag, BoxedUnit> toTags =
Flow.of(Tweet.class).mapConcat(t -> new ArrayList<Hashtag>(t.hashtags()));
b.from(tweets).via(bcast).via(toAuthor).to(writeAuthors);
b.from(bcast).via(toTags).to(writeHashtags);
}).run(mat);

As you can see, we use graph builder to mutably construct the graph using the addEdge method. Once we
have the FlowGraph in the value g it is immutable, thread-safe, and freely shareable. A graph can be run()
directly - assuming all ports (sinks/sources) within a flow have been connected properly. It is possible to construct PartialFlowGraph s where this is not required but this will be covered in detail in Constructing and
combining Partial Flow Graphs.
As all Akka Streams elements, Broadcast will properly propagate back-pressure to its upstream element.

1.2.4 Back-pressure in action


One of the main advantages of Akka Streams is that they always propagate back-pressure information from stream
Sinks (Subscribers) to their Sources (Publishers). It is not an optional feature, and is enabled at all times. To
learn more about the back-pressure protocol used by Akka Streams and all other Reactive Streams compatible
implementations read Back-pressure explained.
A typical problem applications (not using Akka Streams) like this often face is that they are unable to process the
incoming data fast enough, either temporarily or by design, and will start buffering incoming data until theres
no more space to buffer, resulting in either OutOfMemoryError s or other severe degradations of service
responsiveness. With Akka Streams buffering can and must be handled explicitly. For example, if we are only
interested in the most recent tweets, with a buffer of 10 elements this can be expressed using the buffer
element:
tweets
.buffer(10, OverflowStrategy.dropHead())
.map(t -> slowComputation(t))
.runWith(Sink.ignore(), mat);

The buffer element takes an explicit and required OverflowStrategy, which defines how the buffer should
react when it receives another element while it is full. Strategies provided include dropping the oldest element
(dropHead), dropping the entire buffer, signalling failures etc. Be sure to pick and choose the strategy that fits
your use case best.

1.2.5 Materialized values


So far weve been only processing data using Flows and consuming it into some kind of external Sink - be it by
printing values or storing them in some external system. However sometimes we may be interested in some value
that can be obtained from the materialized processing pipeline. For example, we want to know how many tweets
we have processed. While this question is not as obvious to give an answer to in case of an infinite stream of
tweets (one way to answer this question in a streaming setting would to create a stream of counts described as up
until now, weve processed N tweets), but in general it is possible to deal with finite streams and come up with a
nice result such as a total count of elements.
First, lets write such an element counter using Flow.of(Class) and Sink.fold to see how the types look
like:
final Sink<Integer, Future<Integer>> sumSink =
Sink.<Integer, Integer>fold(0, (acc, elem) -> acc + elem);

1.2. Quick Start Guide: Reactive Tweets

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

final RunnableGraph<Future<Integer>> counter =


tweets.map(t -> 1).toMat(sumSink, Keep.right());
final Future<Integer> sum = counter.run(mat);
sum.foreach(new Foreach<Integer>() {
public void each(Integer c) {
System.out.println("Total tweets processed: " + c);
}
}, system.dispatcher());

First we prepare a reusable Flow that will change each incoming tweet into an integer of value 1. We combine
all values of the transformed stream using Sink.fold will sum all Integer elements of the stream and make
its result available as a Future<Integer>. Next we connect the tweets stream though a map step which
converts each tweet into the number 1, finally we connect the flow using toMat the previously prepared Sink.
Remember those mysterious Mat type parameters on Source<Out, Mat>, Flow<In, Out, Mat> and
Sink<In, Mat>? They represent the type of values these processing parts return when materialized. When
you chain these together, you can explicitly combine their materialized values: in our example we used
the Keep.right predefined function, which tells the implementation to only care about the materialized
type of the stage currently appended to the right. As you can notice, the materialized type of sumSink is
Future<Integer> and because of using Keep.right, the resulting RunnableGraph has also a type
parameter of Future<Integer>.
This step does not yet materialize the processing pipeline, it merely prepares the description of the
Flow, which is now connected to a Sink, and therefore can be run(), as indicated by its type:
RunnableGraph<Future<Integer>>. Next we call run() which uses the ActorMaterializer to
materialize and run the flow. The value returned by calling run() on a RunnableGraph<T> is of type T. In
our case this type is Future<Integer> which, when completed, will contain the total length of our tweets
stream. In case of the stream failing, this future would complete with a Failure.
A RunnableGraph may be reused and materialized multiple times, because it is just the blueprint of the
stream. This means that if we materialize a stream, for example one that consumes a live stream of tweets within
a minute, the materialized values for those two materializations will be different, as illustrated by this example:
final Sink<Integer, Future<Integer>> sumSink =
Sink.<Integer, Integer>fold(0, (acc, elem) -> acc + elem);
final RunnableGraph<Future<Integer>> counterRunnableGraph =
tweetsInMinuteFromNow
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> 1)
.toMat(sumSink, Keep.right());
// materialize the stream once in the morning
final Future<Integer> morningTweetsCount = counterRunnableGraph.run(mat);
// and once in the evening, reusing the blueprint
final Future<Integer> eveningTweetsCount = counterRunnableGraph.run(mat);

Many elements in Akka Streams provide materialized values which can be used for obtaining either results of
computation or steering these elements which will be discussed in detail in Stream Materialization. Summing up
this section, now we know what happens behind the scenes when we run this one-liner, which is equivalent to the
multi line version above:
final Future<Integer> sum = tweets.map(t -> 1).runWith(sumSink, mat);

Note: runWith() is a convenience method that automatically ignores the materialized value of any other stages
except those appended by the runWith() itself. In the above example it translates to using Keep.right as
the combiner for materialized values.

1.2. Quick Start Guide: Reactive Tweets

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.3 Design Principles behind Akka Streams


It took quite a while until we were reasonably happy with the look and feel of the API and the architecture of the
implementation, and while being guided by intuition the design phase was very much exploratory research. This
section details the findings and codifies them into a set of principles that have emerged during the process.
Note: As detailed in the introduction keep in mind that the Akka Streams API is completely decoupled from
the Reactive Streams interfaces which are just an implementation detail for how to pass stream data between
individual processing stages.

1.3.1 What shall users of Akka Streams expect?


Akka is built upon a conscious decision to offer APIs that are minimal and consistentas opposed to easy or
intuitive. The credo is that we favor explicitness over magic, and if we provide a feature then it must work always,
no exceptions. Another way to say this is that we minimize the number of rules a user has to learn instead of trying
to keep the rules close to what we think users might expect.
From this follows that the principles implemented by Akka Streams are:
all features are explicit in the API, no magic
supreme compositionality: combined pieces retain the function of each part
exhaustive model of the domain of distributed bounded stream processing
This means that we provide all the tools necessary to express any stream processing topology, that we model all
the essential aspects of this domain (back-pressure, buffering, transformations, failure recovery, etc.) and that
whatever the user builds is reusable in a larger context.
Resulting Implementation Constraints
Compositionality entails reusability of partial stream topologies, which led us to the lifted approach of describing
data flows as (partial) FlowGraphs that can act as composite sources, flows (a.k.a. pipes) and sinks of data. These
building blocks shall then be freely shareable, with the ability to combine them freely to form larger flows. The
representation of these pieces must therefore be an immutable blueprint that is materialized in an explicit step in
order to start the stream processing. The resulting stream processing engine is then also immutable in the sense of
having a fixed topology that is prescribed by the blueprint. Dynamic networks need to be modeled by explicitly
using the Reactive Streams interfaces for plugging different engines together.
The process of materialization may be parameterized, e.g. instantiating a blueprint for handling a TCP connections data with specific information about the connections address and port information. Additionally, materialization will often create specific objects that are useful to interact with the processing engine once it is running, for
example for shutting it down or for extracting metrics. This means that the materialization function takes a set of
parameters from the outside and it produces a set of results. Compositionality demands that these two sets cannot
interact, because that would establish a covert channel by which different pieces could communicate, leading to
problems of initialization order and inscrutable runtime failures.
Another aspect of materialization is that we want to support distributed stream processing, meaning that both the
parameters and the results need to be location transparenteither serializable immutable values or ActorRefs.
Using for example Futures would restrict materialization to the local JVM. There may be cases for which this will
typically not be a severe restriction (like opening a TCP connection), but the principle remains.

1.3.2 Interoperation with other Reactive Streams implementations


Akka Streams fully implement the Reactive Streams specification and interoperate with all other conformant
implementations. We chose to completely separate the Reactive Streams interfaces (which we regard to be an
SPI) from the user-level API. In order to obtain a Publisher or Subscriber from an Akka Stream topology,
a corresponding Sink.publisher or Source.subscriber element must be used.
1.3. Design Principles behind Akka Streams

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

All stream Processors produced by the default materialization of Akka Streams are restricted to having a single
Subscriber, additional Subscribers will be rejected. The reason for this is that the stream topologies described
using our DSL never require fan-out behavior from the Publisher sides of the elements, all fan-out is done using
explicit elements like Broadcast[T].
This means that Sink.fanoutPublisher must be used where multicast behavior is needed for interoperation
with other Reactive Streams implementations.

1.3.3 What shall users of streaming libraries expect?


We expect libraries to be built on top of Akka Streams, in fact Akka HTTP is one such example that lives within
the Akka project itself. In order to allow users to profit from the principles that are described for Akka Streams
above, the following rules are established:
libraries shall provide their users with reusable pieces, allowing full compositionality
libraries may optionally and additionally provide facilities that consume and materialize flow descriptions
The reasoning behind the first rule is that compositionality would be destroyed if different libraries only accepted
flow descriptions and expected to materialize them: using two of these together would be impossible because
materialization can only happen once. As a consequence, the functionality of a library must be expressed such
that materialization can be done by the user, outside of the librarys control.
The second rule allows a library to additionally provide nice sugar for the common case, an example of which is
the Akka HTTP API that provides a handleWith method for convenient materialization.
Note: One important consequence of this is that a reusable flow description cannot be bound to live resources,
any connection to or allocation of such resources must be deferred until materialization time. Examples of live
resources are already existing TCP connections, a multicast Publisher, etc.; a TickSource does not fall into this
category if its timer is created only upon materialization (as is the case for our implementation).

Resulting Implementation Constraints


Akka Streams must enable a library to express any stream processing utility in terms of immutable blueprints. The
most common building blocks are
Source: something with exactly one output stream
Sink: something with exactly one input stream
Flow: something with exactly one input and one output stream
BidirectionalFlow: something with exactly two input streams and two output streams that conceptually
behave like two Flows of opposite direction
Graph: a packaged stream processing topology that exposes a certain set of input and output ports, characterized by an object of type Shape.
Note: A source that emits a stream of streams is still just a normal Source, the kind of elements that are produced
does not play a role in the static stream topology that is being expressed.

1.3.4 The difference between Error and Failure


The starting point for this discussion is the definition given by the Reactive Manifesto. Translated to streams
this means that an error is accessible within the stream as a normal data element, while a failure means that the
stream itself has failed and is collapsing. In concrete terms, on the Reactive Streams interface level data elements
(including errors) are signaled via onNext while failures raise the onError signal.
Note: Unfortunately the method name for signaling failure to a Subscriber is called onError for historical reasons. Always keep in mind that the Reactive Streams interfaces (Publisher/Subscription/Subscriber) are modeling
1.3. Design Principles behind Akka Streams

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

the low-level infrastructure for passing streams between execution units, and errors on this level are precisely the
failures that we are talking about on the higher level that is modeled by Akka Streams.
There is only limited support for treating onError in Akka Streams compared to the operators that are available for the transformation of data elements, which is intentional in the spirit of the previous paragraph. Since
onError signals that the stream is collapsing, its ordering semantics are not the same as for stream completion:
transformation stages of any kind will just collapse with the stream, possibly still holding elements in implicit or
explicit buffers. This means that data elements emitted before a failure can still be lost if the onError overtakes
them.
The ability for failures to propagate faster than data elements is essential for tearing down streams that are backpressuredespecially since back-pressure can be the failure mode (e.g. by tripping upstream buffers which then
abort because they cannot do anything else; or if a dead-lock occurred).
The semantics of stream recovery
A recovery element (i.e. any transformation that absorbs an onError signal and turns that into possibly more
data elements followed normal stream completion) acts as a bulkhead that confines a stream collapse to a given
region of the flow topology. Within the collapsed region buffered elements may be lost, but the outside is not
affected by the failure.
This works in the same fashion as a trycatch expression: it marks a region in which exceptions are caught, but
the exact amount of code that was skipped within this region in case of a failure might not be known preciselythe
placement of statements matters.

1.3.5 The finer points of stream materialization


Note: This is not yet implemented as stated here, this document illustrates intent.
It is commonly necessary to parameterize a flow so that it can be materialized for different arguments, an example
would be the handler Flow that is given to a server socket implementation and materialized for each incoming
connection with information about the peers address. On the other hand it is frequently necessary to retrieve
specific objects that result from materialization, for example a Future[Unit] that signals the completion of a
ForeachSink.
It might be tempting to allow different pieces of a flow topology to access the materialization results of other
pieces in order to customize their behavior, but that would violate composability and reusability as argued above.
Therefore the arguments and results of materialization need to be segregated:
The Materializer is configured with a (type-safe) mapping from keys to values, which is exposed to the
processing stages during their materialization.
The values in this mapping may act as channels, for example by using a Promise/Future pair to communicate
a value; another possibility for such information-passing is of course to explicitly model it as a stream of
configuration data elements within the graph itself.
The materialized values obtained from the processing stages are combined as prescribed by the user, but can
of course be dependent on the values in the argument mapping.
To avoid having to use Future values as key bindings, materialization itself may become fully asynchronous.
This would allow for example the use of the bound server port within the rest of the flow, and only if the port was
actually bound successfully. The downside is that some APIs will then return Future[MaterializedMap],
which means that others will have to accept this in turn in order to keep the usage burden as low as possible.

1.3. Design Principles behind Akka Streams

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.4 Basics and working with Flows


1.4.1 Core concepts
Akka Streams is a library to process and transfer a sequence of elements using bounded buffer space. This
latter property is what we refer to as boundedness and it is the defining feature of Akka Streams. Translated to
everyday terms it is possible to express a chain (or as we see later, graphs) of processing entities, each executing
independently (and possibly concurrently) from the others while only buffering a limited number of elements at
any given time. This property of bounded buffers is one of the differences from the actor model, where each actor
usually has an unbounded, or a bounded, but dropping mailbox. Akka Stream processing entities have bounded
mailboxes that do not drop.
Before we move on, lets define some basic terminology which will be used throughout the entire documentation:
Stream An active process that involves moving and transforming data.
Element An element is the processing unit of streams. All operations transform and transfer elements from
upstream to downstream. Buffer sizes are always expressed as number of elements independently form the
actual size of the elements.
Back-pressure A means of flow-control, a way for consumers of data to notify a producer about their current
availability, effectively slowing down the upstream producer to match their consumption speeds. In the
context of Akka Streams back-pressure is always understood as non-blocking and asynchronous.
Non-Blocking Means that a certain operation does not hinder the progress of the calling thread, even if it takes
long time to finish the requested operation.
Processing Stage The common name for all building blocks that build up a Flow or FlowGraph. Examples of
a processing stage would be operations like map(), filter(), stages added by transform() like
PushStage, PushPullStage, StatefulStage and graph junctions like Merge or Broadcast.
For the full list of built-in processing stages see Overview of built-in stages and their semantics
When we talk about asynchronous, non-blocking backpressure we mean that the processing stages available in
Akka Streams will not use blocking calls but asynchronous message passing to exchange messages between each
other, and they will use asynchronous means to slow down a fast producer, without blocking its thread. This is a
thread-pool friendly design, since entities that need to wait (a fast producer waiting on a slow consumer) will not
block the thread but can hand it back for further use to an underlying thread-pool.
Defining and running streams
Linear processing pipelines can be expressed in Akka Streams using the following core abstractions:
Source A processing stage with exactly one output, emitting data elements whenever downstream processing
stages are ready to receive them.
Sink A processing stage with exactly one input, requesting and accepting data elements possibly slowing down
the upstream producer of elements
Flow A processing stage which has exactly one input and output, which connects its up- and downstreams by
transforming the data elements flowing through it.
RunnableGraph A Flow that has both ends attached to a Source and Sink respectively, and is ready to be
run().
It is possible to attach a Flow to a Source resulting in a composite source, and it is also possible to prepend a
Flow to a Sink to get a new sink. After a stream is properly terminated by having both a source and a sink, it
will be represented by the RunnableGraph type, indicating that it is ready to be executed.
It is important to remember that even after constructing the RunnableGraph by connecting all the source,
sink and different processing stages, no data will flow through it until it is materialized. Materialization is the
process of allocating all resources needed to run the computation described by a Flow (in Akka Streams this will
often involve starting up Actors). Thanks to Flows being simply a description of the processing pipeline they
are immutable, thread-safe, and freely shareable, which means that it is for example safe to share and send them

1.4. Basics and working with Flows

10

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

between actors, to have one actor prepare the work, and then have it be materialized at some completely different
place in the code.
final Source<Integer, BoxedUnit> source =
Source.from(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
// note that the Future is scala.concurrent.Future
final Sink<Integer, Future<Integer>> sink =
Sink.fold(0, (aggr, next) -> aggr + next);
// connect the Source to the Sink, obtaining a RunnableFlow
final RunnableGraph<Future<Integer>> runnable =
source.toMat(sink, Keep.right());
// materialize the flow
final Future<Integer> sum = runnable.run(mat);

After running (materializing) the RunnableGraph we get a special container object, the MaterializedMap.
Both sources and sinks are able to put specific objects into this map. Whether they put something in or not is
implementation dependent. For example a FoldSink will make a Future available in this map which will
represent the result of the folding process over the stream. In general, a stream can expose multiple materialized
values, but it is quite common to be interested in only the value of the Source or the Sink in the stream. For
this reason there is a convenience method called runWith() available for Sink, Source or Flow requiring,
respectively, a supplied Source (in order to run a Sink), a Sink (in order to run a Source) or both a Source
and a Sink (in order to run a Flow, since it has neither attached yet).
final Source<Integer, BoxedUnit> source =
Source.from(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
final Sink<Integer, Future<Integer>> sink =
Sink.fold(0, (aggr, next) -> aggr + next);
// materialize the flow, getting the Sinks materialized value
final Future<Integer> sum = source.runWith(sink, mat);

It is worth pointing out that since processing stages are immutable, connecting them returns a new processing
stage, instead of modifying the existing instance, so while constructing long flows, remember to assign the new
value to a variable or run it:
final Source<Integer, BoxedUnit> source =
Source.from(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
source.map(x -> 0); // has no effect on source, since it's immutable
source.runWith(Sink.fold(0, (agg, next) -> agg + next), mat); // 55
// returns new Source<Integer>, with `map()` appended
final Source<Integer, BoxedUnit> zeroes = source.map(x -> 0);
final Sink<Integer, Future<Integer>> fold =
Sink.fold(0, (agg, next) -> agg + next);
zeroes.runWith(fold, mat); // 0

Note: By default Akka Streams elements support exactly one downstream processing stage. Making fan-out
(supporting multiple downstream processing stages) an explicit opt-in feature allows default stream elements to
be less complex and more efficient. Also it allows for greater flexibility on how exactly to handle the multicast
scenarios, by providing named fan-out elements such as broadcast (signals all down-stream elements) or balance
(signals one of available down-stream elements).
In the above example we used the runWith method, which both materializes the stream and returns the materialized value of the given sink or source.
Since a stream can be materialized multiple times, the MaterializedMap returned is different for each materialization. In the example below we create two running materialized instance of the stream that we described
in the runnable variable, and both materializations give us a different Future from the map even though we
used the same sink to refer to the future:

1.4. Basics and working with Flows

11

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// connect the Source to the Sink, obtaining a RunnableGraph


final Sink<Integer, Future<Integer>> sink =
Sink.fold(0, (aggr, next) -> aggr + next);
final RunnableGraph<Future<Integer>> runnable =
Source.from(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)).toMat(sink, Keep.right());
// get the materialized value of the FoldSink
final Future<Integer> sum1 = runnable.run(mat);
final Future<Integer> sum2 = runnable.run(mat);
// sum1 and sum2 are different Futures!

Defining sources, sinks and flows

The objects Source and Sink define various ways to create sources and sinks of elements. The following
examples show some of the most useful constructs (refer to the API documentation for more details):
// Create a source from an Iterable
List<Integer> list = new LinkedList<Integer>();
list.add(1);
list.add(2);
list.add(3);
Source.from(list);
// Create a source form a Future
Source.from(Futures.successful("Hello Streams!"));
// Create a source from a single element
Source.single("only one element");
// an empty source
Source.empty();
// Sink that folds over the stream and returns a Future
// of the final result in the MaterializedMap
Sink.fold(0, (Integer aggr, Integer next) -> aggr + next);
// Sink that returns a Future in the MaterializedMap,
// containing the first element of the stream
Sink.head();
// A Sink that consumes a stream without doing anything with the elements
Sink.ignore();
// A Sink that executes a side-effecting call for every element of the stream
Sink.foreach(System.out::println);

There are various ways to wire up different parts of a stream, the following examples show some of the available
options:
// Explicitly creating and wiring up a Source, Sink and Flow
Source.from(Arrays.asList(1, 2, 3, 4))
.via(Flow.of(Integer.class).map(elem -> elem * 2))
.to(Sink.foreach(System.out::println));
// Starting from a Source
final Source<Integer, BoxedUnit> source = Source.from(Arrays.asList(1, 2, 3, 4))
.map(elem -> elem * 2);
source.to(Sink.foreach(System.out::println));
// Starting from a Sink
final Sink<Integer, BoxedUnit> sink = Flow.of(Integer.class)

1.4. Basics and working with Flows

12

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.map(elem -> elem * 2).to(Sink.foreach(System.out::println));


Source.from(Arrays.asList(1, 2, 3, 4)).to(sink);

Illegal stream elements

In accordance to the Reactive Streams specification (Rule 2.13) Akka Streams do not allow null to be passed
through the stream as an element. In case you want to model the concept of absence of a value we recommend
using akka.japi.Option (for Java 6 and 7) or java.util.Optional which is available since Java 8.
Back-pressure explained
Akka Streams implement an asynchronous non-blocking back-pressure protocol standardised by the Reactive
Streams specification, which Akka is a founding member of.
The user of the library does not have to write any explicit back-pressure handling code it is built in and dealt
with automatically by all of the provided Akka Streams processing stages. It is possible however to add explicit
buffer stages with overflow strategies that can influence the behaviour of the stream. This is especially important
in complex processing graphs which may even contain loops (which must be treated with very special care, as
explained in Graph cycles, liveness and deadlocks).
The back pressure protocol is defined in terms of the number of elements a downstream Subscriber is able to
receive and buffer, referred to as demand. The source of data, referred to as Publisher in Reactive Streams
terminology and implemented as Source in Akka Streams, guarantees that it will never emit more elements than
the received total demand for any given Subscriber.
Note: The Reactive Streams specification defines its protocol in terms of Publisher and Subscriber.
These types are not meant to be user facing API, instead they serve as the low level building blocks for different
Reactive Streams implementations.
Akka Streams implements these concepts as Source, Flow (referred to as Processor in Reactive Streams)
and Sink without exposing the Reactive Streams interfaces directly. If you need to integrate with other Reactive
Stream libraries read Integrating with Reactive Streams.
The mode in which Reactive Streams back-pressure works can be colloquially described as dynamic push / pull
mode, since it will switch between push and pull based back-pressure models depending on the downstream
being able to cope with the upstream production rate or not.
To illustrate this further let us consider both problem situations and how the back-pressure protocol handles them:
Slow Publisher, fast Subscriber

This is the happy case of course we do not need to slow down the Publisher in this case. However signalling rates
are rarely constant and could change at any point in time, suddenly ending up in a situation where the Subscriber
is now slower than the Publisher. In order to safeguard from these situations, the back-pressure protocol must still
be enabled during such situations, however we do not want to pay a high penalty for this safety net being enabled.
The Reactive Streams protocol solves this by asynchronously signalling from the Subscriber to the Publisher
Request(int n) signals. The protocol guarantees that the Publisher will never signal more elements than the
signalled demand. Since the Subscriber however is currently faster, it will be signalling these Request messages at
a higher rate (and possibly also batching together the demand - requesting multiple elements in one Request signal). This means that the Publisher should not ever have to wait (be back-pressured) with publishing its incoming
elements.
As we can see, in this scenario we effectively operate in so called push-mode since the Publisher can continue
producing elements as fast as it can, since the pending demand will be recovered just-in-time while it is emitting
elements.

1.4. Basics and working with Flows

13

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Fast Publisher, slow Subscriber

This is the case when back-pressuring the Publisher is required, because the Subscriber is not able to cope
with the rate at which its upstream would like to emit data elements.
Since the Publisher is not allowed to signal more elements than the pending demand signalled by the
Subscriber, it will have to abide to this back-pressure by applying one of the below strategies:
not generate elements, if it is able to control their production rate,
try buffering the elements in a bounded manner until more demand is signalled,
drop elements until more demand is signalled,
tear down the stream if unable to apply any of the above strategies.
As we can see, this scenario effectively means that the Subscriber will pull the elements from the Publisher
this mode of operation is referred to as pull-based back-pressure.
Stream Materialization
When constructing flows and graphs in Akka Streams think of them as preparing a blueprint, an execution plan.
Stream materialization is the process of taking a stream description (the graph) and allocating all the necessary
resources it needs in order to run. In the case of Akka Streams this often means starting up Actors which power the
processing, but is not restricted to that - it could also mean opening files or socket connections etc. depending
on what the stream needs.
Materialization is triggered at so called terminal operations. Most notably this includes the various forms
of the run() and runWith() methods defined on flow elements as well as a small number of special
syntactic sugars for running with well-known sinks, such as runForeach(el -> ) (being an alias to
runWith(Sink.foreach(el -> )).
Materialization is currently performed synchronously on the materializing thread. The actual stream processing
is handled by actors started up during the streams materialization, which will be running on the thread pools they
have been configured to run on - which defaults to the dispatcher set in MaterializationSettings while
constructing the ActorMaterializer.
Note: Reusing instances of linear computation stages (Source, Sink, Flow) inside FlowGraphs is legal, yet will
materialize that stage multiple times.

Combining materialized values

Since every processing stage in Akka Streams can provide a materialized value after being materialized, it is
necessary to somehow express how these values should be composed to a final value when we plug these stages
together. For this, many combinator methods have variants that take an additional argument, a function, that will
be used to combine the resulting values. Some examples of using these combiners are illustrated in the example
below.
// An empty source that can be shut down explicitly from the outside
Source<Integer, Promise<BoxedUnit>> source = Source.<Integer>lazyEmpty();
// A flow that internally throttles elements to 1/second, and returns a Cancellable
// which can be used to shut down the stream
Flow<Integer, Integer, Cancellable> flow = throttler;
// A sink that returns the first element of a stream in the returned Future
Sink<Integer, Future<Integer>> sink = Sink.head();

// By default, the materialized value of the leftmost stage is preserved


RunnableGraph<Promise<BoxedUnit>> r1 = source.via(flow).to(sink);

1.4. Basics and working with Flows

14

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// Simple selection of materialized values by using Keep.right


RunnableGraph<Cancellable> r2 = source.viaMat(flow, Keep.right()).to(sink);
RunnableGraph<Future<Integer>> r3 = source.via(flow).toMat(sink, Keep.right());
// Using runWith will always give the materialized values of the stages added
// by runWith() itself
Future<Integer> r4 = source.via(flow).runWith(sink, mat);
Promise<BoxedUnit> r5 = flow.to(sink).runWith(source, mat);
Pair<Promise<BoxedUnit>, Future<Integer>> r6 = flow.runWith(source, sink, mat);
// Using more complext combinations
RunnableGraph<Pair<Promise<BoxedUnit>, Cancellable>> r7 =
source.viaMat(flow, Keep.both()).to(sink);
RunnableGraph<Pair<Promise<BoxedUnit>, Future<Integer>>> r8 =
source.via(flow).toMat(sink, Keep.both());
RunnableGraph<Pair<Pair<Promise<BoxedUnit>, Cancellable>, Future<Integer>>> r9 =
source.viaMat(flow, Keep.both()).toMat(sink, Keep.both());
RunnableGraph<Pair<Cancellable, Future<Integer>>> r10 =
source.viaMat(flow, Keep.right()).toMat(sink, Keep.both());
// It is also possible to map over the materialized values. In r9 we had a
// doubly nested pair, but we want to flatten it out

RunnableGraph<Cancellable> r11 =
r9.mapMaterializedValue( (nestedTuple) -> {
Promise<BoxedUnit> p = nestedTuple.first().first();
Cancellable c = nestedTuple.first().second();
Future<Integer> f = nestedTuple.second();
// Picking the Cancellable, but we could
return c;
});

also construct a domain class here

Note: In Graphs it is possible to access the materialized value from inside the stream processing graph. For
details see Accessing the materialized value inside the Graph

1.4.2 Stream ordering


In Akka Streams almost all computation stages preserve input order of elements. This means that if inputs
{IA1,IA2,...,IAn} cause outputs {OA1,OA2,...,OAk} and inputs {IB1,IB2,...,IBm} cause
outputs {OB1,OB2,...,OBl} and all of IAi happened before all IBi then OAi happens before OBi.
This property is even uphold by async operations such as mapAsync, however an unordered version exists called
mapAsyncUnordered which does not preserve this ordering.
However, in the case of Junctions which handle multiple input streams (e.g. Merge) the output order is, in
general, not defined for elements arriving on different input ports. That is a merge-like operation may emit Ai
before emitting Bi, and it is up to its internal logic to decide the order of emitted elements. Specialized elements
such as Zip however do guarantee their outputs order, as each output element depends on all upstream elements
having been signalled already thus the ordering in the case of zipping is defined by this property.
If you find yourself in need of fine grained control over order of emitted elements in fan-in scenarios consider
using MergePreferred or FlexiMerge which gives you full control over how the merge is performed.

1.4. Basics and working with Flows

15

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.5 Working with Graphs


In Akka Streams computation graphs are not expressed using a fluent DSL like linear computations are, instead
they are written in a more graph-resembling DSL which aims to make translating graph drawings (e.g. from notes
taken from design discussions, or illustrations in protocol specifications) to and from code simpler. In this section
well dive into the multiple ways of constructing and re-using graphs, as well as explain common pitfalls and how
to avoid them.
Graphs are needed whenever you want to perform any kind of fan-in (multiple inputs) or fan-out (multiple
outputs) operations. Considering linear Flows to be like roads, we can picture graph operations as junctions:
multiple flows being connected at a single point. Some graph operations which are common enough and fit the
linear style of Flows, such as concat (which concatenates two streams, such that the second one is consumed
after the first one has completed), may have shorthand methods defined on Flow or Source themselves, however
you should keep in mind that those are also implemented as graph junctions.

1.5.1 Constructing Flow Graphs


Flow graphs are built from simple Flows which serve as the linear connections within the graphs as well as
junctions which serve as fan-in and fan-out points for Flows. Thanks to the junctions having meaningful types
based on their behaviour and making them explicit elements these elements should be rather straightforward to
use.
Akka Streams currently provide these junctions (for a detailed list see Overview of built-in stages and their semantics):
Fan-out
Broadcast<T> (1 input, N outputs) given an input element emits to each output
Balance<T> (1 input, N outputs) given an input element emits to one of its output ports
UnzipWith<In,A,B,...> (1 input, N outputs) takes a function of 1 input that given a value for each
input emits N output elements (where N <= 20)
UnZip<A,B> (1 input, 2 outputs) splits a stream of Pair<A,B> tuples into two streams, one of type A
and one of type B
FlexiRoute<In> (1 input, N outputs) enables writing custom fan out elements using a simple DSL
Fan-in
Merge<In> (N inputs , 1 output) picks randomly from inputs pushing them one by one to its output
MergePreferred<In> like Merge but if elements are available on preferred port, it picks from
it, otherwise randomly from others
ZipWith<A,B,...,Out> (N inputs, 1 output) which takes a function of N inputs that given a value
for each input emits 1 output element
Zip<A,B> (2 inputs, 1 output) is a ZipWith specialised to zipping input streams of A and B into a
Pair(A,B) tuple stream
Concat<A> (2 inputs, 1 output) concatenates two streams (first consume one, then the second one)
FlexiMerge<Out> (N inputs, 1 output) enables writing custom fan-in elements using a simple DSL
One of the goals of the FlowGraph DSL is to look similar to how one would draw a graph on a whiteboard, so that
it is simple to translate a design from whiteboard to code and be able to relate those two. Lets illustrate this by
translating the below hand drawn graph into Akka Streams:

1.5. Working with Graphs

16

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Such graph is simple to translate to the Graph DSL since each linear element corresponds to a Flow, and each
circle corresponds to either a Junction or a Source or Sink if it is beginning or ending a Flow.
final
final
final
final
final
final
final

Source<Integer, BoxedUnit> in = Source.from(Arrays.asList(1, 2, 3, 4, 5));


Sink<List<String>, Future<List<String>>> sink = Sink.head();
Sink<List<Integer>, Future<List<Integer>>> sink2 = Sink.head();
Flow<Integer, Integer, BoxedUnit> f1 = Flow.of(Integer.class).map(elem -> elem + 10);
Flow<Integer, Integer, BoxedUnit> f2 = Flow.of(Integer.class).map(elem -> elem + 20);
Flow<Integer, String, BoxedUnit> f3 = Flow.of(Integer.class).map(elem -> elem.toString());
Flow<Integer, Integer, BoxedUnit> f4 = Flow.of(Integer.class).map(elem -> elem + 30);

final RunnableGraph<Future<List<String>>> result = FlowGraph.factory()


.closed(
sink,
(builder, out) -> {
final UniformFanOutShape<Integer, Integer> bcast = builder.graph(Broadcast.create(2));
final UniformFanInShape<Integer, Integer> merge = builder.graph(Merge.create(2));
builder.from(in).via(f1).via(bcast).via(f2).via(merge)
.via(f3.grouped(1000)).to(out);
builder.from(bcast).via(f4).to(merge);
});

Note: Junction reference equality defines graph node equality (i.e. the same merge instance used in a FlowGraph
refers to the same location in the resulting graph).
By looking at the snippets above, it should be apparent that the builder object is mutable. The reason for
this design choice is to enable simpler creation of complex graphs, which may even contain cycles. Once the
FlowGraph has been constructed though, the RunnableGraph instance is immutable, thread-safe, and freely
shareable. The same is true of all flow piecessources, sinks, and flowsonce they are constructed. This means
that you can safely re-use one given Flow in multiple places in a processing graph.
We have seen examples of such re-use already above: the merge and broadcast junctions were imported into the
graph using builder.graph(...), an operation that will make a copy of the blueprint that is passed to it
and return the inlets and outlets of the resulting copy so that they can be wired up. Another alternative is to pass
existing graphsof any shapeinto the factory method that produces a new graph. The difference between these
approaches is that importing using b.graph(...) ignores the materialized value of the imported graph while
importing via the factory method allows its inclusion; for more details see stream-materialization-scala.
In the example below we prepare a graph that consists of two parallel streams, in which we re-use the same
instance of Flow, yet it will properly be materialized as two connections between the corresponding Sources and
Sinks:
final Sink<Integer, Future<Integer>> topHeadSink = Sink.head();
final Sink<Integer, Future<Integer>> bottomHeadSink = Sink.head();
final Flow<Integer, Integer, BoxedUnit> sharedDoubler = Flow.of(Integer.class).map(elem -> elem *
final RunnableGraph<Pair<Future<Integer>, Future<Integer>>> g = FlowGraph
.factory().closed(
topHeadSink, // import this sink into the graph
bottomHeadSink, // and this as well
Keep.both(),

1.5. Working with Graphs

17

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

(b, top, bottom) -> {


final UniformFanOutShape<Integer, Integer> bcast = b
.graph(Broadcast.create(2));
b.from(Source.single(1)).via(bcast).via(sharedDoubler).to(top);
b.from(bcast).via(sharedDoubler).to(bottom);
});

1.5.2 Constructing and combining Partial Flow Graphs


Sometimes it is not possible (or needed) to construct the entire computation graph in one place, but instead construct all of its different phases in different places and in the end connect them all into a complete graph and run
it.
This
can
be
achieved
using
FlowGraph.factory().partial()
instead
of
FlowGraph.factory().closed(), which will return a Graph instead of a RunnableGraph.
The reason of representing it as a different type is that a RunnableGraph requires all ports to be connected,
and if they are not it will throw an exception at construction time, which helps to avoid simple wiring errors while
working with graphs. A partial flow graph however allows you to return the set of yet to be connected ports from
the code block that performs the internal wiring.
Lets imagine we want to provide users with a specialized element that given 3 inputs will pick the greatest
int value of each zipped triple. Well want to expose 3 input ports (unconnected sources) and one output port
(unconnected sink).
final Graph<FanInShape2<Integer, Integer, Integer>, BoxedUnit> zip =
ZipWith.create((Integer left, Integer right) -> Math.max(left, right));
final Graph<UniformFanInShape<Integer, Integer>, BoxedUnit> pickMaxOfThree =
FlowGraph.factory().partial(builder -> {
final FanInShape2<Integer, Integer, Integer> zip1 = builder.graph(zip);
final FanInShape2<Integer, Integer, Integer> zip2 = builder.graph(zip);
builder.edge(zip1.out(), zip2.in0());
// return the shape, which has three inputs and one output
return new UniformFanInShape<Integer, Integer>(zip2.out(),
new Inlet[] {zip1.in0(), zip1.in1(), zip2.in1()});
});
final Sink<Integer, Future<Integer>> resultSink = Sink.<Integer>head();
final RunnableGraph<Future<Integer>> g = FlowGraph.factory()
.closed(resultSink, (builder, sink) -> {
// import the partial flow graph explicitly
final UniformFanInShape<Integer, Integer> pm = builder.graph(pickMaxOfThree);
builder.from(Source.single(1)).to(pm.in(0));
builder.from(Source.single(2)).to(pm.in(1));
builder.from(Source.single(3)).to(pm.in(2));
builder.from(pm.out()).to(sink);
});
final Future<Integer> max = g.run(mat);

As you can see, first we construct the partial graph that describes how to compute the maximum of two input
streams, then we reuse that twice while constructing the partial graph that extends this to three input streams, then
we import it (all of its nodes and connections) explicitly to the FlowGraph instance in which all the undefined
elements are rewired to real sources and sinks. The graph can then be run and yields the expected result.

1.5. Working with Graphs

18

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Warning: Please note that a FlowGraph is not able to provide compile time type-safety about whether or
not all elements have been properly connectedthis validation is performed as a runtime check during the
graphs instantiation.
A partial flow graph also verifies that all ports are either connected or part of the returned Shape.

1.5.3 Constructing Sources, Sinks and Flows from Partial Graphs


Instead of treating a PartialFlowGraph as simply a collection of flows and junctions which may not yet all
be connected it is sometimes useful to expose such a complex graph as a simpler structure, such as a Source,
Sink or Flow.
In fact, these concepts can be easily expressed as special cases of a partially connected graph:
Source is a partial flow graph with exactly one output, that is it returns a SourceShape.
Sink is a partial flow graph with exactly one input, that is it returns a SinkShape.
Flow is a partial flow graph with exactly one input and exactly one output, that is it returns a FlowShape.
Being able to hide complex graphs inside of simple elements such as Sink / Source / Flow enables you to easily
create one complex element and from there on treat it as simple compound stage for linear computations.
In order to create a Source from a partial flow graph Source provides a special apply method that takes a function
that must return an Outlet. This unconnected sink will become the sink that must be attached before this Source
can run. Refer to the example below, in which we create a Source that zips together two numbers, to see this
graph construction in action:
// first create an indefinite source of integer numbers
class Ints implements Iterator<Integer> {
private int next = 0;
@Override
public boolean hasNext() {
return true;
}
@Override
public Integer next() {
return next++;
}
}
final Source<Integer, BoxedUnit> ints = Source.fromIterator(() -> new Ints());
final Source<Pair<Integer, Integer>, BoxedUnit> pairs = Source.factory().create(
builder -> {
final FanInShape2<Integer, Integer, Pair<Integer, Integer>> zip =
builder.graph(Zip.create());
builder.from(ints.filter(i -> i % 2 == 0)).to(zip.in0());
builder.from(ints.filter(i -> i % 2 == 1)).to(zip.in1());
return zip.out();
});
final Future<Pair<Integer, Integer>> firstPair =
pairs.runWith(Sink.<Pair<Integer, Integer>>head(), mat);

Similarly the same can be done for a Sink<T>, in which case the returned value must be an Inlet<T>. For
defining a Flow<T> we need to expose both an undefined source and sink:
final Flow<Integer, Pair<Integer, String>, BoxedUnit> pairs = Flow.factory().create(
b -> {
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
final FanInShape2<Integer, String, Pair<Integer, String>> zip =

1.5. Working with Graphs

19

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

b.graph(Zip.create());
b.from(bcast).to(zip.in0());
b.from(bcast).via(Flow.of(Integer.class).map(i -> i.toString())).to(zip.in1());
return new Pair<>(bcast.in(), zip.out());
});
Source.single(1).via(pairs).runWith(Sink.<Pair<Integer, String>>head(), mat);

1.5.4 Bidirectional Flows


A graph topology that is often useful is that of two flows going in opposite directions. Take for example a codec
stage that serializes outgoing messages and deserializes incoming octet streams. Another such stage could add a
framing protocol that attaches a length header to outgoing data and parses incoming frames back into the original
octet stream chunks. These two stages are meant to be composed, applying one atop the other as part of a protocol
stack. For this purpose exists the special type BidiFlow which is a graph that has exactly two open inlets and
two open outlets. The corresponding shape is called BidiShape and is defined like this:
/**
* A bidirectional flow of elements that consequently has two inputs and two
* outputs, arranged like this:
*
* {{{
+------+
*
|~> Out1
* In1 ~>|
| bidi |
*
|<~ In2
* Out2 <~|
+------+
*
* }}}
*/
final case class BidiShape[-In1, +Out1, -In2, +Out2](in1: Inlet[In1],
out1: Outlet[Out1],
in2: Inlet[In2],
out2: Outlet[Out2]) extends Shape {
// implementation details elided ...
}

A bidirectional flow is defined just like a unidirectional Flow as demonstrated for the codec mentioned above:
static interface Message {}
static class Ping implements Message {
final int id;
public Ping(int id) { this.id = id; }
@Override
public boolean equals(Object o) {
if (o instanceof Ping) {
return ((Ping) o).id == id;
} else return false;
}
@Override
public int hashCode() {
return id;
}
}
static class Pong implements Message {
final int id;
public Pong(int id) { this.id = id; }
@Override
public boolean equals(Object o) {
if (o instanceof Pong) {

1.5. Working with Graphs

20

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return ((Pong) o).id == id;


} else return false;
}
@Override
public int hashCode() {
return id;
}
}
public static ByteString toBytes(Message msg) {
// implementation details elided ...
}
public static Message fromBytes(ByteString bytes) {
// implementation details elided ...
}
public final BidiFlow<Message, ByteString, ByteString, Message, BoxedUnit> codecVerbose =
BidiFlow.factory().create(b -> {
final FlowShape<Message, ByteString> top =
b.graph(Flow.<Message> empty().map(BidiFlowDocTest::toBytes));
final FlowShape<ByteString, Message> bottom =
b.graph(Flow.<ByteString> empty().map(BidiFlowDocTest::fromBytes));
return new BidiShape<>(top, bottom);
});
public final BidiFlow<Message, ByteString, ByteString, Message, BoxedUnit> codec =
BidiFlow.fromFunctions(BidiFlowDocTest::toBytes, BidiFlowDocTest::fromBytes);

The first version resembles the partial graph constructor, while for the simple case of a functional 1:1 transformation there is a concise convenience method as shown on the last line. The implementation of the two functions is
not difficult either:
public static ByteString toBytes(Message msg) {
if (msg instanceof Ping) {
final int id = ((Ping) msg).id;
return new ByteStringBuilder().putByte((byte) 1)
.putInt(id, ByteOrder.LITTLE_ENDIAN).result();
} else {
final int id = ((Pong) msg).id;
return new ByteStringBuilder().putByte((byte) 2)
.putInt(id, ByteOrder.LITTLE_ENDIAN).result();
}
}
public static Message fromBytes(ByteString bytes) {
final ByteIterator it = bytes.iterator();
switch(it.getByte()) {
case 1:
return new Ping(it.getInt(ByteOrder.LITTLE_ENDIAN));
case 2:
return new Pong(it.getInt(ByteOrder.LITTLE_ENDIAN));
default:
throw new RuntimeException("message format error");
}
}

In this way you could easily integrate any other serialization library that turns an object into a sequence of bytes.
The other stage that we talked about is a little more involved since reversing a framing protocol means that
any received chunk of bytes may correspond to zero or more messages. This is best implemented using a
PushPullStage (see also Using PushPullStage).

1.5. Working with Graphs

21

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

public static ByteString addLengthHeader(ByteString bytes) {


final int len = bytes.size();
return new ByteStringBuilder()
.putInt(len, ByteOrder.LITTLE_ENDIAN)
.append(bytes)
.result();
}
public static class FrameParser extends PushPullStage<ByteString, ByteString> {
// this holds the received but not yet parsed bytes
private ByteString stash = ByteString.empty();
// this holds the current message length or -1 if at a boundary
private int needed = -1;
@Override
public SyncDirective onPull(Context<ByteString> ctx) {
return run(ctx);
}
@Override
public SyncDirective onPush(ByteString bytes, Context<ByteString> ctx) {
stash = stash.concat(bytes);
return run(ctx);
}
@Override
public TerminationDirective onUpstreamFinish(Context<ByteString> ctx) {
if (stash.isEmpty()) return ctx.finish();
else return ctx.absorbTermination(); // we still have bytes to emit
}
private SyncDirective run(Context<ByteString> ctx) {
if (needed == -1) {
// are we at a boundary? then figure out next length
if (stash.size() < 4) return pullOrFinish(ctx);
else {
needed = stash.iterator().getInt(ByteOrder.LITTLE_ENDIAN);
stash = stash.drop(4);
return run(ctx); // cycle back to possibly already emit the next chunk
}
} else if (stash.size() < needed) {
// we are in the middle of a message, need more bytes
return pullOrFinish(ctx);
} else {
// we have enough to emit at least one message, so do it
final ByteString emit = stash.take(needed);
stash = stash.drop(needed);
needed = -1;
return ctx.push(emit);
}
}
/*
* After having called absorbTermination() we cannot pull any more, so if we need
* more data we will just have to give up.
*/
private SyncDirective pullOrFinish(Context<ByteString> ctx) {
if (ctx.isFinishing()) return ctx.finish();
else return ctx.pull();
}
}
public final BidiFlow<ByteString, ByteString, ByteString, ByteString, BoxedUnit> framing =

1.5. Working with Graphs

22

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

BidiFlow.factory().create(b -> {
final FlowShape<ByteString, ByteString> top =
b.graph(Flow.<ByteString> empty().map(BidiFlowDocTest::addLengthHeader));
final FlowShape<ByteString, ByteString> bottom =
b.graph(Flow.<ByteString> empty().transform(() -> new FrameParser()));
return new BidiShape<>(top, bottom);
});

With these implementations we can build a protocol stack and test it:
/* construct protocol stack
+------------------------------------+
*
| stack
|
*
|
|
*
| +-------+
+---------+ |
*
~>
O~~o
|
~>
|
o~~O
~>
*
* Message | | codec | ByteString | framing | | ByteString
<~
O~~o
|
<~
|
o~~O
<~
*
| +-------+
+---------+ |
*
+------------------------------------+
*
*/
final BidiFlow<Message, ByteString, ByteString, Message, BoxedUnit> stack =
codec.atop(framing);
// test it by plugging it into its own inverse and closing the right end
final Flow<Message, Message, BoxedUnit> pingpong =
Flow.<Message> empty().collect(new PFBuilder<Message, Message>()
.match(Ping.class, p -> new Pong(p.id))
.build()
);
final Flow<Message, Message, BoxedUnit> flow =
stack.atop(stack.reversed()).join(pingpong);
final Future<List<Message>> result = Source
.from(Arrays.asList(0, 1, 2))
.<Message> map(id -> new Ping(id))
.via(flow)
.grouped(10)
.runWith(Sink.<List<Message>> head(), mat);
final FiniteDuration oneSec = Duration.create(1, TimeUnit.SECONDS);
assertArrayEquals(
new Message[] { new Pong(0), new Pong(1), new Pong(2) },
Await.result(result, oneSec).toArray(new Message[0]));

This example demonstrates how BidiFlow subgraphs can be hooked together and also turned around with the
.reversed() method. The test simulates both parties of a network communication protocol without actually
having to open a network connectionthe flows can just be connected directly.

1.5.5 Accessing the materialized value inside the Graph


In certain cases it might be necessary to feed back the materialized value of a Graph (partial, closed or backing a
Source, Sink, Flow or BidiFlow). This is possible by using builder.materializedValue which gives an
Outlet that can be used in the graph as an ordinary source or outlet, and which will eventually emit the materialized value. If the materialized value is needed at more than one place, it is possible to call materializedValue
any number of times to acquire the necessary number of outlets.
final Sink<Integer, Future<Integer>> foldSink = Sink.<Integer, Integer> fold(0, (a, b) -> {
return a + b;
});
final Flow<Future<Integer>, Integer, BoxedUnit> flatten = Flow.<Future<Integer>> empty()
.mapAsync(4, x -> {
return x;

1.5. Working with Graphs

23

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

});
final Flow<Integer, Integer, Future<Integer>> foldingFlow = Flow.factory().create(foldSink,
(b, fold) -> {
return new Pair<>(
fold.inlet(),
b.from(b.materializedValue()).via(flatten).out());
});

Be careful not to introduce a cycle where the materialized value actually contributes to the materialized value. The
following example demonstrates a case where the materialized Future of a fold is fed back to the fold itself.
// This cannot produce any value:
final Source<Integer, Future<Integer>> cyclicSource = Source.factory().create(foldSink,
(b, fold) -> {
// - Fold cannot complete until its upstream mapAsync completes
// - mapAsync cannot complete until the materialized Future produced by
//
fold completes
// As a result this Source will never emit anything, and its materialited
// Future will never complete
b.from(b.materializedValue()).via(flatten).to(fold);
return b.from(b.materializedValue()).via(flatten).out();
});

1.5.6 Graph cycles, liveness and deadlocks


Cycles in bounded flow graphs need special considerations to avoid potential deadlocks and other liveness issues.
This section shows several examples of problems that can arise from the presence of feedback arcs in stream
processing graphs.
The first example demonstrates a graph that contains a naive cycle. The graph takes elements from the source,
prints them, then broadcasts those elements to a consumer (we just used Sink.ignore for now) and to a
feedback arc that is merged back into the main via a Merge junction.
// WARNING! The graph below deadlocks!
final Flow<Integer, Integer, BoxedUnit> printFlow =
Flow.of(Integer.class).map(s -> {
System.out.println(s);
return s;
});
FlowGraph.factory().closed(b -> {
final UniformFanInShape<Integer, Integer> merge = b.graph(Merge.create(2));
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
b.from(source).via(merge).via(printFlow).via(bcast).to(Sink.ignore());
b.to(merge)
.from(bcast);
});

Running this we observe that after a few numbers have been printed, no more elements are logged to the console
- all processing stops after some time. After some investigation we observe that:
through merging from source we increase the number of elements flowing in the cycle
by broadcasting back to the cycle we do not decrease the number of elements in the cycle
Since Akka Streams (and Reactive Streams in general) guarantee bounded processing (see the Buffering section
for more details) it means that only a bounded number of elements are buffered over any time span. Since our
cycle gains more and more elements, eventually all of its internal buffers become full, backpressuring source
forever. To be able to process more elements from source elements would need to leave the cycle somehow.
If we modify our feedback loop by replacing the Merge junction with a MergePreferred we can avoid the
deadlock. MergePreferred is unfair as it always tries to consume from a preferred input port if there are
1.5. Working with Graphs

24

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

elements available before trying the other lower priority input ports. Since we feed back through the preferred
port it is always guaranteed that the elements in the cycles can flow.
// WARNING! The graph below stops consuming from "source" after a few steps
FlowGraph.factory().closed(b -> {
final MergePreferredShape<Integer> merge = b.graph(MergePreferred.create(1));
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
b.from(source).via(merge).via(printFlow).via(bcast).to(Sink.ignore());
b.to(merge.preferred()) .from(bcast);
});

If we run the example we see that the same sequence of numbers are printed over and over again, but the processing
does not stop. Hence, we avoided the deadlock, but source is still back-pressured forever, because buffer space
is never recovered: the only action we see is the circulation of a couple of initial elements from source.
Note: What we see here is that in certain cases we need to choose between boundedness and liveness. Our first
example would not deadlock if there would be an infinite buffer in the loop, or vice versa, if the elements in the
cycle would be balanced (as many elements are removed as many are injected) then there would be no deadlock.
To make our cycle both live (not deadlocking) and fair we can introduce a dropping element on the feedback arc. In
this case we chose the buffer() operation giving it a dropping strategy OverflowStrategy.dropHead.
FlowGraph.factory().closed(b -> {
final UniformFanInShape<Integer, Integer> merge = b.graph(Merge.create(2));
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
final FlowShape<Integer, Integer> droppyFlow = b.graph(
Flow.of(Integer.class).buffer(10, OverflowStrategy.dropHead()));
b.from(source).via(merge).via(printFlow).via(bcast).to(Sink.ignore());
b.to(merge).via(droppyFlow).from(bcast);
});

If we run this example we see that


The flow of elements does not stop, there are always elements printed
We see that some of the numbers are printed several times over time (due to the feedback loop) but on
average the numbers are increasing in the long term
This example highlights that one solution to avoid deadlocks in the presence of potentially unbalanced cycles
(cycles where the number of circulating elements are unbounded) is to drop elements. An alternative would be to
define a larger buffer with OverflowStrategy.fail which would fail the stream instead of deadlocking it
after all buffer space has been consumed.
As we discovered in the previous examples, the core problem was the unbalanced nature of the feedback loop. We
circumvented this issue by adding a dropping element, but now we want to build a cycle that is balanced from the
beginning instead. To achieve this we modify our first graph by replacing the Merge junction with a ZipWith.
Since ZipWith takes one element from source and from the feedback arc to inject one element into the cycle,
we maintain the balance of elements.
// WARNING! The graph below never processes any elements
FlowGraph.factory().closed(b -> {
final FanInShape2<Integer, Integer, Integer>
zip = b.graph(ZipWith.create((Integer left, Integer right) -> left));
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
b.from(source).to(zip.in0());
b.from(zip.out()).via(printFlow).via(bcast).to(Sink.ignore());
b.to(zip.in1())
.from(bcast);
});

Still, when we try to run the example it turns out that no element is printed at all! After some investigation we
realize that:
1.5. Working with Graphs

25

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

In order to get the first element from source into the cycle we need an already existing element in the
cycle
In order to get an initial element in the cycle we need an element from source
These two conditions are a typical chicken-and-egg problem. The solution is to inject an initial element into the
cycle that is independent from source. We do this by using a Concat junction on the backwards arc that injects
a single element using Source.single.
FlowGraph.factory().closed(b -> {
final FanInShape2<Integer, Integer, Integer>
zip = b.graph(ZipWith.create((Integer left, Integer right) -> left));
final UniformFanOutShape<Integer, Integer> bcast = b.graph(Broadcast.create(2));
final UniformFanInShape<Integer, Integer> concat = b.graph(Concat.create());
b.from(source).to(zip.in0());
b.from(zip.out()).via(printFlow).via(bcast).to(Sink.ignore());
b.to(zip.in1()).via(concat).from(Source.single(1));
b.to(concat) .from(bcast);
});

When we run the above example we see that processing starts and never stops. The important takeaway from this
example is that balanced cycles often need an initial kick-off element to be injected into the cycle.

1.6 Modularity, Composition and Hierarchy


Akka Streams provide a uniform model of stream processing graphs, which allows flexible composition of reusable
components. In this chapter we show how these look like from the conceptual and API perspective, demonstrating
the modularity aspects of the library.

1.6.1 Basics of composition and modularity


Every processing stage used in Akka Streams can be imagined as a box with input and output ports where
elements to be processed arrive and leave the stage. In this view, a Source is nothing else than a box with a
single output port, or, a BidiFlow is a box with exactly two input and two output ports. In the figure below
we illustrate the most common used stages viewed as boxes.

The linear stages are Source, Sink and Flow, as these can be used to compose strict chains of processing
stages. Fan-in and Fan-out stages have usually multiple input or multiple output ports, therefore they allow to

1.6. Modularity, Composition and Hierarchy

26

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

build more complex graph layouts, not just chains. BidiFlow stages are usually useful in IO related tasks,
where there are input and output channels to be handled. Due to the specific shape of BidiFlow it is easy to
stack them on top of each other to build a layered protocol for example. The TLS support in Akka is for example
implemented as a BidiFlow.
These reusable components already allow the creation of complex processing networks. What we have seen
so far does not implement modularity though. It is desirable for example to package up a larger graph entity
into a reusable component which hides its internals only exposing the ports that are meant to the users of the
module to interact with. One good example is the Http server component, which is encoded internally as a
BidiFlow which interfaces with the client TCP connection using an input-output port pair accepting and sending
ByteString s, while its upper ports emit and receive HttpRequest and HttpResponse instances.
The following figure demonstrates various composite stages, that contain various other type of stages internally,
but hiding them behind a shape that looks like a Source, Flow, etc.

One interesting example above is a Flow which is composed of a disconnected Sink and Source. This can be
achieved by using the wrap() constructor method on Flow which takes the two parts as parameters.
The example BidiFlow demonstrates that internally a module can be of arbitrary complexity, and the exposed
ports can be wired in flexible ways. The only constraint is that all the ports of enclosed modules must be either
connected to each other, or exposed as interface ports, and the number of such ports needs to match the requirement
of the shape, for example a Source allows only one exposed output port, the rest of the internal ports must be
properly connected.

1.6. Modularity, Composition and Hierarchy

27

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

These mechanics allow arbitrary nesting of modules. For example the following figure demonstrates a
RunnableGraph that is built from a composite Source and a composite Sink (which in turn contains a
composite Flow).

The above diagram contains one more shape that we have not seen yet, which is called RunnableGraph. It
turns out, that if we wire all exposed ports together, so that no more open ports remain, we get a module that
is closed. This is what the RunnableGraph class represents. This is the shape that a Materializer can
take and turn into a network of running entities that perform the task described. In fact, a RunnableGraph is
a module itself, and (maybe somewhat surprisingly) it can be used as part of larger graphs. It is rarely useful to
embed a closed graph shape in a larger graph (since it becomes an isolated island as there are no open port for
communication with the rest of the graph), but this demonstrates the uniform underlying model.
If we try to build a code snippet that corresponds to the above diagram, our first try might look like this:
Source.single(0)
.map(i -> i + 1)
.filter(i -> i != 0)
.map(i -> i - 2)
.to(Sink.fold(0, (acc, i) -> acc + i));
// ... where is the nesting?

It is clear however that there is no nesting present in our first attempt, since the library cannot figure out where
we intended to put composite module boundaries, it is our responsibility to do that. If we are using the DSL
provided by the Flow, Source, Sink classes then nesting can be achieved by calling one of the methods
withAttributes() or named() (where the latter is just a shorthand for adding a name attribute).
The following code demonstrates how to achieve the desired nesting:
final Source<Integer, BoxedUnit> nestedSource =
Source.single(0) // An atomic source
.map(i -> i + 1) // an atomic processing stage
.named("nestedSource"); // wraps up the current Source and gives it a name
final Flow<Integer, Integer, BoxedUnit> nestedFlow =
Flow.of(Integer.class).filter(i -> i != 0) // an atomic processing stage
.map(i -> i - 2) // another atomic processing stage
.named("nestedFlow"); // wraps up the Flow, and gives it a name
final Sink<Integer, BoxedUnit> nestedSink =
nestedFlow.to(Sink.fold(0, (acc, i) -> acc + i)) // wire an atomic sink to the nestedFlow
.named("nestedSink"); // wrap it up

1.6. Modularity, Composition and Hierarchy

28

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// Create a RunnableGraph
final RunnableGraph<BoxedUnit> runnableGraph = nestedSource.to(nestedSink);

Once we have hidden the internals of our components, they act like any other built-in component of similar shape.
If we hide some of the internals of our composites, the result looks just like if any other predefine component has
been used:

If we look at usage of built-in components, and our custom components, there is no difference in usage as the code
snippet below demonstrates.
// Create a RunnableGraph from our components
final RunnableGraph<BoxedUnit> runnableGraph = nestedSource.to(nestedSink);
// Usage is uniform, no matter if modules are composite or atomic
final RunnableGraph<BoxedUnit> runnableGraph2 =
Source.single(0).to(Sink.fold(0, (acc, i) -> acc + i));

1.6.2 Composing complex systems


In the previous section we explored the possibility of composition, and hierarchy, but we stayed away from nonlinear, generalized graph components. There is nothing in Akka Streams though that enforces that stream processing layouts can only be linear. The DSL for Source and friends is optimized for creating such linear chains, as
they are the most common in practice. There is a more advanced DSL for building complex graphs, that can be
used if more flexibility is needed. We will see that the difference between the two DSLs is only on the surface:
the concepts they operate on are uniform across all DSLs and fit together nicely.
As a first example, lets look at a more complex layout:

1.6. Modularity, Composition and Hierarchy

29

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

The diagram shows a RunnableGraph (remember, if there are no unwired ports, the graph is closed, and
therefore can be materialized) that encapsulates a non-trivial stream processing network. It contains fan-in, fanout stages, directed and non-directed cycles. The closed() method of the FlowGraph factory object allows
the creation of a general closed graph. For example the network on the diagram can be realized like this:
FlowGraph.factory().closed(builder -> {
final Outlet<Integer> A = builder.source(Source.single(0));
final UniformFanOutShape<Integer, Integer> B = builder.graph(Broadcast.create(2));
final UniformFanInShape<Integer, Integer> C = builder.graph(Merge.create(2));
final FlowShape<Integer, Integer> D =
builder.graph(Flow.of(Integer.class).map(i -> i + 1));
final UniformFanOutShape<Integer, Integer> E = builder.graph(Balance.create(2));
final UniformFanInShape<Integer, Integer> F = builder.graph(Merge.create(2));
final Inlet<Integer> G = builder.sink(Sink.foreach(System.out::println));
builder.from(F).to(C);
builder.from(A).via(B).via(C).to(F);
builder.from(B).via(D).via(E).to(F);
builder.from(E).to(G);
});

In the code above we used the implicit port numbering feature to make the graph more readable and similar to the
diagram. It is possible to refer to the ports, so another version might look like this:
FlowGraph.factory().closed(builder -> {
final Outlet<Integer> A = builder.source(Source.single(0));
final UniformFanOutShape<Integer, Integer> B = builder.graph(Broadcast.create(2));
final UniformFanInShape<Integer, Integer> C = builder.graph(Merge.create(2));
final FlowShape<Integer, Integer> D =
builder.graph(Flow.of(Integer.class).map(i -> i + 1));
final UniformFanOutShape<Integer, Integer> E = builder.graph(Balance.create(2));
final UniformFanInShape<Integer, Integer> F = builder.graph(Merge.create(2));
final Inlet<Integer> G = builder.sink(Sink.foreach(System.out::println));
builder.from(F.out()).to(C.in(0));
builder.from(A).to(B.in());
builder.from(B.out(0)).to(C.in(1));
builder.from(C.out()).to(F.in(0));
builder.from(B.out(1)).via(D).to(E.in());

1.6. Modularity, Composition and Hierarchy

30

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

builder.from(E.out(0)).to(F.in(1));
builder.from(E.out(1)).to(G);
});

Similar to the case in the first section, so far we have not considered modularity. We created a complex graph, but
the layout is flat, not modularized. We will modify our example, and create a reusable component with the graph
DSL. The way to do it is to use the partial() method on FlowGraph factory. If we remove the sources and
sinks from the previous example, what remains is a partial graph:

We can recreate a similar graph in code, using the DSL in a similar way than before:
final Graph<FlowShape<Integer, Integer>, BoxedUnit> partial =
FlowGraph.factory().partial(builder -> {
final UniformFanOutShape<Integer, Integer> B = builder.graph(Broadcast.create(2));
final UniformFanInShape<Integer, Integer> C = builder.graph(Merge.create(2));
final UniformFanOutShape<Integer, Integer> E = builder.graph(Balance.create(2));
final UniformFanInShape<Integer, Integer> F = builder.graph(Merge.create(2));
builder.from(F.out()).to(C.in(0));
builder.from(B).via(C).to(F);
builder.from(B).via(builder.graph(Flow.of(Integer.class).map(i -> i + 1))).via(E).to(F);
return new FlowShape(B.in(), E.out(1));
});

The only new addition is the return value of the builder block, which is a Shape. All graphs (including Source,
BidiFlow, etc) have a shape, which encodes the typed ports of the module. In our example there is exactly one
input and output port left, so we can declare it to have a FlowShape by returning an instance of it. While it is
possible to create new Shape types, it is usually recommended to use one of the matching built-in ones.

1.6. Modularity, Composition and Hierarchy

31

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

The resulting graph is already a properly wrapped module, so there is no need to call named() to encapsulate the
graph, but it is a good practice to give names to modules to help debugging.

Since our partial graph has the right shape, it can be already used in the simpler, linear DSL:
Source.single(0).via(partial).to(Sink.ignore());

It is not possible to use it as a Flow yet, though (i.e. we cannot call .filter() on it), but Flow has a wrap()
method that just adds the DSL to a FlowShape. There are similar methods on Source, Sink and BidiShape,
so it is easy to get back to the simpler DSL if a graph has the right shape. For convenience, it is also possible
to skip the partial graph creation, and use one of the convenience creator methods. To demonstrate this, we will
create the following graph:

The code version of the above closed graph might look like this:

1.6. Modularity, Composition and Hierarchy

32

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// Convert the partial graph of FlowShape to a Flow to get


// access to the fluid DSL (for example to be able to call .filter())
final Flow<Integer, Integer, BoxedUnit> flow = Flow.wrap(partial);
// Simple way to create a graph backed Source
final Source<Integer, BoxedUnit> source = Source.factory().create(builder -> {
final UniformFanInShape<Integer, Integer> merge = builder.graph(Merge.create(2));
builder.from(builder.source(Source.single(0))).to(merge);
builder.from(builder.source(Source.from(Arrays.asList(2, 3, 4)))).to(merge);
// Exposing exactly one output port
return merge.out();
});
// Building a Sink with a nested Flow, using the fluid DSL
final Sink<Integer, BoxedUnit> sink = Flow.of(Integer.class)
.map(i -> i * 2)
.drop(10)
.named("nestedFlow")
.to(Sink.head());
// Putting all together
final RunnableGraph<BoxedUnit> closed = source.via(flow.filter(i -> i > 1)).to(sink);

Note: All graph builder sections check if the resulting graph has all ports connected except the exposed ones and
will throw an exception if this is violated.
We are still in debt of demonstrating that RunnableGraph is a component just like any other, which can be
embedded in graphs. In the following snippet we embed one closed graph in another:
final RunnableGraph<BoxedUnit> closed1 =
Source.single(0).to(Sink.foreach(System.out::println));
final RunnableGraph<BoxedUnit> closed2 = FlowGraph.factory().closed(builder -> {
final ClosedShape embeddedClosed = builder.graph(closed1);
});

The type of the imported module indicates that the imported module has a ClosedShape, and so we are not able
to wire it to anything else inside the enclosing closed graph. Nevertheless, this island is embedded properly, and
will be materialized just like any other module that is part of the graph.
As we have demonstrated, the two DSLs are fully interoperable, as they encode a similar nested structure of boxes
with ports, it is only the DSLs that differ to be as much powerful as possible on the given abstraction level. It is
possible to embed complex graphs in the fluid DSL, and it is just as easy to import and embed a Flow, etc, in a
larger, complex structure.
We have also seen, that every module has a Shape (for example a Sink has a SinkShape) independently
which DSL was used to create it. This uniform representation enables the rich composability of various stream
processing entities in a convenient way.

1.6.3 Materialized values


After realizing that RunnableGraph is nothing more than a module with no unused ports (it is an island), it
becomes clear that after materialization the only way to communicate with the running stream processing logic is
via some side-channel. This side channel is represented as a materialized value. The situation is similar to Actor
s, where the Props instance describes the actor logic, but it is the call to actorOf() that creates an actually
running actor, and returns an ActorRef that can be used to communicate with the running actor itself. Since the
Props can be reused, each call will return a different reference.
When it comes to streams, each materialization creates a new running network corresponding to the blueprint
that was encoded in the provided RunnableGraph. To be able to interact with the running network, each

1.6. Modularity, Composition and Hierarchy

33

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

materialization needs to return a different object that provides the necessary interaction capabilities. In other
words, the RunnableGraph can be seen as a factory, which creates:
a network of running processing entities, inaccessible from the outside
a materialized value, optionally providing a controlled interaction capability with the network
Unlike actors though, each of the processing stages might provide a materialized value, so when we compose
multiple stages or modules, we need to combine the materialized value as well (there are default rules which make
this easier, for example to() and via() takes care of the most common case of taking the materialized value to the
left. See flow-combine-mat-scala for details). We demonstrate how this works by a code example and a diagram
which graphically demonstrates what is happening.
The propagation of the individual materialized values from the enclosed modules towards the top will look like
this:

To implement the above, first, we create a composite Source, where the enclosed Source have a materialized
type of Promise. By using the combiner function Keep.left(), the resulting materialized type is of the
nested module (indicated by the color red on the diagram):
// Materializes to Promise<BoxedUnit>
(red)
final Source<Integer, Promise<BoxedUnit>> source = Source.<Integer> lazyEmpty();
// Materializes to BoxedUnit
(black)
final Flow<Integer, Integer, BoxedUnit> flow1 = Flow.of(Integer.class).take(100);
// Materializes to Promise<BoxedUnit>
final Source<Integer, Promise<BoxedUnit>> nestedSource =
source.viaMat(flow1, Keep.left()).named("nestedSource");

(red)

Next, we create a composite Flow from two smaller components. Here, the second enclosed Flow has a materialized type of Future, and we propagate this to the parent by using Keep.right() as the combiner function
(indicated by the color yellow on the diagram):
// Materializes to BoxedUnit
(orange)
final Flow<Integer, ByteString, BoxedUnit> flow2 = Flow.of(Integer.class)
.map(i -> ByteString.fromString(i.toString()));

1.6. Modularity, Composition and Hierarchy

34

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// Materializes to Future<OutgoingConnection>
final Flow<ByteString, ByteString, Future<OutgoingConnection>> flow3 =
Tcp.get(system).outgoingConnection("localhost", 8080);

(yellow)

// Materializes to Future<OutgoingConnection>
final Flow<Integer, ByteString, Future<OutgoingConnection>> nestedFlow =
flow2.viaMat(flow3, Keep.right()).named("nestedFlow");

(yellow)

As a third step, we create a composite Sink, using our nestedFlow as a building block. In this snippet, both the
enclosed Flow and the folding Sink has a materialized value that is interesting for us, so we use Keep.both()
to get a Pair of them as the materialized type of nestedSink (indicated by the color blue on the diagram)
// Materializes to Future<String>
final Sink<ByteString, Future<String>> sink = Sink
.fold("", (acc, i) -> acc + i.utf8String());

(green)

// Materializes to Pair<Future<OutgoingConnection>, Future<String>>


(blue)
final Sink<Integer, Pair<Future<OutgoingConnection>, Future<String>>> nestedSink =
nestedFlow.toMat(sink, Keep.both());

As the last example, we wire together nestedSource and nestedSink and we use a custom combiner
function to create a yet another materialized type of the resulting RunnableGraph. This combiner function just
ignores the Future part, and wraps the other two values in a custom case class MyClass (indicated by color
purple on the diagram):
static class MyClass {
private Promise<BoxedUnit> p;
private OutgoingConnection conn;
public MyClass(Promise<BoxedUnit> p, OutgoingConnection conn) {
this.p = p;
this.conn = conn;
}
public void close() {
p.success(scala.runtime.BoxedUnit.UNIT);
}
}
static class Combiner {
static Future<MyClass> f(Promise<BoxedUnit> p,
Pair<Future<OutgoingConnection>, Future<String>> rest) {
return rest.first().map(new Mapper<OutgoingConnection, MyClass>() {
public MyClass apply(OutgoingConnection c) {
return new MyClass(p, c);
}
}, system.dispatcher());
}
}
// Materializes to Future<MyClass>
final RunnableGraph<Future<MyClass>> runnableGraph =
nestedSource.toMat(nestedSink, Combiner::f);

(purple)

Note: The nested structure in the above example is not necessary for combining the materialized values, it just
demonstrates how the two features work together. See Combining materialized values for further examples of
combining materialized values without nesting and hierarchy involved.

1.6. Modularity, Composition and Hierarchy

35

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.6.4 Attributes
We have seen that we can use named() to introduce a nesting level in the fluid DSL (and also explicit nesting by
using partial() from FlowGraph). Apart from having the effect of adding a nesting level, named() is actually a shorthand for calling withAttributes(Attributes.name("someName")). Attributes provide
a way to fine-tune certain aspects of the materialized running entity. For example buffer sizes can be controlled via
attributes (see stream-buffers-scala). When it comes to hierarchic composition, attributes are inherited by nested
modules, unless they override them with a custom value.
The code below, a modification of an earlier example sets the inputBuffer attribute on certain modules, but
not on others:
final Source<Integer, BoxedUnit> nestedSource =
Source.single(0)
.map(i -> i + 1)
.named("nestedSource"); // Wrap, no inputBuffer set
final Flow<Integer, Integer, BoxedUnit> nestedFlow =
Flow.of(Integer.class).filter(i -> i != 0)
.via(Flow.of(Integer.class)
.map(i -> i - 2)
.withAttributes(Attributes.inputBuffer(4, 4))) // override
.named("nestedFlow"); // Wrap, no inputBuffer set
final Sink<Integer, BoxedUnit> nestedSink =
nestedFlow.to(Sink.fold(0, (acc, i) -> acc + i)) // wire an atomic sink to the nestedFlow
.withAttributes(Attributes.name("nestedSink")
.and(Attributes.inputBuffer(3, 3))); // override

The effect is, that each module inherits the inputBuffer attribute from its enclosing parent, unless it has
the same attribute explicitly set. nestedSource gets the default attributes from the materializer itself.
nestedSink on the other hand has this attribute set, so it will be used by all nested modules. nestedFlow
will inherit from nestedSink except the map stage which has again an explicitly provided attribute overriding
the inherited one.

1.6. Modularity, Composition and Hierarchy

36

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

This diagram illustrates the inheritance process for the example code (representing the materializer default attributes as the color red, the attributes set on nestedSink as blue and the attributes set on nestedFlow as
green).

1.7 Buffers and working with rate


Akka Streams processing stages are asynchronous and pipelined by default which means that a stage, after handing
out an element to its downstream consumer is able to immediately process the next message. To demonstrate what
we mean by this, lets take a look at the following example:
Source.from(Arrays.asList(1, 2, 3))
.map(i -> {System.out.println("A: " + i); return i;})
.map(i -> {System.out.println("B: " + i); return i;})
.map(i -> {System.out.println("C: " + i); return i;})
.runWith(Sink.ignore(), mat);

Running the above example, one of the possible outputs looks like this:
A:
A:
B:
A:
B:
C:
B:
C:
C:

1
2
1
3
2
1
3
2
3

Note that the order is not A:1, B:1, C:1, A:2, B:2, C:2, which would correspond to a synchronous
execution model where an element completely flows through the processing pipeline before the next element
enters the flow. The next element is processed by a stage as soon as it is emitted the previous one.
While pipelining in general increases throughput, in practice there is a cost of passing an element through the
asynchronous (and therefore thread crossing) boundary which is significant. To amortize this cost Akka Streams
uses a windowed, batching backpressure strategy internally. It is windowed because as opposed to a Stop-AndWait protocol multiple elements might be in-flight concurrently with requests for elements. It is also batching
because a new element is not immediately requested once an element has been drained from the window-buffer
but multiple elements are requested after multiple elements have been drained. This batching strategy reduces the
communication cost of propagating the backpressure signal through the asynchronous boundary.
While this internal protocol is mostly invisible to the user (apart form its throughput increasing effects) there are
situations when these details get exposed. In all of our previous examples we always assumed that the rate of
the processing chain is strictly coordinated through the backpressure signal causing all stages to process no faster
than the throughput of the connected chain. There are tools in Akka Streams however that enable the rates of
different segments of a processing chain to be detached or to define the maximum throughput of the stream
through external timing sources. These situations are exactly those where the internal batching buffering strategy
suddenly becomes non-transparent.

1.7.1 Buffers in Akka Streams


Internal buffers and their effect
As we have explained, for performance reasons Akka Streams introduces a buffer for every processing stage. The
purpose of these buffers is solely optimization, in fact the size of 1 would be the most natural choice if there
would be no need for throughput improvements. Therefore it is recommended to keep these buffer sizes small,
and increase them only to a level suitable for the throughput requirements of the application. Default buffer sizes
can be set through configuration:

1.7. Buffers and working with rate

37

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

akka.stream.materializer.max-input-buffer-size = 16

Alternatively they can be set by passing a ActorMaterializerSettings to the materializer:


final Materializer materializer = ActorMaterializer.create(
ActorMaterializerSettings.create(system)
.withInputBuffer(64, 64), system);

If the buffer size needs to be set for segments of a Flow only, it is possible by defining a separate Flow with these
attributes:
final Flow<Integer, Integer, BoxedUnit> flow1 =
Flow.of(Integer.class)
.map(elem -> elem * 2) // the buffer size of this map is 1
.withAttributes(Attributes.inputBuffer(1, 1));
final Flow<Integer, Integer, BoxedUnit> flow2 =
flow1.via(
Flow.of(Integer.class)
.map(elem -> elem / 2)); // the buffer size of this map is the default

Here is an example of a code that demonstrate some of the issues caused by internal buffers:
final FiniteDuration oneSecond =
FiniteDuration.create(1, TimeUnit.SECONDS);
final Source<String, Cancellable> msgSource =
Source.from(oneSecond, oneSecond, "message!");
final Source<String, Cancellable> tickSource =
Source.from(oneSecond.mul(3), oneSecond.mul(3), "tick");
final Flow<String, Integer, BoxedUnit> conflate =
Flow.of(String.class).conflate(
first -> 1, (count, elem) -> count + 1);
FlowGraph.factory().closed(b -> {
final FanInShape2<String, Integer, Integer> zipper =
b.graph(ZipWith.create((String tick, Integer count) -> count));
b.from(msgSource).via(conflate).to(zipper.in1());
b.from(tickSource).to(zipper.in0());
b.from(zipper.out()).to(Sink.foreach(elem -> System.out.println(elem)));
}).run(mat);

Running the above example one would expect the number 3 to be printed in every 3 seconds (the conflate step
here is configured so that it counts the number of elements received before the downstream ZipWith consumes
them). What is being printed is different though, we will see the number 1. The reason for this is the internal buffer
which is by default 16 elements large, and prefetches elements before the ZipWith starts consuming them. It is
possible to fix this issue by changing the buffer size of ZipWith (or the whole graph) to 1. We will still see a
leading 1 though which is caused by an initial prefetch of the ZipWith element.
Note: In general, when time or rate driven processing stages exhibit strange behavior, one of the first solutions to
try should be to decrease the input buffer of the affected elements to 1.

Explicit user defined buffers


The previous section explained the internal buffers of Akka Streams used to reduce the cost of crossing elements
through the asynchronous boundary. These are internal buffers which will be very likely automatically tuned in
future versions. In this section we will discuss explicit user defined buffers that are part of the domain logic of the
stream processing pipeline of an application.
The example below will ensure that 1000 jobs (but not more) are dequeued from an external (imaginary) system
and stored locally in memory - relieving the external system:

1.7. Buffers and working with rate

38

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// Getting a stream of jobs from an imaginary external system as a Source


final Source<Job, BoxedUnit> jobs = inboundJobsConnector;
jobs.buffer(1000, OverflowStrategy.backpressure());

The next example will also queue up 1000 jobs locally, but if there are more jobs waiting in the imaginary external
systems, it makes space for the new element by dropping one element from the tail of the buffer. Dropping from
the tail is a very common strategy but it must be noted that this will drop the youngest waiting job. If some
fairness is desired in the sense that we want to be nice to jobs that has been waiting for long, then this option
can be useful.
jobs.buffer(1000, OverflowStrategy.dropTail());

Instead of dropping the youngest element from the tail of the buffer a new element can be dropped without
enqueueing it to the buffer at all.
jobs.buffer(1000, OverflowStrategy.dropNew());

Here is another example with a queue of 1000 jobs, but it makes space for the new element by dropping one
element from the head of the buffer. This is the oldest waiting job. This is the preferred strategy if jobs are
expected to be resent if not processed in a certain period. The oldest element will be retransmitted soon, (in fact a
retransmitted duplicate might be already in the queue!) so it makes sense to drop it first.
jobs.buffer(1000, OverflowStrategy.dropHead());

Compared to the dropping strategies above, dropBuffer drops all the 1000 jobs it has enqueued once the buffer
gets full. This aggressive strategy is useful when dropping jobs is preferred to delaying jobs.
jobs.buffer(1000, OverflowStrategy.dropBuffer());

If our imaginary external job provider is a client using our API, we might want to enforce that the client cannot
have more than 1000 queued jobs otherwise we consider it flooding and terminate the connection. This is easily
achievable by the error strategy which simply fails the stream once the buffer gets full.
jobs.buffer(1000, OverflowStrategy.fail());

1.7.2 Rate transformation


Understanding conflate
When a fast producer can not be informed to slow down by backpressure or some other signal, conflate might be
useful to combine elements from a producer until a demand signal comes from a consumer.
Below is an example snippet that summarizes fast stream of elements to a standart deviation, mean and count of
elements that have arrived while the stats have been calculated.
final Flow<Double, Tuple3<Double, Double, Integer>, BoxedUnit> statsFlow =
Flow.of(Double.class)
.conflate(elem -> Collections.singletonList(elem), (acc, elem) -> {
return Stream
.concat(acc.stream(), Collections.singletonList(elem).stream())
.collect(Collectors.toList());
})
.map(s -> {
final Double mean = s.stream().mapToDouble(d -> d).sum() / s.size();
final DoubleStream se = s.stream().mapToDouble(x -> Math.pow(x - mean, 2));
final Double stdDev = Math.sqrt(se.sum() / s.size());
return new Tuple3(stdDev, mean, s.size());
});

This example demonstrates that such flows rate is decoupled. Element rate at the start of the flow can be much
higher that the element rate at the end of the flow.

1.7. Buffers and working with rate

39

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Another possible use of conflate is to not consider all elements for summary when producer starts getting too fast.
Example below demonstrates how conflate can be used to implement random drop of elements when consumer is
not able to keep up with the producer.
final Double p = 0.01;
final Flow<Double, Double, BoxedUnit> sampleFlow = Flow.of(Double.class)
.conflate(elem -> Collections.singletonList(elem), (acc, elem) -> {
if (r.nextDouble() < p) {
return Stream
.concat(acc.stream(), Collections.singletonList(elem).stream())
.collect(Collectors.toList());
}
return acc;
})
.mapConcat(d -> d);

Understanding expand
Expand helps to deal with slow producers which are unable to keep up with the demand coming from consumers.
Expand allows to extrapolate a value to be sent as an element to a consumer.
As a simple use of expand here is a flow that sends the same element to consumer when producer does not send
any new elements.
final Flow<Double, Double, BoxedUnit> lastFlow = Flow.of(Double.class)
.expand(d -> d, s -> new Pair(s, s));

Expand also allows to keep some state between demand requests from the downstream. Leveraging this, here is a
flow that tracks and reports a drift between fast consumer and slow producer.
final Flow<Double, Pair<Double, Integer>, BoxedUnit> driftFlow = Flow.of(Double.class)
.expand(d -> new Pair<Double, Integer>(d, 0), t -> {
return new Pair(t, new Pair(t.first(), t.second() + 1));
});

Note that all of the elements coming from upstream will go through expand at least once. This means that the
output of this flow is going to report a drift of zero if producer is fast enough, or a larger drift otherwise.

1.8 Custom stream processing


While the processing vocabulary of Akka Streams is quite rich (see the Streams Cookbook for examples) it is
sometimes necessary to define new transformation stages either because some functionality is missing from the
stock operations, or for performance reasons. In this part we show how to build custom processing stages and
graph junctions of various kinds.

1.8.1 Custom linear processing stages


To extend the available transformations on a Flow or Source one can use the transform() method which
takes a factory function returning a Stage. Stages come in different flavors swhich we will introduce in this page.
Using PushPullStage
The most elementary transformation stage is the PushPullStage which can express a large class of algorithms
working on streams. A PushPullStage can be illustrated as a box with two input and two output ports as
it is seen in the illustration below.

1.8. Custom stream processing

40

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

The input ports are implemented as event handlers onPush(elem,ctx) and onPull(ctx) while output
ports correspond to methods on the Context object that is handed as a parameter to the event handlers. By
calling exactly one output port method we wire up these four ports in various ways which we demonstrate
shortly.
Warning: There is one very important rule to remember when working with a Stage. Exactly one method
should be called on the currently passed Context exactly once and as the last statement of the handler
where the return type of the called method matches the expected return type of the handler. Any violation
of this rule will almost certainly result in unspecified behavior (in other words, it will break in spectacular
ways). Exceptions to this rule are the query methods isHolding() and isFinishing()
To illustrate these concepts we create a small PushPullStage that implements the map transformation.

Map calls ctx.push() from the onPush() handler and it also calls ctx.pull() form the onPull handler
resulting in the conceptual wiring above, and fully expressed in code below:
public class Map<A, B> extends PushPullStage<A, B> {
private final Function<A, B> f;
public Map(Function<A, B> f) {
this.f = f;
}
@Override public SyncDirective onPush(A elem, Context<B> ctx) {
return ctx.push(f.apply(elem));
}

1.8. Custom stream processing

41

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

@Override public SyncDirective onPull(Context<B> ctx) {


return ctx.pull();
}
}

Map is a typical example of a one-to-one transformation of a stream. To demonstrate a many-to-one stage we will
implement filter. The conceptual wiring of Filter looks like this:

As we see above, if the given predicate matches the current element we are propagating it downwards, otherwise
we return the ball to our upstream so that we get the new element. This is achieved by modifying the map
example by adding a conditional in the onPush handler and decide between a ctx.pull() or ctx.push()
call (and of course not having a mapping f function).
public class Filter<A> extends PushPullStage<A, A> {
private final Predicate<A> p;
public Filter(Predicate<A> p) {
this.p = p;
}
@Override public SyncDirective onPush(A elem, Context<A> ctx) {
if (p.test(elem)) return ctx.push(elem);
else return ctx.pull();
}
@Override public SyncDirective onPull(Context<A> ctx) {
return ctx.pull();
}
}

To complete the picture we define a one-to-many transformation as the next step. We chose a straightforward
example stage that emits every upstream element twice downstream. The conceptual wiring of this stage looks
like this:

1.8. Custom stream processing

42

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

This is a stage that has state: the last element it has seen, and a flag oneLeft that indicates if we have duplicated
this last element already or not. Looking at the code below, the reader might notice that our onPull method
is more complex than it is demonstrated by the figure above. The reason for this is completion handling, which
we will explain a little bit later. For now it is enough to look at the if(!ctx.isFinishing) block which
corresponds to the logic we expect by looking at the conceptual picture.
class Duplicator<A> extends PushPullStage<A, A> {
private A lastElem = null;
private boolean oneLeft = false;
@Override public SyncDirective onPush(A elem, Context<A> ctx) {
lastElem = elem;
oneLeft = true;
return ctx.push(elem);
}
@Override public SyncDirective onPull(Context<A> ctx) {
if (!ctx.isFinishing()) {
// the main pulling logic is below as it is demonstrated on the illustration
if (oneLeft) {
oneLeft = false;
return ctx.push(lastElem);
} else
return ctx.pull();
} else {
// If we need to emit a final element after the upstream
// finished
if (oneLeft) return ctx.pushAndFinish(lastElem);
else return ctx.finish();
}
}
@Override public TerminationDirective onUpstreamFinish(Context<A> ctx) {
return ctx.absorbTermination();
}
}

Finally, to demonstrate all of the stages above, we put them together into a processing chain, which conceptually
would correspond to the following structure:

1.8. Custom stream processing

43

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

In code this is only a few lines, using the transform method to inject our custom processing into a stream:
final RunnableGraph<Future<List<Integer>>> runnable =
Source
.from(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
.transform(() -> new Filter<Integer>(elem -> elem % 2 == 0))
.transform(() -> new Duplicator<Integer>())
.transform(() -> new Map<Integer, Integer>(elem -> elem / 2))
.toMat(sink, Keep.right());

If we attempt to draw the sequence of events, it shows that there is one event token in circulation in a potential
chain of stages, just like our conceptual railroad tracks representation predicts.

Completion handling

Completion handling usually (but not exclusively) comes into the picture when processing stages need
to emit a few more elements after their upstream source has been completed. We have seen an example of this in our Duplicator class where the last element needs to be doubled even after the

1.8. Custom stream processing

44

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

upstream neighbor stage has been completed. Since the onUpstreamFinish() handler expects a
TerminationDirective as the return type we are only allowed to call ctx.finish(), ctx.fail()
or ctx.absorbTermination(). Since the first two of these available methods will immediately terminate,
our only option is absorbTermination(). It is also clear from the return type of onUpstreamFinish
that we cannot call ctx.push() but we need to emit elements somehow! The trick is that after calling absorbTermination() the onPull() handler will be called eventually, and at the same time
ctx.isFinishing will return true, indicating that ctx.pull() cannot be called anymore. Now we are
free to emit additional elementss and call ctx.finish() or ctx.pushAndFinish() eventually to finish
processing.
The reason for this slightly complex termination sequence is that the underlying onComplete signal of Reactive
Streams may arrive without any pending demand, i.e. without respecting backpressure. This means that our
push/pull structure that was illustrated in the figure of our custom processing chain does not apply to termination.
Our neat model that is analogous to a ball that bounces back-and-forth in a pipe (it bounces back on Filter,
Duplicator for example) cannot describe the termination signals. By calling absorbTermination() the
execution environment checks if the conceptual token was above the current stage at that time (which means that
it will never come back, so the environment immediately calls onPull) or it was below (which means that it will
come back eventually, so the environment does not need to call anything yet).
The first of the two scenarios is when a termination signal arrives after a stage passed the event to its downstream.
As we can see in the following diagram, there is no need to do anything by absorbTermination() since the
black arrows representing the movement of the event token is uninterrupted.

In the second scenario the event token is somewhere upstream when the termination signal arrives. In this
case absorbTermination needs to ensure that a new event token is generated replacing the old one that is
forever gone (since the upstream finished). This is done by calling the onPull() event handler of the stage.

1.8. Custom stream processing

45

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Observe, that in both scenarios onPull() kicks off the continuation of the processing logic, the only difference
is whether it is the downstream or the absorbTermination() call that calls the event handler.
Using PushStage
Many one-to-one and many-to-one transformations do not need to override the onPull() handler at all since all
they do is just propagate the pull upwards. For such transformations it is better to extend PushStage directly. For
example our Map and Filter would look like this:
public class Map2<A, B> extends PushStage<A, B> {
private final Function<A, B> f;
public Map2(Function<A, B> f) {
this.f = f;
}
@Override public SyncDirective onPush(A elem, Context<B> ctx) {
return ctx.push(f.apply(elem));
}
}
public class Filter2<A> extends PushStage<A, A> {
private final Predicate<A> p;
public Filter2(Predicate<A> p) {
this.p = p;
}
@Override public SyncDirective onPush(A elem, Context<A> ctx) {
if (p.test(elem)) return ctx.push(elem);
else return ctx.pull();
}
}

The reason to use PushStage is not just cosmetic: internal optimizations rely on the fact that the onPull method
only calls ctx.pull() and allow the environment do process elements faster than without this knowledge. By
extending PushStage the environment can be sure that onPull() was not overridden since it is final on
PushStage.

1.8. Custom stream processing

46

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Using StatefulStage
On top of PushPullStage which is the most elementary and low-level abstraction and PushStage that is
a convenience class that also informs the environment about possible optimizations StatefulStage is a new
tool that builds on PushPullStage directly, adding various convenience methods on top of it. It is possible to
explicitly maintain state-machine like states using its become() method to encapsulates states explicitly. There
is also a handy emit() method that simplifies emitting multiple values given as an iterator. To demonstrate this
feature we reimplemented Duplicator in terms of a StatefulStage:
public class Duplicator2<A> extends StatefulStage<A, A> {
@Override public StageState<A, A> initial() {
return new StageState<A, A>() {
@Override public SyncDirective onPush(A elem, Context<A> ctx) {
return emit(Arrays.asList(elem, elem).iterator(), ctx);
}
};
}
}

Using DetachedStage
The model described in previous sections, while conceptually simple, cannot describe all desired stages. The main
limitation is the single-ball (single event token) model which prevents independent progress of an upstream
and downstream of a stage. Sometimes it is desirable to detach the progress (and therefore, rate) of the upstream
and downstream of a stage, synchronizing only when needed.
This is achieved in the model by representing a DetachedStage as a boundary between two single-ball
regions. One immediate consequence of this difference is that it is not allowed to call ctx.pull() from
onPull() and it is not allowed to call ctx.push() from onPush() as such combinations would steal
a token from one region (resulting in zero tokens left) and would inject an unexpected second token to the other
region. This is enforced by the expected return types of these callback functions.
One of the important use-cases for DetachedStage is to build buffer-like entities, that allow independent
progress of upstream and downstream stages when the buffer is not full or empty, and slowing down the appropriate
side if the buffer becomes empty or full. The next diagram illustrates the event sequence for a buffer with capacity
of two elements.

1.8. Custom stream processing

47

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

The very first difference we can notice is that our Buffer stage is automatically pulling its upstream on initialization. Remember that it is forbidden to call ctx.pull from onPull, therefore it is the task of the framework
to kick off the first event token in the upstream region, which will remain there until the upstream stages stop.
The diagram distinguishes between the actions of the two regions by colors: purple arrows indicate the actions
involving the upstream event token, while red arrows show the downstream region actions. This demonstrates
the clear separation of these regions, and the invariant that the number of tokens in the two regions are kept
unchanged.
For buffer it is necessary to detach the two regions, but it is also necessary to sometimes hold back the upstream
or downstream. The new API calls that are available for DetachedStage s are the various ctx.holdXXX()
methods , ctx.pushAndPull() and variants, and ctx.isHoldingXXX(). Calling ctx.holdXXX()
from onPull() or onPush results in suspending the corresponding region from progress, and temporarily
taking ownership of the event token. This state can be queried by ctx.isHolding() which will tell if the
stage is currently holding a token or not. It is only allowed to suspend one of the regions, not both, since that
would disable all possible future events, resulting in a dead-lock. Releasing the held token is only possible by
calling ctx.pushAndPull(). This is to ensure that both the held token is released, and the triggering region
gets its token back (one inbound token + one held token = two released tokens).
The following code example demonstrates the buffer class corresponding to the message sequence chart we discussed.
class Buffer2<T> extends DetachedStage<T, T> {
final private Integer SIZE = 2;
final private List<T> buf = new ArrayList<>(SIZE);
private Integer capacity = SIZE;
private boolean isFull() {
return capacity == 0;

1.8. Custom stream processing

48

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

}
private boolean isEmpty() {
return capacity == SIZE;
}
private T dequeue() {
capacity += 1;
return buf.remove(0);
}
private void enqueue(T elem) {
capacity -= 1;
buf.add(elem);
}
public DownstreamDirective onPull(DetachedContext<T> ctx) {
if (isEmpty()) {
if (ctx.isFinishing()) return ctx.finish(); // No more elements will arrive
else return ctx.holdDownstream(); // waiting until new elements
} else {
final T next = dequeue();
if (ctx.isHoldingUpstream()) return ctx.pushAndPull(next); // release upstream
else return ctx.push(next);
}
}
public UpstreamDirective onPush(T elem, DetachedContext<T> ctx) {
enqueue(elem);
if (isFull()) return ctx.holdUpstream(); // Queue is now full, wait until new empty slot
else {
if (ctx.isHoldingDownstream()) return ctx.pushAndPull(dequeue()); // Release downstream
else return ctx.pull();
}
}
public TerminationDirective onUpstreamFinish(DetachedContext<T> ctx) {
if (!isEmpty()) return ctx.absorbTermination(); // still need to flush from buffer
else return ctx.finish(); // already empty, finishing
}
}

1.8.2 Custom graph processing junctions


To extend available fan-in and fan-out structures (graph stages) Akka Streams include FlexiMerge and
FlexiRoute which provide an intuitive DSL which allows to describe which upstream or downstream stream
elements should be pulled from or emitted to.
Using FlexiMerge
FlexiMerge can be used to describe a fan-in element which contains some logic about which upstream stage
the merge should consume elements. It is recommended to create your custom fan-in stage as a separate class,
name it appropriately to the behavior it is exposing and reuse it this way similarly as you would use built-in
fan-in stages.
The first flexi merge example we are going to implement is a so-called preferring merge, in which one of the
input ports is preferred, e.g. if the merge could pull from the preferred or another secondary input port, it will
pull from the preferred port, only pulling from the secondary ports once the preferred one does not have elements
available.

1.8. Custom stream processing

49

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Implementing a custom merge stage is done by extending the FlexiMerge trait, exposing its input ports and
finally defining the logic which will decide how this merge should behave. First we need to create the ports which
are used to wire up the fan-in element in a FlowGraph. These input ports must be properly typed and their names
should indicate what kind of port it is.
public class PreferringMerge
extends FlexiMerge<Integer, Integer, FanInShape3<Integer, Integer, Integer, Integer>> {
public PreferringMerge() {
super(
new FanInShape3<Integer, Integer, Integer, Integer>("PreferringMerge"),
Attributes.name("PreferringMerge")
);
}
@Override
public MergeLogic<Integer, Integer> createMergeLogic(
FanInShape3<Integer, Integer, Integer, Integer> s) {
return new MergeLogic<Integer, Integer>() {
@Override
public State<Integer, Integer> initialState() {
return new State<Integer, Integer>(readPreferred(s.in0(), s.in1(), s.in2())) {
@Override
public State<Integer, Integer> onInput(MergeLogicContext<Integer> ctx,
InPort inputHandle, Integer element) {
ctx.emit(element);
return sameState();
}
};
}
};
}
}

Next we implement the createMergeLogic method, which will be used as factory of merges MergeLogic.
A new MergeLogic object will be created for each materialized stream, so it is allowed to be stateful.
The MergeLogic defines the behaviour of our merge stage, and may be stateful (for example to buffer some
elements internally).
Warning: While a MergeLogic instance may be stateful, the FlexiMerge instance must not hold any
mutable state, since it may be shared across several materialized FlowGraph instances.
Next we implement the initialState method, which returns the behaviour of the merge stage. A
MergeLogic#State defines the behaviour of the merge by signaling which input ports it is interested in consuming, and how to handle the element once it has been pulled from its upstream. Signalling which input port
we are interested in pulling data from is done by using an appropriate read condition. Available read conditions
include:
Read(input) - reads from only the given input,
ReadAny(inputs) reads from any of the given inputs,
ReadPreferred(preferred)(secondaries) reads from the preferred input if elements available, otherwise from one of the secondaries,
ReadAll(inputs) reads from all given inputs (like Zip), and offers an ReadAllInputs as the
element passed into the state function, which allows to obtain the pulled element values in a type-safe
way.
In our case we use the ReadPreferred read condition which has the exact semantics which we need to implement our preferring merge it pulls elements from the preferred input port if there are any available, otherwise
reverting to pulling from the secondary inputs. The context object passed into the state function allows us to

1.8. Custom stream processing

50

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

interact with the connected streams, for example by emitting an element, which was just pulled from the given
input, or signalling completion or failure to the merges downstream stage.
The state function must always return the next behaviour to be used when an element should be pulled from its
upstreams, we use the special SameState object which signals FlexiMerge that no state transition is needed.
Note: As response to an input element it is allowed to emit at most one output element.

Implementing Zip-like merges

More complex fan-in junctions may require not only multiple States but also sharing state between those states.
As MergeLogic is allowed to be stateful, it can be easily used to hold the state of the merge junction.
We now implement the equivalent of the built-in Zip junction by using the property that a the MergeLogic can be
stateful and that each read is followed by a state transition (much like in Akka FSM or Actor#become).
public class Zip2<A, B> extends FlexiMerge<A, Pair<A, B>, FanInShape2<A, B, Pair<A, B>>> {
public Zip2() {
super(new FanInShape2<A, B, Pair<A, B>>("Zip2"), Attributes.name("Zip2"));
}
@Override
public MergeLogic<A, Pair<A, B>> createMergeLogic(final FanInShape2<A, B, Pair<A, B>> s) {
return new MergeLogic<A, Pair<A, B>>() {
private A lastInA = null;
private final State<A, Pair<A, B>> readA = new State<A, Pair<A, B>>(read(s.in0())) {
@Override
public State<B, Pair<A, B>> onInput(
MergeLogicContext<Pair<A, B>> ctx, InPort inputHandle, A element) {
lastInA = element;
return readB;
}
};
private final State<B, Pair<A, B>> readB = new State<B, Pair<A, B>>(read(s.in1())) {
@Override
public State<A, Pair<A, B>> onInput(
MergeLogicContext<Pair<A, B>> ctx, InPort inputHandle, B element) {
ctx.emit(new Pair<A, B>(lastInA, element));
return readA;
}
};
@Override
public State<A, Pair<A, B>> initialState() {
return readA;
}
@Override
public CompletionHandling<Pair<A, B>> initialCompletionHandling() {
return eagerClose();
}
};
}
}

The above style of implementing complex flexi merges is useful when we need fine grained control over consuming
from certain input ports. Sometimes however it is simpler to strictly consume all of a given set of inputs. In the
Zip rewrite below we use the ReadAll read condition, which behaves slightly differently than the other read
1.8. Custom stream processing

51

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

conditions, as the element it is emitting is of the type ReadAllInputs instead of directly handing over the
pulled elements:
public class Zip<A, B>
extends FlexiMerge<FlexiMerge.ReadAllInputs, Pair<A, B>, FanInShape2<A, B, Pair<A, B>>> {
public Zip() {
super(new FanInShape2<A, B, Pair<A, B>>("Zip"), Attributes.name("Zip"));
}
@Override
public MergeLogic<ReadAllInputs, Pair<A, B>>createMergeLogic(
final FanInShape2<A, B, Pair<A, B>> s) {
return new MergeLogic<ReadAllInputs, Pair<A, B>>() {
@Override
public State<ReadAllInputs, Pair<A, B>> initialState() {
return new State<ReadAllInputs, Pair<A, B>>(readAll(s.in0(), s.in1())) {
@Override
public State<ReadAllInputs, Pair<A, B>> onInput(
MergeLogicContext<Pair<A, B>> ctx,
InPort input,
ReadAllInputs inputs) {
final A a = inputs.get(s.in0());
final B b = inputs.get(s.in1());
ctx.emit(new Pair<A, B>(a, b));
return this;
}
};
}
@Override
public CompletionHandling<Pair<A, B>> initialCompletionHandling() {
return eagerClose();
}
};
}
}

Thanks to being handed a ReadAllInputs instance instead of the elements directly it is possible to pick elements in a type-safe way based on their input port.
Connecting your custom junction is as simple as creating an instance and connecting Sources and Sinks to its ports
(notice that the merged output port is named out):
final Sink<Pair<Integer, String>, Future<Pair<Integer, String>>> head =
Sink.<Pair<Integer, String>>head();
final Future<Pair<Integer, String>> future = FlowGraph.factory().closed(head,
(builder, headSink) -> {
final FanInShape2<Integer, String, Pair<Integer, String>> zip =
builder.graph(new Zip<Integer, String>());
builder.from(Source.single(1)).to(zip.in0());
builder.from(Source.single("A")).to(zip.in1());
builder.from(zip.out()).to(headSink);
}).run(mat);

1.8. Custom stream processing

52

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Completion handling

Completion handling in FlexiMerge is defined by an CompletionHandling object which can react on


completion and failure signals from its upstream input ports. The default strategy is to remain running while atleast-one upstream input port which are declared to be consumed in the current state is still running (i.e. has not
signalled completion or failure).
Customising completion can be done via overriding the MergeLogic#initialCompletionHandling
method, or from within a State by calling ctx.changeCompletionHandling(handling). Other than
the default completion handling (as late as possible) FlexiMerge also provides an eagerClose completion
handling which completes (or fails) its downstream as soon as at least one of its upstream inputs completes (or
fails).
In the example below the we implement an ImportantWithBackups fan-in stage which can only keep operating while the important and at-least-one of the replica inputs are active. Therefore in our custom
completion strategy we have to investigate which input has completed or failed and act accordingly. If the important input completed or failed we propagate this downstream completing the stream, on the other hand if the first
replicated input fails, we log the exception and instead of failing the downstream swallow this exception (as one
failed replica is still acceptable). Then we change the completion strategy to eagerClose which will propagate
any future completion or failure event right to this stages downstream effectively shutting down the stream.
public class ImportantWithBackups<T> extends FlexiMerge<T, T, FanInShape3<T, T, T, T>> {
public ImportantWithBackups() {
super(
new FanInShape3<T, T, T, T>("ImportantWithBackup"),
Attributes.name("ImportantWithBackup")
);
}

@Override
public MergeLogic<T, T> createMergeLogic(final FanInShape3<T, T, T, T> s) {
return new MergeLogic<T, T>() {
@Override
public CompletionHandling<T> initialCompletionHandling() {
return new CompletionHandling<T>() {
@Override
public State<T, T> onUpstreamFinish(MergeLogicContextBase<T> ctx,
InPort input) {
if (input == s.in0()) {
System.out.println("Important input completed, shutting down.");
ctx.finish();
return sameState();
} else {
System.out.printf("Replica %s completed, " +
"no more replicas available, " +
"applying eagerClose completion handling.\n", input);
ctx.changeCompletionHandling(eagerClose());
return sameState();
}
}
@Override
public State<T, T> onUpstreamFailure(MergeLogicContextBase<T> ctx,
InPort input, Throwable cause) {
if (input == s.in0()) {
ctx.fail(cause);
return sameState();
} else {
System.out.printf("Replica %s failed, " +

1.8. Custom stream processing

53

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

"no more replicas available, " +


"applying eagerClose completion handling.\n", input);
ctx.changeCompletionHandling(eagerClose());
return sameState();
}
}
};
}
@Override
public State<T, T> initialState() {
return new State<T, T>(readAny(s.in0(), s.in1(), s.in2())) {
@Override
public State<T, T> onInput(MergeLogicContext<T> ctx,
InPort input, T element) {
ctx.emit(element);
return sameState();
}
};
}
};
}
}

In case you want to change back to the default


MergeLogic#defaultCompletionHandling.

completion

handling,

it is available as

It is not possible to emit elements from the completion handling, since completion handlers may be invoked at any
time (without regard to downstream demand being available).
Using FlexiRoute
Similarily to using FlexiMerge, implementing custom fan-out stages requires extending the FlexiRoute
class and with a RouteLogic object which determines how the route should behave.
The first flexi route stage that we are going to implement is Unzip, which consumes a stream of pairs and splits
it into two streams of the first and second elements of each pair.
A FlexiRoute has exactly-one input port (in our example, type parameterized as Pair<A,B>), and may have
multiple output ports, all of which must be created beforehand (they can not be added dynamically). First we need
to create the ports which are used to wire up the fan-in element in a FlowGraph.
public class Unzip<A, B> extends FlexiRoute<Pair<A, B>, FanOutShape2<Pair<A, B>, A, B>> {
public Unzip() {
super(new FanOutShape2<Pair<A, B>, A, B>("Unzip"), Attributes.name("Unzip"));
}
@Override
public RouteLogic<Pair<A, B>> createRouteLogic(final FanOutShape2<Pair<A, B>, A, B> s) {
return new RouteLogic<Pair<A, B>>() {
@Override
public State<BoxedUnit, Pair<A, B>> initialState() {
return new State<BoxedUnit, Pair<A, B>>(demandFromAll(s.out0(), s.out1())) {
@Override
public State<BoxedUnit, Pair<A, B>> onInput(
RouteLogicContext<Pair<A, B>> ctx, BoxedUnit x, Pair<A, B> element) {
ctx.emit(s.out0(), element.first());
ctx.emit(s.out1(), element.second());
return sameState();
}

1.8. Custom stream processing

54

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

};
}
@Override
public CompletionHandling<Pair<A, B>> initialCompletionHandling() {
return eagerClose();
}
};
}
}

Next we implement RouteLogic#initialState by providing a State that uses the DemandFromAll demand condition to signal to flexi route that elements can only be emitted from this stage when demand is available
from all given downstream output ports. Other available demand conditions are:
DemandFrom(output) - triggers when the given output port has pending demand,
DemandFromAny(outputs) - triggers when any of the given output ports has pending demand,
DemandFromAll(outputs) - triggers when all of the given output ports has pending demand.
Since the Unzip junction were implementing signals both downstreams stages at the same time, we use
DemandFromAll, unpack the incoming pair in the state function and signal its first element to the left stream,
and the second element of the pair to the right stream. Notice that since we are emitting values of different types
(A and B), the output type parameter of this State must be set to Any. This type can be utilised more efficiently
when a junction is emitting the same type of element to its downstreams e.g. in all strictly routing stages.
The state function must always return the next behaviour to be used when an element should be emitted, we use
the special SameState object which signals FlexiRoute that no state transition is needed.
Warning: While a RouteLogic instance may be stateful, the FlexiRoute instance must not hold any
mutable state, since it may be shared across several materialized FlowGraph instances.

Note: It is only allowed to emit at most one element to each output in response to onInput, IllegalStateException
is thrown.

Completion handling

Completion handling in FlexiRoute is handled similarily to FlexiMerge (which is explained in depth in


Completion handling), however in addition to reacting to its upstreams completion or failure it can also react to its
downstream stages cancelling their subscriptions. The default completion handling for FlexiRoute (defined
in RouteLogic#defaultCompletionHandling) is to continue running until all of its downstreams have
cancelled their subscriptions, or the upstream has completed / failed.
In
order
to
customise
completion
handling
we
can
override
overriding
the
RouteLogic#initialCompletionHandling
method,
or
call
ctx.changeCompletionHandling(handling) from within a State.
Other than the default
completion handling (as late as possible) FlexiRoute also provides an eagerClose completion handling
which completes all its downstream streams as well as cancels its upstream as soon as any of its downstream
stages cancels its subscription.
In the example below we implement a custom completion handler which completes the entire stream eagerly
if the important downstream cancels, otherwise (if any other downstream cancels their subscription) the
ImportantRoute keeps running.
public class ImportantRoute<T> extends FlexiRoute<T, FanOutShape3<T, T, T, T>> {
public ImportantRoute() {
super(new FanOutShape3<T, T, T, T>("ImportantRoute"), Attributes.name("ImportantRoute"));
}

1.8. Custom stream processing

55

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

@Override
public RouteLogic<T> createRouteLogic(FanOutShape3<T, T, T, T> s) {
return new RouteLogic<T>() {
@Override
public CompletionHandling<T> initialCompletionHandling() {
return new CompletionHandling<T>() {
@Override
public State<T, T> onDownstreamFinish(RouteLogicContextBase<T> ctx,
OutPort output) {
if (output == s.out0()) {
// finish all downstreams, and cancel the upstream
ctx.finish();
return sameState();
} else {
return sameState();
}
}
@Override
public void onUpstreamFinish(RouteLogicContextBase<T> ctx) {
}
@Override
public void onUpstreamFailure(RouteLogicContextBase<T> ctx, Throwable t) {
}
};
}
@Override
public State<OutPort, T> initialState() {
return new State<OutPort, T>(demandFromAny(s.out0(), s.out1(), s.out2())) {
@SuppressWarnings("unchecked")
@Override
public State<T, T> onInput(
RouteLogicContext<T> ctx, OutPort preferred, T element) {
ctx.emit((Outlet<T>) preferred, element);
return sameState();
}
};
}
};
}
}

Notice that State changes are only allowed in reaction to downstream cancellations, and not in the upstream
completion/failure cases. This is because since there is only one upstream, there is nothing else to do than possibly
flush buffered elements and continue with shutting down the entire stream.
It is not possible to emit elements from the completion handling, since completion handlers may be invoked at any
time (without regard to downstream demand being available).

1.8.3 Thread safety of custom processing stages


All of the above custom stages (linear or graph) provide a few simple guarantees that implementors can rely on.
The callbacks exposed by all of these classes are never called concurrently.
The state encapsulated by these classes can be safely modified from the provided callbacks, without
any further synchronization.
1.8. Custom stream processing

56

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

In essence, the above guarantees are similar to what Actor s provide, if one thinks of the state of a custom stage
as state of an actor, and the callbacks as the receive block of the actor.
Warning: It is not safe to access the state of any custom stage outside of the callbacks that it provides, just
like it is unsafe to access the state of an actor from the outside. This means that Future callbacks should not
close over internal state of custom stages because such access can be concurrent with the provided callbacks,
leading to undefined behavior.

1.9 Integration
1.9.1 Integrating with Actors
For piping the elements of a stream as messages to an ordinary actor you can use the Sink.actorRef. Messages
can be sent to a stream via the ActorRef that is materialized by Source.actorRef.
For more advanced use cases the ActorPublisher and ActorSubscriber traits are provided to support
implementing Reactive Streams Publisher and Subscriber with an Actor.
These can be consumed by other Reactive Stream libraries or used as a Akka Streams Source or Sink.
Warning: AbstractActorPublisher and AbstractActorSubscriber cannot be used with remote actors, because if signals of the Reactive Streams protocol (e.g. request) are lost the the stream may
deadlock.
Note: These Actors are designed to be implemented using Java 8 lambda expressions. In case you need to stay
on a JVM prior to 8, Akka provides UntypedActorPublisher and UntypedActorSubscriber which
can be used easily from any language level.

Source.actorRef
Messages sent to the actor that is materialized by Source.actorRef will be emitted to the stream if there is
demand from downstream, otherwise they will be buffered until request for demand is received.
Depending on the defined OverflowStrategy it might drop elements if there is no space available in the
buffer. The strategy OverflowStrategy.backpressure() is not supported for this Source type, you
should consider using ActorPublisher if you want a backpressured actor interface.
The stream can be completed successfully by
akka.actor.Status.Success to the actor reference.

sending

akka.actor.PoisonPill

or

The stream can be completed with failure by sending akka.actor.Status.Failure to the actor reference.
The actor will be stopped when the stream is completed, failed or cancelled from downstream, i.e. you can watch
it to get notified when that happens.
Sink.actorRef
The sink sends the elements of the stream to the given ActorRef. If the target actor terminates the stream will
be cancelled. When the stream is completed successfully the given onCompleteMessage will be sent to the
destination actor. When the stream is completed with failure a akka.actor.Status.Failure message will
be sent to the destination actor.

1.9. Integration

57

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Warning: There is no back-pressure signal from the destination actor, i.e. if the actor is not consuming the
messages fast enough the mailbox of the actor will grow. For potentially slow consumer actors it is recommended to use a bounded mailbox with zero mailbox-push-timeout-time or use a rate limiting stage in front of
this stage.

ActorPublisher
Extend akka.stream.actor.AbstractActorPublisher to implement a stream publisher that keeps
track of the subscription life cycle and requested elements.
Here is an example of such an actor. It dispatches incoming jobs to the attached subscriber:
public static class JobManagerProtocol {
final public static class Job {
public final String payload;
public Job(String payload) {
this.payload = payload;
}
}
public static class JobAcceptedMessage {
@Override
public String toString() {
return "JobAccepted";
}
}
public static final JobAcceptedMessage JobAccepted = new JobAcceptedMessage();
public static class JobDeniedMessage {
@Override
public String toString() {
return "JobDenied";
}
}
public static final JobDeniedMessage JobDenied = new JobDeniedMessage();
}
public static class JobManager extends AbstractActorPublisher<JobManagerProtocol.Job> {
public static Props props() { return Props.create(JobManager.class); }
private final int MAX_BUFFER_SIZE = 100;
private final List<JobManagerProtocol.Job> buf = new ArrayList<>();
public JobManager() {
receive(ReceiveBuilder.
match(JobManagerProtocol.Job.class, job -> buf.size() == MAX_BUFFER_SIZE, job -> {
sender().tell(JobManagerProtocol.JobDenied, self());
}).
match(JobManagerProtocol.Job.class, job -> {
sender().tell(JobManagerProtocol.JobAccepted, self());
if (buf.isEmpty() && totalDemand() > 0)
onNext(job);
else {
buf.add(job);
deliverBuf();
}
}).
match(ActorPublisherMessage.Request.class, request -> deliverBuf()).

1.9. Integration

58

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

match(ActorPublisherMessage.Cancel.class, cancel -> context().stop(self())).


build());
}
void deliverBuf() {
while (totalDemand() > 0) {
/*
* totalDemand is a Long and could be larger than
* what buf.splitAt can accept
*/
if (totalDemand() <= Integer.MAX_VALUE) {
final List<JobManagerProtocol.Job> took =
buf.subList(0, Math.min(buf.size(), (int) totalDemand()));
took.forEach(this::onNext);
buf.removeAll(took);
break;
} else {
final List<JobManagerProtocol.Job> took =
buf.subList(0, Math.min(buf.size(), Integer.MAX_VALUE));
took.forEach(this::onNext);
buf.removeAll(took);
}
}
}
}

You send elements to the stream by calling onNext. You are allowed to send as many elements as
have been requested by the stream subscriber. This amount can be inquired with totalDemand. It
is only allowed to use onNext when isActive and totalDemand>0, otherwise onNext will throw
IllegalStateException.
When the stream subscriber requests more elements the ActorPublisherMessage.Request message is
delivered to this actor, and you can act on that event. The totalDemand is updated automatically.
When the stream subscriber cancels the subscription the ActorPublisherMessage.Cancel message is
delivered to this actor. After that subsequent calls to onNext will be ignored.
You can complete the stream by calling onComplete. After that you are not allowed to call onNext, onError
and onComplete.
You can terminate the stream with failure by calling onError. After that you are not allowed to call onNext,
onError and onComplete.
If you suspect that this AbstractActorPublisher may never get subscribed to, you can
override the subscriptionTimeout method to provide a timeout after which this Publisher
should be considered canceled.
The actor will be notified when the timeout triggers via an
ActorPublisherMessage.SubscriptionTimeoutExceeded message and MUST then perform
cleanup and stop itself.
If the actor is stopped the stream will be completed, unless it was not already terminated with failure, completed
or canceled.
More detailed information can be found in the API documentation.
This is how it can be used as input Source to a Flow:
final Source<JobManagerProtocol.Job, ActorRef> jobManagerSource =
Source.actorPublisher(JobManager.props());
final ActorRef ref = jobManagerSource
.map(job -> job.payload.toUpperCase())
.map(elem -> {
System.out.println(elem);
return elem;
})

1.9. Integration

59

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.to(Sink.ignore())
.run(mat);
ref.tell(new JobManagerProtocol.Job("a"), ActorRef.noSender());
ref.tell(new JobManagerProtocol.Job("b"), ActorRef.noSender());
ref.tell(new JobManagerProtocol.Job("c"), ActorRef.noSender());

You can only attach one subscriber to this publisher.


Sink.fanoutPublisher to enable multiple subscribers.

Use a Broadcast element or attach a

ActorSubscriber
Extend akka.stream.actor.AbstractActorSubscriber to make your class a stream subscriber
with full control of stream back pressure.
It will receive ActorSubscriberMessage.OnNext,
ActorSubscriberMessage.OnComplete and ActorSubscriberMessage.OnError messages
from the stream. It can also receive other, non-stream messages, in the same way as any actor.
Here is an example of such an actor. It dispatches incoming jobs to child worker actors:
public static class WorkerPoolProtocol {
public static class Msg {
public final int id;
public final ActorRef replyTo;
public Msg(int id, ActorRef replyTo) {
this.id = id;
this.replyTo = replyTo;
}
@Override
public String toString() {
return String.format("Msg(%s, %s)", id, replyTo);
}
}
public static Msg msg(int id, ActorRef replyTo) {
return new Msg(id, replyTo);
}

public static class Work {


public final int id;
public Work(int id) { this.id = id; }
@Override
public String toString() {
return String.format("Work(%s)", id);
}
}
public static Work work(int id) {
return new Work(id);
}

public static class Reply {


public final int id;
public Reply(int id) { this.id = id; }
@Override
public String toString() {
return String.format("Reply(%s)", id);
}

1.9. Integration

60

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

}
public static Reply reply(int id) {
return new Reply(id);
}

public static class Done {


public final int id;
public Done(int id) { this.id = id; }
@Override
public String toString() {
return String.format("Done(%s)", id);
}
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Done done = (Done) o;
if (id != done.id) {
return false;
}
return true;
}
@Override
public int hashCode() {
return id;
}
}
public static Done done(int id) {
return new Done(id);
}
}
public static class WorkerPool extends AbstractActorSubscriber {
public static Props props() { return Props.create(WorkerPool.class); }
final int MAX_QUEUE_SIZE = 10;
final Map<Integer, ActorRef> queue = new HashMap<>();
final Router router;
@Override
public RequestStrategy requestStrategy() {
return new MaxInFlightRequestStrategy(MAX_QUEUE_SIZE) {
@Override
public int inFlightInternally() {
return queue.size();
}
};
}

1.9. Integration

61

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

public WorkerPool() {
final List<Routee> routees = new ArrayList<>();
for (int i = 0; i < 3; i++)
routees.add(new ActorRefRoutee(context().actorOf(Props.create(Worker.class))));
router = new Router(new RoundRobinRoutingLogic(), routees);

receive(ReceiveBuilder.
match(ActorSubscriberMessage.OnNext.class, on -> on.element() instanceof WorkerPoolProtocol.Ms
onNext -> {
WorkerPoolProtocol.Msg msg = (WorkerPoolProtocol.Msg) onNext.element();
queue.put(msg.id, msg.replyTo);
if (queue.size() > MAX_QUEUE_SIZE)
throw new RuntimeException("queued too many: " + queue.size());
router.route(WorkerPoolProtocol.work(msg.id), self());
}).
match(WorkerPoolProtocol.Reply.class, reply -> {
int id = reply.id;
queue.get(id).tell(WorkerPoolProtocol.done(id), self());
queue.remove(id);
}).
build());
}
}
static class Worker extends AbstractActor {
public Worker() {
receive(ReceiveBuilder.
match(WorkerPoolProtocol.Work.class, work -> {
// ...
sender().tell(WorkerPoolProtocol.reply(work.id), self());
}).build());
}
}

Subclass must define the RequestStrategy to control stream back pressure. After each incoming message the
AbstractActorSubscriber will automatically invoke the RequestStrategy.requestDemand and
propagate the returned demand to the stream.
The provided WatermarkRequestStrategy is a good strategy if the actor performs work itself.
The provided MaxInFlightRequestStrategy is useful if messages are queued internally or delegated to other actors.
You can also implement a custom RequestStrategy or call request manually together with
ZeroRequestStrategy or some other strategy. In that case you must also call request when the
actor is started or when it is ready, otherwise it will not receive any elements.
More detailed information can be found in the API documentation.
This is how it can be used as output Sink to a Flow:
final int N = 117;
final List<Integer> data = new ArrayList<>(N);
for (int i = 0; i < N; i++) {
data.add(i);
}
Source.from(data)
.map(i -> WorkerPoolProtocol.msg(i, replyTo))
.runWith(Sink.<WorkerPoolProtocol.Msg>actorSubscriber(WorkerPool.props()), mat);

1.9. Integration

62

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.9.2 Integrating with External Services


Stream transformations and side effects involving external non-stream based services can be performed with
mapAsync or mapAsyncUnordered.
For example, sending emails to the authors of selected tweets using an external email service:
public Future<Email> send(Email email) {
// ...
}

We start with the tweet stream of authors:


final Source<Author, BoxedUnit> authors = tweets
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);

Assume that we can lookup their email address using:


public Future<Optional<String>> lookupEmail(String handle)

Transforming the stream of authors to a stream of email addresses by using the lookupEmail service can be
done with mapAsync:
final Source<String, BoxedUnit> emailAddresses = authors
.mapAsync(4, author -> addressSystem.lookupEmail(author.handle))
.filter(o -> o.isPresent())
.map(o -> o.get());

Finally, sending the emails:


final RunnableGraph<BoxedUnit> sendEmails = emailAddresses
.mapAsync(4, address ->
emailServer.send(new Email(address, "Akka", "I like your tweet")))
.to(Sink.ignore());
sendEmails.run(mat);

mapAsync is applying the given function that is calling out to the external service to each of the elements as they
pass through this processing step. The function returns a Future and the value of that future will be emitted
downstreams. The number of Futures that shall run in parallel is given as the first argument to mapAsync. These
Futures may complete in any order, but the elements that are emitted downstream are in the same order as received
from upstream.
That means that back-pressure works as expected. For example if the emailServer.send is the bottleneck it
will limit the rate at which incoming tweets are retrieved and email addresses looked up.
The final piece of this pipeline is to generate the demand that pulls the tweet authors information through the
emailing pipeline: we attach a Sink.ignore which makes it all run. If our email process would return some
interesting data for further transformation then we would of course not ignore it but send that result stream onwards
for further processing or storage.
Note that mapAsync preserves the order of the stream elements. In this example the order is not important and
then we can use the more efficient mapAsyncUnordered:
final Source<Author, BoxedUnit> authors =
tweets
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);
final Source<String, BoxedUnit> emailAddresses =
authors
.mapAsyncUnordered(4, author -> addressSystem.lookupEmail(author.handle))
.filter(o -> o.isPresent())
.map(o -> o.get());

1.9. Integration

63

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

final RunnableGraph<BoxedUnit> sendEmails =


emailAddresses
.mapAsyncUnordered(4, address ->
emailServer.send(new Email(address, "Akka", "I like your tweet")))
.to(Sink.ignore());
sendEmails.run(mat);

In the above example the services conveniently returned a Future of the result. If that is not the case you need
to wrap the call in a Future. If the service call involves blocking you must also make sure that you run it on a
dedicated execution context, to avoid starvation and disturbance of other tasks in the system.
final MessageDispatcher blockingEc = system.dispatchers().lookup("blocking-dispatcher");
final RunnableGraph sendTextMessages =
phoneNumbers
.mapAsync(4, phoneNo ->
Futures.future(() ->
smsServer.send(new TextMessage(phoneNo, "I like your tweet")),
blockingEc)
)
.to(Sink.ignore());
sendTextMessages.run(mat);

The configuration of the "blocking-dispatcher" may look something like:


blocking-dispatcher {
executor = "thread-pool-executor"
thread-pool-executor {
core-pool-size-min
= 10
core-pool-size-max
= 10
}
}

An alternative for blocking calls is to perform them in a map operation, still using a dedicated dispatcher for that
operation.
final Flow<String, Boolean, BoxedUnit> send =
Flow.of(String.class)
.map(phoneNo -> smsServer.send(new TextMessage(phoneNo, "I like your tweet")))
.withAttributes(ActorAttributes.dispatcher("blocking-dispatcher"));
final RunnableGraph<?> sendTextMessages =
phoneNumbers.via(send).to(Sink.ignore());
sendTextMessages.run(mat);

However, that is not exactly the same as mapAsync, since the mapAsync may run several calls concurrently,
but map performs them one at a time.
For a service that is exposed as an actor, or if an actor is used as a gateway in front of an external service, you can
use ask:
final Source<Tweet, BoxedUnit> akkaTweets = tweets.filter(t -> t.hashtags().contains(AKKA));
final RunnableGraph saveTweets =
akkaTweets
.mapAsync(4, tweet -> ask(database, new Save(tweet), 300))
.to(Sink.ignore());

Note that if the ask is not completed within the given timeout the stream is completed with failure. If that is not
desired outcome you can use recover on the ask Future.

1.9. Integration

64

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Illustrating ordering and parallelism


Let us look at another example to get a better understanding of the ordering and parallelism characteristics of
mapAsync and mapAsyncUnordered.
Several mapAsync and mapAsyncUnordered futures may run concurrently. The number of concurrent futures are limited by the downstream demand. For example, if 5 elements have been requested by downstream
there will be at most 5 futures in progress.
mapAsync emits the future results in the same order as the input elements were received. That means that
completed results are only emitted downstream when earlier results have been completed and emitted. One slow
call will thereby delay the results of all successive calls, even though they are completed before the slow call.
mapAsyncUnordered emits the future results as soon as they are completed, i.e. it is possible that the elements
are not emitted downstream in the same order as received from upstream. One slow call will thereby not delay the
results of faster successive calls as long as there is downstream demand of several elements.
Here is a fictive service that we can use to illustrate these aspects.
static class SometimesSlowService {
private final ExecutionContext ec;
public SometimesSlowService(ExecutionContext ec) {
this.ec = ec;
}
private final AtomicInteger runningCount = new AtomicInteger();
public Future<String> convert(String s) {
System.out.println("running: " + s + "(" + runningCount.incrementAndGet() + ")");
return Futures.future(() -> {
if (!s.isEmpty() && Character.isLowerCase(s.charAt(0)))
Thread.sleep(500);
else
Thread.sleep(20);
System.out.println("completed: " + s + "(" + runningCount.decrementAndGet() + ")");
return s.toUpperCase();
}, ec);
}
}

Elements starting with a lower case character are simulated to take longer time to process.
Here is how we can use it with mapAsync:
final MessageDispatcher blockingEc = system.dispatchers().lookup("blocking-dispatcher");
final SometimesSlowService service = new SometimesSlowService(blockingEc);
final ActorMaterializer mat = ActorMaterializer.create(
ActorMaterializerSettings.create(system).withInputBuffer(4, 4), system);
Source.from(Arrays.asList("a", "B", "C", "D", "e", "F", "g", "H", "i", "J"))
.map(elem -> { System.out.println("before: " + elem); return elem; })
.mapAsync(4, service::convert)
.runForeach(elem -> System.out.println("after: " + elem), mat);

The output may look like this:


before: a
before: B
before: C
before: D
running: a (1)
running: B (2)
before: e

1.9. Integration

65

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

running: C
before: F
running: D
before: g
before: H
completed:
completed:
completed:
completed:
after: A
after: B
running: e
after: C
after: D
running: F
before: i
before: J
running: g
running: H
completed:
completed:
completed:
completed:
after: E
after: F
running: i
after: G
after: H
running: J
completed:
completed:
after: I
after: J

(3)
(4)

C
B
D
a

(3)
(2)
(1)
(0)

(1)

(2)

(3)
(4)
H (2)
F (3)
e (1)
g (0)

(1)

(2)
J (1)
i (0)

Note that after lines are in the same order as the before lines even though elements are completed in a
different order. For example H is completed before g, but still emitted afterwards.
The numbers in parenthesis illustrates how many calls that are in progress at the same time. Here the
downstream demand and thereby the number of concurrent calls are limited by the buffer size (4) of the
ActorMaterializerSettings.
Here is how we can use the same service with mapAsyncUnordered:
final MessageDispatcher blockingEc = system.dispatchers().lookup("blocking-dispatcher");
final SometimesSlowService service = new SometimesSlowService(blockingEc);
final ActorMaterializer mat = ActorMaterializer.create(
ActorMaterializerSettings.create(system).withInputBuffer(4, 4), system);
Source.from(Arrays.asList("a", "B", "C", "D", "e", "F", "g", "H", "i", "J"))
.map(elem -> { System.out.println("before: " + elem); return elem; })
.mapAsyncUnordered(4, service::convert)
.runForeach(elem -> System.out.println("after: " + elem), mat);

The output may look like this:


before: a
before: B
before: C
before: D
running: a (1)
running: B (2)
before: e
running: C (3)

1.9. Integration

66

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

before: F
running: D
before: g
before: H
completed:
completed:
completed:
after: B
after: D
running: e
after: C
running: F
before: i
before: J
completed:
after: F
running: g
running: H
completed:
after: H
completed:
after: A
running: i
running: J
completed:
after: J
completed:
after: E
completed:
after: G
completed:
after: I

(4)

B (3)
C (1)
D (2)

(2)
(3)

F (2)
(3)
(4)
H (3)
a (2)
(3)
(4)
J (3)
e (2)
g (1)
i (0)

Note that after lines are not in the same order as the before lines. For example H overtakes the slow G.
The numbers in parenthesis illustrates how many calls that are in progress at the same time. Here the
downstream demand and thereby the number of concurrent calls are limited by the buffer size (4) of the
ActorMaterializerSettings.

1.9.3 Integrating with Reactive Streams


Reactive Streams defines a standard for asynchronous stream processing with non-blocking back pressure. It
makes it possible to plug together stream libraries that adhere to the standard. Akka Streams is one such library.
An incomplete list of other implementations:
Reactor (1.1+)
RxJava
Ratpack
Slick
The two most important interfaces in Reactive Streams are the Publisher and Subscriber.
import org.reactivestreams.Publisher;
import org.reactivestreams.Subscriber;
import org.reactivestreams.Processor;

Let us assume that a library provides a publisher of tweets:


Publisher<Tweet> tweets();

1.9. Integration

67

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

and another library knows how to store author handles in a database:


Subscriber<Author> storage();

Using an Akka Streams Flow we can transform the stream and connect those:
final Flow<Tweet, Author, BoxedUnit> authors = Flow.of(Tweet.class)
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);
Source.from(rs.tweets())
.via(authors)
.to(Sink.create(rs.storage()));

The Publisher is used as an input Source to the flow and the Subscriber is used as an output Sink.
A Flow can also be also converted to a RunnableGraph[Processor[In, Out]] which materializes to a
Processor when run() is called. run() itself can be called multiple times, resulting in a new Processor
instance each time.
final Processor<Tweet, Author> processor =
authors.toProcessor().run(mat);

rs.tweets().subscribe(processor);
processor.subscribe(rs.storage());

A publisher can be connected to a subscriber with the subscribe method.


It is also possible to expose a Source as a Publisher by using the Publisher-Sink:
final Publisher<Author> authorPublisher =
Source.from(rs.tweets()).via(authors).runWith(Sink.publisher(), mat);
authorPublisher.subscribe(rs.storage());

A publisher that is created with Sink.publisher only supports one subscriber. A second subscription attempt
will be rejected with an IllegalStateException.
A publisher that supports multiple subscribers can be created with Sink.fanoutPublisher instead:
Subscriber<Author> storage();
Subscriber<Author> alert();
final Publisher<Author> authorPublisher =
Source.from(rs.tweets())
.via(authors)
.runWith(Sink.fanoutPublisher(8, 16), mat);
authorPublisher.subscribe(rs.storage());
authorPublisher.subscribe(rs.alert());

The buffer size controls how far apart the slowest subscriber can be from the fastest subscriber before slowing
down the stream.
To make the picture complete, it is also possible to expose a Sink as a Subscriber by using the SubscriberSource:
final Subscriber<Author> storage = rs.storage();
final Subscriber<Tweet> tweetSubscriber =
authors
.to(Sink.create(storage))
.runWith(Source.subscriber(), mat);
rs.tweets().subscribe(tweetSubscriber);

1.9. Integration

68

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

It is also possible to use re-wrap Processor instances as a Flow by passing a factory function that will create
the Processor instances:
// An example Processor factory
final Creator<Processor<Integer, Integer>> factory =
new Creator<Processor<Integer, Integer>>() {
public Processor<Integer, Integer> create() {
return Flow.of(Integer.class).toProcessor().run(mat);
}
};
final Flow<Integer, Integer, BoxedUnit> flow = Flow.create(factory);

Please note that a factory is necessary to achieve reusability of the resulting Flow.

1.10 Error Handling


Strategies for how to handle exceptions from processing stream elements can be defined when materializing the
stream. The error handling strategies are inspired by actor supervision strategies, but the semantics have been
adapted to the domain of stream processing.
Warning: ZipWith, FlexiMerge, FlexiRoute junction, ActorPublisher source and ActorSubscriber sink components do not honour the supervision strategy attribute yet.

1.10.1 Supervision Strategies


There are three ways to handle exceptions from application code:
Stop - The stream is completed with failure.
Resume - The element is dropped and the stream continues.
Restart - The element is dropped and the stream continues after restarting the stage. Restarting a stage
means that any accumulated state is cleared. This is typically performed by creating a new instance of the
stage.
By default the stopping strategy is used for all exceptions, i.e. the stream will be completed with failure when an
exception is thrown.
final Materializer mat = ActorMaterializer.create(system);
final Source<Integer, BoxedUnit> source = Source.from(Arrays.asList(0, 1, 2, 3, 4, 5))
.map(elem -> 100 / elem);
final Sink<Integer, Future<Integer>> fold =
Sink.fold(0, (acc, elem) -> acc + elem);
final Future<Integer> result = source.runWith(fold, mat);
// division by zero will fail the stream and the
// result here will be a Future completed with Failure(ArithmeticException)

The default supervision strategy for a stream can be defined on the settings of the materializer.
final Function<Throwable, Supervision.Directive> decider = exc -> {
if (exc instanceof ArithmeticException)
return Supervision.resume();
else
return Supervision.stop();
};
final Materializer mat = ActorMaterializer.create(
ActorMaterializerSettings.create(system).withSupervisionStrategy(decider),
system);
final Source<Integer, BoxedUnit> source = Source.from(Arrays.asList(0, 1, 2, 3, 4, 5))

1.10. Error Handling

69

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.map(elem -> 100 / elem);


final Sink<Integer, Future<Integer>> fold =
Sink.fold(0, (acc, elem) -> acc + elem);
final Future<Integer> result = source.runWith(fold, mat);
// the element causing division by zero will be dropped
// result here will be a Future completed with Success(228)

Here you can see that all ArithmeticException will resume the processing, i.e. the elements that cause the
division by zero are effectively dropped.
Note: Be aware that dropping elements may result in deadlocks in graphs with cycles, as explained in Graph
cycles, liveness and deadlocks.
The supervision strategy can also be defined for all operators of a flow.
final Materializer mat = ActorMaterializer.create(system);
final Function<Throwable, Supervision.Directive> decider = exc -> {
if (exc instanceof ArithmeticException)
return Supervision.resume();
else
return Supervision.stop();
};
final Flow<Integer, Integer, BoxedUnit> flow =
Flow.of(Integer.class).filter(elem -> 100 / elem < 50).map(elem -> 100 / (5 - elem))
.withAttributes(ActorAttributes.withSupervisionStrategy(decider));
final Source<Integer, BoxedUnit> source = Source.from(Arrays.asList(0, 1, 2, 3, 4, 5))
.via(flow);
final Sink<Integer, Future<Integer>> fold =
Sink.fold(0, (acc, elem) -> acc + elem);
final Future<Integer> result = source.runWith(fold, mat);
// the elements causing division by zero will be dropped
// result here will be a Future completed with Success(150)

Restart works in a similar way as Resume with the addition that accumulated state, if any, of the failing
processing stage will be reset.
final Materializer mat = ActorMaterializer.create(system);
final Function<Throwable, Supervision.Directive> decider = exc -> {
if (exc instanceof IllegalArgumentException)
return Supervision.restart();
else
return Supervision.stop();
};
final Flow<Integer, Integer, BoxedUnit> flow =
Flow.of(Integer.class).scan(0, (acc, elem) -> {
if (elem < 0) throw new IllegalArgumentException("negative not allowed");
else return acc + elem;
})
.withAttributes(ActorAttributes.withSupervisionStrategy(decider));
final Source<Integer, BoxedUnit> source = Source.from(Arrays.asList(1, 3, -1, 5, 7))
.via(flow);
final Future<List<Integer>> result = source.grouped(1000)
.runWith(Sink.<List<Integer>>head(), mat);
// the negative element cause the scan stage to be restarted,
// i.e. start from 0 again
// result here will be a Future completed with Success(List(0, 1, 4, 0, 5, 12))

1.10.2 Errors from mapAsync


Stream supervision can also be applied to the futures of mapAsync.

1.10. Error Handling

70

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Lets say that we use an external service to lookup email addresses and we would like to discard those that cannot
be found.
We start with the tweet stream of authors:
final Source<Author, BoxedUnit> authors = tweets
.filter(t -> t.hashtags().contains(AKKA))
.map(t -> t.author);

Assume that we can lookup their email address using:


public Future<String> lookupEmail(String handle)

The Future is completed with Failure if the email is not found.


Transforming the stream of authors to a stream of email addresses by using the lookupEmail service can be
done with mapAsync and we use Supervision.getResumingDecider to drop unknown email addresses:
final Attributes resumeAttrib =
ActorAttributes.withSupervisionStrategy(Supervision.getResumingDecider());
final Flow<Author, String, BoxedUnit> lookupEmail =
Flow.of(Author.class)
.mapAsync(4, author -> addressSystem.lookupEmail(author.handle))
.withAttributes(resumeAttrib);
final Source<String, BoxedUnit> emailAddresses = authors.via(lookupEmail);

If we would not use Resume the default stopping strategy would complete the stream with failure on the first
Future that was completed with Failure.

1.11 Working with streaming IO


Akka Streams provides a way of handling File IO and TCP connections with Streams. While the general approach
is very similar to the Actor based TCP handling using Akka IO, by using Akka Streams you are freed of having
to manually react to back-pressure signals, as the library does it transparently for you.

1.11.1 Streaming TCP


Accepting connections: Echo Server
In order to implement a simple EchoServer we bind to a given address, which returns a
Source[IncomingConnection], which will emit an IncomingConnection element for each new connection that the Server should handle:
// IncomingConnection and ServerBinding imported from Tcp
final Source<IncomingConnection, Future<ServerBinding>> connections =
Tcp.get(system).bind("127.0.0.1", 8889);

Next, we simply handle each incoming connection using a Flow which will be used as the processing stage to
handle and emit ByteStrings from and to the TCP Socket. Since one ByteString does not have to necessarily
correspond to exactly one line of text (the client might be sending the line in chunks) we use the delimiter
helper Flow from akka.stream.io.Framing to chunk the inputs up into actual lines of text. The last boolean
argument indicates that we require an explicit line ending even for the last message before the connection is closed.
In this example we simply add exclamation marks to each incoming text message and push it through the flow:
connections.runForeach(connection -> {
System.out.println("New connection from: " + connection.remoteAddress());
final Flow<ByteString, ByteString, BoxedUnit> echo = Flow.of(ByteString.class)
.via(Framing.delimiter(ByteString.fromString("\n"), 256, false))
.map(bytes -> bytes.utf8String())

1.11. Working with streaming IO

71

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.map(s -> s + "!!!\n")


.map(s -> ByteString.fromString(s));
connection.handleWith(echo, mat);
}, mat);

Notice that while most building blocks in Akka Streams are reusable and freely shareable, this is not the case
for the incoming connection Flow, since it directly corresponds to an existing, already accepted connection its
handling can only ever be materialized once.
Closing connections is possible by cancelling the incoming connection Flow from your server logic (e.g. by
connecting its downstream to an CancelledSink and its upstream to a completed Source). It is also possible
to shut down the servers socket by cancelling the connections:Source[IncomingConnection].
We can then test the TCP server by sending data to the TCP Socket using netcat:
$ echo -n "Hello World" | netcat 127.0.0.1 8889
Hello World!!!

Connecting: REPL Client


In this example we implement a rather naive Read Evaluate Print Loop client over TCP. Lets say we know a server
has exposed a simple command line interface over TCP, and would like to interact with it using Akka Streams
over TCP. To open an outgoing connection socket we use the outgoingConnection method:
final Flow<ByteString, ByteString, Future<OutgoingConnection>> connection =
Tcp.get(system).outgoingConnection("127.0.0.1", 8889);
final PushStage<String, ByteString> replParser = new PushStage<String, ByteString>() {
@Override public SyncDirective onPush(String elem, Context<ByteString> ctx) {
if (elem.equals("q"))
return ctx.pushAndFinish(ByteString.fromString("BYE\n"));
else
return ctx.push(ByteString.fromString(elem + "\n"));
}
};
final Flow<ByteString, ByteString, BoxedUnit> repl = Flow.of(ByteString.class)
.via(Framing.delimiter(ByteString.fromString("\n"), 256, false))
.map(bytes -> bytes.utf8String())
.map(text -> {System.out.println("Server: " + text); return "next";})
.map(elem -> readLine("> "))
.transform(() -> replParser);
connection.join(repl).run(mat);

The repl flow we use to handle the server interaction first prints the servers response, then awaits on input from
the command line (this blocking call is used here just for the sake of simplicity) and converts it to a ByteString
which is then sent over the wire to the server. Then we simply connect the TCP pipeline to this processing stageat
this point it will be materialized and start processing data once the server responds with an initial message.
A resilient REPL client would be more sophisticated than this, for example it should split out the input reading
into a separate mapAsync step and have a way to let the server write more data than one ByteString chunk at any
given time, these improvements however are left as exercise for the reader.
Avoiding deadlocks and liveness issues in back-pressured cycles
When writing such end-to-end back-pressured systems you may sometimes end up in a situation of a loop, in
which either side is waiting for the other one to start the conversation. One does not need to look far to find
examples of such back-pressure loops. In the two examples shown previously, we always assumed that the side
we are connecting to would start the conversation, which effectively means both sides are back-pressured and can
1.11. Working with streaming IO

72

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

not get the conversation started. There are multiple ways of dealing with this which are explained in depth in
Graph cycles, liveness and deadlocks, however in client-server scenarios it is often the simplest to make either
side simply send an initial message.
Note: In case of back-pressured cycles (which can occur even between different systems) sometimes you have to
decide which of the sides has start the conversation in order to kick it off. This can be often done by injecting an
initial message from one of the sidesa conversation starter.
To break this back-pressure cycle we need to inject some initial message, a conversation starter. First, we need
to decide which side of the connection should remain passive and which active. Thankfully in most situations
finding the right spot to start the conversation is rather simple, as it often is inherent to the protocol we are trying
to implement using Streams. In chat-like applications, which our examples resemble, it makes sense to make the
Server initiate the conversation by emitting a hello message:
connections.runForeach(connection -> {
// server logic, parses incoming commands
final PushStage<String, String> commandParser = new PushStage<String, String>() {
@Override public SyncDirective onPush(String elem, Context<String> ctx) {
if (elem.equals("BYE"))
return ctx.finish();
else
return ctx.push(elem + "!");
}
};
final String welcomeMsg = "Welcome to: " + connection.localAddress() +
" you are: " + connection.remoteAddress() + "!\n";
final Source<ByteString, BoxedUnit> welcome =
Source.single(ByteString.fromString(welcomeMsg));
final Flow<ByteString, ByteString, BoxedUnit> echoFlow =
Flow.of(ByteString.class)
.via(Framing.delimiter(ByteString.fromString("\n"), 256, false))
.map(bytes -> bytes.utf8String())
.transform(() -> commandParser)
.map(s -> s + "\n")
.map(s -> ByteString.fromString(s));
final Flow<ByteString, ByteString, BoxedUnit> serverLogic =
Flow.factory().create(builder -> {
final UniformFanInShape<ByteString, ByteString> concat =
builder.graph(Concat.create());
final FlowShape<ByteString, ByteString> echo = builder.graph(echoFlow);
builder
.from(welcome).to(concat)
.from(echo).to(concat);
return new Pair<>(echo.inlet(), concat.out());
});
connection.handleWith(serverLogic, mat);
}, mat);

The way we constructed a Flow using a PartialFlowGraph is explained in detail in Constructing Sources,
Sinks and Flows from Partial Graphs, however the basic concepts is rather simple we can encapsulate arbitrarily complex logic within a Flow as long as it exposes the same interface, which means exposing exactly one
UndefinedSink and exactly one UndefinedSource which will be connected to the TCP pipeline. In this
example we use a Concat graph processing stage to inject the initial message, and then continue with handling
all incoming data using the echo handler. You should use this pattern of encapsulating complex logic in Flows and
attaching those to StreamIO in order to implement your custom and possibly sophisticated TCP servers.

1.11. Working with streaming IO

73

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

In this example both client and server may need to close the stream based on a parsed command - BYE in the case
of the server, and q in the case of the client. This is implemented by using a custom PushStage (see Using
PushPullStage) which completes the stream once it encounters such command.

1.11.2 Streaming File IO


Akka Streams provide simple Sources and Sinks that can work with ByteString instances to perform IO
operations on files.
Note: Since the current version of Akka (2.3.x) needs to support JDK6, the currently provided File IO implementations are not able to utilise Asynchronous File IO operations, as these were introduced in JDK7 (and newer).
Once Akka is free to require JDK8 (from 2.4.x) these implementations will be updated to make use of the new
NIO APIs (i.e. AsynchronousFileChannel).
Streaming data from a file is as easy as defining a SynchronousFileSource given a target file, and an optional
chunkSize which determines the buffer size determined as one element in such stream:
final File file = new File("example.csv");
Sink<ByteString, Future<BoxedUnit>> printlnSink =
Sink.foreach(chunk -> System.out.println(chunk.utf8String()));
Future<Long> bytesWritten =
SynchronousFileSource.create(file)
.to(printlnSink)
.run(mat);

Please note that these processing stages are backed by Actors and by default are configured to run on a preconfigured threadpool-backed dispatcher dedicated for File IO. This is very important as it isolates the blocking
file IO operations from the rest of the ActorSystem allowing each dispatcher to be utilised in the most efficient
way. If you want to configure a custom dispatcher for file IO operations globally, you can do so by changing the
akka.stream.file-io-dispatcher, or for a specific stage by specifying a custom Dispatcher in code,
like this:
SynchronousFileSink.create(file)
.withAttributes(ActorAttributes.dispatcher("custom-file-io-dispatcher"));

1.12 Pipelining and Parallelism


Akka Streams processing stages (be it simple operators on Flows and Sources or graph junctions) are executed
concurrently by default. This is realized by mapping each of the processing stages to a dedicated actor internally.
We will illustrate through the example of pancake cooking how streams can be used for various processing patterns, exploiting the available parallelism on modern computers. The setting is the following: both Patrik and
Roland like to make pancakes, but they need to produce sufficient amount in a cooking session to make all of the
children happy. To increase their pancake production throughput they use two frying pans. How they organize
their pancake processing is markedly different.

1.12.1 Pipelining
Roland uses the two frying pans in an asymmetric fashion. The first pan is only used to fry one side of the pancake
then the half-finished pancake is flipped into the second pan for the finishing fry on the other side. Once the first
frying pan becomes available it gets a new scoop of batter. As an effect, most of the time there are two pancakes
being cooked at the same time, one being cooked on its first side and the second being cooked to completion. This
is how this setup would look like implemented as a stream:

1.12. Pipelining and Parallelism

74

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Flow<ScoopOfBatter, HalfCookedPancake, BoxedUnit> fryingPan1 =


Flow.of(ScoopOfBatter.class).map(batter -> new HalfCookedPancake());
Flow<HalfCookedPancake, Pancake, BoxedUnit> fryingPan2 =
Flow.of(HalfCookedPancake.class).map(halfCooked -> new Pancake());
// With the two frying pans we can fully cook pancakes
Flow<ScoopOfBatter, Pancake, BoxedUnit> pancakeChef = fryingPan1.via(fryingPan2);

The two map stages in sequence (encapsulated in the frying pan flows) will be executed in a pipelined way,
basically doing the same as Roland with his frying pans:
1. A ScoopOfBatter enters fryingPan1
2. fryingPan1 emits a HalfCookedPancake once fryingPan2 becomes available
3. fryingPan2 takes the HalfCookedPancake
4. at this point fryingPan1 already takes the next scoop, without waiting for fryingPan2 to finish
The benefit of pipelining is that it can be applied to any sequence of processing steps that are otherwise not
parallelisable (for example because the result of a processing step depends on all the information from the previous
step). One drawback is that if the processing times of the stages are very different then some of the stages will
not be able to operate at full throughput because they will wait on a previous or subsequent stage most of the
time. In the pancake example frying the second half of the pancake is usually faster than frying the first half,
fryingPan2 will not be able to operate at full capacity 1 .
Stream processing stages have internal buffers to make communication between them more efficient. For more
details about the behavior of these and how to add additional buffers refer to stream-rate-scala.

1.12.2 Parallel processing


Patrik uses the two frying pans symmetrically. He uses both pans to fully fry a pancake on both sides, then puts
the results on a shared plate. Whenever a pan becomes empty, he takes the next scoop from the shared bowl of
batter. In essence he parallelizes the same process over multiple pans. This is how this setup will look like if
implemented using streams:
Flow<ScoopOfBatter, Pancake, BoxedUnit> fryingPan =
Flow.of(ScoopOfBatter.class).map(batter -> new Pancake());
Flow<ScoopOfBatter, Pancake, BoxedUnit> pancakeChef =
Flow.factory().create(b -> {
final UniformFanInShape<Pancake, Pancake> mergePancakes =
b.graph(Merge.create(2));
final UniformFanOutShape<ScoopOfBatter, ScoopOfBatter> dispatchBatter =
b.graph(Balance.create(2));
// Using two frying pans in parallel, both fully cooking a pancake from the batter.
// We always put the next scoop of batter to the first frying pan that becomes available.
b.from(dispatchBatter.out(0)).via(fryingPan).to(mergePancakes.in(0));
// Notice that we used the "fryingPan" flow without importing it via builder.add().
// Flows used this way are auto-imported, which in this case means that the two
// uses of "fryingPan" mean actually different stages in the graph.
b.from(dispatchBatter.out(1)).via(fryingPan).to(mergePancakes.in(1));
return new Pair(dispatchBatter.in(), mergePancakes.out());
});

The benefit of parallelizing is that it is easy to scale. In the pancake example it is easy to add a third frying pan
with Patriks method, but Roland cannot add a third frying pan, since that would require a third processing step,
which is not practically possible in the case of frying pancakes.
1 Rolands reason for this seemingly suboptimal procedure is that he prefers the temperature of the second pan to be slightly lower than the
first in order to achieve a more homogeneous result.

1.12. Pipelining and Parallelism

75

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

One drawback of the example code above that it does not preserve the ordering of pancakes. This might be a
problem if children like to track their own pancakes. In those cases the Balance and Merge stages should be
replaced by strict-round robing balancing and merging stages that put in and take out pancakes in a strict order.
A more detailed example of creating a worker pool can be found in the cookbook: cookbook-balance-scala

1.12.3 Combining pipelining and parallel processing


The two concurrency patterns that we demonstrated as means to increase throughput are not exclusive. In fact,
it is rather simple to combine the two approaches and streams provide a nice unifying language to express and
compose them.
First, lets look at how we can parallelize pipelined processing stages. In the case of pancakes this means that we
will employ two chefs, each working using Rolands pipelining method, but we use the two chefs in parallel, just
like Patrik used the two frying pans. This is how it looks like if expressed as streams:
Flow<ScoopOfBatter, Pancake, BoxedUnit> pancakeChef =
Flow.factory().create(b -> {
final UniformFanInShape<Pancake, Pancake> mergePancakes =
b.graph(Merge.create(2));
final UniformFanOutShape<ScoopOfBatter, ScoopOfBatter> dispatchBatter =
b.graph(Balance.create(2));
// Using two pipelines, having two frying pans each, in total using
// four frying pans
b.from(dispatchBatter.out(0))
.via(fryingPan1)
.via(fryingPan2)
.to(mergePancakes.in(0));
b.from(dispatchBatter.out(1))
.via(fryingPan1)
.via(fryingPan2)
.to(mergePancakes.in(1));
return new Pair(dispatchBatter.in(), mergePancakes.out());
});

The above pattern works well if there are many independent jobs that do not depend on the results of each other,
but the jobs themselves need multiple processing steps where each step builds on the result of the previous one. In
our case individual pancakes do not depend on each other, they can be cooked in parallel, on the other hand it is
not possible to fry both sides of the same pancake at the same time, so the two sides have to be fried in sequence.
It is also possible to organize parallelized stages into pipelines. This would mean employing four chefs:
the first two chefs prepare half-cooked pancakes from batter, in parallel, then putting those on a large enough
flat surface.
the second two chefs take these and fry their other side in their own pans, then they put the pancakes on a
shared plate.
This is again straightforward to implement with the streams API:
Flow<ScoopOfBatter, HalfCookedPancake, BoxedUnit> pancakeChefs1 =
Flow.factory().create(b -> {
final UniformFanInShape<HalfCookedPancake, HalfCookedPancake> mergeHalfCooked =
b.graph(Merge.create(2));
final UniformFanOutShape<ScoopOfBatter, ScoopOfBatter> dispatchBatter =
b.graph(Balance.create(2));
// Two chefs work with one frying pan for each, half-frying the pancakes then putting
// them into a common pool
b.from(dispatchBatter.out(0)).via(fryingPan1).to(mergeHalfCooked.in(0));
b.from(dispatchBatter.out(1)).via(fryingPan1).to(mergeHalfCooked.in(1));

1.12. Pipelining and Parallelism

76

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return new Pair(dispatchBatter.in(), mergeHalfCooked.out());


});
Flow<HalfCookedPancake, Pancake, BoxedUnit> pancakeChefs2 =
Flow.factory().create(b -> {
final UniformFanInShape<Pancake, Pancake> mergePancakes =
b.graph(Merge.create(2));
final UniformFanOutShape<HalfCookedPancake, HalfCookedPancake> dispatchHalfCooked =
b.graph(Balance.create(2));
// Two chefs work with one frying pan for each, finishing the pancakes then putting
// them into a common pool
b.from(dispatchHalfCooked.out(0)).via(fryingPan2).to(mergePancakes.in(0));
b.from(dispatchHalfCooked.out(1)).via(fryingPan2).to(mergePancakes.in(1));
return new Pair(dispatchHalfCooked.in(), mergePancakes.out());
});
Flow<ScoopOfBatter, Pancake, BoxedUnit> kitchen =
pancakeChefs1.via(pancakeChefs2);

This usage pattern is less common but might be usable if a certain step in the pipeline might take wildly different
times to finish different jobs. The reason is that there are more balance-merge steps in this pattern compared to
the parallel pipelines. This pattern rebalances after each step, while the previous pattern only balances at the entry
point of the pipeline. This only matters however if the processing time distribution has a large deviation.

1.13 Testing streams


Verifying behaviour of Akka Stream sources, flows and sinks can be done using various code patterns and libraries.
Here we will discuss testing these elements using:
simple sources, sinks and flows;
sources and sinks in combination with TestProbe from the akka-testkit module;
sources and sinks specifically crafted for writing tests from the akka-stream-testkit module.
It is important to keep your data processing pipeline as separate sources, flows and sinks. This makes them
easily testable by wiring them up to other sources or sinks, or some test harnesses that akka-testkit or
akka-stream-testkit provide.

1.13.1 Built in sources, sinks and combinators


Testing a custom sink can be as simple as attaching a source that emits elements from a predefined collection,
running a constructed test flow and asserting on the results that sink produced. Here is an example of a test for a
sink:
final Sink<Integer, Future<Integer>> sinkUnderTest = Flow.of(Integer.class)
.map(i -> i * 2)
.toMat(Sink.fold(0, (agg, next) -> agg + next), Keep.right());
final Future<Integer> future = Source.from(Arrays.asList(1, 2, 3, 4))
.runWith(sinkUnderTest, mat);
final Integer result = Await.result(future, Duration.create(1, TimeUnit.SECONDS));
assert(result == 20);

The same strategy can be applied for sources as well. In the next example we have a source that produces an
infinite stream of elements. Such source can be tested by asserting that first arbitrary number of elements hold
some condition. Here the grouped combinator and Sink.head are very useful.

1.13. Testing streams

77

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

final Source<Integer, BoxedUnit> sourceUnderTest = Source.repeat(1)


.map(i -> i * 2);
final Future<List<Integer>> future = sourceUnderTest
.grouped(10)
.runWith(Sink.head(), mat);
final List<Integer> result =
Await.result(future, Duration.create(1, TimeUnit.SECONDS));
assertEquals(result, Collections.nCopies(10, 2));

When testing a flow we need to attach a source and a sink. As both stream ends are under our control, we can
choose sources that tests various edge cases of the flow and sinks that ease assertions.
final Flow<Integer, Integer, BoxedUnit> flowUnderTest = Flow.of(Integer.class)
.takeWhile(i -> i < 5);
final Future<Integer> future = Source.from(Arrays.asList(1, 2, 3, 4, 5, 6))
.via(flowUnderTest).runWith(Sink.fold(0, (agg, next) -> agg + next), mat);
final Integer result = Await.result(future, Duration.create(1, TimeUnit.SECONDS));
assert(result == 10);

1.13.2 TestKit
Akka Stream offers integration with Actors out of the box. This support can be used for writing stream tests that
use familiar TestProbe from the akka-testkit API.
One of the more straightforward tests would be to materialize stream to a Future and then use pipe pattern to
pipe the result of that future to the probe.
final Source<List<Integer>, BoxedUnit> sourceUnderTest = Source
.from(Arrays.asList(1, 2, 3, 4))
.grouped(2);
final TestProbe probe = new TestProbe(system);
final Future<List<List<Integer>>> future = sourceUnderTest
.grouped(2)
.runWith(Sink.head(), mat);
akka.pattern.Patterns.pipe(future, system.dispatcher()).to(probe.ref());
probe.expectMsg(Duration.create(1, TimeUnit.SECONDS),
Arrays.asList(Arrays.asList(1, 2), Arrays.asList(3, 4))
);

Instead of materializing to a future, we can use a Sink.actorRef that sends all incoming elements to the
given ActorRef. Now we can use assertion methods on TestProbe and expect elements one by one as
they arrive. We can also assert stream completion by expecting for onCompleteMessage which was given to
Sink.actorRef.
final Source<Tick, Cancellable> sourceUnderTest = Source.from(
FiniteDuration.create(0, TimeUnit.MILLISECONDS),
FiniteDuration.create(200, TimeUnit.MILLISECONDS),
Tick.TOCK);
final TestProbe probe = new TestProbe(system);
final Cancellable cancellable = sourceUnderTest
.to(Sink.actorRef(probe.ref(), Tick.COMPLETED)).run(mat);
probe.expectMsg(Duration.create(1, TimeUnit.SECONDS), Tick.TOCK);
probe.expectNoMsg(Duration.create(100, TimeUnit.MILLISECONDS));
probe.expectMsg(Duration.create(1, TimeUnit.SECONDS), Tick.TOCK);
cancellable.cancel();
probe.expectMsg(Duration.create(1, TimeUnit.SECONDS), Tick.COMPLETED);

1.13. Testing streams

78

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Similarly to Sink.actorRef that provides control over received elements, we can use Source.actorRef
and have full control over elements to be sent.
final Sink<Integer, Future<String>> sinkUnderTest = Flow.of(Integer.class)
.map(i -> i.toString())
.toMat(Sink.fold("", (agg, next) -> agg + next), Keep.right());
final Pair<ActorRef, Future<String>> refAndFuture =
Source.<Integer>actorRef(8, OverflowStrategy.fail())
.toMat(sinkUnderTest, Keep.both())
.run(mat);
final ActorRef ref = refAndFuture.first();
final Future<String> future = refAndFuture.second();
ref.tell(1, ActorRef.noSender());
ref.tell(2, ActorRef.noSender());
ref.tell(3, ActorRef.noSender());
ref.tell(new akka.actor.Status.Success("done"), ActorRef.noSender());
final String result = Await.result(future, Duration.create(1, TimeUnit.SECONDS));
assertEquals(result, "123");

1.13.3 Streams TestKit


You may have noticed various code patterns that emerge when testing stream pipelines. Akka Stream has a
separate akka-stream-testkit module that provides tools specifically for writing stream tests. This module
comes with two main components that are TestSource and TestSink which provide sources and sinks that
materialize to probes that allow fluent API.
Note: Be sure to add the module akka-stream-testkit to your dependencies.
A sink returned by TestSink.probe allows manual control over demand and assertions over elements coming
downstream.
final Source<Integer, BoxedUnit> sourceUnderTest = Source.from(Arrays.asList(1, 2, 3, 4))
.filter(elem -> elem % 2 == 0)
.map(elem -> elem * 2);
sourceUnderTest
.runWith(TestSink.probe(system), mat)
.request(2)
.expectNext(4, 8)
.expectComplete();

A source returned by TestSource.probe can be used for asserting demand or controlling when stream is
completed or ended with an error.
final Sink<Integer, BoxedUnit> sinkUnderTest = Sink.cancelled();
TestSource.<Integer>probe(system)
.toMat(sinkUnderTest, Keep.left())
.run(mat)
.expectCancellation();

You can also inject exceptions and test sink behaviour on error conditions.
final Sink<Integer, Future<Integer>> sinkUnderTest = Sink.head();
final Pair<TestPublisher.Probe<Integer>, Future<Integer>> probeAndFuture =
TestSource.<Integer>probe(system)
.toMat(sinkUnderTest, Keep.both())
.run(mat);

1.13. Testing streams

79

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

final TestPublisher.Probe<Integer> probe = probeAndFuture.first();


final Future<Integer> future = probeAndFuture.second();
probe.sendError(new Exception("boom"));
Await.ready(future, Duration.create(1, TimeUnit.SECONDS));
final Throwable exception = ((Failure)future.value().get()).exception();
assertEquals(exception.getMessage(), "boom");

Test source and sink can be used together in combination when testing flows.
final Flow<Integer, Integer, BoxedUnit> flowUnderTest = Flow.of(Integer.class)
.mapAsyncUnordered(2, sleep -> akka.pattern.Patterns.after(
Duration.create(10, TimeUnit.MILLISECONDS),
system.scheduler(),
system.dispatcher(),
Futures.successful(sleep)
));
final Pair<TestPublisher.Probe<Integer>, TestSubscriber.Probe<Integer>> pubAndSub =
TestSource.<Integer>probe(system)
.via(flowUnderTest)
.toMat(TestSink.<Integer>probe(system), Keep.both())
.run(mat);
final TestPublisher.Probe<Integer> pub = pubAndSub.first();
final TestSubscriber.Probe<Integer> sub = pubAndSub.second();
sub.request(3);
pub.sendNext(3);
pub.sendNext(2);
pub.sendNext(1);
sub.expectNextUnordered(1, 2, 3);
pub.sendError(new Exception("Power surge in the linear subroutine C-47!"));
final Throwable ex = sub.expectError();
assert(ex.getMessage().contains("C-47"));

1.14 Overview of built-in stages and their semantics


All stages by default backpressure if the computation they encapsulate is not fast enough to keep up with the rate
of incoming elements from the preceding stage. There are differences though how the different stages handle
when some of their downstream stages backpressure them. This table provides a summary of all built-in stages
and their semantics.
All stages stop and propagate the failure downstream as soon as any of their upstreams emit a failure unless
supervision is used. This happens to ensure reliable teardown of streams and cleanup when failures happen.
Failures are meant to be to model unrecoverable conditions, therefore they are always eagerly propagated. For inband error handling of normal errors (dropping elements if a map fails for example) you should use the upervision
support, or explicitly wrap your element types in a proper container that can express error or success states (for
example Try in Scala).
Custom components are not covered by this table since their semantics are defined by the user.

1.14.1 Simple processing stages


These stages are all expressible as a PushPullStage. These stages can transform the rate of incoming elements
since there are stages that emit multiple elements for a single input (e.g. mapConcat) or consume multiple
elements before emitting one output (e.g. filter). However, these rate transformations are data-driven, i.e. it is
the incoming elements that define how the rate is affected. This is in contrast with Backpressure aware stages
which can change their processing behavior depending on being backpressured by downstream or not.

1.14. Overview of built-in stages and their semantics

80

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Stage Emits when


map the mapping function returns an
element
map- the mapping function returns an
Con- element or there are still remaining
cat
elements from the previously
calculated collection
filthe given predicate returns true for the
ter
element
collect

the provided partial function is defined


for the element

grouped
the specified number of elements has
been accumulated or upstream
completed
scan the function scanning the element
returns a new element
fold upstream completes
drop the specified number of elements has
been dropped already
take

the specified number of elements to


take has not yet been reached

Backpressures when
downstream backpressures

Completes when
upstream completes

downstream backpressures or
there are still available elements
from the previously calculated
collection
the given predicate returns true
for the element and downstream
backpressures
the partial function is defined for
the element and downstream
backpressures
a group has been assembled and
downstream backpressures

upstream completes
and all remaining
elements has been
emitted
upstream completes

downstream backpressures

upstream completes

downstream backpressures
the specified number of elements
has been dropped and
downstream backpressures
downstream backpressures

upstream completes
upstream completes

take- the predicate is true and until the first


While false result

downstream backpressures

dropWhile
recover

predicate returned false and


downstream backpressures
downstream backpressures, not
when failure happened

the predicate returned false and for all


following stream elements
the element is available from the
upstream or upstream is failed and pf
returns an element

upstream completes

upstream completes

the defined number of


elements has been
taken or upstream
completes
predicate returned
false or upstream
completes
upstream completes
upstream completes or
upstream failed with
exception pf can
handle

1.14.2 Asynchronous processing stages


These stages encapsulate an asynchronous computation, properly handling backpressure while taking care of the
asynchronous operation at the same time (usually handling the completion of a Future).
It is currently not possible to build custom asynchronous processing stages
Stage Emits when
mathe Future returned by
pAsync the provided function
finishes for the next
element in sequence
maany of the Futures
pAsyn- returned by the
cUnordered
provided function
complete
2 If

Backpressures when
the number of futures
reaches the configured
parallelism and the
downstream
backpressures
the number of futures
reaches the configured
parallelism and the
downstream
backpressures

Completes when
upstream completes and all futures has been
completed and all elements has been emitted 2

upstream completes and all futures has been


completed and all elements has been emitted 1

a Future fails, the stream also fails (unless a different supervision strategy is applied)

1.14. Overview of built-in stages and their semantics

81

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

1.14.3 Timer driven stages


These stages process elements using timers, delaying, dropping or grouping elements for certain time durations.
Stage

Emits when

Backpressures when

takeWithin

an upstream element arrives

downstream backpressures

dropWithin
groupedWithin

after the timer fired and a new


upstream element arrives
the configured time elapses since
the last group has been emitted

downstream backpressures
the group has been assembled (the
duration elapsed) and downstream
backpressures

Completes
when
upstream
completes or
timer fires
upstream
completes
upstream
completes

It is currently not possible to build custom timer driven stages

1.14.4 Backpressure aware stages


These stages are all expressible as a DetachedStage. These stages are aware of the backpressure provided by
their downstreams and able to adapt their behavior to that signal.
Stage
conflate

expand
buffer
(Backpressure)
buffer
(DropX)
buffer
(Fail)

Emits when
downstream stops backpressuring
and there is a conflated element
available
downstream stops backpressuring
downstream stops backpressuring
and there is a pending element in
the buffer
downstream stops backpressuring
and there is a pending element in
the buffer
downstream stops backpressuring
and there is a pending element in
the buffer

Backpressures when
never 3

Completes when
upstream completes

downstream backpressures
buffer is full

upstream completes
upstream completes and
buffered elements has
been drained
upstream completes and
buffered elements has
been drained
upstream completes and
buffered elements has
been drained

never 2

fails the stream instead of


backpressuring when buffer
is full

1.14.5 Nesting and flattening stages


These stages either take a stream and turn it into a stream of streams (nesting) or they take a stream that contains
nested streams and turn them into a stream of elements instead (flattening).
It is currently not possible to build custom nesting or flattening stages
3 Except

if the encapsulated computation is not fast enough

1.14. Overview of built-in stages and their semantics

82

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Stage Emits when


pre- the configured number of prefix
fixAnd-elements are available. Emits this
Tail prefix, and the rest as a substream
groupBy
an element for which the grouping
function returns a group that has
not yet been created. Emits the
new group
splitWhen
an element for which the provided
predicate is true, opening and
emitting a new substream for
subsequent elements
splitAfter
an element passes through. When
the provided predicate is true it
emitts the element * and opens a
new substream for subsequent
element
flat- the current consumed substream
ten
has an element available
(Concat)

Backpressures when
downstream backpressures or
substream backpressures

Completes when
prefix elements has been
consumed and substream has
been consumed
upstream completes 4

there is an element pending for


a group whose substream
backpressures

upstream completes 3

there is an element pending for


the next substream, but the
previous is not fully consumed
yet, or the substream
backpressures
there is an element pending for
the next substream, but the
previous is not fully consumed
yet, or the substream
backpressures
downstream backpressures

upstream completes 3

upstream completes and all


consumed substreams
complete

1.14.6 Fan-in stages


Most of these stages can be expressible as a FlexiMerge. These stages take multiple streams as their input and
provide a single output combining the elements from all of the inputs in different ways.
The custom fan-in stages that can be built currently are limited
Stage

Emits when

merge

one of the inputs has an element available

mergePre- one of the inputs has an element available, preferring a


ferred
defined input if multiple have elements available
zip
all of the inputs has an element available
zipWith

all of the inputs has an element available

concat

the current stream has an element available; if the current


input completes, it tries the next one

Backpressures
when
downstream
backpressures
downstream
backpressures
downstream
backpressures
downstream
backpressures
downstream
backpressures

Completes
when
all upstreams
complete
all upstreams
complete
any upstream
completes
any upstream
completes
all upstreams
complete

1.14.7 Fan-out stages


Most of these stages can be expressible as a FlexiRoute. These have one input and multiple outputs. They
might route the elements between different outputs, or emit elements on multiple outputs at the same time.
The custom fan-out stages that can be built currently are limited
4 Until

the end of stream it is not possible to know whether new substreams will be needed or not

1.14. Overview of built-in stages and their semantics

83

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Stage

Emits when

unzip

all of the outputs stops backpressuring and there is an


input element available
all of the outputs stops backpressuring and there is an
input element available
all of the outputs stops backpressuring and there is an
input element available
any of the outputs stops backpressuring; emits the
element to the first available output

unzipWith
broadcast
balance

Backpressures
when
any of the outputs
backpressures
any of the outputs
backpressures
any of the outputs
backpressures
all of the outputs
backpressure

Completes
when
upstream
completes
upstream
completes
upstream
completes
upstream
completes

1.15 Streams Cookbook


1.15.1 Introduction
This is a collection of patterns to demonstrate various usage of the Akka Streams API by solving small targeted
problems in the format of recipes. The purpose of this page is to give inspiration and ideas how to approach
various small tasks involving streams. The recipes in this page can be used directly as-is, but they are most
powerful as starting points: customization of the code snippets is warmly encouraged.
This part also serves as supplementary material for the main body of documentation. It is a good idea to have
this page open while reading the manual and look for examples demonstrating various streaming concepts as they
appear in the main body of documentation.
If you need a quick reference of the available processing stages used in the recipes see Overview of built-in stages
and their semantics.

1.15.2 Working with Flows


In this collection we show simple recipes that involve linear flows. The recipes in this section are rather general,
more targeted recipes are available as separate sections (Buffers and working with rate, Working with streaming
IO).
Logging elements of a stream
Situation: During development it is sometimes helpful to see what happens in a particular section of a stream.
The simplest solution is to simply use a map operation and use println to print the elements received to the
console. While this recipe is rather simplistic, it is often suitable for a quick debug session.
mySource.map(elem -> {
System.out.println(elem);
return elem;
});

Another approach to logging is to use log() operation which allows configuring logging for elements flowing
through the stream as well as completion and erroring.
// customise log levels
mySource.log("before-map")
.withAttributes(Attributes.createLogLevels(onElement, onFinish, onFailure))
.map(i -> analyse(i));
// or provide custom logging adapter
final LoggingAdapter adapter = Logging.getLogger(system, "customLogger");
mySource.log("custom", adapter);

1.15. Streams Cookbook

84

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Flattening a stream of sequences


Situation: A stream is given as a stream of sequence of elements, but a stream of elements needed instead,
streaming all the nested elements inside the sequences separately.
The mapConcat operation can be used to implement a one-to-many transformation of elements using a mapper
function in the form of In -> List<Out>. In this case we want to map a List of elements to the elements in
the collection itself, so we can just call mapConcat(l -> l).
Source<List<Message>, BoxedUnit> myData = someDataSource;
Source<Message, BoxedUnit> flattened = myData.mapConcat(i -> i);

Draining a stream to a strict collection


Situation: A finite sequence of elements is given as a stream, but a scala collection is needed instead.
In this recipe we will use the grouped stream operation that groups incoming elements into a stream of limited
size collections (it can be seen as the almost opposite version of the Flattening a stream of sequences recipe we
showed before). By using a grouped(MAX_ALLOWED_SIZE) we create a stream of groups with maximum
size of MaxAllowedSeqSize and then we take the first element of this stream by attaching a Sink.head().
What we get is a Future containing a sequence with all the elements of the original up to MAX_ALLOWED_SIZE
size (further elements are dropped).
final Future<List<String>> strings = myData
.grouped(MAX_ALLOWED_SIZE).runWith(Sink.head(), mat);

Calculating the digest of a ByteString stream


Situation: A stream of bytes is given as a stream of ByteStrings and we want to calculate the cryptographic
digest of the stream.
This recipe uses a PushPullStage to host a mutable MessageDigest class (part of the Java Cryptography
API) and update it with the bytes arriving from the stream. When the stream starts, the onPull handler of the
stage is called, which just bubbles up the pull event to its upstream. As a response to this pull, a ByteString
chunk will arrive (onPush) which we use to update the digest, then it will pull for the next chunk.
Eventually the stream of ByteStrings depletes and we get a notification about this event via
onUpstreamFinish. At this point we want to emit the digest value, but we cannot do it in this handler
directly. Instead we call ctx.absorbTermination() signalling to our context that we do not yet want to
finish. When the environment decides that we can emit further elements onPull is called again, and we see
ctx.isFinishing() returning true (since the upstream source has been depleted already). Since we only
want to emit a final element it is enough to call ctx.pushAndFinish passing the digest ByteString to be
emitted.
public PushPullStage<ByteString, ByteString> digestCalculator(String algorithm)
throws NoSuchAlgorithmException {
return new PushPullStage<ByteString, ByteString>() {
final MessageDigest digest = MessageDigest.getInstance(algorithm);
@Override
public SyncDirective onPush(ByteString chunk, Context<ByteString> ctx) {
digest.update(chunk.toArray());
return ctx.pull();
}
@Override
public SyncDirective onPull(Context<ByteString> ctx) {
if (ctx.isFinishing()) {
return ctx.pushAndFinish(ByteString.fromArray(digest.digest()));
} else {
return ctx.pull();

1.15. Streams Cookbook

85

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

}
}
@Override
public TerminationDirective onUpstreamFinish(Context<ByteString> ctx) {
// If the stream is finished, we need to emit the last element in the onPull block.
// It is not allowed to directly emit elements from a termination block
// (onUpstreamFinish or onUpstreamFailure)
return ctx.absorbTermination();
}
};
}
final Source<ByteString, BoxedUnit> digest = data
.transform(() -> digestCalculator("SHA-256"));

Parsing lines from a stream of ByteStrings


Situation: A stream of bytes is given as a stream of ByteStrings containing lines terminated by line ending
characters (or, alternatively, containing binary frames delimited by a special delimiter byte sequence) which needs
to be parsed.
The Framing helper class contains a convenience method to parse messages from a stream of ByteStrings:
final Source<String, BoxedUnit> lines = rawData
.via(Framing.delimiter(ByteString.fromString("\r\n"), 100, true))
.map(b -> b.utf8String());

Implementing reduce-by-key
Situation: Given a stream of elements, we want to calculate some aggregated value on different subgroups of the
elements.
The hello world of reduce-by-key style operations is wordcount which we demonstrate below. Given a stream
of words we first create a new stream wordStreams that groups the words according to the i -> i function,
i.e. now we have a stream of streams, where every substream will serve identical words.
To count the words, we need to process the stream of streams (the actual groups containing identical words). By
mapping over the groups and using fold (remember that fold automatically materializes and runs the stream it
is used on) we get a stream with elements of Future[String,Int]. Now all we need is to flatten this stream,
which can be achieved by calling mapAsync with i -> i identity function.
There is one tricky issue to be noted here. The careful reader probably noticed that we put a buffer between the
mapAsync() operation that flattens the stream of futures and the actual stream of futures. The reason for this is
that the substreams produced by groupBy() can only complete when the original upstream source completes.
This means that mapAsync() cannot pull for more substreams because it still waits on folding futures to finish,
but these futures never finish if the additional group streams are not consumed. This typical deadlock situation
is resolved by this buffer which either able to contain all the group streams (which ensures that they are already
running and folding) or fails with an explicit failure instead of a silent deadlock.
final int MAXIMUM_DISTINCT_WORDS = 1000;
// split the words into separate streams first
final Source<Pair<String, Source<String, BoxedUnit>>, BoxedUnit> wordStreams = words
.groupBy(i -> i);
// add counting logic to the streams
Source<Future<Pair<String, Integer>>, BoxedUnit> countedWords = wordStreams.map(pair -> {
final String word = pair.first();
final Source<String, BoxedUnit> wordStream = pair.second();

1.15. Streams Cookbook

86

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return wordStream.runFold(
new Pair<>(word, 0),
(acc, w) -> new Pair<>(word, acc.second() + 1), mat);
});
// get a stream of word counts
final Source<Pair<String, Integer>, BoxedUnit> counts = countedWords
.buffer(MAXIMUM_DISTINCT_WORDS, OverflowStrategy.fail())
.mapAsync(4, i -> i);

By extracting the parts specific to wordcount into


a groupKey function that defines the groups
a foldZero that defines the zero element used by the fold on the substream given the group key
a fold function that does the actual reduction
we get a generalized version below:
static public <In, K, Out> Flow<In, Pair<K, Out>, BoxedUnit> reduceByKey(
int maximumGroupSize,
Function<In, K> groupKey,
Function<K, Out> foldZero,
Function2<Out, In, Out> fold,
Materializer mat) {
Flow<In, Pair<K, Source<In, BoxedUnit>>, BoxedUnit> groupStreams = Flow.<In> create()
.groupBy(groupKey);
Flow<In, Future<Pair<K, Out>>, BoxedUnit> reducedValues = groupStreams.map(pair -> {
K key = pair.first();
Source<In, BoxedUnit> groupStream = pair.second();
return groupStream.runFold(new Pair<>(key, foldZero.apply(key)), (acc, elem) -> {
Out aggregated = acc.second();
return new Pair<>(key, fold.apply(aggregated, elem));
} , mat);
});
return reducedValues.buffer(maximumGroupSize, OverflowStrategy.fail()).mapAsync(4, i -> i);
}
final int MAXIMUM_DISTINCT_WORDS = 1000;
Source<Pair<String, Integer>, BoxedUnit> counts = words.via(reduceByKey(
MAXIMUM_DISTINCT_WORDS,
word -> word, // TODO
key -> 0,
(count, elem) -> count + 1,
mat));

Note: Please note that the reduce-by-key version we discussed above is sequential, in other words it is NOT a
parallelization pattern like mapReduce and similar frameworks.

Sorting elements to multiple groups with groupBy


Situation: The groupBy operation strictly partitions incoming elements, each element belongs to exactly one
group. Sometimes we want to map elements into multiple groups simultaneously.
To achieve the desired result, we attack the problem in two steps:

1.15. Streams Cookbook

87

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

first, using a function topicMapper that gives a list of topics (groups) a message belongs to, we transform
our stream of Message to a stream of (Message, Topic) where for each topic the message belongs
to a separate pair will be emitted. This is achieved by using mapConcat
Then we take this new stream of message topic pairs (containing a separate pair for each topic a given
message belongs to) and feed it into groupBy, using the topic as the group key.
final Function<Message, List<Topic>> topicMapper = m -> extractTopics(m);
final Source<Pair<Message, Topic>, BoxedUnit> messageAndTopic = elems
.mapConcat((Message msg) -> {
List<Topic> topicsForMessage = topicMapper.apply(msg);
// Create a (Msg, Topic) pair for each of the topics
// the message belongs to
return topicsForMessage
.stream()
.map(topic -> new Pair<Message, Topic>(msg, topic))
.collect(toList());
});
Source<Pair<Topic, Source<Message, BoxedUnit>>, BoxedUnit> multiGroups = messageAndTopic
.groupBy(pair -> pair.second())
.map(pair -> {
Topic topic = pair.first();
Source<Pair<Message, Topic>, BoxedUnit> topicStream = pair.second();
// chopping of the topic from the (Message, Topic) pairs
return new Pair<Topic, Source<Message, BoxedUnit>>(
topic,
topicStream.<Message> map(p -> p.first()));
});

1.15.3 Working with Graphs


In this collection we show recipes that use stream graph elements to achieve various goals.
Triggering the flow of elements programmatically
Situation: Given a stream of elements we want to control the emission of those elements according to a trigger
signal. In other words, even if the stream would be able to flow (not being backpressured) we want to hold back
elements until a trigger signal arrives.
This recipe solves the problem by simply zipping the stream of Message elments with the stream of Trigger
signals. Since Zip produces pairs, we simply map the output stream selecting the first element of the pair.
Flow<Pair<Message, Trigger>, Message, BoxedUnit> takeMessage =
Flow.<Pair<Message, Trigger>> create().map(p -> p.first());
final RunnableGraph<Pair<TestPublisher.Probe<Trigger>, TestSubscriber.Probe<Message>>> g =
FlowGraph.factory().closed(triggerSource, messageSink,
(p, s) -> new Pair<TestPublisher.Probe<Trigger>, TestSubscriber.Probe<Message>>(p, s),
(builder, source, sink) -> {
final FanInShape2<Message, Trigger, Pair<Message, Trigger>> zip =
builder.graph(Zip.create());
builder.from(elements).to(zip.in0());
builder.from(source).to(zip.in1());
builder.from(zip.out()).via(takeMessage).to(sink);
});

1.15. Streams Cookbook

88

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Alternatively, instead of using a Zip, and then using map to get the first element of the pairs, we can avoid creating
the pairs in the first place by using ZipWith which takes a two argument function to produce the output element.
If this function would return a pair of the two argument it would be exactly the behavior of Zip so ZipWith is a
generalization of zipping.
final RunnableGraph<Pair<TestPublisher.Probe<Trigger>, TestSubscriber.Probe<Message>>> g =
FlowGraph.factory().closed(triggerSource, messageSink,
(p, s) -> new Pair<TestPublisher.Probe<Trigger>, TestSubscriber.Probe<Message>>(p, s),
(builder, source, sink) -> {
final FanInShape2<Message, Trigger, Message> zipWith =
builder.graph(ZipWith.create((msg, trigger) -> msg));
builder.from(elements).to(zipWith.in0());
builder.from(source).to(zipWith.in1());
builder.from(zipWith.out()).to(sink);
});

Balancing jobs to a fixed pool of workers


Situation: Given a stream of jobs and a worker process expressed as a Flow create a pool of workers that
automatically balances incoming jobs to available workers, then merges the results.
We will express our solution as a function that takes a worker flow and the number of workers to be allocated and
gives a flow that internally contains a pool of these workers. To achieve the desired result we will create a Flow
from a graph.
The graph consists of a Balance node which is a special fan-out operation that tries to route elements to available
downstream consumers. In a for loop we wire all of our desired workers as outputs of this balancer element,
then we wire the outputs of these workers to a Merge element that will collect the results from the workers.
public static <In, Out> Flow<In, Out, BoxedUnit> balancer(
Flow<In, Out, BoxedUnit> worker, int workerCount) {
return Flow.factory().create(b -> {
boolean waitForAllDownstreams = true;
final UniformFanOutShape<In, In> balance =
b.graph(Balance.<In> create(workerCount, waitForAllDownstreams));
final UniformFanInShape<Out, Out> merge =
b.graph(Merge.<Out> create(workerCount));
for (int i = 0; i < workerCount; i++) {
b.flow(balance.out(i), worker, merge.in(i));
}
return new Pair(balance.in(), merge.out());
});
}
Flow<Message, Message, BoxedUnit> balancer = balancer(worker, 3);
Source<Message, BoxedUnit> processedJobs = data.via(balancer);

1.15.4 Working with rate


This collection of recipes demonstrate various patterns where rate differences between upstream and downstream
needs to be handled by other strategies than simple backpressure.
Dropping elements
Situation: Given a fast producer and a slow consumer, we want to drop elements if necessary to not slow down
the producer too much.

1.15. Streams Cookbook

89

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

This can be solved by using the most versatile rate-transforming operation, conflate. Conflate can be thought
as a special fold operation that collapses multiple upstream elements into one aggregate element if needed to
keep the speed of the upstream unaffected by the downstream.
When the upstream is faster, the fold process of the conflate starts. This folding needs a zero element, which
is given by a seed function that takes the current element and produces a zero for the folding process. In our case
this is i -> i so our folding state starts form the message itself. The folder function is also special: given the
aggregate value (the last message) and the new element (the freshest element) our aggregate state becomes simply
the freshest element. This choice of functions results in a simple dropping operation.
final Flow<Message, Message, BoxedUnit> droppyStream =
Flow.of(Message.class).conflate(i -> i, (lastMessage, newMessage) -> newMessage);

Dropping broadcast
Situation: The default Broadcast graph element is properly backpressured, but that means that a slow downstream consumer can hold back the other downstream consumers resulting in lowered throughput. In other words
the rate of Broadcast is the rate of its slowest downstream consumer. In certain cases it is desirable to allow
faster consumers to progress independently of their slower siblings by dropping elements if necessary.
One solution to this problem is to append a buffer element in front of all of the downstream consumers defining
a dropping strategy instead of the default Backpressure. This allows small temporary rate differences between
the different consumers (the buffer smooths out small rate variances), but also allows faster consumers to progress
by dropping from the buffer of the slow consumers if necessary.
// Makes a sink drop elements if too slow
public <T> Sink<T, Future<BoxedUnit>> droppySink(Sink<T, Future<BoxedUnit>> sink, int size) {
return Flow.<T> create()
.buffer(size, OverflowStrategy.dropHead())
.toMat(sink, Keep.right());
}
FlowGraph.factory().closed(builder -> {
final int outputCount = 3;
final UniformFanOutShape<Integer, Integer> bcast =
builder.graph(Broadcast.create(outputCount));
builder.from(builder.source(myData)).to(bcast);
builder.from(bcast).to(builder.sink(droppySink(mySink1, 10)));
builder.from(bcast).to(builder.sink(droppySink(mySink2, 10)));
builder.from(bcast).to(builder.sink(droppySink(mySink3, 10)));
});

Collecting missed ticks


Situation: Given a regular (stream) source of ticks, instead of trying to backpressure the producer of the ticks we
want to keep a counter of the missed ticks instead and pass it down when possible.
We will use conflate to solve the problem. Conflate takes two functions:
A seed function that produces the zero element for the folding process that happens when the upstream is
faster than the downstream. In our case the seed function is a constant function that returns 0 since there
were no missed ticks at that point.
A fold function that is invoked when multiple upstream messages needs to be collapsed to an aggregate
value due to the insufficient processing rate of the downstream. Our folding function simply increments the
currently stored count of the missed ticks so far.
As a result, we have a flow of Int where the number represents the missed ticks. A number 0 means that we were
able to consume the tick fast enough (i.e. zero means: 1 non-missed tick + 0 missed ticks)

1.15. Streams Cookbook

90

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

final Flow<Tick, Integer, BoxedUnit> missedTicks =


Flow.of(Tick.class).conflate(tick -> 0, (missed, tick) -> missed + 1);

Create a stream processor that repeats the last element seen


Situation: Given a producer and consumer, where the rate of neither is known in advance, we want to ensure that
none of them is slowing down the other by dropping earlier unconsumed elements from the upstream if necessary,
and repeating the last value for the downstream if necessary.
We have two options to implement this feature. In both cases we will use DetachedStage to build our
custom element (DetachedStage is specifically designed for rate translating elements just like conflate,
expand or buffer). In the first version we will use a provided initial value initial that will be used
to feed the downstream if no upstream element is ready yet. In the onPush() handler we just overwrite the
currentValue variable and immediately relieve the upstream by calling pull() (remember, implementations
of DetachedStage are not allowed to call push() as a response to onPush() or call pull() as a response
of onPull()). The downstream onPull handler is very similar, we immediately relieve the downstream by
emitting currentValue.
class HoldWithInitial<T> extends DetachedStage<T, T> {
private T currentValue;
public HoldWithInitial(T initial) {
currentValue = initial;
}
@Override
public UpstreamDirective onPush(T elem, DetachedContext<T> ctx) {
currentValue = elem;
return ctx.pull();
}
@Override
public DownstreamDirective onPull(DetachedContext<T> ctx) {
return ctx.push(currentValue);
}
}

While it is relatively simple, the drawback of the first version is that it needs an arbitrary initial element which is
not always possible to provide. Hence, we create a second version where the downstream might need to wait in
one single case: if the very first element is not yet available.
We introduce a boolean variable waitingFirstValue to denote whether the first element has been provided
or not (alternatively an Optional can be used for currentValue or if the element type is a subclass of
Object a null can be used with the same purpose). In the downstream onPull() handler the difference from the
previous version is that we call holdDownstream() if the first element is not yet available and thus blocking
our downstream. The upstream onPush() handler sets waitingFirstValue to false, and after checking if
holdDownstream() has been called it either releaves the upstream producer, or both the upstream producer
and downstream consumer by calling pushAndPull()
class HoldWithWait<T> extends DetachedStage<T, T> {
private T currentValue = null;
private boolean waitingFirstValue = true;
@Override
public UpstreamDirective onPush(T elem, DetachedContext<T> ctx) {
currentValue = elem;
waitingFirstValue = false;
if (ctx.isHoldingDownstream()) {
return ctx.pushAndPull(currentValue);
} else {
return ctx.pull();

1.15. Streams Cookbook

91

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

}
}
@Override
public DownstreamDirective onPull(DetachedContext<T> ctx) {
if (waitingFirstValue) {
return ctx.holdDownstream();
} else {
return ctx.push(currentValue);
}
}
}

Globally limiting the rate of a set of streams


Situation: Given a set of independent streams that we cannot merge, we want to globally limit the aggregate
throughput of the set of streams.
One possible solution uses a shared actor as the global limiter combined with mapAsync to create a reusable Flow
that can be plugged into a stream to limit its rate.
As the first step we define an actor that will do the accounting for the global rate limit. The actor maintains a
timer, a counter for pending permit tokens and a queue for possibly waiting participants. The actor has an open
and closed state. The actor is in the open state while it has still pending permits. Whenever a request for
permit arrives as a WantToPass message to the actor the number of available permits is decremented and we
notify the sender that it can pass by answering with a MayPass message. If the amount of permits reaches
zero, the actor transitions to the closed state. In this state requests are not immediately answered, instead the
reference of the sender is added to a queue. Once the timer for replenishing the pending permits fires by sending a
ReplenishTokens message, we increment the pending permits counter and send a reply to each of the waiting
senders. If there are more waiting senders than permits available we will stay in the closed state.
public class Limiter extends AbstractActor {
public static class WantToPass {}
public static final WantToPass WANT_TO_PASS = new WantToPass();
public static class MayPass {}
public static final MayPass MAY_PASS = new MayPass();
public static class ReplenishTokens {}
public static final ReplenishTokens REPLENISH_TOKENS = new ReplenishTokens();
private final int maxAvailableTokens;
private final FiniteDuration tokenRefreshPeriod;
private final int tokenRefreshAmount;
private final List<ActorRef> waitQueue = new ArrayList<>();
private final Cancellable replenishTimer;
private int permitTokens;
public static Props props(int maxAvailableTokens, FiniteDuration tokenRefreshPeriod,
int tokenRefreshAmount) {
return Props.create(Limiter.class, maxAvailableTokens, tokenRefreshPeriod,
tokenRefreshAmount);
}
private Limiter(int maxAvailableTokens, FiniteDuration tokenRefreshPeriod,
int tokenRefreshAmount) {
this.maxAvailableTokens = maxAvailableTokens;
this.tokenRefreshPeriod = tokenRefreshPeriod;

1.15. Streams Cookbook

92

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

this.tokenRefreshAmount = tokenRefreshAmount;
this.permitTokens = maxAvailableTokens;
this.replenishTimer = system.scheduler().schedule(
this.tokenRefreshPeriod,
this.tokenRefreshPeriod,
self(),
REPLENISH_TOKENS,
context().system().dispatcher(),
self());
receive(open());
}
PartialFunction<Object, BoxedUnit> open() {
return ReceiveBuilder
.match(ReplenishTokens.class, rt -> {
permitTokens = Math.min(permitTokens + tokenRefreshAmount, maxAvailableTokens);
})
.match(WantToPass.class, wtp -> {
permitTokens -= 1;
sender().tell(MAY_PASS, self());
if (permitTokens == 0) {
context().become(closed());
}
}).build();
}
PartialFunction<Object, BoxedUnit> closed() {
return ReceiveBuilder
.match(ReplenishTokens.class, rt -> {
permitTokens = Math.min(permitTokens + tokenRefreshAmount, maxAvailableTokens);
releaseWaiting();
})
.match(WantToPass.class, wtp -> {
waitQueue.add(sender());
})
.build();
}
private void releaseWaiting() {
final List<ActorRef> toBeReleased = new ArrayList<>(permitTokens);
for (int i = 0; i < permitTokens && i < waitQueue.size(); i++) {
toBeReleased.add(waitQueue.remove(i));
}
permitTokens -= toBeReleased.size();
toBeReleased.stream().forEach(ref -> ref.tell(MAY_PASS, self()));
if (permitTokens > 0) {
context().become(open());
}
}
@Override
public void postStop() {
replenishTimer.cancel();
waitQueue.stream().forEach(ref -> {
ref.tell(new Status.Failure(new IllegalStateException("limiter stopped")), self());
});
}
}

To create a Flow that uses this global limiter actor we use the mapAsync function with the combination of the

1.15. Streams Cookbook

93

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

ask pattern. We also define a timeout, so if a reply is not received during the configured maximum wait period
the returned future from ask will fail, which will fail the corresponding stream as well.
public <T> Flow<T, T, BoxedUnit> limitGlobal(ActorRef limiter, FiniteDuration maxAllowedWait) {
final int parallelism = 4;
final Flow<T, T, BoxedUnit> f = Flow.create();
return f.mapAsync(parallelism, element -> {
final Timeout triggerTimeout = new Timeout(maxAllowedWait);
final Future<Object> limiterTriggerFuture =
Patterns.ask(limiter, Limiter.WANT_TO_PASS, triggerTimeout);
return limiterTriggerFuture.map(new Mapper<Object, T>() {
@Override
public T apply(Object parameter) {
return element;
}
}, system.dispatcher());
});
}

Note: The global actor used for limiting introduces a global bottleneck. You might want to assign a dedicated
dispatcher for this actor.

1.15.5 Working with IO


Chunking up a stream of ByteStrings into limited size ByteStrings
Situation: Given a stream of ByteStrings we want to produce a stream of ByteStrings containing the same bytes
in the same sequence, but capping the size of ByteStrings. In other words we want to slice up ByteStrings into
smaller chunks if they exceed a size threshold.
This can be achieved with a single PushPullStage. The main logic of our stage is in emitChunkOrPull()
which implements the following logic:
if the buffer is empty, we pull for more bytes
if the buffer is nonEmpty, we split it according to the chunkSize. This will give a next chunk that we will
emit, and an empty or nonempty remaining buffer.
Both onPush() and onPull() calls emitChunkOrPull() the only difference is that the push handler also
stores the incoming chunk by appending to the end of the buffer.
class Chunker extends PushPullStage<ByteString, ByteString> {
private final int chunkSize;
private ByteString buffer = ByteString.empty();
public Chunker(int chunkSize) {
this.chunkSize = chunkSize;
}
@Override
public SyncDirective onPush(ByteString elem, Context<ByteString> ctx) {
buffer = buffer.concat(elem);
return emitChunkOrPull(ctx);
}
@Override
public SyncDirective onPull(Context<ByteString> ctx) {
return emitChunkOrPull(ctx);
}
public SyncDirective emitChunkOrPull(Context<ByteString> ctx) {

1.15. Streams Cookbook

94

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

if (buffer.isEmpty()) {
return ctx.pull();
} else {
Tuple2<ByteString, ByteString> split = buffer.splitAt(chunkSize);
ByteString emit = split._1();
buffer = split._2();
return ctx.push(emit);
}
}
}
Source<ByteString, BoxedUnit> chunksStream =
rawBytes.transform(() -> new Chunker(CHUNK_LIMIT));

Limit the number of bytes passing through a stream of ByteStrings


Situation: Given a stream of ByteStrings we want to fail the stream if more than a given maximum of bytes has
been consumed.
This recipe uses a PushStage to implement the desired feature. In the only handler we override, onPush()
we just update a counter and see if it gets larger than maximumBytes. If a violation happens we signal failure,
otherwise we forward the chunk we have received.
class ByteLimiter extends PushStage<ByteString, ByteString> {
final long maximumBytes;
private int count = 0;
public ByteLimiter(long maximumBytes) {
this.maximumBytes = maximumBytes;
}
@Override
public SyncDirective onPush(ByteString chunk, Context<ByteString> ctx) {
count += chunk.size();
if (count > maximumBytes) {
return ctx.fail(new IllegalStateException("Too much bytes"));
} else {
return ctx.push(chunk);
}
}
}
Flow<ByteString, ByteString, BoxedUnit> limiter =
Flow.of(ByteString.class).transform(() -> new ByteLimiter(SIZE_LIMIT));

Compact ByteStrings in a stream of ByteStrings


Situation: After a long stream of transformations, due to their immutable, structural sharing nature ByteStrings
may refer to multiple original ByteString instances unnecessarily retaining memory. As the final step of a transformation chain we want to have clean copies that are no longer referencing the original ByteStrings.
The recipe is a simple use of map, calling the compact() method of the ByteString elements. This does
copying of the underlying arrays, so this should be the last element of a long chain if used.
Source<ByteString, BoxedUnit> compacted = rawBytes.map(bs -> bs.compact());

Injecting keep-alive messages into a stream of ByteStrings


Situation: Given a communication channel expressed as a stream of ByteStrings we want to inject keep-alive
messages but only if this does not interfere with normal traffic.
1.15. Streams Cookbook

95

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

All this recipe needs is the MergePreferred element which is a version of a merge that is not fair. In other
words, whenever the merge can choose because multiple upstream producers have elements to produce it will
always choose the preferred upstream effectively giving it an absolute priority.
Flow<Tick, ByteString, BoxedUnit> tickToKeepAlivePacket =
Flow.of(Tick.class).conflate(tick -> keepAliveMessage, (msg, newTick) -> msg);
final Tuple3<
TestPublisher.Probe<Tick>,
TestPublisher.Probe<ByteString>,
TestSubscriber.Probe<ByteString>
> ticksDataRes =
FlowGraph.factory().closed3(ticks, data, sink,
(t, d, s) -> new Tuple3(t, d, s),
(builder, t, d, s) -> {
final int secondaryPorts = 1;
final MergePreferredShape<ByteString> unfairMerge =
builder.graph(MergePreferred.create(secondaryPorts));
// If data is available then no keepalive is injected
builder.from(d).to(unfairMerge.preferred());
builder.from(t).via(tickToKeepAlivePacket).to(unfairMerge.in(0));
builder.from(unfairMerge.out()).to(s);
}
).run(mat);

1.16 Configuration
#####################################
# Akka Stream Reference Config File #
#####################################
akka {
stream {
# Default flow materializer settings
materializer {
# Initial size of buffers used in stream elements
initial-input-buffer-size = 4
# Maximum size of buffers used in stream elements
max-input-buffer-size = 16
# Fully qualified config path which holds the dispatcher configuration
# to be used by FlowMaterialiser when creating Actors.
# When this value is left empty, the default-dispatcher will be used.
dispatcher = ""
# Cleanup leaked publishers and subscribers when they are not used within a given
# deadline
subscription-timeout {
# when the subscription timeout is reached one of the following strategies on
# the "stale" publisher:
# cancel - cancel it (via `onError` or subscribing to the publisher and
#
`cancel()`ing the subscription right away
# warn
- log a warning statement about the stale element (then drop the
#
reference to it)
# noop
- do nothing (not recommended)
mode = cancel
# time after which a subscriber / publisher is considered stale and eligible
# for cancelation (see `akka.stream.subscription-timeout.mode`)

1.16. Configuration

96

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

timeout = 5s
}
# Enable additional troubleshooting logging at DEBUG log level
debug-logging = off
# Maximum number of elements emitted in batch if downstream signals large demand
output-burst-limit = 1000
}
# Fully qualified config path which holds the dispatcher configuration
# to be used by FlowMaterialiser when creating Actors for IO operations,
# such as FileSource, FileSink and others.
file-io-dispatcher = "akka.stream.default-file-io-dispatcher"
default-file-io-dispatcher {
type = "Dispatcher"
executor = "thread-pool-executor"
throughput = 1
thread-pool-executor {
core-pool-size-min = 2
core-pool-size-factor = 2.0
core-pool-size-max = 16
}
}
}
}

1.16. Configuration

97

CHAPTER

TWO

AKKA HTTP

The Akka HTTP modules implement a full server- and client-side HTTP stack on top of akka-actor and akkastream. Its not a web-framework but rather a more general toolkit for providing and consuming HTTP-based
services. While interaction with a browser is of course also in scope it is not the primary focus of Akka HTTP.
Akka HTTP follows a rather open design and many times offers several different API levels for doing the same
thing. You get to pick the API level of abstraction that is most suitable for your application. This means that, if
you have trouble achieving something using a high-level API, theres a good chance that you can get it done with
a low-level API, which offers more flexibility but might require you to write more application code.
Akka HTTP is structured into several modules:
akka-http-core A complete, mostly low-level, server- and client-side implementation of HTTP (incl. WebSockets). Includes a model of all things HTTP.
akka-http Higher-level functionality, like (un)marshalling, (de)compression as well as a powerful DSL for defining HTTP-based APIs on the server-side
akka-http-testkit A test harness and set of utilities for verifying server-side service implementations
akka-http-jackson Predefined glue-code for (de)serializing custom types from/to JSON with jackson

2.1 Configuration
Just like any other Akka module Akka HTTP is configured via Typesafe Config. Usually this means that you
provide an application.conf which contains all the application-specific settings that differ from the default
ones provided by the reference configuration files from the individual Akka modules.
These are the relevant default configuration values for the Akka HTTP modules.

2.1.1 akka-http-core
########################################
# akka-http-core Reference Config File #
########################################
# This is the reference config file that contains all the default settings.
# Make your edits/overrides in your application.conf.
akka.http {
server {
# The default value of the `Server` header to produce if no
# explicit `Server`-header was included in a response.
# If this value is the empty string and no header was included in
# the request, no `Server` header will be rendered at all.
server-header = akka-http/${akka.version}

98

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

# The time after which an idle connection will be automatically closed.


# Set to `infinite` to completely disable idle connection timeouts.
idle-timeout = 60 s
# The time period within which the TCP binding process must be completed.
# Set to `infinite` to disable.
bind-timeout = 1s
# The maximum number of concurrently accepted connections when using the
# `Http().bindAndHandle` methods.
#
# This setting doesn't apply to the `Http().bind` method which will still
# deliver an unlimited backpressured stream of incoming connections.
max-connections = 1024
# Enables/disables the addition of a `Remote-Address` header
# holding the clients (remote) IP address.
remote-address-header = off
# Enables/disables the addition of a `Raw-Request-URI` header holding the
# original raw request URI as the client has sent it.
raw-request-uri-header = off
# Enables/disables automatic handling of HEAD requests.
# If this setting is enabled the server dispatches HEAD requests as GET
# requests to the application and automatically strips off all message
# bodies from outgoing responses.
# Note that, even when this setting is off the server will never send
# out message bodies on responses to HEAD requests.
transparent-head-requests = on
# Enables/disables the returning of more detailed error messages to
# the client in the error response.
# Should be disabled for browser-facing APIs due to the risk of XSS attacks
# and (probably) enabled for internal or non-browser APIs.
# Note that akka-http will always produce log messages containing the full
# error details.
verbose-error-messages = off
# The initial size of the buffer to render the response headers in.
# Can be used for fine-tuning response rendering performance but probably
# doesn't have to be fiddled with in most applications.
response-header-size-hint = 512
# The requested maximum length of the queue of incoming connections.
# If the server is busy and the backlog is full the OS will start dropping
# SYN-packets and connection attempts may fail. Note, that the backlog
# size is usually only a maximum size hint for the OS and the OS can
# restrict the number further based on global limits.
backlog = 100
# If this setting is empty the server only accepts requests that carry a
# non-empty `Host` header. Otherwise it responds with `400 Bad Request`.
# Set to a non-empty value to be used in lieu of a missing or empty `Host`
# header to make the server accept such requests.
# Note that the server will never accept HTTP/1.1 request without a `Host`
# header, i.e. this setting only affects HTTP/1.1 requests with an empty
# `Host` header as well as HTTP/1.0 requests.
# Examples: `www.spray.io` or `example.com:8080`
default-host-header = ""
# Socket options to set for the listening socket. If a setting is left
# undefined, it will use whatever the default on the system is.

2.1. Configuration

99

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

socket-options {
so-receive-buffer-size = undefined
so-send-buffer-size = undefined
so-reuse-address = undefined
so-traffic-class = undefined
tcp-keep-alive = undefined
tcp-oob-inline = undefined
tcp-no-delay = undefined
}
# Modify to tweak parsing settings on the server-side only.
parsing = ${akka.http.parsing}
}
client {
# The default value of the `User-Agent` header to produce if no
# explicit `User-Agent`-header was included in a request.
# If this value is the empty string and no header was included in
# the request, no `User-Agent` header will be rendered at all.
user-agent-header = akka-http/${akka.version}
# The time period within which the TCP connecting process must be completed.
connecting-timeout = 10s
# The time after which an idle connection will be automatically closed.
# Set to `infinite` to completely disable idle timeouts.
idle-timeout = 60 s
# The initial size of the buffer to render the request headers in.
# Can be used for fine-tuning request rendering performance but probably
# doesn't have to be fiddled with in most applications.
request-header-size-hint = 512

# The proxy configurations to be used for requests with the specified


# scheme.
proxy {
# Proxy settings for unencrypted HTTP requests
# Set to 'none' to always connect directly, 'default' to use the system
# settings as described in http://docs.oracle.com/javase/6/docs/technotes/guides/net/proxies
# or specify the proxy host, port and non proxy hosts as demonstrated
# in the following example:
# http {
#
host = myproxy.com
#
port = 8080
#
non-proxy-hosts = ["*.direct-access.net"]
# }
http = default
# Proxy settings for HTTPS requests (currently unsupported)
https = default
}
# Socket options to set for the listening socket. If a setting is left
# undefined, it will use whatever the default on the system is.
socket-options {
so-receive-buffer-size = undefined
so-send-buffer-size = undefined
so-reuse-address = undefined
so-traffic-class = undefined
tcp-keep-alive = undefined
tcp-oob-inline = undefined
tcp-no-delay = undefined
}

2.1. Configuration

100

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

# Modify to tweak parsing settings on the client-side only.


parsing = ${akka.http.parsing}
}
host-connection-pool {
# The maximum number of parallel connections that a connection pool to a
# single host endpoint is allowed to establish. Must be greater than zero.
max-connections = 4
# The maximum number of times failed requests are attempted again,
# (if the request can be safely retried) before giving up and returning an error.
# Set to zero to completely disable request retries.
max-retries = 5

# The maximum number of open requests accepted into the pool across all
# materializations of any of its client flows.
# Protects against (accidentally) overloading a single pool with too many client flow material
# Note that with N concurrent materializations the max number of open request in the pool
# will never exceed N * max-connections * pipelining-limit.
# Must be a power of 2 and > 0!
max-open-requests = 32
# The maximum number of requests that are dispatched to the target host in
# batch-mode across a single connection (HTTP pipelining).
# A setting of 1 disables HTTP pipelining, since only one request per
# connection can be "in flight" at any time.
# Set to higher values to enable HTTP pipelining.
# This value must be > 0.
# (Note that, independently of this setting, pipelining will never be done
# on a connection that still has a non-idempotent request in flight.
# See http://tools.ietf.org/html/rfc7230#section-6.3.2 for more info.)
pipelining-limit = 1
# The time after which an idle connection pool (without pending requests)
# will automatically terminate itself. Set to `infinite` to completely disable idle timeouts.
idle-timeout = 30 s
# Modify to tweak client settings for host connection pools only.
client = ${akka.http.client}
}
# The (default) configuration of the HTTP message parser for the server and the client.
# IMPORTANT: These settings (i.e. children of `akka.http.parsing`) can't be directly
# overridden in `application.conf` to change the parser settings for client and server
# at the same time. Instead, override the concrete settings beneath
# `akka.http.server.parsing` and `akka.http.client.parsing`
# where these settings are copied to.
parsing {
# The limits for the various parts of the HTTP message parser.
max-uri-length
= 2k
max-method-length
= 16
max-response-reason-length = 64
max-header-name-length
= 64
max-header-value-length
= 8k
max-header-count
= 64
max-content-length
= 8m
max-chunk-ext-length
= 256
max-chunk-size
= 1m
# Sets the strictness mode for parsing request target URIs.
# The following values are defined:
#

2.1. Configuration

101

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

# `strict`: RFC3986-compliant URIs are required,


#
a 400 response is triggered on violations
#
# `relaxed`: all visible 7-Bit ASCII chars are allowed
#
# `relaxed-with-raw-query`: like `relaxed` but additionally
#
the URI query is not parsed, but delivered as one raw string
#
as the `key` value of a single Query structure element.
#
uri-parsing-mode = strict
# Enables/disables the logging of warning messages in case an incoming
# message (request or response) contains an HTTP header which cannot be
# parsed into its high-level model class due to incompatible syntax.
# Note that, independently of this settings, akka-http will accept messages
# with such headers as long as the message as a whole would still be legal
# under the HTTP specification even without this header.
# If a header cannot be parsed into a high-level model instance it will be
# provided as a `RawHeader`.
# If logging is enabled it is performed with the configured
# `error-logging-verbosity`.
illegal-header-warnings = on
# Configures the verbosity with which message (request or response) parsing
# errors are written to the application log.
#
# Supported settings:
# `off`
: no log messages are produced
# `simple`: a condensed single-line message is logged
# `full` : the full error details (potentially spanning several lines) are logged
error-logging-verbosity = full
# limits for the number of different values per header type that the
# header cache will hold
header-cache {
default = 12
Content-MD5 = 0
Date = 0
If-Match = 0
If-Modified-Since = 0
If-None-Match = 0
If-Range = 0
If-Unmodified-Since = 0
User-Agent = 32
}
}
}

2.1.2 akka-http
#######################################
# akka-http Reference Config File #
#######################################
# This is the reference config file that contains all the default settings.
# Make your edits/overrides in your application.conf.
akka.http.routing {
# Enables/disables the returning of more detailed error messages to the
# client in the error response
# Should be disabled for browser-facing APIs due to the risk of XSS attacks

2.1. Configuration

102

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

# and (probably) enabled for internal or non-browser APIs


# (Note that akka-http will always produce log messages containing the full error details)
verbose-error-messages = off
# Enables/disables ETag and `If-Modified-Since` support for FileAndResourceDirectives
file-get-conditional = on
# Enables/disables the rendering of the "rendered by" footer in directory listings
render-vanity-footer = yes

# The maximum size between two requested ranges. Ranges with less space in between will be coale
#
# When multiple ranges are requested, a server may coalesce any of the ranges that overlap or th
# by a gap that is smaller than the overhead of sending multiple parts, regardless of the order
# corresponding byte-range-spec appeared in the received Range header field. Since the typical o
# parts of a multipart/byteranges payload is around 80 bytes, depending on the selected represen
# media type and the chosen boundary parameter length, it can be less efficient to transfer many
# disjoint parts than it is to transfer the entire selected representation.
range-coalescing-threshold = 80
# The maximum number of allowed ranges per request.
# Requests with more ranges will be rejected due to DOS suspicion.
range-count-limit = 16
# The maximum number of bytes per ByteString a decoding directive will produce
# for an entity data stream.
decode-max-bytes-per-chunk = 1m
# Fully qualified config path which holds the dispatcher configuration
# to be used by FlowMaterialiser when creating Actors for IO operations.
file-io-dispatcher = ${akka.stream.file-io-dispatcher}
}

The other Akka HTTP modules do not offer any configuration via Typesafe Config.

2.2 HTTP Model


Akka HTTP model contains a deeply structured, fully immutable, case-class based model of all the major HTTP
data structures, like HTTP requests, responses and common headers. It lives in the akka-http-core module and
forms the basis for most of Akka HTTPs APIs.

2.2.1 Overview
Since akka-http-core provides the central HTTP data structures you will find the following import in quite a few
places around the code base (and probably your own code as well):
import akka.http.javadsl.model.*;
import akka.http.javadsl.model.headers.*;

This brings all of the most relevant types in scope, mainly:


HttpRequest and HttpResponse, the central message model
headers, the package containing all the predefined HTTP header models and supporting types
Supporting types like Uri, HttpMethods, MediaTypes, StatusCodes, etc.
A common pattern is that the model of a certain entity is represented by an immutable type (class or trait), while
the actual instances of the entity defined by the HTTP spec live in an accompanying object carrying the name of
the type plus a trailing plural s.

2.2. HTTP Model

103

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

For example:
Defined HttpMethod instances are defined as static fields of the HttpMethods class.
Defined HttpCharset instances are defined as static fields of the HttpCharsets class.
Defined HttpEncoding instances are defined as static fields of the HttpEncodings class.
Defined HttpProtocol instances are defined as static fields of the HttpProtocols class.
Defined MediaType instances are defined as static fields of the MediaTypes class.
Defined StatusCode instances are defined as static fields of the StatusCodes class.

2.2.2 HttpRequest
HttpRequest and HttpResponse are the basic immutable classes representing HTTP messages.
An HttpRequest consists of
a method (GET, POST, etc.)
a URI
a seq of headers
an entity (body data)
a protocol
Here are some examples how to construct an HttpRequest:
// construct a simple GET request to `homeUri`
Uri homeUri = Uri.create("/home");
HttpRequest request1 = HttpRequest.create().withUri(homeUri);
// construct simple GET request to "/index" using helper methods
HttpRequest request2 = HttpRequest.GET("/index");
// construct simple POST request containing entity
ByteString data = ByteString.fromString("abc");
HttpRequest postRequest1 = HttpRequest.POST("/receive").withEntity(data);
// customize every detail of HTTP request
//import HttpProtocols._
//import MediaTypes._
Authorization authorization = Authorization.basic("user", "pass");
HttpRequest complexRequest =
HttpRequest.PUT("/user")
.withEntity(HttpEntities.create(MediaTypes.TEXT_PLAIN.toContentType(), "abc"))
.addHeader(authorization)
.withProtocol(HttpProtocols.HTTP_1_0);

In its basic form HttpRequest.create creates an empty default GET request without headers which can then
be transformed using one of the withX methods, addHeader, or addHeaders. Each of those will create a new
immutable instance, so instances can be shared freely. There exist some overloads for HttpRequest.create
that simplify creating requests for common cases. Also, to aid readability, there are predefined alternatives for
create named after HTTP methods to create a request with a given method and uri directly.

2.2.3 HttpResponse
An HttpResponse consists of
a status code
a list of headers

2.2. HTTP Model

104

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

an entity (body data)


a protocol
Here are some examples how to construct an HttpResponse:
// simple OK response without data created using the integer status code
HttpResponse ok = HttpResponse.create().withStatus(200);
// 404 response created using the named StatusCode constant
HttpResponse notFound = HttpResponse.create().withStatus(StatusCodes.NOT_FOUND);
// 404 response with a body explaining the error
HttpResponse notFoundCustom =
HttpResponse.create()
.withStatus(404)
.withEntity("Unfortunately, the resource couldn't be found.");
// A redirecting response containing an extra header
Location locationHeader = Location.create("http://example.com/other");
HttpResponse redirectResponse =
HttpResponse.create()
.withStatus(StatusCodes.FOUND)
.addHeader(locationHeader);

In addition to the simple HttpEntities.create methods which create an entity from a fixed String or
ByteString as shown here the Akka HTTP model defines a number of subclasses of HttpEntity which
allow body data to be specified as a stream of bytes. All of these types can be created using the method on
HttpEntites.

2.2.4 HttpEntity
An HttpEntity carries the data bytes of a message together with its Content-Type and, if known, its ContentLength. In Akka HTTP there are five different kinds of entities which model the various ways that message content
can be received or sent:
HttpEntityStrict The simplest entity, which is used when all the entity are already available in memory. It wraps
a plain ByteString and represents a standard, unchunked entity with a known Content-Length.
HttpEntityDefault The general, unchunked HTTP/1.1 message entity. It has a known length and presents its
data as a Source[ByteString] which can be only materialized once. It is an error if the provided
source doesnt produce exactly as many bytes as specified. The distinction of HttpEntityStrict and
HttpEntityDefault is an API-only one. One the wire, both kinds of entities look the same.
HttpEntityChunked The model for HTTP/1.1 chunked content (i.e. sent with Transfer-Encoding:
chunked).
The content length is unknown and the individual chunks are presented as a
Source[ChunkStreamPart]. A ChunkStreamPart is either a non-empty chunk or the empty
last chunk containing optional trailer headers. The stream consists of zero or more non-empty chunks parts
and can be terminated by an optional last chunk.
HttpEntityCloseDelimited An unchunked entity of unknown length that is implicitly delimited by closing the
connection (Connection: close). Content data is presented as a Source[ByteString]. Since
the connection must be closed after sending an entity of this type it can only be used on the server-side for
sending a response. Also, the main purpose of CloseDelimited entities is compatibility with HTTP/1.0
peers, which do not support chunked transfer encoding. If you are building a new application and are
not constrained by legacy requirements you shouldnt rely on CloseDelimited entities, since implicit
terminate-by-connection-close is not a robust way of signaling response end, especially in the presence of
proxies. Additionally this type of entity prevents connection reuse which can seriously degrade performance. Use HttpEntityChunked instead!
HttpEntityIndefiniteLength A streaming entity of unspecified length for use in a Multipart.BodyPart.

2.2. HTTP Model

105

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Entity types HttpEntityStrict, HttpEntityDefault, and HttpEntityChunked are a subtype of RequestEntity which allows to use them for requests and responses.
In contrast,
HttpEntityCloseDelimited can only be used for responses.
Streaming entity types (i.e. all but HttpEntityStrict) cannot be shared or serialized. To create a strict,
sharable copy of an entity or message use HttpEntity.toStrict or HttpMessage.toStrict which
returns a Future of the object with the body data collected into a ByteString.
The class HttpEntities contains static methods to create entities from common types easily.
You can use the isX methods of HttpEntity to find out of which subclass an entity is if you want to
provide special handling for each of the subtypes. However, in many cases a recipient of an HttpEntity doesnt
care about of which subtype an entity is (and how data is transported exactly on the HTTP layer). Therefore, the
general method HttpEntity.getDataBytes() is provided which returns a Source<ByteString, ?>
that allows access to the data of an entity regardless of its concrete subtype.
Note:
When to use which subtype?
Use HttpEntityStrict if the amount of data is small and already available in memory (e.g. as
a String or ByteString)
Use HttpEntityDefault if the data is generated by a streaming data source and the size of the
data is known
Use HttpEntityChunked for an entity of unknown length
Use HttpEntityCloseDelimited for a response as a legacy alternative to
HttpEntityChunked if the client doesnt support chunked transfer encoding. Otherwise
use HttpEntityChunked!
In a Multipart.Bodypart use HttpEntityIndefiniteLength for content of unknown
length.
Caution: When you receive a non-strict message from a connection then additional data is only read from
the network when you request it by consuming the entity data stream. This means that, if you dont consume
the entity stream then the connection will effectively be stalled. In particular, no subsequent message (request
or response) will be read from the connection as the entity of the current message blocks the stream. Therefore you must make sure that you always consume the entity data, even in the case that you are not actually
interested in it!

2.2.5 Header Model


Akka HTTP contains a rich model of the most common HTTP headers. Parsing and rendering is done automatically so that applications dont need to care for the actual syntax of headers. Headers not modelled explicitly are
represented as a RawHeader (which is essentially a String/String name/value pair).
See these examples of how to deal with headers:
// create a ``Location`` header
Location locationHeader = Location.create("http://example.com/other");
// create an ``Authorization`` header with HTTP Basic authentication data
Authorization authorization = Authorization.basic("user", "pass");
// a method that extracts basic HTTP credentials from a request
private Option<BasicHttpCredentials> getCredentialsOfRequest(HttpRequest request) {
Option<Authorization> auth = request.getHeader(Authorization.class);
if (auth.isDefined() && auth.get().credentials() instanceof BasicHttpCredentials)
return Option.some((BasicHttpCredentials) auth.get().credentials());
else

2.2. HTTP Model

106

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return Option.none();
}

2.2.6 HTTP Headers


When the Akka HTTP server receives an HTTP request it tries to parse all its headers into their respective model
classes. Independently of whether this succeeds or not, the HTTP layer will always pass on all received headers
to the application. Unknown headers as well as ones with invalid syntax (according to the header parser) will be
made available as RawHeader instances. For the ones exhibiting parsing errors a warning message is logged
depending on the value of the illegal-header-warnings config setting.
Some headers have special status in HTTP and are therefore treated differently from regular headers:
Content-Type The Content-Type of an HTTP message is modeled as the contentType field of the
HttpEntity. The Content-Type header therefore doesnt appear in the headers sequence of a
message. Also, a Content-Type header instance that is explicitly added to the headers of a request or
response will not be rendered onto the wire and trigger a warning being logged instead!
Transfer-Encoding Messages with Transfer-Encoding: chunked are represented as a
HttpEntityChunked entity. As such chunked messages that do not have another deeper nested
transfer encoding will not have a Transfer-Encoding header in their headers list. Similarly, a
Transfer-Encoding header instance that is explicitly added to the headers of a request or response
will not be rendered onto the wire and trigger a warning being logged instead!
Content-Length The content length of a message is modelled via its HttpEntity. As such no Content-Length
header will ever be part of a messages header sequence. Similarly, a Content-Length header instance
that is explicitly added to the headers of a request or response will not be rendered onto the wire and
trigger a warning being logged instead!
Server A Server header is usually added automatically to any response and its value can be configured via the
akka.http.server.server-header setting. Additionally an application can override the configured header with a custom one by adding it to the responses header sequence.
User-Agent A User-Agent header is usually added automatically to any request and its value can be configured
via the akka.http.client.user-agent-header setting. Additionally an application can override
the configured header with a custom one by adding it to the requests header sequence.
Date The Date response header is added automatically but can be overridden by supplying it manually.
Connection On the server-side Akka HTTP watches for explicitly added Connection: close response
headers and as such honors the potential wish of the application to close the connection after the respective
response has been sent out. The actual logic for determining whether to close the connection is quite
involved. It takes into account the requests method, protocol and potential Connection header as well
as the responses protocol, entity and potential Connection header. See this test for a full table of what
happens when.

2.2.7 Parsing / Rendering


Parsing and rendering of HTTP data structures is heavily optimized and for most types theres currently no public
API provided to parse (or render to) Strings or byte arrays.

2.3 Low-Level Server-Side API


Apart from the HTTP Client Akka HTTP also provides an embedded, Reactive-Streams-based, fully asynchronous
HTTP/1.1 server implemented on top of Akka Stream.
It sports the following features:
Full support for HTTP persistent connections

2.3. Low-Level Server-Side API

107

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Full support for HTTP pipelining


Full support for asynchronous HTTP streaming including chunked transfer encoding accessible through
an idiomatic API
Optional SSL/TLS encryption
Websocket support
The server-side components of Akka HTTP are split into two layers:
1. The basic low-level server implementation in the akka-http-core module
2. Higher-level functionality in the akka-http module
The low-level server (1) is scoped with a clear focus on the essential functionality of an HTTP/1.1 server:
Connection management
Parsing and rendering of messages and headers
Timeout management (for requests and connections)
Response ordering (for transparent pipelining support)
All non-core features of typical HTTP servers (like request routing, file serving, compression, etc.) are left to the
higher layers, they are not implemented by the akka-http-core-level server itself. Apart from general focus
this design keeps the server core small and light-weight as well as easy to understand and maintain.
Depending on your needs you can either use the low-level API directly or rely on the high-level Routing DSL
which can make the definition of more complex service logic much easier.

2.3.1 Streams and HTTP


The Akka HTTP server is implemented on top of Akka Stream and makes heavy use of it - in its implementation
as well as on all levels of its API.
On the connection level Akka HTTP offers basically the same kind of interface as Akka Stream IO: A socket
binding is represented as a stream of incoming connections. The application pulls connections from this stream
source and, for each of them, provides a Flow<HttpRequest, HttpResponse, ?> to translate requests
into responses.
Apart from regarding a socket bound on the server-side as a Source<IncomingConnection> and each
connection as a Source<HttpRequest> with a Sink<HttpResponse> the stream abstraction is also
present inside a single HTTP message: The entities of HTTP requests and responses are generally modeled as
a Source<ByteString>. See also the HTTP Model for more information on how HTTP messages are represented in Akka HTTP.

2.3.2 Starting and Stopping


On the most basic level an Akka HTTP server is bound by invoking the bind method of the akka.http.javadsl.Http
extension:
ActorSystem system = ActorSystem.create();
Materializer materializer = ActorMaterializer.create(system);
Source<IncomingConnection, Future<ServerBinding>> serverSource =
Http.get(system).bind("localhost", 8080, materializer);
Future<ServerBinding> serverBindingFuture =
serverSource.to(Sink.foreach(
new Procedure<IncomingConnection>() {
@Override
public void apply(IncomingConnection connection) throws Exception {
System.out.println("Accepted new connection from " + connection.remoteAddress());

2.3. Low-Level Server-Side API

108

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// ... and then actually handle the connection


}
})).run(materializer);

Arguments to the Http().bind method specify the interface and port to bind to and register interest in handling
incoming HTTP connections. Additionally, the method also allows for the definition of socket options as well as
a larger number of settings for configuring the server according to your needs.
The result of the bind method is a Source<Http.IncomingConnection> which must be drained by the
application in order to accept incoming connections. The actual binding is not performed before this source is
materialized as part of a processing pipeline. In case the bind fails (e.g. because the port is already busy) the
materialized stream will immediately be terminated with a respective exception. The binding is released (i.e. the
underlying socket unbound) when the subscriber of the incoming connection source has cancelled its subscription.
Alternatively one can use the unbind() method of the Http.ServerBinding instance that is created as part
of the connection sources materialization process. The Http.ServerBinding also provides a way to get a
hold of the actual local address of the bound socket, which is useful for example when binding to port zero (and
thus letting the OS pick an available port).

2.3.3 Request-Response Cycle


When a new connection has been accepted it will be published as an Http.IncomingConnection which
consists of the remote address and methods to provide a Flow<HttpRequest, HttpResponse, ?> to
handle requests coming in over this connection.
Requests are handled by calling one of the handleWithXXX methods with a handler, which can either be
a Flow<HttpRequest, HttpResponse, ?> for handleWith,
a function Function<HttpRequest, HttpResponse> for handleWithSyncHandler,
a
function
Function<HttpRequest, Future<HttpResponse>>
handleWithAsyncHandler.

for

Here is a complete example:


ActorSystem system = ActorSystem.create();
final Materializer materializer = ActorMaterializer.create(system);
Source<IncomingConnection, Future<ServerBinding>> serverSource =
Http.get(system).bind("localhost", 8080, materializer);
final Function<HttpRequest, HttpResponse> requestHandler =
new Function<HttpRequest, HttpResponse>() {
private final HttpResponse NOT_FOUND =
HttpResponse.create()
.withStatus(404)
.withEntity("Unknown resource!");

@Override
public HttpResponse apply(HttpRequest request) throws Exception {
Uri uri = request.getUri();
if (request.method() == HttpMethods.GET) {
if (uri.path().equals("/"))
return
HttpResponse.create()
.withEntity(MediaTypes.TEXT_HTML.toContentType(),
"<html><body>Hello world!</body></html>");
else if (uri.path().equals("/hello")) {
String name = Util.getOrElse(uri.parameter("name"), "Mister X");
return
HttpResponse.create()

2.3. Low-Level Server-Side API

109

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.withEntity("Hello " + name + "!");


}
else if (uri.path().equals("/ping"))
return HttpResponse.create().withEntity("PONG!");
else
return NOT_FOUND;
}
else return NOT_FOUND;
}
};
Future<ServerBinding> serverBindingFuture =
serverSource.to(Sink.foreach(
new Procedure<IncomingConnection>() {
@Override
public void apply(IncomingConnection connection) throws Exception {
System.out.println("Accepted new connection from " + connection.remoteAddress());

connection.handleWithSyncHandler(requestHandler, materializer);
// this is equivalent to
//connection.handleWith(Flow.of(HttpRequest.class).map(requestHandler), materializ
}
})).run(materializer);

In this example, a request is handled by transforming the request stream with a function
Function<HttpRequest, HttpResponse> using handleWithSyncHandler (or equivalently,
Akka Streams map operator). Depending on the use case many other ways of providing a request handler are
conceivable using Akka Streams combinators.
If the application provides a Flow it is also the responsibility of the application to generate exactly one response
for every request and that the ordering of responses matches the ordering of the associated requests (which is
relevant if HTTP pipelining is enabled where processing of multiple incoming requests may overlap). When
relying on handleWithSyncHandler or handleWithAsyncHandler, or the map or mapAsync stream
operators, this requirement will be automatically fulfilled.
See Routing DSL Overview for a more convenient high-level DSL to create request handlers.
Streaming Request/Response Entities
Streaming of HTTP message entities is supported through subclasses of HttpEntity. The application needs
to be able to deal with streamed entities when receiving a request as well as, in many cases, when constructing
responses. See HttpEntity for a description of the alternatives.
Closing a connection
The HTTP connection will be closed when the handling Flow cancels its upstream subscription or the peer closes
the connection. An often times more convenient alternative is to explicitly add a Connection: close header
to an HttpResponse. This response will then be the last one on the connection and the server will actively close
the connection when it has been sent out.

2.3.4 Server-Side HTTPS Support


Akka HTTP supports TLS encryption on the server-side as well as on the client-side.
The central vehicle for configuring encryption is the HttpsContext, which can be created using the static
method HttpsContext.create which is defined like this:
public static HttpsContext create(SSLContext sslContext,
Option<Collection<String>> enabledCipherSuites,

2.3. Low-Level Server-Side API

110

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Option<Collection<String>> enabledProtocols,
Option<ClientAuth> clientAuth,
Option<SSLParameters> sslParameters)

On the server-side the bind, and bindAndHandleXXX methods of the akka.http.javadsl.Http extension define an optional httpsContext parameter, which can receive the HTTPS configuration in the form of an
HttpsContext instance. If defined encryption is enabled on all accepted connections. Otherwise it is disabled (which is the default).

2.3.5 Stand-Alone HTTP Layer Usage


It is currently only possible to use the HTTP server layer with Scala in a stand-alone fashion. See http-serverlayer-scala and #18027 for the plan to add Java support.

2.4 Server-Side WebSocket Support


WebSocket is a protocol that provides a bi-directional channel between browser and webserver usually run over
an upgraded HTTP(S) connection. Data is exchanged in messages whereby a message can either be binary data
or unicode text.
Akka HTTP provides a stream-based implementation of the WebSocket protocol that hides the low-level details of
the underlying binary framing wire-protocol and provides a simple API to implement services using WebSocket.

2.4.1 Model
The basic unit of data exchange in the WebSocket protocol is a message. A message can either be binary message,
i.e. a sequence of octets or a text message, i.e. a sequence of unicode code points.
In the data model the two kinds of messages, binary and text messages, are represented by the two classes
BinaryMessage and TextMessage deriving from a common superclass Message. The superclass
Message contains isText and isBinary methods to distinguish a message and asBinaryMessage and
asTextMessage methods to cast a message.
The subclasses BinaryMessage and TextMessage contain methods to access the data. Take the API of
TextMessage as an example (BinaryMessage is very similar with String replaced by ByteString):
abstract class TextMessage extends Message {
/**
* Returns a source of the text message data.
*/
def getStreamedText: Source[String, _]
/** Is this message a strict one? */
def isStrict: Boolean
/**
* Returns the strict message text if this message is strict, throws otherwise.
*/
def getStrictText: String
}

The data of a message is provided as a stream because WebSocket messages do not have a predefined size and
could (in theory) be infinitely long. However, only one message can be open per direction of the WebSocket
connection, so that many application level protocols will want to make use of the delineation into (small) messages
to transport single application-level data units like one event or one chat message.
Many messages are small enough to be sent or received in one go. As an opportunity for optimization, the
model provides the notion of a strict message to represent cases where a whole message was received in one

2.4. Server-Side WebSocket Support

111

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

go. If TextMessage.isStrict returns true, the complete data is already available and can be accessed with
TextMessage.getStrictText (analogously for BinaryMessage).
When receiving data from the network connection the WebSocket implementation tries to create a strict message
whenever possible, i.e. when the complete data was received in one chunk. However, the actual chunking of
messages over a network connection and through the various streaming abstraction layers is not deterministic
from the perspective of the application. Therefore, application code must be able to handle both streaming and
strict messages and not expect certain messages to be strict. (Particularly, note that tests against localhost will
behave differently than tests against remote peers where data is received over a physical network connection.)
For sending data, you can use the static TextMessage.create(String) method to create a strict message if the complete message has already been assembled.
Otherwise, use
TextMessage.create(Source<String, ?>) to create a streaming message from an Akka Stream
source.

2.4.2 Server API


The entrypoint for the Websocket API is the synthetic UpgradeToWebsocket header which is added to a
request if Akka HTTP encounters a Websocket upgrade request.
The Websocket specification mandates that details of the Websocket connection are negotiated by placing specialpurpose HTTP-headers into request and response of the HTTP upgrade. In Akka HTTP these HTTP-level details
of the WebSocket handshake are hidden from the application and dont need to be managed manually.
Instead, the synthetic UpgradeToWebsocket represents a valid Websocket upgrade request. An application can detect a Websocket upgrade request by looking for the UpgradeToWebsocket header. It
can choose to accept the upgrade and start a Websocket connection by responding to that request with
an HttpResponse generated by one of the UpgradeToWebsocket.handleMessagesWith methods. In its most general form this method expects two arguments: first, a handler Flow<Message,
Message, ?> that will be used to handle Websocket messages on this connection. Second, the application can optionally choose one of the proposed application-level sub-protocols by inspecting the values of UpgradeToWebsocket.getRequestedProtocols and pass the chosen protocol value to
handleMessagesWith.
Handling Messages
A message handler is expected to be implemented as a Flow<Message, Message, ?>. For typical requestresponse scenarios this fits very well and such a Flow can be constructed from a simple function by using
Flow.<Message>create().map or Flow.<Message>create().mapAsync.
There are other use-cases, e.g. in a server-push model, where a server message is sent spontaneously, or in a true bidirectional scenario where input and output arent logically connected. Providing the handler as a Flow in these
cases may not fit. An overload of UpgradeToWebsocket.handleMessagesWith is provided, instead,
which allows to pass an output-generating Source<Message, ?> and an input-receiving Sink<Message,
?> independently.
Note that a handler is required to consume the data stream of each message to make place for new messages.
Otherwise, subsequent messages may be stuck and message traffic in this direction will stall.
Example
Lets look at an example.
Websocket requests come in like any other requests. In the example, requests to /greeter are expected to be
Websocket requests:
public static HttpResponse handleRequest(HttpRequest request) {
System.out.println("Handling request to " + request.getUri());
if (request.getUri().path().equals("/greeter"))

2.4. Server-Side WebSocket Support

112

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return Websocket.handleWebsocketRequestWith(request, greeter());


else
return HttpResponse.create().withStatus(404);
}

It uses a helper method akka.http.javadsl.model.ws.Websocket.handleWebsocketRequestWith


which can be used if only Websocket requests are expected. The method looks for the UpgradeToWebsocket
header and returns a response that will install the passed Websocket handler if the header is found. If the request
is no Websocket request it will return a 400 Bad Request error response.
In the example, the passed handler expects text messages where each message is expected to contain (a persons)
name and then responds with another text message that contains a greeting:
/**
* A handler that treats incoming messages as a name,
* and responds with a greeting to that name
*/
public static Flow<Message, Message, BoxedUnit> greeter() {
return
Flow.<Message>create()
.collect(new JavaPartialFunction<Message, Message>() {
@Override
public Message apply(Message msg, boolean isCheck) throws Exception {
if (isCheck)
if (msg.isText()) return null;
else throw noMatch();
else
return handleTextMessage(msg.asTextMessage());
}
});
}
public static TextMessage handleTextMessage(TextMessage msg) {
if (msg.isStrict()) // optimization that directly creates a simple response...
return TextMessage.create("Hello "+msg.getStrictText());
else // ... this would suffice to handle all text messages in a streaming fashion
return TextMessage.create(Source.single("Hello ").concat(msg.getStreamedText()));
}

2.4.3 Routing support


The routing DSL provides the handleWebsocketMessages directive to install a WebSocket handler if a
request is a WebSocket request. Otherwise, the directive rejects the request.
Lets look at how the above example can be rewritten using the high-level routing DSL.
Instead of writing the request handler manually, the routing behavior of the app is defined by a route that uses the
handleWebsocketRequests directive in place of the Websocket.handleWebsocketRequestWith:
@Override
public Route createRoute() {
return
path("greeter").route(
handleWebsocketMessages(greeter())
);
}

The handling code itself will be the same as with using the low-level API.
See the full routing example.

2.4. Server-Side WebSocket Support

113

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

2.5 High-level Server-Side API


To use the high-level API you need to add a dependency to the akka-http-experimental module.

2.5.1 Routing DSL Overview


The Akka HTTP Low-Level Server-Side API provides a Flow- or Function-level interface that allows an application to respond to incoming HTTP requests by simply mapping requests to responses (excerpt from Low-level
server side example):
final Function<HttpRequest, HttpResponse> requestHandler =
new Function<HttpRequest, HttpResponse>() {
private final HttpResponse NOT_FOUND =
HttpResponse.create()
.withStatus(404)
.withEntity("Unknown resource!");

@Override
public HttpResponse apply(HttpRequest request) throws Exception {
Uri uri = request.getUri();
if (request.method() == HttpMethods.GET) {
if (uri.path().equals("/"))
return
HttpResponse.create()
.withEntity(MediaTypes.TEXT_HTML.toContentType(),
"<html><body>Hello world!</body></html>");
else if (uri.path().equals("/hello")) {
String name = Util.getOrElse(uri.parameter("name"), "Mister X");
return
HttpResponse.create()
.withEntity("Hello " + name + "!");
}
else if (uri.path().equals("/ping"))
return HttpResponse.create().withEntity("PONG!");
else
return NOT_FOUND;
}
else return NOT_FOUND;
}
};

While itd be perfectly possible to define a complete REST API service purely by inspecting the incoming
HttpRequest this approach becomes somewhat unwieldy for larger services due to the amount of syntax ceremony required. Also, it doesnt help in keeping your service definition as DRY as you might like.
As an alternative Akka HTTP provides a flexible DSL for expressing your service behavior as a structure of
composable elements (called Directives) in a concise and readable way. Directives are assembled into a so called
route structure which, at its top-level, can be used to create a handler Flow (or, alternatively, an async handler
function) that can be directly supplied to a bind call.
Heres the complete example rewritten using the composable high-level API:
import
import
import
import

akka.actor.ActorSystem;
akka.http.javadsl.model.MediaTypes;
akka.http.javadsl.server.*;
akka.http.javadsl.server.values.Parameters;

import java.io.IOException;
public class HighLevelServerExample extends HttpApp {

2.5. High-level Server-Side API

114

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

public static void main(String[] args) throws IOException {


// boot up server using the route as defined below
ActorSystem system = ActorSystem.create();
// HttpApp.bindRoute expects a route being provided by HttpApp.createRoute
new HighLevelServerExample().bindRoute("localhost", 8080, system);
System.out.println("Type RETURN to exit");
System.in.read();
system.shutdown();
}
// A RequestVal is a type-safe representation of some aspect of the request.
// In this case it represents the `name` URI parameter of type String.
private RequestVal<String> name = Parameters.stringValue("name").withDefault("Mister X");
@Override
public Route createRoute() {
// This handler generates responses to `/hello?name=XXX` requests
Route helloRoute =
handleWith1(name,
// in Java 8 the following becomes simply
// (ctx, name) -> ctx.complete("Hello " + name + "!")
new Handler1<String>() {
@Override
public RouteResult apply(RequestContext ctx, String name) {
return ctx.complete("Hello " + name + "!");
}
});
return
// here the complete behavior for this server is defined
route(
// only handle GET requests
get(
// matches the empty path
pathSingleSlash().route(
// return a constant string with a certain content type
complete(MediaTypes.TEXT_HTML.toContentType(),
"<html><body>Hello world!</body></html>")
),
path("ping").route(
// return a simple `text/plain` response
complete("PONG!")
),
path("hello").route(
// uses the route defined above
helloRoute
)
)
);
}
}

Heart of the high-level architecture is the route tree. It is a big expression of type Route that is evaluated only
once during startup time of your service. It completely describes how your service should react to any request.
The type Route is the basic building block of the route tree. It defines if and a how a request should be handled.
Routes are composed to form the route tree in the following two ways.
A route can be wrapped by a Directive which adds some behavioral aspect to its wrapped inner route.
path("ping") is such a directive that implements a path filter, i.e. it only passes control to its inner route
when the unmatched path matches "ping". Directives can be more versatile than this: A directive can also transform the request before passing it into its inner route or transform a response that comes out of its inner route. Its

2.5. High-level Server-Side API

115

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

a general and powerful abstraction that allows to package any kind of HTTP processing into well-defined blocks
that can be freely combined. akka-http defines a library of predefined directives and routes for all the various
aspects of dealing with HTTP requests and responses.
Read more about Directives.
The other way of composition is defining a list of Route alternatives. Alternative routes are tried one after the
other until one route accepts the request and provides a response. Otherwise, a route can also reject a request,
in which case further alternatives are explored. Alternatives are specified by passing a list of routes either to
Directive.route() as in pathSingleSlash().route() or to directives that directly take a variable
number of inner routes as argument like get() here.
Read more about Routes.
Another important building block is a RequestVal<T>. It represents a value that can be extracted from a
request (like the URI parameter Parameters.stringValue("name") in the example) and which is then
interpreted as a value of type T. Examples of HTTP aspects represented by a RequestVal are URI parameters,
HTTP form fields, details of the request like headers, URI, the entity, or authentication data.
Read more about Request values.
The actual application-defined processing of a request is defined with a Handler instance or by specifying a
handling method with reflection. A handler can receive the value of any request values and is converted into a
Route by using one of the BasicDirectives.handleWith directives.
Read more about Handlers.
Requests or responses often contain data that needs to be interpreted or rendered in some way. Akka-http provides
the abstraction of Marshaller and Unmarshaller that define how domain model objects map to HTTP
entities.
Read more about Marshalling & Unmarshalling.
akka-http contains a testkit that simplifies testing routes. It allows to run test-requests against (sub-)routes quickly
without running them over the network and helps with writing assertions on HTTP response properties.
Read more about Route Testkit.

2.5.2 Routes
A Route itself is a function that operates on a RequestContext and returns a RouteResult. The
RequestContext is a data structure that contains the current request and auxiliary data like the so far unmatched path of the request URI that gets passed through the route structure. It also contains the current
ExecutionContext and akka.stream.Materializer, so that these dont have to be passed around
manually.
RequestContext
The RequestContext achieves two goals: it allows access to request data and it is a factory for creating a
RouteResult. A user-defined handler (see Handlers) that is usually used at the leaf position of the route tree
receives a RequestContext, evaluates its content and then returns a result generated by one of the methods of
the context.
RouteResult
The RouteResult is an opaque structure that represents possible results of evaluating a route. A
RouteResult can only be created by using one of the methods of the RequestContext. A result can
either be a response, if it was generated by one of the completeX methods, it can be an eventual result, i.e. a
Future<RouteResult if completeWith was used or a rejection that contains information about why the
route could not handle the request.

2.5. High-level Server-Side API

116

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Composing Routes
Routes are composed to form the route tree in two principle ways.
A route can be wrapped by a Directive which adds some behavioral aspect to its wrapped inner route. Such
an aspect can be
filtering requests to decide which requests will get to the inner route
transforming the request before passing it to the inner route
transforming the response (or more generally the route result) received from the inner route
applying side-effects around inner route processing, such as measuring the time taken to run the inner route
akka-http defines a library of predefined Directives and routes for all the various aspects of dealing with HTTP
requests and responses.
The other way of composition is defining a list of Route alternatives. Alternative routes are tried one after the
other until one route accepts the request and provides a response. Otherwise, a route can also reject a request,
in which case further alternatives are explored. Alternatives are specified by passing a list of routes either to
Directive.route() as in path("xyz").route() or to directives that directly take a variable number
of inner routes as argument like get().
The Routing Tree
Essentially, when you combine routes via nesting and alternative, you build a routing structure that forms a tree.
When a request comes in it is injected into this tree at the root and flows down through all the branches in a
depth-first manner until either some node completes it or it is fully rejected.
Consider this schematic example:
val route =
a.route(
b.route(
c.route(
... // route 1
),
d.route(
... // route 2
),
... // route 3
),
e.route(
... // route 4
)
)

Here five directives form a routing tree.


Route 1 will only be reached if directives a, b and c all let the request pass through.
Route 2 will run if a and b pass, c rejects and d passes.
Route 3 will run if a and b pass, but c and d reject.
Route 3 can therefore be seen as a catch-all route that only kicks in, if routes chained into preceding positions
reject. This mechanism can make complex filtering logic quite easy to implement: simply put the most specific
cases up front and the most general cases in the back.

2.5.3 Directives
A directive is a wrapper for a route or a list of alternative routes that adds one or more of the following functionality
to its nested route(s):

2.5. High-level Server-Side API

117

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

it filters the request and lets only matching requests pass (e.g. the get directive lets only GET-requests pass)
it modifies the request or the RequestContext (e.g. the path directives filters on the unmatched path
and then passes an updated RequestContext unmatched path)
it modifies the response coming out of the nested route
akka-http provides a set of predefined directives for various tasks. You can access them by either extending from
akka.http.javadsl.server.AllDirectives or by importing them statically with import static
akka.http.javadsl.server.Directives.*;.
These classes of directives are currently defined:
BasicDirectives Contains methods to create routes that complete with a static values or allow specifying Handlers
to process a request.
CacheConditionDirectives Contains a single directive conditional that wraps its inner route with support
for Conditional Requests as defined by RFC 7234.
CodingDirectives Contains directives to decode compressed requests and encode responses.
CookieDirectives Contains a single directive setCookie to aid adding a cookie to a response.
ExecutionDirectives Contains directives to deal with exceptions that occurred during routing.
FileAndResourceDirectives Contains directives to serve resources from files on the file system or from the classpath.
HostDirectives Contains directives to filter on the Host header of the incoming request.
MethodDirectives Contains directives to filter on the HTTP method of the incoming request.
MiscDirectives Contains directives that validate a request by user-defined logic.
PathDirectives Contains directives to match and filter on the URI path of the incoming request.
RangeDirectives Contains a single directive withRangeSupport that adds support for retrieving partial responses.
SchemeDirectives Contains a single directive scheme to filter requests based on the URI scheme (http vs. https).
WebsocketDirectives Contains directives to support answering Websocket requests.
PathDirectives
Path directives are the most basic building blocks for routing requests depending on the URI path.
When a request (or rather the respective RequestContext instance) enters the route structure it has an unmatched path that is identical to the request.uri.path. As it descends the routing tree and passes through
one or more pathPrefix or path directives the unmatched path progressively gets eaten into from the left
until, in most cases, it eventually has been consumed completely.
The two main directives are path and pathPrefix. The path directive tries to match the complete remaining
unmatched path against the specified path matchers, the pathPrefix directive only matches a prefix and
passes the remaining unmatched path to nested directives. Both directives automatically match a slash from the
beginning, so that matching slashes in a hierarchy of nested pathPrefix and path directives is usually not
needed.
Path directives take a variable amount of arguments. Each argument must be a PathMatcher or a string (which
is automatically converted to a path matcher using PathMatchers.segment). In the case of path and
pathPrefix, if multiple arguments are supplied, a slash is assumed between any of the supplied path matchers.
The rawPathX variants of those directives on the other side do no such preprocessing, so that slashes must be
matched manually.

2.5. High-level Server-Side API

118

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Path Matchers

A path matcher is a description of a part of a path to match.


The simplest path matcher is
PathMatcher.segment which matches exactly one path segment against the supplied constant string.
Other path matchers defined in PathMatchers match the end of the path (PathMatchers.END), a single
slash (PathMatchers.SLASH), or nothing at all (PathMatchers.NEUTRAL).
Many path matchers are hybrids that can both match (by using them with one of the PathDirectives) and extract
values, i.e. they are Request values. Extracting a path matcher value (i.e. using it with handleWithX) is only
allowed if it nested inside a path directive that uses that path matcher and so specifies at which position the value
should be extracted from the path.
Predefined path matchers allow extraction of various types of values:
PathMatchers.segment(String) Strings simply match themselves and extract no value. Note that
strings are interpreted as the decoded representation of the path, so if they include a / character this character will match %2F in the encoded raw URI!
PathMatchers.regex You can use a regular expression instance as a path matcher, which matches whatever
the regex matches and extracts one String value. A PathMatcher created from a regular expression extracts either the complete match (if the regex doesnt contain a capture group) or the capture group
(if the regex contains exactly one capture group). If the regex contains more than one capture group an
IllegalArgumentException will be thrown.
PathMatchers.SLASH Matches exactly one path-separating slash (/) character.
PathMatchers.END Matches the very end of the path, similar to $ in regular expressions.
PathMatchers.Segment Matches if the unmatched path starts with a path segment (i.e. not a slash). If so
the path segment is extracted as a String instance.
PathMatchers.Rest Matches and extracts the complete remaining unmatched part of the requests URI path
as an (encoded!) String. If you need access to the remaining decoded elements of the path use RestPath
instead.
PathMatchers.intValue Efficiently matches a number of decimal digits (unsigned) and extracts their (nonnegative) Int value. The matcher will not match zero digits or a sequence of digits that would represent an
Int value larger than Integer.MAX_VALUE.
PathMatchers.longValue Efficiently matches a number of decimal digits (unsigned) and extracts their
(non-negative) Long value. The matcher will not match zero digits or a sequence of digits that would
represent an Long value larger than Long.MAX_VALUE.
PathMatchers.hexIntValue Efficiently matches a number of hex digits and extracts their (non-negative)
Int value. The matcher will not match zero digits or a sequence of digits that would represent an Int
value larger than Integer.MAX_VALUE.
PathMatchers.hexLongValue Efficiently matches a number of hex digits and extracts their (non-negative)
Long value. The matcher will not match zero digits or a sequence of digits that would represent an Long
value larger than Long.MAX_VALUE.
PathMatchers.uuid Matches and extracts a java.util.UUID instance.
PathMatchers.NEUTRAL A matcher that always matches, doesnt consume anything and extracts nothing.
Serves mainly as a neutral element in PathMatcher composition.
PathMatchers.segments Matches all remaining segments as a list of strings. Note that this can also be
no segments resulting in the empty list. If the path has a trailing slash this slash will not be matched, i.e.
remain unmatched and to be consumed by potentially nested directives.
Heres a collection of path matching examples:
// matches "/test"
path("test").route(
completeWithStatus(StatusCodes.OK)
);

2.5. High-level Server-Side API

119

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

// matches "/test", as well


path(PathMatchers.segment("test")).route(
completeWithStatus(StatusCodes.OK)
);
// matches "/admin/user"
path("admin", "user").route(
completeWithStatus(StatusCodes.OK)
);
// matches "/admin/user", as well
pathPrefix("admin").route(
path("user").route(
completeWithStatus(StatusCodes.OK)
)
);
// matches "/admin/user/<user-id>"
Handler1<Integer> completeWithUserId =
new Handler1<Integer>() {
@Override
public RouteResult apply(RequestContext ctx, Integer userId) {
return ctx.complete("Hello user " + userId);
}
};
PathMatcher<Integer> userId = PathMatchers.intValue();
pathPrefix("admin", "user").route(
path(userId).route(
handleWith1(userId, completeWithUserId)
)
);
// matches "/admin/user/<user-id>", as well
path("admin", "user", userId).route(
handleWith1(userId, completeWithUserId)
);
// never matches
path("admin").route( // oops this only matches "/admin"
path("user").route(
completeWithStatus(StatusCodes.OK)
)
);
// matches "/user/" with the first subroute, "/user" (without a trailing slash)
// with the second subroute, and "/user/<user-id>" with the last one.
pathPrefix("user").route(
pathSingleSlash().route(
completeWithStatus(StatusCodes.OK)
),
pathEnd().route(
completeWithStatus(StatusCodes.OK)
),
path(userId).route(
handleWith1(userId, completeWithUserId)
)
);

2.5. High-level Server-Side API

120

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

2.5.4 Request values


A request value of type RequestVal<T> is a typed structure that represents some aspect of the request that
can be interpreted as a value of type T. A RequestVal instance abstracts the knowledge about how to extract a
certain value from the request and interpret it as a T. It is used in combination with Handlers.
The advantage of representing a request detail as a RequestVal instead of performing ad-hoc analysis of a
request are:
you can define an inventory of HTTP primitives for your application that you can reuse in many places of
your application
automatic handling of errors when an expected value was not found in a request or if it could not be interpreted as the expected Java type
Note, that the Scala version of the routing DSL has no direct correspondent to RequestVals. Instead, a Scala-side
Directive can have extractions that are reflected in the type of the Directive.
Predefined Request values
akka-http provides a set of predefined request values for request data commonly accessed in a web service.
These request values are defined:
RequestVals Contains request values for basic data like URI components, request method, peer address, or the
entity data.
Cookies Contains request values representing cookies.
FormFields Contains request values to access form fields unmarshalled to various primitive Java types.
Headers Contains request values to access request headers or header values.
HttpBasicAuthenticator An abstract class to implement to create a request value representing a HTTP basic
authenticated principal.
Parameters Contains request values to access URI paramaters unmarshalled to various primitive Java types.
PathMatchers Contains request values to match and access URI path segments.
CustomRequestVal An abstract class to implement arbitrary custom request values.

2.5.5 Handlers
Handlers implement the actual application-defined logic for a certain trace in the routing tree. Most of the leaves
of the routing tree will be routes created from handlers. Creating a Route from a handler is achieved using the
BasicDirectives.handleWith overloads. They come in several forms:
with a single Handler argument and a variable number of RequestVal<?> (may be 0)
with a number n of RequestVal<T1> arguments and a HandlerN<T1, .., TN> argument
with a Class<?> and/or instance and a method name String argument and a variable number of
RequestVal<?> (may be 0) arguments
Simple Handler
In its simplest form a Handler is a SAM class that defines application behavior by inspecting the
RequestContext and returning a RouteResult:
trait Handler extends akka.japi.function.Function[RequestContext, RouteResult] {
override def apply(ctx: RequestContext): RouteResult
}

2.5. High-level Server-Side API

121

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Such a handler inspects the RequestContext it receives and uses the RequestContexts methods to create
a response:
Handler handler = new Handler() {
@Override
public RouteResult apply(RequestContext ctx) {
return ctx.complete("This was a " + ctx.request().method().value()
" request to "+ctx.request().getUri());
}
};

The handler can include any kind of logic but must return a RouteResult in the end which can only be created
by using one of the RequestContext methods.
A handler instance can be used once or several times as shown in the full example:
class TestHandler extends akka.http.javadsl.server.AllDirectives {
Handler handler = new Handler() {
@Override
public RouteResult apply(RequestContext ctx) {
return ctx.complete("This was a " + ctx.request().method().value()
" request to "+ctx.request().getUri());
}
};

Route createRoute() {
return route(
get(
handleWith(handler)
),
post(
path("abc").route(
handleWith(handler)
)
)
);
}
}
// actual testing code
TestRoute r = testRoute(new TestHandler().createRoute());
r.run(HttpRequest.GET("/test"))
.assertStatusCode(200)
.assertEntity("This was a GET request to /test");
r.run(HttpRequest.POST("/test"))
.assertStatusCode(404);
r.run(HttpRequest.POST("/abc"))
.assertStatusCode(200)
.assertEntity("This was a POST request to /abc");

Handlers and Request Values


In many cases, instead of manually inspecting the request, a handler will make use of Request values to extract
details from the request. This is possible using one of the other handleWith overloads that bind the values of
one or more request values with a HandlerN instance to produce a Route:
final Handler2<Integer, Integer> multiply =
new Handler2<Integer, Integer>() {
@Override
public RouteResult apply(RequestContext ctx, Integer x, Integer y) {
int result = x * y;

2.5. High-level Server-Side API

122

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

return ctx.complete("x * y = " + result);


}
};
final Route multiplyXAndYParam = handleWith2(xParam, yParam, multiply);

The handler here implements multiplication of two integers. However, it doesnt need to specify where these
parameters come from. In handleWith, as many request values of the matching type have to be specified as the
handler needs. This can be seen in the full example:
class TestHandler extends akka.http.javadsl.server.AllDirectives {
RequestVal<Integer> xParam = Parameters.intValue("x");
RequestVal<Integer> yParam = Parameters.intValue("y");
RequestVal<Integer> xSegment = PathMatchers.intValue();
RequestVal<Integer> ySegment = PathMatchers.intValue();
final Handler2<Integer, Integer> multiply =
new Handler2<Integer, Integer>() {
@Override
public RouteResult apply(RequestContext ctx, Integer x, Integer y) {
int result = x * y;
return ctx.complete("x * y = " + result);
}
};
final Route multiplyXAndYParam = handleWith2(xParam, yParam, multiply);
Route createRoute() {
return route(
get(
pathPrefix("calculator").route(
path("multiply").route(
multiplyXAndYParam
),
path("path-multiply", xSegment, ySegment).route(
handleWith2(xSegment, ySegment, multiply)
)
)
)
);
}
}
// actual testing code
TestRoute r = testRoute(new TestHandler().createRoute());
r.run(HttpRequest.GET("/calculator/multiply?x=12&y=42"))
.assertStatusCode(200)
.assertEntity("x * y = 504");
r.run(HttpRequest.GET("/calculator/path-multiply/23/5"))
.assertStatusCode(200)
.assertEntity("x * y = 115");

Here, the handler is again being reused. First, in creating a route that expects URI parameters x and y. This route
is then used in the route structure. And second, the handler is used with another set of RequestVal in the route
structure, this time representing segments from the URI path.
Handlers in Java 8
Handlers are in fact simply classes which extend akka.japi.function.FunctionN in order to make
reasoning about the number of handled arguments easier. For example, a Handler1[String] is sim2.5. High-level Server-Side API

123

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

ply a Function2[RequestContext, String, RouteResult]. You can think of handlers as hotdogs, where each T type represents a sausage, put between the buns which are RequestContext and
RouteResult.
In Java 8 handlers can be provided as function literals or method references. The example from before then looks
like this:
class TestHandler extends akka.http.javadsl.server.AllDirectives {
final RequestVal<Integer> xParam = Parameters.intValue("x");
final RequestVal<Integer> yParam = Parameters.intValue("y");
final Handler2<Integer, Integer> multiply =
(ctx, x, y) -> ctx.complete("x * y = " + (x * y));
final Route multiplyXAndYParam = handleWith2(xParam, yParam, multiply);
RouteResult subtract(RequestContext ctx, int x, int y) {
return ctx.complete("x - y = " + (x - y));
}
Route createRoute() {
return route(
get(
pathPrefix("calculator").route(
path("multiply").route(
// use Handler explicitly
multiplyXAndYParam
),
path("add").route(
// create Handler as lambda expression
handleWith2(xParam, yParam,
(ctx, x, y) -> ctx.complete("x + y = " + (x + y)))
),
path("subtract").route(
// create handler by lifting method
handleWith2(xParam, yParam, this::subtract)
)
)
)
);
}
}
// actual testing code
TestRoute r = testRoute(new TestHandler().createRoute());
r.run(HttpRequest.GET("/calculator/multiply?x=12&y=42"))
.assertStatusCode(200)
.assertEntity("x * y = 504");
r.run(HttpRequest.GET("/calculator/add?x=12&y=42"))
.assertStatusCode(200)
.assertEntity("x + y = 54");
r.run(HttpRequest.GET("/calculator/subtract?x=42&y=12"))
.assertStatusCode(200)
.assertEntity("x - y = 30");

Note: The reason the handleWith## methods include the number of handled values is because otherwise (if
overloading would be used, for all 22 methods) error messages generated by javac end up being very long and
not readable, i.e. if one type of a handler does not match the given values, all possible candidates would be printed
in the error message (22 of them), instead of just the one arity-matching method, pointing out that the type does
not match.

2.5. High-level Server-Side API

124

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

We opted for better error messages as we feel this is more helpful when developing applications,
instead of having one overloaded method which looks nice when everything works, but procudes
hard to read error messages if something does not match up.

Providing Handlers by Reflection


Using Java before Java 8, writing out handlers as (anonymous) classes can be unwieldy. Therefore,
handleReflectively overloads are provided that allow writing handler as simple methods and specifying
them by name:
public RouteResult multiply(RequestContext ctx, Integer x, Integer y) {
int result = x * y;
return ctx.complete("x * y = " + result);
}
Route multiplyXAndYParam = handleReflectively(this, "multiply", xParam, yParam);

The complete calculator example can then be written like this:


class TestHandler extends akka.http.javadsl.server.AllDirectives {
RequestVal<Integer> xParam = Parameters.intValue("x");
RequestVal<Integer> yParam = Parameters.intValue("y");
RequestVal<Integer> xSegment = PathMatchers.intValue();
RequestVal<Integer> ySegment = PathMatchers.intValue();

public RouteResult multiply(RequestContext ctx, Integer x, Integer y) {


int result = x * y;
return ctx.complete("x * y = " + result);
}
Route multiplyXAndYParam = handleReflectively(this, "multiply", xParam, yParam);
Route createRoute() {
return route(
get(
pathPrefix("calculator").route(
path("multiply").route(
multiplyXAndYParam
),
path("path-multiply", xSegment, ySegment).route(
handleWith2(xSegment, ySegment, this::multiply)
)
)
)
);
}
}
// actual testing code
TestRoute r = testRoute(new TestHandler().createRoute());
r.run(HttpRequest.GET("/calculator/multiply?x=12&y=42"))
.assertStatusCode(200)
.assertEntity("x * y = 504");
r.run(HttpRequest.GET("/calculator/path-multiply/23/5"))
.assertStatusCode(200)
.assertEntity("x * y = 115");

There are alternative overloads for handleReflectively that take a Class instead of an object instance to
refer to static methods. The referenced method must be publicly accessible.
2.5. High-level Server-Side API

125

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Deferring Result Creation


Sometimes a handler cannot directly complete the request but needs to do some processing asynchronously. In
this case the completion of a request needs to be deferred until the result has been generated. This is supported
by the routing DSL in two ways: either you can use one of the handleWithAsyncN methods passing an
AsyncHandlerN which returns a Future<RouteResult>, i.e. an eventual RouteResult, or you can
also use a regular handler as shown above and use RequestContext.completeWith for completion which
takes an Future<RouteResult> as an argument.
This is demonstrated in the following example. Consider a asynchronous service defined like this (making use of
Java 8 lambdas):
class CalculatorService {
public Future<Integer> multiply(final int x, final int y, ExecutionContext ec) {
return akka.dispatch.Futures.future(() -> x * y, ec);
}
public Future<Integer> add(final int x, final int y, ExecutionContext ec) {
return akka.dispatch.Futures.future(() -> x + y, ec);
}
}

Here the calculator runs the actual calculation in the background and only eventually returns the result. The
HTTP service should provide a front-end to that service without having to block while waiting for the results. As
explained above this can be done in two ways.
First, you can use handleWithAsyncN to be able to return a Future<RouteResult>:
// would probably be injected or passed at construction time in real code
CalculatorService calculatorService = new CalculatorService();
public Future<RouteResult> multiplyAsync(final RequestContext ctx, int x, int y) {
Future<Integer> result = calculatorService.multiply(x, y, ctx.executionContext());
Mapper<Integer, RouteResult> func = new Mapper<Integer, RouteResult>() {
@Override
public RouteResult apply(Integer product) {
return ctx.complete("x * y = " + product);
}
}; // cannot be written as lambda, unfortunately
return result.map(func, ctx.executionContext());
}
Route multiplyAsyncRoute =
path("multiply").route(
handleWithAsync2(xParam, yParam, this::multiplyAsync)
);

The handler invokes the service and then maps the calculation result to a RouteResult using Future.map
and returns the resulting Future<RouteResult>.
Otherwise, you can also still use handleWithN and use RequestContext.completeWith to convert a
Future<RouteResult> into a RouteResult as shown here:
public RouteResult addAsync(final RequestContext ctx, int x, int y) {
Future<Integer> result = calculatorService.add(x, y, ctx.executionContext());
Mapper<Integer, RouteResult> func = new Mapper<Integer, RouteResult>() {
@Override
public RouteResult apply(Integer sum) {
return ctx.complete("x + y = " + sum);
}
}; // cannot be written as lambda, unfortunately
return ctx.completeWith(result.map(func, ctx.executionContext()));
}
Route addAsyncRoute =
path("add").route(

2.5. High-level Server-Side API

126

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

handleWith2(xParam, yParam, this::addAsync)


);

Using this style, you can decide in your handler if you want to return a direct synchronous result or if you need to
defer completion.
Both alternatives will not block and show the same runtime behavior.
Heres the complete example:
class CalculatorService {
public Future<Integer> multiply(final int x, final int y, ExecutionContext ec) {
return akka.dispatch.Futures.future(() -> x * y, ec);
}
public Future<Integer> add(final int x, final int y, ExecutionContext ec) {
return akka.dispatch.Futures.future(() -> x + y, ec);
}
}
class TestHandler extends akka.http.javadsl.server.AllDirectives {
RequestVal<Integer> xParam = Parameters.intValue("x");
RequestVal<Integer> yParam = Parameters.intValue("y");
// would probably be injected or passed at construction time in real code
CalculatorService calculatorService = new CalculatorService();
public Future<RouteResult> multiplyAsync(final RequestContext ctx, int x, int y) {
Future<Integer> result = calculatorService.multiply(x, y, ctx.executionContext());
Mapper<Integer, RouteResult> func = new Mapper<Integer, RouteResult>() {
@Override
public RouteResult apply(Integer product) {
return ctx.complete("x * y = " + product);
}
}; // cannot be written as lambda, unfortunately
return result.map(func, ctx.executionContext());
}
Route multiplyAsyncRoute =
path("multiply").route(
handleWithAsync2(xParam, yParam, this::multiplyAsync)
);
public RouteResult addAsync(final RequestContext ctx, int x, int y) {
Future<Integer> result = calculatorService.add(x, y, ctx.executionContext());
Mapper<Integer, RouteResult> func = new Mapper<Integer, RouteResult>() {
@Override
public RouteResult apply(Integer sum) {
return ctx.complete("x + y = " + sum);
}
}; // cannot be written as lambda, unfortunately
return ctx.completeWith(result.map(func, ctx.executionContext()));
}
Route addAsyncRoute =
path("add").route(
handleWith2(xParam, yParam, this::addAsync)
);
Route createRoute() {
return route(
get(
pathPrefix("calculator").route(
multiplyAsyncRoute,
addAsyncRoute
)
)

2.5. High-level Server-Side API

127

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

);
}
}
// testing code
TestRoute r = testRoute(new TestHandler().createRoute());
r.run(HttpRequest.GET("/calculator/multiply?x=12&y=42"))
.assertStatusCode(200)
.assertEntity("x * y = 504");
r.run(HttpRequest.GET("/calculator/add?x=23&y=5"))
.assertStatusCode(200)
.assertEntity("x + y = 28");

2.5.6 Marshalling & Unmarshalling


Marshalling is the process of converting a higher-level (object) structure into some kind of lower-level representation (and vice versa), often a binary wire format. Other popular names for it are Serialization or Pickling.
In akka-http Marshalling means the conversion of an object of type T into an HttpEntity, which forms the entity
body of an HTTP request or response (depending on whether used on the client or server side).
Marshalling
On the server-side marshalling is used to convert a application-domain object to a response (entity). Requests can
contain an Accept header that lists acceptable content types for the client. A marshaller contains the logic to
negotiate the result content types based on the Accept and the AcceptCharset headers.
Marshallers can be specified when completing a request with RequestContext.completeAs or by using
the BasicDirectives.completeAs directives.
These marshallers are provided by akka-http:
Use Json Support via Jackson to create an marshaller that can convert a POJO to an application/json
response using jackson.
Use
Marshallers.toEntityString,
Marshallers.toEntityBytes,
Marshallers.toEntityByteString,
Marshallers.toEntity,
and
Marshallers.toResponse to create custom marshallers.
Unmarshalling
On the server-side unmarshalling is used to convert a request (entity) to a application-domain object. This means
unmarshalling to a certain type is represented by a RequestVal. Currently, several options are provided to
create an unmarshalling RequestVal:
Use Json Support via Jackson to create an unmarshaller that can convert an application/json request
to a POJO using jackson.
Use
the
predefined
Unmarshallers.String,
Unmarshallers.ByteString,
Unmarshallers.ByteArray, Unmarshallers.CharArray to convert to those basic types.
Use Unmarshallers.fromMessage or Unmarshaller.fromEntity to create a custom unmarshaller.

2.5.7 Route Testkit


akka-http has a testkit that provides a convenient way of testing your routes with JUnit. It allows running requests
against a route (without hitting the network) and provides means to assert against response properties in a compact
way.
2.5. High-level Server-Side API

128

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

To use the testkit you need to take these steps:


add a dependency to the akka-http-testkit-experimental module
derive the test class from JUnitRouteTest
wrap the route under test with RouteTest.testRoute to create a TestRoute
run requests against the route using TestRoute.run(request) which will return a TestResponse
use the methods of TestResponse to assert on properties of the response
Example
To see the testkit in action consider the following simple calculator app service:
import akka.http.javadsl.server.*;
import akka.http.javadsl.server.values.Parameters;
public class MyAppService extends HttpApp {
RequestVal<Double> x = Parameters.doubleValue("x");
RequestVal<Double> y = Parameters.doubleValue("y");
public RouteResult add(RequestContext ctx, double x, double y) {
return ctx.complete("x + y = " + (x + y));
}
@Override
public Route createRoute() {
return
route(
get(
pathPrefix("calculator").route(
path("add").route(
handleReflectively(this, "add", x, y)
)
)
)
);
}
}

The app extends from HttpApp which brings all of the directives into scope. Method createRoute needs to
be implemented to return the complete route of the app.
Heres how you would test that service:
import
import
import
import
import

akka.http.javadsl.model.HttpRequest;
akka.http.javadsl.model.StatusCodes;
akka.http.javadsl.testkit.JUnitRouteTest;
akka.http.javadsl.testkit.TestRoute;
org.junit.Test;

public class TestkitExampleTest extends JUnitRouteTest {


TestRoute appRoute = testRoute(new MyAppService().createRoute());
@Test
public void testCalculatorAdd() {
// test happy path
appRoute.run(HttpRequest.GET("/calculator/add?x=4.2&y=2.3"))
.assertStatusCode(200)
.assertEntity("x + y = 6.5");
// test responses to potential errors
appRoute.run(HttpRequest.GET("/calculator/add?x=3.2"))

2.5. High-level Server-Side API

129

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

.assertStatusCode(StatusCodes.NOT_FOUND) // 404
.assertEntity("Request is missing required query parameter 'y'");
// test responses to potential errors
appRoute.run(HttpRequest.GET("/calculator/add?x=3.2&y=three"))
.assertStatusCode(StatusCodes.BAD_REQUEST)
.assertEntity("The query parameter 'y' was malformed:\n" +
"'three' is not a valid 64-bit floating point value");
}
}

Writing Asserting against the HttpResponse


The testkit supports a fluent DSL to write compact assertions on the response by chaining assertions using dotsyntax. To simplify working with streamed responses the entity of the response is first strictified, i.e. entity
data is collected into a single ByteString and provided the entity is supplied as an HttpEntityStrict.
This allows to write several assertions against the same entity data which wouldnt (necessarily) be possible for
the streamed version.
All of the defined assertions provide HTTP specific error messages aiding in diagnosing problems.
Currently, these methods are defined on TestResponse to assert on the response:
Assertion Description
assertStatusCode(int expectedCode)
assertStatusCode(StatusCode
expectedCode)
assertMediaType(String
expectedType)
assertMediaType(MediaType
expectedType)
assertEntity(String
expectedStringContent)
assertEntityBytes(ByteString
expectedBytes)
assertEntityAs(Unmarshaller<T>
unmarshaller, expectedValue: T)
assertHeaderExists(HttpHeader
expectedHeader)
assertHeaderKindExists(String
expectedHeaderName)
assertHeader(String name, String
expectedValue)

Asserts that the numeric response status code equals


the expected one
Asserts that the response StatusCode equals the
expected one
Asserts that the media type part of the responses
content type matches the given String
Asserts that the media type part of the responses
content type matches the given MediaType
Asserts that the entity data interpreted as UTF8
equals the expected String
Asserts that the entity data bytes equal the expected
ones
Asserts that the entity data if unmarshalled with the
given marshaller equals the given value
Asserts that the response contains an HttpHeader
instance equal to the expected one
Asserts that the response contains a header with the
expected name
Asserts that the response contains a header with the
given name and value.

Its, of course, possible to use any other means of writing assertions by inspecting the properties the response manually. As written above, TestResponse.entity and TestResponse.response return strict versions of
the entity data.
Supporting Custom Test Frameworks
Adding support for a custom test framework is achieved by creating new superclass analogous to JUnitRouteTest for writing tests with the custom test framwork deriving from
akka.http.javadsl.testkit.RouteTest and implementing its abstract methods. This will allow users of the test framework to use testRoute and to write assertions using the assertion methods defined
on TestResponse.

2.5. High-level Server-Side API

130

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

2.5.8 Json Support via Jackson


akka-http provides support to convert application-domain objects from and to JSON using jackson. To make use
of the support module, you need to add a dependency on akka-http-jackson-experimental.
Use
akka.http.javadsl.marshallers.jackson.Jackson.jsonAs[T]
to
create
a
RequestVal<T> which expects the request body to be of type application/json and converts it
to T using Jackson.
See this example in the sources for an example.
Use
akka.http.javadsl.marshallers.jackson.Jackson.json[T]
to
create
a
Marshaller<T> which can be used with RequestContext.completeAs to convert a POJO to an
HttpResponse.

2.6 Consuming HTTP-based Services (Client-Side)


All client-side functionality of Akka HTTP, for consuming HTTP-based services offered by other endpoints, is
currently provided by the akka-http-core module.
Depending on your applications specific needs you can choose from three different API levels:
Connection-Level Client-Side API for full-control over when HTTP connections are opened/closed and how requests are scheduled across them
Host-Level Client-Side API for letting Akka HTTP manage a connection-pool to one specific host/port endpoint
Request-Level Client-Side API for letting Akka HTTP perform all connection management
You can interact with different API levels at the same time and, independently of which API level you choose,
Akka HTTP will happily handle many thousand concurrent connections to a single or many different hosts.

2.6.1 Connection-Level Client-Side API


The connection-level API is the lowest-level client-side API Akka HTTP provides. It gives you full control over
when HTTP connections are opened and closed and how requests are to be send across which connection. As such
it offers the highest flexibility at the cost of providing the least convenience.
Opening HTTP Connections
With the connection-level API you open a new HTTP connection to a target endpoint by materializing a Flow
returned by the Http.get(system).outgoingConnection(...) method. Here is an example:
final ActorSystem system = ActorSystem.create();
final ActorMaterializer materializer = ActorMaterializer.create(system);
final Flow<HttpRequest, HttpResponse, Future<OutgoingConnection>> connectionFlow =
Http.get(system).outgoingConnection("akka.io", 80);
final Future<HttpResponse> responseFuture =
Source.single(HttpRequest.create("/"))
.via(connectionFlow)
.runWith(Sink.<HttpResponse>head(), materializer);

Apart from the host name and port the Http.get(system).outgoingConnection(...) method also
allows you to specify socket options and a number of configuration settings for the connection.
Note that no connection is attempted until the returned flow is actually materialized! If the flow is materialized
several times then several independent connections will be opened (one per materialization). If the connection attempt fails, for whatever reason, the materialized flow will be immediately terminated with a respective exception.

2.6. Consuming HTTP-based Services (Client-Side)

131

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Request-Response Cycle
Once the connection flow has been materialized it is ready to consume HttpRequest instances from the source
it is attached to. Each request is sent across the connection and incoming responses dispatched to the downstream
pipeline. Of course and as always, back-pressure is adequately maintained across all parts of the connection. This
means that, if the downstream pipeline consuming the HTTP responses is slow, the request source will eventually
be slowed down in sending requests.
Any errors occurring on the underlying connection are surfaced as exceptions terminating the response stream
(and canceling the request source).
Note that, if the source produces subsequent requests before the prior responses have arrived, these requests will
be pipelined across the connection, which is something that is not supported by all HTTP servers. Also, if the
server closes the connection before responses to all requests have been received this will result in the response
stream being terminated with a truncation error.
Closing Connections
Akka HTTP actively closes an established connection upon reception of a response containing Connection:
close header. The connection can also be closed by the server.
An application can actively trigger the closing of the connection by completing the request stream. In this case
the underlying TCP connection will be closed when the last pending response has been received.
Timeouts
Currently Akka HTTP doesnt implement client-side request timeout checking itself as this functionality can be
regarded as a more general purpose streaming infrastructure feature. However, akka-stream should soon provide
such a feature.
Stand-Alone HTTP Layer Usage
// TODO

2.6.2 Host-Level Client-Side API


As opposed to the connection-level-api the host-level API relieves you from manually managing individual HTTP
connections. It autonomously manages a configurable pool of connections to one particular target endpoint (i.e.
host/port combination).
Requesting a Host Connection Pool
The best way to get a hold of a connection pool to a given target endpoint is the
Http.get(system).cachedHostConnectionPool(...)
method, which returns a Flow that
can be baked into an application-level stream setup. This flow is also called a pool client flow.
The connection pool underlying a pool client flow is cached. For every ActorSystem, target endpoint and pool
configuration there will never be more than a single pool live at any time.
Also, the HTTP layer transparently manages idle shutdown and restarting of connection pools as configured. The
client flow instances therefore remain valid throughout the lifetime of the application, i.e. they can be materialized
as often as required and the time between individual materialization is of no importance.
When you request a pool client flow with Http.get(system).cachedHostConnectionPool(...)
Akka HTTP will immediately start the pool, even before the first client flow materialization. However, this running
pool will not actually open the first connection to the target endpoint until the first request has arrived.

2.6. Consuming HTTP-based Services (Client-Side)

132

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Configuring a Host Connection Pool


Apart from the connection-level config settings and socket options there are a number of settings
that allow you to influence the behavior of the connection pool logic itself.
Check out the
akka.http.client.host-connection-pool section of the Akka HTTP Configuration for more information about which settings are available and what they mean.
Note that, if you request pools with different configurations for the same target host you will get independent
pools. This means that, in total, your application might open more concurrent HTTP connections to the target
endpoint than any of the individual pools max-connections settings allow!
There is one setting that likely deserves a bit deeper explanation: max-open-requests. This setting limits
the maximum number of requests that can be in-flight at any time for a single connection pool. If an application
calls Http.get(system).cachedHostConnectionPool(...) 3 times (with the same endpoint and
settings) it will get back 3 different client flow instances for the same pool. If each of these client flows is then
materialized 4 times (concurrently) the application will have 12 concurrently running client flow materializations.
All of these share the resources of the single pool.
This means that, if the pools pipelining-limit is left at 1 (effecitvely disabeling pipelining), no more
than 12 requests can be open at any time. With a pipelining-limit of 8 and 12 concurrent client flow
materializations the theoretical open requests maximum is 96.
The max-open-requests config setting allows for applying a hard limit which serves mainly as a protection
against erroneous connection pool use, e.g. because the application is materializing too many client flows that all
compete for the same pooled connections.
Using a Host Connection Pool
The pool client flow returned by Http.get(system).cachedHostConnectionPool(...) has the
following type:
// TODO Tuple2 will be changed to be `akka.japi.Pair`
Flow[Tuple2[HttpRequest, T], Tuple2[Try[HttpResponse], T], HostConnectionPool]

This means it consumes tuples of type (HttpRequest, T) and produces tuples of type
(Try[HttpResponse], T) which might appear more complicated than necessary on first sight. The
reason why the pool API includes objects of custom type T on both ends lies in the fact that the underlying
transport usually comprises more than a single connection and as such the pool client flow often generates
responses in an order that doesnt directly match the consumed requests. We could have built the pool logic in
a way that reorders responses according to their requests before dispatching them to the application, but this
would have meant that a single slow response could block the delivery of potentially many responses that would
otherwise be ready for consumption by the application.
In order to prevent unnecessary head-of-line blocking the pool client-flow is allowed to dispatch responses as
soon as they arrive, independently of the request order. Of course this means that there needs to be another way
to associate a response with its respective request. The way that this is done is by allowing the application to pass
along a custom context object with the request, which is then passed back to the application with the respective
response. This context object of type T is completely opaque to Akka HTTP, i.e. you can pick whatever works
best for your particular application scenario.
Connection Allocation Logic
This is how Akka HTTP allocates incoming requests to the available connection slots:
1. If there is a connection alive and currently idle then schedule the request across this connection.
2. If no connection is idle and there is still an unconnected slot then establish a new connection.
3. If all connections are already established and loaded with other requests then pick the connection with
the least open requests (< the configured pipelining-limit) that only has requests with idempotent
methods scheduled to it, if there is one.

2.6. Consuming HTTP-based Services (Client-Side)

133

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

4. Otherwise apply back-pressure to the request source, i.e. stop accepting new requests.
For more information about scheduling more than one request at a time across a single connection see this
wikipedia entry on HTTP pipelining.
Retrying a Request
If the max-retries pool config setting is greater than zero the pool retries idempotent requests for which a
response could not be successfully retrieved. Idempotent requests are those whose HTTP method is defined to be
idempotent by the HTTP spec, which are all the ones currently modelled by Akka HTTP except for the POST,
PATCH and CONNECT methods.
When a response could not be received for a certain request there are essentially three possible error scenarios:
1. The request got lost on the way to the server.
2. The server experiences a problem while processing the request.
3. The response from the server got lost on the way back.
Since the host connector cannot know which one of these possible reasons caused the problem and therefore
PATCH and POST requests could have already triggered a non-idempotent action on the server these requests
cannot be retried.
In these cases, as well as when all retries have not yielded a proper response, the pool produces a failed Try (i.e.
a scala.util.Failure) together with the custom request context.
Pool Shutdown
Completing a pool client flow will simply detach the flow from the pool. The connection pool itself will continue to run as it may be serving other client flows concurrently or in the future. Only after the configured
idle-timeout for the pool has expired will Akka HTTP automatically terminate the pool and free all its resources.
If a new client flow is requested with Http.get(system).cachedHostConnectionPool(...) or if
an already existing client flow is re-materialized the respective pool is automatically and transparently restarted.
In addition to the automatic shutdown via the configured idle timeouts its also possible to trigger the immediate
shutdown of a specific pool by calling shutdown() on the HostConnectionPool instance that the pool
client flow materializes into. This shutdown() call produces a Future[Unit] which is fulfilled when the
pool termination has been completed.
Its also possible to trigger the immediate termination of all connection pools in the ActorSystem at the
same time by calling Http.get(system).shutdownAllConnectionPools(). This call too produces
a Future[Unit] which is fulfilled when all pools have terminated.
Example
final ActorSystem system = ActorSystem.create();
final ActorMaterializer materializer = ActorMaterializer.create(system);
// construct a pool client flow with context type `Int`
// TODO these Tuple2 will be changed to akka.japi.Pair
final Flow<
Tuple2<HttpRequest, Integer>,
Tuple2<Try<HttpResponse>, Integer>,
HostConnectionPool> poolClientFlow =
Http.get(system).<Integer>cachedHostConnectionPool("akka.io", 80, materializer);
// construct a pool client flow with context type `Int`
final Future<Tuple2<Try<HttpResponse>, Integer>> responseFuture =

2.6. Consuming HTTP-based Services (Client-Side)

134

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

Source
.single(Pair.create(HttpRequest.create("/"), 42).toScala())
.via(poolClientFlow)
.runWith(Sink.<Tuple2<Try<HttpResponse>, Integer>>head(), materializer);

2.6.3 Request-Level Client-Side API


The request-level API is the most convenient way of using Akka HTTPs client-side functionality. It internally
builds upon the Host-Level Client-Side API to provide you with a simple and easy-to-use way of retrieving HTTP
responses from remote servers. Depending on your preference you can pick the flow-based or the future-based
variant.
Flow-Based Variant
The flow-based variant of the request-level client-side API is presented by the Http().superPool(...)
method. It creates a new super connection pool flow, which routes incoming requests to a (cached) host connection pool depending on their respective effective URIs.
The Flow returned by Http().superPool(...) is very similar to the one from the Host-Level Client-Side
API, so the Using a Host Connection Pool section also applies here.
However, there is one notable difference between a host connection pool client flow for the host-level API and a
super-pool flow: Since in the former case the flow has an implicit target host context the requests it takes dont
need to have absolute URIs or a valid Host header. The host connection pool will automatically add a Host
header if required.
For a super-pool flow this is not the case. All requests to a super-pool must either have an absolute URI or a valid
Host header, because otherwise itd be impossible to find out which target endpoint to direct the request to.
Future-Based Variant
Sometimes your HTTP client needs are very basic. You simply need the HTTP response for a certain request and
dont want to bother with setting up a full-blown streaming infrastructure.
For these cases Akka HTTP offers the Http().singleRequest(...) method, which simply turns an
HttpRequest instance into Future<HttpResponse>. Internally the request is dispatched across the
(cached) host connection pool for the requests effective URI.
Just like in the case of the super-pool flow described above the request must have either an absolute URI or a valid
Host header, otherwise the returned future will be completed with an error.
Example
final ActorSystem system = ActorSystem.create();
final ActorMaterializer materializer = ActorMaterializer.create(system);
final Future<HttpResponse> responseFuture =
Http.get(system)
.singleRequest(HttpRequest.create("http://akka.io"), materializer);

2.6.4 Client-Side HTTPS Support


Akka HTTP supports TLS encryption on the client-side as well as on the server-side.
The central vehicle for configuring encryption is the HttpsContext, which can be created using the static
method HttpsContext.create which is defined like this:

2.6. Consuming HTTP-based Services (Client-Side)

135

Akka Stream and HTTP Experimental Java Documentation, Release 1.0

public static HttpsContext create(SSLContext sslContext,


Option<Collection<String>> enabledCipherSuites,
Option<Collection<String>> enabledProtocols,
Option<ClientAuth> clientAuth,
Option<SSLParameters> sslParameters)

In
addition
to
the
outgoingConnection,
newHostConnectionPool
and
cachedHostConnectionPool
methods
the
akka.http.javadsl.Http
extension
also
defines
outgoingConnectionTls,
newHostConnectionPoolTls
and
cachedHostConnectionPoolTls. These methods work identically to their counterparts without the
-Tls suffix, with the exception that all connections will always be encrypted.
The singleRequest and superPool methods determine the encryption state via the scheme of the incoming
request, i.e. requests to an https URI will be encrypted, while requests to an http URI wont.
The encryption configuration for all HTTPS connections, i.e. the HttpsContext is determined according to
the following logic:
1. If the optional httpsContext method parameter is defined it contains the configuration to be used (and
thus takes precedence over any potentially set default client-side HttpsContext).
2. If the optional httpsContext method parameter is undefined (which is the default) the default client-side
HttpsContext is used, which can be set via the setDefaultClientHttpsContext on the Http
extension.
3. If no default client-side HttpsContext has been set via the setDefaultClientHttpsContext on
the Http extension the default system configuration is used.
Usually the process is, if the default system TLS configuration is not good enough for your
applications needs, that you configure a custom HttpsContext instance and set it via
Http.get(system).setDefaultClientHttpsContext.
Afterwards you simply use
outgoingConnectionTls, newHostConnectionPoolTls, cachedHostConnectionPoolTls,
superPool or singleRequest without a specific httpsContext argument, which causes encrypted
connections to rely on the configured default client-side HttpsContext.

2.6.5 Client-Side WebSocket Support


Not yet implemented see 17275.

2.6. Consuming HTTP-based Services (Client-Side)

136

You might also like