Java Nio Framework: Introducing A High-Performance I/O Framework For Java
Java Nio Framework: Introducing A High-Performance I/O Framework For Java
Java Nio Framework: Introducing A High-Performance I/O Framework For Java
Abstract: A new input/output (NIO) library that provides block-oriented I/O was introduced with Java v1.4. Because
of its complexity, creating network applications with the Java NIO library has been very difficult and build-in
support for high-performance, distributed and parallel systems was missing. Parallel architectures are now
becoming the standard in computing and Java network application programmers need a framework to build
upon. In this paper, we introduce the Java NIO Framework, an extensible programming library that hides most
of the NIO library details and at the same time provides support for secure and high-performance network
applications. The Java NIO Framework is already used by well-known organizations, e.g. the U.S. National
Institute of Standards and Technology, and is running successfully in a distributed computing framework that
has more than 1000 nodes.
2 MULTIPLEXING STRATEGIES
Two multiplexing strategies were mentioned in the in-
troduction, one thread per socket and readiness selec-
tion. In the next sections both strategies are briefly
introduced and analyzed. Figure 1: Reactor design pattern
2.1 One Thread Per Socket readiness selection is given in (James O. Coplien,
1995) when presenting the reactor design pattern.
Threads are a mechanism to split a process into sev- The reactor design pattern proposes the software
eral simultaneously running tasks. Threads differ architecture presented in Figure 1.
from normal processes by sharing memory and other • The class Handle identifies resources that are
resources. Therefore they are often called lightweight managed by an operating system, e.g. sockets.
processes. Switching between threads is typically
faster than switching between processes. • The class Demultiplexer blocks awaiting events
When a server uses the one thread per socket mul- to occur on a set of Handles. It returns when it
tiplexing strategy it creates one thread for every client is possible to initiate an operation on a Handle
connection. When executing blocking I/O operations without blocking. The method select() returns
the thread is also blocked until the operation com- which Handles can have operations invoked on
pletes its execution (e.g. when reading data from a them synchronously without blocking the appli-
socket the thread blocks until new data is available cation process.
to read from the socket). • The class Dispatcher defines an interface for
This strategy is very simple to implement because registering, removing, and dispatching Event-
every thread just continues its operation after return- Handlers. Ultimately, the Demultiplexer is re-
ing from a blocking operation and all internal states sponsible for waiting until new events occur.
of the thread are automatically restored. A program- When it detects new events, it informs the Dis-
mer can implement the thread (more or less) as if the patcher to call back application-specific event
server handles only one client connection. handlers.
The drawback of this multiplexing strategy is that
• The interface EventHandler specifies a hook
it does not scale well. Each blocked thread acts as a
method that abstractly represents the dispatching
socket monitor and the thread scheduler is the notifi-
operation for service-specific events.
cation mechanism. Neither of them was designed for
such a purpose. • The class ConcreteEventHandler implements the
A remaining problem of this strategy is that a de- hook method as well as the methods to pro-
sign with massive parallel threads naturally is prone to cess these events in an application-specific man-
typical threading problems, e.g. deadlocks, lifelocks ner. Applications register ConcreteEventHandlers
and starvation. with the Dispatcher to process certain types of
events. When these events arrive, the Dispatcher
2.2 Readiness Selection calls back the hook method of the appropriate
ConcreteEventHandler.
Readiness selection is a multiplexing strategy that en- Readiness selection scales much better but it is
ables a server to handle many client connections si- not as easy to implement as the one thread per socket
multaneously with a single thread. An overview of strategy.
3 NIO FRAMEWORK DESIGN The class SelectionKey associates a channel with
a Selector, tells the Selector which events to monitor
Because the NIO Framework should be scalable to for the channel and holds a reference to an arbitrary
handle thousands of network connections simultane- object, called “attachment”. In the current architec-
ously, the decision was made to use readiness se- ture the attachment is a HandlerAdapter.
lection as the multiplexing strategy, which is much The EventHandler from the reactor design pattern
more appropriate for high-performance I/O than the is split up into several components. The first compo-
one thread per socket strategy. nent is the class HandlerAdapter. It manages all the
operations on a channel (connect, read, write, close)
3.1 Mapping the Reactor Design and its queues, interacts with the Selector and Selec-
tionKey classes and, most importantly, hides and en-
Pattern capsulates most NIO details from higher level classes
and interfaces.
If the reactor design pattern presented above had The second EventHandler component in the NIO
been used for the NIO Framework without mod- Framework is the interface ChannelHandler. It de-
ification, every application-specific ConcreteEvent- fines the methods that any application-specific chan-
Handler would still have to take care of many NIO nel handler class has to implement so that it can be
specific details. These include buffers, queues, in- used in the NIO framework. These include:
complete write operations, encryption of data streams
and much more. To provide a simple API to Java net- public void channelRegistered(
work application programmers, the NIO Framework HandlerAdapter handlerAdapter)
was complemented with several additional helper This method gets called when a channel was regis-
classes and interfaces that will be introduced in the tered at the Dispatcher. It is mostly used on server
following sections. type applications to send a welcome message to
The concepts and techniques used to design and clients that just connected.
implement a safe and scalable framework that effec- public InputQueue getInputQueue()
tively exploits multiple processors are presented in
This method returns the InputQueue that will be used
(Peierls et al., 2005).
by the HandlerAdapter, if there is data to be read from
A simplified model of the NIO Framework core is
the channel.
shown in Figure 2.
The blue UML elements (Runnable, Thread, Se- public OutputQueue getOutputQueue()
lector, SelectionKey and Executor) are part of the Java This method returns the OutputQueue that will be
Development Kit (JDK). The interface Runnable and used by the HandlerAdapter, if there is data to be writ-
the class Thread were part of JDK from the very be- ten to the channel.
ginning, Selector and SelectionKey have been added
public void handleInput()
to the JDK with the NIO package in JDK v1.4 and
the interface Executor was added with the concur- The HandlerAdapter calls this method, if the In-
rency package in JDK v1.5. The yellow UML el- putQueue has new data to be read from the channel.
ements (ChannelHandler, HandlerAdapter and Dis- public void inputClosed()
patcher) are the essential core classes of the NIO This method gets called by the HandlerAdapter, if no
Framework. more data can be read from the InputQueue.
The Dispatcher is a Thread that runs in an endless
loop, processes registrations of ChannelHandlers with public void channelException(
a channel (a nexus for I/O operations that represents Exception exception)
an open connection to an entity such as a network The HandlerAdapter calls this method, if an exception
socket) and uses an Executor to offload the execution occurred while reading from or writing to the channel.
of selected HandlerAdapters. The Executor interface Not shown in Figure 2 are all the application-
hides the mechanics of how each task will be exe- specific channel handlers that implement the interface
cuted, including details of thread use, scheduling, etc. ChannelHandler. They represent the ConcreteEvent-
This abstraction is necessary because the NIO Frame- Handler of the reactor design pattern. Because the
work may be used on a wide range of systems, from details of the method handleInput() may vary with
low-cost embedded devices up to high-performance every specific handler they are outside the scope of
multi-core servers. the NIO Framework.
The class Selector determines which registered Table 1 shows the mappings from the reactor de-
channels are ready. sign pattern to the NIO Framework.
Figure 2: NIO Framework Core
Table 1: Mappings from reactor design pattern to the Java this scenario, an active channel counter could be in-
NIO Framework tegrated into every Distributor and a lowest-channel-
Reactor Design Pattern Java NIO Framework counter-first scheduling algorithm could be used.
If connections have a high degree of “activity”
Dispatcher Dispatcher variation, i.e. on some channels there is always some-
Demultiplexer Selector thing to read or write and other channels are mostly
Handle SelectionKey idle, the scheduling algorithm should be based on a
HandlerAdapter select()-counter in the Dispatcher.
EventHandler ChannelHandler
Executor 3.2.3 Accepting
ConcreteEventHandler n.a.
Another thread, the Acceptor, is running on server
3.2 Parallelization type applications. It is listening on a server socket
for incoming connection requests from clients over
Some parts of the Java NIO Framework are paral- the network. Every time a request comes in, the Ac-
lelized by default, other parts can be customized to ceptor creates a new channel and appropriate handler,
be parallelized. and registers them both at the Dispatcher of the server
type application (or Dispatchers, if selection was par-
3.2.1 Execution allelized like mentioned in Section 3.2.2).
Currently the Java NIO Framework does not sup-
The execution of all HandlerAdapters is off-loaded port parallelization of Acceptors.
from the Dispatcher thread to an Executor. Because
I/O operations are typically short-lived asynchronous 3.3 I/O Transformations
tasks, the default Executor of the Java NIO Frame-
work uses a thread pool that creates new threads as When application data units (objects, messages, etc.)
needed, but will reuse previously constructed threads have to be transmitted over a TCP network connec-
when they are available. Threads that have not been tion, they have to be transformed into a serialized rep-
used for a while are terminated and removed from the resentation of bytes.
pool. Therefore, if the Executor remains idle for long There are many ways to represent application data
enough, it will not consume any resources. and there are also many ways to serialize data into a
Not every I/O operation meets the typical criteria, byte stream. Therefore, there are countless transfor-
e.g. SSL operations are comparatively long-lived. If mations between application space and network space
the actual requirements (e.g. a certain thread usage imaginable.
or scheduling) are not met by the default Java NIO The first approach to this problem in the NIO
Framework Executor, it can be customized with the Framework was to provide an extensible hierarchy of
method setExecutor() of the Dispatcher. Because classes, where every class dealt with a certain trans-
this method is thread-safe, the Executor can even be formation (e.g. string serialization, SSL encryption).
hot-swapped at runtime. This architecture turned out to be very simple and ef-
ficient. The downside of this approach was that every
3.2.2 Selection combination of transformations required its own im-
plementing class. Changing the order or composition
There is only one Dispatcher running per default in of transformations was very difficult and much too in-
the Java NIO Framework, waiting until new events flexible for a generic framework.
occur on channels represented by SelectionKeys. If The second and current approach to message
the Dispatcher would ever become the bottleneck of transformation is to implement a set of transformer
the framework it could simply be parallelized by start- classes were each class offers just a certain transfor-
ing several Dispatcher instances. mation. An application programmer can put these
Load-balancing could be done by distributing transformer classes together into a hierarchy of almost
channel registrations between the parallel Dispatcher arbitrary order. Almost no programming effort is re-
instances. Some of the most simple scheduling algo- quired besides assembling the needed classes of the
rithms that could be applied are round-robin distribu- transformation hierarchy in the desired order.
tion or random scheduling. A diagrammatic example of the I/O transforma-
If connection lifetimes have a high degree of vari- tion architecture is shown in Figure 3:
ation, both algorithms could lead to a very unequal The shapes Tx are the transformation classes.
distribution of channels to Dispatchers. To prevent When writing to a channel, the ChannelHandler hands
ment prototypes for all models above it became clear
that the most simple API was provided by using Java
Generics and the 1:1 model. Another advantage of
the 1:1 model is the encouragement of code reuse, be-
cause every transformation should be implemented in
a separate class.
The elegance and simplicity comes at the small
price of an almost immeasurable performance loss.
Currently, Java Generics are implemented by type
erasure: generic type information is present only at
compile time, after which it is erased by the com-
piler. The compiler automatically inserts cast oper-
ations into the byte code at necessary places which
may cause a tiny performance loss. Using the 1:1
model results in slightly longer transformation chains,
more involved objects and more locking and unlock-
ing when passing data through a transformation hier-
archy.