Advanced Concurrency in Java
Advanced Concurrency in Java
1. Overview
In this short article, we'll have a look at daemon threads in Java and see what can they be
used for. We'll also explain the difference between daemon threads and user threads.
Java offers two types of threads: user threads and daemon threads.
User threads are high-priority threads. The JVM will wait for any user thread to complete
its task before terminating it.
On the other hand, daemon threads are low-priority threads whose only role is to provide
services to user threads.
Since daemon threads are meant to serve user threads and are only needed while user threads
are running, they won't prevent the JVM from exiting once all user threads have finished their
execution.
That's why infinite loops, which typically exist in daemon threads, will not cause problems,
because any code, including the finally blocks, won't be executed once all user threads have
finished their execution. For this reason, daemon threads are not recommended for I/O
tasks.
However, there're exceptions to this rule. Poorly designed code in daemon threads can
prevent the JVM from exiting. For example, calling Thread.join() on a running daemon
thread can block the shutdown of the application.
Daemon threads are useful for background supporting tasks such as garbage collection,
releasing memory of unused objects and removing unwanted entries from the cache. Most of
the JVM threads are daemon threads.
Any thread inherits the daemon status of the thread that created it. Since the main thread
is a user thread, any thread that is created inside the main method is by default a user thread.
2
Advanced Concurrency in Java
The method setDaemon() can only be called after the Thread object has been created and the
thread has not been started. An attempt to call setDaemon() while a thread is running will
throw an IllegalThreadStateException:
@Test(expected = IllegalThreadStateException.class)
public void whenSetDaemonWhileRunning_thenIllegalThreadStateException() {
NewThread daemonThread = new NewThread();
daemonThread.start();
daemonThread.setDaemon(true);
}
Finally, to check if a thread is a daemon thread, we can simply call the method isDaemon():
@Test
public void whenCallIsDaemon_thenCorrect() {
NewThread daemonThread = new NewThread();
NewThread userThread = new NewThread();
daemonThread.setDaemon(true);
daemonThread.start();
userThread.start();
assertTrue(daemonThread.isDaemon());
assertFalse(userThread.isDaemon());
}
6. Conclusion
In this quick tutorial, we've seen what daemon threads are and what they can be used for in a
few practical scenarios.
1. Overview
ExecutorService is a framework provided by the JDK which simplifies the execution of tasks
in asynchronous mode. Generally speaking, ExecutorService automatically provides a pool of
threads and API for assigning tasks to it.
2. Instantiating ExecutorService
The are several other factory methods to create predefined ExecutorService that meet specific
use cases. To find the best method for your needs, consult Oracle's official documentation.
callableTasks.add(callableTask);
callableTasks.add(callableTask);
Tasks can be assigned to the ExecutorService using several methods, including execute(),
which is inherited from the Executor interface, and also submit(), invokeAny(), invokeAll().
The execute() method is void, and it doesn't give any possibility to get the result of task's
execution or to check the task's status (is it running or executed).
executorService.execute(runnableTask);
Now, before going any further, two more things must be discussed: shutting down
an ExecutorService and dealing with Future return types.
5. The Future Interface
If the execution period is longer than specified (in this case 200 milliseconds),
a TimeoutException will be thrown.
The isDone() method can be used to check if the assigned task is already processed or not.
The Future interface also provides for the cancellation of task execution with
the cancel() method, and to check the cancellation with isCancelled() method:
boolean canceled = future.cancel(true);
boolean isCancelled = future.isCancelled();
6
Advanced Concurrency in Java
6. The ScheduledExecutorService Interface
7. ExecutorService vs. Fork/Join
8. Conclusion
Even despite the relative simplicity of ExecutorService, there are a few common pitfalls. Let's
summarize them:
Keeping an unused ExecutorService alive: There is a detailed explanation in section 4 of
this article about how to shut down an ExecutorService;
Wrong thread-pool capacity while using fixed length thread-pool: It is very important to
determine how many threads the application will need to execute tasks efficiently. A thread-
pool that is too large will cause unnecessary overhead just to create threads which mostly will
be in the waiting mode. Too few can make an application seem unresponsive because of long
waiting periods for tasks in the queue;
Calling a Future‘s get() method after task cancellation: An attempt to get the result of an
already canceled task will trigger a CancellationException.
Unexpectedly-long blocking with Future‘s get() method: Timeouts should be used to avoid
unexpected waits.
1. Overview
The fork/join framework was presented in Java 7. It provides tools to help speed up parallel
processing by attempting to use all available processor cores – which is
accomplished through a divide and conquer approach.
In practice, this means that the framework first “forks”, recursively breaking the task into
smaller independent subtasks until they are simple enough to be executed asynchronously.
After that, the “join” part begins, in which results of all subtasks are recursively joined into
a single result, or in the case of a task which returns void, the program simply waits until
every subtask is executed.
To provide effective parallel execution, the fork/join framework uses a pool of threads called
the ForkJoinPool, which manages worker threads of type ForkJoinWorkerThread.
8
Advanced Concurrency in Java
2. ForkJoinPool
Simply put – free threads try to “steal” work from deques of busy threads.
By default, a worker thread gets tasks from the head of its own deque. When it is empty, the
thread takes a task from the tail of the deque of another busy thread or from the global entry
queue, since this is where the biggest pieces of work are likely to be located.
This approach minimizes the possibility that threads will compete for tasks. It also reduces
the number of times the thread will have to go looking for work, as it works on the biggest
available chunks of work first.
2.2. ForkJoinPool Instantiation
In Java 8, the most convenient way to get access to the instance of the ForkJoinPool is to use
its static method commonPool(). As its name suggests, this will provide a reference to the
common pool, which is a default thread pool for every ForkJoinTask.
According to Oracle’s documentation, using the predefined common pool reduces resource
consumption, since this discourages the creation of a separate thread pool per task.
ForkJoinPool commonPool = ForkJoinPool.commonPool();
The same behavior can be achieved in Java 7 by creating a ForkJoinPool and assigning it to
a public static field of a utility class:
public static ForkJoinPool forkJoinPool = new ForkJoinPool(2);
Now it can be easily accessed:
ForkJoinPool forkJoinPool = PoolUtil.forkJoinPool;
With ForkJoinPool’s constructors, it is possible to create a custom thread pool with a
specific level of parallelism, thread factory, and exception handler. In the example above, the
pool has a parallelism level of 2. This means that pool will use 2 processor cores.
3. ForkJoinTask<V>
ForkJoinTask is the base type for tasks executed inside ForkJoinPool. In practice, one of its
two subclasses should be extended: the RecursiveAction for void tasks and
9
Advanced Concurrency in Java
3.1. RecursiveAction – an Example
@Override
protected void compute() {
if (workload.length() > THRESHOLD) {
ForkJoinTask.invokeAll(createSubtasks());
} else {
processing(workload);
}
}
subtasks.add(new CustomRecursiveAction(partOne));
subtasks.add(new CustomRecursiveAction(partTwo));
10
Advanced Concurrency in Java
return subtasks;
}
3.2. RecursiveTask<V>
For tasks that return a value, the logic here is similar, except that the result for each subtask is
united in a single result:
public class CustomRecursiveTask extends RecursiveTask<Integer> {
private int[] arr;
@Override
protected Integer compute() {
if (arr.length > THRESHOLD) {
return ForkJoinTask.invokeAll(createSubtasks())
.stream()
.mapToInt(ForkJoinTask::join)
.sum();
} else {
return processing(arr);
}
}
5. Conclusions
Using the fork/join framework can speed up processing of large tasks, but to achieve this
outcome, some guidelines should be followed:
Use as few thread pools as possible – in most cases, the best decision is to use one
thread pool per application or system
Use the default common thread pool, if no specific tuning is needed
Use a reasonable threshold for splitting ForkJoinTask into subtasks
Avoid any blocking in your ForkJoinTasks
13
Advanced Concurrency in Java
1. Overview
Java 8 introduced the concept of Streams as an efficient way of carrying out bulk operations
on data. And parallel Streams can be obtained in environments that support concurrency.
These streams can come with improved performance – at the cost of multi-threading
overhead.
In this quick tutorial, we'll look at one of the biggest limitations of Stream API and see how
to make a parallel stream work with a custom ThreadPool instance, alternatively – there's a
library that handles this.
2. Parallel Stream
assertTrue(parallelStream.isParallel());
}
The default processing that occurs in such a Stream uses
the ForkJoinPool.commonPool(), a Thread Pool shared by the entire application.
3. Custom Thread Pool
long firstNum = 1;
long lastNum = 1_000_000;
4. Conclusion
1. Introduction
In this article, we'll give a guide to the CountDownLatch class and demonstrate how it can be
used in a few practical examples.
Essentially, by using a CountDownLatch we can cause a thread to block until other threads
have completed a given task.
Let's try out this pattern by creating a Worker and using a CountDownLatch field to signal
when it has completed:
public class Worker implements Runnable {
private List<String> outputScraper;
private CountDownLatch countDownLatch;
@Override
public void run() {
doSomeWork();
outputScraper.add("Counted down");
countDownLatch.countDown();
}
}
Then, let's create a test in order to prove that we can get a CountDownLatch to wait for
the Worker instances to complete:
16
Advanced Concurrency in Java
@Test
public void whenParallelProcessing_thenMainThreadWillBlockUntilCompletion()
throws InterruptedException {
workers.forEach(Thread::start);
countDownLatch.await();
outputScraper.add("Latch released");
assertThat(outputScraper)
.containsExactly(
"Counted down",
"Counted down",
"Counted down",
"Counted down",
"Counted down",
"Latch released"
);
}
Naturally “Latch released” will always be the last output – as it's dependant on
the CountDownLatch releasing.
Note that if we didn't call await(), we wouldn't be able to guarantee the ordering of the
execution of the threads, so the test would randomly fail.
If we took the previous example, but this time started thousands of threads instead of five, it's
likely that many of the earlier ones will have finished processing before we have even
called start() on the later ones. This could make it difficult to try and reproduce a
concurrency problem, as we wouldn't be able to get all our threads to run in parallel.
To get around this, let's get the CountdownLatch to work differently than in the previous
example. Instead of blocking a parent thread until some child threads have finished, we can
block each child thread until all the others have started.
Let's modify our run() method so it blocks before processing:
public class WaitingWorker implements Runnable {
public WaitingWorker(
List<String> outputScraper,
CountDownLatch readyThreadCounter,
CountDownLatch callingThreadBlocker,
CountDownLatch completedThreadCounter) {
this.outputScraper = outputScraper;
this.readyThreadCounter = readyThreadCounter;
this.callingThreadBlocker = callingThreadBlocker;
this.completedThreadCounter = completedThreadCounter;
}
@Override
public void run() {
readyThreadCounter.countDown();
try {
callingThreadBlocker.await();
doSomeWork();
outputScraper.add("Counted down");
} catch (InterruptedException e) {
e.printStackTrace();
} finally {
completedThreadCounter.countDown();
}
}
}
Now, let's modify our test so it blocks until all the Workers have started, unblocks
the Workers, and then blocks until the Workers have finished:
@Test
public void whenDoingLotsOfThreadsInParallel_thenStartThemAtTheSameTime()
throws InterruptedException {
workers.forEach(Thread::start);
readyThreadCounter.await();
outputScraper.add("Workers ready");
callingThreadBlocker.countDown();
completedThreadCounter.await();
outputScraper.add("Workers complete");
assertThat(outputScraper)
.containsExactly(
"Workers ready",
"Counted down",
"Counted down",
"Counted down",
"Counted down",
"Counted down",
"Workers complete"
);
}
This pattern is really useful for trying to reproduce concurrency bugs, as can be used to force
thousands of threads to try and perform some logic in parallel.
5. Terminating a CountdownLatch Early
Sometimes, we may run into a situation where the Workers terminate in error before counting
down the CountDownLatch. This could result in it never reaching zero and await() never
terminating:
@Override
public void run() {
if (true) {
throw new RuntimeException("Oh dear, I'm a BrokenWorker");
}
countDownLatch.countDown();
outputScraper.add("Counted down");
}
Let's modify our earlier test to use a BrokenWorker, in order to show how await() will block
forever:
@Test
public void whenFailingToParallelProcess_thenMainThreadShouldGetNotGetStuck()
throws InterruptedException {
.limit(5)
.collect(toList());
workers.forEach(Thread::start);
countDownLatch.await();
}
Clearly, this is not the behavior we want – it would be much better for the application to
continue than infinitely block.
To get around this, let's add a timeout argument to our call to await().
boolean completed = countDownLatch.await(3L, TimeUnit.SECONDS);
assertThat(completed).isFalse();
As we can see, the test will eventually time out and await() will return false.
6. Conclusion
In this quick guide, we've demonstrated how we can use a CountDownLatch in order to block
a thread until other threads have finished some processing.
We've also shown how it can be used to help debug concurrency issues by making sure
threads run in parallel.
20
Advanced Concurrency in Java
Guide to java.util.concurrent.Locks
1. Overview
Simply put, a lock is a more flexible and sophisticated thread synchronization mechanism
than the standard synchronized block.
The Lock interface has been around since Java 1.5. It's defined inside
the java.util.concurrent.lock package and it provides extensive operations for locking.
In this article, we'll explore different implementations of the Lock interface and their
applications.
3. Lock API
void lock() – acquire the lock if it's available; if the lock isn't available a thread gets
blocked until the lock is released
void lockInterruptibly() – this is similar to the lock(), but it allows the blocked thread
to be interrupted and resume the execution through a
thrown java.lang.InterruptedException
boolean tryLock() – this is a non-blocking version of lock() method; it attempts to
acquire the lock immediately, return true if locking succeeds
boolean tryLock(long timeout, TimeUnit timeUnit) – this is similar
to tryLock(), except it waits up the given timeout before giving up trying to acquire
the Lock
void unlock() – unlocks the Lock instance
21
Advanced Concurrency in Java
4. Lock Implementations
4.1. ReentrantLock
if(isLockAcquired) {
try {
//Critical section here
} finally {
lock.unlock();
}
}
//...
}
In this case, the thread calling tryLock(), will wait for one second and will give up waiting if
the lock isn't available.
4.2. ReentrantReadWriteLock
Read Lock – if no thread acquired the write lock or requested for it then multiple
threads can acquire the read lock
Write Lock – if no threads are reading or writing then only one thread can acquire the
write lock
try {
writeLock.lock();
return syncHashMap.remove(key);
} finally {
writeLock.unlock();
}
}
//...
}
For both the write methods, we need to surround the critical section with the write lock, only
one thread can get access to it:
Lock readLock = lock.readLock();
//...
public String get(String key){
try {
readLock.lock();
return syncHashMap.get(key);
} finally {
readLock.unlock();
}
}
4.3. StampedLock
StampedLock is introduced in Java 8. It also supports both read and write locks. However,
lock acquisition methods return a stamp that is used to release a lock or to check if the lock is
still valid:
public class StampedLockDemo {
Map<String,String> map = new HashMap<>();
private StampedLock lock = new StampedLock();
map.put(key, value);
} finally {
lock.unlockWrite(stamp);
}
}
if(!lock.validate(stamp)) {
stamp = lock.readLock();
try {
return map.get(key);
} finally {
lock.unlock(stamp);
}
}
return value;
}
5. Working With Conditions
The Condition class provides the ability for a thread to wait for some condition to occur while
executing the critical section.
This can occur when a thread acquires the access to the critical section but doesn't have the
necessary condition to perform its operation. For example, a reader thread can get access to
the lock of a shared queue, which still doesn't have any data to consume.
Traditionally Java provides wait(), notify() and notifyAll() methods for thread
intercommunication. Conditions have similar mechanisms, but in addition, we can specify
multiple conditions:
25
Advanced Concurrency in Java
6. Conclusion
In this article, we have seen different implementations of the Lock interface and the newly
introduced StampedLock class. We also explored how we can make use of
the Condition class to work with multiple conditions.
26
Advanced Concurrency in Java
1. Overview
2. After Executor's Shutdown
3. Using CountDownLatch
Next, let's look at another approach to solving this problem – using a CountDownLatch to
signal the completion of a task.
We can initialize it with a value that represents the number of times it can be decremented
before all threads, that have called the await() method, are notified.
For example, if we need the current thread to wait for another N threads to finish their
execution, we can initialize the latch using N:
ExecutorService WORKER_THREAD_POOL
= Executors.newFixedThreadPool(10);
27
Advanced Concurrency in Java
4. Using invokeAll()
The first approach that we can use to run threads is the invokeAll() method. The method
returns a list of Future objects after all tasks finish or the timeout expires.
Also, we must note that the order of the returned Future objects is the same as the list of the
provided Callable objects:
ExecutorService WORKER_THREAD_POOL =
Executors.newFixedThreadPool(10);
awaitTerminationAfterShutdown(WORKER_THREAD_POOL);
assertTrue("fast thread".equals(firstThreadResponse));
5. Using ExecutorCompletionService
future = service.take();
String secondThreadResponse = future.get();
totalProcessingTime
= System.currentTimeMillis() - startProcessingTime;
assertTrue(
"Last response should be from the slow thread",
"slow thread".equals(secondThreadResponse));
assertTrue(
totalProcessingTime >= 3000
&& totalProcessingTime < 4000);
LOG.debug("Thread finished after: " + totalProcessingTime
29
Advanced Concurrency in Java
+ " milliseconds");
awaitTerminationAfterShutdown(WORKER_THREAD_POOL);
6. Conclusion
Depending on the use case, we have various options to wait for threads to finish their
execution.
A CountDownLatch is useful when we need a mechanism to notify one or more threads
that a set of operations performed by other threads has finished.
ExecutorCompletionService is useful when we need to access the task result as soon as
possible and other approaches when we want to wait for all of the running tasks to
finish.
30
Advanced Concurrency in Java
1. Overview
2. Phaser API
Let's say that we want to coordinate multiple phases of actions. Three threads will process the
first phase, and two threads will process the second phase.
We'll create a LongRunningAction class that implements the Runnable interface:
class LongRunningAction implements Runnable {
private String threadName;
private Phaser ph;
this.ph = ph;
ph.register();
}
@Override
public void run() {
ph.arriveAndAwaitAdvance();
try {
Thread.sleep(20);
} catch (InterruptedException e) {
e.printStackTrace();
}
ph.arriveAndDeregister();
}
}
When our action class is instantiated, we're registering to the Phaser instance using
the register() method. This will increment the number of threads using that specific Phaser.
The call to the arriveAndAwaitAdvance() will cause the current thread to wait on the barrier.
As already mentioned, when the number of arrived parties becomes the same as the number
of registered parties, the execution will continue.
After the processing is done, the current thread is deregistering itself by calling
the arriveAndDeregister() method.
Let's create a test case in which we will start three LongRunningAction threads and block on
the barrier. Next, after the action is finished, we will create two
additional LongRunningAction threads that will perform processing of the next phase.
When creating Phaser instance from the main thread, we're passing 1 as an argument. This is
equivalent to calling the register() method from the current thread. We're doing this because,
when we're creating three worker threads, the main thread is a coordinator, and therefore
the Phaser needs to have four threads registered to it:
ExecutorService executorService = Executors.newCachedThreadPool();
Phaser ph = new Phaser(1);
assertEquals(0, ph.getPhase());
The phase after the initialization is equal to zero.
The Phaser class has a constructor in which we can pass a parent instance to it. It is useful in
cases where we have large numbers of parties that would experience massive synchronization
contention costs. In such situations, instances of Phasers may be set up so that groups of sub-
phasers share a common parent.
Next, let's start three LongRunningAction action threads, which will be waiting on the barrier
until we will call the arriveAndAwaitAdvance() method from the main thread.
Keep in mind we've initialized our Phaser with 1 and called register() three more times.
Now, three action threads have announced that they've arrived at the barrier, so one more call
of arriveAndAwaitAdvance() is needed – the one from the main thread:
executorService.submit(new LongRunningAction("thread-1", ph));
32
Advanced Concurrency in Java
ph.arriveAndAwaitAdvance();
assertEquals(1, ph.getPhase());
After the completion of that phase, the getPhase() method will return one because the
program finished processing the first step of execution.
Let's say that two threads should conduct the next phase of processing. We can
leverage Phaser to achieve that because it allows us to configure dynamically the number of
threads that should wait on the barrier. We're starting two new threads, but these will not
proceed to execute until the call to the arriveAndAwaitAdvance() from the main thread (same
as in the previous case):
executorService.submit(new LongRunningAction("thread-4", ph));
executorService.submit(new LongRunningAction("thread-5", ph));
ph.arriveAndAwaitAdvance();
assertEquals(2, ph.getPhase());
ph.arriveAndDeregister();
After this, the getPhase() method will return phase number equal to two. When we want to
finish our program, we need to call the arriveAndDeregister() method as the main thread is
still registered in the Phaser. When the deregistration causes the number of registered parties
to become zero, the Phaser is terminated. All calls to synchronization methods will not block
anymore and will return immediately.
Running the program will produce the following output (full source code with the print line
statements can be found in the code repository):
This is phase 0
This is phase 0
This is phase 0
Thread thread-2 before long running action
Thread thread-1 before long running action
Thread thread-3 before long running action
This is phase 1
This is phase 1
Thread thread-4 before long running action
Thread thread-5 before long running action
We see that all threads are waiting for execution until the barrier opens. Next phase of the
execution is performed only when the previous one finished successfully.
4. Conclusion
Guide To CompletableFuture
1. Introduction
This tutorial is a guide to the functionality and use cases of the CompletableFuture class that
was introduced as a Java 8 Concurrency API improvement.
3. Using CompletableFuture as a Simple Future
Executors.newCachedThreadPool().submit(() -> {
Thread.sleep(500);
completableFuture.complete("Hello");
return null;
});
return completableFuture;
}
To spin off the computation, we use the Executor API. This method of creating and
completing a CompletableFuture can be used together with any concurrency mechanism or
API, including raw threads.
Notice that the calculateAsync method returns a Future instance.
We simply call the method, receive the Future instance, and call the get method on it when
we're ready to block for the result.
Also observe that the get method throws some checked exceptions,
namely ExecutionException (encapsulating an exception that occurred during a computation)
and InterruptedException (an exception signifying that a thread executing a method was
interrupted):
Future<String> completableFuture = calculateAsync();
// ...
// ...
The code above allows us to pick any mechanism of concurrent execution, but what if we
want to skip this boilerplate and simply execute some code asynchronously?
Static methods runAsync and supplyAsync allow us to create a CompletableFuture instance
out of Runnable and Supplier functional types correspondingly.
36
Advanced Concurrency in Java
// ...
assertEquals("Hello", future.get());
The most generic way to process the result of a computation is to feed it to a function.
The thenApply method does exactly that; it accepts a Function instance, uses it to process the
result, and returns a Future that holds a value returned by a function:
CompletableFuture<String> completableFuture
= CompletableFuture.supplyAsync(() -> "Hello");
future.get();
Finally, if we neither need the value of the computation, nor want to return some value at the
end of the chain, then we can pass a Runnable lambda to the thenRun method. In the
following example, we simply print a line in the console after calling the future.get():
CompletableFuture<String> completableFuture
37
Advanced Concurrency in Java
future.get();
6. Combining Futures
7. Difference Between thenApply() and thenCompose()
7.1. thenApply()
We can use this method to work with a result of the previous call. However, a key point
to remember is that the return type will be combined of all calls.
So this method is useful when we want to transform the result of a CompletableFuture call:
CompletableFuture<Integer> finalResult = compute().thenApply(s-> s + 1);
7.2. thenCompose()
When we need to execute multiple Futures in parallel, we usually want to wait for all of them
to execute and then process their combined results.
The CompletableFuture.allOf static method allows to wait for completion of all of
the Futures provided as a var-arg:
CompletableFuture<String> future1
= CompletableFuture.supplyAsync(() -> "Hello");
CompletableFuture<String> future2
= CompletableFuture.supplyAsync(() -> "Beautiful");
39
Advanced Concurrency in Java
CompletableFuture<String> future3
= CompletableFuture.supplyAsync(() -> "World");
CompletableFuture<Void> combinedFuture
= CompletableFuture.allOf(future1, future2, future3);
// ...
combinedFuture.get();
assertTrue(future1.isDone());
assertTrue(future2.isDone());
assertTrue(future3.isDone());
Notice that the return type of the CompletableFuture.allOf() is a CompletableFuture<Void>.
The limitation of this method is that it does not return the combined results of all Futures.
Instead, we have to manually get results from Futures.
Fortunately, CompletableFuture.join() method and Java 8 Streams API makes it simple:
String combined = Stream.of(future1, future2, future3)
.map(CompletableFuture::join)
.collect(Collectors.joining(" "));
9. Handling Errors
// ...
CompletableFuture<String> completableFuture
= CompletableFuture.supplyAsync(() -> {
if (name == null) {
throw new RuntimeException("Computation error!");
40
Advanced Concurrency in Java
}
return "Hello, " + name;
})}).handle((s, t) -> s != null ? s : "Hello, Stranger!");
// ...
completableFuture.completeExceptionally(
new RuntimeException("Calculation failed!"));
// ...
completableFuture.get(); // ExecutionException
In the example above, we could have handled the exception with the handle method
asynchronously, but with the get method we can use the more typical approach of a
synchronous exception processing.
Most methods of the fluent API in CompletableFuture class have two additional variants with
the Async postfix. These methods are usually intended for running a corresponding step of
execution in another thread.
The methods without the Async postfix run the next execution stage using a calling thread. In
contrast, the Async method without the Executor argument runs a step using the
common fork/join pool implementation of Executor that is accessed with
the ForkJoinPool.commonPool() method. Finally, the Async method with
an Executor argument runs a step using the passed Executor.
Here's a modified example that processes the result of a computation with
a Function instance. The only visible difference is the thenApplyAsync method, but under the
hood the application of a function is wrapped into a ForkJoinTask instance (for more
information on the fork/join framework, see the article “Guide to the Fork/Join Framework in
Java”). This allows us to parallelize our computation even more and use system resources
more efficiently:
CompletableFuture<String> completableFuture
= CompletableFuture.supplyAsync(() -> "Hello");
Executor defaultExecutor()
CompletableFuture<U> newIncompleteFuture()
CompletableFuture<T> copy()
CompletionStage<T> minimalCompletionStage()
CompletableFuture<T> completeAsync(Supplier<? extends T> supplier, Executor
executor)
CompletableFuture<T> completeAsync(Supplier<? extends T> supplier)
CompletableFuture<T> orTimeout(long timeout, TimeUnit unit)
CompletableFuture<T> completeOnTimeout(T value, long timeout, TimeUnit unit)
Finally, to address timeout, Java 9 has introduced two more new functions:
orTimeout()
completeOnTimeout()
Here's the detailed article for further reading: Java 9 CompletableFuture API Improvements.
42
Advanced Concurrency in Java
CyclicBarrier in Java
1. Introduction
CyclicBarrier
Phaser
CountDownLatch
Exchanger
Semaphore
SynchronousQueue
These classes offer out of the box functionality for common interaction patterns between
threads.
If we have a set of threads that communicate with each other and resemble one of the
common patterns, we can simply reuse the appropriate library classes (also
called Synchronizers) instead of trying to come up with a custom scheme using a set of
locks and condition objects and the synchronized keyword.
Let's focus on the CyclicBarrier going forward.
3. CyclicBarrier
A CyclicBarrier is a synchronizer that allows a set of threads to wait for each other to reach a
common execution point, also called a barrier.
CyclicBarriers are used in programs in which we have a fixed number of threads that must
wait for each other to reach a common point before continuing execution.
The barrier is called cyclic because it can be re-used after the waiting threads are
released.
43
Advanced Concurrency in Java
4. Usage
The constructor for a CyclicBarrier is simple. It takes a single integer that denotes the
number of threads that need to call the await() method on the barrier instance to signify
reaching the common execution point:
public CyclicBarrier(int parties)
The threads that need to synchronize their execution are also called parties and calling
the await() method is how we can register that a certain thread has reached the barrier point.
This call is synchronous and the thread calling this method suspends execution till a specified
number of threads have called the same method on the barrier. This situation where the
required number of threads have called await(), is called tripping the barrier.
Optionally, we can pass the second argument to the constructor, which is
a Runnable instance. This has logic that would be run by the last thread that trips the barrier:
public CyclicBarrier(int parties, Runnable barrierAction)
5. Implementation
// ...
}
This class is pretty straight forward – NUM_WORKERS is the number of threads that are
going to execute and NUM_PARTIAL_RESULTS is the number of results that each of the
worker threads is going to produce.
Finally, we have partialResults that are a list that's going to store the results of each of these
worker threads. Do note that this list is a SynchronizedList because multiple threads will be
writing to it at the same time, and the add() method isn't thread-safe on a plain ArrayList.
Now let's implement the logic of each of the worker threads:
public class CyclicBarrierDemo {
44
Advanced Concurrency in Java
// ...
@Override
public void run() {
String thisThreadName = Thread.currentThread().getName();
List<Integer> partialResult = new ArrayList<>();
partialResults.add(partialResult);
try {
System.out.println(thisThreadName
+ " waiting for others to reach barrier.");
cyclicBarrier.await();
} catch (InterruptedException e) {
// ...
} catch (BrokenBarrierException e) {
// ...
}
}
}
}
We'll now implement the logic that runs when the barrier has been tripped.
To keep things simple, let's just add all the numbers in the partial results list:
public class CyclicBarrierDemo {
// ...
@Override
public void run() {
System.out.println(
thisThreadName + ": Computing sum of " + NUM_WORKERS
45
Advanced Concurrency in Java
// Previous code
Once the barrier is tripped, the last thread that tripped the barrier executes the logic specified
in the AggregatorThread, namely – add all the numbers produced by the threads.
6. Results
Here is the output from one execution of the above program – each execution might create
different results as the threads can be spawned in a different order:
Spawning 5 worker threads to compute 3 partial results each
Thread 0: Crunching some numbers! Final result - 6
Thread 0: Crunching some numbers! Final result - 2
Thread 0: Crunching some numbers! Final result - 2
Thread 0 waiting for others to reach barrier.
Thread 1: Crunching some numbers! Final result - 2
Thread 1: Crunching some numbers! Final result - 0
Thread 1: Crunching some numbers! Final result - 5
Thread 1 waiting for others to reach barrier.
Thread 3: Crunching some numbers! Final result - 6
Thread 3: Crunching some numbers! Final result - 4
Thread 3: Crunching some numbers! Final result - 0
Thread 3 waiting for others to reach barrier.
Thread 2: Crunching some numbers! Final result - 1
Thread 2: Crunching some numbers! Final result - 1
Thread 2: Crunching some numbers! Final result - 0
Thread 2 waiting for others to reach barrier.
Thread 4: Crunching some numbers! Final result - 9
Thread 4: Crunching some numbers! Final result - 3
Thread 4: Crunching some numbers! Final result - 5
Thread 4 waiting for others to reach barrier.
Thread 4: Computing final sum of 5 workers, having 3 results each.
Adding 6 2 2
Adding 2 0 5
Adding 6 4 0
Adding 1 1 0
Adding 9 3 5
Thread 4: Final result = 46
As the above output shows, Thread 4 is the one that trips the barrier and also executes the
final aggregation logic. It is also not necessary that threads are actually run in the order that
they're started as the above example shows.
7. Conclusion
In this article, we saw what a CyclicBarrier is, and what kind of situations it is helpful in.
We also implemented a scenario where we needed a fixed number of threads to reach a fixed
execution point, before continuing with other program logic.
47
Advanced Concurrency in Java
48
Advanced Concurrency in Java
1. Overview
Generating random values is a very common task. This is why Java provides
the java.util.Random class.
However, this class doesn't perform well in a multi-threaded environment.
In a simplified way, the reason for the poor performance of Random in a multi-threaded
environment is due to contention – given that multiple threads share the
same Random instance.
To address that limitation, Java introduced
the java.util.concurrent.ThreadLocalRandom class in JDK 7 – for generating random
numbers in a multi-threaded environment.
Let's see how ThreadLocalRandom performs and how to use it in real-world applications.
2. ThreadLocalRandom Over Random
4. Comparing ThreadLocalRandom and Random Using JMH
Let's see how we can generate random values in a multi-threaded environment, by using the
two classes, then compare their performance using JMH.
First, let's create an example where all the threads are sharing a single instance
of Random. Here, we're submitting the task of generating a random value using
the Random instance to an ExecutorService:
ExecutorService executor = Executors.newWorkStealingPool();
List<Callable<Integer>> callables = new ArrayList<>();
Random random = new Random();
for (int i = 0; i < 1000; i++) {
callables.add(() -> {
return random.nextInt();
});
}
executor.invokeAll(callables);
Let's check the performance of the code above using JMH benchmarking:
# Run complete. Total time: 00:00:36
Benchmark Mode Cnt Score Error Units
ThreadLocalRandomBenchMarker.randomValuesUsingRandom avgt 20 771.613 ±
222.220 us/op
Similarly, let's now use ThreadLocalRandom instead of the Random instance, which uses one
instance of ThreadLocalRandom for each thread in the pool:
ExecutorService executor = Executors.newWorkStealingPool();
List<Callable<Integer>> callables = new ArrayList<>();
for (int i = 0; i < 1000; i++) {
callables.add(() -> {
return ThreadLocalRandom.current().nextInt();
});
}
executor.invokeAll(callables);
Here's the result of using ThreadLocalRandom:
# Run complete. Total time: 00:00:36
Benchmark Mode Cnt Score Error Units
ThreadLocalRandomBenchMarker.randomValuesUsingThreadLocalRandom avgt 20
624.911 ± 113.268 us/op
Finally, by comparing the JMH results above for both Random and ThreadLocalRandom, we
can clearly see that the average time taken to generate 1000 random values using Random is
772 microseconds, whereas using ThreadLocalRandom it's around 625 microseconds.
Thus, we can conclude that ThreadLocalRandom is more efficient in a highly concurrent
environment.
To learn more about JMH, check out our previous article here.
51
Advanced Concurrency in Java
5. Implementation Details
return instance;
}
It's true that sharing one global Random instance leads to sub-optimal performance in high
contention. However, using one dedicated instance per thread is also overkill.
Instead of a dedicated instance of Random per thread, each thread only needs to
maintain its own seed value. As of Java 8, the Thread class itself has been retrofitted to
maintain the seed value:
public class Thread implements Runnable {
// omitted
@jdk.internal.vm.annotation.Contended("tlr")
long threadLocalRandomSeed;
@jdk.internal.vm.annotation.Contended("tlr")
int threadLocalRandomProbe;
@jdk.internal.vm.annotation.Contended("tlr")
int threadLocalRandomSecondarySeed;
}
The threadLocalRandomSeed variable is responsible for maintaining the current seed value
for ThreadLocalRandom. Moreover, the secondary seed, threadLocalRandomSecondarySeed,
is usually used internally by the likes of ForkJoinPool.
This implementation incorporates a few optimizations to make ThreadLocalRandom even
more performant:
6. Conclusion
1. Introduction
2.1. CountDownLatch
2.2. CyclicBarrier
And for a lot more detail on each of these individually, refer to our previous tutorials
on CountDownLatch and CyclicBarrier respectively.
54
Advanced Concurrency in Java
Let's take a deeper dive into some of the semantic differences between these two classes.
As stated in the definitions, CyclicBarrier allows a number of threads to wait on each other,
whereas CountDownLatch allows one or more threads to wait for a number of tasks to
complete.
In short, CyclicBarrier maintains a count of threads whereas CountDownLatch maintains
a count of tasks.
In the following code, we define a CountDownLatch with a count of two. Next, we
call countDown() twice from a single thread:
CountDownLatch countDownLatch = new CountDownLatch(2);
Thread t = new Thread(() -> {
countDownLatch.countDown();
countDownLatch.countDown();
});
t.start();
countDownLatch.await();
assertEquals(0, countDownLatch.getCount());
Once the latch reaches zero, the call to await returns.
Note that in this case, we were able to have the same thread decrease the count twice.
CyclicBarrier, though, is different on this point.
Similar to the above example, we create a CyclicBarrier, again with a count of two and
call await() on it, this time from the same thread:
CyclicBarrier cyclicBarrier = new CyclicBarrier(2);
Thread t = new Thread(() -> {
try {
cyclicBarrier.await();
cyclicBarrier.await();
} catch (InterruptedException | BrokenBarrierException e) {
// error handling
}
});
t.start();
assertEquals(1, cyclicBarrier.getNumberWaiting());
assertFalse(cyclicBarrier.isBroken());
The first difference here is that the threads that are waiting are themselves the barrier.
Second, and more importantly, the second await() is useless. A single thread can't count
down a barrier twice.
Indeed, because t must wait for another thread to call await() – to bring the count to two
– t‘s second call to await() won't actually be invoked until the barrier is already broken!
55
Advanced Concurrency in Java
In our test, the barrier hasn't been crossed because we only have one thread waiting and
not the two threads that would be required for the barrier to be tripped. This is also
evident from the cyclicBarrier.isBroken() method, which returns false.
4. Reusability
The second most evident difference between these two classes is reusability. To
elaborate, when the barrier trips in CyclicBarrier, the count resets to its original
value. CountDownLatch is different because the count never resets.
In the given code, we define a CountDownLatch with count 7 and count it through 20
different calls:
CountDownLatch countDownLatch = new CountDownLatch(7);
ExecutorService es = Executors.newFixedThreadPool(20);
for (int i = 0; i < 20; i++) {
es.execute(() -> {
long prevValue = countDownLatch.getCount();
countDownLatch.countDown();
if (countDownLatch.getCount() != prevValue) {
outputScraper.add("Count Updated");
}
});
}
es.shutdown();
ExecutorService es = Executors.newFixedThreadPool(20);
for (int i = 0; i < 20; i++) {
es.execute(() -> {
try {
if (cyclicBarrier.getNumberWaiting() <= 0) {
outputScraper.add("Count Updated");
}
cyclicBarrier.await();
} catch (InterruptedException | BrokenBarrierException e) {
// error handling
}
});
}
56
Advanced Concurrency in Java
es.shutdown();
5. Conclusion
1. Overview
Java supports multithreading out of the box. This means that by running bytecode
concurrently in separate worker threads, the JVM is capable of improving application
performance.
Although multithreading is a powerful feature, it comes at a price. In multithreaded
environments, we need to write implementations in a thread-safe way. This means that
different threads can access the same resources without exposing erroneous behavior or
producing unpredictable results. This programming methodology is known as “thread-
safety”.
In this tutorial, we'll look at different approaches to achieve it.
2. Stateless Implementations
In most cases, errors in multithreaded applications are the result of incorrectly sharing state
between several threads.
Therefore, the first approach that we'll look at is to achieve thread-safety using stateless
implementations.
To better understand this approach, let's consider a simple utility class with a static method
that calculates the factorial of a number:
public class MathUtils {
3. Immutable Implementations
If we need to share state between different threads, we can create thread-safe classes by
making them immutable.
Immutability is a powerful, language-agnostic concept and it's fairly easy to achieve in Java.
To put it simply, a class instance is immutable when its internal state can't be modified
after it has been constructed.
The easiest way to create an immutable class in Java is by declaring all the
fields private and final and not providing setters:
public class MessageService {
// standard getter
}
A MessageService object is effectively immutable since its state can't change after its
construction. Hence, it's thread-safe.
Moreover, if MessageService were actually mutable, but multiple threads only have read-only
access to it, it's thread-safe as well.
Thus, immutability is just another way to achieve thread-safety.
4. Thread-Local Fields
In object-oriented programming (OOP), objects actually need to maintain state through fields
and implement behavior through one or more methods.
If we actually need to maintain state, we can create thread-safe classes that don't share
state between threads by making their fields thread-local.
We can easily create classes whose fields are thread-local by simply defining private fields
in Thread classes.
We could define, for instance, a Thread class that stores an array of integers:
public class ThreadA extends Thread {
@Override
public void run() {
numbers.forEach(System.out::println);
}
59
Advanced Concurrency in Java
}
While another one might hold an array of strings:
public class ThreadB extends Thread {
private final List<String> letters = Arrays.asList("a", "b", "c", "d", "e", "f");
@Override
public void run() {
letters.forEach(System.out::println);
}
}
In both implementations, the classes have their own state, but it's not shared with other
threads. Thus, the classes are thread-safe.
Similarly, we can create thread-local fields by assigning ThreadLocal instances to a field.
Let's consider, for example, the following StateHolder class:
public class StateHolder {
@Override
protected StateHolder initialValue() {
return new StateHolder("active");
}
};
5. Synchronized Collections
We can easily create thread-safe collections by using the set of synchronization wrappers
included within the collections framework.
60
Advanced Concurrency in Java
6. Concurrent Collections
7. Atomic Objects
It's also possible to achieve thread-safety using the set of atomic classes that Java provides,
including AtomicInteger, AtomicLong, AtomicBoolean, and AtomicReference.
61
Advanced Concurrency in Java
Atomic classes allow us to perform atomic operations, which are thread-safe, without
using synchronization. An atomic operation is executed in one single machine level
operation.
To understand the problem this solves, let's look at the following Counter class:
public class Counter {
8. Synchronized Methods
While the earlier approaches are very good for collections and primitives, we will at times
need greater control than that.
So, another common approach that we can use for achieving thread-safety is implementing
synchronized methods.
62
Advanced Concurrency in Java
Simply put, only one thread can access a synchronized method at a time while blocking
access to this method from other threads. Other threads will remain blocked until the first
thread finishes or the method throws an exception.
We can create a thread-safe version of incrementCounter() in another way by making it a
synchronized method:
public synchronized void incrementCounter() {
counter += 1;
}
We've created a synchronized method by prefixing the method signature with
the synchronized keyword.
Since one thread at a time can access a synchronized method, one thread will execute
the incrementCounter() method, and in turn, others will do the same. No overlapping
execution will occur whatsoever.
Synchronized methods rely on the use of “intrinsic locks” or “monitor locks”. An
intrinsic lock is an implicit internal entity associated with a particular class instance.
In a multithreaded context, the term monitor is just a reference to the role that the lock
performs on the associated object, as it enforces exclusive access to a set of specified
methods or statements.
When a thread calls a synchronized method, it acquires the intrinsic lock. After the
thread finishes executing the method, it releases the lock, hence allowing other threads to
acquire the lock and get access to the method.
We can implement synchronization in instance methods, static methods, and statements
(synchronized statements).
9. Synchronized Statements
Synchronization is expensive, so with this option, we are able to only synchronize the
relevant parts of a method.
// standard getter
}
We use a plain Object instance to enforce mutual exclusion. This implementation is slightly
better, as it promotes security at the lock level.
When using this for intrinsic locking, an attacker could cause a deadlock by acquiring the
intrinsic lock and triggering a denial of service (DoS) condition.
On the contrary, when using other objects, that private entity is not accessible from the
outside. This makes it harder for an attacker to acquire the lock and cause a deadlock.
9.2. Caveats
Even though we can use any Java object as an intrinsic lock, we should avoid
using Strings for locking purposes:
public class Class1 {
private static final String LOCK = "Lock";
}
At first glance, it seems that these two classes are using two different objects as their lock.
However, because of string interning, these two “Lock” values may actually refer to the
same object on the string pool. That is, the Class1 and Class2 are sharing the same lock!
This, in turn, may cause some unexpected behaviors in concurrent contexts.
In addition to Strings, we should avoid using any cacheable or reusable objects
as intrinsic locks. For example, the Integer.valueOf() method caches small numbers.
Therefore, calling Integer.valueOf(1) returns the same object even in different classes.
Synchronized methods and blocks are handy for addressing variable visibility problems
among threads. Even so, the values of regular class fields might be cached by the CPU.
Hence, consequent updates to a particular field, even if they're synchronized, might not be
visible to other threads.
To prevent this situation, we can use volatile class fields:
public class Counter {
}
With the volatile keyword, we instruct the JVM and the compiler to store
the counter variable in the main memory. That way, we make sure that every time the JVM
reads the value of the counter variable, it will actually read it from the main memory, instead
of from the CPU cache. Likewise, every time the JVM writes to the counter variable, the
value will be written to the main memory.
Moreover, the use of a volatile variable ensures that all variables that are visible to a
given thread will be read from the main memory as well.
Let's consider the following example:
public class User {
}
In this case, each time the JVM writes the age volatile variable to the main memory, it will
write the non-volatile name variable to the main memory as well. This assures that the latest
values of both variables are stored in the main memory, so consequent updates to the
variables will automatically be visible to other threads.
65
Advanced Concurrency in Java
Similarly, if a thread reads the value of a volatile variable, all the variables visible to the
thread will be read from the main memory too.
This extended guarantee that volatile variables provide is known as the full volatile
visibility guarantee.
}
The ReentrantLock constructor takes an optional fairness boolean parameter. When set
to true, and multiple threads are trying to acquire a lock, the JVM will give priority to the
longest waiting thread and grant access to the lock.
Another powerful mechanism that we can use for achieving thread-safety is the use
of ReadWriteLock implementations.
A ReadWriteLock lock actually uses a pair of associated locks, one for read-only operations
and other for writing operations.
66
Advanced Concurrency in Java
// standard constructors
13. Conclusion
In this article, we learned what thread-safety is in Java, and took an in-depth look at
different approaches for achieving it.
67
Advanced Concurrency in Java
1. Introduction
It is relatively common for Java programs to add a delay or pause in their operation. This can
be useful for task pacing or to pause execution until another task completes.
This tutorial will describe two ways to implement delays in Java.
2. A Thread-Based Approach
When a Java program runs, it spawns a process that runs on the host machine. This
process contains at least one thread – the main thread – in which the program runs.
Furthermore, Java enables multithreading, which enables applications to create new threads
that run in parallel, or asynchronously, to the main thread.
2.1. Using Thread.sleep
A quick and dirty way to pause in Java is to tell the current thread to sleep for a specified
amount of time. This can be done using Thread.sleep(milliseconds):
try {
Thread.sleep(secondsToSleep * 1000);
} catch (InterruptedException ie) {
Thread.currentThread().interrupt();
}
It is good practice to wrap the sleep method in a try/catch block in case another thread
interrupts the sleeping thread. In this case, we catch the InterruptedException and
explicitly interrupt the current thread, so it can be caught later and handled. This is more
important in a multi-threaded program, but still good practice in a single-threaded program in
case we add other threads later.
2.2. Using TimeUnit.sleep
The sleep times are not exactly precise, especially when using smaller time
increments like milliseconds and nanoseconds
When used inside of loops, sleep will drift slightly between loop iterations due to
other code execution so the execution time could get imprecise after many iterations
3. An ExecutorService-Based Approach
executorService.schedule(Classname::someTask, delayInSeconds,
TimeUnit.SECONDS);
The Classname::someTask part is where we specify the method that will run after the delay:
executorService.scheduleAtFixedRate(Classname::someTask, 0, delayInSeconds,
TimeUnit.SECONDS);
This will repeatedly call the someTask method, pausing for delayInSeconds between each
call.
Besides allowing more timing options, the ScheduledExecutorService method yields more
precise time intervals, since it prevents issues with drift.
4. Conclusion
In this article, we discussed two methods for creating delays in Java programs.