Part 4 - Easy Data Parallelism
Part 4 - Easy Data Parallelism
Richard Warburton
Raoul-Gabriel Urma
Overview
● Why is parallelism Important?
● Parallelism
○ At least two threads are executing simultaneously
○ A specific case of concurrency
○ Eg: servlet container dealing with two users at
once on a multicore machine
Parallelism
● Task
○ Distribute execution processes over processes
○ Threads and Executors in Java
○ Eg: each thread services a user in JEE App
● Data
○ Distribute data over different processes
○ Support built on top of Streams
○ Eg: process a payroll and give each core 100
employee’s salary
What are good data parallel
problems?
● Big Batch Jobs
○ Transaction Processing
○ Analytics/Reporting
● Maths
○ Linear Algebra
What’s a good data parallel problem from your
workplace?
Parallelising your Streams
Data Parallelism
● Useful
○ a lot of data
○ want to process in a similar way
numbers.parallelStream()
.forEach(i -> numbers.add(i * 2));
Referring to data sources fixed
numbers = numbers.parallelStream()
.flatMap(i -> Stream.of(i, i * 2))
.collect(toList());
DON’T misuse reduce
(4 + 2) + 1 = 4 + (2 + 1) = 7
(4 * 2) * 1 = 4 * (2 * 1) = 8
Identity
0 + 5 = 5
1 * 5 = 5
How to fix reduce
values.parallelStream()
.forEach(i -> {
try {
doSomething(i);
// Potential Deadlock
latch.countdown();
} catch (Exception e ) {
e.printStackTrace();
}});
No mutable state!
public static long sideEffectParallelSum(long n) {
Accumulator accumulator = new Accumulator();
LongStream.rangeClosed(1,n).parallel()
.forEach(accumulator::add);
return accumulator.total;
}
but …
● Distributed by data
int sum =
values.parallelStream()
.mapToInt(i -> i)
.sum();
Spliterator
public interface Spliterator<T> {
/** Carve off a portion of the data
into a separate Spliterator */
Spliterator<T> trySplit();
● Packing
● Number of Cores
● Stateful
○ accumulate state during evaluation
○ eg: sorted
○ unbounded caching of data
Benchmarking and Testing
● Don’t assume parallel = faster, measure it
● Use jmh:
http://openjdk.java.net/projects/code-tools/jmh/
● Best Practices
○ Warmup
○ Repeatability
○ Evade the JIT
Summary
Lesson Summary
1. Looks at OptimisationExample
2. Try to improve the performance of this code
3. Measure performance using the benchmark harness
4. Don’t make the code uglier!
Exercise
In: com.java_8_training.problems.data_parallelism
● Example
○ 1024 cores, 50% serial
○ 1 / (0.5 + 1/1024 * (1 - 0.5)) ~= 2x speedup