New hope is comming? Project Loom.pdf

Krystian Zybała
„Here comes the Loom, the concurrency will never be the
same again”
05.10.2023
Webosystem

Here comes the
Loom, the
concurrency will never
be the same again
Krystian Zybała
@k_zybala
2

#whoami
• Principal Java Engineer / Performance Java Engineer
• Specialization : JVM, low-latency, Apache Kafka, Cassandra
• Hobby : JVM, Performance
• Workshops, Consultations
3
Twitter: @k_zybala
Mail: kontakt@kzybala.pl

Phil Karlton
„There are only two hard things in Computer Science: cache
invalidation and naming things.”
4

Herb Sutter
„The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in
Software”
5

Motivation
• Problem resources
• Scalability
• E
ffi
ciency
7

Every thread in the JVM is just a
thin wrapper around an OS thread
8

OS allocates a large amount of memory
*megabytes* in the stack to store the
thread context, native, Java call stacks
9

Thread per task model
„ Way to write concurrent programs is create new a thread of every
task”
10

JEPs
• JEP-425 Virtual Threads (preview)
• JEP-428 Structured Concurrency
• JEP-444 Virtual Threads
11

Concurrency: throughput(tasks/time unit)
Schedule multiple largely independent tasks to a set of computational resources
14
Parallelism: latency(time unit)
Speed up a task by splitting it to sub-tasks and exploiting multiple processing units

18
https://fera.pl/co-moze-jesc-chomik-jak-powinna-wygladac-dieta-tego-zwierzecia.html

Thread
• Unit of work
• Too less them
• Requires a lot of resources
• Hard to manage (Reactive toys 😎)
• Not enough knowledge
• Lazy programmers
21

Threads
• Platform Thread
• Carrier Thread
• Virtual thread
23

Platform Thread
• ~1ms to schedule thread
• Big memory consumption: 2MB of stack
• Expensive
• OS thread
• Task-switching requires switch to kernel: ~100µs (depends on the OS)
• Scheduling is a compromise for all usages. Bad cache locality
25

Carrier Thread
• How many as cores
• Carrier thread is the same as platform thread
• Fork/Join pool
27

Virtual Thread
• Lighter threads
• Less memory usage
• Fastest blocking code*
• No more platform threads
• is not GC root
• CPU cache misses are possible
29
• Pay-as-you-go stacks (size 200-300
bytes) stored in a heap
• Scales to 1M+ on commodity
hardware
• Clean stack traces
• Your old code just works
• Readable sequential code
• The natural unit of scheduling for
operating systems

Virtual Thread
• Cheap to create
• Cheap to destroy
• Cheap to block
30
https://cojestgrane24.wyborcza.pl/cjg24/Warszawa/1,30,33618,Smerfy-live-on-stage-na-Torwarze.html

Purpose?
• mostly intended to write I/O application
• servers
• message brokers
• higher concurrency if system needs additional resources for concurrency
• available connections in a connection pool
• su
ffi
cient memory to serve the increased load
• increase e
ffi
ciency for short cycle tasks
31

Virtual Thread isn’t for
• The non-realtime kernels primarily employ time-sharing when the CPU is
at 100%
• Run for long time
• CPU bound tasks*
32

„Virtual threads are not an execution resource, but a business logic
object like a string.”
33

34
final Thread thread1 = Thread
.ofPlatform()
.unstarted(() -> System.out.println("Hello from " + Thread.currentThread()));
final Thread thread2 = Thread
.ofVirtual()
.unstarted(() -> System.out.println("Hello from " + Thread.currentThread()));
Hello from Thread[#22,Thread-0,5,main]
Hello from VirtualThread[#23]/runnable@ForkJoinPool-1-worker-1

Fast forward to today
• Virtual thread = user mode thread
• Scheduled by JVM, not OS
• Virtual thread is a instance of java.lang.Thread
• Platform thread is instance of java.lang.Thread but implemented by
“traditional way”, thin wrapper around OS thread
35

How are virtual threads
implemented?
36

How are virtual threads implemented?
• Built on continuations, as lower construct of JVM
• Virtual thread wraps a task in continuation
• FIFO mode
• M:N threading model
37

38
VT
CT
CT
CT
CT
VT
VT
VT
ForkJoinPool
OS
OS
OS
OS

„The virtual thread is not an atomic construct, but a composition of
two concerns — a scheduler and a continuation..”
39

Delimited Continuous with
Scheduler
40

The ability to manipulate call stacks
to the JVM will undoubtedly be
required
41

A scheduler assigns continuations to CPU cores, replacing a paused one
with another that's ready to run, and ensuring that a continuation that is
ready to resume will eventually be assigned to a CPU core.
43

VH wraps a task in a
continuation
44

Continuation
• Call-stack context
• May suspend itself
• May be resumed by caller
• Must be “stateful”
• Non-reentrant
46

Copy Terminology
• Freeze: Suspend a continuation and unmount it by copying frames from
OS thread stack → continuation object
• Thaw: Mount a suspended continuation by copying frames from
continuation object → OS thread stack
47

50
private static void enter(Continuation c, boolean isContinue) {
// This method runs in the "entry frame".
// A yield jumps to this method's caller as if returning from this method.
try {
c.enter0();
} finally {
c.finish();
}
}
private void enter0() {
target.run();
}

Virtual thread scheduling is
already preemptive*
53

Do not pool virtual threads
56

What is a blocking operation?
57

59
https://tenor.com/view/what-minion-what-minion-minion-gif-minion-funny-gif-gif-25942348

You do not have to rewrite you Tomcat
application
Thread per request style is still with us
60

The synchronized blocked doesn’t
work well with Virtual Threads
62

64
https://giphy.com/gifs/festivals-woov-catch-beer-Y3jwwe1xIpkDZXQXn3

Pinning
• When it executes code inside a synchronized block or method;
• When it calls a native method or a foreign function
65

I/O
• The java.nio.channels classes — SocketChannel, ServerSocketChannel and DatagramChannel — were retro
fi
tted to
become virtual-thread-friendly. When their synchronous operations, such as read and write, are performed on a virtual thread,
only non-blocking I/O is used under the covers.
• “Old” I/O networking — java.net.Socket, ServerSocket and DatagramSocket — has been reimplemented in Java on top
of NIO, so it immediately bene
fi
ts from NIO’s virtual-thread-friendliness.
• DNS lookups by the getHostName, getCanonicalHostName, getByName methods of java.net.InetAddress (and other
classes that use them) are still delegated to the operating system, which only provides a OS-thread-blocking API. Alternatives are
being explored.
• Process pipes will similarly be made virtual-thread-friendly, except maybe on Windows, where this requires a greater e
ff
ort.
• Console I/O has also been retro
fi
tted.
• Http(s)URLConnection and the implementation of TLS/SSL were changed to rely on j.u.c locks and avoid pinning.
• File I/O is problematic. Internally, the JDK uses bu
ff
ered I/O for
fi
les, which always reports available bytes even when a read will
block. On Linux, we plan to use io_uring for asynchronous
fi
le I/O, and in the meantime we’re using the
ForkJoinPool.ManagedBlocker mechanism to smooth over blocking
fi
le I/O operations by adding more OS threads to the
worker pool when a worker is blocked.
67

Thread local doesn’t work?
68

70
ScopedValue<User> KEY = ScopedValue.newInstance();
Runnable task = () -> {
if (KEY.isBound()) {
return userProvider.fetchUser(KEY.get());
}
return null;
}
ScopedValue.where(KEY, 1).run(task);
ScopedValue.where(KEY, 1).call(() -> doSomething());
// or
ScopedValue.where(KEY, 2, task);

-Djdk.tracePinnedThreads=full/short
72

74
Response handle() throws ExecutionException, InterruptedException {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
Future<String> user = scope.fork(() -> findUser());
Future<Integer> order = scope.fork(() -> fetchOrder());
scope.join(); // Join both forks
scope.throwIfFailed(); // ... and propagate errors
// Here, both forks have succeeded, so compose their results
return new Response(user.resultNow(), order.resultNow());
}
}
< JDK 21

75
Response handle() throws ExecutionException, InterruptedException {
try (var scope = new StructuredTaskScope.ShutdownOnFailure()) {
java.util.concurrent.StrucutredTaskScope.SubTask<String> user = scope.fork(() -> findUser());
java.util.concurrent.StrucutredTaskScope.SubTask<Integer> order = scope.fork(() -> fetchOrder());
scope.join(); // Join both forks
scope.throwIfFailed(); // ... and propagate errors
// user.state(); UNAVAILABLE, SUCCESS, FAILED
// user.exception();
// user.task(); Callable<? extends T>
// Here, both forks have succeeded, so compose their results
return new Response(user.get(), order.get());
}
}
JDK 21

78
jcmd <pid> JavaThread.dump -format=json <
fi
le>

Virtual Thread but what with
GC
79

Project Synergies
• Data more local than ever
• Less reason to manually share data across thread pools
• Same data no are private in per request model
• GC when thread terminates
• The Virtual thread stack objet itself is thread-local
80

JDBC can’t rely on Threads
82

Cay Horstmann
„Tasks, not thread”
83

Libraries doesn't support it
well*
84

Benchmark - Jackson
86
Jackson serialization to JSON
Benchmark Mode Cnt Score Error Units
serialize_object_platform_thread avgt 25 64,266 ± 10,300 ms/op
serialize_object_virtual_thread avgt 25 238,299 ± 121,981 ms/op

Benchmark - Kryo
87
Kryo serialization
serialize_object_platform_thread avgt 25 0.656173 ± 0.104847 ms/op
serialize_object_virtual_thread avgt 25 1.044487 ± 0.075430 ms/op

Benchmark - Avro
88
Avro serialization
serialize_object_platform_thread avgt 25 0.112802 ± 0.002340 ms/op
serialize_object_virtual_thread avgt 25 0.113031 ± 0.004064 ms/op

Simple switch to VT makes
3x
performance boost in I/O
89

Problems
• Custom Carrier Thread Pool
• Virtual Thread Hang
• Synchronized
• Parks in scheduler
• -Djdk.tracePinnedThreads
• Thread.yield() < JDK20
92

-Djdk.tracePinnedThreads has a
probability of causing the
application to hang
93

Virtual threads are scheduled
preemptively* not cooperatively?
95

This means that the runtime makes the
decision when to deschedule (preempt) one
thread and schedule another without
cooperation from user code
96

ConcurrentHashMap#computeIfAbsent
„Some attempted update operations on this map by other threads
may be blocked while computation is in progress, so the
computation should be short and simple, and must not attempt to
update any other mappings of this map.”
98

99
import java.util.Map;
import java.util.concurrent.CancellationException;
import java.util.concurrent.ConcurrentHashMap;
public class CHMPinning {
public static void main(String... args) throws InterruptedException {
Map<Integer, Integer> map = new ConcurrentHashMap<>();
for (int i = 0; i < 1_000; i++) {
int finalI = i;
Thread.startVirtualThread(() -> map.computeIfAbsent(finalI % 3, key -> {
try {
Thread.sleep(2_000);
} catch (InterruptedException e) {
throw new CancellationException("interrupted");
}
return finalI;
}));
}
long time = System.nanoTime();
try {
Thread.startVirtualThread(() -> System.out.println("Hi, I'm an innocent virtual thread")).join();
} finally {
time = System.nanoTime() - time;
System.out.printf("time = %dms%n", (time / 1_000_000));
}
System.out.println("map = " + map);
}
}

100
private static final ConcurrentMap<String, String> cache = new ConcurrentHashMap<>();
private static String refresh(String key) {
try (var scope = new StructuredTaskScope.ShutdownOnSuccess<String>()) {
scope.fork(() -> UUID.randomUUID().toString());
scope.join();
return scope.result();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
public static void main(String[] args) throws Exception {
var cpus = Runtime.getRuntime().availableProcessors();
List<Future> fl = new ArrayList<>();
try (var es = Executors.newVirtualThreadPerTaskExecutor()) {
for (int i = 0; i < cpus; ++i)
fl.add(es.submit(() -> cache.computeIfAbsent("foo", k -> refresh(k))));
}
for (var f : fl)
System.out.println(f.get());
}

Runtime.getRuntime().availableProcessors() - 1
103

-Djdk.virtualThreadScheduler.parallelism=N
105

-XX:-DoJVMTIVirtualThreadTransitions
107

Future work
• BlockingQueue
• Structured Concurrency
• use io_uring for asynchronous
fi
le I/O
• Object.wait()
• Concurrent Collection review
108

Takeaways
• Nothing is changed 😃
• A virtual thread is a java.lang.Thread — in code, at
runtime, in the debugger and in the pro
fi
ler
• Lighter threads
• Pay-as-you-go stacks (size 200-300 bytes) stored in a heap
• Scales to 1M+ on commodity hardware
• The natural unit of scheduling for operating systems
109
• The natural unit of scheduling for operating systems
• A virtual thread is not a wrapper around an OS thread, but a
Java entity.
• Creating a virtual thread is cheap — have millions, and don’t
pool them!
• Blocking a virtual thread is cheap — be synchronous!
• No language changes are needed.
• Pluggable schedulers o
ff
er the
fl
exibility of asynchronous
programming.

Takeaways
• Move to simpler blocking/synchronous code
• Migrate tasks to Virtual threads not Platform threads to Virtual threads
• Use Semaphores or similar to limit concurrency
• Try to not cache expensive objects in Thread Locals
• Avoid pinning
• Avoid reusing
• Avoid pooling
110

05.10.2023
Wejdź w agendę
Oceń mój wykład
w aplikacji Eventory
Kliknij w wybrany wykład
Oceń

New hope is comming? Project Loom.pdf

Related slideshows

More Related Content

Similar to New hope is comming? Project Loom.pdf

Similar to New hope is comming? Project Loom.pdf (20)

Recently uploaded

Recently uploaded (20)

New hope is comming? Project Loom.pdf