Threads in NodeJS_ Going beyond eventloop using Rust - DEV Community
Threads in NodeJS_ Going beyond eventloop using Rust - DEV Community
Rahul Ramteke
Posted on Apr 10, 2022
12 3 1
Threads in NodeJS: Going beyond eventloop
using Rust
#node #eventloop #rust #javascript
Circumventing the single thread bottleneck
Index:
NodeJS Refresher
A brief overview of how eventloop works internally
Let's block the main thread
How a simple code can bring down the performance of NodeJS
A qr generator service
A realistic example and the results of load testing
How to improve?
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 1/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
NodeJS refresher
At this point, we have all heard and read how nodejs is singlethreaded but not really. But
just in case, here's a refresher:
NodeJS relies on the concept of event loop. The idea is to ask the os/kernel to do
the heavylifting and expect a signal saying 'hey, this is done'.
Each os has their own thing going on, linux has , osx has epoll_wait kqueue
and windows has something weird.
These kernel api calls are the ones doing the actual job. It kinda looks like this
//pseudocode
while(event=epoll_wait()) {
if(event.type === 'socket') {
// do something
// or in our case, execute the relevant callback
}
}
NodeJS doesn't have a one size fit all event loop, rather it has a phased setup.
For example, it checks timers( etc) first. setTimeout
Here it's the OS's show again, and it uses or equivalent to know if it epoll
needs to execute a callback or not.
Then we have the microtask queue, which handles and promises nextTicks
...And more, checkout this video for full picture
At the end of the phased setup, it checks if there are still any more events that
it needs to handle or wait for. If yes, the loop continues, if not the loop and the
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 2/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
program exits.
After getting a signal saying 'hey, this is done', the associated callback that you
provided is executed.
Now mind you, the loop itself is what's single threaded. The tasks that node
does in the loop, all on one thread.
And the associated callback that it needs to run? Well, you guessed it, the
same event loop thread.
And now you case see why there might be some confusion around the execution.
Afterall, it's singlethreaded but not really.
Also, what happens if the callback you provided is trying to calculate the meaning of
life? That's when we have a problem, because now our eventloop isn't going to do
anything until the callback function's execution is complete.
That's what we mean by blocking the main thread in NodeJS.
function getHash(text) {
let hashedString = text;
for(const i=0; i<500000; i++) {
// do fancy hashing
}
return hashedString;
}
app.listen(port, () => {
console.log(`App listening on port ${port}`)
})
Based on what we discussed in previous section, we can see how this setup can
backfire and undermine the performance of NodeJS. But to show again:
1. NodeJS starts up, and starts executing our script
2. It asks OS to tell when the server starts
3. It asks OS to also tell when that server receives a connection request
4. And now the grand loop runs in phased manner, checking timer first, then i/o and so
on
5. Since NodeJS still has some events that it's waiting for(server connection requests),
the loop doesn't quit
6. Let's say someone hits our api, then the os tells NodeJS of that event
7. In the next iteration/tick of the grand phased loop, it checks timers first, finds
nothing and then it checks i/o
8. It finds that there's a request, and promptly starts executing associated callback
9. Once the execution of callback is finished, the grand phased loop is iterated again
and the queues are checked for more connection requests.
Now, our callback isn't very easy breezy, it can take a good amount of time to execute,
relatively speaking.
And that will delay the next iteration of the grand phased loop, which will delay knowing
if there's a new connection or not. And that's one very good way of losing i/o
performance in NodeJS.
If you look at the code, it's quite innocent looking, nothing weird about it. But one
nefarious loop or thread blocking operation is all it takes.
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 4/19
A qr generator service
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
The previous example of hash calculation isn't very realistic. So let's say we have to
build a service which can create a qr image of any given text.
This service will have a simple api which will take text in query params. After that it
GET
will return a base64 string representing the QR version of given text.
Let's use NodeJS and commonly used libraries for this service. Here's how it looks in
code:
const QRCode = require('qrcode')
const express = require('express')
const app = express()
const port = 3000
app.listen(port, () => {
console.log(`App listening on port ${port}`)
})
Voilà! We have what we needed. A very simple script which does what we planned to
do. But here's the catch, if you look at the source code of library, you'll find qrcode
there are no async calls. It's all done in one synchronous function.
And now our code looks a lot like the 500k hashing one. But how bad can it really be?
To answer that, I setup for some advanced monitoring and
pm2 for load artillery
testing. Here's how it went:
┌─ Custom Metrics ───────────────────────────────────────────┐┌─ Metadata ──────
│ Used Heap Size 23.74 MiB ││ App Name
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 5/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
--------------------------------
Summary report @ 16:49:34(+0530)
--------------------------------
http.codes.200: .............................49994
http.request_rate: ..........................356/sec
http.requests: ..............................49994
http.response_time:
min: ......................................1
max: ......................................97
median: ...................................15
p95: ......................................29.1
p99: ......................................47
http.response_time:
min: ................ 1 ms
max: ................ 97 ms
median: ............. 15 ms
p95: ................ 29.1 ms
p99: ................ 47 ms
The response times that we are seeing, a median of and p95, p99 of and15ms ~30ms
~50ms respectively, seem like a lot. It's a fairly simple service, it makes sense to expect
better.
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 6/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
We know that we have a performance bottleneck, and apparently this is how it crops
up. But we still don't if this is really bad or not, or if we can do better or not and if so
then by how much?
How to improve?
We know the bottleneck is that we only have one thread, and if we block it, we are
doomed. We need more threads for this. What if we tried ? worker_threads
Introduced in node 10, these are separate threads with their own eventloops, but they
share the same node and v8 instance, unlike child processes. This is what makes
them analogous to standard threads in other runtimes.
Well, we probably can use them and it might even work, but I wanted to go all in and
have a much leaner solution.
That's why I went with Rust, to get some near native performance.
Architecture
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 7/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
The idea is to use NodeJS for what it's known for, i.e brilliant i/o and async
performance, and rust for managing threads. This way we get to have best of both the
worlds.
NodeJS has / as a layer which enables FFI(Foreign Function Interface).
n-api node-api
Essestially, it allows node to call functions running in entirely different runtime, written
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 8/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
Rust solution
We are using neon to help us with creating a Rust binding for NodeJS. They have pretty
good docs and example for you to start tinkering with it.
I started with their hello-world example and then used that as a template.
Neon creates a node compatible binary, which our NodeJS program then loads as a
library and runs.
Here's the rust code:
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 9/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
use neon::prelude::*;
use image::{DynamicImage, ImageOutputFormat, Luma};
use base64::{encode as b64encode};
use qrcode::QrCode;
use neon::event::Channel;
fn create_qr(
text: String,
) -> Result<String, String> {
let width = 128;
let height = 128;
Ok(())
});
}
Ok(cx.undefined())
}
#[neon::main]
fn main(mut cx: ModuleContext) -> NeonResult<()> {
cx.export_function("createQR", parse_js_and_get_qr)?;
Ok(())
}
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 11/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
app.listen(port, () => {
console.log(`App listening on port ${port}`)
})
And it works! If we run this code, we will get our base64 representation of a qr code.
But is it any good? Does this perform better than our main thread blocking version?
┌─ Custom Metrics ───────────────────────────────────────────┐┌─ Metadata ──────
│ Used Heap Size 22.00 MiB ││ App Name
│ Heap Usage 36.74 % ││ Namespace
│ Heap Size 59.87 MiB ││ Version
│ Event Loop Latency p95 2.29 ms ││ Restarts
│ Event Loop Latency 0.17 ms ││ Uptime
│ Active handles 1604 ││ Script path
│ Active requests 0 ││ Script args
│ HTTP 240.11 req/min ││ Interpreter
│ HTTP P95 Latency 9.549999999999955 ms ││ Interpreter arg
│ HTTP Mean Latency 1 ms ││ Exec mode
│ ││ Node.js version
--------------------------------
Summary report @ 16:55:55(+0530)
--------------------------------
http.codes.200: .............................50005
http.request_rate: ..........................356/sec
http.requests: ..............................50005
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 12/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
http.response_time:
min: ......................................0
max: ......................................58
median: ...................................1
p95: ......................................12.1
p99: ......................................22
Important stats:
event-loop-latency:
p95 2.29 ms
current 0.17 ms
http.response_time:
min: ................ 0 ms
max: ................ 58 ms
median: ............. 1 ms
p95: ................ 12.1 ms
p99: ................ 22 ms
Comparison
HTTP performance: Latency in ms
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 13/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 14/19
12/21/24, 7:44 PM Threads in NodeJS: Going beyond eventloop using Rust - DEV Community
Conclusion
We see a tremendous performance increase, especially in p95 and p99 cases. We
successfully modified our app such that not only is it faster on average, but the users
facing hiccups aren't far by a huge margin. This ~2-3x increase in performance tells a
lot about where node shines and where it shouldn't be used.
This ability to create native addons has huge implications for JS projects. Imagine you
have your entire stack in typescript and all the engineers well versed with TS/JS
ecosystem, but you finally hit the limit. Now you can rewrite and retrain, or you can
simply create a fast, low surface area library which anyone can plug and play as easily
as downloading it from npm.
https://dev.to/iostreamer/threads-in-nodejs-going-beyond-eventloop-using-rust-3ch7 15/19