Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Jakob's Blog coding for fun and growth

Why WebAssembly came to the Backend (Wasm in the wild part 3)

Why and how are people running WebAssembly on the backend? I make the case it's simply the best way to run sandboxed code written in systems languages and why you should consider using it in your applications. Examples will be in Rust but the lesson applies to many languages.

This is part 3 of the Wasm in the Wild series. Each post can be read in isolation. But for the best narrative from start to finish, start with part 1.

Introduction

WebAssembly sounds like it was made for the Web. For good reasons, as I explained in the previous post. For a long time, the browser wanted a general-purpose sandbox, not tied to a specific language or ecosystem. Wasm brought that. But Wasm in the cloud is also growing fast.

Why do we need a sandbox like Wasm on the server? Obviously, isolating different applications from each other can be a use case. I will show you how Wasm is particularly useful for that in a moment. But this is not all.

Wasm is also good for easy and portable deployment without the Docker overhead. In fact, it’s so good at it that Docker, Inc. now allows running Wasm modules side-by-side with traditional Docker containers. Code examples for that follow in this article, too.

The article is structured in the following sections.

Image: A map showing a path to "Backend Sandbox." Wasm in the Wild map with route of today.

Image: An uneven grass terrain with two trees in the front and more forests in the back The start of today’s journey.

Alternatives before Wasm

Quick refresher. The purest form of running code on your CPU would be to run it all in privileged mode with full system access. That’s how it used to be. But eventually, humanity decided it would be better to have an operating system in place that isolates different processes from each other. Only a part of the OS should run with full system access, which we nowadays call the kernel, while all other code runs in a restricted environment, the user space.

While the reason for sandboxing like this is to restrict access, usually the more interesting engineering work is to put the right API in place to overcome restrictions. We want to put a roadblock in place to control who passes the road, not entirely make it impassable.

With the split between user space and kernel, system calls are used to kindly ask the kernel to do what we cannot do inside userspace. How system calls work is defined on a hardware level on the CPU. What each system call does is handled by the operating system. POSIX defines a standard of system calls for portability between compatible OSs. This makes file manipulation through open, read, write, and close portable across Linux and macOS, for example. WinAPI defines them differently, making Windows compatible with Windows only. Luckily, we have standard libraries hiding the actual system calls and their differences between operating systems from us.

So far so good, we have isolated processes and we have an API for accessing what’s outside the sandbox. But the age of a computer system can be measured by the number of layers it has, similar to counting the rings in a tree trunk.

The next layer would be full VMs, grouping multiple processes into a shared sandbox that’s isolated from other VMs on the same physical machine. It gives a full OS to each sandbox, including kernel and user space.

Another layer is created by containers, like Docker for a POSIX-compatible sandbox. Compared to VMs, containers can share the OS kernel with the host. This can reduce startup time and memory overhead substantially. As an example, an empty Debian Docker image is only around 50 MB and starts in less than a second. Often a negligible overhead compared to the applications running inside.

The Docker sandbox restricts containers to only access the private filesystem and network by default. Developers can explicitly set up permissions to break these restrictions when starting a container. Tools like docker compose help automate this configuration step.

A Docker-based setup is pretty good for long-running applications that need a full POSIX interface. But for finer granularity, it is still too heavy. For cloud computing paradigms closer to Function-as-a-Service, a container per function call is usually not viable.

For an extra layer of isolation with finer granularity, a Wasm runtime is a great solution. Starting it usually has a sub-millisecond overhead. Spawning a new process would be slower than starting a Wasm runtime inside the existing process.

Wasm modules are also compact in size by design, since they must be fast to download even over a bad cellular network connection. The code for a FaaS module would likely be less than 100 kB if optimized properly.

But we can go beyond FaaS with the isolation granularity of Wasm. Why not run every endpoint invocation of your public API in a separate Wasm runtime? You would keep your web server with session management but hand off all business logic and computations to run as Wasm. This approach has a couple of benefits.

First, you can easily limit the compute time and memory consumption for the request, for a basic DoS mitigation. Second, you can hot-swap the Wasm code for an updated endpoint with zero downtime.

A third benefit is complete memory isolation for each request. Even if you write buggy code and have something like a memory buffer over-read, the request cannot access secret data from other requests or even the host.

You could achieve the same architecture with a separate process per endpoint. But then you have to deal with the overhead of starting a process and inter-process communication. In the Wasm approach, multiple Wasm runtimes can share the same host process. It’s much easier to share something like a pool of DB connections in this way.

But with Wasm, we have a blazing-fast sandbox that can run in the same process. It can be a powerful tool to have in your repertoire as a developer in 2024.

Image: A path up a grassy hill. Ready to walk up that hill?

Next, I will show 4 ways of running Rust code in a Wasm sandbox on the backend, starting with the lowest level and adding one layer at a time.

  1. Run it as pure Wasm.
  2. Run it through the Wasm System Interface (WASI).
  3. Run it through one of many Wasm backend frameworks.
  4. Run it as a Docker container.

Pure Wasm on the Server

The purest form of running Rust in a Wasm sandbox is to compile it for wasm32-unknown-unknown, which makes no assumptions on the runtime. Here is a complete example.

# Cargo.toml
[package]
name = "is-thirteen"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
// src/lib.rs
#[no_mangle]
pub fn is_thirteen(num: u32) -> bool {
    num == 13
}

As you can see, the example code is extremely simple. We just take a number as input and return a boolean value whether it is equal to the number 13 or not.

Cargo can compile this into a portable and runtime-agnostic Wasm module.

$ cargo build --release --target wasm32-unknown-unknown

Now, using wasmtime as the runtime in this example, we can execute this module.

$ wasmtime run is_thirteen.wasm --invoke is_thirteen 1
0
$ wasmtime run is_thirteen.wasm --invoke is_thirteen 12
0
$ wasmtime run is_thirteen.wasm --invoke is_thirteen 13
1
$ wasmtime run is_thirteen.wasm --invoke is_thirteen 14
0

Now, to quickly demonstrate one of the benefits of sandboxing. We can, for example, limit the memory of the Wasm module to 2MB like so.

$ wasmtime run -W max-memory-size=2000000 is_thirteen.wasm --invoke is_thirteen 13

I hope this makes sense so far, at least on a high level. But there is already a bunch of magic happening that I want to explain.

As mentioned in the intro, sandboxes restrict access but also have to define a way to overcome restrictions and communicate with what’s outside the box. In this example, the input comes from outside the sandbox and the result is returned to the outside world.

When running the example, the wasmtime CLI made a few assumptions for us. It took the argument we gave it on the command line, which is a string, and converted it to a number. Then it pushed that on the virtual Wasm stack before calling is_thirteen. After the function finished, it popped the return value, which we defined to be a bool in Rust. This is then represented as 0 or 1 in a string output we see on the command line.

These steps are not part of the Wasm standard. It’s a choice of the CLI on how to make running pure Wasm modules somewhat useful. To understand better what is really going on under the hood, we can inspect the Wasm module with wasm2wat from The WebAssembly Binary Toolkit.

$ wasm2wat ./target//wasm32-unknown-unknown/release/is_thirteen.wasm
(module
  (type (;0;) (func (param i32) (result i32)))
  (func $is_thirteen (type 0) (param i32) (result i32)
    local.get 0
    i32.const 13
    i32.eq)
  (table (;0;) 1 1 funcref)
  (memory (;0;) 16)
  (global $__stack_pointer (mut i32) (i32.const 1048576))
  (global (;1;) i32 (i32.const 1048576))
  (global (;2;) i32 (i32.const 1048576))
  (export "memory" (memory 0))
  (export "is_thirteen" (func $is_thirteen))
  (export "__data_end" (global 1))
  (export "__heap_base" (global 2)))

This is the Wasm code produced by cargo, without knowing anything about the runtime we are going to use to run it.

The module is only 15 lines long. Four of them are the actual function, which reads parameter 0 and compares it to the constant 13. The return is implicit by keeping the result on the stack.

(type (;0;) (func (param i32) (result i32)))
(func $is_thirteen (type 0) (param i32) (result i32)
   local.get 0
   i32.const 13
   i32.eq)

Before the function, the type is defined as a function that takes one parameter and returns a result. Since this is the first definition in the module, it is implicitly type 0 for references further down in the code. For convenience, the wasm2wat command also puts that into a comment ((;0;)). (Yes, WebAssembly text format inline comments start with (; and end with ;)).

Did you notice how i32 is used in the binary despite the u32 and the bool in the code? Well, Wasm doesn’t have a native u32 or bool type. So on the binary level, it uses i32. Any u32 value can be encoded as an i32 and thanks to Two’s complement encoding, arithmetic operations are equivalent. That’s good enough if the caller knows how to encode values. But does wasmtime know it? Let’s try.

$ wasmtime run is_thirteen.wasm --invoke is_thirteen 2147483647
0
$ wasmtime run is_thirteen.wasm --invoke is_thirteen 2147483648
Error: failed to run main module `is_thirteen.wasm`
Caused by:
    number too large to fit in target type

Oh, oh. If we try to pass a number higher than i32::MAX = 2147483647, wasmtime fails to encode it, even though it would fit in a u32. This makes sense, given the Wasm module defines the parameter as i32 and there is no way for wasmtime to know that our original code wanted a u32. That shows how the glue between a portable app and a runtime is important. The CLI can only do so much.

What else is exposed by the module for the runtime?

(export "memory" (memory 0))
(export "is_thirteen" (func $is_thirteen))
(export "__data_end" (global 1))
(export "__heap_base" (global 2))

The module has 4 export entries. One of them is our function we just looked at, exported with the name “is_thirteen”. Before that is a memory region named “memory”, which internally references memory number 0. This was previously defined to hold 16 pages of 64 KiB each, hence 1 MiB in total.

(memory (;0;) 16)

This tells the runtime any code inside will only access memory addresses 0 to 1 MiB. The runtime can allocate it accordingly.

The last two exports are pointers into the memory called __data_end and __heap_base. This helps the runtime understand how the module organizes its memory internally.

(global (;1;) i32 (i32.const 1048576))
(global (;2;) i32 (i32.const 1048576))

Both are set to 1048576, exactly at 1 MiB. With __data_end set to this value, it means the module has the region from 0 to 1MiB for the data section. If we had constant strings in our code, they would show up in the data section.

At the exact same address 1048576, the heap starts, meaning everything afterward is dynamically managed memory. Since the entire memory region is only 1 MiB, we can conclude the module has no heap in this case.

If there was some space between __data_end and __heap_base, this could be used as a stack by the Wasm code. Not the stack of the Wasm virtual stack machine as I explained in Fundamentals of WebAssembly but the stack where your programming language can put data temporarily, without a dynamic memory allocator. But anyway, this example has no such stack since the code is simple enough.

Okay, the module exposes its internal memory details. But why? You might have guessed it, it has to do with sandboxing and communicating across its boundaries.

First, defining the memory region allows the runtime to map Wasm addresses to host virtual addresses efficiently. In Wasm, if a pointer has a value of 8, for example, this is relative to the Wasm module. Another module in the same runtime could also have an address of 8 that references something completely different. The runtime has to translate the Wasm addresses of a module to valid native pointers. Ideally, without allowing sandbox escapes.

So for our sample code, the runtime allocates a 1 MiB memory region for the data section of the Wasm module. Let’s say that region has a native address of 0xffff_0000_0000_0000. An interpreter would now add 0xffff_0000_0000_0000 to each data section access to do the address mapping.

A more efficient runtime can avoid doing this work on each memory access. It can translate all addresses during the compilation of the Wasm code to native code. Hence, this addition happens during module loading/compilation, not during during execution.

For the data section, which is a static size during execution, this works well enough. The stack and the heap, however, can usually grow dynamically. Ideally, the runtime should start with a small allocation and be ready to dynamically increase it as necessary. That’s one of the reasons why it can be good for the runtime to know which addresses are considered to be on the heap by the program.

You don’t have to understand these intricate details of memory management to use Wasm. But they are important to understand the architecture and design of Wasm on a deeper level. Other runtimes, like Java, decided to put memory management on the runtime and dictate the usage of a garbage collector. Wasm tries to dictate as little as possible and allow maximum flexibility in how the host and the guest want to split the work of memory management. That’s why it works well with Rust and C but can also be used with Go and C#.

So much about this pure Wasm situation. It has very tight sandboxing. So much so that it’s hard to show a more useful example of it. If we wanted to do something interesting, like even just printing something to the console, we would need to switch gears and invoke wasmtime as a library from code, so that we can define imports like we did in the JS example. You might also recall from the previous post how much work was necessary to pass a string from Wasm to JS without wasm-bindgen. It would be the same work here.

But instantiating a Wasm runtime inside your native code is for another day. For now, I want to finish the overview of how to compile Rust to Wasm and run it as a standalone Wasm. Let’s add another layer, it will make things easier.

Image: A pretty view from atop a pretty barren hill. Let’s leave this barren hill.

WASI on the Server

The most vanilla layer of abstraction on top of pure Wasm, for the backend, would be the Wasm System Interface (WASI). Like POSIX system calls, it is an API to give the program access to system resources. Using the wasm32-wasi target for Rust, the standard library will use this interface instead of system calls. For other programming languages, you want to look at the standalone Wasm option for emscripten.

Through WASI, you may access system resources like files or the network. Like Wasm, WASI is a standard developed by the W3C. Wasm had its 1.0 release a while ago and all major browsers support it. WASI builds on top of it and naturally lags behind a bit regarding stability.

The first WASI standard preview was floating around for a couple of years and has been implemented by multiple Wasm runtimes. Today, what came out of it is known as WASI 0.1. Despite the number indicating instability, it works well enough that it is used in production.

There is also WASI 0.2, which I believe is supposed to replace WASI 0.1. But it was only released in February 2024, so I wouldn’t quite call WASI 0.1 deprecated, yet.

With WASI, we can create a more interesting example. Let’s create a Wasm component that takes a message and a signature as input from the user and then validates the signature using a certificate stored in a file.

# Cargo.toml
[package]
name = "wasm-signature-checker"
version = "0.1.0"
edition = "2021"

[dependencies]
ed25519-compact = "2"
hex = "0.4"
anyhow = "1"
// main.rs
use ed25519_compact::{PublicKey, Signature};

fn main() -> anyhow::Result<()> {
    println!("> Wasm signature verifier, featuring WASI.");
    println!("> Please enter the message text.");
    let message = read_line()?;

    println!("> Please enter the signature in hex encoding.");
    let hex_input = read_line()?;
    let binary_input = hex::decode(hex_input.clone())?;
    let signature = Signature::from_slice(&binary_input)?;

    println!("> Please enter the path to the certificate.");
    let path = read_line()?;
    let pk_vec = std::fs::read(&path)?;
    let pk_array: [u8; 32] = pk_vec.try_into().unwrap();
    let pk = PublicKey::new(pk_array);

    println!(">");
    println!("> Verifying ");
    println!(">           {message}");
    println!(">           {hex_input}");
    println!(">           {path}");
    match pk.verify(message, &signature) {
        Ok(()) => println!("valid"),
        Err(_) => println!("invalid"),
    }
    Ok(())
}

/// Read a line from stdin and remove trailing whitespace, such as \n
fn read_line() -> Result<String, anyhow::Error> {
    let mut line = String::new();
    std::io::stdin().read_line(&mut line)?;
    line.truncate(line.trim().len());
    Ok(line)
}
# set up certificate in a new directory
mkdir guest_dir
echo "D0A6947EF7BDA615D38B43A896AE6EA6FDE87F98729C18B6D9AB447051B0E7A0" | xxd -r -p > ./guest_dir/my_pk.cert

Note how I’m using the Rust standard library to read from standard input and read a file from a path. Also, println! prints to the standard output. None of this would work in pure Wasm. But when I compile it for --target wasm32-wasi, it all gets mapped to the WASI interface.

cargo build --release --target wasm32-wasi
cp ./target/wasm32-wasi/release/wasm-signature-checker.wasm ./

Finally, we can run the code with wasmtime again, this time interactively.

$ wasmtime wasm-signature-checker.wasm --dir ./guest_dir::/
> Wasm signature verifier, featuring WASI.
> Please enter the message text.
Hello WASI!
> Please enter the signature in hex encoding.
96D7065598D79299AD539B1299E83196EA19A2660342ED480FD1500E813E636DCA4112D3A49B783B207219B432E7F0882D8F3840D6D7591ECA271CB028A3D608
> Please enter the path to the certificate.
my_pk.cert
>
> Verifying 
>           Hello WASI!
>           96D7065598D79299AD539B1299E83196EA19A2660342ED480FD1500E813E636DCA4112D3A49B783B207219B432E7F0882D8F3840D6D7591ECA271CB028A3D608
>           my_pk.cert
valid
$

Probably the most important part here is how we have to give explicit permission to read the file ./guest_dir/my_pk.cert. By adding --dir ./guest_dir::/ as an argument, I mapped ./guest_dir to / inside the filesystem the Wasm runtime exposes. Anything outside the directory is unaccessible, as I can demonstrate by trying to access for example ../secret_key.

# Demonstrate sandboxing in WASI
# ...
$ wasmtime wasm-signature-checker.wasm --dir ./guest_dir::/
> Please enter the path to the certificate.
../secret_key
Error: Operation not permitted (os error 63)

As you can see, the sandboxing model of WASI is similar to Docker. By default, no access. But all the tools are available to quickly give access to specific parts of the system.

If instead I run the example as a native binary and not as a Wasm module, the program has access to anything my current user has access to…

# Demonstrate non-sandboxing in native
$ cargo run
# ...
> Please enter the path to the certificate.
/home/jakmeier/.ssh/id_ed25519.pub
thread 'main' panicked at src/main.rs:16:48:
called `Result::unwrap()` on an `Err` value: [ READACTED ]

In the example above, running this on my dev machine, my real ed25519 public key was printed out. It panicked because the format was unexpected but it did read the file content and printed it as part of the panic message.

It wouldn’t work with my private key, since I have the usual file permission setup where only the root can read private keys. But here we are again, relying on specific configurations on the OS level for isolation of resources. Running it in a Wasm sandbox seems more foolproof to me.

Before moving on from WASI, let me briefly mention the component model. It’s part of WASI 0.2 and allows the composition of Wasm modules, even across the programming language boundary. Traditionally, we would call this concept linking of shared libraries in native binaries, or maybe dynamic class loading in the JVM or CLR.

The sandboxing properties of it are quite interesting. Even components have strong isolation from each other. Communication also works over imports and exports, plus there is an interface definition layer to define how components can be combined.

The component model is really cool. So cool, in fact, that I want to show it off in a dedicated post. Right now, we need to move on if we want a chance at reaching the planned destination before it gets dark.

Image: A house with many rooftop bricks lined up on the side. Somehow this house we pass feels right for the next section.

Wasm frameworks on the Server

The list of Wasm runtimes is larger than I can keep track. And as a layer on top of the runtimes, there are Wasm platforms, which I also think I lost the overview. But from the platforms I have tested, they mostly follow a similar pattern.

The Wasmer runtime is an example of a well-documented and widely used project that also comes with a platform built on top. Their platform is called “Wasmer Edge” and has been available as a public Beta version since October 2023. (General Wasmer Edge Announcement, Announcement of Public Beta)

The idea is that you can build Wasm modules and deploy them directly to a battery-included, scalable cloud. No infrastructure for you to maintain, just a bunch of Wasm modules accessible through your domain.

It’s a bit like FaaS again but promises to be cheaper due to tiny resource requirements for Wasm runtimes. That’s fundamentally what gives fuel to this direct attack on the established model of the hyperscalers today.

Wasm platforms also promise to be more flexible than FaaS platforms like AWS Lambda since the platform has WASI support. In the case of Wasmer Edge, they even defined a superset of WASI called WASIX which gives full POSIX compatibility. Essentially, WASIX is for the impatient who don’t want to use things like multi-threading, subprocesses, full networking support, and other cool features before things get standardized in a formal WASI preview. Even setjmp and longjmp work in WASIX, which is kind of crazy.

WasmEdge, despite its very similar name, is a completely separate platform with a similar approach. They also have their own Wasm runtime and implement WASI. But without the X, so you get access to most things you need but not quite the full POSIX compliance. But using their SDK, you can easily access resources on a higher level, including networking on the HTTP level, database driver bindings, and of course AI integrations.

There is also Spin by Fermyon and wasmCloud by wasmCloud LLC. These are two further platforms that will cheaply host your Wasm modules. I lump these two together because both use wasmtime, the runtime implemented by the Bytecode Alliance nonprofit. This makes them WASI-compliant. And again, they have SDKs for all the things I mentioned for WasmEdge.

As you can see, the namespace of “Wasm” plus something about cloud or edge is already getting scarce. Probably, we have enough of these projects already but if you must create another one, please don’t call it “Cloud Wasm”.

Seriously though, you don’t have to create another one of these. As far as I can see, all those projects have their code on GitHub under open-source licenses. You can even self-host these platforms, so no danger of vendor lock-in. Just use their code and contribute potential changes to the upstream.

But I have a feeling most of you won’t abandon the existing infrastructure and port everything to Wasm modules. Luckily, there is a way to integrate Wasm modules natively into Docker and thus adopt the technology incrementally.

Image: A straight path with cut grass on the left and high grass on the right, leading towards 3 large trees in the center of the image and a blue sky with clouds in the top-halve of the image. Finally a paved path to follow!

Docker integration of Wasm

To round up the Wasm on the backend part, let’s talk about Wasm and Docker.

Of course, you could always put a Wasm runtime inside a Docker container, copy a Wasm module inside, and execute it like that. I wouldn’t blame you, it gives you the benefit of being integrated with existing tooling around Docker containers.

But in this way, you give up on the fast startup times of Wasm, the small memory footprint, and even portability is dampened (Dockerfiles differ by CPU architectures).

A better way to integrate Wasm modules inside Docker is with “Wasm workloads” for Docker Desktop as they are currently called in the official documentation.

You can define a Wasm workload in a dockerfile by starting with a completely empty image and only copying in the Wasm to execute.

FROM scratch
COPY --from=build /build/hello_world.wasm /hello_world.wasm
ENTRYPOINT [ "/hello_world.wasm" ]

Like any other dockerfile, you can push this to a Docker image repository and run it with the usual commands. But when you run it, you have to specify which runtime you want. All the runtimes I mentioned previously are available, and more. Check out the full list here.

For example, to pick wasmtime:

docker run \
  --runtime=io.containerd.wasmtime.v1 \
  --platform=wasi/wasm \
  secondstate/rust-example-hello

In the background, containerd (managed through Docker in this example) will spawn a special container. Inside the container, you will find containerd-wasm-shim instead of the normal containerd-shim acting as a bridge between the container and containerd. This shim then talks directly to wasmtime, which executes the Wasm module specified in the dockerfile. Here is the diagram from the official announcement.

Image: A diagram that compares a normal container with a Wasm container. The former has a shim, runc and a container process inside. The latter has a wasm-shim, wasmedge, and a wasm module. Source: Introducing the Docker+Wasm Technical Preview, Docker Inc

That’s pretty cool, right? We get the familiar Docker tooling and the well-tested sandboxing properties of it. But your applications can be really small, easily below 1 MB for simple applications. And startup times can be a few milliseconds.

But be aware that the Wasm workloads in Docker are still Beta and are not recommended for use in production. I really hope this changes soon, since I believe this would really accelerate Wasm adoption.

Image: The mountain "Peak WebAssembly" in the background with a sparsely populated valley before it. Look, a clear view on Peak WebAssembly!

Conclusion

That’s all for now. What do you think? Do you see the benefit of running your code in a Wasm sandbox? Will it eliminate another class of security bugs for future software, comparable to how Rust eliminates memory corruption bugs? Or is it just a niche tool and a bunch of hype? I’m curious to read where all of you stand on this, please let me know on the Reddit discussion!

Image: a map showing marking the current position at "Backend Sandbox" and the next at "Blockchain Sandbox" Blockchain sandbox next, with a skipped-for-now mystery to the right.

Next post in series: Why WebAssembly came to Web3 (Wasm in the Wild, Part 4)