Schemescape

Development log of a life-long coder

A trivial WebAssembly example

In the last post, I provided an overview of WebAssembly. In this post, I'm going to build and run a complete (but trivial) WebAssembly module in C using LLVM and Clang.

All of the code is here: webassembly-trivial-example.

Aside

It looks like someone else was frustrated with Emscripten in the past, so they wrote a post about WebAssembly without Emscripten. Their guide was helpful, but I'm not sure if it's up to date. I also found their Makefiles to be excessively complex.

Setup

First, download and install LLVM (I used LLVM 12.0.1) from the LLVM GitHub releases page (note: 180 MB download that expands to 1.8 GB installed). The programs I'm actually planning to use are clang and wasm-ld.

I also wanted to inspect the output WebAssembly, so I needed the WebAssembly Binary Toolkit, which was a much more reasonable ~2 MB download (note: the Windows version is a gzipped tarball instead of a zip file). I'm going to use wasm2wat for disassembly.

Some notes on C

C compilation is usually done as follows:

  1. Run the preprocessor (cpp) to expand macros and includes (often on many source files)
  2. Compile the preprocessed code into object files
  3. Link everything into a final binary

A decent overview of the most common command line arguments for a different compiler is here. Many of the options are identical for most C compilers.

Implementing a trivial function

I'm going to start with a very simple example (just to reduce the number of things that could go wrong).

Source code

File name: "add.c":

int add(int a, int b) {
    return a + b;
}

Compiling the code

First, I'm just going to compile (but not link) the code, to see what happens.

clang -target wasm32 -Os -c add.c

Since I'm compiling and not linking, this command generates an object file named "add.o".

Disassembling the object file

Run the disassembler:

wasm2wat add.o

And it produces the following surprisingly readable output:

(module
  (type (;0;) (func (param i32 i32) (result i32)))
  (import "env" "__linear_memory" (memory (;0;) 0))
  (func $add (type 0) (param i32 i32) (result i32)
    local.get 1
    local.get 0
    i32.add))

MDN has a great explanation of the Web Assembly text format (".wat" files). The syntax is based on S-expressions (similar to Lisp). Note that inline comments are delimited by semicolons (e.g. ; comment goes here ;), as noted in this more detailed look at Web Assembly text format syntax..

Breaking down the first two lines:

So far, we have a module and an unnamed type (which can be referenced by index 0) for a function taking two 32-bit integers and returning one 32-bit integer. Moving on:

The host code would need to pass in a (zero-sized) memory buffer named env.__linear_memory. On to the actual code:

Looking at the function body, note that local values are referenced by a zero-based index that starts with the function arguments and then continues on to any local variables:

It looks like the C code compiled correctly and the output WAT seems reasonable. So far, so good.

Linking

Note that my original Clang command specified -c, so it only compiled the code and never ran the linker. Let's go all the way this time:

clang -target wasm32 -Os -nostdlib -Wl,--no-entry add.c -o add.wasm

I removed -c and added some new arguments:

Disassembling "add.wasm" yields the following:

(module
  (memory (;0;) 2)
  (global $__stack_pointer (mut i32) (i32.const 66560))
  (export "memory" (memory 0)))

My code disappeared! Of course, this isn't surprising because my code has no entry point and doesn't export anything.

Memory and a stack

Interestingly, this most recent disassembly shows some other changes:

I have some questions about this arrangement:

Aside: a WebAssembly critique

As an aside: while trying to find answers to some of my questions, I ran across an incredibly insightful series of posts that retrospectively critique some of WebAssembly's design decisions.

Exports

Back to my trivial experiment.

How do I tell Clang/LLVM that I want to export a function? Consulting the linker documentation, it looks like I can export everything (not my preferred approach) or specify exports either on the command line or with attributes in the code. In code, the two options appear to be:

I kind of wish there was an "always export this symbol by name" option that didn't require duplicating the name. C preprocessor to the rescue!

#define WASM_EXPORT_AS(name) __attribute__((export_name(name)))
#define WASM_EXPORT(symbol) WASM_EXPORT_AS(#symbol) symbol

int WASM_EXPORT(add)(int a, int b) {
    return a + b;
}

Output:

(module
  (type (;0;) (func (param i32 i32) (result i32)))
  (func $add (type 0) (param i32 i32) (result i32)
    local.get 1
    local.get 0
    i32.add)
  (memory (;0;) 2)
  (global $__stack_pointer (mut i32) (i32.const 66560))
  (export "memory" (memory 0))
  (export "add" (func $add)))

This looks like what I want. I've got my function and it's being exported (along with a memory region that I'm not actually using in my code).

Using the module

Now that I've got my finished module (add.wasm), I need to host it somewhere.

Using the module in Node

Here's an example of loading the module and calling add in Node:

const fs = require('fs');
(async () => {
    const module = await WebAssembly.instantiate(await fs.promises.readFile("./add.wasm"));
    const add = module.instance.exports.add;
    console.log(add(2, 2));
})();

Using the module in a web page

Here's a web page for my trivial example:

<html>
    <body>
        <p>The value of 2 + 2 is <span id="result">?</span></p>

        <script>
            (async () => {
                const module = await WebAssembly.instantiateStreaming(fetch("./add.wasm"));
                const add = module.instance.exports.add;
                document.getElementById("result").innerText = add(2, 2);
            })();
        </script>
    </body>
</html>

Note that using fetch isn't supported from the file system, so I used a trivial HTTP server for local testing.

To my surprise, everything worked on the first try.

I was also able to confirm that the module's memory was exported (module.instance.exports.memory) and could be read from within my browser's dev tools window. I'm still not clear on why LLVM decided to export the memory by default.

That's it!

The end result of all this was actually pretty simple. Here are some links for reference:

Remembering how to use a C compiler on the command line, and deciphering LLVM's export semantics took a bit more time than I would have liked, but I learned a lot about WebAssembly in the process.

Next up, I'll see if I can get the C standard library working.