notes

Bringing the Web up to Speed with WebAssembly

Andreas Haas, Andreas Rossberg, Derek L. Schuff, Ben L. Titzer (Google); Michael Holman (Microsoft); Dan Gohman, Luke Wagner, Alon Zakai (Mozilla); JF Bastien (Apple)

Proc. of the 38th Conference on Programming Language Design and Implementation (PLDI), Barcelona, Spain, Jun. 2017.

[paper], [PDF], [video], [conference]

Semantics

To their knowledge, Wasm is the first industrial-strength language or VM designed with a formal semantics from the start.

§2.1 Basics

§2.2 Linear memory

§2.3 Structured control flow

For example, a switch with fallthrough:

switch (x) {
case 0: ...A...
case 1: ...B... break;
default: ...C...
}

is translated to:

block block block block
  br_table 0 1 2
  end ...A...
  end ...B... br 1
  end ...C...
end

§2.4 Function calls and tables

§2.5 Determinism

§2.6 Omissions and restrictions

These restrictions have since been lifted (see “Multi-Value All the Wasm!”):

§4.1 Validation - typing rules

§5 Binary format

§6 Interoperability

Wasm is an abstraction over hardware, not over a programming language, so it doesn’t provide a built-in object model. Producers map their data types to numbers or the memory and can define an ABI so that modules can interoperate in heterogeneous applications.

§7 Implementation

V8 (Chrome), SpiderMonkey (Firefox), and JavaScriptCode (WebKit) compile modules AOT before instantiation. Chakra (Edge) interprets modules on first execution and later JIT compiles the hottest functions.

The SpiderMonkey fast baseline JIT emits machine code and validates in a single pass with greedy register allocation. The IonMonkey optimizing JIT compiles it in parallel. V8 and SpiderMonkey achieve 5-6x compilation speed improvement with 8 compilation threads.

Wasm can be decoded to SSA form in a single linear pass. This is done by V8 and SpiderMonkey. V8’s TurboFan compiler uses a sea of nodes graph-based IR (like Graal!).

Reference interpreter is written in OCaml.

§7.1 Bounds checks

The memory size changes so infrequently that the JIT can constant fold it and embed it into the machine code, then patch it later if it changes. Deoptimization of the code is not necessary for that.

§7.2 Caching

JavaScript API can store compiled Wasm modules as blob in IndexedDB to later query for cached module. This seems like it should be a built-in, automatic capability in the browser, including expanding this to include the compiled machine code for JavaScript files, since I think that most sites would benefit from this caching. I think this could only be reliably-implemented for resources that include a cache key, such as the ETag HTTP header.

§7.3 Measurements

Other memory safety models:

Wasm, on the other hand, enforces memory safety with simple bounds checks or virtual memory techniques.

Wasm bytecode speed and simplicity of validation informed by JVM and CIL stack machine experience. The description of Wasm verification takes one page of spec, but 150 pages for the JVM.

JVM has a potentially O(n^3) worst-case for the iterative dataflow approach that is a consequence of unrestricted gotos and other idiosyncrasies that had to be fixed by adding stack maps.

JVM, CIL, and Android Dalvik allow irreducible loops and unbalanced locking structures in bytecode and just relegate those constructs to an interpreter.

§9 Future directions