WebAssembly in Action

Author of the book "WebAssembly in Action"
Save 40% with the code: ggallantbl
The book's original source code can be downloaded from the Manning website and GitHub. The GitHub repository includes an updated-code branch that has been adjusted to work with the latest version of Emscripten (currently version 3.1.44).

Friday, November 10, 2017

An Introduction to WebAssembly


The concept behind WebAssembly isn't new and is based on work that was pioneered by Mozilla (asm.js) and Google (Native Client – NaCl and Portable Native Client – PNaCl).

One of WebAssembly's main goals is parity with asm.js so I'll give you a little background on that before we start digging into WebAssembly.

When I first heard of asm.js a few years ago I was excited. The ability to write code in C/C++ and have it compiled down into a special form of JavaScript that the browser could then run at near native speeds if it supported asm.js.

You don't typically write asm.js code by hand, it's usually created by a compiler.

The code starts off with the asm pragma statement ("use asm";) and then typing hints are included that allow JavaScript interpreters, that support asm.js, to know that they can use low-level CPU operations rather than the more expensive JavaScript operations.

For example, a | 0 is used to hint that the variable 'a' is a 32-bit integer. This works because a bitwise operation of zero doesn't change the original value so there are no side effects to doing this. Because there are no side effects to this, it can be used wherever it's required in the code to hint the type for either the return value or the parameters passed into a method as in the following example:

function AsmModule() {
"use asm";
return {
add: function(a, b) {
a = a | 0;
b = b | 0;
return (a + b) | 0;
}
}
}

The nice thing about asm.js is that, even if the browser doesn't support it, it would still run and you would get identical results since the code is still valid JavaScript. The only difference is that it would be slower compared to if the browser did support asm.js.

Being able to write code in C/C++ and have it compiled into a special form of JavaScript that is significantly faster than standard JavaScript is pretty cool but asm.js did have some disadvantages:
  • The extra type hints could result in very large asm.js files
  • Things still need to be parsed so a large file could be expensive on lower end devices like phones
  • asm.js needs to be valid JavaScript so adding new features is complex and would affect the JavaScript language itself as well


WebAssembly

To solve the issues of asm.js, the major browser venders got together and started working on a W3C standard and an MVP of WebAssembly that is already live in most browsers! The browsers that currently support WebAssembly are:

Firefox, Chrome, Opera, Edge 16, Safari 11, iOS Safari 11, Android browsers 56, Chrome for Android, and Firefox for Android!

It can be turned on in Edge 15 by turning on the browser's Experimental JavaScript Features flag.

WebAssembly, or wasm for short, is intended to be a portable bytecode that will be efficient for browsers to download and load. The bytecode is transmitted in a binary format and, due to how the modules are structured, things can be compiled by the browser in parallel speeding things up even further.

Once the binary has been compiled into executable machine code, the module is stateless and as a result can be explicitly cached in IndexedDB or even shared between windows and workers via postMessage calls.

WebAssembly is currently an MVP so not everything is there yet but it will eventually include both a binary notation, that compilers will produce, and a corresponding text notation for display in debuggers or development environments.


Emscripten

For the examples that follow, I'll be using Emscripten to compile C code into a wasm module. You can download Emscripten from the following location: https://kripken.github.io/emscripten-site/docs/getting_started/downloads.html

On Windows it was simply a matter of unzipping the contents and then running some command line arguments.

I was getting errors when I first tried running a wasm module and it turned out that the version of Emscripten that came with the zip files wasn't recent enough.

You'll need git on your system in order to have the command line arguments build the latest version of the Emscripten compiler for you. The following command line downloaded git for me:

emsdk install git-1.9.4

Once you have the latest version of git on your system, run the command lines indicated on the download page to update, install, and activate Emscripten.

Note: When you run the 'activate latest' command line, you will probably want to include the --global flag.

Otherwise, the path variables are only remembered for the current command line window and you'll have to run the emsdk_env.bat file each time you open a command line window:

emsdk activate latest --global



Hello World

For the examples that follow, I simply create a text file named test.c and use notepad to adjust the text (you don't need an IDE for the examples I give you here).

As with any new programming technology, it's almost a requirement to start off with a hello world program so let's create a very basic hello world app using C. The following will simply write a string to the command line as soon as the module gets loaded:

#include <stdio.h>

int main()
{
printf("Hello World from C\n");
return 0;
}

To compile the code into a WebAssembly module, we need to run the following command line:

emcc test.c -s WASM=1 -o hello.html

-s WASM=1 specifies that we want a wasm module output

The nice thing about this tool is that it gives you all the JavaScript 'glue' needed allowing you to play with WebAssembly modules right away. As you get more experienced with the technology, you can customize it.

If you open the generated hello.html file in a browser, you'll see the Hello World from C text displayed in the textbox.


For the rest of the examples, I'm going to adjust the command line to generate a more minimal html template just to make things easier (less to scroll through when we edit the html file).

I've copied the emscripten.h and shell_minimal.html files into folders in the root directory where I'm creating my code files. For each code example, I've created a subfolder for the .c file I create and for the generated files just to keep each example separate which is why you'll see a relative path stepping back a folder in the following examples.

You can find the shell_minimal.html file in the emscripten src folder.

You can find the emscripten.h file in the emscripten system\include\emscripten folder.


Calling into a module from JavaScript

Let's take our code a step further and build a function that you can call from JavaScript.

First, let's add a function to our code called TestFunction that accepts an integer:

#include <stdio.h>
#include "../emscripten/emscripten.h"

int main()
{
printf("Hello World from C\n");
return 0;
}

void EMSCRIPTEN_KEEPALIVE TestFunction(int iVal) {
printf("TestFunction called...value passed in was: %i\n", iVal);
}

The EMSCRIPTEN_KEEPALIVE declaration adds our functions to the exported functions list so that they're seen by the JavaScript code.

Because we have a function we want to call in the WebAssembly, we don't want the runtime to shut down after the main method finishes executing so we're also going to include the NO_EXIT_RUNTIME flag.

Run the following to generate the new files:

emcc test.c -s WASM=1 -o hello2.html --shell-file ../html_template/shell_minimal.html -s NO_EXIT_RUNTIME=1

If you opened the html file in a browser at this point, you won't see anything other than the Hello World output because the JavaScript doesn't yet have the code to call TestFunction.

Open up the generated hello2.html file using a tool like notepad, scroll down to just before the opening Script tag, and add the following HTML:

<input type="text" id="txtValue" />
<input type="button" value="Pass Value" onclick="PassValue();" />

Scroll further down the html file and add the following just before the closing script tag (after the window.onerror method's code):

function PassValue(){
Module.ccall('TestFunction', // name of C function
null, //return type
['number'],//argument types
[ parseInt(document.getElementById("txtValue").value,10) ]);
}

Save the html file and then open it up in a browser.

If you type a number into the textbox next to the Pass Value button and then press the button you should see that value echoed back into the textbox.


In the example above, we used the ccall method which calls the module right away.

Another approach is that we can use the cwrap method to create a function pointer that can then be used multiple times. The JavaScript would look like this:

var fncTestFunction = Module.cwrap('TestFunction', null, ['number']);
fncTestFunction(1); //passing in an integer directly for this test
fncTestFunction(2); //passing in an integer directly for this test


Calling into JavaScript from a module using macros

Being able to talk to the WebAssembly from JavaScript is nice but what if you want to talk to JavaScript from the WebAssembly?

There are different ways that this can be achieved.

The simplest way is through macros like emscripten_run_script() or EM_ASM() which basically trigger a JavaScript eval statement.

Macros are not my recommended approach for production code especially if you're dealing with user supplied data but they could come in handy if you needed to do some quick debugging.

Note: You need to use single quotes in the macros. Double quotes will cause a syntax error that is not detected by the compiler.

To test out the EM_ASM macro, let's adjust TestFunction to simply call into JavaScript and display an alert:

#include <stdio.h>
#include "../emscripten/emscripten.h"

int main()
{
printf("Hello World from C\n");
return 0;
}

void EMSCRIPTEN_KEEPALIVE TestFunction(int iVal) {
printf("TestFunction called...value passed in was: %i\n", iVal);

EM_ASM(
alert('Test call from C to JS');
throw 'all done';
);
}

Run the following to generate the new files:

emcc test.c -s WASM=1 -o hello3.html --shell-file ../html_template/shell_minimal.html -s NO_EXIT_RUNTIME=1

Open up the hello3.html file that was generated, scroll down to just before the opening Script tag, and add the following HTML:

<input type="text" id="txtValue" />
<input type="button" value="Pass Value" onclick="PassValue();" />

Scroll further down the html file and add the following just before the closing script tag (after the window.onerror method's code):

function PassValue(){
Module.ccall('TestFunction', // name of C function
null, //return type
['number'], //argument types
[ parseInt(document.getElementById("txtValue").value,10) ]);
}

Save the html file and then open it up in a browser. If you type a number into the textbox next to the Pass Value button and then press the button you will see that value echoed back into the textbox but you'll also see an alert displayed.


Calling into JavaScript from a module using function pointers

In the previous example we used a macro to call into the JavaScript code but that's generally taboo, especially if you have user supplied data, since it uses eval in the background.

The better approach, in my opinion, is to pass a function pointer to the WebAssembly module and have the C code call into that.

In the JavaScript code, you can use Runtime.addFunction to return an integer value that represents a function pointer. You can then pass that integer to the C code which can be used as a function pointer.

When using Runtime.addFunction, there is a backing array where these functions are stored. The array size must be explicitly set, which can be done via a compile-time setting: RESERVED_FUNCTION_POINTERS.

Let's adjust our code by getting rid of TestFunction and adding in a new function called CallFunctionPointer that simply calls the function pointer that was specified:

#include <stdio.h>
#include "../emscripten/emscripten.h"

int main(){
printf("Hello World from C\n");
return 0;
}

void EMSCRIPTEN_KEEPALIVE CallFunctionPointer(void(*f)(void)){
f();
}

Run the following to generate the new files and indicate that we will have 1 function pointer:

emcc test.c -s WASM=1 -o hello4.html --shell-file ../html_template/shell_minimal.html -s NO_EXIT_RUNTIME=1 -s RESERVED_FUNCTION_POINTERS=1

Open up the hello4.html file that was generated, scroll down to just before the opening Script tag, and add the following HTML:

<input type="button" value="Test Pointer" onclick="TestPointer();" />

Scroll further down the html file and add the following just before the closing script tag (after the window.onerror method's code):

function TestPointer(){
// Create an anonymous function that will be called by the C code
var pointer = Runtime.addFunction(function() { alert('I was called from C!'); });

// Call the C code passing in the pointer reference
Module.ccall('CallFunctionPointer', null, ['number'], [pointer]);

// Remove the function pointer from the array
Runtime.removeFunction(pointer);
}

Save the html file and then open it up in a browser. If you press the Test Pointer button you will see an alert displayed.


Current Limitations

Like with JavaScript, WebAssembly is specified to be run in a safe, sandboxed execution environment which means it will enforce the browser's same-origin and permission policies too.

WebAssembly is currently an MVP which means there are a number of things still missing. For example, at the moment there is no direct DOM access from the assembly which means you will have to call into the JavaScript code if you need to update the UI.


Future Improvements

The browser makers are pushing ahead with this technology (for example, Google's Native Client has been depreciated in favor of this) so I expect to see improvements start showing up.

The following are some of the improvements that browser makers have identified:
  • Faster function calls between JavaScript and WebAssembly
  • You won't notice this overhead if you're passing a single large task off to the WebAssembly module. But, if you have lots of back-and-forth between the module and JavaScript, then this overhead becomes noticeable apparently.
  • Faster load times
  • The browser makers had to make trade-offs between fast load times and optimized code so there are more improvements that can be made here.
  • Working directly with the DOM
  • Exception handling
  • Better tooling support for things like debugging
  • Garbage collection which will make it easier for some languages like C# to target WebAssembly too



A New Book

I’ve been honored with an opportunity to work with Manning Publications and write a book about WebAssembly.

If you enjoyed this article and would like to know more about WebAssembly, I welcome you to check out my book: WebAssembly in Action

No comments:

Post a Comment