Insights Into Webassembly: Compilation Performance and Shared Code Caching in Node.Js

Insights into WebAssembly: Compilation Performance and Shared Code Caching in Node.js Tobias Nießen Michael Dawson Faculty of Computer Science IBM Runtime Technologies University of New Brunswick IBM Canada [email protected] [email protected] Panos Patros Kenneth B. Kent Software Engineering Faculty of Computer Science University of Waikato University of New Brunswick [email protected] [email protected] ABSTRACT high-level languages, the WebAssembly instruction set is closely re- Alongside JavaScript, V8 and Node.js have become essential com- lated to actual instruction sets of modern processors, since the initial ponents of contemporary web and cloud applications. With the WebAssembly specification does not contain high-level concepts addition of WebAssembly to the web, developers finally have a fast such as objects or garbage collection [8]. Because of the similarity of platform for performance-critical code. However, this addition also the WebAssembly instruction set to physical processor instruction introduces new challenges to client and server applications. New sets, many existing “low-level” languages can already be compiled application architectures, such as serverless computing, require to WebAssembly, including C, C++, and Rust. instantaneous performance without long startup times. In this pa- WebAssembly also features an interesting combination of secu- per, we investigate the performance of WebAssembly compilation rity properties. By design, WebAssembly can only interact with its in V8 and Node.js, and present the design and implementation of host environment through an application-specific interface. There a multi-process shared code cache for Node.js applications. We is no built-in concept of system calls, but they can be implemented demonstrate how such a cache can significantly increase applica- through explicitly imported functions. This allows the host to mon- tion performance, and reduce application startup time, CPU usage, itor and restrict all interaction between WebAssembly code and and memory footprint. the host environment. Another important aspect is the concept of linear memory: Each WebAssembly instance can access mem- CCS CONCEPTS ory through linear memory, a consecutive virtual address range that always begins at address zero. The host environment needs • Software and its engineering Compilers; Software per- ! to translate virtual memory addresses into physical addresses on formance; • Computer systems organization Cloud comput- ! the host system, and ensure that virtual addresses do not exceed ing. the allowed address range. On modern hardware, this can be imple- KEYWORDS mented using the Memory Management Unit (MMU) and hardware memory protection features, leading to minimal overhead while WebAssembly, compiler, code cache, Node.js, V8, JavaScript allowing direct memory access to linear memory and preventing ACM Reference Format: access to other memory segments of the host system [8]. Combined, Tobias Nießen, Michael Dawson, Panos Patros, and Kenneth B. Kent. 2020. these properties allow running the WebAssembly code both with Insights into WebAssembly: Compilation Performance and Shared Code full access to the real system, and in a completely isolated sandbox, Caching in Node.js. In Proceedings of 30th Annual International Conference without any changes to the code itself. on Computer Science and Software Engineering (CASCON’20). ACM, New These properties make WebAssembly an attractive platform for York, NY, USA, 10 pages. performance-critical portable code, especially in web applications. 1 INTRODUCTION However, WebAssembly makes no inherent assumptions about its host environment, and can be embedded in other contexts such as WebAssembly is a new hardware abstraction that aims to be faster Node.js, a framework built on top of the JavaScript engine V8. Not than interpreted languages without sacrificing portability or secu- only does Node.js share many technological aspects, such as the rity. Conceptually, WebAssembly is a virtual machine and binary- programming language JavaScript, with web applications, but its code format specification for a stack machine with separate, linearly performance and portability goals align well with those of Web- addressable memory. However, unlike many virtual machines for Assembly. Sandboxing, platform-independence, and high perfor- Permission to make digital or hard copies of part or all of this work for personal or mance are especially relevant in cloud-based application backends, classroom use is granted without fee provided that copies are not made or distributed the primary domain of Node.js. for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. However, one hindrance remains: Because WebAssembly code is For all other uses, contact the owner/author(s). tailored towards a conceptual machine, it can either be interpreted CASCON’20, November 10–13, 2020, Toronto, Canada on the host machine, or first be compiled into code for the actual © 2020 Copyright held by the owner/author(s). host architecture. As we will see below, interpretation leads to CASCON’20, November 10–13, 2020, Toronto, Canada Tobias Nießen, Michael Dawson, Panos Patros, and Kenneth B. Kent inadequate performance. At the same time, compilation of large Malle et al. conducted experiments comparing WebAssembly WebAssembly modules can lead to considerable delays during the to JavaScript, asm.js, and native code in the context of artificial application’s startup phase. intelligence algorithms [18]. The results are in line with the results Node.js is often used in “serverless computing:” Instead of keep- reported by Haas et al. [8] and Herrera et al. [13, 14], and again show ing a number of servers running at all times, the provider allocates that WebAssembly is faster than JavaScript, but slower than native resources dynamically with a minimal scope. For example, Function- code. The authors suggest that future additions to WebAssembly as-a-Service (FAAS), sometimes referred to as “serverless functions,” such as SIMD instructions will likely reduce the difference between is a deployment model in which the provider only allocates enough WebAssembly and native code. resources for a single function to be executed for each request, and Hall et al. investigated WebAssembly as a platform for server- no internal state is preserved between requests [2]. In this scenario, less applications. They came to the conclusion that the security it is crucial for the function to be executed quickly in order to properties of WebAssembly allow isolation similar to virtualization produce the desired response to an incoming request with mini- via containers, and that, while WebAssembly generally did not out- mal latency, making it unfeasible to compile large WebAssembly perform native code, containers often took longer to start than the modules on each request. respective WebAssembly implementations [9]. In this paper, we analyze the performance of V8’s WebAssembly Matsuo et al. suggested using WebAssembly in browser-based compilers. Sections 2 and 3 provide an overview of background volunteer computing. They found WebAssembly outperformed and related work. We analyze the performance of code generated JavaScript for long-running tasks, but the overhead of compiling by V8 in Section 4. In Section 5, we investigate the performance and optimizing WebAssembly before being able to run it caused of V8’s WebAssembly compilers themselves, and the performance performance gains to disappear for tasks with short durations [19]. benefits of caching compiled code. Finally, in Section 6, we present Jangda et al. exposed performance flaws of WebAssembly im- and evaluate the design and implementation of a multi-process plementations in web browsers [15]. They found that, on average, shared code cache for WebAssembly code in Node.js applications. WebAssembly code in the V8-based web browser Chrome is 55% slower than native code. While comparing code generated by V8 to native code generated by a C compiler, the authors observed that V8 2 BACKGROUND produces more instructions. This leads to more CPU cycles required With the addition of WebAssembly to Node.js, there are three kinds to execute the code and more cache misses in the processor’s L1 of code that are supported: JavaScript, native addons, and Web- instruction cache. Code generated by V8 also suffers from increased Assembly. While JavaScript code and WebAssembly are interpreted register pressure due to sub-optimal register allocations and the fact and/or compiled by V8 (see Section 4), native addons behave like that V8 reserves some registers for memory management. Finally, shared libraries and allow embedding “native” code, e.g., compiled C the WebAssembly specification mandates certain safety checks at or C++ code. Unlike WebAssembly, native addons have direct access runtime, which also incur a performance cost. However, despite to the underlying system and its resources (file systems, network these problems, the authors also showed that WebAssembly was interfaces, etc.). While this might be a desired or even required 54% faster than asm.js in the same browser. feature for some use cases, it might be a security risk in others The multitude of publications highlighting the performance ben- [7]. Additionally, native addons usually need to be compiled on efits of WebAssembly over JavaScript has inspired efforts tosim- the target platform, while WebAssembly

Insights Into Webassembly: Compilation Performance and Shared Code Caching in Node.Js

Differential Fuzzing the Webassembly

An Event-Driven Embedded Operating System for Miniature Robots

Interaction Between Web Browsers and Script Engines

What Are the Problems with Embedded Linux?

Co-Optimizing Performance and Memory Footprint Via Integrated CPU/GPU Memory Management, an Implementation on Autonomous Driving Platform

Machine Learning in the Browser

Javascript API Deprecation in the Wild: a First Assessment

Performance Study of Real-Time Operating Systems for Internet Of

Seamless Offloading of Web App Computations from Mobile Device to Edge Clouds Via HTML5 Web Worker Migration

Webassembly a New World of Native Exploits on the Web Agenda

Superoptimization of Webassembly Bytecode

Learning Javascript Design Patterns