2015 SMC Paper

Web Audio Modules Jari Kleimola Oliver Larkin Dept. of Computer Science Music Department Aalto University University of York Espoo, Finland York, UK [email protected] [email protected] ABSTRACT scope of applicability. For instance, the recent Web Au- dio API extends the browser sandbox to fit in musical This paper introduces Web Audio Modules (WAMs), applications such as those presented in this work (see which are high-level audio processing/synthesis units Figure 1). Browser functionality may be further in- that represent the equivalent of Digital Audio Work- creased with novel secure extension formats such as Em- station (DAW) plug-ins in the browser. Unlike traditional scripten and Portable Native Client (PNaCl) that are run- browser plugins WAMs load from the open web with the ning close to native speeds, and without manual installa- rest of the page content without manual installation. We tion. propose the WAM API – which integrates into the existing Web Audio API – and provide its implementation for JavaScript and C++ bindings. Two proof-of-concept WAM virtual instruments were implemented in Emscrip- ten, and evaluated in terms of latency and performance. We found that the performance is sufficient for reasona- ble polyphony, depending on the complexity of the processing algorithms. Latency is higher than in native DAW environments, but we expect that the forthcoming W3C standard AudioWorkerNode as well as browser developments will reduce it. 1. INTRODUCTION Digital Audio Workstations (DAWs) have evolved from Figure 1. Detail of webCZ-101 user interface. simple MIDI sequencers into professional quality music production environments. Contemporary DAWs equip With these things in mind, we argue that enabling home studios with multi-track audio and MIDI record- DAW-style virtual instruments and effects processors in ing/editing capabilities, which enable musicians to turn their compositions into publication-ready master tracks. web browsers – and integrating them with existing web DAWs are standalone native applications, whose func- APIs – introduces novel use cases that go beyond tionality may be extended using plug-ins. Plug-ins im- standalone DAW host scenarios. We give examples of plement custom virtual instruments and effects pro- four categories. cessing devices, which co-operate inside the DAW host First, the Internet infrastructure may be utilized in di- environment through host-specific application program- rect distribution of software synthesizers, effects devices ming interfaces (APIs). and their presets. Online DAW plug-ins may also be used The primary application scope of a DAW is limited to in demoing native plug-ins without installation. More music making in a local single user environment. In this elaborate use cases include collaborative music making sense, web browsers may be regarded as functional op- and live coding performances. Second, seamless integra- posites of DAWs: browsers target a wide range of use tion with online web pages and strong support for multi- cases, focus on networked connectivity between many media suggests use cases for plug-in tutorials, interactive users, and provide remote resource access in a global documentation, music theory lessons, online musical scope. Like DAWs, browsers have also matured in time score rendering, audiovisual installations, and parameter- from simple document viewing applications into rich ized audio assets for online games. Third, wireless net- interactive multimedia platforms. Moreover, the ever- working and support for various local communication increasing number of standardized web APIs and open protocols afford new interaction paradigms for software source third party libraries continues to expand their synthesizer control. Browsers also provide versatile tools for traditional graphical user interface (GUI) implemen- Copyright: © 2015 Jari Kleimola and Oliver Larkin. This is an open- tations. Fourth, the direct development approach – based access article dis- tributed under the terms of the Creative Commons on JavaScript (JS), HTML and CSS – encourages proto- Attribution License 3.0 Unported, which permits unrestricted use, distri- typing and exploration of novel audio synthesis and pro- bution, and reproduction in any medium, provided the original author cessing algorithms. Finally, when conforming to existing and source are credited. web standards, these four categories are available in ble C++ SDK2 allowed developers to create their prod- cross-platform and cross-device manner without plug-in ucts as plug-ins – dynamic libraries conforming to a spe- host vendor control. We envision that many more scenar- cific API, to be loaded by a “Host” application, which ios will emerge. would typically be a DAW. A small industry had devel- This work introduces Web Audio Modules (WAMs), oped around the technology by the early-2000s with which are DAW-style plug-ins optimized for web brows- companies adopting the format along with a hobbyist ers. Unlike most existing online synthesizer and audio community. Some host vendors, such as Apple and AV- effects implementations that build on top of Web Audio ID created competing APIs allowing them a tighter con- API, WAMs integrate into the Web Audio API via its trol of the market and better integration with their platform. Although it represents a small fraction of musical script-based backend node: each WAM thus implements instrument retail sales, the industry is still growing (at a full-blown software synthesizer or effects device inside least in the USA) as can be seen in the NAMM 2014 a single Web Audio API node. The proposed WAM API global report3. strives to make these nodes reusable a) by standardizing Several plug-in APIs have prevailed and are used wide- how the enclosing web page loads and controls them, and ly at the time of writing, including Steinberg’s own b) by standardizing their DSP interface. The former ena- VST2.4 and VST3, Apple’s AudioUnit and AVID’s bles development of novel application scenarios enlisted AAX. In the open source community LADSPA and LV2 in the previous paragraph, while the latter enables audio (LADSPA version 2) have been widely adopted. In the algorithm development in JavaScript or cross-compiled commercial arena the success and adoption of a particu- C/C++. This, in turn, affords single code base for native lar API is often dictated by the host vendors and the mar- and web audio plug-in implementation. However, in con- ket share they control, more than the merits of the API trast to traditional web plugins, the cross-compiled itself. VST plug-ins are supported by many hosts on the WAMs load directly from the open web without manual Mac and PC platforms, although the VST3 format, which installation (hence the term “module” instead of “plug- is substantially different from VST2.4, has not yet seen in”). The primary contribution of this work consists of: widespread support outside of Steinberg’s DAWs. The AudioUnit format only runs on the Mac platform and is • a proposal for a streamlined API that enables DAW- the only format supported by Apple’s popular Logic style virtual instruments and effects devices in web DAW. Access to the AAX SDK is controlled by AVID browsers. and only AVID can produce AAX hosting applications • an implementation of the API in JS and C/C++. (such as ProTools). LV2 is platform agnostic and entirely • a minimal WAM example and two proof-of-concept open source with the most liberal license, but to date up- WAM virtual instruments conforming to the API. take has been mainly on Linux. • a web service to aggregate WAMs and their presets. In 2003 a working group of the MIDI Manufacturer’s Association (MMA) was set up to develop a non vendor This work also explores how WAMs integrate with ex- controlled plug-in API Generalized Music Plug-in Inter- isting and emerging APIs such as Web MIDI, WebGL, face (GMPI). Although this API never materialized, a and Web Components. draft of a list of requirements was produced based on the The remainder of this paper is structured as follows. members’ discussions4. This has informed the LV2 plug- Section 2 reviews related work in native and web plat- in specification5 and serves as a useful reference for the forms. Section 3 details the API and its architecture, design of audio plug-in APIs. while section 4 explains how web pages and applications Audio plug-in developers who want to make their soft- may embed WAMs, and describes their proof-of-concept ware compatible with a variety of hosts and platforms implementations. Section 5 evaluates the implementa- and reach a wide market have to support multiple APIs tions in terms of latency and performance, and finally, and the complexity is increased by the need to provide Section 6 concludes. cross platform GUI and file system features. For this reason many developers use an intermediate C++ frame- Source code, documentation, demos, and the web ser- 6 7 1 work such as JUCE or IPlug (or a proprietary solution) vice are available via links at the accompanying website in order to develop an abstracted version of the plug-in https://mediatech.aalto.fi/publications/webservices/wams which can then be compiled to multiple formats, platforms and architectures, saving development time. 2. RELATED WORK Although we aim to introduce the functionality offered by the concept of native audio plug-ins to the web, the 2.1 Native Plugin APIs differences of the environment require a different approach to the API design, and the development of a new In 1996 Steinberg introduced Virtual Studio Technology (VST) and started a trend towards “in the box” audio 2 http://www.steinberg.net/en/company/developers.html production, where hardware effects and instruments 3 https://www.namm.org/files/ihdp-viewer/global-report-2014/ could be replaced by native software equivalents running 4 http://retropaganda.info/archives/gmpi/gmpi-requirements-2005-04- inside a DAW on personal computers.

2015 SMC Paper

Synchronous Programming in Audio Processing Karim Barkati, Pierre Jouvelot

Chuck: a Strongly Timed Computer Music Language

Audio Signal Processing in Faust

An Agent Based Approach to Interaction and Composition

An Enabling Technology for Musical Networking

DAW Plugins for Web Browsers Jari Kleimola Dept

Using OSC to Communicate with a Raspberry Pi Created by Todd Treece

To Graphic Notation Today from Xenakis’S Upic to Graphic Notation Today

Chuck: a Strongly Timed Computer Music Language

Software Documentation V

AAC, 18 Ableton LIVE, 84 Absorption, 40 Absorption Coefficient, 40

Ten Years of Tablet Musical Interfaces at CNMAT