CI EREC

Linux Audio Conference 2017

Conferences - Workshops- Concerts - Installations May 18-21, 2017 Saint-Etienne University Proceedings

Foreword

Welcome to Audio Conference 2017 in Saint-Etienne!

The feld of computer music and digital audio is rich of several well-known scientifc conferences. But the Linux Audio Conference is very unique in this landscape! Its focus on Linux-based (but not only) free/open-source software development and its friendly atmosphere are perfect to speak code from breakfast to late at night, to demonstrate early prototypes of software that still crash, and more generally to exchange audio and music related technical ideas without fear. LAC offers also a unique opportunity for users and developers to meet, to discuss features, to provide feedback, to suggest improvements, etc.

LAC 2017 is the frst edition to take place in France. It is co-organized by the University Jean Monnet (UJM) in Saint-Etienne and GRAME in Lyon.

GRAME is a National Center for Music Creation, an institution devoted to contemporary music and digital art, scientifc research, and technological innovation. In 2016 the center hosted 26 guest composers and artists, produced 88 musical events and 25 exhibitions in 20 countries. GRAME is the organizer of the Biennale Musiques en Scène festival, one of France’s largest international festival of contemporary and new music with guest artists ranging from Peter Eötvös, Kaija Saariaho, Michael Jarrell, Heiner Goebbels, Michel van der Aa, etc. GRAME develops research activities in the feld of real-time systems, music representation, and programming languages. Since 1999 all software developed by GRAME are open source and in most cases multiplatform (Linux, macOS, Windows, Web, Android, iOS, ...).

UJM is part of the University of Lyon, a consortium of higher-education and research institutions located within the two neighboring cities of Lyon and Saint-Étienne. The University of Lyon-Saint- Etienne is the main French higher-education and scientifc center outside the Paris metropolitan area, composed of 4 public universities, 7 high schools (grandes écoles) and the CNRS (the French National Center for Scientifc Research), forming a group of 12 member institutions. The University of Lyon also assembles 19 associated institutions, offering specifc disciplinary training programs. Altogether, the Université de Lyon regroups 137,600 students and 168 public laboratories.

The Université Jean Monnet (UJM), founded in 1969, is a comprehensive university enrolling some 20,000 students, about 15% international students from 111 countries. A high proportion of international students come from Africa and Asia.

The university is composed of 5 faculties (arts, letters and languages, humanities and social sciences, law, sciences and technology, and medicine), four institutes: Institut Supérieur d’Economie d’Administration et de Gestion, Telecom Saint-Etienne, and University Institutes of Technology (IUTs) in Saint-Etienne and Roanne.

The CIEREC is a research center devoted to the feld of contemporary expression, which brings together professors, researchers and PhD students in aesthetics and sciences of art, plastics arts, design, digital arts, literature, linguistics and musicology. Its main feld is the arts and literature of the twentieth and twenty-frst centuries.

The Music Department offers graduate, post-graduate and doctoral training in music and musi- cology. In the feld of technologies, it proposes, since 2011, a Professional Master's Degree in Computer Music (RIM) that is unique in France, in collaboration with GRAME, with the main audio production studio of Saint-Etienne (Le FIL) and with the National Superior Conservatory of Music in Lyon. In 2016, we have created another Professional Master for Digital Arts (RAN). The Professional Masters of RIM & RAN are aimed at developing students’ applied knowledge and understanding of electronic and digital technologies for the creation and they prepare to the professions of "Producer in Computer Music (RIM - Réalisateur en Informatique Musicale) and in Digital Arts (RAN - Réalisateur en Arts Numériques). These producers are direct actors in musical and artistic productions, and they are at the interface between software developers, applied computer scientists, composers, artists ... and all people likely to integrate video, image and sound in their activities. Most courses are available in English (see http://musinf.univ-st-etienne.fr/indexGB.).

Thanks to all the contributors who submitted papers and proposed workshops, installations and music, we will have a very interesting and varied program at the conference. We are pleased to welcome, over a period of 4 days, some twenty conferences on various subjects with speakers from different backgrounds and countries.

We'd like to thank all those who contribute to the realization of this edition, and especially Albert Gräf, Philippe Ezequel, Jean-François Minjard, Stéphane Letz, Lionel Rascle, Thomas Cipierre, Sébastien Clara, Landrivon Philippe, David-Olivier Lartigaud, Jean-Jacques Girardot, Martine Patsalis, Nadine Leveque-Lair and all the reviewers and members of the scientifc and artistic committees.

Thanks to our partners who helped to fnance this conference : the CIEREC, GRAME, Masters RIM & RAN, the UJM Music and Arts Departments, The Arts, Lettres, Langues Faculty, Random-Lab at ESADSE (Art School of Saint-Etienne), Le Son des Choses, Electro-M, IDjeune, the Commission Sociale et Vie Etudiante at l’UJM.

LAC 2017 has been also partially funded by the FEEVER project [ANR-13-BS02-0008] supported by the Agence Nationale pour la Recherche.

We hope that you will enjoy the conference and have a pleasant stay in Saint-Etienne!

Vincent Ciciliato, Yann Orlarey et Laurent Pottier LAC 2017 Teams

Organizers

• CIEREC (Centre Interdisciplinaire d’Étude et de Recherche sur l’Expression Contemporaine), director: Danièle Méaux, directors of the Electronic Team: Laurent Pottier & Vincent Ciciliato • Music Department of Jean Monnet University (UJM), director: Anne Damon-Guillot • GRAME (National Center for Musical Creation), director: James Giroudon, scientifc director: Yann Orlarey • Random-Lab, Center for Open Researches in Art, Design and New Media at ESADSE (Art School of Saint-Etienne), director: David-Olivier Lartigaud • Association « Le son des choses » (Acousmatic Music) • Association « Electro-M » (Masters RIM RAN students)

Organizing committee

• Vincent Ciciliato, Lecturer (Digital Arts) at CIEREC (UJM) • Thomas Cipierre, PhD student (Musicology) at CIEREC (UJM) • Sébastien Clara, PhD student (Musicology) at CIEREC (UJM) • Philippe Ezequel, Lecturer (Computer Sciences) at CIEREC (UJM) • Jean-Jacques Girardot, Programmer (Computer Sciences) at Le son des Choses • Stéphane Letz, Researcher at GRAME • Philippe Landrivon, Audiovisual technician, ALL faculty (UJM) • David-Olivier Lartigaud, Director of Random-Lab (ESADSE) • Jean-François Minjard, Composer at Le Son des Choses • Yann Orlarey, Scientifc director of GRAME • Laurent Pottier, Lecturer (Musicology) at CIEREC (UJM) • Lionel Rascle, Professor (Musical School – St Chamond & Rive de Giers)

Scientifc committee

• Fons Adriaensen • Marije Baalman • Tim Blechmann • Alain Bonardi • Ivica Ico Bukvic • Guilherme Carvalho • Vincent Ciciliato • Thierry Coduys • Myriam Desainte–Catherine • Goetz Dipper • Catinca Dumitrascu • Philippe Ezequel • John Fftch • Dominique Fober • Robin Gareus • Albert Gräf • Marc Groenewegen • Florian Hollerweger • Madeline Huberth • Jeremy Jongepier • Pierre Jouvelot • David-Olivier Lartigaud • Victor Lazzarini • Stéphane Letz • Fernando Lopez-Lezcano • Kjetil Matheussen • Romain Michon • Frank Neumann • Yann Orlarey • Dave Phillips • Peter Plessas • Laurent Pottier • Miller Puckette • Elodie Rabibisoa • Lionel Rascle • David Robillard • Martin Rumori • Bruno Ruviaro • Funs Seelen • Julius Smith • Pieter Suurmond • Harry Van Haaren • Steven Yi • Johannes Zmölnig SUMMARY

Special Guests

Paul Davis ------1 Thierry Coduys ------1 Conferences

1. OpenAV Ctrla: A Library for Tight Integration of Controllers by Harry Van Haaren -- 5 2. Binaural Floss - “ Exploring Media, Immersion, Technology by Martin Rumori ------13 3. A versatile workstation for the diffusion, mixing and post-production of spatial audio by Thibaut Carpentier ------21 4. Teaching Sound Synthesis in C/C++ on the Raspberry PI by Henrik Von Coler, David Runge. ------29 5. Open Signal Processing Software Platform for Hearing Aid Research (openMHA) by Tobias Herzke, Hendrik Kayser, Frasher Loshaj, Giso Grimm, Volker Hohmann ------35 6. Towards dynamic and animated music notation using INScore by Dominique Fober, Yann Orlarey, Stéphane Letz ------43 7. PlayGuru, a music tutor by Marc Groenewegen ------53 8. Faust audio DSP language for JUCE by Adrien Albouy, Stéphane Letz ------61 9. Polyphony, sample-accurate control and MIDI support for FAUST DSP using combinable architecture files by Stéphane Letz, Yann Orlarey, Dominique Fober, Romain Michon ------69 10. faust2api: a Comprehensive API Generator for Android and iOS by Romain Michon, Julius Smith, Stéphane Letz, Chris Chafe, Yann Orlarey ------77 11. New Signal Processing Libraries for Faust by Romain Michon, Julius Smith, Yann Orlarey ------83 12. Heterogeneous data orchestration - Interactive fantasia under SuperCollider by Sébastien Clara ------89 13. Higher Order Ambisonics for SuperCollider by Florian Grond, Pierre Lecomte ------95 14. STatic (LLVM) Object Analysis Tool: Stoat by Mark McCurry ------105 15. AVE Absurdum by Winfried Ritsch ------111 16. Multi-user posture and gesture classification for "subject-in-the-loop" applications by Giso Grimm, Joanna Luberadzka, Volker Hohmann ------119 17. VoiceOfFaust by Bart Brouns ------127 18. On the Development of C++ Instruments by Victor Lazzarini ------133 19. Meet the Cat: Pd-L2Ork and its New Cross-Platform Version "Purr Data" by Ivica Bukvic, Albert Graef, Jonathan Wilkes ------141

Posters / Speed-Geeking

Impulse-Response- and CAD-Model-Based Physical Modeling in Faust (poster) by Pierre-Amaury Grumiaux, Romain Michon, Emilio Gallego Arias, Pierre Jouvelot ------151 Fundamental Frequency Estimation for Non-Interactive Audio-Visual Simulations (poster) by Rahul Agnihotri, Romain Michon, Timothy O'Brian ------155 Porting WDL-OL to LADSPA (speed-geeking) by Jean-Jacques Girardot ------159 Workshops

------161 Concerts

------169 Installations

------177

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 1

Special Guests

PAUL DAVIS Paul Davis is the lead developer of the open source digital audio workstation, as well as the JACK Audio Connection Kit. Before his18 years involvement with audio software, Paul moved between academia and the corporate computing worlds, including 4-1/2 years at the University of Washington's Computer Science & Engineering department, and then becoming the 2nd employee at Amazon.com. In 2008/2009 he taught at the Technische Universitat, Berlin as the Edgar Varese Visiting Professor. Paul normally lives near Philadelphia, PA, but can also be found living and working in a solar-powered van. To his regret, Paul does not play any musical instruments.

Talk:

20 years of Open Source Audio: Success, Failure and The In-Between

I will talk about the 20-year history of open source audio development (focused on Linux but including other platforms when appropriate). It is a story that includes successes, failures and a lot of more ambiguous elements. I will discuss the way that the open source model does and does not help with software development, and also the sometimes surprising ways that "open source" might be pushing audio and music technology in the near future. Thierry Codyus Artist, musician, new technology expert, Thierry Coduys specializes in collaborative and multidisciplinary projects where interactivity meets the contemporary arts. Since 1986, he has worked closely with the avant-garde of contemporary music (e.g. Karlheinz Stockhausen, Steve Reich, ...) to realize electroacoustic and computer systems for live performance. After a few years spent at the IRCAM in Paris, he becomes the assistant to Luciano Berio. Building on his experience of the contemporary art scene, he creates his own company in 1999: an artistic research and technology laboratory called ‘La kitchen’, where artists from various horizons (e.g. music, dance, theatre, video, network) came to develop projects in collaboration with the team and where artists were encouraged to use Open Source Software. Thierry, among others, has been for more than 15 years the project manager of ‘IanniX’ (GNU GPL3 application), an interactive software interface, inspired by the UPIC of Iannis Xenakis and senior consultant for the development of ‘Rekall' (GNU GPL3 application), a video- annotation software to document digital performances.

Talk:

Why could the open source software change the way of writing for contemporary creation?

I will try to explain how it is important to propose to the artists to use open source’s tools. For such a long time artists have been obliged to play the game of commerce and industries, and also to use dedicated platforms in research and creation centres, they are waiting now for new concepts and not only for new production’s tools. Open Source community attracts brilliant developers, very motivated and often not very well paid. Many reasons can explain this interest, like clean design, liability, easy maintenance in the respect of the rules and values shared by the community, but also and mostly its freedom without constrictions that leave space to new concepts. I will show some examples of creation in collaboration with artists in all phases of development and their impact on the functions of the software which are used in the creation process.

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 2

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 3

Conferences LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 4

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 5

OpenAV Ctlra: A Library for Tight Integration of Controllers Harry VAN HAAREN OpenAV Bohatch, Mountshannon, Co Clare, Ireland. [email protected]

Abstract familiarity with the implementation presented Ctlra is a library designed to encourage integra- possible issues, so Ctlra is designed as a simple tion of hardware and software. The library ab- C API that will be instantly familiar to seasoned stracts events from the hardware controller, emitting developers. generic events which can be mapped to functionality Hence, Ctlra is implemented as a C library exposed by the software. that provides generic events to the application, The generic events provide a powerful method regardless of the hardware in use. to allow developers and users integrate hardware and software, however a good development workflow 1.2 Modern Controllers is vital to users while tailoring mappings to their unique needs. Each year there are new, more powerful and This paper proposes an implementation to enable complex hardware controllers, often with large a fast scripting-like development workflow utilizing numbers of input controls, and lots of feedback on-the-fly recompilation of C code for integrating using LEDs etc. The latest generations have hardware and software in the Ctlra environment. seen an uptake in high-resolution screens built into the hardware. Keywords The capabilities of these devices require an Controllers, Hardware, Software, Integration. equally powerful method to control the hard- ware, or risk not utilizing them to the full po- 1 Introduction tential. As such, any library to interface with Ctlra aims to enable easy integration between these controllers should afford handling these DAWs and controllers. At OpenAV we believe complex and powerful controller devices easily. that enabling hardware controllers to be 1st 1.3 Why a Controller library? class citizens in controlling music software will provide the best on-stage workflow possible. Although every application could implement its Ctlra has been developed due to lack of a own device-handling mechanism, there are sig- simple C library that affords interacting with a nificant downsides to this approach. range of controllers in a generic but direct way, Firstly, a developer will not have access to that enables tight integration. all controllers that are available, so only a sub- set of the controllers will have tight integration 1.1 Existing Projects with their software. As an end result, the users Although many projects exist to enable hard- controller may not be directly supported by the ware access, very few aim to provide a generic application. interface for applications to use. Secondly, duplication of effort is significant, Projects such as maschine.rs[Light, 2016], both in the development and testing of the con- HDJD[Pickett, 2017], OpenKinect[OpenKinect- troller support. This is particularly true if a Community, 2017] and CWiid[Smith, 2007] all device supports multiple layers of controls. enable hardware access, however they each ex- Thirdly, advanced controller support features pose a a unique API to the application, resulting like hotplug and supporting multiple devices of in the need to explicitly support each controller. the same type must also be tested - requir- The o.io[Freed, 2014] project aims to unify ing both access to multiple hardware units and communications for various types of interac- time. tion using an OSC API, which is similar to The Ctlra library shares the effort required to the generic events concept. Discoverability and develop support for these powerful devices, pro- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 6

viding users and developers with an easy API doesn’t officially expose a scripting API, the are to communicate with the hardware. members of the community that have investi- gated and successfully written scripts to control 1.4 Tight integration it[Petrov, 2017]. The terms “tight integration” or “deep integra- A brief review shows high-level scripting lan- tion” are often used to describe hardware and guages are favoured over compiled languages. software that collaborate closely together, per- Mixxx and Bitwig are both using JavaScript, haps they are even specifically designed to suite while uses Python, and Ardour one other. uses the Lua language. Tight integration leads to better workflows These solutions are all valid and workable, for on-stage usage of software, as it allows oper- however they do require that the application ations from inside the software to be controlled developer to exposes a binding API to glue the by the hardware device and appropriate feed- scripting API to the core of the application. back returned to the user. With the exception of Lua, none of the above The advantage of tight integration is pro- scripting languages provide real-time safety un- viding a more powerful way of integrating the less very carefully programmed - which should physical device and the software. As an exam- not be expected of user’s scripts. ple, many DAWs support MIDI Control Change OpenAV feels that providing controller sup- (CC) messages, and allow changing a parame- port in the native language of the application ter with it. Although such a 1:1 mapping is ensures that all operations that the application useful, most workflows require more flexibility. is capable of are also mappable to a controller. For example, each physical control could effect Other advantages of having the controller map- a number of parameters with weighting applied pings in the native language of the application is to provide a more dynamic performance. that they can be compiled into the application 1.5 Controller Mapping itself. The Ctlra library allows mappings to be cre- 2 Ctlra Implementation ated between physical controls and the target This section details the design decisions made software. DAWs could expose this functional- during the implementation of the Ctlra library. ity for technical users - giving them full control The core concepts like the , device and over the software. events are introduced. Given the variation in live-performances and on-stage workflows, there is no ideal mapping 2.1 Ctlra Context from a device to the application - it depends The main part of the Ctlra library is the con- on the user. As a result, OpenAV is of the text, it contains all the state of that particular opinion that enabling users to create custom instance of the Ctlra library. This state is repre- mappings from controllers to software using a sented by a ctlra t in the code. Using a state generic event as a medium to do so is the best structure ensures that Ctlra is usable from in- approach. side a plugin, for example an LV2 plugin. 1.6 Scripting APIs Devices and metadata used by Ctlra are stored internally in the ctlra t. The end goal is Various audio applications provide APIs to al- to enable multiple ctlra t instances to exist in low users script functionality for their con- the same process without interfering with one- troller. Enabling users to script themselves re- another. This is more difficult than it sounds quires technical skill from the user, however it as not all backends provide support for context seems like there is no viable alternative. style usage. The solution proposed in section 4 also pro- poses “crowd-sourcing” the effort in writing 2.2 Generic Events controller mappings to the users themselves, as Ctlra is built around the concept of a generic they have access to the physical device and have event. The generic event is a C struc- knowledge of their ideal workflow. ture ctlra event t which may contain any Examples of audio applications that pro- of the available event types. The avail- vide scripting APIs are Ardour[Davis, 2017], able event types include all common hardware Mixxx[Mixxx, 2017] and Bitwig Studio[Bitwig, controller interaction types, such as BUTTON, 2017]. Although Ableton Live[Ableton, 2017] ENCODER, SLIDER and GRID. The events are LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 7

prefixed by CTLRA EVENT , so BUTTON becomes The id identifies the grid number, allowing CTLRA EVENT BUTTON. controllers with more than one grid to distin- Once the type of the event is established, guish between them. The flags allows the the contents of the event can be decoded. The event to identify which values are valid in this generic event has a union around all events, so event. Currently two flags are defined, BUTTON an event must represent one and only one type and PRESSURE, there are 14 bits remaining for of event. It is expected that the application will future expansion. use a switch() statement to decode the event If the flag GRID FLAG BUTTON is set, the types, and process them further. pressed variable is valid to read, and repre- The power of generic events is shown by the sents if the button is currently pressed or not. examples/daemon sample application, which The BUTTON flag should only be set in the de- translates any Ctlra supported device into an vice backend if the state of the grid-square has ALSA MIDI transmitting device. changed, this eases handling events in the appli- 2.2.1 Button cation. When GRID FLAG PRESSURE is set, the The button event represents physical buttons floating-point pressure variable may be read, on a hardware device. It contains two variables, The pressure value is normalized to the range id and pressed. The button id is guaranteed 0.f to 1.0f. to be a unique identifier for this device, based 2.3 Devices from 0, and counting to the maximum number In Ctlra, any physical controller is represented of buttons. The pressed variable is a boolean internally in by a ctlra dev t. Devices do not value set high when the button is pressed by the appear available to the application directly, but user. instead operations on the device are performed 2.2.2 Slider through the ctlra t context. There is an ab- The slider event represents physical controls stracted representation of a device at the API that have a range of values, but the interaction level, which the application has access to in the is of limited range, eg: faders on a mixing desk. event handle() callback. The slider has and id as a unique identifier for The reason that the device is not exposed the slider, and floating-point value that repre- to the application directly is that ownership sents the position of the control. The value and cleanup of resources becomes blurred when variable range is normalized as a linear value hotplug functionality is introduced. Using the from 0.f to 1.f to allow generic usage of the ctlra t context as a proxy for multiple devices event. not only simplifies the application handling of 2.2.3 Encoder controllers, but actually helps define stronger The encoder represents an endless rotary con- memory ownership rules too. See section 2.4 trol on a hardware device. There are two for hotplug implementation details. types of encoders, which we will refer to as 2.3.1 Device Backends “stepped” and “continuous”. Stepped controls A device backend is how the software driver con- have notches providing distinct steps of move- nects to the physical device. ment, while the continuous type is smooth and The implementation of the driver calls a provides no physical feedback during rotation. read() function, which indicates the driver The stepped controls notify the appli- wishes to receive data. The backend library cation for each notch moved by setting will send an async read to the physical device, the ENCODER FLAG INT, and the delta change and return immediately. Upon completion of is available from delta. Similarly the the transaction a callback in the driver is called ENCODER FLAG FLOAT tells the application to which decodes the newly received data, and can read the delta float value, and interpret the emit events to the application if required. To value as a continuous control. write data to the device, a write() is provided. 2.2.4 Grid Note that a single device driver may open The grid represents a set of controls that are log- multiple backends, or utilize multiple connec- ically grouped together, eg: the squares of the tions of the same backend in order to fully sup- Push2 controller. The grid event type contains port the capabilities of the hardware. An ex- multiple variables: id, flags, pos, pressure ample could be a USB controller that exposes and pressed. both a USB interrupt endpoint for buttons and LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 8

a USB bulk endpoint for sending data to a high- 3.1 Interaction resolution screen. The main interaction between Ctlra and the Note that more backends can be added to application happens in two functions. Events support more devices if it is required in future. from the device are handled in the applica- 2.4 Hotplug Implementation tion provided event handle() function, while feedback can be sent to a device from the Implementing a hotplug feature is difficult; it feedback func(). requires handling device additions and removals These functions are callback functions, and in the library itself, as well as a method to com- they are invoked for each device when the ap- municate any changes of environment with the plication calls ctlra iter(). application. To understand the events passed between the As Ctlra is a new library built from the device and the application, please review the ground up, hotplug was a consideration from generic events (Section 2.2), and browse the the start as a required feature. As such, the API examples/ directory. has been influenced by and designed for hotplug capabilities. The concept of a ctlra t context 3.2 Controller’s View of State that contains devices was introduced to allow Each application has its own way of representing transparent adding of devices without blurring its state. Similarly, each controller has its own memory ownership rules. capabilities in terms of controls and feedback to Hotplug of USB devices is enabled by the user. Given the specific application state LibUSB, which provides a hotplug callback, and capabilities of the hardware, it is useful to when a hotplug callback is registered and create a struct specifically for storing the view hotplug is supported on the platform. The that the controller has of the application. USB hotplug callback is utilized to call the Note that the controller view should be accept device() callback in the application, tracked per instance of the controller, as users providing details of the controller. The info pro- may have multiple identical controllers. This vided allows the application to present the user controller’s instance of the struct is very useful with a choice of accepting or rejecting the con- for remapping the controls to provide an alter- troller, and if accepted, it will be added to the nate map when a “shift” key is held down. As ctlra t context. the struct depends on the application and de- 3 Application Usage of Ctlra vice, this problem can not be solved elegantly at the library layer. This section will introduce the reader to the Ctlra provides a userdata pointer for each in- steps required to integrate Ctlra into an applica- stance which can be purposed for to point to the tion. Refer to the examples/simple/simple.c state struct. If the application’s state must be sample to see a minimal program in action. accessed from the state-struct, a “back-pointer” The following steps summarize Ctlra usage: to the application elegantly provides that. 1. ctlra create() The memory for the state struct can be al- located in the accept device() callback from 2. ctlra probe() Ctlra, and the memory can be released in ❼ Accept controller in callback when the device is disconnected using the 3. ctlra iter() remove device() callback. ❼ Handle events in callback 4 Device Scripting in C 4. ctlra exit() This section describes a solution to providing This creates a single ctlra t context, probes a fast and interactive development workflow for and accepts any supported controller. The ac- scripting mappings between software and device cepted controllers are connected to the particu- using the C language. lar context that it was probed from. C is typically a compiled and static language, Calling ctlra iter() causes the event to be not one that comes to mind when discussing dy- polled and the application is given a chance namic and scripting type workflows. Although to send feedback to the device. Finally, generally accurate, C can be used as a dynamic ctlra exit() releases any resources and grace- language with certain compromises. The follow- fully closes the context. ing section details how applications can imple- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 9

ment a C scripting workflow for users to quickly Often real-time software uses message- develop “Ctlra scripts”. passing in plain C structs through ringbuffers. 4.1 Dynamic Compilation This is a good way to communicate between dy- namically compiled scripts and the host, as it Dynamically compiling C at runtime can be provides a native C API, as well as a method to achieved by bundling a small, lightweight C achieve thread-safe message passing. compiler with your application. This may sound a little crazy, but there are very small and 5 Case Study: Ctlra and Mixxx lightweight C compilers available designed for this type of usage. The “Tiny C Compiler”, This section briefly describes the work per- or TCC[Bellard, 2017] project is used to enable formed to integrate Ctlra with the open-source compiling C code at runtime of the application. Mixxx DJ software. It is presented here to Please note that the security of dynamically showcase how to integrate the Ctlra library in compiling code is not being considered here as an existing project. the goal is to enable user-scripted controller mappings for musical performance. If security 5.1 Implementation is a concern, the reader is encouraged to find a This section details the steps taken to integrate different solution. the Ctlra library in Mixxx to test Ctlra in the 4.2 TCC and Function Pointers real-world. The TCC API has various functions to create a 5.1.1 Class Structure compilation context, set includes, and add files Mixxx has a very object oriented design, utiliz- for compilation. Once initialized, TCC takes an ing C++ classes to abstract behaviour of control ordinary .c source file, and compiles it. devices and managers of those control devices. When compilations completes successfully, The ControllerManager class aggregates the TCC allows requesting functions from the script different types of ControllerEnumerator by name, returning a function pointer. classes, which in turn add Controller class in- The returned function pointer may be called stances to the list of active controllers. Ctlra has by the host application, forming the method of been integrated as a ControllerEnumerator communicating with the compiled script. sub-class for this proof-of-concept implemen- 4.3 The Illusion of Scripting tation, really it should be integrated at the To provide the illusion that the code is a script, ControllerManager level. the application can check the modified time of 5.1.2 Threading in the Mixxx Engine a script file, and recompile the file if needed. By swapping in the new function pointers, the The Mixxx engine currently creates many update code runs. The old program can then threads. This design is supported by the use of be freed, cleaning up the resources that were an “atomic database” of values (see next Section consumed by the now outdated script. 5.1.3). Given this design, the Ctlra integration The examples/tcc scripting/ directory is done by spawning a Ctlra handling thread, contains a minimal example showing how the which performs any polling and interacting with event handling for any Ctlra supported device Ctlra supported devices. can be dynamically scripted. 5.1.3 Communicating with the Engine Providing this workflow requires some extra integration from the application, however the The Mixxx engine is composed of values, which time pays off easily in developer time saved can be controlled from any thread anywhere when time save in scripting support for each in the code. These values are represented in controller is considered. the code by ControlObject and ControlProxy classes. A ControlObject is the equivalent to 4.4 C and C++ APIs owning a value, while the ControlProxy allows Note that TCC is a C compiler only - explicitly atomic access to update the value. Lookup of not a C++ compiler. This has some impact on these values is performed using “group” and how scripts can interact with applications, as “key” strings. The strings are constant allowing many large open-source audio projects are writ- Ctlra and the Mixxx engine to understand the ten in C++. The solution is to provide wrapper meaning of each value represented by a partic- functions to C, if the hosts language is C++. ular ControlProxy. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 10

5.1.4 Mixxx’s C++ API 5.3.2 Mixxx Feedback to Device An issue arises due to Mixxx having a Control- The reverse of the previous paragraph is to Proxy being a C++ class which is not possible send Mixxx state to the physical device, provid- to access from a TCC compiled script (refer to ing feedback to the user. Each parameter that C and C++ APIs, Section 4.4). Mixxx exposes via the ControlProxy is avail- The solution is to create a C wrapper able for reading as well as writing. The allows function, which simply provides a C API to the script to query the state of a particular vari- the desired C++ function to be called on a able from Mixxx, and update the state of an ControlProxy instance. This provides the LED on the device, using the Ctlra encoding power of the Mixxx engine to the dynamically for colour and brightness: compiled script code: int play; void mixxx c o n f i g key set( play = mixxx c o n f i g key get( ✯ consl char group , ✬ ✬ [Channel1] ✬ ✬ , ✯ const char key , ✬ ✬ play indicator ✬ ✬ ); float value); led = play > 0 ? 0xffffffff : 0; 5.2 Mixxx and Hotplug Since Ctlra hides the hotplug functionality c t l r a dev light set(dev, from the application due to the design of the DEVICE LED PLAY, accept device() callback, Mixxx supports on- led ); the-fly plug-in and plug-out transparently. This is achieved by the Ctlra library having its own thread to poll events (see Section 5.1.2), 6 Future Work and handling the connect or disconnect events. To make Ctlra a ubiquitous library for event The Mixxx application code did not have to be I/O is a huge task, however the benefit to all modified to support hotplugging of controllers applications if such a library did exist would be in any way (beyond adding basic Cltra support). huge too. 5.3 Scripting Controller Support Imagine easily scripting your DIY controller to easily control any aspect of any software - With the Ctlra library integrated in Mixxx, huge potential for customized powerful user- users are now able to script the tight integration experience. OpenAV intends to use the Ctlra of the Ctlra supported hardware and Mixxx. library and integrate it with any projects that The next sections demonstrate simple mappings would benefit from a powerful customizable from a device to Mixxx and vice-versa. workflow. 5.3.1 Event Input to Mixxx When a user presses a physical control on a de- 6.1 Device Support vice, the action is presented to the application At time of writing, the Ctlra library supports as an event. The user can map these events to 6 advanced USB HID devices, one USB DMX the application in a variety of ways, in order to device, a generic MIDI backend, and plans are suit their own requirements on how they wish in place to support a common bluetooth console to control the software application. controller - but more must be added to make the For example, the following snippet shows how Ctlra library really useful! we can bind slider ID 10 to channel 1 volume in An interesting angle may be so that DIY plat- Mixxx (note the usage of the C function from forms like Arduino can be used to build con- Section 5.1.4): trollers that use a generic Ctlra backend, allow- case CTLRA EVENT SLIDER: ing controllers to be auto-supported. switch(e−>slider.id) { The previously mentioned hardware enabling case 10: projects that provide access to specific hardware mixxx c o n f i g key set( devices could be integrated with Ctlra, trans- ✬ ✬ [Channel1] ✬ ✬ , parently benefiting applications that use Ctlra. ✬ ✬ volume ✬ ✬ , The number of supported hardware devices is e−>slider .value); paramount to the success of the Ctlra library, so break ; OpenAV welcomes patches or pull-requests that add support for a device. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 11

6.2 Software Environments References From the software point-of-view there is huge Ableton. 2017. Music production potential for integrating into existing software. with live and push — ableton. For example mapping Ctlra events to LV2 https://www.ableton.com. Atoms would expose the Ctlra backends to any Fabrice Bellard. 2017. Tcc : Tiny c compiler. LV2 Atom capable host. https://http://www.bellard.org/tcc/. Integration with DSP languages like FAUST or PD may prove interesting and allow for faster Bitwig. 2017. Bitwig music productiona nd prototyping and more powerful control over per- performance system for windows, and formance using those tools. linux. https://www.bitwig.com. Hardware platforms like the MOD Paul Davis. 2017. Ardour: , edit, Duo[MOD, 2017] could use the Ctlra li- and mix on linux, os x and windows. brary to enable musicians to use a wider http://ardour.org/. variety of controllers in thier on-stage setups in conjunction with the DSP on the DUO. Adrian Freed. 2014. o.io: a unified com- munications framework for music, interme- 7 Conclusion dia and cloud interaction. International Com- puter Music Conference (ICMC) 2014. This paper presents Ctlra, a library that allows an application to interface with a range of con- William Light. 2016. Maschine.rs, open- trollers in a powerful and customizable way. source ni maschine device handling. It shows how applications and devices can in- https://github.com/wrl/maschine.rs. teract by using generic events. A case study Mixxx. 2017. Mixxx dj software, dj your way. showcases integrating Ctlra with the open- for free. https://mixxx.org/. source Mixxx project as a proof of concept. To enable a fast development workflow for MOD. 2017. Mod duo, the definitive stomp- creating mappings between applications and de- box. https://moddevices.com/pages/mod- vices, a method to dynamically compile C code duo. is introduced. This enables developers and users OpenAV. 2017. Ctlra is a library pro- to write mappings between devices and appli- viding support for controllers, designed cations as if C was a scripting language, but to integrate hardware and software. provides native access to the applications data https://github.com/openAVproductions/ structures. openAV-Ctlra. Ctlra is available from github here[OpenAV, OpenKinect-Community. 2017. Open source 2017], please run the sample programs in the libraries that will enable the kinect to examples/ directory of the source to experience be used with windows, linux, and mac. the power of Ctlra yourself. https://openkinect.org. 8 Acknowledgements Hanz Petrov. 2017. Introduction to the ableton framework classes. OpenAV would like to acknowledge the linux- http://remotescripts.blogspot.com/2010/03/ audio community and open-source ecosystem as introduction-to-framework-classes.html. a whole for providing novel solutions to various problems and being a great place to collaborate Neale Pickett. 2017. Hercules and innovate. For the work on Ctlra certain dj controller driver for linux. people and projects provided lots of inspiration https://github.com/nealey/hdjd. and support, thanks! David Robillard. 2017. Pugl is a min- Thanks to the TCC project, which allows dy- imal portable api for opengl guis. namically compiling Ctlra scipts, it is awesome https://drobilla.net/software/pugl. to script in C! Thanks to William Light for writing Donnie Smith. 2007. A collection maschine.rs, David Robillard for the creation of linux tools written in c for in- of PUGL[Robillard, 2017], the Mixxx project terfacing to the nintendo wiimote. devs (particular shout outs to be , Pegasus RPG http://abstrakraft.org/cwiid/. and rryan on #mixxx on irc.freenode.net. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 12

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 13

❇✐♥❛✉❛❧ ❋❧♦ ✕ ❊①♣❧♦✐♥❣ ▼❡❞✐❛✱ ■♠♠❡✐♦♥✱ ❚❡❝❤♥♦❧♦❣②

▼❛✐♥ ❘❯▼❖❘■ ■♥✐✉❡ ♦❢ ❊❧❡❝♦♥✐❝ ▼✉✐❝ ❛♥❞ ❆❝♦✉✐❝ ✭■❊▼✮ ❯♥✐✈❡✐② ♦❢ ▼✉✐❝ ❛♥❞ ❡❢♦♠✐♥❣ ❆ ●❛③ ■♥✛❡❧❞❣❛❡ ✶✵✴✸✱ ✽✵✶✵ ●❛③✱ ❆✉✐❛ ✉♠♦✐❅✐❡♠✳❛

❆❜❛❝ ❝❧❡❛✳ ◗✉❡✐♦♥ ♦❢ ✐♠♠❡✐♦♥✱ ♣❡❝❡♣✐♦♥ ❛♥❞ ❝♦❣♥✐✐♦♥ ❛✐❡ ❛ ❝♦♠♣♦♥❡♥ ♦❢ ❛♥ ✐♥❡❣❛❧ ❛❡✲ Technology for binaural audio, that is, relating two ❤❡✐❝ ❡①♣❡✐❡♥❝❡✳ ▼❡❤♦❞ ✐♥ ❝❤♦❧❛❧② ❡❡❛❝❤ audio signals to the psychophysical properties of the ✉✉❛❧❧② ❡❣♠❡♥ ❝♦♠♣❧❡① ♣♦❝❡❡ ✉❝❤ ❤❛✱ human hearing apparatus, is capable of recording, ❢♦ ✐♥❛♥❝❡✱ ❝❡❛✐♥ ♣②❝❤♦❛❝♦✉✐❝ ♣❛❛♠❡❡ synthesising and reproducing the spatial informa- ❛❡ ✐♦❧❛❡❞ ❢♦ ❡♣❛❛❡ ✐♥✈❡✐❣❛✐♦♥✳ ❚❤❡ ❡✲ tion of an auditory environment comprising an im- ✉❧ ♦❢ ❧✐❡♥✐♥❣ ❡ ❛❝❝♦❞✐♥❣ ♦ ✉❝❤ ♠❡❤✲ mersive quality. While current scholarly research on ♦❞ ♦❢❡♥ ❝❛♥♥♦ ❜❡ ❣❡♥❡❛❧✐❡❞ ❢♦ ❡❣❛❞✐♥❣ ❛ binaural rendering and reproduction techniques for ❝♦♠♣❧❡① ❧✐❡♥✐♥❣ ♣♦❝❡ ❤❛ ✐♥✈♦❧✈❡ ♠✉✐❝❛❧ personal, mobile and interactive audio augmented ♦ ❛♥❡❝❞♦❛❧ ❛♣❡❝ ♦❢ ❤❡ ♦✉♥❞ ♠❛❡✐❛❧✱ ❝♦❣✲ environments is well advanced, their grounding with ♥✐✐✈❡ ❝♦♥✐❜✉✐♦♥ ♦ ♣❡✈✐♦✉ ❡①♣❡✐❡♥❝❡ ❜② ❤❡ respect to the aesthetic experience in an integral lis- ❧✐❡♥❡✱ ♦ ♥❛♠❡ ❥✉ ❛ ❢❡✇ ❢❛❝♦✳ tening act is not. Based on the case study of an ❖❜✈✐♦✉❧②✱ ❤✐ ♣❛♣❡ ❝❛♥♥♦ ♣♦✈✐❞❡ ♦❧✉✲ intermedia installation, Parisflâneur, an attempt to- ✐♦♥ ♦ ❛♥✇❡✳ ❲❤❛ ■ ❛♠ ❣♦✐♥❣ ♦ ♣❡❡♥ wards the exploration and reflection of binaural me- ✐ ❛ ♣❡♦♥❛❧ ❛❡♠♣ ♦❢ ❛♣♣♦❛❝❤✐♥❣ ❤❡♦❡✲ dia properties is made. Here, a special emphasis is ✐❝❛❧✱ ❛❡❤❡✐❝ ❛♥❞ ❡♥❣✐♥❡❡✐♥❣ ❡✢❡❝✐♦♥ ❛❧♦♥❣ put on the role of FLOSS tools in an arts-based re- ❤❡ ❞❡✈❡❧♦♣♠❡♥ ♦❢ ❛♥ ❛✐✐❝ ❝❛❡ ✉❞②✱ ❛✐✲ search context. ✢♥❡✉✱ ✇❤✐❝❤ ✐ ✇♦❦ ✐♥ ♣♦❣❡✳ ❑❡②✇♦❞ ■♥ ❤❡ ♥❡① ❡❝✐♦♥✱ ■ ✇✐❧❧ ❞❡❝✐❜❡ ❤❡ ❝❛❡ ✉❞② ❢♦♠ ❛ ♣❤❡♥♦♠❡♥♦❧♦❣✐❝❛❧ ♣♦✐♥ ♦❢ ✈✐❡✇✱ binaural audio, immersion, floss tools, intermedia ❤❛ ✐✱ ❤♦✇ ✐ ❛♣♣❡❛ ♦ ❤❡ ✈✐✐♦ ♦❢ ❛♥ ✐♠❛❣✲ art, field recordings ✐♥❡❞ ❡①❤✐❜✐✐♦♥✳ ❚❤❡ ❞❡❝✐♣✐♦♥ ✇✐❧❧ ❜❡ ❢♦❧❧♦✇❡❞ ❜② ❛ ❞❡❛✐❧❡❞ ❞✐❝✉✐♦♥ ♦❢ ❡❝❤♥✐❝❛❧ ✐♠♣❧❡♠❡♥✲ ✶ ■♥♦❞✉❝✐♦♥ ❛✐♦♥ ❞❡❝✐✐♦♥ ✐♥ ❝❧♦❡ ❡❧❛✐♦♥ ♦ ❛❡❤❡✐❝ ❡✲ ❇✐♥❛✉❛❧ ❛✉❞✐♦ ♠❡❛♥ ♦ ❡❧❛❡ ❛ ♣❛✐ ♦❢ ❛✉❞✐♦ ✢❡❝✐♦♥ ♦♥ ❝♦♥❞✐✐♦♥ ♦❢ ❤❡ ♠❡❞✐❛ ✐♥✈♦❧✈❡❞✳ ❆ ✐❣♥❛❧ ♦ ❤❡ ♣②❝❤♦♣❤②✐❝❛❧ ♣♦♣❡✐❡ ♦❢ ❤❡ ♣❡❝✐❛❧ ❡♠♣❤❛✐ ✇✐❧❧ ❜❡ ♣✉ ♦♥ ❤❡ ♦❧❡ ♦❢ ❋❡❡ ❤✉♠❛♥ ❤❡❛✐♥❣ ❛♣♣❛❛✉✱ ❤❛ ✐✱ ❤❡ ✐❣♥❛❧ ❛♥❞ ▲✐❜❡ ❖♣❡♥ ❙♦✉❝❡ ❙♦❢✇❛❡ ✭❋▲❖❙❙✮ ✐♥ ❤❡ ❛❡ ❡❣❛❞❡❞ ❛ ♦ ❝❛❧❧❡❞ ❡❛ ✐❣♥❛❧✳ ❇✐♥❛✉❛❧ ❞❡❝✐❜❡❞ ♣♦❝❡✳ ❛✉❞✐♦ ✐ ❛♠♦♥❣ ❤❡ ❡❛❧✐❡ ❛❡♠♣ ♦❢ ❡❝♦❞✲ ✐♥❣✱ ❡♣♦❞✉❝✐♥❣ ❛♥❞ ②♥❤❡✐✐♥❣ ❤❡ ♣❛✐❛❧ ✐♥✲ ✷ ❛✐✢♥❡✉✿ ✈✐✐♦✬ ❡①♣❡✐❡♥❝❡ ❢♦♠❛✐♦♥ ♦❢ ❛♥ ❛✉❞✐♦② ❝❡♥❡ ❜② ❞✉♠♠② ❤❡❛❞ ❛✐✢♥❡✉ ✐ ❛ ♦✉♥❞ ✐♥❛❧❧❛✐♦♥ ❤❛ ❡①✲ ♠✐❝♦♣❤♦♥❡✱ ❛♣♣♦♣✐❛❡ ✐❣♥❛❧ ♣♦❝❡✐♥❣ ❛♥❞ ♣❧♦❡ ❤❡ ❡❧❛✐♦♥ ♦❢ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣ ❛♥❞ ❜✐♥✲ ❜② ♣❡❡♥✐♥❣ ❤❡ ❜✐♥❛✉❛❧ ✐❣♥❛❧ ♣❛✐ ✐♦❧❛❡❞ ❛✉❛❧ ❡♥❞❡✐♥❣ ♦❢ ❛ ✈✐✉❛❧ ❝❡♥❡ ❜② ♣♦✈✐❞✐♥❣ ❢♦♠ ❡❛❝❤ ♦❤❡ ♦ ❤❡ ❧❡❢ ❛♥❞ ✐❣❤ ❡❛✱ ❡✲ ❛ ❡❛❝✐✈❡✱ ♣❧❛②❢✉❧ ❡♥✈✐♦♥♠❡♥✳ ♣❡❝✐✈❡❧②✱ ✉✉❛❧❧② ✈✐❛ ❤❡❛❞♣❤♦♥❡✳ ◆♦✇❛❞❛②✱ ❋♦♠ ❤❡ ♦✉✐❞❡✱ ❤❡ ❛♣♣❡❛❛♥❝❡ ♦❢ ❛✐✲ ✐♥ ❤❡ ✈✐❡✇ ♦❢ ✉❜✐✉✐♦✉ ❤❡❛❞♣❤♦♥❡ ✉❡ ❛♥❞ ❤❡ ✢♥❡✉ ✐ ✉✐❡ ❡❞✉❝❡❞✿ ✐ ❞♦❡ ♥♦ ❝♦♥✐ ♦❢ ❛❞✈❡♥ ♦❢ ✇✐❞❡♣❡❛❞ ❤❡❡✲❞✐♠❡♥✐♦♥❛❧ ✈✐❞❡♦ ♠✉❝❤ ♠♦❡ ❤❛♥ ❛ ♣❛✐ ♦❢ ❤❡❛❞♣❤♦♥❡ ❛♥❞ ❛♥ ♣♦❥❡❝✐♦♥✱ ❜✐♥❛✉❛❧ ❡❝❤♥♦❧♦❣② ❝♦♥❛♥❧② ❣❛✐♥ ❡♠♣② ❛❡❛ ✐♥ ♣❛❝❡ ♦❢ ❛❜♦✉ ✇❡♥② ♦ ❢♦✉② ✐❣♥✐✜❝❛♥❝❡✱ ❛♥❞ ♦ ❞♦❡ ❡❡❛❝❤ ♦♥ ❤❡ ♦♣✐♠❛❧ ✉❛❡ ♠❡❡✳ ❚❤❡ ✈✐✐♦ ✐ ✐♥✈✐❡❞ ♦ ♣✉ ❡♥❞❡✐♥❣ ❛♥❞ ♣♦❥❡❝✐♦♥ ♦❢ ♣❡♦♥❛❧✱ ♠♦❜✐❧❡ ❛♥❞ ♦♥ ❤❡ ❤❡❛❞♣❤♦♥❡ ❛♥❞ ❡①♣❧♦❡ ❤❡ ✐♥❛❧❧❛✐♦♥ ✐♥❡❛❝✐✈❡ ❛✉❞✐♦ ❛✉❣♠❡♥❡❞ ❡♥✈✐♦♥♠❡♥✳ ♦❧❡❧② ❜② ❧✐❡♥✐♥❣ ❛♥❞ ❢❡❡❧② ♠♦✈✐♥❣ ✐♥ ❤❡ ❛❡❛ ❲❤❡♥ ✐ ❝♦♠❡ ♦ ❤❡ ❝❡❛✐♦♥ ♦❢ ✉❝❤ ❡♥✈✐♦♥✲ ✇❤♦❡ ❜♦✉♥❞❛✐❡ ❛❡ ✉✉❛❧❧② ♠❛❦❡❞ ♦♥ ❤❡ ♠❡♥✱ ♦♣✐♠✐❛✐♦♥ ❛❣❡ ❜❡❝♦♠❡ ♠✉❝❤ ❧❡ ✢♦♦✳ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 14

❇♦❤ ❤❡ ♣♦✐✐♦♥ ❛♥❞ ♦✐❡♥❛✐♦♥ ♦❢ ❤❡ ❤❡❛❞✲ ❛ ❛ ❡❡✱ ♣❡❞❡✐❛♥ ❛❡❛✱ ♦ ♣❛❦ ❛❡ ❡❝♦❣✲ ♣❤♦♥❡ ❛❡ ❛❝❦❡❞✱ ✇❤✐❝❤ ♦ ❞❛❡ ❡✉✐❡ ❛♥ ♥✐❛❜❧❡ ❜② ②♣✐❝❛❧ ♦✉♥❞ ❧✐❦❡ ❝❛✱ ❢♦♦❡♣✱ ♦♣✐❝❛❧ ♠✉❧✐✲❝❛♠❡❛ ❛❝❦✐♥❣✱ ❣✐✈❡♥ ❤❡ ❡✲ ✈♦✐❝❡✱ ❝✐❝❦❡✱ ❛♥ ❛❡♦♣❧❛♥❡ ♦ ❛✐♥✳ ❚❤❡② ❛♣✲ ✉✐❡❞ ❧❛❡♥❝② ❧✐♠✐ ❛♥❞ ❤❡ ❡❧❛✐✈❡❧② ❧❛❣❡ ♣❡❛ ♦ ❝♦♠❡ ❢♦♠ ❞✐✛❡❡♥ ❞✐❡❝✐♦♥ ❛♦✉♥❞ ❛❝❦✐♥❣ ✈♦❧✉♠❡✳ ❚❤❛ ♠❡❛♥ ❤❛ ❛ ❛❝❦✐♥❣ ❤❡ ❧✐❡♥❡✳ ❲❤❡♥ ✇❛❧❦✐♥❣ ❛♦✉♥❞ ❣✉✐❞❡❞ ❜② ❛❣❡✱ ❛ ✐❣✐❞ ❜♦❞② ♦❢ ❢♦✉ ♦ ✜✈❡ ❡✢❡❝✐✈❡ ❜❛❧❧✱ ❧✐❡♥✐♥❣ ✐ ✉♥ ♦✉ ❤❛ ❡❛❝❤ ♦❢ ❤❡ ♦✉♥❞ ✐✲ ✐ ❛ ✉✐❡ ♥♦✐❝❡❛❜❧❡ ♣❛ ♠♦✉♥❡❞ ♦♥ ♦♣ ♦❢ ✉❛✐♦♥ ✐ ✜①❡❞ ❛ ❛ ❝❡❛✐♥ ❧♦❝❛✐♦♥ ✐♥ ♣❛❝❡✳ ❤❡ ❤❡❛❞♣❤♦♥❡✳✶ ❆❞❞✐✐♦♥❛❧❧②✱ ✐♥ ♠♦ ♣❛❝✐✲ ❚❤❡✐ ♣♦✐✐♦♥ ♠❛② ❜❡ ❢♦✉♥❞ ❜② ❜♦❞✐❧② ♠♦✈❡✲ ❝❛❧ ✐♥❛❧❧❛✐♦♥ ♦❢ ❛✐✢♥❡✉ ❤❡ ❤❡❛❞♣❤♦♥❡ ♠❡♥✱ ❛♣♣♦❛❝❤✐♥❣✱ ✉♥✐♥❣ ♦✇❛❞ ❛♥❞ ❛✇❛② ❛❡ ❝❛❜❧❡❞ ❛ ♥♦ ❛✐❢②✐♥❣ ✇✐❡❧❡ ♦❧✉✐♦♥ ✇✐❤ ❢♦♠ ❤❡ ♦✉♥❞✳ ❚❤❡② ❡❛❝ ❜② ❧♦✉❞♥❡ ❛❡♥✉✲ ❡♣❡❝ ♦ ❛♥♠✐✐♦♥ ✉❛❧✐② ❛♥❞ ♦❜✉♥❡✱ ❛✐♦♥ ❛♥❞ ✜❧❡✐♥❣ ♦♥ ✐♥❝❡❛✐♥❣ ❞✐❛♥❝❡ ❛♥❞ ❞✐✲ ❧♦✇ ❧❛❡♥❝② ❛♥❞ ✐❣♥❛❧ ❞②♥❛♠✐❝ ✭✐✳ ❡✳✱ ♥♦ ❛✉❞✐♦ ❡❝✐♦♥❛❧ ❝❤❛♥❣❡ ❡❧❛✐✈❡ ♦ ❤❡ ❧✐❡♥❡✬ ❤❡❛❞✱ ❝♦♠♣❡✐♦♥✮ ✇❛ ❛✈❛✐❧❛❜❧❡ ♦ ❢❛✳ ❚❤✐ ❢❛❝ ✐ ❝♦♠♣❡♥❛✐♥❣ ❤✐ ♠♦✈❡♠❡♥ ❛♥❞ ❤✉ ❡✉❧✲ ♠❡♥✐♦♥❡❞ ❛ ✐ ♣♦❡♥✐❛❧❧② ✐♥❡❢❡❡ ✇✐❤ ❤❡ ✐♥❣ ✐♥ ❛ ♣❡❝❡✐✈❡❞ ❡❛❞② ❝♦♥✜❣✉❛✐♦♥ ✐♥❝✐❜❡❞ ✈✐✐♦✬ ♠♦❜✐❧✐② ✭❡❡ ❬❘✉♠♦✐✱ ✷✵✶✼❪✮✳ ✐♥♦ ❤❡ ✉♦✉♥❞✐♥❣ ♣❛❝❡✳ ❲❤❡♥ ❤❡ ❧✐❡♥❡ ❡❛❝❤❡ ❡①❛❝❧② ❤❡ ❛♠❡ ❧♦❝❛✐♦♥ ❛ ❛ ♦✉♥❞ ✐✲ ✉❛✐♦♥✱ ✐ ❛♣♣❡❛ ♦ ❡✐❞❡ ✐♥✐❞❡ ❤✐ ❤❡❛❞✳ ❚❤✐ ❛✉❞✐♦② ❡✛❡❝ ✐ ❛ ❝♦♠♠♦♥ ❡①♣❡✐❡♥❝❡ ✇❤❡♥ ❧✐✲ ❡♥✐♥❣ ♦ ♣❡❛❦❡✲❜❛❡❞ ❡❡♦♣❤♦♥✐❝ ✐❣♥❛❧ ♦♥ ❤❡❛❞♣❤♦♥❡✳ ■♥ ♦❛❧✱ ❤❡❡ ❛❡ ❡✈❡♥ ♦❢ ✉❝❤ ♦✉♥❞ ♣♦ ❡♣❡❡♥✐♥❣ ❞✐✛❡❡♥ ❡✈❡②❞❛② ✐✲ ✉❛✐♦♥ ✐♥ ❛✐✢♥❡✉✳ ❲❤❡♥ ❤❡ ❧♦❝❛✐♦♥ ♦❢ ❛ ♦✉♥❞ ✐✉❛✐♦♥ ✇❛ ❢♦✉♥❞✱ ❤❡ ❧✐❡♥❡ ♠❛② ✏❡♥❡✑ ✐ ❜② ♣❡❢♦♠✐♥❣ ❛ ❞✉❝❦✐♥❣ ❣❡✉❡✱ ❤❛ ✐✱ ❜② ❜❡♥❞✐♥❣ ❞♦✇♥ ✉❝❤ ❤❛ ❤❡ ❤❡❛❞ ❣♦❡ ✇❡❧❧ ❜❡❧♦✇ ❤❡ ✉✉❛❧ ❛♥❞✲ ✐♥❣ ♦ ✇❛❧❦✐♥❣ ❤❡✐❣❤ ❛♥❞ ✉❜❡✉❡♥❧② ❛✐✐♥❣ ❤❡ ❤❡❛❞ ❛❣❛✐♥ ❛ ❤❡ ❢♦✉♥❞ ❧♦❝❛✐♦♥✳ ❚❤✐ ♣♦✲ ❝❡❞✉❡ ✐ ❝♦♠♠✉♥✐❝❛❡❞ ♦ ❤❡ ✈✐✐♦ ❜❡❢♦❡✲ ❤❛♥❞ ✉✐♥❣ ❤❡ ♠❡❛♣❤♦ ♦❢ ❛❝✐♥❣ ✏♦♥✐❝ ❤❛✑ ✐♥ ♣❛❝❡ ✇❤✐❝❤ ❝❛♥ ❜❡ ✏♣✉ ♦♥✑ ❛♥❞ ✏❛❦❡♥ ♦✛✳✑ ❊♥❡✐♥❣ ❛ ♦✉♥❞ ✐✉❛✐♦♥ ②✐❡❧❞ ❛ ✉❜❛♥✲ ✐❛❧ ❝❤❛♥❣❡ ✐♥ ❤❡ ❛✉❞✐♦ ❧✐❡♥❡❞ ♦✳ ❚❤❡ ✈✐✲ ✉❛❧ ♦✉♥❞ ❝❡♥❡② ❝♦♠♣♦❡❞ ♦❢ ♠✉❧✐♣❧❡ ❛♥❡❝✲ ❞♦❛❧ ✐✉❛✐♦♥ ❣❛❞✉❛❧❧② ❞✐❛♣♣❡❛ ❡①❝❡♣ ♦❢ ❤❡ ✐♥❣❧❡ ♦✉♥❞ ❜❡✐♥❣ ❡♥❡❡❞✳ ❚❤❡ ❡♠❛✐♥✐♥❣ ♦♥❡ ✐ ♥♦ ❧♦♥❣❡ ❡♣❡❡♥❡❞ ❜② ❛ ✐♥❣❧❡ ♣♦ ❜✉ ♦♣❡♥ ♦✇❛❞ ❛ ✐❝❤✱ ❡①♣❛♥❞❡❞ ❛✉❞✐♦② ❝❡♥❡ ♦♥ ✐ ♦✇♥ ❤❛ ✐♠♠❡❡ ❤❡ ❧✐❡♥❡✳ ❚❡❝❤✲ ♥✐❝❛❧❧②✱ ❤❡ ❡♥❞❡❡❞ ❜✐♥❛✉❛❧ ✐❣♥❛❧ ✐ ❡♣❧❛❝❡❞ ❜② ❛ ❛✐❝ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣✱ ✇❤✐❝❤ ❛❧♦ ❡✈❡ ❋✐❣✉❡ ✶✿ ❱✐✐♦ ❡①♣❧♦✐♥❣ ❛✐✢♥❡✉✳ ❛ ❛ ❜❛✐ ❢♦ ❤❡ ♦✉♥❞ ♦✉❝❡ ✐♥ ❤❡ ✈✐✉❛❧ ❝❡♥❡✳ ❆ ❤❡ ❡❝♦❞✐♥❣ ✐ ❛✐❝✱ ✐ ❞♦❡ ♥♦ ❲❤❡♥ ❤❡ ❧✐❡♥❡ ❡♥❡ ❤❡ ✐♥❛❧❧❛✐♦♥✱ ❤❡ ❛♥② ❧♦♥❣❡ ❡♣♦♥❞ ♦ ❤❡ ❧✐❡♥❡✬ ♠♦✈❡♠❡♥ ✐ ♣❡❡♥❡❞ ❛ ✈✐✉❛❧ ❛✉❞✐♦② ❝❡♥❡✱ ✇❤✐❝❤ ❝❛♥ ❜✉ ✐ ❛❛❝❤❡❞ ♦ ❤✐ ❤❡❛❞✱ ❛ ❦♥♦✇♥ ❢♦♠ ❤❡ ❜❡ ♥❛✈✐❣❛❡❞✳ ❯❜❛♥ ❛♥❞ ✉❛❧ ✐✉❛✐♦♥ ✉❝❤ ❝♦♠♠♦♥ ❧✐❡♥✐♥❣ ❡①♣❡✐❡♥❝❡ ✇✐❤ ❤❡❛❞♣❤♦♥❡✳ ■♥ ❡♠ ♦❢ ❤❡ ❛❜♦✈❡✲♠❡♥✐♦♥❡❞ ♠❡❛♣❤♦✱ ❤❡ ✶ ❆ ♣♦♠✐✐♥❣ ❛❧❡♥❛✐✈❡ ✐ ♣❡❡♥❡❞ ❜② ❤❡ ▲✐❣❤✲ ✏♦♥✐❝ ❤❛✑ ❤❛ ❤❛ ❜❡❡♥ ✏♣✉ ♦♥✑ ✐ ♥♦✇ ✏❝❛✲ ❤♦✉❡ ②❡♠ ❞❡✈❡❧♦♣❡❞ ❢♦ ❤❡ ❍❚❈ ❱✐✈❡ ❣♦❣❣❧❡ ❛♥❞ ♦ ❜❡ ❡❧❡❛❡❞ ♦♦♥ ❛ ❛♥ ✐♥❞❡♣❡♥❞❡♥ ❛❝❦✐♥❣ ♦❧✉✐♦♥✳ ✐❡❞ ❛♦✉♥❞✳✑ ■ ❤❛❧❧ ♣♦✈✐❞❡ ❛ ♥❡❛❧② ❝♦♠♣❛❛❜❧❡ ♣❡❢♦♠❛♥❝❡ ♦ ❚❤❡ ❡♥❡❡❞ ♦✉♥❞ ✐✉❛✐♦♥ ♠❛② ❜❡ ❧❡❢ ❜② ❝❛♠❡❛✲❜❛❡❞ ②❡♠ ❜② ❖♣✐❚❛❝❦ ♦ ❱✐❝♦♥ ❛ ❛ ♣❡❢♦♠✐♥❣ ❤❡ ❞✉❝❦✐♥❣ ❣❡✉❡ ♦♥❝❡ ♠♦❡✿ ❜② ♠✉❝❤ ❧♦✇❡ ❝♦ ❛♥❞ ❡✉♣ ❝♦♠♣❧❡①✐②✱ ❝❢✳ ❤♣✿✴✴✇✇✇✳ ❜❡♥❞✐♥❣ ❞♦✇♥ ❛♥❞ ❝♦♠✐♥❣ ✉♣ ❛❣❛✐♥ ❢♦♠ ✉♥✲ ♦❛❞♦✈✳❝♦♠✴✈❛❧✈❡✲❡❧❧✲❜❛❡✲❛✐♦♥✲❞✐❡❝❧②✲ ❧♦✇❡✲❜❛✐❡✲❡❛♠✈✲❛❝❦✐♥❣✲❞❡✈❡❧♦♣♠❡♥✴ ✭❧❛ ❞❡♥❡❛❤ ❤❡ ♦✉♥❞ ♣♦✱ ❤✉ ✏❛❦✐♥❣ ♦✛✑ ❤❡ ❡✐❡✈❡❞ ❋❡❜✉❛② ✷✼✱ ✷✵✶✼✮✳ ✏♦♥✐❝ ❤❛✑ ❛♥❞ ❧❡❛✈✐♥❣ ✐ ✐♥ ♣❛❝❡✳ ❚❤❡ ❜✐♥❛✉❛❧ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 15

♦♦❧ ❛♥❞ ❤❡✐ ❡❧❛✐♦♥ ♦ ❤❡ ❛✐✐❝ ❝❡❛✐♦♥ ♣♦❝❡ ❛♥❞ ❛❡❤❡✐❝ ❛✐♠✳ ▲✐❦❡ ✐♥ ♠♦ ♦❤❡ ✐♥❡♠❡❞✐❛ ❛❡❢❛❝✱ ❤❡ ❣❡♥❡❛❧ ♣✉♣♦❡ ❝♦♠✲ ♣✉❡ ❛❝ ❛ ❛ ❦✐♥❞ ♦❢ ♠❡❛✲♠❡❞✐✉♠ ❤❛✱ ♦✇✲ ✐♥❣ ♦ ✉♥✐✈❡❛❧ ❞✐❣✐❛❧ ❞❛❛ ❡♣❡❡♥❛✐♦♥✱ ❛❧✲ ❧♦✇ ❢♦ ❤❡ ❛❝✉❛❧✐❛✐♦♥ ♦❢ ♠♦❡ ♣❡❝✐✜❝ ♠❡✲ ❞✐❛ ♠❛❝❤✐♥❡ ❜② ♠❡❛♥ ♦❢ ♦❢✇❛❡ ❬▼❛♥♦✈✐❝❤✱ ✷✵✶✸❪✳ ■ ❤❛❧❧ ❜❡ ❡❡❞ ❤♦✉❣❤ ❤❛ ❛❧❧ ❤❡ ♦❤❡ ♠❡❞✐❛ ✐♥✈♦❧✈❡❞✱ ✐♥❝❧✉❞✐♥❣ ❛♥❞ ❡♣❡❝✐❛❧❧② ♥♦♥✲❞✐❣✐❛❧ ♦♥❡✱ ❤❛✈❡ ❛♥ ❡✉❛❧❧② ✐❣♥✐✜❝❛♥ ✐♥✲ ✢✉❡♥❝❡ ♦♥ ❤❡ ❛❡❤❡✐❝ ♦❢ ❤❡ ✇♦❦ ✭❢♦ ❛ ❞✐✲ ❝✉✐♦♥✱ ❡❡ ❬❘✉♠♦✐✱ ✷✵✶✼❪✮✳ ■♥ ❤❡ ❢♦❧❧♦✇✐♥❣✱ ■ ✇✐❧❧ ♣❡❡♥ ❡❝❤♥✐❝❛❧ ❝♦♥✲ ❋✐❣✉❡ ✷✿ ❙❝❤❡♠❛✐❝ ❡✉♣ ♦❢ ❛✐✢♥❡✉✳ ✐❞❡❛✐♦♥ ✐♥ ❡♦ ✇✐❤ ❡✢❡❝✐♦♥ ♦♥ ❤❡ ❛✐✐❝ ♣♦❝❡ ❛♥❞ ♦♥ ❤❡ ♣♦♣❡✐❡ ♦❢ ♠❡❞✐❛✳ ❡❝♦❞✐♥❣ ❝♦❢❛❞❡ ✐♥♦ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ❛❣❛✐♥ ❝♦♠♣✐✐♥❣ ❛❧❧ ♦❢ ❤❡ ❡✈❡♥ ♦✉♥❞ ✐❛✐♦♥✱ ❡❛❝❤ ✸✳✶ ❙♦❢✇❛❡ ✐♥✈♦❧✈❡❞ ❡♣❡❡♥❡❞ ❜② ❛ ✐♥❣❧❡ ♣♦✐♥ ✐♥ ♣❛❝❡✳ ❛✐✢♥❡✉ ✐ ❡❛❧✐❡❞ ❜② ❝♦♠❜✐♥✐♥❣ ❛ ❢❡✇ ♦❢✲ ❲❤❡♥ ❛ ♦✉♥❞ ✐✉❛✐♦♥ ✇❛ ❧❡❢✱ ✐ ❡♠❛✐♥ ✇❛❡ ❜✉✐❧❞✐♥❣ ❜❧♦❝❦✱ ❛❧❧ ♦❢ ❤❡♠ ❜❡✐♥❣ ❋▲❖❙❙✳ ❛ ❤❡ ❧♦❝❛✐♦♥ ✐♥ ♣❛❝❡ ✇❤❡❡ ✐ ✇❛ ❞♦♣♣❡❞✳ ❚❤❡ ♣♦❝❡✐♥❣ ♦❢ ❤❡ ❛❝❦✐♥❣ ❞❛❛✱ ♠♦ ♦❢ ❤❡ ❚❤❛ ♠❡❛♥✱ ✇❤❡♥ ❤❡ ❧✐❡♥❡ ♠♦✈❡ ✇✐❤ ❛ ✐❣♥❛❧ ♣♦❝❡✐♥❣ ❛♥❞ ❤❡ ❛♣♣❧✐❝❛✐♦♥ ❧♦❣✐❝ ✐ ✐♠✲ ✏♦♥✐❝ ❤❛✑ ❝✉❡♥❧② ✏♣✉ ♦♥✑ ❤❡ ♣❛✐❛❧ ❝♦♥✜❣✲ ♣❧❡♠❡♥❡❞ ✐♥ ❙✉♣❡❝♦❧❧✐❞❡✷✳ ❙✉♣❡❝♦❧❧✐❞❡ ❛❧✲ ✉❛✐♦♥ ♦❢ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ✐ ❡❛❛♥❣❡❞✳ ❚❤❡❡ ❧♦✇ ❢♦ ❝♦♥✉❝✐♥❣ ♠♦❞✉❧❛ ♠✉❧✐❝❤❛♥♥❡❧ ❡✲ ✐ ♥♦ ✐♠♠❡❞✐❛❡ ❛✉❞✐❜❧❡ ❢❡❡❞❜❛❝❦ ❤✐♥✐♥❣ ❛ ❤✐ ❛❧✐♠❡ ✐❣♥❛❧ ♣♦❝❡✐♥❣ ♥❡✇♦❦ ❝♦♥♦❧❧❡❞ ❜② ❝❤❛♥❣❡ ❛ ✐♥ ❤✐ ♠♦♠❡♥ ♦❧❡❧② ❤❡ ❡♥❡❡❞ ✐✉✲ ❛ ❣❡♥❡❛❧✲♣✉♣♦❡ ♦❜❥❡❝✲♦✐❡♥❡❞ ❧❛♥❣✉❛❣❡✳ ❉❡✲ ❛✐♦♥ ✐ ♣❡❡♥❡❞ ✐♥ ✐ ♥♦♥✲❡❛❝✐✈❡ ❢♦♠✳ ❖♥❧② ❛✐❧ ♦♥ ❤❡ ❜✐♥❛✉❛❧ ❡♥❞❡✐♥❣ ✇✐❧❧ ❜❡ ♣❡❡♥❡❞ ❛❢❡ ❤❛✈✐♥❣ ❡✲❡♥❡❡❞ ❤❡ ✈✐✉❛❧ ❝❡♥❡✱ ❤❡ ❡✲ ✐♥ ❤❡ ❢♦❧❧♦✇✐♥❣ ❡❝✐♦♥✱ ✇❤✐❝❤ ✇✐❧❧ ❛❧♦ ♠❛❦❡ ❝♦♥✜❣✉❛✐♦♥ ❜❡❝♦♠❡ ❛✉❞✐❜❧❡✳ ❝❧❡❛ ✇❤② ❛♥ ♦♣❡♥ ❛♥❞ ✢❡①✐❜❧❡ ❢❛♠❡✇♦❦ ❧✐❦❡ ❚❛❝✐♥❣✱ ❡♥❡✐♥❣✱ ❧❡❛✈✐♥❣ ❛♥❞ ❡❛❛♥❣✐♥❣ ❡✈✲ ❙✉♣❡❝♦❧❧✐❞❡ ✐ ♥❡❝❡❛② ❢♦ ❞❡✈❡❧♦♣✐♥❣ ❤✐ ✐♥✲ ❡②❞❛② ♦✉♥❞ ✐✉❛✐♦♥ ❤❛❧❧ ❛❧❧♦✇ ❢♦ ❛ ♣❧❛②✲ ❛❧❧❛✐♦♥✱ ❛❤❡ ❤❛♥ ❛ ♠♦♥♦❧✐❤✐❝✱ ♦♣✐♠✐❡❞ ❢✉❧ ❡①♣❧♦❛✐♦♥ ♦❢ ❤❡ ♦✉♥❞ ♠❛❡✐❛❧ ❛♥❞ ❛♥ ♦❢✇❛❡ ♣❛❝❦❛❣❡ ✭❝❢✳ ❬▼❛❣♥✉♦♥✱ ✷✵✵✽❪✮✳ ❛♦❝✐❛✐✈❡ ❡❝♦♠❜✐♥❛✐♦♥ ♦❢ ♥❛❛✐✈❡ ✐♥ ❤❡ ▼♦ ❜✐♥❛✉❛❧ ②♥❤❡✐ ❡❝❤♥✐✉❡ ✐♥✈♦❧✈❡ ❛ ❡♥❡ ♦❢ ❛♥❡❝❞♦❛❧ ♠✉✐❝ ❛ ❝♦✐♥❡❞ ❜② ▲✉❝ ❋❡✲ ♠❛✐① ♦❢ ❡❛❧✐♠❡ ❝♦♥✈♦❧✉✐♦♥✱ ♦♠❡✐♠❡ ✉✲ ❛✐✳ ❆❧♦♥❣ ✇✐❤ ❤❡ ♣❡❝❡♣✉❛❧ ❞✐✛❡❡♥❝❡ ♦❢ ✐♥❣ ♦♦♠ ✐♠♣✉❧❡ ❡♣♦♥❡ ♦❢ ❡✈❡❛❧ ❡❝♦♥❞ ❡♥❞❡❡❞ ❛♥❞ ❡❛❝✐✈❡ ❛✉❞✐♦ ❛ ♦♥❡ ❤❛♥❞ ❛♥❞ ❞✉❛✐♦♥✳ ■♥ ❡❛❧✐❡ ✈❡✐♦♥✱ ❛✐✢♥❡✉ ✉❡ ❡❝♦❞❡❞ ❛♥❞ ❛✐❝ ♠❛❡✐❛❧ ♦♥ ❤❡ ♦❤❡✱ ❛✐✲ ✷✹ ❜✐♥❛✉❛❧ ♦♦♠ ✐♠♣✉❧❡ ❡♣♦♥❡ ✭❇❘■❘✮ ♦❢ ✢♥❡✉ ✐ ❛ ❤❡ ❛♠❡ ✐♠❡ ❛ ✉❞② ♦❢ ❤❡ ♣♦♣✲ ✻✹ ❦ ❛♠♣❧❡ ❡❛❝❤✱ ✐♥ ❛ ❧❛❡ ✈❡✐♦♥ ✶✷ ♦❢ ❤♦❡ ❡✐❡ ❛♥❞ ❝♦♥❞✐✐♦♥ ♦❢ ♦ ❝❛❧❧❡❞ ✐♠♠❡✐✈❡ ♣❛✲ ❇❘■❘ ♣❧✉ ✸✻ ❢❡❡✲✜❡❧❞ ✇♦✲❝❤❛♥♥❡❧ ❡♣♦♥❡ ✐❛❧ ♠❡❞✐❛✳ ♦❢ ✺✶✷ ❛♠♣❧❡✳ ❋♦ ♣❡❢♦♠✐♥❣ ❤❡ ❝♦♥✈♦❧✉✐♦♥✱ ❏❝♦♥✈♦❧✈❡ ❜② ❋♦♥ ❆❞✐❛❡♥❡♥ ✐ ✉❡❞ ❬❆❞✐✲ ✸ ▼❡❞✐❛✱ ♦❢✇❛❡✱ ❡❝❤♥♦❧♦❣② ❛❡♥❡♥✱ ✷✵✵✻❜❪✳ ■ ♣♦✈✐❞❡ ✈❡② ❡✣❝✐❡♥✱ ❧♦✇✲ ❙❡✈❡❛❧ ❦✐♥❞ ♦❢ ♠❡❞✐❛ ❡❝❤♥♦❧♦❣② ❛❡ ✐♥✈♦❧✈❡❞ ❧❛❡♥❝②✱ ♠✉❧✐✲❤❡❛❞❡❞ ❝♦♥✈♦❧✉✐♦♥ ✇❤✐❧❡ ♠❛✲ ✐♥ ❤❡ ❡❛❧✐❛✐♦♥ ♦❢ ❛✐✢♥❡✉✱ ❛♠♦♥❣ ❤❡♠ ✐❝❡ ♦❢ ❛♥② ❧❛②♦✉ ❛♥❞ ♦❢ ❧❛❣❡ ✐③❡ ♠❛② ❜❡ ❝♦♥✜❣✉❡❞✳ ❙✉♣❡❝♦❧❧✐❞❡ ❛♥❞ ❏❝♦♥✈♦❧✈❡ ❛❡ ❛✉❞✐♦ ❡❝♦❞✐♥❣ ❛♥❞ ❡♣♦❞✉❝✐♦♥ ♦✈❡ ❤❡❛❞✲ ✸ ♣❤♦♥❡✱ ♦♣✐❝❛❧ ❛❝❦✐♥❣ ♦❢ ♣♦✐✐♦♥ ❛♥❞ ♦❛✲ ❝♦♥♥❡❝❡❞ ✈✐❛ ❤❡ ❏❛❝❦ ❆✉❞✐♦ ❈♦♥♥❡❝✐♦♥ ❑✐ ✳ ✐♦♥✱ ❤❡ ❛♣♣❧✐❝❛✐♦♥ ❧♦❣✐❝ ❤❛ ❡✈❛❧✉❛❡ ❤❡ ❚❤❡ ❜✐♥❛✉❛❧ ♦♦♠ ✐♠♣✉❧❡ ❡♣♦♥❡ ✭❜✉ ❛❝❦✐♥❣ ❞❛❛ ❛♥❞ ✜♥❛❧❧② ❝♦♥♦❧ ❞✐✛❡❡♥ ❧❡✈✲ ♥♦ ❤❡ ❢❡❡✲✜❡❧❞ ♦♥❡✮ ✉❡❞ ❢♦ ❤❡ ❝♦♥✈♦❧✉✐♦♥ ❡❧ ♦❢ ✐❣♥❛❧ ♣♦❝❡✐♥❣ ❢♦ ❝❡❛✐♥❣ ❤❡ ♣❡❡♥❡❞ ✇❡❡ ♠❡❛✉❡❞ ✐♥ ❤❡ ❈✉❜❡ ❧❛❜♦❛♦② ❛ ■♥✲ ♦✉♣✉✳ ❚❤❡ ❢♦❝✉ ✐♥ ❤✐ ♣❛♣❡ ✐ ♦♥ ❤❡ ❧❛✲ ✐✉❡ ♦❢ ❊❧❡❝♦♥✐❝ ▼✉✐❝ ❛♥❞ ❆❝♦✉✐❝ ●❛③ ♠❡♥✐♦♥❡❞ ❜✉✐❧❞✐♥❣ ❜❧♦❝❦ ❤❛ ❛❡ ❡♣❡❡♥❡❞ ✭■❊▼✮ ❬❘✉♠♦✐ ❡ ❛❧✳✱ ✷✵✶✵❪✳ ❋♦ ❤❡ ♠❡❛✉❡✲ ❜② ♦❢✇❛❡✱ ✇❤✐❝❤ ❢♦♠ ❛ ♠❛❥♦ ♣❛ ♦❢ ❤❡ ✷ ❛✐✐❝ ❞❡✈❡❧♦♣♠❡♥✳ ❤♣✿✴✴✉♣❡❝♦❧❧✐❞❡✳❣✐❤✉❜✳✐♦ ✭❧❛ ❡✐❡✈❡❞ ❋❡❜✉❛② ✷✽✱ ✷✵✶✼✮ ■ ❛♠ ❣♦✐♥❣ ♦ ❞✐❝✉ ❤❡ ✐♠♣❧❡♠❡♥❛✐♦♥ ✇✐❤ ✸❤♣✿✴✴✇✇✇✳❥❛❝❦❛✉❞✐♦✳♦❣ ✭❧❛ ❡✐❡✈❡❞ ❋❡❜✉✲ ❛ ♣❡❝✐❛❧ ❡♠♣❤❛✐ ♦♥ ❤❡ ❛♣♣❧✐❝❛✐♦♥ ♦❢ ❋▲❖❙❙ ❛② ✷✽✱ ✷✵✶✼✮ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 16

♠❡♥✱ ❛ ❝✉♦♠✐❡❞ ✈❡✐♦♥ ♦❢ ❆❧✐❦✐✱ ❛❣❛✐♥ ❜② ♣❛❝❡ ❛♥❞ ❛ ❤✐♥❞❛♥❝❡ ❢♦ ❤❡ ♦✐❡♥❛✐♦♥ ❜② ❧✐✲ ❋♦♥ ❆❞✐❛❡♥❡♥✱ ✇❛ ✉❡❞ ❬❆❞✐❛❡♥❡♥✱ ✷✵✵✻❛❪✳ ❡♥✐♥❣ ❞✉❡ ♦ ✐ ✈✐✉❛❧ ♣❡❡♥❝❡✳ ❆♣❛ ❢♦♠ ❈✉♦♠✐❛✐♦♥ ✐♥❝❧✉❞❡ ❛ ❤✐❣❤❡ ♥✉♠❜❡ ♦❢ ✉♣✲ ❤❛✱ ✉❝❤ ❛♥ ✐♥❛❧❧❛✐♦♥ ✇♦✉❧❞ ❧❛❝❦ ❤❡ ❡❛❝✲ ♣♦❡❞ ❝❤❛♥♥❡❧ ❛♥❞ ♦♠❡ ❛✉♦♠❛✐♦♥ ❢❛❝✐❧✐✐❡✱ ✐✈❡ ❝❛♣❛❜✐❧✐② ♦❢ ❡♥❡✐♥❣ ♦♥❡ ♦❢ ❤❡ ❡❝♦❞✐♥❣✳ ✇❤✐❝❤ ✇❡❡ ✉❡❞ ✐♥ ❝♦♥❥✉♥❝✐♦♥ ✇✐❤ ❙✉♣❡❝♦❧✲ ■ ✇♦❡ ❤❛ ❤❡ ❡♥✈✐✐♦♥❡❞ ✏❤❛❞✇❛❡✑ ❡♣❧✐✲ ❧✐❞❡ ❬❍♦❧❧❡✇❡❣❡ ❛♥❞ ❘✉♠♦✐✱ ✷✵✶✸❪✳ ❝❛✐♦♥ ♦❢ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ✏❝♦✉❧❞ ❝♦♠❡ ❝❧♦❡ ❊❞✐✐♥❣ ❛♥❞ ♣♦❝❡✐♥❣ ♦❢ ❤❡ ✜❡❧❞ ❡❝♦❞✐♥❣ ♣❤②✐❝❛❧❧②✑ ❛♥❞ ✏♠❛② ❜❡ ❡❛✐♥❡❞ ✐♥ ♣✐♥❝✐♣❧❡✑ ✇❛ ♣❡❢♦♠❡❞ ✉✐♥❣ ❆❞♦✉✹✳ ♦ ✐♥❞✐❝❛❡ ❤❛ ❤❡ ❡♥❞❡❡❞ ❝❡♥❡ ❛♥❞ ✐ ♣❤②✲ ✐❝❛❧ ❝♦✉♥❡♣❛ ❤❛✈❡ ♥♦❤✐♥❣ ✐♥ ❝♦♠♠♦♥ ✐♥ ✸✳✷ ❇✐♥❛✉❛❧ ❡♥❞❡✐♥❣ ❡♠ ♦❢ ♦✉♥❞ ♣♦♣❛❣❛✐♦♥ ♣♦♣❡✐❡ ❛♥❞ ❡❛❝✲ ❆ ✜ ❣❧❛♥❝❡✱ ❤❡ ❡♥❞❡✐♥❣ ♦❢ ❤❡ ✈✐✉❛❧ ❛✉✲ ✐✈❡ ❜❡❤❛✈✐♦✉✳ ❚❤❡❡ ✐ ♥♦ ❡✈✐❞❡♥❝❡ ✇❤❛♦❡✈❡ ❞✐♦② ❝❡♥❡ ✐♥ ❛✐✢♥❡✉ ❡❡♠ ♦ ❜❡ ❛ ❛♥✲ ✇❤② ❤❡ ✈✐✉❛❧ ❝❡♥❡ ❤♦✉❧❞ ❜❡ ❞❡✐❣♥❡❞ ✉❝❤ ❞❛❞ ❡♥❣✐♥❡❡✐♥❣ ♣♦❜❧❡♠ ♦❢ ♠♦❞❡❛❡ ❝♦♠♣❧❡①✲ ❤❛ ✐ ❛❝♦✉✐❝❛❧ ♣♦♣❡✐❡ ♠❛❝❤ ❤♦❡ ♦❢ ❡✲ ✐②✱ ✇❤✐❝❤ ✐ ❛ ❝♦❡❝ ❛✉♠♣✐♦♥ ♦ ❛ ❧❛❣❡ ❡①✲ ❛❧✐②✳ ❘❛❤❡ ✐ ♣❡❝❡♣✐♦♥ ❛♥❞ ❝♦❣♥✐✐♦♥✱ ❤❛ ❡♥✳ ❚❤❡❡ ❛❡ ❡✈❡♥ ♠♦♥❛✉❛❧ ♣♦✐♥ ♦✉❝❡✱ ✐✱ ✐ ✐♥❡❣❛❧ ❛❡❤❡✐❝ ❡①♣❡✐❡♥❝❡✱ ❤❛❧❧ ♣♦✲ ♥♦ ♦♦ ♠❛♥②✱ ❡❛❝❤ ✇✐❤ ❤❡ ❛♠❡ ✐✈✐❛❧✱ ❤❛ ✈♦❦❡ ❛♥ ✐♠❛❣✐♥❛✐♦♥ ❤❛ ✉♣♣♦ ❤❡ ❢✉❤❡ ✐✱ ♦♠♥✐✲❞✐❡❝✐♦♥❛❧ ❛❞✐❛✐♦♥ ♣❛❡♥✱ ♦ ❜❡ ❡♥✲ ❡♥❣❛❣❡♠❡♥ ✇✐❤ ❤❡ ❛✇♦❦✳ ❆❡❤❡✐❝ ❡①♣❡✲ ❞❡❡❞ ✐♥ ❛ ♦ ❢❛ ♥♦ ❢✉❤❡ ♣❡❝✐✜❡❞ ✈✐✉❛❧ ✐❡♥❝❡ ❞❡♣❡♥❞ ♦♥ ♣❡✈✐♦✉❧② ♠❛❞❡ ❡①♣❡✐❡♥❝❡✳ ♣❛❝❡✱ ♣♦❜❛❜❧② ♥♦ ❡✉✐✐♥❣ ❛ ♦♦ ❝♦♠♣❧❡① ✉♥✲ ■♥ ❤❡ ❝❛❡ ♦❢ ♥❛✈✐❣❛✐♥❣ ❛♥ ❛✉❞✐♦② ❡♥✈✐♦♥✲ ❞❡❧②✐♥❣ ♠♦❞❡❧✳ ❚❤❡ ❝❡♥❡ ❤♦✉❧❞ ❜❡ ❡♥❞❡❡❞ ♠❡♥ ✐ ❡❧❛❡ ♦ ♦✉ ♣❛✐❛❧ ❛✇❛❡♥❡ ✇❤✐❝❤ ♦ ❢♦ ♦♥❡ ❞②♥❛♠✐❝❛❧❧② ♠♦✈✐♥❣ ❧✐❡♥❡ ❛❝❝♦❞✐♥❣ ♦ ❞❛❡ ✐ ♠♦❧② ❛✐♥❡❞ ❜② ♦✐❡♥❛✐♦♥ ✐♥ ❡❛❧✐②✳ ❛❝❦✐♥❣ ❞❛❛ ✐♥♣✉✳ ❚❤❡ ♦✉♥❞ ♦✉❝❡ ❛❡ ♥♦ ❆❣❛✐♥✱ ❤✐ ❞♦❡ ♥♦ ♠❡❛♥ ❤❛ ♠❛❝❤✐♥❣ ♣❤②✐✲ ❞②♥❛♠✐❝❛❧❧② ♠♦✈✐♥❣✱ ❛♥❞ ✐❢ ♦✱ ❤❡✐ ♠♦✈❡♠❡♥ ❝❛❧ ✐♠✉❧✐ ❛❡ ✉✣❝✐❡♥ ♦ ❤❡ ✐❣❤ ✇❛② ❛ ❛❧❧ ♦ ❛❡ ♥♦ ❛✉❞✐❜❧❡ ❛ ❤❡ ❛♠❡ ✐♠❡✱ ✇❤✐❝❤ ♠✐❣❤ ❡✈♦❦❡ ♠❛❝❤✐♥❣ ❛✉❞✐♦② ✐♠♣❡✐♦♥✳ ■♥ ❛✐✲ ❛❧❧♦✇ ❢♦ ♥♦♥✲❡❛❧✐♠❡ ♦♣✐♠✐❛✐♦♥✳ ❋✉❤❡✲ ✢♥❡✉✱ ♣♦❜❛❜❧② ❛♠♦♥❣ ♠❛♥② ♦❤❡ ❡①❛♠♣❧❡✱ ♠♦❡✱ ♦♥❧② ♦♥❡ ♦✉❝❡ ✐ ♠♦✈✐♥❣ ❛ ❛ ✐♠❡✳ ✐ ✐ ♥♦ ❡✈❡♥ ❞❡✐❡❞ ❬❘✉♠♦✐✱ ✷✵✶✻❪✳ ❉❡♣✐❡ ✐ ♠♦❞❡❛❡ ❡❝❤♥✐❝❛❧ ❞❡♠❛♥❞✱ ❛✐✢♥❡✉ ✐ ♥♦ ❛❜♦✉ ❞❡✈❡❧♦♣✐♥❣ ♦ ✉✐♥❣ ❛♥ ❚❤✐ ❜❛✐❝ ❛✉♠♣✐♦♥ ❬✳ ✳ ✳ ❪✱ ❤❛ ❛ ✏♦♣✐♠❛❧✑ ❜✐♥❛✉❛❧ ❡♥❞❡✐♥❣ ❡❝❤♥✐✉❡✳ ■♥ ❢❛❝✱ ✉❜❥❡❝ ✇✐❧❧ ❛❧✇❛② ❤❡❛ ❤❡ ❛♠❡ ❤❡ ❛✐✐❝ ❡✢❡❝✐♦♥ ✐ ❛❣❡❡❞ ❛ ❤❡ ✉❡✲ ♦✉♥❞ ✇❤❡♥ ❡①♣♦❡❞ ♦ ✐❞❡♥✐❝❛❧ ♦✉♥❞ ✐♦♥ ♦❢ ✇❤❛ ✏♦♣✐♠❛❧✑ ❝♦✉❧❞ ❛❝✉❛❧❧② ♠❡❛♥ ✐♥ ✐❣♥❛❧✱ ✐ ♦❜✈✐♦✉❧② ♥♦ ✉❡ ❬✳ ✳ ✳ ❪✳ ❨❡ ❤✐ ❝♦♥❡①✳ ❉♦❡ ✐ ♠❡❛♥ ♦ ♠♦❞❡❧ ❛ ❛❝❝✉✲ ❬✳ ✳ ✳ ❪✱ ❛✉❤❡♥✐❝ ❡♣♦❞✉❝✐♦♥ ✐ ❛❡❧② ❛❡❧② ❛ ♣♦✐❜❧❡ ❤❡ ♣❤②✐❝❛❧ ♦✉♥❞ ♣♦♣❛❣❛✲ ❡✉✐❡❞✳ ❬✳ ✳ ✳ ❙❪♦✉♥❞ ♠❛❡✐❛❧ ♦♥ ❤❡ ✐♦♥ ❛✐♥❣ ❢♦♠ ❤❡ ❡♠✐❡✱ ❤❡ ❝♦♥✐❜✉✐♦♥ ❛❞✐♦ ❛♥❞ ♦♥ ❞✐❦ ✐ ♣♦❝❡❡❞ ✐♥ ✉❝❤ ♦❢ ❤❡ ✉♦✉♥❞✐♥❣ ♣❛❝❡ ♦ ❤❡ ❛❞✐❛❡❞ ♦✉♥❞ ❛ ✇❛② ❛ ♦ ❛❝❤✐❡✈❡ ❤❡ ♦♣✐♠❛❧ ❛✉❞✐✲ ✇❛✈❡✱ ❤❡✐ ❛✐✈❛❧ ❛ ❤❡ ❤✉♠❛♥ ❤❡❛❞✱ ✜♥❛❧❧② ♦② ❡✛❡❝✱ ❢♦ ✐♥❛♥❝❡✱ ❢♦♠ ❛♥ ❛✐✲ ❤❡ ❡✛❡❝ ♦❢ ❛ ✇♦✲❝❤❛♥♥❡❧✱ ♣❛❝❡❞ ❛♥❞ ✐♥❞✐✈✐❞✉✲ ✐❝ ♣♦✐♥ ♦❢ ✈✐❡✇✳ ❬❇❧❛✉❡✱ ✶✾✾✼✱ ✸✼✹❪ ❛❧❧② ✜❧❡❡❞ ♣❡✉❡ ❡❝❡✐✈❡ ❛❛②✱ ♦✉ ❤❡❛✐♥❣ ❛♣♣❛❛✉❄ ■♥ ♦❤❡ ✇♦❞✿ ❞♦❡ ✐ ♠❡❛♥ ♦ ❇❧❛✉❡ ❞♦❡ ♥♦ ❡❧❛❜♦❛❡ ♦♥ ❤♦✇ ✏❤❡ ♦♣✐✲ ❝❛♣✉❡ ❤❡ ♣❤②✐❝ ♦❢ ❛♥ ❡①✐✐♥❣ ♦ ✐♠❛❣✐♥❡❞ ♠❛❧ ❛✉❞✐♦② ❡✛❡❝✑ ✇♦✉❧❞ ❜❡ ❛♣♣♦❛❝❤❡❞ ❛♥❞ ❡❛❧✲✇♦❧❞ ✐✉❛✐♦♥ ❛♥❞ ✐♠✉❧❛❡ ✐❄ ✇❤❡♥ ✐ ✐ ❡❛❝❤❡❞✳ ❋♦ ❛ ❡❛♦♥✿ ♣♦❝❡❡❞ ❖❜✈✐♦✉❧②✱ ❤❡❡ ✐ ♥♦ ❝♦❡♣♦♥❞✐♥❣ ❡❛❧✲ ♦✉♥❞ ♠❛❡✐❛❧ ✐ ♦♥❧② ♦♥❡ ♣❛ ♦❢ ❛♥ ✐♥❡✲ ✇♦❧❞ ✐✉❛✐♦♥ ♦ ❡✈❡♥ ♣❛✐❛❧ ✜❡❧❞ ❡❝♦❞✐♥❣ ❣❛❧ ❛❡❤❡✐❝❛❧ ❡①♣❡✐❡♥❝❡❀ ✐♥❞✐✈✐❞✉❛❧ ♣❡❝❡♣✲ ❡❞✉❝❡❞ ♦ ♠♦♥❛✉❛❧ ✐❣♥❛❧ ❛♥❞ ♣✉ ✐♥♦ ❛ ♥❛✈✲ ✐♦♥✱ ✈❛✐♦✉ ❧❡✈❡❧ ♦❢ ❢❛♠✐❧✐❛✐② ✇✐❤ ❝❡❛✐♥ ✐❣❛❜❧❡ ✈✐✉❛❧ ♣❛❝❡✳ ♦❡♥✐❛❧❧②✱ ❛♥ ✐♥❛❧❧❛✐♦♥ ❡❝❤♥♦❧♦❣✐❡✱ ✉❜❥❡❝✐✈❡ ❝♦❣♥✐✐✈❡ ❝♦♥✐❜✉✐♦♥✱ ♦❢ ❡✈❡♥ ❧♦✉❞♣❡❛❦❡ ❞✐✐❜✉❡❞ ✐♥ ♣❛❝❡ ❛♥❞ ❝✉❧✉❛❧ ❞✐✛❡❡♥❝❡ ❛❡ ♦❤❡✳ ❋♦♠ ❤❡ ✏❛✐✲ ❡❛❝❤ ♣❧❛②✐♥❣ ❜❛❝❦ ♦♥❡ ♦❢ ❤❡ ❡❝♦❞✐♥❣ ❝♦✉❧❞ ✐❝ ♣♦✐♥ ♦❢ ✈✐❡✇✑ ❤❡❡ ✐ ♥♦ ❝❧❡❛ ♦♣✐♠✉♠ ❡✐✲ ❝♦♠❡ ❝❧♦❡ ♣❤②✐❝❛❧❧② ❜✉ ❤❡ ♠❡❡ ❤♦✉❣❤ ❡①✲ ❤❡✿ ❛✇♦❦ ♦♣❡♥ ♣❡❝❡♣✉❛❧ ♣❛❝❡ ❢♦ ✐♥✲ ♣❡✐♠❡♥ ♠❛❦❡ ❡✈✐❞❡♥ ❤❛ ❤❡ ❛✐✐❝ ♣♦✐♥ ❞✐✈✐❞✉❛❧ ❡①♣❧♦❛✐♦♥ ❛♥❞ ♦✛❡ ❛ ♠✉❧✐✉❞❡ ♦❢ ✇♦✉❧❞ ❜❡ ❡♥✐❡❧② ♠✐❡❞✳ ❆❧❤♦✉❣❤ ❤❡ ♥❛✈✐❣❛✲ ❛♥❞ ❢♦ ✐♥❡♣❡❛✐♦♥✳ ❖❢ ❝♦✉❡✱ ❛ ❦✐♥❞ ♦❢ ✐♦♥ ❛♣❡❝ ♠❛② ❜❡ ❡❛✐♥❡❞ ✐♥ ♣✐♥❝✐♣❧❡✱ ❡❛❝❤ ♦❢ ✏❛❡❤❡✐❝ ♥✉❝❧❡✉✑ ❝❛♥ ❜❡ ❛✉♠❡❞ ❤❛ ✐ ❝❡♥✲ ❤❡ ♦✉♥❞ ♣♦ ✇♦✉❧❞ ❜❡ ❡♣❡❡♥❡❞ ❜② ❛ ♣❤②✲ ❛❧ ♦ ❜♦❤ ❤❡ ❛✐✬ ❛♥❞ ❤❡ ❡❝✐♣✐❡♥✬ ❡✲ ✐❝❛❧ ♦❜❥❡❝✱ ❜❡✐♥❣ ❜♦❤ ❛♥ ♦❜❛❝❧❡ ❢♦ ♠♦✈✐♥❣ ✐♥ ✢❡❝✐♦♥✳ ❚❤❡❡ ♠❛② ❜❡ ♠♦❡ ♦ ❧❡ ❛♣♣♦♣✐❛❡ ✇❛② ♦❢ ❣❛♣✐♥❣ ❛♥❞ ❝♦♥✈❡②✐♥❣ ✐ ✉✐♥❣ ♠❡❞✐❛✱ ✹❤♣✿✴✴❛❞♦✉✳♦❣ ✭❧❛ ❡✐❡✈❡❞ ❆♣✐❧ ✸✱ ✷✵✶✼✮ ❜✉ ❛ ✐♥❣❧❡ ♦♣✐♠❛❧ ♦♥❡ ✐ ✉♥❧✐❦❡❧② ♦ ❡①✐✳ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 17

Sound Source 1 Sound Source 2 Sound Source 3 ... ❉✉❡ ♦ ❤❡ ❛❜❡♥❝❡ ♦❢ ❝♦♠♣✉❧♦② ❡❛❧✐❛✐♦♥ (File Player, Live Input, etc.) (File Player, Live Input, etc.) (File Player, Live Input, etc.)

❝❤❡♠❡ ✐♥ ❛♥ ❛✐✐❝ ❝♦♥❡①✱ ❤❡ ❡♥❞❡✐♥❣ Crossfade Source treatment Crossfade Source treatment ❡❝❤♥✐✉❡ ❛❞❛♣❡❞ ❢♦ ❛✐✢♥❡✉✱ ❤❡ ♦✉♥❞ ♣♦♣❛❣❛✐♦♥ ❧❛✇ ♠♦❞❡❧❧❡❞ ✐♥ ❛♥❞ ❤❡ ✉❧❡ ♦❢ Source treatment Source treatment ❡❛❝✐✈❡ ❜❡❤❛✈✐♦✉ ❛♣♣❧✐❡❞ ♦ ❤❡ ✈✐✉❛❧ ❡♥✈✐✲ Distance model Distance model

♦♥♠❡♥ ❛❡ ❢♦✉♥❞ ❜② ❡①♣❡✐♠❡♥❛✐♦♥ ❛♥❞ ✐♥✲ Ambisonic encoder Ambisonic encoder ✉✐✐♦♥✳ ❚❤❡ ❡❢❡❡♥❝❡ ❛❡ ♥♦ ❡❛ ✐❣♥❛❧ ✐♥ ❡❛❧✐② ❜✉ ❤❡ ❝♦♥❞✐✐♦♥ ❛♥❞ ✐♠♣❧✐❝❛✐♦♥ ♦❢

♠❡❞✐❛✳ ❍❡❡✱ ❤✐ ✐♥❝❧✉❞❡ ❧✐♠✐❛✐♦♥ ♦❢ ♣❛❝❡ Ambisonic Bus

❛♥❞ ❛❝❦✐♥❣ ❝❛♣❛❜✐❧✐②✱ ❝♦♠♣✉❛✐♦♥❛❧ ♣♦✇❡ Tracking Data

❛♥❞ ✐♠♣❧❡♠❡♥❛✐♦♥ ❢❡❛✐❜✐❧✐②✱ ❛♥❞✱ ♠♦ ✐♠✲ Binaural Bus Ambisonic rotator Application control ♣♦❛♥❧②✱ ❤❡ ❝✉❧✉❛❧ ❡❝❤♥✐✉❡ ♦❢ ❤❡❛❞♣❤♦♥❡ Application Control Data

❧✐❡♥✐♥❣ ❛♥❞ ✐ ❤❡✐❛❣❡ ✭❢♦ ❛ ❞✐❝✉✐♦♥ ♦♥ ❤❡ Order weighting Tracking data input ❧❛❡✱ ❡❡ ❬❘✉♠♦✐✱ ✷✵✶✼❪✮✳ Ambisonic decoder

✸✳✷✳✶ ❱✐✉❛❧ ❆♠❜✐♦♥✐❝ HRRIR convolution

❊❛❧✐❡ ✐♠♣❧❡♠❡♥❛✐♦♥ ♦❢ ❛✐✢♥❡✉ ✉❡ ❛ Binaural Output ♠♦❞✐✜❡❞ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ❢♦ ❡♥✲ ❞❡✐♥❣ ❤❡ ❜✐♥❛✉❛❧ ❝❡♥❡ ❬◆♦✐❡♥✐❣ ❡ ❛❧✳✱ ❋✐❣✉❡ ✸✿ ❇❧♦❝❦ ❞✐❛❣❛♠ ♦❢ ❛✐✢♥❡✉ ✉✐♥❣ ✷✵✵✸❪✳ ■♥❡❛❞ ♦❢ ②♥❤❡✐❡❞ ♦♦♠ ❛❝♦✉✲ ❤❡ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤✳ ✐❝ ❛♥❞ ❢❡❡✲✜❡❧❞ ✭✐✳ ❡✳✱ ❛♥❡❝❤♦✐❝✮ ✐♠♣✉❧❡ ❡✲ ♣♦♥❡✱ ♠❡❛✉❡❞ ❜✐♥❛✉❛❧ ♦♦♠ ✐♠♣✉❧❡ ❡✲ ✸✳✷✳✷ ❉✐❛♥❝❡ ♠♦❞❡❧ ♣♦♥❡ ✭❇❘■❘✮ ❛❡ ✉❡❞✳ ❚❤✐ ✇❛②✱ ❤❡ ✈✐✲ ✉❛❧ ❝❡♥❡ ✐ ❡♠❜❡❞❞❡❞ ✐♥ ❝❛♣✉❡❞ ❡❛❧✲♦♦♠ ❈❧❛✐❝❛❧ ❆♠❜✐♦♥✐❝ ❞♦❡ ♥♦ ❡♥❝♦❞❡ ❞✐❛♥❝❡ ❛❝♦✉✐❝ ♣♦♣❡✐❡ ❛❤❡ ❤❛♥ ❛ ✐♠♣❧✐✜❡❞ ✐♥❢♦♠❛✐♦♥ ♦❢ ♦✉♥❞ ♦✉❝❡✱ ♦♥❧② ❞✐❡❝✐♦♥✱ ♠♦❞❡❧✳ ❋✉❤❡♠♦❡✱ ❤❡ ❇❘■❘ ✇❡❡ ♠❡❛✉❡❞ ❤❛ ✐✱ ♦✉❝❡ ❛❡ ♣❧❛♥❡ ✇❛✈❡✳ ❊①❡♥✐♦♥ ❡①✲ ✐♥ ❤❡ ❧♦❝❛✐♦♥ ♦❢ ❤❡ ✇♦❦✬ ✜ ♣❡❡♥❛✐♦♥✱ ✐ ♦ ❛❦❡ ✐♥♦ ❛❝❝♦✉♥ ❤❡ ♥❡❛ ✜❡❧❞ ❡✛❡❝ ♦❢ ❤❡ ❤❡ ❈✉❜❡ ❧❛❜♦❛♦② ❛ ■❊▼ ●❛③✱ ✉❝❤ ❤❛ ♣♦❥❡❝✐♦♥ ②❡♠ ❜② ❛♣♣♦♣✐❛❡ ✜❧❡ ❬❉❛♥✐❡❧✱ ❤❡ ✈✐✉❛❧ ❛❝♦✉✐❝ ♣❡❡♥❡❞ ✈✐❛ ❤❡❛❞♣❤♦♥❡ ✷✵✵✸❪ ♦ ♦ ✉❡ ❛♥ ❛❞❞✐✐♦♥❛❧ ❆♠❜✐♦♥✐❝ ❝❤❛♥✲ ♠❛❝❤❡❞ ❤❛ ♦❢ ❤❡ ✉♦✉♥❞✐♥❣ ❡❛❧ ♣❛❝❡✳ ❚❤❡ ♥❡❧ ❢♦ ❡♥❝♦❞✐♥❣ ♦✉❝❡ ❞✐❛♥❝❡ ❬❡♥❤❛✱ ✷✵✵✽❪✳ ✐❞❡❛ ✇❛ ♦ ♣♦✈♦❦❡ ❤❡ ♥♦✐♦♥ ♦❢ ❛♥ ♦✈❡❧❛② ✐♥✲ ❙✐❧❧✱ ❤❡❡ ❞♦ ♥♦ ✐♥❝❧✉❞❡ ♠♦❞❡❧ ❢♦ ❛♥❧❛✐♥❣ ❝✐❜❡❞ ✐♥♦ ❤❡ ❡①✐✐♥❣ ❛✉❛❧ ♣❛❝❡ ❛❤❡ ❤❛♥ ❛ ❞✐❛♥❝❡ ✈❡❝♦ ✐♥♦ ♣♦❝❡✐♥❣ ♣❛❛♠❡❡ ❧✐❦❡ ❡♣❧❛❝✐♥❣ ✐ ❜② ❛ ❞✐✛❡❡♥ ♦♥❡✳ ❛♠♣❧✐✉❞❡ ❛❡♥✉❛✐♦♥✱ ❧♦✇✲♣❛ ✜❧❡✐♥❣✱ ♦ ❤❡ ❛✐♦ ♦❢ ❞✐❡❝ ✐❣♥❛❧ ♦ ❡✈❡❜ ❡♥❡❣②✳ ❖♥❡ ❞✐❛❞✈❛♥❛❣❡ ♦❢ ❝♦♠❜✐♥✐♥❣ ❤❡ ✈✐✉❛❧ ❙❝❤♦❧❛❧② ❡❡❛❝❤ ❤♦✇ ❤❛ ❛✉❞✐♦② ❞✐✲ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ✇✐❤ ❇❘■❘ ✐ ❤❛ ❤❡ ❛♥❝❡ ❡✐♠❛✐♦♥ ✐ ❤✐❣❤❧② ❞❡♣❡♥❞❡♥ ♦♥ ❤❡ ♣♦♣♦❡❞ ❡♥❞❡✐♥❣ ♦♣✐♠✐❛✐♦♥ ❝❛♥♥♦ ❜❡ ❛♣✲ ♦✉❝❡ ♠❛❡✐❛❧ ❛♥❞ ❝❛♥♥♦ ❜❡ ❡❧✐❛❜❧② ♣❡✲ ♣❧✐❡❞ ✉♥❧❡ ❤❡ ♠❡❛✉❡❞ ♦♦♠ ❛❝♦✉✐❝ ✐ ❛✲ ❢♦♠❡❞ ❡✈❡♥ ✐♥ ❡❛❧✐② ❬❩❛❤♦✐❦✱ ✷✵✵✷❪✳ ■♥ ❤❡ ✉♠❡❞ ♦ ❜❡ ❢✉❧❧② ②♠♠❡✐❝ ✭❝❢✳ ❬◆♦✐❡♥✐❣ ❡ ❧✐❣❤ ♦❢ ❤❡ ❡✢❡❝✐♦♥ ❛❜♦✈❡ ✭❡❡ ❡❝✐♦♥ ✸✳✷✮✱ ❛❧✳✱ ✷✵✵✸❪✮✳ ▼♦❡ ✐❣♥✐✜❝❛♥❧②✱ ❤❡ ✐♠♣❧❡♠❡♥❛✲ ♠♦❞❡❧❧✐♥❣ ❤❡ ♦✉❝❡ ❞✐❛♥❝❡ ✐♥ ❛ ❡♥❞❡❡❞ ✐♦♥ ✐ ✏✐♥❝♦❡❝✑ ✐♥ ❡♠ ♦❢ ❝♦♠♠✉♥✐❝❛✐♦♥ ❝❡♥❡ ✐ ♥♦ ❛ ♠❡❛♥ ♦❢ ❡❢❡✐♥❣ ♦ ❡❛❧✐② ❜✉ ❡♥❣✐♥❡❡✐♥❣✿ ❆ ❤❡ ❇❘■❘ ✇❡❡ ♦♥❧② ♠❡❛✉❡❞ ♦ ❤❡ ❛❡❤❡✐❝ ❢❛♠❡✇♦❦ ♦❢ ❤❡ ✐♥❛❧❧❛✐♦♥✳ ❢♦ ❛ ✐♥❣❧❡ ♦✐❡♥❛✐♦♥ ♦❢ ❤❡ ❞✉♠♠② ❤❡❛❞✱ ♦✲ ❆♠♣❧✐✉❞❡ ❛❡♥✉❛✐♦♥ ✐♥ ❛✐✢♥❡✉ ✐ ❛✐♦♥ ✐♥ ❤❡ ❆♠❜✐♦♥✐❝ ❞♦♠❛✐♥ ✉♣♦♥ ❛❝❦✐♥❣ ♠✉❝❤ ♦♥❣❡ ❤❛♥ ✐♥ ❡❛❧✐② ❛ ❞❡❝✐❜❡❞ ❜② ✐♥♣✉ ❡✉❧ ✐♥ ❤❡ ♦♦♠ ❛❝♦✉✐❝ ❜❡✐♥❣ ✉♥❡❞ ❤❡ ✐♥✈❡❡ ✉❛❡❞ ❧❛✇✳ ❖❤❡✇✐❡✱ ❤❡ ❡✈❡♥ ❛❧♦♥❣ ✇✐❤ ❤❡ ❧✐❡♥❡ ✇❤✐❧❡ ❤❡ ❡❧❛✐✈❡ ♦✉❝❡ ♦✉♥❞ ✐✉❛✐♦♥ ✇♦✉❧❞ ♥♦ ❜❡ ❞✐✐♥❣✉✐❤❛❜❧❡ ♣♦✐✐♦♥ ❛❡ ❝♦❡❝❧② ❛❞❥✉❡❞✳ ❚❤❡ ❡✉❧✐♥❣ ❛ ❛❧❧ ❜② ❛♣♣♦❛❝❤✐♥❣ ♦♥❡ ♦ ❤❡ ♦❤❡ ❛ ❤❡✐ ♠✐❧❡❛❞✐♥❣ ♣❛✐❛❧ ❝✉❡ ♠❛② ❞❡❣❛❞❡ ❧♦❝❛❧✐❛✲ ❧❡✈❡❧ ✇♦✉❧❞ ❞✐✛❡ ♦♦ ❧✐❧❡✱ ❣✐✈❡♥ ❤❡ ❧✐♠✐❡❞ ✐♦♥ ❛❝❝✉❛❝② ❛♥❞ ❡①❡♥❛❧✐❛✐♦♥ ✭❝❢✳ ❬❘✉♠♦✐✱ ❛❝❦✐♥❣ ✈♦❧✉♠❡ ❛♥❞ ❤❡ ❡❧❛✐✈❡❧② ❧♦✇ ♠❛①✐✲ ✷✵✶✼❪✮✳ ♠✉♠ ❞✐❛♥❝❡ ♦❢ ♦✉❝❡✳ ❙✐♠✐❧❛❧②✱ ❧♦✇✲♣❛ ❚❤❡ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ❤❛ ❜❡❡♥ ✜❧❡✐♥❣ ❜② ❛✐ ❛❜♦♣✐♦♥ ✇♦✉❧❞ ❜❡ ❤❛❞❧② ♥♦✲ ✐♥❝♦♣♦❛❡❞ ✐♥ ❛✐✢♥❡✉ ✉✐♥❣ ♠♦❞✐✜❡❞ ✺ ✐❝❡❛❜❧❡ ❛ ✉❝❤ ❤♦ ❞✐❛♥❝❡✱ ✇❤✐❧❡ ✐♥ ❛✐✲ ❝❧❛❡ ♦❢ ❤❡ ❆♠❜■❊▼ ❙✉♣❡❝♦❧❧✐❞❡ ✉❛❦ ✳ ✢♥❡✉ ✐ ✐ ✉❡❞ ❛ ❛♥ ❛❝♦✉✐❝❛❧ ✏♠❛❣♥✐✜❡✑ ❢♦ ❤❡ ❝❧♦❡ ✉♦✉♥❞✐♥❣ ♦❢ ❤❡ ❧✐❡♥❡✳ ✺❤♣✿✴✴❣✐❤✉❜✳❝♦♠✴✉♣❡❝♦❧❧✐❞❡✲✉❛❦✴ ❆♠❜■❊▼ ✭❧❛ ❡✐❡✈❡❞ ❋❡❜✉❛② ✷✽✱ ✷✵✶✼✮ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 18

■♥ ✐♠♣❧❡♠❡♥❛✐♦♥ ♦❢ ❛✐✢♥❡✉ ✉✐♥❣ ❤❡ ✉❛❧ ♣❛❝❡✳ ❚❤✐ ✐ ❞✐✛❡❡♥ ❢♦♠ ✐♠♣❧② ♣❧❛②✐♥❣ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ✭❡❡ ❡❝✐♦♥ ✸✳✷✳✶✮✱ ✐ ❜❛❝❦✱ ✇❤✐❝❤ ❛❡❧② ❢♦❝✉❡ ❤❡ ❡❝♦❞✐♥❣ ♠❡✲ ❤❡ ❛✐♦ ♦❢ ❞✐❡❝ ❛♥❞ ❡✈❡❜ ✐❣♥❛❧ ❡♥❡❣② ✐ ❞✐❛ ✐❡❧❢✱ ❛❤❡✱ ✐ ♣♦♣❡✐❡ ❤❛❧❧ ❜❡ ❤✐❞❞❡♥ ✜①❡❞ ❜② ❤❡ ✐♠♣✉❧❡ ❡♣♦♥❡✳ ■ ❝♦✉❧❞ ❜❡ ♠❛❞❡ ❜❡❤✐♥❞ ❤❡ ❡❝♦❞❡❞✳ ■♥ ❛✐✢♥❡✉✱ ❤❡ ❡✲ ✈❛✐❛❜❧❡ ❜② ❛♥ ✐♠♣❧❡♠❡♥❛✐♦♥ ✉✐♥❣ ✇♦ ❆♠✲ ❧❛✐♦♥ ♦❢ ❤❡ ❡❝♦❞✐♥❣ ✐♥ ✐ ❤❡❛❞✲❡❧❛❡❞ ❢♦♠ ❜✐♦♥✐❝ ❞♦♠❛✐♥✱ ❛ ✏❞②✑ ❛♥❞ ❛ ✏✇❡✑ ♦♥❡✳ ❛♥❞ ✐ ❛♣♣❡❛❛♥❝❡ ❛ ❛ ✈✐✉❛❧ ♦❜❥❡❝ ✐ ❛ ❝❡♥✲ ✸✳✷✳✸ ❈✐❝✉❧❛ ♣❛♥♥✐♥❣ ❛❧ ♣♦✐♥ ♦❢ ❡✢❡❝✐♦♥✱ ♣❧✉ ❤❡ ❛♥❡❝❞♦❛❧✱ ❤❛ ✐✱ ♠✉✐❝❛❧ ❡❧❛✐♦♥ ♦❢ ❡✈❡❛❧ ♦❢ ✉❝❤ ♦❜❥❡❝ ♦ ▼♦❡ ❡❝❡♥ ✐♠♣❧❡♠❡♥❛✐♦♥ ♦❢ ❛✐✢♥❡✉ ❡❛❝❤ ♦❤❡ ❜② ♣♦✈✐❞✐♥❣ ❤❡♠ ❢♦ ❡❛❛♥❣❡♠❡♥ ❞♦♣ ❤❡ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ❛ ♠♦ ❜② ❤❡ ❧✐❡♥❡✳ ♦❢ ✐ ❛❞✈❛♥❛❣❡ ❞♦ ♥♦ ❛♣♣❧② ❤❡❡✳ ❉✐✛❡❡♥ ♦ ❤❡ ❤❡❡✲❞✐♠❡♥✐♦♥❛❧ ❆♠❜✐♦♥✐❝ ❛♣♣♦❛❝❤✱ ❲❤✐❧❡ ❤❡ ❡❝♦❞✐♥❣ ❛❡ ❧❡❢ ✇✐❞❡❧② ✉♥♣♦✲ ❤❡ ❡♥❞❡✐♥❣ ✇❛ ❛❞❞✐✐♦♥❛❧❧② ❡❞✉❝❡❞ ♦ ✇♦ ❝❡❡❞ ❢♦ ❤❡✐ ❜✐♥❛✉❛❧ ♣❡❡♥❛✐♦♥ ✇❤❡♥ ❛ ❞✐♠❡♥✐♦♥✳ ▲✐❡♥❡ ✐♥ ❛✐✢♥❡✉ ♠♦❧② ♦✉♥❞ ✐✉❛✐♦♥ ✐ ✏❡♥❡❡❞✱✑ ❤❡✐ ♠♦♥❛✉❛❧ ♠♦✈❡ ✐♥ ❛ ♣❧❛♥❡ ♦♥❧②✱ ✇❤✐❧❡ ❤❡ ❤✐❞ ❞✐♠❡♥✲ ❝♦✉♥❡♣❛ ❛ ♦❜❥❡❝ ✐♥ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ❤❛✈❡ ✐♦♥ ❤❛ ♥♦ ♦✐❡♥✐♥❣ ❢✉♥❝✐♦♥✳ ❚❤❡ ♦✉♥❞ ♣♦ ♦ ❜❡ ❞❡✐✈❡❞ ❢♦♠ ❤❡ ❡❝♦❞✐♥❣ ✇✐❤ ♦♠❡ ❛❡ ♠❡❛♥ ♦ ❜❡ ❛ ❡❛ ❧❡✈❡❧ ❛❧❧ ❤❡ ✐♠❡✱ ✐♥✲ ❡❛♠❡♥✳ ❞❡♣❡♥❞❡♥ ♦❢ ❤❡ ❤❡✐❣❤ ♦❢ ❤❡ ❧✐❡♥❡✳ ❲❤❡♥ ✸✳✸✳✶ ▼♦♥❛✉❛❧ ❡♣❡❡♥❛✐♦♥ ❛♣♣❧②✐♥❣ ❤❡ ✏♦♥✐❝ ❤❛✑ ♠❡❛♣❤♦ ✭❡❡ ❡❝✐♦♥ ✷✮✱ ❡❧❡✈❛✐♦♥ ✐♥❢♦♠❛✐♦♥ ❝♦✉❧❞ ❤❛✈❡ ❛ ❝❡❛✐♥ ❆♥ ✐♠♣♦❛♥ ♣♦✐♥ ✐ ♦ ❛❝❤✐❡✈❡ ♦♠❡ ❞❡❣❡❡ ♦❢ ✈❛❧✉❡✱ ❜✉ ❤✐ ✐♥❡❛❝✐♦♥ ❝❤❡♠❡ ✇❛ ❛❧♦ ❡✲ ♠♦♥❛✉❛❧ ❝♦♠♣❛✐❜✐❧✐② ✐♥ ♦❞❡ ♦ ❡❞✉❝❡ ❝♦♠❜ ♣❧❛❝❡❞ ❜② ❛ ❞✐✛❡❡♥ ♦♥❡ ✐♥ ❧❛❡ ✈❡✐♦♥ ♦❢ ❤❡ ✜❧❡ ❡✛❡❝ ❡♣❡❝✐❛❧❧② ✐♥ ❤❡ ❧♦✇❡ ❢❡✉❡♥❝✐❡ ✐♥❛❧❧❛✐♦♥ ✭❡❡ ❬❘✉♠♦✐✱ ✷✵✶✼❪✮✳ ✇❤❡♥ ♠✐①✐♥❣ ❜♦❤ ❝❤❛♥♥❡❧ ♦❢ ❛ ❜✐♥❛✉❛❧ ❡❝♦❞✲ ❈✉❡♥❧②✱ ❛✐✢♥❡✉ ✐♥❝♦♣♦❛❡ ✇♦ ❞♦✲ ✐♥❣ ♦ ❛ ✐♥❣❧❡ ♦♥❡✳ ♠❛✐♥ ♦❢ ✐♠♣❧❡ ❝✐❝✉❧❛ ♣❛♥♥✐♥❣✱ ✐♠♣❧❡♠❡♥❡❞ ❆ ✐♠♣❧❡ ♠♦♥❛✉❛❧ ❡♣❡❡♥❛✐♦♥ ✇♦✉❧❞ ♦♥❧② ✉✐♥❣ ❙✉♣❡❝♦❧❧✐❞❡✬ ❛♥❆③ ✉♥✐ ❣❡♥❡❛♦✳ ❖♥❡ ✉❡ ♦♥❡ ❝❤❛♥♥❡❧ ♦❢ ❤❡ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣✱ ❛♥❞ ✐♥ ❞♦♠❛✐♥ ✉❡ ✶✷ ❝❤❛♥♥❡❧ ♦❢ ❛ ♠❡❛✉❡❞ ❝✐❝✉❧❛ ❢❛❝ ❤❛ ❤❛ ❜❡❡♥ ❞♦♥❡ ✐♥ ♣❡❧✐♠✐♥❛② ✈❡✐♦♥ ❧♦✉❞♣❡❛❦❡ ❛❛② ✐♥ ❛ ❢❛✐❧② ❡✈❡❜❡❛♥ ♦♦♠✱ ♦❢ ❛✐✢♥❡✉✳ ❖❢ ❝♦✉❡ ❤✐ ❡✉❧ ✐♥ ❛♥ ✉♥✲ ✇❤✐❧❡ ❤❡ ❡❝♦♥❞ ♦♥❡ ❤❛ ✸✻ ♦✉♣✉ ❝❤❛♥♥❡❧ ❡♣✲ ❜❛❧❛♥❝❡❞ ♣❛✐❛❧ ✐♥❡♣❡❛✐♦♥ ♦❢ ❤❡ ✐❣♥❛❧✱ ❛ ❡❡♥✐♥❣ ❛ ❡♥✲❞❡❣❡❡ ❡♦❧✉✐♦♥ ♦❢ ❛♥❡❝❤♦✐❝ ✐♠✲ ❤❡ ❤✐❣❤❡ ❢❡✉❡♥❝② ♣♦✐♦♥ ❛ ❤❡ ❢❛ ✐❞❡ ❤❛ ♣✉❧❡ ❡♣♦♥❡ ❛❦❡♥ ❢♦♠ ❤❡ ❙♦✉♥❞❙❝❛♣❡❘❡♥✲ ❛❡ ❛❡♥✉❛❡❞ ❜② ❤❡ ❤❡❛❞ ❛❡ ♦♠✐❡❞✳ ◆❡✈✲ ❞❡❡ ♣♦❥❡❝✻✳ ❙♦✉❝❡ ✐♥ ❤❡ ❢❛ ✜❡❧❞ ❛❡ ♣♦✲ ❡❤❡❧❡✱ ❢♦ ♣♦✈✐❞✐♥❣ ❛♥ ♦✈❡❛❧❧ ✐♠♣❡✐♦♥ ♦❢ ❥❡❝❡❞ ✉✐♥❣ ❤❡ ✜ ♣❛♥♥✐♥❣ ❞♦♠❛✐♥ ✇❤✐❧❡ ❤❡ ❛ ✜❡❧❞ ❡❝♦❞✐♥❣ ❛♥❞ ✐ ❡❝♦❣♥✐✐♦♥ ✐♥ ❛ ✈✐✉❛❧ ❡♥❡❣② ❝♦♥✐❜✉✐♦♥ ✐ ❣❛❞✉❛❧❧② ❤✐❢❡❞ ♦✇❛❞ ❝❡♥❡ ❤✐ ♦❧✉✐♦♥ ♠❛② ✉✣❝❡✳ ❤❡ ❡❝♦♥❞ ❞♦♠❛✐♥ ❢♦ ❝❧♦❡ ♦✉❝❡✳ ❖❜✈✐♦✉❧②✱ ❆ ♠♦❡ ❛❞✈❛♥❝❡❞ ❛♣♣♦❛❝❤ ♦ ♠♦♥❛✉❛❧ ❝♦♠✲ ❤❡ ❧❛❡ ❡♣❡❡♥ ❛ ♦♥❣❡ ❞✐❡❝ ♣♦✐♦♥ ♦❢ ♣❛✐❜✐❧✐② ✇♦✉❧❞ ❜❡ ♦ ✉♥ ❤❡ ♣❤❛❡ ❞✐✛❡❡♥❝❡ ❤❡ ♦✉❝❡ ✐❣♥❛❧✳ ✐♥ ❤❡ ❧♦✇ ❢❡✉❡♥❝✐❡ ✐♥♦ ❧❡✈❡❧ ❞✐✛❡❡♥❝❡✳ ❚❤✐ ❋♦ ♦✉❝❡ ✈❡② ❝❧♦❡ ♦ ❤❡ ❧✐❡♥❡✬ ❤❡❛❞✱ ✐ ❡①❛❝❧② ❤❡ ♣✉♣♦❡ ♦❢ ❤❡ ♦ ❝❛❧❧❡❞ ❇❧✉♠✲ ❤❡ ❜✐♥❛✉❛❧ ❞♦♠❛✐♥ ✐♥ ❛ ❝❧❛✐❝ ✉♥❞❡❛♥❞✲ ❧❡✐♥ ❙❤✉✤❡ ❬●❡③♦♥✱ ✶✾✾✹❪✱ ✏❤❡ ❣❡❛❡ ❢♦✲ ✐♥❣ ✐ ❧❡❢✱ ❤❛ ✐✱ ♥♦ ❤❡❛❞✲❡❧❛❡❞ ✐♠♣✉❧❡ ❡✲ ❣♦❡♥ ✐♥✈❡♥✐♦♥ ✐♥ ❛✉❞✐♦ ❡♥❣✐♥❡❡✐♥❣✑✼✳ ■ ✇❛ ♣♦♥❡ ❛❡ ✐♥✈♦❧✈❡❞ ❛♥②♠♦❡✳ ■♥❡❛❞✱ ❤❡ ✉✉✲ ♣❛❡♥❡❞ ❜② ❆❧❛♥ ❇❧✉♠❧❡✐♥ ✐♥ ✶✾✸✸ ❢♦ ❤❡ ❧♦✉❞✲ ❛❧❧② ✉♥❞❡✐❡❞ ❡✛❡❝ ♦❢ ✐♥❡♥✐② ♣❛♥♥✐♥❣ ♦♥ ♣❡❛❦❡ ❡♣♦❞✉❝✐♦♥ ♦❢ ✐♠❡✲♦❢✲❛✐✈❛❧ ❡❡♦✲ ❤❡❛❞♣❤♦♥❡ ❛❡ ❡①♣❧♦✐❡❞ ❢♦ ♣♦✈♦❦✐♥❣ ♥❡❛✲ ♣❤♦♥✐❝ ✐❣♥❛❧✳ ✜❡❧❞ ❛♥❞ ✐♥✲❤❡❛❞ ❡①♣❡✐❡♥❝❡ ✭❡❡ ❡❝✐♦♥ ✸✳✸✳✸✮✳ ■♥ ❛✐✢♥❡✉✱ ❤❡ ❇❧✉♠❧❡✐♥ ❙❤✉✤❡ ✐♠♣❧❡✲ ✸✳✸ ❆♣♣❧✐❝❛✐♦♥ ♦❢ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣ ♠❡♥❛✐♦♥ ❜❧✶ ❜② ❋♦♥ ❆❞✐❛❡♥❡♥ ✐ ✉❡❞✽✳ ■ ❲❤❛ ❞♦❡ ✐ ♠❡❛♥ ♦ ❡♣❡❡♥ ❛ ♣❛✐❛❧✱ ❤❡❛❞✲ ♣♦✈✐❞❡ ♦♥❡ ♦❢ ❤❡ ❢❡✇ ❛❝❝❡✐❜❧❡ ✐♠♣❧❡♠❡♥❛✲ ❡❧❛❡❞ ✜❡❧❞ ❡❝♦❞✐♥❣ ❜② ❛ ♠♦♥❛✉❛❧ ✐♥❣❧❡✲ ✐♦♥✱ ❤❡ ♠♦ ❛❞✈❛♥❝❡❞ ♦♥❡ ❞✉❡ ♦ ✐ ✉❡ ♦❢ ♣♦✐♥ ♦❜❥❡❝ ✐♥ ❛ ✈✐✉❛❧ ❛✉❞✐♦② ❝❡♥❡❄ ❙✐♠✐✲ ❝❛❡❢✉❧❧② ❞❡✐❣♥❡❞ ❋■❘ ✜❧❡ ❛♥❞✱ ♦ ♠② ❦♥♦✇❧✲ ❧❛ ♦ ♦✉♥❞ ♦❡❞ ♦♥ ❛♣❡✱ ❛ ✈✐♥②❧ ❡❝♦❞ ♦ ❛ ❡❞❣❡✱ ❤❡ ♦♥❧② ❢❡❡ ❛♥❞ ❧✐❜❡ ✐♠♣❧❡♠❡♥❛✐♦♥✳ ❝♦♠♣❛❝ ❞✐❝✱ ❤❡ ❡❝♦❞✐♥❣ ❜❡❝♦♠❡ ❛♥ ♦❜❥❡❝ ✐♥ ❡♠ ♦❢ ❤❡ ❡♥✈✐♦♥♠❡♥✱ ❜❡ ✐ ❛ ♣❤②✐❝❛❧ ❝❛✲ ✼❤♣✿✴✴✇✇✇✳♣♣❛✐❛❧❛✉❞✐♦✳❝♦♠✴❜❧✉♠❧❡✐♥❴❞❡❧❛✳ ✐❡ ♠❡❞✐✉♠ ♦ ❛ ♦✉♥❞ ♦✉❝❡ ❡♥❞❡❡❞ ✐♥ ✈✐✲ ❤♠ ✭❧❛ ❡✐❡✈❡❞ ❋❡❜✉❛② ✷✼✱ ✷✵✶✼✮ ✽❤♣✿✴✴❦♦❦❦✐♥✐③✐❛✳❧✐♥✉①❛✉❞✐♦✳♦❣✴ ✻❤♣✿✴✴♣❛✐❛❧❛✉❞✐♦✳♥❡✴✴ ✭❧❛ ❡✐❡✈❡❞ ❧✐♥✉①❛✉❞✐♦✴③✐❛✲❜❧✶✲❞♦❝✴✉✐❝❦❣✉✐❞❡✳❤♠❧ ✭❧❛ ❋❡❜✉❛② ✷✼✱ ✷✵✶✼✮ ❡✐❡✈❡❞ ❋❡❜✉❛② ✷✼✱ ✷✵✶✼✮ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 19

✸✳✸✳✷ ❋❡✉❡♥❝② ❡♣♦♥❡ ❝❛❧ ♣♦♣❡✐❡ ✉❝❤ ❛ ❤❡ ♣❡❡♥❝❡ ♦❢ ✐♥❡❛✉❛❧ ❆ ✈✐✉❛❧ ♦✉❝❡✬ ♣❡❝✉♠ ✐ ❧✐❦❡❧② ♦ ❜❡ ❞✐✲ ✐♠❡ ♦ ❧❡✈❡❧ ❞✐✛❡❡♥❝❡ ❜✉ ❛❤❡ ❞✉❡ ♦ ✐ ♦❡❞ ❜② ❡♥❞❡✐♥❣ ❝♦♠♣❛❡❞ ♦ ❤❛ ♦❢ ❤❡ ✉♥✲ ✐♥❡♥✐♦♥❛❧ ✐♥❡♣❡❛✐♦♥ ❛ ❡❛ ✐❣♥❛❧✳ ❋✉✲ ❞❡❧②✐♥❣ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣✳ ❆ ❜♦❤ ✐♥❛♥❝❡ ❤❡♠♦❡✱ ❤✐ ❡①❛♠♣❧❡ ✐ ❛ ♦♥❣ ✐♥❞✐❝❛✐♦♥ ❛❡ ❡❧❛❡❞ ♦ ❡❛❝❤ ♦❤❡ ✐♥ ❤❡ ✐♥❛❧❧❛✐♦♥✱ ❤❡ ✇❤② ♦♣❡♥ ♦❢✇❛❡ ②❡♠ ❛❡ ❛ ♣❡❝♦♥❞✐✐♦♥ ✐❣♥❛❧ ✉❡❞ ❛ ✈✐✉❛❧ ♦✉❝❡ ❛❡ ✜❧❡❡❞ ❛❝✲ ❢♦ ♣✉✉✐♥❣ ❤❡ ❛✐✐❝❛❧❧② ♠♦✐✈❛❡❞ ❛♣♣♦❛❝❤ ❝♦❞✐♥❣ ♦ ❡①♣❡✐♠❡♥❛❧ ❡①♣❧♦❛✐♦♥ ♦❢ ❞✐✛❡❡♥ ♦ ❜✐♥❛✉❛❧ ❡❝❤♥♦❧♦❣② ❛ ❞❡❝✐❜❡❞ ❤❡❡✳ ▼♦ ♣❛✐❛❧ ❝♦♥❡❧❧❛✐♦♥✱ ❤❛ ✐✱ ❞✐✛❡❡♥ ❡♥❞❡❡❞ ♠♦♥♦❧✐❤✐❝ ✐♠♣❧❡♠❡♥❛✐♦♥✱ ❡✈❡♥ ✐❢ ❛❞✈❛♥❝❡❞ ❞✐❡❝✐♦♥ ❛♥❞ ❞✐❛♥❝❡ ❢♦♠ ❤❡ ❧✐❡♥❡✳ ❛♥❞ ♦♣✐♠✐❡❞ ✇✐❤ ❡♣❡❝ ♦ ❧❛❡ ❡❡❛❝❤✱ ❞♦ ♥♦ ❛❧❧♦✇ ❢♦ ♠♦❞❡❧❧✐♥❣ ❛♥❞ ❛❝❝❡✐♥❣ ❤❡ ✐❣♥❛❧ ✸✳✸✳✸ ❚❛♥✐✐♦♥ ❞❡✐❣♥ ♣❛❤ ❛ ❡✈❡② ❧❡✈❡❧✳ ❚❤❡ ♠♦♠❡♥ ♦❢ ❛♥✐✐♦♥ ❢♦♠ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ❲❤❡♥ ❡①♣❡✐♠❡♥✐♥❣ ✇✐❤ ❤❡ ❛❜♦✈❡✲ ♦ ❤❡ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣ ❛♥❞ ❜❛❝❦ ✐ ♦♥❡ ♦❢ ❤❡ ♠❡♥✐♦♥❡❞ ❇❧✉♠❧❡✐♥ ❙❤✉✤❡ ✭❡❡ ❡❝✐♦♥ ✸✳✸✳✶✮✱ ❝❡♥❛❧ ❛❡❤❡✐❝ ❡①♣❡✐❡♥❝❡ ✐♥ ❛✐✢♥❡✉✱ ■ ♥♦✐❝❡❞ ❤❛ ✐ ♦✉♣✉ ♣♦✈✐❞❡ ❛ ♣❡❝❡♣✉❛❧ ❤❡♥❝❡ ❤❡ ✐♠♣♦❛♥❝❡ ♦❢ ✐ ❞❡✐❣♥✳ ■♥ ❤❡ ❝♦✉❡ ❜✐❞❣❡ ❜❡✇❡❡♥ ♠♦♥❛✉❛❧ ✐♥✲❤❡❛❞ ❧♦❝❛❧✐❛✐♦♥ ♦❢ ❡✜♥✐♥❣ ❤❡ ✇♦❦✱ ❛♥✐✐♦♥ ❞❡✐❣♥ ❡✈♦❧✈❡❞ ❛♥❞ ❜✐♥❛✉❛❧ ❡①❡♥❛❧✐❛✐♦♥✳ ❙♦♠❡ ❢❡❛✉❡ ❛❡ ❢♦♠ ❛ ✐♠♣❧❡ ❝♦✲❢❛❞❡ ❜❡✇❡❡♥ ❤❡ ✇♦ ❞♦✲ ❡❛✐♥❡❞ ❢♦♠ ❤❡ ♦✐❣✐♥❛✐♥❣ ❜✐♥❛✉❛❧ ✐❣♥❛❧ ♠❛✐♥✱ ♥❡✈❡❤❡❧❡ ✉✐♥❣ ♣❡❝✐❛❧ ♦✈❡❧❛♣♣✐♥❣ ❛❧❧♦✇✐♥❣ ❢♦ ❛ ♣❛✐❛❧ ❡①❡♥❛❧✐❛✐♦♥✱ ✇❤✐❧❡ ❝✉✈❡✱ ♦✇❛❞ ❛ ♠♦❡ ❝♦♠♣❧❡① ♠✉❧✐✲❛❣❡ ♣♦✲ ♦❤❡✱ ❞✉❡ ♦ ❤❡✐ ♠♦♥❛✉❛❧ ❝♦♠♣❛✐❜✐❧✐②✱ ❝❡✳ ❡♥❛❜❧❡ ♣❛♥♣♦✲❧✐❦❡ ♣♦❝❡✐♥❣ ❢♦ ❛❝❤✐❡✈✐♥❣ ❛ ■♥ ❤❡ ♣❤❡♥♦♠❡♥♦❧♦❣✐❝❛❧ ❞❡❝✐♣✐♦♥ ✭❡❡ ❡❝✲ ✈❛✐❛❜❧❡ ✐♥✲❤❡❛❞ ❡❡♦ ✇✐❞❤✳ ■♥ ❧❛❡ ✐♠♣❧❡✲ ✐♦♥ ✷✮ ■ ❛❡❞ ❤❛ ✐♥✲❤❡❛❞ ❧♦❝❛❧✐❛✐♦♥ ✐♥ ❤❡ ♠❡♥❛✐♦♥ ♦❢ ❛✐✢♥❡✉✱ ✉❝❤ ❛ ❇❧✉♠❧❡✐♥ ✈✐✉❛❧ ❝❡♥❡ ✐ ❞❡✐❡❞ ✐♥ ♦❞❡ ♦ ✐♥❞✐❝❛❡ ❤❡ ❤✉✤❡❞ ❡❡♦ ✐❣♥❛❧ ✐ ✉❡❞ ❛ ❛♥ ✐♥❡♠❡❞✐❛❡ ❡①❛❝ ♣♦✐✐♦♥ ♦❢ ❛ ♦✉♥❞ ♣♦✳ ❊❛❧② ✐♠♣❧❡✲ ❛♥✐✐♦♥ ♣❤❛❡ ❢♦ ❣❛❞✉❛❧❧② ♦♣❡♥✐♥❣ ❤❡ ♠❡♥❛✐♦♥ ♦❢ ❛✐✢♥❡✉ ✉❡❞ ❤❡ ✈✐✉❛❧ ❆♠✲ ♠♦♥❛✉❛❧ ✐♥✲❤❡❛❞ ♣♦✱ ✉♥✐❧ ❤❡ ❧✐❡♥❡✬ ❤❡❛❞ ❜✐♦♥✐❝ ❛♣♣♦❛❝❤ ❢♦ ❤❡ ❜✐♥❛✉❛❧ ❡♥❞❡✐♥❣ ♦❢ ✐ ✏❧❡❢✑ ❜② ❢❛❞✐♥❣ ✐♥♦ ❤❡ ✐♠♠❡✐✈❡ ❜✐♥❛✉❛❧ ❤❡ ❝❡♥❡ ✭❡❡ ❡❝✐♦♥ ✸✳✷✳✶✮✳ ❋♦ ❝❧♦❡ ♦✉❝❡✱ ❡❝♦❞✐♥❣✳ ❤❡ ❡♥❡❣② ❝♦♥✐❜✉✐♦♥ ♦❢ ❤✐❣❤❡ ❆♠❜✐♦♥✐❝ ♦✲ ❞❡ ✐ ❣❛❞✉❛❧❧② ❡❞✉❝❡❞ ❛❢❡ ❡♥❝♦❞✐♥❣✱ ✇❤✐❝❤ ✹ ❈♦♥❝❧✉✐♦♥ ❛❝❤✐❡✈❡ ❛ ♣❛✐❛❧ ✇✐❞❡♥✐♥❣ ✉♥✐❧ ♦♥❧② ❤❡ ③❡♦❤ ♦❞❡ ❡♠❛✐♥ ✇❤❡♥ ❤❡ ♣♦✐✐♦♥ ♦❢ ❤❡ ♦✉❝❡ ■♥ ❤✐ ♣❛♣❡✱ ■ ♣❡❡♥❡❞ ❛♥ ✐♥❡♠❡❞✐❛ ✐♥❛❧✲ ✐ ❡❛❝❤❡❞✳ ❚❤✐ ❝♦❡♣♦♥❞ ♦ ❛♥ ♦♠♥✐❞✐❡❝✲ ❧❛✐♦♥ ♦❢ ♠✐♥❡ ❝❛❧❧❡❞ ❛✐✢♥❡✉✳ ■ ❛❦❡ ♣❧❛❝❡ ✐♦♥❛❧ ❡❝❡✐✈❡ ♣❛❡♥ ❛ ❤❡ ❧✐❡♥❡✬ ♣♦✐✐♦♥✱ ✐♥ ❛✉❞✐♦② ♣❛❝❡ ✇❤✐❝❤ ✐ ♣❡❡♥❡❞ ❜✐♥❛✉❛❧❧② ❤❡♥❝❡ ❤❡ ♦✉❝❡✬ ✐❣♥❛❧ ✐ ♣♦❥❡❝❡❞ ❡✉❛❧❧② ✈✐❛ ❤❡❛❞♣❤♦♥❡✳ ❚❤❡ ✇♦❦ ✐♥❝♦♣♦❛❡ ❡✈❡♥ ❢♦♠ ❛❧❧ ❞✐❡❝✐♦♥ ✐♥ ❤❡ ✈✐✉❛❧ ❆♠❜✐♦♥✐❝ ✉❜❛♥ ❛♥❞ ✉❛❧ ♦✉♥❞ ✐✉❛✐♦♥ ❛❛♥❣❡❞ ✐♥ ❛ ♣❡❛❦❡ ❡✉♣✳ ■♥ ❛ ❝❡❛✐♥ ✉♥❞❡❛♥❞✐♥❣✱ ❤✐ ✈✐✉❛❧ ❝❡♥❡ ❤❛ ✐ ♥❛✈✐❣❛❜❧❡ ❜② ❜♦❞✐❧② ♠♦✐♦♥ ♠✐❣❤ ❡♣❡❡♥ ❤❡ ♥♦✐♦♥ ♦❢ ❜❡✐♥❣ ✏✐♥✐❞❡✑ ❛ ❛♥❞ ♦✐❡♥❛✐♦♥ ❜② ❧✐❡♥✐♥❣✳ ❯♣♦♥ ✐♥❡❛❝✐♦♥✱ ♦✉♥❞ ♦✉❝❡✱ ❡♣❡❝✐❛❧❧② ✐♥ ❤❡ ❝❛❡ ♦❢ ❡❛❧ ❧♦✉❞✲ ❡❛❝❤ ♦❢ ❤❡ ♦✉♥❞ ✐✉❛✐♦♥ ❝❛♥ ❜❡ ❡♥❡❡❞✱ ❤❛ ♣❡❛❦❡ ❡♣♦❞✉❝✐♦♥ ❛♥❞ ✇❤❡♥ ❤❡ ♦✉❝❡ ✐ ❛✲ ✐✱ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ❝❛♥ ❜❡ ❧❡❢ ✐♥ ❢❛✈♦✉ ♦❢ ❤❡ ✐❜✉❡❞ ❛ ❝❡❛✐♥ ❡①❡♥✐♦♥✱ ❢♦ ✐♥❛♥❝❡✱ ❤❛ ♦✐❣✐♥❛❧ ❛✐❝✱ ❜✐♥❛✉❛❧ ❡❝♦❞✐♥❣ ♦❢ ❤❛ ✐✉❛✲ ♦❢ ❤❡ ❡♣♦❞✉❝✐♦♥ ♣❛❝❡✳ ✐♦♥✳ ❙✉❜❡✉❡♥ ♠♦✈❡♠❡♥ ❞♦ ♥♦ ❛❧❧♦✇ ❢♦ ❛ ❋♦ ❤❡ ❜✐♥❛✉❛❧ ♣♦❥❡❝✐♦♥ ♦❢ ❛✐✢♥❡✉ ❢✉❤❡ ♥❛✈✐❣❛✐♦♥ ✇✐❤✐♥ ❤❡ ✐✉❛✐♦♥✱ ✐♥❡❛❞✱ ❛♥❞ ✐ ♥❛❛✐✈❡✱ ❛♥♦❤❡ ❛♣♣♦❛❝❤ ♦ ❝♦♥✈❡②✲ ❤❡ ✈✐✉❛❧ ❝❡♥❡ ✇✐❧❧ ❜❡ ❡❛❛♥❣❡❞✱ ✇❤✐❝❤ ❜❡✲ ✐♥❣ ❤❡ ✏✐♥✐❞❡✑ ♥♦✐♦♥ ❛♣♣❡❛ ♦ ❜❡ ♠✉❝❤ ♠♦❡ ❝♦♠❡ ❛✉❞✐❜❧❡ ♦♥❧② ❛❢❡ ❤❛✈✐♥❣ ❧❡❢ ❛❣❛✐♥ ❤❡ ❛♣♣♦♣✐❛❡✿ ❤❡ ♦❢❡♥ ✉♥❞❡✐❡❞ ✐♥✲❤❡❛❞ ❧♦✲ ❛✐❝ ❡❝♦❞✐♥❣✳ ❝❛❧✐❛✐♦♥ ♦❢ ❧♦✉❞♣❡❛❦❡✲❜❛❡❞ ❡❡♦♣❤♦♥② ♦ ■ ❞❡❝✐❜❡❞ ✐♥ ❞❡❛✐❧ ❤❡ ✈✐✐♦✬ ❡①♣❡✐❡♥❝❡ ♦❢ ♠♦♥❛✉❛❧ ✐❣♥❛❧ ♣❡❡♥❡❞ ♦♥ ❤❡❛❞♣❤♦♥❡✳ ■ ❤❡ ✐♥❛❧❧❛✐♦♥ ❛♥❞ ❡❛❧✐❛✐♦♥ ❛❧❡♥❛✐✈❡ ✉✲ ❛♣♣❧✐❝❛✐♦♥ ♠❡❛♥ ❧❡❛✈✐♥❣ ❤❡ ✐♥❡❣✐② ♦❢ ❜♦❤ ✐♥❣ ❋▲❖❙❙ ♦♦❧✳ ❇② ❞♦✐♥❣ ♦✱ ■ ✐❡❞ ♦ ❡❧❛❡ ❜✐♥❛✉❛❧ ♣❧❛②❜❛❝❦ ❛♥❞ ❜✐♥❛✉❛❧ ❡♥❞❡✐♥❣ ✐♥ ❡❝❤♥✐❝❛❧ ✐♠♣❧❡♠❡♥❛✐♦♥ ❞❡❛✐❧ ♦ ❜♦❤ ❝♦♠✲ ❛ ✐❝ ❡♥❡ ♦❢ ❝♦♠♠✉♥✐❝❛✐♦♥ ❡♥❣✐♥❡❡✐♥❣✳ ♠♦♥ ❛♣♣♦❛❝❤❡ ❛ ✉❣❣❡❡❞ ❜② ❝❤♦❧❛❧② ❡✲ ❘❛❤❡✱ ✐❣♥❛❧ ✉✉❛❧❧② ♥♦ ❝♦♥✐❞❡❡❞ ❜✐♥❛✉✲ ❡❛❝❤ ❛♥❞ ♦ ❛❧❡♥❛✐✈❡ ✜♥❞✐♥❣ ❞✐✈❡♥ ❜② ❛❡✲ ❛❧ ❛❡ ✐♥❡♣❡❡❞ ❛ ❡❛ ✐❣♥❛❧ ✐♥ ♦❞❡ ♦ ❡①✲ ❤❡✐❝ ❡✢❡❝✐♦♥ ❛♥❞ ❛✐✐❝ ❡①♣❡✐♠❡♥❛✐♦♥✳ ♣❧♦✐ ❤❡ ❡✉❧✐♥❣✱ ②❡ ✉♥✐✉❡❧② ❜✐♥❛✉❛❧ ❡✛❡❝✳ ❖♥❡ ♦❢ ♠② ❝❡♥❛❧ ❛❣✉♠❡♥ ✐ ❤❛ ❤❡ ❞❡✐❣♥ ❋♦ ❤✐ ❡❛♦♥✱ ■ ❞♦ ♥♦ ❛✐❜✉❡ ❤❡ ✉❛❧✐② ♦❢ ✈✐✉❛❧ ❛✉❞✐♦ ❡♥✈✐♦♥♠❡♥ ❛❧✇❛② ❤❛ ♦ ❡❢✲ ✏❜✐♥❛✉❛❧✑ ♦ ❛ ✐❣♥❛❧ ♣❛✐ ❜❡❝❛✉❡ ♦❢ ✐ ❡❝❤♥✐✲ ❡❡♥❝❡ ❤❡ ❛❡❤❡✐❝ ❡①♣❡✐❡♥❝❡ ❛♥❞ ❤❡ ❝♦♥❞✐✲ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 20

✐♦♥ ♦❢ ❤❡✐ ❡❝❡♣✐♦♥ ❛❤❡ ❤❛♥ ❡①♣❧✐❝✐ ♦ ▼✐❝❤❛❡❧ ●❡③♦♥✳ ✶✾✾✹✳ ❆♣♣❧✐❝❛✐♦♥ ♦❢ ❜❧✉♠✲ ✐♠♣❧✐❝✐ ❡❛❧✲✇♦❧❞ ✐✉❛✐♦♥✳ ❚❤❡ ♣❡❡♥❛✐♦♥ ❧❡✐♥ ❤✉✤✐♥❣ ♦ ❡❡♦ ♠✐❝♦♣❤♦♥❡ ❡❝❤♥✐✉❡✳ ♦❢ ♣❛❝❡ ❜② ❛♥❢♦♠✐♥❣ ♠❡❞✐❛ ✉❝❤ ❛ ❜✐♥❛✉✲ ❏♦✉♥❛❧ ♦❢ ❤❡ ❆✉❞✐♦ ❊♥❣✐♥❡❡✐♥❣ ❙♦❝✐❡②✱ ❛❧ ❛✉❞✐♦ ❡❝❤♥♦❧♦❣② ✐ ♥♦ ❛ ❡❛❧✲✇♦❧❞ ❡①♣❡✐✲ ✹✷✭✻✮✿✹✸✺✕✹✺✸✳ ❡♥❝❡ ✐♥ ❤❡ ❡♥❡ ♦❢ ♦✉♥❞ ♣♦♣❛❣❛✐♦♥ ❞✐❡❝❧② ❋❧♦✐❛♥ ❍♦❧❧❡✇❡❣❡ ❛♥❞ ▼❛✐♥ ❘✉♠♦✐✳ ❛♥❞ ♦❧❡❧② ❤♦✉❣❤ ❛✐✳ ✷✵✶✸✳ ♦❞✉❝✐♦♥ ❛♥❞ ❛♣♣❧✐❝❛✐♦♥ ♦❢ ♦♦♠ ✐♠✲ ■ ❛✐♠❡❞ ❛ ♣♦✐♥✐♥❣ ♦✉ ❤❛ ❋▲❖❙❙ ♦♦❧ ❛❡ ♣✉❧❡ ❡♣♦♥❡ ❢♦ ♠✉❧✐❝❤❛♥♥❡❧ ❡✉♣ ✉✐♥❣ ❛ ♣❡❝♦♥❞✐✐♦♥ ❢♦ ❛✐✐❝ ❡♥❣✐♥❡❡✐♥❣ ❛ ♣❡✲ ❋▲❖❙❙ ♦♦❧✳ ■♥ ♦❝❡❡❞✐♥❣ ♦❢ ▲✐♥✉① ❆✉❞✐♦ ❢♦♠❡❞ ✐♥ ❤❡ ♣❡❡♥❡❞ ♣♦❥❡❝✳ ❆ ❛♥② ❣✐✈❡♥ ❈♦♥❢❡❡♥❝❡✱ ♣❛❣❡ ✶✷✺✕✶✸✷✳ ■❊▼✳ ❛♣♣♦❛❝❤ ♦ ♣♦❝❡ ✐ ✉❜❥❡❝ ♦ ❝✐✐❝❛❧ ❡✢❡❝✲ ✐♦♥ ❛♥❞ ♣♦❡♥✐❛❧ ♠♦❞✐✜❝❛✐♦♥✱ ❤❡ ✐♠♣❧❡♠❡♥✲ ❚❤♦ ▼❛❣♥✉♦♥✳ ✷✵✵✽✳ ❊①♣❡✐♦♥ ❛♥❞ ❛✐♦♥ ✐♥✈♦❧✈❡❞ ❤❛✈❡ ♦ ❜❡ ❛❝❝❡✐❜❧❡ ❛♥②✇❤❡❡ ✐♠❡✿ ❚❤❡ ✉❡✐♦♥ ♦❢ ❛❛ ❛♥❞ ✐♠❡ ♠❛♥✲ ✐♥ ❤❡ ✐❣♥❛❧ ♣❛❤ ❛♥❞ ❛ ❛♥② ❧❡✈❡❧ ❤❛ ✉♥ ❛❣❡♠❡♥ ✐♥ ❝❡❛✐✈❡ ♣❛❝✐❝❡ ✉✐♥❣ ❡❝❤✲ ♦✉ ♦ ❜❡ ❛♣♣♦♣✐❛❡✳ ◆❡✐❤❡ ✇♦✉❧❞ ✐ ❜❡ ♣♦✲ ♥♦❧♦❣②✳ ■♥ ❆②♠❡✐❝ ▼❛♥♦✉① ❛♥❞ ▼❛❧♦❡ ✐❜❧❡ ❢♦ ♠❡ ✭❛♥❞ ♣♦❜❛❜❧② ❢♦ ❛♥② ❛✐✮ ♦ ✐♠✲ ❞❡ ❱❛❧❦✱ ❡❞✐♦✱ ❋▲❖❙❙ ✰ ❆✱ ♣❛❣❡ ✷✸✷✕ ♣❧❡♠❡♥ ❛❧❧ ❤❡ ❜✉✐❧❞✐♥❣ ❜❧♦❝❦ ♠②❡❧❢ ❤❛ ❡✲ ✷✹✼✳ ●❖❚❖✶✵✴❖♣❡♥▼✉❡✳ ✉✐❡ ❛ ❞❡❡♣ ❛❝❝❡ ♦ ❤❡✐ ✐♥♥❡ ♠❡❝❤❛♥✐♠✱ ▲❡✈ ▼❛♥♦✈✐❝❤✳ ✷✵✶✸✳ ❙♦❢✇❛❡ ❚❛❦❡ ❈♦♠✲ ♥♦ ✇♦✉❧❞ ♠♦♥♦❧✐❤✐❝ ❛♥❞ ❝❧♦❡❞ ♦❢✇❛❡ ❛❧❧♦✇ ♠❛♥❞✳ ◆✉♠❜❡ ✺ ✐♥ ■♥❡♥❛✐♦♥❛❧ ❚❡① ✐♥ ❢♦ ❡♥❛♥❣❧✐♥❣ ❛✐✐❝ ✉❡✱ ❛❡❤❡✐❝ ❡✢❡❝✐♦♥ ❈✐✐❝❛❧ ▼❡❞✐❛ ❆❡❤❡✐❝✳ ❇❧♦♦♠❜✉②✳ ❛♥❞ ❡♥❣✐♥❡❡✐♥❣ ❛♠❜✐✐♦♥ ❛ ❛❡♠♣❡❞ ♦ ❡①✲ ▼❛❦✉ ◆♦✐❡♥✐❣✱ ❚❤♦♠❛ ▼✉✐❧✱ ❆❧♦✐ ❙♦♥✲ ❡♠♣❧✐❢② ✐♥ ❤✐ ♣❛♣❡✳ ❛❝❝❤✐✱ ❛♥❞ ❘♦❜❡ ❍❧❞✐❝❤✳ ✷✵✵✸✳ ✸ ❉ ❜✐♥✲ ❛✉❛❧ ♦✉♥❞ ❡♣♦❞✉❝✐♦♥ ✉✐♥❣ ❛ ✈✐✉❛❧ ❆♠✲ ✺ ❆❝❦♥♦✇❧❡❞❣❡♠❡♥ ❜✐♦♥✐❝ ❛♣♣♦❛❝❤✳ ■♥ ■❊❊❊ ■♥❡♥❛✐♦♥❛❧ ❙②♠✲ ❛✐✢♥❡✉ ✇❛ ✐♥✐❛❧❧② ❝♦♥❝❡✐✈❡❞ ✐♥ ✷✵✵✽ ❞✉✲ ♣♦✐✉♠ ♦♥ ❱✐✉❛❧ ❊♥✈✐♦♥♠❡♥✱ ♣❛❣❡ ✶✼✹✕ ✐♥❣ ✇♦ ❤♦✲❡♠ ❝✐❡♥✐✜❝ ♠✐✐♦♥ ❛ ■♥✐✲ ✶✼✽✳ ✉❡ ♦❢ ❊❧❡❝♦♥✐❝ ▼✉✐❝ ❛♥❞ ❆❝♦✉✐❝ ●❛③ ❘✉✐ ❡♥❤❛✳ ✷✵✵✽✳ ❉✐❛♥❝❡ ❡♥❝♦❞✐♥❣ ✐♥ ❆♠✲ ✭■❊▼✮✱ ❢✉♥❞❡❞ ❜② ❤❡ ❙♦♥✐❝ ■♥❡❛❝✐♦♥ ❉❡✐❣♥ ❜✐♦♥✐❝ ✉✐♥❣ ❤❡❡ ❛♥❣✉❧❛ ❝♦♦❞✐♥❛❡✳ ■♥ ❊✉♦♣❡❛♥ ❈❖❙❚ ❛❝✐♦♥ ✭❈❖❙❚ ■❈ ✵✻✵✶✱ ❬❘♦❝✲ ♦❝❡❡❞✐♥❣ ♦❢ ❙♦✉♥❞ ❛♥❞ ▼✉✐❝ ❈♦♠♣✉✐♥❣ ❝❤❡♦✱ ✷✵✶✶❪✮✳ ❚❤❡ ✐♥❛❧❧❛✐♦♥ ❤❛ ❜❡❡♥ ❢✉✲ ❈♦♥❢❡❡♥❝❡✳ ❤❡ ❞❡✈❡❧♦♣❡❞ ✇✐❤✐♥ ❤❡ ❑❧❛♥❣✉♠❡ ❡❡❛❝❤ ♣♦❥❡❝ ✭✷✵✶✸✕✷✵✶✺✮✱ ✉♣♣♦❡❞ ❜② ❩✉❦✉♥❢✲ ❉❛✈✐❞❡ ❘♦❝❝❤❡♦✳ ✷✵✶✶✳ ❊①♣❧♦❛✐♦♥ ✐♥ ❢♦♥❞ ❙❡✐❡♠❛❦ ✭❢✉♥❞ ❢♦ ❤❡ ❢✉✉❡ ❞❡✈❡❧♦♣✲ ❙♦♥✐❝ ■♥❡❛❝✐♦♥ ❉❡✐❣♥✳ ▲♦❣♦✳ ♠❡♥ ♦❢ ❤❡ ❡❣✐♦♥ ♦❢ ❙②✐❛✱ ❆✉✐❛✮ ❛ ♣❛ ▼❛✐♥ ❘✉♠♦✐✱ ❋❧♦✐❛♥ ❍♦❧❧❡✇❡❣❡✱ ❛♥❞ ❆♥✲ ♦❢ ❤❡ ♣♦❣❛♠♠❡ ❊①❝✐✐♥❣ ❙❝✐❡♥❝❡ ❛♥❞ ❙♦❝✐❛❧ ❞ ❈❛❜❡❛✳ ✷✵✶✵✳ ❇✐♥❛✉❛❧ ♦♦♠ ✐♠♣✉❧❡ ■♥♥♦✈❛✐♦♥✳ ❡♣♦♥❡ ❢♦ ❝♦♠♣♦✐✐♦♥✱ ❞♦❝✉♠❡♥❛✐♦♥✱ ✈✐✉❛❧ ❛❝♦✉✐❝ ❛♥❞ ❛✉❞✐♦ ❛✉❣♠❡♥❡❞ ❡♥✈✐✲ ❘❡❢❡❡♥❝❡ ♦♥♠❡♥✳ ■♥ ♦❝❡❡❞✐♥❣ ♦❢ ✷✻❤ ❱❉❚ ■♥❡✲ ❋♦♥ ❆❞✐❛❡♥❡♥✳ ✷✵✵✻❛✳ ❆❝♦✉✐❝❛❧ ✐♠♣✉❧❡ ♥❛✐♦♥❛❧ ❈♦♥✈❡♥✐♦♥✱ ♣❛❣❡ ✻✼✵✕✻✼✾✳ ❱❉❚✳ ❡♣♦♥❡ ♠❡❛✉❡♠❡♥ ✇✐❤ ❆▲■❑■✳ ■♥ ♦✲ ▼❛✐♥ ❘✉♠♦✐✳ ✷✵✶✻✳ ❑♦♥✉✐❡❡ ❘✉♠❡ ✕ ❝❡❡❞✐♥❣ ♦❢ ▲✐♥✉① ❆✉❞✐♦ ❈♦♥❢❡❡♥❝❡✱ ♣❛❣❡ ✾✕ ❤❡✐❝❤❡ ■♠♣❧✐❦❛✐♦♥❡♥ ✈♦♥ ❱❡❢❛❤❡♥ ✉♥❞ ✶✹✳ ❩❑▼✳ ❲❡❦③❡✉❣❡♥ ❞❡ ❇✐♥❛✉❛❧❡❝❤♥✐❦✳ ■♥ ♦❝❡❡❞✲ ✐♥❣ ♦❢ ✷✾❤ ❱❉❚ ■♥❡♥❛✐♦♥❛❧ ❈♦♥✈❡♥✐♦♥✱ ❋♦♥ ❆❞✐❛❡♥❡♥✳ ✷✵✵✻❜✳ ❉❡✐❣♥ ♦❢ ❛ ❝♦♥✈♦❧✉✲ ♣❛❣❡ ✷✶✵✕✷✶✻✳ ❱❉❚✳ ❬❣❡♠❛♥❪✳ ✐♦♥ ❡♥❣✐♥❡ ♦♣✐♠✐❡❞ ❢♦ ❡✈❡❜✳ ■♥ ♦❝❡❡❞✲ ✐♥❣ ♦❢ ▲✐♥✉① ❆✉❞✐♦ ❈♦♥❢❡❡♥❝❡✱ ♣❛❣❡ ✹✾✕✺✸✳ ▼❛✐♥ ❘✉♠♦✐✳ ✷✵✶✼✳ ❙♣❛❝❡ ❛♥❞ ❜♦❞② ✐♥ ❩❑▼✳ ♦✉♥❞ ❛✿ ❆✐✐❝ ❡①♣❧♦❛✐♦♥ ✐♥ ❜✐♥❛✉❛❧ ❛✉❞✐♦ ❛✉❣♠❡♥❡❞ ❡♥✈✐♦♥♠❡♥✳ ■♥ ❈❧❡♠❡♥ ❏❡♥ ❇❧❛✉❡✳ ✶✾✾✼✳ ❙♣❛✐❛❧ ❍❡❛✐♥❣✳ ▼■❚ ❲❧❧♥❡✱ ❡❞✐♦✱ ❇♦❞②✱ ❙♦✉♥❞ ❛♥❞ ❙♣❛❝❡ ✐♥ ❡✳ ▼✉✐❝ ❛♥❞ ❇❡②♦♥❞✿ ▼✉❧✐♠♦❞❛❧ ❊①♣❧♦❛✐♦♥✱ ❏♠❡ ❉❛♥✐❡❧✳ ✷✵✵✸✳ ❙♣❛✐❛❧ ♦✉♥❞ ❡♥❝♦❞✲ ♣❛❣❡ ✷✸✺✕✷✺✻✳ ❘♦✉❧❡❞❣❡✳ ❋♦❤❝♦♠✐♥❣✳ ✐♥❣ ✐♥❝❧✉❞✐♥❣ ♥❡❛ ✜❡❧❞ ❡✛❡❝✿ ■♥♦❞✉❝✐♥❣ ❛✈❡❧ ❩❛❤♦✐❦✳ ✷✵✵✷✳ ❆✉❞✐♦② ❞✐♣❧❛② ♦❢ ❞✐❛♥❝❡ ❝♦❞✐♥❣ ✜❧❡ ❛♥❞ ❛ ✈✐❛❜❧❡✱ ♥❡✇ ❛♠✲ ♦✉♥❞ ♦✉❝❡ ❞✐❛♥❝❡✳ ■♥ ♦❝❡❡❞✐♥❣ ♦❢ ■♥✲ ❜✐♦♥✐❝ ❢♦♠❛✳ ■♥ ♦❝❡❡❞✐♥❣ ♦❢ ✷✸❞ ❆❊❙ ❡♥❛✐♦♥❛❧ ❈♦♥❢❡❡♥❝❡ ♦♥ ❆✉❞✐♦② ❉✐♣❧❛②✳ ■♥❡♥❛✐♦♥❛❧ ❈♦♥❢❡❡♥❝❡✳ ■❈❆❉✳ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 21

A versatile workstation for the diffusion, mixing, and post-production of spatial audio

Thibaut CARPENTIER UMR 9912 STMS IRCAM–CNRS–UPMC 1, place Igor Stravinsky, 75004 Paris, France, [email protected]

Abstract spot microphones or electronic tracks), encod- This paper presents a versatile workstation for the ing and decoding Ambisonic materials, adding diffusion, mixing, and post-production of spatial artificial reverberation, combining and mixing sound. Designed as a virtual console, the tool pro- the heterogeneous sound layers, as well as vides a comprehensive environment for combining rendering, monitoring and exporting the final channel–, scene–, and object–based audio. The in- mix in multiple formats. Panoramix provides coming streams are mixed in a flexible bus archi- a unified framework covering all the required tecture which tightly couples sound spatialization operations, and it allows to seamlessly integrate with reverberation effects. The application supports all spatialization paradigms: channel-based, a broad range of rendering techniques (VBAP, HOA, binaural, etc.) and it is remotely controllable via the scene-based, and object-based audio. Open Sound Control protocol. Besides post-production purposes, panoramix is also suitable for the diffusion of sound in Keywords live events since the audio engine operates in realtime and without latency.2 Indeed, it has sound spatialization, mixing, post-production, recently been used by sound engineers and object-based audio, Ambisonic computer musicians in order to control the 1 Introduction sound spatialization for live productions at Ircam. This paper presents a port of the panoramix workstation to Linux. First, we give a brief 2 Architecture presentation of panoramix and typical use-cases The general architecture of the workstation has of this environment. Then, we present some been presented in previous work [Carpentier, recently added features and discuss the chal- 2016]. In a nutshell, the panoramix signal flow lenges involved with porting the application to consists of input tracks which are sent to busses the Linux OS. dedicated to spatialization and reverberation effects. All busses are ultimately collected into Panoramix is an audio workstation that was the Master strip, which delivers the signals to primarily designed for the post-production of the output audio driver. Each channel strip 3D audio materials. The needs and motivations in the workstation comes with a set of specific for such tool have been discussed in previous DSP features. publications [Carpentier, 2016; Carpentier and One major improvement of the new version Cornuau, 2016]: panoramix typically addresses 1 herein presented is the introduction of “parallel the post-production of mixed music concerts bussing”. Namely, this means that each track where the sound recording involves a large set can be sent to multiple busses in parallel.3 The of heterogeneous elements (close microphones, benefit of such parallel bussing architecture is ambient miking, surround or Ambisonic micro- 2 phone arrays, electronic tracks, etc). During Only a few specific DSP treatments may induce a la- tency, e.g., the encoding of Eigenmike signals (discussed the post-production stage, the sound engineers later in this paper). Also there is the irreducible latency need tools for spatializing sonic sources (e.g., of the audio I/O device. 3The number of parallel sends is currently restricted 1The practical use of the software in such a context to three busses, referred to as A/B/C. In practical mix- has also been demonstrated in the above-mentioned pub- ing situations, it appeared useless to provide more than lications, through the case study of an electro-acoustic three sends although there is no technical constraint to piece by composer Olga Neuwirth. increase this limit. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 22

twofold; it allows: The overall processing architecture is inspired 1) to simultaneously produce a mix in multiple from the Spat design [Jot and Warusfel, 1995; formats: tracks can for instance be sent to a Jot, 1999; Carpentier et al., 2015] which tightly VBAP bus and to an Ambisonic bus; both combines an artificial reverberation engine with busses are rendered in parallel, with shared a panning module. This framework relies on a settings, and it is fast and easy to switch from simplified space-time-frequency model of room one to another e.g., for A/B comparison. acoustics wherein the generated room effect 2) to “hybridize” spatialization techniques: for is divided in four temporal segments (direct instance, when producing binaural mixes, it is sound, early reflections, late reflections, and re- sometimes useful to combine “true” binaural verb tail); each segment is individually filtered synthesis (or recordings) with conventional and then spatialized (direct sound and early re- stereophony. Adjusting the level of the two flections are localized as point sources while the parallel busses, the sound engineer can balance late segments are spatially diffuse). between the 3D layer (with well-known binaural In the first release of panoramix, only the filter- artifacts such as timbral coloration, front-back ing of direct sound was proposed. In the pre- confusions, in-head localization, etc.) and the sented version, we have introduced additional stereo layer (often considered as more robust filters for the early and late reflection sections, and spectrally transparent). Such hybridization therefore extending the range of possible effects. appeared especially useful and convincing when producing content intended for non-individual HRTF listening conditions.

Figures 1 and 2 present the signal-flow graph of the tracks and busses respectively. They also exhibit how the signal processing blocks relate to the controllers exposed in the user interface (see also Figure 4 for a general view of this interface). When parallel bussing is involved, some elements of the depicted audio graph are replicated and run concurrently.

Figure 2: Anatomy of a bus: the purpose of a bus is twofold: it generates a late/diffuse reverberation tail (shared amongst multiple tracks for efficiency) and it provides control over the spatialization rendering. The lefthand side (violet frame) depicts the panning bus; the righthand side (red frame) represents the late reverb bus.

Note that the number of tracks, busses and channel per strip is unlimited, only restricted by the available computing power.

3 Main features Figure 1: Anatomy of a track: a track is essentially used for pre-processing the incoming audio source This section presents the main functionalities of (compression, equalization, delay, etc.) and for gen- the software, with an emphasis on newly added erating a set of early reflections that will later 1) features. The interested reader may also refer feed the late reverb FDN and 2) be spatialized. to [Carpentier, 2016]. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 23

3.1 2D panpot FuMa, MaxN) and sorting (ACN, SID, Furse- The first version of panoramix was focusing ex- Malham) schemes. clusively on 3D rendering approaches, namely 3.2.2 HOA manipulations VBAP [Pulkki, 1997], Higher Order Ambison- One benefit of the Ambisonic formalism is that ics (HOA) [Daniel, 2001], and binaural [Møller, a HOA stream can be flexibly manipulated so as 1992]. It rapidly appeared convenient to also in- to alter the spatial properties of the sound field. tegrate 2D techniques, as it is common practice In addition to 3D rotations of the sound field to add horizontal-only layers even when mix- [Daniel, 2001; Daniel, 2009], two new transfor- ing for 3D formats. A number of traditional mation operators have been recently integrated 2D techniques have therefore been implemented to the workstation: (time and/or intensity panning laws such as 2D- 1) a directional loudness processor [Kronlachner VBAP or VBIP [Pernaux et al., 1998], etc). and Zotter, 2014] which allows to spatially em- The workstation now offers a broad range of al- phasize certain regions of the sound field (Fig- gorithms, being able to address arbitrary loud- ure 3) and speaker layouts. 2) a spatial blur effect [Carpentier, 2017] which reduces the resolution of an Ambisonic stream, 3.2 Ambisonic processing indeed simulating fractional order representa- Higher Order Ambisonic (HOA) is a recording tion and varying the “bluriness” of the spatial and reproduction technique that can be used image. to create spatial audio for circular or spheri- These transformation operators are achieved by cal loudspeaker arrangements. It has been sup- applying a (time and frequency independent) ported in the workstation since its origin, and transformation matrix in the Ambisonic do- further improvements have been made, espe- main. The implementation is therefore very ef- cially in the encoding and transformation mod- ficient, making them suitable for realtime au- ules. tomation. 3.2.1 HOA encoding Compact spherical microphone arrays such as the Eigenmike4 are sometimes used for music recordings as they are able to capture natural sound fields with high spatial resolution. The signals captured by such pickup systems do not directly correspond to HOA components; an en- coding stage is required. Such encoding usually necessitates to regularize the modal radial fil- ters as they are ill-conditioned for certain fre- quencies. Various equalization approaches have been proposed in the literature, in particular: Tikhonov regularization [Moreau, 2006; Daniel and Moreau, 2004], soft-limiting [Bernsch¨utz et al., 2011], filter bank applied in the modal do- main [Baumgartner et al., 2011]. There is yet no Figure 3: HOA focalization interface: the simple consensus about which method is the most ap- user interface allows to steer one or multiple virtual propriate; consequently they have all been im- beams in space; the radial axis is used to control the plemented in panoramix. An adjustable maxi- “selectivity” of the virtual beam (from omnidirec- mum amplification factor is also controllable by tional to highly directional). This is especially use- ful in post-production contexts, either to emphasize the user. the sound from certain directions (e.g., instruments) Besides HOA recordings, it is also possible to or to attenuate undesired regions. synthesize Ambisonic virtual sources and there is no restriction on the maximum encoding or- der. 3.2.3 HOA decoding Note finally that panoramix supports all usual A HOA bus serves as a decoder (with respect HOA normalization (N3D, N2D, SN3D, SN2D, to a given loudspeaker layout) and it comes with a comprehensive set of decoding flavors in- 4http://www.mhacoustics.com cluding: sampling Ambisonic decoder [Daniel, LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 24

2001], mode-matching [Daniel, 2001], energy- 3.5 OSC communication preserving [Zotter et al., 2012], and all-round All parameters of the panoramix application decoding [Zotter and Frank, 2012]. In addition, can be remotely accessed via the Open Sound dual-band decoding is possible, with adjustable Control (OSC) protocol [Wright, 2005]. This crossover frequency, and in-phase or max-re op- fosters easy and efficient communication with timizations can be applied in each band [Daniel, other applications (e.g., Pd) or external devices 2001]. (e.g., head-tracker for realtime binaural render- 3.3 Binaural rendering ing). Panoramix implements binaural synthesis for OSC communication may also be used for 3D rendering over headphones. It is possi- remote automation with a digital audio ble to load HRTF in the SOFA/AES-69 for- workstation (DAW) through the ToscA plu- mat [Majdak et al., 2013]. Two SOFA con- gin [Carpentier, 2015b]. Note, however, that ventions are currently supported: “Simple- the latter has not yet been ported to the Linux FreeFieldHRIR” for convolution with HRIR, platform. and “SimpleFreeFieldSOS” for filtering with HRTF represented as second-order sections and A dedicated window allows to monitor the 5 current OSC state of the panoramix engine (see interaural time delay. ➆ SOFA data can be either loaded from a local in Figure 4). Also, the mixing session itself file or remotely accessed through the OpenDAP is stored to disk as a “stringified” OSC bundle protocol [Carpentier, 2015a; Carpentier et al., (human readable and editable). 2014a]. The binaural bus features a user in- 3.6 Enhanced productivity terface for rapid navigation/search through the available SOFA files (Figure 5). A number of other features have been added for enhanced productivity, compared to pre- vious versions. This includes: a large set of keyboard shortcuts (the key mapping can fur- ther be customized and stored – see ➇ in Fig- ure 4) for handling most common tasks (create new tracks, enable/disable groups, etc.), tooltip pop-up that present inline help tips, the possi- bility to split the console window in multiple windows (especially useful when using multi- ple screens and dealing with a high number of Figure 5: UI for loading or downloading SOFA tracks), etc. files. ➀ Filters for quick search. ➁ Text search field. ➂ Results matching query. 4 Software aspects and Linux port 3.4 Reverberation Panoramix was originally developed as a set of two Max/MSP6 externals (panoramix∼ for As mentioned in previous sections, panoramix the DSP rendering and panoramix for the embeds a reverberation engine that allows to GUI controller) and released in the form of a generate artificial room effects during the mix- Max standalone application for macOS and ing process. The reverb processor currently Windows. used is a feedback delay network (FDN) origi- The DSP code is written is C++. It is OS- nally designed by [Jot and Chaigne, 1991]. This independent, host-independent (i.e. it does FDN is particularly flexible and scalable; in typ- not rely on Max/MSP) and highly optimized, ical use-cases, it involves eight feedback chan- extensively using vectorized SIMD instructions nels and provides decay control in three fre- and high performance functions from the quency bands. 7 Intel R Integrated Performance Primitives. In addition to that, there is an on-going work The application can easily handle dozens or to further integrate convolution-based or hybrid even hundreds of tracks on a modern computer. reverberators [Carpentier et al., 2014b] in the The GUI component, also written in C++, is bus architecture. 6 5see www.sofaconventions.org for further details on http://www.cycling74.com SOFA conventions. 7http://software.intel.com LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 25

9 6

8 4

1

5

7

2

3

Figure 4: panoramix running in Ubuntu with Jack server (QJackctl) as audio pilot. ➀ Input tracks in the mixing console. ➁ Spatialization and reverberation busses. ➂ Geometrical representation of the sound scene. ➃ Parametric equalizer. ➄ Group management. ➅ Jack server and inter-application connections. ➆ Status window: allows to inspect the current state of the engine and all parameters exposed to OSC messaging. ➇ Shortcut window: allows to edit the key mappings. ➈ OSC setup window: configure input and output port for remote communication. built with the Juce8 framework which facili- makes it pluggable with potentially any audio tates cross-platform development and provides application. In typical use-cases, a digital a large set of useful widgets. audio workstation such as Ardour10 is used to send audio streams to the panoramix processor. For the Linux environment, it was first The processed buffers may be re-routed to the envisioned to port the Max externals to Pure DAW, e.g., for bouncing, or directly played Data (Pd) [Puckette, 1997]. Porting the DSP back through the output device (see Figure 6). engine is straightforward as the Max and Pd APIs are relatively similar in this regard. Port- ing the GUI object, however, was problematic: Pd uses Tcl/Tk as its windowing system, and to the best of the author’s knowledge there is no easy way to embed GUI components developed with other frameworks (such as Juce or Qt) in the Tk engine. As an alternative, it was decided to create an autonomous appli- cation, handling both the GUI and the audio engine (i.e. an “AudioAppComponent” in Juce’s dialect). The application thus operates independently of any host engine (Pd or Max) and it processes the audio directly to/from the audio devices. Furthermore, it is compatible 9 with the Jack Audio Connection Kit, which Figure 6: Typical workflow.

8http://juce.com/ 9http://www.jackaudio.org 10http://www.ardour.org LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 26

5 Conclusions Thibaut Carpentier, Markus Noisternig, and This paper discussed the Linux port of an au- Olivier Warusfel. 2015. Twenty Years of Ir- cam Spat: Looking Back, Looking Forward. dio engine designed for the diffusion, mixing, st and post-production of spatial audio. We high- In Proc. of the 41 International Computer lighted several new features that extend the pos- Music Conference (ICMC), pages 270 – 277, sibilities of the tool and improve productivity Denton, TX, USA, Sept. and user experience. Future work will mainly Thibaut Carpentier. 2015a. Binaural synthe- focus on the integration of convolution-based sis with the Web Audio API. In Proc. of reverberation into the framework herein pre- the 1st Web Audio Conference (WAC), Paris, sented. France, Jan. 6 Acknowledgements Thibaut Carpentier. 2015b. ToscA: An OSC Communication Plugin for Object-Oriented The author is very grateful to Cl´ement Cor- Spatialization Authoring. In Proc. of the 41st nuau, Olivier Warusfel, and the sound engineer- International Computer Music Conference, ing team at Ircam for their invaluable help in pages 368 – 371, Denton, TX, USA, Sept. the conception of this tool. Thanks also to An- ders Vinjar for his insightful advices and beta Thibaut Carpentier. 2016. Panoramix: 3D testing on the Linux platform. mixing and post-production workstation. In Proc. of the 42nd International Computer References Music Conference (ICMC), pages 122 – 127, Utrecht, Netherlands, Sept. Robert Baumgartner, Hannes Pomberger, Thibaut Carpentier. 2017. Ambisonic spa- and Matthias Frank. 2011. Practical im- nd plementation of radial filters for ambisonic tial blur. In Proc. of the 142 Convention of recordings. In Proc. of the 1st International the Audio Engineering Society (AES), Berlin, Conference on Spatial Audio (ICSA), Det- Germany, May. mold, Germany, Nov. J´erˆome Daniel and S´ebastien Moreau. 2004. Further Study of Sound Field Coding with Benjamin Bernsch¨utz, Christoph Higher Order Ambisonics. In Proc. of the P¨orschmann, Sascha Spors, and Stefan 116th Convention of the Audio Engineering Weinzierl. 2011. Soft-limiting der modalen Society (AES), Berlin, Germany, May. Amplitudenverst¨arkung bei sph¨arischen Mikrofonarrays im Plane Wave Decompo- J´erˆome Daniel. 2001. Repr´esentation de sition Verfahren. In Proc. of 37th German champs acoustiques, application `ala trans- Annual Convention on Acoustics (DAGA), mission et `a la reproduction de sc`enes D¨usseldorf, Germany, March. sonores complexes dans un contexte mul- tim´edia. Ph.D. thesis, Universit´ede Paris VI. Thibaut Carpentier and Cl´ement Cornuau. 2016. panoramix: station de mixage et post- J´erˆome Daniel. 2009. Evolving Views on production 3D. In Proc. of the Journ´ees HOA : From Technological To Pragmatic st d’Informatique Musicale (JIM), pages 162 – Concerns. In Proc. of the 1 Ambisonics 169, Albi, France, April. Symposium, Graz, Austria, June. Thibaut Carpentier, H´el`ene Bahu, Markus Jean-Marc Jot and Antoine Chaigne. 1991. Digital delay networks for designing artificial Noisternig, and Olivier Warusfel. 2014a. th Measurement of a head-related transfer func- reverberators. In Proc. of the 90 Conven- tion database with high spatial resolution. tion of the Audio Engineering Society (AES), In Proc. of the 7th EAA Forum Acusticum, Paris, France, Feb. Krak´ow, Poland, Sept. Jean-Marc Jot and Olivier Warusfel. 1995. A Real-Time Spatial Sound Processor for Mu- Thibaut Carpentier, Markus Noisternig, and sic and Virtual Reality Applications. In Proc. Olivier Warusfel. 2014b. Hybrid Reverbera- of the of the International Computer Music tion Processor with Perceptual Control. In th Conference (ICMC), Banff, Canada. Proc. of the 17 Int. Conference on Digital Audio Effects (DAFx), pages 93 – 100, Erlan- Jean-Marc Jot. 1999. Real-time spatial pro- gen, Germany, Sept. cessing of sounds for music, multimedia and LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 27

interactive human-computer interfaces. ACM ambisonic decoding. Acta Acustica united Multimedia Systems Journal (Special issue on with Acustica, 98:37 – 47. Audio and Multimedia), 7(1):55 – 69. Matthias Kronlachner and Franz Zotter. 2014. Spatial transformations for the en- hancement of Ambisonic recordings. In Proc. of the 2nd International Conference on Spa- tial Audio (ICSA), Erlangen, Germany, Feb. Piotr Majdak, Yukio Iwaya, Thibaut Carpen- tier, Rozenn Nicol, Matthieu Parmentier, Ag- nieszka Roginska, Yˆoiti Suzuki, Kanji Watan- abe, Hagen Wierstorf, Harald Ziegelwanger, and Markus Noisternig. 2013. Spatially Ori- ented Format for Acoustics: A Data Ex- change Format Representing Head-Related Transfer Functions. In Proc. of the 134th Convention of the Audio Engineering Society (AES), Roma, Italy, May 4-7. Henrik Møller. 1992. Fundamentals of bin- aural technology. Applied Acoustics, 36:171 – 218. S´ebastien Moreau. 2006. Etude´ et r´ealisation d’outils avanc´es d’encodage spatial pour la technique de spatialisation sonore Higher Or- der Ambisonics : microphone 3D et contrˆole de distance. Ph.D. thesis, Universit´e du Maine. Jean-Marie Pernaux, Patrick Boussard, and Jean-Marc Jot. 1998. Virtual Sound Source Positioning and Mixing in 5.1 Implementa- tion on the Real-Time System Genesis. In Proc. of the Digital Audio Effects Conference (DAFx), Barcelona, Spain, Nov. Miller Puckette. 1997. Pure Data. In Proc. of the International Computer Music Confer- ence (ICMC), pages 224 – 227, Thessaloniki, Greece. Ville Pulkki. 1997. Virtual Sound Source Po- sitioning Using Vector Base Amplitude Pan- ning. Journal of the Audio Engineering Soci- ety, 45(6):456 – 466, June. Matthew Wright. 2005. Open Sound Control: an enabling technology for musical network- ing. Organised Sound, 10(3):193 – 200, Dec. Franz Zotter and Matthias Frank. 2012. All-round ambisonic panning and decoding. Journal of the Audio Engineering Society, 60(10):807 – 820. Franz Zotter, Hannes Pomberger, and Markus Noisternig. 2012. Energy-preserving LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 28

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 29

Teaching Sound Synthesis in C/C++ on the Raspberry Pi

Henrik von Coler and David Runge Audio Communication Group, TU Berlin [email protected] [email protected]

Abstract added, they are not considered for exploring ap- For a sound synthesis programming class in C/C++, plied sound synthesis algorithms in this class. a Raspberry Pi 3 was used as runtime and develop- In a more application-based context, graphi- ment system. The embedded system was equipped cal programming environments like Pure Data with an Arch Linux ARM, a collection of libraries (Pd), MAX MSP or others would be the first for sound processing and interfacing, as well as with choice. They allow rapid progress and interme- basic examples. All material used in and created diate results. However, they are not the best for the class is freely available in public git reposito- platform to enhance the knowledge of sound ries. After a unit of theory on sound synthesis and Linux system basics, students worked on projects synthesis algorithms on a sample-wise level by in groups. This paper is a progress report, point- nature. ing out benefits and drawbacks after the first run of the seminar. The concept delivered a system with C/C++ delivers a reasonable compromise be- acceptable capabilities and latencies at a low price. tween low level access and the comfort of using Usability and robustness, however, need to be im- available libraries for easy interfacing with hard- proved in future attempts. ware. For using C/C++ in the development of sound synthesis software, a software devel- Keywords opment kit (SDK) or application programming Jack, Raspberry Pi, C/C++ Programming, Educa- interface (API) is needed in order to offer access tion, Sound Synthesis, MIDI, OSC, Arch Linux to the audio hardware. Digital audio workstations (DAW) use plug- 1 Introduction ins programmed with SDKs and APIs like Stein- The goal of the seminar outlined in this pa- berg’s VST, Apple’s Audio Units, Digidesign’s per was to enable students with different back- RTAS, the Linux Audio Developer’s Simple grounds and programming skills the develop- Plugin API (LADSPA) and its successor LV2, ment of standalone real-time sound synthesis which offer quick access to developing audio projects. Such a class on the programming synthesis and processing units. A plugin-host of sound synthesis algorithms is, among other is necessary to run the resulting programs and things, defined by the desired level of depth in the structure is predefined by the chosen plat- signal processing. It may convey an application- form. The JUCE framework [34] offers the oriented overview or a closer look at algorithms possibility of developing cross platform applica- on a sample-wise signal processing level, as in tions with many builtin features. Build targets this case. Based on this decision, the choice can be audio-plugins for different systems and of tools, respectively the programming environ- standalone applications, including Jack clients, ment should be made. which would present an alternative to the cho- sen approach. Script languages like Matlab or Python are Another possibility is the programming of widely used among students and offer a comfort- Pd externals in the C programming language able environment, especially for students with- with the advantage of quick merging of the self- out a background in computer science. They programmed components with existing Pd in- are well suited for teaching fundamentals and ternals. FAUST [6] also provides means for cre- theoretical aspects of signal processing due to ating various types of audio plugins and stan- advanced possibilities of debugging and visual- dalone applications. Due to a lack in functional isation. Although real-time capabilities can be programming background it was not chosen. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 30

For various reasons we settled for the Jack Unfortunately the on-board audio interface API [18] to develop command line programs on of the Raspberry Pi could not be configured a Linux system. for real-time audio applications. It does not • feature an input and the Jack server could Jack clients are used in the research on bin- only be started with high latencies. After try- aural synthesis and sound field synthesis, ing several interfaces, a Renkforce USB-Audio- for example by the WONDER interface [15] Adapter was chosen, since it delivered an ac- or the SoundScape Renderer [14]. Results ceptable performance at a price of 10 e. The of the projects might thus be potentially Jack server did perform better with other inter- integrated into existing contexts. faces, yet at a higher price and with a larger • Jack clients offer quick connection to other housing. clients, making them as modular as audio- Students were equipped with MIDI interfaces plugins in a DAW. from the stock of the research group and private • The Jack API is easy to use, even for be- devices. The complete cost for one system, in- ginners. Once the main processing func- cluding the Raspberry Pi 3 with housing, SD tion is understood, students can immedi- card, power adapter and the audio interface, ately start inserting their own code. were thus kept at about 70 e(vs. the integrated low-latency platform bela[5], that still ranks at • The omission of a graphical user interface around 120 eper unit). for the application leaves more space for focusing on the audio-related problems. 2.2 • The proposed environment is independent of proprietary components. First tests on the embedded system involved Raspbian [12], as it is highly integrated and has a graphical environment preinstalled, that A main objective of the seminar was to equip eases the use for beginners. Integration with the students with completely identical systems the libraries used for the course proved to be for development. This avoids troubles in han- more complicated however, as they needed to dling different operating systems and hardware be added to the software repository for the ex- configurations. Since no suitable computer pool amples. was available, we aimed at providing a set of machines as cheap as possible for developing, A more holistic approach, integrating the op- compiling and running the applications. The erating system, was aimed at, in order to leave students were thus provided with a Raspberry as few dependencies on the students’ side as pos- Pi 3 in groups of two. Besides being one of the sible and not having to deal with the hosting of cheapest development systems, it offers the ad- a custom repository of packages for Raspbian. vantage of resulting in a highly portable, quasi Due to previous experience with low latency se- embedded for the actual use in live tups using Arch Linux [7], Arch Linux ARM applications. [2] was chosen. With its package manager pac- man [9] and the Arch Build System [1] an easy The remainder of this paper is organized as system-wide integration of all used libraries and follows: Section 2 introduces the used hard- and software was achieved by providing them as pre- software, as well as the infrastructure. Section 3 installed packages (with them either being avail- presents the concept of the seminar. Section 4 able in the standard repositories, or the Arch briefly summarizes the experiences and evalu- User Repository[3]). Using Arch Linux ARM, ates the concept. it was also possible to guarantee a systemd [13] 2 Technical Outline based startup of the needed components, which is further described in Section 2.5. At the time 2.1 Hardware of preparation for the course, the 64bit vari- A Raspberry Pi 3 was used as development ant (AArch64) - using the mainline kernel - was and runtime system. The most recent version not available yet. At the time of writing it is at that time was equipped with 1.2GHz 64-bit still considered experimental, as some vendor quad-core ARMv8 CPU, 802.11n Wireless LAN, libraries are not yet available for it. Instead the a Bluetooth adapter, 1GB RAM, 4 USB ports ARMv7 variant of the installation image was 40 GPIO pins and various other features [11]. used, which is the default for Raspberry Pi 2. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 31

2.3 Libraries As these settings were not realizable with the All libraries installed on the system are listed internal audio card, the snd-bcm2835 module - in Tab. 1. For communicating with the au- the driver in use for it - was blacklisted using dio hardware, the Jack API was installed. The /etc/modprobe.d/*, to not use the sound device jackcpp framework adds a C++ interface to at all. Jack. Additional libraries allowed the handling For automatic start of the low-latency au- of audio file formats, ALSA-MIDI interfacing, dio server and its clients, systemd user ser- Open Sound Control, configuration files and vices were introduced, that follow a user session Fast Fourier Transforms. based setup. The session of the main system user is started automatically as per systemd’s [email protected]. This is achieved by enabling the Table 1: Libraries installed on the development linger status of said user with the help of loginctl system [8], which activates its session during boot and Library Ref. Purpose starts its enabled services. jack2 [18] Jack audio API A specialized systemd user service [30] allows jackcpp [26] C++ wrapper for jack2 for Jack’s startup with an elevated CPU sched- uler without the use of dbus [21]. The stu- sndfile [19] Read and write audio files dents’ projects could be automatically started rtmidi [32] Connect to MIDI devices as user services, that rely on the audio server liblo [23] OSC support being started by enabling services such as this example: [20] Configuration files [ Unit ] fftw3 [22] Fourier transform Description=Example project boost [4] Various applications After=jack@rpi-usb-44100. service [ Service ] ExecStart=/path/to/executable \ parameter1 \ 2.4 Image paramter2 The image for the system is available for down- Restart=on-failure 1 [ Install ] load and can be asked for by mailing to the WantedBy=default.target authors in case of future unavailability. Installa- tion of the image follows the standard procedure 2.6 Infrastructure of an Arch Linux ARM installation for Rasp- berry Pi 3, using the blockwise copy tool dd, For allowing all students in the class the access which is documented in the course’s git reposi- via SSH, a WIFI was set up, providing fixed tory [31]. IP addresses for all Raspberry Pis. Network performance showed to be insufficient to handle 2.5 System Settings all participants simultaneously, though. Thus, The most important goal was to achieve a additional parallel networks were installed. round-trip latency below 10 ms. This would not For home use, students were instructed to be sufficient for real-time audio applications in provide a local network with their laptops, or general, but for teaching purposes. With the using their home network over WiFi or cable. hardware described in Section 2.1, stable Jack Depending on their operating system, this was server command line options were evaluated, more or less complicated. leading to a round-trip latency of 2.9ms: 3 The Seminar /usr/bin/jackd -R \K The seminar was addressed to graduate stu- -p 512 \ dents with different backgrounds, such as com- -d alsa \ puter science, electrical engineering, acoustics -d hw:Device \ -n 2 \ and others. One teacher and one teaching assis- -p 64 \ tant were involved in the planning, preparation -r 44100 and execution of the classes. The course was divided into a theoretical and 1 https://www2.ak.tu-berlin.de/~drunge/ a practical part. In the beginning, theory and klangsynthese basics where taught in mixed sessions. The fi- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 32

nal third of the semester was dedicated to su- at the architecture and programming of the pervised project work. Yamaha DX7 synthesizer. 3.1 System Introduction 3.3 Projects Students were introduced to the system by giv- Out of the 35 students who appeared to the first ing an overview over the tools needed and pre- meetings, 18 worked on projects throughout the senting the libraries with their interfaces. Users whole semester. It should be noted that (with of Linux, Mac and Windows operating systems one exception) only Linux and MAC users con- took part in the class. Since many students tinued. lacked basic Linux skills, first steps included us- No restrictions were made regarding the ing Secure Shell (SSH) to access the devices, choice of the topic, exept that it should result which proved to be hard for attendees without in an executable program on the Raspberry PI. any knowledge on the use of the command-line The student projects included: interface. Windows users were aided to install and use PuTTY [10] for the purpose of connect- • A vector synthesis engine, allowing the ing, as there is no native SSH client. After two mixture of different waveforms with a suc- sessions, each group was able to reach the Rasp- ceeding filter section berry Pi in class, as well as at home. • A subtractive modeling synth with modu- 3.2 Sound Synthesis Theory lar capabilities Sound synthesis approaches were introduced • A physical string model, based on the from an algorithmic point of view, showing ex- Karplus-Strong Extended with dispersion amples of commercially available implementa- filter tions, also regarding their impact on music pro- • A sine-wave Speech Synthesis [28] effect, duction and popular culture. Students were which includes a real-time FFT provided with ready-to-run examples from the course repository for some synthesis approaches, • A guitar-controlled subtractive synthesizer, as well as with tasks for extending these. using zero-crossing rate for pitch detection Important fundamental literature was pro- • A wave-digital-filter implementation with vided by Z¨olzer [35], Pirkle [27] and Roads [29]. sensor input from the GPIOs The taxonomy of synthesis algorithms proposed by Smith [33] was used to structure the outline In order to provide a more suitable platform of the class, as follows. for running embedded audio applications, one A section on Processed Recording dealt with group used buildroot2 to create a custom oper- sampling and sample-based approaches, like ating system. wave-table synthesis, granular synthesis, vector synthesis and concatenative synthesis. 4 Conclusions Subtractive synthesis and analog modeling The use of the Raspberry Pi 3 for the program- were treated as the combination of the basic ming of Jack audio applications showed to be a units oscillators, filters and envelopes. Fil- promising approach. A system with acceptable ters were studied more closely, considering IIR capabilities and latencies could be provided at and FIR filters and design methods like bilin- a low price. Accessibility and stability, how- ear transform. A ready to run biquad example ever, need to be improved in future versions: A was included and a first order low-pass was pro- considerable amount of time was spent work- grammed in class. ing on these issues in class and the progress in and Spectral Modeling the projects was therefore delayed considerably. were introduced by an analysis-resynthesis pro- The overhead in handling Linux showed to be a cess of a violin tone in the SMS model [25], con- major problem for some students and probably sidering only the harmonic part. caused some people to drop out. A possible step Physical Modeling was treated for plucked would be to provide a set with monitor, key- strings, starting with the Karplus-Strong algo- board and mouse, as this would increase acces- rithm [17], advancing to bidirectional [24]. sibility. The stability of the Jack server needs to FM Synthesis [16] was treated as a repre- be worked on, as sometimes the hardware would sentative of abstract algorithms in the class. The concept was mainly taught by a closer look 2https://buildroot.org LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 33

not be started properly, leading to crashing Jack [18] Paul Davies. JACK API. http://www. clients. jackaudio.org/. In future seminars, which are likely to be con- [19] Erik de Castro Lopo. Libsndfile. http://www. ducted after these principally positive experi- mega-nerd.com/libsndfile/. ences, most of the issues should be worked out [20] Clark C. Evans. YAML: YAML Ain’t Markup and the image, as well as the repository will Language. http://yaml.org/. have been improved. [21] Free Desktop Foundation. dbus home- References page. https://wiki.freedesktop.org/www/ Software/dbus/. [1] Arch Build System - ArchWiki. https: [22] Matteo Frigo and Steven G. Johnson. FFTW //wiki.archlinux.org/index.php/Arch_ Fastest Fourier Transform in the West. http: Build_System. //www.fftw.org/. [2] Arch Linux ARM homepage. https://www. [23] Steve Harris and Stephen Sinclair. liblo Home- archlinuxarm.org/. page: Lightweight OSC implementation. http: [3] Arch User Repository. https://aur. //liblo.sourceforge.net/. archlinux.org/. [24] Matti Karjalainen, Vesa V¨alim¨aki, and Tero [4] Boost C++ Libraries - Homepage. http:// Tolonen. Plucked-string models: From the www.boost.org. Karplus-Strong algorithm to digital wave- [5] buildroot homepage. https://bela.io. guides and beyond. Computer Music Journal, 22(3):17–32, 1998. [6] FAUST - Homepage. http://faust.grame. fr/. [25] Scott N. Levine and Julius O. Smith. A Sines+Transients+Noise Audio Representation [7] Linux Audio Conference 2015 - Workshop: for Data Compression and Time/Pitch Scale Arch Linux as a lightweight audio platform Modi cations. Proceedings of the 105th Audio - Slides. http://lac.linuxaudio.org/2015/ Engineering Society Convention, 1998. download/lac2015_arch_slides.pdf. [26] Alex Norman. JackCpp. http://www.x37v. [8] loginctl man page. https://www. info/projects/jackcpp/. freedesktop.org/software/systemd/man/ loginctl. [27] Will Pirkle. Designing Plug-Ins in C++. Focal Press, 2014. [9] Pacman homepage. https://www.archlinux. org/pacman/. [28] Robert E Remez, Philip E Rubin, David B Pisoni, Thomas D Carrell, et al. Speech percep- [10] PuTTY Homepage. http://www.putty.org/. tion without traditional speech cues. Science, [11] Raspberry PI Homepage. https: 212(4497):947–949, 1981. //www.raspberrypi.org/products/ [29] Curtis Roads. The computer music tutorial. raspberry-pi-3-model-b/. MIT press, 1996. [12] Raspbian homepage. https://www.raspbian. [30] David Runge. uenv homepage. https://git. org/. sleepmap.de/software/uenv.git/about/. [13] systemd homepage. https://www. [31] David Runge and Henrik von Coler. AK- freedesktop.org/wiki/Software/systemd/. Klangsynthese Repository. https://gitlab. [14] Jens Ahrens, Matthias Geier, and Sascha Spors. tubit.tu-berlin.de/henrikvoncoler/ The soundscape renderer: A unified spatial au- Klangsynthese_PI. dio reproduction framework for arbitrary ren- [32] Gary P. Scavone. RtMidi. http://www.music. dering methods. In Audio Engineering Soci- mcgill.ca/~gary/rtmidi/. ety Convention 124. Audio Engineering Society, 2008. [33] Julius O. Smith. Viewpoints on the History of Digital Synthesis. In Proceedings of the Inter- [15] Marije AJ Baalman. Updates of the wonder national Computer Music Conference, pages 1– software interface for using wave field synthe- 10, 1991. sis. LAC2005 Proceedings, page 69, 2005. [34] Jules Storer. JUCE. https://www.juce.com/. [16] John M Chowning. The synthesis of complex audio spectra by means of frequency modula- [35] Udo Zoelzer, editor. DAFX: Digital Audio Ef- tion. Journal of the audio engineering society, fects . John Wiley & Sons, Inc., New York, NY, 21(7):526–534, 1973. USA, 2 edition, 2011. [17] Julius O. Smith David A. Jaffe. Extensions of the Karplus-Strong Plucked-String Algorithm. Computer Music Journal, 7(2):56–69, 1983. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 34

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 35

Open signal processing software platform for hearing aid research (openMHA) Tobias Herzke1 and Hendrik Kayser2 and Frasher Loshaj1 and Giso Grimm1,2 and Volker Hohmann1,2 1 H¨orTech gGmbH and Cluster of Excellence “Hearing4all”, Marie-Curie-Str. 2, D-26129 Oldenburg, Germany 2 Medizinische Physik and Cluster of Excellence “Hearing4all” Universit¨at Oldenburg, D-26111 Oldenburg, Germany [email protected]

Abstract • provide a library for common signal pro- Hearing aids help hearing impaired users partici- cessing tasks and commonly needed ser- pate in the communication society. Development vices in hearing aid signal processing, like and improvement of hearing aid signal processing support for acoustic calibration and filter- algorithms takes place in the industry and in aca- banks, demic research. With openMHA, we present a de- velopment and evaluation platform that is able to • be able to run on a wide range of hardware, execute hearing aid signal processing in real-time from high-performance PCs to execute on standard computing hardware with a low delay bleeding-edge algorithms in real-time, to between sound input and output. We lay out the portable, power-efficient, headless, battery- application specific requirements and present how powered devices for improved testing capa- openMHA meets these and will be helpful in future bilities in realistic usage scenarios and field research in the field of signal processing for hearing tests. aids. Several open-source tools for audio signal pro- Keywords cessing, that can also be used in hearing aid Hearing aids, audio signal processing, plugin host research, exist: 1 Introduction Octave. Octave is actively used in hearing aid research for the development of signal process- Development of hearing aid signal processing is ing algorithms for hearing aids. It is a suitable widely conducted by hearing aid manufacturers tool to quickly develop, change and evolve iso- on proprietary systems that are not accessible lated algorithms as long as no real-time audio to the research community and that underlie processing is required. However, Octave is un- commercial constraints. Providing open tools suitable for executing hearing aid algorithms in to the hearing aid research community lowers real-time with live input and output sound sig- barriers, accelerates studies with novel acoustic nals with low delay.[Eaton et al., 2015] processing algorithms and facilitates translation of these advances into widespread use with hear- NumPy/SciPy. Scientific Computing Tools ing aids, cochlear implants, and consumer elec- for Python enable researchers to develop signal tronics devices for sub-clinical hearing support. processing algorithms. Technically, this soft- A software platform for the development and ware platform is equivalent to octave, but it is evaluation of hearing aid algorithms should to our knowledge currently not actively used in hearing aid research.[Jones et al., 2001 ] • offer a complete set of hearing aid signal Pure Data. Pd is a real-time signal process- processing reference algorithms that can be ing platform. It features a graphical program- combined with newly developed algorithms ming interface. Pd is actively used mainly by to form a complete hearing aid signal pro- artists to perform signal processing of music and cessing chain, other data. Pd can achieve a low delay in real- • enable researchers to perform offline- time processing. In principle it would be possi- processing as well as real-time signal pro- ble to develop hearing aid signal processing al- cessing with a reliable low delay between gorithms on Pd, and have these algorithms pro- acoustic input and output of less than 10 cess audio signal in real-time. We are not aware milliseconds, even when algorithms need of any hearing aid research being performed on significant processing power, the Pd platform and would consider it too la- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 36

borious to implement modern hearing aid algo- rithms in the graphical programming environ- control applications ment. Pd can be extended with C, therefore, (e.g., Octave) hearing aid algorithms could be implemented for Pd in C or C++.[Puckette, 1996 ] openMHA Plugin hosts. Various plugin hosts for differ- ent plugin architectures (VST, LADSPA, LV2) libMHAToolbox MHAhost exist, that can load and combine algorithms in plugins to form complex signal processing chains. Most hosts can achieve a low delay in real-time audio processing. Plugins can be writ- ten in C or C++ using the plugin-architecture plugins IO specific SDK. (Using the VST SDK requires signing a license agreement.) Plugin hosts are mainly used by sound engineers and also by artists to process recorded or live music and other sounds. audio backend (Jack, File, TCP) Signal processing toolboxes and lan- guages. A signal processing toolbox like the Synthesis ToolKit (STK) [Cook and Scavone, Figure 1: Structure of the openMHA. The 1999] and domain-specific languages (DSL) like openMHA contains a toolbox library “libMHA- SuperCollider [McCartney, 2002] and Faust [Or- Toolbox”, a command line host application, larey et al., 2009] provide useful signal process- which acts as an openMHA plugin host and pro- ing primitives to ease development of audio sig- vides the configuration interface, and openMHA nal processing algorithms. We are not aware of plugins. any hearing aid research being performed using these toolboxes and DSLs. search. Until recently, it was only available While the dynamic programming languages as a closed-source commercial product. To en- Octave and Python are suitable to develop al- able and facilitate collaborative research efforts gorithms and execute them offline, their run- and comparative studies in the research commu- time environment is not suitable for real-time nity, an open-source version of the MHA soft- processing when low delay is required at high ware platform for real-time audio signal process- processing loads. Octave and Python do not ing is now being developed and made available: give algorithm implementers the necessary con- the open Master Hearing Aid (openMHA). In trol to prevent heap memory allocation in the February 2017, a pre-release of the openMHA signal processing path, which can cause unpre- has been published on GitHub under an open- dictable interruptions in the real-time process- source license (AGPL3) by [H¨orTech gGmbH ing due to priority inversion situations. Pd and and Universit¨atOldenburg, 2017]. This pre- plugin hosts are real-time safe themselves and release features an initial set of reference al- allow algorithms to be implemented in C or gorithms for hearing aid processing, which will C++. The C and C++ programming languages be expanded in subsequent releases. Thereby, allow developers sufficient control to implement openMHA provides a growing benchmark for algorithms in a real-time safe way. However, the development and investigation of novel al- Pd and plugin hosts do not provide commonly gorithms on this platform in the future. With needed services to hearing aid signal processing the openMHA we provide an open-source tool developers like calibration or an existing set of that is tailored to the needs of hearing aid al- hearing aid algorithms. gorithm research which was not available before The H¨orTech Master Hearing Aid (MHA) as a specialized tool in the open-source domain. [Grimm et al., 2006; Grimm et al., 2009a] is an existing software platform for hearing aid al- 2 Structure gorithm development and evaluation that meets all the requirements and has been used by the The openMHA can be split into four major com- hearing aid industry as well as in academic re- ponents (see Figure 1 for an overview): LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 37

1. The openMHA command line application algorithmic delay, sharing STFT data is a ne- cessity for a hearing aid signal processing plat- 2. Signal processing plugins form, because the overall delay of the complete 3. Audio input-output (IO) plugins processing has to be as short as possible. Similar to some other platforms, the 4. The openMHA toolbox library openMHA allows also arbitrary data to be ex- The openMHA command line application acts changed between plugins through a mechanism as a plugin host. It can load signal process- called “algorithm communication variables” or ing plugins as well as audio input-output (IO) short “AC vars”. This mechanism is commonly plugins. Additionally, it provides the command used to share data such as filter coefficients or line configuration interface and a TCP/IP based filter states. configuration interface. Several IO plugins ex- 3.2 Real-Time Safe Complex ist: For real-time signal processing, commonly Configuration Changes the “MHAIOJack” plugin is used, which pro- vides an interface to the Jack Audio Connec- Hearing aid algorithms in the openMHA can ex- tion Kit (JACK) [Davis, 2003]. Other IO plu- port configuration settings that may be changed gins provide audio file access or TCP/IP-based by the user at run time. To ensure real-time safe processing. signal processing, the audio processing will nor- openMHA plugins provide the audio signal mally be done in a signal processing thread with processing capabilities and audio signal han- real-time priority, while user interaction with dling. Typically, one openMHA plugin imple- configuration parameters would be performed ments one specific algorithm. The complete in a configuration thread with normal priority, virtual hearing aid signal processing can be so that the audio processing does not get in- achieved by a combination of several openMHA terrupted by configuration tasks. Two types of plugins. problems may occur when the user is changing The openMHA toolbox library “libMHATool- parameters in such a setup: box” provides reusable data structures and sig- nal processing classes. Examples are class tem- 1. The change of a simple parameter exposed plates for the implementation of openMHA plu- to the user may cause an involved recalcu- gins, and container classes for audio data. Fur- lation of internal runtime parameters that thermore, several filter classes in temporal or the algorithm actually uses in processing. spectral domain, filter banks, and hearing aid The duration required to perform this re- specific classes are provided in this library. calculation may be a significant portion of (or take even longer than) the time avail- 3 openMHA Platform Services and able to process one block of audio signal. Conventions In hearing aid usage, it is not acceptable to halt audio processing for the duration that The openMHA platform offers some services the recalculation may require. and conventions to algorithms implemented in plugins, that make it especially well suited to 2. If the user needs to change multiple param- develop hearing aid algorithms, while still sup- eters to reach a desired configuration state porting general-purpose signal processing. of an algorithm from the original configu- ration state, then it may not be acceptable 3.1 Audio Signal Domains that processing is performed while some of As in most other plugin hosts, the audio signal the parameters have already been changed in the openMHA is processed in audio chunks. while others still retain their original val- However, plugins are not restricted to propa- ues. It is also not acceptable to interrupt gate audio signal as blocks of audio samples in signal processing until all pending configu- the time domain – another option is to propa- ration changes have been performed. gate the audio signal in the short time Fourier transform (STFT) domain, i.e. as spectra of The openMHA provides a mechanism in its blocks of audio signal, so that not every plugin toolbox library to enable real-time safe configu- has to perform its own STFT analysis and syn- ration changes in openMHA plugins: Basically, thesis. Since STFT analysis and re-synthesis of existing runtime configurations are used in the acceptable audio quality always introduces an processing thread until the work of creating an LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 38

updated runtime configuration has been com- the user of the openMHA to perform calibration pleted in the configuration thread. In hear- measurements and adapt the openMHA settings ing aids, it is more acceptable to continue to to make sure that this calibration convention use an outdated configuration for a few more is met. We provide the plugin transducers (cf. milliseconds than blocking all processing. The section 4.1) which can be configured to perform openMHA toolbox library provides an easy-to- the necessary signal adjustments in most situa- use mechanism to integrate real-time safe run- tions. time configuration updates into every plugin. 4 February 2017 Pre-Release 3.3 Plugins can Themselves Host Other Plugins In February 2017, H¨orTech and Universit¨at Oldenburg published a pre-release of the An openMHA plugin can itself act as a plugin openMHA on GitHub under an open-source li- host. This allows to combine analysis and re- cense (AGPL3). This pre-release contains the synthesis methods in a single plugin. We call openMHA command line application, the tool- plugins that can themselves load other plugins box library “libMHAToolbox”, an initial set “bridge plugins” in the openMHA. When such a of openMHA plugins and openMHA sound in- bridge plugin is then called by the openMHA to put/output (IO) libraries, and example configu- process one block of signal, it will first perform rations. The initial set of plugins and sound IO its analysis, then invoke (as a function call) the libraries was selected so that a basic research signal processing in the loaded plugin to process hearing aid configuration can be realized with the block of signal in the analysis domain, wait the contained plugins, and users could process to receive a processed block of signal in the anal- both, live sounds via JACK as well as sound ysis domain back from the loaded plugin when from and to files. The basic hearing aid algo- the signal processing function call to that plu- rithms present in the pre-release include gin returns, then perform the re-synthesis trans- form, and finally return the block of processed • an adaptive differential microphone al- signal in the original domain back to the caller gorithm that suppresses interfering noise of the bridge plugin. from the rear hemisphere (cf. section 4.3), 3.4 Central Calibration • a binaural coherence filter that provides The purpose of hearing aid signal processing is feedback suppression and dereverberation to enhance the sound for hearing impaired lis- (cf. section 4.5), and teners. Hearing impairment generally means • that people suffering from it have increased a multi-band dynamic range compression hearing thresholds, i.e. soft sounds that are au- algorithm that restores audibility of sounds dible for normal hearing listeners may be imper- for the hearing impaired user (cf. section ceptible for hearing impaired listeners. To pro- 4.7). vide accurate signal enhancement for hearing Apart from the plugins that implement just impaired people, hearing aid signal processing these algorithms, additional supporting plug- algorithms have to be able to determine the ab- ins are contained in the pre-release that are re- solute physical sound pressure level correspond- quired to form a complete hearing aid imple- ing to a digital signal given to any openMHA mentation. The contained plugins are briefly plugin for processing. Inside the openMHA, described in the following subsections. we achieve this with the following convention: The single-precision floating point time-domain For real-time hearing aid processing, an sound signal samples, that are processed inside input-output delay below 10ms is required. the openMHA plugins in blocks of short du- This ensures that rations, have the physical pressure unit Pascal 2 • the hearing-impaired user is not confused (1Pa = 1N/m ). With this convention in place, by asynchrony between lip movements of all plugins can determine the absolute physi- a conversation partner and the perceived cal sound pressure level from the sound sam- sound, ples that they process. A derived convention is employed in the spectral domain for STFT sig- • no echo-effects are audible if the direct nals. Due to the dependency of the calibration sound can also be perceived by the hear- on the hardware used, it is the responsibility of ing aid user, and LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 39

• fewer frequencies are available for possibly transducers, and it sends its processed but still annoying acoustic feedback loops [Grimm calibrated output signal back to the transducers et al., 2009a]. plugin to adjust for the physical outputs. trans- ducers provides filters and gain adjustments to The example configuration that combines all ensure calibration of inputs and outputs. Typi- three example algorithms mentioned here shows cal output calibration values are in the order of an algorithmic delay of 4.4 ms. On top of this al- 110 dB SPL of a full-scale signal. gorithmic delay, input and output of the sound through a sound card causes additional delay in 4.2 The mhachain Plugin the range of two to three block durations de- An mhachain plugin can itself load several other pending on the hardware in use. The example plugins in a configurable order, where each plu- configuration uses a block size of 64 samples at gin processes the output signal of the previous 44100 Hz sampling rate. We have found, that plugin. e.g. with the RME Multiface II sound card and the snd-hdsp alsa driver used by JACK, this will 4.3 The Adaptive Differential add 4.4 ms delay between acoustic input and Microphone (adm) Plugin output on a Linux system with a low-latency Reduced audibility of soft sounds is not the only kernel and real-time priorities set up for JACK problem that hearing impaired listeners face and the alsa sound driver. when communicating. Another commonly ex- This results in an overall delay of 8.8 ms of perienced problem is a reduced intelligibility of the example configuration containing the plu- speech in noisy environments, even if the speech gins described in the following in the order of is loud enough to be perceived. Hearing aids their processing. therefore regularly employ signal processing al- gorithms to enhance the signal-to-noise ratio 4.1 The transducers Plugin of speech in noisy environments. In this con- A device-dependent calibration is required for text assumptions about target and noise sources plugins to be able to deduce the physical sig- play an important role as well as robustness nal level that is present at the hearing aid in- and generalization capabilities of the method put. When connecting a microphone to a sound used. Adaptive differential microphones (ADM, card and using that sound card to feed sound [Elko and Pong, 1995]) aim at the preserva- samples to the openMHA, these sound samples tion of a target signal while suppressing back- do not automatically follow the openMHA level ground noise. For this purpose, two general convention outlined in section 3.4. The same assumptions are made: the target is assumed is true when using sound files instead of sound to be present in the frontal hemisphere of a cards for input and output. Different micro- listener, while noise occurs in the rear hemi- phones have different sensitivities. Sound cards sphere. ADMs work for pairs of omnidirec- have adjustable amplification settings. Sound tional microphones separated by a small dis- files may have been normalized before they have tance, and combine a two-channel input to a been saved to disk. To be able to implement single-channel output signal by adding up de- the openMHA level convention, i.e., that the layed and weighted versions of the input as numeric value of time-domain sound samples in shown in Figure 2. In a binaural setting two the openMHA should reflect their sound pres- independent, bilateral ADMs are realized, each sure amplitude in Pascal, we need to be able using a two-microphone pair located in the a to adjust for arbitrary physical level to digital hearing aid device on one ear. level mappings in the openMHA. This is done with the help of the plugin transducers, which 4.4 The overlapadd Plugin is the only plugin that must not rely on this overlapadd is one of the openMHA plugins that convention, because it is the one plugin that perform conversion between time domain and has to make sure that all other plugins can rely spectral domain as a service for algorithms that on this convention. For this reason, transduc- process a series of short time Fourier transform ers is usually loaded as the first plugin into (STFT) signals. Thereby, not every openMHA the openMHA, and will itself (i.e. as a bridge plugin that processes spectral signal has to per- plugin, cf. section 3.3) load another openMHA form its own spectral analysis. plugin into the openMHA process. This other overlapadd is a bridge plugin (cf. sec- plugin receives the calibrated input signal from tion 3.3) and performs both, the forward and LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 40

front T Lin STFT coherence gain iSTFT back - Rin Rout

- out Figure 3: Coherence filter signal flowchart. Bin- T - weight aural coherence-based gain control is applied to the left and the right input channel in different Figure 2: Adaptive differential microphone sig- frequency bands in the STFT domain. nal flowchart. The input of the front and back microphone is combined to a single-channel out- level put after applying a delay T and a weighting. meter filter gain in resynthesis out bank control the backward transform, and can load another openMHA plugin which analyses and modi- Figure 4: Dynamic compression signal fies the signal while in the spectral domain. flowchart. The input is split into frequency The plugin performs the standard process of bands by a filter-bank. Before re-synthesis, an collecting the input signal, windowing, zero- input-level dependent gain rule is applied. padding, fast Fourier transform, inverse fast Fourier transform, additional windowing, and overlap-add time signal output. It can be used and compress the signal differently in different in standard overlap-add (OLA) and weighted frequency bands, and let the time-varying input overlap-add (WOLA) contexts. level in the different frequency bands control the gain selection. The fftfilterbank plugin re- 4.5 The Binaural coherence Filter ceives broadband spectra for each audio channel Plugin and divides the incoming spectra into multiple An important issue in hearing aid processing narrower frequency bands for processing by the is the reduction of feedback that can occur be- following openMHA algorithms. The fftfilter- tween the hearing aid receivers (outputs) and bank provides flexibility for filter-bank design. the closely located inputs (microphones). At The output frequency bands may overlap or not, high output levels a sound loop can emerge, with variable degrees of overlap, with customiz- causing annoying, self-sustaining beep tones. able filter shapes and different frequency scales Binaural coherence filtering, i.e., coherence- to specify the edge or center frequencies of the based gain control is applied to reduce this ef- filters. fect and enable higher gain levels of the hearing 4.7 Hearing Loss Compensation (dc) device [Grimm et al., 2009b]. Figure 3 shows that the binaural coherence is The dc plugin applies Multi-band dynamic measured between the left and the right input range compression [Grimm et al., 2015] to the signals to the hearing aids and used to derive signal. This operation serves two important frequency-dependent gains. aspects in a hearing aid: The hearing loss is Coherence filtering also contributes to noise compensated by defining gain rules between in- and reverberation reduction, as diffuse, inco- put and output level. Specific gain rules are herent background sounds are also reduced. also used to compensate recruitment effects that A combination the binaural coherence filtering often comes along with a hearing loss, i.e., a with preceding bilateral ADMs was shown to decreased range between the percept of a soft be beneficial, i.e., increased speech intelligibility sound and the loudest sound with a still com- with a binaural hearing aid setup [Baumg¨artel fortable level. To compensate for this effect, soft et al., 2015]. input sounds are usually amplified with higher gains than loud sounds. The dc plugin allows to 4.6 The fftfilterbank Plugin specify a gain-matrix with different gains for dif- In the hearing impaired, the hearing loss gener- ferent frequencies and input sound levels. Input ally varies with frequency. To restore audibility sound levels in hearing aid frequency bands are in hearing impaired listeners with amplification commonly measured with attack-release level and compression in hearing aid signal process- filters, the time constants of which can be freely ing, it is therefore common practice to amplify configured in the dc plugin. Figure 4 shows the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 41

signal flow for dynamic compression with the strings) and dimensions (scalars, vectors, ma- dc plugin. The dc plugin also allows to config- trices) are supported. For more details, please ure binaural and inter-frequency interactions of refer to [Grimm et al., 2006]. gain derivation. 5.2 Plugin Development 4.8 The combinechannels Plugin New plugins can be developed for the openMHA Because the fftfilterbank splits broadband sig- by implementing a C++ class derived from a nals into frequency bands for processing by the generic base class, implementing the methods dc plugin, these frequency bands have to be re- and compiling it to a shared object. Together combined to broadband channels again, after with other helper classes provided by the MHA- dc has processed them. This is done in the Toolbox library, out-of-the box support for ex- combinechannels plugin. Of course, the fftfil- porting variables to the configuration interface terbank and combinechannels plugins could be (cf. section 5.1) and for thread safe configura- combined into a single bridge plugin (cf. sec- tion updates (cf. section 3.2) is available. tion 3.3). This would generally be a better Simple plugins will usually output the signal implementation choice. It is not done here to in the same domain (spectrum or waveform) as showcase the flexibility of the openMHA plat- the input domain. It is also possible to im- form: It is also possible to have analysis and re- plement domain transformations (from the time synthesis of some transform as separate plugins, domain to spectrum or vice versa) inside a plu- and to propagate the signal from one plugin to gin, as well as change the number of audio chan- the next inside a single mhachain plugin while nels, and even the number of audio samples the domain changes from one plugin to the next per block and the sampling rate (e.g. for re- (here: few broadband channels vs many narrow- sampling). band channels). A detailed manual for plugin development and implementation will be provided with a 5 Software near-future release. openMHA is a command line application with no graphical user interface (GUI) of its own. 6 Conclusions openMHA can be configured with command line The openMHA provides the means for sustain- parameters, configuration files, interactively able research on and development of hearing aid over a network connection, or by a combina- processing algorithms and assistive hearing sys- tion of all three methods. The same text-based tems. The software is further developed in the configuration language is used in all three meth- project ”Open community platform for hearing ods. Special-purpose GUIs can be produced to aid algorithm research”, additionally, updates control the openMHA over the network connec- based on the feedback of the research commu- tion. Such GUIs can be produced in any pro- nity will be conducted. Future work will ex- gramming language or framework that is able to tend the openMHA in several directions: The connect to the openMHA over a TCP network set of reference algorithms will be expanded and connection. Some special-purpose GUIs exist experimental algorithms will be included. Ad- for the closed-source MHA that also work with ditional hardware and operation systems will the openMHA, but are not yet part of the first be included, i.e., real-time runtime support for open-source pre-release. GUIs will be added in Beaglebone Black ARM and similar platforms, later releases of the openMHA. as well as support for Windows operations sys- tems. Increased usability on different user levels 5.1 Configuration Interface is achieved by the preparation of a GUI for the The openMHA application itself and also its pure application of the openMHA, e.g., in the plugins are controlled through a simple, text- context of audiological measurements, availabil- based configuration language. The language ity of reference manuals for the configuration as allows hierarchical configuration similar to the well as the implementation of plugins for real- concept of Octave and Matlab structures. The ization and implementation of own algorithms configuration language enables variable assign- and methods and their evaluation. ments, queries, and loading and saving of con- The openMHA is intended to serve as a plat- figuration files. Variables of different types (in- form for extensive research and evaluations by tegers, floating point and complex numbers, the community. A pre-release of the software in LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 42

its current version including example configura- Giso Grimm, Tobias Herzke, Stephan Ewert, tion files as described here can be downloaded and Volker Hohmann. 2015. Implementation via http://www.openmha.org. and Evaluation of an Experimental Hearing Aid Dynamic Range Compressor Gain Pre- 7 Acknowledgments scription. In DAGA 2015, pages 996–999. The project ”Open community platform for H¨orTech gGmbH and Universit¨atOldenburg. hearing aid algorithm research” is funded by 2017. openMHA web site on GitHub. http: the National Institutes of Health (NIH Grant //www.openmha.org/. 1R01DC015429-01). Eric Jones, Travis Oliphant, Pearu Peterson, References et al. 2001–. SciPy: Open source scientific Regina M. Baumg¨artel, Martin Krawczyk- tools for Python. http://www.scipy.org/. Becker, Daniel Marquardt, Christoph V¨olker, James McCartney. 2002. Rethinking the com- Hongmei Hu, Tobias Herzke, Graham Cole- puter music language: Supercollider. Com- man, Kamil Adiloglu, Stephan M. A. puter Music Journal, 26(4):61–68. Ernst, Timo Gerkmann, Simon Doclo, Birger Kollmeier, Volker Hohmann, and Math- Yann Orlarey, Dominique Fober, and ias Dietz. 2015. Comparing Binaural Pre- St´ephane Letz. 2009. Faust: an efficient processing Strategies I: Instrumental Eval- functional approach to dsp programming. uation. Trends in Hearing, 19:article No. New Computational Paradigms for Computer 2331216515617916. Music, 290. Perry R Cook and Gary P Scavone. 1999. The Miller Puckette. 1996–. Pure data. https: synthesis toolkit (stk). In ICMC. //puredata.info/. Paul Davis. 2003. Jack audio connection kit. http://jackaudio.org/. John W. Eaton, David Bateman, Søren Hauberg, and Rik Wehbring. 2015. GNU Octave version 4.0.0 manual: a high-level interactive language for numerical com- putations. http://www.gnu.org/software/ octave/doc/interpreter. G. W. Elko and Anh-Tho Nguyen Pong. 1995. A Simple Adaptive First-order Differential Microphone. In Proceedings of 1995 Work- shop on Applications of Signal Processing to Audio and Accoustics, pages 169–172. Giso Grimm, Tobias Herzke, Daniel Berg, and Volker Hohmann. 2006. The Master Hearing Aid: a PC-based Platform for Al- gorithm Development and Evaluation. Acta acustica united with Acustica, 92:618–628. Giso Grimm, Tobias Herzke, and Volker Hohmann. 2009a. Application of Linux Au- dio in Hearing Aid Research. In Linux Audio Conference 2009. Giso Grimm, Volker Hohmann, and Birger Kollmeier. 2009b. Increase and Subjective Evaluation of Feedback Stability in Hear- ing Aids by a Binaural Coherence-based Noise Reduction Scheme. IEEE Transactions on Audio, Speech, and Language Processing, 17(7):1408–1419. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 43

Towards dynamic and animated music notation using INScore

Dominique Fober, Yann Orlarey and St´ephane Letz GRAME - Centre national de cr´eation musicale 11 cours de Verdun Gensoul 69002 Lyon France, {fober, orlarey, letz}@grame.fr

Abstract [Hope et al., 2015] but may also use the com- INScore is an environment for the design of aug- mon music notation [Hoadley, 2012; Hoadley, mented interactive music scores opened to conven- 2014]. tional and non-conventional use of the music nota- In order to address the notation challenges tion. The system has been presented at LAC 2012 mentioned above, INScore [Fober et al., 2010] and has significantly evolved since, with improve- has been designed as an environment opened ments turned to dynamic and animated notation. to non-conventional music representation (al- This paper presents the latest features and notably though it supports symbolic notation), and the dynamic time model, the events system, the turned to real-time and interactive use [Fober et scripting language, the symbolic scores composition al., 2013]. It is clearly focused on music repre- engine, the network and Web extensions, the inter- action processes representation system and the set sentation only and in this way, differs from tools of sensor objects. integrated into programming environments like Bach [Agostini and Ghisi, 2012] or MaxScore Keywords [Didkovsky and Hajdu, 2008]. INScore has been already presented at LAC INScore, music score, dynamic score, interaction. 2012 [Fober et al., 2012a]. It has significantly evolved since and this paper introduces the 1 Introduction set of issues that have been more recently ad- Contemporary music creation poses numerous dressed. After a brief recall of the system and challenges to the music notation. Spatialized of the programming environment, we’ll present music, new instruments, gesture based inter- the scripting language extensions and the sym- actions, real-time and interactive scores, are bolic scores composition engine that provides among the new domains that are now com- high level operations to describe real-time and monly explored by the artists. Common music interactive symbolic scores composition. Next notation doesn’t cover the needs of these new we’ll describe how interaction processes repre- musical forms and numerous research and ap- sentations can be integrated into the music score proaches have recently emerged, testifying to and how remote access is supported using the the maturity of the music notation domain, in network and/or Web extensions. Tablet and the light of computer tools for music notation smartphone support have led to integrate ges- and representation. Issues like writing spatial- tural interaction with a set of sensor objects ized music [Ellberger et al., 2015], addressing that will be presented. Finally, the time model, new instruments [Mays and Faber, 2014] or new recently extended, will be described. interfaces [Enstr¨omet al., 2015] (to cite just a few), are now subject of active research and pro- 2 The INScore environment posals. INScore is an environment to design interactive Interactive music and real-time scores are also augmented music scores. It extends the mu- representative of an expanding domain in the sic representation to arbitrary graphic objects music creation field. The advent of the digi- (symbolic notation but also images, text, vecto- tal score and the maturation of the computer rial graphics, video, signals representation) and tools for music notation and representation con- provides an homogeneous approach to manipu- stitute the basement for the development of this late the score components both in the graphic musical form, which is often grounded on non- and time spaces. traditional music representation [Smith, 2015] It supports time synchronization in the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 44

graphic space, which refers to the graphic rep- • the set message that operates like a con- resentation of the temporal relations between structor and that takes the object type as components of a score - via a synchronization parameter, followed by type specific param- mechanism and using mappings that express re- eters (Fig. 3). lations between time and graphic space segmen- tations (Fig. 1). • the get message provided to query the sys- tem state (Fig. 4).

/ITL/scene/obj set txt "Hello world!";

Figure 3: A message that creates a textual object, which type is txt, with a text as specific data.

/ITL/scene/obj get; -> /ITL/scene/obj set txt "Hello Figure 1: A graphic score (Mark Applebaum’s world!"; graphic score Metaphysics of Notation) is synchronized to a symbolic score. The picture /ITL/scene/obj get x y; in the middle is the result of the -> /ITL/scene/obj x 0; synchronization. The vertical lines express the -> /ITL/scene/obj y 0.5; graphic to graphic relationship, that have been computed by composing the objects Figure 4: Querying an object with a get message common relations with the time space. gives messages on output (prefixed with ->). These messages can be used to restore the corresponding object state. INScore has been primarily designed to be controlled via OSC1 messages. The format of the messages consists in an OSC address fol- The address space is dynamic and not limited lowed by a message string and 0 to n parameters in depth. It is hierarchically organized, the first (Fig. 2). level /ITL is used to address the application, OSCMessage the second one /ITL/scene to address the score OSCAddress message parameters and the next ones to address the components of a score (note that scene is a default name that can be user defined). Arbitrary hierarchy of objects is supported. Figure 2: INScore messages general format. 3 The scripting language Compared to object oriented programming, The OSC messages described above have been the address may be viewed as an object pointer, turned into a textual version to constitue the the message string as a method name and the INScore scripting language. This language has parameters as the method parameters. For ex- been rapidly extended to support : ample, the message: /ITL/scene/score color 255 128 40 150 • variables, that may be used to share pa- may be viewed as the following method call: rameters between messages (Fig. 5). ITL[scene[score]]->color(255 128 40 150) • message based variables and/or parameters The system provides a set of messages for that consists in querying an object to re- the graphic space control ( x, y, color, space, trieve one of it’s attributes value (Fig. 6). etc.), for the time space control ( date, duration, etc.), and to manage the environ- • an extended OSC addressing scheme that ment. It includes two special messages: allows to send OSC messages to an exter- nal application for initialization of control 1http://opensoundcontrol.org/ purposes (Fig. 7). LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 45

• JavaScript sections that can be evaluated conditionals, etc.) that can be used for /ITL/scene/ run arguments computation (Fig. 9). ’randpos("/ITL/scene/obj")’; • symbolic scores composition expressions that are described in section 4. Figure 8: The JavaScript section defines a randpos function that computes an x message with a random value, addressed to greylevel = 140; the object given as parameter. This function color = $greylevel $greylevel $greylevel; may be next called at initialization or at any /ITL/scene/obj1 color $color; time using the static JavaScript node /ITL/scene/obj2 color $color; embedded into each score.

Figure 5: Variables may be used to share values between messages. /ITL/scene/o x ($shift ? $x + 0.5 : $x);

Figure 9: A mathematical expression is used to compute the position of an object depending ox = $(/ITL/scene/obj get x); on 2 previously defined variables. /ITL/scene/obj2 x $(/ITL/scene/obj get x);

Figure 6: The output of get messages can be rhythm or the pitch of a score to another one used by variables or as another message (rhythm, pitch). parameter. The INScore scripting language includes score expressions, a simple language providing score composition operations. The novelty of the pro- /ITL/scene/obj set txt "Hello world!"; posed approach relies on the dynamic aspects of localhost:8000/start; the operations, as well as on the persistence of the score expressions. A score may be composed Figure 7: This script initialises a textual object as an arbitrary graph of score expressions and and sends the /start message to an external equipped with a fine control over the changes application listening on UDP port 8000. propagation. 4.1 Score expressions A score expression is defined as an operator fol- 4 Symbolic scores composition lowed by two scores (Fig. 10). The leading expr Rendering of symbolic music notation makes use token is present to disambiguate parenthesis in of the Guido engine [Daudin et al., 2009]. Thus the context of INScore scripts. the primary music score description format is score expression: the Guido Music Notation format [Hoos et al., expr ( 1 operator score score ) 1998] [GMN]. The MusicXML format [Good, 2001] is also supported via conversion to the 2 score GMN format. The Guido engine provides a set of operators Figure 10: Score expressions syntax. for scores level composition [Fober et al., 2012b]. These operators consistently take 2 scores as ar- gument to produce a new score as output. They The score arguments may be: allow to put scores in sequence (seq), in par- • a literal score description string (GMN or allel (par), to cut a score in the time dimen- MusicXML formats), sion (head, tail), in the polyphonic dimen- • a file (GMN or MusicXML formats), sion (top, bottom), to transpose (transpose), to stretch (duration) a score and to apply the • an existing score object, LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 46

• a score expression. On INScore side, a process state is viewed as a signal. Signals are part of a score components An example is presented in Fig. 11. and can be combined into graphic signals to be- come first order objects of a score. They may expr( par score.gmn (seq "[c]" score)) be notably used for the representation of a per- formance [Fober et al., 2012a]. Figure 11: A score expression that puts a score file ( score.gmn) in parallel with the Regarding musical processes representation, sequence of a literal score and an existing a signal can be connected to any attribute of an object ( score). Note that the leading expr object (Fig. 13), which makes the signal varia- token can be omitted inside an expression. tions visible (and thus the process activity) with the changes of the corresponding attributes.

/ITL/scene/signal/sig size 100; 4.2 Dynamic score expression trees /ITL/scene/obj set rect 0.5 0.5; The score expressions language is first trans- /ITL/scene/img set img ’file.png’; formed into an internal tree representation. In a /ITL/scene/signal connect sig second step, this representation is evaluated to "obj:scale" "img:rotatez[0,360]"; produce GMN strings as output, that are finally passed to the INScore object as specific data. Figure 13: A signal sig is connected to the scale Basically, the tree is reduced using a depth attribute of an object and to the rotatez attribute of an image. Note that for first post-order traversal and the result is stored the latter, the signal values (expected to be in a cache. However, the score expressions lan- in [-1,1]) are scaled to the interval [0,360]. guage provides a mechanism to make arbitrary parts of a tree variable using an ampersand (&) as prefix of an argument, preventing the cor- responding nodes to be reduced at cache level 6 Network and Web dimensions (Fig. 12). INScore supports the aggregation of distributed resources over Internet, as well as the publica- expr( par score.gmn (seq "[c]" &score)) tion of a score via the HTTP and/or WebSocket protocols. In addition, a score can also be used Figure 12: A score expression that includes a reference to a score object. Successive to control a set of remote scores on the local evaluations of the expression may produce network using a forwarding mechanism. different results, provided that the score 6.1 Distributed score components object has changed. Most of the components of a score can be de- fined in a literal way or using a file. All the file INScore events system (described in section based resources can be specified as a simple file 8.2) provides a way to automatically trigger the path, using absolute or relative path, or as an re-evaluation of an expression when one of it’s HTTP url (Fig 14). variable parts has changed. These mechanisms open the door to dynamic scores composition /ITL/scene/obj1 set img ’file.png’; within the INScore environment. More details /ITL/scene/obj2 set img about the score expressions language can be ’http://www.adomain.org/file.png’; found in [Lepetit-Aimon et al., 2016]. Figure 14: File based resources can refer to local 5 Musical processes representation or to remote files. INScore includes tools for the representation of musical processes within the music score. In When using a relative path, an absolute path the context of interactive music and/or when a is built using the current path of the score, that computer is involved in a music performance, may be set to an arbitrary location using the it may provide useful information regarding the rootPath attribute of the score (Fig 15). state of the musical processes running on the The current rootpath can also be set to an computer. This feedback can notably be used to arbitrary HTTP url, so that further use of a guide the interaction choices of the performer. relative path will result in an url (Fig. 16). LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 47

/ITL/scene rootPath ’/users/me/inscore’; • post: intended to send an INScore script to /ITL/scene/obj set img ’file.png’;; the server.

Figure 15: The rootPath of a score is equivalent • click: intended to allow remote mouse in- to the current directory in a shell. With this teraction with the score. example, the system will look for the file at More details are available from [Fober et al., ’/users/me/inscore/file.png’ 2015]. 6.3 Messages forwarding /ITL/scene rootPath Message forwarding is another mechanism pro- ’http://www.adomain.org’; vided to distribute scores over a network. It op- /ITL/scene/obj set img ’file.png’;; erates at application and/or score levels when Figure 16: The rootPath supports urls. With the forward the message is send to the appli- this example, the system will look for the file cation ( /ITL) or to a score ( /ITL/scene). The at ’http://www.adomain.org/file.png’ message takes a list of destination hosts spec- ified using a host name or an IP number, and suffixed with a port number. All the OSC mes- This mechanism allows to mix local and re- sages may be forwarded, provided they are not mote resources in the same music score, but also filtered out (Fig. 18). The filtering strategy to express local and remote scores in a similar is based on OSC adresses and/or on INScore way, just using a rootPath change. methods (i.e. messages addressing specific ob- jects attributes). 6.2 HTTPd and WebSocket objects A music score can be published on the Inter- /ITL forward 192.168.1.255:7000; net using the HTTP or the WebSocket proto- /ITL/filter reject cols. Specific objects can be embedded in a ’/ITL/scene/javascript’; score in order to make this score available to remote clients (Fig. 17). Figure 18: The application is requested to forward all messages on INScore port (7000) /ITL/scene/http set httpd 8000; to the local network using a broadcast address. Messages addressed to the /ITL/scene/ws set websocket 8100 200; JavaScript engine are filtered out in order to only forward the result of their evaluation. Figure 17: This example creates an httpd server listening on the port 8000 and a WebSocket server listening on the port 8100 with a maximum notification rate of 200 ms. 7 The sensor objects INScore runs on the major operating systems The WebSocket server allows bi-directional including Android and iOS. Tablet and smart- communication between the server and the phone support have led to integrate gestural in- client. It sends notifications of score changes teraction with a set of sensor (Table 1). each time the graphic appearance of the score Sensors can be viewed as objects or as signals. is modified, provided that the notification rate When created as a signal node, a sensor behaves is lower than the maximum rate set at server like any signal but may provide some additional creation time. features (like calibration). When created as a The communication scheme between a client score element, a sensor has no graphical appear- and an INScore Web server relies on a reduced ance but provides specific sensor events and fea- set of messages. These messages are proto- tures. col independent and are equally supported over All the sensors won’t likely be available on a HTTP or WebSocket : given device. In case a sensor is not supported, an error message is generated at creation re- • get: requests an image of the score. quest and the creation process fails. • version: requests the current version of the score. The server answers with an integer 8 The time model value that is increased each time the score INScore time model has been recently extended is modified. to support dynamic time. Indeed and with the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 48

name values /ITL/scene/obj tempo 60 accelerometer x, y, z ambient light light level -> /ITL/scene/obj ddate f(ri) compass azimuth -> /ITL/scene/obj ddate f(ri+1) gyroscope x, y, z -> /ITL/scene/obj ddate f(ri+2) light a level in lux -> ... magnetometer x, y, z orientation device orientation proximity a boolean value Figure 19: A sequence of messages that activate rotation x, y, z the time of an object obj. Messages prefixed by -> are generated by the object tilt x,y itself. ri is the value of the absolute time elapsed between the task i and i − 1. Table 1: The set of sensors and associated values 8.2 The events system The event-driven approach of time in INScore initial design, the time attributes of an object preceded the musical time model and has been are fixed and don’t change unless a time mes- presented in [Fober et al., 2013]. The event- sage (date, duration) is received, which can based interaction process relies on messages only be emitted from an external application or that are associated to events and that are sent using the events mechanism. The latter (de- when the corresponding event occurs. The gen- fined very early) introduced another of eral format of an interaction message is de- time: the events time, which takes place when scribed in Fig. 20. an event occurs. The events system has also been extended for more flexibility. address watch event ()messages 8.1 The musical time Regarding the time domain, any object of a Figure 20: Format of an interaction message: the score has a date and a duration. A new tempo watch request installs a messages list attribute has been added, which has the effect of associated to the event event. moving the object in the time dimension when non null, according to the tempo value and the Initially, the events typology was limited absolute time flow. Let t0 be the time of the last to classical user interface events (e.g. mouse tempo change of an object, let v be the tempo events), extended in the time domain (see Ta- value, the object date dt at a time t is given by ble 2). a time function f: Graphic domain Time domain f(t) → dt = dt0 +(t − t0) × v × k, t ≧ t0 (1) mouseDown timeEnter mouseUp timeLeave where di is the object date at time ti and k mouseEnter durEnter a constant to convert absolute time in musical mouseLeave durLeave time. In fact, absolute time is expressed in mil- mouseMove liseconds and the musical time unit is the whole note. Therefore, the value of k is 1/1000×60×4. Table 2: Main INScore events in the initial Each object of a score has an independent versions. tempo. The tempo value is a signed integer, which means that an object can move forward This typology has been significantly extended in time but backward as well. to include: From implementation viewpoint and when its tempo is not null, an object sends ddate (a rel- • touch events (touchBegin, touchEnd, ative displacement in time) to itself at periodic touchUpdate), available on touch screens intervals (Fig. 19). and supporting multi-touch. This design is consistent with the overall sys- • any attribute of an object: modifying an tem design since it is entirely message based. It object attribute may trigger the corre- is thus compatible with all the INScore mecha- sponding event, that carries the name of nisms such as the forwarding system. the attribute (e.g. x, y date, etc.). LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 49

• an object specific data i.e. defined with a set message. The event name is newData /ITL/scene/obj date d1

and has been introduced for the purpose of e1 e2 e3 e4 e5 e6 the symbolic score composition system. Events • user defined events, that have to comply to /ITL/scene/obj watch a special naming scheme. d1 d2 d3 d4 d5 d6

Any event can be triggered using the event mes- t sage, followed by the event name and event’s de- Figure 22: Exemple of events placed in the time event pendent parameters. The message may space. These events are associated to time be viewed as a function call that generates OSC intervals (timeEnter and timeLeave) and messages on output. This approach is particu- are triggered when entering (in red) of larly consistent for user events that can take an leaving (in blue) these intervals. The last arbitrary number of parameters, which are next event (e6) emits a date message that available to the associated messages under the creates a loop by putting the object back at form of variables named $1...$n (Fig. 21). the beginning of the first interval.

/ITL/scene/obj watch MYEVENT ( # first clear the scene /ITL/scene/t1 set txt $1, /ITL/scene/* del; /ITL/scene/t2 set txt $2 ); # add a simple symbolic score /ITL/scene/obj event MYEVENT /ITL/scene/score set gmn ’[c d e f g a h "This text is for t1" c2 ]’; "This one is for t2"; # add a cursor synchronized to the score /ITL/scene/cursor set ellipse 0.1 0.1; Figure 21: Definition of a user event named /ITL/scene/cursor color 0 0 250; MYEVENT that expects 2 arguments referenced /ITL/scene/sync cursor score syncTop; as $1 and $2 in the body of the definition. This event is next triggered with 2 different # watch different time zones strings as arguments. /ITL/scene/cursor watch timeEnter 2 3 ( /ITL/scene/cursor tempo -60 ); The time dimension of the events system al- lows to put functions in the time space un- /ITL/scene/cursor watch timeEnter -1 0 der the form of events that trigger messages ( /ITL/scene/cursor tempo 60 ); that can modify the score state and/or be ad- dressed to external applications using the ex- # and finally start the cursor time tended OSC addressing scheme (Fig. 22). /ITL/scene/cursor tempo 60; Combined with the dynamic musical time, this events system allows to describe au- tonomous animated score. The example in Fig. Figure 23: A cursor that moves forward and 23 shows how to describe a cursor that moves backward over a symbolic score. forward and backward over a score by watch- ing the time intervals that precedes and follows a symbolic score and by inverting the tempo projects and many of the concrete experiences value. raised new issues that are reflected into some 9 Conclusion of the system extensions. The domain is quite recent and there are still a lot of open questions 2 INScore is an ongoing open source project that that we plan to address in future work and in crystallizes a significant amount of research ad- particular: dressing the problematics of the music nota- tion and representation in regard of the con- • turning the scripting language into a real temporary music creation. It is used in artistic programming language would provide a more powerful approach to music score de- 2 http://inscore.sf.net) scription. The embedded JavaScript en- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 50

gine may already be used for an algorith- D. Fober, C. Daudin, Y. Orlarey, and S. Letz. mic description of a score, but switching 2010. Interlude - a framework for augmented from one environment (INScore script) to music scores. In Proceedings of the Sound another one (JavaScript) proved to be a bit and Music Computing conference - SMC’10, tedious. pages 233–240. • extending the score components to give a Dominique Fober, Yann Orlarey, and time dimension to any of their attributes Stephane Letz. 2012a. Inscore – an environ- could open a set of new possibilities, includ- ment for the design of live music scores. In ing arbitrary representations of the passage Proceedings of the Linux Audio Conference – of time. LAC 2012, pages 47–54. Finally, migrating the INScore native environ- Dominique Fober, Yann Orlarey, and ment to the Web is part of the current plans St´ephane Letz. 2012b. Scores level compo- and should also open new perspectives, notably sition based on the guido music notation. due to the intrinsic connectivity of Web appli- In ICMA, editor, Proceedings of the Inter- cations. national Computer Music Conference, pages 383–386. References Dominique Fober, St´ephane Letz, Yann Or- Andrea Agostini and Daniele Ghisi. 2012. larey, and Frederic Bevilacqua. 2013. Pro- Bach: An environment for computer-aided gramming interactive music scores with in- composition in max. In ICMA, editor, Pro- score. In Proceedings of the Sound and Music ceedings of International Computer Music Computing conference – SMC’13, pages 185– Conference, pages 373–378. 190. C. Daudin, Dominique Fober, Stephane Letz, Dominique Fober, Guillaume Gouilloux, and Yann Orlarey. 2009. The guido engine – a Yann Orlarey, and St´ephane Letz. 2015. Dis- toolbox for music scores rendering. In LAC, tributing music scores to mobile platforms editor, Proceedings of Linux Audio Confer- and to the internet using inscore. In Proceed- ence 2009, pages 105–111. ings of the Sound and Music Computing con- Nick Didkovsky and Georg Hajdu. 2008. ference – SMC’15, pages 229–233. Maxscore: Music notation in max/msp. In M. Good. 2001. MusicXML for Notation and ICMA, editor, Proceedings of International Analysis. In W. B. Hewlett and E. Selfridge- Computer Music Conference. Field, editors, The Virtual Score, pages 113– Emile Ellberger, Germ´an Toro-Perez, Jo- 124. MIT Press. hannes Schuett, Linda Cavaliero, and Gior- Richard Hoadley. 2012. Calder’s violin: Real- gio Zoia. 2015. A paradigm for scor- time notation and performance through musi- ing spatialization notation. In Marc Bat- cally expressive algorithms. In ICMA, editor, tier, Jean Bresson, Pierre Couprie, C´ecile Proceedings of International Computer Music Davy-Rigaux, Dominique Fober, Yann Ges- Conference, pages 188–193. lin, Hugues Genevois, Fran¸cois Picard, and Alice Tacaille, editors, Proceedings of the Richard Hoadley. 2014. December variation First International Conference on Technolo- (on a theme by earle brown). In Proceedings gies for Music Notation and Representation of the ICMC/SMC 2014, pages 115–120. - TENOR2015, pages 98–102, Paris, France. H. Hoos, K. Hamel, K. Renz, and J. Kilian. Institut de Recherche en Musicologie. 1998. The GUIDO Music Notation Format - a Warren Enstr¨om, Josh Dennis, Brian Lynch, Novel Approach for Adequately Representing and Kevin Schlei. 2015. Score-level Music. In Proceedings of the Inter- for multi-touch interfaces. In Edgar Berdahl national Computer Music Conference, pages and Jesse Allison, editors, Proceedings of the 451–454. ICMA. International Conference on New Interfaces Cat Hope, Lindsay Vickery, Aaron Wy- for Musical Expression, pages 83–86, Baton att, and Stuart James. 2015. The deci- Rouge, Louisiana, USA, May 31 – June 3. bel scoreplayer - a digital tool for read- Louisiana State University. ing graphic notation. In Marc Battier, Jean LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 51

Bresson, Pierre Couprie, C´ecile Davy-Rigaux, Tom Mays and Francis Faber. 2014. A nota- Dominique Fober, Yann Geslin, Hugues tion system for the karlax controller. In Pro- Genevois, Fran¸cois Picard, and Alice Tacaille, ceedings of the International Conference on editors, Proceedings of the First International New Interfaces for Musical Expression, pages Conference on Technologies for Music No- 553–556, London, United Kingdom, June. tation and Representation - TENOR2015, Goldsmiths, University of London. pages 58–69, Paris, France. Institut de Ryan Ross Smith. 2015. An atomic approach Recherche en Musicologie. to animated music notation. In Marc Bat- Gabriel Lepetit-Aimon, Dominique Fober, tier, Jean Bresson, Pierre Couprie, C´ecile Yann Orlarey, and St´ephane Letz. 2016. Davy-Rigaux, Dominique Fober, Yann Ges- Inscore expressions to compose symbolic lin, Hugues Genevois, Fran¸cois Picard, and scores. In Richard Hoadley, Chris Nash, Alice Tacaille, editors, Proceedings of the and Dominique Fober, editors, Proceedings First International Conference on Technolo- of the International Conference on Technolo- gies for Music Notation and Representation gies for Music Notation and Representation - TENOR2015, pages 39–47, Paris, France. - TENOR2016, pages 137–143, Cambridge, Institut de Recherche en Musicologie. UK. Anglia Ruskin University. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 52

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 53

PlayGuru, a music tutor

Marc Groenewegen Hogeschool voor de Kunsten Utrecht Ina Boudier-Bakkerlaan 50 3582 VA Utrecht The Netherlands [email protected]

Abstract base the next action upon the response of the PlayGuru is a practice tool being developed for be- musician. The example and response can also ginning and intermediate musicians. With exercises be played simultaneously, creating a sense of that adapt to the musician, the intention is to help playing together. Those synchronised exercises a music student to develop several playing skills and give room to improvisation and exploration. motivate them to practice in-between classes. Be- The way in which PlayGuru aims to keep the cause all exercise interaction is entirely based on user motivated is based on affirmation when the sound, the author believes PlayGuru is particularly exercise is performed according to predefined useful for blind and visually impaired musicians. Re- search currently focuses on monophonic exercises. objectives. This paper is a report of the current status and ul- It will never flag a “wrong note”, as this is timate goals. considered highly demotivating. Instead, it re- sponds by making the exercise slightly easier Keywords when it finds you are struggling with the cur- Computer-assisted music education, DSP, machine rent level, until it gets to a level from which you learning can continue growing again. The most important affirmative motivators 1 Project objectives used are: The main reason for writing this paper is to ❼ increasing playing speed bring the project to the attention of others so they can use, improve and benefit from the ❼ extending the phrase ideas, technology and the intended end prod- ❼ increasing complexity of the exercise uct. Even though few user experiences can be re- The current version contains a very basic user ported at the time of writing, some intermediate model, which is a starting point for a module results and plans for the near future are given that monitors a user’s achievements and keep towards the end of this paper. track of their progress, thus supporting their PlayGuru is a music tutor that operates ex- personal growth. clusively in the sound domain. Because the fo- To perform user tests supporting the re- cus is only on music, the author believes it is a search, several exercises are being implemented. very useful tool for beginning and intermediate At the time of writing, a versatile sound-domain musicians and particularly useful for blind and guitar tuner with arbitrary tuning is available, visually impaired musicians. as is an exercise for remembering a melodic phrase and a riff trainer for practicing licks at 1.1 Practice motivation high speed. How does an amateur musician find the moti- Most exercises are developed for the guitar. vation to pick up their instrument and play? A great source of inspiration for guitar practice What motivates children to practice? is the book by Scott Tennant: [Tennant, 1995] These questions probably have a large spec- with a focus on motor skills and automation. trum of answers. Let us focus on two motiva- Because all exercises are based on pitch- and tional forces: personal growth and affirmation. onset-detection in sound signals, adaptation for PlayGuru is a set of music exercises based other instruments with discrete pitch and a on an example and response approach. Dialogs clear onset should be reasonably straightfor- usually start with an example being played and ward. For instruments with arbitrary pitch LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 54

ranges and smooth transitions, as well as hu- 2 Research man voice, some additional provisions may be To support the research with experiences of end necessary. users, some application prototypes are being de- veloped. The adaptive exercises used in these 1.2 Origin of the project prototypes will briefly be introduced separately. The PlayGuru project started as part of the au- This chapters discusses the software and the thor’s Master’s course. While trying to find chosen methods for interacting with the user. ways to improve the effectiveness of practice 2.1 Software routines for the guitar, some research was done The framework and all exercises are currently into existing solutions. implemented in C++11. For audio signal anal- Several systems for computer assisted music ysis, the Aubio 2 library is used. Exercises are training were found and some have been put to composed in real time according to musical rules the test. A shortlist is included at the end. One or taken from existing material like MIDI files thing all the encountered solutions have in com- and guitar tablature. mon is that they rely heavily on visual interac- tion. In many cases this implies written score 2.2 Dependencies or tablature, in other cases a game-like environ- Development is done on Linux and Raspbian. ment in which the user has to act upon events Porting to Apple OSX should be relatively easy happening in a graphic scene. but has not been done yet. The most prominent In several cases the author found the visual dependencies, as in libraries, are jackd, aubio, information distracting from the music. Thus fftw3 and portmidi. For generating sound, Flu- the idea arose for a practice tool exclusively idsynth is used. Stand-alone versions use a working with sound. Python script to connect the hardware user- Shortly after that, the foundation Con- interface to the exercises. 1 nect2Music came into view. Connect2Music, 2.3 Practice companion founded in 2013, provides information with re- When playing along with a song on the radio spect to music practice by visually impaired mu- you will need to adjust to the music you hear, sicians. as it will not wait for you. Playing with other According to [Mak, 2015], the facilities for musicians has entirely different dynamics. Peo- blind music students in The Netherlands are ple influence each other, try to synchronise, tune limited. Even though the situation is improv- in and reach a common goal: to make the music ing, a practice tool which focuses only on the sound nice and feel good about it. music itself would be a much wanted addition. When practicing music with a tool like Thus a project was born: to find ways to im- PlayGuru it would be nice to have a dialog with prove the learning path for beginning and in- the tool, instead of just obeying to its rules. termediate musicians with music as the key el- This is exactly what makes PlayGuru interact ement and primarily addressing blind and visu- so nicely. It listens to you and adapts, thus be- ally impaired people. having like a practice companion. The prototype being developed for perform- How this is achieved is shown with refer- ing this research is called PlayGuru. The envi- ence to the software architecture and indica- sioned end product aims to help and encourage tions which parts have been realised and which a music student to perform certain exercises in- are being developed. between music classes and is meant to comple- 2.4 Architecture ment rather than replace regular classes from a human teacher. The modular design of PlayGuru is shown in fig- ure 1, with the Exercise Governor as the module Through the contacts of Connect2Music with from which every exercise starts. the community, several blind and visually im- The Exercise Governor reads a configuration paired musicians and software developers in The file containing information about the user, the Netherlands and Belgium expressed their inter- type of exercise, parameters defining the course est in this project and offered help to assess and of the exercise, various composition settings and assist. possibly other sources like MIDI files.

1https://www.connect2music.nl 2https://aubio.org LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 55

Figure 1: PlayGuru’s software architecture

When the exercise starts, the Composer will generate a MIDI track, or read it from the spec- ified file. During the exercise, this generating may take place again, depending on the type of exercise and the progress of the musician. The MIDI track is played by the sound mod- Figure 2: Note onset absolute time difference ule, which also captures the sound that comes back from the musician or their instrument. from the same modules, but with a different in- The Assessor contains all the sound process- tention. In this case it would let the Composer ing and assessment logic and reports back to the create a new phrase when needed, ask the As- Exercise Governor, which calls in the help of the sessor to run a different type of analysis and User Model to decide how to interpret the data perform concurrent scheduling of playback and and what to do next. analysis. 2.5 Playback and analysis 2.7 The Assessor Playback and analysis run in separate threads, PlayGuru’s exercises generally consist of a play- but share the same time base for relating the and evaluation loop. For some exercises, the output to the input. evaluation process is run after the playback iter- Incoming audio is analysed in real time to de- ation, while for others they run simultaneously. tect pitch(es) and onsets, which are used to as- sess the musician’s play in relation to the given Figure 2 shows the measured timing of a mu- stimuli. Pitch and onset detection are done us- sician playing along with an example melody. ing the Aubio library. The time of each matched note is compared to the time when that note was played in the ex- 2.6 The Exercise Governor ample. In this chart we see that the musician Every exercise type is currently implemented as played slightly “before the beat”. a separate program. The Exercise Governor is This absolute timing indicates whether the essentially a descriptive name for the main pro- musician is able to exactly copy the example, gram of each exercise, which uses those parts which can be seen clearly for a melody constist- from the other modules that it needs for a cer- ing of only equidistant notes. tain exercise. Currently these are compiled and More interesting however is relative tim- linked into the program. With these building ing. This indicates whether the musician blocks it determines the nature of the exercise. keeps the timing structure of the example in- As an example: to let the musician work tact. In this case we calculate the differ- on accurate reproduction of a pre-composed ences in spacing of the onsets of successive phrase, the Exercise Governor will ask the Com- notes, either numerically or as an indication of poser to read a MIDI file, call the MIDI play “smaller/equal/larger” and compare the result routine from the Sound module, then let the to the structure of the example. In figure 3 this Assessor assess the user’s response and consult is shown. Here we can see that the musician the User Model, given the Assessor’s data, for started out with confidence and needed more determining the next step. time to find the last notes of the phrase. The For a play-along exercise using generated example consisted of equidistant notes, which melodies, the Exercise Governor uses routines would result in a chart of zeros and is therefore LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 56

It is here where the assistance of a Machine Learning algorithm is wanted: to learn which weight factors and other parameters contribute to the user’s goals and to optimise these. This has not been implemented yet and is currently being studied. 2.10 Melodic similarity There are several ways to find out the similarity between the given example and the musician’s response. In the current research, only the onset (i.e. start) and pitch of notes are taken into account. Although timbre, loudness and various other features are extremely useful, these are ignored for the time being. In the exercises where the user is asked to memorise and copy an example melody, the ac- Figure 3: Note onset relative time difference complishment of this task is purely based on hitting the correct notes in the correct order. The similarity however is also reflected in the omitted. timing. The extent to which the musician keeps 2.8 The Composer the rhythm of the example intact is a property that is measured and evaluated. The Composer generates musical phrases based In the exercises where the musician plays on given rules. The current implementation along with a piece of music, we have much more uses melodic intervals in a specified range, with freedom in the assessment. In this case, playing an allowed set of intervals and within a given the exact same notes as in the example is not scale. The scale is listed as a combination of always necessary. For some exercises it would tones (T) and semitones (S), as in the examples suffice to improvise within the scale or play cer- in table 1 and can be specified as needed. tain melodic intervals. T,T,S,T,T,T,S major In these situations, similarity measures also T,S,T,T,T,S,T minor allow for more freedom. S,S,S,S,S,S,S,S,S,S,S 12-tone A method that is used in an exercise called “riff trainer”, focused on automation of and cre- Table 1: Scale examples ating variations on a looped phrase, observes notes in the proximity of example notes and draws conclusions based on the objectives of the 2.9 The User Model exercise. This allows for both very strict adher- At the time of writing, the User Model is partly ence to the original melody as well as melodic implemented. interval-based variations, depending on the as- The part that has been implemented and is sumed objectives. currently tested by end users is the mapping Another method compares the Markov chain from analysed properties of the user’s playing of the example with that of the musician’s re- to parameters that are musically meaningful or sponse. significant for the user’s ambitions. Some inspiration is gained from this book In general this mapping is a linear combina- about melodic similarity: [Walther B. Hewlett, tion of those properties. For example, the user’s 1998] melodic accuracy can be expressed as a combi- nation of hitting the correct notes and the lack 3 Personal objectives of spurious or unwanted notes. Figure 4 shows that an exercise is ‘composed’ The weight factors are empirically deter- and played. The response of the musician is as- mined, as are several parameters in the exer- sessed and mapped to temporary skills. Long- cises, such as the number of repetitions before term skills are accumulated in a user model, moving on or the required proficiency for in- which tries to construct an accurate profile of creasing the level of an exercise. the musician and their personal objectives. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 57

Measured quantity Mapped to timing deltas timing accuracy missed notes melodic accuracy spurious notes clutter

Table 2: Mapping measured data to skills

4 Hardware

The starting point for this project is to assess the sound of an unmodified instrument. In this section, the current choice of hardware is dis- cussed. Figure 4: Machine assisted learning supported A brief side-project was undertaken to equip by machine learning an acoustic guitar with resistive sensors for detecting the point where strings are pressed Examples of these objectives are to strive for against the fretboard, but because this doesn’t faster playing, memorise long melodies or im- look and feel natural, would imply that all users prove timing accuracy. These need to be ex- would need to install a similar modification and pressed as quantifyable properties. Faster play- would exclude all instruments other than guitar, ing can be expressed as playing a phrase faster this was discarded. than before or playing it faster than the given Because nylon-string acoustic guitar is the example, which can be measured. primary instrument for the author as well as Likewise, memorising a melody involves the for lots of beginning music students, the deci- length of the melody that can accurately be sion was to analyse the sound of the instrument played at reasonable speed. Obviously, the con- with a microphone or some kind of transducer. cepts ‘accurately’ and ‘reasonable’ have to be Using a microphone raises the problem that quantified. the sound produced by PlayGuru interferes with An indication of accuracy in playing is ob- the sound of the instrument. Source separa- tained by measuring the number of spurious tion techniques are not considered viable for notes, missed notes and timing. this project due to the added complexity and Some approaches for machine learning (ML) because we want to be able to play and listen will be investigated. The term “machine learn- simultaneously, often to the exact same notes. ing” is used here to express that the machine This would justify a study of itself. itself is learning and does not refer to the “ma- Requiring the musician to use a headset is chine assisted learning”, which is the main topic also considered undesirable. So the only option of this paper. A machine learning algorithm left seems to use a transducer attached to the is thought to be able to achieve the user’s ob- instrument. jectives by adjusting the mapping parameters After some experimenting with various com- that translate measured quantities to short-time binations of guitar pickups, sensors, preamps skills and several properties of the exercises. and audio interfaces, it turned out that a com- Depending on the exercise, various factors are bination of a simple piezo pickup and a cheap measured, such as matched notes, missed notes, USB audio interface does the job very well. spurious notes, adherence to the scale, speed Successful measurements were done with the and timing accuracy. piezo pickup attached to the far end of the neck These quantities are mapped to short-term of a guitar. It is advised to embed the pickup skills according to table 2. into a protective cover to prevent the element Apart from these measured data, the exer- itself and the cable from being exposed to me- cises also contain configuration parameters that chanical strain and mount it with the piezo’s can be optimised for each user. These are found metal surface touching the wood of the gui- in composer settings and the curves used to con- tar using a rubber padded clamp from the DIY trol the playing speed and exercise complexity. store. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 58

Figure 6: Stand-alone device for user tests Figure 5: Embedded piezo pickup Several users mentioned a heightened focus, 5 Dissemination meaning that they were very concentrated for User’s experiences and feedback are of crucial a longer time to keep the interaction going and importance for the development of this tool. the level rising. Mistakes bring down the speed While a browser application or mobile app seem or level in a suble way and do lead to a slight obvious ways to reach thousands of musicians, disappointment, which in many cases proved to development is currently done on Linux and be an incentive to get back into the ‘flow’ of the Raspbian. This is a deliberate choice, partly in- exercise. spired by the author’s lack of experience with The project is work in progress. For a well- Webaudio intricacies and the acclaimed large founded opinion on the practical use, a lot more round-trip audio latency of Android devices, for user tests need to be done but the author’s con- which a separate study may be justified. clusion, based on results so far, is that the ap- For a large part however, this choice is sup- proach is promising. ported by the wish to have an inexpensive stand-alone, self-reliant, single-purpose device. 7 Future Work A series of stand-alone PlayGuru test devices At the time of writing, research concentrates are being developed, based on a Raspberry Pi on monophonic exercises with a basic machine with a tactile interface meant to be intuitive learning algorithm. to blind people. The idea is to attach a piezo For the near future the author has plans to pickup to the instrument, plug in, select the ex- perform a study with a larger group of users who ercise and start practicing. can use the system by themselves for a longer time. This requires creating test setups and/or 6 Conclusions porting the software to other platforms. Several beginners, intermediate guitar players Monophonic exercises may be the preferred and some people with no previous experience way to develop several skills, but being able to have used PlayGuru’s exercises in various stages play, or play along with, your favourite mu- of development. The two most mature exercises sic is much more motivating, particularly for used are copying a melody and playing along children. An approach resembling Band-in-a- with a melody. In most cases the melodies were Box➤ is taken, using a multi-channel MIDI file generated in real time based on the aformen- as input, with the possibility of indicating which tioned interval-based rules. In some cases a pre- channels will be heard and which channel will be composed MIDI file was used. ‘observed’. This requires both polyphonic play From the start it was clear that users enjoy and analysis, which are largely implemented the fact that PlayGuru listens to them and re- but currently belong in the Future Work sec- wards “well played” responses with an increase tion. On the analysis side, a technique based in speed or making the assignment slightly more on chroma vectors [Tzanetakis, 2003] is being challenging. tested. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 59

Machine learning strategies are being stud- ied but have not yet been implemented. The assistance of co-developers would be much ap- preciated. 8 Acknowledgements I want to thank the HKU for supporting and facilitating my work and in particular Gerard van Wolferen, Ciska Vriezenga and Meindert Mak for their insights and ideas. Gerard van Wolferen and Pieter Suurmond were of great help proof-reading and correcting this paper. Lastly I would like to thank the LAC review committee for their excellent observations. 9 Other solutions for computer-assisted music education ❼ i-maestro ❼ Bart’s Virtual Music School ❼ Rocksmith TM ❼ yousician.com ❼ bestmusicteacher.com ❼ onlinemuziekschool.nl ❼ gitaartabs.nl References Meindert Mak. 2015. Connecting music in the key of life. 1.2 Scott Tennant. 1995. Pumping Nylon, The Classical Guitarist’s Technique Handbook. Al- fred Music. 1.1 Ning Hu; Roger B. Dannenberg; George Tzanetakis. 2003. Polyphonic audio matching and alignment for music retrieval. 2003 IEEE Workshop on Applications of Signal Process- ing to Audio and Acoustics. 7 Eleanor Selfridge-Field Walther B. Hewlett. 1998. Melodic Similarity. The MIT Press. 2.10 LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 60

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 61

Faust audio DSP language for JUCE

Adrien ALBOUY and Stéphane Letz GRAME 11, cours de Verdun (GENSOUL) 69002 LYON, FRANCE, {adrien.albouy, letz}@grame.fr

Abstract automatic parallelization and take advantage of Faust [Functional Audio Stream] is a functional multicore machines. programming language specifically designed for real- The generated code can generally compete time signal processing and synthesis [1]. It consists with, and sometimes even outperform, C++ of a compiler that translates a Faust program into code written by seasoned programmers. It an equivalent C++ program, taking care of generat- works at the sample level, it is therefore suited ing the most efficient code. JUCE is an open-source to implement low-level DSP functions like recur- cross-platform C++ application framework devel- 1 sive filters up to fullscale audio applications. It oped since 2004, and bought by ROLI in Novem- can be easily embedded as it is selfcontained and ber 2014, used for the development of desktop and mobile applications. A new feature to the Faust does not depend of any DSP library or runtime environnement is the addition of architectures files system. Moreover it has a very deterministic to provide the glue between the Faust C++ output behavior and a constant memory footprint. and the JUCE framework. This article presents the Being a specification language the Faust overall design of the architecture files for JUCE. code says nothing about the audio drivers or the GUI toolkit to be used. It is the role of Keywords the architecture file to describe how to relate JUCE, Faust, Domain Specific Language, DSP, the DSP code to the external world [2]. This real-time, audio approach allows a single Faust program to be easily deployed to a large variety of audio stan- 1 Introduction dards (Max-MSP externals, PD externals, VST From a technical point of view Faust2 (Func- plugins, CoreAudio applications, JACK appli- tional Audio Stream) is a functional, syn- cations, etc.), and JUCE is now supported. chronous, domain specific language designed for The aim of JUCE[3] is to allow software to be real-time signal processing and synthesis. A written such that the same source code will com- unique feature of Faust, compared to other ex- pile and run identically on Windows, Mac OS X, isting languages like Max, PD, Supercollider, Linux platforms for the desktop devices, and on etc., is that programs are not interpreted, but Android and iOS for the mobile ones. A notable fully compiled. feature of JUCE when compared to other similar One can think of Faust as a specification lan- frameworks is its large set of audio functional- guage. It aims at providing the user with an ity. Those services, the user-interface possibil- ities and the multi-platform exportability posi- adequate notation to describe signal processors Faust from a mathematical point of view. This spec- tion JUCE as a great framework for to ification is free, as much as possible, from im- get exported on, to have in the future less code plementation details. It is the role of the Faust to maintain up-to-date, and simpler utilization. compiler to provide automatically the best pos- In section 2, the idea and the use of the GUI sible implementation. The compiler translates architecture file will be introduced. In section Faust programs into equivalent C++ programs 3, the JUCE Component hierarchy will be pre- taking care of generating the most efficient code. sented without going into many details. Sec- The compiler offers various options to control tion 4 is the main one, explaining in detail the the generated code, including options to do fully graphical architecture file for JUCE. MIDI and OSC architecture files are introduced in Section 1https://roli.com/ 5. Section 6 will treat of the "glue" between 2http://faust.grame.fr JUCE audio layers and Faust ones. Section 7 LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 62

presents the faust2juce script. Section 8 is a support the following methods to setup this hi- quick tutorial on how to use JUCE for Faust. erarchy:

2 Faust GUI architecture files openTabBox (const char* label) A Faust UI architecture is a glue between a openHorizontalBox (const char* label) host control layer and a Faust module. It is openVerticalBox (const char* label) responsible to associate a Faust module pa- closeBox (const char* label) rameter to a user interface element and to up- Note that all the widgets are added to the cur- date the parameter value according to the user rent box. actions. This association is triggered by the dsp::buildUserInterface call, where the DSP 2.4 Metadata asks a UI object to build the module controllers. The Faust language allows widget labels Since the interface is basically graphic ori- to contain metadata enclosed in square ented, the main concepts are widget based: a brackets. These metadata are handled UI architecture is semantically oriented to han- at UI level by a declare method taking dle active widgets, passive widgets and widgets as argument, a pointer to the widget as- layout. sociated value, the metadata key and value: Faust A UI architecture derives a UI class, declare(float*, const char*, const char*). containing active widgets, passive widgets, lay- Metadata can also declare a DSP as polyphonic, out widgets, and metadata. with a line looking like declare nvoices "8" 2.1 Active widgets for 8 voices. This will always output a poly- phonic DSP, either you use the polyphonic Active widgets are graphical elements that option of the compiler or not. This number of control a parameter value. They are initialized voices can be changed with the compiler (cf. with the widget name and a pointer to the Section 7). linked value. The widget currently considered For instance, if the program needs a Slider are Button, ToggleButton, CheckButton, to be a Knob, those lines are written: RadioButton, Menu, VerticalSlider, HorizontalSlider, Knob and NumEntry. declare(&fVslider0, "style", "knob"); A UI architecture must implement a method addVerticalSlider("Vol", &fVslider0,...); addxxx (const char* name, float* zone, ...) for each active widget. Additional param- style knob menu eters are available to Slider, Knob, NumEntry, The can be a , , etc... de- RadioButton and Menu: the init value, the min pending on the program. and max values and the step (RadioButton, Multiple aspects of the items can be described Menu and Knob being special kind of Sliders, with the metadata, such as the type of the item cf. subsection 2.4, Metadata). just as seen before, the tooltip of the item, the unit, etc... 2.2 Passive widget Passive widgets are graphical elements that 3 JUCE Component class reflect values. Similarly to active widgets, To implement a complete program, the graph- they are initialized with the widget name and ical elements described in the previous section a pointer to the linked value. The widget need to be combined with JUCE classes. In the currently considered are NumDisplay, Led, JUCE Framework, the component class is the HorizontalBarGraph and VerticalBarGraph. base-class for all JUCE user-interface objects. A UI architecture must implement a method The following section explains the relationship addxxx (const char* name, float* zone, between Faust GUI architecture files, and the ...) for each passive widget. Additional JUCE mechanics. parameters are available, depending on the 3.1 Parent and child mechanics passive widget type. (NumDisplay and Led are a special kind of BarGraph, cf. Subsection 2.4). As most frameworks have, JUCE has a hi- erarchy of Component objects, organized in a 2.3 Widget layout tree structure. The common way to set a Generally, a UI is hierarchically organized into Component as child of another component is boxes and/or tab boxes. A UI architecture must to do parent->addAndMakeVisible(child);. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 63

This function sets the child component as vis- be adapted to the Juce::Component mechanics ible too, because it’s not by default. Multi- in an architecture file called JuceGUI.h. The ple functionalities are accessible to run through following section discusses annotated examples. this Component tree, with methods that give the child Component at index i, or give the parent. 4.1 Two different kinds of objects There’s even a function allowing to get the par- There are two kinds of object used in the adap- ent of a Component with a specific type, this type tation: being a derived class of Juce::Component. How- • ever, this function does not exist for the child, uiComponent, which are basically any items Faust and imply that dynamic_cast has to be done if of the program, like sliders or but- you want to get a child of a certain type. tons. 3.2 Component setup mechanics • uiBox, which is container component, and First of all, a Component is drawn if it’s visi- so can contain a uiComponent or some oth- ble, and its parent too. If a Component is not ers uiBox. visible, its child and all of its children, etc... Both are derived classes of a will not be visible, but as addAndMakeVisible uiBaseComponent class, which is itself a function is used most of the time, this derived class of Juce::Component. should not be a problem. A Component has The uiBaseComponent class regroups meth- a Rectangle boundsRelativeToParent, ods shared by both uiBox and uiComponent, containing its x and y coordinates, and its width like void setRatio(), int getTotalWidth(), and height. As the variable name implies, the etc.... This way, too many dynamic_cast bounds of a Component is relative to its parent, in our code are avoided. Here’s what the and not absolute in the window ; it is very im- uiBaseComponent class contains: portant in the architecture files for Faust, as will be demonstrated in subsection 4.4. float fHRatio, fVRatio; 3.3 Drawing mechanics int fTotalWidth, fTotalHeight; A Component has two virtual functions3 int fDisplayRectHeight, that are the main tools to handle a dy- fDisplayRectWidth; namic layout, the void resized() and String fName; void paint(Graphics& g) functions. The resized one is called each time a Component uiBaseComponent(int totWidth, bounds are changed, and the paint one when int totHeight, String name); the Component flag indicates that it needs to be repainted. The mouse cursor being on top of int getTotalHeight() ; it, a mouse click, the Component bounds being int getTotalWidth(); changed, or one or multiple of its child needing virtual void setRatio(); to be repainted indicates that it needs to be float getHRatio(); repainted for example. float getVRatio(); There is a design class called LookAndFeel String getName(); that allows customization of the interface. The void setHRatio(); LookAndFeel objects defines the appearance of void setVRatio() all the JUCE widgets, and subclasses can be void setBaseComponentSize used to apply different ’skins’ to the application. (Rectangle r); void mouseDoubleClick There is obviously a lot more to the (const MouseEvent &event) override; Juce::Component class, but that’s the basics, or at least what the architecture files need. virtual void writeDebug() = 0; virtual void setCompLookAndFeel 4 JuceGUI architecture file (LookAndFeel* laf) = 0; To summarize what has been seen before, the Faust system of widgets and boxes of needs to The mouseDoubleClick function is a JUCE 3placeholder functions which programmer must im- overridable function, which is called every time plement a Component is double-clicked. Here it’s used LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 64

to call the writeDebug function, showing differ- 4.3 uiTabs class ent characteristics of the double clicked uiBox The uiTabs class inherits of or uiComponent. Juce::TabbedComponent, which is a The two pure virtual functions are defined Juce::Component with a TabbedButtonBar on to have their own behavior for both uiBox and one of its size. It just needs a Juce::Component uiComponent, not being the same obviously. for each tab, and a tab name, and it will display The virtual void setRatio(); function is them. virtual because there is a special case with the A tab layout is needed when the uiBox, which is setting her own ratio, and need buildUserInterface starts with a openTabBox to be asking its child to set their ratios too, in call. In this, a boolean tabLayout is set to true, a recursive way. to know that it’s a tab layout. As said before, uiComponent inherits from While parsing the buildUserInterface, a those uiBaseComponent functions, and is itself uiBox is given to the uiTabs every time the a mother class for plenty of different widgets. current tab is "closed". To do that, a vari- Here’s the inheritance diagram: able called order keeps track of the "level" of the current box. The order starts at 0, is incre- mented when a new box is opened, and decre- mented when a box is closed. If the order is 0 in a closeBox() call, then a tab is being closed, and so the current box is added to the uiTabs, using the TabbedComponent::addTab function. Once all the tabs are closed, the tabBox is closed too, the order is now at -1, and it triggers the initialization function of uiTabs, uiTabs::init(). It’ll be described it in the next subsection. 4.4 Initialization of the layout First of all, while parsing the buildUserInterface lines, which are list- ing the different boxes and items that need Figure 1: Inheritance diagram to be displayed, the tree is getting built. It’s done using the Juce::Component me- chanics of addAndMakeVisible. The different A uiComponent subclasses can handle multi- uiBaseComponent are added as child of different ple "type" of items. uiBox, and uiBox display rectangle size and For instance, uiSlider groups every kind of total size are calculated every time a box is sliders: HorizontalSlider, VerticalSlider, closed in the buildUserInterface (i.e. when NumEntry and Knob. closeBox() is called). 4.2 The main window The uiBox display rectangle size is the sum of his child width and the maximum of his child The user interface cannot be shrunk infinitely in height, and the contrary depending on its ori- order to be always lisible and clear, so a mini- entation. But margins are added to our display mal window size is defined. That implies that in- rectangle width and height, 4 pixels per child, stead of a basic Component in a DocumentWindow for a margin of 2 pixels on the top, left, bottom (a resizable window with a title bar and max- and right, and the uiBox total size is obtained. imise, minimise and close buttons), a Viewport This is to avoid an overlapping effect, having two in a DocumentWindow is used, which displays items touching each other. Following the same scrollbars when the window gets lower dimen- Faust spirit, 12 pixels are added to the height of the sions than the minimal size of the DSP box if its name needs to be displayed, 12 pixels program, allowing to have full access to the user being the space needed to display its name. interface even in the lower dimensions. Here’s the buildUserInterface that display This Viewport can either contains a uiBox this program: as presented before, or a uiTabs if the program requires tabs. ui_interface->openHorizontalBox("TITLE1"); LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 65

H might get incremented by 12 pixels, de- pending on the need to display the box name. 4 pixels for each child component are added on a dimension to have margins between each of them, because they will be placed aside of each other in this dimension, and simply 4 pixels added to the other dimension to have 2 pixels separating parent and child box on each side. Once buildUserInterface is done, the last box is closed, and the user interface initialized. This last box, that will be called the "main box" Figure 2: Representation of the display rectangle size and is initiated with ratios of 1 and 1, even if they the total size of a box with four child are needed, because it’ll take the window size. Here’s how the UI is initialized: • ui_interface->addVerticalSlider("Slider1", Setting the actual rendering size for the &fVslider0, 0.0f, 0.0f, 6.0f, 1.0f); main box, because the total size is ui_interface->addVerticalSlider("Slider2", set here, but not the Juce::Component &fVslider1, 0.0f, 0.0f, 6.0f, 1.0f); bounds. That’s done through the ui_interface->addVerticalSlider("Slider3", void setBaseComponentSize &fVslider2, 0.0f, 0.0f, 6.0f, 1.0f); (Rectangle r) methods, which sets ui_interface->addVerticalSlider("Slider4", the size of the components, and especially &fVslider3, 0.0f, 0.0f, 6.0f, 1.0f); position them right. Concretely, a 30 pix- ui_interface->closeBox(); els offset is needed on the height for a tab layout, 30 pixels being the height taken by In Figure 2, the difference between the dis- the tab bar. Only the main box needs to play rectangle size and the total size can be set with an offset, because other boxes be easily seen. The total size of the box will be positioned depending on its parents here named "TITLE1" is the lighter gray, and coordinates. the display rectangle size would be the four • After that, the ratios are calculated for darker gray rectangle stick together. The layout the whole tree, from root to leaves. The is not aligned seamlessly because of the margin horizontal ratio is the component total that is implemented to avoid the overlapping of width divided by its parent display rectan- the components. gle width, same for the height. This way, The space left on the top of the box is for its it avoids to have the margins to mess with title, and this margin is included in the total our ratios, and to have a sum of ratio equals size. to 1 instead of one approaching 1, but not being 1 exactly. n−1 • h = X (ci.H) (1) Last step is to set the LookAndFeel for all i=0 uiComponents, which are for all of them the leaves of the trees. So the tree is fully w = max ci.W (2) parsed there, root to leaves. i∈[0,n−1] The only possible change in the initial- H = h + 4 ∗ n (3) ization of the program, is in a case of a tab layout. The uiTabs::init() method just calls the uiBox::setRatio() and the W = w + 4 (4) uiBox::setCompLookAndFeel(LookAndFeel*) for every of its tab component. In those equations, H is the total height, While going through all the tabs, the algo- W the total width, h the display rectangle rithm keeps track of the minimal size of the height, and w display rectangle width ; ci uiTabs component to be displayed. Its mini- being the nth child component of the current mal dimensions being the maximum width and box. the maximum height of all its tabs. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 66

There, the tree is built, the total size has been little by little every time a uiBaseComponent is initiated, display rectangle size and the ratios for given a rectangle to be displayed in [5]. Basi- all components, all the uiBox and uiComponent. cally, every rectangle that is given to child is removed from the original functionRect, and 4.5 Dynamic Layout this allow us the keep track of the good x and y At that point, the user interface is dis- coordinates to give to the child component, with played at his original size, but it needs to the margin added. It’s done over and over again adapt to the potential resizing of the win- for each child component, cutting from the left dow. To do that, the uiBoxes are used or the top of the boxRectangle rect de- to layout all the items. A uiBox item has pending on its orientation. a void arrangeComponents(Rectangle functionRect) function, which is the main tool to organize the layout. It’s called whenever the resized() function of the main box is called. In this function, the initial rectangle given as argument, that is basically the window size, will propagate through all the child uiBox and uiComponent, in a recursive way [4]. At the beginning, it checks if the name needs to be displayed, and as no child components should be displayed there, it cuts 12 pixels from the top of the functionRect, given as argument. Figure 3: Representation of the layout algo- After that, the margins are sets, so 2 pixels are rithm cut on the left, top, right and bottom side. This way, overlapping components are avoided. Once it’s done, it goes through all the child, to give 4.6 The MainContentComponent class them the right space to occupy and the right position of course. In the adapted MainContentComponent class, Faust The algorithm works that way: if the current there is plenty of libraries, that are in- Faust box is vertical, then it needs to give its child dispensable for the program. There are a vertical part functionRect, and a horizontal some optionals includes, for OSC, MIDI and one for a horizontal box of course.The amount polyphonic mode, that depends on the compi- of vertical or horizontal size of the child is cal- lation options that the user sets. culated, still depending on the vertical nature The MainContentComponent class is the of the current box. This size is the box current Juce::Component contained in our Viewport, height or width, minus the margins, multiplied and contains itself a JuceGUI object, that is a Faust by the horizontal or vertical ratio. Concrete ex- subclass of Juce::Component, GUI class ample: the current box is a horizontal display, and MetaDataUI. The minimal things to do is: and has 2 child components, one having a hori- addAndMakeVisible(juceGUI); zontal ratio of 0.7 and the other one of 0.3. The fDSP = new mydsp(); box display size is here 1000x500 pixels, and it’s fDSP->buildUserInterface(&juceGUI); total size 1008x504 (2 items and it’s a horizontal recommendedSize = juceGUI.getSize(); ∗ ∗ margin box, so 2 (2 ) = 8 on the width, and setSize (recommendedSize.getWidth(), ∗ margin 2 = 4 on the height). recommendedSize.getHeight()); Let’s say the size of the window almost dou- setAudioChannels (fDSP->getNumInputs(), bled, and it’s now 2008x1004 (arbitrary simple fDSP->getNumOutputs()); values). It will calculate that the first item get a [...] 0.7∗(2008−2∗4) = 0.7∗2000 = 1400 pixels wide ∗ − ∗ space and the second one 0.3 (2008 2 4) = 600 private: pixels. First item bounds will be 1400x1000 and JuceGUI juceGUI; the second one 600x1000, height being kept the same, without the margins of course. A simple buildUserInterface call is needed, On top of that, to keep track of where to place set the size of the MainContentComponent, and our components, the functionRect get cut off set the amount of audio channels. Following the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 67

same spirit, there is optional code in case of a framework already contains an abstract audio MIDI, OSC or polyphonic mode. layer connected to a set of native audio drivers on all development platforms. JUCE develop- Faust 5 Other architecture files ers can choose to deploy their code as stan- Just a GUI architecture file isn’t enough to run a dalone audio applications or audio plugins. A Faust program on JUCE, adaptations for dif- standalone application has to subclass the ab- ferent kind of control are also needed, such as stract AudioAppComponent class and implement OSC and MIDI. the prepareToPlay, getNextAudioBlock and releaseResources methods: 5.1 OSC OSC integration has been done by devel- • prepareToPlay is called just before audio oping a new JuceOSCUI class, subclass of processing starts with a sample rate param- the base UI class. Two send and re- eter. The Faust DSP is initiated with this ceive ports are defined. Input OSC mes- sample rate value, and input/output chan- sages are decoded by subclassing the JUCE nels number is possibly adapted to match OSCReceiver class, and implementing its the capabilities of the used native layer OSCReceiver::oscMessageReceived method. (that can a different number of input/out- Output OSC messages are sent by using the put channels than the DSP). OSCSender::send method. • The special "hello" message allows to retrieve getNextAudioBlock is called every time several parameters of the Faust applications: the audio hardware needs a new block of its root OSC port, IP address, input and output audio data. Audio buffers presented as port. The "get" message allows to retrieve the a AudioSourceChannelInfo data type are current, min and max values for a given param- retrieved and adapted to be given to the Faust eter. Finally a float value received on a given DSP compute method. path will allow to change the parameter value • releaseResources is called when audio in real-time. processing has finished. Nothing special An application wanting to be controlled by has to be done at the Faust level. OSC messages has to use an instance of the JuceOSCUI class, to be given to the DSP 7 The faust2juce script buildUserInterface method. There are many scripts availiable in the Faust 5.2 MIDI ecosystem allowing to generate a ready to use MIDI messages handling is done by us- binary, project file, or compiled file from a sim- ing the MidiInput and MidiOutput ple DSP file. They are labeled faust2xxx. JUCE classes. A new juce_midi class Following the same spirit, a faust2juce subclassing the MidiInputCallback script has been implemented, that allows to cre- and implementing the required ate a JUCE project directory from a simple DSP MidiInputCallback::handleIncomingMidi file. The command is used as follow: Message method has been defined. MIDI mes- faust2juce [-options] dspFile.dsp sages coming from the JUCE layer are decoded and sent to the corresponding application This will create a folder containing a .jucer file, controllers. MIDI messages produced by the and a "Source" folder containing the Main.cpp application controllers are encoded and sent and the MainComponent.h. This folder is self using a MidiOutput object. contained, all needed Faust includes are in the An application wanting to be controlled by MainComponent.h, including the compiled DSP. MIDI messages has to use an instance of the There are the options available at this mo- MidiUI class, created with a juce_midi handler, ment for faust2juce: to be given to the DSP buildUserInterface method. • -nvoices x: produces a polyphonic self- contained DSP with x voices, ready to be 6 Audio integration used with MIDI events To be connected to the external world, a given • -: activates MIDI control Faust DSP has to be connected to an audio driver and a User Interface definition. JUCE • -osc: activates OSC control LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 68

• -help: shows the different options available [3] JUCE online documentation https://www. juce.com/doc/classes As described in subsection 2.4, a number of voices can be hardcoded for a polyphonic DSP, [4] JUCE "Tutorial: Advanced Rectangle but you can change it with the nvoices option. techniques" https://www.juce.com/doc/ It has the priority over the metadata declara- tutorial_rectangle_advanced tion. In the case of a non-hardcoded polyphonic [5] J. Storer "Developing Graphical User DSP, it will just make it a polyphonic one with Interfaces with JUCE", JUCE Summit this compiler option. Some others options will 2015 https://www.youtube.com/watch?v= be added later, it’s still in development. xsCZoE1s_uw 8 How to use JUCE architecture files Using JUCE to export a Faust DSP program file is easy: create the project folder with faust2juce [-options] dspFile.dsp and drag & drop the created folder named after the DSP to the "example" folder contained in the JUCE git folder. Simply execute the .jucer file, and select "Save Project and Open in IDE...", the first time at least, to generate the JUCE header files, etc... And it’s ready to execute your program on what- ever export platform you chose.

9 Conclusion The Faust audio DSP language implementaion is now possible with JUCE, and can theoreti- cally be exported to every platform that JUCE supports. It has been tested on OS X and iOS, both work correctly, and has a close performance to already available options, such as faust2caqt for OS X and faust2ios, for iOS. MIDI control, polyphonic mode, and OSC control are implemented, more features are in progress of development, to permit a full com- patibility with the whole Faust library. JUCE offers two types of "audio project", standalone applications or plug-in. Currently the FAUST architecture files are limited to de- scribe standalone applications, but we are look- ing forward to adapt our code for plug-ins.

References [1] Orlarey, Y., Fober, D., and Letz, S. (2009), "FAUST: an efficient functional approach to DSP programming." New Computational Paradigms for Computer Music, 290. [2] D. Fober, Y. Orlarey, and S. Letz, “Faust Architectures Design and OSC Support", IRCAM, (Ed.): Proc. of the 14th Int. Con- ference on Digital Audio Effects (DAFx-11), pp. 231-216, 2011. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 69

Polyphony, sample-accurate control and MIDI support for FAUST DSP using combinable architecture files Stéphane LETZ, Yann ORLAREY, Romain MICHON Dominique FOBER CCRMA GRAME Stanford University Centre National de Création Musicale Stanford, CA 94305-8180 11 Cours de Verdun (Gensoul) USA 69002, Lyon [email protected] France {letz,orlarey,fober}@grame.fr Abstract Faust User Interface The architecture files ecosystem is regularly Module enriched with new targets to deploy Digital Signal Processing (DSP) programs. This paper presents re- cently developed techniques to expand the standard DSP code one DSP source, one program or plugin model, and to better control parameter changes during the au- Audio Driver Module dio computation. Sample accurate control and poly- phonic instruments definition have been introduced, and will be explained particularly in the context of Figure 1: DSP code is generated by the compiler, audio MIDI control. and UI codes are added from the generic architecture files. Keywords Faust, DSP programming, audio, MIDI controls definition (with their name, default 1 Introduction value, value range etc.). OSCUI and httpdUI Faust is a functional programming language classes [1] typically follow this strategy. New architecture files have been regularly specifically designed for real-time signal process- Faust ing and synthesis. From a high-level specifica- added to the already rich ecosystem, to tion, its compiler typically generates the DSP expand the variety of possible targets for the computation as a C++ class1 to be wrapped by DSP code. so-called architecture files and connected to the 1.2 Macro Construction of DSP external world. Components 1.1 Audio and UI Architecture Files The Faust program specification is usually en- Native audio drivers are developed as subclasses tirely done in the language itself. But in some of a base audio class, controllers as subclasses of specific cases it may be useful to develop sepa- a base UI class. Typical Graphical User Inter- rated DSP components and combine them in a face architectures are based on well established more complex setup. frameworks like QT2 or JUCE3, and allow to Since taking advantage of the huge number display a ready to use window with sliders, text of already available UI and audio architecture files is important, keeping the same dsp API is zones and buttons. Audio and UI parts are fi- 5 nally combined with the actual DSP computa- preferable , so that more complex DSP can be tion to produce the final audio application or controlled and audio rendered the usual way: plugin (see Figure 1). Non graphical controllers can also be defined class dsp { as subclasses of UI, simply by ignoring the lay- 4 public: out description , and just keeping the actual ..... virtual int getNumInputs() {} 1 The faust2 development branch can also generate C, virtual int getNumOutputs() {} LLVM IR, WebAssemby etc. target languages. virtual void buildUserInterface(UI* ui) {} 2 http://doc.qt.io virtual void init(int samplingRate) {} 3https://www.juce.com/doc/classes 4Typically done using hgroup, vgroup or tgroup in 5Only part of the complete DSP API is presented the DSP source code. here. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 70

virtual void compute(int count, • the dsp_sequencer class combines two FAUSTFLOAT** inputs, DSP in sequence, assuming that the num- FAUSTFLOAT** outputs) {} ber of outputs of the first DSP equals the ..... }; number of input of the second one. Its buildUserInterface method is overloaded to group the two DSP in a tabgroup, so that Extended DSP classes will typically subclass control parameters of both DSPs can be in- the dsp root class and override part of its API. 7 dividually controlled . Its compute method This paper shows how this approach can be is overloaded to call each DSP compute used to define new extended and combinable dsp in sequence, using an intermediate output classes. Section 2 describes tools to combine sep- buffer produced by first DSP as the input arately developed DSP. Section 3 explains how one given to the second DSP. sample accurate parameter control of a given DSP can be done using the new timed_dsp class, • the dsp_parallelizer class com- and when it needs to be used. bines two DSP in parallel. Its Section 4 presents the model used to deploy getNumInputs/getNumOutputs meth- polyphonic instruments, section 5 presents how ods are overloaded by correctly reflecting the previously presented components can be the input/output of the resulting DSP as used together in the context of MIDI control, the sum of the two combined ones. Its and finally the conclusion tries to enlarge this buildUserInterface method is overloaded work in a more general analysis of the Faust to group the two DSP in a tabgroup, so compiler generated code. that control parameters of both DSP can be individually controlled. Its compute 2 Combining DSP method is overloaded to call each DSP 2.1 Dsp Decorator Pattern compute, where each DSP consuming and producing its own number of input/output A dsp_decorator class, subclass of the root dsp audio buffers taken from the method class has first been defined. Following the dec- parameters. orator design pattern6, it allows behavior to be added to an individual object, either statically or dynamically. 3 Sample Accurate Control The extended DSP class hierarchy is shown in DSP audio languages usually deal with several Figure 2. As an example of the decorator pat- timing dimensions when treating control events tern, the timed_dsp class allows to decorate a and generating audio samples. For performance given DSP with sample accurate control capa- reasons, systems maintain separated audio rate bility as explained in section 3. for samples generation and control rate for asyn- chronous messages handling. The audio stream is most often computed by blocks, and control is updated between blocks. To smooth control parameter changes, some lan- guage chose to interpolate parameter values [7] between blocks. In some cases control may be more finely in- terleaved with audio rendering [8], and some lan- Figure 2: DSP classes diagram guages [9] simply choose to interleave control and sample computation at sample level. Although the Faust language permits the de- 2.2 Combining DSP Components scription of sample level algorithms (like recur- sive filters etc.), Faust generated DSP are usu- A few additional macro construction classes, ally computed by blocks. Underlying audio ar- subclasses of the root dsp class have been de- chitectures usually give a fixed size buffer over fined in the public faust/dsp/dsp-combiner.h and over to the DSP compute method which header file: consumes and produces audio samples. 6https://en.wikipedia.org/wiki/Decorator_ pattern 7Typically using any UI object. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 71

3.1 Control to DSP Link samples and the actual MIDI message each time In the current version of the Faust generated one is received. In the DSP compute method, code, the primary connection point between the the ring-buffer will be read to handle all mes- control interface and the DSP code is simply sages received during the previous audio block. a memory zone. For control inputs, the archi- Since control values can change several times tecture layer continuously write values in this inside the same audio block, the DSP compute zone, which is then sampled by the DSP code cannot be called only once with the total num- at the beginning of the compute method, and ber of frames and the complete inputs/outputs used with the same values during the entire call. audio buffers. The following strategy has to be Because of this simple control/DSP connexion used: mechanism, the most recent value is seen by the • DSP code. several slices are defined with control values Similarly for control outputs8, the DSP code changing between consecutive slices. inside the compute method possibly write sev- • all control values having the same time- eral values at the same memory zone, and the stamp are handled together, and change last value only will be seen by the control archi- the DSP control internal state. The slice tecture layer when the method finishes. is computed up to the next control param- Although this behaviour is satisfactory for eters time-stamp until the end of the given most use-cases, some specific usages need to audio block is reached. handle the complete stream of control values • with sample accurate timing. For instance keep- in the Figure 3 example, four slices with the ing all control messages and handling them at sequence of c1, c2, c3, c4 frames are succes- their exact position in time is critical for proper sively given to the DSP compute method, MIDI clock synchronisation. with the appropriate part of the audio in- put/output buffers. Control values (ap- 3.2 Time-Stamped Control pearing here as [v1,v2,v3], then [v1,v3], The first step consists in extending the archi- then [v1], then [v1,v2,v3] sets) are changed tecture control mechanism to deal with time- between slices. stamped control events. Note that this requires the underlying event control layer to support this capability. The native MIDI API for in- stance is usually able to deliver time-stamped MIDI messages. The next step is to keep all time-stamped events in a time ordered data structure to be continuously written by the control side, and read by the audio side. Figure 3: Audio block slice-based computation Finally the sample computation has to take account of all queued control events, and cor- rectly change the DSP control state at succes- Since time-stamped control messages from the sive points in time. previous audio block are used in the current block, control messages are aways handled with 3.3 Slices Based DSP Computation one audio buffer latency. With time-stamped control messages, changing control values at precise sample indexes on the 4 Polyphonic Instruments audio stream becomes possible. A generic slices Directly programing polyphonic instruments in based DSP rendering strategy has been imple- Faust is perfectly possible. It is also needed mented in the timed_dsp class. if very complex signal interaction between the A ring-buffer is used to transmit the stream different voices have to be described9. of time-stamped events from the control layer But since all voices would always be com- to the DSP one. In the case of MIDI control puted, this approach could be too CPU costly case for instance, the ring-buffer is written with for simpler or more limited needs. In this case a pair containing the time-stamp expressed in 9Like sympathetic strings resonance in a physical 8Using bargraph kind of UI elements. model of a piano for instance. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 72

describing a single voice in a Faust DSP pro- gram and externally combining several of them with a special polyphonic instrument aware ar- chitecture file is a better solution. Moreover, this special architecture file takes care of dy- namic voice allocations and control MIDI mes- sages decoding and mapping. 4.1 Polyphonic Ready DSP Code By convention Faust architecture files with polyphonic capabilities expect to find control parameters named freq, gain and gate. The metadata declare nvoices "8"; kind of line with a desired value of voices can be added in the source code. Figure 4: Extended multi-voices GUI interface In the case of MIDI control, the freq parame- ter (which should be a frequency) will be auto- matically computed from MIDI note numbers, The resulting polyphonic DSP object can be gain (which should be a value between 0 and 1) used as usual, connected with the needed audio from velocity and gate from keyon/keyoff events. driver, and possibly other UI control objects like Thus, gate can be used as a trigger signal for any OSCUI, httpdUI, etc. Having this new UI hi- envelope generator, etc. erarchical view allows complete OSC control of 4.2 Using the mydsp_poly class each single voice and their control parameters, The single voice has to be described by a Faust but also all voices using the master component. DSP program, the mydsp_poly class is then used The following OSC messages reflect the same to combine several voices and create a poly- DSP code either compiled normally, or in poly- phonic ready DSP: phonic mode (only part of the OSC hierarchies are displayed here): • the faust/dsp/poly-dsp.h file contains the definition of the mydsp_poly class used to // Mono mode wrap the DSP voice into the polyphonic ar- chitecture. This class maintains an array of /0x00/0x00/vol f -10.0 dsp type of objects, manage dynamic voice /0x00/0x00/pan f 0.0 allocations, control MIDI messages decod- ing and mapping, mixing of all running // Polyphonic mode voices, and stopping a voice when its out- /Polyphonic/Voices/0x00/0x00/pan f 0.0 put level decreases below a given threshold. /Polyphonic/Voices/0x00/0x00/vol f -10.0 • ... as a sub-class of DSP, the mydsp_poly /Polyphonic/Voice1/0x00/0x00/vol f -10.0 class redefines the buildUserInterface /Polyphonic/Voice1/0x00/0x00/pan f 0.0 method. By convention all allocated voices ... are grouped in a global Polyphonic tab- /Polyphonic/Voice2/0x00/0x00/vol f -10.0 /Polyphonic/Voice2/0x00/0x00/pan f 0.0 group. The first tab contains a Voices ... group, a master like component used to change parameters on all voices at the The polyphonic instrument allocation takes same time, with a Panic button to be used 11 10 the DSP to be used for one voice , the desired to stop running voices , followed by one number of voices, the dynamic voice allocation tab for each voice. Graphical User Inter- state12, and the group state which controls if face components will then reflect the multi- separated voices are displayed or not (Figure 4): voices structure of the new polyphonic DSP (Figure 4). DSP = new mydsp_poly(dsp, 2, true, true);

11The DSP object will be automatically cloned in the 10An internal control grouping mechanism has been mydsp_poly class to create all needed voices. defined to automatically dispatch a user interface action 12Voices may be always running, or dynamically start- done on the master component on all linked voices. ed/stopped in case of MIDI control. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 73

With the following code, note that a poly- Polyphonic ready faust2xx scripts will then phonic instrument may be used outside of a compile the polyphonic instrument and the ef- MIDI control context, so that all voices will fect, combine them in sequence, and create a be always running and possibly controlled with ready to use DSP. OSC messages for instance:

DSP = new mydsp_poly(dsp, 8, false, true);

4.3 Controlling the Polyphonic Instrument The mydsp_poly class is also ready for MIDI control and can react to keyon/keyoff and pitch- wheel messages. Other MIDI control parameters can directly be added in the DSP source code.

4.4 Deploying the Polyphonic Instrument Figure 5: Polyphonic instrument with output effect Several architecture files and associated scripts GUI interface: left tab window shows the polyphonic have been updated to handle polyphonic instru- instrument with its Voices group only, right tab window ments: shows the output effect. As an example on OSX, the script faust2caqt foo.dsp can be used to cre- ate a polyphonic CoreAudio/QT application. 5 MIDI Control The desired number of voices is either declared in a nvoices metadata or changed with the MIDI control connects DSP parameters with -nvoices num additional parameter13. MIDI MIDI messages (in both directions), and can be control is activated using the -midi parameter. used to trigger polyphonic instruments. The number of allocated voices can possibly be changed at runtime using the -nvoices pa- 5.1 MIDI Messages Description in the rameter to change the default value (so using DSP Source Code ./foo -nvoices 16 for instance). MIDI control messages are described as meta- Several other scripts have been adapted using data in UI elements. They are decoded by a new the same conventions. MidiUI class, subclass of UI, which parses in- coming MIDI messages and updates the appro- 4.5 Polyphonic Instrument with a priate control parameters, or sends MIDI mes- Global Output Effect sages when the UI elements (sliders, buttons...) Polyphonic instruments may be used with an are moved. output effect. Putting that effect in the main Faust code is not a good idea since it would be 5.2 Defined Standard MIDI messages instantiated for each voice which would be very A special [midi:xxx yyy...] metadata needs inefficient. This is a typical use case for the to be added in the UI element. Here is the de- dsp_sequencer class previously presented with scription of three common MIDI messages: the polyphonic DSP connected in sequence with a unique global effect (Figure 5). • [midi:keyon pitch] in a slider or bar- faustcaqt inst.dsp -effect effect.dsp graph will map the UI element value to with inst.dsp and effect.dsp in the same folder, keyon velocity in the (0, 127) range. When and the number of outputs of the instrument used with a button or checkbox, 1 will be matching the number of inputs of the effect, has mapped to 127, 0 will be mapped to 0, to be used. A dsp_sequencer object will be • created to combine the polyphonic instrument [midi:keyoff pitch] in a slider or bar- in sequence with the single output effect. graph will map the UI element value to keyoff velocity in the (0,127) range. When 13-nvoices parameter takes precedence over the meta- used with a button or checkbox, 1 will be data value. mapped to 127, 0 will be mapped to 0, LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 74

• [midi:ctrl num] in a slider or bargraph Note that the described sample accurate will map the UI element value to (or from) MIDI clock synchronization model can currently (0, 127) range. When used with a button only be used at input level. Because of the or checkbox, 1 will be mapped to 127, 0 will simple memory zone based connection point be- be mapped to 0. tween the control interface and the DSP code, output controls (like bargraph) cannot generate The full description of supported MIDI mes- a stream of control values. Thus a reliable MIDI sages is now part of the Faust documentation. clock generator cannot be implemented with the 5.3 MIDI Clock Synchronization current approach. MIDI clock based synchronization can be used 5.4 MIDI Classes Faust to slave a given program, using the sam- A midi base class defining MIDI messages de- ple accurate control mechanism described in sec- coding/encoding methods has been developed. tion 3. The following three messages have to be A midi_hander subclass implements actual de- used: coding. Several concrete implementations based on native API have been written (Figure 6) and • [midi:start] in a button or checkbox will can be found in the faust/midi folder. trigger a value of 1 when a start MIDI mes- Depending on the used native MIDI API, sage is received event time-stamps are either expressed in ab- • [midi:stop] in a button or checkbox will solute time or in frames. They are converted trigger a value of 0 when a stop MIDI mes- to offsets expressed in samples relative to the sage is received beginning of the audio buffer. • [midi:clock] in a button or checkbox will Connected with the new MidiUI class, sub- deliver a sequence of successive 1 and 0 val- class of UI, they allow a given DSP to be con- ues each time a clock MIDI message is re- trolled with incoming MIDI messages or possi- ceived, seen by Faust code as a square bly send MIDI messages when its internal con- command signal, to be used to compute trol state changes. higher level information.

A typical Faust program will then use the MIDI clock command signal to possibly com- pute the Beat Per Minutes (BPM) information, or for any synchronization need it may have. Here is a simple example of a sinusoid gener- ated which a frequency controlled by the MIDI clock stream14, and starting/stopping when re- ceiving the MIDI start/stop messages: Figure 6: MIDI classes diagram import("stdfaust.lib"); In the following piece of code, a MidiUI ob- // square signal (1/0), changing state ject is created and connected to a rt_midi [5] // at each received clock clocker = checkbox("MIDI clock[midi:clock]"); MIDI message handler, then given as parameter to the standard buildUserInterface to control // ON/OFF button controlled the DSP parameters: // with MIDI start/stop messages play = checkbox("On/Off [midi:start][midi:stop]"); rt_midi midi_handler("MIDI"); MidiUI midiinterface(&midi_handler); // detect front DSP->buildUserInterface(&midiinterface); front(x) = (x-x’) != 0.0; 6 Deployment // count number of peaks during one second freq(x) = ([email protected]) : + ~ _; The extended architecture files have been pre- sented and used in the context of statically gen- process = os.osc(8*freq(front(clocker))) * play; erated and compiled DSP, that is generating Faust 14Using an external MIDI clock generator and chang- C++ code from , then compiling the re- ing its tempo allow to precisely control the sinusoid fre- sulting code in executable applications or plug- quency. ins. They have been deployed in several faust2xx LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 75

scripts and especially in faust2api presented in to DSP programming." New Computational [6]. Paradigms for Computer Music, 290. Note that they can also be used with dynam- [3] S. Denoux, S. Letz, Y. Orlarey and D. ically libfaust generated DSP15 as in particular Fober, “FAUSTLIVE Just-In-Time Faust in FaustLive [3] standalone just-in-time Faust Compiler... and much more." Linux Audio compiler, or in faustgen~ Max/MSP external Conference, 2014. object. [4] S. Letz, S. Denoux and Y. Orlarey, “Au- 7 Conclusion dio Rendering/Processing and Control Ubiq- The sample accurate control model could easily uity ? a Solution Built Using the Faust Dy- be adapted to work with MIDI controllable plu- namic Compiler and JACK/NetJack." ICM- gins like LV2 instruments16, so that MIDI clock C/SMC, Athenes 2014. synchronization could be used. [5] “RtMidi framework online documentation" Expanding the polyphonic and sample accu- http://www.music.mcgill.ca/~gary/ rate control model over the network in the lib- rtmidi/ faustremote [4] library is still in progress. As a general concluding remark, a deeper re- [6] R.Michon, J.Smith, C.Chafe, S. Letz and Y. thinking of the control/DSP connection model Orlarey, “faust2api: a Comprehensive API in the Faust compiled code will have to be Generator for Android and iOS." Linux Au- done. As explained in section 3, control and dio Conference, 2017. DSP computation interaction is somewhat lim- [7] J.McCartney, “Rethinking the Computer ited in the current model of the generated code. Music Language: SuperCollider." Computer The described solution stays at the architec- Music Journal, Winter 2002. ture layer level with some limitations. Although sample accurate control for inputs can be done [8] P.Donat-Bouillud, JL.Giavitto, A.Cont, using the presented slices based DSP computa- N.Schmidt and Y.Orlarey, “Embedding tion, this strategy does not help to properly re- native audio-processing in a score following trieve the stream of control output values. system with quasi sample accuracy." ICMC, A cleaner approach would be to extend the Utrecht 2016. model of control signals to be a list of time- [9] G. Wang, P. R. Cook, and S. Salazar, stamped values, so that the compute would han- “Chuck: A strongly timed computer music dle a slice of time-stamped input controls (kept language." Computer Music Journal, 2016. from the previous block), and possibly produces a slice of time-stamped output controls. Having this more general strategy at the code genera- tion level still has to be developed.

Acknowledgments This work has been done under the FEEVER project [ANR-13-BS02-0008] supported by the “Agence Nationale pour la Recherche". References [1] D. Fober, Y. Orlarey, and S. Letz, “Faust Architectures Design and OSC Support", IRCAM, (Ed.): Proc. of the 14th Int. Con- ference on Digital Audio Effects (DAFx-11), pp. 231-216, 2011. [2] Orlarey, Y., Fober, D., and Letz, S. (2009), “FAUST: an efficient functional approach

15Dynamically libfaust generated DSP are objects of llvm_dsp or interpreter_dsp types, subclasses of the dsp root class with the same API. 16http://lv2plug.in/doc/html/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 76

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 77

faust2api: a Comprehensive API Generator for Android and iOS

Romain Michon, Julius Smith, St´ephane Letz, Yann Orlarey Chris Chafe GRAME CCRMA Centre National de Cr´eation Musicale Stanford University 11 Cours de Verdun (Gensoul) Stanford, CA 94305-8180 69002, Lyon USA France {rmichon,jos,cc}@ccrma.stanford.edu {letz,orlarey}@grame.fr

Abstract use objects written in common computer music 6 faust2api languages such as PureData: libpd [Brinkmann We introduce , a tool to generate cus- 7 tom DSP engines for Android and iOS using the et al., 2011] and Csound: Mobile Csound Plat- Faust programming language. Faust DSP objects form (MCP) [Lazzarini et al., 2012] on mobile can easily be turned into MIDI-controllable poly- platforms. phonic or audio effects with built-in sen- Similarly, we introduced faust2android sors support, etc. The various elements of the DSP in a previous publication [Michon, 2013]: a engine can be accessed through a high-level API, tool allowing to turn Faust8 [Orlarey et al., made uniform across platforms and languages. This paper provides technical details on the im- 2009] code into a fully operational Android plementation of this system as well as an evaluation application. faust2android is based on of its various features. faust2api [Michon et al., 2015]. It al- lows to turn a Faust program into a cross- Keywords platform API usable on Android and iOS to Faust, iOS, Android, Mobile Instruments carry out various kinds of real-time audio pro- cessing tasks. 1 Introduction In this paper, we present a completely re- Mobile devices (smart-phones, tablets, etc.) designed version of faust2api offering the have been used as musical instruments for the same features on Android and iOS: past ten years, both in the industry (e.g., • polyphony and MIDI support, GarageBand1 for iPad, Smule’s apps,2 mo- ’s GeoShred,3 etc.), and in the academic • audio effects chains, community ([Tanaka, 2004], [Geiger, 2006], • built-in sensors support, [Gaye et al., 2006], [Essl and Rohs, 2009] and • low latency audio, [Wang, 2014]). Implementing real-time Digital Signal Pro- • etc. cessing (DSP) engines from scratch on mobile First, we’ll give an overview of how platforms can be hard using standard audio faust2api works. Then, technical details on APIs provided with common operating systems the implementation of this system will be pro- (we’ll only cover iOS and Android here). In- vided. Finally, we’ll evaluate it and present fu- deed, CoreAudio on iOS and OpenSL ES on ture directions for this project. Android are relatively low-level APIs offering customization possibilities not needed by most 2 Overview audio app developers. Fortunately, there ex- 2.1 Basics ist several third party cross-platform APIs to work with real-time audio on mobile devices at a At its highest level, faust2api is a command Faust higher level (e.g., SuperPowered,4 JUCE,5 etc.). line program taking a code as its main Additionally, several open-source tools allow to argument and generating a package containing a series of files implementing the DSP engine. 1http://www.apple.com/ios/garageband. All Various flags can be used to customize the API. the URLs in this paper were verified on 01/26/17. The only required flag is the target platform: 2https://www.smule.com 3http://www.moforte.com/geoshredapp 6https://puredata.info 4http://superpowered.com 7http://www.csounds.com 5https://www.juce.com 8http://faust.grame.fr LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 78

faust2api -ios myCode.dsp 2.2 MIDI Support will generate a DSP engine for iOS and MIDI support can be easily added to a faust2api -android myCode.dsp DspFaust object simply by providing the -midi flag when calling faust2api. MIDI will generate a DSP engine for Android. support works the same way on Android and The content of each package is quite different iOS: all MIDI devices connected to the mobile between these two platforms (see §3), but the device before the app is launched can control the format of the API itself remains very similar Faust object, and any new device connected (see Figure 1 at page 4). The iOS DSP engines while the app is running will also be able to faust2api generated with consist of a large control it. C++ DspFaust 9 object ( ) accessible through a Standard Faust MIDI meta-data can be separate header file. This object can be con- used to assign MIDI CCs to specific parame- C++ veniently instantiated and used in any or ters. For example, the freq parameter of the Objective-C code in an app project. A typi- previous code could be controlled by MIDI CC DspFaust cal “life cycle” for a object can be 52 simply by writing DspFaust *dspFaust = new DspFaust(SR, blockSize); dspFaust->start(); f = nentry("freq[midi: ctrl dspFaust->stop(); delete dspFaust; 52]",440,50,1000,0.01); start() launches the computation of the 2.3 Polyphony audio blocks and stop() stops (pauses) the Faust objects can be conveniently turned into computation. These two methods can be re- polyphonic synthesizers simply by specifying peated as many times as needed. The construc- the maximum number of voices of polyphony tor allows to specify the sampling rate and the when calling faust2api using the -nvoices block size, and is used to instantiate the au- flag. In practice, only active voices are allocated dio engine. While the configuration of the au- and computed, so this number is just used as a dio engine is very limited at the API level (only safeguard. these two parameters can be configured through As used for many years by the various it), lots of flexibility is given to the program- tools for making Faust synthesizers, such as mer within the Faust code. For example, if faust2pd, compatibility with the -nvoices the Faust object doesn’t have any input, then option requires the freq, gain and gate pa- no audio input will be instantiated in the audio rameters to be defined. faust2api automati- engine, etc. cally takes care of converting MIDI note num- The value of the different parameters of a bers to frequency values in Hz for freq, MIDI Faust object can be easily modified once the velocity to linear amplitude-gain for gain, and DspFaust object is created and is running. note-on (1) and note-off (0) for gate: For example, the freq parameter of the sim- ple Faust code f = nentry("freq",440,50,1000,0.01); g = nentry("gain",1,0,1,0.01); f = nentry("freq",440,50,1000,0.01); t = button("gate"); process = osc(f)*g* process = osc(f); t; can be modified simply by calling Here, t could be used to trigger an envelope dspFaust->setParamValue("freq",440); generator, for example. In such a case, the voice Faust user-interface elements (nentry here) would stop being computed only after t is set are ignored by faust2api and simply used as to 0 and the tail-off amplitude becomes smaller a way to declare parameters controllable in the than -60dB (configurable using macros in the API. API packages generated by faust2api application code). also contain a documentation pro- A wide range of methods is accessible to work viding information on how to use the API as with voices. A “typical” life cycle for a MIDI well a list of all the parameters controllable with note can be setParamValue(). long voiceAddress = dspFaust->keyOn( The structure of the DSP engine package note,velocity); is quite different for Android since it contains dspFaust->setVoiceParamValue("param", both C++ and JAVA files (see §3). Otherwise, voiceAddress,paramValue); the same steps can be used to work with the 9http://faust.grame.fr/images/ DspFaust object. faust-quick-reference.pdf LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 79

dspFaust->keyOff(note); its custom DSP engines. [Letz et al., 2017] For Faust setVoiceParamValue() can be used to example, turning a monophonic synthe- change the value of a parameter for a specific sizer into a polyphonic one can be done in a voice. simple generic way. Both on Android and iOS, faust2api C++ Alternatively, voices can be allocated without generates a large file imple- specifying a note number and a velocity: menting all the features used by the high level API. On iOS, this API is accessed through a long voiceAddress = dspFaust->newVoice C++ header file that can be conveniently in- (); dspFaust->setVoiceParamValue("param", cluded in any C++ or Objective-C code. On voiceAddress,paramValue); Android, a JAVA interface allows to interact dspFaust->deleteVoice(voiceAddress); with the native (C++) block. The DSP C++ For example, this can be very convenient to code is the same for all platforms (see Figure 2 associate voices to specific fingers on a touch- at page 5) and is wrapped into an object imple- screen. menting the polyphonic synthesizer followed by When MIDI support is enabled in the effects chain (assuming that the -mvoices faust2api, MIDI events will automati- and -poly2 options were used during compila- cally interact with voices. Thus, if a MIDI tion). keyboard is connected to the mobile device, In this section, we provide more information it will be able to control the Faust object on the architecture of DSP engines generated without additional configuration steps. by faust2api for Android and iOS. 2.4 Adding Audio Effects 3.1 iOS In most cases, effects don’t need to be re- The global architecture of API packages gen- implemented for each voice of polyphony and erated by faust2api is relatively simple on can be placed at the end of the DSP chain. iOS since C++ code can be used directly in faust2api allows to provide a Faust object Objective-C (which is one of the two lan- implementing the effects chain to be connected guages used to make iOS applications along to the output of the polyphonic synthesizer. with swift). The Faust synthesizer object This can be done simply by giving the -effect gets automatically connected to the audio en- flag followed by a Faust effects chain file name gine implemented using CoreAudio. As ex- (e.g., effect.dsp) when calling faust2api: plained in the previous section, the sampling faust2api -android -nvoices 12 -effect rate and the buffer length are defined by the effect.dsp synth.dsp programmer when the DspFaust object is cre- ated. The number of instantiated inputs and The parameters of the effect automatically outputs is determined by the Faust code. By become available in the DspFaust object and default, the system deactivates gain correction can be controlled using the setParamValue() on the input but this can be changed using a method. macro in the including source code. 2.5 Working With Sensors MIDI support is implemented using RtMidi The built-in accelerometer and gyroscope of a [Scavone and Cook, 2005], which is auto- mobile device can be easily assigned to any of matically added to the API if the -midi the parameters of a Faust object using the acc option was used for compilation. Alterna- or gyr meta-data: tively, programmers might choose to use the propagateMidi() method to send raw MIDI g = nentry("gain[acc: 0 0 -10 0 DspFaust 10]",1,0,1,0.01); events to the object in case they would like to implement their own MIDI re- Complex mappings can be implemented using ceiver. this system. This feature is not documented The same approach can be used for built- here, but more information about it is available in sensors using the propagateAcc() and in [Michon, 2017]. This reference also provides propagateGyr() methods. a series of tutorials on how to use faust2api. 3.2 Android 3 Implementation Android applications are primarily written in faust2api takes advantage of the modularity JAVA. However, despite the fact that the Faust on the Faust architecture system to generate compiler can generate JAVA code, it is not a LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 80

Basic Elements Parameters Control DspFaust: Constructor getParamsCount: Get number of params ˜DspFaust: Destructor setParamValue: Set param value start: Start audio processing getParamValue: Get param value stop: Stop audio processing getParamAddress: Get param address isRunning: True if processing is on getParamMin: Get param min value getJSONUI: Get UI JSON description getParamMax: Get param max value getJSONMeta: Get Metadata JSON getParamInit: Get param init value getParamTooltip: Get param description Polyphony keyOn: Start a new note Other Functions keyOff: Stop a note propagateMidi: Propagate raw MIDI newVoice: Start a new voice messages deleteVoice: Delete a voice propagateAcc: Propagate raw accel data allNotesOff: Terminate all active voices setAccConverter: Set accel mapping setVoiceParamValue: Set param propagateGyr: Propagate raw gyro data value for a specific voice setGyrConverter: Set gyro mapping getVoiceParamValue: Get param getCPULoad: Get CPU load value for a specific voice

Figure 1: Overview of the API functions. good choice for real-time audio signal processing 4 Evaluation [Michon, 2013]. Thus, DSP packages generated 4.1 Use in Other Frameworks by faust2api contain elements implemented faust2api both in JAVA and C++. is now used at the core of faust2android [Michon, 2013] and The native portion of the package (C++) im- faust2ios. It is also used as the basis 12 plements the DSP elements as well as the au- for our new SmartKeyboard tool (currently dio engine (see Figure 2) which is based on under development), allowing to generate mu- OpenSL ES.10 The audio engine is configured sical applications with advanced user interfaces to have the same behavior as on iOS. Native on Android and iOS. Figure 3 presents Nuance, elements are wrapped into a shared library ac- [Michon et al., 2016] a musical instrument cessible in JAVA through a Java Native Inter- based on faust2api and SmartKeyboard. face (JNI) using the Android Native Develop- 11 4.2 Audio Latency ment Kit (NDK). We measured the “touch-to-sound” and the MIDI receivers can only be created in JAVA “round-trip” audio latency of apps based on on Android (and only since Android API 23), faust2api for various devices using the tech- thus MIDI support is implemented in the JAVA niques described by Google on their website.13 portion. Like on iOS, the propagateMidi() The “touch-to-sound” latency is the time it method can be used to implement custom MIDI takes to generate a sound after a touch event receivers. was registered on the touch screen of the de- vice. The “round-trip” latency is the time it While raw sensor data can be retrieved in C++ takes to process an analog signal recorded by on Android, we decided to implement a system the built-in microphone or acquired by the line similar to the one used for MIDI, where raw input. sensor data are pushed from the JAVA layer to Latency performance hasn’t improved on iOS the native one. (see Table 1) compared to our previous study [Michon et al., 2015], except for newer devices

12 https://ccrma.stanford.edu/˜rmichon/ 10https://www.khronos.org/opensles smartKeyboard 11https://developer.android.com/ndk/ 13https://source.android.com/devices/ index.html audio/latency_measurements.html LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 81

Android API

Faust Code JAVA Android Faust API

faust2api MIDI Support Built-In Sensors (AndroidMidi) Control iOS API C++ iOS Faust API JNI Interface

MIDI Support Built-In Sensors C++ (RtMidi -> CoreMidi) Control (Native Library)

Synth Object Synth Object Polyphony Polyphony Faust DSP (Synth) Faust DSP (Effect) Faust DSP (Synth) Faust DSP (Effect)

Audio Engine (CoreAudio) Audio Engine (OpenSL ES)

Audio In/Out Audio In/Out

Figure 2: Overview of DSP engines generated with faust2api.

Touch to Round Device Sound Trip iPhone6 30 ms 13 ms iPhone5 36 ms 13 ms iPodTouch 36 ms 13 ms iPadPro 28 ms 12 ms iPadAir2 35 ms 13 ms iPad2 45 ms 15 ms

Table 1: Audio latency for different iOS devices using faust2api. Figure 3: Nuance: a musical instrument using faust2api. Touch to Round Device Sound Trip OS such as the iPad Pro. On the other hand, An- HTC Nexus 9 29 ms 15 ms 7.0 droid made huge progress (see Table 2), thanks Huawei Nexus 6p 31 ms 17 ms 7.0 to tremendous work carried out by Google, as Asus Nexus 7 37 ms 48 ms 7.0 well as our completely rewritten audio engine. Samsung Gal. S5 37 ms 48 ms 5.0 Table 2 shows that a “reasonable” latency can only be achieved with the latest version Table 2: Audio latency for different Android of Android, which confirms the measurements devices using faust2api. made by Google.14 Unfortunately, such per- formances can only be attained on a few de- vices supported by Google, and configured with can be improved: a specific sampling rate and buffer length. First, while basic MIDI support is provided, we haven’t tested it with complex MIDI inter- 5 Future Directions faces such as the one using the Multidimen- We believe that faust2api has reached a ma- sional Polyphonic Expression (MPE) standard 15 16 ture and stable state. However, many elements (e.g. LinnStrument, ROLI Seaboard, etc.).

14https://source.android.com/devices/ 15http://www.rogerlinndesign.com/ audio/latency_measurements.html\ linnstrument.html #measurements 16https://roli.com/products/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 82

Currently, specific parameters of the various Victor Lazzarini, Steven Yi, Joseph Timoney, elements of the API (such as audio engine, MIDI Damian Keller, and Marco Pimenta. 2012. behavior, etc.) can only be configured using The mobile Csound platform. In Proceed- source-code macros. We would like to provide ings of the International Conference on Com- a more systematic and in some cases dynamic puter Music (ICMC-12), Ljubljana, Slovenia, way of controlling them. September. Finally, we plan to add more targets to St´ephane Letz, Yann Orlarey, Dominique faust2api for various kinds of platforms to Fober, and Romain Michon. 2017. Polyphony, help design elements such as audio plug-ins, sample-accurate control and MIDI support standalone applications, and embedded sys- for FAUST DSP using combinable architec- tems. ture files. In Proceedings of Linux Audio Con- 6 Conclusions ference (LAC-17), Saint-Etienne, France. Faust gives access to dozens of high qual- Romain Michon, Julius Orion Smith, and ity open source sound processors and genera- Yann Orlarey. 2015. MobileFaust: a set of tors ranging from specialized types of filters, tools to make musical mobile applications to virtual analog oscillators, etc. Thanks to with the Faust programming language. In faust2api, all these elements can be easily Proceedings of the Linux Audio Conference embedded and controlled in any Android or iOS (LAC-15), Mainz, Germany, April. app in a very simple manner. Romain Michon, Julius O. Smith, Chris One of the new experimental features of the Chafe, Matthew Wright, and Ge Wang. 2016. Faust compiler allows to select at run time the Faust Nuance: Adding multi-touch force detection portions of a object that are computed. to the iPad. In Proceedings of the Sound This makes it possible to create very large ob- and Music Computing Conference (SMC-16), jects embedding multiple synthesizers and ef- Hamburg, Germany. fects. We believe that this feature, in combi- nation with faust2api, will allow to design Romain Michon. 2013. faust2android: a complex Faust-based DSP engines for a wide Faust architecture for Android. In Proceed- range of platforms. ings of the 16th International Conference on Digital Audio Effects (DAFx-13), Maynooth, References Ireland, September. Peter Brinkmann, Peter Kirn, Richard Romain Michon. 2017. Faust tutorials. Web- Lawler, Chris McCormick, Martin Roth, and page. https://ccrma.stanford.edu/ Hans-Christoph Steiner. 2011. Embedding ˜rmichon/faustTutorials. PureData with libpd. In Proceedings of the Pure Data Convention, Weinmar, Germany. Yann Orlarey, St´ephane Letz, and Dominique Fober, 2009. New Computational Paradigms Georg Essl and Michael Rohs. 2009. Inter- for Computer Music, chapter “Faust : an Effi- activity for mobile music-making. Organised cient Functional Approach to DSP Program- Sound, 14(2):197–207. ming”. Delatour, Paris, France. Lalya Gaye, Lars Erik Holmquist, Frauke Gary Scavone and Perry Cook. 2005. Rt- Behrendt, and Atau Tanaka. 2006. Mobile Midi, RtAudio, and a synthesis toolkit music technology: Report on an emerging (STK) update. In Proceedings of the 2005 community. In Proceedings of the Interna- International Computer Music Conference, tional Conference on New Interfaces for Mu- Barcelona, Spain. sical Expression (NIME-06), Paris, France, June. Atau Tanaka. 2004. Mobile music making. In Proceedings of the 2004 conference on New G¨unter Geiger. 2006. Using the touch screen interfaces for musical expression (NIME04), as a controller for portable computer mu- National University of Singapore. sic instruments. In Proceedings of the 2006 International Conference on New Interfaces Ge Wang. 2014. Ocarina: Designing the for Musical Expression (NIME-06), Paris, iPhone’s Magic Flute. Computer Music Jour- France. nal, 38(2):8–21, Summer. seaboard-grand LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 83

New Signal Processing Libraries for Faust

Romain Michon, Julius Smith Yann Orlarey CCRMA GRAME Stanford University Centre National de Cr´eation Musicale Stanford, CA 94305-8180 11 Cours de Verdun (Gensoul) USA 69002, Lyon {rmichon,jos}@ccrma.stanford.edu France [email protected]

Abstract In this paper, we give an overview of the or- Faust We present a completely re-organized set of signal ganization of the new libraries, as well processing libraries for the Faust programming lan- as technical details about their implementation. guage. They aim at providing a clearer classification We then evaluate them through the results of a of the different Faust DSP functions, as well as bet- workshop on Faust that was taught at the Cen- ter documentation. After giving an overview of this ter for Computer Research in Music and Acous- new system, we provide technical details about its tics (CCRMA) at Stanford University in 2016, implementation. Finally, we evaluate it and give and we provide ideas for future directions. ideas for future directions. 2 Global Organization and Keywords Standards Faust , Digital Signal Processing, Computer Music 2.1 Overview Programming Language The new Faust libraries1 are organized in dif- 1 Introduction ferent files presented in Figure 1. Each file Faust is a functional programming language for contains several subcategories allowing to eas- real time Digital Signal Processing (DSP) tar- ily find functions for specific uses. While some geting high-performance audio applications and libraries host fewer functions than others, they plug-ins for a wide range of platforms and stan- were created to be easily updated with new ele- dards. [Orlarey et al., 2009] ments. The content of the old (and now depre- Faust One of Faust’s strength lies in its DSP li- cated) libraries was spread across these braries implementing a large collection of ref- new files, making backward compatibility a bit erence implementations ranging from filters to hard to implement (see §2.4). audio effects and sound generators, etc. More specifically, the old music.lib was When Faust was created, it had a lim- removed since it contained much overlap in oscillator.lib effect.lib ited number of DSP libraries that were scope with , , filter.lib organized in a “somewhat” coherent way: and . effect.lib math.lib contained mathematical functions, was divided into several compressors.lib and music.lib everything else (filters, ef- “specialized” libraries: , misceffects.lib phaflangers.lib fects, generators, etc.). Later, the li- , , reverbs.lib vaeffects.lib braries filter.lib, oscillator.lib, and , and . Sim- oscillator.lib effect.lib were developed [Smith, 2008], ilarly, the content of noises.lib [Smith, 2012], which had significant overlap in is now spread between and oscillators.lib demo.lib scope with music.lib. . Finally, hosts A year ago, we decided to fully reorganize the demo functions, typically adding user-interface Faust libraries to elements with illustrative parameter defaults. 2.2 Prefixes • provide more clarity, Each Faust library has a recommended • organize functions by category, two-letter namespace prefix defined in the “meta library” stdfaust.lib. For example, • standardize function names, stdfaust.lib contains the lines • create a dynamic documentation of their 1http://faust.grame.fr/library.html. All content. the URLs in this paper were verified on 01/30/17. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 84

analyzer.lib maths.lib - Amplitude Tracking - Constants - Spectrum-Analyzers - Functions - Mth-Octave Spectral Level - Arbitrary-Crossover Filter misceffects.lib - Banks and Spectrum Analyzers - Dynamic - Filtering basics.lib - Time Based - Conversion Tools - Pitch Shifting - Counters and Time/Tempo Tools - Meshes - Array Processing and Pattern Matching - Selectors (Conditions) noises.lib - Other Misc Functions Noise generators library.

compressors.lib oscillators.lib Compressors and limiters library. - Wave-Table-Based Oscillators - LFOs delays.lib - Low Frequency Sawtooths - Basic Delay Functions - Bandlimited Sawtooth - Lagrange Interpolation - Bandlimited Pulse, Square, - Thiran Allpass Interpolation and Impulse Trains - Filter-Based Oscillators demos.lib - Waveguide-Resonators - Analyzers - Filters phaflangers.lib - Effects Phasers and flangers library - Generators reverbs.lib envelopes.lib Reverbs library. Envelope generators library. routes.lib filters.lib Signal routing library. - Basic Filters - Comb Filters signals.lib - Direct-Form Sections Misc signal tools library. - Direct-Form Second-Order - Biquad Sections spats.lib - Ladder/Lattice Spatialization tools library. - Virtual Analog Filters - Simple Resonator synths.lib -ButterworthFilters Miscsynthesizerslibrary. - Elliptic (Cauer) Filters - Filters for Parametric Equalizers vaeffects.lib (Shelf, Peaking) Virtual analog effects library. - Arbitrary-Crossover Filter-Banks

Figure 1: Overview of the organization of the new Faust libraries.

fi = library("filters.lib"); import("stdfaust.lib"); os = library("oscillators.lib"); process = os.sawtooth(440) : fi.lowpass (2,2000); so that functions from oscillator.lib can be invoked using the os prefix and func- It is of course possible to avoid prefixes using tions from filter.lib through fi: the import directive: LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 85

import("filters.lib"); libraries, allowing to concurrently use our new import("oscillators.lib"); filters.lib with the old filter.lib, for process = sawtooth(440) : lowpass example. (2,2000); If one of the old libraries is imported in The libraries presently avoid name collisions, a Faust program, the Faust compiler now so it is possible to load all functions from all throws a warning indicating the use of a dep- libraries into one giant namespace soup: recated library. import("all.lib"); process = sawtooth(440) : lowpass 2.5 Other “Non-Standard” Libraries (2,2000); A few “non-standard” libraries for very specific Alternatively, all Faust-defined functions applications remain accessible but are not doc- can be loaded into a single namespace separate umented (see §3): from the user’s namespace: • hoa.lib: high order ambisonics library sf = library("all.lib"); // standard faust namespace • instruments.lib: library used by the process = sf.sawtooth(440) : sf.lowpass Faust-STK [Michon and Smith, 2011] (2,2000); • maxmsp.lib: compatibility library for Further details can be found in the documen- 2 Max/MSP tation for the libraries. tonestacks.lib 2.3 Standard Functions • : tonestack emulation library used by Guitarix6 The Faust libraries implement dozens of func- tions, and it can be hard for new users to find • tubes.lib: guitar tube emulation library standard elements for basic uses. For example, used by Guitarix filter.lib contains seven different lowpass filters, and it’s probably not obvious to some- 3 Automatic Documentation one with little experience in signal processing The new Faust libraries use a new automatic which one should be used. documentation system based on the faust2md To address this problem, the new Faust li- (Faust to MarkDown) script which is now part braries declare “standard” functions (see Fig- of the Faust distribution. It allows to eas- ure 2) that are automatically added to the li- ily write MarkDown comments within the code 3 brary documentation. Standard functions are of the libraries by respecting the standards de- organized by categories, independently from scribed below. the library where they are declared (see §3). Library headers and descriptions can be cre- They should cover the needs of most users used ated with to computer music programming environments 4 5 //##### Library Name ##### // Some such as PureData, SuperCollider, etc. Markdown text. 2.4 Backward Compatibility //######################## With such major changes, providing a decent Libraries can be organized into sections using level of backward compatibility proved to be the following syntax: Faust quite complicated. The old libraries //===== Section Name ===== // Some (effect.lib, filter.lib, math.lib, Markdown text. music.lib and oscillator.lib) can still //======be used and will remain accessible for about Each function in a library should be docu- one year. mented as such: In order to make this possible, we had to find a way to make them cohabit with the new li- //---- Function Name ---- // Some Markdown text. braries without creating conflicts. Thus, we de- //------cided to use plurals for the name of the new The libraries documentation can be conve- 2http://faust.grame.fr/library.html 3 niently generated by running: http://faust.grame.fr/library.html\ #standard-functions. make doclib 4https://puredata.info. 5http://supercollider.github.io. 6http://guitarix.org. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 86

Analysis Tools Envelopes an.amp follower Amplitude follower en.adsr ADSR envelope an.mth oct[...] Octave analyzers en.ar AR envelope en.asr ASR envelope Basic Elements en.smoothEnv Exponential envelope ba.beat Pulse generator si.block Block a signal Filters ba.bpf Break Point Function fi.bandpass Bandpass (Butterworth) si.bus Bus of n signals fi.resonbp Bandpass (resonant) ba.bypass1 Bypass (mono) fi.bandstop Bandstop (Butterworth) ba.bypass2 Bypass (stereo) fi.tf2 Biquad Filters ba.count Counts in a list fi.allpass fcomb Comb (allpass) ba.countdown Samples count down fi.fb fcomb Comb (feedback) ba.countup Samples count up fi.ff fcomb Comb (feedforward) de.delay Integer delay fi.dcblocker DC blocker de.fdelay Fractional delay fi.filterbank Filterbank ba.impulsify Signal to impulse fi.fir FIR (arbitrary order) ba.sAndH Sample and hold fi.high shelf High shelf ro.cross Cross n signals fi.highpass Highpass (Butterworth) si.smoo Smoothing fi.resonhp Highpass (resonant) si.smooth Controllable smoothing fi.iir IIR (arbitrary order) ba.take Element from a list fi.levelfilter Level filter ba.time Timer fi.low shelf Low shelf fi.lowpass Lowpass (Butterworth) Conversion fi.resonlp Lowpass (resonant) ba.db2linear dB to linear fi.notchw Notch filter ba.linear2db Linear to dB fi.peak eq Peak equalizer ba.midikey2hz MIDI key to Hz ba.pole2tau Pole to t60 Generators ba.samp2sec Samples to seconds os.impulse Impulse ba.sec2samp Seconds to samples os.imptrain Impulse train ba.tau2pole t60 to pole os.phasor Phasor no.pink noise Pink noise Effects os.pulsetrain Pulse train ve.autowah Auto-wah os.lf imptrain Low-freq pulse train co.compressor Compressor os.sawtooth Sawtooth wave ef.cubicnl Distortion os.lf saw Low-freq sawtooth ve.crybaby Crybaby os.osc Sine (filter-based) ef.echo Echo os.oscsin Sine (table-based) pf.flanger Flanger os.square square wave ef.gate mono Signal gate os.lf square Low-freq square co.limiter Limiter os.triangle Triangle pf.phaser2 Phaser os.lf triangle Low-freq triangle re.fdnrev0 Reverb (FDN) no.noise White noise re.freeverb Reverb (Freeverb) re.jcrev Reverb (simple) Synths re.zita rev1 Reverb (Zita) sy.additiveDrum Additive drum sp.panner Panner sy.dubDub Filtered sawtooth ef.transpose Pitch shift sy.combString Comb string sp.spat Panner sy.fm FM ef.speakerbp Speaker simulator sy.sawTrombone Lowpassed sawtooth ef.stereo width Stereo width sy.popFiltPerc Popping filter ve.vocoder Vocoder ve.wah4 Wah

Figure 2: Standard Faust functions with their corresponding prefix when used with stdfaust.lib. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 87

at the root of the Faust distribution. This language. In Proceedings of the 14th Inter- will generate an html and a pdf file in the national Conference on Digital Audio Effects /documentation folder using .7 (DAFx-11), Paris, France, September. 4 Evaluation and Future Directions Yann Orlarey, St´ephane Letz, and Dominique Fober, 2009. New Computational Paradigms The new Faust libraries were beta tested dur- for Computer Music, chapter “Faust : an Effi- ing the CCRMA Faust Summer Workshop at 8 cient Functional Approach to DSP Program- Stanford University. In previous editions of ming”. Delatour, Paris, France. the workshop, students had to go through the library files to get the documentation of specific Julius Orion Smith. 2008. Virtual electric gui- functions. During last year’s workshop, thanks tars and effects using Faust and Octave. In to the new libraries documentation, students Proceedings of the Linux Audio Conference were able to find information about functions (LAC-08), pages 123–127, KHM, Cologne, simply by doing a search in the documentation Germany. file. Additionally, none of them encountered Julius O. Smith. 2012. Signal processing li- problems while using the new libraries which braries for Faust. In Proceedings of Linux was very satisfying. Audio Conference (LAC-12), Stanford, USA, The Faust libraries are meant to grow May. with time, and we hope that this new for- mat will facilitate the integration of new con- tributions. Eventually, we plan to divide filters.lib into more subcategories, like we did for the old oscillator.lib. Finally, physmodels.lib which is a new library for physical modeling of musical instruments is cur- rently under development. 5 Conclusions The new Faust libraries provide a platform to easily prototype DSP algorithms using the Faust programming language. Their new or- ganization, in combination with their automat- ically generated documentation, simplifies the search for specific elements covering a wide range of uses. New “standard functions” help to point new users to useful elements to imple- ment various kind of synthesizers, audio effects, etc. Finally, we hope that this new format will encourage new contributions. 6 Acknowledgments Thanks to Albert Gr¨af for his contributions to the design of the new libraries, and for single-handedly implementing a solid backward- compatibility scheme! References Romain Michon and Julius O. Smith. 2011. Faust-STK: a set of linear and nonlinear physical models for the Faust programming

7http://pandoc.org. 8 https://ccrma.stanford.edu/˜rmichon/ faustWorkshops/2016. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 88

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 89

Heterogeneous data orchestration Interactive fantasy under SuperCollider

Sébastien Clara CIEREC UJM – PhD student 35 rue du 11 novembre Saint-Étienne, France, 42000 [email protected]

Abstract are talking about a device that increases the sonic possibilities of the acoustic instrument. Feedback For L’Imaginaire music ensemble, I composed a is used to regulate electronic sound by a human or piece of interactive music involving acoustic an automaton. instruments and surround electronic music. The interaction between live musicians and electronics I use these two techniques of electronic sound is based on data collected in real time from accompaniment in my work, but the increase in acoustic instruments. This data is further used to computing power and the versatility of the tools adjust timbre and synchronize electronics with the have allowed a median way. In the next chapter, I rest of the music. present my problematic. Thereafter, I will show my issue with an example which could be Keywords extrapolated to other devices. SuperCollider, interactivity, music mixed 2 Problematic 1 Introduction According to Robert Rowe, "Interactive For L'Imaginaire1 music ensemble, I composed computer music systems are what are the changes a piece of interactive music mixed. Mixed music is in response to musical input. Such responsiveness a term used in musicological literature to refer to a allows these systems to participate in live musical genre. It is defined by the alloy of performances, of both notated and improvised instrumental music and electronic music. For this music" [1]. How can a system listen to a musician paper, I wish to focus on an interactive property of and make an appropriate decision to generate a my piece. sound response? In his book, Rowe analyzes systems that use the MIDI standard to Historically, the electronic part of mixed music communicate between instruments and computers. is composed in a studio and fixed on a magnetic 2 But how can we use traditional instruments? tape . This technique makes impossible the 4 interaction between the interpreters and the The audio descriptors correspond to parameters electronic sound. Feedback helps to improve the that describe an analyzed sound. A set of compositional process of the tape. The other descriptors is used to construct a data set and technique is to perform audio processing of the create spaces for representing the sound. The acoustic instruments in real time3. In this case, we parameters that are extracted can be described in different ways, depending on what is expected from the information conveyed by the parameter. 1Musical ensemble composed by Keiko Murakami (flutes), Philippe Koerper (saxophones) and Maxime The sound acquisition sensor for the processing Springer (piano). http://www.limaginaire.org/. unit is a microphone. Therefore, the use of audio 2The first pieces of music using this technique: descriptors in interactive computer music systems Orphée 51 and Orphée 53 by Pierre Schaeffer and allows the use of standard acoustic instruments. Pierre Henry (1951 and 1953) and Musica su due dimensioni by Bruno Maderna (1952 and 1958). 4With SuperCollider, it is necessary to install its 3The first pieces of music using this technique: extension package to benefit from a wide choice of Mixtur (1964) and Mikrophonie I (1964) by Karlheinz descriptors : https://github.com/supercollider/sc3- Stockhausen. plugins. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 90

To control electronic sound with the sound of an /dataTrigger address. The data is sent whenever acoustic instrument, many descriptors can be used. an onset is detected (Onsets.kr). When To do this, it is necessary to match a sound instantiating the synthesizer, two arguments are characteristic with the values of the parameters available. The first determines the number of that describe it. The preparation of this report will inputs of the signal to be analyzed on our audio allow us to establish a particular threshold and interface (SoundIn.ar). The second determines the once it is crossed, the system can trigger a onset detection threshold. response. However, the interval between the extreme values of a parameter depends on its nature and on the analyzed sound source. To use this technique of interaction between an instrumentalist and electronic sound, a first difficulty is to negotiate with the heterogeneity of the data produced by the different audio descriptors. For example, I want to use the nuance and pitch Execution of figure 1 will only give the definition of your synthesizer to SuperCollider of a sound to build a particular accompaniment. 5 The amplitude descriptor returns a number that can audio server . Synthesizer is not instantiated, so it does not work and it does not ask for resources to range from 0 to 1 and the pitch descriptor returns a 6 frequency. The range of extreme values returned your hardware. We will run it only when we want by the pitch descriptor depends on the ambit of the to receive data (figure 2). analyzed sound source. Determining a trigger We now have to build a data collector to threshold to activate a response of a system constitute our repository. To do this, we use the involves crossing values of different magnitudes OSCFunc object. It is fast responder for incoming sometimes from sound sources of different natures. OSC messages. We configure it with the Moreover, this technique uses sound capture and previously defined OSC address. When a new this information is variable depending on its message arrives from the analyzer, it executes a environment. Consequently, elaborate settings in a function. In this case, this function saves the studio with a particular electro-acoustic chain amplitude and pitch of the signal in array global would be less effective in another location with a variables. different electro-acoustic chain. So, how can we treat the heterogeneity of the data produced by different audio descriptors, different sound sources, different electro-acoustic chains and different concert locations, to achieve homogeneous electronic music accompaniment for each interpretation of the same piece?

3 Creating a repository The first step is to establish a reference. This will serve as a standard for a single piece, a specific electro-acoustic chain and a particular We set to -9 dB the onsets detection threshold. place. We will be obliged to renew it each time We can change this parameter for change the one of these three parameters changes. density of the data reception. The analyzer listens Furthermore, this repository will allow us to to the first input of our audio interface (default characterize our data and to determine thresholds setting). In figure 2, running the first block starts for performing a particular electronic sound the acquisition of the data. Running the last line accompaniment. kills the synthesizer and responder instances, frees In the following example, we use the amplitude memory and processor usage. The collected descriptor (Amplitude.kr) and pitch (Pitch.kr). The values are transmitted by the OSC protocol 5http://doc.sccode.org/Classes/SynthDef.html. (SendReply.kr) to the SuperCollider clients at the 6http://doc.sccode.org/Classes/Synth.html. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 91 information is stored in variable arrays. We can 5 System Response In order for our system to respond to certain stimuli, we must attribute to it a means of sound production. To do this, we define an arbitrary

The graph shows the evolution of the parameter over time and allows us to make a correspondence between a sound characteristic and some values. The histogram provides a representation of the distribution of the values of the parsed parameter. This observation allows us to characterize the distribution produced by a descriptor.

4 Using the repository To implement a concrete example, we assume that our device listens to two types of percussion. In our system, we determine the value of a One of the percussions emits sounds high-pitched threshold to trigger a response to accomplish than the other. We decide that our system will dynamic electronic music accompaniment. This respond only to the percussion which emits the choice may be arbitrary or be determined in most high-pitched sounds and to the most loud response to a specificity of the analyzed sound sounds. With onsets detection threshold, our source. Above a certain value, our program condition for triggering a response depends on triggers a response for example. We can also two others parameters: choose this value according to its frequency of appearance in the distribution and according to the sound result produced by this choice, we can increase or decrease the density of our electronic accompaniment by modifying the value of our threshold. However, how do we handle values of If the frequency of the answers does not suit us, different magnitudes? we can return to the choice of the values of these variables to modify the sensitivity of our system. We manipulate repository by the requested percentile. To use this method, it is necessary to install an additional library. For that, you can use the SuperCollider package manager7 to install MathLib. This one gives us access to additional statistics methods for arrays. The percentile rank corresponds to the proportion of the values of a distribution less than or equal to a determined value. We manipulate our data with float values from 0 to 1. For example, if we want to know the value equal to 90% of our data, we use 0.9.

During the adjustment phase of our system, we can tune several parameters of different magnitudes transparently with a single scale.

7http://doc.sccode.org/Classes/Quarks.html. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 92

The implementation of a response for our We can parallelize other listeners who analyze system follows the same structure as Figure 2. A other sound qualities (brightness, noise, responder wait for incoming OSC messages from dissonance, etc.) to achieve other triggers the analyzer. When a new message arrives, if it threshold and make complex electronic music meets the previously formulated conditions (high- accompaniment executed by our system. pitched and loud), a response is issued. For our player, we can define a maximum This response is customized according to the number of synthesizers executed in parallel in analysis data. This connection is made by a order to preserve the resources of the system and / sonification process, "technique of rendering or to control the acoustic density so as not to sound in response to data and interactions" [2]. I saturate our perception. do not deal with the mapping technique in this In addition, we can perform a certain musical paper, but its mastery is a source of variation and process8 or that feeds on our repository instead of expressiveness for electronic music. To make this running a simple synthesizer. The interactive relation, we map the synthesizer parameter from system developed by Jean-Claude Risset for his an input range to an output range. We can set the duets for one pianist [3] is very interesting for this input range with the repository information and way. He uses MIDI data (pitch, velocity and adjust the output according to the desired sound duration) which he transforms according to quality. Figure 7 is the implementation of the traditional compositional operations: response of our system. transposition, reversal, canon, etc. At the end, we must free objects for frees memory and processor usage. 7 Implementation I control my audio processor by a graphical interface (Figure 9) and MIDI controller. The 6 For further different audio tracks allow me to adjust the For this paper, we concentrated our system to its intensity of the electronic sound layers. The flute, simplest expression. In this chapter, we wish to sax and piano tracks deal with interactive develop it design. Some of these ideas were electronic sound accompaniment. The synth track conceived during the development of our piece and manages my non-real time composite electronic others afterwards. music. Finally, the live track amplifies the acoustic instruments. Robert Rowe divides his interactive computer music system (Cypher) into two sections [1]. The The creation of the repository is realized listener analyzes the data produced by a musician directly in the interface and makes this action and the player delivers a musical response. The transparent. Graphical interface allows a sound structure of SuperCollider source code implies this engineer to play my work and this interface makes organization. Our analysis synthesizer is the rehearsals and concerts easier. listener and the function of the responder object for incoming OSC messages is the player. We keep these terms to locate the following points. Our listener can also have an implicit function of time master. Indeed, we transmit the data of the analysis every time an onset is detected. But we can transmit them at a given frequency. The responses delivered by the system would then be have a beat. We can use our listener to produce an automation (with Env and EnvGen.kr objects). An automation allows to control and to automate the variation of a parameter over a given time. In this A MIDI pedal assigned to a performer manages way, we determine the evolution of any threshold the overall setting. Fourteen key moments or parameter. 8http://doc.sccode.org/Tutorials/A-Practical- Guide/PG_01_Introduction.html. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 93 articulate the electronics for a duration of fifteen to determine the sound characteristics of our minutes. response are conceived during the last step of this To create my electronic accompaniment, I use practical solution. the descriptors of amplitude, pitch, centroid and The third step in using our system is to noise. The operation of coupling between the data determine which synthesizer we want to assign to of the analysis and the parameters of the our system. Controls can be implemented in synthesizers [2] makes it possible to control the synthesizers. This possibility can give us solutions form of electronic music accompaniment by to build a previously chosen model. For example, scaling, transposition or reversal. This relation can a synthesizer can perform a glissando. In the be fixed or variable. example developed for this paper, we use a simple Generally, the sound source of my electronic granular FM synthesis. music accompaniment does not come from sound The final step in implementing our solution is to synthesis, but acoustic instruments. When the determine the type of relationship between the density of the responses of the system is not analysis data and the parameters of our limited, the responses are superimposed to synthesizer. How to get expressive sounds with continuously transform the timbre of the electronic sonification process [2] ? Should our relationship sound. When the density of the responses of the be static or dynamic? How should the plan of system is limited, the responses can arrive in connections between these various elements be successive waves and produce a dynamic established? For the example developed in this accompaniment. paper, we have established a one-to-many and many-to-one static connection plan. The 8 Epilogue amplitude determined by the analysis is correlated with the granular density of the synthesis, the To design our system, our initial motivation was modulation index and the duration of the to simplify the use of different descriptors, to response. The pitch determined by the analysis is simplify our system settings, to customize the correlated with the pitch of our response. A responses of the system according to the sensitivity transposition is performed by a scaling operation. of the interpreter and to produce a homogeneous Finally, the amplitude and the pitch analyzed electronic accompaniment to each interpretation of serve to determine the modulation frequency of the same notated music under different conditions. the FM synthesis of the response delivered by our But in addition, we obtain an open interactive system. system that can be adapted from a specific model to the intuition of a musician. We identified four 9 Conclusion steps to explore in order to implement our practical solution and develop an interactive scenario [4]. For this paper, we have implemented our open interactive system under SuperCollider - platform The first step focuses on the sound of the for audio synthesis and algorithmic composition. instrumentalist. What particularities of sound do We could have implemented this system on other we want to relate to our system? What filter do we software. Moreover, use increases want to use to trigger an answer? In the example the durability of our work. Laurent Pottier recalls developed for this paper, we make our filter with the history of the precariousness of technologies the parameters of onset detection, pitch and in electronic music [5] and free software answers amplitude. The thresholds established to constitute to this problem. this filter allow us to play on the sensitivity or the particularity of the answers delivered by our An example concerns the portage of Pluton by 9 system. Philippe Manoury from the 4X to Max [6]. The piece did not really sound exactly the same way The second question to implement our solution on both platforms. After a thorough study of the is to choose the type of response to trigger. Should 4X, the engineers discovered that a 4X hardware the answer be monophonic, polyphonic, limitation influenced the sound result. This contrapuntal, etc.? In other words, what limitation was implemented in the Max patch to organizational model should we use to develop our find an electronic music equivalent [7]. response? In our example, we produce one item per answer. This element is strongly correlated 94X is real time effect processors designed by with the sound analyzed and the operations applied Giuseppe di Guigno at IRCAM in the 1970s. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 94

Free software is an important factor of durability and reproducibility in the digital art. The ubiquity [8] of free software allows more flexibility to imagine original devices [9]. In the end, researchers have no lock to study and increase the common good.

10 Acknowledgements Thanks to Laurent Pottier for his encouragement and his pertinent advices.

References [1] R. Rowe. 1993. Interactive Music Systems: Machine Listening And Composing. MIT Press, Cambridge. [2] T. Hermann, A. Hunt, J. G. Neuhoff. 2011. The Sonification Handbook, Logos Verlag, Berlin. [3] J.-C. Risset, S. Van Duyne. 1996. Real-Time Performance Interaction with a Computer- Controlled Acoustic Piano. In Computer Music Journal 20/1, pp. 62–75, MIT Press, Cambridge. [4] B. Laurel. 2014, Second Edition. Computers as Theatre. Addison-Wesley, Crawfordsville. [5] L. Pottier. 2015. L’évolution des outils technologiques pour la musique électronique, en rapport avec la pérennité des œuvres. Constat, propositions. A. Saemmer, editor, E-Formes 3, Les frontières de l’œuvre numérique, pp. 245- 261, PUSE, Saint-Étienne. [6] M. Puckette. 2002. Max at Seventeen. In Computer Music Journal 26/4, pp. 31-43, MIT Press, Cambridge. [7] J. Szpirglas. 2012. Composer, même avec trois bouts de ficelle… Entretien avec Philippe Manoury. http://etincelle.ircam.fr/1077.10.html. [8] S. Letz, S. Denoux, Y. Orlarey. 2014. Audio Rendering/Processing and Control Ubiquity? a Solution Built Using the Faust Dynamic Compiler and JACK/NetJack. In Proceedings ICMC|SMC, Athens. [9] M. Lallement. 2015. L'âge du faire. Hacking, travail, anarchie. Le Seuil, Paris. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 95

Higher Order Ambisonics for SuperCollider

Florian Grond Pierre Lecomte Input Device and LMSSC, Music Interaction Laboratory, McGill 2 Rue Conté 555, Sherbrooke W 75003 Paris, H3A 1E3 Montreal CANDA, FRANCE, fl[email protected] [email protected]

Abstract spatial resolution and thereby reduces the limi- In this paper we present a library for 3D Higher Or- tation of low spatial definition when compared der Ambisonics (HOA) for the SuperCollider (SC) with other spatialization techniques. sound programming environment. The library con- For streamlining and standardising content tains plugins for all standard operations in a typical production, one hurdle that HOA was facing in Ambisonics signal-flow: encoding, transforming and the past was the coexistence of various channel decoding up to the 5th order. Carefully designed ordering and normalization conventions. In or- PseudoUgens are the interface to those plugins to der to address this issue, the Ambix standard aim for the best possible code flexibility and code was proposed by [Nachbar et al., 2011] and is reusability. As a key feature, the implementation is designed to handle the higher order B-format as ever since increasingly adopted by recent HOA a channel array and to obey the channel expansion implementations. paradigm in order to take advantage of the power- Today, processors can handle with ease multi- ful scripting possibilities of SC. The design of the ple instances of multi-channel sound processes. library and its components is described in details. Further, the rise of video games and Virtual Moreover, some examples are given for how to built Reality (VR) applications has elicited new in- flexible HOA processing chains with the use of node terest in Ambisonics amongst audio researchers proxies. and content creators. This is mostly due to its inherent property to yield easy-to-manipulate Keywords isotropic 360 degree sound pressure fields, which SuperCollider, Higher Order Ambisonics. can be rendered either through multi loud- speaker arrays or headphones. In the case of VR 1 Introduction applications head-tracking is already available Ambisonics, i.e. the description of sound pres- and the listener is always in the sweet spot. For sure fields through spherical harmonics decom- the capturing of HOA 3D sound pressure fields, position, has been around for quite a while since various microphone array prototypes have been developed some of them being available as com- its invention by [Gerzon, 1973]. Back in the ® days, the harmonics decomposition was up to mercial products like mhacoustic’s Eigenmike the first order (First Order Ambisonics, FOA), for instance. As far as multi loudspeaker re- using the 4-channel B-format1 The playback of production is concerned, the number of loud- FOA audio content depended on special hard- speaker domes with semi spherical configura- ware and did not make it into mainstream au- tions is growing and electroacoustic composers dio in the first decade of its existence for various have also shown increasing interest in HOA as a reasons, one of which being that FOA offers only spatialisation technique, notably amongst them limited spatial resolution. for composition and [Barrett, 2010] and sonifi- cation [Barrett, 2016]. Ambisonics research made significant ad- vances in the 2000’s through the work of Bam- 1.1 Ambisonics in various platforms ford [Bamford, 1995], Malham [Malham, 1999] Over the last few years HOA has seen various and Daniel [Daniel, 2000], who extended the implementations in diverse sound software en- sound pressure field decomposition to higher or- vironments, mostly as plugins in DAWs. The ders hence the term (HOA). HOA increases the Ambisonics Studio plugins by Daniel Courville, for instance, have been around for some time2. 1The term B-format is often used for FOA signals, in this paper we use the term also for higher orders. 2http://www.radio.uqam.ca/ambisonic LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 96

Another recent and very comprehensive exam- a more recent and very comprehensive set of ple its the Ambix plugin suite from [Kronlach- tools. The Atk includes for instance various ner, 2013] also see [Kronlachner, 2014a]. For transformations to manipulate the directivity of Pure Data and MaxMSP, HOA libraries have the sound field such as pushing and zooming. It been made available by the Centre de recherche is however only a first order implementation of Informatique et Création Musicale3 an early im- Ambisonics. plementation can also be found with the ICST Ambisonic tools [Schacher, 2010]. An early ver- 1.4 Library design in SuperCollider sion for Pure Data can be found in the collection In this context the paper presents a modern of abstractions called CubeMixer by [Musil et HOA implementation for SC, which is modular al., 2003]. Recently, the HOA libraryAmbitools4 and adopts all established standards in terms developed mainly in Faust has been made avail- of channel ordering and normalizations. In- able [Lecomte and Gauthier, 2015]. spired by the approach found in the Atk and typical for the general design of SC, compu- 1.2 SuperCollider tationally intensive parts like Ugens are split The audio synthesis environment SuperCollider from PseudoUgens convenience wrapper classes (SC) by [McCartney, 2002] is particularly well of sclang. The HOA library comes hence in three suited for the creation of dynamic audio scenes. parts, SC3plugins, PseudoUgens and audiocon- SC is split into two parts: The server scsynth for tent plus HRTFs for binaural rendering in a sup- efficient sound synthesis and sclang, an object port directory. oriented programming language for the flexible 1.4.1 SC3plugins configuration and re-patching of DSP trees on the server. Similar to most sound programming The first part of the library is a collection of environments, synthesis is based in SC on unit Ugens, which is part of the SC3plugins collec- generators called Ugens. Third party Ugens are tion. Each Ugen is compiled from C++ code. collected separately in SC3plugins. Extensions It consists of a SC language side representation to sclang are mananged through Quarks. Ugens of the Ugen as a .sc class file and a .scx, .so or can be assembled to more complex arrangements .dll compiled dynamic link library, for the plat- through synthesis definitions, known as Syn- forms (OSX, Linux or Windows) respectively. thDefs, which are executable binaries for synthe- For each Ambisonics order (so far up to order sis in scsynth. In sclang, PseudoUgens can be 5), there are individual Ugens for the encoding, created, which is another way of handling com- transforming and decoding processes in a typi- plex arrangements of Ugens in sclang, which are cal Ambisonics signal flow. The C++ code for compiled for scsynth, when needed. For a de- these Ugens is generated from the HOA library tailed introduction to SC see [Valle, 2016] and Ambitools [Lecomte and Gauthier, 2015] with the SuperCollider book5 by [Wilson et al., 2011]. the compilation tool faust2supercollider. This approach was taken for two reasons: 1.3 Ambisonics in SuperCollider First, to leverage the work already accom- In 2005, Frauenberger et al. implemented HOA plished in Faust. Indeed, the Faust compiler in SC as the AmbIEM Quark6. This implemen- generates very efficient DSP code and the Faust tation goes up to the 3rd order, and follows the code base allows to efficiently combine exist- old Furse Malham channel ordering and normal- ing functionality. The meta approach through ization. All unit generators (Ugens) like en- Faust will lead to future additions of function- coding, rotation, and simple decoding are im- ality, which can then be easily integrated in the plement in sclang as PseudoUgens. AmbIEM HOA library for SC. comes with an simulation of early reflections in Second, each Ambisonics order comes with a a virtual room but lacks functionality such as defined multichannel B-format, this in turn de- beamforming. The Ambisonics Toolkit (Atk) fines the amount of input and output arguments for SC by [Anderson and Parmenter, 2012] is for the Ugens. For instance, a Ugen rotating an Ambisonic signal of order 3 has 16 input argu- 3http://www.mshparisnord.fr/hoalibrary/en 4 ments plus the rotation angles and 16 output http://faust.grame.fr/news/2016/10/17/ channels. While it is of interest to expose the Faust-Awards-2016.html 5http://supercolliderbook.net/ Ambisonics order as an argument for the flexibil- 6https://github.com/supercollider-quarks/ ity and reusability of code on the side of sclang, AmbIEM it is an argument unlikely to be changed while LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 97

1.4.2 PseudoUgens The second part of the library is available as the sclang extension HOA Quark. While the SC3plugins are designed for computational ef- ficiency of the sound synthesis processes, The HOA Quark is conceived to unlock the flexibil- ity of making sound in SC with respect to code reusability and the scaling of synthesis scripts. Each typical operation in Ambisonics (Encod- ing, Transforming, Decoding) is here provided as a PseudoUgen. Depending on the Ambisonics order provided as an argument, the PseudoUgen returns and instantiates the correct Ugen from the SC3plugins collection on the sound server. Figure 1: The Ambisonics processing chain in Since the Ambisonics order is an argument for the HOA library for the selected order 3 with the PseudoUgen, the number of channels in the 16 channels. B-format vary and so does the number of in- put arguments in the UGens. This is why the B-format is handled as a channel array. This an instance of the Ugen is running as a node in makes the SC code flexible for experimenta- the DSP tree on scsynth. This is why for every tions with different orders depending on compu- order there is a unique Ugen for each function tational resources. All arguments of the Pseu- (encoding, transforming, decoding). doUgen obey the channel expansion paradigm. The Ugens follow these conventions, some of This means that if any of the arguments is an which are explained in subsequent sections): array (or an array of arrays), the PseudoUgen returns an array (or an array of arrays) of Ugens. • The Ambisonics channels are ordered ac- Figure 1 shows the relation between the SC cording to the ACN convention. language side (PseudoUgens in light grey) and the SC3plugins (Ugens dark grey). If the Am- • The default normalisation of the B-format bisonics order is set to 3 and passed as an argu- is N3D. ment to the PseudUgen, the corresponding Ugen • All azimuth and elevation arguments follow with 16 channels is returned and a typical pro- the spherical coordinates convention from cessing chain (Encoding Transforming Decod- SC. ing, encircled in red) can be established. The main features of the design of the HOA library • Operations of resource intensive Ugens can implementation on the language side are: by bypassed. • B-format is handled as a channel array. Based on the implementation in Faust, the • All arguments obey the channel expansion main functionalities of the HOA Ugens provided paradigm. as the SC3plugins are so far: This leads to the following advantage, when • Encoding and decoding of planar waves and scripting HOA sound scenes in SC. Compared spherical waves using near field filters. with graphical data flow programs like for in- stance Pure Data, changing the order means to • Mirroring, Rotation (around azimuth and reconnect all channels between objects interfac- full 3D). ing with the B-format. In SC changing the Am- bisonics order in a single global variable changes • Various Ugens for beamforming, returning the order of the whole HOA processing chain. mono as well as B-format signals. 1.4.3 Support directory • Various decoders in conjunction with Head- The third part of the library is a platform in- Related Impulse Responses (HRIRs) for dependent support directory for various HOA binaural monitoring. sound file recordings and convolutions kernels from HRIRs for the binaural rendering of HOA LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 98

sound scenes. The support directory approach is similar to the Atk implementation from which HOAEncoder .ar(1, we adapted the corresponding class. The rea- SinOsc .ar([f1,f2]), son to keep these resources separate from the a,e).sum;// returns Quark directory is mostly due to the size of the [ OutputProxy ,..,.., OutputProxy ] included sound files. We provide some 4th or- der HOA sound files that have been recorded If an array of azimuth and elevation argu- with the Eigenmike® with the support of Ro- ments, matching in size those of the source main Dumoulin from CIRMMT. The HRIRs SinOsc.ar([f1, f2]), flexible and scalable code provided are either measured from a KU-100 for multi source encoding can be created. dummy head [Bernschuetz, 2013] or computed 2.2 HOAEncLebedev06 / 26 / 50, from a 3D mesh scan of several people’s face. HOAEncEigenmike The directions of the HRIRs follow a 50-node Lebedev grid, allowing an Ambisonic binaural This collection of PseudoUgen offers at first the rendering up to order 5 [Lecomte et al., 2016b]. Discrete Spherical Fourier Transform (DSFT) for various spherical layout of rigid spherical 2 Encoding microphone. In the current implementation As the first step in an Ambisonics rendering the proposed geometries are 06- 26- or 50-node chain, the library provides PseudoUgens for en- Lebedev grid [Lecomte et al., 2016b] and Eigen- coding into the B-format. One for the encoding Mike grid[Elko et al., 2009]. The components of of mono sound signals, one for microphone array the DSFT are then filtered to take into account prototypes and one for the commercially avail- the diffraction by the rigid sphere and retrieve able Eigenmike® microphone array. the Ambisonic components [Moreau et al., 2006] [Lecomte et al., 2015] The filters are applied by 2.1 HOAEncoder setting the filter flag to 1 as shown in the next This PseudoUgen creates an HOA scene from code listing: mono inputs encoded as a (possibly moving) // Encode the signals from the sound source in space. The source can be en- // Lebedev26 grid microphone coded 1) as a plane wave with azimuth and HOAEncLebedev26 .loadRadialFilters elevation (θp, δp) respectively 2) as a spherical (s); wave with position (rs, θs, δs), where rs is the { HOAEncLebedev26 .ar(4, SoundIn .ar distance to origin of the source. The spherical (0!26), filters: 1)}.play wave is encoded using near-field filters [Daniel, 2003]. In the current implementation, those fil- ters are stabilized with near-field compensation 3 Converting filters. Thus, in this case, the radius of the loud- In order to correctly reconstruct a sound field speaker layout rspk used for decoding is needed. Note that if the spherical source radius is such from the channels of the B-format, it is im- as the source is focused inside the loudspeaker portant to know about standard normaliza- tion methods for the spherical harmonic com- enclosure (rs ≤ rspk), a "bass-boost" effect may occur with potential excessive loudspeaker gain. ponents, as well as channel ordering conven- This effect increases as the source get closer to tions. Two main channel ordering conventions the origin [Daniel, 2003] [Lecomte and Gauthier, exist: The original Furse-Malham (FuMa) [Mal- 2015]. ham, 1999] higher-order format, an extension of traditional first order B-format up to third HOAEncoder .ar(1, SinOsc .ar(f),a,e) order (16 channels). FuMa channel ordering ; comes with maxN normalization, which guaran- // returns tee maximum amplitude of 1. The FuMa format [ OutputProxy ,..,.., OutputProxy ] has been widely used and is still in use but is increasingly replaced by the Ambisonic Channel HOAEncoder .ar(1, Number (ACN) ordering [Nachbar et al., 2011]. SinOsc .ar([f1,f2]), ACN typically comes with (the full three-D nor- a,e); // returns malisation) where all signals are orthonormal. [[ OutputProxy ,..,.., OutputProxy ], SN3D (Semi-Normalized 3D) spherical harmon- [ OutputProxy ,..,.., OutputProxy ]] ics. This normalization has the advantage that LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 99

none of the higher order signals exceeds the level channels from the B-format inputs are combined of the first Ambisonic channel, W (ACN 0). to produce a monophonic output as if a direc- However, this normalization does not provide tional microphone was used to listen into a spe- an orthonormal basis of spherical harmonics and cific direction in the sound field. In the current this latter case is recommended for transforma- implementation, the beampatern provided are tions which rely on the orthormality property of regular hypercardioids up to order 5 see [Meyer spherical harmonics. Therefore, the library uses and Elko, 2002] internally the N3D (full 3D normalization) with ACN convention. 5.2 HOAHCard2HOA 3.1 HOAConvert This PseudoUgen applies a hyper-cardioid The HOAConvert PseudoUgen accepts a B- beam-pattern to the HOA scene to enhance format array as input and converts from and some directions and outputs a directional fil- to ACN_N3D, ACN_SN3D, FuMa_MaxN. It tered HOA scene [Lecomte et al., 2016a]. is mostly meant to convert existing B-format The proposed beam-patterns are regular hyper- recordings into ACN N3D for use within the li- cardioids as described in [Meyer and Elko, 2002]. brary The other use case is to render B-format The selectivity of the directional filtering in- mixes to other conventions for other production creases with the order of the beam-pattern. This contexts. transformation requires an order re-expansion such that the output HOA scene should be of 4 Transforming the order of the input HOA scene plus the beam- In its current implementation, the HOA library pattern order [Lecomte et al., 2016a]. provides 3 standard operations like rotation and mirroring to transform the B-format. 5.3 HOADirac2HOA 4.1 HOAAzimuthRotator As in the previous section, this PseudoUgens This PseudoUgen rotates the HOA scene around performs a directional filtering on the HOA the z-axis, which is accomplished with a rota- scene but this time the beam-pattern is a di- tion matrix in x and y due to the symmetry in z rectional Dirac, that is to say a function which of the spherical harmonics. For the matrix def- is zero everywhere except in the chosen direc- inition see [Kronlachner, 2014b]. In combina- tion. As a result the output HOA scene contains tion with horizontal head tracking, this trans- only the sound from the chosen direction. Thus, formation can stabilise horizontal auditory cues this tools helps to explore the HOA scene with for left-right movements when the rendering is a "laser beam". For more details see [Lecomte made over headphones in VR contexts. et al., 2016a].

4.2 HOAMirror 6 Decoding This PseudoUgen mirrors an HOA scene at the origin in the directions along the axes left-right For the decoding of HOA signals two different (y), front-back (x), up-down(z). According to ways of rendering the sound field are possible: [Kronlachner, 2014b], this can be accomplished First via headphones, or second through a setup by changing a the sign of selected spherical har- of multiple loudspeakers. monics. For the headphone option the HOA signal is 4.3 HOARotatorXYZ decoded to spherically distributed virtual speak- ers. For the best possible spatial resolution more This PseudoUgen rotates a HOA scene around speakers are needed then there are channels in any given angle around x,y,z. The rotation ma- the B-format. Each speaker signal is then con- trix is computed in spherical harmonic domains volved with HRTFs and the resulting left and using recurrence formulas [Ivanic and Rueden- right channels are summed respectively. For the berg, 1996]. distribution of the virtual speakers a regular dis- 5 Beamforming tribution on the sphere is desirable, so that the decoding matrix is well behaved. This is why 5.1 HOAHCard2Mono according to [Lecomte et al., 2015] and similar This PseudoUgen extract a mono signal from to the microphone array prototypes from above the HOA scene according to a beampatern. The a Lebedev grid is chosen. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 100

6.1 HOADecLebedev06 / 26 / 50 compiled online8 as Ugens and in turn can then This collection of PseudoUgen decodes an Am- be integrated in the HOADec PseudoUgen class bisonics signal up to 50 virtual speakers posi- template. tioned as nodes on a Lebedev grid. The decod- 7 The distance of sound sources ing on the 50-node Lebedev grid works up to order 5. This grids contains two several nested One novel aspect of the underlying Faust impl- sub-grids which work up to lower order with less mentation of Ambitools is the spherical encod- nodes[Lecomte et al., 2016b]. Therefore, the 6 ing of sound sources using near field filters. For first nodes are sufficient for first order and the the correct reproduction of the HOA scene, the 26 first nodes are sufficient up to the third order. distance of the sound source and the radius of If the HRTF filter flag is set to 1, the signals are the reproducing (virtual) speaker array needs to convolved with the kernels and summed up to be set. The correct near field filters are either yield a left and right headphone speaker signal. applied by setting it in the encoding or in the Prior to this, the convolution kernels need to be decoding step. loaded to the sound server as shown in the next // load the binaural filters code listing: HOADecLebedev26 .loadHrirFilters() // load a HOA sound file { var src; ~file = Buffer .read(s,"hoa3O.wav"); src= HOAEncoder .ar( // prepare binaural filters 3,//order HOADecLebedev26 .loadHrirFilters() PinkNoise .ar(0.1),//source { HOADecLebedev26 .ar(3, //order 3 az ,//azimuth PlayBuf .ar(16, ~file ,1,loop:1), ele,//elevation hrir_Filters:1) plane_spherical:1, }.play; radius:2, // set the speaker radius here 6.2 HOADec speaker_radius:1) For the case of decoding for speaker arrays HOADecLebedev26 .ar( [Heller et al., 2008] distinguish 3 cases: 3,//order 1. regular polygons (square, octagon) and src,//source polyhedra (cube, octahedron) // or set the speaker radius here // speaker_radius:1, 2. semiregular arrays (non equidistant but op- hrir_Filters:1) posing speakers, like in a shoebox) }.play; 3. general irregular arrays (e.g. ITU 5.1, 7.1 ... semispherical speaker domes) 8 HOA and SynthDefs For the cases 1 and 2, decoder matrices can The use of PseudoUgens leads to one important be obtained by matrix inversion. If, depend- caveat when working with SynthDefs. The Am- ing on the positions of the speakers, the result- bisonics order is an argument pertaining to the ing decoder matrix has elements are of similar PseudoUgen, it can hence not be an argument magnitudes, it is suitable for signal processing. of a SynthDef. The reason is that at compile For case 3, which are arguably the more realistic time the Ambisonics order would remain unde- cases, a variety of state of the art techniques ex- fined and the PseudoUgen does hence not know ists, see for instance [Zotter et al., 2012], [Zotter which Ugen to return When working with Syn- et al., 2010], and [Zotter and Frank, 2012]. An thDefs code reusability can still be achieved as implememtation of these techniques exceeds the shown in the next code listing: scope of this library. However, for the construc- tion of decoders for specific irregular speaker // set the max order: arrays, we refer the user to the Ambisonic De- ~order = 5; coder Toolbox by Aaron J. Heller7. This toolbox ~order .do({|i| // iterate produces decoders as Faust files, which can be SynthDef ( //create unique names "hoaSin"++(i+1).asString, 7https://bitbucket.org/ambidecodertoolbox/ adt.git 8http://faust.grame.fr/onlinecompiler/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 101

{Out .ar(0, HOAEncoder .ar( // change rotation to beamforming i+1, //increase the order SinOsc .ar()))}, ~trans .source= ).add; {var in;in=\in.ar(0!16) }) HOABeamDirac2Hoa .ar( // play the Synths ~o ,in, Synth ( \hoaSin1 ); az,ele)}; ... Synth ( \hoaSin5 ); // direct the beam ~trans .set(\az , angle); 9 HOA and Node Proxies For the flexible creation of typical Ambisonics 10 Conclusions render chains, Node Proxies [Rohrhuber and de- We have presented a HOA library for SC. The Campo, 2011] provide an excellent tool in SC. design of which resulted in great flexibility and Node Proxies autonomously handle audio busses makes it a valuable addition to experiment with and conveniently allow to crossfade between au- HOA in various contexts. Due to the meta ap- dio processes of a selected node, freeing silent proach through Faust, future additions to the process when the crossfade is completed. This library are feasible and we look forward to ex- allows to dynamically change sources in the en- periment with it in the context of VR and coding, transforming and decoding step in the video gaming platforms but also for the creation rendering chain. The following code example of sound material for electro acoustic composi- shows a flexible scenario with changing seam- tions. We believe that the flexibility and live lessly from an XYZ rotation to beamforming. coding capacity of SC is particularly useful in ~o =3; the context of HOA, where repeated listening ~chn =( ~order +1).pow(2); is essential to asses the perceptually complex // load hoa sound file: mutual interdependence of temporal and spatial ~bf= Buffer .read(s,"file.wav"); sound characteristics.

// b-format file player: 11 Acknowledgements ~player = NodeProxy (s,\audio , ~chn ); This work has been supported through the ~player .source= research-creation funding program of the Fonds { PlayBuf .ar(~chn ,~bf)}; de Recherche du Québec - Société et Culture (FRQSC). // Node for xyz rotation: ~trans = NodeProxy (s, \audio , ~chn ); References ~trans .source= {var in;in=\in.ar(0!16) J. Anderson and J. Parmenter. 2012. 3D HOATransRotateXYZ .ar( sound with the Ambisonic Toolkit. In Pre- ~o ,in, sented at the Audio Engineering Society 25th yaw,pitch,roll)}; UK Conference / 4th International Sympo- sium on Ambisonics and Spherical Acoustics, // rotate the scene York. ~trans .set(\yaw , angle); Jeffrey S. Bamford. 1995. An Analysis of Am- bisonic Sound Systems of First and Second // decoding, Order. Ph.D. thesis, University of Waterloo, ~dec = NodeProxy (s, \audio , ~chn ); Waterloo. ~dec .source= {var in;in=\in.ar(0!16) Natasha Barrett. 2010. Ambisonics and HOADec .ar(~o ,in,)}; acousmatic space: a composer’s framework for investigating spatial ontology. In Proceed- // chain the proxies together ings of the Sixth Electroacoustic Music Studies ~player <>> ~trans <>> ~dec ; Network Conference Shanghai, 21-24 June. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 102

Natasha Barrett. 2016. Interactive Spatial P. Lecomte and P.-A. Gauthier. 2015. Real- Sonification of Multidimensional Data for Time 3D Ambisonics using Faust, Process- Composition and Auditory Display. Com- ing, Pure Data, And OSC. In In 15th Inter- puter Music Journal, 40(2):47–69. national Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway. B. Bernschuetz. 2013. A spherical far field hrir/hrtf compilation of the neumann ku 100. P. Lecomte, P.-A. Gauthier, C. Langrenne, In in Proceedings of the 40th Italian (AIA) A. Garcia, and A. Berry. 2015. On the use Annual Conference on Acoustics and the of a Lebedev grid for Ambisonics. In in Au- 39th German Annual Conference on Acous- dio Engineering Society Convention 139. tics (DAGA), page 29. P. Lecomte, P.-A. Gauthier, C. Langrenne, J. Daniel. 2000. Représentation de Champs A. Berry, and A. Garcia. 2016a. Filtrage Acoustiques, Application à la Transmission et directionnel dans un scène sonore 3D par à la Reproduction de Scènes Sonores Com- une utilisation conjointe de Beamforming plexes dans un Contexte Multimédia, Ph.D. et d’Ambisonie d’ordre elevés. in CFA / Thesis. Ph.D. thesis, University of Paris 6, VISHNO 2016, pages 169–175. Paris, France. Pierre Lecomte, Philippe-Aubert Gauthier, Jerome Daniel. 2003. Spatial sound encod- Christophe Langrenne, Alain Berry, and ing including near field effect: Introducing Alexandre Garcia. 2016b. A fifty-node lebe- distance coding filters and a viable, new am- dev grid and its applications to ambison- bisonic format. In In Audio Engineering So- ics. Journal of the Audio Engineering Society, ciety Conference: 23rd International Con- 64(11):868–881. ference: Signal Processing in Audio Record- D.G. Malham. 1999. Higher order ambisonic ing and Reproduction, pages 1–15, Helsingor. systems for the spatialisation of sound. In AES. ICMC99, pages 1–4, Beijing. ICMC. G. Elko, R. A. Kubli, and J. Meyer. 2009. J. McCartney. 2002. Rethinking the Com- Audio system based on at least second-order puter Music Language: SuperCollider. Com- eigenbeams. puter Music Journal, 26(4):61–68. Michael A. Gerzon. 1973. Periphony: With- J. Meyer and G. Elko. 2002. A highly scal- height sound reproduction. Journal of the Au- able spherical microphone array based on an dio Engineering Society, 21(1):2–10. orthonormal decomposition of the soundfield. in IEEE International Conference on Acous- Aaron J. Heller, Richard Lee, and Eric M. tics, Speech, and Signal Processing, 2:1781– Benjamin. 2008. Is My Decoder Ambisonic? 1784. In the 125th Convention of the Audio Engi- neering Society, San Francisco, oct. 1-5. S. Moreau, J. Daniel, and S. Bertet. 2006. 3d sound field recording with higher order J. Ivanic and K. Ruedenberg. 1996. Rotation ambisonics-objective measurements and val- matrices for real spherical harmonics. Direct idation of spherical microphone. In in Audio determination by recursion. J. Phys. Chem., Engineering Society Convention 120, pages 1 100(15):6342–6347. – 24. M. Kronlachner. 2013. Ambisonics plug-in T. Musil, J. Zmoelnig, M. Noisternig, A. Son- suite for production and performance usage. tacchi, and R. Hoeldrich. 2003. AMBISONIC In LAC, Graz, Austria, May. 3D Beschallungssystem 5.Ordnung fuer PD. M. Kronlachner. 2014a. Plug-in Suite for Report 15, Institute for Elektronic Music and Mastering the Production and Playback in Acoustics. Surround Sound and Ambisonics. In AES Christian Nachbar, Franz Zotter, Etienne Student Design Competition (Gold Award), Deleflie, and Alois Sontacchi. 2011. AMBIX - Berlin, April. a suggested ambisonics format. In Ambisonics Symposium, Lexington, KY, June 2-3. M Kronlachner. 2014b. Spatial transforma- tions for the alteration of ambisonic record- Julian Rohrhuber and Alberto deCampo, ings. Graz University Of Technology, Austria. 2011. The SuperCollider Book, chapter Just LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 103

in Time Programming (ch 7) . MIT Press, MA: Cambridge. Jan C. Schacher. 2010. Seven Years of ICST Ambisonics Tools for MAXMsp - a Brief Re- port. In Proc. of the 2nd International Sympo- sium on Ambisonics and Spherical Acoustics, Paris, France, May 6 - 7. Andrea Valle. 2016. Introduction to SuperCol- lider. Logos Publishing House. S. Wilson, D. Cottle, and N. Collins, editors. 2011. The SuperCollider Book. MIT Press, MA: Cambridge. F. Zotter and M. Frank. 2012. All-Round Am- bisonic Panning and Decoding. J. Audio Eng Soc, 60(10):807–820, Nov. F. Zotter, M. Frank, and A. Sontacchi. 2010. The virtual t-design ambisonics-rig us- ing vbap. In presented at the 1st EAA- EuoRegio 2010 Congress on Sound and Vi- bration, pages 1 – 4, Ljubljana, Slovenia. F. Zotter, H. Pomberger, and M. Noisternig. 2012. Energy-Preserving Ambisonic Decod- ing. Acta united with Acoustica, 98:37 – 47. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 104

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 105

STatic (LLVM) Object Analysis Tool: Stoat

Mark McCurry Georgia Tech United States of America [email protected]

Abstract ters and waiting for a response. The user could Stoat is a tool which identifies realtime safety haz- enter a response quickly or they could never pro- ards. The primary use is to analyze programs which vide a response. The class of functions which need to perform hard realtime operations in a por- aren’t bounded by the real time constraint, that tion of a mixed codebase. Stoat traverses the call- a realtime program operates with, are known as graph of a program to identify which functions can non-realtime. be called from a root set of functions which are ex- To avoid xruns, realtime programs should be pected to be realtime. If any unsafe function which composed of functions with reasonable realtime could block for an unacceptable amount of time is bounds, and thus non-realtime functions are un- found in the set of functions called by a realtime safe for a reliable program. Typically realtime function, then an error is emitted to indicate where system programmers acknowledge the timing the improper behavior can be found and what back- constraint and design systems with this limita- trace is responsible for its call. tion in mind. Simple tests may be used to iden- tify the typical execution times as well as vari- Keywords ance, though it’s easy for bugs to creep in. In Realtime safety, static analysis, LLVM particular, the C or C++ open source projects in Linux audio frequently have architectural is- 2 1 Motivation sues making realtime use unreliable . When using low latency audio tools an all too Maintaining a large codebase in C or C++ common problem encountered by users is au- can make it very difficult to both know what a dio dropout caused by an excessive run time given function can end up calling or when a par- of the audio generation or processing routine. ticular function could be called. Typically prob- This artifact is also commonly known as an lems start with a mixed realtime/non-realtime xrun. Xruns can be generated when there’s sim- system, such as UI and DSP sections of code; ply too much to calculate during the allocated the segregation within one codebase may not be time, but it can also be easily generated by any at all clear in implementation. This is further function which takes an unreasonable amount complicated by the opaqueness of some C++ of real time to execute1. The latter category of techniques, such as virtual overloading, opera- functions typically include operations involving tor overloading, multiple inheritance, and im- dynamic memory, inter-process communication, plicit conversions. Overall these complications file IO, and threading locks. make manual verification of large scale realtime For low latency audio to reliably work, a programs difficult. frame of audio and midi data must be processed Stoat offers a solution to identifying realtime within a short fixed time window. Audio call- hazards through an easy-to-use static analy- backs are then known as functions bound by a sis approach. Static analysis makes it possible real time constraint, or realtime for short. A to identify when functions claimed to be real- large portion of code can have it’s total exe- time can call unsafe non-realtime functions even cution time bounded when the size of data is when complex C or C++ call graphs are in- known in advance. Some code however cannot volved. The approach offered by Stoat makes it be simply bounded. As a simple example, con- easy to identify these programming errors which sider prompting the user for synthesis parame- can be used to greatly improve the reliability of

1as opposed to cpu time 2see appendix A LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 106

low latency tools. detecting when a function which can be exe- cuted in a realtime thread can call a function 1.1 Prior Art which may block for an unacceptable amount Stoat isn’t the first tool to address the prob- of time. In C, an example of this is is shown in lem of identifying these realtime safety haz- listing 1. root fn() can call malloc() through ards. Several years prior to the creation of two intermediate functions, unannotated fn() Stoat, Arnout Engelen created jack-interposer and unsafe fn(). When Stoat is provided with which is a runtime realtime safety checker an out-of-source annotation on root fn() it can [Engelen, 2012]. Jack-interposer works by then use the call graph to deduce that an unsafe causing a program to abort if within the function can be called. JACK process callback any known unsafe non- realtime function is called. The functions which Listing 1: Example C Program jack-interposer identifies as unsafe include, void root fn(void) { IO functions (vprintf(), and vfprintf()), unannotated fn(); polling functions (select(), poll()), interpro- } cess communication (wait()), dynamic mem- void unannotated fn(void) { ory functions (malloc(), realloc(), free()), unsafe fn (); threading functions (pthread mutex lock(), } pthread join()), and sleep(). void unsafe fn(void) { As a runtime analysis tool jack-interposer re- malloc(10); quires the program to be executed to identify } errors and each error is reported as it’s en- countered. Individual errors are presented by a For a C program many call graphs are rela- message without a backtrace or by halting the tively simple and no complex type information program and allowing a developer to use a de- is needed. C++ call graphs however make ex- bugger. Jack-interposer has the same issue as tensive use of operator overloading, templates, other runtime tools compared to static analysis. and class based inheritance. Listing 2 shows Namely, exhaustive testing requires the user or an example of the root fn() calling a method testing script to run the program through all which may or may not be safe based upon which states which involve different logic. Doing so is implementation of method() is called. As the a difficult, error prone, and tedious task. Ad- class hierarchy is available to Stoat, the root ditionally, jack-interposer was designed to only function can be conservatively marked as unsafe be used with JACK clients, while Stoat works as method() would call malloc() if Obj was an with any program, JACK based or not. instance of the Unsafe class. Depending upon Stoat is based of an earlier attempt at the workload of a particular program, this data creating another more general static analysis dependency might be satisfied very rarely, so a tool. The predecessor project, Static Func- purely runtime based approach may not identify tion Property Verifier, or SFPV, attempted to the error. address more general problem of tracking de- Listing 2: Example C++ Program scribed properties through a programs feasi- void root fn (Obj ∗o) { ble call graph [McCurry, 2014]. SFPV used > the Clang compiler’s API to record precise o− method(); source level information [Lattner, 2008]. Un- } fortunately the Clang API was subject to rapid class Unsafe: public Obj { breaking changes, slow to compile, and vastly virtual void method(void) { underdocumented, so SFPV was rewritten to malloc(10); create Stoat. Stoat in comparison uses a limited } subset of the LLVM API without interfacing di- } ; rectly with Clang. 3 Stoat Implementation 2 Examples Stoat consists of several components. First, Both runtime and static analysis tools, includ- there is a compiler shim to dump LLVM based ing jack-interposer and Stoat, attempt to ad- metadata though LLVM IR files. Second, there dress the same overall problem. Both aim at is a series of LLVM compiler passes to extract LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 107

inline annotations and call graph information. IR for a virtual call is shown in listing 5 and it Last, there is a ruby frontend to perform deduc- corresponds to the source shown in listing 4. tions on the extracted call graph and to produce Listing 4: C++ Call diagnostic messages and diagrams. Stoat uses information present in LLVM void foo(void) { bitcode to capture the program’s call struc- Baz ∗baz; baz−>bar(); ture. Generating bitcode for individual files } can be difficult to integrate with complex software projects’ build systems. A similar issue was presented by Clang’s official static analysis tools [Kremenek, 2008]. Their solution Listing 5: LLVM IR For C++ Call was to have a stand-in which replaces the 3 define void @ Z3foov ( ) #0 { normal C/C++ compiler . Stoat offers two entry : compiler proxy binaries, stoat-compile and %baz = alloca %class.Baz ∗ , a l i g n 4 stoat-compile++, which provide a way to %0 = load %class.Baz∗∗ %baz , simplify generating LLVM bitcode similar a l i g n 4 to Clang’s scan-build toolchain. For an au- %1 = bitcast %class.Baz∗ %0 to totool based project analysing source code void (%class .Baz ∗)∗∗∗ is as simple as running CC=stoat-compile %vtable = CXX=stoat-compile++ ./configure && make load void (%class.Baz ∗)∗∗∗ %1 and then running stoat -r .. %vfn = getelementptr inbounds void (%class .Baz ∗)∗∗ %vtable , For each LLVM bitcode file Stoat runs four i 6 4 0 custom LLVM passes. These passes respectively %2 = load void (%class.Baz ∗)∗∗ %vfn identify: the function calls, or call graph within call void %2(%class.Baz∗ %0) the program; the C++ virtual methods associ- ret void ated in each class; the C++ class hierarchy; and } in-source realtime safety annotations. Next the vtable calls need to be mapped back First the call graph is constructed. Within to real functions. Vtables are stored as a global the LLVM IR the Call and Invoke operations symbols and can be identified by the “ ZTV” call another function and they contain meta- prefix used in normal C++ symbol mangling data about what function is being called. For procedures. The class hierarchy can be re- C functions this is relatively simple. Consider constructed by identifying chained constructors the IR associated with void foo(){bar()} in from class to class. listing 3. With the information presented by the nor- Listing 3: LLVM IR For C Call mal call graph and the C++ virtual methods an augmented call graph can be constructed. define void @foo() #0 { First, any vtable methods are assumed to call entry : any method implementation of the base class or call void @bar() any child class. Then, suppression file entries ret void are used to remove edges from the augmented } call graph to avoid false errors typically seen in error handling. For C++, extracting the call graph is some- The last LLVM pass looks for in- what more complex due to the introduction of source safety annotations in the form of virtual methods. Virtual methods are a struc- attribute ((annotate("realtime"))) and tured version of function pointers calls and they attribute ((annotate("non-realtime"))). can be identified by the two-step process to ob- These annotations can be added to the end of tain the function pointer. First, a class instance a function declaration to add metadata to the is converted to the virtual function table, or function within the C or C++ source. The vtable. Then, the method’s ID is used to ex- annotations are augmented with out-of-source tract the method from the vtable and the re- annotations in the form of whitelist and sulting function pointer is called. The LLVM blacklist files. 3http://clang-analyzer.llvm.org/scan-build. Once the augmented call graph is constructed html and a subset of the functions in the program LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 108

are annotated, a series of deductions can be 3.1 Limitations made. Any function which is called, but never Stoat offers a number of improvements over implemented is assumed to be non-realtime if prior art, though Stoat does have its limitations. not specified otherwise. Any function which is Namely, Stoat doesn’t track data dependencies5 unannotated and called by a realtime function on realtime safety. This task is one where run- is assumed to be realtime. Any function which time analysis tools, such as jack-interposer, can is realtime or assumed realtime that calls a non- identify errors which Stoat isn’t able to find or realtime function produces an error with an as- avoid false positives. sociated deduction chain. Two primary data dependent issues which The errors can be presented in either a textual produce misleading results include the use of or graphical form. The current format includes unsafe function pointers and the use of unsafe the function that calls the unsafe function along error handling code. A short example of the with the deduced path. An example of an error former would be: flagged by Stoat is dynamic memory use within jalv when it is in a debug mode. Listing 6: Function pointer call void function(void (∗ fn)(void)) { Error #514: fn() serd_stack_new } ##The Deduction Chain: - serd_writer_new : Deduced Realtime If and only if function() is only passed real- - sratom_to_turtle : Deduced Realtime time functions, then it is a realtime safe func- - jack_process_cb : Realtime (Annotation) tion, but the data passed into the function isn’t ##The Contradiction Reasons: analysed by Stoat, so function pointer calls are - malloc : NonRealtime (Blacklist) typically overlooked. Alternatively, figure 1 shows a partial view of Debug and error handling code is a common a graphical representation of call graph nodes source of false positives and the example error involved in errors. When dealing with a legacy from jalv shows one such example. In listing 7, codebase the graphical representation tends to function() would be marked unsafe. The un- be preferable as it visually shows which routines safe function should never be called in practical contain the most errors, and which errors are use and a runtime checker would not flag this the most common. Additionally, for C++ code- case. A similar class of issues can occur if a bases who’s error involve long template expan- function has different realtime safe behavior de- sions the graphical representation shortens the pending upon a flag passed to the function as displayed names to result in a still large, but may be the case with codebases which do not more manageable view on the software’s archi- have separate functions for realtime and non- tecture. realtime tasks. Listing 7: Example error handling void function(void) { if(fatal error) call unsafe function (); }

3.2 Discussion Stoat and it’s predecessor, SFPV, were origi- nally created as a tool to assist with finding is- sues within the ZynAddSubFX synthesizer6 and bringing it into compliance with realtime safety issues. While minor issues still exist, several Figure 1: Partial output from Stoat applied to users have reported improved reliability at lower ZynAddSubFX 2.5.04 latencies compared to earlier versions. Stoat has 5this includes any conditional code execution based 4http://fundamental-code.com/2.5. upon constant or non-constant data 0-realtime-issues.png 6http://zynaddsubfx.sf.net/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 109

since been used as a verification tool in: libr- long program analysis and transformation. In tosc7, carla8, ingen9, and jalv10. Ideally it will CGO, pages 75–88, San Jose, CA, USA, Mar. be used on more projects within Linux Audio to Chris Lattner. 2008. Llvm and clang: Next identify realtime hazards in the future. The use generation compiler technology. In The BSD of Stoat or jack-interposer would assist in cor- Conference, pages 1–2. recting the poor user experience and possibly a negative reputation for stability that realtime Mark McCurry. 2014. Static function prop- hazards have created in a variety of realtime erty verification: sfpv. https://github. projects. com/fundamental/sfpv. When Stoat doesn’t understand regular structure within a program it is relatively easy A Brief Survey of Realtime Safety to extend. ZynAddSubFX uses roughly 500 In order to validate the claim that “projects in callbacks through librtosc. Stoat has already Linux audio frequently have architectural issues been extended to automatically annotate these making realtime use unreliable”, a survey was callbacks As mentioned in the limitations, func- conducted on a sampling of Linux synthesiz- tion pointers are difficult to reliably track with ers. Each synthesizer as presented by http:// static analysis, librtosc callbacks however have www.linuxsynths.com/ was given a brief man- per callback metadata which can be used to as- ual code review (typically < 15 minutes per sociate a statically known function pointer with project) looking for common realtime safety vi- information which can be used to identify which olations. If source code was not available or ones are expected to be executed in the realtime could not be located for a code review then the environment. This process was tested in Zy- project was excluded. Projects marked with a nAddSubFX and used to resolve several bugs. ‘*’ have had an in depth code review prior to the writing of this paper. The results shown in 4 Conclusion table 1 show that 18 of 40 projects (or 45%) Stoat offers a new method to inspect exist- have some easy to identify realtime safety issue. ing software projects and direct attention to- Outside of LMMS and ZynAddSubFX the re- wards code which may be responsible for real- altime hazards within each project has not re- time hazards. Addressing these realtime haz- ceived additional verification. Based upon expe- ards can improve the experience within a va- rience working with projects not included in this riety of Linux audio applications and plugins. list, additional realtime hazards are expected to Through the use of automated tools such as be observed when tools like jack-interposer or Stoat realtime hazards can be identified and Stoat are applied. corrected quickly. Additionally, the static anal- ysis approach of Stoat complements the prior art of runtime analysis that projects like jack- interposer provide. Stoat is available at https: //github.com/fundamental/stoat under the GPLv3 license. References Arnout Engelen. 2012. Jack interposer. https://github.com/raboof/jack_ interposer. Ted Kremenek. 2008. Finding software bugs with the clang static analyzer. Apple Inc. Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for life-

7http://github.com/fundamental/rtosc 8http://kxstudio.linuxaudio.org/Applications: Carla 9https://drobilla.net/software/ingen 10http://drobilla.net/software/jalv LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 110

Table 1: Linux Synthesizer Realtime Safety Observations

Software Name Observed Status Notes 6PM likely unsafe appears to launch threads within rt-thread Add64 likely unsafe blocking gui communication in rt-thread Alsa Modular Synth likely unsafe unsafe data mutex amSynth likely unsafe unsafe memory allocation in rt-thread Borderlands likely unsafe unsafe locks/memory allocation in rt-thread Bristol likely safe appears safe Calf tools likely safe appears safe Cellular Automaton Synth likely safe appears safe Dexed likely unsafe unsafe memory allocations in rt-thread DX-10 likely safe appears safe Helm likely unsafe memory allocation in rt-thread/addProcessor() Hexter likely safe appears safe JX-10 likely safe appears safe LB-302 likely unsafe see LMMS LMMS* unsafe unsafe locks in rt-thread, unsafe memory allo- cation in rt-thread, creation of threads in rt- thread, blocking communication to user inter- face in rt-thread, etc Monstro likely unsafe see LMMS Mr. Alias 2 likely safe appears safe Mx44 likely safe appears safe Nekobee likely safe appears safe Newtonator likely safe appears safe OBXD likely safe appears safe Organic likely unsafe see LMMS Oxe FM Synth likely safe appears safe Peggy2000 likely safe appears safe Petri-Foo likely safe appears safe Phasex likely safe appears safe Samplev1 likely unsafe possible memory allocation in rt-thread SetBFree likely safe appears safe Sineshaper likely safe appears safe Sorcer likely safe appears safe Synthv1 likely unsafe possible memory allocation in rt-thread Triceratops likely safe appears safe Triple Oscillator likely unsafe see LMMS Tunefish 4 likely safe appears safe Vex likely safe appears safe Watsyn likely unsafe see LMMS WhySynth likely safe appears safe Wolpertinger likely unsafe unsafe memory allocation in setParameter() Xsynth likely unsafe variety of mutexes used in the rt-thread ZynAddSubFX* unsafe unsafe memory allocation in oscillator wavetable generation (the total number of realtime hazards was greatly decreased with the use of Stoat) LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 111

AVE Absurdum

Winfried Ritsch Intitute for Electronic Music and Acoustics Inffeldgasse 10/3 8010 Graz Austria [email protected]

Abstract been evoked by the corresponding real envi- Auditory Virtual Environments (AVEs) are used to ronment. He should have same spatial im- simulate audio environments in real spaces. As room pression moving through and perceive his in room reverberation system (RRR) they augment own movement inside the environment as the acoustics in spaces, e.g. in concert halls and well as the movements of sound sources. music theaters. Why not utilize them for theater music as acoustic stage design and therefore as a 2. Reproduction of plausible auditory events playable instrument ? This approach tries to evoke auditory Even more, tune them to extreme configurations, events which the listener perceives as hav- so that absurd acoustic situations can be realized, ing occurred in a real environment. Here absurd in the sense of not normal or possible in only those features are implemented which real physics and using distortions in time, space, fre- are needed for a specific simulation situa- quency and signal domains. tion. This paper discusses the conceptualization and design of an artistic research project using AVEs for 3. Creation of non-authentic plausible audi- a theatre and some of the new aspects of these ideas tory events or environments. are discussed. For the multi-space theater produc- The virtual room doesn’t evoke percepts tion “the Trial” from Franz Kafka for actors, singer, in the listener which are related to a choir and stage design at the Art University in Graz networked AVEs have been realized, utilizing Am- real acoustic environment, evoking audi- bisonics systems in concert halls and movable acous- tory events where no authenticity or plausi- tics instruments on open spaces. bility restraints are imposed, targeting pure virtual environments like computer games. Keywords Auditory Virtual Environment, acoustic, stage de- sign, computer music, Ambisonics

1 Introduction An auditory virtual environment (AVE) is a vir- tual environment (VE) that focuses on the au- ditory domain only. It sees itself independent from other modalities like vision. Nevertheless an AVE could also be combined with the vi- sual domain. Depending on the application, the user may be either a passive receiver or be able to interact with the environment. Three dif- Figure 1: physical adjustable acoustic for Beat ferent approaches for implementations of AVEs Furer’s music theater FAMA are listed in Blauert’s book “Communication Acoustics”[Novo, 2005] from Novo: 1.1 the setting For the music theater production based on the 1. Authentic reproduction of real existing en- novel “Der Process“ (engl. ”The Trial”), writ- vironments. ten by Franz Kafka from 1914 to 1915 for actors, The virtual room should evoke in the lis- singer, choir and stage design, an experimental tener the same percepts that would have theater music composition should be done: LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 112

The hopeless search of the main char- places; to construct virtual soundscapes for dif- acter “Josef K” for the reason of his ferent audio reproduction systems and virtual arrest is the grandiose template for a acoustics as a main instrument within these inter-institutional project: a play with sceneries. perception, a sensual journey through Another constraint that all places should be existential abysses and absurdities of treated as networked AVEs. the bureaucracy with students and lec- turers of the institute stage design / acting / singing, song, oratorio / elec- tronic music and acoustics, choir of the Kunstuniversit¨atGraz.

The production was roughly spread in three parallel played scenes at three subsequent places with an collective intro at the foyer and collec- tive finale at the concert hall:

Figure 3: TIP stage with big hole in the middle, traditional chariot-and-pole-system

1.3 the experimental approach The a artistic research question have been: can a non-plausible non-authentic AVE, applied as a complex music instrument for theater music, Figure 2: Ligeti hall stage design with big move- produce a varying plausible acoustic sceneries. able blocks As an extension, the AVE should use distor- tions in space, time, spectrum and signal do- 2 main and should therewith produce an distorted Gyorgi Ligeti hall 400m concert hall con- AVE, which is still perceived as acoustics, an structed for virtual acoustics. absurd acoustics, Therefore the production was 2 theatre in the Palais (TIP) 200m theatre titled “AVE-Absurdum”. With this concept the hall constructed traditional acoustics. category of AVEs should be extended to a fourth courtyard between these houses with a category of AVE, let us name it “absurd AVE”, Peepshow construction as stage design. which is non-authentic but plausible in an ab- surd way of reception and in respect to the vi- After the “Intro” in the foyer, the audience sual domain. This AVE does evoke percepts in was divided into 3 groups, each group attend- the listener which are related to a real acoustic ing a 30 minute performance in one of the places environment and the live sound produced. by and have been guided from one place to the the actors and other real sound sources. other within the intermissions. As a common audio 3D sound representation Ambisonics should be used, also to allow sim- 1.2 the compositional approach ulations of these AVEs at development phase Additional constraints for the musical acoustic prior the first rehearsals. composition has been made to concentrate en- Ambisonics was chosen, not only because of force the ideas of the piece: already implemented Ambisonics system at the One main idea was to use signal processing for concert hall, which has been the very well tested the experimental theater music composition for in previous productions like “Pure Ambisonics”, the play “Der Process” on live signals only and but for streaming the acoustical impact of one do not use any pre-produced sound material. room to another. Therefore spatial recordings All material should be based on live recorded and mixes as 3D audio streams was used, so sound signals using different microphones at the spatial information of the audio signals can be LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 113

E 4 L 26 L 14 L 25 L 24 L 27 L 13 L 23 S1 L 15 E 2

L 12 M 1 L 5 M 1 L 30 M 3 L 6 M 2 y L 4 L 16 x Eingang L 28 MA L 1 L 22 L 11 L 7 M 4 M 6 L 3 M 5 L 2 L 31 L 17

L 10 L 8 L 21 E 1 L 29 L 18 L 19 L 9 L 20 S2

Figure 4: loudspeaker in Ligeti hall: Ambisonics L, subwoofer S, extra E, and microphones M. The heights of the speaker increases to the middle from 2m to 8m. used in other spaces. Also Ambisonics can be microphone, powered by batteries and used in directional speaker system, used for the played by actors. move-able acoustics in between the spaces. As a stage design using processes as backdrop, The order used in the different places is nor- like an additional layer on the theater music it- mally defined by the amplification system, but self, as an big invisible ensemble of signal pro- here we work also with Ambisonics streams and cessing algorithms, the generated sound envi- virtual microphones detecting different signals, ronment represent a complex machine. So from the maximum is limited by the encoding system. another perspective these AVEs can be seen as The Ambisonics system at the TIP, since the part of the “theater machine” in the meaning stage was designed as proscenium stage with the of Gilles Deleuze concept of machines[Raunig, audience at one wall, was not satisfying and can- 2004; Deleuze and Guattari, 1977]. celed by the director there, who wanted a purely stereo frontal speaker system. Anyway streams 2 AVE Absurdi from other places has been used for the sound For the three spaces, three different implemen- environment. tation of Ambisonics has been designed: In the following the space in the Ligeti hall and the movable acoustic will be discussed. Ligeti-Saal Ambisoncis 4th order with 32 am- bisonics speakers and 2 subwoofer, 7 direc- 2.1 AVE in Ligeti concert hall tional microphones hanging from the ceil- Varying playable acoustics has been developed ing, 2 headsets for main actors, 2 pickups as an acoustician for Beat Furer’s music the- on the floor and 2 on the blocks of the stage ater FAMA. As a stage design a real room in design. room with rotate-able wall elements, one side absorbers one side reflectors, for 200 listeners Theater im Palais (TIP) ambisonics 2D was build like a huge machine. One restrain ring on the ceiling, subwoofer, 2 Mikro- was to use no electro-acoustic element. Un- fones for Reverberation, 2 microphones for like this physical adjustable acoustics, electro- enhancement of special plaxes, 1 headset acoustic AVEs should be implemented. for trigger only. Since the already installed Constellation courtyard between two houses Movable Acoustic System from Meyer Sound[Sound, spherical directional loudspeaker driven by 2010] with circa eighty of small speakers and embedded linux computers connected to about twenty microphones in 5 meter heights multichannel amplifier and a directional as a closed system was not in any way flexible LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 114

enough to fulfill the requirements of the idea of Pd extension library with the therefore devel- an “AVE absurdum”. oped Ambisonics Toolbox module for Pd: “acre- The speaker used for the 3D Ambison- amb”[Ritsch, 2016] using “iem-ambi” external ics system are shown in figure4. 31 active library. Klingt&Freytag speakers have been used, where “acre-amb” is a collection of high level Pd the first 29 of them are mounted on pan- abstraction, to implement Ambisonics function- tographs, which can adjust the heights and di- ality for Ambisonics mixing and processing of rection of each speaker individually as presets. multichannel signals and controls to be used The L22 speaker has to be adjusted higher, be- in compositions and effects. Also a goal was cause of the blocks from the stage designe and to easily integrate Ambisonics encoder, decoder has been second by two others L30 and L31 on with calibration and speaker distribution, pro- stands on the floor to lower the acoustical hori- viding also connection and processing targeting zon. Additional two sub-woofer left and right in fast prototyping of new Ambisonics algorithms. the front corners for enhancing the Ambisonics sound and used for special subsonic effects have been placed. The Hemisphere was slightly expanded as el- lipse and stretched to the front to increase the “sweet spot”. With this number of speaker a 5th order Ambisonics system could be realized. But since all the obstacles and additional mov- able blocks using a 3th order Ambisonics had smother results on moving sources, increased spatial continuity and avoided to spatial aliasing errors over the room which resulted in a bigger “sweet spot”. As an decoder the standalone decoder of the AmbiX plugin suite[Kronlachner, 2013] was used, for which Matthias Frank from the IEM calculated an suitable Allrad-decoder[Zotter and Frank, 2012] . For preproduction of effects and the development of the AVE in a studio or over headphones the binaural decoder with the special set of impulse-responses, measured in the Ligeti-Hall, was provided. Figure 5: consoles (from left): Lawo Mixing Desk, Controller for effects, Computer Console The decoder was fed with Ambisonics signals with Pd Patch and AmbiX Decoder, Controller from applications within the Linux computer for time machine and memory player, Spectral and over a MADI-Audio Interface input routed Ambisoncis, with notebook as controller through the Lawo Mixing console from other computers, using “jackd”. Therefore three com- puter musicians were able play in parallel using 2.1.1 Room in Room Reverberation for the same AVE system over one central decoder acoustics feeding the speaker. The sub-woofer manage- The core of the RRR system is a multichannel ment has been done in the Mixer, using the reverberation system with 6 Inputs and 12 early Ambisonics signals and an additional a subsonic reflections and 6 late reverb channels to be spa- effect channel for special effects. tialized in the 3D space of the AVE. The AVE-Absurdum has been implemented It was not possible within this production with Puredata[Puckette, 1996] running patches time to mike a choir with 110 singer, especially on different computers connected over MADI because they move sometimes erratically in the Audio Interfaces. The main computer imple- room. Adjusting to limited rehearsal time, we mented an Ambisonics Mixer with the room had to find a solution where the actors and choir in room reverberation system (RRR), derived can play with different absurd acoustics, uti- from the CUBEmixer[Ritsch et al., 2008] de- lizing the conductor and movement-director to velopment of previous years and the “acre” explore and fixate this effects within their re- LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 115

hearsals, starting within the very first rehearsals is therefore only spectacular, if really applied and eliminating the need to track the movement strongly. Changing this dynamically changes of the choir and actors for the composition. the whole “sound color” of the scenes. The solution was chose was enabling “active Dynamics have been mostly applied on singer zones”, areas under microphones, where choir, and actors. Nowadays audience is widely fa- actors and audience are enhanced and encounter miliarized with these effects for solo perform- different acoustics feedback. These effects can ers, but doing it extreme, which means silent be switch or cross-faded for each scene. Also passages become loud and loud voices decrease actors can be in an different acoustical spaces the volume is a strong effect, but is something parallel to the choir or audience: singers do not like. Therefore it was used to Therefore the RRR was driven directly by mi- increase the struggle of the actor against the crophones in 3-4m heights, enabling the differ- environment, here acoustics. The drawback are ent playable acoustics from small garage reverb, that it was really hard to control without feed- long tunnels with scatter echoes to big halls, back at silent phases and can be perceived as an even further to echoes like from surrounding mistake in the performance very easily. buildings, mountains, allowing > 200ms early A really strong effect is the distortion espe- reflections. This allows to build none-plausible cially of the early reflections in the reverb: A acoustics, like increasing energy on reflections tube shape make the room a warm sound and and/or different acoustical rooms in one room: using nonlinear shapes introduces noise. Addi- an absurd AVE. tional metal distortion like ring-modulator with Additional to limiting the output, especially 2 inputs signals within a parallel dialog can pro- decreasing feedbacks of the reverb, each micro- duce really scary rooms. As drawback the feed- phone got an feedback suppression EQ for the back is again an issue, so mostly limiters has to 3 most resonant frequencies of the room. be used. 2.1.2 Distortion in Space Within the RRR spatializing early reflections from only one direction or placing all late re- verb to the other site, the acoustical space can be shaped: eg. imaging a big room in one direc- tion and a wall in the other. This can be done dynamically, with a sudden appearance of a late reverb from one site. Since a changing acous- tics can be perceived better than a static one, since change of size of rooms, normally does not happen, are drawing more attention to listeners than static ones. Figure 6: Choir surrounding the audience and A overlap of one acoustic space over another actors behind a transparent curtain seems to be more unnaturally, but since most singers and actors use reverb on stages, the au- dience is used to this effect. 2.2 Distortion in Time We called it “artistic time-stretching”, which is 2.1.3 Distortion in signal an ongoing research project done by Manuel Another effect was inserting processing of the Planton on the IEM, where time-stretching microphone signal path. Within this project should be applied in live situations. time- three types has been tested: stretching and realtime is clearly a contradic- tion, since stretching leads in the past, which spectral Resonances and filters means the signal is not within the realtime con- dynamics Limiter, Compressor and Expander. straints. There has been three different phases in per- shaping Waveshaping: tubes, metal strings, ception experienced: noise (cut) and also string simulator, metal plate. • time-stretched signal within the early re- flections delay < 80ms Spectral filter have same effects as different spectral properties of reflection material and • echos up to 300ms LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 116

• a playback of a detached recording of the do a kind of spectral freezing. This develop- signal > 1sec − inf ment was used to audio-process the choir in- put signals and distribute them in the space. Playing within this phases is the artistic ap- The choir chant tends to be a acoustical en- proach, where voices has to be slowed down first vironment with itself, especially if the choir is and then speed up again. Doing this, the sound surrounding the audience. On transient signals is amplified first, than scattered and becomes with fast glissandi elements like shouting, clap- then a dialog with the live signal, which is a very ping and stamping the effect is audible like a thrilling effect, even more it seems like a replay rapid movement of the sound in the room. On like a “deja-vu” experience. It turned out to long notes especially accords, the rooms begin be a thrilling effect, which actors liked to play to feel like stretched and softened walls, because with. It was used on solo pieces on actors and it is hard to hear any dimension, since reflec- repetition phases of the choir. Introducing feed- tions are masked by direct sound. The effect back loops of the signal to the time-stretcher was used on a one scene as solo acoustic perfor- optionally combining with pitch shifts, expands mance and frozen during conversions. the possibilities of this effect even further. So it was used solo for some scenes and the effect sig- nal spatialized independently from the position of the source. For the implementation a own Pd external was written, using the rubberband library and additional an overlap and add (OLA) algorithm. Considering the limited rehearsal time and the big parameter space, the time-stretcher has to be played interactively, observing the actors by an additional electronic musician. As an own instrument the Pd-patch was run on an separate computer with own controllers, mixed in via a Ambisoncis bus signal. 2.3 Spectrum processing

Figure 8: development prototype for linux player with 8x100W for 2 tetrahedron-speaker: decoupled USB 2/8channel, dc/dc, olimex-A20, 2x class-D amplifier to be powered by 12V bat- tery

3 Movable Virtual Acoustics

Figure 7: Spetcre Pd GUI

Another special effect has been the spectral distortion over space, we named it “spectre” de- veloped and played by Christoph Ressi. With an special patch using small FFTs/IFFTs, the spectral information was split into several chan- nels, which have been spatialized in the 3D space. High frequencies could be played from another direction than low ones and spread over the hemisphere. Drawing tables controls the movements and spreading. Also a feedback loop Figure 9: agent with microphone playing 2 di- to within the effect was introduced, so it can rectional speaker LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 117

For virtual acoustics in the courtyard, where Tetrahedral Speaker Control no speakers and fixed installation was available, channel strip channel strip a special concept of moving acoustics was con- volume, EQ, Limiter volume, EQ, Limiter ceptualized. Operated by electronic musicians encoder 1 encoder 2 . . . as “agents” in the play, the movable acoustics Ambisonics Ambisonics theta/phi/width theta/phi/width instrument with AVE-function and stream ren- Ambisonics bus 1.order dering features, were integrated in the play by multiband Ambisonics decoder the director: 3 band splitter and equalizer

Using spherical loudspeaker arrays allows us lowrange midrange highrange to beam sound to many directions utilizing Am- decoder decoder decoder bisonics signals. With walls of buildings and rooms around in the courtyard, reflection can + be induced, which triggers a kind of surrounding sound. The simplest of the spherical geometries crosstalk cancelation the tetrahedron, which has been used before in a performance enhancing the room acoustics of calibration calibration calibration calibration a church[Robert lepenik, 2014] . The Tetra- equalization equalization equalization equalization hedron loudspeaker have 4 wideband speaker SPEAKER 1 SPEAKER 2 SPEAKER 3 SPEAKER 4 mounted on each plane and can be placed on an portable stand. The electronics consists of Figure 10: tetraeder drive for AVEs a 4x100W class-D amplifier, supplied by an 12V12Ah rechargeable battery, driven by an For streaming scripts for “gstreamer” has “Olimex ARM-A20” embedded computer with been written as transmitter and receiver con- a hacked multichannel USB audio interface, nected via “jackd” to Pd. This allows a ad- a phantom power microphone-preamp, speaker justable and acceptable latency with a sufficient cable and an microphone over XLR cable. The buffering for different situations. agents can carry the whole electronics in their bags and hold the microphone and speaker. A directional microphone has been chosen for Ligeti-Saal TIP interaction with the surrounding, so the agents choir MADI can focus and play with the sound input of the acoustics sänger schauspieler actors NET environment using a kind of AVE-patch. Re- Raum objects ceiving the Ambisonics streams from the other singer spaces, using an addtional virtual microphone, mix wifi-stream they can select signals from other performances room + - to be combined in the audio scene. Foyer Vorplatz + - Raum The whole signal processing was done by a Network between spaces schauspieler Pd patch including different effects like feedback Ethernet + - with reverb, pitch shifting, delays etc. to realize a movable AVE. This work was named “AVE- tetrahedron” and experimental explored before on the campus. Figure 11: Network of AVEs To play this instrument small controllers mounted to the arms have been used. 5 Conclusions The whole production was a big success from 4 Ambisonics network the reaction of the audience and the partici- Streaming Ambisonics was developed for the pants. The AVE concept was accepted, after COMEDIA project[Ritsch, 2010]. Using this some persuasiveness, explaining the concept to technique, the 3D acoustic signal of an room can all participants. Because of the limited time, be delivered to other spaces, broadcasting eg. there was not much space to criticize and over- the 25 channel Ambisonics signal from Ligeti throw the concept, so even it was very tight we hall to others. The receiver can choose the AVE tried to stay as close to the concept as possible and place virtual microphones inside, using con- or drop it, like within the TIP. trollable Ambisonics decoder. To focus on transformations and not so much LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 118

on sound-effects was a very wise decision, since M. Kronlachner. 2013. Plug-in suite for mas- effects brings to much additional parallel con- tering the production and playback in sur- tent and are not so invasive to support the idea round sound and ambisonics. In Linux Au- of Kafka’s absurd world. dio Conference 2013, University of Music and The concept of AVE absurdum as a playable Performing Arts, Graz, May. Institute for instrument has been proven in the Ligeti-hall. Electronic Music and Acoustics, Linux Audio The movable acoustics instruments works nicely Group. in small areas. Simple effects like distortion in Pedro Novo, 2005. Auditory Virtual Environ- time work surprisingly well. Imprinting other ment, chapter 11, pages 277++. Signals and acoustics of one room in the other also works Communication Technology. Springer. fine in most situations, but since listeners are used to it in media perception, are not spectac- M. Puckette. 1996. Pure Data. In Proceed- ular. ings, International Computer Music Confer- ence., pages 224–227, San Francisco. 6 Acknowledgements G. Raunig. 2004. Einige Fragmente ¨uber Thanks go to all the participants of the play, Maschinen. Grundrisse 17. Die Neu- more than 80 for support and being open Erfindung des Maschinenbegriffs. minded to accept new concepts. Also to the university with greatly supported this produc- W. Ritsch, J. Zm¨olnig, and T. Musil. 2008. he tion beyond the normal effort for workshops cubemixer a performance-, mixing- and mas- and lectures. Especially thanks to the Sabine teringtool. In Proceedings of the LAC 2008, Pinsker from the stage design department for Cologne, Germany. ZKM, Linux Audio Users organizing the whole project and the general Group. director Horner for his encouragements to enter Winfried Ritsch. 2010. Nettrike. CO-ME-DI- new path for theater music. For the develop- A,EACEA Culture Project on Network Per- ment, Atelier Algorythmics which provides the formance in Music. Tetrahedron-player with embedded linux sys- tems and other hardware. Winfried Ritsch. 2016. Ambisonics toolbox for puredata. internet. Puredata abstraction library. Winfried Ritsch Robert lepenik. 2014. Ex Machina Dei. commissioned work by Musikprotokoll Graz 2014 for robot organ player, robot piano player and 6 Tetrahedron- speaker at Church St. Andrei. Meyer Sound. 2010. Constellationacoustic system. internet. F. Zotter and M. Frank. 2012. All-round am- bisonic panning and decoding. Journal of the Audio Engineering Society. audio, acoustics, applications, 60(10):807–820, 11.

Figure 12: Winfried Ritsch conduction pan- tographs in ligeti hall

References G. Deleuze and F. Guattari, 1977. Pro- grammatische Bilanz f¨ur Wunschmaschinen, page 489. Deleuze, Gilles: Kapitalismus und Schizophrenie. Suhrkamp. Die Neu-Erfindung des Maschinenbegriffs refereniert von Raunig Gerald. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 119

Multi-user posture and gesture classification for ‘subject-in-the-loop’ applications Giso GRIMM1,2 and Joanna LUBERADZKA2 and Volker HOHMANN1,2 1 H¨orTech gGmbH, Marie-Curie-Str. 2, D-26129 Oldenburg, Germany 2 Research group “digital hearing devices”, Department of Medical Physics and Acoustics, Medizinische Physik und Cluster of Excellence Hearing4all [email protected]

Abstract an example from the hearing research, in typ- This study describes a posture classification method ical communication situations, leaning forward for a marker-free depth camera. The method con- while listening is associated with a high listen- sists of an object identification procedure, feature ing effort, whereas sitting more relaxed indi- extraction, and a na¨ıve Bayesian classification ap- cates a lower effort [Paluch et al., 2015]. Man- proach with a supervised training. Point clouds ual labelling of user behaviour in similar tasks obtained from the depth camera are split into ob- is usually time consuming and is not sufficient jects. For each object a set of features is ex- in case of the ’subject-in-the-loop’ experiments, tracted. A method of feature pre-processing is pro- where the measurement is controlled by the re- posed and compared against a statistical orthogo- nalisation method. Using a manually labelled train- sponses of the test subject. Interaction between ing data set, the probability distributions for the the subject reactions and the measurement pro- Bayesian classification are obtained. As a result of cedure is desired when aiming at more realistic the classification, the most likely gesture is assigned experimental conditions, but can also provide to each object in real time. Classification perfor- additional performance measures from the ex- mance was tested on a separate data set and reached perimental feedback loop. ’Subject-in-the-loop’ about 80%. experiments require a real-time classification of Three different applications are described: Auto- gestures and postures. This differs from conven- matic estimation of user postures to estimate the tional behavioural experiments where a post- influence of hearing devices on user behaviour in communication situations, the control of an inter- hoc analysis of the data is possible. Besides re- active audio-visual art installation, and interactive search applications, machine control functions light control on a dance-floor setup with multiple based on natural postures are possible, e.g., a dancers. Classification performance in these appli- hearing device could increase the noise reduc- cations was measured and discussed. tion efficiency when the user’s change in pos- ture indicates a higher listening effort. Such an Keywords application would require a body-worn motion gesture classification, behaviour analysis, hearing tracking sensor, e.g., accelerometer and gyro- devices, interactive art, subject-in-the-loop scope embedded into a hearing device. Gesture and posture recognition tools are also 1 Introduction applied in music and arts [Ciglar, 2008; Don- With the development of assistive technolo- narumma, 2011]. Typically, an artist controls gies, there is a growing need for robust auto- music generation and modification tools with matic identification of human postures and ges- gestures, resulting in a mixture of dance and tures. Gesture recognition is used for improv- music performance. The classification system ing the human-machine communication, e.g., in proposed here is designed to be useful for mu- hand gesture-based device control [Freeman and sic and art applications with multiple users. Weissman, 1997; Richarz et al., 2008]. Another One application is an audio-visual installation, use case is the classification of gestures and pos- where the postures of the audience influence the tures that describe the subject’s behaviour or sound and vision. Another potential real-time provide information on the current state of the application is the live interactive light and mu- subject [Busso et al., 2008; Melo et al., 2015]. sic control system for a dance-floor. Automatic recognition of various postures has Real-time analysis of postures and gestures potential applications in research areas where from depth images is commonly achieved via the test subject’s behaviour is analysed. As skeleton modelling [Shotton et al., 2013]. In the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 120

applications of this study, such a high level pos- depth camera (kinect) ture model is not required, because only a lim- ited number of posture and gesture classes need point cloud to be discriminated. Furthermore, these appli- noise reduction & background removal cations require a computationally fast method of classification. For this, a na¨ıve Bayesian clas- point cloud edge detection & sifier as used in this study. This simple classifi- object grouping cation method can deal with a low-dimensional point cloud sets data and requires only a limited amount of coordinate coordinate training data [Ashari et al., 2013; Gupte et al., transformation transformation 2014]. For discriminating only a small set of point cloud classes, low level features describing the coarse feature optimisation point cloud distributions and the velocities of extraction certain point cloud areas can be used. However, full feature set preprocessing training to fulfil the implicit statistical assumptions of & selection the na¨ıve Bayesian classifier, and to identify the object- optimised center most relevant application-specific feature sets, a feature set coordinates pre-processing of features may be beneficial. In this paper, methods of point cloud process- classification ing (sections 2.1 to 2.3), feature pre-processing probabilities probabilities (section 2.4) and classification (section 2.5) are Application 1: Applications 2+3: described. In section 2.6, the training con- data logging audio/video/light mixing ditions in three different applications – pos- ture classification for hearing research, multi- Figure 1: Structure of the proposed gesture and user control of an audio-visual art installation, posture classification framework. and individualised light control for a dance-floor – are explained. Classification performance in parallax of an infrared laser grid. It provides the different applications with the proposed pre- a depth value d for each pixel position (k,l). processing methods are given in section 3 and Invalid values (e.g., occlusion, absorption) get discussed in section 4. the depth value d = 0. In this study, the depth 2 Methods and apparatus was scanned with a frame rate of 10 Hz. For this study, one or more subjects were Absorbing objects and objects with a very tracked using a Microsoft kinect depth camera. uneven surface, e.g., curly hair, typically re- Although the final applications of this gesture sult in invalid data points for many frames. To and posture classification approach significantly increase robustness in such conditions, invalid differ, they have all the same structure, which is values were replaced by the last available valid depicted in Fig. 1. First, the camera data was value, if a value was measured within the last filtered for a more robust point cloud estimation second. and background removal. In a second step, the For the classification of objects it was es- point cloud was split into multiple objects. For sential to separate them from the background. each object, a set of features was extracted, and Therefore, in an initial phase without subjects based on this feature set, the posture or gesture in the sensing area, the background depth was of each object was classified. The point cloud measured, and all depth values close to the processing and classification was implemented background were removed. After this step, only in the openMHA hearing device signal process- those data points remain which were assumed ing platform [Herzke et al., 2017; Grimm et al., to belong to a relevant object. 2009; Grimm et al., 2006]. Training and data 2.2 Edge detection and object grouping analysis was implemented in Matlab. These processing blocks are described below. An assumption for the object grouping was that all objects have a spatial separation, i.e., either 2.1 Noise reduction and background the depth was not continuous or the objects removal were separated by background pixels. This al- The Microsoft kinect depth camera is an optical lows to use a simple first-order gradient edge sensor which measures the depth through the detection algorithm using the depth data. A LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 121

pixel (k,l) was an object boundary if the depth dered to maximise the elements on the diago- gradient was above a threshold dt: nal, corresponding to a maximal temporal co- herence. 2 2 2 (dk,l − dk+1,l) +(dk,l − dk,l+1) >d (1) t 2.3 Feature extraction To construct objects, a generic flood fill algo- A list of all extracted features and their labels rithm [Torbert, 2012] was applied to identify all can be found in Table 1. Features correspond- pixels within a closed boundary. These pixels ing to the object in the global coordinate system were marked on an object map with their ob- as well as features describing size and distribu- ject number. The set of pixels (k,l) belonging tion of the point cloud P relative to its centre to one object was P, which was then used for were extracted. The object rotation was esti- further object-specific processing if the number mated from the ratio of depth to width. Two of elements of P, p, has a sufficient size. methods of calculating point cloud distribution Coordinate transformation. At this stage were tested: In the first method, weighted av- the objects were defined by a set of pixels with erages across P were calculated. For example, a certain depth from the camera. For a robust the average left bottom position was estimated feature extraction, these have to be transformed by using a weight w with into a world coordinate system. In the first step, (z − z )2 +(y −hyi)2 y > hyi pixel data was transformed into a camera coor- w = max . T dinate system xc = (xc,yc,d) , with the hor-  0 otherwise izontal distance from the camera axis xc, the (4) vertical distance yc and the distance from the To account for dynamic properties, which camera d. These coordinates were linearly ap- may be important for gesture classification, in proximated by addition to the above mentioned point cloud distribution related features, the absolute value (xc,yc)= α(k − k0,l − l0)dk,l. (2) of their temporal derivatives was calculated. These features define the time-variant feature (k0,l0) was the central pixel of the camera. vector f(t) which was used as an input of feature T World-coordinates x = (x, y, z) (x distance pre-processing. along camera axis, y to the left, z upwards) were calculated by rotation and translation of 2.4 Feature pre-processing and the camera-coordinates. These point clouds P optimization were the basis of further feature extraction of Before the actual classification, the features f each object. The object centre was x = hxiP, were pre-processed with a method P to max- i.e., the mean of all points in the point cloud P. imise the classification performance, Temporal alignment of objects. At this ˆ point, the order of detected objects depends f(t)= P{f(t)}. (5) on the first object pixel position in the camera The pre-processing method P was a combina- plane. This is not a robust measure, thus the tion of temporal low-pass filtering with the time object order may change from frame to frame. constant τ, selection of optimal feature set F, However, to allow for analysis of time related and PCA. features, the objects were re-ordered based on The pre-processing method P was iteratively a similarity measure of distance d and the ob- optimised. In each iteration cycle m, the train- ject size ratio r between consecutive frames. ing of the classifier was done based on the pre- The distance between the objects o and q at processed training data set, whereas the classifi- the time indices t and t − 1 was defined as cation performance to which this pre-processing d (t) = ||x (t) − x (t − 1)||. The size ratio o,q o q method P led, was computed using the test |ln(po(t))−ln(pq(t−1))| m was ro,q(t)= e . Then the co- data set, pre-processed in the same way as the herence matrix C(t) between two objects was training data. The pre-processing method Pm, defined by its elements which gave the best classification performance was chosen as the final pre-processing method c (t)= r (t)e−γdo,q(t) (3) o,q o,q for classification. with a weighting coefficient γ = 10. For a re- Orthogonalisation. The na¨ıve Bayesian sorting of objects, the columns of C were or- classifier used in the current work assumes LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 122

name label global coordinates: number of pixels p n.n mean position x n.x, n.y, n.z median position n.xmed, n.ymed, n.zmed rotation n.rot local coordinates: size n.sx, n.sy, n.sz thickness n.r segment positions o.lx o.lz, o.rx, o.rz, o.lby, o.rby segment thickness n.r1, n.r2, n.r3 z-quantiles n.z25, n.z50, n.z75 velocities: object velocity o.vz size changes o.vsy, o.vsz, o.vsx vertical segment velocities o.vlz, o.vlz, n.vz1, n.vz2, n.vz3 horizontal segment velocities n.vxy1, n.vxy2, n.vxy3 angular velocity n.vrot

Table 1: List of features per identified object. The features were calculated by two different implementations, as indicated by the prefix o and n. conditional independence of all the features. Low-pass filtering. Low-pass filtering of the This means, that adding features which are features across time results in a smaller feature highly correlated with other features might variance within a class and thus a better class degrade the performance of the classifica- separation, which as a consequence leads to a tion. Therefore, an orthogonalisation of the better classification performance. On the other feature space is required. In this study, two hand, with long time constants the classifier is orthogonalisation methods were tested. not able to track transitions between the classes. The time constant τ can be adapted to the ex- A principle component analysis (PCA) is a pected frequency of class transitions in the test generic orthogonalisation method. A transfor- data, or to increase classification performance mation matrix is estimated, which is then ap- and stability. The optimal τ was determined by plied to the feature vector before classification. a one-dimensional grid search, with and without To avoid a dominance of large-scale features, all PCA and feature selection. features were scaled to ensure a standard devi- ation of one before calculating the PCA coeffi- 2.5 Classification cients. To accomplish the gesture classification task, a As an alternative method, a feature selection Gaussian Na¨ıve Bayesian Classifier was imple- method is proposed. First, the individual clas- mented. This approach assumes a set of condi- sification performance of each feature from the tionally independent and normally distributed full feature set was computed, by training the features. Each class ch, where h = 1,...,Nc is classifier only on the given feature. Classifica- the class index, and Nc is the total number of tion performance was then measured on the test classes, represented a different gesture or pos- data set. The features were then sorted by their ture. ˆf is a data vector with extracted features ˆ individual classification performance. Starting fj, where j = 1,...,Nfˆ is the feature index, and with the best performing feature, features from Nfˆ is the number of features. Considering the the sorted feature set were added successively independence assumption, Bayes formula can be to the optimal feature set. This procedure was written in the following form: repeated until no further increase of classifica- tion performance was observed. Although this N ˆ ˆ f ˆ feature set is optimised for classification perfor- p(f|ch)p(ch) j=1 p(fj|ch)p(ch) p(ch|ˆf)= = , mance, it does not guarantee that it is orthog- p(ˆf) Q p(ˆf) onal in a statistical sense. (6) LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 123

relaxed straight forward gestures lake ice boil steam

Figure 2: Labels of the project 1 (research ap- plication). which means that the overall class conditional ˆ probability p(f|ch) can be computed by multi- waves thunder ocean rain plying the conditional probabilities for each fea- ture p(fˆj|ch). Figure 3: Labels of the project 2 (audio-visual Since the elements of ˆf are assumed to be nor- art installation). mally distributed p(fˆj|ch) = N(µjh,σjh), the probability density function (PDF) of a feature j for a class h can be modelled by the mean µjh and standard deviation σjh. These parame- ters were estimated from the manually labelled training data. Also a flat prior probability was assumed, p(ch) = 1/Nc. In the current study, probabilities p(ch|ˆf(t)) were calculated for each object in each time frame. For estimating the classification perfor- stand beer dance xdance mance, the confusion matrix was computed as an average posterior probability for each class. The classification performance was the geomet- ric average across the diagonal of the confusion matrix.

2.6 Classification tasks and class labels. windmill The training was executed for three different classification tasks, corresponding to the use Figure 4: Labels of the project 3 (dance-floor cases in hearing research, art and entertain- light control). ment. In each training data set, data from nine test subjects (age from 23 to 44 years) was used. The recording of each gesture or posture lasted approximately 90 seconds for each subject. lake, rain, ice, waves, ocean, boil, steam In the first task (‘project 1’), four classes and thunder. with typical communication states were defined with an indention to track the subject’s be- The third classification task (‘project 3’) con- haviour during the hearing experiment. There tained five classes related to typical actions on were three sitting postures with labels relaxed, a dance-floor at parties, to control the light ac- straight, forward, and a class corresponding cording to individual behaviour of the dancers. to gesticulation while talking, gestures. The labels stand (standing or slowly walking), The second task (‘project 2’) consisted of beer (drinking from a bottle), dance (dancing), eight classes, either body movements or pos- xdance (excessive dancing) and windmill (ro- tures, which were used for controlling and mix- tating head) were used. ing of sound and video art installation con- cerning different manifestations of water. The Images from Figures 2, 3 and 4 present the ’water’ classes had the following labels: labels selection of classes for each project. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 124

1 1 1 project1 none project2 project3 proposed PCA 0.8 proposed,PCA

0.6

0.4

0.2 classification performance 0 0 0 0.1 0.2 0.5 1 2 5 10 0.1 0.2 0.5 1 2 5 10 0.1 0.2 0.5 1 2 5 10 low-pass filter time constant / s low-pass filter time constant / s low-pass filter time constant / s Figure 5: Classification performance as a function of feature low-pass filter time constant τ for the orthogonalisation methods ’none’, ’proposed’, ’PCA’ and the combination of ’proposed’ with ’PCA’, in the three tested projects. The shaded area denotes the chance level.

project1 3 Results straight relaxed 3.1 Influence of feature pre-processing forward on classification performance gestures 10 20 30 40 50 60 Time constant optimisation. Figure 5 time / s project2 shows the classification performance as a func- lake steam tion of feature low-pass filter time constant τ rain in all tested projects. The optimal value for waves ice project 1 was 297 ms, resulting in a classifica- ocean boil tion performance of 81.8%. In project 2, the thunder optimal time constant was 250 ms with a per- 20 40 60 80 formance of 82.9%. In the third project, the time / s project3 time constant τ was 8 s, leading to a classifica- stand tion performance of 78.4%. beer dance In all cases, the feature orthogonalisation im- xdance windmill proved the performance. The maximum per- 10 20 30 40 50 60 70 formance was always achieved with the pro- time / s posed method for feature selection. Using the PCA alone increased the performance only marginally. Both methods in combination do Figure 7: Posterior class probability as a func- not give better performance results than the tion of time for the test data, with the labelled proposed method alone. classes indicated by red lines. Feature selection. Figure 6 shows the per- formance of individual features in the three dif- 3.2 Classification performance with ferent projects. In project 1, the proposed fea- optimised parameter sets ture selection method reduced the dimensional- Figure 7 shows the posterior probabilities as a ity to 12 features. The performance of individ- function of time. It can be noticed that classi- ual features ranged from 19.1% to 44.4%. 42.7% fication errors mostly occurred at class transi- of the selected features were velocity-related fea- tions. With the longer time constants of project tures. In project 2, a set of 17 features was 3, a lag of classification at each transition can found to be optimal; individual performance be seen. ranged from 13% to 28.4%. 35.3% of the fea- The confusion matrix is shown in Figure 8. tures were velocity-related. In the last project, In project 1, the least confusions were achieved only 9 features were sufficient for optimal clas- for the forward class. Typical confusions were sification, with individual performance between between the classes straight and relaxed, as 26.3% and 48.2%. In this case, 66.7% of the well as between gestures and straight. In features were related to motion. project 2, the least confusions were found for LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 125

project1 project2 project3 n.z25 n.r3 n.sz n.z50 n.r2 n.z n.zmed n.r1 o.lz n.z75 n.z25 n.z75 n.r1 o.z n.z75 n.z25 n.z n.z50 n.z50 n.r n.zmed n.zmed n.r2 n.z n.r1 n.sz n.vxy2 n.r n.r3 n.vxy1 n.r3 n.vz3 n.vxy3 n.r2 n.y o.rz o.z n.vz2 o.lz o.rby n.ymed n.vz1 o.rz n.vxy1 n.vz2 o.rx n.vxy3 n.vz1 n.vz3 o.lx n.vxy2 o.rx o.lby o.vsx n.vrot n.sy o.rby n.r n.vz1 o.rz o.z o.vz o.vsz n.sz n.vz2 n.sy n.sx n.vxy1 o.rx o.vlz o.vrz o.lx o.vrz n.vxy2 o.lby o.vsz o.vsz o.vlz o.rby n.sx n.vrot o.vz o.lx n.vz3 o.vrz o.r o.r n.rot o.lby n.vxy3 o.vsy o.vz n.vrot n.sx n.sy o.vlz o.r o.vsy o.vsx o.lz o.vsx o.vsy 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8 0 0.2 0.4 0.6 0.8

Figure 6: Classification performance of single features (thick bars) and the cumulative classifi- cation performance (thin bars). Stars denote the features which were selected by the proposed orthogonalisation method. Blue colours denote velocity-related features.

project1 project2 project3 1 1 1 lake straight stand 0.8 steam 0.8 0.8 rain beer relaxed 0.6 0.6 0.6 waves dance ice 0.4 0.4 0.4 classified

forward classified classified ocean xdance 0.2 boil 0.2 0.2 gestures windmill thunder 0 0 0

ice lake rain boil beer ocean stand dance relaxed steam waves xdance straight forward gestures thunder windmill labelled labelled labelled Figure 8: Confusion matrices (average posterior probability for each class in the test data set) of the three different projects. the classes boil, ocean and steam. The class tures with an optimal class separation. How- thunder was often confused with the classes ever, it is still unclear whether another order of rain or steam. In the third project, more con- combination of orthogonalisation methods or a fusions can be noticed. Most confusions can be dimension-reduction in the PCA would further found for the classes xdance and windmill, and increase performance. between the classes beer and dance.

4 Discussion In this study, only number of low-level fea- The results show that a robust classification of tures was used. A high-level feature space, e.g., gestures and postures based on a low-level fea- skeleton modelling, might be beneficial for ro- ture set is possible, even with a na¨ıve Bayesian bust classification of complex and high-level ges- classifier and a small feature space. The pre- tures. On the other hand, using such low-level processing of features indicated that a orthog- features does not require any model assump- onalisation of the feature space in a statistical tions. An intermediate solution could be an ad- sense is less important than the selection of fea- vanced segmentation of the point cloud. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 126

5 Conclusions Giso Grimm, Tobias Herzke, Daniel Berg, In this study it was shown that even with and Volker Hohmann. 2006. The Master a small and low-level point-cloud based fea- Hearing Aid – a PC-based platform for al- Acta ture space a robust classification of gestures gorithm development and evaluation. Acustica united with Acustica and postures is possible. The tested applica- , 92:618–628. tions covered research, art and entertainment, Giso Grimm, Tobias Herzke, and Volker with four to eight classes in each application. Hohmann. 2009. Application of linux audio in The proposed method of feature-space opti- hearing aid research. In Frank Neumann, ed- misation by selecting a subset of the features itor, Proceedings of the Linux Audio Confer- was shown to result in better classification per- ence, pages 61–66, Parma, Italy. Istituzione formance than a statistical orthogonalisation Casa della Musica. method. Low-pass filtering of features with Amit Gupte, Sourabh Joshi, Pratik Gadgul, application-specific time constants allowed for Akshay Kadam, and A Gupte. 2014. Compar- a trade-off between stable classification and fast ative study of classification algorithms used reactions at class transitions. Classification per- in sentiment analysis. International Journal formance of approximately 80% was achieved in of Computer Science and Information Tech- all applications. Automatic classification of ges- nologies, 5(5):6261–6264. tures and postures for hearing research applica- tions with the ‘subject-in-the-loop’, i.e., with a Tobias Herzke, Hendrik Kayser, Frasher behavioural feedback loop, seems feasible. Loshaj, Giso Grimm, and Volker Hohmann. 2017. Open signal processing platform for 6 Acknowledgements hearing aid research (openMHA). In Proceed- ings of the Linux Audio Conference This study was funded by DFG research grant . 1732 “Individualisierte H¨orakustik” and by Renato de Souza Melo, Andrea Lemos, Carla “klangpol Netzwerk Neue Musik Nordwest”. Fabiana da Silva Toscano Macky, Maria We would like to thank all test subjects who Cristina Falc˜aoRaposo, and Karla Mˆonica participated in the training phase. Ferraz. 2015. Postural control assessment in students with normal hearing and sensorineu- References ral hearing loss. Brazilian journal of otorhi- Ahmad Ashari, Iman Paryudi, and A Min nolaryngology, 81(4):431–438. Tjoa. 2013. Performance comparison be- Richard Paluch, Matthias Latzel, and Markus tween na¨ıve bayes, decision tree and k-nearest Meis. 2015. A new tool for subjective assess- neighbor in searching alternative design in an ment of hearing aid performance: Analyses of energy simulation tool. Int. J. Adv. Comput. interpersonal communication. In Proceedings Sci. Appl, 4(11). of the International Symposium on Auditory and Audiological Research, volume 5, pages Carlos Busso, Murtaza Bulut, Chi-Chun Lee, 453–460. Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N Chang, Sungbok Lee, and Jan Richarz, Thomas Plotz, and Gernot A Shrikanth S Narayanan. 2008. Iemocap: In- Fink. 2008. Real-time detection and interpre- teractive emotional dyadic motion capture tation of 3d deictic gestures for interaction- database. Language resources and evaluation, with an intelligent environment. In Pattern 42(4):335. Recognition, 2008. ICPR 2008. 19th Interna- tional Conference on, pages 1–4. IEEE. Miha Ciglar. 2008. ” 3rd. pole”-a composition performed via gestural cues. In NIME, pages Jamie Shotton, Toby Sharp, Alex Kipman, 203–206. Andrew Fitzgibbon, Mark Finocchio, Andrew Blake, Mat Cook, and Richard Moore. 2013. Marco Donnarumma. 2011. Xth sense: a Real-time human pose recognition in parts study of muscle sounds for an experimental from single depth images. Communications paradigm of musical performance. In ICMC. of the ACM, 56(1):116–124. William T Freeman and Craig D Weissman. Shane Torbert. 2012. Applied computer sci- 1997. Hand gesture machine control system, ence. Springer. January 14. US Patent 5,594,469. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 127

VoiceOfFaust

Bart Brouns studio magnetophon Biesenwal 3 Maastricht, Netherlands, 6211 AD [email protected]

Abstract • FM with modulation by the voice VoiceOfFaust turns any monophonic sound into a • ring-modulation synthesizer, preserving the pitch and spectral • Karplus-Strong used as an effect dynamics of the input. • Phase modulation used as an effect There are 7 synthesizer and two effect algorithms: • a classic channel vocoder The features include: • a couple of vocoders based on oscillators • all oscillators are synchronized to a single with controllable formants: saw-wave, so they stay in phase, unless you don't want them to ◦ CZ resonant oscillators • powerful parameter mapping system lets ◦ PAF oscillators you set different parameter values for each ◦ FM oscillators band, without having to set them all ◦ FOF oscillators separately • formant compression/expansion: Make the FM with modulation by the voice • output spectrum more flat or more resonant, • ring-modulation at the twist of a knob. • Karplus-Strong used as an effect • flexible in and output routing: change the • Phase modulation used as an effect character of the synth. • all parameters, including routing, but except the octave, are step-less, meaning any 'preset' can morph into any other. Keywords • multi-band deEsser and reEsser Synthesis, Signal Processing, Audio Plugins. • optionally use as a master-slave pair: The master is a saw-oscilator driven by the 1 Introduction (external) pitchtacker, and the slaves contain everything else, synced to the VoiceOfFaust turns any monophonic sound into a master. synthesizer, preserving the pitch and spectral This makes it possible to run the slaves as dynamics of the input. It is written in Faust [1], and plugins. uses a pitch tracker in Pure Data [2]. • configuration file: It consists of: Through this file, lot's of options can be set • an external pitch tracker: helmholtz~ [3] by at compile time, allowing you to adapt the Katja Vetter. synth to the amount of CPU power and • a compressor/expander, called qompander screen real-estate available. [4], ported to Faust. Some of the highlights: There are 7 synthesizer and two effect algorithms: • number of bands of the vocoders • a classic channel vocoder • number of output channels • a couple of vocoders based on oscillators • whether we want ambisonics output with controllable formants: • whether a vocoder has one set of ◦ CZ resonant oscillators oscillators, or a separate set of oscillators per output. ◦ PAF oscillators ◦ FM oscillators ◦ FOF oscillators LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 128

2 Vocoders It has all the usual controls, but since we already are working with signals that are split up in bands, 2.1 Common features of all vocoders with known volumes, it was implemented rather differently: 2.1.1 Parameter mapping system • multiband, yet much cheaper, The parameters for the vocoders use a very • without additional filters, even for the flexible control system: sidechain, Each parameter has a bottom and a top knob, • and with a dB per octave knob for the where the bottom changes the value at the lowest sidechain, from 0dB/oct (bypass), to formant band, and the top the value at the highest 60dB/oct (fully ignore the lows). formant band. It also has a (badly named) noise strenght The rest of the formant bands get values that are parameter: it uses the fidelity parameter from the evenly spaced in between. external pitchtracker to judge if a sound is an S. For some of them that means linear spacing, for When you turn it up, the deEsser gets disabled others logarithmic spacing. when the pitchtracker claims a sound is pitched. See [3] for more info. For even more flexibility there is a parametric mid: You set it's value and band number and the 2.1.4 ReEsser parameter values are now: Disabled by default, but can be enabled in the • 'bottom' at the lowest band, going to: configuration file. • 'mid value' at band nr 'mid band', going to: It replaces or augments the reduced highs caused by the deEsser. • 'top value' at the highest band. Kind of like a parametric mid in equalizers. If that's all a bit too much, just set ``para`` to 0 2.1.5 DoubleOscs in the configuration file, and you'll have just the This is a compile option, with two settings: top and bottom settings. • 0 = have one oscillators for each formant frequency 2.1.2 Formant compression/expansion • 1 = creates a separate set of oscillators for Scale the volume of each band relative to the each output channel, with their phase others: modulations reversed. • 0 = all bands at average volume • 1 = normal 2.1.6 In and output routing • 2 = expansion The vocoders can mix their bands together in expansion here means: various ways: • the loudest band stays the same We can send all the low bands left and the high • soft bands get softer ones right, we can alternate the bands between left and right, we can do various mid-side variations Because low frequencies contain more energy than we can even do a full Hadamard matrix. high ones, a lot of expansion will make your All of these, and more, can be cross-faded sound duller. between. To counteract that, you can apply a weighting In the classicVocoder, a similar routing matrix sits filter, settable from between the oscillators and the filters. • 0 = no weighting 2.1.7 Phase parameters • 1 = A-weighting Since all1 formants are made by separate 2 = ITU-R 468 weighting • oscillators that are synced to a single master oscillator, you can set their phases relative to each 2.1.3 DeEsser other. To tame harsh esses, especially when using some This allows them to sound like one oscillator formant compression/expansion, there is a when they have static phase relationships, and to deEsser: sound like many detuned oscillators when their phases are moving.

1 except for the classicVocoder. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 129

Together with the output routing, it can also create 2.2.2 CZvocoder interesting cancellation effects. For example, with the default settings for the FMvocoder, the formants are one octave up from Block-diagram: where you'd expect them to be. https://magnetophon.github.io/VoiceOfFaust/imag When you change the phase or the output routing, es/czVocoder-svg/process.svg they drop down. This is the simplest of the vocoders made out of These settings are available: formant oscilators. • static phases The oscillators where ported from a pd patch by Mike Moser-Booth [6]. • amount of modulation by low pass filtered noise You can adjust: • the cutoff frequency of the noise filters • the formant frequencies 2.2 Features of individual vocoders • the phase parameters

2.2.1 ClassicVocoder 2.2.3 PAFvocoder Block-diagram: https://magnetophon.github.io/VoiceOfFaust/imag Block-diagram: es/classicVocoder-svg/process.svg https://magnetophon.github.io/VoiceOfFaust/imag es/PAFvocoder-svg/process.svg A classic channel vocoder, with: The oscillators where ported from a pd patch by • a "super-saw" that can be cross-faded to a “super-pulse", free after Adam Szabo [5]. Miller Puckette [7]. * flexible Q and frequency setting for the filters * an elaborate feedback and distortion matrix It also has frequencies and phases, but adds index around the filters for brightness.

The gui of the classicVocoder has two sections: 2.2.4 FMvocoder First oscillators, containing the parameters for the Block-diagram: carrier oscillators. https://magnetophon.github.io/VoiceOfFaust/imag These are regular virtual analog oscillators, with es/FMvocoder-svg/process.svg the following parameters: • cross-fade between oscillators and noise The oscillators where based on code by Chris • cross-fade between sawtooth and pulse Chafe [8]. wave • width of the pulse wave Same parameters, different sound. • mix between a single oscilators and multiple detuned ones 2.2.5 FOFvocoder • detuning amount Block-diagram: Second filters, containing the parameters for the https://magnetophon.github.io/VoiceOfFaust/imag synthesis filters: es/FOFvocoder-svg/process.svg • bottom, mid and top set the resonant frequencies Original idea by Xavier Rodet [9]. • Q for bandwidth based on code by Michael Jørgen Olsen [10]. • a feedback matrix. each filter gets fed back Also has frequencies and phases, but adds: a variable amount of: • skirt and decay: ◦ itself Two settings that influence the brightness ◦ it's higher neighbor of each band ◦ it's lower neighbor • Octavation index ◦ all other filters Normally zero. If greater than zero, ◦ distortion amount lowers the effective frequency by ◦ DC offset attenuating odd-numbered sinebursts. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 130

Whole numbers are full octaves, fractions The delay time is relative to the transitional. frequency. Inspired by an algorithm in Csound [11]. Because this delay is applied to just the oscillators, and before the ringmodulation, the sound of both output 3 Other synthesizers channels arrives simultaneously. This creates a mono-compatible widening These are all synths that are not based on of the stereo image. vocoders. 3.1 Features of individual synths 3.1.3 KarplusStrongSinger 3.1.1 FMsinger Block-diagram: Block-diagram: https://magnetophon.github.io/VoiceOfFaust/imag https://magnetophon.github.io/VoiceOfFaust/imag es/KarplusStrongSinger-svg/process.svg es/FMsinger-svg/process.svg This takes the idea of a Karplus Strong algorithm [12], but instead of noise, it uses the input signal. A sine wave that modulates its frequency with the The feedback is ran trough an allpass filter, input signal. modulated with an LFO; adapted from the There are five of these, one per octave, and each nonLinearModulator in instrument.lib. one has: To keep the level from going out of control, there volume • is a limiter in the feedback path. • modulation index Parallel to the delay is a separate • modulation dynamics nonLinearModulator. This fades between 3 settings: Globally you can set: ◦ no dynamics: the amount of • octave modulation stays constant with • output volume varying input signal • threshold of the limiter ◦ normal dynamics: more input volume For the allpass filters you can set: equals more modulation • amount of phase shift ◦ inverted dynamics: more input equals • difference in phase shift between left and less modulation. right (yeah, I lied, there are two of everything) 3.1.2 CZringmod • amount of modulation by the LFO Block-diagram: • frequency of the LFO, relative to the main https://magnetophon.github.io/VoiceOfFaust/imag pitch es/CZringmod-svg/process.svg • phase offset between the left and right LFO's. To round things off there is a volume for the dry Ringmodulates the input audio with emulations of path and a feedback amount for the delayed one. Casio CZ oscillators. Again five octaves, with each octave containing 3.1.4 KarplusStrongSingerMaxi three different oscillators: • square and pulse, each having volume and Block-diagram: index (brightness) controls https://magnetophon.github.io/VoiceOfFaust/imag es/KarplusStrongSingerMaxi-svg/process.svg • reso, having a volume and a resonance multiplier: This is a formant oscillator, and it's resonant frequency is multiplied by the To have more voice control of the spectrum, this formant setting top right. one has a kind of vocoder in the feedback path. It is intended to be used with an external Since we don't want the average volume of the formant tracker. feedback path changing much, only the volumes relative to the other bands, the vocoder is made There is a global width parameter that • out of equalizers, not bandpass filters. controls a delay on the oscillators for one You can adjust it's output. • strength: from bypass to 'fully equalized' LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 131

• cut/boost; steplessly vary between in most languages, as they lean heavily on Fausts ◦ -1 = all bands have negative gain, splitting and combinatory operators. except the strongest, which is at 0 Since this idea has been implemented in ◦ 0 = the average gain of the bands is 0. PureData earlier, it makes sense to mention two ◦ +1 = the all bands have positive gain, big advantages over that: except the weakest, which is at 0 1. Text-interface, enabling quicker notation of ideas, version-control and a mouseless • top and bottom frequencies workflow. Q factor • 2. Single sample feedback loops, as used in the classicVocoder. 4 Master-slave The downsiders of Faust to me are a steep This is a workaround for the need for an learning curve and error messages that are often external pitchtracker, making it possible to use the very verbose and unclear. synths and effects as plugins. It has the nice side effect that your sounds become fully deterministic: because a pitchtracker will always output slightly 6 Use cases different data, or at least at slightly different The author has used VoiceOfFaust mostly for moments relative to the audio, the output audio voice transformation in a musical context, but it can sometimes change quite a bit from run to run. has also come in handy to turn a bass-guitar into a The master is a small program that receives the synth [14]. audio and the OSC messages from the external pitch tracker, and outputs: 7 Deployment • a copy of the input audio • a saw wave defining the pitch and phase VoiceOfFaust heavily leans on knowing the • the value of fidelity, from the pitch pitch of the input signal. Since it’s not yet tracker, as audio. possible to do decent pitchtracking in Faust, an The slaves are synths and effects that input the external pitchtracker which sends the pitch trough above three signals. OSC is used. The outputs of the master can be recorded into a This limits the usable architectures to the ones looper or DAW, and be used as song building supporting OSC. blocks, without needing the pitch tracker. Specifically, it would be nice to have This makes it possible to switch synths, automate VoiceOfFaust as a plugin within a DAW, but that parameters, etc. is not directly possible. The master slave architecture is a usable workaround. 5 Strengths and weaknessesses of Faust To compile VoiceOfFaust, run one of the compilation scripts that support OSC, for The Faust language has some big advantages. example: The common perks of the language apply. For me, faust2jack -osc FMvocoder.dsp the biggest ones are: To run it, you can use one of the scripts in the • Quick implementation of ideas. launchers directory, for example: • If it sounds right, it is right. There won’t ./FMvocoder_PT be any crashes, memory leaks or other This will start puredata with the pitchtracker bugs. patch plus a synth, and connect everything trough • Write once, deploy everywhere. jack. • The block diagrams help with debugging and documentation. • Fast running code. 8 Acknowledgements • Automatic parrallelisation. Many thanks to the developers of Faust [1] and In this project it was also very helpful to be able Pure Data [2] for making dsp so accesable yet to easily parameterize things like the number of powerfull. bands. Related: the input and output routing wouldn’t be nearly as easy and fun to implement LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 132

References [1] http://faust.grame.fr [2] https://puredata.info [3] http://www.katjaas.nl/helmholtz/helmholtz.html [4] http://www.katjaas.nl/compander/compander.ht ml [5] https://www.nada.kth.se/utbildning/grukth/exjob b/rapportlistor/2010/rapporter10/szabo_adam_10 131.pdf [6] http://forum.pdpatchrepo.info/topic/5992/casio- cz-oscillators [7] http://msp.ucsd.edu/techniques/v0.11/book- html/node96.html [8] http://chrischafe.net/glitch-free-fm-vocal- synthesis [9] http://anasynth.ircam.fr/home/english/media/sin ging-synthesis-chant-program [10] https://ccrma.stanford.edu/~mjolsen/220a/fp/Fof let.dsp [11] https://csound.github.io/docs/manual/fof2.html [12] https://en.wikipedia.org/wiki/Karplus %E2%80%93Strong_string_synthesis [13] https://github.com/magnetophon/VoiceOfFaust [14] http://magnetophon.nl/sounds/BucketBoyz/Shini ngBrightLight.mp3 LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 133

On the Development of C++ Instruments

Victor LAZZARINI Maynooth University Maynooth Co. Kildare Ireland, [email protected]

Abstract the much appreciated, much maligned, stan- This paper brings together some ideas regarding dard template library (STL), has actually be- computer music instrument development with re- come quite usable. There is still enough com- spect to the C++ language. It looks at these from plexity for one to get entangled, however, but two perspectives, that of the development of self- with moderation and good design, we can make contained instruments with the use of a class library it work for us. and that of programming of plugin modules for a This paper will examine two approaches of music programming system. Working code exam- C++ instrument making. The first one is based ples illustrate the paper throughout. on employing a signal processing library to write Keywords simple, straightforward, programs that can be ported to various platforms. The second is to Computer Music Instruments, C++, Music Pro- create components, plugins, for Csound using gramming a framework that sits atop the system imple- mentation in C. It is mostly directed at com- 1 Introduction puter music practitioners who can converse in Whatever we do, if we are in the business of C/C++, and it will be fully illustrated by work- making music solely or primarily with comput- ing code, which can also be found somewhere in ers, one way or another, at some point, we will an online repo (links will be given). meet computer music instruments[Lazzarini, 2017a] . Whether we are making electroacous- 2 AuLib Instruments tic music, algorithmic composition, live coding, Towards the end of 2016, I decided to collect tracking, creating pop tunes, we will find our- a number of digital signal processing (DSP) al- selves manipulating these. They can present gorithms that I had been writing or studying themselves through music programming sys- throughout the years into a simple, lightweight, tems [Lazzarini, 2013] , such as Csound [Laz- flexible C++ library, called AuLib1. One of my zarini et al., 2016] or Faust [Orlarey et al., 2004], aims was to document these uniformly in effi- or as software synthesizers, plugins, audio pro- cient and readable code so that they could be- cessing programs, etc. There is a wide variety come somewhat of a reference for me and others. of forms. In this paper, I would like to contem- I was also rewriting some of my teaching mate- plate one of these that involves libraries, com- rials and this became part of them. Following pilers, and the C++ language. a number of refactoring steps, I settled on a de- C++ was once described as having “the ele- sign that followed modern C++ standards in gance, the power, and the simplicity of a hand employing the standard library as much as pos- grenade”, which to me, as a die-hard pure sible to handle resources and keeping the code C programmer sounds about right. However, as simple and lightweight as possible. I must admit that its latest standards, ISO When designing a class library, there are two C++11[ISO/IEC, 2011] , C++14[ISO/IEC, distinct possibilities (amongst the various deci- 2014] , and the forthcoming C++17,[ISO/IEC, sions we have to make) with respect to object hi- 2017] arriving in quick succession as they are, erarchies. One is that we can define a base class are making this monstrous language more at- for DSP objects that has a shared processing in- tractive. Now finally we can write a nice lambda terface, that is one (or more) DSP methods that and pass it to a map to process a list, for example. The standard library, borne out of 1github.com/vlazzarini/aulib/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 134

are specialised in derived classes. A means of moment). Attributes such as number of chan- connecting objects is provided separately from nels (interleaved), sampling rate and vector size this, and once connected, we can place the ob- are also needed, and of course the audio signal jects in a list of references or pointers to the base vector itself. class and call the processing method of each in This makes up the AudioBase class of the li- turn to get an output signal. This is what is at brary, which begins like this: the heart of processing engines such as the one class AudioBase { in PD, Csound, Faust. If the aim is to create a protected : library whose main objective is to be employed uint32_t m_nchnls; as an engine for some higher-level programming uint32_t m_vframes; or patching system, this is the way to go. The std::vector m_vector; double m_sr; Sound Object Library[Lazzarini, 2000] was de- uint32_t m_error; signed this way and it really paid off when it was later wrapped up in Python. Having a fundamentally neutral base, with no The alternative is to relax this constraint and hint of what a DSP object might want to im- not provide a unified processing interface, leave plement allows me to use it for absolutely ev- it to derived classes to define their own. The erything I can think of, or almost. So of the 50- advantage of this is that each class can have dif- odd classes currently sitting in the library, only ferent ways to handle input parameters to pro- four are not derived from AudioBase (fig. 1). cessing methods, depending on what they are It is specialised for common time-domain oper- supposed to do. So an oscillator might have am- ations (oscillators, filters, envelopes, etc.), for plitude and frequency as parameters, in scalar spectral processing (short-time Fourier trans- or vectorial forms, or no parameters at all (for form, phase vocoder), for function tables, for say fixed values of amplitude and frequency). audio input/output, and even for higher-level It can provide a bunch of overloads to han- instrument models. Code re-use is truly max- dle each case. A filter will have an input sig- imised. nal and optional parameters, depending on the type. A frequency-domain object might take a 3 Developing Instruments spectral frame. This, on one hand, simplifies A detailed description of the library design is connections (we can define them at the process- offered elsewhere [Lazzarini, 2017c]. In this pa- ing point, rather than separately), and on the per, we will to look at using it for C++ instru- other makes it hard to use in sound engine appli- ment development. So let’s explore some cases2. cations where the interface needs to be shared. 3.1 Basic examples Given that the objective here for this library was to provide a working context for a di- We begin with a trivial case: a ping instrument, verse set of algorithms, and to provide a flex- written as a command-line program. This just ible means of using them in programs, I have plays a 440Hz, -6dB sine wave to the output for opted for the second approach. This would a couple of seconds. The code, without its safety provide greater freedom to create exactly the checks etc, can be abbreviated as this seven- right form to hold each DSP formulation. Now, liner: given this context, it is still desirable to use the int main () { class structure afforded by C++ to re-use code Oscil sig(0.5,440); fully. This meant to design a base class that SoundOut output("dac"); for (int i = 0; i < def_sr * 2; was a container for an audio signal, providing i += def_vframes) the typical fundamental operations we would output.write(sig.process()); like to perform on it. For me, this meant: scal- return 0; ing (multiplying by a scalar), offsetting (adding } a scalar), mixing (adding a vector/signal) and Frequency and amplitude are not changing, ring modulating (multiplying by a vector/sig- so I pick the process() overload with no pa- nal). Granted, in an audience of music and rameters and stick its return value straight into audio developers, we are likely to find multiple the output write() method. The two classes definitions of what fundamental operations on signals are, but I am drawing the line here (ok 2all examples available in the examples directory of maybe not quite, but let’s keep at this for the the AuLib repository. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 135

Figure 1: The AuLib class library share the base, but they have distinctly-named for (int i = 0; i < def_sr*10; and defined processing methods. i += def_vframes) { sig.process(); Let’s try something slightly less simplistic. fil.process(sig, A similarly-placed instrument but now with a 1000. + 400. * i / def_sr); sweeping resonant filter acting on a sawtooth bal.process(fil, sig); wave: output.write(bal); } int main () { return 0; TableSet saw(SAW); } BlOsc sig(0.5, 440., saw); ResonR fil(1000, 1.); Balance bal; TableSet creates a set of tables for a band- SoundOut output("dac"); limited oscillator. The filter centre frequency is LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 136

varied over time, and we feed its output into a a new convenient interface: using the classes balancing operator that uses a comparator to operator(), we connect objects more easily one keep amplitudes under control. into another. This syntax reinforces the connec- To demonstrate how the base class-defined tion metaphor, envelope, alongside pitch, into operations can be useful, we have a simple FM oscillator. Given that the class is derived from example AudioBase, we set its vector to the result of the int main () { processing. double fm = 440., fc = 220., ndx = 5.; Additionally, we want to specialise two other Oscili mod, car; methods: for sound onset and sound termina- SoundOut output("dac"); tion: for (int i = 0; i < def_sr*10; i += def_vframes) { // note off processing mod.process(ndx * fm, fm); virtual void off_note() { car.process(0.5, mod += fc); m_env.release(); output.write(car); } } return 0; // note on processing } virtual void on_note() { m_env.reset(m_amp, 0.01, Note the use of the overloaded sum- 0.5, 0.25 * m_amp, 0.01); assignment operator in mod += fc to add the } modulator signal to the carrier (scalar) fre- This plus the constructor completes our Note- quency. derived class. Now we want to model the whole synthesiser, not just its voices. To do this, we 3.2 Instrument Models can use the Instrument template class, instan- Clearly, the examples above are more demon- tiated with the required number of voices and strations of how instruments can be set up. This our note class: would, in a more realistic scenario, be placed Instrument synth(8); in plugin or GUI application wrapping code, where they can become useful. The AuLib also An important aspect of this class is that it provides some modelling of instruments and in- has a dispatch() method that takes in five pa- stances of these. We can show how these work rameters (message type, channel, data1, data2, in a straightforward application case: a poly- time stamp) and responds to two message types phonic MIDI synthesiser. (NOTE ON, NOTE OFF). While these are the The AuLib class Note provides the base for same as the MIDI channel messages, we are an instance of a sound object, which can be just re-using the metaphor here. The call to for example, a synthesiser voice. This holds dispatch() does not need to originate from basic parameters such as amplitude, cps pitch, MIDI or be limited to the usual MIDI data etc. that we can use to control a sound object. ranges. Specialisations of instrument can re- To use it, we derive our own, and specialise its implement message handling to allow for other dsp method, placing our sound processing code types. Instrument also handles polyphony us- there. ing last-note priority, and this can also be over- riden in derived classes. class SineSyn : public Note { // signal processing objects Given that the example will use MIDI input, Adsr m_env ; the library supports a simple MIDI listener class Oscili m_osc; that takes an Instrument object (or from any type implementing dispatch() and process() // DSP override and responds to messages. The complete pro- virtual const SineSyn &dsp() { if (!m_env.is_finished()) gram becomes very straightforward (trivial sig- set(m_osc(m_env(), m_cps)); nal handler implementation omitted): else clear(); int main () { return *this; int dev ; } Instrument synth(8); ... SoundOut out("dac"); MidiIn midi; The sound synthesis is again, trivial, to std::signal(SIGINT, signal_handler); keep the example focused: an envelope and a sine wave oscillator. But note that we have std::cout << LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 137

"Available MIDI inputs:\n"; Class Description for(auto &devs: Csound The Csound engine midi.device_list()) Params std::cout << devs << std::endl; Opcode parameters std::cout << "choose a device: "; AudioSig Audio signals std::cin >> dev; Fsig Spectral signals Pvframe Spectral data frames if (midi.open(dev) == Pvbin Spectral data bins AULIB_NOERROR){ std::cout << Vector Array variables "running... Table Function tables (use ctrl-c to close)\n"; AuxMem Dynamic memory while (running) Thread Multithreading // listen to midi on Plugin // behalf of synth Plugin base class out(midi.listen(synth)); FPlugin Spectral plugin base class } else std::cout << Table 1: Classes provided by CPOF. "error opening device...\n"; std::cout << "...finished \n"; return 0; } 5 Plugin Examples Again, with a few lines of code, we can get The Csound language has a variety of internal a basic MIDI synthesiser instrument. Although data types that its opcodes can process. We will the synthesis is simple, it can be shown that look at each one of these with a programming the effort involved in more complex examples example. scales well. It is just a case of using other signal 5.1 Init-time opcodes processing objects in different arrangements. In Csound, code that is run only once per in- 4 Csound Plugins stantiation (or again on explicit re-initisation) employs init-time variables. These are scalar The second case of C++ instrument develop- types holding a floating-point number (the ment we will look at focuses on creating com- MYFLT type defined by the system). Plugin op- ponents (plugins) that can be employed in a mu- codes for these types are derived from Plugin sic programming language. Unit generators in and are instantiated templates taking the num- Csound are known as opcodes and the system ber of output and input arguments (respec- has a well-document C interface for the pur- tively) as parameters. The following examples pose of adding new ones of these to it. It also uses the standard library Gaussian generator to has a C++ base class that has been used for produce a random number using the normal dis- a small number of opcode plugin libraries that tribution. The first input argument is the mean, come with the system. followed by the deviation, and the seed: With the intention of enabling a more com- plete and well-integrated C++ support for plu- #include gin opcode development, I have introduced struct Gauss : 3 csnd::Plugin<1, 3>{ the Csound Plugin Opcode Framework CPOF std::normal_distribution norm; (pronounced see-pough or cip´o = vine in Por- std::mt19937 gen; tuguese4). The actual framework part of it is fairly light, consisting of two template base init init(){ csnd::constr(&norm, inargs[0], classes, but it also contains an extensive set inargs[1]); of utility classes that wrap Csound C code for csnd::constr(&gen, inargs[2]); C++ use in a very idiomatic way (table 1). outargs[0] = norm(gen); CPOF is discussed extensively in [Lazzarini, csnd::destr(&norm); 2017b]. csnd::destr(&gen); } 3 }; available as part of Csound, github.com/csound/ csound, with code examples in the examples/plugin di- Note that because Csound instantiates the rectory. 4as in: C++ gives you enough vine, or rope, for you plugin object and it does not know anything to either hoist yourself up a tree, or hang yourself fairly about C++ constructors, we need to explicitly decently. construct the objects norm and gen. When we LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 138

are done, we need to destruct them as they are with Csound through the plugin deinit() likely to have allocated resources, which we do template function. We register this version of not want to be left dangling. The Plugin base the opcode with: class gives us the inargs and outargs objects, csnd::plugin(csound,"gaussian", which contain the input and output arguments "k", "iii", csnd::thread::ik); respectively. In order for the plugin to be added to 5.3 Audio-rate opcodes Csound’s collection of opcodes, we need to For audio signals, we need to implement the register it. To do this, we implement the aperf() method. The variable now is a vector, csnd::on load() function, where we place so we have to use an AudioSig object to hold a call to the csnd::plugin() template it. The following example shows an aperf() method, passing the argument types ("i") and method that can be added to GaussianPerf to the action time of the opcode (thread::i), as implement an audio rate opcode: well as the opcode name we will use (”guas- sian”): int aperf(){ csnd::AudioSig out(this, outargs(0)); #include for(auto &sample : out) void csnd::on_load(Csound sample = norm(gen); * csound ) { return OK; csnd::plugin(csound, "gaussian", } "i", "iii", csnd::thread::i); } The same class can then be registered for an audio-rate output: 5.2 Control-rate opcodes csnd::plugin(csound, "gaussian", The next data type we can tackle is the one used "a", "iii", csnd::thread::ia); control-rate variables (k). This is also a scalar, but now the opcode is active at performance 5.4 Spectral signals time (as well as init). A control-rate version of Spectral signals in Csound are carried from op- the gaussian opcode would look like this: code to opcode using fsig variables. These are struct GaussP : self-describing variables holding one frame of csnd::Plugin<1, 3>{ frequency-domain data, plus associated infor- std::normal_distribution norm ; mation about the stream. In CPOF, we manip- std::mt19937 gen; ulate these using the pv stream class. Similarly to audio signals we can get the fsig data off argu- int init (){ ments into objects of these types for processing. csnd::constr(&norm, inargs[0], An opcode is responsible for initialising its own inargs[1]); csnd::constr(&gen, inargs[2]); output stream, which we can do at init time. csound->plugin_deinit(this); Stream frames can be decomposed in separate return OK; bins held by pv bin objects. } The example below shows a plugin that im- int deinit(){ plements spectral tracing [Wishart, 1996] de- csnd::destr(&norm); fined as retaining only the loudest N bins in csnd::destr(&gen); each frame. Some important aspects to note return OK; about this code: (a) spectral processing occurs } at a rate determined by the frame analysis rate, int kperf() { so we run it a k-rate and process frames as they outargs[0] = norm(gen); become available; (b) a framecount, a member return OK; variable of the FPlugin base class, is kept for } this. (c) The AuxMem is used to manage a heap- }; allocated block of memory to keep bin ampli- We can see that we now supplied the kperf() tudes; and (d) we add the types as a static con- that will be called repeatedly during perfor- stant member of the class, which simplifies the mance. Another difference is that we have plugin registration call. to provide a deinit() to call the destructors, The basic algorithm is as follows: which will be called when performance ends. This method needs to be registered separately 1. get the amplitudes from each bin; LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 139

2. find the nth loudest; return OK; } 3. use this as a threshold to filter the frame }; date, keeping only the bin holding ampli- tudes above it. #include void csnd::on_load(Csound *csound) { csnd::plugin(csound, #include "pvstrace", csnd::thread::ik); #include } struct PVTrace : csnd::FPlugin<1, 2> { The standard library algorithms are very well csnd::AuxMem amps; static constexpr suited to implementing these steps. The code char const *otypes = "f"; becomes very compact and fairly readable. static constexpr char const *itypes = "fk"; 5.5 Array variables Csound has a container type, array, which can int init () { be used to create vectors of built in types. if(inargs.fsig_data(0).isSliding()) return csound->init_error( CPOF provides a template class Vector Str("sliding not supported")); to wrap array arguments conveniently for ma- nipulation. The typedef myflt vector is an if(inargs.fsig_data(0).fsig_format() instantiation of this template for real values !=csnd::fsig_format::pvs && inargs.fsig_data(0).fsig_format() (MYFLT). The following example combines the !=csnd::fsig_format::polar) use of lambdas and templates to create a whole return csound->init_error( family of binary (two-operand) operators for nu- Str("fsig format not supported")); meric (scalar) arrays. It can be used for init and k-rate opcodes. The processing is placed on a amps.allocate(csound, inargs.fsig_data(0).nbins()); separate function to avoid code duplication. It is just a matter of mapping the inputs into the csnd::Fsig &fout = outputs through the application of a given func- outargs.fsig_data(0); tion. fout.init(csound, inargs.fsig_data(0)); template struct ArrayOp2 : csnd::Plugin<1, 2> { framecount = 0; return OK; int process(csnd::myfltvec &out, } csnd::myfltvec &in1, csnd::myfltvec &in2) { int kperf() { std::transform(in1.begin(), in1.end(), csnd::pv_frame &fin = in2.begin(), out.begin(), inargs.fsig_data(0); [](MYFLT f1, MYFLT f2) { csnd::pv_frame &fout = return bop(f1, f2); }); outargs.fsig_data(0); return OK; } if(framecount < fin.count()) { int n = fin.len() - (int)inargs[1]; int init () { float thrsh; csnd::myfltvec &out = outargs.myfltvec_data(0); std::transform(fin.begin(),fin.end(), csnd::myfltvec &in1 = amps.begin(), [](csnd::pv_bin f){ inargs.myfltvec_data(0); return f.amp(); }); csnd::myfltvec &in2 = inargs.myfltvec_data(1); std::nth_element(amps.begin(), amps.begin()+n, amps.end()); if (in2.len() < in1.len()) thrsh = amps[n]; return csound->init_error( Str("second input array" std::transform(fin.begin(), fin.end(), " is too short\n")); fout.begin(), [thrsh](csnd::pv_bin f){ out.init(csound, in1.len()); return f.amp() >= thrsh ? return process(out, in1, in2); f : csnd::pv_bin(); }); }

framecount = fout.count(fin.count()); int kperf() { } return LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 140

process(outargs.myfltvec_data(0), References inargs.myfltvec_data(0), inargs.myfltvec_data(1)); ISO/IEC. 2011. ISO international standard } ISO/IEC 4882:2011, programming language }; C++. ISO/IEC. 2014. ISO international standard This class template then is instantiated to ISO/IEC 14882:2014, programming language create various opcodes based on different two- C++. operand functions: ISO/IEC. 2017. Working draft, standard for csnd::plugin> programming language C++. (csound, "taninv", "i[]", "i[]i[]", csnd::thread::i); V. Lazzarini, J. ffitch, S. Yi, J. Heintz, Ø. csnd::plugin> Brandtsegg, and I. McCurdy. 2016. Csound: (csound, "taninv", "k[]", "k[]k[]", csnd::thread::ik); A Sound and Music Computing System. csnd::plugin> Springer Verlag. (csound, "pow", "i[]", "i[]i[]", csnd::thread::i); V. Lazzarini. 2000. The SndObj sound object csnd::plugin> library. Organised Sound, (5):35–49. (csound, "pow", "k[]", "k[]k[]", csnd::thread::ik); V. Lazzarini. 2013. The development of com- csnd::plugin> puter music programming systems. Journal (csound, "hypot", of New Music Research, (42):97–110. "i[]", "i[]i[]", csnd::thread::i); csnd::plugin> V. Lazzarini. 2017a. Computer Music Instru- (csound, "hypot", ments. Springer Verlag. "k[]", "k[]k[]", csnd::thread::ik); csnd::plugin> V. Lazzarini. 2017b. The csound plugin op- (csound, "fmod", code framework. In SMC 2017 (under re- "i[]", "i[]i[]", csnd::thread::i); view), Helsinki. csnd::plugin> (csound, "fmod", V. Lazzarini. 2017c. The design of a "k[]", "k[]k[]", csnd::thread::ik); lightweight dsp programming language. In csnd::plugin> (csound, "fmax", SMC 2017 (under review), Helsinki. "i[]", "i[]i[]", csnd::thread::i); csnd::plugin> Y. Orlarey, D. Fober, and S. Letz. 2004. Syn- (csound, "fmax", tactical and semantical aspects of faust. Soft "k[]", "k[]k[]", csnd::thread::ik); Computing, 8(9):623?632. csnd::plugin> (csound, "fmin", T. Wishart. 1996. Audible Design. Orpheus "i[]", "i[]i[]", csnd::thread::i); The Pantomine. csnd::plugin> (csound, "fmin", "k[]", "k[]k[]", csnd::thread::ik);

This is a good example of how we can apply modern a C++ idiom to create compact code for the generation of a family of related opcodes.

6 Conclusions

Perhaps one of the conclusions of this paper is that C++ is not such a terrible choice for the implementation of computer music instruments. While C is still the preeminent language for au- dio signal processing, the latest C++ standards have made that language somewhat more inter- esting, providing almost a blend of high-level scripting with a (hopefully) efficient implemen- tation. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 141

Meet the Cat: Pd-L2Ork and its New Cross-Platform Version “Purr Data” Ivica Ico Bukvic Albert Gr¨af Jonathan Wilkes Virginia Tech SOPA ICAT Johannes Gutenberg [email protected] DISIS L2Ork University (JGU) Blacksburg, VA, USA 24061 IKM, Music-Informatics [email protected] Mainz, Germany [email protected]

Abstract fostering creativity and curiosity across genera- The paper reports on the latest developments of Pd- tions, and as the library of works relying on Pd L2Ork, a fork of Pd-extended created by Ico Bukvic grows, so does the importance of conservation in 2010 for the Linux Laptop Orchestra (L2Ork). and ensuring that Pd continues to support even Pd-L2Ork offers many usability improvements and the oldest of patches. However, the inevitable a growing set of objects designed to lower the learn- side-effect of the increasingly conservationist fo- ing curve and facilitate rapid prototyping. Started cus of the core Pd is that any new addition has in 2015 by Jonathan Wilkes, Purr Data is a cross- to be carefully thought out in order to account platform port of Pd-L2Ork which has recently been for all the idiosyncrasies of past versions and en- released as Pd-L2Ork version 2. It features a com- sure there is a minimal chance of a regression. plete GUI rewrite and Mac/Windows support, lever- aging JavaScript and Node-Webkit as a replacement This vastly limits the development pace. for Pd’s aging Tcl/Tk-based GUI component. As a result, the Pd community sought to com- plement Pure Data’s compelling core function- Keywords ality with a level of polish that would lower the Pd-L2Ork, Purr Data, fork, usability, L2Ork initial learning curve and improve user experi- ence. In 2002 the community introduced the 1 Introduction earliest builds of Pd-extended [13], the longest running Pd variant. There were other ambi- Pure Data, also known as Pd, [15] is arguably tious attempts, like pd-devel, Nova, and Desire- one of the most widespread audio and multime- Data [14], and in recent years Pd has seen a dia dataflow programming languages. Pd’s his- resurgence in forks that aim to sidestep usabil- tory is deeply intertwined with that of its com- ity issues through alternative approaches, in- mercial counterpart, Cycling 74’s Max [16]. A cluding embeddable solutions (e.g. libPd) and particular strength shared by the two platforms custom front ends. Pd-extended was probably is in their modularized approach that empowers the most popular alternative Pd version which third party developers to extend the function- continues to be used by many, even though it ality without having to deal with the under- was abandoned in 2013 by its maintainer Hans- lying engine. Perhaps the most profound im- Christoph Steiner due to lack of contributors to pact of Pd is in its completely free and open the project. source model that has enabled it to thrive in a number of environments inaccessible to its Pd-L2Ork presents itself as a viable alterna- commercial counterpart. Examples include cus- tive which started out as a fork of Pd-extended tom in-house solutions for entertainment soft- and continues to be actively maintained. We be- ware (e.g. EaPd [10]), Unity3D [18] and smart- gin with a discussion of Pd-L2Ork’s history, mo- phone integration via libPD [1], an embeddable tivation and implementation. We then look at library (e.g. RjDj [11], PdDroidParty [12], and Pd-L2Ork’s most recent off-spring nick-named Mobmuplat [9]), and other embedded platforms, “Purr Data”, which has recently been released such as Raspberry Pi [4]. as Pd-L2Ork version 2, runs on Linux, Mac Pd’s author Miller Puckette has spearheaded and Windows, and offers some unique new fea- a steady development pace with the primary tures, most notably a completely new and im- motivation being iterative improvement while proved GUI component. The paper concludes preserving backwards compatibility. Puckette’s with some remarks on availability and avenues work on Pd continues to be instrumental in for future developments. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 142

2 History and Motivation in separate processes (running the engine with Introduced in 2009 by Bukvic, Pd-L2Ork [2] real-time priorities). started as a Pd-extended 0.42.5 variant. The Like Pd-extended, Pd-L2Ork provides a sin- focus was on nimble development designed to gle turnkey monolithic solution with all the li- cater to the specific needs of the Linux Lap- braries included in one package. This minimizes top Orchestra (L2Ork), even if that meant sub- overhead in configuring the programming envi- optimal initial implementations that would be ronment and installing supplemental libraries, ironed out over time as the understanding of the and addresses the potential for binary incom- overall code base improved and the target pur- patibility with Pd. pose was better understood through practice. An important part of L2Ork’s mission was 3 Implementation educational outreach. Consequently, a major- Pd-L2Ork’s code base increasingly diverges ity of early additions to Pd-extended focused from Pd. It consists of many bug-fixes, addi- on usability improvements, including graphical tions and improvements, which can be split into user interface and editor functions. While some engine, usability, documentation, new and im- of these were incorporated upstream, a growing proved objects and libraries, scaffolded learning number of rejected patches began to build an and rapid prototyping. In this section we high- increasing divide between the two code bases. light some of the most important user-visible As a result in 2010 Bukvic introduced a sepa- changes and additions, more details can be rately maintained Pd-extended variant, named found in the authors’ PdCon paper [3]. Pd-L2Ork after L2Ork for which it was origi- nally designed. 3.1 Engine Over time, as the project grew in its scope Internal engine contributions have largely fo- and visibility, it attracted new users, and even- cused on implementing features and bug-fixes tually a team of co-developers, maintainers and requested by past and existing Pd users. Some contributors formed around it. This is obvi- of these include patches that have never made it ously important for the long-term viability of to the core Pd, such as the cord inspector (a.k.a. the project, so that it doesn’t fall victim to Pd- magic glass), improved data type handling logic, extended’s fate, and thus the development team and support for outlier cases that may otherwise continues to invite all kinds of contributions. result in crashes and unexpected behavior. Ad- Pd-L2Ork’s philosophy grew out of its ini- ditional checks were implemented for the Jack tial goals and the early development efforts. It [6] audio backend to avoid hangs in case Jack is defined by a nimble development process al- freezes. Default sample rate settings are pro- lowing both major and iterative code changes vided for situations where Pd-L2Ork may run for the sake of improving usability and stabil- headless (without GUI), thus removing the need ity as quickly as possible. Another important for potentially unwieldy headless startup proce- aspect of this philosophy is releasing improve- dures. The $0 placeholder in messages now au- ments early and often in order to have work- tomatically resolves to the patch instance, while ing iterations in the hands of dozens of students the $@ argument can be used to pass the entire of varying educational backgrounds and experi- argument set inside a sub-patch or an abstrac- ence, which ensured quick vetting of the ensuing tion.1 [trigger]2 logic has been expanded to solutions. allow for static allocation of values, which alle- Despite an ostensibly lax outlook on back- viates the need for creating bang triggers that wards compatibility, to date Pd-L2Ork and are fed into a message with a static value. Purr Data remain compatible with Pd (the -legacy flag can be used to disable some of the Visual improvements: The Tk-based [19] more disruptive changes). In particular, there graphical engine has been replaced with TkPath haven’t been any changes in the patch file for- [17] which offers an SVG-enabled antialiased mat, so patches created in Pd still work without 1 any ado in Pd-L2Ork and vice versa (assuming In Pd parlance, an abstraction is a Pd patch encap- that they don’t use any externals which aren’t sulating some functionality to be used as a subpatch in other patches. available in the target environment). Also, com- 2Here and in the following we employ the usual con- munication between GUI and engine still hap- vention to indicate Pd objects by enclosing them in pens through sockets, so that the two can run brackets. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 143

canvas.3 A lot of effort went into streamlining “graph-on-parent” (Pd’s facility to draw GUI elements in a subpatch on its parent), includ- ing proper bounding box calculation and detec- tion, optimizing redraw, and resolving drawing issues with embedded graph-on-parent patches. Improvements also focused on sidestepping the limitations of the socket-based communication between the GUI and the engine, such as key- board autorepeat detection. As a result, the [key] object can be instantiated with an op- tional argument that enables autorepeat filter- Figure 1: Pd-L2Ork running on Linux. ing, while retaining backward compatibility. Stacking order: Another substantial core for image formats with alpha channel, and ad- engine overhaul pertains to consistent ordering vanced data structure drawing and manipula- of objects in the glist (a.k.a. canvas) stack. This tion using SVG-centric enhancements (Fig. 1). has helped ensure that objects always honor the A majority of usability improvements focus visual stacking order, even after undo and redo on the editor. The consistent stacking order im- actions, and has paved the way towards more plemented in the engine has served as a foun- advanced functionality including advanced edit- dation for the infinite undo, as well as to-front ing techniques and a system-wide preset engine. and -back stacking options that are accessible Presets: The preset engine consists of two via the right-click context menu. Lots of im- new objects [preset hub] and [preset node]. provements and polishing went into the iemgui Nodes can be connected to various objects, in- objects, such as improved positioning, enhanced cluding arrays, and can broadcast the current properties dialogs and graph-on-parent behav- state to their designated hub for storing and re- ior. trieval. Multiple hubs can be used with vary- The old autotips patch was integrated (and ing contexts. The ensuing system is univer- improved upon). The tidy up feature has been sal, efficient, unaffected by editing actions, and redesigned to offer a two-step realignment of ob- abstraction- and instance-agnostic (e.g., using jects. (The first key press aligns the objects on multiple instances of the same abstraction is a single axis, while the second respaces them, so automatically supported). It supports anything that they are equidistant from each other.) In- from recording individual states to real-time au- telligent patching was implemented to provide tomation of multiple parameters through peri- four variants of automatically generating mul- odic snapshots. tiple patch cords based on user’s selection, and to provide additional ways of creating multiple Data structures: Data structures are an ad- connections (e.g. SHIFT + mouse click). The vanced feature of Pd to produce visualizations canvas scrolling logic has been overhauled to of data collections such as interactive graphical minimize the use of scrollbars, provide minimal scores. Pd-L2Ork enhances these with the addi- visual footprint, and ensure most of the patch tion of sprites and new ways to manipulate the is always visible. data. Pd-L2Ork supports drag and drop and has 3.2 Usability support for pasting Pd code snippets (using Pd’s “FUDI” format) directly onto the canvas. The On the surface Pd-L2Ork builds on Pd- copy and paste engine has been overhauled to extended’s appearance improvements. Under improve buffer sharing across multiple applica- the hood, with the canvas being drawn as a tion instances. The entire graphics engine is collection of SVG shapes, the entire ecosys- themeable and its settings are by default saved tem lends itself to a number of new opportuni- with the rest of the configuration files. ties. The most obvious involve antialiased dis- play, advanced shapes (e.g. B´ezier curves that 3.3 Object Libraries are also used for drawing patch cords), support Apart from the core Pd objects and improve- 3SVG = Scalable Vector Graphics, a widely used vec- ments described in the Engine section above, tor image format standardized by the W3C. Pd-L2Ork offers a growing number of revamped LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 144

objects while also pruning redundant and un- While the introspection provided by these necessary objects. classes is relatively rudimentary, it alleviates the Special attention was given to supporting need for a large number of external libraries that the Raspberry Pi (RPi) platform with a cus- add missing core functionality. For example, tom set of objects designed specifically to har- Pd-L2Ork ships with several compiled externals ness the full potential of the RPi GPIO and whose purpose is to fetch the list of abstrac- I2C interfaces, including [disis gpio] and tion arguments. These externals all have dif- [disis spi] [4]. The cyclone library has re- ferent interfaces and are spread across various ceived new documentation and a growing num- libraries. Having one standard built-in interface ber of bugfixes and improvements. Ggee li- for fetching arguments that behaves similarly to brary’s [image] has received a significant over- other introspection interfaces improves the us- haul and became the catchall solution for image ability of the system. Furthermore, opening up manipulation. In addition to the standard Pd- rudimentary introspection to the user increases extended libraries, Pd-L2Ork has reintroduced the composability of Pd. Functionality that pre- [disis munger~] and an upgraded version of viously only existed inside the C code can now the [fluid~] soundfont synth external which be implemented as an abstraction (i.e., in Pd depend on the flext library. Other libraries in- itself). These don’t require compilation and are clude fftease, lyonpotpourri, and RTcmix˜. Pd- more accessible to a wider number of users to L2Ork bundles advanced networking externals test and improve them. [disis netsend] and [disis receive], con- venience externals like [patch name], and ab- 4 Purr Data a.k.a. “The Cat” stractions (e.g., those of the K12 learning mod- Despite all of the improvements it brings to ule [5], and a growing number of L2Ork-specific the table, Pd-L2Ork still employs the same old abstractions designed to foster rapid prototyp- Tcl/Tk environment to implement its graphi- ing). A few libraries have been removed due cal user interface. This is both good and bad. to lack of support and/or GUI object imple- The major advantage is compatibility with the mentations that utilize hardwired Tcl-specific original Pd. On the other hand, Tcl/Tk looks workarounds. and feels quite dated as a GUI toolkit in this 3.4 Introspection day and age. Tcl is a rather basic program- ming language and its libraries have been falling Most interpreted languages have mechanisms to behind, making it hard to integrate the latest do introspection. Pd-L2Ork features a collec- GUI, multimedia and web technologies. Last tion of “info” classes for retrieving the state but not least, Pd-L2Ork’s adoption was severely of the program on a number of levels, from hampered by the fact that it relies on some the running Pd instance to individual objects lesser-used Tcl/Tk extensions (specifically, Tk- within patches. Four classes provide the basic Path and the Tcl Xapian bindings) which are functionality: not well-supported on current Mac and Win- • [pdinfo] reflects the state of the running dows systems, and thus would have required Pd instance, including dsp state, avail- substantial porting effort to make Pd-L2Ork able/connected audio and midi devices, work there. platform, executable directory, etc. Purr Data was created in 2015 by Wilkes to address these problems.4 The basic idea was • [canvasinfo] is a symbolic receiver for the to replace the aging Tcl/Tk GUI engine with canvas, abstraction arguments, patch file- a modern, open-source, well-supported cross- name, list of current objects, etc. The ob- platform framework supporting programmabil- ject takes a numeric argument to query the ity and the required advanced 2D graphical ca- state of parent or ancestor canvases. pabilities, without being tied into a particular • [classinfo] offers information about the GUI toolkit again.

currently loaded classes in the running in- 4 Readers may wonder about the nick-name of this stance. This includes creator argument Pd-L2Ork offspring, to which the author in his origi- types, as well as the various methods. nal announcement at http://forum.pdpatchrepo.info • only offered the explanation, “because cats.” Quite obvi- [objectinfo] returns bounding box, class ously the name is a play on “Pure Data” on which “Purr type, and size for a particular object on the Data” is ultimately based, but it also raises positive con- canvas. notations of soothing purring sounds. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 145

Employing modern web technologies seemed So far, none of these issues has turned out to an obvious choice to achieve those goals, as they be a major road-block in practice. The most are well-supported, cross-platform and toolkit- serious issue we’re facing right now probably is agnostic, programmable (via JavaScript), and that externals using Pd’s Tcl/Tk facilities need offer an extensive programming library and to have their GUI code rewritten to make it built-in SVG support (as a substitute for Pd- work with Purr Data; this is a substantial un- L2Ork’s use of TkPath which incidentally fol- dertaking and thus hasn’t been done for all bun- lows the SVG graphics model). dled externals yet. There are basically two main alternatives in this realm, nw.js5 a.k.a. “node-webkit” and 4.1 Implementation 6 Electron . These both essentially offer a stand- Using JavaScript in lieu of Tcl as the GUI pro- alone web browser engine combined with a gramming language poses some challenges. Tcl JavaScript runtime. nw.js was chosen because it commands with Tk window strings are hard- offers some technical advantages deemed impor- coded into the C source files of Pd. This means tant for Purr Data (in particular, an easier in- that any port to a different toolkit must ei- terface to create multi-window applications and ther replace those commands with an abstract better support for legacy Windows systems). interface, or write middleware that turns the So, in a nutshell, Purr Data is Pd-L2Ork with hard-coded Tcl strings into abstract commands. the Tcl/Tk GUI part ripped out and replaced Given the complexity of Tcl commands in both with nw.js. Purr Data’s GUI is written en- the core and external libraries, that middleware tirely in JavaScript, which is a much more ad- would essentially have to re-implement a large vanced programming language than Tcl with an part of the Tcl interpreter. abundance of libraries and support materials. Consequently, Purr Data opted for the former Patches are implemented as SVG documents approach of directly implementing an abstract which are generally much more responsive and interface. This takes the form of a JavaScript offer better graphical capabilities than Tk win- API providing the necessary GUI tie-ins to the dows. They can also be zoomed to 16 different engine and externals, which is called from the C levels and themed using CSS, improving usabil- side using a new set of functions (gui vmess et ity. The contents of a patch window is drawn al) which replace the corresponding functions and manipulated using the HTML5 API. Thus of Pd’s C API (sys vgui etc.). As already the code to display Pd patches is very portable mentioned, this means that externals which use and will work in any modern GUI toolkit that these facilities need to have their GUI code has a webview widget. rewritten to make it work with the new GUI. There are also some disadvantages with this (Affected externals will work, albeit without approach. First, Tcl code in Pd’s core and in their GUI features.) the externals needs to be ported to JavaScript Adding to the porting difficulty is the fact to make it work with the new GUI; we’ll touch that Pd has no formal specification, and its GUI on this in the following subsection. interface follows no common design pattern for Second, the size of the binary packages 2D graphics. For example, the graph-on-parent is much larger than with Pd-L2Ork or Pd- window appears at a glance as a viewport that extended since, in order to make the packages clips to a specified bounding box. However, the self-contained, they also include the full nw.js bounding box itself behaves inconsistently–for binary distribution. This is a valid complaint built-in widgets like [hslider] or [bng] it clips about many of the so-called “portable desktop (per widget, not per pixel), but for graphed ar- applications” being offered these days, but in rays, data structure visualizations, and widget the case of Purr Data it is mitigated by the fact labels it does no clipping at all. that plain Pd-L2Ork is not exactly a slim pack- To get to grips with these problems, Purr age either. Data’s JavaScript GUI implementation draws Third, the browser engine has a much higher and manipulates Pd patch windows using the memory footprint than Tcl/Tk which might be HTML5 API, which is widely documented and an issue on embedded platforms with very tight used. The Pd canvas itself is implemented as an memory constraints. SVG document. SVG was chosen because it is a 5https://nwjs.io/ mature, widely-used 2D API. Also, larger can- 6https://electron.atom.io/ vas sizes have little to no performance impact LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 146

on the responsiveness of the graphics. Since Pd patches can be large, this makes SVG a better choice for drawing a Pd canvas than the stan- dard HTML5 canvas. 4.2 Leveraging HTML5 and SVG to Improve Pd Data Structures Purr Data employs a small subset of the SVG specification to implement quite substantial im- provements to data structure visualization. In- heriting from a pre-existing standards-based 2D API has several advantages over an ad-hoc ap- proach. First, if implemented consistently, the existing SVG documentation can be used to test and teach the system. Second, it is not neces- sary to immediately understand all the design choices of the entire specification in order to im- Figure 2: Interactive SVG data structure. plement parts of it. Since those parts have been used and tested in a variety of mature applica- tions, it makes it easier to avoid mistakes that 4.3 Custom GUI Elements often riddle designs made by developers who As the SVG tiger example shows, Purr Data aren’t graphics experts. Finally, there is less makes it possible to bind HTML5 DOM events risk of a standards-based API becoming aban- to SVG shapes. Reporting the events is not en- doned than a more esoteric API. abled by default, but can be switched on by sim- To improve data structure visualizations, sev- ply sending the appropriate Pd message to the eral [draw] commands were added to support [draw] object, such as the mouseover 1 mes- the basic shape/object types in SVG. The cur- sage in Fig. 2. Each [draw] object has an out- rently supported types are circle, ellipse, let which then emits messages when events like rect, line, polyline, polygon, path, image, 7 mouse-over, movements and clicks are detected. and g. Each has a number of methods It goes without saying that this considerably which map directly to SVG graphical attributes. expands Pd’s capabilities to deal with user in- Methods were also added for Document Object teractions, e.g., if the user wants to modify ele- Model (DOM) events to trigger notifications to ments of a graphical score in real-time. But it the outlet of each object. also paves the way for enabling users to design The screenshot in Fig. 2 shows the “SVG any kind of GUI element in plain Pd, without tiger” drawn from a few hundred paths found having to learn a “real” programming language inside the [draw g] object. Even though and its frameworks. the drawing is complex, Purr Data caches the For instance, Fig. 3 shows a collection of three bounding box for the tiger object to prevent knobs drawn using the new SVG [draw] com- the hit-testing from causing dropouts. One can mands, whose values (represented by the r field mouse over the tiger and trigger real-time audio in the nub data structure, which is linked to the synthesis. rotation angles of the knobs) can be manipu- It is also possible to set parameters for most lated by dragging the mouse up or down. The of the [draw] attributes. For instance, the mes- values can then be read from the data struc- sage opacity z can be sent to set a shape’s ture using Pd’s built-in [get] object and used opacity to be whatever the value of the field for whatever purpose, just like with any of the z happens to be for a particular instance of built-in GUI elements. the data structure. As soon as the value of Pd offers a rather limited collection of built-in z changes, Pd then automatically updates the GUI elements to be used in patches, and extend- opacity of the corresponding shape accordingly. ing that collection needs a developer proficient in both C and Tcl/Tk. Purr Data’s new SVG 7The latter g element denotes a “group”, which is implemented as a special kind of subpatch that allows visualizations totally change the game, because the attributes of several [draw] commands to be changed any Pd user can do them without specialized simultaneously. programming knowledge. We thus expect the LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 147

• Jonathan Wilkes’ Purr Data packages can be found at https://github.com/ agraef/purr-data/releases. • JGU also offers Pd-L2Ork and Purr Data packages for Ubuntu and Arch Linux. Web links and installation instructions can be found at http://l2orkubuntu. bitbucket.org/ and http://l2orkaur. bitbucket.org/, respectively.

The JGU packages can be installed alongside each other, so that you can run both “classic” Pd-L2Ork and Purr Data on the same system. (This may be useful, e.g., if you plan to use Pd- Figure 3: Custom GUI elements. L2Ork’s K12 mode which has not been ported to Purr Data yet.) We mention in passing that facilities sketched out above to be used a lot JGU’s binary package repositories also contain by Pd users who want to enrich their patches Pd-L2Ork and Purr Data versions of the Faust with new kinds of GUI elements. As soon as it and Pure extensions which further enhance Pd’s 9 becomes possible to conveniently package such programming capabilities. custom GUI elements as graph-on-parent ab- stractions, we hope to see the proliferation of 6 Future Work GUI element libraries which can then be used by After Purr Data’s initial release as Pd-L2Ork Pd users and modified for their own purposes. 2.0 in February 2017, “classic” Pd-L2Ork has become version 1.0 and went into maintenance 5 Getting Pd-L2Ork mode. While development will continue on the The sources of Pd-L2Ork and Purr Data are Purr Data branch, we will keep the original Pd- currently being maintained in two separate git L2Ork available until all of Pd-L2Ork’s features repositories.8 There are plans to merge the two have been ported or have suitable replacements repositories again at some point, so that both in Purr Data. versions will become two branches in the same Purr Data has matured a lot in the past few repository, but this has not happened yet. months, but like any project of substantial size For Purr Data there is also Github mir- and complexity it still has a few bugs and rough ror available at https://agraef.github.io/ edges we want to address after the initial re- purr-data/. This is mainly used as a one-stop lease, in particular: shop to make it easy for users to get their hands • Port the remaining missing features from on the latest source and the available releases, Pd-L2Ork (autotips and K12 mode). including pre-built packages for Linux, macOS and Windows. • Port legacy Tcl code that is still present in Because of Pd-L2Ork’s addons and its com- the GUI features of some of the 3rd party prehensive set of bundled externals, the soft- externals. ware has a lot of dependencies and a fairly com- • Some code reorganization is in order, along plicated (and time-consuming) build process. with a complete overhaul of the current So, while the software can be built straight from build system. the source, it is usually much easier to use one of the available binary packages: One interesting direction for future research • is leveraging the new SVG visualizations as a Virginia Tech’s official Pd-L2Ork packages means to create custom GUI elements in plain are available at http://l2ork.music. Pd, i.e., as ordinary Pd abstractions. This will vt.edu/main/make-your-own-l2ork/ make it much easier for users to create their software/. 9Grame’s Faust and JGU’s Pure are two functional 8cf. https://github.com/pd-l2ork/pd and https: programming languages geared towards signal processing //git.purrdata.net/jwilkes/purr-data and multimedia applications [7,8]. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 148

own GUI elements, and will hopefully encourage [7] A. Gr¨af. Signal Processing in the Pure Pro- community contributions resulting in libraries gramming Language. In Proceedings of the of custom GUI objects ready to be used and 7th International Linux Audio Conference, modified by Purr Data users. pages 137–144, Parma, 2009. Casa della With the expansion onto other platforms, Pd- Musica. L2Ork’s key challenge is ensuring sustainable [8] A. Gr¨af. Pd-Faust: An integrated environ- growth. As with any other open-source project ment for running Faust objects in Pd. In of its size and scope, this can only be achieved Proceedings of the 10th International Linux through fostering greater community participa- Audio Conference, pages 101–109, Stanford tion in its development and maintenance, so University, California, US, 2012. CCRMA. please do not hesitate to contact us if you would like to help! [9] D. Iglesia. MobMuPlat (iOS application). Iglesia Intermedia, 2013. 7 Acknowledgements [10] K. Jolly. Usage of pd in spore and dark- The authors would like to thank the original spore. In PureData Convention, 2011. Pd author Miller Puckette, numerous commu- nity members who have complemented the Pd [11] J. Kincaid. RjDj Generates An Awe- ecosystem with their own creativity and con- some, Trippy Soundtrack For Your Life. tributions, including Hans Christoph Steiner http://social.techcrunch.com/2008/ and Mathieu Bouchard. We would also like 10/13/rjdj-generates. to thank the L2Ork sponsors and stakeholders [12] C. McCormick, K. Muddu, and without whose support Pd-L2Ork would have A. Rousseau. PdDroidParty-Pure Data never been possible nor sustainable. patches on Android devices. Retrieved References January, 21, 2014. [1] P. Brinkmann, P. Kirn, R. Lawler, C. Mc- [13] [PD-announce] MacOSX installers for pd Cormick, M. Roth, and H.-C. Steiner. Em- 0.36 and pd 0.36 extended (CVS). bedding pure data with libpd. In Proceed- [14] pd forks WAS : Keyboard shortcuts ings of the Pure Data Convention, volume for ”nudge”, ”done editing”. http: 291. Citeseer, 2011. //permalink.gmane.org/gmane.comp. [2] I. Bukvic, T. Martin, E. Standley, and multimedia.puredata.general/79646. M. Matthews. Introducing L2ork: Linux Laptop Orchestra. In Interfaces, pages [15] M. Puckette. Pure Data: another inte- In 170–173, 2010. grated computer music environment. Proceedings, International Computer Mu- [3] I. Bukvic, J. Wilkes, and A. Gr¨af. sic Conference, pages 37–41, 1996. Latest developments with Pd-L2Ork and its development branch Purr-Data. [16] M. Puckette. Max at seventeen. Computer PdCon 2016, New York, NY, USA. Music Journal, 26(4):31–43, 2002. http://ico.bukvic.net/PDF/PdCon16_ [17] TkPath. http://tclbitprint. paper_84.pdf, 2016. sourceforge.net/. [4] I. I. Bukvic. Pd-L2ork Raspberry Pi [18] Unity - Game Engine. https://unity3d. Toolkit as a Comprehensive Arduino Alter- com. native in K-12 and Production Scenarios. In NIME, pages 163–166, 2014. [19] B. B. Welch. Practical programming in Tcl and Tk, volume 3. Prentice Hall Upper [5] I. I. Bukvic, L. Baum, B. Layman, and Saddle River, 1995. K. Woodard. Granular Learning Objects for Instrument Design and Collaborative Performance in K-12 Education. In New Interfaces for Music Expression, pages 344–346, Ann Arbor, Michigan, 2012. [6] P. Davis and T. Hohn. Jack audio connec- tion kit. In Proc. Linux Audio Conference, LAC, volume 3, pages 245–256, 2003. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 149

Posters/ SpeedGeeking LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 150

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 151

Impulse-Response and CAD-Model-Based Physical Modeling in Faust

Pierre-Amaury Grumiaux1, Romain Michon2, Emilio Gallego Arias1, and Pierre Jouvelot1 1MINES ParisTech, PSL Research University, France 2CCRMA, Stanford University, USA

Abstract After briefly describing these two tools, we’ll We present a set of tools to quickly implement modal evaluate them and provide directions for future physical models in the Faust programming lan- works. guage. Models can easily be generated from any impulse response or 3D graphical representation of 2 Faust Modal Physical Model a physical object. Any linear percussion instrument can be imple- This system targets users with little knowledge in physical modeling and willing to use this type of mented using a bank of resonant bandpass fil- synthesis technique in a musical context. ters [Smith, 2010]. Each filter implements one mode (a sine or cosine function) of the system Keywords and can be configured by providing three pa- Physical Modeling Synthesis, Faust Language, Dig- rameters: the frequency of the mode, its gain, ital Signal Processing and its resonance duration (T60 ). Such a filter can be easily implemented in 1 Introduction Faust using a biquad filter (tf2) and by com- The Faust programming language has proven puting its poles and zeros for a given frequency to be well suited to implement physical mod- (f) and T60 (t60): els of musical instruments [Michon and Smith, modeFilter(f,t60) = tf2(b0,b1,b2,a1,a2) 2011] using waveguides [Smith, 2010] and modal with{ synthesis [Adrien, 1991]. b0 = 1; In this short paper, we present two Python b1 = 0; 1 Faust b2 = -1; scripts allowing to easily generate w = 2*PI*f/SR; modal physical models: ir2dsp.py and r = pow(0.001,1/float(t60*SR)); mesh2dsp.py. a1 = -2*r*cos(w); a2 = rˆ2; • ir2dsp.py takes an audio file containing }; an impulse response as its main argument mode(f,t60,gain) = modeFilter(f,t60) gain; and converts it into a Faust file imple- * menting the corresponding modal physical The modeFilter function can be easily ap- model. plied in parallel in Faust using the par opera- • mesh2dsp.py outputs the same type of tor to implement any modal physical model: model but takes an .stl2 file containing model = the specification of any 3D object designed _ <: 3 par(i,nModes, with a CAD program as its main argu- mode(freq(i),t60(i),gain(i))) ment. :> _; Faust programs generated by ir2dsp.py and The Faust-generated block diagram corre- mesh2dsp.py are ready to use and can be sponding to this code, with nModes = 4, freq(i compiled to any of the Faust targets (stan- )= 100*(i+1), and (t60(i),gain(i)) as succes- dalone applications, plug-ins, etc.). sively (0,9,0.9), (0.8,0.9), (0.6,0.5) and (0.5,0.6), 1 can be visualized in Figure 1. https://github.com/rmichon/pmFaust/ – All URLs in this paper were verified on 07/04/2017. This type of model can be easily excited by a 2STereoLithography. filtered noise impulse (see Figure 2). The cut- 3Computer-Aided Design. off frequency of the lowpass and highpass filters LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 152

its input instead of an impulse response. stl is process a common format supported by most CAD pro- mode(100)(0.9f)(0.9f) grams to export the description of 3D objects. After converting the provided .stl file into mesh2dsp.py mode(200)(0.8f)(0.9f) a mesh, performs a Finite Ele- ment Analysis (FEA) using Elmer 4 Various pa- rameters such as the Young Modulus, the Pois- mode(300)(0.6f)(0.5f) son Coefficient, and the density of the material of the object must be provided to carry out this mode(400)(0.5f)(0.6f) task. The result of the analysis is a set of eigen- values and mass participations for each mode. Eigenvalues are then converted to mode fre- quencies and mass participations to mode gains. Figure 1: Block diagram of a Faust modal Unfortunately, this technique doesn’t allow to physical model. calculate the T60 of the modes which can be configured by the user directly from the Faust can be used to excite specific zones of the spec- program. trum of the model and to choose the “excitation position.” Since this system is linear, the same 5 Evaluation behavior could be achieved by scaling the gain of the different modes, but the filter approach that To evaluate the accuracy of ir2dsp.py, we we use here will better integrate to our modu- recorded the impulse response of a can and gen- lar physical modeling synthesis toolkit, briefly erated its corresponding modal physical model. presented in §6. Figure 3 shows the spectrogram of the impulse response of the can and Figure 4 the spectro- White Noise Lowpass Highpass Envelope To Model gram of the impulse response of the physical model generated by ir2dsp.py. ir2dsp.py Figure 2: Excitation generator algorithm used was configured to detect peaks at a minimum to drive our modal physical models. value of -20 dB and at least 100 Hz spaced from each other. We see that the synthesized sound is pretty close to the recorded version. 3 ir2dsp.py T60 s are not perfectly accurate since they were ir2dsp.py takes an audio file containing an calculated by measuring the bandwidth of the impulse response as its main argument. After mode. Tracking their evolution in the time do- performing the Fast Fourier Transform (FFT) main would provide better results; thus we plan on it, modes information is extracted by carry- to use this technique in the future instead. ing out peaks detection. The T60 of each mode is computed by measuring its bandwidth at -3 dB. Modes information is formatted by ir2dsp.py to be plugged to a generic modal Faust physical model similar to the one described in §2. The output of the Python pro- gram is a ready-to-use Faust file implementing the model. The goal of this tool is not to create very ac- curate models but rather to be able to strike any object (e.g., a glass, a metal bar, etc.), record the resulting sound, and turn it into a playable digital musical instrument. Figure 3: Spectrogram of an impulse response 4 mesh2dsp.py of a can The output of mesh2dsp.py is the same as ir2dsp.py (see §3), but it takes a .stl file as 4https://www.csc.fi/web/elmer/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 153

8 Acknowledgements Our thanks go to Yann Orlarey for his help with the use of Faust. References Jean-Marie Adrien. 1991. The missing link: Modal synthesis. In Representations of Musi- cal Signals, chapter The Missing Link: Modal Synthesis, pages 269–298. MIT Press, Cam- bridge, USA. Romain Michon and Julius O. Smith. 2011. Faust-STK: a set of linear and nonlinear Figure 4: Spectrogram of the output of the physical models for the Faust programming modal model generated with ir2dsp.py from language. In Proceedings of the 14th Inter- a can IR national Conference on Digital Audio Effects (DAFx-11), pages 199–204, Paris, France, mesh2dsp.py was tested with the geometric September. 3D model of a solid bar and provided good sub- Julius Orion Smith. 2010. Physical Audio jective auditory results. More objective analysis Signal Processing for Virtual Musical Instru- is clearly needed here. ments and Digital Audio Effects. W3K Pub- 6 Future Work lishing. This work has been carried out as part of a larger project on designing a physical modeling toolkit for the Faust programming language. ir2dsp.py and mesh2dsp.py will be inte- grated to it. We plan to improve ir2dsp.py by using a better T60 measurement algorithm. Indeed, the T60 of each mode is currently computed by measuring its bandwidth after taking the FFT of the entire impulse response. A better approach would be to extract this information from a time-frequency representation of the sig- nal (i.e., spectrogram), which would be more accurate. Finally, we would like to try other open- source packages than Elmer to carry out the FEA in mesh2dsp.py to get better results and to smooth its integration in our Faust physical modeling toolkit. 7 Conclusion We presented a series of prototype tools allow- ing to design at a very high level ready-to-use physical models of musical instruments. Models can be generated from impulse responses or 3D graphical representations of physical objects. While the models generated by this system are far from being accurate, we believe that it provides a convenient way for composers and musicians to design expressive custom instru- ments usable in a musical context. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 154

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 155

Fundamental Frequency Estimation for Non-Interactive Audio-Visual Simulations

Rahul AGNIHOTRI, Romain MICHON and Timothy S. O’BRIEN CCRMA (Center for Computer Research in Music and Acoustics), Stanford University, Stanford, CA, 94305 {ragni, rmichon, tsob}@ccrma.stanford.edu

Abstract systems. When considering applications for This paper aims to demonstrate the use of funda- pitch detection and source separation, we often mental frequency estimation as a gateway to create consider the example of identifying individual a non-interactive system where computers communi- speakers at a round-table event. Computers, cate with each other using musical notes. Frequency unlike humans, have a difficult time identifying estimation is carried out using the YIN algorithm, the words of a particular person if multiple peo- implemented using the ofxAubio add-on in open- ple are communicating with each other simulta- Frameworks. Since we have used only open-source neously. Building on that concept, we simulate technologies for the implementation of this project, it can be executed on any platform: Linux, Macin- a conversation between multiple computers us- tosh OS or Windows OS. As a simulation, this sys- ing musical notes. The YIN pitch detection al- tem is used to depict a conversation between people gorithm is employed to detect trigger notes, to at a round-table event and presented as an audiovi- which other computers in the network respond. sual art installation. Since we desire a continuous conversation, we must use an efficient detection algorithm with Keywords the following features: Pitch Estimation, MIR, YIN algorithm, Aubio, Audio Simulation. • real time response, • minimal latency, 1 Introduction • accurate identification in the presence of Pitch detection algorithms have been used in noise. various contexts in the past: We need to be careful about both the latency • audio editing programs (pitch correction 1 of the attack and of the pitch detection algo- and time scaling) such as Melodyne , rithm since if a note is played, the human ear • analysis of complicated melodies of world needs at least seven periods of a waveform to music cultures (Indian Classical music), identify its pitch. Hence, note onsets and note 2 pitches are not directly related [3]. Additionally, • music notation programs like , we must ensure that the pitch recognition algo- • MIDI interfaces such as the Roland GI-20 rithm is reasonably robust to the sort of noise to get data from guitar MIDI pickups. which is inevitable in a performance scenario. After comparing different methods for pitch Since it lies firmly within the domain of music estimation, as in [3], we chose the YIN algo- information retrieval (MIR), pitch estimation rithm for its real-time tracking ability. YIN is has many applications in recommender systems, a time-domain algorithm based on the autocor- sound source separation, genre categorization relation method for estimation. [2]. Using the and even music generation. With the popular- common autocorrelation method, its error rates ity of machine learning, neural-networks, and are analyzed and corrected for every new itera- data mining, audio signal processing tools are tion to ensure the best possible accuracy. Using utilized more and more to create user-specific YIN is beneficial since it can accurately analyze

1 higher frequencies which we might use as trigger http://www.celemony.com/en/melodyne/ what-is-melodyne. URLs in this paper were veri- notes in our system. To make sure that we have fied on Feb. 16, 2017. the lowest possible pitch identification latency 2http://www.avid.com/sibelius and have a very small frame size for incoming LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 156

audio, we use the YIN algorithm implemented computer was set to the G, C and D Major in the aubio framework [1], extended further as scale respectively. The choice for these par- an addon in openFrameworks3 for our computer ticular scales was arbitrary and baseless. The network. ofxStk4 add-on was used to generate the sounds using the Moog synthesis class. The generated 2 Methods and Implementation sound has a reverb effect applied to it in order to Since fundamental frequency identification create a wider stereo image when all computers works well on monophonic sounds (e.g., a guitar are in place. solo or any wind instrument), it was decided to In order to mimic the characteristics of hu- use this approach to reduce the problem of in- man speech, there is no fixed tempo for the sys- correct triggers for the other computers in the tem. When a particular trigger note is heard by network. Individual notes are generated one af- a computer, it goes silent and shifts its scale by ter the other at randomized tempos (to mimic an octave higher or lower and also changes its the prosody of human speech) in a particular corresponding trigger note in order to keep the musical scale. Every other computer has its own whole system ambiguous. The ambiguity lies in “voice” which does not overlap with that of any the fact that when all computers are communi- other computer in the network. At a given point cating simultaneously, it is difficult to identify in time, multiple “people” can “speak” simulta- what the actual trigger note is and maintains neously. the illusion of an improvised conversation. The The YIN algorithm is accurate enough to suf- exact flow of the network is represented in Fig- ficiently identify the trigger notes at any given ure 1. This flow remains constant for any new time within the chaotic yet pleasing tone of the computer added to the system. conversation, and the computers react accord- 2.2 Graphics ingly. Visual feedback is provided to portray whether a computer is voicing itself or is re- Graphical feedback is used to convey whether a maining silent in response to the trigger note. particular computer is active or silenced. We For the test system, three computers, each with used the ofxParticles add-on to implement their own voice, constituted the network. particle physics in order to visually represent the current state of our system.

Figure 2: Screenshot of the particle physics graphics

White particles at any given instant repre- sent an “in-active state” of the computer, and Figure 1: Overview of the system colored particles are used when the computer is active (as shown in Figure 2). The color of the particles for each computer changes when it be- 2.1 Audio comes active after being inactive for a random Each voice for the “speaker” in the system is amount of time, as if joining the conversation a musical scale. For our demonstration, each 4https://github.com/Ahbee/ofxStk, which encap- 3See https://github.com/aubio/ofxAubio and sulates the original Synthesis ToolKit, https://ccrma. http://openframeworks.cc/ stanford.edu/software/stk/ LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 157

again to state its opinion. Particles are emitted [2] Alain de Cheveign´eand Hideki Kawahara. from the exact center, spiral outward, eventu- YIN, a fundamental frequency estimator for ally fade out. There is a pseudo-gravitation ef- speech and music. The Journal of the Acous- fect that makes the particles orbit around the tical Society of America, 111(4):1917–1930, center of the screen after their spiral trajec- 2002. tory. We designed this visual tool to create a [3] P De La Cuadra. Efficient pitch detection psychedelic effect for the audience. techniques for interactive music. Interna- 2.3 Evaluation tional Computer Music Conference, pages When presenting this system to a group of ob- 403–406, 2001. servers it was noticed that despite having an underlying pattern to the changes in scale and trigger conditions, they could not detect this and the audience expressed that they believed the computers were having a conversation, al- beit through musical notes. The audience also responded that having a visual feedback gave each computer a unique personality. 3 Future Work The future scope of this project involves the in- tegration of machine learning in order to im- plement musical improvisations as a response to the trigger conditions. This would make the whole system more expressive (i.e., trivial ac- tions such as having the computer go silent or be active, etc.). Polyphonic sound identifica- tion is also a viable addition to have a more immersed experience of musicians improvising with one another. 4 Conclusion In this paper we presented pitch estimation as a tool for a musical performance system. Since the future scope does involve the use of ma- chine learning and data mining techniques, as is the custom with music information retrieval, this was a relevant stepping stone. Presently, this system is being modified to include multiple triggers for the whole system. We are trying to move away from the initial condition of having only one computer respond to a single trigger note but have multiple computers react to more than one trigger added into the network. 5 Acknowledgements Many thanks to the CCRMA community for its support. References [1] P Brossier. Automatic Annotation of Musi- cal Audio for Interactive Applications. Cen- tre for Digital Music Queen Mary University of London, Diploma of(August):215, 2006. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 158

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 159

Porting WDL-OL to LADSPA/LV2

Jean-Jacques Girardot Independent programmer 25 Rue Pierre Bérard 42000 Saint-Etienne, France [email protected]

Abstract 2 The Geek Meeting WDL-OL is an open source framework that is used The author has developed a few freeware to develop audio plug-ins in various formats (VST, plug-ins (pictured here: ClockWise Orange which VST3, AU, RTAS, AAX) for Mac and Windows manages multi tap multi delays, and Strawberry operating systems. The proposition is to add the Feel which provides a graphical audio language) possibility to develop plug-ins in LADSPA and using WDL-OL, and he wishes to make them LV2 formats under Linux. available to the Linux community. Some other plug-ins authors may whish to do the same thing, Keywords and therefore add to the plug-ins offer under Linux. For this, we need to add to WDL-OL the Audio Plug-ins, WDL-OL, LADSPA, LV2 LADSPA/LV2 support (for DAWs, but also for tools like ChucK), and to find people 1 Introduction knowlegeable on the subject and willing to help us WDL / IPlug is a simple-to-use C++ in this development. framework for developing cross platform audio plugins and targeting multiple plugin APIs with the References same code. Originally developed by Schwa/Cockos, IPlug has been enhanced by various contributors, in [1] WDL Git Repository: particular Oliver Larkin, whose version seems to be https://github.com/olilarkin/wdl-ol the most used. [2] WDL discussion Forum: Plug depends on WDL, and that is why this http://forums.cockos.com/forumdisplay.php? project is called WDL-OL, although most of the f=32 differences from Cockos' WDL are to do with IPlug. The source code for the framework can be downloaded from the WDL repository [1]. There exists also a very active discussion list about WDL [2] LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 160

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 161

Workshops LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 162

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 163

Workshops

1-Yoshimi and the reluctant developer by Will Godfrey (United Kingdom) A workshop overview of my involvement with the Yoshimi soft-synth, discussion and current status, including demonstrations See https://sourceforge.net/projects/yoshimi/

2. Free Software and DIY at Radio Panik by Frederic Peters, Arthur Lacomme et Suzie Suptille (Belgium) Radio Panik is a community FM-radio created in Brussels in 1983, it has been using, adapting and creating free software for much of its activity for years. This workshop aims to explore and discuss our current audio practices, from broadcast systems to creative tools. See http://www.radiopanik.org/ 3. Building a local linux audio community by Daniel Appelt (Germany) The Open Source Audio Meeting Cologne is a monthly gathering of audio and free software enthusiasts. This workshop provides insights how such a regular event may be organized. See http://cologne.linuxaudio. org 4. Generative Music with Recurrent Neural Networks par Kosmas Giannoutakis (Institut für Elektronische Musik und Akustik Inffeldgasse, Graz, Austria) In this workshop the generative music capabilities of artificial recurrent neural networks will be explored, using an abstractions library for the programming environment Pure Data, called RNMN (Recurrent Neural Music Networks). The library provides the basic building blocks, neurons and synapses, which can be arbitrarily connected, easily and conveniently, creating compound topologies. The framework LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 164 allows real time signal processing, which permit direct interactions with the topologies and quick development of musical intuition. For the workshop, a laptop with a build-in microphone, Pure Data (PD-vanilla 0.47.1 version is recommended) installed and headphones is required for the participants. Experience with visual programming is a plus but not a necessary prerequisite. It will be explained the basic principles of the framework and it will be demonstrated the construction of some basic topologies. In the end the participants can create their own topologies which can demonstrate to the other participants.

5. Origin, features and roadmap of the MOD* Duo by Mauricio Dwek, Gianfranco Ceccolini & Filipe Coelho (Germany) (* Musical Operating Devices For Experienced Musicians) In this workshop, MOD Devices tells its story and shows its heavy use of Linux Audio technologies for the MOD Duo. The workshop will consist of: • History on how the MOD Duo came to be • What (Linux Audio) technologies are used inside • Challenges and difficulties found while making the Duo • Showing the Duo in action • Things to come soon The audience will be encouraged to get their hands on the device and try out its features. See https://moddevices.com/

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 165 6. Too Much Qstuff* To Handle by Rui Nuno Capela (Portugal) Following in the tradition of LAC2013@IEM-Graz, LAC2014@ZKM-Karlsruhe, LAC2015@JGU-Mainz and miniLAC2016@c_base-Berlin, this talk/workshop is once again being proposed as an informal opportunity for open debate and discussion, over the so called Qstuff* software constellation. Although starring [4] as the main subject, all users and developers are welcome to attend, whether or not they're using any of the Qstuff*. An all-inclusive talk/workshop. The Qstuff* are, in order of appearance: [1] QjackCtl - A JACK Audio Connection Kit Qt GUI Interface http://qjackctl.sourceforge.net https://github.com/rncbc/qjackctl [2] Qsynth - A fluidsynth Qt GUI Interface http://qsynth.sourceforge.net https://github.com/rncbc/qsynth [3] Qsampler - A LinuxSampler Qt GUI Interface http://qsampler.sourceforge.net https://github.com/rncbc/qsampler https://github.com/rncbc/liblscp [4] Qtractor - An audio/MIDI multi-track sequencer http://qtractor.org http://qtractor.sourceforge.net https://github.com/rncbc/qtractor [5] QXGEdit - A Qt XG Editor http://qxgedit.sourceforge.net https://github.com/rncbc/qxgedit [6] QmidiNet - A MIDI Network Gateway via UDP/IP Multicast http://qmidinet.sourceforge.net https://github.com/rncbc/qmidinet [7] QmidiCtl - A MIDI Remote Controller via UDP/IP Multicast http://qmidictl.sourceforge.net https://github.com/rncbc/qmidictl [8] synthv1 - an old-school polyphonic synthesizer http://synthv1.sourceforge.net https://github.com/rncbc/synthv1 [9] samplv1 - an old-school polyphonic sampler http://samplv1.sourceforge.net https://github.com/rncbc/samplv1 [10] drumkv1 - an old-school drum-kit sampler http://drumkv1.sourceforge.net https://github.com/rncbc/drumkv1 LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 166

7. Moony – rapid prototyping of LV2 (MIDI) event filters in Lua by Hanspeter Portner (Switzerland) In need of a specific event filter no yet existent for your DAW or live setup? No time or skill to write your own MIDI plugin in C/C++? Moony comes to the rescue and lets you script your filters in Lua on-the-fly for any LV2 host. Come and learn about the LV2 atom event system and rapid prototyping in Lua. See https://open-music-kontrollers.ch/ 8. Interactive music with i-score by Jean-Michaël Celerier (Laboratoire Bordelais de Recherche en Informatique, France) This workshop presents the i-score (www.i-score.org) sequencer. It will present the challenges and the rationale that led to the creation of the software, that is, providing a dedicated tool for temporal design in an interactive context. The construction of a score will be detailed on practical examples involving audio-visual features and interaction with a familiar creative coding environment such as PureData, openFrameworks or Processing. See http://www.i-score.org LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 167

9. Smartphone Passive Augmentation by John Granzow* & Romain Michon** (* University of Michigan - United States, ** Stanford University - United States) In this 4 hours workshop (2 sessions) we will present Mobile3D, a library for introducing passive musical augmentations to mobile phones. The library allows participants to quickly leverage the parametric features of OpenScad a functional programming language for text based computer assisted drawing (CAD). Several 3D printers will be on hand to materialize designs. The workshop gives participants the tools to customize their smartphones for musical interaction. See https://ccrma.stanford.edu/~rmichon/mobile3D/

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 168

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 169

Concerts LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 170

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 171

Concerts

Thursday, May 18 – 06:30pm – 08:00pm - Auditorium de la Maison de l’Université – 10 rue Tréfilerie – Saint-Etienne Concert N°1: Mixed Music It is a first concert featuring works of mixed music (music with acoustic instruments and electronic devices that interacts). These works are pieces whose electronic devices have been developed with real-time free software (FAUST and SuperCollider). They are played by regional musicians and students from Saint-Etienne. 1. SmartMachine musicale (20’ – 2017 - Création) With the students of La Salle secondary school (Saint-Etienne), Robert Chauchat, music teacher, Roméo Monteiro, percussionist, solist from Ensemble Orchestral Contemporain and Gérard Authelain, teacher and musician (GRAME). Guided by their teacher and musicians, the students have explored the relationship between the technical object and the musical creation, but also the musical gesture (using the Smartfaust apps for smartphone designed by GRAME, National Center of Music Creation in Lyon). They have gradually created a new kind of sound scene, halfway between a performance and a visual installation. They invite you tonight to discover a space of research, questioning the aesthetic of the concert, the musician’s attitude and his instrument! Project conducted by a partnership between GRAME and Ensemble Orchestral Contemporain, with the support from DRAC Auvergne – Rhônes-Alpes and the Department of Loire. 2. Smartbones (20’ - 2016) With brass section students of the Conservatory of Valence (Léane Berthaud, Alice Chakroun, Noam Leenhardt, Rami Leenhardt, Meryem Ouannas, Antonin Vinay) and Pierry Bassery, musician and teacher. In order to arouse the curiosity and the listening, the students of the conservatory of Valence, guided by the trombonist Pierre Bassery, combine repertoire, improvisation, creation and new technologies with pieces for trombone/tuba. The incorporation of smartphones on the instruments allows the musicians to play with Smartfaust apps as the continuation of the instrumental gesture. Thus these new instruments (“Smartbones”- smartphones & trombones) will generate a sound material that will lead the students to imagine choreographies in which dance, trombone and electro will be mixed and confronted. 3. Peyote (16’ - 2016) by Sébastien Clara, mixed music for trio & electronics With Alice Szymanski, flute, Justine Eckhaut, piano and Florent Coutanson, saxophones. In 1936 Antonin Artaud left Europe to travel to Mexico. This departure symbolizes his break with surrealist aesthetics. "The rationalist culture of Europe has gone bankrupt and I have come to the land of Mexico to seek the bases of a magical culture that can still spring from the forces of Indian soil." However, by a romantic desire to touch the bottom or a personal inner experience to better understand his fellows, Artaud undertakes "a descent to come out of the day", a journey within his journey, a transcendental abysm. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 172 Artaud sets out to research "the ancient solar culture". To do this, he wants to be introduced to the culture of the men and women of the Sierra Tarahumara. Peyote retraces the dances of the Tarahumara to the glory of the sun, which undeniably influenced the work of Antonin Artaud.

Friday, May 19 – 08:00pm – 12:00pm - Auditorium de la Maison de l’Université – 10 rue Tréfilerie – Saint-Etienne Concert N°2: Electronic Music It is a concert of electronic music with musicians coming from the USA or from various European countries. They will play works, mostly experimental, realized with diverse and innovative digital devices, all involving Open Source software. 1. Inaudible Harp (10’) by Bruno Ruviaro & Juan-Pablo Caceres (Santa Clara University - USA) Music for harp, distant, played via Internet, and a computer system for real time sound synthesis and processing. One of "origin tales" of Ambient Music has Brian Eno stuck in a hospital bed after an accident: lying immobile in bed, he would listen to records played by visiting friends. One day it was harp music, with the volume turned so low that the plucked strings were almost inaudible. "At first I thought, 'Oh God, I wish I could turn it up," Eno remembers. "But then I started to think how beautiful it was. It was raining heavily outside and I could just hear the loudest notes of the harp coming above the level of the rain." Our telematic duo improvisation is inspired by this image. The duo utilizes SuperCollider to generate and process sounds, and JackTrip to stream audio over the internet. JackTrip is a Linux and Mac OS X-based system used for multi- machine network performance over the Internet. It supports any number of channels (as many as the computer/network can handle) of bidirectional, high quality, uncompressed audio signal streaming. The duo usually performs with one player on location, and the other remotely from Chile or the USA. 2. Vox Voxel (12’ – 2015) by Fernando Lopez-Lezcano* & John Granzow** (* Center for Computer Research in Music and Acoustics - Stanford University - USA, ** University of Michigan - USA) Music for two interpreters, a 3D printer, a Daxophone, a Korg Nanokontrol 2 and a computer. From an IBM 720 line printer playing Three Blind Mice in 1954 to dot matrix printers playing love songs and Queen, mechanical noises coming from printers were slowly tamed, domesticated and controlled, and countless unproductive hours of programming time were spent in figuring out how to make those noises into musical notes, phrases and whole pieces for the enjoyment of the IT team. From deafening antique mainframe line printers to whisper quiet inkjets, all have been at the spotlight of a concert performance (or at least a basement computer room). VoxVoxel is "composed" by designing a suitably useless 3D shape and capturing the sound of the working 3D printer using piezoelectric sensors. Those sounds are amplified, modified and multiplied through live processing in a computer using Ardour and LV2/LADSPA plugins, and output in full matching 3D sound. 3D pixels in space. The piece is dedicated to our endangered wooden 3d printer, slowly declining with the rise of folded metal frames in entry-level machines. The wood, (if fragile) is good for LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 173 contact one view of an object and score for the piece vibrations, to amplify rhythms of the tool-path and the frequencies of stepper motors. This rare 3d printer takes six minutes to warm up its extruder. For this, it has also fabricated an array of extensions for its equally endangered human performer.

3. Pointillism (20’ – 2016) by Iohannes Zmolnig (Institute of Electronic Music and Acoustics - University of Music and Performing Arts - Graz – Austria) Pointillism is a solo live-coding performance in Pd. Both the code representation and the generated audio use “points” and “dots” as building blocks: the music is generated using morse code patterns, the code is written in “Braille” (the dot--based writing system especially designed for blind/visually-impaired people). The performance is an ironic statement on the popular “show--us--your--screens” paradigm, by presenting the code in a form that does not even pretend to be readable.

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 174 4. Seven Sphere Journey (12’ - 2016) by Broch Vilbjorg (Anti-Delusion Mechanism - Netherlands) Music for computer generated sound, voice, dance and video projection This project explores generative sound, graphics and algorithmic composition based on the octonion algebra. The octonions form an 8 dimensional normed division algebra. The project started 2016 and and is in ongoing development. The octonions are an extremely rich subject within mathematics, with symmetry relations to Lie algebras and not the least to the so called special Lie groups. The unit-length octonions trace the seven-sphere: S7 - a 7 dimensional surface in 8 dimensional space. The project explores orbits in 8 dimensional space for audio synthesis and graphics. 5. 1)3V1532 (25’ - 2012) by David Runge (Elektronic Studio, TU Berlin & c-base - Germany) 1)3\/1532 (deviser) is drone/noise/experimental soundscaping. The aural journey leads to places like simple repetitve guitar tunes, loops, feedback manipulation, modified samples of field recordings, DIY analog synthesizers, toys and the like. Experimentation for the greater good! 6. 5-HT_five levels to zero (28’ – 2016) by Tina Mariane Krogh Madsen & Malte Steiner (Block4 - Germany) For Linux computer with Pure Data, synthesizer, div pedals and instruments for noise The concert 5-HT_five levels to zero is based on the structural qualities of the neurotransmitter serotonin. The dynamics of the molecule will be improvised and performed live in a dynamic and counterbalanced noise act that deals with both the balances as well as the imbalances inherent, resulting in chaotic states caused by disruption of this unit in the brain. For the piece, TMS has created a score that captures the dynamics of the musical composition, where the audio will be accompanied by a projection of a visual score created in Blender and Pure Data.

7. Level 5 Alert (30’ - 2016) by Frederic Peters, Arthur Lacomme and Suzie Suptille (Radio Panik - Belgium) Live performance; the show was created after 2016 Brussels bombings as an independent and uncontrolled mean of expression, as a recurrent collective work mixing experimental and creative writing, music and sound effects. LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 175 8. The Infinite Repeat (22’) by Jeremy Jongepier (linuxaudio.org – USA & MOD Devices - Germany) For guitar, computer and voice. A musician with over 25 years of experience and a computer with Linux. That's what it boils down to. The result: conventional, solid song-writing, with an eclectic tinge because of the choice to not walk the threaded paths coupled with an auto-didactic background, an outspoken personal taste and an open-minded world-view. This year The Infinite Repeat will be all about going back to the roots and mixing that up with the latest Linux based technology. So expect some solid acoustic singer-songwriter material with a modern touch and a Linux device on the floor. See http://theinfiniterepeat.com Saturday, May 21 – 08:30pm – 10:00pm – L’Estancot – 10 Rue Henri Dunant - Saint-Étienne Concert N°3: Acousmatic Music 1. Voce 3316 (3’ - stereo), Massimo Fragalà, Italy. 2. Kecapi III (10’56 – octophony), Patrick Hartono, Indonesia. 3. Inuti (9’ - 16 channels), Helene Hedsund, United Kingdom. 4. Profon (7’ – stereo), Magnus Johansson, Sweden. 5. Dark Path #6 (5’ – stereo), Anna Terzaroli, Italy. 6. Kruchtkammer (6’ – quadriphony), Lukas Tobiassen, Germany. 7. Definierte Lastbedingung (11’40 – octophony ambisonic), Clemens Von Reusner, Germany. 8. Suite of miniatures (10’ - stereo), Bernard Bretonneau, France 9. Concret X (5’ – ambisoniX), Jean-Marc Duchenne, France

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 176

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 177

Multimedia Installations LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 178

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 179

Multimedia installations

1. OUPPO (Harris Louise - United Kingdom) Ouppo is a generative audiovisual installation composed of four modular units for video mapping.

2. PROFON (Julien Ottavi - France) Profon is a performative device composed of a collection of flower pots moved in time and space.

3. SONIC CURRENT (Giannoutakis Kosmas - Austria) Sonic current is a site-specific sound installation which transform architectural locations into “sonic conscious” organisms. The transformation of the site into a body, with its sense organs (microphones) and actuators (loudspeakers), enable the site to articulate and manifest itself in an open dialogue with its visitors.

4. THETA FANTOMES (Apo33 Collective - France) Thêta Fantomes is a cross-disciplinary digital game/art project. Its an art piece developed by APO33 to realise some of our ideas about using real-time neuronal data processing with game play in a hybrid transcendental experience.

5. ZIC STREET BOX ((Lionel Rascle) Conservatoire de Musique de Saint-Chamond - France SicStreetBox is an interactive equipment targeted for public demonstration and Raise awareness to the use of technologies for sound art creation. The songs used in the installation are designed by the pupils of the computer music course in Saint-Chamond music school.

6. SOUND SCULPTURES (Thomas Barbe - France) The sound sculptures of Thomas Barbe question the link between sculptural form and sonorous generation. His creations situate the sound in the spatial environment in a tangible way. They question the link between visual form and sound activity.

LAC2017 - CIEREC - GRAME - Université Jean Monnet - Saint-Etienne - France 180

Linux Audio Conference 2017

lac2017.univ-st-etienne.fr