Network distribution in music applications with Medusa

Fl´avioLuiz SCHIAVONI and Marcelo QUEIROZ Computer Science Department University of S˜aoPaulo S˜aoPaulo Brazil, {fls, mqz}@ime.usp.br

Abstract puters, a resource which might be seen as a scal- This paper introduces an extension of Medusa, a dis- able distributed sound card; music spatialization tributed music environment, that allows an easy use of could be done using a mesh network topology; network music communication in common music ap- digital signal processing power can be vastly im- plications. Medusa was firstly developed as a Jack proved using clusters of computers; and musi- application and now it is being ported to other au- cal composition may explore fresh new grounds dio as an attempt to make network music ex- by viewing computer networks as nonstandard perimentation more widely accessible. The APIs cho- acoustical environments for performance and lis- sen were LADSPA, LV2 and (external). This new project approach required a complete review tening. of the original Medusa architecture, which consisted There are at least two requirements for all these of a monolithic implementation. As a result of the scenarios to be fully explorable by potentially in- new modular development, some possibilities of us- terested users, which are often musicians or afi- ing networked audio streaming via Medusa plugins in cionados and usually nonprogrammers; first, flex- well-known audio processing environments will be pre- ible audio network tools must be made available, sented and commented on. and second, easy integration with popular mu- Keywords sic/sound processing applications must be guar- Network Music, Pure Data, LADSPA, LV2, Medusa. anteed. Despite the fact that most high-level music applications nowadays run over Jack 1 Introduction and ALSA, we see that time and again end users Ever since the availability of network and In- can’t deal with this audio infrastructure lying ternet connections became an undisputed fact, below the application they are running. Also, collaborative and cooperative music tools have the lack of automated setup and graphical user been developed. The desire of using realtime interfaces shun common users from many exist- audio streaming in live musical performances in- ing tools, like the network music tools mentioned spired the creation of various tools focused on dis- above. tributed performance, where synchronous music The work presented in this paper is part communication is a priority. Some related net- of the investigation behind the development of work music tools which address the problem of Medusa [?], a distributed audio environment. synchronous music communication between net- Medusa is a FLOSS project which is primarily worked computers are NetJack [?], SoundJack [?], focused on usability and transparency, as means JackTrip [?; ?], llcon [?] and LDAS [?]. to making network music connections easier for The success cases of network music perfor- end users. The first implementation of Medusa, mance concerts could give the wrong impression presented in LAC 2011, was developed in C++ that the only use for network music is distributed with GUI and Jack as sound API. The Jack performance, but several other problems in com- API was extended with a series of network func- puter music can take advantage of or benefit from tionalities, such as add/remove remote ports and network music distribution. For instance, - remote control of the Jack Transport functional- ings can be made using a pool of networked com- ity. Recently, the development of Medusa has been text-based interfaces and plugins now occupy a strongly guided by the attempt to deal with the separate layer, that uses the core Medusa library two aforementioned requirements, namely flexibil- as an API on its own. ity and integrability. These goals may be reached by extending regular sound processing applica- tions, such as Pure Data, or , allowing them to function as network music tools. The very basic idea is trying to reach end users wherever they already are. Most music applications can be extended by plugins: Pure Data can be extended by the cre- ation of C externals (which might be viewed as a plugin), whereas digital audio workspaces such as Ardour, Rosegarden, and Traverso can be extended by LADSPA, VST and LV2 plugins. In this paper we will explore the possibility of developing Medusa network music plugins using three popular audio APIs: LADSPA, LV2 and Figure 1: Medusa Jack implementation with Qt Pure Data API (via C externals). It is worth GUI mentioning that these APIs are all open-source, developed in C and widely used in Linux music The new Medusa architecture is divided in environments. three layers: Sound, Control and Network. The A reimplementation of Medusa has been re- Control and Network layers are implemented as a quired in order to grant code reuse in the im- unique library and are used by all Medusa imple- plementation of these plugins, and also for easy mentations. The Sound layer corresponds to spe- maintenance of the source code. All source code cific applications, like the proposed plugins, that of the Medusa project, including Medusa plug- 1 are built using each plugin API and the Medusa ins, are freely available in the project site . This library. This architecture facilitates code mainte- reimplementation of Medusa and the proposed ar- nance and bug correction. chitecture are presented in section 2; section 3 discusses the chosen audio APIs and presents the developed plugins, and section 4 brings some con- clusions and a discussion of future works. 2 Medusa Although some promising preliminary results had been achieved with the first version of Medusa [?], the original monolithic implementation raised several difficulties in the implementation of the proposed Medusa plugins, which eventually trig- gered a fundamental architectural change in the project. First, the implementation language was Figure 2: Architectural view of the implementa- changed from C++ to ANSI C, which is more tion compatible with the chosen sound APIs. Sec- ond, the monolithic structure has been changed to 2.1 Network layer the development of a core Medusa library (libme- The network layer is responsible for managing dusa.so), comprising control and network func- connections between network clients and servers. tions, which could be re-used by each new plugin This layer’s implementation was made based on implementation. Third, graphical user interfaces, a fixed set of network transport protocols, which 1http://sourceforge.net/projects/medusa-audionet/ are normally provided as part of the kernel, since its development and deploy- ment requires superuser privileges. Medusa cur- rently allows the user to choose among 4 transport protocols: UDP, TCP, SCTP e DCCP: Figure 3: Sender / Receiver communication

UDP [?]: User Datagram Protocol is the classi- cal unreliable (but faster) transport protocol. buffer for each channel, and provides functions TCP [?]: Transmission Control Protocol is a re- for the sound API to have easy access to these liable transport protocol, which ensures ab- buffers. Each ring buffer receives sound content sence of packet losses. from applications (usually in DSP chunks), out of which the sender prepares network chunks (usu- SCTP [?]: Stream Control Transmission Proto- ally of a different size) for the Network Layer. col is a connection-oriented transport proto- col that provides a reliable full-duplex associ- The receiver is created in a way similar to the ation. This protocol was not originally meant sender, but besides the basic parameters the cor- as a replacement for TCP, but was developed responding server IP address is also required. The for carrying voice over IP (VoIP). receiver creates a network client and the required DCCP [?; ?]: Datagram Congestion Control number of ring buffers (one per channel). Un- Protocol is a transport protocol that com- like the sender, the ring buffer of the receiver will bines TCP-friendly congestion control with be fed by a network client and consumed by the unreliable datagram semantics for applica- sound API. tions that transfer fairly large amounts of data [?]. 2.3 Sound layer This multi-protocol network communication layer is intended to offer alternatives for users that The outermost Medusa layer is the Sound layer, may need different specific features in data trans- which deals not only with audio streams but also fer according to application context. For instance, with MIDI streams. The Sound layer lies be- an interactive musical performance with strong tween the Medusa API and several sound applica- rhythmic interactions may require the smallest tion APIs, and represents a collection of Medusa possible latency while tolerating audio glitches front-ends or interfaces, since each integration of due to packet losses, and a remote recording ses- Medusa with a particular sound application may sion may tolerate high latency and jitter, but still have its own user interface, defined by each sound require that every packet be delivered. With a few application API. alternatives available, the user may choose which The integration of Sound and Network layers protocol is more appropriate to its own musical uses the Control layer through its sender / re- use. ceiver plugins. Sender and receiver roles defined 2.2 Control layer by the Control layer are converted into Sound layer plugins that simply exchange audio streams The next layer in the Medusa API is the Con- between machines. trol layer. The Control layer essentially creates senders and receivers and provides them to the Each plugin has a host application and a partic- Sound layer. Instead of requiring that all pro- ular way to communicate with it. The API defines tocols in the Network layer to have full-duplex how a plugin is initialized, how it processes data communication, the Control layer always sepa- and how it is presented. Each plugin implemen- rates the roles of sending and receiving network tation generates an independent software pack- data. age that can be individually used and distributed. In order to create a sender it is necessary to in- The separation of these implementations avoids form the network protocol, the network port and the mixing of different plugin libraries, which is the number of channels that will be sent. The better for code maintenance, chasing bugs and so sender creates the network server and one ring on. 3 Implementations scriptor in and any other resource needed. 3.1 LADSPA The LV2 bundle may include a GTK GUI, thus offering developers a way to create better user in- LADSPA [?] stands for “Linux Audio Developer terfaces. As LADSPA, LV2 is an easy-to-use li- Simple Plugin API”, and it is the most com- brary, and plenty of documentation and examples mon audio plugin API for Linux applications. are provided by the developers. Many Linux programs, such as , Ardour, Rosegarden, QTractor and Jack Rack, support LADSPA plugins. The LADSPA API is captured within a header file and is very easy to use. Some examples are provided with an SDK and there is a lot of open source code with great documentation available.

Figure 5: Medusa LV2 implementation

The possibility of creating GTK interfaces al- lowed a more natural integration of Medusa as an LV2 plugin. The IP mask entry allows easy input for the user, and on-the-fly changes to the inter- face may be used by Medusa in the future to allow for finding users and sound resources and keeping this information up-to-date on the interface. 3.3 Pure Data external Pure Data [?] (aka Pd) is a largely used real- time programming environment for audio, video, Figure 4: Medusa LADSPA implementation and graphical processing, which can be extended by the use of externals written in C/C++ lan- Figure 4 presents Medusa LADSPA sender and guage [?]. Pure Data does have some network ex- receiver interfaces. The controls of a LADSPA ternals available, but none offers all transport pro- plugin are represented exclusively by 32 bit float- tocols provided by Medusa. A Pd external may ing point numbers, and it was awkward to also be implemented with a tcl/tk GUI, which can use these to represent IP addresses. LADSPA be useful to improve usability and also to imple- GUIs are defined in RDF files that cannot be ment Medusa service control in the future. In ad- changed on-the-fly, which makes them very cum- dition, Pd offers a good environment to measure bersome as interfaces for network audio connec- latency, packet loss and jitter in Medusa Network tions. LADSPA doesn’t support MIDI and the layer implementation. documentation suggests the use of Rosegarden The Medusa Pd external implementation is ac- DSSI to deal with it. tually a collection of externals (a library) that includes the sender and receiver objects (see fig- 3.2 LV2 ure 6) and a network meter to measure latency LV2 (LADSPA version 2) [?] is acknowledged as and packet loss. Some examples of use were also the official LADSPA successor. An LV2 plugin is a developed and are available on the project web- bundle that includes the plugin itself, an RDF de- site. service, and a name server to publish networked music resources, thus allowing users to refer to other users and audio streams, instead of IP ad- dresses and network ports. Probably the Medusa LADSPA plugin will have to be abandoned at this point because the control service is impossible to implement without on-the-fly GUI modification (although the current version of the plugin, with- out control service, should be kept). Figure 6: Medusa Pure Data external implemen- Up to this point the plugin development has tation focused on audio streaming, but some other fea- tures must soon be addressed, such as treating MIDI streams and designing better GUIs. There 4 Conclusions and Future Works are also other sound APIs that are been inves- The possibility of exchanging sound data between tigated and maybe soon will be incorporated in applications using the Jack sound server, for in- Medusa Sound layer. stance, has brought new perspectives in audio Acknowledgements software development and usage. Extending this possibility to allow for network data exchange Thanks to uncountable developers of LADSPA, from within popular user applications may fa- LV2 and Pure Data externals and their amazing cilitate the emergence of new ways to compose, open source code. Without their anonymous help record or play music. It may be early to con- this project would not have been possible. clude that network plugins for audio software will Thanks also go to Andr´e Jucovsky Bianchi, really expand the musical use of computer net- Beraldo Leal, Santiago Davila, AntˆonioGoulart, works, but it is not early to observe in users and Giuliano Obici, Danilo Leite and the Computer musicians new expectations about what might be Music Group of IME/USP for their interest, feed- done in network music, and also the desire to try back and support. out new ways to do old things, maybe more easily This work has been supported by the fund- and more transparently than before. ing agencies CNPq (grant 141730/2010-2) and The effort that went into changing Medusa ar- FAPESP - S˜aoPaulo Research Foundation (grant chitecture and reimplementeing it is totally justi- 2008/08623-8). fied, in the sense that now the effort of implement- ing other network music plugins (i.e. Medusa plu- gins for other sound applications) involve a lot of code reuse and thus have been made much lighter. On the other hand, the first version of Medusa had an additional structure called Control Ser- vice, which was responsible for helping users to connect to network music resources in a transpar- ent way, in other words, without having to specify IP, network ports or audio configuration, using broadcast messages to publish networked audio resources. With that in mind, the level of usability orig- inally pretended by this project will only be reached when Medusa’s control service is imple- mented in the presented Medusa plugins, which is the very next step in our work. This control ser- vice is going to be responsible for improving trans- parency and usability, by including a discovery