<<

Performance for Real-Time Audio on BeagleBone Black

James William TOPLISS Victor ZAPPI Andrew McPHERSON Centre for Digital Music, School of EECS Queen Mary University of London Mile End Road E1 4NS London, United Kingdom, [email protected] [email protected] [email protected]

Abstract with a musical experience not too far from the In this paper we present a set of tests aimed at one typical of acoustic and electric instruments evaluating the responsiveness of a BeagleBone Black [Berdahl and Ju, 2011]. This natural compari- board in real-time interactive audio applications. son with well known “devices”, such as piano The default Angstrom Linux distribution was tested and guitar, underlines qualities like reconfig- without modifying the underlying kernel. Latency urability, independence/autonomy and high re- measurements and audio quality were compared sponsiveness, which can be assured only on a across the combination of different audio interfaces dedicated system. and audio synthesis models. Data analysis shows that the board is generally characterised by a re- As designers and developers of open source markably high responsiveness; most of the tested novel DMIs, we have decided to explore the configurations are affected by less than 7ms of la- promising and evolving world of embedded tency and under-run activity proved to be contained Linux technologies, focusing as starting point on using the correct optimisation techniques. the concept of responsiveness. The work here presented shows the result of a series of tests Keywords aimed at measuring the latency of a Beagle- Embedded systems, BeagleBone Black, responsive- Bone Black4 board (BBB), used as the core of a ness, latency, real-time. self-contained, open-source musical instrument. 1 Introduction Different hardware and software configurations based on the same Linux kernel (v3.8.13) have Research in Music Technology and, in partic- been analysed under different CPU loads and ular, on Digital Musical Instruments (DMIs) levels of code optimisation. is strongly connected to the field of Human- This work is part of a larger structured Computer Interaction (HCI). Following the study, whose goal is to assess longevity, usabil- trend of many other disciplines involving HCI, ity and reconfigurability of DMIs, compared to like Ubiquitous Computing [Kranz et al., 2009] the standards of acoustic and electric musical and Augmented Reality [Langlotz et al., 2012; instruments. Ellsworth and Johnson, 2013], DMI research has recently started capitalising on portable and 2 Related Work embedded systems rather than on general pur- pose architectures. After many years of com- In 2011 Berdahl et al. presented the Satellite plete synergy, musical instruments are increas- CCRMA [Berdahl and Ju, 2011], a platform for ingly abandoning the laptop/desktop computer teaching and practicing interaction design for in favour of onboard audio processing, leaving diverse musical applications, completely based an important mark in both academia [Berdahl on embedded Linux. It runs on a BeagleBoard5 et al., 2013; Oh et al., 2010; Baclawski and coupled with an Arduino Nano6 and a bread- Jackowski, 2013] and industry (e.g., Aleph1, board, to support the use of sensors and ac- ToneCore DSP2 and OWL3). tuators. Two years later, Berdahl et al. up- This is due to the fact that DMIs require a graded the platform enabling the compatibil- specific set of design features to provide the user ity with more powerful boards, such as Rasp- (i.e., a performer, a composer, a casual player) 4http://beagleboard.org/products/beaglebone% 1http://monome.org/aleph/ 20black 2http://line6.com/tcddk/ 5http://beagleboard.org/Products/BeagleBoard 3http://hoxtonowl.com/ 6http://arduino.cc/en/Main/arduinoBoardNano berry Pi7 and BeagleBoard-xM8, and expanded cludes most of the standard components re- the range of possible applications including net- quired to synthesise and control audio in real- working capabilities and hardware-accelerated time on a self-contained instrument (i.e., with- graphics [s[Berdahl et al., 2013]. The project is out the need of laptops and any additional ex- based on a Fedora distribution with a custom ternal devices). Details on each of these com- low latency kernel. ponents are given in the following subsections. A good example of how new generation em- bedded Linux boards can be used to extend the 3.1 Board and OS capabilities of a musical instrument has been The BBB is an embedded Linux board based recently presented by MacConnell et al. [Mac- on a 1GHz ARM Cortex-A8 processor. It is Connell et al., 2013]. This work introduces shipped with an embedded Angstrom Linux dis- a BBB-based open framework for autonomous tribution (v3.8), optimised to run on embedded music computing eschewing the use of the lap- architectures. This distro is meant to run gen- top on stage. Some important features of em- eral purpose applications and it is not specif- bedded systems are here used to provide the in- ically audio-oriented. Our first intent was to struments designed using the framework with explore the capabilities of this default board high degrees of autonomy and reconfigurabil- configuration, without introducing any changes ity. Authors include also data regarding the in the underlying kernel. We believe this ap- latency of the system running Ubuntu 12.04. proach could be very useful for the community Since mainly FX-like processes are addressed, of embedded developers to have a clear outline only the audio- time is measured, of the built-in audio capabilities of the BBB, under different system load configurations. Re- thus helping choose the right board. sults vary between 10 to 15 ms, according to the Although belonging to a new generation of kind of filtering. compact and fully accessorised boards (e.g., The necessity of measuring the responsive- HDMI, uSD card slot), the BBB does not na- ness of computer-based systems is not recent tively provide any audio interfacing. To test at all, especially in the context of real-time op- the performances of this board in relation to erative systems. In 2002, Abeni et al. used high quality audio synthesis, two commercially a series of micro-benchmarks to identify major available audio interfaces were chosen for com- sources of latency in the Linux kernel [Abeni parison; these were both configured for use with et al., 2002]. They also evaluated its effects the board so as to provide real-time audio out- on a time-sensitive application, in particular an put. audio/video player. Moving towards computer- based audio systems, it is worth mentioning the work by Wright et al. [Wright et al., 2004], in which the latency of MacOS, Red Hat Linux (with real-time kernel patches) and Windows XP are compared, both in an audio-throughput configuration and in an event audio-based con- figuration. The technique used to estimate the event audio latency consists of measuring the time delay between the sound produced by pressing a button on the keyboard and a sinu- soidal audio output triggered on the computer by pressing the button itself. 3 System Configuration The tests presented throughout this work aim Figure 1: The USB interface and the Audio at the evaluation of the capabilities of a BBB- Cape attached to the BeagleBone Black. based system when used as development plat- form for DMIs. The chosen configuration in- 3.2 Audio Interfaces 7http://www.raspberrypi.org/ 8http://beagleboard.org/Products/ We tested one USB interface and one Beagle- BeagleBoard-xM Bone expansion “cape” providing audio output. This choice aimed at comparing the two most model between operating processes and output common solutions used by BBB users for gen- devices. For this reason, only a synthesizer class erating audio output. Figure 1 shows both the was designed to operate as the client , interfaces attached to the board. while a standard JACK server acts as the audio The first interface used is the Turtle Beach engine for transport. In this configuration the Amigo II USB Interface9’This device is bus- server pulls audio from the client process every powered and USB 2.0 class-compliant.. Once time it requires new output data, this is in stark attached, this device is automatically recog- contrast to the ALSA system whereby audio is nised as a new hardware interface, meaning that pushed to the output devices. it simply requires being specified as the selected Concerning audio synthesis, both the synthe- device for audio applications. This interface sizers implemented in ALSA and JACK gener- provides 2 stereo 3.5mm jack receptors, one for ate simple sine waves based on reading from input the other for output. Only the latter has a wavetable. Both systems are configured to been used for our tests. provide CD quality audio, (i.e., 16bit resolu- The second interface used for our tests is tion, 44.1KHz sample rate) and to run the au- the BeagleBone Audio Cape10; this device effec- dio thread at the maximum priority level using tively acts as an extension to the BBB, it simply a real-time FIFO . attaches to the top of the board to provide an A parallel control thread was included to audio interface. Audio data are exchanged to manage user input through the keyboard and to and from the BBB using an I2S connection. Un- have access to the general-purpose in/out pins like the USB interface, this device requires some (GPIO) read/write capabilities of the BBB. manual configuration to be recognised as a plug- in hardware interface. The on-board HDMI au- 4 Performance Test dio virtual cape must be disabled so that the The responsiveness and the audio quality of Audio Cape can be loaded by the firmware as four different specific configurations were tested, the main audio device; this can be easily done combining the use of the 2 audio backends by changing the uBoot parameters passed at (ALSA an JACK) with the 2 audio devices boot-time. As the USB interface, this cape in- (USB and Audio Cape). Responsiveness was cludes a couple of stereo 3.5mm input/output evaluated considering the latency occurring be- jack receptors. tween the triggering of an audio task produc- ing a waveform and the actual output of the 3.3 Audio Synthesis waveform through the audio interface; the as- Two different audio backend systems were de- sessment of audio quality was connected to the veloped in C++ and cross-compiled to run on incidence of under-runs. the ARM Cortex-A8 processor, one based on The performances of each configuration were ALSA, the other based on JACK. ALSA and measured running the audio task in 3 distinct JACK implementations are currently adopted test scenarios. Each of these scenarios (de- by a large number of Linux audio developers. scribed in the following subsection) involved The audio backend system based on ALSA testing different period and buffer size config- (Advanced Linux Sound Architecture11) essen- urations. As the focus of the test is concerned tially comprises an audio engine and a para- with very low latency, only the smallest possi- metric synthesizer. The synthesizer produces ble period and buffer sizes were examined. In frame data; it is connected to the audio engine, addition, each measurement was repeated en- which is responsible for collecting and trans- abling 3 different optimisation settings on the porting this frame data to the selected output C++ cross-compiler (Linaro GCC 4.7 hosted on device. a x86 64 architecture), using the O1, O2 and O3 The audio backend system based on JACK flags. (Jack Audio Connection Kit12) was similarly 4.1 Test Scenarios designed. A fundamental difference between The first scenario involved the generation of a these two APIs is that JACK uses a client-server simple monophonic tone. As mentioned in Sec- tion 3.2, a simple lookup table was used to gen- 9www.turtlebeach.com 10http://elinux.org/CircuitCo:Audio_Cape_RevA erate the frames for this tone. 11www.alsa.opensrc.org The second scenario consisted of creating the 12www.jackaudio.org same monophonic tone as used in the first sce- nario, however whilst a Top background pro- cle could begin, outputting the signal into the cess was active with a fast refresh rate (passing oscilloscope. The oscilloscope was set to trigger the command line argument “-d 0.1”) (but with a single display capture on both the channels standard priority). This scenario allowed for the upon the detection of a rising edge in the GPIO efficiency of audio synthesis to be observed and signal. The time distance between the GPIO measured whilst the system was under heavier rising edge in the display and the beginning load. of the captured audio output hence provided a The final test scenario was concerned with measurement of the operational latency (Figure the generation of a more complex polyphonic 3); each measurement was repeated 5 times [as- tone; this was achieved through the summation suming that’s right] and an average value cal- of three harmonically related monophonic oscil- culated. lators. The addition of these two extra tones was considered to be a suitably harder task to synthesise than the simple monophonic tone. It must be noted that no background process was executed during this scenario.

Figure 3: Examples of oscilloscope display cap- ture for scenarios 1 and 3. Latency is measured as the horizontal distance (time gap) between the GPIO signal rising edge (in yellow) and the start of the waveform (in light blue). Figure 2: The setup to measure latency when using the USB interface. It must be noted that the period and buffer sizes chosen were dependent upon both the sys- 4.2 Procedure tem type and the target interface; configura- tions that worked well for one pair did not nec- The BBB was connected by USB; tests were essarily run well for another. The six smallest performed over an ssh connection via BBB’s usable configurations were tested for each sys- USB network connection. One of the board’s tem and interface pair. GPIOs was attached to the first input channel of an oscilloscope. The audio output was con- In addition to measuring the output latency, nected to the second channel of the oscilloscope. the quality of the output audio was also ob- For the case of the USB interface, the complete served; this observation relied upon noting the setup is shown in Figure 2. frequency of frame dropout (under-run activity) In detail, the test procedure ran as follows. and visible distortion displayed on the oscillo- Upon starting one of the executables, the gener- scope (if any). ated system (ALSA or JACK) was programmed to initialise itself but then wait for user input 5 Results (i.e. a keystroke) before beginning to fill the output buffers with frames. Once the keystroke The reported measurements are here presented signal was received across the serial connection, and discussed, first globally and then analysing the first task of the system was to drive the more specific cases. Both latency and quality of GPIO connected to the oscilloscope from low to the output (under-runs) are taken into account high. Only immediately after this the audio cy- and the singular contributions are combined. 5.1 Latency The latency results were highly consistent across trials: when using the USB audio in- terface with both ALSA and JACK systems (Fgures 4 and 6) the maximum difference be- tween individual measurements generally only varied by one , with only few excep- tions; this was true regardless of the used opti- Figure 6: JACK latency measurements for the misation. Measurements concerning the Audio USB interface. Cape (Figures 5 and 7) were even more consis- tent than for the USB interface. Across the five measured latencies, measurements only varied by half of a millisecond, without exceptions. As expected, for all configurations, latency is directly related to buffer and period sizes. No significant, systematic latency differences were noted amongst the three optimisation settings. However, the choice of monophonic versus Figure 7: JACK latency measurements for the polyphonic synthesis and the system load in- Audio Cape. troduce unexpected variations in the latency. These variations may reflect a delay in start- configurations. Unfortunately, the set of avail- ing up the ALSA or JACK system, rather than able configurations varies between the 2 sys- a difference in steady-state latency once the au- tems, making impossible a direct performance dio rendering is running. Our test procedure comparison. For the only overlapping configu- toggles a GPIO pin and then immediately be- ration (buffer 512 - period 32), JACK performed gins filling the audio buffers; it is possible that unexpectedly badly, showing a higher latency this initial startup produces an additional tran- than configurations based on larger period sizes. sient delay compared to reacting to an event Conversely, USB interface results extend on once audio is already running. In any case, the the same set of configurations for both ALSA difference between test conditions is always less and JACK, so that a quantitative comparison than 1ms. is here possible. For the smallest period size (i.e. 64 frames) the JACK system shows bet- ter or equal performances, while increasing the size ALSA proved remarkably more responsive. In particular, the last 3 cases listed in Figure 6 shows that JACK’s latency is almost the same and quite high, regardless of all the conditions, i.e. buffer/period sizes, test scenario and opti- misation level. Figure 4: ALSA latency measurements for the USB interface. 5.2 Under-runs The test highlighted a certain incidence of under-runs, whose effects varied according to the chosen configuration. Generally, they oc- curred in particular when lowest buffer and pe- riod sizes were tested. Also the different opti- misation settings proved to strongly affect their incidence. 5.2.1 ALSA Figure 5: ALSA latency measurements for the Using ALSA on the USB interface, it was ob- Audio Cape. served the occurrence of frame dropout issue only when the buffer and period were set to It can be noted that, on both systems, the the minimum values (i.e. respectively 128 and 8 Audio Cape allows for smaller buffer and period frames). This was true across all the build qual- ities of the system and for all of the scenarios. In the case of the JACK system using the During the first scenario (pure tone) and third Audio Cape, it was noticed that the occurrence scenario (polyphonic tone) observed dropouts of frame dropout appeared more frequently for were not too severe, normally only being exhib- the first three period and buffer size config- ited once or twice at the beginning of synthesis. urations. The first scenario produced a very The second scenario seemed to generate signif- small amount of under-run activity, in regards icantly higher rates of frame dropout, leading to all the three optimisations; only during the to audible clicks. An interesting observation is first scenario were any frame dropouts observed. that the O2 optimised versions of the system The nature of these under-runs however was dif- appeared to always exhibit the least amount of ferent to those previously observed; during the under-runs. synthesis the server ran smoothly, while under- In the case of the ALSA system using the Au- runs were noted only after the client had been dio Cape, it was observed that frame dropout disconnected. It was observed during the sec- only occurred whilst using the smallest period ond scenario that the amount of frame dropout size configurations (period size of 8), regardless increased slightly; the type of under-run seen in of the buffer setting. Again this was true across the first scenario (after the termination of the all the build qualities of the system and all of the client) occurred more frequently. This behav- scenarios. The first and third scenario produced ior was exhibited in both the first and second a very small amount of under-run activity, in- period and buffer size configurations for the O1 terestingly only for the first period and buffer and O3 optimisations. In the case of the O2 op- size configuration tested. Similarly to the USB timisation, this behavior was not observed; in- interface, these observed dropouts were very mi- stead a severe under-run issue occurred during nor, normally only being exhibited once or twice the second period and buffer size configurations at the beginning of synthesis. In the second whereby the server immediately began to under- scenario the amount of frame dropout did in- run before the client had even been launched. crease slightly. Interestingly, this time the O3 In relation to the third scenario, similar types optimised versions exhibited the best improve- of under-run behavior were seen as in the previ- ments, almost completely preventing all under- ous two scenarios whereby under-runs occurred runs. after the termination of the client. No frame 5.2.2 JACK dropouts were observed for the O2 optimisation for this particular scenario. Since in JACK the stream to the audio device is not managed by the client, under-runs can occur 6 Conclusion only not he server. In relation to the JACK sys- tem using the USB audio interface, most frame Throughout this paper we presented a study dropout issues observed occurred during the aimed at evaluating the responsiveness of a first size configurations (buffer 128 - period 16), Linux embedded system. As part of a larger the second one (buffer 128 - period 32) and the study on DMIs design, we focused on test- fourth one (buffer 256 - period 32). In regards to ing latency and quality of audio output on a the first and third scenarios, the frame dropouts BeagleBone Black board running the standard noted for the first period and buffer size con- Angstrom distribution with no kernel modifi- figuration were very severe for the O1 and O2 cations. Two different audio backend systems optimisations. The amount of under-run activ- were taken into consideration, one based on ity for this configuration made it very difficult ALSA, the other on JACK, and measurements to gather any measurements for latency; some- using a USB audio interface and an Audio Cape times the JACK server would under-run con- were compared. The test monitored event-to- tinuously without the client even being active. audio latency and included the monitoring of The O3 optimisation however did not experi- under-run activity. ence this problem for this configuration; under- Data analysis showed for both ALSA and runs were noted however were nowhere near as JACK audio systems remarkably low latency severe. In the case of the second scenario, far values, especially for small buffer and period less dropouts were observed consistently across size configurations. In particular, the use of the all build qualities, a surprising result. Again, Audio Cape allows for latency values lower than O3 optimisation proved to provide the best per- 3ms. In some of the different CPU scenarios formance enhancement. taken into consideration the audio stream pre- sented dropouts and clicks, especially when us- mobile augmented reality. Personal and Ubiq- ing small buffer and period size configurations. uitous Computing, 16(6):623–630. However, the usage of the different levels of code Duncan MacConnell, Shawn Trail, George optimisation available in the chosen compiler Tzanetakis, Peter Driessen, and Wyatt Page. (cross-Gcc O1, O2 and O3) completely fixed the 2013. Reconfigurable autonomous novel gui- audio quality in most of the tested configura- tar effects (range). In Sound and Music Com- tions. puting Conference. No previous works delved into the audio capa- bilities of the BeagleBone Black while running Jieun Oh, Jorge Herrera, Nicholas Bryan, the default Linux distribution (Angstrom with and Ge Wang. 2010. Evolving the mobile kernel 3.8). Compared to other distributions, phone orchestra. In International Conference like Ubuntu or Fedora, the usage of Angstrom on New Interfaces for Musical Expression, on the BeagleBone Black proved to support very Sydney, Australia. low latency configurations without the need of Matthew Wright, Ryan J. Cassidy, and a customised kernel. In the context of digital Michael F. Zbyszynski. 2004. Audio and musical instrument design, this feature is re- gesture latency measurements on linux and markable and makes the BeagleBone Black an osx. In International Computer Music Con- appealing platform for instrument development. ference, pages 423–429, Miami, FL. Interna- tional Computer Music Association. References Luca Abeni, Ashvin Goel, Charles Krasic, Jim Snow, and Jonathan Walpole. 2002. A measurement-based analysis of the real- time performance of linux. In IEEE Real Time Technology and Applications Sympo- sium, pages 133–142. IEEE Computer Soci- ety. Krystian Baclawski and Dariusz Jackowski. 2013. Komeda: Framework for interactive al- gorithmic music on embedded systems. In Sound and Music Computing Conference. Edgar Berdahl and Wendy Ju. 2011. Satellite ccrma: A musical interaction and sound syn- thesis platform. In International Conference on New Interfaces for Musical Expression. Edgar Berdahl, Spenser Salazar, and Myles Borins. 2013. Embedded networking and hardware-accelerated graphics with satellite ccrma. In International Conference on New Interfaces for Musical Expression. Jeri Ellsworth and Rick Johnson. 2013. Castar - technical illusions. http://technicalillusions.com/?page id=197. Matthias Kranz, Albrecht Schmidt, and Paul Holleis. 2009. Embedded interaction - inter- acting with the internet of things. IEEE In- ternet Computing, 99(1). Tobias Langlotz, Stefan Mooslechner, Ste- fanie Zollmann, Claus Degendorfer, Gerhard Reitmayr, and Dieter Schmalstieg. 2012. Sketching up the world: in situ authoring for