Fluidsynth Real-Time and Thread Safety Challenges

FluidSynth real-time and thread safety challenges David Henningsson FluidSynth Developer Team [email protected] Abstract 1.1 FluidSynth's use cases FluidSynth takes soundfonts and MIDI data as FluidSynth is not only a command-line input, and gives rendered audio samples as output. application, but also a library used by more than On the surface, this might sound simple, but doing 15 other applications [1], all putting their it with hard real-time guarantees, perfect timing, requirements on the FluidSynth engine. and in a thread safe way, is difficult. Requirements include: This paper discusses the different approaches that • low-latency guarantees, e.g. when have been used in FluidSynth to solve that playing live on a keyboard. problem both in the past, present, as well as • fast rendering1, e.g. when rendering a suggestions for the future. MIDI file to disk. • configurability, such as loading and Keywords changing soundfonts on the fly. • FluidSynth, real-time, thread safety, soundfont, monitoring current state and what©s MIDI. currently happening inside the engine, needed by GUI front-ends and 1 Introduction to FluidSynth soundfont editors. FluidSynth is one of the more common software 1.2 Introduction to SoundFonts synthesizers in Linux today. It features a high level SoundFont (SF2) files contains samples and of compliance to the SoundFont (SF2) standard, as instructions for how to play them, just like similar well as good performance. The design is modular formats such as DLS, Gigastudio and Akai. A enough to suit a variety of use cases. soundfont renderer must implement features such FluidSynth does not come bundled with a GUI, as cut-off and resonance filters, ADSR envelopes, but several front-ends exist. It does however come LFOs (Low-Frequency Oscillators), reverb and with several drivers for MIDI and audio, e.g. chorus, and a flexible system for how MIDI JACK, ALSA, PulseAudio, OSS, CoreAudio/ CoreMidi (MacOSX), and DirectSound 1 Sometimes known as ªbatch processingº, a mode of (Windows). operation where throughput matters more than latency. SF2 metadata SF2 sample data MIDI MIDI processing: Audio processing: Presets, tuning, Voice(s) Interpolation, Rendered audio gain, etc filters, etc Overview of FluidSynth core messages affect the different parameters of these 2 Architecture before 1.1.0 features. FluidSynth has always had a multi-threaded 1.3 More background information architecture: One or more MIDI threads produce MIDI input to the synthesizer, and the audio driver 1.3.1 Buffer management thread is asking for more samples. Other threads FluidSynth internally processes data in blocks of would set and get the current gain, or load new 64 samples2. It is between these blocks the soundfonts. rendering engine can recalculate parameters, such 2.1 Thread safety versus low latency as e.g. current LFO values and how they affect pitch, volume, etc. When the author got involved with the There is also the concept of the audio buffer FluidSynth project, a few years ago, thread safety size, which controls the latency: the audio driver was not being actively maintained, or at least not uses this size parameter to determine how often the documented properly. There weren©t any clear system should wake up, executing one or more directions for users of FluidSynth©s API on what internal block rendering cycles, and write the result could be done in parallel. to the sound card©s buffer. Yet there seems to have been some kind of balance: Unless you stress tested it, it wouldn©t 1.3.2 MIDI processing latency crash that often ± even though several race To understand some of the problems faced conditions could be found by looking at the source below, it is also important to understand the code. At the same time, latency performance was difficulty of handling all MIDI messages in a acceptable ± again, unless you stress tested it, it timely fashion: wouldn©t underrun that often. • Loading soundfonts or MIDI files from This ªbalanceº was likely caused by carefully disk are worst, and are not guaranteed to selecting places for locking a mutex ± the more execute within an acceptable amount of MIDI messages and API calls protected by this time due to disk accesses. mutex, the better thread safety, but worse latency • MIDI Program change messages are performance. In several places in the code, one troublesome, somewhat depending on could see this mutex locking code commented out. the current API allowing custom 2.2 The ªdrunk drummerº problem soundfont and preset loaders. • Other MIDI messages, while they are An additional problem was the timing source: not calling into other libraries (and thus The only timing source was the system timer, i.e. unknown code latency-wise), still take timing based on the computer©s internal clock. This some time to process, compared to just had two consequences. rendering a block. The first: All rendering, even rendering to disk, took as long time as the playing time of the song, so if a MIDI file was three minutes long, rendering 2 It is known as the FLUID_BUFSIZE constant in the code, and I have never seen anyone change it. Audio driver thread: Render blocks Shell thread: load new SF2 file MIDI thread: input from keyboard FluidSynth core GUI thread: Set reverb width Different threads calling into FluidSynth that song would take three minutes, with the implemented so that on every 64th sample, a computer idling most of the time. callback was made to the MIDI player so that it The second: With larger audio buffer/block could process new MIDI messages. sizes3, timing got increasingly worse. Since audio 3.3 Problems with the overhaul was rendered one audio buffer at a time, MIDI messages could only be inserted between these 3.3.1 Worse latency buffer blocks. All notes and other MIDI events therefore became quantized to the audio block As the audio thread was now expected to size. (Note that this quantization is not at all process all MIDI messages, this meant more related to the intended timing of the music!) pressure on the MIDI messages to return timely, This problem was labelled ªthe drunk drummer and audio latency now had to take MIDI problemº, since listeners were especially sensitive processing into account as well. The sample timer to the drum track having bad timing (even though made this even worse, as all MIDI file loading and the same bad timing was applied to all channels). parsing now also happened in the audio thread. 3.3.2 Reordering issues To aid the now tougher task of the audio thread, 3 Architecture in 1.1.0 and 1.1.1 program change messages were still processed in 3.1 Queuing input the MIDI thread, queueing the loaded preset instead of the MIDI message. However, this also To make FluidSynth thread safe, it was decided meant that bank messages had to be processed to queue MIDI messages as well as those API calls immediately, or the program change would load setting parameters in the engine. This was the wrong preset. In combination with API calls implemented as lock-free queues ± the MIDI for loading soundfonts, this became tricky and thread would insert the message into the queue, there always seemed to be some combination order and the audio thread would be responsible for not being handled correctly. processing all pending MIDI messages before 3.3.3 Not getting what you're putting in rendering the next block. out Since API calls were now being queued until the 3.2 The sample timer next rendering, this broke API users expecting to To make the drunk drummer sober again, the be able to read back what they just wrote. E g if a ªsample timerº was added ± that uses the number GUI front-end set the gain, and then read it back, it of rendered samples as a timing source instead of would not read the previous set value as that value the system timer. This also allowed features such had not yet been processed by the audio thread. as fast MIDI-file rendering to be added. This was This was somewhat worked around by providing a separate set of variables that were updated 3 In high-latency scenarios, such as a MIDI file immediately, but since these variables could be player, you would typically want as large buffer as simultaneously written by several threads, writes possible, both to avoid underruns and to improve overall and reads had to be atomic, which became difficult performance. Shell thread: load new SF2 file Audio driver thread: MIDI thread: input from keyboard Render blocks GUI thread: Set reverb width FluidSynth core: MIDI processing, Audio rendering MIDI/API queues Threading architecture in 1.1.0 when writes and reads spanned several variables 4.2 Return information internally. A queue with return information also had to be added, with information flowing from the Audio rendering thread to the MIDI threads. This is used 4 Architecture in 1.1.2 and later to notify the MIDI processing when a voice has To overcome the problems introduced with finished, so that the voice can be reallocated at the 1.1.0, the thread safety architecture was once again next MIDI note-on event. This return information rewritten in 1.1.2. This time, it was decided to split queue is processed right after the mutex is locked. the engine into two parts: One for handling MIDI and one for handling audio. Hard real-time is guaranteed for the audio thread only, in order not 5 Conclusion and suggestions for the future to miss a deadline and cause underruns as a result. While the architecture in 1.1.2 seems to have For MIDI, the synth no longer has an input been more successful than the previous attempts in queue, but is instead mutex protected4.

Load more