Rtaudio: a Cross-Platform C++ Class for Realtime Audio Input/Output
Total Page:16
File Type:pdf, Size:1020Kb
In Proceedings of the 2002 International Computer Music Conference, G¨oteborg, Sweden 1 RtAudio: A Cross-Platform C++ Class for Realtime Audio Input/Output Gary P. Scavone [email protected] Center for Computer Research in Music and Acoustics Department of Music, Stanford University Stanford, California 94305-8180 USA Abstract other than those necessitated by the underlying platform-specific audio interfaces. This paper presents a cross-platform C++ class for realtime audio input and output streaming. RtAu- dio provides a flexible, easy to use application pro- 2 Features & Design Goals gramming interface (API) which allows complete audio system control, including device capability RtAudio is a C++ class which provides a com- querying, multiple concurrent streams, blocking mon API for realtime audio input/output across and callback functionality. RtAudio is currently Linux, Irix, and Windows operating systems. supported on Windows platforms using the Direct- RtAudio significantly simplifies the process of in- Sound API, Linux platforms using both the OSS terfacing with computer audio hardware. It was and ALSA APIs, and on Irix platforms. Support designed with the following goals: for OS-X and Steinberg ASIO drivers is planned • object-oriented C++ structure for Spring 2002. • single independent header and source file for easy inclusion in programming projects • blocking and callback functionality 1 Introduction • flexible, easy to use, audio device parameter control While programming languages have gained • automatic internal conversion for data for- standardized support across the myriad of com- mat, channel number compensation, de- puter platforms and operating systems in exis- interleaving, and byte-swapping tence, a commonly supported API for audio pro- • control over multiple audio streams and de- gramming is far from a reality. As a result, an vices with a single class instance attempt to provide multi-platform support for an • audio device capability probing audio application can prove difficult at best. To RtAudio incorporates the concept of audio further complicate matters, multiple audio driver streams, which represent independent audio out- interfaces often exist for a single operating system. put (playback) and/or input (recording) “connec- For example, Windows platforms have Direct- tions” to audio devices. Available audio devices Sound, Windows Multimedia Library, and ASIO and their capabilities can be enumerated and then (Steinberg) driver options, Linux platforms have specified when opening a stream. Multiple streams Open Sound System (OSS) and Advanced Linux can run at the same time and, when allowed by the Sound Architecture (ALSA) drivers, and Macin- underlying audio API, a single device can serve tosh platforms have Sound Manager, ASIO and multiple streams. Core Audio drivers. RtAudio was designed to pro- The RtAudio API provides both blocking (syn- vide a common interface across a variety of these chronous) and callback (asynchronous) function- APIs in as flexible, yet simple, manner as possible. ality. Callbacks offer a simple means for achiev- RtAudio was originally developed to provide ing non-blocking audio input/output. Blocking audio input/output support for the Synthesis functionality is often necessary for explicit con- ToolKit in C++ (STK) [Cook and Scavone 1999]. trol of multiple input/output stream synchroniza- However, the latest release of RtAudio (version tion or when audio must be synchronized with 2.0, January 2002) was designed to function in- other system events. All public RtAudio func- dependently from STK, as well as any libraries tions are thread-safe. This allows users to safely In Proceedings of the 2002 International Computer Music Conference, G¨oteborg, Sweden 2 embed blocking RtAudio functions within a multi- is supplied to a method or a driver error occurs. threaded programming structure of their own de- There are a number of cases within RtAudio where sign. warning messages may be displayed but an excep- RtAudio offers uniform support for 8-bit, 16- tion is not thrown. bit, 24-bit, and 32-bit signed integer data for- mats, as well as 32-bit and 64-bit floating point 3.1 Device Capabilities formats. When an audio device does not na- tively support a requested user format, RtAudio RtAudio provides the following functions for provides automatic format conversion. In addi- use in probing the number and capabilities of avail- tion, internal routines will automatically perform able audio devices: any byte-swapping, channel number compensa- int getDeviceCount (void); tion, and channel de-interleaving required by the underlying audio driver or hardware. void getDeviceInfo (int device, On Linux platforms, both native ALSA and RTAUDIO_DEVICE *info); OSS audio APIs are supported. Portability to other OSS supported systems, such as Solaris and The RTAUDIO DEVICE structure contains in- HP-UX, is untested but most likely easily achieved. formation commonly required in assessing the ca- The ALSA driver model was recently incorporated pabilities of an audio device, including its name, into the Linux development kernel and will likely minimum and maximum number of input, output, gain wide acceptance in the near future. The and duplex channels, supported sample rates, and ALSA API provides a more developed level of sup- native data formats. port for professional quality audio devices than OSS. On Windows platforms, only the Direct- 3.2 Stream Creation & Parameters Sound API is currently supported. On SGI plat- forms, the newer “al” API is supported. In addition to the default constructor, RtAudio The RtAudio API incorporates many of the provides an overloaded constructor which allows a concepts developed in the PortAudio project stream to be immediately opened with a given set [Bencina and Burk 2001]. RtAudio distinguishes of device parameters. Alternately, a stream can be itself from PortAudio in its object-oriented, C++ opened after instantiation in much the same way. framework, single-file encapsulation, native block- RtAudio (int *streamId, ing support, ALSA support, thread-safe routines, int outputDevice, and slightly less ambitious API (which makes int outputChannels, RtAudio less prone to bugs and easier to maintain int inputDevice, and extend). int inputChannels, All source code for RtAudio is made freely RTAUDIO_FORMAT format, available, allowing full user extensibility and cus- int sampleRate, tomization. RtAudio is distributed with a tutorial int *bufferSize, and complete API documentation in HTML, PDF, int numberOfBuffers); and RTF formats. int openStream (int outputDevice, RtAudio int outputChannels, 3 The API int inputDevice, int inputChannels, All uses of RtAudio must begin with object in- RTAUDIO_FORMAT format, stantiation. The default constructor RtAudio() int sampleRate, scans the underlying audio system to verify that int *bufferSize, at least one audio input/output device is available. int numberOfBuffers); RtAudio uses C++ exceptions to handle critical errors, necessitating try/catch blocks around most A stream is opened with specified output and member functions as well as constructors. Like- input devices, output and input channels, data for- wise, all uses of RtAudio must end with class de- mat, sample rate, and buffer parameters. When struction. successful, a stream identifier is returned which RtAudio uses a C++ exception handler called must be used for subsequent function calls on the RtError, which is declared and defined within the stream. Audio devices are identified by integer RtAudio class files. An RtError can be caught by values of one and higher, as enumerated by the type, providing a means for error correction or at getDeviceInfo() function. In addition, the sys- a minimum, more detailed error reporting. Al- tem default input/output devices are identified by most all RtAudio methods can ”throw” an RtEr- a zero value. When a device identifier of zero is ror, most typically if an invalid stream identifier In Proceedings of the 2002 International Computer Music Conference, G¨oteborg, Sweden 3 specified during stream creation, RtAudio first at- the user must first get a pointer to the stream tempts to open the default audio device(s) with the buffer, provided by RtAudio, for use in feeding data given parameters. If that fails, an attempt is made to/from the opened stream. Memory management to find a device or set of devices which will meet the for the stream buffer is automatically controlled given parameters. If all attempts are unsuccessful, by RtAudio. The bufferSize value returned dur- an RtError is thrown. When a positive, non-zero ing stream creation defines the length, in sample device value is specified, no additional devices are frames, of the stream buffer. Multichannel data in probed. Example program code is provided in the the stream buffer must be in interleaved order. appendix of this paper. char *const getStreamBuffer (int streamId); Because RtAudio can be used to simultaneously void tickStream (int streamId); control more than a single stream, it is necessary int streamWillBlock (int streamId); that the returned stream identifier be provided to nearly all public methods. After starting the stream, the sequence of The bufferSize parameter specifies the desired events then consists of filling or reading from number of sample frames which will be written to the stream buffer between calls to tickStream(). and/or read from a device per write/read oper- The tickStream() function blocks until the ation. Both the bufferSize and numberOfBuffers data within the stream buffer can be com-