Game Audio Via Openal
Total Page:16
File Type:pdf, Size:1020Kb
Game Audio via OpenAL Summary In the Graphics For Games module, you learnt how to use OpenGL to create complex graphical scenes, improving your programming skills along the way, and learning about data structures such as scene graphs. In this workshop, you'll see how to add sounds to your game worlds using the OpenAL sound library. New Concepts Sound in games, OpenAL, PCM audio, binary file formats, FourCC codes, WAV files, limited resource management Introduction Audio has played an important part in gaming almost as long as there have been games to play - even Pong back in 1972 had simple sound effects. We've moved a long way from then; the 80s brought dedicated audio hardware such as the Commodore 64's SID chip that could play 3 simultaneous syn- thesised sounds, and later the Amiga brought the ability to play audio samples to the home gaming market. In modern gaming hardware, we can expect to hear many simultaneous sounds and music tracks, often in surround sound. Game developers now employ dedicated sound engineers that will carefully adjust the sounds in each game release to create an immersive aural experience - making sure that each individual sound is uniquely identifiable and correctly equalised, and that every music track suits the situation they will be played in. At the heart of a game's audio experience is the code that plays back the game sounds, and cal- culates which speakers they should use - the sound system. Although we can't hope to compete with the complex sound systems of AAA games, we should still be able to make a robust, simple system for the addition of sound in our 3D games, and that's what this workshop will assist you in creating. Audio Data Before we even begin to explore how to add in-game audio to our games, it's worth while having a brief overview of what sound actually is. Sound is simply the propagation of pressure waves through a medium, such as water or air. If we were to take a look at a sound wave using an oscilloscope, it'd look something like this: A single continuous wave, with an amplitude (how large the peaks and troughs of the wave are - the larger the amplitude, the louder the sound), and a frequency (how often these peaks and troughs occur - the higher the frequency, the higher pitched the sound is). 1 This poses a problem for computers. Sound waves are continuous, yet computers store discrete, digital values - so how do we store our game audio? The answer is Pulse-Code Modulation, or PCM for short. To digitally represent sound, the sound is regularly sampled, and its amplitude stored as a value. PCM data has two important components - sample rate (measured in hertz, meaning cycles per second, just like a computer's CPU) which determines how often a sample of the sound is taken, and bit depth, which determines the range of amplitude values that can be stored. As there is limited bit depth, and so only a finite number of values that can be stored, the sampled values of a sound wave are quantised - rounded to the nearest value that the bit depth can hold. Here's an example of our sound wave being transformed into PCM data: Clockwise from top left: A section of the original analogue wave, low bit depth and frequency, medium bit depth and frequency, hight bit depth and frequency You should be able to see how both bit depth and sample rate determine the final quality of the digitised sound. To give you an idea of what the commonly used values for bit depth and sample rate are, audio CDs encode their data at a sample rate of 44,056hz (over 44k samples a second!), and a bit rate of 16 (65,536 unique values). The audio hardware in your computer can natively handle PCM data, and can turn it back into analog data, for outputting via speakers or headphones. WAV Files One of the most popular file formats to store PCM data is IBM / Microsoft's WAV file. Unlike the text files you'll have been reading and writing to so far, WAV is a 'binary' file format, meaning it contains no human-readable data. Well, almost. WAV files use something known as FourCCs, short for Four-Character Code, to identify the start of data structures within the file, commonly known as 'chunks'. As the name suggests, a FourCC is a simple 4-byte identifier, with the byte values in the range that represents readable characters. If you were to open a WAV file in a text editor such as Notepad++, you'd see the phrase RIFF at the start of the file, and then WAVE a few characters later - FourCCs that identify the file is of type RIFF (a container file format, meaning it could contain any sort of binary data), with WAVE data in it. Here's a more detailed example of a WAV file’s contents: 2 After the initial file header containing RIFF and WAVE, we have a number of chunks, each of which has a FourCC, a size value, and a number of bytes matching that size value - this structure allows the file to be read in correctly, and searched through if necessary. WAV files can have many types of chunk, containing information on MIDI playback, que indicators within the file data, and even space for text notes. The most important two chunks, and the only two we'll be processing in the code later on, are the format chunk, and the data chunk. Here's what the format chunk looks like: Like other WAV chunks, it begins with a FourCC and a size, before defining a number of important characteristics on the PCM data contained within the WAV. We can determine the number of chan- nels the PCM data has (i.e. whether it is mono, stereo, 5.1, and so on), the sample rate, and bit rate. WAV files can contain either uncompressed or compressed PCM data (although uncompressed is far more common), so a compression code is also defined. Format chunks sometimes also specify additional data to support the compression type, so another size value is defined, determining the number of this extra data. Finally, this is what the data chunk contains: Not much to it! It's basically just a big lump of binary data, containing either compressed or uncompressed PCM data. All that's really worth pointing out is how the number of channels is handled by the PCM sample data: for each given 'time slice' of data (remember the 44,056 samples per second for a CD?), the PCM data for each channel is stored, one after the other, then the next samples, and so on. This interleaving of sample data per channel allows multichannel sound to be easily 'streamed' from the data source, without having to read data from different parts of the file. Audio Hardware Much as with graphical output, sound is usually handled by dedicated hardware, which may have a wide variety of capabilities. Inexpensive 'on board' sound chips might be little more than a digital to analogue signal converter connected to some audio jacks, while more expensive hardware may have support for natively decoding compressed audio, and contain sophisticated signal processors to add effects such as reverb. Higher end sound hardware will have a number of discrete channels, or voices that it can support (that is, the number of concurrent PCM sounds it can process into analogue data and output at one time, a value that is generally independent from whether the PCM data is mono, stereo etc.), while lower-end hardware might perform all PCM data mixing in software, outputting to a single combined channel. Depending on the exact audio features of the hardware, there might even be some amount of dedicated RAM, to buffer sound samples for higher performance. 3 OpenAL As with graphics cards, the majority of the varying hardware features of sound devices are hidden behind an API, which will in turn target the vendor-provided drivers of the hardware. There's a few of these APIs around: DirectSound for Windows, XAudio2 for Windows and Xbox 360, and ALSA for Linux. One of the most pervasive audio APIs is OpenAL, currently hosted and maintained by Creative Labs. As its name suggets, OpenAL aims to be for audio what OpenGL is for graphics - a cross-platform, easy to use API, that supports both hardware acceleration, and software fallbacks where necessary. Unlike OpenGL, which has an Architecture Review Board to discuss features and improvements to the language (currently handled by the Khronos Group), no such collaborative entity currently ex- ists for OpenAL, with all recent development being undertaken by Creative Labs, developers of the SoundBlaster series of sound cards. Like OpenGL, OpenAL is a giant 'state machine', with its API functions adding commands to an internal stack, which are popped off one after the other as they are processed. Implementations of OpenAL exist for Windows, Apple Mac and iOS devices, Android, and the Xbox 360. Sony's Playstation 3 has an OpenAL-like sound API called x, making OpenAL a popular choice for cross platform programs, and a firm understanding of how to use it an advantage for a budding game developer. OpenAL allows a number of sound sources to be played in 3D space, with all of the processing required to determine the volume of sound required in each speaker to generate the spatial effect han- dled automatically. It supports a number of parameters to determine how samples are played back, and how distance from a sound source effects sound playback volume.