Audio File Formats

An audio is a container format for storing audio data on a system. There are numerous file formats for storing audio data. The general approach towards storing is to sample the audio voltage (which on playback, would correspond to a certain position of the membrane in a speaker) of the individual channels with a certain resolution (the number of bits per sample) in regular intervals (forming the sample rate). This data can then be stored uncompressed or compressed to reduce the file size.

Types of formats

It is important to distinguish between a file format and a . A codec performs the encoding and decoding of the raw audio data while the data itself is stored in a file with a specific . Though most audio file formats support only one , a file format may support multiple , as AVI does. There are three major groups of audio file formats: - Uncompressed audio formats, such as WAV, AIFF and AU;

- Formats with , such as FLAC, Monkey's Audio ( APE), WavPack, , TTA, and lossless Audio (WMA); and

- Formats with , such as MP3, , lossy Window Media Audio (WMA) and AAC.

Uncompressed audio format There is one major uncompressed audio format, PCM, which is usually stored as a . on Windows or as .aiff on Mac OS. WAV is a flexible file format designed to store more or less any combination of sampling rates or bitrates. This makes it an adequate file format for storing and archiving an original recording. A lossless compressed format would require more processing for the same time recorded, but would be more efficient in terms of space used. WAV, like any other uncompressed format, encodes all sounds, whether they are complex sounds or absolute silence, with the same number of bits per unit of time. Let's take an example. A file contains a minute of a symphonic orchestra playing beautifully followed by a minute of silence. If the sound were stored in WAV, the same amount of data would be used for each half. If data were encoded with TTA, the first minute would be a bit smaller than in the WAV file, and the silent half would take almost no disc space at all. But then, recording in the TTA format would require a lot more processing than the WAV. The WAV format is based on the RIFF file format, which is similar to the IFF format.

BWF () is a audio format created by the European Union as a successor to WAV. BWF allows to be stored in the file. See: European Broadcasting Union: Specification of the Broadcast Wave Format - A format for audio data in broadcasting. EBU Technical document 3285, July 1997. This format is the primary recording format used in many professional Audio Workstations used in the and Film industry. Stand-alone file based multi-track recorders from Sound Devices, Zaxcom, HHB USA, , and Aaton all use BWF as their preferred file format for recording multi-track audio files with SMPTE Time Code reference. This standardized Time Stamp in the Broadcast Wave File allows for easy synchronization with a separate picture element.

Linear pulse code (LPCM) is a method of encoding audio information digitally. The term also refers collectively to formats using this method of encoding. The term Pulse-code modulation (PCM), though strictly more general, is often used to describe data encoded as LPCM. LPCM is a particular method of pulse code modulation which represents an audio waveform as a sequence of amplitude values recorded at a sequence of times. LPCM is PCM with linear quantization.[5] LPCM represents sample amplitudes on a linear scale.[6] LPCM specifies that the values stored are proportional to the amplitudes, rather than representing say the logarithm of the amplitude (e.g. - A-law/u-law), or being related in some other manner (e.g. DPCM or ADPCM).[6] In practice these values will be quantized. LPCM audio is coded using a combination of various parameters - such as resolution/sample size (e.g. 8, 16, 20, 24 bit, etc), frequency/sample rate (e.g. 8000, 11025, 16000, 22050, 24000, 32000, 44100, 48000 Hz / "samples per second", etc), sign (signed or unsigned), number of channels (monaural, stereo, quadrophonic, etc) and interleaving of channels, order (little endian, big endian).[7] If the sample is 16-bit signed, the sample range is from -32768 to 32767, with a centerpoint of 0.[8] (For example, signed LPCM data is used on Audio CD, DVD , 16-bit LPCM in WAV, audio/L16, etc.) If the sample is 16-bit unsigned, the sample range is from 0 to 65535, with a centerpoint of 32768.[7]

Lossless audio formats Lossless audio formats (such as TTA and FLAC) provide a compression ratio of about 2:1.

Open File Formats wav - standard audio file format used mainly in Windows PCs. Commonly used for storing uncompressed (LPCM – Linear Pulse Code Modulation), CD-quality sound files, which means that they can be large in size - around 10MB per minute of . It is less well known that wave files can also be encoded with a variety of codecs to reduce the file size (for example the GSM or codecs). WAVE or WAV, short for Waveform Audio File Format,(also, but rarely, named, Audio for Windows[9]) is a and IBM audio file format standard for storing an audio bitstream on PCs. It is an application of the RIFF method for storing data in “chunks”, and thus is also close to the 8SVX and the AIFF format used on and , respectively. It is the main format used on Windows systems for raw and typically uncompressed audio. The usual bitstream encoding is the Linear Pulse Code Modulation (LPCM) format.

mp3 - the MPEG Layer-3 format is the most popular format for downloading and storing music. By eliminating portions of the audio file that are essentially inaudible, mp3 files are compressed to roughly one-tenth the size of an equivalent PCM file while maintaining good audio quality. The mp3 format is recommended for music storage. It is not that good for voice storage.

- a free, open source container format supporting a variety of codecs, the most popular of which is the audio codec Vorbis. Vorbis files are often compared to MP3 files in terms of quality. But the simple fact mp3 are so much more broadly supported makes it difficult to recommend ogg files.

- designed for use in Europe, gsm is a very practical format for quality voice. It makes a good compromise between file size and quality. Note that wav files can also be encoded with the gsm codec.

dct - A variable codec format designed for dictation. It has dictation header information and can be encrypted (often required by medical confidentiality laws). - a lossless compression codec. You can think of lossless compression as like but for audio. If you a PCM file to flac and then restore it again it will be a perfect copy of the original. (All the other codecs discussed here are lossy which means a small part of the quality is lost). The cost of this losslessness is that the compression ratio is not good. Flac is recommended for archiving PCM files where quality is important (eg. broadcast or music use). au - the standard audio file format used by Sun, and Java. The audio in au files can be PCM or compressed with the ulaw, alaw or G729 codecs. aiff - Audio (AIFF) is an audio file format standard used for storing sound data for personal computers and other electronic audio devices. The format was co-developed by Apple Computer in 1988 based on ' Interchange File Format (IFF, widely used on Amiga systems) and is most commonly used on Apple Macintosh computer systems. The audio data in a standard AIFF file is uncompressed pulse- code modulation (PCM). There is also a compressed variant of AIFF known as AIFF- or AIFC, with various defined compression codecs. Standard AIFF is a leading format (along with SDII and WAV) used by professional-level audio and video applications, and unlike the better-known lossy MP3 format, it is non-compressed (which aids rapid streaming of multiple audio files from disk to the application), and lossless. Like any non- compressed, lossless format, it uses much more disk space than MP3— about 10MB for one minute of stereo audio at a sample rate of 44.1 kHz and a sample size of 16 bits. In addition to audio data, AIFF can include loop point data and the musical note of a sample, for use by hardware samplers and musical applications. The file extension for the standard AIFF format is .aiff or .aif. For the compressed variants it is supposed to be .aifc, but .aiff or .aif are accepted as well by audio applications supporting the format. vox - the vox format most commonly uses the Dialogic ADPCM (Adaptive Differential Pulse Code Modulation) codec. Similar to other ADPCM formats, it to 4-bits. Vox format files are similar to wave files except that the vox files contain no information about the file itself so the codec sample rate and number of channels must first be specified in order to play a vox file. raw - a raw file can contain audio in any codec but is usually used with PCM audio data. It is rarely used except for technical tests.

Proprietary Formats wma - the popular format owned by Microsoft. Designed with Digital Rights Management (DRM) abilities for copy protection. Windows Media Audio (WMA) is an audio technology developed by Microsoft. The name can be used to refer to its audio file format or its audio codecs. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs.WMA Pro, a newer and more advanced codec, supports multichannel and high resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. The regular WMA format is not lossless).[3] And WMA Voice, targeted at voice content, applies compression using a range of low bit rates.

aac - the Advanced is based on the MPEG4 audio standard owned by Dolby. A copy-protected version of this format has been developed by Apple for use in music downloaded from their iTunes Music Store.

atrac (.wav) - the older style ATRAC format. It always has a .wav file extension. To open these files simply install the ATRAC3 drivers.

ra - a Real Audio format designed for streaming audio over the . The .ra format allows files to be stored in a self-contained fashion on a computer, with all of the audio data contained inside the file itself.

ram - a text file that contains a link to the Internet address where the Real Audio file is stored. The .ram file contains no audio data itself.

dss - Digital Speech Standard files are an Olympus proprietary format. It is a fairly old and poor codec. Prefer gsm or mp3 where the recorder allows.

msv - a Sony proprietary format for Memory Stick compressed voice files.

dvf - a Sony proprietary format for compressed voice files; commonly used by Sony dictation recorders.