Audio Encoder/Decoder Interface

MPEG Layer III Encoder Interface

2018 年 4 月 5 日 Table of Contents TABLE OF CONTENTS...... 2 1 SCOPE...... 3 2 INTERFACE DESCRIPTION...... 4 3 USAGE...... 5 4 MEMBER FUNCTIONS...... 6

4.1 L3_AUDIO_ENCODE_INIT...... 6 4.2 L3_AUDIO_ENCODE...... 6 4.3 L3_AUDIO_ENCODE_PACKET...... 6 4.4 MP3_AUDIO_ENCODE_INIT...... 7 4.5 MP3_AUDIO_ENCODE...... 7 4.6 MP3_AUDIO_ENCODE_PACKET...... 7 4.7 L3_AUDIO_ENCODE_INFO_STRING...... 7 4.8 L3_AUDIO_ENCODE_GET_BITRATE_FLOAT...... 8 5 DATA TYPES...... 9 5.1 E_CONTROL...... 9 5.2 IN_OUT...... 10 New Audio Encoder/Decoder Interface

1 SCOPE This document describes the interface and usage of the RealNetworks ISO/MPEG Layer III Encoder Library

2 INTERFACE DESCRIPTION The ISO/MPEG Layer III Encoder Library provides a high-level interface for encoding MPEG 1 and MPEG 2 Layer III files. The following header files are provided for usage in C/C++ programs: mp3enc.h: Layer III encoder class definition xhead.h: Utility functions to write a TOC header for VBR files The encoder core resides in a statically linkable library called mp3enc.lib (Microsoft Windows) or mp3enc.a (UNIX). See tomp3.c for an example on how to use the library.

3 USAGE To encode Layer-III streams, the encoder offers two interfaces:  L3_XXX() to encode without sample rate conversion, and from “short int” input data only  MP3_XXX() to encoder with sample rate conversion and type conversion of input data (both optional). For either interface, the calling sequence is as follows:  Call XX_audio_encode_init() to initialize the encoder. The return value of this call will indicate the minimum number of bytes you must feed the encoder per call.  Call XXX_audio_encode() to encode data into a Layer III stream. The return code is a structure indicating the number of bytes read from the input stream and number of bytes written to the output stream. Note that not always will the encoder read from the input, nor will it always produce output. On other occasions, it might produce more than one frame output.  Either write out the output data, and read new input data, or update your pointers according to the result of the last call. Again, please see tomp3.c for an example on how to drive the encoder efficiently. There is currently no mechanism to flush the internal encoder state and write out all input data. We recommend calling the encoder until all of your input data plus an additional silent 6912 (stereo samples) have been read.

4 MEMBER FUNCTIONS The class offers a high-level interface to the MPEG Layer III encoder. For convenience, it offers not only basic encoding functions, but also a sample rate converter and the ability to read different input word formats. Finally, the encoder supports packetized delivery of mp3, which entails redistributing the mp3 dynamic data so that all packets are self-sufficient.

4.1 L3_AUDIO_ENCODE_INIT

PROTOTYPE int L3_audio_encode_init ( E_CONTROL * ec_arg );

DESCRIPTION Initialize the basic encoder (16 bit input, no rate conversion)

RETURNS Zero for failure

4.2 L3_AUDIO_ENCODE

PROTOTYPE IN_OUT L3_audio_encode ( short *pcm, unsigned char *bs_out );

DESCRIPTION Encode one MPEG Layer III frame. The input (pcm[]) is 1152 pcm samples, oldest sample at lowest memory address. Two channel modes are interleaved left/right for a total of 2*1152 values. Note that the input is of type short, which is not necessarily 16 bit. The output (bs_out) is the encoded frame. Output bytes returned may be zero, may be more than one frame.

RETURNS Bytes removed from pcm input, number of bytes encoded.

4.3 L3_AUDIO_ENCODE_PACKET

PROTOTYPE IN_OUT L3_audio_encode_Packet (short *pcm, unsigned char *bs_out, unsigned char *packet, int nbytes_out[2] );

DESCRIPTION Encode into a reformatted frame (self contained frame) for streaming network packets. If bs_out != NULL then bs_out returns standard bitstream. The number of bytes is given by IN_OUT.out_bytes. If packet != NULL packet returns reforematted frame(s). Mpeg-1 returns one frame per call, mpeg-2 returns two frames per call. Number of bytes in each frame is given by nbytes_out[0] and nbytes_out[1]. Second frame (for mpeg2) begins at packet+nbytes_out[0]. Total bytes written to packet is nbytes_out[0]+nbytes_out[1].

4.4 MP3_AUDIO_ENCODE_INIT

PROTOTYPE int MP3_audio_encode_init (E_CONTROL * ec_arg, int input_type, int mpeg_select, int mono_convert);

DESCRIPTION Initialize the encoder with sample rate, channel and word length conversion. Input_type selects if the input is 16 bit linear (input_type = 0) or 8 bit linear (input_type = 1).

RETURNS The minimum number of bytes that MP3_audio_encode() expects on input. Zero for failure

4.5 MP3_AUDIO_ENCODE

PROTOTYPE IN_OUT MP3_audio_encode ( unsigned char *pcm, unsigned char *bs_out );

DESCRIPTION Encode one MPEG Layer III frame. The input (pcm[]) is 1152 pcm samples, oldest sample at lowest memory address. Two channel modes are interleaved left/right for a total of 2*1152 values. The output (bs_out) is the encoded frame. Output bytes returned may be zero, may be more than one frame.

4.6 MP3_AUDIO_ENCODE_PACKET

PROTOTYPE IN_OUT MP3_audio_encode_Packet (unsigned char *pcm, unsigned char *bs_out, unsigned char *packet, int nbytes_out[2] );

DESCRIPTION Encode with sample rate type conversion, and conversion from 16 bit/ 8 bit input. See L3_audio_encode_packet.

4.7 L3_AUDIO_ENCODE_INFO_STRING

PROTOTYPE L3_audio_encode_info_string ( char *s );

DESCRIPTION

Return a textual description of the current encoder settings.

4.8 L3_AUDIO_ENCODE_GET_BITRATE_FLOAT

PROTOTYPE float L3_audio_encode_get_bitrate_float() ;

DESCRIPTION Return the average bitrate of the mp3 stream encoded so far.

5 DATA TYPES

5.1 E_CONTROL E_CONTROL is the structure used to initialize the layer III encoder.

FIELDS int mode; /* mode-0 stereo=0 mode-1 stereo=1 dual=2 mono=3 */ int bitrate; /* CBR PER CHANNEL bit rate, 1000's, default = -1 */ int samprate; /* samp rate e.g 44100 */ int nsbstereo; /* mode-1 only, stereo bands, 3-32 , for default set =-1 */ int filter_select; /* input filter selection - set to -1 for default */ int freq_limit; /* special purpose, set to 24000 */ int nsb_limit; /* special purpose, set to -1 */ int layer; /* 3=Layer III. Set to 3 */ int cr_bit; /* header copyright bit setting */ int original; /* header original/copy bit setting, original=1 */ int hf_flag; /* MPEG1 high frequency */ int vbr_flag; /* 1=vbr, 0=cbr */ int vbr_mnr; /* 0-150, suggested setting is 50 */ int vbr_br_limit; /* reserved for per chan vbr bitrate limit set to 160 */ int vbr_delta_mnr; /* special, set 0 default */ int chan_add_f0; /* channel adder start freq - set 24000 default */ int chan_add_f1; /* channel adder final freq - set 24000 default */ int sparse_scale; /* reserved set, to -1 (encoder chooses) */ int mnr_adjust[21]; /* special, set 0 default */ int cpu_select; /* 0=generic, 1=3DNow reserved, 2=Pentium III */ int quick; /* 0 = base, 1 = fast , -1 = encoder chooses */ int test1; /* special test, set to -1 */ int test2; /* special test, set to 0 */ int test3; /* special test, set to 0 */

REMARKS Mode: Select encoding mode:mode-0 stereo=0 mode-1 stereo=1 dual=2 mono=3. wo channel modes require two channel input. Bitrate: Per channel bitrate in 1000's bits per second. Encoder will make a default selection if -1. Samprate: Sample rate in samples per second (e.g. 44100). Valid MPEG rates are 16000, 22050, 24000, 32000, 44100, and 48000. The encoder will encode at the nearest valid rate. Nsbstereo: Number of subbands to encode in non-intensity stereo. Controls the use of intensity stereo coding, therefore applies to mode-1 stereo mode only. Valid values are 3-32. No intensity stereo will be coded if >= 30. The encoder limits choices to valid values. The encoder will make a default selection if nsbstereo = -1. For MPEG2, should not be set lower that 8 except at very low bitrates. filter_select: Selects input filtering: No filter = 0, DC blocking filter = 1. In general, the suggested setting is 0 for professionally produced material. The suggested setting is 1 for pcm captured by consumer quality sound cards. If set to -1 encoder will select a default value.

freq_limit: Special purpose use only. Limits encoded subbands to specified frequency. Set to 24000. Same function as nsb_limit. nsb_limit: Limits number of subbands encoded. For special purpose use only. Set to -1. hf_flag: MPEG1 high frequency encoding. Allows coding above 16000Hz. hf_flag=1 (mode-1 granules), hf=3 (all granules). vbr_flag: Select variable bitrate operation. VBR=1, VBR=0 for CBR operation. vbr_mnr: Valid values are 0-150. Higher settings give higher SNR's and higher bitrates. Suggested starting setting is 50. vbr_br_limit: Per chan vbr bitrate limit (limitation is on br_index). Special purpose. Set to 160.

5.2 IN_OUT

FIELDS int in_bytes; int out_bytes;