Europäisches Patentamt *EP001347650A1* (19) European Patent Office

Office européen des brevets (11) EP 1 347 650 A1

(12) EUROPEAN PATENT APPLICATION

(43) Date of publication: (51) Int Cl.7: H04N 7/50 24.09.2003 Bulletin 2003/39

(21) Application number: 02251932.6

(22) Date of filing: 18.03.2002

(84) Designated Contracting States: (72) Inventor: Bolton, Martin AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU Grib Lane, Blagdon BS40 7SA (GB) MC NL PT SE TR Designated Extension States: (74) Representative: Driver, Virginia Rozanne et al AL LT LV MK RO SI Page White & Farrer 54 Doughty Street (71) Applicant: STMicroelectronics, Ltd. London WC1N 2LS (GB) Almondsbury, Bristol, BS32 4SQ (GB)

(54) Compression circuitry for generating an encoded bitstream from a plurality of frames

(57) Compression circuitry for generating an encod- generate reconstructed prediction error macroblocks. ed bitstream from a plurality of video frames. Data is An addition circuit adds each reconstructed prediction DCT transformed and then streamed to a processor error macroblock and its corresponding predictor mac- where quantised and inverse quantised blocks are gen- roblock to generate a respective reconstructed macrob- erated. A second streaming data connection streams lock. The quantised macroblocks are zig-zag scanned, the inverse quantised blocks to an inverse DCT block to run level coded and variable length coded to generate an encoded bitstream. EP 1 347 650 A1

Printed by Jouve, 75001 PARIS (FR) 1 EP 1 347 650 A1 2

Description ing the less frequent pairs with long codes, with the aid of correspondence tables defined by the H.261 and FIELD OF INVENTION MPEG standards. The quantification coefficients can be varied from one block to the next by multiplication by a [0001] The present invention relates to motion picture 5 quantisation coefficient. That quantisation coefficient is compression circuits for pictures such as television pic- inserted during variable length coding in headers pre- tures, and more particularly to a compression circuit ceding the compressed data corresponding to macrob- complying with H.261 and MPEG standards. locks. [0009] Macroblocks of an intra picture are used to BACKGROUND OF THE INVENTION 10 compress macroblocks of a subsequent picture of pre- dicted or bidirectional type. Thus, decoding of a predict- [0002] Figures 1A-1C schematically illustrate three ed or bidirectional picture is likely to be achieved from methods for compressing motion pictures in accordance a previously decoded intra picture. This previously de- with H.261 and MPEG standards. According to H.261 coded intra picture does not exactly correspond to the standards, pictures may be of intra or predicted type. 15 actual picture initially received by the compression cir- According to MPEG standards, the pictures can also be cuit, since this initial picture is altered by the quantifica- of bidirectional type. tion at 11. Thus, the compression of a predicted or intra [0003] Intra ("I") pictures are not coded with reference picture is carried out from a reconstructed intra picture to any other pictures. Predicted ("P") pictures are coded I1r rather than from the real intra picture I1, so that de- with reference to a past intra or past predicted picture. 20 coding is carried out under the same conditions as en- Bidirectional ("B") pictures are coded with reference to coding. both a past picture and a following picture. [0010] The reconstructed intra picture I1r is stored in [0004] FIG. 1A illustrates the compression of an intra a memory area M2 and is obtained by subjecting the picture I1. Picture I1 is stored in a memory area M1 be- macroblocks provided by the quantification 11 to a re- fore being processed. The pictures have to be initially 25 verse processing, that is, at 15 an inverse quantification stored in a memory since they arrive line by line whereas followed at 16 by an inverse DCT. they are processed square by square, the size of each [0011] FIG. 1B illustrates the compression of a pre- square being generally 16.times.16 . Thus, before dicted picture P4. The predicted picture P4 is stored in starting to process picture I1, memory area M1 must be a memory area M1. A previously processed intra picture filled with at least 16 lines. 30 Ilr has been reconstructed in a memory area M2. [0005] The pixels of a 16.times.16- square are ar- [0012] The processing of the macroblocks of the pre- ranged in a so-called "macroblock". A macroblock in- dicted picture P4 is carried out from so-called predictor cludes four 8.times.8-pixel luminance blocks and two or macroblocks of the reconstructed picture I1r. Each mac- four 8.times.8-pixel chrominance blocks. The processes roblock of picture P4 (reference macroblock) is subject hereinafter described are carried out by blocks of 35 at 17 to (generally, the motion esti- 8.times.8 pixels. mation is carried out only with the four luminance blocks [0006] The blocks of each macroblock of picture I1 are of the reference macroblocks) . This motion estimation submitted at 10 to a discrete cosine transform (DCT) fol- includes searching in a window of picture Ilr for a mac- lowed at 11 by a quantisation. A DCT transforms a ma- roblock that is nearest, or most similar to the reference trix of pixels (a block) into a matrix whose upper left cor- 40 macroblock. The nearest macroblock found in the win- ner coefficient tends to have a relatively high value. The dow is the predictor macroblock. Its position is deter- other coefficients rapidly decrease as the position mined by a motion vector V provided by the motion es- moves downwards to the right. Quantisation involves di- timation. The predictor macroblock is subtracted at 18 viding the coefficients of the matrix so transformed, such from the current reference macroblock. The resulting that a large number of coefficients which are a distance 45 difference macroblock is subjected to the process de- away from the upper left corner are cancelled. scribed with relation to FIG. 1A. [0007] At 12, the quantified matrices are subject to [0013] Like the intra pictures, the predicted pictures zigzag scanning (ZZ) and to run/level coding (RLC). Zig- serve to compress other predicted pictures and bidirec- zag scanning has the consequence of improving the tional pictures. For this purpose, the predicted picture chances of consecutive series of zero coefficients, each 50 P4 is reconstructed in a memory area M3 by an inverse of which is preceded by a non-zero coefficient. The run/ quantification at 15, inverse DCT at 19, and addition at level coding mainly includes replacing each series from 19 of the predictor macroblock that was subtracted at the ZZ scanning with a pair of values, one representing 18. the number of successive zero coefficients and the other [0014] The vector V provided by the motion estimation representing the first following non-zero coefficient. 55 17 is inserted in a header preceding the data provided [0008] At 13, the pairs of values from the RLC are sub- by the variable length coding of the currently processed ject to variable length coding (VLC) that includes replac- macroblock. ing the more frequent pairs with short codes and replac- [0015] FIG. 1C illustrates the compression of a bidi-

2 3 EP 1 347 650 A1 4 rectional picture B2. Bidirectional pictures are provided ing whether filtering has to be carried out or not. for in MPEG standards only. The processing of the bidi- [0022] The succession of types (intra, predicted, bidi- rectional pictures differs from the processing of predict- rectional) is assigned to the pictures in a predetermined ed pictures in that the motion estimation 17 consists in way, in a so-called group of pictures (GOP). A GOP gen- finding two predictor macroblocks in two pictures I1r and 5 erally begins with an intra picture. It is usual, in a GOP, P4r, respectively, that were previously reconstructed in to have a periodical series, starting from the second pic- memory areas M2 and M3. Pictures I1r and P4r gener- ture, including several successive bidirectional pictures, ally respectively correspond to a picture preceding the followed by a predicted picture, for example of the form bidirectional picture that is currently processed and to a IBBPBBPBB ... where I is an intra picture, B a bidirec- picture following the bidirectional picture. 10 tional picture, and P a predicted picture. The processing [0016] At 20, the mean value of the two obtained pre- of each bidirectional picture B is carried out from mac- dictor macroblocks is calculated and is subtracted at 18 roblocks of the previous intra or predicted picture and from the currently processed macroblock. from macroblocks of the next predicted picture. [0017] The bidirectional picture is not reconstructed [0023] The various functional blocks that are used in because it is not used to compress another picture. 15 a typical prior art functional implementation are shown [0018] The motion estimation 17 provides two vectors in Figure 2. For clarity, the motion estimation engine and V1 and V2 indicating the respective positions of the two memory for storing macroblocks and video pictures predictor macroblocks in pictures I1r and P4r with re- have been omitted. spect to the reference macroblock of the bidirectional [0024] In Figure 2, a reference macroblock 200 is sup- picture. Vectors V1 and V2 are inserted in a header pre- 20 plied to a subtraction circuit 201, where the predictor 202 ceding the data provided by the variable length coding for that macroblock is subtracted (in the case of B and of the currently processed macroblock. P pictures, only). The resultant error block (or the orig- [0019] In a predicted picture, an attempt is made to inal macroblock, for I pictures) is passed on to a DCT find a predictor macroblock for each reference macrob- block 203, then to a quantisation block 204 for quanti- lock. However, in some cases, using the predictor mac- 25 sation. roblock that is found may provide a smaller compression [0025] The quantised macroblock is forwarded to an rate than that obtained by using an unmoved predictor encoding block 205 and an inverse quantisation block macroblock (zero motion vector), or even smaller than 206. The encoding block 205 takes the quantised mac- the simple intra processing of the reference macroblock. roblock and zig-zag encodes it, performs run level cod- Thus, depending upon these cases, the reference mac- 30 ing on the resultant data, then variable length packs the roblock is submitted to either predicted processing with result, outputting the now encoded bitstream. the vector that is found, predicted processing with a zero [0026] The bitstream is monitored and can be control- vector, or intra processing. led via feedback to a rate control system 207. This con- [0020] In a bidirectional picture, an attempt is made trols quantisation (and dequantisation) to meet certain to find two predictor macroblocks for each reference 35 objective for the bitstream. A typical objective is a max- macroblock. For each of the two predictor macroblocks, imum bit-rate, although other factors can also be used. the process providing the best compression rate is de- [0027] The inverse quantisation block 206 in this Fig- termined, as indicated above with respect to a predicted ure is the start of a reconstruction chain that is used to picture. Thus, depending on the result, the reference generate a reconstructed version of each frame, so that macroblock is submitted to either bidirectional process- 40 the frames the motion prediction engine is searching for ing with the two vectors, predicted processing with only matching macroblocks are the same as will be regener- one of the vectors, or intra processing. ated during decoding proper. After inverse quantisation, [0021] Thus, a predicted picture and a bidirectional the macroblock is inverse DCT transformed in IDCT picture may contain macroblocks of different types. The block 208 and added in an adding block 209 to the orig- type of a macroblock is also data inserted in a header 45 inal predictor used to generate the error macroblock. during variable length coding. According to MPEG This reconstructed block is stored in memory for subse- standards, the motion vectors can be defined with an quent use in the motion estimation process. accuracy of half a pixel. To search a predictor macrob- [0028] The various blocks required to generate the lock with a non integer vector, first the predictor macrob- encoded output stream have different computational re- lock determined by the integer part of this vector is 50 quirements, which themselves can vary according to the fetched, then this macroblock is submitted to so-called particular application or user selected restrictions. "half-pixel filtering", which includes averaging the mac- Throttling of the output bitstream to meet bandwidth re- roblock and the same macroblock shifted down and/or quirements is typically handled by manipulating the to the right by one pixel, depending on the integer or quantisation step. non-integer values of the two components of the vector. 55 [0029] Pure hardware architectures, while potentially According to H.261 standards, the predictor macrob- the most efficient, suffer from lack of flexibility since they locks may be subjected to low-pass filtering. For this can support only a restricted range of standards; more- purpose, information is provided with the vector, indicat- over they have long design/verification cycles. On the

3 5 EP 1 347 650 A1 6 other hand, pure software solutions, while being the IDCT circuitry comprise a single functional block selec- most flexible, require high-performance processors un- tively operable in a DCT or IDCT mode. suited to low-cost consumer applications. [0036] In a preferred form, the compression circuitry [0030] It would be desirable to provide an architecture further includes a motion estimation engine for supply- that allowed for relatively flexible bitstream control whilst 5 ing the predictor macroblocks to the IDCT circuitry. More reducing the amount of software=based processing preferably, the motion estimation engine is configured power required. to generate the prediction error macroblocks by sub- tracting predictor macroblocks from respective corre- SUMMARY OF INVENTION sponding picture macroblocks of the picture being en- 10 coded, and to supply the prediction error macroblocks [0031] According to a first aspect of the invention, to the DCT circuitry. there is provided compression circuitry for generating [0037] In a preferred embodiment, the circuitry in- an encoded bitstream from a plurality of video frames, cludes a hardware VLC packer and a third streaming the circuitry including: data connection for streaming the run length coded data 15 from the processor to the hardware VLC packer. discrete cosine transform (DCT) circuitry for accept- [0038] Preferably, the compression circuitry further in- ing prediction error macroblocks and generating cludes macroblock memory for storing the reconstruct- DCT transformed macroblocks; ed macroblocks. a first streaming data connection for streaming the [0039] It is particularly preferred that the compression DCT transformed macroblocks from the DCT trans- 20 circuitry can be configured for decoding of a com- formation circuitry to a processor, the processor be- pressed video stream. ing configured to run software for: [0040] In a second aspect, the present invention pro- vides a method of generating an encoded bitstream (i) quantising the DCT transformed macrob- from a plurality of video frames, the method including locks to generate quantised macroblocks; and 25 the steps of: (ii) inverse quantising the quantised macrob- locks to generate inverse quantised macrob- discrete cosine transforming prediction error mac- locks; roblocks to generate DCT transformed macrob- locks; a second streaming data connection for streaming 30 streaming the DCT transformed macroblocks from the inverse quantised macroblocks from the proc- the DCT transformation circuitry to a processor via essor; a first streaming data connection; inverse discrete cosine transform (IDCT) circuitry in the processor: for accepting the streamed inverse quantised mac- roblocks and IDCT transforming them to generate 35 (i) quantising the DCT transformed macrob- reconstructed prediction error macroblocks; locks to generate quantised macroblocks; and an addition circuit for adding each reconstructed (ii) inverse quantising the quantised macrob- prediction error macroblock and its corresponding locks to generate inverse quantised macrob- predictor macroblock, thereby to generate a re- locks; spective reconstructed macroblocks for use in en- 40 coding of other macroblocks; streaming the inverse quantised macroblocks from means for zig-zag scanning, run level coding and the processor via a second streaming data connec- variable length coding the quantised macroblocks tion; to generate an encoded bitstream. inverse discrete cosine transforming (IDCT) the 45 streamed inverse quantised macroblocks to gener- [0032] Preferably, the DCT and IDCT circuitry perform ate reconstructed prediction error macroblocks; DCT and IDCT processing at a rate determined by the adding each reconstructed prediction error macrob- arrival of data from the relevant data connection. lock and its corresponding predictor macroblock, [0033] Preferably, the first and second streaming data thereby to generate a respective reconstructed connections are handshake controlled. More preferably, 50 macroblocks for use in encoding of other macrob- the DCT and IDCT circuitry perform DCT and IDCT locks; processing at a rate determined by the handshake con- zig-zag scanning, run level coding and variable trol signals. length coding the quantised macroblocks to gener- [0034] In a preferred form, the processor is configured ate an encoded bitstream. to run software for implementing the zig-zag scanning 55 and run length coding. [0041] Preferably, the DCT and IDCT processing take [0035] Preferably, the DCT and IDCT circuitry share place at a rate determined by the arrival of data from the hardware. It is particularly preferred that the DCT and relevant data connection.

4 7 EP 1 347 650 A1 8

[0042] Preferably, the first and second streaming data [0051] The functional blocks include an subtraction connections are handshake controlled. More preferably, circuit 300 for subtracting each predictor macroblock, as the step of DCT and IDCT processing at a rate deter- supplied by the motion estimation engine (described lat- mined by the handshake control signals. er) from its corresponding picture macroblock, to gen- [0043] Preferably, the processor is configured to run 5 erate a prediction error macroblock. For an I picture, software for implementing the zig-zag scanning and run there is no predictor, so the macroblock is passed length coding. through the subtraction circuit with no change. [0044] In a preferred embodiment, the DCT and IDCT [0052] The prediction error macroblock is supplied to circuitry share hardware. More preferably, the DCT and a DCT circuit 301 where a forward discrete cosine trans- IDCT circuitry comprise a single functional block selec- 10 form is performed. Such hardware and its operation are tively operable in a DCT or IDCT mode. well known in the prior art and so have not been de- [0045] Preferably, the method further includes the scribed here in further detail. step of receiving, in the IDCT circuitry, the predictor [0053] The output of the DCT is streamed to a proc- macroblocks from a motion estimation engine. More essor 302 (described later) which performs the quanti- preferably, the method includes the step, in the motion 15 sation, zig-zag coding, a run level coding steps in the estimation engine, of generating the prediction error encoding process. The resultant data is variable length macroblocks by subtracting predictor macroblocks from coded and output as an encoded bitstream. In the sim- respective corresponding picture macroblocks of the plified schematic of Figure 3, the variable length coding picture being encoded, and supplying the prediction er- takes place in software. However, in an alternative em- ror macroblocks to the DCT circuitry. 20 bodiment described later, the variable length coding and [0046] In a preferred form, the circuitry includes a packing, or just packing, is performed in hardware, since hardware VLC packer, the method including the step of this provides a drastic increase in performance com- streaming the run length coded data from the processor pared to software coding running on a general purpose to the hardware VLC packer via a third streaming data processor. connection. 25 [0054] As well as these steps, the processor also per- [0047] Preferably, the reconstructed macroblocks are forms inverse quantisation, and the resultant inverse stored in macroblock memory. quantised macroblocks are sent to an inverse DCT (ID- [0048] In each aspect of the invention, it is preferred CT) circuit 303 via a streaming interface. An inverse that the encoded bitstream conforms to MPEG, MPEG- DCT is performed and the resultant reconstructed error 2 and/or H.261 standards. 30 macroblock is added to the original predictor macrob- lock (for P and B pictures only) by an addition circuit 304. BRIEF DESCRIPTION OF DRAWINGS The predictor macroblocks have been delayed in a de- lay buffer 305. For I and P pictures, the macroblock is [0049] fully reconstructed after the IDCT circuit. The resultant 35 reconstructed macroblocks are then stored in memory Figures 1A to 1C, described above, illustrate three (not shown) for use by the motion estimation engine in picture compression processes according to H.261 generating predictors for future macroblocks. This is and MPEG standards; necessary because it is reconstructed macroblocks that a decoder will subsequently use to reconstruct the pic- Figure 2, described above, is a simplified schematic 40 tures. of the functional blocks in a typical MPEG encoding [0055] Turning to Figure 4, there is shown a more de- scheme, in accordance with the prior art; tailed version of the schematic of Figure 3, and like fea- tures are denoted by corresponding reference numer- Figure 3 is a schematic of an encoder loop, in ac- als. In Figure 4, the motion estimation engine 400 for cordance with the invention; 45 use with the encoding circuitry is also shown. The mo- tion estimation engine 400 determines the best match- Figure 4 is a schematic of compression circuitry for ing macroblock (or average of two macroblocks) for generating an encoded bitstream from a plurality of each macroblock in the frame (for B and P pictures only) video frames, in accordance with the invention, in and subtracts it from the macroblock being considered encoding mode; 50 to generate a predictor error macroblock. The method of selecting predictor macroblocks does not form part of DETAILED DESCRIPTION the present invention and so is not described in greater detail herein. [0050] Figure 3 shows an overview of the functional [0056] The motion estimation engine 400 outputs the blocks of the preferred form of the invention, in which 55 macroblocks, associated predictor macroblocks and hardware functionality is represented by rectangular vectors, and other information such as frame type and blocks and software functionality is represented by an encoding modes, to DCT/IDCT circuitry via a direct link. oval block. Alternatively, this information can be transferred over a

5 9 EP 1 347 650 A1 10 data bus. Data bus transfer principles are well known 301/303 need to be controlled to ensure that the correct and so is not described in detail. data is being fed to the DCT/IDCT block and that the [0057] The DCT and IDCT steps are performed in a correct operation is being performed. For example, DCT/IDCT block 401, which includes combined DCT/ when the initial DCT operation 301 is being performed, IDCT circuitry 301/303 that is selectable to perform ei- 5 the multiplexer 402 is controlled to provide data from the ther operation on incoming data. The input is selected bus (supplied by the motion estimation engine) to the by way of a multiplexer 402, the operation of which will DCT/IDCT block 301/303, which is set to DCT mode. be described in greater detail below. The output of the However, when performing the IDCT operation 303, the multiplexer is supplied to the delay block 305 and the multiplexer 402 sends data from the fourth SDI port 409 DCT/IDCT circuitry 301/303. Additional data supplied by 10 to the DCT/IDCT block 301/303, which is set to IDCT the motion estimation engine 400, such as the motion mode. vector(s), encoding decisions (intra/non-intra, MC/no [0064] Similarly, some support hardware that would MC, field/frame prediction, field/frame DCT) is routed exist in the actual implementation has been omitted. An past the delay and DCT/IDCT blocks to a first streaming obvious example is buffers on the various inputs and data interface SDI 403. 15 output. It would be usual in such circuitry to include FIFO [0058] The outputs of the delay block and the DCT/ buffers supporting the SDI ports to maximise through- IDCT circuitry are supplied to an addition circuit 304, the put. For the purposes of clarity, such support hardware output of which is sent to memory 450. The output of the is not explicitly shown. However, it will be understood by DCT/IDCT block 301/303 is also supplied to the first SDI those skilled in the art to be implicitly present in any prac- port 403. 20 tical application of the invention. [0059] The first SDI port 403 accepts data from the [0065] It will be appreciated that, in the encoding DCT/IDCT block 301/303 and the multiplexer 402 and mode described above, the DCT and IDCT functions of converts it into a format suitable for streaming transmis- the DCT/IDCT block 301/303 will be performed in an in- sion to a corresponding second streaming SDI port 404. terleaved manner, with one or more DCT operations be- The streaming is controlled by a handshake arrange- 25 ing interleaved with one or more IDCT operations, de- ment between the respective SDI ports. The second pending upon the order of I, P and B pictures being en- streaming SDI port 404 takes the streaming data from coded. the first SDI port 403 and converts it back into a format [0066] With slight modifications to control software suitable for use within the processor 302. and circuitry, the encoding circuitry described above can [0060] Once the data has been transformed back into 30 perform decoding of an encoded MPEG stream. This is a synchronous format, the processor performs quanti- because the inverse quantisation software and IDCT sation 405, inverse quantisation 406 and zig-zag/run hardware are common to the encoding and decoding level coding 407 as described previously. It will be ap- process. There are at least three ways this can be preciated that the particular implementations of these achieved: steps in software is now relevant to the present inven- 35 tion, and so is not described in detail. 1. If it is only required to offload the IDCT processing [0061] After inverse quantisation, the macroblock is from the processor, the dequantised coefficient returned to a third SDI port 408, which operates in the blocks can be streamed from the processor to the same way as the first streaming port to convert and IDCT/DCT block 301/303 via the third and fourth stream the data to a fourth SDI port 409, which converts 40 SDI ports 408 and 409. The results of the IDCT are the data for synchronous use and supplies it to the mul- then read back via the first and second SDI ports tiplexer 402. 403 and 404. [0062] The processor 302 outputs the run level coded data to a fifth SDI port 410, which in a similar fashion to 2. Option 1 can be extended to allow more of the the first and third SDI ports, formats the data for stream- 45 decoding load to be passed to the DCT/IDCT block ing transmission to a sixth SDI port 411, which in turn 401. In particular, the predictor blocks are read into reformats the data into a synchronous format. The data the delay buffer 305. The coefficient blocks are then is then variable length coded and packed in hardware read in via the same route by the DCT/IDCT block VLC circuitry 412. The particular workings of the hard- 301/303 (in IDCT mode). After the IDCT has taken ware VLC packing circuitry 412 are well known in the 50 place, the predictor and IDCT processed macrob- art, are not critical to the present invention and so will locks are combined by the addition circuitry 304 and not be described in detail. Indeed, as mentioned previ- written to system memory via the system data bus. ously, the VLC operation can be performed in software by the processor, for a corresponding cost in processor 3. In an alternative to second decoding arrange- cycles. 55 ment, the motion estimation block is configured to [0063] It will be appreciated that a number of control provide the predictor blocks to the delay buffer 305 lines and ancillary detail has been omitted for clarity. For via the multiplexer 402. The coefficient blocks are example, it is clear the multiplexer and DCT/IDCT block provided to the DCT/IDCT block 301/303 (in IDCT

6 11 EP 1 347 650 A1 12

mode), and the remainder of the procedure is as per processing at a rate determined by the handshake the second decoding arrangement. control signals.

[0067] Although the invention has been described 5. Compression circuitry according to any one of the with reference to a number of specific examples, it will 5 preceding claims, wherein the processor is config- be appreciated by those skilled in the art that the inven- ured to run software for implementing the zig-zag tion can be embodied in many other forms. scanning and run length coding.

6. Compression circuitry according to any one of the Claims 10 preceding claims, wherein the DCT and IDCT cir- cuitry share hardware. 1. Compression circuitry for generating an encoded bitstream from a plurality of video frames, the cir- 7. Compression circuitry according to claim 6, wherein cuitry including: the DCT and IDCT circuitry comprise a single func- 15 tional block selectively operable in a DCT or IDCT discrete cosine transform (DCT) circuitry for ac- mode. cepting prediction error macroblocks and gen- erating DCT transformed macroblocks; 8. Compression circuitry according to any one of the a first streaming data connection for streaming preceding claims, further including a motion estima- the DCT transformed macroblocks from the 20 tion engine for supplying the predictor macroblocks DCT transformation circuitry to a processor, the to the IDCT circuitry. processor being configured to run software for: 9. Compression circuitry according to claim 8, wherein (i) quantising the DCT transformed mac- the motion estimation engine is configured to gen- roblocks to generate quantised macrob- 25 erate the prediction error macroblocks by subtract- locks; and ing predictor macroblocks from respective corre- (ii) inverse quantising the quantised mac- sponding picture macroblocks of the picture being roblocks to generate inverse quantised encoded, and to supply the prediction error macrob- macroblocks; locks to the DCT circuitry. 30 a second streaming data connection for 10. Compression circuitry according to any one of the streaming the inverse quantised macroblocks preceding claims, wherein the circuitry includes a from the processor; hardware VLC packer and a third streaming data inverse discrete cosine transform (IDCT) cir- connection for streaming the run length coded data cuitry for accepting the streamed inverse quan- 35 from the processor to the hardware VLC packer. tised macroblocks and IDCT transforming them to generate reconstructed prediction error mac- 11. Compression circuitry according to any one of the roblocks; preceding claims, further including macroblock an addition circuit for adding each reconstruct- memory for storing the reconstructed macroblocks. ed prediction error macroblock and its corre- 40 sponding predictor macroblock, thereby to gen- 12. Compression circuitry according to any one of the erate a respective reconstructed macroblocks preceding claims, configured for decoding of a com- for use in encoding of other macroblocks; pressed video stream. means for zig-zag scanning, run level coding and variable length coding the quantised mac- 45 13. Compression circuitry according to any one of the roblocks to generate an encoded bitstream. preceding claims, configured to generate an encod- ed bitstream in accordance with MPEG, MPEG-2 2. Compression circuitry according to claim 1, wherein and/or H.261 standards. the DCT and IDCT circuitry perform DCT and IDCT processing at a rate determined by the arrival of da- 50 14. A method of generating an encoded bitstream from ta from the relevant data connection. a plurality of video frames, the method including the steps of: 3. Compression circuitry according to claim 1 or 2, wherein the first and second streaming data con- discrete cosine transforming prediction error nections are handshake controlled. 55 macroblocks to generate DCT transformed macroblocks; 4. Compression circuitry according to claim 3, wherein streaming the DCT transformed macroblocks the DCT and IDCT circuitry perform DCT and IDCT from the DCT transformation circuitry to a proc-

7 13 EP 1 347 650 A1 14

essor via a first streaming data connection; in the motion estimation engine, of generating the in the processor: prediction error macroblocks by subtracting predic- tor macroblocks from respective corresponding pic- (i) quantising the DCT transformed mac- ture macroblocks of the picture being encoded, and roblocks to generate quantised macrob- 5 supplying the prediction error macroblocks to the locks; and DCT circuitry. (ii) inverse quantising the quantised mac- roblocks to generate inverse quantised 23. A method according to any one of claim 14 to 22, macroblocks; wherein the circuitry includes a hardware VLC 10 packer, the method including the step of streaming streaming the inverse quantised macroblocks the run length coded data from the processor to the from the processor via a second streaming data hardware VLC packer via a third streaming data connection; connection. inverse discrete cosine transforming (IDCT) the streamed inverse quantised macroblocks to 15 24. A method according to any one of claims 14 to 23, generate reconstructed prediction error mac- wherein the reconstructed macroblocks are stored roblocks; in macroblock memory. adding each reconstructed prediction error macroblock and its corresponding predictor 25. A method according to any one of claims 14 to 24, macroblock, thereby to generate a respective 20 wherein the encoded bitstream conforms to MPEG, reconstructed macroblocks for use in encoding MPEG-2 and/or H.261 standards. of other macroblocks; zig-zag scanning, run level coding and variable length coding the quantised macroblocks to generate an encoded bitstream. 25

15. A method according to claim 14, wherein the DCT and IDCT processing take place at a rate deter- mined by the arrival of data from the relevant data connection. 30

16. A method according to claim 14 or 15, wherein the first and second streaming data connections are handshake controlled. 35 17. A method according to claim 16, including the step of DCT and IDCT processing at a rate determined by the handshake control signals.

18. A method according to any one of claims 14 to 17, 40 wherein the processor is configured to run software for implementing the zig-zag scanning and run length coding.

19. A method according to any one of claims 14 to 18, 45 wherein the DCT and IDCT circuitry share hard- ware.

20. A method according to claim 19, wherein the DCT and IDCT circuitry comprise a single functional 50 block selectively operable in a DCT or IDCT mode.

21. A method according to any one of claims 14 to 20, further including the step of receiving, in the IDCT circuitry, the predictor macroblocks from a motion 55 estimation engine.

22. A method according to claim 21, including the step,

8 EP 1 347 650 A1

9 EP 1 347 650 A1

10 EP 1 347 650 A1

11 EP 1 347 650 A1

12 EP 1 347 650 A1

13 EP 1 347 650 A1

14 EP 1 347 650 A1

15 EP 1 347 650 A1

16