I NASA Technical Memorandum 102325

Digital for Real-Time Processing of Broadcast Quality Signals at 1 8 Bits/Pixel

Mary Jo Shalkhauser and Wayne A. Whyte, Jr. Lewis Research Center Cleveland, Ohio

(NASA-Tll-102325) DIGITAL CODEC FOB lV8 9 -279 2 7 REAL-TIHE PBOCESSIIG OF BRO&DCAST QUALITY VIDgO SIGWALS AT 1.8 BITS/PIXBL IMASa- Lewis Research Center) 16 p CSCL 17B Dnclas G3/32 0225953

Prepared for the * Global Telecommunications Conference sponsored by the Institute of Electrical and Electronics Engineers Dallas, Texas, November 27-30, 1989 i .. ,_,

DIGITAL CODEC FOR REAL-TIME PROCESSING OF BROADCAST QUALITY VIDEO SIGNALS AT 1.8 BITSIPIXEL

Mary Jo Shalkhauser and Wayne A. Whyte. Jr.

National Aeronaut cs and Space Administration Lewi s Research Center C1 eve and, Ohio 44135

11. Abstract signals in the same bandwidth occupied by a single frequency modulated television signal. This paper Advances in very large-scale integration presents the hardware implementation of a digital and recent work in the field of bandwidth television bandwidth compression algori thm which efficient digital modulation techniques have processes standard NTSC (National Television Systems combined to make digital video processing Committee) composite color television signals and technically feasible and potentially cost com- produces broadcast quality video in real time at an petitive for broadcast quality television average of 1.8 bits/pixel. (A pixel, or picture transmission. A hardware implementation has element, represents each piece of sampled data. been developed for a DPCM-based digital televi- The sampling rate used with this results sion bandwidth compression algorithm which in 768 samples over the active portion of each processes standard NTSC composite color video line by 512 active video lines per video television signals and produces broadcast frame.) The algorithm is based on differential quality video in real time at an average of pulse modulation (DPCM), but additionally 1.8 bitslpixel. This paper describes the data utilizes a nonadaptive predictor, nonuniform quan- compression algorithm and the hardware imple- tizer and multilevel Huffman coder to reduce the mentation of the codec, and provides perform- data rate substantially below that achievable with ance results. straight DPCM. The nonadaptive predictor and multi- level Huffman coder combine to set this technique apart from prior-art DPCM encoding . 11. Introduction Section I11 below will provide the details of the compression algorithm while sections IV and V dis- Transmission of television signals in a digital cuss the hardware implementation and performance format has been looked upon with promise for a results, respectively. number of years. Digital systems providing tele- conferencing quality video have become common place 111. Algorithm in both government and industry. However, digital transmission of high-quality (toll grade or broad- Differential pulse code modulation has histor- cast quality) television signals has yet to achieve ically been one of the most popular predictive anything close to the same kind of acceptance. image coding methods studied, due to its simplicity This has been due in part to the broadcasters' of implementation and overall subjective perform- reluctance to have processing of any kind performed ance characteristics. The fault of DPCM schemes in on the transmitted signals. But to a greater the past have been that 3 to 4 bitslpixel were extent, digital transmission of broadcast quality required to achieve acceptable image quality, with video has failed to gain acceptance because it has 4 bits/pixel generally preferred to maintain a not been cost effective to do so. The lack of broadcast qual i ty picture representation. The sys- available wideband digital links as well as the tem presented here combines the simplicity of the complexity of implementation of bandwidth effi- basic DPCM approach with several performance cient digital 's (encoder/decoder) has enhancements to achieve broadcast quality images worked to keep the cost of digital television at an average 1.8 bits/pixel. transmission too high to compete with analog , methods. A block diagram of the compression scheme is presented in Fig. 1. The DPCM portion utilizes an Advances in vsiry large-scale integration intrafield approach with a two-dimensional predic- (VLSI) as well as recent work in the field of tio? based on averaging neighboring pixel values advanced digi tal modul$.rlon techniques have com- having the same color subcarrier phase relationship bined to make dtqqtal video processing technically as the current pixel . Sampling of the compos1 te feasible and potqntially cost competitive for analog video signal is done at four times the color broadcast quality television transmission. The subcarrier frequency rate (4 x 3.579545 MHz). Fig- coupling 'of a transparent, bandwidth efficient data ure 2 shows the spacial relationship of the current compression technique with a bandwidth efficient pixel and the two pixels used to generate the pre- modulation technique offer the potential for trans- dicted value, PV, in Fig. 1. The pixels used are mission of two (or more) high-quality television the fourth previous pixel from the same line and

1 I I I I I

DIGITIZED VIDEO IN I QUANTIZER I VALUE I I I I I I I I NON-ADAPTIVE I I ~~~~ RP I PREDICTOR I I I I o;p I I I I I LINE LINE I +2 I + I DELAY DELAY I ENCODER I

I I I I NON-ADAPTIVE I I PREDICTOR I I I I I I I RECONSTRUCTED I DlGlTlZEDVlDEO 4 1 OUT I I I I I I I I I I I I I I I I I I I I DECODER I

SAMPLES

r2LINES PREVIOUS PIXEL

CONSECUTIVE LINESOFA -0 0 0 0 0 0 0 0 FIELD -o~oooqoo 4TH CURRENT PIXEL PREVIOUS PIXEL

Figure 2. - Anabg video signal sampling and DPCM pixel relationships. the same pixel from two lines previous in the same current pixel value to obtain a difference value field. These neighboring pixels have the same to be quantized. color subcarrier phasing as the current pixel and will therefore be highly correlated.. The two pixel Figure 1 shows a "nonadaptive predictor" values are averaged to produce the prediction of (NAP) value being subtracted from the current pixel the current pixel value. At this point the algo- value along with the predicted value, PV. The rithm differs from standard DPCM. vlhere the pre- function of the NAP is to further improve the dicted value would simply be subtracted from the prediction of the current pixel. The nonadaptive

2 predictor estimates the difference value obtained The quantizer shown in Fig. 1 has 13 levels. when the DPCM prediction is subtracted from the Each level has a quantization value associated current pixel value (PIX - PV). The subtraction with a range of difference values as indicated in of the NAP value from PIX - PV causes the result- Table I. The quantizer is nonuniform so that more ing difference (DIF) value to be close to zero. levels are provided for small magnitude differ- The smaller the DIF, the more efficiently the quan- ences which would result from subtle changes in tized pixel information can be rransnitted aue to picture content. The human eye is sensitive to the use of Huffman coding prior to rransmission small variations in smooth regions of an image and over the channel. (Huffman coding assigns varia- can tolerate larger variations near transition ble length codewords based upon probability of boundaries where large difference values are occurrence. The application of Huffman coding to more likely to occur. The nonadaptive predictor this algorithm will be discussed later.) discussed previously, acts to reduce the difference values thus improving image quality by reducing the The development of the nonadaptive predictor quantization error. This is because the nonuniform was predicated on the likelihood that the differ- quantizer results in lower quantization error for ence values of adjacent pixels are similar. The small magnitude differences than for large magnitude difference between the current pixel value and its differences. The number of quantization levels, prediction, PV, is estimated and subtracted off by the corresponding difference value ranges, and the way of the NAP prior to quantiztion. The estimate specific quantization values shown in Table I were is simply based on the value of DIF for the previ- experimentally derived through subjective evalua- ous pixel. The NAP is nonadaptive because the tion of sample images processed by computer simula- estimates are prestored and do not change with dif- tion of the encoding algorithm. fering picture content. These prestored values were generated from statistics of numerous televi- The final major aspect of the encoding algo- sion images covering a wide range of picture rithm is the multilevel Huffman coding process. content. The NAP values represent the average dif- Huffman coding of the quantized data allows shorter ference values calculated within the boundaries of codewords to be assigned to quantized pixels having the difference values for each quantization level. the highest probability of occurrence. A separate Table I shows the NAP values corresponding to each set of Huffman has been generated for each of quantization level. To give an example using the the 13 quantization levels. The matrix of code values in Table I; if the difference value (DIF) sets is used to reduce the number of data bits for the previous pixel was 40, corresponding to required to transmit a given pixel. The particular quantization level 11, the value of NAP to be sub- Huffman code set used for a given quantized pixel tracted off from the current pixel difference would is determined by the quantization level of the pre- be 38. To reconstruct the pixel, the decoder uses vious pixel (i.e.. if the difference value for the a lookup table to add back in the appropriate NAP previous pixel resulted in quantization level 4 value based upon knowledge of the quantization being selected for that pixel, then the Huffman level from the previously decoded pixel. The use code set selected for the current pixel would be of the NAP results in faster convergence at transi- code set 4, corresponding to the probability of tion points in the image, thereby improving edge occurrence of pixels falllng into the fourth quan- detection performance. The rapid convergence also tization level). reduces the total data requirements by increasing the percentage of pixels in quantization level 7, As with the NAP, the Huffman code trees were which is assigned the shortest codeword length by generated by compiling statistical data from numer- the Huffman coding process. ous images covering a broad range of picture con- tent. Probability of occurrence data was compiled for each of the 13 quantization levels as a func- TABLE I. - QUANTIZATION AND NONADAPTIVE tion of the quantization level of the previous PREDICTION pixel. A separate Huffman code set was then gener- ated based on the probabil i ty data of "current" Difference Quanti- Quanti- Nonadaptive pixels falling into each of the 13 quantization (DIF) zation zat i on prediction levels of the "previous" pixels. There is a tend- value level value (NAP) ency for neighboring pixels to fall into the same (QL) CQV) value or close to the same quantization level. By recog- nizing and taking advantage of this fact, the use -255 to -86 1 -100 -85 of the multilevel Huffman code sets provides sig- -85 to -60 2 -66 -61 nificant reductions in bits per pixel over a -59 to -34 3 -42 -38 single Huffman code because they allow nearly -33 to -19 4 -25 -22 all pixels to be represented by short length -18 to -9 5 -14 -1 1 codewords. -a to -4 6 -6 -4 -3 to 3 7 0 0 IV. Hardware Implementation 4 to 8 8 6 4 9 to 18 9 14 11 The configuration of the data compression 19 to 33 10 25 21 hardware is shown in Fig. 1. The NTSC format video 34 to 59 11 42 38 signal is digitized by sampling with an analog-to- 60 to 85 12 66 61 digital (A/D) converter at a rate of four times the 86 to 255 13 100 84 color subcarrier frequency (approximately 14.32 MHz). The AID converter has an 8-bit resolution allowing

3 each sample, called a pixel, to be represent d by lines of each field, the RAM is loaded with the one of 256 digital levels. The 8-bit pixels are reconstructed values of the original pixels, while input to the enccder at the 14.32 MHz rate. The the output register of the 2-line delay is zeroed. encoder Compresses the video data and serial Y For every line thereafter, the pixel value of two transmits a comoressed representation of the data lines previous is read out of the RAM and then the over a channel at a rate of approximately new reconstructed pixel (RP) value is written into 25 Mbps :megabits/sec) to the decoder. the same memory location. Then the address counter is incremented to the next memory location for the The decoder receives the compressed ser a1 next pixel prediction. data ana reconstructs a facsimile of the original viaeo data. The reconstructed &bit pixels are The outputs of the 2-line delay and the 4-pixel converted back to an analog video signal using a delay are added together using two cascaded 4-bit digital-to-analog (D/A) converter. For this imple- full adders. The divide-by-two function is per- mentation, the entire video signal including the formed by dropping the adder's least significant horizontal and vertical synchronization pulses, output bit and using the carry-out signal as the the color burst, and the active video is sampled most significant bit. This is the same as a "shift and compressed. In future implementations, all right with carry" of the adder outputs. During the but the active video part of the signal will be first two lines of each field, the DPCM prediction removea by the encoder and the decoder will recon- circuit uses only the 4th previous pixel on the struct these portions and reinsert them into the current line to predict the current pixel value. signal, thereby increasing the overall compression In this case, the 2-line delay input to the adder of the picture. is zeroed and the divide-by-two circuit is by-passed using a multiplexer. For this breadboard version of the compression hardware, TTL (transistor-transistor logic) was The DPCM predicted value (PV) and the nonadap- chosen as the implementation technology due to ease tive prediction (NAP) value are subtracted (sub- of usage and wide availability of devices. The tractions are performed as two's complement hardware designs were constructed on wire-wrap additions) from the original pixel value resulting boards and mounted in a 5-slot 19-in. chassis. in a difference value (DIF = PIX - PV - NAP). These difference values are then grouped into quan- En coder tization levels (QL), created from a look-up table Implemented In programmable read-only memory (PROM) The encoder portion of the compression hard- using the difference (DIF) value as the address. ware digitizes the analog video signal into 8-bit The quantization levels are delayed by one pixel- pixels, compresses the image, and converts the time and used to address another PROM look-up table resultant data to a serial bit stream. A detailed to create the nonadaptive prediction (NAP). The description of each of the encoder blocks (shown nonadaptive predictor estimates the current DPCM in Fig. 1) follows. difference value (PIX - PV) from the difference value of the immediately previous pixel. The differential pulse code modulation circuit is shown in Fig. 3. This DPCM circuit averages The quantization value CQV), an estimation of previous neighboring pixel values to predict the the DIF, is created from another PROM look-up current pixel value. The previous pixels of the table. In this case, the quantization level is same color subcarrier phase as the current pixel used to address the memory locations which contain are obtained using a 4-pixel delay and a 2-line the quantization values. I delay. The 4-pixel delay is implemented using four 8-bit registers in a shift register .configuration. The two-dimensional Huffman codes are created by yet another PROM look-up table. The current The 2-line delay is implemented using a random quantization level (QL) and the immediately previ- access memory (RAM) which is addressed by a counter ous quantization level (QLn-l) address a PROM which that recycles every two lines. For the first two contains, at each location, a 1 to 12-bit Huffman

DIVIM-BY 2 ADDER MUX -pv NO DNIDE-BY 2

RP RECONSTRUCTED PIXEL PIX PIXEL PV PREDICTED VALUE

Figure 3. - Differential pulse code modulation circuit.

4 code and a 4-bit code which specify the length of shifts stop and a new code is read from the FIFO the Huffman code. memory. Then the shift register and counter are loaded with new values and the shifting process The outputs of the Huffman encoder are multi- reDeats. plexed with the first four pixels of every line so that the DPCM circuit has a valid starting point. Decoder The output of this multiplexer is input into a bank of first-in, first-out (FIFO) memories. Forty FIFO The decoder circuit receives the serial data integrated circuits are configured with expanded that the encoder transmitted and reconstructs a width and depth to achieve a Sank of FIFO memory representation of the original 8-bit pixels, and 18 bits wide and 72 K deep. The FIFOs are neces- using a dlgital-to-analog (DIA) converter, creates sary to compensate for the variable lengths of the an analog video signal. A detailed description of Huffman codes and the differences between the FIFO each of the decoder blocks (Fig. 1) follows. input frequency and the FIFO output frequency. On the input-side of the FIFO's, the data is written The input to the decoder contains three paral- periodically at the pixel rate of 14.32 MHz. On lel circuits: line unique word detect, field the output side of the FIFO's, data is read out at unique word detect, and Huffman decoder. The a variable rate depending on the length of the unique word detect circuits allow detection of Huffman codes and the frequency of the serial data. unique words with bit errors by selection of an error threshold of up to 3-bit errors. A block Sixteen of the FIFO's bits are data (either diagram of the unique word detect circuit is con- actual pixel values for the first four pixels of tained in Fig. 5. The serial data is first shifted each line or Huffman codes) and length of data. into a 16-bit shift register. The 16-bit parallel The other two bits are used to pass line and field output of the shift register is exclusive-or'd flags, indicating the start of each line and each (XOR) with the correct unique word value set in dip field. The line and field flags are used for switches. Next, the number of ones contained in insertion of unique words into the data. the XOR outputs are summed using adders. A circuit at the output of the adders allows selection of the Unique words are necessary to maintain proper error threshold and creates a pulse if a unique fie d and line timing at the decoder. Due the var- word within that error threshold is detected. The iab e length nature of the Huffman codes, channel unique word detect pulse is ANDed with a unique bit errors can often result in improper detection word window signal which disallows unique word of he codes by the decoder. Unique words allow detects until close to the expected location of the line and field timing to get back on track in valid unique words. The windowing technique lowers the event of bit errors, to minimize the impact to the probabi 1 i ty of false detects. the quality of the reconstructed video images. Different unique word values are used for lines The Huffman decoder receives the data in a and fields so they can be detected separately. In serial format from the output of a 16-bit shift both cases, unique words were chosen to avoid dup- register that parallels the unique word detect cir- lication by valid Huffman codes. Sixteen-bit cuits. When unique words are detected, the Huffman unique words are currently used; however, the hard- decoder is disabled while the unique word is purged ware was flexibly designed so that the unique word from the shift register to avoid Huffman decoding content and length can be changed If necessary. of unique words. The line and field flags at the FIFO outputs The Huffman decoder (Fig. 6) is implemented are monitored to allow insertion of the unique as a tree search in memory. The address to the words at the proper position within the data. When Huffman decoder PROM is initially set to zero (top a line or field flag is detected, FIFO reads are node of the Huffman code tree). The content of stopped to allow time for the unique words to be each memory location consists of the next two pos- multiplexed within the data (see Fig. 4). Like the sible addresses to the memory denoting the next Huffman codes, the unique words must contain a two tree branches. As each serial bit is received, 4-bit code indicating the length of the unique it is used by a multiplexer to select the next mem- words. The unique words are divided into two 8-bit ory address. A serial "one" selects one address sections each accompanied by a length code. After (branch) and a serial "zero" selects the other insertion of the unique word, the FIFO reads are address (branch). The new address (new tree node) reactivated. also contains the next two possible tree branches based upon the next received serial bit. The tree Next, the data must be converted from the par- search continues in this manner until the least allel format to a serial format for transmission significant output bit of the memory is high, indi- over a channel. Because lengths of the Huffman cating the end of a valid Huffman code. At this codes vary, a variable length parallel-to-serial point, the other memory output bits contain the converter must be used. The parallel-to-serial correct quantization value (QV) and quantization converter (shown in Fig. 4) consists of a 12-bit level (QL) for the received Huffman code. The parallel load shift register and a counter. The PROM address is then reset to zero (the top node Huffman codes are loaded into the shift register of the tree) and the decoding process continues. and the 4-bit length of the Huffman code is loaded into the counter. The counter counts down as the As the Huffman codes are detected, the result- shift register shifts out the data into a serial ant quantization levels and values are written into bit stream. When the counter reaches zero the FIFO's. These FIFO's, like in the encoder, are

5 7 FIFO'OUTPUT - COUNTER MUX - 4- WORD LENGTH FIELD 4 MUX UNIQUE WORD DATA WORD LINE ----C SERIAL DATA DIP UNIQUE CLOCK SWITCH - WORD

16-BITSHIFT UNIQUE WORD DIP SWITCHES DATA

ADDERS

THRESHOLD LOGIC

WINDOW 2 Figure 5. - Unique word detect circuit.

A ROM D INITIAL ----C OLn-' w A MUX LATCH A NON-ADAPTIVE VALUE + T PREDICTION 0 A -

SERIAL DATA 1 1 NEXT QL t PROM - SEL ADDRESS A D - A D A- 'cA MUX *D 1 MUX ZERO -B A A AB LOOK-UP TABLE m QUANTIZATION t e VALUE

FLAG

Figure 6. - Huffman decoder circuit.

required to absorb the differences in the variable delayed by one pixel-time and used by a PROM length Huffman codes and the pixel rate at the out- look-up table to create the nonadaptive prediction put of the decoder circuit. In conjunction with (NAP). The quantization value CQV) is added to the the unique word detects and the line and field nonadaptive prediction (NAP) value and the OPCM counters, the FIFO writes and reads are controlled prediction value (PV) to create the reconstructed to compensate for synchronization problems created pixel values (RP). The decoder DPCM circuit imple- by improper Huffman decoding due to bit errors. mentation is identical to the encoder DPCM circuit. The RP values are input to a D/A converter which The FIFO outputs, quantization levels and converts the reconstruct pixel values to an analog quantization values, are used to reconstruct the video signal. original video image. The quantization level is

G ONGINAL PAGE j3lJ$W AUQ WHITE PHOTOGRAPH

(a) Original image (8 bpp). (b)Processed image (1.949 bpp) TABLE 11. - ENCODING ALGORITHM PERFORMANCE RESULTS

Scene Bi ts/ pixel

Beach 2.228 Make up 1.738 Lawyer 1.976 Star Trek 1.682 Bees and flowers 1.996 IC) Onginal inage (8 bw) Id) Processed image (1.815 bpp! Game/girl 1.689 Woman/couch 1.813 Hospital 1.979 Woman /man 1.802 Woman/gray sui t 1.872 Two men 1.949 Manlphone 1.815 Fuller brush 2.031 P 1 ane / b ip 1 ane 1.711 Womanlhand 1.671 News woman 1.823 Color bars 1.347 Two women 1.659 (e) Original image (8 bpp) (f)Processedimage(l.711 bpp) Average 1.822

(9) Onginal image (8 bpp). (h) Processed image(l.671 bpp) Figure 7. - Results 01 mmputer sirnulalion processing of NTSC video inages (bpp -bits per pixel).

7 V. Performance Results ical comparative testing, since side-by-side as well as switched comparisons were possible. Inter- Table I1 provides results of computer simula- frame motion is not degraded by the processing tion of the encoding algorithm described in this since the algorithm is an intrafield coding proc- paper. The picture content across the images is ess. Motion sequences of "real-time" video were representative of the broad range of material which processed on a frame-by-frame basis and pieced makes up typical television viewing. The results together using a type "C" 1-in. broadcast video show the variability of compression with ComDlexity tape recorder with animation editing capabilities. of oicture content. Standard color bars, contain- No motion artifacts were present in the reconstruc- ing considerable redundancy, can be processed very ted sequence as expected. efficiently at 1.347 bitslpixel. The Beach scene, one of the SMPTE color reference subjective testing The hardware design of this video data com- slides, requires 2.228 bits/pixel; an indication pression algorithm is relatively straight forward of the complex nature of the scene. fhe other and can be easily implemented in VLSI for perform- images fall somewhere between these bounds, with ance improvement, size reduction and cost effec- an average across all scenes of 1.822 bitslpixel. tiveness. Combining the data compression hardware Figure 7 shows original and reconstructed images. with a bandwidth efficient modulation technique such as 8-PSK (practical bandwidth efficiency of The reconstructed image quality in all cases greater than 2 bits/sec/Hz in a power limited sys- is excellent - reconstructed images are indlstln- tem) will enable two or more broadcast quality guishable from the 8 bits/pixel digitized origi- television signals to be transmitted through a sin- nals. A two channel video frame storage unit was gle C-band transponder, creating a substantial used in the comparison of original versus proc- bandwidth improvement over analog television essed images. This provided a means for very crit- transmissions.

8 cw\s/\National Aeronautics and Report Documentation Page 1. Report No. 2. Government Accession No. 3. Recipient's Catalog NO. NASA TM-102325 5. Report Date

6. Performing Organization Code

7. Author@) 8. Performing Organization Report No. Mary Jo Shalkhauser and Wayne A. Whyte, Jr. E-5026 IO. Work Unit No. 643- 10-01 9. Performing Organization Name and Address 11. Contract or Grant No. National Aeronautics and Space Administration Lewis Research Center Cleveland, Ohio 44135-3191 13. Type of Report and Period Covered

2. Sponsoring Agency Name and Address Technical Memorandum

National Aeronautics and Space Administration 14. Sponsoring Agency Code Washington, D.C. 20546-0001

5. Supplementary Notes Prepared for the Global Telecommunications Conference sponsored by the Institute of Electrical and Electronics Engineers, Dallas, Texas, November 27-30, 1989.

6. Abstract Advances in very large-scale integration and recent work in the field of bandwidth efficient digital modulation techniques have combined to make digital video processing technically feasible and potentially cost competitive for broadcast quality television transmission. A hardware implementation has been developed for a DPCM-based digital television bandwidth compression algorithm which processes standard NTSC composite color television signals and produces broadcast quality video in real time at an average of 1.8 bitdpixel. This paper describes the data compression algorithm and the hardware implementation of the CODEC, and provides performance results.

17. Key Words (Suggested by Author(@) 18. Distribution Statement Data compression; Video; DPCM; Huffman coding; Unclassified -Unlimited Digital television Subject Category 32

19. Security Classif. (of this report) 20. Security Classif. (of this page) 21. No of pages 22. Price' Unclassified Unclassified 10 A02 I NASA FORM 1826 OCT 86 'For sale by the National Technical Information Service, Springfield, Virginia 22161