Hybrid Digital-Analog Scheme for Video Transmission over Wireless

Lei Yu, Houqiang Li, Senior Member, IEEE, and Weiping Li, Fellow, IEEE University of Science and Technology of China, Hefei, Anhui, China [email protected], [email protected], [email protected]

Abstract—In this paper, we propose a novel wireless video neither digital source coding nor digital channel coding tech- transmission scheme named HDA-Cast, which is a hybrid nique is adopted, have also been also proposed to avoid the digital-analog (HDA) coding scheme that integrates the advan- cliff effect and remove the unfairness in multicast, such as tages of digital coding and analog coding. Relative to most SoftCast [6], [7]. Because SoftCast does not employ predic- state-of-the-art video transmission methods, it avoids the “cliff tion coding so as to avoid error propagation, it has the graceful effect” provided that the channel quality is within the expected degradation property when channel condition becomes worse. range, gives better fairness among all receivers for multicast, However, prediction coding is one of the most efficient com- and has strong adaption to channel variation. The evaluation pression tools in the video coding scheme, so the coding effi- results show that our HDA-Cast is 3.5-9.6 dB better than the ciency of SoftCast is relatively low. SoftCast which is an up-to-date analog scheme. Owing to its strong adaption to channel variation, it can be regarded as a Besides digital scheme and analog scheme, there are some kind of wireless scalable video coding (WSVC). hybrid digital-analog (HDA) transmission schemes have been proposed in recent years [10], [11]. These schemes integrate I. INTRODUCTION digital coding techniques with analog coding techniques by With the rapid development of wireless network and mo- transmitting superposition of digital modulation signal and bile terminals, wireless video services are becoming increa- analog modulation signal or transmitting them in the way of singly important and popular, involving a number of diverse time-sharing or bandwidth-sharing. However, so far, most of applications, such as mobile TV, wireless video surveillance, the existing HDA schemes focus on theoretical study [8], [9]; mobile video telephone, mobile video conference, etc. Con- except for the VQHDA image transmission scheme in [10] ventional wireless video transmission scheme typically con- and the D-Cast video transmission scheme in [11], there is sists of separate digital video compression coding (e.g., hardly any practical HDA scheme for image or video trans- H.264/AVC) and a digital channel coding. Such a “digital- mission. D-Cast [11] is a HDA scheme for video transmission ly-coded” framework is a typical digital coding scheme, based on SoftCast. It combines separate digital coding and which inevitably faces the problems that the quality of de- SoftCast in the way of time-sharing or bandwidth-sharing. coded video is sensitive to the channel variation. Especially, Considering that the low coding efficiency of SoftCast results with the channel increasingly deteriorating, the video quality from using purely-analog scheme to coding DCT coefficients, will drop sharply when approaching certain critical point. This D-Cast employs coset coding and syndrome coding (which phenomenon is named as “cliff effect” [6]. For mobile ter- are two typical techniques used in distributed source coding) minals, the wireless channels are almost varying at all times, to code the coefficients so as to effectively reduce their am- therefore, it is not able to reconstruct the video with reliable plitudes (variances). quality. This is one of the reasons that conventional wireless Through the analysis above, the wireless video transmis- video transmission scheme is not suitable for the present sion scheme should have better performance at these three wireless video services. The other reason is that for multi- aspects: (1) Unicast Efficiency: it measures the quality of the cast/broadcast video services, the bitrate selected by conven- video transmitted from the sender to a stationary receiver. The tional wireless video transmission scheme cannot fit all re- better the received video quality is (at the same channel qual- ceivers at same time. If the video is transmitted at a high bi- ity), the higher Unicast Efficiency is; (2) Multicast (or trate, it only can be decoded by those receivers with better Broadcast) Efficiency: it measures the overall quality of the quality channels, and it is unfair for those receivers with worse video transmitted from the sender to multiple stationary re- quality channels; while if it transmits at a low bitrate sup- ceivers. It also reflects the fairness among all receivers; (3) ported by all receivers, it reduces the performance of the re- Mobility: it measures the quality of the video transmitted from ceivers with better quality channels, and it is unfair for them. the sender to one or multiple mobile receivers. The more To overcome the shortcomings of conventional scheme, gracefully the video quality varies with the channel varying, the scalable digital video transmission scheme has been pro- the higher Mobility is. posed. This kind of transmission scheme encodes the video In this paper, we propose a novel wireless video trans- with multiple levels of quality or resolution and transmitted mission scheme named HDA-Cast, which adopts HDA with unequal error protection (UEP), such as hierarchical transmission technique, integrates the advantages of digital modulation (HM) [1]-[4]. It usually uses the scalable video and analog transmissions, and overcomes the disadvantages coding (SVC) at the sender, which fragments a video stream of digital and analog transmissions. Relative to most into a base layer and several enhancement layers [5], so we state-of-the-art wireless video transmission schemes, our name this method as SVC+HM for short. HDA-Cast has better overall performance in Unicast Effi- Some analog wireless video transmission schemes, where ciency, Multicast Efficiency and Mobility.

978-1-4673-5762-3/13/$31.00 ©2013 IEEE 1163

II. HDA-CAST the random reordering, partitioning the GOP is needed (The In order to achieve both advantages of digital and analog reasons for doing so will be clarified in the next subsection). schemes, our HDA-Cast adopts low bitrate digital scheme Here assume each GOP unit consists of frames, and is even. combined with the modified SoftCast, which can avoid the After DCT and power allocation, the͠ MSoftCast ͠ encoder cliff effect on condition that the channel quality is within the divides each GOP into two parts: the first /2 frames and the expected range and make full use of the channel. last /2 frames. They are separately operated͠ by random reordering.͠ The resulting signals are denoted as ͥ and ͦ, As shown in Fig. 1, HDA-Cast consists of the digital and and they are mapped to Q (quadrature) and ͬ I (in-phase)ͬ analog codec parts. At the sender side, the video is firstly components of analog vector signal  respectively. Ob- encoded by H.264/AVC, and the residual is processed by the viously, the variance of ͥ is larger thanͬ ͦ. Assuming the modified SoftCast (MSoftCast); then H.264/AVC bitstream is seeds of random reorderingͬ at sender areͬ known by all the channel coded, modulated and allocated power by the sender; receivers, all the receivers can restore the initial order of the finally, the power-allocated modulation signals and the output signals by the opposite operation of random reordering. signals of MSoftCast are superposed and transmitted. At the receiver side, the decoder first decodes the digital signal ac- For the integrity of the paper, the power allocation and curately; then, it obtains the analog signals by subtracting the Linear Least Square Estimator (LLSE) in [6] employed by digital signals from the received signals; finally, it recon- MSoftCast are listed as follows: structs the video based on the part decoded by H.264/AVC and ͤ = ∙ , the part decoded by MSoftCast. ͬ ʚ͟ʛ ͛ʚ͟ʛ ͧʚ͟ʛ ∙ /2 (1)

A. Digital Encoder and Decoder = ͈ ͊ , ͛ʚ͟ʛ ǰ .( ) &Ͱͥ .( ) For the digital encoder and decoder part, HDA-Cast  ͟ ∑  ͟ combines H.264/AVC with channel code and modulation. In and ͤ = ͤ + , order to achieve better adaption to channel, HDA-Cast em- ͦ ploys BPSK with low rate convolutional codes [12]. ͭ ʚ͟ʛ ͬ ʚ͟ʛ .͢ʚ͟ʛ (2) = ͦ ͛ʚͦ͟ ʛ ʚ͟ʛ ͦ ͤ , B. Analog Encoder and Decoder ͧ̂ʚ͟ʛ . + ) ͭ ʚ͟ʛ ͛ ʚ͟ʛ ʚ͟ʛ  ʚ͟ʛ For the analog part, HDA-Cast adopts MSoftCast pro- where is the DCT coefficients in k-th PAU, is the posed by us. Compared with SoftCast, MSoftCast changes in power allocationͧʚ͟ʛ scaling value for , ͤ is͛ theʚ͟ ʛoutput the two aspects. of power allocation for , . ͧ ʚis͟ ʛtheͬ standardʚ͟ʛ deviation of , is the numberͧ ʚof͟ʛ PAUs ʚ͟ ʛin one GOP,  is the av- First, each GOP is firstly divided into ℎ ×  ×  erageͧʚ͟ powerʛ ͈ allocated to the analog vector signal͊ , ͤ blocks as transforming units (TU), and every TU is ͫdivided͠ ͦ is the received signal for ͤ , ) is the varianceͬ ͭ of ʚthe͟ʛ into ℎ ×  × 1 blocks as power allocation units (PAU, ʚ ʛ ʚ ʛ which is namedͫ as Chunk in SoftCast). In order to reduce the noise added into ͤ ͬ while͟ ͤ ͟ is being transmitted, complexity, DCT and IDCT are performed on each TU, in- and is the outputͭ ʚ of͟ʛ LLSE forͬ ʚ͟ʛ . stead of the whole GOP (as in SoftCast). Then, power alloca- Itͧ̂ ʚ should͟ʛ be noted that the standardͧʚ͟ʛ deviations of DCT tion is performed on all PAUs (similar to SoftCast). coefficients as the side information of MSoftCast are trans- Second, in order to guarantee the decoding performance of mitted in the digital part considering its importance to the digital coding, it needs to whiten the output signals of MSoftCast decoding. MSoftCast encoder. Therefore, MSoftCast adopts the random C. HDA Modulation and Power Allocation reordering instead of Hadamard transform of SoftCast. Before The transmitted signal is the superposition of output of the

(a) The encoder of HDA-Cast

(b) The decoder of HDA-Cast Fig. 1. Framework of our HDA-Cast. 1164 digital and analog encoders. As shown in Fig. 2, the trans- mitted signal is the sum of BPSK-modulated vector signal and outputͬ vector signal of MSoftCast , i.e., ͬ = + . ͬ (3) A point worth notingͬ isͬ that theͬ digital decoder treats the analog signal  as noise and the demodulation performance of BPSK is onlyͬ affected by variance of I component of . Therefore in order to achieve better decoding performance,ͬ Fig. 2. Mapping output of the digital and analog encoders to I/Q compo- the I component of  should be kept as small as possible. nents of transmitted signal. That is the reason whyͬ divide each GOP into two parts ͥ riments are taken from [12], which have the best error cor- and ͦ with different variances in previous subsection. ͬ rection capability, and all decoders of convolutional codes Inͬ order to decode H.264/AVC bitstream and side infor- apply 3-bits soft-decision decoding. mation “correctly”, the overall power allocation is needed. We select “Foreman” sequence with resolution CIF The two power allocation, operation 1 and 2, should be jointly (352x288) to conduct all experiments, and test the perfor- considered because  is treated as noise when decoding the mances of all the schemes in three situations: unicast with digital part. Thereforeͬ the following inequality should hold, stationary receiver, multicast with stationary receivers and unicast with mobile receiver. We test the performances of all ͊ ≥ ͤ, (4) the schemes over AWGN channel and set the target SNR  + ͤ ͊ ͈ range to -4~ 25 dB, which is the reasonable SNR range of where and  are the allocated average powers to and WLAN. In addition, in order to show the performance de-  respectively,͊ ͊ ͤ is the average power of channel noise,ͬ and gradation outside of target SNR range, the measurements span ͬͤ is the lowest SNR͈ needed to guarantee that the decoding bit SNRs from -5 to 25 dB. All the schemes send the same power, error rate (BER) is not larger than the target BER. In addition, and use the same wireless bandwidth of 2.2 MHz, which the average transmitting power is usually constrained, i.e., guarantees that all of the pixels just can be sent out by analog +  ≤ /, (5) transmission part. where / is the constrained͊ ͊ average͊ power of transmitting In our experiments, the sizes of TU and PAU in HDA-Cast signal ͊. By combining formula (4) and (5), it holds that are 176 × 144 × 8 and 44 × 36 × 1. This means each TU is ͬ / − ͤ ͤ divided into 128 PAUs. We apply rate 1/16 convolutional  ≤ . (6) ͊ ͈ codes in HDA-Cast and set ͤ = −6.3 and // ( = ͊ 1 + ͤ In order to achieve the best video quality, the power should −4 in formula (7). This guarantees that ̼͘ the decoding͊ ͈ BER at SNR ̼͘ −4 is less than 6.67 ∙ 10ͯͬ. be in full use. Therefore it is reasonable to set  as the maximum. In addition, considering the varying channel,͊ the B. Unicast ̼͘ Efficiency above formula should hold when the maximum noise ( Method: We run a group of simple unicast experiments with occurs, i.e., ͈ stationary receiver, whose SNR is fixed, for different schemes: / − ͤ ( AVC+QAM, SVC+HM, SoftCast and our HDA-Cast. For  = , ͊ ͈ (7) ͊ 1 + ͤ each experiment, all schemes are tested at same SNR, and we = / − . conduct a group of such experiments under different SNR. ͊ ͊ ͊ III. COMPLEXITY ANALYSIS In detail, we run H.264/AVC with different modulations and convolutional code rates from BPSK with code rate 1/10 Our HDA-Cast adds the MSoftCast into the conventional to 64QAM with code rate 3/4. We also run SVC+HM with method of H.264/AVC with fixed modulation and convolu- different layer number and different modulations and convo- tional code. Therefore the main additional complexity comes lutional code rates of each layer. For 2-layer SVC case, we from DCT and IDCT operations of MSoftCast. For n-points transmit base layer with QPSK and code rate 1/20, and en- DCT or IDCT operation, its computation complexity is hancement layer with QPSK and code rate 1/8, 1/4 and 1/2 , so the additional computation complexity of respectively. For 3-layer SVC case, we transmit base layer HDA-Cast͉ʚ͢ ͣ͛͠ ͢ʛ for one GOP is  , where is the with QPSK and code rate 1/20, first enhancement layer with number of pixels in one GOP, and͉ʚ͐ ͣ͛͠ is ͐ theʛ number ͐of pixels QPSK and code rate 1/8, and second enhancement layer with in one TU. We can reduce the complexity͐ by reducing the size QPSK and code rate 1/4. of TU. Especially, DCT and IDCT operations can be removed Results: at the cost of a little video quality loss. Fig. 3 shows the PSNR curves of different schemes. From the figure, it can be concluded: (1) HDA-Cast is 3.5-9.6 IV. EVALUATION AND RESULTS dB better than SoftCast when SNR is between -4 dB and 25 A. Reference Baselines and Testing Setup dB; (2) PSNR of HDA-Cast varies gracefully with SNR va- We have implemented a simulation of HDA-Cast, which rying, whereas the PSNR curves of AVC+QAM and we compare with three baselines to evaluate its performance: SVC+HM appear cliff effect; (3) PSNR of HDA-Cast is (1) H.264/AVC with convolutional codes and QAM, which is higher than SVC+HM at almost every SNR, except for SNR denoted as AVC+QAM; (2) H.264/SVC [4], [5] with convo- below 0 dB. In addition, the figure also shows that for lutional codes and hierarchical modulation [1]-[3], which is HDA-Cast, the cliff effect appears at SNR below -4 dB, denoted as SVC+HM; (3) Latest version of SoftCast [7] however, which is out of our target SNR range. which uses the 3D-DCT. C. Multicast Efficiency All schemes are implemented using MATLAB’s commu- Method: In order to test how much our HDA-Cast is better nications toolbox. All convolutional codes used in our expe- than SVC+HM at worst case, we perform an experiment with a single sender and three multicast receivers with SNRs being 1165 -4 dB, 10 dB, and 23 dB respectively which is the best case for the SVC+HM with 3-layers mentioned in previous subsection. We test AVC+QAM, SVC+HM, SoftCast and our HDA-Cast transmitting video to the multicast receivers respectively. For AVC+QAM, in order to make the worst receiver achieve better video quality, the scheme that AVC with BPSK and code rate 1/10 is chosen. For SVC+HM, as shown in Fig. 3, the one with three layers, QPSK1/20+QPSK1/8+QPSK1/4, is the best, and therefore is chosen. Results: The comparison of PSNRs for different schemes is showed in Fig. 4. From the figure, it can be concluded that our HDA-Cast is absolutely better than other schemes except for a little PSNR loss for the receiver 1. D. Mobility Method: Assuming that for all the schemes the sender does not know the instantaneous channel quality. We run a simple Fig. 3. PSNR comparison for unicast with a stationary receiver. unicast experiment with mobile receiver, whose SNR varies uniformly from 4 dB to 0 dB. For AVC+QAM, we assume the BPSK with rate 1/2 convolutional code is selected by sender; for SVC+HM, any one mentioned in Subsection B, can be selected, because as shown in Fig. 3, the decoded video qual- ity is the same when SNR varies from 4 dB to 0 dB. Results: Fig. 5 shows PSNR variation with frame number. From the figure, HDA-Cast has higher video PSNR and varies more gracefully, which verifies that it has strong adaptation to varying channel. Fig. 4. PSNR comparison for multicast with stationary receivers 1-3 at SNR The all the performance comparisons among all the -4 dB, 10 dB, and 23 dB respectively. schemes above are summarized in Table I. V. CONCLUSION In this paper, we propose a novel wireless video trans- mission scheme named HDA-Cast, which integrates the ad- vantages of digital and analog transmissions. The perfor- mance evaluation in terms of Unicast Efficiency, Multicast Efficiency and Mobility shows that our HDA-Cast outper- forms over most state-of-the-art methods. In other words, our HDA-Cast avoids the cliff effect on the condition that the channel quality is within expected range, gives the better fairness among all receivers for multicast, and has strong Fig. 5. PSNR variation with frame number for mobility with a mobile adaptation to channel variation. Our HDA-Cast can be applied receiver whose SNR varies uniformly from 4 dB to 0 dB. in most of wireless video communications scenarios, such as WLAN media sharing, video telephoning, satellite broad- TABLE I casting, mobile TV and so on. PERFORMANCE COMPARISON OF DIFFERENT SCHEMES Unicast Multicast Schemes Mobility ACKNOWLEDGMENT Efficiency Efficiency AVC+QAM High Low Low This work was supported by Natural Science Foundation SVC+HM Middle Middle Middle of China (NSFC) under contract No. 61272316 and 61233003, SoftCast Low Middle High and 973 Program under contract No. 2013CB329004. HDA-Cast Middle High High

REFERENCES [7] S. Jakubczak and D. Katabi. “A cross-layer design for scalable mobile video,” in ACM Proc. of the 17th Annu. Int. Conf. Mobile Computing [1] ETSI. Digital Video ; Framing structure, channel coding and Networking, pp. 289–300, New York, NY, USA, 2011. and modulation for digital , Jan 2009. EN 300 744. [8] V. Prabhakaran, R. Puri, and K. Ramchandran, “Hybrid digital-analog [2] ETSI. Digital Video Broadcasting; DVB-H Implementation Guidelines, codes for source-channel broadcast of Gaussian sources over Gaussian Jun 2009. TR 102 377. channels,” IEEE Trans. Inform. Theory, vol. 57, no. 7, pp.4573-4588, [3] T. Kratochvíl, “Hierarchical Modulation in DVB-T/H Mobile TV Aug. 2011. Transmission,” in Lecture Notes in Electrical Engineering, vol. 41, [9] Y. Wang, F. Alajaji, and T. Linder, “Hybrid digital-analog coding with Multi-Carrier Systems & Solutions, S. Plass, A. Dammann, S. Kais- bandwidth compression for Gaussian source-channel pairs,” IEEE er, K. Fazel, Ed., Springer Netherlands, pp. 333-341, 2009. Trans. Commun., vol. 57, no. 4, pp. 997–1012, Apr. 2009. [4] M. M. Ghandi and M. Ghanbari, “Layered H.264 video transmission [10] ——, “Design of VQ-based hybrid digital-analog joint source-channel with hierarchical QAM,” Elsevier Journal of Visual Communication codes for image communication,” in Proc. IEEE and Image Representation, vol. 17, no. 2, pp. 451-466, Apr. 2006. Conf., Snowbird, Utah, USA, pp. 193-202, Mar. 2005. [5] H. Schwarz, D. Marpe, and T. Wiegand. “Overview of the scalable [11] X. Fan, F. Wu, and D. Zhao, “D-Cast: DSC based soft mobile video video coding extension of the H.264/AVC standard,” IEEE Trans. broadcast,” in ACM Int. Conf. Mobile and Ubiquitous Multimedia Circuits Syst. Video Technol., vol. 17, no. 9, pp.1103-1120, Sep. 2007. (MUM), Beijing, China, Dec. 2011. [6] S. Jakubczak, H. Rahul, and D. Katabi. “One-size-fits-all wireless [12] P. Frenger, P. Orten, and T. Ottosson, “Code-spread CDMA using video,” in Proc. Eighth ACM SIGCOMM HotNets Workshop, New maximum free distance low-rate convolutionalal codes,” IEEE Trans. York City, NY, Oct. 2009. Commun., vol. 48, no. 1, pp. 135-144, Jan. 2000. 1166