1 A Tutorial and Review on Inter-Layer FEC Coded Layered Video Streaming
Yongkai Huo∗, Cornelius Hellge†, Thomas Wiegand†, Fellow, IEEE, and Lajos Hanzo∗, Fellow, IEEE ∗School of ECS, University of Southampton, UK. †Fraunhofer HHI, Berlin, Germany. Email: {yh3g09∗, lh∗}@ecs.soton.ac.uk, {cornelius.hellge†, thomas.wiegand†}@hhi.fraunhofer.de
Abstract—Layered video coding creates multiple layers of and large-screen televisions (TVs) may be used in various unequal importance, which enables us to progressively refine scenarios. In order to eliminate this problem, layered video the reconstructed video quality. When the base layer (BL) is coding became the center of research efforts and a number of corrupted or lost during transmission, the enhancement layers (ELs) must be dropped, regardless whether they are perfectly standards emerged, such as the scalable video coding (SVC) decoded or not, which implies that the transmission power scheme standardized in H.264. The state-of-the-art in layered assigned to the ELs is wasted. For the sake of combating this video compression techniques will be portrayed in Section problem, the class of inter-layer forward error correction (IL- II-B, while the above-mentioned question will be answered 1 FEC) solutions, also referred to as layer-aware FEC (LA-FEC ), in more detail in Section II-C. has been proposed for layered video transmissions, which jointly encode the BL and the ELs, thereby protecting the BL using Layered video coding generates a video stream containing the ELs. This tutorial aims for inspiring further research on IL- multiple inter-dependent layers. Specifically, a less important FEC/LA-FEC techniques, with special emphasis on the family of layer may depend on a more important layer for decoding. soft-decoded bit-level IL-FEC schemes. Numerous techniques have been conceived for the sake of improving the attainable video quality by allocating different error protection, different power, etc to different layers. More I.PROLOGUE details found in the literature will be reviewed in Section III. Sophisticated video processing is required for recording, During the early days of layered video communication, for production, or re-broadcasting of moving images, which has example different code-rates were invoked for the different been changing our everyday life ever since the first clip was layers. When the most important base layer is corrupted or lost, captured by a video tape recorder in 1951. Since then a the less important layers also have to be dropped, regardless number of video compression standards have been conceived whether they are perfectly received or not, which implies for the sake of reducing the number of bits required, while that both the power and the bandwidth assigned to the less retaining the fidelity of the visual information conveyed by important layers was wasted. a video clip. The achievable compression ratio has been The second question that arises, is “can we exploit the substantially improved over the past years both as a benefit valuable resources allocated to the less important layers for of advances in video signal processing and microelectronics. recovering the more important layer?” Indeed, the answer This was achieved at the cost of more complex algorithms, but is yes. Hence in this tutorial we design so-called inter- fortunately the spectacular developments in low-power chip layer scheme. In this scheme, the information of the more design were still capable of mitigating the power-dissipation important layer is implanted into the less important layer imposed. We will briefly review the history of video compres- at the transmitter. Additionally, the information of the more sion standards in Section II-A of this treatise. important layer incorporated into the less important layers may In Section II-A, we will see that these existing or emerg- be extracted and utilized for improving the error-resilience of ing single-layer video compression standards are capable of the more important layer. A pair of different solutions will achieving a high compression ratio in numerous applications be detailed, namely the packet level layer-aware (LA) forward scenarios. The first question that arises, is “why do we need error correction (FEC) solution of Section IV and the bit- layered video coding?” Typically, a single stream is generated level inter-layer (IL) FEC solution of Section V. Both their by the classic single-layer video encoders, which as a whole differences and their relationship will be discussed in Section will be forwarded to the distant receiver. However, this single III. layer video is unable to meet the complex multi-platform Given that the information of the important layer is im- requirements of the multimedia era, where for example various planted into the less important layers, the third question receiver terminals, such as tablet computers, smart phones that arises, “whether there is a way to measure how much extra information is gained from the implanted information?” The financial support of the EU’s Concerto project, of the EPSRC under Fortunately, the question may be answered with the aid of the the auspices of the India-UK Advanced Technology Centre (IU-ATC) and that of the ERC’s Advanced Fellow Grant is gratefully acknowledged. powerful tool of extrinsic information transfer charts (EXIT). 1The philosophy of LA-FEC and IL-FEC was used in [1], [2] for referring This tool will be employed in Section VI for analyzing the to packet-level and bit-level techniques, respectively. Consequently, we will benefits of bit-level IL-FEC arrangements, followed by char- employ LA- /IL-FEC for referring to packet- /bit-level schemes, respectively. The definitions of packet-level and bit-level notations will be given in Section acterizing its performance in the context of both partitioned III. and scalable H.264/AVC video coding in Sections VII and 2
VIII, respectively. Our conclusions are provided in Section IX, where the pertinent design guidelines of layered video coding schemes will be summarized.
II.INTRODUCTION -FROM UNCOMPRESSED VIDEOTO SCALABLE VIDEO COMPRESSION
Below, we firstly review the history of video compression (a) Two consecutive frames standards followed by portraying the state-of-the-art in layered video compression techniques. Finally, we answer the question as to “why we need layered video coding?”.
A. Video Compression Uncompressed video clips consist of a sequence of frames captured from a real-world scene, which exhibit high intra- and inter-frame correlation [3], [4] among the pixels. A pair (b) Transform domain, DCT of uncompressed consecutive frames of the Football sequence are exemplified in Fig. 1a. Observe from Fig. 1a that there is substantial intra-frame correlation. Furthermore, the similarity between the two frames seen in Fig. 1a indicates the inter- frame correlation. In Fig. 1b, the transform domain repre- sentation of Fig. 1a is displayed after applying the discrete cosine transform (DCT), while the frame difference of Fig. 1a is displayed in Fig. 1c. Intuitively, the correlation residing in an uncompressed video sequence should be removed in (c) Frame difference order to represent the original video with the aid of less bits, yet without any substantial reduction of the perceived visual Figure 1: Two consecutive frames of the Football sequence. quality. Again, video compression will reduce the storage required in a hard-drive for example, or the transmission bandwidth This reduced-variance difference signal of Fig. 2a, namely and the transmission power required for distributing the video. e = f f − , is often referred to as Motion Compen- n n − n 1 A simple frame-differencing based video codec architecture sated Error Residual (MCER) since the associated frame- [5] is displayed in Fig. 2 for illustrating the basic video differencing effectively attempts to compensate for the motion compression framework. Furthermore, a number of advanced of the objects between consecutive video frames, yielding a video compression standards [5] have been designed during reduced-variance MCER. Thus, en requires a reduced coding the past decades for the sake of achieving a high compression rate, that is, a lower number of bits than fn in order to repre- ratio, such as the H.120 [6], H.261 [7], H.263 [8], H.264/AVC sent it with a certain distortion. The MCER en = fn fn−1 [9], MPEG-2 [10] and MPEG-4 [5] schemes. The timeline of − can then be encoded as e¯n, at a given distortion using a the video coding standards is shown in Fig. 3. The first video variety of techniques [5]. The quantized or encoded MCER recorder was invented as early as in the 1950s. signal e¯n is then transmitted over the communications channel. 1) A Simple Frame-Differencing Video Codec: Assuming In order to reproduce the original image fn = en + fn−1 that the Football video sequence is encoded by the encoder of exactly, knowledge of en would be necessary, but only its Fig. 2a, the consecutive image frames fn and fn−1 typically do quantized version, e¯n is available at the decoder of Fig. 2b, not exhibit dramatic scene changes as exemplified in Fig. 1a. which is contaminated by the quantization distortion intro- Hence, the consecutive frames are similar, a property that we duced by the MCER encoder, inflicting the reconstruction error refer to as being correlated. This implies that the current frame ∆e = e eˆ . Since the previous undistorted image frame n n − n fn can be approximated or predicted by the previous frame, fn−1 is not available at the decoder, image reconstruction has which we express as f fˆ = f − , where fˆ denotes n ≈ n n 1 n to take place using their available approximate values, namely, the nth predicted frame. When the previous frame fn−1 is e¯n and f˜n−1, giving the reconstructed image frame as follows: subtracted from the current one, namely fn, this prediction typically results in a “line-drawing-like” difference frame, as f˜ =e ˜ + f˜ − , (1) shown in Fig. 1c. Most areas of this difference frame are n n n 1 “flat”, having values close to zero, and the variance or second where in Fig. 2a we found that the locally decoded MCER moment of it is significantly lower than that of the original residual e˜n is an equal-valued, noiseless equivalent represen- frame. We have removed some of the predictable components tation of e¯n. The above operations are portrayed in Fig. 2a, of the video frame. where the current video frame fn is predicted by fˆn = f˜n−1, 3
(a) Encoder
(b) Decoder Figure 2: Basic video codec schematic using frame-differencing [5], where the corresponding video encoder and video decoder are displayed in (a) and (b) respectively. which is an estimate based on the previous reconstructed decoding operation the distant decoder would have to use the frame f˜n−1. Observe that (ˆ) indicates the predicted value and reconstructed version of the previous frame in its attempt to ( ) the reconstructed value, which is contaminated by some reconstruct the current frame. ∼ coding distortion. Note furthermore that the encoder in Fig. 2a contains a so-called local decoder, which is identical to 2) H.120 (1984-1988): The International Telecommunica- the remote decoder of Fig. 2b. This measure ensures that the tion Union (ITU) standardized the recommendation H.120 [6], which is the first digital video coding standard. The techniques decoder of Fig. 2b uses the same reconstructed frame f˜n−1 in order to reconstruct the image, as the one used by the employed in H.120 include differential pulse code modulation encoder of Fig. 2a to generate the MCER. Note, however, (DPCM), scalar quantization of pixels and variable-length coding (VLC) of the transform domain coefficients. Although that in case of transmission errors in e˜n, the local and the remote reconstructed frames become different, which we often H.120 video failed to achieve adequate quality for practical refer to as the encoder and decoder being “misaligned”. This video compression, it provided important knowledge for its phenomenon leads to transmission error propagation through successors. the reconstructed frame buffer, unless counter-measures are 3) H.261 (1988-1993): The ITU H.261 standard [7] pio- employed. To be more explicit, instead of using the previous neered the era of practical digital video compression tech- original frame, the so-called locally decoded frame is used in niques during the 1990s, which was designed for transmitting the motion compensation, where the phrase “locally decoded” 352 288-pixel Common Intermediate Format [7] (CIF) and × implies decoding it at the encoder (i.e., where it was encoded). for 176 144-pixel Quarter Common Intermediate Format × This local decoding yields an exact replica of the video [7] (QCIF) video clips over Integrated Services Digital Net- frame at the distant decoder’s output. This local decoding works (ISDN). This standard employed a hybrid video coding operation is necessary, because the previous original frame is scheme, which formed the basis of all state-of-the-art video not available at the distant decoder, and so without the local coding standards. 4
AVS AVS Workgroup
H.120 H.263
ITU-T H.261 H.262 MPEG-2 H.265 H.264/MPEG-4 part 10/AVC
HEVC MPEG-4 SP/ASP ISO/ICE
MPEG-1
JPEG JPEG
Microsoft DV VC-1 IEC
RealNetworks RealVideo
1984 1990 1995 2000 2005 2010 2015 Year Figure 3: Development timeline of the video compression standards. The timing box indicates the major development period of the corresponding standard.
4) Joint Photographic Experts Group (1992-1998+): or (720 480)-pixel resolutions, and for high-definition (HD) × Motion-JPEG employs the image compression standard of the video with a pixel resolution of 1920 1080. MPEG-2 is × Joint Photographic Experts Group [11] (JPEG), which was widely used as the format of digital TeleVision (TV) signals, standardized in 1992 for digital still images. Based on JPEG, which are broadcast by terrestrial, cable and direct satellite Motion-JPEG treats a video clip as a sequence of independent TV systems. images, which are encoded by JPEG separately. Since the temporal correlation is not removed, Motion-JPEG results in a 7) DV (1994-1995+): The Digital Video (DV) coding higher bitrate at a lower computational complexity than that of specification [14] was standardized by the International Elec- its motion-compensated counterparts. Although Motion-JPEG trotechnical Commission (IEC) in 1994, mainly targeting may be used for digital cameras and video processing systems, camera recorders, which encodes a video clip on a frame- it has not been standardized by any organization. by-frame basis, i.e. by dispensing with motion compensation. The DV efficiency is comparable to that of the Intra-frame 5) MPEG-1 (1993-1998+): Then the Moving Picture Ex- coding mode of MPEG-2/H.262 and it is better than that of perts Group (MPEG) proposed the MPEG-1 standard in 1992 Motion-JPEG. [12] for CIF- or 352 240-pixel videos. MPEG-1 may be × used by almost all Personal Computers (PCs), Video Compact 8) H.263 (1995-2005): The H.263 specification was stan- Disc (VCD) players and Digital Versatile Disc (DVD) players. dardized in 1995 by the ITU Telecommunication Standardiza- However, MPEG-1 only supports progressively scanned im- tion Sector [8], [15], [16] (ITU-T), which targeted video con- ages, which cannot be used for the interlaced video frames of ferencing at low bitrates for mobile wireless communications the National Television System Committee’s (NTSC) standard and it is superior to all prior standards in terms of its com- or for the so-called Phase Alternating Line (PAL) video pression efficiency. The H.263v2 standard, namely H.263+, formats [13]. The lack of this compatibility correspondingly was completed in 1998, which is the informal acronym for the prompted the development of the MPEG-2/H.262 standard second edition of the ITU-T H.263 standard. In this edition, [10]. the capabilities of the H.263 scheme were enhanced by adding 6) MPEG-2/H.262 (1994-1998+): In 1993, the Interna- several annexes for detailing, how to achieve a substantially tional Organization for Standardization (ISO) and ITU jointly improved encoding efficiency. Later, the definition of H.263v3, developed the MPEG-2/H.262 [10] standard for (720 576)- also known as H.263++, added three further annexes in 2000. × 5
9) RealVideo (1997-2008+): RealVideo [17], [18] is a Definition (UHD) videos at a resolution of 8192 4320. The × successful proprietary video compression format developed by emerging H.265 scheme will continue to support scalability the RealNetworks company, which was first released in 1997. by the scalable HEVC (SHVC) profile, which enables the RealVideo is supported by numerous computing platforms, transmitter to meet multiple users’ preferences. including Windows, Mac, Linux, Solaris and several mobile phones. B. Layered Video Coding Standards 10) MPEG-4 SP/ASP (1998-2004+): MPEG-4 standard- ization [5] was initiated in 1995 and has been continually Layered video compression [9], [20], [26], [27] encodes enhanced by a number of new profiles, including wavelet- a video sequence into multiple layers, which enables us to based still image coding, scalable coding, 3D images etc. The progressively refine the reconstructed video quality at the MPEG-4 video standard strikes adjustable compression quality receiver. Generally, the most important layer is referred to as versus bitrate trade-off, where the so-called Simple Profile the base layer (BL) and the less important layers are termed as (SP) is very similar to H.263, while the Advanced Simple enhancement layers (ELs), which rely on the BL. Furthermore, Profile (ASP) further increased the compression efficiency an EL may be further relied upon by less important ELs. attained. Again, when the BL or an EL is lost or corrupted during 11) WMV9/VC-1/SMPTE 421M (2005-2006+): The soci- its transmission, the dependent layers cannot be utilized by ety of Motion Picture and Television Engineers (SMPTE) the decoder and must be dropped. A layered video scheme 421M [19] developed a scheme, which is, also known as is displayed in Fig. 4, where the video sequence captured Windows Media Video version 9 (WMV9) or VC-1. This was from the scene is encoded into four layers by the layered initially developed as a proprietary video format by Microsoft, video encoder, namely L0 L3, where layer Li (0 < i 3) ∼ ≤ but it was then released as a SMPTE video codec standard depends on layer Li−1 for decoding, while layer Li improves in 2006. VC-1 was designed as an alternative to the latest the video quality of layer Li−1. In other words, layer L0 is ITU-T and MPEG video codec H.264/MPEG-4 AVC. It was the BL and layers L1 L3 are ELs depending on the BL. ∼ shown that the VC-1 compression efficiency is lower than Furthermore, as shown in Fig. 4, the ELs L2 and L3 rely that of H.264/AVC, while imposing a reduced computational on the EL L1. In other words, if layer L1 is corrupted, then complexity. layers L2 and L3 are dropped by the decoder. A number of 12) H.264/MPEG-4 part 10/AVC (2003-2014+): The layered video coding techniques have been investigated and/or H.264/AVC standard [9], also known as MPEG-4 part 10 standardized, which will be introduced below. or Advanced Video Coding (AVC), was completed in May 1) Partition Mode of H.264: A number of layered video 2003, albeit its research continued by adding more extensions. coding schemes [28] have been developed and some of them H.264/MPEG-4 part 10/AVC was jointly developed by the are adopted by recent video coding standards, for example ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC the scalable video coding (SVC) [20] and data partitioned JTC1 MPEG, which became one of the most commonly used mode (DP) [9], [29], [30]. In the data partitioning mode, the formats of video recording, compression and distribution. The data streams representing different semantic importance are design goal was to halve the bitrate required by the previous categorized into a maximum of three bitstreams/partitions [31] video standards, while retaining the same video quality. A per video slice, namely type A, type B and type C partition. major recent extension of H.264/AVC was the Scalable Video The header information, such as macroblock (MB) types, Coding [20] (SVC) scheme completed in 2007, which is quantization parameters and motion vectors are carried by the specified in Annex G, allowing the construction of bitstreams type A partition. The type B partition is also referred to as the that contain sub-bitstreams of the H.264/AVC standard. An- intra-partition, which contains intra-frame-coded information, other major extension of H.264/AVC was the Multiview Video including the coded block patterns (CBPs) and intra-coded Coding (MVC) scheme completed in 2009 [21], which is coefficients. The type B partition is capable of prohibiting specified in Annex H, enabling the construction of bitstreams error propagation in the scenario, when the reference frame that represent more than one view of a video scene. The MVC of the current frame is corrupted. In contrast to the type extension contains two profiles, which are the Multiview High B partition, the type C partition is the inter-partition, which Profile [9] representing an arbitrary number of views and the carries the inter-CBPs and the inter-frame coded coefficients. Stereo High Profile [9] for stereoscopic video. The type C partition has to rely on the reference frame for 13) Audio and Video Coding Standard (2003-2005): The reconstructing the current picture. Hence, if the reference Advanced Video Standard (AVS) [22] was initiated by the picture is corrupted, errors may be propagated to the current government of China for replacing MPEG-2, which has been frame. Amongst these three partitions, the type A partition standardized in 2005. may be deemed to be the most important one, which may be 14) HEVC/MPEG-H Part 2/H.265 (2012-2014+): High treated as the BL. Correspondingly, the type B and C partitions Efficiency Video Coding (HEVC) [23] is undergoing devel- may be interpreted as a pair of enhancement layers, since they opment as a successor to H.264/AVC, which is being jointly are dependent on the type A partition for decoding. Albeit developed by the ISO/IEC MPEG and ITU-T VCEG as the information in partition B and C cannot be used in the ISO/IEC 23008-2 MPEG-H Part 2 and ITU-T H.265. HEVC absence of partition A, partition B and partition C can be used aims for halving the bitrate of the H.264/AVC standard at independently of each other, given the availability of partition the same video quality and it aims for supporting Ultra High A. The dependency of the layers in the partitioned mode of 6
Layered Layered L 128kbps QCIF@ Video 0 Video Encoder Decoder 7.5 Hz
Layered L1 384 kbps CIF@ scene Video 15 Hz Decoder
MUX Layered L2 896 kbps CIF@ Video 30 Hz Decoder
Layered L 1920 kbps 3 Video TV@ Decoder 60 Hz
Figure 4: Architecture of a layered video scheme [24], [25], where the video quality is refined gradually.
C C C C C
A A A A A
B B B B B
I B P B P Figure 5: Dependencies associated with the GOP=2 partition mode of the H.264/AVC video stream, where A, B and C represent the partition A, B and C, respectively. “B A” indicates B depends on A, while “B A” indicates B is predicted → 99K from A. the H.264/AVC video codec is exemplified in Fig. 52, where scalability feature of the bitstream. Specifically, DID, TID the group of pictures (GOP) parameter is GOP=2. and QID represent Coarse Grain Scalability (CGS), Temporal 2) Scalable Video Coding: The subject of scalable video Scalability (TS) and Medium Grain Scalability (MGS) [20], coding [20], [32] has been an active research field for over respectively. The CGS feature facilitates the coarse adaption two decades. This terminology is also used in the Annex of video properties, such as the spatial resolution of the video, G extension of the H.264/AVC video compression standard reconfiguring from QCIF to CIF, where the video can be [9]. Indeed, SVC is capable of generating several bitstreams encoded into a set of coarse enhanced sub-streams referred that may be decoded at a similar quality and compression to as dependency-layers. The DID parameter represents the ratio to that of the existing H.264/AVC codec. When for dependency-layer the current NALU belongs to. The decoding example low-cost, low-quality streaming is required by the of a NALU with DID > 0 depends on the NALUs associated users, some of the ELs may be removed from the compressed with (DID 1), but with the same TID and QID values. Based − video stream, which facilitates flexible bitrate-control based on on this dependency rule, the video quality may be readily the specific preferences of the users. A H.264/AVC scalable reduced by removing the NALUs with a DID larger than a video stream contains a sequence of network abstraction layer specific DID parameter. Similar dependency rules exist for units (NALUs) [9], which consist of a header and a payload. the temporal scalability and MGS features. The dependency The header contains the information about the type of NALU of the layers in the SVC stream is exemplified in Fig. 6. and its function in the video reconstruction process, while the payload carries the compressed signals of a video frame. 3) Multiview Video Coding: Recently, the Joint Video Team The parameters dependency_id (DID), temporal_id (TID) and (JVT) proposed MVC as an amendment to the H.264/AVC quality_id (QID) contained in the NALU header describe the standard [9]. Apart from the classic techniques employed in single-view coding, multi-view video coding invokes the so- 2Here the phrase x “depends on” y implies that the layer x will be discarded called inter-view correction technique by jointly processing by the video decoder, if layer y is lost. The phrase x “is predicted from” y the different views for the sake of reducing the bitrate. Hence, means that the layer x may still be usefully utilized by the video decoder, if layer y is lost. Hence the relationship “depends on” is stronger than “is the first encoded view may be termed as the BL, while the predicted from”. remaining views may be treated as the ELs. The dependency 7
L1 L2 L1 L2 L1 L2 L1 L2 L1
L0 L0 L0 L0 L0
I BBBPBBBP
Figure 6: Dependencies with an GOP=4 SVC stream, where L0, L1 and L2 represent the BL, spatial EL and temporal EL, respectively. “ ” indicates “depends on”, while “ ” indicates “is predicted from”. → 99K
View 1 P B B B P B B B P
View 0 I B B B P B B B P
Figure 7: Dependencies with an GOP=4 MVC encoded stereoscopic video stream. “ ” indicates “depends on”, while “ ” → 99K indicates “is predicted from”. of the layers in the MVC stream is exemplified in Fig. 7. devices.
4) Others: Set-partitioning in hierarchical trees (SPIHT) Given only the layer L0 having a bitrate of 128 kbits per [33], [34] was originally proposed as an image compression al- second (kbps), the corresponding layered video decoder of Fig. gorithm, which encodes the most important wavelet transform 4 reconstructs the video with a resolution of QCIF at 7.5 frame coefficients first and allows an increasingly refined reproduc- per second (FPS), while a CIF based video sequence at 30 tion of the original image. A multiview profile (MVP) [26] FPS can be reconstructed with the aid of layers L0, L1 and was developed by the moving picture expert group (MPEG)’s L2, which require bitrates of 128 kbps, 256 kbps and 512 [5] video coding standard, where the left view and right view kbps, respectively. If the TV screen of Fig. 4 is utilized by were encoded into a BL and an EL, respectively. Again, the the user, all four layers L L may also be streamed for 0 ∼ 3 emerging H.265 scheme will continue to include a scalability achieving the highest video quality. In practice, the different profile. video streaming scenarios of Fig. 4 require different bandwidth and hence achieve different visual quality. The users may rely C. Why Layered Video Coding on different video screens, such as those of mobile phones, tablet PCs, PC and TV screen, as seen in Fig. 4 for example. The MPEG-2 based digital video systems were designed for broadcasting services over satellite and terrestrial channels, as 2) Unreliable Networks: Internet-based communication re- well as for storage. Generally, these applications require fixed lies on the Transmission Control Protocol (TCP)-Internet quality video representations [20]. However, these traditional Protocol (IP), which aims for providing best-effort services for video representations fail to meet the requirements of the the application layer. However, TCP-IP is unable to guarantee various transmission scenarios in the era of the Internet and of the reliability of Internet links, since unpredictable network mobile networks [35], [36], which have to support heteroge- congestions, breakdown of routers, jitter, cache overflow etc. neous devices, operating under unreliable network conditions, [38], [39] may occur. In comparison to wired links, wireless bandwidth fluctuations etc. Layered video coding, such as channels are more hostile due to both small-scale and large- SVC, emerged in order to support operation in these new scale fading [40]. In the scenario of transmitting layered transmission scenarios. video over unreliable networks [41], the BL and ELs may be 1) Heterogeneous Devices: Nowadays, contemporary video assigned different protection strengths [2], channels [42], or transmission and storage scenarios require diverse receiving different packet-dropping priority [43], [44] etc, for the sake devices with more flexible communication qualities [37], as of combating the network-induced impairments. For example, exemplified by the mobile phones, tablet PCs, laptops and TV retransmissions may be used for the BL to limit the burden sets of Fig. 4. Satisfying these diverse requirements is facili- imposed on congested networks [43]. tated by layered video compression, for video-streaming, tele- 3) Bandwidth Fluctuations: In wired networks, bandwidth conferencing, surveillance, broadcast and storage for diverse fluctuations are mainly caused by the effects of network con- 8 gestion. In wireless links, the bandwidth fluctuates for several techniques have proposed. A novel UEP modulation concept reasons [35]. Firstly, the bandwidth may vary dramatically was investigated in [52] for the specific scenarios, where when a mobile terminal switches between different networks, channel coding cannot be employed. Hence UEP was achieved e.g., from a wireless local area network (LAN) to a wireless by allocating different transmission power to individual bits wide area network (WAN). Secondly, the throughput of a according to their bit error sensitivity albeit in practice this wireless channel may be reduced due to interference imposed remains a challenge. Additionally, the UEP capabilities of both by reflecting physical objects and by other wireless convolutional codes (CC) were studied in [53]. Furthermore, as devices. Thirdly, the bandwidth of a wireless link may fluctuate a benefit of the outstanding performance of low-density parity- due to the changing distance between the base station and the check (LDPC) codes, a number of UEP design methodologies mobile terminal. [54]–[57] have been investigated using LDPC codes. The so- In traditional transmission of single layer video, the receiver called UEP density evolution (UDE) technique of [54], [57] may experience long delays due to the limited bandwidth, was proposed for transmission of video streams over binary which is unacceptable for real-time video streaming. Hence erasure channels (BEC). The authors of [55] proposed a new layered video coding may be utilized for the sake of combating family of UEP codes, based on LDPC component codes, the bandwidth fluctuations, which allows us to dynamically where the component codes are decoded iteratively in multiple adapt the bandwidth requirements [45]. We consider the sce- stages, while the order of decoding and the choice of the nario of Fig. 4 as an example. When the bandwidth is low, LDPC component codes jointly determine the level of error we may transmit L0 and L1 of the bitstream, resulting in a protection. A practical UEP scheme using LDPC codes was bandwidth requirement of 384 kbps, while the receiver can still proposed in [56], where the high-significance bits were more reconstruct a CIF-sized video at 15 FPS. When the bandwidth strongly protected than low-significance bits. becomes higher, L0~L3 of the bitstream may be transmitted, As mentioned in Section II-B, when the BL is corrupted or resulting in a bandwidth requirement of 1920 kbps. As a result, lost due to channel impairments, the ELs must also be dropped the receiver can reconstruct a HD video at 60 FPS. by the video decoder, even if they are perfectly received. Moreover, the less important layers have lower priority and III.LAYERED VIDEO COMMUNICATIONS hence may be dropped in the transmission scenario of network congestion or buffer overflow [58]. Hence, it is intuitive to From the perspective of the source codec, there are two perform UEP for both the BL and the ELs for the sake of ways of correcting or concealing the transmission errors. improving the system’s performance. However, most of the Popular class of solutions relies on the so-called error- above UEP studies considered artificially generated signals concealment/error-resilient techniques. Numerous error re- of unequal significance, rather than realistic video signals. silient techniques have been investigated [46]–[49], such as Naturally, the significance differentiation of practical video intra-MB refreshing, motion vector recovery, frame-copy, dis- signals is more challenging. In compressed video streams, as in tributed source coding etc. Generally, these techniques aim for layered video coding, different bits may have different signif- reducing the channel-effects imposed on the received video icance. Therefore, UEP became the most important technique is, thereby improving the reconstructed video quality. For in the context of layered video communication invoked for example, the authors of [47] aimed for reducing the error achieving an improved reconstructed video quality. Generally, propagation effects using the so-called “peg” frames instead of UEP may be performed by carefully and judiciously allocat- the traditional intra-coded frames. Indeed, it is intuitively beni- ing resources such as the coding rate, the transmit power, ficial to perform error concealment, since the source codecs modulation mode, etc. Nonetheless, a number of contributions have the most intricate knowledge of the source stream. How- have been made in the field of layered video communications. ever, the quality of video cannot be guaranteed in a vulnerable Generally, these scalable video streaming techniques [1], [2], network environment, even when the error effects may be [30], [41], [42], [59]–[103] may be classified into the following reduced significantly. The other class of correcting/concealing four categories: errors includes the techniques that are performed after/before the source encoding/decoding stage, which include unequal • Transceivers based on UEP schemes: as in [42], [68], error protection [50], inter-layer FEC [2], priority-encoding [72], [86], [90], [95]–[97], allocate an unequal amount based transmission [51] etc. The benefit of the these methods of transmit power to the different layers. These schemes is that a high reconstructed video quality may be achieved are employed in the physical layer of the Open Systems in a vulnerable error-prone network environment for example Interconnection (OSI) model. by reducing the FEC coding rate. However, the worst-case • FEC based UEP Schemes network conditions have to be considered, when allocating – Packet-level Schemes: as in [1], [59], [60], [62], [63], error protection coding for achieving near-error-free real-time [67], [69], [71], [74], [78], [85], [87]–[89], [92], [93], video, which seems wasteful of resources. In this treatise, we [98], [99], [101], mitigate the packet-loss events, as mainly consider the latter class of techniques. exemplified by the packets lost in Internet-routers Unequal error protection (UEP)was firstly proposed by [39]. These schemes are typically hard-decoded at Masnick and Wolf in [50], which allocates stronger FEC to the higher layers. the more important data, while dedicating weaker FEC to the – Bit-level Schemes: as in [2], [30], [65], [66], [77], less important video parameters. Since then numerous UEP [79], [100], [102], are devoted to eliminating bit- 9
errors in wireless scenarios [35], [104]. These were investigated. Specifically, a scalable resource-allocation schemes may be soft- or hard-decoded at the lower framework was proposed for achieving different quality of ser- layers of the OSI stack. vice objectives for different scalable video layers. A beneficial • Cross-layer operation aided schemes: as in [41], [61], [70], scheme was designed [96] for guaranteeing that each user was [73], [75], [76], [81], [91], [94], [103], are typically invoked entitled to maintain a specific MAXMIN fairness for ensuring for optimizing the scalable video streaming systems by that their BL video packets are indeed received. considering multiple signal processing stages, such as Generally speaking, these transceiver-based UEP techniques the source-compression, FEC-encoding, modulation etc. are mainly employed in the physical layer, as shown in Fig. 8, These schemes tend to collaborate across multiple layers since they rely on adapting controlling the modulation mode or of the OSI stack. the channel selection for the sake of improving the attainable Below, we will introduce these four categories in the Sections performance. ranging from III-A to III-D. B. Packet-Level FEC A. Transceivers Based UEP The major contributions on packet-level FEC based layered The major contributions on UEP aided layered video video streaming techniques are summarized in Table II. In streaming techniques are summarized in Table I. In [68], [72], [106], the source of unequal importance was firstly trans- [80], [105], UEP schemes conceived for video transmission formed to and encoded with the aid of multiple descriptions using hierarchical quadrature amplitude modulation (HQAM) using Reed-Solomon codes. Then the FEC coding rate assign- were investigated, which considered the unequal importance of ment of different layers was solved by the classic Lagrange the different layers in SVC, as well as the unequal importance Multiplier method for the sake of robust transmission. In of both the intra-coded frames (I-frame) and of the predicted [59], the authors proposed a sophisticated framework for frames (P-frame). Specifically, the video bits of different optimizing the rate-distortion relationship [109] in the context importance were mapped to the different-integrity bits of of video communications over error-prone packet-switched the modulation constellation points of HQAM. An adaptive networks, which was based on the principle of layered coding channel selection (ACS) based layered video transmission relying on transport prioritization. Su et al. in [62] aimed for scheme was proposed for transmission over multiple input finding an optimal FEC assignment scheme for scalable video multiple output (MIMO) schemes in [42], which periodically transmission over bursty channels invoking packet-loss rate switched each bit stream among multiple antennas. Specifi- feedback. An iterative algorithm was derived for calculating a cally, the ordering of each sub-channel’s signal-to-noise ratio new packet-loss probability function conditioned on the past (SNR) was exploited as partial channel quality information loss rates. Diverse UEP algorithms were designed in [60], (CQI) at the receiver. The bitstream was carefully mapped by [63], [64], [110] for transmitting 3D SPIHT coded images exploiting the SNR-based CQI of the channels. Essentially, the and videos over networks imposing packet-loss events, where higher-priority layer’s bitstream has to be mapped to higher- Reed-Solomon (RS) codes were utilized as FEC codes. An SNR channel by the proposed algorithm. A HQAM based UEP scheme was conceived for object-based video commu- UEP scheme was proposed in [86] for H.264/AVC video nications in [67] for achieving the best attainable system transmission over frequency selective fading channels. For the performance under specific bitrate and delay constraints in an sake of preventing the more important data to be mapped onto error-prone network environment. In [69], UEP was allocated sub-carriers subjected to deep fading, an orthogonal frequency- to the different frames (I- or P-frame) of a GOP, and in division multiplexing (OFDM) sub-carrier classification strat- each frame, unequal protection was allocated to the progres- egy relying on a pair of SNR thresholds was proposed. Coop- sive bitstream of scalable video for the sake of providing erative multicasting of scalable video was investigated in [97]. a graceful degradation of the video quality, as the packet The so-called opportunistic cooperative multicast (OppCM) loss rate increased. A genetic algorithm (GA) was utilized and coded cooperative multicast (CodedCM) were proposed, to for efficiently identifying the UEP patterns, which was where opportunistic listening, conditional demodulation and inefficient with the aid of conventional methods owing to the multi-resolution modulation were utilized to enhancing the excessive search-space. A packetization scheme was proposed system’s performance. in [71], which efficiently combated packet loss events by A cooperative MIMO framework was proposed in [90] for combining UEP, retransmission and GOP-level interleaving. scalable video transmission, which relies on a sophisticated Intra-GOP rate allocation was invoked for minimizing the power control strategy invoked for controlling the on/off mode distortion of individual GOPs, while inter-GOP rate allocation of the relays and the specific power allocation among them. was proposed for reducing the video quality fluctuations by In [95], UEP was achieved by striking a tradeoff between the adaptively allocating bandwidth according to both the video achievable transmission integrity and data rates, which were signal characteristics and to the buffer-fullness. An adaptive controlled both by the FEC and the MIMO mode selection UEP scheme was proposed in [74] for robust and scalable for the sake of minimizing the average distortion. A scal- wireless video streaming. By jointly exploiting the temporal able resource allocation framework was proposed in [96] for inter-frame dependency and the quality dependency between streaming scalable videos over multiuser MIMO-OFDM net- the scalable layers, the proposed scheme adaptively assigned works, where the attainable multidimensional diversity gains unequal-protection FEC codes to the packets of each layer and 10
Table I: Major contributions on transceivers based UEP aided layered video streaming.
Year Author(s) Contribution proposed HQAM based UEP for H.264/AVC coded video, which considered the different importance Chang et al. [68] 2006 of the I-frames and P-frames. Ghandi and Ghan- benchmarked SVC against the partition mode of H.264/AVC video codec, where UEP is achieved by bari [72] employing HQAM. proposed adaptive channel selection (ACS) based layered video streaming over MIMO system, which Song and Chen 2007 periodically switched each bit stream among multiple antennas and allocated the higher-priority layer’s [42] bitstream to higher-SNR channels. proposed an OFDM sub-carrier classification strategy for a HQAM based UEP scheme, thereby avoiding Li et al. [86] 2010 the more important data to be mapped onto sub-carriers in deep fading. proposed a cooperative MIMO framework for scalable video communications, which employs a Xiao et al. [90] sophisticated power control strategy for controlling the on/off mode of relays and the specific power allocation among them. considered the tradeoff between the achievable transmission integrity and data rates, which were Kim et al. [95] controlled by the FEC and the MIMO mode selection to minimize the average distortion. 2013 proposed a scalable resource allocation framework for streaming scalable videos over multiuser MIMO- Li et al. [96] OFDM networks, where the achievable multidimensional diversity gains were investigated. Wang and Liao proposed opportunistic cooperative multicast (OppCM) and coded cooperative multicast (CodedCM) [97] for scalable video communications. adjusted the transmission rate by dynamically selecting the all received source symbols are decoded. The priority encoding number of layers to transmit according to both the available transmission (PET) [51] protection philosophy was introduced network bandwidth and to the packet loss rate. A novel in [98] for streaming scalable video streams over erasure chan- UEP method using RS codes was proposed in [78] for SVC nels, where a small number of retransmissions were allowed. video transmission over networks inflicting packet-loss events. In principle, the choice of the most appropriate protection Firstly, the layer-weighted expected zone of error propagation depended not only on the importance of each stream element, (LW-EZEP) was defined as an efficient performance metric for but also on the expected channel behavior. By formulating a quantifying the error propagation effects imposed by packet collection of hypotheses concerning a specific stream’s own loss events. Then, the corresponding RS coding rates were behavior in future transmissions, the limited-retransmission assigned based on LW-EZEP for minimizing the expected aided PET (LR-PET) regime was capable of effectively con- video distortion. A two-dimensional (2D) layered multiple de- structing channel codes spanning multiple transmission slots scription coding [107] arrangement was proposed for scalable and thus offered a better protection than the original PET of video transmission over unreliable networks, which allocates [51]. Based on packet-level transmission distortion modeling, multiple description based sub-bitstreams of a scalable 2D in [99] the significance of each video packet was estimated bitstream to two different network paths exhibiting unequal in terms of the reconstructed video quality, which defined the packet loss rates. Furthermore, the video distortion was mini- priority level of each packet. UEP was then allocated to the mized, conditioned both on the given total rate budget and on video packets according to both priority levels as well as to the packet loss probabilities. the prevalent channel conditions. The proposed RD-based UEP In recent years, cross-layer operation aided scalable video resource allocation problem of [99] was formulated as a con- streaming designed for error-prone channels was investigated strained nonlinear optimization problem. Then an algorithm in [87], where the RS coded UEP was optimized for robust based on Particle Swarm Optimization (PSO) was developed video delivery. The expected video quality was evaluated for solving the optimal resource allocation problem. In [101], based on both the available bandwidth and the packet loss UEP was conceived for packets containing scene-transition ratio (PLR) encountered, which was then further improved frames, which have to be better protected for the sake of by employing content-aware bitrate allocation. The authors of attaining an improved video quality. A sophisticated FEC code [88] studied an UEP scheme using Luby Transform (LT) codes allocation strategy was adopted based on the minimization of [108], [111] for recovering the video packets lost owing to the end-to-end distortion, assuming that error concealment was network congestions. The multistream UEP (M-UEP) concept adopted at the decoder. Two different FEC allocation strategies was proposed in [92], which allocated separate streams to were proposed [101] for the Block of Packets (BOP) structure, separate sets of packets for maintaining their independence. namely an iterative modified hill climbing approach and a The concept of permuted systematic RS codes defined in [92] reduced-complexity heuristic approach. was employed for beneficially dispensing the message symbols In the above UEP schemes, variable-rate FEC codes were across the packets, which interleaved the source symbols with assigned to the different-sensitivity layers for improving the re- the redundancy symbols in order to form the codewords. M- constructed video quality. However, when the BL is corrupted UEP improved the efficiency of classic UEP by ensuring that or lost, the ELs have to be discarded by the video decoder, 11
Table II: Major contributions on packet-level FEC based layered video streaming.
Year Author(s) Contribution transformed and encoded scalable sources into multiple descriptions using the family of Reed-Solomon 1999 Puri et al. [106] codes for robust transmission. proposed a framework for optimizing the rate-distortion relationship of video communications over 2001 Gallant et al. [59] error-prone packet-switched networks based on the principle of layered coding relying on with transport prioritization. Thornton et al. designed UEP algorithms for transmitting 3D-SPIHT coded videos over networks subject to packet-loss 2002 [60] events, where packet-level FEC codes were employed. found an optimal FEC assignment scheme for scalable video transmission over channels inflicting bursty Su et al. [62] 2003 errors and relying on loss rate feedback. proposed a bit-plane-wise UEP algorithm for 3D-SPIHT coded progressive bitstream transmission over Kim et al. [63] lossy networks. conceived an UEP scheme for object-based video communications in order to achieve the best attainable 2005 Wang et al. [67] system performance under specific bitrate- and delay-constraints in an error-prone network. Fang and Chau performed UEP of the different frames within a GOP and of the different layers of each frame. 2006 [69] combated packet loss events by combining UEP, retransmission and GOP-level interleaving. Intra-GOP Gan et al. [71] and inter-GOP rate allocation were also investigated. proposed an adaptive UEP scheme for robust scalable wireless video streaming, which adaptively 2007 Shi et al. [74] assigned different rate FEC codes to each layer and adjusted the transmission rate by dynamically selecting the number of layers to transmit. proposed a technique referred to as layer-weighted expected zone of error propagation (LW-EZEP) for 2008 Ha and Yim [78] quantifying the error propagation effects imposed by packet loss events. allocated multiple description sub-bitstreams of a scalable 2D bitstream to two different network paths Xiang et al. [107] exhibiting unequal packet loss rates and minimized the end-to-end distortion. 2009 Sejdinovic´ et al. proposed the so-called packet-level expanded window fountain (EWF) codes for UEP. [82] Vukobratovic´ et al. applied the EWF codes for scalable video multicast over networks inflicting packet loss events, where [85] ELs conveyed parity information protecting the more important BL. Maani and proposed cross-layer operation aided scalable video streaming, which estimated the expected video Katsaggelos [87] distortion according to the prevalent channel conditions. 2010 developed Luby Transform (LT) [108] coded UEP for combating the packet loss events imposed by the Ahmad et al. [88] network. Dumitrescu et al. proposed the multistream UEP (M-UEP) philosophy, which utilized permuted systematic Reed-Solomon [92] (RS) codes for enhancing the distribution of message symbols amongst the packets. designed hierarchical network coding (HNC), where the less important bits are used for protecting the Nguyen et al. [89] more important bits. Halloush and proposed the so-called Multi-Generation Mixing (MGM) technique, which allows us to recover packets 2011 Radha [93] using data encoded into other packets. proposed the layer-aware FEC (LA-FEC) philosophy using a Raptor codec for video transmission over Hellge et al. [1] the BEC. advocated a so-called priority encoding transmission (PET) [51] protection for streaming scalable video Xiong et al. [98] 2013 over erasure channels, where a small number of retransmissions were allowed. estimated the importance of each video packet according to their impact on the reconstructed video Zhang et al. [99] quality. UEP was then allocated to different video packets according to their priority levels and the dynamic channel conditions. identified the more important packets containing scene-transition frames and invoked UEP for the sake 2014 Midya et al. [101] of improved video quality. 12 regardless whether they are perfectly decoded or not, which over wireless channels, where a prompt and efficient fast rate implies that both the transmission power and the bandwidth allocation scheme was also investigated. However, only three assigned to the ELs is wasted. Hence it is beneficial improve protection classes were discussed in [79], which limits the the protection of the more important BL with the aid of attainable system performance. The authors of [100] show that the ELs. Hence, the authors of [85] applied the packet-level the side information (SI) values within different positions of expanded window fountain (EWF) codes of [82] for scalable the Wyner-Ziv (WZ) frames have different error probability. video multicast over networks inflicting packet loss events, Hence UEP of these non-uniformly distributed SI values was where the ELs conveyed parity information protecting the employed for the sake of reducing the required bitrate in the more important BL. By contrast, hierarchical network coding application of distributed video coding (DVC) [120], [121]. (HNC) was proposed in [89], where the encoded ELs carry Similar to the packet-level schemes of [1], [85], [89], [93], some of the information of the BL. Hence, when only a where the information of the ELs was utilized for protect- low number of coded packets is received, the most important ing the BL, the authors of [25] developed bit-level inter- data can be recovered with a high probability. As a further layer coded FEC (IL-FEC) arrangements for layered video advance, Multi-Generation Mixing (MGM) was developed telephony over wireless fading channels in [2], [115], [116] in [93] for combating the deleterious effects of packet-loss relying on soft-decoded RSC, turbo and self-concatenated events. When an insufficient number of packets is received convolutional codes, respectively, where the systematic bits for a specific chunk of packets, it is still possible to recover of the BL are implanted into the ELs at the transmitter. At the the chunk using the data embedded into other chunks. The receiver, the BL’s bits implanted into the ELs may be utilized packet-level layer-aware FEC (LA-FEC) philosophy using a for correcting the BL. The above-mentioned IL-FEC technique hard-decoded Raptor code was designed for scalable video of [2] was also combined with the UEP philosophy for the sake transmission over the BECs in [1], [112]. The Raptor encoder of further improving the attainable system performance. In [2], generated the parity bits right across the BL and the ELs at a number of coding rates were tested for the sake of proving the transmitter. As a benefit, the parity bits of the ELs may the benefits of the proposed IL technique, where the code rates be utilized for assisting in correcting the errors residing in the arrangements were determined empirically. However, in prac- BL at the receiver. Similar techniques were also investigated tical scenarios, different configurations of video codecs and in [54], [113], [114]. The common characteristic within the different video sequences may have different characteristics, contributions of [1], [85], [89], [93] is that they all facilitate which may require different channel coding rates for achieving the enhanced protection of the BL using the information of the best system performance. In [102], the authors proposed the ELs. In the rest of the paper, we refer to these schemes as a technique for finding the optimized coding rates for coded LA-FEC. bitstreams “on-the-fly” at the transmitter, which optimizes the Generally speaking, these packet-level FEC schemes are IL-FEC coded system performance. Specifically, they found typically employed in the higher layers of the OSI protocol the coding rates achieving the minimum video quality distor- stack, as shown in Fig. 8, since they are designed for combat- tion with the aid of the mutual information (MI) between the ing the packet loss events. log-likelihood ratios (LLRs) and the corresponding video bits. In this context, the soft-decoding metric of the FEC codec and C. Bit-Level FEC of the demodulator are characterized by lookup tables (LUTs), since these cannot be characterized theoretically. In a nutshell, the authors of [102] focused their effects on the optimization The major contributions on bit-level FEC based layered of bit-level IL-FEC encoded scalable video communications video streaming techniques are summarized in Table III. These over wireless fading channels. The UEP based low-density bit-level schemes [2], [30], [65], [66], [77], [79], [100], [102], parity-check codes (UEP-LDPC) of [54] may also be used for [115], [116] tended to employ physical layer FEC codes bit-level protection, which may rely on both soft-decoding and [117] and performed soft decoding [118] for wireless video hard-decoding. communications. These bit-level FEC schemes may be further divided into The authors of [65] minimized the mean distortion by non- soft decoded and hard-decoded categories. The soft-decoded uniformly distributing the redundancy imposed by the turbo bit-level FEC schemes are typically employed in the physical code between the successive video frames, where the H.263 layer, since they are capable of combating the effects of the video codec was employed. In [66], an UEP scheme using a wireless channel. By contrast, as shown in Fig. 8, the hard- turbo transceiver was optimized for wireless video telephony. decoded schemes may be invoked in the data link or network LDPC code based UEP was investigated in [57]. The UEP layer for mitigating the bit-erasure effects. performance of data-partitioned [9] H.264/AVC video stream- ing systems using recursive systematic convolutional (RSC) D. Cross Layer codes was evaluated in [30], while turbo coded modulation [119] based UEP was investigated in [77], where both the cutoff rates and the channel capacity of each of the UEP levels The major contributions on cross-layer operation aided was determined. The authors of [79] considered the unequal layered video streaming techniques are summarized in Table importance of both the video-frames in a GOP and the signif- IV. The authors of [61] proposed an adaptive cross-layer icance of the diverse MBs in a video frame for transmission protection strategy for enhancing the robustness of scalable 13
Table III: Major contributions on bit-level FEC based layered video streaming.
Year Author(s) Contribution Marx and Farah minimized the mean distortion by non-uniformly distributing the redundancy imposed by a turbo code 2004 [65] between the successive video frames. 2005 Ng et al. [66] optimized an UEP scheme using a turbo transceiver for wireless video telephony. Rahnavard et al. 2007 investigated an UEP scheme using partially regular LDPC codes. [57] Aydinlik and investigated UEP based turbo coded modulation, where both the channel capacity and the cutoff rates 2008 Salehi [77] of the UEP levels were determined. considered the unequal importance of the frames in a , as well as that of the macroblocks in a video Chang et al. [79] frame. Nasruminallah and evaluated the UEP performance of data-partitioned H.264/AVC video streaming systems using RSC 2011 Hanzo [30] codes. developed a bit-level inter-layer coded FEC (IL-FEC) scheme for layered wireless video steaming Huo et al. [2] relying on a soft-decoded FEC code, where the systematic bits of the BL are implanted into the ELs 2013 at the transmitter. Micallef et al. conceived UEP for side information (SI) values at different position of the Wyner-Ziv (WZ) frames for [100] the sake of reducing the bitrate in DVC. proposed a technique for finding the optimized coding rates for coded bitstreams “on-the-fly” at the 2014 Huo et al. [102] transmitter, which optimizes the IL-FEC coded system performance.
Table IV: Major contributions on cross-layer operation based layered video streaming.
Year Author(s) Contribution proposed a cross-layer protection for robust scalable video transmission by striking tradeoff amongst 2003 Schaar et al. [61] the attainable throughput, reliability and delay. judiciously allocated the resource between the source and channel encoders based on either the 2004 Zhang et al. [41] minimum-distortion or minimum-power consumption criterion. adaptively adjusted the rate of the video encoder and the channel encoder to maximize the video quality Qu et al. [73] 2006 delivered based upon both application-layer video motion estimates and link-layer channel estimates. Barmada et al. combined turbo coding and HQAM to provide a high protection for the BL under the constraint of the [70] affordable channel coding redundancy. Ha et al. [76] optimized the source and channel coding rates for minimizing the overall distortion. 2007 proposed a cross-layer operation method and a protocol architecture for the sake of transmitting the Huusko et al. [75] required control information and for optimizing the overall multimedia transmission quality over wireless and wired IP networks. amalgamated a two-stage FEC scheme with an enhanced link-layer protocol conceived for multimedia Shan et al. [81] 2009 transmission over wireless LANs. Both packet-level FEC and bit-level FEC were employed. estimated the video distortion for specific channel conditions, which was then exploited for optimally Jubran et al. [83] selecting both the application layer parameters and the physical layer parameters. investigated the efficient streaming of scalable video from a base station to multiple clients over a 2010 Zhang et al. [91] shared fading wireless network by considering both the application layer information and the wireless channel conditions. proposed an APP/MAC/PHY cross-layer architecture for optimizing the perceptual quality of delay- 2012 Khalek et al. [94] constrained scalable video transmission. Cicalo and Tralli designed a cross-layer optimization framework for scalable video delivery over OFDMA wireless 2014 [103] networks, which jointly addressed both rate adaptation and resource allocation. 14 video transmission by striking tradeoff amongst the attainable and quality-layers to be transmitted based on a combination throughput, reliability and delay by taking into account the of the prevalent channel statistics, on the source rates and near-instantaneous channel conditions and application require- on the playback buffer-fullness. A cross-layer optimization ments. By carefully considering both the wireless channel framework was proposed in [103] for scalable video delivery conditions and the SVC characteristics, the authors of [41] over orthogonal frequency division multiple access (OFDMA) judiciously allocated the resources between the source and wireless networks, which jointly addressed rate adaptation and channel encoders based either on the minimum-distortion or resource allocation with the aim of maximizing the sum of the the minimum-power consumption criterion. By exploiting the achievable rates, while minimizing the distortion difference knowledge of the source and the channel conditions, the among multiple video streams. These cross-layer operation authors of [73] adaptively adjusted the operating parameters based schemes may over-arch the entire OSI stack, as shown of both the video source encoder and of the FEC channel in Fig. 8. encoder for maximizing the video quality delivered based upon both the application-layer video motion estimates and E. Summary the link-layer channel estimates. The authors of [70] combined turbo coding and HQAM to provide a high protection for the Below, Section III-E1 summarizes the contributions re- BL under the constraint of the maximum affordable channel viewed above, followed by the illustration of the main focus coding redundancy. Both the source and channel coding rates of this tutorial. were optimized in [76] for the sake of minimizing the overall 1) Resources-Solutions-Objective: The main objective of distortion. An efficient cross-layer operation aided protocol layered video streaming schemes is to minimize the video architecture was proposed in [75] for the sake of transmit- distortion, which is indicated by the innermost kernel of Fig. ting the required control information and for optimizing the 8. Furthermore, the available resources and the four solution multimedia transmission quality both over wireless and wired categories are also displayed in Fig. 8. Generally, a specific Internet Protocol (IP) networks. A two-stage FEC scheme was solution may change the configuration or parameters of several investigated in [81] in conjunction with an enhanced link-layer resources for the sake of approaching the minimum video protocol specifically designed for multimedia transmission distortion. The available resources are listed as follows: over wireless LANs. At the application layer, the stage-one • Carrier: represents an electro-magnetic wave of fixed packet-level FEC scheme protected the packets against packet amplitude and frequency that is modulated in amplitude, losses due to congestion and route disruption. The stage-two frequency, or phase in order to carry a modulating sig- bit-level FEC was then added to both application packets and nal with the aid of radio-frequency (RF) transmission. stage-one FEC coded packets to recover bit errors imposed Multiple carriers may exist in a wireless communication by the link layer. Then at the link layer, a combined header system, which may experience different fading effects. cyclic redundancy check (CRC)/FEC scheme was used to Hence a less faded carrier may be selected for conveying enhance the attainable protection and to act in unison with the the more important BL as exemplified in [86]. two-stage FEC scheme. The proposed scheme thus provided • Bandwidth: In wireless communications, the bandwidth joint protection across the entire protocol stack. The authors represents the frequency range occupied by a carrier of [83] estimated the distortion of the received video and wave. For wireless transmission of the BL a wider proposed different error concealment schemes for diverse bandwidth may be allocated for the sake of improving its channel conditions. This estimated end-to-end video distortion quality. In wired networks, the bandwidth predetermines was then utilized for optimally selecting the application layer the achievable data transfer rate. In congested networks, parameters, such as the quantization parameter (QP), the GOP adaptive bandwidth allocation may be beneficial for lay- size and the physical layer parameters, such as the coding ered video streaming, as exemplified in [71]. rate, the symbol constellation in a constrained-bandwidth • Channel: Layered video transmission may benefit from framework. the diversity gain of MIMO systems, where different The authors of [91] investigated the streaming of scalable transmitter and receiver antenna pairs experience different video from a base station to multiple clients over a shared channel fading effects. Hence it is intuitive to optimize the fading wireless network by considering both the available channel assignment algorithm for transmitting different application layer information as well as the wireless channel layers, as exemplified in [42]. conditions. An APP/MAC/PHY cross-layer architecture was • Power: In wireless streaming of layered video, the em- proposed in [94] that optimizes the perceptual quality of ployment of adaptive power control may be beneficial, delay-constrained scalable video transmission. Specifically, an which increases/reduces the transmit power depending online Quality of Service-to-Quality of Experience (QoS-to- on the importance of the data, as exemplified in [90]. QoE) mapping technique was employed for quantifying the Furthermore, the hierarchical constellation point mapping effects of packet loss events endured by each video layer using of [72] may be interpreted as unequal power allocation. the Acknowledgement (ACK) history and perceptual metrics. • FEC Redundancy: FEC encodes a packet by imposing At the PHY layer, the authors developed a link adaptation redundant bits for the sake of providing an error cor- technique relying on the QoS-to-QoE mapping for providing rection capability. Generally, more redundancy results in perceptually-optimized UEP. At the APP layer, the source rate lower FEC coding rates and stronger protection. Hence was adapted by carefully selecting the specific set of temporal- different importance layers may be encoded differently 15
FEC Priority Redundancy
Available Resources Packet-level FEC Channel Retransmission
Solutions
Transceivers Minimum Cross-layer Based UEP Distortion Operation Objective
Power Source Bitrate Bit-level FEC
Carrier Bandwidth
Figure 8: The objective, solutions and available resources in existing layered video streaming schemes.
for the sake of improving the overall video quality, as
exemplified in [2]. Layered Video Communications • Retransmission: Retransmissions are widely utilized in congested networks for improving the QoS, provided that Transceivers Based UEP the delay imposed may be acceptable in non-interactive FEC Based UEP broadcast-scenarios. Layers of different importance may Layer-Aware FEC/Inter-Layer FEC be retransmitted for the sake of improving the perfor- Packet level (hard-decoded) mance, as exemplified in [71]. Fountain codes • Priority: The priority classes affect both the routing, as well as the congestion control of a packet. Hence in EWF [57] vulnerable networks it is intuitive to assign a higher Raptor [1] priority for the BL than for the ELs, as exemplified in RaptorQ [127, 129] [43], [44]. Pro-MPEG COP3 [114] • Source Bitrate: In vulnerable and throughput-limited net- RE-RS [113] works, streaming of a high-bitrate source bitstream may UEP-LDPC [51] result in unacceptable reconstructed video quality at the Multi-Generation Mixing (MGM) [89] receiver owing to the preponderance of errors. Hence we Hierarchical Network Coding (HNC) [85]
may have to dynamically adapt the bitrate allocation. For Bit-level example, the source bitrate may be reduced for the sake UEP-LDPC [51] (hard-decoded or soft-decoded) of increasing the FEC coding rate within a fixed total RSC [2, 93] (soft-decoded) bandwidth, as exemplified in [41]. SECCC [116] (soft-decoded) Turbo [115] (soft-decoded) 2) Our Focus: According to the state-of-the-art review provided in the Sections ranging from III-A to III-D, the infor- Priority Encoding Transmission (PET) [102] mation of the ELs was exploited in the LA-FEC and IL-FEC Partially regular LDPC [54] based UEP schemes of [1], [2], [85], [89], [93], [102] for the Cross Layer Operation sake of protecting the BL, as seen in Fig. 9. Hence, they were capable of substantially improving the system’s performance. Figure 9: The schemes of [1], [2], [51], [54], [57], [85], [89], For the sake of inspiring further research on LA- / IL-FEC [93], [102], [113]–[116], [122] employ the information of the solutions, we will review the family of packet-level LA-FEC ELs for protecting the BL. The techniques considered in this and bit-level IL-FEC solutions in the rest of this tutorial, as treatise are shown in bold. indicated in Fig. 9 in bold fonts. Moreover, there is a paucity of research on the optimization of soft-decoded bit-level IL- 16
FEC based schemes. Since the soft-decoded FEC codes are Coded L0 Coded L0&L1 more beneficial than their hard-decoded counterparts, we will mainly focus our attention on the class of soft-decoded IL-FEC schemes.
IV. PACKET-LEVEL PROTECTION USING:LAYER-AWARE FEC BL L EL L Layer-Aware FEC schemes are characterized by encoding 0 1 BL and EL separately while allowing a combined decoding of 1st window 2ed window BL and EL FEC data on the receiver side. This is achieved by incorporating the BL source data in the FEC generation of Figure 11: Example of expanded window fountain codes the EL so that EL data also protects BL data. The key for an invoked for scalable video streaming. efficient design of LA-FEC is the integration in today’s state- of-the-art FEC codes while not deteriorating the original cod- ing performance and functionalities of those highly optimized defined over the source symbols generate r classes of different- FEC codes. This section reviews the integration of LA-FEC rate protection. Specifically, the symbols falling in the index in today’s packet-level FEC and both, packet- and bit-level, range of 1 k , , (k − + 1) k , ,(k − + 1) k ∼ 1 ··· i 1 ∼ i ··· r 1 ∼ r algorithms. define the 1st, , ith, ,rth windows, which represent the ··· ··· importance order of 1st > >ith> >rth. A EWF code may ··· ··· A. Layer-Aware FEC Using Expanded Window Fountain be defined by the set: (EWF) Codes (Π, Γ, Ω(1), , Ω(j)), (2) Fountain codes [108], [117], [123] provide a universal F ··· solution for combating packet loss events, when multicasting where the protection-classes are defined by the generating data over unreliable networks [124], since they are capable of polynomial r s generating an arbitrary number of encoded packets given the Π(x) = i xi, (3) k input source packets. LT codes [108] were the first practical i=1 fountain codes. LT codes are equal error protection (EEP) X with s = k k − indicating the number of symbols within i i − i 1 FEC codes, since they treat all source symbols as equally the ith protection class. The encoding process of this EWF important. Hence as a further development, fountain codes code is formulated as follows [82], [85]: capable of providing UEP have emerged in [82], [85], [125]. 1) Randomly choose a window from the window-selection The family of EWF codes was proposed in [82] as a class of r i UEP fountain codes, which may be viewed as a generalized distribution of Γ(x) = Γix , where Γi is the proba- version of standard LT codes. i=1 bility of picking the ithPwindow. 1) LT Encoding: The LT decoding process is detailed in 2) Encode the symbols of the ith window using the LT code i [108]. Suffice to say that LT codes are capable of may recov- (i) (i) i ering the transmitted source packets with a high probability, Ω (x) = Ωl x , generating the output symbols of l=1 when the number of received packets is sufficiently higher the ith protection-classP as exemplified in Fig. 10. than the number of input packets. The LT encoding process is 3) Repeat the above process, until a sufficient high number listed as follows: of symbols was generated. 1) Divide the source sequence S into K blocks of equal 3) Layered Video Multicast Using EWF Codes: Based on length. the above discussions, we note that the EWF code defined by 2) A random degree d is generated according to a specific the set (Π, Γ, Ω(1), , Ω(j)) constitutes a fountain code, F ··· degree distribution, where d is defined as the number which assigns an output symbol to the ith window using of source-packets combined by the modulo-2 operation the probability distribution Γi. Then this protection class is to generate an output packet, for example the so-called encoded using a standard LT code based on the distribution robust soliton distribution was proposed in [108], which Ω(i)(x), which treats the source symbols within the the ith determines the attainable performance of the LT codes. protection class as of equal importance. Hence the EWF code 3) For generating a new LT-coded packet, d of the K may be considered as a specific class of UEP LT codes. The blocks are randomly selected. Then the binary exclusive classic belief propagation (BP) [126] algorithm or alternatively OR (XOR) operation is performed on these d blocks, low-complexity XOR operations [82], [108] may then be generating the new LT-encoded packet. applied for decoding both the conventional LT codes and the 4) Repeat the Steps 2 and 3, until a sufficiently high number EWF codes. of packets was generated for frame S. From Section IV-A2, we observe that the ith window is 2) EWF Codes: The EWF encoding process is illustrated protected by the (i + 1)st, ,rth windows. This property ··· in Fig. 10, where we consider a total of k consecutive source enables UEP for the different layers of scalable video. More symbols. As shown in Fig. 10, the r expanded windows specifically, let us consider the scenario of two layers, namely 17
ith class symbols
...... 1 2 k1 k2 ki kr = k 1st window ith window rth window
:source symbol :output symbol Figure 10: LT code with Expanded window fountain codes [82].
BL L0 and an EL L1. We may construct the EWF codes with eliminate this impediment of the LT codes by concatenating two window as displayed in Fig. 11. The first window will be the LT code with a precoding operation, which relieves the LT allocated to the BL L0, while the second window will contain code from the need to decode all source symbols and leads to both BL L0 and EL L1. Hence, the BL L0 may be protected a significantly lower coding overhead. by the encoded results of both 1st window and 2ed window, Fig. 12 illustrates the coding process of the Raptor code as illustrated in Fig. 11. Since the encoding symbols of an as specified in RFC5053 [128] and RFC6330 [129]. As men- LT code are computed by linear combinations of the source tioned above, the encoding process consists of two encoding symbols, the encoding results of both windows can also be steps, namely the precoding and the LT coding. In the first used in an additive way. step, the precoded symbols are generated by the precoding matrix, which contains the first k rows of the LT encoding B. Layer-Aware FEC Using Raptor Codes matrix of the second encoding step. The precoded symbols are then used as input to the second encoding stage, which Raptor codes were proposed in [127] as improved LT codes, invokes an LT encoding process for generating a ’fountain’ which imposed linearly increasing encoding and decoding of encoded symbols. Since the precoding matrix contains the complexity as a function of the number of source packets. first k rows of the LT coding matrix, the final LT encoding Raptor codes are furthermore specified under the auspices of step is an inverse process and the first k symbols of the the IETF. As a first version, RFC5053 [128] was ratified, encoded symbol stream are identical to the k source symbols. which was then followed by an evolved version in RFC6330 This makes the code systematic, which is important for real- [129]. The integration of the LA-FEC concept into RFC5053 world applications for the sake of avoiding the need for and RFC6330 is detailed both in [1] and in [122]. The basic FEC decoding under error-free conditions, because in this principle is similar to the EWF approach of Section IV-A, but case simply the systematic information part is retained as the requires additional modifications due to the specific encoding decoded codeword. process of the Raptor FEC. 2) Layer-Aware Raptor Codes: In order to preserve the 1) Raptor Coding: The basic idea behind Raptor codes original coding performance and functionality of the Raptor relies on precoding the input symbols before applying to code, its integration with the LA process has to satisfy the them an LT code. Alternatively, Raptor codes are typically following requirements: constructed by concatenating from two codes, namely the outer code/precode and the inner LT code, where the inner LT code 1) Preserve the original systematic structure of the BL’s encodes the precoded symbols its input. and EL’s symbols, which guarantees their separability; Again, Raptor codes may be viewed as an evolution of the 2) Facilitate the superimposed, enhanced-quality additive LT coding philosophy. The disadvantage of the LT code is that decoding of the BL’s and EL’s symbols; for facilitating decoding, all original uncoded source symbols 3) Preserve the original error-resilience of both the BL and have to be covered by the received symbols. This leads to EL. a relatively high decoding overhead, since the decoder has to In the following, we consider the example of the EWF code wait in some situations until the reception of a ’repair’ symbol, of Fig. 11 with the aid of two SVC layers. The encoded which covers a specific missing source symbol. Raptor codes symbols L0 of the 1st window are generated with the aid of the 18
precoded symbols . = 0
Input data Output data
source symbols encoded symbols encoded symbols source symbols precoded symbols
1. Precoding step: 2. LT Encoding step: Generates precoded Generates encoded symbols symbols from precoded symbols
Figure 12: Raptor encoding process as specified in RFC5053 [128] and RFC6330 [1], [129].
standard Raptor encoding process discussed in Section IV-B1. context, the BL can be decoded iff at least k0 BL symbols Similarly, following the EWF approach of Fig. 11, the encoded have already been received. By contrast, for the LA-Raptor symbols L1 of the 2nd window are also generated with the aid code, the BL can either be decoded in isolation, provided that of another standard Raptor code that covers all source symbols more than k0 BL symbols have been received or alternatively of the 2nd window. However, there are two major problems, by, using the already available BL symbols in combination when using the standard Raptor encoding process, which may with the EL’s symbols. In the latter case, both layers can be be circumvented by the scheme of Fig. 13, as detailed below: decoded iff at least (k0 + k1) symbols have been received. For 1) Generally, the EL’s encoding process is non-systematic, the LA-Raptor scheme, we assume having the same size for since the EL is only involved in the encoding process the BL and EL in conjunction with k0=k1. Table V shows of the 2ed window, which contains the symbols of both the failure probability of the Raptor decoding process for a the BL and EL. given source block of k source symbols in conjunction with 2) The BL’s and EL’s encoded symbols cannot be readily different number of received symbols. Specifically, k0 and decoded to provide an enhanced video quality, since the (k0 + 1) symbols are received for the BL, while (k0 + k1) BL’s precoded symbols differ in the original BL coding and (k0 + k1 + 1) for the EL. process and in the EL coding process. The results of Table V demonstrate that the above- mentioned LA-Raptor modifications achieve a comparable Fortunately, the 2nd problem can be solved by incorporating decoding performance to that of the original Raptor code. the BL’s precoded symbols in the 2nd encoding step, similar The reception of a single additional symbol is sufficient in to the EWF process, but using a concatenation of the LT both codes for attaining a near-unity probability of successful coding matrices of both the BL and of the EL. This ensures decoding. that the BL’s precoded symbols become part of the EL’s precoded symbols. Furthermore, this approach also preserves the original degree distribution of the LT code matrix of both C. Layer-Aware FEC Using Randomized Expanded Window the BL as well as of the EL and thereby preserves the original Based RS Codes (RE-RS) coding performance. Serendipitously, the 1st problem can also The integration of the LA-FEC approach with Reed- be solved, which is achieved by integrating the LT matrix of Solomon (RS) codes is described in [113]. The authors con- the second encoding step into the precoding step and thereby ceived the randomized expanded window based RS coding ensures that the first k encoded symbols become identical 1 (RE-RS) philosophy for delay sensitive real-time video stream- to the k original source symbols. This coding process is 1 ing. In this context, the RS coding is expanded over a number graphically illustrated in Fig. 13. of previously received pictures in order to recover the reference The above-mentioned modifications of the Raptor code pictures in the case of transmission problems, albeit this is specification satisfy both the 1st and 2nd requirements. Ad- achieved at the cost of a moderate single-codeword FEC delay. ditionally, the results seen in Table V show that the above- The general approach is illustrated in Fig. 14. Again, in order mentioned approach is also capable of satisfying the 3rd to limit the delay introduced by the FEC to a single video requirement. The results are based on the LA RaptorQ3 frame, the FEC source block is expanded only to video frames scheme of the RFC6330 standard as specified in [122]. The received in the past, thereby each video frame may be decoded performance of Raptor codes is typically characterized in and displayed at the appropriate time instant. terms of the number of symbols required for successful decod- The authors designed the RE-RS encoding scheme ac- ing, i.e. by the parity-overhead. In the specific Raptor code’s cording to the LA-FEC approach, constructing the RE-RS in a way so that all the RS-decoded information of the 3RaptorQ code [129] is a variant of Raptor codes [128], which offers a better coding performance, i.e. its reduced symbol overhead for guaranteeing current video frame and that of all the previous video frames successful decoding and supports larger source symbol block sizes. can be conveniently combined at the decoder and that the 19
1828 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 11, NOVEMBER 2013
and at the end of each sub-GOP, if the lost packets could be recovered by the RS code, the reference buffer will be updated using the recovered information to stop the error propagations. In this scheme, although the overall performance could be Base layer higherLA-FEC than that of the evenly FEC, the video quality fluctuates
LA-FEC precoded symbols (BL) frameprecoded by frame. symbols (BL) precoded symbols (EL) . = 0 B. Proposed RE-RS Scheme Input data Another solution which we propose in this paper isOutput to use data source symbols encoded symbols expanding RS code. This scheme is described in Fig. 1(c), the RS parity packets are generated using the video packets of the encoding symbols source symbols
Precodedsymbols (EL) current frame and all the previous frames of the current GOP. At the decoder side, the parity-check equations of the current 1. Precoding step: frame will be combined with2. LT those encoding of step: all the previous frames. Generates precoded The lost packets will be recoveredGenerates encoded if the symbols combination of the symbols from precoded symbols parity-check equations can be jointly solved. Thereby, these Figure 13: Modified Layer-AwareRS parity Raptor packets coding can not process only help [1]. to recover the lost packets of the current frame, but also the lost packets of the previous Table V: Decoding failure probabability of both the Raptorframes. and of In the this LA-Raptor scheme, no scheme video of packets the RFC6330 of the following standard [122]. frames will be used in the current FEC coding block, thereby each frame could be decoded and displayed at its proper time, Raptor and no extra delay willRaptorLA be caused. Received k0 k0+1To better illustratek0+k1 the expandingk0+k RS1+1 scheme, one simplified k=50 4.68 10−3 3.82 example10−16 is4 drawn..26 Let10− us3 use1. the42 first10 two−16 frames in Fig. 1(c), × ×where each frame× has 4 video packets× and 2 RS parity packets. k=100 4.47 10−3 2.71 10−16 3.95 10−3 1.43 10−16 × −3 ×We− assume16 packets× 1,− 2,3 and 3 in× the first−16 frame and packet 5 k=200 4.98 10 3.77 in10 the second2.91 frame10 are lost,3 whereas.12 10 other video packets and × −3 × −16 × −3 × −16 k=400 5.19 10 1.07 parity10 packets1.98 of the10 two frames2.31 are10 received. In this case, as × − × − × − × − k=800 3.97 10 3 2.75 described10 16 in2. (2),59 the10 parity-check3 3.48 equations10 16 for the first frame × × × × k=1600 2.82 10−3 1.95 could10−16 be simplified3.86 10 as− follows:3 3.52 10−16 × × × 2 × X1 + αX2 + α X3 + C1 =0 2 2 2 (5) X1 +(α )X2 +(α ) X3 + C2 =0 ! C = (c , c , ..., c ) having N = 2m 1 encoded symbols and and the parity-check1 2 equationsN for the second− frame combined withm thatbits of per the firstRS-encoded frame are symbol, as follows: where 2t symbols are lost. A decoder will attempt2 to recover the 2t = (N K) packets by X1 + αX2 + α X3 + C1 =0 − solving the 22t parity-check2 2 equations of X1 +(α )X2 +(α ) X3 + C2 =0 ⎧ 2 4 (6) ⎪ X1 + αX2 + α X3 + α CHX5 +tC=3 =0 0, ⎪ 2 2 2 2 4 (4) ⎨ X1 +(α )X2 +(α ) X3 +(α ) X5 + C4 =0 where⎪ H is the parity-check matrix, formulated as follows: where X⎩⎪1, X2, X3, and X5 denote the four lost packets; C1, 2m−3 2m−2 C2, C3, and C41 are constantα ... values, which α are determinedα by m m the received packets.1 α It2 should... be mentioned(α2)2 −3 that, in order(α2)2 to −2 recoverH the= lost. packets,. the above equations. should be solved. , in Galois Field. of GF. (2m). Here we should. note that in the. N−K N−K 2m−3 N−K 2m−2 Figure 14: Example of the RE-RS scheme, where eachfirst and third equations1 α in (6),... the(α coefficients) for X(1α, X2, X)3 are the same, and this also happens for the second and fourth transmission frame contains 4 original m-bit video strings, (5) Fig. 1. Examples of different FEC schemes, where each frame has 4 video equations. Therefore, for (5) and (6), the rank of coefficient eachpackets representing and redundant an packetm-bit rate is RS-symbol,µ =0.5. (a) Evenly and FEC. 2 redundant (b) Sub-GOP- RS- with α being the so-called primitive element of the Galois based FEC. (c) Proposed RE-RS scheme. matrixField is 2 GF( and2m 3,). respectively. Eq. (4) can Hence, be solved, (5) and when (6) cannot we have 2t symbols [113]. be solved with three and four variables, respectively. In other ≤ N K, since the rank of the matrix H is (N K). code. However, for this solution, one sub-GOP of delay will be words,− all the four lost packets, in this example, cannot− be caused if the video frames will be decoded and displayed at the recovered.To elaborate a little further, t physically represents the num- codingend of performance its sub-GOP. of On the the combined other hand, RS if the code sub-GOP-based was preserved Inber order of tom tackle-bit symbols this problem, that we can need be tocorrected increase theby rankan (N,K, 2t) comparedapproach to is a to conventional be used for the RS real-time coded scheme. applications where it of (6)code. to 4, Explicitly, for this reasont equations we propose are to solved randomly for finding reorder the positions 1)is not Reed-Solomon possible to wait Codes: for theRS video codes or parity are packets widely of used the invideoof packets the erroneous and the zerom-bit padded symbols, packets and (ift equations any) before for the finding the numerousfollowing operational frames, the applications. average video An qualityRS might(N,K deteriorate) code takesRS encodingspecific stage. error-magnitudes This is a key step of the to ensuret erroneous that, withm high-bit symbols. K withsymbols respect as its to theinput approach and generates that tolerates a codeword delay. In having [25], wea totalprobabilityBy contrast, the coefficients in the scenario of the parity-check considered equations the position for of the proposed to decode and display the frames in real-time fashion, different frames are independent, thereby ensure the high error length of N symbols. They are maximum distance separable unrecovered symbols is known, hence the 2t equations can be directly solved for finding the missing 2t = (N K) symbols. (MDS) codes, which have the beneficial characteristic that − receiving any K out of the N RS-encoded symbols is sufficient 2) RE-RS Scheme: When considering the example seen in for the successful decoding of the K original source symbols. Fig. 14, the basic approach of the RE-RS code is to use a Let us assume for example using a systematic RS codeword of standard RS code, such as an RS(16,14) code defined over the 1832 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 23, NO. 11, NOVEMBER 2013
Fig. 3. Average PSNR versus bitrate for various redundant packet rate µ; CIF Foreman sequence is used; i.i.d. average packet loss rate is 10%; RS redundant packet rate µ includes 200.3, 0.4, 0.5, 0.6 . { } Galois Field GF(24) and to encode the first frame with the aid of a shortened RS(6, 4) code, where the 10 remaining unused symbols are zero-valued padding symbols, which are removed after encoding. The second video frame of Fig. 14 is encoded with the aid of the same RS(16, 14) code but using only 6 zero- valued padding symbols, hence yielding an RS(10, 8) code and so forth. As a benefit, the different RS-encoded data can be combined at the receiver, since they belong to the same RS codeword. However, the authors of [113] demonstrated that Fig. 5. Average PSNR versus bitrate curves; i.i.d. average packet loss rate with the aid of the standard RS code, a combination of the RS-encoded data of the different video frames would fail to is 10% and the redundant packet rate µ =0.4. (a) Foreman sequence. (b) Bus achieve the performance of an RS code. This is illustrated by sequence. (c) Stefan sequence. (d) Mobile sequence. a simple example, which considers the first two frames of Fig. 14. The underlying assumption is that the source symbol 1, 2, and 3 of the first video frame and 5 of the second video frame In the first set of simulations, we study the effects of are lost, where the total number of lost symbols is given by 2t = 4. According to Eq. (4), the parity-check equations of allocating different redundant packet rates for RS code. The the first frame can be formulated as follows: Figure 15: RE-RS: Average PSNR versus bitrate curves; i.i.d. 2 average packet loss rate is 5% and the redundant overhead was X1 + αX2 + α X3 + C1 = 0 network packet loss is i.i.d. random packet loss model; for 2 2 (6) 20% for the Foreman sequence [113]. X1 + (α )X2 + (α )X3 + C2 = 0, the same average packet loss rate p = 10%, we try different where X1, X2, and X3 denote the three lost m-bit RS- symbols, while C1 and C2 are constant values, which are de- parity-check equations for the second frame become: RS redundant packet rates, µ, including 0.3, 0.4, 0.5, 0.6 . termined by the RS-symbols received for the first transmission 5 2 1 0 frame. It is obvious that this pair of equations is insufficient α X1 + α X2 + α 0X3 + C1 = 0 { } 2 5 2 2 2 1 0 for finding the three unknowns. However, as an explicit benefit (α ) X1 + (α ) X2 + (α ) 0X3 + C = 0 We do simulations with various quantization parameters (QP) 2 (8) of the RE-RS approach, we can now combine the RS-decoded 6 3 1 0 α X1 + X2 + (α )X3 + (α 1)X5 + C3 = 0 data of the two transmission frames, which leads now to the 0 to span a considerable bitrate range. Fig. 3 shows the average (α2)6X + X + (α2)3X + (α2)11X + C = 0. following four parity-check equations: 1 2 3 5 4 In the example considered, the rank of the coefficient matrix PSNR versus bitrate curves with different RS redundant packet X + αX + α2X + C = 0 1 2 3 1 defined over the Galois Field GF(24) of the RS(16,14) code 2 2 X1 + (α )X2 + (α )X3 + C2 = 0 becomes 4, which makes it a full rank matrix and hence the rates µ. In general, the PSNR curve for redundant packet 2 4 (7) X1 + αX2 + (α )X3 + (α )X5 + C3 = 0 equations can be readily solved. It should be noted that the 2 2 2 2 4 randomization approach relies on an intelligent algorithms that rate 0.3 is much lower than other cases. The PSNR curves X1 + (α )X2 + (α ) X3 + (α ) X5 + C4 = 0, minimizes the probability of having the same coefficients in where X5 denotes the lost RS-symbol of the second transmis- the different windows. for µ = 0.4, 0.5 are very close; while in low bitrate, sion frame, while C3 and C4 are constant values determined The performance of the proposed scheme is characterized { } by the RS-coded transmission received for the second trans- in Fig. 15. Here the authors of [113] compared their RE-RS higher redundant rate, µ =0.5, can provide slightly better mission frame. This equation has to be solved over the Galois scheme to an evenly distributed FEC (Evenly-FEC), to the so- m field GF(2 ). However, since the coefficients of X1, X2 and called dynamic sub-GOP FEC (DSGF) scheme [130] and to performance than that of µ =0.4, and vice versa, in high X3 in the first and the third equation are the same, the rank an error-freeFig. 4. curve. Average An H.264/AVC PSNR video stream versus and an 10-bit bitrate curves; i.i.d. average packet loss rate of the combined matrix becomes 3, which is insufficient for peris symbol 5% RS and code defined the redundant over GF(210) is packet used. The figure rate µ =0.2. (a) Foreman sequence. (b) Bus bitrate, lower redundant rate, µ =0.4, is slightly better. This determining four variables. Hence upon using the standard RS shows the peak signal-to-noise ratio (PSNR) of the decoded code, the four lost symbols cannot be recovered. videosequence. clip averaged over(c) 200Stefan trials. Forsequence. the three approaches, (d) Mobile sequence. is because in low bitrate, the slice number in each frame is the same quantization parameter (QP) is used for encoding the video sequences, and the same number of redundant symbols small, which makes the performance of RS code low, and was inserted for ensuring a fair comparison. An average packet The authors of [113] solved this problem of combining two losslength ratio of 5% was as used. 400 The results bytes of Fig.[36] 15 show the unless clear otherwise noted. Since at high high RS redundant packet rate is required to compensate for different RS codes by a randomized reordering of the m-bit benefit of the RE-RS approach compared to DSGF and to the video strings representing the RS-symbols of a transmission evenlybitrate, distributed FECthe scenario. slice number of one GOP could be large, 10-bit per this. For the PSNR curve of µ =0.6, although at low bitrate frame before the actual RS encoding step. This approach en- sures with a high probability that the coefficients of the parity- 10 check equations derived for different transmission frames be- D.RS Layer-Aware symbol FEC Using is the Pro-MPEG used, COP3 namely Code Galois Field of GF(2 ). So the its performance is similar as that of µ = 0.4, 0.5 , it is less come independent of each other. This is shown with the aid of The authors of [114] demonstrated the integration of the { } an example in [113], where the randomly reordered positions LA-FECvalue technique of withN thefor Pro-MPEG the COP3 RS code code of [131], (N, K) could be up to 1024, and active in high bitrate. It is worth indicating that for a fixed total of the lost RS-coded symbols {1,2,3} become {6,3,11}. By the which is widely used in IP-based video delivery networks. same token, for the second RS-encoded window, the positions Althoughthe thisvalue code does of notK achievedepends the error correction on N and the per frame parity packet bitrate, higher redundancy means less bitrate could be used for {1,2,3,5} become {7,1,4,12}. In this case, the combined RS performancenumber of the evaluated other codes described using earlier (4) in this in Section III. We use the average the video date and vice versa; that is why having too high a luminance peak signal-to-noise ratio (PSNR) to assess the redundant packet rate cannot provide the best performance. In objective video quality, which is denoted as PSNR(mse), this general, the PSNR curves for a redundant rate 0.3, 0.4 are is obtained by evaluating the mean squared error (mse) over all similar; consequently, in the following simulations,{ we} use the frames and over 200 trials, then the value of PSNR(mse) is RS redundant packet rate µ =0.4 for the 10% packet loss calculated based on the averaged mse. It is worth mentioning rate. Using the same methods, it is found that for packet loss that in all the following reported results, the bitrate includes rate p = 5, 10, 15, 20 %, the proper RS redundant packet both the video and parity packets. In our previous work [25], rate is µ ={ 0.2, 0.4, 0.}55, 0.7 , which means that µ should it was shown that the performance of DSGF approach is increase almost{ linearly with p.} Therefore, in later simulations, higher than many state-of-the-art approaches. Therefore, to RS redundant packet rate µ =4p will be used. The precise have fair comparison we compare our results with DSGF [25] relationship between µ and p is left for future investigation. and Evenly FEC, both of which meet the real-time constraint Figs. 4 and 5 compare the performance of the three and cause no additional delay. approaches in term of PSNR versus bitrate and for i.i.d.
1 1 ideal code; REIN ideal code; REIN standard COP3; REIN standard COP3; REIN enhanced COP3; REIN enhanced COP3; REIN 0.8 0.8 ideal code; random ideal code; random standard COP3; random standard COP3; random enhanced COP3; random enhanced COP3; random 0.6 0.6
0.4 0.4 Minimum FEC overhead Minimum FEC overhead 0.2 0.2
0 0 −6 −5 −4 −3 −6 −5 −4 −3 10 10 10 10 10 10 10 10 Packet Loss Rate Packet Loss Rate (a) Bitrate = 2 Mbps; FEC latency = 100 ms (b) Bitrate = 2 Mbps; FEC latency = 400 ms
1 1 ideal code; REIN ideal code; REIN 21 standard COP3; REIN standard COP3; REIN enhanced COP3; REIN enhanced COP3; REIN 0.8 0 0.8 " 0 " ideal code; random 10 ideal code; random 10 standard COP3; random standard! COP3; random !# !# !# !# !# enhanced COP3; random enhanced COP3; random !# !# !# !# !# $%& 0.6 0.6!# !# !# !# !# $%& !# !# !# !# !# !# !# !# !# !# $%& −2 ! −2 10 !# !# !# !# !# $%& 10 ! 0.4 " 0.4 !# !# !# !# !# !# !# !# !# !# $%& !# !# !# !# !# $%& !# !# !# !# !# !# !# !# !# !# $%&
Minimum FEC overhead Minimum FEC overhead !# !# !# !# !# $%& 0.2 0.2 −4 enh COP3; base layer; indep −4 enh COP3; base layer; indep 10 10 enh COP3; base layer; LA$%& $%& $%& $%& $%& enh COP3; base layer; LA (a) Row-wiseenh COP3; enh layer; indep Fig. 2.EnhancedPro-MPEGCOP3codesparitypacketgeneration.enh COP3; enh layer; indep 0 0 −6 −5 −4 −3 −6 enh−5 COP3; enh layer;− LA4 (b) Column-wise−3 enh COP3; enh layer; LA 10 10 10 10 10 10 10 10 Example with N3 =5and G3 =4 Packet Loss Rate st COP3;Packet baseLoss Ratelayer; indep st COP3; base layer; indep −6 Fig. 1.StandardPro-MPEGCOP3codesparitypacketgeneration. −6 10 Figure 16: The standardst COP3; Pro-MPEG base layer; COP3 LA code’s parity gen- 10 looking at it from a differentst perspective, COP3; base layer; to be LA able to reachthe (c) Bitrate = 6 Mbps; FEC latency = 100 ms % blocks with uncorrectable errors Exampleeration [114].with(d) BitrateD =4 =and 6st Mbps; COP3;L =5 FECenh layer; latency indep = 400 ms % blocks with uncorrectable errors st COP3; enh layer; indep st COP3; enh layer; LA same quality level as the standardst COP3; ones, enh employinglayer; LA less FEC over-
Fig. 3.EvaluationoftheperformanceofstandardandenhancedPro0.75 trate,0.8 which determine0.85 -MPEG the0.9 COP3maximum0.95 codes value of1 k,and(ii)themini- 0.75head than0.8 the most suitable0.85 configuration0.9 0.95 that can be1 selectedwhen mum allowedMinimum FEC FEC code code rate, rate which defines the available FEC band- working with onlyMinimum one or FEC two code dimensions. rate model, the gap between the two schemes is smaller. On the other 9:; < (a) (Bitratewidth, enh. that layer is, the / bitrate maximum base layer) number = 0.45 of parity packets that can be (b) (BitrateProvided enh. the numberlayer / bitrate of data base packets layer) per = FEC 0.35 block and the min- hand, the performance of the enhanced Pro-MPEG COP3 codes 57 58 59 5: Figure 18: Performance results for Layer-Aware extension of generated.567 The8 actual9 : !"# values that D and L take, and the decision imum FEC code rate, the extra parameters raise the number of pos- 0 Pro-MPEG0 COP3 code [114]. improves, compared to the standard ones, as the number of dat10a 65 66 67 68 10 on which; <= dimensions54 55 56 to!"# employ, are selected after an optimization sible combinations of data packets within a FEC block that canbe packets per FEC block grows. The reason is that the number and the process dependent on the channel behavior.5 = 5; "#$5 %!$ 57 58 59 5: 5; 5< !"# employed to compute the repair packets. This action increases the quality of possible configurations also rises. Pro-MPEG COP3 codes are characterized6545< ! by%& its extremely4 %!$ low capability of Pro-MPEG COP3 codes to adapt to a specific situa- 5= 64 65 66 67 68 !"# the EL source data of Fig. 17 (b), firstly the EL parity matrix −2 −2 10 complexity with respect to both the redundancy557 5= 0 generation12 6 algorithm%!$ 10 tion, whereas respecting the principles of the redundancy generation !"# !"# !"# !"# !"# !"# is generated by using the standard Pro-MPEG COP3 code. and the number of operations required568 to64 compute7 89 repair: %!$ packets algorithm of the standard codes. 4. TWO-LAYERED VIDEO PROTECTION In a second step, the BL data is uniformly distributed across and recover(a) Base data layer packets (computational complexity). Unlike other In regard to the encoding computational complexity of the pro- state-of-the-art codes, like Raptor or RaptorQ%!$ codes%!$ %!$ [15],%!$ redun-the two-dimensions of the EL with the aid of their XOR 4.1. Scheme −4 enh COP3; base layer; indep −4posed enhanced codes, Nd 1 XORenh COP3; operations base layer; are indep carried out per re- 10 connection,10 whilst avoidingth XORing− the same source symbols, dancy is generated throughenh COP3; the straightforward base(b) layer; Enhancement LA one-step + base process layer de- pair packet in the d dimension.enh The COP3; decoding base layer; process LA is a bit more In this section, we describe the proposed LA-FEC scheme for the scribed above, which veryenh COP3; much enh eases layer; their indep implementation and de-as in the BL FEC generation. The performanceenh COP3; enh difference layer; indep of a Fig. 4.ProposedLA-FECschemefortwo-layeredmultimediapro-enh COP3; enh layer; LA complicated to evaluate than inenh the COP3; standard enh layer; case, LA as many possible protection of two-layered multimedia transmission, which is based ployment.Figure 17: Concerning Layer-Aware the extensionlatter, the encodingof the Pro-MPEG computation COP3al com-variety of schemes is shown in Fig. 18. Specifically, the curves tection. Purple is for basest layerCOP3; packetsbase layer; and indep pale blue for enhance- realizations can take place. However,st COP3; the base number layer; indep of XOR operations on the AL-FEC enhanced Pro-MPEG COP3 codes introduced above.−6 code [114]. st COP3; base layer; LA with the−6 "enh" label refer to anotherst enhancement COP3; base layer; ofthe LA Pro- 10 plexity equals L 1 XOR operations per row-wise repair packet 10 is always lower than N1 (N2 1)+N2 (N1 1)+(k/N3) (N3 1). % blocks with uncorrectable errors ment layer packets % blocks with uncorrectable errors LA-FEC schemes follow the dependency between the different video and D 1 XOR− operationsst COP3; per enh column-wise layer; indep repair packet. Thus,MPEG COP3 code proposed· by the−st authors COP3; enh· of layer; [114],− indep which · − − st COP3; enh layer; LA Therefore, the computation complexityst COP3; enh of layer; both LA stages is linear with layers to improve decoding capabilities, in contrast to traditional Un- the encoding complexity deeply depends on the shape of the matrix,increases the attainable protection capability by incorporating 0.75 among0.8 the different0.85 dimensions0.9 of the0.95 enhancement1 layer. Finally, 0.75k in the worst0.8 case. 0.85 0.9 0.95 1 equal Error Protection (UEP) schemes, where protection packets are growingsection, linearly itMinimum is particularly withFEC codek in rate the beneficial worst case. for employment In the decoding in high- process,additional FEC protectionMinimum across FEC the code diagonal rate of the parity repair packets are generated XORing not only the corresponding en- generated independently for each layer [14]. Dthroughput(L 1)+( networksL 1) ( owingD 1) toXOR its extremely operations low are computational needed to recovermatrix. The curves with the "st" label refer to the standard (c) (Bitratehancement· enh.− layer layer packets, /− bitrate· but base− also layer) the = associated 0.25 base layer packets. (d)3.2. (Bitrate Experiments enh. layer and / results bitrate base layer) = 0.15 In the considered scenario, users are divided into two different alland the memory lost packets demand, in a protection while providing block in a sufficiently the worst case. high T errorherefore,Pro-MPEG COP3 code of [131]. For both cases, the figure subsets, regarding network and device capabilities. UsersFig. 5in.EvaluationoftheperformanceoftheproposedLA-FECschem the first So, the decoding capabilities of both layers are increased andewithstandardandenhancedPro-MPEGCOP3codes. a bettershows a comparison of the BL and EL block error rate using thecorrection decoding capability. computational complexity is2 also linear with k. In this section, we follow the criteria specified in the DVB-IPPhase1 quality, Q2,canbeoffered. both an independent encoding and the LA-FEC extension of group only receive the base layer of the layered videoParameters: with an as- FEC latency1) = Pro-MPEG 400 ms; Channel COP3 Code: PLRThe = 10 Pro-MPEG− ;Errordistribution=REINmodel(8ms);Bitratebaselayer= COP3 code is Handbook (ETSI TS 102 542-3-2) [3] to compare4Mbps the performance Figure 4 shows an example of the application of the proposedthis section. sociated quality Q1.Thesecondgroupobtainsbothlayerswithan based on a simple XOR-aided combination of the video source of the standard Pro-MPEG COP3 codes with that of the proposed Moreover, unlikeLA-FEC for3. thePROPOSED scheme protection with ENHANCED the of standard non-layered Pro-MPEG PRO-MPEG multime COP3 COP3dia, codes. CODESsumed The a constant maximum FEC latency). However, there exist cer- associated quality Q2 > Q1.Inordertoensurethetargetquality,the symbols. Before encoding, the source symbols are arranged enhanced ones. where the numberdistribution of data packets of the base within layer a FEC packets block is realized is not sequenti impor-ally, astain it al- points that do not fully follow this expected tendency. The rea- layered video is protected by means of an LA-FEC scheme. 3.1.in a Scheme matrix having D rows and L columns, as shown in Fig. E. Layer-AwareBy means FEC of Using a series UEP-LDPC of simulations, Code in which the conditions and Our proposed straightforward scheme operates astant follows. as long First, as it fulfillslows16. simple the The imposed signaling.FEC-encoded constraints, However, redundancy different ancan extrathen distribution restriction be generated schemes row-son (e.g. is that,limitations for certain of configurations, the scenario are provided, the selection the resulting of the poibasents layerindicate the base layer is protected on its own applying theappears enhanced in P thero- layeredrandom)Inwise videoessence, will (a), case: which be as considered indicated the is suited need in asfor the that a correcting future standard, thenumber work. independent the set of ofdata data packet pack losspacketsets that A used numberthe for FEC the of overhead application generation corresponding layer of a FEC protection codes to the were code packet mentioned configuration is not the (dimostmen- are used to generate a certain parity packet belonging to the dimen- MPEG COP3 codes. The repair packets created cannotpackets depend in a on base layerAsevents, protection for the or column-wiseone-layer block and video (b), the case, whichnumber an is overall ofintended dat minimumapack- for correcting FECsuitable, codeabove. andsions By so, contrast, to results use, andLDPC are values notcodes better of areN1, widely thanN2 and those usedN3)thatrequiresthelowest asfor physical other scenarios d enhancement layer data, as this information is distributedets into bothan enhancement sets ratesionbursts is layer imposed.in of protection the consecutive Pro-MPEG With the block errors. purpose COP3 need The codes ofL toand fulfilling meet canD beparameters the identifiedthesame quality are by usedrein justquire- which de-layer conditions codesFEC bandwidthin broadcast are more among systems severe. those as a that Abenefit different fulfill of theirthe distributi target capacity qualityon of req theuire- of users. The redundancy generated ensures the aimed Q .Next,en- mentstecting:for of adjusting the (i) users the the minimum that code receive rate sequence between just the number thebase column-wise layer, among it is all assu and ofmed them; row- that (ii)approachingment (mean performance. time between In [54], blocks the authors with errorsproposed equal a UEP to 4 hours). The relation1 as their bitratesthe number do, in order of packets not to in lose this set,synchroniN ;and(iii)thesequencenum-zation. base layer packets might help avoid this performance irregularities. hancement layer protection is computed. To this end, two steps take the requiredwise FEC code codes. rate of the AL-FEC coded used for the protection ofschemeactual based parameter on LDPC values codes (UEP-LDPC),of the most suitable which configurationrelies on ateach Note that this schemeber gap is between also valid any for two the consecutive standard data codes, packets, as theyGd,asitremains place. In the first one, the code parameters are optimized resort- this layer2)Layer-Aware meets the overall Pro-MPEG one. Consequently, COP3 Code: the minimum FECan EWF-likepoint are approach not shown used to avoid for LT confusion. codes in The Section features IV-A. and constraints really constitute a subsetconstant. of the For enhanced the row-wise-generated ones, when N parity=0 packets,. For the the design two latter ing only to the packets belonging to the enhancement layer. Inthe codeof rate the that Layer-Aware can be used Pro-MPEG for the protection COP33 code,of the the enhanceme simple con-nt layerNote thatof the this scenarios UEP-LDPC5. are CONCLUSIONS specified code may next: be soft-decoded or hard- variables respectively are N1 = L and G1 =1,and,forthecolumn- second one, the base layer video packets are uniformly distributed alsocatenation equals the ofimposed the BL overall and EL value. parity matrices shown for the decoded for employment in the physical layer or application wise ones, respectively N2 = D and G2 = L. Channel error distributions: independent losses and burst layer, respectively.• 3 6 4.2. Experiments andEWFBased results code on in this Fig. straightforward 10 does not observation, provide a suitable and with solution. theIn aim thisof paper welosses have presented (REIN model, an 8 enhancement ms) with PLRs of from the10 well-kn− to 10own− This is because in the situation that a row of the matrix 1) LDPC Code: LDPC codes belong to the family of linear increasing the packet recovery capability of standard Pro-andMPEG widely-utilizederror correctingAdditional codes AL-FEC defined latency Pro-MPEG by due a sparse to FEC: COP3 parity-check 100 codes, ms and matrix consistin 400 ms gon Figure 5 presents theCOP3 resultscannot codes, derived be corrected we frompropose by applying the the single enhancement the BL-protection proposed of these LA- FEC codes symbol througallowingh the addition• of a third dimension of parity packets.The available, the single FEC symbol provided for the EL will have H, where theVideo legitimate bitrates: codeword 2 Mbpsc andhas 6to Mbps satisfy the parity- FEC scheme to the protectionallowing the of addition a set of oftwo-layered an extra dimension, videos. that T is,he generating a • T the same problem and hence a combined decoding becomesintroductioncheck equations of extra of parametersHc = 0. The raise parity-check the number matrix ofH possibcan le com- difference between thethird videos set ofemployed parity packets in withthe simulations parameters N r3eliesand G on3.Anexamplebinationsbe represented of dataPacket packets by a sending Tanner within grapharrangement: a FEC [132] block4, constant constituted for repair by packe the tcom- isimpossible. shown in Fig.2. To circumvent this and to provide further benefits, • the bitrate of the enhancementwhen the layer,BL and which EL are respectively jointly decoded, is 0.45, the authors 0.35, of [114]putation. ThisThe action results increases obtained thefor the capability different configurations of Pro-MPEG areCOP3 depicted For a certain k,thenewvariablesneedtofulfillthefollowing4 0.25 and 0.15 times the bitrate of the base layer, which is of 4 Mbps. Tanneringraphs Fig. 3. were First, proposed it can by be Michael observed Tanner that forconstructing at low PLRs long the proposed constraintproposed to a fit reordering the matrix: of the BL source bits for the EL FECcodeserror to adapt correcting to codes the specificfrom multiple conditions shorter ones ofusing the recursive communicatio techniques. nchan- Results are shown in termsdata generation, of the percentage as shown of in blocks Fig. 17. that contain at nel, and,Tanner thus, graphsscheme increase contain does two nottheir types present ofrecovery nodes, a namely better capability. check performance nodes and variable than the standard nodes [133].codes, For as linear the block latter codes, are capable such as LDPC, of satisfying the check the and target variable criterion with least one uncorrectable error.The scheme What illustrated is depictedN3 inG3 Fig. is= thek 17= value operatesD L obtained as follows. The Additionally,(1) we have described a Layer-Aware FEC scheme in · · nodes representvery low the extrarows and bandwidth columns of for the any parity-check scenario matrix (theyH, perform respec- very close BL source data of Fig. 17 (a) is protected by its own FEC code, i j when applying the best codeThat means that the that, imposed once D limitationsand L are specified, allow, i.e., just a numberwhich oftively. the Anto enhanced edge or meet connects thePro-MPEG the ideal check nodecode). COP3to the However, variable codes node as are the, if used PLRthe entry to increases, increase our amaximumFEClatencyof400ms,theminimumFECcoderatefollowing the standard Pro-MPEG COP3 procedure [131]. For (i, j) of matrix H is nonzero. combinations of N3 and G3 are allowed. the QoS ofproposal two-layered gets more video efficient distribution. than the standard To that Pro-MPEG end, base codelayersfor presented in the graphs,The and introduction the bitrate of athe third respective dimension la doesyer. not imply thepackets addi- aremost distributed of the configurations, among the repair specially, packets as expected, of the enha whenncement the REIN Unlike in the firsttion part of extra of the redundancy, paper, the but FEC it increments overhead the is number now of avalayer.ilable Witherror this distribution straightforward applies, step, as new the configurations recovery capabi can helplity boost of the error fixed in order to showconfigurations the better to handling better utilize of the the FECavailable bandwidth FEC bandwidtprotectionh. Or, schemedecorrelation. is significantly In the scenarios raised. with the independent randomerror performed by the LA-FEC scheme when compared to the one in We have carried out a series of experiments concerning both which the protection of the two layers is completed independently. parts of the work presented in the paper. First, we have testedtheper- As can be observed, the results obtained after applying the LA- formance of the proposed enhanced Pro-MPEG COP3 codes when FEC scheme are significantly better compared to the ones in which compared with the standard ones. Next, we have analyzed the appli- the two layers are protected independently, especially for the base cation of the presented LA-FEC scheme, comparing it to a standard layer. In addition, results are generally better if the conditions im- scheme, in which no layer dependency is considered. Simulation re- posed become less restrictive, that is, if the minimum FEC code rate sults show a benefit in the introduction of the proposed protection decreases and the bitrate of the enhancement layer gets greater (as- mechanism.