electronics

Article A Trellis Based Temporal Rate Allocation and Virtual Reference Frames for High Efficiency Video Coding

Xiem HoangVan 1,*, Le Dao Thi Hue 1 and Thuong Nguyen Canh 2

1 Faculty of Electronics and Telecommunications, University of Engineering and Technology, Vietnam National University, 144 Xuan Thuy Rd, Dich Vong Hau, Cau Giay Dist, Hanoi 10000, Vietnam; [email protected] 2 Institute for Datability Science, Osaka University, 2-8 Yamadaoka, Suita, Osaka 565-0871, Japan; [email protected] * Correspondence: [email protected]

Abstract: The High Efficiency Video Coding (HEVC) standard has now become the most popular video coding solution for video conferencing, broadcasting, and streaming. However, its compression performance is still a critical issue for adopting a large number of emerging video applications with higher spatial and temporal resolutions. To advance the current HEVC performance, we propose an efficient temporal rate allocation solution. The proposed method adaptively allocates the compression bitrate for each coded picture in a group of pictures by using a trellis-based dynamic programming approach. To achieve this task, we trained the trellis-based quantization parameter for each frame in a group of pictures considering the temporal layer position. We further improved coding efficiency by incorporating our proposed framework with other inter prediction methods such as a virtual

 reference frame. Experiments showed around 2% and 5% bitrate savings with our trellis-based rate  allocation method with and without a virtual reference frame compared to the conventional HEVC

Citation: HoangVan, X.; Dao Thi standard, respectively. Hue, L.; Nguyen Canh, T. A Trellis Based Temporal Rate Allocation and Keywords: inter coding; HEVC standard; virtual reference; dynamic programming; adaptive quantization Virtual Reference Frames for High Efficiency Video Coding. Electronics 2021, 10, 1384. https://doi.org/ 10.3390/electronics10121384 1. Introduction 1.1. Context and Motivations Academic Editor: Otoniel Mario The wide growth of multimedia applications always demands more powerful video López Granado transmission over the internet with high compression performance as well as low complex- ity. Therefore, the Joint Collaborative Team on Video Coding (JCT-VC) announced the High Received: 10 May 2021 Efficiency Video Coding (HEVC) [1] with 50% bitrate reduction at the same perceptual Accepted: 7 June 2021 Published: 9 June 2021 quality as the previous H.264/AVC standard [2]. Despite its success, there is still a demand for beyond HEVC coding to meet emerging requirements of higher resolutions (i.e., 4K,

Publisher’s Note: MDPI stays neutral 8K) and diverse contents (screen contents, drone, etc.). In this context, in the newest video with regard to jurisdictional claims in compression standard, H.266/Versatile Video Coding (VVC), many coding tools have been published maps and institutional affil- proposed, such as partitioning with quad-tree plus binary, adaptive transforms, new intra iations. modes, affine motion estimation, dependent quantization, etc. [2]. Among all, inter coding tools are always bringing the highest bitrate reduction due to the high redundancy be- tween consecutive temporal frames. Besides the affine motion estimation, virtual reference frame and temporal rate control are active research topics to improve the inter coding performance [3]. Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. Rate control is one of the most powerful encoding tools to advance the rate–distortion This article is an open access article (RD) performance of any video codec. In general, at a given quality, rates are allocated distributed under the terms and from coarse to fine in (i) group of pictures (GOP) or temporal rate allocation, then (ii) dis- conditions of the Creative Commons tributed among coding unit or spatial, or (iii) adaptive allocation to reduce the distortion Attribution (CC BY) license (https:// propagation. Adaptive approaches are complicated and demand computational complex- creativecommons.org/licenses/by/ ity and memory for either pre-analysis or online analysis. As a result, the common test 4.0/). condition of the most recent video coding standards, including H.266/VVC, still favors a

Electronics 2021, 10, 1384. https://doi.org/10.3390/electronics10121384 https://www.mdpi.com/journal/electronics Electronics 2021, 10, 1384 2 of 19

simple hierarchical rate allocation at the frame level [2]. For each GOP, HEVC and VVC classified frames into a temporal index (Tid), and depending on their hierarchical location and frame types (i.e., intra or inter), the quantized parameter (QP) is adjusted by a fixed value. This framework is very simple and easy to validate the performance of coding tools during the standardization process. However, with increasing competition from AOMedia Video Coding (AV1) [4] with an adaptive GOP, an effective temporal rate allocation for MPEG codecs is demanded. HEVC adopts a simple temporal rate control via its hierarchical coding structure [1]. HEVC controls bits for each frame in a group of pictures (GOP) with a temporal index associated with its hierarchical reference model by adjusting the quantization parameter for each frame so that high Tid frames are encoded at a higher QP. However, the simple method in HEVC is not effective in preventing the propagation of distortion. Researchers have introduced adaptively allocate bitrates to each frame via a rate–distortion optimized scheme [5–7]. These methods, however, required modification in the decoder, which is non- compliant with the HEVC standard. Additionally, the adaptive quantization parameter at the frame level is an encoder-only tool, easy to modify in the configuration. In addition, adaptive quantization at the frame level would significantly improve the overall compres- sion performance. Therefore, it is often left out from the MPEG (Moving Picture Experts Group) standardization process; thus, there are few works on this research topic. Virtual reference frame (VRF) is another inter coding improvement approach by (i) generating virtual frames and (ii) utilizing virtual frames as an additional reference or for guided reconstructed frames [8]. Several works utilized the motion estimation or advanced deep learning framework to interpolate frames from decoded frames [9–13]. The deep learning approach shows higher gain but also introduces tremendous complexity in both encoder and decoder [11–13]. On the other hand, virtual frames are utilized as additional reference frames for motion estimation or reducing the syntax transmission overhead as a special merge mode. Although there are many works on VRF, there is lacking investigation on the impact of virtual frames on coding performance.

1.2. Contributions and Paper Organization Although the virtual referencing-based method has provided important compression improvement for HEVC, there is still room for further improving HEVC performance. Previous works either researched the virtual reference frame or temporal rate control separately. Firstly, the prior VR creation mainly relied on the motion estimation and interpolation-based approach, which may be ineffective for fast motion content and com- pressed with high QPs or low rate. Secondly, the prior VR frames are used for all B-slices, which may always be effective in terms of the rate–distortion optimization (RDO) manner, especially for pictures at the low temporal layer positions. Finally, since the quality of the VR frames highly depends on the quality of existing decoded references, adaptive quan- tized frames at different layers would impact the performance of inter coding. In this context, this paper proposes: (i) a novel virtual reference frame creation where a multiple hypothesis motion estima- tion method is used for frame generation; (ii) an efficient rate allocation algorithm in which the trellis coding method is used to learn the temporal rate allocation. We conducted a rich set of experiments with numerous train and test videos and showed around 5% BD-rate gains on top of the most recent HEVC reference software. The rest of this paper is organized as follows. Section2 briefly describes the background work on HEVC inter coding and the related works on rate control. Section3 presents the proposed VRF creation and temporal rate allocation, while Section4 assesses the coding performance of the proposed methods. Finally, we summarize the contributions of this paper in Section5. Electronics 2021, 10, x FOR PEER REVIEW 3 of 19

posed VRF creation and temporal rate allocation, while Section 4 assesses the coding per- Electronics 2021, 10, 1384 formance of the proposed methods. Finally, we summarize the contributions of this paper3 of 19 in Section 5.

2. Background and Related Works 2. Background and Related Works HEVC adopted the inter-coding tools to take advantage of temporal correlation HEVC adopted the inter-coding tools to take advantage of temporal correlation among among frames in a GOP. This paper introduces novel temporal rate allocation and virtual frames in a GOP. This paper introduces novel temporal rate allocation and virtual reference reference frame exploitation for HEVC inter coding; hence, this section will briefly de- frame exploitation for HEVC inter coding; hence, this section will briefly describe the scribe the background work on the HEVC inter coding and then analyzes the related background work on the HEVC inter coding and then analyzes the related works on HEVC works on HEVC rate control. rate control.

2.1. HEVC HEVC Inter Inter Coding Coding Compared to the prior inter-coding tools of the H.264/AVCH.264/AVC standard [2], [2], the HEVC introducesintroduces several several improvements improvements in in coding coding t toolsools and coding structure, notably (i) block partitioning and (ii) motion estimati estimationon and reference picture management.management. Block partition: HEVCHEVC inter inter coding allows compressing picturepicture withwith more more block block parti- par- titiontion shapes shapes than than intra intra coding, coding, notably notably the the partition partition modes modes of PART_2N of PART_2N× 2N, × PART_2N2N, PART_2N× N, ×and N, PART_Nand PART_N× 2N × indicate 2N indicate the cases the cases when when coding coding block block is not is split, not splitsplit, into split two into equal- two equal-sizesize prediction prediction blocks blocks horizontally horizontally and vertically, and vertically, respectively respectively [1]. While [1]. While PART_N PART_N× N ×refers N refers to the to case, the thecase, coding the coding block isblock split is into split four into equal-size four equal-size blocks. Inblocks. addition, In addition, there are therefour asymmetricare four asymmetric motion partitions, motion partit PART_2Nions, PART_2N× nU, PART_2N × nU, PART_2N× nD, PART_nL× nD, PART_nL× 2N, ×and 2N, PART_nR and PART_nR× 2N (see× 2N Figure (see Figure1). For 1 each ). For coding each treecoding unit tree (CTU), unit the(CTU), rate–distortion the rate–distor- opti- tionmization optimization (RDO) process (RDO) is process recursively is recursivel computedy computed to find the to optimal find the coding optimal and coding prediction and predictionunit, notably unit, from notably 64 × 64from to 864× ×8. 64 to 8 × 8.

Figure 1. IllustrationIllustration of of block block partitions partitions in in HEVC HEVC inter coding coding..

Motion estimation and referencereference picture management:management: Similar to the prior video coding standard, HEVC inter employs motion estimation to find find the best-matched block inin reference frames to reduce thethe redundancyredundancy betweenbetween successivesuccessive frames.frames. For reference picture management, HEVC inter stores the pr previouslyeviously decoded pictures into a decoded picture buffer (DPB).(DPB). ToTo identify identify these these pictures, pictures a, lista list of of picture picture order order count count (POC) (POC) identifiers identi- is transmitted in each slice header, and the set of reference pictures is called the reference fiers is transmitted in each slice header, and the set of reference pictures is called the ref- picture set (RPS), as shown in Figure2. erence picture set (RPS), as shown in Figure 2. The HEVC DPB also contains two lists, list 0 and list 1, which are referred through the reference picture index. For biprediction, i.e., in random access coding configuration, two pictures are selected (one from each list).

2.2. HEVC Rate Control Rate control has been an important research topic in video coding over the years. A temporal domain is an effective way to reduce the transition rate. HEVC uses a fixed hierarchical bit rate allocation. It controls bits for each frame in a group of pictures (GOP) with a temporal index (Tid) associated with its hierarchical reference model, depicted in Figure2. By simply adjusting the QP parameters for each frame, high Tid frames are encoded at a higher quantization parameter (QP). However, this simple method in HEVC Electronics 2021, 10, 1384 4 of 19

is unable to prevent distortion propagation. Therefore, various rate control algorithms were developed for video coding standards. For HEVC, an early rate control algorithm was proposed using a pixel-wise unified rate-quantization (R-Q) model [14]. This R-Q model is almost the same as the conventional quadratic R-D model in [15]. Afterward, Li et al., in [16], proposed an R-λ model where λ is the Lagrange Multiplier based on a frame’s complexity. This rate control algorithm allocates target bits to a GOP, a frame, or a CTU. Further extending this work, the authors in [17] proposed a gradient-based R- λ model and inter-frame rate control for HEVC. In order to enhance facial detail for video coding conferencing, the work in [18] developed a rate control algorithm based on a weighted R- λ model, which can allocate more target bits to important regions in a video frame. Considering the coding performance of intra coding, a structure similarity (SSIM) based game theory approach was introduced in [19] to optimize the CTU level bit allocation. Similarly, a ρ domain bit allocation and a rate control algorithm were proposed in [20] to optimize bit allocation for key frames. Recently, the authors in [21] presented a highly parallel hardware architecture for rate estimation in HEVC intra coding. Electronics 2021, 10, x FOR PEER REVIEWAlthough the aforementioned rate control algorithms [14–20] show promising performance4 of 19 for HEVC, their high computational complexity may not suitable for a number of emerging video applications.

FigureFigure 2. 2. ExampleExample of of a atemporal temporal prediction prediction structure structure in in HEVC HEVC inter inter coding. coding.

TheIn thisHEVC work, DPB we also focused contains on developing two lists, list a frame-level 0 and list 1, rate which control are algorithm. referred through In order to do so, the learning process requires training data to have raw video contents at various the reference picture index. For biprediction, i.e., in random access coding configuration, sizes and resolutions. Therefore, using recently advanced learning frameworks such as two pictures are selected (one from each list). deep learning [13] would require frame-level features for multiple frames in each GOP, thus demands a tremendous amount of computation, not only at training but also at 2.2. HEVC Rate Control deployment. Additionally, the learning framework often relies on a quality metric such as MSE,Rate which control does has not been well an reflect important the commonly research usedtopic BDBRin video score coding in codec over development.the years. A temporalAs a result, domain with our is an limited effective computing way to resources,reduce the we transition were not rate. able HEVC to usea uses more a expensivefixed hi- erarchicallearning algorithm.bit rate allocation. To address It controls this problem, bits for andeach motivated frame in a by group RDOQ, of pictures we proposed (GOP) a withsimple a temporal but efficient index temporal (Tid) associated rate allocation with solutionits hierarchical for HEVC reference following model, a greedy depicted method in Figurebased 2. on By the simply trellis adjusting coding approach. the QP parameters for each frame, high Tid frames are en- coded at a higher quantization parameter (QP). However, this simple method in HEVC is unable3. Proposed to prevent HEVC distortion Improvement propagation. Tools Therefore, various rate control algorithms were developedIn this for work, video we coding aimed standards. to develop aFor simple, HEVC, non-normative an early rate MPEGcontrol video algorithm coding was tool proposedby integrating using aa VRpixel-wise creation unified solution rate-quantization into the HEVC (R-Q) and refining model [14]. the existing This R-Q temporal model isbitrate almost allocation the same scheme.as the conventional We avoided quadratic the pre-analysis R-D model approach, in [15]. which Afterward, requires Li looking et al., inahead [16], severalproposed frames an R- orλ GOPs,modelthus where subsequently λ is the Lagrange introduces Multiplier significant based complexity on a frame’s and is complexity.not suitable This for relativerate control applications. algorithm To allocate achieves target this target, bits to we a readjustedGOP, a frame, the predefinedor a CTU. Furtherheuristic extending quantization this work, level bythe a authors learned in method. [17] proposed Unfortunately, a gradient-based the learning R- approachλ model andusually inter-frame faces the rate challenge control for of theHEVC. huge In computational order to enhance complexity facial detail of video for video coding, coding vari- conferencing,ous datasets, the as wellwork as in the [18] nonlinearity developed ofa ra thete encodingcontrol algorithm process [based12]. We on addressed a weighted these R- λchallenges model, which by using can allocate a dynamic more programming target bits to approach—trellis-based important regions in a rate video allocation frame. (TRA).Con- sidering the coding performance of intra coding, a structure similarity (SSIM) based game theory approach was introduced in [19] to optimize the CTU level bit allocation. Similarly, a 𝜌 domain bit allocation and a rate control algorithm were proposed in [20] to optimize bit allocation for key frames. Recently, the authors in [21] presented a highly parallel hard- ware architecture for rate estimation in HEVC intra coding. Although the aforementioned rate control algorithms [14–20] show promising performance for HEVC, their high com- putational complexity may not suitable for a number of emerging video applications. In this work, we focused on developing a frame-level rate control algorithm. In order to do so, the learning process requires training data to have raw video contents at various sizes and resolutions. Therefore, using recently advanced learning frameworks such as deep learning [13] would require frame-level features for multiple frames in each GOP, thus demands a tremendous amount of computation, not only at training but also at de- ployment. Additionally, the learning framework often relies on a quality metric such as MSE, which does not well reflect the commonly used BDBR score in codec development. As a result, with our limited computing resources, we were not able to use a more expen- sive learning algorithm. To address this problem, and motivated by RDOQ, we proposed a simple but efficient temporal rate allocation solution for HEVC following a greedy method based on the trellis coding approach.

Electronics 2021, 10, x FOR PEER REVIEW 5 of 19

3. Proposed HEVC Improvement Tools In this work, we aimed to develop a simple, non-normative MPEG video coding tool by integrating a VR creation solution into the HEVC and refining the existing temporal bitrate allocation scheme. We avoided the pre-analysis approach, which requires looking ahead several frames or GOPs, thus subsequently introduces significant complexity and is not suitable for relative applications. To achieve this target, we readjusted the prede- fined heuristic quantization level by a learned method. Unfortunately, the learning ap- Electronics 10 2021, , 1384 proach usually faces the challenge of the huge computational complexity of video coding,5 of 19 various datasets, as well as the nonlinearity of the encoding process [12]. We addressed these challenges by using a dynamic programming approach—trellis-based rate alloca- Thetion proposed(TRA). The TRA proposed algorithm TRA was algorithm integrated was into integrated both original into both HEVC original and HEVC HEVC with and virtualHEVC referencewith virtual frames reference (VRF). frames Hence, (VRF). this section Hence, introduces this section a novel introduces HEVC a with novel a virtualHEVC referencewith a virtual frame reference framework frame and framework is followed and by theis followed proposed by TRA the proposed solution. TRA solution.

3.1.3.1. Proposed HEVCHEVC Architecture TheThe proposedproposed HEVCHEVC encodingencoding architecturearchitecture isis describeddescribed inin FigureFigure3 3 with with highlighted highlighted modifiedmodified modules.modules. Virtual reference framesframes areare generatedgenerated fromfrom thethe referencereference decodeddecoded framesframes based on on the the multiple multiple hypothesis hypothesis motion motion estimation estimation and and then then put putto the to DPB. the DPB.Both Boththe previously the previously decoded decoded references references and andthe virtual the virtual references references are areused used as the as thereference reference for formotion motion estimation estimation of the of thecurrent current frame. frame.

FigureFigure 3.3. ProposedProposed HEVCHEVC architecture.architecture.

3.1.1.3.1.1. Virtual ReferenceReference FrameFrame CreationCreation ToTo achieveachieve highhigh efficientefficient encoder,encoder, wewe createdcreated aa newnew virtualvirtual referencereference frameframe forfor eacheach interinter codingcoding frameframe basedbased onon aa hierarchicalhierarchical motionmotion estimationestimation (HME)(HME) andand compensationcompensation technique.technique. ToTo fully fully exploit exploit the the statistic statistic information information from from decoded decoded data anddata the and texture the texture corre- lationcorrelation between between two consecutive two consecutive references, references, we proposed we proposed an advanced an advanced motion-compensated motion-com- temporalpensated interpolationtemporal interpolation (MCTI) based (MCTI) VR frame based creation, VR frame in which creation, the motionin which vector the motion field is adaptivelyvector field generated is adaptively using generated the hierarchical using motionthe hierarchical estimation motion (ME) solutionestimation with (ME) a coarse solu- blocktion with size a of coarse 16 × block16 or 32size× of32 16 initialized × 16 or 32 in × the32 initialized backward in ME. the In backward this stage, ME. the In min- this imizationstage, the ofminimization regularized of mean regularized absolute mean difference absolute (RMAD) difference was adopted(RMAD) [ 22was]. Figureadopted3 highlights[22]. Figure the 3 proposedhighlights MCTI the proposed structure. MCTI Since st aructure. large block Since size a islarge hard block to cover size the is motionhard to information of tiny activity or video taken from a far camera, a finer block size may be chosen. Afterward, the motion vector refinement was adopted to refine the motion field. Finally, the motion compensation was used to create the VR frame. The creation of the VR frame can be performed as follows: • Hierarchical ME: First, decoded frames obtained from two reference lists, list 0 and list 1, are low pass filtered and used as references in a motion estimation process. Our proposed VR frame creation uses both forward and backward ME to generate the forward interpolated frame and the backward interpolated frame. In these modules, a block matching algorithm is used to estimate the motion between the next and previous decoded frames. Electronics 2021, 10, 1384 6 of 19

To achieve more accurate motion vector information, an up-sample motion vector field (MVF) was performed. The motion vector value obtained from the previous step Electronics 2021, 10, x FOR PEER REVIEW 6 of 19 is an integer number; there will be jagged edges in the interpolated frame. To alleviate this problem, we up-sampled the reference frames by a factor of 2, both in horizontal and vertical directions, as shown in Figure4. After up-sampling the MVF, coarsely refine the coverMVF the considering motion information its 8-neighboring of tiny MVs activity and chooseor video that taken MVF from has thea far smallest camera, error a finer cost. block• MVsize may refinement be chosen.: To achieveAfterward, better the motionmotion field,vector a refinement motion vector was refinementadopted to (MVR)refine the motionprocess field. is employed. Finally, the In motion MVR, thecompensa temporaltion bidirectional was used to ME create (BiME) the VR and frame. the spatial The creationweighted of the VR vector frame median can be filtering performed (WVMF) as follows: are chosen to refine the motion information • Hierarchicalderived from ME: the First, hierarchical decoded ME frames stage obtained [22,23]. In fr BiME,om two the reference motion lists, vectors list of0 and each listinterpolated 1, are low pass block filtered are refined and used in a as small references search in area a motion and following estimation an process. assumption Our proposedthat the motionVR frame trajectory creation between uses both consecutive forward and frames backward is linear. ME While to generate the spatial the forwardWVMF improvesinterpolated the frame motion and field the spatially backward coherent interpolated by looking, frame. for In each these interpolated modules, ablock, block amatching candidate algorithm motion vector is used at to neighboring estimate the blocks motion can between better represent the next the and motion pre- trajectory. This filter is also adjustable by a set of weights, controlling the filter strength vious decoded frames. and depending on the block distortion for each candidate motion vector. Since the Toquality achieve of decodedmore accurate references motion and vector video information, content highly an up-sample affect the finalmotion VR vector frame field (MVF)quality, was we performed. adopted a statisticalThe motion learning-based vector value parameterobtained from optimization the previous solution step is to an integerinitialize number; the block there size,will searchbe jagged range, edges and in search the interpolated refinement frame. areas forTo alleviate the proposed this problem,MCTI we method up-sampled [24]. the reference frames by a factor of 2, both in horizontal and ver- tical• directions,Motion compensation as shown in Figure: Finally, 4. After the motionup-sampling compensation the MVF, processcoarsely is refine applied the toMVF the consideringdecoded its frames8-neighboring and obtained MVs and MV choose to achieve that theMVF VR has frames. the smallest error cost.

(a) (b)

FigureFigure 4. 4. HierarchicalHierarchical ME: ME: (a (a) )Upsample Upsample MVF; MVF; (b (b) )MV MV selection selection..

•3.1.2. MV Virtual refinement Reference: To Frame achieve Exploitation better motion field, a motion vector refinement (MVR) processTo better is employed. use this new In reference,MVR, the temporal we conducted bidirectional exhaustive ME experiments(BiME) and the to identifyspatial whichweighted frame positionvector median should befiltering employed (WVMF) to generate are chosen VR frames. to refine The the generation motion infor- of the VR framemation benefits derived from from the the low hierarchical complexity ME motion stage estimation [22,23]. In and BiME, compensation the motion approach vectors combiningof each withinterpolated a data-driven block are optimal refined parameter in a small configuration. search area and The following proposed an VR as- is appliedsumption to the that B-frame the motion with bi-directional trajectory between references consecutive as in [25]. frames is linear. While the spatial WVMF improves the motion field spatially coherent by looking, for each in- 3.2. Trellis-Based Rate Allocation (TRA) terpolated block, a candidate motion vector at neighboring blocks can better repre- sentTrellis the search motion is trajectory. a type of dynamic This filter programming is also adjustable algorithm by a toset find of weights, the optimal controlling encoded sequencethe filter [26 strength]. In video and coding, depending trellis on search the block is used distortion for optimal for each quantization candidate motion in rate– distortionvector. optimizedSince the quality quantization of decoded (RDOQ) refere [27nces] of and HEVC. video Starting content with highly initial affect scal5ar the quantizationfinal VR frame (i.e., xquality,), RDOQ we recursively adopted a decidesstatistical the learning-based optimal quantization parameter level optimiza- at each tion solution to initialize the block size, search range, and search refinement areas for the proposed MCTI method [24]. • Motion compensation: Finally, the motion compensation process is applied to the decoded frames and obtained MV to achieve the VR frames.

Electronics 2021, 10, x FOR PEER REVIEW 7 of 19

3.1.2. Virtual Reference Frame Exploitation To better use this new reference, we conducted exhaustive experiments to identify which frame position should be employed to generate VR frames. The generation of the VR frame benefits from the low complexity motion estimation and compensation ap- proach combining with a data-driven optimal parameter configuration. The proposed VR is applied to the B-frame with bi-directional references as in [25].

3.2. Trellis-Based Rate Allocation (TRA) Trellis search is a type of dynamic programming algorithm to find the optimal en- coded sequence [26]. In video coding, trellis search is used for optimal quantization in rate–distortion optimized quantization (RDOQ) [27] of HEVC. Starting with initial scal5ar quantization (i.e., 𝑥), RDOQ recursively decides the optimal quantization level at each coefficient location between candidates of {𝑥, 𝑥 − 1} (additional candidate of 0 can be added) toward minimizing rate and distortion cost. The overall step of our TRA is given in Algorithm 1. We modeled the bitrate alloca- tion as an optimal search path of QP adjustment between temporal ID (Tid; illustrated in Figure 5). The details are explained as follows. To avoid the worst case of the greedy algorithm, we have: (1) initialized the search with Δ𝑄𝑃 = 0, 0, 0, 0, which is the same as HEVC configuration; (2) using BDBR as the loss function to pick up the highest BDBR improvement path.

Algorithm 1 Proposed trellis-based bitrate allocation algorithm. # Detail Descriptions Initialize offset Δ𝑄𝑃 = 0, 0, 0, 0, max range 𝑟, No. iteration 𝐾, 1 excluding test 𝑐 = {0},𝑇𝑖𝑑=0,1,2,3 Electronics 2021, 10, 1384 2 For 𝑘 in {1,…,𝐾−1} # Phase loop7 of 19 3 For 𝑖 in {0, 1, …} # Iter loop 4 For 𝑇𝑖𝑑 in {0, 1, 2, 3} # Tid loop coefficient5 locationFind exclude between candidate candidates 𝑐 of∈{𝑐{x,x −,Δ𝑄𝑃1} (additional ±𝑘>1} candidate of 0 can be added)6 toward Current minimizing candidate rate and 𝑐= distortion{±𝑘}\{𝑐 cost.} 7 The overall Find step the of ourbest TRAΔ𝑄𝑃 is given∈𝑐 at in Algorithm𝑡𝑖𝑑 w.r.t BDBR 1. We modeled the bitrate allocation as8 an optimal Add search exclude path ofcandidates: QP adjustment 𝑐 ∈{𝑐 between ,𝑐} temporal ID (Tid; illustrated in Figure9 Output5). The Δ𝑄𝑃 details are explained as follows.

Figure 5. TidTid illustration with HEVC Random Access configuration.configuration.

WeTo avoidthen performed the worst casean iterative of the greedy greedy algorithm, search with we the have: maximum (1) initialized K phases. the In search each 0 phase,with ∆ QPwe searched= [0, 0, the 0, 0 optimal], which QP is the offsets same star as HEVCting from configuration; Tid 0 to Tid (2) 3 with using respect BDBR asto the bestloss functionBDBR reduction. to pick up By the starting highest from BDBR the improvementmost (Tid 0) to path. the least important picture (Tid 3), we avoided the great fluctuation in the algorithm (i.e., the BDBR gain tends to reduce afterAlgorithm each step). 1 Proposed An example trellis-based of Tid bitrate structure allocation in HEVC algorithm. is shown in Figure 5. The lower Tid # Detail Descriptions Initialize offset ∆QP0 = [0, 0, 0, 0], max range r, No. iteration K, 1 excluding test ctid = {0}, Tid = 0, 1, 2, 3 ex 2 For k in {1, . . . , K − 1} # Phase loop 3 For i in {0, 1, . . . } # Iter loop 4 For Tid in {0, 1, 2, 3} # Tid loop n o 5 tid ∈ tid k ± > Find exclude candidate cex cex , ∆QPtid k 1 6 Current candidate c = {±k}\{cex} k 7 Find the best ∆QPtid ∈ c at tid w.r.t BDBR tid n tid o 8 Add exclude candidates: cex ∈ cex , c 9 Output ∆QPK

We then performed an iterative greedy search with the maximum K phases. In each phase, we searched the optimal QP offsets starting from Tid 0 to Tid 3 with respect to the best BDBR reduction. By starting from the most (Tid 0) to the least important picture (Tid 3), we avoided the great fluctuation in the algorithm (i.e., the BDBR gain tends to reduce after each step). An example of Tid structure in HEVC is shown in Figure5. The lower Tid is, the more important it is, as being referred by more picture at lower Tid. In our practice, we used only two-phase, K = 2. k The best QP offset at the iteration k, and Tid i (i.e., denoted as ∆QPi ) is added to the optimal list ∆QPK from the full candidate list of {0, ±1, ... , ±k}, which already removed k−1 candidates from the previous iteration (i.e., ∆QPi ). The maximum offset range was set to k in this work to limit the complexity. Based on the bitrate reduction rate (BDBR) of a given video dataset, we selected the best offset at high and low bitrate scenarios. Given four Tids, at the maximum range of ±2 (thus five available options), with six QPs for low and high rate, the number of tests for each sequence in the greedy search is

nQP × r4 = 6 × 54 = 3750. (1) Electronics 2021, 10, 1384 8 of 19

This still requires a tremendous amount of computation. Therefore, we used trellis- based dynamic programing to recursively search for the best offset of each TID. For one phase round of search (i.e., we named as a phase), it requires

nQP × (5 + 4 + 4 + 4) = 6 × 17 = 102, (2)

for each sequence that significantly reduces simulations. Even though the proposed TRA could significantly reduce the searching space, it still demands a huge computation given multiple test/train sequences and several iterations. Therefore, we proposed a simplified version of TRA with a two-phase iterative approach based on the reference candidates. In the first phase, we limited the candidate list to {0, ±1} which requires

nQP × (3 + 2 + 2 + 2) = 6 × 9 = 56, (3)

half of experiments as in TRA. In the second phase, we only tested additional configuration, which is ±1 difference from phase 1. If the optimal offset is +1, then only +2 is evaluated (i.e., 0 is evaluated in Phase 1). We skipped the case of the initial offset in Phase 1 being 0, as the difference to +-2 is larger than 1. Therefore, the maximum additional experiments per iteration is QP × (1 + 1 + 1 + 1) = 6 × 1 = 6. (4) The combination of experiments in (3) and (4) is significantly less than the original TRA in (2). To train TRA, a decision was made to adjust the quantization level or ∆QP at each Tid related to the RD performance of the whole sequences. Fortunately, with many sequences, we adopted an offline simple training scheme, in which a decision is made based on the final BDBR performance. Then, we applied the learned TRA offset to test sequences without fine-tuning to avoid the complexity overhead. Therefore, we learned a data-dependent group of quantization ∆QPi, i = 0, 1, 2, 3 while previous works independently learn to adjust ∆QP at a given QP setting.

4. Performance Evaluation 4.1. Training and Testing Conditions To evaluate the coding performance of the proposed methods, Y-BDBR [28] is calcu- lated under the Random Access (RA) configuration at the common test conditions [29]. We used the HM 16.20 [30] with GOP of 8, intra period of 32, and various quantization parameters for high rate (QP: 22, 27, 32, 37), which corresponds to the common test condi- tion in HEVC and low rate (QP: 32, 37, 42, 45). For trellis-based rate allocation, we chose 16 sequences at various resolutions and frame rates while the test sequences are other com- mon sequences in HEVC [31], and additional sequences can be found in [32,33]. Only the first 64 frames are used to reduce the training time but still produce high performance. The training and test sequences can be seen in Table1, while Figures6 and7 illustrate the first frame of each video sequence. To illustrate the effectiveness of the TRA method, its performance on the training dataset in BDBR is shown in Figure8. It is easy to observe that TRA already achieves good performance even with phase one, which includes ∆QP set of {±1, 0}. Moreover, we observed that one iteration was enough for TRA to converge in both phases. For the training set, over Phase 1, an additional ∆QP set of {±2, ±1, 0} in Phase 2 showed no BDBR improvement for TRA. Interestingly, the TRA and VRF combination slightly (i.e., −0.28%) and greatly reduced (i.e., 1.54%) rates for TRA + VRF at a high and low rate, respectively. The reason for no improvement on TRA at Phase 2 is due to the change ∆QP at frame level, which will greatly impact overall performance. Therefore, ±2 might depart too much from the conventional rate. However, while combining with the VRF, ∆QP ±2 showed a significant gain of 1% BDBR in the training set on average. The loss in quality at low is compensated by additional virtual frames generated by VRF. Please refer to Table2 for more results on the different training classes. We observed that using the Electronics 2021, 10, 1384 9 of 19

same ∆QP set for all training data does show a minor quality improvement at a high rate. This is due to the nature of the local optima of dynamic programming. At a low rate, an additional 0.31% Y-BDBR is achieved for low rate but almost identical results for high rate. Therefore, we adopted a simple resolution-based adaptive ∆QP selection. As the resolution is available as the input, the proposed method requires no additional complexity Electronics 2021, 10, x FOR PEER REVIEW 9 of 19 compared to other frameworks. For complex adaptive methods (i.e., fast/slow motion, simple/details scene, etc.), this could be an interesting topic for our future work.

TableTable 1. TrainingTraining and testing sequences at variousvarious resolutions and frame rates.

ResolutionsResolutions Training Training Sequences Sequences Testing Testing Sequences Sequences 1080p and ParkScene,ParkScene, BQTerrace, BQTerrace, SnakeNDry, SnakeNDry, BasketballDrive, Kimono, Cactus, PeopleOnStreet, 1080p and Class A BasketballDrive, Kimono, Cactus, PeopleOnStreet, Traffic Class A ReadySteadyGoReadySteadyGo Traffic and Class FourPeople,FourPeople, KristenAndSara, KristenAndSara, Vidyo1, Vidyo1, Vidyo3,Vidyo3, Vidyo4, Vidyo4, 720p and Class FParkjoy, Parkjoy, InToTree, InToTree, DucksTakeOff, DucksTakeOff, Johnny Johnny F ChinaSpeed,ChinaSpeed, SlideEditing 480p480p RaceHorses, RaceHorses, BasketballDrill BasketballDrill BQMall, BQMall, PartyScene, PartyScene, BasketballDrillText BasketballDrillText 4SIF4SIF Crew, Crew, Harbour Harbour Ice Ice 240p240p RaceHorses, RaceHorses, BQSquare BQSquare Ba BasketballPass,sketballPass, BlowingBubbles BlowingBubbles CIF CIF Akiyo, Akiyo, City City Hall, Hall, Foreman, Foreman, Football

Figure 6. IllustrationIllustration of of the the first first frame of trainingtraining videos. From From left left to to right, top top to bottom: ParkScene, ParkScene, BQTerrace, BQTerrace, SnakeNDry, ReadySteadyGo, Parkjoy, Parkjoy, InToTree, InToTree, DucksTakeOff, Johnny, RaceHors RaceHorses,es, BasketballDrill, BasketballDrill, Crew, Crew, Harbour, Harbour, RaceHorses, BQSquare, Akiyo, City.

ToThe illustrate learned ∆theQP effectivenessset is given in of Table the 2TR. PhaseA method, 2 only its changes performance the Tid 0on in the class training 1080p, dataset720p and in 240p.BDBR The is shown main reasonin Figure is related8. It is easy to the to GOPobserve 8, which that TRA has already more Tid achieves 0 frames good and performancethus consumes even a high with rate. phase We one, also observedwhich includes very similar Δ𝑄𝑃 set results of {±1, of the0}. Moreover, learned ∆QP wewith ob- servedand without that one VRF iteration with the was exception enough offor 1080p. TRA to A converge more detailed in both rate–distribution phases. For the can1080p be trainingfound in set, Section over 4.3Phase. 1, an additional Δ𝑄𝑃 set of {±2, ±1, 0} in Phase 2 showed no BDBR improvement for TRA. Interestingly, the TRA and VRF combination slightly (i.e., −0.28%) and greatly reduced (i.e., 1.54%) rates for TRA + VRF at a high and low rate, respectively. The reason for no improvement on TRA at Phase 2 is due to the change Δ𝑄𝑃 at frame level, which will greatly impact overall performance. Therefore, ±2 might depart too much from the conventional rate. However, while combining with the VRF, Δ𝑄𝑃 ±2 showed a significant gain of 1% BDBR in the training set on average. The loss in quality at low is compensated by additional virtual frames generated by VRF. Please refer to Table 2 for more results on the different training classes. We observed that using the same Δ𝑄𝑃 set for all training data does show a minor quality improvement at a high rate. This is due to

Electronics 2021, 10, x FOR PEER REVIEW 10 of 19 Electronics 2021, 10, x FOR PEER REVIEW 10 of 19

the nature of the local optima of dynamic programming. At a low rate, an additional 0.31% the nature of the local optima of dynamic programming. At a low rate, an additional 0.31% Y-BDBR is achieved for low rate but almost identical results for high rate. Therefore, we Y-BDBR is achieved for low rate but almost identical results for high rate. Therefore, we adopted a simple resolution-based adaptive Δ𝑄𝑃 selection. As the resolution is available adopted a simple resolution-based adaptive Δ𝑄𝑃 selection. As the resolution is available as the input, the proposed method requires no additional complexity compared to other as the input, the proposed method requires no additional complexity compared to other Electronics 2021, 10, 1384 frameworks. For complex adaptive methods (i.e., fast/slow motion, simple/details 10scene, of 19 frameworks. For complex adaptive methods (i.e., fast/slow motion, simple/details scene, etc.), this could be an interesting topic for our future work. etc.), this could be an interesting topic for our future work.

Figure 7. IllustrationIllustration of the the first first frame frame of of testing testing videos. videos. From From left left to to right, right, top top to to bottom: bottom: BasketballDrive, BasketballDrive, Kimono, Kimono, Cactus, Cactus, FourPeople,Figure 7. Illustration KristenAndSara, of the first Vidyo1, frame of Vidyo3, testing videos. Vidyo4, From BQMall, left to PartyScene, right, top to Ice, bottom: BasketballPass, BasketballDrive, BlowingBubbles, Kimono, Cactus, Hall, FourPeople, KristenAndSara, Vidyo1, Vidyo3, Vidyo4, BQMall, PartyScene, Ice, BasketballPass, BlowingBubbles, Hall, Foreman, Football, PeopleOnStreet, Trafic,Trafic, ChinaSpeed,ChinaSpeed, SlideEditing.SlideEditing. Foreman, Football, PeopleOnStreet, Trafic, ChinaSpeed, SlideEditing.

0% Phase 1 Phase 2 Phase 1 Phase 2 0% Tid 0 1 2 3 0 1 2 3 Tid 0 1 2 3 0 1 2 3 -1% -1% TRA [Low rate] -2% TRA [Low rate] -2% TRA [High rate] TRA [High rate] -3% TRA+VRF [Low rate] BDBR -3% TRA+VRF [Low rate] BDBR TRA+VRF [High rate] -4% TRA+VRF [High rate] -4% -5% -5% -6% -6% Figure 8. BDBR performance of TRA on training dataset for class B(1080p) at Phase 1 with Δ𝑄𝑃 ∈ Δ𝑄𝑃 ∈ {Figure±1,0} 8. and BDBR Phase performanceperformance 2 with Δ𝑄𝑃 of of TRA ∈ {TRA±2, on ±1,0on training training}, only dataset one dataset foriteration class for B(1080p)class is used. B(1080p)at Phase at 1 Phase with ∆ 1QP with∈ { ±1, 0} {and±1,0 Phase} and 2 Phase with ∆ 2QP with∈ {Δ𝑄𝑃±2, ± ∈ 1,{±2, 0}, ±1,0 only}, one only iteration one iteration is used. is used.

4.2. Compression Performance Assessment

The coding performance on test sequences is evaluated in Table3 for BDBR and Table4 for BDPSNR comparisons. We compared our TRA method with and without VRF to HEVC with (TRA) and without VRF (TRA + VRF), and the most relevant adaptive quantization method [3] as well as the standard HEVC. Similar to the training set, low-rate and high- Electronics 2021, 10, 1384 11 of 19

rate BDBR are used for comparison. Overall, both TRA and VRF provide a consistent improvement over the baseline HEVC.

Table 2. Training performance (Y-BDBR % at low rate/high rate).

Best ∆QP for Each Class Sequences Best ∆QP for All Training Sequences Resolutions Phase 1: ±1, 0 Phase 2: ±2, ±1, 0 Phase 1: ±1, 0 Phase 2: ±2, ±1, 0 TRA TRA + VRF TRA TRA + VRF TRA TRA + VRF TRA TRA + VRF 1080p −2.49/−1.86 −4.00/−2.68 −2.53/−2.03 −4.05/−2.85 −2.49/−1.86 −4.00/−2.68 −3.77/−2,25 −5.27/−3.12 720p −1.48/−0.84 −2.18/−0.84 −1.72/−1.53 −2.43/−1.50 −0.36/−0.72 −1.12/−0.84 −2.53/−1.50 −2.43/−1.50 480p −1.44/−1.13 −3.38/−2.02 −1.49/−1.50 −3.38/−2.43 −1.44/−1.13 −3.38/−2.02 −2.03/−1.41 −4.08/−1.93 4SIF −1.25/−3.40 −3.40/−1.00 −1.56/−0.60 −3.78/−1.56 −1.07/−0.55 −3.14/−1.48 −2.03/−0.41 −4.08/−1.93 240p −0.76/−0.33 −3.49/−1.49 −1.05/−0.84 −3.54/−1.59 −0.83/−0.76 −3.40/−1.69 −1.63/−0.96 −4.10/−1.72 CIF −1.21/−0.84 −2.67/−1.23 −1.26/−1.16 −2.71/−1.43 −1.07/−0.95 −2.67/−1.23 −1.68/−1.35 −1.53/−1.45 Average −1.58/−1.01 −3.17/−1.61 −1.73/−1.40 −3.30/−1.96 −1.26/−1.07 −2.85/−1.68 −2.50/−1.45 −3.80/−1.96

Table 3. Y-BDBR (%) coding performance of various methods for test sequences, full frames.

Low Rate (QP = 32, 37, 42, 45) High Rate (QP = 22, 27, 32, 37) QPA [3] TRA VRF TRA + VRF QPA [3] TRA VRF TRA + VRF BasketballDrive −0.44 −1.38 −1.84 −3.64 −0.43 −1.51 −0.91 −2.21 Kimono −0.49 −0.99 −3.09 −4.82 −1.69 −1.14 −1.75 −2.65 Class A and Class B Cactus −0.17 −1.69 −3.33 −6.06 −0.09 −1.20 −1.87 −2.96 PeopleOnStreet N/A −1.11 −8.35 −9.67 N/A −1.28 −4.56 −5.69 Traffic N/A −1.89 −4.14 −7.98 N/A −1.95 −4.05 −4.68 FourPeople −0.71 −1.67 −2.71 −4.53 −0.68 −0.40 −2.38 −3.02 KristenAndSara −0.66 −1.75 −2.56 −4.37 −0.56 −0.68 −1.92 −2.77 Vidyo1 −0.70 −2.03 −4.45 −6.43 −0.88 −0.55 −3.26 −4.03 HD 720p Vidyo3 −0.73 −2.24 −3.35 −5.83 −0.61 −0.63 −1.85 −2.70 Vidyo4 −0.90 −1.48 −3.23 −4.92 −0.64 −0.29 −1.84 −2.35 ChinaSpeed N/A −1.98 −1.60 −3.65 N/A −0.81 −0.64 −1.44 SlideEditing N/A −0.01 0.02 0.07 N/A 0.27 −0.06 0.26 BQMall −0.77 −1.39 −3.96 −5.33 −0.88 −0.94 −1.86 −2.78 Class C 480p PartyScene −0.57 −2.80 −1.70 −4.44 −0.61 −1.18 −0.65 −1.83 BasketballDrillText N/A −2.24 −2.22 −4.37 N/A −1.61 −1.11 −2.69 4SIF Ice −0.68 −0.80 −9.33 −9.90 −0.97 −0.77 −4.23 −4.60 BasketballPass −0.43 −0.81 −4.52 −5.99 0.84 0.27 −1.80 −2.33 Class D BlowingBubbles −0.57 −2.42 −1.49 −4.63 0.26 0.71 −0.65 −1.00 Hall −0.68 −2.07 −2.04 −4.30 −0.88 −1.92 −0.92 −3.21 CIF Foreman −0.41 −1.48 −4.03 −5.50 −0.49 −1.25 −2.12 −3.49 Football −0.82 −0.41 −1.89 −2.99 −0.72 −0.45 −0.56 −1.07 Average −0.61 −1.55 −3.32 −5.20 −0.56 −0.82 −1.86 −2.73

In addition, to explore the effectiveness of the proposed TRA method, we conducted further experiments for relevant frame-based QP adaptation methods, notably the QP- λ model proposed in [34] where the QP and λ are modeled as a linear function and our recent work in [35] where the QP and RD cost is modeled as a polynomial function. The model parameters were selected as in [34,35]. We also slightly modified the TRA to change the QP for only the Tid 0, named TRA_0. The BDBR comparison is illustrated in Table5. Electronics 2021, 10, 1384 12 of 19

Table 4. Y-BDPSNR (dB) coding performance of various methods for test sequences, full frames.

Low Rate (QP = 32, 37, 42, 45) High Rate (QP = 22, 27, 32, 37) TRA VRF TRA + VRF TRA VRF TRA + VRF BasketballDrive 0.04 0.06 0.12 0.03 0.02 0.05 Kimono 0.03 0.11 0.17 0.03 0.05 0.08 Class A and Class B Cactus 0.06 0.11 0.21 0.02 0.04 0.07 PeopleOnStreet 0.05 0.40 0.46 0.06 0.20 0.25 Traffic 0.08 0.16 0.34 0.07 0.31 0.16 FourPeople 0.09 0.14 0.24 0.02 0.09 0.11 KristenAndSara 0.09 0.13 0.22 0.02 0.06 0.09 Vidyo1 0.10 0.22 0.33 0.02 0.11 0.13 HD 720p Vidyo3 0.11 0.17 0.30 0.02 0.06 0.09 Vidyo4 0.06 0.14 0.22 0.01 0.06 0.07 ChinaSpeed 0.09 0.07 0.16 0.04 0.03 0.08 SlideEditing 0.00 0.00 -0.01 -0.04 0.01 -0.04 BQMall 0.06 0.17 0.23 0.04 0.07 0.11 Class C 480p PartyScene 0.10 0.06 0.16 0.05 0.03 0.08 BasketballDrillText 0.10 0.10 0.19 0.07 0.05 0.12 4SIF Ice 0.04 0.44 0.46 0.03 0.14 0.16 BasketballPass 0.03 0.18 0.24 -0.01 0.09 0.11 Class D BlowingBubbles 0.08 0.06 0.16 -0.03 0.03 0.04 Hall 0.11 0.11 0.23 0.06 0.03 0.10 CIF Foreman 0.06 0.17 0.23 0.05 0.09 0.15 Football 0.01 0.07 0.11 0.02 0.03 0.06 Average 0.07 0.15 0.23 0.03 0.08 0.10

Table 5. Y-BDBR (%) coding performance for TRA and other frame-based methods.

QP—λ [34] QP—λ [34] Sequence QPA [35] QPA [3] TRA_0 Proposed TRA (a = 4.2005, b = 13.7112) (a = 4.2005, b = 9.7112) BasketballPass 3.01 11.89 3.58 −0.43 −0.60 −0.81 BlowingBubbles 5.16 8.39 3.37 −0.57 −1.48 −2.42 Football 2.37 15.65 3.72 −0.82 −0.22 −0.41 Foreman 3.90 10.59 2.63 −0.41 −1.35 −1.48 Hall 5.39 7.27 2.91 −0.68 −1.62 −2.07 Average 3.97 10.76 3.24 −0.58 −1.05 −1.44

From the obtained testing performance shown in Tables3–5, some conclusions can be obtained as: • Overall, the compression performance of the HEVC with proposed TRA and VRF methods outperforms both HEVC with and without QPA benchmarks; • The HEVC with TRA and VRF methods achieved a significant coding improvement for all test sequences, notably by 5.2% and 2.7% of BDBR on average for the low and high rates regions, respectively; Electronics 2021, 10, x FOR PEER REVIEW 13 of 19

Electronics 2021, 10, 1384 • The HEVC with TRA and VRF methods achieved a significant coding improvement13 of 19 for all test sequences, notably by 5.2% and 2.7% of BDBR on average for the low and high rates regions, respectively; • The TRA method provides around 1.55% and 0.82% of BDBR saving for test se- • The TRA method provides around 1.55% and 0.82% of BDBR saving for test sequences quences at the low and high rate regions, respectively, while the VRF method pro- at the low and high rate regions, respectively, while the VRF method provides around vides around 3.32% and 1.86% of BDBR saving; 3.32% and 1.86% of BDBR saving; • The proposed methods, both TRA and VRF, achieve better compression performance • The proposed methods, both TRA and VRF, achieve better compression performance for the low rate region than for the high rate region; for the low rate region than for the high rate region; • • InIn mostmost cases,cases, thethe combinationcombination ofof TRATRA andand VRFVRF methodsmethods achievesachieves eveneven betterbetter com-com- pressionpression performanceperformance thanthan aa simplesimple additionaddition methodmethod wherewhere thethe compressioncompression gain gain of of TRATRA isis addedadded withwith thatthat ofof VRF.VRF. ThisThis composedcomposed effecteffect motivatesmotivates thethe useuse ofof bothboth TRATRA andand VRFVRF inin improvingimproving HEVCHEVC performance;performance; • • ExperimentalExperimental results results also also show show that that no no compression compression gain gain is achievedis achieved for for screen screen content con- videostent videos such such as SlideEditing as SlideEditing. This. mayThis may come come from from the fact the thatfact thethat training the training set does set does not includenot include any videoany video with with this content,this content, and and the VRFthe VRF creation creation may may also also not worknot work well well for screen-capturedfor screen-captured videos; videos; • • ComparedCompared to other other frame-based frame-based QPA QPA algori algorithms,thms, the the proposed proposed method method achieved achieved bet- betterter BDBR BDBR (see (see Table Table 5). 5It). should It should be noted be noted that the that QPA the QPAproposed proposed in [35] in was [ 35 mainly] was mainlydesigned designed for surveillance for surveillance video video content content and andits high-order its high-order polynomial polynomial model model is ishighly highly sensitive sensitive to tothe the selected selected parameters. parameters. Similarly, Similarly, the theQP- QP-λ linearλ linear model model pro- proposedposed in [34] in [ is34 also] is alsounable unable to achieve to achieve good goodBDBR BDBRperformance performance even with even the with original the originalmodel parameters model parameters used in used[34] or in our [34 ]new or our parameters; new parameters; •• AA similarsimilar compressioncompression achievementachievement isis alsoalso observed observed for for the the BDPSNR BDPSNR comparison. comparison. InIn addition,addition, similarsimilar toto otherother temporaltemporal raterate allocation allocation methods, methods, TRA TRA and and TRA TRA + + VRF VRF showshow betterbetter performanceperformance forfor slow-motionslow-motion sequencessequences whilewhile stillstill greatlygreatly improveimprove perfor-perfor- mancemance forfor fast-moving fast-moving sequences sequences such such as asBasketballDrive BasketballDrive. It. shouldIt should be be noted noted that that the the testing test- sequencesing sequences are differentare different from from the trainingthe training sequences, sequences, thus thus validating validating the the generalization generalization of ourof our TRA TRA scheme. scheme. ToTo furtherfurther visualize visualize the the performance performance of of our our proposed proposed method, method, Figure Figure9 shows 9 shows the RD the comparisonRD comparison between between the original the original HEVC HEVC and theand proposed the proposed TRA TRA + VRF. + VRF.

Basketball Pass Cactus 34 36

33 35

32 34 Y PSNR (dB) 31 33 Y PSNR (dB)

30 32

29 31 HEVC 28 30 TRA+VRF 27 29 bitrate (Mbps) bitrate (Mbps) 26 28 0 0.1 0.2 0.3 0.4 00.511.522.53

Ice Vidyo1 39 40

38 39

37 38

36 (dB) PSNR Y 37 PSNR (dB) Y

35 36

34 35

33 34

32 33 bitrate (Mbps) bitrate (Mbps) 31 32 0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 FigureFigure 9.9.RD RD performanceperformance comparisoncomparison betweenbetween thethe originaloriginal HEVCHEVC andand thetheproposed proposedmethods methodsfor for variousvarious sequences,sequences, fullfull frames,frames, low-rate.low-rate.

Electronics 2021, 10, x FOR PEER REVIEW 14 of 19 Electronics 2021, 10, 1384 14 of 19

4.3. Rate Allocation Asessment 4.3. Rate Allocation Asessment ToTo explore explore the the rate rate allocationallocation performance performanc withe with the proposedthe proposed method, method, the rate the distri- rate dis- tributionbution withwith different different temporal temporal layers layers is measured is measured andshown and shown in Table in6 Tableand Figure 6 and 10 Figure for 10 forboth both low low rate rate and and high high rate. rate.

TableTable 6. 6.RateRate distribution distribution (%) (%) between temporal temporal index. index.

HEVCHEVC ++ ProposedProposed Methods Methods Tid Tid HEVCHEVC VRFVRF TRA TRA TRA TRA + VRF + VRF 0–I 0–I23.95 23.95 24.07 24.07 30.0630.06 30.2330.23 0–P 0–P32.03 32.03 32.19 32.19 33.4333.43 33.6133.61 HR HR 1–B 1–B17.27 17.27 17.27 17.27 16.4316.43 16.4116.41 2–B 2–B20.27 20.27 20.16 20.16 11.6911.69 11.5711.57 3–B 3–B6.50 6.50 6.30 6.30 8.39 8.39 8.19 8.19 0–I 0–I45.78 45.78 46.27 46.27 51.4451.44 52.0452.04 0–P 0–P34.57 34.57 34.94 34.94 31.2431.24 31.5831.58 LR LR 1–B 1–B8.20 8.20 8.17 8.17 7.42 7.42 7.36 7.36 2–B 2–B7.82 7.82 7.51 7.51 5.81 5.81 5.48 5.48 3–B 3.63 3.11 4.09 3.54 3–B 3.63 3.11 4.09 3.54

RATE DISTRIBUTION (%) DISTRIBUTION RATE RATE DISTRIBUTION (%) RATE

Figure 10. Rate allocation among coded pictures in GOP and for several codecs. Figure 10. Rate allocation among coded pictures in GOP and for several codecs. The proposed TRA redistributed the rate among temporal layers. The proposed method tends to give more bits for pictures at the lower temporal layer indexes, i.e., Tid 0. The proposed TRA redistributed the rate among temporal layers. The proposed In fact, the pictures at these positions have a higher impact than the remaining ones as they methodcan be tends referred to togive when more coding bits thefor pictures atat the the higher lower Tid. temporal In general, layer our indexes, proposed i.e., Tid 0. InTRA fact, tends the topictures reduce theat these rate while positions maintaining have a a higher similar impact level of quality.than the remaining ones as they canOn be the referred other hand, to when VRF showscoding a virtuallythe pictures similar at ratethe distributionhigher Tid. comparedIn general, to our the pro- posedbaseline TRA HEVC. tends Therefore,to reduce thethe rate rate redistribution while maintaining is only impacted a similar by level our TRA.of quality. Details on theOn rate the distribution other hand, can VRF be seen shows in Table a virtually6. similar rate distribution compared to the baseline HEVC. Therefore, the rate redistribution is only impacted by our TRA. Details on 4.4. Complexity Assessment the rate distribution can be seen in Table 6. To complete the performance evaluation, we evaluated the encoding time of our proposed TRA with and without VRF. The time increase with the proposed tools was 4.4. Complexity Assessment assessed and is shown in Table7. As obtained, The TRA method does not affect the computationalTo complete complexity the performance of the encoder. evaluation, we evaluated the encoding time of our pro- posed TRA with and without VRF. The time increase with the proposed tools was as- sessed and is shown in Table 7. As obtained, The TRA method does not affect the compu- tational complexity of the encoder. On the other hand, TRA can even reduce the encoding time, especially at a high rate. The main reason is due to the reduction in total rate, which greatly impacts other coding

Electronics 2021, 10, 1384 15 of 19

Table 7. Encoding time (%) compared to HEVC.

Low Rate High Rate Class TRA TRA + VRF TRA TRA + VRF 1080p 99.7 675.90 99.1 561.5 720p 100.8 447.10 101.9 463.2 480p 87.0 231.30 85.6 281.7 4SIF 105.9 231.19 96.9 243.4 240p 109.3 148.55 106.9 166.7 CIF 96.1 165.50 96.9 180.8 Average 99.8 316.60 97.9 316.2

On the other hand, TRA can even reduce the encoding time, especially at a high rate. The main reason is due to the reduction in total rate, which greatly impacts other coding tools. However, the computation time caused by the VRF method introduces nearly three times the encoding increase to the proposed HEVC. This mainly comes from the creation of additional references, and more reference frames also lead to more encoding time for motion estimation. This complexity–compression performance trade-off may prevent the wide deployment of VRF in some video coding applications where the computational complexity is constrained. However, as shown in Table4, BDBR with the combination of TRA and VRF methods is mostly higher than the total BDBR of TRA and VRF when they are individually applied. This effect motivates to include in HEVC not only the rate control solution but also other VRF creation approach in future video coding development. Additionally, the complexity issue in VRF can be addressed by looking for low complexity VRF creation methods.

4.5. VRF Assessment Finally, although VRF has shown an impressive compression performance when adopted in the HEVC architecture, it also introduced a remarkable complexity overhead. To further explore the creation of VRF in the proposed HEVC, this sub-section will assess the visual quality and the computational complexity of VRF. As presented in Section 3.1.1, the VRF creation adopted in this paper includes three main steps, (i) the hierarchical motion estimation (HME) to initialize the motion vector field, (ii) the motion vector refinement (MVR) to mitigate the quantization errors which frequently happen in decoded references and (iii) motion compensation (MC) to interpolate the VRF. To reveal the contribution of these modules, we compute and illustrate in Table8 the percent- age of computation time (%) measured for VRF over overall HEVC (%VRFHEVC), and for the HME, MVR, and MC over the VRF (%HMEVRF,%MVRVRF,%MCVRF), while Figure 11 illustrates the visual quality comparison for the proposed VRF with and without MVR and the related work in [25].

%VRFHEVC = TimeVRF × 100/TimeHEVC (5)

%HMEVRF = TimeHME × 100/TimeVRF (6)

%MVRVRF = TimeMVR × 100/TimeVRF (7)

%MCVRF = TimeMC × 100/TimeVRF (8)

The complexity results obtained in Table8 shows that the VRF consumes around 11.70% of computational complexity in overall HEVC encoding. In VRF, the HME required the largest computational complexity, i.e., 95.24%. In this case, reducing the time of the HME is critical to make the VRF more potential for widespread adoption in HEVC. Electronics 2021, 10, 1384 16 of 19

As observed in Figure 11, both HME and MVR contribute to the final quality of VRF. Naturally, HME exploits the video content to initialize the motion vector information; the good MV can mitigate the hole and occlusion problems in the interpolated frame. Meanwhile, the MVR can partly reduce the quantization noise and blocking artifacts in interpolated pictures by adaptively refining the MVF obtained in the HME. It should be noted that the blocking artifact typically happens in HEVC compressed videos due to the block-based coding approach, while the translational motion model adopted in BiME may also introduce further blocking artifact (see Figure 11 visual quality for VRF w/o MVR). In this case, WVMF can eliminate the “outlier” motion information, resulting in smoother MVF for final VRF creation (see Figure 11 visual quality for proposed VRF).

Table 8. VRF Complexity assessment.

Video %VRFHEVC %HMEVRF %MVRVRF %MCVRF BasketballDrive 12.71 96.06 2.37 1.56 Kimono 12.41 96.09 2.38 1.53 Cactus 14.12 96.03 2.37 1.59 PeopleOnStreet 14.17 96.85 1.87 1.28 Traffic 18.47 96.81 1.86 1.33 FourPeople 13.70 95.21 2.85 1.94 KristenAndSara 12.71 96.06 2.37 1.56 Vidyo1 13.72 95.22 2.82 1.96 Vidyo3 13.15 95.26 2.82 1.93 Vidyo4 13.24 95.24 2.85 1.91 ChinaSpeed 10.24 95.24 2.87 1.89 SlideEditing 13.73 95.22 2.83 1.95 BQMall 10.12 94.90 3.10 2.00 PartyScene 8.68 94.83 3.15 2.02 BasketballDrillText 9.75 94.84 3.13 2.02 Ice 10.71 94.76 3.11 2.12 BasketballPass 10.26 94.24 3.50 2.26 BlowingBubbles 8.13 94.33 3.46 2.21 Hall 10.85 94.25 3.28 2.46 Foreman 9.00 94.36 3.53 2.12 Football 5.85 94.30 3.42 2.28 Average 11.70 95.24 2.85 1.90 Electronics 2021, 10, x FOR PEER REVIEW 16 of 19

Kimono 12.41 96.09 2.38 1.53 Cactus 14.12 96.03 2.37 1.59 PeopleOnStreet 14.17 96.85 1.87 1.28 Traffic 18.47 96.81 1.86 1.33 FourPeople 13.70 95.21 2.85 1.94 KristenAndSara 12.71 96.06 2.37 1.56 Vidyo1 13.72 95.22 2.82 1.96 Vidyo3 13.15 95.26 2.82 1.93 Vidyo4 13.24 95.24 2.85 1.91 ChinaSpeed 10.24 95.24 2.87 1.89 SlideEditing 13.73 95.22 2.83 1.95 BQMall 10.12 94.90 3.10 2.00 PartyScene 8.68 94.83 3.15 2.02 BasketballDrillText 9.75 94.84 3.13 2.02 Ice 10.71 94.76 3.11 2.12 BasketballPass 10.26 94.24 3.50 2.26 BlowingBubbles 8.13 94.33 3.46 2.21 Hall 10.85 94.25 3.28 2.46 Foreman 9.00 94.36 3.53 2.12 Electronics 2021, 10, 1384 Football 5.85 94.30 3.42 2.28 17 of 19 Average 11.70 95.24 2.85 1.90

[25] VRF w/o MVR Proposed VRF (a)

29.04 dB 27.64 dB 29.61 dB (b)

35.69 dB 34.79 dB 35.78 dB (c)

Electronics 2021, 10, x FOR PEER REVIEW 17 of 19

22.08 dB 21.85 dB 22.39 dB (d)

28.52 dB 28.00 dB 28.77 dB

Figure 11. VRFFigure visual 11. VRF quality visual illustration quality illustration (highlighting (highlighting the improvement the improvement in the red-circle in the): red-circle (a) Ice; (b):) ( Vidyo1;a) Ice; (b ()c ) Football; (d) BQMall.Vidyo1; (c) Football; (d) BQMall. 5. Conclusions As observed in Figure 11, both HME and MVR contribute to the final quality of VRF. Naturally, HME exploitsThis paper the video presented content a novelto initialize trellis-based the motion bitrate vector allocation information; algorithm the for HEVC inter good MV can mitigatecoding wherethe hole a virtualand occlusion reference problems frame is in considered. the interpolated The proposedframe. Mean- VRF-HEVC relies while, the MVRon can the partly decoded reduce information the quantization available noise at both and the blocking encoder artifacts and decoder; in inter- thus, no syntax polated pictureselements by adaptively need to refining change the in theMV standardF obtained specification, in the HME. and It should no overhead be noted bitrate needs to that the blockingconcern. artifact Totypically optimize happens the VRF in HEVC quality, compressed a statistical videos learning-based due to the block- VRF frame creation based coding approach,is adopted. while This the paper translational is the first motion work model that considers adopted thein BiME impact may of quantizationalso error introduce furtherand blocking the video artifact content (see on Figure the VR11 visual frame quality quality. for In VRF addition, w/o MVR). to achieve In this higher HEVC case, WVMF cancompression eliminate the performance, “outlier” motion a novel information, set of quantization resulting parameters in smoother is MVF introduced for the for final VRF creationVRF-HEVC (see frameworkFigure 11 visual based quality on a dynamic for proposed programing VRF). based bitrate allocation approach. Experimental results obtained for a rich set of test sequences revealed that the proposed 5. ConclusionsTRA and VRF based HEVC solutions significantly outperform relevant HEVC improvement methods and the standard HEVC. This paper presented a novel trellis-based bitrate allocation algorithm for HEVC inter coding where a virtual reference frame is considered. The proposed VRF-HEVC relies on the decoded information available at both the encoder and decoder; thus, no syntax ele- ments need to change in the standard specification, and no overhead bitrate needs to con- cern. To optimize the VRF quality, a statistical learning-based VRF frame creation is adopted. This paper is the first work that considers the impact of quantization error and the video content on the VR frame quality. In addition, to achieve higher HEVC compres- sion performance, a novel set of quantization parameters is introduced for the VRF-HEVC framework based on a dynamic programing based bitrate allocation approach. Experi- mental results obtained for a rich set of test sequences revealed that the proposed TRA and VRF based HEVC solutions significantly outperform relevant HEVC improvement methods and the standard HEVC.

Author Contributions: Conceptualization, X.H. and T.N.C.; methodology, X.H.; software, L.D.T.H.; validation, X.H., T.N.C.; formal analysis, X.H.; investigation, L.D.T.H.; resources, T.N.C.; data cura- tion, X.H.; writing—original draft preparation, X.H.; writing—review and editing, T.N.C. and X.H.; visualization, L.D.T.H.; supervision, X.H. All authors have read and agreed to the published version of the manuscript. Funding: This research is funded by Vietnam National Foundation for Science and Technology De- velopment (NAFOSTED) under grant number 102.01-2020.15. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Sullivan, G.J.; Ohm, J.-R.; Han, W.-J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. 2. Bross, B.; Chen, J.; Ohm, J.-R.; Sullivan, G.J.; Wang, Y.-K. Developments in International Video Coding Standardization After AVC, With an Overview of Versatile Video Coding (VVC). Proc. IEEE 2020, doi:10.1109/JPROC.2020.3043399. 3. Papadopoulos, M.A.; Zhang, F.; Agrafiotis, D.; Bull, D. An adaptive QP offset determination method for HEVC. In Proceedings of the IEEE Conference on Image Processing, Phoenix, AZ, USA, 25–28 September 2016; pp. 4220–4224. 4. Chen, Y.; Murherjee, D.; Han, J.; Grange, A.; Xu, Y.; Liu, Z.; Parker, S.; Chen, C.; Su, H.; Joshi, U.; et al. An Overview of Core Coding Tools in the AV1 Video Codec. In Proceedings of the Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; pp. 41–45.

Electronics 2021, 10, 1384 18 of 19

Author Contributions: Conceptualization, X.H. and T.N.C.; methodology, X.H.; software, L.D.T.H.; validation, X.H., T.N.C.; formal analysis, X.H.; investigation, L.D.T.H.; resources, T.N.C.; data curation, X.H.; writing—original draft preparation, X.H.; writing—review and editing, T.N.C. and X.H.; visualization, L.D.T.H.; supervision, X.H. All authors have read and agreed to the published version of the manuscript. Funding: This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2020.15. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Sullivan, G.J.; Ohm, J.-R.; Han, W.-J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [CrossRef] 2. Bross, B.; Chen, J.; Ohm, J.-R.; Sullivan, G.J.; Wang, Y.-K. Developments in International Video Coding Standardization after AVC, With an Overview of Versatile Video Coding (VVC). Proc. IEEE 2020.[CrossRef] 3. Papadopoulos, M.A.; Zhang, F.; Agrafiotis, D.; Bull, D. An adaptive QP offset determination method for HEVC. In Proceedings of the IEEE Conference on Image Processing, Phoenix, AZ, USA, 25–28 September 2016; pp. 4220–4224. 4. Chen, Y.; Murherjee, D.; Han, J.; Grange, A.; Xu, Y.; Liu, Z.; Parker, S.; Chen, C.; Su, H.; Joshi, U.; et al. An Overview of Core Coding Tools in the AV1 Video Codec. In Proceedings of the Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; pp. 41–45. 5. Xu, M.; Canh, T.N.; Jeon, B. Simplified Rate-Distortion Optimized Quantization for HEVC. In Proceedings of the IEEE 2018 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Valencia, Spain, 6–8 June 2018; pp. 1–6. 6. Zhang, Y.; Tian, R.; Liu, J.; Wang, N. Fast rate distortion optimized quantization for HEVC. In Proceedings of the 2015 Visual Communications and Image Processing (VCIP), Singapore, 13–16 December 2015; pp. 1–4. 7. Sun, L.; Au, O.C.; Zhao, C.; Huang, F.H. Rate distortion modeling and adaptive rate control scheme for high efficiency video coding (HEVC). In Proceedings of the 2014 IEEE International Symposium on Circuits and Systems (ISCAS), Melbourne, VIC, Australia, 1–5 June 2014; pp. 1933–1936. 8. Zhao, L.; Wang, S.; Zhang, X.; Wang, S.; Ma, S.; Gao, W. Enhanced Motion-Compensated Video Coding With Deep Virtual Reference Frame Generation. IEEE Trans. Image Process. 2019, 28, 4832–4844. [CrossRef][PubMed] 9. Zhang, Y.; Zhang, C.; Fan, R. Fast Motion Estimation in HEVC Inter Coding: An Overview of Recent Advances. In Proceedings of the 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Honolulu, HI, USA, 12–15 November 2018; pp. 49–56. 10. Djelouah, A.; Campos, J.; Schaub-Meyer, S.; Schroers, C. Neural Inter-Frame Compression for Video Coding. In Proceedings of the IEEE International Conference Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 6420–6428. 11. Zhao, L.; Wang, S.; Zhang, X.; Wang, S.; Ma, S.; Gao, W. Enhanced Ctu-Level Inter Prediction with Deep Frame Rate Up- Conversion for High Efficiency Video Coding. In Proceedings of the 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 206–210. 12. Xu, M.; Canh, T.N.; Jeon, B. Simplified Level Estimation for Rate-Distortion Optimized Quantization of HEVC. IEEE Trans. Broadcast. 2019, 66, 88–99. [CrossRef] 13. Xu, M.; Canh, T.N.; Jeon, B. Rate-Distortion Optimized Quantization: A Deep Learning Approach. In Proceedings of the IEEE High Performance Extreme Computing Conference, Waltham, MA, USA, 25–27 September 2018. 14. Choi, H.; Yoo, J.; Nam, J.; Sim, D.-G. Pixel-wise unified rate-quantization model for multi-level rate control. IEEE J. Sel. Top. Signal Process. 2013, 7, 1112–1123. [CrossRef] 15. Wang, S.; Ma, S.; Wang, S.; Zao, D.; Gao, W. Quadratic Rho-domain based rate control algorithm for HEVC. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013. 16. Li, B.; Li, H.; Li, L.; Zhang, J. λ domain rate control algorithm for high efficiency video coding. IEEE Trans. Image Process. 2014, 23, 3841–3854. [CrossRef][PubMed] 17. Wang, M.; Ngan, K.N.; Li, H. An efficient frame-content based intra frame rate control for high efficiency video coding. IEEE Signal Process. Lett. 2014, 22, 896–900. [CrossRef] 18. Li, S.; Xu, M.; Deng, X.; Wang, Z. Weight-based R-λ rate control for perceptual HEVC coding on conversational videos. Signal Process. Image Commun. 2015, 38, 127–140. [CrossRef] 19. Gao, W.; Kwong, S.; Zhou, Y.; Yuan, H. SSIM-based game theory approach for rate-distortion optimized intra frame CTU-level bit allocation. IEEE Trans. Multimed. 2016, 18, 988–999. [CrossRef] 20. Yan, T.; Ra, I.-H.; Zhang, Q.; Xu, H.; Huang, L. A Novel Rate Control Algorithm Based on ρ Model for Multiview High Efficiency Video Coding. Electronics 2020, 9, 166. [CrossRef] 21. Yuanzhi, Z.; Chao, L. A Highly Parallel Hardware Architecture of Table – Based CABAC Bitrate Estimator in an HEVC Intra Encoder. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 1544–1558. Electronics 2021, 10, 1384 19 of 19

22. Ascenso, J.; Brites, C.; Pereira, F. Improving frame interpolation with spatial motion smoothing for pixel domain distributed video coding. In Proceedings of the Conference on Speech and Image Process, Multimedia Communication and Services, Smolenice, Slovak, 29 June–2 July 2005. 23. Jeong, S.-G.; Lee, C.; Kim, C.-S. Motion-Compensated Frame Interpolation based Multihypothesis Motion Estimation and Texture Optimization. IEEE Trans. Image Process. 2013, 22, 4497–4505. [CrossRef][PubMed] 24. Hoangvan, X. Statistical search range adaptation solution for effective frame rate up-conversion. IET Image Process. 2018, 12, 113–120. [CrossRef] 25. HoangVan, X.; Ascenso, J.; Pereira, F. Improving predictive video coding performance with decoder side information. In Proceed- ings of the IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012. 26. Fischer, T.R. Joint trellis coded quantization/modulation. IEEE Trans. Commun. 1991, 39, 17–176. [CrossRef] 27. Karczewicz, M.; Ye, Y.; Chong, I. Rate Distortion Optimized Quantization. In Proceedings of the ITU-T VCEG Meeting, Antalya, Turkey, 12–13 January 2008. 28. Bjontegaard, G. Calculation of average PSNR differences between RD curves. In Proceedings of the 13th ITU-T VCEG Meeting, Austin, TX, USA, 2–4 April 2001. 29. Bossen, F. Common HM Test Conditions and Software Reference Configuration. In Proceedings of the 14th meeting of the Joint Collaborative Team on Video Coding (JCT-VC), Vienna, Austria, 25 July–2 August 2013. 30. HEVC Reference Software. Available online: https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/ (accessed on 8 June 2021). 31. Video Test Sequences. Available online: ftp://[email protected]/testsequences/ (accessed on 20 November 2020). 32. Available online: ftp://vqeg.its.bldrdoc.gov (accessed on 27 November 2020). 33. Xiph.org Video Test Media. Available online: https://media.xiph.org/video/derf/ (accessed on 20 November 2020). 34. Li, B.; Xu, J.; Zhang, D.; Li, H. QP refinement according to lagrange multiplier for high efficiency video coding. In Proceedings of the IEEE International Symposium on Circuits and Systems, Beijing, China, 19–23 May 2013. 35. HoangVan, X. Adaptive Quantization Parameter Estimation for HEVC Based Surveillance Scalable Video Coding. Electronics 2020, 9, 915. [CrossRef]