JMVC Simulation Environment
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A JMVC Simulation Environment The JVT (Joint Video Team), formed from the cooperation between the ITU-T Study Group 16 (VCEG) and ISO/IEC Motion Picture Experts group (MPEG), responsible for the standardization of the H.264, SVC (scalable video coding), and MVC provides software models used for algorithms experimentation and for stan- dards proof of concept. The JMVC (Joint Model for MVC), currently on version 8.5, is the reference software available for experimentation on the MVC standard. Along this work the JMVC software, coded using C++, was used and modifi ed to implement the proposed algorithms. Initially, the version 6.0 was used followed by an upgraded to version 8.5. Considering the length and complexity of the software, a high-level overview of the interaction between the main encoder classes is pre- sented here. Afterwards are shown the classes modifi ed to enable our algorithms experimentation. For in-depth details of the class structure refers to JMVC docu- mentation . A.1 JMVC Encoder Overview The JMVC classes are hierarchically structured as shown in Fig. A.1 . The JMVC encodes each view at a time requiring as many calls as number of view to be encoded. The reference views are stored in temporary fi les. The class H264AVCEncoderTest represents the top encoder entity; it initializes the encoder, and calls the CreateH264AVCEncoder class to initialize the other coding classes. At this level, the PicEncoder is initialized and the frame-level loop is controlled. The PicEncoder loops over the slices inside each frame and resets the RDcost. The slice encoder controls the MB-level loop and sets the reference frames for each slices. For MB encoding there are two main classes: the MbEncoder and the MBCoder . MbEncoder encapsulates all the prediction, transforms, and entropy steps. It imple- ments the mode decision by looping over and encoding all possible coding modes (in case of RDO-MD) and determining the minimum RDCost. At this point no MB B. Zatt et al., 3D Video Coding for Embedded Devices: Energy Effi cient 175 Algorithms and Architectures, DOI 10.1007/978-1-4614-6759-5, © Springer Science+Business Media New York 2013 176 Appendix A JMVC Encoder H264AVCEncoderTest (H264AVCEncoderTest.cpp) -Init encoder -Setup frame buffers CreateH264AVCEncoder -Loop over frames (CreaterH264AVCEncoder.cpp) PicEncoder - Init slice header - Set RDCost (λ) (PicEncoder.cpp) - Loop over slices SliceEncoder - Init reference frames (SliceEncoder.cpp) - Loop over MBs MbEncoder MbCoder BS (MbEncoder.cpp) (MbCoder.cpp) - Loop over coding modes - Encode the best mode - Perform the encoding for each mode - Create the bitstream - Select the best RDCost Fig. A.1 JMVC encoder high-level diagram coding data is written to the bitstream. Once the best mode is selected, the SliceEncoder calls the MbCoder to write the MB-level side information and resi- dues to the bitstream. Figure A.2 (Tech et al. 2010) depicts the hierarchical call graph of methods inside the mode decision process implemented in MbEncoder class. Firstly, the SKIP and Direct modes are evaluated,;along this monograph these modes are jointly referred as SKIP MBs. In the following, all inter-prediction block sizes are evalu- ated. For each partition size a call to the method MotionEstimation::estimateBlock WithStart (see Fig. A.3 discussion) is performed. The same happens for the sub- partitions in case of 8 × 8 partitioning. EstimateMb8 × 8Frext is only called in case the FRExt fl ag is set. Finally, the intra-frame coding modes including PCM, intra4 × 4, intra8 × 8 (FRExt only), and intra16 × 16 are called. The Estimate<mode> methods call the complete coding loop for that specifi c mode including prediction, transforms, quantization, entropy encoding, and reconstruction. It allows a precise defi nition of the minimum RDCost (λ) and an optimal best mode selection at the cost of elevated coding complexity. The MbCoder is called to entropy encode the best mode and write the data into the bitstream output buffer. The motion and disparity estimation search itself is defi ned in the method esti- mateBlockWithStart and is composed of three basic steps. The ME/DE datafl ow is represented by the arrows in Fig. A.3 . Once the estimateBlockWithStart is called, for instance in EstimateMb16 × 16 , the search runs for each reference frame list A.1 JMVC Encoder Overview 177 EstimateMbSkip Inter P EstimateMbDirect Inter B Inter P EstimateSubMbDirect Inter B estimateBlock EstimateMb8x8Frext 4x min EstimateSubMb8x8 1x WithStart estimateBlock EstimateSubMbDirect EstimateMb16x16 1x WithStart estimateBlock estimateBlock Minimum EstimateMb16x8 2x WithStart EstimateSubMb8x8 1x WithStart RDCost estimateBlock estimateBlock min EstimateMb8x16 2x EstimateSubMb8x4 2x WithStart Best WithStart Mode estimateBlock EstimateMb8x8 4x min EstimateSubMb4x8 2x WithStart estimateBlock EstimateSubMb4x4 4x WithStart EstimateMbPCM Intra EstimateMbIntra4 min 1x Test 9 Intra Modes EstimateMbIntra8 min 1x Test 4 Intra Modes EstimateMbintra16 min 1x Test 4 Intra Modes Fig. A.2 Mode decision hierarchy in JMVC InterSearch Minimum Search List 0 min ME/DE Sub Pel min ME/DE Full Pel MotionCost min Search List 1 min ME/DE Sub Pel min ME/DE Full Pel Interactive B Prediction min ME/DE Sub Pel min ME/DE Full Pel Fig. A.3 Inter-frame search in JMVC (List 0 and List 1) and for an interactive B search mode that exploits both lists in an interactive fashion (in case the interactive B is active). At the software perspective, there is no distinction between ME and DE. List 0 and List 1 store both temporal and disparity reference frames. The search for a given reference frame fi rstly fi nds the best candidate block among the integer pixels (ME/DE Full Pel) and then refi nes the result considering half and quarter pixels (ME/DE Sub Pel). The search pattern depends on the search algorithm, and JMVC implements TZ Search, Full Search, Spiral Search, and Log Search. The goal is to fi nd the candidate block that mini- λ mizes the Motion Cost ( Motion ) in terms of SAD, SAD-YUV (considering chroma channels), SATD, or SSE according to user-defi ned coding parameters. The posi- tion of the best matching candidate block position defi nes the motion or disparity vector. 178 Appendix A A.2 Modifi cations to the JMVC Encoder A.2.1 JMVC Encoder Tracing In order to generate the statistics used for coding modes and motion/disparity vec- tor, some modifi cations were done in the original JMVC code. The point selected for this tracing is inside the entropy encoder to guarantee that the extracted data is the same actually encoded and transmitted. The entropy encoder is declared as the virtual class MbSymbolWriteIf , but the actual implementation is in CabacWriter and UvlcWriter , depending on the entropy encoder selected in the confi guration fi le. The methods monitored are skipFlag that encodes the SKIP (and Direct)-coded MBs and mbMode that encodes all other modes. Note, MB coding mode codes ( uiMbMode ) vary with the slice type as defi ned in Tables 7-11, 7-12, 7-12, 7-13, and 7-14 of the MVC standard (JVT 2008). A.2.2 Communication Channels in JMVC Multiple algorithms proposed in this monograph employ the information from the 3D-neighborhood. For that, there is a need to build communication channels between neighboring MBs in the special, temporal, and disparity domains. In other words, an infrastructure to send and receive data at MB level, at frame level (in same view), and at view level (frames in different views). Therefore, a hierarchical com- munication infrastructure was designed and implemented. Figure A.4 presents graphically the modifi ed classes along with the new member data structures and communication methods. The MbDataAccess already provides direct access to the left and upper neigh- bors (A, B, C, and D in Fig. A.4 ). This access was extended to the right and bottom neighbors (A*, B*, C*, and D*) enabling access to data from all spatial neighboring PicEncoder SliceHeader MbDataAccess #MBx GOP Size #MBx Send() Send() D B C MB A MB A* Frame #MBy Frame Receive() #MBy Receive() C* B* D* MbDataAccess::Send(MB,x,y) SliceHeader::Send(slice.POC) MbDataAccess::Receive(MB,x,y) PicEncoder ::Read(file, view) SliceHeader::Receive(slice,POC) MbDataAccess::Write(MB,x,y) Read() Write() MB Storage File Storage File View N-1 View N Fig. A.4 Communication in JMVC A.2 Modifi cations to the JMVC Encoder 179 MBs. For temporal neighboring MB access the current MB data is sent to SliceHeader (using Send () methods) where a 2D array stores the information from the MBs belonging to the current slice. Once the slice is completely processed, the 2D array is sent to the PicEncoder class. PicEncoder maintains the data for the whole current GOP. For reading the data communication channel writes the requested data from PicEncoder to SliceHeader and fi nally to MbDataAccess (using Receive () methods ). As far as the views are processed in distinct encoder calls there is a need to use external temporary fi les to transmit the disparity neighboring infor- mation. The current MB data is written in these fi les from MbDataAccess , while the data from previous views is read in PicEncoder , as shown in Fig. A.4 . A.2.3 Mode Decision Modifi cation in JMVC The mode decision is programmed in a very simple becoming easy to fi nd and modify. MD is handled in MbEncoder::encodeMacroblock . To fi nd the exact point, search for the xEstimateMb methods responsible for calling the modes evaluation. Before this point are implemented the 3D-neighborhood communication calls and the calculation required to take the fast decisions. A.2.4 ME/DE Modifi cation in JMVC The modifi cations for fast ME/DE are inserted in two distinct classes. For modifi ca- tions at higher level such as avoiding interactive B search, search direction, and reference frames the modifi cations are done in the MbEncoder by modifying the xEstimateMb methods.