Phd THESIS Retina-Inspired Image and Video Coding

Total Page:16

File Type:pdf, Size:1020Kb

Phd THESIS Retina-Inspired Image and Video Coding UNIVERSITE´ COTEˆ D’AZUR GRADUATE SCHOOL STIC SCIENCE ET TECHNOLOGIES DE L’INFORMATION ET DE LA COMMUNICATION PhD THESIS A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Science Specialized in Signal and Image Processing presented by Effrosyni Doutsi Retina-Inspired Image and Video Coding supervised by Pr. Lionel Fillatre and Dr. Marc Antonini prepared at I3S Laboratory, SIS Team, Mediacoding Group Funded by 4G-TECHNOLOGY and ANRT Grant Defended on March 22, 2017 Committee: Referees : Pr. Janusz Konrad - Boston University Dr. Laurent Perrinet - CNRS, Aix-Marseille Universit´e Directors : Pr. Lionel Fillatre - I3S, Universit´eCˆote d’Azur Dr. Marc Antonini - I3S, CNRS Examiners : Pr. B´eatrice Pesquet - T´el´ecom ParisTech Dr. Pierre Kornbropst - Inria Pr. Fernando Pereira - University of Lisbon Invited : Mr. Julien Gaulmin - 4G-TECHNOLOGY 2 Contents 1 Introduction 9 1.1 GeneralFramework................................ 9 1.2 ContributionsandOutline. 11 1.2.1 PartII:DynamicFiltering. 11 1.2.2 Part III: Dynamic Quantization . 12 1.2.3 PartIV:Applications .......................... 13 I STATE-OF-THE-ART 15 2 State-of-the-art and Background in Compression 17 2.1 Whyweusecompression? . .. .. .. .. .. .. .. 18 2.1.1 Losslesscompression . 18 2.1.2 Lossycompression ............................ 19 2.2 CodingPrinciple ................................. 20 2.3 Rate-DistortionTheory . 20 2.3.1 Qualitative Metrics . 21 2.3.1.1 MeanSquareError(MSE) . 22 2.3.1.2 Peak Signal to Noise Ratio (PSNR) . 22 2.3.1.3 Structure SIMilarities (SSIM) . 22 2.3.1.4 PSNRvsSSIM......................... 23 2.3.2 ShannonEntropy............................. 24 2.3.2.1 HuffmanCoding ........................ 25 2.3.2.2 ArithmeticCoding. 26 2.3.2.3 Stack-RunCoding . 27 2.3.3 Rate-Distortion Optimality . 28 2.3.4 CodingUnitandComplexity . 29 2.3.5 Lagrangian Optimization . 29 2.3.5.1 Rate Allocation Problem . 31 2.3.5.2 Distortion Allocation Problem . 31 2.4 Progress inVideoCompressionStandards . ...... 32 2.4.1 VideoStream............................... 32 2.4.2 FromJPEGtoHEVC .......................... 33 2.4.3 OverviewofJPEGandJPEG2000 . 34 2.4.3.1 Discrete Cosine Transform (DCT) . 34 2.4.3.2 Discrete Wavelet Transform (DWT) . 35 2.4.3.3 Quantization . 36 2.4.3.4 Entropycoding......................... 36 2.4.3.5 Overview MJPEG and MJPEG2000 . 36 2.4.4 OverviewofMPEG-1 .......................... 37 2.4.4.1 Macroblocks .......................... 38 2.4.4.2 Motion Compensation . 38 1 2 CONTENTS 2.4.5 OverviewofMPEG-2 .......................... 39 2.4.6 Overview of H.264/MPEG-4/AVC . 40 2.4.7 OverviewofH.265 ............................ 42 2.5 Alternative Video Compression Algorithms . ....... 43 2.5.1 VP9.................................... 43 2.5.2 GreenMetadataStandard. 44 2.6 IEEE1857Standard ............................... 46 2.7 Conclusion: What is the future of video compression? . ......... 47 II DYNAMIC FILTERING 49 3 DoG Filters from Neuroscience to Image Processing 53 3.1 Introduction.................................... 53 3.2 IntroductiontotheVisualSystem . 54 3.2.1 Retina................................... 54 3.2.1.1 Photoreceptors . 56 3.2.1.2 Horizontal Cells . 56 3.2.1.3 Bipolar cells . 56 3.3 OPLApproximationModels. 56 3.3.1 SpatialDoGFilter ............................ 57 3.3.2 Separable Spatiotemporal DoG Filter . 57 3.3.3 Non-Separable Spatiotemporal Receptive Field . ....... 58 3.3.4 Non-Separable Spatiotemporal Filter . ..... 59 3.4 DoGinImageProcessing ............................ 60 3.4.1 SpatialDoGPyramid .......................... 60 3.4.2 Invertible Spatial DoG Pyramid . 61 3.4.3 Invertible Spatiotemporal DoG Pyramid . 61 3.5 Conclusion .................................... 62 4 Retina-inspired Filtering 63 4.1 Introduction.................................... 63 4.2 Definition ..................................... 64 4.3 Spatiotemporal Behavior and Convergence . ....... 65 4.4 WeightedDoGAnalysis ............................. 67 4.4.1 WDoGinSpaceDomain. .. .. .. .. .. .. .. 68 4.4.2 WDoG in Frequency Domain . 69 4.5 NumericalResults ................................ 72 4.5.1 1DInputSignal ............................. 72 4.5.2 ImageInputSignal............................ 73 4.6 Conclusion .................................... 76 5 Inverse Retina-Inspired Filtering 77 5.1 Introduction.................................... 77 5.2 Discrete Retina-inspired Filter . ...... 78 5.3 Retina-inspiredFrame . 79 5.3.1 FrameTheory .............................. 79 5.3.2 FrameProof ............................... 80 5.4 Pseudo-inverseFrame .. .. .. .. .. .. .. .. .. 82 5.5 ConjugateGradient ............................... 83 5.6 NumericalResults ................................ 84 5.6.1 Progressive Reconstruction . 85 5.6.2 Additive White Gaussian Noise . 86 CONTENTS 3 5.7 Conclusion .................................... 91 III DYNAMIC QUANTIZATION 93 6 Generation of Spikes 97 6.1 Introduction.................................... 97 6.2 GanglionCells .................................. 98 6.2.1 Morphology................................ 99 6.2.2 Functionality ............................... 99 6.3 SpikeGenerationModels. 100 6.3.1 RateCodes ................................ 101 6.3.1.1 Michaelis-Menten Function . 101 6.3.1.2 Rate as a Spike Count . 101 6.3.1.3 Rate as a Spike Density . 102 6.3.1.4 Rate as a Population Activity . 102 6.3.2 Time Code: Leaky Integrate and Fire (LIF) . 103 6.3.3 RankCode ................................ 105 6.4 Howtointerpretthespikes?. 106 6.5 Spikesincodingsystems. 107 6.5.1 RankOrderCoder(ROC). 107 6.5.2 ExtensionofROC ............................ 108 6.5.3 TimeEncodingMachine(TEM) . 108 6.5.4 A/D Bio-inspired Converter . 109 6.6 Conclusion .................................... 110 7 LIF Quantizer 113 7.1 Introduction.................................... 113 7.2 LIFQuantizer................................... 114 7.2.1 Decodingspikes.............................. 115 7.2.2 Dead-zone................................. 117 7.2.3 Quantization ............................... 117 7.2.4 Perfect-LIF dead-zone Quantizer . 120 7.2.5 Uniform-LIF dead-zone Quantizer . 121 7.2.6 Adaptive-LIF dead-zone Quantizer . 127 7.2.7 Optimized-LIF dead-zone Quantizer . 131 7.3 ProgressiveReconstruction . 135 7.4 Conclusion .................................... 136 IV APPLICATIONS 137 8 Application on Video Surveillance Data 139 8.1 VideoSurveillanceoverWSN . 140 8.1.1 TransmissionConstraints . 140 8.1.2 Network Topologies . 140 8.1.3 Pre-processingsolutions . 141 8.2 4G-TECHNOLOGY ............................... 142 8.2.1 BSViArchitecture ............................ 142 8.2.2 EViBOX/BVi Architecture . 142 8.2.3 SurveillanceScenarios . 142 8.2.3.1 WorkingArea ......................... 143 8.2.3.2 Traffic.............................. 143 4 CONTENTS 8.2.3.3 PublicArea........................... 144 8.3 OurContributions ................................ 144 8.4 VideoNumericalResults. 146 8.5 Conclusion .................................... 153 9 General Conclusion 155 9.1 Contributions................................... 155 9.1.1 Retina-inspired Filtering . 155 9.1.2 LIFQuantizer .............................. 156 9.1.3 Retina-inspired image and video codec . 156 9.2 Perspectives.................................... 156 V APPENDIX 159 A Proof of Lemma 1 161 A.1 Closed-form of Jc(t) ............................... 161 A.2 Closed-form of Js(t) ............................... 162 B List of Symbols 165 C List of Abbreviations 169 CONTENTS 5 Abstract The goal of this thesis is to propose a novel video coding architecture which is inspired by the visual system and the retina. If one sees the retina as a machine which processes the visual stimulus, it seems an intelligent and very efficient model to mimic. There are several reasons to claim that, first of all because it consumes low power, it also deals with high resolution inputs and the dynamic way it transforms and encodes the visual stimulus is beyond the current standards. We were motivated to study and release a retina-inspired video codec. The proposed algorithm was applied to a video stream in a very simple way according to the coding standards like MJPEG or MJPEG2000. However, this way allows the reader to study and explore all the advantages of the retina dynamic processing way in terms of compression and image processing. The current performance of the retina- inspired codec is very promising according to some final results which outperform MJPEG for bitrates lower than 100 kbps and MPEG-2 for bitrates higher than 70 kpbs. In addition, for lower bitrates the retina-inspired codec outlines better the content of the input scene. There are many perspectives which concern the improvement of the retina-inspired video codec which seem to lead to a groundbreaking compression architecture. Hopefully, this manuscript will be a useful tool for all the researchers who would like to study further than the perceptual capability of the visual system and understand how the structure and the functions of this efficient machine can in practice improve the coding algorithms. R´esum´e Cette th`ese vise `aproposer une nouvelle architecture de codage vid´eo qui s’inspire du syst`eme visuel et de la r´etine. La r´etine peut ˆetre consid´er´ee comme une machine intelligente qui traite le stimulus visuel de fa¸con tr`es efficace. De ce fait, elle repr´esente donc un bon candidat pour d´evelopper de nouveaux syst`emes de traitement d’image. Il y a plusieurs raisons pour cela, tout d’abord parce
Recommended publications
  • Versatile Video Coding – the Next-Generation Video Standard of the Joint Video Experts Team
    31.07.2018 Versatile Video Coding – The Next-Generation Video Standard of the Joint Video Experts Team Mile High Video Workshop, Denver July 31, 2018 Gary J. Sullivan, JVET co-chair Acknowledgement: Presentation prepared with Jens-Rainer Ohm and Mathias Wien, Institute of Communication Engineering, RWTH Aachen University 1. Introduction Versatile Video Coding – The Next-Generation Video Standard of the Joint Video Experts Team 1 31.07.2018 Video coding standardization organisations • ISO/IEC MPEG = “Moving Picture Experts Group” (ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11) • ITU-T VCEG = “Video Coding Experts Group” (ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6) • JVT = “Joint Video Team” collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009) • JCT-VC = “Joint Collaborative Team on Video Coding” team of MPEG & VCEG , responsible for developing HEVC (established January 2010) • JVET = “Joint Video Experts Team” responsible for developing VVC (established Oct. 2015) – previously called “Joint Video Exploration Team” 3 Versatile Video Coding – The Next-Generation Video Standard of the Joint Video Experts Team Gary Sullivan | Jens-Rainer Ohm | Mathias Wien | July 31, 2018 History of international video coding standardization
    [Show full text]
  • VOL. E100-C NO. 6 JUNE 2017 the Usage of This PDF File Must Comply
    VOL. E100-C NO. 6 JUNE 2017 The usage of this PDF file must comply with the IEICE Provisions on Copyright. The author(s) can distribute this PDF file for research and educational (nonprofit) purposes only. Distribution by anyone other than the author(s) is prohibited. IEICE TRANS. ELECTRON., VOL.E100–C, NO.6 JUNE 2017 643 PAPER A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding Yibo FAN†a), Member, Leilei HUANG†, Zheng XIE†, and Xiaoyang ZENG†, Nonmembers SUMMARY In the newly finalized video coding standard, namely high 4×4, 8×8, 16×16, 32×32 and 64×64 with 35 possible pre- efficiency video coding (HEVC), new notations like coding unit (CU), pre- diction modes in intra prediction. Although several fast diction unit (PU) and transformation unit (TU) are introduced to improve mode decision designs have been proposed, still a consid- the coding performance. As a result, the reconstruction loop in intra en- coding is heavily burdened to choose the best partitions or modes for them. erable amount of candidate PU modes, PU partitions or TU In order to solve the bottleneck problems in cycle and hardware cost, this partitions are needed to be traversed by the reconstruction paper proposed a high-throughput and compact implementation for such a loop. reconstruction loop. By “high-throughput”, it refers to that it has a fixed It can be inferred that the reconstruction loop in intra throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability prediction has become a bottleneck in cycle and hardware between discrete cosine transform (DCT) and inverse discrete cosine trans- cost.
    [Show full text]
  • Security Solutions Y in MPEG Family of MPEG Family of Standards
    1 Security solutions in MPEG family of standards TdjEbhiiTouradj Ebrahimi [email protected] NET working? Broadcast Networks and their security 16-17 June 2004, Geneva, Switzerland MPEG: Moving Picture Experts Group 2 • MPEG-1 (1992): MP3, Video CD, first generation set-top box, … • MPEG-2 (1994): Digital TV, HDTV, DVD, DVB, Professional, … • MPEG-4 (1998, 99, ongoing): Coding of Audiovisual Objects • MPEG-7 (2001, ongo ing ): DitiDescription of Multimedia Content • MPEG-21 (2002, ongoing): Multimedia Framework NET working? Broadcast Networks and their security 16-17 June 2004, Geneva, Switzerland MPEG-1 - ISO/IEC 11172:1992 3 • Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s – Part 1 Systems - Program Stream – Part 2 Video – Part 3 Audio – Part 4 Conformance – Part 5 Reference software NET working? Broadcast Networks and their security 16-17 June 2004, Geneva, Switzerland MPEG-2 - ISO/IEC 13818:1994 4 • Generic coding of moving pictures and associated audio – Part 1 Systems - joint with ITU – Part 2 Video - joint with ITU – Part 3 Audio – Part 4 Conformance – Part 5 Reference software – Part 6 DSM CC – Par t 7 AAC - Advance d Au dio Co ding – Part 9 RTI - Real Time Interface – Part 10 Conformance extension - DSM-CC – Part 11 IPMP on MPEG-2 Systems NET working? Broadcast Networks and their security 16-17 June 2004, Geneva, Switzerland MPEG-4 - ISO/IEC 14496:1998 5 • Coding of audio-visual objects – Part 1 Systems – Part 2 Visual – Part 3 Audio – Part 4 Conformance – Part 5 Reference
    [Show full text]
  • Dynamic Resource Management of Network-On-Chip Platforms for Multi-Stream Video Processing
    DYNAMIC RESOURCE MANAGEMENT OF NETWORK-ON-CHIP PLATFORMS FOR MULTI-STREAM VIDEO PROCESSING Hashan Roshantha Mendis Doctor of Engineering University of York Computer Science March 2017 2 Abstract This thesis considers resource management in the context of parallel multiple video stream de- coding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allo- cation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bot- tlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission con- troller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams.
    [Show full text]
  • Parameter Optimization in H.265 Rate-Distortion by Single Frame
    https://doi.org/10.2352/ISSN.2470-1173.2019.11.IPAS-262 © 2019, Society for Imaging Science and Technology Parameter optimization in H.265 Rate-Distortion by single- frame semantic scene analysis Ahmed M. Hamza; University of Portsmouth Abdelrahman Abdelazim; Blackpool and the Fylde College Djamel Ait-Boudaoud; University of Portsmouth Abstract with Q being the quantization value for the source. The relation is The H.265/HEVC (High Efficiency Video Coding) codec and derived based on several assumptions about the source probability its 3D extensions have crucial rate-distortion mechanisms that distribution within the quantization intervals, and the nature of help determine coding efficiency. We have introduced in this the rate-distortion relations themselves (constantly differentiable work a new system of Lagrangian parameterization in RDO cost throughout, etc.). The value initially used for c in the literature functions, based on semantic cues in an image, starting with the was 0.85. This was modified and made adaptive in subsequent current HEVC formulation of the Lagrangian hyper-parameter standards including HEVC/H.265. heuristics. Two semantic scenery flag algorithms are presented Our work here investigates further adaptations to the rate- and tested within the Lagrangian formulation as weighted factors. distortion Lagrangian by semantic algorithms, made possible by The investigation of whether the semantic gap between the coder recent frameworks in computer vision. and the image content is holding back the block-coding mecha- nisms as a whole from achieving greater efficiency has yielded a Adaptive Lambda Hyper-parameters in HEVC positive answer. Early versions of the rate-distortion parameters in H.263 were replaced with more sophisticated models in subsequent stan- Introduction dards.
    [Show full text]
  • The H.264 Advanced Video Coding (AVC) Standard
    Whitepaper: The H.264 Advanced Video Coding (AVC) Standard What It Means to Web Camera Performance Introduction A new generation of webcams is hitting the market that makes video conferencing a more lifelike experience for users, thanks to adoption of the breakthrough H.264 standard. This white paper explains some of the key benefits of H.264 encoding and why cameras with this technology should be on the shopping list of every business. The Need for Compression Today, Internet connection rates average in the range of a few megabits per second. While VGA video requires 147 megabits per second (Mbps) of data, full high definition (HD) 1080p video requires almost one gigabit per second of data, as illustrated in Table 1. Table 1. Display Resolution Format Comparison Format Horizontal Pixels Vertical Lines Pixels Megabits per second (Mbps) QVGA 320 240 76,800 37 VGA 640 480 307,200 147 720p 1280 720 921,600 442 1080p 1920 1080 2,073,600 995 Video Compression Techniques Digital video streams, especially at high definition (HD) resolution, represent huge amounts of data. In order to achieve real-time HD resolution over typical Internet connection bandwidths, video compression is required. The amount of compression required to transmit 1080p video over a three megabits per second link is 332:1! Video compression techniques use mathematical algorithms to reduce the amount of data needed to transmit or store video. Lossless Compression Lossless compression changes how data is stored without resulting in any loss of information. Zip files are losslessly compressed so that when they are unzipped, the original files are recovered.
    [Show full text]
  • Design and Implementation of a Fast HEVC Random Access Video Encoder
    Design and Implementation of a Fast HEVC Random Access Video Encoder ALFREDO SCACCIALEPRE Master's Degree Project Stockholm, Sweden March 2014 XR-EE-KT 2014:003 Contents 1 Introduction 11 1.1 Background . 11 1.2 Thesis work . 12 1.2.1 Factors to consider . 12 1.3 The problem . 12 1.3.1 C65 . 12 1.4 Methods and thesis outline . 13 1.4.1 Methods . 13 1.4.2 Objective measurement . 14 1.4.3 Subjective measurement . 14 1.4.4 Test sequences . 14 1.4.5 Thesis outline . 16 1.4.6 Abbreviations . 16 2 General concepts 19 2.1 Color spaces . 19 2.2 Frames, slices and tiles . 19 2.2.1 Frames . 19 2.2.2 Slices and Tiles . 19 2.3 Predictions . 20 2.3.1 Intra . 20 2.3.2 Inter . 20 2.4 Merge mode . 20 2.4.1 Skip mode . 20 2.5 AMVP mode . 20 2.5.1 I, P and B frames . 21 2.6 CTU, CU, CTB, CB, PB, and TB . 21 2.7 Transforms . 23 2.8 Quantization . 24 2.9 Coding . 24 2.10 Reference picture lists . 24 2.11 Gop structure . 24 2.12 Temporal scalability . 25 2.13 Hierarchical B pictures . 25 2.14 Decoded picture buffer (DPB) . 25 2.15 Low delay and random access configurations . 26 2.16 H.264 and its encoders . 26 2.16.1 H.264 . 26 1 2 CONTENTS 3 Preliminary tests 27 3.1 Speed - quality considerations . 27 3.1.1 Interactive applications . 27 3.1.2 Entertainment applications .
    [Show full text]
  • Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding
    RESEARCH ARTICLE Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding Yuan Gao1,2,3☯, Pengyu Liu1,2,3☯, Yueying Wu1,2,3, Kebin Jia1,2,3*, Guandong Gao1,2,3 1 Beijing Advanced Innovation Center for Future Internet Technology, Beijing University of Technology, Beijing, China, 2 Beijing Laboratory of Advanced Information Networks, Beijing, China, 3 College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, China ☯ These authors contributed equally to this work. * [email protected] Abstract In high efficiency video coding (HEVC), coding tree contributes to excellent compression performance. However, coding tree brings extremely high computational complexity. Inno- vative works for improving coding tree to further reduce encoding time are stated in this OPEN ACCESS paper. A novel low complexity coding tree mechanism is proposed for HEVC fast coding Citation: Gao Y, Liu P, Wu Y, Jia K, Gao G (2016) unit (CU) encoding. Firstly, this paper makes an in-depth study of the relationship among Fast Coding Unit Encoding Mechanism for Low CU distribution, quantization parameter (QP) and content change (CC). Secondly, a CU Complexity Video Coding. PLoS ONE 11(3): coding tree probability model is proposed for modeling and predicting CU distribution. Even- e0151689. doi:10.1371/journal.pone.0151689 tually, a CU coding tree probability update is proposed, aiming to address probabilistic Editor: You Yang, Huazhong University of Science model distortion problems caused by CC. Experimental results show that the proposed low and Technology, CHINA complexity CU coding tree mechanism significantly reduces encoding time by 27% for lossy Received: September 23, 2015 coding and 42% for visually lossless coding and lossless coding.
    [Show full text]
  • ITC Confplanner DVD Pages Itcconfplanner
    Comparative Analysis of H.264 and Motion- JPEG2000 Compression for Video Telemetry Item Type text; Proceedings Authors Hallamasek, Kurt; Hallamasek, Karen; Schwagler, Brad; Oxley, Les Publisher International Foundation for Telemetering Journal International Telemetering Conference Proceedings Rights Copyright © held by the author; distribution rights International Foundation for Telemetering Download date 25/09/2021 09:57:28 Link to Item http://hdl.handle.net/10150/581732 COMPARATIVE ANALYSIS OF H.264 AND MOTION-JPEG2000 COMPRESSION FOR VIDEO TELEMETRY Kurt Hallamasek, Karen Hallamasek, Brad Schwagler, Les Oxley [email protected] Ampex Data Systems Corporation Redwood City, CA USA ABSTRACT The H.264/AVC standard, popular in commercial video recording and distribution, has also been widely adopted for high-definition video compression in Intelligence, Surveillance and Reconnaissance and for Flight Test applications. H.264/AVC is the most modern and bandwidth-efficient compression algorithm specified for video recording in the Digital Recording IRIG Standard 106-11, Chapter 10. This bandwidth efficiency is largely derived from the inter-frame compression component of the standard. Motion JPEG-2000 compression is often considered for cockpit display recording, due to the concern that details in the symbols and graphics suffer excessively from artifacts of inter-frame compression and that critical information might be lost. In this paper, we report on a quantitative comparison of H.264/AVC and Motion JPEG-2000 encoding for HD video telemetry. Actual encoder implementations in video recorder products are used for the comparison. INTRODUCTION The phenomenal advances in video compression over the last two decades have made it possible to compress the bit rate of a video stream of imagery acquired at 24-bits per pixel (8-bits for each of the red, green and blue components) with a rate of a fraction of a bit per pixel.
    [Show full text]
  • The H.264/MPEG4 Advanced Video Coding Standard and Its Applications
    SULLIVAN LAYOUT 7/19/06 10:38 AM Page 134 STANDARDS REPORT The H.264/MPEG4 Advanced Video Coding Standard and its Applications Detlev Marpe and Thomas Wiegand, Heinrich Hertz Institute (HHI), Gary J. Sullivan, Microsoft Corporation ABSTRACT Regarding these challenges, H.264/MPEG4 Advanced Video Coding (AVC) [4], as the latest H.264/MPEG4-AVC is the latest video cod- entry of international video coding standards, ing standard of the ITU-T Video Coding Experts has demonstrated significantly improved coding Group (VCEG) and the ISO/IEC Moving Pic- efficiency, substantially enhanced error robust- ture Experts Group (MPEG). H.264/MPEG4- ness, and increased flexibility and scope of appli- AVC has recently become the most widely cability relative to its predecessors [5]. A recently accepted video coding standard since the deploy- added amendment to H.264/MPEG4-AVC, the ment of MPEG2 at the dawn of digital televi- so-called fidelity range extensions (FRExt) [6], sion, and it may soon overtake MPEG2 in further broaden the application domain of the common use. It covers all common video appli- new standard toward areas like professional con- cations ranging from mobile services and video- tribution, distribution, or studio/post production. conferencing to IPTV, HDTV, and HD video Another set of extensions for scalable video cod- storage. This article discusses the technology ing (SVC) is currently being designed [7, 8], aim- behind the new H.264/MPEG4-AVC standard, ing at a functionality that allows the focusing on the main distinct features of its core reconstruction of video signals with lower spatio- coding technology and its first set of extensions, temporal resolution or lower quality from parts known as the fidelity range extensions (FRExt).
    [Show full text]
  • H.264/MPEG-4 Advanced Video Coding Alexander Hermans
    Seminar Report H.264/MPEG-4 Advanced Video Coding Alexander Hermans Matriculation Number: 284141 RWTH September 11, 2012 Contents 1 Introduction 2 1.1 MPEG-4 AVC/H.264 Overview . .3 1.2 Structure of this paper . .3 2 Codec overview 3 2.1 Coding structure . .3 2.2 Encoder . .4 2.3 Decoder . .6 3 Main differences 7 3.1 Intra frame prediction . .7 3.2 Inter frame prediction . .9 3.3 Transform function . 11 3.4 Quantization . 12 3.5 Entropy encoding . 14 3.6 In-loop deblocking filter . 15 4 Profiles and levels 16 4.1 Profiles . 17 4.2 Levels . 18 5 Evaluation 18 6 Patents 20 7 Summary 20 1 1 Introduction Since digital images have been created, the need for compression has been clear. The amount of data needed to store an uncompressed image is large. But while it still might be feasible to store a full resolution uncompressed image, storing video sequences without compression is not. Assuming a normal frame-rate of about 25 frames per second, the amount of data needed to store one hour of high definition video is about 560GB1. Compressing each image on its own, would reduce this amount, but it would not make the video small enough to be stored on today's typical storage mediums. To overcome this problem, video compression algorithms have exploited the temporal redundancies in the video frames. By using previously encoded, or decoded, frames to predict values for the next frame, the data can be compressed with such a rate that storage becomes feasible.
    [Show full text]
  • The Open-Source Turing Codec: Towards Fast, Flexible and Parallel Hevc Encoding
    THE OPEN-SOURCE TURING CODEC: TOWARDS FAST, FLEXIBLE AND PARALLEL HEVC ENCODING S. G. Blasi1, M. Naccari1, R. Weerakkody1, J. Funnell2 and M. Mrak1 1BBC, Research and Development Department, UK 2Parabola Research, UK ABSTRACT The Turing codec is an open-source software codec compliant with the HEVC standard and specifically designed for speed, flexibility, parallelisation and high coding efficiency. The Turing codec was designed starting from a completely novel backbone to comply with the Main and Main10 profiles of HEVC, and has many desirable features for practical codecs such as very low memory consumption, advanced parallelisation schemes and fast encoding algorithms. This paper presents a technical description of the Turing codec as well as a comparison of its performance with other similar encoders. The codec is capable of cutting the encoding complexity by an average 87% with respect to the HEVC reference implementation for an average coding penalty of 11% higher rates in compression efficiency at the same peak-signal-noise-ratio level. INTRODUCTION The ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) combined their expertise to form the Joint Collaborative Team on Video Coding (JCT-VC), and finalised the first version of the H.265/High Efficiency Video Coding (HEVC) standard (1) in January 2013. HEVC provides the same perceived video quality as its predecessor H.264/Advanced Video Coding (AVC) at considerably lower bitrates (the MPEG final verification tests report average bitrate reductions of up to 60% (2) when coding Ultra High Definition, UHD, content). HEVC specifies larger block sizes for prediction and transform, an additional in-loop filter, and new coding modes for intra- and inter-prediction.
    [Show full text]