Signal Processing for Improved MPEG-Based Communication Systems

Signal Processing for Improved MPEG-based Communication Systems Signal Processing for Improved MPEG-based Communication Systems PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de rector magnificus prof.dr.ir. F.P.T. Baaijens, voor een commissie aangewezen door het College voor Promoties, in het openbaar te verdedigen op woensdag 9 december 2015 om 16:00 uur door Onno Eerenberg geboren te Zwolle Dit proefschrift is goedgekeurd door de promotor en de samenstelling van de promotiecommissie is als volgt: voorzitter: prof.dr.ir. J.W.M. Bergmans 1e promotor: prof.dr.ir. P.H.N. de With copromotor(en): prof.dr. R.M. Aarts leden: prof.dr. C. Hentschel (Brandenb. Univ. of Technol. Cottbus, Germany) prof.dr. S. Sherratt (Univ. of Reading, United Kingdom) prof.dr.ir. J.J. Lukkien adviseur(s): dr.P.Hofman(PhilipsIP&S) Het onderzoek of ontwerp dat in dit proefschrift wordt beschreven is uitgevo- erd in overeenstemming met de TU/e Gedragscode Wetenschapsbeoefening. Nulla tenaci invia est via – Voor de volhouder is geen weg onbegaanbaar (Spyker Automobielen N.V.). CIP-DATA LIBRARY TECHNISCHE UNIVERSITEIT EINDHOVEN Onno Eerenberg Signal Processing for Improved MPEG-based Communication Systems / by Onno Eeren- berg. - Eindhoven : Technische Universiteit Eindhoven, 2015. A catalogue record is available from the Eindhoven University of Technology Library ISBN: 978-90-386-3979-6 NUR: 959 Trewf: videocompressie / personal video recording / MPEG-2 / H.264/MPEG4-AVC / trick play / DVB-H / video coding artifact detection. Subject headings: conventional video navigation / advanced video navigation / MPEG- 2-compliant video navigation / DVB-H link layer / mosquito and ringing artifact location detection. Cover design: D.J.M. Frishert. Printed by: Dereumaux. Copyright c 2015 by O. Eerenberg All rights reserved. No part of this material may be reproduced or transmitted in any form or by any means, electronic, mechanical, including photocopying, recording or by any information storage and retrieval system, without the prior permission of the copyright owner. Summary Signal Processing for Improved MPEG-based Communication Systems This thesis describes improvements for MPEG-based consumer communication systems and focuses on three areas. The first area addresses intra-program video navigation for disk-based storage media, where three forms of video navigation are investigated enabling conventional as well as more advanced forms of video navigation. The second area presents an efficient and robust data link layer for DVB-H, a standard targeting battery-powered mobile television reception. The improved link layer results in a higher robustness and efficiency. The third area addresses picture quality for digital-television reception, presenting two detection systems for locating visual-coding artifact regions, which are po- tentially contaminated with either mosquito noise or ringing. The location information is used to attenuate the detected coding noise. The emphasis of the presented work is on embedded system solutions to be integrated into existing consumer platforms. The three areas are briefly summarized below. In this thesis, three navigation techniques are presented for disk-based storage systems. The first navigation technique equals that of full-frame fast-search and slow-motion playback and is suitable for a push-based architecture, enabling deployment in a networked client-server system setup. Networked full- frame fast-search video navigation is based on re-using intra-coded MPEG- compressed normal-play video pictures. The proposed solution divides the signal processing for navigation over both recording and navigation playback operation mode. It is based on the finding of characteristic point information during recording, revealing the storage locations of intra-coded pictures, which are then re-used for generation of a fast-search navigation sequence. Furthermore, in order to adapt the frame rate, refresh-rate, bit rate and playback speed, the solution employs repetition pictures, which repeat normal- play reference pictures. On the basis of field-based repetition pictures, rendering control at field level is obtained, enabling efficient removal of field-based video information (interlace kill), thereby avoiding motion judder during nav- i igation, enabling the navigation method to be applied to both progressive and interlaced video formats. Slow-motion video navigation is implemented in a similar fashion. When applying repetition pictures to control the Quality of Ex- perience (QoE) during navigation, the refresh-rate should not drop below 1/3 of the frame rate, otherwise a slide-show effect occurs, whereby the viewer loses the fast-search navigation experience. For a typical video broadcast at SD resolution and 4 Mbit/s, the required DSP processor load during recording requires a cycle frequency of 5 MHz, while a cycle frequency of 22 MHz is required for full-frame fast-search video playback with speed-up factor 12 and 5 MHz for slow-motion with playback speed 0.5. Both fast-search and slow-motion video navigation can be operated with a reduced refresh-rate and thus a considerably lower cycle frequency. Based on drawbacks associated to full-resolution fast-search trick play, a video navigation technique is presented based on a hierarchical mosaic screen representation. This screen is composed of re-used MPEG-compressed subpictures, avoiding transcoding of pictorial data. Hierarchical mosaic screens enable instant overview of video information associated with a large temporal interval, thereby eliminating the individual picture refresh-rate from the final video navigation rendering. Each subpicture is coded at a fixed bit cost, thereby simplifying the construction of the final mosaic screen in the compressed do- main. A fixed-cost subpicture is achieved by dividing each subpicture into a set of “mini slices”, which are also encoded at a fixed bit cost. Furthermore, when subpictures use P-type coding syntax, new mosaic screens can be con- structed using predictive coding, based on re-used subpictures available at the MPEG decoder. Both aspects clearly reduce the complexity of the implementa- tion. The continuous derivation of subpictures for mosaic screens requires only a low fraction of the computation complexity, because this intra-coded normal- play pictures appear at a rate of only 2 Hz, which results in a processing load of 0.3 Hz, when a scene duration of 3 seconds is used. It was found that this system can be implemented with the same architecture as the first navigation solution, because the processing required for the construction of a mosaic screen has a high resemblance with the fast-search full-frame navigation solution. We therefore expect that the involved playback processing for mosaic-screen navigation will show a similar throughput and DSP cycle load. Finally, an audio-enhanced dual-window video navigation technique is presented, combining normal-play audiovisual fragments, with a down-scaled fast-search information signal. This representation addresses human percep- tion which employs both visual and auditory queues. Hereby detailed normal- play information is rendered in a main window, while a coarse overview is pro- vided by fast-search information rendered in a second picture-in-Picture (PiP) window. Due to simultaneous rendering, a viewer can switch between the two information signals, perceiving either a fast- or a detailed overview, guiding the viewer to the requested audiovisual fragment. ii The main navigation signal is based on re-using normal-play audio and video fragments of typical length of 3 seconds (approx. 6 GOPs), which are fully decoded. The fast-search navigation signal is derived from decoded normal- play intra-coded pictures, which are scalable MPEG-2 decoded already during recording. The scalability enables downscaling while decoding, since the picture quality is lower. These lower-quality pictures are stored, next to the other essential Characteristic Point Information (CPI) in the metadata database. During playback, the concatenated stream fragments are modified to ensure MPEG-2-compliant formatting and enabling seamless decoding. This modification involves amongst others audio padding, removal of predictive images without reference and modification of the time base for navigation. Scalabil- ity is implemented by partial decoding of the intra-compressed data, which is worked out for both MPEG-2 and H.264/MPEG4-AVC standards. The obtained picture quality for both standards is in the range of 23-32 dB, which is lower than the normal-play quality but sufficient for navigation purposes. The three proposed navigation solutions show a high commonality with respect to the required signal processing and the execution architectures, so that they can be jointly implemented in one overall architecture. The second contribution of this thesis aims at improving the DVB-H link layer for mobile battery-powered MPEG communication. The problem of the standard solution in a receiver is that it does not provide sufficient robustness and is also based on inefficient bandwidth usage. The solution is based on lo- cally obtained reliability and location information, in the form of 2-bit erasure flags, Internet Protocol Entry Table (IPET) and Correct Row Index Table (CRIT). The reliability information is derived on the basis of dual-stage FEC decoding, while the location information is derived from correctly received broadcast data. Hereby, the primary FEC is performed by the channel decoder, while the secondary FEC is integrated in the DVB-H link layer. The usage of the reliability information derived from the primary FEC is employed in two ways. First, this reliability information is applied for error and erasure decoding by the secondary FEC stage. For the situation that after this second FEC stage an MPE- FEC frame is still incorrect, this reliability information is used for a second time, in combination with reliability information derived from the secondary FEC decoding stage, supplemented with the IPET and CRIT information. In this way, correctly received and corrected IP datagrams are extracted from the defect MPE-FEC frame.

Signal Processing for Improved MPEG-Based Communication Systems

Versatile Video Coding – the Next-Generation Video Standard of the Joint Video Experts Team

Security Solutions Y in MPEG Family of MPEG Family of Standards

The H.264 Advanced Video Coding (AVC) Standard

ITC Confplanner DVD Pages Itcconfplanner

Computersalectronics AUGUST 1983 Formerly Popular Electronics $1.50

The H.264/MPEG4 Advanced Video Coding Standard and Its Applications

H.264/MPEG-4 Advanced Video Coding Alexander Hermans

Introduction to the Special Issue on Scalable Video Coding—Standardization and Beyond

Lossy-To-Lossless 3D Image Coding Through Prior Coefficient Lookup

Intra-Frame JPEG-2000 Vs. Inter-Frame Compression Comparison: the Benefits and Trade-Offs for Very High Quality, High Resolution Sequences

Tutorial: the H.264 Advanced Video Compression Standard

Deep Neural Network Based Video Compression for Next Generation MPEG Video Codec Standardization