Dynamic Resource Management of Network-On-Chip Platforms for Multi-Stream Video Processing

Total Page:16

File Type:pdf, Size:1020Kb

Dynamic Resource Management of Network-On-Chip Platforms for Multi-Stream Video Processing DYNAMIC RESOURCE MANAGEMENT OF NETWORK-ON-CHIP PLATFORMS FOR MULTI-STREAM VIDEO PROCESSING Hashan Roshantha Mendis Doctor of Engineering University of York Computer Science March 2017 2 Abstract This thesis considers resource management in the context of parallel multiple video stream de- coding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allo- cation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bot- tlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission con- troller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams. Finally, this thesis explores parallelisation and resource management issues, surrounding soft real-time video streams that have been encoded using complex encoding tools and mod- ern codecs such as High Efficiency Video Coding (HEVC). Properties of real streams and decod- ing trace data are analysed, to statistically model and generate synthetic HEVC video decoding workloads. These workloads are shown to have complex and varying task dependency struc- tures and resource requirements. To address these challenges, two novel runtime task clustering and mapping techniques for Tile-parallel HEVC decoding are proposed. These strategies con- sider the workload communication to computation ratio and stream-specific characteristics to balance predictability improvement and communication energy reduction. Lastly, several task to memory controller port assignment schemes are explored to alleviate performance bottle- necks, resulting from memory traffic contention. 3 4 Contents Abstract 3 Contents 5 List of Tables 9 List of Figures 10 Acknowledgements 12 Declaration 13 1 Introduction 14 1.1 Background and motivation............................... 14 1.1.1 Use-cases for predictable real-time multi-stream video decoding..... 15 1.2 Research problems.................................... 16 1.2.1 The problem of shared-memory based parallel video decoding....... 17 1.2.2 The problem of resource management for unpredictable workloads.... 17 1.2.3 The problem of scalable management..................... 18 1.2.4 The problem of resource sharing........................ 18 1.3 Research hypothesis................................... 19 1.4 Thesis contributions................................... 19 1.5 Thesis outline....................................... 20 2 Literature Survey 22 2.1 Video decoding applications............................... 22 2.1.1 Video stream decoding overview........................ 22 2.1.2 Features of the H.265 (HEVC) coding standard................ 24 2.1.3 Complexities of video decoding......................... 26 2.1.4 Parallel video decoding.............................. 27 2.1.5 Challenges in characterisation of video decoding workload......... 28 2.2 Many-core platforms................................... 31 2.2.1 Network-on-chip interconnects......................... 32 2.2.2 Predictability in NoCs.............................. 34 2.2.3 Many-core memory challenges......................... 35 2.2.4 Communication models............................. 35 2.2.5 Modelling many-core systems.......................... 36 2.3 Resource management in multicore platforms.................... 37 2.3.1 Multiprocessor real-time systems........................ 38 2.3.2 Resource management organisation...................... 44 2.3.3 Dynamic task mapping.............................. 50 5 2.3.4 Runtime task re-allocation............................ 57 2.4 Summary.......................................... 62 3 Eval. platform, metrics and problem statement 64 3.1 System model....................................... 64 3.1.1 Application model................................ 64 3.1.2 Platform model.................................. 68 3.2 Metrics........................................... 70 3.2.1 Predictability metrics............................... 71 3.2.2 Utilisation-based performance metrics.................... 72 3.2.3 Resource management energy-efficiency................... 74 3.2.4 Resource management overhead........................ 74 3.3 Simulation-based evaluation.............................. 75 3.4 Problem statement.................................... 76 4 Admission control strategies to balance predictability and utilisation 78 4.1 Runtime task mapping and priority assignment.................... 79 4.1.1 Runtime task mapping table........................... 79 4.2 Admission control tests.................................. 80 4.2.1 Deterministic admission controller....................... 80 4.2.2 Heuristic-based admission controler...................... 83 4.3 Evaluation......................................... 85 4.3.1 Experimental design............................... 85 4.3.2 Results discussion................................. 86 4.4 Summary and novel contributions........................... 90 5 Dynamic task mapping for hard real-time video streams 91 5.1 System model....................................... 92 5.1.1 Application model refinements......................... 92 5.1.2 Platform model refinements........................... 94 5.1.3 Memory-specific response time analysis refinements............ 96 5.2 Least worst-case remaining slack heuristic....................... 96 5.3 LWCRS-aware runtime mapping............................ 96 5.3.1 LWCRS mapping complexity analysis...................... 99 5.4 Clustered I and P frames mapping........................... 99 5.4.1 IPC mapping complexity analysis........................ 100 5.5 Baseline static task mapping............................... 100 5.6 Evaluation......................................... 103 5.6.1 Experimental design............................... 103 5.6.2 Results discussion................................. 105 5.7 Summary and novel contributions........................... 110 6 Distributed task remapping 112 6.1 System model refinement................................ 113 6.1.1 Omitting the memory transaction model................... 113 6.1.2 Disabled admission control........................... 113 6.1.3 Job-level task mapping/remapping....................... 113 6 6.1.4 Low-overhead remapping procedure...................... 114 6.2 Bio-inspired distributed task remapping........................ 114 6.2.1 Overview of the PS distributed load-balancing algorithm.......... 114 6.2.2 Extensions to the PS algorithm......................... 115 6.2.3 PSRM parameter selection............................ 118 6.2.4 Complexity and overhead analysis of PSRM.................. 119 6.2.5 Challenges of PSRM................................ 120 6.2.6 Application of PSRM............................... 120 6.3 Cluster-based distributed task remapping....................... 121 6.3.1 Improvements to CCPRMV1 ........................... 122 6.3.2 CCPRMV2 parameter selection......................... 122 6.3.3 Complexity and overhead analysis of CCPRMV2 ................ 122 6.4 Evaluation......................................... 123 6.4.1 Experimental design............................... 123 6.4.2 Results discussion................................. 124 6.5 Summary and novel contributions........................... 128 7 Extending the application model to support HEVC encoded video streams 130 7.1 Application model refinements............................. 131 7.2 HEVC stream analysis and synthetic workload generation methodology...... 131 7.3 Video sequences and codec tools under investigation................ 132 7.3.1 Encoder and decoder settings.......................... 133 7.4 Adaptive GoP characteristics............................... 134 7.4.1 Distribution of different frame types...................... 135 7.4.2 Contiguous B-frames and reference frame distances............. 136 7.5 HEVC coding unit characteristics............................ 136 7.5.1 CU sizes and types................................ 137 7.5.2 CU decoding time................................. 137 7.6 Workload communication volume characteristics.................. 139 7.6.1 Reference data................................... 139 7.6.2 Encoded frame sizes............................... 140 7.7 CPU and memory usage
Recommended publications
  • Datasheet-Smartsignage
    Fanless Embedded Box PC with Intel® Celeron® SmartSignage 4-core Processor J1900 EBC-3311 Features Fanless digital signage solution Cableless design Low power consumption 1 x USB 3.0 Line-out 1 x USB 2.0 2 x RJ-45 VGA DC INPUT 12V On board Intel® Celeron® 4-core Processor EBC-3311B High definition video output J1900 (Bay Trail) Supports DDR3L SO-DIMM 1333 max. up to 8GB 1 x High definition video output & 1 VGA or 2 x High definition video output 1 x USB 3.0 Line-out 1 x USB 2.0 2 x RJ-45 DC INPUT 12V Front Side 1 mSATA supported High definition video output Supports VESA mount Rear Side Power Switch Applications Digital Signage Kiosk Engine POS PC IoT Gateway Specifications Image format Jpeg, tiff, png, (anim) gif, bmp System Optional for 802.11 b/g/n CPU Intel® Celeron® 4-core Processor J1900 WiFi (Bay Trail) Expansion Slot 1 x full-size Mini card Chipset SoC integrated (for mSATA) System Memory 1 x DDR3L 1333 max up to 8GB 1 x half-size PCI Express Mini card Storage 1 x mSATA supported for optional WiFi module Watchdog Timer 255 levels, 1-255 sec. Power Supply 12V DC-in I/O 1 x USB 2.0 , 1 x USB 3.0 2 x 10/100/1000Mbps Ethernet Dimensions (W x D x H) 170 x 120 x 40 mm 1 x power on/off button 6.69 x 4.72 x 1.57” 1 x Audio (Line-out) Packing Dimension 320 x 205 x 65 mm Video I/O EBC-3311 EBC-3311B (W x D x H) 12.59 x 8.07 x 2.55” 1 x High 2 x High definition Weight (net/gross) 0.7kg(1.5lb)/1.2kg(2.61b) definition video output video output Environmental 1 x VGA Operation Temperature 0°C – +40°C,(32°F – 104°F) Video MPEG-4, MPEG-2, MPEG-1, H.264,
    [Show full text]
  • VOL. E100-C NO. 6 JUNE 2017 the Usage of This PDF File Must Comply
    VOL. E100-C NO. 6 JUNE 2017 The usage of this PDF file must comply with the IEICE Provisions on Copyright. The author(s) can distribute this PDF file for research and educational (nonprofit) purposes only. Distribution by anyone other than the author(s) is prohibited. IEICE TRANS. ELECTRON., VOL.E100–C, NO.6 JUNE 2017 643 PAPER A High-Throughput and Compact Hardware Implementation for the Reconstruction Loop in HEVC Intra Encoding Yibo FAN†a), Member, Leilei HUANG†, Zheng XIE†, and Xiaoyang ZENG†, Nonmembers SUMMARY In the newly finalized video coding standard, namely high 4×4, 8×8, 16×16, 32×32 and 64×64 with 35 possible pre- efficiency video coding (HEVC), new notations like coding unit (CU), pre- diction modes in intra prediction. Although several fast diction unit (PU) and transformation unit (TU) are introduced to improve mode decision designs have been proposed, still a consid- the coding performance. As a result, the reconstruction loop in intra en- coding is heavily burdened to choose the best partitions or modes for them. erable amount of candidate PU modes, PU partitions or TU In order to solve the bottleneck problems in cycle and hardware cost, this partitions are needed to be traversed by the reconstruction paper proposed a high-throughput and compact implementation for such a loop. reconstruction loop. By “high-throughput”, it refers to that it has a fixed It can be inferred that the reconstruction loop in intra throughput of 32 pixel/cycle independent of the TU/PU size (except for 4×4 TUs). By “compact”, it refers to that it fully explores the reusability prediction has become a bottleneck in cycle and hardware between discrete cosine transform (DCT) and inverse discrete cosine trans- cost.
    [Show full text]
  • Chapter 3 Image Data Representations
    Chapter 3 Image Data Representations 1 IT342 Fundamentals of Multimedia, Chapter 3 Outline • Image data types: • Binary image (1-bit image) • Gray-level image (8-bit image) • Color image (8-bit and 24-bit images) • Image file formats: • GIF, JPEG, PNG, TIFF, EXIF 2 IT342 Fundamentals of Multimedia, Chapter 3 Graphics and Image Data Types • The number of file formats used in multimedia continues to proliferate. For example, Table 3.1 shows a list of some file formats used in the popular product Macromedia Director. Table 3.1: Macromedia Director File Formats File Import File Export Native Image Palette Sound Video Anim. Image Video .BMP, .DIB, .PAL .AIFF .AVI .DIR .BMP .AVI .DIR .GIF, .JPG, .ACT .AU .MOV .FLA .MOV .DXR .PICT, .PNG, .MP3 .FLC .EXE .PNT, .PSD, .WAV .FLI .TGA, .TIFF, .GIF .WMF .PPT 3 IT342 Fundamentals of Multimedia, Chapter 3 1-bit Image - binary image • Each pixel is stored as a single bit (0 or 1), so also referred to as binary image. • Such an image is also called a 1-bit monochrome image since it contains no color. It is satisfactory for pictures containing only simple graphics and text. • A 640X480 monochrome image requires 38.4 kilobytes of storage (= 640x480 / (8x1000)). 4 IT342 Fundamentals of Multimedia, Chapter 3 8-bit Gray-level Images • Each pixel has a gray-value between 0 and 255. Each pixel is represented by a single byte; e.g., a dark pixel might have a value of 10, and a bright one might be 230. • Bitmap: The two-dimensional array of pixel values that represents the graphics/image data.
    [Show full text]
  • Parameter Optimization in H.265 Rate-Distortion by Single Frame
    https://doi.org/10.2352/ISSN.2470-1173.2019.11.IPAS-262 © 2019, Society for Imaging Science and Technology Parameter optimization in H.265 Rate-Distortion by single- frame semantic scene analysis Ahmed M. Hamza; University of Portsmouth Abdelrahman Abdelazim; Blackpool and the Fylde College Djamel Ait-Boudaoud; University of Portsmouth Abstract with Q being the quantization value for the source. The relation is The H.265/HEVC (High Efficiency Video Coding) codec and derived based on several assumptions about the source probability its 3D extensions have crucial rate-distortion mechanisms that distribution within the quantization intervals, and the nature of help determine coding efficiency. We have introduced in this the rate-distortion relations themselves (constantly differentiable work a new system of Lagrangian parameterization in RDO cost throughout, etc.). The value initially used for c in the literature functions, based on semantic cues in an image, starting with the was 0.85. This was modified and made adaptive in subsequent current HEVC formulation of the Lagrangian hyper-parameter standards including HEVC/H.265. heuristics. Two semantic scenery flag algorithms are presented Our work here investigates further adaptations to the rate- and tested within the Lagrangian formulation as weighted factors. distortion Lagrangian by semantic algorithms, made possible by The investigation of whether the semantic gap between the coder recent frameworks in computer vision. and the image content is holding back the block-coding mecha- nisms as a whole from achieving greater efficiency has yielded a Adaptive Lambda Hyper-parameters in HEVC positive answer. Early versions of the rate-distortion parameters in H.263 were replaced with more sophisticated models in subsequent stan- Introduction dards.
    [Show full text]
  • Design and Implementation of a Fast HEVC Random Access Video Encoder
    Design and Implementation of a Fast HEVC Random Access Video Encoder ALFREDO SCACCIALEPRE Master's Degree Project Stockholm, Sweden March 2014 XR-EE-KT 2014:003 Contents 1 Introduction 11 1.1 Background . 11 1.2 Thesis work . 12 1.2.1 Factors to consider . 12 1.3 The problem . 12 1.3.1 C65 . 12 1.4 Methods and thesis outline . 13 1.4.1 Methods . 13 1.4.2 Objective measurement . 14 1.4.3 Subjective measurement . 14 1.4.4 Test sequences . 14 1.4.5 Thesis outline . 16 1.4.6 Abbreviations . 16 2 General concepts 19 2.1 Color spaces . 19 2.2 Frames, slices and tiles . 19 2.2.1 Frames . 19 2.2.2 Slices and Tiles . 19 2.3 Predictions . 20 2.3.1 Intra . 20 2.3.2 Inter . 20 2.4 Merge mode . 20 2.4.1 Skip mode . 20 2.5 AMVP mode . 20 2.5.1 I, P and B frames . 21 2.6 CTU, CU, CTB, CB, PB, and TB . 21 2.7 Transforms . 23 2.8 Quantization . 24 2.9 Coding . 24 2.10 Reference picture lists . 24 2.11 Gop structure . 24 2.12 Temporal scalability . 25 2.13 Hierarchical B pictures . 25 2.14 Decoded picture buffer (DPB) . 25 2.15 Low delay and random access configurations . 26 2.16 H.264 and its encoders . 26 2.16.1 H.264 . 26 1 2 CONTENTS 3 Preliminary tests 27 3.1 Speed - quality considerations . 27 3.1.1 Interactive applications . 27 3.1.2 Entertainment applications .
    [Show full text]
  • Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding
    RESEARCH ARTICLE Fast Coding Unit Encoding Mechanism for Low Complexity Video Coding Yuan Gao1,2,3☯, Pengyu Liu1,2,3☯, Yueying Wu1,2,3, Kebin Jia1,2,3*, Guandong Gao1,2,3 1 Beijing Advanced Innovation Center for Future Internet Technology, Beijing University of Technology, Beijing, China, 2 Beijing Laboratory of Advanced Information Networks, Beijing, China, 3 College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, China ☯ These authors contributed equally to this work. * [email protected] Abstract In high efficiency video coding (HEVC), coding tree contributes to excellent compression performance. However, coding tree brings extremely high computational complexity. Inno- vative works for improving coding tree to further reduce encoding time are stated in this OPEN ACCESS paper. A novel low complexity coding tree mechanism is proposed for HEVC fast coding Citation: Gao Y, Liu P, Wu Y, Jia K, Gao G (2016) unit (CU) encoding. Firstly, this paper makes an in-depth study of the relationship among Fast Coding Unit Encoding Mechanism for Low CU distribution, quantization parameter (QP) and content change (CC). Secondly, a CU Complexity Video Coding. PLoS ONE 11(3): coding tree probability model is proposed for modeling and predicting CU distribution. Even- e0151689. doi:10.1371/journal.pone.0151689 tually, a CU coding tree probability update is proposed, aiming to address probabilistic Editor: You Yang, Huazhong University of Science model distortion problems caused by CC. Experimental results show that the proposed low and Technology, CHINA complexity CU coding tree mechanism significantly reduces encoding time by 27% for lossy Received: September 23, 2015 coding and 42% for visually lossless coding and lossless coding.
    [Show full text]
  • R-Photo User's Manual
    User's Manual © R-Tools Technology Inc 2020. All rights reserved. www.r-tt.com © R-tools Technology Inc 2020. All rights reserved. No part of this User's Manual may be copied, altered, or transferred to, any other media without written, explicit consent from R-tools Technology Inc.. All brand or product names appearing herein are trademarks or registered trademarks of their respective holders. R-tools Technology Inc. has developed this User's Manual to the best of its knowledge, but does not guarantee that the program will fulfill all the desires of the user. No warranty is made in regard to specifications or features. R-tools Technology Inc. retains the right to make alterations to the content of this Manual without the obligation to inform third parties. Contents I Table of Contents I Start 1 II Quick Start Guide in 3 Steps 1 1 Step 1. Di.s..k.. .S..e..l.e..c..t.i.o..n.. .............................................................................................................. 1 2 Step 2. Fi.l.e..s.. .M..a..r..k.i.n..g.. ................................................................................................................ 4 3 Step 3. Re..c..o..v..e..r.y.. ...................................................................................................................... 6 III Features 9 1 File Sorti.n..g.. .............................................................................................................................. 9 2 File Sea.r.c..h.. ............................................................................................................................
    [Show full text]
  • The Open-Source Turing Codec: Towards Fast, Flexible and Parallel Hevc Encoding
    THE OPEN-SOURCE TURING CODEC: TOWARDS FAST, FLEXIBLE AND PARALLEL HEVC ENCODING S. G. Blasi1, M. Naccari1, R. Weerakkody1, J. Funnell2 and M. Mrak1 1BBC, Research and Development Department, UK 2Parabola Research, UK ABSTRACT The Turing codec is an open-source software codec compliant with the HEVC standard and specifically designed for speed, flexibility, parallelisation and high coding efficiency. The Turing codec was designed starting from a completely novel backbone to comply with the Main and Main10 profiles of HEVC, and has many desirable features for practical codecs such as very low memory consumption, advanced parallelisation schemes and fast encoding algorithms. This paper presents a technical description of the Turing codec as well as a comparison of its performance with other similar encoders. The codec is capable of cutting the encoding complexity by an average 87% with respect to the HEVC reference implementation for an average coding penalty of 11% higher rates in compression efficiency at the same peak-signal-noise-ratio level. INTRODUCTION The ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) combined their expertise to form the Joint Collaborative Team on Video Coding (JCT-VC), and finalised the first version of the H.265/High Efficiency Video Coding (HEVC) standard (1) in January 2013. HEVC provides the same perceived video quality as its predecessor H.264/Advanced Video Coding (AVC) at considerably lower bitrates (the MPEG final verification tests report average bitrate reductions of up to 60% (2) when coding Ultra High Definition, UHD, content). HEVC specifies larger block sizes for prediction and transform, an additional in-loop filter, and new coding modes for intra- and inter-prediction.
    [Show full text]
  • Design and Analysis of Video Compression Technique Using Hevc Intra-Frame Coding K
    ISSN: 2277-9655 [Reddy* et al., 6(6): June, 2017] Impact Factor: 4.116 IC™ Value: 3.00 CODEN: IJESS7 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY DESIGN AND ANALYSIS OF VIDEO COMPRESSION TECHNIQUE USING HEVC INTRA-FRAME CODING K. Sripal Reddy*1, Boppidi Srikanth2 & C. Loknath Reddy3 *1,2&3Vardhaman College of Engineering DOI: 10.5281/zenodo.817839 ABSTRACT High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards in the range of 50% bit-rate reduction for equal perceptual video quality. Intra-frame coding is essential in both still image and video coding. In the block-based coding scheme, the spatial redundancy can be removed by utilizing the correlation between the current pixel and its neighboring reconstructed pixels from the differential pulse code modulation (DPCM) in the early video coding standard H.261 to the angular intra prediction in the latest H.265/HEVC [1], different intra prediction schemes are employed. Almost without exception, linear filters are used in these prediction schemes. This paper provides an overview of the Intra-frame coding techniques of the HEVC standard. KEYWORDS: High Efficiency Video Coding (HEVC), Intra- frame coding, angular intra prediction. I. INTRODUCTION Along with the development of multimedia and hardware technologies, the demand for high-resolution video services with better quality has been increasing. These days, the demand for ultrahigh definition (UHD) video services is emerging, and its resolution is higher than that of full high definition (FHD), by a factor of 4 or more.
    [Show full text]
  • Digital Recording of Performing Arts: Formats and Conversion
    detailed approach also when the transfer of rights forms part of an employment contract between the producer of Digital recording of performing the recording and the live crew. arts: formats and conversion • Since the area of activity most probably qualifies as Stijn Notebaert, Jan De Cock, Sam Coppens, Erik Mannens, part of the ‘cultural sector’, separate remuneration for Rik Van de Walle (IBBT-MMLab-UGent) each method of exploitation should be stipulated in the Marc Jacobs, Joeri Barbarien, Peter Schelkens (IBBT-ETRO-VUB) contract. If no separate remuneration system has been set up, right holders might at any time invoke the legal default mechanism. This default mechanism grants a proportionate part of the gross revenue linked to a specific method of exploitation to the right holders. The producer In today’s digital era, the cultural sector is confronted with a may also be obliged to provide an annual overview of the growing demand for making digital recordings – audio, video and gross revenue per way of exploitation. This clause is crucial still images – of stage performances available over a multitude of in order to avoid unforeseen financial and administrative channels, including digital television and the internet. Essentially, burdens in a later phase. this can be accomplished in two different ways. A single entity can act as a content aggregator, collecting digital recordings from • Determine geographical scope and, if necessary, the several cultural partners and making this content available to duration of the transfer for each way of exploitation. content distributors or each individual partner can distribute its own • Include future methods of exploitation in the contract.
    [Show full text]
  • Designing and Developing a Model for Converting Image Formats Using Java API for Comparative Study of Different Image Formats
    International Journal of Scientific and Research Publications, Volume 4, Issue 7, July 2014 1 ISSN 2250-3153 Designing and developing a model for converting image formats using Java API for comparative study of different image formats Apurv Kantilal Pandya*, Dr. CK Kumbharana** * Research Scholar, Department of Computer Science, Saurashtra University, Rajkot. Gujarat, INDIA. Email: [email protected] ** Head, Department of Computer Science, Saurashtra University, Rajkot. Gujarat, INDIA. Email: [email protected] Abstract- Image is one of the most important techniques to Different requirement of compression in different area of image represent data very efficiently and effectively utilized since has produced various compression algorithms or image file ancient times. But to represent data in image format has number formats with time. These formats includes [2] ANI, ANIM, of problems. One of the major issues among all these problems is APNG, ART, BMP, BSAVE, CAL, CIN, CPC, CPT, DPX, size of image. The size of image varies from equipment to ECW, EXR, FITS, FLIC, FPX, GIF, HDRi, HEVC, ICER, equipment i.e. change in the camera and lens puts tremendous ICNS, ICO, ICS, ILBM, JBIG, JBIG2, JNG, JPEG, JPEG 2000, effect on the size of image. High speed growth in network and JPEG-LS, JPEG XR, MNG, MIFF, PAM, PCX, PGF, PICtor, communication technology has boosted the usage of image PNG, PSD, PSP, QTVR, RAS, BE, JPEG-HDR, Logluv TIFF, drastically and transfer of high quality image from one point to SGI, TGA, TIFF, WBMP, WebP, XBM, XCF, XPM, XWD. another point is the requirement of the time, hence image Above mentioned formats can be used to store different kind of compression has remained the consistent need of the domain.
    [Show full text]
  • Parallel Deblocking Filter Based on Modified Order of Accessing the Coding Tree Units for HEVC on Multicore Processor
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS VOL. 11, NO. 3, Mar. 2017 1684 Copyright ⓒ2017 KSII Parallel Deblocking Filter Based on Modified Order of Accessing the Coding Tree Units for HEVC on Multicore Processor Haiwei Lei1, Wenyi Liu1 and Anhong Wang2 1 Key Laboratory of Instrumentation Science & Dynamic Measurement, Ministry of Education, North University of China Taiyuan, 030051- China [e-mail: [email protected]; [email protected]] 2 School of Electronic Information Engineering, Taiyuan University of Science and Technology Taiyuan, 030024 - China [e-mail: [email protected]] *Corresponding author: Wenyi Liu; Anhong Wang Received April 11, 2016; revised October 18, 2016; revised December 12, 2016; accepted January 15, 2017; published March 31, 2017 Abstract The deblocking filter (DF) reduces blocking artifacts in encoded video sequences, and thereby significantly improves the subjective and objective quality of videos. Statistics show that the DF accounts for 5–18% of the total decoding time in high-efficiency video coding. Therefore, speeding up the DF will improve codec performance, especially for the decoder. In view of the rapid development of multicore technology, we propose a parallel DF scheme based on a modified order of accessing the coding tree units (CTUs) by analyzing the data dependencies between adjacent CTUs. This enables the DF to run in parallel, providing accelerated performance and more flexibility in the degree of parallelism, as well as finer parallel granularity. We additionally solve the problems of variable privatization and thread synchronization in the parallelization of the DF. Finally, the DF module is parallelized based on the HM16.1 reference software using OpenMP technology.
    [Show full text]