<<

International Journal of Research ISSN NO:2236-6124

COMPARATIVE STUDY OF MPEG-4 AND H.264 COMPRESSION STANDARDS #1 GOPAL KASAT P.G SCHOLAR #2 SUHAS JADHAV Professor #3 ZAMEER FAROOQUI Professor Department Of Electronics and Telecommunication Engineering Aditya Engineering College, Beed, Maharashtra. ABSTRACT—We propose an improved Recently, the exponential growth in saliency guided compression scheme networking technologies and widespread use for low-bitrate image/video coding of video content based multimedia applications. Important regions (faces in information over internet for mass security camera feeds, vehicles in traffic communication through social networking; surveillance) get degraded significantly at e-commerce, education, etc. have promoted low bitrates by existing compression the development of video coding to a great standards, such as JPEG/JPEG-2000/MPEG- extent. Various video coding schemes have 4, since these do not explicitly utilize any already been designed for seamless knowledge of which regions are salient. We transmission of data and for design a compression algorithm which, given mass storage of digital information. The an image/video and a saliency value for each primary goal of a video coding is to pixel, computes a corresponding saliency achieve higher compression performance value in the domain. Our while maintaining high visual quality. A algorithm ensures wavelet coefficients human eye is space-variant non-uniform representing salient regions have a high resolution sampling system. Hence, saliency value. The coefficients are foveation based video coding yields higher transmitted in decreasing order of their compression performance by varying the saliency.This allows important regions in the visual quality of video data across the space image/video to have even at very to match the non-uniform spatial sampling of low bitrates. Further, our compression a human eye. In the present doctoral research scheme can handle several salient regions work, efforts are made to develop fast and with different relative importance. We efficient foveated video compression compare the performance of our method with schemes that achieve higher compression the JPEG/JPEG-2000 image standards and performance as well as higher visual quality the MPEG-4 video standard through two at a lower computational complexity. experiments: face detection and vehicle tracking. We show improved detection rates 1.1 Digital Video and quality of reconstructed images/ using our Saliency Based Compression Digital video is a three-dimensional data of a (SBC) algorithm. dynamic visual scene, sampled spatially and temporally. A visual scene temporally I.INTRODUCTION sampled at any time instant is known as a frame.

Volume 7, Issue XI, November/2018 Page No:1275 International Journal of Research ISSN NO:2236-6124

The data rates available within a network vary across the channels according to the characteristics of a network, i.e. the types of the transmission channel and the receiving data terminal as well as the network traffic congestion. Consequently, video data must be transmitted at a variety of bit-rates to have efficient transmission. Some efficient Figure 1.1: Illustration of Spatio-temporal and adaptive video compression schemes sampling of a video scene are required to solve these issues [1]. A typical video coding system is shown in 1.2 Fundamentals of Video Figure 1.4. A video data generated at the Compression source is encoded with low bit-rate by a 1.2.1 Background video encoder. The compressed video data is In the modern world, the demand of video either sent to storage devices or transmitted data has increased manifold due to massive through a communication channel. At the internet application like social networking, receiving end, the compressed video data is e-governance, security and surveillance, decoded by a video decoder and video . Hence, the network reconstructed video frames are displayed to bandwidth has become a major bottleneck users. for efficient II.LITERATURE REVIEW A space-variant non-uniform resolution image can be generated by various foveation filtering schemes. The encoding of oblique featured video data is a challenging task. Different directional transform schemes are available in literature, which efficiently encode these oblique featured video data. Motion estimation is one of the very Figure 1.4: Typical video coding system important tools of a hybrid video transmission of these vast amount of video compression schemes. Various motion data in real-time even if the present estimation schemes are present in literature to technology offers quite large bandwidths. find out the best matched block in a reference Most probably, this problem will continue frame and enhance the compression for ever since the modern human civilization efficiency with minimum computation cost. will demand more and more for video In this chapter, some well-known, efficient, transmission applications in future. standard and benchmark schemes related to Therefore, a well designed and efficient different tools of efficient foveated video video compression system is always compression schemes, are studied. The required to reduce transmission bit-rate for proposed schemes, developed and designed video data content without degrading the in this doctoral research work, are compared visual quality significantly. In a against these in subsequent chapters. heterogeneous network, where medium to Therefore, attempts are made here for a low data rates are supported, transmission of detailed and critical analysis of these video data is even a more challenging task. schemes. The literature review is categorized

Volume 7, Issue XI, November/2018 Page No:1276 International Journal of Research ISSN NO:2236-6124

into three domains of the proposed foveated video compression schemes as shown in Figure 2.1. The detailed discussion of each category is given below. 2.1 Foveated Video Compression Recently, foveated video compression (FVC) schemes have gain major interest by many researchers in the field of video coding. Since FVC schemes exploit non-uniformity in the resolution of the retina by allocating more number of bits to visual fixation points and reducing resolution drastically away Figure 2.1: Categorisation of literature from the fixation points, it delivers review perceptually high quality at greatly reduced bandwidths. There are several efficient Wallace et al. [5] and Kortum and Geisler [4] foveated schemes available have shown geometric transformation of in literature, for example, foveation filtering uniform sampled image to non-uniform space (local bandwidth reduction) [4], saliency variant sampled image using super pixel. The detection based foveating [6] and wavelet super pixels are generated to match the retinal based foveated compression [3]. sampling distribution by grouping and averaging the uniform pixels. Lee and Bovik In 1993, Silsbee et al. have introduced the have shown that foveation is a coordinate image coding based on the properties of transformation from cartesian coordinates to human visual system (HVS) [4]. The video is curvilinear coordinates and a local bandwidth encoded by dividing the frame into a number is uniformly distributed over curvilinear of spatio-temporal patterns which are based coordinates for a foveated image [5]. on spatio-temporal properties of HVS. The adaptation of foveated processing to various 2.2 Directional Transforms video coding standards is demonstrated by [2]. The recent developments in video acquisition and display systems and exponential growth Broadly, foveation method can be classified in transmission bandwidths have increased into three categories: the demand of superior quality video contents 1. Geometry based foveation (GBF), in multimedia applications with resolutions ranging from 166 × 144 pixels (QCIF) to 2. Filtering based foveation (FBF) and 3840 × 2160 pixels (UHD). With widespread adoption of emerging applications like video 3. multi-resolution based foveation (MBF). streaming, video surveillance, blue-ray disk video, etc. video compression has become an In GBF schemes, uniformly sampled image integral component of such multimedia coordinates are transformed into spatial applications. variant coordinates by log map transform, also known as foveation coordinate However, a video data in an uncompressed transform, which exploits the retina sampling format demands a huge amount of storage geometry [5]. space and transmission bandwidth. To

Volume 7, Issue XI, November/2018 Page No:1277 International Journal of Research ISSN NO:2236-6124

surpass these physical constraints, an single, low-resolution, uncompressed stream. efficient video compression scheme is always Even with constant advances in storage and required. Various video coding methods have transmission capacity, compression is likely been developed in literature to accomplish to be an essential component of multimedia video compression such as entropy coding services for many years to come. [8], predictive coding [9], block [6], wavelet/sub-band coding [1]. Block transform coding is the one which is highly exploited in image and video coding by reducing the inherent spatial redundancies between neighbouring pixels. III.VIDEO COMPRESSION

3.1 INTRODUCTION Network bitrates continue to increase (dramatically in the local area and somewhat less so in the wider area), high bitrate connections to the home are commonplace and the storage capacity of hard disks, flash memories and optical media is greater than Figure 3.1 Video frame (showing examples ever before. With the price per transmitted or of homogeneous regions) stored bit continually falling, it is perhaps not immediately obvious why video compression is necessary (and why there is such a significant effort to make it better). Video compression has two important benefits. First, it makes it possible to use digital video in transmission and storage environments that would not support uncompressed (‘raw’) video. For example, current Internet throughput rates are insufficient to handle in real time (even at low frame rates and/or small frame size). A Digital Versatile Disk (DVD) can only store a few seconds of raw video at Figure 3.2 Video frame (low-pass filtered -quality resolution and background) MPEG-4 AND H.264 and so DVD-Video storage would not be practical without video and audio compression. Second, video compression enables more efficient use of transmission and storage resources. If a high bitrate transmission channel is available, then it is a more attractive proposition to send high- resolution compressed video or multiple compressed video channels than to send a

Volume 7, Issue XI, November/2018 Page No:1278 International Journal of Research ISSN NO:2236-6124

compression (video coding) is the process of compacting or condensing a digital video sequence into a smaller number of bits. ‘Raw’ or uncompressed digital video typically requires a large bitrate (approximately 216 Mbits for 1 second of uncompressed TV-quality video, see Chapter 2) and compression is necessary for practical storage and transmission of digital video. Compression involves a complementary pair of systems, a compressor (encoder) and a decompressor (decoder).

The encoder converts the source data into a Figure 3.3 Video frame 2 compressed form (occupying a reduced By removing different types of redundancy number of bits) prior to transmission or (spatial, frequency and/or temporal) it is storage and the decoder converts the possible to compress the data significantly at compressed form back into a representation the expense of a certain amount of of the original video data. The information loss (distortion). Further encoder/decoder pair is often described as a compression can be achieved by encoding the (enCOder/ DECoder) (Figure 4.1). processed data using an entropy coding is achieved by removing scheme such as or redundancy, i.e. components that are not . Image and video necessary for faithful reproduction of the compression has been a very active field of data. Many types of data contain statistical research and development for over 20 years redundancy and can be effectively and many different systems and algorithms compressed using , so for compression and decompression have that the reconstructed data at the output of the been proposed and developed. In order to decoder is a perfect copy of the original data. encourage interworking, competition and Unfortunately, lossless compression of image increased choice, it has been necessary to and video information gives only a moderate define standard methods of compression amount of compression. encoding and decoding to allow products from different manufacturers to communicate effectively. This has led to the development of a number of key International Standards for image and video compression, including the JPEG, MPEG and H.26× series of standards.

4. VIDEO COMPRESSION Figure 4.1 Encoder/decoder STANDARDS

4.1 INTRODUCTION Compression is the process of compacting data into a smaller number of bits. Video

Volume 7, Issue XI, November/2018 Page No:1279 International Journal of Research ISSN NO:2236-6124

a fidelity as possible. These two goals (compression efficiency and high quality) are usually conflicting, because a lower compressed typically produces reduced image quality at the decoder.

Figure 4.2 Spatial and temporal correlation in a video sequence Figure 4.3 Video encoder block diagram The best that can be achieved with current lossless standards such as 5.MPEG-4 AND H.264 JPEG-LS [1] is a compression ratio of around 4–4 times. is necessary to 5.1 INTRODUCTION achieve higher compression. In a lossy MPEG-4Visual and H.264 (also known as compression system, the decompressed data ) are standards for is not identical to the source data and much the coded representation of visual higher compression ratios can be achieved at information. Each standard is a document the expense of a loss of visual quality. Lossy that primarily defines two things, a coded video compression systems are based on the representation (or syntax) that describes principle of removing subjective redundancy, visual data in a compressed form and a elements of the image or video sequence that method of decoding the syntax to reconstruct can be removed without significantly visual information. Each standard aims to affecting the viewer’s perception of visual ensure that compliant encoders and decoders quality. can successfully interwork with each other, 4.2 whilst allowing manufacturers the freedom to develop competitive and innovative products. A video CODEC (Figure 4.4) encodes a The standards specifically do not define an source image or video sequence into a encoder; rather, they define the output that an compressed form and decodes this to produce encoder should produce. A decoding method a copy or approximation of the source is defined in each standard but manufacturers sequence. If the decoded video sequence is are free to develop alternative decoders as identical to the original, then the coding long as they achieve the same result as the process is lossless; if the decoded sequence method in the standard. differs from the original, the process is lossy. The CODEC represents the original video MPEG-4 Visual (Part 2 of the MPEG-4 sequence by a model (an efficient coded group of standards) was developed by the representation that can be used to reconstruct Moving Picture Experts Group (MPEG), a an approximation of the video data). Ideally, working group of the International the model should represent the sequence Organisation for Standardisation (ISO). This using as few bits as possible and with as high group of several hundred technical experts

Volume 7, Issue XI, November/2018 Page No:1280 International Journal of Research ISSN NO:2236-6124

(drawn from industry and research images or ‘texture’), scalable profiles (coding organisations) meet at 2–4 month intervals to at multiple resolutions or quality levels) and develop the MPEG series of standards. studio profiles (coding for high-quality studio MPEG-4 (a multi-part standard covering applications). In contrast with the highly audio coding, systems issues and related flexible approach of MPEG-4 Visual, H.264 aspects of audio/visual communication) was concentrates specifically on efficient first conceived in 1994 and Part 2 was compression of video frames. Key features of standardised in 1999. the standard include compression efficiency (providing significantly better compression The H.264 standardisation effort was than any previous standard), transmission initiated by the Video Coding Experts Group efficiency (with a number of built-in features (VCEG), a working group of the to support reliable, robust transmission over International Telecommunication Union a range of channels and networks) and a focus (ITU-T) that operates in a similar way to on popular applications of video MPEG and has been responsible for a series compression. of visual telecommunication standards. The final stages of developing the H.264 standard Only three profiles are currently supported have been carried out by the Joint Video (in contrast to nearly 20 in MPEG-4 Visual), Team, a collaborative effort of both VCEG each targeted at a class of popular video and MPEG, making it possible to publish the communication applications. The Baseline final standard under the joint auspices of profile may be particularly useful for ISO/IEC (as MPEG-4 Part 10) and ITU-T (as “conversational” applications such as Recommendation H.264) in 2004. videoconferencing, the Extended profile adds extra tools that are likely to be useful for MPEG-4 Visual and H.264 have related but video streaming across networks and the significantly different visions. Both are Main profile includes tools that may be concerned with compression of visual data suitable for consumer applications such as but MPEG-4 Visual emphasises flexibility video broadcast and storage. whilst H.264’s emphasis is on efficiency and reliability. MPEG-4 Visual provides a highly The MPEG-4 and H.264 Standards flexible toolkit of coding techniques and resources, making it possible to deal with a An understanding of the process of creating wide range of types of visual data including the standards can be helpful when rectangular frames (‘traditional’ video interpreting or implementing the documents material), video objects (arbitrary-shaped themselves. In this chapter we examine the regions of a visual scene), still images and role of the ISO MPEG and ITU VCEG hybrids of natural (real-world) and synthetic groups in developing the standards. We (computer-generated) visual information. discuss the mechanisms by which the features and parameters of the standards are MPEG-4 Visual provides its functionality chosen and the driving forces (technical and through a set of coding tools, organised into commercial) behind these mechanisms. We ‘profiles’, recommended groupings of tools explain how to ‘decode’ the standards and suitable for certain applications. Classes of extract useful information from them and profiles include ‘simple’ profiles (coding of give an overview of the two standards rectangular video frames), object-based covered by this book, MPEG-4 Visual (Part profiles (coding of arbitrary-shaped visual 2) [1] and H.264/MPEG-4 Part 10 [2]. objects), still texture profiles (coding of still

Volume 7, Issue XI, November/2018 Page No:1281 International Journal of Research ISSN NO:2236-6124

we concentrate on the target applications, the and H.254 standards, providing better ‘shape’ of each standard and the method of compression of video images. The new specifying the standard. We briefly compare standard is entitled ‘Advanced Video the two standards with related International Coding’ (AVC) and is published jointly as Standards such as MPEG-2, H.264 and Part 10 of MPEG-4 and ITU-T JPEG. Recommendation H.254 [1, 4]. 5.2 DEVELOPING THE STANDARDS 6.RESULTS

Creating, maintaining and updating the RESULT AND ANALYSIS ISO/IEC 14496 (‘MPEG-4’) set of standards is the responsibility of the Moving Picture Experts Group (MPEG), a study group who develop standards for the International Standards Organisation (ISO). The emerging H.264 Recommendation (also known as MPEG-4 Part 10, ‘Advanced Video Coding’ and formerly known as H.26L) is a joint effort between MPEG and the Video Coding Experts Group (VCEG), a study group of the International Telecommunications Union (ITU). MPEG developed the highly successful MPEG-1 and MPEG-2 standards for coding video and audio, now widely used for communication and storage of digital video, and is also responsible for the MPEG-6 FIG.1 INPUT VIDEO FOR standard and the MPEG-21 standardisation COMPRESSION effort. VCEG was responsible for the first widely-used video telephony standard (H.261) and its successor, H.264, and initiated the early development of the H.26L project. The two groups set-up the collaborative Joint Video Team (JVT) to finalise the H.26L proposal and convert it into an international standard (H.264/MPEG- 4 Part 10) published by both ISO/IEC and ITU-T. 5.3 H.264/MPEG4 5.3.1 INTRODUCTION The Moving Picture Experts Group and the Video Coding Experts Group (MPEG and VCEG) have developed a new standard that promises to outperform the earlier MPEG-4

Volume 7, Issue XI, November/2018 Page No:1282 International Journal of Research ISSN NO:2236-6124

FIG 2 COMPRESSED VIDEO BY area due to different advantages and disadvantages, many people prefer H.264 to PROPOSED WORK MPEG4 no matter in video quality, size or other aspects. Now, we begin to summarize why H.264 is more superior. 1. One of the advantages of H.264 is the high compression rate that is about 2 times more efficient than MPEG-4 encoding. To put it in another way, the high compression rate makes it possible to store more information on the same hard disk. 2. H.264 VS MPEG4 quality: The image quality of H.264 is better and playback is more fluent than MPEG4 compression. 3. H.264 owns more efficient mobile surveillance application.

REFERENCES: FIG.3 COMPRESSION RATIO FOR [1] I. Richardson, Video Codec Design: ALL SELECTED FRAMES Developing Image and Video Compression Systems, 1st ed. John Wiley & Sons, Ltd, 2002. PERFORMANCE PARAMETRS: [2] J. D. Gibson and A. Bovik, Eds., Handbook of Image and Video Processing, 1.BER_OF_MPEG4 = 1st ed. Orlando, FL, USA: Academic Press, Inc., 2000. 0.1042 [3] K. Jack, Video Demystified: A Handbook for the Digital Engineer, 5th ed. Newton, 2.compression_ratio_of_mpeg4 = MA, USA: Newnes, 2007.

16.4924 [4] R. W. G. Hunt and P. M. R., Measuring Color, 4, Ed. John Wiley & Sons Inc., 3.entropy_MPEG4 = September 2011. 7.2422 [5] T. Wiegand, G. J. Sullivan, G. Bjøntegaard, and A. Luthra, “Overview of 7.CONCLUSION the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. Video Technol., Proposed work is for comparative study of vol. 13, no. 7, pp. 560–576, 2003. recent robust video compression standards. There is comparative study of H.264 and [6] ITU-T Recommendation H.264 / mpeg4 video compression standards. We ISO/IEC 14496-10, Advanced Video Coding used both subjective as well as objective for Generic Audiovisual Services, ITU-T / quality assessment techniques to know the ISO/IEC Std., March 2005. robustness of these two techniques. Although these two encoding format is used in different [7] Recommendation ITU-R BT.601-5, Studio encoding parameters of digital

Volume 7, Issue XI, November/2018 Page No:1283 International Journal of Research ISSN NO:2236-6124

television for standard 4:3 and wide-screen [17] ITU-T Recommendation H.263, Video 16:9 aspect ratios,, ITU-T Std., 1995. coding for low bit rate communication, ITU- T Std., 03 1996. [8] L. Hanzo, P. Cherriman, and J. Streit, Video Compression and Communications: [18] M. G. Strinzis, “Object-based coding of From Basics to H.261, H.263, H.264, stereospic and 3D image sequences,” IEEE MPEG4 for DVB and HSDPA-Style Signal Process. Mag., pp. 14–28, 1999. 147 Adaptive Turbo-Transceivers, 2nd ed. References Wiley-IEEE Press, 2007. [19] T. Ebrahimi and C. Horne, “MPEG-4 [9] S. B. Solak and F. Labeau, Sustainable natural video coding - an overview,” Signal ICTs and Management Systems for Green Processing: Image Communication, vol. 15, Computing. IGI Global, June 2012, ch. Green no. 4, pp. 365–385, 2000. Video Compression for Portable and Low- Power Applications, pp. 325–349. [20] ISO/IEC Standard 14996-2, Information technology: coding of audio- [10] A. Malewar, A. Bahadarpurkar, and V. visual objects-part 2: Visual, ISO/IEC Std., Gadre, “A linear rate control model for better 1998. target buffer level tracking in H.264,” Signal, Image and Video Processing, vol. 7, pp. 275– [21] ITU-T Recommendation H.265 / 286, 2011. ISO/IEC 23008-2, High Efficiency Video Coding (HEVC), ITU-T / ISO/IEC Std., [11] “International telecommunication October 2014. union-telecommunication,” http://www.itu.int. [22] F. Bossen, B. Bross, K. Suhring, and D. Flynn, “HEVC complexity and [12] “International organization for implementation analysis,” IEEE Trans. standardization,” http://www.iso.org. Circuits Syst. Video Technol., vol. 22, no. 12, pp. 1685–1696, Dec 2012. [13] ITU-T Recommendation H.261, Video codec for audiovisual services at p X 64 [23] J. Vanne, M. Viitanen, T. D. kbit/s, ITU-T Std., December 1990. Hamalainen, and A. Hallapuro, “Comparative rate-distortion-complexity [14] ISO/IEC Standard 11172-2, Information analysis of HEVC and AVC video ,” technology: coding of moving pictures and IEEE Trans. Circuits Syst. Video Technol., associated audio for digital storage media at vol. 22, no. 12, pp. 1885–1898, Dec 2012. up to about 1.5 Mbit/s-part 2: Video, ISO/IEC Std., 1993. [24] M. B. Dissanayake and D. L. B. Abeyrathna, “Performance comparison of [15] ISO/IEC Standard 13818-2, Information HEVC and H.264/AVC standards in technology: generic coding of moving environments,” Information pictures and associated audio information: Processing Systems, vol. 11, no. 3, pp. 483– Video, ISO/IEC Std., 1995. 494, September 2015. [16] ITU-T Recommendation H.262, [25] I. E. Richardson, H.264 and MPEG-4 Information technology - Generic coding of Video Compression: Video Coding for Next- moving pictures and associated audio generation Multimedia. John Wiley & Sons, information: Video, ITU-T Std., July 1995. 2003.

Volume 7, Issue XI, November/2018 Page No:1284 International Journal of Research ISSN NO:2236-6124

Volume 7, Issue XI, November/2018 Page No:1285