Platforms for Handling and Development of Audiovisual Data

Total Page:16

File Type:pdf, Size:1020Kb

Platforms for Handling and Development of Audiovisual Data Faculdade de Engenharia da Universidade do Porto Platforms for Handling and Development of Audiovisual Data José Pedro Sousa Horta Project/Dissertation concerning Mestrado Integrado em Engenharia Informática Advisor: Professor Eurico Carrapatoso July 2008 © José Pedro Sousa Horta, 2008 Platforms for Handling and Development of Audiovisual Data José Pedro Sousa Horta Project/Dissertation concerning Mestrado Integrado em Engenharia Informática Approved in public display by the jury: President: Prof. Doutor António Fernando Coelho _________________________________________________ Examiner: Prof. Doutor Nuno Magalhães Ribeiro Vowel: Prof. Doutor Eurico Manuel Carrapatoso July 17th 2008 Abstract Everywhere around us, digital is replacing analogue. Such is especially true in the audiovisual: be it in consumer or professional market, the advantages of computer-based media have quickly persuaded investors. To choose in which technologies, software should be based, proper understanding of the major aspects behind digital media is essential. An overview of the state of the art regarding the compression of digital video, the existing container formats and the multimedia frameworks (that enable easier playback and editing of audiovisual content) will be given, preceded by an introduction of the main topics. The professional video market is particularly focused on this document, due to the context of its elaboration, predominantly MPEG-2, MXF and DirectShow. In order to obtain more accurate results, fruit of the practical contact with the technology, a hands-on programming project was accomplished using DirectShow to playback both local and remote WMV files. Compression algorithms are continuously being improved and new ones invented, the designers of container formats are paying increased attention to metadata and some serious alternatives to DirectShow are emerging. In a continuously mutating environment, light is shed at what has, what is and what will be happening in the upcoming years on the digital video scene. Results show that the choice of a particular video coding format or multimedia framework depends heavily on the purpose of its application and that there is no universal solution. Various use scenarios are given an answer as to which is the best course of action concerning which technologies to apply. The software developed was ultimately given a positive feedback by the client. Keywords: Digital video compression, Digital video containers, Multimedia frameworks; DV, Motion JPEG 2000, MPEG 2 / 4 / 21, ASF, MXF, DirectShow, Helix DNA, QuickTime i ii Resumo Um pouco por todo o lado, o formato digital está a substituir o analógico. Este facto torna-se realmente óbvio no mundo do audiovisual: seja no mercado profissional ou de consumo, as vantagens da media baseada na informática rapidamente convenceram os investidores. De modo a facilitar a escolha de quais tecnologias digitais que devem ser empregues, um conhecimento suficiente dos principais aspectos por detrás do tratamento de media numérica é essencial. Uma vista geral do estado da arte dos formatos de compressão e transporte de vídeo digital e das plataformas multimédia (que permitem o fácil manuseio e edição de conteúdo audiovisual) será efectuada, precedida por uma introdução aos tópicos principais. O mercado de vídeo profissional é particularmente tido em conta neste documento, devido ao contexto da sua elaboração, predominantemente as tecnologias MPEG-2, MXF e DirectShow. De modo a obter resultados mais pertinentes, foi concluído um projecto prático baseado em DirectShow com o intuito de reproduzir ficheiros WMV, locais e remotos. Os algoritmos de compressão estão continuamente a ser melhorados e inventados, os formatos de armazenamento têm cada vez recursos dedicados a metadata e sérias alternativas ao DirectShow começam a emergir. Num ambiente em contínua mutação, é levantado o véu sobre o que aconteceu, o que está a acontecer e o que irá acontecer nos próximos anos, na área do vídeo digital. Os resultados obtidos indicam que a escolha de um formato de ficheiro de vídeo particular ou de uma framework multimédia depende imenso da utilização pretendida, não existindo soluções universais. Vários cenários são considerados e várias soluções recomendadas acerca das melhores opções tecnológicas a adoptar. A aplicação implementada foi avaliada positivamente pelo cliente. iii iv Acknowledgements I would like to thank all my colleagues at MOG Solutions for an outstanding work environment that considerably eased the flow of work. My special thanks to my MOG Solutions responsible, Vitor Teixeira, for letting me find my own pace and for his valuable knowledge, to Luis Barral for his priceless expertise on DirectShow programming and his will to share it, to Orlando Ribeiro for his constant availability to help and to Pedro Campos for keeping me motivated along these four and a half months. Big thanks to my faculty advisor Prof. Eurico Carrapatoso for kindly embracing this project. And a final thank you to my brother Frederico, for providing me with his layman’s insight when reviewing this document. v vi Table of Contents 1. ITRODUCTIO ........................................................................................................................................ 1 1.1 SCOPE ............................................................................................................................................................ 1 1.2 MOTIVATION .................................................................................................................................................. 2 1.3 OBJECTIVES ................................................................................................................................................... 3 1.4 STRUCTURE .................................................................................................................................................... 4 1.5 DIGITAL VIDEO .............................................................................................................................................. 5 2. DIGITAL VIDEO COMPRESSIO .......................................................................................................... 7 2.1 DIGITAL DATA COMPRESSION ......................................................................................................................... 8 2.1.1 Lossless Compression ............................................................................................................................ 8 2.1.1.1 Run-Length Encoding ..................................................................................................................................... 8 2.1.1.2 Entropy Encoding ........................................................................................................................................... 9 Huffman Coding .................................................................................................................................................. 10 Arithmetic Coding ............................................................................................................................................... 11 2.1.2 Lossy Compression .............................................................................................................................. 13 2.1.2.1 Markov Sources ............................................................................................................................................ 13 2.1.3 Transform Coding................................................................................................................................ 14 2.1.3.1 Discrete Cosine Transform ........................................................................................................................... 14 2.1.3.2 Wavelet Transform ........................................................................................................................................ 16 2.2 DIGITAL IMAGE COMPRESSION ..................................................................................................................... 18 2.2.1 Sampling .............................................................................................................................................. 18 2.2.2 Quantization ........................................................................................................................................ 19 2.2.3 Predicting Image Values ...................................................................................................................... 21 2.2.4 Chroma Subsampling .......................................................................................................................... 21 2.3 DIGITAL VIDEO COMPRESSION ..................................................................................................................... 24 2.3.1 Motion Estimation and Compensation ................................................................................................ 24 2.3.1.1 Frame Types .................................................................................................................................................. 25 Intra Frames......................................................................................................................................................... 25 Predicted Frames ................................................................................................................................................
Recommended publications
  • High Level Architecture Framework
    Java Media Framework Multimedia Systems: Module 3 Lesson 1 Summary: Sources: H JMF Core Model H JMF 2.0 API Programmers m Guide from Sun: Architecture http://java.sun.com/products/java-media/jmf/2.1/guide/ m Models: time, event, data H JMF 2.0 API H JMF Core Functionality H JMF White Paper from IBM m Presentation http://www- 4.ibm.com/software/developer/library/jmf/jmfwhite. m Processing html m Capture m Storage and Transmission H JMF Extensibility High Level Architecture H A demultiplexer extracts individual tracks of media data JMF Applications, Applets, Beans from a multiplexed media stream. A mutliplexer performs the JMF Presentation and Processing API opposite function, it takes individual tracks of media data and merges them into a single JMF Plug-In API multiplexed media stream. H A codec performs media-data Muxes & Codecs Effects Renderers Demuxes compression and decompression. Each codec has certain input formats that it can handle and H A renderer is an abstraction of a certain output formats that it can presentation device. For audio, the generate presentation device is typically the H An effect filter modifies the computer's hardware audio card track data in some way, often to that outputs sound to the speakers. create special effects such as For video, the presentation device is blur or echo typically the computer monitor. Framework JMF H Media Streams m A media stream is the media data obtained from a local file, acquired over the network, or captured from a camera or microphone. Media streams often contain multiple channels of data called tracks.
    [Show full text]
  • XMP SPECIFICATION PART 3 STORAGE in FILES Copyright © 2016 Adobe Systems Incorporated
    XMP SPECIFICATION PART 3 STORAGE IN FILES Copyright © 2016 Adobe Systems Incorporated. All rights reserved. Adobe XMP Specification Part 3: Storage in Files NOTICE: All information contained herein is the property of Adobe Systems Incorporated. No part of this publication (whether in hardcopy or electronic form) may be reproduced or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Adobe Systems Incorporated. Adobe, the Adobe logo, Acrobat, Acrobat Distiller, Flash, FrameMaker, InDesign, Illustrator, Photoshop, PostScript, and the XMP logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. MS-DOS, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. Apple, Macintosh, Mac OS and QuickTime are trademarks of Apple Computer, Inc., registered in the United States and other countries. UNIX is a trademark in the United States and other countries, licensed exclusively through X/Open Company, Ltd. All other trademarks are the property of their respective owners. This publication and the information herein is furnished AS IS, is subject to change without notice, and should not be construed as a commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or inaccuracies, makes no warranty of any kind (express, implied, or statutory) with respect to this publication, and expressly disclaims any and all warranties of merchantability, fitness for particular purposes, and noninfringement of third party rights. Contents 1 Embedding XMP metadata in application files .
    [Show full text]
  • Capabilities of the Horchow Auditorium and the Orientation
    Performance Capabilities of Horchow Auditorium and Atrium at the Dallas Museum of Art Horchow Auditorium Capacity and Stage: The auditorium seats 333 people (with a 12 removable chair option in the back), maxing out the capacity at 345). The stage is 45’ X 18’and the screen is 27’ X 14’. A height adjustable podium, microphone, podium clock and light are standard equipment available. Installed/Available Equipment Sound: Lighting: 24 channel sound board 24 fixed lights 4 stage monitors (with up to 4 Mixes) 5 movers (these give a wide array of lighting looks) 6 hardwired microphones 4 wireless lavaliere microphones 2 handheld wireless microphones (with headphone option) 9-foot Steinway Concert Grand Piano 3 Bose towers (these have been requested by Acoustic performers before and work very well) Music stands Projection Panasonic PTRQ32 4K 20,000 Lumen Laser Projector Preferred Video Formats in Horchow Blu Ray DVD Apple ProRes 4:2:2 Standard in a .mov wrapper H.264 in a .mov wrapper Formats we can use, but are not optimal MPEG-1/2 Dirac / VC-2 DivX® (1/2/3/4/5/6) MJPEG (A/B) MPEG-4 ASP WMV 1/2 XviD WMV 3 / WMV-9 / VC-1 3ivX D4 Sorenson 1/3 H.261/H.263 / H.263i DV H.264 / MPEG-4 AVC On2 VP3/VP5/VP6 Cinepak Indeo Video v3 (IV32) Theora Real Video (1/2/3/4) Atrium Capacity and Stage: The Atrium seats up to 500 people (chair rental required). The stage available to be installed in the Atrium is 16’ x 12’ x 1’.
    [Show full text]
  • (A/V Codecs) REDCODE RAW (.R3D) ARRIRAW
    What is a Codec? Codec is a portmanteau of either "Compressor-Decompressor" or "Coder-Decoder," which describes a device or program capable of performing transformations on a data stream or signal. Codecs encode a stream or signal for transmission, storage or encryption and decode it for viewing or editing. Codecs are often used in videoconferencing and streaming media solutions. A video codec converts analog video signals from a video camera into digital signals for transmission. It then converts the digital signals back to analog for display. An audio codec converts analog audio signals from a microphone into digital signals for transmission. It then converts the digital signals back to analog for playing. The raw encoded form of audio and video data is often called essence, to distinguish it from the metadata information that together make up the information content of the stream and any "wrapper" data that is then added to aid access to or improve the robustness of the stream. Most codecs are lossy, in order to get a reasonably small file size. There are lossless codecs as well, but for most purposes the almost imperceptible increase in quality is not worth the considerable increase in data size. The main exception is if the data will undergo more processing in the future, in which case the repeated lossy encoding would damage the eventual quality too much. Many multimedia data streams need to contain both audio and video data, and often some form of metadata that permits synchronization of the audio and video. Each of these three streams may be handled by different programs, processes, or hardware; but for the multimedia data stream to be useful in stored or transmitted form, they must be encapsulated together in a container format.
    [Show full text]
  • VHS and VCR (Edited from Wikipedia)
    VHS And VCR (Edited from Wikipedia) SUMMARY A videocassette recorder, VCR, or video recorder is an electromechanical device that records analog audio and analog video from broadcast television or other source on a removable, magnetic tape videocassette, and can play back the recording. Use of a VCR to record a television program to play back at a more convenient time is commonly referred to as timeshifting. VCRs can also play back prerecorded tapes. In the 1980s and 1990s, prerecorded videotapes were widely available for purchase and rental, and blank tapes were sold to make recordings. Most domestic VCRs are equipped with a television broadcast receiver (tuner) for TV reception, and a programmable clock (timer) for unattended recording of a television channel from a start time to an end time specified by the user. These features began as simple mechanical counter-based single-event timers, but were later replaced by more flexible multiple-event digital clock timers. In later models the multiple timer events could be programmed through a menu interface displayed on the playback TV screen ("on-screen display" or OSD). This feature allowed several programs to be recorded at different times without further user intervention, and became a major selling point. The Video Home System (VHS) is a standard for consumer-level analog video recording on tape cassettes. Developed by Victor Company of Japan (JVC) in the early 1970s, it was released in Japan in late 1976 and in the United States in early 1977. From the 1950s, magnetic tape video recording became a major contributor to the television industry, via the first commercialized video tape recorders (VTRs).
    [Show full text]
  • Compression for Great Video and Audio Master Tips and Common Sense
    Compression for Great Video and Audio Master Tips and Common Sense 01_K81213_PRELIMS.indd i 10/24/2009 1:26:18 PM 01_K81213_PRELIMS.indd ii 10/24/2009 1:26:19 PM Compression for Great Video and Audio Master Tips and Common Sense Ben Waggoner AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Focal Press is an imprint of Elsevier 01_K81213_PRELIMS.indd iii 10/24/2009 1:26:19 PM Focal Press is an imprint of Elsevier 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA Linacre House, Jordan Hill, Oxford OX2 8DP, UK © 2010 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions . This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this fi eld are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein.
    [Show full text]
  • Multichannel Music and DVD Audio
    Multichannel Music and DVD Audio Table of contents 1 Multichannel Music and DVD Audio.......................................................................... 2 Built with Apache Forrest http://forrest.apache.org/ Multichannel Music and DVD Audio 1. Multichannel Music and DVD Audio In the early years, fantastic sound meant fantastic music. With the advent of "surround sound", the emphasis shifted a bit, in the direction of film sound. Also, my present speakers (Bose Acoustimass 7) did not really make music sound "musically". Surround sound in the sense of matrix encoded "Dolby surround" also does not "promote" audiophile music. Already Dolby Digital 5.1 and DTS offer more for multichannel music (five discrete full-range channels, although with some compression). DVD Audio, together with Super Audio CD (SACD), presently offers the ultimate in modern audiophile audio reproduction. DVD Audio offers a sampling depth up to 24 bits, sampling frequence up to 192 kHz, and up to 6 channels. Instead of lossy compression, the music date may be compressed losslessly using Meridian Lossless Packing. DVD-Audio can also contain multimedia content like pictures, lyrics, etc. It can also be combined with video content, although audio of highest quality/bandwidth and video cannot coexist (for bandwidth reasons). DVD Audio comes with digital copy protection that is part of the specification, and not based on deliberately specification-violations and production of deliberately defect discs (like "copy protected" CD's). SACD has similar properties, however, it cannot be combined with multimedia content. It is said, e.g. by Wikipedia, that there is a format war between SACD and DVD Audio. It appears to me that formats wars in the 2000's run differently than the format war between VHS, Betamax and Video 2000; also compare the "format war" DVD-R vs.
    [Show full text]
  • Codec Is a Portmanteau of Either
    What is a Codec? Codec is a portmanteau of either "Compressor-Decompressor" or "Coder-Decoder," which describes a device or program capable of performing transformations on a data stream or signal. Codecs encode a stream or signal for transmission, storage or encryption and decode it for viewing or editing. Codecs are often used in videoconferencing and streaming media solutions. A video codec converts analog video signals from a video camera into digital signals for transmission. It then converts the digital signals back to analog for display. An audio codec converts analog audio signals from a microphone into digital signals for transmission. It then converts the digital signals back to analog for playing. The raw encoded form of audio and video data is often called essence, to distinguish it from the metadata information that together make up the information content of the stream and any "wrapper" data that is then added to aid access to or improve the robustness of the stream. Most codecs are lossy, in order to get a reasonably small file size. There are lossless codecs as well, but for most purposes the almost imperceptible increase in quality is not worth the considerable increase in data size. The main exception is if the data will undergo more processing in the future, in which case the repeated lossy encoding would damage the eventual quality too much. Many multimedia data streams need to contain both audio and video data, and often some form of metadata that permits synchronization of the audio and video. Each of these three streams may be handled by different programs, processes, or hardware; but for the multimedia data stream to be useful in stored or transmitted form, they must be encapsulated together in a container format.
    [Show full text]
  • Episode Engine User’S Guide
    Note on License The accompanying Software is licensed and may not be distributed without writ- ten permission. Disclaimer The contents of this document are subject to revision without notice due to con- tinued progress in methodology, design, and manufacturing. Telestream shall have no liability for any error or damages of any kind resulting from the use of this doc- ument and/or software. The Software may contain errors and is not designed or intended for use in on-line facilities, aircraft navigation or communications systems, air traffic control, direct life support machines, or weapons systems (“High Risk Activities”) in which the failure of the Software would lead directly to death, personal injury or severe physical or environmental damage. You represent and warrant to Telestream that you will not use, distribute, or license the Software for High Risk Activities. Export Regulations. Software, including technical data, is subject to Swedish export control laws, and its associated regulations, and may be subject to export or import regulations in other countries. You agree to comply strictly with all such regulations and acknowledge that you have the responsibility to obtain licenses to export, re-export, or import Software. Copyright Statement ©Telestream, Inc, 2010 All rights reserved. No part of this document may be copied or distributed. This document is part of the software product and, as such, is part of the license agreement governing the software. So are any other parts of the software product, such as packaging and distribution media. The information in this document may be changed without prior notice and does not represent a commitment on the part of Telestream.
    [Show full text]
  • Efficient Multi-Codec Support for OTT Services: HEVC/H.265 And/Or AV1?
    Efficient Multi-Codec Support for OTT Services: HEVC/H.265 and/or AV1? Christian Timmerer†,‡, Martin Smole‡, and Christopher Mueller‡ ‡Bitmovin Inc., †Alpen-Adria-Universität San Francisco, CA, USA and Klagenfurt, Austria, EU ‡{firstname.lastname}@bitmovin.com, †{firstname.lastname}@itec.aau.at Abstract – The success of HTTP adaptive streaming is un- multiple versions (e.g., different resolutions and bitrates) and disputed and technical standards begin to converge to com- each version is divided into predefined pieces of a few sec- mon formats reducing market fragmentation. However, other onds (typically 2-10s). A client first receives a manifest de- obstacles appear in form of multiple video codecs to be sup- scribing the available content on a server, and then, the client ported in the future, which calls for an efficient multi-codec requests pieces based on its context (e.g., observed available support for over-the-top services. In this paper, we review the bandwidth, buffer status, decoding capabilities). Thus, it is state of the art of HTTP adaptive streaming formats with re- able to adapt the media presentation in a dynamic, adaptive spect to new services and video codecs from a deployment way. perspective. Our findings reveal that multi-codec support is The existing different formats use slightly different ter- inevitable for a successful deployment of today's and future minology. Adopting DASH terminology, the versions are re- services and applications. ferred to as representations and pieces are called segments, which we will use henceforth. The major differences between INTRODUCTION these formats are shown in Table 1. We note a strong differ- entiation in the manifest format and it is expected that both Today's over-the-top (OTT) services account for more than MPEG's media presentation description (MPD) and HLS's 70 percent of the internet traffic and this number is expected playlist (m3u8) will coexist at least for some time.
    [Show full text]
  • CALIFORNIA STATE UNIVERSITY, NORTHRIDGE Optimized AV1 Inter
    CALIFORNIA STATE UNIVERSITY, NORTHRIDGE Optimized AV1 Inter Prediction using Binary classification techniques A graduate project submitted in partial fulfillment of the requirements for the degree of Master of Science in Software Engineering by Alex Kit Romero May 2020 The graduate project of Alex Kit Romero is approved: ____________________________________ ____________ Dr. Katya Mkrtchyan Date ____________________________________ ____________ Dr. Kyle Dewey Date ____________________________________ ____________ Dr. John J. Noga, Chair Date California State University, Northridge ii Dedication This project is dedicated to all of the Computer Science professors that I have come in contact with other the years who have inspired and encouraged me to pursue a career in computer science. The words and wisdom of these professors are what pushed me to try harder and accomplish more than I ever thought possible. I would like to give a big thanks to the open source community and my fellow cohort of computer science co-workers for always being there with answers to my numerous questions and inquiries. Without their guidance and expertise, I could not have been successful. Lastly, I would like to thank my friends and family who have supported and uplifted me throughout the years. Thank you for believing in me and always telling me to never give up. iii Table of Contents Signature Page ................................................................................................................................ ii Dedication .....................................................................................................................................
    [Show full text]
  • A Comparison of Video Formats for Online Teaching Ross A
    Contemporary Issues in Education Research – First Quarter 2017 Volume 10, Number 1 A Comparison Of Video Formats For Online Teaching Ross A. Malaga, Montclair State University, USA Nicole B. Koppel, Montclair State University, USA ABSTRACT The use of video to deliver content to students online has become increasingly popular. However, educators are often plagued with the question of which format to use to deliver asynchronous video material. Whether it is a College or University committing to a common video format or an individual instructor selecting the method that works best for his or her course, this research presents a comparison of various video formats that can be applied to online education and provides guidance in which one to select. Keywords: Online Teaching; Video Formats; Technology Acceptance Model INTRODUCTION istance learning is one of the most talked-about topics in higher education today. Online and hybrid (or blended) learning removes location and time-bound constraints of the traditional college classroom to a learning environment that can occur anytime or anywhere in a global environment. DAccording to research by the Online Learning Consortium, over 5 million students took an online course in the Fall 2014 semester. This represents an increase in online enrollment of over 3.9% in just one year. In 2014, 28% of higher education students took one or more courses online (Allen, I. E. and Seaman, J, 2016). With this incredible growth, albeit slower than the growth in previous years, institutions of higher education are continuing to increase their online course and program offerings. As such, institutions need to find easy to develop, easy to use, reliable, and reasonably priced technologies to deliver online content.
    [Show full text]