Format Independence Provision of Audio and Video Data in Multimedia Database Management Systems

FORMAT INDEPENDENCE PROVISION OF AUDIO AND VIDEO DATA IN MULTIMEDIA DATABASE MANAGEMENT SYSTEMS Der Technischen Fakultät der Universität Erlangen-Nürnberg zur Erlangung des Grades D O K T O R – I N G E N I E U R vorgelegt von Maciej Suchomski Erlangen – 2008 Als Dissertation genehmigt von der Technischen Fakultät der Universität Erlangen-Nürnberg Tag der Einreichung: 13.05.2008 Tag der Promotion: 31.10.2008 Dekan: Prof. Dr.-Ing. habil. Johannes Huber Berichterstatter: Prof. Dr.-Ing. Klaus Meyer-Wegener, Vizepräsident der FAU Prof. Dr. Andreas Henrich BEREITSTELLUNG DER FORMATUNABHÄNGIGKEIT VON AUDIO- UND VIDEODATEN IN MULTIMEDIALEN DATENBANKVERWALTUNGSSYSTEMEN Der Technischen Fakultät der Universität Erlangen-Nürnberg zur Erlangung des Grades D O K T O R – I N G E N I E U R vorgelegt von Maciej Suchomski Erlangen – 2008 Als Dissertation genehmigt von der Technischen Fakultät der Universität Erlangen-Nürnberg Tag der Einreichung: 13.05.2008 Tag der Promotion: 31.10.2008 Dekan: Prof. Dr.-Ing. habil. Johannes Huber Berichterstatter: Prof. Dr.-Ing. Klaus Meyer-Wegener, Vizepräsident der FAU Prof. Dr. Andreas Henrich To My Love Parents Dla Moich Kochanych Rodziców Abstract ABSTRACT Since late 90s there is a noticeable revolution in the consumption of multimedia data being analogical to the electronic data processing revolution in 80s and 90s. The multimedia revolution covers different aspects such as multimedia production, storage, and delivery, but as well triggers completely new solutions on consumer market such as multifunction handheld devices, digital and internet TV, or home cinemas. It brings however also new problems. The multimedia format variety is on one hand an advantage but on the other one of the problems, since every consumer has to understand the data in a specific format in order to consume them. On the other hand, the database management systems have been responsible for providing the data to the consumers and applications regardless the format and storage characteristics. However in case of multimedia data, the MMDBMSes have failed to provide data independence due to complexity in “translation process”, especially when considering continuous data such as audio and video. There are many reasons of such situation: the time characteristic of the continuous data (processing according to functional correctness but also to time correctness), the complexity of conversion algorithms (especially compression), and the demand of processing resources varying in time (due to the dependence on content) thus requiring sophisticated resource allocation algorithms. This work focuses on a proposal of the conceptual model of the real-time audio-video conversion (RETAVIC) architecture in order to diminish existing problems in the multimedia format translation process, and thus, to allow the format independence provision of audio and video data. The data processing within the RETAVIC architecture has been divided in four phases: capturing, analysis, storage and delivery. The key assumption is the meta-data-based real- time transcoding in the delivery phase, where quality-adaptive decoding and encoding employing Hard-Real-Time Adaptive model occurs. Besides, the Layered Lossless Video format (LLV1) has been implemented within this project and the analysis of format independence approaches and support in current multimedia management systems has been conducted. The prototypical real-time implementation of the critical parts of the transcoding chain for video provides the functional, quantitative and qualitative evaluation. i Abstract ii Kurzfassung KURZFASSUNG Seit den späten 1990er Jahren gibt es eine wahrnehmbare Revolution im Konsumverhalten von Multimediadaten, welche analog der Revolution der elektronischen Datenverarbeitung in 1980er und 1990er Jahren ist. Diese Multimediarevolution umfasst verschiedene Aspekte wie Multimediaproduktion, -speicherung und -verteilung, sie bedingt außerdem vollständig neue Lösungen auf dem Absatzmarkt für Konsumgüter wie mobile Endgeräte, digitales und Internet- Fernsehen oder Heimkinosystemen. Sie ist jedoch ebenfalls Auslöser bis dato unbekannter Probleme. Die Multimediaformatvielzahl ist einerseits ein Vorteil, auf der anderen Seite aber eines dieser Probleme, da jeder Verbraucher die Daten in einem spezifischen Format „verstehen“ muss, um sie konsumieren zu können. Andererseits sind die Datenbankverwaltungssysteme aber auch dafür verantwortlich, dass die Daten unabhängig von Format- und Speichereigenschaften für die Verbraucher und für die Anwendungen zur Verfügung stehen. Im Falle der Multimediadaten jedoch haben die MMDBVSe die Datenunabhängigkeit wegen der Komplexität „im Übersetzungsprozess“ nicht zur Verfügung stellen können, insbesondere wenn es sich um kontinuierliche Datenströme wie Audiodaten und Videodaten handelt. Es gibt viele Gründe solcher Phänomene, die Zeiteigenschaften von den kontinuierlichen Daten (die Verarbeitung entsprechend der Funktionskorrektheit aber auch entsprechend der Zeitkorrektheit), die Komplexität der Umwandlungsalgorithmen (insbesondere jene der Kompression) und die Anforderungen an die Verarbeitungsressourcen, die in der Zeit schwanken (wegen der Inhaltsabhängigkeit), die daher anspruchsvolle Ressourcenzuweisungsalgorithmen erforderlich machen. Die vorliegende Arbeit konzentriert sich auf einen Vorschlag des Begriffsmodells der Echtzeitumwandlungsarchitektur der Audio- und Videodaten (RETAVIC), um vorhandene Probleme im Multimediaformat-Übersetzungsprozess zu mindern und somit die Bereitstellung der Formatunabhängigkeit von Audio- und Videodaten zu erlauben. Die Datenverarbeitung innerhalb der RETAVIC-Architektur ist in vier Phasen unterteilt worden: Erfassung, Analyse, Speicherung und Anlieferung. Die Haupthypothese ist die metadaten-bezogene Echtzeittranskodierung in der Anlieferungsphase, in der die qualitätsanpassungsfähige Decodierung und Enkodierung mit dem Einsatz des „Hard-Real-Time Adaptive (Hart-Echtzeit- iii Kurzfassung Adaptiv-)-Modells“ auftritt. Außerdem ist das „Layered Lossless Video Format“ (Geschichtetes Verlustfreies Videoformat) innerhalb dieses Projektes implementiert worden, eine Analyse der Formatunabhängigkeitsansätze sowie der -unterstützung in den gegenwärtigen Multimedia- Managementsystemen wurde geführt. Die prototypische Echtzeit-Implementierung der kritischen Teile der Transkodierungskette für Video ermöglicht die funktionale, quantitative und qualitative Auswertung. iv Acknowledgements ACKNOWLEDGEMENTS First and foremost, I would like to thank my supervisor Prof. Klaus Meyer-Wegener. It was a great pleasure to work under his supervision. He was always able to find time for me and conduct stimulating discussions. His advices and suggestions at a crossroads allowed me to choose correct path and bring my research forward keeping me right on to the end of the road. His great patience, tolerance, and understanding helped in conducting the research and testing new ideas. His great wisdom and active support are undoubted facts. Prof. Meyer-Wegener spent not only days but also nights on co-authoring the papers published in the time of research on this work. Without him beginning and completion of this thesis would never be possible. Next, I would like to express my gratitude to Prof. Hartmut Wedekind for his great advices, shared spiritual experiences during our stay in Sudety Mountains and for accepting the chairman position during the viva voce examination. I also want to thank Prof. Andreas Henrich for many fruitful discussions during the workshops of the MMIS Group. Moreover, I am very happy that Prof. Henrich has committed himself to be the reviewer of my dissertation and I will never forget these efforts. Subsequently, I give my great thanks to Prof. André Kaup for his participation in the PhD-examination procedure. The expression of enjoyment deriving from the cooperative and personal work, from meetings “in the kitchen”, and from the funny every-day situations goes to all my colleagues at the chair. Particularly I would like to thank few of them. First, I wan to give my thanks to our ladies: to Mrs. Knechtel for his organizational help and warm welcome at the university, and to Mrs. Braun and to Mrs. Stoyan for keeping the hardware infrastructure up and running allowing me to work without additional worries. My appreciations are directed to Prof. Jablonski for many scholar advices and for the smooth and unproblematic collaboration during preparation of the exercises. I give my great thanks to Dr. Marcus Meyerhöfer – I have really enjoyed the time with you not only in the shared office but also outside the university, and there is no time spent together that will be forgotten. Finally, I would like to thank other colleagues: Michael Daum, Dr. Ilia Petrov, Dr. Rainer Lay, Dr. Sascha Müller, Dr. Udo Mayer, Florian Irmert and Robert Nagy. They spent willingly the time with me also outside the office and brought me closer not only to the German culture but also to the night life fun. v Acknowledgements I am also grateful to all students supervised by me, which have done their study projects and master theses supporting the RETAVIC project. Their contribution including among other things discussions on architecture issues, writing the code, benchmarking and evaluation allowed refining the concepts and clarifying the doubts. Especially, the best-effort converter prototypes and then their real-time implementations have made proving the assumed hypotheses possible – great thanks to my developers and active discussion partners. I

Format Independence Provision of Audio and Video Data in Multimedia Database Management Systems

Lossless Audio Codec Comparison

An Introduction to Mpeg-4 Audio Lossless Coding

Download Media Player Codec Pack Version 4.1 Media Player Codec Pack

Encoding H.264 Video for Streaming and Progressive Download

Implementing Object-Based Audio in Radio Broadcasting

Making Speech Recognition Work on the Web Christopher J. Varenhorst

A Forensic Database for Digital Audio, Video, and Image Media

Ardour Export Redesign

Installation Manual

Ffmpeg Documentation Table of Contents

Capabilities of the Horchow Auditorium and the Orientation

(A/V Codecs) REDCODE RAW (.R3D) ARRIRAW