Perceptual Audio Coding Contents
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Proceedings 2005
LAC2005 Proceedings 3rd International Linux Audio Conference April 21 – 24, 2005 ZKM | Zentrum fur¨ Kunst und Medientechnologie Karlsruhe, Germany Published by ZKM | Zentrum fur¨ Kunst und Medientechnologie Karlsruhe, Germany April, 2005 All copyright remains with the authors www.zkm.de/lac/2005 Content Preface ............................................ ............................5 Staff ............................................... ............................6 Thursday, April 21, 2005 – Lecture Hall 11:45 AM Peter Brinkmann MidiKinesis – MIDI controllers for (almost) any purpose . ....................9 01:30 PM Victor Lazzarini Extensions to the Csound Language: from User-Defined to Plugin Opcodes and Beyond ............................. .....................13 02:15 PM Albert Gr¨af Q: A Functional Programming Language for Multimedia Applications .........21 03:00 PM St´ephane Letz, Dominique Fober and Yann Orlarey jackdmp: Jack server for multi-processor machines . ......................29 03:45 PM John ffitch On The Design of Csound5 ............................... .....................37 04:30 PM Pau Arum´ıand Xavier Amatriain CLAM, an Object Oriented Framework for Audio and Music . .............43 Friday, April 22, 2005 – Lecture Hall 11:00 AM Ivica Ico Bukvic “Made in Linux” – The Next Step .......................... ..................51 11:45 AM Christoph Eckert Linux Audio Usability Issues .......................... ........................57 01:30 PM Marije Baalman Updates of the WONDER software interface for using Wave Field Synthesis . 69 02:15 PM Georg B¨onn Development of a Composer’s Sketchbook ................. ....................73 Saturday, April 23, 2005 – Lecture Hall 11:00 AM J¨urgen Reuter SoundPaint – Painting Music ........................... ......................79 11:45 AM Michael Sch¨uepp, Rene Widtmann, Rolf “Day” Koch and Klaus Buchheim System design for audio record and playback with a computer using FireWire . 87 01:30 PM John ffitch and Tom Natt Recording all Output from a Student Radio Station . -
Fast Computational Structures for an Efficient Implementation of The
ARTICLE IN PRESS Signal Processing ] (]]]]) ]]]–]]] Contents lists available at ScienceDirect Signal Processing journal homepage: www.elsevier.com/locate/sigpro Fast computational structures for an efficient implementation of the complete TDAC analysis/synthesis MDCT/MDST filter banks Vladimir Britanak a,Ã, Huibert J. Lincklaen Arrie¨ns b a Institute of Informatics, Slovak Academy of Sciences, Dubravska cesta 9, 845 07 Bratislava, Slovak Republic b Delft University of Technology, Department of Electrical Engineering, Mathematics and Computer Science, Mekelweg 4, 2628 CD Delft, The Netherlands article info abstract Article history: A new fast computational structure identical both for the forward and backward Received 27 May 2008 modified discrete cosine/sine transform (MDCT/MDST) computation is described. It is Received in revised form the result of a systematic construction of a fast algorithm for an efficient implementa- 8 January 2009 tion of the complete time domain aliasing cancelation (TDAC) analysis/synthesis MDCT/ Accepted 10 January 2009 MDST filter banks. It is shown that the same computational structure can be used both for the encoder and the decoder, thus significantly reducing design time and resources. Keywords: The corresponding generalized signal flow graph is regular and defines new sparse Modified discrete cosine transform matrix factorizations of the discrete cosine transform of type IV (DCT-IV) and MDCT/ Modified discrete sine transform MDST matrices. The identical fast MDCT computational structure provides an efficient Modulated -
LOW COMPLEXITY H.264 to VC-1 TRANSCODER by VIDHYA
LOW COMPLEXITY H.264 TO VC-1 TRANSCODER by VIDHYA VIJAYAKUMAR Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE IN ELECTRICAL ENGINEERING THE UNIVERSITY OF TEXAS AT ARLINGTON AUGUST 2010 Copyright © by Vidhya Vijayakumar 2010 All Rights Reserved ACKNOWLEDGEMENTS As true as it would be with any research effort, this endeavor would not have been possible without the guidance and support of a number of people whom I stand to thank at this juncture. First and foremost, I express my sincere gratitude to my advisor and mentor, Dr. K.R. Rao, who has been the backbone of this whole exercise. I am greatly indebted for all the things that I have learnt from him, academically and otherwise. I thank Dr. Ishfaq Ahmad for being my co-advisor and mentor and for his invaluable guidance and support. I was fortunate to work with Dr. Ahmad as his research assistant on the latest trends in video compression and it has been an invaluable experience. I thank my mentor, Mr. Vishy Swaminathan, and my team members at Adobe Systems for giving me an opportunity to work in the industry and guide me during my internship. I would like to thank the other members of my advisory committee Dr. W. Alan Davis and Dr. William E Dillon for reviewing the thesis document and offering insightful comments. I express my gratitude Dr. Jonathan Bredow and the Electrical Engineering department for purchasing the software required for this thesis and giving me the chance to work on cutting edge technologies. -
Document Title
Video-conference system based on open source software Emanuel Frederico Barreiros Castro da Silva Dissertation submitted to obtain the Master Degree in Information Systems and Computer Engineering Jury Chairman: Prof. Doutor Joao Paulo Marques da Silva Supervisor: Prof. Doutor Nuno Filipe Valentim Roma Co-supervisor: Prof. Doutor Pedro Filipe Zeferino Tomas Member: Prof. Doutor Luis Manuel Antunes Veiga June 2012 Acknowledgments I would like to thank Dr. Nuno Roma for the patience in guiding me through all the research, implementation and writing phase of this thesis. I would also like to thank Dr. Pedro Tomas´ for the Latex style sheet for this document and for the final comments and corrections on this thesis. I would also like to thank all my colleagues that helped me and supported me in hard times when i though it was impossible to implement the proposed work and was thinking in giving up. I would like to mention: Joao˜ Lima, Joao˜ Cabral, Pedro Magalhaes˜ and Jose´ Rua for all the support! Joao˜ Lima and Joao˜ Cabral were also fundamental for me, since they discussed different possibilities in the proposed solution with me and gave me fundamental support and feedback on my work! You guys were really the best friends i could had in the time being! I do not have enough thank words for you! Andre´ Mateus should also be mentioned due to all the patience and support for installing, uninstalling, installing, uninstalling and installing different Virtual Machines with Linux some more times. I would like to thank my family, specially my mother, for all the support and care they gave me for almost two years. -
A Game Engine for the Blind
Bachelor in Computer Science and Engineering 2016/2017 Bachelor Thesis Designing User Experiences: a Game Engine for the Blind Álvaro Cáceres Muñoz Tutor/s: Teresa Onorati Thesis defense date: July 3rd, 2017 This work is subject to the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License. Abstract Video games experience an ever-increasing interest by society since their incep- tion on the 70’s. This form of computer entertainment may let the player have a great time with family and friends, or it may as well provide immersion into a story full of details and emotional content. Prior to the end user playing a video game, a huge effort is performed in lots of disciplines: screenwriting, scenery design, graphical design, programming, opti- mization or marketing are but a few examples. This work is done by game studios, where teams of professionals from different backgrounds join forces in the inception of the video game. From the perspective of Human-Computer Interaction, which studies how people interact with computers to complete tasks [9], a game developer can be regarded as a user whose task is to create the logic of a video game using a computer. One of the main foundations of HCI1. is that an in-depth understanding of the user’s needs and preferences is vital for creating a usable piece of technology. This point is important as a single piece of technology (in this case, the set of tools used by a game developer) may – and should have been designed to – be used on the same team by users with different knowledge, abilities and capabilities. -
DCP-O-Matic Users' Manual
DCP-o-matic users’ manual Carl Hetherington DCP-o-matic users’ manual ii Contents 1 Introduction 1 1.1 What is DCP-o-matic?.............................................1 1.2 Licence.....................................................1 1.3 Acknowledgements...............................................1 1.4 This manual...................................................1 2 Installation 2 2.1 Windows....................................................2 2.2 Mac OS X....................................................2 2.3 Debian, Ubuntu or Mint Linux........................................2 2.4 Fedora, Centos and Mageia Linux.......................................2 2.5 Arch Linux...................................................3 2.6 Other Linux distributions...........................................3 2.7 ‘Simple’ and ‘Full’ modes...........................................4 3 Creating a DCP from a video 5 3.1 Creating a new film..............................................5 3.2 Adding content.................................................6 3.3 Making the DCP................................................7 4 Creating a DCP from a still image9 5 Manipulating existing DCPs 12 5.1 Importing a DCP into DCP-o-matic..................................... 12 5.2 Decrypting encrypted DCPs.......................................... 12 5.3 Making a DCP from a DCP.......................................... 12 5.3.1 Re-use of existing data......................................... 13 5.3.2 Making overlay files......................................... -
Using Daala Intra Frames for Still Picture Coding
Using Daala Intra Frames for Still Picture Coding Nathan E. Egge, Jean-Marc Valin, Timothy B. Terriberry, Thomas Daede, Christopher Montgomery Mozilla Mountain View, CA, USA Abstract—Recent advances in video codec technology have techniques. Unsignaled horizontal and vertical prediction (see been successfully applied to the field of still picture coding. Daala Section II-D), and low complexity prediction of chroma coef- is a new royalty-free video codec under active development that ficients from their spatially coincident luma coefficients (see contains several novel coding technologies. We consider using Daala intra frames for still picture coding and show that it is Section II-E) are two examples. competitive with other video-derived image formats. A finished version of Daala could be used to create an excellent, royalty-free image format. I. INTRODUCTION The Daala video codec is a joint research effort between Xiph.Org and Mozilla to develop a next-generation video codec that is both (1) competitive performance with the state- of-the-art and (2) distributable on a royalty free basis. In work- ing to produce a royalty free video codec, Daala introduces new coding techniques to replace those used in the traditional block-based video codec pipeline. These techniques, while not Fig. 1. State of blocks within the decode pipeline of a codec using lapped completely finished, already demonstrate that it is possible to transforms. Immediate neighbors of the target block (bold lines) cannot be deliver a video codec that meets these goals. used for spatial prediction as they still require post-filtering (dotted lines). All video codecs have, in some form, the ability to code a still frame without prior information. -
Format De Données
Format de Cannées • Hfila Perls 2.95Cd nes 1015/0912.99 PHI Format de données Un article de Wild Paris Descartes. Des clés pour comprendre l'Université numérique Accès per catégories au glossaire : Accès thématique HYPERGLOSSAIFtE :ABCDEF GH IJKLMN OP QRSTUVWXYZ Format de données Sommaire Le format des données est la représentation informatique (sous forme de nombres binaires) des données (texte, image, vidéo, audio, archivage et myptage, exécutable) en 1 Format de données vue de les stocker, de les traiter, de les manipuler. Les données peuvent être ainsi • 1.1 Format ouvert échangées entre divers programmes informatiques ou logiciels (qui les lisent et les • 1.2 Format libre interprètent), soit par une connexion directe soit par l'intermédiaire d'un fichier. • 1.3 Format fermé • 1.4 Format Le format de fichier est le type de codage utilisé pour coder les données contenues dans propriétaire un fichier. Il est identifié par l'extension qui suit le nom du fichier. • 1.5 Format normalisé Les fichiers d'ordinateurs peuvent être de deux types : • 1.6 Format conteneur • 2 Les formas texte • Les fichiers ASCII : ils contiennent des caractères qui respectent les codes • 2.1 Bureautique ASCII. Ils sont donc lisibles (éditables dans n'importe quel éditeur de texte). • 3 Les formats d'image • Les fichiers binaires ils contiennent des informations codées directement en • 4 Les formats audio binaire (des 0 et des l). Ces fichiers sont lisibles dans un éditeur de texte, mais • 5 Les formats vidéo complètement déchiffrables par la spécification publique fournie par le producteur. • 6 DR/yr Pour afficher ou exécuter un fichier binaire, il faut donc utiliser un logiciel • 7 Liens pour approfondir compatible avec le logiciel qui a produit ce fichier. -
Lapped Transforms in Perceptual Coding of Wideband Audio
Lapped Transforms in Perceptual Coding of Wideband Audio Sien Ruan Department of Electrical & Computer Engineering McGill University Montreal, Canada December 2004 A thesis submitted to McGill University in partial fulfillment of the requirements for the degree of Master of Engineering. c 2004 Sien Ruan ° i To my beloved parents ii Abstract Audio coding paradigms depend on time-frequency transformations to remove statistical redundancy in audio signals and reduce data bit rate, while maintaining high fidelity of the reconstructed signal. Sophisticated perceptual audio coding further exploits perceptual redundancy in audio signals by incorporating perceptual masking phenomena. This thesis focuses on the investigation of different coding transformations that can be used to compute perceptual distortion measures effectively; among them the lapped transform, which is most widely used in nowadays audio coders. Moreover, an innovative lapped transform is developed that can vary overlap percentage at arbitrary degrees. The new lapped transform is applicable on the transient audio by capturing the time-varying characteristics of the signal. iii Sommaire Les paradigmes de codage audio d´ependent des transformations de temps-fr´equence pour enlever la redondance statistique dans les signaux audio et pour r´eduire le taux de trans- mission de donn´ees, tout en maintenant la fid´elit´e´elev´ee du signal reconstruit. Le codage sophistiqu´eperceptuel de l’audio exploite davantage la redondance perceptuelle dans les signaux audio en incorporant des ph´enom`enes de masquage perceptuels. Cette th`ese se concentre sur la recherche sur les diff´erentes transformations de codage qui peuvent ˆetre employ´ees pour calculer des mesures de d´eformation perceptuelles efficacement, parmi elles, la transformation enroul´e, qui est la plus largement r´epandue dans les codeurs audio de nos jours. -
Volume 2 – Vidéo Sous Linux
Volume 2 – Vidéo sous linux Installation des outils vidéo V6.3 du 20 mars 2020 Par Olivier Hoarau ([email protected]) Vidéo sous linux Volume 1 - Installation des outils vidéo Volume 2 - Tutoriel Kdenlive Volume 3 - Tutoriel cinelerra Volume 4 - Tutoriel OpenShot Video Editor Volume 5 - Tutoriel LiVES Table des matières 1 HISTORIQUE DU DOCUMENT................................................................................................................................4 2 PRÉAMBULE ET LICENCE......................................................................................................................................4 3 PRÉSENTATION ET AVERTISSEMENT................................................................................................................5 4 DÉFINITIONS ET AUTRES NOTIONS VIDÉO......................................................................................................6 4.1 CONTENEUR................................................................................................................................................................6 4.2 CODEC.......................................................................................................................................................................6 5 LES OUTILS DE BASE POUR LA VIDÉO...............................................................................................................7 5.1 PRÉSENTATION.............................................................................................................................................................7 -
Daala: a Perceptually-Driven Still Picture Codec
DAALA: A PERCEPTUALLY-DRIVEN STILL PICTURE CODEC Jean-Marc Valin, Nathan E. Egge, Thomas Daede, Timothy B. Terriberry, Christopher Montgomery Mozilla, Mountain View, CA, USA Xiph.Org Foundation ABSTRACT and vertical directions [1]. Also, DC coefficients are com- bined recursively using a Haar transform, up to the level of Daala is a new royalty-free video codec based on perceptually- 64x64 superblocks. driven coding techniques. We explore using its keyframe format for still picture coding and show how it has improved over the past year. We believe the technology used in Daala Multi-Symbol Entropy Coder could be the basis of an excellent, royalty-free image format. Most recent video codecs encode information using binary arithmetic coding, meaning that each symbol can only take 1. INTRODUCTION two values. The Daala range coder supports up to 16 values per symbol, making it possible to encode fewer symbols [6]. Daala is a royalty-free video codec designed to avoid tra- This is equivalent to coding up to four binary values in parallel ditional patent-encumbered techniques used in most cur- and reduces serial dependencies. rent video codecs. In this paper, we propose to use Daala’s keyframe format for still picture coding. In June 2015, Daala was compared to other still picture codecs at the 2015 Picture Perceptual Vector Quantization Coding Symposium (PCS) [1,2]. Since then many improve- Rather than use scalar quantization like the vast majority of ments were made to the bitstream to improve its quality. picture and video codecs, Daala is based on perceptual vector These include reduced overlap in the lapped transform, finer quantization (PVQ) [7]. -
Dialogic Diva System Release 8.5WIN Reference Guide
Dialogic® Diva® System Release 8.5WIN Service Update Reference Guide February 2012 206-339-34 www.dialogic.com Dialogic® Diva® System Release 8.5 WIN Reference Guide Copyright and Legal Notice Copyright © 2000-2012 Dialogic Inc. All Rights Reserved. You may not reproduce this document in whole or in part without permission in writing from Dialogic Inc. at the address provided below. All contents of this document are furnished for informational use only and are subject to change without notice and do not represent a commitment on the part of Dialogic Inc. and its affiliates or subsidiaries (“Dialogic”). Reasonable effort is made to ensure the accuracy of the information contained in the document. However, Dialogic does not warrant the accuracy of this information and cannot accept responsibility for errors, inaccuracies or omissions that may be contained in this document. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH DIALOGIC® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN A SIGNED AGREEMENT BETWEEN YOU AND DIALOGIC, DIALOGIC ASSUMES NO LIABILITY WHATSOEVER, AND DIALOGIC DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF DIALOGIC PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY INTELLECTUAL PROPERTY RIGHT OF A THIRD PARTY. Dialogic products are not intended for use in certain safety-affecting situations. Please see http://www.dialogic.com/company/terms-of-use.aspx for more details. Due to differing national regulations and approval requirements, certain Dialogic products may be suitable for use only in specific countries, and thus may not function properly in other countries.