Music Sound Synthesis Using Machine Learning: Towards a Perceptually Relevant Control Space
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Minimoog Model D Manual
3 IMPORTANT SAFETY INSTRUCTIONS WARNING - WHEN USING ELECTRIC PRODUCTS, THESE BASIC PRECAUTIONS SHOULD ALWAYS BE FOLLOWED. 1. Read all the instructions before using the product. 2. Do not use this product near water - for example, near a bathtub, washbowl, kitchen sink, in a wet basement, or near a swimming pool or the like. 3. This product, in combination with an amplifier and headphones or speakers, may be capable of producing sound levels that could cause permanent hearing loss. Do not operate for a long period of time at a high volume level or at a level that is uncomfortable. 4. The product should be located so that its location does not interfere with its proper ventilation. 5. The product should be located away from heat sources such as radiators, heat registers, or other products that produce heat. No naked flame sources (such as candles, lighters, etc.) should be placed near this product. Do not operate in direct sunlight. 6. The product should be connected to a power supply only of the type described in the operating instructions or as marked on the product. 7. The power supply cord of the product should be unplugged from the outlet when left unused for a long period of time or during lightning storms. 8. Care should be taken so that objects do not fall and liquids are not spilled into the enclosure through openings. There are no user serviceable parts inside. Refer all servicing to qualified personnel only. NOTE: This equipment has been tested and found to comply with the limits for a class B digital device, pursuant to part 15 of the FCC rules. -
Real-Time Timbre Transfer and Sound Synthesis Using DDSP
REAL-TIME TIMBRE TRANSFER AND SOUND SYNTHESIS USING DDSP Francesco Ganis, Erik Frej Knudesn, Søren V. K. Lyster, Robin Otterbein, David Sudholt¨ and Cumhur Erkut Department of Architecture, Design, and Media Technology Aalborg University Copenhagen, Denmark https://www.smc.aau.dk/ March 15, 2021 ABSTRACT Neural audio synthesis is an actively researched topic, having yielded a wide range of techniques that leverages machine learning architectures. Google Magenta elaborated a novel approach called Differ- ential Digital Signal Processing (DDSP) that incorporates deep neural networks with preconditioned digital signal processing techniques, reaching state-of-the-art results especially in timbre transfer applications. However, most of these techniques, including the DDSP, are generally not applicable in real-time constraints, making them ineligible in a musical workflow. In this paper, we present a real-time implementation of the DDSP library embedded in a virtual synthesizer as a plug-in that can be used in a Digital Audio Workstation. We focused on timbre transfer from learned representations of real instruments to arbitrary sound inputs as well as controlling these models by MIDI. Furthermore, we developed a GUI for intuitive high-level controls which can be used for post-processing and manipulating the parameters estimated by the neural network. We have conducted a user experience test with seven participants online. The results indicated that our users found the interface appealing, easy to understand, and worth exploring further. At the same time, we have identified issues in the timbre transfer quality, in some components we did not implement, and in installation and distribution of our plugin. The next iteration of our design will address these issues. -
Presented at ^Ud,O the 99Th Convention 1995October 6-9
Tunable Bandpass Filters in Music Synthesis 4098 (L-2) Robert C. Maher University of Nebraska-Lincoln Lincoln, NE 68588-0511, USA Presented at ^ uD,o the 99th Convention 1995 October 6-9 NewYork Thispreprinthas been reproducedfrom the author'sadvance manuscript,withoutediting,correctionsor considerationby the ReviewBoard. TheAES takesno responsibilityforthe contents. Additionalpreprintsmay be obtainedby sendingrequestand remittanceto theAudioEngineeringSocietY,60 East42nd St., New York,New York10165-2520, USA. All rightsreserved.Reproductionof thispreprint,or anyportion thereof,isnot permitted withoutdirectpermissionfromthe Journalof theAudio EngineeringSociety. AN AUDIO ENGINEERING SOCIETY PREPRINT TUNABLE BANDPASS FILTERS IN MUSIC SYNTHESIS ROBERT C. MAHER DEPARTMENT OF ELECTRICAL ENGINEERING AND CENTERFORCOMMUNICATION AND INFORMATION SCIENCE UNIVERSITY OF NEBRASKA-LINCOLN 209N WSEC, LINCOLN, NE 68588-05II USA VOICE: (402)472-2081 FAX: (402)472-4732 INTERNET: [email protected] Abst/act: Subtractive synthesis, or source-filter synthesis, is a well known topic in electronic and computer music. In this paper a description is given of a flexible subtractive synthesis scheme utilizing a set of tunable digital bandpass filters. Specific examples and applications are presented for realtime subtractive synthesis of singing and other musical signals. 0. INTRODUCTION Subtractive (or source-filter) synthesis is used widely in electronic and computer music applications. Subtractive synthesis general!y involves a source signal with a broad spectrum that is passed through a filter. The properties of the filter largely define the shape of the output spectrum by attenuating specific frequency ranges, hence the name subtractive synthesis [1]. The subtractive synthesis model is appropriate for the wide class of physical systems in which an input source drives a passive acoustical or mechanical system. -
Computationally Efficient Music Synthesis
HELSINKI UNIVERSITY OF TECHNOLOGY Department of Electrical and Communications Engineering Laboratory of Acoustics and Audio Signal Processing Jussi Pekonen Computationally Efficient Music Synthesis – Methods and Sound Design Master’s Thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Technology. Espoo, June 1, 2007 Supervisor: Professor Vesa Välimäki Instructor: Professor Vesa Välimäki HELSINKI UNIVERSITY ABSTRACT OF THE OF TECHNOLOGY MASTER’S THESIS Author: Jussi Pekonen Name of the thesis: Computationally Efficient Music Synthesis – Methods and Sound Design Date: June 1, 2007 Number of pages: 80+xi Department: Electrical and Communications Engineering Professorship: S-89 Supervisor: Professor Vesa Välimäki Instructor: Professor Vesa Välimäki In this thesis, the design of a music synthesizer for systems suffering from limitations in computing power and memory capacity is presented. First, different possible syn- thesis techniques are reviewed and their applicability in computationally efficient music synthesis is discussed. In practice, the applicable techniques are limited to additive and source-filter synthesis, and, in special cases, to frequency modulation, wavetable and sampling synthesis. Next, the design of the structures of the applicable techniques are presented in detail, and properties and design issues of these structures are discussed. A major implemen- tation problem is raised in digital source-filter synthesis, where the use of classic wave- forms, such as sawtooth wave, as the source signal is challenging due to aliasing caused by waveform discontinuities. Methods for existing bandlimited waveform synthesis are reviewed, and a new approach using polynomial bandlimited step function is pre- sented in detail with design rules for the applicable polynomials. -
Masterarbeit
Masterarbeit Erstellung einer Sprachdatenbank sowie eines Programms zu deren Analyse im Kontext einer Sprachsynthese mit spektralen Modellen zur Erlangung des akademischen Grades Master of Science vorgelegt dem Fachbereich Mathematik, Naturwissenschaften und Informatik der Technischen Hochschule Mittelhessen Tobias Platen im August 2014 Referent: Prof. Dr. Erdmuthe Meyer zu Bexten Korreferent: Prof. Dr. Keywan Sohrabi Eidesstattliche Erklärung Hiermit versichere ich, die vorliegende Arbeit selbstständig und unter ausschließlicher Verwendung der angegebenen Literatur und Hilfsmittel erstellt zu haben. Die Arbeit wurde bisher in gleicher oder ähnlicher Form keiner anderen Prüfungsbehörde vorgelegt und auch nicht veröffentlicht. 2 Inhaltsverzeichnis 1 Einführung7 1.1 Motivation...................................7 1.2 Ziele......................................8 1.3 Historische Sprachsynthesen.........................9 1.3.1 Die Sprechmaschine.......................... 10 1.3.2 Der Vocoder und der Voder..................... 10 1.3.3 Linear Predictive Coding....................... 10 1.4 Moderne Algorithmen zur Sprachsynthese................. 11 1.4.1 Formantsynthese........................... 11 1.4.2 Konkatenative Synthese....................... 12 2 Spektrale Modelle zur Sprachsynthese 13 2.1 Faltung, Fouriertransformation und Vocoder................ 13 2.2 Phase Vocoder................................ 14 2.3 Spectral Model Synthesis........................... 19 2.3.1 Harmonic Trajectories........................ 19 2.3.2 Shape Invariance.......................... -
11C Software 1034-1187
Section11c PHOTO - VIDEO - PRO AUDIO Computer Software Ableton.........................................1036-1038 Arturia ...................................................1039 Antares .........................................1040-1044 Arkaos ....................................................1045 Bias ...............................................1046-1051 Bitheadz .......................................1052-1059 Bomb Factory ..............................1060-1063 Celemony ..............................................1064 Chicken Systems...................................1065 Eastwest/Quantum Leap ............1066-1069 IK Multimedia .............................1070-1078 Mackie/UA ...................................1079-1081 McDSP ..........................................1082-1085 Metric Halo..................................1086-1088 Native Instruments .....................1089-1103 Propellerhead ..............................1104-1108 Prosoniq .......................................1109-1111 Serato............................................1112-1113 Sonic Foundry .............................1114-1127 Spectrasonics ...............................1128-1130 Syntrillium ............................................1131 Tascam..........................................1132-1147 TC Works .....................................1148-1157 Ultimate Soundbank ..................1158-1159 Universal Audio ..........................1160-1161 Wave Mechanics..........................1162-1165 Waves ...........................................1166-1185 -
THE COMPLETE SYNTHESIZER: a Comprehensive Guide by David Crombie (1984)
THE COMPLETE SYNTHESIZER: A Comprehensive Guide By David Crombie (1984) Digitized by Neuronick (2001) TABLE OF CONTENTS TABLE OF CONTENTS...........................................................................................................................................2 PREFACE.................................................................................................................................................................5 INTRODUCTION ......................................................................................................................................................5 "WHAT IS A SYNTHESIZER?".............................................................................................................................5 CHAPTER 1: UNDERSTANDING SOUND .............................................................................................................6 WHAT IS SOUND? ...............................................................................................................................................7 THE THREE ELEMENTS OF SOUND .................................................................................................................7 PITCH ...................................................................................................................................................................8 STANDARD TUNING............................................................................................................................................8 THE RESPONSE OF THE HUMAN -
A Unit Selection Text-To-Speech-And-Singing Synthesis Framework from Neutral Speech: Proof of Concept 39 II.1 Introduction
Adding expressiveness to unit selection speech synthesis and to numerical voice production 90) - 02 - Marc Freixes Guerreiro http://hdl.handle.net/10803/672066 Generalitat 472 (28 de Catalunya núm. Rgtre. Fund. ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de ió la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual undac F (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs. Universitat Ramon Llull Universitat Ramon ADVERTENCIA. El acceso a los contenidos de esta tesis doctoral y su utilización debe respetar los derechos de la persona autora. Puede ser utilizada para consulta o estudio personal, así como en actividades o materiales de investigación y docencia en los términos establecidos en el art. 32 del Texto Refundido de la Ley de Propiedad Intelectual (RDL 1/1996). -
Tools for Creating Audio Stories
Tools for Creating Audio Stories Steven Rubin Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2015-237 http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-237.html December 15, 2015 Copyright © 2015, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Acknowledgement Advisor: Maneesh Agrawala Tools for Creating Audio Stories By Steven Surmacz Rubin A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Maneesh Agrawala, Chair Professor Björn Hartmann Professor Greg Niemeyer Fall 2015 Tools for Creating Audio Stories Copyright 2015 by Steven Surmacz Rubin 1 Abstract Tools for Creating Audio Stories by Steven Surmacz Rubin Doctor of Philosophy in Computer Science University of California, Berkeley Professor Maneesh Agrawala, Chair Audio stories are an engaging form of communication that combines speech and music into com- pelling narratives. One common production pipeline for creating audio stories involves three main steps: recording speech, editing speech, and editing music. Existing audio recording and editing tools force the story producer to manipulate speech and music tracks via tedious, low-level wave- form editing. -
Physically-Based Parametric Sound Synthesis and Control
Physically-Based Parametric Sound Synthesis and Control Perry R. Cook Princeton Computer Science (also Music) Course Introduction Parametric Synthesis and Control of Real-World Sounds for virtual reality games production auditory display interactive art interaction design Course #2 Sound - 1 Schedule 0:00 Welcome, Overview 0:05 Views of Sound 0:15 Spectra, Spectral Models 0:30 Subtractive and Modal Models 1:00 Physical Models: Waveguides and variants 1:20 Particle Models 1:40 Friction and Turbulence 1:45 Control Demos, Animation Examples 1:55 Wrap Up Views of Sound • Sound is a recorded waveform PCM playback is all we need for interactions, movies, games, etc. (Not true!!) • Time Domain x( t ) (from physics) • Frequency Domain X( f ) (from math) • Production what caused it • Perception our image of it Course #2 Sound - 2 Views of Sound Time Domain is most closely related to Production Frequency Domain is most closely related to Perception we will see that many hybrids abound Views of Sound: Time Domain Sound is produced/modeled by physics, described by quantities of • Force force = mass * acceleration • Position x(t) actually < x(t), y(t), z(t) > • Velocity Rate of change of position dx/dt • Acceleration Rate of change of velocity dv/dt Examples: Mass+Spring+Damper Wave Equation Course #2 Sound - 3 Mass/Spring/Damper F = ma = - ky - rv - mg F = ma = - ky - rv (if gravity negligible) d 2 y r dy k + + y = 0 dt2 m dt m ( ) D2 + Dr / m + k / m = 0 2nd Order Linear Diff Eq. Solution 1) Underdamped: -t/τ ω y(t) = Y0 e cos( t ) exp. -
Fabián Esqueda Native Instruments Gmbh 1.3.2019
SOUND SYNTHESIS FABIÁN ESQUEDA NATIVE INSTRUMENTS GMBH 1.3.2019 © 2003 – 2019 VESA VÄLIMÄKI AND FABIÁN ESQUEDA SOUND SYNTHESIS 1.3.2019 OUTLINE ‣ Introduction ‣ Introduction to Synthesis ‣ Additive Synthesis ‣ Subtractive Synthesis ‣ Wavetable Synthesis ‣ FM and Phase Distortion Synthesis ‣ Synthesis, Synthesis, Synthesis SOUND SYNTHESIS 1.3.2019 Introduction ‣ BEng in Electronic Engineering with Music Technology Systems (2012) from University of York. ‣ MSc in Acoustics and Music Technology in (2013) from University of Edinburgh. ‣ DSc in Acoustics and Audio Signal Processing in (2017) from Aalto University. ‣ Thesis topic: Aliasing reduction in nonlinear processing. ‣ Published on a variety of topics, including audio effects, circuit modeling and sound synthesis. ‣ My current role is at Native Instruments where I work as a Software Developer for our Synths & FX team. ABOUT NI SOUND SYNTHESIS 1.3.2019 About Native Instruments ‣ One of largest music technology companies in Europe. ‣ Founded in 1996. ‣ Headquarters in Berlin, offices in Los Angeles, London, Tokyo, Shenzhen and Paris. ‣ Team of ~600 people (~400 in Berlin), including ~100 developers. SOUND SYNTHESIS 1.3.2019 About Native Instruments - History ‣ First product was Generator – a software modular synthesizer. ‣ Generator became Reaktor, NI’s modular synthesis/processing environment and one of its core products to this day. ‣ The Pro-Five and B4 were NI’s first analog modeling synthesizers. SOUND SYNTHESIS 1.3.2019 About Native Instruments ‣ Pioneered software instruments and digital -
A Parametric Sound Object Model for Sound Texture Synthesis
Daniel Mohlmann¨ A Parametric Sound Object Model for Sound Texture Synthesis Dissertation zur Erlangung des Grades eines Doktors der Ingenieurwissenschaften | Dr.-Ing. | Vorgelegt im Fachbereich 3 (Mathematik und Informatik) der Universitat¨ Bremen im Juni 2011 Gutachter: Prof. Dr. Otthein Herzog Universit¨atBremen Prof. Dr. J¨ornLoviscach Fachhochschule Bielefeld Abstract This thesis deals with the analysis and synthesis of sound textures based on parametric sound objects. An overview is provided about the acoustic and perceptual principles of textural acoustic scenes, and technical challenges for analysis and synthesis are con- sidered. Four essential processing steps for sound texture analysis are identified, and existing sound texture systems are reviewed, using the four-step model as a guideline. A theoretical framework for analysis and synthesis is proposed. A parametric sound object synthesis (PSOS) model is introduced, which is able to describe individual recorded sounds through a fixed set of parameters. The model, which applies to harmonic and noisy sounds, is an extension of spectral modeling and uses spline curves to approximate spectral envelopes, as well as the evolution of pa- rameters over time. In contrast to standard spectral modeling techniques, this repre- sentation uses the concept of objects instead of concatenated frames, and it provides a direct mapping between sounds of different length. Methods for automatic and manual conversion are shown. An evaluation is presented in which the ability of the model to encode a wide range of different sounds has been examined. Although there are aspects of sounds that the model cannot accurately capture, such as polyphony and certain types of fast modula- tion, the results indicate that high quality synthesis can be achieved for many different acoustic phenomena, including instruments and animal vocalizations.