Neural Granular Sound Synthesis

Total Page:16

File Type:pdf, Size:1020Kb

Neural Granular Sound Synthesis Neural Granular Sound Synthesis Adrien Bitton Philippe Esling Tatsuya Harada IRCAM, CNRS UMR9912 IRCAM, CNRS UMR9912 The University of Tokyo, RIKEN Paris, France Paris, France Tokyo, Japan [email protected] [email protected] [email protected] ABSTRACT Granular sound synthesis is a popular audio generation technique based on rearranging sequences of small wave- resynthesis generated sequence match decode form windows. In order to control the synthesis, all grains grains in a given corpus are analyzed through a set of acoustic continuous + f(↵) descriptors. This provides a representation reflecting some discrete + condition + form of local similarities across the grains. However, the grain space + quality of this grain space is bound by that of the descrip- + + + grain tors. Its traversal is not continuously invertible to signal grain + + latent space sample and does not render any structured temporality. library dz acoustic R We demonstrate that generative neural networks can im- analysis encode plement granular synthesis while alleviating most of its target signal shortcomings. We efficiently replace its audio descriptor input basis by a probabilistic latent space learned with a Vari- signal ational Auto-Encoder. In this setting the learned grain space is invertible, meaning that we can continuously syn- thesize sound when traversing its dimensions. It also im- Figure 1. Left: A grain library is analysed and scattered (+) into the acoustic dimensions. A target is defined, by analysing an other signal (o) plies that original grains are not stored for synthesis. An- or as a free trajectory, and matched to the library through the acoustic other major advantage of our approach is to learn struc- descriptors. Subsequently, grains are selected and arranged into a wave- tured paths inside this latent space by training a higher- form. Right: The grain latent space can continuously synthesize wave- level temporal embedding over arranged grain sequences. form grains. Latent features can be encoded from an input signal, sampled from a structured temporal embedding or freely drawn. Explicit controls The model can be applied to many types of libraries, in- can be learned as target conditions for the decoder. cluding pitched notes or unpitched drums and environmen- tal noises. We report experiments on the common granular synthesis processes as well as novel ones such as condi- An instance of corpus-based synthesis, named granular tional sampling and morphing. sound synthesis [2], uses short waveform windows of a fixed length. These units (called grains) usually have a size 1. INTRODUCTION ranging between 10 and 100 milliseconds. For a given cor- pus, the grains are extracted and can be analyzed through The process of generating musical audio has seen a con- audio descriptors [3] in order to facilitate their manipula- tinuous expansion since the advent of digital systems. Au- tion. Such analysis space provides a representation that dio synthesis methods relying on parametric models can reflects some form of local similarities across grains. The be derived from physical considerations, spectral analysis grain corpus is displayed as a cloud of points whose dis- (e.g. sinusoids plus noise [1] models) or signal processing tances relate to some of their acoustic relationships. By re- operations (e.g. frequency modulation). Alternatively to lying on this space, resynthesis can be done with concate- those signal generation techniques, samplers provide syn- native sound synthesis [4]. To a certain extent, this process arXiv:2008.01393v3 [cs.SD] 3 Jul 2021 thesis mechanisms by relying on stored waveforms and can emulate the spectro-temporal dynamics of a given sig- sets of audio transformations. However, when tackling nal. However, the perceptual quality of the audio similar- large audio sample libraries, these methods cannot scale ities, assessed through predefined sets of acoustic descrip- and are also unable to aggregate a model over the whole tors, is inherently biased by their design. These only offer data. Therefore, they cannot globally manipulate the audio a limited consistency across many different sounds, within features in the sound generation process. To this extent, the corpus and with respect to other targets. Furthermore, corpus-based synthesis has been introduced by slicing sets it should be noted that the synthesis process can only use of signals in shorter audio segments, which can be rear- the original grains, precluding continuously invertible in- ranged into new waveforms through a selection algorithm. terpolations in this grain space. To enhance the expressivity of granular synthesis, grain Copyright: ©2020 Adrien Bitton et al. This is an open-access article dis- sequences should be drawn in more flexible ways, by un- tributed under the terms of the Creative Commons Attribution License 3.0 derstanding the temporal dynamics of trajectories in the Unported, which permits unrestricted use, distribution, and reproduction acoustic descriptor space. However, current methods are in any medium, provided the original author and source are credited. only restricted to perform random or simple hand-drawn dz paths. Traversals across the space map to grain series that lower-dimensional space z R (dz dx), as a higher- are ordered according to the corresponding features. How- level representation generating2 any given example. The ever, given that the grain space from current approaches is complete model is defined by p(x; z) = p(x z)p(z). How- not invertible, these paths do not correspond to continuous ever, a real-world dataset follows a complexj distribution audio synthesis, besides that of each of the scattered orig- that cannot be evaluated analytically. The idea of varia- inal grains. This could be alleviated by having a denser tional inference (VI) is to address this problem through op- grain space (leading to a smoother assembled waveform), timization by assuming a simpler distribution qφ(z x) but it would require a correspondingly increasing amount from a family of approximate densities [5]. Thej goal2 ofQ of memory, quickly exceeding the gigabyte scale when VI is to minimize differences between the approximated considering nowadays sound sample library sizes. In a and real distribution, by using their Kullback-Leibler (KL) real-time setting, this causes further limitations to consider divergence in a traditional granular synthesis space. As current meth- ∗ ods only account for local relationships, they cannot gen- q (z x) = argmin KL qφ (z x) pθ (z x) : (1) φ j D j k j erate the structured temporal dynamics of musical notes qφ(zjx)2Q or drum hits without having a strong inductive bias, such By developing this divergence and re-arranging terms (de- as a target signal. Finally, the audio descriptors and the tailed development can be found in [5]), we obtain slicing size of grains are critical parameters to choose for these methods. They model the perceptual relationships log p(x) q (z x) p (z x) across elements and set a trade-off: shorter grains allow − DKL φ j k θ j for a denser space and faster sound variations at the ex- = Ez log p(x z) KL qφ(z x) pθ(z) : (2) pense of a limited estimate of the spectral features and the j − D j k need to process larger series for a given signal duration. This formulation of the Variational Auto-Encoder(VAE) In this paper, we show that we can address most of the relies on an encoder qφ(z x), which aims at minimizing aforementioned shortcomings by drawing parallels between the distance to the unknownj conditional latent distribution. granular sound synthesis and probabilistic latent variable Under this assumption, the Evidence Lower Bound Objec- models. We develop a new neural granular synthesis tech- tive (ELBO) is optimized by minimization of a β weighted nique that refines granular synthesis and is efficiently solved KL regularization over the latent distribution added to the by generative neural networks (Figure1). Through the re- reconstruction cost of the decoder pθ(x z) peated observation of grains, our proposed technique adap- j θ,φ = Eqφ(z) log pθ(x z) +β KL qφ(z x) pθ(z) : (3) tively and unsupervisedly learns analysis dimensions, struc- L − j ∗ D j k turing a latent grain space, which is continuously invert- reconstruction regularization ible to signal domain. Such space embeds the training | {z } | {z } dataset, which is no longer required in memory for gen- The second term of this loss requires to define a prior dis- eration. It allows to continuously generate novel grains tribution over the latent space, which for ease of sampling at any interpolated latent position. In a second step, this and back-propagation is chosen to be an isotropic gaussian space serves as basis for a higher-level temporal model- of unit variance pθ(z) = (0; I). Accordingly, a forward N ing, by training a sequential embedding over contiguous pass of the VAE consists in encoding a given data point series of grain features. As a result, we can sample la- qφ : x µ(x); σ(x) to obtain a mean µ(x) and vari- −!f g tent paths with a consistent temporal structure and more- ance σ(x). These allow us to obtain the latent z by sam- over relieve some of the challenges to learn to generate pling from the Gaussian, such that z (µ(x); σ(x)). ∼ N raw waveforms. Its architecture is suited to optimizing lo- The representation learned with a VAEhas a smooth topol- cal spectro-temporal features that are essential for audio ogy [6] since its encoder is regularized on a continuous quality, as well as longer-term dependencies that are ef- density and intrinsically supports sampling within its unsu- ficiently extracted from grain-level sequences rather than pervised training process. Its latent dimensions can serve individual waveform samples. The trainable modules used both for analysis when encoding new samples, or as gen- are well-grounded in digital signal processing (DSP), thus erative variables that can continuously be decoded back to interpretable and efficient for sound synthesis.
Recommended publications
  • Granular Synthesis and Physical Modeling with This Manual Is Released Under the Creative Commons Attribution 3.0 Unported License
    Granular synthesis and physical modeling with This manual is released under the Creative Commons Attribution 3.0 Unported License. You may copy, distribute, transmit and adapt it, for any purpose, provided you include the following attribution: Kaivo and the Kaivo manual by Madrona Labs. http://madronalabs.com. Version 1.3, December 2016. Written by George Cochrane and Randy Jones. Illustrated by David Chandler. Typeset in Adobe Minion using the TEX document processing system. Any trademarks mentioned are the sole property of their respective owners. Such mention does not imply any endorsement of or associa- tion with Madrona Labs. Introduction “The future evolution of virtual devices is less constrained than that of real devices.” –Julius O. Smith Kaivo is a software instrument combining two powerful synthesis techniques (physical modeling and granular synthesis) in an easy-to- use semi-modular package. It’s laid out a bit like an acoustic instru- For more information on the theory be- ment; the GRANULATOR module acts like the player’s touch, exciting hind Kaivo, see Chapter 1, “Physi-who? Granu-what?” one or more tuned objects (here, the RESONATOR module, based on physical models of resonant objects) that come together in a central resonating body (the BODY module, also physics-based). This allows for a natural (or uncanny) sense of space, pleasing in- teractions between voices, and a ton of expressive potential—all traits in short supply in digital synthesis. The acoustic comparison begins to pale when you realize that your “touch” is really a granular sam- ple player with scads of options, and that the physical properties of the resonating modules are widely variable, and in real time.
    [Show full text]
  • The Sonification Handbook Chapter 9 Sound Synthesis for Auditory Display
    The Sonification Handbook Edited by Thomas Hermann, Andy Hunt, John G. Neuhoff Logos Publishing House, Berlin, Germany ISBN 978-3-8325-2819-5 2011, 586 pages Online: http://sonification.de/handbook Order: http://www.logos-verlag.com Reference: Hermann, T., Hunt, A., Neuhoff, J. G., editors (2011). The Sonification Handbook. Logos Publishing House, Berlin, Germany. Chapter 9 Sound Synthesis for Auditory Display Perry R. Cook This chapter covers most means for synthesizing sounds, with an emphasis on describing the parameters available for each technique, especially as they might be useful for data sonification. The techniques are covered in progression, from the least parametric (the fewest means of modifying the resulting sound from data or controllers), to the most parametric (most flexible for manipulation). Some examples are provided of using various synthesis techniques to sonify body position, desktop GUI interactions, stock data, etc. Reference: Cook, P. R. (2011). Sound synthesis for auditory display. In Hermann, T., Hunt, A., Neuhoff, J. G., editors, The Sonification Handbook, chapter 9, pages 197–235. Logos Publishing House, Berlin, Germany. Media examples: http://sonification.de/handbook/chapters/chapter9 18 Chapter 9 Sound Synthesis for Auditory Display Perry R. Cook 9.1 Introduction and Chapter Overview Applications and research in auditory display require sound synthesis and manipulation algorithms that afford careful control over the sonic results. The long legacy of research in speech, computer music, acoustics, and human audio perception has yielded a wide variety of sound analysis/processing/synthesis algorithms that the auditory display designer may use. This chapter surveys algorithms and techniques for digital sound synthesis as related to auditory display.
    [Show full text]
  • Combining Granular Synthesis with Frequency Modulation
    Combining granular synthesis with frequency modulation. Kim ERVIK Øyvind BRANDSEGG Department of music Department of music University of Science and Technology University of Science and Technology Norway Norway [email protected] [email protected] Abstract acoustical events. Sound as particles has been Both granular synthesis and frequency used in applications like independent time and modulation are well-established synthesis pitch scaling, formant modification, analog synth techniques that are very flexible. This paper modeling, clouds of sound, granular delays and will investigate different ways of combining reverbs, etc [3]. Examples of granular synthesis the two techniques. It will describe the rules parameters are density, grain pitch, grain duration, of spectra that emerge when combining, compare it to similar synthesis techniques and grain envelope, the global arrangement of the suggest some aesthetic perspectives on the grains, and of course the content of the grains matter. (which can be a synthetic waveform or sampled sound). Granular synthesis gives the musician vast Keywords expressive possibilities [10]. Granular synthesis, frequency modulation, 1.3 FM synthesis partikkel, csound, sound synthesis Frequency modulation has been a known method of coding audio into radio signal since the beginning of commercial radio. In 1964 that John Chowning discovered the implication of frequency modulation of audio working on synthesis of brass instrument at Stanford [2]. He discovered that modulating the frequency resulted in sideband emerging from the carrier frequency. Fig 1:Grain Pitch modulation Yamaha later implemented the technique in the hugely successful DX7. 1 Introduction If we consider FM synthesis using sinusoidal carrier and modulation oscillator, the spectral 1.1 Method components present in a FM sound can be mathematically stated as in figure 2.
    [Show full text]
  • Concatenative Sound Synthesis: the Early Years
    CONCATENATIVE SOUND SYNTHESIS: THE EARLY YEARS Diemo Schwarz Ircam – Centre Pompidou 1, place Igor-Stravinsky, 75003 Paris, France http://www.ircam.fr/anasyn/schwarz http://concatenative.net [email protected] ABSTRACT Concatenative sound synthesis (CSS) methods use a large database of source sounds, segmented into units, Concatenative sound synthesis is a promising method and a unit selection algorithm that finds the sequence of of musical sound synthesis with a steady stream of work units that match best the sound or phrase to be synthe- and publications for over five years now. This article of- sised, called the target. The selection is performed ac- fers a comparative survey and taxonomy of the many dif- cording to the descriptors of the units, which are charac- ferent approaches to concatenative synthesis throughout teristics extracted from the source sounds, or higher level the history of electronic music, starting in the 1950s, even descriptors attributed to them. The selected units can then if they weren't known as such at their time, up to the recent be transformed to fully match the target specification, and surge of contemporary methods. Concatenative sound are concatenated. However, if the database is sufficiently synthesis methods use a large database of source sounds, large, the probability is high that a matching unit will be segmented into units, and a unit selection algorithm that found, so the need to apply transformations, which always finds the units that match best the sound or musical phrase degrade sound quality, is reduced. The units can be non- to be synthesised, called the target. The selection is per- uniform (heterogeneous), i.e.
    [Show full text]
  • Chroma Palette: Chromatic Maps of Sound As Granular Synthesis
    Proceedings of the 2007 Conference on New Interfaces for Musical Expression (NIME07), New York, NY, USA Chroma Palette: Chromatic Maps of Sound As Granular Synthesis Interface Justin Donaldson Ian Knopke Chris Raphael Indiana University School of Indiana University School of Indiana University School of Informatics Informatics Informatics 1900 E. 10th Street, Room 931 1900 E. 10th Street, Room 932 1900 E. 10th Street, Room 933 Bloomington, IN 47406 Bloomington, IN 47406 Bloomington, IN 47406 [email protected] [email protected] [email protected] ABSTRACT There are many different categories of granular synthesis, Chroma based representations of acoustic phenomenon are such as synchronous, quasi-synchronous, and asynchronous representations of sound as pitched acoustic energy. A frame- forms, referring to the regularity with which grains are wise chroma distribution over an entire musical piece is a useful reassembled. Grains are usually windowed, both to aid resynthesis and straightforward representation of its musical pitch over time. and to avoid audible clicks. However, the choice of window This paper examines a method of condensing the block-wise function can also have a pronounced effect on the resulting timbre chroma information of a musical piece into a two dimensional and is an important component of the synthesis process. Granular embedding. Such an embedding is a representation or map of the synthesis has similarities to other common analysis/resynthesis different pitched energies in a song, and how these energies relate methodologies such as the short-term Fourier transform and to each other in the context of the song. The paper presents an wavelet-based techniques. Figure 1 shows an example of an interactive version of this representation as an exploratory envelope windowing and overlap arrangement for four different analytical tool or instrument for granular synthesis.
    [Show full text]
  • NUMERICAL SOUND SYNTHESIS Ii Numerical Sound Synthesis
    NUMERICAL SOUND SYNTHESIS ii Numerical Sound Synthesis Stefan Bilbao November 27, 2007 iii iv Contents Preface xiii 1 Sound Synthesis and Physical Modeling 1 1.1 AbstractDigitalSoundSynthesis . .......... 2 1.1.1 AdditiveSynthesis . .. .. .. .. .. .. .. .. .. .. .. .. .... 3 1.1.2 SubtractiveSynthesis . ..... 5 1.1.3 WavetableSynthesis . .... 5 1.1.4 AMandFMSynthesis.............................. 7 1.1.5 OtherMethods .................................. 8 1.2 PhysicalModeling ................................ ..... 9 1.2.1 LumpedMass-SpringNetworks . ..... 10 1.2.2 ModalSynthesis ................................ .. 11 1.2.3 DigitalWaveguides. .... 13 1.2.4 HybridMethods ................................. 16 1.2.5 DirectNumericalSimulation . ...... 17 1.3 PhysicalModeling: ALargerView . ........ 20 1.3.1 Physical Models as Descended from Abstract Synthesis ............ 20 1.3.2 Connections: Direct Simulation and Other Methods . ........... 21 1.3.3 ComplexityofMusicalSystems . ...... 22 1.3.4 Why? ........................................ 25 2 Time Series and Difference Operators 27 2.1 TimeSeries ...................................... ... 28 2.2 Shift, Difference and AveragingOperators . ............ 29 2.2.1 TemporalWidth ofDifference Operators. ........ 30 2.2.2 CombiningDifferenceOperators . ...... 31 2.2.3 TaylorSeriesandAccuracy . ..... 32 2.2.4 Identities .................................... .. 33 2.3 FrequencyDomainAnalysis . ....... 34 2.3.1 Laplace and z transforms ............................. 34 2.3.2 Frequency Domain Interpretation
    [Show full text]
  • Physically-Based Parametric Sound Synthesis and Control
    Physically-Based Parametric Sound Synthesis and Control Perry R. Cook Princeton Computer Science (also Music) Course Introduction Parametric Synthesis and Control of Real-World Sounds for virtual reality games production auditory display interactive art interaction design Course #2 Sound - 1 Schedule 0:00 Welcome, Overview 0:05 Views of Sound 0:15 Spectra, Spectral Models 0:30 Subtractive and Modal Models 1:00 Physical Models: Waveguides and variants 1:20 Particle Models 1:40 Friction and Turbulence 1:45 Control Demos, Animation Examples 1:55 Wrap Up Views of Sound • Sound is a recorded waveform PCM playback is all we need for interactions, movies, games, etc. (Not true!!) • Time Domain x( t ) (from physics) • Frequency Domain X( f ) (from math) • Production what caused it • Perception our image of it Course #2 Sound - 2 Views of Sound Time Domain is most closely related to Production Frequency Domain is most closely related to Perception we will see that many hybrids abound Views of Sound: Time Domain Sound is produced/modeled by physics, described by quantities of • Force force = mass * acceleration • Position x(t) actually < x(t), y(t), z(t) > • Velocity Rate of change of position dx/dt • Acceleration Rate of change of velocity dv/dt Examples: Mass+Spring+Damper Wave Equation Course #2 Sound - 3 Mass/Spring/Damper F = ma = - ky - rv - mg F = ma = - ky - rv (if gravity negligible) d 2 y r dy k + + y = 0 dt2 m dt m ( ) D2 + Dr / m + k / m = 0 2nd Order Linear Diff Eq. Solution 1) Underdamped: -t/τ ω y(t) = Y0 e cos( t ) exp.
    [Show full text]
  • Soft Sound: a Guide to the Tools and Techniques of Software Synthesis
    California State University, Monterey Bay Digital Commons @ CSUMB Capstone Projects and Master's Theses Spring 5-20-2016 Soft Sound: A Guide to the Tools and Techniques of Software Synthesis Quinn Powell California State University, Monterey Bay Follow this and additional works at: https://digitalcommons.csumb.edu/caps_thes Part of the Other Music Commons Recommended Citation Powell, Quinn, "Soft Sound: A Guide to the Tools and Techniques of Software Synthesis" (2016). Capstone Projects and Master's Theses. 549. https://digitalcommons.csumb.edu/caps_thes/549 This Capstone Project is brought to you for free and open access by Digital Commons @ CSUMB. It has been accepted for inclusion in Capstone Projects and Master's Theses by an authorized administrator of Digital Commons @ CSUMB. Unless otherwise indicated, this project was conducted as practicum not subject to IRB review but conducted in keeping with applicable regulatory guidance for training purposes. For more information, please contact [email protected]. California State University, Monterey Bay Soft Sound: A Guide to the Tools and Techniques of Software Synthesis Quinn Powell Professor Sammons Senior Capstone Project 23 May 2016 Powell 2 Introduction Edgard Varèse once said: “To stubbornly conditioned ears, anything new in music has always been called noise. But after all, what is music but organized noises?” (18) In today’s musical climate there is one realm that overwhelmingly leads the revolution in new forms of organized noise: the realm of software synthesis. Although Varèse did not live long enough to see the digital revolution, it is likely he would have embraced computers as so many millions have and contributed to these new forms of noise.
    [Show full text]
  • The Evolution of Granular Synthesis: an Overview of Current Research
    International Symposium on The Creative and Scientific Legacies of Iannis Xenakis, 8-10 June 2006, University of Guelph, Toronto, Canada ________________________________________________________ The evolution of granular synthesis: an overview of current research Curtis Roads Media Arts and Technology University of California Santa Barbara CA 93106-6065 USA 9 June 2006 This lecture presents an overview of several projects pursued over the past five years in our laboratories at Santa Barbara. All this research is based on a scientific model of sound initially proposed by Dennis Gabor (1946), and soon afterward extended to music by Iannis Xenakis (1960). Granular analysis (also called atomic decomposition) and granular synthesis have evolved over more than five decades from a paper theory and primitive experiments into a broad range of applied techniques. Specific to the granular model is its focus on the microacoustic time scale (typically 1 to 100 ms). Granular methods treat sound as a stream of acoustic particles in both the time domain and the time-frequency (TF) domain. For more details, see Roads (2002; Kling, et al. 2005). In this lecture, I first very briefly trace the history of the idea of sound particles. Next I will demonstrate PulsarGenerator, an application developed by Alberto de Campo and me in 2001 for a specific type of particle synthesis with links to past analog techniques. I will also demonstrate the SweepingQGranulator, a tool that I wrote in the SuperCollider language for the microfiltration of granulated sound. The latest threads in this line of research go in two directions. The first is a time-frequency analysis method known as matching pursuit decomposition.
    [Show full text]
  • Morphing of Granular Sounds
    Proc. of the 18th Int. Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, Nov 30 - Dec 3, 2015 MORPHING OF GRANULAR SOUNDS Sadjad Siddiq Advanced Technology Division, Square Enix Co., Ltd. Tokyo, Japan [email protected] ABSTRACT Granular sounds are commonly used in video games but the con- ventional approach of using recorded samples does not allow sound designers to modify these sounds. In this paper we present a tech- nique to synthesize granular sound whose tone color lies at an arbi- trary point between two given granular sound samples. We first ex- tract grains and noise profiles from the recordings, morph between them and finally synthesize sound using the morphed data. Dur- ing sound synthesis a number of parameters, such as the number of grains per second or the loudness distribution of the grains, can be altered to vary the sound. The proposed method does not only al- low to create new sounds in real-time, it also drastically reduces the Figure 1: Summary of granular synthesis. Grains are at- memory footprint of granular sounds by reducing a long recording tenuated and randomly added to the final mix. to a few hundred grains of a few milliseconds length each. simplest case their amplitude is attenuated randomly. Depending 1. INTRODUCTION on the probability distribution of the attenuation factors, this can 1.1. Granular synthesis in video games have a very drastic effect on the granularity and tone color of the produced sound. For a more detailed introduction to granular syn- In previous work we described the morphing between simple im- thesis refer to [4] or [5].
    [Show full text]
  • CM3106 Chapter 5: Digital Audio Synthesis
    CM3106 Chapter 5: Digital Audio Synthesis Prof David Marshall [email protected] and Dr Kirill Sidorov [email protected] www.facebook.com/kirill.sidorov School of Computer Science & Informatics Cardiff University, UK Digital Audio Synthesis Some Practical Multimedia Digital Audio Applications: Having considered the background theory to digital audio processing, let's consider some practical multimedia related examples: Digital Audio Synthesis | making some sounds Digital Audio Effects | changing sounds via some standard effects. MIDI | synthesis and effect control and compression Roadmap for Next Few Weeks of Lectures CM3106 Chapter 5: Audio Synthesis Digital Audio Synthesis 2 Digital Audio Synthesis We have talked a lot about synthesising sounds. Several Approaches: Subtractive synthesis Additive synthesis FM (Frequency Modulation) Synthesis Sample-based synthesis Wavetable synthesis Granular Synthesis Physical Modelling CM3106 Chapter 5: Audio Synthesis Digital Audio Synthesis 3 Subtractive Synthesis Basic Idea: Subtractive synthesis is a method of subtracting overtones from a sound via sound synthesis, characterised by the application of an audio filter to an audio signal. First Example: Vocoder | talking robot (1939). Popularised with Moog Synthesisers 1960-1970s CM3106 Chapter 5: Audio Synthesis Subtractive Synthesis 4 Subtractive synthesis: Simple Example Simulating a bowed string Take the output of a sawtooth generator Use a low-pass filter to dampen its higher partials generates a more natural approximation of a bowed string instrument than using a sawtooth generator alone. 0.5 0.3 0.4 0.2 0.3 0.1 0.2 0.1 0 0 −0.1 −0.1 −0.2 −0.2 −0.3 −0.3 −0.4 −0.5 −0.4 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 subtract synth.m MATLAB Code Example Here.
    [Show full text]
  • Implementing Real-Time Granular Synthesis Ross Bencina Draft of 31St August 2001
    Copyright ©2000-2001 Ross Bencina, All rights reserved. Implementing Real-Time Granular Synthesis Ross Bencina Draft of 31st August 2001. Overview This article describes a flexible architecture for Real-Time Granular Synthesis that accommodates a number of Granular Synthesis variants including Tapped Delay Line, Stored Sample and Synthetic Grain Granular Synthesis in both Pitch Synchronous and Asynchronous forms. Three efficient algorithms for generating grain envelopes are reviewed and two methods for generating stochastic grain onset times are discussed. Readers are advised to consult the literature for further information on the theory and applications of Granular Synthesis.1, 2, 3,4,5 Background Granular Synthesis or Granulation is a flexible method for creating animated sonic textures. Sounds produced by granular synthesis have an organic quality sometimes reminiscent of sounds heard in nature: the sound of a babbling brook, or leaves rustling in a tree. Forms of granular processing involving sampled sound may be used to create time stretching, time freezing, time smearing, pitch shifting and pitch smearing effects. Perceptual continua for granular sounds include gritty / smooth and dense / sparse. The metaphor of sonic clouds has been used to describe sounds generated using Granular Synthesis. By varying synthesis parameters over time, gestures evocative of accumulation / dispersal, and condensation / evaporation may be created. The output of a Granular Synthesizer or Granulator is a mixture of many individual grains of sound. The sonic quality of a granular texture is a result of the distribution of grains in time and of the parameters selected for the synthesis of each grain. Typically, grains are quite short in duration and are often distributed densely in time so that the resultant sound is perceived as a fused texture.
    [Show full text]