Implementing Audio Feature Extraction in Live Electronic Music

Implementing audio feature extraction in live electronic music Jamie Bullock Birmingham Conservatoire Birmingham City University A thesis submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy October ò , òýý For Harry and Jake. Acknowledgements I would like to thank my supervisor, Lamberto Coccioli for changing the way I think about soware design. anks also to Simon Hall for invaluable support and guidance throughout my research and to Peter Johnson for supporting us in this crazy artform we call live electronic music. anks to Rivka Golani for input into Fission, and to Larissa Brown and David Matheu for really ‘going for it’ in the performance. anks to Katharine Lam for a beautiful performance of Variations. Many thanks to Miller Puckette for writing Pure Data, one of the most con- cise pieces of soware ever developed, and the Pure Data community for continuing to prove its value. anks to Richard Stallman for the GPL li- cense, the single most intelligent contribution to computing in the òýth century, and to Linus Torvalds for adopting it for Linux. anks to Ichiro Fujinaga, Tristan Jehan and Tae Hong Park whose seminal work started me thinking about the potential of audio feature extraction in live electronics. anks to the team at RootDown FM. I couldn’t have survived ‘the writeup’ without a soundtrack of Jazz, Funk, Hip-Hop, Soul, Latin, Reg- gae, Afrobeat, Leeld, Downtempo and Beats! anks to Aki Shiramizu, Martin Robinson, Andrew Deakin and Jonty Harrison for original and in- uential thinking, and to Henrik Frisk for helping me realise the original- ity of my own work. anks to James Stockley for ‘beer and scrubs’, Paul Broaderick and Bev Tagg for great friendship, and the team at Birmingham Women’s hospital Neo-Natal unit for giving my son medical care whilst I wrote my dissertation in the ‘parents room’. anks to my family for teaching me to believe I could achieve anything, and just ‘being there’ when times were bad. Most of all I would like to thank my wife Cheryl, the love of my life. Related Publications Chapter ò may include fragments of text from Bullock (òýýÞ). Abstract Music with live electronics involves capturing an acoustic input, converting it to an electrical signal, processing it electronically and converting it back to an acoustic waveform through a loudspeaker. e electronic processing is usually controlled during performance through human interaction with potentiometers, switches, sensors and other tactile controllers. ese tangible interfaces, when operated by a technical assistant or dedicated electronics performer can be eective for controlling multiple processing pa- rameters. However, when a composer wishes to delegate control over the electronics to an (acoustic) instrumental performer, physical interfaces can sometimes be problematic. Performers who are unfamiliar with electronics technology, must learn to operate and interact eectively with the interfaces provided. e operation of the technology is sometimes unintuitive, and ts uncomfortably with the performer’s learned approach to her instrument, creating uncertainty for both performer and audience. e presence of switches or sensors on and around the instrumental performer begs the questions: how should I interact with this and is it working correctly? In this thesis I propose an alternative to the physical control paradigm, whereby features derived from the sound produced by the acoustic instrument itself are used as a control source. is approach removes the potential for performer anxiety posed by tangible interfaces and allows the performer to focus on instrumental sound production and the eect this has on the electronic processing. A number of experiments will be conducted through a reciprocal process of composition, performance and soware development in order to evaluate a range of methods for instrumental interaction with electronics through sonic change. e focus will be on the use of ‘low level’ audio features including, but not limited to, fundamental frequency, amplitude, brightness and noise content. To facilitate these experiments, a number of pieces of soware for audio feature extraction and visualisation will be developed and tested, the nal aim being that this soware will be publically released for download and usable in a range of audio feature extraction contexts. In the conclusion, I will propose a new approach to working with audio feature extraction in the context of live electronic music. is approach will combine the audio feature extraction and visualisation techniques dis- cussed and evaluated in previous chapters. A new piece of soware will be presented in the form of a graphical user interface for perfomers to work interactively using sound as an expressive control source. Conclusions will also be drawn about the methdology employed during this research, with particular focus on the relationship between composition, ‘do-it-yourself’ live electronics and soware development as research process. Contents Ô Introduction Ô.Ô Research Context...................Ô Ô.ò Research questions................... Ô.ç Research methodology................. Ô.¥ Ethical considerations................. Ôý Ô. Background.....................ÔÔ Ô.â Audio feature extraction................. òý Ô.Þ Precedents...................... òò Ô. Perception...................... çý Ô.À Conclusions..................... ¥ ò Extraction ò.Ô Introduction..................... ¥â ò.ò LibXtract...................... ¥Þ ò.ç Existing systems.................... ¥Þ ò.¥ Library design.................... ç ò. Feature List...................... ò.â Other functionality provided by the library......... ò.Þ Implementations................... Þ ò. Eciency and real-time use............... ò.À Future work..................... À vi CONTENTS ò.Ôý Conclusions..................... Àý ç Visualisation ç.Ô Introduction..................... ÀÔ ç.ò e importance of visualisation.............. Àò ç.ç Existing audio feature visualisation soware......... Ôýâ ç.¥ Braun........................ Ôò ç. Conclusions..................... Ôçò ¥ Case Studies ¥.Ô Introduction..................... Ôç ¥.ò Fission....................... Ôç ¥.ç Undercurrent..................... Ô Ô ¥.¥ Sparkling Box..................... Ô ¥. Variations...................... ÔâÀ ¥.â Conclusions..................... Ôò Conclusions .Ô Research ndings................... Ô¥ .ò Future work..................... ÔÀý .ç Postlude....................... òýý Appendices A k-NN classier implementation vii CONTENTS B Simple perceptron code listing C LibXtract main header and public API D e Open Source Denition D.Ô Introduction..................... òò¥ E Audio Recordings E.Ô Track listing..................... òòÞ References viii List of Figures Ô.Ô Composition with live electronics work ow diagram . .ç Ô.ò Diagrammatic representation of the relationship between the written thesis, compositions and soware development components of the sub- mission.....................................â Ô.ç Archetypal live electronics conguration . Ôò Ô.¥ Live electronics setup for Madonna of Winter and Spring by Jonathan Harvey(ÔÀâ) ................................. Ô Ô. Instrument-player continuum that extends Rowe’splayer-paradigm and instrument-paradigm(Rowe, ÔÀÀò) . Ôâ Ô.â Process employed by Wessel (ÔÀÞÀ) for ‘extracting’ timbre similarity judge- ments made by a human listener . ò¥ Ô.Þ Pitch chromagram from Jehan (òýý ) showing a series of eight dierent chords (four AmÞ and four DmÞ) looped â times on a piano at Ôòý BPM ò Ô. Venndiagram of the hypothetical relationship between musically useful features and perceptually correlated features . çÔ Ô.À Equal-loudness contours (red) (from ISO òòâ:òýýç revision) Fletcher- Munson curves shown (blue) for comparison . ç¥ Ô.Ôý e multidimensional (vector) nature of timbre compared to the scalars pitch and loudness . ç¥ Ô.ÔÔ Pierre Schaeer’s TARTYP diagram(Schaeer, ÔÀââ, p. ¥ À) . çÞ Ô.Ôò Expanded typology diagram showing novel notation, from oresen (òýýò)...................................... çÀ Ô.Ôç Smalley’s spectro-morphological continua, from (Smalley, ÔÀâ, p. â - Þò)........................................ ¥Ô ix LIST OF FIGURES Ô.Ô¥ Dimension reduction for the purpose of live electronics control . ¥ç ò.Ô Example libXtract call graph showing quasi-arbitrary passing of data between functions . ç ò.ò Distribution graph showing mean, standard deviation, and variance . Þ ò.ç Distributions with positive and negative skewness . ò.¥ Distributions with positive and negative kurtosis . À ò. Comparison between Irregularity (Krimpho), Red, and Irregularity (Jensen), Green, for a linear cross-fade between a ¥¥ýHz sine wave and whitenoise................................... âÔ ò.â Korg Kaos pad . âò ò.Þ Audio tristimulus graph . âç ò. A typical B at clarinet spectrum . â ò.À Operations to reduce the eects of non-fundamental partials on the PDA....................................... ÞÔ ò.Ôý Manhattan distance versus Euclidean distance: e red, blue, and yel- low lines have the same length (Ôò) in both Euclidean and taxicab√ geometry. In Euclidean geometry, the green line has length âx ò ≈ .¥, and is the unique shortest path. In taxicab geometry, the green line’s length is still Ôò, making it no shorter than any other path shown. Þò ò.ÔÔ e eect of changing p in the Lp-norm computation for Spectral Flux (graph generated using Mazurka Scaled Spectral Flux plugin) . Þç ò.Ôò Hertz to bark conversion graph . Þâ ò.Ôç LibXtract peak extraction algorithm. e coecient value

Load more