Music Information Retrieval in Live Coding

Anna Xambo,´ ∗ Alexander Lerch,† Music Information Retrieval and Jason Freeman† ∗Music Technology in Live Coding: A Department of Music Norwegian University of Science and Theoretical Framework Technology 7491 Trondheim, Norway †Center for Music Technology Georgia Institute of Technology 840 McMillan St NW Atlanta, Georgia 30332 USA [email protected] {alexander.lerch, jason.freeman}@gatech.edu Downloaded from http://direct.mit.edu/comj/article-pdf/42/4/9/1857217/comj_a_00484.pdf by guest on 02 October 2021 Abstract: Music information retrieval (MIR) has a great potential in musical live coding because it can help the musician–programmer to make musical decisions based on audio content analysis and explore new sonorities by means of MIR techniques. The use of real-time MIR techniques can be computationally demanding and thus they have been rarely used in live coding; when they have been used, it has been with a focus on low-level feature extraction. This article surveys and discusses the potential of MIR applied to live coding at a higher musical level. We propose a conceptual framework of three categories: (1) audio repurposing, (2) audio rewiring, and (3) audio remixing. We explored the three categories in live performance through an application programming interface library written in SuperCollider, MIRLC. We found that it is still a technical challenge to use high-level features in real time, yet using rhythmic and tonal properties (midlevel features) in combination with text-based information (e.g., tags) helps to achieve a closer perceptual level centered on pitch and rhythm when using MIR in live coding. We discuss challenges and future directions of utilizing MIR approaches in the computer music field. Live coding in music is an improvisation practice retrieval of sounds from both crowdsourced and based on generating code in real time by either personal databases. writing it directly or using interactive programming This article explores the challenges and opportu- (Brown 2006; Collins et al. 2007; Rohrhuber et al. nities that MIR techniques can offer to live coding 2007; Freeman and Troyer 2011). Music information practices. Its main contributions are: a survey re- retrieval (MIR) is an interdisciplinary research field view of the state of the art of MIR in live coding; that targets, generally speaking, the analysis and a categorization of approaches to live coding using retrieval of music-related information, with a focus MIR illustrated with examples; an implementation on extracting information from audio recordings in SuperCollider of the three approaches that are (Lerch 2012). Applying MIR techniques to live coding demonstrated in test-bed performances; and a dis- can contribute to the process of music generation, cussion of future directions of real-time computer both creatively and computationally. A potential music generation based on MIR techniques. scenario would be to create categorizations of audio streams and extract information on timbre and Background performance content, as well as drive semiautomatic audio remixing, enabling the live coder to focus on In this section, we overview the state of the art of high-level musical properties and decisions. Another live coding environments and MIR related to live potential scenario is to be able to model a high-level performance applications. music space with certain algorithmic behaviors and allow the live coder to combine the semi-automatic Live Coding Programming Languages Computer Music Journal, 42:4, pp. 9–25, Winter 2018 doi:10.1162/COMJ a 00484 The terms “live coding language” and “live coding c 2019 Massachusetts Institute of Technology. environment,” which support the activity of live Xamboetal.´ 9 coding in music, will be used interchangeably in MIR in commercial software. The present article this article. Programming languages that have been aims to help fill that gap. used for live coding include ChucK (Wang 2008), Extempore (previously Impromptu; Sorensen 2018), FoxDot (Kirkbride 2016), Overtone (Aaron and Real-Time Feature Extraction Tools for Live Blackwell 2013), Sonic Pi (Aaron and Blackwell Coding Environments 2013), SuperCollider (McCartney 2002), TidalCycles (McLean and Wiggins 2010), Max and Pd (Puckette In this section, we present a largely chronological Downloaded from http://direct.mit.edu/comj/article-pdf/42/4/9/1857217/comj_a_00484.pdf by guest on 02 October 2021 2002), and Scratch (Resnick et al. 2009). With and functional analysis of existing MIR tools for live the advent and development of JavaScript and Web coding and interactive programming to understand Audio, a new generation of numerous Web-based live the characteristics and properties of the features coding programming languages has emerged, e.g., supported, identify the core set of features, and Gibber (Roberts and Kuchera-Morin 2012). Visual discuss whether the existing tools satisfy the live coding is also an emerging practice, notably requirements of live coders. with Hydra (https://github.com/ojack/hydra) and Nearly all MIR systems extract a certain interme- Cyril (http://cyrilcode.com/). diate feature representation from the input audio and use this representation for inference—for example, classification or regression. Although still an open Commercial Software Using MIR debate, there is some common ground in the MIR for Live Performance literature about the categorization of audio features between low-, mid-, and high-level. In an earlier Real-time MIR has been investigated and imple- publication, the second author referred to low-level mented in mainstream software ranging from: digi- features as instantaneous features, extracted from tal audio workstations (DAWs) such as Ableton Live; small blocks of audio and describing various, mostly karaoke software such as Karaoki (http://www.pcdj technical properties, such as the spectral shape, .com/karaoke-software/karaoki/) and My Voice intensity, or a simple characterization of pitched Karaoke (http://www.emediamusic.com/karaoke content (Lerch 2012). Serra et al. (2013) refer to -software/my-voice-karaoke.html); DJ software such tonal and temporal descriptors as midlevel fea- as Traktor (https://www.native-instruments.com tures. High-level features usually describe semantic /en/products/traktor/), Dex 3 (http://www.pcdj.com characteristics, e.g., emotion, mood, musical form, /dj-software/dex-3/), and Algoriddim’s Djay Pro and style (Lerch 2012), or overall musical, cultural, 2 (https://www.algoriddim.com/); and song re- and psychological characteristics, such as genre, trieval or query-by-humming applications such as harmony, mood, rhythm, and tonality (Serra et al. Midomi (https://www.midomi.com/) and Sound- 2013). The distinction between low-, mid-, and Hound (https://soundhound.com/). Typically, these high-level audio features is used here because it solutions include specialized MIR tasks, such as provides a sufficient level of granularity to be useful pitch tracking for the karaoke software, and beat to the live coder. and key detection for the DJ software. In these ap- Various audio feature extraction tools for live plications, the MIR tasks are focused on high-level coding have existed for some time. In this section, musical information (as opposed to low-level feature we present an overview of existing tools for the three extraction) and thus inspire this research. There are environments ChucK (Wang 2008), Max (Puckette several collaborations and conversations between 2002), and SuperCollider (McCartney 2002), because industry and research looking at real-time MIR of their popularity, extensive functionality, and applied to composition, editing, and performance maturity. The findings and conclusions, however, (Bernardini et al. 2007; Serra et al. 2013). To our are generalizable to other live coding environments. knowledge, there has been little research on the A illustrative selection of MIR tools for analysis is synergies between MIR in musical live coding and listed in Table 1. 10 Computer Music Journal Table 1. Selected MIR Tools for Live Coding Environments Live Coding Library Type MIR Tool Authors Environment Included Unit Analyzer (UAna) Wang, Fiebrink, and Cook (2007) ChucK fiddle∼,bonk∼ and sigmund∼ objects Puckette, Apel, and Zicarelli (1998) Max Built-in UGens for machine listening SuperCollider Community SuperCollider Third-party SMIRK Fiebrink, Wang, and Cook (2008) ChucK Downloaded from http://direct.mit.edu/comj/article-pdf/42/4/9/1857217/comj_a_00484.pdf by guest on 02 October 2021 LibXtract Jamie Bullock (2007) Max, SuperCollider Zsa.Descriptors Malt and Jourdan (2008) Max analyzer∼ object Tristan Jehan Max Mubu Schnell et al. (2009) Max SCMIR Nick Collins (2011) SuperCollider Freesound quark Gerard Roma SuperCollider Figure 1 shows a timeline of the analyzed MIR Considering these most popular features, it is tools from 1990s to present, which provides a histor- noticeable that the majority are low-level (instan- ical and technological perspective of the evolution taneous) features (e.g., RMS or spectral centroid). of the programming environments and dependent These features have been widely used in audio con- libraries. As shown in Figure 2, the live coding tent analysis—for example, to classify audio signals. environment that includes most audio features is In live coding, they can be helpful to determine SuperCollider, and the third-party libraries with audiovisual changes in real time, yet the amount the largest number of features are LibXtract and of data can be overwhelming and most of

Music Information Retrieval in Live Coding

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support