A Bassline Generation System Based on Sequence-To-Sequence Learning

A Bassline Generation System Based on Sequence-to-Sequence Learning Behzad Haki Sergi Jorda Music Technology Group Music Technology Group Universitat Pompeu Fabra Universitat Pompeu Fabra Roc Boronat, 138 Roc Boronat, 138 08018 Barcelona, Spain 08018 Barcelona, Spain [email protected] [email protected] ABSTRACT to commonly (while not necessarily) go beyond their limited This paper presents a detailed explanation of a system gen- and possibly cliche compositional approaches. erating basslines that are stylistically and rhythmically in- While unpredictivity can improve the aesthetic quality of terlocked with a provided audio drum loop. The proposed a piece of work, a great deal of dance music producers may system is based on a natural language processing technique: favour creating pieces that are within a specific style. As word-based sequence-to-sequence learning. The word-based a result, the motivation behind this work was to provide sequence-to-sequence learning method proposed in this pa- a method that would enable development of tools assisting per is comprised of recurrent neural networks composed of users to improve the compositional aspects of their work LSTM units. The novelty of the proposed method lies in creatively. Given the prominence of basslines in a variety of the fact that the system is not reliant on a voice-by-voice dance music styles, we specifically focused our work on the transcription of drums; instead, in this method, a drum rep- generation of style-specific basslines. resentation is used as an input sequence from which a trans- An automatically generated bassline needs to rhythmi- lated bassline is obtained at the output. The drum repre- cally work well with the other components of a track. Given sentation consists of fixed size sequences of onsets detected that the rhythmic structure of a danceable track is highly from a 2-bar audio drum loop in eight different frequency correlated with the underlying drum patterns, the bassline bands. The basslines generated by this method consist of needs to be well interlocked with the percussive instru- pitched notes with different duration. The proposed sys- ments. In dance music, the percussive patterns are either tem was trained on two distinct datasets compiled for this programmed, performed live or sampled. While in some project by the authors. Each dataset contains a variety of cases, a symbolic representation of the drum pattern can 2-bar drum loops with annotated basslines from two dif- be available, in most cases only an audio recording of the ferent styles of dance music: House and Soca. A listening percussive instruments is available. Hence, practical tools experiment designed based on the system revealed that the that require rhythmic information should be reliant on au- proposed system is capable of generating basslines that are dio signals (i.e., they should be capable of extracting rhyth- interesting and are well rhythmically interlocked with the mic information from audio recordings). drum loops from which they were generated. The primary objective of this work was to develop a system that would generate a style-specific bassline based on a provided \audio" drum loop. As explained in the previ- Author Keywords ous section, the dependency of the system on audio drum NIME, Generative music, LSTM, word-based sequence-to- loops would make the system more practical. While there sequence learning, EDM are methods to automatically transcribe the drums in a percussive recording, the state-of-the-art methods in this topic CCS Concepts still do not yield acceptable results. Also, given the scope of the current work, we could not afford to implement these •Applied computing ! Sound and music comput- methods. As a result, another objective of this work was to ing; •Information systems ! Natural language gen- come up with a meaningful alternative \representation" of eration; Music retrieval; the drums (rather than an instrument by instrument transcription) using which the system would generate a bassline. 1. INTRODUCTION Moreover, to be able to generate basslines within a style, we In electronic dance music, many producers do not have aimed to focus our work on \popular" tracks that suppos- a traditional compositional approach to creating musical edly define a particular style of dance music. Given a lack pieces. For these producers, unpredictivity may be a source of such datasets, we aimed to create datasets containing of creativity. Moreover, many producers are not trained popular tracks within a specific style. musicians in the sense that they have limited theoretical music training; hence, compositional tools may allow them 2. RELATED WORK Algorithmic Composition refers to \using some formal pro- cess to make music with minimal human intervention" [1] [13]. With the advent of computers, some composers have Licensed under a Creative Commons Attribution used computers to compose music using algorithmic meth- 4.0 International License (CC BY 4.0). Copyright ods. The earliest example of such works is the Illiac Suite remains with the author(s). by Lejaren Hiller and Leonard Isaacson. The entirety of NIME’19, June 3-6, 2019, Federal University of Rio Grande do Sul, this piece was composed using a high-speed digital com- Porto Alegre, Brazil. puter in 1955-56 at the University of Illinois. To com- 204 pose this piece, Hiller and Isaacson created a computer pro- ods, another NLP technique, Sequence to Sequence (seq2seq) gram which would generate musical contents which would translation, has recently been proposed in [11] for generat- be modified and selected based on pre-defined functions and ing drum patterns. In a sequence to sequence translation rules [1] [13]. In another approach using computers, Iannis method, a multilayered LSTM is used to map (encode) an Xenakis, created a program which would compose a piece input sequence to a vector of a fixed dimensionality, while based on a probability of note distributions provided to the another LSTM layer decodes a target sequence from the program. In short, earliest Algorithmic Music Composition decoded vector [16] (see Figure 1). (AMC) approaches using computers were based on stochas- tic or rule-based systems [13]. A relatively recent approach to computer-based AMC is Artifical Intelligence (AI) systems. AI systems may be con- sidered rule-based systems with the distinction that these systems are capable of \defining their own grammar" [13]. AI systems are commonly developed using Machine Learn- ing (ML) techniques. In [7] [3], relevant ML techniques for Figure 1: Block diagram of a typical sequence to automatic music composition have been comprehensively sequence translation architecture [16] reviewed. Based on this review, earlier ML-based techniques were mostly based on Markov chains and genetic Hutchings uses this method based on the assumption that algorithms while the most recent ones are based on Neural in a drum phrase, different instruments speak different lan- Networks (NN). guages while saying the same thing [11]. Using seq2seq ar- The earliest AMC methods relevant to dance music are chitecture, the author develops a generative drum machine based on Markov models. A Markov model is a stochas- which is capable of generating a drum groove only using a tic model predicting the probability of the occurrence of kick drum pattern and a provided musical style. an event given the previous occurrences leading up to that Although music is most certainly different from a lan- event. In musical terms, Markov models are quite useful in guage, above works illustrate the potentials of using NLP learning short-term temporal structures. Patchet's Con- techniques for AMC. The current paper tries to further in- tinuator is an example of a Markov based AMC system [14] vestigate the applicability of NLP (specifically the method [15]. While the above works were not primarily focused on proposed in [11]) for AMC. dance music, their capability to generate stylistically con- sistent content laid the ground for later AMC researches 4. METHODOLOGY focused solely on dance music such as \GESMI: Generative The objective of this project is to design, implement and Electronica Statistical Modeling Instrument" [6] and \GED- test a generative model that is capable of generating a two- MAS: Generative Electronic Dance Music Algorithmic Sys- bar bassline that is stylistically interlocked with a provided tem" [2], developed by Eigenfeldt et al. [5]. two-bar audio drum loop. There are numerous resources on how to create \groovy" or \danceable" house music basslines that are well inter- 3. TEXT-BASED LSTM NETWORKS locked with a drum pattern; however, these guidelines can Music can be represented as a sequence of events. Under- be entirely subjective (i.e., they are biased towards the au- standing the conditional probabilities between these events thors' aesthetic preferences), and hence, may not best char- allows for generating new musical content. However, fully acterize the style of the music. To avoid these biases, we understanding the conditional probabilities between these decided to use machine learning techniques on a corpus events is a complicated task. In the recent years, many consisting of a variety of style-specific music loops com- machine learning techniques have been employed to under- piled from numerous artists/releases. Using the compiled stand the relationship between different musical events, and dataset, a generative model was then trained so as to gen- hence, generating new musical content based on the learned erate

Load more