A Model of Inter-musician Communication for Artificial Musical Intelligence

Oscar Alfonso Puerto Melendez

Thesis of 60 ECTS credits Master of Science (M.Sc.) in Computer Science

June 2017 ii A Model of Inter-musician Communication for Artificial Musical Intelligence

by

Oscar Alfonso Puerto Melendez

Thesis of 60 ECTS credits submitted to the School of Computer Science at Reykjavík University in partial fulfillment of the requirements for the degree of Master of Science (M.Sc.) in Computer Science

June 2017

Supervisor: David Thue, Assistant Professor Reykjavík University, Iceland,

Examiner: Hannes Högni Vilhjálmsson, Examiner Associate Professor, Reykjavík University, Iceland

Thor Magnusson, Examiner Senior Lecturer in , University of Sussex, UK Copyright Oscar Alfonso Puerto Melendez June 2017

iv A Model of Inter-musician Communication for Artificial Musical Intelligence

Oscar Alfonso Puerto Melendez

June 2017

Abstract

Artificial Musical Intelligence is a subject that spans a broad array of disciplines related to human , social interaction, cultural understanding, and music generation. Although significant progress has been made on particular areas within this subject, the combination of these areas remains largely unexplored. In this dissertation, we propose an architecture that facilitates the integration of prior work on Artificial Intelligence and music, with a fo- cus on enabling computational creativity. Specifically, our architecture represents the verbal and non-verbal communication used by human musicians using a novel multi-agent inter- action model, inspired by the interactions that a jazz quartet exhibits when it performs. In addition to supporting direct communication between autonomous musicians, our architec- ture presents a useful step toward integrating the different subareas of Artificial Musical Intelligence. Titll verkefnis

Oscar Alfonso Puerto Melendez

júní 2017

Útdráttur

Tónlistargervigreind er grein sem spannar fjölbreytt svið tengd mannlegri þekkingu, félags- legum samskiptum, skilningi á menningu og gerð tónlistar. Þrátt fyrir umtalsverða fram- þróun innan greinarinnar, hefur samsetning einstakra sviða hennar lítið verið rannsökuð. Í þessari ritgerð eru lögð drög að leið sem stuðlar að samþættingu eldri verka á sviði gervi- greindar og tónlistar, með áherslu á getu tölva til sköpunar. Nánar tiltekið lýtur ritgerðin að samskiptum sem notuð eru af tónlistarmönnum við nýju samskiptalíkani fyrir fjölþætt samskipti, sem er innblásið af samskiptum jazz kvarteta á sviði. Til viðbótar við stuðning við bein samskipti sjálfvirkra "tónlistarmanna, felst í rannsókninni framþróun samþættingar ólíkra undirsviða tónlistargervigreindar.

vi A Model of Inter-musician Communication for Artificial Musical Intelligence

Oscar Alfonso Puerto Melendez

Thesis of 60 ECTS credits submitted to the School of Computer Science at Reykjavík University in partial fulfillment of the requirements for the degree of Master of Science (M.Sc.) in Computer Science

June 2017

Student:

Oscar Alfonso Puerto Melendez

Supervisor:

David Thue

Examiner:

Hannes Högni Vilhjálmsson

Thor Magnusson viii The undersigned hereby grants permission to the Reykjavík University Library to reproduce single copies of this Thesis entitled A Model of Inter-musician Communication for Arti- ficial Musical Intelligence and to lend or sell such copies for private, scholarly or scientific research purposes only. The author reserves all other publication and other rights in association with the copyright in the Thesis, and except as herein before provided, neither the Thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author’s prior written permission.

date

Oscar Alfonso Puerto Melendez Master of Science x To my parents José Jacobo Puerto and Iris Melendez. xii Acknowledgements

I would like to thank my supervisor David Thue for his remarkable guidance throughout my research and to my family for all the support and encouragement. xiv xv

Contents

Acknowledgements xiii

Contents xv

List of Figures xvii

List of Tables xix

1 Introduction 1 1.1 General Background ...... 2 1.1.1 Artificial Music Intelligence ...... 2 1.1.2 Computational models of music creativity ...... 3 1.1.3 Common algorithms applied to musical composition ...... 3 1.1.3.1 Cellular automata ...... 4 1.1.3.2 Grammar based music composition ...... 4 1.1.3.3 Genetic Algorithms ...... 5 1.1.4 Interactive musical systems for live music ...... 6 1.1.5 Communication protocols in a musical multi-agent system . . . . . 7 1.1.5.1 The FIPA Protocols ...... 7 1.2 Musebot Ensemble ...... 8 1.2.1 Limited real-time capabilities ...... 8 1.2.2 Changing musical roles ...... 9 1.2.3 Collaboration between agents ...... 9 1.3 Summary ...... 9

2 Problem Formulation 11 2.1 Our vision: an ideal AMI system ...... 11 2.1.1 Autonomy ...... 12 2.1.2 Musical communication ...... 12 2.1.3 Performance of alternative musical styles ...... 12 2.1.4 Taking on different roles within a musical ensemble ...... 12 2.2 Our focus: enabling communication between musical agents ...... 13 2.3 Jazz as a good domain for AMI research ...... 13 2.4 Expected challenges ...... 14 2.5 Summary ...... 14

3 Related Work 15 3.1 Algorithmic Composition ...... 15 3.1.1 Generative grammars ...... 15 3.1.2 Evolutionary methods ...... 16 xvi

3.1.3 Artificial neural networks ...... 17 3.2 Live Algorithms ...... 17 3.3 Musical multi-agent systems ...... 18 3.3.1 Multi-agent system for collaborative music ...... 19 3.3.2 Multi-agent system for simulating musicians’ behaviors ...... 19 3.3.3 Multi-agent system towards the representation of musical structures. 20 3.4 Efforts toward integration ...... 20 3.5 Commercial software for music ...... 21 3.6 Summary ...... 21

4 Proposed Approach 23 4.1 General Architecture ...... 24 4.1.1 The CADIA Musebot ...... 24 4.1.1.1 Musician Agent ...... 26 4.1.1.2 Agent ...... 28 4.1.1.3 Synchronizer Agent ...... 30 4.1.1.4 Ensemble Assistant Agent ...... 31 4.2 Interaction Between Agents ...... 31 4.2.1 Interaction Protocols ...... 31 4.3 Development Process ...... 33 4.3.1 Initial Considerations ...... 33 4.3.2 Development Challenges and Implementations ...... 33 4.4 Summary ...... 34

5 Evaluation 35 5.1 Testing and Simulations ...... 35 5.1.1 Bastien and Hostager’s case study ...... 35 5.1.2 Song 1 ...... 36 5.1.2.1 Results ...... 37 5.1.2.2 Analysis ...... 38 5.1.3 Song 2 ...... 38 5.1.3.1 Results ...... 39 5.1.3.2 Analysis ...... 41 5.1.4 Song 3 ...... 41 5.1.4.1 Results ...... 42 5.1.4.2 Analysis ...... 43 5.2 Summary ...... 43

6 Discussion and future work 45 6.1 Benefits and limitations ...... 45 6.2 Future Work ...... 49 6.2.1 Towards Algorithmic Composition ...... 49 6.2.2 Towards Live Algorithms ...... 50 6.2.3 Towards a Standard Taxonomy in AMI ...... 50 6.2.4 Further Considerations ...... 50

7 Conclusion 51

Bibliography 53 xvii

List of Figures

1.1 An example of a linear single Cellular Automaton...... 4 1.2 Example of the binary and notational representations of the cellular automata extracted from Brown’s work [36] ...... 4

4.1 An example of our Musebot agent architecture. Each Musebot represents an autonomous musician and is a multiagent system composed of four agents: mu- sician, composer, synchronizer, and ensemble assistant...... 24 4.2 The musician agent’s (MA’s) Finite State Machine (FSM)...... 26 4.3 Example of the musician agent’s behaviours hierarchy...... 27 4.4 Cycle process in which the MAs exchange roles...... 28 4.5 The Intro FSM...... 29 4.6 The Accompaniment FSM...... 29 4.7 The Solo FSM...... 30 4.8 One of the agent interaction protocols that we implemented to support commu- nication in our agent architecture (the FIPA Contract Net Interaction Protocol). 32

6.1 Synchronization process of two agents in the Musebot Ensemble. The Musebot Bass comes from being a leader and is ready to support the Musebot Piano, but it will need to request its Synchronizer Agent to obtain the current section being played by the Musebot Piano...... 47 6.2 Interaction between three internal agents through custom protocols...... 49 xviii xix

List of Tables

1.1 Set of notes in an input melody ...... 5 1.2 Transition probability matrix from Brown’s markov process example [42]. . . . 5 1.3 FIPA Interaction Protocols ...... 8

4.1 This table represents the properties of the song’s structure object. The tempo and time signature’s numerator and denominator are stored as integer values. The form stores a string of characters, and the sections of the form store a list of musical chords...... 27

5.1 This table enumerates each agent’s possible actions during a song’s progression. The actions are classified into communicative behaviours and musical behaviours. 36 5.2 Introduction of Song 1. The leader waits while the piano plays an introduction, and all three accompanists compose their parts...... 37 5.3 First Chorus (AAB) of Song 1. The leader (sax) waits until the end of the first section, because it needs to give a time to the CA to find out the current section (First A) being played by the accompanists. The leader then starts to play the solo at the beginning of the second section A. Finally at section B, the "sax" passes the leadership to the "bass". All three accompanists play their parts dur- ing this chorus...... 37 5.4 Second Chorus (AAB) of Song 1. The new leader (piano), plays a solo in each section of this chorus, and passes the leadership to the "bass" at the last section. The rest of the members play the accompaniments...... 37 5.5 Third Chorus (AAB) of Song 1. The "bass" takes its turn to be the leader and play a solo while the accompanists play their parts. At section B of this chorus, the "drums" accept to be the new leader of the ensemble...... 37 5.6 Fourth Chorus (AAB) of Song 1. The new leader (drums) plays a solo during the entire chorus, and the accompanists support this playing their parts. . . . . 38 5.7 Fifth Chorus (AAB) of Song 1. During this chorus, the "drums" continue play- ing a solo until the end of the second section A. At this moment of the song, the "sax" take back the leadership and the "drums" return to play the accompani- ments, while the rest of the agents ("bass" and "piano") plays their parts. . . . . 38 5.8 Sixth Chorus (AAB) of Song 1. This is the last chorus of Song 1, the "sax" play a solo for a single section and then pass the leadership to the "piano," which continue playing the next solo during the rest of the chorus...... 38 5.9 Introduction of Song 2. The leader ("sax") waits for the end of the introduction, which is played by the "piano" while all the accompanists compose their parts. . 39 5.10 First Chorus (A) of Song 2. The leader ("sax") compose the solo and waits for the end of the chorus, while the accompanists play their parts ...... 39 xx

5.11 Second Chorus (A) of Song 2. The "sax" start to play a solo and the rest of the members of the ensemble play the accompaniments...... 39 5.12 Third Chorus (A) of Song 2. The "sax" ends playing the solo at second fifty-five of section A and pass the leadership to the "piano". The "piano" and the "drums" play the accompaniments during the entire chorus...... 39 5.13 Fourth Chorus (A) of Song 2. After the "piano" accepted the leadership in the previous chorus, it starts to play the solo at the beginning of section A. The "sax" and "drums" support the new soloist with the accompaniments...... 40 5.14 Fifth Chorus (A) of Song 2. The leader ("piano") continues playing the solo during this chorus as well as the accompaniments play their parts...... 40 5.15 Sixth Chorus (A) of Song 2. The "piano" pass the leadership to the "drums" and ends its solo when section A finished. The "drums" then accept the leadership and end its part while the "sax" continue playing the accompaniment...... 40 5.16 Seventh Chorus (A) of Song 2. The new leader ("drums") starts to play a solo. The "piano" has become an accompanist and support the "drums" along with the "sax"...... 40 5.17 Eighth Chorus (A) of Song 2. The accompanists "sax" and "piano" continues supporting the soloist. The leader plays a section of inventive solo...... 40 5.18 Ninth Chorus (A) of Song 2. The leader ("drum") continues improvising a solo during this chorus, while the other agents play the accompaniments...... 41 5.19 Tenth Chorus (A) of Song 2. The "drums" play its last section of improvised solo and pass the leadership to the "sax". The "sax" accepts the leadership and plays its last accompaniment in the song, while the piano continues supporting the leader...... 41 5.20 Eleventh Chorus (A) of Song 2. The leader ("sax") compose a solo while the rest of the agents play the accompaniments...... 41 5.21 Introduction of Song 3. The leader ("sax") waits for the end of the introduction, which is played by the "piano" while all the accompanists compose their parts. . 42 5.22 First Chorus (AABA) of Song 3. At the first section, the "sax" composes a solo and waits until this section is finished. The leader later starts to play the solo at the beginning of the second section. Next, at section B the "sax" pass the leadership to the "piano" and afterwards become an accompanist playing its part from the beginning of the last section, at that moment the "piano" becomes the new leader and starts to play a solo. The "bass" and "drums" play their parts during the entire chorus without any change in their roles...... 42 5.23 Second Chorus (AABA) of Song 3. In this chorus, the "piano" play a solo dur- ing the first two sections, afterward it passes the leadership to the "sax" and then becomes an accompanist. The "sax" accept the leadership during the second sec- tion and plays a solo during the next two sections (B and A) after that, the "sax" passes the leadership back to the "piano". The "bass" and "drums" continue this chorus without any change in their roles...... 42 5.24 Third Chorus (AABA) of song 3. The "piano" plays a solo during the first three sections, the leader ("piano") then becomes an accompanist and lets the "bass" change into the new leader of the ensemble. The "bass" starts to play a solo at the beginning of the last section, while the "sax" and the "drums" play their parts as accompanists...... 43 xxi

5.25 Fourth Chorus (AABA) of Song 3. The leader ("bass") plays a solo during the first two sections. The leader later passes the leadership to the "drums" and subsequently changes its role to an accompanist. The "drums" accept to be the new leader during the second section and then play a solo during the rest of the chorus. The "sax" and the "piano" continue this chorus without any changes in their roles...... 43 xxii 1

Chapter 1

Introduction

Artificial Musical Intelligence is a broad area of research that uses Artificial Intelligence (AI) techniques to build autonomous, interactive, musical systems [1]. Hiller [2] was one of the pioneers of combining AI and music, building an application that generated musical compositions based on rule systems and Markov Chains. Later, Cope’s [3] “Experiments in Music Intelligence” used symbolic AI techniques such as grammars to generate a musical composition. Readers interested in the history of AI and music are encouraged to read Mi- randa’s [4] survey. We believe that the intersection of AI and music is an ideal context for the study of computational creativity. Computational Creativity is the use of autonomous systems to generate and apply new ideas that would be considered creative in different dis- ciplines of social life, including art, sciences and engineering [5]. In this dissertation, we focus particularly on the creativity that is inherent in collaborative, improvised, musical per- formance, and we adopt Roads’s [6] assertion that both composition and performance can be usefully tackled using AI techniques. Research and practical efforts that pursue these ob- jectives are commonly discussed under the title “Musical Metacreation” (MuMe) [7], which can be well thought of as a subfield of Computational Creativity. The study of Musical Metacreation (MuMe) techniques has been approached from sev- eral different perspectives, which we roughly organize into three subareas: Algorithmic Composition, Live Algorithms, and Musical Multi-agent Systems. Algorithmic Composition is a subarea of MuMe that seeks to automate different aspects of musical composition, including orchestration, score editing and sound synthesis [8]. For example, Stella [9] is an automated music representation system that can be used to edit musical scores. Live Algorithms seeks to build autonomous systems that can perform in a collaborative musical setting, sharing the same privileges, roles, and abilities as a human performer. For example, ixi lang [10] is an interpreted programming that produces musical events in response to instructions typed in real-time – a practice known as “coding live music” [11] or simply "live coding". Autonomy is a central concern when creating such systems, mean- ing that the interaction between the musician and the system must be strictly collaborative; neither should control the other. Elements of Algorithmic Composition, live electronics (e.g., Imaginary Landscape [12]), and free improvisation are often combined to satisfy this constraint [13]. While Algorithmic Composition aims to model different elements of the composition task, Live Algorithms seeks to model the creative abilities that can be observed when human musicians perform. Musical Multi-agent Systems is a subarea of MuMe that seeks to model the composi- tion and performance of music as a task that requires collaboration between multiple agents. The general concepts of multi-agent systems can been applied to MuMe in two ways: mul- 2 CHAPTER 1. INTRODUCTION

tiple agents can be used to represent a single autonomous musician (e.g., a composer and a performer) [14], or the behaviour of multiple autonomous musicians can be represented as a single multi-agent system (e.g., a string quartet) [15]–[17]. Ideas related to computer networking are often used in this context, with communication protocols being defined and used to deliver messages between interacting agents. Although many useful advances have been made in each of these three subareas, methods and architectures for combining such advances remain largely unexplored. Recently, Bown, Carey, and Eigenfeldt [18] developed “Musebot Ensemble” – a platform for agent-based music creation in which musical agents are developed individually and designed to accom- plish a specific role in a larger ensemble. In their paper, the authors asserted that the agents in the ensemble (called “Musebots”) must be able to communicate musical ideas through a set of predefined messages, toward supporting collective performances that are analogous to those of human musical ensembles. While the Musebot Ensemble platform offers a ba- sis for integrating various MuMe techniques, its model of how agents communicate can be improved. Specifically, it lacks support for direct communication between agents, choosing instead to allow only a special, centralized “Conductor” agent to communicate directly with each Musebot. The result is that some agents can be left unaware of decisions that are made by other agents, reducing their ability to perform well [19]. Furthermore, direct commu- nication between agents is essential to certain kinds of music, where the need for real-time coordination is inherent to the musical style (e.g., small-group jazz [20]). In this dissertation, we propose an architecture for Musical Metacreation that offers two contributions. First, it extends the Musebot Ensemble platform with a model of direct communication between au- tonomous musicians. Second, it does so in a way that facilitates integrating recent advances in the MuMe subareas of Algorithmic Composition, Live Algorithms, and Musical Multi- agent Systems. The remainder of this dissertation is organized as follows. We begin with a broad description of background in Artificial Music Intelligence. We then formulate our challenge and follow with an overview of related work, covering each of MuMe’s subareas in turn. We then present our architecture in two parts; we describe its overall structure and each of its component parts, and then explain how the parts interact with one another. We conclude by discussing our contributions and offering some suggestions for future work.

1.1 General Background

In this section, we provide a general background on artificial music intelligence (AMI), an overview of the most common AMI techniques that are currently used to build generative music systems, and a brief summary of the AMI subfields.

1.1.1 Artificial Music Intelligence Music is a human cognitive activity that many different academic disciplines have studied, including Psychology [21], Mathematics [22] and Sociology [23]. There is a large group of researchers interested in Artificial Intelligence (AI) model of music. The application of AI to music was mainly motivated by who, prior to the existence of computers, used algorithms as a means of inspiration for their musical compositions. For example, Mozart’s dice game [24] was a technique comprised of a stochastic algorithm used to randomly gen- erate notes in order to compose a melody. With the arrival of AI during the last century and the creation of programming such as Lisp and Prolog, composers became interested in developing computer systems that used algorithms based on AI techniques to 1.1. GENERAL BACKGROUND 3 generate music. For instance, Emmy and Emily Howell are computer programs that gener- ate musical compositions [25]. While the origins of the intersection between AI and music favoured musical composition, several studies have been conducted in other areas of music, including performance (e.g., The flavor band system [26]), sound synthesis analysis (e.g., Seewave [27]) and improvisation (e.g., JIG [28]). This thesis only addresses areas of music and AI that are related to creativity and collaboration.

1.1.2 Computational models of music creativity The study of human behavior is a fundamental practice for much AI research. Studies have shown impressive advances in our ability to model human strategical thinking (e.g., game theory and decision theory in multi-agent systems [29] and simulation based approaches to General Game Playing [30]). However, while some human behaviors have been effectively established by computer models, others have only been modestly explored. Creativity is one human behavior whose cognitive base is still difficult to understand. In spite of the unexplained nature of creativity, researchers argue that it is a feature of the human mind that computers could model. Margaret Boden [31] a researcher in cognitive science, states that there is nothing magical about creativity—it is just a natural characteristic of human intelligence that can be expressed in three forms: "combinational, exploratory, and transfor- mational," [31]. Boden claims that these could be effectively modeled by AI [31]. Computational creativity (CC) is a subfield of AI that studies how to develop computer programs that autonomously model creative behaviors. Different artistic disciplines such as painting, storytelling, and music provide interesting models for CC. Today, we can find open-source projects that apply machine techniques to carry out artistic tasks. For instance, Magenta is a public framework built by Google with the goal of advancing AI research within the arts and music. This framework supports the application of machine learning to simulate different artistic disciplines, such as painting and music. The Interna- tional Conference on Computational Creativity (ICCC) has been an important resource since 2010, publishing research related to computers being creative and evaluating computational creativity, generative systems, and interactive demonstrations. Numerous computational models of musical creativity have been presented at the ICCC. The aim of these models is to create autonomous, generative music systems with results that are indistinguishable from human musical compositions. It is important to note that these systems are assessed by people, making it difficult to implement reliable methods of evaluation. From a scientific perspective, this is a research limitation; however, in this dissertation, we do not focus on evaluating how closely the music generated by our proposed system resembles the music created by humans. We describe our method of evaluation in Chapter 5.

1.1.3 Common algorithms applied to musical composition Musical composition involves a series of activities that practitioners try to automate. For in- stance, musicians use musical notation to represent, store, and communicate musical ideas. Digital Interface (MIDI) is a protocol that encodes musical informa- tion such as pitch, velocity, and volume. For decades, MIDI has been a prominent form of communication between electronic instruments as well as a significant tool for using com- puters in musical composition. Implementing algorithms through a programming language is one way to compose music using computers. Programming languages are becoming more flexible, and it is possible to adapt programming interfaces to extend the capacities of the programming language. For example, jMusic is a programming library developed in Java 4 CHAPTER 1. INTRODUCTION

that allows one to represent music in common notation analogous to how musicians write music in a score, including elements such as parts, phrases, and notes; these elements are represented in the data structure of the jMusic library [32]. One can use this data struc- ture to implement algorithms related to cellular automata, generative grammars, and genetic algorithms.

1.1.3.1 Cellular automata In 1948, John Von Neumann [33] introduced cellular automata, a mathematical concept that models a discrete dynamic system. A cellular automata consist of two principal elements: First, a grid of cells in which each cell corresponds to an arbitrary number of possible states (see Figure 1.1), and second, a set of transition rules used to update the states of each cell over time.

Figure 1.1: An example of a linear single Cellular Automaton.

Cellular automata have been studied by different disciplines [34]-[35]. This concept has been applied to music with the goal of experimenting with alternative paradigms of musical composition. For instance, researchers have identified elements of the composition of (e.g., repetition and musical patterns) that are suitable to model with cellular automata. Brown [36] explored the generation of rhythmic patterns using a single linear cellular automata. Brown assigned a binary set of states to each cell of the cellular automata, where 1 represented a musical note and 0 represented a rest (see Figure 1.2).

Figure 1.2: Example of the binary and notational representations of the cellular automata rhythm extracted from Brown’s work [36]

To produce a new sequence of notes and rests, Brown initially experimented using what he called a totalistic rule. This rule will update the state of each cell according to the sum of 1s in its neighborhood (a subset of cells from the cellular automata). The aesthetics of the resulting composition are outlined by the application of the rule in each repetition [36].

1.1.3.2 Grammar based music composition Grammar is one of the principal features of every communication language. AMI pioneers explored the similarities between musical notation and natural languages. Studies on formal grammar (e.g., The Algebraic Theory of Context-Free Languages [37]) served as inspira- tion for the application of grammar-based methods in AI and music. In contrast to natural 1.1. GENERAL BACKGROUND 5 language, music escapes the problem of dealing with semantics since music does not have any precise meaning [38]; thus, grammar-based techniques are one of the most popular approaches used in algorithmic composition. The great majority of AI domains employ models based on theories of probability and statistics, from evaluating functions based on Monte Carlo tree searches in the application of games [39] to Bayesian networks [40]. Markov chains are commonly used in computer musical composition. A Markov chain represents a chance process where the results of an event can be influenced by the outcome of a previous event [41]. For a description of the Markov chain process, see Grinstead and Snell [41]. Translating this mechanism into a computer algorithm that outputs a melody requires the input of an existing composition. Brown gives an example of this operation [42]. Given the melody described in Table 1.1, the probability of the note G depends on the occurrence of the previous note. This is determined by representing the likelihood of each note using a Markov Matrix of all possibilities. Table 1.2 shows how Brown calculated the probabilities of each pitch by counting the sequence repeated of each note in the first column (e.g., the transition from D in the first column to E in the top row is repeated four times in the melody). After normalizing the values in the matrix so the count of each row sum to 1, the matrix can be used to generate a new melody of any length sharing similar caractheristics to the previous one [42].

Table 1.1: Set of notes in an input melody

E, D, C, D, E, E, E, D, D, D, E, G, G, E, D, C, D, E, E, E, E, D, E, D, C

Table 1.2: Transition probability matrix from Brown’s markov process example [42].

C D E G C 0.0 0.1 0.0 0.0 D 0.3 0.3 0.4 0.0 E 0.0 0.4545 0.4545 0.0909 G 0.0 0.0 0.5 0.5

Cope’s [3] “Experiments in Music Intelligence,” mentions an example of a system based on the process of statistical analysis. Additional examples and further discussion on this method appear in Chapter 3.

1.1.3.3 Genetic Algorithms Genetic algorithms (GA) are part of the research being conducted on evolutionary algo- rithms. They were created in the early days of computer science and were mainly influenced by the attempted simulation of biological processes. GAs play an important role in the process of algorithmic music improvisation due to their characteristics of mutation and re- combination. Brown [42] explained this process as music improvisation based on the spon- taneous combination of notes evolving continuously over time. Specifically, Brown [42] used GAs to generate a new melody from a given sample. First he reproduced a random variation of the given sample. Second, he mutated the generated sample, applying arbitrary changes to some elements of the melody (e.g., pitch, rhythm, volume, etc.), generating an 6 CHAPTER 1. INTRODUCTION extensive number of variations. Third, the decision of which variation best fits the given sample involved a selection process. The selection process was carried out by applying a fitness function that rated the generated variations according to certain criteria. Finally, the highest-rated variations were chosen [42].

1.1.4 Interactive musical systems for live music The development of new music interface designs has been explored by researchers in the arts. Hundreds of novel music interfaces have been proposed over the years, such as [43] Music Bottles, The [44] Audio Shaker and [45] musical trinkets. The International Con- ference on New Interfaces for Musical Expression (NIME) is an important channel for re- searchers who want to exchange knowledge about new music interface design. Not only artists have expressed their passion for the development of new interactive music forms computer scientists and engineers have developed software that shares a certain degree of musical interaction during live performance. One of the genres that revolutionized the use of computer software in music is electronic music. Making music by organizing sounds created with electronic devices challenged composers to work with complex digital artifacts such as computers. This prompted a whole new range of research questions, today related to investigations on sound synthesis analysis. With the digitalization of the synthesizer, an electronic device that imitates acoustic instruments by mixing waveforms, composers have been able to store different instrumental sounds on computers. This has helped musicians to automate some elements of the synthesizer, such as speed, that allows them to create very fast passages and complicated rhythms humans cannot perform. The film industry has also shown an interest in this type of music, especially for the creation of sound effects and fu- turistic melodies. In parallel, some musicians have adopted these electronic devices as tools for their live performances. While some artists and engineers research sound qualities that are beyond human dex- terity, others are interested in improving musicians’ collaborative performing abilities. Live Algorithms is an area of research dedicated to building systems that autonomously collab- orate with musicians in a music ensemble during live performances. Blackwell, Bown, and Young [46] listed four attributes they adopted from human performers. • Autonomy. Blackwell, Bown, and Young [46] describe an autonomous system in music as an agent that bases its actions on the musical inputs that it receives in real time. The system might have to make decisions and respond in order to adapt its actions to the musical environment. • Novelty. All artists are compelled to sustain a high level of originality—for instance, each time a jazz musician improvises a solo, he or she is creating a brand new com- position. This fundamental feature of production has been emulated by hundreds of computer algorithms, such as Biles [47] genetic algorithm for generating jazz solos. • Participation. A computer system that collaborates in a musical group must be able to contribute with continuous musical ideas and support future musical directions ac- cordingly [46]. • Leadership. Blackwell, Bown, and Young stressed that a live algorithm must be able to propose a change in the musical direction, acting not only proactively but also receptively. This extends the role of the live algorithms in a musical ensemble to directly assume the power of having control over decisions that impact the future outcomes of the sound. 1.1. GENERAL BACKGROUND 7

Blackwell, Bown, and Young also described a framework for Live Algorithms called the P, F and Q architecture. P represents sound analysis through analogous input (listening), F is the reasoning module of the system that processes the audio received by applying algorithms in a musical context (brain), and Q represents the functions of synthesis by converting the input from analogue to digital and streaming the music (voice). In general terms, F represents the integration of algorithmic composition into the framework. The whole process is an abstraction of musicians’ , cognition, and production [46].

1.1.5 Communication protocols in a musical multi-agent system One of the challenges of multi-agent systems is organizing agents’ activities to coordinate with each other. Consider, for example, the process through which the members of a music ensemble are able to coordinate amongst themselves during a live performance. The music ensemble is a network of musicians that interpret different parts of a musical score using their instruments or their voices; this is transformed into a piece of music and delivered to the audience. In order to operate efficiently, the ensemble must work in a coordinated man- ner. The accurate coordination of decisions and management of actions among musicians is what ultimately determines the efficient achievement of the ensemble’s goal. This structure has been modeled by different multi-agent applications such as Wulfhorst, Nakayama, and Vicari’s [15] multi-agent approach for musical interactive systems and Wulfhorst, Flores, Nakayama, et al.’s [48] open architecture for musical multi-agent system. These models require a communication mechanism that organizes the distribution of activities within the ensemble. Agent communication is defined by language and interaction protocols (e.g., con- versations) that serve a certain purpose (e.g., a negotiation or a request for something). In the context of coordinated activities, which is a main concern in multi-agent systems, re- searchers have designed specifications of agents communication languages (ACL). These are messages composed of different elements such as a sender, a receiver, a performative verb (type of message), an ontology, etc. The earliest attempts to use ACLs date back to the late 1970s, when Cohen and Perrault [49] proposed a plan-based theory of speech acts. A speech act is defined by two parts: 1) a performative element, which is a communicative verb that distinguishes between differ- ent intended meanings; and 2) a propositional element, which indicates what the speech is about. Some of the most commonly used languages are Knowledge Query and Manipulation Language (KQML) and Foundation for Intelligent Physical Agents (FIPA). While both are standards of messages based on speech acts, in this dissertation, we focus on FIPA because it is part of the implementation phase of our proposed approach.

1.1.5.1 The FIPA Protocols FIPA is a collection of standards that improve interaction between agents from multiple platforms. This collection includes an agent interaction protocol suite (AIPS). As Poslad [50] asserted, instead of being a single-agent communication language, AIPS is a set of semantic protocols for agent communication: "Interaction Process, Communicative Acts, Content Logic, and Content Ontologies." [50] Communicative acts (CA) are fundamental to the FIPA agent communication language (FIPA-ACL), which is based on speech act theory and composed of a definition of messages that represent actions or performatives. For in- stance, "propose", "accept proposal", "reject", and "refuse" are some of the most commonly used communicative acts. Furthermore, these communicative acts are the foundation for the standardization of several interaction protocols, which consist of exchanging a sequence of 8 CHAPTER 1. INTRODUCTION messages to coordinate agents’ actions. The existing FIPA interaction protocols are listed in Table 1.3. Generally speaking, the FIPA interaction protocols either request to perform a task or to share information among participants in a multi-agent system. In the FIPA request interaction protocol, an agent (the initiator) asks another agent (the responder) to perform an action. The responder computes the request and decides whether to accept or refuse it. From a music perspective, this kind of protocol is a powerful tool for negotiating which passages agents will perform. We demonstrate this capability in Chapter 4.

Table 1.3: FIPA Interaction Protocols

Interaction Protocol Task Info sharing Request X Request-when(ever) X Query X Contract-Net X English/Dutch Auction X Broker X Recruit X Subscribe X Propose X

1.2 Musebot Ensemble

Musebot Ensemble is an open-source project by which the community of researchers on MuMe can share their ideas and create autonomous agents that perform in a collaborative manner [18]. Despite the novelty of its architecture, this generative music system has certain limitations. We address these limitations in the following subsections.

1.2.1 Limited real-time capabilities The original intention of the ensemble was for it to facilitate interactivity that allows Muse- bots to work together, contribute to a common creative outcome, and complement each other, thus leading to creative music. In its current form, Musebot Ensemble does not exhibit ele- vated levels of creative autonomy [51]. One characteristic of an intelligent musical system is the ability to independently create music in real time, enabling computer music systems to perform in real time produces a series of interesting challenges. With the incorporation of electronic synthesizers in the early 80s, researchers started to focus on developing software that overcomes those challenges. This created an interaction between personal computers and musical instruments, leading music to a technological era of live electronics [52]. With technological advances and the evolution of AI, software has become increasingly more automated, creating a new branch of inquiry in the domain of MuMe. For instance, to avoid human intervention, a computational model of music creativity will need to listen to its musician peer, and instantaneously process and react to what it has heard [53]. Eigen- feldt and Pasquier [53] proposed a set of real-time generative music systems, using aspects of artificial intelligence and stochastic methods [53]. However, the evaluation of the effec- tiveness of their experiment has yet to be concluded [53]. The complexity of simulating human , understanding, and response abilities remains an open question for research 1.3. SUMMARY 9 in Live Algorithms. Incorporating Live Algorithms into the architecture of Musebot Ensem- ble is expected to impact the quality of synchronizing and tracking beats [54]. The efficacy of Live Algorithms is to allow for synchronization of performances between computational and human musicians. This functionality further allows the system to track any changes in the structure of the music and take corrective measures in real time. Brown and Gifford [55] argue that implementing Live Algorithms allows systems to make corrections to the changing structure of music in real time [55].

1.2.2 Changing musical roles Musical improvisation consist of deciding musical ideas and playing them simultaneously during a performance; it is a traditional practice in music performance that requires a deep understanding of and mastery of a musical instrument [56]. Musical improvi- sation is one of the main features of jazz. Consequently, several studies on artificial music intelligence have based their investigations on this style of music (e.g.,[57] and [58]). Ei- ther by performing as an accompaniment to support a musician or by taking musical solos in an improvised fashion, both previously mentioned activities have been tackled through Algorithmic Composition, with exceptional results. However, a musician’s ability to trade between the roles of accompanist and soloist remains unexplored. Implementing Algorithmic Composition can improve the Musebot Ensemble’s capacity to compose either an accompaniment or a solo, based on the specification of the musical structure. The implementation essentially allows the generation of musical compositions autonomously through computer algorithms.

1.2.3 Collaboration between agents Incorporating Musical Multi-Agent Systems into the Musebot Ensemble would allow mu- sicians to collaborate while performing music through interactive protocols. It would es- sentially synchronize musicians’ activities and allow scheduling different musical tasks for each member of the ensemble, via direct communication among agents. The complexity of such protocols could make them difficult to implement in MuMe. However, various stud- ies have pointed to the possibility of them being feasible through the adoption of specific communication protocols [55].

1.3 Summary

In this chapter, we gave a brief background of AMI research. We started by discussing com- mon computational models of music creativity. We then addressed some of the mechanisms used in earlier studies of AI and music, including Cellular Automata, grammar based mu- sic composition, and genetic algorithms. Furthermore, we listed different communication protocols provided by FIPA. Finally, we described the Musebot Ensemble and its limita- tions. In Chapter 2, we will discuss the challenge of creating an AMI system with a focus on communication. 10 11

Chapter 2

Problem Formulation

The study of human cognition and how it can be modeled has contributed to several im- provements in our daily lives, including “smart systems” that use AI techniques to ease users’ experiences. Inspired by this perspective, we view the study and emulation of human musical abilities as an important avenue to explore in pursuit of autonomous musical sys- tems, which are computer programs based on AI techniques. Several autonomous musical systems have been developed over the years that propose a vast number of approaches to either aid musicians with composition or generate original compositions that are considered to be aesthetically creative. In this chapter, we describe the specific challenges within this area that we seek to address in our work, and we motivate why jazz is a useful domain for our research.

2.1 Our vision: an ideal AMI system

Researchers’ initial approaches to AI and collaborative music began in the 1980s due to ad- vances in technological music communication. The benefits of MIDI provided researchers with mechanisms to implement interactive computer performers [59]. Before this, studies related to computer music were concentrated in implementing computer software that in- dividually generated composition in a monophonic or polyphonic manner, which provided promising results [59]. However, each instrument in the composition was unaware of the intention of the other instruments [59]. While computer software was built to emulate the abilities of live human performers, such as adjusting the tempo and anticipating mistakes [59], the novelty of these systems lies in their capability to simulate cognitive musical pro- cesses. For instance, a skilled live performer is capable of becoming a member of any musical ensemble regardless of the style of music intended by the ensemble and sometimes, he or she will have the ability to play any type of musical instrument. Commonly, interactive computer performers are constrained by their fellow participants. The human musician who is playing alongside the computer performer will dictate the flow of the song, and the computer software will rely on the musical inputs provided by its human peer in order to process the musical information and then output its musical contribution. Thus, the computer performer cannot be involved in leadership tasks and cannot propose any changes in the structure of the song, limiting it to performing merely as an accompanist [59]. De Mantaras and Arcos [60] classified three types of computer music systems concerned with AI techniques: Compositional, improvisational, and performance systems. They re- viewed different approaches to each of these areas and how these have individually grown under the domain of AI and music. We believe that a generative music system should be able 12 CHAPTER 2. PROBLEM FORMULATION to integrate all three components into a single system, as a talented human musician is able to compose, improvise, and perform. While integrating these three abilities into one system could present various challenges, doing so will enhance the generative music system’s ca- pabilities and will enable it to more closely approximate a human musical performance. In our view, an ideal AMI system must be able to: (1) autonomously, (2) communicate musical information, (3) Perform alternative musical styles, and (4) take on different roles within a musical ensemble.

2.1.1 Autonomy We envision an ideal AMI system as an autonomous music generator that does not rely on any human intervention. In contrast to most of the previously proposed interactive music systems that collaborate with human musicians, an autonomous AMI system should be able to perform in a musical setting without expecting any input from a human musician.

2.1.2 Musical communication An ideal AMI system must be able to communicate musical ideas and understand topics related to the musical context, perhaps by encoding these topics into conversations. Further- more, it must be able to hold multiple conversations in order to negotiate or request musical tasks. To do so, it must be aware of the rest of the musicians’ intentions by receiving infor- mation from them. It must be able to exhibit some degree of collaboration and cooperation by exploiting the different communication elements.

2.1.3 Performance of alternative musical styles We have already mentioned how musicians develop the skills to perform different musical styles—they have the necessary talent to perform as members of a rock band, a jazz ensem- ble, a classical music group, etc. Generally, a generative music system will be aesthetically similar to the genre of music that its creator has pre-established. An ideal AMI system should be able to perform any musical style, for this will allow the system not only to be computationally creative, but also to be creative using musical structures that are aestheti- cally closer to those produced by humans.

2.1.4 Taking on different roles within a musical ensemble Pressing [61] highlighted the critical role of creativity in free jazz improvisation, addressing the expressive tension that is felt by the members of a jazz ensemble and suggesting that this tension is traditionaly handled via role playing. Here, each instrument performs a nominal part in the larger work. More specifically, Pressing described the role of the bass player as providing chord foundations and time, the drums providing and setting up the rhythm, and the soloist playing the main melody [61]. In a jazz ensemble, the soloist typically makes musical decisions in relation to what the rest of the musicians are doing [62]. For instance, in Bastien and Hostager’s case study [20], the first soloist determines the structure of the song and conveys this information to the other players. In this dissertation, we refer to the soloist as the leader of the ensemble and to the rest of the members as the accompanists. In an ideal AMI system, the musical agent would be capable of playing as either a leader or as an accompanist. As a leader, the agent should have the main voice in the ensemble, determining the structure of the song. It 2.2. OUR FOCUS: ENABLING COMMUNICATION BETWEEN MUSICAL AGENTS13 would also be the first to perform the main melody of the song and the first to assign musical tasks to the accompanists. As an accompanist, it should support the leader and respond to the leader’s requests. It should also provide compelling musical accompaniments according to the specifications of the musical structure. Simulating musicians’ ability to trade between roles has remained unexplored; therefore, the difficulty of doing so remains unknown.

2.2 Our focus: enabling communication between musical agents

In this work, we focus particularly on the challenge of enabling communication between musical agents. In general an autonomous generative music system must possess attributes connected to Gardner’s theory of multiple intelligences [63]. According to this theory, a human being is endowed with eight types of intelligence: logical mathematical intelligence, verbal linguistic intelligence, interpersonal intelligence, body–kinesthetic intelligence, mu- sical intelligence, visual–spatial intelligence, intrapersonal intelligence, and naturalistic in- telligence. Despite the distinct disciplines these type of intelligence are related to, they are not independent from one another. On the contrary, these intelligences are combined when per- forming specific activities such as playing a sport or performing in a musical ensemble [64]. For example, people who want to pursue a musical profession will need to develop a high level of musical intelligence to understand musical structure, compose rhythms and musi- cal patterns, recognize pitches, and follow variations in the music. Additionally, they will need to develop body–kinesthetic intelligence to coordinate their body movement in order to play an instrument and simultaneously express musical ideas in a collaborative musical performance (e.g., body movement that indicates when the music is about to end) [64]. Musical performance researchers [65] have studied social interaction in a musical en- semble, finding that facial expressions and body movements are significant forms of com- munication among musicians. Further, verbal feedback during rehearsal is essential for musicians’ professional development [65]. Therefore, one set of abilities that merits emu- lation enables and supports direct communication between musicians, spanning both verbal and non-verbal modes. Specifically, musicians’ abilities to negotiate, synchronize, compose, and perform with others are essential in the context of collaborative musical improvisation [66]. Noting that the capacity for direct, inter-musician communication is lacking in must AMI systems, we seek to address this lack.

2.3 Jazz as a good domain for AMI research

Jazz is a genre of music in which performers exhibit an elevated degree of collaboration. This is due to the constant spontaneous variations the musicians must adapt to during a song as well as the impromptu creation of a new composition each time a musician plays a solo. This practice forces musicians to collaboratively listen and respond to one another [67]. Analyses of such interactions from a social perspective have been undertaken by different disciplines within the social sciences. For instance, from their case study observation of a jazz quartet’s performance, Bastien and Hostager [20] concluded that musicians engage in direct verbal and non-verbal communication across three distinct modes: instruction, co- operation, and collaboration. Furthermore, Sawyer [64] reported that jazz musicians relate music to speech. This is perhaps one origin of implementing grammar-based techniques in 14 CHAPTER 2. PROBLEM FORMULATION generative music systems. However, most formal grammatical implementations are strongly related to the structure of music and have failed to address problems relevant to musical performance [64]. In spite of this, Sawyer argued that collaborative musical performance and language are heavily connected as a means of communication. Sawyer’s work focused on the observation of jazz improvisation and the similarities that its features share with ev- eryday conversation. He proposed that everyday conversations are creatively improvised, and neither the speaker nor the interlocutor can know for certain what the next line in the conversation will be—both are guided through the conversation by facial or visual expres- sions [64]. Sawyer uses the term "collaborative emergence" to describe the unpredictable directions in which a jazz ensemble can go. In this work, we adopt and extend Bastien and Hostager’s [20] case study as a model of the desired behavior of an AMI system. This model must be customized to exhibit attributes related to previously mentioned characteristics of musical performers, including autonomy, musical communication, performing alternative musical styles, and taking on different roles within a musical ensemble.

2.4 Expected challenges

Modeling Bastien and Hostager’s [20] work in an AMI system will be challenging, since such a simulation requires a process of negotiation among the members of the ensemble that has not been explored in prior AMI work. Furthermore, an analysis of the intrinsic limita- tions of contemporary advances in the field of MuMe is key to formulating strategies that pursue our vision of an ideal AMI system. Therefore, the goal of incorporating contem- porary advances in MuMe will serve as a foundation for implementing all of the described functionalities of an AMI system. In doing so, we aim to show that different MuMe tech- niques, such as live algorithms, algorithmic composition, and musical multi-agent systems, can be successfully integrated into an AMI system.

2.5 Summary

Generally, AMI is based on the notion of computational creativity [18]. The essence of the field of computational creativity is to advance the development of artificial systems in order to demonstrate creative behavior or produce original artifacts [68]. In essence, computa- tional creativity such as MuMe endows machines with the capability to perform functional musical tasks. Such musical tasks include combination, accompaniment, composition, and interpretation [18], and existing systems that exhibit all of these characteristics are limited, as we discuss in Chapter 3. We aim to create a system that achieves the above-mentioned musical tasks, and enables our vision fo an ideal AMI system. In this chapter, we described the different abilities that an ideal AMI system should have and explained our decision to focus on enabling direct musical communication. To enable the creation of ideal AMI systems, we aim to design and demonstrate a platform that allows the different techniques of MuMe’s subareas to be integrated and that support direct communication between musicians in an ensemble. 15

Chapter 3

Related Work

The community of researchers studying MuMe has grown over the years, with projects like the Musebot Ensemble platform and research networks like Live Algorithms for Music (LAM) seeking to encourage interest and integration in the context of musical creativity. As a result, numerous papers have been published in the MuMe subareas of algorithmic com- position, live algorithms, and musical multi-agent systems, as considered in the subsections that follow. We will conclude our review of related research by summarizing recent efforts to facilitate and promote integration across MuMe’s subareas, along with a section devoted to commercial music software.

3.1 Algorithmic Composition

Commonly, musicians who compose without the help of any technological device will think of an idea and work diligently with it on a piece of paper from concept to completion [69]. However, if we encapsulate this process into computer software, the software can develop a composer’s idea using various parameters to generate a diverse series of versions of com- positions based on the original idea [69]. This assertion was made by Edwards, when he expressed that the advantages offered by a computer’s software inspired him to explore the formalization of music composition through computer algorithms. Furthermore, Edwards stressed why algorithms are useful in simulating the music composition process. He ex- plained that composers write music on a score, or on any other format, in an algorithmic manner. Therefore, algorithmic composition formalizes this process and makes it explicit it by using a finite set of step-by-step procedures encapsulated in a software program to create music [69]. Algorithmic Composition (AC) is the basis of creative music passages in a generative music system. Moreover, it has served as a foundation for using AI meth- ods at the intersection of computers and music [8]. Furthermore, AC is an area of research that has contributed to several technological advances in the music industry, as many tools have been created to help musicians automate their composition tasks. A great majority of contemporary music uses computer systems to generate or modify music during live perfor- mances. However, in this section, we focus only on algorithms for composition that involve the application of AI, and particularly on those that do not require any human intervention.

3.1.1 Generative grammars Generative grammars and Markov models are some of the first methods of AI used in AC [70]–[74]. The key to applying grammars in algorithmic music is to establish a set of 16 CHAPTER 3. RELATED WORK

rules that will define the generative composition. Applying these rules to musical attributes could be achieved using production rules from a Context Free Grammar (CFG). Perchy and Sarria [75] developed a stochastic CFG that generates harmonic progression based upon J.S. Bach’s rules of homophonic composition [75]. Additionally, a rule-based generative music system was engineered by Wallis, Ingalls, Campana, et al. [76]. While Perchy and Sarria’s system used CFG to generate musical sequences that provide ideas for musicians to use in their compositions, the Wallis, Ingalls, Campana, et al. system focused on the research of emotional music synthesis. Both studies are limited to the scope that involves real-time music composition in multi-agent systems. Some grammar-based methods have been combined with evolutionary algorithms, creat- ing a compound approach to algorithmic composition. For instance, a work in grammatical evolution in automatic composition was proposed by Puente, Alfonso, and Moreno [77]. The goal of this work is to provide an alternative to algorithmic composition, merging ge- netic algorithms and grammar-based mechanisms to create music that sounds comparable to that played by humans [77]. Despite the connections between human composition and al- gorithmic composition, there are problems beyond the quality of the music being generated, including a musician’s capacity to interchange roles during the production of music (e.g., musicians who play a solo right after playing and accompaniment). Other approaches involve a variation of formal grammars. A Lindenmayer System (L- system) is based on a parallel rewriting system. A definition of a rewriting system is de- scribed by Prusinkiewicz and Lindenmayer in [78] as follows:

"Rewriting is a technique for defining complex objects by successively re- placing parts of a simple initial object using a set of rewriting rules or produc- tions " [78].

The main different between formal grammars and L-systems is defined by how they ap- ply the production rules. While a formal grammar applies its rules in sequence, L-systems apply their rules in parallel. McCormack [79] described an implementation of rewriting rules over a representation of pitch, duration, and timbre, which are encoded as grammar symbols and stored in strings. After simultaneously applying the production rules to these strings, the result is translated to MIDI data, which allows the system to play the composition through various synthesizers. In his paper, McCormack reported that in contrast to previ- ous grammar-based methods, L-systems allow the development of complex music patterns and the exploitation of notes’ attributes, such as velocity and volume. Furthermore, SLIP- PERY CHICKEN is an opensource algorithmic composition system that combine different techniques used in AC. This system implemented L-system in three ways: "as a straightfor- ward cycling mechanism; as a simple L-System without transition; and as a Transitioning L-system [80]." However, this methods does yet not fulfill the requirement that includes musicians’ behaviors, such as negotiation and cooperation. Abdallah and Gold [81] offer a detailed explanation of grammar-based models and Markov models, as well as a comparison between them.

3.1.2 Evolutionary methods Researchers were also interested in evolutionary methods, in which a subset of solutions is generated from an initial set and then evaluated using a fitness function to measure their quality [82]. Evolutionary Algorithms (EAs) are commonly used for the generation of music patterns. However, a common problem within the music context is the dilemma of defin- 3.2. LIVE ALGORITHMS 17 ing automatic fitness functions [8]. Tokui and Iba [83] provided a system in which prefer- ence of the fitness function is defined implicitly by the user. The system combines genetic algorithms with genetic programming, resulting in the generation of rhythm patterns. Al- though the system promotes interaction between human and evolutionary algorithms through a novel composition tool, the fitness values are given previously by a user. Therefore, sys- tem autonomy is limited to its user. EAs are often combined with other AI methods given the EAs reproductive nature. We will discuss a more hybrid system that combines Artificial Neural Networks with EAs in the next subsection.

3.1.3 Artificial neural networks Another method that is widely implemented in AC is Artificial Neural Networks (ANN), in which interconnected processing units are often used to accomplish a pattern recognition task [84]. For example, Goldman, Gang, Rosenschein, et al. [85] developed a hybrid sys- tem based on the communication and cooperation of agents. These agents applied heuristic rules to solve problems relevant to polyphonic music composition, in real time. Melodies produced by the system are generated by neural networks that predict the expected value of each subsequent note in the melody. Although Goldman, Gang, Rosenschein, et al. suc- cessfully modeled some aspects of inter-agent communication (such as agreeing on which notes to play together), other important aspects were not included in the model (e.g., cueing transitions or negotiating over necessary tasks). Furthermore, their system focused primar- ily on modelling the cognitive processes of a human musician and their individual capacity to undertake different musical tasks (e.g., analyzing possible combination of notes and per- forming at the same time). Nishijima and Watanabe [86] used ANNs to learn musical styles. Moreover, a Recurrent Neural Network (RNN) is a type of ANN that has recently gained popularity in the implementation of language recognition and handwriting because its ca- pacity to store information about what has been computed so far in an arbitrary number of sequences [87]. Nayebi and Vitelli [88] experimented with algorithmic composition to compare two types of RNNs, Long Short-Term (LSTM) and Gated Recurrent Unit (GRU) [88]. The system takes an audio wave form as an input and generates a new musi- cal composition as an output. Nayebi and Vitelli [88] asserted that the music generated by their LSTM network was comparable to the music created by humans. Furthermore, Eck and Schmidhuber [89] implemented LSTM networks to study Blues music and were able to compose singular compositions in that style of music [89]. An approach for increasing the chance of producing pleasant melodies by evolving RRNs was reported in [90]. Ultimately, an overview and taxonomy of methods in AC is provided in Fernández and Vico’s [8] survey. Despite the capacity of AC to automate certain aspects of music com- position, it remains challenging to represent features that are exclusively in the domain of communication between agents, such as the ability to share musical ideas with another mu- sician.

3.2 Live Algorithms

Most of the research conducted on live algorithms is focused on beat tracking as well as sound analysis and their respective challenges. This research is based on allowing mu- sicians to synchronize performances and effectively take turns playing with other human musicians. ANTESCOFO is an example of one such work. It is an anticipatory system that follows coordination between two agents [91]. One of these agents is audio and the other is 18 CHAPTER 3. RELATED WORK tempo, and they help the music synchronize more effectively and accurately with the musi- cal partners. ANTESCOFO is an excellent example of computational music, wherein the synchronization of the two elements (audio angent and tempo agent) offer impeccable re- sults. The interesting thing about this system is that it is capable of predicting changes in the structure of music and adapting to those changes in real time with efficiency and accu- racy. The resultant music can be considered dynamic in nature, with the ability to create new structures as and when required. It has the ability to produce enjoyable music by identifying patterns and structures by itself. It is helpful in providing a sense of accomplishment to its users. Genjam, similar to ANTESCOFO, is also capable of performing with human musicians [47]. The effects are similar to ANTESCOFO, with varying forms of musical creation. Music can be created using these programs without employing actual performers since the algorithm is able to follow programmed instructions in a systematic manner without being monitored [71]. According to Biles [47], the most important feature of this system, is its capability to trade fours and eights. Fours and eights are the parts of a performance in jazz where soloists exchange and improvise musical ideas turn by turn in a way that mimics verbal communication. A characteristic of a live algorithms is that they are able to improvise just by following the leader of the ensemble, without need for the leader to give instructions. Interaction between musicians in a musical ensemble could be represented by the self-organization components of swarms [92]. Alternatively, Harrald [93] discussed interactive improvisation in musical ensembles using the game-theoretic concept of the Prisoner’s Dilemma. Finally, members of the research network “Live Algorithms for Music” have provided a description of live algorithms, wherein they classified different attributes a live algorithm must possess [46]. Live algorithms develop automatic systems, capable of performing in coordinated music settings and sharing the same privileges, abilities, and roles as human performers [46]. Ixi Lang, previously mentioned in Chapter 1, is one such interpreted language, able to produce musical events based on the instructions provided to it by typing in real time [10]. This practice is known as live music coding or live coding. While we believe that the study of live algorithms is essential to the development of artificial musical intelligence, it has thus far only addressed the challenges of building au- tonomous systems that can perform together with humans. In contrast, our work seeks to understand and address the challenge of having multiple autonomous systems perform to- gether and without any reliance on human participation.

3.3 Musical multi-agent systems

In the context of the intersection between computers and music, an agent can be used to study various problems concerned to musical intelligence [94]. For instance, a common approach is to represent members of a musical ensemble as agents. Although several studies in multi- agent systems have shown advances in the representation of musicians playing in a musical group, the simulation of intelligent musical behaviors has also been addressed frequently. Therefore, this section is divided into three subsections; research related to multi-agent sys- tems for collaborative music, research towards the simulation of musicians’ behaviors, and research towards the representation of musical structures. 3.3. MUSICAL MULTI-AGENT SYSTEMS 19

3.3.1 Multi-agent system for collaborative music Musical multi-agent systems are commonly used to enable collaborative performances of music when all the musicians are performing in the same space. However, a different per- spective has been provided by many researchers including [16], [95] and [96]. The concept of performance through networking offers a compelling abstraction of musical multi-agent systems. Challenges for networked music performance such as "low delay audio capture and transmission, time synchronization and bandwidth requirements," [16] where musicians perform in different physical locations using the Internet as a communication channel, are common [16]. Collaborative music communication via the Internet thus needs a standard- ized protocol to govern the messages exchanged between different agents [16]. The protocol becomes the foundation of the music created, as they maintain the tempo, rhythm, and har- mony of the work. Various methods have been discussed to both implement and sustain multi-agent musical communications and represent the cognitive agents used for the music. Several protocols have been developed recently by researchers for the multi-agent system. MAMA is a multi- agent system example which has been developed by group of researchers [14]. The system assists agent interaction through the use of musical acts, which are based on the theory of speech acts, implemented in a variety of agent communication languages such as FIPA- ACL [14]. Murray-Rust, Smaill, and Edwards describe the implementation of their system architecture, in which two agents, musician and composer, use a set of formal musical acts to develop a more precise understanding between them [14].

3.3.2 Multi-agent system for simulating musicians’ behaviors There are a variety of natural human behaviors implicit in musical intelligence. One impor- tant musical behavior is the ability to play one’s instrument following a determined tempo. MMAS, developed by Wulfhorst, Flores, Nakayama, et al. [48], is a system that simulates the behavior of a musician in an ensemble through the interaction of musical events. The aim of this system is to simulate the ability exhibited by musicians when are playing their instruments in synchrony [48]. However, Wulfhorst, Flores, Nakayama, et al.’s approach to agents’ synchronization requires sending midi events to a centralized "black board" to mitigate possible protocol communication delays. In spite of the efficacy of the system, this method of synchronization does not allow each agent to play its own instrument. Similarly, Oliveira, Davies, Gouyon, et al. [97] explored the natural human ability to follow the beat of the music; this ability is commonly exhibited by people, for example, when someone either taps one foot or moves their head in unison with the beat [97]. Their beat synchronization system was based in a multi-agent architecture, which applied tracking of tempo and beats by automatically monitoring the state of the music [97]. Dixon’s system simulated this beat tracking ability without obtaining any information about the music including time signature, style, and tempo [98]. Dixon’s paper reported that his system is capable of following the beat of different styles of music with a minimum margin of error [98]. The connection of music to human is a characteristic often exploited by different products, such as computer games and motion pictures. Inmamusys is a multi-agent system that aims to develop music based on the response received from the user given a specific profile containing emotional content [99]. Emotional music has found increased traction in recent years since becoming increasingly popular with the public. There has been increased use of such music to catch the attention of prospective listeners. Listeners have become open to the creation and display new, diverse kinds of music. The system creates a music 20 CHAPTER 3. RELATED WORK piece to attempt emotional profiling. Users can request a upbeat or down-tempo songs based on their moods [99]. Finally, similar to AC, computer musical frameworks based on multi- agent systems have been developed to stimulate human musical creativity. This provides musicians with new ideas, which can be used to generate unique compositions. Navarro, Corchado, and Demazeau’s [100] system assists composers in creating harmonies through a multi-agent system based on virtual organization [100]. Despite the promising results offered by these multi-agent systems, they lack the musicians’ ability to take multiple roles during the performance of a song.

3.3.3 Multi-agent system towards the representation of musical structures. While the efforts of Wulfhorst, Flores, Nakayama, et al. [48] and Oliveira, Davies, Gouyon, et al. [48] were focused on the application of MAS to analyze problems related to the mu- sical context of human behaviors, other researches addressed issues associated with musical structures. VIRTUALATIN described by Murray-Rust, Smaill, and Maya [94] is a MAS dedicated to dedicated to generating accompaniment rhythms to complement music. Such rhythms are generated through a percussive agent which adds accompaniments to Latin mu- sic [94]. Pachet [101] developed a MAS framework which models musical rhythms through the simulation of a game where percussive agents engage in mutual collaboration, playing in real time, without any previous knowledge of the music to be played. While several efforts to create autonomous music systems have used a multi-agent ap- proach, the majority of them have modelled each autonomous musician as a multi-agent system, while saying little about how a group of such musicians should coordinate or inter- act. Furthermore, the integration of a multi-agent system framework in the context of the Musebot Ensemble platform (e.g., representing a Musebot as a multi-agent system) remains unexplored, and we believe that such work could benefit the interaction between agents in the Musebot Ensemble platform.

3.4 Efforts toward integration

Workshops on MuMe are held annually in conjunction with the International Conference on Computational Creativity (ICCC), toward inspiring collaboration and integration between artistic and technological approaches. Recently, a modest amount of work has been com- pleted in this area, and most of it has been related to meta-creations that interact with a human performer. The Musebot Ensemble platform arose as a result of the effort to create an alternative venue to the area, with the particular goal of promoting both collaboration and the evaluation of work done in the field [18]. Eigenfeldt, Bown, and Carey [19] pre- sented the first application of this platform, including specifications for their Musebots, the architecture of the Musebot Ensemble platform, and a discussion of its benefits and limi- tations. They asserted that the platform does not take precedents from a human band, but in our view, the use of such precedents holds great potential for advancing our knowledge of musically creative systems (as we argued in Chapter 1). Finally from different previous work, Thomaz and Queiroz [17] developed a framework that aims to integrate general ideas (pulse detection, instrument simulation, and automatic accompaniment) in the context of Musical Multi-agent Systems. While this framework offers a convenient layer of supporting functionality (e.g., synchronization), it does not support the kinds of direct communication between agents that we pursue in this work. 3.5. COMMERCIAL SOFTWARE FOR MUSIC 21

3.5 Commercial software for music

The intersection of AI and music have impact many activities of the musical world. For in- stance, it has evolved the ways people teach, learn, perform, reproduce, and compose music. Moreover, the progress made in academic research areas such as algorithmic composition has created interest in the development of commercial tools that can be employed in different areas of the music industry. For instance, Band-in-a-Box is an "intelligent music accompani- ment software" [102] which allows musicians to practice their solos along with a generated music accompaniment [102]. Alternatively, ChordPulse simulates a band to support musi- cians in practicing improvisation in the company of a musical ensemble [103]. Other tools are intended to aid composers with writing new compositions. Crescendo is a musical no- tation software that allows musician to write music on computers. The aim of these tools is to allow musicians to avoid the laborious process of manually writing musical notations, al- lowing them to concentrate their efforts on the production of the music. Software developed to assist musicians with their composition tasks are commonly known as computer-aided algorithmic composition (CAAC) tools [8]. The games industry has also made exceptional advances in AI and music. Video games such as Guitar Hero, Singstar and Rock Band have contributed to the possibility of interaction between humans and computers within a musical environment. Rocksmith is a game which allows the player to connect a traditional electric guitar. The purpose of the game is to create a musical environment where players can de- velop their performing skills by practicing over a variety of popular songs. A version of the game that includes a live session mode was introduced at the nucl.ai conference in 2014. In this session mode, the game provides a set of virtual instruments that respond according to the notes played on the user’s guitar. It is interesting that the game is able to adapt to the changes in tempo and style produced by the player in real time. Although these commercial tools and games have automated some music-related activities, they lack in autonomy, since all of them are controlled by a user.

3.6 Summary

In this chapter, we have reviewed some AI approaches used in the three areas of MuMe: al- gorithmic composition, live algorithms and musical multi-agent systems. First, we discussed the advantages of different implementations of AC and how this area effectively expresses the formalization of music composition. However, AC is not yet suitable to conquer the full representation of musicians’ interactions. Secondly, we have described some of the work in live algorithms and the limitations that these face regarding the performance of musical agents without any human intervention. Next, we presented some previous contributions in multi-agent systems and discussed some challenges regarding the coordination of a musical ensemble. We then described The Musebot Ensemble platform as previous effort toward the integration of MuMe’s subareas. Finally, we devoted a section to commercial tools provided by the music and games industry. In the next chapter, we will explain in detail our proposed approach, which aims to overcome some of the limitation discussed in this chapter. 22 23

Chapter 4

Proposed Approach

We seek to understand and build a system that demonstrates the cognitive abilities exhibited by musicians in a musical ensemble. A musical ensemble is a group of musicians that are able to coordinate through various communication modes during a live performance. This ensemble interprets different parts from a musical score with instruments or vocals, which are then transformed into a piece of music and delivered to an audience. To operate effectively, the musical ensemble must work in a coordinated manner. Accurate decision coordination and action management among musicians are what ultimately determine the success or failure of the ensemble’s goal. The Musebot Ensemble provides some of the mechanisms required to model a musical ensemble; its concept of musical creativity (based on a group of agents representing a single musical instrument) makes it suitable for our purpose to provide the AMI system with a high of level autonomy. The Musebot Ensemble also offers a communication architecture which enables the Musebot to receive and transmit information pertinent to the participation of each musician in the ensemble. This architecture also provides each Musebot with the ability to communicate certain elements of the musical composition, such as tempo. Therefore, this software is a relevant piece of technology for the proposed AMI system. Our goal is to extend the Musebot Ensemble platform in a way that supports direct com- munication between Musebots while simultaneously offering a clear avenue for integrating recent advancements in MuMe’s subareas. We have chosen to use the concept of a jazz quartet as a case study for this work, since jazz is a genre of music that routinely requires real-time coordination and improvisation from its players. Thus, it serves as a suitable and convenient proving ground for the techniques of Algorithmic Composition, Live Algorithms, and Musical Multiagent Systems. Furthermore, jazz performance (among humans) has been studied from the perspective of social science due to its inherent social interactivity, pro- viding us with a solid point of reference when considering whether and how autonomous musicians can be made to play jazz. We will base our discussion on the work of Bastien and Hostager [20], who presented a study of how four jazz musicians could coordinate their mu- sical ideas without the benefit of rehearsal and without the use of sheet music. In their study, the authors found that the musicians communicated through a variety of different means, including visual, verbal, and non-verbal cues. We aim to model such communications be- tween autonomous musical agents. To extend the Musebot Ensemble platform in a way that supports direct communication between Musebots, we propose the use of a two-level agent architecture in which each Musebot is itself comprised of multiple interacting agents. In the remainder of this chapter, we will describe the general architecture of the AMI system, as well as the functionalities of a Musebot as an individual performer and each of its com- ponents. We will then address the different levels of interaction between the Musebots in 24 CHAPTER 4. PROPOSED APPROACH the ensemble and include a brief explanation of the interaction protocols implemented to support direct communication between the Musebots.

4.1 General Architecture

In this section, we describe in detail the architecture of the proposed system, which is based on a two-level structure. The first level is represented by a Musebot agent, and the second level is represented by a set of interactive sub-agents: a musician agent, a composer agent, a synchronizer agent, and an ensemble assistant agent. Figure 4.1 shows a graphical represen- tation of our architecture, using part of Bastien and Hostager’s jazz quartet (a saxophonist and a pianist) as an example.

MUSICIAN

SYNCHRONIZER MUSEBOT (SAX)

ENSEMBLE ASSISTANT

COMPOSER

MUSICIAN

SYNCHRONIZER MUSEBOT (PIANO)

ENSEMBLE ASSISTANT

COMPOSER

Figure 4.1: An example of our Musebot agent architecture. Each Musebot represents an autonomous musician and is a multiagent system composed of four agents: musician, com- poser, synchronizer, and ensemble assistant.

4.1.1 The CADIA Musebot The CADIA Musebot is an autonomous musical software engineered to perform certain musical tasks, including composition, interpretation, and collaboration. At the top level of 4.1. GENERAL ARCHITECTURE 25 our structure (Figure 4.1, left side), each Musebot represents a single autonomous musi- cian, such as a saxophonist or a pianist. The software was designed to perform in a virtual environment where, as a musical agent, it plays music along with an arbitrary number of fel- low Musebots. A Musebot chooses its actions with the intention to achieve common goals within the Musebot ensemble. For example, some of these goals are to support a soloist with musical accompaniments or improvise original melodies as a soloist. While the CADIA Musebot was created to perform as a single musical instrument, this does not mean that it cannot perform as a variety of instruments at the same time. For instance, a Musebot intended to play a drum set will need to play different instruments simultaneously, since in the system, each component of a drum set (such as a ride or a snare) is represented as an individual musical instrument. Each Musebot is made up of four different sub-agents (musician, composer, synchro- nizer, and ensemble assistant) which are designed to distribute the various tasks that a Muse- bot should perform. Agents at the lower level share a common goal: to help the Musebot perform appropriately as part of the Musebot Ensemble, including communicating and in- teracting effectively with other Musebots. To achieve this goal, several actions are executed by the sub-agents, based on the role that the Musebot is currently playing within the musical ensemble. We developed these sub-agents using the Java Agent Development Framework (JADE), an agent-oriented framework designed in compliance with specifications from the Founda- tion for Intelligent Physical Agents (FIPA). FIPA is an organization that provides an agent communication language along with standards for several interaction protocols [104]. This framework provides a variety of components:

• Agent Platform (AP): The AP is the virtual environment in which the agents are launched. The AP runs under the computer operating system and can be executed across multiple computers [105].

• Agent: An agent is a computational thread that processes different types of behaviours and provides computational capabilities that could be requested by other agents. An agent must be labeled with a unique identification address, which is used as contact information [105].

• Agent’s composite behaviours: A composite behaviour is defined by a hierarchy of behaviours composed of a parent behaviour and various child beheviours. These agents behaviours are used to deal with complex tasks. For example, tasks involving several operations such as handling multiple conversations. JADE offers three types of composite behaviours: Sequential behaviour, Finite State Machine (FSM) behaviour, and parallel behaviour. [105].

• Agent Management System (AMS): The AMS manages the operations of the AP; all agents must be registered to the AMS. Additionally, this component is responsible for the creation and deletion of every agent [105].

• Directory Facilitator (DF): The DF is an additional agent included with the AP that provides “yellow pages” to the other agents [105]. It keeps a list of the identification of all agents within the AP. An agent can register its services into the yellow pages through the DF.

• Message Transport Service (MTS): The MTS is a component within the framework that allows the agents to communicate through FIPA-ACL messages. 26 CHAPTER 4. PROPOSED APPROACH

In the following sections, we explain how we used JADE to develop the agents in our archi- tecture.

4.1.1.1 Musician Agent In our architecture, the musician agent (MA) is the primary component of a Musebot. It is responsible for carrying out the Musebot’s role in the ensemble (soloist or accompanist); in doing so, it interacts with the rest of the agents in the architecture. As soon as the MA becomes active, it starts to behave according to the Finite State Machine (FSM) shown in Figure 4.2. Each state in this FSM is represented by a child behaviour of the composite

Share song’s Request the Register Start structure Intro service

Get song’s Search Leader/Soloist Accompanist Receive song's structure musicians structure

wait Play Request to be a leader accompaniment Play intro solo

Pass the Request leadership accompaniment

Figure 4.2: The musician agent’s (MA’s) Finite State Machine (FSM).

FSM behaviour mentioned in the previous section. Some child behaviours are composed of one or more sub-behaviours, meaning that sometimes a single state consists of two or three layers of behaviours, as shown in Figure 4.3. Such structures help the agent handle different types of computations, including computations with tasks that are either programmed to be executed only once or in cycles. Having this hierarchy of behaviours managed by the MA makes it high in complexity. However, it is a suitable approach to simulate the actions taken by a music performer because they can be distributed in different musical tasks. After the “start,” the MA registers its service to the DF, followed by a search of every member of the ensemble. There must be at least four agents registered in the ensemble in order to advance to the following state. Once the MA is placed into the “Search musicians” state and has successfully found all the musicians’ partners, it will have to decide to take either an accompanist or a leader role. This decision currently depends on a rule pre-established by an external user. If the MA happens to be the leader, it will take the transition that leads to the “Leader/Soloist” state, otherwise it will be directed to the “Accompanist” state. We model a musician agent being the leader or an accompanist using separate musician agents. Soloist Musician Agent. Being a soloist in the Musebot Ensemble is equivalent to being the leader of the group. The MA with a leadership position will be the one coordinating 4.1. GENERAL ARCHITECTURE 27

parent Layer 1 Composite Behaviour

The musician agent’s FSM children

Cycle Behaviour Layer 2 Play Intro state

parent

children children FSM Behaviour One Shot Behaviour Layer 3 Negotiation with Notify the synchroni- the composer agent zer agent

Figure 4.3: Example of the musician agent’s behaviours hierarchy. the musical activities as well as deciding upon and sharing the structure of the song to be played, which is currently set prior to the performance by an external user. Once the MA is placed in the “Leader/Soloist” state, it will check if this is the very first time passing by this state. If affirmative, the transition will follow to the “Get song’s structure” state. Here, the MA will take the structure of the song (an object encoding musical elements such as tempo, time signature, and form of the song) and a list of chords, which comprise each section of the form (see Table 4.1). In the current proposed system, we only support two sections: “A and B.”

Table 4.1: This table represents the properties of the song’s structure object. The tempo and time signature’s numerator and denominator are stored as integer values. The form stores a string of characters, and the sections of the form store a list of musical chords.

Tempo 120 Time Numerator 4 Signature Denominator 4 Section A Em7, A7, Dm7, G7 Form AAB Section B CM7, CM7

After retrieving this information, the MA will share it with the other members of the ensemble, whereupon confirmation is received from each member. The MA will continue to the next state and negotiate the introduction to the song by requesting another Musebot to play it. If every Musebot refuses or fails to play the introduction, the leader will continue trying to find someone that wants to cooperate. (We address the mechanisms of this nego- tiation in Section 4.2.1). From there, the MA will remain ready to improvise a solo after the completion of the introduction of the song. Finally, after playing a number of sections 28 CHAPTER 4. PROPOSED APPROACH

(relying upon the form in the structure of the song), the MA will pass leadership to another musician, supporting the new soloist from then on as an accompanist. Accompanist Musician Agent. A MA designed as an accompanist would have a largely passive behavior; it would respond to and support the soloist’s requests. From the “Accom- panist” state (see Figure 4.2), the MA will directly receive the structure of the song provided by the leader of the ensemble. This will then transition to the “Play intro” state, where the MA will wait until the leader sends a petition to play the introduction. The accompanist could accept or reject the request, depending on the simulation computing its decision. In case several accompanists accept to play the introduction, the leader will choose only the first one that has accepted to play and will reject the rest. Next, it will follow the transition to the "Play accompaniment" state; in this state, it will play an accompaniment to the song, following the given structure. Finally, the accompanist MA will be directed to the “Wait to be a leader” state, where it will remain waiting for a request to be the succeeding leader of the Musebot Ensemble. A fragment from the musician agent FSM is illustrated in Figure 4.4. The cycle involving four states (“Request Solo,” “Pass the Leadership,” “Request ac- companiment,” and “Wait to be a leader”) allows the system perform multiple roles in the AMI system.

Wait Play Request to be a leader accompaniment Play intro solo

Pass the Request leadership accompaniment

Figure 4.4: Cycle process in which the MAs exchange roles.

Once the leader is ready to pass the lead, the accompanist will respond to this request by either accepting or rejecting it. If the accompanist accepts, it will take the transition to the “Request solo” state and automatically become the leader of the ensemble. Comparably, the soloist, after passing the lead, continues to the “Request accompaniment” state, where it will play the accompaniment to the song and therefore, change its role from a soloist to an accompanist. Although the “Request accompaniment” state has similar functionalities as the “Play accompaniment” state, they are different; only “Play accompaniment” is concerned with calculations related to the introduction of the song since this follows after the "play intro" state. Therefore, to decrease the complexity of returning to the “Play accompaniment” state, we decided to add a state that removes any computation related to the introduction of the song.

4.1.1.2 Composer Agent The composer agent (CA) is responsible for composing melodies, chord progressions, and/or solos at run-time. Each composition will depend on the role of the Musebot and the instru- 4.1. GENERAL ARCHITECTURE 29 ment that it is playing. For instance, a Musebot playing a drums solo will compose rhythm patterns, while a Musebot playing piano accompaniment will likely play the chord progres- sion of the song. The CA’s behaviour is based on three FSMs: the intro FSM, the accompaniment FSM, and the solo FSM. These three FSMs run simultaneously once the CA is launched by the AMS. Intro FSM. The intro FSM (see Figure 4.5) provides the mechanisms to compose the introduction of the song. It begins in the “Wait for request” state; here, after the CA receives a request from the MA, it will take the transition leading to the “Compose intro” state. This state will apply methods of algorithmic composition to generate the introduction to the song. Once the composition is completed, the CA will pass into the “Play intro” state, where it will take the composition and play it at run-time. The CA then goes to the “End” state, completing the musical task. This is the only FSM of the CA that runs just once during the performance of the song.

Wait for Compose Play intro End request intro

Figure 4.5: The Intro FSM.

Accompaniment FSM. The accompaniment FSM is responsible for composing and per- forming the accompaniment (see Figure 4.6). Like the intro FSM, the process begins when the MA requests the composition of an accompaniment. The request could happen after the end of an introduction; in this situation, the CA will follow the transition to the “Confirm” state. This state informs the MA that the petition was successfully processed. The only transition that follows the “Confirm” state leads to the “Compose” state, where the accom- paniment will be composed. Afterward, it will be played in the “Play” state. However, if

Request info Get info to sync from sync

Wait for Con rm Compose request

Play End

Figure 4.6: The Accompaniment FSM. the MA requests an accompaniment after playing a solo, the transition will continue to the 30 CHAPTER 4. PROPOSED APPROACH

“Request info to sync” state, where a coordination with the Synchronizer Agent will occur (see Section 4.1.1.3). The CA then passes to the “Get info from sync” state and retrieves information concerning the current section played in the song. It also sends confirmation to the MA that the process has been executed. Next, as seen in Figure 4.6, the CA goes into the “Compose” state. It is important to address the relation between the “Compose” state and the “Play” state, which allows the Musebot to perform at real time. While the CA is playing a determined section from the form, the “Compose” state is preparing the next section to be played by the CA, providing enough time to for the “Play” state to perform each section of the song. This cycle continues until the MA is requested to play a solo. Solo FSM. The solo FSM is analogous to the accompaniment FSM (see Figure 4.7). It follows the same principles as the accompaniment FSM. However, the solo FSM is dis- tinguished by the procedure applied to the compositions. The solo FSM uses methods re- lated to stochastic algorithms, providing it the ability to improvise. The compositions are constructed using “jMusic,” a Java library that encodes music as a symbolic representation similar to Common Practice Notation (CPN), and played using the JAVA MIDI sound bank. This agent provides a point of integration for recent advances in Algorithmic Composition. Afterward, the composition will be played in the “Play” state.

Request info Get info to sync from sync

Wait for Compose Con rm request

Play End

Figure 4.7: The Solo FSM.

4.1.1.3 Synchronizer Agent The structure of the Synchronizer Agent (SA) is based upon a JADE mechanism known as “parallel behaviour (see Section 4.1.1).” This behaviour is the parent of a set of multiple behaviours executed in parallel. One function of the SA is to store information about events that happen during the pro- gression of a song. For instance, when a Musebot’s MA is ready to perform an introduction, it will inform its SA of the time it started to play, along with the expected duration of the introduction. This information will then be stored by the SA and kept available to share with the rest of the agents through an interaction protocol (see Section 4.2.1). This mechanism allows any agents that ignore this information to request and calculate the time at which the introduction will be finished, to know when they must play their part of the song. This agent also provides a point of integration for recent advances in Live Algorithms. 4.2. INTERACTION BETWEEN AGENTS 31

Most of the functionalities of the SA are strongly related to the communication and interaction protocols proposed in this dissertation. Therefore, we will discuss the operations performed by this agent in detail later in this chapter.

4.1.1.4 Ensemble Assistant Agent The ensemble assistant agent (EAA) serves as an intermediary between the Musebot Con- ductor (required by the Musebot Ensemble platform) and each Musebot. The Musebot Conductor provides a way for external users to control the ensemble, for example, vary- ing tempo, volume, or which Musebots are involved [18].

4.2 Interaction Between Agents

The interaction between agents in our architecture is managed by two different mechanisms: the Musebot Ensemble platform and a set of interaction protocols. The interaction required by the Musebot Ensemble platform is defined by a specific set of messages [18], which are exchanged between the Musebot Conductor and the Musebots. The messages are human- readable and classified into categories. For example, the message “/mc/time” is broadcasted from the Musebot Conductor to every Musebot in the ensemble, conveying the tempo of the composition for use in synchronizing the agents. Similarly, “agent/kill” is a message that indicates that the particular agent receiving this message should stop performing. In our architecture, this interaction mechanism is handled by the EAA; its principal task is to interpret these messages and transmit them to the Musebot’s multi-agent system.

4.2.1 Interaction Protocols The interaction protocols implemented in the system are inherited from JADE [105] and are represented as sequences of messages (based on speech acts) for handling different actions between agents such as agreement, negotiation, and others [106]. We have implemented two types of interaction protocols: the FIPA Contract Net Interaction Protocol (example in Fig- ure 4.8) and the FIPA Query. We used the FIPA Contract Net Interaction Protocol to model parts of the interaction between musicians that Bastien and Hostager [20] described in their work. For example, prior to performing a song, the jazz quartet took time to discuss which musician should play an introduction to the song. The Contract Net Interaction Protocol pro- vides all the necessary elements to represent this negotiation. The Musebot designated as the leader becomes the initiator of the conversation, while the rest of the Musebots become responders. The initiator will send a call for a proposal from the responders, recommending that one of them should play an introduction. Each responder will then reply with either a proposal to play the introduction or a refusal to play it. The initiator will evaluate the re- ceived proposals, accepting one and rejecting the others. Once a proposal is accepted by the initiator, the matching responder will compose and play the introduction to the song, after- ward informing the initiator that the action was successfully completed. In case of failure by the responder, the initiator will repeat the conversation until the introduction gets played. This protocol was also implemented to model how musicians take turns performing so- los. For instance, when the leader determines that it is done playing a solo, it will finish and then negotiate with the rest of the members to confirm which musician will play the next solo. 32 CHAPTER 4. PROPOSED APPROACH

Initiator Responder

call for proposal

proposal

refuse

accept proposal

reject proposal

failure

inform done

Figure 4.8: One of the agent interaction protocols that we implemented to support commu- nication in our agent architecture (the FIPA Contract Net Interaction Protocol).

While this example describes an interaction between agents at the top level of the ar- chitecture (e.g., a saxophonist interacting with a bassist, drummer, and pianist), other con- versations are carried out internally by the agents at the lower level. There is a constant communication between the musician agent and the synchronizer agent each time a piece of the song is planned to be played. For instance, the SA stores information about sections being played in the progression of the song, which allows each Musebot to synchronize the transition of the introduction to the accompaniment. The MA needs to access information about the introduction, so the MA can calculate for how long the introduction will be played and how much time it has left to compose and play the accompaniment. We use the FIPA Query interactive protocol to do this. This protocol follows a sequence of messages started by the initiator sending an FIPA-ACL message with "QUERY_IF" as a performative (a com- ponent of speech act theory, which uses verbs to represent agents’ actions). The responder will handle this message with either a confirmation (in the case that it has such information) or a refusal (in the case that it ignores the information). 4.3. DEVELOPMENT PROCESS 33

4.3 Development Process

Here, we describe the different approaches we took during the development of the AMI sys- tem.

4.3.1 Initial Considerations Our goal was to build an autonomous AMI system capable of simulating some of the communicative modes exhibited by musicians during live performance. To develop an AMI system capable of the four abilities described in Section 2.1 in a feasible amount of time, we decided to use the Musebot Ensemble platform instead of building our own from scratch, as this platform offers a foundation for collaborative musical contexts. The ma- jority of existing Musebots were developed using MAX/MSP (a visual programming lan- guage). Initially, we considered developing the CADIA Musebot (see Section 4.1) using this tool; however, after experimenting with several previous Musebots proposed by the MuMe community, we discovered a Musebot developed using Java (downloaded from https: //bitbucket.org/musebots/profile/repositories) that was more suited to our programming abilities.

4.3.2 Development Challenges and Implementations After reverse engineering the Musebot developed with Java, we chose not to use the com- ponents implemented to generate the musical compositions, as these were implemented to process audio waves whose output sounded aesthetically close to electronic dance music. We only adopted the implementation of the communication protocols, which are based on the exchange of single open sound control (OSC) messages in compliance with the specifi- cations set out in the Musebot Ensemble platform. This implementation is the basis of the Ensemble Assistant agent (see Section 4.1). One of the principal challenges in the development process was to enhance the com- munication between the Musebots. We were inspired by Barbuceanu and Fox [107], who modeled the communication between intelligent agents along a supply chain by using a framework based on a coordination language representing different levels of coordination. We designed several conversations based on the OSC messages proposed in the specifica- tions of the Musebot Ensemble platform; however, to use this conversation, we had to im- plement different network communication elements. For example, we needed a mechanism that dynamically identified the different Musebots participating in the ensemble (allowing the Musebots to send messages directly to one another instead of broadcasting messages to all of the members of the ensemble). We also needed a mechanism that allowed us to send information encoded into Java objects (e.g., sending the song’s structure in the content of the message). The lack of these elements motivated us to search for alternative communication protocols. The JADE framework (see Section 4.1.1) provides different interactive communication protocols that include components such as the FIPA Agent Identifier (AID), which provides a unique identifier to agents so they can be addressed unambiguously, and an agent com- munication language featuring parameters that allow the agents to encode specific message content [104]. We decided to include JADE in the CADIA Musebot as an alternative com- munication protocol that allowed the Musebots to cooperate and negotiate in a coordinated manner. Furthermore, it provided the agents with a structure based on different types of behaviours (see Section 4.1.1.1), which helped us to manage the various agents’ actions. 34 CHAPTER 4. PROPOSED APPROACH

We used jMusic through the Composer Agent to replace the musical composition com- ponents of the previous Musebot (see Section 4.1.1.2). In contrast with previous methods of algorithmic composition, we did not require any pre-established melodies to generate both accompaniments and solos. Instead, we used different rhythm patterns stored in arrays that were randomly chosen by the algorithms to generate the musical compositions. This method is described in [42] as a “random walk.” Finally, the decision to develop the CADIA Muse- bot as a set of four different sub-agents allowed us to distribute the agents’ multiple musical tasks and reduce the CADIA Musebot’s complexity. The two-level architecture (see Figure 4.1) provided an opportunity to integrate advances in Algorithmic Composition and Live Algorithms.

4.4 Summary

In this chapter, we discussed our proposed AMI system. We addressed the different com- ponents that provide the system with functionalities exhibited by human music performers, such as composition, improvisation, and performance. Furthermore, we proposed the use of interactive protocols to extend the communication architecture of the Musebot ensemble and provided examples of two interactive protocols used to model characteristics of Bastien and Hostager’s observations of a (human) jazz quartet [20]. We also discussed in detail the mechanisms that allow the Musebots to perform different roles in the ensemble, as well as the importance of these roles to coordinating their musical tasks. We explained how the components of our Musebots enable the integration of advances in Algorithmic Composi- tion and Live Algorithms. In the next chapter, we will evaluate the system and determine whether it constitutes is a relevant approach for building an ideal AMI system. 35

Chapter 5

Evaluation

In this chapter, we evaluate the proposed AMI system by comparing its operation across three simulations of Bastien and Hostager’s case study [20] of a jazz quartet. We review the different abilities of an ideal AMI system that we discussed in Chapter 2 and assess if they are present in the proposed system.

5.1 Testing and Simulations

We previously explained the different aspects of jazz that make it a suitable domain for simulating certain aspects of musical performance. To clarify the purpose of our simulation, we begin with a detailed description of Bastien and Hostager’s case study.

5.1.1 Bastien and Hostager’s case study Bastien and Hostager’s paper aims to understand the organizational processes exhibited dur- ing live jazz performance. They observed the performance of four jazz musicians (Bud Freeman on tenor saxophone, Art Hodes on piano, Biddy Bastien on bass, and Hal Smith on drums [20]) and the way they coordinated without rehearsing or using sheet music. Bastien and Hostager classified the social practices of the jazz quartet using behavioral norms and communicative codes. They gave examples of behavioral norms in jazz, which we summarize as follows. First, the leader of the jazz ensemble chooses each song to be played. Second, the leader decides the tempo and structure of the song, and the rest of the members in the ensemble are expected to support this determination [20]. Finally, each musician is given the opportunity to play a solo during the performance [20]. Bastien and Hostager also identified basic patterns in the jazz process of organizational innovation [20]. Before performing, the members of the jazz quartet planned what they would do. Guided by the leader, they agreed to the following arrangements. First, the saxophone player would be the nominal leader and would decide each song. Second, the chosen songs would be popular jazz songs, considering the knowledge of each member of the ensemble. Third, there would be no change in the tempo of each song. Finally, all songs would start with an introduction played by the piano, followed by an improvised solo played by the saxophone; next, the pianist would take the leadership and play a solo. Then, either the bassist or the drummer would accept the lead, play a couple of measures of inventive solos, and pass the lead to the remaining musician that had not played a solo. We used the proposed AMI system to model these basic patterns and social practices described by Bastien and Hostager. To demonstrate the varied capabilities of the proposed 36 CHAPTER 5. EVALUATION system, we used different input values to model three different songs, each of which the sys- tem can perform on demand. The remainder of this section will describe the communicative events that happen during the progression of each song. These communicative events and other actions are represented in the results by the abbreviations that we enumerate in Table 5.1. Table 5.1: This table enumerates each agent’s possible actions during a song’s progression. The actions are classified into communicative behaviours and musical behaviours.

Abbreviations Actions Behaviour Type PI Play the Introduction Musical WI Waiting for the end of the Introduction Musical CA Compose an Accompaniment Musical CS Compose a solo Musical PA Playing the Accompaniment Musical PS Playing the Solo Musical WS Wait for the end of the first Section Musical CPL Cue : Pass the Lead to another musician Communicative CAL Cue: Accept the Leadership role Communicative

5.1.2 Song 1 In jazz, a song is based on a chord progression, which is a sequence of musical chords that support the song’s melody and guide improvised solos. The chord progression is often divided into sections, which together represent the "form" of the song. For example, consider a song of twenty-four measures where the first eight measures are the same as the next eight, but the last eight are different. This song has three eight-measure sections with the form AAB: Section A is a sequence of chords that will be repeated a second time, followed by a different sequence of chords represented by Section B. When performing a jazz song, "playing a chorus" means playing one time through every chord in the song’s form. A jazz performance typically involves several choruses played one after another. The Song 1 is based on a common jazz chord progression. Four Musebots ("sax", "pi- ano","bass", and "drums") were set up to perform. The agent playing the saxophone was designated by a user to be the leader. The leader was provided with several song attributes, including tempo, song structure, time signature, and introduction length. 1. The structure of the song is based on the form AAB. The set of chords Em7, A7, Dm7, G7 are contained in section A, while the set of chords CM 7, CM 7 represent section B. 2. The tempo of 120 bpm. 3. The introduction will be six measures long. 4. The time signature of 4/4. 5.1. TESTING AND SIMULATIONS 37

The four Musebots were launched sequentially, generating a musical composition, which readers can access through the following link : https://tinyurl.com/simulationsong1

5.1.2.1 Results Tables 5.2-5.8 show the roles of each musician during the progression of the song, as well as the different actions performed by the agents.

Table 5.2: Introduction of Song 1. The leader waits while the piano plays an introduction, and all three accompanists compose their parts.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:03 - 00:14 Introduction Leader WI Accompanist PI and CA Accompanist CA and WI Accompanist CA and WI

Table 5.3: First Chorus (AAB) of Song 1. The leader (sax) waits until the end of the first section, because it needs to give a time to the CA to find out the current section (First A) being played by the accompanists. The leader then starts to play the solo at the beginning of the second section A. Finally at section B, the "sax" passes the leadership to the "bass". All three accompanists play their parts during this chorus.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:15 - 00:22 A Leader CS and WS Accompanist PA Accompanist PA Accompanist PA 00:23 - 00:30 A Leader PS Accompanist PA Accompanist PA Accompanist PA 00:31 - 00:34 B Leader PS and CPL Accompanist PA and CAL Accompanist PA Accompanist PA

Table 5.4: Second Chorus (AAB) of Song 1. The new leader (piano), plays a solo in each section of this chorus, and passes the leadership to the "bass" at the last section. The rest of the members play the accompaniments.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:35 - 00:42 A Accompanist PA Leader PS Accompanist PA Accompanist PA 00:43 - 00:50 A Accompanist PA Leader PS Accompanist PA Accompanist PA 00:51 - 00:54 B Accompanist PA Leader PS and CPL Accompanist PA and CAL Accompanist PA

Table 5.5: Third Chorus (AAB) of Song 1. The "bass" takes its turn to be the leader and play a solo while the accompanists play their parts. At section B of this chorus, the "drums" accept to be the new leader of the ensemble.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:55 - 01:02 A Accompanist PA Accompanist PA Leader PS Accompanist PA 01:03 - 01:10 A Accompanist PA Accompanist PA Leader PS Accompanist PA 01:11 - 01:14 B Accompanist PA Accompanist PA Leader PS and CPL Accompanist PA and CAL 38 CHAPTER 5. EVALUATION

Table 5.6: Fourth Chorus (AAB) of Song 1. The new leader (drums) plays a solo during the entire chorus, and the accompanists support this playing their parts.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 01:15 - 01:22 A Accompanist PA Accompanist PA Accompanist PA Leader PS 01:23 - 01:30 A Accompanist PA Accompanist PA Accompanist PA Leader PS 01:31 - 01:34 B Accompanist PA Accompanist PA Accompanist PA Leader PS

Table 5.7: Fifth Chorus (AAB) of Song 1. During this chorus, the "drums" continue playing a solo until the end of the second section A. At this moment of the song, the "sax" take back the leadership and the "drums" return to play the accompaniments, while the rest of the agents ("bass" and "piano") plays their parts.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 01:35 - 01:42 A Accompanist PA Accompanist PA Accompanist PA Leader PS 01:43 - 01:50 A Accompanist PA and CAL Accompanist PA Accompanist PA Leader PS and CPL 01:51 - 01:54 B Leader CS Accompanist PA Accompanist PA Accompanist PA

Table 5.8: Sixth Chorus (AAB) of Song 1. This is the last chorus of Song 1, the "sax" play a solo for a single section and then pass the leadership to the "piano," which continue playing the next solo during the rest of the chorus.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 01:55 - 02:02 A Leader PS and CPL Accompanist PA and CAL Accompanist PA Accompanist PA 02:03 - 02:10 A Accompanist PA Leader PS Accompanist PA Accompanist PA 02:11 - 02:15 B Accompanist PA Leader PS Accompanist PA Accompanist PA

5.1.2.2 Analysis We can see in Song 1 how the simulation replicates the basic patterns in the jazz process described by Bastien and Hostager. It begins with “piano” playing the introduction, followed by the first soloist change, which occurs during the shift from the first chorus to the second chorus when “sax” requests “piano” to play the next solo (Tables 5.3-5.4). The next change occurs at the transition from the second chorus to the third chorus when “piano” passes the leadership to “bass” (Tables 5.4-5.5). Next, “drums” takes the next solo at the beginning of the fourth chorus (Table 5.6). Finally, the lead returns to “sax” at the end of the fifth chorus (Table 5.7).

5.1.3 Song 2 In this simulation the system performed a twelve measure blues in C major. We only used three agents ("sax", "piano" and "drums"). The agent “sax” was designated to be the leader of the ensemble. Furthermore, the leader was provided with the following information.

1. The form of the song, based on a single section (A), which contains the following sequence of chords: C7, F 7, C7, C7, F 7, F 7, C7, C7, G7, F 7, C7, C7. 5.1. TESTING AND SIMULATIONS 39

2. The tempo of 170 bpm.

3. The introduction length of four measures.

4. The time signature of 4/4. The three musebots were launched in sequence, producing Song 2, which readers can have access to it through the following link: https://tinyurl.com/simulationsong2.

5.1.3.1 Results Tables 5.9-5.18 show the roles of each musician during the progression of the song, as well as the different actions performed by the agents.

Table 5.9: Introduction of Song 2. The leader ("sax") waits for the end of the introduction, which is played by the "piano" while all the accompanists compose their parts.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 00:02 - 00:06 Introduction Leader WI Accompanist PI and CA Accompanist CA and WI

Table 5.10: First Chorus (A) of Song 2. The leader ("sax") compose the solo and waits for the end of the chorus, while the accompanists play their parts

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 00:07 - 00:23 A Leader CS and WS Accompanist PA Accompanist PA

Table 5.11: Second Chorus (A) of Song 2. The "sax" start to play a solo and the rest of the members of the ensemble play the accompaniments.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 00:24 - 00:41 A Leader PS Accompanist PA Accompanist PA

Table 5.12: Third Chorus (A) of Song 2. The "sax" ends playing the solo at second fifty-five of section A and pass the leadership to the "piano". The "piano" and the "drums" play the accompaniments during the entire chorus.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 00:42 - 00:57 A Leader PS and CPL Accompanist PA and CAL Accompanist PA 40 CHAPTER 5. EVALUATION

Table 5.13: Fourth Chorus (A) of Song 2. After the "piano" accepted the leadership in the previous chorus, it starts to play the solo at the beginning of section A. The "sax" and "drums" support the new soloist with the accompaniments.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 00:58 - 01:15 A Accompanist PA Leader PS Accompanist PA

Table 5.14: Fifth Chorus (A) of Song 2. The leader ("piano") continues playing the solo during this chorus as well as the accompaniments play their parts.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 01:16 - 01:31 A Accompanist PA Leader PS Accompanist PA

Table 5.15: Sixth Chorus (A) of Song 2. The "piano" pass the leadership to the "drums" and ends its solo when section A finished. The "drums" then accept the leadership and end its part while the "sax" continue playing the accompaniment.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 01:32 - 01:48 A Accompanist PA Leader PS and CPL Accompanist PA and CAL

Table 5.16: Seventh Chorus (A) of Song 2. The new leader ("drums") starts to play a solo. The "piano" has become an accompanist and support the "drums" along with the "sax".

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 01:49 - 02:05 A Accompanist PA Accompanist PA Leader PS

Table 5.17: Eighth Chorus (A) of Song 2. The accompanists "sax" and "piano" continues supporting the soloist. The leader plays a section of inventive solo.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 02:06 - 02:22 A Accompanist PA Accompanist PA Leader PS 5.1. TESTING AND SIMULATIONS 41

Table 5.18: Ninth Chorus (A) of Song 2. The leader ("drum") continues improvising a solo during this chorus, while the other agents play the accompaniments.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 02:23 - 02:39 A Accompanist PA Accompanist PA Leader PS

Table 5.19: Tenth Chorus (A) of Song 2. The "drums" play its last section of improvised solo and pass the leadership to the "sax". The "sax" accepts the leadership and plays its last accompaniment in the song, while the piano continues supporting the leader.

Saxophone (op_sax) Piano (op_piano) Drums (op_drums) Track counter Section Role Actions Role Actions Role Actions 02:40 - 02:56 A Accompanist PA and CAL Accompanist PA Leader PS and CPL

Table 5.20: Eleventh Chorus (A) of Song 2. The leader ("sax") compose a solo while the rest of the agents play the accompaniments.

Saxophone (sax) Piano (piano) Drums (drums) Track counter Section Role Actions Role Actions Role Actions 02:56 - 03:13 A Leader CS Accompanist PA Accompanist PA

5.1.3.2 Analysis Similarly to Song 1, Song 2 begins with an introduction played by the piano. However, this introduction is seven seconds shorter than that of Song 1 and the solos played in each chorus are longer than those of Song 1. For example, the first solo played by “sax” in Song 2 is twenty one seconds longer than the first solo played in Song 1. The different between the length of the solos of these songs is caused by the numbers of chords played in each chorus. Whereas the choruses of Song 1 consist of 10 chords, the choruses of Song 2 are composed of twelve chords. The soloists of Song 2 thus play two additional chords than the soloists of Song 1. Furthermore, “bass” was not included in Song 2. Despite this, the level of communication remained the same, with each agent taking turns to play a solo, sequentially beginning with "sax" and ending with the "drums".

5.1.4 Song 3 In this simulation, the proposed system performed "Sunday," a popular jazz song performed by the jazz quartet in Bastien and Hostager’s case study [20]. The agent “sax” was once again designated to be the first agent leading the ensemble and was provided with the fol- lowing song attributes.

1. The form of the song: AABA. Section A of the form is represented by the following sequence of chords: C, C, C7, G7, A7, Dm7, G7, C, C. Section B of the form is represented by the following sequence of chords: E7, E7, A7, A7, D7, D7, G7, G7.

2. The tempo of 120 bpm. 42 CHAPTER 5. EVALUATION

3. The introduction length of four measures.

4. The time signature of 4/4.

The four Musebots were launched sequentially, generating a musical composition. Readers can access this with the following link: https://tinyurl.com/simulationsong3.

5.1.4.1 Results Tables 5.21-5.25 show the roles of each musician during the progression of the song and the different actions performed by the agents.

Table 5.21: Introduction of Song 3. The leader ("sax") waits for the end of the introduction, which is played by the "piano" while all the accompanists compose their parts.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:04 - 00:10 Introduction Leader WI Accompanist PI and CA Accompanist CA Accompanist CA

Table 5.22: First Chorus (AABA) of Song 3. At the first section, the "sax" composes a solo and waits until this section is finished. The leader later starts to play the solo at the beginning of the second section. Next, at section B the "sax" pass the leadership to the "piano" and afterwards become an accompanist playing its part from the beginning of the last section, at that moment the "piano" becomes the new leader and starts to play a solo. The "bass" and "drums" play their parts during the entire chorus without any change in their roles.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 00:11 - 00:27 A Leader CS and WS Accompanist PA Accompanist PA Accompanist PA 00:27 - 00:43 A Leader PS Accompanist PA Accompanist PA Accompanist PA 00:44 - 00:59 B Leader PS and CPL Accompanist PA and CAL Accompanist PA Accompanist PA 01:00 - 01:14 A Accompanist PA Leader PS Accompanist PA Accompanist PA

Table 5.23: Second Chorus (AABA) of Song 3. In this chorus, the "piano" play a solo during the first two sections, afterward it passes the leadership to the "sax" and then becomes an accompanist. The "sax" accept the leadership during the second section and plays a solo during the next two sections (B and A) after that, the "sax" passes the leadership back to the "piano". The "bass" and "drums" continue this chorus without any change in their roles.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 01:15 - 01:31 A Accompanist PA Leader PS Accompanist PA Accompanist PA 01:32 - 01:47 A Accompanist PA and CAL Leader PS and CPL Accompanist PA Accompanist PA 01:47 - 02:03 B Leader PS Accompanist PA Accompanist PA Accompanist PA 02:04 - 02:19 A Leader PS and CPL Accompanist PA and CAL Accompanist PA Accompanist PA 5.2. SUMMARY 43

Table 5.24: Third Chorus (AABA) of song 3. The "piano" plays a solo during the first three sections, the leader ("piano") then becomes an accompanist and lets the "bass" change into the new leader of the ensemble. The "bass" starts to play a solo at the beginning of the last section, while the "sax" and the "drums" play their parts as accompanists.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 02:20 - 02:35 A Accompanist PA Leader PS Accompanist PA Accompanist PA 02:36 - 02:51 A Accompanist PA Leader PS Accompanist PA Accompanist PA 02:52 - 03:07 B Accompanist PA Leader PS and CPL Accompanist PA and CAL Accompanist PA 03:08 - 03:22 A Accompanist PA Accompanist PA Leader PS Accompanist PA

Table 5.25: Fourth Chorus (AABA) of Song 3. The leader ("bass") plays a solo during the first two sections. The leader later passes the leadership to the "drums" and subsequently changes its role to an accompanist. The "drums" accept to be the new leader during the second section and then play a solo during the rest of the chorus. The "sax" and the "piano" continue this chorus without any changes in their roles.

Saxophone (sax) Piano (piano) Bass (bass) Drums (drums) Track counter Section Role Actions Role Actions Role Actions Role Actions 03:23 - 03:39 A Accompanist PA Accompanist PA Leader PS Accompanist PA 03:40 - 03:55 A Accompanist PA Accompanist PA Leader PS and CPL Accompanist PA and CAL 03:56 - 04:11 B Accompanist PA Accompanist PA Accompanist PA Leader PS 04:12 - 04:27 A Accompanist PA Accompanist PA Accompanist PA Leader PS

5.1.4.2 Analysis Song 3 shows a different pattern regarding the communication behaviour, specifically in the second chorus, when “piano” passed the leadership to “sax” instead of passing it to “bass,” like in the first two simulations (Table 5.23). Moreover, after “sax” played its second solo, it returned the leadership to “piano,” which performed its second solo and decided to pass the leadership to “bass” (Table 5.24). This characteristic of the communication behaviour was generated by a modification in the decision simulator of the MA’s "Request solo" behaviour.

5.2 Summary

In this chapter, we evaluated the proposed AMI system. We presented the results of three different simulations, each arising from different input values. These simulations are clas- sified in the chapter as Song 1, Song 2, and Song 3. The results of these simulations are described through various tables in Section 5.1. From the results, we can see that the agents communicate with each other during dif- ferent moments of the song and that each change happening in the song is the result of communicative acts between the agents. This reveals the high degree of autonomy of each Musebot during the progression of the song. This behaviour complies with the first two abilities of our vision of an ideal AMI system (see Section 2.2). We can see how the different songs’ attributes produce different outcomes, specifically affecting the length of the solos and the style of the music. For instance, solos played in Song 2 are longer than those played in Song 1. The possibility of setting the system with song structures used to represent different genres of music allows our system to support 44 CHAPTER 5. EVALUATION the performance of diverse musical styles (the third ability of our vision of an ideal AMI system). Furthermore, the results of these simulations show how agents are able to change roles during the progression of the song, enabling them to either lead the ensemble or support the leader with accompanying compositions. Therefore, we have successfully implemented the four abilities described in our vision of an ideal AMI system. 45

Chapter 6

Discussion and future work

Artificial Music Intelligence (AMI) involves a combination of different research areas, in- cluding AI, , and mathematics, among others. In this dissertation, we proposed an AMI system that integrates some advances in the subareas of MuMe. We reviewed several generative music systems that have provided promising results in these subareas, but found all of them to lack certain necessary attributes of the ideal AMI system that we envision. When performing, an AMI system must be autonomous (e.g., independent of human intervention during performance), communicative (e.g., exchange musical ideas with the members of an ensemble), musically diverse (e.g., play different styles of music), and adapt- able to perform either as a leader or as an accompanist. To fulfill each these capacities, we chose to extend the previous MuMe work of the Muse- bot Ensemble platform. Certain mechanisms of the Musebot Ensemble platform (e.g., the musical multi-agent system architecture) helped us to achieve our desire of integrating algo- rithmic composition, live algorithms, and musical multi-agent systems. However, different adjustments to the platform were needed to achieve the requirements stated in our problem formulation, such as maintaining coordination between Musebots and ensuring their aware- ness of what the rest of the members in the ensemble are doing during the progression of a song. To do so, we enhanced the communication mechanisms of the Musebot Ensemble platform. The Musebot Ensemble platform’s original methods of communication required the im- plementation of a “musebot conductor,” the main objective of which is to initiate the per- formance of the ensemble and pass messages with musical parameters between Musebots to coordinate their musical activities. The Musebot conductor needed to be run or the Musebot Ensemble would not be able to perform. The only direct communication supported by each Musebot was limited to the Musebot Conductor.

6.1 Benefits and limitations

We have proposed a new architecture for the Musebot Ensemble that seeks to achieve the goal of integration across MuMe subareas and extend the capabilities of the platform. Com- pared to previous approaches, our architecture offers certain benefits. First, by extending the Musebot Ensemble platform rather than attempting to replace it, we have provided the system with independence from the Musebot conductor. That is, Musebots no longer rely on the conductor to communicate their musical ideas. We still support some of the Musebot conductor’s functionalities to maintain compati- bility with the Musebot Ensemble platform. For instance, it can change the speed of a song 46 CHAPTER 6. DISCUSSION AND FUTURE WORK by sending a message to the Musebots containing a parameter for tempo. Furthermore, the agents in the ensemble are capable of performing even if the Musebot conductor is not run- ning, which we achieved by implementing standardized protocols (the FIPA Contract Net Interaction Protocol). Bown, Carey, and Eigenfeldt [18] claim that there is no need for conversations between the agents in the Musebot Ensemble, as everything can be handled via the passing of mes- sages through the conductor. However, conversations based on interactive protocols allow Musebots to negotiate, coordinate, and plan autonomously in a peer-to-peer fashion, which is a closer representation of how human musicians perform. Additionally, implementing this interaction protocol allows the agents to speak the same communication language (the FIPA-ACL), which provides the system with considerable flexibility when the agents need to share information. This is useful, for example, when a leader wants to pass a song structure composed of different parameters, including one or two lists of chords. These parameters can be encoded and sent in a single message by a leader to its interlocutor. The interlocu- tor (the accompanist) will receive and decode this information in number of simple steps. The communication is quick and clean considering the amount of information that is pass- ing through a single message. Furthermore, while the original functioning of the Musebot Ensemble passes information continually during an entire performance, our communication architecture transmits information only at the precise moment when it is required. Allowing the system to share these song structures once before a performance not only simplifies the communication architecture, but also provides the system with the ability to perform different musical styles. While we based our evaluation on the simulation of jazz compositions, other genres of music could be generated by providing parameters related to a specific musical style in the structure of a song, as well as by implementing certain musical algorithms in the CA (e.g., a new implementation of Algorithmic Composition in the behaviour represented by the "Compose" state (see Section 4.1.1.2). Such an effort remains as a future work. The second benefit offered by our architecture is that it is able to support cross-compatibility with different implementations of the platform. For example, given an ensemble made up of Musebots defined using our architecture, the addition of an arbitrary other Musebot into the ensemble (i.e., one which does not implement our architecture) should result in as viable a performance as the Musebot Ensemble platform allows. However, since the new Musebot’s abilities to communicate with the others would be reduced (because it lacks our architec- ture), the resulting ensemble performance might be impaired. Testing this hypothesis with a variety of different Musebots remains a task for future work. The third benefit of our architecture is that the multi-agent design of each Musebot al- lows for convenient points of integration with recent advances in both live algorithms (in the synchronizer agent) and algorithmic composition (in the composer agent). Furthermore, Blackwell, Bown, and Young identified four important behaviours in a Live Algorithm: "shadowing, mirroring, coupling and negotiation" [46]. They stated that while negotiation has been hard to achieve in Live Algorithms, implementing this behaviour could provide strong creative collaboration in autonomous generative music systems. They have asserted that “systems which are capable of successful negotiation are the goal for Live Algorithms research" [46]. Therefore, we believe that the mechanisms that allow the collaborative be- haviours in our proposed AMI system offer a significant benefit to further research direc- tions in Live Algorithms. In addition to the precedents set by prior work, our decision to model each Musebot as a multi-agent system has been supported in the field of neuroscience. Specifically, Zatorre, Chen, and Penhune [108] measured the activity of different areas of the brain during music performance and found relationships between motor function and audi- 6.1. BENEFITS AND LIMITATIONS 47 tory interaction. While we do not claim that the specific agents in our architecture represent an optimal design or mimic the true operations of the human brain, we have found that utility can be gained from having a multi-agent representation for each autonomous musician. Designing each Musebot based on a multi-agent system, however, resulted in a number of challenges. First, since each Musebot in our system was represented by four agents (see Figure 4.1), simulating a jazz quartet meant simultaneously running twenty agents. Each agent was controlling its own thread, with each thread performing a considerable number of behaviours (within the framework, each behaviour is implemented in a java class that runs in a continuous loop). If not treated carefully, this can be extremely CPU consuming. For example, each state in the MA’s FSM (see Figure 4.2) runs a loop until some event (e.g., receiving a message from another agent) triggers a transition to a following state. We put the agent’s thread to sleep when no events were available to avoid wasting CPU time. We tested the system with four Musebots on a typical personal computer, without any significant increase on the CPU usage. We expect that executing simulations with a higher number of Musebots (e.g., big band jazz) on a single computer will not substantially affect the performance of the system with each additional Musebot. However, this hypothesis was not evaluated in this dissertation. In the future, we hope to assess the system with more than four agents.

on ? secti est qu Re 2.

Synchronizer Agent Synchronizer Agent

4

.

P A ?

r n A o io t n c v se n o e

i d

i i d v o

t ro

. P i

e 3

c t

e c s

s

e e

t c s

s t

i e

e

o d u

n i

q

v

A

e o

R r

. P 1 . Form: AAB 5

Musebot Bass (accompanist) Musebot Piano (leader) State: calculate how much time left State: Playing the solo at section A I have to compose and join the soloist at the begining of the next section.

Figure 6.1: Synchronization process of two agents in the Musebot Ensemble. The Musebot Bass comes from being a leader and is ready to support the Musebot Piano, but it will need to request its Synchronizer Agent to obtain the current section being played by the Musebot Piano.

With respect to the number of Musebots running in the ensemble, certain synchronization issues were discovered when the system was running with only two agents. The origin of the synchronization problems occurred when a Musebot that had been leading the ensemble finished its solo and prepared to become an accompanist. For example, consider the process 48 CHAPTER 6. DISCUSSION AND FUTURE WORK shown in Figure 6.1. First, the bass requests the current section being played from its own SA (see step 1 in Figure 6.1). Second, the bass’ SA requests this information from the piano’s SA (see step 2 in Figure 6.1). Third, the piano’s SA receives the current section and provides this information to the bass’ SA (see step 3 in Figure 6.1). Finally, the bass’ SA passes the current section to the accompanist (see step 4 in Figure 6.1). From the moment that the bass received this information, it could then calculate the time left of the current section by using the parameters encoded in the content of the message (including the length of the section and the elapsed time from the moment that the piano started to play the section, at the moment that the bass received the information). It could then use the time left to compose the accompaniment and prepare to join the soloist at the beginning of the following section. However, the remaining time of the current section played by the soloist could at times be too short, and the accompanist would not have enough time to play the accompaniment at the right moment. Thus, by the time the leader finished its solo and passed leadership back to the accompanist, both agents would ignore the current section of the song, which would lead to infinite loop wherein both asked each other the current section of the song without receiving a correct answer. Such issues reveal the complex nature of developing a generative music system that per- forms in real time. Numerous related works have avoided such problems by saving their generated compositions into files that can be listened to after a simulation has finished. However, we dealt with this problem by adding a third Musebot to the ensemble. An en- semble with three Musebots will always have an agent that knows the current section of the song, which removes the uncertainty created in the two-Musebot ensemble. An alternative solution to this issue would be to compose and play smaller groups of measures instead of measures in complete sections. This approach remains to be tested in future work. While the interactive protocols proposed in this dissertation offer many benefits, there were certain cases in which they could not be implemented exactly according to their speci- fication. For instance, sometimes the agents needed to put an ongoing conversation on hold to request information from another agent that was needed to complete a musical task. The problem with interactive protocols such as FIPA Contract Net and FIPA Query is that it is not possible to interrupt a sequence of messages without experiencing undesirable results. In such cases, we replaced the interactive protocols with custom protocols that allowed the agents to maintain multiple conversations simultaneously. Figure 6.2 shows an example of a custom protocol in a finite state machine. In this example, before the Composer Agent (CA) could confirm with the Musician Agent (MA) that it was to compose a solo, the CA needed to receive the section of the song from the Synchronization Agent (SA) before going into the “Compose" state. As we can see in this example, the CA is having a conversation with the MA and the SA simultaneously. However, one disadvantage of using a custom protocol is that we were not able to catch possible communication problems within the network. For example, the speaker does not get a notification when the interlocutor fails to respond. A common challenge experienced by researchers of MuMe concerns the evaluation of the effectiveness of their music generation systems. Delgado, Fajardo, and Molina-Solana [99] state that the assessment of any musical work is a complex task based on subjective individual opinions. MuMe has devoted an area of research that specifically studies prob- lems related to the evaluation of computational creativity. Often people will find value in a generative music system based on the quality of its musical compositions and their simi- larity to a human musician’s work. However, in this dissertation, we were not interested in generating musical compositions that are indistinguishable from those of humans, but rather focused on modeling aspects of musicians’ behaviors in a collaborative environment. As we demonstrated in Chapter 5, our proposed AMI system succeeded in this task. 6.2. FUTURE WORK 49

MUSEBOT

Musician Agent ( MA ) Composer Agent ( CA ) Synchronizer Agent ( SA )

Wait for a Request Request to CA Request to SA Get solo solo the section request

A to C Receive rm Wait for the Con rm to MA Info con rmation the section

Compose END

Play

END

Figure 6.2: Interaction between three internal agents through custom protocols.

6.2 Future Work

In this section, we describe possible further directions for our work. We believe that our proposed approach offers a variety of opportunities for those interested in the various sub- areas of MuMe. We summarize these opportunities in the following sections.

6.2.1 Towards Algorithmic Composition Our Composer Agent is the main entry point for the application of algorithmic composition techniques. In the proposed AMI system, the improvised melodies that are generated are composed of very simple musical passages based on the variation of the same set of rhythms. However, the Composer Agent offers a potential avenue for experimenting with any AI technique that is used to compose music. It would be interesting to implement different algorithmic composition techniques, such as artificial neural networks, in the Composer Agent to generate substantial musical compositions, perhaps making them closer to those composed by talented musicians. 50 CHAPTER 6. DISCUSSION AND FUTURE WORK

6.2.2 Towards Live Algorithms Further development of the Synchronizer Agent could be interesting for researchers of Live Algorithms. Several previous studies examining live algorithms focused on methods that enable the systems to synchronize with a human musician as well as mirror musicians’ ability to perform. However, it would be interesting to implement the three modules of the Live Algorithms architecture (P, Q, and F) proposed by Blackwell into the proposed AMI system. The F module, which, according to Blackwell, is directly concerned with different elements of artificial intelligence, could be particularly fruitful (e.g., the manner in which agents can reason their actions and base their decisions on specific judgments). We propose allowing the Musebots to learn from and be influenced by each other’s compositions as well as judge how well a Musebot is playing; based on this, they could decide which Musebot will perform the next solo.

6.2.3 Towards a Standard Taxonomy in AMI Several approaches have been taken to build AMI systems, and different theoretical con- cepts have been used to identify different components that a generative music system might possess. For instance, Blackwell, Bown, and Young described a Live Algorithm capable of four human musician abilities: autonomy, novelty, participation, and leadership [46]. In this dissertation, we discussed our vision of an ideal AMI system (see Chapter 2), which was also capable of four fundamental abilities: autonomy, communicating musical infor- mation, playing different musical styles, and fulfilling different roles within a musical en- semble. Additionally, musical multi-agent systems have proposed different communication protocols based on speech acts (see Section 3.3). In spite of the various terms used among generative music systems to outline the design methodologies, some of these terms share the same characteristics, and researchers in a new sub-area in MuMe could pursue a standard theoretical framework in AMI.

6.2.4 Further Considerations There is still much to accomplish in the pursuit of Artificial Musical Intelligence and the goals of MuMe. We view our architecture as a step toward this direction, since it offers both an extension to existing, related work and a convenient basis for integrating other recent advances. It would be interesting to apply our model of communication in a jazz quartet to different genres of music, both with and without additional Musebots that do not implement our architecture. We also hope to test the performance of our system with a larger number of Musebots, such as a symphony orchestra or a jazz big band. Future work could involve developing the communication of the CADIA Musebot to support the OSC messages used in the original Musebot Ensemble. This will allow the CADIA Musebot to participate in more varied Musebot Ensemble settings. We hope that our integrated, communication-focused approach will encourage and support further work in Musical Metacreation. 51

Chapter 7

Conclusion

Understanding the variety of human actions that lead to creative behaviors is a complicated task, yet the study of creative behavior has gained the attention of artists and programmers alike. Together, they hope to create technological artifacts and artificial systems that can autonomously act in disciplines that seem to require high levels of creative intelligence. In this dissertation, we studied AMI as a medium for integrating current advances in MuMe. First, we outlined our vision of an ideal AMI system and explained how it must be capable of four fundamental abilities: autonomy, musical communication, performance of alternative musical styles, and the performance of multiple roles. Second, we proposed an approach for enabling direct communication between autonomous musical agents from the Musebot Ensemble platform. Finally, we situated our work within the musical genre of jazz and showed how our approach can mimic certain human social behaviors (such as communicative modes) in the context of a jazz quartet, through our analysis of three different simulations. Two simulations based on a jazz quartet and one based on a jazz trio were executed using the proposed system. Different musical parameters were put into the system prior to each simulation, generating three different collaborative compositions. The main novelty of our extended communication architecture of the Musebot Ensemble is the agents’ ability to interact without any direct human intervention. Furthermore, as evidenced by the results of our evaluation, we have emulated similar communicative behaviors to those described in Bastien and Hostager’s work [20]. Our results show how the FIPA interactive protocols and the custom protocols are com- parable to the cues that jazz musicians use during performances to indicate changes in solos. However, in contrast to human improvisation, computers can easily plan and calculate the exact duration of the fragment of a song as well as pre-plan future interactions. In this dissertation, we have summarized the current state of the advances of MuMe and explored the integration of algorithmic composition, live algorithms, and musical multi- agent systems. Despite the minor limitations of our presented approach, we believe that we have presented a valuable AMI system that meets the criteria described in our problem formulation. We hope that it can be used as a tool for future research in the area of MuMe. 52 53

Bibliography

[1] N. M. Collins, “Towards autonomous agents for live computer music: Realtime ma- chine listening and interactive music systems”, PhD thesis, University of Cambridge, 2006. [2] L. A. Hiller, “Computer music”, Scientific American, vol. 201, pp. 109–121, 1959. [3] D. Cope, “Computer modeling of musical intelligence in EMI”, Computer Music Journal, vol. 16, no. 2, pp. 69–83, 1992. [4] E. R. Miranda, Readings in music and artificial intelligence. Routledge, 2013, vol. 20. [5] T. R. Besold, M. Schorlemmer, A. Smaill, et al., Computational creativity research: towards creative machines. Springer, 2015. [6] C. Roads, “Research in music and artificial intelligence”, ACM Computing Surveys (CSUR), vol. 17, no. 2, pp. 163–190, 1985. [7] A. Eigenfeldt and O. Bown, Eds., Proceedings of the 1st International Workshop on Musical Metacreation (MUME 2012), AAAI Press, 2012. [8] J. D. Fernández and F. Vico, “AI methods in algorithmic composition: A compre- hensive survey”, Journal of Artificial Intelligence Research, vol. 48, pp. 513–582, 2013. [9] H. Taube, “Stella: Persistent score representation and score editing in common mu- sic”, Computer Music Journal, vol. 17, no. 4, pp. 38–50, 1993. [10] T. Magnusson, “ixi lang: a SuperCollider parasite for live coding”, in Proceedings of International Computer Music Conference, University of Huddersfield, 2011, pp. 503–506. [11] ——, “Algorithms as scores: Coding live music”, Leonardo Music Journal, vol. 21, pp. 19–23, 2011. [12] J. Cage, Imaginary landscape, no. 1: for records of constant and variable frequency, large Chinese cymbal and string piano, 1. Henmar Press, 1960. [13] T. Blackwell, “Live algorithms”, in Dagstuhl Seminar Proceedings, Schloss Dagstuhl- Leibniz-Zentrum für Informatik, 2009. [14] D. Murray-Rust, A. Smaill, and M. Edwards, “MAMA: An architecture for interac- tive musical agents”, Frontiers in Artificial Intelligence and Applications, vol. 141, p. 36, 2006. [15] R. D. Wulfhorst, L. Nakayama, and R. M. Vicari, “A multiagent approach for musi- cal interactive systems”, in Proceedings of the second international joint conference on Autonomous agents and multiagent systems, ACM, 2003, pp. 584–591. 54 BIBLIOGRAPHY

[16] A. Carôt, U. Krämer, and G. Schuller, “Network music performance (NMP) in nar- row band networks”, in Audio Engineering Society Convention 120, Audio Engi- neering Society, 2006. [17] L. F. Thomaz and M. Queiroz, “A framework for musical multiagent systems”, Pro- ceedings of the SMC, vol. 1, no. 2, p. 10, 2009. [18] O. Bown, B. Carey, and A. Eigenfeldt, “Manifesto for a Musebot Ensemble: A plat- form for live interactive performance between multiple autonomous musical agents”, in Proceedings of the International Symposium of Electronic Art, 2015. [19] A. Eigenfeldt, O. Bown, and B. Carey, “Collaborative Composition with Creative Systems: Reflections on the First Musebot Ensemble”, in Proceedings of the Sixth International Conference on Computational Creativity, ICCC, 2015, p. 134. [20] D. T. Bastien and T. J. Hostager, “Jazz as a process of organizational innovation”, Communication Research, vol. 15, no. 5, pp. 582–602, 1988. [21] D. J. Hargreaves, The developmental psychology of music. Cambridge University Press, 1986. [22] G. Assayag and H. G. Feichtinger, Mathematics and music: A Diderot mathematical forum. Springer Science & Business Media, 2002. [23] J. Shepherd, Whose music?: A sociology of musical languages. Transaction Publish- ers, 1977. [24] L. Harkleroad, The math behind the music. Cambridge University Press Cambridge, 2006, vol. 1. [25] D. Cope, “The well-programmed clavier: Style in computer music composition”, XRDS: Crossroads, The ACM Magazine for Students, vol. 19, no. 4, pp. 16–20, 2013. [26] C. Fry, “Flavors Band: A language for specifying musical style”, Computer Music Journal, vol. 8, no. 4, pp. 20–34, 1984. [27] J. Sueur, T. Aubin, and C. Simonis, “Seewave, a free modular tool for sound analysis and synthesis”, , vol. 18, no. 2, pp. 213–226, 2008. [28] M. Grachten, “JIG: Jazz improvisation generator”, in Proceedings of the MOSART Workshop on Current Research Directions in Computer Music, 2001, pp. 1–6. [29] S. Parsons and M. Wooldridge, “Game theory and decision theory in multi-agent systems”, Autonomous Agents and Multi-Agent Systems, vol. 5, no. 3, pp. 243–254, 2002. [30] H. Finnsson and Y. Björnsson, “Simulation-Based Approach to General Game Play- ing.”, in AAAI, vol. 8, 2008, pp. 259–264. [31] M. A. Boden, “Computer models of creativity”, AI Magazine, vol. 30, no. 3, p. 23, 2009. [32] A. R. Brown and A. C. Sorensen, “Introducing jMusic”, in Australasian Computer Music Conference, A. R. Brown and R. Wilding, Eds., Queensland University of Technology, Brisbane: ACMA, 2000, pp. 68–76. [Online]. Available: http : / / eprints.qut.edu.au/6805/. [33] M. Delorme and J. Mazoyer, Cellular Automata: a parallel model. Springer Science & Business Media, 2013, vol. 460. BIBLIOGRAPHY 55

[34] K. Nagel and M. Schreckenberg, “A cellular automaton model for freeway traffic”, Journal de physique I, vol. 2, no. 12, pp. 2221–2229, 1992. [35] D. A. Wolf-Gladrow, Lattice-gas cellular automata and lattice Boltzmann models: an introduction. Springer, 2004. [36] A. R. Brown, “Exploring rhythmic automata”, in Workshops on Applications of Evo- lutionary Computation, Springer, 2005, pp. 551–556. [37] N. Chomsky and M. P. Schützenberger, “The algebraic theory of context-free lan- guages”, Studies in Logic and the Foundations of Mathematics, vol. 35, pp. 118– 161, 1963. [38] J. P. Swain, “The range of musical semantics”, The Journal of aesthetics and art criticism, vol. 54, no. 2, pp. 135–152, 1996. [39] M. Lanctot, V. Lisy,` and M. H. Winands, “Monte Carlo tree search in simultaneous move games with applications to Goofspiel”, in Workshop on Computer Games, Springer, 2013, pp. 28–43. [40] K. B. Korb and A. E. Nicholson, Bayesian artificial intelligence. CRC press, 2010. [41] C. M. Grinstead and J. L. Snell, Introduction to probability. American Mathematical Soc., 2012. [42] A. Brown, Making Music with Java: An Introduction to Computer Music, Java Pro- gramming and the JMusic Library. Lulu. com, 2005. [43] H. Ishii, A. Mazalek, and J. Lee, “Bottles as a minimal interface to access digital information”, in CHI’01 extended abstracts on Human factors in computing systems, ACM, 2001, pp. 187–188. [44] F. Menenghini, P. Palma, M. Taylor, and A. Cameron, The art of experimental inter- action design. Systems Design Limited, 2004, vol. 4. [45] J. A. Paradiso, K.-y. Hsiao, and A. Benbasat, “Tangible music interfaces using pas- sive magnetic tags”, in Proceedings of the 2001 conference on New interfaces for musical expression, National University of Singapore, 2001, pp. 1–4. [46] T. Blackwell, O. Bown, and M. Young, “Live Algorithms: towards autonomous com- puter improvisers”, in Computers and Creativity, Springer, 2012, pp. 147–174. [47] J. A. Biles, “GenJam: Evolution of a jazz improviser”, Creative evolutionary sys- tems, vol. 168, p. 2, 2002. [48] R. D. Wulfhorst, L. V. Flores, L. Nakayama, C. D. Flores, L. O. C. Alvares, and R. Viccari, “An open architecture for a musical multi-agent system”, in Proceedings of Brazilian symposium on computer music, 2001. [49] P. R. Cohen and C. R. Perrault, “Elements of a plan-based theory of speech acts”, Cognitive science, vol. 3, no. 3, pp. 177–212, 1979. [50] S. Poslad, “Specifying protocols for multi-agent systems interaction”, ACM Trans- actions on Autonomous and Adaptive Systems (TAAS), vol. 2, no. 4, p. 15, 2007. [51] O. Bown and A. Martin, “Autonomy in musicgenerating systems”, in Eighth Artifi- cial Intelligence and Interactive Digital Entertainment Conference, 2012, p. 23. [52] A. Eigenfeldt, “Real-time Composition or Computer Improvisation? A composer’s search for intelligent tools in interactive computer music”, Proceedings of the Elec- tronic Music Studies, vol. 2007, 2007. 56 BIBLIOGRAPHY

[53] A. Eigenfeldt and P. Pasquier, “A realtime generative music system using autonomous melody, harmony, and rhythm agents”, in XIII Internationale Conference on Gener- ative Arts, Milan, Italy, 2009. [54] P. Pasquier, A. Eigenfeldt, O. Bown, and S. Dubnov, “An introduction to musical metacreation”, Computers in Entertainment (CIE), vol. 14, no. 2, p. 2, 2016. [55] A. R. Brown and T. Gifford, “Prediction and proactivity in real-time interactive music systems”, in Musical Metacreation: Papers from the 2013 AIIDE Workshop, AAAI Press, 2013, pp. 35–39. [56] G. Solis and B. Nettl, Musical improvisation: Art, education, and society. University of Illinois Press, 2009. [57] P. Toiviainen, “Modeling the target-note technique of bebop-style jazz improvisa- tion: An artificial neural network approach”, : An Interdisciplinary Journal, vol. 12, no. 4, pp. 399–413, 1995. [58] G. Papadopoulos and G. Wiggins, “A genetic algorithm for the generation of jazz melodies”, Proceedings of STEP, vol. 98, 1998. [59] B. Baird, D. Blevins, and N. Zahler, “Artificial intelligence and music: Implement- ing an interactive computer performer”, Computer Music Journal, vol. 17, no. 2, pp. 73–79, 1993. [60] R. L. De Mantaras and J. L. Arcos, “AI and music: From composition to expressive performance”, AI magazine, vol. 23, no. 3, p. 43, 2002. [61] J. Pressing, “Free jazz and the avant-garde”, The Cambridge companion to jazz, pp. 202–216, 2002. [62] I. Monson, Saying something: Jazz improvisation and interaction. University of Chicago Press, 2009. [63] H. Gardner, Multiple intelligences: New horizons. Basic books, 2006. [64] R. K. Sawyer, “Music and conversation”, Musical communication, vol. 45, p. 60, 2005. [65] A. Williamon and J. W. Davidson, “Exploring co-performer communication”, Mu- sicae Scientiae, vol. 6, no. 1, pp. 53–72, 2002. [66] W. F. Walker, “A computer participant in musical improvisation”, in Proceedings of the ACM SIGCHI Conference on Human factors in computing systems, ACM, 1997, pp. 123–130. [67] F. A. Seddon, “Modes of communication during jazz improvisation”, British Journal of , vol. 22, no. 01, pp. 47–61, 2005. [68] K. Agres, J. Forth, and G. A. Wiggins, “Evaluation of musical creativity and musical metacreation systems”, Computers in Entertainment (CIE), vol. 14, no. 3, p. 3, 2016. [69] M. Edwards, “Algorithmic composition: computational thinking in music”, Com- munications of the ACM, vol. 54, no. 7, pp. 58–67, 2011. [70] G. M. Rader, “A method for composing simple traditional music by computer”, Communications of the ACM, vol. 17, no. 11, pp. 631–638, 1974. [71] C. Roads and P. Wieneke, “Grammars as representations for music”, Computer Mu- sic Journal, pp. 48–55, 1979. BIBLIOGRAPHY 57

[72] C. Rueda, G. Assayag, and S. Dubnov, “A concurrent constraints factor oracle model for music improvisation”, in XXXII Conferencia Latinoamericana de Informática CLEI 2006, 2006, pp. 1–1. [73] R. M. Keller and D. R. Morrison, “A grammatical approach to automatic improvi- sation”, in Proceedings of the Fourth Sound and Music Conference, SMC, 2007. [74] D. Morris, I. Simon, and S. Basu, “Exposing Parameters of a Trained Dynamic Model for Interactive Music Creation.”, in AAAI, 2008, pp. 784–791. [75] S. Perchy and G. Sarria, “Musical Composition with Stochastic Context-Free Gram- mars”, in 8th Mexican International Conference on Artificial Intelligence (MICAI 2009), 2009. [76] I. Wallis, T. Ingalls, E. Campana, and J. Goodman, “A rule-based generative music system controlled by desired valence and arousal”, in Proceedings of 8th interna- tional sound and music computing conference (SMC), 2011, pp. 156–157. [77] A. O. de la Puente, R. S. Alfonso, and M. A. Moreno, “Automatic composition of music by means of grammatical evolution”, in ACM SIGAPL APL Quote Quad, ACM, vol. 32, 2002, pp. 148–155. [78] P. Prusinkiewicz and A. Lindenmayer, The algorithmic beauty of plants. Springer Science & Business Media, 2012. [79] J. McCormack, “Grammar based music composition”, Complex systems, vol. 96, pp. 321–336, 1996. [80] M. Edwards, “An introduction to slippery chicken”, in ICMC, 2012. [81] S. A. Abdallah and N. E. Gold, “Comparing models of symbolic music using prob- abilistic grammars and probabilistic programming”, in Proceedings of the 40th In- ternational Computer Music Conference (ICMC|SMC|2014), 2014, pp. 1524–1531. [82] C. A. C. Coello, G. B. Lamont, D. A. Van Veldhuizen, et al., Evolutionary algo- rithms for solving multi-objective problems. Springer, 2007, vol. 5. [83] N. Tokui, H. Iba, et al., “Music composition with interactive evolutionary computa- tion”, in Proceedings of the 3rd international conference on generative art, vol. 17, 2000, pp. 215–226. [84] B Yegnanarayana, Artificial neural networks. PHI Learning Pvt. Ltd., 2009. [85] C. V. Goldman, D. Gang, J. S. Rosenschein, and D. Lehmann, “Netneg: A hybrid interactive architecture for composing polyphonic music in real time”, in Proceed- ings of the International Computer Music Conference, Michigan Publishing, 1996, pp. 133–140. [86] M. Nishijima and K. Watanabe, “Interactive music composer based on neural net- works”, Fujitsu scientific and technical journal, vol. 29, no. 2, pp. 189–192, 1993. [87] H. Sak, A. W. Senior, and F. Beaufays, “Long short-term memory recurrent neu- ral network architectures for large scale acoustic modeling.”, in Interspeech, 2014, pp. 338–342. [88] A. Nayebi and M. Vitelli, “GRUV: Algorithmic Music Generation using Recurrent Neural Networks”, 2015. [89] D. Eck and J. Schmidhuber, “Learning the long-term structure of the blues”, Artifi- cial Neural Networks—ICANN 2002, pp. 796–796, 2002. 58 BIBLIOGRAPHY

[90] C.-C. Chen and R. Miikkulainen, “Creating melodies with evolving recurrent neural networks”, in Neural Networks, 2001. Proceedings. IJCNN’01. International Joint Conference on, IEEE, vol. 3, 2001, pp. 2241–2246. [91] A. Cont, “ANTESCOFO: Anticipatory Synchronization and Control of Interactive Parameters in Computer Music.”, in International Computer Music Conference (ICMC), 2008, pp. 33–40. [92] T. Blackwell, “Swarm music: improvised music with multi-swarms”, in In Proceed- ings of the AISB ’03 Symposium on Artificial Intelligence and Creativity in Arts and Science, AISB, 2003, pp. 41–49. [93] L Harrald, “Collaborative Music Making with Live Algorithms”, in Australasian Computer Music Conference, Australasian Computer Music Association, Australasian Computer Music Association, 2007, pp. 59–65. [94] D. Murray-Rust, A. Smaill, and M. C. Maya, “VirtuaLatin - towards a musical multi- agent system”, in Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05), 2005, pp. 17–22. DOI: 10.1109/ICCIM A.2005.59. [95] C. Alexandraki and D. Akoumianakis, “Exploring new perspectives in network mu- sic performance: The DIAMOUSES framework”, Computer Music Journal, vol. 34, no. 2, pp. 66–83, 2010. [96] M. Sarkar and B. Vercoe, “Recognition and prediction in a network music perfor- mance system for Indian percussion”, in Proceedings of the 7th international con- ference on new interfaces for musical expression, ACM, 2007, pp. 317–320. [97] J. L. Oliveira, M. E. Davies, F. Gouyon, and L. P. Reis, “Beat tracking for multiple applications: A multi-agent system architecture with state recovery”, IEEE Transac- tions on Audio, Speech, and Language Processing, vol. 20, no. 10, pp. 2696–2706, 2012. [98] S. Dixon, “A lightweight multi-agent musical beat tracking system”, PRICAI 2000 Topics in Artificial Intelligence, pp. 778–788, 2000. [99] M. Delgado, W. Fajardo, and M. Molina-Solana, “Inmamusys: Intelligent multiagent music system”, Expert Systems with Applications, vol. 36, no. 3, pp. 4574–4580, 2009. [100] M. Navarro, J. M. Corchado, and Y. Demazeau, “A Musical Composition Applica- tion Based on a Multiagent System to Assist Novel Composers”, 2014. [101] F. Pachet, “Rhythms as emerging structures.”, in ICMC, Berlin, Germany, 2000. [102] P. M. Inc. (2017). Band-in-a-Box, [Online]. Available: http://www.pgmusic. com/ (visited on 01/01/2017). [103] F. Bt. (2015). ChordPulse, [Online]. Available: http://www.chordpulse. com/ (visited on 01/01/2015). [104] FIPA, “FIPA ACL message structure specification”, Foundation for Intelligent Phys- ical Agents, http://www. fipa. org/specs/fipa00061/SC00061G. html (30.6. 2004), 2002. [105] F Bellifemine, “Jade and beyonds”, Presentation at AgentCities Information Day, vol. 1, BIBLIOGRAPHY 59

[106] F. Bellifemine, A. Poggi, and G. Rimassa, “JADE–A FIPA-compliant agent frame- work”, in Proceedings of PAAM, London, vol. 99, 1999, p. 33. [107] M. Barbuceanu and M. S. Fox, “COOL: A Language for Describing Coordination in Multi Agent Systems.”, in ICMAS, 1995, pp. 17–24. [108] R. J. Zatorre, J. L. Chen, and V. B. Penhune, “When the brain plays music: auditory– motor interactions in music perception and production”, Nature reviews neuroscience, vol. 8, no. 7, pp. 547–558, 2007. 60

School of Computer Science Reykjavík University Menntavegur 1 101 Reykjavík, Iceland Tel. +354 599 6200 Fax +354 599 6201 www.ru.is