Musical Style Modification as an Optimization Problem

Frank Zalkow Stephan Brand Bejamin Graf Saarland University SAP SE SAP SE [email protected] [email protected] [email protected]

ABSTRACT style is seen probabilistic, supporting the attempt of this paper to tackle style computationally. Third, in the text- This paper concerns musical style modification on sym- books he mentions, probabilities are used in a very broad bolic level. It introduces a new, flexible method for chang- sense. Words like frequently or sometimes aren’t enough ing a given piece of music so that its style is modified to for the models of this paper. So one cannot rely on text- another one that previously has been learned from a cor- books and has to work through real data. pus of musical pieces. Mainly this is an optimization task Working through data is the guidance of this new ap- with the music’s note events being optimized for different proach for musical style modification, i.e. changing a given objectives relating to that corpus. The method has been piece of music so that its style is modified to another one developed for the use case of pushing existing monophonic that previously has been learned from a data corpus, while pieces of music closer to the style of the outstanding elec- the original piece of music should shine through the new tric bass player Jaco Pastorius. one (section 2). This style modification is seen as an opti- mization problem, where the music is to be optimized re- 1. INTRODUCTION garding different objectives (section 4). The method has Musical style is a quite vague term that is at risk not to be been developed for monophonic melodic bass lines, along captured computationally or even analytically. The musi- with chord annotations—especially for the style of the out- cologist Guido Adler states in his early, grand monograph standing electric player Jaco Pastorius (see sec- about musical style: tion 5 for some results). Due to the nature of the use case, the method is oriented towards monophonic symbolic mu- So [regarding the definition of style] one has to con- tent oneself with periphrases. Style is the center of sic, but most parts of the procedure could easily be ex- artistic approaching and conceiving, it proves itself, tended to polyphonic music as well. In this case the chord as Goethe says, as a source of knowledge about annotations may be even computed in a preceding auto- deep truth of life, rather than mere sensory obser- mated step, so that annotating would not be necessary. vation and replication. [1, p. 5] 1

Music doing small This passage suggests not to approach musical style com- input changes. . . Musicnew putationally. Nearly 80 years after that Leonard B. Meyer expresses a rather opposite view: Multiple Once a musical style has become part of the habit objectives responses of composers, performers, and practiced listeners it may be regarded as a complex system of probabilities. That musical styles are internalized Scoringinput Scoringnew probability systems is demonstrated by the rules of musical grammar and syntax found in textbooks on Music with harmony, counterpoint, and theory in general. [...] better scoring For example, we are told that in the tonal harmonic will replace. . . system of Western music the tonic chord is most often followed by the dominant, frequently by the Figure 1: Rough overview of the modification procedure subdominant, sometimes by the submediant, and so forth. [2, p. 414]

There are a couple of things to notice here: First, Meyer 2. STYLE MODIFICATION PROCEDURE supports the view that style isn’t in the music per se, but only when regarded in relation with other systems. Second, A simple monophonic, symbolic music representation can be seen as a series of I note events N, where each note 1 Translation by the author. Original version: So muß man sich mit Umschreibungen begnugen.¨ Der event ni is a tuple of pitch pi as MIDI note number and Stil ist das Zentrum kunstlerischer¨ Behandlung und Er- duration di in quarter lengths (ql). fassung, er erweist sich, wie Goethe sagt, als eine Erken- ntnisquelle von viel tieferer Lebenswahrheit, als die N =(n1,n2,...nI ) where ni =(pi,di) (1) bloße sinnliche Beobachtung und Nachbildung. In the case of rest, a predefined rest symbol takes the place of note number. Along with the note events a chord an- notation is needed, also being represented as a series of J Copyright: c 2016 Frank Zalkow et al. This is an open-access article chord events C, where each chord event c is a tuple of distributed under the terms of the Creative Commons Attribution License j 3.0 Unported, which permits unrestricted use, distribution, and reproduc- chord symbol sj and duration dj, again in ql. tion in any medium, provided the original author and source are credited. C =(c1,c2,...cJ ) where cj =(sj,dj) (2)

206 Proceedings of the International Computer Music Conference 2016 Such an existing N with a corresponding C should be mod- In our use case pieces of Jaco Pastorius (8 pieces of 2227.5 ified in such a way that it comes closer to a specific style. ql duration in total) form the target corpus whereas pieces C is fixed and won’t be changed. This task is viewed as of Victor Wooten (8 pieces of 3642.75 ql duration in total) a local search, performing a multi-objective optimization form the counterexamples. So both corpora contain elec- for finding a local-optimal N [3]. The main idea is to start tric bass guitar music in the genre of rock, whereas from a piece of music and try out neighbors. If a neigh- both bassists clearly show a different style. The complete bor is better than the original one, according to the objec- list of pieces used is to be found in [4]. tives described in section 4, the neighbor is saved and the In the following section the objectives used in the Pastorius- process is iteratively continued with this new one. In a project are introduced. But a key feature of this method is multi-objective optimization it is not straightforward what its flexibility: If some of its objectives don’t seem appro- is considered better. For ensuring that no objective be- priate or other objectives seem to bee needed for a specific comes worse, Pareto optimality is utilized: A state is only target style, it is easy to leave some out and/or develop new considered better, if all objectives stay the same or increase ones. in value. A neighbor is reached by randomly doing one of the following changes: 4. OBJECTIVES • Changing the pitch pi of a single note event. The 4.1 Feature Classification maximum change interval is a major third upwards or downwards. Before tackling the modification task, a classification task • is to be solved. For that purpose one has to design individ- Changing the duration of two notes di1 and di2 so that the overall duration of the note succession in ual feature extractors, tailored to the specifics of the target this voice stays the same. style. If, like in our case, it is to be assumed that style is not 1 1 • A note ni is divided into multiple ones, having ( /2, /2), only recognizable when the full piece has been played, but 2 1 1 1 1 ( /3, /3) or ( /3, /3, /3) of the original duration di. also on a local level, one can apply windowing of a fixed • Two successive notes ni and ni+1 are joined into a musical duration. single one with duration di + di+1. The feature extractors compute for each window a row vector xi, forming the feature matrix X. A target value yi Ideally one wants to have a manageable amount of neigh- can be assigned to each feature vector, numerically repre- bors so that every one can be tried to find out which one senting the target style with 0 and the style of the coun- is best. Unfortunately the amount is not manageable in terexamples with 1. So for the corpus the target values are this case: For a given state with N note events, there are obvious, forming the column vector Y . Therefore a func- 16N 2 − 4N − 1 possible neighbors (if one assumes that tion f : X → Y is needed. This can be learned from each possibility of splitting/joining notes is always valid). the data corpus, where cross validation assures a certain Because one cannot try out all possibilities, one randomly generalizability. For learning this function we made good tries out neighbors and applies a technique from the tool- experiences with Gradient Tree Boosting, because of its box of metaheuristics: Tabu search, which means stor- good performance as well as the interpretability of the uti- ing the already tried out possibilities for not visiting them lized decision trees. For details regarding Decision Trees again. But even by this only a small fraction of possibilities and Gradient Tree Boosting see [5]. The Gradient Tree will be tried out and one will be trapped in a local optimum. boosting classifier cannot only classify a given window of In most cases many efforts are employed to overcome local music, it also can report a probability of a the window be- optima for finding the global one or at least solutions bet- longing to the target style. This probability forms the first ter than the first local optimum. But here there is a relaxing objective that is to be optimized. property: There is no need for finding the global optimum This objective is the most flexible one because the fea- anyway. Changing a given piece of music for pushing it ture design can be customized for the characteristics that into a specific style should not mean to completely throw the modified music should have. Adopting this method for away the original piece and replace it by the stylistic best new styles goes hand in hand with a considerable work in piece ever possible. It is reasonable doing small modifica- designing appropriate feature extractors. tions until no modification improves the values of all ob- Beside 324 feature dimensions coming from already im- jectives. By that it can be assured that the modified piece plemented feature extractors of music21 [6], multiple fea- retains characteristics of the original one. ture extractors have been customly designed for the mu- sic of the Pastorius project, outputting 86 dimensions. As 3. DATA CORPUS an example a feature extractor should be shortly described that aims to model one of the striking features of the bass Before formulating objectives, a data corpus of examples guitar play of Jaco Pastorius, according Pastorius-expert of music in the target style as well as counterexamples has Sean Malone: to be established. The choice of counterexamples is cru- cial: Ideally a musical corpus would be needed that covers Measure 47 [of Donna Lee, author’s note] contains the first occurrence of what would become a Pas- all music ever possible that doesn’t belong to the target torius trademark: eighth-note triplets in four-note style in a statistically significant way. In practice this isn’t groups, outlining descending seventh-chord arpeg- possible. As an compromise, on the one hand it should be gios. The effect is polyrhythmic – the feeling of a style quite different from the target style, on the other two separate pulses within the bar that don’t share an equal division. [...] As we will see, Jaco utilizes hand it should be not too far off, so that the target style is this same technique (including groupings of five) in sharpened by differentiating it from the counterexamples. many of his solos. [7, p. 6]

Proceedings of the International Computer Music Conference 2016 207 For calculating this feature fJaco, firstly the lengths of Some further adjustments have been made to improve the the sequences of notes with common direction are deter- results: mined. Common direction means either successively as- cending or descending in pitch. The length occurring most • Separate chains have been trained for durations and often is called the most common sequence length fLen. So pitches for getting less sparse probability matrices. this feature is calculated by taking the fractional part of the • Since we assume chord annotations, separate chains quotient of the most common sequence length and the de- can be trained for each chord symbol type. 5 nominator of most common quarter length duration fDur. • Linear interpolation smoothing as well as additive 6 fLen a smoothing has been applied. Both smoothing tech- fJaco = mod 1 , where = fDur, niques counteract the zero-frequency problem, i.e. b b (2) 2 the problem of yet unseen data. The first one means gcd(a, b)=1 to take the average of several order Markov chains See figure 2 for an example, where fLen =4and fDur = (0 ≤ O ≤ 4 in our project) and the latter one to add 1 4 1 /3, so fJaco = /3 mod 1 = /3. a tiny constant term to all probabilities.

8va See figure 3 for a graphical depiction of this objective 3 3 3 47 3 3 3 3 3 function based on real data of Jaco Pastorius. The opti-     mization can be imagined as an hill climbing on this sur-   4     4  face plot.

Figure 2: Bar 47–48 of Pastorius’ Donna Lee. Brackets in- dicate sequences of notes with common direction. 3 0.07 The complete list and description of the features extrac- 0.06 ¯ 0.05 tors used is to be found in [4]. P(p1, p2) 0.04 0.03 0.02 4.2 Markov Classification 0.01 0.00 While the classification described in the previous section 39 44 49 captures general characteristics of the music, depending on 54 59 the feature extractors, Markov chains [8] are an old friend p 64 44 39 2 69 54 49 for music generation that works well on a local level. The 74 64 59 79 74 69 p1 first-order Markov model assumes a fixed set of possible 79 note events S = {s1,s2,...,sK } and assigns a probability Figure 3: Surface of an objective function of two successive akl for each note event sk being preceded by a note event pitches regarding the average smoothed Markov probabil- sl. ity (orders 0 and 1) for the Pastorius-project akl ≡ P (ni = sk|ni−1 = sl) with akl ≥ 0 and K (3) 4.3 Ratio of example/counter-example Markov akl =1 probability l=1 Along with the K×K transition matrix akl the initial prob- The preceding objective only takes the Markov model of abilities πk for note events without predecessor are needed. the target style into account. So it also rewards changes K that make a given note succession more close to general πk ≡ P (n1 = sk) with πk ≥ 0 and πk =1 (4) musical characteristics. To foster the specific characteris- k=1 tics of the target style, the value of this objective is the ratio The assumption of the dependency of a fixed number of O of the average smoothed Markov probability of the target style and average smoothed Markov probability of coun- predecessors is called the order of the Markov chain. First 7 order chains, like in equations 3–5, clearly are an unrealis- terexamples. Figure 4 shows a graphical depiction of this objective tic model, but for music, even when limO→∞ an Oth-order Markov chain wouldn’t hold true, because the probability function. Compare with figure 3 to clarify the big differ- of a note can even be influenced by its successors. 4 Nev- ence between this and the previous objective. ertheless a sufficiently large order O usually captures good 5 See [9] for a general depiction and [10] for one referring to Markov local characteristics. model music generation. 6 For optimizing a given note sequence of length , the See [11] for a comparison of different smoothing techniques. There N I on p. 311 it is argued that additive smoothing generally performs poorly, mean probability is used as objective. but note in the case of this project it is used in combination with in- I terpolation smoothing (related to what is there called Jelinek-Mercer- ¯ P (n1) · i=2 P (ni|ni−1) Smoothing). P (N|akl,πk)= (5) 7 A possibility for improving this objective is applying Bayes’ theo- I rem. This objective could than be reformulated (with target being the 2 gcd means greatest common divisor. This line is just for the purpose target style and counter being the style of the counterexamples): to indicate hat a/b is irreducible in lowest terms. P (N|target)P (target) 3 P (target|N)= and This and the following Pastorius music examples in this paper are P (N) newly typeset with [7] as reference. 4 Think of the climax of a musical phrase that is headed for already P (N)=P (N|target)P (target) + P (N|counter)P (counter) some time before.

208 Proceedings of the International Computer Music Conference 2016 F7 B m7 E 7 A (29)           4     40   4 35 (a) from Pastorius’ Donna Lee P¯Pastorius p ,p 30 ( 1 2) E 7 G m7 C 7 F m7 Wooten 25 P¯ (p1,p2)     20 (94)       15      10   4      5  4 3 0 (b) from Pastorius’ Donna Lee 39 44 49 54 70 ×10−3 59 70 3 p 64 44 39 2 69 54 49 74 64 59 65 2 79 69 65 79 74 p1 1 60 60 Figure 4: Surface of an objective function of two succes- 0 pitch sive pitches regarding the ratio of the example/counter- 55 pitch 55 example-ratio for the average smoothed Markov probabil- correlation −1 ities (order 0 and 1) with Jaco Pastorius being the target 50 50 style and Victor Wooten being the counter-example. −2 45 45 −3 0 2 4 6 8 10 12 14 0 1 2 3 4 5 6 7 8 0 2 4 6 8 10 12 14 time (ql) time (ql) time lag (ql) (c) roll repre- (d) Piano roll repre- (e) Cross-correlation sentation of a. sentation of b. of c and d. 4.4 Time correlations for chord-repetitions          4     4          The main idea of this objective is to capture some large    4           4 3 scale structural similarities within the pieces of the target style. A shallow idea of large scale structure should also (f) Aligning of the two examples with b being circularly shifted to the maximum value of e. be given by the feature classification (subsection 4.1) when the relative position of the window is given as feature. But Figure 5: Illustration of this objective for two examples with in practice large scale structure is something one misses the same intervals between the roots of the chord progres- most in the generated music. So this is an additional ap- sion (when enharmonic change is ignored). The cross- proach to include that. Our approach is based on two as- correlation shows that a circular shifting of 12 ql the two sumptions about repetitions in music: The first assumption examples have the maximal motivic similarity. is, that structure evolves by the absence or presence of rep- etition, e.g. the varied reoccurrence of material already played some time ago. The second assumption is that such 5. RESULTS kind of repetition most probably occurs when the relative In contrast to the classification, where the performance can changes in harmony also repeat. For each such a repeti- be evaluated relatively impartially, such an objective eval- tion both corresponding subparts of the note sequence N uation is not possible for the modification. One can only is converted into a piano roll representation (cf. figure 5c try it out and evaluate the result subjectively. So the as- and d) where, after subtracting the mean and normalizing, sessment heavily depends on the judges, their musical and a circular cross correlation (cf. figure 5e) is performed, so cultural background, taste. As an example, a simple tra- this objective correlates with motivic similarity. When re- ditional melody has been chosen: New Britain, often sung (1) (2) I ferring to both parts as N and N with length , the along with the Christian hymn Amazing Grace. See figure circular cross correlation is given by: 6 for the original version. During the modification pro- I 1 (1) (2) cess, 3398 different changes have been tried out whereby RN (1)N (2) (l)= N N (6) I i (i+l) mod I only 14 ones have been accepted by Pareto optimality. See i=1 figure 7 for the final version. And that’s how it is applied in the optimization: All cross- correlations between parts with related chord-progressions N.C. F F7 B F in the target style corpus are pre-computed. During the op-     timization the cross-correlations of those related sections 3    4 3 are computed, too. Then the inner product between each C C7 F F7 correlation of the piece of music to be optimized and the 6   ones of the target style corpus are computed and the max-       imum one is returned as value of this objective. Since the 3 dot product of correlations of different lengths cannot be B F C7 F 12  computed, all correlations of the same progression length        are brought to the same length by linear interpolation. By  3 that means one ensures that the correlation within the chord progressions in the music to be optimized becomes more Figure 6: New Britain resp. Amazing Grace, original ver- similar to ones of the examples in the target style. sion.

Proceedings of the International Computer Music Conference 2016 209 N.C. F F7 B F of Pastorius, because signatures are less dominant. Nev-       ertheless, it would be interesting to apply Copes inspiring 3   3   4    work to eclectic and erratic styles since Cope developed C 7 7 his methodology sophisticatedly, far exceeding the rough 6 C F F     and basic ideas just touched here.     Cope describes basic categories into which music com-  3 posing programs fall: B F C7 F 12   The approaches [...] include rules-based algorithms,      data-driven-programming, genetic algorithms, neu-   3 ral networks, fuzzy logic, mathematical modeling, and sonification. Although there are other ways to Figure 7: New Britain resp. Amazing Grace, modified ver- program computers to compose music, these seven sion. basic processes represent the most commonly used types. [12, p. 57] The tieing of notes from bars 4 to 5 seems to interrupt If one would like to force the approach of this paper to fall the arc of suspense, thus it seems to be rather unfavourable 8 from a musical viewpoint. The run from bar 9 seems in- into these categories, rules-based programming and data- teresting and coherent. The two eighths in bar 9 as well as driven-programming would fit, but a considerable amount the ones in bar 15 seem to surround the following principal of this work wouldn’t be described. Especially Cope’s note, which lets the music appear quite natural. Especially category of genetic algorithms (GAs) is too specific and the first example can be considered a broadened double could be generalized to metaheuristics, which then would appoggiatura. Some successions seem harsh, like the suc- also fit for the method described here. GAs enable ran- cession of the major to the minor third of the chord in bar dom jumps in the optimization neighborhood whereas the 11 or the melodic succession of a semitone and a tritone in method presented here only takes small steps for ensuring bars 12–13, but such harsh elements are not unusual in the that a local optimum is targeted—which is desired as de- music of Pastorius, see figure 8. Maybe they seem spuri- scribed in section 2. Markov chains have a great tradition ous, since one rather bears in mind the consonant original in music. They found application very early in both com- version, which is still noticeable in the modified version to puter aided music generation [15–17] and in musicologi- a large extent. But from that point of view, the result is cal studies [18, 19]. More recently, researchers from the a successful blending of the original version and the style Sony Computer Science Laboratory rediscovered Markov of Pastorius, even if it is doubtful if Pastorius would have chains by combining them with constraint based program- improvised over New Britain like this. ming, yielding very interesting results [10, 20]. In general, most music generation approaches, including Cope’s and 7 all Markovian methods, are united by the strategy of re- 7 24 E 3 C 3 26          combining elements of an existing musical corpus. Other   4        4 4 3       4 3 attempts that fall under this umbrella are suffix based meth- (a) from Donna Lee (b) from The Days of Wine and ods [21, 22]. The method presented here, however, also Roses enables recombinatorial results, but is not restricted to that (4) 51   because the other optimization objectives also foster the   4     4   4  4 generation of music that is similar to the corpus on a more (c) from Donna Lee (d) from (Used To Be A) Cha abstract level. [22, 23] share the concept of an underlying Cha harmonic progression with the approach presented here. Figure 8: Jaco Pastorius examples, that could be the model [24] is similar in the approach to apply metaheuristics for for harsh results in the modification. a and b: Succes- musical style modification, but is not about learning the sion of the minor on the major third of the chord. c and objectives from a data corpus in the way described here. d: Melodic succession of a semitone and a tritone. Having mentioned some major branches of automatic mu- sic generation, the author recommends a more complete survey [25] for those interested in more branches of this field. 6. RELATED WORK

One of the most prominent researchers, engaging compu- 7. CONCLUSIONS tationally with musical style, especially in symbolic style synthesis, is David Cope [12–14]. In its basic form his In this paper a novel approach of musical style modifica- style replication program EMI (“Experiments in Musical tion has been presented. Basically this is a multi-objective Intelligence”) has to be fed by over a thousand of user in- optimization, where the objectives try to reward similar- put questions. Cope also attempts to overcome this by au- ity to the target style in different respects. By that means tomatically analyzing a corpus of music. Roughly, this in- a given piece of music can be transformed with the aim volves finding what Cope calls signatures, frequently reoc- of pushing it closer to a specified target style. There are curring sequences, assigning functional units to them and plenty of possibilities to built upon this work: making it recombining the corpus with special regards to those func- real-time capable (currently it is not), paying more regard tional signatures. This leads to impressive results for music to the metrical structure (a weakness in the Pastorius-project), with rather homogeneous texture, but it may be less ap- 8 That fits since in Cope’s terminology Markovian processes fall into propriate for more eclectic and erratic styles, like the one this category

210 Proceedings of the International Computer Music Conference 2016 validly evaluating the modification results by empirical ex- [10] F. Pachet and P. Roy, “Markov constraints. Steer- periments. We also started to conceptualize about peda- able generation of Markov sequences,” in Constraints, gogical applications: during the modification process, rea- vol. 16, no. 2, March 2011, pp. 148–172. sons why things has been changed in certain manner, can be tracked—something that could be expanded for auto- [11] S. F. Chen and J. Goodman, in An empirical study of matically explaining style. This shows that the potential of smoothing techniques for language modeling, vol. 13, this approach is far from exhausted. no. 4, 1999, pp. 359–393. [12] D. Cope, Computers and Musical Style. Madison: Acknowledgments A-R Editions, Inc., 1991. This paper is a condensed form of a Master thesis [4], so [13] ——, Virtual Music. Computer Synthesis of Musical thanks are owed to all who supported this thesis: First and Style. Cambridge: MIT Press, 2001. foremost SAP SE who generously supported it in financial terms, by a providing a great working environment as well [14] ——, Computer Models of Musical Creativity. Cam- as with personnel support: Stephan Brand who initiated bridge: MIT Press, 2005. the project and debated about the project from the perspec- tive of a bassist, Pastorius enthusiast and software man- [15] F. Brooks, A. Hopkins, P. Neumann, and W. Wright, ager; Benjamin Graf who was available weekly for discus- “An experiment in musical composition,” in IRE Trans- sions especially about data science issues. Special thanks actions on Electronic Computers, vol. 6, no. 3, 1957, to Prof. Denis Lorrain whose encouraging and far-sighted pp. 175–182. arguments have been inspiring as well as Sean Malone, who contributed by occasional mail contact with benefi- [16] L. Hiller and L. Isaacson, Experimental Music. Com- cial comments from his view as bassist, musicologist and position With an Electronic Computer. New York: Pastorius expert. McGraw-Hill, 1959. [17] I. Xenakis, Formalized Music. Thought and Math- 8. REFERENCES ematics in Composition, 2nd ed., S. Kanach, Ed. Stuyvesant: Pendragon Press, 1992. [1] G. Adler, Der Stil in der Musik. 1. Buch: Prinzipien und Arten des Musikalischen Stils. Leipzig: Breitkopf [18] R. C. Pinkerton, “Information theory and melody,” in &Hartel,¨ 1911. Scientific American, vol. 194, no. 2, 1956, pp. 77–86. [19] J. E. Youngblood, “Style as Information,” in Journal of [2] L. B. Meyer, “Meaning in Music and Information the- Music Theory, vol. 2, no. 1, 1958, pp. 24–35. ory,” in The Journal of Aesthetics and Art Criticism, vol. 15, no. 4, 1957, pp. 412–424. [20] F. Pachet, P. Roy, and G. Barbieri, “Finite-length Markov Processes with Constraints,” in Proceedings of [3] S. Luke, Essentials of Metaheuristics, 2nd ed. the 22nd International Joint Conference on Artificial Raleigh: Lulu, 2013. [Online]. Available: https: Intelligence, vol. 1, 2011, pp. 635–642. //cs.gmu.edu/∼sean/book/metaheuristics/ [21] S. Dubnov, G. Assayag, O. Lartillot, and G. Bejerano, [4] F. Zalkow, “Automated musical style analysis. Compu- “Using Machine-Learning Methods for Musical Style tational exploration of the bass guitar play of Jaco Pas- Modeling,” Computer, vol. 36, no. 10, pp. 73–80, Oct. torius on symbolic level,” Master’s Thesis, University 2003. of Music Karlsruhe, Germany, Sep. 2015. [22] J. Nika and M. Chemillier, “Improtek: integrating har- [5] T. J. Hastie, R. J. Tibshirani, and J. H. Friedman, The monic controls into improvisation in the filiation of elements of statistical learning. Data mining, infer- OMax,” in Proceedings of the 2012 International Com- ence, and prediction, ser. Springer series in statistics. puter Music Conference, 2012, pp. 180–187. New York: Springer, 2009. [23] A. Donze,´ R. Valle, I. Akkaya, S. Libkind, S. A. Se- [6] M. S. Cuthbert, C. Ariza, and L. Friedland, “Feature shia, and D. Wessel, “Machine Improvisation with For- Extraction and Machine Learning on Symbolic Mu- mal Specifications,” in Proceedings of the 40th Inter- sic using the music21 Toolkit,” in Proceedings of the national Computer Music Conference, 2014, pp. 1277– 12th International Symposium on Music Information 1284. Retrieval, 2011, pp. 387–392. [24] T. Dimitrios and M. Elen, “A GA Tool for Computer Assisted Music Composition,” in Proceedings of the [7] S. Malone, A Portrait of Jaco. The Solo Collection. 2007 International Computer Music Conference, 2007, Milwaukee: Hal Leonard, 2002. pp. 85–88. [8] E. Alpaydın, Introduction to Machine Learning, [25] J. D. Fernandez´ and F. Vico, “AI Methods in Algorith- 2nd ed., ser. Adaptive computation and machine learn- mic Composition: A Comprehensive Survey,” in Jour- ing. Cambridge and London: MIT Press, 2010. nal of Artificial Intelligence Research, vol. 48, no. 1, 2013, pp. 513–582. [9] F. Jelinek and R. L. Mercer, “Interpolated estimation of Markov source parameters from sparse data,” in Pro- ceedings, Workshop on Pattern Recognition in Prac- tice. Amsterdam: North Holland, 1980, pp. 381–397. Proceedings of the International Computer Music Conference 2016 211