Systematic Evaluation and Improvement of Statistical Models of Harmony
Total Page:16
File Type:pdf, Size:1020Kb
Computational Creativity 2007 SYSTEMATIC EVALUATION AND IMPROVEMENT OF STATISTICAL MODELS OF HARMONY Raymond P. Whorley Geraint A. Wiggins Marcus T. Pearce Centre for Cognition, Computation and Culture Goldsmiths, University of London New Cross, London SE14 6NW, U.K. r.whorley,g.wiggins,m.pearce @gold.ac.uk { } Abstract tical machine learning techniques. There are two primary motivations for this work. The first is that the models so We are investigating the utility of Markov models in re- produced should be of use in addressing stylistic and mu- lation to the learning of the task of four-part harmoni- sicological issues, and the second is that the models could sation, which is a creative musical activity. A program form the basis of a cognitive model of harmonic percep- is described which uses statistical machine learning tech- tion (Pearce et al., 2002). It is intended that the program niques to learn this task from a suitable corpus of homo- will eventually handle music which is not completely ho- phonic music. The task is decomposed into a series of mophonic, such as the set of chorale melodies harmonised more manageable sub-tasks; these are each modelled by by J. S. Bach. Markov models, which can use contexts drawn from sym- The approach adopted breaks the overall harmonisa- bols describing past, current and future chords. The re- tion task into a series of more manageable sub-tasks (Hild sults of a number of initial studies, for example compar- et al., 1992; Allan, 2002) . The sub-tasks implemented so ing different types of model and the effect of corpus size, far relate to the harmonic function of the chords; sub-tasks are given. There is also some discussion about harmonisa- that for example assign particular notes to the alto, tenor tions that have been generated by the program by random and bass parts will follow in due course. An information- sampling of the probability distributions in the models. theoretic measure is used to evaluate the various sub-task Following this, a procedure for the systematic evaluation models (Allan and Williams, 2005), and a systematic ap- and “optimisation” of the sub-task models, involving the proach to evaluation is taken which guides the creation of application of an information-theoretic measure, is pre- the models (Pearce, 2005). Having learned a style of har- sented, along with some more results. An appraisal of the monisation in this way, the program is equipped to gener- procedure’s shortcomings is made, and ideas for its im- ate harmonisations to “unseen” melodies. provement are put forward. Finally, an indication of the It should be noted that this is very much “work in future direction of the work (which is currently in its early progress”; indeed, work at a very early stage. The results stages) is given. obtained so far can only therefore be taken as indicative, rather than definitive. Keywords: Machine learning, harmonisation, Markov models, evaluation. 2 Corpus, Test Data and Input 1 Introduction The training corpus currently comprises fifteen major key hymn tunes with harmonisations taken from Vaughan The current work contributes to the study of computa- Williams (1933) and Nicholson et al. (1950). In addition, tional creativity in music by seeking to improve the com- there are two sets of five major key test melodies with har- putational simulation of four-part harmonisation, which is monisations from the same sources (although only three a creative human activity. We are investigating the utility of these melodies are used for initial studies). of Markov models in relation to the learning of this task, All of the training corpus melodies, along with a cer- in pursuit of which a program is being developed which tain amount of annotation (Ponsford et al., 1999), are as- seeks to learn a style of four-part harmonisation from a signed to a single sequence (array) of symbols. Annota- suitable corpus of homophonic music by means of statis- tion symbols are placed at the beginning and end of each melody, and at phrase boundaries. All of the training cor- Permission to make digital or hard copies of all or part of this pus bass parts are assigned to another symbol sequence work for personal or classroom use is granted without fee pro- in a completely analogous way. Melody and bass notes vided that copies are not made or distributed for profit or com- are normalised to seven scale degree symbols which are mercial advantage and that copies bear this notice and the full not dependent upon key; for example a tonic note is rep- citation on the first page. resented by “1” and a dominant note by “5”. These scale degrees can be chromatically altered by post-fixing with c 2007 Goldsmiths, University of London “#” or “b” (sharp or flat). No account is currently taken of 81 Computational Creativity 2007 note duration or metrical importance. exist at the various stages of the harmonisation process; Symbols representing the harmonic function of chords for example, future melodic symbols can be used from in the training corpus (determined by the principal author) the beginning of the process. are assigned, along with annotation symbols, to yet an- other sequence. The chords are normalised to root posi- 3.2 Smoothing tion triads and seventh chords; for example a tonic chord is represented by “I” and a supertonic seventh chord by It is often the case that contexts occurring while, for exam- “ii7”. Chords containing chromatically altered notes are ple, generating a harmony to an “unseen” melody, cannot deemed to be triads or seventh chords in the nearest possi- be found in the model. In order to be able to make in- ble related key; for example if a chord containing the notes formed predictions in these circumstances, a number of D, F ] and A appears when the key is C major, it is repre- models using different contexts are used in conjunction sented by “V/V” (dominant of the dominant). Since this with a smoothing technique. Conklin and Cleary (1988) program does not keep track of modulations, the chord and Allan (2002) use a simple method, which has worked could, alternatively, be described as “II”, a major triad well with text, known as back-off smoothing (Katz, 1987). on the second degree of the original key. The latter ap- For Markov chains, a number of models of different order proach requires less music-theoretic knowledge and gives are created. When searching for a particular context, the an equally good description; therefore it might be adopted highest order model is checked first. If the context is not later. On the other hand, in their multiple viewpoint tech- found, the second-highest order model is checked, and so nique, Conklin and Cleary (1988) propose a viewpoint to on, until the (progressively smaller) context is found. For detect modulations; something of this sort could be incor- the “mixed” contexts that we are using, however, back-off porated into the program. Note that since the hymns are is not limited to successive contexts of decreasing size; homophonic (i.e., consisting of block chords only, with no different contexts of the same overall size can be involved extra-chord movement such as passing notes), it is sim- in the back-off sequence. ple to associate all symbols comprising a particular chord It should be noted that an escape method is normally with the same array index for ease of retrieval. associated with back-off smoothing, whereby some of the Test melodies, bass parts and harmonic symbols are probability mass is assigned to unseen Markov chains, the assigned to sequences in the same way as for the training probabilities of which are estimated during back-off. No corpus, except that they are individually assigned. Only such escape method has so far been implemented here; if a one test melody/harmony combination is used per pro- context is not found in a particular model, then the model gram run. is discarded. The next model to be tried is assumed to A second version of the harmonic symbol sequence contain all of the probability mass. also exists, which contains chord inversion information in There are other more complicated smoothing methods the symbols. This is used for purposes of comparison. that are also known to work well, but one of the purposes of this paper is to investigate how an optimum, or close to optimum, back-off sequence can be determined, and how 3 The Models and Their Construction well it performs. 3.1 Markov Models 3.3 Decomposition of the Harmonisation Task Markov chains, or n-gram models, widely used in nat- ural language processing (Manning and Schutze,¨ 1999), Allan (2002) follows Hild et al. (1992) in breaking up the are sequences of n symbols (e.g., a sequence of musical harmonisation task into three sub-tasks, in the following note names), where it is assumed that the probability of order: “harmonic skeleton”, “chord skeleton” and “or- the last symbol appearing is dependent upon the previ- namentation”. In the “harmonic skeleton” sub-task, har- ous n 1 symbols (the context). This is an approxima- monic symbols are assigned to each beat; the actual notes tion of− the probability of the last symbol being dependent of each chord are filled out in the “chord skeleton” sub- upon its entire history; it can therefore be expected that task; and additional notes such as passing notes are sup- the longer the context, the more accurate the transition plied during “ornamentation”. Wiggins (1998) suggests probabilities. The size of the context is known as the or- that the problem should be broken up even further, by for der of the Markov chain; therefore for example a 2-gram example choosing cadences first.