Prediction in Polyphony: Modelling Musical Auditory Scene Analysis
Total Page:16
File Type:pdf, Size:1020Kb
Prediction in polyphony: modelling musical auditory scene analysis by Sarah A. Sauvé A thesis submitted to the University of London for the degree of Doctor of Philosophy School of Electronic Engineering & Computer Science Queen Mary University of London United Kingdom September 2017 Statement of Originality I, Sarah A Sauvé, confirm that the research included within this thesis is my own work or that where it has been carried out in collaboration with, or supported by others, that this is duly acknowledged below and my contribution indicated. Previously published material is also acknowledged below. I attest that I have exercised reasonable care to ensure that the work is original, and does not to the best of my knowledge break any UK law, infringe any third party's copyright or other Intellectual Property Right, or contain any confidential material. I accept that the College has the right to use plagiarism detection software to check the electronic version of the thesis. I confirm that this thesis has not been previously submitted for the award of a degree by this or any other university. The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the author. Signature: Sarah A Sauvé Date: 1 September 2017 2 Details of collaboration and publication One journal article currently in review and one paper uploaded to the ArXiv database contain work presented in this thesis. Two conference proceedings papers contain work highly related to, and fundamental to the development of the work presented in Chapters 5 and 7. Details of these as well as their location in the thesis and details of the collaborations, where appropriate, are presented here: Chapter 5 ● Sauvé, S. A., Pearce, M. T. & Stewart, L. (2014). The effect of musical training on auditory grouping. In Proceedings of the ICMPC-APSCOM 2014 Joint Conference. Seoul, Korea. ● Sauvé, S. A., Pearce, M. T. & Stewart, L. (2017). Attention but not musical training affect auditory grouping. ArXiv. Chapter 6 ● Sauvé, S. A., Sayed, A., Dean, R. T. & Pearce, M. T. (in review). Effects of pitch and timing expectancy on musical emotion. Psychomusicology. In this project, Sayed designed the experiment with Pearce and collected the data as part of a master’s thesis. I, with support from Dean, carried out the analysis presented in this thesis. I also, with the support of Dean and Pearce and final approval of Sayed, wrote the manuscript submitted to Psychomusicology and currently under review. 3 Chapter 7 ● Bussi, M., Sauvé, S. & Pearce, M. (2016). Relative Salience in Polyphonic Music. In Proceedings of the 14th International Conference for Music Perception and Cognition. San Francisco, USA. In this project, Bussi and myself designed the experiment, with support from Pearce. Bussi collected data and analysed it with my support. He also wrote the manuscript with support from myself and Pearce. This project inspired the second experiment of Chapter 7, where the same concept is applied to newly designed stimuli. 4 Abstract How do we know that a melody is a melody? In other words, how does the human brain extract melody from a polyphonic musical context? This thesis begins with a theoretical presentation of musical auditory scene analysis (ASA) in the context of predictive coding and rule-based approaches and takes methodological and analytical steps to evaluate selected components of a proposed integrated framework for musical ASA, unified by prediction. Predictive coding has been proposed as a grand unifying model of perception, action and cognition and is based on the idea that brains process error to refine models of the world. Existing models of ASA tackle distinct subsets of ASA and are currently unable to integrate all the acoustic and extensive contextual information needed to parse auditory scenes. This thesis proposes a framework capable of integrating all relevant information contributing to the understanding of musical auditory scenes, including auditory features, musical features, attention, expectation and listening experience, and examines a subset of ASA issues – timbre perception in relation to musical training, modelling temporal expectancies, the relative salience of musical parameters and melody extraction – using probabilistic approaches. Using behavioural methods, attention is shown to influence streaming perception based on timbre more than instrumental experience. Using probabilistic methods, information content (IC) for temporal aspects of music as generated by IDyOM (information dynamics of music; Pearce, 2005), are validated and, along with IC for pitch and harmonic aspects of the music, are subsequently linked to perceived complexity but not to salience. Furthermore, based on the hypotheses that a melody is internally coherent and the most complex voice in a piece of polyphonic music, IDyOM has been extended to extract melody from symbolic representations of chorales by J.S. Bach and a selection of string quartets by W.A. Mozart. 5 Acknowledgements I am endlessly grateful to my supervisor, Dr. Marcus T. Pearce, for his unwavering support and patience throughout this journey, particularly as I learned to code in LISP. Thank you also to my remaining supervisory panel, Matthias Mauch, Elaine Chew, and Simon Dixon, for helpful feedback as I progressed through critical stages of this PhD. Thank you to all the members of the Music Cognition Lab who have offered friendship, feedback and support over the past three years. Thank you also to the Music Performance and Expression Lab, and the Centre for Digital Music and the Cognitive Science research groups for friendships, advice, and making this musician feel welcome in a computer science department. A specific thank you must go to Peter Harrison, lab mate, flatmate and friend, without whom I would be lost in code and who has been such an encouragement in the last few weeks of this process. Thanks to CISV, and all my friends in the organization, who provided invaluable emotional support to an expat far away from her home, and encouraged me to always continue to challenge myself. Finally, thanks to my family and my closest friends, who have generously tolerated the incessant always-working student mode both when I was at home and from abroad, and supported me through some significant life and lifestyle decisions. I could not have done this without all of you – thank you. 6 Table of Contents Statement of Originality ........................................................................................................ 2 Details of collaboration and publication ................................................................................ 3 Abstract ................................................................................................................................ 5 Acknowledgements ............................................................................................................... 6 Table of Contents .................................................................................................................. 7 List of Figures ..................................................................................................................... 14 List of Tables ...................................................................................................................... 17 1 Introduction ...................................................................................................................... 20 2 Auditory streaming: understanding the special case of music perception .......................... 25 2.1 Auditory features ........................................................................................................... 28 2.2 Musical features ............................................................................................................ 32 2.3 Attention ....................................................................................................................... 36 2.4 Expectation ................................................................................................................... 38 2.5 Musical training ............................................................................................................ 42 2.6 Predictive coding ........................................................................................................... 45 3 Materials .......................................................................................................................... 52 3.1 IDyOM ......................................................................................................................... 52 3.2 Musical corpora ............................................................................................................. 59 3.2.1 Montreal Billboard Corpus .................................................................. 60 7 3.2.2 Bach Chorale Dataset .......................................................................... 60 3.2.3 String Quartet Dataset ......................................................................... 61 3.2.4 Nova Scotia Folk Songs ...................................................................... 61 3.2.5 Essen Folksong Collection Subset ....................................................... 61 3.2.6 Bach Soprano ...................................................................................... 62 3.3 Goldsmiths Musical Sophistication Index .....................................................................