Proutskova, Polina. 2019. Investigating the Singing Voice: Quantitative and Qualitative Ap- Proaches to Studying Cross-Cultural Vocal Production
Total Page:16
File Type:pdf, Size:1020Kb
Proutskova, Polina. 2019. Investigating the Singing Voice: Quantitative and Qualitative Ap- proaches to Studying Cross-Cultural Vocal Production. Doctoral thesis, Goldsmiths, University of London [Thesis] https://research.gold.ac.uk/id/eprint/26133/ The version presented here may differ from the published, performed or presented work. Please go to the persistent GRO record above for more information. If you believe that any material held in the repository infringes copyright law, please contact the Repository Team at Goldsmiths, University of London via the following email address: [email protected]. The item will be removed from the repository while any claim is being investigated. For more information, please contact the GRO team: [email protected] Investigating singing voice: quantitative and qualitative approaches to studying cross-cultural vocal production Polina Proutskova PhD Thesis Department of Computing Goldsmiths, University of London 6 October 2017 I hereby declare that, except where explicit attribution is made, the work presented in this thesis is entirely my own. PolinaProutskova 2 Abstract This thesis was motivated by an experiment carried out in the 1960s that stud- ied the relationship between vocal performance practice and society by means of statistical analysis. Using a comprehensive corpus of audio recordings of singing from around the world collected over several decades, the ethnomusicologist Alan Lomax devised the Cantometrics project, the largest comparative study of music, in which 36 performance practice characteristics were rated for each recording. With particular interest in vocal production, we intended to formalise the knowledge of vocal production to enable statistical and computational approaches in the spirit of Cantometrics. Three models of vocal production were investigated: the perceptual model from Cantometrics, a physical model from voice science and a physiological model from singing education. We built on Johan Sundberg’s vocal source parameters and Jo Estill’s physiological building blocks as the basis to develop an ontology of vocal production. Two approaches to automated characterisation of the ontological descriptors were considered. For the incremental approach a proof-of-concept experiment on auto- matic labelling of phonation modes was presented, based on reconstructing the vocal source waveform by means of inverse filtering. We created a dataset of sustained sung vowels with annotations on pitch, vowel and phonation mode on which our model was trained. Steps to generalise this experiment to more complex data were outlined, discussing the challenges of such generalisation. The integrated approach addressed the full variance in the data, turning to the methodology of expert knowledge elicitation in order to annotate the original Canto- metrics dataset with our descriptors. We performed an investigative mixed-methods study in which 13 vocal physiology experts from different professional backgrounds were interviewed; they used our ontology to analyse vocal production in the Can- tometrics dataset. The goal of the study was to: a) validate the acceptance of our ontological terms, b) verify the consensus between experts on the values of the descriptors, c) collect reliable annotations. While the acceptance of the ontology 3 was good for most terms, quantitative analysis showed good agreement between experts for only two out of 11 descriptors (larynx height, aryepiglottic sphincter). A detailed qualitative analysis of the interview data (over 33 hours) was followed by a meta-analysis extracting common themes and confounding issues which point to probable reasons for the disagreement. For aryepiglottic sphincter and larynx height we collected the average ratings, which constitute the first set of reliable annota- tions on vocal production. A strong correlation was found between larynx height and the vocal width parameter from Cantometrics; larynx height was therefore a good candidate to replace vocal width as a more objective descriptor. The current work was based on knowledge from a number of research discip- lines, and its results are discussed from the viewpoint of several fields – MIR, vocal pedagogy, Cantometrics – for which they present significant implications. Future research is suggested for each of the fields. Based on the meta-analysis, we account for the reasons for disagreement between experts on the subject of vocal produc- tion, from music information retrieval (MIR) and singing education perspectives. We further explain the various kinds of bias that affect raters. We conclude that vocal physiology, though offering a more objective language than perceptual descriptors, is not well-suited as an ontological middle layer for statistical approaches to singing given the current state of knowledge. A mixed perceptual-objective path to ontology building is suggested and ways to collect reliable annotations are outlined. In the domain of vocal pedagogy we touch on the issue of communication on vocal physiology between experts, between teacher and student; we consider the future of teaching vocal technique and make suggestions for new experiments in the field. A plan is presented for revising and scaling up Cantometrics as an interdiscip- linary collaboration. Possible contributions of MIR, ethnomusicologists and vocal production specialists are specified. 4 Acknowledgements I would first like to thank my supervisors Prof. Geraint Wiggins, Dr Christophe Rhodes and Prof. Tim Crawford for their extraordinary support in this thesis process. The intellectual discussion, encouragement and guidance and the critique they offered have been invaluable. I am particularly grateful to Prof. Wiggins for taking over my supervision after a difficult period and helping me to get back on track, ultimately leading to the thesis completion. My thanks go to Dr Rhodes for helping me to navigate through the bureaucratic maze. My thanks to Prof. Michael Casey for introducing me to music informatics and to Prof. Mark D’Inverno for his help. In addition, I would like to thank Lesley Hewings and the Graduate School for supporting my case. I would like to sincerely thank Dr Victor Grauer, the co-inventor of Cantometrics, for suggesting to focus on the subordination of women hypothesis, which gave the initial direction for the current work. His explorations on the evolution of musical style have been a great inspiration that motivated me to return to academic research. My special thanks go to Anna Lomax Wood, the Alan Lomax Archives and the Association for Cultural Equity for interest in my research and for providing me with the recordings from the Cantometrics Training Tapes which have been the subject of my research ever since. I am very grateful to Prof. Johan Sundberg for his involvement with my recordings for the Phonation Modes Dataset and his general support. His Summer School taught me more about singing voice in two weeks than I had learned in years. The participants in my study require a particular mention: I greatly appreciate that in spite of all their other commitments they found time to take part and shared their knowledge and expertise with me with openness, intellectual rigour and trust. I am heartily thankful to Dr Gillyanne Kayes for her support and advice and for being a role model. My sincere thanks to Kim Chandler for introducing me to the British Voice Association and to the science minded singing voice community. I am grateful to Dr Daniel Müllensiefen, Dr Kat Agres and Vincent Akkermans for their valuable recommendations. 5 I’d like to thank my singing teacher Natalia Vladimirovna Silina for her steadfast guidance and support through all these years and her dedication to vocal pedagogy that inspired my love of studying the voice. My husband Dr Sven Macholl has been my greatest supporter through all the long years this PhD took to mature and to come to completion. Without his care, understanding and patience it would not have happened. I am indebted by my family for believing in me, giving me space and time to work and cheering me up whenever I felt lost. My gratitude includes my Mum, who sacrificed her own PhD in order to raise me. It is the values she instilled in me – the love of learning, the importance of knowledge and the respect for the academic world – that have led me to where I am today. My mother’s kindness has always been my biggest resource. I am so happy that I can share this achievement with her. Last but not least, my son has grown up alongside my PhD, filling my days with joy and meaning. Seeing how mature and self-reliant he has become makes clear what a long way it took for this body of work to shape into the thesis presented here. 6 Contents Abstract 3 Acknowledgements 5 Main contributions 17 Author’s relevant publications 21 1 Introduction 23 1.1 Broad context and motivation – revising the Cantometrics experiment 25 1.2 Vocalproductionvocabulary . 30 1.3 Modelsofvocalproductionandourontology . 32 1.4 Revising Cantometrics: methodological challenges for MIR . 34 1.5 Documentstructure ........................... 35 2 Ontology 38 2.1 Relatedpreviouswork .......................... 38 2.1.1 Vocalsource–JohanSundberg . 38 2.1.1.1 Phonation modes in singing: voice acoustics . 40 2.1.1.2 Performance practice . 43 2.1.2 Registers ............................. 45 2.1.3 Physiologicalbuildingblocks–JoEstill . 50 2.1.3.1 Research, commerce, impact . 50 2.1.3.2 TheEstillmodel .................... 52 2.2 Ontologyofvocalproduction . 64 2.3 Deconstructing Cantometrics parameters . 69 2.4 Vocal