Towards a Better Understanding of Mix Engineering
Total Page:16
File Type:pdf, Size:1020Kb
Towards a better understanding of mix engineering Brecht De Man Submitted in partial fulfilment of the requirements of the Degree of Doctor of Philosophy School of Electronic Engineering and Computer Science Queen Mary University of London United Kingdom January 2017 Statement of originality I, Brecht Mark De Man, confirm that the research included within this thesis is my own work or that where it has been carried out in collaboration with, or supported by others, that this is duly acknowledged below and my contribution indicated. Previously published material is also acknowledged below. I attest that I have exercised reasonable care to ensure that the work is original, and does not to the best of my knowledge break any UK law, infringe any third party's copyright or other Intellectual Property Right, or contain any confidential material. I accept that the College has the right to use plagiarism detection software to check the electronic version of the thesis. I confirm that this thesis has not been previously submitted for the award of a degree by this or any other university. The copyright of this thesis rests with the author and no quotation from it or in- formation derived from it may be published without the prior written consent of the author. Details of collaboration and publications: see Chapter 1: Introduction, Section 1.5. Signature: Date: 2 Abstract This thesis explores how the study of realistic mixes can expand current knowledge about multitrack music mixing. An essential component of music production, mixing remains an esoteric matter with few established best practices. Research on the topic is challenged by a lack of suitable datasets, and consists primarily of controlled studies focusing on a single type of signal processing. However, considering one of these pro- cesses in isolation neglects the multidimensional nature of mixing. For this reason, this work presents an analysis and evaluation of real-life mixes, demonstrating that it is a viable and even necessary approach to learn more about how mixes are created and perceived. Addressing the need for appropriate data, a database of 600 multitrack audio recordings is introduced, and mixes are produced by skilled engineers for a selection of songs. This corpus is subjectively evaluated by 33 expert listeners, using a new framework tailored to the requirements of comparison of musical signal processing. By studying the relationship between these assessments and objective audio features, previous results are confirmed or revised, new rules are unearthed, and descriptive terms can be defined. In particular, it is shown that examples of inadequate process- ing, combined with subjective evaluation, are essential in revealing the impact of mix processes on perception. As a case study, the percept `reverberation amount' is ex- pressed as a function of two objective measures, and a range of acceptable values can be delineated. To establish the generality of these findings, the experiments are repeated with an expanded set of 180 mixes, assessed by 150 subjects with varying levels of experience from seven different locations in five countries. This largely confirms initial findings, showing few distinguishable trends between groups. Increasing experience of the listener results in a larger proportion of critical and specific statements, and agreement with other experts. 3 Table of Contents List of Tables 6 List of Figures 8 Acknowledgements 9 1 Introduction 11 1.1 Research questions . 17 1.2 Objectives . 19 1.3 Thesis structure . 20 1.4 Applications . 24 1.5 Related publications by the author . 26 1.5.1 Journal articles . 26 1.5.2 Conference papers . 26 1.5.3 Book chapters . 28 1.5.4 Patents . 28 2 Knowledge-engineered mixing 29 2.1 System . 31 2.1.1 Rule list . 32 2.1.2 Measurement modules . 33 2.1.3 Processing modules . 35 2.2 Perceptual evaluation . 52 2.2.1 Participants . 52 2.2.2 Apparatus . 52 2.2.3 Materials . 52 2.2.4 Procedure . 55 2.3 Results and discussion . 56 2.4 Conclusion . 59 3 Data collection 61 3.1 Testbed creation and curation . 63 3.1.1 Content . 65 3.1.2 Infrastructure . 68 3.1.3 Mix creation experiment . 70 3.2 Perceptual evaluation of mixing practices . 74 3.2.1 Basic principles . 74 3.2.2 Interface . 75 3.2.3 Listening environment . 83 3.2.4 Subject selection and surveys . 85 3.2.5 Tools . 86 4 TABLE OF CONTENTS 5 3.2.6 Perceptual evaluation experiment . 95 3.3 Conclusion . 98 4 Single group analysis 100 4.1 Objective features . 100 4.1.1 Features overview . 100 4.1.2 Statistical analysis of audio features . 103 4.1.3 Workflow statistics . 112 4.1.4 Conclusion . 114 4.2 Subjective numerical ratings . 116 4.2.1 Preference rating . 116 4.2.2 Correlation of audio features with preference . 118 4.2.3 Correlation of workflow statistics with preference . 121 4.2.4 Conclusion . 122 4.3 Subjective free-form description . 124 4.3.1 Thematic analysis . 125 4.3.2 Challenges . 129 4.3.3 Conclusion . 134 4.4 Real-time attribute elicitation . 136 4.4.1 System . 136 4.4.2 Term analysis . 141 4.4.3 Conclusion . 148 5 Multi-group analysis 150 5.1 Experiments . 152 5.2 Objective features . 157 5.3 Subjective numerical ratings . 159 5.3.1 Average rating . 159 5.3.2 Self-assessment . 159 5.4 Subjective free-form description . 161 5.4.1 Praise and criticism . 161 5.4.2 Comment focus . 162 5.4.3 Agreement . 164 5.5 Conclusion . 167 6 Conclusion 169 Appendix | Case study: Use and perception of reverb 179 A.1 On reverb . 179 A.2 Background . 180 A.3 Problem formulation . 182 A.4 Comment analysis . 185 A.5 Relative Reverb Loudness . 186 A.6 Equivalent Impulse Response . 187 A.6.1 Process . 187 A.6.2 Equivalent Impulse Response analysis and results . 189 A.7 Multi-group analysis . 190 A.8 Conclusion . 193 Bibliography 195 5 List of Tables 1.1 Overview of systems that automate music production processes . 14 2.1 Dynamic range compression rules . 37 2.2 Equalisation rules . 39 2.3 Spectral descriptors in practical sound engineering literature . 40 2.4 Panning rules . 50 2.5 Songs used in the perceptual evaluation experiment . 53 3.1 Metadata fields per song, track, stem, and mix . 67 3.2 Songs used in the experiment . 71 3.3 Microphones under test . 76 3.4 Subject groups . 77 3.5 Existing listening test platforms . 87 3.6 Selection of supported listening test formats . 94 4.1 List of extracted features . 101 4.2 Average values of features per instrument . 104 4.3 Average change of feature values per instrument . 106 4.4 Number of different individual subgroup types . 113 4.5 Number of different multi-instrument subgroup types . 114 4.6 Correlation coefficients of extracted features with perception . 119 4.7 Spearman's rank correlation coefficient for different kinds of subgroups . 121 4.8 Top 25 most occurring descriptive terms over all comments . 128 4.9 Features extracted from the audio before and after processing . 139 4.10 Highest ranking terms . 143 4.11 The first ten descriptors per processor, ranked by number of entries Ndk 143 5.1 Overview of evaluation experiments . 153 5.2 Overview of mixed content . 155 5.3 Effect of expertise on proportion of negative statements . 161 5.4 Effect of expertise on generality of statements . 163 6.1 Summary of confirmed, revised (crossed out), and new mixing rules . 171 A.1 Overview of studies on perception of reverberation of musical signals . 181 A.2 Logistic regression results . 190 6 List of Figures 1.1 Example of a music production chain . 12 2.1 Block diagram of knowledge-engineered automatic mixing system . 31 2.2 Activity in function of audio level . ..