Software-Based Extraction of Objective Parameters from Music Performances
Total Page:16
File Type:pdf, Size:1020Kb
Software-Based Extraction of Objective Parameters from Music Performances Der Fakultät I der Technischen Universität Berlin zur Erlangung des akademischen Grades eines Dr. phil. eingereichte Dissertation von Herrn Dipl.-Ing. Alexander Lerch Datum der Einreichung: August 2008 Abstract Different music performances of the same score may significantly differ from each other. It is obvious that not only the composer’s work, the score, defines the listener’s music experience, but that the music performance itself is an integral part of this experience. Music performers use the information contained in the score, but interpret, transform or add to this information. Four parameter classes can be used to describe a performance objectively: tempo and timing, loudness, timbre and pitch. Each class contains a multitude of individual parameters that are at the performers’ disposal to generate a unique physical rendition of musical ideas. The extraction of such objective parameters is one of the difficulties in music performance research. This work presents an approach to the software-based extraction of tempo and timing, loudness and timbre parameters from audio files to provide a tool for the automatic parameter extraction from music performances. The system is applied to extract data from 21 string quartet performances and a detailed analysis of the extracted data is presented. The main contributions of this thesis are the adaptation and development of signal processing approaches to performance parameter extraction and the presentation and discussion of string quartet performances of a movement of Beethoven’s late String Quartet op. 130. music performance, music performance analysis, automatic tempo extraction, loudness analysis, timbre analysis, string quartet performance, audio content analysis, audio-to-score-matching Zusammenfassung Verschiedene Aufführungen des gleichen musikalischen Werkes unterscheiden sich deutlich voneinander. Es ist offensichtlich, daß das Musikerlebnis des Hörers nicht nur durch die zugrundeliegende Partitur bestimmt wird, sondern auch maßgeblich von der Interpretation dieser Partitur durch die aufführenden Musi- ker. Diese deuten, modifizieren oder erweitern die im Notenbild enthaltenen Informationen im Zuge ihrer Darbietung. Eine solche Musikaufführung läßt sich mit Parametern der Parameterkategorien Tempo, Lautheit, Klangfarbe und Tonhöhe objektiv beschreiben. Jede der vier Kategorien stellt eine Vielzahl von Parametern bereit, die es den Musikern ermöglicht, musikalische Ideen auf eine einmalige physikalische Art umzusetzen. Die Extraktion solcher Parameter ist eine der typischen Problemstellungen der Aufführungsanalyse. Diese Arbeit präsentiert ein Softwaresystem, das als Werkzeug zur automatischen Extraktion von Tempo-, Lautheits- und Timbre- merkmalen angewendet werden kann. Dieses System wurde für eine systematische Analyse von 21 Streichquartettauf- nahmen eingesetzt. Die Arbeit widmet sich hauptsächlich zwei Thematiken, der Entwicklung und Optimierung von Algorithmen der Audiosignalverarbeitung zur Parameterex- traktion aus Audioaufnahmen musikalischer Aufführungen sowie der Analyse und Diskussion von Streichquartrettaufführungen eines Satzes aus Beethovens spätem Streichquartett op. 130. musikalische Interpretation, Aufführungsanalyse, Tempoerkennung, Lautheits- analyse, Klangfarbenanalyse, Streichquartettanalyse, Musikanalyse Contents 1 Introduction1 2 Music Performance and its Analysis5 2.1 Music Performance.........................5 2.2 Music Performance Analysis.................... 10 2.3 Analysis Data............................ 11 2.3.1 Data Acquisition...................... 11 2.3.2 Instrumentation & Genre.................. 15 2.3.3 Variety & Significance of Input Data........... 15 2.3.4 Extracted Parameters................... 20 2.4 Research Results.......................... 20 2.4.1 Performance......................... 20 2.4.2 Performer.......................... 22 2.4.3 Recipient.......................... 23 2.5 Software Systems for Performance Analysis............ 24 3 Tempo Extraction 27 3.1 Performance to Score Matching.................. 29 3.1.1 Score Following....................... 29 3.1.2 Audio to Score Alignment................. 30 3.2 Proposed Algorithm........................ 32 3.2.1 Definitions.......................... 33 3.2.2 Pre-Processing....................... 34 3.2.3 Processing.......................... 42 i 3.2.4 Similarity Measure..................... 46 3.2.5 Tempo Curve Extraction.................. 52 3.2.6 Evaluation.......................... 55 4 Dynamics Feature Extraction 63 4.1 Implemented Features....................... 65 4.1.1 Peak Meter......................... 65 4.1.2 VU Meter.......................... 66 4.1.3 Root Mean Square Based Features............ 66 4.1.4 Zwicker Loudness Features................. 67 4.2 Example Results.......................... 69 5 Timbre Feature Extraction 73 5.1 Implemented Features....................... 76 5.1.1 Spectral Rolloff....................... 76 5.1.2 Spectral Flux........................ 76 5.1.3 Spectral Centroid...................... 77 5.1.4 Spectral Spread....................... 77 5.1.5 Mel Frequency Cepstral Coefficients............ 78 5.2 Example Results.......................... 78 6 Software Implementation 81 6.1 Data Extraction........................... 82 6.1.1 FEAPI............................ 83 6.1.2 Performance Optimizations................ 88 6.2 Performance Player......................... 88 6.2.1 Smoothing Filter...................... 90 6.2.2 Overall Results for each Feature.............. 91 6.2.3 Graphical User Interface.................. 94 7 String Quartet Performance Analysis 97 7.1 Music................................ 97 7.2 Recordings.............................. 99 7.3 Procedure.............................. 100 7.3.1 Audio Treatment...................... 100 ii 7.3.2 Analysis Data........................ 100 7.3.3 Feature Space Dimensionality Reduction......... 101 7.4 Overall Performance Profiles.................... 104 7.4.1 Tempo............................ 104 7.4.2 Timing............................ 107 7.4.3 Loudness........................... 108 7.4.4 Timbre............................ 110 7.5 Performance Similarity....................... 111 7.5.1 Repetition Similarity.................... 111 7.5.2 Overall Similarity...................... 115 7.6 Overall Observations........................ 115 7.6.1 Dimensionality of Overall Observations.......... 119 7.6.2 Relationships between Overall Observations....... 120 7.7 Summary.............................. 122 8 Conclusion 125 8.1 Summary.............................. 125 8.2 Potential Algorithmic Improvements............... 128 8.3 Future Directions.......................... 129 List of Figures 131 List of Tables 133 A Standard Transformations 135 A.1 Discrete Fourier Transformation.................. 135 A.2 Principal Component Analysis................... 136 B Software Documentation 137 B.1 Parameter Extraction........................ 137 B.1.1 Command Line....................... 138 B.1.2 Input and Output Files................... 138 B.2 Performance Player......................... 139 B.2.1 Loading Performances................... 140 B.2.2 Visualize Parameters.................... 140 B.2.3 Play Performances..................... 140 iii C Result Tables - String Quartet Analysis 141 Bibliography 153 iv List of Symbols and Abbreviations ACA Audio Content Analysis ADSR technical description of the volume envelope phases of single sounds, frequently used in synthesizers: attack, decay, sustain and release ANOVA Analysis of Variance AOT Acoustic Onset Time API Application Programmer’s Interface AU Audio Unit (Apple) B similarity matrix b(m; n) similarity matrix entry C number of audio channels CD Compact Disc CPU Central Processing Unit (processor) CQT Constant Q Transform δ evaluation criterion DFT Discrete Fourier Transformation DIN Deutsches Institut für Normung DMA Deutsches Musikarchiv DTW Dynamic Time Warping f frequency in Hz fS sample rate in Hz v FEAPI Feature Extraction Application Programmer’s Interface FFT Fast Fourier Transformation FLTK Fast Light Toolkit GPL Gnu General Public License GUI Graphical User Interface H hop-size between STFT blocks HMM Hidden Markov Model IBI Inter-Bar-Interval IEC International Electrotechnical Commission IOI Inter-Onset-Interval ISO International Organization for Standardization ITU International Telecommunication Union K block-size of STFT blocks LADSPA Linux Audio Developer’s Simple Plugin API M number of similarity matrix rows MDS Multidimensional Scaling MFCC Mel Frequency Cepstral Coefficient MIDI Musical Instrument Digital Interface MIREX Music Information Retrieval Evaluation eXchange MPA Music Performance Analysis MPEG Motion Picture Experts Group N number of similarity matrix columns N 0(z) specific loudness on the bark scale NOT Note Onset Time O(m) observation at block index m !(k; m) instantaneous frequency at bin k for block m vi OP Cn nth PCA component (variables: overall features, observations: performances) p MIDI pitch P alignment path PAT Perceptual Attack Time PCA Principal Component Analysis P CnF nth PCA component (variables: features, observations: perfor- mances) POT Perceptual Onset Time PPM Peak Programme Meter RMS Root Mean Square RP Cn nth rotated component from the selected OP Cn components S(n) score or audio event with list index n SDK Software Development Kit SD Semantic Differential SIMD Single Instruction Multiple Data SSE Streaming SIMD Extension (Intel) STFT Short Time