Non-Invasive Gesture Sensing, Physical Modeling, Machine Learning and Acoustic Actuation for Pitched Percussion

Non-invasive Gesture Sensing, Physical Modeling, Machine Learning and Acoustic Actuation for Pitched Percussion by Shawn Trail Bachelor Arts, Bellarmine University, 2002 Master of Music., Purchase College Conservatory of Music, 2008 A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY In Interdisciplinary Studies: Computer Science and Music c Shawn Trail, 2018 University of Victoria All rights reserved. This dissertation may not be reproduced in whole or in part, by photocopying or other means, without the permission of the author. ii Non-invasive Gesture Sensing, Physical Modeling, Machine Learning and Acoustic Actuation for Pitched Percussion by Shawn Trail Bachelor Arts, Bellarmine University, 2002 Master of Music., Purchase College Conservatory of Music, 2008 Supervisory Committee Dr. Peter F. Driessen, Co-Supervisor (Electrical Engineering) Dr. W. Andrew Schloss, Co-Supervisor (Music) Dr. George Tzanetakis, Co-Supervisor (Computer Science) iii Supervisory Committee Dr. Peter F. Driessen, Co-Supervisor (Electrical Engineering) Dr. W. Andrew Schloss, Co-Supervisor (Music) Dr. George Tzanetakis, Co-Supervisor (Computer Science) ABSTRACT This thesis explores the design and development of digitally extended, electro- acoustic (EA) pitched percussion instruments, and their use in novel, multi-media performance contexts. The proposed techniques address the lack of expressivity in existing EA pitched percussion systems. The research is interdisciplinary in na- ture, combining Computer Science and Music to form a type of musical human- computer interaction (HCI) in which novel playing techniques are integrated in perfor- mances. Supporting areas include Electrical Engineering- design of custom hardware circuits/DSP; and Mechanical Engineering- design/fabrication of new instruments. The contributions can be grouped into three major themes: 1) non-invasive gesture recognition using sensors and machine learning, 2) acoustically-excited physical mod- els, 3) timbre-recognition software used to trigger idiomatic acoustic actuation. In addition to pitched percussion, which is the main focus of the thesis, application of these ideas to other music contexts is also discussed. iv Contents Supervisory Committee ii Abstract iii Table of Contents iv List of Tables vii List of Figures viii Acknowledgements xii Dedication xiv Forward xv I INTRODUCTION 1 1 Overview 2 1.1 Outline . 2 1.2 Background . 3 1.3 Motivation for this work . 6 1.4 Novel Contributions . 9 2 Related Work 12 2.1 New Instruments for Musical Expression (NIME) . 12 2.2 Electronic Pitched Percussion . 14 2.3 Non-Invasive Sensing . 19 2.4 Physical Modeling Synthesis . 22 2.5 Acoustic Actuation . 25 v 2.6 Summery . 28 II IDIOMATIC INTERFACES, MACHINE LEARNING AND ACTUATION 29 3 Musical-HCI 30 3.1 Gesture Sensing . 30 3.1.1 Xylophone Based . 30 3.1.2 Lamellophone Based . 42 3.2 Gyil Gourd Physical Modeling Synthesis . 56 3.2.1 Model Description . 56 3.2.2 Gyil: Physical Measurements . 57 3.2.3 Experimental Results . 64 4 Sensing 68 4.1 Machine Awareness . 68 4.1.1 Gesture Prediction . 68 4.1.2 Drum Pattern Identification . 71 4.2 Surrogate Sensor Framework . 74 4.2.1 Hybrid Acoustic/Physical Model . 76 4.2.2 Sound Source Separation and Automatic-Transcription . 80 5 Actuation 85 5.1 Auto-Calibration . 85 5.1.1 Drum Classification for Auto-Mapping . 88 5.1.2 Timbre-Adaptive Velocity Calibration . 90 5.1.3 Gesture recognition using Dynamic Time Warping . 91 5.2 Marimba platform . 92 5.2.1 Physical Design . 94 5.2.2 Electrical Design . 96 5.2.3 DSR - bars as speakers . 99 5.2.4 Idiomatic HCI . 101 5.3 Lamellophone Tine Excitation . 102 5.4 Auto-monochord . 104 5.4.1 Introduction . 104 vi 5.4.2 Design Considerations . 104 5.4.3 System Description . 106 6 Contributions to other musical contexts 115 6.1 Extended Framework . 115 6.1.1 Guitar . 115 6.1.2 Trumpet . 122 6.2 Geometric Rhythm Theory . 135 6.2.1 Geometric Sequencer . 136 6.2.2 System Description . 143 6.2.3 Euclidean Visualizer . 145 6.2.4 Musicality . 147 III CONCLUSION 151 7 Final Thoughts and Future Work 152 Appendices 162 A Author publications related to this thesis 162 B Deployment 170 C Potential Patent Claims 171 Bibliography 174 vii List of Tables Table 3.1 Likembe tuning . 46 Table 3.2 3D gesture mappings . 48 Table 3.3 Wiimote: switch mappings . 50 Table 3.4 Dynamic gourd dimensions . 57 Table 3.5 Model parameters/Gyil physical properties . 64 Table 3.6 Gyil expert feedback . 67 Table 4.1 Sound-source separation: Signal-to-Distortion ratio (SDR) . 82 Table 4.2 Direct sensor: detection accuracy (%) . 84 Table 4.3 Indirect sensor: detection accuracy (%) . 84 Table 5.1 SVM classifier accuracy . 89 Table 5.2 String tuning threshold formulas . 105 Table 5.3 STARI: GPIO assignments . 114 Table 6.1 EROSS: sensor test results . 130 Table 6.2 EROSS battery specifications . 134 Table 6.3 GeoSEEq: Control Logic . 145 viii List of Figures Figure 2.1 Ayotte sensor system (ca. 2000) . 15 Figure 2.2 Original KandK MIDI-Vibe . 16 Figure 2.3 Guitaret . 18 Figure 2.4 Pianet tine . 19 Figure 2.5 Gyil: bass gourd (14" deep, 10"dia.-body, and 4"dia.-mouth) . 23 Figure 3.1 Virtual vibraphone faders . 32 Figure 3.2 Hardware diagram . 33 Figure 3.3 Software diagram . 34 Figure 3.4 Audio aignal chain . 35 Figure 3.5 Music control design . 36 Figure 3.6 Latency: Radiodrum vs. Kinect . 37 Figure 3.7 Captured motion of four drum strikes . 37 Figure 3.8 Radiodrum viewable area . 38 Figure 3.9 Kinect viewable area . 38 Figure 3.10Horizontal range of Radiodrum vs. Kinect . 39 Figure 3.11Kinect depth variance . 40 Figure 3.12Wiikembe and custom footpedal . 43 Figure 3.13Wiikembe system overview . 44 Figure 3.14Likembe tines with rings . 45 Figure 3.15tine layout/approx. A=440 . 46 Figure 3.161.A-rvrb, B-dly; 2. FX vol.; 3. Wiikembe vol. 47 Figure 3.17Arduino controller . 47 Figure 3.18Wiimote axes/filter parameters . 48 Figure 3.19Wiimote switch labels . 49 Figure 3.20El-lamellophone . 51 Figure 3.21Modularity of components . 51 Figure 3.22Mounted system . 52 ix Figure 3.23Sensor interface (l); Beaglebone (r) . 53 Figure 3.249DOF axes of rotation . 54 Figure 3.25Puredata patch . 55 Figure 3.26Gourds/frame construction . 58 Figure 3.27DFT: resonating wooden bar results . 58 Figure 3.28Signal flow for one gourd . 60 Figure 3.29Membrane's asymmetric motion . 61 Figure 3.30Signal flow for one-gourd/one-membrane model variant . 63 Figure 3.31DFT non-linearities . 65 Figure 4.1 Forecasting system operation modes . 70 Figure 4.2 Convolution with Gaussian function. 72 Figure 4.3 Surrogate sensor system . 75 Figure 4.4 Simple wave folder w/ adjustable symmetry . 78 Figure 4.5 Diagram of signal flow for one gourd . 78 Figure 4.6 Various spectral plots for Gyil . 79 Figure 5.1 Solenoid actuated frame drum array . 86 Figure 5.2 Calibrated input velocities mapped to output driving velocities 89 Figure 5.3 Precision of Radiodrum and vibraphone gestures . 92 Figure 5.4 Loudness and timbre based velocity calibration . 93 Figure 5.5 DSRmarimbA . 93 Figure 5.6 DSRm hardware configuration . 95 Figure 5.7 Solonoid sequencer . 95 Figure 5.8 Beaglebone/Arduino/heatsink . 96 Figure 5.9 Solenoid . 96 Figure 5.10Ruler and solenoids . 97 Figure 5.11Schematic of solenoid actuation . 97 Figure 5.12Arduino code . 98 Figure 5.13Piezo summing mixer . 98 Figure 5.14ADSR . 99 Figure 5.15Piezo schematic and response . 100 Figure 5.16Ella system flow . 102 Figure 5.17STARI . 104 Figure 5.18Monochord . ..

Load more