Scalable Audio Processing Across Heterogeneous Distributed Resources an Investigation Into Distributed Audio Processing for Music Information Retrieval
Total Page:16
File Type:pdf, Size:1020Kb
Scalable Audio Processing Across Heterogeneous Distributed Resources An Investigation into Distributed Audio Processing for Music Information Retrieval Ahmad Al-Shakarchi School of Computer Science, Cardiff University ii Declaration This work has not previously been accepted in substance for any degree and is not con- currently submitted in candidature for any degree. Signed ............................................. (candidate) Date ......................... STATEMENT 1 This thesis is being submitted in partial fulfillment of the requirements for the degree of Ph.D. Signed ............................................. (candidate) Date ......................... STATEMENT 2 This thesis is the result of my own independent work/investigation, except where oth- erwise stated. Other sources are acknowledged by explicit references. Signed ............................................. (candidate) Date ......................... STATEMENT 3 I hereby give consent for my thesis, if accepted, to be available for photocopying and for inter-library loan, and for the title and summary to be made available to outside or- ganisations. Signed ............................................. (candidate) Date ......................... iii iv Acknowledgements The work presented in this thesis could not have been completed without the help and support of various researchers, scientists, family members and close friends that I am fortunate enough to have in my life. I would like to thank the Triana developers and researchers that I have worked closely with during my time as a PhD student, and that built the foundation for my work; Dr. Matthew Shields, Dr. Andrew Harrison, and Dr. Ian Wang. Extra special thanks are extended to Kieran Evans at Cardiff University for his help and expertise - I am eternally indebted! I am also particularly grateful for the assistance of researchers and developers at var- ious institutes around the world that helped me conduct such large scale experiments - researchers at: Laboratoire de l'Acc´el´erateurLin´eaire(Paris, France), SZTAKI (Hungarian Academy of the Sciences, Hungary), the Information Sciences Institute at the University of Southern California and the Center for Computation and Technology at Louisiana State University (USA), the University of Coimbra (Portugal), and not forgetting researchers and staff at Cardiff University. I would like to thank the Computer Science and Informatics department at Cardiff for allowing me the opportunity to perform this research. I would - of course - like to extend a very special `thank you' to my supervisor Dr. Ian Taylor, for his patience, his ongoing dedication to my work, his vision and superior intellect (!), and most of all his friendship throughout my PhD. Thank you for all the cool opportunities you've provided me! I could not have asked for a better and more understanding supervisor and am happy to have developed a great friend over the years. Rock on! To my label mates and closest associates; Dan, Ruff, Ralph, Mud and Huw, who may v not know too much about my academic career, but whose antics have provided me with the levity required to maintain my sanity1 over the years. Thank you! To Sarah, who sacrificed so much and showed massive commitment, love and support to me. You helped me more than you know and I will never forget it. I'm also indebted to the entire Kingman family - especially Roger and Gaynor - for their support during hard times. Finally, I would like to thank my family - Ali, Israa, Khitam, Mazin - for supporting me in every way imaginable; I know I asked too much of you all and hope I can repay you with unwavering love, support and patience, in turn. Especially to my father, for whom all of my work and perseverance was really for. Thank you, dad. 1Relatively speaking. vi Abstract Audio analysis algorithms and frameworks for Music Information Retrieval (MIR) are expanding rapidly, providing new ways to discover non-trivial information from audio sources, beyond that which can be ascertained from unreliable metadata such as ID3 tags. MIR is a broad field and many aspects of the algorithms and analysis components that are used are more accurate given a larger dataset for analysis, and often require extensive computational resources. This thesis investigates if, through the use of modern distributed computing techniques, it is possible to design an MIR system that is scalable as the number of participants in- creases, which adheres to copyright laws and restrictions, whilst at the same time enabling access to a global database of music for MIR applications and research. A scalable platform for MIR analysis would be of benefit to the MIR and scientific community as a whole. A distributed MIR platform that encompasses the creation of MIR algorithms and workflows, their distribution, results collection and analysis, is presented in this thesis. The framework, called DART - Distributed Audio Retrieval using Triana - is designed to facilitate the submission of MIR algorithms and computational tasks against either remotely held music and audio content, or audio provided and distributed by the MIR researcher. Initially a detailed distributed DART architecture is presented, along with simulations to evaluate the validity and scalability of the architecture. The idea of a pa- rameter sweep experiment to find the optimal parameters of the Sub-Harmonic Summation (SHS) algorithm is presented, in order to test the platform and use it to perform useful and real-world experiments that contribute new knowledge to the field. DART is tested on various pre-existing distributed computing platforms and the feasi- bility of creating a scalable infrastructure for workflow distribution is investigated through- vii out the thesis, along with the different workflow distribution platforms that could be inte- grated into the system. The DART parameter sweep experiments begin on a small scale, working up towards the goal of running experiments on thousands of nodes, in order to truly evaluate the scalability of the DART system. The result of this research is a functional and scalable distributed MIR research platform that is capable of performing real world MIR analysis, as demonstrated by the successful completion of several large scale SHS parameter sweep experiments across a variety of dif- ferent input data - using various distribution methods - and through finding the optimal parameters of the implemented SHS algorithm. DART is shown to be highly adaptable both in terms of the distributed MIR analysis algorithm, as well as the distribution mech- anism used. viii Contents 1 Introduction 1 1.0.1 Overview . .1 1.0.2 Hypothesis and Research Question . .4 1.0.3 Approach . .5 1.0.4 Contributions to the State of the Art . .5 1.0.5 Project History . .7 1.0.6 Layout & chapter plan/thesis outline . .7 2 Background 11 2.1 Overview . 11 2.2 Review of Musical Concepts . 12 2.2.1 Sound . 12 2.2.2 Pitch . 12 2.2.3 Intensity . 14 2.2.4 Timbre . 14 2.3 Digital Audio . 15 2.3.1 Digital Signal Processing . 15 2.3.2 Analog and digital signals . 16 2.3.3 Sampling . 16 2.3.4 Time domain representation . 18 2.3.5 Frequency Domain Representation . 18 2.3.6 Fourier Transform . 19 2.3.7 Why the Frequency Domain? . 23 2.4 Pitch Detection Algorithms . 24 2.4.1 Time Domain Pitch Detection . 25 ix CONTENTS 2.4.2 Frequency Domain Pitch Detection . 26 2.5 Music Information Retrieval . 30 2.5.1 Distributed MIR systems . 33 2.5.2 Music Recommendation Systems . 35 2.6 Triana Programming Environment . 39 2.6.1 Triana Data Types . 40 2.6.2 Audio Toolkit . 42 2.6.3 Creating and Running Triana Programs . 46 2.6.4 Related Music Data-flow Tools . 47 2.6.5 JavaSound API . 51 3 Requirements Analysis, Underlying Infrastructures and Architecture 55 3.1 DART Overview . 55 3.2 Investigation into the Use of a P2P Architecture . 62 3.2.1 Peer to Peer Computing . 62 3.2.2 A P2P DART Case Study . 64 3.2.3 Proposed Work Package Assignment and Results Retrieval Archi- tecture . 66 3.2.4 Peer Overview . 70 3.2.5 Workflow Design . 71 3.3 Simulation Results . 71 3.3.1 Distributed Simulations . 71 3.4 Conclusions . 77 3.4.1 P2P Technology Feasibility . 77 3.5 Relevant Distributed Computing Technologies . 80 3.5.1 XtremWeb . 80 3.5.2 EGEE / EGI . 83 3.5.3 GRID5000 . 84 3.5.4 BOINC . 85 3.5.5 AtticFS . 89 3.5.6 Distributed Computing Summary . 89 4 Design 91 4.1 Chapter Overview . 91 x CONTENTS 4.2 Design Overview . 92 4.3 The Sub-Harmonic Summation Algorithm and Triana . 93 4.3.1 LoadSound Unit . 94 4.3.2 Fast Fourier Transform Unit . 95 4.3.3 OneSide . 97 4.3.4 AmplitudeSpectrum . 97 4.3.5 PitchDetection . 97 4.3.6 NoteMapper . 98 4.3.7 Unit Design . 98 4.4 Mapping Triana Workflows to the DART Execution Environment . 100 4.4.1 Analysis of Triana Application Structure . 102 4.4.2 Designing a standalone Triana Workflow framework application . 103 4.4.3 User Interface . 105 4.5 DART Experiment Design Choices . 106 4.5.1 Experiment Design Overview . 107 4.6 Input Data Design . 111 4.6.1 Acoustic Guitar . 112 4.6.2 Distorted Guitar . 113 4.6.3 Oboe . 114 4.6.4 Grand Piano . 114 4.6.5 Violin . 115 4.6.6 Tubular Bells . 115 4.6.7 Input Data Design Summary . 116 4.7 Requirements Analysis . 116 4.8 Distribution of Workflows . 119 4.8.1 XtremWeb . 120 4.8.2 BOINC . 121 4.9 Results Analysis . 122 4.9.1 Comparison of Distribution Platforms . 124 4.10 Summary . ..