The Single Hidden Layer Neural Network Based Classifier for Han

The Single Hidden Layer Neural Network Based Classifiers for Han Chinese Folk Songs Sui Sin Khoo A thesis submitted in fulfilment of the requirements for the Doctor of Philosophy at Faculty of Engineering and Industrial Sciences Swinburne University of Technology Australia 2013 This page is intentionally left blank. Abstract This thesis investigates the application of a few powerful machine learning techniques in music classification using a symbolic database of folk songs: The Essen Folksong Collection. Firstly, a meaningful and representative set of theory-based method of encoding Chinese folk songs, called the musical feature density map (MFDMap) is developed to enable efficient classification by machines. This encoding method effectively encapsulates useful musical information that is readable by the machines and at the same time can be easily interpreted by humans. This encoding will aid ethnomusicologists in future folk song research. The extreme learning machine (ELM), an extremely fast machine learning algorithm that utilizes the structure of the single-hidden layer feedforward neural networks (SLFNs) is employed as the machine classifier. This algorithm is capable of performing at a very fast speed and has good generalization performance. The application of the ELM classifier and its enhanced variant called the regularized extreme learning machine (R-ELM), in real-world multi-class folk song classification is examined in this thesis. The effectiveness of the MFDMap encoding technique combining with the ELM classifiers for multi-class folk song classification is verified. The finite impulse response extreme learning machine (FIR-ELM) is a relatively new learning algorithm. It is a powerful algorithm in the sense that its robustness is reflected in the design of the input weights and the output weights. This algorithm can effectively remove input disturbances and undesired frequency components in the input data. The capability of the FIR-ELM in solving complex real-world multi-class classification is examined in this thesis. The MFDMap performed more effectively with i the FIR-ELM. The classification accuracy using the FIR-ELM is significantly better than both the ELM and the R-ELM. The techniques of folk song classification proposed in this thesis are further investigated on a different data samples. These techniques are also applied to the European folk songs, a culture that is very different from the Chinese culture, to investigate the flexibility of the learning machines. In addition, the roles and relationships of four music elements: solfege, interval, duration and duration ratio are investigated. ii Acknowledgement I would like to express my gratitude to my supervisor, Professor Zhihong Man, who has given me both guidance and courage to pursue the work in this thesis. A special thank for his patience to my slow responses and his advices that lead me along the path. I would also like to express my utmost gratefulness to my parents for their loving and constant support, interest and encouragement that lead me up to this point in my life. I would love to express my deepest appreciation to my dearest brother who leads me and inspired me along my way. A sweet thank to Aiji, Kevin, Fei Siang, Hai, and Tuan Do for all the laughter and companies I received during my years of research in Swinburne. iii This page is intentionally left blank. Declaration This is to certify that: 1. This thesis contains no material which has been accepted for the award to the candidate of any other degree or diploma, except where due reference is made in the text of the examinable outcome. 2. To the best of the candidate’s knowledge, this thesis contains no material previously published or written by another person except where due reference is made in the text of the examinable outcome. 3. The work is based on the joint research and publications; the relative contributions of the respective authors are disclosed. ________________________ Sui Sin Khoo, 2013 v This page is intentionally left blank. Table of Contents Table of Contents ABSTRACT......................................................................................................................i ACKNOWLEDGEMENT.............................................................................................iii LIST OF FIGURES .......................................................................................................xi LIST OF TABLES .......................................................................................................xiii LIST OF ACRONYMS ...............................................................................................xix 1. INTRODUCTION.......................................................................................................1 1.1 Motivation...............................................................................................................1 1.2 Contribution ............................................................................................................3 1.3 Organization of the Thesis ......................................................................................4 2. LITERATURE REVIEW...........................................................................................7 2.1 Artificial Neural Network .......................................................................................7 2.1.1 McCulloch-Pitts Threshold Processing Unit.............................................8 2.1.2 Rosenblatt’s Perceptron ............................................................................9 2.1.3 Multi-Layer Perceptron...........................................................................11 2.1.4 Learning Algorithms ...............................................................................13 2.1.5 Extreme Learning Machine.....................................................................16 2.2 Music Representations ..........................................................................................20 2.2.1 Audio Format ..........................................................................................21 2.2.2 Symbolic Format.....................................................................................32 2.3 Discussion .............................................................................................................41 3. MUSIC REPRESENTATION AND THE MUSICAL FEATURE DENSITY MAP ..........................................................................................................................43 3.1 Ethnomusicology Background on Geographical Based Han Chinese Folk Song Classification...............................................................................................................44 vii Table of Contents 3.1.1 Rationale for the Choice of the Five Classes ..........................................49 3.2 Music Data Set – The Essen Folksong Collection................................................51 3.2.1 The **Kern Representation ....................................................................52 3.2.2 An Example of Han Chinese Folk Song in **Kern Format ...................53 3.2.3 Assumptions in **Kern Version of the Essen Folksong Collection.......59 3.3 Music Elements and Encoding..............................................................................60 3.3.1 Pitch Elements.........................................................................................61 3.3.2 Duration Elements...................................................................................66 3.4 The Musical Feature Density Map........................................................................72 3.4.1 Advantage of the Musical Feature Density Map ....................................79 3.4.2 Future Enhancement to the Musical Feature Density Map.....................92 4. THE EXTREME LEARNING MACHINE FOLK SONG CLASSIFIER..........93 4.1 Introduction...........................................................................................................94 4.2 Extreme Learning Machine...................................................................................95 4.3 Regularized Extreme Learning Machine ..............................................................98 4.4 Experiment Design and Setting ..........................................................................100 4.4.1 Data Pre-Processing and Post-Processing.............................................100 4.4.2 Parameter Setting ..................................................................................109 4.5 Experiment Results .............................................................................................110 4.6 Discussion ...........................................................................................................122 4.7 Conclusion ..........................................................................................................125 5. THE FINITE IMPULSE RESPONSE EXTREME LEARNING MACHINE FOLK SONG CLASSIFIER...................................................................................127 5.1 Introduction.........................................................................................................127 5.2 Finite Impulse Response Extreme Learning Machine ........................................129 5.3 Experiment Design and Setting ..........................................................................135 5.3.1 Data Pre-Processing and Post-Processing.............................................136 5.3.2 Parameter Setting

Load more