An Assessment of Psychoacoustical Models in The
Abstract A simple system for recognizing music is presented, based on various musical descriptors, num- bers that describe some aspect of the music. Various descriptors are discussed; in particular, a novel descriptor, the floor-1 cepstral coefficient (F1CC) measure, a refinement of MFCCs based on the Vorbis psychoacoustical model is presented and evaluated. Also, various forms of statistical dimensionality reduction, among them PCA and LDA, are considered in the present context. Finally, a few directions for future work are discussed. vii viii Acknowledgments First of all, I would like to thank my advisor Jan Tro, who patiently provided feedback and guidance over the course of the entire semester. However, several other people have played important roles: Greg Maxwell originally proposed the idea that eventually led to the develop- ment of F1CCs, and Chris Montgomery provided helpful guidance on the internals of the Vorbis encoder. Mette Langaas helped with various insights on statistics, in particular dimensionality reduction. H˚avard Midtkil provided his entire music collection in FLAC format as data mate- rial, saving countless hours of ripping labor. Finally, Rune Holm and Magne Mæhre proofread the manuscript at various stages, providing invaluable feedback, corrections and suggestions. ix x Contents Abstract vii Acknowledgments ix Contents xi 1 Introduction 1 1.1 MusicInformationRetrieval . ....... 1 1.2 Aimofstudy ...................................... 2 1.3 Structure ....................................... 2 1.4 Previouswork.................................... 2 2 Audio descriptors 5 2.1 Motivation ...................................... 5 2.2 Formaldescription ............................... .... 5 2.3 Desiredproperties ............................... .... 6 2.4 Distortionandnoise .............................. .... 7 2.5 Choiceofsourcefragment . ..... 9 2.6 Basicmusicaldescriptors . ...... 9 2.7 Humandescriptors ................................ 13 3 Mel frequency cepstral coefficients (MFCC) 15 3.1 Psychoacoustical motivation .
[Show full text]