Music Data Mining Presents a Variety of Approaches to Successfully Employ Data Mining Techniques for the Purpose of Music Processing
Total Page:16
File Type:pdf, Size:1020Kb
Computer Science Chapman & Hall/CRC Music Data Mining Chapman & Hall/CRC Data Mining and Knowledge Discovery Series Data Mining and Knowledge Discovery Series The research area of music information retrieval has gradually evolved to address the challenges of effectively accessing and interacting large collections of music and associated data, such as styles, artists, Music Data lyrics, and reviews. Bringing together an interdisciplinary array of top researchers, Music Data Mining presents a variety of approaches to successfully employ data mining techniques for the purpose of music processing. Mining The book first covers music data mining tasks and algorithms and audio feature extraction, providing a framework for subsequent chapters. With a focus on data classification, it then describes a computational approach inspired by human auditory perception and examines instrument recognition, the effects of music on moods and emotions, and the connections between power laws and music aesthetics. Given the importance of social aspects in understanding music, the text addresses the use of the Web and peer-to-peer networks for both music data mining and evaluating music mining tasks and algorithms. It also discusses indexing with tags and explains how data can be collected using online human computation games. The final chapters offer a balanced exploration of hit song science as well as a look at symbolic musicology and data mining. and Tzanetakis and The multifaceted nature of music information often requires Ogihara, Li, algorithms and systems using sophisticated signal processing and machine-learning techniques to better extract useful information. An excellent introduction to the field, this volume presents state-of- the-art techniques in music data mining and information retrieval to create novel ways of interacting with large music collections. E D I T E D B Y Tao Li Mitsunori Ogihara K11597 George Tzanetakis K11597_Cover.indd 1 6/6/11 11:16 AM Music Data Mining E D I T E D B Y Tao Li Mitsunori Ogihara George Tzanetakis This page intentionally left blank CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper Version Date: 20110603 International Standard Book Number: 978-1-4398-3552-4 (Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information stor- age or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copy- right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro- vides licenses and registration for a variety of users. For organizations that have been granted a pho- tocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com This page intentionally left blank Contents List of Figures xiii List of Tables xvii Preface xix List of Contributors xxiii I Fundamental Topics 1 1 Music Data Mining: An Introduction 3 Tao Li and Lei Li 1.1 Music Data Sources . 4 1.2 An Introduction to Data Mining . 7 1.2.1 Data Management . 8 1.2.2 Data Preprocessing . 9 1.2.3 Data Mining Tasks and Algorithms . 10 1.2.3.1 Data Visualization . 10 1.2.3.2 Association Mining . 11 1.2.3.3 Sequence Mining . 11 1.2.3.4 Classification . 11 1.2.3.5 Clustering . 12 1.2.3.6 Similarity Search . 12 1.3 Music Data Mining . 13 1.3.1 Overview . 13 1.3.2 Music Data Management . 14 1.3.3 Music Visualization . 16 1.3.4 Music Information Retrieval . 17 1.3.5 Association Mining . 19 1.3.6 Sequence Mining . 19 1.3.7 Classification . 20 1.3.8 Clustering . 25 1.3.9 Music Summarization . 26 1.3.10 Advanced Music Data Mining Tasks . 27 1.4 Conclusion . 29 Bibliography . 31 v vi 2 Audio Feature Extraction 43 George Tzanetakis 2.1 Audio Representations . 44 2.1.1 The Short-Time Fourier Transform . 45 2.1.2 Filterbanks, Wavelets, and Other Time-Frequency Representations . 50 2.2 Timbral Texture Features . 51 2.2.1 Spectral Features . 51 2.2.2 Mel-Frequency Cepstral Coefficients . 52 2.2.3 Other Timbral Features . 52 2.2.4 Temporal Summarization . 53 2.2.5 Song-Level Modeling . 56 2.3 Rhythm Features . 57 2.3.1 Onset Strength Signal . 59 2.3.2 Tempo Induction and Beat Tracking . 60 2.3.3 Rhythm Representations . 62 2.4 Pitch/Harmony Features . 64 2.5 Other Audio Features . 65 2.6 Musical Genre Classification of Audio Signals . 66 2.7 Software Resources . 68 2.8 Conclusion . 69 Bibliography . 69 II Classification 75 3 Auditory Sparse Coding 77 Steven R. Ness, Thomas C. Walters, and Richard F. Lyon 3.1 Introduction . 78 3.1.1 The Stabilized Auditory Image . 80 3.2 Algorithm . 81 3.2.1 Pole{Zero Filter Cascade . 81 3.2.2 Image Stabilization . 83 3.2.3 Box Cutting . 83 3.2.4 Vector Quantization . 83 3.2.5 Machine Learning . 84 3.3 Experiments . 85 3.3.1 Sound Ranking . 85 3.3.2 MIREX 2010 . 89 3.4 Conclusion . 91 Bibliography . 92 4 Instrument Recognition 95 Jayme Garcia Arnal Barbedo 4.1 Introduction . 96 4.2 Scope Delimitation . 97 vii 4.2.1 Pitched and Unpitched Instruments . 97 4.2.2 Signal Complexity . 97 4.2.3 Number of Instruments . 100 4.3 Problem Basics . 102 4.3.1 Signal Segmentation . 102 4.3.2 Feature Extraction . 103 4.3.3 Classification Procedure . 105 4.3.3.1 Classification Systems . 105 4.3.3.2 Hierarchical and Flat Classifications . 106 4.3.4 Analysis and Presentation of Results . 108 4.4 Proposed Solutions . 111 4.4.1 Monophonic Case . 111 4.4.2 Polyphonic Case . 119 4.4.3 Other Relevant Work . 125 4.5 Future Directions . 125 Bibliography . 127 5 Mood and Emotional Classification 135 Mitsunori Ogihara and Youngmoo Kim 5.1 Using Emotions and Moods for Music Retrieval . 136 5.2 Emotion and Mood: Taxonomies, Communication, and Induction . 137 5.2.1 What Is Emotion, What Is Mood? . 137 5.2.2 A Hierarchical Model of Emotions . 138 5.2.3 Labeling Emotion and Mood with Words and Its Issues 138 5.2.4 Adjective Grouping and the Hevner Diagram . 140 5.2.5 Multidimensional Organizations of Emotion . 140 5.2.5.1 Three and Higher Dimensional Diagrams . 142 5.2.6 Communication and Induction of Emotion and Mood 145 5.3 Obtaining Emotion and Mood Labels . 146 5.3.1 A Small Number of Human Labelers . 146 5.3.2 A Large Number of Labelers . 147 5.3.3 Mood Labels Obtained from Community Tags . 148 5.3.3.1 MIREX Mood Classification Data . 148 5.3.3.2 Latent Semantic Analysis on Mood Tags . 149 5.3.3.3 Screening by Professional Musicians . 150 5.4 Examples of Music Mood and Emotion Classification . 150 5.4.1 Mood Classfication Using Acoustic Data Analysis . 150 5.4.2 Mood Classification Based on Lyrics . 151 5.4.3 Mixing Audio and Tag Features for Mood Classification 153 5.4.4 Mixing Audio and Lyrics for Mood Classification . 154 5.4.4.1 Further Exploratory Investigations with More Complex Feature Sets . 156 5.4.5 Exploration of Acoustic Cues Related to Emotions . 157 5.4.6 Prediction of Emotion Model Parameters . 157 viii 5.5 Discussion . 158 Bibliography . 160 6 Zipf's Law, Power Laws, and Music Aesthetics 169 Bill Manaris, Patrick Roos, Dwight Krehbiel, Thomas Zalonis, and J.R. Armstrong 6.1 Introduction . 171 6.1.1 Overview . 171 6.2 Music Information Retrieval . 172 6.2.1 Genre and Author Classification . 172 6.2.1.1 Audio Features . 172 6.2.1.2 MIDI Features . 173 6.2.2 Other Aesthetic Music Classification Tasks . 174 6.3 Quantifying Aesthetics . 175 6.4 Zipf's Law and Power Laws . 178 6.4.1 Zipf's Law . 178 6.4.2 Music and Zipf's Law . 181 6.5 Power-Law Metrics . 182 6.5.1 Symbolic (MIDI) Metrics . 182 6.5.1.1 Regular Metrics . 182 6.5.1.2 Higher-Order Metrics . 182 6.5.1.3 Local Variability Metrics . 184 6.5.2 Timbre (Audio) Metrics . 184 6.5.2.1 Frequency Metric . 184 6.5.2.2 Signal Higher-Order Metrics .