Sequence Classification and Ranking in Indian Classical Music with Deep Learning

Total Page:16

File Type:pdf, Size:1020Kb

Sequence Classification and Ranking in Indian Classical Music with Deep Learning DEEPSRGM - SEQUENCE CLASSIFICATION AND RANKING IN INDIAN CLASSICAL MUSIC WITH DEEP LEARNING Sathwik Tejaswi Madhusudhan Girish Chowdhary University of Illinois, Urbana Champaign University of Illinois, Urbana Champaign [email protected] [email protected] ABSTRACT note, slide from one note to another etc [17]), arohana and avarohana (upward and downward melodic movement) A vital aspect of Indian Classical Music (ICM) is Raga, and melodic phrases/ motifs [12, 20]. Emotion is another which serves as a melodic framework for compositions and attribute that makes Ragas distinctive. Balkwill et al. [2] improvisations alike. Raga Recognition is an important conducted studies on the perception of Ragas and observed music information retrieval task in ICM as it can aid nu- that even an inexperienced listener is able to identify emo- merous downstream applications ranging from music rec- tions portrayed by various Ragas. Given the importance ommendations to organizing huge music collections. In of Raga in ICM, Machine Learning (ML) based automatic this work, we propose a deep learning based approach to raga recognition can aid in organizing large audio libraries, Raga recognition. Our approach employs efficient pre- labeling recording in the same, etc. possessing and learns temporal sequences in music data We believe automatic raga recognition has tremendous using Long Short Term Memory based Recurrent Neural potential in empowering content-based recommendation Networks (LSTM-RNN). We train and test the network on systems pertaining to ICM and contemporary Indian mu- smaller sequences sampled from the original audio while sic. However, Raga recognition is not a trivial problem. the final inference is performed on the audio as a whole. There are numerous examples where two or more Ragas Our method achieves an accuracy of 88.1% and 97 % dur- have the same or a similar set of notes but are worlds ing inference on the Comp Music Carnatic dataset and its apart in the musical effect they produce due factors like 10 Raga subset respectively making it the state-of-the-art the gamaka, temporal sequencing (which has to abide by for the Raga recognition task. Our approach also enables the constraints presented in the arohana and avarohana), sequence ranking which aids us in retrieving melodic pat- as well as places of svara emphasis and rest. Raga identi- terns from a given music data base that are closely related fication is an acquired skill that requires significant train- to the presented query sequence. ing and practice. As such, automatic Raga classification methods have been widely studied. However, existing 1. INTRODUCTION methods based on Pitch Class Distributions (PCD) disre- gard the temporal information and are highly error-prone, Carnatic and Hindustani music are the two main branches while other methods (reviewed in detail in Section 2) are of Indian Classical Music (ICM), which underlies most of highly dependent on preprocessing and hand made fea- the music emanating from the Indian subcontinent. Ow- tures, which limit the performance of such methods. ing to the contemplative and spiritual nature of these art forms and varied cultural influences, a lot of emphases is In this work, we introduce a deep learning based solu- placed on melodic development. Raga, which governs var- tion to Raga recognition and Raga content-based retrieval. ious melodic aspects of ICM, serves as a framework for First, we reformulate the problem of Raga recognition as compositions and improvisations. Krishna et al. [20] de- a sequence classification task performed using an LSTM fine a Raga as the “collective melodic expression that con- RNN based architecture with attention. We then show that sists of phraseology which is part of the identifiable macro- the same network, with minimal fine-tuning based on the melodic movement”. A major portion of contemporary In- triplet margin loss [27], can be used for sequence ranking. dian Music including film, folk and other forms of music We introduce sequence ranking as a new sub-task of auto- heavily draw inspiration from ICM [10, 12]. matic Raga recognition. In this configuration, the model Numerous melodic attributes make Ragas distinctive in can be used to perform Raga content-based retrieval. A nature, such as the svara (roughly, a musical note), the user provides the model with a query sequence and the gamaka (for example oscillatory movement about a given model returns sequences that are very closely related to the presented query. This can aid in downstream applications like recommendation systems and so on. c Sathwik Tejaswi Madhusudhan, , Girish Chowdhary. Li- Deep learning has proven to be an indispensable asset censed under a Creative Commons Attribution 4.0 International License due to its flexibility, scalability, learning abilities, end to (CC BY 4.0). Attribution: Sathwik Tejaswi Madhusudhan, , Girish Chowdhary. “DeepSRGM - Sequence classification and ranking in In- end training, and most importantly, unprecedented success dian Classical music with deep learning”, 20th International Society for in modeling unstructured spatial and temporal data over Music Information Retrieval Conference, Delft, The Netherlands, 2019. the past few years. Our work draws inspiration from the 533 Proceedings of the 20th ISMIR Conference, Delft, Netherlands, November 4-8, 2019 successes in [26, 27] introduce a convolutional neural net- work based recommendation system for music which over- Audio comes and outperforms various drawbacks of traditional Model recommendation systems like collaborative filtering, a bag Word embedding of words method, etc. It also leverages LSTM-RNN and Preprocessing attention based models which have proven to be extremely with embedding dimension = 128 effective in tasks like sequence classification, sequence to Deep learning based sequence learning, neural machine translation, image cap- audio source tioning task, etc., all of which directly deal with sequences. separation model LSTM layer To summarize, the main contributions of our work are hidden size = 768 as follows : Pitch tracking • We present a new approach to Raga recognition us- Attention Layer ing deep learning using LSTM based RNN architec- Tonic Normalization ture to address problem of Raga recognition. Dense layer 1 and note quantization Dense layer 2 • With our approach we obtain 97.1 % on the 10 Raga Softmax classification task and 88.1 % accuracy on the 40 Random Sampling Raga classification task on the Comp Music Carnatic Music Dataset (CMD), hence improving the state- of-the-art on the same. Predicted Class label • We introduce sequence ranking as a new sub task of Raga recognition, which can be used in creating Raga content based recommendation systems. Figure 1. Figure shows various preprocessing steps and model architecture for SRGM1 (refer Section 3) Note that we refer to our Raga recognition (i.e sequence classification) model as SRGM1 and the sequence ranking model as SRGM2 throughout the paper. Although arohana and avarohana are critical components to identifying a Raga, they do not contain as much infor- mation as the audio excerpt itself. Most previous works 2. RELATED WORKS heavily rely on prepossessing and feature extraction, most Broadly speaking, ML methods for Raga classification can of which are handcrafted, which can prove to be a limita- either be PCD based or sequence based. Many of the tion as it only gets harder to devise ways to extract complex previous works have focused on PCD based methods for and rich features. Raga recognition [6, 7, 18] with Chordia et al. [4], cur- rently holding the best performing results for PCD based 3. RAGA RECOGNITION AS A SEQUENCE methods with an accuracy of 91%. [17] presents an in- CLASSIFICATION PROBLEM (SRGM1) depth study of various distribution based methods for Raga classification. Although an intuitive and an effective ap- Sequence classification is a predictive modeling problem proach, the major shortcoming of PCD based methods is where an input, a sequence over space or time, is to be that it disregards temporal information and hence is error classified as one among several categories. In modern prone. To overcome this, previous works have used Hidden NLP literature, sentiment classification is one such exam- Markov Models based approach [19], convolutional neural ple, where models predict sentiments of sentences. We ar- networks based approach [21], arohana avarohana based gue that the Raga recognition is closely related to the task approach [23] among various other methods. of sequence classification, since any audio excerpt in ICM [9, 24] employ Raga phrase-based approach for Raga involving vocals/melody instruments, can be broken down recognition. [11] use an interesting phrase-based approach as a sequence of notes spread over time. To reformulate inspired by the way in which seasoned listeners identify Raga recognition as a sequence classification task, we treat a Raga. This method is currently the state-of-the-art on the notes obtained via predominant melody estimation as a 10 Raga subset of CMD. [12] introduces Time Delayed words, the set of all words form the vocabulary and each Melodic Surface(TDMS) which is a feature based on delay Raga as a class. We detail our approach to raga recognition co-ordinates that is capable of describing both tonal and as below. temporal characteristics of the melody given a song. With TDMS, the authors achieve state-of-art performance on the 3.1 Preprocessing CMD. 3.1.1 Deep Learning based audio source separation While the PCD based methods ignore the temporal in- formation, many works like assume that the audio is mono- Audio recordings available as part of CMD accurately rep- phonic. Works like [19,23] attempt to take advantage of the resent live ICM concerts with vocalists accompanied by temporal information by extracting arohana and avarohana. various string, wind, and percussion instruments. How- 534 Proceedings of the 20th ISMIR Conference, Delft, Netherlands, November 4-8, 2019 ever, for the purposes of raga recognition, analyzing the 3.1.5 Choosing an appropriate value for Nr vocals is sufficient.
Recommended publications
  • An Exploration of the Relationship Between Mathematics and Music
    An Exploration of the Relationship between Mathematics and Music Shah, Saloni 2010 MIMS EPrint: 2010.103 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: http://eprints.maths.manchester.ac.uk/ And by contacting: The MIMS Secretary School of Mathematics The University of Manchester Manchester, M13 9PL, UK ISSN 1749-9097 An Exploration of ! Relation"ip Between Ma#ematics and Music MATH30000, 3rd Year Project Saloni Shah, ID 7177223 University of Manchester May 2010 Project Supervisor: Professor Roger Plymen ! 1 TABLE OF CONTENTS Preface! 3 1.0 Music and Mathematics: An Introduction to their Relationship! 6 2.0 Historical Connections Between Mathematics and Music! 9 2.1 Music Theorists and Mathematicians: Are they one in the same?! 9 2.2 Why are mathematicians so fascinated by music theory?! 15 3.0 The Mathematics of Music! 19 3.1 Pythagoras and the Theory of Music Intervals! 19 3.2 The Move Away From Pythagorean Scales! 29 3.3 Rameau Adds to the Discovery of Pythagoras! 32 3.4 Music and Fibonacci! 36 3.5 Circle of Fifths! 42 4.0 Messiaen: The Mathematics of his Musical Language! 45 4.1 Modes of Limited Transposition! 51 4.2 Non-retrogradable Rhythms! 58 5.0 Religious Symbolism and Mathematics in Music! 64 5.1 Numbers are God"s Tools! 65 5.2 Religious Symbolism and Numbers in Bach"s Music! 67 5.3 Messiaen"s Use of Mathematical Ideas to Convey Religious Ones! 73 6.0 Musical Mathematics: The Artistic Aspect of Mathematics! 76 6.1 Mathematics as Art! 78 6.2 Mathematical Periods! 81 6.3 Mathematics Periods vs.
    [Show full text]
  • How to Read Music Notation in JUST 30 MINUTES! C D E F G a B C D E F G a B C D E F G a B C D E F G a B C D E F G a B C D E
    The New School of American Music How to Read Music Notation IN JUST 30 MINUTES! C D E F G A B C D E F G A B C D E F G A B C D E F G A B C D E F G A B C D E 1. MELODIES 2. THE PIANO KEYBOARD The first thing to learn about reading music A typical piano has 88 keys total (including all is that you can ignore most of the informa- white keys and black keys). Most electronic tion that’s written on the page. The only part keyboards and organ manuals have fewer. you really need to learn is called the “treble However, there are only twelve different clef.” This is the symbol for treble clef: notes—seven white and five black—on the keyboard. This twelve note pattern repeats several times up and down the piano keyboard. In our culture the white notes are named after the first seven letters of the alphabet: & A B C D E F G The bass clef You can learn to recognize all the notes by is for classical sight by looking at their patterns relative to the pianists only. It is totally useless for our black keys. Notice the black keys are arranged purposes. At least for now. ? in patterns of two and three. The piano universe tends to revolve around the C note which you The notes ( ) placed within the treble clef can identify as the white key found just to the represent the melody of the song.
    [Show full text]
  • Music Is Made up of Many Different Things Called Elements. They Are the “I Feel Like My Kind Building Bricks of Music
    SECONDARY/KEY STAGE 3 MUSIC – BUILDING BRICKS 5 MINUTES READING #1 Music is made up of many different things called elements. They are the “I feel like my kind building bricks of music. When you compose a piece of music, you use the of music is a big pot elements of music to build it, just like a builder uses bricks to build a house. If of different spices. the piece of music is to sound right, then you have to use the elements of It’s a soup with all kinds of ingredients music correctly. in it.” - Abigail Washburn What are the Elements of Music? PITCH means the highness or lowness of the sound. Some pieces need high sounds and some need low, deep sounds. Some have sounds that are in the middle. Most pieces use a mixture of pitches. TEMPO means the fastness or slowness of the music. Sometimes this is called the speed or pace of the music. A piece might be at a moderate tempo, or even change its tempo part-way through. DYNAMICS means the loudness or softness of the music. Sometimes this is called the volume. Music often changes volume gradually, and goes from loud to soft or soft to loud. Questions to think about: 1. Think about your DURATION means the length of each sound. Some sounds or notes are long, favourite piece of some are short. Sometimes composers combine long sounds with short music – it could be a song or a piece of sounds to get a good effect. instrumental music. How have the TEXTURE – if all the instruments are playing at once, the texture is thick.
    [Show full text]
  • Automatic Music Transcription Using Sequence to Sequence Learning
    Automatic music transcription using sequence to sequence learning Master’s thesis of B.Sc. Maximilian Awiszus At the faculty of Computer Science Institute for Anthropomatics and Robotics Reviewer: Prof. Dr. Alexander Waibel Second reviewer: Prof. Dr. Advisor: M.Sc. Thai-Son Nguyen Duration: 25. Mai 2019 – 25. November 2019 KIT – University of the State of Baden-Wuerttemberg and National Laboratory of the Helmholtz Association www.kit.edu Interactive Systems Labs Institute for Anthropomatics and Robotics Karlsruhe Institute of Technology Title: Automatic music transcription using sequence to sequence learning Author: B.Sc. Maximilian Awiszus Maximilian Awiszus Kronenstraße 12 76133 Karlsruhe [email protected] ii Statement of Authorship I hereby declare that this thesis is my own original work which I created without illegitimate help by others, that I have not used any other sources or resources than the ones indicated and that due acknowledgement is given where reference is made to the work of others. Karlsruhe, 15. M¨arz 2017 ............................................ (B.Sc. Maximilian Awiszus) Contents 1 Introduction 3 1.1 Acoustic music . .4 1.2 Musical note and sheet music . .5 1.3 Musical Instrument Digital Interface . .6 1.4 Instruments and inference . .7 1.5 Fourier analysis . .8 1.6 Sequence to sequence learning . 10 1.6.1 LSTM based S2S learning . 11 2 Related work 13 2.1 Music transcription . 13 2.1.1 Non-negative matrix factorization . 13 2.1.2 Neural networks . 14 2.2 Datasets . 18 2.2.1 MusicNet . 18 2.2.2 MAPS . 18 2.3 Natural Language Processing . 19 2.4 Music modelling .
    [Show full text]
  • Musical Elements in the Discrete-Time Representation of Sound
    0 Musical elements in the discrete-time representation of sound RENATO FABBRI, University of Sao˜ Paulo VILSON VIEIRA DA SILVA JUNIOR, Cod.ai ANTONIOˆ CARLOS SILVANO PESSOTTI, Universidade Metodista de Piracicaba DEBORA´ CRISTINA CORREA,ˆ University of Western Australia OSVALDO N. OLIVEIRA JR., University of Sao˜ Paulo e representation of basic elements of music in terms of discrete audio signals is oen used in soware for musical creation and design. Nevertheless, there is no unied approach that relates these elements to the discrete samples of digitized sound. In this article, each musical element is related by equations and algorithms to the discrete-time samples of sounds, and each of these relations are implemented in scripts within a soware toolbox, referred to as MASS (Music and Audio in Sample Sequences). e fundamental element, the musical note with duration, volume, pitch and timbre, is related quantitatively to characteristics of the digital signal. Internal variations of a note, such as tremolos, vibratos and spectral uctuations, are also considered, which enables the synthesis of notes inspired by real instruments and new sonorities. With this representation of notes, resources are provided for the generation of higher scale musical structures, such as rhythmic meter, pitch intervals and cycles. is framework enables precise and trustful scientic experiments, data sonication and is useful for education and art. e ecacy of MASS is conrmed by the synthesis of small musical pieces using basic notes, elaborated notes and notes in music, which reects the organization of the toolbox and thus of this article. It is possible to synthesize whole albums through collage of the scripts and seings specied by the user.
    [Show full text]
  • Understanding Music Past and Present
    Understanding Music Past and Present N. Alan Clark, PhD Thomas Heflin, DMA Jeffrey Kluball, EdD Elizabeth Kramer, PhD Understanding Music Past and Present N. Alan Clark, PhD Thomas Heflin, DMA Jeffrey Kluball, EdD Elizabeth Kramer, PhD Dahlonega, GA Understanding Music: Past and Present is licensed under a Creative Commons Attribu- tion-ShareAlike 4.0 International License. This license allows you to remix, tweak, and build upon this work, even commercially, as long as you credit this original source for the creation and license the new creation under identical terms. If you reuse this content elsewhere, in order to comply with the attribution requirements of the license please attribute the original source to the University System of Georgia. NOTE: The above copyright license which University System of Georgia uses for their original content does not extend to or include content which was accessed and incorpo- rated, and which is licensed under various other CC Licenses, such as ND licenses. Nor does it extend to or include any Special Permissions which were granted to us by the rightsholders for our use of their content. Image Disclaimer: All images and figures in this book are believed to be (after a rea- sonable investigation) either public domain or carry a compatible Creative Commons license. If you are the copyright owner of images in this book and you have not authorized the use of your work under these terms, please contact the University of North Georgia Press at [email protected] to have the content removed. ISBN: 978-1-940771-33-5 Produced by: University System of Georgia Published by: University of North Georgia Press Dahlonega, Georgia Cover Design and Layout Design: Corey Parson For more information, please visit http://ung.edu/university-press Or email [email protected] TABLE OF C ONTENTS MUSIC FUNDAMENTALS 1 N.
    [Show full text]
  • How to Play the Musical
    ' '[! . .. ' 1 II HOW TO PLAY THE MUSI CAL SAW by RENE ' BOGART INTRODUCTION TO PLAYING THE MUSICA L SAW: It is unknown just when nor where THE MUSICAL SAW had it's origin and was first played. It ~s believed among some MUSICAL SAW players, however, that i t originated in South America in the 1800's and some believe the idea was imported into the U. S. from Europe . I well remember the fi r st time I heard it played. It wa s in Buffalo, N.Y. and played at the Ebrwood M!L6,£.c..Ha..U about 1910 when I was a boy. It has been used extensively in the ho me, on stage at private cl ubs, church fun ctions, social gatherings , hospita l s and nursing homes, in vaudevi l le and theaters for years, and more recently on Radio and Televisi on Programs in the enter­ tainmentfield. The late Cf.alte.nc..e.MM ~e.ht , of Fort Atkinson, Wi sconsin, was the first to introduce and suppl y Musical Saw Kits to the field under the name of M!L6 ~e. ht&Wutpha1.,** and did a great deal to develop the best line of SAWS suitable for this use, after considerable research. Before taking up the MUSICAL SAW and going into the detail s of there­ quired proceedure for playing it wel l , it is important to understand some of the basic rudiments and techniques involved for it's proper manipu l ation. While the MUSICAL SAW is essential l y a typi cal carpenter's or craftsman's saw, it' s use as a very unique musical instrument has gained in popu larity, and has recently become a means of expressing a very distinctive qua l ity of music.
    [Show full text]
  • Elementary Music Synthesis
    Elementary Music Synthesis This lab is by Professor Virginia Stonick of Oregon State University. The purp ose of this lab is to construct physically meaningful signals math- ematically in MATLAB using your knowledge of signals. You will gain some understanding of the physical meaning of the signals you construct by using audio playback. Also, we hop e you have fun! 1 Background In this section, we explore how to use simple tones to comp ose a segment of music. By using tones of various frequencies, you will construct the rst few bars of Beethoven's famous piece \Symphony No. 5 in C-Minor." In addition, you will get to construct a simple scale and then play it backwards. IMPORTANT: Eachmusical note can b e simply represented by a sinusoid whose frequency dep ends on the note pitch. Assume a sampling rate of 8KHz and that an eighth note = 1000 samples. Musical notes are arranged in groups of twelve notes called octaves. The notes that we'll be using for Beethoven's Fifth are in the o ctave containing frequencies from 220 Hz to 440 Hz. When we construct our scale, we'll include notes from the o ctave containing frequencies from 440 Hz to 880 Hz. The twelve notes in each o ctave are logarithmically spaced in frequency, 1=12 with each note b eing of a frequency 2 times the frequency of the note of lower frequency. Thus, a 1-o ctave pitch shift corresp onds to a doubling of the frequencies of the notes in the original o ctave.
    [Show full text]
  • Frequency-And-Music-1.34.Pdf
    Frequency and Music By: Douglas L. Jones Catherine Schmidt-Jones Frequency and Music By: Douglas L. Jones Catherine Schmidt-Jones Online: < http://cnx.org/content/col10338/1.1/ > This selection and arrangement of content as a collection is copyrighted by Douglas L. Jones, Catherine Schmidt-Jones. It is licensed under the Creative Commons Attribution License 2.0 (http://creativecommons.org/licenses/by/2.0/). Collection structure revised: February 21, 2006 PDF generated: August 7, 2020 For copyright and attribution information for the modules contained in this collection, see p. 51. Table of Contents 1 Acoustics for Music Theory ......................................................................1 2 Standing Waves and Musical Instruments ......................................................7 3 Harmonic Series ..................................................................................17 4 Octaves and the Major-Minor Tonal System ..................................................29 5 Tuning Systems ..................................................................................37 Index ................................................................................................49 Attributions .........................................................................................51 iv Available for free at Connexions <http://cnx.org/content/col10338/1.1> Chapter 1 Acoustics for Music Theory1 1.1 Music is Organized Sound Waves Music is sound that's organized by people on purpose, to dance to, to tell a story, to make other people
    [Show full text]
  • Note Value Recognition for Piano Transcription Using Markov Random Fields Eita Nakamura, Member, IEEE, Kazuyoshi Yoshii, Member, IEEE, Simon Dixon
    JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. YY, ZZZZ 1 Note Value Recognition for Piano Transcription Using Markov Random Fields Eita Nakamura, Member, IEEE, Kazuyoshi Yoshii, Member, IEEE, Simon Dixon Abstract—This paper presents a statistical method for use Performed MIDI signal 15 (sec) 16 1717 1818 199 20 2121 2222 23 in music transcription that can estimate score times of note # # # # b # # onsets and offsets from polyphonic MIDI performance signals. & # # # Because performed note durations can deviate largely from score- ? Pedal indicated values, previous methods had the problem of not being able to accurately estimate offset score times (or note values) Hidden Markov model (HMM) etc. and thus could only output incomplete musical scores. Based on Incomplete musical score by conventional methods observations that the pitch context and onset score times are nœ œ œ œ #œ œ œ œ œ œ œ influential on the configuration of note values, we construct a # 6 œ œ œ œ nœ & # 8 œ œ œ œ œ œ bœ nœ #œ œ context-tree model that provides prior distributions of note values œ œ #œ nœ œ œ œ œ œ nœ œ œ œ œ ? # 6 œ n œ œ œ œ œ œ œ # 8 œ using these features and combine it with a performance model in (Rhythms are scaled by 2) the framework of Markov random fields. Evaluation results show Markov random field (MRF) model that our method reduces the average error rate by around 40 percent compared to existing/simple methods. We also confirmed Complete musical score by the proposed method that, in our model, the score model plays a more important role # œ nœ œ œ œ œ œ œ #œ œ œ œ ˙.
    [Show full text]
  • UNIVERSITY of CALIFORNIA, SAN DIEGO Probabilistic Topic Models
    UNIVERSITY OF CALIFORNIA, SAN DIEGO Probabilistic Topic Models for Automatic Harmonic Analysis of Music A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Diane J. Hu Committee in charge: Professor Lawrence Saul, Chair Professor Serge Belongie Professor Shlomo Dubnov Professor Charles Elkan Professor Gert Lanckriet 2012 Copyright Diane J. Hu, 2012 All rights reserved. The dissertation of Diane J. Hu is approved, and it is ac- ceptable in quality and form for publication on microfilm and electronically: Chair University of California, San Diego 2012 iii DEDICATION To my mom and dad who inspired me to work hard.... and go to grad school. iv TABLE OF CONTENTS Signature Page . iii Dedication . iv Table of Contents . v List of Figures . viii List of Tables . xii Acknowledgements . xiv Vita . xvi Abstract of the Dissertation . xvii Chapter 1 Introduction . 1 1.1 Music Information Retrieval . 1 1.1.1 Content-Based Music Analysis . 2 1.1.2 Music Descriptors . 3 1.2 Problem Overview . 5 1.2.1 Problem Definition . 5 1.2.2 Motivation . 5 1.2.3 Contributions . 7 1.3 Thesis Organization . 9 Chapter 2 Background & Previous Work . 10 2.1 Overview . 10 2.2 Musical Building Blocks . 10 2.2.1 Pitch and frequency . 11 2.2.2 Rhythm, tempo, and loudness . 17 2.2.3 Melody, harmony, and chords . 18 2.3 Tonality . 18 2.3.1 Theory of Tonality . 19 2.3.2 Tonality in music . 22 2.4 Tonality Induction . 25 2.4.1 Feature Extraction . 26 2.4.2 Tonal Models for Symbolic Input .
    [Show full text]
  • The Sounds of Music: Science of Musical Scales∗ 1
    SERIES ARTICLE The Sounds of Music: Science of Musical Scales∗ 1. Human Perception of Sound Sushan Konar Both, human appreciation of music and musical genres tran- scend time and space. The universality of musical genres and associated musical scales is intimately linked to the physics of sound, and the special characteristics of human acoustic sensitivity. In this series of articles, we examine the science underlying the development of the heptatonic scale, one of the most prevalent scales of the modern musical genres, both western and Indian. Sushan Konar works on stellar compact objects. She Introduction also writes popular science articles and maintains a Fossil records indicate that the appreciation of music goes back weekly astrophysics-related blog called Monday Musings. to the dawn of human sentience, and some of the musical scales in use today could also be as ancient. This universality of musi- cal scales likely owes its existence to an amazing circularity (or periodicity) inherent in human sensitivity to sound frequencies. Most musical scales are specific to a particular genre of music, and there exists quite a number of them. However, the ‘hepta- 1 1 tonic’ scale happens to have a dominating presence in the world Having seven base notes. music scene today. It is interesting to see how this has more to do with the physics of sound and the physiology of human auditory perception than history. We shall devote this first article in the se- ries to understand the specialities of human response to acoustic frequencies. Human ear is a remarkable organ in many ways. The range of hearing spans three orders of magnitude in frequency, extending Keywords from ∼20 Hz to ∼20,000 Hz (Figure 1) even though the sensitivity String vibration, beat frequencies, consonance-dissonance, pitch, tone.
    [Show full text]