Computational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music
Total Page:16
File Type:pdf, Size:1020Kb
Computational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music Sertan Şentürk TESI DOCTORAL UPF / 2016 Director de la tesi Dr. Xavier Serra Casals Music Technology Group Department of Information and Communication Technologies Copyright © 2016 by Sertan Şentürk http://compmusic.upf.edu/senturk2016thesis http://www.sertansenturk.com/phd-thesis Dissertation submitted to the Department of Information and Com- munication Technologies of Universitat Pompeu Fabra in partial fulfillment of the requirements for the degree of DOCTOR PER LA UNIVERSITAT POMPEU FABRA, with the mention of European Doctor. Licensed under Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 You are free to share – to copy and redistribute the material in any medium or format under the following conditions: • Attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. • Non-commercial – You may not use the material for com- mercial purposes. • No Derivative Works – If you remix, transform, or build upon the material, you may not distribute the modified ma- terial. Music Technology Group (http://mtg.upf.edu/), Department of Informa- tion and Communication Technologies (http://www.upf.edu/etic), Univer- sitat Pompeu Fabra (http://www.upf.edu), Barcelona, Spain The doctoral defense was held on .................. 2017 at Universitat Pompeu Fabra and scored as ........................................................... Dr. Xavier Serra Casals Thesis Supervisor Universitat Pompeu Fabra (UPF), Barcelona, Spain Dr. Gerhard Widmer Thesis Committee Member Johannes Kepler University, Linz, Austria Dr. Barış Bozkurt Thesis Committee Member Koç University, Istanbul, Turkey Dr. Tillman Weyde Thesis Committee Member City, University of London, London, United Kingdom To Sare, Fadime, Osman and Fatma, who still show me their boundless love and compassion beyond time... vi This thesis has been carried out at the Music Technology Group (MTG) of Universitat Pompeu Fabra (UPF) in Barcelona (Spain) from September 2011 to November 2016, and at the Department of Electrical and Electron- ics Engineering of Bahçeşehir University in Istanbul (Turkey) from July 2014 to September 2014. It is supervised by Dr. Xavier Serra Casals. The work in Chapter 6 has been partially conducted in close collaboration with Dr. André Holzapfel (KTH Royal Institute of Technology, Stock- holm, Sweden). Dr. M. Kemal Karaosmanoğlu (Yıldız University, Istan- bul, Turkey) has been consulted in all matters pertaining the music scores used throughout the work. The work in several parts of this thesis have been carried out in collabora- tion with the CompMusic team at the MTG, the partner group led by Dr. Barış Bozkurt (Koç University, Istanbul, Turkey) and the Perceptual In- telligent Lab – Statistical Inference Group led by Dr. Ali Taylan Cemgil (Boğaziçi University, Istanbul, Turkey). A detailed list of collaborators include (alphabetically ordered) Alastair Porter, Altuğ Karakurt, Andrés Ferraro, Burak Uyar, Hasan Sercan Atlı, Gopala Krishna Koduri, Miraç Atıcı, Mohamed Sordo, Sankalp Gulati and Uğur Şimşekli. M. Kemal Karaosmanoğlu and Nezihe Şentürk (Gazi University, Ankara, Turkey) have been consulted for all the music related aspects in the work. This work has been supported by the Department of Information and Communication Technologies (DTIC) PhD fellowship (2011–16), Uni- versitat Pompeu Fabra and the European Research Council (ERC) under the European Union’s Seventh Framework Program, as part of the Comp- Music project (ERC grant agreement 267583). The research stay at the Bahçeşehir University, Istanbul has been funded by European Commis- sion under the Erasmus+ program. The epigraphs in the beginning of each Chapter are taken from İhsan Ok- tay Anar’s phenomenal novel Suskunlar (Taciturns). Acknowledgements Phew, what a journey it was..! It has been a little bit more that 5 years since I have started my journey in the CompMusic project. Seasons have passed, lives have changed. The world has *really* changed (I wonder how Dick- ens would have started its tale). I have also changed, for better or worse. This thesis is a fruit of these years; a fruit which I think as the harvest of collaborative labor. First of all, I would like to thank my supervisor Xavier Serra. It is his vision, which made the CompMusic project possible. It is his perspective, which gathered this exceptional team throughout the world. Likewise, it is his guidance that shaped the thesis. There is not a single day I find myself thinking of all merits I have learned from him and how much more there is still to gain. He has the biggest role in what I have become as a determined researcher, for which I owe the greatest gratitute. Nevertheless, I should say that I owe the biggest debt to M. Ke- mal Karaosmoğlu. Without his unparalleled devotion to collect the makam music scores for decades, and his invaluable help through- out the thesis, this work would simply be impossible. He has been and will always be the definition of idealism I would like to take an example of. I was able to realize most of the audio-score alignment, the core contribution of my thesis, under André Holzapfel’s guidance. vii viii Moreover, he has given me some of the most crucial and sincere suggestions about my PhD work. He am indebted to him not only for his lead during my PhD, and but for his sincerity as a friend. Andres Ferrero is definitely one of the most fruitful people I have had the chance to work with. It is his tireless effort and ded- ication, which made our “baby” web application become what it is. I would like to give a special thanks to Ara Dinkjian, Özer Özel, Robert Garfias and Harold Hagopian for granting us the streaming rights of many recordings they hold the rights of. Seymour Shilen has also helped me numerous times by pointing out the errors in the content. Last but not the least, I want to thank all my family (especially my mother, father and my kel sister) and my friends (especially Emre Can Dağlıoğlu and Can Öktemer). Nothing has changed in the last five years: “Without their selfless support, I would have never even made it to a new day. On the contrary, without my ridiculous ways, they would have a much better life.” Müge Kalen- der has always been on my side for all these uncountable days and nights, until the very last moment. I don’t know how many times Levent Ateşoğlu took care of me. I will always be grateful to Eva Arrizabalaga: I would not be able to continue without her motherly care, while all the bombs (literally) were exploding. Muchos be- sos a Tamar Nalcı, Narod Erkol, Beril Karavit, Cana Göksu, Berta Garcia Faet, Dafni Anastasiadi and Rüstem Can. Finally, I should not forget Şeker and İrfan Hamdi, for their ... cuteness. Abstract This thesis addresses several shortcomings on the current state of the art methodologies in music information retrieval (MIR). In particular, it proposes several computational approaches to automatically analyze and describe music scores and audio recordings of Ottoman-Turkish makam music (OTMM). The main contributions of the thesis are the music cor- pus that has been created to carry out the research and the audio-score alignment methodology developed for the analysis of the corpus. In addi- tion, several novel computational analysis methodologies are presented in the context of common MIR tasks of relevance for OTMM. Some exam- ple tasks are predominant melody extraction, tonic identification, tempo estimation, makam recognition, tuning analysis, structural analysis and melodic progression analysis. These methodologies become a part of a complete system called Dunya-makam for the exploration of large cor- pora of OTMM. The thesis starts by presenting the created CompMusic Ottoman- Turkish makam music corpus. The corpus includes 2200 music scores, more than 6500 audio recordings, and accompanying metadata. The data has been collected, annotated and curated with the help of music experts. Using criteria such as completeness, coverage and quality, we validate the corpus and show its research potential. In fact, our corpus is the largest and most representative resource of OTMM that can be used for compu- tational research. Several test datasets have also been created from the corpus to develop and evaluate the specific methodologies proposed for different computational tasks addressed in the thesis. The part focusing on the analysis of music scores is centered on phrase and section level structural analysis. Phrase boundaries are automatically identified using an existing state-of-the-art segmentation methodology. ix x Section boundaries are extracted using heuristics specific to the format- ting of the music scores. Subsequently, a novel method based on graph analysis is used to establish similarities across these structural elements in terms of melody and lyrics, and to label the relations semiotically. The audio analysis section of the thesis reviews the state-of-the-art for analysing the melodic aspects of performances of OTMM. It proposes adaptations of existing predominant melody extraction methods tailored to OTMM. It also presents improvements over pitch-distribution-based tonic identification and makam recognition methodologies. The audio-score alignment methodology is the core of the thesis. It addresses the culture-specific challenges posed by the musical charac- teristics, music theory related representations and oral praxis of OTMM. Based on several techniques such as subsequence dynamic time warping, Hough transform and variable-length Markov models, the audio-score alignment methodology is designed to handle the structural differences between music scores and audio recordings. The method is robust to the presence of non-notated melodic expressions, tempo deviations within the music performances, and differences in tonic and tuning. The method- ology utilizes the outputs of the score and audio analysis, and links the audio and the symbolic data.