Computational Music Archiving as Physical Culture Theory

Prof. Dr. Rolf Bader

Institute of Systematic Musicology University of Hamburg Neue Rabenstr. 13, 20354 Hamburg, Germany [email protected]

ABSTRACT: The framework of the Computational Music and Sound Archive (COMSAR) is discussed. The aim is to analyze and sort musical pieces of music from all over the world with computational tools. Its analysis is based on Music Information Retrieval (MIR) tools, the sorting algorithms used are Hidden- Markov models and self-organazing Kohonen maps (SOM). Different kinds of systematizations like taxonomies, self-organazing systems as well as bottom-up methods with physiological motivation are discussed, next to the basic signal- processing algorithms. Further implementations include ge- ometries with their radiation characteristics as measured by microphone arrays, as well as the vibrational reconstruction of these instruments using physical model- ing. Practically the aim is a search engine for music which is based on musical pa- rameters like pitch, rhythm, tonality, form or timbre using methods close to neu- ronal and physiological mechanisms. Still the concept also suggests a culture the- ory based on physical mechanisms and parameters, and therefore omits specula- tion and theoretical overload.

instruments54 as a detailed basis for such a 2 BACKGROUND classification. This classification system still 2.1 KINDS OF MUSIC SYSTEMS holds today and is only enlarged by electronic The aim of Systematic Musicology, right from musical instruments, whos development started the start around 1900, is the seek for universals in the second half of the 19th and was very prominent in the second half of the 20th in music, for rules, relations, systems or 51 interactions holding for all musical styles of all century . The success of this classification is ethnic groups and cultures around the world61. caused by its classification idea of sorting Music recordings and Phonogram Archives musical instruments according to acoustical played a crucial role for establishing the field, properties, namely their driving mechanisms. as only after the invention of the Edison So instruments can be plucked, bowed, blown phonograph for recording music on wax or struck, which produces similar timbres. This cylinders60 it was possible to compare music classification system is therefore showing a recorded by ethnomusicologists. The first of unity in the variety by comparing instruments. such archives was the Phonogram Therefore the field of Systematic Musicology in these days was called Comparative Archive established by Erich von Hornbostel 59 and with a historical recording of Musicology . Taxonomies have been applied a Thai phi pha orchestra at a visit in the Berlin in many contexts in the field. A prominent Tiergarten 1900. Many recordings followed, example is that of musical styles forming feature lists38 as applied to singing styles like those of Jaap Kunst recording music of 39 Indonesia?. around the world .

2.1.1 TAXONOMY 2.1.2 SELF-ORGANIZATION One way of giving the endless variety of Classification is a hierarchical structure with musics a system was the Hornbostel/Sachs global nodes followed by subnodes to classification of musical instruments27 in align differentiate plucked or blown instruments into with the Sachs dictionary of musical many subcategories. This is only one kind of a

60

system, and several others have been proposed. discussed. Indeed neural processing is self- One such way is describing music as a self- orgnaization and therefore many of the organized system. The sounds produced by algorithms are often only roughly related to musical instruments are showing a very simple perception50 48 49. Still this is not the main aim behaviour, the harmonicity of their overtone of such algorithms which can be seen as structures, only due to a very complex system engineering solutions to a given task and often of linear and nonlinear substructures, like perform very well. turbulence in wind instruments or the bow- string interaction of the violin family5. The Other more complex algorithms have been perception of music in the brain in neural proposed coming closer to human perception, networks is also a self-organizing process35 37 67 like self-organizing maps of the Kohonen map 68, where many neurons interact in nonlinear type35 37 67 68 . Here features of musical sounds ways to result in simple outputs, the perception are sorted in a map according to their of timbre, rhythm, melodies or musical form. similarities. Still as the map is organizing Therefore self-organization is a second system itself, no initial estimations are needed to proposed to understand musics from all around decide about the way similarity may be the world in a more differentiated way5. measured, the features of the different sounds decide about this on their own. Also 2.1.3 PSYCHOACOUSTIC BOTTOM- algorithms estimating the fractal dimension of UP PROCESS sounds show considerable relatedness both to Yet a third kind of system to understand music the processing of sound in humans as well as is often used in Musical Signal Processing and the perceptional sensation of musical density Music Information Retrieval (MIR)29 16 17. Here as a simple result to a complex computation5. algorithms investigate the digital waveform of recorded music to retrieve information from Another kind of algorithms used for such tasks this waveform, such as pitch, timbre, rhythm are Hidden Markov Models (HMM)69. Here or other psychoacoustic parameters like the temporal development of events are roughness, brightness, density or the like. predicted as the result of some hidden process. Classification of musical instruments is This process consists of the transition between performed, as well as many other tasks like a small amount of states, like musical pitches score following or Networked Music or musical instruments. The development of Performance2. Here the computer understands their appearance is modeled here as the the music and can tell a player where he is in probability of the transition of one state the score. Musicians around the world play switching into another state, so e.g. one pitch together over the internet, and the task of the followed by another one. As such transition computer is to synchronize their playing, probabilities are of statistical nature working around the restrictions of the speed of likelihoods describe the process and therefore light, delaying the transmissions. This can be the output is not fixed beforehand leaving achieved by estimating the played music of a space for arbitrariness. Still to which extend musician before it is actually played. Piano-roll these models fit human perception is under extraction is yet another such task, where the debate. computer understands typically piano music and prints a score from the sound wave file10. Yet a totally different kind of algorithm describes music production of musical 2.2 BASIC TYPES OF ALGORITHMS instruments. Physical Modeling is a set of Many of those tasks are realized by algorithms methods to produce the sound of instruments estimating a bottom-up approach to music by knowledge about the instrument geometries retrieval, where the sound file is analyzed in and the physical laws governing their terms of its spectrum or cepstrum at first which vibration9. Several stages of complexity and is then further processed using more complex simplicity exist here, from lumped models, algorithms to end up in the retrieval result29. digital waveguides or delay lines53 to whole To which extend those algorithms represent body geometries solving the differential perception of music by humans, the processes equations governing the vibrations of plates, physiologically present from the cochlear and membranes or turbulent air flow5 18. These the neural nuclei following up to the primary algorithms use the detailed geometry and solve auditory cortex and beyond is not too much the problems in a time-dependent manner,

61

resulting in very realistic sounds and finding universals in music33, understanding its estimations of vibrating frequencies, transients system, its production and perception need to or radiation. Using extreme parallel use the analysis and analysis-by-synthesis tools computation on an Field-Programmable Gate discussed above. Including all these tasks is a Array (FPGA) these geometries can be tremendous effort, still all these fields simulated in real-time42 43. Understanding the nowadays show a high degree of specialization behaviour of instruments is often achieved in a and are able to give detailed and robust results. bottom-up way by trying different models, Therefore to combine them together to an adding or leaving out geometrical details, and automatic analysis and search engine is from the comparison between the computed straightforward. and measured sound decide about how the 2.3 COMSAR AS BIG DATA SOLU- instrument works in detail. Other more simple TION models are able to show relations between musical instruments and instrument families Such automatic systems are needed in many more easily by starting with global estimations fields. The endless amount of accessible music and adding necessary features in a topdown recordings via the internet, on CDs and in way5. archives makes it practically impossible for a single researcher and even for research teams Measurements of musical instruments are also to perform these analysis by hand. Such a Big a crucial part of the understanding the Data problem needs automatic tools for instruments and their relation one to another. researchers to cope with. Additionally, the Whole body measurement techniques, like amount of methods and their complexity are so microphone arrays4, laser interferometry41 or large that it is not feasible to have one modal analysis? all give a detailed picture researcher perform all tasks. about the spatial and temporal development of the vibrations and transients of musical Also in terms of the 'buzz' the internet instruments. High-speed cameras and subpixel produces in terms of the endless variety of tracking analysis show the movement of musics, research and consumer demands, strings or reeds45. This understanding leads to search engines are needed to point researchers, estimations of the global behaviour of musicians, listeners and music lovers to music instrument radiation, the role of different they would hardly find otherwise. Such search instrument parts in sound production, the use engines need to be based on real musical of materials like woods, hybrid- or parameters. Existing search engines for music metamaterials, the interaction of musical used for mood regulation or work-outs sort instruments with room acoustics or between music mainly in terms of its tempo and on its instruments and players. Modern high- vitalizing properties. Still the musical styles of resolution methods of computer tomography the world are so many and so differentiated (CT) give very detailed geometries of the that such simple parameters are not able to instruments which can be used as input to represent traditional music, all-in-a-box physical modeling, showing details within the productions of musicians of ethnic groups in structures not accessible from their surfaces, or remote areas, boy groups in jungle regions giving estimations of material parameters like combining their tradition with Western density or Young's modulus. Therefore these harmony or electronic music, free- methods are able to compare instruments and improvisation, global Hip-Hop or Electronic instrument families and give insight into Dance Music. Here much more di_erentiated building strategies and methods. algorithms are needed, and all of those mentioned above should be combined. The different approaches to understanding 2.4 COMSAR AS PHYSICAL CUL- music should at best all be used in a TURE THEORY Computational Music and Sound Archive (COMSAR) as proposed here. Traditional The COMSAR standard is not only to update phonogram archives only consist of recordings traditional phonogram archives with modern and their metadata, like the country they have methods and algorithms and not only about been recorded, the musicians playing, the coping with Big Data. It is also the attempt to instruments in use etc.13 65 66 46 12. Still to realize a culture theory based on physical address the aim of Systematic Musicology of reasoning.

62

As has been shown, both, musical instruments music speaks to us, the development of as well as neural brain networks are self- musical styles follow living behaviour, styles organizing systems5. They both are highly are born, live, die and are remembered, nonlinear and intensely coupled only to output become legend. Musicians fuse with their a very simple behaviour. In terms of musical instruments and experience them as having instruments this output is the harmonic their own live, relate to them very similar to overtone spectrum which would not be perfect the way they relate to humans. without the self-organizing process at all. In terms of music perception and production, COMSAR, as implementation of many of the pitch, rhythm, melodies, musical form and mechanisms and systems is therefore other features are the results of self- approaching music as a living culture, a organization and synchronization in the human selforganizing process. Of course it is only an brain. approach yet, still extending the system in the future towards more and more precise Historically musical instruments have been algorithms and tools is only a matter of time. developed and built by humans over at least the last forty thousand years according to the Due to the difference of the algorithms physiological and physical mechanisms the discussed above, in the follow we give a human brain and body is built of. The human deeper insight into main properties and voice, as well as animal vocalization are also research done in these fields, without being self-organizing processes. They produce able to mention all of the works done here. sounds which are not often found in non-living MIR is already implemented in Phonogram systems, namely the harmonicity of the Archives, physical modeling, microphone partials. As voice is meant for communication, array techniques are not, and therefore work in harmonic sounds are evolutionarily related to using these tools are given. semantics. Therefore the semantics found in Self-organizing maps as front end for search music need to be there because it is built-in the engines are discussed. human auditory system reacting to harmonicity 3 TOOLS AND APPLICATIONS FOR of the sounds. COMSAR Self-organization is the base of life compared In 1978 Halmos, Kszegi, and Mandler coined to dead matter, it is turning non-living things to the term Computational Ethnomusicology for life. Its main issues are maintainance of life in using MIR also in regard to non-Western a destructive world, difierentiation in parts to music23. Since then many tools and fulfill dificult tasks and the ability to assimilate applications have been suggested to retrieve, in a changing environment. So building sort and understand musical content from musical instruments as self-organizing systems sound. means to make them similar to living systems. 3.1 BOTTOM-UP MIR TOOLS They exhibit behaviour we know only from animals or humans, harmonic tone production. In early attempts to extract musical parameters Musical instrument builders have obviously from sound, simple multi-line textures decided to make this physical feature the core consisting of two voices were considered, of musical instruments, and therefore the core which must not have overlapping over-tones44. of music as a cultural phenomenon. So the core More modern approaches include percussion of musical instruments is their self-organizing sounds, meter and rhythm estimation30 and are nature. We have built a music culture as designed for the analysis of harmonic as well artificial life by building musical instruments as percussive instruments24, including and perform on them. psychoacoustic knowledge31, or using different approaches in the matrix domain14. Similarity A Physical Culture Theory is taking culture as matrices of spectral features, like Mel a physical and physiological selforganization Frequency Cepstral Coefficients (MFCC), process building artificial life and therefore amongst others, have also been proposed to extending our life by inventing physical tools relate parts of a piece, like verse or chorus20. and processes which again work as self- Singular Value Decomposition (SVD) has also organizing systems. Therefore the culture we been employed in this context21. Recently build appears to us as a living system. The TARSOS, a platform to extract pitch

63

information from sound using 1200 cent per within a defined training space. After the octave has been developed62 to suit demands in training process, any such system should be Computational Ethnomusicology23. able to identify new feature vectors by itself and will therefore be able to define a parameter The first task to accomplish in analysing audio space for these features for all of the analysed signal will invariably be the detection of the archival assets, and will be able to detect onset of any given signal event (see15 for a structural similarities on a best-estimation review). The approaches employed here range basis. from measurement of strong amplitude raise 3.3 HIDDEN-MARKOV MODEL and phase differences, to fluctuation estimation. It appears that the choice of the A complementary approach to be employed is onset detection algorithm depends on the type the implementation of the Hidden-Markov- of sound to be analyzed. For percussive Model (HMM), used for stochastically sounds, measurement of amplitude raise is estimating transition probabilities between sufficient, yet for fusing tones, like piano or hidden states, which, performed consecutively, violin sounds, measurement of fluctuations results in an event series, as present with both, seems more promising. As fluctuations on a musical rhythm and melody. These models phase level seem to include both to a great have been used extensively for musical extend, the approach favoured by the applicant applications1 2. The Markov model consists of and his staff is therefore based on a Modified musically meaningful states. So when Modulation Algorithm2. representing, for example, a multi-line rhythm, these states could be bass drum, snare drum, A second task is the estimation of pitch, often hi-hat, tom-tom, etc. These are mathematically referred to as f0-estimation, i.e. detecting the represented as a Mixed Poisson distribution. fundamental partials of a given harmonic spectrum25 29. An approach used in the context Additionally, a transition matrix between these stated in this proposal is based on algorithms hidden states will be calculated using an like Autocorrelation Functions, but Estimation-Maximization (EM) algorithm69. furthermore employs Correlogram Both, the Poisson distribution and the Representation for f0 estimation in multi-line transition matrix determine the musical textures11,6. This robust method allows parameters, rhythm, melody and texture. This estimation of harmonic overtone structures representation may then be compared to all within very short time frames. Additionally, to previously analysed assets in the ESRA estimate if a piece is single- or multi-line, the database, again forming a state-space, Fractal Correlation Dimension is appropriate, detecting similarities, relate objects, etc. as the integer dimension number thus obtained 4 COMSAR ARCHITECTURE constitutes the amount of harmonic overtone series present in a given musical sound22 5. A MIR-based data infrastructure and classification scheme is to be implemented 3.2 SELF-ORGANIZING TOOLS within the framework of the ESRA database Representation of the results of an analysis for currently under development to be able to use in IR algorithms has been proposed in categorize the database content in regard to several ways. COMSAR uses self-organizing basic musical parameters derived from the maps (SOM), Hidden-Markov Models digital audio data stream. (HMM), and correlation matrices, all based on the extracted data. The three main musical parameters which are Self-organizing Kohonen maps have been treated using the MIR analysis described in proposed for pitch and chord mapping37,36, and this proposal are pitch (melody, texture), for sound level assessment35, for a review see rhythm (single- and multi-line), and timbre 5. This method has also been successfully (single- and multi-line). The MIR structure has applied to related fields, such as speech two main threads, the timbre thread (TT) and estimation47, and soundscape mapping. Here the pitch thread (PT). As TT deals always with the feature vector extracted by the MIR the whole sound information, PT performs a algorithms, consisting of pitch contour, pitch extraction from the sounds and proceeds spectral centroid, uctuations, inharmonicity, with pitch information only. etc., is fed into an Artificial Neural Network

64

4.1 TIMBRE THREAD (TT) overtone structure and therefore is considered multi-line. Within this definition, all The first step in TT is a segmentation of the percussion objects are multi-line, too. This is audio file in terms of onset detection (OD). reasonable also if only one drum is played. If Here, two main methods are used, the the piece is found to be multiline in nature, the fluctuation method for fusing tones2 and a TPM algorithm estimates a feature vector of simple amplitude model for percussive each segment provided by the onset detector, onsets29. From the segments three MIR using spectral centroid, uctuations within the estimations are performed: a Timbre Thread steady-state of the sound, amount of chaoticity Rhythm (TTR), a Timbre Thread Timbre of the initial transient, and other related Multi-line (TTM) and a Timbre Thread Timbre features found with timbre perception of multi- Singleline (TTS). dimensional scaling events. These features are 4.1.1 TIMBRE THREAD RHYTHM calculated for adjacent times within each (TTS) segment to end in a multidimensional TTR does take the sound played by several trajectory of the sound development, as found instruments as one; it does not attempt any crucial to explain nearly endless possible splitting of compound sounds into individual sounds within a low-dimensional timbre space, instrument sounds. As discussed above, by adding the temporal development of the retrieving individual sounds of musical sound within this space. instruments from a multi-instrumental This feature vector is then used as input to a recording is theoretically impossible, because self-organizing Kohonen map. After training, of the fact that no clear association with all this map constitutes a twodimensional partials of harmonic pitch structures can be representation of the objects in the database. assumed from the sound alone without any All segments of all objects are then fed into the further knowledge. Still to be able to deal with map, where the neuron with maximum more complex rhythms, in the PT section (see similarity between the given segment and this below) a multi-line estimation is performed to neuron positions the segment within the map. detect the most probable events without the Therefore, segments or objects can be need to extract the sounds perfectly. estimated for similarity from the trained map. Within the TTS, for each segment a spectral 4.1.3 TIMBRE THREAD TIMBRE centroid is calculated as the most prominent SINGLELINE (TTS) parameter of timbre perception. The list of If a piece is found to be single-line all through, centroid values of the onsets found is then fed as discussed above, the same procedure is into a Hidden-Markov Model (HMM), using a performed, training a Kohonen map with the Poisson Mixture Model (PMM). The results of feature vector of the given sound. Again the the HMM are the parameters of the PMM, trained map is then able to relate all segments which represent the rhythmical structure of the and all pieces, and give similarity judgements. centroid values of the onsets. This PMM, as The reason why the single- or multi-line cases well as the Transition Probabilities (TP) are are separated is to have one map which is able calculated for all objects in the database and a to classify single instruments alone, while the correlation matrix between all PMMs and TPs other is able to deal with orchestrated is calculated to relate the di_erent rhythm multiinstrument sounds. So if a musical PMM structures in terms of similarity. instrument is to be judged in terms of 4.1.2 TIMBRE THREAD TIMBRE similarity, the TTS can be used. Another MULTILINE (TTM) reason is the problem of dealing with the As discussed in the Pitch Thread (PT) section different pitches of the sounds. The TTS map below, it is estimated if a given recording will classify both, pitch and timbre. As pitch is contains multi-line or single-line melody (this the most prominent factor in musical may also be judged aurally and used as instrument similarity judgements, the map will additional, external input). Additionally, the have different regions for different pitches. fractal correlation dimension D of a given Then within each region the differentiation in piece is calculated for adjacent sound sections terms of timbre is present. This is of 50ms. If 100ms after the initial transient D automatically performed by the map. Still it is >= 2.0, the sound has more than one harmonic necessary, as one instrument may sound

65

considerably dfferent within different registers. 4.2.2 PITCH THREAD TEXTURE Differences in articulation within one pitch (PTT) region will again be met by the differentiation Here, from the pitch texture again the onsets of the map within the pitch region of the sound are calculated, again to end in a score. For investigated. This cannot be done with multi- PTT, this score itself is used to correlate the line sounds, as here virtually endless objects in terms of their similarities. PTT is possibilities of pitch combinations can be meant for objects with no glissando, where the present. pitch texture holds the main information of the 4.2 PITCH THREAD (PT) object. PT is representing a piece on the score level, 4.2.3 PITCH THREAD MELISMA although of course it also needs to start from (PTM) the recorded sound. So, first PT performs a Contrary, PTM is meant for objects in which pitch extraction, both single- and multi-line. pitch changes are very important, in terms of Two main algorithms are used here, the vibrato, glissandi, etc. Again after performing correlogram for multi-line and the onset detection, the objects are divided into autocorrelation function for single-line sounds. small segments for each played note. Still, all The correlogram is detecting whole overtone detailed pitch information over time within the structures, which are related to pitch, and finds segments is preserved here. Again, a Kohonen the basic frequency for it within small time map is used to represent possible frames of about 20ms. As it also displays ornamentations, melismata, glissandi, or multiple harmonic series, the pitches of vibrato. So here special ornamentations or different instruments can be detected with high melismata can be compared, rather than whole frequency resolutions. If a piece is singleline, objects themselves. this algorithm can be used, too. Additionally, an autocorrelation estimation of small time 4.3 SUMMARY OF THREADS frames of again about 20ms is performed, After performing all these analysis of all adding information to the correlogram. The objects, six parameters result, which estimate result is a temporally and spectrally high the basic musical parameters timbre, pitch, and resolution function of the harmonic overtone rhythm for each piece: series, the pitches, over time, for the whole object. From this pitch texture, again three Timbre musical parameters are calculated, the Pitch Kohonen map of multi-line timbre (TTP) Thread Rhythm (PTR), the Pitch Thread Kohonen map of single-line timbre, musical Texture (PTT), and the Pitch Thread Melisma instrument sounds (TTH) (PTM). 4.2.1 PITCH THREAD RHYTHM Pitch (PTR) Score for multi- and single-line objects (PTT) Kohonen map of melisma for all segments of For PTR, from the pitch texture, note onsets all objects (PTM) are calculated to end up in a musical score. This score consists of pitch events, both in Rhythm terms of Western pitch classes as well as in Hidden-Markov Model of multi-line fused term of their microtonal precisions up to 1200 sounds based on sound level (TTR) Hidden- cent per octave over time. As with the TTR, a Markov Model of multi- and single-line Hidden-Markov Model (HMM) is used with a objects based on pitch level (PTR) With these Poisson Mixture Model (PMM) which has as six models all objects within the archive can be many hidden states as are di_erent pitches compared in terms of all the sub-features appearing in the given object. The PMM and present in the models. Also new objects or the Transition Probability (TP) calculated by sounds can be compared to all the existing the HMM then represents the objects, which features in the model. The similarities can therefore be related in terms of similarity. proposed by the algorithms then need to be judged by listeners and experts.

66

5 INCLUDING MUSICAL INSTRU- that of phase-locking of different partials (for a MENT MEASUREMENTS AND review see5). MODELING IN COMSAR Also the geometry of the instruments are Organology was part of understanding and known in great detail. High-resolution systematizing music as part of the Hornbostel/ Computer Tomography (CT) scans of whole Sachs classification. Musical instrument instrument geometries display the instruments dictionaries like that of (see above) with resolutions of a fraction of a millimeter. mention and describe thousands of musical From these results material properties can be instruments from all over the world. Features derived, like density, speed of sound, like their origin and use in the musical culture, diffraction or internal damping. Therefore the the material they are built of or the building rough estimations of geometrical data have process are documented. These dictionaries are been replaced by detailed and precise a useful source when it comes to identifying measurements. instruments collected in the field and for giving information about their content. The radiation of musical instruments have also been measured extensively, as only during the Still research is way ahead in terms of the last years technological advances have made it acoustics, properties and building processes of possible to recored single instruments with musical instruments. The basic principles of microphone arrays consisting of up to 128 how musical instruments vibrate and radiate microphones when recording single sounds sound have extensively been studied (for and up to several thousand microphone reviews see3 19 ? ?). Many musical instruments positions when recording multiple sound have been investigated in great detail, mainly instances (for a review see4). Some techniques those of the West, but also many others all allow the back-propagation of the radiated over the world. The materials used are known sound to the radiating surface, the musical in terms of their material parameters like instrument geometry. This means a Young's modulus, density or internal damping. measurement of the instrument vibration all The building process of many instruments over its geometry within and therefore a have been described not only as plain measurement of the internal vibrational energy craftsmanship but also in terms of the distribution, the role of geometrical parts to the acoustical and musical function these acoustical output, or an estimation the processes have been motivated by. radiation of the instrument at any place in a performance space. Several theoretical frameworks on the acoustical properties of musical instruments Physical modeling of musical instruments have have been developed over the last decades also been performed extensively over the last making it possible to classify them not only in decade (for a review see7 9 3). Here the terms of their driving mechanisms, like differential equations governing the vibration Hornbostel/ Sachs have done, but by their of the instruments are used with a geometrical physical mechanisms and features. The model of the instrument to make this virtual Impulse Pattern Formulation (IPF) considers instrument vibrate in silico. High time and musical instruments as working with short spatial resolution allows the precise modeling impulses caused by one part of the instrument, of the instrument and the production of a e.g. the force of a string acting on a body or the sound very close to the original sound of the pressure impulse of wind instruments produced instrument. By changing the mathematical at the players mouth. These impulses are model the role of different kinds of vibrations, transferred to other parts of the instruments, couplings or instrument parts can be shown. like top- and back plates, rims and ribs, are By changing the geometry or the material filtered and return to its origin. In a self- parameters the instruments can be understood organizing process this system starts with a in terms of why which geometry is used and transient phase which is complex and chaotic how changed here would change the sound of only to organize itself after a short time of the instrument. The use of geometrical changes maximum 50 ms to end in a harmonic overtone or alternative materials can be tested before sound radiation5. Other proposals are that of a building the instrument, a property needed nonlinear driving generator and a linear resonator which interact, producing sound, or

67

nowadays as climate change forces new wood Also from an educational point of view such a species to be planted e.g. in Europe. system would be highly attractive to young people interested in the music of the world and All these contemporary features of musical used to use the internet, search engines and instrument research, models and experimental simple music production systems to be setups need to be part of a phonogram archive creative. Such tools could be rated as 'cool' and standard in the near future. Two examples are contemporary and therefore be used with the given below, one about the use of microphone by-effect of making them understand the arrays to measure the acoustical behaviour of a principles of musical instruments and their use. lounuet, a New Ireland friction instrument, and the use of paste to tune the membrane of a In terms of replacing wood and other natural Burmese pat wain. Both examples show the material becoming scarce nowadays with clarification of ethnomusicological questions artificial materials, metamaterials or the like, about the instruments. A modern way of using such tools would be highly welcome, too, as microphone-array measurements, high- everybody could try online how such changes resolution CT scans or physical modeling need would effect the sound and if they might be to be developed in the future and faces several used. issues, like server space, computational capability of servers or the retrieval of the Many other applications are to be expected due information necessary to built such models. to an inclusion of these techniques in Still solutions are around for all aspects so that phonogram archive standards in the near the implementation of these features to future. COMSAR is mainly a matter of time and 6 CONCLUSION energy and not a fundamental problem. A Computational Music and Sound Archive The advantages of such methods are many. does not only fit the needs of sorting and Instrument builders could go online and look analyzing music automatically in times of Big for the instrument they build, or for similar Data and digital acscessibility of music. It also ones, understand more about its vibration, is a way to understand music and culture in radiation or the inuence of several parts or terms relating cultures and ethnic groups rather materials on the sound, include changes of the than stress the dissimilarities between them. It material or parameters online and listen to the is also objective in a way to omit cultural bias resulting sound of such a new instrument and view. Therefore it suggests a view on instantaneously. They could then decide to music as a complex system rather than sorting built such an instrument or thing of different it in genres or styles. Such a Physical Culture changes. They could also be inspired by Theory is therefore both able to cope with the similar instruments, their building process, complexity of todays reality as well as materials or sounds and decide to try new suggesting a new and fresh look on music in instruments similar to their traditional one. the world. 7. REFERENCES Researchers could estimate how important different aspects of the instrument for sound [1] Aucouturier, J.-J. and M. Sandler production are. Some building processes of 2001: Segmentation of Musical Signals Using instruments are sound decisive, some may be Hidden Markov Models, in Proc. of the Audio needed in terms of rituals used for the building Engineering Society 110th Convention. process, and some might be pure myths, traded by tradition rather than by a sound idle and [2] Alexandraki, Ch. and R. Bader 2013: unnecessary or even unwanted today. In the Real-time Concatanative Synthesis for history of instrument building of Western Networked Musical Interactions. J. Acoust. instruments, like violins, pianos, trumpets or Soc. Am. 133, 3367, 2013. guitars, many of these myths have been [3] Bader, R.: Springer Handbook of identified over the years and from an ethnomusicological standpoint it is important Systematic Musicology. Springer, Berlin, Heidelberg 2018. to know which stories are true and which are not.

68

[4] Bader, R.: Microphone Array. In: T. for Music Information Retrieval Conference Rossing (ed.): Springer Handbook of (ISMIR), pp. 489494. Acoustics. 1179-1207, 2014. [15] Dixon, S. 2006. Onset detection [5] Bader, R. 2013: Nonlinearities and revisited. In Proc of the Int Conf on Digital Synchronization in Musical Acoustics and Audio E_ects DAFx06, pp. 133137. . Springer Series Current Research in Systematic Musicology, Vol. 2. [16] Downie, J. Stephen 2003. Music Heidelberg. Springer. information retrieval. Annual Review of information science and technology 37, 295- [6] Bader, R. 2011: Buddhism, Animism, 340. (Available from http://musicir.org/downie and Entertainment in Cambodian Melismatic mir arist37.pdf). Chanting smot. In A. Schneider and A. von Ruschkowski (eds.): Hamburg Yearbook of [17] Fingerhut, M. 2004. Music Musicology 28. information retrieval, or how to search for (and maybe _nd) music and do away with incipits. [7] Bader, R. & Hansen, U.: Acoustical Proc. IAML IASA Joint congress, Music and Analysis and Modeling of Musical Instruments Multimedia, Oslo, August 2004. (Avalaible Using Modern Signal Processing Methods. In: from Handbook of Signal Processing in Acoustics. http://www.ismir.net/admin/ismirbookle.pdf) D. Havelock, M. Vorlnder, S. Kuwano (ed.). Springer, 219-247, 2008. [18] Fischer, Jost: Numerical simulations of the turbulent flow and the sound field of the [8] Bader. R.: "Finite-element calculation Turkish ney end-blown ute.J. Acoust. Soc. of a bass drum," J. Acoust. Soc. Am. 119, Am. 141, 3960, 2017. 3290, 2006. [19] Fletcher, Neville & Rossing, Th. D.: [9] Bader, R.: Computational Mechanics Physics of Musical Instruments. Springer, of the Classical Guitar. Springer Oktober 2005. Heidelberg 2000.

[10] Benetos, Emmanouil & Dixon, Simon: [20] Foote, J.: Automatic Audio Multiple-instrument polyphonic music Segmentation Using a Measure of Audio transcription using a temporally constrained Novelty, in Proc. ICME'00, 2000. shift-invariant model. J. Acoust. Soc. Am. 133 (3), 1727-1741, 2013. 15 [21] Foote, J., M. Cooper 2003 Media Segmentation Using Self-Similarity [11] Chai W. and B. Vercoe 2003, Decomposition, in Proc. Of SPIE Storage and Structural Analysis of Musical Signals for Retrieval for Multimedia Databases, Vol. Indexing and Thumbnailing, in Proc. 5021. ACM/IEEE Joint Conference on Digital Libraries. [22] Gibiat, V. & Castellengo, M. 2000: Period Doubling Occurences in Wind [12] Copeland, P. 2008. Manual of Instruments Musical Performance. Acustica Analogue Sound Restoration Techniques. 86, 746-754. London. The British Library. [23] Gmez, E., Herrera, P. & Gmez-Martin, [13] Elschek, Oskr et al. 2001. Digitizing F. 2013: Compuational Ethnomusicology: world music. Digitalisierung von Weltmusik. perspectives and challenges. J. New Music Special issue, Systematische Research 42:2, 111-112. Musikwissenschaft/Systematic Musicology Vol. VII, no. 3. [24] Goto, M. 2001a. An Audio-based Realtime Beat Tracking System for Music [14] Dessein, A., A. Cont and G. Lemaitre. With or Without Drum-sounds, Journal of New 2010. Real-time polyphonic music Music Research 30(2): 159-171. transcription with non-negative matrix factorization and beta-divergence, In [25] Goto, M. 2001b. A predominant-f0 Proceedings of the 11th International Society estimation method for real-world musical

69

audio signals: MAP estimation for [36] Leman, M 1995.: Music and schema incorporating prior knowledge about f0s and theory. Berlin, Springer. tone models, In Proc Workshop on Consistent and reliable acoustic cues for sound, 1-4. [37] Leman, M. and F. Carreras 1997: Schema and Gestalt: Testing the hypothesis of [26] Harwood, D. L. 1976. Universals in Psychoneural Isomorphism by Computer Music: a Perspective from Cognitive Simulation. In: Marc Leman (ed.): Music, Psychology. Ethnomusicology, 20, 521-533. Gestalt, and Computing. Studies in Cognitive and Systematic Musicology. Springer, Berlin, [27] Hornbostel, Erich M. v. & Sachs, C.: 144-168. Systematik der Musikinstrumente. Ein Versuch. Zeitschrift für Ethnologie 46, 4-5, [38] Lomax, A., and N. Berkowitz 1972. 553- 590, 1914. The Evolutionary Taxonomy of Culture. Science 177(4045), 228-239. [28] International Association of Sound and Audiovisual Archives Technical Committee [39] Lomax, A. 1976. Cantometrics: an (eds.) 2005. IASA-TC 03 Standards, Approach to the Anthropology of Music. Recommended Practices and Strategies: the Berkeley: University of California Extension Safeguarding of the Audio Heritage: Ethics, Media Center. Principles and Preservation Strategy. http://www.iasaweb.org/IASA TC03/TC03 [40] Moelants, D. et al. 2007. The English.pdf. as of 2013-09-13. Problems and opportunities of content-based analysis and description of ethnic music. [29] Klapuri, A, & M. Davy: Signal International Journal of intangible heritage 2, Processing Methods for Music Transcription. 57-67. Ed. Anssi Klapuri and Manuel Davy. Signal Processing. Springer-Verlag New York Inc, [41] Moore, Thomas: Measurement 2006. Techniques. In: Bader R. (eds) Springer Handbook of Systematic Musicology. [30] Klapuri A: Signal Processing Methods Springer, Berlin, Heidelberg, 81-103, 2018. for the Automatic Transcription of Music. PhD Dissertation, Tampere University of [42] Pfeie Florian: Real-Time Signal Technology, 2004. Processing on Field Programmable Gate Array Hardware. In: Bader R. (eds) Springer [31] Klapuri, A.: Sound onset detection by Handbook of Systematic Musicology. applying psychoacoustic knowledge. 1999 Springer, Berlin, Heidelberg, 385-417, 2018. IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, [43] Pfeie, Florian, Bader, Rolf: Real-Time ICASSP99 Cat No99CH36258 6, 3089-3092, Finite-Di_erence Method Physical Modeling 1999. of Musical Instruments Using Field- Programmable Gate Array Hardware, J. Audio [32] Kolinski, M. 1957. The Determinants Engineering Society 63 (12), 1001-1016, 2015. of Tonal Construction in Tribal Music. The Musical Quarterly, 43(1), 50-56. [44] Pisczcalski M. and B. Geller 1977. Automatic Music Transcription, Computer [33] Kolinski, M. 1978. The Structure of Music Journal, 1(4): 24-31. Music: Diversi_cation Versus Constraint. Ethnomusicology, 22(2), 229-244. [45] Plath, Niko: High-speed camera displacement measurement (HCDM) technique [34] Kolinski, M. 1982. Reiteration of string vibrations,Proceedings of the Quotients: a Cross-Cultural Comparison. Stockholm Music Acoustics Conference, 188- Ethnomusicology, 26(1), 85-90. 192, 2013.

[35] Kostek, B. 2005: Perception-based [46] Proutskova, P. 2007. Musical Memory data processing in acoustics. Applications to of the world Data infrastructure in Music Information Retrieval and ethnomusicological archives. Proc. of the 8th Psychophysiology of Hearing. Springer.

70

Intern. Conf. On Music Information Retrieval Cross-Cultural Song Classification Scheme. (ISMIR), , 2007. Analytic Approaches to World Music, 2(1), 87-137. [47] Rabiner L. R. and B. H. Juang 1993. Fundamentals of Speech Recognition, Prentice [58] Scavone, Gary: Delay-Lines and Hall Signal Processing Series Digital Waveguides. In: Bader R. (eds) Springer Handbook of Systematic Musicology. [48] Rohrmeier M., Pearce M.: Musical Springer, Berlin, Heidelberg, 259-272, 2018. Syntax I: Theoretical Perspectives. In: Bader R. (eds) Springer Handbook of Systematic [59] Schneider, A. 2001. Sound, Pitch, and Musicology. Springer, Berlin, Heidelberg, Scale: from Tone Measurements to 473-486, 2018. Sonological Analysis in Ethnomusicology. Ethnomusicology, 45(3). 489-519. [49] Pearce M., Rohrmeier M.: Musical Syntax II: Empirical Perspectives. In: Bader R. [60] Schneider, A. 2006 Comparative and (eds) Springer Handbook of Systematic Systematic Musicology in relation to Musicology. Springer, Berlin, Heidelberg, Ethnomusicology: A Historical and 487-505, 2018. Methodological Survey. Ethnomusicology, 50(2), 236-258. [50] Rohrmeier, M. A., & Cross, I. (2014). Modelling unsupervised online-learning of [61] Schneider, A. 2018 "Systematic artificial grammars: Linking implicit and Musicology: a historical interdisciplinary statistical learning. Consciousness and perspective". In: R. Bader (ed.): Springer Cognition: An International Journal, 27, 155- Handbook of Systematic Musicology, 1- 24, 167, 2014. Springer Heidelberg, Berlin.

[51] Ruschkowski, Andre: Elektronische [62] Six, J., O. Cornelis and M. Leman Klänge und musikalische Entdeckungen 2013: Tarsos, a Modular Platform for Precise [Electronic Sounds and musical discoveries], Pitch Analysis of Western and Non-Western Reclam, Ditzingen 1998. Music. J. New Music Research 42:2, 113-129.

[52] Rzeszutek, T., Savage, P. E., & [63] Tenzer, M. (Ed.). 2006. Analytical Brown, S. 2012. The Structure of Cross- studies in world music. New York: Oxford Cultural Musical Diversity. Proceedings of the University Press. Royal Society B: Biological Sciences, 279, 1602-1612. [64] Tenzer, M., & Roeder, J. (Eds.). 2011. Analytical and cross-cultural studies in world [53] Scarvone, Gary: Delay-Lines and music. New York: Oxford University Press. Digital Waveguides. In: Bader R. (eds) Springer Handbook of Systematic Musicology. [65] The National Recording Preservation Springer, Berlin, Heidelberg, 259-272, 2018. Board of the Library of Congress (eds.) 2010. The State of Recorded Sound Preservation in [54] Sachs, Carl: Real-Lexikon der the United States: a National Legacy at Risk in Musikinstrumente : zugleich ein Polyglossar the Digital Age. Washington, D.C. Council on für das gesamte Instrumentengebiet. Bard, Library and Information Resources and The Berlin 1913. Library of Congress.

[55] Sachs, C.: Geist und Werden der [66] The National Recording Preservation Musikinstrumente. [Spirit and Becoming of Board of the Library of Congress (eds.) 2012. Musical Instruments.] Berlin: Reimer, 1929. The Library of Congress National Recording Preservation Plan. Washington, D.C. Council [56] Sadie, S.: The New Grove Dictionary on Library and Information Resources and The of Musical Instruments. Vol. 2. Macmillan Library of Congress. Press Limited, 1984. [67] Toiviainen, P. (Ed.) 2009. Musical [57] Savage, P.E., Merritt, E., Rzeszutek, similarity [Special issue]. Musicae Scientiae, T., & Brown, S. 2012. CantoCore: a New 13(1 suppl).

71

[68] Toiviainen, P., and Eerola, T. 2001 A method for comparative analysis of based on musical feature extraction and neural networks. In H. Lappalainen (Ed.), Proceedings of the VII International Symposium of Systematic and Comparative Musicology and the III International Conference on Cognitive Musicology (pp. 41-

45). Jyvskyl: University of Jyvskyl.

[69] Zucchini, W. and I.L. MacDonald 2009: Hidden-Markov Models for Time Series. An Introduction using R. Chapman & Hall.

72