MIDI Implementation of a Modal Synthesizer using the VST2.4 Standard

Ann Franchesca B. Laguna1, Nicanor Marco P. Valdez2 and Rowena Cristina L. Guevara3 Digital Signal Processing Laboratory, Electrical and Electronics Engineering Institute, University of the Philippines Diliman [email protected], [email protected], [email protected]

Abstract—Philippine Indigenous Music is slowly disappearing II. BACKGROUND partly because of insufficient accessibility to Philippine indigenous musical instruments. This project aims to give A. Philippine Kulintang Ensemble musicians access to the sounds of the Philippine instrument The kulintang is commonly played as part of a kulintang known as the kulintang through a synthesizer plug-in. This paper ensemble. A kulintang is typically made up of five to nine ventures into the implementation of a Virtual Studio Technology graduated bossed that are horizontally suspended on a Instrument (VSTi) plug-in containing kulintang sounds analyzed wooden platform. It is played by striking the bosses with a using the Modal Distribution and synthesized using Sum of stick. The gongs are usually made of brass and have a diameter Sinusoid (SOS) Synthesis. The VSTi plug-in was implemented in of 53 cm to 77 cm, and depth of 6 cm to 7cm. A set is usually C++ using the VST2.4 SDK (Software Development Kit). The plug-in is controlled by Digital Interface played by one player who uses two sticks to strike as many as (MIDI) events. The Filipino Kulintang MIDI (FKM1.0) was two gongs at a time. A kulintang ensemble typically consists of proposed as an initial MIDI mapping standard for Filipino a kulintang, a mid-range called babandir, a called kulintang instruments. A VST Dynamic Link Library (DLL) dabakan, one or two large gongs called agong, and four containing the kulintang synthesizer was successfully ported and suspended gongs called . Each Southern Philippine controlled by a Digital Audio Workstation (DAW). Twenty three Indigenous Group has its own variation of the ensemble. A tuning presets with eight gongs each were included in the library. notation system using numbers on a tablature that denote the rhythm and gong type was created by Butocan, A.M. et al. [3]. Keywords-Kulintang, Virtual Studio Technology, Philippine For this project, the focus will be on the kulintang instrument. Indigenous Music, Audio Synthesis. The other instruments in the ensemble will be addressed in future studies. I. INTRODUCTION Philippine indigenous music has been declining because it TABLE I. KULINTANG PRESETS has not been able to sustain itself in the mainstream music Federizon Number of scene. The number of practitioners has also been decreasing Scale Federizon Scale Label Kulintang Preset Gongs due to the increasing influence of western-style music in the Number Philippines. The goal of this project is to give contemporary N/A N/A Agsaway 8 Philippine artists easier access to sounds from their musical 1 DMR DMR1 8 roots in the hope that it will bring back a resurgence of 2 Santos Santos 8 Philippine culture in modern Philippine music. Moreover, the 3 DMR DMR2 8 project also aims to provide a pedagogical tool for teaching 4 Dioquino Dioquino 8 Filipino musical culture. 5 DMR DMR3 8 6 DMR DMR4 8 The kulintang is one of the most popular Indigenous 7 DMR DMR5 8 Philippine musical instruments. This project implements a 8 DMR DMR6 8 Kulintang VSTi plug-in which allows a DAW to play kulintang 9 Laureola Laureola 8 sounds via MIDI data and to initiate a standard MIDI mapping 10 Maceda Maceda 8 scheme for Philippine Indigenous Instruments. The synthesis is 11 Cacnio Cacnio 8 adapted from the Modal Distribution frequency and amplitude 12 Balingit Balingit 8 estimates obtained by Agsaway et al. [1]. Sum of Sinusoids 13 Villanueva Villanueva 8 (SOS), or additive synthesis, was then used in the 14 Arambulo Arambulo 8 15 DMR DMR7 8 implementation of the synthesizer. 16 Magind. 10 Tape SD 1 R 17 Maguindanao10 8 This paper is ordered as follows: a short background on the 17 Magind. 9 Tape SD 1 R 2 Maguindanao9 8 Philippine kulintang ensemble and Virtual Studio Technology 18 Magind. 11 Tape SD 1 R 12 Maguindanao11 8 is presented, the methodology is shown, the conclusions of the 19 Magind. 14 Tape SD 1 R 12 Maguindanao14 8 project and, finally, recommendations and topics of interest for 20 Magind. 23 Tape SD 1 R 22 Maguindanao23R22 11 future work. 21 Magind. 23 Tape SD 1 R 1 Maguindanao23R1 8 22 Magind. 5 Tape 1 SD 1 R 6 Maguindanao5 8

Kulintang tunings do not follow the Western musical scale because they belong to the Gong-Chime culture of Southeast Asia. Federizon analyzed a list of Kulintang tunings using a Stroboconn in the 1960’s [4]. The results are available in the University of the Philippines Center for Ethnomusicology (UPCE). The data is in the form of musical notes on a G-clef with the variation in Cents from the actual western note marked underneath. His data included 22 unique scales. Each of the scales of Federizon has 8 gongs except for one which has 11. Note values range from C4 (261.6 Hz) to A#5 (932.3 Hz). A list of the Kulintang Presets is given in Table I. Multiple recordings of the instruments in a kulintang set with 8 gongs were recorded and analyzed in [1]. The recordings exhibit various ways of playing a kulintang gong such as striking it then letting the sound decay, and striking it then muffling it with a hand. The modal partials generated from the Fig. 1. VST Host and VSTi Plug-in Interaction first 8 recordings (which were generated by striking each gong on the boss with a stick) were used in this study. The current VST version is VST 3.0. However, most DAWs only support VST 2.4. Starting with VST 2.0, VST modules The modal data does not contain the actual gong waveforms have been capable of accepting Musical Instrument Digital but the amplitudes and frequencies of different partials at Interface (MIDI) events needed to control the audio output. different time frames. The modal partials per waveform were Synthesizer modules can be implemented using the VST summarized into two matrices: one for amplitude and one for standard in order to create plug-ins that can be imported in frequency. Each column corresponds to a time frame. For the DAWs. frequency matrix, each row corresponds to the instantaneous frequencies (in Hz) of a partial for a time frame. For the To create a VSTi, a new class for the new plug-in must be amplitude matrix, each row corresponds to the instantaneous created through the inheritance of the AEffectX Class which can amplitude of a partial for a time frame. Excerpts from the be found in the SDK. The three most relevant functions of this modal database are given in Tables II and III as examples. class are:  the constructor TABLE II. EXCERPT FROM THE AMPLITUDE MATRIX OF KULINTANG1_01_BEED_EST.TXT  processEvents Time Frame (0.005828 s)  processReplacing Partial 1 of 53 2 of 53 3 of 53 11 of 215 0.000479 0.000538 0.000479 The constructor is called by the host to create the plug-in 12 of 215 0.000321 0.000362 0.000326 and must contain all the initialization functions. 13 of 215 0.000180 0.000305 0.000276 The processEvents function is called whenever the plug-in TABLE III. EXCERPT FROM THE FREQUENCY MATRIX OF receives events such as MIDI data from the host. All MIDI KULINTANG1_01_BEED_EST.TXT handling starts from the function processEvents. Time Frame (0.005828 s) The processReplacing function handles the audio input and Partial 1 of 53 2 of 53 3 of 53 11 of 215 273.589056 273.367035 273.124425 output of the plug-in. The function is constantly being called as 12 of 215 298.931976 299.572594 300.303680 it searches for input data to process, output data to return to the 13 of 215 322.931685 323.147098 324.335863 host and changes in the state of the plug-in which will tell it if it should write data to the output or not. For a VSTi, there is no audio input. The function deals with the output buffer through B. Virtual Studio Technology frames of a variable length related to the sound driver settings. Virtual Studio Technology (VST) was developed by This length is often very small such as 256 bytes. Steinberg Media Technologies in order to allow the simulation of real world instruments and audio effects in a Digital Audio In VST Terminology, a parameter is a variable that the user Workstation (DAW). The VST Software Development Kit can edit while a program is a set of preset parameters. (SDK) is coded in C++ [2]. III. METHODOLOGY VST has two main components: the host and the plug-in. A. Generation of Kulintang Program Partials The Virtual Studio Technology Instrument (VSTi) is a special type of VST plugin for playing and mapping synthesized The note and cent values from Federizon’s kulintang scales instruments. Plug-ins come in the form of Dynamic Link were converted to Hz and tabulated. These frequencies were Library (DLL) files. The DLL files are used by a VST Host, used as the basis for each of the preset programs. It was noticed such as DAWs, to implement audio processing algorithms. Fig. that Federizon’s note values were an octave below the 1 shows the interaction between a VST Host and a VSTi plug- fundamental frequencies obtained from the recordings. The in. frequencies tabulated from Federizon were doubled to bring them up to the range of the gong recordings. Partials were generated for each of the gongs in the kulintang scales by shifting the values in the frequency matrix ( ) ∑ ( ) ( ( ) ( )) (2) of the partials of [1] to the frequency range of Federizon’s pitches. where ( ) is the synthesized gong, ( ) and ( ) are The fundamental frequency of each of the gong recordings the amplitude and frequency of the nth partial in the kth time in [1] were estimated in order to know how much each partial frame and ( ) is the last phase value of the partial's previous should be shifted. The peak of a dB plot containing the means time frame. of the partial frequencies across time frames versus the mean of the partial amplitudes across time frames was used to find the 1) Program Storage and Synthesis fundamental. The results were aurally validated by comparing each gong’s pitch and fundamental with the pitch it seemed Two implementations were tested in storing and closest to in the western musical scale. synthesizing the program data upon VST object construction. From the initial analysis, it was noticed that the recordings The first implementation included generating the preset were contaminated by 60 Hz mains hum. The hum was partials outside the VST plug-in. These partials were then removed by removing all the rows in both the amplitude and stored in different functions to be called whenever the program frequency matrices with frequencies close to 60 Hz and its was changed. This implementation allowed for more freedom harmonics. in manipulating the data for each program but was much larger and had a longer compilation time. This implementation was Each program contains a set of partials for 8 gongs. For the abandoned because compiling took too long for the project scale which had 11 gongs, only the first 8 were used in its schedule to allow. corresponding program. The second implementation only stored the partials The partials were generated by taking the original partials generated in [1] and the frequency offset for each gong in each and shifting their frequency data by the difference of double the program. There was less freedom in this implementation since frequency found in Federizon's scales with the gong’s original the frequency shifting algorithm was found within the plug-in mean fundamental frequency. The partial data used depended data but it was considerably smaller since large sets of partials on the gong number of the set (i.e. Gong 3 of DMR1 was based for each program were no longer included. on Gong 3 of [1], Gong 7 of Santos was based on Gong 7 of [1], etc.). In some situations, shifting the partials resulted in When a program is initialized, the program name is set in negative frequencies. These partials were removed from its the plug-in to be displayed by the host. Each gong is gong’s partial database. The shifting process is shown in (1). synthesized using the Modal Synthesis Code found in [1] (which was ported from C to C++). However, for the second

( ) implementation, the offset algorithm shown in (1) is included in { the waveform synthesis in order to get the proper gong pitches. (1) The waveforms are stored in a wavetable to be called later.

After the construction of the plug-in, the default program is that

where represents the original frequency partial matrix of of the original recordings from [1]. gong x, represents the new frequency partial matrix after shifting, n and m indicate the row and column of a particular 2) MIDI Manager Module partial's instantaneous frequency, represents the mean A MIDI event manager module was implemented which fundamental frequency of gong x, and represents the only captures Note On events and Global Note Off events. frequency of Federizon's tuning for gong x. Deleting a row There was no need to capture Note Off events because gong from the partial database includes deleting the row from the sounds have no sustain and release. The note number and frequency matrix as well as its corresponding row in the velocity of each Note On event are extracted by MIDI even amplitude matrix. manager to be used in the audio module. For this project, there were zero parameters and 23 A MIDI mapping standard for Philippine indigenous programs: one program for each of the 22 kulintang scale and instruments was initiated with the Filipino Kulintang MIDI 1.0 the one program for the original sound files from [1]. A total of Standard (FKM1.0) as shown in Table IV. For now, only the 8 184 gong partial databases were created. gongs are included in this MIDI map. Gongs were mapped B. VST Modules from C4 (261.6 Hz) upwards using only white keys of the keyboard. Thus, gong 1 would be played using the C4 key, The Kulintang VSTi was created using VST2.4 in order to gong 2 would be played using the D4 key, gong 3 would be achieve compatibility with most DAWs. played using the E4 key, etc.

The synthesis of each gong depends on the number of TABLE IV. MIDI MAPPING partials N included in the matrix which is different for each gong. Generating a gong sound would require 2N Gong Note MIDI Number 1 C4 60 multiplications and N-1 additions as shown in (2). This 2 D4 62 algorithm could be too complex for slow processors to do in 3 E4 64 real time hence a wavetable is first generated upon startup of 4 F4 65 the plug-on to reduce the complexity. 5 G4 67 6 A5 69 7 B5 71 8 C5 72 3) Audio Module not yet have data or a model of how a gong sound will change if the gong is struck while it is still producing a sound. The playback of a single gong sound requires several iterations of processReplacing due to the small buffer size For a simple velocity-sensitive implementation, the velocity allotted to the function as explained in Section II-B. Because value of the MIDI event was corresponded to the waveform there are 8 gongs in each program, there was also a need to volume. Velocity was implemented as a simple linear scaling implement a polyphony of 8. of the original waveform. The value of the output sample ( ) can be expressed as a linear combination of the gong A representation of the Audio Module is shown in Fig. 2. sample ( ) with the corresponding time advance For each of the eight gongs, there is a buffer index pointer and a depending on when that gong was last hit and the buffer that contains its synthesized waveform. The pointer’s corresponding volume of that gong as shown in (3). default position is at the end of its corresponding wavetable as demonstrated by gong pointers 5 and 7. When a Note On event is received by the MIDI Manager Module, the corresponding ( ) ∑ ( ) (3) pointer is moved to the start of the buffer as shown in gong pointer 6. C. Testing The pointers keep track of the next sample to be relayed to the output audio buffer. For example, gong 6 in Fig. 2 has just The file was tested using MiniHost, a VST Host application begun to be played while gong 3 is in an advanced part of its which can be downloaded as donation-ware from [5], and decay. Gongs 5 and 7, whose pointers are at the end, are silent. Cakewalk Sonar 8.5. Both VST Hosts were able to successfully interact with the kulintang plug-in. During an iteration of processReplacing, each active gong pointer is incremented until it reaches the end of the output Initial tests with only 8 presets implemented showed a very buffer. For each increment, a data point from each live gong long waiting time at plug-in startup. Quantifying the actual buffer is summed into the output buffer. In summary, a chunk start-up time was not done because there were many variables of data from each live buffer, as shown by the blue regions in to consider (hardware, different implementations of VST hosts, Fig. 2, are summed and sent to the VST host via the output current CPU usage, etc) and there was not enough time to code audio buffer for every iteration of processReplacing. an efficient timer program into the plug-in. Thus, our approach was to merely optimize the code as much as possible in hope that it would decrease the waiting time. The sum of sinusoid kulintang synthesis was the main bottleneck during startup. For the initial implementation, all the presets were synthesized at startup which accounted for the long startup time. The code was revised to make sure that the preset is synthesized only at preset change. Improvements in the memory management were also done by making sure that allocated memories were freed to avoid memory leaks. After implementing the optimization solutions, all 23 programs were included in the final plug-in DLL file. The result is a plug-in with a much shorter startup time and preset change delay.

IV. CONCLUSION In this project we were able to successfully create a Kulintang VSTi plug-in with 23 preset kulintang scales that follow the tunings recorded by Federizon. The plug-in can produce a total of 184 gong sounds of different pitches using SOS synthesis and the gong partial matrices created from Agsaway's analysis. The program was coded in C++ using the VST SDK. VST hosts, such as a DAW, were able to control and produce kulintang gong sounds using the plug-in. The plug-in follows the FKM1.0 MIDI mapping standard defined in this paper. This standard maps the 8 gong in a kulintang set to 8 MIDI numbers.

V. RECOMMENDATIONS The synthesis algorithm can be further optimized to reduce

the startup time. A timer can also be implemented to quantify Fig. 2. Audio Module the waiting time of different implementations. Theoretically, Whenever a new MIDI event is received, its pointer is set to the number of partials can still be further reduced by removing the beginning of its buffer regardless of where the previous partials that are masked by other partials. Reducing the partial position of the pointer. This procedure was used because we do matrices will reduce synthesis time. However, listening tests should be done to make sure that signal degradation is not other instruments in the Kulintang Ensemble and other apparent. Indigenous Philippine Instruments are planned projects in the future. The other instruments in the Kulintang Ensemble will be The recording in [1] included different excitations of the included in the Filipino Kulintang MIDI Standard in future gongs such as muffling a vibrating gong or hitting it in different studies. place with different velocities. Further study can also be done on how a kulintang gong reacts to different physical excitations. VI. REFERENCES The sounds used in this study were done independent of the [1] F. Agsaway, M. Co, and R. C. L. Guevara, “Modal Distribution Analysis current excitation state of the gong; i.e., whether it was already and Sum of Sinusoids Synthesis of Kulintang Musical Signals” vibrating or not. The behavior of the gong can be investigated Undegraduate Research Project, University of the Philippines, 2005 in order to create a better model of the response when gong [2] http://www.steinberg.net/en/company/developer.html [March 22, 2012] sounds overlap. These models can be incorporated later on for a [3] Z. E. Velasco, “Kulintangan” Palabuniyan Gongs, Filipino Folk Arts VSTi of a larger scope. Furthermore, better kulintang Theatre. http://members.aol.com/TaraCelest/kulintang_instruments.html recordings could be used to improve the quality of the [August 17, 2004] synthesis. [4] R. Federizon, "Illustration 3. Kulintang Scales", Kulintang and Kudyapiq: Gong Ensemble and Two-string Lute among the This study is only a preliminary investigation into the VST Magindanaon in Mindanao Philippines, University of the Philippines standard by our research group. In the future, we would like to College of Music, Quezon City, 1988 implement a Kulintang VSTi with a velocity module that more [5] http://www.tobybear.de/p_minihost.html [March 22, 2012] accurately models the actual kulintang. Implementation of the