EURASIP Journal on Applied Signal Processing
Model-Based Sound Synthesis
Guest Editors: Vesa Välimäki, Augusto Sarti, Matti Karjalainen, Rudolf Rabenstein, and Lauri Savioja
EURASIP Journal on Applied Signal Processing Model-Based Sound Synthesis
EURASIP Journal on Applied Signal Processing Model-Based Sound Synthesis
Guest Editors: Vesa Välimäki, Augusto Sarti, Matti Karjalainen, Rudolf Rabenstein, and Lauri Savioja
Copyright © 2004 Hindawi Publishing Corporation. All rights reserved.
This is a special issue published in volume 2004 of “EURASIP Journal on Applied Signal Processing.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Editor-in-Chief Marc Moonen, Belgium
Senior Advisory Editor K. J. Ray Liu, College Park, USA Associate Editors Kiyoharu Aizawa, Japan A. Gorokhov, The Netherlands Antonio Ortega, USA Gonzalo Arce, USA Peter Handel, Sweden Montse Pardas, Spain Jaakko Astola, Finland Ulrich Heute, Germany Ioannis Pitas, Greece Kenneth Barner, USA John Homer, Australia Phillip Regalia, France Mauro Barni, Italy Jiri Jan, Czech Markus Rupp, Austria Sankar Basu, USA Søren Holdt Jensen, Denmark Hideaki Sakai, Japan Jacob Benesty, Canada Mark Kahrs, USA Bill Sandham, UK Helmut Bölcskei, Switzerland Thomas Kaiser, Germany Wan-Chi Siu, Hong Kong Chong-Yung Chi, Taiwan Moon Gi Kang, Korea Dirk Slock, France M. Reha Civanlar, Turkey Aggelos Katsaggelos, USA Piet Sommen, The Netherlands Tony Constantinides, UK Mos Kaveh, USA John Sorensen, Denmark Luciano Costa, Brazil C.-C. Jay Kuo, USA Michael G. Strintzis, Greece Satya Dharanipragada, USA Chin-Hui Lee, USA Sergios Theodoridis, Greece Petar M. Djurić, USA Kyoung Mu Lee, Korea Jacques Verly, Belgium Jean-Luc Dugelay, France Sang Uk Lee, Korea Xiaodong Wang, USA Touradj Ebrahimi, Switzerland Y. Geoffrey Li, USA Douglas Williams, USA Sadaoki Furui, Japan Mark Liao, Taiwan An-Yen (Andy) Wu, Taiwan Moncef Gabbouj, Finland Bernie Mulgrew, UK Xiang-Gen Xia, USA Sharon Gannot, Israel King N. Ngan, Hong Kong Fulvio Gini, Italy Douglas O’Shaughnessy, Canada
Contents
Editorial, Vesa Välimäki, Augusto Sarti, Matti Karjalainen, Rudolf Rabenstein, and Lauri Savioja Volume 2004 (2004), Issue 7, Pages 923-925
Physical Modeling of the Piano, N. Giordano and M. Jiang Volume 2004 (2004), Issue 7, Pages 926-933
Sound Synthesis of the Harpsichord Using a Computationally Efficient Physical Model, Vesa Välimäki, Henri Penttinen, Jonte Knif, Mikael Laurson, and Cumhur Erkut Volume 2004 (2004), Issue 7, Pages 934-948
Multirate Simulations of String Vibrations Including Nonlinear Fret-String Interactions Using the Functional Transformation Method, L. Trautmann and R. Rabenstein Volume 2004 (2004), Issue 7, Pages 949-963
Physically Inspired Models for the Synthesis of Stiff Strings with Dispersive Waveguides, I. Testa, G. Evangelista, and S. Cavaliere Volume 2004 (2004), Issue 7, Pages 964-977
Digital Waveguides versus Finite Difference Structures: Equivalence and Mixed Modeling, Matti Karjalainen and Cumhur Erkut Volume 2004 (2004), Issue 7, Pages 978-989
A Digital Synthesis Model of Double-Reed Wind Instruments, Ph. Guillemain Volume 2004 (2004), Issue 7, Pages 990-1000
Real-Time Gesture-Controlled Physical Modelling Music Synthesis with Tactile Feedback, David M. Howard and Stuart Rimell Volume 2004 (2004), Issue 7, Pages 1001-1006
Vibrato in Singing Voice: The Link between Source-Filter and Sinusoidal Models, Ixone Arroabarren and Alfonso Carlosena Volume 2004 (2004), Issue 7, Pages 1007-1020
A Hybrid Resynthesis Model for Hammer-String Interaction of Piano Tones, Julien Bensa, Kristoffer Jensen, and Richard Kronland-Martinet Volume 2004 (2004), Issue 7, Pages 1021-1035
Warped Linear Prediction of Physical Model Excitations with Applications in Audio Compression and Instrument Synthesis, Alexis Glass and Kimitoshi Fukudome Volume 2004 (2004), Issue 7, Pages 1036-1044 EURASIP Journal on Applied Signal Processing 2004:7, 923–925 c 2004 Hindawi Publishing Corporation
Editorial
Vesa Valim¨ aki¨ Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: vesa.valimaki@hut.fi Augusto Sarti Dipartimento di Elettronica e Informazione, Politecnico di Milano, piazza Leonardo da Vinci 32, 20133 Milan, Italy Email: [email protected] Matti Karjalainen Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: matti.karjalainen@hut.fi Rudolf Rabenstein Multimedia Communications and Signal Processing, University Erlangen-Nuremberg, 91058 Erlangen, Germany Email: [email protected] Lauri Savioja Laboratory of Telecommunications Software and Multimedia, Helsinki University of Technology, P.O. Box 5400, 02015 Espoo, Finland Email: lauri.savioja@hut.fi
Model-based sound synthesis has become one of the most called ALMA (Algorithms for the Modelling of Acous- active research topics in musical signal processing and in tic Interactions, IST-2001-33059, see http://www-dsp.elet. musical acoustics. The earliest attempts in generating mu- polimi.it/alma/) where the guest editors and their research sical sound with a physical model were made over three teams collaborated in the period from 2001 to 2004. The decades ago. The first commercial products were seen only goal of the ALMA project was to develop an elegant, gen- some twenty years later. Recently, many refinements to pre- eral, and unifying strategy for a blockwise design of physi- vious signal processing algorithms and several new ones have cal models for sound synthesis. A “divide-and-conquer” ap- been introduced. We have learned that new signal processing proach was taken, in which the elements of the structure methods can still be devised or old ones modified to advance are individually modeled and discretized, while their inter- the field. action topology is separately designed and implemented in a Today there exist efficient model-based synthesis algo- dynamical and physically sound fashion. As a result, several rithms for many sound sources, while there are still some high-quality demonstrations of virtual musical instruments for which we do not have a good model. Certain issues, such played in a virtual environment were developed. During the as parameter estimation and real-time control, require fur- ALMA project, the guest editors realized that this special is- ther work for many model-based approaches. Finally, the ca- sue could be created, since the field was very active but there pabilities of human listeners to perceive details in synthetic had not been a special issue devoted to it for a long time. sound should be accounted for in a way similar as in percep- This EURASIP JASP special issue presents ten examples tual audio coding in order to optimize the algorithms. The of recent research in model-based sound synthesis. The first success and future of the model-based approach depends on two papers are related to keyboard instruments. First Gior- researchers and the results of their work. dano and Jiang discuss physical modeling synthesis of the pi- The roots of this special issue are in a European project ano using the finite-difference approach. Then Valim¨ aki¨ et al. 924 EURASIP Journal on Applied Signal Processing show how to synthesize the sound of the harpsichord based Westminster, London, UK. During the academic year 2001-2002 on measurements of a real instrument. An efficient imple- he was Professor of signal processing at the Pori School of Tech- nology and Economics, Tampere University of Technology (TUT), mentation using a visual software synthesis package is given Pori, Finland. In August 2002 he returned to HUT, where he for real-time synthesis. is currently Professor of audio signal processing. He was ap- In the third paper, Trautmann and Rabenstein present a pointed Docent in signal processing at the Pori School of Tech- multirate implementation of a vibrating string model that is nology and Economics, TUT, in 2003. His research interests are based on the functional transformation method. In the next in the application of digital signal processing to audio and mu- paper, Testa et al. investigate the modeling of stiff string be- sic. Dr. Valim¨ aki¨ is a Senior Member of the IEEE Signal Process- havior. The dispersive wave phenomenon, perceivable as in- ing Society and is a Member of the Audio Engineering Society, harmonicity in many string instrument sounds, is studied by the Acoustical Society of Finland, and the Finnish Musicological deriving different physically inspired models. Society. In the fourth paper, Karjalainen and Erkut propose a very interesting and general solution to the problem of how to Augusto Sarti, born in 1963, received the build composite models from digital waveguides and finite- “Laurea” degree (1988, cum laude) and the difference time-domain blocks. The next contribution is Ph.D. (1993) in electrical engineering, from from Guillemain, who proposes a real-time synthesis model the University of Padua, Italy, with research of double-reed wind instruments based on a nonlinear phys- on nonlinear communication systems. He completed his graduate studies at the Uni- ical model. versity of California at Berkeley, where he The paper by Howard and Rimell provides a viewpoint spent two years doing research on nonlinear ff quite di erent from the others in this special issue. It deals system control and on motion planning of with the design and implementation of user interfaces for nonholonomic systems. In 1993 he joined model-based synthesis. An important aspect is the incorpo- the Dipartimento di Elettronica e Informazione of the Politecinco ration of tactile feedback into the interface. di Milano, where he is now an Associate Professor. His current re- Arroabarren and Carlosena have studied the modeling search interests are in the area of digital signal processing, with and analysis of human voice production, particularly the vi- particular focus on sound analysis, processing and synthesis, im- brato used in the singing voice. Source-filter modeling and age processing, video coding and computer vision. Augusto Sarti sinusoidal modeling are compared to gain a deeper insight authored over 100 scientific publications. He is leading the Image in these phenomena. Bensa et al. bring the discussion back to and Sound Processing Group (ISPG) at the Dipartimento di Elet- tronica e Informazione of the Politecnico di Milano, which con- the physical modeling of musical instruments, with particu- tributed to numerous national projects and eight European re- lar reference to the piano. They propose a source/resonator search projects. He is currently coordinating the IST-2001-33059 model of hammer-string interaction aimed at a realistic pro- European Project “ALMA: Algorithms for the Modelling of Acous- duction of piano sound. Finally, Glass and Fukuodome in- tic Interactions,” and is co-coordinating the IST-2000-28436 Euro- corporate a plucked-string model into an audio coder for au- pean Project “ORIGAMI: A new paradigm for high-quality mixing dio compression and instrument synthesis. of real and virtual.” The guest editors would like to thank all the authors for their contributions. We would also like to express our deep Matti Karjalainen was born in Hankasalmi, gratitude to the reviewers for their diligent efforts in evaluat- Finland, in 1946. He received the M.S. and ing all submitted manuscripts. We hope that this special issue the Dr.Tech. degrees in electrical engineer- will stimulate further research work on model-based sound ing from the Tampere University of Tech- synthesis. nology, in 1970 and 1978, respectively. Since 1980 he has been a Professor of acoustics and audio signal processing at the Helsinki Vesa Valim¨ aki¨ University of Technology in the Faculty of Augusto Sarti Electrical Engineering. In audio technology Matti Karjalainen his interest is in audio signal processing, Rudolf Rabenstein such as DSP for sound reproduction, perceptually based signal pro- Lauri Savioja cessing, as well as music DSP and sound synthesis. In addition to audio DSP, his research activities cover speech synthesis, analysis, and recognition; perceptual auditory modeling and spatial hear- Vesa Valim¨ aki¨ was born in Kuorevesi, Fin- ing; DSP hardware, software, and programming environments; as land, in 1968. He received the M.S. de- well as various branches of acoustics, including musical acoustics gree, the Licentiate of Science degree, and and modeling of musical instruments. He has written more than the Doctor of Science degree, all in elec- 300 scientific or engineering articles and contributed to organiz- trical engineering from Helsinki Univer- ing several conferences and workshops. Professor Karjalainen is sity of Technology (HUT), Espoo, Fin- an AES Fellow and a Member in IEEE (Institute of Electrical and land, in 1992, 1994, and 1995, respec- Electronics Engineers), ASA (Acoustical Society of America), EAA tively. He was with the HUT Labora- (European Acoustics Association), ICMA (International Com- tory of Acoustics and Audio Signal Pro- puter Music Association), ESCA (European Speech Communica- cessing from 1990 to 2001. In 1996, he tion Association), and several Finnish scientific and engineering was a Postdoctoral Research Fellow with the University of societies. Editorial 925
Rudolf Rabenstein received the “Diplom- Ingenieur” and “Doktor-Ingenieur” degrees in electrical engineering and the “Habilita- tion” degree in signal processing, all from the University of Erlangen-Nuremberg, Germany in 1981, 1991, and 1996, respec- tively. He worked with the Telecommuni- cations Laboratory, University of Erlangen- Nuremberg, from 1981 to 1987. From 1998 to 1991, he was with the Physics Depart- ment of the University of Siegen, Germany. In 1991, he returned to the Telecommunications Laboratory of the University of Erlangen- Nuremberg. His research interests are in the fields of multidimen- sional systems theory, multimedia signal processing, and computer music. Rudolf Rabenstein is the author and coauthor of more than 100 scientific publications, has contributed to various books and book chapters, and holds several patents in audio engineering. He is a Board Member of the School of Engineering of the Virtual Uni- versity of Bavaria, Germany and a member of several engineering societies.
Lauri Savioja worksasaProfessorforthe Laboratory of Telecommunications Soft- ware and Multimedia in the Helsinki Uni- versity of Technology (HUT), Finland. He received the Doctor of Science degree in Technology in 1999 from the Department of Computer Science, HUT. His research inter- ests include virtual reality, room acoustics, and human-computer interaction. EURASIP Journal on Applied Signal Processing 2004:7, 926–933 c 2004 Hindawi Publishing Corporation
Physical Modeling of the Piano
N. Giordano Department of Physics, Purdue University, 525 Northwestern Avenue, West Lafayette, IN 47907-2036, USA Email: [email protected]
M. Jiang Department of Physics, Purdue University, 525 Northwestern Avenue, West Lafayette, IN 47907-2036, USA Department of Computer Science, Montana State University, Bozeman, MT 59715, USA Email: [email protected]
Received 21 June 2003; Revised 27 October 2003
A project aimed at constructing a physical model of the piano is described. Our goal is to calculate the sound produced by the instrument entirely from Newton’s laws. The structure of the model is described along with experiments that augment and test the model calculations. The state of the model and what can be learned from it are discussed. Keywords and phrases: physical modeling, piano.
1. INTRODUCTION the instrument. However, as far as we can tell, certain fea- tures of the model, such as hammer-string impulse func- This paper describes a long term project by our group aimed tions and the transfer function that ultimately relates the at physical modeling of the piano. The theme of this volume, sound pressure to the soundboard motion (and other sim- model based sound synthesis of musical instruments, is quite ilar transfer functions), are taken from experiments on real broad, so it is useful to begin by discussing precisely what instruments. This approach is a powerful way to produce re- we mean by the term “physical modeling.” The goal of our alistic musical tones efficiently, in real time and in a man- project is to use Newton’s laws to describe all aspects of the ner that can be played by a human performer. However, this piano. We aim to use F = ma to calculate the motion of the approach cannot address certain questions. For example, it hammers, strings, and soundboard, and ultimately the sound would not be able to predict the sound that would be pro- that reaches the listener. duced if a radically new type of soundboard was employed, Of course, we are not the first group to take such a New- or if the hammers were covered with a completely differ- ton’s law approach to the modeling of a musical instrument. ent type of material than the conventional felt. The physi- For the piano, there have been such modeling studies of the cal modeling method that we describe in this paper can ad- hammer-string interaction [1, 2, 3, 4, 5, 6, 7, 8, 9], string vi- dress such questions. Hence, we view the ideas and method brations [8, 9, 10], and soundboard motion [11]. (Nice re- embodied in work of Bank and coworkers [20] (and the ref- views of the physics of the piano are given in [12, 13, 14, 15].) erences therein) as complementary to the physical modeling There has been similar modeling of portions of other instru- approach that is the focus of our work. ments (such as the guitar [16]), and of several other com- In this paper, we describe the route that we have taken plete instruments, including the xylophone and the timpani to assembling a complete physical model of the piano. [17, 18, 19]. Our work is inspired by and builds on this pre- This complete model is really composed of interacting sub- vious work. models which deal with (1) the motions of the hammers and At this point, we should also mention how our work re- strings and their interaction, (2) soundboard vibrations, and lates to other modeling work, such as the digital waveguide (3) sound generation by the vibrating soundboard. For each approach, which was recently reviewed in [20]. The digital of these submodels we must consider several issues, includ- waveguide method makes extensive use of physics in choos- ing selection and implementation of the computational algo- ing the structure of the algorithm; that is, in choosing the rithm, determination of the values of the many parameters proper filter(s) and delay lines, connectivity, and so forth, that are involved, and testing the submodel. After consider- to properly match and mimic the Newton’s law equations of ing each of the submodels, we then describe how they are motion of the strings, soundboard, and other components of combined to produce a complete computational piano. The Physical Modeling of the Piano 927 quality of the calculated tones is discussed, along with the The issue of listening tests brings us to the question of lessons we have learned from this work. A preliminary and goals, that is, what do we hope to accomplish with such a abbreviated report on this project was given in [21]. modeling project? At one level, we would hope that the cal- culated piano tones are realistic and convincing. The model could then be used to explore what various hypothetical pi- 2. OVERALL STRATEGY AND GOALS anos would sound like. For example, one could imagine con- structing a piano with a carbon fiber soundboard, and it One of the first modeling decisions that arises is the question would be very useful to be able to predict its sound ahead of of whether to work in the frequency domain or the time do- time, or to use the model in the design of the new sound- main. In many situations, it is simplest and most instructive board. On a different and more philosophical level, one to work in the frequency domain. For example, an under- might want to ask questions such as “what are the most im- standing of the distribution of normal mode frequencies, and portant elements involved in making a piano sound like a pi- the nature of the associated eigenvectors for the body vibra- ano?” We emphasize that it is not our goal to make a real time tions of a violin or a piano soundboard, is very instructive. model, nor do we wish to compete with the tones produced However, we have chosen to base our modeling in the time by other modeling methods, such as sampling synthesis and domain. We believe that this choice has several advantages. digital waveguide modeling [20]. First, the initial excitation—in our case this is the motion of a piano hammer just prior to striking a string—is described most conveniently in the time domain. Second, the interac- 3. STRINGS AND HAMMERS tion between various components of the instrument, such Our model begins with a piano hammer moving freely with a as the strings and soundboard, is somewhat simpler when speed v just prior to making contact with a string (or strings, viewed in the time domain, especially when one considers h since most notes involve more than one string). Hence, we the early “attack” portion of a tone. Third, our ultimate goal ignore the mechanics of the action. This mechanics is, of is to calculate the room pressure as a function of time, so it is course, quite important from a player’s perspective, since it appealing to start in the time domain with the hammer mo- determines the touch and feel of the instrument [26]. Nev- tion and stay in the time domain throughout the calculation, ertheless, we will ignore these issues, since (at least to a first ending with the pressure as would be received by a listener. approximation) they are not directly relevant to the compo- Our time domain modeling is based on finite difference cal- sition of a piano tone and we simply take v as an input pa- culations [10] that describe all aspects of the instrument. h rameter. Typical values are in the range 1–4 m/s [9]. A second element of strategy involves the determination When a hammer strikes a string, there is an interaction of the many parameters that are required for describing the force that is a function of the compression of the hammer piano. Ideally, one would like to determine all of these pa- felt, y f . This force determines the initial excitation and is rameters independently, rather than use them as fitting pa- thus a crucial factor in the composition of the resulting tone. rameters when comparing the modeling results to real (mea- Considerable effort has been devoted to understanding the sured) tones. This is indeed possible for all of the parame- hammer-string force [1, 2, 3, 4, 5, 6, 7, 27, 28, 29, 30, 31, ters. For example, dimensional parameters such as the string 32, 33]. Hammer felt is a very complicated material [34], diameters and lengths, soundboard dimensions, and bridge and there is no “first principles” expression for the hammer- positions, can all be measured from a real piano. Likewise, ff string force relation Fh(y f ). Much work has assumed a sim- various material properties such as the string sti ness, the ple power law function elastic moduli of the soundboard, and the acoustical proper- ties of the room in which the numerical piano is located, are p Fh y f = F0 y ,(1) well known from very straightforward measurements. For a f few quantities, most notably the force-compression charac- where the exponent p is typically in the range 2.5–4 and F0 teristics of the piano hammers, it is necessary to use separate is an overall amplitude. This power law form seems to be at (and independent) experiments. least qualitatively consistent with many experiments and we This brings us to a third element of our modeling therefore used (1) in our initial modeling calculations. strategy—the problem of how to test the calculations. The While (1) has been widely used to analyze and inter- final output is the sound at the listener, so one could “test” pret experiments, and also in previous modeling work, it the model by simply evaluating the sounds via listening tests. has been known for some time that the force-compression However, it is very useful to separately test the submod- characteristic of most real piano hammers is not a simple els. For example, the portion of the model that deals with reversible function [7, 27, 28, 29, 30]. Ignoring the hystere- soundboard vibrations can be tested by comparing its pre- sis has seemed reasonable, since the magnitude of the ir- dictions for the acoustic impedance with direct measure- reversibility is often found to be small. Figure 1 shows the ments [11, 22, 23, 24]. Likewise, the room-soundboard com- force-compression characteristic for a particular hammer (a putation can be compared with studies of sound production Steinway hammer from the note middle C) measured in by a harmonically driven soundboard [25]. This approach, two different ways. In the type I measurement, the hammer involving tests against specially designed experiments, has struck a stationary force sensor and the resulting force and proven to be extremely valuable. felt compression were measured as described in [31]. We see 928 EURASIP Journal on Applied Signal Processing
cles for the felt. There is considerable hysteresis during these 20 cycles, much more than might have been expected from the Hammer force characteristics type I result. The overall magnitude of the type II force is also Hammer C4 somewhat smaller; the hammer is effectively “softer” under the type II conditions. Since the type II arrangement is the one found in real piano, it is important to use this hammer- force characteristic in modeling.
(N) We have chosen to model our hysteretic type II hammer
h Type I exp. F 10 measurements following the proposal of Stulov [30, 33]. He has suggested the form Fh y f (t) −∞ = F0 g y f (t) − 0 g y f (t ) exp − (t − t )/τ0 dt . t Type II exp. (2) 0 00.20.40.6 Here, τ is a characteristic (memory) time scale associated y f (mm) 0 with the felt, 0 is a measure of the magnitude of the hystere- sis, and ( ) is the variation of the compression with time. Figure 1: Force-compression characteristics measured for a partic- y f t ular piano hammer measured in two different ways. In the type I ex- In other words, (2) says that the felt “remembers” its pre- periment (dotted curve), the hammer struck a stationary force sen- vious compression history over a time of order τ0, and that sor and the resulting force, Fh, and felt compression, y f , were mea- the force is reduced according to how much the felt has been sured. The initial hammer velocity was approximately 1 m/s. The compressed during that period. The inherent nonlinearity of solid curve is the measured force-compression relation obtained in the hammer is specified by the function g(z); Stulov took this a type II measurement, in which the same hammer impacted a pi- to be a power law ano string. This behavior is described qualitatively by (2), with pa- = = × 13 = = × −5 p rameters p 3.5, F0 1.0 10 N, 0 0.90, and τ0 1.0 10 g(z) = z . (3) second. The dashed arrows indicate compression/decompression branches. Stulov has compared (2) to measurements with real ham- mers and reported very good agreement using τ0, 0, p,and F0 as fitting parameters. Our own tests of (2) have not shown that for a particular value of the felt compression, y f , the such good agreement; we have found that it provides only a force is larger during the compression phase of the hammer- qualitative (and in some cases semiquantitative) description string collision than during decompression. However, this of the hysteresis shown in Figure 1 [35]. Nevertheless, it is ff di erence is relatively small, generally no more than 10% of currently the best mathematical description available for the the total force. Provided that this hysteresis is ignored, the hysteresis, and we have employed it in our modeling calcula- type I result is described reasonably well by the power law tions. function (1)withp ≈ 3. However, we will see below that (1) Our string calculations are based on the equation of mo- is not adequate for our modeling work, and this has led us to tion [8, 10, 36] consider other forms for F . h In order to shed more light on the hammer-string force, ∂2 y ∂2 y ∂4 y ∂y ∂3 y we developed a new experimental approach, which we refer = c2 − − α + α ,(4) ∂t2 s ∂x2 ∂x4 1 ∂t 2 ∂t3 to as a type II experiment, in which the force and felt com- pression are measured as the hammer impacts on a string where y(x, t) is the transverse string displacement at time t [32, 35]. Since the string rebounds in response to the ham- ≡ mer, the hammer-string contact time in this case is consider- and position x along the string. cs µ/T is the wave speed ably longer (by a factor of approximately 3) than in the type I for an ideal string (with stiffness and damping ignored), with measurement. The force-compression relation found in this T the tension and µ the mass per unit length of the string. type II measurement is also shown in Figure 1.Incontrastto When the parameters , α1,andα2 are zero, this is just the the type I measurements, the type II results for Fh(y)donot simple wave equation. Equation (4) describes only the po- consist of two simple branches (one for compression and an- larization mode for which the string displacement is parallel other for decompression). Instead, the type II result exhibits to the initial velocity of the hammer. The other transverse “loops,” which arise for the following reason. When the ham- mode and also the longitudinal mode are both ignored; ex- mer first contacts the string, it excites pulses that travel to periments have shown that both of these modes are excited the ends of the string, are reflected at the ends, and then re- in real piano strings [37, 38, 39],butwewillleavethemfor turn. These pulses return while the hammer is still in contact future modeling work. The term in (4)thatisproportional ff with the string, and since they are inverted by the reflection, to arises from the sti ness of the string. It turns out that = 2 they cause an extra series of compression/decompression cy- cs rs Es/ρs,wherers, Es,andρs are the radius, Young’s Physical Modeling of the Piano 929 modulus, and density of the string, respectively, [9, 36]. For tion.) The soundboard coordinates x and y run perpendic- −4 typical piano strings, is of order 10 , so the stiffness term ular and parallel to the grain of the board. Ex and νx are in (4) is small, but it cannot be neglected as it produces the Young’s modulus and Poisson’s ration for the x direction, well-known effect of stretched octaves [36]. Damping is ac- and so forth for y, Gxy is the shear modulus, hb is the board counted for with the terms involving α1 and α2; one of these thickness and ρb is its density. The values of all elastic con- terms is proportional to the string velocity, while the other is stants were taken from [41]. In order to model the ribs and proportional to ∂3 y/∂t3. This combination makes the damp- bridges, the thickness and rigidity factors are position depen- ing dependent on frequency in a manner close to that ob- dent (since these factors are different at the ribs and bridges served experimentally [8, 10]. than on the “bare” board) as described in [11]. There are Our numerical treatment of the string motion employs a also some additional terms that enter the equation of mo- finite difference formulation in which both time t and posi- tion (5) at the ends of bridges [11, 17, 18, 43]. Fs(x, y) is the tion x are discretized in units ∆ts and ∆xs [8, 9, 10, 40]. The force from the strings on the bridge. This force acts at the string displacement is then y(x, t) ≡ y(i∆xs, n∆ts) ≡ y(i, n). appropriate bridge location; it is proportional to the com- If the derivatives in (4) are written in finite difference form, ponent of the string tension perpendicular to the plane of this equation can be rearranged to express the string dis- the board, and is calculated from the string portion of the placement at each spatial location i at time step n+1interms model. Finally, we include a loss term proportional to the of the displacement at previous time steps as described by parameter β [11]. The physical origin of this term involves Chaigne and Askenfelt [8, 10]. The equation of motion (4) elastic losses within the board. We have not attempted to does not contain the hammer force. This is included by the model this physics according to Newton’s laws, but have sim- addition of a term on the right-hand side proportional to ply chosen a value of β which yields a quality factor for the Fh, which acts at the hammer strike point. Since the ham- soundboard modes which is similar to that observed experi- mer has a finite width, it is customary to spread this force mentally [11, 24].1 Finally, we note that the soundboard “acts over a small length of the string [8]. So far as we know, the back” on the strings, since the bridge moves and the strings details of how this force is distributed have never been mea- are attached to the bridge. Hence, the interaction of strings in sured; fortunately our modeling results are not very sensitive a unison group, and also sympathetic string vibrations (with to this factor (so long as the effective hammer width is qual- the dampers disengaged from the strings) are included in the itatively reasonable). With this approach to the string calcu- model. lation, the need for numerical stability together with the de- For the solution of (5), we again employed a finite dif- sired frequency range require that each string be treated as ference algorithm. The space dimensions x and y were dis- 50–100 vibrating numerical elements [8, 10]. cretized, both in steps of size ∆xb; this spatial step need not be related to the step size for the string ∆xs. As in our previous work on soundboard modeling [11], we chose ∆xb = 2cm, 4. THE SOUNDBOARD since this is just small enough to capture the structure of the Wood is a complicated material [41]. Soundboards are as- board, including the widths of the ribs and bridges. Hence, sembled from wood that is “quarter sawn,” which means that the board was modeled as ∼ 100 × 100 vibrating elements. two of the principal axes of the elastic constant tensor lie in The behavior of our numerical soundboard can be the plane of the board. judged by calculations of the mechanical impedance, Z,as The equation of motion for such a thin orthotropic plate defined by is [11, 22, 23, 42] F = ,(7) 2 4 4 Z ∂ z ∂ z ∂ z vb ρ h =−D − D ν + D ν +4D b b ∂t2 x ∂x4 x y y x xy ∂x2∂y2 where F is an applied force and vb is the resulting sound- 4 (5) − ∂ z − ∂z board velocity. Here, we assume that F is a harmonic (single Dy 4 + Fs(x, y) β , ∂y ∂t frequency) force applied at a point on the bridge and vb is measured at the same point. Figure 2 shows results calculated where the rigidity factors are from our model [11] for the soundboard from an upright pi- ano. Also shown are measurements for a real upright sound- h3E = b x board (with the same dimensions and bridge positions, etc., Dx − ν ν , 12 1 x y as in the model). The agreement is quite acceptable, espe- h3E cially considering that parameters such as the dimensions of = b y (6) Dy , the soundboard, the position and thickness of the ribs and 12 1 − νxνy 3 bridges, and the elastic constants of the board were taken h Gxy D = b . xy 12 1 − In principle, one might expect the soundboard losses to be frequency Here, our board lies in the x y plane and z is its displace- dependent, as found for the string. At present there is no good experimental ment. (These x and y directions are, of course, not the same data on this question, so we have chosen the simplest possible model with as the x and y coordinates used in describing the string mo- just a single loss term in (5). 930 EURASIP Journal on Applied Signal Processing
104 ∆ ,and ∆ . The grids for and are arranged in a sim- Experiment j xr k xr vy vz ilar manner, as explained in [44, 45]. Sound is generated in this numerical room by the vibra- 5000 tion of the soundboard. We situate the soundboard from the previous section on a plane perpendicular to the z direction in the room, approximately 1 m from the nearest parallel wall 2000 (i.e., the floor). At each time step the velocity vz of the room air at the surface of the soundboard is set to the calculated soundboard velocity at that instant, as obtained from the 1000 (kg/s) soundboard calculation. Z The room is taken to be a rectangular box with the same 500 Model acoustical properties for all 6 walls. The walls of the room are modeled in terms of their acoustic impedance, Z,with
Soundboard impedance p = Zvn,(9) 200 Upright piano at middle C where vn is the component of the (air) velocity normal to the 100 wall [46]. Measurements of Z for a number of materials [47] 100 1000 104 have found that it is typically frequency dependent with the f(Hz) form iZ Figure 2: Calculated (solid curve) and measured (dotted curve) Z(ω) ≈ Z0 − , (10) mechanical impedance for an upright piano soundboard. Here, ω the force was applied and the board velocity was measured at the where ω is the angular frequency. Incorporating this fre- point where the string for middle C crosses the bridge. Results from quency domain expression for the acoustic impedance into [11, 24]. our time domain treatment was done in the manner de- scribed in [45]. The time step for the room calculation was ∆tr = 1/22050 ≈ 4.5 × 10−4 s, as explained in the next section. from either direct measurements or handbook values (e.g., ∆ Young’s modulus). The choice of spatial step size xr was then influenced by two considerations. First, in order for the finite difference al- gorithm to be numerically√ stable in three dimensions, one 5. THE ROOM must have ∆xr /( 3∆tr ) >ca. Second, it is convenient for the Our time domain room modeling follows the work of Bottel- spatial steps for the soundboard and room to be commen- surate. In the calculations described below, the room step dooren [44, 45]. We begin with the usual coupled equations ∆ = for the velocity and pressure in the room size was xr 4 cm, that is, twice the soundboard step size. When using the calculated soundboard velocity to obtain the ∂v ∂p room velocity at the soundboard surface, we averaged over ρ x =− , a ∂t ∂x 4 soundboard grid points for each room grid point. Typical numerical rooms were 3×4×4m3, and thus contained ∼ 106 ∂vy ∂p ρ =− , finite difference elements. a ∂t ∂y (8) Figure 3 shows results for the sound generation by an ∂vz ∂p upright soundboard. Here, the soundboard was driven har- ρa =− , ∂t ∂z monically at the point where the string for middle C contacts ∂p ∂v ∂vy ∂v the bridge, and we plot the sound pressure normalized by = ρ c2 − x − − z , ∂t a a ∂x ∂y ∂z the board velocity at the driving point [25]. It is seen that the model results compare well with the experiments. This pro- where p is the pressure, the velocity components are vx, vy, vides a check on both the soundboard and the room models. and vz, ρa is the density, and ca is the speed of sound in air. This family of equations is similar in form to an electromag- 6. PUTTING IT ALL TOGETHER netic problem, and much is known about how to deal with it numerically. We employ a finite difference approach in which Our model involves several distinct but coupled sub- staggered grids in both space and time are used for the pres- systems—the hammers/strings, the soundboard, and the sure and velocity. Given a time step ∆tr , the pressure is com- room—and it is useful to review how they fit together com- puted at times n∆tr while the velocity is computed at times putationally. The calculation begins by giving some initial (n+1/2)∆tr . A similar staggered grid is used for the space co- velocity to a particular hammer. This hammer then strikes a ordinates, with the pressure calculated on the grid i∆xr , j∆xr , string (or strings), and they interact through either (1)or(2). k∆xr , while vx is calculated on the staggered grid (i+1/2)∆xr , This sets the string(s) for that note into motion, and these Physical Modeling of the Piano 931
20 Sound generation 30 minutes of computer time. Of course, this gap will nar- row in the future in accord with Moore’s law. In addition, Soundboard driven at C4 10 the model should transfer easily to a cluster (i.e., multi-CPU) machine. We have also explored an alternative approach to the room modeling involving a ray tracing approach [48]. 5 Ray tracing allows one to express the relationship between soundboard velocity and sound pressure as a multiparame-
(arb. units) ter map, involving approximately 104 parameters. The values b
p/v 2 of these parameters can be precalculated and stored, resulting in about an order of magnitude speed-up in the calculation Experiment as compared to the room algorithm described above. 1 Model
0.5 7. ANALYSIS OF THE RESULTS: WHAT HAVE WE 20 100 103 104 LEARNED AND WHERE DO WE GO NEXT? Frequency (Hz) In the previous section, we saw that a real-time Newton’s law Figure 3: Results for the sound pressure normalized by the sound- simulation of the piano is well within reach. While such a board velocity for an upright piano soundboard: calculated (solid simulation would certainly be interesting, it is not a primary curve) and measured (dotted curve). The board was driven at the goal of our work. We instead wish to use the modeling to point where the string for middle C crosses the bridge. Results from learn about the instrument. With that in mind, we now con- [25]. sider the quality of the tones calculated with the current ver- sion of the model. In our initial modeling, we employed power law ham- in turn act on the bridge and soundboard. As we have al- mers described by (1) with parameters based on type I ready mentioned, the vibrations of each component of our hammer experiments by our group [31]. The results were model are calculated with a finite difference algorithm, each disappointing—it is hard to accurately describe the tones in with an associated time step. Since the systems are coupled— words, but they sounded distinctly plucked and somewhat that is, the strings drive the soundboard, the soundboard acts metallic. While we cannot include our calculated sounds back on the strings, and the soundboard drives the room— as part of this paper, they are available on our website it would be computationally simpler to use the same value http://www.physics.purdue.edu/piano. After many modeling of the time step for all three subsystems. However, the equa- calculations, we came to the conclusion that the hammer tion of motion for the soundboard is highly dispersive, and model—for example, the power law description (1)—was the stability requirements demand a much smaller time step for problem. Note that we do not claim that power law ham- the soundboard than is needed for string and room simula- mers must always give unsatisfactory results. Our point is tions. Given the large number of room elements, this would that when the power law parameters are chosen to fit the type greatly (and unnecessarily) slow down the calculation. We I behavior of real hammers, the calculated tones are poor. It is have therefore chosen to instead make the various time steps certainly possible (and indeed, likely) that power law param- commensurate, with eters that will yield good piano tones can be found. How- ever, based on our experience, it seems that these parameters 1 ∆tr = s, should be viewed as fitting parameters, as they may not ac- 22050 curately describe any real hammers. ∆t ∆t = r , (11) This led us to the type II hammer experiments described s 4 ∆t above, and to a description of the hammer-string force in ∆t = s , terms of the Stulov function (2), with parameters (τ , ,etc.) b 6 0 0 taken from these type II experiments [35]. The results were where the subscripts correspond to the room (r), string (s), much improved. While they are not yet “Steinway quality,” and soundboard (b). To explain this hierarchy, we first note it is our opinion that the calculated tones could be mistaken that the room time step is chosen to be compatible with com- for a real piano. In that sense, they pass a sort of acoustical mon audio hardware and software; 1/∆tr is commensurate Turing test. Our conclusion is that the hammers are an es- with the data rates commonly used in CD sound formats. sential part of the instrument. This is hardly a revolutionary We then see that each room time step contains 4 string time result. However, based on our modeling, we can also make steps; that is, the string algorithm makes 4 iterations for each a somewhat stronger statement: in order to obtain a real- iteration of the room model. Likewise, each string time step istic piano tone, the modeling should be based on hammer contains 6 soundboard steps. parameters observed in type II measurements, with the hys- The overall computational speed is currently somewhat teresis included in the model. less than “real time.” With a typical personal computer (clock There are a number of issues that we plan to address speed 1 GHz), a 1 minute simulation requires approximately in the future. (1) The hammer portion of the model still 932 EURASIP Journal on Applied Signal Processing needs attention. Our experiments [35] indicate that while [6] H. Suzuki, “Model analysis of a hammer-string interaction,” the Stulov function does provide a qualitative description Journal of the Acoustical Society of America,vol.82,no.4,pp. of the hammer force hysteresis, there are significant quan- 1145–1151, 1987. titative differences. It may be necessary to develop a bet- [7] X. Boutillon, “Model for piano hammers: Experimental de- termination and digital simulation,” Journal of the Acoustical ter functional description to replace the Stulov form. (2) Society of America, vol. 83, no. 2, pp. 746–754, 1988. As it currently stands, our string model includes only one [8] A. Chaigne and A. Askenfelt, “Numerical simulations of pi- polarization mode, corresponding to vibrations parallel to ano strings. I. A physical model for a struck string using finite the initial hammer velocity. It is well known that the other difference method,” Journal of the Acoustical Society of Amer- transverse polarization mode can be important [37]. This ica, vol. 95, no. 2, pp. 1112–1118, 1994. can be readily included, but will require a more general [9] A. Chaigne and A. Askenfelt, “Numerical simulations of piano soundboard model since the two transverse modes couple strings. II. Comparisons with measurements and systematic exploration of some hammer-string parameters,” Journal of through the motion of the bridge. (3) The soundboard of the Acoustical Society of America, vol. 95, no. 3, pp. 1631–1640, a real piano is supported by a case. Measurements in our 1994. laboratory indicate that the case acceleration can be as large [10] A. Chaigne, “On the use of finite differences for musical syn- as 5% or so of the soundboard acceleration, so the sound thesis. Application to plucked stringed instruments,” Journal emitted by the case is considerable. (4) We plan to refine d’Acoustique, vol. 5, no. 2, pp. 181–211, 1992. the room model. Our current room model is certainly a [11] N. Giordano, “Simple model of a piano soundboard,” Journal of the Acoustical Society of America, vol. 102, no. 2, pp. 1159– very crude approximation to a realistic room. Real rooms ff 1168, 1997. have wall coverings of various types (with di ering values [12] H. A. Conklin Jr., “Design and tone in the mechanoacoustic of the acoustic impedances), and contain chairs and other piano. Part I. Piano hammers and tonal effects,” Journal of the objects. At our current level of sophistication, it appears Acoustical Society of America, vol. 99, no. 6, pp. 3286–3296, that the hammers are more of a limitation than the room 1996. model, but this may well change as the hammer modeling is [13] H. Suzuki and I. Nakamura, “Acoustics of pianos,” Appl. improved. Acoustics, vol. 30, pp. 147–205, 1990. [14] H. A. Conklin Jr., “Design and tone in the mechanoacous- In conclusion, we have made good progress in developing tic piano. Part II. Piano structure,” Journal of the Acoustical a physical model of the piano. It is now possible to produce Society of America, vol. 100, no. 2, pp. 695–708, 1996. realistic tones using Newton’s laws with realistic and inde- [15] H. A. Conklin Jr., “Design and tone in the mechanoacoustic pendently determined instrument parameters. Further im- piano. Part III. Piano strings and scale design,” Journal of the provements of the model seem quite feasible. We believe that Acoustical Society of America, vol. 100, no. 3, pp. 1286–1298, physical modeling can provide new insights into the piano, 1996. and that similar approaches can be applied to other instru- [16] B. E. Richardson, G. P. Walker, and M. Brooke, “Synthesis of guitar tones from fundamental parameters relating to con- ments. struction,” Proceedings of the Institute of Acoustics, vol. 12, no. 1, pp. 757–764, 1990. ACKNOWLEDGMENTS [17] A. Chaigne and V. Doutaut, “Numerical simulations of xy- lophones. I. Time-domain modeling of the vibrating bars,” We thank P. Muzikar, T. Rossing, A. Tubis, and G. Weinre- Journal of the Acoustical Society of America, vol. 101, no. 1, pp. ich for many helpful and critical discussions. We also are in- 539–557, 1997. [18] V. Doutaut, D. Matignon, and A. Chaigne, “Numerical simu- debted to A. Korty, J. Winans II, J. Millis, S. Dietz, J. Jourdan, ff lations of xylophones. II. Time-domain modeling of the res- J. Roberts, and L. Reu for their contributions to our piano onator and of the radiated sound pressure,” Journal of the studies. This work was supported by National Science Foun- Acoustical Society of America, vol. 104, no. 3, pp. 1633–1647, dation (NSF) through Grant PHY-9988562. 1998. [19] L. Rhaouti, A. Chaigne, and P. Joly, “Time-domain model- ing and numerical simulation of a kettledrum,” Journal of the REFERENCES Acoustical Society of America, vol. 105, no. 6, pp. 3545–3562, 1999. [1] D. E. Hall, “Piano string excitation in the case of small ham- [20] B. Bank, F. Avanzini, G. Borin, G. De Poli, F. Fontana, and mer mass,” Journal of the Acoustical Society of America, vol. 79, D. Rocchesso, “Physically informed signal processing meth- no. 1, pp. 141–147, 1986. ods for piano sound synthesis: a research overview,” EURASIP [2] D. E. Hall, “Piano string excitation II: General solution for Journal on Applied Signal Processing, vol. 2003, no. 10, pp. a hard narrow hammer,” Journal of the Acoustical Society of 941–952, 2003. America, vol. 81, no. 2, pp. 535–546, 1987. [21] N. Giordano, M. Jiang, and S. Dietz, “Experimental and com- [3] D. E. Hall, “Piano string excitation III: General solution for putational studies of the piano,” in Proc. 17th International a soft narrow hammer,” Journal of the Acoustical Society of Congress on Acoustics, vol. 4, Rome, Italy, September 2001. America, vol. 81, no. 2, pp. 547–555, 1987. [22] J. Kindel and I.-C. Wang, “Modal analysis and finite ele- [4] D. E. Hall and A. Askenfelt, “Piano string excitation V: Spectra ment analysis of a piano soundboard,” in Proc. 5th Interna- for real hammers and strings,” Journal of the Acoustical Society tional Modal Analysis Conference, pp. 1545–1549, Union Col- of America, vol. 83, no. 4, pp. 1627–1638, 1988. lege, Schenectady, NY, USA, 1987. [5] D. E. Hall, “Piano string excitation. VI: Nonlinear modeling,” [23] J. Kindel, “Modal analysis and finite element analysis of a Journal of the Acoustical Society of America, vol. 92, no. 1, pp. piano soundboard,” M.S. thesis, University of Cincinnati, 95–105, 1992. Cincinnati, Ohio, USA, 1989. Physical Modeling of the Piano 933
[24] N. Giordano, “Mechanical impedance of a piano sound- [45] D. Botteldooren, “Finite-difference time-domain simulation board,” Journal of the Acoustical Society of America, vol. 103, of low-frequency room acoustic problems,” Journal of the no. 4, pp. 2128–2133, 1998. Acoustical Society of America, vol. 98, no. 6, pp. 3302–3308, [25] N. Giordano, “Sound production by a vibrating piano sound- 1995. board: Experiment,” Journal of the Acoustical Society of Amer- [46] P.M. Morse and K. U. Ingard, Theoretical Acoustics,Princeton ica, vol. 104, no. 3, pp. 1648–1653, 1998. University Press, Princeton, NJ, USA, 1986. [26] A. Askenfelt and E. V. Jansson, “From touch to string vibra- [47] L. L. Beranek, “Acoustic impedance of commercial materials tions. II. The motion of the key and hammer,” Journal of the and the performance of rectangular rooms with one treated Acoustical Society of America, vol. 90, no. 5, pp. 2383–2393, surface,” Journal of the Acoustical Society of America, vol. 12, 1991. pp. 14–23, 1940. [27] T. Yanagisawa, K. Nakamura, and H. Aiko, “Experimental [48] M. Jiang, “Room acoustics and physical modeling of the study on force-time curve during the contact between ham- piano,” M.S. thesis, Purdue University, West Lafayette, Ind, mer and piano string,” Journal of the Acoustical Society of USA, 1999. Japan, vol. 37, pp. 627–633, 1981. [28] T. Yanagisawa and K. Nakamura, “Dynamic compression characteristics of piano hammer,” Transactions of Musical N. Giordano obtained his Ph.D. from Yale Acoustics Technical Group Meeting of the Acoustic Society of University in 1977, and has been at the De- Japan, vol. 1, pp. 14–17, 1982. partment of Physics at Purdue University [29] T. Yanagisawa and K. Nakamura, “Dynamic compression since 1979. His research interests include characteristics of piano hammer felt,” Journal of the Acous- mesoscopic and nanoscale physics, compu- tical Society of Japan, vol. 40, pp. 725–729, 1984. tational physics, and musical acoustics. He [30] A. Stulov, “Hysteretic model of the grand piano hammer felt,” is the author of the textbook Computational Journal of the Acoustical Society of America, vol. 97, no. 4, pp. Physics (Prentice-Hall, 1997). He also col- 2577–2585, 1995. lects and restores antique pianos. [31] N. Giordano and J. P. Winans II, “Piano hammers and their force compression characteristics: does a power law make sense?,” Journal of the Acoustical Society of America, vol. 107, M. Jiang has a B.S. degree in physics (1997) no. 4, pp. 2248–2255, 2000. from Peking University, China, and M.S. [32] N. Giordano and J. P. Millis, “Hysteretic behavior of pi- degrees in both physics and computer sci- ano hammers,” in Proc. International Symposium on Musi- ence (1999) from Purdue University. Some cal Acoustics,D.Bonsi,D.Gonzalez,andD.Stanzial,Eds.,pp. of the work described in this paper was part 237–240, Perugia, Umbria, Italy, September 2001. of his physics M.S. thesis. After graduation, [33] A. Stulov and A. Magi,¨ “Piano hammer: Theory and experi- he worked as a software engineer for two ment,” in Proc. International Symposium on Musical Acoustics, D. Bonsi, D. Gonzalez, and D. Stanzial, Eds., pp. 215–220, Pe- years, developing Unix kernel software and rugia, Umbria, Italy, September 2001. device drivers. In 2002, he moved to Boze- [34] J. I. Dunlop, “Nonlinear vibration properties of felt pads,” man, Montana, where he is now pursuing a Journal of the Acoustical Society of America, vol. 88, no. 2, pp. Ph.D. in computer science in Montana State University. Minghui’s 911–917, 1990. current research interests include the design of algorithms, compu- [35] N. Giordano and J. P. Millis, “Using physical modeling to tational geometry, and biological modeling and bioinformatics. learn about the piano: New insights into the hammer-string force,” in Proc. International Congress on Acoustics, S. Furui, H. Kanai, and Y. Iwaya, Eds., pp. III–2113, Kyoto, Japan, April 2004. [36] N. H. Fletcher and T. D. Rossing, The Physics of Musical In- struments, Springer-Verlag, New York, NY, USA, 1991. [37] G. Weinreich, “Coupled piano strings,” Journal of the Acous- tical Society of America, vol. 62, no. 6, pp. 1474–1484, 1977. [38] M. Podlesak and A. R. Lee, “Dispersion of waves in piano strings,” Journal of the Acoustical Society of America, vol. 83, no. 1, pp. 305–317, 1988. [39] N. Giordano and A. J. Korty, “Motion of a piano string: lon- gitudinal vibrations and the role of the bridge,” Journal of the Acoustical Society of America, vol. 100, no. 6, pp. 3899–3908, 1996. [40] N. Giordano, Computational Physics,Prentice-Hall,Upper Saddle River, NJ, USA, 1997. [41] V. Bucur, Acoustics of Wood, CRC Press, Boca Raton, Fla, USA, 1995. [42] S. G. Lekhnitskii, Anisotropic Plates,GordonandBreachSci- ence Publishers, New York, NY, USA, 1968. [43] J. W. S. Rayleigh, Theory of Sound,Dover,NewYork,NY,USA, 1945. [44] D. Botteldooren, “Acoustical finite-difference time-domain simulation in a quasi-Cartesian grid,” Journal of the Acoustical Society of America, vol. 95, no. 5, pp. 2313–2319, 1994. EURASIP Journal on Applied Signal Processing 2004:7, 934–948 c 2004 Hindawi Publishing Corporation
Sound Synthesis of the Harpsichord Using a Computationally Efficient Physical Model
Vesa Valim¨ aki¨ Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: vesa.valimaki@hut.fi Henri Penttinen Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: henri.penttinen@hut.fi Jonte Knif Sibelius Academy, Centre for Music and Technology, P.O. Box 86, 00251 Helsinki, Finland Email: jknif@siba.fi
Mikael Laurson Sibelius Academy, Centre for Music and Technology, P.O. Box 86, 00251 Helsinki, Finland Email: laurson@siba.fi
Cumhur Erkut Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: cumhur.erkut@hut.fi
Received 24 June 2003; Revised 28 November 2003
A sound synthesis algorithm for the harpsichord has been developed by applying the principles of digital waveguide modeling. A modification to the loss filter of the string model is introduced that allows more flexible control of decay rates of partials than is possible with a one-pole digital filter, which is a usual choice for the loss filter. A version of the commuted waveguide synthesis approach is used, where each tone is generated with a parallel combination of the string model and a second-order resonator that are excited with a common excitation signal. The second-order resonator, previously proposed for this purpose, approximately simulates the beating effect appearing in many harpsichord tones. The characteristic key-release thump terminating harpsichord tones is reproduced by triggering a sample that has been extracted from a recording. A digital filter model for the soundboard has been designed based on recorded bridge impulse responses of the harpsichord. The output of the string models is injected in the soundboard filter that imitates the reverberant nature of the soundbox and, particularly, the ringing of the short parts of the strings behind the bridge. Keywords and phrases: acoustic signal processing, digital filter design, electronic music, musical acoustics.
1. INTRODUCTION features of grand pianos are among the most popular elec- tronic instruments. Our current work focuses on the imita- Sound synthesis is particularly interesting for acoustic key- tion of the harpsichord, which is expensive, relatively rare, board instruments, since they are usually expensive and large but is still commonly used in music from the Renaissance and may require amplification during performances. Elec- and the baroque era. Figure 1 shows the instrument used in tronic versions of these instruments benefit from the fact this study. It is a two-manual harpsichord that contains three that keyboard controllers using MIDI are commonly avail- individual sets of strings, two bridges, and has a large sound- able and fit for use. Digital pianos imitating the timbre and board. Sound Synthesis of the Harpsichord Using a Physical Model 935
model that consists of the cascade of a shaping filter and a common reverb algorithm. The sparse loop filter consists of a conventional one-pole filter and a feedforward comb filter inserted in the feedback loop of a basic string model. Meth- ods to calibrate these parts of the synthesis algorithm are pro- posed. This paper is organized as follows. Section 2 gives a short overview on the construction and acoustics of the harpsi- chord. In Section 3, signal-processing techniques for synthe- sizing harpsichord tones are suggested. In particular, the new loop filter is introduced and analyzed. Section 4 concentrates on calibration methods to adjust the parameters according Figure 1: The harpsichord used in the measurements has two man- to recordings. The implementation of the synthesizer using uals, three string sets, and two bridges. The picture was taken during a block-based graphical programming language is described the tuning of the instrument in the anechoic chamber. in Section 5, where we also discuss the computational com- plexity and potential applications of the implemented sys- tem. Section 6 contains conclusions, and suggests ideas for Instead of wavetable and sampling techniques that are further research. popular in digital instruments, we apply modeling tech- niques to design an electronic instrument that sounds nearly 2. HARPSICHORD ACOUSTICS identical to its acoustic counterpart and faithfully responds The harpsichord is a stringed keyboard instrument with a to the player’s actions, just as an acoustic instrument. We use long history dating back to at least the year 1440 [8]. It is the modeling principle called commuted waveguide synthe- the predecessor of the pianoforte and the modern piano. It sis [1, 2, 3], but have modified it, because we use a digital belongs to the group of plucked string instruments due to filter to model the soundboard response. Commuted syn- its excitation mechanism. In this section, we describe briefly thesis uses the basic property of linear systems, that in a the construction and the operating principles of the harpsi- cascade of transfer functions their ordering can be changed chord and give details of the instrument used in this study. ff without a ecting the overall transfer function. This way, the For a more in-depth discussion and description of the harp- complications in the modeling of the soundboard resonances sichord, see, for example, [9, 10, 11, 12], and for a descrip- extracted from a recorded tone can be hidden in the in- tion of different types of harpsichord, the reader is referred put sequence. In the original form of commuted synthesis, to [10]. the input signal contains the contribution of the excitation mechanism—the quill plucking the string—and that of the 2.1. Construction of the instrument soundboard with all its vibrating modes [4]. In the current The form of the instrument can be roughly described as tri- implementation, the input samples of the string models are angular, and the oblique side is typically curved. A harpsi- short (less than half a second) and contain only the initial chord has one or two manuals that control two to four sets of part of the soundboard response; the tail of the soundboard strings, also called registers or string choirs. Two of the string response is reproduced with a reverberation algorithm. choirs are typically tuned in unison. These are called the 8 (8 Digital waveguide modeling [5]appearstobeanexcel- foot) registers. Often the third string choir is tuned an octave lent tool for the synthesis of harpsichord tones. A strong ar- higher, and it is called the 4 register. The manuals can be set gument supporting this view is that tones generated using to control different registers, usually with a limited number the basic Karplus-Strong algorithm [6] are reminiscent of of combinations. This permits the player to use different reg- the harpsichord for many listeners.1 This synthesis technique isters with left- and right-hand manuals, and therefore vary has been shown to be a simplified version of a waveguide the timbre and loudness of the instrument. The 8 registers string model [5, 7]. However, this does not imply that realis- differ from each other in the plucking point of the strings. tic harpsichord synthesis is easy. A detailed imitation of the Hence, the 8 registers are called 8 back and front registers, properties of a fine instrument is challenging, even though where “back” refers to the plucking point away from the nut the starting point is very promising. Careful modifications (and the player). to the algorithm and proper signal analysis and calibration The keyboard of the harpsichord typically spans four or routines are needed for a natural-sounding synthesis. five octaves, which became a common standard in the early The new contributions to stringed-instrument models 18th century. One end of the strings is attached to the nut include a sparse high-order loop filter and a soundboard and the other to a long, curved bridge. The portion of the string behind the bridge is attached to a hitch pin, which is on top of the soundboard. This portion of the string also 1The Karplus-Strong algorithm manages to sound something like the harpsichord in some registers only when a high sampling rate is used, such tends to vibrate for a long while after a key press, and it gives as 44.1 kHz or 22.05 kHz. At low sample rates, it sounds somewhat similar theinstrumentareverberantfeel.Thenutissetonavery to violin pizzicato tones. rigid wrest plank. The bridge is attached to the soundboard. 936 EURASIP Journal on Applied Signal Processing
grelease Trigger at Trigger at attack time release time Release samples
Output Timbre S(z) Excitation control samples
gsb Soundboard Tone R(z) filter corrector
Figure 2: Overall structure of the harpsichord model for a single string. The model structure is identical for all strings in the three sets, but the parameter values and sample data are different.
Therefore, the bridge is mainly responsible for transmitting and 85 cm wide, and its strings are all made of brass. The string vibrations to the soundboard. The soundboard is very plucking point changes from 12% to about 50% of the string thin—about 2 to 4 mm—and it is supported by several ribs length in the bass and in the treble range, respectively. This installed in patterns that leave trapezoidal areas of the sound- produces a round timbre (i.e., weak even harmonics) in the board vibrating freely. The main function of the soundboard treble range. In addition, the dampers have been left out in is to amplify the weak sound of the vibrating strings, but it the last octave of the 4 register to increase the reverberant also filters the sound. The soundboard forms the top of a feel during playing. The wood material used in the instru- closed box, which typically has a rose opening. It causes a ment has been heat treated to artificially accelerate the aging Helmholtz resonance, the frequency of which is usually be- process of the wood. low 100 Hz [12]. In many harpsichords, the soundbox also opens to the manual compartment. 3. SYNTHESIS ALGORITHM 2.2. Operating principle This section discusses the signal processing methods used in A plectrum—also called a quill—that is anchored onto a the synthesis algorithm. The structure of the algorithm is jack, plucks the strings. The jack rests on a string, but there is illustrated in Figure 2. It consists of five digital filters, two a small piece of felt (called the damper) between them. One sample databases, and their interconnections. The physical end of the wooden keyboard lever is located a small distance model of a vibrating string is contained in block S(z). Its in- below the jack. As the player pushes down a key on the key- put is retrieved from the excitation signal database, and it board, the lever moves up. This action lifts the jack up and can be modified during run-time with a timbre-control fil- causes the quill to pluck the string. When the key is released, ter, which is a one-pole filter. In parallel with the string, a the jack falls back and the damper comes in contact with the second-order resonator R(z) is tuned to reproduce the beat- string with the objective to dampen its vibrations. A spring ing of one of the partials, as proposed earlier by Bank et al. mechanism in the jack guides the plectrum so that the string [14, 15].Whilewecouldusemoreresonators,wehavede- is not replucked when the key is released. cided to target a maximally reduced implementation to min- imize the computational cost and number of parameters. The 2.3. The harpsichord used in this study sum of the string model and resonator output signals is fed The harpsichord used in this study (see Figure 1)wasbuilt through a soundboard filter, which is common for all strings. The tone corrector is an equalizer that shapes the spectrum in 2000 by Jonte Knif (one of the authors of this paper) and ffi Arno Pelto. It has the characteristics of harpsichords built in of the soundboard filter output. By varying coe cients grelease Italy and Southern Germany. This harpsichord has two man- and gsb, it is possible to adjust the relative levels of the string uals and three sets of string choirs, namely an 8 back, an sound, the soundboard response, and the release sound. 8 front, and a 4 register. The instrument was tuned to the In the following, we describe the string model, the sample databases, and the soundboard model in detail, and discuss Vallotti tuning [13] with the fundamental frequency of A4 of 2 the need for modeling the dispersion of harpsichord strings. 415 Hz. There are 56 keys from G1 to D6,whichcorrespond to fundamental frequencies 46 Hz and 1100 Hz, respectively, 3.1. Basic string model revisited in the 8 register; the 4 register is an octave higher, so the corresponding lowest and highest fundamental frequencies We use a version of the vibrating string filter model proposed are about 93 Hz and 2200 Hz. The instrument is 240 cm long by Jaffe and Smith [16]. It consists of a feedback loop, where a delay line, a fractional delay filter, a high-order allpass filter, and a loss filter are cascaded. The delay line and the fractional 2The tuning is considerably lower than the current standard (440 Hz or delay filter determine the fundamental frequency of the tone. higher). This is typical of old musical instruments. The high-order allpass filter [16] simulates dispersion which Sound Synthesis of the Harpsichord Using a Physical Model 937
x(n) y(n)
b
Ad(z) r a + −
− F(z) z−L1 z−R z 1
Ripple One-pole filter filter
Figure 3: Structure of the proposed string model. The feedback loop contains a one-pole filter (denominator of (1)), a feedforward comb filter called “ripple filter” (numerator of (1)), the rest of the delay line, a fractional delay filter F(z), and an allpass filter Ad(z) simulating dispersion. is a typical characteristic of vibrating strings and which in- The delay line length R is determined as troduces inharmonicity in the sound. For the fractional delay filter, we use a first-order allpass filter, as originally suggested R = round rrateL ,(3) by Smith and Jaffe[16, 17]. This choice was made because it allows a simple and sufficient approximation of delay when where rrate is the ripple rate parameter that adjusts the rip- a high sampling rate is used.3 Furthermore, there is no need ple density in the frequency domain and L is the total delay to implement fundamental frequency variations (pitch bend) length in the loop (in samples, or sampling intervals). in harpsichord tones. Thus, the recursive nature of the allpass Theripplefilterwasdevelopedbecauseitwasfoundthat fractional delay filter, which can cause transients during pitch the magnitude response of the one-pole filter alone is overly bends, is not harmful. smooth when compared to the required loop gain behavior The loss filter of waveguide string models is usually im- for harpsichord sounds. Note that the ripple factor r in (1) plemented as a one-pole filter [18], but now we use an ex- increases the loop gain, but it is not accounted for in the scal- tended version. The transfer function of the new loss filter ing factor in (2). This is purposeful because we find it useful is that the loop gain oscillates symmetrically around the mag- nitude response of the conventional one-pole filter (obtained r + z−R from (1) by setting r = 0). Nevertheless, it must be ensured H(z) = b ,(1) 1+az−1 somehow that the overall loop gain does not exceed unity at any of the harmonic frequencies—otherwise the system be- where the scaling parameter b is defined as comes unstable. It is sufficient to require that the sum g + |r| remains below one, or |r| < 1−g. In practice, a slightly larger b = g(1 + a), (2) magnitude of r still results in a stable system when r<0, because this choice decreases the loop gain at 0 Hz and the R is the delay line length of the ripple filter, r is the ripple conventional loop filter is a lowpass filter, and thus its gain at depth, and a is the feedback gain. Figure 3 shows the block the harmonic frequencies is smaller than g. diagram of the string model with details of the new loss filter, With small positive or negative values of r, it is possible to which is seen to be composed of the conventional one-pole obtain wavy loop gain characteristics, where two neighboring filter and a ripple filter in cascade. The total delay line length partials have considerably different loop gains and thus decay L in the feedback loop is 1+R+L1 plus the phase delay caused rates. The frequency of the ripple is controlled by parameter by the fractional delay filter F(z) and the allpass filter Ad(z). rrate so that a value close to one results in a very slow wave, The overall loop gain is determined by parameter g, while a value close to 0.5 results in a fast variation where the which is usually selected to be slightly smaller than 1 to en- loop gain for neighboring even and odd partials differs by sure stability of the feedback loop. The feedback gain param- about 2r (depending on the value of a). An example is shown eter a defines the overall lowpass character of the filter: a in Figure 4 where the properties of a conventional one-pole value slightly smaller than 0 (e.g., a =−0.01) yields a mild loss filter are compared against the proposed ripply loss filter. lowpass filter, which causes high-frequency partials to decay Figure 4a shows that by adding a feedforward path with small faster than the low-frequency ones, which is natural. gain factor r = 0.002, the loop gain characteristics can be The ripple depth parameter r is used to control the de- made less regular. viation of the loss filter gain from that of the one-pole filter. Figure 4b shows the corresponding reverberation time (T60) curve, which indicates how long it takes for each partial to decay by 60 dB. The T60 values are obtained by multiplying 3The sampling rate used in this work is 44100 Hz. the time-constant values τ by −60/[20 log(1/e)] or 6.9078. 938 EURASIP Journal on Applied Signal Processing
1
0.995
0.99 Loop gain
0.985 0 500 1000 1500 2000 2500 3000 Frequency (Hz)
(a)
10 (s)
60 5 T
0 0 500 1000 1500 2000 2500 3000 Frequency (Hz)
(b)
Figure 4: The frequency-dependent (a) loop gain (magnitude response) and (b) reverberation time T60 determined by the loss filter. The dashed lines show the smooth characteristics of a conventional one-pole loss filter (g = 0.995, a =−0.05). The solid lines show the characteristics obtained with the ripply loss filter (g = 0.995, a =−0.05, r = 0.0020, rrate = 0.5). The bold dots indicate the actual properties experienced by the partials of the synthetic tone (L = 200 samples, f0 = 220.5Hz).
The time constants τ(k) for partial indices k = 1, 2, 3, ...,on monicity, because then the allpass filter Ad(z) would not be the other hand, are obtained from the loop gain data G(k)as needed at all. The inharmonicity of the recorded harpsichord tones −1 τ(k) = . (4) were investigated in order to find out whether it is relevant f0 ln G(k) to model this property. The partials of recorded harpsichord tones were picked semiautomatically from the magnitude The loop gain sequence G(k) is extracted directly from the spectrum, and with a least-square fit we estimated the in- magnitude response of the loop filter at the fundamental fre- harmonicity coefficient B [20] for each recorded tone. The quency (k = 1) and at the other partial frequencies (k = measured B values are displayed in Figure 5 together with the 2, 3, 4, ...). threshold of audibility and its 90% confidence intervals taken Figure 4b demonstrates the power of the ripply loss fil- from listening test results [24]. It is seen that the B coeffi- ter: the second partial can be rendered to decay much slower cient is above the mean threshold of audibility in all cases, but than the first and the third partials. This is also perceived above the frequency 140 Hz, the measured values are within in the synthetic tone: soon after the attack, the second par- the confidence interval. Thus, it is not guaranteed that these tial stands out as the loudest and the longest ringing partial. cases actually correspond to audible inharmonicity. At low Formerly, this kind of flexibility has been obtained only with frequencies, in the case of the 19 lowest keys of the harpsi- high-order loss filters [17, 19]. Still, the new filter has only chord, where the inharmonicity coefficients are about 10−5, two parameters more than the one-pole filter, and its com- the inharmonicity is audible according to this comparison. putational complexity is comparable to that of a first-order It is thus important to implement the inharmonicity for the pole-zero filter. lowest 2 octaves or so, but it may also be necessary to imple- 3.2. Inharmonicity ment the inharmonicity for the rest of the notes. This conclusion is in accordance with [10], where inhar- Dispersion is always present in real strings. It is caused by monicity is stated as part of the tonal quality of the harp- the stiffness of the string material. This property of strings ff sichord, and also with [12], where it is mentioned that the gives rise to inharmonicity in the sound. An o spring of the inharmonicity is less pronounced than in the piano. harpsichord, the piano, is famous for its strongly inharmonic tones, especially in the bass range [9, 20]. This is due to the 3.3. Sample databases large elastic modulus and the large diameter of high-strength steel strings in the piano [9]. In waveguide models, inhar- The excitation signals of the string models are stored in a monicity is modeled with allpass filters [16, 21, 22, 23]. Nat- database from where they can be retrieved at the onset time. urally, it would be cost-efficient not to implement the inhar- The excitation sequences contain 20,000 samples (0.45 s), Sound Synthesis of the Harpsichord Using a Physical Model 939
10−2 dB 4000 0
3500 −5 10−3 3000 −10
10−4 2500 −15
B 2000 −20 10−5 1500 −25 Frequency (Hz)
− 10−6 1000 30
500 −35 10−7 0 −40 0 200 400 600 800 1000 00.511.52 Fundamental frequency (Hz) Time (s)
ffi Figure 5: Estimates of the inharmonicity coe cient B for all 56 keys Figure 6: Time-frequency plot of the harpsichord air radiation of the harpsichord (circles connected with thick line). Also shown when the 8 bridge is excited. To exemplify the fast decay of the ffi are the threshold of audibility for the B coe cient (solid line) and low-frequency modes only the first 2 seconds and frequencies up its 90% confidence intervals (dashed lines) taken from [24]. to 4000 Hz are displayed. and they have been extracted from recorded tones by can- The soundboard has its own modes depending on the size celing the partials. The analysis and calibration procedure is and the materials used. The radiated acoustic response of the discussed further in Section 4 of this paper. The idea is to harpsichord is reasonably flat over a frequency range from 50 include in these samples the sound of the quill scraping the to 2000 Hz [11]. In addition to exciting the air and structural string plus the beginning of the attack of the sound so that modes of the instrument body, the pluck excites the part of a natural attack is obtained during synthesis, and the ini- the string that lies behind the bridge, the high modes of the tial levels of partials are set properly. Note that this approach low strings that the dampers cannot perfectly attenuate, and is slightly different from the standard commuted synthesis the highest octave of the 4 register strings.4 The resonance technique, where the full inverse filtered recorded signal is strings behind the bridge are about 6 to 20 cm long and have used to excite the string model [18, 25]. In the latter case, a very inharmonic spectral structure. The soundboard filter all modes of the soundboard (or soundbox) are contained used in our harpsichord synthesizer (see Figure 2)isrespon- within the input sequence, and virtually perfect resynthesis is sible for imitating all these features. However, as will be dis- accomplished if the same parameters are used for inverse fil- cussed further in Section 4.5, the lowest body modes can be tering and synthesis. In the current model, however, we have ignored since they decay fast and are present in the excita- truncated the excitation signals by windowing them with the tion samples. In other words, the modeling is divided into right half of a Hanning window. The soundboard response two parts so that the soundboard filter models the rever- is much longer than that (several seconds), but imitating its berant tail while the attack part is included in the excitation ringing tail is taken care of by the soundboard filter (see the signal, which is fed to the string model. Reference [11] dis- next subsection). cusses the resonance modes of the harpsichord soundboard In addition to the excitation samples, we have extracted in detail. short release sounds from recorded tones. One of these is re- The radiated acoustic response of the harpsichord was trieved and played each time a note-off command occurs. Ex- recorded in an anechoic chamber by exciting the bridges tracting these samples is easy: once a note is played, the player (8 and 4 ) with an impulse hammer at multiple positions. can wait until the string sound has completely decayed, and Figure 6 displays a time-frequency response of the 8 bridge then release the key. This way a clean recording of noises re- when excited between the C3 strings, that is, approximately lated to the release event is obtained, and any extra process- at the middle point of the bridge. The decay times at fre- ing is unnecessary. An alternative way would be to synthesize quencies below 350 Hz are considerably shorter than in the these knocking sounds using modal synthesis, as suggested in frequency range from 350 to 1000 Hz. The T60 values at the [26]. respective bands are about 0.5 seconds and 4.5 seconds. This can be explained by the fact that the short string portions 3.4. Modeling the reverberant soundboard and undamped strings
When a note is plucked on the harpsichord, the string vibra- 4The instrument used in this study does not have dampers in the last tions excite the bridge and, consequently, the soundboard. octave of the 4 register. 940 EURASIP Journal on Applied Signal Processing behind the bridge and the undamped strings resonate and highpass filtered. The highpass filter is a fourth-order But- decay slowly. terworth highpass filter with a cutoff frequency of 52 Hz or As suggested by several authors, see for example, [14, 27, 32 Hz (for the lowest tones). The filter was applied to the 28], the impulse response of a musical instrument body can signal in both directions to obtain a zero-phase filtering. be modeled with a reverberation algorithm. Such algorithms The recordings were compared in an informal listening test have been originally devised for imitating the impulse re- among the authors, and the signals obtained with a high- sponse of concert halls. In a previous work, we triggered a quality studio microphone by Schoeps were selected for fur- static sample of the body response with every note [29]. In ther analysis. contrast to the sample-based solution, which produces the All 56 keys of the instrument were played separately with same response every time, the reverberation algorithm pro- six different combinations of the registers that are commonly duces additional variation in the sound: as the input signal used. This resulted in 56 × 6 = 336 recordings. The tones of the reverberation algorithm is changed, or in this case as were allowed to decay into silence, and the key release was in- the key or register is changed, the temporal and frequency cluded. The length of the single tones varied between 10 and content of the output changes accordingly. 25 seconds, because the bass tones of the harpsichord tend The soundboard response of the harpsichord in this work to ring much longer than the treble tones. For completeness, is modeled with an algorithm presented in [30]. It is a mod- we recorded examples of different dynamic levels of different ification of the feedback delay network [31], where the feed- keys, although it is known that the harpsichord has a limited back matrix is replaced with a single coefficient, and comb dynamic range due to its excitation mechanism. Short stac- allpass filters have been inserted in the delay line loops. A cato tones, slow key pressings, and fast repetitions of single schematic view of the reverberation algorithm is shown in keys were also registered. Chords were recorded to measure Figure 7. This structure is used because of its computational the variations of attack times between simultaneously played efficiency. The Hk(z) blocks represent the loss filters, Ak(z) keys. Additionally, scales and excerpts of musical pieces were blocks are the comb allpass filters, and the delay lines are of played and recorded. length Pk. In this work, eight (N = 8) delay lines are imple- Both bridges of the instrument were excited at several mented. points (four and six points for the 4 and the 8 bridge, re- One-pole lowpass filters are used as loss filters which im- spectively) with an impulse hammer to obtain reliable acous- plement the frequency-dependent decay. The comb allpass tic soundboard responses. The force signal of the hammer filters increase the diffusion effect and they all have the trans- and acceleration signal obtained from an accelerometer at- fer function tached to the bridge were recorded for the 8 bridge at
− three locations. The acoustic response was recorded in syn- + Mk = aap,k z chrony. Ak(z) −M ,(5) 1+aap,kz k 4.2. Analysis of recorded tones and extraction where are the delay-line lengths and are the allpass Mk aap,k of excitation signals filter coefficients. To ensure stability, it is required that aap,k ∈ [−1, 1]. In addition to the reverberation algorithm, a tone- Initial estimates of the synthesizer parameters can be ob- corrector filter, as shown in Figure 2, is used to match the tained from analysis of recorded tones. For the basic calibra- spectral envelope of the target response, that is, to suppress tion of the synthesizer, the recordings were selected where the low frequencies below 350 Hz and give some additional each register is played alone. We use a method based on the lowpass characteristics at high frequencies. The choice of the short-time Fourier transform and sinusoidal modeling, as parameters is discussed in Section 4.5. previously discussed in [18, 32]. The inharmonicity of harp- sichord tones is accounted for in the spectral peak-picking ffi 4. CALIBRATION OF THE SYNTHESIS ALGORITHM algorithm with the help of the estimated B coe cient val- ues. After extracting the fundamental frequency, the analy- The harpsichord was brought into an anechoic chamber sis system essentially decomposes the analyzed tone into its where the recordings and the acoustic measurements were deterministic and stochastic parts, as in the spectral model- conducted. The registered signals enable the automatic cali- ing synthesis method [33]. However, in our system the de- bration of the harpsichord synthesizer. This section describes cay times of the partials are extracted, and the loop filter de- the recordings, the signal analysis, and the calibration tech- sign is based on the loop gain data calculated from the de- niques for the string and the soundboard models. cay times. The envelopes of partials in the harpsichord tones exhibit beating and two-stage decay, as is usual for string in- 4.1. Recordings struments [34]. The residual is further processed, that is, the Harpsichord tones were recorded in the large anechoic cham- soundboard contribution is mostly removed (by windowing ber of Helsinki University of Technology. Recordings were the residual signal in the time domain) and the initial level made with multiple microphones installed at a distance of of each partial is adjusted by adding a correction obtained about 1 m above the soundboard. The signals were recorded through sinusoidal modeling and inverse filtering [35, 36]. digitally (44.1 kHz, 16 bits) directly onto the hard disk, and The resulting processed residual is used as an excitation sig- to remove disturbances in the infrasonic range, they were nal to the model. Sound Synthesis of the Harpsichord Using a Physical Model 941
−P − + z 1 H1(z) A1(z) − x(n) y(n) + . + . +
− + + z PN HN (z) AN (z)
+ gfb
Figure 7: A schematic view of the reverberation algorithm used for soundboard modeling.
4.3. Loss filter design 1 Since the ripply loop filter is an extension of the one-pole fil- ter that allows improved matching of the decay rate of one 0.995 partial and simply introduces variations to the others, it is 0.99 reasonable to design it after the one-pole filter. This kind Loop gain of approach is known to be suboptimal in filter design, but 0.985 highest possible accuracy is not the main goal of this work. 0 500 1000 1500 2000 2500 3000 3500 4000 Rather, a simple and reliable routine to automatically pro- Frequency (Hz) cess a large amount of measurement data is reached for, thus leaving a minimum amount of erroneous results to be fixed (a) manually. Figure 8 shows the loop gain and T60 data for an example case. It is seen that the target data (bold dots in Figure 8)con- 10
tain a fair amount of variation from one partial to the next (s) one, although the overall trend is downward as a function 60 5 T of frequency. Partials with indices 10, 11, 16, and 18 are ex- cluded (set to zero), because their decay times were found to 0 be unreliable (i.e., loop gain larger than unity). The one-pole 0 500 1000 1500 2000 2500 3000 3500 4000 filter response fitted using a weighted least squares technique Frequency (Hz) [18] (dashed lines in Figure 8) can follow the overall trend, (b) but it evens up the differences between neighboring partials. The ripply loss filter can be designed using the following Figure 8: (a) The target loop gain for a harpsichord tone ( f0 = heuristic rules. 197 Hz) (bold dots), the magnitude response of the conventional (1) Select the partial with the largest loop gain starting one-pole filter with g = 0.9960 and a =−0.0296 (dashed line), and from the second partial5 (the sixth partial in this case, the magnitude response of the ripply loss filter with r =−0.0015 and rrate = 0.0833 (solid line). (b) The corresponding T60 data. The see Figure 8), whose index is denoted by kmax. Usually one of the lowest partials will be picked once the out- total delay-line length is 223.9 samples, and the delay-line length R of the ripple filter is 19 samples. liers have been discarded. (2) Set the absolute value of r so that, together with the one-pole filter, the magnitude response will match the (3) If the target loop gain of the first partial is larger than target loop gain of the partial with index k , that is, max the magnitude response of the one-pole filter alone at |r|=G(k ) −|H(k f )|, where the second term max max 0 that frequency, set the sign of r to positive, and other- is the loop gain due to the one-pole filter at that fre- wise to negative so that the decay of the first partial is quency (in this case r = 0.0015). made fast (in the example case in Figure 8, the minus sign is chosen, that is, r =−0.0015). 5In practice, the first partial may have the largest loop gain. However, if (4) If a positive r has been chosen, conduct a stability ≥ we tried to match it using the ripply loss filter, the rrate parameter would go check at the zero frequency. If it fails (i.e., g + r 1), to 1, as can be seen from (6), and the delay-line length R would become equal the value of r must be made negative by changing its to L rounded to an integer, as can be seen from (3). This practically means sign. that the ripple filter would be reduced to a correction of the loop gain by r, which can be done also by simply replacing the loop gain parameter g by (5) Set the ripple rate parameter rrate so that the longest g + r. For this reason, it is sensible to match the loop gain of a partial other ringing partial will occur at the maximum nearest to than the first one. 0 Hz. This means that the parameter must be chosen 942 EURASIP Journal on Applied Signal Processing
according to the following rule: 1500 1 when r ≥ 0, 1000 = kmax rrate (6) MSE 1 500 when r<0. 2kmax 0 In the example case, as the ripple pattern is a negative 20 40 60 80 100 cosine wave (in the frequency domain) and the peak should Harmonic # = hit the 6th partial, we set the rrate parameter equal to 1/12 (a) 0.0833. This implies that the minimum will occur at every 12th partial and the first maximum will occur at the 6th par- tial. The result of this design procedure is shown in Figure 8 −120 with the solid line. Note that the peak is actually between the −140 5th and the 6th partial, because fractional delay techniques − are not used in this part of the system and the delay-line 160 length R is thus an integer, as defined in (3). It is obvious that −180 Magnitude (dB) this design method is limited in its ability to follow arbitrary −200 target data. However, as we now know that the resolution of 500 1000 1500 2000 human hearing is also very limited in evaluating differences Time (ms) in decay rates [37], we find the match in most cases to be ffi 9th partial su ciently good. 10th partial 11th partial 4.4. Beating filter design (b) The beating filter, a second-order resonator R(z) coupled in parallel with the string model (see Figure 2), is used for re- producing the beating in harpsichord synthesis. In practice, Figure 9: (a) The mean squared error of exponential curve fitting = we decided to choose the center frequency of the resonator so to the decay of partials ( f0 197 Hz), where the lowest large devi- that it brings about the beating effect in one of the low-index ation has been circled (10th partial), and the acceptance threshold is presented with a dashed-dotted line. (b) The corresponding tem- partials that has a prominent level and large beat amplitude. poral envelopes of the 9th, 10th, and 11th partials, where the slow These criteria make sure that the single resonator will pro- beating of the 10th partial and deviations in decay rates are visible. duce an audible effect during synthesis. In this implementation, we probed the deviation of the actual decay characteristics of the partials from the ideal ex- ponential decay. This procedure is illustrated in Figure 9.In Figure 9a, the mean-squared error (MSE) of the deviation is components that produce the beating, because the mixing shown. The lowest partial that exhibits a high deviation (10th parameter that adjusts the beating amplitude was not giving partial in this example) is selected as a candidate for the most a useful audible variation [39]. Thus, we are now convinced prominent beating partial. Its magnitude envelope is pre- that it is unnecessary to add another parameter for all string sented in Figure 9b by a solid curve. It exhibits a slow beating models by allowing changes in the amplitude of the beating pattern with a period of about 1.5 seconds. The second-order partial. resonator that simulates beating, in turn, can be tuned to re- sult in a beating pattern with this same rate. For comparison, 4.5. Design of soundboard filter the magnitude envelopes of the 9th and 11th partials are also The reverberation algorithm and the tone correction unit are shown by dashed and dash-dotted curves, respectively. set in cascade and together they form the soundboard model, The center frequency of the resonator is measured from as shown in Figure 2. For determining the soundboard filter, the envelope of the partial. In practice, the offset ranges from the parameters of the reverberation algorithm and its tone practically 0 Hz to a few Hertz. The gain of the resonator, correctorhavetobeset.Theparametersforthereverbera- that is, the amplitude of the beating partial, is set to be the tion algorithm were chosen as proposed in [31]. To match same as that of the partial it beats against. This simple choice the frequency-dependent decay, the ratio between the de- is backed by the recent result by Jarvel¨ ainen¨ and Karjalainen cay times at 0 Hz and at fs/2 was set to 0.13, so that T60 at [38] that the beating in string instrument tones is essentially 0 Hz became 6.0 seconds. The lengths of the eight delay lines perceived as an on/off process: if the beating amplitude is varied from 1009 to 1999 samples. To avoid superimposing above the threshold of audibility, it is noticed, while if it is the responses, the lengths were incommensurate numbers below it, it becomes inaudible. Furthermore, changes in the [40]. The lengths Mk of the delay lines in the comb allpass beating amplitude appear to be inaccurately perceived. Be- structures were set to 8% of the total length of each delay fore knowing these results, in a former version of the synthe- line path Pk, filter coefficients aap,k were all set to 0.5, and the sizer, we also decided to use the same amplitude for the two feedback coefficient gfb was set to −0.25. Sound Synthesis of the Harpsichord Using a Physical Model 943
The excitation signals for the harpsichord synthesizer are 0.45 second long, and hence contain the necessary fast- decaying modes for frequencies below 350 Hz (see Figure 6). Therefore, the tone correction section is divided into two parts: a highpass filter that suppresses frequencies below 350 Hz and another filter that imitates the spectral envelope 0 at the middle and high frequencies. The highpass filter is a 5th-order Chebyshev type I design with a 5 dB passband rip- −20 ple, the 6 dB point at 350 Hz, and a roll-off rate of about 0 50 dB per octave below the cutoff frequency. The spectral en- velope filter for the soundboard model is a 10th-order IIR Magnitude (dB) 0.5 −40 filter designed using linear prediction [41] from a 0.2-second 1 long windowed segment of the measured target response (see 0 2000 Time (s) 4000 Figure 6 from 0.3 second to 0.5 second). Figure 10 shows the 6000 1.5 Frequency (Hz) 8000 10000 time-frequency plot of the target response and the sound- board filter for the first 1.5 seconds up to 10 kHz. The tar- get response has a prominent lowpass characteristic, which (a) is due to the properties of the impulse hammer. While the response should really be inverse filtered by the hammer force signal, in practice we can approximately compensate this effect with a differentiator whose transfer function is −1 Hdiff(z) = 0.5 − 0.5z . This is done before the design of the tone corrector, so the compensation filter is not included in 0 the synthesizer implementation.
−20 5. IMPLEMENTATION AND APPLICATIONS 0
ffi Magnitude (dB) This section deals with computational e ciency, implemen- 0.5 tation issues, and musical applications of the harpsichord −40 1 synthesizer. 0 2000 Time (s) 4000 6000 1.5 5.1. Computational complexity Frequency (Hz) 8000 10000 The computational cost caused by implementing the harp- sichord synthesizer and running it at an audio sample rate, (b) such as 44100 Hz, is relatively small. Table 1 summarizes the amount of multiplications and additions needed per sam- Figure 10: The time-frequency representation of (a) the recorded ple for various parts of the system. In this cost analysis, it is soundboard response and (b) the synthetic response obtained as the assumed that the dispersion is simulated using a first-order impulse response of a modified feedback delay network. allpass filter. In practice, the lowest tones require a higher- order allpass filter, but some of the highest tones may not have the allpass filter at all. So the first-order filter represents computer, and it can simultaneously run 15 string models in an average cost per string model. Note that the total cost per real time without the soundboard model. With the sound- string is smaller than that of an FIR filter of order 12 (i.e., 13 boardmodel,itispossibletorunabout10strings.Anew, multiplications and 12 additions). In practice, one voice in faster computer and optimization of the code can increase harpsichord synthesis is allocated one to three string mod- these numbers. With optimized code and fast hardware, it els, which simulate the different registers. The soundboard may be possible to run the harpsichord synthesizer with full model is considerably more costly than a string model: the polyphony (i.e., 56 voices) and soundboard in real time using number of multiplications is more than fourfold, and the current technology. number of additions is almost seven times larger. The com- plexity analysis of the comb allpass filters in the soundboard 5.2. Synthesizer implementation model is based on the direct form II implementation (i.e., The signal-processing part of the harpsichord synthesizer one delay line, two multiplications, and two additions per is realized using a visual software synthesis package called comb allpass filter section). PWSynth [42]. PWSynth, in turn, is part of a larger visual The implementation of the synthesizer, which is dis- programming environment called PWGL [43]. Finally, the cussed in detail in the next section, is based on high-level control information is generated using our music notation programming and control. Thus, it is not optimized for package ENP (expressive notation package) [44]. In this sec- fastest possible real-time operation. The current implemen- tion, the focus is on design issues that we have encountered tation of the synthesizer runs on a Macintosh G4 (800 MHz) when implementing the synthesizer. We also give ideas on 944 EURASIP Journal on Applied Signal Processing
Table 1: The number of multiplications and additions in different parts of the synthesizer.
Part of synthesis algorithm Multiplications Additions String model • Fractional delay allpass filter F(z) 22
• Inharmonizing allpass filter Ad(z) 22 • One-pole filter 21 • Ripple filter 11 • Resonator R(z) 32 • Timbre control 21 • Mixing with release sample 11 Soundboard model • Modified FDN reverberator 33 47 • IIR tone corrector 11 10 • Highpass filter 12 9 • Mixing 11 Total • Per string (without soundboard model) 13 10 • Soundboard model 57 67 • All (one string and soundboard model) 70 77
how the model is parameterized so that it can be controlled ual string sets used by the instrument. These sets are labeled from the music notation software. as follows: “harpsy1/8-fb/,” “harpsy1/8-ff,” and “harpsy1/4- Our previous work in designing computer simulations ff/.” Each string set copies the string model patch count times, of musical instruments has resulted in several applications, where count is equal to the current number of strings (given such as the classical guitar [39], the Renaissance lute, the by the upper number-of-strings box). The rest of the boxes Turkish ud [45], and the clavichord [29]. The two-manual in the patch are used to mix the outputs of the string sets. harpsichord tackled in the current study is the most chal- Figure 12 gives the definition of a single string model. lenging and complex instrument that we have yet investi- The patch consists of two types of boxes. First, the boxes with gated. As this kind of work is experimental, and the synthe- the name “pwsynth-plug” (the boxes with the darkest out- sis model must be refined by interactive listening, a system lines in grey-scale) define the parametric entry points that is needed that is capable of making fast and efficient proto- are used by our control system. Second, the other boxes are types of the basic components of the system. Another non- low-level DSP modules, realized in C++, that perform the ac- trivial problem is the parameterization of the harpsichord tual sample calculation and boxes which are used to initialize synthesizer. In a typical case, one basic component, such as the DSP modules. The “pwsynth-plug” boxes point to mem- the vibrating string model, requires over 10 parameters so ory addresses that are continuously updated while the syn- that it can be used in a convincing simulation. Thus, since the thesizer is running. Each “pwsynth-plug” box has a label that full harpsichord synthesizer implementation has three string is used to build symbolic parameter pathnames. While the sets each having 56 strings, we need at least 1680 (= 10 × “copy-synth-patch” boxes (see the main patch of Figure 11) 3 × 56) parameters in order to control all individual strings copy the string model in a loop, the system automatically separately. generates new unique pathnames by merging the label from Figure 11 shows a prototype of a harpsichord synthe- the current “copy-synth-patch” box, the current loop index, sizer. It contains three main parts. First, the top-most box and the label found in “pwsynth-plug” boxes. Thus, path- (called “num-box” with the label “number-of-strings”) gives names like “harpsy1/8-fb/1/lfgain” are obtained, which refers the number of strings within each string set used by the syn- to the lfgain (loss filter gain) of the first string of the 8 back thesizer. This number can vary from 1 (useful for preliminary string set of a harpsichord model called “harpsy1.” tests) to 56 (the full instrument). In a typical real-time sit- uation, this number can vary, depending on the polyphony 5.3. Musical applications of the musical score to be realized, between 4 and 10. The The harpsichord synthesizer can be used as an electronic mu- next box of interest is called “string model.” It is a spe- sical instrument controlled either from a MIDI keyboard or cial abstraction box that contains a subwindow. The con- from a sequencer software. Recently, some composers have tents of this window are displayed in Figure 12. This abstrac- been interested in using a formerly developed model-based tion box defines a single string model. Next, Figure 11 shows guitar synthesizer for compositions, which are either experi- three “copy-synth-patch” boxes that determine the individ- mental in nature or extremely challenging for human players. Sound Synthesis of the Harpsichord Using a Physical Model 945
Num-box 56 Number of strings
String-model A
Copy-synth-patch Copy-synth-patch Copy-synth-patch Count Patch Count Patch Count Patch harpsy1/8-fb/ harpsy1/8-ff/ harpsy1/4-f/ S S S
Accum-vector Accum-vector Accum-vector Vector Vector Vector S S S
+
S Synth-box Patch Score S
Figure 11: The top-level prototype of the harpsichord synthesizer in PWSynth. The patch defines one string model and the three string sets used by the instrument.
Another fascinating idea is to extend the range and timbre of modern technology and want to try physics-based synthesis the instrument. A version of the guitar synthesizer, that we to learn about the instrument. A synthesizer allows varying call the super guitar, has an extended range and a large num- certain parameters in the instrument design, which are diffi- ber of strings [46]. We plan to develop a similar extension of cult or impossible to adjust in the real instrument. For exam- the harpsichord synthesizer. ple, the point where the quill plucks the string is structurally In the current version of the synthesizer, the parameters fixed in the harpsichord, but as it has a clear effect on the have been calibrated based on recordings. One obvious ap- timbre, varying it is of interest. In the current harpsichord plication for a parametric synthesizer is to modify the timbre synthesizer, it would require the knowledge of the plucking by deviating the parameter values. This can lead to extended point and then inverse filtering its contribution from the ex- timbres that belong to the same instrument family as the citation signal. The plucking point contribution can then be original instrument or, in the extreme cases, to a novel vir- implemented in the string model by inserting another feed- tual instrument that cannot be recognized by listeners. One forward comb filter, as discussed previously in several works of the most obvious subjects for modification is the decay [7, 16, 17, 18]. Another prospect is to vary the location of the rate, which is controlled with the coefficients of the loop fil- damper. Currently, we do not have an exact model for the ter. damper, and neither is its location a parameter. Testing this is A well-known limitation of the harpsichord is its re- still possible, because it is known that the nonideal function- stricted dynamic range. In fact, it is a controversial issue ing of the damper is related to the nodal points of the strings, whether the key velocity has any audible effect on the sound which coincide with the locations of the damper. The ripply of the harpsichord. The synthesizer easily allows the imple- loss filter allows the imitation of this effect. mentation of an exaggerated dynamic control, where the key Luthiers are interested in the possibility of virtual proto- velocity has a dramatic effect on both the amplitude and the typing without the need for actually building many versions timbre, if desired, such as in the piano or in the acoustic gui- of an instrument out of wood. The current synthesis model tar. As the key velocity information is readily available, it can may not be sufficiently detailed for this purpose. A real-time be used to control the gain and the properties of a timbre or near-real-time implementation of a physical model, where control filter (see Figure 2). several parameters can be adjusted, would be an ideal tool for Luthiers who make musical instruments are interested in testing prototypes. 946 EURASIP Journal on Applied Signal Processing
Pwsynth-plug SoundID 0 Pwsynth-plug Pwsynth-plug freqsc 1 P1gain 0 Sample-player Pwsynth-plug Sample Freq Amp Trig Trigg 0 S + Pwsynth-plug Number Numbers Riprate 0.5 S Pwsynth-plug Initial-vals Ripple-delay-1g3 1Overfreq Intval 1/freq 1/fcoef 1fgain A Sig Delay Pwsynth-plug Ripple Ripdepth Ripdepth 0.0 0.5 ∧ Pwsynth-plug z −1 S Onepole Sig 1fcoef Intval + Extra-sample1 Coef Gain Number A Numbers S Pwsynth-plug S 1fgain 0.994842 Pwsynth-plug ∗ 1fgainsc 1.0 Number Numbers Linear-iP S 1 Sig 0.001 S
Figure 12: The string model patch. The patch contains the low-level DSP modules and parameter entry points used by the harpsichord synthesizer.
6. CONCLUSIONS (IST-2001-33059). The authors are grateful to B. Bank, P. A. A. Esquef, and J. O. Smith for their helpful comments. Spe- This paper proposes signal-processing techniques for synthe- cial thanks go to H. Jarvel¨ ainen¨ for her help in preparing sizing harpsichord tones. A new extension to the loss filter Figure 5. of the waveguide synthesizer has been developed which al- lows variations in the decay times of neighboring partials. This filter will be useful also for the waveguide synthesis of REFERENCES other stringed instruments. The fast-decaying modes of the ffi soundboard are incorporated in the excitation samples of [1] J. O. Smith, “E cient synthesis of stringed musical instru- ments,” in Proc. International Computer Music Conference,pp. the synthesizer, while the long-ringing modes at the middle 64–71, Tokyo, Japan, September 1993. and high frequencies are imitated using a reverberation al- [2] M. Karjalainen and V. Valim¨ aki,¨ “Model-based analy- gorithm. The calibration of the synthesis model is made al- sis/synthesis of the acoustic guitar,” in Proc. Stockholm Music most automatic. The parameterization and use of simple fil- Acoustics Conference, pp. 443–447, Stockholm, Sweden, July– ters also allow manual adjustment of the timbre. A physics- August 1993. based synthesizer, such as the one described here, has several [3] M. Karjalainen, V. Valim¨ aki,¨ and Z. Janosy,´ “Towards high- musical applications, the most obvious one being the usage quality sound synthesis of the guitar and string instruments,” in Proc. International Computer Music Conference, pp. 56–63, as a computer-controlled musical instrument. Tokyo, Japan, September 1993. Examples of single tones and musical pieces synthesized [4] J. O. Smith and S. A. Van Duyne, “Commuted piano syn- with the synthesizer are available at http://www.acoustics. thesis,” in Proc. International Computer Music Conference,pp. hut.fi/publications/papers/jasp-harpsy/. 319–326, Banff, Alberta, Canada, September 1995. [5] J. O. Smith, “Physical modeling using digital waveguides,” Computer Music Journal, vol. 16, no. 4, pp. 74–91, 1992. ACKNOWLEDGMENTS [6] K. Karplus and A. Strong, “Digital synthesis of plucked string and drum timbres,” Computer Music Journal,vol.7,no.2,pp. The work of Henri Penttinen has been supported by the 43–55, 1983. Pythagoras Graduate School of Sound and Music Research. [7] M. Karjalainen, V. Valim¨ aki,¨ and T. Tolonen, “Plucked- The work of Cumhur Erkut is part of the EU project ALMA string models, from the Karplus-Strong algorithm to digital Sound Synthesis of the Harpsichord Using a Physical Model 947
waveguides and beyond,” Computer Music Journal, vol. 22, [26] P. R. Cook, “Physically informed sonic modeling (PhISM): no. 3, pp. 17–32, 1998. synthesis of percussive sounds,” Computer Music Journal, vol. [8] F. Hubbard, Three Centuries of Harpsichord Making,Harvard 21, no. 3, pp. 38–49, 1997. University Press, Cambridge, Mass, USA, 1965. [27] D. Rocchesso, “Multiple feedback delay networks for sound [9] N. H. Fletcher and T. D. Rossing, The Physics of Musical In- processing,” in Proc. X Colloquio di Informatica Musicale,pp. struments, Springer-Verlag, New York, NY, USA, 1991. 202–209, Milan, Italy, December 1993. [10] E. L. Kottick, K. D. Marshall, and T. J. Hendrickson, “The [28] H. Penttinen, M. Karjalainen, T. Paatero, and H. Jarvel¨ ainen,¨ acoustics of the harpsichord,” Scientific American, vol. 264, “New techniques to model reverberant instrument body re- no. 2, pp. 94–99, 1991. sponses,” in Proc. International Computer Music Conference, [11] W. R. Savage, E. L. Kottick, T. J. Hendrickson, and K. D. Mar- pp. 182–185, Havana, Cuba, September 2001. shall, “Air and structural modes of a harpsichord,” Journal of [29] V. Valim¨ aki,¨ M. Laurson, and C. Erkut, “Commuted waveg- the Acoustical Society of America, vol. 91, no. 4, pp. 2180–2189, uide synthesis of the clavichord,” Computer Music Journal, vol. 1992. 27, no. 1, pp. 71–82, 2003. [12] N. H. Fletcher, “Analysis of the design and performance of [30] R. Va¨an¨ anen,¨ V. Valim¨ aki,¨ J. Huopaniemi, and M. Kar- harpsichords,” Acustica, vol. 37, pp. 139–147, 1977. jalainen, “Efficient and parametric reverberator for room [13] J. Sankey and W. A. Sethares, “A consonance-based approach acoustics modeling,” in Proc. International Computer Mu- to the harpsichord tuning of Domenico Scarlatti,” Journal of sic Conference, pp. 200–203, Thessaloniki, Greece, September the Acoustical Society of America, vol. 101, no. 4, pp. 2332– 1997. 2337, 1997. [31] J. M. Jot and A. Chaigne, “Digital delay networks for design- [14] B. Bank, “Physics-based sound synthesis of the piano,” M.S. ing artificial reverberators,” in Proc. 90th Convention Audio thesis, Department of Measurement and Information Sys- Engineering Society, Paris, France, February 1991. tems, Budapest University of Technology and Economics, Bu- [32] C. Erkut, V. Valim¨ aki,¨ M. Karjalainen, and M. Laurson, “Ex- dapest, Hungary, 2000, published as Tech. Rep. 54, Laboratory traction of physical and expressive parameters for model- of Acoustics and Audio Signal Processing, Helsinki University based sound synthesis of the classical guitar,” in Proc. 108th of Technology, Espoo, Finland, 2000. Convention Audio Engineering Society,p.17,Paris,France, [15] B. Bank, V. Valim¨ aki,¨ L. Sujbert, and M. Karjalainen, “Effi- February 2000. cient physics based sound synthesis of the piano using DSP [33] X. Serra and J. O. Smith, “Spectral modeling synthesis: a methods,” in Proc. European Signal Processing Conference, sound analysis/synthesis system based on a deterministic plus vol. 4, pp. 2225–2228, Tampere, Finland, September 2000. stochastic decomposition,” Computer Music Journal, vol. 14, [16] D. A. Jaffe and J. O. Smith, “Extensions of the Karplus-Strong no. 4, pp. 12–24, 1990. plucked-string algorithm,” Computer Music Journal, vol. 7, [34] G. Weinreich, “Coupled piano strings,” Journal of the Acous- no. 2, pp. 56–69, 1983. tical Society of America, vol. 62, no. 6, pp. 1474–1484, 1977. [17] J. O. Smith, Techniques for digital filter design and system iden- [35] V. Valim¨ aki¨ and T. Tolonen, “Development and calibration of tification with application to the violin, Ph.D. thesis, Stanford a guitar synthesizer,” Journal of the Audio Engineering Society, University, Stanford, Calif, USA, 1983. vol. 46, no. 9, pp. 766–778, 1998. [18] V. Valim¨ aki,¨ J. Huopaniemi, M. Karjalainen, and Z. Janosy,´ [36] T. Tolonen, “Model-based analysis and resynthesis of acoustic “Physical modeling of plucked string instruments with appli- guitar tones,” M.S. thesis, Laboratory of Acoustics and Audio cation to real-time sound synthesis,” Journal of the Audio En- Signal Processing, Department of Electrical and Communica- gineering Society, vol. 44, no. 5, pp. 331–353, 1996. tions Engineering, Helsinki University of Technology, Espoo, [19] B. Bank and V. Valim¨ aki,¨ “Robust loss filter design for digital Finland, 1998, Tech. Rep. 46. waveguide synthesis of string tones,” IEEE Signal Processing [37] H. Jarvel¨ ainen¨ and T. Tolonen, “Perceptual tolerances for de- Letters, vol. 10, no. 1, pp. 18–20, 2003. cay parameters in plucked string synthesis,” Journal of the Au- [20] H. Fletcher, E. D. Blackham, and R. S. Stratton, “Quality of dio Engineering Society, vol. 49, no. 11, pp. 1049–1059, 2001. piano tones,” Journal of the Acoustical Society of America, vol. [38] H. Jarvel¨ ainen¨ and M. Karjalainen, “Perception of beating and 34, no. 6, pp. 749–761, 1962. two-stage decay in dual-polarization string models,” in Proc. [21] S. A. Van Duyne and J. O. Smith, “A simplified approach to International Symposium on Musical Acoustics, Mexico City, modeling dispersion caused by stiffness in strings and plates,” Mexico, December 2002. in Proc. International Computer Music Conference, pp. 407– [39] M. Laurson, C. Erkut, V. Valim¨ aki,¨ and M. Kuuskankare, 410, Arhus,˚ Denmark, September 1994. “Methods for modeling realistic playing in acoustic guitar [22] D. Rocchesso and F. Scalcon, “Accurate dispersion simulation synthesis,” Computer Music Journal, vol. 25, no. 3, pp. 38–49, for piano strings,” in Proc. Nordic Acoustical Meeting, pp. 407– 2001. 414, Helsinki, Finland, June 1996. [40] W. G. Gardner, “Reverberation algorithms,” in Applications of [23] B. Bank, F. Avanzini, G. Borin, G. De Poli, F. Fontana, and Digital Signal Processing to Audio and Acoustics,M.Kahrsand D. Rocchesso, “Physically informed signal processing meth- K. Brandenburg, Eds., pp. 85–131, Kluwer Academic, Boston, ods for piano sound synthesis: a research overview,” EURASIP Mass, USA, 1998. Journal on Applied Signal Processing, vol. 2003, no. 10, pp. [41] J. D. Markel and A. H. Gray Jr., Linear Prediction of Speech, 941–952, 2003. Springer-Verlag, Berlin, Germany, 1976. [24] H. Jarvel¨ ainen,¨ V. Valim¨ aki,¨ and M. Karjalainen, “Audibility [42] M. Laurson and M. Kuuskankare, “PWSynth: a Lisp-based of the timbral effects of inharmonicity in stringed instrument bridge between computer assisted composition and sound tones,” Acoustics Research Letters Online, vol. 2, no. 3, pp. 79– synthesis,” in Proc. International Computer Music Conference, 84, 2001. pp. 127–130, Havana, Cuba, September 2001. [25] M. Karjalainen and J. O. Smith, “Body modeling techniques [43] M. Laurson and M. Kuuskankare, “PWGL: a novel visual for string instrument synthesis,” in Proc. International Com- language based on Common Lisp, CLOS and OpenGL,” in puter Music Conference, pp. 232–239, Hong Kong, China, Au- Proc. International Computer Music Conference, pp. 142–145, gust 1996. Gothenburg, Sweden, September 2002. 948 EURASIP Journal on Applied Signal Processing
[44] M. Kuuskankare and M. Laurson, “ENP2.0: a music notation Mikael Laurson was born in Helsinki, program implemented in Common Lisp and OpenGL,” in Finland, in 1951. His formal training at Proc. International Computer Music Conference, pp. 463–466, the Sibelius Academy consists of a guitar Gothenburg, Sweden, September 2002. diploma (1979) and a doctoral dissertation [45] C. Erkut, M. Laurson, M. Kuuskankare, and V. Valim¨ aki,¨ (1996). In 2002, he was appointed Docent “Model-based synthesis of the ud and the Renaissance lute,” in in music technology at Helsinki Univer- Proc. International Computer Music Conference, pp. 119–122, sity of Technology, Espoo, Finland. Between Havana, Cuba, September 2001. the years 1979 and 1985 he was active as [46] M. Laurson, V. Valim¨ aki,¨ and C. Erkut, “Production of vir- a guitarist. Since 1989 he has been work- tual acoustic guitar music,” in Proc. Audio Engineering Society ing at the Sibelius Academy as a Researcher 22nd International Conference on Virtual, Synthetic and Enter- and Teacher of computer-aided composition. After conceiving the tainment Audio, pp. 249–255, Espoo, Finland, June 2002. PatchWork (PW) programming language (1986), he started a close collaboration with IRCAM resulting in the first PW release in 1993. Vesa Valim¨ aki¨ was born in Kuorevesi, Fin- After 1993 he has been active as a developer of various PW user li- land, in 1968. He received the M.S. de- braries. Since the year 1999, Dr. Laurson has worked in a project gree, the Licentiate of Science degree, and dealing with physical modeling and sound synthesis control funded the Doctor of Science degree, all in elec- by the Academy of Finland and the Sibelius Academy Innovation trical engineering from Helsinki University Centre. of Technology (HUT), Espoo, Finland, in Cumhur Erkut was born in Istanbul, 1992, 1994, and 1995, respectively. He was Turkey, in 1969. He received the B.S. and the with the HUT Laboratory of Acoustics and M.S. degrees in electronics and communi- Audio Signal Processing from 1990 to 2001. cation engineering from the Yildiz Techni- In 1996, he was a Postdoctoral Research Fel- cal University, Istanbul, Turkey, in 1994 and low with the University of Westminster, London, UK. During the 1997, respectively, and the Doctor of Sci- academic year 2001-2002 he was Professor of signal processing at ence degree in electrical engineering from thePoriSchoolofTechnologyandEconomics,TampereUniversity Helsinki University of Technology (HUT), of Technology (TUT), Pori, Finland. He is currently Professor of Espoo, Finland, in 2002. Between 1998 and audio signal processing at HUT. He was appointed Docent in sig- 2002, he worked as a Researcher at the HUT nal processing at the Pori School of Technology and Economics, Laboratory of Acoustics and Audio Signal Processing. He is cur- TUT, in 2003. His research interests are in the application of digi- rently a Postdoctoral Researcher in the same institution, where tal signal processing to music and audio. Dr. Valim¨ aki¨ is a Senior he contributes to the EU-funded research project “Algorithms for Member of the IEEE Signal Processing Society and is a Member of the Modelling of Acoustic Interactions” (ALMA, European project the Audio Engineering Society, the Acoustical Society of Finland, IST-2001-33059). His primary research interests are model-based and the Finnish Musicological Society. sound synthesis and musical acoustics. Henri Penttinen was born in Espoo, Fin- land, in 1975. He received the M.S. degree in electrical engineering from Helsinki Uni- versity of Technology (HUT), Espoo, Fin- land, in 2003. He has worked at the HUT Laboratory of Acoustics and Signal Process- ing since 1999 and is currently a Ph.D. stu- dent there. His main research interests are signal processing algorithms, real-time au- dio applications, and musical acoustics. Mr. Penttinen is also active in music through playing, composing, and performing.
Jonte Knif was born in Vaasa, Finland, in 1975. He is currently studying music tech- nology at the Sibelius Academy, Helsinki, Finland. Prior to this he studied the harpsi- chord at the Sibelius Academy for five years. He has built and designed many histori- cal keyboard instruments and adaptations such as an electric clavichord. His present interests include also loudspeaker and stu- dio electronics design. EURASIP Journal on Applied Signal Processing 2004:7, 949–963 c 2004 Hindawi Publishing Corporation
Multirate Simulations of String Vibrations Including Nonlinear Fret-String Interactions Using the Functional Transformation Method
L. Trautmann Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Cauerstrasse 7, 91058 Erlangen, Germany Email: [email protected] Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, P.O. Box 3000, 02015 Espoo, Finland Email: [email protected].fi R. Rabenstein Multimedia Communications and Signal Processing, University of Erlangen-Nuremberg, Cauerstrasse 7, 91058 Erlangen, Germany Email: [email protected]
Received 30 June 2003; Revised 14 November 2003
The functional transformation method (FTM) is a well-established mathematical method for accurate simulations of multidimen- sional physical systems from various fields of science, including optics, heat and mass transfer, electrical engineering, and acoustics. This paper applies the FTM to real-time simulations of transversal vibrating strings. First, a physical model of a transversal vibrat- ing lossy and dispersive string is derived. Afterwards, this model is solved with the FTM for two cases: the ideally linearly vibrating string and the string interacting nonlinearly with the frets. It is shown that accurate and stable simulations can be achieved with the discretization of the continuous solution at audio rate. Both simulations can also be performed with a multirate approach with only minor degradations of the simulation accuracy but with preservation of stability. This saves almost 80% of the compu- tational cost for the simulation of a six-string guitar and therefore it is in the range of the computational cost for digital waveguide simulations. Keywords and phrases: multidimensional system, vibrating string, partial differential equation, functional transformation, non- linear, multirate approach.
1. INTRODUCTION ferent playing techniques or different instruments within one instrument family are described in the physics-based meth- Digital sound synthesis methods can mainly be categorized ods with only a few parameters. These parameters can be ad- into classical direct synthesis methods and physics-based justed in advance to simulate a distinct acoustical instrument methods [1]. The first category includes all kinds of sound or they can be controlled by the musician to morph between processing algorithms like wavetable, granular and subtrac- real world instruments to obtain more degrees of freedom in tive synthesis, as well as abstract mathematical models, like the expressiveness and variability. additive or frequency modulation synthesis. What is com- The second item makes physical modeling methods quite mon to all these methods is that they are based on the sound useful for multimedia applications where only a very limited to be (re)produced. bandwidth is available for the transmission of music as, for The physics-based methods, also called physical model- example, in mobile phones. In these applications, the physi- ing methods, start at the physics of the sound production calmodelhastobetransferredonlyonceandafterwardsitis mechanism rather than at the resulting sound. This approach sufficient to transfer only the musical score while keeping the has several advantages over the sound-based methods. variability of the resulting sound. (i) The resulting sound and especially transitions be- The starting points for the various existing physical mod- tween successive notes always sound acoustically realistic as eling methods are always physical models varying for a cer- far as the underlying model is sufficiently accurate. tain vibrating object only in the model accuracies. The appli- (ii) Sound variations of acoustical instruments due to dif- cation of the basic laws of physics to an existing or imaginary 950 EURASIP Journal on Applied Signal Processing vibrating object results in continuous-time, continuous- 17, 18]. The DWG first simplifies the PDE to the wave equa- space models. These models are called initial-boundary- tion which has an analytical solution in the form of a for- value problems and they contain a partial differential equa- ward and backward traveling wave, called d’Alembert solu- tion (PDE) and some initial and boundary conditions. The tion. It can be realized computationally very efficient with discretization approaches to the continuous models and the delay lines. The sound effects like damping or dispersion oc- digital realizations are different for the single physical mod- curring in the vibrating structure are included in the DWG by eling methods. low-order digital filters concentrated in one point of the de- One of the first physical modeling algorithm for the sim- lay line. This procedure ensures the computational efficiency, ulation of musical instruments was made by Hiller and Ruiz but the implementation looses the direct connection to the 1971 in [2] with the finite difference method. It directly dis- physical parameters of the vibrating structure. cretizes the temporal and spatial differential operators of the The focus of this article is the FTM. It was first intro- PDE to finite difference terms. On the one hand, this ap- duced in [19] for the heat-flow equation and first used for proach is computationally very demanding; since temporal digital sound synthesis in [20]. Extensions to the basic model and spatial sampling intervals have to be chosen small for of a vibrating string and comparisons between the FTM and accurate simulations. Furthermore, stability problems occur the above mentioned physical modeling methods are given, especially in dispersive vibrational objects if the relationship for example, in [12]. In the FTM, the initial-boundary-value between temporal and spatial sampling intervals is not cho- problem is first solved analytically by appropriate functional sen properly [3]. On the other hand, the finite difference transformations before it is discretized for computer simula- method is quite suitable for studies in which the vibration has tions. This ensures a high simulation accuracy as well as an to be evaluated in a dense spatial grid. Therefore, the finite inherent stability. One of the drawbacks of the FTM is so far difference method has mainly been used for academic stud- its computational load, which is about five times higher than ies rather than for real-time applications (see, e.g., [4, 5]). the load of the DWG [21]. However, the finite difference method has recently become This article extends the FTM by applying a multirate ap- more popular also for real-time applications in conjunction proach to the discrete realization of the FTM, such that the with other physical modeling methods [6, 7]. computational complexity is significantly reduced. The ex- A mathematically similar discretization approach is used tension is shown for the linearly vibrating string as well as in mass-spring models that are closely related to the finite for the nonlinear limitation of the string vibration by a fret- element method. In this approach, the vibrating structure string interaction occurring in slapbass synthesis. is reduced to a finite number of mass points that are inter- The article is organized as follows. Section 2 derives the connected by springs and dampers. One of the first systems physical model of a transversal vibrating, dispersive, and for the simulation of musical instruments was the CORDIS lossy string in terms of a scalar PDE and initial and boundary system which could be realized in real time on a specialized conditions. Furthermore, a model for a nonlinear fret-string processor [8]. The finite difference method, as well as the interaction is given. These models are solved in Section 3 mass-spring models, can be viewed as direct discretization with the FTM in continuous time and continuous space. approaches of the initial-boundary-value problems. Despite Section 4 discretizes these solutions at audio rate and derives the stability problems, they are very easy to set up, but they an algorithm to guarantee stability even for the nonlinear are computationally demanding. discrete system. A multirate approach is used in Section 5 In modal synthesis, first introduced in [9], the PDE is for the simulation of the continuous solution to save com- spatially discretized at non necessarily equidistant spatial putational cost. It is shown that this multirate approach also points, similar to the mass-spring models. The interconnec- works for nonlinear systems. Section 6 compares the audio tions between these discretized spatial points reflect the phys- rate and the multirate solutions with respect to the simula- ical behavior of the structure. This discretization reduces the tion accuracy and the computational complexity. degrees of freedom for the vibration to the number of spatial points which is directly transferred to the same number of 2. PHYSICAL MODELS temporal modes the structure can vibrate in. The reduction does not only allow the calculation of the modes of simple In this Section, a transversal vibrating, dispersive, and lossy structures, but it can also handle vibrational measurements string is analyzed using the basic laws of physics. From this of more complicated structures at a finite number of spatial analysis, a scalar PDE is derived in Section 2.1. Section 2.2 points [10]. A commercial product of the modal synthesis, defines the initial states of the vibration, as well as the fixings Modalys, is described, for example, in [11]. For a review of of the string at the nut and the bridge end, in terms of ini- modal synthesis and a comparison to the functional trans- tial and boundary conditions, respectively. In Section 2.3, the formation method (FTM), see also [12]. linear model is extended with a deflection-dependent force The commercially and academically most popular phys- simulating the nonlinear interaction between the string and ical modeling method of the last two decades was the digital the frets, well known as slap synthesis [22]. waveguide method (DWG) because of its computational ef- In all these models, the strings are assumed to be homo- ficiency. It was first introduced in [13] as a physically inter- geneous and isotropic. Furthermore, the smoothness of their preted extension of the Karplus-Strong algorithm [14]. Ex- surfaces may not permit stress concentrations. The deflec- tensions of the DWG are described, for example, in [15, 16, tions of the strings are assumed to be small enough to change Multirate Simulations of String Vibrations Using the FTM 951 neither the cross section area nor the tension on the string so string deflection y(x, t) by replacing v(x, t)withy˙(x, t)and that the string itself behaves linearly. ϕ(x, t) = y(x, t)from(3d) and with (3b)and(3c). Then (3) can be written in a general notation of scalar PDEs 2.1. Linear partial differential equation derived by basic laws of physics D y(x, t) +L y(x, t) +W y(x, t) (4a) The string under examination is characterized by its ma- = fe1(x, t), x ∈ [0, l], t ∈ [0, ∞), terial and geometrical parameters. The material parameters are given by the mass density ρ, the Young’s modulus E, the with laminar air flow damping coefficient d1, and the viscoelastic ffi D y(x, t) = ρAy¨(x, t)+d1 y˙(x, t), damping coe cient d3. The geometrical parameters consist of the length l, the cross section area A and the moment of L y(x, t) =−Ts y (x, t)+EIB y (x, t), (4b) inertia I.Furthermore,atensionT is applied to the string in s = =− axial direction. Considering only a string segment between W y(x, t) WD WL y(x, t) d3 y˙ (x, t). the spatial positions and + ∆ , the forces on this string xs xs x {} segment can be analyzed in detail. They consist of the restor- Asitcanbeseenin(4), the operator D contains only tem- {} ing force caused by the tension , the bending force poral derivatives, the operator L has only spatial deriva- fT Ts fB {} caused by the stiffness of the string, the laminar air flow force tives, and the operator W consists of mixed temporal and spatial derivatives. The PDE is valid only on the string be- fd1, the viscoelastic damping force fd3 (modeled here without tween x = 0andx = l and for all positive times. Equation memory), and the external excitation force fe. They result at x in (4) forms a continuous-time, continuous-space PDE. For a s unique solution, initial and boundary conditions must be f x , t = T sin ϕ x , t ≈ T ϕ x , t ,(1a)given as specified in the next section. T s s s s s f x , t =−EIb x , t ,(1b) B s s 2.2. Initial and boundary conditions f x , t = d ∆xv x , t ,(1c) d1 s 1 s Initial conditions define the initial state of the string at time fd3 xs, t = d3 sin ϕ˙ xs, t ≈ d3ϕ˙ xs, t ,(1d)t = 0. This definition is written in the general operator nota- tion with where ϕ(xs, t) is the slope angle of the string, b(xs, t) is the curvature of the string, v(x , t) is the velocity, and prime de- y(x,0) s fT y(x, t) = = 0, x ∈ [0, l], t = 0. (5) notes spatial derivative and dot denotes temporal derivative. i y˙(x,0) Note that in (1a) and in (1d) it is assumed that the amplitude of the string vibration is small so that the sine function can Since the scalar PDE (4) is of second order with respect to be approximated by its argument. Similar equations can be time, only two initial conditions are needed. They are chosen found for the forces at the other end of the string segment at arbitrarily by the initial deflection and the initial velocity of xs + ∆x. the string as seen in (5). For musical applications, it is a rea- All these forces are combined by the equation of motion sonable assumption that the initial states of the strings vanish to at time t = 0asgivenin(5). Note that this does not prevent the interaction between successively played notes since the ρA∆xv˙ x , t = f x , t + f x , t − f x + ∆x, t s y s d3 s y s time is not set to zero for each note. Thus, this kind of initial − fd3 xs + ∆x, t − fd1 xs, t + fe xs, t , condition is only used for, for example, the beginning of a (2) piece of music. In addition to the initial conditions, also the fixings of where fy = fT + fB. Setting ∆x → 0 and solving (2) for the the string at both ends must be defined in terms of bound- excitation force density fe1(xs, t) = fe(xs, t)δ(x − xs), four ary conditions. In most stringed instruments, the strings are coupled equations are obtained, that are valid not only at the nearly fixed at the nut end (x = x0 = 0) and transfer energy = = string segment xs ≤ x ≤ xs + ∆x but also at the whole string at the other end (x x1 l) via the bridge to the resonant 0 ≤ x ≤ l. δ(x) denotes the impulse function. body [2]. For some instruments (e.g., the piano) it is also a justified assumption, that the bridge fixing can be modeled = − − ˙ fe1(x, t) ρAv˙(x, t)+d1v(x, t) fy (x, t) d3b(x, t), (3a) to be ideally rigid [23]. Then the boundary conditions are given by fy(x, t) = Tsϕ(x, t) − EIb (x, t), (3b) = b x1, t ϕ (x, t), (3c) y x , t T = i = ∈ ∈ ∞ fbi y(x, t) 0, i 0, 1, t [0, ). (6) v x1, t = ϕ˙(x, t). (3d) y xi, t
An extended version of the derivation of (3)canbefound It can be seen from (6) that the string is assumed to be fixed, in [12]. The four coupled equations (3) can be simplified allowed to pivot at both ends, such that the deflection y and to one scalar PDE with only one output variable. All the the curvature b = y must vanish. These are boundary con- dependent variables in (3a) can be written in terms of the ditions of first kind. For simplicity, there is no energy fed 952 EURASIP Journal on Applied Signal Processing
L{·} T {·} Algebraic Reordering PDE ODE MD TFM IC, BC BC equation Discretization −1{·} T −1{·} Discrete z Discrete Discrete solution 1−DTFM MD TFM
Figure 1: Procedure of the FTM solving initial boundary value problems defined in form of PDEs, IC, and BC.
into the system via the boundary, resulting in homogeneous (BC) is Laplace transformed (L{·})withrespecttotime boundary conditions. to derive a boundary-value problem (ODE, BC). Then a The PDE (4), in conjunction with the initial (5)and so-called Sturm-Liouville transformation (T {·})isusedfor boundary conditions (6), forms the linear continuous- the spatial variable to obtain an algebraic equation. Solving time continuous-space initial-boundary-value problem to be for the output variable results in a multidimensional (MD) solved and simulated. transfer function model (TFM). It is discretized and by ap- plying the inverse Sturm-Liouville transformation T −1{·} 2.3. Nonlinear extension to the linear model and the inverse z-transformation z−1{·} it results in the dis- for slap synthesis cretized solution in the time and space domain. Nonlinearities are an important part in the sound produc- The impulse-invariant transformation is used for the dis- tion mechanisms of musical instruments [23]. One example cretization shown in Figure 1. It is equivalent to the calcu- is the nonlinear interaction of the string with the frets, well lation of the continuous solution by inverse transformation known as slap synthesis. This effect was modeled first for the into the continuous time and space domain with subsequent DWG in [22] as a nonlinear amplitude limitation. For the sampling. The calculation of the continuous solution is pre- FTM, the effect was already applied to vibrating strings in sented in Sections 3.1 to 3.5, the discretization is shown in [24]. Sections 4 and 5. A simplified model for this interaction interprets the fret For the nonlinear system, the transformations cannot ob- as a spring with a high stiffness coefficient Sfret acting at one viously result in a TFM. Therefore, the procedure has to be position xf as a force ff on the string at time instances where modified slightly, resulting in an MD implicit equation, de- the string is in contact with the fret. Since this force depends scribed in Section 3.6. on the string deflection, it is nonlinear, defined with 3.1. Laplace transformation ff xf , t, y, yf As known from linear electrical network theory, the Laplace transformation removes the temporal derivatives in linear S y x , t − y x , t ,fory x , t − y x , t > 0, = fret f f f f f f and time-invariant (LTI) systems and includes, due to the − ≤ 0, for y xf , t yf xf , t 0. differentiation theorem, the initial conditions as additive (7) terms (see, e.g., [25]). Since first- and second-order time derivatives occur in (4) and the initial conditions (5) are ho- The deflection of the fret from the string rest position is de- mogeneous, the application of the Laplace transformation to noted with yf .ThePDE(4) becomes nonlinear by adding the the initial boundary value problem derived in Section 2 re- slap force ff to the excitation function fe1(x, t).Thus,alinear sults in and a nonlinear system for the simulation of the vibrating string is derived. Both systems are solved in the next sections dD(s)Y(x, s)+L Y(x, s) + wD(s)WL Y(x, s) (8a) with the FTM. = Fe1(x, s), x ∈ [0, l], T = ∈ fbiY(x, s) 0, i 0, 1. (8b) 3. CONTINUOUS SOLUTIONS USING THE FTM The Laplace transformed functions are written with capital To obtain a model that can be implemented in the computer, letters and the complex temporal frequency variable is de- the continuous initial-boundary-value problem has to be noted by s = σ + jω. It can be seen in (8a) that the temporal discretized. Instead of using a direct discretization approach derivatives of (4a) are replaced with scalar multiplication of as described in Section 1, the continuous analytical solution the functions is derived first, which is discretized subsequently. This proce- 2 dure is well known from the simulation of one-dimensional dD(s) = ρAs + d1s, wD(s) =−d3s. (8c) systems like electrical networks. It has several advantages in- cluding simulation accuracy and guaranteed stability. Thus, the initial boundary value problem (4), (5), and (6)is The outline of the FTM is given in Figure 1. First, the replaced with the boundary-value problem (8)afterLaplace PDE with initial conditions (IC) and boundary conditions transformation. Multirate Simulations of String Vibrations Using the FTM 953
3.2. Sturm-Liouville transformation 3.4. Inverse transformations The transformation of the spatial variable should have the As explained at the beginning of Section 3, the continuous same properties as the Laplace transformation has for the solution in the time and space domain is now calculated by time variable. It should remove the spatial derivatives and it using inverse transformations. should include the boundary conditions as additive terms. Unfortunately, there is no unique transformation available Inverse SLT for this task due to the finite spatial definition range in con- The inverse SLT is defined by an infinite sum over all discrete trast to the infinite time axis. That calls for a determination eigenvalues βµ with of the spatial transformation at hand, depending on the spa- 1 tial differential operator and the boundary conditions. Since Y(x, s) = T −1 Y¯ (µ, s) = Y¯ (µ, s)K(µ, x). (12) it leads to an eigenvalue problem first solved for simplified µ Nµ problems by Sturm and Liouville between 1836 and 1838, this transformation is called a Sturm-Liouville transforma- The inverse transformation kernel K(µ, x) and the inverse tion (SLT) [26]. Mathematical details of the SLT applied to spatial frequency variable βµ are the same eigenfunctions and scalar PDEs can be found in [12]. eigenvalues as for the forward transformation due to the self- The SLT is defined by adjointness of the spatial operators L and WL (see [12]forde- tails). Thus, the inverse SLT can be evaluated at each spatial l position by evaluating the infinite sum. Since only quadratic T Y(x, s) = Y¯ (µ, s) = K(µ, x)Y(x, s)dx. (9) terms of µ occur in the denominator, it is sufficient to sum 0 over positive values of µ and double the result to account for the negative values. The norm factor results in that case in Note that there is a finite integration range in (9)incontrast N = l/4. to the Laplace transformation. The transformation kernels µ K(µ, x) of the SLT are obtained as the set of eigenfunctions of Inverse Laplace transformation the spatial operator L = L+W with respect to the bound- W L It can be seen from (11)and(8c), (10b) that the transfer ary conditions (8b). The corresponding eigenvalues are de- functions consist of two-pole systems with conjugate com- noted by β4(s)whereβ (s) is the discrete spatial frequency µ µ plex pole pairs for each discrete spatial eigenvalue β . There- variable (see, e.g., [12] for details). µ fore the inverse Laplace transformation results for each spa- For the boundary-value problem defined in (8) with the tial frequency variable in a damped sinusoidal term, called operators given in (4b), the transformation kernels and the mode. discrete spatial frequency variables result in 3.5. Continuous solution µπ K(µ, x) = sin x , µ ∈ N, (10a) After applying the inverse transformations to the MD TFM, l the continuous solution results in 4 2 4 µπ µπ ∞ β (s) = EI − Ts + d3s . (10b) 4 1 µ σµt ¯ l l y(x, t) = e sin ωµt ∗ fe(x, t) K(µ, x)δ−1(t). ρAl µ=1 ωµ Thus, the SLT can be interpreted as an extended Fourier se- (13) ries decomposition. The step function, denoted by δ−1(t), is used since the solu- tion is only valid for positive time instances; ∗ means tem- ¯ 3.3. Multidimensional transfer function model poral convolution. fe(x, t) is the spatially transformed exci- tation force, derived by inserting f into (9). The angular Applying the SLT (9) to the boundary-value problem (8)and e1 frequencies ω , as well as their corresponding damping co- solving for the transformed output variable Y¯ (µ, s) results in µ efficients σµ, can be calculated from the poles of the transfer the MD TFM function model (11). They directly depend on the physical 1 parameters of the string and can be expressed by Y¯ (µ, s) = F¯ (µ, s). (11) d (s)+β4(s) e ω = D µ µ 2 4 2 2 EI d µπ T d d µπ d Hence, the transformed input forces F¯(µ, s) are related via − 3 + s − 1 3 − 1 , ρA 2ρA l ρA 2(ρA)2 l 2ρA the MD transfer function given in (11) to the transformed output variable Y¯ (µ, s). The denominator of the MD TFM d d µπ 2 σ =− 1 − 3 . depends quadratically on the temporal frequency variable s µ 2ρA 2ρA l and to the power of four on the spatial frequency variable βµ. (14) This is based on the second-order temporal and fourth-order spatial derivatives occurring in the scalar PDE (4). Thus, the Thus, an analytical continuous solution (13), (14) of the ini- transfer function is a two-pole system with respect to time tial boundary value problem (4), (5), (6) is derived without for each discrete spatial eigenvalue βµ. temporal or spatial derivatives. 954 EURASIP Journal on Applied Signal Processing
3.6. Implicit equation for slap synthesis Discretization with respect to time The PDE (4) becomes nonlinear by adding the solution- Discretizing the time variable with t = kT, k ∈ N and assum- dependent slap force ff (xf , t, y, yf )in(7) to the right-hand ing an impulse-invariant system, an s-to-z mapping is ap- side of the linear PDE. Obviously, the application of the plied to the MD TFM (11)withz = e−sT . This procedure di- Laplace transformation and the SLT to the nonlinear initial- rectly leads to an MD TFM with the discrete-time frequency boundary-value problem cannot lead to an MD TFM, since variable z: a TFM always requires linearity. However, assuming that the σ T T 1/ρAωµ ze µ sin ωµT nonlinearity can be represented as a finite power series and Y¯ d(µ, z) = F¯d(µ, z). (17) 2 σµT 2σµT e that the nonlinearity does not contain spatial derivatives, z − 2ze cos ωµT + e both transformations can be applied to the system [12]. With (7), both premises are given such that the slap force can also Superscript d denotes discretized variables. The angular fre- ffi be transformed into the frequency domains. The Y(x, s)- quency variables and the damping coe cients are given in (14). Pole-zero diagrams of the continuous and the discrete dependency of F¯f can be expressed with (12)intermsof Y¯ (ν, s) to be consistently in the spatial frequency domain. system are shown in [27]. Then an MD implicit equation is derived in the temporal and spatial frequency domain Discretization with respect to space For the spatial frequency domain, there is no need for dis- ¯ = 1 ¯ ¯ ¯ ν Y(µ, s) 4 Fe(µ, s)+Ff µ, s, Y( , s) . (15) cretization, since the spatial frequency variable is already dis- dD(s)+β (s) µ crete. However, a discretization has to be applied to the spa- Note that the different argument ν in the output dependence tial variable x. This spatial discretization consists of simply ¯ ¯ ν of Ff (µ, s, Y( , s)) denotes an interaction between all modes evaluating the analytical solution (13) at a limited number caused by the nonlinear slap force. Details can be found in of arbitrary spatial positions xa on the string. They can be [12]. chosen to be the pickup positions or the fret positions, re- Since the transfer functions in (11)and(15) are the same, spectively. also the spatial transformation kernels and frequency vari- ables stay the same as in the linear case. Thus, also the tem- Inverse transformations poral poles of (15) are the same as in the MD TFM (11)and The inverse SLT cannot be performed any longer for an infi- the continuous solution results in the implicit equation nite number of µ due to the temporal discretization. To avoid ∞ 4 1 temporal aliasing the number must be limited to µT such that = σµt | |≤ y(x, t) e sin ωµt ωµT T π, which also ensures realizable computer imple- ρAl µ=1 ωµ ff mentations. E ects of this truncation are described in [12]. (16) The most important conclusion is that the sound quality is ∗ ¯ ¯ ν fe(x, t)+ ff µ, t, y¯( , t) not effected since only modes beyond the audible range are neglected. × K(µ, x)δ−1(t), By applying the shifting theorem, the inverse z-trans- formation results in µT second-order recursive systems in with ωµ and σµ givenin(14). It is shown in the next sections parallel, each one realizing one vibrational mode of the that this implicit equation is turned into explicit ones by ap- ff string. The structure is shown with solid lines in Figure 2. plying di erent discretization schemes. This linear structure can be implemented directly in the computer since it only includes delay elements z−1,adders, 4. DISCRETIZATION AT AUDIO RATE and multipliers. Due to (14), the coefficients of the second- This section describes the discretization of the continuous order recursive systems in Figure 2 only depend on the phys- solutions for the linear and the nonlinear cases. It is per- ical parameters of the vibrating string. formed at audio rate, for example with sampling frequency 4.2. Extensions for slap synthesis fs = 1/T = 44.1 kHz, where T denotes the sampling interval. The discrete realization is shown as it can be implemented The discretization procedure for the nonlinear slap synthe- in the computer. For the nonlinear slap synthesis, some ex- sis can be performed with the same three steps described in tensions of the discrete realization are required and, further- Section 4.1. Here, the discretized MD TFM is extended with ¯d ¯ d ν more, the stability of the entire system must be controlled. the output-dependent slap force Ff (µ, z, Y ( , z)) and thus stays implicit. However, after discretization with respect to 4.1. Discretization of the linear MD model spaceasdescribedabove,andinversez-transformation with The discrete realization of the MD TFM (11) consists of a application of the shifting theorem, the resulting recursive three-step procedure performed below: systems are explicit. This is caused by the time shift of the ex- citation function due to the multiplication with z in the nu- (1) discretization with respect to time, merator of (17). Therefore, the linear system given with solid (2) discretization with respect to space, lines in Figure 2 is extended with feedback paths denoted by (3) inverse transformations. dashed lines from the output to additional inputs between Multirate Simulations of String Vibrations Using the FTM 955
d d f (k) fe (k) f NL
c1,e(1) c1,s(1) d( , ) K(1, x ) y xa k z−1 + z−1 a + N1 2σ T σ T −e 1 2e 1 cos(ω1T) . . . ··· + c1,e(µT ) c1,s(µT )
K(µT , xa) z−1 + z−1 NµT
− 2σµT T σµT T e 2e cos(ωµT T)
Figure 2: Basic structure of the FTM simulations derived from the linear initial boundary value problem (4), (5), and (6)withseveral second-order resonators in parallel. Solid lines represent basic linear system, while dashed lines represent extensions for the nonlinear slap force.
d d d y¯2 (µT , k) c1,e(µT ) fe (k) c1,s(µT ) ff (k)
2σµ T d −e T y¯ (µT , k) z−1 ++ z−1
d d y¯1 (µT , k) y¯1,s(µT , k) σµT T 2e cos(ωµT T)
Figure 3: Recursive system realization of one mode of the transversal vibrating string.
the unit delays of all recursive systems. The feedback paths in are weighted with the nonlinear (NL) function (7). = 2T µT π c1,(e,s) µT sin ωµT T sin x(e,s) . (18) 4.3. Guaranteeing stability ρAωµT l The discretized LTI systems derived in Section 4.1 are inher- The total instantaneous energy of the string vibration with- ently stable as long as the underlying continuous physical out slap force density can be calculated with [12, 28](time model is stable due to the use of the impulse-invariant trans- step k and mode number µT dependencies are omitted for formation [25]. However, for the nonlinear system derived in concise notation) Section 4.2 this stability consideration is not valid any more. It might happen that the passive slap force of the continu- 4ρA E (k) = σ2 + ω2 ous system becomes active with the direct discretization ap- vibr µT µT l µ proach [24]. To preserve the passivity of the system, and thus T d2 d d σ T d2 2σ T y¯ − 2y¯ y¯ e µT cos ω T − y¯ e µT the inherent stability, the slap force must be limited such that × 1 1 2 µT 2 2 . 2σµT T the discrete impulses correspond to their continuous coun- e sin ωµT T terparts. (19) The instantaneous energy of the string vibration can be calculated by monitoring the internal states of the modal de- In (19), the instantaneous energy is calculated without appli- d flections [12]. The slap force limitation can then be obtained cation of the slap force since the internal states y¯1 (µT , k)are directly from the available internal states. For an illustration used (see Figure 3). For calculating the instantaneous energy d of these internal states, the recursive system of one mode µT Es(k) after applying the slap force, y¯1 (µT , k) must be replaced d is given in Figure 3. with y¯1,s(µT , k)in(19). To meet the condition of passivity The variables c1,e(µT )andc1,s(µT ), denoting the weight- of the elastic slap collision, both energies must be related by d ≥ ings of the linear excitation force fe (k)atxe and of the slap Evibr(k) Es(k). Here, only the worst-case scenario with d force ff (k)atxf , respectively, result with (9), (10a)and(17) regard to the instability problem is discussed, where both 956 EURASIP Journal on Applied Signal Processing energies are the same. By inserting into this energy equal- ing the temporal spectrum into different bands that are pro- ity the corresponding expressions of (19) and solving for the cessed independently of each other, the modes within these d slap force ff (k) results in bands can be calculated with a sampling rate that is a frac- tion of the audio rate. Thus, the computational complexity d ff (k) can be reduced with this method. The sidebands generated σ T d d by this procedure at audio rate are suppressed with a syn- = c µ 2e µT cos ω T y¯ µ , k − 2y¯ µ , k , 5 T µT 2 T 1 T thesis filter bank when all bands are added up to the output µT signal. The input signals of the subsampled modes also have (20a) to be subsampled. To avoid aliasing, the respective input sig- nals for the modes are obtained by processing the excitation d with signal fe (k) through an analysis filter bank. This general pro- cedure is shown with solid lines in Figure 4. It shows several c5 µT modes (RS # i), each one running at its respective downsam- 2 ν 2 2 2 σ T T ν pled rate. c1,s µT σµ + ωµ νT =µT e sin ω T T = T T . This filter bank approach is discussed in detail in the next ν 2 2 2 2 2σ T T two sections for the linear as well as for the nonlinear model κ c1,s κT σκ + ωκ ν =κ e sin ωνT T T T T T T of the FTM. (20b) 5.1. Discretization of the linear MD model The force limitation discussed here can be implemented ffi For the realization of the structure shown in Figure 4,two very e ciently. Only one additional multiplication, one major tasks have to be fulfilled [29]: summation, and one binary shift are needed for each vibra- tional mode (see (20a)), since the more complicated con- (1) designing an analysis and a synthesis filter bank that ffi stants c5(µT ) have to be calculated only once and the weight- can be realized e ciently, d ing of y¯2 (µT , k) has to be performed within the recursive sys- (2) developing an algorithm that can simulate band tem anyway (compare Figure 3). changes of single sinusoids to keep the flexibility of the Discrete realizations of the analytical solutions of the MD FTM. initial boundary value problems have been derived in this section. For the linear and nonlinear systems, they resulted Filter bank design in stable and accurate simulations of the transversal vibrat- There are numerous design procedures for filter banks that ing string. The drawback of these straight forward discretiza- are mainly specialized to perfect or nearly perfect reconstruc- tion approaches of the MD systems in the frequency domains tion requirements [30]. In the structure shown in Figure 4 is the high computational complexity of the resulting real- there is no need for a perfect reconstruction as in sound- izations. Assuming a typical nylon guitar string with 247 Hz processing applications, since the sound production mecha- pitch frequency, 59 eigenmodes have to be calculated up to nism is performed within the single downsampled frequency the Nyquist frequency at 22.050 kHz. With an average of 3.1 bands. Therefore, inaccuracies of the interpolation filters can and 4.2 multiplications per output sample (MPOS) per re- be corrected by additional weightings of the subsampled re- cursive system for the linear and the nonlinear systems, re- cursive systems. Linear phase filters with finite impulse re- spectively, the total computational cost results for the whole sponses (FIR) are used for the filter bank due to the vari- string in 183 MPOS and 248 MPOS. Note that the fractions ability of the single sinusoids over time. Furthermore, a real- of the average MPOS result from the assumption that there valued generation of the sinusoids in the form of second- are only few time instances where an excitation force acts on order recursive systems as shown in Figure 2 is preferred to the string, such that the input weightings of the recursive sys- complex-valued first-order recursive systems. This approach tems do not have to be calculated at each sample step. Since avoids on one hand additional real-valued multiplications of this is also assumed for the nonlinear slap force, the fractional complex numbers. On the other hand, the nonlinear slap part in the nonlinear system is higher than in the linear sys- implementation can be performed in a similar way for the tem. multirate approach, as explained for the audio-rate realiza- These computational costs are approximately five times tion in Section 4.2. A multirate realization of the FTM with higher than those of the most efficient physical modeling complex-valued first-order systems is described in [31]. method, the DWG [21]. The next section shows that this dis- To fulfill these prerequisites and the requirement of low- advantage of the FTM can be fixed by using a multirate ap- order filters for computational efficiency with necessarily flat proach for the simulation of the recursive systems. filter edges, a filter bank with different downsampling factors for different bands has to be designed. A first step is to de- 5. DISCRETIZATION WITH A MULTIRATE APPROACH sign a basic filter bank with PED equidistant filters, all using the same downsampling factor rED = PED. Due to the flat fil- The basic idea using a multirate approach to the FTM realiza- ter edges, there will be PED − 1 frequency gaps between the tion is that the single modes have a very limited bandwidth single filters that have neither a sufficient passband amplifi- as long as the damping coefficients σµ are small. Subdivid- cation nor a sufficient stopband attenuation. These gaps are Multirate Simulations of String Vibrations Using the FTM 957
d d ff (rk) y (xa, rk) NL +
↓ 4 + RS # 1 + ↑ 4
RS # 2 +
RS # 3
↓ 6 + RS # 4 + ↑ 6 d d y (xa, k) fe (k) RS # 5
↓ 4 + RS # 6 + ↑ 4 Analysis filter bank Synthesis filter bank
RS # 7 . .
Figure 4: Structure of the multirate FTM. Solid lines represent the basic linear system, while dashed and dotted lines represent the extensions for the nonlinear slap force. RS means recursive system. The arrow between RS # 3 and RS # 4 indicates a band change.
filled with low-order FIR filters that realize the interpolation 0 of different downsampling factors than rED. The combina- −30 4444 tion of all filters forms the filter bank. It is used for the anal- −60 ysis and the synthesis filter bank as shown in Figure 4. −90 An example of this procedure is shown in Figure 5 with 00.20.40.60.81 PED = 4. The total number of bands is P = 7. The frequency 0 regions where the single filters are used as passbands in the −30 656 filter bank are separated by vertical dashed lines. The filters −60 are designed by a weighted least-squares method such that −90 they meet the desired passband bandwidths and stopband at- 00.20.40.60.81
tenuations. Note that there are several frequency regions for Magnitude response (dB) 0 each filter where the frequency response is not specified ex- −30 plicitly. These so-called “don’t care bands” occur since only −60 a part of the Nyquist bandwidth in the downsampled do- −90 main is used for the simulation of the modes. Thus, there can 00.20.40.60.81 only be images of these sinusoids in the upsampled version ω T/π in distinct regions. All other parts of the spectrum are “don’t µ care bands,” for the lowpass filter they are shown as gray ar- Figure 5: Top: frequency responses of the equidistant filters (with eas in Figure 5. Magnitude ripples of ±3 dB are allowed in downsampling factor four in this example). Center: frequency re- the passband which can be compensated by a correction of sponses of the filters with other downsampling factors. Bottom: fre- the weighting factors of the single sinusoids. The stopbands quency response of the filter bank. The downsampling factors r are are attenuated by at least −60 dB, which is sufficient for most given within the corresponding passbands. The FIR filter orders are hearing conditions. Merely in studio-like hearing conditions between Mmin = 34 and Mmax = 72 in this example. They realize a larger stopband attenuations must be used such that artifacts stopband attenuation of at least −60 dB and allow passband ripples produced by using the filter bank cannot be heard. of ±3dB. Due to the different specifications of the filters, concern- ing bandwidths and edge steepnesses, they have different or- ders and thus different group delays. To compensate for the ficients of the interpolation filters are denoted by Mp,where different group delays, delay-lines of length (Mmax − Mp)/2 Mmax is the maximum order of all filters. The delay lines con- are used in conjunction with the filters. The number of coef- sume some memory space but no additional computational 958 EURASIP Journal on Applied Signal Processing cost [32]. Realizing the filter bank in a polyphase structure, (3) training of the new interpolation filter to avoid tran- each filter bank results in a computational cost of sient behavior.
P Similar to the calculation of the instantaneous energy for slap Mp Cfilterbank = MPOS, (21) synthesis, also the instantaneous amplitude and phase can be p=1 rp calculated from the internal states of a second-order recursive system, y¯1 and y¯2. They can be calculated for the old band with the downsampling factors rp of each band. For the ex- with downsampling factor r1,aswellasforthenewband ample given above, each filter bank needs 73 MPOS. In (21) with factor r2. Demanding the equality of both amplitudes it is assumed that each band contains at least one mode to and phases, the internal states of the new band are calculated be reproduced, so that it is a worst-case scenario. As long as from the internal states of the old band to the excitation signal is known in advance, the excitations for sin ω r T (r2) = (r1) µ 2 each band can be precalculated such that only the synthesis y¯1 y¯1 sin ωµr1T filter bank must be implemented in real time. The case that the excitation signals are known and stored as wavetables in ( ) sin ωµr2T (23) r1 σµr1T − advance is quite frequent in physical modeling algorithms, + y¯2 e cos ωµr2T , tan ωµr1T although the pure physicality of the model is lost by this ap- ( ) ( ) − r2 = r1 σµ(r1 r2)T proach. For example, for string simulations, typical plucking y¯2 y¯2 e . or striking situations can be described by appropriate excita- tion signals which are determined in advance. The second item of the three-step procedure means that The practical realization of the multirate approach starts the output of the synthesis interpolation filter must not con- tain those modes that are leaving that band at time instance with the calculation of the modal frequencies ωµT and their ffi kchT for time steps kT ≥ kchT. Since the filter bank is a causal corresponding damping coe cients σµT .Thefrequencyde- notes in which band the mode is synthesized. The coefficients system of length MpT, the information of the band change − of the recursive systems, as shown in Figure 2 for the audio must either be given in advance at (kch Mp)T or a turbo fil- rate realization, have to be modified in the downsampled do- tering procedure has to be applied. In the turbo filtering, the main since the sampling interval T is replaced by calculations of several sample steps are performed within one sampling interval at the cost of a higher peak computational T(r) = rT(1) = rT. (22) complexity. In this case, the turbo filtering must calculate the previous outputs of the modes, leaving the band and sub- tract their contribution to the interpolated output for time Superscript (r) denotes the downsampled simulation with instances kT ≥ k T. Due to the higher peak computational factor r. The downsampling factors of the different bands r ch p complexity of the turbo filtering and the low orders of the are given in the top and center plot of Figure 5. No further interpolation filters, the additional delay of M T is preferred adjustments have to be performed for the coefficients of the p here. recursive systems in the multirate approach, since modes can In the same way, as the band changing mode must not be realized in the downsampled baseband or each of the cor- have an effect on the leaving band from k T on, it must responding images. ch also be included in the interpolation filter of the new band from this time instance on. In other words, the new interpo- Band changes of single modes lation filter must be trained to correctly produce the desired One advantage of the FTM is that the physical parameters mode without transients, as addressed in the third item of the of a vibrating object can be varied while playing. This is not three-step procedure above. It can also be performed with the only valid for successively played notes but also within one turbo processing procedure with a higher computational cost note, as it occurs, for example, in vibrato playing. As far as or with the delay of MpT between the information of band one or several modes are at the edges of the filter bank bands, change and its effect in the output signal. these variations can cause the modes to change the bands Now, the linear solution (13) of the transversal vibrating while they are active. This is shown with an arrow in Figure 4. string derived with the FTM is realized also with a multirate In such a case, the reproduction cannot be performed by just approach. Since the single modes are produced at a lower rate ffi adjusting the coe cients of the recursive systems with (22) than the audio rate, this procedure saves computational cost to the new downsampling rate and using the other interpo- in comparison to the direct discretization procedure derived lation filter. This procedure would result in strong transients in Section 4.1. The amount of computational savings with and in a modification of the modal amplitudes and phases. this procedure is discussed in more detail in Section 6. Therefore, a three-step procedure has to be applied to the band changing modes: 5.2. Extensions for slap synthesis (1) adjusting the internal states of the recursive systems In the discretization approach described in Section 4.2 the ff d such that no phase shift and no amplitude di erence output y (xa, k) is fed back to the recursive systems via the d occurs in the upsampled output signal from this mode, path of the external force fe (k)(compareFigure 2). Using (2) canceling the filter output of the band changing mode, the same path in the multirate system shown in Figure 4 Multirate Simulations of String Vibrations Using the FTM 959 would result in a long delay within the feedback path due between most modes but it only restricts them to few time to the delays in the interpolation filters of the analysis and instances, in the example above every fourth or twelfth audio the synthesis filter bank. Furthermore, the analysis filter bank sample. These low delays of the interaction are not notice- should not be realized in real time as long as the excitation able. The second effect can be handled by adding impulses signal is known in advance. directly to the interpolation filters of the synthesis filter bank. Fortunately, the recursive systems calculate directly the The weights of the impulses in each band are determined by instantaneous deflection of the single modes, but in the the difference between the sum of all slap force impulses in downsampled domain. Considering a system where only all bands and the applied slap force impulses in that band. In modes are simulated in baseband, the signal can be fed back that way, a slap force, only applied to baseband modes, pro- in between the down- and upsampled boxes in Figure 4 and duces a nearly white noise slap signal at audio rate. thus directly in the downsampled domain. In comparison to The stabilization procedure described in Section 4.3 can the full-rate system, the observation of the penetration of the be also applied to the multirate realization of the nonlinear string into the fret might be delayed by up to (rp − 1)T sec- slap force. The only differences to the audio rate simulations onds. This delay results in a different slap force, but applying are that T is replaced by rpT asgivenin(22) and the sum- d the stabilization procedure described in Section 4.3 the sta- mation for the calculation of the stable slap force ff (k)as bility is guaranteed. givenin(20a) is only performed over the modes realized in However, in realistic simulations there are also modes in the participating bands. Thus, there are time instances where the higher frequency bands than just in the baseband. This the slap force is only applied to the modes in the equidistant modifies the simulations described above in two ways: bands and time instances where it is applied also to bands (i) the deflection of the string and thus the penetration with another downsampling factor. This is shown with the ff into the fret depends on the modes of all bands, dotted lines in Figure 4. Due to the di erent cases of partici- (ii) there is an interaction due to nonlinear slap force be- pating bands, also two versions of the constants c5(µT )have tween all modes in all bands. to be calculated, since the products and sums in (20b)de- pend only on the participating modes. The calculation of the instantaneous string deflection in the Now, a stable and realistic simulation of the nonlinear downsampled rates is rarely possible, since there are various slap force is also obtained in the multirate realization. In the downsampling rates as shown in Figure 4. Thus, there are nonlinear case, the simulation accuracy obviously decreases only a few time instances kallT, where the modal deflections with higher downsampling factors and thus with an increas- are updated in all bands at the same time. Since in almost all ing number of bands. This effect is discussed in more detail bands one sample value of the recursive systems represents in the next section. more than half the period of the mode, it is not reasonable to use the previously calculated sample value for the calcu- lation of the deflection at time instances kT = kallT.How- 6. SIMULATION ACCURACY AND COMPUTATIONAL ever, all the equidistant bands of the filter bank as shown COMPLEXITY on top of Figure 5 have the same downsampling factor and In the previous sections, stable, linear and nonlinear, discrete can thus represent the same time instances for the calcula- FTM models have been derived. In the next sections, the sim- tion of the deflection. Furthermore, most of the energy of ulation accuracies of these models and their corresponding guitar string vibrations is in the lower modes [28], such that computational complexities are discussed. the deflection is mostly defined by the modes simulated in the lowest bands. Therefore, the string deflection is deter- 6.1. Simulation accuracies mined here at each r1th audio sample from all equidistant bands and each (k mod r1 = 0) ∧ (k mod r2 = 0)th audio For the linearly vibrating string, the discrete realization of sample from all equidistant bands and bands with the down- the single modes at full rate is an exactly sampled version of sampling rate of the lowest band-pass. This is shown in the the continuous modes. This is true as long as the input force right dashed and dotted paths in Figure 4. In the example can be modeled with discrete impulses, since the impulse- of Figure 5, in each twelfth audio sample the deflection is invariant transformation is used as explained in Section 4.1. calculated from the four equidistant bands and each twelfth However, the exactness of the complete system is lost with audio sample it is calculated also from the second and sixth the truncation of the summation of partials in (12)toavoid bands. aliasing effects. Therefore, the results are only accurate as In the same way the string deflection is calculated with long as the excitation signal has only low energy in the trun- varying participation of the different bands, also the slap cated high frequency range. This is true for the guitar and force is only applied to modes in these bands as shown in the most other musical instruments [28] and, furthermore, the left dashed and dotted paths in Figure 4. This procedure has neglected higher partials cannot be received by the human two effects: firstly, there is no interaction between all modes auditory system as long as the sampling interval T is cho- at all (downsampled) time instances from the slap force. Sec- sen small enough. Since the audible modes are simulated ex- ondly, the slap force itself, being an impulse-like signal with a actly and the simulation error is out of the audible range, the bright spectrum, is filtered by the filter bank. The first effect is FTM is used here as an optimized discretization approach for not that important since the procedure ensures interactions sound synthesis applications. 960 EURASIP Journal on Applied Signal Processing
In multirate simulations of linear systems as described are preserved. Thus, simplifications and computational sav- in Section 5.1, the single modes are produced exactly within ings due to the filter bank approach are performed here with the downsampled domain. But due to the imperfectness of respect to the human auditory system. the analysis filter bank, modes are not only excited by the correct frequency components of the excitation force, but 6.2. Computational complexities also by aliasing terms that occur with downsampling. In the same way, the images, produced by upsampling the outputs The computational complexities of the FTM are explained of the recursive systems, are not suppressed perfectly with with two typical examples, a single bass guitar string sim- the synthesis filter bank. However, the filter banks have been ulated in different qualities and a six-string acoustic gui- designed such that the stopband suppressions are at least tar. The first example simulates the vibration of one bass −60dB.Thisissufficient for most listening conditions as de- guitar string with fundamental frequency of 41 Hz. The cor- fined in Section 5.1. Furthermore, the filters are designed in responding physical parameters can be found, for example, a least-mean-squares sense such that the energy of the side in [12]. This string is simulated in different sound quali- lobes in the stopbands is minimized. Further filter bank op- ties by varying the number of simulated modes from 1 to timizations with respect to the human auditory system are 117, which corresponds to the simulation of all modes up difficult since the filter banks are designed only once for all to the Nyquist frequency with a sampling frequency of fs = kinds of mode configurations concerning their positions and 44.1 kHz. amplitude relations in the simulated spectrum. Figure 6 shows the dependency of the computational In the audio rate string model excited nonlinearly with complexities on the number of simulated modes and thus the slap force as described in Section 4.2, the truncation of the simulation accuracy or sound quality. The procedure the infinite sum in (16) also effects the accuracy of the lower used here to enhance the sound quality consists of sim- modes through the nonlinearity. The simulations are accu- ulating more and more modes in consecutive order from rate only as long as the external excitation and the nonlinear- the lowest mode on. Thus, the enhancement of the sound ity have low contributions to the higher modes. Although the quality sounds like opening the lowpass in subtractive syn- external excitation contributes rarely to the higher modes, thesis. The upper plot shows the computational complexi- there is an interaction between all modes due to the slap ties for the linear system, simulated at audio rate and with force. This interaction grows with the modal frequencies. It the multirate approach using filter banks with P = 7and = can be directly seen in the coefficients c5(µT )in(20b), since P 15. The bottom plot shows the corresponding graphs they have larger absolute values for higher frequencies. How- for the nonlinear systems. It is assumed that the exter- ever, the force contributions of the omitted modes are dis- nal forces only act on the string at one tenth of the out- tributed to the simulated modes since the denominator of put samples such that the weighting of the inputs do not (20b) decreases for less simulated partials. Furthermore, the have to be performed at each time instance. Thus, each lin- sign of c5(µT ) changes with µT due to (18) as well as the ex- ear recursive system needs 3.1 MPOS for the calculation of pression in parenthesis of (20a) does with time. Thus, there is one output sample, whereas the nonlinear system needs 4.2 a bidirectional interaction between low and high modes and MPOS. not only an energy shift from low to high frequencies. Ne- It can be seen that the multirate implementations are glecting modes out of the audible range results in less energy much more efficient than the audio-rate simulations, except fluctuations of the audible modes. But since the neglected en- for simulations with very few modes. With all 117 simulated ergy fluctuations have high frequencies, they are also out of modes, the relation between audio rate and multirate sim- the audible range. ulations (P = 7) is 363 MPOS to 157 MPOS for the linear In the multirate implementation of the nonlinear model system and 492 MPOS to 187 MPOS for the nonlinear sys- as described in Section 5.2, the interactions between al- tem. This is a reduction of the computational complexity of most all modes are retained. It is more critical here that more than 60%. the observation of the fret-string penetration might be The steps in the multirate graphs denote the offset of the delayed by several audio samples. This circumvents not filter bank realization and that the interpolations of the fil- only the strict limitation of the string deflection by the ter bank bands are only calculated as long as there is at least fret, but is also changes the modal interactions because one mode simulated in those bands. On the one hand, the the nonlinear system is not time-invariant. However, the regions between the steps are steeper in the filter bank with audible slap effect stays similar to the full-rate simula- P = 7 than in that with P = 15 due to the higher downsam- tions and sounds realistic. Audio examples can be found at pling factors in filter banks with more bands. On the other http://www.LNT.de/∼traut/JASP04/sounds.html. hand, the steps are higher for filter banks with more bands It has been shown that the FTM realizes the continuous due to the higher interpolation filter orders. In this example, solutions of the physical models of the vibrating string ac- the multirate approach with P = 7 is superior to the filter curately. With the multirate approach, the FTM looses the bank with P = 15 for high qualities, since there are only a exactness of the linear audio rate model, but the inaccura- few modes simulated in the higher bands of P = 15, but the cies cannot be heard. For the nonlinear model, the multirate filter bank offset is higher. For other configurations with a approach leads to audible differences compared to the audio higher number of simulated modes, this situation is different rate simulations, but the characteristics of the slap sounds as shown in the next example. Multirate Simulations of String Vibrations Using the FTM 961
200 150 100 (MPOS) 50 0 0 20406080100120 Computational complexity Number of modes
(a)
200 150 100 (MPOS) 50 0 0 20406080100120 Computational complexity Number of modes
(b)
Figure 6: Computational complexities of the FTM simulations dependent on the number of simulated modes at audio rate (dotted line), and with multirate approaches with P = 7 (dashed line) and P = 15 (solid line). (a): Linearly vibrating string, (b): vibrating string with nonlinear slap forces.
The second example shows the computational complex- ities of the simultaneous simulations of six independent 400 strings as they occur in an acoustic guitar. Obviously, there is only one interpolation filter bank needed for all strings. The average number of simulated modes for each guitar string is assumed to be 60. In contrast to the first example, it is assumed that the modes are equally distributed in the fre- 350 quency domain, such that at least one mode is simulated in each band. Figure 7 shows that the computational complexities de- pend on the choice of the used filter bank. On the one hand, 300 each filter bank needs a fixed amount of computational cost Computational complexity (MPOS) which grows with the number of used bands. On the other hand, filter banks with more bands provide higher down- sampling factors for the production of the sinusoids which 7 111519 saves computational cost. Thus, the choice of the optimal P filter bank depends on the number of simultaneously sim- ulated modes. For practical implementations this has to be Figure 7: Computational complexities of the FTM simulations of a estimated in advance. six-string guitar dependent on the number of bands for the multi- It can be seen that for the linear case (solid line) the rate approach. Solid line: linearly vibrating string. Dashed line: vi- minimum computational cost is 272 MPOS using the filter brating string with nonlinear slap forces. bank with P = 11. In the nonlinear case, the filter bank with P = 15 has the minimum computational cost with 319 MPOS for the simulation of all six strings. Compared to the audio-rate simulations with 1116 MPOS and 1512 MPOS for Compared to high quality DWG simulations, the com- the linear and nonlinear case, respectively, the multirate sim- putational complexities of the multirate FTM approach are ulations allow computational savings up to 79%. Thus, the nearly the same. Linear DWG simulations need up to 40 multirate simulations have a computational complexity of MPOS for the realization of the reflection filters [21]and approximately 45 MPOS (53 MPOS) for each linearly (non- the nonlinear limitation of the string by the fret additionally linearly) simulated string. needs 3 MPOS per fret position [22]. 962 EURASIP Journal on Applied Signal Processing
7. CONCLUSIONS [9] J. M. Adrien, “Dynamic modeling of vibrating structures for sound synthesis, modal synthesis,” in Proc. AES 7th Inter- The complete procedure of the FTM has been described from national Conference, pp. 291–299, Audio Engineering Society, the basic physical analysis of a vibrating structure resulting in Toronto, Canada, May 1989. an initial boundary value problem via its analytical solution [10] G. De Poli, A. Piccialli, and C. Roads, Eds., Representations of to efficient digital multirate implementations. The transver- Musical Signals, MIT Press, Cambridge, Mass, USA, 1991. sal vibrating dispersive and lossy string with a nonlinear slap [11] G. Eckel, F. Iovino, and R. Causse,´ “Sound synthesis by phys- force served as an example. The novel contribution is a thor- ical modelling with Modalys,” in Proc. International Sympo- ough investigation of the implementation and the properties sium on Musical Acoustics, pp. 479–482, Le Normant, Dour- dan, France, July 1995. of a multirate realization. ff [12] L. Trautmann and R. Rabenstein, Digital Sound Synthe- It has been shown that the di erences between audio- sis by Physical Modeling Using the Functional Transformation rate and multirate simulations for linearly vibrating string Method, Kluwer Academic Publishers, New York, NY, USA, simulations are not audible. The differences of the nonlin- 2003. ear simulations were audible but the multirate approach pre- [13] D. A. Jaffe and J. O. Smith, “Extensions of the Karplus-Strong serves the sound characteristics of the slap sound. The ap- plucked-string algorithm,” Computer Music Journal,vol.7, plication of the multirate approach saves almost 80% of the no. 2, pp. 56–69, 1983. ffi [14] K. Karplus and A. Strong, “Digital synthesis of plucked-string computational cost at audio rate. Thus, it is nearly as e cient and drum timbres,” Computer Music Journal, vol. 7, no. 2, pp. as the most popular physical modeling method, the DWG. 43–55, 1983. The multirate FTM is by far not limited to the example of [15] J. O. Smith, “Physical modeling using digital waveguides,” vibrating strings. It can be used in a similar way to spatially Computer Music Journal, vol. 16, no. 4, pp. 74–91, 1992. multidimensional systems, like membranes or plates, or even [16] J. O. Smith, “Efficient synthesis of stringed musical instru- to other physical problems like heat flow or diffusion. ments,” in Proc. International Computer Music Conference,pp. 64–71, Tokyo, Japan, September 1993. [17] M. Karjalainen, V. Valim¨ aki,¨ and Z. Janosy,´ “Towards high- ACKNOWLEDGMENTS quality sound synthesis of the guitar and string instruments,” The authors would like to thank Vesa Valim¨ aki¨ for numer- in Proc. International Computer Music Conference, pp. 56–63, Tokyo, Japan, September 1993. ous discussions and his help in the filter bank design for the [18] M. Karjalainen, V. Valim¨ aki,¨ and T. Tolonen, “Plucked-string multirate FTM. Furthermore, the financial support of the models, from the Karplus-Strong algorithm to digital waveg- Deutsche Forschungsgemeinschaft (DFG) for this research is uides and beyond,” Computer Music Journal,vol.22,no.3,pp. greatly acknowledged. 17–32, 1998. [19] R. Rabenstein, “Discrete simulation of dynamical boundary REFERENCES value problems,” in Proc. EUROSIM Simulation Congress,pp. 177–182, Vienna, Austria, September 1995. [1] C. Roads, S. Pope, A. Piccialli, and G. De Poli, Eds., Musical [20] L. Trautmann and R. Rabenstein, “Digital sound synthesis Signal Processing, Swets & Zeitlinger, Lisse, The Netherlands, based on transfer function models,” in Proc. IEEE Workshop 1997. on Applications of Signal Processing to Audio and Acoustics,pp. [2] L. Hiller and P. Ruiz, “Synthesizing musical sounds by solving 83–86, IEEE Signal Processing Society, New Paltz, NY, USA, the wave equation for vibrating objects: Part I,” Journal of the October 1999. Audio Engineering Society, vol. 19, no. 6, pp. 462–470, 1971. [21] L. Trautmann, B. Bank, V. Valim¨ aki,¨ and R. Rabenstein, [3] A. Chaigne and V. Doutaut, “Numerical simulations of xy- “Combining digital waveguide and functional transformation lophones. I. Time-domain modeling of the vibrating bars,” methods for physical modeling of musical instruments,” in Journal of the Acoustical Society of America, vol. 101, no. 1, pp. Proc. Audio Engineering Society 22nd International Conference 539–557, 1997. on Virtual, Synthetic and Entertainment Audio, pp. 307–316, [4] A. Chaigne, “On the use of finite differences for musical syn- Espoo, Finland, June 2002. thesis. Application to plucked stringed instruments,” Journal [22] E. Rank and G. Kubin, “A waveguide model for slapbass syn- d’Acoustique, vol. 5, no. 2, pp. 181–211, 1992. thesis,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Pro- [5] A. Chaigne and A. Askenfelt, “Numerical simulations of pi- cessing, pp. 443–446, IEEE Signal Processing Society, Munich, ano strings. I. A physical model for a struck string using finite Germany, April 1997. difference methods,” Journal of the Acoustical Society of Amer- [23] M. Kahrs and K. Brandenburg, Eds., Applications of Digital ica, vol. 95, no. 2, pp. 1112–1118, 1994. Signal Processing to Audio and Acoustics,KluwerAcademic [6] M. Karjalainen, “1-D digital waveguide modeling for im- Publishers, Boston, Mass, USA, 1998. proved sound synthesis,” in Proc. IEEE Int. Conf. Acoustics, [24] L. Trautmann and R. Rabenstein, “Stable systems for nonlin- Speech, Signal Processing, vol. 2, pp. 1869–1872, IEEE Signal ear discrete sound synthesis with the functional transforma- Processing Society, Orlando, Fla, USA, May 2002. tion method,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal [7] C. Erkut and M. Karjalainen, “Finite difference method Processing, vol. 2, pp. 1861–1864, IEEE Signal Processing So- vs. digital waveguide method in string instrument modeling ciety, Orlando, Fla, USA, May 2002. and synthesis,” in Proc. International Symposium on Musical [25] B. Girod, R. Rabenstein, and A. Stenger, Signals and Systems, Acoustics, Mexico City, Mexico, December 2002. John Wiley & Sons, Chichester, West Sussex, UK, 2001. [8] C. Cadoz, A. Luciani, and J. Florens, “Responsive input [26] R. V. Churchill, Operational Mathematics,McGraw-Hill,New devices and sound synthesis by simulation of instrumental York, NY, USA, 3rd edition, 1972. mechanisms: the CORDIS system,” Computer Music Journal, [27] R. Rabenstein and L. Trautmann, “Digital sound synthe- vol. 8, no. 3, pp. 60–73, 1984. sis of string instruments with the functional transformation Multirate Simulations of String Vibrations Using the FTM 963
method,” Signal Processing, vol. 83, no. 8, pp. 1673–1688, 2003. [28] N. H. Fletcher and T. D. Rossing, The Physics of Musical In- struments, Springer-Verlag, New York, NY, USA, 1998. [29] L. Trautmann and V. Valim¨ aki,¨ “A multirate approach to physical modeling synthesis using the functional transforma- tion method,” in Proc. IEEE Workshop on Applications of Sig- nal Processing to Audio and Acoustics, pp. 221–224, IEEE Signal Processing Society, New Paltz, NY, USA, October 2003. [ 3 0 ] P. P. Va i dya n a t h a n , Multirate Systems and Filter Banks,Pren- tice Hall, Englewood Cliffs, NJ, USA, 1993. [31] S. Petrausch and R. Rabenstein, “Sound synthesis by physical modeling using the functional transformation method: Effi- cient implementation with polyphase filterbanks,” in Proc. International Conference on Digital Audio Effects,London,UK, September 2003. [32] B. Bank, “Accurate and efficient method for modeling beating and two-stage decay in string instrument synthesis,” in Proc. MOSART Workshop on Current Research Directions in Com- puter Music, pp. 134–137, Barcelona, Spain, November 2001.
L. Trautmann received his “Diplom-Inge- nieur” and “Doktor-Ingenieur” degrees in electrical engineering from the University of Erlangen-Nuremberg, in 1998 and 2002, re- spectively. In 2003 he was working as a Post- doc in the Laboratory of Acoustics and Au- dio Signal Processing at the Helsinki Uni- versity of Technology, Finland. His research interests are in the simulation of multi- dimensional systems with focus on digital sound synthesis using physical models. Since 1999, he published more than 25 scientific papers, book chapters, and books. He is a holder of several patents on digital sound synthesis.
R. Rabenstein received his “Diplom-Inge- nieur” and “Doktor-Ingenieur” degrees in electrical engineering from the University of Erlangen-Nuremberg, in 1981 and 1991, respectively, as well as the “Habilitation” in signal processing in 1996. He worked with the Telecommunications Laboratory of this university from 1981 to 1987 and since 1991. From 1998 to 1991, he was with the Physics Department of the University of Siegen, Germany. His research interests are in the fields of multidi- mensional systems theory and simulation, multimedia signal pro- cessing, and computer music. He serves in the IEEE TC on Signal Processing Education. He is a Board Member of the School of En- gineering of the Virtual University of Bavaria and has participated in several national and international research cooperations. EURASIP Journal on Applied Signal Processing 2004:7, 964–977 c 2004 Hindawi Publishing Corporation
Physically Inspired Models for the Synthesis of Stiff Strings with Dispersive Waveguides
I. Testa Dipartimento di Scienze Fisiche, Universita` di Napoli “Federico II,” Complesso Universitario di Monte S. Angelo, 80126 Napoli, Italy Email: [email protected]
G. Evangelista Dipartimento di Scienze Fisiche, Universita` di Napoli “Federico II,” Complesso Universitario di Monte S. Angelo, 80126 Napoli, Italy Email: [email protected]
S. Cavaliere Dipartimento di Scienze Fisiche, Universita` di Napoli “Federico II,” Complesso Universitario di Monte S. Angelo, 80126 Napoli, Italy Email: [email protected]
Received 30 June 2003; Revised 17 November 2003
We review the derivation and design of digital waveguides from physical models of stiff systems, useful for the synthesis of sounds from strings, rods, and similar objects. A transform method approach is proposed to solve the classic fourth-order equations of stiff systems in order to reduce it to two second-order equations. By introducing scattering boundary matrices, the eigenfrequencies are determined and their n2 dependency is discussed for the clamped, hinged, and intermediate cases. On the basis of the frequency-domain physical model, the numerical discretization is carried out, showing how the insertion of an all-pass delay line generalizes the Karplus-Strong algorithm for the synthesis of ideally flexible vibrating strings. Know- ing the physical parameters, the synthesis can proceed using the generalized structure. Another point of view is offered by Laguerre expansions and frequency warping, which are introduced in order to show that a stiff system can be treated as a nonstiff one, provided that the solutions are warped. A method to compute the all-pass chain coefficients and the optimum warping curves from sound samples is discussed. Once the optimum warping characteristic is found, the length of the dis- persive delay line to be employed in the simulation is simply determined from the requirement of matching the desired fun- damental frequency. The regularization of the dispersion curves by means of optimum unwarping is experimentally evalu- ated. Keywords and phrases: physical models, dispersive waveguides, frequency warping.
1. INTRODUCTION tem are sought. However, due to the complexity of the real Interest in digital audio synthesis techniques has been rein- physical systems—from the classic design of musical in- forced by the possibility of transmitting signals to a wider au- struments to the molecular structure of extended objects— dience within the structured audio paradigm, in which algo- solutions of these equations cannot be generally found in an rithms and restricted sets of data are exchanged [1]. Among analytic way and one should resort to numerical methods or these techniques, the physically inspired models play a privi- approximations. In many cases, the resulting approximation leged role since the data are directly related to physical quan- scheme only closely resembles the exact model. For this rea- tities and can be easily and intuitively manipulated in order son, one could better define these methods as physically in- to obtain realistic sounds in a flexible framework. Applica- spired models, as first proposed in [2], where the mathemat- tions are, amongst the others, the simulation of a “physical ical equations or solutions of the physical problem serve as situation” producing a class of sounds as, for example, a clos- a solid base to inspire the actual synthesis scheme. One of ing door, a car crash, the hiss made by a crawling creature, the the advantages of using physically inspired models for sound human-computer interaction and, of course, the simulation synthesis is that they allow us to perform a “selection” of the of musical instruments. physical parameters actually influencing the sound so that a In the general physical models technique, continuous- trade-off between completeness and particular goals can be time solutions of the equations describing the physical sys- achieved. Physically Inspired Models 965
In the following, we will focus on stiff vibrating systems, exploited in order to perform an analysis of piano sounds including rods and stiff strings as encountered in pianos. by means of pitch-synchronous frequency warped wavelets However, extensions to two- or three-dimensional systems in which the excitation can be separated from the resonant can be carried out with little effort. sound components [34]. Vibrating physical systems have been extensively studied The models presented in this paper provide at least two over the last thirty years for their key role in many musi- entry points for the synthesis. If the physical parameters and cal instruments. The wave equation can be directly approxi- boundary conditions are completely known, or if it is de- mated by means of finite difference equations [3, 4, 5, 6, 7], sired to specify them to model arbitrary strings or rods, then or by discretization of the wave functions as proposed by the eigenfunctions, hence the dispersion curve, can be deter- Jaffe and Smith [8, 9] who reinterpreted and generalized the mined. The problem is then reconducted to that of finding Karplus-Strong algorithm [10] in a wave propagation setting. the best approximation of the continuous-time dispersion The outcome of the approximation of the time domain solu- curve with the phase response of a suitable all-pass chain us- tion of the wave equation is the design of a digital waveg- ing the methods illustrated in Section 3. Another entry point uide simulating the string itself: the sound signal simulation is offered if sound samples of an instrument are available. is achieved by means of an appropriate excitation signal, such In this case, the parameters of the synthesis model can be as white noise. However, in order to achieve a more realistic determined by finding the warping curve that best fits the and flexible synthesis, the interaction of the excitation sys- data given by the frequencies of the partials, together with tem with the vibrating element is, in turn, physically mod- the length of the dispersive delay line. This is achieved by eled. Digital waveguide methods for the simulation of physi- means of a regularization method of the experimental dis- cal models have been widely used [11, 12, 13, 14, 15, 16]. One persion data, as reported in Section 4. of the reasons for their success is that they are appropriate for The physical entry point is to be preferred in those sit- real-time synthesis [17, 18, 19, 20]. This result allowed us to uations where sound samples are not available, for example, change our approach to model musical instruments based on when we are modeling a nonexisting instrument by extension vibrating strings: waveguides can be designed for modeling of the physical model, such as a piano with unusual speak- the “core” of the instruments, that is, the vibrating string, but ing length. The other entry level is best for approximating they are also suitable for the integration of interaction mod- real instrument sounds. However, in this case, the synthesis els, for example, for the excitation due to a hammer [21]or is limited to existing sources, although some variations can to a bow [9], the radiation of sound due to the body of the be obtained in terms of the warping parameters, which are instrument [22, 23, 24, 25], and of different side-effects in related to, but do not directly represent, physical factors. plucked strings [26]. It must be pointed out that the interac- tions being highly nonlinear, their modeling and the deter- 2. PHYSICAL STIFF SYSTEMS mination of the range of stability is not an easy task. ff In this paper, we will review the design of a digital waveg- In this section, we present a brief overview of the sti uide simulating a vibrating stiff system, focusing on stiff string and rod equations of motion and of their solution. strings and treating bars as a limit case where the tension The purpose is twofold. On the one hand, these equations in negligible. The purpose is to derive a general framework give the necessary background to the physical modeling of ff inspiring the determination of a discrete numerical model. sti strings. On the other hand, we show that their fre- A frequency domain approach has been privileged, which quency domain solution ultimately provides the link between allows us to separate the fourth-order differential equation continuous-time and discrete-time models, useful for the of stiff systems into two second-order equations, as shown derivation of the digital waveguide and suitable for their sim- in Section 2. This approach is also useful for the simula- ulation. This link naturally leads to Laguerre expansions for tion of two-dimensional (2D) systems such as thin plates. the solution and to frequency warping equivalences. Further- By enforcing proper boundary conditions, we obtain the more, enforcing proper boundary conditions determines the eigenfrequencies and the eigenfunctions of the system as eigenfrequencies and eigenfunctions of the system, useful for found, for the case of strings, in the classic works by Fletcher fitting experimentally measured resonant modes to the ones [27, 28]. Once the exact solutions are completely charac- obtained by simulation. This fit allows us to determine the terized, their numerical approximation is discussed [29, 30] parameters of the waveguide through optimization. together with their justification based on physical reason- 2.1. Stiff string and bar equation ing. The discretization of the continuous-time domain so- lutions is carried out in Section 3, which naturally leads to The equation of motion for the stiff string can be determined dispersive waveguides based on a long chain of all-pass fil- by studying the equilibrium of a thin plate [35, 36]. One ob- ters. From a different point of view, the derived structure can tains the following 4th-order differential equation for the de- be described in terms of Laguerre expansions and frequency formation of the string y(x, t): warping [31]. In this framework, a stiff system can be shown ff ∂4 y(x, t) ∂2 y(x, t) 1 ∂2 y(x, t) to be equivalent to a nonsti (Karplus-Strong like) system, −ε + = , ∂x4 ∂x2 c2 ∂t2 whose solutions are frequency warped, provided that the ini- (1) tial and the possibly moving boundary conditions are prop- EI T ε = , c = , erly unwarped [32, 33]. As a side effect, this property can be T ρS 966 EURASIP Journal on Applied Signal Processing
2 2 featuring the Young modulus of the material E, the inertia The operator −∂ /∂x is selfadjoint with respect to the L2 moment I with respect to the transversal axis of the cross- scalar product [37]. Therefore, (9) can be separated into the section of the string (for a circular section of radius r, I = following two independent equations: πr4/4asin[36]), the tension of the string T, and the mass per unit length ρS. Note that for ε → 0, (1) becomes the well- ∂2 − ξ2 Y (x, ω) = 0, known equation of the vibrating string [35]. Otherwise, if the ∂x2 1 1 applied tension T is negligible, we obtain (10) 2 ∂ − 2 = 4 2 ξ2 Y2(x, ω) 0, ∂ y(x, t) ∂ y(x, t) EI ∂x2 −ε = , ε = ,(2) ∂x4 ∂t2 ρS where which is the equation for the transversal vibrations of rods. Solutions of (1)and(2) are best found in terms of the Fourier Y(x, ω) = Y1(x, ω)+Y2(x, ω). (11) transform of y(x, t)withrespecttotime: As we will see, (10) justifies the use, with proper modifica- +∞ tions, of a second-order generalized waveguide based on pro- Y(x, ω) = y(x, t)exp(−iωt)dt,(3) −∞ gressive and regressive waves for the numerical simulation of stiff systems. where ω is the angular velocity related to frequency f by the relationship f = 2πω. 2.2. General solution of the stiff string By taking the Fourier transform of both members of (1) and bar equations and (2), we obtain In this section, we will provide the general solution of (8). ∂4Y(x, ω) ∂2Y(x, ω) ω2 The particular eigenfunctions and eigenfrequencies of rods ε − − Y(x, ω) = 0(4) ff ∂x4 ∂x2 c2 and sti strings are determined by proper boundary condi- tions and are treated in Section 2.3.From(7), it can be shown for the stiff string and that √ 4 2 2 ∂ Y(x, ω) 2 ± 1+4ω ε/c − 1 ε − ω Y(x, ω) = 0(5)ξ =± − ∂x4 1 2ε √ (stiff string), 2 2 for the rod. ± 1+4ω ε/c +1 − 2 2 ff ξ =± The second-order ∂ /∂x spatial di erential operator is 2 2ε defined as a repeated application of the L2 space extension of (12) the −i(∂/∂x)operator[37]. To the purpose, we seek solutions ± =± − √ω ξ1 whose spatial and frequency dependency can be factored, ac- ε cording to the separation of variables method, as follows: (rod). ± =± √ω ξ2 Y(x, ω) = W(ω)X(x). (6) ε ± Note that in both cases, the eigenvalues ξ1 are complex num- Substituting (6)in(4)and(5) results in the elimination of ± the W(ω) term, obtaining ordinary differential equations, bers, while ξ2 are real numbers. It is also worth noting that whose characteristic equations, respectively, are 2 2 = 1 ff ξ1 + ξ2 (sti string), 2 ε (13) 4 2 ω ελ − λ − = 0 (stiff string), 2 2 = c2 (7) ξ1 + ξ2 0(rod), ελ4 − ω2 = 0(rod). where ξ1 corresponds to the positive choice of the sign in =| ±| The elementary solutions for the spatial part X(x) have the front of the square root in (12)andξ2 ξ2 . As expected, if → ff form X(x) = C exp(λx). It is important to note that in both we let T 0, then both sets of eigenvalues of the sti string cases, the characteristics equations have the following form: tend to those found for the rod. Using the equations in (12), we then have for both strings and rods λ2 − ξ2 λ2 − ξ2 = 0, (8) 1 2 = + − − Y1(x, ω) c1 exp ξ1x + c1 exp ξ1x , (14) where ξ1 and ξ2 are, in general, complex numbers that de- + − Y2(x, ω) = c exp ξ2x + c exp − ξ2x , pend on ω.Equation(8)allowsustofactorbothequations 2 2 in (4)and(5) as follows: ± ± where c1 , c2 are, in general, functions of ω. Note that ( , ) is an oscillating term, while, since is real, ( , ) 2 2 Y1 x ω ξ2 Y2 x ω ∂ − 2 · ∂ − 2 = is nonoscillating. For finite-length strings, both positive and ξ1 ξ2 Y(x, ω) 0. (9) ∂x2 ∂x2 negative real exponentials are to be retained. Physically Inspired Models 967
From (12), we see that the primary effect of stiffness is 3.5 the dependency on frequency of the argument (from now on, phase) of the solutions of (4)and(5). Therefore, the propa- 3 gation of the wave from one section of the string located at x to the adjacent section located at x +∆x is obtained by multi- 2.5 ∆ plication of a frequency dependent factor exp(ξ1 x). Conse- ) 1 2 −1 − quently, the group velocity u,definedasu ≡ (dξ1/dω) , also depends on frequency. This results in a dispersion of the wave (cm 1
ξ 1.5 packet, characterized by the function ξ1(ω), whose modulus is plotted in Figure 1 for the case of a brass string using the 1 following values of the physical parameters r, T, ρ,andE: 0.5 r = 1 mm, T = 9 · 107 dyne, 0 (15) 00.511.522.5 −3 ×104 ρ = 8.44 g cm , Frequency (Hz) − E = 9 · 1011 dyne cm 2. Figure 1: Plot of the phase module of the stiff model equation so- 2 4 −1 Clearly, the previous example is a very crude approximation lution for ε = π/4cm and c ≈ 2 ∗ 10 cm s . of a physical piano string (e.g., real-life piano strings in the low register are built out of more than one material and a copper or brass wire is wrapped around a steel core). For the and of hinged extrema [5, 16, 31, 35, 36]: sake of completeness, we give the explicit expression of |u| in L L both the cases we are studying. We have Y − , ω = Y , ω = 0, 2 2 2 2 (19) 2 2 +4 2 ∂ Y(x, ω) ∂ Y(x, ω) c c ω ε = = |u|= (stiff string), 2 2 0. ∂x −L/2 ∂x L/2 2c2 ± 2c c2 +4ω2ε (16) √ Before determining the conditions for the eigenfrequencies |u|=2 ω ε (rod). of the considered stiff systems, we find a more compact way of writing (18)and(19). Starting from the factorized form of If T → 0, the two group velocities are equal. Moreover, if in the stiff systems equation (see (10)), and using the symbols the first line in (16), we let ε → 0, then u → c, which is the introduced in Section 2.2,wehave limit case of the ideally flexible vibrating string. These facts = + − further justify the use of a dispersive waveguide in the nu- Y1(x, ω) ψ1 (x, ω)+ψ1 (x, ω), (20) = + − merical simulation. With respect to this point, a remark is in Y2(x, ω) ψ2 (x, ω)+ψ2 (x, ω), order: the dispersion introduced by stiffness can be treated as a limiting “nonphysical” consequence of the Euler-Bernoulli wherewelet beam equation: ± = ± ± ψ1 (x, ω) c1 exp ξ1 x , (21) d2 d2 y ψ±(x, ω) = c± exp ξ±x . EI = p, (17) 2 2 2 dx2 dx2 Conditions (18) can then be rewritten as follows: where p is the distributed load acting on the beam. It is “non- √ L L physical” in the sense that u →∞as ω. However, in the Y − , ω =−Y − , ω , 1 2 2 2 discrete-time domain, this “nonphysical” situation is avoided L L ifwesupposeallthesignalsbebandlimited. Y , ω =−Y , ω , 1 2 2 2 (22) 2.3. Complete characterization of stiff string ∂Y1(x, ω) ∂Y2(x, ω) =− , and rod solution ∂x − ∂x − L/2 L/2 Boundary conditions for real piano strings lie in between the ∂Y1(x, ω) ∂Y2(x, ω) =− . conditions of clamped extrema: ∂x L/2 ∂x L/2 L L At the terminations of the string or of the rod, we have Y − , ω = Y , ω = 0, 2 2 (18) + − =− + − ψ1 + ψ1 ψ2 + ψ2 , ∂Y(x, ω) = ∂Y(x, ω) = (23) 0, + + − − =− + + − − ∂x −L/2 ∂x L/2 ξ1 ψ1 + ξ1 ψ1 ξ2 ψ2 + ξ2 ψ2 , 968 EURASIP Journal on Applied Signal Processing which can be rewritten in matrix form: between the conditions given in (18)and(19), we can com- bine the two matrices S and S in order to enforce more + − c h 11 ψ1 11 ψ1 general conditions, as illustrated in Section 3. In the follow- + + + =− − − − . (24) ξ1 ξ2 ψ2 ξ1 ξ2 ψ2 ing, we will solve (4)and(5) applying separately these sets of boundary conditions. By left-multiplying both members of (24) for the inverse of 11 the + + matrix, we have ξ1 ξ2 2.3.1. The clamped stiff string and rod + − In order to characterize the eigenfunctions in the case of con- ψ1 = ψ1 + Sc − , (25) ditions (18), in (12)welet ψ2 ψ2 = where we let ξ1 iξ1 (33) + + + ff ξ2 + ξ1 ξ for both the sti string and the rod solution. By definition, − −2 2 = + − + + − + ξ1 is a real number. Moreover, for the rod, we have ξ1 ξ2. ≡ ξ2 ξ1 ξ2 ξ1 Sc + + + . (26) With this position, it can be shown that conditions (18)for ξ1 ξ2 + ξ1 ff 2 + − + + − + the sti string lead to the equations [35, 38] ξ2 ξ1 ξ2 ξ1 L L The matrix Sc relates the incident wave with the reflected tan ξ1 tanh ξ2 2 2 ξ1 0 wave at the boundaries. Independently of the roots ξi,ithas = , (34) L L ξ 0 the following properties: tanh ξ − tan ξ 2 2 2 1 2 Sc =−1, while, for the rod, we have 10 (27) S2 = . = c 01 cos ξ1L cosh ξ2L 1. (35)
In the case of a hinged stiff system (see (19)) at both ends, we Equations (34)and(35) can be solved numerically. In partic- have ular, taking into account the second line in (12), solutions of (35)are[35] ψ+ + ψ− =− ψ+ + ψ− , 1 1 2 2 2 π 2 2 2 2 (28) = 2 2 2 2 2 ξ+ ψ+ + ξ− ψ− =− ξ+ ψ+ + ξ− ψ− ωn 3.011 ,5 ,7 , ...,(2n +1) α , 1 1 1 1 2 2 2 2 √4 4 (36) ε which, in matrix form, becomes α = . L 11 + 11 − ff ψ1 =− ψ1 A similar trend can be obtained for the sti string. In view + 2 + 2 + − 2 − 2 − . (29) ξ1 ξ2 ψ2 ξ1 ξ2 ψ2 of their historical and practical relevance, we here report the numerical approximation for the allowed eigenfrequencies of 11 + 2 + 2 the stiff string given by Fletcher [27]: By taking the inverse of matrix (ξ1 ) (ξ2 ) ,weobtain + − ψ ψ c 2 2 2 2 1 = 1 ωn nπ 1+n π α 1+2α +4α , + Sh − , (30) L ψ2 ψ2 √ (37) ε α = . where L 10 If we expand the above expression in a series of powers of S =− . (31) h 01 α truncated to second order, we have the following approxi- mate formula valid for small values of stiffness: The Sh matrix for the hinged stiff system is independent of c 1 2 2 2 roots ξi.ThematricesSh and Sc are related in the following ωn nπ 1+2α + 1+ n π (2α) . (38) way: L 8 The last approximation does not apply to bars. For ε = 0, we Sh =− Sc , (32) have α = 0 and the eigenfrequencies tend to the well-known 2 = 2 Sh Sc . formula for the vibrating string [35]: ff In conclusion, the boundary conditions for sti systems ωn = nω1. (39) can be expressed in terms of matrices that can be used in the numerical simulation of stiff systems. Moreover, since the Typical curves of the relative spacing χn ≡ ∆ωn/ω1,where real-life boundary conditions for stiff strings in piano lie in ∆ωn ≡ ωn+1 − ωn, of eigenfrequencies for the stiff string are Physically Inspired Models 969
9 3500 = = r 3mm 8 r 3mm 3000 7 2500 6
5 2000
4 1500
Relative spacing 3 1000 Deviation from linearity (Hz) 2 500 1 = r = 1mm r 1mm 0 0 010203040 0 500 1000 1500 2000 Partial number Frequency (Hz)
Figure 2: Typical eigenfrequencies relative spacing curves of the Figure 3: Typical warping curves of the clamped stiff string for dif- clamped stiff string for different values of the radius r of the sec- ferent values of the radius r of the section S. tion S. shown in Figure 2 with variable r, where values of the other and for the rod: physical parameters are the same as in (15). = Due to the dependency on the frequency of the phase of sin ξ1L sinh ξ2L 0. (42) the solution, the eigenfrequencies of the stiff string are not 2 equally spaced. For a small radius r,henceforlowdegree The second line in (41) has no solutions since both ξ1 and 2 ff of the stiffness of the string (see (1)), the relative spacing is ξ2 are real functions. It follows that hinged sti systems are = almost constant for all the considered order of eigenfrequen- only described by (42). In this equation, sinh(ξ2L) 0has cies. However, for higher stiffness, the spacing of the eigen- no solution, hence the eigenfrequencies are determined by frequencies increases, in first approximation, as a linear func- the condition tion of the order of the eigenfrequency. The above results are nπ ξ = . (43) summarized by the typical “warping curves” of the system, 1 L shown in Figure 3, in which the quantity ωn −ωn,whichrep- resents the deviation from the linearity, is plotted in terms of Using the parameters α and α respectively defined in (36) ff spacing ∆ωn between consecutive eigenfrequencies. and (37), the eigenfrequencies for the hinged sti string are In the stiff string case, we have two sets of eigenfunctions, exactly expressed as follows: one having even parity and the other one having odd parity, c whose analytical expressions are respectively given by [38] ω = nπ n2π2α2 +1 , (44) n L = L cos ξ1x − cosh ξ2x while for the rod, we have Y(x, ω) C(ω)cos ξ1 , 2 cos ξ1(L/2) cosh ξ2(L/2) 2 2 2 ωn = n π α . (45) = L sin ξ1x − sinh ξ2x Y(x, ω) C(ω)sin ξ1 , As the tension T → 0, (44) tends to (45). Figure 4 shows 2 sin ξ1(L/2) sinh ξ2(L/2) (40) the relative spacing of the eigenfrequencies in the case of the hinged stiff string. where C(ω) is a constant that can be calculated imposing the Relative eigenfrequencies spacing curves are very similar initial conditions. to the ones of the clamped string and so are the “warping curves” of the system, as shown in Figure 5. 2.3.2. Hinged stiff string and rod Using (45), we can give an analytic expression for the rel- ative spacing of the eigenfrequencies of the hinged rod. We Conditions (19) lead to the following sets of equations for have the stiff string: 2 2 π α (2n +1). (46) = sin ξ1L sinh ξ2L 0, (41) 2 2 = ξ1 + ξ2 0, Equation (43) leads to the following set of odd and even 970 EURASIP Journal on Applied Signal Processing
10 2P delays r = 3mm X(z) Y(z) z−2P 8
6 G(z) (low-pass) 4 Relative spacing Figure 6: Basic Karplus-Strong delays cascade. 2
= r 1mm can also be seen from the similarity of the warping curves ob- 0 010203040tained with the two types of boundary conditions. Taking into account the fact that real-piano strings Partial number boundary conditions lie in between these two cases, we can Figure 4: Typical eigenfrequencies relative spacing curves of the conclude that the eigenfrequencies of real-piano strings can hinged stiff string for different values of the radius r of the section be calculated by means of the approximated formula [27, 28]: S. & 2 ωn An Bn + 1, (48) 4000 r = 3mm where A and B can be obtained from measurements. Approx- 3500 imation (48) is useful in order to match measured vibrating 3000 modes against the model eigenfrequencies.
2500 3. NUMERICAL APPROXIMATIONS OF STIFF SYSTEMS 2000 Most of the problems encountered when dealing with the continuous-time equation of the stiff string consist in de- 1500 termining the general solution and in relating the initial 1000 and boundary conditions to the integrating constants of the Deviation from linearity (Hz) equation. In this section, we will show that we can use a sim- 500 r = 1mm ilar technique also in discrete-time, which yields a numerical transform method for the computation of the solution. 0 0 500 1000 1500 2000 In Section 2, we noted that (1) becomes the equation of ff ffi Frequency (Hz) vibrating string in the case of negligible sti ness coe cient ε. It is well known that the technique known as Karplus-Strong Figure 5: Typical warping curves of the hinged stiff string for dif- algorithm implements the discrete-time domain solution of ferent values of the radius r of the section S. the vibrating string equation [8], allowing us to reach good quality acoustic results. The block diagram of the adopted loop circuit is shown in Figure 6. eigenfunctions for the stiff string [38]: The transfer function of the digital loop chain can be written as follows: 2nπ Y (x, ω) = 2D(ω)sin x , 1 n L ( ) = , (49) H z − −2P (47) 1 z G(z) (2n +1)π Y (x, ω) = 2D(ω)cos x , n L where the loop filter G(z) takes into account losses due to nonrigid terminations and to internal friction, and P is the where D(ω) must be determined by enforcing the initial con- number of sections in which the string is subdivided, as ob- ditions. It is worth noting that both functions in (47) are in- tained from time and space sampling. Loop filters design dependent of the stiffness parameter ε.InSection 3,wewill can be based on measured partial amplitude and frequency use the obtained results in order to implement the dispersive trajectories [18], or on linear predictive coding (LPC)-type waveguides digitally simulating the solutions of (4)and(5). methods [9]. The filter G(z) can be modelled as IIR or FIR Finally, we need to stress the fact that the eigenfrequen- and it must be estimated from samples of the sound or from cies of the hinged stiff string are similar to the ones for the a model of the string losses, where, for stability, we need clamped case except for the factor (1 + 2α +4α2). Therefore, |G(e jω)| < 1. Clearly, in the IIR case or in the nonlinear phase for small values of stiffness, they do not differ too much. This FIR case, the phase response of the loop filter introduces a Physically Inspired Models 971
3 u = 0.9 2P all-pass cascade X(z) Y(z) A(z)2P
2.5
2 G(z) (low-pass)
1.5 Figure 8: Dispersive waveguide used to simulate dispersive systems.
Discrete frequency (Hz) 1
where fs is the sampling frequency. Note that, by definition, 0.5 both members of (52) are real numbers. Therefore, in the z- domain, a nonstiff system can be mapped into a stiff system =− u 0.9 by means of the frequency warping map 0 0123 z−1 −→ A(z). (53) Discrete frequency (Hz) The resulting circuit is shown in Figure 8. Note, that the feed- Figure 7: First-order all-pass phase plotted for various values of u. back all-pass chain results in delay-free loops. Computation- ally, these loops can be resolved by the methods illustrated in [34, 42, 43]. Moreover, the phase response of the loop filter limited amount of dispersion. Additional phase terms in the G(z) contributes to the dispersion and it must be taken into form of all-pass filters can be added in order to tune the account in the global model. string model to the required pitch [13] and contribute to fur- The circuit in Figure 8 can be optimized in order to take ther dispersion. into account the losses and the coupling amongst strings Since the group velocity for a traveling wave for a stiff sys- (e.g., as in piano). In the framework of this paper, we con- tem depends on frequency (see (16)), it is natural to substi- fined our interest to the design of the stiff system filter. For a tute, in discrete time, the cascade of unit delays with a chain review of the design of lossy filters and coupling models, see of circuital elements whose phase responses do depend on [17]. frequency. One can show that the only choice that leads to rational transfer functions is given by a chain of first-order 3.1. Stiff system filter parameters determination all-pass filters [39, 40]. More complex physical systems, for example, as in the simulation of a monaural room, call for Within the framework of the approximation (52) in the case substituting the delays chain with a more general filter as il- of dispersive waveguide, the integer parameter P can be ob- lustrated in [41]: tained by constraining the two functions to attain the same values at the extrema of the bandwidth. Since θ(π) = π,we z−1 − u have A(z, u) = (50) 1 − uz−1 ξ πf L P = 1 s . (54) whose phase characteristic is π u sin(Ω) As we will see, condition (54) is not the only one that can be θ(Ω)= Ω + 2 arctan . (51) 1 − u cos(Ω) obtained for the parameter P. The deviation from linearity introduced by the warping θ(Ω) can be written as follows: The phase characteristics in (51)areplottedinFigure 7 for variousvaluesofu. u sin(Ω) ∆(Ω) ≡ θ(Ω) − Ω = 2 arctan . (55) A comparison between the curve in Figure 1 and the ones 1 − u cos(Ω) in Figure 7 gives more elements of plausibility for the approx- imation of the solution phase of the stiff model equations, The function ∆(Ω) is plotted, for different values of u,in given in (12), with the all-pass filter phase (51). Adopting a Figure 9. similar circuital scheme as in the Karplus-String algorithm One can see that the absolute value of ∆(Ω) has a max- [10] in which the unit delays are replaced by first-order all- imum which corresponds to the maximum deviation from pass filters, the approximation is given by the linearity of θ(Ω). It can be shown that this maximum oc- curs for Ω P Ω ξ1 fs θ( ), (52) L Ω = ΩM = arccos(u) (56) 972 EURASIP Journal on Applied Signal Processing
3 fore, an approximation of the chain by means of a cascade of u = 0.9 an all-pass of order much smaller than 2P with unit delays is usually sought [13, 29, 30]. A simple and accurate approach 2 is to model the all-pass as a cascade of first-order sections with variable real parameter u [38]. However, a more gen- eral approach calls for including in the design second-order 1 all-pass sections, equivalent to a pair of complex conjugated first-order sections [29]. In Section 4, we will bypass this esti- 0 mation procedure based on the theoretical eigenfunctions of the string to estimate the all-pass parameters and the number of sections from samples of the piano. −1 3.2. Laguerre sequences Deviation from linearity (Hz) An invertible and orthogonal transform, which is related to −2 the all-pass chain included in the stiff string model, is given by the Laguerre transform [44, 45]. The Laguerre sequences u =−0.9 −3 li[m, u] are best defined in the z-domain as follows: 0123 √ 1 − u2 z−1 − u i L (z, u) = . (62) Discrete frequency (Hz) i 1 − uz−1 1 − uz−1
Figure 9: Plot of the deviation from linearity of the all-pass filter Thus, the Laguerre sequences can be obtained from the z- phase for different values of parameter u. domain recurrence √ 1 − u2 L (z, u) = , for which the maximum deviation is 0 1 − uz−1 (63) Li+1(z, u) = A(z)Li(z, u), ∆ ΩM, u = 2 arcsin(u). (57) where A(z)isdefinedasin(50). Comparison of (62)with Substituting (56)in(51), we have (50) shows that the phase of the z transform of the Laguerre π sequences is suitable for approximating the phase of the solu- θ Ω = + arcsin(u). (58) ff M 2 tion of the sti model equation. A biorthogonal generaliza- tion of the Laguerre sequences calling for a variable u from Since the solution phase ξ1 is approximated by θ(Ω), it has to section to section is illustrated in [46]. This is linked to the satisfy the condition refined approximation of the solution previously shown. Ω M L π 3.3. Initial conditions ξ1 + arcsin(u) (59) T P 2 Putting together the results obtained in Section 1,wecan ff Ω and therefore, we have the following bound on P: write the solution phase of the sti model Y( , x)asfollows (see (11)and(14)): Lξ f arccos(u) P 1 s . (60) = + − − π/2 + arcsin(u) Y(ω, x) c1 (ω)exp iξ1x + c1 (ω)exp iξ1x . (64)
For higher-order Q all-pass filters, (60)canbewrittenasfol- We are now disregarding the transient term due to ξ2 since it lows: does not influence the acoustic frequencies of the system. In = discrete time and space, we let x m(L/P)asin[10]. With 1 Q ξ f arccos u L the approximation (52), (64)becomes P 1 s i . (61) 2+arcsin Q i=1 π/ ui Ω + Ω Ω − Ω − Ω Y(m, ) c1 ( )exp imθ( ) + c1 ( )exp imθ( ) . An optimization algorithm can be used to obtain the vector (65) parameter u. Based on our experiments, we estimated that Substituting (63)in(65), we have an optimal order Q is 4 for the piano string. Therefore, us- Lm(Ω, u) − L−m(Ω, u) ing the values in (15) for the 58 Hz tone of an L = 200 cm Y(Ω, m) c+(Ω) + c (Ω) , (66) 1 L (z, u) 1 L (z, u) brass string, we obtain P = 209. Although this is not a model 0 0 for a real-life wound inhomogeneous piano string, this ex- where we have used the fact that ample gives a rough idea of the typical number of the re- −iΩ Ω e − u quired all-pass sections. The computation of this long all- A ei , u = = exp iθ(Ω) . (67) pass chain can be too heavy for real-time applications. There- 1 − ue−iΩ Physically Inspired Models 973
By defining We have just shown that the solution of the discrete-time stiff model equation can be written as a Laguerre expansion + Ω − Ω c1 ( ) c1 ( ) of the initial condition. At the same time, this shows that the V+(Ω) ≡ , V−(Ω) ≡ , (68) L0(z, u) L0(z, u) stiff string model is equivalent to a nonstiff string model cas- caded by frequency warping obtained by Laguerre expansion. (66) can be written as follows: 3.4. Boundary conditions Y(m, Ω) V+(Ω)Lm(Ω, u)+V−(Ω)L−m(Ω, u). (69) In Section 1, we discussed the stiff model equation bound- Taking the inverse discrete-time Fourier transform (IDTFT) ary conditions in continuous time (see (18)and(19)). In on both sides of (69), we obtain this section, we will discuss the homogenous boundary con- ditions (i.e., the first line in both (18)and(19)) in the y[m, n] y+[m, n]+y−[m, n], (70) discrete-time domain. Using approximation (52) and letting the number of sections of the stiff system P be an even integer, where we can write the homogenous conditions as follows (see also ∞ (69)): y [m, n] = v [n − k]l [k, u], + + m P k=−∞ Y − , Ω = 0 ∞ (71) 2 y−[m, n] = v−[n − k]l−m[k, u], =⇒ V+(Ω)L−P/2(Ω, u)+V−(Ω)LP/2(Ω, u) = 0, =−∞ (78) k P Y + , Ω = 0 and the sequences v±(n) are the IDTFT of V±(Ω). For the 2 sake of conciseness, we do not report here the expression of =⇒ V+(Ω)LP/2(Ω, u)+V−(Ω)L−P/2(Ω, u) = 0. ± v±[n] in terms of constants c1 . For further details, see [31, 38]. The expression of the numerical solution y[m, n]canbe Like (34), (78) can be expressed in matrix form: written in terms of a generic initial condition LP/2(Ω, u) L−P/2(Ω, u) V+(Ω) 0 Ω Ω Ω = . (79) y[m,0]= y+[m,0]+y−[m,0]. (72) L−P/2( , u) LP/2( , u) V−( ) 0
In order to do this, we resort to the extension of Laguerre As shown in Section 3.3, the functions V±(Ω) are determined sequences to negative arguments: by means of Laguerre expansion of the initial conditions se- quences through (71)and(76). For any choice of these initial [ , ], ≥ 0, conditions, the determinant of the coefficients matrix in (79) = lm n u n lm[n, u] (73) must be zero, obtaining the following condition: lm[−n, u], n<0, Ω 2 − Ω 2 = and to the property LP/2( , u) L−P/2( , u)] 0. (80) Recalling the z-transform expression for the Laguerre se- lm[n, u] = ln[m, −u]. (74) quences, we have If we introduce the quantity kπ sin θ(Ω)P = 0, θ(Ω) = , k = 1, 2, 3, .... (81) ∞ P ± = ± yk [u] y±[m,0]lk [ m, u], m=0 (75) In the stiff string case, the eigenfrequencies of the system are not harmonically related. In our approximation of the phase l [±m, u] = l±m[k, u], k of the solution with the digital all-pass phase, the harmonic- with a simple mathematical manipulation, (71)canbewrit- ity is reobtained at a different level: the displacement of the ten as follows: all-pass phase values is harmonic according to the law writ- ten in (81). The distance between two consecutive values of ∞ this phase is π/P. Due to the nonrigid terminations, the real- y [m, n] = y+[u]l [k + n, u], + k m life boundary conditions can be given in terms of frequency k=−∞ ∞ (76) dependent functions, which are included in the loop filter. = − In mapping the stiff structure to a nonstiff one, care must be y−[m, n] yk [u]lm[k + n, u]. k=−∞ taken into unwarping the loop filter as well. Therefore, the numeric solution becomes 4. SYNTHESIS OF SOUND ∞ ∞ = + − In order to implement a piano simulation via the physical y[m, n] yk lm[k + n, u]+ yk lm[k + n, u]. (77) k=−∞ k=−∞ model, we need to determine the design parameters of the 974 EURASIP Journal on Applied Signal Processing
0.7 180
160 0.6 140
0.5 120
100 0.4 80 Warping parameter
Spacing of the partials (Hz) 60 0.3 40
0.2 20 0 5 10 15 20 25 30 020406080 Partial number Partial number
Figure 10: Computed all-pass optimized parameters u. Figure 11: Warped deviation from linearity.
dispersive waveguide, that is, the number of all-pass sections 0.022 and the coefficients ui of the all-pass filters. This task could be performed by means of lengthy measurements or estimation 0.021 of the physical variables, such as tension, Young’s module, density, and so forth. However, as we already remarked, due 0.02 to the constitutive complexity of the real-life piano strings and terminations, this task seems to be quite difficult and to lead to inaccurate results. In fact, the given physical model 0.019 only approximately matches the real situation. Indeed, in ordertomodelandjustifythemeasuredeigenfrequencies, 0.018 we resorted to Fletcher’s experimental model described by Normalized frequency (48). However, in that case, we ignore the exact form of the 0.017 eigenfunctions, which is required in order to determine the number of sections of the waveguide and the other param- 0.016 eters.Amorepragmaticandeffective approach is to esti- 0 1020304050 mate the waveguide parameters directly from the measured Partial number eigenfrequencies ωn. These can be extracted, for example, from recorded samples of notes played by the piano under Figure 12: Optimized all-pass parameters u for A#3 tone. exam. Fletcher’s parameters A and B can be calculated as follows: imization of the number of nontrivial all-pass sections in the 1 16ω2 − ω2 A = n 2n , cascade. 2n 3 (82) Given the optimum warping curve, the number of sec- 2 − tions is then determined by forcing the pitch of the cascade = 1 4γ 1 = ωn B 2 2 , γ . of the nonstiff model (Karplus-Strong like) with warping to n 1 − 16γ ω2n match the required fundamental frequency of the recorded In practice, in the model where the all-pass parameters tone. An example of this method is shown in Figure 11, ui are equal throughout the delay line, one does not even where the measured warping curves pertaining to several pi- need to estimate Fletcher’s parameters. In fact, in view of the ano keys in the low register, as estimated from the resonant equivalence of the stiff string model with the warped non- eigenfrequencies, are shown. In Figure 12, the optimum se- stiff model, one can directly determine, through optimiza- quence of all-pass parameters u for the examined tones is tion, the parameter u that makes the dispersion curve of the shown. Finally, in Figure 13, the plot of the regularized dis- eigenfrequencies the closest to a straight line, using a suitable persion curves by means of optimum unwarping is shown. distance. A result of this optimization is shown in Figure 10. For further details about this method, see [47, 48, 49]. Fre- It must be pointed out that our point of view differs from quency warping has also been employed in conjunction with the one proposed in [29, 30], where the objective is the min- 2D waveguide meshes in the effort of reducing the artificial Physically Inspired Models 975
250 struments. However, in order to fine tune the physically in- spired models to real instruments, one needs methods for 200 the estimation of the parameters from samples of the instru- ment. In this paper, we showed that dispersion from stiff- ness is a simple case in which the solution of the raw phys- 150 ical model suggests a discrete-time model, which is flexible enough to be used in the synthesis and which provides real- istic results when the characteristics are estimated from the 100 samples. Frequency (Hz)
50 REFERENCES [1] B. L. Vercoe, W. G. Gardner, and E. D. Scheirer, “Structured audio: creation, transmission, and rendering of parametric 0 sound representations,” Proceedings of the IEEE, vol. 86, no. 010203040 5, pp. 922–940, 1998. Partial number [2] P. Cook, “Physically informed sonic modeling (PhISM): syn- thesis of percussive sounds,” Computer Music Journal,vol.21, Figure 13: Optimum unwarped regularized dispersion curves. no. 3, pp. 38–49, 1997. [3] L. Hiller and P. Ruiz, “Synthesizing musical sounds by solving the wave equation for vibrating objects: Part I,” Journal of the Audio Engineering Society, vol. 19, no. 6, pp. 462–470, 1971. [4] L. Hiller and P. Ruiz, “Synthesizing musical sounds by solving dispersion introduced by the nonisotropic spatial sampling the wave equation for vibrating objects: Part II,” Journal of the [50]. Since the required warping curves do not match the Audio Engineering Society, vol. 19, no. 7, pp. 542–551, 1971. first-order all-pass phase characteristic, in order to overcome [5] A. Chaigne and A. Askenfelt, “Numerical simulations of pi- this difficulty, a technique including resampling operators ano strings. I. A physical model for a struck string using finite has been used in [50, 51] according to a scheme first in- difference methods,” Journal of the Acoustical Society of Amer- troduced in [33] and further developed in [52] for the ica, vol. 95, no. 2, pp. 1112–1118, 1994. wavelet transforms. However, the downsampling operators [6] A. Chaigne and A. Askenfelt, “Numerical simulations of piano strings. II. Comparisons with measurements and systematic inevitably introduce aliasing. While in the context of wavelet exploration of some hammer-string parameters,” Journal of transforms, this problem is tackled with multichannel filter the Acoustical Society of America, vol. 95, no. 3, pp. 1631–1640, banks, this is not the case of 2D waveguide meshes. 1994. [7] A. Chaigne, “On the use of finite differences for musical syn- thesis. Application to plucked stringed instruments,” Journal 5. CONCLUSIONS d’Acoustique, vol. 5, no. 2, pp. 181–211, 1992. [8] D. A. Jaffe and J. O. Smith III, “Extensions of the Karplus- In order to support the design and use of digital dispersive ff Strong plucked-string algorithm,” The Music Machine,C. waveguides, we reviewed the physical model of sti systems, Roads, Ed., pp. 481–494, MIT Press, Cambridge, Mass, USA, using a frequency domain approach in both continuous and 1989. discrete time. We showed that, for dispersive propagation in [9] J. O. Smith III, Techniques for digital filter design and sys- the discrete-time, the Laguerre transform allows us to write tem identification with application to the violin,Ph.D.the- the solution of the stiff model equation in terms of an or- sis, Electrical Engineering Department, Stanford University thogonal expansion of the initial conditions and to reob- (CCRMA), Stanford, Calif, USA, June 1983. [10] K. Karplus and A. Strong, “Digital synthesis of plucked-string tain harmonicity at the level of the displacement of the all- ff and drum timbres,” The Music Machine,C.Roads,Ed.,pp. pass phase values. Consequently, we showed that the sti 467–479, MIT Press, Cambridge, Mass, USA, 1989. string model is equivalent to a nonstiff string model cas- [11] J. O. Smith III, “Physical modeling using digital waveguides,” caded with frequency warping, in turn obtained by Laguerre Computer Music Journal, vol. 16, no. 4, pp. 74–91, 1992. expansion. Finally, we showed that due to this equivalence, [12] J. O. Smith III, “Physical modeling synthesis update,” Com- the all-pass coefficients can be computed by means of opti- puter Music Journal, vol. 20, no. 2, pp. 44–56, 1996. mization algorithms of the stiff model with a warped nonstiff [13] S. A. Van Duyne and J. O. Smith III, “A simplified approach to modeling dispersion caused by stiffness in strings and plates,” one. in Proc. 1994 International Computer Music Conference,pp. The exploration of physical models of musical instru- 407–410, Aarhus, Denmark, September 1994. ments requires mathematical or physical approximations in [14] J. O. Smith III, “Principles of digital waveguide models of order to make the problem treatable. When available, the musical instruments,” in Applications of Digital Signal Pro- solutions will only partially reflect the ensemble of mechan- cessing to Audio and Acoustics,M.KahrsandK.Branden- ical and acoustic phenomena involved. However, the phys- burg, Eds., pp. 417–466, Kluwer Academic Publishers, Boston, Mass, USA, 1998. ical models serve as a solid background for the construc- [15] M. Karjalainen, T. Tolonen, V. Valim¨ aki,¨ C. Erkut, M. Laur- tion of physically inspired models, which are flexible nu- son, and J. Hiipakka, “An overview of new techniques and merical approximations of the solutions. Per se, these ap- effects in model-based sound synthesis,” Journal of New Mu- proximations are interesting for the synthesis of virtual in- sic Research, vol. 30, no. 3, pp. 203–212, 2001. 976 EURASIP Journal on Applied Signal Processing
[16] J. Bensa, S. Bilbao, R. Kronland-Martinet, and J. O. Smith III, [34] A. Harm¨ a,¨ M. Karjalainen, L. Savioja, V. Valim¨ aki,¨ U. K. “The simulation of piano string vibration: from physical Laine, and J. Huopaniemi, “Frequency-warped signal process- models to finite difference schemes and digital waveguides,” ing for audio applications,” Journal of the Audio Engineering Journal of the Acoustical Society of America, vol. 114, no. 2, pp. Society, vol. 48, no. 11, pp. 1011–1031, 2000. 1095–1107, 2003. [35] N. H. Fletcher and T. D. Rossing, Principles of Vibration and [17] B. Bank, F. Avanzini, G. Borin, G. De Poli, F. Fontana, and Sound, Springer-Verlag, New York, NY, USA, 1995. D. Rocchesso, “Physically informed signal processing meth- [36] L. D. Landau and E. M. Lifsits,ˇ Theory of Elasticity, Editions ods for piano sound synthesis: a research overview,” EURASIP Mir, Moscow, Russia, 1967. Journal on Applied Signal Processing, vol. 2003, no. 10, pp. [37] N. Dunford and J. T. Schwartz, Linear Operators. Part 2: Spec- 941–952, 2003. tral Theory, Self Adjoint Operators in Hilbert Space, John Wiley [18] V. Valim¨ aki,¨ J. Huopaniemi, M. Karjalainen, and Z. Janosy,´ & Sons, New York, NY, USA, 1st edition, 1963. “Physical modeling of plucked string instruments with appli- [38] I. Testa, Sintesi del suono generato dalle corde vibranti: un al- cation to real-time sound synthesis,” Journal of the Audio En- goritmo basato su un modello dispersivo, Physics degree thesis, gineering Society, vol. 44, no. 5, pp. 331–353, 1996. Universita` Federico II di Napoli, Napoli, Italy, 1997. [19] J. O. Smith III, “Efficient synthesis of stringed musical instru- [39] H. W. Strube, “Linear prediction on a warped frequency ments,” in Proc. 1993 International Computer Music Confer- scale,” Journal of the Acoustical Society of America, vol. 68, no. ence, pp. 64–71, Tokyo, Japan, September 1993. 4, pp. 1071–1076, 1980. [20] M. Karjalainen, V. Valim¨ aki,¨ and Z. Janosy,´ “Towards high- [40] J. A. Moorer, “The manifold joys of conformal mapping: ap- quality sound synthesis of the guitar and string instruments,” plications to digital filtering in the studio,” Journal of the Audio in Proc. 1993 International Computer Music Conference,pp. Engineering Society, vol. 31, no. 11, pp. 826–841, 1983. 56–63, Tokyo, Japan, September 1993. [41] J.-M. Jot and A. Chaigne, “Digital delay networks for design- [21] G. Borin and G. De Poli, “A hysteretic hammer-string inter- ing artificial reverberators,” in Proc. 90th Convention Audio action model for physical model synthesis,” in Proc. Nordic Engineering Society, Paris, France, preprint no. 3030, Febru- Acoustical Meeting, pp. 399–406, Helsinki, Finland, June 1996. ary, 1991. [22] G. E. Garnett, “Modeling piano sound using digital waveg- [42] M. Karjalainen, A. Harm¨ a,¨ and U. K. Laine, “Realizable uide filtering techniques,” in Proc. 1987 International Com- warped IIR filters and their properties,” in Proc. IEEE Interna- puter Music Conference, pp. 89–95, Urbana, Ill, USA, August tional Conference on Acoustics, Speech, and Signal Processing, 1987. vol. 3, pp. 2205–2208, Munich, Germany, April 1997. [23] J. O. Smith III and S. A. Van Duyne, “Commuted piano syn- [43] A. Harm¨ a,¨ “Implementation of recursive filters having delay thesis,” in Proc. 1995 International Computer Music Confer- free loops,” in Proc. IEEE International Conference on Acous- ence, pp. 319–326, Banff, Canada, September 1995. tics, Speech, and Signal Processing, vol. 3, pp. 1261–1264, Seat- [24] S. A. Van Duyne and J. O. Smith III, “Developments for the tle, Wash, USA, May 1998. commuted piano,” in Proc. 1995 International Computer Mu- [44] P. W. Broome, “Discrete orthonormal sequences,” Journal of sic Conference, pp. 335–343, Banff, Canada, September 1995. the ACM, vol. 12, no. 2, pp. 151–168, 1965. [25] M. Karjalainen and J. O. Smith III, “Body modeling tech- [45] A. V. Oppenheim, D. H. Johnson, and K. Steiglitz, “Computa- niques for string instrument synthesis,” in Proc. 1996 Interna- tion of spectra with unequal resolution using the fast Fourier tional Computer Music Conference, pp. 232–239, Hong Kong, transform,” Proceedings of the IEEE, vol. 59, pp. 299–301, August 1996. 1971. [26] M. Karjalainen, V. Valim¨ aki,¨ and T. Tolonen, “Plucked-string [46] G. Evangelista and S. Cavaliere, “Audio effects based on models, from the Karplus-Strong algorithm to digital waveg- biorthogonal time-varying frequency warping,” EURASIP uides and beyond,” Computer Music Journal,vol.22,no.3,pp. Journal on Applied Signal Processing, vol. 2001, no. 1, pp. 27– 17–32, 1998. 35, 2001. [27] H. Fletcher, “Normal vibration frequencies of a stiff piano [47] G. Evangelista and S. Cavaliere, “Auditory modeling via fre- string,” Journal of the Acoustical Society of America, vol. 36, quency warped wavelet transform,” in Proc. European Sig- no. 1, pp. 203–209, 1964. nal Processing Conference, vol. I, pp. 117–120, Rhodes, Greece, [28] H. Fletcher, E. D. Blackham, and R. Stratton, “Quality of pi- September 1998. ano tones,” Journal of the Acoustical Society of America,vol. [48] G. Evangelista and S. Cavaliere, “Dispersive and pitch- 34, no. 6, pp. 749–761, 1962. synchronous processing of sounds,” in Proc. Digital Audio [29] D. Rocchesso and F. Scalcon, “Accurate dispersion simulation Effects Workshop, pp. 232–236, Barcelona, Spain, November for piano strings,” in Proc. Nordic Acoustical Meeting, pp. 407– 1998. 414, Helsinki, Finland, June 1996. [49] G. Evangelista and S. Cavaliere, “Analysis and regulariza- [30] D. Rocchesso and F. Scalcon, “Bandwidth of perceived inhar- tion of inharmonic sounds via pitch-synchronous frequency monicity for physical modeling of dispersive strings,” IEEE warped wavelets,” in Proc. 1997 International Computer Mu- Trans. Speech and Audio Processing, vol. 7, no. 5, pp. 597–601, sic Conference, pp. 51–54, Thessaloniki, Greece, September 1999. 1997. [31] I. Testa, G. Evangelista, and S. Cavaliere, “A physical model [50] L. Savioja and V. Valim¨ aki,¨ “Reducing the dispersion er- of stiff strings,” in Proc. Institute of Acoustics (Internat. Symp. ror in the digital waveguide mesh using interpolation and on Music and Acoustics), vol. 19, pp. 219–224, Edinburgh, UK, frequency-warping techniques,” IEEE Trans. Speech and Audio August 1997. Processing, vol. 8, no. 2, pp. 184–194, 2000. [32] S. Cavaliere and G. Evangelista, “Deterministic least squares [51] L. Savioja and V. Valim¨ aki,¨ “Multiwarping for enhancing the estimation of the Karplus-Strong synthesis parameter,” in frequency accuracy of digital waveguide mesh simulations,” Proc. International Workshop on Physical Model Synthesis,pp. IEEE Signal Processing Letters, vol. 8, no. 5, pp. 134–136, 2001. 15–19, Firenze, Italy, June 1996. [52] G. Evangelista, Dyadic Warped Wavelets, vol. 117 of Advances [33] G. Evangelista and S. Cavaliere, “Discrete frequency warped in Imaging and Electron Physics, Academic Press, NY, USA, wavelets: theory and applications,” IEEE Trans. Signal Process- 2001. ing, vol. 46, no. 4, pp. 874–885, 1998. Physically Inspired Models 977
I. Testa was born in Napoli, Italy, on September 21, 1973. He received the Lau- rea in Physics from University of Napoli “Federico II” in 1997 with a dissertation on physical modeling of vibrating strings. In the following years, he has been engaged in the didactics of physics research, in the field of secondary school teacher training on the use of computer-based activities and in teaching computer architecture for the information sciences course. He is currently teaching “electronics and telecommunications” at the Vocational School, Galileo Fer- raris, Napoli.
G. Evangelista received the Laurea in physics (with the highest honors) from the University of Napoli, Napoli, Italy, in 1984 andtheM.S.andPh.D.degreesinelectri- cal engineering from the University of Cal- ifornia, Irvine, in 1987 and 1990, respec- tively. Since 1995, he has been an Assistant Professor with the Department of Physical Sciences, University of Napoli “Federico II”. From 1998 to 2002, he was a Scientific Ad- junct with the Laboratory for Audiovisual Communications, Swiss Federal Institute of Technology, Lausanne, Switzerland. From 1985 to 1986, he worked at the Centre d’Etudes de Mathematique´ et Acoustique Musicale (CEMAMu/CNET), Paris, France, where he contributed to the development of a DSP-based sound synthesis system, and from 1991 to 1994, he was a Research Engineer at the Microgravity Advanced Research and Support Center, Napoli, where he was engaged in research in image processing applied to fluid motion analysis and material science. His interests in- clude digital audio, speech, music, and image processing; coding; wavelets and multirate signal processing. Dr. Evangelista was a re- cipient of the Fulbright Fellowship.
S. Cavaliere received the Laurea in elec- tronic engineering (with the highest hon- ers) from the University of Napoli “Federico II”,Napoli, Italy, in 1971. Since 1974, he has been with the Department of Physical Sci- ences, University of Napoli, first as a Re- search Associate and then as an Associate Professor. From 1972 to 1973, he was with CNR at the University of Siena. In 1986, he spent an academic year at the Media Lab- oratory, Massachusetts Institute of Technology, Cambridge. From 1987 to 1991, he received a research grant for a project devoted to the design of VLSI chips for real-time sound processing and for the realization of the Musical Audio Research Station, workstation for sound manipulation, IRIS, Rome, Italy. He has also been a Research Associate with INFN for the realization of very-large systems for data acquisition from nuclear physics experiments (KLOE in Fras- cati and ARGO in Tibet) and for the development of techniques for the detection of signals in high-level noise in the Virgo experiment. His interests include sound and music signal processing, in partic- ular for the Web, signal transforms and representations, VLSI, and specialized computers for sound manipulation. EURASIP Journal on Applied Signal Processing 2004:7, 978–989 c 2004 Hindawi Publishing Corporation
Digital Waveguides versus Finite Difference Structures: Equivalence and Mixed Modeling
Matti Karjalainen Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, 02150 Espoo, Finland Email: matti.karjalainen@hut.fi
Cumhur Erkut Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, 02150 Espoo, Finland Email: cumhur.erkut@hut.fi
Received 30 June 2003; Revised 4 December 2003
Digital waveguides and finite difference time domain schemes have been used in physical modeling of spatially distributed systems. Both of them are known to provide exact modeling of ideal one-dimensional (1D) band-limited wave propagation, and both of them can be composed to approximate two-dimensional (2D) and three-dimensional (3D) mesh structures. Their equal capabil- ities in physical modeling have been shown for special cases and have been assumed to cover generalized cases as well. The ability to form mixed models by joining substructures of both classes through converter elements has been proposed recently. In this paper, we formulate a general digital signal processing (DSP)-oriented framework where the functional equivalence of these two approaches is systematically elaborated and the conditions of building mixed models are studied. An example of mixed modeling of a 2D waveguide is presented. Keywords and phrases: acoustic signal processing, hybrid models, digital waveguides, scattering, FDTD model structures.
1. INTRODUCTION Finite difference schemes [11] were introduced to the simulation of vibrating string as a numerical integration so- Discrete-time simulation of spatially distributed acoustic sys- lution of the wave equation [12, 13], and the approach has tems for sound and voice synthesis finds its roots both in been developed further for example in [14] as a finite differ- modeling of speech production and musical instruments. ence time domain (FDTD) simulation. The second-order fi- The Kelly-Lochbaum vocal tract model [1] introduced a one- nite difference scheme including propagation losses was for- dimensional transmission line simulation of speech produc- mulated as a digital filter structure in [15], and its stability tion with two-directional delay lines and scattering junc- issues were discussed in [16]. This particular structure is the tions for nonhomogeneous vocal tract profiles. Delay sec- main focus of the finite difference discussions in the rest of tions discretize the d’Alembert solution of the wave equa- this paper and we will refer to it as the FDTD model struc- tion [2] and the scattering junctions implement the acous- ture. tic continuity laws of pressure and volume velocity in a tube DWG and FDTD approaches to discrete-time simula- of varying diameter. Further simplification led to the synthe- tion of spatially distributed systems show a high degree sis models used as the basis for linear prediction of speech of functional equivalence. As discussed in [5], in the one- [3]. dimensional band-limited case, the ideal wave propagation A similar modeling approach to musical instruments, can be exactly modeled by both methods. The basic differ- such as string and wind instruments, was formulated later ence is that the FDTD model structures process the signals and named the technique of digital waveguides (DWGs) as they are, whereas DWGs process their wave decompo- [4, 5]. For computational efficiency reasons, in DWGs two- sition. There are other known differences between DWGs directional delay lines are often reduced to single delay loops and FDTD model structures. One of them is the instabil- [6]. DWGs have been further discussed in two-dimensional ities (“spurious” responses) found in FDTD model struc- (2D) and three-dimensional (3D) modeling [5, 7, 8, 9, 10], tures, but not in DWGs, to specific excitations. Another dif- combined sometimes with a finite difference approach into ference is the numeric behavior in finite precision computa- DWG meshes. tion. Digital Waveguides versus Finite Difference Structures 979
Comparison of these two different paradigms has been and µ is mass per unit length of the string [2]. The impedance developed further in [10, 17, 18]. In [17], the interesting and is closely related to the tension T,massdensity µ, and the important possibility of building mixed models with sub- propagation speed c and is given by Z = Tµ = T/c.In models of DWG and FDTD types was introduced and gen- the acoustical domain, the admittance is also related to the eralized to elements with arbitrary wave impedances in [18]. acoustical propagation speed c. For instance, the admittance The problem of functional comparison and compatibility of a tube with a constant cross-section area A is given by analysis has remained, however, and is the topic of this pa- per. A Y = ,(2) The rest of the paper is organized as follows. Section 2 ρc provides the background information and notation that will be used in the following sections. A summary of wave- where ρ is the gas density in the tube. based modeling and finite difference modeling is also in- The two common forms of discretizing the wave equa- cluded in this section. Section 3 provides the derivation of tion for numerical simulation are through traveling wave so- the FDTD model structures, including the source terms, scat- lution and by finite difference formulation. tering, and the continuity laws. Based on the wave equation in the acoustical domain, this section highlights the func- 2.1. Wave-based modeling tional equivalence of DWGs and FDTD model structures. It The traveling wave formulation is based on the d’Alembert also presents a way of building mixed models. The formal solution of propagation of two opposite direction waves, that proofs of equivalence are provided in “Appendix.” Section 4 is, is devoted to real-time implementation of mixed models. Fi- → ← nally, Section 5 draws conclusions and indicates future direc- y(x, t) = y(x − ct)+y(x + ct). (3) tions. Here, the arrows denote the right-going and the left-going 2. BACKGROUND components of the total waveform. Assuming that the signals are bandlimited to half of sampling rate, we may sample the Sound synthesis algorithms that simulate spatially dis- traveling waves without losing any information by selecting tributed acoustic systems usually provide discrete-time so- T as the sample interval and X the position interval between lutions to a hyperbolic partial differential equation, that samples so that T = X/c. Sampling is applied in a discrete is, the wave equation. According to the domain of simula- time-space grid in which n and k are related to time and po- tion, the variables correspond to different physical quanti- sition, respectively. The discretized version of (3)becomes ties. The physical variables may further be characterized by [5]: their mathematical nature. An across variable is defined here → ← to describe a difference between two values of an irrotational y(k, n) = y(k − n)+y(k + n). (4) potential function (a function that integrates or sums up to zero over closed trajectories), whereas a through variable is It follows that the wave propagation can be computed by up- defined here to describe a solenoidal function (a quantity dating state variables in two delay lines by that integrates or sums-up to zero over closed surfaces). For → → ← ← = = example in the acoustical domain, the deviation from the yk,n+1 yk−1,n, yk,n+1 yk+1,n,(5) steady-state pressure p(x, t) is an across variable and the vol- ume velocity u(x, t) is a through variable, where x is the spa- that is, by simply shifting the samples to the right and left, tial vector variable and t is the temporal scalar variable. Sim- respectively. The shift is implemented with a pair of delay ilarly, in the mechanical domain, the across variable is the lines, and this kind of discrete-time modeling is called DWG force and the through variable is the velocity. The ratio of the modeling [5]. Since the physical variables are split into di- through and across variables yields the impedance Z. The ad- rectional wave components, we will refer to such models as mittance is the inverse of Z, that is, Y = 1/Z. W-models. According to (3)or(4), a single physical variable In a one-dimensional (1D) medium, the spatial vector (either through or across) is computed by summing the trav- variable reduces to a scalar variable x, so that in a homo- eling waves, whereas the other one may be computed implic- geneous, lossless, unbounded, and source-free medium the itly via the impedance. wave equation is written If the medium is nonhomogeneous, then the admittance varies as a function of the spatial variable. In this case, the en- 2 ytt = c yxx,(1)ergy transfer between the wave components should be com- puted according to Kirchhoff-type of continuity laws,ensur- where y is a physical variable, subscript tt refers to the second ing that the total energy is preserved. These laws may be de- partial derivative in time t, xx to the second partial deriva- rived utilizing the irrotational and solenoidal nature of across tive in spatial variable x,andc is speed of wavefront in the and through variables, respectively. In the DWG equivalent, medium of interest. For example in the mechanical domain the change in Y across a junction of the waveguide sec- (e.g., vibrating string) we are primarily interested in transver- tions causes scattering and the scattering junctions of inter- salwavemotionforwhichc = T/µ,whereT is tension force connected ports, with given admittances and wave variables, 980 EURASIP Journal on Applied Signal Processing have to be formulated [5]. For instance, in a parallel junc- Y2 tion of waveguides in the acoustical domain, the Kirchhoff Y1 + N P2 − constraints are P1 P− P1 = P2 =···=PN = PJ , 2 (6) P+ U1 + U2 + ···+ UN + Uext = 0, 1 PJ where Pi and Ui are the total pressure and volume velocity 1 P+ of the ith branch ,respectively,PJ is the common pressure Uext 3 of coupled branches, and Uext is an external volume veloc- ity to the junction. Such a junction is illustrated in Figure 1. − When port pressures are represented by incoming wave com- P3 Y3 + − Y ponents Pi , outgoing wave components by Pi , admittances n attached to each port by Yi,and = + − + = + Pi Pi + Pi , Ui YiPi ,(7) Figure 1: Parallel junction of admittances Yi with associated pres- the junction pressure PJ can be obtained as sure waves indicated. A volume velocity input Uext is also attached. N = 1 + PJ Uext +2 YiPi ,(8) Ytot i=1 where the short-hand notation yx,t is used instead of y(x, t). ∆ = ∆ = ∆ By selecting t x/c, and using index notation k x/ x = N = ∆ where Ytot i=1 Yi is the sum of all admittances to the and n t/ t,(10)resultin junction. Outgoing pressure waves are obtained from (7)to − = − + = − + − − (11) yield Pi PJ Pi . The resulting junction, a W-node,is yk,n+1 yk 1,n yk+1,n yk,n 1. depicted in Figure 2. The delay lines or termination admit- tances (see appendix) are connected to the W-ports of a W- From (11) we can see that a new sample yk,n+1 at position k node. and time index n + 1 is computed as the sum of its neighbor- A useful addition to DWG theory is to adopt wave digital ing position values minus the value at the position itself one filters (WDF) [10, 19] as discrete-time simulators of lumped sample period earlier. Since yk,n+1 is a physical variable, we ff parameter elements. Being based on W-modeling, they are will refer to models based on finite di erences as K-models, ff computationally compatible with the W-type DWGs [10, 18, with a reference to Kirchho type of physical variables. 20]. 3. FORMULATION OF THE FDTD MODEL STRUCTURE 2.2. Finite difference modeling The equivalence of the traveling wave and the finite difference In the most commonly used way to discretize the wave equa- solution of the ideal wave equation (given in (5)and(11), re- tion by finite differences, the partial derivatives in (1)areap- spectively) has been shown, for instance, in [5]. Based on this proximated by centered differences. The centered difference functional equivalence, (11) has been previously expanded approximation to the spatial partial derivative y is given by x without a formal derivation to a scattering junction with ar- [11] bitrary port impedances, where (8) is used as a template for y(x + ∆x/2, t) − y(x − ∆x/2, t) the expansion [18]. The resulting FDTD model structure is y ≈ ,(9) x ∆x illustrated in Figure 3 for a three-port junction. A compari- son of the FDTD model structure in Figure 3 and the DWG where ∆x is the spatial sampling interval. A similar expres- scattering junction in Figure 2 reveals the functional simi- sion is obtained for the temporal partial derivative, if x is larities of the two methods. However, a formal, generalized, kept constant and t is replaced by t ± ∆t,where∆t is the and unified derivation of the FDTD model structure with- discrete-time sampling interval. Iterating the difference ap- out an explicit reference to the DWG method remains to proximations, second-order partial derivatives in (1)areap- be presented. This section presents such a derivation based proximated by on the equations of motion of the gas in a tube. Note that, because of the analogy between different physical domains, − ≈ yx+∆x,t 2yx,t + yx−∆x,t once the formulation is derived, it can be used in different yxx ∆ 2 , x (10) domains as well. Therefore, the derivation below is not lim- y ∆ − 2y + y −∆ ited to the acoustical domain and the resulting structure can y ≈ x,t+ t x,t x,t t , tt ∆t2 also be used in other domains. 3.1. Source terms 1 Note that capital letters denote a transform variable. For instance, Pi is In order to explain the excitation Uext and the associated filter − the z-transform of the signal pi(n). H(z) = 1 − z 2 in Figure 3, we consider a piece of tube of Digital Waveguides versus Finite Difference Structures 981
W-port 3 + − Uext Y3 P3 P3
W-node N1 Y1 Y3 Y2 − 2 + + + + 0 P1 P2 − Y1 2 + 2 Y2 z N
1 W-line W-admittance
− Yi − W-port 2 W-port 1 + + z−N − PJ − P1 + + P2
(a)
Y1 Uext Y2 YN
w
Y1 wwN1 w w W-line wwNN wwW-line w
PJ (b)
Figure 2: (a) N-port scattering junction (three ports are shown) of ports with admittances Yi. Incoming and outgoing pressure waves are + − Pi and Pi , respectively. W-port 1 is terminated by admittance Y1. (b) Abstract representation of the W-node in (a).
Uext K-port 3 Y3
K-node − z 2 N Y1 − Y3 1 Y2 + 2
Y1 2 + 2 Y2 K-pipe K-admittance 1 Yi
z−1 − + K-port 1 K-port 2 PJ
z−1 z−1
(a)
Y1 Uext Y2 YN
k
Y1 kkN1 k k K-pipe kkNN kkK-pipe k
PJ (b)
Figure 3: (a) Digital filter structure for finite difference approximation of a three-port scattering node with port admittances Yi.Onlytotal velocity PJ (K-variable) is explicitly available. (b) Abstract representation of the K-node in (a). 982 EURASIP Journal on Applied Signal Processing constant cross-sectional area A that includes an ideal volume twice, (17)becomes velocity source s(t). The pressure p and volume velocity u − (the variables in the acoustical domain, as explained in the pk(n +1)+pk(n 1) previous section) satisfy the following PDE set: 2 (19) = Ak−1/2 pk−1(n)+Ak+1/2 pk+1(n) . Ak−1/2 + Ak+1/2 ∂u ∂p A ∂p ∂u ρ + A = 0 + = s, (12) = ∂t ∂x ρc2 ∂t ∂x Finally, by defining Yk−1 Ak−1/2/ρc we obtain p (n +1)+p (n − 1) where ρ is the gas density and c is the propagation speed. k k 2 (20) This set may be combined to yield a single PDE in p and the = Yk−1 pk−1(n)+Yk+1 pk+1(n) , source term Ytot = ∂2 p ρc2 ∂s ∂2 p where the term Ytot Yk−1 + Yk+1 may be interpreted as − = c2 . (13) the sum of all admittances connected to the kth cell. This ∂t2 A ∂t ∂x2 recursion is implemented with the filter structure illustrated Defining in Figure 4. The output of the structure is the junction pres- sure pJ,k(n).Itisworthtonotethat(20) is functionally the 1 ∆t ∆t same as the DWG scattering representation given in (8), if s(t) = s t − + s t + + O ∆t2 , (14) 2 2 2 the admittances are real. A more general case of complex ad- mittances has been considered in the appendix. Whereas the using index notation k = x/∆x and n = t/∆t, and applying DWG formulation can easily be extended to N-port junc- centered differences (see Section 2.2)to(13)with∆x/∆t = c tions, this extension is not necessarily possible for a K-model, yields the following difference equation where the continuity laws are generally not satisfied. In the next subsection, we investigate the continuity laws within the pk(n +1)= pk+1(n)+pk−1(n) − pk(n − 1) FDTD model structure. ρc∆x (15) + s (n +1)− s (n − 1) . 3.3. Continuity laws 2A k k We denote the pressure across the impedance 1/ Yi as Note that ρc/A is the acoustic impedance that converts the pa(n), and the volume velocity through the same impedance volume velocity source s(t) to the pressure. Since the model as ut(n), with a reference to Figure 4. According to these no- output is the pressure at the time step n + 1, it follows that tations, Ohm’s law in the acoustical domain yields the source is delayed two samples, subtracted from its current −2 ut(n) value, and scaled, corresponding to the filter 1 − z for U pa(n) = , (21) ext Y in Figure 3. tot whereas the Kirchhoff continuity laws can be written as 3.2. Admittance discontinuity and scattering p (n) = p (n +1)+p (n − 1), (22) Now consider an unbounded, source-free tube with a cross- a k k = section A(x) that is a smooth real function of spatial variable ut(n) 2Yk−1 pk−1(n)+2Yk+1 pk+1(n). (23) x. In this case, the governing PDEs can be combined into a Inserting (21) into (23) eliminates ( ), and the result may single PDE in the pressure alone [10], ut n be combined with (22) to give the following equation for 2 2 combined continuity laws: ∂ p = c ∂ ∂p 2 A(x) (16) ∂t A(x) ∂x ∂x pk(n +1)+pk(n − 1) 2 (24) which is the Webster horn equation. Discretizing this equa- = Yk−1 pk−1(n)+Yk+1 pk+1(n) . tion by centered differences yields the following difference Ytot equation This relation is exactly the recursion of the FDTD model structure given in (20), but obtained here solely from the p (n +1)− 2p (n)+p (n − 1) k k k continuity laws. We thus conclude that the continuity laws ∆ 2 t are automatically satisfied by the FDTD model structure of c2 A p (n) − p (n) − A − p (n) − p − (n) = k+1/2 k+1 k k 1/2 k k 1 , Figure 4. ∆ 2 Ak x It is worth to note that more ports may be added to the (17) structure without violating the continuity laws for any num- ber of linear, time-invariant (LTI) admittances, as long as where Ak = A(k∆x). By selecting ∆x = c∆t and using the Ytot = Yi.ForN ports connected to the ith cell, (23)be- approximation comes N 1 2 − A = A − + A + O ∆x (18) = 1 k 2 k 1/2 k+1/2 Ut 2 z YiPJ,i (25) i=1 Digital Waveguides versus Finite Difference Structures 983
Yk−1 2+ 2Yk+1
ut(n)
1 Yi pa(n) − +
pk(n +1)= pJ,K
z−1 z−1
pk−1(n) pk+1(n)
Figure 4: Digital filter structure for finite difference approximation of an unbounded, source-free tube with a spatially varying cross section.
N1 Y2 + N2 P2 0 2Y1 +2Y2 + 2Y2 + 2Y3 −
1 − z−1 z 2 Y1 + Y2 1 Y2 + Y3 −1 W-port W-port K-port + K-port z + − − − P1 − + + P2 P2 z−1 z−1 KW-converter
Figure 5: FDTD node (left) and a DWG node (right) forming a part of a hybrid waveguide. There is a KW-converter between K- and W- models. Yi are wave admittances of W-lines, K-pipes, and KW-converter between junction nodes. P1 and P2 are the junction pressures of the K-node and W-node, respectively.
andtherecursionin(24) can be expressed in z-domain as resulting hybrid model in this special case. A generalization has been proposed in [18],whichallowstomakeanyhybrid N model of K-elements (FDTD) and W-elements having arbi- −2 = 2 −1 PJ,k + z PJ,k z YiPJ,i. (26) trary wave admittances/impedances at their ports (see also Yi = i 1 [21]). The superposition of the excitation block in (14) and the Here,wederivehowahybridmodel(showninFigure 5) N-port formulation above completes the formulation of the can be constructed in a 1D waveguide between a K-node N1 FDTD model structure. In particular, by setting N = 3 the (left) and a W-node N2 (right), aligned with the spatial grids digital filter structure in Figure 3 is obtained. k = 1 and 2, respectively. The derivation is based on the fact that the junction pressures are available in both types 3.4. Construction of mixed models of nodes, but in the DWG case not at the W-ports. If N1 and N2 would be both W-nodes (see Figure 8 in An essential difference between DWGs of Figure 2 and FDTD the appendix), the traveling wave entering into the node N2 model structures of Figure 3 is that while DWG junctions could be calculated as are connected through two-directional delay lines (W-lines), FDTD nodes have two unit delays of internal memory and + = −1 − = −1 − −1 − = −1 − −2 − P2 z P1 z P1 z P2 z P1 z P2 . (27) delay-free K-pipes connecting ports between nodes. These junction nodes and ports are thus not directly compatible. Note that P1 is available in the K-node N1 in Figure 5.Con- The next question is the possibility to interface these sub- versely, if N1 and N2 would be both K-nodes, the junction −1 models. The interconnection of a lossy FDTD model struc- pressure z P2 would be needed for calculation of P1 (see ture and a similar DWG has been tackled in [17]. A proper in- Figure 10 in the appendix). Although P2 is implicitly avail- terconnection element (converter) has been proposed for the able in N2, it can also be obtained by summing up the wave 984 EURASIP Journal on Applied Signal Processing
yt yt yt
ww w
wl wl wl
wl w wl w wl w wl w yt
kw kw wl
kp k kp k kw w wl w yt
kp kp wl
kp k kp k kw w wl w yt
kp kp wl
Figure 6: Part of a 2D waveguide mesh composed of (a) K-type FDTD elements (left bottom): K-pipes (kp) and K-nodes (k), (b) W-type DWG elements (top and right): delay-controllable W-lines (wl), W-nodes (w), and terminating admittances (yt), and (c) converter elements (kw) to connect K- and W-type elements into a mixed model. components within the converter 4.1. K-modeling versus W-modeling, pros and cons An advantage of W-modeling is in its numerical robustness. z−1P = z−1 P+ + P− . (28) 2 2 2 By proper formulation, the stability is guaranteed also with Equation (27) may be inserted in (28) to yield the following fixed-point arithmetics [5, 19]. Another useful property is transfer matrix of the 2-port KW-converter element the relatively straightforward way of using fractional delays [22] when building digital waveguides, which makes for ex- + − −2 −1 ample tuning and run time variation of musical instrument P2 = 1 z z P1 −1 − −2 − . (29) z P2 1 1 z P2 models convenient. In general, it seems that W-modeling is the right choice in most 1D cases. The KW-converter in Figure 5 essentially performs the cal- The advantages of K-modeling by FDTD waveguides are culations given in (29) and interconnects the K-type port of found when realizing mesh-like structures, such as 2D and an FDTD node and the W-type port of a DWG node. The 3D meshes [7, 8]. In such cases, the number of unit delays signal behavior in a mixed modeling structure is further in- (memory positions) is two for any dimensionality, while for vestigated in the appendix. a DWG mesh it is two times the dimensionality of the mesh. A disadvantage of FDTDs is their inherent lack of numeri- 4. IMPLEMENTATION OF MIXED MODELS cal robustness and tendency of instability for signal frequen- cies near DC and the Nyquist frequency. Furthermore, FDTD The functional equivalence and mixed modeling paradigm of junction nodes cannot be made memoryless, which may be a DWGs and FDTDs presented above allows for flexible build- limitation in nonlinear and parametrically varying models. ing of physical models from K- and W-type of substructures. In this way, it is possible to exploit the advantages of each 4.2. 2D waveguide mesh case type. In this section, we will explore a simple example of digital waveguide model that shows how the mixed mod- Figure 6 illustrates a part of a 2D mixed model structure that els can be built. Before that, a short discussion on the pros is based on a rectangular FDTD waveguide mesh for effi- and cons of the different paradigms in practical realizations cient and memory-saving computation and DWG elements is presented. at boundaries. Such model could be for example a membrane Digital Waveguides versus Finite Difference Structures 985 of a drum or in a 3D case a room enclosed by walls. When DWGs. Furthermore, an example of mixed models consist- there is need to attach W-type termination admittances to ing of FDTD and DWG blocks and converter elements is re- the model or to vary the propagation delays within the sys- ported. The formulation allows for high flexibility in build- tem, a change from K-elements to W-elements through con- ing 1D or higher dimensional physical models from inter- verters is a useful property. Furthermore, variable-length de- connected blocks. lays can be used, for example, for passive nonlinearities at the The DWG method is used as a primary example to terminations to simulate gongs and other instruments where the wave-based methods in this paper. Naturally, the KW- nonlinear mode coupling takes place [23]. The same princi- converter formulation is applicable to any W-method, such ple can be used to simulate shock waves in brass instrument as the wave digital filters (WDFs) [19]. In the future, we plan bores [24]. In such cases, the delay lengths are made depen- to extend our examples to include WDF excitation blocks. dent on the signal value passing through the delay elements. Other important future directions are the analysis of the dy- In Figure 6, the elements denoted by kp are K-type pipes namic behavior of parametrically varying hybrid models, as between K-type nodes. Elements kw are K-to-W converters well as benchmark tests for computational costs of the pro- and elements wl are W-lines, where the arrows indicate that posed structures. they are controllable fractional delays. Elements yt are ter- Matlab scripts and demos related to DWGs and minating admittances. In a general case, scattering can be FDTDs can be found at http://www.acoustics.hut.fi/demos/ controlled by varying the admittances, although the compu- waveguide-modeling/. tational efficiency is improved if the admittances are made equal.InamodernPC,a2Dmeshofafewhundredelements APPENDIX can run in real time at full audio rate. By decimated compu- tation, bigger models can be computed if a lower cutoff fre- A. PROOFS OF EQUIVALENCE quency is permitted, allowing large physical dimensions of the mesh. The proofs of functional equivalence between the DWG and FDTD formulations used in this article are given below. The 4.3. Mixed modeling in BlockCompiler approach useful for this can be based on the Thevenin and Norton theorems [27]. The development of the K- and W-models above has led to a systematic formulation of computational elements for both A.1. Termination in a DWG network paradigms and mixed modeling. The W-lines and K-pipes as well as related junction nodes are useful abstractions for Passive termination of a DWG junction port by a given ad- a formal specification of model implementation. We have mittance Y is equivalent to attaching a delay line of infinite developed a software tool for physical modeling called the length and wave admittance Y. In the DWG case, this means BlockCompiler [20] that is designed in particular for flexible an infinite long sequence of admittance-matched unit delay modeling and efficient real-time computation of the models. lines. Since there is no back-scattering in finite time, we can The BlockCompiler contains two levels: (a) model cre- use the left-side port termination of Figure 2,withzerovol- ation and (b) model implementation. The model creation ume velocity in input terminal. Thus, admittance filter Y1 level is written in the Common Lisp programming lan- is not needed in computation, it has only to be included in guage for maximal flexibility in symbolic object-based ma- making the filter 1/ Yi. nipulation of model structures. A set of DSP-oriented and A.2. Termination in an FDTD network physics-oriented computational blocks are available. New block classes can be created either as macro classes composed Deriving the passive port termination for an FDTD junc- of predefined elementary blocks or by writing new elemen- tion is not as obvious as for a DWG junction. We can ap- tary blocks. The blocks are connected through ports: inputs ply again an infinitely long sequence of admittance-matched and outputs for DSP blocks and K- or W-type ports for phys- FDTD sections, as depicted in Figure 7 on the left-hand side. ical blocks. A full interconnected model is called a patch. With the notations given and z-transforms of variables and The model implementation level is a code generator that admittances we can denote does the scheduling of the blocks, writes C source code into a M file, compiles it on the fly, and allows for streaming sound 2Y1 −1 2 −1 −2 P0 = P−1z + YiPiz − P0z , (A.1a) in real time or computation by stepping in a sample-by- Yi Yi i=2 sample mode. The C code can also be exported to other = −1 −1 − −2 platforms, such as the Mustajuuri audio platform [25]and P−1 P0z + P−2z P−1z , (A.1b) pd [26]. Sound examples of mixed models can be found at −1 −1 −2 P−k = P−k+1z + P−k−1z − P−kz ,fork<−1, (A.1c) http://www.acoustics.hut.fi/demos/waveguide-modeling/. where Pi, i = 1, ..., M, are pressures of all M neighboring junctions linked through admittances Y to junction i = 0, 5. SUMMARY AND CONCLUSIONS i and Pk,wherek = 0, −1, −2, ... are pressures in junctions This paper has presented a formulation of a specific FDTD between admittance-matched elements chained as termina- model structure and showed its functional equivalence to the tion of junction 0. By applying (A.1c)to(A.1b)iterativelyfor 986 EURASIP Journal on Applied Signal Processing
+ + 2Y1 + 2Y2
1 Yi
− − − + + + P−2 P−1 P0
z−1 z−1 z−1 z−1 z−1 z−1
Figure 7: FDTD structure terminated by admittance-matched chain of FDTD elements on the left-hand side.
Uext + P2 −1 2Y1 +2Y2 z 2Y2 +2Y3 0 0
1 1 Y1 + Y2 Y2 + Y3
− − − − + + z−1 + + − − P1 P1 P2 P2
Figure 8: Structure for derivation of signal behavior in a DWG network. k = 2, ..., N we get z−2 U −1 −N −N−1 ext − P−1 = P0z + P−N−1z − P−N z . (A.2) + When N →∞, the last two terms cease to have effect on P−1 in any finite time span and they can thus be discarded. When = −1 the result P−1 P0z is used in (A.1a), we get 2Y1 + 2Y2
' ( M 2Y − − 2 − − P = 1 P z 1 z 1 + Y P z 1 − P z 2,(A.3) 0 0 i i 0 1 Yi Yi i=2 Y1 + Y2 where the first term on the right-hand side can be interpreted −1 −1 as a way to implement the termination as a feedback through z − z a unit delay as illustrated in Figure 3 for the left-hand port of + the FDTD junction. PJ A.3. Signal behavior in a DWG network z−1 z−1 Figure 8 illustrates a case where an arbitrarily large intercon- nected DWG network is reduced so that only two scattering junctions, connected through unit delay line of wave admit- tance Y2, are shown explicitly. Norton equivalent source Uext Figure 9: FDTD structure for derivation of volume velocity source is feeding junction node 1 and an equivalent termination ad- (Uext) to junction pressure (PJ ) transfer function. mittance is Y1. Junction node 2 is terminated by a Norton equivalent admittance Y3. Now, we derive the signal prop- agation from Uext to junction pressure P1 and transmission ratio between pressures P2 and P1. If these “transfer func- for any topologies and parametric values equivalent between tions” are equal for the DWG, the FDTD, and the mixed case these cases. This is due to the superposition principle and the with KW-converter, the models are functionally equivalent Norton theorem. Digital Waveguides versus Finite Difference Structures 987
2Y1 + 2Y2 2Y2 + 2Y3
1 1 Y1 + Y2 Y2 + Y3
−1 −1 z − − z + +
P1 P2
z−1 z−1 z−1 z−1
Figure 10: FDTD structure for derivation of signal relation between two junction pressures.
2Y1 + 2Y2 + 2Y2 + 2Y3 0 −
− 1 z 2 z−1 1 Y2 + Y3 Y1 + Y2 − + z 1 + − − − P + + 2 − P1 P1 −1 −1 W-to-K converter z z
Figure 11: Mixed modeling structure for derivation of DWG to FDTD pressure relation.
From Figure 8, we can write directly for the propagation In the special case of admittance match Y2 = Y3,wegetP2/P1 −1 of equivalent source Uext to junction pressure P1 as = z .Forms(A.4)and(A.7) are now the reference to prove equivalence with FDTD and mixed modeling cases. Uext P1 = . (A.4) Y1 + Y2 A.4. Signal behavior in an FDTD network
Signal transmission ratio between P2 and P1 can be de- Using notations in Figure 9, which shows a Norton’s equiva- rived from the following set of equations (A.5a), (A.5b), and lent for an FDTD network, we can write (A.5c): U − − P = ext 1 − z 2 − P z 2 2Y − − J J P = 2 P z 1, (A.5a) Y1 + Y2 2 Y + Y 1 (A.8) 2 3 2Y1 −2 2Y2 −2 − = − − −1 + PJ z + PJ z P1 P1 P2 z , (A.5b) Y1 + Y2 Y1 + Y2 − = − − −1 (A.5c) P2 P2 P1 z . that after simplification yields − − By eliminating wave variables P1 and P2 , U = ext − PJ ,(A.9) P − P z 1 Y1 + Y2 P− = 1 2 , 1 1 − z−2 whichisequivalenttotheDWGform(A.4). Notice that form − −1 −2 − P2 P1z (1−z ) in feeding U to the node has zeros on the unit cir- P = , (A.6) ext 2 1 − z−2 cle for angles nπ (n is integer), compensating poles inherent −1 in the FDTD backbone structure. This degrades numerical 2Y − z P = 2 P − P z 1 2 Y + Y 1 2 1 − z−2 robustness of the structure around these frequencies. 2 3 For the structure of two FDTD nodes in Figure 10,wecan and by solving for P2/P1,weget write equation
−1 P2 2Y2z −2 2Y3 −2 2Y2 −1 = P2 =−P2z + P2z + P1z , (A.10) −2 . (A.7) P1 Y2 + Y3 + Y2 − Y3 z Y2 + Y3 Y2 + Y3 988 EURASIP Journal on Applied Signal Processing which simplifies to [6]M.Karjalainen,V.Valim¨ aki,¨ and T. Tolonen, “Plucked-string models: From the Karplus-Strong algorithm to digital waveg- P 2Y z−1 uides and beyond,” Computer Music Journal,vol.22,no.3,pp. 2 = 2 (A.11) P Y + Y + Y − Y z−2 17–32, 1998. 1 2 3 2 3 [7] S. A. Van Duyne and J. O. Smith, “Physical modeling with the being equivalent to the DWG form (A.7). This completes 2-D digital waveguide mesh,” in Proc. International Computer Music Conference, pp. 40–47, Tokyo, Japan, September 1993. proving the equivalence of the DWG and FDTD structures. [8]L.Savioja,T.J.Rinne,andT.Takala,“Simulationofroom acoustics with a 3-D finite difference mesh,” in Proc. Interna- A.5. Signal behavior in a mixed modeling structure tional Computer Music Conference, pp. 463–466, Aarhus, Den- To prove the equivalence of signal behavior also in the mixed mark, September 1994. modeling structure of Figure 5 with a KW-adaptor, we have [9] L. Savioja, Modeling techniques for virtual acoustics,Ph.D.the- sis, Helsinki University of Technology, Espoo, Finland, 1999. to analyze the junction signal relations in both directions. We [10] S. D. Bilbao, Wave and scattering methods for the numerical first prove the equivalence in the FDTD to DWG direction. integration of partial differential equations, Ph.D. thesis, Stan- According to Figure 5,wecanwrite ford University, Stanford, Calif, USA, May 2001. [11] J. C. Strikwerda, Finite Difference Schemes and Partial Differ- 2Y − 2Y − − ential Equations, Wadsworth and Brooks/Cole, Pacific Grove, P = 2 P z 1 − 2 P z 2, 2 Y + Y 1 Y + Y 2 Calif, USA, 1989. 2 3 2 3 (A.12) [12] L. Hiller and P. Ruiz, “Synthesizing musical sounds by solving − = − −1 − − −2 P2 P2 P1z P2 z . the wave equation for vibrating objects: Part 1,” Journal of the Audio Engineering Society, vol. 19, no. 6, pp. 462–470, 1971. − Eliminating P2 and solving for P2/P1 yields again form (A.7), [13] L. Hiller and P. Ruiz, “Synthesizing musical sounds by solving proving the equivalence. the wave equation for vibrating objects: Part 2,” Journal of the Audio Engineering Society, vol. 19, no. 7, pp. 542–551, 1971. According to Figure 11, we can analyze signal relation- ff ship in the DWG to FDTD direction by writing [14] A. Chaigne, “On the use of finite di erences for musical syn- thesis. Application to plucked stringed instruments,” Journal d’Acoustique, vol. 5, no. 2, pp. 181–211, 1992. 2Y3 −2 −2 P2 = P2z − P2z [15] M. Karjalainen, “1-D digital waveguide modeling for im- Y2 + Y3 proved sound synthesis,” in Proc. IEEE International Con- − 2Y2 − − − −2 −1 −1 (A.13) ference on Acoustics, Speech and Signal Processing, vol. 2, pp. P1 P1 z + P2z z , 1869–1872, Orlando, Fla, USA, May 2002. Y2 + Y3 [16] C. Erkut and M. Karjalainen, “Virtual strings based on a − = − −1 − − −2 P1 P1 P2z P1 z . 1-D FDTD waveguide model: Stability, losses, and traveling
− waves,” in Proc. Audio Engineering Society 22nd International By eliminating P1 and solving for P2/P1, we get again form Conference on Virtual, Synthetic and Entertainment Audio,pp. (A.7). This concludes proving the equivalence of the mixed 317–323, Espoo, Finland, June 2002. modeling case to corresponding DWG and thus also to [17] C. Erkut and M. Karjalainen, “Finite difference method FDTD structures. vs. digital waveguide method in string instrument modeling and synthesis,” in Proc. International Symposium on Musical Acoustics, Mexico City, Mexico, December 2002. ACKNOWLEDGMENTS [18] M. Karjalainen, C. Erkut, and L. Savioja, “Compilation of unified physical models for efficient sound synthesis,” in Proc. This work is part of the Algorithms for the Modelling of IEEE International Conference on Acoustics, Speech and Signal Acoustic Interactions (ALMA) project (IST-2001-33059) and Processing, vol. 5, pp. 433–436, Hong Kong, China, April 2003. has been supported by the Academy of Finland as a part of [19] A. Fettweis, “Wave digital filters: Theory and practice,” Proc. IEEE, vol. 74, no. 2, pp. 270–327, 1986. the project “Technology for Audio and Speech Processing” [20] M. Karjalainen, “BlockCompiler: Efficient simulation of (SA 53537). acoustic and audio systems,” in Proc. 114th Audio Engineering Society Convention, Amsterdam, Netherlands, March 2003, REFERENCES preprint 5756. [21] M. Karjalainen, “Time-domain physical modeling and real- [1]J.L.KellyandC.C.Lochbaum,“Speechsynthesis,”inProc. time synthesis using mixed modeling paradigms,” in Proc. 4th International Congress on Acoustics, pp. 1–4, Copenhagen, Stockholm Music Acoustics Conference, vol. 1, pp. 393–396, Denmark, September 1962. Stockholm, Sweden, August 2003. [2] N. H. Fletcher and T. D. Rossing, The Physics of Musical In- [22] T. I. Laakso, V. Valim¨ aki,¨ M. Karjalainen, and U. K. Laine, struments, Springer-Verlag, New York, NY, USA, 2nd edition, “Splitting the unit delay-tools for fractional delay filter de- 1998. sign,” IEEE Signal Processing Magazine,vol.13,no.1,pp.30– [3] J. D. Markel and A. H. Gray, Linear Prediction of Speech, 60, 1996. Springer-Verlag, New York, NY, USA, 1976. [23] J. R. Pierce and S. A. Van Duyne, “A passive nonlinear digital [4] J. O. Smith, “Physical modeling using digital waveguides,” filter design which facilitates physics-based sound synthesis of Computer Music Journal, vol. 16, no. 4, pp. 74–91, 1992. highly nonlinear musical instruments,” Journal of the Acousti- [5] J. O. Smith, “Principles of digital waveguide models of musi- cal Society of America, vol. 101, no. 2, pp. 1120–1126, 1997. cal instruments,” in Applications of Digital Signal Processing to [24] R. Msallam, S. Dequidt, S. Tassart, and R. Causse,´ “Physical Audio and Acoustics, M. Kahrs and K. Brandenburg, Eds., pp. model of the trombone including nonlinear propagation ef- 417–466, Kluwer Academic Publishers, Boston, Mass, USA, fects,” in Proc. International Symposium on Musical Acoustics, 1998. vol. 2, pp. 419–424, Edinburgh, Scotland, UK, August 1997. Digital Waveguides versus Finite Difference Structures 989
[25] T. Ilmonen, “Mustajuuri—an application and toolkit for in- teractive audio processing,” in Proc. International Conference on Auditory Display, pp. 284–285, Espoo, Finland, July 2001. [26] M. Puckette, “Pure data,” in Proc. International Computer Mu- sic Conference, pp. 224–227, Thessaloniki, Greece, September 1997. [27] J. E. Brittain, “Thevenin’s theorem,” IEEE Spectrum, vol. 27, no. 3, pp. 42, 1990.
Matti Karjalainen was born in Hankasalmi, Finland, in 1946. He received the M.S. and the Dr.Tech. degrees in electrical engineer- ing from the Tampere University of Tech- nology, in 1970 and 1978, respectively. Since 1980 he has been a professor in acoustics and audio signal processing at the Helsinki University of Technology in the Faculty of Electrical Engineering. In audio technology, his interest is in audio signal processing such as digital signal processing (DSP) for sound reproduction, perceptually based signal processing, as well as music DSP and sound synthesis. In addition to audio DSP, his research activities cover speech synthesis, analysis, and recognition, perceptual audi- tory modeling and spatial hearing, DSP hardware, software, and programming environments, as well as various branches of acous- tics, including musical acoustics and modeling of musical instru- ments. He has written more than 300 scientific and engineering ar- ticles and contributed to organizing several conferences and work- shops. Professor Karjalainen is Audio Engineering Society (AES) Fellow and Member in Institute of Electrical and Electronics En- gineers (IEEE), Acoustical Society of America (ASA), European Acoustics Association (EAA), International Computer Music As- sociation (ICMA), European Speech Communication Association (ESCA), and several Finnish scientific and engineering societies.
Cumhur Erkut was born in Istanbul, Turkey, in 1969. He received his B.S. and his M.S. degrees in electronics and communi- cation engineering from the Yildiz Techni- cal University, Istanbul, Turkey, in 1994 and 1997, respectively, and the Dr.Tech. degree in electrical engineering from the Helsinki University of Technology (HUT), Espoo, Finland, in 2002. Between 1998 and 2002, he worked as a researcher at the Laboratory of Acoustics and Audio Signal Processing of the HUT. He is cur- rently a postdoctoral researcher in the same institution, where he contributes to the EU-funded research project “Algorithms for the Modelling of Acoustic Interactions” (ALMA, IST-2001-33059). His primary research interests are model-based sound synthesis and musical acoustics. EURASIP Journal on Applied Signal Processing 2004:7, 990–1000 c 2004 Hindawi Publishing Corporation
A Digital Synthesis Model of Double-Reed Wind Instruments
Ph. Guillemain LaboratoiredeM´ecanique et d’Acoustique, Centre National de la Recherche Scientifique, 31 chemin Joseph-Aiguier, 13402 Marseille cedex 20, France Email: [email protected]
Received 30 June 2003; Revised 29 November 2003
We present a real-time synthesis model for double-reed wind instruments based on a nonlinear physical model. One specificity of double-reed instruments, namely, the presence of a confined air jet in the embouchure, for which a physical model has been proposed recently, is included in the synthesis model. The synthesis procedure involves the use of the physical variables via a digital scheme giving the impedance relationship between pressure and flow in the time domain. Comparisons are made between the behavior of the model with and without the confined air jet in the case of a simple cylindrical bore and that of a more realistic bore, the geometry of which is an approximation of an oboe bore. Keywords and phrases: double-reed, synthesis, impedance.
1. INTRODUCTION The physical model is first summarized in Section 2.In order to obtain the synthesis model, a suitable form of the The simulation of woodwind instrument sounds has been in- flow model is then proposed, a dimensionless version is writ- vestigated for many years since the pioneer studies by Schu- ten and the similarities with single-reed models (see, e.g., macher [1] on the clarinet, which did not focus on digital [7]) are pointed out. The resonator model is obtained by as- sound synthesis. Real-time-oriented techniques, such as the sociating several elementary impedances, and is described in famous digital waveguide method (see, e.g., Smith [2]and terms of the acoustic pressure and flow. Valim¨ aki¨ [3]) and wave digital models [4] have been intro- ffi Section 3 presents the digital synthesis model, which re- duced in order to obtain e cient digital descriptions of res- quires first discrete-time equivalents of the reed displacement onators in terms of incoming and outgoing waves, and used and the impedance relations. The explicit scheme solving the to simulate various wind instruments. nonlinear model, which is similar to that proposed in [6], is The resonator of a clarinet can be said to be approxi- then briefly summarized. mately cylindrical as a first approximation, and its embou- In Section 4, the synthesis model is used to investigate the chure is large enough to be compatible with simple airflow effects of the changes in the nonlinear characteristics induced models. In double-reed instruments, such as the oboe, the by the confined air jet. resonator is not cylindrical but conical and the size of the air jet is comparable to that of the embouchure. In this case, the dissipation of the air jet is no longer free, and the jet remains 2. PHYSICAL MODEL confined in the embouchure, giving rise to additional aero- The main physical components of the nonlinear synthesis dynamic losses. model are as follows. Here, we describe a real-time digital synthesis model for (i) The linear oscillator modeling the first mode of reeds double-reed instruments based on one hand on a recent vibration. study by Vergez et al. [5], in which the formation of the con- (ii) The nonlinear characteristics relating the flow to the fined air jet in the embouchure is taken into account, and on pressure and to the reed displacement at the mouth- the other hand on an extension of the method presented in piece. [6] for synthesizing the clarinet. This method avoids the need (iii) The impedance equation linking pressure and flow. for the incoming and outgoing wave decompositions, since it deals only with the relationship between the impedance vari- Figure 1 shows a highly simplified embouchure model for an ables, which makes it easy to transpose the physical model to oboe and the corresponding physical variables described in a synthesis model. Sections 2.1 and 2.2. A Digital Synthesis Model of Double-Reed Wind Instruments 991
The relationship between the mouth pressure pm and the y/2 pressure of the air jet pj (t) and the velocity of the air jet vj (t) pm H pj , vj pr , q and the volume flow q(t), classically used when dealing with y/2 single-reed instruments, is based on the stationary Bernoulli Reeds Backbore Main bore equation rather than on the Backus model (see, e.g., [10]for justification and comparisons with measurements). This re- Figure 1: Embouchure model and physical variables. lationship, which is still valid here, is
1 2 pm = pj (t)+ ρvj (t) , 2.1. Reed model 2 (4) Although this paper focuses on the simulation of double- q(t) = Sj (t)vj (t) = αSi(t)vj (t), reed instruments, oboe experiments have shown that the dis- placements of the two reeds are symmetrical [5, 8]. In this where α, which is assumed to be constant, is the ratio be- case, a classical single-mode model seems to suffice to de- tween the cross section of the air jet Sj (t) and the reed open- scribe the variations in the reed opening. The opening is ing Si(t). based on the relative displacement y(t) of the two reeds when It should be mentioned that the aim of this paper is to adifference in acoustic pressure occurs between the mouth propose a digital sound synthesis model that takes the dis- pressure pm and the acoustic pressure pj (t) of the air jet sipation of the air jet in the reed channel into account. For formed in the reed channel. If we denote the resonance fre- a detailed physical description of this phenomenon, readers quency, damping coefficient, and mass of the reeds ωr , qr and can consult [5], from which the notation used here was bor- µr , respectively, the relative displacement satisfies the equa- rowed. tion 2.2.2. Flow model 2 p − p (t) d y(t) dy(t) 2 =− m j In the framework of the digital synthesis model on which 2 + ωr qr + ωr y(t) . (1) dt dt µr this paper focuses, it is necessary to express the volume flow q(t) as a function of the difference between the mouth pres- Based on the reed displacement, the opening of the reed sure pm and the pressure at the entrance of the resonator channel denoted Si(t) is expressed by pr (t). From (4), we obtain Si(t) = Θ y(t)+H × w y(t)+H ,(2) 2 = 2 − where w denotes the width of the reed channel, H denotes the vj (t) pm pj (t) ,(5) ρ distance between the two reeds at rest (y(t)andpm = 0) and 2 2 2 2 Θ is the Heaviside function, the role of which is to keep the q (t) = α Si(t) vj (t) . (6) opening of the reeds positive by canceling it when y(t)+H< 0. Substituting the value of pj (t)givenby(3) into (5)gives