A. Acoustic Theory and Modeling of the Vocal Tract

Total Page:16

File Type:pdf, Size:1020Kb

A. Acoustic Theory and Modeling of the Vocal Tract A. Acoustic Theory and Modeling of the Vocal Tract by H.W. Strube, Drittes Physikalisches Institut, Universität Göttingen A.l Introduction This appendix is intended for those readers who want to inform themselves about the mathematical treatment of the vocal-tract acoustics and about its modeling in the time and frequency domains. Apart from providing a funda­ mental understanding, this is required for all applications and investigations concerned with the relationship between geometric and acoustic properties of the vocal tract, such as articulatory synthesis, determination of the tract shape from acoustic quantities, inverse filtering, etc. Historically, the formants of speech were conjectured to be resonances of cavities in the vocal tract. In the case of a narrow constriction at or near the lips, such as for the vowel [uJ, the volume of the tract can be considered a Helmholtz resonator (the glottis is assumed almost closed). However, this can only explain the first formant. Also, the constriction - if any - is usu­ ally situated farther back. Then the tract may be roughly approximated as a cascade of two resonators, accounting for two formants. But all these approx­ imations by discrete cavities proved unrealistic. Thus researchers have have now adopted a more reasonable description of the vocal tract as a nonuni­ form acoustical transmission line. This can explain an infinite number of res­ onances, of which, however, only the first 2 to 4 are of phonetic importance. Depending on the kind of sound, the tube system has different topology: • for vowel-like sounds, pharynx and mouth form one tube; • for nasalized vowels, the tube is branched, with transmission from pharynx through mouth and nose; • for nasal consonants, transmission is through pharynx and nose, with the closed mouth tract as a "shunt" line. The situation becomes even more complicated for plosive and fricative consonants, where one must furt her take into account different places and kinds of excitation. Instead of - or in addition to - glottal oscillation, there is a turbulent-noise source at any narrow constriction, and for plosives, a sudden pressure release after opening a closure. 204 A. Acoustic Theory and Modeling of the Vocal Tract In this appendix, we will present a fundamental description of a one­ dimensional tube in the time and frequency domains, show the connection between tube shape and formants, and present methods for time-domain modeling of sound propagation as weIl as frequency-domain computation of transfer functions and impedances. The "inverse problem" of how to estimate the tract shape from acoustical data will also be discussed briefty. A.2 Acoustics of a Hard-Walled, Lossless Tube To keep the formulas simple, we will present the fundamental equations for the hard-walled, lossless tube only, but first allowing time-varying tube shape. The more general case will only be described in the frequency domain for a time-invariant tube shape. In addition to the well-known representation by pressure and volume velo city, less familiar representations will be introduced that are useful for modeling and computation or show analogies to other fields of physics. A.2.1 Field Equations The acoustic field equations are derived from the N avier-Stokes ftow equa­ tions by linearization, assuming that ftow velo city is small compared to the speed of sound, c. Furthermore, the non-zero average (dc) air ftow in the vocal tract is neglected (its effects on the formants would be of second order on1y). But keep in mind that at narrow constrictions, nonlinear effects can become important, e.g. when whistling or in the glottis. Additional approximations are: (1) The curved vocal tract is treated like a straight tube. (2) The waves propagate one-dimensionally along the tube axis x and are approximately plane. This requires that the slope of the tube walls be small. (3) No higher mo des with nodes over the cross-section are taken into account [they are removed by the integrals in (A.1) below]. In the vocal tract, higher modes cannot propagate below about 4 kHz. Thus the tube is entirely described by its "area function" A(x, t). Let y, z be the coordinates in the cross-section plane. The appropriate field quantities are the a1ternating (ac) press ure averaged over the cross-section, p(x, t), and the volume velocity q(x, t), defined as p(x,t) = A(;,t) JJp~(x,y,z,t)dydz, A (A.1) q(x,t) = JJvx(x,y,z,t)dydz. A Here p~ is the three-dimensional ac pressure field and V x the x component of the velocity field. The ac density p(x, t) is defined analogously to p(x, t) and proportional to it (state equation); the constant average density will be A.2 Acoustics of a Hard-Walled, Lossless Tube 205 denoted by {!o. The motion is then described by "Newton's law", the (one­ dimensional) continuity equation, and the state equation: (!o(q/A)" = _pi , (A.2) ({!A)" + {!oA = -{!oq' , (A.3) p=c2 {!. (A.4) Here, a dot denotes 0/ ot, a prime means 0/ ox. The proportionality factor c2 in (A.4) will in fact turn out to be the speed of sound (phase velocity). The second term in (A.3) represents a flow source due to the motion of the tube walls. Since in the vocal tract these move too slowly to generate audible sound, this term will henceforth be neglected. Then the last two equations can be combined into (pA)) {!OC2 = -q' . (A.5) Obviously, the two field equations (A.2), (A.5) are of a form analogous to those of a lossless electrical transmission line, if p is identified with voltage and q with current and L' = {!o/A, C' = A/ {!OC2 (A.6) correspond to an inductance and capacitance density, respectively. These are not independent, since L'C' = c-2 is constant. Thus the tube may be com­ pletely described by the eharaeteristie impedanee: Z = JL' /C' = (!oc/A; (A.7) then L' = Z/e, C' = l/Ze, and for any derivative or variation "0" we have oA/A = oC'/C' = -oL'/L' = -OZ/Z. (A.8) Equations (A.2), (A.5) are rewritten as _pi = (L'q)" = c-1 (qZ)" , (A.9a) -q' = (C'p)" = c-1 (p/Z)" . (A.9b) The infinitesimal transmission-line element is then that shown in Fig. A.1. Note that in the time-varying case, L' contains a quasi-resistive and C' a quasi-conductive component, since (L'q)" = L'q + i/q, etc. The energy balance of the tube can be derived by multiplying (A.9a) and (A.9b) with q and p, respectively, and adding them, leading to the continuity equation L' dx ~'----iI""---- C' dx Fig. A.1. Electrical equivalent circuit of the I infinitesimallossless transmission-line element 206 A. Acoustic Theory and Modeling of the Vocal Tract w )" P' (w - wq)Z/Z , (Wp + q + = p (A.lO) w p = C'p2/2, w q = L'q2/2, P =pq, where wp and wq are the potential and kinetic energy densities and P is the power (energy flow). The right-hand side represents an energy source density due to work of the moving walls against the radiation pressure in the tube. This term vanishes in the time-invariant case, so that energy is then conserved. Note that all equations are invariant under the duality transformation p+-+q, L' +-+ C', Z+-+l/Z. (A.ll) Another familiar representation uses the velocity potential P, related to p and q according to p = {!ocP, q=-Ap'. (A.12) These equations imply (A.2) or (A.9a) automatically. Inserting them in (A.5) now yields a second-order wave equation, namely Webster's horn equation, generalized for time-varying A: (A.13) which has a completely symmetrie form in the space and time derivatives. Another similar form with A replaced by 1/A - again corresponding to the duality transformation (A.ll) - can be obtained by using the volume dis­ placement J q dt instead of P. When A and thus L', C', Z are constant in time, Webster's horn equation can be written in its familiar form, not only for velocity potential and volume displacement but also for pressure and volume velocity themselves: p"+ (A'/A)p' - c-2 jj = 0, (A.14) q"- (A' /A)q' - c-2 ij = O. (A.15) Other Representations. a) Square root of energy density. By replacing p and q with the corresponding square-root-of-energy-densi ty com ponents, 7jJ = PVC' /2, rp = qVU/2, (A.16) the field equations now read c-1(Zl/2rp)" = _(Zl/27jJ)' , (A.17) or equivalently, c-1(cp + Qrp) = -7jJ' - W7jJ , c-1(-rj; - Q7jJ) = -rp' + Wrp, (A.18a) with Q = Z/2Z = -Ä/2A, W = Z'/2Z = -A'/2A. (A.18b) In the time-invariant case (Q = 0), this representation removes the first-order derivatives of the fields from the Webst er horn equations (A.14), (A.15), yielding Schrödinger (or rather, Klein-Gordon) type equations instead: A.2 Acoustics of a Hard-Walled, Lossless Tube 207 -'1//' + Vp~ + c-2;j; = 0 , (A.19) -'{J" + Vq'{J + c-2 y; = 0 , (A.20) with potentials Vp = (JA)"/JA = W 2 - W', (A.21) Vq = (I/JA)" JA = W 2 + W' . This representation may be useful for eigenvalue problems and inverse prob­ lems. In the quantum-mechanical terminology, Vp and Vq have the form of supersymmetrie partner potentials with W as superpotential. b) Wave quantities. As will become apparent in Sect. A.3 below, it is often advantageous to transform p and q to an equivalent representation by "right" and "left" traveling waves, whose dimension may be pressure, volume velocity, or square root of power.
Recommended publications
  • Karlheinz Stockhausen's Stimmung and Vowel Overtone Singing Wolfgang Saus, 27.01.2009
    - 1 - Saus, W., 2009. Karlheinz Stockhausen’s STIMMUNG and Vowel Overtone Singing. In Ročenka textů zahraničních profesorů / The Annual of Texts by Foreign Guest Professors. Univerzita Karlova v Praze, Filozofická fakulta: FF UK Praha, ISBN 978-80-7308-290-1, S 471-478. Karlheinz Stockhausen's Stimmung and Vowel Overtone Singing Wolfgang Saus, 27.01.2009 Karlheinz Stockhausen created STIMMUNG1 for six vocalists in one maintains the same vocal timbre and sings a portamento, then 1968 (first setting 1967). It is the first vocal work in Western se- the formants remain constant on their pitch and the overtones rious music with explicitly notated vocal overtones2, and is emerge one after the other as their frequencies fall into the range of therefore the first classic composition for overtone singing. Howe- the vocal formant. ver, the singing technique in STIMMUNG is different from It is not immediately obvious from the score whether Stock- "western overtone singing" as practiced by most overtone singers hausen's numbering refers to overtones or partials. Since the today. I would like to introduce the concept of "vowel overtone fundamental is counted as partial number 1, the corresponding singing," as used by Stockhausen, as a distinct technique in additi- overtone position is always one lower. The 5th overtone is identi- on to the L, R or other overtone singing techniques3. cal with the 6th partial. The tape chord of 7 sine tones mentioned Stockhausen demands in the instructions to his composition the in the performance material provides clarification. Its fundamental mastery of the vowel square. This is a collection of phonetic signs, is indicated at 57 Hz, what corresponds to B2♭ −38 ct.
    [Show full text]
  • Sounds of Speech Sounds of Speech
    TM TOOLS for PROGRAMSCHOOLS SOUNDS OF SPEECH SOUNDS OF SPEECH Consonant Frequency Bands Vowel Chart (Maxon & Brackett, 1992; and Ling, 1979) (*adapted from Ling and Ling, 1978) Ling, D. & Ling, A. (1978) Aural Habilitation — The Foundations of Verbal Learning in Hearing-Impaired Children Washington DC: The Alexander Graham Bell Association for the Deaf. Consonant Frequency Bands (Hz) 1 2 3 4 /w/ 250–800 /n/ 250–350 1000–1500 2000–3,000 /m/ 250–350 1000–1500 2500–3500 /ŋ/ 250–400 1000–1500 2000–3,000 /r/ 600–800 1000–1500 1800–2400 /g/ 200–300 1500–2500 /j/ 200–300 2000–3000 /ʤ/ 200–300 2000–3000 /l/ 250–400 2000–3000 /b/ 300–400 2000–2500 Tips for using The Sounds of Speech charts and tables: /d/ 300–400 2500–3000 1. These charts and tables with vowel and consonant formant information are designed to assist /ʒ/ 200–300 1500–3500 3500–7000 you during therapy. The values in the charts are estimated acoustic data and will be variable from speaker to speaker. /z/ 200–300 4000–5000 2. It is important to note not only the first formant of the target sounds during therapy, but also the /ð/ 250–350 4500–6000 subsequent formants as well. For example, some vowels share the same first formant F1. It is /v/ 300–400 3500–4500 the second formant F2 which will make these vowels sound different. If a child can't detect F2 they will have discrimination problems for vowels which vary only by the second formant e.g.
    [Show full text]
  • Comparison of Procedures for Determination of Acoustic Nonlinearity of Some Inhomogeneous Materials
    Downloaded from orbit.dtu.dk on: Dec 17, 2017 Comparison of procedures for determination of acoustic nonlinearity of some inhomogeneous materials Jensen, Leif Bjørnø Published in: Acoustical Society of America. Journal Link to article, DOI: 10.1121/1.2020879 Publication date: 1983 Document Version Publisher's PDF, also known as Version of record Link back to DTU Orbit Citation (APA): Jensen, L. B. (1983). Comparison of procedures for determination of acoustic nonlinearity of some inhomogeneous materials. Acoustical Society of America. Journal, 74(S1), S27-S27. DOI: 10.1121/1.2020879 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. PROGRAM OF The 106thMeeting of the AcousticalSociety of America Town and CountryHotel © San Diego, California © 7-11 November1983 TUESDAY MORNING, 8 NOVEMBER 1983 SENATE/COMMITTEEROOMS, 8:30 A.M. TO 12:10P.M. Session A. Underwater Acoustics: Arctic Acoustics I William Mosely,Chairman Naval ResearchLaboratory, Washington, DC 20375 Chairman's Introductions8:30 Invited Papers 8:35 A1.
    [Show full text]
  • Tutorial #5 Solutions
    Name: Other group members: Tutorial #5 Solutions PHYS 1240: Sound and Music Friday, July 19, 2019 Instructions: Work in groups of 3 or 4 to answer the following questions. Write your solu- tions on this copy of the tutorial|each person should have their own copy, but make sure you agree on everything as a group. When you're finished, keep this copy of your tutorial for reference|no need to turn it in (grades are based on participation, not accuracy). 1. When telephones were first being developed by Bell Labs, they realized it would not be feasible to have them detect all frequencies over the human audible range (20 Hz - 20 kHz). Instead, they defined a frequency response so that the highest-pitched sounds telephones could receive is 3400 Hz and lowest-pitched sounds they could detect are at 340 Hz, which saved on costs considerably. a) Assume the temperature in air is 15◦C. How large are the biggest and smallest wavelengths of sound in air that telephones are able to receive? We can find the wavelength from the formula v = λf, but first we'll need the speed: v = 331 + (0:6 × 15◦C) = 340 m/s Now, we find the wavelengths for the two ends of the frequency range given: v 340 m/s λ = = = 0.1 m smallest f 3400 Hz v 340 m/s λ = = = 1 m largest f 340 Hz Therefore, perhaps surprisingly, telephones can pick up sound waves up to a meter long, but they won't detect waves as small as the ear receiver itself.
    [Show full text]
  • L Atdment OFFICF QO9ENT ROOM 36
    *;JiQYL~dW~llbk~ieira - - ~-- -, - ., · LAtDMENT OFFICF QO9ENT ROOM 36 ?ESEARC L0ORATORY OF RL C.f:'__ . /j16baV"BLI:1!S INSTITUTE 0i'7Cn' / PERCEPTION OF MUSICAL INTERVALS: EVIDENCE FOR THE CENTRAL ORIGIN OF THE PITCH OF COMPLEX TONES ADRIANUS J. M. HOUTSMA JULIUS L. GOLDSTEIN LO N OPY TECHNICAL REPORT 484 OCTOBER I, 1971 MASSACHUSETTS INSTITUTE OF TECHNOLOGY RESEARCH LABORATORY OF ELECTRONICS CAMBRIDGE, MASSACHUSETTS 02139 The Research Laboratory of Electronics is an interdepartmental laboratory in which faculty members and graduate students from numerous academic departments conduct research. The research reported in this document was made possible in part by support extended the Massachusetts Institute of Tech- nology, Research Laboratory of Electronics, by the JOINT SER- VICES ELECTRONICS PROGRAMS (U. S. Army, U. S. Navy, and U.S. Air Force) under Contract No. DAAB07-71-C-0300, and by the National Institutes of Health (Grant 5 PO1 GM14940-05). Requestors having DOD contracts or grants should apply for copies of technical reports to the Defense Documentation Center, Cameron Station, Alexandria, Virginia 22314; all others should apply to the Clearinghouse for Federal Scientific and Technical Information, Sills Building, 5285 Port Royal Road, Springfield, Virginia 22151. THIS DOCUMENT HAS BEEN APPROVED FOR PUBLIC RELEASE AND SALE; ITS DISTRIBUTION IS UNLIMITED. MASSACHUSETTS INSTITUTE OF TECHNOLOGY RESEARCH LABORATORY OF ELECTRONICS Technical Report 484 October 1, 1971 PERCEPTION OF MUSICAL INTERVALS: EVIDENCE FOR THE CENTRAL ORIGIN OF COMPLEX TONES Adrianus J. M. Houtsma and Julius L. Goldstein This report is based on a thesis by A. J. M. Houtsma submitted to the Department of Electrical Engineering at the Massachusetts Institute of Technology, June 1971, in partial fulfillment of the requirements for the degree of Doctor of Philosophy.
    [Show full text]
  • Contribution of the Conditioning Stage to the Total Harmonic Distortion in the Parametric Array Loudspeaker
    Universidad EAFIT Contribution of the conditioning stage to the Total Harmonic Distortion in the Parametric Array Loudspeaker Andrés Yarce Botero Thesis to apply for the title of Master of Science in Applied Physics Advisor Olga Lucia Quintero. Ph.D. Master of Science in Applied Physics Science school Universidad EAFIT Medellín - Colombia 2017 1 Contents 1 Problem Statement 7 1.1 On sound artistic installations . 8 1.2 Objectives . 12 1.2.1 General Objective . 12 1.2.2 Specific Objectives . 12 1.3 Theoretical background . 13 1.3.1 Physics behind the Parametric Array Loudspeaker . 13 1.3.2 Maths behind of Parametric Array Loudspeakers . 19 1.3.3 About piezoelectric ultrasound transducers . 21 1.3.4 About the health and safety uses of the Parametric Array Loudspeaker Technology . 24 2 Acquisition of Sound from self-demodulation of Ultrasound 26 2.1 Acoustics . 26 2.1.1 Directionality of Sound . 28 2.2 On the non linearity of sound . 30 2.3 On the linearity of sound from ultrasound . 33 3 Signal distortion and modulation schemes 38 3.1 Introduction . 38 3.2 On Total Harmonic Distortion . 40 3.3 Effects on total harmonic distortion: Modulation techniques . 42 3.4 On Pulse Wave Modulation . 46 4 Loudspeaker Modelling by statistical design of experiments. 49 4.1 Characterization Parametric Array Loudspeaker . 51 4.2 Experimental setup . 52 4.2.1 Results of PAL radiation pattern . 53 4.3 Design of experiments . 56 4.3.1 Placket Burmann method . 59 4.3.2 Box Behnken methodology . 62 5 Digital filtering techniques and signal distortion analysis.
    [Show full text]
  • Separation of Vocal and Non-Vocal Components from Audio Clip Using Correlated Repeated Mask (CRM)
    University of New Orleans ScholarWorks@UNO University of New Orleans Theses and Dissertations Dissertations and Theses Summer 8-9-2017 Separation of Vocal and Non-Vocal Components from Audio Clip Using Correlated Repeated Mask (CRM) Mohan Kumar Kanuri [email protected] Follow this and additional works at: https://scholarworks.uno.edu/td Part of the Signal Processing Commons Recommended Citation Kanuri, Mohan Kumar, "Separation of Vocal and Non-Vocal Components from Audio Clip Using Correlated Repeated Mask (CRM)" (2017). University of New Orleans Theses and Dissertations. 2381. https://scholarworks.uno.edu/td/2381 This Thesis is protected by copyright and/or related rights. It has been brought to you by ScholarWorks@UNO with permission from the rights-holder(s). You are free to use this Thesis in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights- holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/or on the work itself. This Thesis has been accepted for inclusion in University of New Orleans Theses and Dissertations by an authorized administrator of ScholarWorks@UNO. For more information, please contact [email protected]. Separation of Vocal and Non-Vocal Components from Audio Clip Using Correlated Repeated Mask (CRM) A Thesis Submitted to the Graduate Faculty of the University of New Orleans in partial fulfillment of the requirements for the degree of Master of Science in Engineering – Electrical By Mohan Kumar Kanuri B.Tech., Jawaharlal Nehru Technological University, 2014 August 2017 This thesis is dedicated to my parents, Mr.
    [Show full text]
  • Consonants Consonants Vs. Vowels Formant Frequencies Place Of
    The Acoustics of Speech Production: Source-Filter Theory of Speech Consonants Production Source Filter Speech Speech production can be divided into two independent parts •Sources of sound (i.e., signals) such as the larynx •Filters that modify the source (i.e., systems) such as the vocal tract Consonants Consonants Vs. Vowels All three sources are used • Frication Vowels Consonants • Aspiration • Voicing • Slow changes in • Rapid changes in articulators articulators Articulations change resonances of the vocal tract • Resonances of the vocal tract are called formants • Produced by with a • Produced by making • Moving the tongue, lips and jaw change the shape of the vocal tract relatively open vocal constrictions in the • Changing the shape of the vocal tract changes the formant frequencies tract vocal tract Consonants are created by coordinating changes in the sources with changes in the filter (i.e., formant frequencies) • Only the voicing • Coordination of all source is used three sources (frication, aspiration, voicing) Formant Frequencies Place of Articulation The First Formant (F1) • Affected by the size of Velar Alveolar the constriction • Cue for manner • Unrelated to place Bilabial The second and third formants (F2 and F3) • Affected by place of articulation /AdA/ 1 Place of Articulation Place of Articulation Bilabials (e.g., /b/, /p/, /m/) -- Low Frequencies • Lower F2 • Lower F3 Alveolars (e.g., /d/, /n/, /s/) -- High Frequencies • Higher F2 • Higher F3 Velars (e.g., /g/, /k/) -- Middle Frequencies • Higher F2 /AdA/ /AgA/ • Lower
    [Show full text]
  • Major Heading
    THE APPLICATION OF ILLUSIONS AND PSYCHOACOUSTICS TO SMALL LOUDSPEAKER CONFIGURATIONS RONALD M. AARTS Philips Research Europe, HTC 36 (WO 02) Eindhoven, The Netherlands An overview of some auditory illusions is given, two of which will be considered in more detail for the application of small loudspeaker configurations. The requirements for a good sound reproduction system generally conflict with those of consumer products regarding both size and price. A possible solution lies in enhancing listener perception and reproduction of sound by exploiting a combination of psychoacoustics, loudspeaker configurations and digital signal processing. The first example is based on the missing fundamental concept, the second on the combination of frequency mapping and a special driver. INTRODUCTION applications of even smaller size this lower limit can A brief overview of some auditory illusions is given easily be as high as several hundred hertz. The bass which serves merely as a ‘catalogue’, rather than a portion of an audio signal contributes significantly to lengthy discussion. A related topic to auditory illusions the sound ‘impact’, and depending on the bass quality, is the interaction between different sensory modalities, the overall sound quality will shift up or down. e.g. sound and vision, a famous example is the Therefore a good low-frequency reproduction is McGurk effect (‘Hearing lips and seeing voices’) [1]. essential. An auditory-visual overview is given in [2], a more general multisensory product perception in [3], and on ILLUSIONS spatial orientation in [4]. The influence of video quality An illusion is a distortion of a sensory perception, on perceived audio quality is discussed in [5].
    [Show full text]
  • Measuring Vowel Formants ­ UW Phonetics/Sociolinguistics Lab Wiki Measuring Vowel Formants
    6/18/2015 Measuring Vowel Formants ­ UW Phonetics/Sociolinguistics Lab Wiki Measuring Vowel Formants From UW Phonetics/Sociolinguistics Lab Wiki By Richard Wright and David Nichols. Introduction Vowel quality is based (largely) on our perception of the relationship between the first and second formants (F1 & F2) of a vowel in combination with the third formant (F3) and details in the vowel's spectrum. We can measure F1 and F2 using a variety of tools. A researcher's auditory impression is the most important qualitative tool in linguists; we can transcribe what we hear using qualitative labels such as IPA symbols. For a trained transcriber, transcriptions based on auditory impressions are sufficient for many purposes. However, when the goals of the research require a higher level of accuracy than can be achieved using transcriptions alone, researchers have a variety of tools available to them. Spectrograms are probably the most commonly used acoustic tool; most phoneticians consult spectrograms when making narrow transcriptions or when making decisions about where to measure the signal using other tools. While spectrograms can be very useful, they are typically used as qualitative tools because it is difficult to make most measures accurately from a spectrogram alone. Therefore, many measures that might be made using a spectrogram are typically supplemented using other, more precise tools. In this case an LPC (linear predictive coding) analysis is used to track (estimate) the center of each formant and can also be used to estimate the formant's bandwidth. On its own, the LPC spectrum is generally reliable, but it may sometimes miscalculate a formant severely by mistaking another formant or a harmonic for the formant you're trying to find.
    [Show full text]
  • Large Scale Sound Installation Design: Psychoacoustic Stimulation
    LARGE SCALE SOUND INSTALLATION DESIGN: PSYCHOACOUSTIC STIMULATION An Interactive Qualifying Project Report submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE in partial fulfillment of the requirements for the Degree of Bachelor of Science by Taylor H. Andrews, CS 2012 Mark E. Hayden, ECE 2012 Date: 16 December 2010 Professor Frederick W. Bianchi, Advisor Abstract The brain performs a vast amount of processing to translate the raw frequency content of incoming acoustic stimuli into the perceptual equivalent. Psychoacoustic processing can result in pitches and beats being “heard” that do not physically exist in the medium. These psychoac- oustic effects were researched and then applied in a large scale sound design. The constructed installations and acoustic stimuli were designed specifically to combat sensory atrophy by exer- cising and reinforcing the listeners’ perceptual skills. i Table of Contents Abstract ............................................................................................................................................ i Table of Contents ............................................................................................................................ ii Table of Figures ............................................................................................................................. iii Table of Tables .............................................................................................................................. iv Chapter 1: Introduction .................................................................................................................
    [Show full text]
  • City Research Online
    City Research Online City, University of London Institutional Repository Citation: Cross, I. (1989). The cognitive organisation of musical pitch. (Unpublished Doctoral thesis, City University London) This is the accepted version of the paper. This version of the publication may differ from the final published version. Permanent repository link: https://openaccess.city.ac.uk/id/eprint/7663/ Link to published version: Copyright: City Research Online aims to make research outputs of City, University of London available to a wider audience. Copyright and Moral Rights remain with the author(s) and/or copyright holders. URLs from City Research Online may be freely distributed and linked to. Reuse: Copies of full items can be used for personal research or study, educational, or not-for-profit purposes without prior permission or charge. Provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way. City Research Online: http://openaccess.city.ac.uk/ [email protected] The Cognitive Organisation of Musical Pitch A dissertation submitted in fulfilment of the requirements of the degree of Doctor of Philosophy in The City University, London by Ian Cross September1989 ABSTRACT This thesis takes as its initial Premise the idea that the rationales for the forms of pitch organisation employed within tonal music which have been adopted by music theorists have strongly affected those theorists` conceptions of music, and that it is of critical importance to music theory to investigate the potential origination of such rationales within the human sciences.
    [Show full text]