HCS 7367 Speech Perception Anatomical, Acoustic, Neural Properties Production and Perception
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Pg. 5 CHAPTER-2: MECHANISM of SPEECH PRODUCTION AND
Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals CHAPTER-2: MECHANISM OF SPEECH PRODUCTION AND LITERATURE REVIEW 2.1. Anatomy of speech production Speech is the vocal aspect of communication made by human beings. Human beings are able to express their ideas, feelings and emotions to the listener through speech generated using vocal articulators (NIDCD, June 2010). Development of speech is a constant process and requires a lot of practice. Communication is a string of events which allows the speaker to express their feelings and emotions and the listener to understand them. Speech communication is considered as a thought that is transformed into language for expression. Phil Rose & James R Robertson (2002) mentioned that voices of individuals are complex in nature. They aid in providing vital information related to sex, emotional state or age of the speaker. Although evidence from DNA always remains headlines, DNA can‟t talk. It can‟t be recorded planning, carrying out or confessing to a crime. It can‟t be so apparently directly incriminating. Perhaps, it is these features that contribute to interest and importance of forensic speaker identification. Speech signal is a multidimensional acoustic wave (as seen in figure 2.1.) which provides information about the words or message being spoken, speaker identity, language spoken, physical and mental health, race, age, sex, education level, religious orientation and background of an individual (Schetelig T. & Rabenstein R., May 1998). pg. 5 Acoustic Analysis for Comparison and Identification of Normal and Disguised Speech of Individuals Figure 2.1.: Representation of properties of sound wave (Courtesy "Waves". -
Phonological Use of the Larynx: a Tutorial Jacqueline Vaissière
Phonological use of the larynx: a tutorial Jacqueline Vaissière To cite this version: Jacqueline Vaissière. Phonological use of the larynx: a tutorial. Larynx 97, 1994, Marseille, France. pp.115-126. halshs-00703584 HAL Id: halshs-00703584 https://halshs.archives-ouvertes.fr/halshs-00703584 Submitted on 3 Jun 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Vaissière, J., (1997), "Phonological use of the larynx: a tutorial", Larynx 97, Marseille, 115-126. PHONOLOGICAL USE OF THE LARYNX J. Vaissière UPRESA-CNRS 1027, Institut de Phonétique, Paris, France larynx used as a carrier of paralinguistic information . RÉSUMÉ THE PRIMARY FUNCTION OF THE LARYNX Cette communication concerne le rôle du IS PROTECTIVE larynx dans l'acte de communication. Toutes As stated by Sapir, 1923, les langues du monde utilisent des physiologically, "speech is an overlaid configurations caractéristiques du larynx, aux function, or to be more precise, a group of niveaux segmental, lexical, et supralexical. Nous présentons d'abord l'utilisation des différents types de phonation pour distinguer entre les consonnes et les voyelles dans les overlaid functions. It gets what service it can langues du monde, et également du larynx out of organs and functions, nervous and comme lieu d'articulation des glottales, et la muscular, that come into being and are production des éjectives et des implosives. -
Acoustic-Phonetics of Coronal Stops
Acoustic-phonetics of coronal stops: A cross-language study of Canadian English and Canadian French ͒ Megha Sundaraa School of Communication Sciences & Disorders, McGill University 1266 Pine Avenue West, Montreal, QC H3G 1A8 Canada ͑Received 1 November 2004; revised 24 May 2005; accepted 25 May 2005͒ The study was conducted to provide an acoustic description of coronal stops in Canadian English ͑CE͒ and Canadian French ͑CF͒. CE and CF stops differ in VOT and place of articulation. CE has a two-way voicing distinction ͑in syllable initial position͒ between simultaneous and aspirated release; coronal stops are articulated at alveolar place. CF, on the other hand, has a two-way voicing distinction between prevoiced and simultaneous release; coronal stops are articulated at dental place. Acoustic analyses of stop consonants produced by monolingual speakers of CE and of CF, for both VOT and alveolar/dental place of articulation, are reported. Results from the analysis of VOT replicate and confirm differences in phonetic implementation of VOT across the two languages. Analysis of coronal stops with respect to place differences indicates systematic differences across the two languages in relative burst intensity and measures of burst spectral shape, specifically mean frequency, standard deviation, and kurtosis. The majority of CE and CF talkers reliably and consistently produced tokens differing in the SD of burst frequency, a measure of the diffuseness of the burst. Results from the study are interpreted in the context of acoustic and articulatory data on coronal stops from several other languages. © 2005 Acoustical Society of America. ͓DOI: 10.1121/1.1953270͔ PACS number͑s͒: 43.70.Fq, 43.70.Kv, 43.70.Ϫh ͓AL͔ Pages: 1026–1037 I. -
Vowel Acoustics Reliably Differentiate Three Coronal Stops of Wubuy Across Prosodic Contexts
Vowel acoustics reliably differentiate three coronal stops of Wubuy across prosodic contexts Rikke L. BundgaaRd-nieLsena, BRett J. BakeRb, ChRistian kRoosa, MaRk haRveyc and CatheRine t. Besta,d aMARCS Auditory Laboratories, University of Western Sydney bSchool of Languages and Linguistics, University of Melbourne cSchool of Humanities and Social Science, University of Newcastle dHaskins Laboratories, New Haven Abstract The present study investigates the acoustic differentiation of three coronal stops in the indigenous Australian language Wubuy. We test independent claims that only VC (vowel-into-consonant) transitions provide robust acoustic cues for retroflex as compared to alveolar and dental coronal stops, with no differentiating cues among these three coronal stops evident in CV (consonant-into-vowel) transitions. The four-way stop distinction /t, t̪ , ʈ, c/ in Wubuy is contrastive word-initially (Heath 1984) and by implication utterance-initially, i.e., in CV-only contexts, which suggests that acoustic differentiation should be expected to occur in the CV transitions of this language, including in initial positions. Therefore, we examined both VC and CV formant transition information in the three target coronal stops across VCV (word-internal), V#CV (word-initial but utterance-medial) and ##CV (word- and utterance-initial), for /a / vowel contexts, which provide the optimal environment for investigating formant transitions. Results confirm that these coro- nal contrasts are maintained in the CVs in this vowel context, and in all three posi- tions. The patterns of acoustic differences across the three syllable contexts also provide some support for a systematic role of prosodic boundaries in influencing the degree of coronal stop differentiation evident in the vowel formant transitions. -
Relations Between Speech Production and Speech Perception: Some Behavioral and Neurological Observations
14 Relations between Speech Production and Speech Perception: Some Behavioral and Neurological Observations Willem J. M. Levelt One Agent, Two Modalities There is a famous book that never appeared: Bever and Weksel (shelved). It contained chapters by several young Turks in the budding new psycho- linguistics community of the mid-1960s. Jacques Mehler’s chapter (coau- thored with Harris Savin) was entitled “Language Users.” A normal language user “is capable of producing and understanding an infinite number of sentences that he has never heard before. The central problem for the psychologist studying language is to explain this fact—to describe the abilities that underlie this infinity of possible performances and state precisely how these abilities, together with the various details . of a given situation, determine any particular performance.” There is no hesi tation here about the psycholinguist’s core business: it is to explain our abilities to produce and to understand language. Indeed, the chapter’s purpose was to review the available research findings on these abilities and it contains, correspondingly, a section on the listener and another section on the speaker. This balance was quickly lost in the further history of psycholinguistics. With the happy and important exceptions of speech error and speech pausing research, the study of language use was factually reduced to studying language understanding. For example, Philip Johnson-Laird opened his review of experimental psycholinguistics in the 1974 Annual Review of Psychology with the statement: “The fundamental problem of psycholinguistics is simple to formulate: what happens if we understand sentences?” And he added, “Most of the other problems would be half way solved if only we had the answer to this question.” One major other 242 W. -
Building a Universal Phonetic Model for Zero-Resource Languages
Building a Universal Phonetic Model for Zero-Resource Languages Paul Moore MInf Project (Part 2) Interim Report Master of Informatics School of Informatics University of Edinburgh 2020 3 Abstract Being able to predict phones from speech is a challenge in and of itself, but what about unseen phones from different languages? In this project, work was done towards building precisely this kind of universal phonetic model. Using the GlobalPhone language corpus, phones’ articulatory features, a recurrent neu- ral network, open-source libraries, and an innovative prediction system, a model was created to predict phones based on their features alone. The results show promise, especially for using these models on languages within the same family. 4 Acknowledgements Once again, a huge thank you to Steve Renals, my supervisor, for all his assistance. I greatly appreciated his practical advice and reasoning when I got stuck, or things seemed overwhelming, and I’m very thankful that he endorsed this project. I’m immensely grateful for the support my family and friends have provided in the good times and bad throughout my studies at university. A big shout-out to my flatmates Hamish, Mark, Stephen and Iain for the fun and laugh- ter they contributed this year. I’m especially grateful to Hamish for being around dur- ing the isolation from Coronavirus and for helping me out in so many practical ways when I needed time to work on this project. Lastly, I wish to thank Jesus Christ, my Saviour and my Lord, who keeps all these things in their proper perspective, and gives me strength each day. -
Prestopped Bilabial Trills in Sangtam*
PRESTOPPED BILABIAL TRILLS IN SANGTAM* Alexander R. Coupe Nanyang Technological University, Singapore [email protected] ABSTRACT manner of articulation in an ethnographic description published in 1939: This paper discusses the phonetic and phonological p͜͜ w = der für Nord-Sangtam typische Konsonant, sehr features of a typologically rare prestopped bilabial schwierig auszusprechen; tönt etwa wie pw oder pr. trill and some associated evolving sound changes in Wird jedoch von den Lippen gebildet, durch die man the phonology of Sangtam, a Tibeto-Burman die Luft so preßt, daß die Untelippe einmal (oder language of central Nagaland, north-east India. zweimal) vibriert. (Möglicherweise gibt es den gleichen Konsonanten etwas weicher und wird dann Prestopped bilabial trills were encountered in two mit b͜ w bezeichnet). dozen words of a 500-item corpus and found to be in Translation: pw = the typical consonant for the North phonemic contrast with all other members of the Sangtam language, very difficult to pronounce; sounds plosive series. Evidence from static palatograms and like pw or pr. It is however produced by the lips, linguagrams demonstrates that Sangtam speakers through which one presses the air in a way that the articulate this sound by first making an apical- or lower lip vibrates once (or twice). (Possibly, the same laminal-dental oral occlusion, which is then consonant exists in a slightly softer form and is then 1 explosively released into a bilabial trill involving up termed bw). to three oscillations of the lips. In 2012 a similar sound was encountered in two The paper concludes with a discussion of the dozen words of a 500-word corpus of Northern possible historical sources of prestopped bilabial Sangtam, the main difference from Kauffman’s trills in this language, taking into account description being that the lip vibration is preceded phonological reconstructions and cross-linguistic by an apical- or laminal-dental occlusion. -
Theories of Speech Perception
Theories of Speech Perception • Motor Theory (Liberman) • Auditory Theory – Close link between perception – Derives from general and production of speech properties of the auditory • Use motor information to system compensate for lack of – Speech perception is not invariants in speech signal species-specific • Determine which articulatory gesture was made, infer phoneme – Human speech perception is an innate, species-specific skill • Because only humans can produce speech, only humans can perceive it as a sequence of phonemes • Speech is special Wilson & friends, 2004 • Perception • Production • /pa/ • /pa/ •/gi/ •/gi/ •Bell • Tap alternate thumbs • Burst of white noise Wilson et al., 2004 • Black areas are premotor and primary motor cortex activated when subjects produced the syllables • White arrows indicate central sulcus • Orange represents areas activated by listening to speech • Extensive activation in superior temporal gyrus • Activation in motor areas involved in speech production (!) Wilson and colleagues, 2004 Is categorical perception innate? Manipulate VOT, Monitor Sucking 4-month-old infants: Eimas et al. (1971) 20 ms 20 ms 0 ms (Different Sides) (Same Side) (Control) Is categorical perception species specific? • Chinchillas exhibit categorical perception as well Chinchilla experiment (Kuhl & Miller experiment) “ba…ba…ba…ba…”“pa…pa…pa…pa…” • Train on end-point “ba” (good), “pa” (bad) • Test on intermediate stimuli • Results: – Chinchillas switched over from staying to running at about the same location as the English b/p -
Appendix B: Muscles of the Speech Production Mechanism
Appendix B: Muscles of the Speech Production Mechanism I. MUSCLES OF RESPIRATION A. MUSCLES OF INHALATION (muscles that enlarge the thoracic cavity) 1. Diaphragm Attachments: The diaphragm originates in a number of places: the lower tip of the sternum; the first 3 or 4 lumbar vertebrae and the lower borders and inner surfaces of the cartilages of ribs 7 - 12. All fibers insert into a central tendon (aponeurosis of the diaphragm). Function: Contraction of the diaphragm draws the central tendon down and forward, which enlarges the thoracic cavity vertically. It can also elevate to some extent the lower ribs. The diaphragm separates the thoracic and the abdominal cavities. 2. External Intercostals Attachments: The external intercostals run from the lip on the lower border of each rib inferiorly and medially to the upper border of the rib immediately below. Function: These muscles may have several functions. They serve to strengthen the thoracic wall so that it doesn't bulge between the ribs. They provide a checking action to counteract relaxation pressure. Because of the direction of attachment of their fibers, the external intercostals can raise the thoracic cage for inhalation. 3. Pectoralis Major Attachments: This muscle attaches on the anterior surface of the medial half of the clavicle, the sternum and costal cartilages 1-6 or 7. All fibers come together and insert at the greater tubercle of the humerus. Function: Pectoralis major is primarily an abductor of the arm. It can, however, serve as a supplemental (or compensatory) muscle of inhalation, raising the rib cage and sternum. (In other words, breathing by raising and lowering the arms!) It is mentioned here chiefly because it is encountered in the dissection. -
Saudi Speakers' Perception of the English Bilabial
Sino-US English Teaching, June 2015, Vol. 12, No. 6, 435-447 doi:10.17265/1539-8072/2015.06.005 D DAVID PUBLISHING Saudi Speakers’ Perception of the English Bilabial Stops /b/ and /p/ Mohammad Al Zahrani Taif University, Taif, Saudi Arabia Languages differ in their phoneme inventories. Some phonemes exist in more than one language but others exist in relatively few languages. More specifically, English Language has some sounds that Arabic does not have and vice versa. This paper focuses on the perception of the English bilabial stops /b/ and /p/ in contrast to the perception of the English alveolar stops /t/ and /d/ by some Saudi linguists who have been speaking English for more than six years and who are currently in an English speaking country, Australia. This phenomenon of perception of the English bilabial stops /b/ and /p/ will be tested mainly by virtue of minimal pairs and other words that may better help to investigate this perception. The paper uses some minimal pairs in which the bilabial and alveolar stops occur initially and finally. Also, it uses some verbs that end with the suffix /-ed/, but this /-ed/ suffix is pronounced [t] or [d] when preceded by /p/ or /b/ respectively. Notice that [t] and [d] are allophones of the English past tense morpheme /-ed/ (for example, Fromkin, Rodman, & Hyams, 2007). The pronunciation of the suffix as [t] and [d] works as a clue for the subjects to know the preceding bilabial sound. Keywords: perception, Arabic, stops, English, phonology Introduction Languages differ in their phoneme inventories. -
Illustrating the Production of the International Phonetic Alphabet
INTERSPEECH 2016 September 8–12, 2016, San Francisco, USA Illustrating the Production of the International Phonetic Alphabet Sounds using Fast Real-Time Magnetic Resonance Imaging Asterios Toutios1, Sajan Goud Lingala1, Colin Vaz1, Jangwon Kim1, John Esling2, Patricia Keating3, Matthew Gordon4, Dani Byrd1, Louis Goldstein1, Krishna Nayak1, Shrikanth Narayanan1 1University of Southern California 2University of Victoria 3University of California, Los Angeles 4University of California, Santa Barbara ftoutios,[email protected] Abstract earlier rtMRI data, such as those in the publicly released USC- TIMIT [5] and USC-EMO-MRI [6] databases. Recent advances in real-time magnetic resonance imaging This paper presents a new rtMRI resource that showcases (rtMRI) of the upper airway for acquiring speech production these technological advances by illustrating the production of a data provide unparalleled views of the dynamics of a speaker’s comprehensive set of speech sounds present across the world’s vocal tract at very high frame rates (83 frames per second and languages, i.e. not restricted to English, encoded as conso- even higher). This paper introduces an effort to collect and nant and vowel symbols in the International Phonetic Alphabet make available on-line rtMRI data corresponding to a large sub- (IPA), which was devised by the International Phonetic Asso- set of the sounds of the world’s languages as encoded in the ciation as a standardized representation of the sounds of spo- International Phonetic Alphabet, with supplementary English ken language [7]. These symbols are meant to represent unique words and phonetically-balanced texts, produced by four promi- speech sounds, and do not correspond to the orthography of any nent phoneticians, using the latest rtMRI technology. -
Acoustic Characteristics of Aymara Ejectives: a Pilot Study
ACOUSTIC CHARACTERISTICS OF AYMARA EJECTIVES: A PILOT STUDY Hansang Park & Hyoju Kim Hongik University, Seoul National University [email protected], [email protected] ABSTRACT Comparison of velar ejectives in Hausa [18, 19, 22] and Navajo [36] showed significant cross- This study investigates acoustic characteristics of linguistic variation and some notable inter-speaker Aymara ejectives. Acoustic measurements of the differences [27]. It was found that the two languages Aymara ejectives were conducted in terms of the differ in the relative durations of the different parts durations of the release burst, the vowel, and the of the ejectives, such that Navajo stops are greater in intervening gap (VOT), the intensity and spectral the duration of the glottal closure than Hausa ones. centroid of the release burst, and H1-H2 of the initial In Hausa, the glottal closure is probably released part of the vowel. Results showed that ejectives vary very soon after the oral closure and it is followed by with place of articulation in the duration, intensity, a period of voiceless airflow. In Navajo, it is and centroid of the release burst but commonly have released into a creaky voice which continues from a lower H1-H2 irrespective of place of articulation. several periods into the beginning of the vowel. It was also found that the long glottal closure in Keywords: Aymara, ejective, VOT, release burst, Navajo could not be attributed to the overall speech H1-H2. rate, which was similar in both cases [27]. 1. INTRODUCTION 1.2. Aymara ejectives 1.1. Ejectives Ejectives occur in Aymara, which is one of the Ande an languages spoken by the Aymara people who live Ejectives are sounds which are produced with a around the Lake Titicaca region of southern Peru an glottalic egressive airstream mechanism [26].