Prosody of Spontaneous Speech in Autism
Total Page:16
File Type:pdf, Size:1020Kb
Prosody of Spontaneous Speech in Autism G´ezaKiss M. S. in Computer Science, Budapest University of Technology, 1997 Presented to the Center for Spoken Language Understanding within the Oregon Health & Science University School of Medicine in partial fulfillment of the requirements for the degree Doctor of Philosophy in Computer Science & Engineering September 2017 Copyright c 2017 G´ezaKiss All rights reserved ii Center for Spoken Language Understanding School of Medicine Oregon Health & Science University CERTIFICATE OF APPROVAL This is to certify that the Ph. D. dissertation of G´ezaKiss has been approved. Jan P.H. van Santen, Thesis Advisor Professor Alison Presmanes Hill Associate Professor Xubo Song Professor Eric´ Fombonne Professor Elmar N¨oth Professor iii Dedication To Kl´ariand Peti, who were with me through thick and thin. To Father and Mother, who did all in their power to help me achieve this goal. iv Acknowledgments First of all, I would like to thank my advisor, Jan van Santen, who has been instrumental in making this PhD training come to fruition. He took on the role of mentoring me during my Fulbright scholarship and then for the entire duration of my PhD program. He made it possible for me to move back to my home country before finishing, taking on the inconveniences associated with being a remote advisor. He was kind, encouraging, constructive, and supportive all the time. He let me pursue my research interests, while also providing goals, advice, and feedback. Last but not least, he was also the one who found the financial means for this undertaking. I also want to express my sincere thanks to the professors who were on my advisory commit- tee during these years, namely Alison Presmanes Hill, Esther Klabbers, and Xubo Song, and my external committee members, Eric´ Fombonne and Elmar N¨oth.Their feedback made a vast differ- ence in improving the quality of this dissertation. Kyle Gorman, Eric Morley, and Emily Tucker Prud'hommeaux also provided me with helpful feedback on parts of the manuscript. I am grateful for the guidance of Peter Heeman through the PhD process, and the constant and apt support with administrative issues and other everyday matters from Patricia Dickerson. Steven Bedrick, Jason Brooks, Sean Farrell, Ethan Van Matre, Izhak Shafran, and Robert Stites always helped with issues related to the CSLU computer cluster. I am also grateful to Kim Basney for calling my attention to a tax treaty, which step improved our financial standing substantially. The classes with professors like Steven Bedrick, John-Paul Hosom, Alexander Kain, Brian Roark, Izhak Shafran, and Richard Sproat, the weekly seminars, and the interaction with fellow students all made CSLU a fun and inspriring place to be. Thank you all for sharing some of the good treasures of your heart! It was great to be among such clever and friendly people. I want to thank Pastor Fred and Betty Cason of True Life Fellowship in Beaverton, who were friends to us and helpful in all possible ways during our stay in the US, and provided a home for me during my later visits. The word gratitude is not enough to express what I feel toward my wife, Kl´ari,who loved and supported me faithfully me even when it was not so easy, and my oldest son, Peti, who was with us through it all. Above all, I thank Jesus Christ, without whom this dissertation would never have been completed. v This material is based upon work supported in part by the National Science Foundation under Grant No. NSF IIS-0905095 (van Santen, PI), and the National Institute on Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health under award numbers NIH-1R01DC007129 (van Santen, PI), NIH-1R01DC012033 (Richard Sproat, PI and van Santen, PI), and NIH-R21DC010239 (Lois Black, PI and van Santen, PI). The content is solely the respon- sibility of the author and does not necessarily represent the official views of the National Science Foundation or the National Institutes of Health. vi Contents Dedication ::::::::::::::::::::::::::::::::::::::::::::: iv Acknowledgments :::::::::::::::::::::::::::::::::::::::: v Contents :::::::::::::::::::::::::::::::::::::::::::::: vii List of Tables ::::::::::::::::::::::::::::::::::::::::::: xi List of Figures :::::::::::::::::::::::::::::::::::::::::: xiii List of Abbreviations :::::::::::::::::::::::::::::::::::::: xiv Abstract :::::::::::::::::::::::::::::::::::::::::::::: xvi 1 Introduction ::::::::::::::::::::::::::::::::::::::::: 1 1.1 Problem Statement . 2 1.2 Aims . 3 1.3 Scientific Contributions . 3 1.4 Organization of the dissertation . 5 2 Background :::::::::::::::::::::::::::::::::::::::::: 6 2.1 Autism Spectrum Disorders (ASD) . 7 2.1.1 History . 7 2.1.2 Characterization . 9 2.1.3 Relationship to other disorders . 10 2.2 Speech Prosody . 11 2.2.1 Prosodic disorders . 12 2.2.2 Screening instruments for prosody . 13 2.2.3 Prosody-Voice Screening Profile (PVSP) . 14 2.2.4 Profiling Elements of Prosodic Systems{Children (PEPS-C) . 15 2.3 Speech Prosody in ASD . 15 2.3.1 Early mentions . 16 2.3.2 Overview of papers on prosody in autism . 18 2.3.3 Perceptual ratings . 19 2.3.4 PVSP studies . 20 2.3.5 PEPS-C studies . 21 vii 2.3.6 Other perceptual studies . 22 2.3.7 Relationship to intellectual disability . 24 2.3.8 Developmental aspects . 24 2.3.9 What is still not known . 25 2.3.10 Clinical practice . 26 3 Technical Background ::::::::::::::::::::::::::::::::::: 27 3.1 F0 tracking . 28 3.1.1 Setting F0 tracker parameters . 29 3.1.2 Project on automatic correction of F0 tracking errors . 29 3.2 Computational models of prosody . 31 3.2.1 Fujisaki model . 31 3.2.2 Generalized Linear Alignment Model . 32 3.2.3 Simplified Linear Alignment Model . 32 3.3 Functional data analysis . 34 3.3.1 FDA for prosody research . 34 3.3.2 Converting intonation curves to a functional form . 34 3.3.3 Functional Principal Component Analysis . 35 3.4 Statistical Techniques . 35 3.4.1 False Discovery Rate correction . 37 3.4.2 Mixed effect linear models . 37 3.4.3 Monte Carlo significance test . 38 3.4.4 Reliability of results . 38 4 Corpora :::::::::::::::::::::::::::::::::::::::::::: 40 4.1 CSLU Cross-modal Corpus . 41 4.2 CSLU ERPA Autism Corpus . 42 4.2.1 The Autism Diagnostic Observation Schedule (ADOS) . 42 4.2.2 Data collection and corpus characteristics . 43 4.2.3 Diagnostic categories . 43 4.2.4 Matched groups . 44 4.2.5 ADOS recordings . 46 4.2.6 Relational Feature Tables from ADOS Corpora . 48 4.2.7 Determining sentence type . 48 5 Matching Diagnostic Groups ::::::::::::::::::::::::::::::: 55 5.1 Motivation for Matching Subject Groups . 56 5.2 Matching as an Optimization Problem . 60 5.3 Matching Algorithms . 61 5.3.1 Random search (random) ............................ 62 5.3.2 LDA-based heuristic search (heuristic1)................... 62 5.3.3 Test-statistic based heuristic search (heuristic2) . 63 5.3.4 Test-statistic based heuristic search with look-ahead (heuristic3 and 4) . 63 viii 5.3.5 Exhaustive search (exhaustive) ........................ 65 5.4 Matching the CSLU ERPA Corpus Subjects . 65 5.4.1 Matching criteria . 66 5.4.2 Approaches to matching subsets of groups on different variables . 66 5.4.3 The matching process . 68 5.4.4 Results . 68 5.5 Discussion of the Matching Algorithms . 70 5.6 Summary of our Work on Matching . 71 6 Acoustic Characterization of Prosody ::::::::::::::::::::::::: 72 6.1 Motivation for the Acoustic Analysis of Speech Prosody . 73 6.2 Acoustic-Prosodic Feature Sets . 74 6.2.1 Statistical features of prosody . 74 6.2.2 Speaker-specific intonation model parameters . 76 6.2.3 Functional Data Analysis for prosody curves . 85 6.3 Results of the Acoustic Analyes . 86 6.3.1 Statistical features of prosody . 86 6.3.2 Speaker-specific intonation model parameters . 93 6.3.3 Functional Data Analysis for prosody curves . 94 6.4 Discussion of the Acoustic Differences . 100 6.4.1 Statistical features of prosody . 100 6.4.2 Speaker-specific intonation model parameters . 102 6.4.3 Functional Data Analysis for prosody curves . 103 6.4.4 Summary . 104 6.5 Summary of the Acoustic Analyses of Prosody . 104 7 Perceptual Ratings of Prosody: Collection and Analysis ::::::::::::: 106 7.1 Motivation for Collecting Perceptual Atypicality Ratings . 107 7.2 Methodology for the Perceptual Rating Collection . 108 7.2.1 Study Design . 108 7.2.2 Tasks . 110 7.2.3 Task interfaces . 111 7.2.4 Stimulus sets . 123 7.2.5 Stimulus set Matched on Expected Prosody (MEP) . 123 7.2.6 Stimulus set that is Maximally Individually Diverse (MID) . 128 7.2.7 Audio normalization . 131 7.2.8 Rating collection . 131 7.2.9 Aggregating ratings . 133 7.2.10 Data analysis . 136 7.3 Analyzing the Perceptual Ratings . 138 7.3.1 Assessing the perceptual ratings data . ..