Peters, H. (2017)

Total Page:16

File Type:pdf, Size:1020Kb

Peters, H. (2017) "Proceedings of the International Conference “Fluency & Disfluency Across Languages and Language Varieties”" Degand, Liesbeth Abstract Within the frame of a research project entitled “Fluency and disfluency markers. A multimodal contrastive perspective”, the universities of Louvain and Namur have been involved in a large-scale usage-based study of (dis)fluency markers in spoken French, L1 and L2 English, and French Belgian Sign Language (LSFB), with a focus on variation according to language, speaker and genre. To close this five-year research project, the (DIS)FLUENCY 2017 International Conference is held in Louvain-la-Neuve on the subject of fluency and disfluency across languages and language varieties. This volume contains the papers presented at the conference. Document type : Rapport (Report) Référence bibliographique Degand, Liesbeth ; et. al. Proceedings of the International Conference “Fluency & Disfluency Across Languages and Language Varieties”. (2017) 125 pages Available at: http://hdl.handle.net/2078.1/195807 [Downloaded 2019/08/20 at 15:48:16 ] Proceedings of the International Conference “Fluency & Disfluency Across Languages and Language Varieties” Université catholique de Louvain February 15-17, 2017 Louvain-la-Neuve, Belgium Preface Fluency and disfluency have attracted a great deal of attention in different areas of linguistics such as language acquisition or psycholinguistics. They have been investigated through a wide range of methodological and theoretical frameworks, including corpus linguistics, experimental pragmatics, perception studies and natural language processing, with applications in the domains of language learning, teaching and testing, human/machine communication and business communication. Spoken and signed languages are produced and comprehended online, with typically very little time to plan ahead. As a result, they are often characterized by features such as (filled and unfilled) pauses, discourse markers, repeats and self-repairs, which can be said to reflect on- going mechanisms of processing and monitoring. The role of these items is ambivalent, as they can be both a symptom of encoding difficulties and a sign that the speaker is trying to help the hearer decode the message. They should thus be interpreted in context to identify their contribution to fluency and/or disfluency, which can be viewed as two faces of the same phenomenon. Within the frame of a research project entitled “Fluency and disfluency markers. A multimodal contrastive perspective”, the universities of Louvain and Namur have been involved in a large-scale usage-based study of (dis)fluency markers in spoken French, L1 and L2 English, and French Belgian Sign Language (LSFB), with a focus on variation according to language, speaker and genre. To close this five-year research project, the (DIS)FLUENCY 2017 International Conference is held in Louvain-la-Neuve on the subject of fluency and disfluency across languages and language varieties. This volume contains the papers presented at the conference. This conference benefits from the support of the ARC-project “A Multi-Modal Approach to Fluency and Disfluency Markers” granted by the Fédération Wallonie-Bruxelles (grant nr.12/17-044). Liesbeth Degand, on behalf of the organizing committee: Cédrick Fairon (UCL) Gaëtanelle Gilquin (UCL) Sylviane Granger (UCL) Laurence Meurant (UNamur) Anne Catherine Simon (UCL) George Christodoulides (UCL) Ludivine Crible (UCL) Amandine Dumont (UCL) Iulia Grosman (UCL) Ingrid Notarrigo (UNamur) Lucie Rousier Vercruyssen (UCL) Table of content What do listeners think of disfluencies? (Martin Corley) .......................................................................................... 6 Do (non-linguistic) variables affect learners’ (dis)fluency? A learner corpus-based perspective (Sandra Götz) ...... 6 Modeling disfluencies across domains (Helena Moniz) ............................................................................................ 7 Signs of (dis)fluency throughout development: The language use of Deaf children who are native users of a signed language considering adult examples of (dis)fluency (David Quinto-Pozos) ................................................. 8 A contrastive analysis of disfluency markers in Hungarian in four different settings (Mária Bakti & Judit Bóna) .. 11 Filled and unfilled pauses in recurrent multiword sequences in L2 presentational talk (Nicole Baumgarten) ......... 12 Synthesized lengthening of function words - The fuzzy boundary between fluency and disfluency (Simon Betz et al.) .......................................................................................................................................................................... 15 Disfluencies in children’s and adolescents’ spontaneous speech: the effect of speech task (Judit Bóna & Timea Vakula) ................................................................................................................................................................... 20 Comparing the evaluation and processing of native and non-native disfluencies (Hans Rutger Bosker) ................. 21 You know in learner English as an aid to fluency and beyond (Lieven Buysse) ....................................................... 23 A preliminary contrastive analysis of the acoustic features of “nasal grunts” in the CID and NECTE corpora (Aurélie Chlebowski) ................................................................................................................................................ 25 Methodologies for modelling silent pause length. Insights on individual and situational variation from a large- scale corpus study (George Christodoulides et al.) .................................................................................................... 28 Recycling discourse: from qualitative repair categories to a formal scale of fluency (Ludivine Crible) ................. 31 (Dis)fluency across spoken and signed languages: applications of an interoperable annotation scheme (Ludivine Crible et al.) ............................................................................................................................................................ 34 Examining fluency in second language speaking from speaker’s perspective: A cognitive approach (Tannistha Dasgupta) ............................................................................................................................................................... 37 Fluency and the use of foreign words in interviews with EFL learners (Sylvie De Cock) ....................................... 38 Profiling French learners’ productive (dis)fluency (Amandine Dumont) ................................................................. 41 Temporal characteristics of speech during fluency-inducing conditions in stuttering (Simone Falk et al.) ............. 43 Between pragmatic function and bad reputation: Partner models mediate the use and interpretation of hesitation markers (Kerstin Fischer & Nathalie Schümchen) ...................................................................................................... 44 Repeats in native and learner English (Tomáš Gráf & Lan-Fen Huang) ................................................................... 47 Prosodic variation of identical repetitions as a function of their properties and editing terms: A large-scale corpus study on French speech (Iulia Grosman et al.) ......................................................................................................... 49 An investigation of stuttering in Persian-speaking children based on the CALMS Assessment (Cognitive, Affective, Linguistic, Motor and Social) (Yasaman Jalilian et al.) .......................................................................... 53 Validation of a multidimensional model of stuttering for Persian-speaking children who stutter (Yasaman Jalilian et al.) ...................................................................................................................................................................... 53 How do Deaf native signers perceive fluency in adult L2/M2 learners of German Sign Language? (Thomas Kaul et al.) .......................................................................................................................................................................... 54 Word production difficulties in oral discourse by people with aphasia and healthy speakers: a qualitative analysis (Mariya V. Khudyakova & Mira B. Bergelson) ............................................................................................................ 57 Disfluency or speech management? A case study of co-constructed fluency with two Japanese learners of English (Steven Kirk) .............................................................................................................................................. 60 Should ‘Uh’ and ‘Um’ be categorized as markers of disfluency? The use of fillers in a challenging conversational context (Loulou Kosmala, Aliyah Morgenstern) ................................................................................. 62 Viewing pause behaviour across languages: Methodological and theoretical concerns (Hege Larsson Ass & Susan Nacey) .................................................................................................................................................................... 64 Stuttered and non-stuttered disfluencies in normally fluent, French-speaking preschool children (A.-L.
Recommended publications
  • Metapragmatics of Playful Speech Practices in Persian
    To Be with Salt, To Speak with Taste: Metapragmatics of Playful Speech Practices in Persian Author Arab, Reza Published 2021-02-03 Thesis Type Thesis (PhD Doctorate) School School of Hum, Lang & Soc Sc DOI https://doi.org/10.25904/1912/4079 Copyright Statement The author owns the copyright in this thesis, unless stated otherwise. Downloaded from http://hdl.handle.net/10072/402259 Griffith Research Online https://research-repository.griffith.edu.au To Be with Salt, To Speak with Taste: Metapragmatics of Playful Speech Practices in Persian Reza Arab BA, MA School of Humanities, Languages and Social Science Griffith University Thesis submitted in fulfilment of the requirements of the Degree of Doctor of Philosophy September 2020 Abstract This investigation is centred around three metapragmatic labels designating valued speech practices in the domain of ‘playful language’ in Persian. These three metapragmatic labels, used by speakers themselves, describe success and failure in use of playful language and construe a person as pleasant to be with. They are hāzerjavāb (lit. ready.response), bāmaze (lit. with.taste), and bānamak (lit. with.salt). Each is surrounded and supported by a cluster of (related) word meanings, which are instrumental in their cultural conceptualisations. The analytical framework is set within the research area known as ethnopragmatics, which is an offspring of Natural Semantics Metalanguage (NSM). With the use of explications and scripts articulated in cross-translatable semantic primes, the metapragmatic labels and the related clusters are examined in meticulous detail. This study demonstrates how ethnopragmatics, its insights on epistemologies backed by corpus pragmatics, can contribute to the metapragmatic studies by enabling a robust analysis using a systematic metalanguage.
    [Show full text]
  • Two Approaches to Metaphor Detection Brian Macwhinney and Davida Fromm Carnegie Mellon University Pittsburgh PA, 15213 USA E-Mail: [email protected], [email protected]
    Two Approaches to Metaphor Detection Brian MacWhinney and Davida Fromm Carnegie Mellon University Pittsburgh PA, 15213 USA E-mail: [email protected], [email protected] Abstract Methods for automatic detection and interpretation of metaphors have focused on analysis and utilization of the ways in which metaphors violate selectional preferences (Martin, 2006). Detection and interpretation processes that rely on this method can achieve wide coverage and may be able to detect some novel metaphors. However, they are prone to high false alarm rates, often arising from imprecision in parsing and supporting ontological and lexical resources. An alternative approach to metaphor detection emphasizes the fact that many metaphors become conventionalized collocations, while still preserving their active metaphorical status. Given a large enough corpus for a given language, it is possible to use tools like SketchEngine (Kilgariff, Rychly, Smrz, & Tugwell, 2004) to locate these high frequency metaphors for a given target domain. In this paper, we examine the application of these two approaches and discuss their relative strengths and weaknesses for metaphors in the target domain of economic inequality in English, Spanish, Farsi, and Russian. Keywords: metaphor, corpora, grammatical relations, selectional restrictions, collocations • Action_Has_Modifier: verb-adverb pairs. 1. The Task (trap softly) Entity_Has_Modifier: noun-adjective pairs. In our project, code-named METAL, the task was to • (grinding poverty) automatically detect metaphors in the target domain Entity_Entity: noun-noun pairs. (Kövecses, 2002) of economic inequality – a subject of • particular current interest in the wake of the financial (wealth ladder) Entity_Has_Entity: noun-noun pairs for possession. collapse of 2008 and growing income disparity within • (poverty s trap, burden of poverty) and between countries caused by globalization and ’ Entity_Is_Entity: noun-noun pairs for identity.
    [Show full text]
  • The Learner Corpora of Spoken English: What Has Been Done and What Should Be Done? Soyeon Yoon† Incheon National University
    The Learner Corpora of Spoken English: What Has Been Done and What Should Be Done? Soyeon Yoon† Incheon National University ABSTRACT The number of spoken learner corpora is smaller than that of written corpora; however, demand for spoken corpora has continuously increased. This study investigates the current state of spoken English-language learner corpora both in the world and in Korea, with a focus on their size, speakers, and genres. Based on this survey, the study discusses factors to consider when building and publishing a spoken learner corpus, especially with respect to the issues of conversation genre, proficiency level, and annotation. Keywords: spoken corpus, learner corpus, conversation, transcription, annotation 1. Introduction Ever since Brown Corpus was constructed in 1967, various types and sizes of corpora have been constructed and widely used as quantitative and qualitative evidence not only for language research but also for foreign/second language education. In language education, the corpora of native language have served as reference that the learners’ language patterns can be compared to or as a model that the learners pursue. On the other hand, learner corpora collect second or foreign language data that the learners use in either written or spoken forms. They have been constructed to examine learners’ linguistic knowledge and how their usage patterns differ from those of native speakers or differ depending on the learners’ age, gender, first language, and fluency. According to Center for English Corpus Linguistics1) (CECL) at Universite Catholique de Louvain, there are 177 learner corpora in the world currently available. Only 40 of them contain spoken data while most are written corpora.
    [Show full text]
  • The Corpus of Contemporary American English As the First Reliable Monitor Corpus of English
    The Corpus of Contemporary American English as the first reliable monitor corpus of English ............................................................................................................................................................ Mark Davies Brigham Young University, Provo, UT, USA ....................................................................................................................................... Abstract The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a ‘monitor corpus’, and which can be used to accurately track and study recent changes in the language. The 400 million words corpus is evenly divided between spoken, fiction, popular magazines, newspapers, and academic journals. Most importantly, the genre balance stays almost exactly the same from year to year, which allows it to accurately model changes in the ‘real world’. After discussing the corpus design, we provide a number of concrete examples of how the corpus can be used to look at recent changes in English, including morph- ology (new suffixes –friendly and –gate), syntax (including prescriptive rules, Correspondence: quotative like, so not ADJ, the get passive, resultatives, and verb complementa- Mark Davies, Linguistics and tion), semantics (such as changes in meaning with web, green, or gay), and lexis–– English Language, Brigham including word and phrase frequency by year, and using the corpus architecture Young University.
    [Show full text]
  • And the Causality Detection Benchmark
    1 Persian Causality Corpus (PerCause) and the Causality Detection Benchmark Zeinab Rahimi, Mehrnoush ShamsFard*1 NLP Research Lab., Shahid Beheshti University, Tehran, Iran {z_rahimi, m-shams}@sbu.ac.ir ABSTRACT. Recognizing causal elements and causal relations in text is one of the challenging issues in natural language processing; specifically, in low resource languages such as Persian. In this research we prepare a causality human annotated corpus for the Persian language which consists of 4446 sentences and 5128 causal relations and three labels of cause, effect and causal mark -if possible- are specified for each relation. We have used this corpus to train a system for detecting causal elements’ boundaries. Also, we present a causality detection benchmark for three machine learning methods and two deep learning systems based on this corpus. Performance evaluations indicate that our best total result is obtained through CRF classifier which has F-measure of 0.76 and the best accuracy obtained through BiLSTM-CRF deep learning method with Accuracy equal to %91.4. Keywords: PerCause, Causality annotated corpus, causality detection, deep learning, CRF 1. Introduction Causal relation extraction is a challenging natural language processing (NLP) open problem that mainly involves semantic analysis and is used in many NLP tasks such as textual entailment recognition, question answering, event prediction and narrative extraction. Causation indicates that one event, state or action is the result of the occurrence of the other state, event or action. Consider the following examples: (1) John is poisoned because he ate a poisonous apple. (2) It’s raining. The streets are slippery. (3) Narrow roads are dangerous.
    [Show full text]
  • Download the Processed Data (In Vertical Format), Perhaps for Further Annotation and Re- Uploading
    The Sketch Engine: Ten Years On Adam Kilgarriff, Vít Baisa, Jan Bušta, Miloš Jakubíček, Vojtěch Kovář, Jan Michelfeit, Pavel Rychlý, Vít Suchomel Lexical Computing Ltd., Brighton, UK & Masaryk University, Brno, Czech Republic 1. Introduction The Sketch Engine is a leading corpus tool. It has been widely used in lexicography. It is now ten years since its launch (Kilgarriff et al. 2004). Those ten years have seen dramatic changes. They have seen the near-death of dictionaries on paper, at the hands of electronic dictionaries.1 They have seen the emergence of entire new ecosystems of dictionaries on the web, with many new players (Google, weblio.jp, dictionary.com, Leo, Wordnik.com). Previously, the dominant players had been around for decades, even centuries -- Longman (who published Johnson's dictionary in 1754), Kenkyusha, OUP, Le Robert, Duden, Merriam-Webster). In the world at large, we have seen the invention and world takeover of the smartphone. 1994-2004 saw the switch of most dictionary lookups from paper to electronic: 2004-2014 has seen them nearly all (in percentage terms) switch from computer to phone. (Just think how often your students look up words on their phones, versus how often they look them up in any other way.) Dictionaries are far, far more available and accessible than they were. The sheer number of dictionary lookups has risen many times over (even as --bitter irony-- many dictionary companies have seen their income collapse). This is all at the publishing end of the dictionary business. What about the lexicography end? Here, we have seen the corpus revolution (Hanks 2012).
    [Show full text]
  • Constructing and Analysing an Error-Tagged Learner Corpus of Persian
    UNIVERSITY OF BELGRADE FACULTY OF PHILOLOGY Saeed G. Safari CONSTRUCTING AND ANALYSING AN ERROR-TAGGED LEARNER CORPUS OF PERSIAN Doctoral Dissertation Belgrade, 2017 UNIVERZITET U BEOGRADU FILOLOŠKI FAKULTET Said G. Safari IZRADA I ANALIZA ANOTIRANOG KORPUSA PERSIJSKOG JEZIKA KAO STRANOG Doktorska disertacija Beograd, 2017. УНИВЕРСИТЕТ БЕЛГРАДА ФАКУЛЬТЕТ ФИЛОЛОГИИ Саид Сафари Формирование и анализ аннотированного корпуса персидского языка Докторская диссертация Белград, 2017 г. Podaci o mentoru i članovima komisije Mentor: dr aja iličevi Petrovi , vanredni profesor Filološki fakultet, Beograd Članovi komisije: 1. dr jiljana arkovi redovni ro esor Filološki fakultet, Beograd 2. dr elena ili ovi redovni ro esor Filološki fakultet, Beograd 3. dr. Reza Morad Sahraei Fakultet za persijsku književnost i strane jezike, Teheran (Faculty of Persian Literature and Foreign Languages, Allameh Tabataba’i University, Tehran) Datum odbrane: Beograd, _______________ به نام خداوند جان آفرین حکیم سخن در زبانآفرین I would like to express my sincere gratitude to my mentor, for the continuous support of my thesis research and her advice, comments, guidance and immense knowledge. I would like to thank my esteemed professors, , Dr Julijana Vu and for their constant enthusiasm and encouragement during my doctoral studies. I would also like to thank Reza Morad Sahraei , from Allameh T b b ’ U s y T h for reviewing my research and his valuable comments and feedback. My deepest and endless gratitude goes to my amazing family, to whom this thesis is dedicated, especially to my loving and supportive wife, Solmaz Taghdimi. CONSTRUCTING AND ANALYSING AN ERROR-TAGGED LEARNER CORPUS OF PERSIAN Summary Linguistic corpora constitute reliable sources and empirical means for analyzing linguistic data.
    [Show full text]
  • An Analysis of a Persian Corpus
    International Journal of Modern Language Teaching and Learning Available online at www.ijmltl.com. Vol. 2, Issue 1, 2017, pp.32- 40 ISSN: 2367-9328 An Analysis of a Persian Corpus Ali Arabmofrad Department of English Language and Literature, Golestan University, Iran [email protected] ABSTRACT In the arena of corpus linguistics, Persian is among the languages that lack a thorough and comprehensive corpus of its contemporary use. Therefore, the present study is a preliminary attempt to construct a corpus for Persian language which is equivalent to that of other languages; for example, BNC which has more than 100 million words (spoken or written). To begin, a corpus of five million words of different contemporary newspaper articles that are considered to be, to a great extent, representative of Persian Standard formal use, were gathered. The corpus was analyzed and an overview of the most frequent single words and multi-word strings was found. Furthermore, it was found that the frequency of some multi word strings are more than that of single words that may be considered as a core vocabulary in Persian signaling the fact that such strings and expressions can play much more important role than conceived before. KEY WORDS: Corpus; Corpus Analysis; Corpus Application; Persian Corpus INTRODUCTION CORPUS "A corpus is a collection of texts, written or spoken, usually stored in a computer database" (McCarthy, 2004, p.1). This collection of words can be used in linguistic description or language hypothesis verification. The sources for written corpora are usually books, newspapers and magazines and those of spoken corpora are everyday (casual) conversations, phone calls, radio and TV programs , lectures, interviews and so on that are transcribed.
    [Show full text]
  • Natural Language Processing with Python
    Natural Language Processing with Python Natural Language Processing with Python Steven Bird, Ewan Klein, and Edward Loper Beijing • Cambridge • Farnham • Köln • Sebastopol • Taipei • Tokyo Natural Language Processing with Python by Steven Bird, Ewan Klein, and Edward Loper Copyright © 2009 Steven Bird, Ewan Klein, and Edward Loper. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: (800) 998-9938 or [email protected]. Editor: Julie Steele Indexer: Ellen Troutman Zaig Production Editor: Loranah Dimant Cover Designer: Karen Montgomery Copyeditor: Genevieve d’Entremont Interior Designer: David Futato Proofreader: Loranah Dimant Illustrator: Robert Romano Printing History: June 2009: First Edition. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Natural Language Processing with Python, the image of a right whale, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein.
    [Show full text]