Phonbank and Data Sharing: Recent Developments in European Portuguese

Total Page:16

File Type:pdf, Size:1020Kb

Phonbank and Data Sharing: Recent Developments in European Portuguese Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pages 6562–6570 Marseille, 11–16 May 2020 c European Language Resources Association (ELRA), licensed under CC-BY-NC PhonBank and Data Sharing: Recent Developments in European Portuguese + Ana Margarida Ramalho*, M. João Freitas*, Yvan Rose + *University of Lisbon / CLUL; Memorial University + *Lisboa, Portugal; St. John’s, NL Canada [email protected], [email protected], [email protected] Abstract This paper presents the recently published RAMALHO-EP and PHONODIS corpora. Both include European Portuguese production data from Portuguese children with typical (RAMALHO-EP) and protracted (PHONODIS) phonological development. The data in the two corpora were collected using the phonological assessment tool CLCP-EP, developed in the context of the Crosslinguistic Child Phonology Project, coordinated by Barbara Bernhardt and Joe Stemberger (University of British Columbia (UBC), Canada). Both corpora are part of the PhonBank Project (Brian MacWhinney (Carnegie Mellon, USA) and Yvan Rose (Memorial University of Newfoundland, Canada), which is the child phonology component of TalkBank, coordinated by Brian MacWhinney. The data at PhonBank is edited in Phon format, a language tool designed and built by Yvan Rose and Greg Hedlund (Memorial University of Newfoundland) and widely used by researchers working in the field of phonological acquisition. RAMALHO-EP contains production data from 87 typically developing children, aged 2;11 to 6;04, all monolinguals. PHONODIS includes production data from 22 children diagnosed with different types of speech and language disorders, all EP monolinguals, aged 3;2 to 11,05. Both corpora are open access language resources and contribute to enlarge the amount of production data on the acquisition of European Portuguese available in PhonBank. Keywords: Phonology, Phonetics, Acquisition, Phon, PhonBank, Data sharing, European Portuguese platform1 has been offering compelling solutions to 1. Introduction these two challenges, through a combination of technical innovations in the areas of phonological and Big Data research enables computer-assisted, broad- phonetic data representation and annotation, assembled based generalizations over rich datasets, which cannot within a software package called Phon2 and the building be obtained through traditional methods of observation. of a public (web-accessible) database documenting However, these generalizations often come with phonological development and speech disorders across partially or completely untested assumptions about the a number of different languages and speaker irrelevance of potential noise in the data, the nature of populations. A key feature of PhonBank is that it relies which may influence results in ways which remain on the commitment of the users of Phon toward data elusive to the researcher. In light of this, and also given sharing in order to construct the database component of the fact that some types of Big Data corpora are not this project: As scholars of phonological development easily obtainable, due to the nature of the data, the need and speech disorders benefit from the functionality for precisely-analyzable corpora remains across many available through Phon for their research, their areas of research. publication works and, in particular, the contribution of This is the case for the study of phonological their corpus data to the PhonBank database enable its development and speech disorders, two disciplines construction in ways that make Big Data research which require access to phonetically-transcribed sets of possible in the long term. data documenting different populations of speakers (e.g. In this paper, we present an overview of this project and typically-developing vs. disordered) across different highlight how it offered a framework for corpus languages and, in the context of multilingual building and analysis based on a series of projects populations, across different combinations of languages. recently developed based on European Portuguese (EP) Current systems for the automatic transcription of speech data. We begin by situating PhonBank within speech data are not reliable enough to be used as a basis the larger context of corpus-based research on language for corpus building at the moment, especially given that development. We then highlight the main components child, disordered, or accented speech are typically of Phon for corpus building and analysis. This provides transcribed very poorly through these systems. This the technical basis for our discussion of how these implies that corpus building in these areas must rely on resources have been used for research on European human transcribers, whose work can be assisted through Portuguese, which we highlight through some of the computer-assisted methods of data measurement (e.g. key findings emanating from this research. Finally, we acoustic analysis) or data annotation (see below). This highlight how these results have now been incorporated requirement in turn poses a major roadblock for the within PhonBank in ways which contribute to cross- development of Big Data corpora for research. linguistic, Big Data research. The PhonBank project, which supports researchers and students through software programs and a data sharing 1 https://phonbank.talkbank.org 2 https://www.phon.ca 6562 2. CHILDES, TalkBank, PhonBank study of phonology and phonological development, for example in the area of segmental and syllabic patterns PhonBank comes from a rich tradition of corpus-based of behaviours (e.g. phones and phonological features; research in language acquisition, which itself has its syllable onsets vs. codas, stress, tone, and so on). In roots all the way to Charles Darwin’s (1877) description addition to this, at the time of PhonBank inception, in of gestural development by his son. This work provided 2006, recent progress in phonetic science was making a model application of Darwin’s descriptive techniques acoustic analyses of child speech increasing compelling for work on natural selection and evolution to the study for our understanding of phonological development. In of human development (MacWhinney, 2000). Limited order to address this need, Praat libraries for the by the means of research available to individuals until acoustic analysis speech5 were subsequently integrated the democratization of computers in the early 1980s, within Phon. We provide more detail on the functions corpus-based research on language acquisition remained supported by Phon in the next section. central to virtually all studies of language acquisition since Darwin, however in ways that were limiting both 3. Phon the level of annotation obtained and the analysis of these annotations. Phon, an open-source (free) software program, was first 3 released in 2006 (Rose et al., 2006). Among other The Child Language Exchange System (CHILDES , co- functions, Phon supports the following: founded by Brian MacWhinney and Catherine Snow in 1983, opened the door for the building of the first large- • Media (audio/video) time-alignment to data scale database documenting child language. To this day, transcripts CHILDES remains the primary source of data and • Orthographic and phonetic transcription research tools for the study of child language • Segmental, tonal and stress-level annotation of the worldwide, through its combination of publicly- phonetic transcriptions accessible datasets documenting child language across • Segment-by-segment alignment between target different languages and learning situations, and (model) and actual (speaker-produced) forms computer programs for the analysis of these datasets, • Data query and reporting in reference to segmental assembled within the CLAN software package. and prosodic aspects of the transcribed forms CHILDES has since provided a model (and inspiration) • Acoustic analysis for the creation of similar databases, for the study of These general functions are supplemented by a series of other topics relevant to human language and additional features within Phon which help the creation communication (e.g. AphasiaBank, BilingBank, and maintenance of structured databases, including FluencyBank) which together form the TalkBank 4 format checks and the ability to import and export data consortium . In addition to the rich documentation it from and to third-party applications. provides to scholars of language, language acquisition, pathologies, etc., one of the most central assets of the After the research has diarized the corpus into speaker- TalkBank databases is their adherence to relatively specific utterances (which can range from isolated word strict data formats, in particular the CHAT format forms to full utterances, each associated with a given supported by CLAN and its corresponding. This cross- speaker) and completed the orthographic transcription corpus consistency greatly facilitates the large-scale of the corpus (a task that remains essentially manual in study of the different datasets. the absence, still today, of reliable and/or easily accessible speech recognition systems), Phon can PhonBank, came into existence first as a subset of the automatically generate broad phonetic transcriptions CHILDES database; through successive rounds of (following the standards of the International Phonetic funding from the National Institutes of Health, Association6 using either built-in dictionaries IPA- PhonBank subsequently grew into becoming an transcribed forms (current
Recommended publications
  • 19 First Language Acquisition
    466 Brian MacWhinney 19 First Language Acquisition BRIAN MACWHINNEY Nothing can make a parent prouder than a child’s first word. Whether it be “Mama,” “Papa,” or even “kitty,” we all recognize the first word as a major milestone in the child’s development – a clear token of the child’s entrance into a fuller membership in human society. We emphasize the child’s first word, because of the enormous importance that we place on language as a way of communicating with other people After all, we reason, the only species that uses language is the human species. Moreover, it is through language that we come to understand the deepest secrets of other people and come to appreciate the extent to which we share a common humanity. Fortunately, the ability to acquire language is present in almost every human child (Lenneberg 1967). Children who are born blind have few problems learning to speak, although they may occasionally be confused about words for colors or geographic locations. Children who are born deaf readily acquire a rich system of signs, as long as they are exposed to native sign language speakers. Even a child like Helen Keller, who has lost both hearing and sight, can still acquire language through symbols expressed in touch and motion. Children with neurological disorders, such as brain lesions or hydrocephalus, often acquire complete control over spoken language, despite a few months of early delay. Children with the most extreme forms of mental retardation are still able to acquire the basic units of human communication. Given the pervas- iveness and inevitability of first language acquisition, we often tend to take the process of language learning for granted.
    [Show full text]
  • Introduction: Reconciling Approaches to Intra-Individual Variation in Psycholinguistics and Variationist Sociolinguistics
    Linguistics Vanguard 2021; 7(s2): 20200027 Lars Bülow* and Simone E. Pfenninger Introduction: Reconciling approaches to intra-individual variation in psycholinguistics and variationist sociolinguistics https://doi.org/10.1515/lingvan-2020-0027 Abstract: The overall theme of this special issue is intra-individual variation, that is, the observable variation within individuals’ behaviour, which plays an important role in the humanities area as well as in the social sciences. While various fields have recognised the complexity and dynamism of human thought and behav- iour, intra-individual variation has received less attention in regard to language acquisition, use and change. Linguistic research so far lacks both empirical and theoretical work that provides detailed information on the occurrence of intra-individual variation, the reasons for its occurrence and its consequences for language development as well as for language variation and change. The current issue brings together two sub- disciplines – psycholinguistics and variationist sociolinguistics – in juxtaposing systematic and non- systematic intra-individual variation, thereby attempting to build a cross-fertilisation relationship between two disciplines that have had surprisingly little connection so far. In so doing, we address critical stock-taking, meaningful theorizing and methodological innovation. Keywords: psycholinguistics, variationist sociolinguistics, intra-individual variation, intra-speaker variation, SLA, language variation and change, language development 1 Intra-individual variation in psycholinguistics and variationist sociolinguistics The overall theme of this special issue is intra-individual variation, that is, the observable variation within individuals’ behaviour, which plays an important role in the humanities area as well as in the social sciences. While various fields have acknowledged the complexity and dynamism of human behaviour, intra-individual variation has received less attention in regard to language use.
    [Show full text]
  • Part I: Perspectives on Usage in L2 Learning and Teaching
    Part I: Perspectives on usage in L2 learning and teaching Brian MacWhinney Multidimensional SLA Complex natural phenomena, such as human language, are shaped by processes operating on very different scales in both time and space (MacWhinney, in press). Consider the case of timescales in Geology. When geologists study rock outcrops they need to consider the results of general processes such as vulcanism, orogeny, glaciation, continental drift, erosion, sedimentation, and metamorphism. Within each of these larger processes, such as vulcanism, there are many microprocesses operating across smaller timescales. For example, once the pressure in the magma chamber reaches a certain level, there can be a slow outpouring of lava or sudden explosions. Pressure can be released through steam vents with geysers operating at regular intervals. The lava may enter lakes or oceans forming pillows or it may rest in underground chambers forming columnar basalt. The variations in these volcanic processes and their interactions with each other and plate tectonics are extensive. The same is true of human language. Within human populations, the ability to articulate and process sounds has emerged across millennia of ongoing changes in physiology and neurology. Within particular language communities, ongoing change is driven by language contact, dialect shift, and group formation. Within individuals, language learning involves a continual adaptation for both first and second languages. Within individual conversations, all of these forces come together, as people work out their mutual plans, goals, and disagreements, using language. Each of these space-time frames interacts with the others at the actual moment of language use. To fully understand the process of second language acquisition (SLA), we must place it within this multidimensional context, both theoretically and prac- tically.
    [Show full text]
  • Language Learning Research at the Intersection of Experimental, Computational and Corpus-Based Approaches Patrick Rebuschat
    Language learning research at the intersection of experimental, computational and corpus-based approaches (1,2) (2,3) (1,4) Patrick Rebuschat ,​ Detmar Meurers ,​ and Tony McEnery ​ ​ ​ (1) (2) Department​ of Linguistics and English Language, Lancaster University; LEAD​ Graduate ​ (3) School and Research Network, University of Tübingen; Seminar​ für Sprachwissenschaft, ​ ​ ​ (4) University of Tübingen; ESRC​ Centre for Corpus Approaches to Social Science ​ Keywords: Psycholinguistics; corpus linguistics; computational linguistics; natural language processing; first language acquisition; second language acquisition; multimethod approaches; interdisciplinary research Acknowledgements: Our research was supported by the Economic and Social Research Council, UK (grant ES/K002155/1) and by the LEAD Graduate School & Research Network (grant DFG-GSC1028), a project of the Excellence Initiative of the German federal and state governments. Correspondence: Correspondence concerning this article should be addressed to Patrick Rebuschat, Department of Linguistics and English Language, Lancaster University, Lancaster LA1 4YL, United Kingdom, E-mail: [email protected]. Language acquisition occupies a central place in the study of human cognition, and research on how we learn language can be found across many disciplines, from developmental psychology and linguistics to education, philosophy and neuroscience. It is a very challenging topic to investigate given that the learning target in first and second language acquisition is highly complex, and part of the challenge consists in identifying how different domains of language are acquired to form a fully functioning system of usage (Ellis, this volume). Correspondingly, the evidence about language use and language learning is generally shaped by many factors, including the characteristics of the task in which the language is produced (Alexopoulou et al, this volume).
    [Show full text]
  • Preprint Ellis, NC (2013). Second Language Acquisition. in Oxford Handbook of Construction Grammar
    Second Language Acquisition Nick C. Ellis University of Michigan [email protected] preprint Ellis, N. C. (2013). Second language acquisition. In Oxford Handbook of Construction Grammar (pp. 365-378), G. Trousdale & T. Hoffmann (Eds.). Oxford: Oxford University Press. 1. Introduction Usage-based approaches hold that we learn linguistic constructions while engaging in communication, the “interpersonal communicative and cognitive processes that everywhere and always shape language” (Slobin 1997). Constructions are form-meaning mappings, conventionalized in the speech community, and entrenched as language knowledge in the learner’s mind. They are the symbolic units of language relating the defining properties of their morphological, syntactic, and lexical form with particular semantic, pragmatic, and discourse functions (Bates and MacWhinney 1987; Lakoff 1987; Langacker 1987; Croft 2001; Croft and Cruse 2004; Goldberg 1995, 2003, 2006; Tomasello 2003; Robinson and Ellis 2008; Bybee 2008). Broadly, Construction Grammar argues that all grammatical phenomena can be understood as learned pairings of form (from morphemes, words, and idioms, to partially lexically filled and fully general phrasal patterns) and their associated semantic or discourse functions. Such beliefs, increasingly influential in the study of child language acquisition, have turned upside down generative assumptions of innate language acquisition devices, the continuity hypothesis, and top-down, rule-governed, processing, bringing back data-driven, emergent accounts of linguistic systematicities. Constructionist theories of child first language acquisition (L1A) use dense longitudinal corpora to chart the emergence of creative linguistic competence from children’s analyses of the utterances in their usage history and from their abstraction of regularities within them (Goldberg 1995, 2006, 2003; Diessel, this volume; Tomasello 1998; Tomasello 2003).
    [Show full text]
  • Language Development Brian Macwhinney
    P1: HEF/FGC P2: HEF/FGC QC: HEF/FGC T1: HEF 0521651174c42.xml CU1767B-Hopkins November 19, 2004 21:58 Language development brian macwhinney Introduction hard-wired instinct and the view of language as an emergent process, let us review a few basic facts about Almost every human child succeeds in learning the developmental course of language acquisition and language. As a result, people often tend to take the some of the methods used to study it. process of language learning for granted. To many, language seems like a basic instinct, as simple as breathing or blinking. But, in fact, language is the most Early auditory development complex ability that a human being will ever master. The fact that all people succeed in learning to use William James (1890) described the world of the language, whereas not all people learn to swim or do newborn as a “blooming, buzzing confusion.” However, calculus, demonstrates how fully language conforms we now know that, at the auditory level at least, the to our human nature. In a very real sense, language is newborn’s world is remarkably well structured. The the complete expression of what it means to be cochlea and auditory nerve provide extensive pre- human. processing of signals for frequency and intensity. In the Linguists in the Chomskyan tradition (Pinker, 1994) 1970s and 1980s, researchers discovered that human tend to think of language as having a universal core from infants were specifically adapted at birth to perceive which individual languages select out a particular contrasts such as that between /p/ and /b/, as in pit and configuration of optional features known technically as bit.Subsequent research showed that even chinchillas ‘parameters’ (Chomsky, 1982).
    [Show full text]
  • Apt Ratner Cv May 2020.Pdf
    Curriculum Vitae Notarization. I have read the following and certify that this curriculum vitae is a current and accurate statement of my professional record. Signature____________________ Date: May 2020__________ Personal Information A. UID nratner, Ratner, Nan Bernstein 0141GG Lefrak Hall, University of Maryland, College Park, MD 20742 Mailing address: 0100 Lefrak Hall 301-405-4217, 301-314-2023 (fax) [email protected] https://hesp.umd.edu/facultyprofile/Bernstein%20Ratner/Nan ORCID iD 0000-0002-9947-0656; link to my public record: http://orcid.org/0000-0002- 9947-0656 Google Scholar profile: https://scholar.google.com/scholar?hl=en&as_sdt=0%2C21&q=Nan+Bernstein+Ratner& btnG= B. Academic Appointments at UMD Department of Hearing and Speech Sciences, Assistant Professor (1983-1989) Department of Hearing and Speech Sciences, Associate Professor (1989-1999) Department of Hearing and Speech Sciences, Professor (1999 – present) (1993 - 2014) C. Administrative Appointments at UMD Chairman, Department of Hearing and Speech Sciences (1993-2014) D. Other employment 1982 - 1983 Howard University Visiting Assistant Professor, Department of Communication Sciences 1978 - 1982 Boston College Instructor, Department of Special Education 1978 - 1982 University of Massachusetts, Boston (formerly Boston State College) Instructor, Depts. of Psychology and Special Education 1980 - 1982 Bridgewater State College, MA Instructor, Department of Communication Disorders 1977 - 1981 Tufts University Director, Forensics Program Intercollegiate Varsity Debate Coach Instructor, Experimental College 1977 - 1978 Child Development Center, Brockton, MA Director, Speech and language services 1977 - 1978 Southeast Massachusetts Head Start, Brockton, MA Director, Speech and language services E. Educational Background Boston University 1982 Ed.D. Applied Psycholinguistics Temple University 1976 M.A.
    [Show full text]
  • Bilingual Sentence Production and Code-Switching: Neural Network Simulations
    Bilingual sentence production and code-switching: Neural network simulations Chara Tsoukala 556869-L-bw-Tsoukala Processed on: 9-3-2021 PDF page: 1 The educational component of the doctoral training was provided by the International Max Planck Research School (IMPRS) for Language Sciences. The graduate school is a joint initiative between the Max Planck Institute for Psycholinguistics and two partner institutes at Radboud University – the Centre for Language Studies, and the Donders Institute for Brain, Cognition and Behaviour. The IMPRS curriculum, which is funded by the Max Planck Society for the Advancement of Science, ensures that each member receives interdisciplinary training in the language sciences and develops a well-rounded skill set in preparation for fulfilling careers in academia and beyond. More information can be found at www.mpi.nl/imprs ISBN: 978-94-92910-27-1 Cover design: Amalia Tsichla Printed by: Ipskamp printing Layout: This thesis was typeset with LATEX2ε. It uses the Clean Thesis style developed by Ricardo Langner. © Chara Tsoukala, 2021. All rights reserved. This thesis was funded by the Netherlands Organisation for Scientific Research (NWO) Gravitation Grant 024.001.006 to the Language in Interaction Consortium. 556869-L-bw-Tsoukala Processed on: 9-3-2021 PDF page: 2 Bilingual sentence production and code-switching: Neural network simulations Proefschrift ter verkrijging van de graad van doctor aan de Radboud Universiteit Nijmegen op gezag van de rector magnificus prof. dr. J.H.J.M. van Krieken, volgens besluit van het college van decanen in het openbaar te verdedigen op woensdag 21 april 2021 om 14.30 uur precies door Chara Tsoukala geboren op 18 april 1986 te Athene (Griekenland) 556869-L-bw-Tsoukala Processed on: 9-3-2021 PDF page: 3 Promotor Prof.
    [Show full text]
  • The Emergence of Linguistic Form in Time
    Connection Science Vol. 17, No. 3–4, September–December 2005, 191–211 The emergence of linguistic form in time BRIAN MACWHINNEY*† †Carnegie Mellon University, Pittsburgh, PA, 15213, USA Linguistic forms are shaped by forces operating on vastly different time scales. Some of these forces operate directly at the moment of speaking, whereas others accumulate over time in personal and social memory. Our challenge is to understand how forces with very different time scales mesh together in the current moment to determine the emergence of linguistic form. Keywords: Emergence; Word learning; Syntax; Phylogeny; Ontogeny; Resonance 1. Introduction Where do linguistic forms come from? Researchers have proposed accounts based on rules, conventions, constraints, statistical regularities, discourse patterns, language contact, neuro- logy, physical dynamics and genes. What is common to all of these forces is that they must all make their final impact at the moment of speaking. Some of these forces operate within a narrow time frame. For example, when we run up a flight of stairs, there will be an immediate impact on vocal production leading to loss of weak syllables and changes in pitch. This type of exertion has a quick and immediate effect on prosodic and phonetic form. Other processes have a slower time fuse. For example, we may not have made reference to some concept, such as ‘apotheosis’ or ‘ligature’ for years. Yet, when we need to use these words, we can dredge them up out of our memories and apply them in the moment. The effects of Alzheimer’s disease on language can work across decades, leading eventually to an inability to remember even the most common words.
    [Show full text]
  • Bifurcations and the Emergence of L2 Syntactic Structures in a Complex Dynamic System
    University of Massachusetts Medical School eScholarship@UMMS Open Access Articles Open Access Publications by UMMS Authors 2020-10-29 Bifurcations and the Emergence of L2 Syntactic Structures in a Complex Dynamic System D. Reid Evans University of Massachusetts Medical School Et al. Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/oapubs Part of the First and Second Language Acquisition Commons, and the Psychology Commons Repository Citation Evans DR, Larsen-Freeman D. (2020). Bifurcations and the Emergence of L2 Syntactic Structures in a Complex Dynamic System. Open Access Articles. https://doi.org/10.3389/fpsyg.2020.574603. Retrieved from https://escholarship.umassmed.edu/oapubs/4408 Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License. This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in Open Access Articles by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. fpsyg-11-574603 October 23, 2020 Time: 19:2 # 1 ORIGINAL RESEARCH published: 29 October 2020 doi: 10.3389/fpsyg.2020.574603 Bifurcations and the Emergence of L2 Syntactic Structures in a Complex Dynamic System D. Reid Evans1* and Diane Larsen-Freeman2 1 Office of Graduate Medical Education, University of Massachusetts Medical School, Worcester, MA, United States, 2 Department of Linguistics, School of Education, University of Michigan, Ann Arbor, MI, United States We report on a complex dynamic systems study of an untutored adult French learner’s development of English syntax, specifically two non-finite adverbial constructions.
    [Show full text]
  • The Handbook of Portuguese Linguistics Blackwell Handbooks in Linguistics
    The Handbook of Portuguese Linguistics Blackwell Handbooks in Linguistics This outstanding multi‐volume series covers all the major subdisciplines within linguistics today and, when complete, will offer a comprehensive survey of linguistics as a whole. Recent Titles Include: The Handbook of Hispanic Sociolinguistics The Handbook of Korean Linguistics Edited by Manuel Díaz‐Campos Edited by Lucien Brown and Jaehoon Yeon The Handbook of Language Socialization The Handbook of Speech Production Edited by Alessandro Duranti, Elinor Ochs, Edited Melissa A. Redford and Bambi B. Schieffelin The Handbook of Contemporary Semantic Theory, The Handbook of Intercultural Discourse and Second Edition Communication Edited by Shalom Lappin and Chris Fox Edited by Christina Bratt Paulston, Scott F. The Handbook of Classroom Discourse and Kiesling, and Elizabeth S. Rangel Interaction The Handbook of Historical Sociolinguistics Edited by Numa Markee Edited by Juan Manuel Hernández‐Campoy The Handbook of Narrative Analysis and Juan Camilo Conde‐Silvestre Edited by Anna De Fina & Alexandra The Handbook of Hispanic Linguistics Georgakopoulou Edited by José Ignacio Hualde, Antxon The Handbook of English Pronounciation Olarrea, and Erin O’Rourke Edited byMarnie Reed and John M. Levis The Handbook of Conversation Analysis The Handbook of Discourse Analysis, 2nd edition, Edited by Jack Sidnell and Tanya Stivers Edited by Deborah Tannen, Heidi E. The Handbook of English for Specific Purposes Hamilton, & Deborah Schiffrin Edited by Brian Paltridge and Sue Starfield The Handbook of Bilingual and Multilingual The Handbook of Spanish Second Language Education Acquisition Edited by Wayne E. Wright, Sovicheth Boun, Edited by Kimberly L. Geeslin and Ofelia García The Handbook of Chinese Linguistics The Handbook of Portuguese Linguistics Edited by C.‐T.
    [Show full text]
  • Transcribing Interlanguage: the Case of Verb-Final [E] in L2 French Pascale Leclercq Université Paul Valéry Montpellier 3
    Chapter 7 Transcribing interlanguage: The case of verb-final [e] in L2 French Pascale Leclercq Université Paul Valéry Montpellier 3 This chapter aims at shedding some light on the place of transcription in the data interpretation process. More specifically, it focuses on the example of verb-final [e] in oral second-language French, which causes interpretation problems when con- text does not provide disambiguation cues. Through an analysis of three studies on this phenomenon (Herschensohn 2001; Prévost 2007b,a; Granget 2015), displaying a variety of theoretical frameworks (generative versus functional) and transcrip- tion options (written- and spoken-centric approaches), I show that transcription choices, whether made intuitively or in a theory-constrained manner, are often problematic as they entail an early categorization of data, even before data coding and analysis, thereby introducing an interpretive bias (Mondada 2007). Finally, I draw conclusions and offer suggestions regarding best transcription practices. Keywords: Transcription, data interpretation, L2 French, verbal morphology, in- terlanguage 1 Introduction This chapter stems from the author’s questions and doubts while engaging in corpus-based second language acquisition (SLA) research, and more specifically when facing oral learner production that is ambiguous and requires the researcher to make conscious transcription decisions. I was particularly puzzled by the way some English-speaking learners of French use verb-final [e] as a generic verb end- ing, in spite of its highly polysemous value (e.g., tomber ‘fall’ infinitive, tombait ‘was falling’ imperfect, tombé ‘fallen’ past participle, etc.) and wondered how to transcribe and interpret such forms. This phenomenon has been pointed out reg- ularly over the last decades in the literature on the acquisition of morphosyntax Pascale Leclercq.
    [Show full text]