Book of Abstracts

Total Page:16

File Type:pdf, Size:1020Kb

Book of Abstracts BOOK OF ABSTRACTS Santiago de Compostela Institutional sponsors Unión Europea Fondo Europeo de Desarrollo Regional “Una manera de hacer Europa” GOBIERNO MINISTERIO DE ECONOMÍA DE ESPAÑA Y COMPETITIVIDAD Organizing committee María José López-Couso Belén Méndez-Naya Mario Cal-Varela Teresa Fanego Xabier Fernández-Polo Paloma Núñez-Pertejo Ignacio Palacios-Martínez Student helpers Cristina Blanco-García Tamilla Mamedova Zeltia Blanco-Suárez Beatriz M ato-Míguez Tamara Bouso Alberto Monteagudo-Buceta Yolanda Joy Calvo-Benzies Noelia Ozores-Reboiras Eduardo Coto-Villalibre Alba Pérez-González Tania de Dios María Luisa Roca-Varela Fátima Faya Paula Rodríguez-Abruñeiras Susana Formoso Iria-Gael Romay Vanesa Gil-Vilacoba Mario Serrano-Losada Lidia Gómez-García Iván Tamaredo-Meira Aleksandra Kaverina Vera Vázquez-López Scientific committee Joan Beal (Sheffield) Julia Lavid (Complutense de Madrid) Douglas Biber (Northern Arizona) Hans Lindquist (Malmö) Hubert Cuyckens (Leuven) Christian Mair (Freiburg) Holger Diessel (Jena) Gabriella Mazzon ( Innsbruck) Marina Dossena (Bergamo) Britta Mondorf (Mainz) Sebastian Hoffmann (Trier) Terttu Nevalainen (Helsinki) Marianne Hundt (Zurich) Dirk Nöel (Hong Kong) Merja Kytö (Uppsala) Aquilino Sánchez Pérez (Murcia) and the members of the ICAME Executive Board Gisle Andersen (Bergen) Fanny Meunier (Louvain-la-Neuve) Kristin Davidse (Leuven) Ilka Mindt (Paderborn) Stefan Th. Gries (Santa Barbara) Joybrato Mukherjee (Giessen) Hilde Hasselgård (Oslo) Pam Peters (Sydney) Magnus Huber (Giessen) Paul Rayson (Lancaster) John M. Kirk (Belfast) Ute Römer (Atlanta) María José López-Couso (Santiago) Irma Taavitsainen (Helsinki) Michaela Mahlberg (Nottingham) Sali Tagliamonte (Toronto) ICAME 34: SPONSORS Financial support for this event has been received from: Grant no. FFI2011-26693-C02-01: Ministry of Economy and Competitiveness; Principal Investigator: María José López-Couso. Grant no. CN2012/012: Directorate General for Scientific and Technological Promotion, Autonomous Government of Galicia; Principal Investigator: Teresa Fanego. Grant no. CN2012/081: Directorate General for Scientific and Technological Promotion, Autonomous Government of Galicia; Principal Investigator: Ignacio Palacios-Martínez. Institutional sponsors: Research Unit for Variation, Linguistic Change and Grammaticalization, University of Santiago de Compostela (Ref. GI-1383) Spoken English Research Team at the University of Santiago de SPERTUS Compostela (Ref. GI-1762) Publishers: ii Benvida Dear ICAME 34 participants, Last year in Leuven we invited you to take the pilgrim’s way to Santiago de Compostela. Now we would like to welcome you warmly to ICAME 34 (22-26 May 2013) and to our old and beautiful town in hopes that you will profit from the very promising conference sessions. ICAME 34 will confirm that English corpus linguistics is indeed on the move and will show its various and wide applications and implications . Your interest and contributions are essential to making this an inspiring and stimulating event. We also encourage you to find some spare time here and there and do some sightseeing: to enjoy the streets and people, our food and wine, and the friendly and lively atmosphere in Santiago. We do hope that the weather will be “merciful” to us. Just in case, remember the fond old saying that “rain is art in Santiago”. We wish you an enjoyable conference, The ICAME 34 organizing committee María José López-Couso, Belén Méndez-Naya, Mario Cal-Varela, Teresa Fanego, Xabier Fernández-Polo, Paloma Núñez-Pertejo, Ignacio Palacios-Martínez iii TABLE OF CONTENTS Benvida .....................................................................................................................................iii PLENARY SPEAKERS Multiethnolects and English corpus linguistics Jenny Cheshire (Queen Mary, University of London)............................................................. 15 The network metaphor of usage-based construction grammar Holger Diessel (Friedrich Schiller University Jena) .............................................................. 15 Who is the/a/Ø professor of English at your university? Marianne Hundt (University of Zurich).................................................................................. 16 Community and identity: What corpora can tell us about academic discourse Ken Hyland (University of Hong Kong).................................................................................. 17 PRE-CONFERENCE WORKSHOPS WORKSHOP 1: CROSS -LINGUISTIC STUDIES AT THE INTERFACE BETWEEN LEXIS AND GRAMMAR Convenors: Karin Aijmer (University of Gothenburg) and Hilde Hasselgård (University of Oslo) Modal particles in a contrastive perspective – the case of the Swedish väl Karin Aijmer (University of Gothenburg) ............................................................................... 21 I am wild about cabbage : Evaluative ‘semantic sequences’ and cross-linguistic (dis)continuities Marina Bondi & Corrado Seidenari (University of Modena and Reggio Emilia).................. 21 Binominal size noun constructions in English and French: A contrastive corpus-based perspective Lot Brems (University of Liège/K.U. Leuven)......................................................................... 23 A contrastive analysis of downtoners, more or less Signe Oksefjell Ebeling & Jarle Ebeling (University of Oslo)................................................ 24 Motion into and out of in English, French and Norwegian Thomas Egan & Anne-Line Graedler (Hedmark University College).................................... 25 The postmodifying structure of noun phrases: A contrastive study of English and Norwegian Johan Elsness (University of Oslo) ......................................................................................... 26 Cross-linguistic analysis of cohesion: Variation across production types and registers Ekaterina Lapshinova-Koltunski & Kerstin Kunz (University of Saarland)........................... 27 1 Intersubjective positioning and thematisation in English and Spanish: A contrastive analysis of letters to the editor Julia Lavid & Lara Moratón (Universidad Complutense de Madrid).................................... 29 A comparable-corpus based approach to the expression of obligation across English and French Diana Lewis (University of Aix-Marseille I)........................................................................... 30 English non-finite participial clauses as seen through their Czech counterparts Markéta Malá & Pavlína Šaldová (Charles University in Prague)........................................ 31 Quite seen through its translation equivalents: A contrastive corpus-based study Michaela Martinková (Palacký University Olomouc) ............................................................ 32 English positive polarity contexts into Spanish: A corpus-based study Rosa Rabadán (University of León)........................................................................................ 34 What Jesus wanted (us) to know Teresa Sánchez Roura (University of Santiago de Compostela)............................................. 35 Parallel corpus - a tool for diagnosing multifunctionality across languages: Actually , naturally and in fact vs. their correspondences in Lithuanian Aurelija Usoniene (Vilnius University), Jolanta Sinkuniene (Vytautas Magnus University & Vilnius University) & Audrone Soliene (Vilnius University).............................. 37 WORKSHOP 2: COMPILATION AND ANNOTATION OF SPOKEN CORPORA : TOWARDS BEST PRACTICE Convenors: Gisle Andersen (NHH Norwegian School of Economics), John Kirk (Belfast) and Susan Lee Nacey (Hedmark University College) How do you spell yeah/yeh/yea/yah ? Assessing orthographic transcription and comparability between spoken corpora Gisle Andersen (NHH Norwegian School of Economics) ....................................................... 39 “Ja, also, can you see me now?” Designing and compiling a corpus of computer-mediated international academic English Stefan Diemer (Saarland University)...................................................................................... 40 From sociolinguistic interviews to a spoken corpus of London English: Creating the Linguistic Innovators Corpus (LIC) Costas Gabrielatos (Edge Hill University), Sebastian Hoffmann (University of Trier) & Eivind Torgersen (Sør-Trøndelag University College)....................................................... 41 The SPICE-Ireland pragmatic annotation scheme: A critical appraisal John M. Kirk (Belfast)............................................................................................................. 42 The Norwegian component of LINDSEI Susan Lee Nacey (Hedmark University College).................................................................... 43 Categorising plurilingual user data? Challenges and solutions for POS-tagging VOICE Ruth Osimk-Teasdale (University of Vienna) .......................................................................... 44 2 BeMaTaC: A flexible multilayer spoken dialogue corpus for contrastive SLA analyses Simon Sauer & Anke Lüdeling (Humboldt University of Berlin)............................................ 46 Best practices in the compilation, annotation and publication of the Research and Teaching Corpus of Spoken German (FOLK) Thomas Schmidt (Institut für Deutsche Sprache, Mannheim)................................................. 47 Curating and maintaining spoken legacy corpora for publication in the scientific community Kai Wörner (HZSK University of Hamburg)..........................................................................
Recommended publications
  • Talk Bank: a Multimodal Database of Communicative Interaction
    Talk Bank: A Multimodal Database of Communicative Interaction 1. Overview The ongoing growth in computer power and connectivity has led to dramatic changes in the methodology of science and engineering. By stimulating fundamental theoretical discoveries in the analysis of semistructured data, we can to extend these methodological advances to the social and behavioral sciences. Specifically, we propose the construction of a major new tool for the social sciences, called TalkBank. The goal of TalkBank is the creation of a distributed, web- based data archiving system for transcribed video and audio data on communicative interactions. We will develop an XML-based annotation framework called Codon to serve as the formal specification for data in TalkBank. Tools will be created for the entry of new and existing data into the Codon format; transcriptions will be linked to speech and video; and there will be extensive support for collaborative commentary from competing perspectives. The TalkBank project will establish a framework that will facilitate the development of a distributed system of allied databases based on a common set of computational tools. Instead of attempting to impose a single uniform standard for coding and annotation, we will promote annotational pluralism within the framework of the abstraction layer provided by Codon. This representation will use labeled acyclic digraphs to support translation between the various annotation systems required for specific sub-disciplines. There will be no attempt to promote any single annotation scheme over others. Instead, by promoting comparison and translation between schemes, we will allow individual users to select the custom annotation scheme most appropriate for their purposes.
    [Show full text]
  • Child Language
    ABSTRACTS 14TH INTERNATIONAL CONGRESS FOR THE STUDY OF CHILD LANGUAGE IN LYON, IASCL FRANCE 2017 WELCOME JULY, 17TH21ST 2017 SPECIAL THANKS TO - 2 - SUMMARY Plenary Day 1 4 Day 2 5 Day 3 53 Day 4 101 Day 5 146 WELCOME! Symposia Day 2 6 Day 3 54 Day 4 102 Day 5 147 Poster Day 2 189 Day 3 239 Day 4 295 - 3 - TH DAY MONDAY, 17 1 18:00-19:00, GRAND AMPHI PLENARY TALK Bottom-up and top-down information in infants’ early language acquisition Sharon Peperkamp Laboratoire de Sciences Cognitives et Psycholinguistique, Paris, France Decades of research have shown that before they pronounce their first words, infants acquire much of the sound structure of their native language, while also developing word segmentation skills and starting to build a lexicon. The rapidity of this acquisition is intriguing, and the underlying learning mechanisms are still largely unknown. Drawing on both experimental and modeling work, I will review recent research in this domain and illustrate specifically how both bottom-up and top-down cues contribute to infants’ acquisition of phonetic cat- egories and phonological rules. - 4 - TH DAY TUESDAY, 18 2 9:00-10:00, GRAND AMPHI PLENARY TALK What do the hands tell us about lan- guage development? Insights from de- velopment of speech, gesture and sign across languages Asli Ozyurek Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands Most research and theory on language development focus on children’s spoken utterances. However language development starting with the first words of children is multimodal. Speaking children produce gestures ac- companying and complementing their spoken utterances in meaningful ways through pointing or iconic ges- tures.
    [Show full text]
  • Segmentability Differences Between Child-Directed and Adult-Directed Speech: a Systematic Test with an Ecologically Valid Corpus
    Report Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus Alejandrina Cristia 1, Emmanuel Dupoux1,2,3, Nan Bernstein Ratner4, and Melanie Soderstrom5 1Dept d’Etudes Cognitives, ENS, PSL University, EHESS, CNRS 2INRIA an open access journal 3FAIR Paris 4Department of Hearing and Speech Sciences, University of Maryland 5Department of Psychology, University of Manitoba Keywords: computational modeling, learnability, infant word segmentation, statistical learning, lexicon ABSTRACT Previous computational modeling suggests it is much easier to segment words from child-directed speech (CDS) than adult-directed speech (ADS). However, this conclusion is based on data collected in the laboratory, with CDS from play sessions and ADS between a parent and an experimenter, which may not be representative of ecologically collected CDS and ADS. Fully naturalistic ADS and CDS collected with a nonintrusive recording device Citation: Cristia A., Dupoux, E., Ratner, as the child went about her day were analyzed with a diverse set of algorithms. The N. B., & Soderstrom, M. (2019). difference between registers was small compared to differences between algorithms; it Segmentability Differences Between Child-Directed and Adult-Directed reduced when corpora were matched, and it even reversed under some conditions. Speech: A Systematic Test With an Ecologically Valid Corpus. Open Mind: These results highlight the interest of studying learnability using naturalistic corpora Discoveries in Cognitive Science, 3, 13–22. https://doi.org/10.1162/opmi_ and diverse algorithmic definitions. a_00022 DOI: https://doi.org/10.1162/opmi_a_00022 INTRODUCTION Supplemental Materials: Although children are exposed to both child-directed speech (CDS) and adult-directed speech https://osf.io/th75g/ (ADS), children appear to extract more information from the former than the latter (e.g., Cristia, Received: 15 May 2018 2013; Shneidman & Goldin-Meadow,2012).
    [Show full text]
  • Lexical Ambiguity • Syntactic Ambiguity • Semantic Ambiguity • Pragmatic Ambiguity
    Welcome to the course! IntroductionIntroduction toto NaturalNatural LanguageLanguage ProcessingProcessing (NLP)(NLP) Professors:Marta Gatius Vila Horacio Rodríguez Hontoria Hours per week: 2h theory + 1h laboratory Web page: http://www.cs.upc.edu/~gatius/engpln2017.html Main goal Understand the fundamental concepts of NLP • Most well-known techniques and theories • Most relevant existing resources • Most relevant applications NLP Introduction 1 Welcome to the course! IntroductionIntroduction toto NaturalNatural LanguageLanguage ProcessingProcessing Content 1. Introduction to Language Processing 2. Applications. 3. Language models. 4. Morphology and lexicons. 5. Syntactic processing. 6. Semantic and pragmatic processing. 7. Generation NLP Introduction 2 Welcome to the course! IntroductionIntroduction toto NaturalNatural LanguageLanguage ProcessingProcessing Assesment • Exams Mid-term exam- November End-of-term exam – Final exams period- all the course contents • Development of 2 Programs – Groups of two or three students Course grade = maximum ( midterm exam*0.15 + final exam*0.45, final exam * 0.6) + assigments *0.4 NLP Introduction 3 Welcome to the course! IntroductionIntroduction toto NaturalNatural LanguageLanguage ProcessingProcessing Related (or the same) disciplines: •Computational Linguistics, CL •Natural Language Processing, NLP •Linguistic Engineering, LE •Human Language Technology, HLT NLP Introduction 4 Linguistic Engineering (LE) • LE consists of the application of linguistic knowledge to the development of computer systems able to recognize, understand, interpretate and generate human language in all its forms. • LE includes: • Formal models (representations of knowledge of language at the different levels) • Theories and algorithms • Techniques and tools • Resources (Lingware) • Applications NLP Introduction 5 Linguistic knowledge levels – Phonetics and phonology. Language models – Morphology: Meaningful components of words. Lexicon doors is plural – Syntax: Structural relationships between words.
    [Show full text]
  • Multimedia Corpora (Media Encoding and Annotation) (Thomas Schmidt, Kjell Elenius, Paul Trilsbeek)
    Multimedia Corpora (Media encoding and annotation) (Thomas Schmidt, Kjell Elenius, Paul Trilsbeek) Draft submitted to CLARIN WG 5.7. as input to CLARIN deliverable D5.C­3 “Interoperability and Standards” [http://www.clarin.eu/system/files/clarin­deliverable­D5C3_v1_5­finaldraft.pdf] Table of Contents 1 General distinctions / terminology................................................................................................................................... 1 1.1 Different types of multimedia corpora: spoken language vs. speech vs. phonetic vs. multimodal corpora vs. sign language corpora......................................................................................................................................................... 1 1.2 Media encoding vs. Media annotation................................................................................................................... 3 1.3 Data models/file formats vs. Transcription systems/conventions.......................................................................... 3 1.4 Transcription vs. Annotation / Coding vs. Metadata ............................................................................................. 3 2 Media encoding ............................................................................................................................................................... 5 2.1 Audio encoding ..................................................................................................................................................... 5 2.2
    [Show full text]
  • Conference Abstracts
    EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION Held under the Patronage of Ms Neelie Kroes, Vice-President of the European Commission, Digital Agenda Commissioner MAY 23-24-25, 2012 ISTANBUL LÜTFI KIRDAR CONVENTION & EXHIBITION CENTRE ISTANBUL, TURKEY CONFERENCE ABSTRACTS Editors: Nicoletta Calzolari (Conference Chair), Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis. Assistant Editors: Hélène Mazo, Sara Goggi, Olivier Hamon © ELRA – European Language Resources Association. All rights reserved. LREC 2012, EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION Title: LREC 2012 Conference Abstracts Distributed by: ELRA – European Language Resources Association 55-57, rue Brillat Savarin 75013 Paris France Tel.: +33 1 43 13 33 33 Fax: +33 1 43 13 33 30 www.elra.info and www.elda.org Email: [email protected] and [email protected] Copyright by the European Language Resources Association ISBN 978-2-9517408-7-7 EAN 9782951740877 All rights reserved. No part of this book may be reproduced in any form without the prior permission of the European Language Resources Association ii Introduction of the Conference Chair Nicoletta Calzolari I wish first to express to Ms Neelie Kroes, Vice-President of the European Commission, Digital agenda Commissioner, the gratitude of the Program Committee and of all LREC participants for her Distinguished Patronage of LREC 2012. Even if every time I feel we have reached the top, this 8th LREC is continuing the tradition of breaking previous records: this edition we received 1013 submissions and have accepted 697 papers, after reviewing by the impressive number of 715 colleagues.
    [Show full text]
  • Gold Standard Annotations for Preposition and Verb Sense With
    Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions Lori Moon Christos Christodoulopoulos Cynthia Fisher University of Illinois at Amazon Research University of Illinois at Urbana-Champaign [email protected] Urbana-Champaign [email protected] [email protected] Sandra Franco Dan Roth Intelligent Medical Objects University of Pennsylvania Northbrook, IL USA [email protected] [email protected] Abstract This paper describes the augmentation of an existing corpus of child-directed speech. The re- sulting corpus is a gold-standard labeled corpus for supervised learning of semantic role labels in adult-child dialogues. Semantic role labeling (SRL) models assign semantic roles to sentence constituents, thus indicating who has done what to whom (and in what way). The current corpus is derived from the Adam files in the Brown corpus (Brown, 1973) of the CHILDES corpora, and augments the partial annotation described in Connor et al. (2010). It provides labels for both semantic arguments of verbs and semantic arguments of prepositions. The semantic role labels and senses of verbs follow Propbank guidelines (Kingsbury and Palmer, 2002; Gildea and Palmer, 2002; Palmer et al., 2005) and those for prepositions follow Srikumar and Roth (2011). The corpus was annotated by two annotators. Inter-annotator agreement is given sepa- rately for prepositions and verbs, and for adult speech and child speech. Overall, across child and adult samples, including verbs and prepositions, the κ score for sense is 72.6, for the number of semantic-role-bearing arguments, the κ score is 77.4, for identical semantic role labels on a given argument, the κ score is 91.1, for the span of semantic role labels, and the κ for agreement is 93.9.
    [Show full text]
  • A Massively Parallel Corpus: the Bible in 100 Languages
    Lang Resources & Evaluation DOI 10.1007/s10579-014-9287-y ORIGINAL PAPER A massively parallel corpus: the Bible in 100 languages Christos Christodouloupoulos • Mark Steedman Ó The Author(s) 2014. This article is published with open access at Springerlink.com Abstract We describe the creation of a massively parallel corpus based on 100 translations of the Bible. We discuss some of the difficulties in acquiring and processing the raw material as well as the potential of the Bible as a corpus for natural language processing. Finally we present a statistical analysis of the corpora collected and a detailed comparison between the English translation and other English corpora. Keywords Parallel corpus Á Multilingual corpus Á Comparative corpus linguistics 1 Introduction Parallel corpora are a valuable resource for linguistic research and natural language processing (NLP) applications. One of the main uses of the latter kind is as training material for statistical machine translation (SMT), where large amounts of aligned data are standardly used to learn word alignment models between the lexica of two languages (for example, in the Giza?? system of Och and Ney 2003). Another interesting use of parallel corpora in NLP is projected learning of linguistic structure. In this approach, supervised data from a resource-rich language is used to guide the unsupervised learning algorithm in a target language. Although there are some techniques that do not require parallel texts (e.g. Cohen et al. 2011), the most successful models use sentence-aligned corpora (Yarowsky and Ngai 2001; Das and Petrov 2011). C. Christodouloupoulos (&) Department of Computer Science, UIUC, 201 N.
    [Show full text]
  • Corpus Linguistics: a Practical Introduction
    Corpus Linguistics: A Practical Introduction Nadja Nesselhauf, October 2005 (last updated September 2011) 1) Corpus Linguistics and Corpora - What is corpus linguistics (I)? - What data do linguists use to investigate linguistic phenomena? - What is a corpus? - What is corpus linguistics (II)? - What corpora are there? - What corpora are available to students of English at the University of Heidelberg? (For a list of corpora available at the Department of English click here) 2) Corpus Software - What software is there to perform linguistic analyses on the basis of corpora? - What can the software do? - A brief introduction to an online search facility (BNC) - A step-to-step introduction to WordSmith Tools 3) Exercises (I and II) - I Using the WordList function of WordSmith - II Using the Concord function of WordSmith 4) How to conduct linguistic analyses on the basis of corpora: two examples - Example 1: Australian English vocabulary - Example 2: Present perfect and simple past in British and American English - What you have to take into account when performing a corpuslingustic analysis 5) Exercises (III) - Exercise III.1 - Exercise III.2 6) Where to find further information on corpus linguistics 1) Corpus Linguistics and Corpora What is corpus linguistics (I)? Corpus linguistics is a method of carrying out linguistic analyses. As it can be used for the investigation of many kinds of linguistic questions and as it has been shown to have the potential to yield highly interesting, fundamental, and often surprising new insights about language, it has become one of the most wide-spread methods of linguistic investigation in recent years.
    [Show full text]
  • The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs
    Love Is Hard to Understand: The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. Hartshorne, Joshua K., Amanda Pogue, and Jesse Snedeker. 2014. Citation Love Is Hard to Understand: The Relationship Between Transitivity and Caused Events in the Acquisition of Emotion Verbs. Journal of Child Language (June 19): 1–38. Published Version doi:10.1017/S0305000914000178 Accessed January 17, 2017 12:55:19 PM EST Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:14117738 This article was downloaded from Harvard University's DASH Terms of Use repository, and is made available under the terms and conditions applicable to Open Access Policy Articles, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#OAP (Article begins on next page) Running head: TRANSITIVITY AND CAUSED EVENTS Love is hard to understand: The relationship between transitivity and caused events in the acquisition of emotion verbs Joshua K. Hartshorne Massachusetts Institute of Technology Harvard University Amanda Pogue University of Waterloo Jesse Snedeker Harvard University In press at Journal of Child Language Acknowledgements: The authors wish to thank Timothy O’Donnell for assistance with the corpus analysis as well as Alfonso Caramazza, Susan Carey, Steve Pinker, Mahesh Srinivasan, Nathan Winkler- Rhoades, Melissa Kline, Hugh Rabagliati, members of the Language and Cognition workshop, and three anonymous reviewers for comments and discussion. This material is based on work supported by a National Defense Science and Engineering Graduate Fellowship to JKH and a grant from the National Science Foundation to Jesse Snedeker (0623845).
    [Show full text]
  • A New Venture in Corpus-Based Lexicography: Towards a Dictionary of Academic English
    A New Venture in Corpus-Based Lexicography: Towards a Dictionary of Academic English Iztok Kosem1 and Ramesh Krishnamurthy1 1. Introduction This paper asserts the increasing importance of academic English in an increasingly Anglophone world, and looks at the differences between academic English and general English, especially in terms of vocabulary. The creation of wordlists has played an important role in trying to establish the academic English lexicon, but these wordlists are not based on appropriate data, or are implemented inappropriately. There is as yet no adequate dictionary of academic English, and this paper reports on new efforts at Aston University to create a suitable corpus on which such a dictionary could be based. 2. Academic English The increasing percentage of academic texts published in English (Swales, 1990; Graddol, 1997; Cargill and O’Connor, 2006) and the increasing numbers of students (both native and non-native speakers of English) at universities where English is the language of instruction (Graddol, 2006) testify to the important role of academic English. At the same time, research has shown that there is a significant difference between academic English and general English. The research has focussed mainly on vocabulary: the lexical differences between academic English and general English have been thoroughly discussed by scholars (Coxhead and Nation, 2001; Nation, 2001, 1990; Coxhead, 2000; Schmitt, 2000, Nation and Waring, 1997; Xue and Nation, 1984), and Coxhead and Nation (2001: 254–56) list the following four distinguishing features of academic vocabulary: “1. Academic vocabulary is common to a wide range of academic texts, and generally not so common in non-academic texts.
    [Show full text]
  • Distributional Properties of Verbs in Syntactic Patterns Liam Considine
    Early Linguistic Interactions: Distributional Properties of Verbs in Syntactic Patterns Liam Considine The University of Michigan Department of Linguistics April 2012 Advisor: Nick Ellis Acknowledgements: I extend my sincerest gratitude to Nick Ellis for agreeing to undertake this project with me. Thank you for cultivating, and partaking in, some of the most enriching experiences of my undergraduate education. The extensive time and energy you invested here has been invaluable to me. Your consistent support and amicable demeanor were truly vital to this learning process. I want to thank my second reader Ezra Keshet for consenting to evaluate this body of work. Other thanks go out to Sarah Garvey for helping with precision checking, and Jerry Orlowski for his R code. I am also indebted to Mary Smith and Amanda Graveline for their participation in our weekly meetings. Their presence gave audience to the many intermediate challenges I faced during this project. I also need to thank my roommate Sean and all my other friends for helping me balance this great deal of work with a healthy serving of fun and optimism. Abstract: This study explores the statistical distribution of verb type-tokens in verb-argument constructions (VACs). The corpus under investigation is made up of longitudinal child language data from the CHILDES database (MacWhinney 2000). We search a selection of verb patterns identified by the COBUILD pattern grammar project (Francis, Hunston, Manning 1996), these include a number of verb locative constructions (e.g. V in N, V up N, V around N), verb object locative caused-motion constructions (e.g.
    [Show full text]