BAAL2017 Book of Abstracts 0

Total Page:16

File Type:pdf, Size:1020Kb

BAAL2017 Book of Abstracts 0 BAAL2017 Book of Abstracts 0 BAAL2017 Book of Abstracts 1 BAAL2017 Book of Abstracts BAAL2017 would like to thank our Sponsors 2 BAAL2017 Book of Abstracts Contents Welcome message .................................................................................................................................. 4 Local Organising Committee and PG student volunteers ....................................................................... 5 General information ................................................................................................................................. 6 Events and entertainment ....................................................................................................................... 7 Health and first aid .................................................................................................................................. 8 Travel and transport ................................................................................................................................ 9 Banks and shops ................................................................................................................................... 10 Pre-booked campus accommodation ................................................................................................... 11 The British Council ................................................................................................................................ 13 Plenaries ............................................................................................................................................... 14 Papers and posters ............................................................................................................................... 18 Colloquia and roundtable .................................................................................................................... 182 Index of presenters ............................................................................................................................. 198 3 BAAL2017 Book of Abstracts A welcome message from the Local Organising Committee Welcome to BAAL2017! The School of Education and the Centre for Language Education Research are delighted to welcome you to the University of Leeds for the 2017 Annual Meeting of the British Association for Applied Linguistics. We would like to extend a special welcome to our plenary speakers, Bencie Woll, Henry Widdowson and Mohamad Daoud. This volume contains the abstracts for the papers which will be presented over the course of the three days, as well as other important information about our conference, the Leeds campus, and the city of Leeds. We had an unprecedented volume of submissions for BAAL 2017, and have assembled a programme of very high quality papers, some individual, and some organised into thematic tracks associated with the Special Interest Groups of BAAL. There will also be poster presentations, and exhibitions of material from arts-based Applied Linguistics projects. In addition we are pleased to be hosting four colloquia, including a Local Organising Committee invited colloquium on creative inquiry and Applied Linguistics. This, as with many of the presentations, engages with the theme of the conference – Diversity and Applied Linguistics – enabling a conversation about the notion of diversity both of and in Applied Linguistics, in a diversifying world. Also speaking to the theme is the BAAL Executive Committee invited roundtable. The speakers, all former Chairs of our Association, have been asked to focus on Applied Linguistics' engagement with policy and practice. The roundtable and the plenary presentations will be filmed, thanks to the generosity of the British Council, who will also be filming interviews with plenary and roundtable speakers. The conference also includes the familiar range of activities, including the Postgraduate Forum and the announcement of the results of the BAAL Book Prize. 2017 marks the 50th anniversary of the inaugural meeting of the British Association for Applied Linguistics, and we are celebrating with a th social programme which includes a 50 Anniversary party on Friday night. We will be welcoming BAAL PGR/Early Career Scholarship winners to the event, as well as the winner of the Chris Brumfit International Scholarship, Rosmawati. We will also be awarding the prize for the best poster, and the Richard Pemberton prize for the best PGR student presentation. We will be using social media to communicate about the conference as it goes on, on our Facebook page /Baal2017 and our Twitter feed @BAAL2017 – using the hashtag #BAAL2017. Please feel free to join us online. Finally, we would like to take this opportunity to thank the many sponsors and supporters of our conference. We wish you a successful and enjoyable BAAL2017! James Simpson, Chair of the Local Organising Committee University of Leeds August 2017 4 BAAL2017 Book of Abstracts Local Organising Committee and Postgraduate Volunteers Local Organising Committee Postgraduate Volunteers Aisha Walker Taguhi Sahakyan Nindya Utimadini Anthony Lowe Mukrim Thamrin Dwi Nugroho Amanda Howard Natalie Donohue Qian Zhang Clare Wright Fereshteh Goshtasbpour Abdulqader Alawadhi Elisabetta Adami Yan Chu Amal Basheikh Emilee Moore Dung Cao Cristobal Torres James Simpson Jin A Kim Vivian Valdivia Jessica Bradley Sani Yantandu Uba Pamela Manriquez Zuniga Julia Snell Alyona Ryazanova Zhijie Wang Lou Harvey Loreto Aliaga-Salas Katherine Spowage Louise Williams Zuraidah Ismail Neveen Al Saeed Melinda Whong Stephania Albert Dangyi Shu Mohammad Ahmadian Handan Lu Natasha Rahanu Liza Ramli Richard Badger Geoffrey Nsanja Taguhi Sahakyan Rumana Hossain Zuraidah Ismail Syakima Ibrahim Loreto Aliaga-Salas Uma Maniam The Local Organising Committee would like to thank Natasha Rahanu and Anthony Lowe of Meet in Leeds most sincerely for their ongoing and dedicated support of the conference. 5 BAAL2017 Book of Abstracts General Information Conference Registration The Conference is being held at Parkinson Building and The Baines Wing. The full address is: Parkinson Court, Woodhouse Lane, University of Leeds,LS2 9JT Registration will be from 09:00 on Thursday 31st August 2017 in Parkinson Court South. Refreshment Breaks and Lunches Refreshment breaks and lunches during the conference will be served in Parkinson Court South. Lunch will include a finger buffet with sandwiches, savoury items, and a mixture of cake and fruit served with mineral water and juice. WiFi Access Delegates will be provided with a username and password to access the WiFi access upon registration. This can be used throughout the University Campus, including accommodation, and will enable you to access the Meet In Leeds network. Instructions are issued with the Usernames and Passwords. If your institution is a member of Eduroam you will also be able to use that service on the University Campus. Please read the Terms of Service carefully to ensure that your browsing and internet usage complies with University regulations. Photocopying and Printing A media services shop is open between 9.00am and 16.30pm, Monday to Friday, on the ground floor of the Roger Stevens building where there are photocopying facilities and other audio visual services available. Poster Sessions The posters will be displayed in Parkinson Court. Publishers’ Stands Publishers’ stands will be available to view in Parkinson Court South, alongside the Catering provision. Telephone Numbers Conference and Events: +44 (0) 113 343 6106 In the event of any serious problems, or for emergencies, please contact University of Leeds Security on 0113 343 5494 (24-hours). The emergency number in the UK for fire, ambulance or police is 999. 6 BAAL2017 Book of Abstracts Events and Entertainment 50th Anniversary Conference Party The 50th Anniversary Conference party will be held at the University Refectory from 18:00 on Friday 1st September. There will be music to dance to: A Bit of the Other with The Klatsh. The Refectory is located on the Main University Precinct, next door to the Student Union, less than a 5 minute walk from Parkinson Court. Prosecco, Pie & Peas! – Book Prize The book prize will be held on Thursday 31st August in Parkinson Court North from 18:00. A drinks reception, pie & pea dinner will be available. Howarth Trip The post conference trip to Howarth will depart from Parkinson Steps at 13:30 on Saturday 2nd September. It will return from the Bronte Parsonage Museum at 17:30 and arrive back at the University at approximately 18:30. Sports Facilities Delegates staying on campus accommodation can benefit from free access to The Edge, the University’s new pool, gym and fitness suite, for the duration of the conference. Built for use by visiting athletes prior to the London 2012 Summer Olympics, there is a 25m, 8-lane swimming pool, the largest fitness suite of any UK university, 3-designated class studios, squash courts and over 100 classes per week. Please visit sport.leeds.ac.uk for more information 7 BAAL2017 Book of Abstracts Health and First Aid If first aid is required on campus please contact a member of staff in the building or for emergencies call Security via an internal telephone on x32222 or externally on +44(0)113 343 2222 - available 24- hours. Hospital The nearest emergency department is at the Leeds General Infirmary, telephone 0113 2432799, which is situated adjacent to the University. Chemists Lloyds Chemist is situated on Woodhouse Lane across from the Parkinson Building.
Recommended publications
  • FERSIWN GYMRAEG ISOD the National Corpus of Contemporary
    FERSIWN GYMRAEG ISOD The National Corpus of Contemporary Welsh Project Report, October 2020 Authors: Dawn Knight1, Steve Morris2, Tess Fitzpatrick2, Paul Rayson3, Irena Spasić and Enlli Môn Thomas4. 1. Introduction 1.1. Purpose of this report This report provides an overview of the CorCenCC project and the online corpus resource that was developed as a result of work on the project. The report lays out the theoretical underpinnings of the research, demonstrating how the project has built on and extended this theory. We also raise and discuss some of the key operational questions that arose during the course of the project, outlining the ways in which they were answered, the impact of these decisions on the resource that has been produced and the longer-term contribution they will make to practices in corpus-building. Finally, we discuss some of the applications and the utility of the work, outlining the impact that CorCenCC is set to have on a range of different individuals and user groups. 1.2. Licence The CorCenCC corpus and associated software tools are licensed under Creative Commons CC-BY-SA v4 and thus are freely available for use by professional communities and individuals with an interest in language. Bespoke applications and instructions are provided for each tool (for links to all tools, refer to section 10 of this report). When reporting information derived by using the CorCenCC corpus data and/or tools, CorCenCC should be appropriately acknowledged (see 1.3). § To access the corpus visit: www.corcencc.org/explore § To access the GitHub site: https://github.com/CorCenCC o GitHub is a cloud-based service that enables developers to store, share and manage their code and datasets.
    [Show full text]
  • Distributional Properties of Verbs in Syntactic Patterns Liam Considine
    Early Linguistic Interactions: Distributional Properties of Verbs in Syntactic Patterns Liam Considine The University of Michigan Department of Linguistics April 2012 Advisor: Nick Ellis Acknowledgements: I extend my sincerest gratitude to Nick Ellis for agreeing to undertake this project with me. Thank you for cultivating, and partaking in, some of the most enriching experiences of my undergraduate education. The extensive time and energy you invested here has been invaluable to me. Your consistent support and amicable demeanor were truly vital to this learning process. I want to thank my second reader Ezra Keshet for consenting to evaluate this body of work. Other thanks go out to Sarah Garvey for helping with precision checking, and Jerry Orlowski for his R code. I am also indebted to Mary Smith and Amanda Graveline for their participation in our weekly meetings. Their presence gave audience to the many intermediate challenges I faced during this project. I also need to thank my roommate Sean and all my other friends for helping me balance this great deal of work with a healthy serving of fun and optimism. Abstract: This study explores the statistical distribution of verb type-tokens in verb-argument constructions (VACs). The corpus under investigation is made up of longitudinal child language data from the CHILDES database (MacWhinney 2000). We search a selection of verb patterns identified by the COBUILD pattern grammar project (Francis, Hunston, Manning 1996), these include a number of verb locative constructions (e.g. V in N, V up N, V around N), verb object locative caused-motion constructions (e.g.
    [Show full text]
  • The National Corpus of Contemporary Welsh 1. Introduction
    The National Corpus of Contemporary Welsh Project Report, October 2020 1. Introduction 1.1. Purpose of this report This report provides an overview of the CorCenCC project and the online corpus resource that was developed as a result of work on the project. The report lays out the theoretical underpinnings of the research, demonstrating how the project has built on and extended this theory. We also raise and discuss some of the key operational questions that arose during the course of the project, outlining the ways in which they were answered, the impact of these decisions on the resource that has been produced and the longer-term contribution they will make to practices in corpus-building. Finally, we discuss some of the applications and the utility of the work, outlining the impact that CorCenCC is set to have on a range of different individuals and user groups. 1.2. Licence The CorCenCC corpus and associated software tools are licensed under Creative Commons CC-BY-SA v4 and thus are freely available for use by professional communities and individuals with an interest in language. Bespoke applications and instructions are provided for each tool (for links to all tools, refer to section 10 of this report). When reporting information derived by using the CorCenCC corpus data and/or tools, CorCenCC should be appropriately acknowledged (see 1.3). § To access the corpus visit: www.corcencc.org/explore § To access the GitHub site: https://github.com/CorCenCC o GitHub is a cloud-based service that enables developers to store, share and manage their code and datasets.
    [Show full text]
  • Book of Abstracts
    Book of Abstracts - Journée d’études AFLiCo JET Corpora and Representativeness jeudi 3 et vendredi 4 mai 2018 Bâtiment Max Weber (W) Keynote speakers Dawn Knight et Thomas Egan Plus d’infos : bit.ly/2FPGxvb 1 Table of contents DAY 1 – May 3rd 2018 Thomas Egan, Some perils and pitfalls of non-representativeness 1 Daniel Henke, De quoi sont représentatifs les corpus de textes traduits au juste ? Une étude de corpus comparable-parallèle 1 Ilmari Ivaska, Silvia Bernardini, Adriano Ferraresi, The comparability paradox in multilingual and multi-varietal corpus research: Coping with the unavoidable 2 Antonina Bondarenko, Verbless Sentences: Advantages and Challenges of a Parallel Corpus-based Approach 3 Adeline Terry, The representativeness of the metaphors of death, disease, and sex in a TV show corpus 5 Julien Perrez, Pauline Heyvaert, Min Reuchamps, On the representativeness of political corpora in linguistic research 6 Joshua M. Griffiths, Supplementing Maximum Entropy Phonology with Corpus Data 8 Emmanuelle Guérin, Olivier Baude, Représenter la variation – Revisiter les catégories et les variétés dans le corpus ESLO 9 Caroline Rossi, Camille Biros, Aurélien Talbot, La variation terminologique en langue de spécialité : pour une analyse à plusieurs niveaux 10 Day 2 – May 4rd 2018 Dawn Knight, Representativeness in CorCenCC: corpus design in minoritised languages 11 Frederick Newmeyer, Conversational corpora: When 'big is beautiful' 12 Graham Ranger, How to get "along": in defence of an enunciative and corpus- based approach 13 Thi Thu Trang
    [Show full text]
  • Proceedings of the 1St Workshop on Sense, Concept and Entity Representations and Their Applications, Pages 1–11, Valencia, Spain, April 4 2017
    SENSE 2017 EACL 2017 Workshop on Sense, Concept and Entity Representations and their Applications Proceedings of the Workshop April 4, 2017 Valencia, Spain c 2017 The Association for Computational Linguistics Order copies of this and other ACL proceedings from: Association for Computational Linguistics (ACL) 209 N. Eighth Street Stroudsburg, PA 18360 USA Tel: +1-570-476-8006 Fax: +1-570-476-0860 [email protected] ISBN 978-1-945626-50-0 ii Preface Welcome to the 1st Workshop on Sense, Concept and Entity Representations and their Applications (SENSE 2017). The aim of SENSE 2017 is to focus on addressing one of the most important limitations of word-based techniques in that they conflate different meanings of a word into a single representation. SENSE 2017 brings together researchers in lexical semantics, and NLP in general, to investigate and propose sense-based techniques as well as to discuss effective ways of integrating sense, concept and entity representations into downstream applications. The workshop is targeted at covering the following topics: Utilizing sense/concept/entity representations in applications such as Machine Translation, • Information Extraction or Retrieval, Word Sense Disambiguation, Entity Linking, Text Classification, Semantic Parsing, Knowledge Base Construction or Completion, etc. Exploration of the advantages/disadvantages of using sense representations over word • representations. Proposing new evaluation benchmarks or comparison studies for sense vector representations. • Development of new sense representation techniques (unsupervised, knowledge-based or hybrid). • Compositionality of senses: learning representations for phrases and sentences. • Construction and use of sense representations for languages other than English as well as • multilingual representations. We received 21 submissions, accepting 15 of them (acceptance rate: 71%).
    [Show full text]
  • Edited Proceedings 1July2016
    UK-CLC 2016 Conference Proceedings UK-CLC 2016 preface Preface UK-CLC 2016 Bangor University, Bangor, Gwynedd, 18-21 July, 2016. This volume contains the papers and posters presented at UK-CLC 2016: 6th UK Cognitive Linguistics Conference held on July 18-21, 2016 in Bangor (Gwynedd). Plenary speakers Penelope Brown (Max Planck Institute for Psycholinguistics, Nijmegen, NL) Kenny Coventry (University of East Anglia, UK) Vyv Evans (Bangor University, UK) Dirk Geeraerts (University of Leuven, BE) Len Talmy (University at Buffalo, NY, USA) Dedre Gentner (Northwestern University, IL, USA) Organisers Chair Thora Tenbrink Co-chairs Anouschka Foltz Alan Wallington Local management Javier Olloqui Redondo Josie Ryan Local support Eleanor Bedford Advisors Christopher Shank Vyv Evans Sponsors and support Sponsors: John Benjamins, Brill Poster prize by Tracksys Student prize by De Gruyter Mouton June 23, 2016 Thora Tenbrink Bangor, Gwynedd Anouschka Foltz Alan Wallington Javier Olloqui Redondo Josie Ryan Eleanor Bedford UK-CLC 2016 Programme Committee Programme Committee Michel Achard Rice University Panos Athanasopoulos Lancaster University John Barnden The University of Birmingham Jóhanna Barðdal Ghent University Ray Becker CITEC - Bielefeld University Silke Brandt Lancaster University Cristiano Broccias Univeristy of Genoa Sam Browse Sheffield Hallam University Gareth Carrol University of Nottingham Paul Chilton Lancaster University Alan Cienki Vrije Universiteit (VU) Timothy Colleman Ghent University Louise Connell Lancaster University Seana Coulson
    [Show full text]
  • The Spoken British National Corpus 2014
    The Spoken British National Corpus 2014 Design, compilation and analysis ROBBIE LOVE ESRC Centre for Corpus Approaches to Social Science Department of Linguistics and English Language Lancaster University A thesis submitted to Lancaster University for the degree of Doctor of Philosophy in Linguistics September 2017 Contents Contents ............................................................................................................. i Abstract ............................................................................................................. v Acknowledgements ......................................................................................... vi List of tables .................................................................................................. viii List of figures .................................................................................................... x 1 Introduction ............................................................................................... 1 1.1 Overview .................................................................................................................................... 1 1.2 Research aims & structure ....................................................................................................... 3 2 Literature review ........................................................................................ 6 2.1 Introduction ..............................................................................................................................
    [Show full text]
  • The Corpus of Contemporary American English As the First Reliable Monitor Corpus of English
    The Corpus of Contemporary American English as the first reliable monitor corpus of English ............................................................................................................................................................ Mark Davies Brigham Young University, Provo, UT, USA ....................................................................................................................................... Abstract The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a ‘monitor corpus’, and which can be used to accurately track and study recent changes in the language. The 400 million words corpus is evenly divided between spoken, fiction, popular magazines, newspapers, and academic journals. Most importantly, the genre balance stays almost exactly the same from year to year, which allows it to accurately model changes in the ‘real world’. After discussing the corpus design, we provide a number of concrete examples of how the corpus can be used to look at recent changes in English, including morph- ology (new suffixes –friendly and –gate), syntax (including prescriptive rules, Correspondence: quotative like, so not ADJ, the get passive, resultatives, and verb complementa- Mark Davies, Linguistics and tion), semantics (such as changes in meaning with web, green, or gay), and lexis–– English Language, Brigham including word and phrase frequency by year, and using the corpus architecture Young University.
    [Show full text]
  • Language, Communities & Mobility
    The 9th Inter-Varietal Applied Corpus Studies (IVACS) Conference Language, Communities & Mobility University of Malta June 13 – 15, 2018 IVACS 2018 Wednesday 13th June 2018 The conference is to be held at the University of Malta Valletta Campus, St. Paul’s Street, Valletta. 12:00 – 12:45 Conference Registration 12:45 – 13:00 Conferencing opening welcome and address University of Malta, Dean of the Faculty of Arts: Prof. Dominic Fenech Room: Auditorium 13:00 – 14:00 Plenary: Dr Robbie Love, University of Leeds Overcoming challenges in corpus linguistics: Reflections on the Spoken BNC2014 Room: Auditorium Rooms Auditorium (Level 2) Meeting Room 1 (Level 0) Meeting Room 2 (Level 0) Meeting Room 3 (Level 0) 14:00 – 14:30 Adverb use in spoken ‘Yeah, no, everyone seems to On discourse markers in Lexical Bundles in the interaction: insights and be saying that’: ‘New’ Lithuanian argumentative Description of Drug-Drug implications for the EFL pragmatic markers as newspaper discourse: a Interactions: A Corpus-Driven classroom represented in fictionalized corpus-based study Study - Pascual Pérez-Paredes & Irish English - Anna Ruskan - Lukasz Grabowski Geraldine Mark - Ana Mª Terrazas-Calero & Carolina Amador-Moreno 1 14:30 – 15:00 “Um so yeah we’re <laughs> So you have to follow The Study of Ideological Bias Using learner corpus data to you all know why we’re different methods and through Corpus Linguistics in inform the development of a here”: hesitation in student clearly there are so many Syrian Conflict News from diagnostic language tool and
    [Show full text]
  • Corpus-Driven Lexicon of MSA 13
    Automatic Lexical Resource Acquisition for Constructing an LMF- Compatible Lexicon of Modern Standard Arabic Mohammed Attia, Lamia Tounsi, Josef van Genabith Abstract We describe the design and implementation of large-scale data processing techniques for the automatic acquisition of lexical resources for Modern Standard Arabic (MSA) from annotated and un-annotated corpora and demonstrate their usefulness for creating a wide-coverage, general- domain lexicon. Modern lexicographic principles (Atkins and Rundell, 2008) emphasise that the corpus is the only viable evidence that a lexical entry still exists in a speech community. Unlike most available Arabic lexicons which are based on previous historical (and dated) dictionaries, our lexicon starts off with corpora of contemporary texts. The lexicon is encoded in Lexical Markup Framework (LMF) which is a metamodel that provides a standardized framework for the construction of electronic lexical resources. The aim of LMF is to optimize the production and extension of electronic lexicons and to facilitate the exchange of data between all aspects of lexical resources and the interoperability with NLP applications. This lexical resource will serve as the core of our Arabic annotation tools: morphological analysis, tokenization, diacritization and base phrase chunking. 1 Introduction A lexicon lies at the heart of most morphological analysers of Arabic (Dichy and Fargaly, 2003; Attia, 2006; Buckwalter. 2002; Beesley, 2001). The quality and coverage of the lexical database will determine the quality and coverage of the morphological analyser, and any limitations found in the database will make their way through to the morphological analyser. The literature abounds with discussions about the design of a morphological analyser, yet little effort has gone into the investigation of the nature of the database at the core of all these systems, and what design decisions have been taken in their development.
    [Show full text]
  • Book of Abstracts
    xx Table of Contents Session O1 - Machine Translation & Evaluation . 1 Session O2 - Semantics & Lexicon (1) . 2 Session O3 - Corpus Annotation & Tagging . 3 Session O4 - Dialogue . 4 Session P1 - Anaphora, Coreference . 5 Session: Session P2 - Collaborative Resource Construction & Crowdsourcing . 7 Session P3 - Information Extraction, Information Retrieval, Text Analytics (1) . 9 Session P4 - Infrastructural Issues/Large Projects (1) . 11 Session P5 - Knowledge Discovery/Representation . 13 Session P6 - Opinion Mining / Sentiment Analysis (1) . 14 Session P7 - Social Media Processing (1) . 16 Session O5 - Language Resource Policies & Management . 17 Session O6 - Emotion & Sentiment (1) . 18 Session O7 - Knowledge Discovery & Evaluation (1) . 20 Session O8 - Corpus Creation, Use & Evaluation (1) . 21 Session P8 - Character Recognition and Annotation . 22 Session P9 - Conversational Systems/Dialogue/Chatbots/Human-Robot Interaction (1) . 23 Session P10 - Digital Humanities . 25 Session P11 - Lexicon (1) . 26 Session P12 - Machine Translation, SpeechToSpeech Translation (1) . 28 Session P13 - Semantics (1) . 30 Session P14 - Word Sense Disambiguation . 33 Session O9 - Bio-medical Corpora . 34 Session O10 - MultiWord Expressions . 35 Session O11 - Time & Space . 36 Session O12 - Computer Assisted Language Learning . 37 Session P15 - Annotation Methods and Tools . 38 Session P16 - Corpus Creation, Annotation, Use (1) . 41 Session P17 - Emotion Recognition/Generation . 43 Session P18 - Ethics and Legal Issues . 45 Session P19 - LR Infrastructures and Architectures . 46 xxi Session I-O1: Industry Track - Industrial systems . 48 Session O13 - Paraphrase & Semantics . 49 Session O14 - Emotion & Sentiment (2) . 50 Session O15 - Semantics & Lexicon (2) . 51 Session O16 - Bilingual Speech Corpora & Code-Switching . 52 Session P20 - Bibliometrics, Scientometrics, Infometrics . 54 Session P21 - Discourse Annotation, Representation and Processing (1) . 55 Session P22 - Evaluation Methodologies .
    [Show full text]
  • Preparation and Analysis of Linguistic Corpora
    Preparation and Analysis of Linguistic Corpora The corpus is a fundamental tool for any type of research on language. The availability of computers in the 1950’s immediately led to the creation of corpora in electronic form that could be searched automatically for a variety of language features and compute frequency, distributional characteristics, and other descriptive statistics. Corpora of literary works were compiled to enable stylistic analyses and authorship studies, and corpora representing general language use became widely used in the field of lexicography. In this era, the creation of an electronic corpus required entering the material by hand, and the storage capacity and speed of computers available at the time put limits on how much data could realistically be analyzed at any one time. Without the Internet to foster data sharing, corpora were typically created, and processed at a single location. Two notable exceptions are the Brown Corpus of American English (Francis and Kucera, 1967) and the London/Oslo/Bergen (LOB) corpus of British English (Johanssen et al., 1978); both of these corpora, each containing one millions words of data tagged for part of speech, were compiled in the 1960’s using a representative sample of texts produced in the year 1961. For several years, the Brown and LOB were the only widely available computer-readable corpora of general language, and therefore provided the data for numerous language studies. In the 1980’s, the speed and capacity of computers increased dramatically, and, with more and more texts being produced in computerized form, it became possible to create corpora much larger than the Brown and LOB, containing millions of words.
    [Show full text]