This article was downloaded by: 10.3.98.104 On: 29 Sep 2021 Access details: subscription number Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: 5 Howick Place, London SW1P 1WG, UK

The Routledge Handbook of English Language and Digital Humanities

Svenja Adolphs, Dawn Knight

Written corpora

Publication details https://www.routledgehandbooks.com/doi/10.4324/9781003031758-3 Sheena Gardner, Emma Moreton Published online on: 05 May 2020

How to cite :- Sheena Gardner, Emma Moreton. 05 May 2020, Written corpora from: The Routledge Handbook of English Language and Digital Humanities Routledge Accessed on: 29 Sep 2021 https://www.routledgehandbooks.com/doi/10.4324/9781003031758-3

PLEASE SCROLL DOWN FOR DOCUMENT

Full terms and conditions of use: https://www.routledgehandbooks.com/legal-notices/terms

This Document PDF may be used for research, teaching and private study purposes. Any substantial or systematic reproductions, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The publisher shall not be liable for an loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material. Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 26 to acertain number ofwords beforeand/orafter the noun. Alternatively, thequestioncould to count only those that occur immediately before the noun or to include those that occur up co-location findings for Shakespeare and Dickens. search parameters can be set identify which can adjectives most we frequently occur example, with a For noun such English. as ‘night’ of nature and compare the the reflect simply might turn in which items, ing This latter questionmight be answered quantitatively in termsofmostfrequent collocat images are typically associated with darkness in Shakespearean vs. Dickensian writing?’ ‘Which as such questions, complex ing?’more for used be might queries co-occurrence Or writ- their in most to refer Dickens and Shakespeare do nations ‘Which as such questions answer to sought be might data frequency example, frequency.For relative or occurrence patterns that would behardtodetect manually. These might be basedonfrequency, co- its socialcontextandtoassistincomparativeanalyseswithin andacrosscorpora. the texts(e.g.who,where,when,why, about what),whichisusedtoassociate the text with Corpora include not onlythe actual texts butalsometadata, or information about the production of 206). 2016: (Nesi tools’ software computer of aid the with analysed and text of naturally-occurring of ‘collections language data,storedinelectronic form, designedtoberepresentativeofparticular types be to understood generally are Corpora analysis. for of textsthatarerepresentative, for instance, of writtenEnglishandaredigitally prepared be indexed and catalogued are differentiated from corpora, which involve a careful sampling corpora ofwrittentexts. nessing the power ofnewtechnologies in the study ofhumanities data. Here wefocuson an opportunity to bringtogether scholars fromacrossthedisciplines to lookatways ofhar including corpus linguistic methods to address humanities-related research questions) offers new and creative ways. Indeed, the digital humanities (using digital tools and techniques, to electronically allowed us store, annotateandanalyseever-increasing amountsofdatain The humanities has always been concerned with written texts, but the digital revolution has Introduction Written corpora can be analysed from a range of quantitative perspectives to uncover to perspectives quantitative of range a from analysed be can corpora Written Collections of written texts, or archives, which are simply assembled and may or may not Sheena GardnerandEmmaMoreton Written corpora 3 - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 multidisciplinary fieldofdigitalhumanities. how writtencorpora,andcorpusmethodsofanalysis,complement andcontribute to the future directions, further reading, related topics andreferences. Our aim is todemonstrate of correspondenceinthreespecialistcorpora. The chapterconcludeswithsuggestionsfor ties in several large, ready-made, freely available corpora and an investigation of features tools. lytical Weterm the for search corpus a studies: small-scale two present then ana- of sophistication statistical growing the and contents corpus to related issues and field broad categorisation (Table explained areasvariedresearchinhumanitiesitself. ) and the contexts in which they occur, the questions posed and how findings are more qualitative and interpretive examination of significant or key items (words, phrases or our linear readingoftextscannot. Around thisquantitative methodological core, andthe phraseology andcollocation within andacrosstexts,thatourintuitionsaboutlanguageor tion incorporawaysthat can answerquestions,particularly those related to frequency, tion orliterarycriticism. tic categories or categoriesofmetaphor)mayderivefromtheoriesgenderrepresenta normally involvegroupingtheresultsintocategories, which maybelinguistic (e.g. seman- as theresultsareunlikely to beasdirectly comparable, an interpretation of theresultswill answer questionssuchas‘Howarewomenrepresented in Shakespeare and Dickens?’, (e.g. Hunston 2002). More complex ‘query language’ would be needed to of frequency, collocation and keynessare widely used andexplained in all introductions to ing ofeachauthorwhencompared withageneral English corpus. These central concepts be answeredinrelative terms bycomparing items that are distinctive, or ‘key’,inthewrit- annotation and multiple categories. It reflects an adult educated variety of English with ofEnglish variety educated adult an reflects It categories. multiple extensive its and for annotation valued is BNC The today. used widely are which of both (COCA), can Corpus (BNC) developed between 1991 and 1994 and the Corpus of Contemporary Ameri- informa (see 1980s the in English largerother National and British later) the COBUILD specifically on corpora, national tion of Bank the by succeeded were These 1960s. the in began that projects in developed both of English, corpus American Brown the and English variety). Nevertheless,itprovidesanorientationtothescope ofwrittencorporainEnglish. single a on focuses corpus SCOTS the while English, of varieties 20 from texts compare to designed was [ICE] English of Corpus International (the regional and etc.) lectures, books, bespoke corpora can be compared), pedagogic (e.g. all the receptive texts for a learner erence (large general corpora areoftenusedasreference corpora against which smaller corpora listedaremonolingual English), multimodal (e.g. transcriptsplusrecordings),ref- Sketch Engine.co.uk/user-guide/user-manual/corpora/corpus-types/). Brigham as such pora, Young(www.Engine Sketch and (http://corpus.byu.edu) University corpus linguistics MOOC session 2), as well as on websites that offer access to multiple cor linguistics (e.g. Huber and Mukherjee 2013; Hunston 2002: 14–16; corpus the Lancaster to University introductions most in found are classifications Similar English. in exist that pus The categorisation in Types ofcorpus Our aim in this chapter is to describe influential written corpora in English. We present a Research onwrittencorporainvolvesusingtoolstoaccessandmanipulate the informa- h eris ntoa croa r te acse-soBre (O) ops f British of corpus (LOB) Lancaster-Oslo-Bergen the are corpora national earliest The the of (most monolingual include categories other exhaustive; not is classification This Table 3.1 isintended to giveabroadoverviewoftypeswrittencor 3.1), followedbyanaccountofhistorical developments inthe Written corpora humani- – text 27 - - - - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Web Corpus Specialist Multilingual Parallel Monitor Learner Historical/ General Type Table 3.1 Sheena GardnerandEmmaMoreton 28 are compiled for specific research purposes, such as the ICLE and British Academic Written Hansard, EUR-LEX, the as such tions, collec or archives existing on built corpora are there design; specific a to according genres December in 560 (COCAover large had are There 17/04/2018). accessed english-corpus, lion words that grows by 800 be more than 10 billion words each for more than 30 languages and a news corpusof28 bil- compiled by web crawling. For example, Sketch Engine houses TenTen corpora that aim to 10 and 90 Diachronic percent written texts, including media, literary and academic sources across domains, across sources academic and literary media, including texts, written percent percent transcripts of spoken language. Broadly speaking there are massive corpora massive are there speaking Broadly language. spoken of transcripts percent Types ofcorpusexemplified 2017) general corpora, such as the BNC and COCA, that collect a range of range a collect that COCA, and BNC the as such corpora, general 2017) Created bywebcrawling, Designed foraspecific Containing textsinseveral Texts andtheiraligned Regularly updatedtotrack Texts producedbylearners From specificperiods,or A varietyoftextsintended Contents synchronous ofcorpora domain, typicallythemost categorised byweb purpose languages translation(s) changes inalanguage e.g. decade allows comparisonby, language of spokenandwritten to reflectgeneraluses million words a month (www.Sketch Engine.eu/timestamped- and CLC corpora; and corpora that corpora and corpora; CLC Timeand Magazine GloWbE contains1.9 billionwordsfromwebsitesin ententen (2013)contains19 billionwordsofEnglish See also‘Smallcorrespondencecorpora’inthetext Hong KongProfessionalCorporaspecialisesin British AcademicWritten English(BAWE) contains European CorpusInitiative(ECI)Multilingual, EUR-LEX isamultilingualparallelcorpusofEuropean News ontheWeb (NOW)corpusdatesfrom2010 The BankofEnglish(BoE)(seediscussionathttp:// Cambridge LearnerCorpus(CLC)with50 million International CorpusofLearnerEnglish(ICLE)with The SienaBologna/PortsmouthModernDiachronic Time Magazinecorpus(1923–2006) Hansard corpus(1803–2005) AmericanEnglish The CorpusofContemporary The BritishNationalCorpus(BNC),ageneralcorpus Examples 20 differentEnglish-speakingcountries on theweb engineering,governance,etc. financial services, writing acrossdisciplinesandlevelsofstudy 6.5 million wordsofassesseduniversitystudent in over20languages containing texts(mainlynewspapersandfiction) European languages Union documentstranslatedintotheofficial and growsbyabout10,000webarticleseachday corpus.byu.edu/coca/compare-boe.asp) 220,000 studentsfrom173countries words fromCambridgeexamscriptssubmittedby 3 million wordsofargumentativeessays newspapers in1993,2005and2010 (385 million tokens)fromthreeUKbroadsheet Corpus (SiBol/Port)contains787,000articles (COCA), ageneralcorpusofAmericanEnglish of BritishEnglish million words million - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 bham.ac.uk/docs/svenguide.html) – but also the development of methods and approaches and methods of development the also but – bham.ac.uk/docs/svenguide.html) large very a of English construction contemporary of corpus the only not involved project The Sinclair. John of ership diverse corpora. of English. Suchdevelopments whetted researchers’ appetites to developbiggerandmore tion offurthercorporawiththesamedesignthatenabledready comparisonacrossvarieties texts of around2000wordsdistributed across 15text categories. This inspired the compila 500 contained each these 1960s, the in Developed English. British of corpus LOB the part, itscounter and 17/04/2018) accessed (www.helsinki.fi/varieng/CoRD/corpora/BROWN/ with whichwecananalysethem. of written textsand,whentheyaredigitally transcribed andmarkedup,thespeed andease quantities large collate now can we which with ease and speed the are changed have What ment ofcomputerisedconcordances: and offrequency’ (2006:5). on ‘naturallyoccurringlanguage,theimportanceofcontext(orcontextualvariation), three concernssharedbysystemic-functionallinguisticsandcorpuslinguistics:afocus of identification Hunston’s and Thompson explains This linguistics. corpus current to precursor natural a was others and Fries Quirk, Firth, of linguistics twentieth-century early the America, in particularly field the dominate to came that linguistics mational of transfor the arguesunlike Junior that, Fries sample community’. speech some of language the representative ‘a in questions) yes–no with associated patterns intonation (e.g. features paradigmatic of frequencies relative of counting and analysis scientific ences ormeaningsandpublishedasreferencebooks. refer functions, similar with items extracting and patterns for searching by 1962) Wisby (see manually created were Bible James King the and Shakespeare of works complete the as such texts of Concordances today. familiar are that techniques used studies Pre-digital Research acrossthehumanities haslongbeeninformedbytheanalysisofwrittentexts. Background handbook. Our twosampleanalyseslaterprovidesomeinsightsintothis,asdootherchaptersinthe mark-up, and thus how amenable they are to answering specific questions in the humanities. or annotation their of extent and nature the in (BAWE)vary English also Corpora corpora. An ambitious project was conducted at the University of Birmingham under the lead- the under Birmingham of University the at conducted was project ambitious An of English Corpus American Brown the were projects corpus digital earliest Twothe of Similarly in folklore and literature studies, Michael Preston (n.d.) describes the develop collection, systematic Fries’s Charles describes (2010) Fries Peter linguistics, In by handwhichwerethenmanuallysorted. slips of writing the mechanizing punched at attempt an itself the process, card/sorter/reader-printer on modeled were computer by concordances producing at efforts early punched cards which he manipulated with a mechanical sorter and a card-printer. Many traditional tunes of the Child ballads. In the pre-computer era, Bronson had worked with the study to technology the of use made who Bronson, Bertrand and 1948, in Aquinas Thomas of works the to a preparing began who S.J., Busa, Roberto by II Computer-assisted study of folklore and literature was initiated shortly after World War – it is currently around 450 around currently is it – million words (www.titania.words million Written corpora 29 - - - - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 and user-friendly softwareprogrammesbystudentsfrommobiledevices. of digital corpora of written texts are increasingly available for online use with sophisticated analyse theirowndigital corpora usingconcordancing software, andtodaymanyhundreds were replaced by desktop computers, which allowed individual researchers to develop and these 1980s the In Fries. Charles by used cards handwritten the of reminiscent cards, file ing materials. Collins and created to inform the making of dictionaries, grammar books and related teach- by funded was project The campus. on computer expensive most and largest the had that project, his and Department, English the was it 1980s, the in started it when that also but of the COBUILD project (see later), and the related advances in lexicography in particular, field today. The annual Sinclair Lectures regularly remind us not only of the achievements to corpusanalysisthatchangedthewayweworkwithEnglishtextandstillpervade Sheena GardnerandEmmaMoreton 30 behind thefindingsisamoot point. to whichthoseinvestigating corpus resourcesneedtounderstand,orteach, the statistics extent the and humanities, in researchers of strengths traditional not are these but field, a as linguistics corpus of development the to crucial is 2013) (Gries R of use the and 1988) corpus findings. The statistics exemplified in areas such as multidimensional analysis (Biber features are‘balanced’,andinwhatways. stand notonlythecomposition of the corpus butalsotherationale for itsconstruction, which research goals whenthey are developing the corpus, andcorpus usersshouldseektounder all situational and contextual variables. So, corpusdevelopersshouldbeclear about their the corpusintheserespects.Itwillneverbepossibletohave corporathat are balanced for correspondence to subordinates, to superiors,orpeers),it should bepossibletobalance advance (e.g. male vs. female, humanities vs. sciences, quantitative vs. qualitative research, in interest of features particular ‘balanced’identifying a By develop corpus. to is many for tive’ corpora. enced Englishlanguageteaching andhaveinspiredthedevelopment ofmore‘representa influ- significantly have projects AWLthe and COBUILD the both Nevertheless, 2012). new of production the in explore to corpora alternative to led AWLsLiu in review see (e.g. is biasedmoretowardsbusinessthansciences,forinstance. This andothercritiqueshave and university one at textbooks the by influenced heavily was collection corpus corpus), (www.victoria.ac.nz/lals/resources/academicwordlist/information/(AWL) list word demic demic corpus was being developed by Averil Coxhead in order to identify items for an aca aca an when Equally, data. written of source easy an were journals academic and novels so and words, of largenumbers collect to was aim the began, project COBUILD the When able corporaincreases,themethodsusedtocompilethemcomeunderincreasingscrutiny. pose, it may not be appropriate for use in different contexts. As the number of publicly avail made in compiling acorpus,however, andwhen acorpushasbeen designed for onepur dence in or interpretation of texts to support their theories. There are many decisions to be surprising thatresearchersareattracted to digital analyses ofwrittencorporatoseekevi- not therefore is it and texts, interpreting and analysing involves research humanities Much Critical issuesandtopics In thelast50 years,wehavemovedfrommainframe computers, withdataenteredon A thirdcritical issue relates to the statistical sophistication needed to interrogate and use There are,ofcourse,issueswhatis‘representative’, and perhapsamorerealistic goal ------

Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 development anddisseminationofideasthespread political news. and theseventeenth during eighteenth centuries to explore how suchnetworks facilitated, amongst other things, the andAmerica Europe within academies’ ‘scientific between Oxford University’s Cultures ofKnowledge including partners various with collaboration in University, Stanford at project of Letters ence collections can be used for linguistic analysis. For example, the correspond- digitised these work, pre-processing basic some With source. data primary a for explore, to instance, expressionsofpoliteness,spellingvariationandlexical change. material this used have Helsinki) of University the at English, in Change 162). Researchers at VARIENG (the Research Unit for the Study of Variation, Contacts and 2007: Nevalainen and (Raumolin-Brunberg religion and education rank, sex, author’s the in asystematisedway, allowinguserstoexplorelanguageinrelation to variablessuchas excellent example of howsociobiographic and extra-linguistic metadata might be captured 188 letter collections (12,000 letters) dating from c. 1403 to 1800. The resource serves as an of EarlyEnglish Correspondence privatecorrespondence. narrativesand diaries, first-person documents: are ego TheCorpus consisting of349completenewspaperissuescontaining1.6 millionwords. files (369 texts) totalling 463,009 words 185 – (CHELAR) recently released CorpusofHistoricalEnglishLawReports1535–1999 the include corpora Genre-specific century. eighteenth the of end the to 1150 from dating taining arangeofgenres(includingprayers,sermons,treatisesandreligiousbiographies), the and 1810s–2000s, the from dating words lion nelcul eeomn ad oil ewr, u aotVcoin cec ad oit in society general’. and Finally, the science Victorian about but network, social and development intellectual own his about only ‘not information providing Darwin, Charles by letters 15,000 roughly win Correspondence Project the and period modern early the from documents and letters 69,000 over containing domains: diachronicEnglishlinguisticsandlanguageteachingmaterials. the asymmetricalwayspeoplearerepresentedinpress’ (2010:99). instance, For and meanings analysis. hidden ‘deconstruct help to corpora use (2010) Moon discourse and ­Caldas-Coulthard (critical) and acquisition et language Frankenberg-Garcia second (e.g. translation, teaching and learning of theories language inform of They and research. language language English of areas many in used are Corpora Current contributionsandresearch (COHA), which includes over 400 over includes include the CorpusofHistorical which (COHA), American English corpora diachronic multi-genre Other 1600–1999’. period the covering English American et (Biber Finegan and Biber by 1990s the Initially constructedin (ARCHER). of Historical English Registers In termsofhistorical written corpora,anotablestartingpointisA Representative Corpus Diachronic/historical corpora Other projects include the as letters using now are projects humanities digital of number growing a Additionally, Arguably, some of the most useful materials for exploring language change and variation Here weillustrate thescopeofwrittencorporaandcorpusresearchthroughtwokey al. 1994), ARCHER is a ‘multi-genre historical corpus of British and British of corpus historical ‘multi-genre a is ARCHER 1994), al. Victorian Lives and LettersConsortium,coordinated by the University at Cambridge University, which has collected and digitised and collected has which University, Cambridge at Electronic Enlightenment (CEEC), produced at the University of Helsinki, comprises – and the rjc, as ewrs f correspondence of networks maps project, Zurich English Newspaper Corpus(ZEN), Corpus of English Religious Prose con- project, at the University of Oxford, Mapping the Republic Written corpora l 2011), al. Dar mil 31 - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 sity of Nottingham and the University of Birmingham, which ‘demonstrates through corpus words) and forms the basis of a collaborative project – artist friends,includingPaulGauguinandEmileBernard(translatedintoEnglish). Van Gogh LettersProject the and format; (XML) Language Markup Extended in available all 1608 to 1550 c. from Browning; Barrett Elizabeth and Browning tal collections include van Ostade’s (2010)studyofunpublishedcorrespondence of RobertLowth.Otherdigi- Tieken-Boonand network) bluestocking the of letters on research her of part (as Montagu Thomas CarlyleandthediariesofJohnRuskin. Queen of coronation Victoriaof outbreak the Worldto Warby letters include Documents I. of South Carolina, has brought to light samples of life-writing from the period spanning the Sheena GardnerandEmmaMoreton 32 growing interestinwhatscholarshavetermed‘historyfrombelow’ or‘intrahistoria’ a been has there however, decades, past the Over 1977). Sifton (see systems duplication letter polygraph and letterpress the of use his and record research the repair to efforts his recipients. The practices of Thomas Jefferson, for example, have received much attention: process transcribed, edited and published.Eminent persons themselvesare sometimes part of this ures) havelongbeenusedforsocial,historicalandculturalstudies.Suchlettersaresaved, within those plays. Finally, the Finally,plays. those within characters all by speeches all for files separate including plays, 37 containing Scott, Mike and American authors dating from 1881 to 1922, and the differences in Twitter andShortMessageService(SMS)(Baronet al.2011). variations in text messages (Tagg et spelling of normalisation the 2011), Baron and (Rayson corpora learner in tagging (error texts for the corpus analysis work. VARD is also useful in other areas of humanities research designed todealwithspelling variations in historical data, has beenusedtostandardisethe tool pre-processing VARD,a detector variant The corpora. drama and text of modern files early plain-text released recently has Library) Shakespeare Folger the and Strathclyde of University the Libraries, Wisconsin-Madison of University the between collaboration enabling linguistic analyses ofvarious kinds; andtheVisualizingproject EnglishPrint morphologically,and syntactically parsed been has that text English Old of words 71,490 representation of dialect; the York-Helsinki Parsed CorpusofOldEnglish Poetry by period,county, genre andauthor, allowing users toexplore vernacular literature and the ation. The Irish English. to exploretopicsandthemesinthediscourse,aswelllinguisticvariationchange and theirfamiliesdatingfromtheseventeenthtotwentiethcenturyhasbeenused Migration Studies in Northern Ireland, which contains over 4,000 letters by Irish migrants the and (1915–1931) is creatingacrowdsourceddigitalcollectionoflettersfromthetimeEasterRising the example, For 2015). 2012, Fairman 2012; Fairman and Auer e.g. et Amador-Moreno for (see, firsthand consequences their and events torical classes popular the of history a is, that mle-cl poet icue aros 20) td o crepnec b Elizabeth by correspondence of study (2009) Sairio’s include projects Smaller-scale Personal and corporate letters of eminent persons (e.g. public political and social fig- social and political public (e.g. persons eminent of letters corporate and Personal (CEN) – texts by 25 British 25 by texts – (CEN) of EnglishNovels Other literarycorporainclude theCorpus Historical literary corpora are also being used to investigate language change and vari- – making copies of their own letters before sending them or retrieving letters from contains literary texts from the 1500s to 1900s searchable 1900s to 1500s the from texts literary contains Salamanca Corpus (IED) housed at the Mellon Centre for Centre Mellon the at housed (IED) Irish EmigrationDatabase – 574 letters between Victorian poets Robert Victorianpoets between letters 574 – The Browning Letters – around 800 letters by van Gogh to his brother Theo as well as to contains 14 texts (over 3 (over texts 14 contains Charles Dickens Corpus al. 2012) and the exploration of gender and spelling – ordinary men and women who experienced his- experienced who women and men ordinary – Bess of Hardwick’sdating letters 234 – Letters 1 CLiC Dickens – between the Univer Shakespeare Corpus,producedby project Letters of1916 contains al. 2016; al. 2 million 3 – (a - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 including (p. frequent more are verbs past-tense different fiction (for in reasons), prose academic and conversation in frequent are and verbs past-tense than common more are verbs present-tense although example, For texts. academic and fiction distinguishing features is the inclusion of register frequency information in spoken, media, its of One corpus. (LSWE) English Writtenand Spoken Longman 40-million-word the by et Quirk the new theoriesabouthowwelearnlanguage(e.g.Hoey’s lexicalpriming, 2005). n-grams, orbundles)hasbecomeacornerstoneofcorpuslinguisticresearchandinspired ments of three common grammatical features explanations alternative demonstrates (1994) patterns; Willis verb meaningful 700 over tifies (1974) 25 formal verb patterns, Francis, Hunston and Manning’s Hornby’spattern grammar to (1996) compared iden- instance, For it. teach to how and teach to what challenged of notions traditional that resources teaching produced team COBUILD The patterns. as grammar of but to inspire a step-change in lexicography, a novel way of viewing language and a new theory research ary in 1987, followed by pattern grammars and teaching materials. Here we see corpus-driven diction- COBUILD the produced corpus, English of Bank the by informed and 1980s the in Sinclair University John by started Birmingham 2007) (Moon project Collins (COBUILD) Language International The materials. teaching and books grammar dictionaries, A major strand in corpus research involves corporate collaboration to inform English language English languageteachingmaterials insights intohowreadersperceivefictionalcharacters’ (CLiCDickens). stylistics how computer-assisted methods can be used to study literary texts and lead to new lish (PICAE),with thesupportofmanywell-known corpuslinguists. of Corpus International Pearson 25-million-word the of component written Eng- Academic monographs. The freelyavailable Pearson academic collocates list wascompiled from the of Corpus and articles Oxford undergraduatescholarly includes textbooks, which (OCAE) English Academic 85-million-word the from extracts includes and by informed is 2014) different purposes. For instance, the they publishincorporadevelopedfor and Pearson,allofwhichcanuse theworks Press University Oxford Longman, Press, University Cambridge include examples Notable pora. but insomecasesitispossibletogainaccessonrequest. at university, asassessedin TOEFL. level and area of study. This design iswell suited to a focus onthe language needed to study (T2k-SWAL) corpus was collected from four sites across the Language United States and stratified by Academic Written and Spoken 2000 TOEFL 2.7-million-word the in data The or read, including classroom teaching, textbooks and service encounters (Biber et tigate the spokenand written registers that university students inthe United States listen to tense (p. 459). while others,includingexclaim The Nowadays, manypublishersemploytheirowncorpuslinguists toworkonin-housecor As these are essentially commercial projects, access to these corpora tends to be restricted, inves- TestingEducational to the (ETS) by Service supported was Aproject subsequent – and reveals the value of lexical phrases. This interest in lexical phrases (or clusters, or (Biber et (Biber Longman Grammarof Spoken and Written English – corpora being used not to find current examples of grammatical and lexical usage, usage, lexical and grammatical of examples current find to not used being corpora – bet, al. (1985) grammars and provides a descriptive grammar of English informed English of grammar descriptive a provides and grammars (1985) al. know, mean andmatter,occurmorethan80 percentinthepresenttense, , pause, Oxford Learner’s Dictionary of Academic English (Lea – the passive, the second conditional and reported state- grin andsigh,occurmorethan80 percent in thepast 456). Furthermore, some verbs, some Furthermore, 456). al. 1999) builds on builds 1999) al. Written corpora al. 2002). 33 - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Reading databasesandinterfacesattheendofthischapter. written corpora, most of which are accessible online. More corpora are listed in the Further cil LearnEnglishwebsite(https://learnenglish.britishcouncil.org/en/writing-purpose/). rant 2017), but it also underpins the Writing for a Purpose materials on the British Coun- (Gardner and Holmes 2009), Chinese writers (Leedham 2015) and lexical bundles (Dur et (Staples groups nominal 2009), Nesi and (Holmes verbs reporting2012), Moreton and (Nesi nouns shell of use the including English, academic of studies and 2012) Gardner and (Nesi classification genre a informed only not has It (www.coventry.ac.uk/BAWE).writing student university successful of corpus BAWE cles, master’s andPhDthesesfromHongKong. compares lexical bundles acrossdisciplines in a 3.5-million-word corpusofresearch arti therefore for teaching English for specific academic purposes). Hyland (2008), for example, supports arguments in favour of the disciplinary distinctiveness of academic English (and and usedtodevelopatheoryofmetalanguage (Hyland 2005)andtoprovideevidence that Sheena GardnerandEmmaMoreton 34 promotes aswelltheuseof theterm‘humanities’. research, department,course),asinthisexample, which isworthreading for the ideas it the pluralform‘humanities’ is agroupofeducational terms (humanitiesscholars,student, lar, uncountable, abstract noun, ‘humanity’, rather thanapluralform. Among collocates of singu- a as used pmw) 3.52 or times, (396 widely is it examples, these humanity’. in of As common humanity’, ‘the dregs of humanity’, ‘the whole of humanity’ and ‘in the interests of sense humanity’‘of’‘a ‘in include with context Examples frequent pmw). (0.77 87 with a as is where it occurredpostmodifier with 432frequent ‘of’, times (3.85 pmw), compared to the next most is it which in context only the that show results The Engine.co.uk/open. lemma WordSketchthe Wea for with search started BNC show through example the kinds of information that can be obtained from different corpora. which it is used. The aim is not to suggest that this is how a study should be conducted, but to fiers) and referents, but whose meaning (signified) is understood according to the contextsin we consider the term to be a semantic shifter (Fludernik 1991); a term that has forms (signi- is notonlyappropriateinrelationtothecontentsofthisvolumebutalsothat through sites searches forhumanitiesinseveralpopularcorporaofwrittentexts. The choiceofhumanities (BYU) University YoungBrigham and Engine Sketch open the on corpora Here weillustratethenatureofinformationthatcanbefreelyobtainedfrompre-loaded Humanities inpre-loaded general,academicandhistoricalcorpora Two contrastingsampleanalysesofwrittencorporaarenowpresented. Sample analyses In this overview of current contributions and research, we have included just some of the ESRC-funded 6.5-million-word the include corpora academic available Freely byKenHyland those developed include corpora English academic influential Other the computer into the humanities disciplines has made no more substantial change in the the computerinto thehumanitiesdisciplines has madenomoresubstantial changeinthe What will the next decades hold? Will it be seen, with hindsight, that the introduction of on the BNC at www.Sketchat BNC the on humanity al. 2016), section headings section 2016), al. - -

Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 further explorationofawiderrangeusesandmoresophisticatedtechniques. century. These twoanalysesbegintoilluminate the term,andmuchmorecouldbedonewith matical patterns of humanity(andhumanities)inalarge general English corpusfromthelast gram- and collocations meanings, the of understanding an provide can BNC the analysis, published lecture shows the state of a development in the from early 1990s, extract 30 this 1960s, the to back dates humanities in computers of use the Although fall/the future of humanity. In English it is used in relation to specific individuals and their and individuals specific to relation in used is humanity.it of English future In fall/the survival/the the for concern including uses, of range wider a has it where texts philosophy ‘humanity can survive’ together, accounting for 12ofthe 28 instances of ‘humanity’ in the philosophy isconcernedwith humanityingeneral,withour‘commonhumanity’ andhow ity, aphrasewhichisalsofoundinpolitics and history, thoughlessfrequently. Incontrast, Nineteen of the 31 instances of ‘humanity’ inthe law texts relate to crime(s) against human- cates differently in law, philosophy/ethics, English literature and classics, as in these examples. early twenty-first century. If we filter to the discipline in and writing search for student collocates, in we used see is that it it how collo- out find to discipline and study of family,level genre for search to us allows (BAWE)corpus English Written Academic British The BAWE up byagenerationofscholarswhohavebeenfamiliarwiththecomputer . because forthefirsttimepostsofleadershipinhumanitiesdepartmentsarebeingtaken pensable? It is hard to make a confident prediction. But the testing time has now arrived; taining solidandoriginalresultsobtainedbytechniquesforwhichthecomputerisindis- Or will there be a flowering of research papers, neither pedestrian nor spectacular, con- eventual outputofresearchthanwasmadebytheintroductionelectrictypewriter? husband forEmma. ment bydancingwithherhimself,thusprovinghishumanity and thatheistheideal Mr for book, whole a microcosmic happy ending in Emma prefigures the macrocosmic happy ending of the ‘his/her humanity’ inEnglishliterature relate to. This isshownin . the cultures of the past or of different location because there would be nothing we could there is acommon humanity. Ifthere was not then we wouldbe unable to comment on Indeed itisthesimilarities between so-called cultures that allow ustoappreciate that ‘common humanity’ and‘humanitycansurvive’ inethics/philosophy crimes againsthumanity. and peace against crimes of offences new introduced and crimes war of commission prosecuted and punished under international law, recognising accountability for the individuals at the highest levels ofgovernment and military infrastructure could be ‘crimes againsthumanity’ inlaw Knightley, being true to his name, moderates Harriet’smoderates name, embarrass- his Knightley,to true being years later. Through Written corpora by humanity by 35 Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 former Yugoslavia, Angola andIraq. academic discussion ofcrimesagainsthumanityin countriessuchasSudan,Zimbabwe,the against innocent civilians. The examplespointtothisbeingaperiodofnewsreporting and and 2009, where it is associated with This phrase is also found in one news item in the BNC, but, unsurprisingly, not in the BAWE. in newspapersratherthanacademictexts. most frequent collocate is habitat Classics Classics Classics Classics Classics Classics such as‘dregsofhumanity’ arenotfound. expressions BNC, the with contrast In language. academic of characteristic are use and ing Classics Table 3.2 Sheena GardnerandEmmaMoreton 36 dential findingsreflectingtopicsinthearticlesselectedforcorpus. tigation might explorewhethertheseareconsistentcharacteristics of thediscipinesoracci surprisingly it is seldom used in perhaps education and (8.05 pm) 5pm), and medicine (35.0 (1.79 pm). humanities Further inves- in amount average an pm), (179.37 studies religious texts andmost(34.85pm)inacademicjournals. spoken in pm) (10.13 least used ‘humanity’ is Here 2015. to 1990 from spans five-year over size for balanced evenly academic, and newspaper magazine, popular fiction, spoken, res’: A search in the BYU Corpus of Contemporary American English (COCA) includes five ‘gen- COCA tends tohavemorenegativeassociationsaroundthelossofhumanity,in asshown Mr of example the in as kindness, human of acts to relation ‘Humanity’ occurs throughout the decades in COCA, and most frequently between 2005 A goodwayofrecycling cabinets andappliances is todonatethemHabitatfor here’s agreatidea: Volunteer with to $35 just Give The top collocates include Among theacademic disciplines, ‘humanity’ isusedextensively in philosophyand Although thenumbersofinstances aresmall, such disciplinary variations in form,mean Humanity orrepurposethem . home inPort-au-Prince Concordance linesof‘humanity’inthedisciplineclassicsBAWE corpus of Caesar, althoughhetries todiminishhis good ofthefutureRome,andsocansubmit in theconventionalsense,involveslossof really aladywhodoesnotpossessany argue againstthegames,butnotfor effectively shuttingoffthelastofher As theplaygoesonwewatchasMedea’s to his case of at habitat.org/haiti to help repair one damaged one repair help to habitat.org/haiti at Habitat forHumanity against, crime(s), , whichintroduces ‘Habitat for Humanity’, which occurs crimes againsthumanity,suchasgenocideorviolence Habitat forHumanity. common andmass.Interestingly, the second humanity humanity humanity humanity humanity humanity humanity Knightley, while in classics it classics in Knightley,while Caesarhad . Thisisnottosayhowever, ‘. Thefinalsacrifice Representing . Instead,thesemembers as, when falls away. Table 3.2. - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Figure 3.2 times inthe1960s–1990s,compared tofewerthan20timesinthe1820s–1890s. most frequent in the latter part of the twentieth century, occurring (Figure A search in The Corpus of Historical American English (COHA) suggests that COHA with featuresof American university culture,suchascollege,endowmentandmajors. (see Figure 3.1 with 3.77 MI), and MI), 3.77 with the strengthofcollocation),humanityincontrastwithGod( God accountsfor0.22 percent the instancesofhabitatoccurwithhumanity with 7.89 mutual information (MI), which indicates Further searchesshowthathumanityinCOCA hasfourmainuses:crimesagainsthumanity crimes, crime, 100 120 140 160 20 40 60 80 0 Raw frequencyof‘humanities’inCOHA Frequency andMIoftoptencollocates‘humanity’inCOCA against andgenocideinFigure in COCA is associated associated is COCA in humanity meaningcompassion,whilehumanities Raw Fr equency of

humanie 3.1), Habitat for Humanity (3.87 Humanity for Habitat 3.1), s in COHA 3.2) more than 80 Written corpora humanities is percent of of percent 37 Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 ‘so-called humanities’ appearspejoratively. where time’and of ‘waste a as decried humanities the see we where 1990s the in Council the term ‘humanities’ intheHansardcorpuspointstodebate over aHumanities Research locate ‘so-called’ whichsuggestshumanitiesisnotanestablishedterm( The change in meaning referred to is also reflected in the relatively strong and enduring col- Sheena GardnerandEmmaMoreton 38 Figure 3.3 1887 Table 3.3 term the that reveals 2012–2013 in countries English-speaking 20 from identity for humanities, and a search in the 1.9-billion-word GloWbE corpus of web material The variousdigital humanities movements perhaps aimtoestablishaclear, contemporary GloWbE A 17/04/2018). accessed (https://hansard.parliament.uk/about, Lords of House and Commons of House UK the in debate and speeches of record transcribed a is Hansard Hansard 1927 1975 2004 gesting thesecamelater. professors, andendowments.Nocollocates are found,however, ofdigital institutionalised meaning of humanities alongside artsandsocialscienceswithdepartments, national, endowment) are similar in many respects to thoseinFigure Earlier examples,asthisonefromthe1840s,areinvokedinrelationtoeducation: The strongestcollocates at theendoftwentieth century (science, arts, social, tion usedtobecalled,notonlythe hence, with singularly beautiful appropriateness, the properstudiesofaliberal educa ing ofhispropernaturemakinghimperfectly a man-realizing his ideal character; (and this processmightbecalled humanization, i. e.,the complete drawing outandunfold- ‘So-called humanities’inCOCA MAG NF FIC NF Collocates of‘humanities’ inGloWbE . coursesofstudy, inoneofwhichlettersandtheso-calledhumanities . . theso-calledHumanities,orclassicsofAncientGreeceandRome . . - decadenceofart,scientificendeavour, theso-calledhumanities - . isespeciallythoughnotexclusivelythecaseinso-calledhumanities . Arts, butthe Humanities). 3.3, reflecting a fully a reflecting 3.3, Table digital humanities orresearch, sug- 3.3).

search for search - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 English languageregistersof otherkinds. the language. It will also explore how the MCC, BTCC and LALP data compare to modern extent to which the type of letter (personal or professional, for instance) predicts variation in perspective. This analysiswillinvestigate the register of correspondence, examining the of letters solicitation), allowing users to explore the discourse of correspondence from a diachronic as well as letters, professional and (personal correspondence of types different represent corpora The used. were Essex and Lancashire in Warwickshire, Offices Record from letters study this For corpus. the in represented are authors female 66 and male 118 1837 written by individuals appealing for poor relief from their parish to or poor 1720 law from authority; dating tokens) (51,408 letters 272 contains corpus (LALP) Labouring Poor the Finally, female. as themselves identifying authors 13 only to Europe: can countriessuchasGhana,Kenya, Tanzania andSouth Africa (Figure 3.3). lesser extent in India, Malaysia and Pakistan, and not at all in Hong Kong, Jamaica or Afri- a to but Zealand, New and Australia Canada, UK, the in Webextensively the on appears tion orknowledge aboutthetext.Itcanbe situated within thebodyoftext,providing informa additional provides XML Metadata individual files. the letter to relating metadata with most corpus software. Each collection has an accompanying spreadsheet containing The 1982. to 1853 from ­letters dating tokens) (130,000 letters business 612 contains (BTCC) 25 75 roughly 1953; to 1819 from dating families, their and migrants Irish between tokens) (176,501 letters private 188 tains con- tures thataretypically found inletters. (MCC) The MigrantCorrespondence Corpus This studyusesthreesmallcorrespondencecorporatoexploresomeofthelinguistic fea- Small correspondence corpora tion tocorpusselection,preparationandinterrogation. atten careful requires however, corpora, written using questions research specific answer literature). vs. law To in writing student from or domains web national different from (e.g. what extentishumanitiesacontestedterm?)andtocomparedatafromcomparable sources to (e.g. enquiry of lines suggest WordSketches),to (through items search with associated pora canbeusedtodemonstrate the range of grammatical patterns inlarge general corpora cor pre-loaded how glimpsed have findings. We of significance the interpreting in role a the texts in the corpus, the contexts of their production and how they were selected all play nature of thedataincorpustoinforminterpretations. An understanding of thenature of information about frequencies and collocations, it is important that users pay heed to the that whilecorpusinvestigations can producestatistical These searchesforhumanitiesshow The one GloWbE instance of ‘so-called humanities’‘so-called of instance GloWbE one The an from is refers but site, American The MCC, BTCC and LALP letters are all saved in XML in saved all are LALPletters and BTCC MCC, The ecn (7 etr) y eae uhr. The authors. female by letters) (47 percent fields disciplinary non-rentable of downsizing a by also and losses, job cuts, de-financing, The Europeaneducational system’s currentprocessofso-called reformismarkedby epistemological primacyoftheeconomicsphere. the of assertion the is reform this of principle leading The research. of fields intensive – sourced from the public archives of BT of archives public the from sourced – – the – – accompanied by the increased support of capital- of support increased the by accompanied – so-called humanities percent of the letters are by male authors (141 letters) and letters) (141 authors male by are letters the of percent 4

– are predominately written by men, with men, by written predominately are – British Telecom Correspondence Corpus

– a format that is compatible is that format a – Letters of Artisans andthe Written corpora 39 - - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 location, orbyaparticularauthor, forinstance). search their refine to user queries based on the header information (searching for all letters from a particular period or the allowing software, corpus into uploaded be then can files XMLXML These files. corresponding the within headers TEI as rendered and spreadsheet straightforward fromaprogrammingperspectiveformetadata to beextracted from the (see attributes Stadler et shared these on based interconnect to collections letter different allowing (see header TEI the of section the in information this capturing a recipient, an origin, a destination andadate. The TEI has recently introduced a module for sender,a features: common some share do they LALPvaried, and are BTCC corpus MCC, digitised texts in the humanities. Although many of the information categories used inthe encoding metadata, the Text Encoding Initiative of ways many are there While recognise. will software the that way a in encoded be must header information described earlier. However, tousethismetadata with corpustools,it respondence (whetherornottheletterwasdictated,forexample). the quality of thepaper, aswelldetails about theauthenticity and authorshipofthecor letter the of materiality the about information communicative function of the letter itself. The LALP corpus, in contrast, captures header tion focuses more on the professional status of the various authors and recipients and the informa header BTCC history), family and background educational occupation, sex, (their header information includes extensive sociobiographic details about the various participants MCC the aims. While research the on depending project to project from vary will captured Most corpora will contain some basic header information; however, the type of information liographic information) and whatitis(aletter, anacademic essay orapoem,forinstance). the header the text the of body the outside situated be can it and instance); for features, pragmatic or information or knowledge relating to the structure, layout or content (line breaks, paragraphs Sheena GardnerandEmmaMoreton 40 Figure 3.4 Spreadsheets anddatabasesareideal for capturing well-structured metadata such asthe task-force-correspDesc”> Meelick, Queen’s County Elizabeth McDonaldLough Winsted, Connecticut Elizabeth Lough – whether it was written in pen or pencil and pencil or pen in written was it whether – (TEI)isthedefactostandardforencoding Figure

3.4), – in – - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 ject project the components: main two of consist structures Projection projection. of type a as function can clauses these 2004) Matthiessen and liday of cognition such as‘Ithink which ‘express intellectual states or nonobservable intellectual acts’ (ibid.) – typically verbs first- and second-person pronouns and what Biber describes as private verbs examining some ofthelinguistic featuresthatarecommonacrossallthreedatasets: namely, correspond mostcloselywithBiber’s preparedspeeches. and letters MCC the than nature in conversation-like more are they 2.17, of score mean a than the personal letters used for Biber’s study. The LALP letters are quite interesting; with ‘involved’ ‘affective’ and less much are letters migrant the words, other In fiction. general alongside them placing 0.36, of score mean a have letters MCC the 19.5, of score a with the ‘Informational versusInvolvedProduction’ spectrum. However, unlike personal letters ‘informational’of the end at them placing 1, dimension for –7.1 of score mean a have ters letters. professional and letters personal for scores mean Biber’s with corpus LALP and BTCC sharing characteristicstypicallyfoundinwrittendiscourse. of thesamecharacteristicsasconversations,professionallettersaremoreinformational, in termsoftheirfunctions:whilepersonallettersaremoreinterpersonal,sharingmany differ they pronouns), second-person and first- of frequency high the by evidenced (as and broadcasts).Inotherwords,althoughbothtypesofcorrespondenceareinteractional and professional letters had a mean score of –3.9 (placing them alongside general fiction interviews), and speeches spontaneous alongside them (placing 1 dimension for 19.5 of In termsofcorrespondence,Biber’s studyshowedthatpersonallettershadameanscore (ibid.). etc.) prose academic or purposes’ documents, informational (official highly with ephone orface-to-faceconversations,forexample),theotherendrepresents‘discourse purposes’ involved affective, interaction, (tel- with ‘discourse represents pole the of end one While 115). 1988: English’(Biber in texts among variation of parameter damental dimensions. Dimension1,‘InformationalversusInvolvedProduction’,represents‘afun- the corpus and plotting the distribution of corpus holdings based on Biber’s (1988/1995) Taggersional Analysis (MAT). MAT draws on Biber (1988) by grammatically annotating other Englishlanguageregisters,thedatawereuploadedintoNini’s (2015)Multidimen- Table 3.4 Dimension 6 Dimension 5 Dimension 4 Dimension 3 Dimension 2 Dimension 1 Having lookedatthedimensionscoresforthreecorpora,nextstageinvolved letters, professional for findings Biber’s to Similar To get a sense of how the letters in the MCC, BTCC and LALP corpus compare withLALP compare and corpus BTCC MCC, the in letters the how Toof sense a get ed clause(youwillwrite). In these structures the primary (projecting) clause sets up the Mean scoresfordimensionsacrossfivedatasets 19.5 Personal Letters –1.4 –2.8 –3.6 1.5 0.3 ’ or‘shehopes’.Insystemic-functional grammar (e.g. Hal - –2.2 –3.9 Professional Letters 1.5 0.4 3.5 6.5 Table I hope) andthepro- ing clause(I Table 3.4 shows that the BTCC let BTCC the that shows 3.4 –0.22 –0.46 MCC 0.34 1.75 1.46 0.36 3.4 compares the MCC, –1.49 –7.1 BTCC 1.64 2.59 6.69 5.15 Written corpora – that is, verbs –0.5 –1.21 LALP 0.09 6.88 2.02 2.17 41 - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Table 3.5 in the BTCC and not at all in the LALP corpus). Interestingly, the LALP corpus contains corpus LALP Interestingly,the corpus). LALP the in all at not and BTCC the in MCC and BTCC) and are propositions) of projection the realise typically (which cognition of verbs common most BTCC. The and MCC the in occurs also tion of proposals) are projec the realise typically (which desire of verbs common most The construction. object Table which verbstypically appear in thisstructure (see the for patterns trend similar a is It (51,113). BTCC the in fewer significantly are there however, respectively); 84,033, and (87,794 LALP and MCC the in pronouns personal for quencies tinction betweentheprojectionofpropositionsandproposalsasfollows: dis- a make Matthiessen and Halliday 377). 2004: Matthiessen and (Halliday said is what secondary (projected) clause as the representation of the content of either what is thought or Sheena GardnerandEmmaMoreton 42 Looking at the normalised frequencies (averaged per million), • • • are notconsideredhere.): following patterns wereextracted (variants,e.g.withotherpronouns orintervening adverbs, The dataset. each within patterns lexico-grammatical identify to (CQL) Language Query Corpus use Penn to the Treebankpossible using it (POS), making speech tagset, of part for (to provideinformation)oranon-verbalresponsecarryoutanaction). expect they response of type the and recipient intended analysis of these structures will reveal something about how theauthor interacts with their Both propositions and proposals have different response-expecting speech functions, and an I +Vyou I +V Pr Having lookedatthefrequency ofthepattern Sketch Engine (Kilgarriff and Kosem 2012) was used to automatically tag the corpus data odrn, etc. wondering, cognition of processes by mentally projected are Propositions, which are exchanges of information [typically statements or questions], sonal pronouncombinations per + verb + pronoun personal all for search [word=“I”][tag=“V..”][word=“you”]to [word=“I”][tag=“V..”] tosearchforallpersonal pronoun+verbcombinations [tag=“PP”] tosearchforallpersonalpronouns offers orcommands],areprojectedmentallybyprocessesofdesire. 3.6 are types of projection. I +Verb andI+Verb +You. the MCC,BTCCandLALPcorpus Raw and normalised frequencies (per million) for Pronouns, I + Verb, and I + Verb + You across . 16,814 (87,794p/m) MCC know (which has a high frequency in the MCC, but only occurs once 3,951 (20,630p/m) . hope and Pooas wih r ecags f od--evcs [typically goods-&-services of exchanges are which Proposals, . 212 (1,106p/m) I thank you, for instance, is a straightforward subject/verb/ wish, occurring across all three corpora. The verb think (acrossall three corpora), suppose(inthe 1,821 (12,326p/m) 7,511 (51,113p/m) BTCC 73 (494p/m) I +V +you,thenextstagewastosee Table – thinking, knowing, understanding, knowing, thinking, – – whether that is a verbal response verbal a is that whether – 3.6). Notall of the examples in Table 3.5 showssimilar fre- 4,332 (84,033p/m) LALP

912 (17,728p/m) 48 (933.05p/m) (2004: 461) want - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 (think of the letter (you) is typically required to ance lines1and2inFigure occurrences 37 of out (21 greetings formulaic in used typically is hope you MCC, the In differences. some revealed context in language the of examination closer a change inthefuture’ (1988:242)–beg,solicit some about bring to intentions ‘imply that (verbs verbs suasive as describes 1182–1183) a highfrequency ofverbsdesire(hope, Table 3.6 of + clause intheLALP data(thepoorlaw authority) is almostalwaysrequiredtocarryouta cognitively respond to required also is associate) business (a clause projected the of subject the part, relative or by not feeling abandoned or forgotten. Similarly, in the BTCC data, for the most way,some a in remembering cognitively by respond to required is – member) family close event that is taking place. In the MCC, the subject of the projected clause – involve the recipient of the letter, assigning to them a role to play in the communicative 11) orconsidertheauthor’s request(concordanceline12). (concordance line 10), speakorwriteto somebody on the author’s behalf (concordance line laic greeting. In over half of the occurrences, what follows are the modals ance lines 3 to 6). In the BTCC only 3 out of 13 instances of 20 19 18 17 16 15 14 13 12 11 10 agree (see concordance lines 7 to 9). Finally, in the LALP corpus, 32 of the 39 occurrences 9 8 7 6 5 4 3 2 1 I hopeyouarefollowedbywill is the most frequent projection structure across all three corpora; however, corpora; three all across structure projection frequent most the is I hopeyou What is significant about projection structures is their ability to directly addressand directly to their ability is structures projection about significant is What ); additionally,); et Quirk (following Biber what of instances several are there Instances ofI+Verb +You acrossthethreecorpora,organisedbyfrequency – this time, by agreeing to something. In contrast, the subject of the projected the of subject the contrast, In something. to agreeing by time, this – I referyou(2) I remainyou(2) I sawyou(2) I seeyou(2) I thoughtyou(2) I writeyou(2) I assureyou(3) I expectyou(3) I wroteyou(3) I sentyou(4) I leftyou(6) I sendyou(6) I thinkyou(8) I toldyou(8) I wantyou(11) I knowyou(12) I supposeyou(12) I wishyou(13) I tellyou(16) I hopeyou(61) I +VYou MCC 3.5, forinstance).Intheremaining 16 occurrences,therecipient . Heretherecipient is typically required tosendmoney think I knowyou(1) you(1) I observe I offeryou(1) I promiseyou(1) I remindedyou(1) I telegraphedyou(1) I thoughtyou(1) I troubleyou(1) I trustyou(1) I wantedyou(1) I wroteyou(1) I seeyou(2) I sendyou(2) I supposeyou(3) I wishyou(3) I sentyou(5) I thinkyou(5) I toldyou(6) I hopeyou(13) I thankyou(15) I +VYou BTCC wish, andask). , pray, or remember/not forget (see concord- desire), butonlyoneverbofcognition I hopeyouarepartofaformu- may, I askyou(1) I referyou(1) I returnyou(1) I sentyou(1) I solicityou(1) I thinkyou(1) I desireyou(2) I begyou(2) I thankyou(2) I wishyou(3) I trustyou(3) I hopeyou(25) I +VYou LALP you (typically a Written corpora – see concord - see – will or al. 1985: al. would 43 I Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 for producing online editionsofhistorical manuscripts suitableforboth corpuslinguistic Editions forCorpusLinguistics (DECL) project, for example, ‘aims to create a framework transcriptions ofmanuscripts thatcanbeusedacrossarangeofdisciplines. The Digital between disciplinesarestarting toform,largely drivenbytheneedforreliable digital the duplicationofworkandmake datainterchangemorechallenging.However, alliances to lead can practices encoding and transcription differing and/or undocumented finally, custom-built software, which makes them particularly difficult to update’ (2013: 46). And, physical action exacerbated, in part, by ‘the tendency of large electronic projects electronic large of tendency ‘the by is part, argues in Millett exacerbated, which issue, an also is resources digital of Sustainability 10). (2009: open-access licence,ortoallowscholarsphotographmanuscript materialthemselves’ and persuadethemtoeitherdigitisethemanuscriptmaterial andtopublishthemunderan et Honkapohja by argued as problem, the around way only the manuscripts, historical with working when Certainly, corpora. interconnecting and creating to barriers the among are and resources of accessibility affect countries and and attitudesrelatingtocopyrightintellectualproperty acrossdisciplines,cultures laws Differing research. future hinder may constraints several resources, written itising dig- to regard with humanities digital the by offered opportunities exciting the Despite Future directions Sheena GardnerandEmmaMoreton 44 Figure 3.5 pronouns aswellevidential verbs (such asknow, second-person and first- of use the that found step-daughter) Johnson’s Porter, Lucy and study of letters by Samuel Johnson to two of his correspondents (Mrs jections of propositions (typically statements/exchanges of information). Similarly, Sairio’s contained more pro- rect commands), while letters to parents (i.e. a generational ‘superior’) indi- realising (often proposals of projections more contained hierarchy) familial notional correspondence, found that letters to siblings, nieces or nephews (i.e. an ‘inferior’ within the influence language choice. Moreton (2015), for instance, examining a collection of migrant certainly would relationship author–recipient the Additionally,solicitation). professional, letter writing genre;however, theirusevariesdependingonthetypeofletter it is(personal, example, (for structures projection certain that seem likely itisthattheselinguisticdeviceswillbeused. cator oftheclosenessrelationship’ (2005:33):thecloserrelationship, the more 12) 11) 10) 9)  8)  7)  6)  5)  4)  3)  2)  1)  do notalowemesomthingmoreand matters wouldnotcometothatpoint. ce aletterofwhichI enclosecopy. would otherwisehavebeenassigned. s Suportthepoor –MyDearterese & morninginmyPoorPrayersand I June 25th 1855MyDearestMother I all areandhowyougettingalong I I am nowingoodhealththankGod I was notonedaysicksinceleftyou Myself haveSufficientNesseseries this morningonlyit Was sowet I Sample concordancelinesfor‘Ihopeyou’ – to write or send money, or to I I hopeyou I hopeyou I hopeyou I hopeyou I hopeyou hopeyou hopeyou hopeyou hopeyou hopeyou hopeyou hopeyou will sendmesomethingbyreturnofpostBenjaminHewitt wonte forget yourOldUnclewellI willCometoaCloseby do notthinkthatI haveforgotten youbyMynotanswering dont thinkI havelostallinterestinyouorwhatconcerns are wellandallfriendsinIrelandRememberto and theChildreniswellthatCapwasingoodhealth will besogoodastospeeketheegentlemanaboutme would agreethatasituationinwhichBT wasforcetopermit will agreegenerallywiththisassessment.Pleaseletusknow will agree,andthatwemayrelyonyourco-operationinappl will Considerofmy­ will prayoftenforthyGrandfatherandhavehisnameinserte consider carryingouttheseactions.Itwould think I hopeyou)maybeindicative of the al., is to ‘work with repositories with ‘work to is al., andbelieve Circumstance forI havedoneasfar Thrale, a close friend, ) are‘arelevantindi- . . . to use complex use to . Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 4 3 • • • • • More informationaboutwrittencorporaandrelatedpublications canbefoundat: 1 Notes annotation ofcorpusdata. cross-disciplinary, cross-culturalandcross-institutionalapproachtodigitisation et (Honkapohja research’ historical and 2 1 Further reading 2 wordpress.com. For more information about the LALP project see https://lalpcorpus.wordpress.com/. https://btletters. from: accessed be can data BTCC The letters. the to access providing for Miller sor et Miller 1985; Miller e.g. for see, migration, Irish of topic the on extensively published has Miller Missouri. of University the at housed respondence, cor migrant Irish of archive Miller’s Kerby Professor from are MCC the create to used letters The that aregenerallyaccountedforincanonicalhistories’ (Amador-Moreno et al.2016). form theessenceofnormal social interactions, as opposedtothelivesofleaders and famouspeople the of value the to refers humble and anonymouslives experienced by ordinary men and womenineveryday contexts which ‘It 1985. in Unamuno de Miguel writer Spanish the by coined term A Available at:www.bessofhardwick.org. landingpage/collection/ab-letters. University ofOxford(2009)Cultures ofknowledge alr University. Baylor tundra.csd.sc.edu/vllc/. darwins-letters info/about/. stanford.edu/index.html. Stanford University (2013) Mappingthe republic of letters (British web corpus) and Wikipedia; as well as more specialist corpora that will be of interest of be will that to researchers in the humanities, such asweb corpora of corpora African and Asian English (Araneum specialist more as well as Wikipedia; and corpus) uWACweb 2013, (British 2012, 2008, Web as such corpora web Gutenberg; Project and Online Books lish the Eng- and Early Corpus of corpora Literary the Corpus, Brown the Corpus; National American The Sketch Engine site developed initially by the late Adam Kilgarriff includes the British National include dozensofpre-loadedcorpora thatcanbeeasilyanalysedbynovices. both researcher-compiledand for corpora, used be can sites BYU and Engine Sketch the contrast, The OTA housesdozensofcorporathat can be requested by researchers in a range of formats. In The Brigham Young Universitysite,http://corpus.byu.edu The SketchEnginesite,www.Sketch Engine.co.uk The CorpusResourceDatabase(CoRD)www.helsinki.fi/varieng/CoRD/ The LearnerCorporalistswww.uclouvain.be/en-cecl-lcworld.html The Oxford Text Archive (OTA) http://ota.ox.ac.uk This provides an overview of corpora, both written and spoken, in English for academic purposes. English foracademicpurposes EAP.P.in and studies Hyland Corpus K. (2016). (eds.), In H. Shaw Nesi, This isausefulsourceofinformationaboutcorpusdesignandcreation. Creating and digitizing language corpora series (Volumes 1–3), published by Palgrave Macmillan. Van GoghMuseum. Vincent vanGogh: The Letters. Available at:http://vangoghletters.org/vg/. University of Glasgow. Bessof Hardwick’s letters: The complete correspondence c.1550–1608. University of South Carolina (2011) Carolina South of University (2015) University Cambridge University of Oxford (2008) The browning letters Electronic Enlightenment . Abingdon: Routledge,pp. 206–217. The darwinproject Victorian lives and letters consortium. http:// Availableat: Aalbe t http://digitalcollections.baylor.edu/cdm/ at: Available . al. 2009: 451). Ultimately, what is needed is a is needed is what Ultimately, 451). 2009: al. al. 2003. The authors would like to thank Profes- thank to like would authors The 2003. al. . Available at:www.culturesofknowledge.org. Aalbe t www.darwinproject.ac.uk/ at: Available . . Available at: www.e-enlightenment.com/ . Available at: http://republicofletters. at: Available. Routledge handbook of Written corpora 45 - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Sheena GardnerandEmmaMoreton 46 Amador-Moreno, C.,Corrigan,K.P., McCafferty, K.,Moreton,E.and Waters, C.(2016).Irish migra References Frankenberg-Garcia, A., Flowerdew, L. and Aston, G. (2011). G. Aston, and L. Flowerdew,Frankenberg-Garcia, A., Francis, G., Hunston, S. and Manning, E. (1996). Biber, D., Conrad, S., Reppen, R., Byrd, P. and Helt, M. (2002). Speaking and writing in the univer the in writing and Speaking (2002). P.M. Byrd, Helt, R., and Reppen, S., Conrad, Biber,D., Biber, D. (1988/1995). Baron, A., Tagg, C., Rayson, P., Greenwood, P., Walkerdine, J. and Rashid, A. (2011). Using verifiable 1750–1835). Auer,c. (England, poor labouring Fairman, the T.and and A. artisans of Letters (2012). Fludernik, M. (1991). Shifters and deixis: Some reflections on Jakobson, Jespersen, and reference. and Jespersen, Jakobson, on reflections Some deixis: and Shifters (1991). M. Fludernik, Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. (1999). Biber, D., Finegan, E. and Atkinson, D. (1994). ARCHER and its challenges: Compiling and exploring Halliday, M.A.K. and Matthiessen, C.M.I.M. (2004). Gries, S.T. (2013).Statistics for linguistics with R: A Fries, P.H. (2010). Charles C. Fries, linguistics and corpus linguistics. Fairman, T.In handwriting. and print in Language (2015). A. Auer,R.J. and WattsSchreier D. (eds.), Lungo Del G. and Dossena M. In language. mechanically-schooled in Letters (2012). T. Fairman, Durrant, P. (2017). Lexical bundles and disciplinary variation in university students’ writing: Mapping Caldas-Coulthard, C.R. and Moon, R. (2010). Curvy, hunky, kinky’: Using corpora as tools for critical Gardner, S. and Holmes, J. (2009). J. Holmes, and Gardner,S. ing COCA,COHA, Time Magazineandcontemporarysoapoperas. Hansard) and one corpus of Canadian English, there are four corpora of American English, includ and BNC (the English British of corpora two to addition In (CORE). English of of Registers Online Corpus a and corpus NOW the GloWbE, includes Davies Mark by developed site BYU The international art(e-flux)andacademicEnglish(BAWE).of reports, law of corpus a corpus, newspaper SiBOL/port the corpus, CHILDS the Anglicum), Norway, 1–5June. at Presented SMS. and in Twitter differences spelling and Gender data: author guistics. Narr: Tübingen, pp. 77–91. P.In R.J. (eds.), and Whitt Scheible S. Durrell, M. Bennett, London: PalgraveMacmillan. (eds.), tion as impact tools in the education and heritage sectors. In K. Corrigan and A. Mearnes Semiotica English languageresearchRodopi. oncomputerizedcorpora,Zurich1993.Amsterdam: Creating andusingEnglishlanguagecorpora:Papersfrom the14thinternational conference on a representative corpus of historical English registers. In U. Fries, P. Schneider and G. Tottie (eds.), de Gruyter. (eds.), and genre families in the BAWE corpus of student writing. In M. Charles, S. Hunston and D. Pecorari learning. London:Continuum. Letter writingandlanguagechange Company, pp. 205–227. (eds.), Camiciotti the territories. analysis. Discourse &Society and writtenEnglish sity: A Arnold. English Linguistics multidimensional comparison. Academic writing: At theinterfaceofcorpusanddiscourse. London: Continuum, pp. Creating and digitizing language corpora –volume 3: Databasesforpublic engagement. 86:193–230.https://doi.org/10.1515/semi.1991.86.3-4.193. Applied Linguistics . Amsterdam: John Benjamins Publishing Benjamins John EuropeLetter writinginlatemodern . Amsterdam: 34:89–121. . Harlow:PearsonEducation. Variation acrossspeech andwriting. Cambridge: Cambridge University Press. 21(2):99–133. Section headings, macrostructures Can I useheadingsinmyessay?Sectionheadings,macrostructures 38(2):165–193. . Cambridge:CambridgeUniversityPress,pp. 53–71. TESOL Quarterly 36(1):9–48. Grammar patterns 1:VerbsGrammar . London: HarperCollins. An introduction tofunctionalgrammar (2nd ed.). Berlin: WalterBerlin: ed.). practical introduction(2nd New methodsinhistoricalcorpuslin- New trends incorporaandlanguage ICAME Journal: Computers in Longman grammar of spoken Longman grammar ICAME 32, Oslo, . London: 251–271. - - - Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Preston, M. (n.d.). M. Preston, (2015). Nini, A. Nesi, H. (2016). Corpus studies in EAP. In K. Hyland and P. Shaw (eds.), Nesi, H.andGardner, S.(2012).Genres across thedisciplines: Student writinginhighereducation I (2015). E. Moreton, Moon, R. (2007). Sinclair, lexicography, and the Cobuild project: The application of theory.of application Sinclair,project: The (2007). lexicography,Cobuild R. the Moon, and Millett, B. (2013). Whatever happened to electronic editing? In V. Gillespie and A. Hudson (eds.), ilr KA, cre,A, oig BD ad ol, .. (2003). D.N. Doyle, and B.D. Boling, A., Schrier, K.A., Miller, Miller, K.A. (1985). Liu, D.(2012). The mostfrequently-usedmulti-wordconstructionsinacademic written English: (2015). M. Leedham, Lea, D.(2014).Oxford learner’s dictionaryofacademicEnglish Raumolin-Brunberg, H. and Nevalainen, T. (2007). Historical sociolinguistics: The corpus of early of corpus The sociolinguistics: Historical (2007). T. Nevalainen, and H. Raumolin-Brunberg, Quirk, R., Greenbaum, S., Leech, G. and Svartvik, J. (1985). Nesi, H. and Moreton, E. (2012). EFL/ESL writers and the use of shell nouns. In R. TangR. In nouns. (ed.), shell of use the and EFL/ESLwriters (2012). E. Moreton, and H. Nesi, Huber, M. and Mukherjee, J. (2013). ‘Introduction’ to corpus linguistics and variation in English: in variation and linguistics corpus to ‘Introduction’ (2013). J. Mukherjee, and M. Huber, Hornby, A.S. (1974). linguistics: corpus for editions Digital V.(2009). Marttila, and S. Kaislaniemi, A., Honkapohja, Charles, M. In disciplines. academic Verbal(2009). in H. processes Nesi, mental and and J. Holmes, (2005). M. Hoey, Kilgarriff, A. and Kosem, I. (2012). Corpus tools for lexicographers. In S. Granger and M. Paquot M. and Granger S. In lexicographers. for tools Corpus (2012). I. Kosem, and A. Kilgarriff, Hyland, K.(2008). As canbeseen:Lexical bundles anddisciplinary variation. Hyland, K.(2005).Metadiscourse:Exploringinteractioninwriting Hunston, S.(2002).Corporainappliedlinguistics site/multidimensionaltagger. English foracademicpurposes doi:10.1075/jhp.16.2.06mor. teenth centuryIrishemigrantcorrespondence tional JournalofCorpusLinguistics truth: Editingmedievaltextsfrom Britaininthetwenty-firstcentury. Turnhout: Brepols,pp. 39–54. Oxford UniversityPress. fromCanaan: Lettersandmemoirs York:New colonialandrevolutionary America, 1675–1815. Oxford UniversityPress. A Abingdon: Routledge. guage corpora(Vol. 2).London: Palgrave Macmillan. (eds.), Moisl H. and Corrigan K. Beal, J. In correspondence. English language. London:Longman. Sciences/CCRH/history.html (Accessed15December 2016). writers inhighereducationcontexts demic writing in a secondorforeign language: Issues andchallenges facing ESL/EFL academic Applied LinguisticsSeries.Cambridge:CambridgeUniversityPress. Focus on non-native Englishes. Rodopi. York:New English languageresearchand Amsterdam oncomputerizedcorpora(ICAME29). (eds.), Representing manuscript reality in electronic corpora. In A.H. Jucker, D. Schreier and M. Hundt London: Continuum,pp. 58–72. S. HunstonandD.Pecorari (eds.), Academic writing: At the interface of corpusanddiscourse. Routledge. (eds.), Electronic lexicography Purposes 27:4–21. multi-corpus study. EnglishforSpecificPurposes Corpora: Pragmatics and discourse: Papers from the 29th international conference on Multidimensional analysis tagger (Version 1.3). Available at: http://sites.google.com/ Lexical priming: A . Available at: www.colorado.edu/Arts at: AAvailable brief history ofcomputerconcordances. Emigrants andexiles: Ireland andtheIrishExodustoNorth America. New York: Oxford advanced learner’s dictionary (3rd ed.). Oxford: Oxford University Press. Chinese students’ writing in English: Implications from acorpus-drivenstudy. hope you will write: The function of projection structures in a corpus of nine- of corpus a in structures projection of function The write: will you hope . New York: OxfordUniversityPress,pp. 31–56. . Abingdon: Routledge,pp. 206–217. Varieng: StudiesinVariation, ContactsandChangeinEnglish:13. . Abingdon and New York:New and new theory of wordsAbingdon andlanguage. (pp. 126–145).London:Continuum. 12(2):159–181. 16(1): 277–303. 16(1): . JournalofHistoricalPragmatics . Cambridge:CambridgeUniversityPress. 31(1):25–35. A comprehensive grammar of the English . Oxford:OxfordUniversityPress. . London:Continuum. Irish immigrantsinthelandof Creating and digitizing lan- The Routledge handbook of English for Specific for English Written corpora Probable Interna- Aca- 47 .

Downloaded By: 10.3.98.104 At: 12:17 29 Sep 2021; For: 9781003031758, chapter3, 10.4324/9781003031758-3 Sifton, P.G. (1977). The provenance of the Thomas Jefferson papers. (2009). A. Sairio, Sairio, A. (2005).‘SamofStreatham Park’: A Rayson, P. and Baron, A. (2011). Automatic error tagging of spelling mistakes in learner corpora. In F. Sheena GardnerandEmmaMoreton 48 Tagg, C., Baron, A. and Rayson, P. (2012). “i didn’t spel that wrong did i. Oops”: Analysis and nor and Oops”: Analysis i. did wrong that spel didn’tP. “i Rayson, (2012). and Baron, Tagg,A. C., Staples, S., Egbert, J., Biber, D. and Gray, B. (2016). Academic writing development at the university Stadler, P., Illetschko, M. and Seifert, S. (2016). Towards a model for encoding correspondence in the Tieken-Boon van Ostade, I.M. (2010). Thompson, G.andHunston,S.(2006).Systemcorpus: Two traditions with a common ground. In Willis, D. (1994). A Wisby, R. (1962). Concordance making by electronic computer: Some experiences with the wiener the with experiences Some computer: electronic by making Concordance (1962). Wisby,R. Helsinki 75).Helsinki:SociétéNéophilologique. de Néophilologique Société la de (Mémoires (Monograph). ­eighteenth-century epistolary English Thrale family. European JournalofEnglishStudies granger. StudiesinCorpusLinguistics,45. Amsterdam: JohnBenjamins. Meunier,(eds.), Paquot M. and Gilquin G. Cock, De S. munication level: Phrasal andclausal complexity across level of study, discipline, andgenre.Written - Com Available at:http://jtei.revues.org/1433. TEI: Developingandimplementing .JournaloftheText EncodingInitiative nox, pp. 1–14. G. Thompson and S. Hunston(eds.),System and corpus: Exploring connections A malisation of SMS spelling variation. In L.A. Cougnon and C. Fairon (eds.), tivism. Oxford:OxfordUniversityPress. genesis. TheModernLanguageReview the languageteacher linguistic approach. Specialissueof 33(2):149–183.

Language andlettersofthebluestocking network. Sociolinguistic issues in lexical approach. In M. Bygate, A. Tonkyn and E. Williams (eds.), . HemelHempstead:PrenticeHallInternational. The Bishop’s grammar:Robert Lowth andthe rise of prescrip- Linguisticae Investigationes 57(2):161–172.doi:10.2307/3720960. linguistic study ofDr. Johnson’s membership in the 9(1):21–35. A taste for corpora.Inhonourof Sylviane American Archivist 35(2):pp. 367–388. SMS communication: . London: Equi- Grammar and Grammar 40(1): 17–30. (9). -