The BT Correspondence Corpus 1853- 1982: the Development and Analysis of Archival Language Resources

Total Page:16

File Type:pdf, Size:1020Kb

The BT Correspondence Corpus 1853- 1982: the Development and Analysis of Archival Language Resources The BT Correspondence Corpus 1853- 1982: The development and analysis of archival language resources Morton, R. Submitted version deposited in Coventry University’s Institutional Repository Original citation: Morton, R. (2016) The BT Correspondence Corpus 1853-1982: The development and analysis of archival language resources. Unpublished PhD Thesis. Coventry: Coventry University Copyright © and Moral Rights are retained by the author. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This item cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder(s). The content must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holders. Some materials have been removed from this thesis due to third party. Pages where material has been removed are clearly marked in the electronic version. The unabridged version of the thesis can be viewed at the Lanchester Library, Coventry University. The BT Correspondence Corpus 1853-1982: The development and analysis of archival language resources RALPH MORTON Thesis presented to the Graduate School of Coventry University in partial fulfilment of the requirements for the degree of Doctor of Philosophy in CORPUS AND HISTORICAL LINGUISTICS COVENTRY UNIVERSITY MAY 2016 ACKNOWLEDGMENTS This research couldn’t have happened without the support of lots of people. Firstly I’d like to thank the New Connections project partners at The National Archives and the BT Archives. Siân Wynn-Jones and David Hay at BT in particular have been very helpful in providing information about their collections and archives in general. Many thanks to Hilary Nesi, Emma Moreton, and Sheena Gardner for giving me the opportunity to conduct this research in the first place, and for their subsequent supervision and guidance. I was very lucky to have such a strong team supporting me and I very much hope we get the chance to continue working together. I would also like to thank Ramesh Krishnamurthy, whose infectious enthusiasm for corpus linguistics is one of the main reasons I became (and remain) interested in this field of research. Special thanks to Angela Morton and Andrew Morton for many years of love and support. Finally, a massive thank you to Katie Hall who put up with me patiently for the four years it took to complete this study, helped piece my thoughts back together when they were scattered, and is all round a wonderful person. Thanks to all. ii ABSTRACT This thesis reports on the construction and analysis of the British Telecom Correspondence Corpus (BTCC), a searchable database of business letters taken from the archives of British Telecom. The letters in the corpus cover the years 1853-1982. This is a crucial period in the development of business correspondence but is so far underrepresented in available historical corpora. This research contributes knowledge in two main areas. Firstly, a number of methodological issues are highlighted with regard to working with public archives to produce linguistic resources. The way in which archives are typically organised, particularly the lack of item-level metadata, presents a number of challenges in terms of locating relevant material and extracting the sort of metadata that is necessary for linguistic analysis. In this thesis I outline the approach that was taken in identifying and digitising the letters for the BTCC, the issues encountered, and the implications future projects that make use of public archives as a source of linguistic material. Secondly this study contributes new insights into the development of English business correspondence from the nineteenth to the twentieth century. The results show a notable decline in overtly deferential language and an increase in familiar forms. However, these more familiar forms also appear in fixed-phrases and conventional patterns. This suggests that there was a move from formalised distance to formalised friendliness in the language of business correspondence in this period. We also see a shift away from the performance of institutional identity through phrases such as ‘I am directed by…’ towards an increased use of the pronoun ‘we’ to represent corporate positions. This shift in corporate identity seems to coincide with the decline in deferential language. Finally an analysis of moves and strategies used in requests suggests that, as the twentieth century progressed, authors began to use a wider range moves to contextualise and justify their requests. Furthermore, though the same request strategy types remain popular over the timeline of the BTCC, there a degree of diversification in terms of how the most popular request strategies are expressed and indirect strategies that rely more on implicature become somewhat more prevalent. iii TABLE OF CONTENTS Contents 1. Chapter 1 - Introduction ........................................................................................................ 1 1.1. Project background ....................................................................................................... 1 1.1.1. The BT Archives ..................................................................................................... 1 1.1.2. New Connections ................................................................................................... 1 1.2. The current study .......................................................................................................... 2 2. Chapter 2 - Literature review ................................................................................................ 4 2.1. Aims and Structure of the Chapter ............................................................................... 4 2.2. The development and content of historical and letter corpora ................................... 4 2.2.1. Digital Letter Collections ....................................................................................... 4 2.2.2. Historical corpora .................................................................................................. 5 2.2.3. Contemporary Letter Corpora ............................................................................. 11 2.2.4. Historical-contemporary gap............................................................................... 12 2.3. Studies of patterns in general language change since the 19th century ..................... 13 2.3.1. Nineteenth Century Language Change ............................................................... 13 2.3.2. Twentieth Century Language Change ................................................................. 16 2.4. Previous Studies of language change in letter writing practice. ................................. 19 2.4.1. Historical Personal correspondence.................................................................... 19 2.4.2. Historical Business correspondence .................................................................... 25 2.4.3. Historical Personal-Business correspondence comparison ................................ 28 2.4.4. Contemporary correspondence .......................................................................... 30 2.4.5. Post-Business Letter Era ...................................................................................... 33 2.5. Previous studies of letter writing guidance ................................................................ 34 2.6. Research Questions ..................................................................................................... 38 3. Chapter 3 – Methodology and Data .................................................................................... 40 3.1. Aims and structure of the chapter .............................................................................. 40 3.2. Overview of corpora .................................................................................................... 41 3.3. Overview of corpus analysis ........................................................................................ 45 3.4. Methods of measuring language change .................................................................... 47 3.5. Methods of analysis .................................................................................................... 49 iv 3.5.1. Letter Writing Manuals (to answer RQ1 - What can be added to our knowledge of letter writing conventions by analysing three letter writing manuals contemporary with the period represented in the BTCC?) ................................................................................. 49 3.5.2. Formal Features (to answer RQ2 - How does practice in the BTCC correspond to advice given in letter writing manuals of the relevant period?) ......................................... 50 3.5.3. Corpus techniques (to answer RQ3 - What are some of the linguistic features of the correspondence in the BTCC that can be identified using corpus analysis, and in what respects do these change over time?) ................................................................................ 51 3.5.4. Qualitative analysis of individual request letters (to answer RQ4 -How are requests characterised in the corpus?) ............................................................................... 55 3.6. Methods of corpus creation ........................................................................................ 59 3.6.1. The British Telecom Correspondence Corpus ....................................................
Recommended publications
  • 20181024 Archive Collections Development Policy
    Archive Collections Development Policy Contents 1. Purpose 2. Background 3. Legal basis 4. Scope 4.1 What are archives? 4.2 Dates 4.3 Format 4.4 Condition 4.5 Duplicates 4.6 Philatelic Collections 4.7 Museum Collection 4.8 Data protection 5. How do we collect? 5.1 Ad-hoc or routine transfers from the business 5.2 Records retention schedules 5.3 Registries and Records Centre 5.4 Donations 5.5 Loans 5.6 Purchases 5.7 Films 5.8 Photographs 5.9 Appraisal 5.10 Deaccessioning 6. Approach to collections development 7. Archive of TPM and its predecessors 8. Responsibility for the archive 9. Implementation and Review Appendix A – Royal Mail Philatelic Collections. Requirement of Collections Appendix B – Records held elsewhere relating to postal operations and telecommunications 1 1. Purpose British postal heritage has touched the lives of countless millions throughout history, it has helped to shape the modern world and the heritage that The Postal Museum (TPM) preserves helps tell this story. The Royal Mail Archive together with the Museum and Philatelic Collections are a unique testament to the role played by postal services and the post office network in the development of modern Britain and the world. The archive supports the museum in its vision to be a leading authority on postal heritage and its impact on society, showcasing stories and collections in an engaging, interactive, educational and fun way. The archive also encapsulates the corporate memory of Royal Mail Group (RMG), including Parcelforce Worldwide; and Post Office Limited (POL). It is an important business asset that assists RMG and POL in meeting their informational, legal and regulatory requirements.
    [Show full text]
  • BT Archives British Phone Books
    January 2013 British Phone Books BT Archives maintains a near complete collection of original phone books for the United Kingdom from 1880, the year after the public telephone service was introduced into the UK. It also holds phone books for Southern Ireland until 1921 and the creation of Eire as a separate state. The collection contains phone books produced by BT and by the predecessor organisations from which BT is directly descended, including Post Office Telecommunications and private telephone companies. The phone books reflect the development of the NTC Phone Book, Yorkshire District, telephone service in the UK, covering exclusively January 1888 (TPF/1/3) London when the telephone was first available; they gradually expand to include major provincial centres and are ultimately nationwide. Preservation of the damage to the originals, the collection collection up to 1992 was microfilmed. BT Archives holds Phone books were not intended the phone book on microfiche to be retained permanently, or for 1993-2000 so access to all even beyond their current phone books from their creation status, with old phone books in 1880 to 2000 is through returned to be pulped for re- microfilm (reels) or microfiche use. This was particularly (sheets) in BT Archives important during the war and searchroom, greatly assisting immediate post-war period preservation of the originals. because of a shortage of paper. A 26-month digitisation project The paper used in their was completed in conjunction production was also of poor with Ancestry.co.uk to scan the quality. As a result many of the phone books from 1880 to 1984 earlier phone books are in a and make them available online fragile condition, and have to through a subscription service.
    [Show full text]
  • Inventing the Communications Revolution in Post-War Britain
    Information and Control: Inventing the Communications Revolution in Post-War Britain Jacob William Ward UCL PhD History of Science and Technology 1 I, Jacob William Ward, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. 2 Abstract This thesis undertakes the first history of the post-war British telephone system, and addresses it through the lens of both actors’ and analysts’ emphases on the importance of ‘information’ and ‘control’. I explore both through a range of chapters on organisational history, laboratories, telephone exchanges, transmission technologies, futurology, transatlantic communications, and privatisation. The ideal of an ‘information network’ or an ‘information age’ is present to varying extents in all these chapters, as are deployments of different forms of control. The most pervasive, and controversial, form of control throughout this history is computer control, but I show that other forms of control, including environmental, spatial, and temporal, are all also important. I make three arguments: first, that the technological characteristics of the telephone system meant that its liberalisation and privatisation were much more ambiguous for competition and monopoly than expected; second, that information has been more important to the telephone system as an ideal to strive for, rather than the telephone system’s contribution to creating an apparent information age; third, that control is a more useful concept than information for analysing the history of the telephone system, but more work is needed to study the discursive significance of ‘control’ itself. 3 Acknowledgements There are many people to whom I owe thanks for making this thesis possible, and here I can only name some of them.
    [Show full text]
  • Corpus Studies in Applied Linguistics
    106 Pietilä, P. & O-P. Salo (eds.) 1999. Multiple Languages – Multiple Perspectives. AFinLA Yearbook 1999. Publications de l’Association Finlandaise de Linguistique Appliquée 57. pp. 105–134. CORPUS STUDIES IN APPLIED LINGUISTICS Kay Wikberg, University of Oslo Stig Johansson, University of Oslo Anna-Brita Stenström, University of Bergen Tuija Virtanen, University of Växjö Three samples of corpora and corpus-based research of great interest to applied linguistics are presented in this paper. The first is the Bergen Corpus of London Teenage Language, a project which has already resulted in a number of investigations of how young Londoners use their language. It has also given rise to a related Nordic project, UNO, and to the project EVA, which aims at developing material for assessing the English proficiency of pupils in the compulsory school system in Norway. The second corpus is the English-Norwegian Parallel Corpus (Oslo), which has provided data for both contrastive studies and the study of translationese. Altogether it consists of about 2.6 million words and now also includes translations of English texts into German, Dutch and Portuguese. The third corpus, the International Corpus of Learner English, is a collection of advanced EFL essays written by learners representing 15 different mother tongues. By comparing linguistic features in the various subcorpora it is possible to find out about non-nativeness generally and about problems shared by students representing different languages. 1 INTRODUCTION 1.1 Corpus studies and descriptive linguistics Corpus-based language research has now been with us for more than 20 years. The number of recent books dealing with corpus studies (cf.
    [Show full text]
  • Cold War Urbanism the Challenge of Survivable City Infrastructure
    Cold War Urbanism The Challenge of Survivable City Infrastructure Martin Dodge Geography | University of Manchester Richard Brook Manchester School of Architecture International Conference of Historical Geographers 9 July 2015 • Post-war, atomic age Britain, but deep austerity and imperial decline • 1949 shock of speed of Soviet atom bomb development • ‘Civil defence was about the preservation of Government (the State) and not about protecting the general populous’ • Essential national infrastructure • Urban planning, architecture / design, structural engineering, the techno-scientific bureaucracies Cold War • Speaking here, we might Context speculate on the role of geographers and the RGS….. Cold War Urbanism Definitions • What do we mean by urbanism? Summation of the forces shaping urban space and how people experience city life • Yet as Henri Lefebvre notes: Urbanism . masks a situation. It conceals operations. It blocks a view of the horizon, a path to urban knowledge and practice. It accompanies the decline of the spontaneous city and the historical urban core. It implies the intervention of power more than that of understanding. Its only coherence, its only logic, is that of the state – the void. The state can only separate, disperse, hollow out vast voids, the squares and avenues built in its own image – an image of force and restraint Lefebvre, H. (2003 [1970]) The Urban Revolution, trans. R. Bononno. Minneapolis: University of Minnesota Press. pp. 160–1 Cold War Urbanism Definitions • Urbanism in the 1950s and 60s as a military
    [Show full text]
  • Download (237Kb)
    CHAPTER II REVIEW OF RELATED LITERATURE In this chapter, the researcher presents the result of reviewing related literature which covers Corpus based analysis, children short stories, verbs, and the previous studies. A. Corpus Based Analysis in Children Short Stories In these last five decades the work that takes the concept of using corpus has been increased. Corpus, the plural forms are certainly called as corpora, refers to the collection of text, written or spoken, which is systematically gathered. A corpus can also be defined as a broad, principled set of naturally occurring examples of electronically stored languages (Bennet, 2010, p. 2). For such studies, corpus typically refers to a set of authentic machine-readable text that has been selected to describe or represent a condition or variety of a language (Grigaliuniene, 2013, p. 9). Likewise, Lindquist (2009) also believed that corpus is related to electronic. He claimed that corpus is a collection of texts stored on some kind of digital medium to be used by linguists for research purposes or by lexicographers in the production of dictionaries (Lindquist, 2009, p. 3). Nowadays, the word 'corpus' is almost often associated with the term 'electronic corpus,' which is a collection of texts stored on some kind of digital medium to be used by linguists for research purposes or by lexicographers for dictionaries. McCarthy (2004) also described corpus as a collection of written or spoken texts, typically stored in a computer database. We may infer from the above argument that computer has a crucial function in corpus (McCarthy, 2004, p. 1). In this regard, computers and software programs have allowed researchers to fairly quickly and cheaply capture, store and handle vast quantities of data.
    [Show full text]
  • 1. Introduction
    This is the accepted manuscript of the chapter MacLeod, N, and Wright, D. (2020). Forensic Linguistics. In S. Adolphs and D. Knight (eds). Routledge Handbook of English Language and Digital Humanities. London: Routledge, pp. 360-377. Chapter 19 Forensic Linguistics 1. INTRODUCTION One area of applied linguistics in which there has been increasing trends in both (i) utilising technology to assist in the analysis of text and (ii) scrutinising digital data through the lens of traditional linguistic and discursive analytical methods, is that of forensic linguistics. Broadly defined, forensic linguistics is an application of linguistic theory and method to any point at which there is an interface between language and the law. The field is popularly viewed as comprising three main elements: (i) the (written) language of the law, (ii) the language of (spoken) legal processes, and (iii) language analysis as evidence or as an investigative tool. The intersection between digital approaches to language analysis and forensic linguistics discussed in this chapter resides in element (iii), the use of linguistic analysis as evidence or to assist in investigations. Forensic linguists might take instructions from police officers to provide assistance with criminal investigations, or from solicitors for either side preparing a prosecution or defence case in advance of a criminal trial. Alternatively, they may undertake work for parties involved in civil legal disputes. Forensic linguists often appear in court to provide their expert testimony as evidence for the side by which they were retained, though it should be kept in mind that standards here are required to be much higher than they are within investigatory enterprises.
    [Show full text]
  • (By Email) Our Ref: MGLA031218-9660 21 January
    (By email) Our Ref: MGLA031218-9660 21 January 2019 Dear Thank you for your request for information which the GLA received on 3 December 2019. Your request has been dealt with under the Freedom of Information Act (2000) Our response to your request is as follows: I wish to see full copies of any and all meetings and emails with any of the following companies: InLink / LinkUK / Google / BT / Alphabet / Intersection / Primesight Regarding the InLink portals. Please find attached the information we have identified within scope of your request. Please note that some names of members of staff are exempt from disclosure under s.40 (Personal information) of the Freedom of Information Act. This information could potentially identify specific employees and as such constitutes as personal data which is defined by Article 4(1) of the General Data Protection Regulation (GDPR) to mean any information relating to an identified or identifiable living individual. It is considered that disclosure of this information would contravene the first data protection principle under Article 5(1) of GDPR which states that Personal data must be processed lawfully, fairly and in a transparent manner in relation to the data subject. A small amount of information contained within the email of 8 November 2018 (11:48) is exempt from disclosure under the exemption for Commercial Interests at section 43(2) of the FoIA. Section 43(2) provides that information can be withheld from release if its release would, or would be likely to, prejudice the commercial interests of any person. A commercial interest relates to a person’s ability to participate competitively in a commercial activity and in this instance, the information withheld from disclosure refers price estimates for advertisement space which is not publicly available, and disclosure would compromise inlinkuk’s competitiveness in the wider out of home market.
    [Show full text]
  • Distributed Memory Bound Word Counting for Large Corpora
    Democratic and Popular Republic of Algeria Ministry of Higher Education and Scientific Research Ahmed Draia University - Adrar Faculty of Science and Technology Department of Mathematics and Computer Science A Thesis Presented to Fulfil the Master’s Degree in Computer Science Option: Intelligent Systems. Title: Distributed Memory Bound Word Counting For Large Corpora Prepared by: Bekraoui Mohamed Lamine & Sennoussi Fayssal Taqiy Eddine Supervised by: Mr. Mediani Mohammed In front of President : CHOUGOEUR Djilali Examiner : OMARI Mohammed Examiner : BENATIALLAH Djelloul Academic Year 2017/2018 Abstract: Statistical Natural Language Processing (NLP) has seen tremendous success over the recent years and its applications can be met in a wide range of areas. NLP tasks make the core of very popular services such as Google translation, recommendation systems of big commercial companies such Amazon, and even in the voice recognizers of the mobile world. Nowadays, most of the NLP applications are data-based. Language data is used to estimate statistical models, which are then used in making predictions about new data which was probably never seen. In its simplest form, computing any statistical model will rely on the fundamental task of counting the small units constituting the data. With the expansion of the Internet and its intrusion in all aspects of human life, the textual corpora became available in very large amounts. This high availability is very advantageous performance-wise, as it enlarges the coverage and makes the model more robust both to noise and unseen examples. On the other hand, training systems on large data quantities raises a new challenge to the hardware resources, as it is very likely that the model will not fit into main memory.
    [Show full text]
  • Private Telegraphy
    Private telegraphy: The path from private wires to subscriber lines in Victorian Britain Jean-François Fava-Verde Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds School of Philosophy, Religion and History of Science September 2016 ii The candidate confirms that the work submitted is his own and that appropriate credit has been given where reference has been made to the work of others. This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement © 2016 The University of Leeds and Jean-François Fava-Verde The right of Jean-François Fava-Verde to be identified as Author of this work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. iii Acknowledgements In the first place, I would like to thank my supervisor, Professor Graeme Gooday, for his guidance and encouragement during the production of this thesis. I enjoyed our frank discussions and I am especially grateful to him for sharing his insight into the history of technology. My sincere thanks also to my examiners, Dr Jonathan Topham and Dr Ben Marsden, for their constructive comments on my thesis. It has also been a privilege to work alongside the knowledgeable and friendly members of the telecommunications reading group of the department, especially Dr Michael Kay, Dr John Moyle, and Dr Lee Macdonald, who broadened my vision and provided insights into various themes such as private telephony, telegraphic lines testing or the effect of solar disturbance on telegraphic lines.
    [Show full text]
  • Unit 3: Available Corpora and Software
    Corpus building and investigation for the Humanities: An on-line information pack about corpus investigation techniques for the Humanities Unit 3: Available corpora and software Irina Dahlmann, University of Nottingham 3.1 Commonly-used reference corpora and how to find them This section provides an overview of commonly-used and readily available corpora. It is also intended as a summary only and is far from exhaustive, but should prove useful as a starting point to see what kinds of corpora are available. The Corpora here are divided into the following categories: • Corpora of General English • Monitor corpora • Corpora of Spoken English • Corpora of Academic English • Corpora of Professional English • Corpora of Learner English (First and Second Language Acquisition) • Historical (Diachronic) Corpora of English • Corpora in other languages • Parallel Corpora/Multilingual Corpora Each entry contains the name of the corpus and a hyperlink where further information is available. All the information was accurate at the time of writing but the information is subject to change and further web searches may be required. Corpora of General English The American National Corpus http://www.americannationalcorpus.org/ Size: The first release contains 11.5 million words. The final release will contain 100 million words. Content: Written and Spoken American English. Access/Cost: The second release is available from the Linguistic Data Consortium (http://projects.ldc.upenn.edu/ANC/) for $75. The British National Corpus http://www.natcorp.ox.ac.uk/ Size: 100 million words. Content: Written (90%) and Spoken (10%) British English. Access/Cost: The BNC World Edition is available as both a CD-ROM or via online subscription.
    [Show full text]
  • The British Post Office in the Telecommunications Era
    H-Albion The British Post Office in the Telecommunications Era Discussion published by Jacob Ward on Tuesday, July 25, 2017 The British Post Office in the Telecommunications Era Date: Thursday, 31st August 2017 Location: The Dana Library and Research Centre, 165 Queen’s Gate, London SW7 5HD This workshop will explore the many different, and often surprising, sides of the British Post Office during its telecommunications era (1870-1975), from its takeover of the UK’s telegraph network to the privatisation of British Telecom. During this time, the British Post Office did more than sorting post and providing telephones. As a government institution, it had its own engineering research facilities, acted as a public savings bank, and regulated and licensed the nation’s broadcasters. It was the UK’s largest employer until the 1970s, and so was a familiar aspect of life for many. It played an understated but key communications role in both World Wars, and facilitated secretive military endeavours. It created new opportunities and types of work for both men and women, and was a key driver for automation throughout the 20th century. Historians from a range of disciplines and backgrounds will meet to discuss the forgotten activities and untold stories of the British Post Office. By drawing together a variety of narratives we hope to illustrate the vital but often unacknowledged roles that this institution played in twentieth century society. The workshop is open to all (up to capacity), and we particularly encourage attendance from graduate students and early-career scholars. To register for this conference, please use the following link: https://postofficeconference.eventbrite.co.uk This workshop is organized by three AHRC-sponsored doctoral students, in collaboration with the Science Museum and BT Archives.
    [Show full text]