Bealsetal2020theatlasofdigitise
Total Page:16
File Type:pdf, Size:1020Kb
Edinburgh Research Explorer The Atlas of Digitised Newspapers Citation for published version: Beals, M, Bell, E, Cordell, R, Fyfe, P, Galina Russell, I, Hauswedell, T, Neudecker, C, Nyhan, J, Oiva, M, Pado, S, Peña Pimentel, M, Rose, L, Salmi, H, Terras, M & Viola, L, The Atlas of Digitised Newspapers: Reports from Oceanic Exchanges, 2020, Web publication/site. https://doi.org/10.6084/m9.figshare.11560059 Digital Object Identifier (DOI): 10.6084/m9.figshare.11560059 Link: Link to publication record in Edinburgh Research Explorer Document Version: Publisher's PDF, also known as Version of record General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 24. Sep. 2021 The Atlas of Digitised Newspapers The Atlas of Digitised Newspapers Reports from Oceanic Exchanges Exchanges Oceanic from Reports The Atlas of M. H. Beals and Emily Bell M. H. Beals and Emily Digitised Newspapers Reports from Oceanic Exchanges M. H. Beals and Emily Bell with additional contributions by Ryan Cordell, Paul Fyfe, Isabel Galina Russell, Tessa Hauswedell, Clemens Neudecker, Julianne Nyhan, Mila Oiva, Sebastian Padó, Miriam Peña Pimentel, Lara Rose, Hannu Salmi, Melissa Terras, and Lorella Viola and special thanks to Seth Cayley (Gale), Steven Claeyssens (KB), Huibert Crijns (KB), Nicola Frean (NLNZ), Julia Hickie (NLA), Jussi-Pekka Hakkarainen (NLF), Chris Houghton (Gale), Melanie Lovell-Smith (NLNZ), Minna Kaukonen (NLF), Luke McKernan (BL), Chris McPartland (NLA), Maaike Napolitano (KB), Tim Sherratt (University of Canberra) and Emerson Vandy (NLNZ) DOI: 10.6084/m9.figshare.11560059 Disclaimers: The wider project is funded by the ‘Transatlantic Partnership for Social Sciences and Humanities 2016 Digging Into Data Challenge’. The research conducted by UK Institutions is funded by the Economic and Social Research Council (ESRC) and the Arts and Humanities Research Council (Grant Reference: ES/ R004110/1). The support of the Economic and Social Research Council (ESRC) is gratefully acknowledged. Although we have directly consulted with the various institutions discussed in this report, the final findings, conclusions and recommendations expressed in this publication do not necessarily represent those of the discussed database providers or the contributors’ host institutions. Executive Summary Between 2017 and 2019, Oceanic Exchanges (http://www.oceanicexchanges.org), funded through the Transatlantic Partnership for Social Sciences and Humanities 2016 Digging into Data Challenge (https:// diggingintodata.org), brought together leading efforts in computational periodicals research from six countries—Finland, Germany, Mexico, the Netherlands, the United Kingdom, and the United States—to examine patterns of information flow across national and linguistic boundaries. Over the past thirty years, national libraries, universities and commercial publishers around the world have made available hundreds of millions of pages of historical newspapers through mass digitisation and currently release over one million new pages per month worldwide. These have become vital resources not only for academics but for journalists, politicians, schools, and the general public. However, these digitisation programmes share a critical weakness: the very creation of national newspaper collections obscures the fact that international news exchange was central to the nineteenth-century press. The Atlas of Digitised Newspapers and Metadata is an open access guide to a selection of newspaper databases around the world. Its initial selection is limited in scope, being comprised of the ten databases (including the aggregator Europeana) for which we were able to secure access and licensing to the machine-readable data. Nonetheless, it aims to form the foundation of a wider mapping of collections beyond its current North Atlantic and Anglophone-Pacific focus. It brings together their histories and digitisation choices with a deeper look at the language of the digitised newspaper, the evolution of newspaper terminology and the variety of metadata available in these collections. It explores how machine-readable information about an issue, volume, page, and author is stored in the digital file alongside the raw content or text, and provides a controlled vocabulary designed to be used across disciplines, within academia and beyond. This report draws upon multiple taxonomies: our own open access dataset (doi:10.6084/m9.figshare.11560110), which provides a full catalogue of metadata fields across the collections, academic and industry discussions of the newspaper as a journalistic form and historical artefact, digitisation guidelines and strategies, library websites, annual reports, interviews with librarians and digitisation providers and the data files themselves. The maps of this Atlas explore each of our overarching categories in detail, providing a selection of language variants, the technical definition we employed in the categorisation process, and notes on its usage across the collections and in the wider world of press history. This allows a greater understanding of how the term is currently being used in different ways by different groups, and allows researchers to navigate to the specific type of information they require and ascertain its availability across these collections. Each entry also includes technical information for obtaining this data across the collections, including data types, which often vary considerably, and XPaths for locating the information within that dataset. With this information, researchers should be able to understand the different structures of these collections and develop computational means for robustly comparing datasets to explore deeper and more meaningful research. After using the Atlas, we hope that readers will begin to understand the great wealth of metadata available for digitised newspapers, much of which is comparable across collections, nations and languages. As we explored these collections, we found a sincere effort on the part of librarians, scholars and commercial providers to converge upon a knowledge system that allowed meaningful enquiry and reflected a consistent layer (if not a complete reproduction) of these historical artefacts. However, this seeming convergence does mask important outliers and divergent interpretations of key bibliographic and conceptual categories—divergences that we hope these maps will highlight, and encourage future digitisers to consider when building or expanding their databases. In sum, the rise of digitisation promises great opportunities for those who wish to engage with newspaper archives, but as with all historical archives, digital collections require researchers to be mindful of their shape, provenance and structure before any conclusion can be drawn. It is the responsibility of both digitiser and researcher to understand both the map and also the terrain. i Table of Contents Executive Summary ....................................................................... i Table of Contents ........................................................................... ii List of Abbreviations ...................................................................... iii Introduction .................................................................................... 01 The Project ............................................................................... 01 Aims and Objectives ................................................................. 01 Methodology ............................................................................. 02 Database Histories ................................................................... 04 Metadata Maps ......................................................................... 05 Suggested Citation ................................................................... 05 Database Histories ........................................................................ 06 British Library 19th Century Newspapers ............................... 06 Chronicling America ................................................................ 11 Delpher ..................................................................................... 14 Europeana Newspapers ........................................................... 17 Hemeroteca Nacional Digital de México.................................. 19 Papers Past .............................................................................. 21 Suomen Kansalliskirjaston digitoidut sanomalehdet ............. 25 Times Digital Archive ............................................................... 28 Trove ......................................................................................... 31 ZEFYS ....................................................................................... 36 Metadata Mappings ....................................................................... 38 Content Data ...........................................................................