PRESERVING EBOOKS: PAST, PRESENT AND FUTURE A Series of Perspectives Trevor Owens Maureen Pennock Library of Congress USA UK [email protected] [email protected] https://orcid.org/0000-0001-8857-388X https://orcid.org/0000-0002-7521-8536

Tom J. Smyth Tobias Steinke Library & Archives Canada Deutsche Nationalbibliothek Canada [email protected] [email protected] https://orcid.org/0000-0002-3829-2980 https://orcid.org/0000-0002-3999-1687

Abstract – This panel will present and discuss Regulations. Our preferred formats are EPUB and PDF different eBook workflows and challenges from four though we also have a small number of MOBI files. national libraries, considering a range of issues from There are around 400,000 NPLD eBooks in the technical complexities to evolution of the content type collection with access rates at around 5,500 per month. and changes in the publishing/collecting landscape. We also have a substantial number of digitized books Keywords – digital preservation, ebooks, ingest, formats, scale, access published under commercial partnerships with Google Conference Topics – The Cutting Edge: Technical and Microsoft. Going forwards, we have an interest in Infrastructure & Implementation; Exploring New Open Access eBooks published outside of the UK and Horizons eBooks published as mobile apps. Current challenges include ensuring an uninterrupted I. OVERVIEW supply to readers during a forthcoming repository eBooks are the backbone of many a National Library migration, and delivering access to all six UK Legal collection, constituting a substantial proportion of the Deposit Libraries in line with regulation requirements for digital content our readers expect to be able to access single sequential access. Active research areas include and consult. Our digital preservation activities reflect collection and preservation of mobile apps and this, with established infrastructures and workflows for evolution of the EPUB format. eBook acquisition, ingest, management and access, all B. eBooks at the Library of Congress at scale. Yet the eBook as a content type is evolving, and user expectations for access are evolving alongside. The U.S Library of Congress has acquired eBooks Dealing with this requires both a responsive framework through a wide range of different programs and and an eye on the horizon. initiatives. For years, the institution has received and acquired eBooks through its Cataloging in Publication This panel brings together experts from leading Program, special relief agreements for copyright national libraries to openly discuss various elements of deposit, web archiving, and other routine transfer their respective eBook preservation activities and methods for acquisition. research programs, and explore where similarities and differences may lie. Below we summarize the eBook In support of the digital collecting plan, staff across collections at each organization, existing challenges, the institution are currently working to expand these and research activities. efforts and to pilot acquiring, preserving, and delivering selected open access eBooks. The majority of this A. eBooks at the British Library content is in PDF and EPUB formats, but the institution Since 2013 The British Library has collected eBooks has copies of eBooks in a much wider range of formats under the UK’s Non-Print (NPLD) as well. As outlined in the Library of Congress Digital 16th International Conference on Digital Preservation iPRES 2019, Amsterdam, The Netherlands. Copyright held by the author(s). The text of this paper is published under a CC BY-SA license (https://creativecommons.org/licenses/by/4.0/). DOI: 10.1145/nnnnnnn.nnnnnnn

Strategy, it is necessary to plan for work around eBooks - Do you have preferred formats for eBook in terms of exponential collection growth. To that end, preservation; if so, what are they and why? a key area of focus for the institution is working to scale - What are the biggest challenges you have up and enhance workflows and processes. encountered in collecting, preserving and C. eBooks at the Deutsche Nationalbibliothek providing access to eBooks? The German National Library has currently around - What changes have you seen in your eBook 1 million eBooks in the formats PDF and EPUB, equating collection over the past decade and how have to approx. 16% of all collected digital publications you responded? (excluding digitized objects). The German legal deposit - How are you monitoring the publishing collection has included eBooks since 2006. eBooks are landscape for more changes going forwards? ingested in the digital preservation system of the German National Library. All eBooks are analyzed and Panelists will discuss answers in advance of the session validated, resulting in generation of a risk analysis to ensure answers are representative of the variety in ‘ingest level’. Checks include tests on copy protection our approaches, thus ensuring we provide sufficient especially in PDF files. There is a separate repository for conflicting perspectives to create interesting discussion. giving access. Attendees will be encouraged to ask additional questions of the panelists during an open-ended Q&A In an ongoing internal project all aspects of the session. digital workflows are currently being optimized for a better performance. This includes using a common III. PANELISTS workflow engine, replacing the repository for access with something more fitting and consolidating the Maureen Pennock is Head of Digital Preservation at different workflows for digital objects including eBooks. the British Library. She sits on the Digital Preservation Coalition Board of Directors and co-chairs the DPC D. eBooks at Library & Archives Canada Special Interest Group for Digital Preservation in LAC has been acquiring eBooks of various different National Libraries, Archives and Museums. She is also formats since the 1990’s. Digital legal deposit Chair of the UK Legal Deposit Libraries’ Digital legislation came into effect in 2006, though Preservation Committee and a member of the UNESCO participation in the legal deposit program varies with PERSIST initiative. commercial/retail publishers and scholarly communities Dr. Trevor Owens serves as the first Head of Digital lagging behind government and self-published content. Content Management at the U.S. Library of Congress. The current technical platform for eBook acquisition In addition, he teaches graduate seminars in digital is based on a pilot project created in 1994. In 2018, LAC history for American University’s History Department embarked on an initiative to modernize its systems and, and graduate seminars and digital preservation for the as part of that, procured Preservica as a DAM and a University of Maryland’s College of Information, where Digital Preservation Solution. New information package he is also a Research Affiliate with the Digital Curation specifications for published heritage collections are Innovation Center currently being developed for use within Preservica. In Tobias Steinke works at the German National addition, LAC’s Published Acquisitions sector is working Library on the conceptual development of digital to implement a collection gap analysis and monitoring preservation and is responsible for the web archiving framework in order to measure and expand project of the library. He has been involved in several participation in the Legal Deposit program. Another key national and international projects about digital activity is the development of a seamless platform for preservation and standardization. publishers and authors to transfer digital content and metadata to LAC. One of the desirable outcomes is Tom J. Smyth is a senior librarian and manager of streamlined workflows from acquisition to preservation. the Digital Integration group within the Digital Preservation Division at Library and Archives Canada. II. PANEL STRUCTURE His work involves digital transformation of library and archival programs and services in digital curation Following short introductions on the state of the contexts. He has managed digital library special practice to acquire, preserve, and deliver eBooks at collections and LAC’s Web Archiving Program since each institution, panelists will then move on to discuss 2009. a range of questions such as: The panel will be moderated by Paul Wheatley, - How does your organization staff and support Head of Research & Practice at the Digital Preservation eBook acquisition, preservation and access? Coalition. Paul is an experienced panelist and - How have you embedded preservation support moderator with many years of experience working with into your end to end workflows? digital collections and in digital preservation.

iPRES 2019 - 16th International Conference on Digital Preservation 2 September 16- 20, 2019, Amsterdam, The Netherlands.