Web Archiving and You Web Archiving and Us

Total Page:16

File Type:pdf, Size:1020Kb

Web Archiving and You Web Archiving and Us Web Archiving and You Web Archiving and Us Amy Wickner University of Maryland Libraries Code4Lib 2018 Slides & Resources: https://osf.io/ex6ny/ Hello, thank you for this opportunity to talk about web archives and archiving. This talk is about what stakes the code4lib community might have in documenting particular experiences of the live web. In addition to slides, I’m leading with a list of material, tools, and trainings I read and relied on in putting this talk together. Despite the limited scope of the talk, I hope you’ll each find something of personal relevance to pursue further. “ the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use International Internet Preservation Coalition To begin, here’s how the International Internet Preservation Consortium or IIPC defines web archiving. Let’s break this down a little. “Collecting portions” means not collecting everything: there’s generally a process of selection. “Archival format” implies that long-term preservation and stewardship are the goals of collecting material from the web. And “serving the archives for access and use” implies a stewarding entity conceptually separate from the bodies of creators and users of archives. It also implies that there is no web archiving without access and use. As we go along, we’ll see examples that both reinforce and trouble these assumptions. A point of clarity about wording: when I say for example “critique,” “question,” or “trouble” as a verb, I mean inquiry rather than judgement or condemnation. we are collectors So, preambles mostly over. Between us, we here in this room and on the livestream and so on represent collectors of web-based material, subjects of captured websites, and users of web archives. To confirm, let’s do a quick poll. Please raise your hand if you are a collector of web-based material. Thank you. Web design & development Labor Browser technology Federal policy Personal tech Corporate policy Digital culture Information ethics Web archiving technology Collaboration Costs & impact of storage Attention As individuals, collectives, and agents of institutions building web archives, we manage many moving parts in attempting to document even a small part of the living web. Here’s a short list of areas that influence our practices. Web development affects the archivability of websites. Personal tech influences how people produce and consume web-based material. Regime change leads to federal policy change like the end of net neutrality, content appearing and disappearing, large-scale collaborations like the End of Term archive, and the Internet Archive moving servers to Canada. Corporate policies and practices like terms of service, DRM, startup churn, and data selling influence both archiving and live use of the web, as we heard Wednesday in Mark Matienzo’s overview of IndieWeb. And trends in ethics include growing discussion of privacy as contextual integrity, particularly in online spaces, as well as the right to be forgotten. What constitutes archival value is, and will always be, “specific to place, time, culture and individual subjectivity. It does not dangle somewhere outside of humanity, immutable, pristine, transcendent. The appraiser creates, or recreates, archival value with every appraisal exercise. Harris 1998 As collectors, we also work within the specific contexts of our biases. This is an appraisal practice, in which collectors assign value to material and take actions accordingly. We aren’t always able to articulate these criteria, nor do we always itemize the actions taken -- colloquially lumping it all together in the word “save.” Appraisal as a fundamental archival practice has been hotly and insularly contested for more than a century -- and I have a syllabus to share if you’re at all curious. This [POINTS TO SLIDE] is one of the more accessible and increasingly relevant approaches, articulated by Verne Harris in 1998. He argues that appraisal is where power is most concentrated in archivists, and that it’s closer to storytelling than to a science. Trying to articulate that story is one way we can grow as web archivists. How can we better Competent, critical, curious use web archiving Learn, teach how it works technologies? Foreground labor Put faces to names behind infrastructure Blewer 2017; Arquivo.pt 2018 Improving our appraisal also comes down to being not only competent but also critical and curious users of web archiving tools. As starting points, I recommend two blog posts by Ashley Blewer: one that approachably introduces the technical side of popular web archiving frameworks; and one that explains the links between archivability and accessibility. If you find yourself in a position to teach with web archiving, try to scaffold learning around not just how to use things but also how they work. Let’s also look at how labor impacts web archiving: How much time do you spend on different parts of the process? What kind of work is web archiving? Put faces to the names of developers, archivists, and designers behind what you collect and how. The developers at Arquivo.pt, the Portuguese web archive, put out a pretty honest video last week describing the process behind their work, including some recent struggles and decisions around improving services. Documentation like this give us insight on the care of web archives. we are subjects So that’s an overview of how collectors and web archiving mutually shape one another. Next let’s consider how web archiving impacts subjects represented in web archives. And in fact, much of the power archivists wield is in the description or metadata that tells the story of a collection and its subjects. Please raise your hand if you’re a subject represented in web archives. Trick question: it’s all of us. How are we identity represented as safety subjects? privacy access accessibility exclusion harm To get at why representation in web archives even matters, consider how you and I are represented on the internet. Many of us come face to face everyday with the reality that the web is not so friendly to us, that it’s not built for us to use, is designed to propagate and privilege limited, harmful representations, all of which have real impacts on our well-being. Spaces nominally designed for participation -- Twitter, Wikipedia, reddit, I don’t really want to go on -- can be some of the most unwelcoming. There’s the tumultuous experience of trying to manage our own identities and safety online. Appeals to and for the right to be forgotten elicit cries of “Shame!”, of government overreach, and of the dangers of censorship for accountability and democracy. Protecting privacy is now treated as an individual rather than a collective responsibility: the admirable work of the Electronic Freedom Foundation, Library Freedom Project, and more only emphasize that institutions and their policies currently trend towards a lack of respect for privacy. And maybe they always have. So. We know the power of archival representation. We know that bias, mis- and underrepresentation, are rampant on the web, including in highly participatory spaces. We’re aware of access and accessibility issues in all of the above. So what warrant could we possibly have to assume web archives would be any different? Future users Current users Audience Designated community Web archives perpetuate unresolved issues that affect us as subjects and for which we bear responsibility as collectors. Communicating the context of web archiving to people or robots who might use the results is one way to confront these issues. There were some raging Western archival debates in the mid-to-late 20th century about whether and how archivists should envision a future user or user community when building collections, and I assume there are small fires of this kind smoldering today. In the digital curation world, and in any system even nominally based on the Open Archival Information System (OAIS), we assume a designated community to justify preserving certain data. It can be illuminating to examine one’s assumptions about audience. we are users Now, please raise your hand if you use web archives. Thank you. How do people use remix web archives? critique “plural and heterogeneous archives” legal evidence historical evidence receipts Post 2017; Taylor 2017; Belovari 2017; Milligan 2016; Zannettou, et al. 2018 How do people use web archives, anyway? Artists use them as source material for remix and critique, as what Colin Post calls “plural and heterogeneous archives.” Courts are starting understand the legal uses and limits of web archives as evidence, including whose interests such evidence and cases tend to serve. Historians have used web archives to study political discourse and engagement, among other topics, but it’s also been pointed out that today’s web archives are so little conducive to historical research methods that it may not be strictly accurate to refer to them as “the historical record.” And of course, journalists and the general public use them for RECEIPTS. Maybe you see yourself in one of these categories, although I’ve left so many out. Let’s think about lines of inquiry we can take as users of web archives, to critically read, much as we learn to critically build. EXERCISE: postcolonial critique ● How do web archives reflect or suppress values of the people they represent? ● How are the decisions behind designs, appraisal, and access obscured or revealed? ● What are the labor practices behind web archives and archiving technologies? ● What are the environmental impacts of web archiving? Anderson 2002 Fundamentally, this means approaching them as having been constructed in different ways and for a variety of purposes. Just like other archives, the narrative of how they’re built is closely tied to how they represent the world.
Recommended publications
  • Jamie Shiers, CERN
    Investing in Curation A Shared Path to Sustainability [email protected] Data Preservation in HEP (DPHEP) Outline 1. Pick 2 of the messages from the roadmap & comment – I could comment on all – but not in 10’ 1. What (+ve) impact has 4C already had on us? – Avoiding overlap with the above 1. A point for discussion – shared responsiblity / action THE MESSAGES The 4C Roadmap Messages 1. Identify the value of digital assets and make choices 2. Demand and choose more efficient systems 3. Develop scalable services and infrastructure 4. Design digital curation as a sustainable service 5. Make funding dependent on costing digital assets across the whole lifecycle 6. Be collaborative and transparent to drive down costs IMPACT OF 4C ON DPHEP International Collaboration for Data Preservation and Long Term Analysis in High Energy Physics “LHC Cost Model” (simplified) Start with 10PB, then +50PB/year, then +50% every 3y (or +15% / year) 10EB 1EB 6 Case B) increasing archive growth Total cost: ~$59.9M (~$2M / year) 7 1. Identify the value of digital assets and make choices • Today, significant volumes of HEP data are thrown away “at birth” – i.e. via very strict filters (aka triggers) B4 writing to storage To 1st approximation ALL remaining data needs to be kept for a few decades • “Value” can be measured in a number of ways: – Scientific publications / results; – Educational / cultural impact; – “Spin-offs” – e.g. superconductivity, ICT, vacuum technology. Why build an LHC? BEFORE! 1 – Long Tail of Papers 2 – New Theore cal Insights 3 4 3 – “Discovery” to “Precision” Volume: 100PB + ~50PB/year (+400PB/year from 2020) 11 Zimmermann( Alain Blondel TLEP design study r-ECFA 2013-07-20 5 Balance sheet – Tevatron@FNAL • 20 year investment in Tevatron ~ $4B • Students $4B • Magnets and MRI $5-10B } ~ $50B total • Computing $40B Very rough calculation – but confirms our gut feeling that investment in fundamental science pays off I think there is an opportunity for someone to repeat this exercise more rigorously cf.
    [Show full text]
  • Module 8 Wiki Guide
    Best Practices for Biomedical Research Data Management Harvard Medical School, The Francis A. Countway Library of Medicine Module 8 Wiki Guide Learning Objectives and Outcomes: 1. Emphasize characteristics of long-term data curation and preservation that build on and extend active data management ● It is the purview of permanent archiving and preservation to take over stewardship and ensure that the data do not become technologically obsolete and no longer permanently accessible. ○ The selection of a repository to ensure that certain technical processes are performed routinely and reliably to maintain data integrity ○ Determining the costs and steps necessary to address preservation issues such as technological obsolescence inhibiting data access ○ Consistent, citable access to data and associated contextual records ○ Ensuring that protected data stays protected through repository-governed access control ● Data Management ○ Refers to the handling, manipulation, and retention of data generated within the context of the scientific process ○ Use of this term has become more common as funding agencies require researchers to develop and implement structured plans as part of grant-funded project activities ● Digital Stewardship ○ Contributions to the longevity and usefulness of digital content by its caretakers that may occur within, but often outside of, a formal digital preservation program ○ Encompasses all activities related to the care and management of digital objects over time, and addresses all phases of the digital object lifecycle 2. Distinguish between preservation and curation ● Digital Curation ○ The combination of data curation and digital preservation ○ There tends to be a relatively strong orientation toward authenticity, trustworthiness, and long-term preservation ○ Maintaining and adding value to a trusted body of digital information for future and current use.
    [Show full text]
  • INST 785 Section 0101 Documentation, Collection, and INST 785 Appraisal of Records Spring 2020
    Course Syllabus – INST 785 Section 0101 Documentation, Collection, and INST 785 Appraisal of Records Spring 2020 Course Description Dr. Eric Hung he / him / his Appraisal is considered to be the archivist’s “first responsibility.” The [email protected] responsibility is “first” because appraisal comes first in the sequence of archival functions and thus influences all subsequent archival activities, and it is “first” in importance because appraisal determines what tiny sliver of Class Meetings: the total human documentary production will actually become “archives” Tuesdays, 6:00-8:45pm and thus a part of society’s history and collective memory. The archivist is HBK 0105 thereby actively shaping the future’s history of our own times. Office Hours The topic of appraisal remains one of considerable controversy in archives. ELMS Chat Office Hours: The archival literature includes debates over the definitions and indicators Mondays, 2:00-3:00pm, of long-term value, the purpose of appraisal, who intervenes in appraisal or by appointment. If you decisions, when in the information life cycle do they intervene, and which want to meet in-person, I methods work for which types of records and which types of organizations. am generally on campus The literature is replete with tensions between the theory and practice of on Tuesdays. appraisal and between questions of universalism versus specificity (by type of record, media, type of organization, time period, country, etc.). Syllabus Policy One of the problems with the literature on appraisal is that there are few This syllabus is a guide for methods for rigorously evaluating the feasibility or effectiveness of different the course and is subject appraisal methodologies.
    [Show full text]
  • Archiving 2016 Preliminary Program
    M ARCHIVING2016 A April 19-22, 2016 • Washington, DC R G www.imaging.org/archiving General Chair: Kari Smith, O MIT Libraries, Institute Archives and Special Collections R P Y R A N I M I L E R P Sponsored by the Society for Imaging Science and Technology April 19-22, 2016 • Washington, DC About the Conference The IS&T Archiving Conference brings together provides a forum to explore new strategies an international community of imaging experts and policies, and reports on successful projects and technicians as well as curators, managers, that can serve as benchmarks in the field. and researchers from libraries, archives, mu- Archiving 2016 is a blend of short courses, seums, records management repositories, in- invited focal papers, keynote talks, and formation technology institutions, and com- peer-reviewed oral and interactive display mercial enterprises to explore and discuss the presentations, offering attendees a unique field of digitization of cultural heritage and opportunity for gaining and exchanging archiving. The conference presents the latest knowledge and building networks among research results on digitization and curation, professionals. Cooperating Societies • American Institute for Conservation Foundation of the American Institute for Conservation (AIC) • ALCTS Association for Library Collections & Technical Services • Coalition for Networked Information (CNI) • Digital Library Federation at CLIR . • Digital Preservation Coalition (DPC) s e g o • IOP/Printing & Graphics Science Group V h p o t s • ISCC – Inter-Society Color Council i r h C • Museum Computer Network (MCN) : o t o h • The Royal Photographic Society P Short courses offer an intimate setting to gain more in-depth knowledge about technical aspects of digital archiving.
    [Show full text]
  • 10 Small Scale Academic Web Archiving: DACHS
    10 Small Scale Academic Web Archiving: DACHS Hanno E. Lecher Leiden University [email protected] 10.1 Why Small Scale Academic Archiving? Considering the complexities of Web archiving and the demands on hard- and software as well as on expertise and personnel, one wonders whether such projects are only feasible for large scale institutions such as national libraries, or whether smaller institutions such as museums, university depart- ments and the like would also be able to perform the tasks required for a Web archive with long-term perspective. Even if the answer to this is yes, the question remains whether this is necessary at all. One could think that the Internet Archive in combination with the efforts of the increasing number of national libraries is already covering many, if not most relevant Web resources. Does academic or other small scale Web archiving make sense at all? Let me begin with this second question. The Internet Archive has done groundbreaking work as the first initiative attempting comprehensive archiv- ing of Web resources. Its success in accomplishing this has been revolution- ary, and it has laid the foundation on which many other projects have built their work. Still, examining what the Internet Archive and other holistic pro- jects1 can achieve it is easy to discover some limitations. Since their focus of collection is very broad, they have to rely on robots for a large part of their collecting activities, automatically grabbing as many Web pages as possible. This kind of capturing is often very superficial, missing parts located further down the tree, many pages being downloaded incompletely, and some file types as well as the hidden Web being ignored altogether.
    [Show full text]
  • Selection in Web Archives: the Value of Archival Best Practices
    WITTENBERG: SELECTION IN WEB ARCHIVES Selection in Web Archives: The Value of Archival Best Practices Jamie Wittenberg, University of Illinois at Urbana-Champaign, United States of America Abstract: The abundance of valuable material available online has mobilized the development of preservation initiatives at collecting institutions that aim to capture and contextualize web content. Web archiving selection criteria are driven by the limitations inherent in harvesting technologies. Observing core archival principles like provenance and original order when establishing collection development policies for web content will help to ensure that archives continue to assure the authenticity of the materials they steward. Keywords: Web Archives; Archival Theory; Digital Libraries; Internet Content; Selection and Appraisal Introduction The abundance of valuable material available online has mobilized the development of preservation initiatives at collecting institutions that aim to capture and contextualize web content. Methodologies for web collection practices are institution and collection-specific. Among institutions charged with preserving cultural heritage, web archiving has become commonplace. However, the disparity between institutional selection and appraisal criteria reveals the absence of standardization for web archive establishment. The Australian web archive, for example, accessions content that it evaluates as having long-term research value. The Library of Congress web archive, represented by its Minerva team, established a collection
    [Show full text]
  • Modeling Popularity and Reliability of Sources in Multilingual Wikipedia
    information Article Modeling Popularity and Reliability of Sources in Multilingual Wikipedia Włodzimierz Lewoniewski * , Krzysztof W˛ecel and Witold Abramowicz Department of Information Systems, Pozna´nUniversity of Economics and Business, 61-875 Pozna´n,Poland; [email protected] (K.W.); [email protected] (W.A.) * Correspondence: [email protected] Received: 31 March 2020; Accepted: 7 May 2020; Published: 13 May 2020 Abstract: One of the most important factors impacting quality of content in Wikipedia is presence of reliable sources. By following references, readers can verify facts or find more details about described topic. A Wikipedia article can be edited independently in any of over 300 languages, even by anonymous users, therefore information about the same topic may be inconsistent. This also applies to use of references in different language versions of a particular article, so the same statement can have different sources. In this paper we analyzed over 40 million articles from the 55 most developed language versions of Wikipedia to extract information about over 200 million references and find the most popular and reliable sources. We presented 10 models for the assessment of the popularity and reliability of the sources based on analysis of meta information about the references in Wikipedia articles, page views and authors of the articles. Using DBpedia and Wikidata we automatically identified the alignment of the sources to a specific domain. Additionally, we analyzed the changes of popularity and reliability in time and identified growth leaders in each of the considered months. The results can be used for quality improvements of the content in different languages versions of Wikipedia.
    [Show full text]
  • Cultural Anthropology and the Infrastructure of Publishing
    CULTURAL ANTHROPOLOGY AND THE INFRASTRUCTURE OF PUBLISHING TIMOTHY W. ELFENBEIN Society for Cultural Anthropology This essay is based on an interview conducted by Matt Thompson, portions of which originally appeared on the blog Savage Minds the week after Cultural Anthropology relaunched as an open-access journal.1 Herein, I elaborate on my original comments to provide a nuts-and-bolts account of what has gone into the journal’s transition. I am particularly concerned to lay bare the publishing side of this endeavor—of what it means for a scholarly society to take on publishing responsibilities—since the challenges of pub- lishing, and the labor and infrastructure involved, are invisible to most authors and readers. Matt Thompson: When did the Society for Cultural Anthropology (SCA) de- cide to go the open-access route and what was motivating them? Tim Elfenbein: Cultural Anthropology was one of the first American Anthropo- logical Association (AAA) journals to start its own website. Kim and Mike Fortun were responsible for the initial site. They wanted to know what extra materials they could put on a website that would supplement the journal’s articles. That experience helped germinate the idea that there was more that the journal could do itself. The Fortuns are also heavily involved in science and technology studies 2 (STS), where discussions about open access have been occurring for a long time. When Anne Allison and Charlie Piot took over as editors of the journal in 2011, they continued to push for an open-access alternative in our publishing program. Although the SCA and others had been making the argument for transitioning at CULTURAL ANTHROPOLOGY, Vol.
    [Show full text]
  • Yale University Library Preservation Department
    Yale University Library Preservation Department 45th Annual Report July 2015-June 2016 Roberta Pilette January 17, 2016 Preservation Department Annual Rpt FY2016 1 Yale University Library Preservation Department 45th Annual Report July 2015-June 2016 Roberta Pilette, Director of Preservation and Chief Preservation Officer Murray Harrison, Senior Administrative Assistant Preservation Staffing: July 1, 2015 June 30, 2016 Positions budgeted: C&T 11.00 11.00 M&P 10.47 11.00 Positions filled: C&T 9.00 11.00 M&P 10.47 11.00 OVERVIEW OF THE DEPARTMENT The Yale University Library Preservation Department is responsible for the long-term preservation of all library materials. The Department consists of four units—Preservation Services; Digital Reformatting & Microfilm Services (DRMS); Conservation & Exhibition Services (CES) including Collections Conservation & Housing (CCH), Special Collections Conservation (SCC) and Exhibit Production Support (EPS); and Digital Preservation Services. The Department organizational chart can be found in Appendix I, the annual statistics for the Department can be found in Appendix II. 344 Winchester moving & more construction The construction of that portion of the department associated with the Beinecke Rare Book & Manuscript (BRBL) Technical Services construction was completed during the first quarter of FY16. The move for Preservation Administration, Preservation Services, and Digital Preservation Services took place in August 2015. Those moves went smoothly and the units settled into the new spaces. Digital Preservation Services moved all of their equipment into their new spaces and spent the year making significant use of the enlarged work areas. Digital Preservation and the Born Digital Working group collaborated on the specifications for the new di Bonaventura Digital Archeology and Preservation laboratory.
    [Show full text]
  • 2016 Program
    Culture Builds Communities Preserving the Past, Shaping the Future International Conference of Indigenous Archives, Libraries, and Museums October 9–12, 2016 ▪ Phoenix, Arizona Presented by the Association of Tribal Archives, Libraries, and Museums with funding from the Institute of Museum and Library Services SCHOOL FOR ADVANCED RESEARCH ANNE RAY INTERNSHIPS Interested in working with Native American collections? The Indian Arts Research Center (IARC) at the School for Advanced Research (SAR) in Santa Fe, NM, offers two nine-month paid internships to college graduates or junior museum professionals. Internships include a salary, housing, and book and travel allowances. Interns participate in the daily collections and programming activities and also benefit from the mentorship of the Anne Ray scholar. Deadline to apply March 1 internships.sarweb.org ANNE RAY FELLOWSHIP FOR SCHOLARS Are you a Native American scholar with a master’s or PhD in the arts, humanities, or social sciences who has an interest in mentorship? Apply for a nine-month Anne Ray Fellowship at SAR. The Anne Ray scholar works independently on their own writing or curatorial research projects, while also providing mentorship to the Anne Ray interns working at the IARC. The fellow receives a stipend, housing, and office space. Deadline to apply November 1 annerayscholar.sarweb.org For more information about SAR, please visit www.sarweb.org INNOVATIVE SOCIAL SCIENCE AND NATIVE AMERICAN ART Welcome to the International Conference of Indigenous Archives, Libraries, and Museums Wild Horse Pass Resort & Spa, Phoenix, AZ October 9-12, 2016 About the Program Cover Table of Contents Harry Fonseca was among a critical generation of twentieth- Special Thanks, Page 2 century Native artists who, inspired by tradition, created work that transcended expectations.
    [Show full text]
  • Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language
    JOURNAL OF MEDICAL INTERNET RESEARCH Heilman & West Original Paper Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language James M Heilman1, BSc, MD, CCFP(EM); Andrew G West2, PhD, MS Eng 1Faculty of Medicine, Department of Emergency Medicine, University of British Columbia, Vancouver, BC, Canada 2Verisign Labs (Verisign, Inc.), Reston, VA, United States Corresponding Author: James M Heilman, BSc, MD, CCFP(EM) Faculty of Medicine Department of Emergency Medicine University of British Columbia 2194 Health Sciences Mall, Unit 317 Vancouver, BC, V6T1Z3 Canada Phone: 1 4158306381 Fax: 1 6048226061 Email: [email protected] Abstract Background: Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. Objective: This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia's natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia's medical contributors, including year-over-year participation trends and editor demographics. Methods: Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity.
    [Show full text]
  • Web Archiving Supplementary Guidelines
    LIBRARY OF CONGRESS COLLECTIONS POLICY STATEMENTS SUPPLEMENTARY GUIDELINES Web Archiving Contents I. Scope II. Current Practice III. Research Strengths IV. Collecting Policy I. Scope The Library's traditional functions of acquiring, cataloging, preserving and serving collection materials of historical importance to Congress and the American people extend to digital materials, including web sites. The Library acquires and makes permanently accessible born digital works that are playing an increasingly important role in the intellectual, commercial and creative life of the United States. Given the vast size and growing comprehensiveness of the Internet, as well as the short life‐span of much of its content, the Library must: (1) define the scope and priorities for its web collecting, and (2) develop partnerships and cooperative relationships required to continue fulfilling its vital historic mission in order to supplement the Library’s capacity. The contents of a web site may range from ephemeral social media content to digital versions of formal publications that are also available in print. Web archiving preserves as much of the web‐based user experience as technologically possible in order to provide future users accurate snapshots of what particular organizations and individuals presented on the archived sites at particular moments in time, including how the intellectual content (such as text) is framed by the web site implementation. The guidelines in this document apply to the Library’s effort to acquire web sites and related content via harvesting in‐house, through contract services and purchase. It also covers collaborative web archiving efforts with external groups, such as the International Internet Preservation Consortium (IIPC).
    [Show full text]