<<

Local for the Uninitiated Public : The Community Webs Project Presenters: Kyrie Whitsett – The Internet Jacquelyn Oshman - New Brunswick Free , New Jersey Emilirose Rasmusson – Marshall-Lyon County Library, Minnesota What is Web Archiving?

Web Archiving is the process of evaluating, selecting, , cataloging, providing access to, and preserving digital materials for researchers today and in the future.

Source: https://www.loc.gov/webarchiving/

The Internet is really, really big. We can’t archive it all. Local preserving local

LibTech 2019|Saint Paul, MN|#communitywebs > Non-profit & Archive founded in 1996 > HQ a former church and it’s awesome (and please visit!) > Extensive open-source technology development for archiving/preservation > Archive-It > Web Archiving Service since 2006 > 625 Organizational Users > Web-based application > Includes training and support Web Archiving in Public Libraries

2015 NDSA Survey results: 2% from Public Libraries

Many public libraries have long of collecting analog materials documenting their region. What happens now that these are now increasingly published exclusively online? Web Archiving in Public Libraries

Perceived roadblocks > Lack of funding or institutional buy -in > Lack of training resources > Staff time requirements > Technical requirements Applicants 110 applications A cohort of 28 from public libraries small, medium, and across the country large public libraries

15 IMLS 13 IA Participants Participants Participants

★ Athens Regional Library System (GA) ★ Denver Public Library (CO) ★ Birmingham Public Library (AL) ★ Metropolitan Library System (OK) ★ (NY) ★ New Brunswick Free Public Library (NJ) ★ Buffalo & Erie County Public Library (NY) ★ Patagonia Library (AZ) ★ Cleveland Public Library (OH) ★ Pollard Memorial Library (MA) ★ Columbus Metropolitan Library (OH) ★ (NY) ★ DC Public Library (DC) ★ San Diego Public Library (CA) ★ East Baton Rouge Parish Library (LA) ★ San Francisco Public Library (CA) ★ Forbes Library (MA) ★ Schomburg Center for Research in Black Culture, ★ Grand Rapids Public Library (MI) New Public Library (NY) ★ Henderson District Public Libraries (NV) ★ Sonoma County Library (CA) ★ Kansas City Public Library (MO) ★ The Urbana Free Library (IL) ★ Lawrence Public Library (KS) ★ West Hartford Public Library (CT) ★ County of Los Angeles Public Library (CA) ★ Westborough Public Library (MA) ★ Marshall Lyon County Library (MN)

Community Webs Program

> IMLS LB21 grant, National Digital Platform > Community Webs: Empowering Public to Create Community History Web > Two-year project (June 2017 - May 2019) > Three library “leads” Community Webs Goals

Education & Training ➔ Establish a cohort network ➔ Professional development activities

Collection Development ➔ Create training materials on community memory web archiving (communitywebs.archive-it.org) ➔ Seed innovative local programing and partnerships

Expanding National Capacity ➔ Provide web archiving services and and ongoing storage and access Community Webs Program

Cohort Activities ➔ Build web archive collections ➔ Local programs and partnerships ➔ In-person and online training Participants Receive ➔ Web archiving services ➔ Stipends for events, conferences, professional development Program Outcomes ➔ Cohort of public library leaders with expertise in born-digital local history collecting ➔ ~15TB per year in born-digital local community archives https://communitywebs.archive-it.org/ > Started in 2016 to massive floods and police shootings > Developed into a more formal program with a development policy > Working with community artists to identify content > “Digital companion to physical vertical file”

Sonoma County Library

> Developed out of an immediate need to capture content related to the North Bay Fires of 2017

> Involved community members from the local cultural heritage community

> Social media, blogs, official & community response to the crises

Project Themes

> Think creatively about what collections are, can be, and how they relate to existing practices or start new ones > Focus on at-risk content liable to disappear quickly > Involvement of subject experts and community members in collecting and public programming > Collaboration is key! Project Outcomes

> 2017: 13% PLs! > Numerous additional PLs now exploring/doing WA > Expanded diversity of org, prof, and collection types > 10s of TBs of local history web resources preserved > Public/professional events and outreach > Training material & OERs THANK YOU!

Kyrie Whitsett [email protected] The technical side ...

Intro to Web Archiving - https://communitywebs.archive-it.org/web-archive.html Archive-It User Guide - https://support.archive-it.org/hc/en-us/categories/201179946- Archive-It-User-Guide Terminology

Collection - A group of Seed URLs curated around a common theme, topic, or domain.

Seed - The starting point URL for a crawler and access point to archived collections.

Crawl - A web archiving (or "capture") operation that is conducted by an automated agent, called a crawler, a robot, or a spider.

Test Crawl - A crawl that isn’t automatically saved long-term.

Crawler Trap - Part of a site that can generate an infinite number of (often invalid) URLs.

Scope - What the crawler will capture and what it won’t - can be expanded or limited

Robots.txt - Files that a site owner can add to their site to keep crawlers from accessing all or parts of it.

Patch Crawl - A crawl to capture and patch in that may have been missing from your original crawl. https://support.archive-it.org/hc/en-us/articles/208111686-Glossary-of-Archive-It-and-Web-Archiving-Terms Challenges in Web Archiving Logistics

● Getting Administration on board ○ Explaining what web archiving is and its importance ● Collection policies for web archives ● Permission to crawl ○ Ethical, legal considerations ● Community involvement ● Dedicating time for maintenance & evaluation

https://communitywebs.archive-it.org/curriculum.html Technical Challenges

Sometimes the crawler doesn’t or can't do what you want it to ... Tips and Tricks

● ALWAYS test crawl first … ● Check in frequently ● With pages that pull in a lot of data, decide what you need and how often ● Add one-page crawls for pages that might create crawler traps ● Check the help files for recommendations for types of sites: ○ https://support.archive-it.org/hc/en-us/articles/208001336-Scoping-guidance-for-specific-types- of-sites ● Download your collection’s to edit it ○ https://support.archive-it.org/hc/en-us/articles/208012996-Upload-and-download-metadata ● You can upload warc/arc files from other crawls or crawlers ○ https://support.archive-it.org/hc/en-us/articles/360000651246-Integrate-external-WARC-ARC- files-into-Archive-It-collections ● When in doubt, ask for help! How do I get involved with Web Archiving? ● “Archive-It is a subscription service of the . Subscriptions to AIT are paid annually and amounts vary depending on scope of collecting a new partner is looking to do. Our smallest account level is 3K/year, however we are somewhat flexible with pricing for smaller institutions who want to start WA programs so anyone interested should reach out!” - Maria Praetzellis

● Manually add URLs to the

● Web Recorder New Brunswick Free Public Library City Statistics:

•2016 Census-estimated population of 56,910 •Middlesex County Seat •Home of , Johnson and Johnson World HQ, - Meyers Squibb, Robert Wood Johnson and St. Peters Medical Centers •Racial breakup: –Hispanic 49.93%; White 45.43%; African American 16.04% –Spanish and Hungarian languages prominent What we are saving:

EVERYTHING WE CAN THINK OF! • –Non-profit organizations –Churches –City Government –City specific English and Spanish newspapers –History Related sites

–New Brunswick Development Corporation (and related articles) •Instagram Pages – Library –Windows of Understanding project

•Facebook Pages –African American Fireman’s Association –Spanish Community organizations that don’t have websites •Blogs –Local Historian/Architect

•Podcast –WCTC AM radio show “New Brunswick Speaks” Policy

Tracking Usage with Analytics

Marshall-Lyon County Library Local statistics

- City of Marshall, Minnesota - Population: 13,616 - One university, several manufacturing and food processing plants - County seat of Lyon County, MN - Lyon County, Minnesota - Southwestern area of Minnesota - Population: 25,857 - Predominately white (estimated 88%), but with growing international immigrant populations, esp. those with with refugee status. - Spanish, Somali, Hmong, and Karen - Marshall-Lyon County Library - Part of the (federated) Plum Creek Library System - Approx 20 employees, mostly part-time, across 3 branches

Time for Questions Helpful & Related Links

Community Webs - https://communitywebs.archive-it.org/ Community Webs curriculum - https://communitywebs.archive-it.org/curriculum.html Internet Archive - https://archive.org/ Archive-It - https://archive-it.org/ Archive-It help files - https://support.archive-it.org/

East Baton Rouge Parish Library collections - https://archive- it.org/organizations/1153 New Brunswick Free Public Library collections - https://archive- it.org/organizations/1326 Marshall-Lyon County Library collections - https://archive-it.org/organizations/1327