Digital Archives Processing Manual Richard B. Russell Library for Political Research and Studies University of Georgia
Total Page:16
File Type:pdf, Size:1020Kb
Digital Archives Processing Manual Richard B. Russell Library for Political Research and Studies University of Georgia Note: contact the digital archivist, Adriane Hanson ([email protected]) with questions or to get copies of templates that are not included in the manual. You are welcome to use any portion of this manual in your own institution. Please do indicate the source if you use a significant portion of it. Purpose The goal of the processing procedure for digital archives is to: 1. Arrange and describe the files to facilitate access. 2. Identify files that should not be kept at all or that should be restricted. 3. Produce the AIPs(Archival Information Packages) and DIPs (Dissemination Information Packages) as defined by OAIS. 4. Gather information about the file formats to support long term preservation. Procedure Overview 1. Accession disks 2. Copy files to workspace 3. Survey and processing plan 4. Appraisal 5. Arrangement 6. Description 7. Restrictions 8. File Format Analysis General Principles and Practices • Use the same policies, procedures, and tools for papers and digital files where possible. • Description is DACS minimum compliant. • Related papers and digital files should be arranged and described together in the same subseries rather than putting digital files in their own series. • All analysis and appraisal is done to the access copies only. The preservation copies (the original files extracted from media or received from the donor) should remain unchanged. [Note: currently considering how appraisal decisions can be implemented on the preservation copies] • Apply archival principles of aggregate description, original order, and MPLP to provide access to these files in a timely manner. • Work should focus on the highest folder level in a file structure, and may occasionally include one or two other levels of folder hierarchy. File level work that is not computer automated should only be undertaken in extraordinary cases. • Currently retain all files in the original format until can research the best file formats for preservation and access. Page 1 1. Accession Disks Found During Processing Some digital media will have been removed from the collection during accessioning. Additional disks will likely be found during processing when boxes are examined more closely. Accession these according to the "Digital Archives Accessioning Manual". 2. Copy Files to Workspace on the SAN Server Note: The SAN Server is maintained by Library Systems for use by all library departments. Russell Arrangement and Description have 2 TB allocated for preservation storage on the SAN which is also currently used for workspace and storing access copies. Only two Russell staff members and some Library Systems staff have access to this space. Library Systems backs up the space daily to a server located in another building on campus. a. Use TeraCopy (see Appendix A) to copy all files in the collection folder in Preservation Storage to the work space (Access Copy Storage). b. Rename the folder to include _Access at the end of the folder name (example RBRL_340_Access). This avoids any confusion when windows are open for both the preservation copies and the access copies at the same time. c. Add _CLOSED to the end of the folder title while working so no access is provided (example RBRL_340_Access_CLOSED). This should be removed once the files are open for research. d. Delete the preservation documentation from the access copies: the manifest, preservation log, and Data Accessioner logs. Do keep the removal sheets. 3. Survey and Processing Plan Use the "Digital Archives Processing Inventory" spreadsheet template for the survey. The name of the file should be "Creator last name_year_Digital_Archives_Processing_Inventory". Typically, the survey has one row in the spreadsheet per piece of digital media. There may already be a beginning of this spreadsheet in the preservation folder titled "For Processing" with information gathered during accessioning. Fill in any missing information using the accession records, the manifest, and opening a sample of the files on the workspace. See Appendix B for common Excel commands used to analyze the manifest. When processing is complete, this survey should be saved to the collection folder on the G: Drive as a record of processing decisions. Use the "Digital Archives Processing Plan" template to record all decisions in steps 4-8. The name of the file should be "Creator last name_year_Digital _Archives_Processing_Plan. When processing is complete, this plan should be saved in the collection folder on the G: Drive as a record of processing decisions. Digital Archives Processing Manual - Richard B. Russell Library Page 2 4. Appraisal Appraisal is typically done at the disk level or upper folder level. Analyze the survey and manifest, and view a sample of the files if needed, to identify files that do not have permanent research value, including published materials and materials outside of our collecting scope. Also separate software and other program files. Record decisions in the "Digital Archives Processing Inventory" and then delete the files from the Access Copies. Duplicates Remove disks or upper level folders that are entirely duplicated elsewhere. Individual files that are duplicated can be left in. To locate duplicate files, use Excel to identify duplicate checksums. a. Select the checksum column by clicking on the letter of that column. b. Under Home - Conditional Formatting - Highlight Cell Rules, select "Duplicate Values". c. In the pop up box, select the formatting to use to show duplicate values and click "ok". d. To see the duplicates, filter the checksum column for the color used to show duplicates. e. To remove the color, go to Home-Conditional Formatting - Clear Rules. Photograph Sampling Sample photographs if there are a large number of photographs from a single event or if the photographs are not identified and are similar to each other. The typical sampling rate is 10%, keeping every 10th photograph. A higher sampling rate can be used if 10% would cause unique images to be missed. Preview the files as you highlight them in the file directory to delete - if any are unique, it may be substituted for the "10th" photo that would have been kept. If no files are kept from a disk • Delete the files from that disk from the accession folder in Preservation Storage. • Note that the disk was discarded in the Archivists' Toolkit Resource Record and the "Digital Archives Processing Inventory". • Remove the "Digital Media Removal Sheet" from the box of papers, if applicable. • Securely destroy the disk by sending it to our shredding vendor. If a portion of files are kept from a disk, just delete the separated files from the access copies. 5. Arrangement Identify series and subseries Digital files should be added to series and subseries established for paper records, if applicable, so all content about the same subject is grouped together. Add additional series and subseries to the arrangement as needed for subjects that are only present in the digital files. Digital Archives Processing Manual - Richard B. Russell Library Page 3 Level of arrangement For disks, generally arrange at the disk level. For external hard drives, generally arrange at the highest folder level of the file directory. If there files that are not inside of folders at the highest level of the file directory, create a folder for them with square brackets around the folder title. Otherwise, don't reorganize the files. Documentation Record arrangement decisions in the "Digital Archives Processing Inventory" and the "Digital Archives Processing Plan". Implement the arrangement a. In the collection folder in Access Copy Storage, make a folder for each series that has digital files, including the series number (example: Series I. Constituent Services). b. Within the series folder, make a folder for each subseries that has digital files, including the subseries letter (example: Subseries A. Issue Mail) c. Use TeraCopy (see Appendix A) to copy the files from Access Copies to the appropriate series or subseries folder. After TeraCopy confirms the copying had no errors, delete the files from the original location on Access Copies. 6. Description Digital files are described in our finding aids. For hybrid collections, description of digital files is integrated with the description of the papers. See Appendix C for how to create the inventory from a directory print. See Appendix D for how to format the description in Archivists' Toolkit. Collection, series, and subseries description a. Scope and content note: If some subjects are only present in the digital files, indicate that they are digital in the scope and content note (example: The digital files include his campaign website and research on abandoned sunken ships). If the subjects of the paper and digital files are the same, it is not necessary to describe the digital files separately. b. Access note: use standardized text to indicate that digital files are present and instruct the researchers on how to request them. c. Extent: total file size. If an entire series or subseries is digital, also include the file count. Folder list d. If there are also papers in the series or subseries, the folder titles of the digital files go at the end of the list for that series or subseries. e. Typically only list the highest level of the file directory in the finding aid. Only include additional levels of the file directory if they significantly aid access. f. If a folder title is generic (example: speeches), add a scope and content note to describe the main subjects in that folder. Digital Archives Processing Manual - Richard B. Russell Library Page 4 g. If the majority of the file names within a folder are descriptive (terms you might search for as opposed to codes or generic terms), create an item-level inventory and link to the finding aid (see Appendix E). Still do include a scope and content note for the folder in the finding aid so that the finding aid is keyword searchable.