A Collaboration Between And
Total Page:16
File Type:pdf, Size:1020Kb
American Archive of Public Broadcasting Technical Specifications Last Modified: June 6, 2016 This document outlines the preferred and acceptable specifications for digital file contributions from donors to the American Archive of Public Broadcasting (AAPB). This includes file format specifications, metadata, and delivery. Our vendor specifications are somewhat different. If you are working with a vendor to digitize your collection, please contact us for our vendor specifications. Our goal is to make the process of contributing to the AAPB as simple as possible, and we are happy to work with you to answer questions and arrange your submission according to your needs and available resources. If you have any questions about these specifications, please contact Casey Davis, AAPB Project Manager at [email protected]. 1. Media Files We would prefer that donors deliver preservation-quality files and access-quality files. If that is not possible, we are able to accept the original files and make the necessary conversions on our end. a. Video preservation file Preferred Acceptable 10-bit JPEG2000 reversible 5/3 in Original file format a .MXF Op1a wrapper with all audio channels captured and encoded (see below for details) A collaboration between and Video preservation file specification details Image essence coding: 10 bit JPEG2000 reversible 5/3 (aka “mathematically lossless”) Interlace frame coding: 2 fields per frame, 1 KLV per frame JPEG2000 Tile: single tile Color space: YCbCr (If source is analog NTSC (YIQ), PAL or SECAM (YUV), it shall be converted to YPbPr for digitization, which converts to YCbCr in digital) Video color channel bit depth: 10 bits per channel Native raster: archive file shall match analog original, which maps to 486 x 720 for 525-line (NTSC) sourced material, and 576 x 720 for 625 line (PAL & SECAM) sourced material. Aspect ratio: AFD (Automatic Format Description) values shall be provided. 4:3 material shall use the AFD 4:3 code; 16:9 materials shall use the 16:9 code. Native frame rate: the frame rate of the original shall be preserved in the file with no conversion (29.97 shall remain 29.97, 25 as 25, etc.) Native color space: If the material is analog sourced, YIQ shall be converted to YPbPr before digitization, YUV (PAL & SECAM) to YPbPr before digitization. YPbPr material shall be maintained. RGB analog material shall be maintained as RGB. b. Audio preservation file Preferred Acceptable BWF (Broadcast WAV) RF64 Original file format format (see below for details) Preservation file specification details (audio) PCM coding, BWF (Broadcast WAV) RF64 format 48 kHz, 24 bit sampling c. Video proxy file Video Codec: h.264/AVC Codec ID: avcl Alternate Name: Advanced Video Codec Format profile: [email protected] Format settings, GOP: M=1, N=30 Bit rate: 711 Kbps Width: 480 pixels Height: 360 pixels Display aspect ratio: 4:3 Color: YUV, 4:2:0, 8 bits Scan type: Progressive 2 Audio Codec: AAC 48.0 KHz / 128 Kbps Codec ID: 40 Other Name: Advanced Audio Codec Format profile: LC Channel(s): 2 channels Wrapper: MPEG-4 (.mp4) wrapper d. Audio proxy file 192 kbps MPEG-1 Audio Layer 3 (48 kHz / 16 bits) Codec ID: 0x55 Channels: 2 Wrapper: mp3 2. Delivery method Ideally, preservation files would be delivered to the Library of Congress and access files would be delivered separately to WGBH; however, we are open to simplifying the process when possible, such as having the donor deliver only one copy of the files (preservation and proxy or preservation only) to WGBH, which WGBH would then process and deliver on to the Library of Congress. a. to the Library of Congress Preferred Acceptable LTO tape, TAR formatted USB3 drive, exFAT, NTFS, ext3 or ext4 formatted Provide blocking factor (over 1024 preferred) If possible, all files should be written to the chosen media using the BagIt specification. http://blogs.loc.gov/digitalpreservation/2012/01/from-there-to-here-from-here-to-there-digital- content-is-everywhere/ . While this method is preferred, we can accept files written to the USB3 drive(s) in a single directory when received directly from the donor. b. to WGBH WGBH can provide a USB3 drive(s) to the donor. The donor would then place files into a single directory. 3 3. Metadata Spreadsheet Prior to receipt of the files, the donor should provide to WGBH a spreadsheet that includes as much descriptive and technical information about the files as possible. The spreadsheet should include at the very least, the unique identifier (file name), title, date (if known), format, and duration. A template spreadsheet can be downloaded here: https://s3.amazonaws.com/americanarchive.org/resources/pbcore_excel_template.xls 4. Intellectual Unit List Upon delivery of files, the donor or vendor will need to provide a complete list of filenames. We call this the "Intellectual Unit List." Please provide this list via email when you ship the files. The list of file names should be in CSV format, semicolon delimited and must have at least one unique Key (index) field. The content of the list depends on the chosen media. For delivery on Hard drive: A list of filenames and the name of the hard drive. If you are not using the BagIt specification, please also include the md5 checksums (if available). For delivery on LTO tape: A comprehensive list of each GUID on the tape with tape label and filemark (the location of the file on the tape). If you do plan to deliver your files to the Library on LTO tape, please contact us to discuss this requirement. 5. File naming conventions The only requirement is that the file names are unique, are included in the metadata spreadsheet delivered to WGBH, and the file names do not include any spaces or special characters other than underscores and hyphens. 6. Checksums A checksum is used to verify that the files were copied to the storage media without any errors or loss of data. AAPB would like to receive md5 checksums when possible. If you are providing checksums, we would prefer you include it on both the hard drive and in the intellectual file unit list. At the very least, checksums should be provided on the intellectual file unit list. 7. Transcripts If you have transcripts of the material you are contributing to the AAPB, we would love to have copies. We would prefer to have time-stamped transcripts in .txt, JSON, XML, SRT or WEBVTT, but if you don't have these formats, plain text without timestamps or PDFs would also be useful to have. 4 Please send copies of the transcripts to WGBH on the hard drive using the same file name as the video/audio files with "transcript" appended to the filename. The transcripts can be placed in the same directory as the files or in a separate "Transcripts" folder. 8. Contracts/Releases If you have production contracts or appearance releases for the material you are contributing to the AAPB, we would love to have copies to help us determine what we can make available online. Please send copies to WGBH on the hard drive using the same file name as the video/audio files themselves with "_contract" appended to the filename. The contracts can be placed in the same directory as the files or in a separate "Contracts" folder. 9. Summary Type of file Required Preservation file WGBH and the Library Proxy file WGBH only Original file WGBH and the Library, if no preservation file is available Transcripts To WGBH, if available Contracts/Releases To WGBH, if available Checksum Strongly preferred (md5) by both WGBH and the Library QC Report To WGBH and the Library, if available Technical metadata To WGBH and the Library, if available Preservation metadata To WGBH and the Library, if available 5 .