<<

Negotiating the Landscape of Born-Digital (Part 1)

Instructors: Stephen Fletcher & Patrick Cullom

FAQs: • This live webcast will begin at 1:00 p.m. CT. • Audio will be heard through your computer or tablet’s speakers. • Submit questions for the presenters and moderator at anytime by utilizing the Q&A box. • Introduce yourself in the Chat box! • The presentation recording and slides will be made available to registrants. Request these materials by emailing [email protected].

©2017 Society of American Archivists Stephen Fletcher Patrick Cullom North Carolina Collection Photographic Archivist Visual Materials Processing Archivist Wilson Special Collections Library Wilson Special Collections Library University of North Carolina at Chapel Hill University of North Carolina at Chapel Hill Two statements about born-digital photographs that concern me as a photographer and archivist. “An file is just like any other file.”

–A colleague “Just read dpBestflow. It has all you need to know.”

–Another colleague Understanding Born- Digital Photographs as digital objects for archival processing Photographers modify and transform their image files

“The medium is the message.”

—Marshall McLuhan, 1964 What is a born-digital ? “It might seem odd, but one of the greatest difficulties in the field of is how to identify a .”

–Howard E. Burdock Digital Imaging: Theory and Practice (1997) Digital Graphic vs. Digital Image A digital graphic exists only in a computer as a mathematical formula; a digital image has been imported from the real world. Or more narrowly . . . “A set of data that never exists as anything other than an array of .” A “discreet array, usually two-dimensional, of pixels, the intensities of which are represented by numbers.”

Burdock, pages 3-4 draw circle center 0.5, 0.5 Digital Graphic radius 0.4 = Vector Graphic fill- yellow stroke-color black stroke-width 0.05 draw circle center 0.35, 0.4 radius 0.05 fill-color black draw circle center 0.65, 0.4 radius 0.05 fill-color black draw line start 0.3, 0.6 end 0.7, 0.6 stroke-color black stroke-width 0.1 Digital Image = Raster Image Image Data: The Heart of Digital Photographs The conversion of a continuous / analog sensor into a discrete digital representation — Quantization Michael W. Burke “The First TV Image of Mars”

Mariner 4, 1965

Encoding "the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage" Decoding "the conversion of an encoded format back into the original sequence of characters" Digital image structure

Transforming continuous into discreet Any Questions on what I just covered? Photo Sensor

Photo sensors detect light intensity with little or no wavelength specificity, and therefore cannot separate color information. Image Data

• Proprietary data directly from sensor

• “raw”

• In- processing

• usually JPEG

• Raw & JPEG

• Proprietary file extension, e.g., .nef for Spatial resolution camera sensor’s dimensions (W x H) Common camera sensor sizes Spatial resolution

• Camera sensor’s pixel dimensions (H x W)

• My K-1 pixels dimensions are

• 4926 x 7389 = 36398214 pixels

• expressed as 36.4 megapixels (MP) Depth resolution number of bits used to hold a pixel’s value Bit depth the number of bits used to store a value Bit Depth

• the number of possible values grows exponentially with the number of bits

• 1-bit per pixel = bi-tonal (black or white; 0 or 255)

• 8 bit = 256 tonal values per channel

are 8-bit files

• Most today are 14-bit (16,384 tonal values per channel);

• There are no true 16-bit cameras

Pixels per Inch (PPI) vs Dots per Inch (DPI) Close, but not the same PPI DPI

• INPUT or OUTPUT • OUTPUT measurement measurement • Inkjet printers place drops • Digital sensor (input) of ink (i.e. dots) on paper • • Computer monitors laser printers fuse toner (output) dots Color Filter Arrays

(CFA) or Color Filter (CFM)

• Colored filters placed over pixel sensors to capture color information • Single filter • • Multi-filter • Foveon Bayer Color Filter Array Developed by Bryce E. Beyer in 1974 Patent US 3971065 A Filed March 5, 1975 by Eastman Company under the title “Color imaging array” Published July 20, 1976 Bayer Color Filter Array The reconstruction of a full from the incomplete color samples output from an overlaid with a color filter array. Unrendered & Rendered

Unrendered Rendered • Raw • Has undergone • Not usable by graphics demosaicing editors (e.g., Photoshop) until rendered • e.g. Adobe Camera Raw The Body of Digital Photographs File Formats Structuring data Image designed to store information specific to image files Types of image file formats

• Unstructured formats

• Chunk-based formats

• Directory-based formats NDIIP National Digital Information Infrastructure and Preservation Program NDIIP "Sustainability of Digital Formats: Planning for the Library of Congress" Prehistory of image file formats File formats before TIFF The need for standards

• A Standard Format for • "In 1978 the Science Digital Image Exchange Council of the AAPM • Published for the formed a Task Force to American Association of consider the problem of Physics in by the transferring digital image American Institute of data between devices. It Physics (11 March 1982), was apparent that there Report No. 10. was a growing need to make such transfers for clinical, research and service reasons." The need for standards

• A Standard Format for • "The Task Force believes Digital Image Exchange that it is impractical, at • Published for the the present time, to American Association of adopt a standard internal Physics in Medicine by the representation for digital American Institute of image data acquired and Physics (11 March 1982), processed by Report No. 10. commercially available equipment." The need for standards

• A Standard Format for • "The Task Force does, Digital Image Exchange however, believe that it is • Published for the possible to facilitate the American Association of transfer of data by Physics in Medicine by the defining a standard American Institute of format for exchange Physics (11 March 1982), purposes." Report No. 10. Enter TIFF

• TIFF: Tagged Image File Format

• Introduced by in autumn of 1986 • "Tagged" refers to file structure • Header • Followed by "chunks" of data called tags • 70+ types TIFF

• Image header called Image File Directory (IFD)

• points to one or more image file directories, which contain the image data and image information

• Fixed information fields (tags)

• image dimensions, specification, etc. . . . then GIF

• GIF: Graphic Interchange Format (1987) • Utilizes algorithm “Lempel– Ziv–Welch" (LZW) • Limited to 256 or shades of gray in an image • Was proprietary: introduced by CompuServe, now all patents expired. : Exchangeable image file format

• Japan Electronic Industries Development Association (JEIDA) introduced Exif as an industry standard in October 1995

• Tag structure borrowed from TIFF Exif: Exchangeable image file format

• Exif is a file format, BUT it’s more like a schema

• Cameras use Exif to record information about image format, size, resolution and color space, and optionally authoring information such as who made the image, when and where it was made, what camera model and photographic settings were used. EXIF metadata revealed by ExifTool

Last login: Thu Mar 10 Interpretation : Color White Balance : Unknown (0x4) Pitch : : Auto Assign Movie Record Shutter Count Mode Date/Time Original 11:46:57 on ttys000 Filter Array : Auto1 Toning Saturation Low AE Lock Button Button : None : 1068 : Auto : 2016:03:01 AALNCC20:~ fletches$ Strip Offsets Focus Mode : 0 No Memory Card : AE/AF Lock Fine Tune Opt Highlight Info Version Digital Zoom Ratio 12:14:47.57 : 5037056 : AF-S Time Zone : Release Locked Command Dials Change Weighted: 0 : 0107 : 1 Modify Date /Users/fletches/Desktop/Samples Per Pixel Flash Setting : -05:00 ISO Display Main Sub : Autofocus Dynamic Area AF Multi Exposure Version In 35mm : 2016:03:01 SPELLINGS\ : 1 : Daylight Savings : Show Frame Count Off, Exposure Off Display : Off : 0100 Format : 70 mm 12:14:47.57 PROTEST/_DSC0963.NRows Per Strip Flash Type : Yes Grid Display Command Dials Menu AF Point Illumination Multi Exposure Mode Scene Capture Type EF : 4928 : Date Display Format : On And Playback : Off : On During Manual : Off : Standard : 0.030 mm ExifTool Version Strip Byte Counts White Balance Fine : D/M/Y Shooting Info Display Command Dials Focusing Multi Exposure Shots Gain Control Number : 10.12 : 72737280 Tune : 0 0 ISO Expansion : Not Set Setting : Sub- Store By Orientation : 0 : None : 2.52 m (5.65 - 8.18 m) File Name X Resolution WB RB Levels : Off LCD Illumination command Dial : Off Multi Exposure Auto : Field Of View : _DSC0963.NEF : 300 : 1.95703125 ISO2 : : Off Shutter Release Button Group Area AF Gain : Off Normal : 28.6 deg (3.40 m) Directory : Y Resolution 1.26953125 1 1 100 Electronic Front-Curtain AE-L : Off Illumination : High ISO Noise Saturation Focal Length /Users/fletches/Desktop/: 300 Program Shift ISO Expansion 2 Shutter: Off Release Button To Use Squares Reduction : Normal : Normal : 70.0 mm (35 mm SPELLINGS PROTEST Planar Configuration : 0 : Off Screen Tips Dial : No Matrix Metering Power Up Time Sharpness equivalent: 70.0 mm) File Size : : Chunky Exposure Difference Vignette Control : On Standby Timer : Face Detection On : 0000:00:00 00:00:00 : Normal 74 MB Resolution Unit : 0 : Normal Beep : : 6 s Live View Button AF Info 2 Version Subject Distance Range : 36.24 m File Modification : inches Preview Image Start Auto Control Off Self Timer Time Options : Enable : 0100 : Unknown Light Value Date/Time : CFA Repeat Pattern : 44952 : Off Reverse Indicators : 10 s AF Mode Restrictions Contrast Detect AF GPS Version ID : 10.7 2016:03:01 12:14:48- Dim : 2 2 Preview Image Length Black Level : - 0 + Self Timer Shot Interval : No Restrictions : Off : 2.3.0.0 05:00 CFA Pattern 2 : 128869 : 600 600 600 600 Command Dials : 0.5 s Limit AF Area Mode AF Area Mode Date/Time Original File Access Date/Time : 0 1 1 2 Flash Exposure Lens Type Reverse Rotation : No Self Timer Shot Count Selection : No : Single Area : 2016:03:01 12:14:47 : 2016:03:10 11:51:34- Subfile Type Compensation : 0 : G Easy Exposure : 1 Restrictions Phase Detect AF TIFF -EP Standard ID 05:00 : Reduced-resolution External Flash Exposure Lens : Compensation : Off Image Review Monitor AF-On For MB-D12 : On (51-point) : 1 0 0 0 File Inode Change image Comp : 0 24-70mm f/2.8 Exposure Control Step Off Time : 4 s : AF-On Primary AF Point Aperture Date/Time : Other Image Start Flash Exposure Bracket Flash Mode Size : 1/3 EV Live View Monitor Off Assign Remote Fn : C6 (Center) : 4.5 2016:03:01 14:17:29- : 231936 Value : 0.0 : Did Not Fire ISO Step Size Time : 10 min Button : None AF Points Used Auto Focus 05:00 Other Image Length Exposure Bracket Value Shooting Mode : 1/3 EV Menu Monitor Off Time Lens Focus Function : C6 : On File Permissions : 883838 : 0 : Single-Frame Exposure Comp Step : 1 min Buttons : AF Lock Contrast Detect AF In Blue Balance : rwxrwxrwx Reference Black White Crop Hi Speed Contrast Curve Size : 1/3 EV Shooting Info Monitor Only Focus : No : 1.269531 File Type : 0 255 0 255 0 255 : On (7384x4928 : (Binary data 578 bytes, Center Weighted Area Off Time : 10 s File Info Version CFA Pattern : NEF Creator Tool cropped to 7380x4928 use -b option to extract) Size : 12 mm Flash Sync Speed : Off : 0100 : File Type Extension : NIKON D810 Ver.1.10 at pixel 0,0) Shot Info Version Fine Tune Opt Matrix : 1/250 s Version Memory Card Number [Red,Green][Green,Blue : nef Copyright Exposure Tuning : 0233 Metering : 0 Flash : 0223 : 0 ] MIME Type : : 0 Firmware Version Fine Tune Opt Center : 1/60 s Lens Data Version Directory Number Image Size : image/x-nikon-nef Exposure Time Serial Number : 1.10d Weighted : 0 Flash Control Built-in : 0204 : 101 : 7380x4928 Exif Byte Order : 1/80 : 3022422 Custom Settings Offset Fine Tune Opt Spot : TTL Exit Pupil Position File Number Jpg From Raw : Little-endian (Intel, II) F Number Color Space : 6475 Metering : 0 Modeling Flash : 97.5 mm : 0963 : (Binary data 3920489 Make : : 4.5 : Adobe RGB D810 Multi Selector Shoot : On AF Aperture AF Fine Tune bytes, use -b option to NIKON CORPORATIONExposure Program VR Info Version : Matrix Mode : Select Playback Monitor Off : 2.8 : Off extract) Camera Model Name : Program AE : 0100 Light Switch Center Focus Point Time : 10 s Focus Position AF Fine Tune Index Lens ID : : NIKON D810 ISO : Vibration Reduction : LCD Backlight (Reset) Multi Selector Live View : 0x04 : n/a AF-S Zoom-Nikkor 24- Orientation 100 : Off Custom Settings Bank Multi Selector Playback : Reset Focus Distance AF Fine Tune Adj 70mm f/2.8G ED : Horizontal (normal) Sensitivity Type VR Mode : A Mode : Thumbnail Shutter Speed Lock : 6.68 m : 0 Lens : Software : Recommended : Normal AF-C Priority Selection On/Off : Off Lens ID Number Retouch Info Version 24 - 70mm f/2.8 G : Ver.1.10 Exposure Index Active D-Lighting : Release Multi Selector Aperture Lock : 147 : 0200 Megapixels Modify Date Create Date : Off AF-S Priority Selection : Do Nothing : Off Lens F Stops Retouch NEF : 36.4 : 2016:03:01 12:14:47 : 2016:03:01 12:14:47 Picture Control Version : Focus Exposure Delay Mode Movie Shutter Button : 6.00 Processing : Off Other Image Artist : Exposure : 0200 AF Point Selection : Off : Take Photo Min Focal Length User Comment : (Binary data 883838 Jpg From Raw Start Compensation : 0 Picture Control Name : 51 Points CL Mode Shooting Flash Exposure Comp : 24.5 mm : bytes, use -b option to : 1116160 Max Aperture Value : Standard Focus Tracking Lock On Speed : 3 fps Area : Entire frame Max Focal Length Sub Sec Time extract) Jpg From Raw Length : 2.8 Picture Control Base : 3 (Normal) Max Continuous Movie AE Lock Button : 71.3 mm : 57 Preview Image : 3920489 Metering Mode : Standard AF Activation Release : 100 Assignment : AE/AF Max Aperture At Min Sub Sec Time Original : (Binary data 128869 Y Cb Cr Positioning : Multi -segment Picture Control Adjust : Shutter/AF -On Auto Bracket Set Lock Focal : 2.8 : 57 bytes, use -b option to : Co-sited Light Source : Default Settings Focus Point Wrap : AE & Flash Movie Function Button Max Aperture At Max Sub Sec Time Digitized extract) Image Width : Unknown Picture Control Quick : No Wrap Auto Bracket Order : None Focal : 2.8 : 57 Red Balance : 7380 Flash : Adjust : Normal AF Point Brightness : 0, -,+ Movie Preview Button MCU Version Sensing Method : 1.957031 Image Height Off, Did not fire Brightness : Auto Auto Bracket Mode M : Index Marking : 149 : One-chip color area Scale Factor To 35 mm : 4928 Focal Length : -124 AF Assist : Flash/Speed Func Button Plus Dials Effective Max Aperture File Source Equivalent: 1.0 Bits Per Sample : 70.0 mm Adjustment : On Func Button : None : 2.8 : Shutter Speed : 14 Maker Note Version : - 124 Battery Order : Virtual Horizon Preview Button Plus Raw Image Center Scene Type : 1/80 Compression : 2.11 Filter Effect : : MB-D12 First Preview Button Dials : None : 3692 2464 : Directly photographed Create Date : Uncompressed Quality : Off MB-D12 Battery Type : Preview AE Lock Button Plus Retouch History Custom Rendered : 2016:03:01 Photometric RAW Toning Effect : LR6 (AA alkaline) Assign Bkt Button Dials : None : None : Normal 12:14:47.57 JPEG & JFIF

• Joint Photographic Experts Group (JPEG)

• committee formed to develop standard

• JPEG File Interchange Format (JFIF)

• developed 1991–1992 TIFF/EP

• Tag Image File Format / Electronic Photography (TIFF/EP) introduced in 2001

• Not the same as TIFF

• It's a subset of TIFF and JEITA Exif

• Used as a , which is a minimally processed data from the image sensor TIFF/EP

• This standard has not been adopted by most camera manufacturers

• However, TIFF/EP provided a basis for the raw image formats of a number of cameras.

• Adobe's DNG (Digital ) raw file format was based on TIFF/EP

• Several cameras use DNG as their raw file format (CIFF)

• Raw image format designed by Canon, February 1997

• File extension is .crw Exif + CIFF —> DCF

• 1997: CIFF

• 1998: Exif Version 2.1

• Form the basis of the "Design rule for Camera File system" (1998) DCF:Design rule for Camera File system How cameras record image data Design rule for Camera File system

• DCF is applicable to • products for writing image files on an interchangeable medium (removable memory) formatted with the DOS FAT file system • products for reading removable memory media • printing recorded on removable memory by reader products Design rule for Camera File system

• Version 1.0 established December 1998 JEIDA Specification

• CURRENTLY JEITA specification (number CP-3461) which defines a file system for digital cameras • directory structure • file naming method • character set • file format • metadata format

• Currently the de facto industry standard for digital still cameras DCIM: Digital Camera Images

• Digital cameras create a folder on the storage media named DCIM (Digital Camera IMages)

• Directory structure that can contain multiple subdirectories Sample directory name

DCF directory name format Examples • 100ABCDE • 100CANON • 100##### • 100NIKON • 100_0304 DCIM Directory and File Structure example

• Root

• DCIM (directory)

• 100ABCDE (a DCF directory) • ABCD0001.JPG (a DCF basic file or DCF optional file) • ABCD0002.JPG • ABCD0003.TIF (a DCF extended image file) • ABCD0003.THM (a DCF thumbnail file for extended image file; it is not allowed for ".JPG" files) • ABCD0004.WAV (a DCF object need not include an image file) • ABCD0005.JPG • ABCD0005.WAV (a DCF object formed by naming non-image file with the same file number as an image file) • ... • ABCD9999.JPG • README.TXT (other file names and extensions may be assigned freely) DCIM Folder on a memory card DCIM Folder showing subfolders and image files DCF and Adobe RGB Color Space

• DCF requires files in AdobeRBG color-space to start with an underscore DCIM Folder note underscore in front of file names Digital Still Cameras (DSC)

• Many image filenames begin with DSC • "_" can be a character • Example DSC_0014.jpg • Some manufactures use IMG • Long list of examples

• Some cameras allow photographer to customize first four characters • Example SJF_0014.jpg Filename prefixes

• "dcp#####.jpg" - Kodak, range of 0 thousands, hundreds • "MMDD####.JPG" - QV3000 to 4000 and QV4000n • "1##-####_IMG.jpg" - Alternate • "dsc#####.jpg" - Nikon, range of 0 Canon name • "YYMDD###.JPG" - Casio QV7000 - to 4000 M is hex • "IMG_####.jpg" - Canon • "dscn####.jpg" - Nikon, range of 0 • "IMGP####.JPG" - Pentax Optio S to 4000 • "_MG_####.jpg" - Canon raw conversion • "PANA####.JPG" - video • "mvc-###.jpg" - Mavica camera stills • "dscf####.jpg" - Fuji Finepix • "mvc#####.jpg" - Sony Mavica • "IMG_YYYYMMDD_HHMMSS.JPG" - • "pdrm####.jpg" - Toshiba PDR HTC Desire Z (AKA Tmobile G2) • "P101####.jpg" - Olympus, Using default camera date of 101 • "IM######.jpg" - HP Photosmart • "Image(##).JPG" - Nokia 3650 • "PMDD####.jpg" - Olympus, M is in • "EX######.jpg" - HP Photosmart hex from 1 to c, DD is 01-31 timelapse? • "DSCI####.JPG" - Polaroid PDC2070 • "IMG_###.jpg" - Some other • "DC####S.jpg" - Kodak DC- camera 40,50,120 S is (L)arge, (M)eduim, (S)mall. • "IMAG####.jpg" - RCA and Samsung • "####.jpg" - Dimage • "1##-####.jpg" - Canon 1TH-TH## • "P#######.JPG" - Kodak DC290 So what?!

• Finding folders labeled DCIM is a good clue that you have found the in-camera originals

• What if nothing is inside a folder labeled DCIM? • Photographer likely moved the original in-camera files into a folder with a different name And so . . .

• Filenames containing DSC or IMG (and other four- letter codes) are good clues that you have found the in-camera originals

• HOWEVER . . .

• Photographers often amend or overwrite filenames during ingest using Digital Asset Management (DAM) software • IMG_1234.DNG —> StephenJFletcher_20160803_IMG_1234.DNG • SJF_0014.jpg (example from earlier slide) Don’t confuse DSC with .dsc .dsc is a file extension used by Nikon Coolpix cameras, associated with a group of image files The Embattled History of Embedded Photographic Metadata It was more than just a TIFF Camera manufacturer proprietary file formats The Confusion of Tongues by Gustave Doré Embedded Metadata The Soul of Digital Photographs Embedded Metadata

• Camera information and settings

• Date and time information

• Thumbnail for

• Exif version

• Et cetera Raw

• “Raw” is not a image file format; it is a generalization for formats, open or propriety, that encode all the image data captured by a camera's sensor. • “Raw” refers to the “raw” data off the sensor Some proprietary raw formats: cameras & scanners

• Arriflex D-20: .ari • Minolta MRW • BAY (Casio) • Mitsubishi DJ-1000: .dat • Camera Image File Format (CIFF) as used • Nikon: .nef, .nrw, .ndf by Canon: .crw • Nokia digital pictures: .nrw • Canon RAW 2: .cr2 • Olympus ORF • CHDK raw: (Older-style CHDK RAW files) • Panasonic RAW/RW2: .raw, .rw2 • DNG (): .dng (Adobe) • Pentax PEF • ERFFujifilm RAF • : .cap, .iiq, .eip • Porst • 3FR • Rawzor: .rwz • HDRiRAW • RED digital pictures: .r3d • Imacon 3F: .fff • Samsung SRW • Kodak: .dcs, .dcr, .drf, .k25, .kdc • Sony ARW • MOS • Sony SRF: .srf • Leica: .raw, .rw2, .dng • Sony SR2 • Logitech: .pxn • TIFF/EP (ISO 12234-2) • Lytro: .lfp • X3F (Sigma / Foveon) • http://fileformats.archiveteam.org/wikiCameras_and_Dig • MEF ital_Image_Sensors • Minolta MDC (Minolta RD-175) File extensions have may have multiple sources

• The file extension (.srf) is as a Sony raw image file. • HOWEVER, animation software LightWave 3D uses .srf files to store information on how a 3D surface should appear, such as the color, transparency, and shading. • LightWave Surface files. • AND 's Visual Studio software uses the .srf file extension for Server Response files (also known as a stencil). International Press Telecommunications • In 1979 IPTC created a set of Council (IPTC) metadata attributes applied “established in 1965 to safeguard the to images. telecommunications interests of the world press.” • The photograph, including caption and other information, placed on a revolving drum and scanned, then transmitted over http://www.photometadata.org/META- telephone lines. Resources-Metadata-History-Timeline Information Interchange Model (IIM)

• IPTC developed in 1990. • First multi-media news exchange format. • Specifications for metadata fields. • Never intended specifically for use with digital photographs. • Adobe adopted the standard in 1995 when it chose 20 photo metadata fields from the IIM to be included in Photoshop. • IIM schema was updated in 1999. Information Interchange Model (IIM)

• In 1997 Adobe standardized use of the Image Resource Block (IRB) method of storing metadata, which adds different kinds of image data –– including, but not limited to metadata –– to a digital picture. Image Resource Blocks

• Image resource blocks are the basic building unit of several file formats, including Photoshop's native file format PSD, JPEG, and TIFF. • Used to store non-pixel data associated with images

From ® File Format Specification Adobe XMP Extensible Metadata Platform: Successor to Image Resource Blocks Extensible Metadata • Introduced in 2001, successor to Platform (XMP) the Image Resource Blocks • XMP is a method of writing metadata, not a metadata schema • Adobe XMP represents the same types of metadata as IPTC, but uses Extensible Markup Language (XML) and Resource Description Framework (RDF) Resource Description • an infrastructure that enables the Framework (RDF) encoding, exchange and reuse of structured metadata. • an application of XML that imposes needed structural constraints to provide unambiguous methods of expressing semantics • First public draft in October 1997 Resource Description • "provides a means for Framework (RDF) publishing both human- readable and machine- processable vocabularies designed to encourage the reuse and extension of metadata semantics among disparate information communities." Extensible Metadata • .xmp files are text files Platform (XMP) • "defining feature" of XMP is the format of the text, not the file that holds it • XMP data has wrappers around each tag to indicate the field to which it belongs • XMP text can be stored in the file itself, and XMP block, or a "sidecar" IPTC Core

• IPTC/Adobe collaboration • Introduced in 2 in 2005 • Includes and defines most IIM fields previously adopted by Adobe • New fields added, including new subject, scene and intellectual genre codes Exif TIFF is a IPTC IIM container format IPTC in XMP

User applied information and parametric editing instructions in XMP

Image Data Adobe DNG The Digital Negative Adobe DNG Specification 1.0 launched on September 27, 2004 1.1.0.0, published February, 2005 to correct flaws DNG: "Digital Negative"

• standardizes basic information structure • processing instructions • image metadata • verification tools • color profiles • more DNG: Three flavors • In-camera DNG • Converted DNG, raw • Converted DNG, linear DNG optional raw file in Pentax Leica only uses DNG for raw files

Raw converter software

• Adobe DNG Converter • Camera raw conversion in various software: • Lightroom • Aperture • Pro • AfterShot Pro • DXO • Irident Developer • PhotoNinja • RAW Photo Processor DNG versus raw: similarities

• Based upon TIFF/EP format • Can store a raw image • Can store a JPEG preview • Can store metadata • Exif (camera-generated) • Encrypted (camera-generated) proprietary date DNG versus raw: differences

• DNG has a published specification • Documented software enables safe manipulation of image by third-party software • Undocumented formats susceptible to damage from third- party software DNG versus raw: differences

• DNG has a published specification • Allows safe manipulation of image by third-party software • Undocumented formats susceptible to damage from third- party software • The DNG Specification has versions • DNG can store nearly unlimited metadata • DNG allows the attachment of processing instructions DNG versus raw: differences

• DNG uses version compatibility tags to help manage incompatibility issues with older software DNG versus raw: differences

• DNG allows the attachment of processing instructions DNG versus raw: differences

• DNG can store nearly unlimited metadata • In 2005, an emerging problem of OpenRAW "proprietary RAW files" noted by "Digital Image Preservation Juergen Specht and members of his Through Open Documentation" mailing list "D1scussion" • More than 200 proprietary raw formats • On 10 March started OpenRAW Mailing List • Founded initiative called OpenRAW in April 2005 • Specht and Michael Reichmann OpenRAW (Luminous-Landscape.com "Digital Image Preservation founder) author article "The RAW Through Open Documentation" Problem" • Translated into 19 languages • Created OpenRAW website 25 April 2005 • January 31, 2006, the Open RAW survey—19,207 participants Stock Artists Alliance's • July 2006, with three guiding Metadata Manifesto principles: The Other SAA • Metadata is essential to identify and track digital images. • Ownership metadata must never be removed • Metadata must be written in formats that are understood by all. Stock Artists Alliance's • 2007: awarded a $100,000 Metadata Manifesto partnership through The Other SAA Library of Congress • investigate industry practices and then developing a program of metadata education for photographers First International Photo Metadata Conference

• June 2007, Florence Italy • http://phmdc.org/index2007post.php • IPTC Photo Metadata White Paper 2007 • "The goal of this White Paper is to devise a way for improving photo workflows with the help of consistent use of metadata . . .” Metadata is essential to . . .

• identify and track digital images • properly describe the content by natural or formal language, and in this way make it possible to easily search for photographs • express technical characteristics of photographs in an interoperable way across technical systems • express rights and licensing terms that pertain to a digital image, hence ownership metadata must never be removed. • for a seamless photo workflow, and must be written in formats that can be easily understood by all Color Model Turning colors into numbers using a mathematical formula. Color Model

• “an abstract mathematical model which simply describes the range of colors as tuples of numbers, typically as 3 or 4 values or color components” Color Space A specific implementation of a color model Color Space The Trinal Frontier Color Space

Each color in the system is represented by a single dot. Often shown as 2-D image but actually 3-D, represented as a wireframe http://www.arcsoft.com/topics/photostudio- /what-is-color-space.html Color spaces

• Red, Green, Blue (RGB) • , Yellow, Magenta (CYM) • Hue, Saturation, Intensity (HSI) • luminance– (brightness/color) RGB color spaces

• sRGB • Adobe RGB (1998) • ProPhoto RGB Color Spaces More than one color space

http://www.digital-photo-secrets.com/tip/2401/what- are-color-spaces-and-which-one-should-i-choose/ Color

• Breadth of color • Subset of colors which can be accurately represented in a given circumstance, such as within a given color space or by a certain output device. • The same and ink will reproduce a different color gamut for different inkjet media (e.g. paper).

• "Color management is the controlled conversion between the color representations of various devices, such as image scanners, digital cameras, monitors, TV screens, film printers, computer printers, offset presses, and corresponding media." —Wikipedia Color profile a specific implementation of a color model ICC Profiles

“a set of data that characterizes a color input or output device, or a color space, according to standards promulgated by the International Color Consortium” Rendering intent ()

• Absolute colorimetric • Relative colorimetric • Perceptual • Saturation Why is this important? Color management: the alchemy of In-camera color space

• If shooting JPEG, typically sRGB or Adobe RGB; • If shooting raw format or .DNG, you assign during ingest or post-processing. • NOT to be confused with WHITE BALANCE SETTINGS Photographers’ Workflows Workflow

• Life cycle informs workflow The five phases of image lifecycle

• Capture • Ingest • Working • Originals • Derivatives • Publish • Anything sent to someone else • Archive

—Dpbestflow.org Computational imaging: photographic techniques that require computer software • Focus stacking • Creating a single image from multiple image files that have varying planes of focus • High imaging (HDR) • Creating a single image from multiple image files that have varying exposures • CAN be in-camera • Keep all image files in order to document technique or recreate Bracketing: the basis of computational imaging • Exposure Bracketing • Flash Bracketing [Variety • Electronic strobes • Focus Bracketing presents • aka Depth-of-field Bracketing opportunity] • White Balance Bracketing • ISO Bracketing Photographic techniques of which you must be aware • Automatic Exposure Bracketing (AEB) [Automated • Automatic Flash Bracketing • Electronic strobes variety • White Balance Bracketing • Auto ISO Bracketing presents • Not common • Automatic Dual-Bracketing opportunity] • Combination of two; not common High Dynamic Range (HDR) Intentional exposure bracketing for blending

Gustave LeGray, Brig Upon the Water, 1855 Pixel shifting

• Camera specific • “Super Resolution” [Automated • Olympus E-M5 Mark II • 8 frames with half-pixel shifts and merges those frames in-camera into variety one higher resolution raw file • Pentax K-3 II and K-1 (more presents advanced, with motion detection) • Pentax embeds all four exposures into a single DNG or PEF raw file opportunity] Focus stacking

• Also known as • Focal plane merging • Z-stacking • (“zedification” in French) Popular uses

Focus stacking High Dynamic Range • Macrophotography • Landscape • Microscopy Raster-based image editors Early Graphics Editing Software

• Aldus SuperPaint (1973) • SuperPaint (1985) • Image Studio (1987) • First “electronic darkroom” software • Color Studio • Adobe Photoshop (1990: Mac only until 1992) Destructive vs Nondestructive "More than Save As" Pixel-based editing

• Editing changes alter the values of individual pixels • in the case of cropping, eliminates the pixels Early Graphics Editing Software

• Matisse • First imaging software to • Fauve Software, 1992 use layers Early Graphics Editing Software

• Live Picture (July 1994) • Non destructive layers • HSC Software • 48-bit color • John Scully, CEO • At $3,995 it was targeted for • File format is . high-end users • Still has users today Early Graphics Editing Software

• Adobe Photoshop 3.0 • First version with layers • September 1994 • possible to save multiple versions of the same image within a single file Early Graphics Editing Software

• Adobe Photoshop 4.0 • First version with • November 1996 adjustment layers • wraps up the source image with a set of instructions (or many sets of instructions) for rendering a photograph • "self-referenced NDI" Combo power: and DAM

• Aperture (Apple) 2005 • Lightroom (Adobe) 2007 Aperture options

Referenced Managed • Can add images from • Software keeps master multiple locations or images organized in a drives without managed library duplicating files • Saves disk space Non-destructive Image Editing

Raster Image Editing Parametric Image Editing • Destructive • Nondestructive • Changes made to original • Saves changes to image file alters the original file as a set of instructions or • Photographers should parameters keep original as master, • Utilizes “reference files” work on a copy or copies • Because instructions are (derivatives) saved, well-suited for • “Save As” DAM environments • That doesn’t mean they do/did Apple's 2015 Abandonment of Aperture

Aperture Lightroom • Does not allow export of • Does allow export of renderings and metadata renderings and metadata to DNG to DNG • Aperture no longer being developed • Import into Photos • Brings rendering settings but they cannot be altered Aperture: Masters and Versions

• Original image is a master • Adjustments to a master are performed on versions • Versions are instructions All these factors determine the nature of a specific image file Negotiating the Landscape of Digital Photographs Patrick Cullom, Visual Materials Processing Archivist The University of North Carolina at Chapel Hill What Does Good Arrangement and Description Do For Photographic Collections?

PROCESSING LEADS TO ACCESS • Survey Materials • Finding Aid • Preservation • Clearly Labeled Enclosures • Arrangement • Logical Arrangement • Description • Unique Identifiers • Formats • Subjects • Access to Analog Materials • Unique identification • Access to Digital (B-D & digitized*) • Creation of Finding Aid Materials Provide: CONTEXT…CONTEXT… CONTEXT Photography= Constant Advances = Many Formats

[2.6.124-1-1 to 5, Clark, Joe] , in the Hugh Morton Photographs and Films #P0081, 1964, North Carolina Collection, University of North Carolina at Chapel Hill Library. "Hybrid" Analog/Born-Digital Collections? Photographic Collections with multiple formats?

Traditionally, photographic formats have had a wide variety of :

Capabilities Purposes Limitations

AND issues regarding:

Access Duplication Preservation

Sound familiar? Arrange and Describe (Analog vs. Digital)

Analog: Step 1: Perform Conservation Assessment/Survey Materials Step 2: Identify Formats and Estimate Rehousing Needs Step 3: Rehouse Step 4: Gather Contextual Information Step 5: Arrange and Describe

Digital: Step 1: Perform Conservation Assessment (Also done at or before point of capture) Step 2: “Rehouse” (Transfer from original storage to permanent storage done when captured) Step 3: Generate and Record Basic Technical Metadata Step 4: Gather Contextual Information Step 5: Arrange and Describe DACS Statement of Principles (They stay the SAME)

• Records are unique and organic in nature • Respect des Fonds (provenance and original order) • Arrangement means identifying groupings (order can be imposed as with analog materials) Digital Records Are Records!

Digital Records* (Fundamentally STILL a record…) -Have Context -Have Content -Have Structure Digital Records (Fundamentally STILL a record) -Require basic preservation, arrangement, and description

Record: - 2. Data or information that has been fixed on some medium, that has content, context, and structure, and that is used as an extension of human memory or to demonstrate accountability (SAA Glossary: Record http://www2.archivists.org/glossary/terms/r/record#.V4_V_egrK70) Some terms to be familiar with

Analog: Any physical print or film-based photographic format/technology

Born-digital: Materials originally created in a digital format (1s and 0s)

EAD-XML: Encoded Archival Description - Extensible Markup Language

DACS: Describing Archives: A Content Standard

PREMIS: Preservation Metadata: Implementation Strategies

FITS: File Information Toolset Some terms to be familiar with

AIP: Archival Ingest Packet

SIP: Submission Information Package

METS: Metadata Encoding and Transmission Standard

MODS: Metadata Object Description Schema

OAIS: Open Archival Information System Analog Materials Arrive Many Different Formats Born-Digital Materials Arrive Born-Digital Materials Arrive in Different Containers… Many Different Formats Similar Issues (Analog) Similar Issues (Digital) Preserving & Transferring “Metadata” From Analog Preserving & Transferring “Metadata” From Analog Preserving & Transferring “Metadata" From Analog Metadata: “Sidecar” File .. Circa 1997 Preserving & Transferring Metadata: Born-Digital View in Admin (logged in as processor): What users see: FITS (File Information Toolset) - 1997 FITS records show information concerning technical information related to files including information compiled by “Exiftool,” “Droid,” “JHOVE,” and other tools. FITS (File Information Toolset) - 2012 FITS (File Information Toolset) - 2012 Sorting Analog Materials Sorting Born-Digital Images Creating New Enclosures For Analog Materials Creating “New Enclosures” For Born-Digital Materials

. Creating “Multipurpose” Metadata for Analog Materials Creating “Multipurpose” Metadata for Born-Digital Materials Presenting Related Images Together (Analog) Presenting Related Images Together (Digital) Basics of a “DIY Repository” From SAA: Arrangement and Description of Electronic Records

• Preservation, access, and documentation folders • Record checksums and basic PDI in an AIP • Descriptive metadata recorded in archival catalog system • File system storage with replication • Unique ID links AIP folder and descriptive record • Use web access systems where possible • Store files consistently to prep for deposit to “real” repository system FOUR Types of Metadata Needed to Ensure Authenticity of Born-Digital Files (From Arrangement and Description of Electronic Records)

Descriptive Structural (MODS/EAD/SIP) (AIP-METS/ Structure Browse/SIP) • Keywords (Folders and files) • How are files related to each • Filenames other or files outside of the • “Sidecars” archival packet (related files)

Administrative Technical (PREMIS) • Provenance (FITS/SIP) • Custodial history • Documents formats and includes record of • Rights changes to materials • Date of Ingest, Who Ingested Submission Information Package (SIP): The Submission Information Package(SIP) is the content and metadata received from an information producer or by a preservation repository. An Information Package that is delivered by the Producer to the OAIS for use in the construction of one or more AIPs. (OAIS Reference Model) Uses METS to encode this information. A standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using XML. (Carolina Digital Repository Help: Glossary: https://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and-collab/technical- documentation/general/glossary/ ) Creating New MODS MODS Metadata Object Description Schema (MODS) is a metadata standard developed by the Library of Congress. It is compatible with METS and can be nested within the SIP to include descriptive information at both the object and aggregate object levels.

(Carolina Digital Repository Help: Glossary: https://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and- collab/technical-documentation/general/glossary/ ) METS for Digital METS Metadata Encoding and Transmission Standard (METS) is a metadata standard developed by the Library of Congress used to encode descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using XML. (METS Wikipedia entry)

(Carolina Digital Repository Help: Glossary: https://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and-collab/technical- documentation/general/glossary/ ) Similar to (EAD-XML) for Analog and Digital Archival Ingest Packet (AIP) Can be created at collection AND container levels An Archival Information Package (AIP) is the set of content and metadata managed by a preservation repository, and organized in a way that allows the repository to perform preservation services. In addition to the data files, the AIP contains metadata that describes the structure, content, and meaning of the data files. The data files and metadata are packaged (encapsulated) either logically or physically as an entity. AIPs are used to transmit and/or store archival objects within a digital repository system. (Carolina Digital Repository Help: Glossary: https://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and- collab/technical-documentation/general/glossary/ ) Preservation Metadata Implementation Strategies (PREMIS) Overview

PREMIS Overview Usually refers to the Data Dictionary and not the XML schema. The Data Dictionary defines a core set of semantic units that repositories should know in order to perform their preservation functions. This generally includes actions to ensure that digital objects remain viable and renderable, as well as to ensure that digital objects in the repository are not inadvertently altered and that legitimate changes to objects are documented. (PREMIS Overview; Understanding PREMIS publication) Standards: Example: METS (SIP transmission) University of North MODS (Descriptive schema within CDR) Carolina at Chapel Hill: PREMIS (Document preservation events) Carolina Digital Repository Includes: (CDR) Basic Workflow Curators

Archival Processors

Library Information Technology Carolina Digital Repository (CDR) Workflow

Transfer files from original media (Bagger - Curator / Electronic Records Archivist)

Ingest through the CDR Admin interface (CDR - Curator / Electronic Records Archivist)

Preliminary survey (CDR – Curator)

Arrange files and set access controls (CDR / Archival Processor) Carolina Digital Repository (CDR) Workflow

Describe files (CDR / Processor)

Quality control (CDR / Archival Processor)

Safe to delete “original” files (CDR / Archival Processor / CDR Staff)

Create (or add to existing) finding aid (CDR / Oxygen – Archival Processor) Retrieve Files From Original Media Environment: Bagger Responsibility: Curator / Electronic Records Archivist

Curators have been instructed on how to capture and ingest born-digital materials acquired into the Carolina Digital Repository using Bagger. The Electronic Record Archivist deals with systematic large-scale deposits (like University Records), transferring “obsolete” formats, and managing the “digital storage” across collections. Bagger: Creates a “payload” and “tags” for selected data Serves as the Submission Information Package (SIP) Creates baseline checksum Packaging format for storage and transfer of arbitrary digital content. A "bag" has just enough structure to enclose descriptive "tags" and a "payload" but does not require knowledge of the payload's internal semantics. Ingest Through the CDR Admin Interface Environment: CDR Responsibility: Curator / Electronic Records Archivist Curators / Electronic Records Archivist take the data created by Bagger and ingest into CDR using admin interface.

CDR Admin Interface:

Universally Unique IDentifier (UUIDs) are assigned and recorded Archival Information Package(s) (AIP) generated Base checksum originated What Happens at ingest…(A lot) Can take a while

• Submission Preparation Tool (Bagger) • Submission Service (CDR Admin Webapp) • Validation (Persistence Module) • Transformation to Ingest Batch (Persistence Module) • Routine Pre-processing of Ingest Batch (Persistence Module) • Queues Ingest Batch with Fedora Ingest Service • Fedora Ingest Service • First Come, First Serve Batch Ingest • Handling of Diverse SOAP Faults, Protocol and Service Exceptions • Verification of Each Ingested Object, Including Fixity • Container Updates (Persistence Module in Services Webapp) • Sending a JMS Message When Done • Emailing Submitter When Done • Fedora Ingest Sequence • Replication in iRODS • Objects and Datastreams Copied to Redundant Storage Systems

More Detail about Ingest at CDR: http://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and- collab/technical-documentation/architecture-and-software/ingest/ingest-overview/ Preliminary Survey Environment: CDR Responsibility: Curator

Curator looks over materials, ensures is what was expected/promised, gets an idea of what is there and adds to accession record that is consulted later by archival processor

CDR admin interface:

Provides tools to access, analyze, and make some decisions of what will stay and what will go. Arrange Files and Set Access Controls Environment: CDR Responsibility: Archival Processor Archival processors have the ability in the CDR admin interface to add collections and folders; move files; and set access controls for collections, folders, and individual files Quality Control Environment: CDR Responsibility: Archival Processor Look at materials in “public” view and make it looks right. Safe to Delete “Original” Files Environment: CDR Responsibility: Archival Processor / CDR / CDR Staff

While arranging and describing, Archival processors have the ability to mark materials for deletion. They are not actually deleted until the processor indicates that arrangement and description has finished. Describe Files Environment: CDR Responsibility: Archival Processor

Archival processors create/edit Metadata Object Description Schema (MODS) records; typically done at folder level and include:

Title Identifier (Corresponding unique identifier in finding aid) Related item / location (Link to finding aid)

Preservation Metadata (PREMIS - Events) updated (Folder level only) Create (or Add to Existing) Finding Aid Environment: CDR / Oxygen Responsibility: Archival Processor Archival processors export .CSV report from collection they are working with. .csv file contains the metadata for the objects in collection.

Object type (folder, file) Deleted (yes/no) UUID Date added/updated Title (original or updated) File type Checksum Path File size Label (original name upon ingest) Number of children Depth Standards Used in the Carolina Digital Repository

These are the standards in use by the CDR.

Unique Identifiers: UUID-based PIDs Submission Information Packages (SIPs): METS, BagIt File Packaging Format Descriptive Metadata: MODS Preservation Metadata: PREMIS Metadata Harvesting: OAI-PMH Repository Data Modeling: Fedora Digital Object Model (Fedora 3.8), Resource Description Framework (RDF) Fedora : REST and SOAP APIs (Fedora 3.8)

Learn more about standards here: http://blogs.lib.unc.edu/cdr/index.php/about/cdr- development-and-collab/technical-documentation/general/standards-used/ Resources: All images depicting processing or archival materials were taken by Patrick Cullom while processing collections 2015-2017 at The University of North Carolina at Chapel Hill, Wilson Libraries Special Collections. All screenshots depicting finding aids or the Carolina Digital Repository are from the following collections at UNC-Chapel Hill: P0011: Bayard Morgan Wootten Photographic SFC-20479: D. Kent Thompson and Sue Meyer Collection Thompson Collection P0081: Morton Photographs and Films SFC-20239: Ronald D. Cohen Collection P0105: Durham Herald Co. Newspaper Photograph SFC-20491: Souls Grown Deep Foundation Collection Collection P0109: North Carolina SFC-20367: William R. Ferris Collection Award Collection SHC-05441: John Kenyon Chapman Papers SHC-20533: Russell Glenwood Baldwin Papers About the Carolina Digital Repository: http://blogs.lib.unc.edu/cdr/ Carolina Digital Repository Technology Overview: (Includes how Archival Processors interact with materials) http://blogs.lib.unc.edu/cdr/index.php/about/cdr-development-and-collab/technology-overview/ Resources (Continued): Further Reading:

Photographs : Archival Care and Management by Mary Lynn Ritzenthaler and Diane Vogt-O'Connor with Helena Zinkham, Brett Carnell, and Kit Peterson. Chicago : Society of American Archivists, 2006

Archival Arrangement and Description edited with an introduction by Christopher J. Prom & Thomas J. Frusciano Chicago : Society of American Archivists, 2013 Questions?

Stephen Fletcher Patrick Cullom North Carolina Collection Photographic Archivist Visual Materials Processing Archivist Wilson Special Collections Library Wilson Special Collections Library University of North Carolina at Chapel Hill University of North Carolina at Chapel Hill [email protected] [email protected]