iPRES 2018 The 15th International Conference on Digital Preservation September 24th, 2018, 9am-12:30pm PRONOM in Room 216, the Joseph B Martin Conference Centre Practice Creating File Format/System Signatures for Submission to PRONOM Technical Registry
David Clipsham, Nick Krabbenhoeft, Shira Peltzman, Justin Simpson, & Carl Wilson 1 INTRODUCTIONS
Facilitators
David Clipsham - Digital Archives Systems Manager, PRONOM Lead, National Archives, UK Nick Krabbenhoeft - Head of Digital Preservation, New York Public Library Shira Peltzman - Digital Archivist, UCLA Library Justin Simpson - Archivematica Technical Director, Artefactual Systems, Inc Carl Wilson - Technical Lead, Open Preservation Foundation
2 Introduction to file format signatures How are file formats identified, overview of 9:15 - 9:35 am Agenda PRONOM, case studies Signature development process Reading bytestreams (why to do it, how to do 9:35 - 10:35 am it), creating signatures
[break] Signature development process (cont’d) & case studies 10:50-11:45 pm Testing signatures, submitting to PRONOM [break] Advanced signature development & open signature development workshop 12:00-12:30 pm Container signatures, finding samples, troubleshoot existing signatures 3 Introduction to file format signatures
4 Why file format signatures? Agenda Style
Relevancy to digital preservation
● File format identification enables us to know what we’re dealing with
○ This happens early on in most workflows
○ The outcome of this process impacts downstream decision-making around activities like normalization for preservation and access
● File format identification tools are only as good as the file format signatures that have been developed by the community
○ The lack of a file format signature means that file identification cannot meaningfully take place
○ Executing tasks that should be straightforward, like disk image extraction and
File Formats characterization, are sometimes difficult if not altogether impossible
5 PRONOM
Image from Flickr via kevandotorg
6 PRONOM http://www.nationalarchives.gov.uk/PRONOM/Default.aspx
Developed in 2001 to meet the National Archives digital record File format registry for digital preservation planning. preservation planning
File format research File format 1670 entries Format extensions, always ongoing, National identification aka PUIDs - PRONOM MIME/Media types, Archives research guided signatures Unique Identifiers links to documentation primarily by UK (for DROID originally) Government needs. External contribution always welcome and encouraged
7 PRONOM Timeline
2001 2004 2005 ongoing
Continual PRONOM research and signature development
DROID launched alongside PRONOM 4 PUIDs introduced
Opened up as externally browsable resource Aka PRONOM 3
Original internal version
8 PRONOM Growth
9 PRONOM Contributors
10 File FormatAgenda ID Style
PRONOM identification mechanisms
● Extension (.doc, .exe, .jpg) ● File format signature ○ Binary pattern matching ○ Created from elements of internal structure ○ May be simple ‘magic numbers’ - “CAFEBABE” for Java Class File ■ http://blog.nationalarchives.gov.uk/blog/cafed00ds-and-cafebabes/ ○ May consist of complex patterns of variations, gaps and alternative values. ○ Driven by file format specification where possible ● Container signatures - formats made up of small files contained within a ‘ZIP’ or ‘OLE2’ wrapper (.doc, .xlsx, .odt, .epub) ○ http://openpreservation.org/blog/2016/01/07/droid-container-signature-files-what-th ey-are-and-how-to-create-them-a-template-and-an-example-or-few/
File Format ID
11 File FormatAgenda ID Style
DROID Pattern Matching
● Scans internal file byte code
● Compares against known signatures in signature file
● Returns a Hit! where it gets a match
● We’re aiming for certainty – there should be an extremely low chance that a file could be of a different type to the format that DROID identifies
● So, signature needs to be strong enough, but doesn’t need to encode all of the characteristics of a format
File Format ID
12 File FormatAgenda ID Style
Magic Numbers (AKA Signatures)
● A specified sequence of characters/bytes that must be present ● Usually at the start of the file (not always) ● Explicitly stated within the format specification:
● Java Class file – https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html Hex 0xCAFEBABE
● PNG - https://www.w3.org/TR/PNG-Structure.html ASCII “‰PNG”, then hex 0x0D0A1A0A
● Photoshop PSD - https://www.adobe.com/devnet-apps/photoshop/fileformatashtml
File Format ID ASCII “8BPS”
13 File FormatAgenda ID Style
Inferred Signatures
● Sometimes formats may not have clearly defined signatures, but may have characteristics that must be present. This can be a good hook for a signature. This can get really complex!
● Gatan DM3: http://www.er-c.org/cbb/info/dmformat/#dm3 00000003{4}000000(00|01){6}(14|15){2-258}25252525
● Stata DTA 113: http://www.stata.com/help.cgi?dta_113 71(01|02)01{105}00
● ASP ASAX: https://msdn.microsoft.com/en-us/library/es4ac4ek(v=vs.85).aspx 3C2540204170706C69636174696F6E20(436F6465426568696E64|436F6D70696C6572 File Format ID 4F7074696F6E73|4465736372697074696F6E|496E686572697473|4C616E6775616765 )3D 14 File FormatAgenda ID Style
Format ‘Subsets’
● Sometimes file formats may be ‘subsets’ or subtypes of other formats. Major examples are:
● PDF/a – subtype of PDF (so is PDF/X) ● DNG – subtype of TIFF (so is NIKON Raw NEF) ● WAV – subtype of RIFF (so is AVI) ● SVG – subtype of XML (so is GML)
We manage these relationships with ‘priorities’ – for example, PDF/A has priority over PDF because it contains a more specific element
File Format ID
15 File FormatAgenda ID Style
Notes on Format Identification
● Not all files are automatically identifiable (based on how PRONOM currently works) – see Wireless Bitmap (.wbmp) – but an extension-only entry is better than nothing!
● A 4 byte (32 bit) sequence has a 1 in ~4 billion chance of a clash with truly random data – this is usually strong enough
● A text editor can be better for viewing XML based formats than a Hex Editor (although you’ll need the hex editor for creating the byte sequences)
● We’re not trying to characterise a format or validate that it is well formed, we’re just trying to give us a reasonable degree of certainty about the outcome
File Format ID ● Files that ID as OLE2 or ZIP are probably container sigs (for later!) 16 Signature development process
17 Reading bytestreams Format signature tools Hexadecimal and binary Hex (hexadecimal editors) are number systems. allows for manipulation of Base 2 and Base16 the fundamental binary respectively. We usually data that constitutes a file work in Base 10 (decimal), ie. 1, 2, 3 … 10. Binary for 144 is 10010000. In hex this is OP Format Corpus simply 0x90. Fewer zeros to work with helps us see an openly-licensed larger numbers easier. corpus of sample files
DROID/Siegfried/FIDO PRONOM Tools to match files to Submission Utility PRONOM format an online form to submit signatures information about file formats for PRONOM
18
Tools Agenda Style
Hex Editors A program that allows for manipulation of the fundamental binary data that constitutes a file
Also called a binary file editor
For more info see: https://en.wikipedia.or g/wiki/Hex_editor
Reading Bytestreams 19 Hex editors
● Windows - HxD https://mh-nexus.de/en/hxd/
● OS X - HexFiend http://ridiculousfish.com/hexf iend/
● Linux - Bless https://apps.ubuntu.com/cat/ applications/precise/bless/
20 Hex editors
Online options:
http://binvis.io https://hexed.it/ http://icebuddha.com/
21
ResourcesAgenda Style
Format specification documents A document that describes the set of requirements necessary for a given file format
LoC’s Sustainability of Digital Formats is a good place to look for these: http://www.loc.gov/preservation/digital/for mats/fdd/browse_list.shtml
Reading Bytestreams GIF specification: https://www.w3.org/Graphics/GIF/spec-gif89a.txt 22 Hands-on: Examining sample files in a hex editor
23 Case study Developing a simple signature
TZX Spectrum Tapes
24
The TZX AgendaTape Format Style
Creating a signature ● A format for archiving ZX Spectrum programs
● Used with ZX emulation programs
● Large hobbyist community – lots of information available
● A audio stream of the tape data
● World of Spectrum Archive: 10,000’s of examples - https://www.worldofspectrum.org
Creating Signatures 25 26 27
The TZX AgendaTape Format Style
The Format Specification - http://www.worldofspectrum.org/TZXformat.html
Creating Signatures 28
ResourcesAgenda Style
PRONOM terms, basic syntax and data model BOF = Beginning of File. EOF = End of File. Var = Variable (anywhere in the file) Offset/Max Offset = Exact or positional range in which a signature starts Wildcards: ?? = single wildcard byte, e.g. AB??C3 * = 0-many wildcard bytes, e.g BC*D4 {n} = specific number of wildcard bytes, e.g. A2{5}F3 {n-n} = range of wildcard bytes, e.g. 4D{0-12}E4 Byte range: [hh:hh] = single byte value between range, e.g [00:FA] Either/or: (hhhh|hhhh|hh) = either/any or these byte values, e.g. (0D|0A|0D0A) Not: [!hh] = anything except this byte value, e.g. ABCD[!01]E1
https://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/automatic_format_identification.pdf
Creating Signatures 29
Tool Agenda Style
PRONOM Signature Development Utility
http://www.nationalarchives.gov.uk/pronom/sigdev/index.htm
Creating Signatures 30 Hands-on: Creating and editing a sample PRONOM signature
31 Break! Please be back at 10:50am
32 Signature development process cont’d
33
Tool Agenda Style
Format characterization tools
The process of file format characterization identifies, validates, and extracts key characteristics of the file formats represented in our preservation collections including format, version, validity, technical metadata, etc.
Tools include:
● DROID ● Siegfried ● Fido ● File Information Tool Set (FITS)
Testing Signatures 34
Tool Agenda Style
OPF Format Corpus
An openly-licensed corpus of small example files, covering a wide range of formats and creation tools.
Pre-requisites:
✔ GitHub account
❌ CLI experience
Testing Signatures
https://github.com/openpreserve/format-corpus/wiki/Contributing-Format-Samples-Without-Needing-to-Know-the-Command-Line35 36
Contributing Format Samples
Getting onto GitHub
You’ll need a GitHub account to upload files to the format corpus. It’s free to join: https://github.com/join/
Testing Signatures
37
Contributing Format Samples
Your test files
You’ll also need some sample files to contribute. You’ll need: ● to understand the sample files well ● the necessary right or permissions to upload the samples ● A willingness to waive the rights to your contribution using a CC0 license: https://creativecommons.org/choose/zero/
Testing Signatures
38
Contributing Format Samples
Overview
The steps in the process are as follows: 1. Find the format corpus on GitHub and copy it to our account 2. Create a new working area for our contribution 3. Create some basic documentation for the submission 4. Add two disk image samples to the repository 5. Submit the contribution to the corpus as a pull request.
Testing Signatures
39
Contributing Format Samples
Find and fork the corpus repository
The OPF corpus can be found using Google. It’s the link to the GitHub repository: https://github.com/openpres erve/format-corpus
It’s best to create a copy of the corpus under your own Fork account. In GitHub speak this copy is known as a “fork”. The fork button is in the top right-hand corner of
Testing Signatures the screen, beneath the black menu bar. 40
Contributing Format Samples
Create a new branch to work in
Once you have your own copy of the corpus repo create a new branch for the contribution. Hit the “Branch” button on the left side of the screen and type Branch a meaningful name. Use “/” to separate branch name parts for tidiness. Testing Signatures
41
Contributing Format Samples
Creating a directory and README file
First we’ll create a new directory containing a single text file for documentation. Convention dictates that these files New File should be called “README.md”. To create the file hit the “Create new file” button on the right of the screen. Then type the Naming directory and filename in File the dialog, adding a “/” creates a new directory. Testing Signatures
42
Contributing Format Samples
README.md and GitHub flavoured markdown
Once you’ve created the file you’ll be able to edit the contents in the web GUI text editor. This can be added as plain text but supports GitHub flavoured Markdown: https://guides.github.com/fe atures/mastering-markdown/ for extra formatting features Preview shown in the “Preview”. Testing Signatures
43
Contributing Format Samples
Time to commit
Once you’re happy with the documentation you can use the Commit dialog at the bottom of the page to submit your changes. Be explicit in your descriptions so that others can easily see your intention as well as what was actually committed. Hit the green “Commit new file” button Commit to save your changes. Testing Signatures
44
Contributing Format Samples
Upload sample data files
We’re now going to add the image samples to our submission. From the same Upload directory use the “Upload files” button in the top right hand corner to open a file upload dialog.
You can drag and drop or use “choose your files” to open a file selection dialog. Testing Signatures
45
Contributing Format Samples
Time to commit again
Once you’ve selected your samples you’ll need to commit again. You can add some detail to the commit dialog, this helps people to understand your intention. When you’re happy with the commit message hit the “Commit changes” button to add your files and comments. Testing Signatures Commit 46
Contributing Format Samples
Contributing our changes to the main corpus repository
Once you’ve committed your changes you’ll be taken back to the home page for your repository fork. Now you are offered the chance to “Compare & pull Pull Request request” from your new branch. A “Pull Request” is GitHub parlance for contributing your changes. You can go ahead and hit the green “Compare and pull request” button. Testing Signatures
47
Contributing Format Samples
Contributing our changes to the main corpus repository
Once again you’ll be able to add some comments, here I’ve simply cut and paste our Pull Request existing comments. For a Target single commit GitHub would do this for us. Here’s the Pull Request ready to submit. Note: Be sure that you’re comparing and pull requesting to the openpreserve master Create Pull branch. Submit your Request contribution using the green
Testing Signatures “Create pull request” button. 48
Contributing Format Samples
Wait for a review and merge
You’re pull request is now submitted. You won’t have the permissions to merge this with the main repository. This enables a corpus repository admin to review your changes before committing them to the repository. If there’s a problem you’ll get a request for changes that need to be made. Congratulations. Testing Signatures
49 Hands-on: Testing draft PRONOM signature against sample files
50
SubmittingAgenda to PRONOM Style
Talking to the PRONOM team or asking for help
● Submissions range from ‘This file won’t identify and I don’t know what it is,’ through to ‘Here is everything there is to know about this format, including signature and samples!’ ● Ideally we’d like format name & samples as a minimum (but don’t let that put you off!) ● We can provide non-disclosure agreements if required by your organisation
● Channels: ○ The PRONOM Mailbox - [email protected] (~10MB file size limit) ○ PRONOM submission form - https://www.nationalarchives.gov.uk/contact-us/submit-information-for-pronom/ (no attachments) ○ PRONOM Google Group - https://groups.google.com/forum/#!forum/pronom (public!)
Submitting to PRONOM 51 Case study Developing a FAT 12 Signature
FAT 12
52
Agenda Agenda Style
Working with a FAT12 sample ● Using a hex editor ● Identifying components of your ‘FAT12’ file ● Turning that into a signature ● Testing your signature ● Packaging it up for PRONOM ● Developing signatures by format specifications
FAT12 Case Study 53
Your FilesAgenda Style
Working with a FAT12 sample
You have a collection of files that you know roughly what they are. This is good provenance. Even before we begin to look up specifications, we can take a look at the files one by one.
What do you see?
FAT12 Case Study 54
Your FilesAgenda Style
Working with a FAT12 sample
A sample file:
https://archive.org/details/PTS-DOS6.42German
https://archive.org/download/PTS-DOS6.42German/disk1.img
FAT12 Case Study 55
Using a HexAgenda Editor Style
Open the sample in binvis.io
FAT12 Case Study 56
Using a HexAgenda Editor Style
Each byte is displayed as a hexadecimal number
FAT12 Case Study 57
Using a HexAgenda Editor Style
Scroll to the top left corner (address 00000000).
You can see some helpful text showing on the right . . . (PTS DOS, DISK_1, FAT12)
FAT12 Case Study 58
Using a HexAgenda Editor Style
Open the sample in hexed.it
FAT12 Case Study 59
Using a HexAgenda Editor Style
We can see patterns by looking at more files
FAT12 Case Study 60
IdentifyingAgenda Components Style
We can see patterns by looking at more files There is some disk1.img consistent data EB 3D 90 50 54 53 24 44 4F 53 20 00 02 01 01 00 visible across 02 E0 00 60 09 F9 07 00 0F 00 02 00 00 00 00 00 multiple samples.
00 00 00 00 00 00 29 43 D8 C0 7F 44 49 53 4B 5F Each sample 31 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA starts with the Hexadecimal number ‘EB’, and drdosboot.img the 3rd number is EB 3C 90 44 52 44 4F 53 37 2E 30 00 02 01 01 00 ‘90’. 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 38 25 A8 6E 4E 4F 20 4E 41 4D 45 20 20 20 20 46 41 54 31 32 20 20 20 FA FC
FAT12 Case Study 61
IdentifyingAgenda Components Style
We can see patterns by looking at more files The ASCII pts-dos_2000_deutsch_disk1.img Characters EB 3D 90 50 54 53 44 4F 53 36 30 00 02 01 01 00 ‘FAT12’ appear in 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 the fourth row. In hx that is 00 00 00 00 00 00 29 35 18 8E 53 00 00 00 00 00 ‘46 41 54 31 32’. 00 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA On the ms-dos_5 Sample, it says ms-dos_5.img ‘FAT16’. EB 3C 90 27 7D 7D 33 32 49 48 43 00 02 01 01 00 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 62 24 F5 15 4D 53 2D 44 4F 53 5F 35 20 20 20 46 41 54 31 36 20 20 20 FA 33
FAT12 Case Study 62
IdentifyingAgenda Components Style
A brief note about Hexadecimal and ASCII ASCII is a character encoding. ASCII directly maps numbers to text.
For example, the hexadecimal numbers: 46 41 54 31 32 converted to ASCII read: F A T 1 2
http://www.asciitable.com/
FAT12 Case Study 63
Writing aAgenda File Format Style Signature
Starting at the beginning . . .
A signature attempts to define a series of ‘magic bytes’.
Some pattern that will always show up in a specific part of a file, that can be used as a fingerprint to identify that a specific file conforms to a particular format.
FAT12 Case Study 64
ResourcesAgenda Style
PRONOM terms, basic syntax and data model BOF = Beginning of File. EOF = End of File. Var = Variable (anywhere in the file) Offset/Max Offset = Exact or positional range in which a signature starts Wildcards: ?? = single wildcard byte, e.g. AB??C3 * = 0-many wildcard bytes, e.g BC*D4 {n} = specific number of wildcard bytes, e.g. A2{5}F3 {n-n} = range of wildcard bytes, e.g. 4D{0-12}E4 Byte range: [hh:hh] = single byte value between range, e.g [00:FA] Either/or: (hhhh|hhhh|hh) = either/any or these byte values, e.g. (0D|0A|0D0A) Not: [!hh] = anything except this byte value, e.g. ABCD[!01]E1
https://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/automatic_format_identification.pdf
FAT12 Case Study 65
Writing aAgenda File Format Style Signature
Starting at the beginning . . . A first signature could simply be:
BOF: EB {1} 90
● ‘BOF’ means ‘beginning of file’ ● ‘EB’ is the hexadecimal number we expect to see as the very first character in the file. ● ‘{1}’ means one wildcard number (next number can be anything) ● ‘90’ is the 3rd number we expect to see.
FAT12 Case Study 66
Writing aAgenda File Format Style Signature
Comparing to an existing signature There is already a FAT Disk Image signature (fmt/1087) http://www.nationalarchives.gov.uk/PRONOM/fmt/1087
BOF: EB{1}90{10}(01|02|04|08|10|20|40|80)
This matches FAT12, FAT16, and FAT32 Disk Images. We will need to make our new signature more specific to make it useful for matching FAT12, without also matching other FAT Disk Images.
FAT12 Case Study 67
Writing aAgenda File Format Style Signature
Reviewing our earlier work: pts-dos_2000_deutsch_disk1.img EB 3D 90 50 54 53 44 4F 53 36 30 00 02 01 01 00 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 35 18 8E 53 00 00 00 00 00 00 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA
The ASCII Characters ‘FAT12’ appear (in hex as ‘46 41 54 31 32’) at offset 55
The fmt/1087 signature defines 14 numbers. There are 40 numbers between the
end of that signature and the beginning of the ‘FAT12’ text:
BOF:EB{1}90{10}(01|02|04|08|10|20|40|80){40}4641543132
FAT12 Case Study 68
PRONOM Signature Development UtilityAgenda Style
Testing withAgenda Tools Style
DROID Download and install: http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/
Download and install:
FAT12 Case Study 70 Break! Please be back at 12:00 pm
71 Advanced signature development
72
PrecisionAgenda and Sensitivity Style in Signatures
When writing signatures, good practice to minimizing False Positives and True Negatives
Format 1
Format 2
Format 1 Signature
Precision & Sensitivity 73
True NegativesAgenda in Format Style Identification
Precise and insensitive signatures do not identify all files with the format
Format 1
Format 2
True Negative Format 1 Signature
Precision & Sensitivity 74
False NegativesAgenda in Format Style Identification
Imprecise and sensitive signatures identify the wrong files with a format
Format 1
Format 2
True Negatives Format 1 Signature
Precision & Sensitivity 75
Ideal SignaturesAgenda Style
Precise and sensitive signatures identify a file format correctly
Format 1
Format 2
Format 1 Signature
Precision & Sensitivity 76
StrategiesAgenda for Improvement Style
Add multiple signatures
Precision & Sensitivity 77
StrategiesAgenda for Improvement Style
TIFF’s (fmt/353) two signatures
Precision & Sensitivity 78
StrategiesAgenda for Improvement Style
Add more flexibility to your signature Signature Options Comment
?? Match one byte.
* Match zero or more bytes.
{j} Match exactly j bytes.
{j-k} Match from j up to k bytes.
{j-*} Match at least j bytes.
[a:b] Match one byte between a and b inclusive.
(ab|cde) Match the byte sequence ab or the sequence cde.
Offset, MaxOffset A pattern may have an Offset or MaxOffset. One or both may be provided.
Offset (no MaxOffset)
MaxOffset (no Offset)
Adam Farquhar, http://openpreservation.org/blog/2010/10/27/closer-look-pronom-signatures/
Precision & Sensitivity 79
StrategiesAgenda for Improvement Style
MP3’s (fmt/134) very complicated signature
Precision & Sensitivity 80
Testing specificityAgenda of Style signatures
Skeleton Test Suite
a tool for the automated generation of digital objects based on the digital signatures documented in the PRONOM database
You probably don’t have examples of all 2000 PRONOM formats, so Ross Spencer cooked up some samples.
https://github.com/exponential-decay/pronom-archive-and-skeleton-test-suite
Precision & Sensitivity 81
Testing specificityAgenda of Style signatures
Skeleton Test Suite
Download the generated skeleton files for the latest version of PRONOM from the Zenodo dataset link
Run your signatures against this test suite. It should not identify any of the skeleton files.
Precision & Sensitivity 82
ContainerAgenda Signatures Style
How are they different from typical signatures
Some file formats use other formats as containers. In these cases, the format signature first needs to identify the container, and then identify the format using that container.
The two most common container formats are 1. OLE2 - Used primarily by Microsoft in the 90s and 00s 2. ZIP - Used by Microsoft and many others in the 00s to the present day
A container signature requires three parts 1. File Format Signature - A PRONOM signature with without an internal signature 2. Container Signature - Byte and/or File Paths that can be found within the container 3. File Format Container Mapping - a relationship between the File Format Signature and the Container Signature
Container Signatures 83
Container Example: Open Office XML
Open Office XML formats use ZIP as a container for XML
Find a DOCX, PPTX, XLSX, or similar Change the extension to zip Make a copy so that you Not strictly necessary, but this don’t destroy the original helps the OS default to unzipping programs
Unzip and explore Depending on the file, you can find attachments, embedded office docs, and other data
Container Signatures 84 Case study Developing a Container Signature
Winamp Skins
85
Winamp AgendaSkins Style
Classic and Modern Skins
Winamp skins allowed users to customize the appearance and functionality of the Winamp Media Playback Software. They use either WSZ or WAL as a file extension, but they are actually ZIP files.
“A Winamp skin is composed of 45 files. …
This started getting messy, because not all of the skin developers were creating subfolders when compressing their skins into a ZIP file for distribution. …
However, a new problem then surfaced: The .ZIP file format is a very widely used compression scheme and Winamp was just one of the many dozens of programs available to utilize it. We wanted users to be able to double-click the Skin ZIP file and have Winamp automatically install and load the skin. How do we do that without associating Winamp as the default program for handling skins? Answer: Rename the file extension.” http://wiki.winamp.com/wiki/WSZ_Files#WSZ_History
Container Signatures 86
CreatingAgenda the Format StyleSignature
XML for Modern Winamp Standard Signature
Container Signatures 87
CreatingAgenda the Container Style Signature
A container signature can use both internal byte sequences and file paths to identify the format. File paths are useful for identifying required files that must be included in the file.
Modern Winamp skins require a file named ‘Skin.xml’
Container Signatures 88
CreatingAgenda the Container Style Signature
XML for Modern Winamp Container Signature based on File Name
Container Signatures 89
Testing theAgenda Container Style Signature
Ross Spencer’s walkthrough of container signatures: http://openpreservation.org/blog/2016/01/07/droid-container-signature-files-what-they-are- and-how-to-create-them-a-template-and-an-example-or-few/
Container Signatures 90 Case study Developing an HFS Signature
Hierarchical File System (1985-1998)
91 Thank you
Contact Info:
92