<<

iPRES 2018 The 15th International Conference on September 24th, 2018, 9am-12:30pm in Room 216, the Joseph B Martin Conference Centre Practice Creating Format/System Signatures for Submission to PRONOM Technical Registry

David Clipsham, Nick Krabbenhoeft, Shira Peltzman, Justin Simpson, & Carl Wilson 1 INTRODUCTIONS

Facilitators

David Clipsham - Digital Archives Systems Manager, PRONOM Lead, National Archives, UK Nick Krabbenhoeft - Head of Digital Preservation, New York Public Shira Peltzman - Digital Archivist, UCLA Library Justin Simpson - Archivematica Technical Director, Artefactual Systems, Inc Carl Wilson - Technical Lead, Preservation Foundation

2 Introduction to signatures How are file formats identified, overview of 9:15 - 9:35 am Agenda PRONOM, case studies Signature development process Reading bytestreams (why to do it, how to do 9:35 - 10:35 am it), creating signatures

[break] Signature development process (cont’d) & case studies 10:50-11:45 pm Testing signatures, submitting to PRONOM [break] Advanced signature development & open signature development workshop 12:00-12:30 pm Container signatures, finding samples, troubleshoot existing signatures 3 Introduction to file format signatures

4 Why file format signatures? Agenda Style

Relevancy to digital preservation

● File format identification enables us to know what we’re dealing with

○ This happens early on in most workflows

○ The outcome of this process impacts downstream decision-making around activities like normalization for preservation and access

● File format identification tools are only as good as the file format signatures that have been developed by the community

○ The lack of a file format signature means that file identification cannot meaningfully take place

○ Executing tasks that should be straightforward, like extraction and

File Formats characterization, are sometimes difficult if not altogether impossible

5 PRONOM

Image from Flickr via kevandotorg

6 PRONOM http://www.nationalarchives.gov.uk/PRONOM/Default.aspx

Developed in 2001 to meet the National Archives digital record File format registry for digital preservation planning. preservation planning

File format research File format 1670 entries Format extensions, always ongoing, National identification aka PUIDs - PRONOM MIME/Media types, Archives research guided signatures Unique Identifiers links to documentation primarily by UK (for DROID originally) Government needs. External contribution always welcome and encouraged

7 PRONOM Timeline

2001 2004 2005 ongoing

Continual PRONOM research and signature development

DROID launched alongside PRONOM 4 PUIDs introduced

Opened up as externally browsable resource Aka PRONOM 3

Original internal version

8 PRONOM Growth

9 PRONOM Contributors

10 File FormatAgenda ID Style

PRONOM identification mechanisms

● Extension (.doc, .exe, .jpg) ● File format signature ○ Binary pattern matching ○ Created from elements of internal structure ○ May be simple ‘magic numbers’ - “CAFEBABE” for Class File ■ http://blog.nationalarchives.gov.uk/blog/cafed00ds-and-cafebabes/ ○ May consist of complex patterns of variations, gaps and alternative values. ○ Driven by file format specification where possible ● Container signatures - formats made up of small files contained within a ‘’ or ‘OLE2’ wrapper (.doc, .xlsx, .odt, .) ○ http://openpreservation.org/blog/2016/01/07/droid-container-signature-files-what-th ey-are-and-how-to-create-them-a-template-and-an-example-or-few/

File Format ID

11 File FormatAgenda ID Style

DROID Pattern Matching

● Scans internal file code

● Compares against known signatures in signature file

● Returns a Hit! where it gets a match

● We’re aiming for certainty – there should be an extremely low chance that a file could be of a different to the format that DROID identifies

● So, signature needs to be strong enough, but doesn’t need to encode all of the characteristics of a format

File Format ID

12 File FormatAgenda ID Style

Magic Numbers (AKA Signatures)

● A specified sequence of characters/ that must be present ● Usually at the start of the file (not always) ● Explicitly stated within the format specification:

● Java Class file – https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html Hex 0xCAFEBABE

● PNG - https://www.w3.org/TR/PNG-Structure.html ASCII “‰PNG”, then hex 0x0D0A1A0A

● Photoshop PSD - https://www.adobe.com/devnet-apps/photoshop/fileformatashtml

File Format ID ASCII “8BPS”

13 File FormatAgenda ID Style

Inferred Signatures

● Sometimes formats may not have clearly defined signatures, but may have characteristics that must be present. This can be a good hook for a signature. This can get really complex!

● Gatan DM3: http://www.er-c.org/cbb/info/dmformat/#dm3 00000003{4}000000(00|01){6}(14|15){2-258}25252525

DTA 113: http://www.stata.com/help.cgi?dta_113 71(01|02)01{105}00

● ASP ASAX: https://msdn.microsoft.com/en-us/library/es4ac4ek(v=vs.85).aspx 3C2540204170706C69636174696F6E20(436F6465426568696E64|436F6D70696C6572 File Format ID 4F7074696F6E73|4465736372697074696F6E|496E686572697473|4C616E6775616765 )3D 14 File FormatAgenda ID Style

Format ‘Subsets’

● Sometimes file formats may be ‘subsets’ or subtypes of other formats. Major examples are:

● PDF/a – subtype of PDF (so is PDF/X) ● DNG – subtype of TIFF (so is NIKON Raw NEF) ● WAV – subtype of RIFF (so is AVI) ● SVG – subtype of XML (so is GML)

We manage these relationships with ‘priorities’ – for example, PDF/A has priority over PDF because it contains a more specific element

File Format ID

15 File FormatAgenda ID Style

Notes on Format Identification

● Not all files are automatically identifiable (based on how PRONOM currently works) – see Wireless Bitmap (.wbmp) – but an extension-only entry is better than nothing!

● A 4 byte (32 bit) sequence has a 1 in ~4 billion chance of a clash with truly random – this is usually strong enough

● A can be better for viewing XML based formats than a (although you’ll need the hex editor for creating the byte sequences)

● We’re not trying to characterise a format or validate that it is well formed, we’re just trying to give us a reasonable degree of certainty about the outcome

File Format ID ● Files that ID as OLE2 or ZIP are probably container sigs (for later!) 16 Signature development process

17 Reading bytestreams Format signature tools and binary Hex (hexadecimal editors) are number systems. allows for manipulation of Base 2 and Base16 the fundamental binary respectively. We usually data that constitutes a file work in Base 10 (decimal), ie. 1, 2, 3 … 10. Binary for 144 is 10010000. In hex this is OP Format Corpus simply 0x90. Fewer zeros to work with helps us see an openly-licensed larger numbers easier. corpus of sample files

DROID/Siegfried/FIDO PRONOM Tools to match files to Submission Utility PRONOM format an online form to submit signatures information about file formats for PRONOM

18

Tools Agenda Style

Hex Editors A program that allows for manipulation of the fundamental binary data that constitutes a file

Also called a editor

For more info see: https://en.wikipedia.or g/wiki/Hex_editor

Reading Bytestreams 19 Hex editors

● Windows - HxD https://mh-nexus.de/en/hxd/

● OS X - HexFiend http://ridiculousfish.com/hexf iend/

● Linux - Bless https://apps.ubuntu.com/cat/ applications/precise/bless/

20 Hex editors

Online options:

http://binvis.io https://hexed.it/ http://icebuddha.com/

21

ResourcesAgenda Style

Format specification documents A document that describes the set of requirements necessary for a given file format

LoC’s Sustainability of Digital Formats is a good place to look for these: http://www.loc.gov/preservation/digital/for mats/fdd/browse_list.shtml

Reading Bytestreams GIF specification: https://www.w3.org/Graphics/GIF/spec-gif89a.txt 22 Hands-on: Examining sample files in a hex editor

23 Case study Developing a simple signature

TZX Spectrum Tapes

24

The TZX AgendaTape Format Style

Creating a signature ● A format for archiving ZX Spectrum programs

● Used with ZX emulation programs

● Large hobbyist community – lots of information available

● A audio stream of the tape data

● World of Spectrum Archive: 10,000’s of examples - https://www.worldofspectrum.org

Creating Signatures 25 26 27

The TZX AgendaTape Format Style

The Format Specification - http://www.worldofspectrum.org/TZXformat.html

Creating Signatures 28

ResourcesAgenda Style

PRONOM terms, basic syntax and data model BOF = Beginning of File. EOF = End of File. Var = Variable (anywhere in the file) Offset/Max Offset = Exact or positional range in which a signature starts Wildcards: ?? = single wildcard byte, e.g. AB??C3 * = 0-many wildcard bytes, e.g BC*D4 {n} = specific number of wildcard bytes, e.g. A2{5}F3 {n-n} = range of wildcard bytes, e.g. 4D{0-12}E4 Byte range: [hh:hh] = single byte value between range, e.g [00:FA] Either/or: (hhhh|hhhh|hh) = either/any or these byte values, e.g. (0D|0A|0D0A) Not: [!hh] = anything except this byte value, e.g. ABCD[!01]E1

https://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/automatic_format_identification.

Creating Signatures 29

Tool Agenda Style

PRONOM Signature Development Utility

http://www.nationalarchives.gov.uk/pronom/sigdev/index.htm

Creating Signatures 30 Hands-on: Creating and editing a sample PRONOM signature

31 Break! Please be back at 10:50am

32 Signature development process cont’d

33

Tool Agenda Style

Format characterization tools

The process of file format characterization identifies, validates, and extracts key characteristics of the file formats represented in our preservation collections including format, version, validity, technical , etc.

Tools include:

● DROID ● Siegfried ● Fido ● File Information Tool Set (FITS)

Testing Signatures 34

Tool Agenda Style

OPF Format Corpus

An openly-licensed corpus of small example files, covering a wide range of formats and creation tools.

Pre-requisites:

✔GitHub account

❌ CLI experience

Testing Signatures

https://github.com/openpreserve/format-corpus/wiki/Contributing-Format-Samples-Without-Needing-to-Know-the-Command-Line35 36

Contributing Format Samples

Getting onto GitHub

You’ll need a GitHub account to upload files to the format corpus. It’s free to join: https://github.com/join/

Testing Signatures

37

Contributing Format Samples

Your test files

You’ll also need some sample files to contribute. You’ll need: ● to understand the sample files well ● the necessary right or permissions to upload the samples ● A willingness to waive the rights to your contribution using a CC0 license: https://creativecommons.org/choose/zero/

Testing Signatures

38

Contributing Format Samples

Overview

The steps in the process are as follows: 1. Find the format corpus on GitHub and copy it to our account 2. Create a new working area for our contribution 3. Create some basic documentation for the submission 4. Add two disk image samples to the repository 5. Submit the contribution to the corpus as a pull request.

Testing Signatures

39

Contributing Format Samples

Find and fork the corpus repository

The OPF corpus can be found using Google. It’s the link to the GitHub repository: https://github.com/openpres erve/format-corpus

It’s best to create a copy of the corpus under your own Fork account. In GitHub speak this copy is known as a “fork”. The fork button is in the top right-hand corner of

Testing Signatures the screen, beneath the black menu bar. 40

Contributing Format Samples

Create a new branch to work in

Once you have your own copy of the corpus repo create a new branch for the contribution. Hit the “Branch” button on the left side of the screen and type Branch a meaningful name. Use “/” to separate branch name parts for tidiness. Testing Signatures

41

Contributing Format Samples

Creating a and README file

First we’ll create a new directory containing a single for documentation. Convention dictates that these files New File should be called “README.md”. To create the file hit the “Create new file” button on the right of the screen. Then type the Naming directory and in File the dialog, adding a “/” creates a new directory. Testing Signatures

42

Contributing Format Samples

README.md and GitHub flavoured markdown

Once you’ve created the file you’ll be able to edit the contents in the web GUI text editor. This can be added as but supports GitHub flavoured Markdown: https://guides.github.com/fe atures/mastering-markdown/ for extra formatting features Preview shown in the “Preview”. Testing Signatures

43

Contributing Format Samples

Time to commit

Once you’re happy with the documentation you can use the Commit dialog at the bottom of the page to submit your changes. Be explicit in your descriptions so that others can easily see your intention as well as what was actually committed. Hit the green “Commit new file” button Commit to save your changes. Testing Signatures

44

Contributing Format Samples

Upload sample data files

We’re now going to add the image samples to our submission. From the same Upload directory use the “Upload files” button in the top right hand corner to open a file upload dialog.

You can drag and drop or use “choose your files” to open a file selection dialog. Testing Signatures

45

Contributing Format Samples

Time to commit again

Once you’ve selected your samples you’ll need to commit again. You can add some detail to the commit dialog, this helps people to understand your intention. When you’re happy with the commit message hit the “Commit changes” button to add your files and comments. Testing Signatures Commit 46

Contributing Format Samples

Contributing our changes to the main corpus repository

Once you’ve committed your changes you’ll be taken back to the home page for your repository fork. Now you are offered the chance to “Compare & pull Pull Request request” from your new branch. A “Pull Request” is GitHub parlance for contributing your changes. You can go ahead and hit the green “Compare and pull request” button. Testing Signatures

47

Contributing Format Samples

Contributing our changes to the main corpus repository

Once again you’ll be able to add some comments, here I’ve simply cut and paste our Pull Request existing comments. For a Target single commit GitHub would do this for us. Here’s the Pull Request ready to submit. Note: Be sure that you’re comparing and pull requesting to the openpreserve master Create Pull branch. Submit your Request contribution using the green

Testing Signatures “Create pull request” button. 48

Contributing Format Samples

Wait for a review and merge

You’re pull request is now submitted. You won’t have the permissions to merge this with the main repository. This enables a corpus repository admin to review your changes before committing them to the repository. If there’s a problem you’ll get a request for changes that need to be made. Congratulations. Testing Signatures

49 Hands-on: Testing draft PRONOM signature against sample files

50

SubmittingAgenda to PRONOM Style

Talking to the PRONOM team or asking for help

● Submissions range from ‘This file won’t identify and I don’t know what it is,’ through to ‘Here is everything there is to know about this format, including signature and samples!’ ● Ideally we’d like format name & samples as a minimum (but don’t let that put you off!) ● We can provide non-disclosure agreements if required by your organisation

● Channels: ○ The PRONOM Mailbox - [email protected] (~10MB limit) ○ PRONOM submission form - https://www.nationalarchives.gov.uk/contact-us/submit-information-for-pronom/ (no attachments) ○ PRONOM Google Group - https://groups.google.com/forum/#!forum/pronom (public!)

Submitting to PRONOM 51 Case study Developing a FAT 12 Signature

FAT 12

52

Agenda Agenda Style

Working with a FAT12 sample ● Using a hex editor ● Identifying components of your ‘FAT12’ file ● Turning that into a signature ● Testing your signature ● Packaging it up for PRONOM ● Developing signatures by format specifications

FAT12 Case Study 53

Your FilesAgenda Style

Working with a FAT12 sample

You have a collection of files that you know roughly what they are. This is good provenance. Even before we begin to look up specifications, we can take a look at the files one by one.

What do you see?

FAT12 Case Study 54

Your FilesAgenda Style

Working with a FAT12 sample

A sample file:

https://archive.org/details/PTS-DOS6.42German

https://archive.org/download/PTS-DOS6.42German/disk1.img

FAT12 Case Study 55

Using a HexAgenda Editor Style

Open the sample in binvis.io

FAT12 Case Study 56

Using a HexAgenda Editor Style

Each byte is displayed as a hexadecimal number

FAT12 Case Study 57

Using a HexAgenda Editor Style

Scroll to the top left corner (address 00000000).

You can see some helpful text showing on the right . . . (PTS DOS, DISK_1, FAT12)

FAT12 Case Study 58

Using a HexAgenda Editor Style

Open the sample in hexed.it

FAT12 Case Study 59

Using a HexAgenda Editor Style

We can see patterns by looking at more files

FAT12 Case Study 60

IdentifyingAgenda Components Style

We can see patterns by looking at more files There is some disk1. consistent data EB 3D 90 50 54 53 24 44 4F 53 20 00 02 01 01 00 visible across 02 E0 00 60 09 F9 07 00 0F 00 02 00 00 00 00 00 multiple samples.

00 00 00 00 00 00 29 43 D8 C0 7F 44 49 53 4B 5F Each sample 31 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA starts with the Hexadecimal number ‘EB’, and drdosboot.img the 3rd number is EB 3C 90 44 52 44 4F 53 37 2E 30 00 02 01 01 00 ‘90’. 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 38 25 A8 6E 4E 4F 20 4E 41 4D 45 20 20 20 20 46 41 54 31 32 20 20 20 FA FC

FAT12 Case Study 61

IdentifyingAgenda Components Style

We can see patterns by looking at more files The ASCII pts-dos_2000_deutsch_disk1.img Characters EB 3D 90 50 54 53 44 4F 53 36 30 00 02 01 01 00 ‘FAT12’ appear in 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 the fourth row. In hx that is 00 00 00 00 00 00 29 35 18 8E 53 00 00 00 00 00 ‘46 41 54 31 32’. 00 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA On the ms-dos_5 Sample, it says ms-dos_5.img ‘FAT16’. EB 3C 90 27 7D 7D 33 32 49 48 43 00 02 01 01 00 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 62 24 F5 15 4D 53 2D 44 4F 53 5F 35 20 20 20 46 41 54 31 36 20 20 20 FA 33

FAT12 Case Study 62

IdentifyingAgenda Components Style

A brief note about Hexadecimal and ASCII ASCII is a . ASCII directly maps numbers to text.

For example, the hexadecimal numbers: 46 41 54 31 32 converted to ASCII : F A T 1 2

http://www.asciitable.com/

FAT12 Case Study 63

Writing aAgenda File Format Style Signature

Starting at the beginning . . .

A signature attempts to define a series of ‘magic bytes’.

Some pattern that will always show up in a specific part of a file, that can be used as a fingerprint to identify that a specific file conforms to a particular format.

FAT12 Case Study 64

ResourcesAgenda Style

PRONOM terms, basic syntax and data model BOF = Beginning of File. EOF = End of File. Var = Variable (anywhere in the file) Offset/Max Offset = Exact or positional range in which a signature starts Wildcards: ?? = single wildcard byte, e.g. AB??C3 * = 0-many wildcard bytes, e.g BC*D4 {n} = specific number of wildcard bytes, e.g. A2{5}F3 {n-n} = range of wildcard bytes, e.g. 4D{0-12}E4 Byte range: [hh:hh] = single byte value between range, e.g [00:FA] Either/or: (hhhh|hhhh|hh) = either/any or these byte values, e.g. (0D|0A|0D0A) Not: [!hh] = anything except this byte value, e.g. ABCD[!01]E1

https://www.nationalarchives.gov.uk/aboutapps/fileformat/pdf/automatic_format_identification.pdf

FAT12 Case Study 65

Writing aAgenda File Format Style Signature

Starting at the beginning . . . A first signature could simply be:

BOF: EB {1} 90

● ‘BOF’ means ‘beginning of file’ ● ‘EB’ is the hexadecimal number we expect to see as the very first character in the file. ● ‘{1}’ means one wildcard number (next number can be anything) ● ‘90’ is the 3rd number we expect to see.

FAT12 Case Study 66

Writing aAgenda File Format Style Signature

Comparing to an existing signature There is already a FAT Disk Image signature (fmt/1087) http://www.nationalarchives.gov.uk/PRONOM/fmt/1087

BOF: EB{1}90{10}(01|02|04|08|10|20|40|80)

This matches FAT12, FAT16, and FAT32 Disk Images. We will need to our new signature more specific to make it useful for matching FAT12, without also matching other FAT Disk Images.

FAT12 Case Study 67

Writing aAgenda File Format Style Signature

Reviewing our earlier work: pts-dos_2000_deutsch_disk1.img EB 3D 90 50 54 53 44 4F 53 36 30 00 02 01 01 00 02 E0 00 40 0B F0 09 00 12 00 02 00 00 00 00 00 00 00 00 00 00 00 29 35 18 8E 53 00 00 00 00 00 00 00 00 00 00 00 46 41 54 31 32 20 20 20 00 FA

The ASCII Characters ‘FAT12’ appear (in hex as ‘46 41 54 31 32’) at offset 55

The fmt/1087 signature defines 14 numbers. There are 40 numbers between the

end of that signature and the beginning of the ‘FAT12’ text:

BOF:EB{1}90{10}(01|02|04|08|10|20|40|80){40}4641543132

FAT12 Case Study 68

PRONOM Signature Development UtilityAgenda Style 4641543132 6 5 4 3 2 1 01 02 04 08 10 20 40 80 90 EB 1 img http://www.nationalarchives.gov.uk/pronom/sigdev/index.htm FAT12 Case Study 69

Testing withAgenda Tools Style

DROID Download and install: http://www.nationalarchives.gov.uk/information-management/manage-information/preserving-digital-records/droid/

Download and install:

FAT12 Case Study 70 Break! Please be back at 12:00 pm

71 Advanced signature development

72

PrecisionAgenda and Sensitivity Style in Signatures

When writing signatures, good practice to minimizing False Positives and True Negatives

Format 1

Format 2

Format 1 Signature

Precision & Sensitivity 73

True NegativesAgenda in Format Style Identification

Precise and insensitive signatures do not identify all files with the format

Format 1

Format 2

True Negative Format 1 Signature

Precision & Sensitivity 74

False NegativesAgenda in Format Style Identification

Imprecise and sensitive signatures identify the wrong files with a format

Format 1

Format 2

True Negatives Format 1 Signature

Precision & Sensitivity 75

Ideal SignaturesAgenda Style

Precise and sensitive signatures identify a file format correctly

Format 1

Format 2

Format 1 Signature

Precision & Sensitivity 76

StrategiesAgenda for Improvement Style

Add multiple signatures

Precision & Sensitivity 77

StrategiesAgenda for Improvement Style

TIFF’s (fmt/353) two signatures

Precision & Sensitivity 78

StrategiesAgenda for Improvement Style

Add more flexibility to your signature Signature Options Comment

?? Match one byte.

* Match zero or more bytes.

{j} Match exactly j bytes.

{j-k} Match from j up to k bytes.

{j-*} Match at least j bytes.

[a:b] Match one byte between a and b inclusive.

(ab|cde) Match the byte sequence ab or the sequence cde.

Offset, MaxOffset A pattern may have an Offset or MaxOffset. One or both may be provided.

Offset (no MaxOffset)

MaxOffset (no Offset)

Adam Farquhar, http://openpreservation.org/blog/2010/10/27/closer-look-pronom-signatures/

Precision & Sensitivity 79

StrategiesAgenda for Improvement Style

MP3’s (fmt/134) very complicated signature

Precision & Sensitivity 80

Testing specificityAgenda of Style signatures

Skeleton Test Suite

a tool for the automated generation of digital objects based on the digital signatures documented in the PRONOM database

You probably don’t have examples of all 2000 PRONOM formats, so Ross Spencer cooked up some samples.

https://github.com/exponential-decay/pronom-archive-and-skeleton-test-suite

Precision & Sensitivity 81

Testing specificityAgenda of Style signatures

Skeleton Test Suite

Download the generated skeleton files for the latest version of PRONOM from the Zenodo dataset link

Run your signatures against this test suite. It should not identify any of the skeleton files.

Precision & Sensitivity 82

ContainerAgenda Signatures Style

How are they different from typical signatures

Some file formats use other formats as containers. In these cases, the format signature first needs to identify the container, and then identify the format using that container.

The two most common container formats are 1. OLE2 - Used primarily by Microsoft in the 90s and 00s 2. ZIP - Used by Microsoft and many others in the 00s to the present day

A container signature requires three parts 1. File Format Signature - A PRONOM signature with without an internal signature 2. Container Signature - Byte and/or File Paths that can be found within the container 3. File Format Container Mapping - a relationship between the File Format Signature and the Container Signature

Container Signatures 83

Container Example: Open Office XML

Open Office XML formats use ZIP as a container for XML

Find a DOCX, PPTX, XLSX, or similar Change the extension to zip Make a copy so that you Not strictly necessary, but this don’t destroy the original helps the OS default to unzipping programs

Unzip and explore Depending on the file, you can find attachments, embedded office docs, and other data

Container Signatures 84 Case study Developing a Container Signature

Winamp Skins

85

Winamp AgendaSkins Style

Classic and Modern Skins

Winamp skins allowed users to customize the appearance and functionality of the Winamp Media Playback . They use either WSZ or WAL as a file extension, but they are actually ZIP files.

“A Winamp skin is composed of 45 files. …

This started getting messy, because not all of the skin developers were creating subfolders when compressing their skins into a ZIP file for distribution. …

However, a new problem then surfaced: The .ZIP file format is a very widely used compression scheme and Winamp was just one of the many dozens of programs available to utilize it. We wanted users to be able to double-click the Skin ZIP file and have Winamp automatically install and load the skin. How do we do that without associating Winamp as the default program for handling skins? Answer: Rename the file extension.” http://wiki.winamp.com/wiki/WSZ_Files#WSZ_History

Container Signatures 86

CreatingAgenda the Format StyleSignature

XML for Modern Winamp Standard Signature

WAL

Container Signatures 87

CreatingAgenda the Container Style Signature

A container signature can use both internal byte sequences and file paths to identify the format. File paths are useful for identifying required files that must be included in the file.

Modern Winamp skins require a file named ‘Skin.xml’

Container Signatures 88

CreatingAgenda the Container Style Signature

XML for Modern Winamp Container Signature based on File Name

Modern Winamp Skin Skin.xml

Container Signatures 89

Testing theAgenda Container Style Signature

Ross Spencer’s walkthrough of container signatures: http://openpreservation.org/blog/2016/01/07/droid-container-signature-files-what-they-are- and-how-to-create-them-a-template-and-an-example-or-few/

Container Signatures 90 Case study Developing an HFS Signature

Hierarchical (1985-1998)

91 Thank you

Contact Info:

92