3. 2. 1. this topic: on year last Imade presentations three environment. open-source and in afree this capability, exactly provide will thing that Today,ages. some see wecan horizon, the on im moving digital of the beginning since ones, smaller especially many of archives, dream the been has video film and both of preservation the for suitable Toformat file asingle have of Fine Arts, Vienna. He now has his own preservation company and lectures at the Bern Institute of Applied Sciences. was formerly Head of Kromer preservation atReto the Cinémathèque suisse, and lectured at the University of Lausanne and the Academy

• • • < (in German). Wait-Preservation-FFV1--Symposium/>. < of version anupdated is article following The

http://www.the-reel-thing.org/program-abstracts-4/>. https://mediaarea.net/MediaConch/2016/07/26/No-Time-To- for Preservation Wait: Standardising and Matroska FFV1 specialists, Bern,22 November. Bern,22 specialists, video Memoriav the of a meeting at August, 18–20 Hollywood, symposium, technical Thing Reel The at Archive IFI the from Film Irish O’Leary Kieran with Reto Kromer Archiving? for Film and Video One Matroska and FFV1:

1 No Time to to Time No symposium the at , Berlin, 18–20 July, 18–20 , Berlin,

2

3 - - and, at present, usually stored in a folder: for for in a folder: stored usually present, and, at 4:4:4 at R’G’B’ space sampling, colour chroma in RGB the or represented mainly content, film I describe . video FFV1 the and Matroska container the of aspects the still-unresolved to address as well as system, this new of potential dent 2017, of ning evi the to discuss is my and goal 5. 4. footer) file the possibly header file the (and only matter, not because does container of choice an AVI, anMP4 or container. anMOV (QuickTime), ineither uncompressed stored often currently sub-sampling, chroma Y’C space in colour the tent, mainly container. MXF container, AXF JPEG and in an files 2000 example, TIFFfolder, in a files DPX an in files

not for digital video. digital for not and PAL video for used space colour the actually is which YUV Y’C not an apostrophe. is this that grey. Note medium so-called the of side light the on as side dark the on steps of numbers same the for allows It reality. physical to not and eye human the to i.e., adapted gamma-corrected, is value the that (‘) indicates prime The It presents the situation as at the begin at as the situation presents It B C R is sometimes written YCbCr, and, often incorrectly, incorrectly, often YCbCr, and, written sometimes is

4 I call video I call as having single-image-based DEFINITIONS stream-based con

5 In practice, the B C R at 4:2:2 4:2:2 at 41 - - -

­

OPEN FORUM 96 | 04.2017 JFP are different in different containers, while the Adobe, who claim some rights in TIFF, would stream is bit-by-bit identical for the full im- not agree to this; the new format is therefore age content. The file can be trans-muxed (i.e., called TI/A for Tagged Image for Archival. 7 the file is de-muxed and then re-muxed) very quickly, because (i.e., extremely The standardisation of EBML, Matroska time-consuming decoding and re-encoding) of (MKV), FFV1, and FLAC is currently being un- the file’s content is not required. Re-wrapping dertaken by the IETF’s CELLAR group (see be- can be easily done if needed, e.g., during a low). This is the main topic of my paper. data migration, without any additional cost. Therefore, the passionate discussions about WHAT DO ALL THESE ACRONYMS MEAN? the best container choice – MP4, AVI or MOV – should now be relegated to the past. The The Internet Engineering Task Force (IETF) 8 is important factor for an archive is that Y’CBCR the body that governs the internet from the 4:2:2 content is often used by the video and technical point of view, in particular, the TCP/ broadcast community to achieve the best IP Internet protocol suite. It develops and pro- quality of high-level professional production motes voluntary internet standards, the so- and post-production. An archive should there- called Request for Comments (RFC). It is an fore be able to provide historic content in a open standards organisation, with no formal format that commercial clients can use, per- membership or membership requirements. haps without any transcoding. All participants and managers are volunteers, though their work is usually funded by their STANDARDISATION employers or by sponsors.

Standardisation is fundamental to every tech- One of IETF’s numerous working groups is nical field. Different bodies have recently called Codec Encoding for LossLess Archiving standardised, or are currently standardising and Realtime transmission (CELLAR). This file formats that closely relate to the audio- group is attempting to standardise a coherent visual preservation field: set of open, transparent, self-descriptive, and lossless formats, 9 an important mission for the The Society of Motion Picture and Television open-source community to undertake for the Engineers (SMPTE) has standardised the archival world. CELLAR is standardising four CineForm or VC-5 and the ProRes video . different elements. ProRes has been one highly relevant de facto in post-production, but Apple will The first element is the Extensible Binary soon stop supporting QuickTime on Windows Meta-Language (EBML). 10 You may think of it platforms – and probably on macOS in the as a binary equivalent to XML, which allows not-too-distant future. While the popularity the encoding of bitstreams instead of bytes, of GoPro’s CineForm/VC5 seems to be increas- like Unicode characters for XML. ing at present, sadly, the published standard does not contain all the relevant information The second element of CELLAR’s standardi- needed to implement the codec. 6 sation work is Matroska, 11 a container or wrap- per with the file extension “.mkv”. It can con- A group of scholars, led by the University tain, among many other elements and possible of Basel in Switzerland, is preparing a propos- formats, an image stream encoded by the FF al for an archival version of the popular TIFF 1 (FFV1), 12 and one or more , which they plan to submit to the streams encoded by the Free Lossless Audio International Organization for Standardization Codec (FLAC). 13 Matroska is actually a fork (ISO) for approval and inclusion. The format was initially called TIFF/A, like PDF/A, but 7. . 8. . 9. . 10. . 6. . details?project_id=15>, and . public/projects/project/details?project_id=278>. 13. .

42 age”. archive of deployment the for allows It media SSD),or (tape, HDD fixed as ist on or Iimagine FLAC, support will it that nounced pro FLAC sound, for choice archival a good is WAVE Broadcast the While (BWF)dec. Format industry. by cinema the used tent, as R’G’B’ RGB or 4:4:4 con logarithmic or linear Y’C stream-based the both for true is This gorithm. al compression simpler much its of because JPEG of that than 2000 less is time pression FFV1’s but com codec, by JPEG the video 2000 achieved to rate that compression asimilar is This codec. video FFV1 the using space, storage the uncompressed 40% of roughly needing losslessly, compressed be can content This codec. video intra-frame-only lossless ficient Matroska. –of subset a –mathematically afork technically is tainer Container Format (MCF). Google’s WebM con Multimedia called container a unfinished of 14. discus an in-depth offers DPX. O’Leary Kieran or Exchange, Picture Digital is output scanner for “raw” formats so-called current, the of One interoperability. and confidence of degree ahigh with migration of generations through data across many storage environments, and ronment such as that known as “cloud stor multi-level envi and in a complex or servers, deep-storage schema for data that can ex trans-generational, functional, and stable providesobjects the key to a non-proprietary, year. the during apriority become this will 2017) (January just has an Chrome Google as but, standardisation, FLAC on done been has CELLAR’snothing activity, of year first During sizes. different very their of because image thanfor sound for relevant less this is vides as well, though broadcast world,broadcast and the single-image-based

B The fourth element is FLAC, an audio co anaudio FLAC, is element fourth The nobody is actively working on it. idea; an just is formats Bayer-filter-based of implementation the Currently improved. alittle be could rate compression the channel, colour per 4:4:4 16 bit RGB at or R’G’B’ For by IETF, the standardised When of this suite The third element is FFV1, a simple and ef and asimple FFV1, is thirdThe element C R 4:2:2 content, as used in the video and 4:2:2 and in video the used as content, WHAT IS INSIDE MY DPX?

14 ------different encodings of RGB-based information: of RGB-based encodings different conservation. Therefore a .dpx filemaycontain step, i.e., for post-production purposes, not for aninterim for designed was format. same in the screened film and analogue on shot early 1990s.flow in the were time,At that films work intermediate by digital for veloped de format Cineon the from derives which ings, DPX the inside is files. what know must really the archive means – which ways ferent inmany dif holds it RGB the information code DPX that can to fact the related partly is This (decode) all at correctly present. notes that FFV1 or not does encode retrieve also files.O’Leary audiovisual of preservation professional to achieve infrastructure a small only with institutions allows that a key factor is This information. fixity tain any embedded DPX over TIFF, or provides con not do which FFV1 that frame of slice every for checksums CRC-32 storing of advantage the real only here Scans Film for Matroska and blog: in anoutstanding situation the current of many of aspects sion 15. his/ by held sources, DPX different from files, different the inside is what exactly to know anarchivist –for impossible even – maybe that groups together many different encod many different together groups that

• • • • DPX is a strange construct, an umbrella anumbrella DPX construct, astrange is -and-matroska-for-film-scans/> < As O’Leary says, at present, it is very hard hard very is it present, at says, O’Leary As

https://kieranjol.wordpress.com/2016/10/07/introduction-to- REDlogFilm, (Panavision),Panalog (Sony), S-log Example scene-linear encoding (Panasonic), hyper-gamma, Examples encoding function power or encoding gamma Examples log RGB encoding or quasi-log encoding DPX), C, log ARRI Examples log neg encoding (Silicon log Imaging, : ACES. :

: sRGB, CineGamma, Film Rec Rec Film : sRGB, CineGamma, : FilmStream (log (CPD/ Printing Density : Cineon

Introduction to FFV1 to FFV1 Introduction 90 ), F, log ARRI

. .

15 I would mention 60 ), SI-log 43 - - - - -

­

Matroska and FFV1: One File Format OPEN FORUM Reto Kromer 96 | 04.2017 JFP for Film and Video Archiving? her archive. Production and post-production television or computer monitor. The “natural” processes don’t give high priority to techni- video codec would be H.264, encoding Y’CBCR cal metadata, perhaps because it is not par- with a 4:2:0 chroma subsampling for the im- ticularly relevant if the colourist has to tweak age. 17 Unfortunately, AAC (Alternative and the controls a little during the creative phase. Augmentative Communication) is the only It is entirely the opposite for the archivist, of audio codec permitted by the MP4 container. course: it is crucial to preserve the document We recommend a sample rate of 48 kHz and a as it is, without any additional creative work. quantisation of 16 bit.

ARCHIVE MASTER AND MEZZANINE OUTLOOK

The Matroska container and the FFV1 video co- Though some issues remain unresolved, dec are good choices for single-image-based Matroska, with FFV1 (and FLAC), is on the way content when making archive masters. Often, to becoming a solid alternative – especially for a resolution of 2K, or sometimes 4K, an RGB small archives or archives with extremely lim- colour space, the 4:4:4 chroma sampling, and ited resources – for preservation masters and a bit-depth of 16 bit per colour channel are ca- mezzanine files. It is still too early to recom- nonical choices. For stream-based content, the mend a change for access. Matroska container and the FFV1 video codec are also good choices for the archive master. Both SMPTE and the Library of Congress are A resolution of HD (with pillarboxing or letter- evaluating data implementation to accom- boxing if required), in general, the Y’CBCR colour plish the same goals. It is important for the space, the 4:2:2 subsampling, and a bit-depth entire community of archives, from the largest of 10 bit are usually considered best practice. state institutions and media companies to the most modest local repositories, to understand The Matroska container can also be used the economic and technical value that collec- for audio, with FLAC as the audio codec. Good tive, open-source solutions can offer. We are parameters are a sample rate of 96 kHz for designing and implementing systems that will preservation and mezzanine, and 48 kHz for retain data over timespans substantially lon- access, 16 with quantisation of 24 bit for pres- ger than that of the life of motion picture film. ervation and 16 bit for access. The advantages are having one container format for both sin- The author wishes to acknowledge the help gle-image-based and stream-based content. provided by Kieran O’Leary and Adrian Wood. Unfortunately, it’s too early to recommend the same format for both the archive master and the mezzanine, because, though this may change in the near future, at present, FFV1 is natively supported by only a few applications.

ACCESS FORMAT

The Matroska container is currently not popu- lar enough for it to be recommended for ac- cess. While Matroska’s subset WebM is being used more and more in modern browsers, it needs the V9 video codec. In practice, how- ever, MP4 is currently the better choice. An HD resolution (with pillarboxing or letterboxing if necessary), can be used for screening on a

16. I don’t believe the so-called “CD quality” at 44.1 kHz to be a 17. While the H.264 codec’s definition allows uncompressed coding, good choice. Its storage economy is minimal, while its sound as far as we know, these files can only be handled by FFmpeg- quality is significantly diminished. based players. We therefore suggest some slight compression.

44 fr es

Beaucoup d’archives rêvent depuis longtemps d’un format Muchos archivos sueñan desde hace mucho tiempo con un unique permettant une sauvegarde optimale des films quel solo formato que permita obtener copias de seguridad ópti- que soit le support d’origine, pellicule ou vidéo. Aujourd’hui, mas independientemente del soporte original, ya sea pelí- les contours de cette solution se précisent, et ceci dans un cula o vídeo. Hoy en día, la configuración de esta solución contexte libre et ouvert. se va precisando y además en un contexto libre y abierto.

Différents organismes ont récemment standardisé Varios organismos han estandarizado recientemente des formats de fichiers pour contenus audiovisuels, ou s’y sus formatos de archivos para contenidos audiovisuales, o attèlent. L’un d’eux travaille ainsi sur le conteneur Matroska lo están considerando. Uno de ellos trabaja también con el (MKV) et le codec FFV1, qui permet de comprimer sans contenedor Matroska (MKV) y el códec FFV1, que permite perte tant l’image Y’CBCR 24 : :2 de la télévision que l’image comprimir sin pérdida tanto la imagen Y’CBCR 4:2:2 de tele- R’G’B’ ou RGB 4 :4 :4 du cinéma. visión como la imagen R’G’B ‘o RGB 4: 4: 4 de cine.

Kieran O’Leary propose un état des lieux de la situation Kieran O’Leary ofrece una visión general de la situación actuelle, mettant en exergue certaines caractéristiques actual, destacando algunas características que interesan intéressant tout particulièrement les archives audiovisuelles, particularmente a los archivos audiovisuales, incluyendo notamment les sommes de contrôle pour chaque photo- las sumas de comprobación para cada fotograma inte- gramme intégrées au flux, qui permettent d’en vérifier aisé- grado en el flujo, lo que permite comprobar la integridad ment l’intégrité. Il souligne en outre que les métadonnées ne con facilidad. Además, subraya que los metadatos no sont pas toujours stockées correctement dans les DPX, qui siempre se almacenan adecuadamente en DPX, que es el est un format source pour de nombreux scanners. formato origen de muchos escáneres.

Tous les problèmes n’ont pas été résolus, mais Mas- A pesar de que no todos los problemas hayan sido troska et FFV1 sont en passe de s’imposer comme une resueltos, Mastroska y FFV1 están a punto de imponerse alternative solide pour la réalisation de masters à fin d’ar- como alternativa sólida para la realización de copias maes- chivage, en particulier pour des petites archives disposant tras para archivos, especialmente para aquellos pequeños de ressources limitées. En revanche, il apparaît préma- archivos con recursos limitados. Sin embargo, parece pre- turé de recommander ce choix aussi comme mezzanine, maturo recomendar esta elección también como opción puisqu’à l’heure actuelle, FFV1 n’est supporté nativement intermedia, ya que en la actualidad, FFV1 sólo puede ser que par un petit nombre de logiciels. soportado por un pequeño número de softwares.