Private Draft Document. Not for Publication

Interoperable Master Format (IMF) Specification

December 22, 2009 Version 0.6d DRAFT

Copyright  2009 by Entertainment Technology Center 509 West 29th Street Los Angeles, California 90007 United States of America 213-743-1600 www.etcenter.org

1 Private Draft Document. Not for Publication

NOTICE

The Entertainment Technology Center (ETC) is the author and creator of this specification for the purpose of copyright and other laws in all countries throughout the world. The ETC copyright notice must be included in all reproductions, whether in whole or in part, and may not be deleted or attributed to others. ETC hereby grants to its members and their suppliers a limited license to reproduce this specification for their own use, provided it is not sold. Others should obtain permission to reproduce this specification from Entertainment Technology Center; Attn: CEO and Executive Director; 509 West 29th Street, Los Angeles, California 90007; USA; (213) 743-1600 (voice); (213) 743-1803 facsimile.

This document is a specification developed and adopted by ETC. This document may be revised by ETC. It is intended solely as a guide for companies interested in developing products, which can be compatible with other products, developed using this document. Each ETC Member Company shall decide independently the extent to which it will utilize, or require adherence to, these specifications. ETC shall not be liable for any exemplary, incidental, proximate or consequential damages or expenses arising from the use of this document. This document defines only one approach to compatibility, and other approaches may be available to the industry.

This document is an authorized and approved publication of ETC. Only ETC has the right and authority to revise or change the material contained in this document, and any revisions by any party other than ETC are unauthorized and prohibited.

Compliance with this document may require use of one or more features covered by proprietary rights (such as features which are the subject of a patent, patent application, copyright, mask work right or trade secret right). By publication of this document, no position is taken by ETC with respect to the validity or infringement of any patent or other proprietary right. ETC hereby expressly disclaims any liability for infringement of intellectual property rights of others by virtue of the use of this document. ETC has not and does not investigate any notices or allegations of infringement prompted by publication of any ETC document, nor does ETC undertake a duty to advise users or potential users of ETC documents of such notices or allegations. ETC hereby expressly advises all users or potential users of this document to investigate and analyze any potential infringement situation, seek the advice of intellectual property counsel, and, if indicated, obtain a license under any applicable intellectual property right or take the necessary steps to avoid infringement of any intellectual property right. ETC expressly disclaims any intent to promote infringement of any intellectual property right by virtue of the evolution, adoption, or publication of this document.

2 Private Draft Document. Not for Publication

Revision History

Date Revision Description Contributors 02/9/09 v0.1 Initial draft Howard Lukk 02/13/09 V0.2 1st Rev, Name Change, revise framework chapter Howard Lukk order and headings. Spell check and general clean up 3/13/09 V0.3 Chapter 10 change to Asset Management, Chapter 5 IMF Tech Committee, H. Lukk change to requirements instead of standard, Document clean up, Chapter 1 review and update, new Annex 6/16/09 V0.41 Update Chapter 1 along with formatting, Complete IMF Tech Committee, H. Lukk re-write of Chapter 2, re label chapters, re structure framework of document 8/4/09 V0.42 Update Figures, Add Table of Tables, Update IMF Tech Committee, H. Lukk Chapter 2., Remove “Clip” from document, Update Chapter 7 to reflect Chapter 2., Updated Glossary 11/6/09 V0.5 Update Chapter 2 and 7. Minor updates throughout IMF Tech Committee, H. Lukk document to remove Dynamic Downmixing and Audio Transitions. Inserted new Chapter 8 for Output Profile List 12/16/09 V0.5d Updates to Chapter 3 (audio draft for essence B. Vessa, A. Chang, H. Lukk and chapter + general notes), 7 (Composition), 8 (Output N. Goodkin Profile List), 9 (Packaging), and the addition of Chapter 10 (Annex), plus spelling and other minor corrections.

12/22/09 V0.6a Changes agreed during the Dec 18 document review, S. Levenson, N. Goodkin recorded by Seth Levenson during the meeting. Composition Play List (CPL) example added as Chapter 12. Moved comments out of the body and into comments. Made all "TBD" text red. Added styles for DCI Normal 1, 2 and 3 corresponding to text under paragraphs DCI Heading 1, 2, and 3. Formatted all tables to have similar headings and fonts. 12/24/09 V0.6b Changed Table Captions to "automatically N. Goodkin generated," thus automatically numbered. Changed Table of Tables to an Inserted Reference Table instead of a table generated by styles. Added box borders to all figures. 12/27/09 V0.6c Moved Image Metadata Data Elements table and text N. Goodkin from 3.2.1.1 to 3.2.3.1.

Moved SL12 text to Chapter 13, Annex - Ideal Implementation

Inserted a reference to the Essence Chapter, according to BV15.

Replaced all references to "Audio Configuration" with "Soundfield Configuration"

Rewrote the Audio section using headings and bullets, and using 'shall' and 'may.'

3 Private Draft Document. Not for Publication

Date Revision Description Contributors Changed all instances of "is required to" to "shall."

Minor grammatical and sentence structure changes.

01/11/10 V0.6d Updated the figure numbers and deleted the dynamic down-mixing section.

Reviewed By

Review Type Reviewed By Date Reviewed

Approval / Sign Off

Name Signature Date

4 Private Draft Document. Not for Publication Table of Contents

1. IMF INTRODUCTION...... 21

1.1. INTRODUCTION...... 21 1.1. INTRODUCTION...... 21

1.2. SCOPE...... 21 1.2. SCOPE...... 21

1.3. SYSTEM OBJECTIVES...... 21 1.3. SYSTEM OBJECTIVES...... 21

1.4. DOCUMENT LANGUAGE...... 22 1.4. DOCUMENT LANGUAGE...... 22

2. SYSTEM OVERVIEW...... 23

1.5. FUNCTIONAL FRAMEWORK ...... 23 1.5. FUNCTIONAL FRAMEWORK ...... 23

1.6. OVERVIEW...... 24 1.6. OVERVIEW...... 24 1.6.1. IMF System Architecture...... 24 1.6.1. IMF SYSTEM ARCHITECTURE...... 24 1.6.2. Major System Key Concepts...... 25 1.6.2. MAJOR SYSTEM KEY CONCEPTS...... 25 1.6.2.1. Business to Business Solution...... 25 1.6.2.1. Business to Business Solution...... 25 1.6.2.2. Digital Source Master (DSM)...... 25 1.6.2.2. Digital Source Master (DSM)...... 25 1.6.2.3. Archive not in Scope...... 26 1.6.2.3. Archive not in Scope...... 26 1.6.2.4. File / Frame-Based System...... 26 1.6.2.4. File / Frame-Based System...... 26 1.6.2.5. Essence and Data Essence...... 26 1.6.2.5. Essence and Data Essence...... 26 1.6.2.6. Metadata...... 26 1.6.2.6. Metadata...... 26 1.6.2.7. Security...... 26 1.6.2.7. Security...... 26 1.6.2.8. Wrapping and Track File Creation...... 26 1.6.2.8. Wrapping and Track File Creation...... 26 1.6.2.9. Composition...... 28 1.6.2.9. Composition...... 28 1.6.2.10. Versions...... 28 1.6.2.10. Versions...... 28 1.6.2.11. Sequence...... 28 1.6.2.11. Sequence...... 28 1.6.2.12. Output Profile List...... 29 1.6.2.12. Output Profile List...... 29 1.6.2.13. Packages...... 29 1.6.2.13. Packages...... 29 1.7. IMF ELEMENTS AND PROCESSES...... 31 1.7. IMF ELEMENTS AND PROCESSES...... 31

5 Private Draft Document. Not for Publication

2. ESSENCE...... 32

2.1. OVERVIEW...... 32 2.1. OVERVIEW...... 32 2.1.1. Introduction...... 32 2.1.1. INTRODUCTION...... 32 2.1.2. Essence System Overview...... 32 2.1.2. ESSENCE SYSTEM OVERVIEW...... 32 2.1.3. Major Essence Concepts...... 32 2.1.3. MAJOR ESSENCE CONCEPTS...... 32 2.1.4. Essence Fundamental Requirements...... 32 2.1.4. ESSENCE FUNDAMENTAL REQUIREMENTS...... 32 2.1.4.1. Common File Formats...... 32 2.1.4.1. Common File Formats...... 32 2.1.4.2. Frame Rates...... 32 2.1.4.2. Frame Rates...... 32 2.1.4.3. Synchronization...... 33 2.1.4.3. Synchronization...... 33 2.2. IMAGE SPECIFICATION...... 34 2.2. IMAGE SPECIFICATION...... 34 2.2.1. Image Structure...... 34 2.2.1. IMAGE STRUCTURE...... 34 2.2.1.1. Image Metadata Required Fields...... 34 2.2.1.1. Image Metadata Required Fields...... 34 2.2.1.2. Image Container, Active Image, Pixel Aspect Ratio...... 35 2.2.1.2. Image Container, Active Image, Pixel Aspect Ratio...... 35 2.2.1.3. Raster Format, Frame Rate...... 36 2.2.1.3. Raster Format, Frame Rate...... 36 2.2.1.4. Depth...... 36 2.2.1.4. Bit Depth...... 36 2.2.1.5. Chroma Subsampling...... 36 2.2.1.5. Chroma Subsampling...... 36 2.2.1.6. Color Space, Color Space Gamuts...... 37 2.2.1.6. Color Space, Color Space Gamuts...... 37 2.2.2. Compression Requirements...... 37 2.2.2. COMPRESSION REQUIREMENTS...... 37 2.2.2.1. Image Compression Requirements...... 37 2.2.2.1. Image Compression Requirements...... 37 2.2.2.2. Intra-Frame Compression...... 37 2.2.2.2. Intra-Frame Compression...... 37 2.2.3. Structural Metadata...... 38 2.2.3. STRUCTURAL METADATA...... 38 2.2.3.1. Image Metadata Required Fields...... 38 2.2.3.1. Image Metadata Required Fields...... 38 2.3. AUDIO SPECIFICATION...... 39 2.3. AUDIO SPECIFICATION...... 39 2.3.1. Audio Structure...... 39 2.3.1. AUDIO STRUCTURE...... 39 2.3.1.1. ...... 39 2.3.1.1. Audio File Format...... 39 2.3.1.2. Sampling Rate...... 39 2.3.1.2. Sampling Rate...... 39 2.3.1.3. Frame Rate/Audio Speed...... 40 2.3.1.3. Frame Rate/Audio Speed...... 40 2.3.1.4. Sped up frame rates (would need metadata to convey this):...... 40

6 Private Draft Document. Not for Publication

2.3.1.4. Sped up frame rates (would need metadata to convey this):...... 40 2.3.1.5. Allowable Samples per Frame...... 40 2.3.1.5. Allowable Samples per Frame...... 40 2.3.1.6. Audio Bit Depth...... 40 2.3.1.6. Audio Bit Depth...... 40 2.3.1.7. Audio Track File Content...... 41 2.3.1.7. Audio Track File Content...... 41 2.3.1.8. Audio Track File Content Constraint...... 41 2.3.1.8. Audio Track File Content Constraint...... 41 2.3.1.9. Audio Track File Language Constraint...... 41 2.3.1.9. Audio Track File Language Constraint...... 41 2.3.1.10. Track File Loudspeaker Channel Content Constraint: ...... 41 2.3.1.10. Track File Loudspeaker Channel Content Constraint: ...... 41 2.3.1.11. Channels Per Track File Constraint...... 41 2.3.1.11. Channels Per Track File Constraint...... 41 2.3.1.12. Simultaneous Multiple Audio Track File Availability and Playout...... 41 2.3.1.12. Simultaneous Multiple Audio Track File Availability and Playout...... 41 2.3.1.13. Multiple Audio Track File routing...... 41 2.3.1.13. Multiple Audio Track File routing...... 41 2.3.1.14. Mixing of audio channels...... 41 2.3.1.14. Mixing of audio channels...... 41 2.3.1.15. Audio Elements Examples...... 42 2.3.1.15. Audio Elements Examples...... 42 2.3.1.16. Soundfield Configurations...... 43 2.3.1.16. Soundfield Configurations...... 43 2.3.1.17. Channel layout and mapping...... 43 2.3.1.17. Channel layout and mapping...... 43 2.3.1.18. Additional Audio Items and Concerns ...... 43 2.3.1.18. Additional Audio Items and Concerns ...... 43 2.3.1.19. Partitions...... 43 2.3.1.19. Partitions...... 43 2.3.1.20. Audio Editing Granularity...... 43 2.3.1.20. Audio Editing Granularity...... 43 2.3.1.21. Crossfades...... 44 2.3.1.21. Crossfades...... 44 2.3.1.22. Audio Editing and Transitions...... 44 2.3.1.22. Audio Editing and Transitions...... 44 2.3.1.23. Audio Insert Considerations...... 44 2.3.1.23. Audio Insert Considerations...... 44 2.3.1.24. Audio Processing...... 44 2.3.1.24. Audio Processing...... 44 2.3.1.25. PARKING LOT ITEMS...... 44 2.3.1.25. PARKING LOT ITEMS...... 44 2.3.1.26. Metadata...... 44 2.3.1.26. Metadata...... 44 2.3.1.27. Wrapping...... 44 2.3.1.27. Wrapping...... 44 2.3.1.28. File Naming...... 44 2.3.1.28. File Naming...... 44 2.3.1.29. Composition Play List...... 44 2.3.1.29. Composition Play List...... 44 2.3.2. Structural Metadata...... 44 2.3.2. STRUCTURAL METADATA...... 44 3. DATA ESSENCE...... 45

3.1. OVERVIEW...... 45 3.1. OVERVIEW...... 45

7 Private Draft Document. Not for Publication

3.1.1. Introduction...... 45 3.1.1. INTRODUCTION...... 45 3.1.2. Data Essence System Overview...... 45 3.1.2. DATA ESSENCE SYSTEM OVERVIEW...... 45 3.1.3. Major Data Essence Concepts...... 45 3.1.3. MAJOR DATA ESSENCE CONCEPTS...... 45 3.1.4. Data Essence Fundamental Requirements...... 45 3.1.4. DATA ESSENCE FUNDAMENTAL REQUIREMENTS...... 45 3.1.4.1. Common File Formats...... 45 3.1.4.1. Common File Formats...... 45 3.1.4.2. Frame Rates...... 45 3.1.4.2. Frame Rates...... 45 3.1.4.3. Synchronization...... 45 3.1.4.3. Synchronization...... 45 3.2. SUBTITLE SPECIFICATION...... 46 3.2. SUBTITLE SPECIFICATION...... 46 3.2.1. Sub picture...... 46 3.2.1. SUB PICTURE...... 46 3.2.2. Timed Text...... 46 3.2.2. TIMED TEXT...... 46 3.3. CAPTION SPECIFICATION...... 46 3.3. CAPTION SPECIFICATION...... 46 3.3.1. Sub picture...... 46 3.3.1. SUB PICTURE...... 46 3.3.2. Timed Text...... 46 3.3.2. TIMED TEXT...... 46 4. DYNAMIC METADATA...... 48

4.1. OVERVIEW...... 48 4.1. OVERVIEW...... 48 4.1.1. Introduction...... 48 4.1.1. INTRODUCTION...... 48 4.1.2. Dynamic Metadata System Overview...... 48 4.1.2. DYNAMIC METADATA SYSTEM OVERVIEW...... 48 4.1.3. Major Dynamic Metadata Concepts...... 48 4.1.3. MAJOR DYNAMIC METADATA CONCEPTS...... 48 4.1.4. Dynamic Metadata Fundamental Requirements...... 48 4.1.4. DYNAMIC METADATA FUNDAMENTAL REQUIREMENTS...... 48 4.1.4.1. Common File Formats...... 48 4.1.4.1. Common File Formats...... 48 4.1.4.2. Frame Rates...... 48 4.1.4.2. Frame Rates...... 48 4.2. TIME CODE SPECIFICATION...... 48 4.2. TIME CODE SPECIFICATION...... 48

4.3. PAN AND SCAN SPECIFICATION...... 48 4.3. PAN AND SCAN SPECIFICATION...... 48

4.4. COLOR CORRECTION SPECIFICATION...... 48 4.4. COLOR CORRECTION SPECIFICATION...... 48

5. WRAPPING...... 50

5.1. OVERVIEW...... 50

8 Private Draft Document. Not for Publication

5.1. OVERVIEW...... 50 5.1.1. Introduction...... 50 5.1.1. INTRODUCTION...... 50 5.1.2. Wrapping System Overview...... 50 5.1.2. WRAPPING SYSTEM OVERVIEW...... 50 5.2. WRAPPING FUNDAMENTAL REQUIREMENTS...... 50 5.2. WRAPPING FUNDAMENTAL REQUIREMENTS...... 50 5.2.1. Common File Formats...... 50 5.2.1. COMMON FILE FORMATS...... 50 5.2.2. Frame Rates...... 50 5.2.2. FRAME RATES...... 50 5.3. WRAPPING OVERVIEW...... 51 5.3. WRAPPING OVERVIEW...... 51 5.3.1. Introduction...... 51 5.3.1. INTRODUCTION...... 51 5.3.2. Format Information...... 51 5.3.2. FORMAT INFORMATION...... 51 5.3.3. Sequence...... 52 5.3.3. SEQUENCE...... 52 5.3.4. Track File Replacement...... 52 5.3.4. TRACK FILE REPLACEMENT...... 52 5.3.5. Synchronization...... 52 5.3.5. SYNCHRONIZATION...... 52 5.3.6. Splicing...... 52 5.3.6. SPLICING...... 52 5.3.7. Key Epoch...... 52 5.3.7. KEY EPOCH...... 52 5.3.8. Security...... 52 5.3.8. SECURITY...... 52 5.3.9. Integrity and Authentication...... 52 5.3.9. INTEGRITY AND AUTHENTICATION...... 52 5.3.10. Extensibility...... 53 5.3.10. EXTENSIBILITY...... 53 5.3.11. Random Access and Restarts...... 53 5.3.11. RANDOM ACCESS AND RESTARTS...... 53 5.3.12. Simple Essence...... 53 5.3.12. SIMPLE ESSENCE...... 53 5.4. MXF TRACK FILE ENCRYPTION...... 53 5.4. MXF TRACK FILE ENCRYPTION...... 53 5.4.1. Introduction...... 53 5.4.1. INTRODUCTION...... 53 5.4.2. Encrypted Track File Constraints...... 54 5.4.2. ENCRYPTED TRACK FILE CONSTRAINTS...... 54 5.5. IMAGE TRACK FILE...... 55 5.5. IMAGE TRACK FILE...... 55 5.5.1. Introduction...... 55 5.5.1. INTRODUCTION...... 55 5.5.2. Frame Boundaries...... 55 5.5.2. FRAME BOUNDARIES...... 55 5.5.3. Compression...... 55 5.5.3. COMPRESSION...... 55 5.5.4. Metadata...... 55 5.5.4. METADATA...... 55

9 Private Draft Document. Not for Publication

5.6. AUDIO TRACK FILE...... 55 5.6. AUDIO TRACK FILE...... 55 5.6.1. Introduction...... 55 5.6.1. INTRODUCTION...... 55 5.6.2. Frame Boundaries...... 55 5.6.2. FRAME BOUNDARIES...... 55 5.6.3. Data Packing Format...... 55 5.6.3. DATA PACKING FORMAT...... 55 5.6.4. Metadata...... 55 5.6.4. METADATA...... 55 5.7. SUBTITLE TRACK FILE...... 56 5.7. SUBTITLE TRACK FILE...... 56 5.7.1. Introduction...... 56 5.7.1. INTRODUCTION...... 56 5.7.2. Frame Boundaries...... 56 5.7.2. FRAME BOUNDARIES...... 56 5.7.3. Timed Text...... 56 5.7.3. TIMED TEXT...... 56 5.7.4. Sub picture...... 56 5.7.4. SUB PICTURE...... 56 5.7.5. Metadata...... 56 5.7.5. METADATA...... 56 5.8. CAPTIONS TRACK FILE...... 57 5.8. CAPTIONS TRACK FILE...... 57 5.8.1. Introduction...... 57 5.8.1. INTRODUCTION...... 57 5.8.2. Frame Boundaries...... 57 5.8.2. FRAME BOUNDARIES...... 57 5.8.3. Timed Text...... 57 5.8.3. TIMED TEXT...... 57 5.8.4. Sub picture...... 57 5.8.4. SUB PICTURE...... 57 5.8.5. Metadata...... 57 5.8.5. METADATA...... 57 5.9. TIME CODE TRACK FILES (OPTIONAL)...... 57 5.9. TIME CODE TRACK FILES (OPTIONAL)...... 57 5.9.1. Introduction...... 57 5.9.1. INTRODUCTION...... 57 5.9.2. Frame Boundaries...... 57 5.9.2. FRAME BOUNDARIES...... 57 5.9.3. Metadata...... 57 5.9.3. METADATA...... 57 5.10. PAN AND SCAN TRACK FILES (OPTIONAL)...... 58 5.10. PAN AND SCAN TRACK FILES (OPTIONAL)...... 58 5.10.1. Introduction...... 58 5.10.1. INTRODUCTION...... 58 5.10.2. Frame Boundaries...... 58 5.10.2. FRAME BOUNDARIES...... 58 5.10.3. Metadata...... 58 5.10.3. METADATA...... 58 5.11. COLOR CORRECTION TRACK FILES (OPTIONAL)...... 58 5.11. COLOR CORRECTION TRACK FILES (OPTIONAL)...... 58

10 Private Draft Document. Not for Publication

5.11.1. Introduction...... 58 5.11.1. INTRODUCTION...... 58 5.11.2. Frame Boundaries...... 58 5.11.2. FRAME BOUNDARIES...... 58 5.11.3. Metadata...... 58 5.11.3. METADATA...... 58 7. THE COMPOSITION...... 60

5.12. INTRODUCTION...... 60 5.12. INTRODUCTION...... 60

5.13. FUNCTIONAL FRAMEWORK...... 60 5.13. FUNCTIONAL FRAMEWORK...... 60

5.14. THE COMPOSITION FUNDAMENTAL REQUIREMENTS...... 60 5.14. THE COMPOSITION FUNDAMENTAL REQUIREMENTS...... 60 5.14.1. Open Standard...... 60 5.14.1. OPEN STANDARD...... 60 5.14.2. Interoperable...... 60 5.14.2. INTEROPERABLE...... 60 5.14.3. Scalable...... 60 5.14.3. SCALABLE...... 60 5.14.4. Extensible...... 60 5.14.4. EXTENSIBLE...... 60 5.14.5. Synchronization...... 60 5.14.5. SYNCHRONIZATION...... 60 5.14.6. Human Readable Metadata...... 61 5.14.6. HUMAN READABLE METADATA...... 61 5.14.7. File Format...... 61 5.14.7. FILE FORMAT...... 61 5.15. CPL CONSTRAINTS...... 61 5.15. CPL CONSTRAINTS...... 61 5.15.1. Constant Frame and Sample Rate...... 61 5.15.1. CONSTANT FRAME AND SAMPLE RATE...... 61 5.15.2. Minimum Track File Duration...... 61 5.15.2. MINIMUM TRACK FILE DURATION...... 61 5.15.3. Minimum Sequence Duration...... 61 5.15.3. MINIMUM SEQUENCE DURATION...... 61 5.16. TERMINOLOGY...... 62 5.16. TERMINOLOGY...... 62

5.17. SYNCHRONIZATION...... 62 5.17. SYNCHRONIZATION...... 62

5.18. COMPOSITION PLAYLIST STRUCTURE...... 65 5.18. COMPOSITION PLAYLIST STRUCTURE...... 65 5.18.1. General Information...... 65 5.18.1. GENERAL INFORMATION...... 65 5.18.2. Unique Id...... 66 5.18.2. UNIQUE ID...... 66 5.18.3. AnnotationText [optional]...... 66 5.18.3. ANNOTATIONTEXT [OPTIONAL]...... 66 5.18.4. IssueDate...... 66 5.18.4. ISSUEDATE...... 66 5.18.5. Issuer [optional]...... 66

11 Private Draft Document. Not for Publication

5.18.5. ISSUER [OPTIONAL]...... 66 5.18.6. Creator [optional]...... 66 5.18.6. CREATOR [OPTIONAL]...... 66 5.18.7. ContentTitleText...... 66 5.18.7. CONTENTTITLETEXT...... 66 5.18.8. ContentKind...... 67 5.18.8. CONTENTKIND...... 67 5.18.9. ContentVersion...... 67 5.18.9. CONTENTVERSION...... 67 5.18.9.1. Id...... 67 5.18.9.1. Id...... 67 5.18.9.2. LabelText...... 67 5.18.9.2. LabelText...... 67 5.18.10. ContentDescription...... 67 5.18.10. CONTENTDESCRIPTION...... 67 5.18.10.1. SourceMediaDescription [optional]...... 68 5.18.10.1. SourceMediaDescription [optional]...... 68 5.18.10.2. Id [optional]...... 68 5.18.10.2. Id [optional]...... 68 5.18.10.3. LabelText [optional]...... 68 5.18.10.3. LabelText [optional]...... 68 5.18.10.4. FrameRate...... 68 5.18.10.4. FrameRate...... 68 5.18.10.5. TimecodeType [optional]...... 68 5.18.10.5. TimecodeType [optional]...... 68 5.18.10.6. ImageEncodingStandard [optional]...... 68 5.18.10.6. ImageEncodingStandard [optional]...... 68 5.18.10.7. StandardsBody [optional]...... 68 5.18.10.7. StandardsBody [optional]...... 68 5.18.10.8. Label [optional]...... 68 5.18.10.8. Label [optional]...... 68 5.18.10.9. AudioSamplingRate...... 69 5.18.10.9. AudioSamplingRate...... 69 5.18.10.10. AudioConfig...... 69 5.18.10.10. AudioConfig...... 69 5.18.10.11. AudioBitDepth...... 69 5.18.10.11. AudioBitDepth...... 69 5.18.11. SubtitleFormat [optional]...... 69 5.18.11. SUBTITLEFORMAT [OPTIONAL]...... 69 5.18.12. CaptionFormat [optional]...... 69 5.18.12. CAPTIONFORMAT [OPTIONAL]...... 69 5.18.13. TotalRunningTime...... 69 5.18.13. TOTALRUNNINGTIME...... 69 5.18.14. Language [optional]...... 69 5.18.14. LANGUAGE [OPTIONAL]...... 69 5.18.15. Country [optional]...... 69 5.18.15. COUNTRY [OPTIONAL]...... 69 5.18.16. RatingList [optional]...... 69 5.18.16. RATINGLIST [OPTIONAL]...... 69 5.18.16.1. Agency...... 70 5.18.16.1. Agency...... 70 5.18.16.2. Label...... 70 5.18.16.2. Label...... 70 5.18.17. Encryption...... 70 5.18.17. ENCRYPTION...... 70 5.18.18. SequenceList...... 70 5.18.18. SEQUENCELIST...... 70 5.18.19. Signer [optional]...... 70

12 Private Draft Document. Not for Publication

5.18.19. SIGNER [OPTIONAL]...... 70 5.18.20. Signature [optional]...... 70 5.18.20. SIGNATURE [OPTIONAL]...... 70 5.19. SEQUENCE STRUCTURE...... 71 5.19. SEQUENCE STRUCTURE...... 71 5.19.1. Sequence Information...... 71 5.19.1. SEQUENCE INFORMATION...... 71 5.19.2. SequenceType...... 72 5.19.2. SEQUENCETYPE...... 72 5.19.3. Unique Id...... 72 5.19.3. UNIQUE ID...... 72 5.19.4. AnnotationText [optional]...... 72 5.19.4. ANNOTATIONTEXT [OPTIONAL]...... 72 5.19.5. IssueDate...... 72 5.19.5. ISSUEDATE...... 72 5.19.6. Issuer [optional]...... 72 5.19.6. ISSUER [OPTIONAL]...... 72 5.19.7. AssetList...... 72 5.19.7. ASSETLIST...... 72 5.19.7.1. MainMarkers [optional]...... 72 5.19.7.1. MainMarkers [optional]...... 72 5.19.7.2. MainPicture [optional]...... 72 5.19.7.2. MainPicture [optional]...... 72 5.19.7.3. PictureSpatial [optional]...... 73 5.19.7.3. PictureSpatial [optional]...... 73 5.19.7.4. MainSound [optional]...... 73 5.19.7.4. MainSound [optional]...... 73 5.19.7.5. MainSubtitle [optional]...... 73 5.19.7.5. MainSubtitle [optional]...... 73 5.19.7.6. MainCaption [optional]...... 73 5.19.7.6. MainCaption [optional]...... 73 5.19.7.7. Extensions (New Asset Types)...... 74 5.19.7.7. Extensions (New Asset Types)...... 74 5.20. ASSET STRUCTURES...... 75 5.20. ASSET STRUCTURES...... 75 5.20.1. Asset Information...... 75 5.20.1. ASSET INFORMATION...... 75 5.20.2. GenericAssetType...... 76 5.20.2. GENERICASSETTYPE...... 76 5.20.2.1. Id...... 76 5.20.2.1. Id...... 76 5.20.2.2. AnnotationText [optional]...... 76 5.20.2.2. AnnotationText [optional]...... 76 5.20.2.3. EditRate...... 76 5.20.2.3. EditRate...... 76 5.20.2.4. IntrinsicDuration...... 77 5.20.2.4. IntrinsicDuration...... 77 5.20.2.5. EntryPoint [optional]...... 77 5.20.2.5. EntryPoint [optional]...... 77 5.20.2.6. Duration [optional]...... 77 5.20.2.6. Duration [optional]...... 77 5.20.3. TrackFileAssetType...... 77 5.20.3. TRACKFILEASSETTYPE...... 77 5.20.3.1. KeyId [optional]...... 77 5.20.3.1. KeyId [optional]...... 77 5.20.3.2. Hash [optional]...... 78

13 Private Draft Document. Not for Publication

5.20.3.2. Hash [optional]...... 78 5.20.4. MarkerAssetType...... 79 5.20.4. MARKERASSETTYPE...... 79 5.20.4.1. MarkerList...... 79 5.20.4.1. MarkerList...... 79 5.20.4.2. Label...... 79 5.20.4.2. Label...... 79 5.20.4.3. AnnotationText [optional]...... 79 5.20.4.3. AnnotationText [optional]...... 79 5.20.4.4. Offset...... 80 5.20.4.4. Offset...... 80 5.20.5. PictureTrackFileAssetType...... 80 5.20.5. PICTURETRACKFILEASSETTYPE...... 80 5.20.5.1. PicturePixelMatrix...... 80 5.20.5.1. PicturePixelMatrix...... 80 5.20.5.2. PictureBitDepth...... 80 5.20.5.2. PictureBitDepth...... 80 5.20.5.3. PictureColorEncoding...... 80 5.20.5.3. PictureColorEncoding...... 80 5.20.5.4. PictureFrameRate...... 80 5.20.5.4. PictureFrameRate...... 80 5.20.5.5. ImageAspectRatio...... 80 5.20.5.5. ImageAspectRatio...... 80 5.20.6. PictureSpatialMetadataTrackFileAssetType...... 80 5.20.6. PICTURESPATIALMETADATATRACKFILEASSETTYPE...... 80 5.20.6.1. StandardsBody [optional]...... 81 5.20.6.1. StandardsBody [optional]...... 81 5.20.6.2. Label [optional]...... 81 5.20.6.2. Label [optional]...... 81 5.20.7. SoundTrackFileAssetType...... 81 5.20.7. SOUNDTRACKFILEASSETTYPE...... 81 5.20.7.1. SoundSamplingRate...... 81 5.20.7.1. SoundSamplingRate...... 81 5.20.7.2. SoundBitDepth...... 81 5.20.7.2. SoundBitDepth...... 81 5.20.7.3. SoundChannelNumber...... 81 5.20.7.3. SoundChannelNumber...... 81 5.20.7.4. ChannelConfiguration...... 82 5.20.7.4. ChannelConfiguration...... 82 5.20.7.5. Label...... 82 5.20.7.5. Label...... 82 5.20.7.6. Language [optional]...... 82 5.20.7.6. Language [optional]...... 82 5.20.8. SubtitleTrackFileAssetType...... 83 5.20.8. SUBTITLETRACKFILEASSETTYPE...... 83 5.20.8.1. Language [optional]...... 83 5.20.8.1. Language [optional]...... 83 5.20.9. CaptionTrackFileAssetType...... 83 5.20.9. CAPTIONTRACKFILEASSETTYPE...... 83 5.20.9.1. Language [optional]...... 83 5.20.9.1. Language [optional]...... 83 6. OUTPUT PROFILE LIST...... 85

6.1. INTRODUCTION...... 85 6.1. INTRODUCTION...... 85

6.2. OUTPUT PROFILE LIST OVERVIEW...... 85

14 Private Draft Document. Not for Publication

6.2. OUTPUT PROFILE LIST OVERVIEW...... 85 6.2.1. Functional Framework...... 85 6.2.1. FUNCTIONAL FRAMEWORK...... 85 6.2.2. Output Profile List Fundamental Requirements...... 85 6.2.2. OUTPUT PROFILE LIST FUNDAMENTAL REQUIREMENTS...... 85 6.2.2.1. Introduction...... 85 6.2.2.1. Introduction...... 85 6.2.2.2. Open Standard...... 85 6.2.2.2. Open Standard...... 85 6.2.2.3. Interoperable...... 85 6.2.2.3. Interoperable...... 85 6.2.2.4. Scalable...... 85 6.2.2.4. Scalable...... 85 6.2.2.5. Extensible...... 85 6.2.2.5. Extensible...... 85 6.2.2.6. Human Readable Metadata...... 85 6.2.2.6. Human Readable Metadata...... 85 6.2.3. Output Profile List Concepts...... 85 6.2.3. OUTPUT PROFILE LIST CONCEPTS...... 85 6.3. OUTPUT PROFILE LIST FORMAT...... 85 6.3. OUTPUT PROFILE LIST FORMAT...... 85 6.3.1. Introduction...... 85 6.3.1. INTRODUCTION...... 85 6.3.2. File Format...... 85 6.3.2. FILE FORMAT...... 85 6.3.3. ImageOutputFormat...... 85 6.3.3. IMAGEOUTPUTFORMAT...... 85 6.3.3.1. OutputPixelMatrix...... 86 6.3.3.1. OutputPixelMatrix...... 86 6.3.3.2. OutputColorEncoding...... 86 6.3.3.2. OutputColorEncoding...... 86 6.3.3.3. StandardsBody [optional]...... 86 6.3.3.3. StandardsBody [optional]...... 86 6.3.3.4. Label [optional]...... 86 6.3.3.4. Label [optional]...... 86 6.3.3.5. DisplayAspectRatio...... 86 6.3.3.5. DisplayAspectRatio...... 86 6.3.3.6. OutputPictureOverlay [optional]...... 86 6.3.3.6. OutputPictureOverlay [optional]...... 86 6.3.3.7. TimeCodeWindow [optional]...... 86 6.3.3.7. TimeCodeWindow [optional]...... 86 6.3.3.8. Spoiler [optional]...... 86 6.3.3.8. Spoiler [optional]...... 86 6.3.4. OutputAudioFormat...... 87 6.3.4. OUTPUTAUDIOFORMAT...... 87 6.3.4.1. AudioCompressionStandard [optional]...... 87 6.3.4.1. AudioCompressionStandard [optional]...... 87 6.3.4.2. StandardsBody [optional]...... 87 6.3.4.2. StandardsBody [optional]...... 87 6.3.4.3. Label [optional]...... 87 6.3.4.3. Label [optional]...... 87 6.3.5. OutputSpeedOffset [optional]...... 87 6.3.5. OUTPUTSPEEDOFFSET [OPTIONAL]...... 87 PACKAGING...... 88

6.4. INTRODUCTION...... 88

15 Private Draft Document. Not for Publication

6.4. INTRODUCTION...... 88

6.5. PACKAGING SYSTEM OVERVIEW...... 88 6.5. PACKAGING SYSTEM OVERVIEW...... 88 6.5.1. Functional Framework...... 88 6.5.1. FUNCTIONAL FRAMEWORK...... 88 6.5.2. Packaging Fundamental Requirements...... 88 6.5.2. PACKAGING FUNDAMENTAL REQUIREMENTS...... 88 6.5.2.1. Introduction...... 88 6.5.2.1. Introduction...... 88 6.5.2.2. Open Standard...... 88 6.5.2.2. Open Standard...... 88 6.5.2.3. Interoperable...... 88 6.5.2.3. Interoperable...... 88 6.5.2.4. Scalable...... 88 6.5.2.4. Scalable...... 88 6.5.2.5. Extensible...... 88 6.5.2.5. Extensible...... 88 6.5.2.6. Synchronization...... 89 6.5.2.6. Synchronization...... 89 6.5.2.7. Human Readable Metadata...... 89 6.5.2.7. Human Readable Metadata...... 89 6.5.3. Packaging Concepts...... 89 6.5.3. PACKAGING CONCEPTS...... 89 6.6. DISTRIBUTION PACKAGE...... 89 6.6. DISTRIBUTION PACKAGE...... 89 6.6.1. Introduction...... 89 6.6.1. INTRODUCTION...... 89 6.6.2. Distribution Package...... 89 6.6.2. DISTRIBUTION PACKAGE...... 89 6.6.2.1. General...... 89 6.6.2.1. General...... 89 6.6.2.2. Packing for Transport...... 89 6.6.2.2. Packing for Transport...... 89 6.6.2.3. Flattened Packages...... 89 6.6.2.3. Flattened Packages...... 89 6.6.2.4. Security...... 89 6.6.2.4. Security...... 89 6.6.3. Packing List...... 90 6.6.3. PACKING LIST...... 90 6.6.3.1. File Format...... 90 6.6.3.1. File Format...... 90 6.6.3.2. Fields...... 90 6.6.3.2. Fields...... 90 6.6.4. Asset Map...... 90 6.6.4. ASSET MAP...... 90 6.6.4.1. File Format...... 90 6.6.4.1. File Format...... 90 6.6.4.2. Fields...... 90 6.6.4.2. Fields...... 90 7. ANNEX...... 92

7.1. EXAMPLE WORKFLOWS...... 92 7.1. EXAMPLE WORKFLOWS...... 92

7.2. EDITORIAL SYSTEMS...... 93

16 Private Draft Document. Not for Publication

7.2. EDITORIAL SYSTEMS...... 93 7.2.1. Editorial System Fundamental Requirements...... 93 7.2.1. EDITORIAL SYSTEM FUNDAMENTAL REQUIREMENTS...... 93 7.2.1.1. Easy Assembly of Content...... 93 7.2.1.1. Easy Assembly of Content...... 93 7.2.1.2. Movement of Content...... 93 7.2.1.2. Movement of Content...... 93 7.2.1.3. Ease of Operation...... 94 7.2.1.3. Ease of Operation...... 94 7.2.1.4. Multiple Systems...... 94 7.2.1.4. Multiple Systems...... 94 7.3. PLAYOUT SYSTEMS...... 94 7.3. PLAYOUT SYSTEMS...... 94 7.3.1. Playout System Fundamental Requirements...... 94 7.3.1. PLAYOUT SYSTEM FUNDAMENTAL REQUIREMENTS...... 94 7.3.1.1. Easy Assembly of Content...... 94 7.3.1.1. Easy Assembly of Content...... 94 7.3.1.2. Movement of Content...... 94 7.3.1.2. Movement of Content...... 94 7.4. TRANSCODING SYSTEMS...... 94 7.4. TRANSCODING SYSTEMS...... 94 7.4.1. Transcoding System Fundamental Requirements...... 94 7.4.1. TRANSCODING SYSTEM FUNDAMENTAL REQUIREMENTS...... 94 7.4.1.1. Easy Assembly of Content...... 94 7.4.1.1. Easy Assembly of Content...... 94 7.4.1.2. Movement of Content...... 94 7.4.1.2. Movement of Content...... 94 7.5. ASSET MANAGEMENT INTERFACE...... 94 7.5. ASSET MANAGEMENT INTERFACE...... 94 7.5.1. Interface Fundamental Requirements...... 94 7.5.1. INTERFACE FUNDAMENTAL REQUIREMENTS...... 94 7.5.1.1. Introduction...... 94 7.5.1.1. Introduction...... 94 7.5.1.2. API...... 94 7.5.1.2. API...... 94 7.5.1.3. Protocols...... 95 7.5.1.3. Protocols...... 95 7.5.2. Asset Management Interface Fundamental Concepts...... 95 7.5.2. ASSET MANAGEMENT INTERFACE FUNDAMENTAL CONCEPTS...... 95 7.5.3. Application Programming Interface...... 95 7.5.3. APPLICATION PROGRAMMING INTERFACE...... 95 7.5.4. Protocols...... 95 7.5.4. PROTOCOLS...... 95 8. GLOSSARY OF TERMS...... 96

9. COMPOSITION PLAY LIST (CPL) EXAMPLE...... 100

10. ANNEX - IDEAL IMPLEMENTATION...... 106

10.1. REAL-TIME PLAYBACK...... 106 10.1. REAL-TIME PLAYBACK...... 106

10.2. DECODE COMPRESSED IMAGE FASTER THAN REAL TIME...... 106 10.2. DECODE COMPRESSED IMAGE FASTER THAN REAL TIME...... 106

17 Private Draft Document. Not for Publication

107

18 Private Draft Document. Not for Publication Table of Figures

FIGURE 1 – EXAMPLE THEATRICAL WORKFLOW...... 24

FIGURE 2 – EXAMPLE IMF SYSTEM ARCHITECTURE...... 25

FIGURE 3 – EXAMPLE WRAPPING...... 27

FIGURE 4 – EXAMPLE IMF COMPOSITION...... 28

FIGURE 5 - EXAMPLE IMF HIERARCHICAL STRUCTURE...... 29

FIGURE 6 – EXAMPLE IMP...... 30

FIGURE 7 – EXAMPLE OF AN IMAGE CONTAINER VS. 2.39:1 ACTIVE IMAGE AREA...... 35

FIGURE 8 - EXAMPLE TRACK FILE STRUCTURE...... 51

FIGURE 9 - EXAMPLE OF KLV CODING...... 51

FIGURE 10 - CORRESPONDENCE BETWEEN SOURCE AND ENCRYPTED TRIPLETS...... 54

FIGURE 11 - COMPOSITION PLAYLIST SEQUENCE...... 63

FIGURE 12 - TIMING RELATIONSHIPS WITHIN A SEQUENCE...... 63

FIGURE 13 - SEQUENCE SYNC RELATIONSHIP EXAMPLE...... 64

FIGURE 14 - SEQUENCE STRUCTURE...... 71

FIGURE 15 - ASSET TIMING PARAMETERS...... 77

FIGURE 16 - FUTURE STATE - MASTERING & DISTRIBUTION SERVICING...... 92

FIGURE 17 – EXAMPLE IMF WORKFLOW...... 93

Table of Tables

TABLE 1: IMF BASIC ELEMENTS...... 31

TABLE 2: IMAGE METADATA DATA ELEMENTS...... 34

TABLE 3: REQUIRED NON-STANDARD RESOLUTIONS AND FRAME RATES...... 36

TABLE 4: IMAGE METADATA DATA ELEMENTS...... 38

TABLE 5: ALLOWABLE SAMPLES FOR SPECIFIED FRAME RATES...... 40

TABLE 6: XML NAMESPACES...... 61

TABLE 7: TERMS AND DEFINITIONS...... 62

TABLE 8: EXAMPLES OF CONTENT KIND...... 67

TABLE 9: EXAMPLE RATINGS (INFORMATIVE)...... 70

19 Private Draft Document. Not for Publication

TABLE 10: EXAMPLES OF MARKER LABELS...... 79

TABLE 11: EXAMPLES OF STANDARDS BODY LABELS...... 81

TABLE 12: EXAMPLES CHANNEL LABELS...... 82

TABLE 13: EXAMPLES OF LABEL ELEMENTS...... 86

TABLE 14: GLOSSARY OF TERMS...... 96

20 Private Draft Document. Not for Publication

1. IMF Introduction

1.1. Introduction With the advance of technology within the motion picture post-production industry, a paradigm shift is upon us as we move from the videotape workflow to the file based workflow. It is with this shift that a need has risen for a standardized set of specifications for this file-based workflow. In light of this shift, an organization was sought out by content creators to provide the facility for detailed discussions surrounding this topic. The Entertainment Technology Center stood out as a leading candidate for such discussions and in November of 2008 the Interoperable Master Format or IMF specification effort was initiated.

1.2. Scope The IMF file-based workflow is designed to replace the existing tape-based Distribution Servicing workflow. It will store one master set of file based elements to be assembled for any downstream distribution using multiple Composition Play Lists (Recipes), similar to what is used in present day Digital Cinema Packaging (DCP). The broad concept of a high quality, uniform IMF should lower costs, improve time-to-market, and increase interoperability of existing production processes and needs. This specification is intended to promote improved interoperability and faster implementation of multiple variants on a common video package, while retaining highest quality and appropriate security of the material. It is envisioned that this Final Specification would then be presented to SMPTE (Society of Motion Picture and Television Engineers) to create standards, recommended practices and engineering guidelines to implement this specification throughout the industry as a common interchange method.

1.3. System Objectives At the onset of writing a specification for an Interoperable Master Format, ETC acknowledged certain fundamental requirements, which are:

The Interoperable Master Format (IMF) shall provide for a single set of master files and recipes to allow for easy creation of versions for distribution channels. This should also provide the potential to repurpose existing content. The IMF shall have the capability to present Essence and Metadata that is equal to or better than what one could achieve with current practice (i.e. Videotape). The IMF shall provide mechanisms (i.e. asset management, packing lists) for the exchange of IMF metadata or packages between facilities. This system should be based upon international standards so that content can be interchanged anywhere in the world as can be done today with videotape. These standards should be open published industry standards that are widely accepted and codified by regional and international standards bodies such as: ANSI, SMPTE, ITU, W3C, and ISO/IEC. The system specification and formats should be chosen so that the capital equipment and operational costs are reasonable and exploit, as much as possible, the economies of scale associated with equipment and technology in use in other industries. The hardware and software used in the system should be easily upgraded as advances in technology are made. Upgrades to the format shall be designed in a way so that content may be distributed and compatibly exchanged on the latest IMF-compliant hardware and software, as well as earlier adopted IMF-compliant equipment installations. The Interoperable Master Format shall be based upon a component architecture (e.g., Mastering, Compression, Encryption, Distribution, Storage, Playback) that allows for the components to be replaced or upgraded in the future without the replacement of the complete system. It is the intention of this Digital Video specification to allow for advances in technology and the economics of technology advancement.

21 Private Draft Document. Not for Publication

The document specifies a baseline for the implementation of an Interoperable Master Format. The goal of backwards compatibility in this context is to allow, for example, new content at higher resolution and color space to be understood by the baseline implementation. The Interoperable Master Format shall provide a reliability and availability that is equal to, or better than, current practice. (i.e. videotape) The Interoperable Master Format shall allow for the use of a security method. The system should provide a means to allow the content to use standardized encryption with private/public keys. The IMF should also allow the use of forensic marking of the content for providing traceable forensic evidence in the case of theft.

1.4. Document Language This document consists of normative text and optional informative text. Normative text is text that describes the elements of the design that are indispensable or contains the conformance language keywords: “shall”, “should” or “may”. Informative text is text that is potentially helpful to the user, but not indispensable and can be removed, changed or added editorially without affecting interoperability. Informative text does not contain any conformance keywords. All text in the document is, by default, normative except: any section titled “Introduction”, any section explicitly labeled as “Informative”, or individual paragraphs that start with the word “Note.” Normative references are those external documents referenced in normative text and are indispensable to the user. Informative, or bibliographic, references are those references made from informative text or are otherwise not indispensable to the user. The keywords “shall” and “shall not” indicate requirements that must be strictly followed in order to conform to the document and from which no deviation is permitted. The keywords “should” and “should not” indicate that among several possibilities one is recommended as particularly suitable, without mentioning or excluding others; or that a certain course of action is preferred but not necessarily required. In the negative form, a certain possibility or course of action is deprecated but not prohibited. The keywords “may” and “need not” indicate a course of action permissible within the limits of the document. The keyword “reserved” indicates that a condition is not defined and shall have no meaning. However, it may be defined in the future. The keyword “forbidden” is the same as reserved, except that the condition shall never be defined in the future. A compliant implementation is one that includes all mandatory provisions (“shall”) and, if implemented, all recommended provisions (“should”) as described. A compliant implementation need not implement optional provisions (“may”). Requirements are indicated with the key phrases “is required to”, “is encouraged to” and “can” which represent “shall,” “should” and “may” (had the text been in a separate requirements document). This is necessary in order to distinguish requirements from the specification conformance language. Sentences with the following keywords are italics: shall, shall not, should not, is required, is not required, is not encourage and is encouraged. The names of standards publications and protocols are placed in [bracketed text]. International and industry standards contain provisions, which, through reference in this text, constitute provisions of this specification. At the time of publication, the editions indicated were valid. These referenced standards are subject to revision, and parties to agreements based upon this specification are encouraged to investigate the possibility of applying the most recent editions of the referenced standards. Section 8 is a glossary of technical terms and acronyms used throughout this specification. The reader is encouraged to refer to the glossary for any unfamiliar terms and acronyms. Trademarked names are the property of their respective owners.

22 Private Draft Document. Not for Publication

2. SYSTEM OVERVIEW

1.5. Functional Framework The document defines technical specifications and requirements for the creation and distribution of Interoperable Master Format. The details are in the following sections:

Essence: This section provides specifications for the image and audio essence and its specific inherent metadata. The image essence specification defines a common set of image structures by specifying parameters such as image container, colorimetry and, if required, a mezzanine compression structure for the Interoperable Master Format (IMF). The Audio Essence specifies audio characteristics such as; bit depth, sample rate, minimum channel count, channel mapping and reference levels. Data Essence: This section provides specifications for the subtitle (Timed Text and sub pictures) and captions data essence. The Subtitles Essence specifies the format of a Digital Video subtitle track file. A subtitle file contains a set of instructions for placing rendered text or graphical overlays at precise locations on distinct groups of motion picture frames. A caption file may provide graphical overlays or provide graphical information to a secondary system for display of text. Outside of providing to a secondary system most other parameters are the same as subtitle data essence. Dynamic Metadata: This section provides the specifications for the Metadata that is outside of the associated metadata of the essence or data essence. This would be metadata such as; time code, pan and scan, color correction, dynamic down-mixing and control. Wrapping: This section defines the requirements for wrapping the content and metadata (image, audio and subtitle) files using (where possible) existing Material eXchange Format (MXF) specifications. The output of this process is the Track Files. This section also defines the requirements for encrypting the essence (sound, picture and subtitles) of the IMF if required Composition: This section provides the specifications for the creation of XML using Composition PlayLists (CPL) which are scripts that link the IMF Track Files together into synchronized pieces of content. This section will also touch upon the security requirements for a composition and Composition PlayList. Packaging: Interoperable Master Format Compositions can be Packaged for distribution which is detailed in this section. Provides requirements for all the tools necessary for editorial functions in a typical post-production environment. Annex: This section contains example workflows using the IMF as well as definitions and suggested devices for editorial systems, transcoding and playout systems.

23 Private Draft Document. Not for Publication

1.6. Overview The goal of the IMF project is to establish an integrated hardware and software solution for the processing, storage, and management of digital content intended for downstream distribution to business that will consume such content.

1.6.1. IMF System Architecture Figure 1 below demonstrates an example of the end-to-end data flow from Production to Distribution. The focus of IMF is within the Distribution Servicing phase.

Distribution Mechanisms

Figure 1 – Example Theatrical Workflow

To further demonstrate the concept of the Interoperable Master Format in a system architecture for distribution servicing refer to Figure 2 below. The IMF aggregates image, audio, and other files to support processing for any downstream delivery requirement. It does this using Composition Play Lists to create the deliverables using transcoding technologies. The IMF package will allow for delivery using the image elements stored in either a compressed or uncompressed file. An example of this would be an HD resolution with 709 color space at 24 FPS converted to SD resolution with 601 color space at 29.97 FPS.

24 Private Draft Document. Not for Publication

Figure 2 – Example IMF System Architecture

1.6.2. Major System Key Concepts The following listed concepts below are the Key Concepts of the IMF as a system. These concepts are derived from work done by content creators to create masters for the downstream distribution servicing of theatrical content. This is not to say that these concepts or the IMF cannot be applied to other forms of content. Instead it is noted here so that the reader may better understand its origin.

1.6.2.1. Business to Business Solution The IMF is intended to facilitate internal or a business to business relationship and is not intended to deliver content to the consumer.

1.6.2.2. Digital Source Master (DSM) The IMF is derived from a wide range of sources with a wide range of technical levels. One example is content in its finished state at the end of the theatrical post-production process which is shown above in Figure 1 as a Digital Intermediate or also know as a Digital Source Master (DSM). The DSM can also be used to convert to a film duplication master and/or a master for archival purposes. It is not the intention of this document to, in any way, specify the DSM. This is left to the discretion of the content provider.

25 Private Draft Document. Not for Publication

1.6.2.3. Archive not in Scope It is also not the intention of this specification to design the IMF as an archiveable master. One could conceive this as another use of the IMF, however this is not part of the scope of this specification.

1.6.2.4. File / Frame-Based System The IMF is built upon a data file-based design, i.e., all of the content is made up of data stored in files. These files are organized around the image frames, which means that all of the synchronization references to the image frames and frame rate. The file is the most basic component of the system.

1.6.2.5. Essence and Data Essence The raw image and audio files of the DSM make up what is known as Essence. Also included as part of the IMF there may be subtitle or caption essence. This type of essence is called Data Essence and because of its nature it may require different specifications and hence is described in its own section.

1.6.2.6. Metadata Metadata is essentially data about data. There are many examples of metadata: • Supporting metadata • Descriptive metadata • Dynamic metadata

Supporting and Descriptive metadata are defined by SMPTE. A new concept proposed for the IMF is Dynamic metadata. This metadata is metadata that changes on a frame basis such as; time code, pan and scan, color correction and dynamic down-mixing. This metadata may or may not be applied to the underlying essence. It must be synchronized to the essence and therefore with these requirements will need to be wrapped and contained as Track Files.

1.6.2.7. Security The IMF should not preclude the use of encryption or the use of forensic marks. This may occur at the wrapping, composition or packaging stage.

1.6.2.8. Wrapping and Track File Creation Wrapping is the process of encapsulating the essence, data essence and dynamic metadata files into well understood temporal units called Track Files using a standardized wrapping method such as MXF. Figure 3 below shows a basic example of this process of wrapping Essence and Metadata into Track Files. This process includes a way to uniquely identify each Track File along with its associated essence and metadata. It also provides a method to identify synchronization locations within each of the Track Files.

26 Private Draft Document. Not for Publication

Figure 3 – Example Wrapping

27 Private Draft Document. Not for Publication

1.6.2.9. Composition A Composition represents a complete set of files that may be a feature, an episode, a trailer, an advertisement or any other single piece of content. A composition minimally consists of a Composition PlayList (CPL) and one or more Track Files. Composition PlayLists (CPL’s) are XML structured textual lists that define how elements of an IMF are to be assembled and either transcoded to another format or played out as a presentation.

Figure 4 – Example IMF Composition

1.6.2.10. Versions One of the main reasons for having an IMF is the ability to create many versions of a program without duplicating the common essence used for each version. Instead, one Composition PlayList (CPL) is required for each version of the program, which is much smaller in size as compared to the Track Files themselves.

1.6.2.11. Sequence Within the Composition PlayList (CPL) one shall create sequences where the Track Files are nested within these sequences. The CPL is organized in such a way that, finding and possibly replacing sequences becomes more efficient depending on the number of Track Files. This configuration was selected to give the content creator more choices for the structure of the Composition PlayList. In the IMF, a sequence represents a conceptual period of time having a specific duration chosen by the content provider. For example, a Sequence maybe the same length as a reel or it may be the running time of a television program between commercials. Again this length is determined by the content provider to fit both the particular workflow and content type. Once created, Sequences can then be electronically spliced together within the CPL to create a complete presentation. An example below (see Figure 4) shows the hierarchical structure of the CPL. A sequence may have one or more Track Files nested within it. The IMF allows cuts only to occur between Track Files as well as allowing a minimum duration of a single image frame. Sequences on the other hand only allow a minimum duration of one second.

28 Private Draft Document. Not for Publication

Figure 5 - Example IMF Hierarchical Structure

1.6.2.12. Output Profile List Once versions of a Composition are created, many different distribution formats may be required to be made from that particular version. In order to assist with the automation of this process, one shall use a method to define the transcoder or play out device’s output. The method of communicating the desired output is called, the Output Profile List or OPL. The OPL is a textual list that contains the specification of the output. It is linked to the CPL by using the UUID of the CPL. This allows one to link many OPL’s to a single CPL and automate the transcoding of the Composition into multiple distribution formats.

1.6.2.13. Packages An Interoperable Master Package (IMP) shall minimally consist of a Composition, an Output Profile List, an Asset Map and Packing List. An Asset Map is a text document that describes how the files are distributed across the physical media that is used for transport. A Packing List is a text document that provides a list of all of the files included in that specific IMP. An example of an IMP is shown below in Figure 6.

29 Private Draft Document. Not for Publication

Figure 6 – Example IMP

30 Private Draft Document. Not for Publication

1.7. IMF Elements and Processes The following table provides a list of the identified IMF elements and processes. The table also provides the category they belong to and where you can find more detail in the following sections.

Table 1: IMF Basic Elements Element Category Image Essence Audio Essence Primary Display Subtitles / Captions Data Essence Composition PlayList (CPL) Files Dynamic Metadata Output Profile List (OPL) Supporting Metadata Packing List Supporting Metadata Asset Map Supporting Metadata Timecode Dynamic Metadata Secondary Display Captions/Subtitles Data Essence QC / Picture Reports and Fact Sheets Descriptive Metadata Forensic Marking Data Essence Pan and Scan (Aspect Ratio Conversion) Dynamic Metadata Time Compression/ Expansion Dynamic Metadata HI (Hearing Impaired) Essence VI (Visually Impaired) Essence

31 Private Draft Document. Not for Publication

2. Essence

2.1. Overview

2.1.1. Introduction Essence is the core files that contain the image and audio data. One of the goals of the IMF is to allow for a variety of these files to be included into the IMF. For example the IMF should support multiple image resolutions such as 1920x1080 and 1280x720. This is just one of many different variables that maybe supported by the IMF Essence.

2.1.2. Essence System Overview For the purpose of documenting the specific requirements and specifications for Essence, it is helpful to divide the system into a set of components. The specifications and requirements for each of these components will be described in the following sections:

• Image – The image specification and file format o Compression Requirements o Structural Metadata • Audio – The audio specification and file format o Structural Metadata

2.1.3. Major Essence Concepts The following are the major concepts of the Essence files.

2.1.4. Essence Fundamental Requirements

2.1.4.1. Common File Formats The Essence and Data Essence shall use a common standardized file format for each element (image, audio, subtitles, etc.). The image essence file format shall be an SMPTE-conformant file based on existing SMPTE standards. The audio track file format shall be based on Broadcast Wave.

2.1.4.2. Frame Rates The image and audio formats shall support the following frame rates: • 23.976 F/s Progressive • 24.000 F/s Progressive • 25 F/s Progressive • 29.97 F/s Progressive • 30.00 F/s Progressive • 50 f/s Interlaced • 50 F/s Progressive • 59.94 f/s Interlaced • 59.94 F/s Progressive • 60.00 F/s Progressive

32 Private Draft Document. Not for Publication

The frame rate of any individual IMF source master shall remain constant. Metadata shall be carried in the image and audio data file format to indicate the frame rate.

2.1.4.3. Synchronization Track files within the image and/or audio essence shall carry information to provide for frame- based synchronization between each track file. At a minimum, this information shall include a start of the track file and a continuous frame count. Image is the master reference for synch. Granularity is frame based.

33 Private Draft Document. Not for Publication

2.2. Image Specification

2.2.1. Image Structure

2.2.1.1. Image Metadata Required Fields Specific image metadata shall be required for each image track that describes the native parameters of the image content, in order to allow for proper interchange between different implementation systems. At the time of this writing, however, the actual metadata fields cannot be defined. Instead, the data elements shown in Table 2 shall be the minimum amount of information supported by the IMF for image. These data elements shall be converted into specific metadata fields once the specification is complete.

Table 2: Image Metadata Data Elements Data Element Data Element Definition Examples Image Container Total number of horizontal pixels used for the Image 4096, 2048, 1920, 720, Horizontal Pixels Container could be many Image Container Total number of vertical pixels used for the Image Container 3112, 2160, 1556, 1080, Vertical Pixels 576, 486, 480, could be many Active Image Top Left Start of the Active Image, within the area expressed in an x,y Varies – examples Start Coordinate (x,y) coordinate value, that is placed in relation to the Image include 0,0 or 0,239 Container Top Left Coordinate of (0,0). Active Image End End of Active Image within Image Container area expressed Varies – examples Coordinate (x,y) in an x,y coordinate value that is placed in relation to the include 1919,1079 or Image Container Top Left Coordinate of (0,0). 1679, 1079 Pixel Aspect Ratio Shape of the pixel expressed in a ratio of width divided by 1:1, could be any ratio height of the pixel Raster Format Interlaced (fields/second) or Progressive (frames/second) Interlaced, Progressive Frame Rate Rate that each image is shown; used in conjunction with 23.976, 24, 25, 29.97, 30, Raster Format to express fields/second or frames/second 50, 59.94, 60 Color Channel Bit Number of used to represent the digital image data 8, 10, 12, 16 Depth Color Encoding Ratio Number of samples used per color space component 4:4:4, 4:2:2 expressed in a ratio

Color Encoding Type of color model used for the image expressed in a set of RGB, YCbCr, XYZ components Color Primaries xy chroma coordinates of the tri-stimulus values Rxy,Gxy,Bxy Transfer Function Relationship of code value to brightness value Linear, log, power function Code Value Range Range used to represent zero black and 100% white Full-Range (0-1023), Limited-Range (64-940) Coordinate Origin Top-left,, bottom-right Codec JPEG 2000 Part 1

34 Private Draft Document. Not for Publication

2.2.1.2. Image Container, Active Image, Pixel Aspect Ratio The Image Container is defined as the full canvas of the image area, parts of which, are not necessarily meant to be seen. Image Container is defined in several SMPTE documents including S2046-1-2009 and will be defined here as a rectangular array of pixels that contains the maximum possible image area in a given format. In contrast, the Active Image area is a subset of the Image Container and is defined as the area that contains any part of the image that is meant to be seen. Active Image contains the active image content only and shall not contain letterboxing mattes or side mattes. The Active Image area shall not exceed the size of the Image Container.

Because the Active Image area may be smaller than the Image Container, the Active Image area needs to be defined by a horizontal and vertical position within the Image Container. In order to describe the location and size of the Active Image area with the least amount of information, the Active Image area shall be expressed as two x/y coordinate values: Top Left Coordinate and Bottom Right Coordinate. For example, the Active Image of content that is shown in the 2.39:1 aspect ratio in the highest resolution of HD (Active Image Width of 1920 pixels) would have an Active Vertical size of 803 (1920/2.39). Because the vertical size is an odd number, the placement of the Active Image cannot be centered within the 1920x1080 Image Container; however, by using coordinates to describe the position, the exact location within the Image Container can be made as shown in Figure 7. The Top Left of the Image Container is always (0,0).

Figure 7 – Example of an Image Container vs. 2.39:1 Active Image Area

The IMF shall support an Image Container Horizontal size up to 1920 pixels and a Vertical size up to 1080 pixels and shall support an Active Image area that is up to 1920x1080. The IMF should support an Image Container Horizontal size up to 4096 pixels and a Vertical size up to 3112 pixels and should support an Active Image area that is up to 4096x3112. At the time of this writing, current mastering standards include

35 Private Draft Document. Not for Publication

resolutions up to 1920x1080, so the IMF shall support up to this resolution including resolutions that are considered to be Standard Definition or lower.

The Pixel Aspect Ratio (PAR) defines the shape of the pixel used by the image in a ratio of width versus height. Square PAR is the same as a ratio of 1:1. Depending on the standard used, the PAR can differ – examples include the non-square PAR for NTSC and PAL. Both the Image Container and the Active Image shall share the same Pixel Aspect Ratio.

2.2.1.3. Raster Format, Frame Rate Raster Format (Interlaced or Progressive) and Frame Rate are dependent on each other and to some extent, on the resolution of the image. The IMF shall support the same Raster Formats and Frame Rates that are currently used in mastering today including those listed in the following SMPTE specifications: • SMPTE 274M • SMPTE 296M • SMPTE 259M

Because the file-based world affords some extra flexibility in regards to the mixing of resolution, frame rate and raster formats, particularly with progressive frame rate material, the IMF shall support the resolutions and frame rates listed in Table 3: Required Non-Standard Resolutions and Frame Rates, in addition to the above SMPTE standards:

Table 3: Required Non-Standard Resolutions and Frame Rates System Nomenclature Luma or R’G’B’ samples Active lines per frame Frame Rate (Hz) per active line (S/AL) (AL/F) 720x480/23.98/P 720 480 24 1.001 720x486/23.98/P 720 480 24 1.001 720x576/25/P 720 576 25 Active Image Only Any up to and including Any up to and including All listed in Section 3.1.4.2 1920 1080

In addition to the required resolutions, frame rates and raster formats, the IMF should support up to 24 frames per second for Horizontal resolutions greater than 1920 pixels and up to 4096 pixels. There may need to be additional metadata defined for Interlaced Raster Formats. Because this is a field rate, additional data elements including field order (bottom or top field first) and possibly repeated fields for content with 3:2 pulldown may be needed to thoroughly describe the Interlaced content.

2.2.1.4. Bit Depth In regards to Bit Depth, the IMF shall support 8-bit and 10-bit image content and should support 12-bit and 16-bit content. At the time of this writing, 8-bit and 10-bit content are used considerably more than the higher bit-depths, so the IMF must support 8-bit and 10-bit in order to match current mastering standards.

2.2.1.5. Chroma Subsampling The IMF shall support 4:4:4 (no chroma subsampling) and 4:2:2 chroma subsampling types. Current mastering standards use both of these ratios; therefore the IMF must include the both of these in the format.

36 Private Draft Document. Not for Publication

2.2.1.6. Color Space, Color Space Gamuts The format shall include the Color Space and Color Space Gamuts that are currently used in mastering, which include: • R’G’B’ ITU-R BT.709

• Y’CbCr ITU-R BT.709

• Y’CbCr ITU-R BT.601

The IMF should support: • XYZ [standard? Or CIE 1931]

2.2.2. Compression Requirements Image Compression for Digital Video may use data reduction techniques to decrease the size of the data for economical delivery and storage. The system uses perceptual coding techniques to achieve an image compression that is visually lossless. It is important to note that image compression is typically used to ensure meeting transmission bandwidth or media storage limitations. This results in image quality being dependent on scene content and delivered . Digital Video image compression is much less dependent upon bandwidth or storage requirements, thereby making bit rate dependent on desired image quality rather than the reverse. Uncompressed, Lossless and schemes shall be included in the IMF

2.2.2.1. Image Compression Requirements The following section defines the requirements for a Image Compression format and structure. The Uncompressed, Lossless and Lossy compression schemes shall meet the following requirements: • Shall be an Industry Standard (i.e. SMPTE, ITU, etc.) • Shall be License-Free • Shall support Intra-Frame

In addition to the above requirements, Lossless and Lossy compression schemes shall meet the following requirements: • Shall support Variable and Constant Bit Rates • Shall support Spatial Resolution Layers • Shall support Bit-Depth and Multiple Quality Layers

The compression scheme shall use documented industry standards in order to ensure consistent interoperability between system implementations and to prevent conflicts with intellectual property. In order to allow for the maximum number of participants to adopt the IMF, the compression scheme shall be license-free.

2.2.2.2. Intra-Frame Compression Intra-Frame compression is an important part of the image requirements because it allows for each image frame to be self-contained and edits to occur in one frame increments without any additional decoding being necessary. Both variable and constant bit rates may be used in the compression scheme in order to allow the content owner to balance quality of the image versus overall size.

• The compression scheme shall allow for different spatial resolution and bit-depth layers within the same frame. Note: This allows one set of files to contain multiple resolutions of the image and various bit-depth qualities. For example, the compressed HD image with a resolution of 1920x1080 shall allow

37 Private Draft Document. Not for Publication

extractions of lesser resolutions such as 960x540 (half of the original width and height) and 480x270 (one-quarter of the original width and height), without having to decode and scale the image width and height to the smaller resolutions. This method of extraction allows for smaller proxy versions of the files to be used without having to generate a separate file for a system that requires a smaller resolution. Smaller proxies are used in many situations including editing and as reference files for audio conforming and subtitling creation.

• The bit-depth shall also allow for multiple bit-depth layers. Note: for example, a 10-bit image should contain the information needed to correctly extract the 8- bit version from the 10-bit. This allows for multiple quality images to be extracted from the same set of files, depending on the requirements of the playback system. Using this method of extraction, the lesser quality version (8-bit in this example) would contain parameters that would optimize the playback on an 8-bit system. • IMF shall support JPEG 2000 Part 1 Note: At the time of this writing, the only lossy and lossless CODEC that meets both the image and compression requirements is JPEG2000 Part 1; however, the IMF may use any CODEC that meets the compression requirements.

2.2.3. Structural Metadata

2.2.3.1. Image Metadata Required Fields Specific image metadata shall be required for each image track that describes the native parameters of the image content, in order to allow for proper interchange between different implementation systems. At the time of this writing, however, the actual metadata fields cannot be defined. Instead, the data elements shown in Table 2 shall be the minimum amount of information supported by the IMF for image. These data elements shall be converted into specific metadata fields once the specification is complete.

Table 4: Image Metadata Data Elements Data Element Data Element Definition Examples Image Container Total number of horizontal pixels used for the Image 4096, 2048, 1920, 720, Horizontal Pixels Container could be many Image Container Total number of vertical pixels used for the Image Container 3112, 2160, 1556, 1080, Vertical Pixels 576, 486, 480, could be many Active Image Top Left Start of Active Image within an Image Container area Varies – examples Start Coordinate (x,y) expressed in an (x,y) coordinate value that is placed in include (0,0) and (0,239) relation to the Image Container Top Left Coordinate of (0,0) Active Image End End of Active Image within Image Container area expressed Varies – examples Coordinate (x,y) in an x,y coordinate value that is placed in relation to the include (1919,1079) and Image Container Top Left Coordinate of (0,0) (1679, 1079) Pixel Aspect Ratio Shape of the pixel expressed in a ratio of width divided by 1:1, could be any ratio height of the pixel Raster Format Interlaced (fields/second) or Progressive (frames/second) Interlaced, Progressive Frame Rate Rate that each image is shown; used in conjunction with 23.976, 24, 25, 29.97, 30, Raster Format to express fields/second or frames/second 50, 59.94, 60 Color Channel Bit Number of bits used to represent the digital image data 8, 10, 12, 16 Depth

38 Private Draft Document. Not for Publication

Data Element Data Element Definition Examples Color Encoding Ratio Number of samples used per color space component 4:4:4, 4:2:2 expressed in a ratio

Color Encoding Type of color model used for the image expressed in a set of RGB, YCbCr, XYZ components Color Primaries xy chroma coordinates of the tri-stimulus values Rxy,Gxy,Bxy Transfer Function Relationship of code value to brightness value Linear, log, power function Code Value Range Range used to represent zero black and 100% white Full-Range (0-1023), Limited-Range (64-940) Coordinate Origin Coordinates that define the upper left corner of the Active Top-left, bottom-right Image within the Image Container CODEC Coder Decoder for a digital stream of data JPEG 2000 Part 1 VANC Horizontal Ancillary Data - Ancillary packets located in Embedded Audio the horizontal blanking region.

VANC Vertical Ancillary Data - Ancillary packets located in the Closed Caption data vertical blanking region and VPID

2.3. Audio Specification

2.3.1. Audio Structure

2.3.1.1. Audio File Format Audio Track Files represent audio within the IMF. • An audio track file shall be a single, complete audio element, which may be any soundfield configuration. • An audio track file would typically be wrapped or interleaved into a single file, and shall be limited to a single audio element (such as a composite mix, or dialog track) and a single soundfield configuration (such as 5.1 or mono) per audio track file. • Audio track files shall be further constrained per sections 3.3.1.6 through 3.3.1.x • The format of an audio track file shall be Interleaved Broadcast Wave. Note: Interleave is by single sample. The order of samples is determined by the soundfield configuration (TBD) Note: MXF is not preferred as an audio wrapper due to it’s limitations- e.g. track assignments are not flexible and are strictly pattern based, depending on the number of tracks. Note: AS02 may play a part here. • Data rate coded audio shall not be used (e.g. AC3, DTS-MA, Dolby E). • Matrix encoded audio (e.g. Lt-Rt, Dolby EX) may be used.

Note: discrete inputs to IMF authoring (informative): Broadcast Wave, Wave, PCM

2.3.1.2. Sampling Rate Sampling rates shall include 48k, 96k, 47.952k, and 95.9k.

39 Private Draft Document. Not for Publication

2.3.1.3. Frame Rate/Audio Speed • Native 24, 25, 30, 50 and 60 are equivalent audio speeds, as the image being represented is the same. • Pull down 23.976, 29.97 and 59.94 are equivalent audio speeds, as the image being represented is the same.

2.3.1.4. Sped up frame rates (would need metadata to convey this): • This is “25 fps”, but is sped up from the native speed of 23.976 to 25 fps. The resultant audio is 4.1% faster and higher in pitch than the original captured audio. Metadata should indicate it is “25 fps sped up 4.1%” • This is “25 fps”, but is sped up from the native speed of 24 to 25 fps. The resultant audio is 4.0% faster and higher in pitch than the original captured audio. Metadata should indicate it is “25 fps sped up 4.0%”

2.3.1.5. Allowable Samples per Frame Allowable samples per frame shall be based on frame rate, according to the following table. The required sample (word) clocking rate shall sync to the frame speed of the associated image.

Table 5: Allowable Samples for Specified Frame Rates Sample Rate Frame Rate Frame Rate Frame Rate Frame Rate 47.952k 48k 95.9k 96k 23.976 2000 2002 4000 4004

24 n/a 2000 n/a 4000

25 n/a 1920 n/a 3840

29.97 1600 1601.6 3200 3203.2

30 n/a 1600 n/a 3200

50 n/a 960 n/a 1920

59.94 800 800.8 1600 1601.6

60 n/a 800 n/a 1600

48 n/a 1000 n/a 2000

2.3.1.6. Audio Bit Depth The audio bit depth shall be 24 bits;16 and 20 bit are not allowed - they are padded in the least significant bits to create 24 bit. Note: Metadata should be included to reflect the original bit depth e.g. “24 bit padded from the original 16 bit source.”

40 Private Draft Document. Not for Publication

2.3.1.7. Audio Track File Content

2.3.1.8. Audio Track File Content Constraint There shall be one audio element per track file. Example audio elements are listed in section 2.3.1.15.

2.3.1.9. Audio Track File Language Constraint Only one audio language shall be used in an audio track file.

2.3.1.10. Track File Loudspeaker Channel Content Constraint: A given loudspeaker channel shall only be represented once per track file, i.e. within a track file containing multiple audio channels, there shall be a one-to-one relationship between loudspeakers and audio channels. There shall not be multiple instances of a loudspeaker channel within a track file.

2.3.1.11. Channels Per Track File Constraint An audio track file shall not exceed 16 audio channels.

2.3.1.12. Simultaneous Multiple Audio Track File Availability and Playout The CPL in an IMF shall be capable of pointing to multiple audio track files, which are simultaneously available for use by a transcoder or real-time playout device. IMF shall support simultaneous multiple audio track file playout, as multiple audio elements must be available to play out at once.

2.3.1.13. Multiple Audio Track File routing When accessing multiple track files, the track file individual channels shall be combined (like channel to like channel) for output routing purposes on a unity gain basis. For example, the left channel of audio track file A and the left channel of audio track file B are combined at unity gain as the overall left channel content of the IMF. Note: Track file channels shall not be combined at other than unity gain. Note: Different loudspeaker channels shall not be combined, e.g. Left and Right are never combined to make a mono output.

2.3.1.14. Mixing of audio channels • Audio individual channels within a track file shall not be mixed. • Audio channels between multiple track files shall not be mixed, other than the routing spec stated above. • Different channels shall not be combined (downmixed) to make a narrower soundfield configuration.

41 Private Draft Document. Not for Publication

2.3.1.15. Audio Elements Examples • There shall be only one audio element per track file. • Multiple audio elements shall not be combined in a single audio track file. The following audio elements are typical of what may be carried in the IMF. Other audio elements may be carried in an IMF at the content provider’s discretion. • Printmaster (reels) • Composite Mix (full length or parts) • Music+Effects (one track, two tracks or full multitrack) • M+E optional material (one track, two tracks or full multitrack) • Narration • VI (Visually Impaired) • SAP • Dialog (one track, two tracks, or full multitrack) Often, this is part of a “DME split track” • Music (one track, two tracks, or full multitrack) Often, this is part of a “DME split track” • Effects (one track, two tracks, or full multitrack) Often, this is part of a “DME split track”

42 Private Draft Document. Not for Publication

2.3.1.16. Soundfield Configurations There shall be only one soundfield configuration per track file. Multiple soundfield configurations shall not be combined in a single audio track file. The following soundfield configurations shall be supported in IMF V.1. Soundfield configurations not specified in this section shall not be supported. • 5.1 • 5.0 • L/R (Lo-Ro) “Standard Stereo” • Lt-Rt • Mono • LCR • LCRS • 5.1EX • 6.1 (discrete) • 6.0 (discrete) • 7.1 (L, C, R, Ls, Rs, Rls, Rrs, Sub)

Note: future soundfield configurations must be defined and standardized in order to be supported. The soundfield configuration shall be described in the metadata that is carried with the audio track file. A registry of soundfield configurations is in process in SMPTE 30MR and shall be used for this purpose.

2.3.1.17. Channel layout and mapping Note: SMPTE 428-3 is not applicable to IMF and should not be referenced. 428-3 is specific to digital cinema, is an early document, and will be revised in the future. In addition, SMPTE 31FS- 10 is working on defining channel labels using (future) registered UL’s, and this will play a part in all multichannel audio specs once ratified. IMF V.1 shall have a restricted, pattern based layout such that the audio channel layout is always in the same order for a given soundfield configuration e.g. 5.1 (6 channel) would always be L, R, C, Sub, Ls, Rs). Note: These layouts would need to be defined, possibly in an annex of the IMF spec. IMF V.2 shall include the capability of determining track layout by audio file labeling and/or track file metadata. This shall reference and be compliant with the 31FS-10 multichannel audio spec.

2.3.1.18. Additional Audio Items and Concerns

2.3.1.19. Partitions The following partitions shall be supported • Continuous (full length) • Reels • Parts

2.3.1.20. Audio Editing Granularity • The audio editing granularity shall have frame accuracy. • An IMF authoring tool shall have granularity to the sample.

43 Private Draft Document. Not for Publication

2.3.1.21. Crossfades • There shall be no automatic or automated audio crossfades in the IMF CPL at playout or transcoding. • Rendered audio shall be used to perform difficult crosses between content. Note: these would be created in the audio editing process and rendered in the authoring process. • Audio Fade up or fade outs shall not be supported.

2.3.1.22. Audio Editing and Transitions • Only hard cuts (“butt splices”) shall be supported. • Audio editing and transitions shall support only straightforward cuts and inserts rather than complex editing.

2.3.1.23. Audio Insert Considerations • Audio inserts may be of different length than corresponding picture insert to facilitate proper audio flow across edits.

2.3.1.24. Audio Processing • Audio Dynamic Range Control shall not be supported. • “Normalization” of audio shall not be supported. • Automatic Gain Control and Volume Changes shall not be supported.

2.3.1.25. PARKING LOT ITEMS

2.3.1.26. Metadata • Metadata has yet to be fully addressed in the audio group

2.3.1.27. Wrapping • Wrapping has yet to be addressed in the audio group

2.3.1.28. File Naming • File naming has yet to be addressed in the audio group

2.3.1.29. Composition Play List • The CPL has yet to be addressed by the audio group. Some notes and comments have been made in chapter 7.

2.3.2. Structural Metadata The audio group has not addressed metadata specifically as of yet-however, references to places where metadata would be needed are included in the text of 3.3.1 above.

44 Private Draft Document. Not for Publication

3. Data Essence

3.1. Overview

3.1.1. Introduction The Data Essence…

3.1.2. Data Essence System Overview For the purpose of documenting the specific requirements and specifications for Data Essence, it is helpful to divide the system into a set of components. The specifications and requirements for each of these components will be described in the following sections:

• Subtitles o Sub picture – The pre-rendered open text specification and file format o Timed Text – The Timed Text data specification and file format • Captions - The Captions specification with similar categories as Subtitles above. o Sub picture – The pre-rendered open text specification and file format o Timed Text – The Timed Text data specification and file format

3.1.3. Major Data Essence Concepts Fill in the blanks……

3.1.4. Data Essence Fundamental Requirements

3.1.4.1. Common File Formats The Essence and Data Essence shall use a common standardized file format for each element (image, audio, subtitles, etc.). The image essence file format shall be an SMPTE-conformant file based on existing SMPTE standards. The audio track file format shall be based on Broadcast Wave; defined in the Glossary of Terms (Chapter 8), and referred to in Common File Formats (2.1.4.1) and Audio File Format (2.3.1.1). The Subtitle essence should be based on PNG and XML file formats.

3.1.4.2. Frame Rates The image structure shall support a frame rate of 24.000 and 23.976 Hz. The image essence structure can also support a frame rate of 59.94 Hz. The frame rate of any individual IMF source master shall remain constant. Metadata shall be carried in the image data file format to indicate the frame rate.

3.1.4.3. Synchronization Files within the image and/or audio track shall carry information to provide for frame-based synchronization between each file. At a minimum, they shall include a start of file and a continuous frame count.

45 Private Draft Document. Not for Publication 3.2. Subtitle Specification

3.2.1. Sub picture

3.2.2. Timed Text

3.3. Caption Specification

3.3.1. Sub picture

3.3.2. Timed Text

46 Private Draft Document. Not for Publication

47 Private Draft Document. Not for Publication

4. Dynamic Metadata

4.1. Overview

4.1.1. Introduction The Metadata…

4.1.2. Dynamic Metadata System Overview For the purpose of documenting the specific requirements and specifications for Metadata, it is helpful to divide the system into a set of components. The specifications and requirements for each of these components will be described in the following sections:

• Time code – The specification and file format for Time code Dynamic Metadata • Pan and Scan – The specification and file format for Image Pan and Scan Dynamic Metadata • Color Correction – The specification and file format for Image Color Correction and Scan Dynamic Metadata • Dynamic Down-mixing – The specification and file format for Audio Dynamic Down-mixing Dynamic Metadata shall not be supported.

4.1.3. Major Dynamic Metadata Concepts Fill in the blanks……

4.1.4. Dynamic Metadata Fundamental Requirements

4.1.4.1. Common File Formats The Dynamic Metadata shall use a common standardized file format for each element (Time code, aux data, etc.). The time code file format shall be an MXF-conformant file based on existing SMPTE standards.

4.1.4.2. Frame Rates The time code structure shall support a frame rate of 24.000 (or 23.976) Hz. The time code structure should also support a frame rate of 60.00 (or 59.94) Hz. The frame rate of any individual IMF source master shall remain constant.

4.2. Time code Specification

4.3. Pan and Scan Specification

4.4. Color Correction Specification

48 Private Draft Document. Not for Publication

49 Private Draft Document. Not for Publication

5. Wrapping

5.1. Overview

5.1.1. Introduction The IMF, as stated in the System Overview, is a collection of files, such as picture essence files and audio track files. These files, as they stand by themselves, do not represent a complete presentation. Synchronization tools, asset management tools, Metadata, content protection and other information are required for a complete presentation to be understood and played back as it was intended. This is especially important when the files become compressed and/or encrypted and are no longer recognizable as image essence or audio essence in this state. Wrapping is a way to organize and wrap this material in such a way as to make it suitable for storage and movement to its destination. In seeking a common interchange standard for Digital Video between facilities and equipment, it is understood that there may be multiple sources of content. This will require special consideration to achieve IMF interchange. Thus, an interchange wrapping structure is needed that operates across several domains. The section also provides a set of requirements for the Material eXchange Format (MXF) track file encryption and creation.

5.1.2. Wrapping System Overview For the purpose of documenting the specific requirements and specifications for Wrapping, it is helpful to divide the system into a set of components. The specifications and requirements for each of these components will be described in the following sections:

• Track Files – The specification and file format for the creation of Track Files o Encryption - The specification for encrypting Essence, Data Essence and Dynamic Metadata if required. o Essence – The specific specifications and file format for wrapping Essence files o Data Essence– The specific specifications and file format for wrapping data essence files o Dynamic Metadata– The specific specifications and file format for wrapping Dynamic Metadata files

5.2. Wrapping Fundamental Requirements

5.2.1. Common File Formats The wrapping process shall use a common standardized file format for each type of element (Image, audio, Time code, aux data, etc.). This format shall be an MXF-conformant file based on existing SMPTE standards.

5.2.2. Frame Rates The time code structure shall support a frame rate of 24.000 (or 23.976) Hz. The time code structure should also support a frame rate of 60.00 (or 59.94) Hz. The frame rate of any individual IMF source master shall remain constant.

50 Private Draft Document. Not for Publication 5.3. Wrapping Overview

5.3.1. Introduction The Audio and Image Track Files are the fundamental element in the IMF system. The audio and Image Track File structure and requirements, are defined by the Essence, Data Essence or Dynamic Metadata information they contain. Each of these Essence, Data Essence or Dynamic Metadata containers could be image, sound, subtitle (Timed Text, and/or sub picture), caption or auxiliary data. However, each track file follows the same basic file structure. A track file consists of three logical parts: the File Header, the File Body and the File Footer as shown in Figure 7 below.

Figure 8 - Example Track File Structure

The file structure is further broken down into logical data items as defined in [SMPTE 336M Data Encoding Protocol using Key-Length-Value]. The KLV Coding Protocol is composed of Universal Label (UL) identification Key (UL Key), followed by a numeric Length (Value Length), followed by the data Value as shown below in Figure 8 below. One or more of these data items are combined to form the logical parts shown above.

Length

UL Key Length Value

16 bytes BER Variable Length Octets

Figure 9 - Example of KLV Coding

5.3.2. Format Information Each track file shall be a self-contained element, such that it’s Essence or Metadata can be understood and presented as it was packaged by a compliant decoder. The information shall be located in the predetermined specified area. The Track File is to contain the following minimum information:

• Required Metadata for unique asset identification • Required Metadata for decompression (optional) • Required Metadata for decryption (optional) The following information shall be configured in a human readable format:

• Essence physical format description. (e.g., 4096 x 2160) • Essence title asset information. (e.g., The_Perfect_Movie_English_ R2)

51 Private Draft Document. Not for Publication

5.3.3. Sequence A Sequence is a conceptual period of time having a specific duration.

• Track Files are required to be associated with a particular Sequence. • A Track File shall not cross over a Sequence boundary that is a playable portion of a Track File, between the mark in and mark out points. • Sequence’s are required to be composed of one or more Track Files. • The minimum duration of a Track File within a sequence shall be an integer number of frames, such that the length is greater than or equal to one (1) frame.

5.3.4. Track File Replacement A Track File is the smallest unit that can be managed or replaced as a discrete file in the field.

5.3.5. Synchronization Each Track File shall contain the following synchronization information:

• Start of Essence Data (mark in) • End of Essence Data (mark out) • Track File Frame Count • Frame Rate • Internal Synchronization

5.3.6. Splicing Track Files, of the same Essence type, are required to allow for seamless or click-free splicing to create a continuous data stream for a presentation. The playback system shall be able to perform sample accurate or click free splicing of audio track files.

5.3.7. Key Epoch A Key Epoch is the period of time during which a given Decryption Key is effective. The Key Epoch is one Reel.

5.3.8. Security Each Track File shall provide for encryption and methods to authenticate the data, if the content provider chooses to use such methods. In addition:

• The Essence container shall allow encrypted data, while the rest of the Track File Metadata is left unencrypted. • At any point in the delivery chain, it shall be possible to detect whether any accidental or intentional alteration has occurred.

5.3.9. Integrity and Authentication Each Track File shall provide a method for verification of file integrity that can be easily determined at any step of the delivery process. In addition:

• It is encouraged that missing or corrupted data be easily identified.

52 Private Draft Document. Not for Publication

• Track Files are encouraged to be subdivided into smaller segments, which have individual authenticity/error-check codes. This facilitates a decision as to whether the file is so corrupt it cannot be played, or whether it is safe to proceed with play out while requesting a replacement Track File. • Synchronization with other Track Files is encouraged to be verifiable.

5.3.10. Extensibility The Operational Pattern shall accommodate future extensions within its original scope.

5.3.11. Random Access and Restarts The Operational Pattern shall support random access to the nearest integer minute. Random access to individual frames is neither required nor desired. A restart occurs as a result of a stop or pause in the system while executing a Composition PlayList. The system may be restarted at any frame prior to the frame at which it was stopped or paused. It is required that a restart be logged by the Security Manager, provided the Essence (either image, audio or subtitle) is encrypted.

5.3.12. Simple Essence A track file shall contain essence of a single Essence type (e.g., audio, image, subtitles). While a Track File can, for instance, contain all audio channels for a given language, additional languages are required to be stored in separate track file. The Composition PlayList will select the correct Track Files to play a requested version of the movie (composition).

5.4. MXF Track File Encryption

5.4.1. Introduction The following are requirements for MXF Track File Encryption. For the purpose of this section, a frame is defined as an image frame time, for example 24 FPS or 48 FPS.

• The encryption shall support KLV encoding as specified in SMPTE 336 Data Encoding using Key- Length-Value. • The encrypted sound or picture Track File shall be a valid MXF file. • KLV packets shall be encrypted using the Advanced Encryption Standard (AES) block cipher algorithm using a 128 bit key operating in Cipher Block Chaining (CBC) mode, as defined in NIST SP 800-38A. See National Institute of Standards and Technology FIPS 197 (November 26, 2001). Advanced Encryption Standard. • Each shall use a single cryptographic key for all frames within the sound or picture Track File. • The encryption method shall support random access to any frame of essence within the sound or picture Track File. • The integrity of each frame of sound and picture essence shall be verifiable using the HMAC-SHA1 algorithm. • There shall be a method for verifying that all frames within a sound and picture track are played in correct sequence. • The Track File encryption method shall allow for the common header data within each frame of Essence to be plain text. In other words, the encryption of each frame of essence shall have a programmable offset of “n” bytes such that common header data is left as plaintext.

53 Private Draft Document. Not for Publication

• There shall be a method for verifying that the correct cryptographic key is used and the essence is being decrypted correctly. • A reference decryption model shall be specified. • Track File encryption shall not require encryption of all frames within a file. • Track File encryption shall be independent of the nature of the underlying essence and associated metadata. • No information shall be lost in the Track File encryption process.

5.4.2. Encrypted Track File Constraints Encrypted Track Files shall follow the same specification as plain-text Track Files, with the following additional constraints. The following are included in the Encrypted Track File Constraints:

• Encrypted Essence Track – A single cryptographic key, and hence, Cryptographic Context, shall be used to encrypt any given Essence track. • Cryptographic Framework Descriptive Metadata Track – Track Files may contain one or more Descriptive Metadata Tracks in the MXF File Package that describes the Essence of the Track File. • Index Tables – In a plain-text Track File, each Index Table entry locates a Triplet containing a single frame of picture Essence. Similarly, in an encrypted Track File, each Index Table entry shall point to an Encrypted Triplet wrapping a single Triplet, itself containing a single frame of picture Essence.

K L V

encryption / decryption

Cryptographic Cryptographic K’ L’ K L E(V) information information (opt)

V’

Figure 10 - Correspondence between Source and Encrypted Triplets1

Figure 9 above illustrates the correspondence between a plaintext KLV triplet and an Encrypted KLV Triplet. The value V of a source plaintext KLV Triplet is first encrypted to yield E(V). This encrypted value, E(V), along with K and L, is wrapped in a K’L’V’ Encrypted Triplet. K' is a unique label common to all Encrypted KLV Triplets, independent of their content. L' refers to the full length of V'. V' consists of K, L and E(V) from the source KLV Triplet as well as cryptographic information specific to the Encrypted KLV Triplet. This cryptographic information includes, for instance, the initialization vector used in generating E(V) and the Message Integrity Code used to verify the integrity of the Triplet.

1 Red hatching depicts the encrypted portion of the Encrypted Triplet. Other items are left in the clear. Only the value item of Source Triplet is encrypted, allowing the Essence information to be encrypted prior to wrapping. See “Encrypted Triplet” for a description of the cryptographic information associated with each Encrypted Triplet.

54 Private Draft Document. Not for Publication 5.5. Image Track File

5.5.1. Introduction An Image Track File contains the image Essence data and its associated Metadata. Each Image Track File can contain compressed and encrypted image data. The following are requirements for an Image Track File.

5.5.2. Frame Boundaries The Image Track File shall begin and end with complete frames that allow for splicing. Frames are defined to be image frames such as 24 FPS (1/24 sec) or 48 FPS (1/48 sec).

5.5.3. Compression The Track File shall support Constant Bit Rate (CBR) compression and Variable Bit Rate (VBR) compression, within the constraints of the specified code stream for the reference decoder

5.5.4. Metadata The following Metadata shall be furnished with the Image Track File:

• Unique ID • Unique ID of corresponding plaintext track if encrypted • Track type (i.e., mage) • Total width in pixels • Total height in pixels • Aspect Ratio • Frame Rate • Frame count number (duration)

5.6. Audio Track File

5.6.1. Introduction An Audio Track File contains the audio Essence data and its associated Metadata. The following are requirements for an Audio Track File.

5.6.2. Frame Boundaries The Audio Track File shall begin and end with complete frames that are associated with its Image Track File to allow for a clean transition between reels.

5.6.3. Data Packing Format The Track File shall support uncompressed audio data.

5.6.4. Metadata The following Metadata shall be furnished with the Audio Track File:

• Unique ID

55 Private Draft Document. Not for Publication

• Unique ID of corresponding plaintext track encrypted • Track type (i.e., audio) • Audio Sampling Frequency • Quantization bits (sample size) • Channel Count • Channel Mapping Labels • Data Packing Format • Frame Rate • Audio Frame count number (duration)

5.7. Subtitle Track File

5.7.1. Introduction A Subtitle Track File contains, for example, the Subtitling Essence data and its associated Metadata. Each Subtitle Track File may contain any combination of text, font references, and image references.

5.7.2. Frame Boundaries The Subtitle Track File shall have the same duration as the playable region of its associated Image Track File.

5.7.3. Timed Text Any Timed Text element shall use an Open Type font.

5.7.4. Sub picture Sub picture elements are required to use the PNG file format.

5.7.5. Metadata The following Metadata shall be furnished with the sub picture Track File:

• Unique identification • Track Type (i.e., Timed Text, sub picture) • Total Width In Pixels of the Image Track File (PNG files only) • Total Height In Pixels of the Image Track File (PNG files only) • Aspect Ratio (PNG files only) • Frame Rate • Position • Timing (Temporal)

56 Private Draft Document. Not for Publication 5.8. Captions Track File

5.8.1. Introduction A Caption Track File contains, for example, the Captions Essence data and its associated Metadata. Each Captions Track File may contain any combination of text, font references, and image references.

5.8.2. Frame Boundaries The Captions Track File shall have the same duration as the playable region of its associated Image Track File.

5.8.3. Timed Text Any Timed Text element shall use an Open Type font.

5.8.4. Sub picture Sub picture elements are required to use the PNG file format.

5.8.5. Metadata The following Metadata shall be furnished with the sub picture Track File:

• Unique identification • Track Type (i.e., Timed Text, sub picture) • Total Width In Pixels of the Image Track File (PNG files only) • Total Height In Pixels of the Image Track File (PNG files only) • Aspect Ratio (PNG files only) • Frame Rate • Position • Timing (Temporal)

5.9. Time Code Track Files (Optional)

5.9.1. Introduction A Time Code Track File contains, for example, the Unicode text data or any other data or Metadata that belongs in a separate track for functional purposes. The following are requirements for a Time Code Track File.

5.9.2. Frame Boundaries The Time Code Track File is encouraged to begin and end with complete frames that are associated with its Image Track File to allow for a clean transition between reels.

5.9.3. Metadata The following Metadata shall be furnished with the Time Code Track Files:

• Unique identification

57 Private Draft Document. Not for Publication

• Track Type (i.e., auxiliary) • Frame Count Number • Text Format (If Applicable) • Cue Names (If Applicable)

5.10. Pan and Scan Track Files (Optional)

5.10.1. Introduction A Pan and Scan Track File contains, for example, the Unicode text data or any other data or Metadata that belongs in a separate track for functional purposes. The following are requirements for a Pan and Scan Track File.

5.10.2. Frame Boundaries The Pan and Scan Track File is encouraged to begin and end with complete frames that are associated with its Image Track File to allow for a clean transition between reels.

5.10.3. Metadata The following Metadata shall be furnished with the Pan and Scan Track Files:

• Unique identification • Track Type (i.e., auxiliary) • Frame Count Number • Text Format (If Applicable) • Cue Names (If Applicable)

5.11. Color Correction Track Files (Optional)

5.11.1. Introduction A Color Correction Track File contains, for example, the Unicode text data or any other data or Metadata that belongs in a separate track for functional purposes. The following are requirements for a Color Correction Track File.

5.11.2. Frame Boundaries The Color Correction Track File is encouraged to begin and end with complete frames that are associated with its Image Track File to allow for a clean transition between reels.

5.11.3. Metadata The following Metadata shall be furnished with the Color Correction Track Files:

• Unique identification • Track Type (i.e., auxiliary) • Frame Count Number

58 Private Draft Document. Not for Publication

• Text Format (If Applicable) • Cue Names (If Applicable)

59 Private Draft Document. Not for Publication

7. The Composition

5.12. Introduction The Composition represents a complete self-contained digital media program. This Composition maybe a feature, episode, trailer, advertisement or any other single piece of content. A composition usually consists of at a minimum a Composition PlayList (CPL) and one or more Track Files, which contain the actual program Essence. A very simplified example of the Hierarchical Structure for a Composition is given below.

5.13. Functional Framework For the purpose of documenting the specific requirements for a Interoperable Master Format Packaging system, it is helpful to divide the system into a set of components. The performance requirements for each of these components will be described in the following sections:

• Composition – A self-contained representation of a single complete Digital Video work, such as a motion picture, or a trailer, or an advertisement, etc. • Composition PlayList - The instructions to a play out device or a transcoder on how to assemble the Track Files together to create a version of the program. (Similar to an Edit Decision List) • Composition Security – The specifications and requirements of a Security framework the end users may or may not take advantage of.

5.14. The Composition Fundamental Requirements

5.14.1. Open Standard The Composition standard shall be based upon an open worldwide standard. This format is encouraged to be a license-free technology. It shall be a complete standard that equipment receiving a compliant Composition can process and interpret unambiguously.

5.14.2. Interoperable The Composition format shall have an open framework that accommodates compressed, encrypted files as well as all other files used in Digital Video.

5.14.3. Scalable The Composition format shall accommodate any number of Essence or Metadata components. There is no limit on the number of files included in the package or the size of the files.

5.14.4. Extensible The Composition format shall allow for new Digital Video features (compositions) to be contained within the package.

5.14.5. Synchronization The Composition format shall provide support for synchronization of the Essence and Metadata elements.

60 Private Draft Document. Not for Publication

5.14.6. Human Readable Metadata Human readable Metadata shall be in English (default) but can be provided in other languages as well.

5.14.7. File Format The Composition PlayList shall use the secure (digitally signed) text-based XML file format. More specifically the structures defined in this document are represented using the Extensible Markup Language (XML) [XML 1.0], and specified using XML Schema [XML Schema Part 1: Structures] and [XML Schema Part 2: Datatypes]. This specification shall be associated with a unique XML namespace name [Namespaces in XML]. The namespace name shall be the string value “http://www.tbd.org“. This namespace name conveys both structural and semantic version information, and serves the purpose of a traditional version number field. Table 6: XML Namespaces lists the XML namespace names used in this specification. Namespace names are represented as Uniform Resource Identifier (URI) values [RFC 2396]. These values shall be considered as simple strings, and applications should not attempt to resolve them as URLs.

Table 6: XML Namespaces Qualifier URI cpl http://www.tbd.org/date xs http://www.w3.org/2001/XMLSchema ds http://www.w3.org/2000/09/xmldsig

The namespace qualifier values (namespace prefixes in XML jargon) used in this document (cpl, xs, ds), are not normative values. Implementations shall perform correctly with any XML compliant namespace prefix value that is associated with a URI from Table 6: XML Namespaces, above. Datatypes from other schemas that are used in this document will be prefixed with the appropriate namespace qualifier (e.g. xs:dateTime). See [XML Schema Part 2: Datatypes] and [XML-Signature Syntax and Processing] for further information about these types. The MIME type [IETF RFC 2046] for a document containing a single Composition PlayList element as its root shall be “text/xml”.

5.15. CPL Constraints The following is a list of items that are intended to be constrained within the Composition and therefore reflected in the CPL.

5.15.1. Constant Frame and Sample Rate The Composition shall only consist of similar frame rate and sample rate content in all of the included track files. Another words all track files shall have the same edit rate, sample rate and frame rate.

5.15.2. Minimum Track File Duration The duration of any asset contained in a Sequence, as indicated by the Duration and IntrinsicDuration elements, shall be no less than one frame.

5.15.3. Minimum Sequence Duration The duration of any Sequence shall be no less than one second.

61 Private Draft Document. Not for Publication 5.16. Terminology The following terms are used to describe the features of this specification.

Table 7: Terms and Definitions Term Definition

Sequence A single or contiguous set of Track Files intended to be reproduced sequentially within a Composition

Composition A complete artistic or informational motion picture work, such as a feature, episode, trailer, or an advertisement, etc.

Essence The sound, picture and data resources that ultimately are intended for a viewing and/or listening experience.

Editable Unit The smallest temporal increment of access to Essence, e.g. a frame or a sample.

Edit Rate A number of Editable Units to be reproduced during a temporal interval having a duration of exactly one (1.0) second. Because Edit Rate values are not always integer values and sometimes require many digits of precision, Edit Rate values are expressed as a rational number (the ratio of two integers).

Native Duration The total number of Editable Units in a Track File.

Native The first Editable Unit of a Track File. All Track Files are viewed by a Composition Start Point PlayList as a sequence of Editable Units numbered from 0 (zero). Consequently, the Editable Unit number of the Native Start Point of a Track File will always be 0 (zero).

Native The last Editable Unit of a Track File. End Point

Playable The set of Editable Units within a Track File that are intended to be reproduced as part Region of a Composition. A Track File may contain Editable Units before and/or after the Playable Region.

Track File A file containing a single Essence, such as sound, picture or subtitle essence.

Sample Rate The number of essence samples per second. Sample Rate values are expressed as a rational number (the ratio of two integers)

Frame Rate The number of frames per second. Frame Rate values are expressed as a rational number (the ratio of two integers)

5.17. Synchronization The Composition PlayList defines an idealized playback timeline using XML to provide information to the transcoder or playback device describing the order and parameters used to manipulate the files. As depicted in first in Figure 1 previously, and now below in Figure 10, the timeline consists of a contiguous set of Sequences that are spliced together, which fit within the region of a complete Composition.

62 Private Draft Document. Not for Publication

Figure 11 - Composition PlayList sequence

Continuing with this hierarchical method, as depicted in Figure 11, the expanded Sequence consists of a set of synchronized Track Files along with their associated Metadata Track Files.

Figure 12 - Timing relationships within a Sequence

For example, the CPL identifies a segment of a Track File to be reproduced in parallel with the other Track Files in the same Sequence. The Asset’s Entry Point and Duration parameters define the order of Editable Units within the Track File that is to be reproduced (the Playable Region). Track Files are aligned with offsets from their Native Start Points from which they are then synchronized and shall be reproduced simultaneously. Using this same method Figure 12 illustrates the timing relationship between Track Files within a Sequence.

63 Private Draft Document. Not for Publication

Figure 13 - Sequence Sync Relationship Example

At the start of a given Sequence, playback of all Track Files contained within that Sequence shall start simultaneously at the Entry Point given for each respective Track File. The duration of a Sequence shall be equal to the total duration of the Sequence’s MainPicture Track Files, or the Track Files with the shortest duration if the MainPicture Track File is not present.

64 Private Draft Document. Not for Publication

5.18. Composition PlayList Structure Composition PlayList shall be represented by a unique XML element, the CompositionPlaylist element. The Composition PlayList shall be encoded using the UTF-8 character encoding [XML 1.0]. The Composition PlayList shall include the following fields unless stated as optional in which case they may be included. An example of a Composition PlayList is provided in Chapter 9.

5.18.1. General Information The Composition PlayList should provide the following general information to allow quick access to the information about the composition. This should allow either humans or automated systems to source the general information. Below is the list of the both the informative information and required information to complete the structure of the Composition PlayList. Some items are optional and therefore not required for a compliant Composition PlayList. Other sections are described in more detail in later sections.

• Unique ID • Annotation Text [optional] • Issue Date • Issuer [optional] • Creator [optional] •

• Content Title Text [optional] • Content Kind (e.g., Feature, Trailer, Logo, Advertisement, Episode) • Content Version o ID [optional] o Label Text [optional] • Content Description o Source Media Description [optional] o Frame Rate o Timecode Type [optional] o Image Encoding Type o Audio Sample Rate o Audio Samples/Frame o Soundfield Configuration o Audio Bit Depth o Subtitle Type o Caption Type • Media File Size • Total Running Time • Language

65 Private Draft Document. Not for Publication

• Country • Rating [optional] o Agency o Label • Encryption (yes/no) • Sequence List • Signer [optional] • Signature [optional]

5.18.2. Unique Id The Id element uniquely identifies the Composition PlayList for asset management purposes. It shall not uniquely identify the content represented by the composition. It shall be encoded as a urn:uuid per [RFC 4122].

5.18.3. AnnotationText [optional] The AnnotationText element shall be a free-form, human-readable annotation describing the composition. It is meant strictly as a display hint to the user. The optional language attribute is an xs:language language code and indicates the language used for the text. If the language attribute is not present, the default value en shall be used.

5.18.4. IssueDate The IssueDate element shall be used to define the time and date at which the Composition PlayList was issued. It may be displayed to the user. It shall be encoded as an xs:dateTime.

5.18.5. Issuer [optional] The Issuer element shall be a free-form, human-readable annotation that shall identify the entity that created the Composition PlayList. It is meant strictly for display to the user. The Signer element defined in Section 5.18.19 shall be used to identify the entity that digitally signed the Composition PlayList. The optional language attribute is an xs:language language code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.18.6. Creator [optional] The Creator element shall be a free-form, human-readable annotation that shall identify; the application used to create the Composition PlayList, the Facility that created the CPL and the operator that created the CPL. It is meant strictly for display to the user. The optional language attribute is an xs:language language code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.18.7. ContentTitleText The ContentTitleText element shall contain a human-readable title for the composition, e.g. The Jazz Singer. It is strictly meant as a display hint to the user. The optional language attribute is an xs:language language code and indicates the language of the content of the element. If the language attribute is not present, the default value en shall be used.

66 Private Draft Document. Not for Publication

5.18.8. ContentKind The ContentKind element defines the kind of material referred to by the Composition PlayList. It is meant to be both human and machine-readable. Table 8: Examples of Content Kind, below, shows examples of Content Kind.

Table 8: Examples of Content Kind Kind Description feature A theatrical feature. trailer Short (2 to 3 minutes) content promoting an upcoming theatrical feature. test Content used to test, calibrate or setup equipment. teaser Very short (typically less than 1 minute) content promoting an upcoming theatrical feature. rating Slate/still picture indicating the recommended age group permitted to view the content to follow. This rating is generally unique per country. advertisement Content promoting a product or service other than an upcoming feature. short Non advertising/promotional content (3 to 15 minutes) typically before a theatrical feature. transitional Extremely short content (1 to 15 seconds) separating unrelated compositions. psa Public service announcement.

5.18.9. ContentVersion This defines the version of the content referred to by the composition, as opposed to the Composition PlayList Id element which uniquely identifies an instance of the Composition PlayList. Thus there may be two distinct compositions, with distinct CompositionPlaylist Id values, that refer to the same content and thus have the same ContentVersion Id value. This may occur, for example, if a composition is distributed to supersede a previous version. Similarly, while two compositions may share the same content title, they may refer to two different versions, such as French (dubbed) and French (original), and therefore have two distinct ContentVersion Id values. The ContentVersion element is meant to assist both users and software in scheduling and tracking content.

5.18.9.1. Id The Id element shall identify the content contained in the Composition PlayList. It shall be a valid URN, per [RFC 2141].

5.18.9.2. LabelText The LabelText element shall be a human readable label, e.g. “French (1.85 picture, 16.1 sound, dubbed)”, describing the content. The optional language attribute is an xs:language language code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.18.10. ContentDescription This describes the content referred to by the composition. Most of this information is also contained in the metadata of the Track Files. It is provided here in human readable form for informative purposes only. Machines should not rely on this information for transcoding or Play-Out and in all cases the

67 Private Draft Document. Not for Publication

Track File metadata takes priority over information provided here. The ContentDescription element is meant to assist both users and software in scheduling and tracking content.

5.18.10.1.SourceMediaDescription [optional] The SourceMediaDescription element shall be human readable text, e.g. “HDCamSR (1.85 picture, 16.1 sound, dubbed)”, describing the content source. The optional language attribute is an xs:language language code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used. Further more an element ID and LabelText can be added to the description and is specified below.

5.18.10.2. Id [optional] The Id element shall identify the content contained in the Composition PlayList. It shall be a valid URN, per [RFC 2141].

5.18.10.3. LabelText [optional] The LabelText element shall be a human readable label, e.g. “French (1.85 picture, 16.1 sound, dubbed)”, describing the content. The optional language attribute is an xs:language language code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.18.10.4.FrameRate The FrameRate element shall contain the number of frames per second of the intended playback rate of the MainPicture element. (ex. 23.98 or 59.94) This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform to the FrameRate.

5.18.10.5.TimecodeType [optional] The TimecodeType element shall contain information of the timecode format of the MainPicture element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform the picture information to the Compression specification.

5.18.10.6.ImageEncodingStandard [optional] The ImageEncodingStandard element shall contain information of the encoding parameters of the MainPicture element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform the picture information to the encoding specification.

5.18.10.7. StandardsBody [optional] The StandardsBody element shall contain a URI [RFC 2396] that uniquely identifies the Standards Body issuing the Standard.

5.18.10.8. Label [optional] The Label element shall contain a textual representation of the Standards Body, which may be displayed to the user. For each issuing Standards Body, and hence unique URI, there are a number of permissible Label values. An example of such is shown below in Table 4. The specification of this mapping is beyond the scope of this document.

68 Private Draft Document. Not for Publication

5.18.10.9.AudioSamplingRate The AudioSamplingRate element shall contain the integer of the sampling rate of the MainSound element. (ex. 48 or 96 ) This is informative information to allow for humans and machine to read and if necessary manipulate the MainSound asset of the Composition to conform to the AudioSamplingRate.

5.18.10.10. AudioConfig The AudioConfig element shall contain the Channel configuration of the MainSound element. (ex. LtRt, 5.1 etc.) This is informative information to allow for humans and machine to read and if necessary manipulate the MainSound asset of the Composition to conform to the AudioConfiguration.

5.18.10.11. AudioBitDepth The AudioBitDepth element shall contain the integer of bits per sample of the MainSound element. (ex. 16 or 24) This is informative information to allow for humans and machine to read and if necessary manipulate the MainSound asset of the Composition to conform to the AudioBitDepth.

5.18.11. SubtitleFormat [optional] The SubtitleFormat element defines the format of the MainSubtitle of material referred to by the Composition PlayList. It is meant to be both human and machine-readable.

5.18.12. CaptionFormat [optional] The CaptionFormat element defines the format of the MainCaption of material referred to by the Composition PlayList. It is meant to be both human and machine-readable.

5.18.13. TotalRunningTime The TotalRunningTime element shall be used to define the complete running time of the Composition at the instance when Composition PlayList was issued. It may be displayed to the user. It shall be encoded as an xs:hours:minutes:seconds:frames.

5.18.14. Language [optional] The Language element shall reflect the primary spoken or textual language of the Composition. The element value is encoded as an xs:language language code and indicates the primary language of the content.

5.18.15. Country [optional] The Country element shall reflect the primary area of intended distribution of the Composition. This shall use the ISO standard three-letter acronym for it designation.

5.18.16. RatingList [optional] The RatingList element shall contain an ordered list of zero or more Rating elements containing ratings associated with the composition. Each Rating element, shown in Table 4, contains an Agency and a Label element. Each element is meant to be both human and machine-readable. There shall be only one Rating element per given Agency.

69 Private Draft Document. Not for Publication

5.18.16.1.Agency The Agency element shall contain a URI [RFC 2396] that uniquely identifies the agency issuing the rating.

5.18.16.2.Label The Label element shall contain a textual representation of the rating, which may be displayed to the user. For each issuing agency, and hence unique URI, there are a number of permissible Label values. The specification of this mapping is beyond the scope of this document

Table 9: Example Ratings (Informative) Agency Labels http://www.mpaa.org/2003-ratings R, PG, PG-13, G, NC-17 http://rcq.qc.ca/2003-ratings G, 13+, 16+, 18+

5.18.17. Encryption The Encryption element defines the whether or not Encryption used on any portion of the Composition. It is meant to be both human and machine-readable.

5.18.18. SequenceList The SequenceList element shall contain an ordered list of Sequence elements to be reproduced in continuous order. The structure of the Sequence element shall be as defined in Section 5.19.

5.18.19. Signer [optional] The Signer element uniquely identifies the entity, and hence the public-private key pair, that digitally signed the Composition PlayList. It shall be an instance of the KeyInfoType type defined in [XML- Signature Syntax and Processing]. If the Signer element is present, then the Signature element shall also be present. If X.509 certificates are used per [XML-Signature Syntax and Processing], then the Signer element shall contain one X509Data element containing one X509IssuerSerial element, which uniquely identifies the certificate used to sign the Composition PlayList.

5.18.20. Signature [optional] The Signature element shall contain a digital signature authenticating the Composition PlayList. If the Signature element is present, then the Signer element (5.18.19 above) shall also be present. The Signature element shall be an instance of the ds:Signature element defined in [XML- Signature Syntax and Processing]. The digital signature shall be enveloped and apply to the entire Composition PlayList. An enveloped signature is one that is attached to the document being signed. The signature is generated by the signer, as identified by the Signer element, using the signer’s private key.

70 Private Draft Document. Not for Publication

5.19. Sequence Structure Each Sequence shall consist of a number of Assets, each corresponding to a particular aspect of the Digital Video presentation. While this specification defines a number of asset types, additional asset types may be added in the future. The structure is shown below in Figure 13.

Figure 14 - Sequence Structure

5.19.1. Sequence Information The Composition PlayList should provide the following sequence information to allow quick access to the information about the composition. This should allow either humans or automated systems to source the general information. Below is the list of the both the informative information and required information to complete the structure of the Sequence. Some items are optional and therefore not required for a compliant Composition PlayList. Other sections are described in more detail in later sections. . SequenceType o Unique ID o Annotation Text (Sequence Title) [optional] o Sequence Version [optional] o Issue Date [optional] o Issuer [optional] o AssetList . MainMarkers [optional] . MainPicture [optional] . PictureSpatial [optional] . MainSound [optional] . Other Asset [optional]

71 Private Draft Document. Not for Publication

5.19.2. SequenceType The SequenceType describes a Sequence asset intended to be reproduced as part of a Composition PlayList. Individual child elements are defined in the following subsections.

5.19.3. Unique Id The Id element uniquely identifies the Sequence for asset management purposes. It shall be encoded as a urn:uuid per [RFC 4122].

5.19.4. AnnotationText [optional] The AnnotationText element shall be a free-form, human-readable, text annotation associated with the Sequence. It meant strictly as a display hint to the user. The optional language attribute is an xs:languagelanguage code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.19.5. IssueDate The IssueDate element shall be used to define the time and date at which the Composition PlayList was issued. It may be displayed to the user. It shall be encoded as an xs:dateTime.

5.19.6. Issuer [optional] The Issuer element shall be a free-form, human-readable annotation that shall identify the entity that created the Composition PlayList. It is meant strictly for display to the user.

5.19.7. AssetList The AssetList element shall contain the ordered list of media assets that will be reproduced in parallel during Sequence playback. Each asset shall be uniquely identified by a UUID per [RFC 4122]. The defining specification for each Track File shall specify the location of the identifying UUID. Each child element of the AssetList element shall be derived from the datatype GenericAssetType defined in Section 5.20.2. A number of standard assets are defined in the subsequent sections. As additional assets are defined, e.g. captions, the AssetList element may be extended by introducing new elements.

5.19.7.1. MainMarkers [optional] The MainMarkers element defines markers, e.g. FFOC, LFOC…, associated with the main portion of the presentation, i.e. MainPicture and MainSound assets. Markers shall be referenced from the start of the Sequence they are associated with. The MainMarkers element shall be an instance of MarkersAssetType and its structure is described in more detail in Section 5.20.4.

5.19.7.2. MainPicture [optional] The MainPicture element defines the picture essence to be delivered to the Transcoder or Playback Display device. The actual picture essence is contained in an external Track File. The MainPicture element shall be an instance of PictureTrackFileAssetType and its structure is defined in Section 5.20.5.

72 Private Draft Document. Not for Publication

5.19.7.3. PictureSpatial [optional] The PictureSpatial element defines the picture spatial metadata to be delivered to the Transcoder or Playback Display device. This provides metadata such as Pan and Scan, Zoom, Matte and Cropping information. The actual picture spatial metadata is contained in an external Track File. The PictureSpatial element shall be an instance of PictureMetadataTrackFileAssetType and its structure is defined in Section 5.20.6.

5.19.7.4. MainSound [optional] The MainSound element defines the sound essence to be reproduced in the Transcoder or Playback system. The actual sound essence is contained in an external Track File. The MainSound element shall be an instance of SoundTrackFileAssetType and its structure is defined in Section 5.20.7.

5.19.7.5. MainSubtitle [optional] The MainSubtitle element defines the Subtitle essence to be reproduced on the main screen in the Transcoder or Playback system. The actual Subtitle essence is contained in an external Track File. The MainSubtitle element shall be an instance of SubtitleTrackFileAssetType and its structure is defined in Section 5.20.8.

5.19.7.6. MainCaption [optional] The MainCaption element defines the Captions data essence to be reproduced on the main screen in the Transcoder or Playback system. The actual Captions data essence is contained in an external Track File. The MainCaption element shall be an instance of CaptionTrackFileAssetType and its structure is defined in Section 5.20.9.

73 Private Draft Document. Not for Publication

5.19.7.7. Extensions (New Asset Types) Extension elements shall be used to represent asset types not defined in this document. Zero or more extension elements may be present in the AssetList. When present, extension elements shall be located after any elements defined by this document. When present, extension elements shall have names that belong to a namespace different than the namespace declared by this document. Implementations may ignore extension elements belonging to an unknown namespace. Extension elements shall directly or indirectly extend GenericAssetType. Extension elements may extend any type defined in this document which are derived from GenericAssetType. Informative note: Extension elements should have unique, descriptive names and should appear only once in a given Sequence. Extension specifications that allow multiple instances of an element in a Sequence should provide both a means of differentiating instances within a Sequence and a means of linking related instances in separate Sequences.

74 Private Draft Document. Not for Publication

5.20. Asset Structures Sequence and Clip Asset elements share common attributes, such as Duration, and are therefore specified as a set of types derived from a common structure, namely the GenericAssetType structure. A number of Asset elements also reference external files, in which case they are derived from the TrackFileAssetType structure.

5.20.1. Asset Information The Composition PlayList should provide the following Asset information to allow quick access to the information about the composition. This should allow either humans or automated systems to source the general information. Below is the list of the both the informative information and required information to complete the structure of the Asset. Some items are optional and therefore not required for a compliant Composition PlayList. Other sections are described in more detail in later sections.

• Generic Asset Type o Unique ID o Asset Title (Annotation Text) [optional] o Edit Rate o Intrinsic Duration o Entry Point [optional] o Duration [optional] • Track File Asset Type o Key ID [optional] o Hash File [optional] • Marker Asset Type o Marker List o Label o Annotation Text [optional] o Offset • Picture Track File Asset Type o Pixel Matrix o Bit Depth o Color Encoding o Frame Rate o Image Aspect Ratio • Picture Spatial Metadata Track File Asset Type o Standards Body [optional] o Label [optional] • Sound Track File Asset Type o Sampling Rate

75 Private Draft Document. Not for Publication

o Bit Depth o Number of Channels o Channel Configuration . Channel Labels o Language [optional] • Subtitle Track File Asset Type o Language [optional] • Caption Track File Asset Type o Language [optional]

5.20.2. GenericAssetType The GenericAssetType describes a generic asset intended to be reproduced as part of a Sequence or Clip. Individual child elements are defined in the following subsections.

5.20.2.1. Id The Id element uniquely identifies the Asset. It shall be encoded as a urn:uuid per [RFC 4122]. If the Asset refers to an external resource, such as a Track File, the UUID value shall identify the resource. Mapping of UUID values to actual resources is beyond the scope of this document.

5.20.2.2. AnnotationText [optional] The AnnotationText element shall be a free-form, human-readable text annotation associated with the asset. It meant strictly as a display hint to the user. The optional language attribute is an xs:languagelanguage code and indicates the text language of the content of the element. If the language attribute is not present, the default value en shall be used.

5.20.2.3. EditRate The EditRate element defines the Edit Rate of the Asset. It shall be in units of inverse seconds and represented as a Rational Number. The IntrinsicDuration, EntryPointand Duration parameters shall be expressed in units of 1/EditRate (i.e. as integer values). If the Asset refers to an external resource, EditRate may differ from the actual Edit Rate or Sample Rate of the underlying essence.

76 Private Draft Document. Not for Publication

5.20.2.4. IntrinsicDuration The IntrinsicDuration element shall define the Native Duration of the Asset, as illustrated in Figure 14. It shall not take into account the values of the EntryPoint and Duration elements. Unless the optional EntryPoint and Duration parameters are specified, playback of the asset shall start at the Native Start Point and terminate at the Native End Point of the Track File. IntrinsicDuration shall be expressed in units of 1/EditRate, i.e. as a count of Editable Units.

native native start point end point intrinsic duration

asset

• EntryPoint Duration

Figure 15 - Asset Timing Parameters.

5.20.2.5. EntryPoint [optional] This element shall only be present when the Asset refers to an external resource such as a Track File. The EntryPoint element identifies the Edit Unit where playback shall start (the first editable unit of the playable region). It shall be encoded as an integer number and shall be expressed in units of 1/EditRate, i.e. as a count of Editable Units. This element shall be required if the desired Entry Point is greater than 0 (zero). If this element is not present, a value of 0 shall be assumed and Asset playback shall start at the Native Start Point of the resource.

5.20.2.6. Duration [optional] This element shall only be present when the Asset refers to an external resource such as a Track File. The Duration element defines the duration of the Playable Region of the resource. It shall be encoded as an integer number and shall be expressed as an integer number in units of 1/EditRate, i.e. as a count of Editable Units. If present, this value shall be an integer between 0 (zero) and IntrinsicDuration – EntryPoint (the number of edit units between the EntryPoint and the Native End Point the Track File). If this element is not present, Asset playback shall stop after (IntrinsicDuration – EntryPoint)/EditRate seconds, i.e. at the Native End Point of the Asset.

5.20.3. TrackFileAssetType The TrackFileAssetType element shall be derived from GenericAssetType. It describes an asset based on an external file such as a Picture or Sound Track File.

5.20.3.1. KeyId [optional] The KeyId element uniquely identifies the cryptographic key used to encrypt the underlying track file. This element shall contain a key identifier encoded as a urn:uuid value. KeyId shall be present if any portion of the underlying track file is encrypted. The mapping of key identifiers to actual key values is beyond the scope of this document.

77 Private Draft Document. Not for Publication

5.20.3.2. Hash [optional] The Hash element shall contain the hash (message digest) of the underlying track file computed using the SHA-1 message digest algorithm [RFC 3174]. When authenticated by the digital signature in the Composition PlayList (see 5.18.20), it may be used to verify the integrity and authenticity of the underlying track file. The resulting 160-bit integer shall be encoded using Base64 representation [RFC 2045].

78 Private Draft Document. Not for Publication

5.20.4. MarkerAssetType The MarkerAssetType element shall be derived from the GenericAssetType. It describes the content markers, e.g. FFOC, associated with a Sequence. An instance of the MarkerAssetType is the MainMarkers element. Just as for picture and sound assets, the marker asset has a timeline. The Offset of each Marker (see Figure 6) is the position from the start of the timeline and the Intrinsic Duration of the timeline shall correspond to the Offset of the last Marker.

5.20.4.1. MarkerList The MarkerList element shall contain a list of Marker elements. The members of the Marker element are defined in the following subsections.

5.20.4.2. Label The Label element shall contain a textual representation of the marker. An optional scope attribute with default URI value of “http://www.smpte-ra.org/schemas/TBD/CPL#standard- markers” determines the permissible values of the element.

Table 10: Examples of Marker Labels Marker Description FFOC First Frame of Composition. The first frame of a composition that is intended for display. LFOC Last Frame of Composition. The last frame of a composition that is intended for display. FFTC First Frame of Title Credits. First displayable frame of content that contains any intensity of the Title Credits (a non zero alpha value), which appear at the beginning of a feature. LFTC Last Frame of Title Credits. Last displayable frame of content that contains any intensity of the Title Credits (a non zero alpha value), which appear at the beginning of a feature. FFOI First Frame of Intermission. LFOI Last Frame of Intermission. FFEC First Frame of End Credits. First displayable frame of content that contains any intensity of the End Credits (a non zero alpha value), which appear at the end of a feature. LFEC Last Frame of End Credits. Last displayable frame of content that contains any intensity of the End Credits (a non zero alpha value), which appear at the end of a feature. FFOB First Frame of Ratings Band. First displayable frame of content of the Rating Band, which is usually a slate at the beginning of a feature. LFOB Last Frame of Ratings Band. Last displayable frame of content of the Rating Band, which is usually a slate at the beginning of a feature. FFMC First displayable frame of content that contains any intensity of moving, rolling or scrolling credits (a non-zero alpha value), which appear at the end of the feature. LFMC Last displayable frame of content that contains any intensity of moving, rolling or scrolling credits (a non-zero alpha value), which appear at the end of the feature.

5.20.4.3. AnnotationText [optional] The AnnotationText element shall be a free-form, human-readable annotation associated with the marker. It meant strictly as a display hint to the user. The optional language attribute is a standard XML language code and indicates the text language of the content of the element. If the text language attribute is not present, the default value en shall be used.

79 Private Draft Document. Not for Publication

5.20.4.4. Offset The Offset element defines the absolute position of the marker from the start of the marker asset. It shall be represented as integer number of 1/EditRate units, as inherited from the GenericAssetType.

5.20.5. PictureTrackFileAssetType The PictureTrackFileAssetType, shall be derived from the TrackFileAssetType. It describes a Track File containing picture essence. An instance of the PictureTrackFileAssetType is the MainPicture element The elements defined below replicate values contained in the underlying track file and shall remain consistent with the content of the underlying track file at all times. They are included in the Composition PlayList to alleviate the need for software to access and parse individual track files in order to display the values to users. In the event an inconsistency exists, the values contained in the underlying track file shall take precedence.

5.20.5.1. PicturePixelMatrix The PicturePixelMatrix element shall contain the horizontal and vertical pixel count of the underlying picture track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a rational number of frames per second.

5.20.5.2. PictureBitDepth The PictureBitDepth element shall contain the number of bits assigned to each picture component of the underlying picture track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a rational number of frames per second.

5.20.5.3. PictureColorEncoding The PictureColorEncoding element shall contain the identification of the color encoding method of the underlying picture track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a rational number of frames per second.

5.20.5.4. PictureFrameRate The PictureFrameRate element shall contain the frame rate of the underlying picture track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a rational number of frames per second.

5.20.5.5. ImageAspectRatio The ImageAspectRatio element shall define the aspect ratio of the picture information contained in the underlying picture track file. As described above, it is included in the Composition PlayList for convenience. It is represented as a rational number and applications may convert the ratio to a decimal number to match current practice.

5.20.6. PictureSpatialMetadataTrackFileAssetType The PictureSpatialMetadataTrackFileAssetType, shall be derived from the TrackFileAssetType. It describes a Track File containing picture spatial metadata. An instance of the PictureSpatialMetadataTrackFileAssetType is the MainPictureMetadata element. The elements defined below replicate values contained in the underlying track file and shall remain consistent with the content of the underlying track file at all times. They are included in the

80 Private Draft Document. Not for Publication

Composition PlayList to alleviate the need for software to access and parse individual track files in order to display the values to users. In the event an inconsistency exists, the values contained in the underlying track file shall take precedence.

5.20.6.1. StandardsBody [optional] The StandardsBody element shall contain a URI [RFC 2396] that uniquely identifies the Standards Body issuing the Standard.

5.20.6.2. Label [optional] The Label element shall contain a textual representation of the Standards Body, which may be displayed to the user. For each issuing Standards Body, and hence unique URI, there are a number of permissible Label values. An example of such is shown below in Table 11: Examples of Standards Body Labels. The specification of this mapping is beyond the scope of this document.

Table 11: Examples of Standards Body Labels Label URI SMPTE STD xxx-y http://www.smpte.org/date W3 http://www.w3.org/2001 ISO http://www.ISO.org/2009

5.20.7. SoundTrackFileAssetType The SoundTrackFileAssetType is derived from TrackFileAssetType. It describes a Track File containing sound essence. An instance of the SoundTrackFileAssetType is the MainSound element. The elements defined below replicate values contained in the underlying track file and shall remain consistent with the content of the underlying track file at all times. They are included in the Composition PlayList to alleviate the need for software to access and parse individual track files when scheduling content. In the event an inconsistency exists, the values contained in the underlying track file shall take precedence.

5.20.7.1. SoundSamplingRate The SoundSamplingRate element shall contain the frequency of the sampling rate of the underlying sound track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a rational number of kHz.

5.20.7.2. SoundBitDepth The SoundBitDepth element shall contain the number of the bits assigned to each sample of audio of the underlying sound track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a integer number of bits.

5.20.7.3. SoundChannelNumber The SoundChannelNumber element shall contain the number of the audio channels of the underlying sound track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as an integer number of channels.

81 Private Draft Document. Not for Publication

5.20.7.4. ChannelConfiguration The ChannelConfiguration element shall contain the channel number and label assigned to each channel of the underlying sound track file. As described above, it is included in the Composition PlayList for convenience only. It shall be encoded as a integer channel number along with its corresponding label.

5.20.7.5. Label The Label element shall contain a textual representation of the channel, which may be displayed to the user. For each channel number there are a number of permissible Label values. An example of such is shown in Table 12: Examples Channel Labels. The specification of this mapping is beyond the scope of this document.

Table 12: Examples Channel Labels Channel Label Description No. 1 L Left 2 R Right 3 C Center 4 LFE Low Frequency Effects 5 Ls Left Surround 6 Rs Right Surround 7 Cs Center Surround 8 Lt Left total (matrix encoded) 9 Rt Right total (matrix encoded) 10 Mono Monaural 11 HI Hearing Impaired (Dynamic compressed dialog centric mix) 12 VI-N Visually Impaired Narration (Descriptive Narration)

5.20.7.6. Language [optional] The Language element shall reflect the primary spoken language of the sound material of the underlying sound track file. The element value is encoded as an xs:language language code and indicates the spoken language of the content. The absence of the element shall indicate that no spoken language is associated with the asset.

82 Private Draft Document. Not for Publication

5.20.8. SubtitleTrackFileAssetType The SubtitleTrackFileAssetType element is derived from TrackFileAssetType. It describes the subtitle material associated with the Sequence. An instance of the SubtitleTrackFileAssetType is the MainSubtitle element. The elements defined below replicate values contained in the underlying track file and shall remain consistent with the content of the underlying track file at all times. They are included in the Composition PlayList to alleviate the need for theater management software to access and parse individual track files when scheduling content. In the event an inconsistency exists, the values contained in the underlying track file shall take precedence.

5.20.8.1. Language [optional] The Language element shall reflect the primary text language used by the Subtitle essence. The absence of the element shall indicate that no primary text language is associated with the asset. It is represented as an xs:language value.

5.20.9. CaptionTrackFileAssetType The CaptionTrackFileAssetType element is derived from TrackFileAssetType. It describes the subtitle material associated with the Sequence. An instance of the CaptionTrackFileAssetType is the MainCaption element. The elements defined below replicate values contained in the underlying track file and shall remain consistent with the content of the underlying track file at all times. They are included in the Composition PlayList to alleviate the need for theater management software to access and parse individual track files when scheduling content. In the event an inconsistency exists, the values contained in the underlying track file shall take precedence.

5.20.9.1. Language [optional] The Language element shall reflect the primary text language used by the Subtitle essence. The absence of the element shall indicate that no primary text language is associated with the asset. It is represented as an xs:language value.

83 Private Draft Document. Not for Publication

84 Private Draft Document. Not for Publication

6. Output Profile List

6.1. Introduction

6.2. Output Profile List Overview

6.2.1. Functional Framework For the purpose of documenting the specific requirements for a Interoperable Master Format Output Profile List • TBD – The specifications for the list of elements included in the Output Profile List

6.2.2. Output Profile List Fundamental Requirements

6.2.2.1. Introduction

6.2.2.2. Open Standard

6.2.2.3. Interoperable The blank shall have an open framework that accommodates compressed, encrypted files as well as all other files used in Digital Video.

6.2.2.4. Scalable The blank shall accommodate any number of Essence or Metadata components. There is no limit on the number of files included in the package or the size of the files.

6.2.2.5. Extensible The blank shall allow for blank.

6.2.2.6. Human Readable Metadata Human readable Metadata shall be in English (default) but can be provided in other languages as well.

6.2.3. Output Profile List Concepts

6.3. Output Profile List Format

6.3.1. Introduction

6.3.2. File Format

6.3.3. ImageOutputFormat The ImageOutputFormat element defines the output format of the MainPicture of material referred to by the Composition PlayList. It is meant to be both human and machine-readable.

85 Private Draft Document. Not for Publication

6.3.3.1. OutputPixelMatrix The OutputPixelMatrix element shall contain the horizontal and vertical pixel counts of the MainPicture element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform to the OutputPixelMatrix requirements.

6.3.3.2. OutputColorEncoding The OutputColorEncoding element shall contain information of the color encoding parameters of the MainPicture element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform the color information to the OutputColorEncoding specification.

6.3.3.3. StandardsBody [optional] The StandardsBody element shall contain a URI [RFC 2396] that uniquely identifies the Standards Body issuing the Standard.

6.3.3.4. Label [optional] The Label element shall contain a textual representation of the Standards Body, which may be displayed to the user. For each issuing Standards Body, and hence unique URI, there are a number of permissible Label values. An example of such is shown below in Table 13: Examples of Label Elements. The specification of this mapping is beyond the scope of this document.

Table 13: Examples of Label Elements Label URI SMPTE STD xxx-y http://www.smpte.org/date W3 http://www.w3.org/2001 ISO http://www.ISO.org/2009

6.3.3.5. DisplayAspectRatio The DisplayAspectRatio element shall contain the ratio intended display screen of the MainPicture element. (ex. 4:3 or 16:9) This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform to the DisplayAspectRatio.

6.3.3.6. OutputPictureOverlay [optional] The OutputPictureOverlay element shall contain information about the auxiliary overlay image elements for the MainPicture element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainPicture asset of the Composition to conform the aspects of the required OutputPictureOverlay.

6.3.3.7. TimeCodeWindow [optional] The TimeCodeWindow element shall contain a designation that uniquely identifies the Timecode type and spatial location for the OutputPictureOverlay.

6.3.3.8. Spoiler [optional] The Spoiler element shall contain a textual representation and spatial location for the OutputPictureOverlay.

86 Private Draft Document. Not for Publication

6.3.4. OutputAudioFormat The OutputAudioFormat element defines the format of the MainSound of material referred to by the Composition PlayList. It is meant to be both human and machine-readable.

6.3.4.1. AudioCompressionStandard [optional] The AudioCompressionStandard element shall contain information of the compression encoding parameters of the MainSound element. This is informative information to allow for humans and machine to read and if necessary manipulate the MainSound asset of the Composition to conform the sound information to the Compression specification.

6.3.4.2. StandardsBody [optional] The StandardsBody element shall contain a URI [RFC 2396] that uniquely identifies the Standards Body issuing the Standard.

6.3.4.3. Label [optional] The Label element shall contain a textual representation of the Standards Body, which may be displayed to the user. For each issuing Standards Body, and hence unique URI, there are a number of permissible Label values. An example of such is shown below in Table 4. The specification of this mapping is beyond the scope of this document.

6.3.5. OutputSpeedOffset [optional] The OutputSpeedOffset element shall contain the rational percentage offset intended to be applied to the Composition. This is informative information to allow for humans and machine to read and if necessary manipulate the Composition to conform to the desired running time of the Composition.

87 Private Draft Document. Not for Publication

Packaging

6.4. Introduction Packaging is defined as the process of combing elements to prepare them for shipping. This generally will consist of one or more compositions along with two more elements which will be specified in the section of the document. Those elements are the Packing List and the Asset Map.

6.5. Packaging System Overview

6.5.1. Functional Framework For the purpose of documenting the specific requirements for a Interoperable Master Format Packaging system, it is helpful to divide the system into a set of components. The performance requirements for each of these components will be described in the following sections:

• Distribution Package – The physical files and the list describing the files and providing a means for authentication as delivered in a Distribution Package (from Distributor to Exhibitor). o Packing List – The specifications for the list of elements included in the Package. o Asset Map – The specification for the directory structure and physical location of those directories and files. o Security –The specifications for the security requirements for the Package.

6.5.2. Packaging Fundamental Requirements

6.5.2.1. Introduction Digital Video presents a challenge to create a versatile packaging system. Throughout this system, some basic requirements are needed and are stated below.

6.5.2.2. Open Standard The Packaging standard shall be based upon an open worldwide standard. This format is encouraged to be a license-free technology. It shall be a complete standard that equipment receiving a compliant package can process and interpret unambiguously.

6.5.2.3. Interoperable The Packaging format shall have an open framework that accommodates compressed, encrypted files as well as all other files used in Digital Video.

6.5.2.4. Scalable The Packaging format shall accommodate any number of Essence or Metadata components. There is no limit on the number of files included in the package or the size of the files.

6.5.2.5. Extensible The Packaging format shall allow for new Digital Video features (compositions) to be contained within the package.

88 Private Draft Document. Not for Publication

6.5.2.6. Synchronization The Packaging format shall provide support for synchronization of the Essence and Metadata elements.

6.5.2.7. Human Readable Metadata Human readable Metadata shall be in English (default) but can be provided in other languages as well.

6.5.3. Packaging Concepts It is common practice to divide content into reels of between 10 and 20 minutes in length for post- production, and distribution. These reels are then assembled, together with other content, to create the modern play lists that are used in play out devices. This concept of reels shall be supported with Digital Video content.

6.6. Distribution Package

6.6.1. Introduction The Distribution Package has two major components. One is the Package itself, which includes all of the Track Files and the other is the Packing List. These are all of the elements required for a complete delivery to the theater Digital Cinema system. It is technically possible to include engagement-specific licenses and keying information in a Package in the form of opaque Metadata, but this is not recommended for general usage. A Distribution Package can contain a complete feature film composition or a set of compositions. Alternatively, it can carry as little as a single file to update one reel’s subtitle or sound track.

6.6.2. Distribution Package

6.6.2.1. General The Distribution Package shall contain a Packing List and one or more Digital Cinema track files. The following requirements apply.

6.6.2.2. Packing for Transport The distribution method shall allow a DCP to be transported via physical media, satellite or network.

6.6.2.3. Flattened Packages TBD

6.6.2.4. Security The distribution method shall provide digital signatures to allow the recipient to verify integrity of the Packing List and the enclosed files. Preparation of Packing Lists is a distribution fulfillment or transport function. Therefore, the digital signatures come from these entities, not the content-owner who mastered the files. Packing List security functions do not verify the authenticity of the content, only the intent of the delivery agent. (Content authenticity is verified through PlayList signatures and digital licenses.)

89 Private Draft Document. Not for Publication

6.6.3. Packing List

6.6.3.1. File Format The Packing List shall use XML data format with XML Signature (digital signature). It should be in English (default) but can be provided in other languages as well.

6.6.3.2. Fields The following data fields are required to be included in the Packing List for each file in the Package:

• Unique identification of each file included in the DCP is encoded as urn:UUID. • Annotation Text parameter (optional), if present, is a free-form, human readable annotation associated with the asset. It is meant strictly as a displayable guidance for the user. • File Integrity check (hash) for each file in the distribution package • Size of the file in bytes • Type (e.g., Packing List, PlayList, Track File, opaque security data) • Original File Name The following fields are required to be included in the digital signature section of the Packing List:

• Signer parameter uniquely identifies the entity, and hence public key that digitally signs the packing list. • Signature parameter contains a digital signature authenticating the packing list.

6.6.4. Asset Map

6.6.4.1. File Format The Asset Map shall use XML data format with XML Signature (digital signature). It should be in English (default) but can be provided in other languages as well.

6.6.4.2. Fields The following data fields are required to be included in the Asset Map for each file in the Package:

90 Private Draft Document. Not for Publication

91 Private Draft Document. Not for Publication

7. Annex

7.1. Example Workflows The following diagrams show an examples of a future state workflows using the concept of the IMF.. This IMF and the file-based workflow will enable Mastering & Distribution Servicing to service both existing and emerging distribution channels. It must be stressed that this is only an example workflow. As one becomes more familiar with the concepts of the IMF one can imagine many different workflows using the IMF. It is not the intention of this document to indentify all of these possibilities.

Figure 16 - Future State - Mastering & Distribution Servicing

92 Private Draft Document. Not for Publication

Figure 17 – Example IMF workflow

7.2. Editorial Systems

7.2.1. Editorial System Fundamental Requirements Digital Video Editorial Systems have some basic requirements that are stated below.

7.2.1.1. Easy Assembly of Content The system shall provide a graphical user interface (GUI) interface for the assembly of content with relative ease in a timely matter.

7.2.1.2. Movement of Content The system shall provide for movement of content within a facility. Blah Blah Blah

93 Private Draft Document. Not for Publication

7.2.1.3. Ease of Operation The Editorial System is encouraged to require only a reasonable level of computer operation knowledge or training for the basic operation of the system. The computer-based user interfaces are required to be simple and intuitive.

7.2.1.4. Multiple Systems Editing shall be supported across many different systems that are common in editorial processes today.

7.3. Playout Systems

7.3.1. Playout System Fundamental Requirements IMF Playout Systems have some basic requirements that are stated below.

7.3.1.1. Easy Assembly of Content The system shall provide a graphical user interface (GUI) interface for the assembly of content with relative ease in a timely matter.

7.3.1.2. Movement of Content The system shall provide for movement of content within a facility. Blah Blah Blah

7.4. Transcoding Systems

7.4.1. Transcoding System Fundamental Requirements IMF Transcoding Systems have some basic requirements that are stated below.

7.4.1.1. Easy Assembly of Content The system shall provide a graphical user interface (GUI) interface for the assembly of content with relative ease in a timely matter.

7.4.1.2. Movement of Content The system is required to provide for movement of content within a facility. Blah Blah Blah

7.5. Asset Management Interface Asset Management in the context of the IMF refers to the system interfaces required to allow for automation of repetitive tasks and tracking of assets for such things as location and version control. This section will concentrate on the interfaces and not all of the different applications required or envisioned for a Asset Management system. Below you should find the requirements and specifications for the interfaces required.

7.5.1. Interface Fundamental Requirements

7.5.1.1. Introduction Interoperable Master Format presents unique opportunities for the automation of content and asset tracking.

7.5.1.2. API The Application Programming Interface for the IMF……Blah Blah Blah.

94 Private Draft Document. Not for Publication

7.5.1.3. Protocols The Protocols to be used for the interfaces shall be….Blah Blah Blah.

7.5.2. Asset Management Interface Fundamental Concepts The interfacing to the Interoperable Master Format can be accomplished in many different ways. Blah, Blah, Blah.

7.5.3. Application Programming Interface TBD

7.5.4. Protocols

95 Private Draft Document. Not for Publication

8. GLOSSARY OF TERMS

Table 14: Glossary of Terms Term Description

AES Acronym for Advanced Encryption Standard

AES Acronym for Audio Engineering Society

ANSI Acronym for American National Standards Institute

API Acronym for Application Programming Interface

Broadcast file format developed and standardized by the EBU Wave

Burned-In Where visual data that is normally supplemental to a motion picture is irrevocably added to the motion-picture image by compositing the data with the underlying image

Captions Text that is a representation, often in the same language, of dialog and audio events occurring during scenes of a motion picture. (Generally associated with a dialog and audio event translation for the deaf and hard of hearing.)

CBC Acronym for Cipher Block Chaining mode

CBR Acronym for Constant Bit Rate for image compression

CIE Acronym for International Commission on Illumination (Commission Internationale de l’Eclairage)

Closed Referring to visual data that is supplemental to a motion picture being displayed off-screen

Composition A motion picture, or a trailer, or an advertisement, etc. Composition consists of a Metadata Composition PlayList along with the Essence and other Metadata track files that define the work.

CPL Acronym for Composition PlayList, the definitive PlayList for specifying how a Composition is played and what track files are required

DCP Acronym for a Digital Cinema Package, the set of files that are the result of the encoding, encryption and packaging process

Distribution The collection of files delivered by the distributor to the exhibitor. A Distribution Package may Package contain pieces of a Composition or several compositions, a complete Composition, replacement/update files, etc.

DM Acronym for Descriptive Metadata

DRM Acronym for Digital Rights Management

DSM Acronym for Digital Source Master, a digital master created in post-production from which different versions and duplication masters may be created.

DVD Acronym for Digital Versatile Disc

EBU Acronym for European Broadcast Union (a standardization organization) e.g. Abbreviation for the Latin phrase exempli gratia, meaning “for example”

96 Private Draft Document. Not for Publication

Term Description

Essence Image, audio, subtitles, or any content that is presented to a human being in a presentation

ETC Acronym for Entertainment Technology Center

FIPS Acronym for Federal Information Processing Standards

FM Acronym for Forensic Marking

Forensic Data embedded in essence to provide forensic tracking information in the event of content Marking theft. Such marking can be visible or non-visible, audible or non-audible.

FPS Acronym for Frames per Second

HD Acronym for High Definition

HI Acronym for Hearing Impaired

HMAC Acronym for Hashing Message Authentication Codes

Hz Abbreviation for Hertz, a unit of frequency expressed in cycles per second i.e. Abbreviation for the Latin phrase id est, meaning “that is”

IEC Acronym for International Electrotechnical Commission

IP Acronym for Intellectual Property

IMF Acronym for Interoperable Master Format

ISO Acronym for International Organization for Standardization

ITU Acronym for International Telecommunications Union

JPEG Acronym for Joint Photographic Experts Group, the international body that developed the JPEG 2000 standard

Key Electronic data used to allow data encryption and decryption

Key Epoch The period of time during which a given decryption key is valid. The key epoch defines a minimum practical time period for use of encrypted track files. kHz Acronym for kilo Hertz, one thousand cycles per second, a measure of frequency

KLV Acronym for Key Length Value – used by the MXF to parse binary data

Localizations Text on screen representing either non-source language dialog or information pertinent to the story such as time and place. This is specifically the text that is absent in text-less masters. This text is localized or translated for various markets either through subtitles or entire image replacement.

LTC Acronym for Linear Time Code

LUT Acronym for Look Up Table

Main Titles A credit sequence generally shown near the beginning of a motion picture

Metadata Data about data or data describing other data. Information that is considered ancillary to or otherwise directly complementary to essence. Information that is useful or of value when

97 Private Draft Document. Not for Publication

Term Description

associated with the essence being provided.

MTBF Acronym for Mean Time Between Failure

MXF Acronym for Material eXchange Format

NDF Acronym for Non Drop Frame (Timecode)

NTSC Acronym for National Television System Committee, which developed the NTSC television broadcasting standard

Open Referring to visual data that is supplemental to a motion picture being displayed on-screen

Operational An MXF construct to define file structures Pattern

Packing List A list describing the files and providing a means for authentication of the files as delivered in a package

PAL Acronym for Phase Alternation by Line, a television broadcasting standard.

PlayList Conceptually, the format and structure of the various lists used to define the playback of content.

PNG Acronym for Portable Network Graphics, an extensible file format for the lossless, portable, well-compressed storage of raster images defined by the PNG Development Group.

QC Acronym for Quality Control

RAND Acronym reasonable and non-discriminatory

Reel A conceptual period of time having a specific duration of generally 10 to 20 minutes. Used primarily in feature film production.

Renewable A software component is renewable if it can be remotely, smoothly and possibly automatically upgraded or replaced without significantly disturbing system operations. A system shutdown and normal restart is acceptable, provided that after the restart, the system can be operated as before.

Replaceable A component is said to be replaceable if it can be upgraded or replaced without significantly disturbing system operations. A system shutdown and restart is acceptable, provided that after the replacement, the system can be operated as before.

SD Acronym for Standard Definition

SHA1 Acronym for Secure Hashing Algorithm 1

SMPTE Acronym for Society of Motion Picture and Television Engineers

Subpicture A multiple-image file format for the transport of visual data supplemental to a motion picture that is intended only for graphic overlay with the main image output of a digital projector

Subtitle Text that is a representation, in a different language, of dialog occurring during scenes of a motion picture. Generally associated with dialog translation for localization of a motion picture in a particular territory.

TCP/IP Acronym for Transmission Control Protocol / Internet Protocol

TDES or 3DES Acronym for Triple Data Encryption Standard. TDES or 3DES was adopted as a federal

98 Private Draft Document. Not for Publication

Term Description

standard in 1998 [FIPS (46-3) and ANSI standard X9.32]

Track File The smallest element of a package that can be managed or replaced as a distinct asset. A track file may contain Essence and/or Metadata, and its duration matches an associated Reel.

UDP Acronym for User Datagram Protocol

UL Acronym for Universal Label used in MXF

Unicode The Universal Multiple-Octet Coded Character set, the [ISO/IEC 10646:2003] standard that defines a single code for representation, interchange, processing, storage, entry and presentation of the written form of the world’s major languages urn Acronym for uniform resource name

USB Acronym for Universal Serial Bus, standardized serial communications connection found on computers

UTC Acronym for Universal Coordinated Time

UUID Acronym for Universal Unique IDentifier

VFX Acronym for Visual Effects

VI Acronym for Visually Impaired

VOD Acronym for Video on Demand

Visually An image compression method is considered visually lossless when the processed image is Lossless indistinguishable from the unprocessed image under normal theatrical viewing conditions.

VPN Acronym for Virtual Private Network.

VBR Acronym for Variable Bit Rate

W3C Acronym for The World Wide Web Consortium, the organization responsible for the development of Internet protocols

XML Acronym for eXtensible Markup Language

99 Private Draft Document. Not for Publication

9. Composition Play List (CPL) Example Composition PlayLists (CPLs) are scripts that link the IMF Track Files together into synchronized pieces of content. CPLs are written in XML.

Composition Play List (CPL) Example urn:uuid:cde69c7b-a055-4373-84f5-e8ffea82f345 Wall-E_FTR_S_EN-XX_US-G_51_2K_DI_20080529_PX 2008-05-30T03:21:28-07:00 Pixar Animation Studios Pixit 0.2.45 Wall-E_FTR_S_EN-XX_US-G_51_2K_DI_20080529_PX feature urn:uuid:cde69c7b-a055-4373-84f5-e8ffea82f345 US English 2.35 BluRay Master HDCamSR(1.85 Picture, 5.1 Sound, Dubbed) French-Directors Cut 23.98 23.98 http://www.ISO.org/2009 48 5.1 24 02:10:15:10 EN US http://www.mpaa.org/2003-ratings urn:uuid:f63fdd78-39b6-413b-a9c5-7231446c6463 PartOne urn:uuid:54e8ef95-26be-854e-bb2e-c95812107c91 PartOne of Show 23.98 urn:uuid:8116c6ef-d870-4038-9f0a-46571d769858 7728 180

100 Private Draft Document. Not for Publication

Composition Play List (CPL) Example 2548 urn:uuid:035b894b-b82c-49f8-abbc-87230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 1920 1080 10 422 1.78 urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 9728 80 1548 urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 1920 1080 10 422 1.78 urn:uuid:8116c6ef-d870-4038-9f0a-66571d769858 7728 180 3504 urn:uuid:035b894b-b82c-49f8-abbc-07230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 1920 1080 10 422 1.78 urn:uuid:54e8ef95-26be-854e-bb2e-c95812107c91 24 1 urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 9728 80 1548 urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 48000 24 6 urn:uuid:ba79548c-852c-c04b-af1a-9cde6999b846 js/VpADyL4sEQF+YarPI61VJ1pw=

101 Private Draft Document. Not for Publication

Composition Play List (CPL) Example urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 80 2/B+hACcPMSFRl9WPCjEIoJoeRc= urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 1548 9728 http://www.ISO.org/2009 urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 9728 80 1548 urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 9728 80 1548 urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= urn:uuid:3c59dc3a-a247-475c-adbd-3ffd64dd5890 urn:uuid:b187df09-0e92-4b0f-9e51-c4313e32c8ee 23.98 urn:uuid:8116c6ef-d870-4038-9f0a-46571d769858 7728 180 2548 urn:uuid:035b894b-b82c-49f8-abbc-87230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 1920 1080 10 422 1.78

102 Private Draft Document. Not for Publication

Composition Play List (CPL) Example urn:uuid:54e8ef95-26be-854e-bb2e-c95812107c91 24 1 urn:uuid:8116c6ef-d870-4038-9f0a-56571d769858 9728 80 1548 urn:uuid:035b894b-b82c-49f8-abbc-97230e526231 2/B+hACcPMSFRl9WPCjEIoJoeRc= 48000 24 6 urn:uuid:ba79548c-852c-c04b-af1a-9cde6999b846 js/VpADyL4sEQF+YarPI61VJ1pw= dnQualifier=Qf4HGTjMqjGmHQOhA\ +xOqtQbSGY=,CN=.DC.DMS.DC2.INTEROP,OU=DC.DOREMILABS.COM,O=DC2.INTEROP.DOREMILABS.COM 4 dnQualifier=aen9g2oYiD4\+ouix29qpzVOJ7rk=,CN=ME CS SM.DMSJP2K- 70002.DC.DC2.INTEROP,OU=DC.DOREMILABS.COM,O=DC2.INTEROP.DOREMILABS.COM FC2hBD4lwXDqSGk97QrrAb69FWA= pdvBqADkTjzVqbLkO2cbLSn2NQ0mD+keDVr6l2MbApDpA+e+ RGLkr0vXUpul86XpP7zg+X5q711ju3u+tuJIMnnl5TAQ6VnRb71WQuRhtI5M2dRe2Mm+wSsm 95fNVAaTDAWJFwW6dS1iJ6dR9a4YAGOVJEpA8WJCl1zKTe2FCZAYlSZvgMQdTV+7cMeYB086 XO9VkBZ2lG+2Ph5wOH/oAJp28w8BkXgyELJiSAsiDmY0aA2ua9aHbocDv5C3bRfeslr85LyL KCt8b9B3xx+USiZ26FY1vo40N0eLNcQzjYOIaDZe/tyKMwhKKD24hu8wxwepeIHrZTjyvwgS jY66nw==

103 Private Draft Document. Not for Publication

Composition Play List (CPL) Example dnQualifier=Qf4HGTjMqjGmHQOhA\ +xOqtQbSGY=,CN=.DC.DMS.DC2.INTEROP,OU=DC.DOREMILABS.COM,O=DC2.INTEROP.DOREMILABS.COM 4 MIIEjDCCA3SgAwIBAgIBBDANBgkqhkiG9w0BAQUFADCBhjEj MCEGA1UEChMaREMyLklOVEVST1AuRE9SRU1JTEFCUy5DT00xGjAYBgNVBAsTEURDLkRPUkVNSUxB QlMuQ09NMRwwGgYDVQQDExMuREMuRE1TLkRDMi5JTlRFUk9QMSUwIwYDVQQuExxRZjRIR1RqTXFq R21IUU9oQSt4T3F0UWJTR1k9MB4XDTA3MDEwMTAwMDAwMFoXDTI1MTIzMTIzNTk1OVowgZgxIzAh BgNVBAoTGkRDMi5JTlRFUk9QLkRPUkVNSUxBQlMuQ09NMRowGAYDVQQLExFEQy5ET1JFTUlMQUJT LkNPTTEuMCwGA1UEAxMlTUUgQ1MgU00uRE1TSlAySy03MDAwMi5EQy5EQzIuSU5URVJPUDElMCMG A1UELhMcYWVuOWcyb1lpRDQrb3VpeDI5cXB6Vk9KN3JrPTCCASIwDQYJKoZIhvcNAQEBBQADggEP +NZnn+yhj0qvA==

dnQualifier=10IWnvjB9MJSQEipA/m0zVdiUEQ=,CN=.DMS.DC2.INTEROP,OU=DC.DOREMILABS.CO M,O=DC2.INTEROP.DOREMILABS.COM 2 MIIEgzCCA2ugAwIBAgIBAjANBgkqhkiG9w0BAQUFADCBgzEj MCEGA1UEChMaREMyLklOVEVST1AuRE9SRU1JTEFCUy5DT00xGjAYBgNVBAsTEURDLkRPUkVNSUxB QlMuQ09NMRkwFwYDVQQDExAuRE1TLkRDMqcXgPOS7ONYGY/A==

dnQualifier=k2\+ZCfLtrt7LaaHYAWzWbreV/ME=,CN=.PRODUCTS.DC2.INTEROP,OU=DC.DOREMIL ABS.COM,O=DC2.INTEROP.DOREMILABS.COM 4 MIIEgTCCA2mgAwIBAgIBBDANBgkqhkiG9w0BAQUFADCBiDEj MCEGA1UEChMaREMyLklOVEVST1AuRE9SRU1JTEFCUy5DT00xGjAYBgNVBAsTEURDLkRPUkVNSUxB QlMuQ09NMR4wHAYDVQQDExUuUF7WaY7Tq4yzl6nL28=

dnQualifier=K9jID4QvfqS1sS6IjiE8zEqSd/g=,CN=.ROOT.DC2.INTEROP,OU=DC.DOREMILABS.COM, O=DC2.INTEROP.DOREMILABS.COM 2 MIIEgjCCA2qgAwIBAgIBAjANBgkqhkiG9w0BAQUFADCBhDEj MCEGA1UEChMaREMyLklOVEVST1AuRE9SRU1JTEFCUy5DT00xGjAYBgNVBAsTEURDLkRPUkVNSUxB QlMuQ09NMRowGAYDVQQDEx0CTLUW/J9um+eSrj

dnQualifier=K9jID4QvfqS1sS6IjiE8zEqSd/g=,CN=.ROOT.DC2.INTEROP,OU=DC.DOREMILABS.COM, O=DC2.INTEROP.DOREMILABS.COM 1

104 Private Draft Document. Not for Publication

Composition Play List (CPL) Example MIIEfjCCA2agAwIBAgIBATANBgkqhkiG9w0BAQUFADCBhDEj MCEGA1UEChMaREMyLklOVEVST1AuRE9SRU1JTEFCUy5DT00xGjAYBgNVBAsTEURDLkRPUkVNSUxB QlMuQ09NMRowGAYDVQf0AsEwAFJR/9crGysLGnJsI=

105 Private Draft Document. Not for Publication

10. Annex - Ideal Implementation

10.1. Real-time Playback Real-time playback of the compressed image shall be supported in order to allow for the playback of the compressed streams in real-time environments such as broadcast systems and screenings.

10.2. Decode Compressed Image Faster than Real Time The ability to decode the compressed image faster than real time shall be required in order to support transcoding of the image to other file formats in a timely manner.

106 Private Draft Document. Not for Publication

107