Iso/Iec Jtc 1/Sc 29 N s2

Total Page:16

File Type:pdf, Size:1020Kb

Iso/Iec Jtc 1/Sc 29 N s2

INTERNATIONAL ORGANISATION FOR STANDARDISATION ORGANISATION INTERNATIONALE DE NORMALISATION ISO/IEC JTC1/SC29/WG11 CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC1/SC29/WG11 MPEG2006/N8032 April 2006,Montoreux, CH

Source AHG on MPEG-A Photo Player Staus Final Title Text of ISO/IEC FCD 23000-3 MPEG photo player application format Editors Akio Yamada, Miroslaw Bober, Robert O’Callaghan and Wo Chang

Document type: Document subtype: Document stage: Document language: ISO/IEC JTC 1/SC 29

Date: 2006-06-16

ISO/IEC FCD 23000-3

ISO/IEC JTC 1/SC 29/WG 11

Secretariat:

Information technology — Multimedia application format (MPEG-A) — Part 3: MPEG photo player application format

Élément introductif — Élément central — Partie 3: Titre de la partie

Warning

This document is not an ISO International Standard. It is distributed for review and comment. It is subject to change without notice and may not be referred to as an International Standard.

Recipients of this draft are invited to submit, with their comments, notification of any relevant patent rights of which they are aware and to provide supporting documentation. ISO/IEC FCD 23000-3 Copyright notice

This ISO document is a Draft International Standard and is copyright-protected by ISO. Except as permitted under the applicable laws of the user's country, neither this ISO draft nor any extract from it may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, photocopying, recording or otherwise, without prior written permission being secured.

Requests for permission to reproduce should be addressed to either ISO at the address below or ISO's member body in the country of the requester. ISO copyright office Case postale 56  CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail [email protected] Web www.iso.org

Reproduction may be subject to royalty payments or a licensing agreement.

Violators may be prosecuted.

© ISO/IEC 2006 — All rights reserved III ISO/IEC FCD 23000-3

Contents Page

1 Scope...... 1 2 Normative references...... 1 3 Terms and definition...... 2 4 Symbols and abbreviated terms...... 2 5 File format...... 3 6 Resource...... 4 7 Metadata...... 5 7.1 Collection level descriptive metadata...... 5 7.2 Item level descriptive metadata...... 11 8 Conformance Testing...... 20 8.1 File format conformance points...... 20 8.2 Photo player device conformance points...... 21 Annex A (informative) Schemas...... 23 Annex B (informative) Relevant technologies to create Photo Player metadata...... 57 B.1 Technologies to create collections...... 57 B.2: Technologies to create metadata on person identities appearing...... 59 Annex C (informative) Examples of collection structure...... 60 Annex D (informative) Reference software...... 61

IV © ISO/IEC 2006 — All rights reserved ISO/IEC FCD 23000-3

Foreword

ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1.

International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2.

The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote.

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights.

ISO/IEC 23000-3 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information Technology, Subcommittee SC 29, Coding of Audio, Picture, Multimedia and Hypermedia Information.

ISO/IEC 23000 consists of the following parts, under the general title Information technology — Multimedia application format (MPEG-A):

 Part 1: Purpose for Multimedia Application Formats

 Part 2: MPEG music player application format

 Part 3: MPEG photo player application format

© ISO/IEC 2006 — All rights reserved V ISO/IEC FCD 23000-3

Introduction

ISO/IEC 23000 (also known as “MPEG-A”) is a recent addition to a sequence of standards that have been developed by the Moving Picture Experts Group. This new standard is developed by selecting existing technologies from all published MPEG standards and combining them into so-called “Multimedia Application Formats” or MAFs. MPEG-A aims to serve clearly identified market needs by facilitating the swift development of innovative and standards-based multimedia applications and services. This application driven process results in normative specifications of multimedia formats along with reference software implementation allowing interoperabilty on an application level.

VI © ISO/IEC 2006 — All rights reserved DRAFT INTERNATIONAL STANDARD ISO/IEC FCD 23000-3

Information technology — Multimedia application format (MPEG-A) — Part 3: MPEG photo player application format

1 Scope

ISO/IEC 23000-3, also known as “Photo Player MAF”, specifies a file format for digital photo library applications. It offers a standardized solution for the carriage of images and associated metadata, to facilitate simple and fully interoperable exchange across different devices and platforms. The set of metadata includes MPEG-7 visual content descriptions, as well as acquisition-based metadata (such as date, time and camera settings). This allows compliant devices to support new, content-enhanced functionality, such as intelligent browsing, content-based search or automatic categorization.

2 Normative references

The following referenced documents are indispensable for the application of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

ISO/IEC 10918-1 Information technology – Digital compression and coding of continuous-tone still images – Requirements and guidelines

ISO/IEC 10918-3 Information technology – Digital compression and coding of continuous-tone still images – Compliance testing

ISO/IEC 14496-12 Information technology – Coding of audio-visual objects – ISO base media file format

ISO/IEC 14496-14 Information technology – Coding of audio-visual objects – MP4 file format

ISO/IEC 15938-1 Information technology – Multimedia content description interface – Systems

ISO/IEC 15938-2 Information technology – Multimedia content description interface – Description definition language

ISO/IEC 15938-3 Information technology – Multimedia content description interface – Visual

ISO/IEC 15938-5 Information technology – Multimedia content description interface – Multimedia description schemes

ISO/IEC 15938-7 Information technology – Multimedia content description interface – Conformance testing

ISO/IEC 15938-10 Information technology – Multimedia content description interface – Schema definition

ISO/IEC 21000-17 Information technology – Multimedia framework – Fragment identification of MPEG resources

Editor’s comments: need to complete

© ISO/IEC 2006 — All rights reserved 1 3 Terms and definition

For the purposes of this document, the following terms and definitions apply.

3.1 Description Scheme Entities or relationships pertaining to multimedia content. Description Schemes specify the structure and semantics of their components, which may be Description Schemes, Descriptors, or datatypes.

3.2 JPEG An image coding format, defined in ISO/IEC 10918.

3.3 MPEG-7 A multimedia content description interface, defined in ISO/IEC 15938.

3.4 MPEG-7 schema A schema which is defined as ISO/IEC 15938-10.

3.5 Collection A grouping, consisting of image resources and/or nested collections, described by an instance of the MPEG-7 ContentCollection Description Scheme.

3.6 Root collection The collection element directly beneath the mpeg7 root element in the Photo Player collection-level metadata, of which all the other collection elements are nested children.

3.7 RoleCS The classification scheme defined in Annex B.2.29 of ISO/IEC 15938-5:2002

3.8 Internal resource A resource included in the file

2 © ISO/IEC 2006 — All rights reserved 3.9 External resource A resource which is available outside of the file

3.10 codestream An entity of a resource.

4 Symbols and abbreviated terms

4.1 Abbreviations CS Classification Scheme

DS Description Scheme

BiM Binary format for Metadata

Editor’s comments: need to complete

5 File format

Photo player file format uses the MPEG-4 file format (mp4) to carry both JPEG resources and their associated metadata. Figure 1 shows the basic structure of the mp4 file format for a photo collection. Internal JPEG resources are stored in the media data box and their associated metadata is stored in the movie box. The linking between metadata and each corresponding internal resource (/codestream) is specified by media box. If an additional resource is available outside of the file, the linkage information to this external resource is specified in the metabox of each track box.

In an mp4 file format, descriptive metadata shall be stored using “meta box” which can be instantiated in a movie box or track boxes. The meta box in the movie box shall be used to annotate collection level information, and those in the track boxes shall be used for item level information. All the descriptive metadata shall be stored using MPEG-7 BiM format, therefore, meta-boxes shall have “mp7b” handler-type.

© ISO/IEC 2006 — All rights reserved 3 File Type box (ftyp) Media Data Box (mdat)

Movie Box (moov)

Meta box (meta) #0 JPEG codestream #1 Track Box (track) #1 Track Header Box (tkhd) Media Box (tkhd) Meta Box (meta) #1

Track Box (track) #N Track Header Box (tkhd) Media Box (tkhd) Meta Box (meta) #N JPEG codestream #N

JPEG codestream (external resource, additional entity out of file) Type B only

Figure 1 — Basic structure of MPEG photo player application format

The number of meta-boxes for collection-level descriptive metadata shall be exactly one, while the number of meta-boxes for item-level descriptive metadata shall be the same as the number of resources in a file.

6 Resource

Only JPEG compliant code-streams are allowed to be used for ISO/IEC 23000-3. Two different types of file format exist:

Type A:

All the resources are stored inside the file. Only internal resources are allowed to be used.

Type B:

At least some of the resources exist outside of the file. To identify the external resource location, URL reference is used (See. Clause 7.2). When the location of external resource is specified, the internal resource data in Media Data Box contains an alternative image to the external resource, such as a thumbnail picture which may be useful for rendering.

4 © ISO/IEC 2006 — All rights reserved 7 Metadata

7.1 Collection level descriptive metadata 7.1.1 Overview

A BiM encoded ContentCollection DS is used to represent collection-level descriptive metadata. Figure 2 shows overview of the metadata and Table 1 summarizes the sermantics of available tools.

Table 1 — Semantics of tools in collection-level descriptive metadata

Tag Name Semantics DescriptionMetadata/ Optional The author of the collection definition. Use the term “Creator” Creator registered in RoleCS as his/her role. DescriptionMetadata/ Optional The time stamp when the collection definition was created. CreationTime DescriptionMetadata/ Mandatory The time stamp of the most recent change to the collection LastUpdate definition. ContentCollection/ Optional The name of collection. Name (attribute) ContentCollection/ Optional The representative thumbnail picture of the collection. CreationInformation/ Creation/ TitleMedia/ TitleImage ContentCollection/ Optional The actor who is captured in the collection. Use the term CreationInformation/ “Actor” registered in RoleCS as his/her role. Creation/ Creator ContentCollection/ Optional The time (or period in time) when the photos in the collection CreationInformation/ were captured. Creation/ Date ContentCollection/ Optional The location where the photos in the collection were CreationInformation/ captured. Creation/ Location ContentCollection/ Optional The summary text about the collection. TextAnnotation/ FreeTextAnnotation ContentCollection/ Optional Any keywords of the collection. It is recommended to use TextAnnotation/ reserved terms (Annex D-1) to specify the collection type. KeywordAnnotation ContentCollection/ Mandatory The photos which are included in the collection. URL ContentRef reference shall be employed with MPEG-21 Fragment Identifier compliant format (ISO/IEC 21000-17). ContentCollection/ Optional The sub-collections. Note that the root collection shall ContentCollection include all of the photos in the file and each sub-collection shall include at least one of the photos included in the file. Hierarchical representation of collections is allowed.

© ISO/IEC 2006 — All rights reserved 5 Mpeg7

Description Metadata

Creator (Creator)

Role

Agent (Person)

Creation Time

Last Update

Content Collection name

Creation Information

Creation

Title Media

Title Image

Creator(Actor)

Role

Agent (Person)

Content Creation Date

Content Creation Location

Text Annotation

Keyword Annotation

Free Text Annotation

Content References

Content Collection

Figure 2 — Overview of collection level descriptive metadata

6 © ISO/IEC 2006 — All rights reserved 7.1.2 Constraints to MPEG-7 Schema

The collection-level Photo-Player schema is defined with respect to the Version 2 schema as specified in ISO/IEC 15938-10. The namespace of the Version 2 schema providing a basis for the Photo-Player collection- level schema is “urn:mpeg:mpeg7:schema:2004”. The following table lists the ISO/IEC 15938 description tools (global elements, global attributes, attribute groups, complexTypes and simpleTypes) selected to be included and any further constraints imposed on these description tools for the collection-level metadata schema:

Global Elements Name Constraint

Mpeg7 DescriptionUnit xsi:type="ContentCollectionType" Description element excluded Complex Types Element/Attribute Name Constraint

Mpeg7BaseType Header element excluded id DSType timePropertyGrp attributeGroup excluded mediaTimePropertyGrp attributeGroup excluded HeaderType id Confidence element excluded Version element excluded LastUpdate minOccurs="1" Comment element excluded PublicIdentifier element excluded PrivateIdentifier element excluded DescriptionMetadataType Creator CreationLocation element excluded CreationTime Instrument element excluded Rights element excluded Package element excluded DescriptionProfile DescriptionMetadata minOccurs="1" Mpeg7Type xml:lang timePropertyGrp attributeGroup excluded mediaTimePropertyGrp attributeGroup excluded CreationInformation minOccurs="0" CreationInformationRef element excluded UsageInformation element excluded CollectionType UsageInformationRef element excluded TextAnnotation name VisualFeature element excluded GofGopFeature element excluded AudioFeature element excluded ContentCollectionType Content element excluded ContentRef minOccurs="0" ContentCollection minOccurs="0" ContentCollectionRef element excluded Creation CreationInformationType Classification element excluded RelatedMaterial element excluded CreationType Title TitleMedia maxOccurs="1" Abstract element excluded

© ISO/IEC 2006 — All rights reserved 7 Creator element excluded CreationCoordinates maxOccurs="1" Location Date CreationTool element excluded CopyrightString element excluded TimePoint RelTimePoint element excluded TimeType RelIncrTimePoint element excluded Duration minOccurs="0" Incr Duration element excluded Name NameTerm element excluded PlaceDescription Role element excluded GeographicPosition Point datum AstronomicalBody element excluded Region element excluded PlaceType AdministrativeUnit element excluded PostalAddress AddressLine PostingIdentifier xml:lang StructuredPostalAddress element excluded InternalCoordinates element excluded StructuredInternalCoordinates element excluded ElectronicAddress element excluded xml:lang attribute excluded longitude GeographicPointType latitude altitude TitleType type attribute excluded TitleImage TitleMediaType TitleVideo element excluded TitleAudio element excluded KeywordAnnotation FreeTextAnnotation StructuredAnnotation element excluded TextAnnotationType DependencyStructure element excluded relevance attribute excluded confidence attribute excluded xml:lang Keyword KeywordAnnotationType type xml:lang Character element excluded CreatorType Instrument element excluded PersonNameType GivenName FamilyName Title Numeration LinkingName Salutation dateFrom attribute excluded dateTo attribute excluded type attribute excluded

8 © ISO/IEC 2006 — All rights reserved xml:lang initial attribute excluded NameComponentType abbrev attribute excluded Role MediaAgentType Agent AgentRef element excluded Name maxOccurs="1" preferred attribute excluded InlineTermDefinitionType Definition element excluded Term element excluded ControlledTermUseType href AgentType Icon element excluded Name maxOccurs="1" NameTerm element excluded Affiliation Organization OrganizationRef element excluded PersonGroup element excluded PersonType PersonGroupRef element excluded Citizenship element excluded Address minOccurs="0" AddressRef element excluded ElectronicAddress PersonDescription element excluded Nationality element excluded Name type attribute excluded NameTerm element excluded Kind element excluded Contact element excluded OrganizationType ContactRef element excluded Jurisdiction element excluded JurisdictionRef element excluded Address minOccurs="0" AddressRef element excluded ElectronicAddress Telephone type attribute excluded ElectronicAddressType Fax Email Url xml:lang TextualBaseType phoneticTranscription attribute excluded phoneticAlphabet attribute excluded MediaTimePoint element excluded MediaRelTimePoint element excluded ImageLocatorType MediaRelIncrTimePoint element excluded BytePosition element excluded MediaData16 InlineMediaType MediaData64 type MediaURI element excluded MediaLocatorType InlineMedia minOccurs="0" StreamID element excluded DescriptionProfileType profileAndLevelIndication

© ISO/IEC 2006 — All rights reserved 9 ReferenceType mpeg7:referenceGroup TextualType

Simple Types Element/Attribute Name Constraints termURIReferenceType termAliasReferenceType termReferenceType basicDurationType durationType mimeType basicTimePointType timePointType

Attribute Groups Attribute Name Constraints

idref attribute excluded referenceGrp xpath attribute excluded href

7.1.3 Instantiation examples (Informative)

2005-09-03T09:20:25+09:00 Creator Akio Yamada Yuto’s 6th birthday Event #1

10 © ISO/IEC 2006 — All rights reserved 7.2 Item level descriptive metadata 7.2.1 Overview

A BiM encoded Image DS is used to represent item-level descriptive metadata. Figure 3 shows an overview of the metadata and Table 2 summarizes the semantics of available tools.

Table 2 — Semantics of tools in item-level descriptive metadata

Tag Name Semantics DescriptionMetadata/ Optional Describing the author of the item description. Use the term Creator “Creator” registered in RoleCS as his/her role. DescriptionMetadata/ Optional Describing the time stamp when the item description was CreationTime created. DescriptionMetadata/ Mandatory Describing the time stamp of the most recent change to the LastUpdate item description. DescriptionUnit/ Optional Describing the file property of the original resource. Here Image/ original resource means external resource if it is available MediaInformation/ and internal resource if not; MediaProfile/ FileSize and Frame elements are used to specify the size of MediaFormat code-stream and pixel dimension of image, respectively. If external resource is available, this field represents the attributes of external one. DescriptionUnit/ Optional Indicating the location where an external JPEG resource is Image/ available. Only one instance is allowed to be instantiated. MediaInformation/ Note that this is an optional element. In the case that no MediaProfile/ MediaLocator instance is included in the item-level MediaInstance/ metadata, it means that only internal resource is available. MediaLocator DescriptionUnit Optional Describing the title of the resource. It might be available by Image/ referring to corresponding Exif tags of the resource. CreationInformation/ Creation/ Title DescriptionUnit Optional Describing persons or organizations who relate to the Image/ creation process of the resource, such as photographer, CreationInformation/ publisher and so on. Their roles can be described using Creation/ RoleCS. A variety of methods can be used to identify the Creator Creators, including electronic address elements such as url or email. Regarding the photographer, such information might be available by referring to corresponding Exif tags of the resource. If the Role is set to “Actor”, this field also describes the identity of persons who appear in the image. DescriptionUnit/ Optional Describing summary text of the resource. It might be Image/ available by referring to corresponding Exif tags of the TextAnnotation resource. DescriptionUnit/ Optional Describing the location where the resource was captured. Image/ GPS location information might be available by referring to CreationInformation/ corresponding Exif tags of the resource. Creation/ Location DescriptionUnit/ Optional Describing the time when the resource was captured. It might Image/ be available by referring to corresponding Exif tags of the CreationInformation/ resource. Creation/ Date DescriptionUnit/ Optional Describing signal-level characteristics of the resource. Image/ Several elements can be instantiated. Suggestions VisualDescriptionScheme appropriate to certain applications are given in Annex C.

© ISO/IEC 2006 — All rights reserved 11 Mpeg7

Description Metadata

Creator

Creation Time

Last Update

Image

Media Information

Creation Information

Creation

Title

Creator

Abstract

Location

Date

Visual Feature

Figure 3 — Overview of collection level descriptive metadata

7.2.2 Constraints to MPEG-7 Schema

The item-level Photo-Player schema is defined with respect to the Version 2 schema as specified in ISO/IEC 15938-10. The namespace of the Version 2 schema providing a basis for the Photo-Player item-level schema is “urn:mpeg:mpeg7:schema:2004”. The following table lists the ISO/IEC 15938 description tools (global elements, global attributes, attribute groups, complexTypes and simpleTypes) selected to be included and any further constraints imposed on these description tools for the item-level metadata schema:

Global Elements Name Constraint Mpeg7 DescriptionUnit xsi:type="ImageType" Description element excluded Complex Types Element/Attribute Name Constraint Mpeg7BaseType DType DSType Header element excluded

12 © ISO/IEC 2006 — All rights reserved id timePropertyGrp attributeGroup excluded mediaTimePropertyGrp attributeGroup excluded HeaderType id Confidence element excluded Version element excluded LastUpdate minOccurs="1" Comment element excluded PublicIdentifier element excluded PrivateIdentifier element excluded DescriptionMetadataType Creator CreationLocation element excluded CreationTime Instrument element excluded Rights element excluded Package element excluded Mpeg7Type DescriptionProfile DescriptionMetadata minOccurs="1" xml:lang timePropertyGrp attributeGroup excluded mediaTimePropertyGrp attributeGroup excluded DescriptionProfileType profileAndLevelIndication MultimediaContentType ImageType Image MediaInformation minOccurs="0" MediaInformationRef element excluded MediaLocator element excluded StructuralUnit element excluded CreationInformation minOccurs="0" CreationInformationRef element excluded UsageInformation element excluded SegmentType UsageInformationRef element excluded TextAnnotation type attribute excluded Semantic element excluded SemanticRef element excluded MatchingHint element excluded PointOfView element excluded Relation element excluded MediaIdentification element excluded MediaInformationType MediaProfile ComponentMediaProfile element excluded MediaFormat MediaTranscodingHints element excluded MediaProfileType MediaQuality element excluded MediaInstance maxOccurs="1" master attribute excluded MediaFormatType Content Medium element excluded FileFormat element excluded FileSize System element excluded Bandwidth element excluded BitRate element excluded TargetChannelBitRate element excluded ScalableCoding element excluded VisualCoding

© ISO/IEC 2006 — All rights reserved 13 Format element excluded Pixel element excluded Frame height width aspectRatio attribute excluded rate attribute excluded structure attribute excluded ColorSampling element excluded AudioCoding element excluded SceneCodingFormat element excluded GraphicsCodingFormat element excluded OtherCodingFormat element excluded InstanceIdentifier MediaInstanceType MediaLocator LocationDescription type organization attribute excluded UniqueIDType authority attribute excluded encoding MediaUri minOccurs="0" MediaLocatorType InlineMedia element excluded StreamID element excluded element excluded SpatialLocator element excluded SpatialMask element excluded MediaTimePoint element excluded MediaRelTimePoint element excluded MediaRelIncrTimePoint element excluded StillRegionType VisualDescriptor xsi:type="StillRegionFeatureType", VisualDescriptionScheme maxOccurs="1" GridLayoutDescriptors element excluded IllumininationInvariantColor element excluded MultipleView element excluded SpatialDecomposition element excluded VisualDType VisualDSType DominantColor ScalableColor ColorStructure ColorLayout ColorTemperature element excluded StillRegionFeatureType IlluminationCompensatedColor element excluded Edge HomogeneousPattern TextureBrowsing element excluded ShapeMask element excluded Contour element excluded ColorSpace element excluded ColorQuantization element excluded SpatialCoherency DominantColorType Value Percentage Index ColorVariance ScalableColor Coeff numOfCoeff numOfBitplanesDiscarded

14 © ISO/IEC 2006 — All rights reserved Values ColorStructure colorQuant YDCCoeff CbDCCoeff CrDCCoeff YACCoeff2 element excluded YACCoeff5 YACCoeff9 YACCoeff14 YACCoeff20 YACCoeff27 YACCoeff63 CrACCoeff2 element excluded CrACCoeff5 ColorLayoutType CrACCoeff9 CrACCoeff14 CrACCoeff20 CrACCoeff27 CrACCoeff63 CbACCoeff2 element excluded CbACCoeff5 CbACCoeff9 CbACCoeff14 CbACCoeff20 CbACCoeff27 CbACCoeff63 Average StandardDeviation HomogeneousTextureType Energy EnergyDeviation EdgeHistogramType BinCounts Creation CreationInformationType Classification element excluded RelatedMaterial element excluded Title maxOccurs="1" TitleMedia element excluded Abstract element excluded Creator CreationType CreationCoordinates maxOccurs="1" Location Date CreationTool element excluded CopyrightString element excluded KeywordAnnotation FreeTextAnnotation StructuredAnnotation element excluded TextAnnotationType DependencyStructure element excluded relevance attribute excluded confidence attribute excluded xml:lang Keyword KeywordAnnotationType type xml:lang PlaceType Name NameTerm element excluded PlaceDescription Role element excluded GeographicPosition

© ISO/IEC 2006 — All rights reserved 15 Point datum AstronomicalBody element excluded Region element excluded AdministrativeUnit element excluded PostalAddress AddressLine PostingIdentifier xml:lang StructuredPostalAddress element excluded InternalCoordinates element excluded StructuredInternalCoordinates element excluded ElectronicAddress element excluded xml:lang attribute excluded Longitude GeographicPointType Latitude altitude TitleType type attribute excluded Character element excluded CreatorType Instrument element excluded GivenName FamilyName Title Numeration LinkingName PersonNameType Salutation dateFrom attribute excluded dateTo attribute excluded type attribute excluded xml:lang Initial attribute excluded NameComponentType abbrev attribute excluded Role MediaAgentType Agent AgentRef element excluded Name maxOccurs="1" preferred attribute excluded InlineTermDefinitionType Definition element excluded Term element excluded ControlledTermUseType href AgentType Icon element excluded Name maxOccurs="1" NameTerm element excluded Affiliation Organization OrganizationRef element excluded PersonGroup element excluded PersonType PersonGroupRef element excluded Citizenship element excluded Address minOccurs="0" AddressRef element excluded ElectronicAddress PersonDescription element excluded Nationality element excluded OrganizationType Name type attribute excluded NameTerm element excluded

16 © ISO/IEC 2006 — All rights reserved Kind element excluded Contact element excluded ContactRef element excluded Jurisdiction element excluded JurisdictionRef element excluded Address minOccurs="0" AddressRef element excluded ElectronicAddress Telephone type attribute excluded ElectronicAddressType Fax Email Url xml:lang TextualBaseType phoneticTranscription attribute excluded phoneticAlphabet attribute excluded TextualType TimePoint RelTimePoint element excluded TimeType RelIncrTimePoint element excluded Duration element excluded Incr Duration element excluded Simple Types Element/Attribute Name Constraints termURIReferenceType termAliasReferenceType termReferenceType textureListType basicTimePointType timePointType unsigned1 unsigned3 unsigned5 unsigned6 unsigned8 unsigned12 integerVector

7.2.3 Instantiation examples (Informative)

2001-09-20T03:20:25+09:00 Creator Akio Yamada

© ISO/IEC 2006 — All rights reserved 17 Actor John Smith Image 138474 ??? http://www.nec.com/fig1.jpg John at the beach 0 5 0 89 203 14 120 43 74 12 243 212 27 48 34 32 12 10 13 9 10 14 15 8 7 3 16 12 9 6 6 2 6 4 4 2 1 7 5 3 2 1 6 4 2 2 2 5 4 5 3 1 5 5 6 5

18 © ISO/IEC 2006 — All rights reserved 2 6 5 4 4 1 6 4 4 4 0 6 3 5 2 1 5 5 6 6 4 2 3 6 7 3 2 5 5 7 3 2 4 4 7 1 5 6 4 6 1 5 7 4 5 1 6 4 6 5 1 3 4 7 6 19 20 103 87 99 130 97 73 112 109 122 132 108 102 105 113 106 141 103 111 78 76 82 117 88 70 69 61 48 68 48 53 106 84 94 130 94 75 107 104 117 128 100 99 97 107 92 132 90 106 76 64 78 110 83 65 64 52 39 72 35 47

8 Conformance Testing

8.1 File format conformance points

Compliant bitstreams (mp4 files) shall satisfy the following conditions.

8.1.1 File Structure

8.1.1.1 General

1) Exactly one collection-level descriptive metadata box and at least one item-level descriptive metadata box shall be present.

2) The number of item-level descriptive metadata boxes shall be the same as the number of internal resources in a file.

8.1.2 Resource

8.1.2.1 General

At least one resource (i.e. image) shall be present. Resources shall conform to the JPEG specification (ISO/IEC 10918-2).

8.1.2.2 Type A specific

The size of the internal resource shall exactly match the description in the corresponding item-level descriptive metadata.

8.1.2.3 Type B specific

1) The size of the internal resource may not be the same as described in the corresponding item-level descriptive metadata, but that of the corresponding extermal resource shall match the description.

2) The additional external resources specified by MediaLocator instances shall exist, at least when the collection data is created.

© ISO/IEC 2006 — All rights reserved 19 8.1.3 Metadata

8.1.3.1 General

3) The metadata instances shall be valid to the schemas specified in Annex A. The collection level metadata shall be valid to the Schema defined in Annex A-1, and the item-level metadata shall be valid to the schema defined in Annex A-2

4) The ID attribute of a DescriptionUnit instance in item-level descriptive metadata shall be unique in a file.

5) The root collection shall include all photos in the file.

6) The following namespaces shall be used, XXXXXXXXX for collection-level descriptive metadata, and XXXXXXXXX for item-level descriptive metadata, respectively

8.1.3.2 Type A specific

No MediaLocator elements shall be included in item-level descriptive metadata instances.

8.1.3.3 Type B specific

At least one of the item-level descriptive metadata instances includes a MediaLocator element to specify the location where the original resource is available.

8.2 Photo player device conformance points

8.2.1 Mandatory behaviours

8.2.1.1 General

Compliant devices shall have the following mechanisms.

1) To update the modification-date/time elements of both collection-level and item-level descriptive metadata when metadata is modified. The modification-date/time related elements include the followings

 Mpeg7/Description Metadata/Last Update (collection-level descriptive metadata)

 Mpeg7/Description Metadata/Last Update (item-level descriptive metadata)

2) To synchronize MPEG-7 visual metadata, specified in an Image/Visual Feature instance of item-level descriptive metadata, with modification of the corresponding resource. The modification date/time information can be described in the Creation Information of item-level descriptive metadata. If the devices are not equipped with some of the MPEG-7 visual feature synchronization functions, the corresponding instances in the Image/Visual Feauture instance shall be removed.

3) If the devices cannot access to an external resource, the device shall not update/remove the corresponding metadata

4) To update the list of resources when the resources are removed from or added to the file. The list is specified in the Content Collection/Content References instances.

5) The compliant products which have resource rendering capability shall implement a JPEG compliant decoder.

8.2.2 Optional behaviours

This subclause is an informative section.

1) The compliant products should

20 © ISO/IEC 2006 — All rights reserved A) copy code-stream properties from the Exif metadata of the resource to the corresponding item-level metadata instances when a file is created from an Exif compliant resource. The corresponding instances include the following:

I. Image/Media Information/Media Profile/Media Format/Frane (Exif: ImageWidth and ImageHeight)

B) add code-stream property including the following:

I. Image/Media Information/Media Profile/Media Format/FileSize

C) copy capturing conditions from the Exif metadata of the resource to the corresponding item-level metadata instances when a file is created from an Exif compliant resource. The corresponding instances include the following:

I. Image/Creation Information/Creation/Creation Date (Exif: DateTimeOriginal)

II. Image/Creation Information/Creation/GPS Information (Exif: GPS Infor IFD)

D) Copy annotations from the Exif metadata of the resource to the corresponding item-level metadata instances when a file is created from an Exif compliant resource. The corresponding instances include the following:

I. Image/Creation Information/Creation/Creator/Name (EIXF: Artist)

II. Image/Creation Information/Creation/Title (Exif: ImageDescription)

III. Image/Creation InformationCreation/Abstract (Exif: UserComment)/

2) The compliant devices may have collection create/update functionalities, by adding/removing Content Collection instances in the Content Collection instance for the root collection. The following instances may be modified

 Content Collection/Creation Information/Creation/Title

 Content Collection/Content References (only for sub-collections)

3) The compliant devices may have the functionalities to create a JPEG compliant file and an MPEG-7 compliant document from an MPEG Photo Player Application Format compliant file.

© ISO/IEC 2006 — All rights reserved 21 Annex A (informative)

Schemas

A1. Collection-level descriptive metadata

22 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 23

24 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 25

26 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 27

28 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 29

30 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 31

A2. Item-level descriptive metadata

32 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 33

34 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 35

36 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 37

38 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 39

40 © ISO/IEC 2006 — All rights reserved

42 © ISO/IEC 2006 — All rights reserved

44 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 45

46 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 47

48 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 49

50 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 51

52 © ISO/IEC 2006 — All rights reserved

© ISO/IEC 2006 — All rights reserved 53 Annex B (informative)

Relevant technologies to create Photo Player metadata

B.1 Technologies to create collections

B1.1 Situation-based clustering

The idea behind situation-based photo clustering is to use MPEG-7 visual image features to automatically group “similar” pictures. A simple but very efficient collection structure is introduced by grouping images by the occasion or situation in which they were taken. Examples of situations include family dinner, a trip to a beach, etc. Such a structure is very natural for the users since they will often remember the circumstances much better than the event date or filename(s). This provides the user with a simple, intuitive and effective means to browse through their collection, and moreover it is all done automatically based on the visual descriptors and the time information. Error: Reference source not foundFigure B.1 shows an example of the situation-based photo clustering applied to a small collection of 15 photographs. Six situations were detected by the grouping algorithm and the user can now assign appropriate labels to the groups, e.g. Situation 1 = school trip, Situation 2 = walk in the park, etc.

54 © ISO/IEC 2006 — All rights reserved Figure B.1 — Situation-based photo clustering

B1.2 Content-based categorization

Manually labeling a large collection is a very tedious task and is seldom performed by the users. However, even quite generic and simple labels often prove very helpful when searching or browsing through large image collections. Simple labels such as indoors/outdoors, landscape, architecture, waterfront, sunset, etc, can be automatically assigned to images based on MPEG-7 Visual descriptors with good accuracy. Figure B.2 —

© ISO/IEC 2006 — All rights reserved 55 Automatic assignment of simple category labels to Image Collections shows example image categories and corresponding photos.

Figure B.2 — Automatic assignment of simple category labels to Image Collections

B.2 Technologies to create metadata on person identities appearing

[To Do] Add Explanation about the usage of MPEG-7 AFR or CE results about personal identity recognition.

56 © ISO/IEC 2006 — All rights reserved Annex C (informative)

Examples of collection structure

[ToDo]

Situation-based collections

Categorization-based collections

Person-Identity-based collections

Anything else

© ISO/IEC 2006 — All rights reserved 57 Annex D (informative)

Reference software

The reference software consists of (a) functional components, (b) command-line programs, (c) GUI-based layer.

A) The following components are provided

Functional unit Software module name I/O Subroutines and MPEG-4 File system User Interface (command line or GUI) BiM encoder/decoder Visual D extraction engine Player functions

B) The following command line programs are provided ProgramName Functionality ItemBuilder Create a photo player file from resources and XML representation of metadata documents DataParser Separate resource and metadata documents from a photo player file

C) The following GUI based programs are provided

58 © ISO/IEC 2006 — All rights reserved Annex E Classification Schemes

E.1 CollectionTypeCS Event collections aggregating the contents which are created in certain time-slot Category collections aggregating the contents which belong to the same broad category, such as XXXXXXXXXXXX Person collections aggregating the contents where the same person appear

© ISO/IEC 2006 — All rights reserved 59

Recommended publications