Pycaption Documentation Release 0.5.0

Total Page:16

File Type:pdf, Size:1020Kb

Pycaption Documentation Release 0.5.0 pycaption Documentation Release 0.5.0 PBS Jun 16, 2021 Contents 1 Table of contents 3 1.1 Introduction...............................................3 1.1.1 Python Usage..........................................4 1.2 Supported formats............................................5 1.2.1 SAMI Reader / Writer :: spec..................................5 1.2.2 DFXP/TTML Reader / Writer :: spec.............................5 1.2.3 SRT Reader / Writer :: spec...................................6 1.2.4 WebVTT Reader / Writer :: spec................................6 1.2.5 SCC Reader :: spec.......................................7 1.2.6 Transcript Writer........................................7 1.3 Extensibility...............................................7 i ii pycaption Documentation, Release 0.5.0 pycaption is a python library for converting caption formats. Contents 1 pycaption Documentation, Release 0.5.0 2 Contents CHAPTER 1 Table of contents 1.1 Introduction pycaption is a caption reading/writing module. Use one of the given Readers to read content into a CaptionSet object, and then use one of the Writers to output the CaptionSet into captions of your desired format. Requires Python 2.7. Turn a caption into multiple caption outputs: srt_caps=u'''1 00:00:09,209 --> 00:00:12,312 This is an example SRT file, which, while extremely short, is still a valid SRT file. ''' converter= CaptionConverter() converter.read(srt_caps, SRTReader()) print converter.write(SAMIWriter()) print converter.write(DFXPWriter()) print converter.write(pycaption.transcript.TranscriptWriter()) Not sure what format the caption is in? Detect it: from pycaption import detect_format caps=u'''1 00:00:01,500 --> 00:00:12,345 Small caption''' reader= detect_format(caps) if reader: print SAMIWriter().write(reader().read(caps)) 3 pycaption Documentation, Release 0.5.0 Or if you expect to have only a subset of the supported input formats: caps=u'''1 00:00:01,500 --> 00:00:12,345 Small caption''' if SRTReader().detect(caps): print SAMIWriter().write(SRTReader().read(caps)) elif DFXPReader().detect(caps): print SAMIWriter().write(DFXPReader().read(caps)) elif SCCReader().detect(caps): print SAMIWriter().write(SCCReader().read(caps)) 1.1.1 Python Usage Example: Convert from SAMI to DFXP from pycaption import SAMIReader, DFXPWriter sami=u'''<SAMI><HEAD><TITLE>NOVA3213</TITLE><STYLE TYPE="text/css"> <!-- P{ margin-left: 1pt; margin-right: 1pt; margin-bottom: 2pt; margin-top: 2pt; text-align: center; font-size: 10pt; font-family: Arial; font-weight: normal; font-style: normal; color: #ffffff; } .ENCC{Name: English; lang: en-US; SAMI_Type: CC;} .FRCC{Name: French; lang: fr-cc; SAMI_Type: CC;} --></STYLE></HEAD><BODY> <SYNC start="9209"><P class="ENCC"> ( clock ticking ) </P><P class="FRCC"> FRENCH LINE 1! </P></SYNC> <SYNC start="12312"><P class="ENCC">&nbsp;</P></SYNC> <SYNC start="14848"><P class="ENCC"> MAN:<br/> <span style="text-align:center;font-size:10">When <i>we</i> think</span><br/> of E equals m c-squared, </P><P class="FRCC"> FRENCH LINE 2? </P></SYNC>''' print DFXPWriter().write(SAMIReader().read(sami)) Which will output the following: <?xml version="1.0" encoding="utf-8"?> <tt xml:lang="en" xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ ,!ttml#styling"> (continues on next page) 4 Chapter 1. Table of contents pycaption Documentation, Release 0.5.0 (continued from previous page) <head> <styling> <style id="p" tts:color="#fff" tts:fontfamily="Arial" tts:fontsize="10pt" ,!tts:textAlign="center"/> </styling> </head> <body> <div xml:lang="fr-cc"> <p begin="00:00:09.209" end="00:00:14.848" style="p"> FRENCH LINE 1! </p> <p begin="00:00:14.848" end="00:00:18.848" style="p"> FRENCH LINE 2? </p> </div> <div xml:lang="en-US"> <p begin="00:00:09.209" end="00:00:12.312" style="p"> ( clock ticking ) </p> <p begin="00:00:14.848" end="00:00:18.848" style="p"> MAN:<br/> <span tts:fontsize="10" tts:textAlign="center">When</span> <span tts:fontStyle= ,!"italic">we</span> think<br/> of E equals m c-squared, </p> </div> </body> </tt> 1.2 Supported formats Read: - DFXP/TTML - SAMI - SCC - SRT - WebVTT Write: - DFXP/TTML - SAMI - SRT - Transcript - WebVTT See the examples folder for example captions that currently can be read correctly. 1.2.1 SAMI Reader / Writer :: spec Microsoft Synchronized Accessible Media Interchange. Supports multiple languages. Supported Styling: - text-align - italics - font-size - font-family - color If the SAMI file is not valid XML (e.g. unclosed tags), will still attempt to read it. 1.2.2 DFXP/TTML Reader / Writer :: spec The W3 standard. Supports multiple languages. Supported Styling: - text-align - italics - font-size - font-family - color 1.2. Supported formats 5 pycaption Documentation, Release 0.5.0 1.2.3 SRT Reader / Writer :: spec SubRip captions. If given multiple languages to write, will output all joined together by a ‘MULTI-LANGUAGE SRT’ line. Supported Styling: - None Assumes input language is english. To change: pycaps= SRTReader().read(srt_content, lang='fr') 1.2.4 WebVTT Reader / Writer :: spec WebVTT is a W3C standard for displaying timed text in HTML5. Its specification is currently (as of February 2015) in draft stage and therefore not all features are implemented by major players, the same being true for pycaption. Styling Styling in WebVTT can be done via inline tags (e.g. <b>, <i> etc.) or external CSS rules applied to text wrapped in class (<c>) or voice (<v>) tags. pycaption currently only keeps voice tags on conversion. Example: <v Fred>Hi, my name is Fred is converted to Fred: Hi, my name is Fred The following WebVTT supported tags are stripped off the cue text: • <c>, <i>, <b>, <u>, <ruby>, <rt>, <lang> and timestamp tags (<h:mm:ss.sss>) Non-supported tags are left unchanged as a natural part of the cue text with no special meaning. Positioning The WebVTT specs allow customizing the position of cues by configuring a number of cue settings. pycaption currently only maintains positioning information on writing, in which case it supports the following settings: • A WebVTT line position cue setting. • A WebVTT text position cue setting. • A WebVTT size cue setting. • A WebVTT alignment cue setting. pycaption does not support: • A WebVTT vertical text cue setting. • A WebVTT region cue setting. Refer to the official WebVTT specification for details about the cue settings. 6 Chapter 1. Table of contents pycaption Documentation, Release 0.5.0 1.2.5 SCC Reader :: spec Scenarist Closed Caption format. Assumes Channel 1 input. Supported Styling: - italics By default, the SCC Reader does not simulate roll-up captions. To enable roll-ups: pycaps= SCCReader().read(scc_content, simulate_roll_up= True) Also, assumes input language is english. To change: pycaps= SCCReader().read(scc_content, lang='fr') Now has the option of specifying an offset (measured in seconds) for the timestamp. For example, if the SCC file is 45 seconds ahead of the video: pycaps= SCCReader().read(scc_content, offset=45) The SCC Reader handles both dropframe and non-dropframe captions, and will auto-detect which format the captions are in. 1.2.6 Transcript Writer Text stripped of styling, arranged in sentences. Supported Styling: - None The transcript writer uses natural sentence boundary detection algorithms to create the transcript. 1.3 Extensibility Different readers and writers are easy to add if you would like to: - Read/Write a previously unsupported format - Read/Write a supported format in a different way (more styling?) Simply follow the format of a current Reader or Writer, and edit to your heart’s desire. 1.3. Extensibility 7.
Recommended publications
  • Introduction to Closed Captions
    TECHNICAL PAPER Introduction to Closed Captions By Glenn Eguchi Senior Computer Scientist April 2015 © 2015 Adobe Systems Incorporated. All rights reserved. If this whitepaper is distributed with software that includes an end user agreement, this guide, as well as the software described in it, is furnished under license and may be used or copied only in accordance with the terms of such license. Except as permitted by any such license, no part of this guide may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, recording, or otherwise, without the prior written permission of Adobe Systems Incorporated. Please note that the content in this guide is protected under copyright law even if it is not distributed with software that includes an end user license agreement. The content of this guide is furnished for informational use only, is subject to change without notice, and should not be construed as a commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or inaccuracies that may appear in the informational content contained in this guide. This article is intended for US audiences only. Any references to company names in sample templates are for demonstration purposes only and are not intended to refer to any actual organization. Adobe and the Adobe logo, and Adobe Primetime are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. Adobe Systems Incorporated, 345 Park Avenue, San Jose, California 95110, USA. Notice to U.S. Government End Users.
    [Show full text]
  • Maccaption 7.0.4 User Guide
    User Guide MacCaption 7.0.4 User Guide 241768 February 2018 Copyrights and Trademark Notices Copyright © 2018 Telestream, LLC. All rights reserved worldwide. No part of this publication may be reproduced, transmitted, transcribed, altered, or translated into any languages without the written permission of Telestream. Information and specifications in this document are subject to change without notice and do not represent a commitment on the part of Telestream. Telestream. Telestream, CaptionMaker, Episode, Flip4Mac, FlipFactory, Flip Player, Lightspeed, ScreenFlow, Switch, Vantage, Wirecast, Gameshow, GraphicsFactory, MetaFlip, and Split-and-Stitch are registered trademarks and MacCaption, e- Captioning, Pipeline, Post Producer, Tempo, TrafficManager, and VOD Producer are trademarks of Telestream, LLC. All other trademarks are the property of their respective owners. Apple. QuickTime, MacOS X, and Safari are trademarks of Apple, Inc. Bonjour, the Bonjour logo, and the Bonjour symbol are trademarks of Apple, Inc. MainConcept. MainConcept is a registered trademark of MainConcept LLC and MainConcept AG. Copyright 2004 MainConcept Multimedia Technologies. Microsoft. Microsoft, Windows NT|2000|XP|XP Professional|Server 2003|Server 2008 |Server 2012, Windows 7, Windows 8, Media Player, Media Encoder, .Net, Internet Explorer, SQL Server 2005|2008|Server 2012, and Windows Media Technologies are trademarks of Microsoft Corporation. Manzanita. Manzanita is a registered trademark of Manzanita Systems, Inc. Adobe. Adobe® HTTP Dynamic Streaming Copyright © 2014 of Adobe Systems All right reserved. Avid. Portions of this product Copyright 2012 Avid Technology, Inc. VoiceAge. This product is manufactured by Telestream under license from VoiceAge Corporation. x.264 LLC. The product is manufactured by Telestream under license from x.264 LLC. Dolby.
    [Show full text]
  • X-Title Caption Export
    ! X-Title Caption Export Introduction 2 The XTCE interface 4 Menu 6 Usage 6 1) Create Subrib files (srt) ....................................................................................6 2) Create WebVTT files (vtt) .................................................................................7 4) Create Spruce Text List files (stl) ....................................................................8 4) Create Encore Text Script files (txt) .................................................................8 Creating a Webpage Using WebVTT 9 Open XTCC 9 Using Subler to add subtitles to movies 11 Appendix 13 X-Title Caption Export Quick Start Page !1 Introduction Starting with version 10.4.1 FCP supports Captions. If you are new to Captions you need to know in general what captions are and what subtitles are. Once they are visible they look the same. But there is difference in usage. The below is one interpretation! Subtitles provide a text alternative to the dialog - this might be in the original language or a translation into another language. Captions on the other hand will provide an additional sound description like "funky music", "phone ringing" and other things. So subtitles are intended for an audience that is able to hear the dialogs and other audio while captions care about the ones who are hearing impaired. Both types might use colours or text hints to make it easier to distinguish persons in some situations. Finally it is difficult in every day's language to make a difference between those types, especially every- body interprets them his own way. In several cases they are burned into the video, these are kind of "Open Captions". So there is only one language available. They also can be "embedded" in the video stream with one ore more languages or even a mixture be- tween captions and subtitles.
    [Show full text]
  • Open Source Support for TTML Subtitles Status Quo and Outlook
    Open Source Support for TTML Subtitles Status Quo and Outlook FOSDEM 2017 Andreas Tai, IRT 1 © IRT 2017 2 © IRT 2017 IRT participation in technical standards • SMPTE • DVB • HbbTV • EBU • W3C • … 3 © IRT 2017 TTML (Timed Text Markup Language) • W3C Standard for Timed Text • Start: 2003 • Candidate Recommendation (as DFXP): 2009 • Recommendation: 2010 • 2nd edition: 2013 • „Profiles“ • (SMPTE-TT) • EBU-TT/EBU-TT-D • IMSC 1 4 © IRT 2017 TTML, XML and WebVTT • TTML = XML, Web developer prefer JSON. • Browser „manufacturer“ developed WebVTT. • TTML: Choice of a lot of content driven organisations (EBU, ATSC, Studios). • WebVTT: Native support by browsers/iOS. • TTML: Support by various video players and frameworks. 5 © IRT 2017 "Standard makers" support Open Source • EBU EBU-TT-D in DASH.js EBU-TT Live Toolkit (+BBC) • IRT Subtitle Conversion Framework (SCF) EBU-TT-D Samples • Netflix (Sponsor) Timed Text Toolkit (by syknav/Glenn Adams) • MovieLabs (Sponsor) imscJS (by Pierre Lemieux) 6 © IRT 2017 OSS for TTML Contribution Production Distribution Presentation (Archive,Exchange) 7 © IRT 2017 OSS for TTML Production 8 © IRT 2017 SubtitleEdit - Subtitle Editor • C#, GPL 3 • TTML: „Nearly“ standard conform, Problems with styles • Profiles: TTML 1, Netflix -TT (will be deprecated in favor of IMSC) 9 © IRT 2017 Amara (unisubs) - Online Subtitle Editor • Python, AGPL 3.0 • Simple TTML Export • Minor bug on datatype 10 © IRT 2017 OSS for TTML Contribution Production (Archive,Exchange) 11 © IRT 2017 Subtitle Conversion Framework (SCF) STL TTML TTML Exchange Distribution • Command Line Conversion of Subtitle Formats • XSLT, Apache 2 • Supports: EBU STL, EBU-TT, EBU-TT-D, IMSC 12 © IRT 2017 Timed Text Toolkit (ttt) $ java -jar ttt-ttxv-all-3.0-SNAPSHOT.jar --model ttml1 test.ttml • Java, BSD • Validates different TTML profiles, Generation of svg+png, … • Covers already TTML 2 features (e.g.
    [Show full text]
  • Maccaption 6.6.5 User Guide
    User Guide MacCaption 6.6.5 User Guide 226772 August 2017 Copyrights and Trademark Notices Copyright © 2017 Telestream, LLC. All rights reserved worldwide. No part of this publication may be reproduced, transmitted, transcribed, altered, or translated into any languages without the written permission of Telestream. Information and specifications in this document are subject to change without notice and do not represent a commitment on the part of Telestream. Telestream. Telestream, CaptionMaker, Episode, Flip4Mac, FlipFactory, Flip Player, Lightspeed, ScreenFlow, Switch, Vantage, Wirecast, Gameshow, GraphicsFactory, MetaFlip, and Split-and-Stitch are registered trademarks and MacCaption, e- Captioning, Pipeline, Post Producer, Tempo, TrafficManager, and VOD Producer are trademarks of Telestream, LLC. All other trademarks are the property of their respective owners. Apple. QuickTime, MacOS X, and Safari are trademarks of Apple, Inc. Bonjour, the Bonjour logo, and the Bonjour symbol are trademarks of Apple, Inc. MainConcept. MainConcept is a registered trademark of MainConcept LLC and MainConcept AG. Copyright 2004 MainConcept Multimedia Technologies. Microsoft. Microsoft, Windows NT|2000|XP|XP Professional|Server 2003|Server 2008 |Server 2012, Windows 7, Windows 8, Media Player, Media Encoder, .Net, Internet Explorer, SQL Server 2005|2008|Server 2012, and Windows Media Technologies are trademarks of Microsoft Corporation. Manzanita. Manzanita is a registered trademark of Manzanita Systems, Inc. Adobe. Adobe® HTTP Dynamic Streaming Copyright © 2014 of Adobe Systems All right reserved. Avid. Portions of this product Copyright 2012 Avid Technology, Inc. VoiceAge. This product is manufactured by Telestream under license from VoiceAge Corporation. x.264 LLC. The product is manufactured by Telestream under license from x.264 LLC. Dolby.
    [Show full text]
  • Captionmaker User Guide 4
    User Guide CaptionMaker 8.0.1 User Guide 271278 February 2019 Copyrights and Trademark Notices Copyright © 2019 Telestream, LLC. All rights reserved worldwide. No part of this publication may be reproduced, transmitted, transcribed, altered, or translated into any languages without the written permission of Telestream. Information and specifications in this document are subject to change without notice and do not represent a commitment on the part of Telestream. Telestream. Telestream, CaptionMaker, Episode, Flip4Mac, FlipFactory, Flip Player, Gameshow, GraphicsFactory, Lightspeed, MetaFlip, Post Producer, ScreenFlow, Split-and- Stitch, Switch, Tempo, TrafficManager, Vantage, VOD Producer and Wirecast, are registered trademarks and Cricket, e-Captioning, iQ, iVMS, iVMS ASM, Inspector, MacCaption, Pipeline, Vidchecker, and Surveyor are trademarks of Telestream, LLC. All other trademarks are the property of their respective owners. Apple. QuickTime, MacOS X, and Safari are trademarks of Apple, Inc. Bonjour, the Bonjour logo, and the Bonjour symbol are trademarks of Apple, Inc. MainConcept. MainConcept is a registered trademark of MainConcept LLC and MainConcept AG. Copyright 2004 MainConcept Multimedia Technologies. Microsoft. Microsoft, Windows NT|2000|XP|XP Professional|Server 2003|Server 2008 |Server 2012, Windows 7, Windows 8, Media Player, Media Encoder, .Net, Internet Explorer, SQL Server 2005|2008|Server 2012, and Windows Media Technologies are trademarks of Microsoft Corporation. Manzanita. Manzanita is a registered trademark of Manzanita Systems, Inc. Adobe. Adobe® HTTP Dynamic Streaming Copyright © 2014 of Adobe Systems All right reserved. Avid. Portions of this product Copyright 2012 Avid Technology, Inc. VoiceAge. This product is manufactured by Telestream under license from VoiceAge Corporation. x.264 LLC. The product is manufactured by Telestream under license from x.264 LLC.
    [Show full text]
  • ESUB-XF Specification Version 1.03 “European Subtitle Exchange
    ESUB-XF Specification Version 1.03 “European Subtitle Exchange Format” XML Format for Exchange of Subtitles Initial release: Version 0.8 / 2013-09-20 by F.A. Bernhardt GmbH, FAB / Miha Sokolov Last modification: Version 1.03, 2020-06-19 by F.A. Bernhardt GmbH, FAB / Miha Sokolov This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) License. https://creativecommons.org/licenses/by-nd/4.0/legalcode This document shall not be modified without permission in writing from F.A. Bernhardt GmbH. Distribution is permitted free of charge, but only in the original unmodified form. The latest version of this document can be obtained from: http://www.fab-online.com/pdf/ESUB-XF.pdf © 2020 F.A. Bernhardt GmbH, FAB, www.fab-online.com European Subtitle Exchange Format Version 1.03 Page 1 Table of Contents 1. Introduction ......................................................................................................................................... 3 1.1 Purpose .......................................................................................................................................... 4 1.2 Explication of Terms ...................................................................................................................... 4 1.3 General XML Usage Rules for this Specification ............................................................................ 5 2. The ESUB-XF XML Structure ...............................................................................................................
    [Show full text]
  • Maccaption 6.5 User Guide
    User Guide MacCaption 6.5 User Guide 196745 July 2016 Copyrights and Trademark Notices Copyright © 2016 Telestream, LLC. All rights reserved worldwide. No part of this publication may be reproduced, transmitted, transcribed, altered, or translated into any languages without the written permission of Telestream. Information and specifications in this document are subject to change without notice and do not represent a commitment on the part of Telestream. Telestream. Telestream, CaptionMaker, Episode, Flip4Mac, FlipFactory, Flip Player, Lightspeed, ScreenFlow, Switch, Vantage, Wirecast, Gameshow, GraphicsFactory, MetaFlip, and Split-and-Stitch are registered trademarks and MacCaption, e- Captioning, Pipeline, Post Producer, Tempo, TrafficManager, and VOD Producer are trademarks of Telestream, LLC. All other trademarks are the property of their respective owners. Apple. QuickTime, MacOS X, and Safari are trademarks of Apple, Inc. Bonjour, the Bonjour logo, and the Bonjour symbol are trademarks of Apple, Inc. MainConcept. MainConcept is a registered trademark of MainConcept LLC and MainConcept AG. Copyright 2004 MainConcept Multimedia Technologies. Microsoft. Microsoft, Windows NT|2000|XP|XP Professional|Server 2003|Server 2008 |Server 2012, Windows 7, Windows 8, Media Player, Media Encoder, .Net, Internet Explorer, SQL Server 2005|2008|Server 2012, and Windows Media Technologies are trademarks of Microsoft Corporation. Manzanita. Manzanita is a registered trademark of Manzanita Systems, Inc. Adobe. Adobe® HTTP Dynamic Streaming Copyright © 2014 of Adobe Systems All right reserved. Avid. Portions of this product Copyright 2012 Avid Technology, Inc. VoiceAge. This product is manufactured by Telestream under license from VoiceAge Corporation. x.264 LLC. The product is manufactured by Telestream under license from x.264 LLC. Dolby. Dolby and the double-D symbol are registered trademarks of Dolby Laboratories.
    [Show full text]
  • Common Metadata Version: 2.5 Date: December 16, 2016
    Ref: TR-META-CM Common Metadata Version: 2.5 Date: December 16, 2016 Common Metadata ‘md’ namespace Motion Picture Laboratories, Inc. i Ref: TR-META-CM Common Metadata Version: 2.5 Date: December 16, 2016 CONTENTS 1 Introduction .............................................................................................................. 1 1.1 Overview of Common Metadata ....................................................................... 1 1.2 Document Organization .................................................................................... 1 1.3 Document Notation and Conventions ............................................................... 2 1.3.1 XML Conventions ...................................................................................... 2 1.3.2 General Notes ........................................................................................... 3 1.4 Normative References ...................................................................................... 4 1.5 Informative References..................................................................................... 6 1.6 Best Practices for Maximum Compatibility ........................................................ 6 2 Identifiers ................................................................................................................. 8 2.1 Identifier Structure ............................................................................................ 8 2.1.1 ID Simple Types .......................................................................................
    [Show full text]
  • Differences from HTML4
    Differences from HTML4 W3C Working Draft 28 May 2013 This Version: http://www.w3.org/TR/2013/WD-html5-diff-20130528/ Latest Version: http://www.w3.org/TR/html5-diff/ Latest Editor's Draft: https://rawgithub.com/whatwg/html-differences/master/Overview.html Participate: File a bug (open bugs) Version History: https://github.com/whatwg/html-differences/commits Previous Versions: http://www.w3.org/TR/2012/WD-html5-diff-20121025/ http://www.w3.org/TR/2012/WD-html5-diff-20120329/ http://www.w3.org/TR/2011/WD-html5-diff-20110525/ http://www.w3.org/TR/2011/WD-html5-diff-20110405/ http://www.w3.org/TR/2011/WD-html5-diff-20110405/ http://www.w3.org/TR/2011/WD-html5-diff-20110113/ http://www.w3.org/TR/2010/WD-html5-diff-20101019/ http://www.w3.org/TR/2010/WD-html5-diff-20100624/ http://www.w3.org/TR/2010/WD-html5-diff-20100304/ http://www.w3.org/TR/2009/WD-html5-diff-20090825/ http://www.w3.org/TR/2009/WD-html5-diff-20090423/ http://www.w3.org/TR/2009/WD-html5-diff-20090212/ http://www.w3.org/TR/2008/WD-html5-diff-20080610/ http://www.w3.org/TR/2008/WD-html5-diff-20080122/ Editor: Simon Pieters (Opera Software ASA) <[email protected]> Previous Editor: Anne van Kesteren <[email protected]> Copyright © 2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract HTML is the core language of the World Wide Web. The W3C publishes HTML5 and HTML5.1. The WHATWG publishes HTML, which is a rough superset of W3C HTML5.1.
    [Show full text]
  • Weaving the Web(VTT) of Data Thomas Steiner,1 Hannes Mühleisen,2 Ruben Verborgh,3 Pierre-Antoine Champin,1 Benoît Encelle,1 and Yannick Prié4
    Weaving the Web(VTT) of Data Thomas Steiner,1 Hannes Mühleisen,2 Ruben Verborgh,3 Pierre-Antoine Champin,1 Benoît Encelle,1 and Yannick Prié4 1 CNRS, Université de Lyon LIRIS, UMR5205 Université Lyon 1, FR 2 Database Architectures Group CWI, Amsterdam, NL 3 Multimedia Lab Ghent University – iMinds, Gent, BE 4 LINA – UMR 6241 CNRS Université de Nantes, Nantes, FR {tsteiner,pachampin,bencelle}@liris.cnrs.fr [email protected] [email protected] [email protected] Contributions Agenda ● Large-Scale Common Crawl study of the state of Web video. ● WebVTT conversion to RDF-based Linked Data. ● Online video annotation format and editor. ● Data and code. Introduction From <OBJECT> to <video> ● In the “ancient” times of HTML 4.01, the <OBJECT> tag was intended for allowing authors to make use of multimedia features like including video. ● To render data types they did not support natively—namely videos— user agents generally ran external applications and depended on plugins like Adobe Flash. ● Today, more and more Web video is powered by the native and well- standardized <video> tag that no longer depends on plugins (albeit some video codec and Digital Rights Management issues remain). ● HTML5 video has finally become a first class Web citizen. Technologies Overview WebVTT ● Straight-forward textual format for providing subtitles (translated speech), captions (hard-of-hearing), descriptions, chapters, and metadata for videos and audios. WEBVTT warning 00:01.000 --> 00:04.000 Never drink liquid nitrogen. 00:05.000 --> 00:09.000 It will perforate your stomach. ● We are especially interested in kind metadata tracks meant to be used from a scripting context and never directly displayed to the user.
    [Show full text]
  • IMSC 1.1 End-To-End Worldwide Subtitles and Captions HPA 2019 What Is IMSC 1.1?
    Pierre-Anthony Lemieux, Sandflow Consulting Supported By MovieLabs Dave Kneeland, 20th Century Fox IMSC 1.1 End-to-End Worldwide Subtitles and Captions HPA 2019 What is IMSC 1.1? W3C Recommendation XML-based format for worldwide subtitles and captions Critical improvements over IMSC 1.0.1 • Advanced Japanese language features • Stereoscopic 3D and HDR presentations ARIB-TT EBU-TT EBU-TT-D IMSC TTML 1 TTML 2 SDP CFF - US - TT TT - SMPTE Flexible styles and writing modes Other Features Text and Image Stereoscopic 3D • Disparity-based • Similar to SMPTE ST 428-7 (D-Cinema) and CEA 708.1 High-Dynamic Range (HDR) • Map onto PQ using an author-supplied luminance gain • Map onto HLG using a fixed recommended gain Anatomy of an IMSC Document <?xml version="1.0" encoding="UTF-8"?> <tt xml:lang="en" xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling"> <head> <layout> <region xml:id="area1" tts:displayAlign="center"/> </layout> </head> <body> <div> <p region="area1" begin="1s" end="2s"> Centered text </p> </div> </body> </tt> A Few Open Source Projects imscJS JavaScript library for rendering IMSC documents to HTML5 Timed Text Toolkit (ttt) Java-based TTML renderer and validator MP4Box ISO BMFF multiplexer dash.js Reference DASH web player asdcplib Wraps IMSC in MXF Many other projects with some IMSC compatibility, e.g. Shaka Player, Exo Player… What is imscJS? JavaScript library Renders IMSC documents to HTML5 . XML to JSON temporal segments . JSON to HTML5 Open source (BSD license) Used by dash.js (reference DASH player)
    [Show full text]