A P P E N D I X

■ ■ ■ Summary and Outlook

It has been a long journey—almost 300 pages about a couple of new elements in HTML. Who would have thought there was this much to learn. However, realistically, we are only at the start of what will be possible with audio and video in the coming years. Right now we are only seeing the most basic functionality of multimedia implemented on the Web. Once the technology stabilizes, publishers and users will follow, and with them businesses, and further requirements and technologies will be developed. However, for now, we have a fairly complete overview of existing functionalities. Before giving you a final summary of everything that we have analyzed in this book, let us mention two more areas of active development.

A.1 Outlook Two further topics deserve a brief mention: metadata and quality of service metrics. Both of these topics sound rather ordinary, but they enable functionalities that are quite amazing.

A.1.1 Metadata API The W3C has a Media Annotations Working Group1. This group has been chartered to create “an ontology and API designed to facilitate cross-community data integration of information related to media objects in the Web, such as video, audio, and images”. In other words: part of the work of the Media Annotations Working Group is to come up with a standardized API to expose and exchange metadata of audio and video resources. The aim behind this is to facilitate interoperability in search and annotation. In the Audio API chapter we have already come across something related: a means to extract key information about the encoding parameters from an audio resource through the properties audio.mozChannels, audio.mozSampleRate, and audio.mozFrameBufferLength. The API that the Media Annotations Working Group is proposing is a bit more generic and higher level. The proposal is to introduce a new Object into HTML which describes a media resource. Without going into too much detail, the Object introduces functions to expose a list of properties. Examples are media resource identifiers, information about the creation of the resource, about the type of content, content rights, distribution channels, and ultimately also the technical properties such as framesize, codec, framerate, sampling-rate, and number of channels. While the proposal is still a bit rough around the edges and could be simplified, the work certainly identifies a list of interesting properties about a media resource that is often carried by the media

1 See http://www.w3.org/2008/WebVideo/Annotations/

297 APPENDIX ■ SUMMARY AND OUTLOOK

resource itself. In that respect, it aligns with some of the requests from archival organizations and the media industry, including the captioning industry, to make such information available through an API. Interesting new applications are possible when such information is made available. An example application is the open source Popcorn.js semantic video demo2. Popcorn.js is a JavaScript library that connects a video, its metadata, and its captions dynamically with related content from all over the Web. It basically creates a mash-up that changes over time as the video content and its captions change. Figure A–1 has a screenshot of a piece of content annotated and displayed with popcorn.js.

Figure A–1. A screenshot of a video mashup example using popcorn.js

A.1.2 Quality of Service API A collection of statistics about the playback quality of media elements will be added to the media element in the near future. It is interesting to get concrete metrics to monitor the quality of service (QoS) that a user perceives, for benchmarking and to help sites determine the bitrate at which their streaming should be started. We would have used this functionality in measuring the effectiveness of Web Workers in Chapter 7 had it been available. Even more importantly, if there are continuously statistics available about the QoS, a JavaScript developer can use these to implement adaptive HTTP streaming.

2 See http://webmademovies.etherworks.ca/popcorndemo/

298 APPENDIX ■ SUMMARY AND OUTLOOK

We have come across adaptive HTTP streaming already in Chapter 2 in the context of protocols for media delivery. We mentioned that Apple, Microsoft, and Adobe offer solutions for MPEG-4, but that no solutions yet exist for other formats. Once playback statistics exist in all browsers, it will be possible to implement adaptive HTTP streaming for any format in JavaScript. This is also preferable over an immediate implementation of support for a particular manifest format in browsers—even though Apple has obviously already done that with Safari and m3u8. It is the format to support for delivery to the iPhone and iPad. So, what are the statistics that are under discussion for a QoS API? Mozilla has an experimental implementation of mozDownloadRate and mozDecodeRate3 for the HTMLMediaElement API. These respectively capture the rate at which a resource is being downloaded in bytes per second, and the rate at which it is being decoded in bytes per second. Further, there are additional statistics for video called mozDecodedFrames, mozDroppedFrames, and mozDisplayedFrames, which respectively count the number of decoded, dropped, and displayed frames for a media resource. These allow identification of a bottleneck either as a network or CPU issue. Note that Adobe has a much more extensive interface for Flash4. A slightly different set of QoS metrics for use in adaptive HTTP streaming is suggested in the WHATWG wiki5: • downloadRate: The current server-client bandwidth (read-only) • videoBitrate: The current video bitrate (read-only) • droppedFrames: The total number of frames dropped for this playback session (read-only) • decodedFrames: The total number of frames decoded for this playback session (read-only) • height: The current height of the video element (already exists) • videoHeight: The current height of the videofile (already exists) • width: The current width of the video element (already exists) • videoWidth: The current width of the videofile (already exists) These would also allow identification of the current bitrate that a video has achieved, which can be compared to the requested one and can help make a decision to switch to a higher or lower bitrate stream. We can be sure that we will see adaptive HTTP streaming implementations shortly after, when such an API has entered the specification and is supported in browsers. This concludes the discussion of HTML5 media technologies under active development.

A.2 Summary of the Book In this book we have taken a technical tour of HTML5

3 See http://www.bluishcoder.co.nz/2010/08/24/experimental-playback-statistics-for-html-video-audio.html 4 See http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/net/NetStreamInfo.html 5 See http://wiki.whatwg.org/wiki/Adaptive_Streaming#QOS_Metrics

299 APPENDIX ■ SUMMARY AND OUTLOOK

Flash, VLC, and Cortado for Ogg —can help a content provider deliver only a single format without excluding audiences on browsers that do not support that format natively.

The Introductory Chapters In the Audio and Video Elements chapter we had out first contact with creating and publishing audio and video content through the

Interacting with other HTML Elements In HTML5 Media and SVG we used SVG to create further advanced styling. We used SVG shapes, patterns, or manipulated text as masks for video, implemented overlay controls for videos in SVG, placed gradients on top of videos, and applied filters to the image content, such as blur, black-and-white, sepia, or line masks. We finished this section by looking at the inclusion of

300 APPENDIX ■ SUMMARY AND OUTLOOK

a JavaScript process that runs in parallel to the main process and communicates with it through message passing. Image data can be posted from and to the main process. Thus, Web Workers are a great means to introduce sophisticated Canvas processing of video frames in parallel into a web page, without putting a burden on the main page; it continues to stay responsive to user interaction and can play videos smoothly. We experimented with parallelization of motion detection, region segmentation, and face detection. A limiting factor is the need to pass every single video frame that needs to be analyzed by a Web Worker through a message. This massively slows down the efficiency of the Web Worker. There are discussions in the WHATWG and W3C about giving Web Workers a more direct access to video and image content to avoid this overhead. All chapters through Chapter 7 introduced technologies that have been added to the HTML5 specifications and are supported in several browsers. The three subsequent chapters reported on further new developments that have only received trial implementations in browsers, but are rather important. Initial specifications exist, but we will need to see more work on these before there will be interoperable implementations in multiple browsers.

Recent Developments In Chapter 8, on the HTML5 Audio API, we introduced two complementary pieces of work for introducing audio data manipulation functionalities into HTML5. The first proposal is by Mozilla and it creates a JavaScript API to read audio samples directly from an

I hope your journey through HTML5 media was enjoyable, and I wish you many happy hours developing your own unique applications with these amazing new elements.

301

Index

■ ■ ■

■ Symbols and Numerics @loop attribute element, 154–155 of

303 ■ INDEX

feature of SVG, 137 for foreign users, 254 element for hard-of-hearing users @media attribute, 28–29 captions, 250–251 @src attribute, 23–24 sign translation, 252–253 @type attribute, 24–27 transcripts, 251–252 description of, 23 interactive transcripts, implementing, 3D transforms, 71–74 256–258 3GPP Timed Text format, 271–273 navigation element, 267–269 chapter markers, 277 2D transforms, 70–71 keyboard navigation, 278

304 ■ INDEX

attributes rendering spectrum, 230–232 boolean, 13 rendering waveform, 227–230 content, 82–83 resources, 231 content compared to IDL, 82 single-frequency sound, creating, IDL 232–233 @buffered, 115–119 sound from another audio source, @currentSrc, 85–86 creating, 233–234 @currentTime, 97–100 tone generator, 237–239 @defaultPlaybackRate, 104–107 overview of, 223–224 @duration, 89–92 audio codecs, 3, 20 @ended, 103–104 audio data @error, 113–115 generating @muted, 93–95 bleeping out section with sine wave, 236–237 @networkState, 107–110 continuous playback, 234–236 @paused, 101–103 single-frequency sound, creating, @playbackRate, 104–107 232–233 @played, 118–120 sound from another audio source, @readyState, 110–113 creating, 233–234 @seekable, 120–122 tone generator, 237–239 @seeking, 100–101 reading @startTime, 88–89 extracting samples, 224–226 @videoHeight, 95–96 framebuffer, 226–227 @videoWidth, 95–96 rendering spectrum, 230–232 @volume, 92–93 rendering waveform, 227–230 from content attributes, 82 audio devices. See external devices general features of media resources, 84

305 ■ INDEX

AudioContext filter on Web Workers region segmentation, node types and, 240 217 reading and writing sound using, 239–240 on Web Workers sepia example, 208 rendering waveform data from, 243–244 RTSP support by, 37 reverberation effect, creating, 241–242 support for CSS3 functionalities by, 79 AudioDestinationNode, 240 support for MIME types by, 25–27 @autoplay attribute video codecs natively supported by, 9–10 of

306 ■ INDEX

captions @controls attribute for hard-of-hearing users, 250–251 of

307 ■ INDEX

debugging Web Workers, 213 WebM video, 34–35 @defaultPlaybackRate IDL attribute, 104–107 @ended IDL attribute, 103–104 Described Video, 248–249 enhanced captions, 251 element, 283–284 Ericsson implementation devices. See external devices of element in Webkit, 284 Dirac codec, 4 of self-view video, 285 displacement map filter, 151 of video chat web application, 293–295 display @error IDL attribute, 113–115 of audio resources on web pages, 22 event listeners, methods for setting up, 85 of audio spectrum, 231 events (JavaScript), 127–130 of audio waveform, 228 extended audio and text descriptions, 248 of audio waveform with history, 229 extending HTTP streaming, 38–40 jof formatted text in captions, 251 external audio or text descriptions, 249 of latency between playback of original external captions, 251 audio element and scripted audio, 234 external devices Document Object Model (DOM), 81 architectural scenarios for use of, 283 DOM attributes. See IDL attributes ConnectionPeer API, 295–296 download behavior element, 283–284 controlling, 18–20 Stream API HTTP progressive download, 36–38 displaying captured video from device in drawFrame() function, 216

308 ■ INDEX

filter effects in SVG, 149–154 continuous playback, 234–236 filter graph audio API single-frequency sound, creating, 232–233 advanced filters, 240–241 sound from another audio source, creating, basic reading and writing, 239–240 233–234 resources, 245 tone generator, 237–239 reverberation effect, creating, 241–242 getByteTimeDomainData() method, 244 waveform display, 243–245 getImageData() function, 171–173, 207 filter graphs, 223 “globalCompositeOperation” property of Firefox Canvas, 185–187, 191 See also Mozilla, audio data API Google manipulating inline SVG with JavaScript in, Chrome user interface 140 context menus, 46–47 SVG masks used by with visible controls, 43 “objectBoundingBox”, 139 Videos and Flash, 2 user interface WebM project, 5–7 context menus, 45–46 GPAC TTXT format, 272 visible controls, 41 gradient transparency masks, 185–188 Flash (Adobe) gradients on SVG masks, 145–146 purchase of, 2 grammatical markup SVG and, 135 for users with learning disabilities, 254 Flash (Macromedia), introduction of, 2 in WebSRT, 266–267 float positioning mode, 58–59 gray-scaling, 209–210 FLV format, 2 feature of SVG, 137 ■ H foreign users, alternative content technologies H.264 standard, 4–5 for, 254 hard-of-hearing users, alternative content formats technologies for See also Ogg Theora video format; Ogg captions, 250–251 Vorbis audio format sign translation, 252–253 converting multimedia files between, 30–31 transcripts, 251–252 encapsulation, 3 HAVE_CURRENT_DATA state, 110 FLV, 2 HAVE_ENOUGH_DATA state, 111 GPAC TTXT, 272 HAVE_FUTURE_DATA state, 110 MP4, 2 HAVE_METADATA state, 110 qttext, 271 HAVE_NOTHING state, 110 3GPP Timed Text, 271–273 @height attribute of

309 ■ INDEX

hyperlinks in-band sign translations, 253 for hard-of-hearing users, 251 in-band time-synchronized text tracks in media resources, 280 MPEG container, 271–273 for vision-impaired users, 250 Ogg container, 270 overview of, 269–270 ■ I WebM container, 271 identification of skin color regions with Web @initialTime attribute, 89 Workers, 217–222 inline box type, 52–54 IDL attributes inline SVG @buffered, 115–119 description of, 136 @currentSrc, 85–86 manipulating with JavaScript in Firefox, 140 @currentTime, 97–100 with video element in XHTML, 157 @defaultPlaybackRate, 104–107 with video element in XHTML and circular @duration, 89–92 mask, 158 @ended, 103–104 interacting with content, 249–250 @error, 113–115 interactive transcripts @muted, 93–95 for hard-of-hearing users, 252 @networkState, 107–110 implementing, 256–258 @paused, 101–103 internationalization, 247, 254 @playbackRate, 104–107 Internet Explorer. See IE @played, 118–120 @readyState, 110–113 ■ J @seekable, 120–122 JavaScript @seeking, 100–101 content attributes, 83 @startTime, 88–89 control methods defined on media @videoHeight, 95–96 elements @videoWidth, 95–96 canPlayType(), 126–127 @volume, 92–93 load(), 122–123 from content attributes, 82 pause(), 124–126 general features of media resources, 84 play(), 124–125 overview of, 83–84 custom controls, 130–134 playback-related features of media DOM and, 81 resources, 97 event listeners, methods for setting up, 85 states of media elements, 107 events, 127–130 IE (Internet Explorer) manipulating inline SVG with, in Firefox, user interface 140 context menus, 47 operations on video data, moving to Worker visible controls, 44 thread and feeding back to main web page, 204–208 Web Workers and, 204 overview of, 81–82 image overlays for hard-of-hearing users, 251 JavaScript API for time-synchronized text image segmentation with Web Workers, 212–217 MutableTimedTrack interface, 273–274 in-band audio or text descriptions, 249 TimedTrack interface, 274–275 in-band captions, 251 TimedTrackCue interface, 275 JaveScriptNode filter, 245

310 ■ INDEX

■ K document Karaoke-style subtitles in WebSRT, 265–266 with absolutely positioned video elements, 59 Kate overlay codec, 270 with floating video element, 58 keyboard navigation, 278 with gradient on video element, 64–65 keyframes, 76 with MPEG-4 video, 12 @kind attribute of element, 268 with object-scaled video, 61 with relative positioned video element, ■ L 56 learning disabilities, users with, 254 with rotated video element, 70–73 letter-boxing video, 17 with styled video element, 50–51 libvpx encoding library, 34 with video element that transitions on ligtheora encoding library, 32 mouse-over, 68–69 linear-gradient property (CSS), 64–65 with video inside surrounding inline listings content, 52 adapting bitrate for audio and video with video marquee, 66–67 for FFmpeg, 34 duration value for media element, getting, ffmpeg2theora, 33 90 ambient color frames, calculation of embedding average color in and display of, audio resources in HTML5, 21 181–182 audio with @type attribute, 25 animated circle in SVG, 154 audio with multiple resources, 23 Apple HTTP live streaming MPEG-4 video with fallback content, 11 encoding multiple versions of resource video with @type attribute, 24 for with, 39 video with multiple resources, 23 HTML5 video element with, 39 encoding m3u8 files for, 40 audio to MP3 using FFmpeg, 35 bleeping out section of audio with sine audio to Ogg Vorbis using FFmpeg, 35 wave, 236 audio to Ogg Vorbis using oggenc, 35 blur filter, SVG-defined, applied to video, two-pass, for Ogg Theora, 33 150 two-pass, using normal profile, 32 buffered ranges for media element, getting, 115–119 typical, using normal profile, 31 clipped path, using to filter out regions of video resource to MPEG-4 H.264 format video for display, 188–189 using presets, 31 composite SVG filter definitions, 152–153 video resources to Ogg Theora/Vorbis format, 32 controls, running in SVG, 143 video resources to WebM format, 34 currentSrc value for media element, getting, 85 ended value for media element, getting, 103 currentTime value for media element, error values for media element, getting, 114 getting and setting, 97–98 external SVG resource, styling video using, defaultPlaybackRate and playbackRate 141–142 values for media element, getting, 105 extracted data, creating audio element with, device selector, creating for video stream, 233 284 face detection displaying captured video from device in main web page code, 218–219

311 ■ INDEX

fortune cookie video with user interactivity, Ogg video 198–199 with @autoplay, 13 function to gain x and y coordinates of click with @controls attribute, 17 in canvas, 199 with @poster, 14 gradient transparency mask, introducing with @preload of “none”, 19 into ambient video, 186–187 Ogg Vorbis audio gray-scaling of video pixels, 209 with @loop attribute, 21 “Hello World” in HTML5, 10 with preload set to “metadata”, 23 hidden video element, style for, 54 opacity example using video, 63 IDL attributes painting video frames at different offsets of audio and video, 83–84 into canvas, 167–169 from content attributes, 82 patterns, filling canvas region with, 183–184 inline SVG, manipulating with JavaScript in paused value for media element, getting, Firefox, 140 101–102 inline SVG with video element pixel transparency through canvas, 176 in XHTML, 157 plain transcript, providing for video in XHTML and circular mask, 158 element, 255 inline video element, style for, 55 play() and pause() control methods, using, interactive transcript, providing for video 124–125 element, 256 played ranges for media element, getting, JavaScript 118 to create TimedTrack and cues in script, playing back all samples from input source, 274 235 dealing with time offsets on page hash, qttxt format used for QuickTime text tracks, 280 271 Kate file format used for Ogg time- reading synchronized text encapsulation, 270 audio metadata for audio framebuffer, latency between playback on original audio 226 element and scripted audio, event time for audio framebuffer, 227 displaying, 234 reading and writing sound using media content attribute, getting and setting AudioContext, 239 in JavaScript, 83 reading audio samples motion animation used as mask, 156 from audio element, 224 motion detection of video pixels, 210–211 from video element, 225–226 MP3 audio readyState values for media element, with @controls attribute, 22 getting, 111 with preload set to “auto”, 23 recording media from device, 286 MPEG-4 video reflection effect using canvas, 192–193 with @preload of “metadata”, 19 rendering with @width and @height with incorrect video in 2D canvas with 3D effect, aspect ratio, 16 178–179 muted value for media element, getting and waveform data from AudioContext filter, setting, 94 243–244 networkState values for media element, rendering audio samples getting, 108 in spectrum, 230–231 in waveform, 227–228

312 ■ INDEX

in waveform with history, 229 tone generator, Web-based, 237–238 reverberation effect, creating, 241–242 markup seekable ranges for media element, getting, with multiple external WebSRT tracks, 120 269 seeking value for media element, getting, with text description WebSRT file, 267 100 TTXT format used for 3GPP text tracks, segmentation of video pixels 272–273 playFrame() function, 213–215 video conferencing web page, 294 postFrame() and drawFrame() video cube, extending with animations, functions, 216–217 77–78 sepia coloring of video pixels video gallery, creating with CSS, 74–76 JavaScript Web Worker, 207 video pixel data, introducing into canvas, painting into Canvas, 204–205 166 using Web Workers, 206–207 video player, creating sine wave, creating audio element with, 232 CSS for markup skeleton, 131–132 spiraling effect, 195 JavaScript for markup skeleton, 132–134 startTime value for media element, getting, markup skeleton, 130 88 videoWidth and videoHeight values for video SVG clip-path, 148 element, getting, 95 SVG edge detection, 161–162 volume value for media element, getting SVG filter definitions, 151 and setting, 92–93 SVG mask WAV audio with gradient, 145 with @autoplay attribute, 21 with pattern, 146 with preload set to “none”, 22 used by Firefox with web server directory structure for video and “objectBoundingBox”, 139 audio publishing, 37 SVG resources WebM video example, 138 with @autoplay and @loop, 13 styling video using, 137 with @preload of “auto”, 19 SVG video reflection, 160–161 with @width and @height, 15 testing WebSocket example using websocket.org echo service, 288 what audio MIME types browser supports, 26–27 WebSocket server what video MIME types browser client for, 290–291 supports, 25 client for that allows shared video TeXML format used for 3GPP text tracks, viewing, 291–293 271–272 written in JavaScript for node.js, 289 text filled with video, 190–191 WebSRT file tiling video into canvas for chapter markers, 277 naïve implementation of, 169 with captions, 261 reimplementation of with with chapter markup, 264–265 createImageData, 173–174 with enhanced captions, 262 reimplementation of with with extended text description, 260 getImageData, 171 with grammatically marked-up subtitles, reimplementation of with two canvases, 266–267 175–176

313 ■ INDEX

with Japanese subtitles and rendering defaultPlaybackRate and playbackRate instructions, 263 values, 105 with Karaoke-style subtitles, 265–266 duration value for, 90 with Russian subtitles, 263 ended value for, 103 with text description, 259–260 error value for, 114 markup with, 267 history of introduction of, 1–2 load() control method, 122–123 introduction of, 1 loadedmetadata event, 85–86 muted value for, 94 locating objects and boundaries in images, 212 networkState value for, 108 @loop attribute paused value for, 101–102 of

314 ■ INDEX

mozCurrentSampleOffset() method, 234 @muted IDL attribute, 93–95 Mozilla audio data API ■ N bleeping out section with sine wave, named media fragment URIs, 279 236–237 navigation continuous playback, 234–236 accessibility and, 255 extracting audio samples, 224–226 between alternative content tracks for framebuffer, 226–227 vision-impaired users, 250 generating data, 232 chapter markers for, 277 overview of, 223–224 hierarchical, 265 rendering spectrum, 230–232 into content for vision-impaired users, 250 rendering waveform, 227–230 keyboard navigation, 278 resources, 231 media fragment URIs, 278–280 single-frequency sound, creating, out of content for vision-impaired users, 232–233 250 sound from another audio source, overview of, 276 creating, 233–234 URLs in cues, 280–281 tone generator, 237–239 @networkState IDL attribute, 107–110 HTMLMediaElement API, 299 node types and AudioContext, 240 mozSetup() method, 232–233 node.js web server, 289 mozWriteAudio() method none box type, 54–55 creating sound from another audio source, normal positioning mode, turning on, 52 233–234 return value on, 234 ■  MP3 audio O with @controls attribute, 22 object-fit property, 61 embedding in HTML5, 21 object-position property, 61 encoding, 35 Ogg container, 270 with preload set to “auto”, 23 Ogg Theora video format MP4 format, 2 description of, 3–6 MP4 H.264/AAC support, 5–7 encoding, 32–33 MPEG container and 3GPP Timed Text format, Ogg video 271–273 embedding in HTML5, 10 MPEG-4 AAC (Advanced Audio Coding), 3 with @autoplay, 13 MPEG-4 H.264 video, encoding, 30–32 with @controls attribute, 17 MPEG-4 video with @poster, 14 document with, 12 with @preload of “none”, 19 embedding Ogg Vorbis audio format in HTML5, 10 description of, 4 with fallback content, 11 embedding in HTML5, 21 with @preload of “metadata”, 19 encoding, 35 with @width and @height with incorrect with @loop attribute, 21 aspect ratio, 16 with preload set to “metadata”, 23 MutableTimedTrack interface (JavaScript API), oggenc to encode audio to Ogg Vorbis, 35 273–274 opacity property (CSS), 63–64

315 ■ INDEX

open audio descriptions, 249 @playbackRate IDL attribute, 104–107 open captions, 251 play() control method, 124–125 open sign translations, 252 @played IDL attribute, 118–120 Opera play/pause toggle, styling in SVG, 143–144 experimental build of, 3 playFrame() function for image segmentation, user interface 213–215 context menus, 46 Popcorn.js semantic video demo, 298 visible controls, 43–44 @poster attribute of

316 ■ INDEX

rendering waveform, 227–230 @seekable IDL attribute, 120–122 with filter graph API, 239–240 @seeking IDL attribute, 100–101 @readyState IDL attribute, 110–113 selectors (CSS), 49 Real-Time Protocol/User Datagram Protocol self-view video, 285–286 (RTP/UDP), 38 sepia coloring of video pixels Real-Time Streaming Protocol, 37–38 JavaScript Web Worker, 207 RealtimeAnalyzerNode filter, 243–245 painting into Canvas, 204–205 recording media from devices, 286–287 using Web Workers, 206–207 reflections, 63, 160–161 servers. See web servers; WebSocket server reflections effect, 192–195 setInterval() function, 198 region segmentation with Web Workers, setTimeout() function, 167, 198 212–217 shapes, creating with SVG, 137–141 relative positioning mode, 55–58 shared video control with WebSocket API, reloading pages, 86 291–293 rendering sign translations for hard-of-hearing users, audio spectrum, 230–232 252–253 audio waveform, 227–230 sine waves waveform data from AudioContext filter, bleeping out section of audio with, 236–237 243–244 creating audio element with, 232–233 rendering instructions in WebSRT, 264 single-frequency sound, creating, 232–233 repeat() function, 196 skin color regions, identification of with Web replicating images into regions, 183–185 Workers, 217–222 resource locations of media elements, 85 SMIL (Synchronized Multimedia Integration restarting Language), 1 audio after finishing playback, 21–22 sniffing, 27 video after finishing playback, 13 sound data. See audio API restore() function, 193 element reverberation effect, creating, 241–242 @media attribute, 28–29 rotate() function, 195–196 @src attribute, 23–24 RTP/RTSP, streaming using, 37–38 @type attribute, 24–27 spatial media fragment URIs, 279 ■ S specifications, checking, 9 Safari user interface spectrum, rendering audio samples in, 230–232 context menus, 46 spiraling effect, 195–196 visible controls, 42 @src attribute sample buffer, feeding, 234–235 of

317 ■ INDEX

streaming styling HTTP, extending, 38–40 gradients, 145–146 RTP/RTSP, using, 37–38 element, 139 styling patterns, 146–147 See also CSS play/pause toggle, 143–144 in Canvas text elements, 141–143 ambient CSS color frames, 181–183 using video element inside SVG resources, createPattern() function, 183–185 156–158 pixel transparency to replace synchronized media technologies backgrounds, 176–177 challenges of, 275–276 scaling pixel slices for 3D effect, 178–179 description of, 255 in SVG Synchronized Multimedia Integration gradients, 145–146 Language (SMIL), 1 element, 139 synchronized text technologies patterns, 146–147 description of, 255, 258 play/pause toggle, 143–144 HTML markup and element, subtitles 267–269 for foreign users, 254 in-band use grammatically marked-up, in WebSRT, MPEG container, 271–273 266–267 Ogg container, 270 Karaoke-style, in WebSRT, 265–266 overview of, 269–270 tracks marked as, 268 WebM container, 271 in WebSRT, 263–264 JavaScript API SVG (Scalable Vector Graphics) MutableTimedTrack interface, 273–274 animations TimedTrack interface, 274–275 element, 154–155 TimedTrackCue interface, 275 element, 155 WebSRT element, 155–156 captions, 261–263 element, 155 chapter tracks, 264–265 basic shapes and outlines, creating, 137–141 grammatical markup, 266–267 edge detection, 161–162 Karaoke-style subtitles, 265–266 effects subtitles, 263–264 clip-paths, 148–149 text description, 259–261 filters, 149–154 overview of, 147 ■ T environment of, compared to Canvas, 165 technologies. See alternative content examples using, 137 technologies; synchronized media features of, 135–136 technologies; synchronized text technologies feature of, 137 temporal media fragment URIs, 279 inline graphics, 136 text description masking video, 136, 158–160 tracks marked as, 268 overview of, 135 in WebSRT, 259–261 performance of, compared to Canvas, 165 text elements (SVG), 141–143 reflection, 160–161 text, drawing, 190–192

318 ■ INDEX

3D transforms, 71–74 two-pass encoding, 31 3GPP Timed Text format, 271–273 @type attribute tiling video into canvas of element, 284 naïve implementation of, 169 of element, 24–27 reimplementation of with createImageData, 173–176 ■ U with getImageData, 171 URLs in cues, 280–281 with two canvases, 175–176 usability. See accessibility TimedTrack interface (JavaScript API), 274–275 user interaction TimedTrackCue interface (JavaScript API), 275 animations through, 198–200 TimeRanges object, 115, 120 tone generator settings, changing, 237–239 time-synchronized text user interfaces for media elements in-band context menus, 45–47 MPEG container, 271–273 controls in web browsers, 47–48 Ogg container, 270 visible controls, 41–44 overview of, 269–270 user preference settings, 249 WebM container, 271 JavaScript API for ■ V MutableTimedTrack interface, 273–274 video codecs, 3 TimedTrack interface, 274–275 video conferencing with WebSocket API, TimedTrackCue interface, 275 293–295 timeupdate event, 92–93, 167 video devices. See external devices toDataURL() function, 165 video element tone generator, 237–239 @autoplay attribute, 13 element, 267–269 @controls attribute, 17–18 track fragment URIs, 279 @height attribute, 15–17 traditional captions, 251 @loop attribute, 13–14 transcoding, 30 @poster attribute, 14–15 transcripts @preload attribute, 18–20 for deaf-blind users, 253 @src attribute, 12 for hard-of-hearing users, 251–252 @width attribute, 15–17 interactive, implementing, 256–258 createImageData() function, 173–175 plain, implementing, 255–256 drawImage() function, 166–169, 175–176 transformations drawImage() function, extended version of, reflections effect, 192–195 169–171 spiraling effect, 195–196 fallback content, 11–12 transforms, CSS floating, document with, 58 2D, 70–71 getImageData() function, 171–173 3D, 71–74 gradient on, 64–65 description of, 68 hidden, style for, 54 transitions, CSS, 68–70 inline, style for, 55 translate() function, 195–196 interactive transcript for, 256 translations for foreign users, 254 in legacy browser, 11 2D transforms, 70–71 Opera and, 2

319 ■ INDEX

plain transcript for, 255 encoding, 35 putImageData() function, 171–173 with @loop attribute, 21 relative positioned, 56 with preload set to “metadata”, 23 rotated, 70–73 VP8 standardization of, 2–7 description of, 5 styled, 50–51 support by Adobe, 7 support for, 9–10 Texas Instruments and, 5 that transitions on mouse-over, 68–69 videoWidth and videoHeight values for, 95 ■ W video gallery, creating, 74 W3C (World Wide Web Consortium) @videoHeight IDL attribute, 95–96 Audio Incubator Group, 223, 239 video pixels description of, 3 data from, introducing into canvas, 166 Media Annotations Working Group, 297 segmentation of Media Fragment Working Group, 278 playFrame() function, 213–215 WAV audio postFrame() and drawFrame() with @autoplay attribute, 21 functions, 216–217 embedding in HTML5, 21 sepia coloring of with preload set to “none”, 22 painting into Canvas, 204–205 waveform using Web Workers, 206–207 display of, 243–245 video player, custom, building, 130–134 rendering audio samples in, 227–230 video support, introduction of into main web browsers. See browsers; specific browsers browsers and video publishing sites, 6 Web interface definition language (WebIDL), @videoWidth IDL attribute, 95–96 purposes of, 81 viewport, 15 web pages, reloading, 86 vision-impaired users, alternative content web servers technologies for node.js, 289 deaf-blind users, 253 publishing media files to web pages, 36–37 interacting with content, 249–250 Web Subtitle Resource Tracks. See WebSRT perceiving video content, 248–249 Web Workers visual elements face detection, 217–222 absolute positioning mode, 59–60 functions and events, 203 box model for IE and, 204 block box type, 55 motion detection with inline box type, 52–54 gray-scaling, 209–210 none box type, 54–55 implementation of, 210–212 overview of, 50–52 overview of, 208 positioning modes, 52 moving operations on video data to thread float positioning mode, 58–59 and feeding back to main web pages, relative positioning mode, 55–58 204–208 scaling and alignment within box, 60–62 overview of, 203 @volume IDL attribute, 92–93 region segmentation, 212–217 Vorbis audio format Web-audio API. See filter graph audio API description of, 4 WebIDL (Web interface definition language), embedding in HTML5, 21 purposes of, 81

320 ■ INDEX

WebM container, 271 WHATWG WebM project (Google), 5–7 adaptive HTTP streaming and, 299 WebM video ConnectionPeer API, 295–296 with @autoplay and @loop, 13 @width attribute of

321