<<

Enabling on-the-fly Video Shot Detection on YouTube

Thomas Steiner Ruben Verborgh Joaquim Gabarró Vallés Google Germany GmbH Ghent University – IBBT, ELIS Universitat Politècnica ABC-Str. 19 Multimedia Lab de Catalunya 20354 Hamburg, Germany 9050 Ghent, Belgium 08034 Barcelona, Spain [email protected] [email protected] [email protected] Michael Hausenblas Raphaël Troncy Rik Van de Walle DERI, NUI Galway EURECOM Ghent University – IBBT, ELIS IDA Business Park 2229 route des crêtes, BP 193 Multimedia Lab Lower Dangan Galway, Ireland Sophia Antipolis, France 9050 Ghent, Belgium [email protected] [email protected] [email protected]

ABSTRACT a tremendous gift”, a caption from Randy Pausch’s famous 1 Video shot detection is the processor-intensive task of split- last lecture Achieving Your Childhood Dreams , reveals the ting a video into continuous shots, with hard or soft cuts as video of his lecture. If no closed captions are available, nor the boundaries. In this paper, we present a client-side on- can be automatically generated, keyword-based search is still the-fly approach to this challenge based on modern HTML5- available over tags, video descriptions, and titles. Presented enabled Web APIs. We show how video shot detection can with a potentially long list of results, preview thumbnails be seamlessly embedded into video platforms like YouTube based on video still frames help users decide on the most using browser extensions. Once a video has been split into promising result. YouTube uses an unpublished computer shots, shot-based video navigation gets enabled and more vision-based algorithm for the generation of smart thumb- fine-grained playing statistics can be created. nails on YouTube and lets video owners choose one out of three automatically suggested thumbnails. In this paper, we introduce on-the-fly shot detection for Categories and Subject Descriptors YouTube videos as a third means besides keyword-based I.2.10 [Vision and Scene Understanding]: Video anal- search and thumbnail preview for deciding on a video from ysis; H.5.1 [Multimedia Information Systems]: Video the haystack. As a user starts watching a video, we detect (e.g., tape, disk, DVI) shots in the video by visually analyzing its content. We do this with the help of a browser extension, i.e., the whole General Terms process runs dynamically on the client-side, using modern HTML5 JavaScript APIs of the