The Continuous Media Web: a Distributed Multimedia Information Retrieval Architecture Extending the World Wide Web
Total Page:16
File Type:pdf, Size:1020Kb
Multimedia Systems (2005) 10(6): 544–558 DOI 10.1007/s00530-005-0181-8 REGULAR PAPER Silvia Pfeiffer · Conrad Parker · André Pang The Continuous Media Web: a distributed multimedia information retrieval architecture extending the World Wide Web Published online: 8 August 2005 c Springer-Verlag 2005 Abstract The World Wide Web, with its paradigms of surf- given in the URI (Unified Resource Identifier) format (see ing and searching for information, has become the predom- http://www.w3.org/Addressing/). inant system for computer-based information retrieval. Me- The World Wide Web is currently the predominant dis- dia resources, however information-rich, only play a minor tributed information retrieval architecture on the Internet [2]. role in providing information to Web users. While band- However, it is not yet a distributed information retrieval sys- width (or the lack thereof) may be an excuse for this situ- tem for time-continuously sampled data such as audio and ation, the lack of surfing and searching capabilities on me- video. It is only set up well for information retrieval on dia resources are the real issue. We present an architecture HTML resources. Although the number of audio and video that extends the Web to media, enabling existing Web in- data on the Internet is increasing, these documents cannot be frastructures to provide seamless search and hyperlink ca- used as intensively as HTML content. pabilities for time-continuous Web resources, with only mi- While URIs provide a uniform means to access and ex- nor extensions. This makes the Web a true distributed infor- plore information, they can only point to a complete audio mation system for multimedia data. The article provides an or video file, not into a specific temporal offset or a spe- overview of the specifications that have been developed and cific segment of interest. Also, when viewing an audio or submitted to the IETF for standardization. It also presents video resource, the hyperlinking functionality of the Web is experimental results with prototype applications. “left behind” and a dead end is reached. Audio or video re- sources cannot typically hyperlink to further Web resources, · · · Keywords Continuous Media Web Annodex CMML thus interrupting the uniform means of information explo- · Markup of media Open media metadata standards ration provided by the Web. Web information search is also deficient for media re- sources. While it is nowadays possible to search the content 1 Introduction of binary text documents, such as Postscript files or word processor documents, and identify whether they contain in- Since the Web’s very foundation by Tim Berners-Lee, aca- formation relevant to a particular query, this is not possible demics, entrepreneurs, and users have dreamt of ways to in- for media resources. Media resources do not currently have tegrate the Web’s various multimedia contents into one dis- a textual representation that fits into the text-based index- tributed system for storage and retrieval. The power of the ing paradigm of existing Web search engines, and therefore World Wide Web stems from being a world-wide distributed their content, however information-rich, cannot be searched information storage system architected to be scalable, simple uniformly on the Web. to author for, and simple to use. Information is found mostly Realizing this situation, we have developed an exten- through Web search engines such as Google or through Web sion to the World Wide Web that addresses these issues. We portals, both of which commonly just provide starting points are proposing a new standard for the way time-continuously for hyperlinking (surfing) to other related documents. In- sampled data such as audio and video files are placed on formation nowadays encompasses heterogeneous Web re- Web sites to make media content just as “surfable” as ordi- sources of all sorts, including media. The uniform means nary text files. In this new standard, users can link not only of accessing information on the Web is through hyperlinks from a text passage to e.g., a video, but into a specific time interval containing the information being sought [21]. The S. Pfeiffer (B) · C. Parker · A. Pang time-continuous resource itself is annotated with HTML- CSIRO-ICT Centre PO Box 76 Epping, NSW 1710, Australia like markup and has an XML representation of its content, E-mail: {Silvia.Pfeiffer, Conrad.Parker, Andre.Pang}@csiro.au enabling Web search engines to index it. The markup may The continuous media web 545 contain hyperlinks to other Web resources, which enables Decompression will lead to the higher temporal resolu- search engines to crawl time-continuous resources in the tion that the resource was sampled at in the first place. same way as they crawl HTML pages. These hyperlinks also This is the case for all common audio and video compres- enable the Web user to hyperlink through a time-continuous sion formats. Other types of time-continuously sampled resource, thus integrating these resources into the “surfing” data may, for example, be physical measurements of nat- paradigm of the Web. ural phenomena, such as the acidity (pH) value of a lake, At CSIRO we call the Web that is extended to time- or the temporally changing value of a stock market. Time- continuous resources the Continuous Media Web (CMWeb) continuously sampled data are also sometimes just called [24]. The technology is based upon the observation that the time-continuous data or, in the case of audio or video, just data stream of existing time-continuous data might be tem- media. porally subdivided based on a semantic concept, and that • Time-continuous (Web) resource is a time-continuous data this structure enables access to interesting subparts, known stream that can be distributed by a Web server. The re- as clips, of the stream. This structure is captured in an source may exist as a file or may be created on the fly HTML-like markup language, which we have named Con- (live or composed). To retain the ability to handle and dis- tinuous Media Markup Language (CMML) [23]. CMML play the time-continuous data progressively during deliv- allows annotations and hyperlinks to be attached to me- ery over HTTP, a time-continuous Web resource must fur- dia resources. For synchronized delivery of the markup and ther be streamable, i.e., it must be possible to decode it in the time-continuous data over the Web, we have developed a single linear pass using a fixed amount of memory. a streamable container format called Annodex [22], which • Clip is an arbitrary temporal section of a time-continuous encapsulates CMML together with the time-continuous re- Web resource as defined by a user. A clip makes sense source. When consuming (i.e., listening, viewing, reading) with respect to some semantic measure of the user. A such an annotated and indexed resource, the user experience whole time-continuous Web resource may be regarded as is such that while, e.g., watching a video, the annotations a clip, though the more common case is to describe such and links change over time and enable browsing of collec- a resource as a combination of several clips. tions simply by following a link to another Web resource. • Annotation is a fee-text, unstructured description of a clip. The remainder of this article is organized as follows. • Metadata are a set of name–value pairs that provide Section 2 provides some definitions for the key concepts dis- databaselike structured descriptions, e.g., in HTML meta- cussed in this article, and Sect. 3 discusses some related re- data are represented in meta elements within the head tag search work. Section 4 describes the specifications of the [29]. An existing metadata scheme that is independent of CMWeb architecture and technologies, introducing tempo- the format of data that it can describe is the Dublin Core ral URI addressing, the Continuous Media Markup Lan- [7]). guage (CMML), and the Annodex format. Switching the fo- • Hyperlink is a Unified Resource Identifier that can point cus to the actual handling of Annodex resources, Sects. 5 to or into any Web resource. In this context, the term me- and 6 detail how an Annodex resource is being produced dia browsing is then mainly used for following hyperlinks fromanexistingmediafilesuchasanMPEG-1 encoded into and out of media files rather than looking through a video and how it is distributed over the Internet. Section 7 collection of media files, as is the more common use of looks at information retrieval through Web search engines. the term in media management systems. Section 8 provides some experimental results of first imple- • Meta-information is a collection of information about a mentations. Section 9 compares the CMWeb to other media time-continuous data stream, which may include annota- standards with respect to their capabilities of extending the tions, metadata, and hyperlinks. World Wide Web to a distributed multimedia information • Markup is the collection of tags that are used to provide retrieval infrastructure. Finally, Sect. 10 concludes the paper metainformation to a Web resource. The most common with a summary and outlook for future work. language for providing markup is XML [33]. 2 Definitions 3 Related research Before we can discuss work in this field, we need to estab- There have been several approaches to a better integration lish that the terms we use have a well-understood meaning. of media documents into the World Wide Web that have not Therefore, we now introduce some fundamental definitions. been standards based. These will be briefly discussed here. A standards-based approach is superior as it enables interop- • Time-continuously sampled data are any sequence of erability between different proprietary solutions and scales numbers that represents an analog-time signal sampled in over the whole Internet rather than providing a solution lim- discrete time steps.