What’s New with MPEG?

ded domain, service interfaces must leverage the online experience. Finally, because users pay for these services, they also expect a decent level of quality, efficiency, and readability. An MPEG The Moving Picture Experts Group (see the “Related Standards Organizations” sidebar on p. 64 for brief descriptions of MPEG and other groups involved in this effort) has specified Standard for Rich MPEG-4 part 20 (formally known as ISO/IEC 14496-20) as the new rich media standard dedi- cated to the mobile, embedded, and consumer electronics industries. MPEG-4 part 20 defines Media Services two binary formats: Lightweight Application Scene Representation (Laser) and Simple Aggregation Format (SAF). Laser enables a fresh Jean-Claude Dufourd and Olivier Avaro and active user experience on constrained net- Streamezzo works and devices based on enriched content, including audio, video, text, and graphics. It Cyril Concolato addresses the requirements of the end-to-end ENST rich media publication chain: ease of content cre- ation, optimized rich media data delivery, and enhanced rendering on all devices. As such, it ful- rich media service is a dynamic, inter- fills the need for an efficient open standard. Lightweight active collection of multimedia data Application Scene such as audio, video, graphics, and Key features and use cases Representation text. Services range from movies Four key features distinguish Laser from exist- (Laser) is the Moving enrichedA with vector graphic overlays and inter- ing technologies: Picture Expert activity (possibly enhanced with closed captions) Group’s solution for to complex multistep services with fluid interac- ❚ In Laser, graphic animations, audio, video, delivering rich media tion and different media types at each step. and text are packaged and streamed together. services to mobile, Demand for these services is rapidly increasing, Unlike existing mobile technologies that are resource-constrained spurred by the development of the next-genera- mostly aggregations of various components, devices. Laser tion mobile infrastructure and the generalization Laser’s design is based on a single, well- provides easy of TV content to new environments. However, defined, and deterministic component that content creation, despite long-lasting deployments and significant integrates all of the media (the same design optimized rich investment from various industries, mobile and, that made Macromedia Flash successful on media data delivery, more generally, embedded interactive services the Web). This integration ensures the rich- and enhanced (mobile Internet and interactive mobile TV, for ness and quality of the end-user experience. rendering on all example) have failed to reach the masses. In addi- devices. tion to conjectural (such as economic) and struc- ❚ Laser provides full-screen interactivity with all tural (such as a lack of compelling business streams. It uses vector graphic technology so models) problems, current technologies (see the users can easily fit content to the screen size. related sidebar on p. 63) have failed to provide an Laser therefore provides optimal content dis- effective user experience. play despite high variations in screen resolu- Using rich media services is more challenging tion. In addition, it can use virtually all pixels in embedded devices than on PCs, where various as elements of the user interface, letting users interfaces are available and homogeneously design rich and user-friendly interfaces. implemented (such as a mouse or keyboard), and ergonomic concepts have been tested and vali- ❚ Laser efficiently delivers real-time content over dated. On the move, when it’s not always easy to constrained networks. More specifically, it interact and time is limited, users expect to be delivers media content in packaged pieces, let- one click away from the information they need. ting the device display a piece as it’s received In addition, end users are accustomed to quality (as opposed to download-and-play mecha- Web interfaces, so to be successful in the embed- nisms). Laser generalizes the streaming con-

60 1070-986X/05/$20.00 © 2005 IEEE (a) (b) (c)

Figure 1. Laser-based ceptalready in place for audio and video Interactive mobile TV rich media services: datato scene description and rich media. As Interactive mobile TV (see Figure 1b) aggre- (a) rich media portal, such, content providers can design services to gates multiple rich media use cases, from inter- (b) interactive mobile keep some information of interest on the active mosaic, electronic program guide, and TV, and (c) interactive screen at all times. voting to personalized newscast. All of these screen saver. applications require a system that can provide ❚ Finally, Laser delivers rich media service at deterministic rendering and behavior of rich rates from 10 kilobits per second (Kbps) using media content (including audiovisual, text, vector graphic compression and dynamic graphics, and images, along with streamed TV scene updates. This drastically reduces end and radio channels) in the user interface. The sys- users’ waiting time compared to standard tem must allow fluid navigation through content Web-like approaches in which the system in a single application or service, as well as local resends the complete page even if only small or remote synchronized interaction for voting changes have been made. This functionality and personalization (for example, related menus is useful not only in low bit-rate networks or submenus, advertising, and content in the such as General Packet Radio Service, but also user profile or service subscription). in higher bit-rate networks in which rich media services can be sent at low rates, pre- Interactive screen saver serving bandwidth to improve audio and The interactive screen saver (illustrated in video quality. Figure 1c) is an instance of a larger class of appli- cations that receive content updates in the back- Three use cases demonstrate the new stan- ground (such as fixed or mobile convergent dard’s benefits. services). The screen saver uses mostly static datathat is, text, graphics, and images Rich media portal arranged with transitions similar to those in a In this application (see Figure 1a), a Laser slide show, with an element of randomness in engine enhances an existing Wireless Access the presentation. The server adds new elements Protocol (WAP) or Extensible Hypertext Markup and removes expired elements from the applica-

Language (XHTML) service with rich media tion stored on the device. Developers should October–December 2005 much like Flash enhances Web sites. When a user design this application with care to avoid overuse accesses the WAP portal, a hyperlink provides of the device’s power resource. access to the site’s rich media part, giving the user a complete, deterministic, and consistent Technical aspects rich media experience. An intuitive interface Part 20 of the MPEG-4 standard consists of with a pop-up menu makes navigating the site two specifications: Laser, which specifies the simpler and the screen feel larger. Figure 1a coded representation of multimedia presenta- shows rich text with smallbut readableArabic tions for rich media services; and SAF, which fonts. At any time, a hyperlink can bring the user defines tools for fulfilling the requirements of back to the original portal. rich media service design at the interface

61 guaranteeing tight synchronization between the Application scene and the media assets composing the rich media presentation. MPEG-4 part 20 defines a Laser engine as the Scalable Vector Laser Graphics (SVG) extensions viewer for Laser presentations. Such an engine scene tree has rich media composition capabilities on top Audio Video Image Font ... of the capabilities common to classic multimedia Dynamic updates players. These composition capabilities are, as a result of the technology selection process, based Binary encoding on (SVG) Tiny 1.1.1 The composition capabilities rely on the use of an Simple Aggregation Format (SAF) SVG scene tree and are enhanced with key fea- tures for mobile services, such as binary encod- Transport ing, dynamic updates, state-of-the-art font representation, and the stable features of the Network upcoming SVG Tiny 1.2 specification (described in the “Current Technologies” sidebar), includ- Figure 2. Laser engine between scene representation and transport ing audio and video support. Figure 2 illustrates architecture. mechanisms. the Laser engine architecture.

Laser SVG scene tree. Laser uses an SVG scene tree In Laser, a multimedia presentation is a col- at its core. It imports composition primitives lection of a scene description and media (zero, from the World Wide Web Consortium’s (W3C) one, or more). A media is an individual content specifications (all of SVG Tiny 1.1, some of SVG item of the following type: image (still picture), 1.1, and Synchronized Multimedia Integration video (moving picture), audio, and by extension, Language, or SMIL, version 2) and uses the SVG font data. A scene description consists of text, rendering model to present the scene tree. Laser graphics, animation, interactivity, and spatial specifies hyperlinking capabilities, audio and and temporal layout. video media embedding, vector graphics repre- A Laser scene description specifies four aspects sentations, animation, and interactivity features. of a presentation: Scene tree extensions. After selecting SVG as ❚ how the scene elements (media or graphics) Laser’s core technology, MPEG identified several are organized spatially (for example, the visu- areas needing extensions to allow the develop- al elements’ spatial layout); ment of efficient services:

❚ how the scene elements are organized tempo- ❚ Because SVG Tiny doesn’t have clipping, rally (for example, if and how they’re syn- MPEG added simple axis-aligned rectangular chronized, or when they start or end); clipping to let developers create such com- mon user interface widgets as ticker tapes and ❚ how users interact with the elements in the simple transitions. scene (for example, when a user clicks on an image); and ❚ SVG lacks a restricted, nonresampling rotation for video and images and a full-screen mode, ❚ how changes occur in a scene. which MPEG added.

A Laser scene description changes through ❚ SMIL and SVG only allow the signaling of one animations. The scene’s states during the anima- synchronization master, whereas MPEG tion can be deterministic (that is, known when allows the specification of multiple synchro- the animation starts) or not (for example, when a nization references. server sends modifications to the scene). A Laser stream is the scene description’s sequence and its ❚ MPEG defines new events: long key press and timed modifications. The notion of Laser access pause and resume for video, audio, and other

IEEE MultiMedia units is key to streaming Laser content while timed elements.

62 Current Technologies Several technologies are competing to achieve the vision of However, SMIL and SVG are XML languages and rely on the rich media service to resource-constrained devices. HTML model for content consumption: download-and-play or progressive download and rendering. The streaming of SMIL Flash and Flash Lite and SVG content is unspecified, making these models inappro- Flash (http://www.macromedia.com/flash) is the current de priate for fast, dynamic, interactive, and interoperable content. facto standard for distributing rich media content on the A new standard for rich media content for mobiles must extend Internet, but it doesn’t efficiently address other industry require- SVG consumption scenarios to streaming and broadcast. ments. For example, Flash isn’t an open standard, which is crit- ical for massive industry support, especially on mobiles. When Media File Format using Flash, content creators, services operators, and device Creating rich media content services for mobile systems isn’t manufacturers are tied to Macromedia for the creation, distri- only about composition coding. To be successful, a mobile ser- bution, and playback of rich media content. Because it’s diffi- vice must provide a reactive and fluid user experience, achieved cult to evolve a media infrastructure once it’s deployed, through efficient delivery mechanisms. However, efficiency is organizations tend to avoid proprietary solutions and promote difficult to achieve when distributing rich media content made open standards. of individual audio, video, and image content. Separately deliv- Because Flash was designed for PCs, it’s unsuitable for the ering all of these media streams to a mobile device incurs high mobile environment, as Flash Lite (the mobile version of Flash) latency unless the system uses an efficient aggregation mech- demonstrated. To both remain compatible with its existing PC anism. In this case, high-latency networks attach a specific format and fit constrained device requirements, Macromedia penalty to multimedia content consumed in download-and- had to compromise on technology. The first version of Flash Lite play mode because the content’s waiting time is the sum of is a downgraded version of the PC version. Moreover, the prob- each media’s waiting time requested separately. Receiving all lem of having a single vendor and a proprietary format remains. streams in a single package reduces waiting time to a single Therefore, a rich media solution for mobiles must be open request delay. and allow easy conversion from the many existing types of Flash Such aggregation mechanisms must be efficient and simple to and other proprietary content into the new standard. implement. The International Organization for Standardization’s Two standardization groups have tried to specify standards (ISO) Base Media File Format5 is a natural candidate for this task that would satisfy this requirement: the Moving Picture Experts because the mobile industry has already adopted it. However, this Group (MPEG) and the World Wide Web Consortium (W3C). file format was designed for storing large amounts of media data, easy editing, and streaming operations. It’s inefficient for storing MPEG-4 Binary Format for Scenes small amounts of timed media data. Thus, we need a simple-to- MPEG-4 Binary Format for Scenes (BIFS)1 is MPEG’s first implement yet efficient aggregation format for mobile to com- attempt in the composition coding field. It features innovative plement the ISO Base File Format. tools that let users create multimedia content mixing 2D and 3D graphics, introduces the notion of incrementally updating the scene, enables streaming of long-running scenes, and References ensures a tight synchronization between a scene’s audiovisual 1. Information TechnologyCoding of Audio-Visual ObjectsPart 11: elements. We attempted to profile MPEG-4 BIFS to create a Scene Description and Application Engine, ISO/IEC 14496-11, Int’l small enough subset for use on mobile phones, but to no avail. Organization for Standardization/Int’l Electrotechnical The inherent content and binary encoding structure makes it Commission, 2005. inappropriate for mobiles. Therefore, instead of compromising 2. Synchronized Multimedia Integration Language (SMIL 2.0), 2nd on the technology’s performance, MPEG decided to create a ed., World Wide Web Consortium recommendation, 7 Jan. rich media standard for constrained devices. 2005; http://www.w3.org/TR/2005/REC-SMIL2-20050107/. 3. Mobile SVG Profiles: SVG Tiny and SVG Basic, World Wide Web SMIL and SVG Consortium recommendation, 14 Jan. 2003; The W3C also attempted to define languages for creating http://www.w3.org/TR/2003/REC-SVGMobile-20030114/. rich media content as an alternative to Flash, including the 4. “Scalable Vector Graphics (SVG) Tiny 1.2 Specification,” World Synchronized Multimedia Integration Language (SMIL)2 and Wide Web Consortium working draft, 13 Apr. 2005; Scalable Vector Graphics (SVG)3,4 standards. Both languages are http://www.w3.org/TR/2005/WD-SVGMobile12-20050413/. getting traction in the mobile industry, where both the Third 5. Information TechnologyCoding of Audio-Visual ObjectsPart 12: Generation Partnership Program (3GPP) and Open Mobile ISO Base Media File Format, ISO/IEC 14496-12, Int’l Organization Alliance (OMA) consortia have adopted their mobile profiles. for Standardization/Int’l Electrotechnical Commission, 2004.

63 ❚ A progressive rendering mode lets the user view Related Standards Organizations the content while it’s downloading. However, The following are the standards organizations that matter in the mobile the downloaded content only adds new con- multimedia services domain. tent to the existing content, making it difficult to manage long-running documents. ❚ Internet Engineering Task Force (IETF, http://www.ietf.org). A standards organization responsible for the Internet Protocol (IP), Real-Time ❚ Using scripting and a Document Object Protocol (RTP) for streaming, Hypertext Transfer Protocol (HTTP) for Model network software interface and an ad download, and many others. IETF standards are called requests for com- hoc protocol, the server communicates scene ments (RFCs). modifications to the client.

❚ International Organization for Standardization/International Electrotechnical However, SVG can’t satisfy the following use Commission (ISO/IEC, http://www.iso.org and http://www.iec.ch). A cases, currently permitted by Flash or MPEG-4 joint standardization group. Binary Format for Scenes (BIFS), in an efficient and interoperable way: ❚ International Telecommunications Union (ITU, http://www.itu.int). A stan- dards organization. ❚ representation of streamable cartoons,

❚ Moving Picture Experts Group (MPEG). A group within ISO/IEC responsi- ❚ partitioning scenes into small packets that fit ble for many standards, including Binary Format for Scenes (BIFS), a bina- in size-limited delivery mechanisms (such as ry standard for 2D/3D vector graphics; MPEG-J, a standard Java interface cell broadcast), to BIFS scenes and Lightweight Application Scene Representation (Laser) for 2D vector graphics; and Simple Aggregation Format (SAF) for light- ❚ dynamic creation of answers to user requests weight streaming. Both Laser and SAF are part 20 of MPEG-4. and their integration into the current scene, and ❚ Open Mobile Alliance (OMA, http://www.openmobilealliance.org). A standards organization for mobile applications, formerly the WAP forum, ❚ dynamic push of content into an existing responsible for the Wireless Application Protocol (WAP) standard. scene.

❚ Third Generation Partnership Project (3GPP, http://www.3gpp.org). A To enable these use cases, Laser complements standards organization for 3G mobile telecommunications. SVG with a dynamic updating mechanism that uses Laser commands. Using these commands, a ❚ World Wide Web Consortium (W3C, http://www.w3.org). A standards server can, for example, insert or delete graphical organization responsible for Scalable Vector Graphics (SVG) with its SVG elements or modify an object’s visual properties. Tiny (SVGT) mobile profile; Synchronized Multimedia Integration Developers can also use these commands to Language (SMIL), an XML-based standard for interactive audiovisual enable Web cookie-like mechanisms. presentations; Cascading Style Sheets (CSS), a standard for document styling; (DOM), a standard software interface Binary encoding. As the W3C specifies, con- to access XML documents; and Extensible Hypertext tent providers create, store, and transmit SVG (XHTML), a reformulation of HTML in XML. content in XML form. Although XML is well suit- ed for Web browsing with powerful PCs and high-bandwidth Internet connections, it incurs severe penalties in performance, code size, and ❚ Finally, SVG Tiny doesn’t have simple text memory requirements for small, predetermined underlining, so MPEG added a set of font styles vocabularies. Although the debate still rages over imported from digital television captions. the generality of these statements, MPEG chose to use binary streams in the Laser specification and thus avoid much of the complexity of XML Dynamic updates. SVG currently supports parsing. the following content consumption use cases: Laser’s binary format allows SVG content encoding. It uses a compact representation for ❚ In classical download-and-play mode, the user the SVG elements’ structure and specific coding waits until the download ends to start view- algorithms to encode the SVG elements’ attribute

IEEE MultiMedia ing the content. values. Because mobile platforms usually lack

64 Laser Laser encoding decoding

SVG scene Laser Laser SVG scene SVG scene stream stream SVG scene with Transport with SVG font SVG font OpenType OpenType SVG font SVG font stream stream OpenType SVG font transcoding transcoding

hardware float processing, compressing these ❚ It doesn’t leverage device support of current Figure 3. Transmission attribute values must be simpler than on a PC. font-rendering engines. of font data with Laser During the standardization process, MPEG reject- content. (SVG = ed complex computations that would slightly ❚ It doesn’t support OpenType fonts. These Scalable Vector improve the compression ratio while doubling fonts are more readable at small screen sizes Graphics.) the decoding time. The binary encoding of Laser and better equipped for internationalization is therefore straightforward, and its quality than SVG fonts. resides in the balance of complexity and effi- ciency. MPEG took special care in the encoding ❚ It doesn’t let the SVG engine share fonts with of values for some attribute types, such as float other applications on the device, such as an coordinates, vector graphics paths, and transfor- XHTML browser. mation matrices. The Laser binary syntax is extensible, letting Laser also carries font information alongside developers mix private extensions with normal scene information as a media stream. The exact Laser elements and attributes that can be ignored format is optional. One option is to use MPEG-4 by decoders that don’t know how to process part 18,2 which defines a font data stream and them. can carry OpenType fonts, possibly compressed. A generic MPEG-7 Systems (BiM) decoder can As a functional subset of OpenType fonts, decode the Laser binary syntax, which is compat- SVG fonts can be transcoded into OpenType font ible with a predefined BiM configuration. BiM data without losing SVG font data, and then car- compatibility lets developers encode other XML ried with Laser and possibly reconstructed for syntaxes, such as XHTML, with Laser. However, playback in an SVG player, as Figure 3 shows. the Laser standard doesn’t mandate the use of BiM. The Laser binary syntax is specified such that Services as incremental scenes. Many rich a specific BiM-agnostic decoder can process it. media services rely on Laser’s incremental scenes, made possible by the specification’s append Audiovisual support. A key feature of rich mode. The append mode lets content providers media is the support of audio and visual infor- create Laser streams that aren’t independent mation. SVG 1.1 doesn’t support audio or video scenes, but add-ons to existing scenes. but SMIL 2 does, and such support will likely be The following are typical use cases of incre- included in SVG Tiny 1.2. The Laser specification mental scenes: includes SMIL 2 audio and video elements as well, including the additional MediaClipping ❚ Streaming style. The scene is designed as a

module for VCR-like media control. Binary iden- sequence of frames, with a continuous stream October–December 2005 tifiers refer to the audio and video streams, which of updates changing the current frame into travel alongside the Laser stream. the next frame. Bandwidth usage varies, but never drops to 0. Incremental scenes of this Use of font information. Both SVG and Laser type are usually best transported over stream- let content creators embed font information in ing protocols like the Real-Time Transport the content to ensure that it’s presented as Protocol (RTP).3 designed and that the terminal doesn’t substitute any fonts at viewing runtime. However, MPEG ❚ Interactive style. The scene is interactive with deemed the SVG fonts solution too limited for the server processing user requests. The the following reasons: response to a user request is a change to the

65 existing scene, not a new scene. Such scenar- pound data stream composed of various elemen- ios also require continuous scene updates, but tary streams such as Laser scenes, video, audio, the transmission statistics differ from the image, font, and metadata streams. Multiplexing streaming style. Bandwidth usage is heavy for the data from these streams into one SAF stream a short time after a user request and then gives us simple, efficient, and synchronous deliv- drops to 0 until the next user request. Given ery. A SAF stream consists of SAF access units car- that mobile usage varies greatly, the next user rying request could come a few seconds or a few hours later. ❚ configuration information for the media or Laser decoder; From the server’s viewpoint, interactive trans- missions are a series of separate connections, as ❚ configuration information for elementary opposed to the continuous connection of the streams not carried in the SAF stream, such as streaming style. Application developers typically streams that start interactively or that travel implement interactive style using separate HTTP over another protocol; connections, because each data burst results from a user request. From a Laser viewer view- ❚ media or Laser access units; point, however, the same scene or service is modified. Hence, the server must be capable of ❚ an end-of-stream signal, indicating that no signaling an append mode that says, “This more data will arrive in an elementary stream; stream doesn’t contain a totally new scene, but an improvement to the scene the viewer is cur- ❚ an indication that no more data will arrive in rently processing.” the SAF session; and Append mode also lets the server create mul- tiple responses to possible user requests in ❚ cache units, as explained in the next section. advance. If we model the service as a state machine, each transition of the state machine We can use Laser and SAF independently, but represents a change to the current scene and we MPEG currently only specifies the use of a SAF can implement it as an append component. This stream carrying a Laser stream. In this case, the requires careful authoring and scope manage- first SAF access unit carries the Laser engine’s ment, particularly to avoid ID clashes between configuration information. elements added by different append compo- nents. Still, this functionality lets servers cache Caching capabilities. One way to reduce most of their responses to users, and thus could server response time in high-latency networks is dramatically improve the service’s performance. to anticipate what the user will need next and send it with the previous request. The CacheUnit Simple Aggregation Format is SAF’s mechanism for doing this. SAF functionalities include The CacheUnit is a package of information attached to a URL. When it receives this package, ❚ simple aggregation of any type of media the Laser viewer stores it in the cache, associated stream, resulting in a SAF stream with a low with the provided URL. If a user requests that overhead multiplexing schema for low-band- URL, the viewer uses the cached version. width networks, and CacheUnits have expiration dates. This feature, together with the Laser append mode, signifi- ❚ possibility of caching SAF streams, as described cantly improves rich media service fluidity. later. Relationship with relevant parts of Multiplexing media streams produces a SAF MPEG-4 stream that any delivery mechanismdownload- The Laser standard relates mainly to two exist- and-play, progressive download, streaming, ing MPEG-4 specifications: broadcasting, and so oncan carry. ❚ Part 1: Systems,4,5 including the synchroniza- Aggregation mechanism. The SAF specifica- tion layer and Object Descriptor Framework

IEEE MultiMedia tion defines the binary representation of a com- (ODF); and

66 ❚ Part 11: BIFS and MPEG-J (a standard Java elements in this domain. As elementary streams, interface to BIFS scenes).6 Laser streams can be wrapped in synchronization layer packets and transported in RTP packets, as Currently, no specification defines Laser’s use RFC 3640 specifies. Currently, no specification in a complete MPEG-4 systems environment. But exists for carrying SAF packets over RTP. Laser can replace BIFS in mobile applications, applications not requiring integration of 2D and Conclusion 3D content, and MPEG-J. Theoretically, we could MPEG will likely extend the Laser specifica- build an MPEG-4 systems application using a tion to support new features on mobile devices. Laser stream instead of a BIFS, but with one major Another direction for future work relates to the difference: A Laser stream doesn’t use the ODF. link between the Laser specification and the W3C SAF maintains compatibility with the syn- Format Group’s work on chronization layer but doesn’t mandate its use. the interactions among W3C standards such as SAF packet syntax follows synchronization layer SVG, SMIL, and XHTML. It’s therefore a natural packet syntax, with a particular configuration. extension for MPEG to study how this work will The SAF specification serves as a multiplexing affect Laser. scheme at the synchronization layer level. Finally, the Laser standard’s development SAF and Laser are independent, but aware, of might impact other MPEG activities such as the rest of the MPEG-4 specification. Therefore, MPEG multimedia middleware (M3W) activity. SAF can carry and use MPEG-4 video or audio M3W aims to improve the portability of applica- streams as defined in MPEG-4 part 2, 3, or 10. tions and services by defining a series of applica- tion programming interfaces (APIs) that can Storing Laser content evolve as middleware technology itself evolves. We can store Laser content in files compatible Dedicated APIs for accessing a Laser engine could with the ISO Base Media file format (see the be an interesting extension of M3W. MM “Current Technologies” sidebar). Because Laser streams are timed streams consisting of access Acknowledgments units, storing them in ISO files is straightforward The European Commission in the course of and similar to storing audio or video streams. We the Information Society Technologic’s (IST) store each Laser access unit as a sample. All of the Dynamic and Distributed Adaptation of Scalable samples form a Laser track identified by a four- Multimedia Content in a Context-Aware character code. We store the decoder’s configu- Environment (DANAE) Project (http://danae. ration as an entry in the sample description box, rd.francetelecom.com) partially funded Cyril and Laser streams comprising only one access Concolato’s work. unit as a primary item of the file using the meta box structure (as in the Third Generation References Partnership Project, or 3GPP, SMIL presentation 1. Scalable Vector Graphics (SVG) 1.1 Specification, World specification). Wide Web Consortium recommendation, 14 Jan. Although a SAF stream is timed, it’s an aggre- 2003; http://www.w3.org/TR/2003/REC-SVG11- gation of several elementary streams (such as 20030114/. Laser, audio, and video), and it would therefore be 2. Information TechnologyCoding of Audio-Visual inappropriate to store it as-is in an ISO file. ObjectsPart 18: Font Compression and Streaming, However, SAF can serve as a delivery protocol, let- ISO/IEC 14496-18, Int’l Organization for Standard- ization/Int’l Electrotechnical Commission, 2004. ting content creators define hint tracks to gener- October–December 2005 ate one SAF stream per ISO file, aggregating the 3. H. Schulzrinne, A. Rao, and R. Lanphier, Real Time file’s Laser, audio, and video tracks. Streaming Protocol, Internet Eng. Task Force, RFC 2326, Apr. 1998; http://www.ietf.org/rfc/rfc2326.txt. Streaming 4. Information TechnologyCoding of Audio-Visual MPEG and the Internet Engineering Task ObjectsPart 1: Systems, ISO/IEC 14496-1, Int’l Force have jointly specified, in MPEG-4 part 87 Organization for Standardization, 2004. and RFC 3640,8 the transport over IP of any kind 5. Information TechnologyMultimedia Content of MPEG-4 elementary stream, as long as it’s Description InterfacePart 1: Systems, ISO/IEC packaged using the synchronization layer syntax. 15938-1, Int’l Organization for Standardization/Int’l Hence, MPEG-4 part 20 doesn’t specify any new Electrotechnical Commission, 2002.

67 6. Information TechnologyCoding of Audio-Visual Olivier Avaro is a cofounder and ObjectsPart 11: Scene Description and Application vice-president of Streamezzo in Engine, ISO/IEC 14496-11, Int’l Organization for charge of business development. Standardization/Int’l Electrotechnical Commission, Before this, he was CTO of the 2005. Hyper-languages and Multimedia 7. Information TechnologyCoding of Audio-Visual Dialogs division of France ObjectsPart 8: Carriage of ISO/IEC 14496 Contents Telecom R&D. He is chair of the over IP Networks, ISO/IEC 14496-8, Int’l Organization Systems subgroup of the MPEG committee. His inter- for Standardization/Int’l Electrotechnical Commission, ests include mobile multimedia, and MPEG-4 systems. 2003. Avaro is a graduate of Ecole National Supérieure des 8. J. van der Meer et al., RTP Payload Format for Télécommunications de Bretagne. Transport of MPEG-4 Elementary Streams, Internet Eng. Task Force, RFC 3640, Nov. 2003; http:// Cyril Concolato is a research assis- www.ietf.org/rfc/rfc3640.txt. tant within the Communication and Electronics Department of Ecole Nationale Supérieure des Jean-Claude Dufourd is a co- Télécommunications in Paris, cur- founder and chief scientist of rently pursuing a PhD on Scene Streamezzo, chair of the MPEG Description Representations at committee’s Integration sub- ENST. He is member of the SVG working groups of group, and a member of the W3C and of the MPEG standardization group. His W3C’s Scalable Vector Graphics research interests include Scene Description Languages, (SVG) and Compound Document their compression, and efficient rendering. Formats (CDF) working groups. His research interests include mobile multimedia and MPEG-4 systems. Readers may contact the authors at Streamezzo, 83 Dufourd has a PhD in computer science from Ecole Boulevard du Montparnasse, 75006 Paris, France; jean- Normale Supérieure de Paris. [email protected].

Editorial2004 Calendar January–March April–June 25th Anniversary Issue History in Perspective This issue looks back to 1977 and looks forward to the future. It will Revel in the past and find out how yesterday’s pioneers have shaped feature reminiscences by former editors as well as major new articles by today’s computing technologies. Personal memoirs, biographical Mike Mahoney and James Cortada. essays, and insightful commentaries round out this issue. October–December July–September Historical Reconstructions IBM Boeblingen Laboratories With so many of the original artifacts of the We often think of IBM’s research facilities at Columbia University, original computing era gone or destroyed, Yorktown, and San Jose, and in the process we forget about its European some scientists are recreating old labs at Boeblingen. This issue, edited by a former Boeblingen staff machines. Edited by Doron Swade, the member, will recount the accomplishments of this facility. issue contains accounts of many attempts to recreate old technology in new forms. http://www.computer.org/annals IEEE MultiMedia

68