US008239558B2

(12) United States Patent (10) Patent No.: US 8.239,558 B2 Setlur et al. (45) Date of Patent: Aug. 7, 2012

(54) TRANSPORT MECHANISMS FOR DYNAMIC 2002/0141740 A1* 10, 2002 Matsui ...... 386,111 RCH MEDIA SCENES 2004/0028049 A1* 2, 2004 Wan ...... 370,394 2004/0174875 A1* 9/2004 Geagan et al. . 370,389 O O 2004/0181666 A1* 9, 2004 Candelore ...... T13,160 (75) Inventors: Vidya Setlur, Irving, TX. (US); Daidi 2004/0230560 A1 11, 2004 Eliza Zhong, Tampere (FI); Miska 2005, 0141484 A1* 6/2005 Rasanen ...... 370,352 Hannuksela, Ruutana (FI); 2005/0188413 A1* 8, 2005 Mohammed et al...... 725, 112 Ramakrishna Vedantham, Irving, TX 2005/0223098 A1 10, 2005 Rimac et al. Sh city,ity,(S, (US); FOREIGN PATENT DOCUMENTS Olga Uapin, Fo orun, EP 1239646 9, 2002 Michael Ingrassia, San Jose, CA (US) WO WOO2O5089 1, 2002 WO WO O305,2626 6, 2003 (73) Assignee: Core Wireless Licensing, S.A.R.L., WO WO O305,2626 A1 6/2003 Luxembourg (LU) - OTHER PUBLICATIONS (*) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 Schulzrinne, H. etal. "Audio-Video Transport Working Group'. RFC U.S.C. 154(b) by 818 days. 1889. Jan. 1996. Flute—FileDelivery Over Unidirectional Transport, Network Work (21) Appl. No.: 11/474,816 ing Group, T. Paila, Oct. 2004, 68 pages. SDP: Session Description Protocol, Network Working Group, (22) Filed: Jun. 26, 2006 Handley et al., Apr. 1998, 82 pages. Continued (65) Prior Publication Data ( ) US 2007/O157283 A1 Ju15, 2007 ul. , Primary Examiner — Kenny Lin Related U.S. Application Data Assistant Examiner — Saad A Waqas (60) Provisional application No. 60/694,440, filed on Jun. (74)74). AttAttorney, Agent, or Firm — AlbertDhandCa LLP 27, 2005. (57) ABSTRACT (51) Int. Cl. G06F 15/16 (2006.01) A transport mechanism for supporting the download of SVG (52) U.S. Cl...... 709/231 over FLUTE or UDP. A RTP payload format is specified that (58) Field of Classification Search ...... 709/231 enables live streaming and the streaming of rich media con See application file for complete search history. tent. According to the present invention, rich media content is encapsulated in RTP packets based upon the payload format (56) References Cited at the sender. With the present invention, an efficient frame work is provided for satisfying several use cases or scenarios U.S. PATENT DOCUMENTS that involve rich media transmission.

6,735,634 B1* 5/2004 Geagan et al...... 709,235 7,246,318 B2 * 7/2007 Debique et al...... 715.734 7.366,241 B2 * 4/2008 Matsui ...... 375,240.27 28 Claims, 6 Drawing Sheets

O 1. 2 3 O 1 2 3 4 5 6 7 8 9 O 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 O 1. ------sequence number ------timestamp ------synchronization source (SSRC) identifier ------contribution source (SSRC) identifiers M------TYPE R PR CTR <-- fixed header fields +------PAD GRP SOD L sS <--variable header fields and flags depending on TYPE U ------N : g : T : : SVG SAMPLE CONTENTS : W V+-+-+-+-+-+-+-+-+------/ US 8,239,558 B2 Page 2

OTHER PUBLICATIONS Office Action for Chinese Patent Application No. 200680030021.1, dated Oct. 9, 2010. Paila et al., Flute-File delivery over unidirectional transport. English translation of Office Action for Chinese Patent Application draft. 040605. Retrieved from internet: URL:http://www.facs.org/ No. 200680030021.1, dated Oct. 9, 2010. ftp/pub/pub/internet-drafts/draft-ietf-rmt-flute-08.txt, Retrieved O61120. International Search Report for PCT/IB2006/001751. * cited by examiner U.S. Patent Aug. 7, 2012 Sheet 1 of 6 US 8,239,558 B2

['OIH

09.I

I

U.S. Patent Aug. 7, 2012 Sheet 3 of 6 US 8,239,558 B2 U.S. Patent Aug. 7, 2012 Sheet 4 of 6 US 8,239,558 B2 £Z

T

0 U.S. Patent Aug. 7, 2012 Sheet 5 of 6 US 8,239,558 B2

U.S. Patent Aug. 7, 2012 Sheet 6 of 6 US 8,239,558 B2

38 O 54 52 44 RADIO 36-O CODEC INTERFACE

46 34 48- CONTROLLER KEYPAD

58 42 32 MEMORY INFRARED PORT DISPLAY

FIG. 9 US 8,239,558 B2 1. 2 TRANSPORT MECHANISMS FOR DYNAMIC The present invention provides for an efficient framework RCH MEDIA SCENES for satisfying several use cases or scenarios that involve rich media transmission. FIELD OF THE INVENTION These and other objects, advantages and features of the invention, together with the organization and manner of This present invention relates to transport mechanisms for operation thereof, will become apparent from the following streaming and downloading rich media content. More par detailed description when taken in conjunction with the ticularly, the present invention relates to transport mecha accompanying drawings, wherein like elements have like nisms for streaming and downloading rich media content numerals throughout the several drawings described below. containing (SVG) over point-to 10 point and broadcast/multicast services. BRIEF DESCRIPTION OF THE DRAWINGS BACKGROUND OF THE INVENTION FIG. 1 is a depiction of an SVG transport mechanism in accordance with the present invention; SVG is a useful format for media presentations. SVG can 15 FIG. 2 is a depiction of the format of the RTP payload fields provide a layout within which other media can be embedded in the transport mechanism of the present invention; and played, for real-time stock and traffic information, and for FIG.3 is a depiction of the RTP payload header for TYPE1 entertainment purposes. In recent years, Mobile Scalable packets; Vector Graphics (Mobile SVG), has been adopted as the new FIG. 4 is a depiction of the RTP payload header for TYPE2 imaging standard by the Third Generation Partnership Project packets; (3GPP) for playing a pivotal role in bringing improved graph FIG.5 is a depiction of the RTP payload header for TYPE3 ics and images to mobile devices. packets; Recently, 3GPP and the Open Mobile Alliance (OMA) FIG. 6 is a depiction of the RTP payload header for TYPE4 have begun work on the streaming of rich media over Portable 25 packets; and Simple Syndication (PSS) and Multimedia Broadcast/ FIG. 7 is a depiction of the RTP payload header for TYPE5 Multicast Service (MBMS). This requires the ability to com packets; bine both raster and vector graphics with existing audio, FIG. 8 is a perspective view of a mobile telephone that can Video, and timed text media. However, unlike existing frame be used in the implementation of the present invention; and based media, SVG follows for declarative animation with a 30 FIG. 9 is a schematic representation of the telephone cir specified presentation start time and duration. All the differ cuitry of the mobile telephone of FIG. 8. ent tracks in the rich media need to be temporally synchro nized and streamed via real-time transport protocol (RTP) DETAILED DESCRIPTION OF THE INVENTION packets using the track information contained within the ISO SVGT version 1.2 supports prefetching for progressive Base Media File Format. Currently, the RTP payload speci 35 downloading. However, for real-time streaming, a scene may fications cater to the packetization of frame-based media and change through animations and changes in scene states. This results in Synchronization problems among frame-based and sequence of Scene description and its spatial/temporal modi non-frame based SVG. fications needs to be streamed from the server to the players Currently, SVG and other media can only be downloaded on the client device. and progressively downloaded via HTTP. There is currently 40 Logical scenes in SVG animations are grouped together no mechanism for permitting the download of SVG over using the element. The SVGT 1.2 specification Suggests FLUTE, which is a Cascading Style Sheets, Level 2 (CSS2) that, for the case of progressive downloading and streaming, parser written in Java that implements SAC. SAC is an event all scene definitions must be children of the root ele based application program interface (API) for CSS parsers. ment in the DOM tree. Each adjacent top level element Additionally, due to the lack of an appropriate RTP payload 45 should contain adjacent chronological scenes in the anima format, there is currently no available mechanism for stream tion. ing SVG contents either out of ISO base media files or Scene updates are incremental updates to the SVG DOM directly from live content. that get sent to the client device one at a time. These updates Previously, there has been work on transport mechanisms include SVG element addition, element deletion, element for media formats Such as audio, video and timed text. Mac 50 replacement and element attribute update. romedia Flash, a proprietary format for vector graphics, does SVG supports media elements similar to Synchronized not have support for real time (RTSP/RTP) streaming. Multimedia Integration Language (SMIL) media elements. Instead, Flash only uses progressive downloading from a web Dynamic or real time media elements define their own time server or http streaming via the Flash Communication Server lines within their time container. All SVG media elements which only runs in a Windows environment. 55 Support the SVG timing attributes and run time synchroniza SUMMARY OF THE INVENTION tion. The real time media elements are audio and video, and are embedded as follows in SVG: The present invention provides for a transport mechanism for supporting the download of SVG over FLUTE or the User 60 Datagram Protocol (UDP). The present invention also pro

TYPE 1 1 4 4 3 3 3 4 4 S 3 3 3 1 2 PR 3 2 3 3 2 2 O 1 1 2 2 1 1 3 CTR3 1 2 3 4 CTR2 1 2 3 4 S CTR1 1 2 3 4 CTRO 1 CTR 1 1 2 3 2 3 1 1 2 4 S 3 4 3

The following sections specify the different payload head The “content script type” (CSCR) field identifies the ers for the TYPE values. TYPE1 headers, depicted in FIG.3, 15 default Scripting language for the given sample. This attribute are used to transport sample descriptions (both static and sets the default Scripting language for all of the instances of dynamic). The RTP transmission always starts with one or Script in the document. The value content type specifies a more than one TYPE1 packet containing static sample media type. If scripting is not enabled, then the value for this descriptions, including the one for the first TYPE2 packet. field is 0. The default value is “ecmascript with a value of 1. The TYPER, PR and CTR fields are used as above, followed A TYPE2 header, depicted in FIG. 4, is used to transport by the sample descriptions. Each sample description contains complete SVG scene samples. This unit is used in the case the following four fields in order: SDID content encoding when the SVG sample is small enough to be transported in |text encoding content Script type. one unit without having to make Smaller scene fragments. The TYPE, PD, SDID and R fields have similar interpretation as The “Sample Description Index Field” (SDID-8bits) is an 25 in TYPE 1. index used to identify the sample descriptions (SD) to help The “Number of Padding Bits Field” (PAD 3 bits) indi decode the payload. There are two types of SDID: static and cates how many padding bits are in the final octet of the real dynamic. The static and dynamic SDS use them respectively. payload. The static sample description remains active during the whole The GRP field (4 bits) indicates to which group the scene presentation, while the dynamic sample description remains 30 belongs. Generally, one scene followed by one or more scene active until the next scene sample is presented. The static and updates constitutes a group. However, the grouping nature is dynamic sample descriptions can be distinguished by their flexible and can be determined at the authoring level. Group associated SDID values. Both the static and dynamic sample ing allows a current list of active elements (TYPE4) to be sent description cannot be modified during their active periods. once per group, and allows for more efficient error correction Both sample descriptions are transmitted in TYPE1 packets. 35 mechanisms. The static SDID is used to identify static sample descriptions. The Sample Description Index Field (SDID) contains a The dynamic SDID is used to identify dynamic SDs. Larger reference to the sample description that must be used to parse values are assigned to the dynamic SDID, while lower values the scene description contained in the particular payload. are assigned to the static SDID. For example, 0-63 are The LNK flag (L–1 bit) indicates whether the current assigned to static SDIDS: 64-255 are assigned to dynamic 40 SVG sample contains an audio, video, animation or image SDIDs. This assignment threshold is transmitted out-of-band tag, indicating that the SVG references external audio, video, in the Session Description Protocol (SDP) before RTP trans SVG or image content. This field can help the client device mission. By default, the first 64 SDIDs are assigned to being 110 decide the decoding order and prepare for new transmis static, while the rest is assigned to be dynamic. However, this sions of external media content. assignment can vary depending on the server, network con 45 The SIM field (3 bits) is used to denote if Sample Dissimi ditions, number of sample descriptions, length of sample larity Information (SDI) is present or not. SDI provides infor descriptions, etc. mation as to how a sample is different from the scene graph The SDID's are connective integers, as discussed in the currently on the client. This information can either compare following example. The allowed value range of static SDIDs two scenes or two scene graphs (i.e., Scenes with scene is from 0 to 63. In other words, the maximum number of 50 updates). The choice depends upon the nature of the server as allowed static Sample Descriptions is 64. It is not necessary well as how the client can handle such information. for them to start from 0, but they must be connective. The The SIM field has several values. SIM=000 if the current allowed value range of dynamic SDIDs is from 64 to 255. packet does not contain any SDI. SIM-001 if the next sample During each active period, the maximum number of allowed and the sample in the current packet are identical. SIM-010 if dynamic sample descriptions is 192. It is not necessary for 55 the next sample is an empty SVG sample. SIM-011 if the two them to start from 64, but they must be connective. samples are different, and the SDI is appended at the end of The length of the version-profile, text-encoding-type and current packet. This occurs when the length of the SDI is compression-type are 16, 8 and 8bits, respectively. The “con small enough to be appended. TYPES2-5 support appending tent encoding field is a null terminated string with possible the SDI. SIM=100 if the two samples are different, and the values being none, bin Xml, gzip, compress and 60 SDI is sent in TYPE 7 or TYPE 9 packets when the length of deflate. This is specified according to www.w3.org/Proto the SDI is long. cols/rfc2616/rfc2616-sec3.html#3.5. The text encoding The S flag (1 bit) indicates whether the current packet field is a null terminated string with possible values selected contains the starting point of the current sample. While the M from the name or alias’ fields (depending upon the appli flag indicates the ending point of a current sample, the S flag cation) in the IANA specifications (which can be found at 65 indicates the starting point. The combination of the M and S www.iana.org/assignments/character-sets) Such aS flags provide the following information. When M=1 and S=1, US-ASCII, BS 4730, etc. the current packet contains a complete sample. When M=0 US 8,239,558 B2 7 8 and S-1, the current packet contains the first fragment of a The SDP session specifies the SVG format, its clock rate sample. When M=1 and S-0, the current packet contains a and the SDID threshold. An example of a media-level final fragment of the sample. When M=0 and S=0, the current description in SDP is shown below. In this case, four H.263 packet contains a middle fragment of a sample. In these Vedio media with the same format are embedded in the pre situations, “sample” refers to an SVG scene, scene update, 5 sentation. Each (“a-fmtp”, “a-rtpmap) pair describes one SDI or an active list of SVG elements. Source location. A TYPE3 header is used to transport whole SVG scene update samples. This unit is used in the case when the SVG scene update is Small enough to be transported in one unit m=svg+ 12345 RTP/AVP 96 without having to fragment into Smaller samples. All of the 10 a=rtpmap:96 X-SVG+XML/100000 fields have similar interpretations as in TYPE2 headers. A a=fmtp:96 scid-threshold=63:version profile=1.2:base profile=1 m=video 49234 RTPAVP9899 100101 scene update contains information for adding, deleting, and a=rtpmap:98 h263-2000/90000 replacing elements and their attributes in the current scene. a=fmtp:98 box=moov:track ID='1';profile="3:level='10'. The TYPE4 header contains definitions of currently active a=rtpmap:99 h263-2000/90000 SVG elements in the scene graph at that particular time 15 a=fmtp:99 file name='video2.h263;profile="3:level='10'. a=rtpmap:100 h263-2000/90000 stamp. This allows the client to better manage elements and a=fmtp: 100 item ID=3;profile="3:level='10'. memory. All of the fields have similar interpretations as the a=rtpmap:101 h263-2000/90000 other types. The GRP field allows TYPE4 packets to be trans a=fmtp:101 item name= “video4.h263;profile="3:level='10'. mitted only once per group for optimization. When to trans mit TYPE4 packets in the group (whether at the beginning, 20 There is no need to specify the source location of the main middle or end) is determined by the server 100 or by any SVG file. However, if there is an embedded SVG file or special instructions accompanying the SVG content. The content in the main SVG file, then the source location of the interpretation of the fields is the same as in the other TYPE file or content should be specified in SDP as an individual headers. An SVG definitions block containing currently media-level SDP description. active SVG elements is present in this TYPE. This allows the 25 A third implementation of the present invention involves client to check its SVG scene graph to make sure that it the transmission of static embedded media. The static embed contains all the SVG elements that are currently active. With ded media files (e.g. images) can be transmitted by either (1) the introduction of this TYPE4 format, errors in the scene sending them to the UE in advance via a FLUTE session; (2) graph due to packet loss may be corrected. In the SVG speci sending the static media to each client on a point-to-point fication, a