IEEE COMSOC MMTC E-Letter

MULTIMEDIA COMMUNICATIONS TECHNICAL COMMITTEE IEEE COMMUNICATIONS SOCIETY http://www.comsoc.org/~mmc

E-LETTER

Vol. 9, No. 2, March 2014

CONTENTS

Message from MMTC Chair ...... 3 EMERGING TOPICS: SPECIAL ISSUE ON CODING ALGORITHMS AND TECHNIQUES FOR MULTIMEDIA NETWORKING ...... 5 Guest Editor: Zongpeng Li, Chuan Wu ...... 5 [email protected], [email protected] ...... 5 Optimizing Media Quality in DASH (Dynamic Adaptive Streaming over HTTP) ...... 7 Sanjeev Mehrotra ...... 7 Microsoft Research, Redmond, WA, USA ...... 7 [email protected] ...... 7 Peer-to-Peer 3D/Multiview Video Distribution with View Synthesis Cooperation ...... 10 Fei Chen 1, Yan Ding 1, Jiangchuan Liu1, and Yong Cui ...... 10 1Simon Fraser University, Canada, {feic, yda15, jcliu}@sfu.ca ...... 10 2Tsinghua University, Beijing, China, [email protected] ...... 10 Considerations for distributed media processing in the cloud ...... 13 Ralf Globisch1, Varun Singh2, Juergen Sienel3, ...... 13 Peter Amon4, Mikko Uitto5, and Thomas Schierl6 ...... 13 1 Technische Universität Berlin (TUB), Berlin, Germany, [email protected] . 13 2 Aalto University, Espoo, Finland, [email protected] ...... 13 3 Alcatel-Lucent, Stuttgart, Germany, [email protected] ...... 13 4 Siemens AG, Munich, Germany, [email protected] ...... 13 5 VTT, Espoo, Finland, [email protected] ...... 13 6 Fraunhofer Heinrich Hertz Institute,Berlin,Germany, [email protected] . 13 Internet Video Multicast via Constrained Space Information Flow ...... 17 Yaochen Hu1, Di Niu1 and Zongpeng Li2 ...... 17 1 University of Alberta, Canada, {yaochen,dniu}@ualberta.ca ...... 17 2 University of Calgary, Canada, [email protected] ...... 17 Compressive Sensing for Video Coding: A Brief Overview ...... 20 Quan Zhou and Liang Zhou ...... 20 Nanjing University of Posts & Telecom, Nanjing, P.R. China ...... 20 {quan.zhou, liang.zhou}@njupt.edu.cn ...... 20 Offloading Policy of Video Transcoding for Green Mobile Cloud ...... 23 Weiwen Zhang and Yonggang Wen ...... 23 http://www.comsoc.org/~mmc/ 1/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Nanyang Technological University, {wzhang9, ygwen}@ntu.edu.sg ...... 23 INDUSTRIAL COLUMN: SPECIAL ISSUE ON NETWORK CALCULUS ...... 26 Guest Editor: Florin Ciucu ...... 26 University of Warwick, [email protected] ...... 26 A Network Calculus Extension to EvalVid ...... 28 Emanuel Heidinger1, Hyung-Taek Lim2 ...... 28 2BMW Group Research and Technology ...... 28 [email protected], [email protected] ...... 28 Performance Analysis for Wireless Multimedia Sensor Networks Based on Stochastic Network Calculus ...... 31 Nao Wang, Gaocai Wang and Ying Peng ...... 31 Guangxi University, 530004, China, [email protected] ...... 31 A General Stochastic Framework for Low-Cost Design of Multimedia SoCs ...... 34 Balaji Raman1, Ayoub Nouri1, Deepak Gangadharan2, Marius Bozga1, Ananda Basu1 ...... 34 Mayur Maheshwari1, Axel Legay3, Saddek Bensalem1, and Samarjit Chakraborty4 ...... 34 VERIMAG (France) 1, Technical University of Denmark2 ...... 34 INRIA Rennes (France) 3, Technical University of Munich (Germany) 4 ...... 34 [email protected] ...... 34 Copula Analysis for Stochastic Network Calculus ...... 37 Kui Wu, Fang Dong, Venkatesh Srinivasan ...... 37 University of Victoria, BC, Canada ...... 37 {wkui, fdong, venkat}@uvic.ca ...... 37 A Delay Calculus for Streaming Media Subject to Video Transcoding ...... 41 Hao Wang and Jens Schmitt ...... 41 DISCO Lab, University of Kaiserslautern, Germany ...... 41 {wang, jschmitt}@informatik.uni-kl.de ...... 41 MMTC OFFICERS ...... 44

http://www.comsoc.org/~mmc/ 2/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Message from MMTC Chair

Dear Fellow MMTC Members,

Time flies, and it has been almost two years since I served as the MMTC Chair from June 2012. In this message, I would like to call for nominations of MMTC officers for 2014-2016 and MMTC Service Awards of 2014.

During the past two years, MMTC has continued to flourish through the dedicated services of many volunteers. 13 MMTC Interest Groups have taken lead in organizing numerous special issues in top journals and magazines as well as conferences and workshops on hot multimedia topics. The TC has been keeping providing high quality monthly publications, with E-letters published in odd months and R-letter published in even months. Each issue of E-letter contains a special issue on emerging topics in multimedia as well as an industry column focusing on the recent industry innovations. Each issue of R-letter recommends top recent papers from the community, and includes a new distinguished category through collaborative recommendations with the Interest Groups. MMTC Service & Publicity Boards and Membership Boards work together in providing the best services to the community, through timely updates of the MMTC website and maintaining active MMTC Facebook and LinkedIn pages. MMTC Award Board is playing the key decisive role in selecting MMTC Best Paper Awards (by working with the Review Board) and MMTC Service Awards (by working with the MMTC officers).

What I am particularly proud is the formalization of various voting and selection procedures, regarding the nomination and election of the following important positions/awards:

- IEEE ICC/GLOBECOM Symposium Chairs - IEEE ICME TPC Co-Chair - IEEE ICME Steering Committee Member - IEEE CCNC Steering Committee Member - IEEE ComSoc Distinguished Lecturers - Associate Editors for IEEE Transactions on Multimedia - MMTC Best Paper Award - MMTC Outstanding Leadership Award - MMTC Distinguished Service Award - MMTC Excellent Editor Award

For all the above positions and awards, MMTC has formulated a transparent procedure that usually involves nominations open to MMTC members through the email list, followed by a voting by a committee usually includes the current MMTC officers, and/or IG Chairs/Vice-Chairs as well as Board Directors/Co-Directors. The procedure encourages the maximum participation from the community, and the decisions are made through the collected wisdom of the most active volunteers who are contributing to the success of MMTC on a daily basis.

Moving forward, it is now the time to elect the next edition of the TC leaders. An ad-hoc election committee has been formed, which is chaired by the past TC Chair, Dr. Haohong Wang ([email protected]) and includes a few past TC Chairs and senior officers. If you are interested in taking a leading role in shaping and serving the MMTC community in the next two years, please send your nomination (including self nomination) to Dr. Wang by April 15, for one of the following positions.

- MMTC Chair - MMTC Steering Committee Chair - MMTC Vice Chair - North America - MMTC Vice Chair - South America - MMTC Vice Chair - Asia - MMTC Vice Chair - Europe - MMTC Vice Chair - Letters and Member Communications - MMTC Secretary http://www.comsoc.org/~mmc/ 3/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter - MMTC Standard Liaison

The election on the short-listed candidates will be conducted in ICC 2014, Sydney, Australia.

Next, I would like to call for nominations of the MMTC Service Awards 2014, according to the procedures outlined at http://committees.comsoc.org/mmc/awards.asp.

MMTC has two types of service awards that are approved by ComSoc. The MMTC Distinguished Service Award is given to an individual who has made significant contributions to MMTC as a core leader over a substantial number of years. The MMTC Outstanding Leadership Award is given to an individual who has made significant contributions to MMTC as a current or past IG Chair/Co-Chair or Board Director/Co-Director. Current MMTC officers are not eligible for the awards.

This year, we called for nominations of the service awards through the MMTC email list on February 17, with the deadline of March 22. A nominator should send the nomination to me and copy to MMTC Secretary Dr. Liang Zhou (liang.zhou[at]ieee.org) in a single PDF file in the following format:

- Award name - Nominator name & affiliation, and contact info - Nominee name & affiliation, and contact info - A nomination letter no more than one page

The MMTC officers will discuss the nominations and provide the short-listed candidates to the Award Board for election. The results are again expected to be available by May, and the plaques will be awarded in ICC 2014.

I want to thank all the researchers and volunteers for making MMTC a great community. In particular, I would like to thank all current MMTC officers. It has been a great pleasure working with you and serving the community together.

Regards,

Jianwei Huang Chair, IEEE ComSoc Multimedia Communication Technical Committee http://jianwei.ie.cuhk.edu.hk/ http://www.comsoc.org/~mmc/ 4/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter SPECIAL ISSUE ON CODING ALGORITHMS AND TECHNIQUES FOR MULTIMEDIA NETWORKING

Coding Algorithms and Techniques for Multimedia Networking Guest Editor: Zongpeng Li, Chuan Wu [email protected], [email protected]

Multimedia computing and networking is an ever- “Consideration for Distributed Media Processing in young research direction that has witnessed sustained The Cloud”. The authors discuss the challenges in interest from both academia and industry in the past building distributed media processing platforms in the decade. Multimedia applications naturally benefit from cloud --- a desired direction to go since the cloud a number of coding techniques, including source coding, provides a rich pool of computing resources suitable for network coding, media trans-coding, to name a few. multimedia processing exemplified by transcoding. It Source coding such as layered cods and MDS codes was shown that open standards such as the IETF conducts encoding at the media source and decoding at Internet standards can be used to build a distributed the end clients, preparing the media content in a format media processing pipeline consisting of multiple cloud for in-network transmission that is most appropriate media processing services. according to the network status. Network coding are more general coding techniques that can be applied to Yaochen Hu, Di Niu from the Univiersity of Alberta different data transmission tasks in wireline and and Zongpeng Li from the University of Calgary wireless settings, and have enjoyed particular success in presented a new work on video multicast optimization multimedia networking. Media transcoding translates using a new mathematical model, in the fourth article, media streams between different playback rates, in “Internet Video Multicast via Constrained Space response to network conditions and customer demand. Information Flow”. Space information flow is a relatively new mathematical model for optimization This special issue of E-Letter focuses on the recent problems in network design and data networking. In progresses of multimedia networking, with a particular this work, it is applied to Internet video multicasting. A accent on networking algorithms and systems that are salient feature is that network optimization is related to one of the many coding schemes relevant to formulated and conducted in the Internet delay space multimedia. instead of in a finite network/graph. It is hoped that the space information flow model may prove useful in In the first article titled, “Optimizing Media Quality in future work on information networking and DASH (Dynamic Adaptive Streaming over HTTP)”, communications, including ones that are related to Sanjeev Mehrotra from Microsoft Research review multicast and network coding. solutions for optimal client and/or network adaptation in response to varying network bandwidth constraints, The fifth article is from Quan Zhou and Liang Zhou at for Internet media streaming applications. A utility Nanjing University of Posts and Telecommunications, maximization model is formulated, whose solution is with the title “Compressive Sensing for Video Coding: discussed. The solution framework is known as DASH A Brief Overview”. It’s nice to see an article that relates --- Dynamic Adaptive Streaming over HTTP. compressive sensing, a rather extensively studied theoretical tool for dimension reduction and image The second article, “Peer-to-Peer 3D/Multiview Video processing, connected to multimedia coding. Distribution with View Synthesis Cooperation”, is from Background on compressive sensing as well as existing Fei Chen, Yan Ding and Jiangchuan Liu at Simon and possible future applications are outlined in this Fraser University, as well as Yong Cui at Tsinghua succinct article. University. As evident in its name, this work is on the timely topic of 3D video delivery in a network. New The final article, “Offloading Policy of Video challenges and opportunities in the design and Transcoding for Green Mobile Cloud”, is contributed implementation of 3D video streaming systems are by Weiwen Zhang and Yonggang Wen from Nanyang discussed. Technological University. We are glad that one of the The third article is contributed by Ralf Globisch, Varun articles in this issue addresses green computing, in a Singh, Juergen Sienel, Peter Amon, Mikko Uitto, and time where a growing portion of the general public Thomas Schierl, from Germany and Finland. The title is become concerned in environment and green life styles. http://www.comsoc.org/~mmc/ 5/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Specifically, this work proposes an energy-efficient Zongpeng was a visitor at the Institute of Network offloading policy for TaaS to minimize the energy Coding, Chinese University of Hong Kong. His consumption of transcoding on both mobile devices at research interests are in networking science and the end user and in the cloud, while striving to maintain network coding. Zongpeng is a senior member of IEEE. low delay latency. Chuan Wu received her This special issue is, of course, not intended to serve as B.Engr. and M.Engr. a comprehensive survey in the field of coding for degrees in 2000 and 2002 multimedia networking. Nonetheless, we hope that each from the Department of of the articles presented is of interest to a group of Computer Science and audience in the multimedia community, including Technology, Tsinghua members of the MMTC. Finally, we would like to University, China, and thank all the authors for their great contribution and the her Ph.D. degree in 2008 E-Letter Board for making this special issue possible. from the Department of Electrical and Computer Zongpeng Li received his Engineering, University B.E. degree in Computer of Toronto, Canada. Science and Technology Between 2002 and 2004, She worked in the Information from Tsinghua University Technology industry in Singapore. Since September (Beijing) in 1999, his M.S. 2008, Chuan Wu has been an Assistant Professor in the degree in Computer Department of Computer Science at the University of Science from University of Hong Kong. Her research is in the areas of cloud Toronto in 2001, and his computing, online and mobile social networks, peer-to- Ph.D. degree in Electrical peer networks and wireless networks. She is a member and Computer Engineering of IEEE and ACM, and the Chair of the Interest Group from University of on Multimedia services and applications over Emerging Toronto in 2005. Since Networks (MEN) of the IEEE Multimedia 2005, he has been with the Communication Technical Committee (MMTC) from Department of Computer 2012 to 2014. Science in the University of Calgary. In 2011-2012,

http://www.comsoc.org/~mmc/ 6/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Optimizing Media Quality in DASH (Dynamic Adaptive Streaming over HTTP) Sanjeev Mehrotra Microsoft Research, Redmond, WA, USA [email protected]

1. Introduction bitrate, file size, starting time, duration, used, Media streaming over HTTP is seeing rapid adoption etc.. An MPD can be dynamically generated, e.g. for and is becoming the prevalent method for media live content, and also can be progressively sent to the streaming as it allows for the use of standard web client so as to avoid a large start-up time caused by servers and caching architecture (e.g. Content Delivery downloading the MPD. In addition, quality information Networks (CDN)) which is already ubiquitously (or distortion) can be included in the MPD for each present for the delivery of other (non-media) web segment encoding. Figure 1 shows the basic content. architecture for the content present within the serving architecture. In addition, although media streaming has deadline constraints, those constraints are typically at least an order of magnitude larger than network latencies (which are on the order of hundreds of milliseconds) since the client buffers several seconds of content. Therefore the underlying use of TCP inside HTTP does not affect performance as networking latency is of little concern to such applications.

However, since the internet does not typically have any Quality of Service (QoS) guarantees, the network bandwidth can vary dramatically. Therefore, Dynamic Figure 1: Content layout in DASH. The MPD file Adaptive Streaming over HTTP (DASH) has been contains information regarding each encoding for developed to allow for client-side adaptation in segment. Each encoding shown in a block is a response to network variation as well as other client different representation for the content. considerations (e.g. CPU, memory, battery, device capability) [1],[2],[3]. 3. Optimization of Dynamic Adaptive Streaming Across Multiple Users In this paper, we review existing solutions for optimal In [4], the authors formulate the problem of optimizing client and/or network adaptation in response to varying adaptive streaming of media over a shared resource bandwidth constraints by formulating it as a utility such as a mobile cellular network. The network may maximization (or equivalently distortion minimization) contain media streams as well as data streams. The subject to rate constraints imposed by the network. problem is formulated as that of maximizing the The optimization can occur over several users sharing aggregate utility of all streams as the same resource (as in the case of LTE or mobile) or temporally across various segments for a single user. ( )

∑ ∑ ( ) 2. DASH Architecture Overview

In DASH, the serving architecture (datacenter, CDN, ( ) or edge servers inside the ISP) host the content as files which the client requests using HTTP GET calls. The subject to data files hosted in the serving infrastructure includes the encoded media which is broken into independently decodable segments, with each portion coded at ∑ ∑ various bitrates and perhaps even using different codecs. In addition to the files containing the actual where are parameters used in the weighted - media encodings, a manifest or Media Presentation proportional fairness utility function, is a parameter Description (MPD) is present which is an XML based depending on content complexity, is the rate of the file containing information regarding the th video source, is the rate of the th data source, representations or encodings present for each segment along with information regarding the segments, such as and and are the minimal and maximal http://www.comsoc.org/~mmc/ 7/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter allowable rates of the th video source. is the deadline for each segment is chosen to make sure the total resource constraint and for mobile cellular client achieves glitch-free playback as well as the networks the resource consumed is a function of both desired startup latency. The estimated throughput can the rate as well as the Signal-to-Interference Plus Noise either be estimated or computed from the optimization Ratio (SINR), given by , where is the rate and in Section 3 for the multi-user case.

[ ( )] . For other networks, we could simply use for the resource consumed, that is, In order to optimize media quality, each segment is . typically encoded using variable bit-rate (VBR) encoding, that is the entire file or at least most of the Using standard techniques by turning the constrained file has to be downloaded in order to attempt playback problem into an unconstrained one using Lagrange of the segment. However, in order to further reduce multipliers, the authors in [4] derive the optimal rate startup latency, smaller sub-segment or constant bit- allocation which shows that is directly proportional rate (CBR) encodings with a smaller buffer can be to and inversely proportional to . This makes sense available. as a higher (more complex) content is one which has lower utility at the same rate and thus requires more In the optimization presented in [6], the client buffer is rate in order to operate at the same slope point in the assumed to have no constraints, which is typically okay rate-utility curve. Similarly a lower SINR (and thus as most clients have sufficient memory or can cache higher ) implies a receiver which consumes more future segments to disk. However, additional resources when operating at the same rate and thus constraints can be used to prevent decoder buffer- should receive less rate. It is then shown that the rate overflow (or encoder underflow) as ∑ allocation can be accomplished by traditional where is a function of the current client buffer size proportional fair schedulers with guaranteed bit rate and the maximum client buffer size. (GBR) constraints already employed at base stations by simply picking the GBR values appropriately for the Although the optimization here is a classic convex various flows. minimization (concave maximization) optimization problem with multiple linear constraints, it can be 4. Optimization of Dynamic Adaptive Streaming readily be solved in ( ) complexity as the constraints Across Multiple Segments of Single User are nested [5], [6]. The basic idea is to optimize first In many cases, the constrained resource (for example for the final rate constraint using the method of the bottleneck link in the network) is the last hop from Lagrange multipliers. If any earlier constraints are the server to the client. For example, most ISPs, such violated, the problem is split into two optimization as cable modem and DSL providers throttle the rate problems with constraints applied to each segment. available to the user depending on the service level. In such cases, cross-user optimization is not a possibility. The optimization can also be applied for live encodings by simply limiting the horizon of search to the However since the client buffer is usually on the order segments that are already available, which is typically of several seconds, the client can make optimizations on the order of tens of seconds. temporally across segments of the content [5]. In [5], we show that each client attempts to minimize By using this simple optimization, the client can distortion or correspondingly maximize the utility achieve a 3dB to 5dB gain in PSNR for segments across the next segments, that is, which are difficult to encode at the expense of minimal loss for segments which are easy to encode. In Figure

∑ ( ) 2, we show the gains by plotting the improvement in PSNR for segments as a function of the PSNR the subject to multiple ( ) rate constraints segment would obtain when using a simple non- optimized request strategy. The segments with low ∑ PSNR see the most gain, thus making the media more constant quality as opposed to constant bitrate. where is the concave utility function for the th segment as a function of rate, . The multiple rate 5. Challenges and Future Directions constraints, , are derived from the following five Although optimizing for utility and quality across a parameters: i) estimated throughput, ii) the current time, multi-user shared resource (such as a mobile cellular iii) the desired playback deadline of segment , iv) the network with multiple users connected to the same current client buffer size, and v) the desired buffer size base station) is a noble goal, the practicality of it after the download of current segment . The playback somewhat difficult as centralized scheduling is needed. http://www.comsoc.org/~mmc/ 8/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

Distributed rate control and scheduling is also possible [3] T. Stockhammer, "Dynamic adaptive streaming over using distributed utility maximization algorithms, HTTP --: standards and design principles," in Proceedings of ACM MMSys, 2011. whereby each flow responds to congestion (once the resource is fully utilized) using its own utility scaling [4] D. De Vleeschauwer, H. Viswanathan, A. Beck, S. Benno, [7]. However, even then, the debate about network G. Li, and R. Miller, "Optimization of HTTP adaptive neutrality [8], [9] makes it difficult since it is difficult streaming over mobile cellular networks," in Proceedings of INFOCOM 2013, pp.989-997, April 2013. to calibrate how one user’s utility relates to another. It is much easier to calibrate utility from one flow to [5] S. Mehrotra and W. Zhao, "Rate-distortion optimized another flow of the same user. client side rate control for adaptive media streaming," in Proc. of IEEE International Workshop on Multimedia Signal Processing (MMSP), October 2009.

[6] A. Ortega, "Optimal bit allocation under multiple rate constraints," in Proceedings of Conference (DCC), pp.349-358, Mar/Apr 1996.

[7] F. P. Kelly, A. K. Maulloo, and D. K. H. Tan, “Rate control for communication networks: Shadow prices, proportional fairness and stability,” in Journal Operations (a) (b) Research Society, vol. 49, no. 3, pp. 237–252, Mar. 1998. Figure 2. PSNR improvements for segments as function of un-optimized segment PSNR for [8] Net neutrality: http://en.wikipedia.org/wiki/Network_neutrality. (a) TV drama clip and (b) Sports clip [9] J. Crowcroft, “Net neutrality: the technical side of the By optimizing across various flows for the same user debate: a white paper” in SIGCOMM Computer Communications Rev. 37, 1, pp.49-56, January 2007. and across time in a given flow, we can avoid the need for iterative distributed scheduling or centralized [10] Q. Xu, S. Mehrotra, Z. M. Mao, and J. Li, “PROTEUS: scheduling. Network Performance Forecast for Real-Time, Interactive Mobile Applications,” in Proceedings of ACM MobiSys, June 2013. In addition, one of the key aspects in the optimization across a single user is to estimate the link throughput. Sanjeev Mehrotra is a Principal Architect at Although the application can estimate the throughput Microsoft Research in Redmond, WA, using simple metrics (e.g. file size and time to USA. He received his Ph.D. in download), lower layers of the networking stack may electrical engineering from Stanford have additional information -- e.g. SINR, number of University in 2000. He was users on the network, location, past throughput at the previously the development manager location -- which can further enhance the rate for the audio codecs and DSP team in estimation. Alternatively, rate prediction can also be the core media processing technology used to predict future throughput to use in the team and led the development of optimization [10]. WMA audio . He is the inventor of several audio and technologies in use today and helped 6. Conclusion develop the initial prototype version of Microsoft’s In this paper we have shown methods in which Smooth Streaming. His work has shipped in numerous adaptive media streaming (such as DASH) can be Microsoft products and standard codecs such as H.264 optimized for media quality. This optimization can be AVC. His research interests include multimedia done across multiple users, multiple flows for a given communications and networking, multimedia user, as well as multiple segments for a given stream. compression, and large scale distributed systems. By taking advantage of multiplexing across users, flows, and segments, and taking advantage of the client Dr. Mehrotra is an author on over 30 refereed journal buffer, we can significantly enhance quality. and conference publications and is an author on over 70 US patent applications, out of which 35 have been References issued. He was general co-chair and TPC chair for the [1] ISO/IEC 23009-1, ISO/IEC 23009-2, ISO/IEC 23009-3, Hot Topics in Mobile Multimedia (HotMM) workshop ISO/IEC 23009-4, “Information technology – Dynamic at ICME 2012 and is currently an associate editor for adaptive streaming over HTTP (DASH)” – Parts 1 – 4. the Journal of Visual Communication and Image [2] I. Sodagar, “The MPEG-DASH Standard for Multimedia Representation. He is a senior member of IEEE and a Streaming Over the Internet”, IEEE Multimedia, pp. 62- recipient of the prestigious Microsoft Gold Star Award. 67, Oct.-Dec. 2011.

http://www.comsoc.org/~mmc/ 9/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Peer-to-Peer 3D/Multiview Video Distribution with View Synthesis Cooperation Fei Chen 1, Yan Ding 1, Jiangchuan Liu1, and Yong Cui 2 1School of Computing Science, Simon Fraser University, Canada 2Department of Computer Science and Technology, Tsinghua University, Beijing, China {feic, yda15, jcliu}@sfu.ca, [email protected]

1. Introduction The recent advances in stereoscopic video capture, compression, and display have made 3-Dimensional (3D) video a visually appealing and costly affordable technology. We have witnessed a series of new releases of stereoscopic 3D products in the past years (e.g., Nintendo 3DS, Fuji W3 3D, and HTC Evo 3D), with much more being just announced. The abundant content, together with the dramatically decreasing price, have become driving forces to the vast growth of 3D-capable disc players and LCD/LED TVs, or even tablets and smartphones in the market, which are quickly sweeping away the conventional 2D only devices.

3D video (also referred as stereo video) is perceived by human beings with additional three-dimensional depth, Figure 1. Peer-to-peer overlays for different views. which is resulting from the spatial disparity of two slightly different streams for left and right eyes’ In this article, we present an initial attempt toward viewpoints respectively. Multiview video further efficient streaming of 3D/multiview videos over peer- extends 3D perception by capturing diverse viewpoints to-peer networks. The remainder of the paper is from a camera array, and enabling a user to organized as follows. We first outline the structure of interactively choose the viewpoints of interest [1]. peer-to-peer 3D/multiview streaming in Section 2. Then There have been a series of pioneer academic works on we will present the inherent partner selection strategy in streaming 3D video over the Internet [2] [3] [4]. Most both the time and space dimensions in Section 3. In of them, like those commercial products, are Section 4, we illustrate the implement issues with view client/server based however. This classical synthesis technology, and the simulation result is communication paradigm has already suffered from presented in Section 5. Finally, we conclude this paper streaming the traffic-intensive 2D videos, and the in Section 6. remarkably increased data volume of 3D videos poses even greater challenges, not to mention multi-view 2. Peer-to-peer 3D streaming: system overview videos. The inherent multi-stream nature of 3D video We advocate a mesh-based peer-to-peer overlay design, unfortunately makes the streaming delivery more which has been widely used in state-of-the-art difficult than that of 2D video [5]. This is because, to commercial systems [5] [6]. In a mesh overlay, each enable stereoscopic perception at the user side, not only newly joined client (peer) will be informed with a list of segments in one stream have to arrive before playback active peers interested in the current video, and it will deadlines, but also they have to be paired with randomly select a subset of them to establish corresponding segments with the same playback time in partnerships. It will then exchange bandwidth and data the other stream. Poor synchronization between streams availability information with these partners and fetch prolongs the playback delay, and would even cause video data through a scheduler, which specifies which spatial displacement of objects in the two views, partner to transmit which segments. The active peer list resulting in false parallax perception. The problem is will be periodically gossiped through the overlay, so severe in a peer-to-peer overlay, given the existence of that existing peers can update their partners to multiple senders and node churns. For multiview video, accommodate peer and network dynamics. since a user is not only interested in a subset of the views but can switch over views, the challenge Fig.1 illustrates the the peer-to-peer overlays with becomes even acuter with such view diversity and separate streams for different views, where the peers in dynamics. the same overlay share the data segments from the same viewpoint. Note that a peer from a adjacent overlay may also be able to access the data in a given http://www.comsoc.org/~mmc/ 10/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

Figure 2. V-Plane Overlay: U has direct partners C in Figure 3. Dynamic view synthesis cooperation with its priority line, and indirect partners B,D,E. viewpoint changes. overlay, which facilitates potential interactive view distributed, the peer will find interested partners, and switching in the multiview scenario. Each peer establish partnerships for data exchange. We can see maintains a buffer for the received video segments, and that, although the peers have their individual COI, they a window slides over the buffer, covering the segments can still have overlapped content of interest. This is of interest for playback and transmission. The because the peers are usually interested in video content availability of the segments within the sliding window in a period of time, which locates near its COI (in the is represented by a peer’s buffer , which is time scale). Further, adjacent peers will have one same periodically updated and exchanged among partners. view shared (in the space scale).

3. Voronoi diagram in time and space dimentions In the V-Plane, the peers having overlapped interest are Differently from traditional monoscopic 2D video partners. Each peer maintains two partners lists, namely, streaming, two distinct data sequences for a 3D video direct partners (DP) and indirect partners (uDP). The need be distributed, which are correspondingly DP list consists of partners that are watching at the produced by the encoded left and right views. Data same viewpoint and are within the interest time period segments with the same playback time, upon receiving, (in peer’s priority line, Fig.2), while the uDP list has will be combined to produce a stereo frame pair, partners locating at farther range (in peer’s k-nearest enabling depth perception. To ensure that the segments neighborhood). Although the indirect partners may not of different views arrive simultaneously at a destination, exchange the current video content, they could be a simple approach would be mixing the segments into possibly promoted to be direct partners, especially one stream for transmission. It is however quite when the peer changes viewpoint or when DP has inflexible, and we instead suggest that the segments insufficient bandwidth to transmit the segment data of being delivered through two separate streams. Further, adjacent view stream. The priority of segment in the multi-view scenario, not only a user might be transmission can be formulated as the scheduling interested in different part of a video, but also might problem, which is solved by our proposed algorithms of prefer to watch the video from different views as well. the former work in [8]. This constrains resource sharing among peers. Besides the view diversity, the users may change viewpoints 4. Implementation issue with view synthesis during watching. Even though the V-Plane overlay reduces the view switching delay by quickly re-locating new partners to We address these challenges by leveraging clustering to avoid data outage with low overhead, it always takes efficiently locate partners with same/overlapped interest. time to establish the data connections with new partners. As in Fig.2, we use a Voronoi diagram [7] as a The smoothness of playback continuity suffers from virtual plane (called V-Plane) that extends in time and sudden view shift behaviors with wide range, or high space dimensions. The space dimension consists of frequency. To address this issue, we deploy the view discrete viewpoints, each representing a 3D perception synthesis technology to assist the smooth playback of two adjacent views. The time dimension then during the view shift applications. The view synthesis corresponds to playback time slots. Let a peer’s buffer refers to utilize the left and the right views with the status be represented by its value of center of interest depth and texture maps to generate the visual views, (COI), expressed as COI (time, viewpoint). The time is rather than the real views captured by the camera. set as the middle of the sliding window and the Therefore, even though the users do not receive the viewpoint represents which two views constitute the segment data of all the viewpoints, the view synthesis stereoscope perception. After the COI values are can still be proceeded to render the visual views with http://www.comsoc.org/~mmc/ 11/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter absent segments. In recent years, it has been extensively disable it and instead have smoother 2D playback with explored in the multiview streaming system [2] [9]. less bandwidth demand. We thus believe that it is Specifically, we propose an online view synthesis necessary to accommodate these clients in our 3D algorithm with low computation cost in [1]. Here, we will illustrate how it can be integrated in the peer-to- peer 3D/multiview streaming system.

In Fig. 3, there are five different viewpoints, in which the view1 and view5 are the reference views. These two views are shared commonly among the clients for view synthesis, in case the segments for playback are not received. Let the selected views be View ={1, 1, 2, 3, 3, 4, 4, 3} for the time series {1, ..., 8}. In time slots 1, 2, 5, and 7, the view segments are buffered for the playback of these viewpoint streams. Otherwise, the view synthesis is proceeded to generate the visual views. Specifically, in the time slots 3 and 4, the common Figure 4. Rendering quality with viewpoint changes. reference views view1 and view5 are utilized for visual view generation. In the time slots 6 and 8, the buffered streaming system, so as to enable a smooth transition view segments (without being viewed) are performed as toward full 3D streaming. Our solution serves as an the reference views for view synthesis. According to initial attempt towards supporting efficient separate the result of our proposed algorithm, the adjacent view streaming and switching. Given the high degree of reference view can provide a more accurate rendering view diversity and dynamics, more advanced multi- performance. view design in terms of overlay construction and partnership maintenance are to be developed. 5. Performance evaluation We evaluate the performance of our 3D/multiview References video streaming through simulations. The video source [1] F. Chen, J. Liu, Y. Zhao, and E. C.-H. Ngai, we used for simulation is based on the standard “Collaborative view synthesis for interactive multi-view multiview video sequence ”Ballet”. The resolution is video streaming ,” in Proc. of ACM NOSSDAV 2012. 1024*768 with a frame rate of 15 frames/s. For [2] G. Park, J. Lee, G. Lee, and K. Kim, “Efficient 3D comparison, we also implement Rarest First (RF) adaptive HTTP streaming scheme over internet TV,” in Proc. of IEEE BMSB 2012. scheduling and Random (Random) scheduling, which [3] Z. Zhou, L. Zhuo, J. Zhang, and X. Li, “A user-driven have been widely used in existing P2P streaming. interactive 3D video streaming transmission system with low network bandwidth requirements,” in Proc. of ICACT In Fig. 4 the viewpoint is changed at 30th second and 2012. 40 second, respectively. The PSNR is 40.2dB on [4] S. Savas, C. G. Gurler, A. M. Tekalp, E. Ekmekcioglu, S. average after compression. The view synthesis Worrall, and A. Kondoz, “Adaptive streaming of multi- performance deteriorates as the range increases, 25.9dB view video over P2P networks,” Signal Processing: Image for 1 visual view, 24.7dB for 3 visual views and 20.3dB Communication, 27(5): 522–531, 2012. for 5 visual views on average. In Fig. 5, we can observe [5] J. Liu, S. G. Rao, B. Li, and H. Zhang, “Opportunities and challenges of Peer-to-Peer Internet video broadcast,” an obviously rendering quality gain when the view Proceedings of the IEEE, 96(1): 11–24, 2008. synthesis strategy is deployed during the viewpoint [6] PPLive http://www.pplive.com/. changes. If the viewpoint keeps constant, the RF [7] D. Lee, “On k-nearest neighbor Voronoi diagrams in the strategy has a little bit better performance as it does not plane,” IEEE Transactions on Computers, 100(6): 478– need to transmit the reference views as well. 487, 1982. [8] Y. Ding and J. Liu, “Efficient stereo segment scheduling 6. Conclusion and future disscussion in Peer-to-Peer 3D/multi-view video streaming,” in Proc. As 3D video remains in its early stage, many of the of IEEE P2P 2011. existing client video playback platforms support [9] N.-M. Chueng, A. Ortega, and G. Cheung, “Rate- distortion based reconstruction optimization in distributed monoscopic 2D video only, even though a client may source coding for interactive multiview video streaming,” have sufficient bandwidth to receiving 3D streams. On in Proc. of IEEE ICIP 2010. the other hand, some 3D-capable client may want to

http://www.comsoc.org/~mmc/ 12/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Considerations for distributed media processing in the cloud Ralf Globisch1, Varun Singh2, Juergen Sienel3, Peter Amon4, Mikko Uitto5, and Thomas Schierl6 1 Technische Universität Berlin (TUB), Berlin, Germany 2 Aalto University, Espoo, Finland 3 Alcatel-Lucent, Stuttgart, Germany 4 Siemens AG, Munich, Germany 5 VTT, Espoo, Finland 6 Fraunhofer Heinrich Hertz Institute, Berlin, Germany [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]

Introduction Consumed content will be customizable according to The cloud infrastructure provides additional network user preferences e.g. a user in a video conference is able and computational resources to devices which have to select which participants they wants to view, and even limited processing or bandwidth. Consequently, the control aspects of the video such as the position and size applications offload computationally-intensive tasks to of each participant. The recording, sharing and the cloud infrastructure. Currently, cloud-based systems customizability of video content, places new burdens on provide multimedia processing services such as such as the network infrastructure, but also opens the door to a transcoding [1], [2], speech recognition [3], and deliver new generation of multimedia services and applications. live media stream to millions of viewers [4], [5]. In this paper, we present a distributed media processing

platform in the cloud and discuss the approach taken as Video traffic is expected to account for a significant well as the challenges that exist. part of Internet traffic over the next few years. The demand for more capacity is only going to increase Case study: a distributed media processing pipeline with the forthcoming deployments of Web-based Real- We focus on the following use-case: a user wants to host Time Communication (WebRTC) [6] and telepresence. a multi-party video conference. A traditional multi-party In the future Internet it is foreseen that an increasing video conferencing uses a Multipoint Control Unit number of users will not only consume media content, (MCU) that is responsible for mixing the audio, but will also help produce it. For example, a live event multiplexing video, and sending the resulting stream to is broadcast online using live footage recorded by the participants of the conference. Leveraging the cloud multiple spectators. The trend towards increased media to provide the function of a Conference Bridge has production by ordinary users is already evident today in several advantages including that the user only pays for the content that users upload onto YouTube [7], and in the time that the bridge is needed, no hardware device is Google+'s [8] Auto Awesome feature and Hangouts. necessary, and the location of the conference bridge can This is driven by social media trends as well as by be selected based on locations of available cloud nodes mobile phone trends, as illustrated in Figure 2. The and the participants. sales of mobile devices have already overtaken In this paper, we consider the entire media processing computer and laptop sales, and these devices typically pipeline, from video capture at the end-user, to have built-in recording capabilities. While most of the content recorded today follows a Video-on-Demand encoding, transport over the network for processing by model, this may change once the technological multiple cloud services, and last mile delivery to the end-user. Further, in our experiments, the content was challenges have been solved. delivered in real-time to active participants, and with a

short delay (2-10 seconds) also available to a larger number of passive participants, i.e. participants that are just consuming the media and not interacting with the other participants. Our system uses three cloud services to fulfil the features of an MCU in the multi-party video conferencing use-case. In the future such services could be provided by one provider, or by different providers and the main reason is that open standards provide an easy interface for integration. Furthermore, the Figure 2 Global communication over IP emergence of WebRTC is resulting in further standardization of discrete components of the media processing pipeline [6], [9]. http://www.comsoc.org/~mmc/ 13/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Our proposed distributed media processing pipeline uses NIST’s cloud definition [19] it provides a Platform-as-a- open standards for the purpose of interoperability and Service (PaaS) approach on top of a cloud infrastructure. communication between the various components. We Its main objective is to provide scalability beyond a use H.264/AVC [10] as the video codec, and use the single physical resource e.g. for multimedia and other following Internet Engineering Task Force (IETF) data intensive real-time applications. Therefore the protocols; the Real-time Transport Protocol (RTP) [11] platform offers capabilities for automated placement of for media transport, the H.264 RTP payload format [12] processing elements (realized as self-contained service for encapsulating media, the Real-time Streaming components) and taking care of their scaling and load- Protocol (RTSP) [13] for signaling between the various balancing behavior. The framework relieves the components. It should be noted that the Session application developer to take care of these issues. The Initiation Protocol (SIP) [14] is typically used for MediaCloud framework mixes multiple video streams conference call setup. RTSP was used instead to by plugging in several service components together simplify the integration of the various cloud services. In (RTP-handlers, video stitching, codecs and scaler). The particular, the endpoints use Multipath RTP (MPRTP) demonstrator shows gains of using the MediaCloud protocol [15], [16] to stream media to the conference approach by optimizing resource utilization, automated bridge, mainly to aggregate capacity across available handling of scaling and placement in the infrastructure. interfaces and provide robustness to failure (fail-over), which is useful in distributed cloud scenarios where it provides load balance without requiring signaling. The cloud nodes take care of playout dejittering and delivery. The use of IETF standards simplified integration between different cloud media processing components and also allows a user with a standard- conformant media player to consume the processed Figure 3 Cloud-based real-time stitching media media stream. pipeline The video-mixing function is carried out by two separate The media streams are coordinated using RTSP for real-time transcoding and video-mixing services. Of session setup and media is delivered via RTP. In the these, one service uses the as a Service (SaaS) integrated demonstrator the input streams are received approach, while the other is a Platform as a Service from the RTSP server, and the mixed stream is sent back (PaaS). Lastly, the conference bridge provides an RTP to the same server. From there the mixed content can be to Dynamic Adaptive Streaming over HTTP (DASH) distributed to end-devices. [17] protocol translation service to deliver media to Service 3: Translation/Distribution Service passive participants. The RTP to DASH (Dynamic Adaptive Streaming over Service 1: Real-time Transcoding/Video-Mixing HTTP [17]) conversion service retrieves the stitched The video stitching service follows the SaaS model. video streams from the other cloud media processing Such a service is typically deployed into the cloud using services using RTSP/RTP. Incoming RTP media packets virtual machines. The service is setup using RTSP and are converted to DASH media segments. The RTP to receives the live media from the user (e.g., client devices DASH conversion machine also acts as an HTTP web of video conferencing users or IP cameras) over RTP. server that hosts the DASH files and hence provides The RTP streams are decompressed by an H.264/AVC wider client coverage to the service. In general, the RTP decoder, scaled to the appropriate resolutions if to DASH conversion module outputs the actual video necessary, and then multiple streams are stitched segments and a playlist file representing the order of the together in the uncompressed domain in a configurable video segments. and flexible layout. The newly-created compound For the conversion process, we identify live and non- pictures are re-encoded in the H.264/AVC format, live playlist file types. In the live case, a client can start packetized in the RTP format and then made available to the playout from the current point of the conversion the other participants via an RTSP server. The media process with a 2-10 second processing/TCP delay pipeline is shown in Figure 3.The service provider needs compared to RTP. For the non-live option, the client can to make sure that this stream is available to a wider start the playout only from the beginning of the playlist, audience by using a content delivery network. in other words, from the point when the RTP to DASH Service 2: Distributed Media processing Platform service was launched. Naturally, the non-live playlist MediaCloud [18] realizes a distributed system that requires extensive disk space for long recordings, exposes multiple physical computing resources as a because all the converted segments need to be saved. unified cloud execution environment. In terms of http://www.comsoc.org/~mmc/ 14/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter On the client side, VLC and OSMO4 were used as Media service discovery is another challenging topic. DASH clients. The RTP to DASH module is currently How will the user find out what services are available? validated with a Linux OS PC, and the DASH client Conclusion with an Android OS Tablet client. In the service setup In this article we have discussed the challenges in the RTSP/RTP packet losses before the DASH building distributed media processing platforms in the conversion module can decrease the end video quality. cloud. The demonstrator showed how open Standards Furthermore, e.g. severe network congestion between such as the IETF Internet standards can be used to build the DASH server and DASH client(s) can result in a distributed media processing pipeline consisting of playback pauses, which is a result of using TCP. multiple cloud media processing services. We have

however only taken the first steps to investigate how Demonstrator standardized interfaces enable multi-cloud multimedia The entire distributed media pipeline was assembled as a demonstrator. Figure 4 shows the end-to-end jitter services and have outlined the challenges faced in variation of a participant streaming from Aalto delivering the next-generation multimedia services. University, Helsinki to TU Berlin. REFERENCES Figure 5 shows a screenshot of the media players [1] Zencoder. [Online]. http://zencoder.com playing the final streams. The demonstrator showed that [2] . [Online]. it is feasible to use cloud media processing platforms for http://www.sorensonmedia.com/video-cloud- real-time applications (within tight real-time solutions guarantees), and that the cloud media services can offer added value enabling a new range of media services in [3] Siri [online]. http://www.apple.com/ios/siri/ the future Internet. Further, the MediaCloud approach [4] R. Sweha, V. Ishakian, and A. Bestavros, demonstrated that scalability can be realized from within "AngelCast: cloud-based peer-assisted live the service. Conversely, media services that use the streaming using optimized multi-tree construction," SaaS approach also have to consider scaling and in ACM Proceedings of the 3rd Multimedia resource allocation, as well as node placement, which Systems Conference, 2012. are challenging tasks in themselves, and even more [5] A. H Payberah, H. Kavalionak, V. Kumaresan, A. challenging for real-time services with continuous media Montresor, and S. Haridi, "CLive: Cloud-Assisted flows and tight end-to-end latency constraints. Further, P2P Live Streaming," in IEEE Proc. of the 12th how can cloud resources be optimized to support real- IEEE P2P Conference on Peer-to-Peer Computing time multimedia services in the cloud, especially in the (P2P’12), Tarragona, Spain, 2012. case where multiple service providers exist in the cloud? [6] C. Jennings, T. Hardie, and M. Westerlund, “Real- Average delay 0.25 time communications for the web,” IEEE Comm.

] 0.2

s [

Magazine, vol. 51, no. 4, April 2013.

y

a l

e 0.15

d

d

e [7] YouTube [online] http://www.youtube.com/ v

r 0.1

e

s b O 0.05 [8] Google+ [online] https://plus.google.com/

0 13:21:30 13:22:00 13:22:30 13:23:00 13:23:30 13:24:00 13:24:30 13:25:00 13:25:30 13:26:00 13:26:30 [9] C. Perkins, M. Westerlund and J. Ott, “Web Real- Figure 4 Variation in packetTime [s] jitter (milliseconds) Time Communication (WebRTC): Media Transport and Use of RTP”, October, 2013 http://tools.ietf.org/html/draft-ietf-rtcweb-rtp- usage-10 [10] ITU-T, " for generic audiovisual services," ITU-T, Standard H.264, 2012. [11] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, "RTP: A transport protocol for real-time applications," IETF, Standard, RFC3550, 2003. [12] Y.-K. Wang, R. Even, T. Kristensen and R. Jesup, “RTP Payload Format for H.264 Video”, IETF, Standard, RFC6184, 2011. [13] H. Schulzrinne, A. Rao and R. Lanphier, “Real Time Streaming Protocol (RTSP)”, IETF, Figure 5 Project demonstrator Standard, RFC2326, 1998.

http://www.comsoc.org/~mmc/ 15/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

[14] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. being part of Bell Labs. His current interest within the Johnston, J. Peterson, R. Sparks, M. Handley and IP Platforms research program is focused on cloud E. Schooler, “SIP: Session Initiation Protocol”, computing platforms for carrier grade multimedia IETF, Standard, RFC3261, 2002. services. [15] V. Singh, T. Karkkainen, J. Ott, S. Ahsan and L. Peter Amon received his diploma (Dipl.-Ing.) degree in Eggert, "Multipath RTP (MPRTP)", draft-singh- electrical engineering from Friedrich-Alexander- avtcore-mprtp-06 (work in progress), January Universität Erlangen-Nürnberg, Germany, in 2001. In 2013. 2001 he joined Siemens Corporate Technology, [16] V. Singh, S. Ahsan and J.Ott, “MPRTP: multipath Munich, where he is currently working as a Senior considerations for real-time media”. In MMSys’13: Research Scientist in the Imaging and Computer Vision Proceedings of the 4th ACM Multimedia Systems department. His research field include video coding and Conference, New York, USA, 2013. transmission, image/video processing and analytics, and future internet technologies. In these areas, he [17] T. Stockhammer, “Dynamic Adaptive Streaming over HTTP – Design Principles and Standards,” in published several conference and journal papers and also actively contributed to the standardization at MMSys ’11: Proceedings of the second annual ISO/IEC MPEG and ITU-T VCEG. Currently, he ACM conference on Multimedia systems, New contributes to the standardization of High Efficiency York, USA, 2011. Video Coding (HEVC). He has been and still is actively [18] P. Domschitz and M. Bauer, “MediaCloud - a involved in several European research projects. framework for real-time media processing in the Mikko Uitto received his M.Sc. (Tech.) degree in network”. In Proceedings of EuroView electrical engineering from the University of Oulu, 2012,Wuerzburg, Germany, July 2012. Finland, in 2009. He is currently working as a research [19] P. Mell and T. Grance, The NIST Definition of scientist in VTT Technical Research Centre of Finland Cloud Computing. (800-145), National Institute of in the Network Resource Management and Control Standards and Technology (NIST), Gaithersburg, research team and aims for his PhD. The main focus in MD (2011). his research work is adaptive video streaming in mobile networks. Ralf Globisch is a Ph.D. candidate and guest researcher Thomas Schierl received the Dr.-Ing. degree in at TU Berlin. He is currently working as a Electrical Engineering (passed with distinction) from developer/researcher in the Real-time Video Coding Berlin University of Technology (TUB) in October group of the CSIR Meraka Institute. Since 2009, he has 2010. He is head of the research group Multimedia been a guest researcher in the Multimedia Communications in the Image Processing Department Communications group of the Fraunhofer Heinrich at Fraunhofer Heinrich Hertz Institute (HHI), Berlin. Hertz Institute (HHI). Thomas is the co-editor of various IETF RFCs and Varun Singh is a final year doctoral student at Aalto various MPEG standards. In 2007, Thomas visited the University, Finland. His research centres around Image, Video, and Multimedia Systems group of Prof. congestion control and resource management of Bernd Girod at Stanford University, CA, USA for multimedia applications. He is the co-author of several different research activities..Thomas' research interests Internet drafts related, including Multipath RTP include system integration of video codecs, delivery of (MPRTP) and RTP circuit breakers. real-time media over mobile IP networks such as Jürgen Sienel received his diploma degree in Computer mobile media content delivery over HTTP, and real- Science from the University of Stuttgart in 1992 and is time multimedia processing in cloud infrastructures. working since then in Alcatel's research branch now

http://www.comsoc.org/~mmc/ 16/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Internet Video Multicast via Constrained Space Information Flow Yaochen Hu1, Di Niu1 and Zongpeng Li2 1. University of Alberta, Canada 2. University of Calgary, Canada {yaochen,dniu}@ualberta.ca, [email protected]

1. Introduction to a higher cost. Neither does [1] consider such cost, Many multimedia streaming and video webcasting nor is it able to solve the problem under a relay applications can be modeled as a multicast session, number constraint. Second, the solution in [1] where a single source node multicasts the same approximates the geometric problem with a graph multimedia content to all the participating terminals, version of the problem by dividing the space with possibly with the help of relay servers. Since video grids. This introduces a large number of intermediate content consumes a lot of network resources, it is variables, as such grids need to be fine-grained to desirable to minimize the congestion that the video increase accuracy. Although the overall mean session imposes on the Internet. The degree of complexity of the grid-based approach appears to be congestion imposed on each link can be modeled by polynomial (depending on the optimization accuracy), the bandwidth delay product, i.e., the network volume it still has a large complexity especially when the on it. A min-cost video multicast network should be dimension of the space is high. constructed to minimize the sum of bandwidth delay products over all the links in the multicast network. In this paper, we propose the Constrained Space Information Flow problem, which aims to find the The min-cost multicast problem is traditionally solved min-cost multicast network under a constraint on the on a given graph formed by the terminals, the source number of relay servers, allowing operators to adjust and all the available candidate relay servers in the the cost of using relay servers through such a Internet to find the optimal topology and flow constraint. We propose an effective EM algorithm to assignments. Such an approach appears to be efficient solve the problem directly in the geometric space, when the number of candidate relay servers is which can converge to the local optimal solutions with relatively small. However, recent years have witnessed high efficiency. a rapid growth of the utilizable server pool including CDN nodes and small to medium datacenters, making 2. Problem Formulation and Algorithms traditional approaches incapable of handling the large In this section, we formulate the constrained space scale of the graph at hand. information flow problem, show some important properties of the optimal solutions to it, based on An alternative approach is to map the nodes onto a which we present our EM algorithm. delay space using a network coordinate system, in which the distance between two nodes estimates their Constrained space information flow problem. pairwise delay [2]. As such, the min-cost multicast Although our idea can be extend to a general space, in network can first be constructed via geometric this letter we focus our work on the min-cost multicast optimization in a delay space, where we can insert problem in a Euclidean space. Given N terminal nodes relay servers at arbitrary positions. Such optimal relay T1, T2,..., TN with coordinates in a space (e.g., in a positions found in the delay space can then be mapped delay space where the distance between two nodes back to the closest real Internet servers. As long as the models their pairwise delay on the Internet) and a servers are densely distributed, this geometric multicast session from one source node S to the N optimization approach can yield a good approximation terminals as sinks, the objective is to construct a min- to the original min-cost multicast problem on a graph. cost network in the space, allowing the introduction of at most M extra relay nodes, and allowing any form of The remaining geometric problem in the delay space is coding including network coding to be performed. We similar to the Space Information Flow (SIF) problem define the total cost of the network as [1], which aims to minimize the sum of bandwidth- å w(e) f (e) , distance products in a (geographic) space, allowing e where f(e) is the information flow rate on link e, and network coding and free insertion of relay nodes. The w(e) is the weight of the link. In a delay space, we set work in [1] presents a heuristic solution to the space w(e) as the link length ||e||, i.e., the delay on link e. information flow problem. However, it has two main drawbacks in practice. First, in real applications, The network cost is determined by two types of introducing more relay server nodes will clearly lead http://www.comsoc.org/~mmc/ 17/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter variables, one being the positions of the relay nodes, the problem once we fix one set of variables. and the other being the flow assignments on the links. We call these two factors positions and flow When the positions of the relay server nodes are fixed, assignments. Note that the flow assignments will also the proposed optimization problem (1)-(7) is reduced determine the connection topology of all nodes, since to a simple linear program (LP). The number of a link with a zero rate indicates that the link does not variables is N+1 times the number of links, i.e., exist. Our problem is to tune these two sets of O(N(M+N)2). The number of linear constraints is also variables with no more than M relay servers to achieve O(N(M+N)2). Therefore, we can solve it efficiently the minimum cost. Denote VR as the set of M candidate with common LP solvers. relay nodes to be found and V as the set of all nodes. Our optimization problem can be stated as When the flow assignment of the network is fixed, the Minimize ||x x || f ( uv ) (1) cost function in (1) is the sum of norms and all the u, v V uv constraints in (2) to (6) are irrelevant to the position Subject to : variables. The optimization problem now reduces to a  f( vu ) f ( uv ),  i ,  u  V , (2) convex optimization problem. There are many v Vii v V  efficient algorithms to solve such kind of problems.  f( T S ) r , i , (3)  ii More specifically, for the sum of norms in this  f( uv ) f ( uv ),  i ,  u , v  V , (4) problem, an Equilibrium method has been proposed in  i  f( uv ) c ( uv ),  u , v  V , (5) [1], which can efficiently converge to the optimum.  f( uv ) 0, f ( uv )  0,  i ,  u , v  V , (6)  i Based on these observations, we propose our EM  |VMR | . (7) heuristic algorithm. In the EM algorithm, the above two local optimizations for relay positions and flow In (1), xu is the position of node u. The positions of the assignments are alternately performed. source node and the terminal nodes are fixed input vectors, while the positions of the relay server nodes An EM Algorithm. are variables to be optimized. For every network Our proposed EM algorithm is shown in Algorithm 1. information flow S→Ti, there is a conceptional flow Initially, Step 1 randomly assigns the positions of the fi(uv). We call it conceptional because different relay server nodes in the smallest box region conceptional flows share the bandwidth on the same containing all the terminals and the source. The link. As stated in (4), the final flow rate f(uv) of a link following steps are iterative operations. In each uv should be no less than the maximum conceptional iteration, there are three major steps. We first solve the rate, which will directly affect the total cost. The LP in (1)-(7) with the relay positions fixed to obtain constraint (2) guarantees the conceptional flow the flow rate assignments. Then with these flow rates equilibrium property for every node and every fixed, we solve a convex optimization problem for the conceptional flow i. The assigned “feedback” flow in relay server node positions. Finally, for each relay (3) characterizes the desired receiving rate at each node that has no throughput on it, we randomly terminal. (5) is the trivial link capacity constraint. For reassign a new position to it and repeat the iterations. every pair of nodes, we have both fi(uv) and fi(vu) to The ε in Step 5 is a small positive threshold to exclude indicate the flows in two directions, so that (6) gives the fake non-zeros, since in our LP solver there is another trivial bound. Finally, (7) indicates the always a small non-zero value on an actually zero- constraint on the maximum number of relay server valued variable. nodes. As for the termination condition in Step 9, we We need to solve this problem over both the variables introduce a counter to help us monitor the termination xu (relay positions) for u in VR and all the conceptional condition. In each iteration, we first calculate the ratio flow assignments on all the links (flow assignment). between the cost in the previous iteration and the cost Note that any feasible flow assignment satisfying (2)- in the current iteration, and increase the counter if the (7) can be achieved with linear network coding in a ratio is less than some threshold. Once the counter single multicast session [3]. reaches some number, the whole algorithm terminates. Finally, we delete the relay nodes that have no Properties of the Solutions. throughput and output solution.. It is hard to simultaneously obtain the optimal values for both relay positions and flow assignments, since it Algorithm 1 EM Heuristic Algorithm is not hard to verify that the problem (1)-(7) is non- Input: N terminals, the source node, the constraint M on the convex. However, we have some good properties for number of relay servers http://www.comsoc.org/~mmc/ 18/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Output: a solution to the Constrained SIF problem Relay problem using an EM algorithm, that can take into positions, flow assignments account the constraint on the number of relay servers 1: Randomly generate M relays in the smallest box and can scale to a large pool of candidate servers. containing the source and all the terminals; Preliminary simulations show that our algorithm can 2: (Flow Assignment) Fix the relay positions, and solve the yield fast convergence to the local optimal solutions to LP in (1)-(7) to get the flow rates; 3: (Relay Position Tuning) Fix the flow rates assigned in the non-convex combinatorial optimization problem. Step 2, and solve the convex optimization problem for the relay positions, e.g., using the Equilibrium method in [1]. (Random Seed Generation) 4: for i = 1 to M, do 5: if the total flow on Relay i < ε then 6: Randomly generate a new relay i in the smallest box containing the source and all the terminals; 7: end if 8: end for 9: if the termination condition does not hold, go to Step 2; 10: Delete the relay server nodes that have no flow on them.

3. Simulation and Performance Fig. 1 The simulation result for randomly chosen 10 We have simulated the EM algorithm in a 2-D terminals and 1 source. The delay node number Euclidean space. We choose 11 nodes uniformly at constraint is 4. random in the unit box. 10 of them are set as the terminals, while the remaining one is set as the source. We set 4 as the constraint on the number of relays. References Since the multicast cost will be proportional to the  Jiaqing Huang, Xunrui Yin, Xiaoxi Zhang, Xu Du and receiving rate, we assume the receiving rate is 1 at Zongpeng Li, “On Space Information Flow: Single each terminal and assign a large capacity 10 to each Multicast,” in the Proc. of Network Coding (NetCod), link. For the termination condition, we will increment 2013 International Symposium on 7-9 June 2013. the counter if the former cost is no greater than 1.05 times the current cost. We stop the algorithm when the  F. Dabek, R. Cox, F. Kaashoek, and R. Morris, “Vivaldi: counter reaches 1000. Fig. 1 shows the resulted relay A decentralized network coordinate system,” in Proc. of server positions and topology. The final total cost in ACM SIGCOMM, 2004. terms of the sum of bandwidth-delay products is 6.72.  R. Ahlswede, N. Cai, S.T.R Li, and R. W. Yeung, 4. Conclusions Network Information Flow, IEEE Trans. On Information In this letter, we present a solution for the min-cost Theory, 46(4):1204-1216, 2000. video multicast problem by considering the problem in a delay space based on network coordinates. We solve  S. Boyd and L. Vandenberghe, Convex Optimization (7e), Cambridge University Press, 2007. the resulted Constrained Space Information Flow

http://www.comsoc.org/~mmc/ 19/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Compressive Sensing for Video Coding: A Brief Overview Quan Zhou and Liang Zhou College of Telecom & Inf Eng, Nanjing University of Posts & Telecom, Nanjing, P.R. China {quan.zhou, liang.zhou}@njupt.edu.cn

1. Introduction spatial-temporal domain of video data, the CS theory Conventional approaches for video coding follow the seems to be potentially applied using only few samples major standard, such as MPEG [1] and H.26X [2], with good fidelity. Our intent of this letter is to briefly which are well advanced and widely employed in last overview the basic CS theory in current literatures, few decades. These developed standards underlie present the underlying mathematical insights of this nearly all the video compressive protocols used in theory, survey a couple of important results of applying consumer devices and visual electronics, such as VCDs CS theory for video coding, and specify some possible and DVDs. Since the video needs to be compressed research directions in the future. only once in the encoder, while the decompression in decoder is required to be performed many times, it is 2. Basic Theory of CS expected that the decoding should be implemented as The famous Shannon’s sampling theorem, the so-called simple and quickly as possible. To this end, the Nyquist sampling rate, asserts: If one needs to recover existing video coding paradigms, such as MPEG as a certain signal without error, the sampling rate must well as H.26X, essentially design a complex encoder be at least twice the maximum frequency present in the and a simple decoder. In order to investigate the signal [6]. On the contrary, the CS theory tells us: If a spatial-temporal redundancies for video data, the certain signal is projected to a feature space spanned by encoder has to be restricted with specially designed some predefined orthogonal basis, for example, hardware, leading to the increasing cost of the cameras. basis and Fourier basis, then a large fraction of projected coefficients can be “thrown away”, while In the past ten years, with the emergence of smart- the perceptual loss is hardly noticeable from the phones and continued growth of laptops, netbooks, reconstructive signal. To make this possible, CS theory tablets, and sensors, there is a huge increase in a depends on two major principles: sparsity, which number of mobile devices able to support new video involves the signals of interest, and incoherence, which signal applications. For example, the video surveillance involves the sensing modality. and sports broadcasting, these mobile devices are in  fact video cameras. However, delivering video data and Sparsity. For the continuous time signal, sparsity conveying visual information from mobile devices over expresses the idea that the significant signal content, cellular or mobile broadband networks confronts many so-called “information rate”, may be much smaller than challenges, e.g., limited channel bandwidth; high what is suggested by signal bandwidth. On the other required quality and reliability; and most importantly, hand, for the discrete time signal, sparsity implies that the insecure, time varying, and unplanned operating the number of freedom degrees for the original signal, environments. This casts an emergent problem that is comparably much smaller than its finite signal length. needs to be well addressed for the video coding scheme: More precisely, the CS theory investigates the common due to the limited computational and energy resources, fact that due to the sparse structure, most nature signals the traditional video coding standard involved complex are redundant and thus compressible. In other words, encoders may not be suitable for recent environment of the nature signals always have concise representations video delivery and transmission. with proper selected basis.

Recent years have witnessed substantial research and Mathematically, suppose we are given a discrete signal developments of a new theory called Compressive f ∈ Rn sampled from a certain continuous signal, which Sampling, also known as Compressive Sensing (CS) [3, can be linearly represented by a series of orthonormal 4, 5], which may show great promise to solve above problem. The most striking characteristic of CS theory basis Ψ [1,2 ,...,n ] as follows: is that one can recover certain signals only using very n fewer samples than traditional methods use, such as f (t)   xii (t) (1) Shannon’s sampling theory [6]. A lower sampling rate i1 implies less computation and energy requirement for where x [x , x ,..., x ] is the coefficient vector of f data processing, resulting in lower power requirements 1 2 n for the special designed hardware. For video coding, and defined as xi  f ,i  . Then the sparsity implies due to the substantial amounts of redundancy in the that when a signal has the form of sparse expansion, http://www.comsoc.org/~mmc/ 20/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter most coefficients with small magnitude is able to be CS theory is mobile broadcasting [15], where the video discarded. Formally, one can only keep top S terms data is divided into self-contained video cubes with with largest xi in Eqn. (1) and set the rests as zero, thus some proper selected random matrices as sensing basis. defines the sparse vector xs with at most S nonzero The proposed method is scalable with channel capacity, entries. If S << n, we will call f as S-sparse, and xs is particularly in wireless broadcast situations. sparse in a strict sense.  ■ Rate-energy-distortion in video streaming. Unlike Of course, the above principle underlies the essence of the previous coding standard, such as MPEG [1], the current , such as JPEG-2000 [7] as concept of rate-energy-distortion [16] is proposed to well as many other coding standards: one could encode trade-off coding efficiency, energy resource and the indexes i and the corresponding magnitude of the S recovery quality. In this application, an empirical significant coefficients. The main advantage of such model is designed when limited energy is available for process lies in that it requires no prior knowledge of all video compression and transmission. the coefficients in x, as well as the indexes that imply the significant pieces of information in advance. 4. Possible Applications in Video Coding  The applications of CS theory suggest that a large Incoherence. Suppose we have a pair of orthogonal amount of mathematical and computational methods basis (Ψ, Φ) of Rn, where the first basis Ψ is used for could have an enormous impact in areas where the sampling or sensing and the second basis Φ is used to conventional multimedia information processing has represent signal f. According to [8], then the coherence significant limitations, especially in the field of video between these two bases can be defined as: acquisition/coding. This section presents some possible research directions that could significantly expand the (Ψ,Φ)  n  max |k , j | (2) 1k, jn ability of traditional coding scheme using CS theory. The implication of incoherence is now clear: It  Video compression/coding. In some cases, the sparse measures the largest correlation between any two basis Ψ may be unknown in the initialization of video elements of sensing basis Ψ and representation basis Φ compression. Furthermore, the Ψ may have no sparse [9]. If there exists correlated elements in Ψ and Φ, the structure, leading to the impractical implement at the coherence thus tend to be large, otherwise, it tend to be decoder. To address this problem, the random design of small. Regarding to how large and how small, it basis Φ, also known as “Random Sensing” [17], can be follows the linear algebra that: seen as a universal coding paradigm, as it can be (Ψ,Φ)[1, n] (3) designed without considering the structure of basis Ψ. For CS theory, the pair of two bases is always designed In multi-signal setting such as sensor networks, this to achieve low coherence. universality may be particularly suitable for distributed source coding [12]. In summary, sparsity and incoherence together quantify  Channel coding. In order to protect from random the compressibility of a certain signal. The higher interference during video transmission, the important sparsity and incoherence a nature signal has, the more principles of CS theory, such as sparsity, randomness compressible this signal is. and convex optimization, can be applied to design fast

correcting codes to rectify the transmission errors, as 3. CS based Video Coding explained in [18]. Research on the application of CS theory for video  system has been started from last decade. We start by Video data acquisition and sensing. The traditional briefly reviewing the open literature in recent reports. video coding schemes depend on the fact that the video data have been acquired using physical cameras. In ■ Distributed video compression/coding. The pioneer some important situations, however, it may be very work using CS in distribution video coding (DVC) is hard to obtain discrete-time video sequence from an proposed in [11]. Unlike conventional approach that analog signals, resulting in the difficulty in the exploits source statistics in encoder, DVC methods subsequently compression. According to the CS theory, investigate video redundancy at the decoder. Two or instead to perform analog-digital conversion, the most more independent encoders are cooperative to encode feasible approach is to design the sampling hardware statistically dependent source. The statistical dependent that directly record discrete and low-rate measurements bit-streams from each encoder are sent to a common of the incident analog signals. We refer the reader to [8] decoder for joint decompression. Some other DVC by Candes and Wakin elsewhere in this issue for related literatures can be referred to [12, 13, 14]. discussions. 5. Conclusion ■ Mobile broadcasting. Another implementation of http://www.comsoc.org/~mmc/ 21/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

In summary, the characteristic of sparse structure for [13] L. Zhou, X. B. Wang, T. Wei, G.M. Muntean, and B. the multimedia data makes CS a very useful and Geller, “Distributed Scheduling Scheme for Video powerful mathematical tool for a broad scope of streaming over Multi-Channel Multi-Radio Multi-Hop natural science and engineering applications. CS theory, Wireless Networks”, in IEEE Journal on Selected Areas in Communications, vol. 28, no. 3, pp. 409-419, 2010. however, is still a fresh field and it is even more recent [14] T. T. Do, C. Yi, D. T. Nguyen, N. Nguyen, G. Lu, and to apply it into video systems. As a result, there are T. D. Tran, “Distributed Compressed Video Sensing,” in many avenues for the future research and the thorough Information Sciences and Systems, pp. 1-2, 2009. quantitative/qualitative analyses are still lacking. For [15] L. Chengbo, H. Jiang, P. Wilford, Y. Zhang and M. example, the relationship between the underlying Scheutzow, “A new compressive video sensing features and the sparse characteristics of videos, as well framework for mobile broadcast”, in IEEE Transactions as the involved visual-based quantitative assessment on Broadcast, vol. 59, no. 1, pp. 197-205, 2013. principles are still far from well understood. Even so, [16] S. Pudlewski and T. Melodia, “Compressive Video CS methodology will undoubtedly play a more Streaming: Design and Rate-Energy-Distortion Analysis”, in IEEE Transactions on Multimedia, vol. 15, important role in the society of advanced multimedia no. 8, pp. 2072-2086, 2013. signal processing. All the related CS-based theories, [17] E. Candes and J. Romberg, “Sparsity and incoherence in techniques, and practical applications are encouraged compressive sampling,” in Inverse Problems, vol. 23, to be in-depth investigated in the near future. no. 3, pp. 969–985, 2007. [18] E. Candès and T. Tao, “Decoding by linear References programming,” in IEEE Transactions on Information [1] P. Symes, “Digital Video Compression,” McGraw-Hill, Theory, vol. 51, no. 12, pp. 4203-4215, 2005. 2004.

[2] T.Wiegand, G. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC video coding standard,” Quan Zhou received the B.S. in IEEE Transactions on Circuits and System for Video degree in Elect and Inf Eng from Technology, vol. 13, no. 7, pp. 560-576, 2003. China University of Geosciences, [3] E. Candès, J. Romberg, and T. Tao, “Robust uncertainty Wuhan, China, in 2002, and the principles: Exact signal reconstruction from highly M.S. and Ph.D. degree in Elect. incomplete frequency information,” in IEEE and Inf. Eng from Huazhong Transactions on Information Theory, vol. 52, no. 2, pp. University of Sci. and Tech., 48-509, 2006. Wuhan, China in 2006 and 2013, [4] E. Candès and T. Tao, “Near optimal signal recovery from random projections: Universal encoding respectively. He is now the strategies?” in IEEE Transactions on Information assistant professor in college of Theory, vol. 52, no. 12, pp. 5406-5425, 2006. Telecom and Inf. Eng, Nanjing University of Posts and [5] D. Donoho, “Compressed sensing,” in IEEE Telecom, Nanjing China. His major research interests Transactions on Information Theory, vol. 52, no. 4, pp. include image processing, computer vision, pattern 1289-1306, 2006. recognition, and multimedia communication. [6] C. Shannon, “Classic paper: Communication in the presence of noise,” in Proceedings of the IEEE, vol. 86, Liang Zhou received his B.S. no. 2, pp. 447–457, 1998. degree and M.S. degree (with [7] D.S. Taubman and M.W. Marcellin, “JPEG 2000: Fundamentals, Standards and Practice,” honors) both major at Elect Eng Norwell, MA: Kluwer, 2001. from Nanjing University of Posts [8] E. Candes and M. Wakin, “An introduction to and Telecom, Nanjing, China in compressive sampling,” in IEEE Signal Processing 2003 and 2006, respectively. In Magazine, pp. 21-30, 2008. March 2009, he received his Ph.D. [9] M. Y. Baig, E. M. K. Lai and A. Punchihewa, degree major at Elect Eng both “Compressive Video Coding: A Review of the State-Of- from Ecole Normale Superieure The-Art,” Video Compression, 2013 (E.N.S.), Cachan, France and Shanghai Jiao Tong [10] R. Coifman, F. Geshwind, and Y. Meyer, “Noiselets,” in University, Shanghai, China. From Mar. 2009. His Applied and Computational Harmonic Analysis, vol. 10, no. 1, pp. 27–44, 2001. research interests are in the area of multimedia [11] J. Slepian and J. Wolf, “Noiseless coding of correlated communications and services, in particular, resource information sources,” in IEEE Transactions on allocation, scheduling, and cross-layer design. He is the Information Theory, vol. 19, no. 4, pp. 471–480, 1973. editor board of IEEE Transactions on Circuits and [12] D. Baron, M.B. Wakin, M.F. Duarte, S. Sarvotham, and System for Video Technology, IEEE Communications R.G. Baraniuk, “Distributed compressed sensing,” 2005, Surveys & Tutorial, and Guest Editor for other journals. Preprint.

http://www.comsoc.org/~mmc/ 22/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Offloading Policy of Video Transcoding for Green Mobile Cloud Weiwen Zhang and Yonggang Wen School of Computer Engineering, Nanyang Technological University {wzhang9, ygwen}@ntu.edu.sg

1. Introduction In Figure 1, the system of green mobile cloud consists Video transcoding [1] is an enabling technology to of mobile devices, a dispatcher and a set of service support the growing demand of video consumption on engines. The dispatcher receives offloading requests mobile devices. According to Cisco VNI report [2], from mobile devices and dispatches these requests to mobile video consumption will increase 16-fold service engines for computation. The service engine between 2012 and 2017. This trend poses challenges on can be a physical server or a virtual machine. If the the design of mobile application platform, because video requested by users is cached in service engines or mobile devices can support limited video formats and stored at the back-end storage, the video can be resolutions. Transcoding can allow videos to be played rendered immediately to users without transcoding; on diverse mobile devices, by adapting videos into a otherwise, video transcoding is performed either on the particular format (e.g., mp4) along with a resolution mobile device or one of the service engines in the reduction. This transcoding process, however, is cloud. Particularly, if the service engine has an computation-intensive, which can drain the battery identical image of the mobile device, and transcoding lifetime of mobile devices. Therefore, a new computing can be conducted on that service engine without the paradigm is increasingly demanded for mobile devices mobile user sending the input file. to consume video content. In this architecture, the dispatcher makes the offloading Cloud computing [3] offers a natural way to decision for both the mobile device and the cloud. The accomplish transcoding tasks. Mobile devices can dispatcher can have the information of mobile devices upload videos to the cloud for transcoding and thus, (i.e., profile of transcoding tasks) and service engines computation is shifted from the mobile device to the in the cloud (i.e., queue length). Based on this cloud, which is referred to computation offloading [4]. information, we can design a policy for the dispatcher. By computation offloading, significant energy In this paper, leveraging optimization theory, we adopt consumption can be saved on mobile devices, enabling a model-based approach to the offloading policy. We more media applications [5]. can also adopt a rule-based approach using machine learning; that approach remains our future work. In this paper, we first present a generic green mobile cloud system to provide Transcoding as a Service (TaaS) to mobile devices. We then propose an optimization framework of offloading policy to reduce the energy consumption on both the mobile device and the cloud. For the mobile device, we formulate offloading policy as a constrained optimization problem, in order to minimize the energy consumption on the mobile device while satisfying a delay deadline. We find the operational region on which execution mode, i.e., mobile execution or cloud execution, is Figure 1. Architecture of green mobile cloud. more energy efficient. For the cloud, using the framework of Lyapunov optimization, we propose an 3. Optimization framework. online algorithm to dispatch transcoding tasks to In this section, we propose an optimization framework service engines. This algorithm can reduce energy of offloading policy for green mobile cloud. consumption while achieving the queue stability in the cloud. The optimization framework of offloading As illustrated in Figure 2, a transcoding task can be policy can jointly reduce the energy consumption on executed in two alternative modes: both the mobile device and the cloud, which provides guidelines for the design of green mobile cloud.  Mobile execution: a transcoding task is executed locally on the mobile device; 2. System architecture for green mobile cloud  Cloud execution: a transcoding task is In this section, we present a generic green mobile cloud offloaded and scheduled by the dispatcher to system to provide TaaS to mobile devices. one of the service engines for execution. http://www.comsoc.org/~mmc/ 23/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter In this section, we present the energy-efficient offloading policy for the mobile device and the cloud, under the optimization framework.

Offloading policy for mobile devices. We can adapt the results in [6] to obtain the minimum energy consumption on the mobile device. Specifically, the minimum energy consumption of the mobile device * 3 2 Figure 2. Execution modes for transcoding tasks. by the mobile execution is ξm =ML /Td , where M is a constant depending on the chip architecture on the To save energy consumption on the mobile device, we mobile device. The minimum energy consumption on * formulate delay-constrained optimization problems for the mobile device by the cloud execution is ξc =C(n) n n-1 the mobile execution and the cloud execution, L /(Td-L'/r'-T0) +P'L'/r', where the first term refers to respectively. Given the profile of the transcoding task, the energy consumption of transmitting the input data we determine which execution is more energy-efficient. and the second term refers to the energy consumption First, for the mobile execution, its energy consumption of receiving the output data. In addition, C(n) is a ξm is minimized by optimally configuring the clock function of n for the cloud execution, and P' and r' are frequency via dynamic voltage scaling (DVS), i.e., the power and rate of receiving the output data, * , where ψ is the set of all xm = min{xm (L,Td,y)} respectively. yÎY feasible clock-frequency vectors ψ, L is the input data We consider the application of transcoding FLV files size and Td is the time deadline of video transcoding. with 1920x1080 resolution size into mp4 files with Second, for the cloud execution, its energy 480x360 resolution size. We can determine which consumption ξc is minimized by optimally setting the execution is more energy-efficient for the mobile * * transmission rate on the mobile device, i.e., device, by comparing ξm and ξc . We model L′ as a * , where Φ is the set linear function of L, using data fitting over the output xc = min{xtran (L,Ttran ,f)} + xrecv (L') fÎF data size, i.e., L′= aL + b, where a = 0.02658 and b = of all feasible data scheduling vectors, T is the tran 0.3316. We also set r'=500KB/s, n=2 and T0=0.5s, transmission time, L’ is the output data size, ξ and tran where T0 is the queueing delay based on the dispatching ξrecv are the energy consumed by transmitting and decision in the cloud. Figure 3 shows there is an receiving data, respectively. Particularly, Ttran = Td- operational region that separates the mobile execution Trecv -T0, where Trecv is time of receiving output data and the cloud execution for input data size L and and T is queueing delay determined by queue length of 0 specified time delay Td. Given (Td, L), if it is above the the service engine that executes the transcoding task curve, then the cloud execution is optimal; otherwise, depending on the dispatching decision. Then, the the mobile execution is optimal. offloading policy for mobile devices is obtained by comparing the optimal energy consumption of the mobile execution and the cloud execution.

To save energy consumption on service engines, we formulate the dispatching decision as a stability- constrained optimization problem. We aim to minimize the time average energy consumption while satisfying the queue stability. Particularly, we consider a discrete time model and assume that transcoding time can be estimated and queue length of each service engine can be observed for each time slot (i.e., A(t)={A(t)} and Q(t)={Q(t)}). We define the time average energy consumption E and the time average queue length Q as the average of summation of energy consumption and remaining transcoding time by all the service Figure 3. Operational region of the mobile execution engines over a long period of time, respectively. The and the cloud execution for mobile devices. energy consumption for each service engine is a product of its power P and the transcoding time A(t). In Offloading policy for cloud. Using the framework of Lyapunov optimization [7], we addition, the queue stability is defined as Q < ¥. propose an online algorithm to dispatch transcoding 4. Energy-efficient offloading policy http://www.comsoc.org/~mmc/ 24/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter tasks to the cloud. Upon receiving the transcoding task References at time slot t, the dispatcher estimates its transcoding [1] A. Vetro, C. Christopoulos, and H. Sun, “Video time for each service engine, A(t), and observes the transcoding architectures and techniques: an queue length on each service engine, Q(t). Then, the overview,” Signal Processing Magazine, IEEE, vol. transcoding task is dispatched to the service engine 20, no. 2, pp. 18–29, 2003. [2] Cisco Visual Networking Index: Forecast and with the minimum value of A(t)(Q(t)+VP), where V is a Methodology, 2012 - 2017, Cisco, 2013. control variable. The queue length of the chosen [3] M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. service engine is the queueing delay of offloading for Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, mobile devices. In addition, the queue length is and M. Zaharia, “A view of cloud computing,” updated at every time slot for each service engine. Communication of the ACM, vol. 53, no. 4, pp. 50–58, 2010. The online algorithm can reduce energy consumption [4] K. Kumar and Y. H. Lu, “Cloud computing for mobile while achieving the queue stability. We do the users: Can offloading computation save energy?” IEEE simulation, assuming there are 10 physical servers in Computer, vol. 43, no. 4, pp. 51–56, 2010. [5] Y. Xu and S. Mao, “A survey of mobile cloud computing the cloud. Figure 4 shows the time average energy for rich media applications,” IEEE Wireless consumption and the time average queue length that Communications, vol. 20, no. 3, pp. 46–53, 2013. are normalized and calculated over 50000 time slots [6] W. Zhang, Y. Wen, K. Guan, D. Kilper, H. Luo, and D. under different V. With the increase of V, the time Wu, “Energy-efficient mobile cloud computing under average energy consumption decreases and converges stochastic wireless channel,” in IEEE Transactions on to the optimal value. However, with the increase of V, Wireless Communications, 2013, pp. 4569–4581. the time average queue length grows linearly. Hence, [7] M. J. Neely, “Stochastic network optimization with the cloud operator can dynamically tune the variable V application to communication and queueing systems,” for the energy-delay tradeoff. Synthesis Lectures on Communication Networks, vol. 3, no. 1, pp. 1–211, 2010.

Weiwen Zhang received a Bachelor’s Degree in Software Engineering and Master’s Degree in Computer Science from South China University of Technology, in 2008 and 2011, respectively. He is a PhD student in Nanyang Technological University in Singapore. His research interests include mobile computing, cloud computing and green media.

Yonggang Wen received his PhD degree in Electrical Engineering and Computer Science (minor in Western Literature) Figure 4. Tradeoff between time average energy from Massachusetts Institute of Technology, consumption and time average queue length. Cambridge, USA. He is an assistant professor with school of computer engineering at Nanyang 5. Conclusion Technological University, Singapore. He has published In this paper we presented a generic mobile cloud over 80 papers in top journals and prestigious system. We proposed energy-efficient offloading conferences. His latest work in multi-screen cloud policy for TaaS to minimize the energy consumption of social TV has been featured by transcoding on both the mobile device and the cloud global media (more than 1600 while achieving low delay. For mobile devices, we news articles from over 29 obtained the operational region of the optimal countries) and recognized with execution for mobile devices. For the cloud, we ASEAN ICT Award 2013 (Gold proposed an online algorithm to dispatch transcoding Medal) and IEEE Globecom 2013 tasks to service engines, which can reduce energy Best Paper Award. His research consumption while achieving the queue stability. In the interests include cloud computing, future, we will have VM resource management such green data center, big data that CPU and memory resource can be dynamically analytics, multimedia network and mobile computing. allocated. We will also use machine learning method to design the offloading policy. http://www.comsoc.org/~mmc/ 25/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter SPECIAL ISSUE ON NETWORK CALCULUS

Applications of Network Calculus to Multimedia Communications Guest Editor: Florin Ciucu (University of Warwick) {[email protected]}

Network calculus emerged in the early 1990s as an The paper illustrates the elegance of the stochastic alternative to the classical queueing theory. Motivated network approach to derive end-to-end results for by the need to analyze complex network scenarios with bursty data carried over a wireless environment. The non-Poisson arrivals, network calculus has evolved as underlying analytical challenges are addressed using an elegant analytical framework attractive to computer the convenient representations of arrivals and service scientists and engineers. In this sense, a key principle by envelopes and service processes. lies in the abstraction of cumbersome mathematical technicalities related to arrival processes, scheduling, The third paper, titled “A General Stochastic and multi-node analysis through the innovative Framework for Low-Cost Design of Multimedia SoCs”, concepts of envelopes and service processes. This is co-authored by Raman et al. The authors consider ability to analyze systems in a simple and yet intuitive the problem of better designing resource-constrained manner has made network calculus applicable in System-on-Chips (SoCs) in multimedia systems diverse areas such as switched Ethernets, systems-on- through analytically understanding the required chip, avionic network, or the smart grid. resources. Concretely, the authors apply the stochastic network calculus to dimension play-out buffers in The goal of this issue of the E-Letter is to illustrate the video decoders. A key message is that, in contrast to its applicability of network calculus in the area of deterministic counterpart, the stochastic network multimedia communications. The issue consists of five calculus can lend itself to better dimensioning of papers, co-authored by network calculus experts, resources in the presence of randomness in both the covering both applications and theory. In particular, streaming data and processing. the issue concerns topics such as the evaluation of the perceived quality of video streaming, the analytical The fourth paper, titled “Copula Analysis for computation of latencies in multimedia sensor Stochastic Network Calculus”, is co-authored by Wu, networks, the design of System-on-Chips in Dong, and Srinivasan from the University of Victoria, multimedia systems, improving the accuracy of Canada. This paper addresses a key theoretical network calculus performance metrics, and the challenge in the stochastic network calculus, i.e., the modelling and analytical understanding of transcoders tightness of the produced probabilistic bounds. In in end-to-end multimedia streaming. particular, the authors integrate the copula theory into the framework of the stochastic network calculus to The first paper, titled “A Network Calculus Extension improve performance bounds when arrival processes to EvalVid”, is co-authored by Heidinger and Lim from are not necessarily independent. The obtained results BMW Group Research and Technology, Germany. The have the potential to better design scheduling systems paper overviews the integration of network calculus in multimedia systems. into EvalVid -- a framework to assess the perceived quality of video streaming. In particular, by relying on The fifth paper, titled “A Delay Calculus of Streaming network calculus delay bounds, the paper illustrates the Media with Video Transcoding”, is co-authored by variability of the perceived quality (i.e., as the Mean Wang and Schmitt from University of Kaiserslautern, Opinion Score) as a function of the play-out buffer. Germany. The paper addresses the problem of end-to- The results are shown for various scenarios such as end delay bounds of video streaming in a network with automotive in-car and cabin networks. transcoders. The authors demonstrate in particular the suitability of the stochastic network calculus to model The second paper, titled “Performance Analysis for the lossy behavior of transcoders, and show that end- Wireless Multimedia Sensor Networks Based on to-end latencies grow in the number of transcoders. Stochastic Network Calculus”, is co-authored by Wang, Wang, and Peng from Guangxi University, China. The We hope that the five contributions convincingly authors address the derivation of probabilistic end-to- demonstrate the potential and suitability of network end delay bounds in a multi-hop multimedia sensor calculus to better understand and design multimedia network, whereby flows are assigned certain weights. communications systems. http://www.comsoc.org/~mmc/ 26/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Florin Ciucu was Warwick. His research interests are in the stochastic educated at the Faculty of analysis of communication networks and the smart grid, Mathematics, University resource allocation, and randomized algorithms for of Bucharest (B.Sc. in large systems. He is a recipient of the ACM Sigmetrics Informatics, 1998), George 2005 Best Student Paper Award. Mason University (M.Sc. in Computer Science, 2001), and University of Virginia (Ph.D. in Computer Science, 2007). Between 2007 and 2008 he was a Postdoctoral Fellow in the Electrical and Computer Engineering Department at the University of Toronto. Between 2008 and 2013 he was a Senior Research Scientist at Telekom Innovation Laboratories (T-Labs) and TU Berlin. Currently he is an Assistant Professor in the Computer Science Department at the University of

http://www.comsoc.org/~mmc/ 27/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter A Network Calculus Extension to EvalVid Emanuel Heidinger1, Hyung-Taek Lim2 2 BMW Group Research and Technology [email protected], [email protected]

1. Introduction come from real measurements via TCPDump[15] or Network Calculus has shown good value for network simulation with OPNET network simulator guaranteeing worst case performance bounds in the [16]. field of industrial Ethernet networks. E.g., those networks are found in the automotive industry [19][22], Play-Out buffer Video Source aeronautics industry [11], and automation industry[13]. Network In those networks, the considered requirements usually Delay / Loss

range from 1ms for synchronization, over 10ms for

EvalVid API EvalVid API EvalVid worst case audio latency to 100ms for worst case Traces Traces signaling delay (cf. [23]). The performance guarantees are required to provide robustness to safety-relevant EvalVid functions, such as audio announcements in the aircraft Evaluation cabin, or acoustic warning signals due to distance alerts in the automotive scenario. PSNR / MOS

Figure 1: Schematic of EvalVid [14] In addition to the briefly explained benefit of Network

Calculus in safety-relevant use cases, we now provide The approach given in this letter is as follows: We insights on how the Network Caluculus results can be derive deterministic Network Calculus bounds for applied for video streaming use cases. Since video video streams over sample onboard networks. These streaming is frequently used for entertainment, it is Network Calculus bounds are then provided to believed that video streaming does not have that EvalVid’s API in order to dimension the play-out stringent requirements in terms of reliability. However, buffers on the receiver’s side. video streaming may also play a role in scenarios, that are closer to safety-relevance than this is the case for The given approach cannot only be used for standard in-car-entertainment or in-flight-entertainment, e.g., switched Ethernet, but also for AVB networks [12].

However, shaping algorithms as inherent parts of AVB • video surveillance in aircraft cabins, or may result in higher worst case delays which in turn • rear view camera for parking assistance. may require higher play-out buffer reserves.

For those scenarios, a simulative approach is certainly 3. Network Calculus Results helpful. However, when it comes to terms, completely Network Calculus is a framework for determining omitting analytically methods is not always possible. worst case bounds in queuing networks that has

emerged since the early nineties. The basic Network In this letter we provide the following insights: Section Calculus is primarily based on the work of Cruz [9][10]. 2 gives an overview on the video evaluation framework One decade later, Le Boudec [18] added nice algebraic EvalVid [14], which is used to determine quality of operations to this theory by shifting towards the video streaming using Network Calculus delays. (min,+)-algebra. The mentioned frameworks target at Section 3 recalls on Network Calculus results in the deterministic performance bounds, i.e., the determined field of automotive in-car networks as well as bounds hold under any circumstance. Today, the aeronautic cabin networks. Section 4 gives insights in Network Calculus community has moved towards how Network Calculus results can be employed on stochastic extensions of the Network Calculus. The EvalVid quality metrics. Section 5 draws the work of Ciucu et al. [8] summarizes the basic conclusions and gives an outlook on ongoing work. movements and stochastic extensions to the

deterministic calculus. 2. Video Evaluation Framework EvalVid

EvalVid [14] is a video evaluation framework by Klaue This work addresses wired video transmission such that et al. that determines quality metrics such as PSNR and it shall suffice to determine deterministic performance MOS. Figure 1 gives a simplified overview on EvalVid. bounds. We address the following scenarios and The EvalVid-API allows to add network delay to video topologies: (a) In-Car-Entertainment as discussed in packets traversing the network. These delays usually http://www.comsoc.org/~mmc/ 28/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter [21] by Lim et al., and (b) In-Flight-Entertainment as As a first approach, we assume values being uniformly discussed in [11] by Heidinger et al. The in-car distributed, and employ the given approach to the deterministic Network Calculus results were presented Akiyo video in CIF format [17]. In our setup, the in [20] and are summarized in Table 1 together with Akiyo video has the MOS value of 3.17 before the cabin results given in [11]. transmission. Figure 2 shows the impact of shifting the length of play-out buffer towards the analytical delay Topology Video Flow bound. We see that not only the network delay has to Maximum Delay in ms be addressed while dimensioning the play-out buffer, In-Flight Cabin Scenario 7.5560 but also the packetization of the video stream. In-Car Star-based 1.3550 In-Car Daisy chain-based 1.7588 In-Car Tree-based 1.7581

Table 1: Network Calculus Deterministic Bounds

4. Network Calculus and Video Evaluation As a matter of fact, there is no single frame loss in the deterministic world of Network Calculus as long as enough buffer space is available. As a result, if the receiver provides a large play-out buffer, the video transmission itself shall have no negative side effect on the video quality. However we saw scenarios in the introduction, e.g., video surveillance or rear view camera, where a smart play-out level in the lower range Figure 2: Mean Opinion Score is desired. 5. Conclusion The question is, how can we apply the Network There is a definite trend that today’s onboard networks Calculus results to give predictions about the perceived move towards switched Ethernet not least due to the video quality. A very first idea that immediately comes availability of low-cost switching ASICs. However, up is delaying each frame by the maximum worst case. there exists no built-in mechanism in standard switched But in the field of video streaming, other worst case Ethernet to predict exact arrival of transmitted frames. scenarios play an even greater role. We now consider Due to the non-preemptiveness, frames may even be high variability in the transmission time of each frame: delayed by frames with lower priority. One frame might be delivered as fast as possible and no interfering flow has negative impact on this frame. Network Calculus provides an analytical framework to Imagine now that the succeeding frame is interfered by capture worst case arrivals, and shows good acceptance all other imaginable flows, i.e., it experiences the in certification. Ongoing work targets at an worst case. As a consequence, we have to dimension a improvement of the engineering process, i.e., how can play-out buffer that it is able to handle that variability. we use the results from the Network Calculus analysis to dimension the onboard network and its components. In order to do so, we do not only require the worst case transmission time as provided by a standard Network In this letter we proposed to combine standard video Calculus analysis, we also calculate the minimal evaluation with analytical Network Calculus methods. transmission time. The results can be directly used to give close upper bounds for the play-out buffers without degrading We propose the following approach: Values ranging user’s perceived quality yet achieving low latency from minimum delay to maximum delay are feed to the play-outs. EvalVid API. These values follow a distribution that does not necessarily have to correspond to the expected Acknowledgments: Portions of this research were distribution. In fact, our intention is to choose a done while Emanuel Heidinger was a PhD student at distribution with a worst, negative impact in terms of the Department of Computer Science, Technische variability. Universität München.

http://www.comsoc.org/~mmc/ 29/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

References bounds in in-car and aeronautic networks. [8] F. Ciucu and J. Schmitt. Perspectives on Network [21] Hyung-Taek Lim, Lars Volker, and Daniel Herrscher. Calculus: No Free Lunch, but Still Good Value. In Challenges in a future IP/ethernet-based in-car network Proceedings of the Conference on Applications, for real-time applications. In Design Automation Technologies, Architectures, and Protocols for Computer Conference (DAC), 2011 48th ACM/EDAC/IEEE, pages Communications, SIGCOMM 2012, pages 311–322. 7–12. IEEE, 2011. ACM, 2012. [22] Martin Manderscheid and Falk Langer. Network [9] R.L. Cruz. A Calculus for Network Delay, Part II, calculus for the validation of automotive ethernet in- Network Analysis. IEEE Transactions on Information vehicle network configurations. In Cyber-Enabled Theory, 37(1):132–141, 1991. Distributed Computing and Knowledge Discovery [10] R.L. Cruz. A Calculus for Network Delay, Part I, (CyberC), 2011 International Conference on, pages 206– Network Elements in Isolation. IEEE Transactions on 211. IEEE, 2011. Information Theory, 37(1):114–131, 1991. [23] Radio Technical Commission for Aeronautics. DO- [11] E. Heidinger, N. Kammenhuber, A. Klein, and G. Carle. 214/SC-164: Audio Systems Characteristics and Network Calculus and Mixed-Integer LP Applied to a Minimum Operational Performance Standards for Switched Aircraft Cabin Network. In Proceedings of the Aircraft Audio Systems and Equipment Systems and 20th International Workshop on Quality of Service, Equipment. 1993. IWQoS 2012, pages 1–4, 2012. [12] IEEE 802.1 AV Bridging Task Group. Website, Emanuel Heidinger received Accessed 2011-01-25. his diploma (equiv. to MSc) http://www.ieee802.org/1/pages/avbridges.html. and doctoral degree from TU [13] J. Imtiaz, J. Jasperneite, and L. Han. A performance München, all in computer study of Ethernet Audio Video Bridging (AVB) for science, in 2006, and 2013 Industrial real-time communication. In Proceedings of the 14th Conference on Emerging Technologies & respectively. During his studies, Factory Automation, ETFA 2009, pages 1–8. IEEE, 2009. he worked in the PhD program [14] Jirka Klaue, Berthold Rathke, and Adam Wolisz. at EADS Innovation works. His Evalvid–a framework for video transmission and quality research interests lie in the field evaluation. In Computer Performance Evaluation. of Network Calculus, Discrete Modelling Techniques and Tools, pages 255–272. Event Simulation and Switched Springer, 2003. Queuing Networks for Aeronautic Cabin Networks. [15] TCPDUMP/LIBPCAP public repository. Website, Accessed 2014-02-13. http://www.tcpdump.org/. [16] OPNET. Application and Network Performance with Hyung-Taek Lim received his OPNET. Website, Accessed 2011-07-04. http://www.opnet.com/ diploma (equiv. to MSc) from [17] Jirka Klaue. EvalVid - A Video Quality Evaluation TU Berlin in computer Tool-set. Website, Accessed 2014-01-30. engineering in 2009. During his http://www2.tkn.tu-berlin.de/research/evalvid/. study, he worked in the PhD [18] J.Y. Le Boudec. Network Calculus: A Theory of program at BMW Group Deterministic Queuing Systems for the Internet. Research and Technology for Springer-Verlag, Berlin, 2004. the topic Ethernet AVB. His [19] H.T. Lim, D. Herrscher, L. Volker, and M.J. Waltl. research interests lie in the field IEEE 802.1 AS time synchronization in a switched of discrete event simulation and Ethernet based in-car network. In Proceedings of the Conference on Vehicular Networking Conference, VNC Quality-of-Service in wired and 2011, pages 147–154. IEEE, 2011. wireless communication in an in-car network. [20] Hyung-Taek Lim and Emanuel Heidinger. Performance

http://www.comsoc.org/~mmc/ 30/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Performance Analysis for Wireless Multimedia Sensor Networks Based on Stochastic Network Calculus Nao Wang, Gaocai Wang and Ying Peng School of Computer and Electronics and Information, Guangxi University, 530004, China [email protected]

Consequently, the content and nature of the sensed data 1. Introduction also varies. As an example that highlights the need for In recent years, wireless multimedia sensor networks network level QoS, consider the task of bandwidth (WMSNs) are able to ubiquitously obtain multimedia assignment for multimedia mobile medical calls, which content such as video, audio streams, images, and include patients’ sensing data, voice, pictures and video scalar sensor data from the environment because of the data. Unlike the typical source-to-sink multi-hop availability of inexpensive hardware. In a WMSN communication used by classical sensor networks, the system, wireless channels have time-varying location WMSN architecture maybe use 3G/4G cellular system dependent characteristics and different wireless sensors in which individual nodes forward the sensed data to a experience different channel conditions at a given time. cellular phone or a specialized information collecting These stochastic wireless channels are affected entity. Different priorities are assigned to video data inherently by sensors’ shadowing, interference and originating from sensors on ambulances, audio traffic path losses due to changing environments and possible from elderly people, and images returned by sensors mobility. In a WMSN, on one hand, the multimedia placed on the body. In order to achieve this, parameters content should be delivered with predefined levels of like hand-off dropping rate, latency tolerance and Quality of Service (QoS) under resource and desired amount of wireless effective bandwidth are performance constraints such as bandwidth, energy and taken into consideration. delay. On the other hand, the traffic generated by In this paper, we mainly focus on some performance sensors is highly burstiness, and various kinds of data metrics of WMSNs. We firstly limit traffic arrival input and output sensors by occupying stochastic process by using exponentially bounded burstiness wireless channel to satisfy different performance (EBB) model of SNC and establish stochastic arrival demands and guarantee. Some uncertain factors curve. Then, we give a new suitable stochastic service existing in stochastic wireless channel become curve which can provide better QoS performance and challenges for performance guarantee in a WMSN. obtain stochastic service curve. We give delay bounds Traditional deterministically performance guarantee of single node and the end-to-end delay bounds. methods based on queuing theory and effective 2. Network model bandwidth are relatively weak in making performance analysis of stochastic networks and lead to low In Figure 1, we introduce reference architecture for utilization of network resource. WMSNs, where three sensor networks with different Deterministic network calculus (DNC) has already characteristics are shown, possibly deployed in become an elegant and effective theory for worst case different physical locations. performance analysis in network dimensioning by two tools: arrival curve and service curve, since it was proposed by Cruz and Chang. On the basis of DNC, some related works have been proposed to derive deterministically performance guarantee (such as deterministic delay and backlog bounds) in some traditional and fashionable networks. As an improved version of DNC, a novel theory called stochastic network calculus (SNC) in which extends DNC two curves with probability meanings, has been proposed systematically. One can provide stochastic performance guarantee in network dimensioning and get higher network utilization at the expense of minor Figure 1 A network architecture for WMSN performance degradation under SNC framework. We mainly focus on performance based on popular We define edge nodes (EN) which sense information SNC in wireless multimedia and data networks. More and data from environment. Sense nodes (SN) can especially, sensed data may originate from various sense information and relay data for edge nodes. Sink types of events that have different levels of importance. node deliver data to control center. http://www.comsoc.org/~mmc/ 31/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

the flow gets service Si(s, t) in time interval (s, t) S(,) s t u ii,i , j  1,2,3  N (3) Sjj(,) s t u

And the flow gets minimal service rate Ri is R  (u u )C (4) i i iE i Where, E is the set of flow for waiting service in time Figure 2 wireless sensor network traffic model interval (s, t). We give some traffic process and scheduling rules 3. Performance evaluation for Figure 1 and 2 models. In this section, we give some performance metrics for (1) Edge nodes send sensing data to sense nodes by WMSNs according to above results. Posisson process with parameter r. 3.1 A delay bound for a single node (2) There exist arrival multi-flows in a sense node. In buffer of a sense node, data will be forwarded to the R(t) denotes arrival data and x is instantaneous next sense node according first come first serve burstiness. mechanism. On the other hand, we suppose the sense (3) Sense nodes buffer arrival data. node serves data by using token bucket model (r, x). So, (4) Sense nodes provide FCFS service for arrival we can compute a delay bound of single node. flows. The service capability is C, and satisfies service f g() x  Pr{() D t  h (  x ,)} curve β(t). Departure curve is R*(t). Pr{sup {inf{D ( t )  0 : ( t )  x  ( t )}}} * t0 (5) R (t) can be regarded as arrival flows for next x sense node. Pr{supt0 {inf{D ( t )  0 : rt  x  C ( t  )}}} Stochastic arrival curve C We give a suitable stochastic arrival curve for sensed Pr{supt0 {inf{D ( t )  0 : ( C  r ) t  x  x }}} data in a WMSN. In SNC, EBB traffic model is a 2x Pr{sup {inf{D ( t )  0 : t  }}} typical stochastic arrival traffic model, it has already t0 Cr been proved that many types of input traffic processes 2x such as Poisson, Bernoulli and exponential ON/OFF Pr{supt0 {D ( t )  0 : t  }} (5) have EBB. So, given the bursty nature of voice and Cr video data, queueing disciplines are needed that can According to the delay of single node, we can accommodate sudden peaks, we abstract arrival traffic deduce the end-to-end delay of a path in a WMSN. A process of all flows and adopt EBB model bound. We weight ui is set to every arrival flow because of a have the following Theorem 1. limited service capability C, when the flows arrive at Theorem 1: Stochastic Arrival Curve: Suppose all the sense node. sensed data arrive at sensor node according to M/M/1 2x Pr{D ( t ) h (  x , )}  Pr{supt0 { D ( t )  0 : t  }} (6) stochastic process and arrival curve satisfies R(t), R(t) Rri  and α(t) denote arrival data and arrival curve, 3.2 The end-to-end delay bound respectively. α(t)=rt + x, r is arrival rate and x is Next, we will compute the end-to-end delay bound by -xt instantaneous burstiness. f(x)=Me is violation using convolution formula. probability. For 0≤s≤t, if x≥0, we have Theorem 3: Suppose a path of a WMSN has n Pr{R ( s , t ) rt  x }  f ( x ) (1) (n=1,2,…,N) sensor nodes. If every sensor node 1 2 n Stochastic service curve provides service curve β (t), β (t),…, β (t), the end-to- The performance of WMSNs depends on different end service curveβnet(t) can be expressed by stochastic service curve. 12 n net ()()tt      (7) Theorem 2: Stochastic Service Curve: Suppose An illustration for the end-to-end delay is given in flows have arrival process A(t) and departure process * Figure 3. R (t), a sensor node said to provide a stochastic service curve β(t)=C(t-T) with violation probability g(x)=1- Me-xt for flows if for all t≥0 and x ≥ 0. Pr{()R* t R () t  ( rt  x )0}   g () x (2) Where, T denotes the delay of traffic in a sensor node. For a single sense node, the sense node will schedules arrival flow according FCFS if there are multiple arrival flows. Suppose N (N=1,2,…) arrival flows, for Figure 3 An illustration for the end-to-end delay any arrival flow i, if the i flow has priority weight ui, http://www.comsoc.org/~mmc/ 32/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Theorem 4: Suppose a path of a WMSN has n in part by the Natural Science Foundation of Guangxi (n=1,2,…,N) sensor nodes, and the flows transmit to Province under Grant No.2010GXNSFC013013 sense node from h edge nodes with rate r. The service capability is C, and hr≤C. Then, the end-to-end delay References DE2E satisfies the following [24] I. Akyildiz, T. Melodia, K. Chowdhury, A survey on nn( 3) wireless multimedia sensor networks, Computer nn( 3) M n 2 Networks 51 (4) (2007) 921–960. DEE2  log( ) (8) 2x ( Ri  hr )  [25] Y. Liu, Ch. K. Tham, Y.M. Jiang, A calculus for Here, ε is violation probability of the path. stochastic QoS analysis. Performance Evaluation, 64(6), pp.547-572, 2008. 3.3 Simulation results [26] J. Debardelaben, Multimedia sensor networks for ISR We give some numerical results based on above applications, in: 37th Conference on Signals, Systems theoretical results. Figure 4 shows that the delay of and Computers, vol. 2, 2003, pp. 2009–2012. single node changes with the increase of the priority [27] I. Akyildiz, T. Melodia, K. Chowdhury, Wireless weight ui. Figure 5 shows the end-to-end delay bound multimedia sensor networks: applications and testbeds, under different node sizes. Proceedings of the IEEE 96 (10) (2008) 1588–1605. [28] J. Boudec and P. Thiran. Network calculus: A theory of deterministic queuing systems for the internet. Springer, LNCS, 2008. [29] R. L. Cruz. A Calculus for Network Delay, Part I: Network Elements in Isolation.IEEE Transactions on Information Theory, 36(2): 114-131. 1991 [30] R.L.Cruz.A Calculus for Network Delay , Part II : Network Analysis.IEEE Transactions on Information Theory, 37(1):132-141. 1991. [31] Florin Ciucu , Almut Burchard , Jorg Liebeherr . A Network Service Curve Approach for the Stochastic Analysis of Networks.In:Proceedings of the ACM International Conference on Measurement and Modeling Figure 4 Different delay bounds for single node of Computer Systems(SIGMETRICS’2005) , Banff , Alberta,Canada,2005. 279-290. [32] Florin Ciucu , Almut Burchard , Jorg Liebeherr.Scaling Properties of Statistical End-to-End Bounds in the Network Calculus.IEEE Transactions on Information Theory,2006,52(6):2300-2312.

Nao Wang received her B.E., M.E., Degree from Central South University, China. Currently, she is a lecture in Guangxi University. Her research interests include networks optimization. Gaocai Wang received his B.E., M.E., and Ph.D. degrees in computer science from Central South University, China, in 2004. Afterward, he went to Figure 5 The end-to-end delay bounds Tsinghua University, University of Toledo and Texas 4. Conclusion Southern University as a Postdoctor. Currently, he is a This paper mainly focuses on some performance professor in the Department of Computer Science, metrics of WMSNs. We firstly limit traffic arrival Guangxi University. His research interests include process by using exponentially bounded burstiness computer networks, routing algorithms, performance (EBB) model of SNC and establish stochastic arrival evaluation. He has published more than 60 journal and curve. Then, we give a new suitable stochastic service conference papers in these areas. curve which can provide better QoS performance and Ying Peng received her B.E., M.E., Degree from obtain stochastic service curve. We give delay bounds Huazhong Normal University, China. Currently, she is of single node and the end-to-end delay bounds. a Ph.D candidate and lecture in Guangxi University. Acknowledgment: This research is supported in part Her research interests include energy consumption by the National Natural Science Foundation of China optimization. under Grant Nos. 61262003, 61063045 and 61103245,

http://www.comsoc.org/~mmc/ 33/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter A General Stochastic Framework for Low-Cost Design of Multimedia SoCs Balaji Raman1, Ayoub Nouri1, Deepak Gangadharan2, Marius Bozga1, Ananda Basu1, Mayur Maheshwari1, Axel Legay3, Saddek Bensalem1, and Samarjit Chakraborty4 VERIMAG (France) 1, Technical University of Denmark2, INRIA Rennes (France) 3, Technical University of Munich (Germany) 4 [email protected]

1. Introduction scenarios of the analytical framework. An apt choice of a modeling framework is essential to design resource-constrained System-on-Chips (SoCs) in multimedia systems (such as video/audio players, etc.). Such a modeling framework must exploit the inherent stochastic nature of the multimedia applications to design low-cost systems. The uncertainty in such systems is due to high variability present in the input multimedia stream, in terms of number and complexity of items that arrive per unit time to the system. Consequently, the variability is exhibited both in arrival and in processing time.

Many of the existing analytical frameworks proposed Figure 6: Display buffer in a generic video decoder so far are either incompatible or inflexible to capture of a SoC. the key characteristics of the system being modeled:  Worst-case execution time modeling and analysis framework cannot capture behavior 2. Case study: Video Decoder of soft real-time systems, leading to We analyze an abstraction of a video decoder SoC pessimistic designs with exorbitant use of (shown in Figure 1) with the analytical approach. In the hardware resources (such as buffer size). abstract SoC model, the input video stored is fed to the  Average-case execution time analysis input buffer in terms of stream objects such as framework cannot provide QoS guarantees macroblocks, frames, etc. A pipeline of functional units and are thus hardly trustworthy. process the input stream. Processed items are To address the above limitations, we sought a temporarily stored in the output buffer before their framework for analyzing multimedia systems that display. account for the stochastic nature of the streaming application. We need a model characterizing input Now, we will state the problem addressed in this paper. stream and execution of the multimedia stream as We assume that the multimedia SoC contains a single stochastic; instead of capturing event arrivals and processing unit and two buffers. executions with worst or average cases. Problem Statement: To estimate the probability that Recently, stochastic network calculus based the buffer underflow (U(t)) is less than two consecutive approaches have been proposed for performance frames in 30 frames. We are given the following: analysis of multimedia systems. These approaches,  a set of video clips of certain bit-rate (r) and however, used the probabilistic calculus only partially: resolution, the input stream objects of a multimedia stream (e.g.,  maximum frequency of the processing unit of frames) and their execution time are assumed to be the multimedia SoC (f), deterministic. Another study used probabilistic real-  consumption rate of the output device (c), time calculus to analyze hard real-time systems. This  start-up values for the initial delay (d), input- later research, however, did not focus on any specific buffer size (b), and playout buffer size (B). application domain. Nonetheless, if stochastic network calculus is fully adopted for system-level design, then 2.1. Analytical Model of the Multimedia SoC design of multimedia embedded platforms can benefit, too. The analysis can provide probabilistic bounds on In this subsection, first, we give an overview of our the output quality of the system. The worst-case and analytical approach for this case study. Second, we average case analysis of the system is special-case present formulation for evaluating the QoS constraints. http://www.comsoc.org/~mmc/ 34/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Therefore, the system designer can choose an initial We can estimate the maximum size of the output buffer delay based on his resource and quality constraints. using our stochastic analytical framework. Previous In what follows, we formulate the probability works that estimate the output buffer size using distribution of the output buffer size using the deterministic real-time calculus proceed as follows. For stochastic real-time calculus model. all given video clips, Maxiaguine et al. construct upper bounds on the number of items that arrive to the input In practice, the designer is typically given a set of input buffer and that execute in the processor. These two clips to design the SoC with given display constraints. bounds together yield another upper bound on the So,in our problem setting, the designer can construct an number of items that arrive to the display buffer. Thus, upper and lower bound using synthetic traces (which given a rate at which items are consumed from the are obtained from actual traces) and use the actual output buffer, Maxiaguine et al. estimate the maximum traces to construct the bounding functions. Now, we buffer size required. explain the construction of these synthetic traces.

We, too, compute deterministic upper bounds on Let the given set of input video clips be partitioned into arrival, execution, and output, however, only for a sub- two sets, SA and SB, based on the designer’s set of given video clips; the remaining video clips can requirement that all clips in SB must be processed with violate the deterministic bounds. For example, the no loss in video quality. The deterministic upper bound number of items arriving to the output buffer over a on the arrival and the output, introduced in the previous time interval, for a certain clip, can be larger than the section, are constructed using clips in SB. Now, we deterministic output bound. Now, we explain how to discuss how to synthetically generate clips in case we quantify this deviation from the deterministic bound in do not have a set SB. a stochastic setting (as we are using probabilistic network calculus). The information we have about the clips in set SA are the number of bits and number of cycles corresponding Assume that the stream objects arrival and execution to each macroblock. For certain macroblocks, we are stochastic—an apt characterization of multimedia modify the number of bits and cycles, assuming we are streams. So, what is the probability that the number of given lower bounds on these parameters. Any number stream objects that arrive to the output buffer over a of bits lower than the actual bound in the input traces time interval is larger than the deterministic output are replaced with a value of the lower bound. Thus we bound? obtain synthetic traces forming clips in set SB. Now we explain how we estimate stochastic bounds using the First, we estimate the maximum probability of actual clips (i.e. clips from set SA) and synthetic traces violating the deterministic bounds for the arrival and (or if available clips from set SB). execution; then, using these two bounds, we estimate it for the output. Using this stochastic bound on the The upper and lower bounds on the arrival given from output, given the constant consumption rate, we can the formula in the previous subsection are calculated compute the probabilistic distribution of the buffer for the synthetic traces. That is, from the modified size. inverse ϕ function, we compute αl(Δ) and αu(Δ). This leads to the definition of the stochastic arrival curve. We specify tolerable loss in video quality as: less than two consecutive frame loss within 30 frames, less than Assume the output arrival curve is an arrival process to 17 aggregate frame loss within 100 frames, etc. the playout buffer. The probability distribution at the Previous work models the loss of stream objects as playout buffer can be computed using the bounding buffer underflows, which occurs whenever the display function h. device finds insufficient items to read from the output P (U(t) > a) ≤ h(a – (α* Ø ct))(0), t ≥ 0 buffer. Raman et al. show that the amount of buffer In the above equation α* is the output arrival function underflow can be controlled using an application given by αuØ βu (t). parameter, namely, the initial playout delay, the delay after which the video starts to display. 3. Results

We, too, tune the initial delay parameter to restrict the This section sketches QoS probabilities estimated from maximum buffer underflow, albeit, in a stochastic the stochastic network calculus approach presented in setting. We obtain probability values for a QoS the previous section. property to hold for a range of initial delays. The buffer size corresponding to each such delay can be estimated. http://www.comsoc.org/~mmc/ 35/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter We implemented the analytical framework described in MATLAB. The experiments were conducted for a low- 4. Conclusion This paper proposed an application of stochastic network calculus for the design of multimedia system- on-chips. There are two main advantages of choosing this framework (i.e. stochastic network calculus) for modeling multimedia systems: (1) for characterizing multimedia streams as stochastic and thus leading to efficient SoCs (in terms of hardware resources, etc.,), and (2) for providing QoS guarantees. We illustrated these advantages with applying the network calculus framework for designing video decoders. We also proved the accuracy of our methodology using a Figure 7: Probabilistic bounds for mobile.m2v. simulation technique whose designs can be expressed in formal semantics. bit rate and low resolution clips (352 x 240) obtained from an open source. The bit-rate of the input video is References 1:5 Mbits per second and the frame output rate is [1] B. Raman, A. Nouri, D. Gangadharan, M. Bozga, A. Basu, 30fps. We used an MPEG2 implementation optimized M. Maheshwari, J. Milan, A. Legay, S. Bensalem, and S. for speed. The MPEG2 source was annotated to get the Chakraborty. A General Stochastic Framework for Low- number of bits corresponding to each compressed Cost Design of Multimedia SoCs. Technical Report TR- 2012-7, Verimag Research Report, 2012 (also published macroblock. The execution cycles for each macroblock in IEEE SAMOS XIII). is obtained from the software simulator SimpleScalar. [2] N. H. Zamora, X. Hu, and R. Marculescu. System-level Recapitulate that the number of bits and execution performance/ power analysis for platform-based design of cycles per macroblock are inputs to the analytical multimedia applications, ACM Transactions on Design framework. We chose the video files cact.m2v, Automation of Electronic Systems, (TODAES), 12(1):1– mobile.m2v, and tennis.m2v for our experiments. 29, January 2007. To construct the synthetic clips we set the lower bound [3] Y. Jiang and Y. Liu. Stochastic Network Calculus. for bits (e.g to 60) and the lower bound for execution Springer, 2008. cycles (e.g to 9000). Then from the actual trace [4] L. Santinelli and L. Cucu-Grosjean. Toward probabilistic real-time calculus. ACM SIGBED Review, 8(1):54–61, containing the number of bits and execution cycles per March 2011. macroblocks, synthetic traces are obtained; any value [5] B. Raman, G. Quintin, W. T. Ooi, D. Gangadharan, J. below the lower bound is modified to the lower bound. Milan, and S. Chakraborty. On buffering with stochastic Figures show the probability that the buffer underflow guarantees in resource constrained media players. In Proc. is greater than two consecutive frames over any time of the IEEE/ACM/IFIP International Conference on interval. Figure 2 also show the probability estimates Hardware/Software Codesign and System Synthesis from the statistical model checking framework. (CODES+ISSS), pages 169–178, September 2011. Following are the main observations: [6] A. Maxiaguine, S. Kunzli, S. Chakraborty, and L. Thiele.  Increase in playout delay decreases the Rate analysis for streaming applications with on-chip buffer constraints. In Proc. Of the Asia and South Pacific amount of buffer underflow, so, probability Design Automation Conference (ASP-DAC), pages 131– that the buffer underflow is more than two 136, January 2004. consecutive frames decreases. [7] D. Wijesekera and J. Srivastava. Quality of Service (QoS)  The estimates from stochastic real-time Metrics for Continuous Media. Multimedia Tools and calculus upper bound the statistical model Applications, 3(2):127–166, July 1996. checking results as the analytical framework captures the worst-case behavior.

http://www.comsoc.org/~mmc/ 36/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Copula Analysis for Stochastic Network Calculus Kui Wu, Fang Dong, Venkatesh Srinivasan Computer Science Department University of Victoria BC, Canada V8W 3P6 {wkui, fdong, venkat}@uvic.ca

1. Introduction interval . For any 0 st, let Since its introduction in early 1990s [1], network    A(,)()() s t A t A s , A(,)()() s t A t A s and calculus has been widely adopted to analyze complex By default, queueing systems, such as multimedia networks, where S( s , t ) S ( t ) S ( s ). the Markovian property of arrivals generally does not AAS(0) (0)  (0)  0. hold and thus traditional queueing theory is hard to apply. Network calculus has evolved along two tracks We denote by F the set of non-negative wide-sense – deterministic [2, 3] and stochastic [4]. In the domain increasing functions, i.e., of stochastic network calculus (SNC), due to some F {():0f    x  y ,0  f () x  f ()}, y difficulties specific to stochastic networks, it is only in recent years that critical network calculus properties, and by F the set of non-negative wide-sense such as concatenation property [4, 5] have been proved. decreasing functions, i.e., F {():0f    x  y ,0  f () y  f ()}. x The practical use of SNC has been questioned due to the lingering problem in deriving tight stochastic For any random variable X , its distribution function, performance bounds [6, 7]. In [7], the problem has denoted by been implicitly raised, and caution has been advised if F( x ) Prob { X x }, model transform is used to derive performance bounds. X belongs to , and its complementary distribution In [6], the problem has been analyzed in more detail and has been given an explicit term, called the quasi- function, denoted by deterministic problem. FX ( x ) Prob { X x }, belongs to . While substantial efforts have been devoted to improving the bounds [6, 8], the problem is tackled The (min, ) convolution of functions f and g is only for special types of traffic and service models, useful for SNC, and is defined under the using probability inequalities, e.g., Chernoff bounds and martingale inequalities. In general the algebra [1-3]: independence assumption is required, e.g., the (f g )() t inf {() f s  g ( t  s )}. (1) 0st independence of the traffic arrivals and the independence between the arrivals and the service. Stochastic traffic arrival curve and stochastic service curve are core concepts in stochastic network calculus, This paper points out the potential of copula theory in with the former used for traffic modeling and the latter SNC. With copula analysis and a simple case study, we used for service modeling. In the literature, there are clearly show the region where copulas can be helpful different definitions of stochastic arrival curve and and the best possible bound that SNC can achieve. stochastic service curve [4, 9]. We use simple models in this paper for illustration purpose. 2. Background A. Stochastic network calculus Definition 1: The t.a.c model [4]: A flow is said We introduce the notation and key concepts of stochastic network calculus [4, 9, 10]. We assume that to have a traffic-amount-centric (t.a.c.) stochastic all arrival curves and service curves are non-negative arrival curve  F with bounding function f F , and wide-sense increasing functions. Conventionally, denoted by  At() and At() are used to denote the cumulative Ata  f,, a traffic that arrives and departs in time interval (0,t ] , if for all ts0 and all x  0 , it holds respectively, and St() is used to denote the cumulative Prob{ A ( s,t ) ( t - s )  x }  f ( x ). (2) amount of service provided by the system in time http://www.comsoc.org/~mmc/ 37/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter Note that the above model is actually quite general and Sklar theorem implies that we can construct a copula if covers some broadly-used models. For example, the the joint distribution and the marginal distributions are stochastically bounded burstiness (BSS) model [11] is given. Similarly, we can calculate the joint distribution a special case of the t.a.c model by setting if we know the copula and the marginal distributions. (t s )   ( t  s ). Following the same notation, This seems to suggest that the joint distribution and when the traffic arrival At() follows the BSS model copula are equivalent. The following theorem, however, shows a special feature of copulas that the joint with upper rate of  and bounding function of f , we distribution function does not possess. denote it as AfSBB ,. Theorem 2: The invariant property of copulas [12]: For service models, the following model is normally Let X and Y be continuous random variables with adopted: copula CXY . If  and  are strictly increasing on the Definition 2: The w.s. model [4]: A server is said to range of X and the range of Y , respectively, then provide a flow with a weak stochastic (w.s.) CC()()X Y XY . In other words, is invariant service curve  F with bounding function g F , under strictly increasing transformations of and . denoted by 3. Stochastic network calculus with copulas Sgws ,, if for all t  0 and all x  0 , it holds A. Basic Lemmas For simplicity, we denote [xx ] min{ ,1} and ProbA{ ( t )  At ( )  x }  gx ( ). (3) 1  [xx ] max{ ,0} in the following. B. Copulas A copula is a function that links univariate marginals to In stochastic network calculus, we are often interested their multivariate distribution. In practice, it is in the complementary distribution function of normally easy to find the univariate marginals, but it is ZXY, i.e., Prob{ Z z }. The following two hard to obtain the multivariate distribution. With a lemmas have been widely used in the derivation of copula, we can capture the joint distribution of random stochastic bounds. variables once their marginal distributions are obtained. Lemma 1: General case [4]: For the sum of two Definition 3: An N-dimensional copula is a function C random variables and , , no matter having the following properties [12]: whether and are independent or not, NN 1) Dom CI[0,1] ; FZXY( z ) F F ( z ). 2) If any argument of C is zero, then C  0 . In addition, C is nondecreasing in each argument1; Lemma 2: Independent case: Assume that non- negative random variables and are independent 3) C has margins Cn which satisfy and FX ()() x f x and FX ()() x g x , where fg,.F. C( u ) C (1, ,1, u ,1, ,1) u for all u in I. n Then, for all x0, ProbXYx {   }  1  ( fgx  )( ),

Sklar's theorem [12] is the core of the copula theory, where f() x 1[()],()  f x11 g x  1[()]  g x , and  is because it provides a way to analyze the dependence the Stieltjes convolution operation. structure of multivariate distributions. It also builds a link between the joint distribution and the marginal The following lemma, from copula analysis, is useful distributions. for SNC: Lemma 3: Copula case [12]: Let Z be the sum of two Theorem 1: (Sklar's theorem) Let F be an N- random variables and . Then dimensional distribution function with continuous FZZZ( z ) F ( z ) F ( z ), (5) margins FFF12,,.N Then F has a unique copula representation: where

Fxx(,,,1 2 xNNN ) CFxFx ((),(),, 1 1 2 2 Fx ()) (4) FZXY( z ) 1sup { W ( F ( x ), F ( y ))}, (6) x y z

FZXY( z ) 1inf { W ( F ( x ), F ( y ))}, (7) x y z 1 We slightly modify the definition for ease of W( u , v ) [ u  v  1] , understanding. Refer to [12] for a more mathematically (8) rigorous definition. http://www.comsoc.org/~mmc/ 38/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter It is easy to show that the superposition of W(,)[]. u v u v 1 (9) and follows

B. Copula analysis Ag~,SBB 12   , where g can be calculated with Markov modulated processes have been extensively Equations (10) (general bound), (11) (independent used for representing multimedia traffic [13]. It has bound), (12) (copula upper bound), and (13) (copula been shown that Markov modulated traffic could be lower bound), respectively. The copula lower bound captured with the stochastically bounded burstiness indicates the tightest possible bound that we can obtain (SBB) model. Assume that we are given two Markov with SNC when the upper rate is 12 . modulated processes and Af1~,SBB  1 1

Af2~,SBB  2 2 As a concrete example, we assume Figs. 1 and 2 show two numerical examples. We have two interesting observations from the figures: that both bounding functions, f1 and f2 , have the  The general bound is the same as the upper copula exponential form [11] with mean values of  and  , bound, indicating that the general bound is actually respectively. We are interested in modeling the the worst bound that we can get with SNC. superposition of A1 and A2 , AAA12.  There is a clear gap between the independent bound and the lower copula bound. This gap implies that Let X and Y be exponentially distributed random if the dependence of random variables is unknown, variables with mean values of  and  , respectively, independent case analysis does not always lead to and let ZXY. With Lemma 1, we have the the best bound. There is much room for us to following bound, denoted as the general bound since it explore how to improve stochastic bounds with holds for any and : copulas. 1, z    Fz() z (10) Z   4. Conclusion and future work   ez,   With a simple case study, we illustrated the benefit of where (    )ln(    )   ln(  )   ln(  ). applying copula theory in SNC for tighter performance bounds. Our preliminary analysis, while by no means If we know that and are independent, we have comprehensive, sheds light on several important issues the following bound, denoted as the independent in SNC, including the region where we can take bound: advantage of dependency of random processes and the z tightest bound that SNC can possibly achieve. This   z  [(1 )e ]1 ,      knowledge is fundamental for designing better   scheduling and multiplexing strategies for multimedia FzZ ()  z z (11)   systems.   ee  [],1    Our future work includes constructing and validating copula models for network traffic with real-world Based on Lemma 3, we have the following copula experiments, and investigating the achievability of the upper and lower bounds, whose proof is omitted to lower copula bound. save space. 1 Theorem 3: Let and be exponentially distributed general random variables with means and , respectively. upper copula 0.8 independent ˆ lower copula Let Fz and Fz be as in Lemma 3. Then 1, z   0.6 ˆ  z Fz    (12)

 Prob ez,   0.4 and 1,z  0  0.2 z Fz    (13) ezmax(,) ,0  0 where (    )ln(    )   ln(  )   ln(  ). 0 2 4 6 8 10 z Figure 1. Different Bounds with 0.5, 1. http://www.comsoc.org/~mmc/ 39/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter [10] C. Li, A. Burchard and J. Liebeherr, “A Network Calculus With Effective Bandwidth,” in IEEE/ACM 1 Trans. Netw., vol. 15, no. 6, pp. 1442–1453, 2007. general upper copula [11] D. Starobinski and M. Sidi, “Stochastically bounded 0.8 independent lower copula burstiness for communication networks,” in IEEE Trans. Information Theory, vol. 46, no.1, pp. 206-212, 2000. 0.6 [12] R. Nelsen, An Introduction to Copulas. Springer, 2006.

Prob 0.4 [13] M. Schwartz, Broadband Integrated Networks. Prentice Hall PTR New Jersey, 1996. 0.2 Kui Wu received the Ph.D. degree in Computing Science 0 0 2 4 6 8 10 from the University of Alberta, z Canada, in 2002. He joined the Figure 2. Different Bounds with 2, 2. Department of Computer Science at the University of References Victoria, Canada in 2002 and is [1] R. L. Cruz, “A calculus for network delay. I. Network currently a Professor there. His elements in isolation,” in IEEE Trans. Information. current research interests Theory, vol. 37, no. 1, pp. 114–131, 1991. include network performance evaluation, network security, and cloud computing. He [2] C.-S. Chang, Performance Guarantees in Communication m Networks. Springer-Verlag, 2000. is an IEEE senior ember.

[3] J. Le Boudec, and P.Thiran, Network calculus: a theory of Fang Dong received Master deterministic queuing systems for the internet. Springer- degree in Electronics and Verlag, 2001. Communication Engineering and Bachelor degree in Physics [4] Y. Jiang, Stochastic network calculus. Springer, 2008. at Wuhan University, China. She is now working towards [5] F. Ciucu, A. Burchard, and J. Liebeherr, “A network her Master degree in Computer service curve approach for the stochastic analysis of networks,” in IEEE Trans. Information Theory, vol. 52, Science at the University of no.6, pp. 2300-2312, 2006. Victoria, Canada. Her current research interest is stochastic network calculus. [6] F. Ciucu and J. Schmitt, “Perspectives on Network Calculus - No Free Lunch but Still Good Value,” in ACM Venkatesh Srinivasan is an Sigcomm, pp. 311–322, 2012. Associate Professor in the Computer Science Department [7]K. Wu, Y. Jiang, and J. Li, “On the model transform in at the University of Victoria. stochastic network calculus,” in Quality of Service Prior to that, he was a (IWQoS), 2010 18th International Workshop on, pp. 1–9, 2010. postdoctoral fellow at the Max- Planck Institute for Informatics, [8] Y. Jiang, “Network Calculus and Queueing Theory: Two Center for Discrete Sides of One Coin,” in Proceedings of VALUETOOLS Mathematics and Theoretical 2009, 2009. Computer Science at Rutgers University and the Institute for [9] Y. Jiang, “A Basic Stochastic Network Calculus,” in Advanced Study. He received his PhD degree in Proceedings of ACM Sigcomm 06, pp. 123–134, 2006. Computer Science from the Tata Institute of Fundamental Research. His research interests are in Algorithms and Complexity of Computing.

http://www.comsoc.org/~mmc/ 40/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter A Delay Calculus for Streaming Media Subject to Video Transcoding Hao Wang and Jens Schmitt DISCO Lab, University of Kaiserslautern, Germany {wang, jschmitt}@informatik.uni-kl.de

1. Introduction Streaming video to a diverse range of target devices across a network is enabled by video transcoding. A transcoder converts the video stream to be delivered Fig. 1. A network scenario with video transcoding. from one quality level to another, either using the same standard, or even switching to another one. Both of these options, often called homogeneous and heterogeneous video transcoders [1], can be deployed on one or multiple intermediate nodes along the path Fig. 2. Model of transcoding with network calculus. from the video provider to the consumers. One simple For the arrival (departure) process A(t) (D(t)), which is scenario is to deliver, e.g., HD videos, to a certain set the amount of traffic in [0, t] (A(s,t) := A(t) - A(s) for 0 of desktop PCs, laptops, tablets, and smart phones (see ≤ s ≤ t), we define: Figure 1). The transcoding techniques can be very diverse: some packets may be randomly discarded, or Definition 1 (Packetizer). Consider a packet process L, information in the stream is used to get the transcoded which is a sequence of cumulative packet lengths L(n) flow. In general, transcoding is a lossy process. = l0 + … + ln, n = 0, 1, 2, … and l0 = 0, an L-packetizer L P is a network element satisfying for all A(t), t ≥ 0

To achieve a certain Quality-of-Experience (QoE) we ( ( )) ∑ (1) usually need to control the end-to-end delay performance of a video stream. Network calculus can where Nt = max{m: ∑ ( )}. We say that a provide much insight in the network system flow A(t) is L-packetized if A(t) = PL(A(t)) for all t ≥ 0. performance: for example, an optimal smoother based Definition 2 (Packet Delay). A process W(t) is called on network calculus is introduced in [2], yet it cannot packet delay (process), if for all t ≥ 0 capture the lossy transcoding behavior. In previous ( ) { ( ( )) ( ( ))} work [3] of ours, we developed a scaling element for the network calculus to model loss processes (though Next we define a dynamic server, which provides a not limited to these). Therein the model applies at the lower bound on the service of a (sub-)system. It is a granularity of an equal sized data unit (e.g., one bit). random process S(s,t) such that the convolution in- Now, we investigate the scaling element at the equality ( ) ( ) { ( ) ( )} granularity of variable data units as they typically arise holds for all t ≥ 0. When the inequality holds with in streaming media, where packets with variable equality, the dynamic server is said to be exact. Note lengths are often encountered. We call this new scaling that the convolution of two concatenated dynamic element packet scaling. The packet scaling decides servers is still a dynamic server (concept of whether a whole packet can be forwarded or must be convolution-form network). We define a packetized discarded. This happens after the processed bits have server consisting of a dynamic server S and a been re-packetized with the (same) packetizer. In this packetizer L and providing a dynamic server SL. When paper, we assume that the packetizers after each fluid L lmax exists, from [6] we know a possible S , ( ) service remain the same. This is suitable to the case [ ( ) ] . Next, we define the packet scaling where streaming media is homogeneously transcoded. element. The heterogeneous case could be based on an extension of [3]. See also Figure 2. For such a network, when Definition 3 (Packet Scaling Element). A packet analyzing the end-to-end delay the mathematical scaling element consists of an L-packetized arrival challenge is to preserve the convolution-form network process ( ) ∑ , a packet scaling process X [4] taking into account the influence of packetization. taking non-negative integer values and a scaled packetized flow defined for all t ≥ 0 as ( ) 2. Model of transcoding with network calculus ∑ The time model is discrete. Before defining the packet scaling element we first give the definitions of When we analyze concatenated packet scaling packetizer [4, 5], packet delay, and packetized server. elements, we cannot easily obtain ( ) ( )

http://www.comsoc.org/~mmc/ 41/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

( ) ( )( ) ∑ Thus we get ( ) [ ] On one ∑ . For this case, we use different subscripts hand we know that for those packet sequences after each round of scaling. ( )( ) ( ) [ ] ∑ Assume an L-packetized ( ) , Nt is given in Eq. (1). We denote the packets resp. number On the other hand, we have

( ) of packets in the arrivals after each round of scaling as ( ) ( ) ∑ l resp. m(k), clearly ( ) , k,i Let ( ) ( ) , this completes the ( ) ( ) ∑ proof. { }, k > 0. Further, we denote the concatenated scaled process for all t ≥ 0 as

( )( ) ( ( ) ) ( )

( ) ( ) ( ) (2) Fig. 3. Commuting packetized server and packet scaling. 3. End-to-end delay of a network with transcoders In this section, we compute the end-to-end packet delay for the case with multiple transcoders. According to Fig. 4. A network with multiple streaming transcodings. our previous work, there are two methods to compute the end-to-end delay: one is to commute the service Next, we consider Figure 4 and provide an end-to-end and scaling elements, the other is to get the leftover delay bound and the corresponding order of growth. service for the flow of interest if the server has FIFO scheduling [7]. In this paper, we use the first. Theorem 1 (End-to-end Delays in a Network with Transcoding). Consider the network scenario from Lemma 1 (Commutation). Consider system (a) and (b) Figure 4 where an L-packetized arrival process A(t) = in Figure 3, we define ( ) ∑ as the L P (A(t)) traverses a series of alternate stationary and (mutually) independent bit level service elements exact service curve in (b), where ( ) ∑ , together with an L-packetizer and scaling elements ( ) ( ) ∑ . If A, S, X, and L are denoted by S , S , …, S and i.i.d. loss processes X , independent, then for all t ≥ 0, ( ) ( ). 1 2 n 1 X2, …, Xn-1, respectively. Assume the packets of L li are Before we present the end-to-end delay bound, we need i.i.d. Assume the MGF bounds ( )( ) ( )( ) to show two useful lemmas. We omit the proofs of and ( )( ) , for k = 1, …, n, Lemma 1, 2 due to space restrictions. and some . We also assume that the maximal Lemma 2 (Stationarity Bound). Assume that the packet length of L is lmax. Under a stability condition, packets li of a packet process L are i.i.d., the Xi’s of a given in the proof, for , we have the scaling element X are also i.i.d., A and B are two L- following end-to-end steady state delay bounds for all packetized arrival processes, then for all s, t , x > 0, d ≥ 0

(∑ ) (3) ( ( ) ( ) ) (( ( ) ( )) ) ( ) where the constants K and b are given in the proof as Lemma 3 (Recursive MGF Bound of Scaled Process). well. Moreover, the ε-quantiles scale as ( ), for some Assume that A is an arrival process, is a packetized ε > 0. server, li’s are i.i.d. with maximal length lmax, and the Proof. Due to space restrictions we only sketch the Xi’s are Bernoulli processes. If we denote ( ) as proof. The proof follows Theorem 1 in [3]. We use

Lemma 1, 2, 3 and Eq. (2) here, and letting ( ( ( ) ( )) ( )) [ ] { } we compute the end-to-end

delay bound on Wn(t) as ( ( ) ) then for all , and n > 1 (∑ ) ( ( ))( ) ( ) ( ) ( ) ∑ where is given.

∑ Proof. Because li’s are i.i.d. and Xn-1 is Markov process, we can view ( ) as the MGF of (∑ (∑ )

∑ ) , where P, Q are r.v.’s. Thi is a Markov- modulated process. It has -envelope ( ) ([5]). http://www.comsoc.org/~mmc/ 42/44 Vol.9, No.2, March 2014

IEEE COMSOC MMTC E-Letter

In this paper, we applied network calculus to the end- Here we let ( ) ⁄( ) and used to-end delay analysis of streaming video across networks subject to in-network transcoding. We have ( ) as the stability condition. Taking proves the result. Finally, the order of growth of the ε- shown that the delays can be bounded and grow in the quantiles for some 0 < ε < 1 follows directly as ( ). order of number of transcoders. The analysis can be applied to a simple scenario where a data unit from the streaming video is randomly selected, or, can be 4. Evaluation extended to more complicated scenarios of transcoding To evaluate the results, we use the following example if we specialize the scaling for specific transcoding. settings. First, we let the packet sizes be discrete uniformly distributed i.i.d. r.v.’s, l ~ U[a,b]. Then we References ( ) [1] I. Ahmad, X. Wei, Y. Sun, and Y. Zhang, “Video know ( ) . Let a = 1, b = 16 for ( )( ) transcoding: an overview of various techniques and illustration. Clearly, l = 16. Next, we use a Bernoulli research issues,” in IEEE Trans. Multimedia, vol. 7, no. 5, max pp. 793-804, 2005. scaling process X ~ B(p) for all transcoders, so that we [2] P. Thiran, J.-Y. Le Boudec, and F. Worm, “Network know ( ) ( ( )) . Further we calculus applied to optimal multimedia smoothing,” in assume the original arrival traffic of homogeneous data Proc. of INFOCOM, pp. 1474-1483, 2001. units (e.g., bits) to be a Poisson process Poi(λ), while [3] F. Ciucu, J. Schmitt, and H. Wang, “On expressing the service for this traffic at the servers is work- networks with flow transformations in convolution-form,” in Proc. of INFOCOM, pp. 1979-1987, 2011. conserving with constant rate Ci. Further, let λ = 1; the number of transcoders varies from 1 to 9; the capacities [4] J.-Y. Le Boudec and P. Thiran, Network Calculus, Springer Verlag, Lecture Notes in Computer Science, at the rest of the nodes are statically set as C1..C10 = [1.25, 1.15, 1.05, 0.95, 0.85, 0.80, 0.75, 0.70, 0.65, LNCS 2050, 2001. 0.60]; ε is 10-3. We use Omnet++ 4.3 to do simulations. [5] C.-S. Chang, Performance Guarantees in Communication We measure 106 packet delays at the end receiver node. Networks, Springer Verlag, 2000. [6] A. Burchard, J. Liebeherr, and F. Ciucu, “On superlinear scaling of network delays,” in IEEE/ACM Trans. Networking, vol. 19, no. 4, pp. 1043-1056, 2011. [7] H. Wang, F. Ciucu and J. Schmitt, “A leftover service curve approach to analyze demultiplexing in queueing networks,” in Proc. of VALUETOOLS, pp. 168-177, 2012. [8] Y. Jiang, “Stochastic service curve and delay bound analysis: a single node case,” in Proc. of ITC 25, pp. 1-9, 2013. Hao Wang received his B.Sc. degree in Computer Science and

Fig. 5. Delay bounds with Theorem 1 and simulation. Technology from Northeastern University, China and his M.Sc. Figure 5 compares the stochastic end-to-end delay degree in Computer Science bounds obtained with Theorem 1 and simulation. Both from University of curves exhibit the ( ) order of growth. It is also Kaiserslautern, Germany. His illustrated that the delay bounds are increasing in research interests include probability p. That means if the streaming data are performance modeling of longer kept at a node due to transcoding, the burstier distributed computer systems, the load of the next node becomes. Clearly, the bounds especially using stochastic network calculus. from Eq. (3) of Theorem 1 are not tight when compared to simulations. That is because we consider Jens B. Schmitt is professor for the increased latency for each packet after being served Computer Science at the TU Kaiserslautern. Since 2003 he by the packetized server as , while actually most has been the head of the packets have smaller increased latency. To avoid using Distributed Computer Systems lmax we would have to consider the inherent correlation Lab (disco). His research among arrivals, service and packet scaling elements, interests are broadly in which makes it very difficult to analyze the delay performance and security aspects bounds even for a single node without scaling element of networked and distributed (see [8]). systems. He received his Ph.D.

5. Conclusion from TU Darmstadt in 2000. http://www.comsoc.org/~mmc/ 43/44 Vol.9, No.2, March 2014

(b)

IEEE COMSOC MMTC E-Letter

MMTC OFFICERS

CHAIR STEERING COMMITTEE CHAIR

Jianwei Huang Pascal Frossard The Chinese University of Hong Kong EPFL, Switzerland China

VICE CHAIRS

Kai Yang Chonggang Wang Bell Labs, Alcatel-Lucent InterDigital Communications USA USA

Yonggang Wen Luigi Atzori Nanyang Technological University University of Cagliari Singapore Italy

SECRETARY

Liang Zhou Nanjing University of Posts and Telecommunications China

E-LETTER BOARD MEMBERS

Shiwen Mao Director Aburn University USA Guosen Yue Co-Director NEC labs USA Periklis Chatzimisios Co-Director Alexander Technological Educational Institute of Greece Thessaloniki Florin Ciucu Editor TU Berlin Germany Markus Fiedler Editor Blekinge Institute of Technology Sweden Michelle X. Gong Editor Intel Labs USA Cheng-Hsin Hsu Editor National Tsing Hua University Taiwan Zhu Liu Editor AT&T USA Konstantinos Samdanis Editor NEC Labs Germany Joerg Widmer Editor Institute IMDEA Networks Spain Yik Chung Wu Editor The University of Hong Kong Hong Kong Weiyi Zhang Editor AT&T Labs Research USA Yan Zhang Editor Simula Research Laboratory Norway

http://www.comsoc.org/~mmc/ 44/44 Vol.9, No.2, March 2014 Vol.8, No.4, July 2013