IEEE COMSOC MMTC E-Letter

IEEE MULTIMEDIA COMMUNICATIONS TECHNICAL COMMITTEE E-LETTER

Vol. 4, No. 10, November 2009

CONTENTS

Message from Editor-in-Chief...... 2 HIGHLIGHT NEWS & INFORMATION...... 3 IEEE MMTC Meeting Agenda...... 3 Call For Nominations: Editor-In-Chief IEEE MMTC E-Letter...... 4 IEEE GLOBECOM 2010 Call for Tutorial Proposals...... 5 IEEE GLOBECOM 2010 Call for Workshop Proposals...... 6 SPECIAL ISSUE ON NEW RESEARCH TRENDS...... 7 Toward Next Generation of Multimedia Computing and Networking...... 7 Guest Editor: Guan-Ming Su, Marvell Semiconductor, USA...... 7 Trends in Multimedia Communications Over Mobile Networks...... 9 Hamid Gharavi (IEEE Fellow), National Institute of Standards & Technology...... 9 Searching Music in the Emotion Plane...... 13 Yi-Hsuan Yang and Homer H. Chen (IEEE Fellow), National Taiwan University, Taiwan...... 13 Distributed Optimization for Wireless Visual Sensor Networks...... 17 Yifeng He and Ling Guan (IEEE Fellow), Ryerson University, Canada...... 17 A New Generation of Wireless Multimedia Link-Layer Protocols...... 19 Hayder Radha (IEEE Fellow), Michigan State University, USA...... 19 TECHNOLOGY ADVANCES...... 30 Four Suggestions for Research on Multimedia QoE Using Subjective Evaluations Greg Cermak, Verizon Labs, USA...... 30 From Cross-Layer Optimization to Cognitive Source Coding for Multimedia Transmission: Adapting Content Formats to the Network...... 34 Simone Milani, University of Padova, Italy...... 34 Focused Technology Advances Series...... 37 Application Layer QoS Provisioning for Wireless Multimedia Networks with Cognitive Radios...... 37 F. Richard Yu, Carleton University, Canada...... 37 MMTC COMMUNICATIONS & EVENTS...... 40 Call for Papers of Selected Journal Special Issues...... 40 Call for Papers of Selected Conferences...... 41

http://www.comsoc.org/~mmc/ 1/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter

Message from Editor-in-Chief

Welcome to the November Issue of E-Letter! source coding First I would like to call your attention to a few solution from a notices posted in the pages 3-4. The winter and set of possible holiday seasons are coming soon; it is time for choices. It many members to plan for their family vacations, differentiates Hawaii could be a warm and wonderful place to from cross-layer consider, even not mentioning that the IEEE design in that GLOBECOM 2009 will be held there during the later jointly Nov. 30-Dec. 4. Our MMTC meeting will be tunes the held during the conference, and all members are transmission invited to attend (please see our MMC Chair, Dr. parameters at Qian Zhang’s message on page 3). On the other different layers hand, it is time to nominate the new Editor-in- without Chief for this E-Letter, please send your changing the nomination to Qian by Nov. 15 (the Call for structure of the Nominations is posted on page 4). involved coding architecture while cognitive source coding Then I would like to thank Dr. Guan-Ming Su implies reconfiguring the architecture of the (Marvell Semiconductors, USA), who continues source coder depending on network status. his earlier efforts in the September Issue to put Please read more details in his paper. together a second wonderful Special Issue on New Research Trends, with four invited position In the focused technology column, Dr. F. papers contributed from world top scientists in Richard Yu (Carleton University, Canada) the field. Please check out this special issue demonstrates an integrated approach to optimize starting from Dr. Su’s Guest Editorial on page 7. application layer QoS for wireless multimedia communication networks. In their effort, the In the following article, Dr. Greg Cermak multimedia intra-refreshing rate is jointly (Verizon Labs, USA) gives positions, directions optimized with access strategy and spectrum and suggestions on the consumer research and sensing for media transmission in a cognitive subjective quality evaluation for multimedia radio network. communication applications, based on his many years’ experience in VQEG and other quality As always, I thank all Editors of the E-Letter, standardization efforts. After that, Dr. Simone and our authors to make this issue successful. Milani (University of Padova, Italy) introduces a new term “Cognitive Source Coding” in his Thank you very much. paper, which refers to a cognitive-radio-like technology that receives a description of the Haohong Wang network conditions from the lowest layers in the Editor-in-Chief, MMTC E-Letter protocol stacks and adopts the most appropriate

http://www.comsoc.org/~mmc/ 2/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter HIGHLIGHT NEWS & INFORMATION IEEE MMTC Meeting Agenda Nov. 30 – Dec. 4, 2009 Honolulu, Hawaii

Dear all the MMTC members,

It is excited that we will have another MMTC meeting coming soon in Globecom 2009 in beautiful Hawaii, USA, from Nov. 30 to Dec. 4. I am looking forward to seeing all of you there for our MMTC meeting, which has a draft agenda as follows.

0. Informal discussion and networking time 1. welcome new members /introduction 2. Last meeting minutes approval (ICC 2009) 3. MMTC Best Paper Award 2009 winner announcement We have 2 papers got this award this year and the authors of the following paper decide to CCNC 2010 receive their plaques in Globecom 2009 Globecom 2009 ICC 2010 B. Li, S.-S. Xie, G. Y. Keung, J.-C. Liu, I. Globecom 2010 Stoica, H. Zhang and X.-Y. Zhang, "An 5. Recent change for ICME Empirical Study of the Coolstreaming 6. MMTC IGs Reports - all IG chairs System," IEEE Journal on Selected Areas in 7. Sub-committees Report Communications, Special Issue on Advances 8. Report for News Letter activity and call for in Peer-to-Peer Streaming System, nomination for future EiC 25(9):1627- 1639, December 2007. 9. Suggestions & discussions – everyone 10. Adjourn 4. Report on Conferences activities Looking forward to seeing you in Hawaii, USA soon.

Cheers,

Qian Zhang IEEE MMTC Chair

Call For Nominations: Editor-In-Chief IEEE MMTC E-Letter

Thanks for Haohong (current EiC)’s great Wenjun Zeng, [email protected] efforts, our MMTC has continued our E-Letter in Madjid Merabti, [email protected] an impressive way. You can check all the E- Rob Fish, [email protected] Letters from Shueng-Han Gary Chan, [email protected] http://www.comsoc.org/~mmc/index.asp. Haohong Wang, [email protected] Qian Zhang, [email protected] The term of the current editor-in-chief (EIC) of IEEE MMTC E-Letter is coming to an end by I would like take this opportunity to invite all the Jan. 2010, and we have set up a nominating members send your nomination to any of the committee to assist in selecting the next EIC. committee members. Really appreciated for your The EIC is responsible for maintaining the help to identify capable MMTC member to highest editorial quality, for setting technical promote our E-Letter to next level. direction of the papers published in E-Letter, and for maintaining a reasonable pipeline of articles Best, for publication. Qian Zhang Nominating committee member list: IEEE MMTC Chair Heather Yu, [email protected]

IEEE GLOBECOM 2010 Call for Tutorial Proposals

IEEE GLOBECOM 2010 opens all tutorial/lecture sessions to the conference attendees for FREE. We invite submission of tutorial proposals for either 3.5-hour length (speak on Dec. 6th and 10th) or 1.5-hour length (speak on Dec. 7-9) overview presentations on topics of interest of the conference. No more than ONE speaker is recommended for each 1.5-hour lecture session. The proposals will be evaluated using the criteria of importance, timeliness, and conference coverage of the topic, track record of the instructor, and previous history for instructing tutorials. The final decisions on accepting tutorial proposals will also reflect space limitations at the conference venue.

Required information in the proposal (up to 3 pages):  Title and Abstract of the lecture  Full contact information of speakers  Detailed outline of topics covered  Biography of speakers  Preferred length of the lecture  History of the tutorial presentation

Important Dates:  Submission: 15 December, 2009  Decision Notification: 15 January 2010

All proposals are required to be submitted in PDF format via EDAS, which requires the registration of the proposal first by the title, keywords, authors' names, and an abstract. Only the complete proposals with all required information would be considered. Once a proposal is accepted, we will work with the speak(s) on a contract, which defines the remuneration, copyright, cancellation policy and so on. Please address all your questions regarding to Tutorials to GLOBECOM 2010 TPC Vice-Chair:

Dr. Khaled El-Maleh

IEEE GLOBECOM 2010 Call for Workshop Proposals

The IEEE GLOBECOM 2010 features advanced workshops on December 6th and 10th, 2010 to explore special topics and provide international forums for scientists, engineers, and users to exchange and share their experiences, new ideas, and research results. The proceedings of the workshop program will be published by IEEE Communications Society and IEEE Digital Library.

Topics covered in workshops will include, but are NOT limited to:  Cloud Computing and Communications  Wireless Networking  Cognitive communications and networks  Internet Quality of Service  Smart Grids  Security of Communication Networks  Next Gen communications & networks  Multimedia Communications & Services  Social networking  Wearable and Pervasive Computing  Satellite and space communications  Ubiquitous and Intelligent Services  Vehicular communications and networks  Distributed and Mobile Computing  Service-oriented Internet  Internet Architectures & Services  Broadband Communications  Grid and P2P Computing  Mobile and Ad Hoc Networks  Cyber-physical computing

Required information in the proposal (up to 4 pages):  Title of the workshop  Expected number of paper submissions  Workshop scope and dates  Draft Call for paper of the workshop  Full contact of workshop organizer(s)  Tentative list of TPC members  Track record of workshop organizer(s)

Important Dates:  Submission: 15 December, 2009  Decision Notification: 15 January 2010

Early-bird proposals (submitted before November 15, 2009) are highly encouraged and would be notified decision within a month from the receipt of your proposal. For each approved workshop, at least one organizer must commit to register and monitor the onsite workshop operation. All proposals as well as your questions should be sent to GLOBECOM 2010 TPC Vice-Chair:

Prof. Xiaobo Zhou < [email protected]>

http://www.comsoc.org/~mmc/ 6/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter SPECIAL ISSUE ON NEW RESEARCH TRENDS

Toward Next Generation of Multimedia Computing and Networking Guest Editor: Guan-Ming Su, Marvell Semiconductor, USA [email protected]

With the rapid growth of computation power and propose to use 2-D real-plane to represent the communication system, multimedia computing emotion. The further applications based on this and networking have become vibrantly attractive approach are addressed. The authors also since underlying foundations provide new indicate the open issues and future research capabilities to express natural human directions on the music emotion recognition. presentation. However, multimedia exhibits dramatically dissimilar characteristics than the Wireless visual sensor network has become an traditional data in many perspectives, including important research topic owing to its new wide the presentation form/timing, computing/coding range of applications. The third article, elements, and transmission mechanism. To “Distributed optimization for wireless visual provide satisfactory quality of experience for sensor networks” by Yifeng He and Ling Guan, multimedia service, we face more difficult and overviews the design consideration and tradeoff, diverse challenges than we ever faced in the data such as power, lifetime, video quality, time- domain before. Continuing our special issue in varying channel) in such a network, and this September, in this issue, we invite top-notch formulates the whole network as an optimization researchers to analyze the new research treads on problem. Since each node only knows its multimedia computing and networking, and neighbor's information, the authors suggest that a provide their valuable suggestions. distributed optimization framework is a more efficient and effective solutions. The first article, “Trends in multimedia communications over mobile networks” by As the level of heterogeneity and high- Hamid Gharavi, discusses the challenging and bandwidth requirements of multimedia corresponding solutions in transmitting applications increase radically, the popular multimedia over mobile networks. More adopted link-layer protocols, mainly ARQ, specially, the author starts to examine the impact cannot provide satisfactory quality of experience of channel fading, co-channel interference, and at the end-user. The fourth article, “A new packet loss on video quality. Then, the author generation of wireless multimedia link-layer presents the recent research trends, such as protocols” by Hayder Radha, highlights the MIMO technologies, error control coding importance to achieve both reliability and method in the OSI protocol stack (including stability in the wireless link layer for both real- cyclic-redundancy check, forward error time and delay-insensitive applications. The correction, and unequal error protection), to author first analyzes the disadvantages of resolve the aforementioned problems. The existing ARQ-based protocols and reviews the author also addresses the research potential of solutions to overcome the shortcomings. Based network coding in the unreliable multi-hop on the discussion, the author presents the environments. Finally, the needs to consider framework of next generation wireless link layer tradeoff and cross-layer optimization among protocol to satisfy both the reliability and overhead, latency, and performance for stability requirement. multimedia over mobile networks are stated. As illustrated in this special issue, research in Multimedia retrieval has become an important multimedia computing and networking has topic owing to fast growing amount of gained significant interest. There are more new multimedia content. The second article, challenges that will require ground-breaking “Searching music in the emotion plane” by Yi- solutions for those emerging applications. We Hsuan Yang and Homer Chen, introduces a new would like to thank all the authors for their music search method based on emotion plane. contribution and hope these articles can stimulate Unlike the traditional method to categorize the further research works on the area of multimedia emotion into discrete classes, the authors computing and networking. http://www.comsoc.org/~mmc/ 7/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter and Ph.D. degrees in electrical engineering from the University of Maryland, College Park, USA, in 2001 and 2006, respectively.

He was with the R&D Department, Qualcomm, Inc., San Diego, CA, during the summer of 2005, and with ESS Technology, Fremont, CA, in 2006. He is currently with video R&D department in Marvell Semiconductor, Inc., Santa Clara, CA. His research interests are multimedia communications and multimedia signal processing.

Dr. Su is an associate editor of Journal of Communications and a guest editor in Journal of Communications special issue on Multimedia Communications, Networking, and Applications. Guan-Ming Su received the B.S.E. degree in He serves as the Publicity Co-Chair of IEEE electrical engineering from National Taiwan GLOBECOM 2010. University, Taipei, Taiwan, in 1996 and the M.S.

http://www.comsoc.org/~mmc/ 8/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Trends in Multimedia Communications Over Mobile Networks Hamid Gharavi (IEEE Fellow), National Institute of Standards & Technology [email protected]

Increasing demand for high quality multimedia IEEE 802 standard for Wireless Local Area services has been a driving force in the Networks (WLAN), for instance, the physical technological evolution of high bandwidth and link layers are responsible for handling error wireless/mobile communications systems and control coding for IP packets. This includes a standards. The most challenging aspect is the 16-bit CRC (cyclic redundancy check) error support of higher quality video, which requires a detection field at the physical layer and packet higher bandwidth. Although the operational third retransmissions via the MAC (Medium Access generation (3G) wireless systems may be capable Control) sub-layer. of handling low data rate video [1], [2], the problem is that mobile cellular technology is not However, in a packet based wireless network ready to offer reliable real-time video services. environment where communications may need to In addition, as the proliferation of mobile video be integrated into the framework of the OSI accelerates, the next generation wireless (Open Systems Interconnection) model, the error communication systems must aim at providing control coding strategy is a challenging issue. higher per user data rate services to support This is mainly a consequence of the layering higher quality real-time audio/video services, structure in the OSI protocol stack, where the especially as new applications, such as protection against packet loss is handled ubiquitous on line journalism and live citizen separately and independently for each layer. For reporting, are emerging. instance, in addition to the link and physical layers, error control coding in the form of It is important to note that in mobile Forward Error Correction (FEC) is also applied environments a higher per user bandwidth does at the application layer. For video transmission not necessarily guarantee a higher video quality particularly, a combination of multi-layer coding reception. Thus, the major technical challenges and FEC with a differing level of protection for will be to cope with frequency-selectivity fading each layer (also known as unequal error due to the use of larger bandwidths. As far as protection: UEP), has been an effective approach transmission at the physical layer is concerned, for transmitting video over multipath fading schemes such as OFDM (Orthogonal Frequency mobile channels [5], [6]. Division Multiplexing) and MC-CDMA (Multi Carrier-Code Division Multiplexing), which are Nonetheless, when the channel condition is capable of providing frequency diversity, can unknown and there are multiple receivers, the indeed enhance robustness to frequency selective use of a fixed rate error control coding could be fading. wasteful and unreliable. Recently, a new generation of rate-less codes, such as raptor In addition, recent technological breakthroughs codes [7], has been considered for file download in Multiple Input Multiple Output (MIMO) in the Digital Video Broadcast for Handheld techniques [3], [4], i.e. providing space-time (DVB-H) standard [8]. In the raptor code, which diversity, have already made a significant impact is a category of the Fountain code, the encoder to transmission over mobile channels in terms of can generate as many encoded symbols as reliability and throughout. Needless to say, a needed from a block of data on the fly. Raptor deep fade in all the wireless channels may still codes have the advantage over traditional fixed erase some of the information and consequently, rate erasure codes, such as Reed-Solomon codes, the use of error control coding - particularly for for their ability to manage the overhead when real-time multimedia - remains a major research channel conditions are unknown. topic in mobile communications. The flexibility and reliability of the Raptor code Error control coding methods, such as error for UEP has been studied by a number of detection and error correction have been researchers in recent years [9], [10]. IETF RFC traditionally applied at the physical layer in most 5053 [11] defines procedures for generating the digital cellular communication systems (also Raptor FEC and its application for the reliable referred to as channel coding). In the case of the delivery of data objects. In addition, for Digital http://www.comsoc.org/~mmc/ 9/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Video Broadcast two-layer protection for real- channel interferences, and ever-changing time multimedia data has been recently channel conditions. Developing methods such as proposed, where the first layer (base layer) is space-time diversity for cooperative transmission protected by the 1-D interleaved parity code, and for multihop networks is becoming the most the enhancement layer is produced by the Raptor active research topic in combating multipath code [12]. It should be noted that in this AL- fading [16], [17]. In addition, in such FEC approach, the source packets are carried in environments error control coding techniques separate RTP (Real-time Transport Protocol) still continue to play a major role in supporting streams. The next layer in the protocol stack is high quality multimedia services for the next the transport layer. Between the two popular generation of wireless/mobile networks. As far transport protocols, TCP and UDP (User as their deployment is concerned, the major Datagram Protocol), UDP is the preferred problem is that they are separately applied within protocol. Nonetheless, this protocol, unlike the each OSI layer without any attention to the connection oriented TCP, is the best effort tradeoffs between overhead, latency and transport protocol. Since UDP cannot provide performance. This will involve the issue of cross reliable packet transmission, additional error layer optimization in order to maximize their control coding may be needed to prevent any efficiency in accordance with service quality significant loss of video quality. requirements.

Indeed, further coding can be accomplished at References the next lower layer in the protocol stack, which is the network layer. This layer is responsible for 1. H. Gharavi and S. M. Alamouti, “Multi routing RTP/UDP/IP packets to their priority Video Transmission for Third destinations. In the case of mobile ad-hoc Generation Wireless Communication Systems,” networks (MANET), these autonomous networks Proceedings of the IEEE, Vol. 87, No. 10, are not quite capable of reliably distributing October 1999, pp1751-1763. RTP/UDP/IP packets [13], [14]. One major 2. L. Hanzo, P. Cherriman, J. Streit, “Video obstacle is a problem with the dynamically Compression and Communications,” H.261, changing network topology. This manifests itself H.263, H.264, MPEG4 and Proprietary Codecs in a frequent route change, consequently causing as well as HSDPA-Style Adaptive Turbo- a potentially long delay [15]. Co-channel Transceivers, John Wiley and IEEE Press, interference from other users is another September 2007 important factor that is problematic and can 3. S. M. Alamouti. A simple transmit diversity severely impact the end-to-end throughout technique for wireless communications. IEEE performance. While mitigating the effect of Journal on Selected Areas in Communications, interference continues to be an active research 16(8): 1451–1458, 1998. topic, a new approach known as network coding, 4. V. Tarokh, H. Jafarkhani, and A. R. has recently emerged [15]. Its concept is based Calderbank. Space-time block codes from on performing coding operations in the interior orthogonal designs. IEEE Transactions on network, rather than just being received and Information Theory, 45(5): 1456–1467, 1999. forwarded by the intermediate nodes (routers). 5. R. Stedman, H. Gharavi, L. Hanzo, and R. By intelligently mixing packets in multicast Steele, "Transmission of Coded Images via routing it is possible to enhance the network Mobile Channels," IEEE Transactions on throughput performance. Although network Circuits and Systems for Video Technology, coding appears to work well for wired networks, CSVT, vol.3, NO. 1, pp.15-26, February 1993. the debate on its suitability for unreliable 6. H. Gharavi, "Pilot Assisted 16-QAM for multihop environments for real-time multimedia Video Communication," IEEE Transactions on services is now becoming a hot research topic. CSVT, Vol. 12, No. 2, February 2002, pp. 77-89. 7. A. Shokrollahi, “Raptor Codes”, IEEE Trans. Finally it is worth noting that in contrast to fixed Inf. Theory, Vol. 52, No. 6, June 2006, pp. wireless communications, existing mobile 2551–2567. networks are still incapable of supporting 8. ETSI, TS 102 472 V1.2.1, “Digital Video reliable, interactive, and high quality video Broadcasting (DVB); IP Datacast over DVB-H: services. This is mainly due to a number of Content Delivery Protocols,” Dec. 2006. factors, such as increasing demands for live 9. N. Rahnavard, B. N. Vellambi, and F. Fekri, video transmissions, bandwidth limitations, co- “Rateless codes with unequal error protection http://www.comsoc.org/~mmc/ 10/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter property”, IEEE Trans. Inf. Theory, Vol. 53, No. 53, pp. 1521–1532, April 2007. 10. D. Sejdinovic, D. Vukobratovic, A. Doufexi, V. Senk, and R. Piechocki, “Expanding window fountain codes for unequal error protection”, Proc. 41st Asilomar Conf., Pacific Grove, pp. 1020– 1024, 2007. 11. M. Luby, A. Shokrollahi, M. Watson, T. Stockhammer,” Raptor Forward Error Correction Scheme for Object Delivery, IETF, RFC 5053, October 2007 “http://www.rfc- editor.org/rfc/rfc5053.txt. 12. A. Begen and T. Stockhammer, “DVB Application-Layer Hybrid FEC Protection,” draft-ietf-fecframe-dvb-al-fec-02, August 11, 2009 “http://tools.ietf.org/search/draft-ietf- fecframe-dvb-al-fec-02. 13. H. Gharavi, K. Ban, “Multihop Sensor Hamid Gharavi received the Ph.D. degree from Network Design for Wideband Communications, Loughborough University, Loughborough, U.K., ”The Proceedings of the IEEE, vol. 91, NO. 8, in 1980. He joined AT&T Bell Laboratories, August 2003, pp. 1221-1234. Holmdel, in 1982. He was then transferred to 14. H. Gharavi ” Control Based Mobile Ad-hoc Bell Communications Research (Bellcore) after Networks For Video Communications, IEEE the AT&T-Bell divestiture, where he became a Transactions on Consumer Electronics, Vol. 52, Consultant on video technology and a No. 2, May 2006, pp. 383-391. Distinguished Member of Research Staff. In 15. R. Ahlswede, N. Cai, S. Li, and R. Yeung, 1993, he joined Loughborough University as “Network information flow,” IEEE Transactions Professor and Chair of Communication on Information Theory, vol. 46, July 2000, pp. Engineering. Since September 1998, he has been 1204-1216. with the National Institute of Standards and 16. S. Katti, H. Rahul, W. Hu, D. Katabi, M. Technology (NIST), US Department of Médard, and J. Crowcroft, “XORs in the air: Commerce, Gaithersburg, MD. practical wireless network coding,” in Proc. Dr Gharavi was a core member of the Study ACM SIGCOMM’06, Pisa, Italy, Sep. 11-15, Group XV (Specialist Group on Coding for 2006, pp. 243-254. Visual Telephony) of the International 17. C. Fragouli, J. Widmer, and J.-Y. L. Boudec, Communications Standardization Body CCITT “A network coding approach to energy efficient (ITU-T). He was selected as one of the six broadcasting: from theory to practice,” in university academics to be appointed to the U.K. Proceedings of IEEE INFOCOM, Apr 2006, pp. Government’s Technology Foresight Panel in 1-11. Communications to consider the future through 2015 and make recommendations for allocation of key research funds. His research interests include video/image transmission, wireless multimedia, mobile communications and third generation wireless systems, and mobile ad-hoc networks. He holds eight U.S. patents related to these topics. Dr Gharavi received the Charles Babbage Premium Award from the Institute of Electronics and Radio Engineering in 1986, and the IEEE CAS Society Darlington Best Paper Award in 1989. He has been a Distinguished Lecturer of the IEEE Communication Society. In 1992 Dr Gharavi was elected a Fellow of IEEE for his contributions to low bit-rate video coding and research in subband coding for image and video applications. He has been a Guest Editor for a http://www.comsoc.org/~mmc/ 11/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter number of special issues. Dr Gharavi served as a then became the Deputy Editor-in-Chief of this member of the Editorial Board of the IEEE Transactions through December 31, 2009. PROCEEDINGS OF THE IEEE from January, Dr Gharavi was recently appointed to serve as 2003 to December, 2008. He is currently a the new Editor-in-Chief for the IEEE member of the Editorial board, IET Image Transactions on CSVT. Processing. He served as an Associate Editor for the IEEE Transactions on CAS for Video Technology (CSVT) from 1996 to 2006. He

http://www.comsoc.org/~mmc/ 12/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Searching Music in the Emotion Plane Yi-Hsuan Yang and Homer H. Chen (IEEE Fellow), National Taiwan University, Taiwan [email protected], [email protected]

There have been tremendous efforts and measured by Euclidean distance. As shown in significant progress in providing media Fig. 1, a user can retrieve music of a certain streaming over the Internet given its great emotion by simply specifying a point in the potential. Music plays an important role in emotion plane. The system then returns the human’s history, even more so in the digital age. music samples whose AV values are close to the Never before has such a large collection of music point. A user can also generate an emotion-based been created and accessed daily by people. playlist by drawing a trajectory in the emotion Because almost all music is created to convey plane. This way, songs of various emotions emotion, music organization and retrieval by corresponding to different points on the emotion is a meaningful way for accessing music trajectory are added to the playlist and played information. The proliferation of tiny mobile back in order. devices and the like also calls for content-based One can also couple other musical metadata such retrieval of music through a small display space. as artist name, genre, or lyrics with emotion to Music emotion recognition (MER) aims at narrow down the search range. For example, one recognizing the affective content of music can specify an artist, and the system would signals. A typical approach is to categorize display all songs of the artist in the emotion emotions into a number of classes (e.g., happy, plane. It is also possible to playback music that angry, sad and relaxing) and apply machine matches the user’s mood detected by using learning techniques to train a classifier [1]–[3]. physiological, prosodic, or facial cues [9]. This This approach, though widely adopted, faces the retrieval paradigm is functionally powerful since granularity issue in practice, because classifying people’s criterion is often related to the emotion emotions into only a handful of classes cannot state at the moment of music selection [10]. meet the user demand for effective information access. Using a finer granularity for emotion description does not necessarily address the issue since language is inherently ambiguous, and the description for the same emotion varies from person to person. Instead, we propose to view emotions from a dimensional perspective and define emotions in a 2-D plane in terms of arousal (how exciting or calming) and valence (how positive or negative), the two emotion dimensions found to be most fundamental by cognitive study [4]. In this way, MER becomes the prediction of the arousal and valence (AV) values of a song corresponding to a point in the emotion plane [5]–[8]. The granularity and ambiguity issues associated with emotion classes no longer exists since no Fig. 1. With emotion-based music retrieval, a user can categorical classes are needed. Moreover, retrieve music of certain emotions by specifying a because the 2-D emotion plane provides a simple point or drawing a trajectory in the 2-D emotion plane means for user interface, novel emotion-based [5], [6] music organization, browsing, and retrieval can be easily created for mobile devices. 2. Emotion Recognition 1. Emotion-Based Retrieval MER can be formulated as a regression problem The advantages of the emotion-based approach [5] by viewing arousal and valence as real values are that each music sample can be represented as in [-1, 1]. Then a regression model can be a point in the emotion plane and that the trained to predict the AV values. More similarity between music samples can be specifically, given N inputs (xi, yi), 1≤ i ≤N, http://www.comsoc.org/~mmc/ 13/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter where xi is a feature vector of the ith input Emotion perception is intrinsically under the sample, and yi is the real value to be predicted, a influence of many factors such as cultural regression model (regressor) R(·) is created by background, generation, sex, and personality. minimizing the mismatch (i.e., mean squared Developing a general retrieval model that difference) between the predicted and the ground performs equally well for everyone is a truth values. Many good regression algorithms, challenging task. This can be explained via Fig. such as support vector regression (SVR) or 3, where each circle corresponds to the Gaussian process regression [11], are readily annotation of a song in the emotion plane by a available. In [5], two SVR models are trained for subject. Obviously, simply assigning one arousal and valence respectively. A schematic emotion value to each song in a deterministic diagram of this MER system is shown in Fig. 2. manner does not work well in practice because the emotion perception varies greatly from Usually, timbral, rhythmic, melodic, and person to person. harmonic features of music are extracted to represent the acoustic property of a song. The subjectivity issue can be addressed by Because of its ability to model auditory sensation personalizing the MER system [7], [15]. We can based on psychoacoustic models, the computer ask a user to annotate a small number of songs program PsySound [12] is often employed for and use the annotations to train a personalized feature extraction. The use of mid-level features model. A two-stage personalization scheme is such as chord progression or genre metadata has proposed in [7]. Two models are trained: one for also been explored [13], [14]. Many features predicting the general perception of a song, and such as loudness (loud/soft), tempo (fast/slow), the other for predicting the difference between and pitch (high/low) have been found relevant to the general perception and a user’s individual arousal, but only few features are relevant to perception. This is a simple personalization valence. Thus, valence recognition is more process because the music content and the challenging than arousal recognition. individuality of the user are treated separately. To make it more sophisticated, one can take into Typically a subjective test is conducted to collect account the demographic property, music the ground truth needed for model training. The preference, or listening context of the user in the subjects are asked to annotate the music pieces process. by rating their emotion perception of the music pieces using either the standard ordinal rating 3.2 Difficulty of Emotion Annotation scale or the graphic rating scale [5], [15]. The emotion annotation process of MER requires Because emotion perception is subjective, each the subjects to rate the emotion in a continuum. music piece is annotated by multiple subjects But it has been found that such rating imposes a and the ground truth is set to the average rating. heavy cognitive load to the subjects [8]. In addition, it is difficult to ensure a consistent rating scale between and within the subjects [16]. As a result, the quality of the ground truth varies, which in turn degrades the accuracy of MER. To address this issue, ranking-based emotion annotation is proposed [8]. A subject is asked to compare the affective content of two songs and determine, for example, which song has a higher arousal value, instead of the exact emotion values. The rankings of music emotion are then Fig. 2. The schematic diagram of a MER system converted to numerical values by a greedy [5] algorithm [17]. Empirical evaluation shows that this scheme relieves the burden of emotion 3. Challenges annotation on the subjects and enhances the quality of the ground truth. It is also possible to As MER is still in its infancy, there are many use an online game to harness the so-called open issues. Some major issues and proposed human computation and make the annotation solutions are discussed in this section. process more engaging [18]. 3.1 Subjectivity of Emotion Perception http://www.comsoc.org/~mmc/ 14/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Acknowledgments This work was supported by the National Science Council of Taiwan under the contract number NSC 97-2221-E-002-111-MY3.

References Fig. 3. Emotion annotations in the emotion plane [1] T. Li and M. Ogihara, “Detecting emotion in for four songs: (a) Smells like teen spirit by music,” in Proc. ISMIR, 2003. Nirvana, (b) A whole new world by Peabo Bryson and Regina Belle, (c) The rose by Janis [2] L. Lu et al, “Automatic mood detection and Joplin, and (d) Tell Laura I love her by Ritchie tracking of music audio signals,” IEEE Valens. Each circle corresponds to the annotation Trans. Audio, Speech and Language of a song by a subject [7] Processing, vol. 14, no. 1, pp. 5–18, 2006. [3] X. Hu et al, “The 2007 MIREX audio mood classification task: Lessons learned,” in 3.3 Semantic Gap Between Audio Signal and Proc. ISMIR, pp. 462–467, 2008. Human Perception [4] R. E. Thayer, The Biopsychology of Mood The viability of an MER system largely lies in and Arousal. New York, Oxford University the accuracy of emotion recognition. However, Press, 1989. due to the semantic gap between the object [5] Y.-H. Yang et al, “A regression approach to feature level and the human cognitive level of music emotion recognition,” IEEE Trans. emotion perception, it is difficult to accurately Audio, Speech and Language Processing, compute the emotion values, especially the vol. 16, no. 2, pp. 448–457, 2008. valence values. What intrinsic element of music, [6] Y.-H. Yang et al, “Mr. Emo: Music retrieval if any, causes a listener to create a specific in the emotion plane,” in Proc. ACM emotional response is still far from well- Multimedia, pp. 1003–1004, 2008. understood. While mid-level audio features such as chord, rhythmic patterns, and instrumentation [7] Y.-H. Yang et al, “Personalized music carry more semantic information, robust emotion retrieval,” in Proc. ACM SIGIR, pp. techniques for extracting such features need to 748–749, 2009. be developed. [8] Y.-H. Yang and H. H. Chen, “Music Available data for MER are not limited to the emotion ranking,” in Proc. ICASSP, pp. raw audio signal. Complementary to music 1657–1660, 2009. signal, lyrics are semantically rich and have [9] T.-L. Wu et al, “Interactive content profound impact on human perception of music presenter based on expressed emotion and [19]. It is often easy for us to tell from the lyrics physiological feedback,” in Proc. ACM whether a song expresses sadness or happiness. Multimedia, pp. 1009–1010, 2008. Incorporating lyrics to MER is feasible because [10] P. N. Juslin and J. A. Sloboda, Music and most popular songs sold in the market come with Emotion: Theory and Research. Oxford: lyrics [20]. One can analyze lyrics using natural Oxford University Press, 2001. language processing to generate textual feature descriptions of music. It has been shown that [11] A. Sen and M. Srivastava, Regression using lyrics indeed improves valence recognition Analysis: Theory, Methods, and [21], [22]. Applications. New York, Springer, 1990. [12] D. Cabrera, “PSYSOUND: A computer 4. Conclusion program for psycho-acoustical analysis,” in The past decade has witnessed a growing interest Proc. Australian Acoustic Society Conf., pp. in analyzing the affective content of music. In 47–54, 1999. http://psysound.wikidot.com/. this article, we have described a new music [13] H.-T. Cheng et al, “Automatic chord retrieval paradigm that allows users to search recognition for music classification and music in the emotion plane. It opens up a new retrieval,” in Proc. ICME, pp. 1505–1508, playground for advanced research on music 2008. emotion recognition and understanding. http://www.comsoc.org/~mmc/ 15/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter

[14] Y.-C. Lin et al, “Exploiting genre for music National Taiwan University. His research emotion classification,” in Proc. ICME, pp. interests include multimedia information 618–621, 2009. retrieval and analysis, machine learning, and [15] Y.-H. Yang et al, “Music emotion affective computing. He has published over 20 recognition: The role of individuality,” in technical papers in the above areas. Proc. ACM Int. Workshop on Human- Mr. Yang is a Microsoft Research Asia Centered Multimedia, pp. 13–21, 2007. Fellowship recipient 2008–2009. [16] S. Ovadia, “Ratings and rankings: Reconsidering the structure of values and their measurement,” Int. J. Social Research Methodology, vol. 7, no. 5, pp. 403–414, 2004. [17] W. W. Cohen et al, “Learning to order things,” J. Artificial Intelligence Research, vol. 10, pp. 243–270, 1999. [18] Y. E. Kim et al, “Moodswings: A collaborative game for music mood label collection,” in Proc. ISMIR, 2008. [19] S. Omar Ali et al, “Songs and emotions: Are lyrics and melodies equal partners,” Psychology of Music, vol. 34, no. 4, pp. 511–534, 2006. [20] J. Fornäs, “The words of music,” Popular Homer H. Chen (S’83-M’86-SM’01-F’03) Music and Society, vol. 26, no. 1, pp. 37–53, received the Ph.D. degree in Electrical and 2003. Computer Engineering from University of Illinois at Urbana- Champaign, Urbana. [21] Y.-H. Yang et al, “Toward multi-modal music emotion classification,” in Proc. Since August 2003, he has been with the PCM, pp. 70–79, 2008. College of Electrical Engineering and Computer Science, National Taiwan University, Taiwan, [22] C. Laurier et al, “Multimodal music mood R.O.C., where he is Irving T. Ho Chair classification using audio and lyrics,” in Professor. Prior to that, he held various R&D Proc. ICMLA, pp. 1–6, 2008. management and engineering positions with US companies over a period of 17 years, including AT&T Bell Labs, Rockwell Science Center, iVast, and Digital Island. He was a US delegate for ISO and ITU standards committees and contributed to the development of many new interactive multimedia technologies that are now part of the MPEG-4 and JPEG-2000 standards. His professional interests lie in the broad area of multimedia signal processing and communications.

Dr. Chen is an Associate Editor of IEEE Transactions on Circuits and Systems for Video Technology. He served as Associate Editor of IEEE Transactions on Image Processing from 1992 to 1994, Guest Editor of IEEE Transactions on Circuits and Yi-Hsuan Yang received the B.S degree in Systems for Video Technology in 1999, and Electrical Engineering from National Taiwan an Associate Editorial of Pattern University, Taiwan, in 2006. He is currently Recognition from 1989 to 1999. working toward the Ph.D. degree in the Graduate Institute of Communication Engineering, http://www.comsoc.org/~mmc/ 16/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter

http://www.comsoc.org/~mmc/ 17/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Distributed Optimization for Wireless Visual Sensor Networks Yifeng He and Ling Guan (IEEE Fellow), Ryerson University, Canada [email protected], [email protected]

A wireless visual sensor network (WVSN) same time, thus prolonging the network lifetime consists of geographically distributed video to maximum. sensors that communicate with each other over wireless channels. Different from conventional In a WVSN, each node only knows about its wireless sensor networks, each video senor in the neighbors, and does not have global knowledge. WVSN has a camera component to capture the Therefore, a centralized algorithm is not video and a processing component to compress appropriate for WVSNs. Instead, distributed the video. WVSNs have a wide range of algorithms require only local message exchange, applications, such as video surveillance, matching well with the distributed nature of emergency response, environmental tracking, WVSN. Distributed optimization provides and health monitoring [1]. efficient solutions for many resource allocation problems in WVSNs. For example, the network Sensor nodes are typically battery powered, and lifetime maximization for a WVSN has been battery replacement is infrequent or even studied in [2], in which the network lifetime impossible in many sensing applications. Hence, maximization problem is formulated to much research in WVSNs has been focused on maximize the network lifetime by jointly maximization of a utility function (e.g., the optimizing the source rate and the encoding network lifetime, the video quality) by power at each video sensor, and the link rates for optimizing the power allocation at each sensor each session, subject to the constraint of flow [2]. conservation and the requirement of the collected video quality. The formulation is a convex In a mesh-based WVSN, each video sensor optimization problem [3]. Therefore, a transmits the compressed video stream via the distributed solution can be developed using the relays of other sensors to a sink for further properties of Lagrangian duality [4]. analysis and decision making. The total power dissipation at a video sensor mainly consists of The formulation of the optimization problem in a the encoding power consumption, the WVSN is time-varying due to the channel transmission power consumption and the dynamics and the content dynamics. The channel reception power consumption. The encoding dynamics are caused by the channel fading or power consumption takes a major part of the interferences. The content dynamics mean that total power consumption [2]. Based on the the P-R-D characteristics are different for power-rate-distortion (P-R-D) analytical model different segments of the video [5]. For example, [1], the video sensor can either increase the some segments may contain many object encoding power or increase the source rate to motions and require a larger amount of bits to achieve an encoding distortion requirement. encode, while others may contain only static However, increasing the encoding power raises scenes which require relatively less bits to the power consumption of the source node. On encode. To deal with the channel dynamics and the other hand, increasing the source rate causes content dynamics, each video sensor needs to the downstream nodes to consume more power adaptively learn the channel model and the P-R- in relaying the traffic. How to allocate the power D model by using the collected channel statistics at each sensor depends on the pre-defined utility and source statistics in the near past. Based on function. For example, in order to maximize the the estimated models, a new optimization network lifetime, which is defined as the problem is formulated. Each video sensor minimum node lifetime, the video sensor far performs distributed optimization, and adjusts its away from the sink should encode the video at a outputs (e.g., bit rate, power) toward the optimal lower source rate with a higher encoding power, solution to the new formulation. and the video sensor close to the sink should In summary, wireless visual sensor network is a encode the video at a higher source rate with a distributed system, in which each sensor node lower encoding power. Such allocation enables has only a local view. Therefore, distributed each sensor to use up its energy almost at the optimization can provide an efficient solution to the resource allocation problem in a WVSN. http://www.comsoc.org/~mmc/ 18/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter REFERENCES: Canada in 2008. He is currently an assistant professor at Ryerson University, Canada. His [1] Z. He and D. Wu, “Resource allocation and research interests include wireless video performance analysis of wireless video streaming, wireless visual sensor networks, peer- sensors,” IEEE Trans. Circuits Syst. Video to-peer streaming, and distributed optimizations Technol., vol. 16, no. 5, pp. 590–599, May for multimedia communications. He is the 2006. recipient of 2008 Governor General’s Gold [2] Y. He, I. Lee, and L. Guan, “Distributed Medal in Canada, and 2007 Pacific-rim algorithms for network lifetime Conference on Multimedia (PCM) best paper maximization in wireless visual sensor award. networks”, IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 5, pp. 704-718, May 2009. [3] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge, U.K.: Cambridge Univ. Press, 2004. [4] M. Chiang, S. H. Low, A. R. Calderbank, and J. C. Doyle, “Layering as optimization decomposition: A mathematical theory of network architectures,” Proc. IEEE, vol. 95, no. 1, pp. 255–312, Jan. 2007. [5] Z. He, W. Cheng, and X. Chen, “Energy Minimization of Portable Video Communication Devices Based on Power- Rate-Distortion Optimization,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 5, pp. 596–608, May 2008. Ling Guan (S’88-M’90-SM’96-F’08) received his Ph.D. Degree in Electrical Engineering from the University of British Columbia, Canada in 1989. He is currently a professor and a Tier I Canada Research Chair in the Department of Electrical and Computer Engineering at the Ryerson University, Toronto, Canada. He held visiting positions at British Telecom (1994), Tokyo Institute of Technology (1999), Princeton University (2000) and Microsoft Research Asia (2002). He has published extensively in multimedia processing and communications, human-centered computing, machine learning, and adaptive image and signal processing. He is a recipient of 2005 IEEE Transactions on Circuits and Systems Best Paper Award.

Yifeng He (M’09) received his Ph.D. degree in Electrical Engineering from Ryerson University,

http://www.comsoc.org/~mmc/ 19/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter A New Generation of Wireless Multimedia Link-Layer Protocols Hayder Radha (IEEE Fellow), Michigan State University, USA [email protected]

1. Introduction processing/decoding at the receiver 1. Simply stated, instability can be synonymous with the Despite the unprecedented success and well-known underflow event in multimedia proliferation of wireless LANs over the past applications; and under limited buffer decade, there are few, arguably major, constraints, overflow events can naturally shortcomings in the underlying link-layer contribute to the instability of the application as protocols of well-established wireless systems. well. These shortcomings are expected to be exacerbated as the level of heterogeneity and Recently, a new wireless framework, high-bandwidth requirements of emerging Automatic Code Embedding (ACE) [1] that multimedia applications increase dramatically. In achieves reliability and stability while particular, popular wireless link-layer protocols, maximizing bandwidth efficiency, has been such as the retransmission (ARQ) based developed. ACE is designed to support a broad approach employed by the IEEE 802.11 standard range of applications both legacy (i.e., TCP- suite 58, are designed to achieve some level of based) and realtime, including high-end wireless reliability by discarding corrupted packets at the multimedia such as HDTV over wireless, remote receiver and by performing one or more presence services, gaming, and immersive retransmission attempts until a packet is received applications. We believe that successful wireless error-free or a maximum number of link-layer protocols have to (a) address the retransmission attempts is reached. In addition to reliability and stability issues (jointly) for the the our contributions 1-26 in this area, many broad range of heterogeneous applications (real- other leading research efforts (see for example time and non-real-time) that ride on these 27-57) have highlighted the inefficiencies of the protocols; (b) take advantage of “side retransmission approach used by the current information” in conjunction with intelligent IEEE link-layer protocol and proposed a variety feedback to maximize throughput; and (c) be of remedy solutions. Many of these remedy flexible and adaptive in face of changing channel solutions focus on a variety of ARQ-based conditions and traffic demands. packet combining schemes and cooperative decoding 52-56; and others employ cross-layer To that end, ACE is a good starting point for strategies with some form of channel coding that a new generation of wireless link layer protocols is usually implemented at higher layers, that employ rate-adaptive channel coding with especially at the application multimedia layer some intelligent feedback and possibly with 568912-1657. ARQ (when needed). More importantly, ACE is built based on the philosophy of “fixing the Although these and other recent proposed problem at the source” with the understanding remedies for the wireless link-layer focus on that the link-layer is the lowest layer where one some aspects of the reliability issue, they largely can address the reliability and stability issues ignore the stability dimension, and they while trying to maximize capacity utilization and especially ignore the heterogeneous associated throughput. It is important to note nature/demands of data/applications at higher that, one can envision some form of a layers. Meanwhile, we believe that emerging and “collaborative” physical/link-layer design that future wireless networks supporting high-end achieves the aforementioned reliability and heterogeneous applications cannot afford stability requirements for heterogeneous piecemeal solutions. Ideally, the wireless link- applications. In this article, however, we focus layer protocol must meet the reliability and on the wireless link-layer design issues with the stability requirements of all applications understanding that many of the ideas presented (realtime or not) while maximizing throughput. here could be done with some cooperation with Here, stability can be simply (and coarsely) the physical layer1. defined for legacy and realtime multimedia applications as the set of condition(s) that ensure 1 Many researcher may argue for a more continuous availability of content for stringent separation between the physical and http://www.comsoc.org/~mmc/ 20/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter layer protocols in conjunction with error In this article, we briefly outline the major correcting codes to recover corrupted packets. shortcomings of the current ARQ based link- layer and highlight some examples of leading In particular, Cross-Layer Design with Side- remedies to overcome those shortcomings. This information (CLDS) 5-11 demonstrated a motivates the need for a new generation of a significant increase in throughput by utilizing wireless link-layer. We then provide a corrupted packets under current 802.11 systems. chronological account of key protocol designs More importantly, under CLDS, it can be shown that led to the ACE framework. Subsequently, that the mere utility of binary side information we focus on highlighting the salient features of (packet is corrupted or not), which is available in the ACE protocol and its architecture, as a the current 802.11 link layer protocol, can representative of a new generation of wireless increase the effective information-theoretic link-layer that is designed with reliability and capacity significantly 5-7. More details will be stability considerations in mind. provided about CLDS below.

2. Why do we need a new wireless link-layer 2.2 Largely absent stability protocol? The ARQ approach is designed to provide To motivate the need for a new generation of “reliability in the long run”, where information wireless link layer protocols, we highlight the could eventually be delivered to the destination. key issues with the 802.11 link-layer protocol. In Even then, the link-layer does not guarantee particular, one can identify two major delivery and the reliability burden (due to shortcomings with such ARQ-based protocols: wireless errors) is carried by higher layers, especially for applications that require 2.1 Inefficient reliability guaranteed delivery. More importantly, ARQ- based 802.11 link-layer and other recent protocols largely ignore the stability aspect of The 802.11 ARQ approach discards corrupted data communications in terms of maintaining a packets that are mostly error-free, even when sustainable flow, which is critical for a dynamic there is only a single bit error in a corrupted and heterogeneous ubiquitous wireless packet. Hence, the effective throughput of environment. Although many leading efforts 802.11 systems can be significantly improved. have addressed the reliability and associated This issue led many efforts to propose new link- throughput inefficiency shortcoming of current layer and cross-layer protocols that utilize 802.11 link-layer (as highlighted above), current corrupted packets (or partial packets) instead of ARQ and many emerging link-layer protocols discarding them 5-51. In addition to Hybrid rely on (or arguably shift the problem to) higher ARQ (HARQ) based methods 40-42, examples layers to provide reliable and stable flow control of recent efforts for combating inefficiencies of for both realtime and non-realtime traffic. In ARQ-based wireless protocols include Cross- conjunction with the inefficient reliability Layer Design with Side-information (CLDS) 5- approach, this design strategy has led to a great 11, packet combining 29-44, Partial Packet deal of inefficiency in throughput and to other Recovery (PPR) 51, ZipTx 39, and Automatic major technical issues and challenges at higher Code Embedding (ACE) 1. Some of these layers. A well-known example is the TCP over- approaches, such as PPR and packet combining, wireless performance degradation phenomenon, exploit physical layer information regarding the which led to major research efforts and quality of individual bits to improve the numerous studies in attempt to mitigate the probability or recovering corrupted packets. shortcoming of the lower layers. Others, such as CLDS, ZipTx, and ACE, utilize information available in current 802.11 link- 3. The evolution from cross-layer protocols toward a reliable and stable link-layer link layers with the caveat that such separation is protocol necessary for adherence to the traditional OSI layer model, and hence, for the sake of The inception of the ACE protocol has been maintaining flexibility in the design and in the making since the early 2000 8 222324. development of these two layers separately and Several research tracks have contributed to independently. developing an insightful protocol design. For http://www.comsoc.org/~mmc/ 21/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter example, comprehensive studies regarding the refer to as a Cross-Layer Design (CLD) channel, measurement, analysis, and modeling of the is a hybrid error-erasure channel since some “MAC-to-MAC channel” error process 17-24 packets are corrupted (with errors) and some provided an underlying foundation for designing packets are dropped (due to errors in the header) better reliable wireless protocols at the 5-7. Let the capacity of this second channel be MAC/link layer and above. Here, the “MAC-to- C . It is important to note that under the CLD MAC channel” or simply the “MAC channel” CLD channel model, the MAC/link layer does not (or the “link-layer channel”) is an abstraction provide any side information about the status of where one can include the physical layer and its the packet (corrupted or not) to the receiver. In underlying channel within such abstraction. other words, the receiver is blind about any Hence, errors detected at the MAC/link layer received packet; and hence, under CLD, the represent the error characteristics of such receiver does not know if that packet has errors abstract channel. These MAC channel errors are or if it is error-free. Consequently, one can referred to as residue errors¸ and they represent define a third channel, which is basically CLD in errors that are not corrected by the error- conjunction with binary side-information about correcting capabilities of the physical layer. the status of the packet (corrupted or not). Let’s assume that this third channel, CLDS, has a As highlighted above, an important conclusion of recent studies in this area is that capacity CCLDS . the simple ARQ strategy is very inefficient under realistic channel conditions. This led to the Now, one can compare the information intuition that wireless systems are better off by theoretic capacities among the three channels, simply utilizing corrupted packets instead of C , C , and C based on the dropping them 7. One major early direction ARQ CLD CLDS under this intuition is the idea of passing corresponding models. In particular, it can be corrupted packets to higher layers, where further shown that under realistic channel conditions, and more efficient reliability functions (e.g., the capacity of the cross-layer design channel application-layer FEC) can be applied to the CCLD (without side information) is “usually” corrupted packets 592375. This opens the door higher than the capacity of the traditional for a variety of cross-layer protocols that perform significantly better when compared with channel CARQ . Here, the ARQ channel may the conventional ARQ protocol. have a higher capacity only if the corrupted packets are severely corrupted. More More specifically, one can consider different importantly, one can show that the capacity “MAC-to-MAC” channel models as shown in CCLDS of the cross-layer design with side Figure 1. Each of these channel models maps a information is “always” higher than the capacity MAC/link layer reliability protocol to a corresponding abstract channel. The first channel CARQ of the conventional channel: is based on the ARQ protocol, which is simply C C 5-7. And under realistic channel mapped into an erasure channel model due to CLDS ARQ packet drops that are induced by the underlying conditions CCLDS is significantly higher than wireless channel errors. Let this channel have an CARQ . This basic and fundamental result clearly information-theoretic capacity CARQ , which leads to the conclusion that any viable wireless represents the maximum amount of information link layer should not adopt a plain ARQ scheme, one can convey per transmitted symbol. Now which drops corrupted packets. This conclusion let’s consider a second channel model as shown can probably be stretched to the following in Figure 1b. Under this channel model, the extent: “regardless how badly corrupted these receiver (i) does not drop a corrupted packet packets are, do not drop them”. when the errors are within the payload only; but (ii) drops a packet when one or more errors impact the header (regardless if the payload is error-free or not). This channel model, which we

(c) Figure 1: Three channel models that represent (a) the conventional ARQ (erasure) channel; (b) a cross-layer design (CLD) hybrid error-erasure channel; and (c) a CLD with side-information (CLDS) channel. Here, “residue errors” are errors that are not corrected by the physical layer and hence are observed at the MAC/link layer.

From the above discussion, the cross-layer further enhanced by reducing the number of design with side-information (CLDS) protocol packet drops due to errors in the header. Novel model is the best among the three models shown methods based on relatively simple detection and in Figure 1. The CLDS protocol employs a estimation theoretic tools can be adopted to hybrid error-erasure channel coding scheme to “estimate” a corrupted header of a packet and jointly correct errors in corrupted packets and hence determine if that packet belongs to the recover lost packets (due to errors in the header) receiver or not 12-16. Under such strategy, if the 56. Such cross-layer design protocols can be probability of “the packet belongs to the http://www.comsoc.org/~mmc/ 23/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter reciever” is very close to one, the receiver would provide stability for higher layers? As mentioned keep the packet. These header above, stability implies that higher layers should detection/estimation techniques can benefit not be starved for data at any time during a greatly from accurate modeling of the MAC session. This concept can be applied to both channel under consideration 17-22. realtime and non-realtime data 1. For one thing, realizing a reliable and stable link layer can In summary, one can achieve significant virtually eliminate all issues with the well-known improvements over conventional ARQ methdos wireless TCP problem, which have occupied the by employing cross-layer designs that are based attention of many leading research efforts mainly on (i) preserving corrupted packets (i.e., not due to the unreliable and inefficient nature of the dropping them due to errros in the payload); (ii) current ARQ based link-layer. Below, we outline employing a form of error-correction and the main features of the PEEC (reliable link- erasure-recovery capabilities at a higher layer; layer) and ACE (reliable and stable link-layer) and (iii) utilizing header detection/estimation to protocol architectures. reduce the number of packet drops due to errors in the header. 4.1 A reliable wireless link-layer: Packet Embedded Error Control (PEEC) 4. Fixing the problem at the source: The next protocol generation wireless link-layer PEEC is based on the simple idea that each The above discussion of cross-layer designs packet should protect itself with error correcting that can achieve improvements over traditional codes embedded in it. If the link-layer at the ARQ based link-layers begs for the following receiver side can correct all errors, the packet is question: why not fixing the problem at the moved to the higher layers; otherwise, the source? In particular, we have seen that the corrupted packet is kept in a buffer at the link- combination of keeping (not dropping) corrupted layer receiver while waiting for more redundant packets in conjunction with some form of bits from the transmitter. Hence, there is a channel coding can provide significant feedback mechanism between the receiver and improvements. A natural question is, why not transmitter that is similar to the feedback include a channel coding based scheme within currently used by wireless link-layer protocols. the link-layer that can achieve reliable In this case, however, the feedback can be used communication? By following such strategy, all to request additional redundancy and can also be types of traffic and data (not only realtime used to adjust the channel coding rate by the multimedia) can benefit from simple and robust transmitter. link-layer architecture, while preserving the integrity of the link-layer itself and the higher Consequently, there are two types of redundant layers above it. This new thinking of a reliable symbols that can be carried in each packet: link-layer that is based on a channel coding Type-I parity and Type-II parity symbols. Type-I scheme with feedback was a first major step parity are redundant symbols for the current toward a next generation wireless link-layer. packet that is being transmitted. Meanwhile, This major step led to a new family of reliable Type-II parity are redundant symbols for wireless link-layer protocols that we refer to as previously transmitted packets that are waiting at Packet Embedded Error Control (PEEC) the receiver buffer (since they could not be protocols 234. corrected based on their own Type-I parity symbols). The basic architecture and a simple The second important question that can be scenario of the PEEC protocol is shown in raised is the following: can the link-layer not Figure 2. only provide reliable communication, but also

Transmitter sends 1st packet Receiver fails to decode first packet

Link-layer Type-I Link-layer Type-I Data ParityType-I Channel Data ParityType-I Data Parity DecoderChannel Data Parity Decoder

Feedback

Transmitter sends 2nd packet with Receiver decodes 2nd packet st Data extra parity (Type-II) for 1 Data successfully and uses extra Type- packet II parity for 1st packet decoding

Link-layer Type-I Type-II Link-layer Type-I Type-II Data ParityType-I ParityType-II Channel Data ParityType-I ParityType-II Data Parity Parity DecoderChannel Data Parity Parity Decoder

Figure 2 The Packet Embedded Error Control (PEEC) protocol.

ACE Sender ACE Receiver Figure 3 design architecture of the ACE protocol.

4.2 A reliable and stable wireless link- receiver. However, depending on the nature of layer: The Automatic Code the traffic flow (realtime or non-realtime), the Embedding (ACE) Framework sender should avoid throughput instability at the receiver. In [1], we proposed a paradigm shift ACE is built on the reliable PEEC protocol where both reliability and stability are ensured described above. The most basic type of wireless using an Automatic Code Embedding (ACE) link-layer communication is the contention-free wireless link-layer protocol. An important point-to-point communication comprising a conclusion of this work is that various traffic single type of traffic flow. In this communication demands (in terms of reliability and stability scheme, the sender has a single task which is to requirements) can be met using a packet-by- transmit information packets reliably to the packet code embedding rate constraint that is http://www.comsoc.org/~mmc/ 25/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter independent of traffic type. Our results show the ARQ based wireless link layer. These solutions feasibility of designing stable and reliable link range from packet combining approaches to layer over 802.11 channels [1]; and more cross-layer methods. We believe that all of these importantly provide clear evidence of the efforts should be utilized to develop a new feasibility of achieving significantly improved framework of wireless link layer. This new link throughput by using this type of link-layer. layer should provide both reliability and stability The design architectures of the ACE sender for higher layers (instead of shifting the problem and receiver are illustrated in Figure 3. The ACE to the higher layers). More importantly, the sender has two components. The first component research, academic and industrial communities is Channel State Prediction where the link-layer have to collaborate to develop a new wireless wireless channel condition for the next link-layer standard that harness all of the transmission interval is predicted based on the advancements made in this area over the past receiver feedback (provided to the sender by the decade. acknowledgment packet (ACK)). The second component is Parity Allocation where a new References codeword is generated and appropriate numbers of parity bits are added to a packet for the next [1] S. Soltani, K. Misra, and H. Radha, transmission. On the other hand, upon the “On Link-Layer Reliability and reception of the link-layer packet, an ACE Stability for Wireless Communication,” receiver first attempts to decode a codeword ACM MOBICOM, 2008. embedded in the packet. If the decoding is [2] S. Soltani, H. Radha, “PEEC: A successful, the information symbols in the Channel-Adaptive Feedback-Based codeword are immediately sent to the higher Error Control Protocol for Wireless layer. But if the decoding fails, then the MAC Layer," IEEE JSAC Special Issue codeword is sent to the receiver buffer for future on Exploiting Limited Feedback in recovery. Tomorrows Wireless Communication Networks, 26(8): 1376-1385 (2008). The decoding operations and buffer [3] Sohraab Soltani and Hayder Radha, management of the ACE receiver are performed “Delay Constraint Error Control in a Packet Decoding and Buffer Management Protocol for Real-Time Video component. In addition, the second component in Communication,” IEEE Transactions the ACE receiver, Channel State Estimation, is on Multimedia, Volume 11, Issue 4, designed to estimate the channel condition by June 2009 Page(s):742 – 751. utilizing the physical and link-layer side- [4] Sohraab Soltani and Hayder Radha, information embedded in the received packet "Performance Evaluation of Error (channel state inference). It is important to note Control Protocols over Finite-State that accurate estimation and prediction of the Markovian Channels", Proceedings of channel condition has a critical impact on the the Conference of Information Sciences performance of the ACE framework. This is due and Systems (CISS’08), Princeton to the fact that ACE employs Low-Density- University, NJ, USA, March 2008. Parity-Check (LDPC) codes for decoding link- [5] Shirish Karande and Hayder Radha, layer packets, and LDPC codes use a soft "Hybrid Erasure-Error Protocols for decision decoding (an iterative belief Wireless Video," IEEE Transactions on propagation method) which requires a Multimedia, vol. 9, no. 2, pp. 307 – 319, knowledge of channel bit error rate (BER). February 2007. Therefore, it is essential to identify practically [6] Shirish Karande and Hayder Radha, observable variables, which can be used for “The Utility of Hybrid Error Erasure reasonably robust channel state LDPC (HEEL) Codes for Wireless inference/prediction (CSI/CSP). Multimedia," IEEE International Conference on Communications (ICC), 5. Conclusion May 2005. [7] Shirish Karande and Hayder Radha, In this article, we provided an overview of a “Does Relay of Corrupted Packets Lead variety of solutions that have been developed to Capacity Improvement?," IEEE recently to overcome the shortcomings of the Wireless Communications and http://www.comsoc.org/~mmc/ 26/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Networking Conference (WCNC), Detection with Priors," IEEE March 2005. International Conference on [8] Syed Ali Khayam, Shirish S. Karande, Communications (ICC), June 2006. Michael Krappel, and Hayder Radha, [16] Syed Ali Khayam, Muhammad U. Ilyas, "Cross-Layer Protocol Design for Real- Klaus Pцrsch, Shirish Karande, and Time Multimedia Applications over Hayder Radha, “A Statistical Receiver- 802.11b Networks," IEEE International based Approach for Improved Conference on Multimedia and Expo Throughput of Multimedia (ICME), July 2003. Communications over Wireless LANs," [9] Y. Cho, S. Karande, K. Misra, H. IEEE International Conference on Radha, J. Yoo, and J. Hong, "On Communications (ICC), May 2005. Channel Capacity Estimation and Prediction for Rate Adaptive Wireless [17] Syed A. Khayam and Hayder Radha, Video," IEEE Transactions on “Constant-Complexity Models for Multimedia, vol. 10, no. 7, Nov. 2008. Wireless Channels," IEEE INFOCOM, [10] S. Karande, S. A. Khayam, Y. Cho, K. April 2006. 9, no. 2, Feb 2007. Misra, H. Radha, J. Kim and J. Hong, [18] Syed Ali Khayam, Hayder Radha, Selin “On Channel State Inference and Aviyente, and John R. Deller, Jr., Prediction Using Observable Variables “Markov and Multifractal Wavelet in 802.11b Networks," IEEE Models for Wireless MAC-to-MAC International Conference on Channels,” Elsevier Performance Communications (ICC), Glasgow, UK, Evaluation Journal, vol. 64, no. 4, pp. June 2007. 298-314, May 2007. [11] Shirish Karande, Utpal Parrikar, Kiran [19] Shirish Karande, U. Parrikar, Kiran Misra, and Hayder Radha, “Utilizing Misra, and Hayder Radha, “On Signal to Silence Ratio indications for Modeling of 802.11b Residue Errors," improved Video Communication in Conference on Information Sciences & presence of 802.11b Residue Errors," Systems (CISS), March 2006. IEEE International Conference on [20] Syed Ali Khayam and Hayder Radha, Multimedia & Expo (ICME), July 2006. “Linear-Complexity Models for Wireless MAC-to-MAC Channels," [12] Syed A. Khayam and Hayder Radha, ACM/Kluwer Wireless Networks “Maximum-Likelihood Header (WINET) Journal - Special Issue on Estimation: A Cross-Layer Selected Papers from MSWiM’03, vol. Methodology for Wireless Multimedia,” 11, no. 5, pp. 543-555, September 2005. IEEE Transactions on Wireless [21] Syed Ali Khayam, Selin Aviyente, and Communications, vol. 6, no. 11, pp. Hayder Radha, “On Long-Range 3946-3954, November 2007. Dependence in High-Bitrate Wireless [13] Syed Ali Khayam and Hayder Radha, Residual Channels," Conference on "Comparison of Conventional and Information Sciences and Systems Cross-Layer Multimedia Transport (CISS), March 2005. Schemes for Wireless Networks," [22] Syed Ali Khayam and Hayder Radha, Springer Journal of Wireless Personal “Markov-based Modeling of Wireless Communications (WPC), pages(s) 535- Local Area Networks," ACM Mobicom 548, July 2009. International Workshop on Modeling, [14] Syed Ali Khayam, Shirish Karande, Analysis and Simulation of Wireless Muhammad Usman Ilyas, and Hayder and Mobile Systems (MSWiM), Radha, "Header Detection to Improve September 2003. Multimedia Quality over Wireless [23] Syed Ali Khayam, Shirish Karande, Networks," IEEE Transactions on Hayder Radha, and Dmitri Loguinov, Multimedia, vol. 9, no. 2, pp. 377-385, “Analysis and Modeling of Errors and February 2007. Losses over 802.11b LANs for High- [15] Syed Ali Khayam, Shirish Karande, Bitrate Real-Time Multimedia," Muhammad Usman Ilyas, and Hayder EURASIP Signal Processing: Image Radha, “Improving Wireless Communication, vol.18, no.7, pp. 575- Multimedia Quality using Header 595, August 2003. http://www.comsoc.org/~mmc/ 27/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter [24] Shirish Karande, Syed Ali Khayam, Wireless Conference (RAWCON), Michael Krappel, and Hayder Radha, August 2003, pp. 47.50. “Analysis and Modeling of Errors at the [33] Y. Liang and S. S. Chakraborty, “ARQ 802.11b Link-Layer," IEEE and packet combining with post- International Conference on Multimedia reception selection diversity," in Proc. and Expo (ICME), July 2003. 60th IEEE Semiannual Vehicular Technology Conference (VTC Fall), [25] Muhammad U. Ilyas, and Hayder 2004. Radha, “Measurement Based Analysis [34] Q. Zhang and S. A. Kassam, “Hybrid and Modeling of the Error Process in ARQ with selective combining for IEEE 802.15.4 LR-WPANs,” fading channels," IEEE Journal on Proceedings of the 27th IEEE Selected Areas in Communications, vol. International Conference on Computer 17, no. 5, pp. 867.874, May 1999. Communications (INFOCOM’08), [35] T. W. A. Avudainayagam, J.M. Shea Phoenix, AZ, United States, April, and L. Xin. Reliability Exchange 2008. Schemes for Iterative Packet [26] Muhammad U. Ilyas, Moonseong Kim, Combining in Distributed Arrays. Proc. and Hayder Radha , "Reducing Packet of the IEEE WCNC, volume 2, pages Losses in Networks of Commodity 832-837, 2003. IEEE 802.15.4 Sensor Motes Using [36] S. S. Chakraborty, E. Yli-Juuti, and M. Cooperative Communication and Liinaharja. An ARQ Scheme with Diversity Combination," Proceedings of Packet Combining. IEEE Comm. the 28th IEEE International Conference Letters, 1998. on Computer Communications [37] H. Yomo, S. S. Chakraborty, and R. (INFOCOM'09), Rio de Janeiro, Brazil, Prasad, “IEEE 802.11 WLAN with April 19 - 25, 2009. Packet Combining", International Conference on Computer and Device [27] D. Aguayo, J. Bicket, S. Biswas, G. 2004 (CODEC-04), January, 2004, Judd, and R. Morris. “Link-level Kolkata, India Measurements from an 802.11b Mesh [38] Grace Woo, Pouya Kheradpour, Dawei Network”. In SIGCOMM, 2004. Shen, and Dina Katabi, “Beyond the [28] J. G. Kim and M. M. Krunz, “Delay Bits: Cooperative Packet Recovery analysis of selective repeat ARQ for a Using Physical Layer Information," Markovian source over a wireless ACM MOBICOM, 2007. channel," IEEE Trans. Veh. Technol., [39] K. C. Lin, N. Kushman, and D. Katabi. vol. 49, no. 5, pp. 1968–1981, Sep. Ziptx: Harnessing partial packets in 2000. 802.11 networks. In Mobicom’08, September 2008. [29] J. G. Kim and M. M. Krunz, “Delay [40] E. Soljanin. Hybrid ARQ in Wireless analysis of selective repeat ARQ for a Networks. DIMACS Workshop on Markovian source over a wireless Network Inform. Theory, March 2003. channel," IEEE Trans. Veh. Technol., [41] E. C. Strinati, S. Simoens, and J. vol. 49, no. 5, pp. 1968–1981, Sep. Boutros, “Performance evaluation of 2000. some Hybrid ARQ schemes in IEEE [30] P. S. Sindhu, “Retransmission error 802.11a Networks“, IEEE VTC, control with memory", IEEE 4(4):2735- 2739, 2003 Transactions on Communications, vol. [42] G. Caire and D. “Tuninetti. The COM-25, no. 5, pp. 473.479, May 1977. throughput of Hybird-ARQ protocols [31] S. S. Chakraborty, E. Yli-Juuti, and M. for the Gaussion collision channel”. Liinaharja, “An adaptive ARQ scheme IEEE Trans. Inform. Theory, July 2001. with packet combining," IEEE [43] S. Cheng and M. C. Valenti. Communications Letters, vol. 2, no. 7, “Macrodiversity packet combining for pp. 200.202, July 1998. the ieee 802.11a uplink”. In IEEE [32] M. Gidlund, “Receiver-based packet WCNC, 2005. combining in IEEE 802.11a wireless [44] M. C. Valenti. “Improving uplink LAN," in Proc. IEEE Radio and performance by macrodiversity http://www.comsoc.org/~mmc/ 28/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter combining packets from adjacent access [55] H. Yomo, S. S. Chakraborty, and R. points”. IEEE WCNC, pages 636– 641, Prasad, “IEEE 802.11 WLAN with 2003. Packet Combining", International [45] S. Lin and D. J. Costello Jr., “Error Conference on Computer and Device Control Coding: Fundamentals and 2004 (CODEC-04), Kolkata, India, Applications," Englewood Cliffs, NJ: January, 2004. Prentice-Hall, 1983. [56] Grace Woo, Pouya Kheradpour, Dawei [46] S. Lin and P. S. Yu, “A hybrid ARQ Shen, and Dina Katabi, “Beyond the scheme with parity retransmission for Bits: Cooperative Packet Recovery error control of satellite channels," Using Physical Layer Information," IEEE Trans. Commun., vol. 30, pp. ACM MOBICOM, 2007. 1701–1719, July 1982. [57] K. C. Lin, N. Kushman, and D. Katabi. [47] Y. Wang and S Lin, “A modified “Ziptx: Harnessing partial packets in selective-repeat type-II hybrid ARQ 802.11 networks,” In Mobicom’08, system and its performance analyses," September 2008. IEEE Transactions on Communications 31(5), pp. 124-133, 1983. [58] IEEE Computer Society LAN MAN [48] G. Caire and D. Tuninetti, "The Standard Committee, “Wireless LAN throughput of Hybird-ARQ protocols Medium Access Control (MAC) and for the Gaussion collision channel", Physical Layer (PHY) Specifications," IEEE Trans. Inform. Theory, 47:1971– IEEE Std. 802.11-1999, New York, 1988, July 2001. 1999. [49] D. Chase, “Code-combining: A [59] The Lightweight User Datagram maximum likelihood decoding approach Protocol (UDP-Lite) for combining an arbitrary number of http://www.ietf.org/rfc/rfc3828.txt. noisy packets,” IEEE Trans. Commun., vol. COMM-33, no. 5, pp. 385в“393, May 1985. [50] J. C. Bolot, S. Fosse-Parisis, and D. Towsley, “Adaptive FEC-based error control for internet telephony,” in Proc. IEEE INFOCOM ’99, 1999, vol. 3, pp. 1453–1460. [51] K. Jamieson and H. Balakrishnan. “PPR: Partial Packet Recovery for Wireless Networks”. In ACM SIGCOMM, Kyoto, Japan, August 2007.

[52] M. Gidlund, “Receiver-based packet combining in IEEE 802.11a wireless LAN," in Proc. IEEE Radio and Wireless Conference (RAWCON), August 2003, pp. 47.50. [53] T. W. A. Avudainayagam, J.M. Shea Hayder Radha received the Ph.M. and Ph.D. and L. Xin. “Reliability Exchange degrees from Columbia University in 1991 and Schemes for Iterative Packet 1993, the M.S. degree from Purdue University in Combining in Distributed Arrays.” 1986, and the B.S. degree (with honors) from Proc. of the IEEE WCNC, volume 2, Michigan State University (MSU) in 1984 (all in pages 832-837, 2003. electrical engineering). Currently, he is a [54] Y. Liang and S. S. Chakraborty, “ARQ Professor of Electrical and Computer and packet combining with post- Engineering (ECE) at MSU, the Associate Chair reception selection diversity," in Proc. for Research of the ECE Department, and the 60th IEEE Semiannual Vehicular Director of the Wireless and Video Technology Conference (VTC Fall), Communications Laboratory. Professor Radha 2004. was with Philips Research (1996-2000), where http://www.comsoc.org/~mmc/ 29/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter he worked as a Principal Member of Research Network-Aware Multimedia Processing and Staff and then as a Consulting Scientist in the Communications of the IEEE Journal on Video Communications Research Department. Selected Topics in Signal Processing. Professor He was a Member of Technical Staff at Bell Radha is a recipient of the Bell Labs Laboratories where he worked between 1986 and Distinguished Member of Technical Staff 1996 in the areas of digital communications, Award, the AT&T Bell Labs Ambassador image processing, and broadband multimedia. Award, AT&T Circle of Excellence Award, the MSU College of Engineering Withrow Professor Radha is a Fellow of the IEEE, and he Distinguished Scholar Award for outstanding was appointed as a Philips Research Fellow in contributions to engineering, and the Microsoft 2000 and a Bell Labs’ Distinguished Member of Research Content and Curriculum Award. He is Technical Staff in 1992. He is an elected a recipient of National Science Foundation member of the IEEE Technical Committee on (NSF) grants under the Theoretical Foundation, Image, Video, and Multidimensional Signal Communications Research, Research in Processing (IVMSP) and the IEEE Technical Networking Technology and Systems (NeTS), Committee on Multimedia Signal Processing and Cyber-Trust programs. His current research (MMSP). He served as Co-Chair and Editor of a areas include wireless communications and Video Coding Experts Group of the International networking, video communications, image Telecommunications Union – processing, compressed sensing, sensor Telecommunications Section (ITU-T) between networks, and network coding. He has more than 1994-1996. He served on the Editorial Board of 150 peer-reviewed papers and 30 US patents in IEEE Transactions on Multimedia and the these areas. Journal on Advances in Multimedia. He also served as a Guest Editor for the special issue on

http://www.comsoc.org/~mmc/ 30/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter TECHNOLOGY ADVANCES

Four Suggestions for Research on Multimedia QoE Using Subjective Evaluations Greg Cermak, Verizon Labs, USA [email protected]

The suggestions that follow are based on my subjective; seeing the common threads depends experience doing consumer research and on my on the judgment of the experimenter. contact with video and multimedia practitioners in T1A1, VPQM, VQEG, and QoMEX. There The alternative proposed here is to concentrate have been many hours of debates about on properties of the product first, properties of consumer QoE and how to measure it. the interaction of the customer with the product second, and properties of the customer last or not Suggestion 1: Measure the consumer at all. For example, in the case of multimedia, experience, but do not try to understand the the idea would be to spend most time and effort consumer. The root cause of much distress and creating a set of multimedia examples that fruitless discussion regarding subjective capture the important elements of multimedia evaluation may be that engineers have the idea products/services of interest. (In the case of that improving customer experience requires video quality, the important elements such as bit understanding why consumers act as they do. rate and packet loss are known in advance.) Understanding why consumers act as they do is a Expose consumers to the multimedia examples, practical impossibility, and is unnecessary for and collect some sort of more or less objective the kind of product and service evaluations rating of the QoE for each example. If the study reported in the organizations listed above. is designed carefully (see below), the result will Instead, measuring consumers’ behavior while be that the relative importance of each of the they are interacting with products or product products’ elements will be revealed. Also, the descriptions provides observable data about the relative improvement in judged QoE may be products, and avoids potential sources of revealed as each individual element is improved argument and confusion. separately. The engineer will have clear direction for how to improve the overall QoE of The distinction between (a) understanding why the product. people perceive, feel, and act as they do, and (b) measuring behavior when a person interacts with Naturally, there are exceptions to every rule, and some stimulus may seem overly subtle. It has the author himself has expended some effort to been the subject of academic debates at least understand individual differences in perception since the beginnings of experimental of QoE for VoIP and videoconferencing [1, 2]. psychology. The important point is that the However, in general, and especially for group distinction has practical consequences for doing projects, keep in mind the suggestion that good research on quality of customer experience project results can be achieved without trying to (QoE). understand the consumer in any depth. Understanding the consumer is actually useful as Understanding why a single person judges background information; it makes it possible to something pleasant or unpleasant (e.g., a packet loss artifact is more/less annoying than a proceed on a consumer research project compression artifact) can require much time and confidently and efficiently. However, effort on the part of the experimenter. If the background understanding of the consumer does testing program has on the order of two dozen not usually lead to the kind of actionable results consumer “subjects,” the amount of time required that multimedia product engineers need. Some to understand the judgments of each one is a good sources for background information on practical impossibility in most industrial labs. consumer needs regarding communication are [3, Furthermore, even if very detailed data on two 4, 5]; and for entertainment [6]. A current dozen subjects were obtained, finding common example of work to understand the end-user is threads among them can be difficult and [7].

http://www.comsoc.org/~mmc/ 31/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Suggestion 2: Think of products, services, and the topic, referred to earlier, of experimental lab stimuli as multiattribute “objects.” Of machinery and data analysis. course, it is obvious that products and services are complicated, and that overall QoE depends Experimental design and data analysis have gone on many elements. Nevertheless, it often hand-in-hand since the time of Ronald Fisher happens in meetings that discussions get turned (see Wikipedia article). Both are based on a so that only the influence of a single attribute, or generalization of the additive compensatory of a few attributes, are considered. A model sketched above. In order to actually multiattribute framework may counteract this achieve a situation (such as a study of QoE) in tendency. Also, by thinking of products as which the simple additive model applies, it is multiattribute objects, machinery for necessary to arrange for the elements of the experimental design and data analysis follows stimuli (e.g., multimedia recordings) to be naturally. logically independent. That is, in the collection of stimuli as a whole, the various elements that The kind of argument that suffers from lack of a vary need to do so independently, and all other multiattribute perspective might go like this: “If elements should be held constant. This situation we improve attribute X of our product/service, is achieved in a “full factorial” experimental then QoE will certainly go up.” In fact, that is design in which every combination of the only true if it is possible to improve attribute X elements or attributes is present. Such a design without negatively affecting other attributes, and is expensive and not strictly necessary; fractional if QoE increases monotonically with increases in factorial designs exist, and random sampling of attribute X. Very often changing one attribute in attribute combinations produces good results. fact also changes other attributes, and the effect See Wikipedia on “factorial design” and of the other attributes compensates for the “fractional factorial design.” improvement in the first attribute. For practical use in industrial lab situations, we are assuming The usual statistical tool for analyzing data from a “compensatory” model of utility. designed experiments is analysis of variance Compensatory models are not always as accurate (ANOVA) or its cousin, the general linear as “noncompensatory” models, but they are model. The main point is that thinking about almost always good approximations, and they products or stimuli as collections of attributes provide a much more experimenter-friendly with additive effects leads both to a way of context for discussing experimental designs and talking about the products, and to a way of analyses. designing studies and analyzing data. It also leads to a way of thinking about consumers and That is, a good general-purpose approximation is research subjects (below). Professional societies that overall utility or QoE follows a weighted of interest include the Psychometric Society, the additive form: QoE = b0 + b1x1 + b2x2 + … + Society for Judgment and Decision Making, and bnxn + e where xi refers to a measurement of the the INFORMS Society for Marketing Science. ith attribute, bi refers to the relative weight of the attribute in contributing to the overall utility or Suggestion 3: Think of consumers and research QoE of the product/service, and e refers to subjects as having different weights for the random error. That is, different attributes have importance of product attributes. In the potentially different effects, but they all count expression for the additive attribute model (unless the weight is zero). And, they can all be above, the attribute weights depend only on the accounted for in this general model. Further, the attribute, not on the individual consumer or effects are potentially separable, just as the b subject. Think of these weights as being the weights are distinct. average across a sample of consumers. However, one can also think of the weights as Another kind of argument that can be forestalled being particular to the individual subjects with the multiattribute model might go like this: (representable by adding another subscript). “We can’t tell what the effect of the codec is, After all, consumers notoriously do weight because it also depends on the scene.” True different product attributes differently – for enough for a single scene and codec, but if one example, I may weight fuel economy very highly has a collection of scenes processed by codecs, and you may weight acceleration very highly. then the effect of the scenes can be separated These differences in taste are often quite stable, from the effect of the codec(s). Which brings up http://www.comsoc.org/~mmc/ 32/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter not due to inconsistent responding or random Suggestion 4: Do not spend your time arguing noise in the data. about the proper rating scale. No rating scale is the “correct” rating scale [10], and most rating The multiattribute way of thinking of consumers scales produce results that are reasonable and product attributes immediately approximations of each other, especially for accommodates individual differences among aggregate data [11]. Better uses of effort are in consumers. Various data analysis tools for producing multi-media stimuli according to a representing individual differences in good orthogonal design, or in understanding consumers’ preferences have been adopted by more about the processes involved in the marketing science community over the past interpersonal communication [e.g., 3, 4, 5]. 40 years. Some key words are individual differences multidimensional scaling, preference Reference: mapping, and latent class analysis. Examples of the application of multiattribute methods to [1] Cermak, G. W. “Verbal descriptors for VoIP consumer research can be found in Paul E. speech sounds.” International Journal of Speech Green’s work [8]. Technology, 7, 81-91, 2004. If consumers differ in their tastes, and if we have a natural way to represent those differences, [2] Cermak, G. W. “Multimedia quality as a then there is less reason to do research with a function of bandwidth, packet loss, and latency.” small homogeneous sample of consumers – such International Journal of Speech Technology, 8, as college students or lab employees. Of course, 259-270, 2005. it could be that a particular product is intended only for a very restricted segment of the market, [3] Clark, H. H. Using Language. Cambridge such as college students or lab employees, but University Press, 1996. that is rarely the case. If larger and more representative samples of consumers are [4] Nofsinger, R. E. Everyday Conversation. considered at least theoretically desirable, then Sage Publications, 1991. the practical issues arise: how to recruit them, how to pay them, how to deal with them in the [5] Short, J., Williams, E., & Christie, B. The lab. Search on Green Book for recruiting and Social Psychology of Telecommunications. New paying subjects. Consider hiring experimental York: Wiley, 1976. psychologists or human factors specialists for dealing with human subjects in the lab. [6] Cermak, G. W. “An approach to mapping entertainment alternatives.” In R. R. Dholakia, Another consequence of thinking of consumers N. Mundorf, and N. Dholakia (Eds.), New as having legitimate reasons for being different Infotainment Technologies in the Home (pp. from each other – and of these differences being 115-134). Mahwah, NJ: L. Erlbaum Associates, captured in a multiattribute model – is that there 1996. is less reason for discarding consumers’ data. Consider the model [7] Aaltonen, V., Takatalo, J., Hakkinen, J.,

QoE = b0j + b1jx1 + b2jx2 + … + bnjxn + ej where Lehtonen, M., Nyman, G., and Schrader, M. the subscript j refers to an individual consumer, “Measuring mediated communication the b’s again refer to weights, the x’s refer to experience.” First International Workshop on values of attributes such as bit rate, and e refers Quality of Multimedia Experience. San Diego, to random error. Then two consumers’ data can July, 2009. correlate poorly either because their error terms e are very large, or because their b weights are [8] quite different. It is the large error term that (http://marketing.wharton.upenn.edu/people/facu indicates “bad” data, not possible differences in lty/green/green.cfm) b-weights. Proper experimental design can make it possible to distinguish between the two cases [9] K. Brunnstrom, G. Cermak, D. Hands, M. with ANOVA. The correlation tests that are Pinson, F. Speranza, and A. Webster. Draft frequently used in video quality research do not Final Report From the Video Quality Experts distinguish between the two cases [9]. Group On the Validation of Objective Models of Multimedia Quality Assessment, Phase I. ©2008 VQEG. http://www.comsoc.org/~mmc/ 33/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Gregory W. Cermak received a B.A. in [10] Shepard, R.N. “Psychological relations and psychology from the University of California, psychophysical scales: On the status of ‘direct’ Santa Barbara, in 1968, and a Ph.D. in psychophysical measurement.” Journal of psychology from Stanford University in 1972. Mathematical Psychology, 24, 21-57, 1981. He worked at the General Motors Research Laboratories from 1972 through 1986, at [11] Cox, E. P. “The optimal number of Information Resources, Inc. from 1987 to 1988, response alternatives for a scale: a review.” and at the GTE/Verizon Laboratories in Journal of Marketing Research, vol. XVII, Nov., Waltham, MA from 1988 to the present. He has 1980, 407-422. published in psychophysics, acoustics, air quality, market research, speech quality, and video quality. He has recently been working with the Video Quality Experts Group on validating objective measures of video quality.

http://www.comsoc.org/~mmc/ 34/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter From Cross-Layer Optimization to Cognitive Source Coding for Multimedia Transmission: Adapting Content Formats to the Network Simone Milani, University of Padova, Italy [email protected]

The advent of wireless multimedia All these solutions can be jointly combined to communications has brought to evidence that the optimize the final performance [6], but the traditional configurations of network protocol computational complexity becomes critical stacks are not adequate for delivering multimedia because of the large number of parameters that contents over heterogeneous and time-varying are involved in the optimization process. networks. The massive amount of data that Moreover, most of the proposed solutions are characterizes multimedia signals, together with focused on finding the optimal parameter setting the strict Quality of Service (QoS) requirements given a fixed set of source and channel coding on bandwidth, delay, and delay jitters, makes solutions at the different layers. difficult to provide multimedia contents to the end users at a satisfying quality. These In this scenario, Cognitive Source Coding inconveniences are further exacerbated by the schemes widen the set of possible cross-layer introduction of wireless channels, which are solutions and significantly improve the characterized by high data loss rates and varying performance of traditional schemes. The term transmitting conditions, and the limits of Cognitive Source Coding (CSC) has been traditional transmission protocols and introduced in analogy with Cognitive Radio [7] infrastructures. architectures adopted for radio transmissions. As defined by Haykin in [8], “Cognitive radio is an In order to mitigate these problems, several intelligent wireless communication system that is optimization strategies have been proposed to aware of its surrounding environment (i.e., increase the QoS level of multimedia outside world), and uses the methodology of transmissions by adapting each layer to the understanding-by-building to learn from the transmitted information and the network environment and adapt its internal states to conditions. However, the modularization of statistical variations in the incoming RF stimuli traditional layered architectures could lead to by making corresponding changes in certain significant inefficiencies depending on a blind operating parameters (e.g., transmit-power, set-up of the transmission parameters [1]. carrier-frequency, and modulation strategy) in real-time.” During the last years a considerable research Similarly, it is possible to change the coding effort has been made to investigate efficient format of multimedia signals according to the cross-layer (CL) solutions that aim at available network resources in order to transmit maximizing the quality of the video signal it effectively to a remote user. CSC schemes transmitted to the end user by allowing a receive a description of the network conditions synergetic interaction between different protocol from the lowest layers in the protocol stacks (i.e. layers [1]. The main goal of these architectures is available bandwidth, number of transmission to improve the performance of the transmission paths, packet loss probability, average delay, by jointly tuning the parameters of each layer jitters, etc) and adopt the most appropriate source according to holistic algorithms. coding solution from a set of possible choices. Some of the proposed solutions aim at jointly As a matter of fact, CSC schemes need to be tuning the parameters of source and channel designed in appropriate way in order to satisfy coders according to the transmission conditions specific requirements: and the characteristics of the video sequence [2]  providing robust multimedia Other solutions [3, 4] differentiate the priorities communications anywhere and anytime and the retransmission policies of packets while granting a certain level of Quality-of- according to the significance of the contained Experience (QoE) to the end user; data in the decoding process. Moreover, other  using effectively the available transmission solutions accurately control the transmission capacity; power in order to vary the Signal-to-Noise Ratio as needed [5].

 limiting the required computational load, the depending on the reliability of the channel. involved hardware resources, and the Whenever multiple channels are available, it is complexity of the transmission architecture. possible to differentiate the adopted source Like for Cognitive Radio systems, coding solution according to packet loss reconfigurability is one of the key elements that probabilities measured from the data carried by permit satisfying these requirements. The RTCP packets [11]. In these cases, strong possibility of orchestrating the different similarities between source coding solutions of functional blocks of the coding architecture different nature permit reusing a great amount of permits improving the effectiveness of the functional units, and therefore, the required transmission in terms of perceptual quality device size and implementation costs are experienced by the end user. The control unit of significantly reduced. CSC solutions enables and reconfigures the available functional blocks selecting those that Following this trend, research is focusing on prove to be the most suitable to the network finding more efficient low-complexity CSC status and to the characteristics of the signal to solutions that enable a stronger reuse of available be transmitted. As a matter of fact, an effective units and maximizes the quality of the video CSC scheme needs to identify the key elements sequence reconstructing at the decoder. Many that are common to the implemented source efforts are concentrated on assigning the most coding solutions and design an effective appropriate source coding solution for given interconnecting network that can be easily transmission conditions. Moreover, video reconfigured. Moreover, efficient optimization designers are investigating effective optimization algorithms must adapt the values of the coding strategies that process the information about the parameters to the features of the coded video state of the network to infer the most appropriate signal and to the available data rate. In this way, configuration. In the end, significant research the coded bit stream fully exploits the work is also involved in identifying novel video transmission capacity available to the terminal coding schemes that can be easily integrated avoiding bandwidth waste or underutilization. In within the previous ones. the end, reconfigurability permits limiting the size of coding devices while increasing the References number of implemented coding solutions. From these premises, the set of CSC schemes [1] M. V. der Schaar and S. Shankar, “Cross-layer lies within the range of cross-layer coding wireless multimedia transmission: challenges, solutions, but at the same time, they differentiate principles, and new paradigms,” IEEE Trans. for the fact that most of the CL solutions jointly Wireless Commun., vol. 12, no. 4, pp. 50–58, tunes the transmission parameters at different Aug. 2005. [2] Q. Qu, Y. Pei, J. W. Modestino, X. Tian, and layers without changing the structure of the B. Wang, “Cross-Layer QoS Control for involved coding architecture while CSC implies Video Communication over Wireless Ad Hoc reconfiguring the architecture of the source coder Networks,” EURASIP Journal on Wireless depending on network status. Communications and Networking, vol. 5, no. 5, As an example, the solution in [9] reconfigures a pp. 743–756, Oct. 2005. standard H.264/AVC video coder in order to [3] B. Girod and N. F¨arber, “Feedback-based support both single description (SD) and error control for mobile video transmission,” multiple description (MD) coding. The proposed Proc. of the IEEE, vol. 87, no. 10, pp. 1707– 1723, Oct. 1999. architecture switches between the SD coder and [4] A. Ksentini, M. Naimi, and A. Gu´eroui, the MD coder according to the characteristics of “Toward an improvement of H.264 video the channel, which are inferred from a set of transmission over IEEE 802.11e through a control messages received at MAC level. cross-layer architecture,” IEEE Commun. Other examples are offered by those solutions Mag., vol. 44, no. 1, pp. 107–114, Jan. 2006. that dynamically adopt traditional video coding [5] Y. Eisenberg, C. E. Luna, T. Pappas, R. Berry, and Distributed Video Coding (DVC) solutions and A. K. Katsaggelos, “Joint Source Coding in order to match both the video signal and Transmission Power Management for characteristics and the need for robust video Energy Efficient Wireless Video Communication,” IEEE Trans. Circuits Syst. coding. Many DVC schemes [10] adaptively Video Technol. vol. 12, no. 6, pp. 411–424, combine Wyner-Ziv video coding with Jun. 2002. traditional non-predictive source coding http://www.comsoc.org/~mmc/ 36/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter

[6] A. K. Katsaggelos, Y. Eisenberg, F. Zhai, R. Berry, and T. N. Pappas, “Advances in Efficient Resource Allocation for Packet- Based Real-Time Video Transmission,” Proc. of IEEE, vol. 93, no. 1, pp. 135–147, Jan. 2005. [7] J. Mitola and G. M. Jr., “Cognitive Radio: Making Software Radios More Personal,” IEEE Personal Commun. Mag., vol. 6, no. 6, pp. 13 – 18, Aug. 1999. [8] S. Haykin, “Cognitive Radio: Brain- Empowered Wireless Communications,” IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp. 201 – 220, Feb. 2005, (Invited). [9] S. Milani, G. Calvagno, R. Bernardini, and P. Zontone, “Cross-Layer Joint Optimization of FEC Channel Codes and Multiple Description Coding for Video Delivery over IEEE 802.11e Simone Milani was born in Camposampiero Links,” in Proc. of IEEE FMN 2008 (co- (PD), Italy, in 1978. From the University of located with NGMAST2008), Cardiff, Wales, GB, Sep. 17 – 18, 2008, pp. 472 – 478. Padova, Italy, he received the Laurea degree in [10] F. Pereira, C. Brites, J. Ascenso, and M. Telecommunication Engineering in 2002, and Tagliasacchi, “Wyner-Ziv video coding: A the Ph.D. degree in Electronics and review of the early architectures and further Telecommunication Engineering in 2007. In developments,” in Proc. of ICME 2008, 2006 he was a visiting Ph.D. student at the Hannover, Germany, Jun. 23 – 26, 2008, pp. University of California-Berkeley under the 625– 628. supervision of prof. K. Ramchandran, while in [11] S. Milani and G. Calvagno, “A Distributed 2007 he was a post-doc researcher at the Video Coding Approach forMultiple University of Udine, Italy, collaborating with Description Video Transmission over Lossy Channels,” in Proc. of EUSIPCO 2009, prof. R. Rinaldo. He has also worked with Glasgow, Scotland, UK, Aug. 24 – 28, 2009. STMicroelectronics, Agrate Brianza, Italy as consulting engineer.

At the moment, he is enrolled as research associate at the University of Padova within the research project "Analysis and implementation of a scalable video coder for transmission over heterogeneous unreliable networks based on distributed coding principles" under the supervision of prof. Giancarlo Calvagno.

His main research topics are digital signal processing, source coding, joint source-channel coding, robust video transmission over lossy packet networks, distributed source coding, and cognitive source coding.

He is also a IEEE member of Information Theory and Signal Processing Societies and he has also been a reviewer for several magazines and international conferences.

http://www.comsoc.org/~mmc/ 37/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Focused Technology Advances Series

Application Layer QoS Provisioning for Wireless Multimedia Networks with Cognitive Radios F. Richard Yu, Carleton University, Canada [email protected]

Abstract have a strictly lower QoS than radio services Most of previous work on wireless multimedia that enjoy guaranteed spectrum access. communication networks concentrates on lower Therefore, if the application layer QoS is not layer quality of service (QoS), such as blocking carefully considered in wireless multimedia probability, throughput and radio resource communication networks, the perceived utilization, as design criteria. However, from a reduction in application layer QoS may impede user’s point of view, application layer QoS, such the success of wireless multimedia technologies as multimedia distortion, is more important than with cognitive radios. that at other layers. In addition, recent study Multimedia applications such as video shows that maximizing lower layer QoS does not telephony, conferencing, and video surveillance necessarily benefit QoS at the multimedia are being targeted for wireless networks, application layer. The problem is more severe in including CR networks. Lossy video cognitive radio (CR) networks, where CR-based compression standards, such as MPEG-4 and secondary users would have a strictly lower QoS H.264, exploit the spatial and temporal than radio services that enjoy guaranteed redundancy in video streams to reduce the spectrum access. Therefore, it is necessary to required bandwidth to transmit video. take an integrated approach to jointly optimize Compressed video comprises of intra- and inter- application layer QoS for wireless multimedia coded frames. The intra refreshing rate is an communication networks. important application layer parameter [7]. Adaptively adjusting the intra refreshing rate 1. Introduction for online video encoding applications can Recently, there has been significant improve error resilience to the time varying growth in the use of wireless multimedia wireless channels available to secondary users communication services. With the growing in CR networks. demand of resource-intensive multimedia In this letter, we can take an integrated applications in wireless networks, quality of design approach to jointly optimize application service (QoS) provisioning is one of the major layer QoS for multimedia transmission over challenges in designing wireless multimedia cognitive radio networks. Based on the sensed communication networks. channel condition, secondary users can adapt Although several schemes have recently the intra refreshing rate at the application layer, been proposed for QoS provisioning [1], [2], in addition to the parameters at other layers. most previous work concentrates on lower layer QoS, such as blocking probability, throughput 2. Rate-Distortion (R-D) Model for and radio resource utilization, as design criteria. Multimedia Applications As a consequence, other QoS measures, such as Highly compressed video data is distortion for multimedia applications, are vulnerable to packet losses where a single bit mostly ignored in the literature. However, error may cause severe distortion [8]. This recent study in cross-layer design show that the vulnerability makes error resilience at the video schemes that are optimal from lower layers' encoder essential. Intra update, also called intra perspective (e.g., maximizing throughput) do refreshing, of macroblocks (MBs) is one not necessarily benefit QoS at the application approach for video error resilience and layer for some multimedia applications, such as protection [9]. An intra coded MB does not videos [3]-[5]. Moreover, from a user's point of need information from previous frames which view, QoS at the application layer is more may have already been corrupted by channel important than that at other layers. The problem errors. This makes intra coded MBs an is more severe in cognitive radio (CR) networks effective way to mitigate error propagation. [6], where CR-based secondary users would Alternatively, with inter-coded MBs, channel http://www.comsoc.org/~mmc/ 38/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter errors from previous frames may still propagate some application layer parameters will be to the current frame along the motion selected and the video content will be compensation path [10]. transmitted. At the end of the slot, the receiver Given a source-coding bit rate Rs and will acknowledge the transfer by sending the intra refreshing rate, we need a model to perceived channel gain back to the transmitter. estimate the corresponding source distortion Ds. We will assume a system for real-time The authors in [7] provide a closed form multimedia applications where packets are distortion model taking into account varying discarded if a primary user is using the slot or if characteristics of the input video, the the channel is not accessed. sophisticated data representation scheme of the coding algorithm, and the intra refreshing rate. 4. Solving the Application Layer QoS Based on the statistical analysis of the error Provisioning Problem in Cognitive Radio propagation, error concealment, and channel Networks decoding, a theoretical framework is developed In wireless multimedia networks with to estimate the channel distortion, Dc. Coupled cognitive radios, we need to determine the with the R-D model for source coding and time optimal policy for channel sensing selection, varying wireless channels an adaptive mode sensor operating point, access decision, and selection is proposed for wireless video coding intra refreshing rate to minimize application and transmission. layer distortion subject to the system We will use the rate-distortion model probability of collision. With channel sensing described in [7] in our study. The R-D model and CSI errors, the system state cannot be facilitates adaptive intra-mode selection and directly observed. We formulate the whole joint source-channel rate control. The total end- system as a partially observable Markov to-end distortion comprises of Ds, the decision process (POMDP). Deriving a single quantization distortion introduced by the lossy POMDP formulation for all policies under the video encoder to meet a target bit rate, and Dc, probability of collision constraint would result the distortion resulting from channel errors. in a constrained POMDP. However, For DCT-based video coding, intra coding of a constrained POMDPs require randomized MB or a frame usually requires more bits than policies to achieve optimality, which is often inter-coding since inter coding removes the intractable. Therefore, we use the separation temporal redundancy between two neighboring principle in [11] for the sensor operating point frames. Inter coding of MBs has much better and the access decision. The spectrum sensor R-D performance than intra mode. Decreasing operating point is set such that the probability the intra refreshing rate decreases the source of miss detection of the busy channel used by distortion for a target bit rate. However inter primary users is the same as the required coding relies on information in previous frames. probability of collision. Packet losses due to channel errors result in At the beginning of the slot, the system error propagation along the motion- transitions to a new state. Using a POMDP compensation path until the next intra coded derived policy, a channel is selected for MB is received. Increasing the intra refreshing spectrum sensing. An access decision is then rate decreases the channel distortion. Thus we made based on the sensing observation. Using have a tradeoff between source and channel the belief of the channel state, an intra distortion when selecting the intra refreshing refreshing rate is selected. The receiver rate. We aim to find the optimal intra acknowledges the transfer by sending the refreshing rate to minimize the total end-to-end quantized perceived channel gain back to the distortion given the channel bandwidth and secondary transmitter. The immediate cost for packet loss ratio. the time slot is derived based on the previous operations in the slot. 3. Multimedia Transmission over Cognitive Radio Networks 5. Conclusions The system time is slotted. At the In wireless multimedia communication beginning of a slot, the transmitter of secondary networks, application layer QoS, such as users will select a set of channels to sense. multimedia distortion, should be taken into Based on the sensing outcome, the transmitter consideration. In this letter, we took an will decide whether or not to access a channel. integrated design approach to jointly optimize If the transmitter decides to access a channel, multimedia intra-refreshing rate, an application http://www.comsoc.org/~mmc/ 39/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter layer parameter, together with access strategy [10] G. Cote, S. Shirani, and F. Kossentini, and spectrum sensing for multimedia “Optimal mode selection and transmission in a CR network. synchronization for robust video communications over error-prone REFERENCES networks,” IEEE J. Sel. Areas Commun., [1] F. R. Yu, V.W. S.Wong, and V. C. M. vol. 18, pp. 952–965, June 2000. Leung, “A new QoS provisioning method [11] Y. Chen, Q. Zhao, and A. Swami, “Joint for adaptive multimedia in wireless design and separation principle for networks,” IEEE Trans. Veh. Tech., vol. opportunistic spectrum access in the 57, pp. 1899–1909, May 2008. presence of sensing errors,” IEEE Trans. [2] C.-F. Tsai, C.-J. Chang, F.-C. Ren, and Inform. Theory, vol. 54, May 2008. C.-M. Yen, “Adaptive radio resource allocation for downlink OFDMA/SDMA systems with multimedia traffic,” IEEE Trans. Wireless Commun., vol. 7, pp. 1734–1743, May 2008. [3] M. van Der Schaar and S. S. N, “Cross- layer wireless multimedia transmission: challenges, principles, and new paradigms,” IEEE Wireless Comm., vol. 12, pp. 50–58, Aug. 2005. [4] S. Khan, Y. Peng, E. Steinbach, M. Sgroi, and W. Kellerer, “Application- driven cross-layer optimization for video streaming over wireless networks,” IEEE Comm. Mag., vol. 44, pp. 122–130, Jan. 2006. [5] Z. Han, G.-M. Su, A. Kwasinski, M. Wu, and K. J. R. Liu, “Multiuser distortion F. Richard Yu (S’00-M’04-SM’08) received the management of layered video over PhD degree in electrical engineering from the resource limited downlink multicode- University of British Columbia (UBC) in 2003. cdma,” IEEE Trans. Wireless Commun., From 2002 to 2004, he was with Ericsson (in vol. 5, no. 11, pp. 3056–3067, 2006. Lund, Sweden), where he worked on the [6] S. Haykin, “Cognitive radio: Brain- research and development of 3G cellular empowered wireless communications,” networks. From 2005 to 2006, he was with a IEEE J. Sel. Areas Commun., vol. 23, pp. start-up in California, USA, where he worked on 201–220, Feb. 2005. the research and development in the areas of [7] Z. He, J. Cai, and C. Chen, “Joint source advanced wireless communication technologies channel rate-distortion analysis for and new standards. He joined Carleton School of adaptive mode selection and rate control Information Technology and the Department of in wireless video coding,” IEEE Trans. Systems and Computer Engineering at Carleton Circ. Sys. Video Tech., vol. 12, pp. 511– University, in 2007, where he is currently an 523, June 2002. Assistant Professor. He received the Leadership [8] K. Stuhlmuller, N. Farber, M. Link, and Opportunity Fund Award from Canada B. Girod, “Analysis of video Foundation of Innovation in 2009 and best paper transmission over lossy channels,” IEEE awards at IEEE/IFIP TrustCom 2009 and Int’l J. Sel. Areas Commun., vol. 18, pp. Conference on Networking 2005. His research 1012–1032, Jun. 2000. interests include cross-layer design, security and [9] J. Y. Liao and J. Villasenor, “Adaptive QoS provisioning in wireless networks. intra block update for robust transmission He has served on the Technical Program of H.263,” IEEE Trans. Circ. Sys. Video Committee (TPC) of numerous conferences and Tech., vol. 10, pp. 30–35, Feb. 2000. as the Co-Chair of ICUMT-CWCN'2009, TPC Co-Chair of IEEE IWCMC'2009, VTC'2008F

http://www.comsoc.org/~mmc/ 40/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Track 4, WiN-ITS'2007. He is a senior member of the IEEE.

MMTC COMMUNICATIONS & EVENTS

Call for Papers of Selected Journal Special Issues

Ad Hoc Networks (Elsevier) Special Issue on Multimedia Ad Hoc and Sensor Networks

Guest Editors: Tommaso Melodia, Martin Reisslein Paper Submission deadline: December 15, 2009 Target Publishing Issue: 3rd Quarter, 2010 CfP Weblink: http://www.elsevierscitech.com/pdfs/cfp_adhoc0709.pdf

Multimedia System Journal Special Issue on Wireless Multimedia Transmission Technology and Application

Guest Editors: Gabriel-Miro Muntean, Pascal Frossard, Haohong Wang, Yan Zhang, Liang Zhou Paper Submission deadline: Jan. 15, 2010 Target Publishing Issue: 4th Quarter, 2010 Cfp Weblink: http://www.ifi.uio.no/MMSJ/CFP_SI_Wireless_MMSJ.pdf

http://www.comsoc.org/~mmc/ 41/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Call for Papers of Selected Conferences

IEEE GLOBECOM 2010 Website: http://www.ieee-globecom.org/2010/ Dates: December 6-10, 2010 Location: Miami, USA Submission Due: March 15, 2010

http://www.comsoc.org/~mmc/ 42/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter Next Issue Partial Content Preview

Scaling P2P Content Delivery Systems Reliably by Exploiting Unreliable System Resources Kannan Ramchandran et al., University of California, Berkeley, USA

Unified Reliable and Secure Media Transmission: Challenges and Approaches Chang Wen Chen, University at Buffalo, State University of New York, USA

Peer-to-Peer Streaming of Scalable Coded Video Mohammed Ghanbari, University of Essex, UK

Locality Aware P2P Delivery: The Way to Scale Internet Video Jin Li, Microsoft Research, USA

Context-aware Multimedia Services in Ambient-enhanced Collaborative Environments Min Chen, Seoul National University, Korea

Cooperative Multimedia Communications Andres Kwasinski, Rochester Institute of Technology, USA

http://www.comsoc.org/~mmc/ 43/44 Vol.4, No.10, November 2009 IEEE COMSOC MMTC E-Letter E-Letter Editorial Board

EDITOR-IN-CHIEF

Haohong Wang TCL-Thomson Electronics USA

EDITOR

Philippe Roose Chonggang Wang IUT of Bayonne NEC Laboratories America France USA

Guan-Ming Su Shiguo Lian Marvell Semiconductors France Telecom R&D Beijing USA China

Antonios Argyriou Phillips Research Netherlands

MMTC Officers

Qian Zhang Hong Kong University of Science and Technology China

VICE CHAIRS

Wenjun Zeng Madjid Merabti University of Missouri, Columbia Liverpool John Moores University USA UK

Zhu Li Nelson Fonseca Hong Kong Polytechnic University Universidade Estadual de Campinas China Brazil

SECRETARY Bin Wei AT&T Labs Research USA

http://www.comsoc.org/~mmc/ 44/44 Vol.4, No.10, November 2009