A System, Tools and Algorithms for Adaptive HTTP-Live Streaming on Peer-To-Peer Overlays
Total Page:16
File Type:pdf, Size:1020Kb
A System, Tools and Algorithms for Adaptive HTTP-live Streaming on Peer-to-peer Overlays ROBERTO ROVERSO Doctoral Thesis in Information and Communication Technology KTH - Royal Institute of Technology Stockholm, Sweden, December 2013 TRITA-ICT/ECS AVH 13:18 KTH School of Information and ISSN 1653-6363 Communication Technology ISRN KTH/ICT/ECS/AVH-13/18-SE SE-100 44 Stockholm ISBN 978-91-7501-915-4 SWEDEN Akademisk avhandling som med tillstånd av Kungl Tekniska högskolan framlägges till offentlig granskning för avläggande av Doktorandexamen i Informations- och Kommu- nikationsteknik Torsdagen den 12 Dec 2013 klockan 13.00 i Sal D, ICT School, Kungl Tekniska högskolan, Forum 105, 164 40 Kista, Stockholm. © ROBERTO ROVERSO, December 2013 Tryck: Universitetsservice US AB i Abstract In recent years, adaptive HTTP streaming protocols have become the de facto standard in the industry for the distribution of live and video-on-demand content over the Internet. In this thesis, we solve the problem of distributing adaptive HTTP live video streams to a large number of viewers using peer-to-peer (P2P) overlays. We do so by assuming that our solution must deliver a level of quality of user expe- rience which is the same as a CDN while trying to minimize the load on the content provider’s infrastructure. Besides that, in the design of our solution, we take into consideration the realities of the HTTP streaming protocols, such as the pull-based approach and adaptive bitrate switching. The result of this work is a system which we call SmoothCache that provides CDN-quality adaptive HTTP live streaming utilizing P2P algorithms. Our experi- ments on a real network of thousands of consumer machines show that, besides meeting the the CDN-quality constraints, SmoothCache is able to consistently de- liver up to 96% savings towards the source of the stream in a single bitrate scenario and 94% in a multi-bitrate scenario. In addition, we have conducted a number of pilot deployments in the setting of large enterprises with the same system, albeit tailored to private networks. Results with thousands of real viewers show that our platform provides an average offloading of bottlenecks in the private network of 91.5%. These achievements were made possible by advancements in multiple research areas that are also presented in this thesis. Each one of the contributions is novel with respect to the state of the art and can be applied outside of the context of our application. However, in our system they serve the purposes described below. We built a component-based event-driven framework to facilitate the develop- ment of our live streaming application. The framework allows for running the same code both in simulation and in real deployment. In order to obtain scalability of simulations and accuracy, we designed a novel flow-based bandwidth emulation model. In order to deploy our application on real networks, we have developed a net- work library which has the novel feature of providing on-the-fly prioritization of transfers. The library is layered over the UDP protocol and supports NAT Traver- sal techniques. As part of this thesis, we have also improved on the state of the art of NAT Traversal techniques resulting in higher probability of direct connectivity be- tween peers on the Internet. Because of the presence of NATs on the Internet, discovery of new peers and col- lection of statistics on the overlay through peer sampling is problematic. Therefore, we created a peer sampling service which is NAT-aware and provides one order of magnitude fresher samples than existing peer sampling protocols. Finally, we designed SmoothCache as a peer-assisted live streaming system based on a distributed caching abstraction. In SmoothCache, peers retrieve video frag- ments from the P2P overlay as quickly as possible or fall back to the source of the stream to keep the timeliness of the delivery. In order to produce savings, the caching system strives to fill up the local cache of the peers ahead of playback by prefetching content. Fragments are efficiently distributed by a self-organizing overlay network that takes into account many factors such as upload bandwidth capacity, connec- tivity constraints, performance history and the currently being watched bitrate. iii Acknowledgements I am extremely grateful to Sameh El-Ansary for the way he supported me during my PhD studies. This thesis would not have been possible without his patient supervision and constant encouragement. His clear way of thinking, methodical approach to problems and enthusiasm are of great inspiration to me. He also taught me how to do research properly given extremely complex issues and, most importantly, how to make findings clear for other people to understand. I would like to acknowledge my supervisor Seif Haridi for giving me the opportunity to work under his supervision. His vast knowledge of the field and experience was of much help to me. In particular, I take this opportunity to thank him for his guidance on the complex task of combining the work as a student and as an employee of Peerialism AB. I am grateful to Peerialism AB for funding my studies and to all the company’s very talented members for their help: Riccardo Reale, Mikael Högqvist, Johan Ljungberg, Magnus Hedbeck, Alexandros Gkogkas, Andreas Dahlström, Nils Franzen, Amgad Naiem, Mohammed El-Beltagy, Magnus Hylander and Giovanni Simoni. Each one of them has contributed to this work in his own way and has made my experience at the company very fruitful and enjoyable. I would also like to acknowledge my colleagues at the Swedish Institute of Computer Science and at the Royal Institute of Technology-KTH: Cosmin Arad, Tallat Mahmood Shaffaat, Joel Höglund, Vladimir Vlassov, Ahmad Al-Shishtawy, Niklas Ekström, Fate- meh Rahimian, Amir Payberah and Sverker Janson. In particular, I would like to express my gratitude to Jim Dowling for providing me with invaluable feedback on the topics of my thesis and passionate support of my work. A special thanks goes to all the people that have encouraged me in the last years and have made my life exciting and cherishable: Christian Calgaro, Stefano Bonetti, Tiziano Piccardo, Matteo Lualdi, Jonathan Daphy, Andrea Ferrario, Sofia Kleban, to name a few and, in particular, Maren Reinecke for her love and continuous moral support. Finally, I would like to dedicate this work to my parents, Sergio and Odetta, and close relatives, Paolo and Ornella, who have, at all times, motivated and helped me in the journey of my PhD. To my family This work was conducted at and funded by CONTENTS Contents ix List of Figures xv List of Tables xix I Thesis Overview 1 1 Introduction 3 1.1 Motivation . 5 1.2 Contribution . 6 1.3 Publications . 10 2 Background 13 2.1 Live Streaming . 14 2.2 Peer-to-peer Live Streaming . 15 2.2.1 Overlay Construction . 16 2.2.2 Data Dissemination . 20 2.2.3 Infrastructure-aided P2P Live Streaming . 21 2.3 Peer-to-peer Connectivity . 23 2.4 Peer Sampling . 25 2.4.1 NAT-resilient PSSs . 26 2.5 Transport . 27 2.6 Development and Evaluation of Peer-To-Peer Systems . 29 2.6.1 Emulation of Bandwidth Allocation Dynamics . 30 ix x CONTENTS Bibliography 31 II Framework and Tools 43 3 Mesmerizer: An Effective Tool for a Complete Peer-to-peer Software Devel- opment Life-cycle 45 3.1 Introduction . 47 3.2 The Mesmerizer framework . 50 3.3 Implementation . 50 3.3.1 Scala Interface . 53 3.4 Execution modes . 54 3.4.1 Simulation Mode . 54 3.4.2 Deployment Mode . 55 3.5 Network Layer . 56 3.5.1 Deployment Configuration . 56 3.5.2 Simulation Configuration . 57 3.6 Conclusion & Future Work . 63 3.7 References . 64 4 Accurate and Efficient Simulation of Bandwidth Dynamics for Peer-To-Peer Overlay Networks 69 4.1 Introduction . 71 4.2 Related Work . 73 4.3 Contribution . 75 4.4 Proposed Solution . 75 4.4.1 Bandwidth allocation . 75 4.4.2 Affected subgraph algorithm . 78 4.5 Evaluation . 82 4.5.1 Methodology . 82 4.5.2 Scalability . 83 4.5.3 Accuracy . 86 4.6 Conclusion & Future Work . 89 4.7 References . 90 III Network Layer 93 5 NATCracker: NAT Combinations Matter 95 5.1 Introduction . 97 5.2 Related Work . 98 5.3 NAT Types as Combinations of Policies . 98 5.3.1 Mapping Policy . 99 5.3.2 Allocation Policy. 100 CONTENTS xi 5.3.3 Filtering Policy. 100 5.3.4 The Set of NAT Types . 101 5.4 NAT Type Discovery . 101 5.5 NAT Traversal Techniques . 102 5.6 Simple hole-punching (SHP) . 102 5.6.1 Traversal Process . 102 5.6.2 SHP Feasibility . 102 5.6.3 Coverage of SHP . 103 5.7 Prediction . 103 5.7.1 Prediction using Contiguity (PRC) . 103 5.7.2 Prediction using Preservation (PRP) . 105 5.7.3 Prediction-on-a-Single-Side Feasibility . 105 5.7.4 Coverage of PRP & PRC . 106 5.8 Interleaved Prediction on Two Sides . 106 5.8.1 Traversal Process . 107 5.8.2 Interleaved Prediction Feasibility . 108 5.8.3 Interleaved Prediction Coverage . 108 5.9 Evaluation . 108 5.9.1 Distribution of Types . 109 5.9.2 Adoption of BEHAVE RFC . 109 5.9.3 Success Rate . 109 5.9.4 Time to traverse . 110 5.10 Conclusion & Future Work . 110 5.11 Acknowledgments . 111 5.12 References . 111 6 DTL: Dynamic Transport Library for Peer-To-Peer Applications 113 6.1 Introduction . 115 6.2 Related Work . 117 6.3 Contribution . 118 6.4 Dynamic Transport Library (DTL) . 119 6.4.1 Polite Mode .