Efficient and Scalable Video Conferences with Selective

Efficient and scalable video conferences with selective forwarding units and webRTC Boris Grozev To cite this version: Boris Grozev. Efficient and scalable video conferences with selective forwarding units and webRTC. Networking and Internet Architecture [cs.NI]. Université de Strasbourg, 2019. English. NNT : 2019STRAD055. tel-02917712 HAL Id: tel-02917712 https://tel.archives-ouvertes.fr/tel-02917712 Submitted on 19 Aug 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE pour obtenir le grade de DOCTEUR de l’UNIVERSITÉ de STRASBOURG Spécialité: Informatique Présenté par Boris GROZEV dirigée par Emil Ivov et Thomas Noel Soutenu le 20/12/2019 Préparée au sein du Laboratoire ICube de l’Université de Strasbourg et la société 8x8 Londres, dans l’Ecole Doctorale Mathématiques, Sciences de l’Information et de l’Ingénieur. Efficient and Scalable Video Conferences With Selective Forwarding Units and WebRTC Jury de thèse composé de : Marcelo Dias De Amorim Directeur de Recherche CNRS, Examinateur Dr. Emil Ivov Directeur Developpement Produit, Collaboration Vidéo à 8x8, Encadrant de thèse Nicolas Montavont Professeur à l’IMT Atlantique, Rapporteur Thomas Noel Professeur à l’Université de Strasbourg, Directeur de thèse Hervé Rivano Professeur à l’INSA de Lyon, Rapporteur 2 Acknowledgements During the last five years I have been very fortunate to have people around me who supported and helped me in various ways, and I would like to recognize them here. This work would not have been even started, let alone completed, without the help of Emil Ivov. He encouraged me to pursue a Ph.D. in the first place, and supported me through the whole process as a research supervisor, manager at work, and on a personal level. He often believed in me more than I did myself, and inspired me to set higher goals and work harder. Finding the right balance between engineering and research was one of the hardest challenges I faced, and Emil went to great lengths to find compromises that allowed me to do both. Thank you Emil! Thomas Noel guided me to the world of academia and publishing, and was instru- mental in shaping this thesis. He read early versions of the manuscript and offered invaluable advice, always delivered in a practical and constructive way. He did more than I could ever expect to enable me to complete my work remotely: understanding and translating my broken French, communicating with administrative staff on my be- half, making sure I meet all the deadlines, finding and organizing the jury, and arranging to use video conferencing for the defence at very short notice. Merci Thomas! I would also like to thank my jury for their efforts: Marcelo Dias De Amorim, Nicolas Montavont and Herve Rivano. My colleagues Damian Minkov, Brian Baldino, Aaron van Meerten, Saúl Ibarra Cor- retgé, George Politis, Yana Stamcheva, Pawel Domas, Lyubomir Marinov and everyone else on the Jitsi team have been part of countless technical discussions and contributed with ideas and criticism, which I am grateful for. I learned and continue to learn from them. George Politis has been my main collaborator for the research papers this thesis is based upon. He contributed with ideas, performing experiments and shaping the text. The other co-authors include Thomas Noel, Emil Ivov, Varun Singh, Pawel Domas, Alexandre Gouaillard and Lyubomir Marinov and I am thankful for the opportunity to work with them. i Varun Singh listened to my ideas and problems, and offered a fresh perspective and advice at some key points in my journey, and I am very grateful for it. Pierre David participated in my mid-term and pre-defence presentations. Both times he asked insightful and revealing questions and suggested better ways to organize my work. Erin Cashel meticulously went through the manuscript and made it more understandable (or often just understandable), clear and correct. Thank you! I am also grateful to the companies that provided employment to me during the last five years: SIP Communicator/BlueJimp, Atlassian, and 8x8. Yana Stamcheva has been the best manager I could ask for, giving me freedom in my work and making sure I get everything I need, as well as a friend to me. Damian Minkov and Eliza Dikova have been my friends and welcomed me in their home on more than one occasion. Thanks, guys! Erin Cashel was next to me providing support and encouragement while I was writ- ing this thesis. Thank you for sharing your life with me! Finally I would like to thank my family back home for always unquestioningly sup- porting me and being understanding of my decisions. Mom, Dad, Nia, I love you! ii iv Contents Acknowledgements i 1 Résumé 1 1.1 Applications de la vidéoconférence . .2 1.2 Défis et motivations . .3 1.3 Contributions . .4 1.4 Structure de la thèse . .5 1.5 Architectures de vidéoconférence: état de l’art . .6 1.5.1 Réseaux Full-mesh . .7 1.5.2 Unités de contrôle multipoint . .7 1.5.3 Unités de routage sélectives . .8 1.5.4 Autres topologies . .8 1.6 WebRTC . .9 1.7 Jitsi Meet . .9 1.8 Last N . .9 1.8.1 Sélection du endpoint en fonction de l’activité audio . 10 1.8.2 Évaluation des performances . 11 1.9 Simulcast . 11 1.10 Une approche hybride: Selective Forwarding Unit et Full-mesh . 13 1.11 Optimisations des performances pour les SFUs . 14 1.11.1 Optimisation des performances de calcul . 15 1.11.2 Optimiser les performances de la mémoire . 15 1.12 Conférences avec les SFUs en cascade . 16 v 1.13 Conférences chiffrées de bout en bout avec les SFUs . 17 2 Introduction 19 2.1 Applications of Video Conferencing . 20 2.2 Challenges and Motivations . 21 2.3 Contributions . 22 2.4 Thesis Structure . 23 3 Video Conferencing Architectures: State of the Art 25 3.1 Video Conference Topologies . 25 3.1.1 One-to-one Communication . 26 3.1.2 Full-mesh Networks . 30 3.1.3 Multipoint Control Units . 31 3.1.4 Selective Forwarding Units . 32 3.1.5 Other Topologies . 34 3.1.6 Summary . 34 3.2 WebRTC . 35 3.3 Jitsi Meet . 37 4 Last N: Relevance-Based Selectivity 41 4.1 Real-time Transport Protocol . 41 4.2 The Selective Forwarding Unit Implementation . 44 4.3 Endpoint Selection Based on Audio Activity . 47 4.4 Dominant Speaker Identification . 49 4.5 Performance Evaluation . 52 4.5.1 Testbed . 52 4.5.2 CPU Usage in a Full-star Conference . 54 4.5.3 Full-star vs LastN . 54 4.5.4 Video Pausing . 54 4.6 Conclusion . 55 5 Simulcast 59 vi 5.1 Simulcast in WebRTC . 60 5.1.1 Standardization . 61 5.1.2 Simulcast in Jitsi Meet . 63 5.2 Metrics . 64 5.3 Preemptive Keyframe Requests . 65 5.4 Results . 66 5.4.1 Experimental Testbed . 66 5.4.2 Peak Signal-to-Noise Ratio (PSNR) . 68 5.4.3 Frame Rate . 70 5.4.4 Stream Switching . 70 5.4.5 Sender Load . 70 5.4.6 Receiver Load . 71 5.4.7 Server Load . 72 5.5 Conclusion . 74 6 A Hybrid Approach: Selective Forwarding Unit and Full-mesh 77 6.1 Implementation . 78 6.1.1 Topology . 78 6.1.2 Features . 80 6.1.3 Dynamic Switching . 80 6.2 Experimental Evaluation . 82 6.2.1 Infrastructure Resource Use . 82 6.2.2 Quality of Service . 84 6.3 Conclusion . 86 7 Performance Optimizations for Selective Forwarding Units 89 7.1 Introduction to Selective Video Routing . 90 7.1.1 Receive Pipeline . 91 7.1.2 Selective Forwarding . 92 7.1.3 Send Pipeline . 93 7.2 Optimizing Computational Performance . 94 vii 7.2.1 Evaluation and Optimizations for Simulcast . 94 7.2.2 Profiling . 95 7.3 Optimizing Memory Performance . 96 7.3.1 Garbage Collection in Java . 97 7.3.2 Introducing a Memory Pool . 100 7.3.3 Evaluation . 101 7.4 Conclusion . 108 7.4.1 Further Research . 108 8 Conferences with Cascaded Selective Forwarding Units 111 8.1 Considerations for Geo-located Video Conferences . 111 8.2 Preliminary Analysis . 114 8.2.1 Round Trip Time Between Servers . 114 8.2.2 Round Trip Time Between Endpoints and Servers . 115 8.2.3 Distribution of Users . 115 8.2.4 Analysis . 116 8.3 Implementation . 118 8.3.1 Signaling . 118 8.3.2 Selection Strategy . 119 8.3.3 Server-to-server Transport: Octo . 120 8.3.4 Packet Retransmissions . 121 8.3.5 Simulcast . 122 8.3.6 Dominant Speaker Identification . 122 8.4 Results . 122 8.5 Conclusion and Future Work . 124 9 End-to-End Encrypted Conferences with Selective Forwarding Units 127 9.1 PERC Double Encryption Implementation . 128 9.1.1 Original Header Block Extension . 129 9.1.2 Frame Marking . ..

Load more