Comparative Study of Webrtc Open Source Sfus for Video Conferencing

Comparative Study of WebRTC Open Source SFUs for Video Conferencing Emmanuel Andre´∗, Nicolas Le Breton∗x, Augustin Lemesle∗x, Ludovic Roux∗ and Alexandre Gouaillard∗ ∗CoSMo Software, Singapore, Email: femmanuel.andre, ludovic.roux, [email protected] xCentraleSupelec,´ France, Email: fnicolas.lebreton, [email protected] Abstract—WebRTC capable media servers are ubiquitous, and servers, even from frameworks that claim to be media server among them, Selective Forwarding Units (SFU) seem to generate and signalling agnostic. more and more interest, especially as a mandatory component In this paper, we will focus on scalability testing of a video of WebRTC 1.0 Simulcast. The two most represented use cases implemented using a WebRTC SFU are video conferencing conference use case using a single WebRTC SFU media server. and broadcasting. To date, there has not been any scientific The novelty here is the capacity to run exactly the same test comparative study of WebRTC SFUs. We propose here a new scenario in the same conditions against several different media approach based on the KITE testing engine. We apply it to the servers installed on the same instance type. To compare the comparative study of five main open-source WebRTC SFUs, used performance of each SFU for this use case, we report the for video conferencing, under load. The results show that such approach is viable, and provide unexpected and refreshingly new measurements of their bit rates and of their latency all along insights on the scalability of those SFUs. the test. Index Terms—WebRTC, Media Server, Load Testing, Real- The rest of the paper is structured as follows: section II Time Communications provides a quick overview of the state of the art of WebRTC testing. Section III describes, in detail, the configurations, I. INTRODUCTION metrics, tools and test logic that were used to generate the results presented in section IV, and analyzed in section V. Nowadays, most WebRTC applications, systems and services support much more than the original one-to-one peer-to- II. BACKGROUND peer WebRTC use case, and use at least one media server to Verification and Validation (V&V) is the set of techniques implement them. that are used to assess software products and services [3]. For For interoperability with pre-existing technologies like SIP a given software, testing is defined as observing the execution for Voice over IP (VoIP), Public Switched Telephone Network of the software on the specific subset of all possible inputs and (PSTN), or Flash (RTMP), which can only handle one stream settings and provide an evaluation of the output or behavior of a given type (audio, video) at a time, one requires media according to certain metrics. mixing capacities and will choose a Multipoint Control Unit Testing of web applications is a subset of V&V, for which (MCU) [1]. However, most of the more recent media servers testing and quality assurance is especially challenging due are designed as Selective Forwarding Units (SFU) [2]; a design to the heterogeneity of the applications [4]. In their 2006 that is not only less CPU intensive on the server, but that also paper, Di Lucca and Fasolino [5] categorize the different types allows for advanced bandwidth adaptation with multiple en- of testing depending on their non-functional requirements. coding (simulcast) and Scalable Video Coding (SVC) codecs. According to them, the most important are performance, load, The latter of these allow for even better resilience against stress, compatibility, accessibility, usability and security. network quality problems like packet loss. Even when solely focusing on use cases that are imple- A. Specific WebRTC Testing tools mentable with an SFU, there are still many other remaining. A lot of specific WebRTC testing tools exist; for instance, Arguably, the two most popular use cases are video confer- tools that assume something about the media server (e.g. ence (many-to-many, all equally receiving and sending), and signalling) or the use cases (e.g. broadcasting) they test. streaming / broadcasting (one-to-many, with one sending and WebRTCBench [6] is an open-source benchmark suite in- many receiving). troduced by University of California, Irvine in 2015. This While most of the open-source (and closed-source) Web- framework aims at measuring performance of WebRTC peer- RTC media servers have implemented testing tools (see section to-peer (P2P) connection establishment and data channel / II-A), most of those tools are specific to the server they test, video calls. It does not provide for the testing of media servers. and cannot be reused to test others or to make a comparative The Jitsi team have developed Jitsi-Hammer [7], an ad-hoc study. Moreover, the published benchmarks differ so much in traffic generator dedicated to testing the performance of their terms of methodology that direct comparison is impossible. So open source Jitsi Videobridge [8]. far, there has not been any single comparative study of media The creators of Janus gateway [9] have developed an ad- 978-1-5386-6205-2/18/$31.00 © 2018 IEEE hoc tool to assess their gateway performance in different Media Server VM Client VM configurations. From this initial assessment, they proposed AWS instance type c4.4xlarge c4.xlarge Jattack [10], an automated stressing tool for the analysis Rationale Cost effective instance Cost effective instance of performance and scalability of WebRTC-enabled servers. with dedicated comput- with dedicated comput- In their paper, they claim that Jattack is generic and that it can ing capacity ing capacity (4 vCPU) RAM 30 GB 7.5 GB be used to assess the performance of WebRTC gateways other Dedicated bandwidth 2 Gbps 750 Mbps than Janus. However, without the code being freely available, this could not be verified. TABLE I: VMs for the media servers and the clients. Most of the tools coming from the VoIP world assume SIP as the signalling protocol and will not work with other 2) Karoshi Interoperability Test Engine: KITE: signaling protocols (XMPP, MQTT, JSON/WebSockets). The KITE project, created and managed by companies WebRTC-test [11] is an open source framework for func- actively involved in the WebRTC standard, has also been very tional and load testing of WebRTC on RestComm; a cloud active in the WebRTC testing field with an original focus on platform aimed at developing voice, video and text messaging compliance testing for the WebRTC standardization process applications. at W3C [16]. KITE is running around a thousand tests across Finally, Red5 has re-purposed an open source RTMP load 21+ browsers / browser revisions / operating systems / OS test tool called “bees with machine guns” to support WebRTC revisions on a daily basis. The results are reported on the [12]. official webrtc.org page.1 In the process of WebRTC 1.0 specification compliance B. Generic WebRTC Testing testing, simulcast testing required using an SFU to test against 2 In the past few years, several research groups have ad- (see the specification chapter 5.1 for simulcast and RID, dressed the specific problem of generic WebRTC testing. For and section 11.4 for the corresponding example). Since there instance, having a testing framework or engine that would was no reference WebRTC SFU implementation, it has been be agnostic to the operating system, browser, application, decided to run the simulcast tests in Chrome browser against network, signaling, or media server used. Specifically, the the most well-known of the open source SFUs. Kurento open source project (Kurento Testing Framework) and 3) Comparison and Choice: the KITE project have generated quite a few articles on this KTF has been understandingly focused on testing the KMS subject. from the start, and only ever exhibited results in their publi- 1) Kurento Testing Framework a.k.a. ElasTest: cations about testing KMS in the one-to-many use case. In [13], the authors introduce the Kurento Testing Frame- KITE has been designed with scalability and flexibility in work (KTF), based on Selenium and WebDriver. They mix mind. The work done in the context of WebRTC 1.0 simulcast load-testing, quality testing, performance testing, and func- compliance testing paved the way for generic SFU testing tional testing. They apply it on a streaming/broadcast use case support. (one-to-many) on a relatively small scale: only the Kurento We decided to extend that work to comparatively load Media Server (KMS) was used, with one server, one room test most of the open-source WebRTC SFUs, in the video and 50 viewers. conference use case, with a single server configuration. In [14], the authors add limited network instrumentation III. SYSTEM UNDER TEST AND ENVIRONMENT to KTF. They provide results on the same configuration as A. Cloud and Network Settings above with only minor modifications (NUBOMEDIA is used to install the KMS). They reach 200 viewers at the cost of All tests were done using Amazon Web Services (AWS) using native applications (fake clients that implement only the Elastic Compute Cloud (EC2). Each SFU and each of its WebRTC parts responsible of negotiation and transport, not connecting web client apps were run on separate Virtual the media processing pipeline). Using fake clients generates Machines (VMs) in the same AWS Virtual Private Cloud different traffic and behavior, introducing a de-facto bias in (VPC) to avoid network fluctuations and interference. The the results. instance types for the VMs used are described in Table I. In [15], the authors add to KTF (renamed ElasTest) support B. WebRTC Open Source Media Servers for testing mobile devices through Appium. It is not clear We set up the following five open-source WebRTC SFUs, whether they support mobile browsers and, if they do, which using the latest source code downloaded from their respective browsers and on which OS, or mobile apps.

Load more