MULTI-VIDEOCONFERENCING BASED ON WebRTC TECHNOLOGIES

Yago Alonso García

Master’s Thesis presented to the Telecommunications Engineering School Master’s Degree in Telecommunications Engineering

Supervisors Jorge García Duque

2016 Multi-videoconferencing based on WebRTC technologies

Master’s Thesis presented in June, 8th 2016 to the Telecommunicatons Engineering School

Master’s Degree in Telecommunications Engineering Mr. Yago Alonso García

Accepted by thesis committee: Prof. Manuel Fernández Veiga, President Prof. Jorge García Duque, Thesis director Prof. Ma Edita de Lorenzo Rodríguez, Committee member Prof. Rebeca P. Díaz Redondo, Committee member

You cannot put a limit on anything. The more you dream, the farther you get. — Michael Phelps

To my family and friends for their support throughout this period that ends up with this work.

Acknowledgements

This master thesis was supervised by Dr. Jorge García Duque, professor at Universidade de Vigo. I really want to thank him for his time in the supervision and direction of this project.

I also want to express my gratitude to Alberto Doval Iglesias, CTO in Councilbox, and Agustín Tourón Gil, CEO of the StreamingGalicia start-up company, for providing me technical support and directing me to perform this work.

Finally, I would like to thank, as well, all workmates I had in SreamingGalicia who have made me feel as part of the group since the very beginning and for supporting me throughout this project.

Thank you all.

Vigo, May 2016

i

Abstract

Videoconferencing allows bidirectional communication of both audio and video, rendering possible group meetings with people from distant places. It also has many advantages, including its ability to reduce time and costs, which makes videoconferencing one of the fastest growing segments in the area of telecommunications.

This project presents a videoconference application developed as a web application based on WebRTC technologies. The application has a One-to-Many architecture in which the users play different roles, one of them acting as the presenter and the others as the viewers. The application allows both the viewers to participate momentarily in videoconferencing and video recording, among other features. The application developed herein could be used, for example, in enterprise environments (shareholders meetings) or tele-teaching.

In the current market, there are some solutions built with WebRTC but none meet the desired characteristics. The solution achieved herein is a scalable One-to-Many system which supports an undefined number of users who can act either actively or passively, as well as recording. The application also offers the possibility that one user acts as a moderator – in almost every solution all the participants display the same features. These are the main advantages compared to the existent solutions. All of this might be accomplished without registration and through a browser.

Keywords: audio, Java, JavaScript, Kurento Media Server, media, One-to-Many, partic- ipant, presenter, room, streaming, video, videoconferencing, viewer, Web Application, WebRTC, Wowza Streaming Engine.

iii

Contents

Acknowledgementsi

Abstract iii

List of figures vii

List of tables ix

1 Introduction1

2 Objectives3

3 State of the art7

4 Development 13 4.1 Architecture...... 13 4.1.1 Server...... 14 4.1.1.1 Server Module...... 14 4.1.1.2 SDK Module...... 16 4.1.2 Client...... 17 4.2 Wowza Streaming Engine integration...... 18 4.3 Operation...... 22 4.3.1 Room management...... 22 4.3.2 Client operation...... 24 4.3.3 Overall working...... 25

5 Results 27

6 Application testing 31 6.1 Functional testing...... 31 6.2 Load testing...... 35 6.2.1 WSE load testing...... 36 6.2.2 KMS load testing...... 39

7 Conclusions 45

v Contents

8 Outlook 47

A Kurento Media Server 49

B Wowza Streaming Engine 53

C Wowza configuration 55

D Server and client application code sample 61

E Dispositive/Browser study 67

Bibliography 77

vi List of Figures

2.1 Videoconference users topology...... 4

3.1 Features of KMS against the WebRTC common servers...... 10 3.2 Mesh topology...... 11

4.1 General architecture of the application...... 13 4.2 Server-side architecture of the application...... 14 4.3 Client-side architecture of the application...... 17 4.4 Wowza integration flow...... 20 4.5 Notification Room Manager...... 23 4.6 Client domain request flow...... 23

5.1 Login screen...... 27 5.2 Simple videoconferencing example...... 28 5.3 Presenter screen...... 28 5.4 Participant screen example...... 29 5.5 Videoconferencing example. A viewer having the floor...... 29 5.6 Players website...... 30

6.1 Example of the test elements...... 32 6.2 Two users accessing the room test...... 33 6.3 Turn mechanism test...... 34 6.4 Turn mechanism test with two concurrent floor requests...... 35 6.5 WSE load testing of 100 users (5 minutes)...... 38 6.6 KMS load testing structure...... 39 6.7 Subjective and objective measurements of the simulations through a browser. 40 6.8 Fragment of the -internals dumping...... 41 6.9 testRTC analyzing tool...... 41

A.1 WebRTC with Kurento Media Server...... 50 A.2 High level architecture of Kurento...... 51

B.1 Wowza Streaming Engine architecture...... 53

C.1 Live application configuration...... 55

vii List of Figures

C.2 Transcoders’s configuration...... 56 C.3 Example of the 360p transcoder configuration...... 57 C.4 nDVR configuration...... 58 C.5 Configuration of the VOD application...... 60

E.1 Ranking of dispositives...... 68 E.2 Ranking of desktop OS...... 69 E.3 Ranking of desktop browsers...... 69 E.4 Ranking of mobile OS...... 70 E.5 Ranking of mobile browsers...... 70

viii List of Tables

6.1 Recording test...... 37 6.2 WSE load testing...... 37 6.3 KMS load testing scenarios...... 42

E.1 Windows browsers compatibility...... 71 E.2 MAC browsers compatibility...... 71 E.3 Linux (Ubuntu) browsers compatibility...... 72 E.4 Android browsers compatibility...... 72 E.5 iOS browsers compatibility...... 73

ix

1 Introduction

Videoconferencing consists in the simultaneous bidirectional audio and video communica- tion, which allows holding meetings between groups of people located in distant places. Additionally, it may offer telematic facilities or other ones, such as graphic and image exchanging, or file transferring.

The core technology used in a videoconferencing system is the digital compression of audio and video streams in real time. Its implementation provides therefore important benefits, such as collaborative work between geographically distant people and better integration among some working groups.

In enterprise environments, videoconferencing solutions provide to the organizations the tools they need to reduce distances, time and costs; promote and increase the productivity of working teams; strengthen participation and relationship between people; improve information and communication systems of the company; and accelerate the decision-making process and problem solving.

Videoconferencing is generally developed on the Internet, so it is important to mention that it faces sometimes technical issues, such as poor quality images or poor sound. These drawbacks are usually linked with slow connections, which hinders the optimal data transmission. Despite of this fact, and in the absence of those limitations, it is possible to set up high quality videoconferencing with audio, video and file transferring and a reachable cost for most of the users.

All those advantages mentioned above make videoconferencing one of the fastest growing segments in the area of telecommunications.

Among the possible technological solutions for web conferencing stands out the WebRTC technology [1]. This solution aims to provide web browsers with capabilities to manage real-time communications natively, so that they do not require the installation of any additional component or plugin.

1 Chapter 1. Introduction

WebRTC is a project promoted by Google and the Mozilla Foundation, among others, which has native support on popular browsers, such as Chrome, Firefox and Opera. This guarantees competitiveness in order to establish itself as the standard technology for managing such communications.

In addition, this Application Programming Interface (API) is an open framework for the web. It is able to manage both instant messaging text and audio and video communications. It is designed to be capable of operating in complex network environments, including those with the presence of firewalls and NAT. For managing audio, WebRTC uses the IETF RFC 6176 codec, which does not require payment for use and ensures the highest quality of communication. Furthermore, it allows a more effective bandwidth use.

One of the advantages of developing applications compatible with WebRTC is that it does not need protocols or additional knowledge to those required in any typical web development languages, offering a JavaScript-based API.

In general, the WebRTC technologies have the following advantages: they are free, anyone with a laptop or desktop device can use them; they are non-device/platform dependent, since they are a browser-based technology running on any operating system; they allow calls, video calls, conferences, and file sharing; quality audio and video is very high, which facilitates communications; and they easily adapt to different network types.

In short, videoconferencing based on WebRTC allows real-time communications through the web, by only loading a page.

2 2 Objectives

The aim of this project is the development of a web application based on WebRTC technologies that allows videoconferencing through a browser. Videoconferencing could be used as a tool for different scenarios, among which it might be found the development of meetings (directives, in assemblies or shareholders, etc.) or tele-teaching, enabling remote audience participation.

The application herein developed shall be capable of allowing videoconferencing with a One-to-Many architecture, in which a user will act as the presenter and the others are viewers who watch and/or listen to the videoconference presenter (in other words, it shall be an implementation of a video broadcasting web application, in which the presenter sends its stream to the viewers).

Among the desired general functionalities (shared by all users, both the presenter and viewers/participants1) the application shall provide the ability to mute the audio and video and also let the video expand to full-screen. With regards to the viewers, they must be able to watch the videoconference presenter, to request the floor/end up the turn in case they want to participate and to know which user has the floor at any time. On the other hand, and concerning the presenter, the application must allow them to give the floor/end up the turn to the viewers who have previously requested it (only one viewer should be able to participate at a time). It shall also provide a thumbnail of the participant video (if so), so that the presenter can watch them if desired (or even watching themselves); a counter which might indicate how many viewers are attending the videoconferencing and an information label which might show which viewer has requested the floor (the first one). The presenter must also be able to record parts of the videoconference which they may consider (the application must include mechanisms to start/pause recording).

To provide scalability, for those users who wish not actively participate in the conference

1Note that participants refers to viewers having the floor.

3 Chapter 2. Objectives but attend it, the application should include a web page. This web page shall contain a player in which the users could watch the live conference. The recorded videoconference should be accessed through another player in the website.

The application scheme is shown in Figure 2.1. The presenter, represented by the central node (orange), sends their stream to all the viewers (blue). The orange node could temporarily move to the user who has the floor, returning to the original presenter at the end of their round.

Figure 2.1 – Videoconference users topology.

After reviewing the state of the art (chapter 3), it was found some solutions in the market built with WebRTC but none meet the desired characteristics. Almost every solution has a Many-to-Many architecture (group call), they only allow videoconferences up to fifteen participants (with payment of a fee), and they do not allow recording. The desired solution is a scalable system which supports an undefined number of users who can act actively or passively and also recording. All of this must be attained without registration and through the browser. The application shall also offer the possibility that one user acts as a moderator –in almost every solution all the participants are equals.

In the aforementioned chapter, it was also decided to use the Kurento Media Server (KMS) [2] as the centerpiece of the project, which is a WebRTC media server, and a set of client . This platform would make simple the development of advanced video applications for web and smartphone platforms. Once the main server is decided, two applications based on Kurento are studied. Among these applications, it is chosen the Kurento Room Demo project [3] as the basis of development for the web application. This one is a more stable and powerful solution compared to the other one.

In order to perform the functions of streaming and recording, Wowza Streaming Engine (WSE) [4] is used. This is a server used for live and on-demand video and/or audio streaming, through IP networks to desktop, laptop, mobile devices and other network- connected devices. The server is a Java application that can be deployed on most operating

4 systems. Appendix A and Appendix B contains detailed information about Kurento and Wowza servers, respectively, on which the developed application will be based on.

The application developed in this project will be based on WebRTC technologies. Further- more, Java, HTML, CSS, XML and JavaScript (AngularJS) and streaming technologies are required. Such application should be compatible with most desktop and mobile devices.

5

3 State of the art

This section describes every main videoconferencing solution that has been reviewed and the central server selection. WebRTC is an emerging standard. However, there are some applications which are based on it (or that will be based on in the near future). Here it is mentioned the most relevant ones:

1. Skype web [5]. Skype is an application maintained by Microsoft that allows video calling, message sending and file sharing with other people who are in different locations. Skype web is a basic version of Skype that will be primarily used for instant messaging with the service in browsers. If it is wanted to make video/audio calls, then it is necessary to download a plug-in or extension, depending on what operating system and browser is being used. As Skype for Web is still in beta, not all browsers are fully supported yet for video/audio calling [6]. Microsoft is working with Google and others to create ORTC (Object Real-Time Communications) specifications to support a version of Skype without the need for any plugins or extensions. That will not be immediate, so it is necessary to install a plug-in by now. Another disadvantage is that it is not totally free, up to ten people can group the video chat (but at least one person must have a Skype Premium paid subscription), and it is allowed to share the screen.

2. Google Hangouts [7]. It is a video chat service from Google that enables both One-on-One chats and group chats with up to ten people at a time. Google+ Hangouts no longer requires a separate plugin to be installed in Chrome for video and voice chat to work. Using the WebRTC API and Native Client (NaCl) Google is able to provide a native video chat experience out of the box in Chrome. Although NaCl is a Chrome-only technology, the key aspects of the service (audio and video) are served by WebRTC. Other browsers, such as Firefox should be able

7 Chapter 3. State of the art

to present up Hangouts service support without a plugin at some point in the near future, as well [8]. It is not a free service, so if subscribed, the talks are limited to fifteen participants, and no limit to the duration applies. Also, it can be used Live Hangouts to broadcast a Hangout to as many people as wished. The limit remains fifteen active participants (ten participants if not subscribed). It is allowed to share the screen.

3. Talky.io [9]. Talky is built on open-source technologies of both SimpleWebRTC and their own Otalk platform. It allows up to fifteen people in the conversation, screen sharing, custom links, password-protected sessions and it is mobile friendly.

In addition to the main systems mentioned above, there are some WebRTC-based applications [10], for example:

4. Appear.in [11]. It is considered as one of the best free video calling services. It allows customizable links, lock rooms, screen sharing (browser extension required), group calls up to eight people, chat and mobile browser support.

5. vLine [12]. It is a simple video chat client with a modern interface that is a breeze to use. It is necessary to enter a name and the videoconference will start (up to four people). Regarding the advantages it provides: mobile browser support and full-screen mode. As for the disadvantages: no extra features, no text chat and no ability to customize the link.

6. Firefox hello [13]. It is for Firefox users. It does not have not any extra features such as text chat, but it is simple and easy to use. Concerning the advantages, it provides mobile browser support and full-screen mode. With regards to the disadvantages, no extra features, no text chat and no more than two rooms allowed.

Following on, it is described all of the solutions that have been considered as a possible central server for the application development.

1. Kurento Media Server. WebRTC media server and a set of client APIs making simple the development of advanced video applications for web and smartphone platforms. KMS features include group communications, transcoding, recording, mixing, broadcasting and routing of audiovisual flows. As a differential feature, KMS also provides advanced media processing capabilities involving computer vision, video indexing, augmented reality and speech analysis. Kurento modular architecture makes simple the integration of third party media processing algorithms (e.g. speech recognition, sentiment analysis, face recognition,

8 etc.), which can be transparently used by application developers as the rest of Kurento built-in features.

2. Videobridge [14]. VideoBRIDGE is a Selective Forwarding Unit (SFU, an SFU is able to receive multiple multimedia streams and then decide which of them should be sent to which participant) compatible with WebRTC that allows multi-user videoconferencing. It does not mix the video channels (or transcoding) in a composite video stream. It only transmits the received video stream to all participants in the call. This makes it scalable. While it should be run on a server with good bandwidth network, the CPU power is not critical for performance.

3. Licode [15]. It is based on WebRTC technologies. It is compatible with the latest stable versions of Google Chrome and Firefox. It offers the possibility of videoconferencing without installing anything. It provides a fast development of videoconference features based on HTML5. It allows the user to implement streaming, recording and any other real time features. It is an MCU (Multipoint Control Unit), which connects two or more audiovisual endpoints together in a single videoconference call. The MCU contains information about the capabilities of the systems in each of the endpoints of the videoconference and establishes the conference on the lowest common denominator for all users to participate.

4. Janus [16]. It is a WebRTC gateway (implemented in C) developed by Meetecho. As such, it does not provide any functionality other tan implementing the means to set up a WebRTC communication with a browser, exchanging JSON messages with it, and relying on RTP/RTCP and messages between browsers and the server-side application logic they are attached to. Any specific feature is provided by server-side plugins.

5. Red5 [17]. Open Source media server. It is designed to be flexible with a plugin architecture that allows customizing live streaming and Video On-Demand (VOD) scenarios. It is currently being built up a support for WebRTC to enable streaming through the browser without a plugin.

After analyzing the capabilities of each of the servers, it was decided to select Kurento because of the following reasons:

(a) It provides more flow-processing capabilities, for example, it allows stream mixing, transcoding and recording both individual and group flows.

(b) No plugins are required.

(c) It has a good community of users.

(d) It has many tutorials which have been developed both in Node.js and Java tech- nologies.

9 Chapter 3. State of the art

(e) Lots of documentation.

(f) It is a more mature and stable solution than the most of others already studied.

As main drawback of the KMS, it could be highlighted its lower efficiency compared to other solutions – there is a trade-off between the processing capabilities and the efficiency.

The capabilities offered by Kurento against the WebRTC common servers are summarized in Figure 3.1.

Figure 3.1 – Features of KMS against the WebRTC common servers.

Once it is decided to use the KMS, after a review of the state of the art to find web applications based on Kurento, which allows videoconferencing features with either the same as or similar to the desired architecture. Two solutions are found:

1. Kurento tutorial: One-to-Many video call [18]. This web application consists in a One-to-Many video call using WebRTC technologies, i.e. it is an implementation of a video broadcasting web application. There are two types of users in this application: 1 peer sends its media stream (presenter) and N peers receive the media from the presenter (viewers). It is a web application, and therefore it displays a client-server architecture. At the client side, the logic is implemented in JavaScript. At the server side the Kurento Java Client is used in order to communicate with the KMS.

10 The high level architecture of this demo is three-tier. To communicate these entities, two WebSockets are used. First, a WebSocket is created between the client and the server side to implement a custom signaling protocol. Second, another WebSocket is used to perform communication between the Kurento Java client and the KMS. This communication is implemented by the Kurento Protocol.

2. Kurento Room Demo project [3]. It is a framework which was developed using the KMS and WebRTC. Its main goal is to help developers when deploying applications for a multimedia group communication. There is a web demo application available, called Kurento Room Demo, which uses this API to allow users to simultaneously establish multiple connections to other users connected with the same session or room. The main module is the Room SDK, but developers can choose any component they need from the framework to build their application. This application allows one to create videoconferencing with a N-to-N architecture in which users are connected, as shown in (Figure 3.2), forming a mesh in which all participants are equals and the stream is sent from each to another.

Figure 3.2 – Mesh topology.

11

4 Development

4.1 Architecture

Videoconferencing takes place through an application with a client-server architecture. The application communicates with two servers, the Kurento Media Server, which can reside on the same machine as the web application, and the Wowza Streaming Engine, which resides on a different machine but in the same subnet (and connected physically) as the KMS one to minimize the delay. Figure 4.1 explains the integration of components and communication channels between them.

Figure 4.1 – General architecture of the application.

13 Chapter 4. Development

The application developed within this project, based on the Kurento Room Demo module of the Room API [19], is a single-page application, i.e. a web application that fits on a page with the goal of providing a more fluid user experience akin to a desktop application. The application allows users to connect with a videoconference room displaying a One-to-Many architecture (presenter and viewers), based on the KMS.

The presenter of every room stream is published on the WSE to stream the videoconference because of two reasons. On the one hand, it allows a large number of users to watch the stream (providing scalability to the system) and, on the other hand, the most important reason is that it allows recording, storing parts of the videoconference in a multimedia file (the parts that the presenter decides).

The recording and live streaming from Wowza can be accessed from any device through two web pages containing two players.

4.1.1 Server

The server-side application is developed in Java and it has a modular architecture, depicted in Figure 4.2. In this figure it is showed that the application is made of two modules: the Kurento Room Server and the SDK. Both modules communicate using the Java API.

Figure 4.2 – Server-side architecture of the application.

4.1.1.1 Server Module

The Room server component provides an API based on WebSockets for communication between clients and the core of the application (the SDK module). In order to do this, it exposes a WebSocket with a URI where the host name or IP and port are configurable.

14 4.1. Architecture

This WebSocket client enables to receive client messages and pushes events to them, as soon as they happen.

The exchanged messages between the server and clients are requests and responses and they are based on the JSON-RPC 2.0 format [20]. The events are sent from the server in the same way as a server request, but without requiring an answer (they do not include an identifier).

Considering the message format, it is distinguished between requests and responses. The format for requests (and events) is as follows:

• jsonrpc: a string specifying the version of de JSON-RPC protocol. It must be “2.0”.

• id: unique identifier established by the client which contains a string or number. The server must reply with the same value in the response message.

• method: a string containing the name of the method to be invoked.

• params: a structure that holds the parameter values to be used during the invoca- tion of the method.

The following JSON shows a sample request:

{"jsonrpc":"2.0","method":"joinRoom","params":{"user":"USER1","room":"ROOM1"},"id":0}

Regarding the format of the responses, they contain (like the requests) the "id" field and the "result" field, whose value is determined by the method invoked.

The following JSON shows a sample response:

{"id":0,"result":{"value":[{"id":"USER0","streams":[{"id":"webcam"}]}], "sessionId":"dv41ks9hj761rndhcc8nd8cj8q"},"jsonrpc":"2.0"}

On error, the response message has the following fields (apart from "jsonrpc"):

• id: this field is mandatory and it must match the value of the id field in the request message. If there was an error detecting the id in the request message (e.g. Parse Error/Invalid Request), it would be equal to null.

• error: message which describes the error through the following fields:

– code: a integer number which indicates the error type. – message: a string which provides a brief description of the error.

15 Chapter 4. Development

– data: a structure which holds additional information about the error. It may be omitted.

The following JSON shows a typical error message:

{"jsonrpc": "2.0", "id": 1, "error": {"code": "33","message": "Invalid parameter format"}}

Besides the WebSocket API, customers can also interact with the Room Server through a RESTful API. There are two main methods:

1. Get all rooms (URL: GET /getAllRooms). It returns a set with available rooms names.

2. Close room (URL: GET /close?room=roomName). It closes the room with the indicated name.

The Room server runs through the SpringBoot framework [21], a solution for creating applications that can be "just run". Tomcat is embedded directly and there is no need of deploying WAR files to run the Web application.

4.1.1.2 SDK Module

The API’s core module, the Room SDK, is a library for the server side and it has the main functionality of managing multi-conference sessions.

This module defines a programming model for applications which have been developed by using the Java language. This library allows, among other functionalities, the control over the life cycle of a multimedia conference (room); access to typical operations required for managing the participant (join, leave, publish or receive multimedia streams, etc.) and multimedia signaling.

This component requires access to, at least, one instance of a Kurento Media Server for WebRTC multimedia handling.

The Room API [19] is based on the concept of Room Manager. This administrator can organize and control group calls with the help of the Kurento technologies. This Java API handles both the room and multimedia specific details. This frees the programmer from low-level or repetitive tasks (inherent in each multi-conferencing application) and allows them to focus more on the functionality or business logic of the application.

16 4.1. Architecture

The Rom manager deals with two fundamental concepts:

• Rooms: Groups of virtual peers. It has the limitation that one user must only belong to a group at a time. To identify the rooms, their name is used.

• Participants: Virtual representation of a final user. To identify them, it is used an id assigned by the application. Using this SDK, the application task is to receive and translate messages from the side of the end user into requests to an instance of the Rom manager.

4.1.2 Client

The client part is described in Figure 4.3. It has been implemented using AngularJS [22], and it is based on the Kurento Room JS Client module for web applications of the Room API (it is necessary the browser to support WebRTC).

Figure 4.3 – Client-side architecture of the application.

Room JS Client library is contained in the JavaScript file KurentoRoom.js of the module Kurento Room JS Client. The main classes of this library are the following:

• KurentoRoom: main class which initializes the room and the local stream. It is also used to communicate with the server.

• KurentoRoom.Room: abstraction of the room, it provides access to the local and remote participants and their streams. In this class there are methods that perform requests to the server (sending requests) on the rooms e.g. "joinRoom", "viewRoom", "startRecoding" or "stopRecording". It also contains the definition of the events: "onParticipantJoined", "onParticipantLeft", and so on.

• KurentoRoom.Participant: a peer (local or remote) in the room. Each peer has a flag to distinguish between the presenter and viewers.

• KurentoRoom.Stream: wrapper for streams published in the room.

17 Chapter 4. Development

The web interface is implemented in AngularJS, which follows a View-Controller Model (VCM). The library reads the HTML which contains attributes with custom labels. Then, it obeys the directives of custom attributes, and joins the pieces of the input or output page with a model represented by the standard JavaScript variables. The web interface has three views (HTML), each of which has an associated controller (js): login screen, presenter screen and viewers screen.

• Login screen: it has a HTML which has two entries for the user to select their name and the name of the room they want to access. The associated driver is the loginController.js, which is responsible for redirecting the user to either the presenter or participant screens, depending on whether the room entered by the user exists or not.

• Presenter screen: the HTML defines the page layout and it is associated with the roomController.js which performs the necessary functions for the creation of a room. This controller also offers different features, such as buttons to record, mute audio/video, enter full-screen and give the floor/end up the turn to the viewers who had requested it. It also contains information about the number of viewers in the room and the viewer which asked first for the floor if so.

• Participant screen: the HTML defines the page layout and it is associated with the participantController.js controller. This driver provides the necessary functions to receive/send the media stream of/to the presenter. It also has buttons that allows the user to request the floor and mute the audio/video of the presenter and expand the application to full-screen. Moreover, it has information like the user who has the floor or the topic that will be treated in the videoconferencing.

4.2 Wowza Streaming Engine integration

It is decided to use the WSE because of two reasons: on the one hand, to provide scalability to the application and, on the other hand, to not overload the KMS with the videoconference recording. Its use renders possible an undefined number of number of potential videoconference viewers, and offers the possibility of recording, avoiding thus the record overloading with the KMS.

In order to obtain the required features, it is necessary to create two applications in WSE, whose process is detailed in Appendix C. The basis of the integration with Kurento is a WSE live application that aims to receive the video stream from the KMS. This application performs flow processing as transcoding and DVR and recording functionalities.

It has been reported that for the media stream to reach the WSE from Kurento, it is necessary a negotiation between them [23]. This negotiation (signaling protocol) is carried

18 4.2. Wowza Streaming Engine integration out by using the Session Description Protocol (SDP) [24] and therein agreed, among other parameters, the content type and format of the flow that will be exchanged.

In this case, the structure of the SDP used is as follows: v=0 o=- 0 3641150070 IN IP4 192.168.0.26 s=Wowza Media Server c=IN IP4 192.168.0.26 t=0 0 m=audio 40000 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=recvonly m=video 40002 RTP/AVP 101 a=rtpmap:101 H264/90000 a=recvonly

Where:

• 192.168.0.26: IPv4 address of the WSE. It should be a parameter in case of server change.

• 40000: Audio port. It changes from one room to another. It would start with the first free port in the WSE machine and increase up to four times per room.

• 40002: Video port. It changes from one room to another. It would start by adding two to the audio port. It would increase up to four times per room.

• rtpmap:0 PCMU/8000: This parameter specifies the audio codecs. The 0 is the value of the field "Payload type" of the RTP packets, which contain fragments of PCMU audio inside a session. 8000 indicates the bit rate of the codec (8 kbps). It could be added different rtpmap and it would be assigned in order (if more than one is indicated).

• a=rtpmap:101 H264/90000. This parameter specifies the video codecs. The 101 is the value of the field "Payload type" of the RTP packets, which contain photograms of H264 video inside a session. H264 indicates the video codec and 90000 the bit rate of the codec (90 kbps). It could be added different rtpmap and it would be assigned in order (if more than one is indicated).

Free ports are kept into memory when a room is closed to be used in new rooms.

19 Chapter 4. Development

Figure 4.4 depicts the process which occurs every time a presenter enters (establishing stream) or exits (stopping stream) a room.

Figure 4.4 – Wowza integration flow.

The SDP file (whose structure is the aforementioned one) is defined "on the fly" within the Rom Manager to publish automatically in the WSE application. Java methods are defined to upload the SDP file to the WSE by setting up a SFTP connection. Through this connection the file is uploaded to the content folder of the Wowza application. Using a proper configuration (listed in Appendix C) the application will detect the SDP file and the stream will be automatically published awaiting multimedia reception.

After negotiation with the SDP, the media stream is sent from Kurento to the Wowza application by using RTP (Real-time Transport Protocol)/RTCP (RTP Control Protocol).

This process is only done with the streams of videoconference presenters. When a participant has the floor, it is mixed both presenter and participant streams and the output stream is sent to the WSE application. When the participant ends up their turn, their stream is removed and it is only sent the presenter stream again. When the presenter

20 4.2. Wowza Streaming Engine integration closes the room, the Room Manager reconnects to the WSE and deletes the SDP file ending up the streaming. Once completed, free ports are saved for reuse.

Once Wowza receives the media stream, it is implemented the recording controls of the stream. There is a record button in the application presenter screen that allows them to manage the record during the videoconference.

Recording follows exactly the same process as any other functionality of the room. Its activation produces a message exchanging using JSON RPCv2 between the client and the application server. Pressing the button, the JavaScript client sends to the application a request message to start/stop the recording. This message contains the name of the room as a parameter. When the application receives a message of this type (start/stop recording) utilizes a REST API from Wowza (whose operation is detailed in Appendix C), using the proper server credentials (username and password) to start/stop recording according to different options.

In this particular case, it is used the following URL to set up the recordings so that they can be paused and resumed:

http://[username]:[password]@[wowza-ip-address]:8086//livestreamrecord?app= yagotfm&streamname=roomName.sdp&action=startRecording&option=append& startOnKeyFrame=true&segmentSize=0&segmentDuration=0

It can be controlled the starting or stopping of the recording with the action attribute of the URL. If it is set the value of action up to "startRecording", the recording will start. By contrast, if it is set up to "stopRecording", the recording will stop.

Besides performing the recording, the WSE is also used for transcoding and DVR.

1. Transcoding: the received stream is transcoded into different video resolutions, such as 160p, 360p and 720p and audio, in case the viewer has a poor/good connection or it is simply not interested in the video.

2. DVR: the video stream is recorded to allow the option of rewinding the video, in case the viewer wish to watch again some conference parts in which they are interested in.

Once the integration is resolved, it is available the live stream and the recorded video stored on the server for each videoconference. To access this content it is necessary to create an on-demand video application (Appendix C).

Both videos will be available in separate web pages that include two flowplayers embedded [25] (based on HTML5). Both simple web pages are served by using a Tomcat server. The access to these pages is done using an URL, which has the name of the room as a

21 Chapter 4. Development parameter.

The live stream player has a playlist, which allows playing both the live stream by default and the transcoded one in different qualities (160p, 360p, 720p), the only-audio stream, and the stream that allows the rewinding (DVR).

WSE can deliver content through different protocols (RTMP, RTP, HLS, for instance). HLS or HTTP Live Streaming is an HTTP [26] media-based streaming protocol developed by Apple, Inc. It works by breaking a media resource into a sequence of smaller chunks and fragments that may be split among a variety of different encoding data rates. Since HLS is an HTTP-based protocol it is able to bypass any firewall or proxy server that lets through standard HTTP traffic unlike other protocols such as RTMP. This allows content to be delivered by standard HTTP servers and delivered from a wide range of CDN’s and streaming engines such as Wowza. It is used this protocol to deliver content to both the passive and recording viewers because of its compatibilities and transversal NATs capabilities.

4.3 Operation

4.3.1 Room management

With the aim to manage the rooms and their users, the server uses the Room SDK library. The core element of this library is the NotificationRoomManager API. This API considers two kind of methods:

• Server domain. It is comprised by methods designed to be used in the application logic and in the integration with the Room SDK. These methods will be executed synchronously. They can be understood as management methods and they provide direct control over the rooms.

• Client domain. Methods invoked as a result of users requests. They may be executed asynchronously.

The NotificationRoomManager comprises two components:

• NotificationRoomHandler: Through this interface, the Room API invokes meth- ods to be executed as a result of facts that happen outside the scope of user requests, such as the closure of a room or the user left thereof.

• KurentoClientProvider: This service was designed so that the Rom Manager can acquire a Kurento client instance at any time, without knowledge of the KMS instances location. The location of the instances is found in URIs in the application settings.

22 4.3. Operation

The diagram in Figure 4.5 shows the components that build up the system when the NotificationRoomManager is used.

Figure 4.5 – Notification Room Manager.

To interact with clients the application uses the Room Server module. Figure 4.6 depicts the flow of a client domain request.

Figure 4.6 – Client domain request flow.

To process the incoming JSON-RPC messages it is defined the RoomJsonRpcHan- dler handler (which extends the DefaultJsonRpcHandler), which processes each request depending on the name of the method. The application stores both the session and

23 Chapter 4. Development transaction associated with each user so that the implementation of UserNotification- Service (the JsonRpcNotificationService) can respond or send events back to the clients.

The handler delegates the message processing to an instance of the class JsonRpcUser- Control. This class extracts the different parameters of the message and invokes the appropriate method of the NotificationRoomManager.

The NotificationRoomManager runs the code that contains the application logic and uses the notificationRoomHandler (which extends to DefaultNotification- RoomHandler) to build the response message and send it to the client.

After the response message construction, the JsonRpcNotificationService sends it back to the client. This class stores all the open WebSocket sessions on a map whose transactions will be obtained in order to send a response back to a request. The functionality of the session object is used to send events (notifications) JSON-RPC to the clients.

If reply (or in case of error) is sent to a given request, the transaction object will be used and removed from memory (a different application shall mean a new transaction).

To send a notification, it is used the session object –which will be retained until the session is closed because of a user exit, timeouts or errors.

Appendix D contains a server-code sample of the whole process.

4.3.2 Client operation

This section describes the operation from the AngularJS application. The Angular-specific code will not be explained, as the goal herein is to understand the room mechanism.

When a user (presenter or viewer) joins a room, it is called the initialization function from KurentoRoom, providing the server’s URI to listen to JSON-RPC requests. It must also provide the room name and the user name.

After the WebSocket initialization, it can create/view a room and create the local stream objects.

The choice of whether joining the room or not is left to the application, and if so, it must first be obtained the access to the webcam and the microphone before calling the create/view room method.

The client contains an event emitter which sends events under certain circumstances (such as media errors or when a new stream is added). There is a parameter where we will

24 4.3. Operation subscribe to the events emitted by the room. For example, when the user is prompted to grant access to the media resources on their system. Depending on their response, the stream object will emit the "access-accepted" or the "access-denied" event (Appendix D contains a sample client code which handles an event example).

The operations of a client in a room are based on the message exchanging between the client and the application server (WebSockects + JSON RPC).

4.3.3 Overall working

The overall working of the application can be explained as follows:

1. A user who enters the application accesses to the login page. They have to introduce their username and the room name they want to access. The application checks whether the room already exists or not. In case the room does not exist yet, the client module sends a "CreateRoom" request. The server creates a room with this user, who acts as the presenter, and sends the "onParticipantJoined" event to the presenter. The application automatically sends its stream to Wowza.

2. Every time a user enters an already created room the client module sends a "ViewRoom" request and the application responses an "onViewerJoined" message to every user in the room. The viewer receives the presenter stream.

3. The turn mechanism is based on a message exchanging. The application only permits one participant at a time. When some floor requests are received by the presenter, the system put them on a queue. The presenter gives the floor to the first one in the queue –the username of the first viewer who has requested the floor is shown to the presenter. The turn mechanism works as follows:

• When a user presses the button to request the floor, the client module sends a "Request floor" message to the application server and it resends it to the presenter (it is added to the queue). A user can cancel their request by sending a "Cancel request" to the presenter. • When the presenter gives the floor to the viewer (with a "Give word" message), they make a "joinRoom" request and publishes their stream in the room (until the presenter ends up the turn or the participant finishes their round). The application sends an "onParticipantJoined" message to every user in the room. As for the WSE streaming, the participant stream is mixed with the presenter one and sent to the WSE. • The round could be ended up by the presenter, by sending a "Presenter ending" message, or by the participant by sending a "Participant ending" message. After that, the participant performs a "viewRoom" operation. Concerning the

25 Chapter 4. Development

WSE streaming, the flow of the participant is unpublished and the application only streams the presenter flow.

26 5 Results

The developed web application, based on WebRTC technologies, enables a One-to-Many videoconferencing in a scalable fashion and through a web browser. The videoconferences will have a presenter and a large number of viewers attending the conference who can temporary participate in.

When a user accesses the application, they find the login screen, in which they are asked for a username and the name of the room they want to enter, as shown in Figure 5.1.

Figure 5.1 – Login screen.

After entering the access data, the user is redirected to a certain webpage depending on whether the room entered already exists or not. If the room does not exist, the application will redirect the user to the presenter screen, or to the participant screen. A simple example of videoconferencing is showing in Figure 5.2, where a videoconferencing with one presenter (Figure 5.5(a)) and one viewer (Figure 5.5(b)) is illustrated. In these figures it can also be observed (on the top-right of each figure) the permitted general

27 Chapter 5. Results features (shared between the presenter and the viewers), the capability to mute the audio and/or video of the presenter or to expand to full-screen.

(a) Presenter screen. (b) Participant screen.

Figure 5.2 – Simple videoconferencing example.

As for the presenter own functionalities, they can give the floor/end up the turn to viewers who request it and start/pause the videoconference recording. The interface (on the top-left of the screen) a start/stop recording button which allows the presenter to start/pause/resume/end up the videoconference recording. The blue button on the right-side of the record button allows the presenter to give the floor/end up the turn to the viewer who has request it. The green label informs the presenter of the number of viewers who are attending the videoconferencing and the white one informs of which viewer was the first requesting the floor (Figure 5.3).

Figure 5.3 – Presenter screen.

Regarding the viewers, as it is depicted in Figure 5.4, they should be able to request the floor to take temporary part in the videoconferencing as they deem appropriated (with the green button on the top-left), ending up their round eventually. The interface also provides a text box with the description of the topics the videoconference will deal with

28 (at the bottom of the screen) and a text label which shows who has the floor at every time. In the aforementioned figure it is illustrated both viewers connections from both a desktop (Figure 5.4(a)) and a mobile device (Figure 5.4(b)).

(a) Desktop screen. (b) Mobile screen.

Figure 5.4 – Participant screen example.

Furthermore, Figure 5.6 shows an example of videoconferencing on which a viewer has requested the floor and the presenter has given it. The presenter screen (Figure 5.5(a)), bottom) has a thumbnail of the video of the participant having the floor, allowing the presenter to select the video they want to watch as the main video.

(a) Presenter screen. (b) Participant screen.

Figure 5.5 – Videoconferencing example. A viewer having the floor.

Moreover, through a Tomcat server, access to watch the videoconference is provided to people who do not want to participate actively. Using WSE transcoding and DVR capabilities, a player allows both watching the presenter live stream in different qualities and rewinding (Figure 5.6(a)). Besides knowing the player URL, it is also necessary to know the room name wanted to attend.

There is another player which provides access to the recorded conferences after they had

29 Chapter 5. Results ended up. The access to this player is similar to the other one and it only allows playing the stream in the default quality (Figure 5.6(b)).

(a) Live player screen.

(b) VOD player screen.

Figure 5.6 – Players website.

30 6 Application testing

In this chapter the test plan is described. In order to test both the functioning and scalability of the developed application two kind of tests are performed: the functional and the load testing.

6.1 Functional testing

These kind of tests are unit tests and are accomplished with the aim to test the application general operations (login, "viewRoom", "createRoom", "joinRoom" and so on). They can be considered integration tests because they check all the modules of the Room application. If desired, they will be executed automatically every time the application is deployed. These kind of tests are based on the Kurento Room API tests.

The integration tests extend the class RoomTest.java, which contains the necessary code to run the chrome browser automatically and perform operations such as accessing the application webpage, filling the login data for distinct users, waiting for the connection establishment, clicking the desired buttons and leaving the application. First of all, this class makes a call to an external one which seeks out the system details (OS and architecture) and downloads the appropriate version of the chromedriver [27], the open source tool used for the automatic testing of applications.

Figure 6.1 illustrates a test output example. In each test, the chromedriver is executed with a parameter to take a fake video source to make easier the testing (without the need of using a webcam and a microphone) and to skip the alert for granting the access to the webcam and microphone. This fake video is represented as the green spinner showed in the figure. If the spinner is present in the browser it means that the video is been received correctly by the users. With regard to the presenter screen, besides the green spinner, it should be paid attention to three labels, the blue (0 requests)/yellow one (some pending requests) which indicates the pending floor requests to the presenter; the green label which informs of the number of pending requests and the white one which informs of

31 Chapter 6. Application testing which viewer has requested the floor first. When a participant has the floor it may appear a thumbnail at the left-bottom of the screen. Concerning the participant screen, there are two important labels on which focus on during the test. The label which indicates the state of a word request (green to participate; blue to wait/cancel the request; and red to indicate that the user is taking part in the conference) and the blue label underneath indicating which participant has the floor.

Figure 6.1 – Example of the test elements.

Following on, the whole functional testing which the application must complete fully is described:

• Presenter accessing a room.

– Class Name: AvailabilityRoomTfmTest. – Objectives: to check if both the "createRoom" and "leaveRoom" success. It also checks whether the browser is receiving the presenter stream and how long (in ms) the applications takes to establish the connection for the presenter (5.524 ms in average). – Description: the test accesses to the application and fills the input data access (username and room name). It will create a room with one presenter. After a time defined by the user, the presenter leaves the room.

• Two users accessing the room.

– Class Name: TwoUsersTfmTest. – Objectives: to check if the first user acts as the presenter and sends their stream to the viewer. It confirms that the viewer browser receives correctly the presenter stream. It also probes whether the "viewRoom" works as expected

32 6.1. Functional testing

and if the viewer does not publish their stream when entering the room. Additionally, it tests if the counter of online users works as desired. – Description: the test executes two browsers and accesses the application. After filling the access data it creates a room with two users. The first user is the presenter (who performs a "createRoom") and the other one is one viewer (who performs a "viewRoom"). Then, the test checks whether the presenter stream is received correctly by the viewer. After the desired time of videoconferencing (5 seconds in this case) both users leave the room and the presenter closes the room. Figure 6.2 shows the test during its execution, when both users have entered the room.

Figure 6.2 – Two users accessing the room test.

• Requesting/Giving the floor and ending up the turn by the presenter/participant with two users (presenter and viewer/participant).

– Class Name: AskFloorTfmTest. – Objectives: to show if the turn mechanism ("Request floor", "Cancel request", "End up the turn by the presenter", "End up the turn by the participant" and "joinRoom" operations) works as expected. It also checks if the participant stream is published within the room in a correct manner. Moreover, it tests if the label informing of which participant has the floor works correctly. – Description: the test executes two browsers and accesses the room with two users after filling the access data. The first user is the presenter and the other one (whose username is "user2") is a viewer. Afterwards, the test simulates

33 Chapter 6. Application testing

that the viewer sends a floor request (Figure 6.3(a)), cancels it and resends it again. Every time a new request arrives to the presenter, they give the floor to the viewer (Figure 6.3(b)). Soon thereafter, depending on the value of a Boolean variable either the presenter ends up the participant turn or the participant themselves end up their turn.

(a) Viewer requesting the floor.

(b) Participant having the floor.

Figure 6.3 – Turn mechanism test.

• Requesting/Giving the floor and ending up the turn by the presenter/participant with three users (presenter and two viewers/participants).

– Class Name: AskFloorTwoUsersTfmTest. – Objectives: to confirm whether the turn mechanism works as expected with two users requesting the floor simultaneously. – Description: the test executes three browsers and accesses the room with three users. The first user is the presenter and the others are viewers. First, the viewer 1 (whose username is "user2") sends a floor request and then the viewer 2 (whose username is "user3") does it as well.

34 6.2. Load testing

Therefore, the presenter has two pending requests. First, they give the floor to the viewer 1. The test checks whether the stream is published in a proper manner. Afterwards, the presenter ends up the participant 1 turn and gives the floor to the second viewer. To finish, the presenter ends up the participant 2 turn. Figure 6.4 depicts an example of this test.

(a) Two users requesting the floor and waiting for it.

(b) Presenter gives the floor to the first viewer who asked for it.

Figure 6.4 – Turn mechanism test with two concurrent floor requests.

6.2 Load testing

The load testing are divided into two independent parts. Both parts are about testing the load on one server. The first tests measure the system load of the WSE machine under different test cases and the second tests also measure the system load but in the KMS machine. This could be done this manner (independently) because both the KMS and the WSE are supposed to be set up in different machines (in the same subnet to minimize the streaming delay between them).

The number of concurrent users for the developed application depends on many conditions, such as hardware configuration and system load (both on the server as well as on the clients), network bandwidth and latency.

35 Chapter 6. Application testing

It would be interesting to test it under real-world conditions (i.e. inviting a large number of people from different geographic locations and with different network access to access the application). Due to the impossibility of doing this, some scenarios are simulated and they could give one an approximation on the number of concurrent supported users in the application.

In order to know in a subjective fashion the application behavior, one real viewer accesses the application during the test performance.

The first step before the simulation of the scenarios to obtain the steady state of the WSE/KMS server (RAM and CPU used by the WSE/KMS without any connection). After that, it was simulated the desired scenario and the RAM and CPU used by the WSE/KMS was measured again (to obtain the difference from the steady state). After the simulation, it is let the WSE/KMS machine to reach the steady state again.

6.2.1 WSE load testing

This test plan is accomplished with the aim to observe the response of the WSE under certain scenarios (number of concurrent passive viewers). The chosen tool for this purpose is the Apache JMeter [28], an Apache project that can be used as a load testing tool for analyzing and measuring the performance of a variety of services, with a focus on web applications.

As it has been explained in the previous chapters, WSE uses HLS media-based streaming protocol to deliver content to the passive and recording viewers. The protocol works by breaking a media resource into a sequence of smaller chunks and fragments that may be split among a variety of different encoding data rates.

The process to simulate the desired number of users requesting a HLS streaming with JMeter is described in ref. [29]. The threads make HTTP requests to WSE in order to simulate the HLS behavior and obtain the media packets.

The test plan is accomplished (like the KMS load testing) under different scenarios with different number of concurrent users accessing the stream at the same/different time to obtain the WSE behavior.

Before the test execution the recording load is measured. The recording of a room involves an almost negligible increase of both CPU and RAM, as illustrated in Table 6.1. This table shows cases with different rooms (1, 2, 3, 4, and 10) with the CPU and RAM with and without the room recording. Therefore, from the data extracted from the table, it could be asserted that it does not make a significant difference if the recording is running. Besides, all the rooms that one KMS can handle (see the next subsection) could be recorded.

36 6.2. Load testing

Table 6.1 – Recording test

No recording Recording Rooms CPU (%) RAM (GB) CPU (%) RAM (GB)

1 16.1 1.39 16.5 1.41 2 25.3 1.59 26.7 1.69 3 39.3 1.76 41.6 1.82 4 49.1 1.97 50.8 2.1 10 93.9 3.16 99.1 3.39

Concerning the WSE load testing, for the different test cases it is also measured the current throughput, percentage of both CPU and RAM usage. The WSE machine has the following characteristics: Ubuntu 14.04 LTS, 16 GB RAM, 1 TB of disk and an 8-core processor (which leads to a maximum CPU usage of 800%). Previous to run the tests, the RAM, CPU and throughput in the steady state are: CPU: 1%, RAM: 1.12 GB and an available throughput of about 250 Mbits/s.

Table 6.2 sums up the performed test.

Table 6.2 – WSE load testing

Test Results

Time Users rate CPU RAM Throughput Subjective Users (s) (users/s) (%) (GB) (Mbits/s) evaluation

20 300 0.2 27 1 20.6 OK 20 300 20 28 1 20.8 OK 30 300 0.2 30 1.48 30.8 OK 30 300 30 31 1.56 30.5 OK 50 300 0.1 33 1.73 50.5 OK 50 300 50 33 1.91 52.5 OK 50 3600 0.033 33 1.98 52.8 OK 100 300 0.83 35 2.01 91.2 OK 100 300 100 35 2.03 91.95 OK

In the table above it can be seen that the tests have been executed for 20, 30, 50 and 100 concurrent users who access the HLS stream at the same/different time. All the tests except one (with 50 users over one hour) have a five-minute duration. It could be observed that the CPU, whose maximum value is about 35% of the 800% that the system can held, and RAM, whose maximum value is 2.03 GB of the available 16 GB, do not

37 Chapter 6. Application testing increase significantly. Thus, the network throughput is the limiting factor. The subjective results of all the tests cases were positive.

The WSE tests are limited to 100 users because of the bandwidth of the machine that runs the test (100 Mbits/s). The theoretical bandwidth of the KMS machine is about 250 Mbits/s so, assuming that a HLS connection consumes about 1 Mbit/s the WSE might support about 250 concurrent users accessing the stream –in a theoretical fashion, once again.

Figure 6.5 shows the test with 100 concurrent users during 5 minutes, both the users rate (Figure 6.5(a)) and the throughput (Figure 6.5(b)).

(a) Number of users over time.

(b) Throughput over time.

Figure 6.5 – WSE load testing of 100 users (5 minutes).

On the flip side, within these tests it is also measured the minimum delay obtained by setting up the KMS and WSE machine on the same subnet and physically connected. This delay was about 5 ms (measured in a subjective fashion).

38 6.2. Load testing

6.2.2 KMS load testing

This section shows the accomplished test plan to test the KMS under different scenarios. The different test cases are performed with different number of concurrent users accessing one/several rooms to obtain the KMS behavior.

A research was carried out in order to find any tool which allows one to perform these kind of tests with a WebRTC application. It was found a few testing solutions available to probe these kind of technologies. In addition, the testing tool that was found (testRTC [30]) was a paid solution. Because of that reason, a programmed mode has to be developed. It is developed some test classes (based again on the Kurento Room Demo test classes) which emulate the desired number of concurrent users accessing the application.

During the tests execution, it is connected a new user via browser to get the WebRTC statistic parameters and a subjective evaluation of the received audio/video streams.

The procedure to perform the load testing of the KMS it is observed in the Figure 6.6.

Figure 6.6 – KMS load testing structure.

The procedure shown in the figure is the following:

1. Configuration file. In this file it is configured the input parameters of the simulator. The file indicates: room number, number of users in each room, maximum number of participants in each room, the KMS URI and the fake source video to use in the test.

2. Simulator. The simulator comprises a set of classes that create the desired number

39 Chapter 6. Application testing

of rooms and create the desired number of users which join a particular room. After the join operation of all users, the videoconference starts for the desired time (10 minutes). After the videoconference time, all the users perform a leave room.

3. Application Server.

4. Real user connected through the Chome browser. The user who enters one room to observe in a subjective way the video/audio behavior. It is opened another tab with the webrtc-internals of the browser. Figure 6.7 depicts what the user can watch during the tests, the browser of the videoconferencing viewer (Figure 6.7(a)) and the webrtc-internals statistics of the videoconferencing (Figure 6.7(b)).

(a) Viewer screen of the simulated videoconferencing.

(b) WebRTC-internals of the viewer browser during the test.

Figure 6.7 – Subjective and objective measurements of the simulations through a browser.

40 6.2. Load testing

5. WebRTC-internals dump file. Once the videoconference is finished, it is created a dump file of the webrtc-internals statistics (Figure 6.8).

Figure 6.8 – Fragment of the webrtc-internals dumping.

6. TestRTC. The webrtc-internals file is analyzed with the testRTC free tool for this purpose obtaining the output data. Figure 6.9 shows an extract from the testRTC analyzing tool.

Figure 6.9 – testRTC analyzing tool.

For the different test it is measured the percentage of both CPU and RAM usage. The KMS machine has the following characteristics: Ubuntu 14.04 LTS, 16 GB RAM and an

41 Chapter 6. Application testing

8-core processor (it leads to an 800% of maximum CPU usage). Previous to run the tests, the RAM, and CPU used by the KMS in the steady state are negligible in both cases.

As noted in the Table 6.3, it shows the different scenarios that were tested during the tests execution, were some variables are measured. The input data are the room number, the number of the participants who have the floor, and the number of users connected to the application concurrently –brackets indicate the users in each room. Regarding the output data, besides CPU and RAM usage it is also measured factors such as: bitrate, the percentage of packet loss, the jitter (audio/video) and the delay.

Table 6.3 – KMS load testing scenarios.

Input Output

CPU RAM Bitrate Packect Jitter Delay Subjective Rooms Partic. Users (%) (%) (kbits/s) loss (%) (ms) (ms) evaluation

1 0 10 90 1 724.49 0.005 39.2/39.3 220 OK 1 0 20 130 1.1 733.71 0.006 41.7/38.4 225 OK 1 0 30 180 0.8 731.44 0.06 40.9/41.8 232 OK 1 0 50 260 1.1 736.53 0.06 39.8/45.3 235 OK 1 0 100 471 2.2 742.08 0.008 46.9/64.2 242 OK 1 0 120 533 2.5 740.58 0.089 47.2/48 243 OK 1 0 130 ------ERROR 1 1 30 316 1.6 1485.06 0.012 42.7/48 221 OK 1 1 50 486 2.1 1502.02 0.031 43.5/56.8 223 OK 1 1 75 600 2 1379.81 0.015 45.3/53 342 OK 1 1 80 ------ERROR 2 0 50 (25) 300 1.4 738.65 0.044 40.3/44.2 242 OK 2 0 120 (60) 580 3 735.36 0.091 47.8/58.2 247 OK 2 1 70 (35) 593 2.6 1508.62 0.078 41.3/53.5 238 OK 2 1 80 (40) ------ERROR 2 2 60 (30) 612 3.2 1522.6 0.62 48.9/51.5 228 OK 2 2 80 (40) 655 7.3 406.23 0.01 238.3/1458 430 ERROR 8 0 120 (15) 675 3.6 640.43 0.056 53.6/155 250 OK 8 4 120 (15) 742 7.5 1517.68 0.062 37.3/41.2 200 OK 8 5 120 (15) 748 7.7 816.19 0.065 175.6/181.8 900 ERROR

The table above shows different test cases that were probed. It was checked different number of rooms (1, 2 and 8), for different number of concurrent users, some of them acting as participants. The test shows the approximated number of concurrent users

42 6.2. Load testing that the application supports in these situations (rows previous to the errors are the limits for each scenario). The limit ranges from one room with only the presenter sending their stream with 75 concurrent viewers and a possible participant; two rooms with one participant supports up to 35 users by room and with two participants up to 30 users by room. Tests with 8 rooms with 15 viewers and 4 concurrent participants (at least the same number than the solutions found in the state of art) were also fully completed. All those data are approximated, because of the simulation conditions. It has not been considered issues such as different locations and network accesses. In the test cases range, the applications herein developed improve the behavior from the state of the art applications.

With regard to the output parameters, concerning the packet loss, acceptable rates of packet loss range from 0.1 to 2%, otherwise picture loss and audio drop outs result. As for the jitter, the term "jitter" refers to the variation in timing of the picture caused as packets are received, buffered, and distributed to the screen as the available bandwidth changes. Most manufacturers recommend a jitter below 100 milliseconds for an uninterrupted videoconference. An increase in jitter caused by an underpowered network connection can cause "skipping" or "freezing" of a picture, resulting in noticeable disruptions. Concerning the latency, values should typically be lower than 300 milliseconds so that the missynchronisation of the packets is not noticeable but admissible up to 400 ms [31]. Therefore, it can be seen that the simulated scenario with a correct output has its statistics in this range works well.

In addition, those simulated scenarios also show the different characteristics from the other similar applications. When a user accesses the room after a videoconferencing has started they can only perform two actions. They are allowed to watch to the presenter/participant who has the floor and asking the presenter for the floor. In the applications revised in the state of the art, when a user enters a room they can access to every participant stream in the videoconferencing. As for the presenter, they are allowed to control whether a viewer speaks or not and how long they can do so. Unlike other applications, the videoconferencing can be recorded (the desired parts of it).

43

7 Conclusions

The accomplished project has covered various aspects of the life cycle of a software project. It has developed a web application that enables, in a scalable fashion, a One- to-Many videoconference through a browser. Videoconferencing could be the tool for the development of meetings (directives, in assemblies or shareholders, among others) or tele-teaching, enabling remote audience participation.

Among the allowed general features (shared between the presenter and viewers), the application provides the capability to mute the audio and video of the presenter and to expand the application to full-screen. In addition, the presenter is allowed to give the floor/end up the turn to those viewers who have request it. Moreover, the presenter also offered the possibility to start/pause the conference recording. As for the viewers, they are able to visualize the videoconference presenter, to request the floor and to know which user has the floor at every time. The videoconferences could be recorded and watched via live streaming or VOD in a website.

All these features are offered in a scalable manner and, as noted in the study (Appendix E), the market for the application includes most devices, browsers and OS used mostly by the Internet users.

The different simulated scenarios show that the application is reasonably scalable. Regard- ing the WSE integration, it may be seen in the load testing section that the concurrent users accessing the stream is limited by the network throughput of the WSE machine rather than the CPU or RAM of the machine. With the actual throughput of the server it allows more than 100 concurrent total users (divided into the number of concurrent rooms). On the flip side, as noted in the KMS load testing, the WebRTC application itself supports about 75 concurrent users (in only one room) and up to about 15 users in eight rooms (with a maximum of 4 concurrent participants), which shows an improvement (within this range) compared to the existent solutions. In the solutions revised in the state of the art, those which are free allow videoconferences up to ten people and the paid ones do so up to fifteen. The concurrent users for the WebRTC application is limited

45 Chapter 7. Conclusions by the KMS machine CPU. However, this could easily be improved by setting up an N number of KMS machines and a load balancer, which may allow incrementing the number of concurrent users by a factor of N. If the KMS instances are located on different places it improves the fault tolerance (KMS is the central point of failure of the application, as WSE does for the recordings). The load testing data are approximated because of the conditions of the simulated scenarios.

Concerning the architecture, the solution herein exposed is the only one with an One-to- Many architecture where the users play different roles (presenter, viewer and participant), in which one user can decide on the rest ones. In spite of not supporting screen sharing or chating, the technologies on which the application is based on allows adding these two features in the future. Besides, unlike existing applications, its plugin free and allows the possibility of recording.

The web application points out the advantages of videoconferencing and its operation, and its development has allowed me to learn about new technologies such as WebRTC, AngularJS and video-streaming technologies and to expand my knowledge of others, for instance HTML, CSS, XML and JavaScript. It has also allowed me to learn how to work with multimedia and streaming servers (KMS and WSE), widely used in business entities oriented to streaming.

46 8 Outlook

The application features are extensible. New videoconferencing features such as desktop sharing or authentication services for users could be added. It might be interesting to allow sharing their desktop screen to all the users in the videoconferencing (presenter and participant) when needed. It may be added a mechanism for reconnection when a user lost their connection as well as another one to alert the users when their bandwidth decreases below a predefined limit. As for the authentication services, new methods could be incorporated to the REST API in order to enter a room in a authenticated fashion.

It could also include an alert email system which sends the schedule of the planned videoconference and the access key. The presenter should be allowed to introduce the emails of the viewers (if known) so that the system automatically sends an email to the viewers with the aforementioned information.

Regarding the access to information, it could also be introduced a database system to store information, such as the server IP addresses, usernames and passwords required. Therefore, it would not be necessary to stop the application if replacement is needed.

In the mobile area, it might be developed an iOS mobile app to access the room. The used API provides a Java client to access the application. Concerning Android OS, this is not necessary, as we can see in the Appendix E, because of the application compatibility with the main browsers in this OS. However, it will be necessary to create an iOS app because, nowadays, WebRTC it is not allowed to work in iOS browsers, which represents approximately a 20% of the market.

It also might be quite interesting to secure all the streams sent to WSE in order to avoid that nobody out of the system could access it. Nevertheless, KMS does not support SRTP nowadays. It also could be necessary to hide the IP addresses in all websites so that nobody could attack them.

Finally, to add scalability to the system some (N) KMS instances could be set up so

47 Chapter 8. Outlook the application would support N times the users that supports recently. It would be necessary to deploy a module that handles how to balance the load between the N servers. It would be desirable that the load balancer considers the number of WebRTC peers and also to distribute all the peers in the same room to the same KMS. The goal of this last feature is that the users in the same room have more or less the same conditions. The mechanism to handle the floor and so on will be transparent because there is a JAVA application between the clients and the N servers. With this solution, the WSE machine could exclusively be used for recording.

48 A Kurento Media Server

WebRTC is enough for creating basic applications but features such as group communica- tions, media stream recording, media broadcasting or media transcoding are difficult to implement on top of it. For this reason, many applications require using a media server.

Conceptually, a WebRTC media server is just a kind of "multimedia middleware" (it is in the middle of the communicating peers) where media traffic pass through when moving from source to destinations.

Kurento is a WebRTC media server and a set of client APIs making simple the development of advanced video applications for WWW and smartphone platforms. Kurento features include group communications, transcoding, recording, mixing, broadcasting and routing of audiovisual flows.

Kurento also provides advanced media processing capabilities (Figure A.1) involving computer vision, video indexing, augmented reality and speech analysis. Kurento modular architecture makes simple the integration of third party media processing algorithms (e.g. speech recognition, sentiment analysis, face recognition, etc.), which can be transparently used by application developers.

49 Appendix A. Kurento Media Server

Figure A.1 – WebRTC with Kurento Media Server.

Kurento’s core element is KMS, responsible for media transmission, processing, loading and recording. It is implemented in low level technologies based on GStreamer [32] to optimize the resource consumption. It provides the following features:

1. Networked streaming protocols, including HTTP, RTP and WebRTC.

2. Group communications (MCUs and SFUs functionality) supporting both media mixing and media routing/dispatching.

3. Generic support for computational vision and augmented reality filters.

4. Media storage supporting writing operations for WebM and MP4 and playing in all formats supported by GStreamer.

5. Automatic media transcodification between any of the codecs supported by GStreamer including VP8, H.264, H.263, AMR, OPUS, Speex, G.711, etc.

There are available Kurento Client libraries in Java and JavaScript to control KMS from applications. If another programming language is preferred, it can be used the Kurento Protocol, based on WebSocket and JSON-RPC.

Kurento is open source, released under the terms of LGPL version 2.1 license.

50 Regarding the architecture, Kurento, as most multimedia communication technologies, is constructed from two layers (called planes) for abstract key functions in all interactive communication systems:

1. Signaling plane. The parts of the system in charge of managing communications, that is, the modules that provides functions for media negotiation, QoS parame- terization, call establishment, user registration, user presence, etc. are part of the signaling plane.

2. Media plane. Features such as media transport, media encoding/decoding and media processing comprise media plane, which is responsible for handling the media. The distinction comes from the telephony differentiation between the handling of voice and the handling of meta-information management as tone, billing, etc.

Figure A.2 shows a conceptual representation of the high-level architecture of Kurento.

Figure A.2 – High level architecture of Kurento.

The right side of the image shows the application that handles the signaling plane and contains the logic and connectors of the multimedia application. It can be built with any

51 Appendix A. Kurento Media Server programming technology such as Java or Node.js. Signaling protocols are used on the client side to command the creation of media sessions and to negotiate the features they desire. Therefore, this is the part of the architecture that is in contact with application developers and, for this reason, has to be designed pursuing simplicity and flexibility.

On the left side, we have the KMS, which implements the media plane capabilities providing access to low level media features: media transport, media encoding/decoding, transcoding media, media mixing, media processing, etc. The Kurento Media Server must be able to manage multimedia streams with minimal latency and maximum throughput. Therefore, Kurento Media Server must be optimized for efficiency and flexibility.

As for the Kurento protocol, KMS can be controlled by two clients (Java or JavaScript). These clients use the Kurento protocol to communicate with the KMS. This protocol is based on WebSockets and uses JSON-RPC v2.0 messages to make requests and to send responses.

52 B Wowza Streaming Engine

Wowza Streaming Engine (known as Wowza Media Server in versions prior to 4) is a software developed by Wowza Media Systems. The server is used for streaming video live and on-demand, audio, and Internet applications over IP networks to desktops, laptops and tablets, mobile devices and other devices connected to the network. The server is a Java application and it could be deployed in most operating systems.

The architecture of the operation is described in Figure B.1.

Figure B.1 – Wowza Streaming Engine architecture.

Among the advantages of this system, they include:

1. Streaming high-quality live on-demand content. It is a robust and customizable media server which provides reliable and high quality video and audio streaming.

2. Platform independent, multi-format and multi-screen. It supports any format, transcoded it once, and delivers it to any device.

3. Transcoding for adaptive live streaming. It converts video formats and creates multiple streams of different bit rates "on the fly" to allow adaptive streaming.

4. Easily scalable as application needs change.

53 Appendix B. Wowza Streaming Engine

5. Supporting of the latest standards: MPEG-DASH, Apple HLS, Adobe HDS, and Microsoft SMOOTH, and VP8 video codecs, VP9, Vorbis and Opus audio codec and more.

54 C Wowza configuration

To ingest the media stream originated in Kurento it is necessary to create a live application in Wowza with some capabilities. This application performs streaming, transcoding and recording. The main steps to configure the live application are the following:

1. Setting up the server into the chosen machine.

2. Setting up a live application. The purpose of this application is to receive the video stream from the KMS and perform media processing functions such as transcoding, DVR and recording. The general settings of this application are shown in Figure C.1.

Figure C.1 – Live application configuration.

55 Appendix C. Wowza configuration

3. Configuring the server and the application to monitor the content folder of the application as shown in ref. [33]. This requires to include a listener on the content folder so when a new file is detected, the streaming will start and, when the file is removed, the streaming will stop.

4. Transcoding: To enable the transcoder it is necessary to select the option Enable transcoder in the Wowza application. Afterwards, it must be selected the group of transcoders to use, as shown in Figure C.2.

Figure C.2 – Transcoders’s configuration.

56 Within each template group it is necessary to edit the settings. As an example it is configured the transcoder to 360p in Figure C.3.

Figure C.3 – Example of the 360p transcoder configuration.

5. DVR: the media stream is recorded allowing the option to start/pause and rewind the live stream if the viewer is interested in.

57 Appendix C. Wowza configuration

To configure the DVR option, it is only necessary to select the parameters such as the length of the window or the directory to save the files, as shown in Figure C.4. After configuration, activation is required.

Figure C.4 – nDVR configuration.

Wowza has a REST API [34] that allows live-streaming recording. The main steps for recording the videoconference are presented below:

1. In Wowza Streaming Engine, it can be used the parameters of HTTP GET method and a URL to record live streams. The following URL shows the minimum parame- ters of the URL required for live recording:

http://[wowza-ip-address]:8086/livestreamrecord?app=live&streamname=myStream& action=startRecording

58 Where the required URL parameters are:

• app = live application’s name. • streamname = stream’s name (must be a live stream). • action = startRecording / stopRecording / splitRecordingNow / startRecord- ingSegmentByDuration / startRecordingSegmentBySize / startRecordingSeg- mentBySchedule.

As optional parameters, the following ones could be used:

(a) option = version / append / overwrite. The default value is version. (b) startOnKeyFrame = true / false. The default value is false. (c) format = 1 / 2. 1 = FLV y 2 = MP4. The default value is 2 (MP4). (d) segmentSize = [megabytes]. The default value is 10 (10 megabytes). (e) segmentDuration = [seconds]. The default value is 900 (15 minutes).

As it was configured an authentication method that requires a username and pass- word, the username and password has to be added to the URL as follows:

http://[username]:[password]@[wowza-ip-address]:8086/livestreamrecord?app=live& streamname=myStream&action=startRecording

59 Appendix C. Wowza configuration

2. To serve the recorded videos it is necessary to set up a VOD application. All what has to be done is to set up a VOD application that has its content folder as the folder on which the recording directory for the streams are saved (Figure C.5).

Figure C.5 – Configuration of the VOD application.

60 D Server and client application code sample

This appendix provides an example of both the server application and the client side code.

• Server application. The following code shows an example of the room managing operation flow shown in Figure 4.6. The "CreateRoom" request is used in the whole example. The following table illustrates the main method of RoomJsonRpcHandler which process every incoming message to the server.

public final void handleRequest(Transaction transaction, Request request) throws Exception { sessionId = transaction.getSession().getSessionId(); notificationService.addTransaction(transaction, request); ParticipantRequest participantRequest = new ParticipantRequest(sessionId, Integer.toString(request.getId())); transaction.startAsync(); switch (request.getMethod()) { case"createRoom": userControl.createRoom(transaction, request, participantRequest); break;

default: log.error("Unrecognized request{}", request); break; }

61 Appendix D. Server and client application code sample

The next step is to extract the parameters of the incoming message. The following code shows the main method of JsonRpcUserControl.

public void joinRoom(Transaction transaction, Request request, ParticipantRequest participantRequest) throws IOException, InterruptedException, ExecutionException { String roomName = getStringParam(request, ProtocolElements.JOINROOM_ROOM_PARAM); String userName = getStringParam(request, ProtocolElements.JOINROOM_USER_PARAM);

ParticipantSession participantSession = getParticipantSession(transaction); participantSession.setParticipantName(userName); participantSession.setRoomName(roomName);

roomManager.joinRoom(userName, roomName, true, participantRequest); }

The JsonRpcUserControl invokes the appropriate method of the Notification- RoomManager.

public void createRoom(String userName, String roomName, boolean webParticipant, ParticipantRequest request) { Set existingParticipants = null; try{ KurentoClientSessionInfo kcSessionInfo = new DefaultKurentoClientSessionInfo(request.getParticipantId(), roomName); existingParticipants = internalManager.createRoom(userName, roomName, webParticipant, kcSessionInfo, request.getParticipantId()); } catch (RoomException e) { log.warn("PARTICIPANT {}: Error joining/creating room{}", userName, roomName, e); notificationRoomHandler.onParticipantJoined(request, roomName, userName, null, e); } if (existingParticipants != null) notificationRoomHandler.onParticipantJoined(request, roomName, userName, existingParticipants, null); }

62 The RoomManager contains the application logic of the requested method: public Set createRoom(String userName, String roomName, boolean webParticipant, KurentoClientSessionInfo kcSessionInfo, String participantId) throws RoomException { log.debug( "Request[JOIN_ROOM] user={}, room={}, web={}" +"kcSessionInfo.room={} ({})", userName, roomName, webParticipant, (kcSessionInfo != null ? kcSessionInfo.getRoomName() : null), participantId); Room room = rooms.get(roomName); if (room == null && kcSessionInfo != null) createRoom(kcSessionInfo); room = rooms.get(roomName); if (room == null){ log.warn("Room ’{}’ not found"); throw new RoomException(Code.ROOM_NOT_FOUND_ERROR_CODE, "Room’" + roomName +"’ was not found, must be created before’" + userName +"’ can join"); } if (room.isClosed()) { log.warn("’{}’ is trying to join room ’{}’ but it is closing", userName, roomName); throw new RoomException(Code.ROOM_CLOSED_ERROR_CODE, "’" + userName +"’ is trying to join room’" + roomName +"’ but it is closing"); } Set existingParticipants = getParticipants(roomName); room.join(participantId, userName, webParticipant, true); return existingParticipants; }

After the application logic execution, the event linked to the request is raised. The DefaultNotificationRoomHandler builds the response. In this case: public void onParticipantJoined(ParticipantRequest request, String roomName, String newUserName, Set existingParticipants, RoomException error) { if (error != null){ notifService.sendErrorResponse(request, null, error); return; }

JsonArray result = new JsonArray(); for (UserParticipant participant : existingParticipants) {

63 Appendix D. Server and client application code sample

JsonObject participantJson = new JsonObject(); participantJson.addProperty(ProtocolElements.JOINROOM_PEERID_PARAM, participant.getUserName()); if (participant.isStreaming()) { JsonObject stream = new JsonObject(); stream.addProperty( ProtocolElements.JOINROOM_PEERSTREAMID_PARAM,"webcam"); stream.addProperty("presenter", participant.isPresenter()); JsonArray streamsArray = new JsonArray(); streamsArray.add(stream); participantJson.add( ProtocolElements.JOINROOM_PEERSTREAMS_PARAM, streamsArray); } result.add(participantJson);

JsonObject notifParams = new JsonObject(); notifParams.addProperty( ProtocolElements.PARTICIPANTJOINED_USER_PARAM, newUserName); notifService.sendNotification(participant.getParticipantId(), ProtocolElements.PARTICIPANTJOINED_METHOD, notifParams); } notifService.sendResponse(request, result); }

In order to send back both the response and notifications to the client the JsonR- pcNotificationService is used:

@Override public void sendResponse(ParticipantRequest participantRequest, Object result) { Transaction t = getAndRemoveTransaction(participantRequest); if (t == null){ log.error("No transaction found for {}, unable to send result{}", participantRequest, result); return; } try{ t.sendResponse(result); } catch (Exception e) { log.error("Exception responding to user", e); } }

@Override public void sendNotification(final String participantId, final String method, final Object params) {

64 SessionWrapper sw = sessions.get(participantId); if (sw == null || sw.getSession() == null){ log.error("No session found for id {}, unable to send notification {}:{}", participantId, method, params); return; } Session s = sw.getSession();

try{ s.sendNotification(method, params); } catch (Exception e) { log.error("Exception sending notification to user", e); } }

• The client defines the events as follows:

localStream.addEventListener("access-denied", function () { //alert of error and go back to login page }

localStream.addEventListener("access-accepted", function () { //register for room-emitted events room.connect(); }

In this example, these definitions indicate what the application has to perform when an event is raised whether the access to the microphone and camera is granted or not.

65

E Dispositive/Browser study

It is carried out a study due to the realization of compatibility table for the developed application. The aim is to observe the application’s market prospects. The study will focus on the type of devices the users access the Internet, analyzing the browser used between the wide offer today.

The conditions of the study are as follows:

• The data analyzed refer to the recorded sessions from December 2014 to December 2015.

• Traffic worldwide is considered.

• The statistic source is ref. [35].

67 Appendix E. Dispositive/Browser study

1. Analysis of the devices used to access the Internet. In Figure E.1 it is shown a comparative ranking of the user preference when choosing a device to access the network.

Figure E.1 – Ranking of dispositives.

This graph can be drawn as more than half of users prefers to access from desktop devices and the others using mobile devices, tablets having reduced significance.

68 2. Analysis of browser desktop devices. Within these devices, they predominate, as can be seen in Figure E.3 and Figure E.2, Windows users in versions 7, 8 and XP followed by the Mac OX ones. As for favorite browsers, Google Chrome is followed by Internet Explorer and Firefox.

Figure E.2 – Ranking of desktop OS.

Figure E.3 – Ranking of desktop browsers.

69 Appendix E. Dispositive/Browser study

3. Analysis of mobile browsers. First, it is studied the percentage of users of different operating systems. In Figure E.4 is depicted that approximately 63.81% are Android users, while the 20.45% opt for iOS. The number of remaining operating systems is almost negligible.

Figure E.4 – Ranking of mobile OS.

As for browsers preferred by mobile users, it can be observed (Figure E.5) that the 5 major browsers are Chrome, Safari, Android, UC browser and Opera.

Figure E.5 – Ranking of mobile browsers.

70 From the study, it can be developed the compatibility tables, with the most commonly operating systems and browsers used on both desktop and mobile devices.

The tables from Table E.1 to Table E.3 illustrate that the application is supported on the most commonly desktop browsers.

Table E.1 – Windows browsers compatibility

OS WINDOWS

Browser Chrome Firefox Internet Explorer OPERA Edge Browser’s version 47.0.2526.106 m Mozilla/5.0 Firefox/42.0 IE11 34.0.2036.47 25.10586.63.0 WebRTC XX  X  Compatibility Wowza streaming and recording XX  X  compatibility Player XXXXX compatiblitity

Table E.2 – MAC browsers compatibility

OS MAC

Browser Chrome Firefox Safari OPERA Browser’s version 47.0.2526.106 m Mozilla/5.0 Firefox/42.0 9.0.2 34.0.2036.47 WebRTC XX  X Compatibility Wowza streaming and recording XX  X compatibility Player XXXX compatiblitity

71 Appendix E. Dispositive/Browser study

Table E.3 – Linux (Ubuntu) browsers compatibility

OS Linux (Ubuntu)

Browser Firefox Chromium Opera Epiphany Chrome Mozilla/5.0 Opera/9.80 Browser’s version 37.0.2062.120 3.18 46.0.2490.86 Firefox/42.0 Version/12.16 WebRTC X    X compatibility Wowza streaming X    X and recording compatibility Player XXXXX compatibility

Regarding mobile devices, the Table E.4 and Table E.5 shows the application compatibility in both Android and iOS operative systems. In iOS the application does not work properly, but it only supposes a 20% of the mobile OS, and smartphones are nowadays less used for videoconferencing.

Table E.4 – Android browsers compatibility

OS Android

Browser Chrome Firefox Opera UC Browser Internet Android Mozilla/5.0 Browser’s version 47.0.2526.83 34.0.2044.98679 10.8.0.718 1.5.18 Firefox/43.0 WebRTC XXX   compatibility Player XXXXX compatiblitity

72 Table E.5 – iOS browsers compatibility

OS iOS

Browser Safari Chrome Firefox Dolphin Browser’s version Safari 601.1 47.0.2526.107 1.4 9.51 WebRTC     Compatibility Player XXXX compatiblitity

73

Bibliography

[1] WebRTC (Web Real-Time Communications). [Online]. Available: https://webrtc.org/

[2] Kurento Media Server. [Online]. Available: http://www.Kurento.org/

[3] Kurento-Room-Demo. [Online]. Available: http://doc-Kurento-room.readthedocs. org/en/stable/room_demo_tutorial.html

[4] Wowza Streaming Engine. [Online]. Available: https://www.wowza.com/

[5] Microsoft. Skype web. [Online]. Available: https://web.skype.com/

[6] T. Warren. (2015) Skype for web beta launches worldwide, no calling support for chromebooks yet. [Online]. Available: http://www.theverge.com/2015/6/15/ 8782887/skype-for-web-launch-features

[7] Google. Google Hangouts. [Online]. Available: https://hangouts.google.com/

[8] J.-E. Sneddon. (2014) Google Hangouts plugin no longer needed in chrome. [Online]. Available: http://www.omgchrome.com/google-hangouts-chrome-plugin-free-2/

[9] Yyet. Talky. [Online]. Available: https://talky.io/

[10] A. Couch. (2015) 9 free video conferencing web apps. [Online]. Available: http://www. makeuseof.com/tag/9-free-video-conferencing-web-apps-no-registration-required/

[11] T. Digital. Appear.in. [Online]. Available: https://appear.in/

[12] vLine. [Online]. Available: http://blog.vline.com/

[13] Mozilla. Firefox Hello. [Online]. Available: https://www.mozilla.org/es-ES/firefox/ hello/

[14] Jitsi Videobridge. [Online]. Available: https://jitsi.org/Projects/JitsiVideobridge

[15] Licode. [Online]. Available: http://lynckia.com/licode/

[16] Meetecho. Janus. [Online]. Available: https://janus.conf.meetecho.com/

[17] Red5 Media Server. [Online]. Available: http://red5.org/

75 Bibliography

[18] Kurento One-to-Many video call. [Online]. Available: http://www.Kurento.org/docs/ 6.1.0/tutorials/java/tutorial-3-one2many.html

[19] Room API components. [Online]. Available: http://doc-kurento-room.readthedocs. org/en/stable/java_api_integration.html

[20] M. Morley. (2010) JSON RPCv2 specification. [Online]. Available: http: //www.jsonrpc.org/specification

[21] Spring Boot. [Online]. Available: http://projects.spring.io/spring-boot/

[22] (2010) AngularJS. [Online]. Available: https://angularjs.org/

[23] (2010) How to set up live streaming using a native RTP encoder with SDP file. [Online]. Available: https://www.wowza.com/forums/content.php? 38-How-to-set-up-live-streaming-using-a-native-RTP-encoder-with-SDP-file

[24] (2006) SDP (Session Description Protocol). [Online]. Available: https: //tools.ietf.org/html/rfc4566

[25] Flowplayer HTML5. [Online]. Available: https://flowplayer.org/

[26] HTTP live streaming HLS. [Online]. Available: https://developer.apple.com/ streaming/

[27] Chromedriver. [Online]. Available: https://sites.google.com/a/chromium.org/ chromedriver/

[28] A. Foundation. JMeter. [Online]. Available: http://jmeter.apache.org/

[29] J. Gainfort. (2015) Using JMeter to load test HLS concurrency of WSE. [Online]. Available: http://www.realeyes.com/blog/2015/08/26/ using-jmeter-to-load-test-live-hls-concurrency-of-wowza-streaming-engine/

[30] testRTC. [Online]. Available: http://testrtc.com/

[31] K. Lynch. (2009) Is your network ready to handle videoconferenc- ing? [Online]. Available: http://www.techrepublic.com/blog/data-center/ is-your-network-ready-to-handle-videoconferencing/

[32] GStreamer. [Online]. Available: https://gstreamer.freedesktop.org/

[33] How to monitor content folder for .sdp and .stream files to start publishing streams. [Online]. Available: https://www.wowza.com/forums/content.php? 475-How-to-monitor-content-folder-for-sdp-and-stream-files-to-start-publishing-streams

[34] (2010) How to record live streams. [Online]. Available: https://www.wowza.com/ forums/content.php?123-How-to-record-live-streams-(HTTPLiveStreamRecord)

76 Bibliography

[35] StatCounter. (1999) Statcounter Global stats. [Online]. Available: http: //gs.statcounter.com/

77