Statistical Learning Based Congestion Control for Real-Time Video Communication Tongyu Dai, Xinggong Zhang, Yihang Zhang, and Zongming Guo, Member, IEEE

1 Statistical Learning Based Congestion Control for Real-time Video Communication Tongyu Dai, Xinggong Zhang, Yihang Zhang, and Zongming Guo, Member, IEEE Abstract—With the increasing demands on interactive video Sender Receiver applications, how to adapt video bit rate to avoid network congestion has become critical, since congestion results in self- video source inflicted delay and packet loss which deteriorate the quality of Congestion Congestion Encoder packets Network Decoder real-time video service. The existing congestion control is hard control control to simultaneously achieve low latency, high throughput, good feedback packets adaptability and fair bandwidth allocation, mainly because of the hardwired control strategy and egocentric convergence objective. Fig. 1: The architecture of real-time video transmission. To address these issues, we propose an end-to-end statistical learning based congestion control, named Iris. By exploring the underlying principles of self-inflicted delay, we reveal that congestion delay is determined by sending rate, receiving rate of HTTP will essentially bring high latency [12], which is not and network status, which inspires us to control video bit rate suitable for real-time video transmission. using a statistical-learning congestion control model. The key The second kind of rate adaptation related work is conges- idea of Iris is to force all flows to converge to the same queue load, and adjust the bit rate by the model. All flows keep a small tion control based on transport layer. As shown in Figure 1, and fixed number of packets queuing in the network, thus the fair congestion control plays an important role in the real-time bandwidth allocation and low latency are both achieved. Besides, video transmission, which adapts video bit rate to avoid self- the adjustment step size of sending rate is updated by online inflicted delay and packet loss. Many rate adaptation schemes learning, to better adapt to dynamically changing networks. [13–18] have been proposed for video calling or video con- We carried out extensive experiments to evaluate the performance of Iris, with the implementations of transport layer ferencing. Most of them focus on rate allocation upon given (UDP) and application layer (QUIC) respectively. The testing bandwidth [13, 14] or relay server selection [17, 18]. Seldom environment includes emulated network, real-world Internet and of them considers how to reduce end-to-end congestion delay. commercial LTE networks. Compared against TCP flavors and In addition to the conventional TCP-like algorithms [19, 20], state-of-the-art protocols, Iris is able to achieve high bandwidth there are also some end-to-end congestion control designed for utilization, low latency and good fairness concurrently. Especially over QUIC, Iris is able to increase the video bitrate up to 25%, real-time video streaming or low-latency transmission, which and PSNR up to 1dB. can limit the self-inflicted delay [15, 21, 22] or achieve TCP fairness [16]. Some methods also adjust video bit rate with Index Terms—Congestion control, real-time video streaming, low latency, statistical learning, adaptive adjustment. online learning [23, 24]. But these existing algorithms still have some flaws, mainly including the following two aspects. • Non-coexistence of high throughput and low latency. I. INTRODUCTION Considering only packet loss will lead to high queuing ITH the widespread deployments of LTE/WiFi wire- delay, but the algorithms overly concerned about delay less networks and the forthcoming 5G [1], inter- also lead to low throughput [25]. arXiv:1905.05998v2 [cs.MM] 16 May 2019 W active video applications are growing exponentially, from • Hardwired control strategy. Most methods adjust send- the mobile video chatting, such as Skype [2], FaceTime, ing rate with fixed step size or multiplier [26]. This to AR/VR streaming [3] and cloud gaming [4, 5]. These manual mapping cannot always be optimal in changing video applications require not only higher bandwidth but networks, resulting in performance degradation. also lower transmission delay. However, real-world network • Egocentric convergence objective. Some algorithms capacity is constrained, especially in wireless networks with based on objective function are self-centered and lack unpredictable dynamics (e.g. random packet loss, channel communication between concurrent data streams, thus fading, etc) [6, 7]. It imposes great challenges on nowadays thye can not keep the same convergence goal for different video bitrate adaptation, which adjusts video streaming bit clients, which leads to unfair bandwidth allocation. rate to reduce self-inflicted delay and loss. Motivated by these issues, to obtain higher video transmis- The existing work related to rate adaptation can be mainly sion quality, we probe into the essence of congestion control divided into two categories. The first is the research of and consider whether it is possible to design an algorithm that adaptive bit rate (ABR) algorithms over application layer, achieves the goals of low latency, high channel utilization, including [8–11]. They usually use HTTP as the transport good adaptability to changing networks and fair bandwidth protocol and adjust the video bit rate according to bandwidth allocation. Some recently proposed algorithms [27, 28] have estimation and buffer state [9]. However, the underlying TCP enlightened us: a congestion control with low-latency fairness 2 objective and a learned rate adjustment strategy. utilization of bandwidth [25]. Moreover, they will be starved In this paper, we start with an in-depth investigation into when sharing a bottleneck link with concurrent loss-based network congestion in LTE, WiFi and Internet, to explore the flows [34]. Therefore, these conventional methods are not underlying principles of congestion delay. The data-driven suitable for real-time video streaming. analysis reveals that there is a strong correlation between There are also many studies focusing on special cases of transmission latency, sending rate and receiving rate. In- network environments, including the algorithm customized for spired by the observations, we have designed Iris, an end- datacenters [35–37], cellular networks [38, 39], Web applica- to-end learning-based congestion control algorithm for real- tions [40] and so on. These solutions yield good performance time video streaming. It mainly consists of two components: under the specific network conditions, but can not improve the low-latency fairness model and learning-based rate control performance of video streaming transmission. The algorithms strategy. The fairness model forces all streams to keep a small specially designed for real-time video transmission mainly and fixed number of packets queuing in network, achieving include [15, 21, 41]. They use packet loss and delay as fairness and low congestion delay. The learning-based rate congestion metrics, and empirically set some fixed thresholds. control builds a statistical function between round-trip time When the congestion metrics are higher or lower than these (RTT), sending rate and receiving rate, based on online linear thresholds, the sending rate is adjusted accordingly. As we regression learning. Then it is used to determine the proper mentioned earlier, this hardwired control strategy can not sending rate, which avoids fixed adjustment step size and maintain effectiveness in changing networks, resulting in converges to the fairness objective more quickly. performance degradation. The contributions of this paper can be summed up in the Over the past decade, more and more researchers have following three aspects: abandoned the TCP-like hardwired mapping rate control • We explore the underlying reasons for congestion delay which maps events to fixed reactions (e.g. using fixed thresh- in video transmission and reveal the correlation between olds or step sizes). They prefer to generate effective control transmission latency, sending rate and receiving rate, strategies through algorithms rather than handicraft, such as which inspires us to design a low-latency algorithm with PCC [42], Verus [43], Remy [27] and Vivace [23]. Especially, statistical learning. Remy replaces the human designed algorithm with an offline • An adaptive bitrate adjustment scheme is introduced. optimization scheme that searches for the best scheme within According to the learning model, the rate adjustment step its assumed network scenario. But the performance drastically size can be periodically updated online, which avoids the degrades when the actual network conditions deviate from hardwired control strategy to better adapt to dynamically its assumptions [44]. PCC and Vivace propose to empirically changing networks. observe and adopt actions that result in higher performance, • A low-latency fairness model which can be proved the- but they have to try many times before deciding on the best oretically is introduced. With the estimation of network action. The lag selection will affect the delay performance status, the fairness model indicates the direction and size of real-time video streaming. Google proposes BBR [22] to of delay adjustment, so that a fair share of bandwidth and address the limitations of conventional TCP, which tries to low latency are guaranteed. make the number of inflight packets (i.e. data sent but not yet acknowledged) converge to bandwidth-delay product (BDP). The paper is organized as follows: Section II introduces the However, it has poor performance in TCP-fairness, and its

Load more