Movidiff: Enabling Service Differentiation for Mobile Video Apps
Total Page:16
File Type:pdf, Size:1020Kb
MoViDiff: Enabling Service Differentiation for Mobile Video Apps Satadal Sengupta∗, Vinay Kumar Yadavy, Yash Sarafz, Harshit Gupta∗, Niloy Ganguly∗, Sandip Chakraborty∗, Pradipta Dex ∗Dept. of CSE, IIT Kharagpur, Kharagpur 721302, India; yDept. of CSE, IIT Patna, Patna 801103, India zDept. of CSE, IIT BHU, Varanasi 221005, India; xDept. of CS, Georgia Southern University, GA 30458 USA Email: [email protected], fvk7055,[email protected], fniloy,[email protected], [email protected] Abstract—Among the mobile applications contributing to the User on Skype call User watching a video on surging Internet traffic, video applications are some of the biggest on her Smart-watch YouTube simultaneously contributors. Most of these video applications use HTTP/HTTPS tunneling making it difficult to apply port based or packet Gateway Device data based identification of flows. This makes it challenging for Internet network operators to enforce bandwidth regulation policies for PERSONAL app based service differentiation due to lack of flow identification MOBILE HUB mechanisms for mobile apps. We explore a packet data agnostic feature of video flows, namely packet-size, to identify the flows. We show that it is possible to train a classifier that can distinguish Fig. 1: Use Case: Multiple mobile devices connected to the packets from streaming and interactive video apps with high network through a Personal Mobile Hub. The video applica- accuracy. We design and implement a system, called MoViDiff, tions running simultaneously on the mobile devices must share with this classifier at the core, that allows bandwidth regulation between video traffic of two different categories, streaming and the bandwidth. interactive. We show that we can achieve an average accuracy of 96% in classifying the traffic, with the maximum accuracy reaching as high as 98%. the preferred application, if bandwidth requirement for both I. INTRODUCTION cannot be supported? Unfortunately, with current techniques, it is difficult for the network to distinguish among traffic from Video applications on different mobile devices, including different applications originating from a single device since wearables, are steadily adding to the volume of Internet traffic. typically the traffic is tunneled over HTTP/HTTPS. In addi- To overcome limited connectivity of the devices, often the tion, often the data traffic is encrypted making it impossible traffic is routed through a mobile hub, like a smartphone [1]. to use traditional techniques to identify traffic sources. When all the video traffic is funneled through a single device, Traffic identification relies on identifying unique character- identifying the source of the traffic can be challenging. Con- istics of the application traffic which can be used as signatures. sider a scenario as shown in Fig. 1, where our smartphone acts Common signatures used to identify desktop applications are as a personal mobile hub that interconnects multiple devices, the port number used by an application, host address of the such as smart-watches, fitness monitoring devices (e.g., Fitbit), server, or any other identifier in the network packets [5]. This smart switches for door control and so on. All of these devices assumes unencrypted traffic that allows deep packet inspection. are connected to the Internet through the mobile hub via Since majority of mobile app traffic is tunneled over HTTP wireless tethering. In this setting, while watching a streaming [6], [7], simple port based classification does not work. For video from YouTube on the smartphone, one can chat with a most mobile video applications, where the data packets are friend via Skype on the smartwatch [2]; this is prevalent in encrypted and sent over HTTPS, packet inspection based many scenarios, for example, in remote medical help, remote approaches will fail. tech support, a long-distance couple trying to emulate the Although inspection based approaches may fail, other side- experience of watching a movie together [3], etc. 1 channel information can be exploited to identify the traffic Unfortunately, in such scenarios, the limited wireless band- categories. For example, streaming and interactive traffic, width of the smartphone to the Internet now must be shared by which are our major concerns in this paper as discussed multiple bandwidth hungry video applications. This can hurt with the use case illustrated in Fig. 1, will have different the Quality of Service (QoS) for each application. Assume that traffic patterns due to inherent differences in the characteristics the user can pre-configure preference between YouTube and of streaming video and interactive video traffic. Assuming Skype. Then, would it be possible to ensure QoS for at least realistically that a user cannot engage in multiple video apps of 1Applications such as Rabbit [4] have also been developed, which combine same category simultaneously (like watching videos over two streaming and video-chatting experience. YouTube instances), it is possible to identify the source type YouTube Skype 1e+07 HotStar 1e+07 Google Hangouts 1e+06 1e+06 100000 100000 10000 10000 1000 1000 100 100 TERMINAL EMULATOR 10 10 BANDWIDTH INTERNET Bytes received (log-scale) 1 Bytes received (log-scale) 1 THROTTLING 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 TCPDUMP Time (seconds) Time (seconds) MOBILE WIFI ROUTER (a) Streaming Video (b) Interactive Video Moto X 2nd Gen ASUS RT-3200AC Fig. 2: Traffic distribution in bytes/sec for two types of mobile Fig. 3: In-Laboratory Testbed Setup for Data video applications showing the difference in traffic characteristics. Collection when the video traffic categories are different. In a preliminary TABLE I: Traffic Trace Collection Scenarios study, we analyzed the temporal characteristics of traffic from App Application/s Total Resolution Trace Band- Category Trace Dura- width streaming video and interactive video sources running on Vol- tion (Mbps) mobile devices. As shown in Fig. 2, video apps within the ume Range same category show similar characteristics, while differing (MB) (mins) YouTube 304 360p, 720p Streaming significantly from apps of another category. Such properties HotStar 645 SD, HD 5-40 0.5 Video can be used to distinguish between data packets originating TED 188 HQ-off, on 1.0 Skype 442 Default 4.0 Interactive from video apps of different categories. Google Hangouts 230 Default 5-60 16.0 Video In this work, we investigate a simple packet-level feature of ooVoo 356 Default video traffic that can be used to identify the sources (x IV) – packet-size. Packet-size varies significantly between streaming the technique works well for popular apps. Other indirect and interactive videos. The feature is used to train a classifier information, like in-app advertisement sources, can also be to identify packets from pre-defined video sources, in presence used to reveal the flow identity, as shown by [10]. Statistical of mixed traffic flows. We design and implement a system methods can be used to identify patterns in app traffic. Dai et called MoViDiff, which can identify packets from different al. showed that by executing different apps in an emulator and video sources, and apply pre-configured bandwidth control gathering network traces from these executions, it is possible policies (x V). The core of MoViDiff is the classification engine to build signatures for identifying specific traffic sources [11]. which runs at the gateway connecting the end devices to the Yao et al. reduced the overhead of building signatures in a ISP’s network. We demonstrate the efficacy of the system recent work where they used context of signatures to improve using the scenario, as shown in Fig. 1 (x VI). The classification identification accuracy and scalability [12]. Automatic gener- results exhibit an accuracy of at least 92%, while reaching ation of classifiers for traffic identification has been presented up to 98% in some cases, thus propelling MoViDiff to work in [13], [14]. AppPrint also uses extensive traffic observations successfully as a bandwidth regulator. to build signatures based on header fields [15]. Since most of The key contributions of this work are, (1) We show that, these works rely on information that must be read from the using packet data agnostic properties of video traffic, it is packets, they are ill suited for classifying encrypted traffic. possible to identify sources of video traffic on a mobile Video traffic tends to exhibit more strongly identifiable flow device from mixed flows. (2) We implement a system, called patterns as compared to any generic traffic class. SkyTracer is MoViDiff, that demonstrates the efficacy of the packet data a tool that shows the patterns in Skype traffic [16]. Jesudasan et agnostic features in video traffic differentiation. MoViDiff can al. have identified generic features across different versions of be used by network operators, and enterprise IT managers for Skype that can be used as features for statistical identification resource optimization and policy enforcement. of Skype flows [17]. Similarly, there have been characteriza- tion of YouTube traffic for PC versions [18] and YouTube II. RELATED WORK mobile application [19]. Recently, Shi et al. also showed that Classifying traffic from mobile applications has spurred video source can be identified for encrypted traffic just by active research recently due to its practical necessity, as analyzing only few timing features of the packet trace [20]. well as, the challenges due to lack of features suitable for Our work builds on these works by carefully analyzing groups traffic classification. If the traffic is unencrypted, deep packet of mobile video applications, and finding statistical properties inspection (DPI) techniques, surveyed in [8], can help in clas- that can enable classification of flows from multiple video sification. For example, port number can help in identifying sources simultaneously. application groups, like browsing, email, maps [6]. Similarly, by peering into the HTTP header, one can read the HTTP III. EXPERIMENTAL IN-LABORATORY SETUP User-agent field that shows the application name.