An Optimized Hybrid Remote Display Protocol Using GPU-Assisted M-JPEG Encoding and Novel High-Motion Detection Algorithm
Total Page:16
File Type:pdf, Size:1020Kb
J Supercomput (2013) 66:1729–1748 DOI 10.1007/s11227-013-0972-1 An optimized hybrid remote display protocol using GPU-assisted M-JPEG encoding and novel high-motion detection algorithm Biao Song · Wei Tang · Tien-Dung Nguyen · Mohammad Mehedi Hassan · Eui Nam Huh Published online: 6 July 2013 © Springer Science+Business Media New York 2013 Abstract In this paper, we design a novel hybrid remote display for mobile thin- client system. The remote frame buffer (RFB) protocol and motion JPEG (M-JPEG) protocol are assigned to handle remote display tasks in the slow-motion region and high-motion region, respectively. Graphic processing units (GPU) are utilized to do a part of a real-time JPEG compression task. A novel quality of experience (QoE)- based high-motion detection algorithm is also proposed to reduce the network band- width consumption and the server-side computing resource consumption. The conti- nuity of screen delivery remains whenever the JPEG compression is applied to dif- ferent screen regions. The proposed hybrid remote display approach has many good features which have been justified by comprehensive simulation studies. Keywords Remote display protocol · RFB · M-JPEG · High-motion detection · GPU B. Song · M.M. Hassan College of Computer and Information Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia B. Song e-mail: [email protected] M.M. Hassan e-mail: [email protected] W. Tang · T.-D. Nguyen · E.N. Huh () Innovative Cloud and Security Lab, Department of Computer Engineering, Kyung Hee University, 1 Seocheon-dong, Giheung-gu, Yongin-si, Gyeonggi-do 446-701, South Korea e-mail: [email protected] W. Tang e-mail: [email protected] T.-D. Nguyen e-mail: [email protected] 1730 B. Song et al. 1 Introduction Nowadays, the rapid development of mobile network and device promotes the inves- tigation of mobile services. During the last decade, the processing power of mobile (smartphone) and portable devices (laptop, tablet PC) has been greatly improved. Despite the advances in mobile hardware and local application, the remote mobile services delivered through thin-client computing still is gaining particular interest in research and development of mobile context. According to [1], the first advantage of mobile thin-client computing is that applications need not to be tailored individually for each mobile platform. It has the potential to break the device-specific/OS-specific barrier for mobile and PC applications. Secondly, the mobile thin computing tech- nology handles the “heterogeneous degree of capability” problem for mobile and portable devices. For example, mobile phones still cannot provide enough local pro- cessing and storage resources to execute 3D virtual environments that require ad- vanced graphical hardware [2], or applications that operate on large data sets, such as medical imaging application [3]. Although the portable devices may have enough ca- pability to process 3D tasks locally, the battery consumption limits the performance and service time. Using thin-client computing, users are able to remotely access their services via a local viewer application and delegate actual information processing to the remote server. The viewer application transfers user input to the server, and renders the dis- play updates received from the server. Many types of thin-client technologies have mobile client versions for different mobile devices such as Microsoft Remote Desk- top Services (RDP) [4], and Virtual Network Computing (VNC) [5]. However, the current technologies did not provide an efficient and widely applicable remote dis- play solution for mobile thin-client computing. All existing remote display solutions can be roughly categorized into structured and pixel-based encoding. In the structured encoding domain, the remote display server intercepts elementary drawing commands from application/OS/graphic card, translates them to a format that is executable on the client device, and sends the trans- lated instructions to the client. The instructions are then received by the client, and executed on the client device to generate screen display. Citrix XenApp [6], Windows RDP [4], THINC [7], and MPEG-4 BiFS semantic remote display framework [1] be- long to this category. The above solutions perform well supporting only a limited range of applications or OS since most of them (expect THINC) need application/OS level structure information, which is usually inaccessible for an external application, such as the encoder application. Besides, these solutions require client-side hardware support to execute 3D virtual environment even if the 3D structure information is available. Thus, the structured encoding technologies are not widely applicable on both server and client side. On the other hand, the pixel-based encoding solutions are more general as com- pared to the structured encoding. The remote display server intercepts the screen pixels from the hardware framebuffer, encodes the screen updates with different com- pression techniques, and sends the encoded screen updates to the client. Due to the variety of compression techniques, the performances of existing pixel-based encod- ing solutions differ from each other. The original VNC, which adopts RFB as the An optimized hybrid remote display protocol using GPU-assisted 1731 encoding technique, suffers from high bandwidth consumption when transmitting high-motion screen updates. As the RFB protocol provides no adequate solution to support pixel-based encoding, the existing works [8, 9] divide the display in slow- and high-motion regions, which are encoded, respectively, by means of VNC draw- ing primitives and MPEG-4 AVC (a.k.a. H.264) frames. This solution fully eliminates the high bandwidth consumption problem caused by VNC encoding. However, the MPEG decoding still results in high computing resource consumption on the client device. Meanwhile, the motion detection approach is not fully optimized since the switch from RFB to MPEG cannot be done transparently. RFB-based JPEG compression technology was used in TightVNC [10] and Tur- boVNC [11] to reduce the consumption of bandwidth and client CPU. They use JPEG compression to encode the difference between the neighbor frames rather than the frame itself. If JPEG compression is applied to each frame, the whole process can be viewed as Motion JPEG (M-JPEG) encoding process. According to [12], M-JPEG has the following advantages: (i) minimum latency in image processing, and (ii) flex- ibility of splicing and resizing. Even if few network packets containing the frame information have been lost during the transmission, M-JPEG can still provide contin- uous streaming while the RFB-based JPEG compression technology may not be able to work properly. Besides, the existing thin-client applications using JPEG-based re- mote display protocol face the low frame rate problem since real-time JPEG encoding is a challenging task for CPU. The experiments in [13] show that it is impossible to guarantee 20–30 frames per second large screen JPEG compression without using parallel processing units. In this paper, we propose a novel hybrid remote display design for the mobile thin-client system. The RFB protocol is chosen to handle the slow-motion remote display task. We adopt M-JPEG as our protocol for high-motion display. To further improve the encoding efficiency and reduce the response time, graphic processing units (GPU) have been utilized to do a part of JPEG compression task. We install a NVIDIA graphic card and NVIDIA CUDA, which is a parallel computing platform and programming model, enabling dramatic increases in computing performance by harnessing the power of the GPU [14]. By using GPU-assisted M-JPEG compression, our remote display system is capable of providing real-time M-JPEG streaming with low latency and high frame rate, which is considered as the first contribution of this paper. The second contribution is that the motion detection algorithm in our display pro- tocol is able to reduce the network bandwidth consumption and the server-side com- puting resource consumption. As the M-JPEG is an intra-frame approach, the resizing and slicing of the M-JPEG frames can be done transparently. Whenever the motion detection algorithm changes the size or position of the high-motion region, the con- tinuity of screen delivery is always preserved. Our proposed algorithm can detect several high-motion regions with the consideration of the following factors: (i) the number of changed pixels, (ii) the network environment, (iii) the resource utilization situation on server side, and (iv) the video quality preference from the client side. The motion detection algorithm first assigns each 8 ∗ 8 display region a QoE value show- ing that how much the M-JPEG compression outperforms RFB encoding regarding that region. Then, the QoE values are used to form a QoE matrix, and the high-motion 1732 B. Song et al. region detection problem is solved by using a dynamic programming algorithm to get the sub-matrix with maximum summation. We also present a four-module-four-thread implementation method. The multi- thread design greatly reduces the whole compression time by running tasks in paral- lel among memory, CPU, GPU, and Network I/O. The proposed display technology with novel motion detection algorithm is compared with existing solutions. The ex- perimental results are demonstrated to support the claims. The rest of this paper is organized as follows. Section 2 discusses some related work in the thin-client domain. The description of the proposed hybrid remote display protocol can be found in Sect. 3. Section 4 contains the details of the motion detection algorithm. We show our implementation method and some performance results in Sect. 5 and conclude with suggestions for future work in Sect. 6. 2 Related work Based on the position of interception, the remote display protocols can be categorized into three distinctive groups. At application/OS layer, Microsoft RDP [4] is a well- known and widely used solution developed by Microsoft, which concerns providing a user with a graphical interface to another computer.