A Reliable Distributed Stream Processing System for Mobile Devices
Total Page:16
File Type:pdf, Size:1020Kb
MobiStreams: A Reliable Distributed Stream Processing System for Mobile Devices Huayong Wang Li-Shiuan Peh MIT MIT Massachusetts, USA Massachusetts, USA [email protected] [email protected] Abstract—Multi-core phones are now pervasive. Yet, existing network can increase the bandwidth by 20X, the projected applications rely predominantly on a client-server computing demand will still exceed capacity in 2016. paradigm, using phones only as thin clients, sending sensed A popular client-server computing paradigm used for real- information via the cellular network to servers for process- ing. This makes the cellular network the bottleneck, limiting time data analytics today is Distributed Stream Processing overall application performance. In this paper, we propose Systems (DSPSs), such as IBM’s InfoSphere Streams [5], MobiStreams, a Distributed Stream Processing System (DSPS) Yahoo’s S4 [6] and Streambase [7]. DSPSs have been that runs directly on smartphones. MobiStreams can offload applied to many industries, ranging from transportation computing from remote servers to local phones and thus to healthcare, energy management to finance [8]. DSPSs alleviate the pressure on the cellular network. Implementing DSPS on smartphones faces significant challenges: 1) multiple improve software development productivity by hiding the phones can readily fail simultaneously, and 2) the phones’ complexity of fault tolerance, workload balance and network ad-hoc WiFi network has low bandwidth. MobiStreams tack- management while providing the scalable computing power les these challenges through two new techniques: 1) token- of a cluster of servers. A DSPS reads in live data streams triggered checkpointing, and 2) broadcast-based checkpointing. from end user devices (e.g. smartphones or sensors) and Our evaluations driven by two real world applications deployed in the US and Singapore show that migrating from a server distributes processing tasks to multiple servers in a data platform to a smartphone platform eliminates the cellular center. The servers process these data streams and return the network bottleneck, leading to 0.78∼42.6X throughput increase computed results back to the user devices. The underlying and 10%∼94.8% latency decrease. Also, MobiStreams’ fault hardware platform for DSPSs is thus usually a cluster tolerance scheme increases throughput by 230% and reduces comprising reliable servers interconnected by high-speed latency by 40% vs. prior state-of-the-art fault-tolerant DSPSs. Ethernet. In this paper, we propose harnessing the increas- ingly powerful and pervasive smartphones as a distributed Keywords-stream computing; mobile computing; reliability; computing platform, processing sensed data directly on the phones themselves, thus avoiding the shortcomings of the I. INTRODUCTION client-server computing paradigm discussed. Specifically, Smartphones have become powerful. Nowadays, it is we propose MobiStreams, a DSPS that runs directly on common for a smartphone to have a quad-core CPU of smartphones. We show that MobiStreams can increase the 1.5∼2.0GHz, such as Samsung Galaxy S 4, Lenovo K860i throughput of our two example applications by 0.78∼42.6X and Huawei Ascend P6. Besides, octa-core 2GHz CPUs for and reduce the latency by 10%∼ 94.8% vs. traditional mobile phones will come to market soon [1]. These phones’ server-based DSPSs (Section IV-A). computation capability is comparable to that of servers 7 Porting a DSPS from reliable servers interconnected by years ago in terms of the total processor frequency on a high-bandwidth Ethernet to a collection of smartphones single chip (Intel’s first quad-core CPU was released in interconnected by ad-hoc WiFi brings about immense fault 2006). Meanwhile, smartphone penetration rates in the U.S. tolerance challenges. Smartphones can readily fail due to and Singapore have reached 60% and 90% [2, 3]. Yet, ex- limited battery, mobility and/or the unreliable wireless net- isting systems and applications typically use smartphones as work. Previously proposed DSPS fault tolerance schemes data-collecting devices or thin clients, transferring all sensed [9–15] are all for servers. They do not work well on data from the phones to back-end servers for computing. smartphones for two reasons. First, the failure model of a These systems and applications have several shortcomings smartphone platform is quite different from the assumptions stemming from the client-server computing paradigm which made by previous work, which assumes small-scale (usually relies on the slow, overloaded, costly cellular network. The single node) failures and is based on the belief that such 1- 3G cellular network is already at capacity in many countries. safety guarantee is sufficient for most real-world applications Furthermore, it is projected that global monthly mobile data [16]. However, in a smartphone platform, it is common that traffic will increase 18-fold from 597 petabytes in 2011 several phones fail simultaneously. For example, multiple to 10.8 exabytes in 2016 [4]. While the 4G/LTE cellular phone users may move out of a communication region Node 1 Node 3 3 4 Node 5 Source node Computing node Source operator 1 1 3 Sink node 5 6 10 Source node Computing node 5 Node 2 Node 4 7 Sink Source 2 operator 2 4 operator 8 9 N Operator N Input/output buffer N The Nth node Data stream Intra-node data pass Network connection (a) (b) Source nodes Source nodes Uplink 3G Network (bottleneck) Computing Computing Sink nodes nodes Sink nodes nodes Ad-hoc WiFi Downlink Ethernet Switch Data Center Network 3G Network (c) (d) Figure 1. An example of DSPS. (a) DSPS represented by operators and a query network. (b) DSPS represented by nodes and a high-level query network. (c) A DSPS deployment on servers, using smartphones as data-collecting devices. (d) A DSPS deployment on smartphones. at the same time, losing WiFi connectivity. Second, prior cation running on the DSPS. The application contains ten fault-tolerance schemes impose substantial overhead on the operators. Each operator is a piece of program code executed performance of DSPS applications, as they require sending repeatedly to process its input data. Whenever an operator a large amount of extra data over the network, while the finishes processing a unit of input data, it produces output phone-to-phone ad-hoc WiFi network in a smartphone plat- data and sends them to the next operator in the graph. Each form has limited bandwidth. Therefore, porting the existing unit of the data passed between two operators is called a fault tolerance schemes naively to smartphone platforms tuple. The tuples sent in a connection between two operators leads to poor resilience and performance. An ideal DSPS form a data stream. A directed acyclic graph, termed query should be resilient to many faults, while adding little runtime network, specifies the producer-consumer relations between overhead for fault tolerance. operators. The query network starts from source operators, MobiStreams is a checkpoint-based reliable DSPS tai- which are responsible for fetching data from external data lored for smartphones. It can overcome large scale burst sensors, and terminates at sink operators, which publish failures while providing good performance. MobiStreams results to the end users. Multiple operators can run on the comprises two new approaches to reduce the checkpointing same node, and a group of operators on a node can be overhead. 1) Token-triggered checkpointing: MobiStreams treated as a single super operator. Without loss of generality, introduces tokens to coordinate the checkpointing activities. we assume each node has only one operator. Consequent- The tokens are generated by source operators. The tokens ly, an operator is the smallest unit of work that can be trickle down the stream graph, triggering each operator checkpointed and recovered independently. The structure of to checkpoint its own state. Token-triggered checkpointing a stream application can be represented at a higher level avoids the redundant data saving in prior fault tolerance based on the interaction between the nodes, as shown in schemes and hence reduces the amount of the data to be Fig. 1.b. The nodes that run source and sink operators saved during a checkpoint. 2) Broadcast-based checkpoint- are called source and sink nodes respectively. The nodes ing: When sending checkpoint data over the network, Mobi- that run other operators in the DSPS are called computing Streams splits up the data transmission into multiple phases. nodes. In Fig. 1.b, node 1 is the upstream node of node In the first few phases, it uses unreliable UDP broadcasts. 3, while node 5 is node 3’s downstream node. Fig. 1.c The UDP broadcasting avoids redundant data transmission illustrates the deployment of a conventional server-based over the network and reduces network overhead. DSPS. Smartphones are only used as data-collecting devices. The uplink cellular network is usually a bottleneck because II. BACKGROUND AND MOTIVATION of its low bandwidth. MobiStreams aims to offload most of A. Distributed Stream Processing System the computing onto smartphones, as shown in Fig. 1.d. A DSPS consists of operators and connections between operators. Fig. 1.a illustrates a DSPS and a stream appli- Previous A Previous C0 A0 M0 S0 N S bus stop L intersection 0 Next C1 A1 M1 V G P K Smartphone intersection Next bus C0 J P K cameras S1 stop C2 A2 M2 D C1 S : data from previous intersection, S : data from smartphone cameras, Camera S B 0 1 1 C: color filter, A: shape filter, M: motion filter, V: voting filter, G: H C2 group, P: SVM prediction, K: sink (to next intersection). C3 S0: data from previous bus stop, N: noise filter, A: prediction model Figure 3. The DSPS at each intersection of the SignalGuru application. for bus arrival time, L: prediction model for alighting passengers, S1: The operators with the same color are on the same node.