Deep Learning for the Internet of Things

COVER FEATURE EMBEDDED DEEP LEARNING Deep Learning for the Internet of Things Shuochao Yao and Yiran Zhao, University of Illinois Urbana-Champaign (UIUC) Aston Zhang, Amazon AI Shaohan Hu, IBM Thomas J. Watson Research Center Huajie Shao and Chao Zhang, UIUC Lu Su, State University of New York, Buffalo Tarek Abdelzaher, UIUC How can the advantages of deep learning be brought to the emerging world of embedded IoT devices? The authors discuss several core challenges in embedded and mobile deep learning, as well as recent solutions demonstrating the feasibility of building IoT applications that are powered by effective, efficient, and reliable deep learning models. he proliferation of internetworked mobile and Indeed, significant research efforts have been spent embedded devices leads to visions of the Inter- toward building smarter and more user-friendly appli- net of Things (IoT), giving rise to a sensor-rich cations on mobile and embedded devices and sensors. world where physical things in our everyday At the same time, recent advances in deep learning have Tenvironment are increasingly enriched with computing, greatly changed the way that computing devices process sensing, and communication capabilities. Such capabil- human-centric content such as images, video, speech, ities promise to revolutionize the interactions between and audio. Applying deep neural networks to IoT devices humans and physical objects. could thus bring about a generation of applications 32 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/18/$33.00 © 2018 IEEE capable of performing complex sens- therefore lies in the high resource We elaborate on these core problems ing and recognition tasks to support demand of trained deep neural net- and their emerging solutions to help a new realm of interactions between work models. While existing neu- lay a foundation for building IoT sys- humans and their physical surround- ral network compression algorithms tems enriched with effective, efficient, ings. This article discusses four key can effectively reduce the number and reliable deep learning models. research questions toward the realiza- of model parameters, not all of these tion of such novel interactions between models lead to matrix representa- ON DEEP LEARNING MODELS humans and (deep-) learning-enabled tions that can be efficiently imple- FOR SENSOR DATA physical things, namely: What deep mented on commodity IoT devices. A key research challenge toward the neural network structures can effec- Recent work describes a particularly realization of learning-enabled IoT tively process and fuse sensory input effective deep learning compression systems lies in the design of deep neu- data for diverse IoT applications? How algorithm, called DeepIoT,2 that can ral network structures that can effec- to reduce the resource consumption directly compress the structures of tively estimate outputs of interest of deep learning models such that commonly used deep neural net- from noisy time-series multisensor they can be efficiently deployed on works. The compressed model can be measurements.1 resource-constrained IoT devices? How deployed on commodity devices. A Despite the large variety of embed- to compute confidence measurements large proportion of execution time, ded and mobile computing tasks in IoT in the correctness of deep learning pre- energy, and memory can be reduced contexts, one can generally categorize dictions for IoT applications? Finally, with little effect on the final predic- them into two common subtypes: esti- how to minimize the need for labeled tion accuracy. mation tasks and classification tasks, data in learning? Third, reliability assurances are depending on whether prediction To elaborate on the above chal- important in cyber-physical and IoT results are continuous or categorical, lenges, first, observe that IoT appli- applications. The need for offering respectively. The question therefore cations often depend on collabora- such assurances calls for well-calibrated becomes whether or not a general neu- tion among multiple sensors, which estimation of uncertainty associated ral network architecture exists that requires designing novel neural net- with learning results. We present can effectively learn the structure of work structures for multisensor data simple methods for generating well- models needed for estimation and fusion. These structures should be calibrated uncertainty estimates for classification tasks from sensor data. able to model complex interactions the predictions computed in deep neu- Such a general deep learning neural among multiple sensory inputs over ral networks, called RDeepSense.3 It network architecture would, in princi- time and effectively encode features achieves accurate and well-calibrated ple, overcome disadvantages of today’s of sensory inputs that are pertinent to estimations by changing the objective approaches that are based on analyti- desired recognition and other tasks. function to faithfully reflect predic- cal model simplifications or the use of We review a general deep learning tion correctness. hand-crafted engineered features. framework for this purpose, called Finally, labeling data for learn- Traditionally, for estimation-oriented DeepSense,1 that provides a unified yet ing purposes is time-consuming. One problems such as tracking and localiza- customizable solution for the learn- must teach sensing devices to recog- tion, sensor inputs are processed based ing needs of various IoT applications. nize objects and concepts without the on the physical models of the phenom- It demonstrates that certain combi- benefit of (many) examples, where ena involved. Sensors generate mea- nations of deep neural network topol- ground truth values for such objects surements of physical quantities such ogies are particularly well-suited for and concepts are given. Unsupervised as acceleration and angular velocity. learning from sensor data. and semisupervised solutions are From these measurements, other phys- Second, IoT devices are usually low- needed to solve the challenge of learn- ical quantities are derived (such as dis- end systems with limited computa- ing with limited labeled (and mostly placement through double integration tional, energy, and memory resources. unlabeled) samples, while approach- of acceleration over time). However, One key impediment in deploying ing the performance of learning from measurements of commodity sensors deep neural networks on IoT devices fully labeled data. are noisy. The noise in measurements MAY 2018 33 EMBEDDED DEEP LEARNING Single/multiple outputs Output layer Type-specic output layer T time intervals with width τ Recurrent layer 2 GRU ...... ...... ...... ...... GRU Recurrent layer 1 GRU ...... ...... ...... ...... GRU Flatten and concatenation (c) τ x ..t τ (6) Merge convolutional layer 3 x ..t (5) x ..t Merge convolutional layer 2 (4) x ..t Merge convolutional layer 1 K attened K attened . features . (3) features x ..t Flatten and concatenation ............ (k,3) ............ x .. Individual convolutional layer 3 t (k,2) ............ x ..t ............ Individual convolutional layer 2 (k,1) ............ x ..t ............ Individual convolutional layer 1 ............ (k) ............ x ..t K sensor inputs K sensor inputs FIGURE 1. Main architecture of the DeepSense framework. is nonlinear and might be correlated automatically adapting the learned among multiple sensors for effective over time, which makes it hard to neural network to complex correlated sensor fusion. Next, an RNN is applied model. It is therefore challenging to noise patterns while, at the same time, to extract temporal patterns. At last, separate signal from noise, leading to converging on the extraction of max- either an affine transformation or a estimation errors and bias. imally robust signal features that are softmax output is used, depending on For classification-oriented prob- most suited for the task at hand. A whether we want to model an estima- lems, such as activity and context recent framework, called DeepSense, tion or a classification task. recognition, a typical approach is to demonstrates a case for feasibility of This architecture solves the gen- compute appropriate features derived such a general solution. eral problem of learning multisensor from raw sensor data. These hand- As shown in Figure 1, DeepSense fusion tasks for purposes of estimation crafted features are then fed into integrates convolutional neural net- or classification from time-series data. a classifier for training. Designing works (CNNs) and recurrent neural For estimation-oriented problems, good hand-crafted features can be networks (RNNs). Sensory inputs are DeepSense learns the physical system time-consuming; it requires exten- aligned and divided into time inter- and noise models to yield outputs from sive experiments to generalize well to vals for processing time-series data. noisy sensor data directly. The neu- diverse settings such as different sen- For each interval, DeepSense first ral network acts as an approximate sor noise patterns and heterogeneous applies an individual CNN to each sen- transfer function. For classification- user behaviors. sor, encoding relevant local features oriented problems, the neural network A general deep learning frame- within the sensor’s data stream. Then, acts as an automatic feature extractor work can effectively address both of a (global) CNN is applied on the respec- encoding local, global, and temporal the aforementioned challenges by tive outputs to model

Load more