Nestdnn: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision

Total Page:16

File Type:pdf, Size:1020Kb

Nestdnn: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision Biyi Fangy, Xiao Zengy, Mi Zhang Michigan State University yCo-primary authors ABSTRACT KEYWORDS Mobile vision systems such as smartphones, drones, and Mobile Deep Learning Systems; Deep Neural Network Model augmented-reality headsets are revolutionizing our lives. Compression; Scheduling; Continuous Mobile Vision These systems usually run multiple applications concur- rently and their available resources at runtime are dynamic ACM Reference Format: y y due to events such as starting new applications, closing ex- Biyi Fang , Xiao Zeng , Mi Zhang. 2018. NestDNN: Resource-Aware isting applications, and application priority changes. In this Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision. In The 24th Annual International Conference on Mobile Com- paper, we present NestDNN, a framework that takes the dy- puting and Networking (MobiCom ’18), October 29-November 2, 2018, namics of runtime resources into account to enable resource- New Delhi, India. ACM, New York, NY, USA, 13 pages. https://doi. aware multi-tenant on-device deep learning for mobile vi- org/10.1145/3241539.3241559 sion systems. NestDNN enables each deep learning model to offer flexible resource-accuracy trade-offs. At runtime, it dynamically selects the optimal resource-accuracy trade-off 1 INTRODUCTION for each deep learning model to fit the model’s resource de- Mobile systems with onboard video cameras such as smart- mand to the system’s available runtime resources. In doing so, phones, drones, wearable cameras, and augmented-reality NestDNN efficiently utilizes the limited resources in mobile headsets are revolutionizing the way we live, work, and inter- vision systems to jointly maximize the performance of all the act with the world. By processing the streaming video inputs, concurrently running applications. Our experiments show these mobile systems are able to retrieve visual information that compared to the resource-agnostic status quo approach, from the world and are promised to open up a wide range of NestDNN achieves as much as 4:2% increase in inference new applications and services. For example, a drone that can accuracy, 2:0× increase in video frame processing rate and detect vehicles, identify road signs, and track traffic flows 1:7× reduction on energy consumption. will enable mobile traffic surveillance with aerial views that traditional traffic surveillance cameras positioned at fixed CCS CONCEPTS locations cannot provide [31]. A wearable camera that can • Human-centered computing → Ubiquitous and mo- recognize everyday objects, identify people, and understand bile computing; • Computing methodologies → Neu- the surrounding environments can be a life-changer for the ral networks; blind and visually impaired individuals [2]. The key to achieving the full promise of these mobile vi- sion systems is effectively analyzing the streaming video frames. However, processing streaming video frames taken Permission to make digital or hard copies of all or part of this work for in mobile settings is challenging in two folds. First, the pro- personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear cessing usually involves multiple computer vision tasks. This this notice and the full citation on the first page. Copyrights for components multi-tenant characteristic requires mobile vision systems to of this work owned by others than ACM must be honored. Abstracting with concurrently run multiple applications that target different credit is permitted. To copy otherwise, or republish, to post on servers or to vision tasks [24]. Second, the context in mobile settings can redistribute to lists, requires prior specific permission and/or a fee. Request be frequently changed. This requires mobile vision systems permissions from [email protected]. MobiCom ’18, October 29-November 2, 2018, New Delhi, India to be able to switch applications to execute new vision tasks © 2018 Association for Computing Machinery. encountered in the new context [12]. ACM ISBN 978-1-4503-5903-0/18/10...$15.00 In the past few years, deep learning (e.g., Deep Neural https://doi.org/10.1145/3241539.3241559 Networks (DNNs)) [21] has become the dominant approach in computer vision due to its capability of achieving impres- of each other, this approach is not scalable and becomes in- sively high accuracies on a variety of important vision tasks feasible when the mobile system concurrently runs multiple [19, 37, 41]. As deep learning chipsets emerge, there is a deep learning models, with each of which having multiple significant interest in leveraging the on-device computing model variants. (ii) Selecting which resource-accuracy trade- resources to execute deep learning models on mobile systems off for each of the concurrently running deep learning models without cloud support [3–5]. Compared to the cloud, mobile is not trivial. This is because different applications have dif- systems are constrained by limited resources. Unfortunately, ferent goals on inference accuracies and processing latencies. deep learning models are known to be resource-demanding Taking a traffic surveillance drone as an example: an applica- [35]. To enable on-device deep learning, one of the common tion that counts vehicles to detect traffic jams does not need techniques used by application developers is compressing high accuracy but requires low latency; an application that the deep learning model to reduce its resource demand at a reads license plates needs high plate reading accuracy but modest loss in accuracy as trade-off [12, 39]. Although this does not require real-time response [39]. technique has gained considerable popularity and has been To address the first challenge, NestDNN employs a novel applied to developing state-of-the-art mobile deep learning model pruning and recovery scheme which transforms a deep systems [8, 14, 17, 20, 38], it has a key drawback: since appli- learning model into a single compact multi-capacity model. cation developers develop their applications independently, The multi-capacity model is comprised of a set of descendent the resource-accuracy trade-off of the compressed model is models, each of which offers a unique resource-accuracy predetermined based on a static resource budget at applica- trade-off. Unlike traditional model variants that are inde- tion development stage and is fixed after the application is pendent of each other, the descendent model with smaller deployed. However, the available resources in mobile vision capacity (i.e., resource demand) shares its model parameters systems at runtime are always dynamic due to events such as with the descendent model with larger capacity, making it- starting new applications, closing existing applications, and self nested inside the descendent model with larger capacity application priority changes. As such, when the resources without taking extra memory space. In doing so, the multi- available at runtime do not meet the resource demands of capacity model is able to provide various resource-accuracy the compressed models, resource contention among concur- trade-offs with a compact memory footprint. rently running applications occurs, forcing the streaming To address the second challenge, NestDNN encodes the video to be processed at a much lower frame rate. On the inference accuracy and processing latency of each descen- other hand, when extra resources at runtime become avail- dent model of each concurrently running application into a able, the compressed models cannot utilize the extra available cost function. Given all the cost functions, NestDNN employs resources to regain their sacrificed accuracies back. a resource-aware runtime scheduler which selects the optimal In this work, we present NestDNN, a framework that takes resource-accuracy trade-off for each deep learning model the dynamics of runtime resources into consideration to en- and determines the optimal amount of runtime resources to able resource-aware multi-tenant on-device deep learning allocate to each model to jointly maximize the overall infer- for mobile vision systems. NestDNN replaces fixed resource- ence accuracy and minimize the overall processing latency accuracy trade-offs with flexible resource-accuracy trade-offs, of all the concurrently running applications. and dynamically selects the optimal resource-accuracy trade- Summary of Experimental Results. We have conducted off for each deep learning model at runtime to fit the model’s a rich set of experiments to evaluate the performance of resource demand to the system’s available runtime resources. NestDNN. To evaluate the performance of the multi-capacity In doing so, NestDNN is able to efficiently utilize the lim- model, we evaluated it on six mobile vision applications that ited resources in the mobile system to jointly maximize the target some of the most important tasks for mobile vision sys- performance of all the concurrently running applications. tems. These applications are developed based on two widely Challenges and our Solutions. The design of NestDNN used deep learning models – VGG Net [33] and ResNet [13]– involves two key technical challenges. (i) The limitation of and six datasets that are commonly used in computer vision existing approaches is rooted in the constraint where the community. To evaluate the performance of the resource- trade-off between resource demand and accuracy of acom-
Recommended publications
  • Zencam: Context-Driven Control of Continuous Vision Body Cameras
    UI*OUFSOBUJPOBM$POGFSFODFPO%JTUSJCVUFE$PNQVUJOHJO4FOTPS4ZTUFNT %$044 ZenCam: Context-Driven Control of Continuous Vision Body Cameras Shiwei Fang Ketan Mayer-Patel Shahriar Nirjon UNC Chapel Hill UNC Chapel Hill UNC Chapel Hill [email protected] [email protected] [email protected] Abstract—In this paper, we present — ZenCam, which is an this design suits the purpose in an ideal scenario, in many always-on body camera that exploits readily available information situations, the wearer (e.g., a law-enforcement officer) may in the encoded video stream from the on-chip firmware to not be able to predict the right moment to turn the camera on classify the dynamics of the scene. This scene-context is further combined with simple inertial measurement unit (IMU)-based or may completely forget to do so in the midst of an action. activity level-context of the wearer to optimally control the camera There are some cameras which automatically turns on at an configuration at run-time to keep the device under the desired event (e.g., when a gun is pulled), but they miss the “back- energy budget. We describe the design and implementation of story,” i.e., how the situation had developed. ZenCam and thoroughly evaluate its performance in real-world We advocate that body cameras should be always on so that scenarios. Our evaluation shows a 29.8-35% reduction in energy consumption and 48.1-49.5% reduction in storage usage when they are able to continuously capture the scene for an extended compared to a standard baseline setting of 1920x1080 at 30fps period.
    [Show full text]
  • Devices, the Weak Link in Achieving an Open Internet
    Smartphones, tablets, voice assistants... DEVICES, THE WEAK LINK IN ACHIEVING AN OPEN INTERNET Report on their limitations and proposals for corrective measures French République February 2018 Devices, the weak link in achieving an open internet Content 1 Introduction ..................................................................................................................................... 5 2 End-user devices’ possible or probable evolution .......................................................................... 7 2.1 Different development models for the main internet access devices .................................... 7 2.1.1 Increasingly mobile internet access in France, and in Europe, controlled by two main players 7 2.1.2 In China, mobile internet access from the onset, with a larger selection of smartphones .................................................................................................................................. 12 2.2 Features that could prove decisive in users’ choice of an internet access device ................ 14 2.2.1 Artificial intelligence, an additional level of intelligence in devices .............................. 14 2.2.2 Voice assistance, a feature designed to simplify commands ........................................ 15 2.2.3 Mobile payment: an indispensable feature for smartphones? ..................................... 15 2.2.4 Virtual reality and augmented reality, mere goodies or future must-haves for devices? 17 2.2.5 Advent of thin client devices: giving the cloud a bigger role?
    [Show full text]
  • Electronic 3D Models Catalogue (On July 26, 2019)
    Electronic 3D models Catalogue (on July 26, 2019) Acer 001 Acer Iconia Tab A510 002 Acer Liquid Z5 003 Acer Liquid S2 Red 004 Acer Liquid S2 Black 005 Acer Iconia Tab A3 White 006 Acer Iconia Tab A1-810 White 007 Acer Iconia W4 008 Acer Liquid E3 Black 009 Acer Liquid E3 Silver 010 Acer Iconia B1-720 Iron Gray 011 Acer Iconia B1-720 Red 012 Acer Iconia B1-720 White 013 Acer Liquid Z3 Rock Black 014 Acer Liquid Z3 Classic White 015 Acer Iconia One 7 B1-730 Black 016 Acer Iconia One 7 B1-730 Red 017 Acer Iconia One 7 B1-730 Yellow 018 Acer Iconia One 7 B1-730 Green 019 Acer Iconia One 7 B1-730 Pink 020 Acer Iconia One 7 B1-730 Orange 021 Acer Iconia One 7 B1-730 Purple 022 Acer Iconia One 7 B1-730 White 023 Acer Iconia One 7 B1-730 Blue 024 Acer Iconia One 7 B1-730 Cyan 025 Acer Aspire Switch 10 026 Acer Iconia Tab A1-810 Red 027 Acer Iconia Tab A1-810 Black 028 Acer Iconia A1-830 White 029 Acer Liquid Z4 White 030 Acer Liquid Z4 Black 031 Acer Liquid Z200 Essential White 032 Acer Liquid Z200 Titanium Black 033 Acer Liquid Z200 Fragrant Pink 034 Acer Liquid Z200 Sky Blue 035 Acer Liquid Z200 Sunshine Yellow 036 Acer Liquid Jade Black 037 Acer Liquid Jade Green 038 Acer Liquid Jade White 039 Acer Liquid Z500 Sandy Silver 040 Acer Liquid Z500 Aquamarine Green 041 Acer Liquid Z500 Titanium Black 042 Acer Iconia Tab 7 (A1-713) 043 Acer Iconia Tab 7 (A1-713HD) 044 Acer Liquid E700 Burgundy Red 045 Acer Liquid E700 Titan Black 046 Acer Iconia Tab 8 047 Acer Liquid X1 Graphite Black 048 Acer Liquid X1 Wine Red 049 Acer Iconia Tab 8 W 050 Acer
    [Show full text]
  • Prof. Feng Liu
    Prof. Feng Liu Spring 2021 http://www.cs.pdx.edu/~fliu/courses/cs510/ 03/30/2021 Today Course overview ◼ Admin. Info ◼ Computational Photography 2 People Lecturer: Prof. Feng Liu ◼ Room: email for a Zoom appointment ◼ Office Hours: TR 3:30-4:30pm ◼ [email protected] Grader: Zhan Li ◼ [email protected] 3 Web and Computer Account Course website ◼ http://www.cs.pdx.edu/~fliu/courses/cs510/ ◼ Class mailing list [email protected] [email protected] Everyone needs a Computer Science department computer account ◼ Get account at CAT ◼ http://cat.pdx.edu 4 Recommended Textbooks & Readings Computer Vision: Algorithms and Applications ◼ By R. Szeliski ◼ Available online, free Learning OpenCV 3: Computer Vision in C++ with the OpenCV Library ◼ By Adrian Kaehler and Gary Bradski Or its early version Learning OpenCV: Computer Vision with the OpenCV Library ◼ By Gary Bradski and Adrian Kaehler Papers recommended by the lecturers 5 Grading 30%: Readings 20%: In-class paper presentation 50%: Project ◼ 10%: final project presentation ◼ 40%: project quality 6 Readings About 2 papers every week ◼ Write a brief summary for one of the papers Totally less than 500 words 1. What problem is addressed? 2. How is it solved? 3. The advantages of the presented method? 4. The limitations of the presented method? 5. How to improve this method? Submit to [email protected] by 4:00 pm every Thursday ◼ Write in the plain text format in your email directly ◼ No attached document 7 Paper Presentation One paper one student ◼
    [Show full text]
  • Manovich.Ai Aestheti
    AI Aesthetics Lev Manovich Publication data: Strelka Press, December 21, 2018. ASIN: B07M8ZJMYG Abstract: AI plays a crucial role in the global cultural ecosystem. It recommends what we should see, listen to, read, and buy. It determines how many people will see our shared content. It helps us make aesthetic decisions when we create media. In professional cultural production, AI has already been adapted to produce movie trailers, music albums, fashion items, product and web designs, architecture, etc. In this short book, Lev Manovich offers a systematic framework to help us think about cultural uses of AI today and in the future. He challenges existing ideas and gives us new concepts for understanding media, design, and aesthetics in the AI era. “[The Analytical Engine] might act upon other things besides number...supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent.” - Ada Lovelace, 1842 Итак, кто же я такой? С известными оговорками, я и есть то, что люди прошлого называли «искусственным интеллектом». (Виктор Пелевин, iPhuck 10, 2017.) “So who am I exactly? With known caveats, I am what people of the past have called "artificial intelligence."” (Victor Pelevin, iPhuck 10, 2017.) Writing in 1842, Ada Lovelace imagines that in the future, Babbage’s Analytical Engine ( general purpose programmable computer) will be able to create complex music. In the 2017 novel by famous Russian writer Victor Pelevin, set in the late 21th century, the narrator is an algorithm solving crimes and writing novels about them.
    [Show full text]
  • AUTHORS and MACHINES Jane C
    AUTHORS AND MACHINES Jane C. Ginsburg† & Luke Ali Budiardjo†† ABSTRACT Machines, by providing the means of mass production of works of authorship, engendered copyright law. Throughout history, the emergence of new technologies tested the concept of authorship, and courts in response endeavored to clarify copyright’s foundational principles. Today, developments in computer science have created a new form of machine, the “artificially intelligent” (AI) system apparently endowed with “computational creativity.” AI systems introduce challenging variations on the perennial question of what makes one an “author” in copyright law: Is the creator of a generative program automatically the author of the works her process begets, even if she cannot anticipate the contents of those works? Does the user of the program become the (or an) author of an output whose content the user has at least in part defined? This Article frames these and similar questions that generative machines provoke as an opportunity to revisit the concept of copyright authorship in general and to illuminate its murkier corners. This Article examines several fundamental relationships (between author and amanuensis, between author and tool, and between author and co-author) as well as several authorship anomalies (including the problem of “accidental” or “indeterminate” authorship) to unearth the basic principles and latent ambiguities which have nourished debates over the meaning of the “author” in copyright. This Article presents an overarching and internally consistent model of authorship based on two basic pillars: a mental step (the conception of a work) and a physical step (the execution of a work), and defines the contours of these basic pillars to arrive at a cohesive definition of authorship.
    [Show full text]
  • PDF (Thumbnails)
    Introducing iPhone 13 Pro | Apple () The Future of Communication in the Metaverse () A brain-computer interface that evokes tactile sensations improves robotic arm control () Nathan Copeland Feels Again with Blackrock Technology () Introducing Scribe — Transform meetings into instantly searchable and shareable video & text () It's a Big Year for ‘Clean Tech’ at CES () [CES 2021] Personalized Services | Samsung () Targus CES 2021 - Antimicrobial Solutions () LG Teases Rollable Phone! () LG Display's transparent OLED screens and bendable TV at CES 2021 () AR Cloud: What Is An Augmented Reality Cloud And Why Does It Matter? () ‘AR cloud’ will drive the next generation of immersive experiences () Everyone you know is a Disney princess, which means AR is queen () How Google Lens is priming the world for an augmented reality takeover () OrCam’s AI-based reading device wins prestigious CES 2021 innovation award () Niantic AR tech () From Foldable Phones to Stretchy Screens () The robot shop worker controlled by a faraway human () Introducing Amazon Halo: A new way to help you measure and improve your health () FM POC with Telexistence Model-T () Robot friends: Why people talk to chatbots in times of trouble () Star Wars inspires new smart skin () The ID-Cap™ System - Revolutionizing Remote Patient Monitoring () Institutional Literacies: Engaging Academic IT Contexts for Writing and Communication () Together mode in Microsoft Teams () Cortana in Microsoft Teams | Easily collaborate with voice assistance () The rise and fall of Adobe Flash () Student
    [Show full text]
  • Big Wave Shoulder Surfing
    Big Wave Shoulder Surfing Neil Patil Brian Cui∗ Hovav Shacham [email protected] [email protected] [email protected] University of Texas at Austin University of Texas at Austin University of Texas at Austin ABSTRACT Camera technology continues to improve year over year with ad- vancements in both hardware sensor capabilities and computer vision algorithms. The ever increasing presence of cameras has opened the door to a new class of attacks: the use of computer vi- sion techniques to infer non-digital secrets obscured from view. We prototype one such attack by presenting a system which can recover handwritten digit sequences from recorded video of pen motion. We demonstrate how our prototype, which uses off-the-shelf computer vision algorithms and simple classification strategies, can predict a set of digits that dramatically outperforms guessing, challenging the belief that shielding information in analog form is sufficient to maintain privacy in the presence of camera surveillance. We conclude that addressing these new threats requires a new method of thinking that acknowledges vision-based side-channel attacks Figure 1: An example scenario. Even though the writing is against physical, analog mediums. not visible, the pen’s motion can be observed and the writ- ing reconstructed by a digital camera. Our prototype gener- CCS CONCEPTS alizes this attack to less optimal camera angles, distances, • Security and privacy → Human and societal aspects of se- and image quality. curity and privacy; Security in hardware; • Computing method- ologies → Computer vision. Meanwhile, analog mediums continue to be trusted platforms KEYWORDS for sensitive information. Writing on paper, physical ID cards, and face-to-face meetings are considered more trustworthy and less video camera, computer vision, pen and paper, side-channel attack, accessible to foreign adversaries when compared to digital com- image analysis, mobile devices, motion tracking, handwriting munication.
    [Show full text]
  • Chameleon: Scalable Adaptation of Video Analytics
    Chameleon: Scalable Adaptation of Video Analytics Junchen Jiang†◦, Ganesh Ananthanarayanan◦, Peter Bodik◦, Siddhartha Sen◦, Ion Stoica? yUniversity of Chicago ◦Microsoft Research ?UC Berkeley, Databricks Inc. ABSTRACT 1 INTRODUCTION Applying deep convolutional neural networks (NN) to video Many enterprises and cities (e.g., [2, 6]) are deploying thou- data at scale poses a substantial systems challenge, as im- sands of cameras and are starting to use video analytics for proving inference accuracy often requires a prohibitive cost a variety of 24×7 applications, including traffic control, se- in computational resources. While it is promising to balance curity monitoring, and factory floor monitoring. The video resource and accuracy by selecting a suitable NN configura- analytics are based on classical computer vision techniques tion (e.g., the resolution and frame rate of the input video), as well as deep neural networks (NN). This trend is fueled by one must also address the significant dynamics of the NN con- the recent advances in computer vision (e.g., [17, 18]) which figuration’s impact on video analytics accuracy. We present have led to a continuous stream of increasingly accurate Chameleon, a controller that dynamically picks the best con- models for object detection and classification. figurations for existing NN-based video analytics pipelines. A typical video analytics application consists of a pipeline The key challenge in Chameleon is that in theory, adapting of video processing modules. For example, the pipeline of a configurations frequently can reduce resource consumption traffic application that counts vehicles consists of a decoder, with little degradation in accuracy, but searching a large followed by a component to re-size and sample frames, and space of configurations periodically incurs an overwhelming an object detector.
    [Show full text]
  • Photoshop/Lightroom Tips
    Here’s What I Think! This article covers different Adobe Photoshop/Lightroomtips and techniques. The views expressed in this article are those of the author and do not necessarily reflect the views of Redlands Camera Club. By John Williams Need help? If you have any questions about processing an image using Adobe Lightroom or Photoshop, email me at [email protected] (for RCC members only) and I will try to assist you. AI (Artificial Intelligence) in Photography/Editing: There are two main areas where AI is used in photography: 1. Camera:AI technology is able to anticipate what your photographing and automatically adjust the exposure, sharpness, etc. Google has released an AI-powered camera called Google Clips, that can judge when the composition or lighting is aesthetically pleasing. Although the Goggle Clips reviews are less than positive, it is only time that this type of technology will be optimized in all cameras. Apple makes use of this technology to drive its dual-camera phones in portrait mode. 2. Post Processing: In Adobe Camera Raw and Basic Panel in Lightroom, AI technology is used to analyze your image and make optimum adjustments. Adobe calls their AI technology Adobe Sensei. Lightroomnow contains two AI features. The first is Enhance Detail that is reported to increase detail in images by as much as 30%. Secondly, the Auto button in the Develop Module Basic Panel usesAItechnology. The Auto feature does a very good job of adjusting your photo and is agood starting point when you adjust your images.Photoshop employs AI in the object selection function (Select Subject button).
    [Show full text]
  • Calling BS, Far and Wide
    Fall 2017 we make information work ischool.uw.edu Meet the Dean — Page 2 Donor Honor Roll — Pages 8 & 9 Alumni Updates — Page 10 Distinguished Alumna has earned many accolades Patricia Cutright, the iSchool 2017 Distinguished Alumna, has had a long and distinguished career in libraries. She is currently dean of libraries at Central Washington University and served as director of libraries at Pratt Institute in Brooklyn, New York, and Eastern Oregon University. Cutright has earned more than 25 honors, Assistant Professor Jevin West teaches “Calling Bullshit in the Age of Big Data.” (Photo courtesy of Columns magazine) including the 2002 Oregon Librarian Calling BS, far and wide of the Year; 2003 LITA/ Gaylord iSchool class makes an impact across the country Award for Achieve- hen Jevin West and Carl Bergstrom “They’re fed with so much BS in all aspects of ment in Patricia Cutright quietly rolled out CallingBullshit.org their lives,” West said. “Every time in history, Library and Win January, they hoped a few of their everyone says there’s more bullshit than in prior Information Technology presented by ideas would find their way into classrooms. times, but I think now you could make an argu- the American Library Association and At best, maybe the University of Washington ment there really is, and I think people are sick the Library Information and Technol- would let them teach a class. and tired of it.” ogy Association; 2002 Jean McKenzie / Woman of the Year Award; “We would have been happy if a couple of our West and Bergstrom, a professor in the UW Soroptimist International of La colleagues and friends would have said, ‘Cool Biology department, wanted to teach students Grande; and the 2002 Excellence in idea, we should pass that along,’ said West, an how to think critically about the data that’s Telecommunications – Applications, assistant professor at the Information School.
    [Show full text]
  • Edgeeye - an Edge Service Framework for Real-Time Intelligent Video Analytics
    EdgeEye - An Edge Service Framework for Real-time Intelligent Video Analytics Peng Liu Bozhao Qi Suman Banerjee University of Wisconsin-Madison University of Wisconsin-Madison University of Wisconsin-Madison [email protected] [email protected] [email protected] ABSTRACT such as speech recognition, language translation, image recogni- Deep learning with Deep Neural Networks (DNNs) can achieve tion, and recommendation. DNN-based approaches have achieved much higher accuracy on many computer vision tasks than classic big improvement in accuracy over traditional machine learning machine learning algorithms. Because of the high demand for both algorithms on computer vision tasks. They even beat humans at computation and storage resources, DNNs are often deployed in the image recognition task [12]. In this paper, we present our work the cloud. Unfortunately, executing deep learning inference in the on building a real-time intelligent video analytics framework based cloud, especially for real-time video analysis, often incurs high on deep learning for services deployed at the Internet edge. bandwidth consumption, high latency, reliability issues, and privacy Nowadays, camera sensors are cheap and can be included in concerns. Moving the DNNs close to the data source with an edge many devices. But the high computation and storage requirements computing paradigm is a good approach to address those problems. of DNNs hinder their usefulness for local video processing applica- The lack of an open source framework with a high-level API also tions in these low-cost devices. For example, the GoogLeNet model complicates the deployment of deep learning-enabled service at the for image classification is larger than 20MB and requires about Internet edge.
    [Show full text]