UNIVERSITY of CALIFORNIA SAN DIEGO Efficient Learning In

UNIVERSITY OF CALIFORNIA SAN DIEGO Efficient Learning in Heterogeneous Internet of Things Ecosystems A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science (Computer Engineering) by Yeseong Kim Committee in charge: Professor Tajana Simunic Rosing, Chair Professor Chung-Kuan Cheng Professor Ryan Kastner Professor Farinaz Koushanfar Professor Dean Tullsen 2020 Copyright Yeseong Kim, 2020 All rights reserved. The dissertation of Yeseong Kim is approved, and it is ac- ceptable in quality and form for publication on microfilm and electronically: Chair University of California San Diego 2020 iii DEDICATION To my wife, Jeeyoon, my dauther, Olivia Daon, and my family, Youngju, Haeja, Arie, and Evie. iv EPIGRAPH O give thanks unto the Lord; for he is good: because his mercy endureth for ever. — Psalm 118:1 v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . .v Table of Contents . vi List of Figures . ix List of Tables . xi Acknowledgements . xii Vita ............................................. xiv Abstract of the Dissertation . xviii Chapter 1 Introduction . .1 1.1 Cross-Platform Energy Prediction and Task Allocation . .4 1.2 Efficient Learning Based on HD Computing . .5 1.3 Collaborative Learning with HD Computing . .6 1.4 Beyond Classical Learning: DNA Pattern Matching using HD Com- puting . .7 Chapter 2 Intelligent Cross-Platform Task Characterization and Allocation . .9 2.1 Introduction . 10 2.2 Related Work . 12 2.3 Overview of P4 ............................ 14 2.4 Automated System Modeling . 15 2.4.1 Full-System Power Modeling . 15 2.4.2 Cross-Platform Application Phase Recognition . 19 2.5 Cross-Platform Prediction . 21 2.5.1 Phase-Based Training Data Generation . 22 2.5.2 Cross-Platform Prediction Model Training . 24 2.5.3 Online Prediction . 26 2.6 Cross-Platform Management for ML tasks . 28 2.6.1 Cross-Platform Management Framework . 28 2.6.2 Application Task Extraction . 30 2.6.3 Task Allocation Case Study . 31 2.7 Experimental Setup . 32 2.7.1 Benchmarks . 33 vi 2.7.2 Model Training Parameters . 34 2.7.3 Overhead . 35 2.8 Evaluation of P4 Models . 36 2.8.1 Full-System Power Estimation . 36 2.8.2 Cross-Platform Prediction . 41 2.9 Evaluation of Model-Based ML Task Allocation . 43 2.9.1 Energy Use Optimization . 44 2.9.2 Energy Cost Reduction . 45 2.10 Conclusion . 46 Chapter 3 Hyperdimensional Computing for Efficient Learning in IoT Systems . 48 3.1 Introduction . 49 3.2 Related Work . 51 3.3 HD Computing Primitives . 52 3.4 HD-Based Classification . 54 3.4.1 Design Overview . 54 3.4.2 Sensor Data Encoding . 55 3.4.3 Model Training . 56 3.4.4 Model-Based Inference . 58 3.5 Evaluation . 59 3.5.1 Experimental Setup . 59 3.5.2 Classification Accuracy . 61 3.5.3 Efficiency Comparison . 61 3.6 Conclusion . 63 Chapter 4 Collaborative Learning with Hyperdimensional Computing . 64 4.1 Introduction . 65 4.2 Motivational Scenario . 68 4.3 Related Work . 69 4.4 Secure Learning in HD Space . 71 4.4.1 Security Model . 71 4.4.2 Proposed Framework . 71 4.4.3 Secure Key Generation and Distribution . 72 4.5 SecureHD Encoding and Decoding . 74 4.5.1 Encoding in HD Space . 75 4.5.2 Decoding in HD Space . 78 4.6 Collaborative Learning in HD Space . 82 4.6.1 Hierarchical Learning Approach . 82 4.6.2 HD Model-Based Inference . 85 4.7 Evaluation . 85 4.7.1 Experimental Setup . 85 4.7.2 Encoding and Decoding Performance . 86 4.7.3 Evaluation of SecureHD Learning . 87 vii 4.7.4 Data Recovery Trade-offs . 89 4.7.5 Metadata Recovery Trade-offs . 92 4.8 Conclusion . 93 Chapter 5 HD Computing Beyond Classical Learning: DNA Pattern Matching . 94 5.1 Introduction . 95 5.2 Related Work . 96 5.3 GenieHD Overview . 97 5.4 DNA Pattern Matching Using HD Computing . 98 5.4.1 DNA Sequence Encoding . 99 5.4.2 Pattern Matching . 102 5.5 Hardware Acceleration Design . 105 5.5.1 Acceleration Architecture . 105 5.5.2 Implementation on Parallel Computing Platforms . 107 5.6 Evaluation . 108 5.6.1 Experimental Setup . 108 5.6.2 Efficiency Comparison . 109 5.6.3 Pattern Matching for Multiple Queries . 110 5.6.4 Dimensionality Exploitation . 112 5.7 Conclusion . 113 Chapter 6 Summary and Future Work . 114 6.1 Thesis Summary . 115 6.2 Future Directions . 117 6.2.1 Efficient Cognitive Processing with HD Computing . 117 6.2.2 Software Infrastructure for HD Computing . 118 Bibliography . 119 viii LIST OF FIGURES Figure 1.1: Computing Nodes on Heterogeneous IoT Systems . .2 Figure 2.1: An overview of P4 framework . 14 Figure 2.2: Power and performance with PMC events for Linpack benchmark on Intel SR1560SF server at maximum frequency . 20 Figure 2.3: Application phases for multi-threaded bzip2 benchmark independently identified for Intel SR1560SF and Sun X4270 servers at the maximum frequency 21 Figure 2.4: Cross-platform phase matching (Intel SR1560SF to SUN X4270, splash2x.lu cb) 22 Figure 2.5: Identified phases of four benchmarks running for 60 seconds (IntelH, IntelM and IntelL: Intel server running at highest, medium, and lowest frequency settings. SunH, DellH, and A15H: Sun server, Dell server, and ARM Cortex- 15 processor running at highest frequency.) . 23 Figure 2.6: Cumulative distribution of instructions for two clusters . 24 Figure 2.7: Feed-forward neural networks for online prediction . 27 Figure 2.8: Overview of model-driven management on Spark environment . 28 Figure 2.9: Task group identification for two Spark applications . 31 Figure 2.10: Cumulative distribution of execution times for two task groups . 31 Figure 2.11: Cross-platform NN model configurations . 35 Figure 2.12: Overhead of model-driven management . 36 Figure 2.13: Processor power estimation errors (Intel SR1560SF) . 37 Figure 2.14: Power estimation error of subcomponents (Intel SR1560SF) . 38 Figure 2.15: Runtime subcomponent power estimation (Intel SR1560SF) . 38 Figure 2.16: Average error of single-machine supply power estimation. ARM A15 and A7 represents respectively either Cortex A15 or A7 processor . 40 Figure 2.17: Summary of time-variant power prediction accuracy . 40 Figure 2.18: Time-variant power level prediction for four heterogeneous platform combi- nations . 41 Figure 2.19: Cross-platform energy prediction accuracy. The error for each case shown in (a) is the average error cross-validated for all benchmark applications. 43 Figure 2.20: Summary of energy use optimization . 44 Figure 2.21: Energy breakdown comparison between Spark default and model-driven policy 45 Figure 2.22: Summary of cluster-level energy cost reduction . 46 Figure 2.23: Energy breakdown over different price ratios between clusters . 46 Figure 3.1: Overview of HD-Based Classification (Example: Human Activity Recognition) 54 Figure 3.2: Encoding of Sensor Measurements . 57 Figure 3.3: Accuracy Comparison for Different Modeling Methods . 60 Figure 3.4: Efficiency Comparison for Training and Inference . 61 Figure 4.1: Motivational scenario . 68 ix Figure 4.2: Execution time of homomorphic encryption and decryption over MNIST dataset . 69 Figure 4.3: Overview of SecureHD . 70 Figure 4.4: MPC-based key generation . 72 Figure 4.5: Illustration of SecureHD encoding and decoding procedures . 73 Figure 4.6: Value extraction example . 76 Figure 4.7: Iterative error correction procedure . 79 Figure 4.8: Relationship between the number of metavector.

Load more