Breaking the ISA Barrier in Modern Computing

UNIVERSITY OF CALIFORNIA SAN DIEGO Breaking the ISA Barrier in Modern Computing A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Ashish Venkat Committee in charge: Professor Dean M. Tullsen, Chair Professor Andrew B. Kahng Professor Sorin Lerner Professor Timothy Sherwood Professor Deian Stefan 2018 Copyright Ashish Venkat, 2018 All rights reserved. The dissertation of Ashish Venkat is approved, and it is ac- ceptable in quality and form for publication on microfilm and electronically: Chair University of California San Diego 2018 iii DEDICATION To Suchetha. iv EPIGRAPH If we knew what it was we were doing, it would not be called research, would it? —Albert Einstein v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . v Table of Contents . vi List of Figures . ix List of Tables . xi Acknowledgements . xii Vita ............................................. xv Abstract of the Dissertation . xvii Chapter 1 Introduction . 1 1.1 Breaking the ISA Barrier . 2 1.1.1 Cross-ISA Process Migration . 3 1.1.2 Design of a Heterogeneous-ISA Chip Multiprocessor . 4 1.1.3 HIPStR: Security Defense via ISA Diversification . 5 1.1.4 Composite-ISA Architectures . 6 1.2 Overview of Dissertation . 8 Chapter 2 Background . 10 2.1 Heterogeneous Architectures . 10 2.2 The Path to Multi-ISA Heterogeneity . 13 2.3 Return-Oriented Programming . 15 2.4 ROP Mitigations . 17 Chapter 3 Cross-ISA Process Migration . 21 3.1 Symmetrical Fat Binary . 22 3.2 Multi-ISA Compilation . 23 3.3 State Transformation . 25 3.4 Binary Translation . 26 3.4.1 Translation Block Chaining . 27 3.4.2 ISA-Specific Challenges . 29 3.5 Experimental Methodology . 31 3.6 Results . 33 3.6.1 Steady State Performance . 33 vi 3.6.2 Migration Cost . 33 3.7 Conclusion . 39 Chapter 4 Design of a Heterogeneous-ISA Chip Multiprocessor . 41 4.1 Harnessing ISA Diversity . 42 4.2 Design Space Exploration . 46 4.3 Programming Environment and Memory Layout . 48 4.4 Experimental Methodology . 49 4.5 Results . 51 4.5.1 Evaluation of the Heterogeneous-ISA Architecture . 52 4.5.2 Framework for ISA-Microarchitecture co-design . 55 4.5.3 ISA Affinity . 57 4.6 Conclusion . 60 Chapter 5 HIPStR: Security Defense via ISA Diversification . 61 5.1 Background and Motivation . 61 5.2 Architectural Overview . 63 5.2.1 Security and Performance Guarantees . 63 5.2.2 Instruction Set Randomization . 64 5.2.3 Program State Relocation . 64 5.2.4 Heterogeneous-ISA PSR . 66 5.3 Assumptions and Threat Model . 68 5.4 Design and Implementation . 69 5.4.1 Program State Relocation . 69 5.4.2 PSR-aware Execution Migration . 71 5.4.3 Execution Scenarios . 72 5.4.4 Performance Optimizations . 73 5.4.5 Prototype Implementation of PSR . 74 5.5 Experimental Methodology . 75 5.6 Evaluation . 80 5.6.1 Security Evaluation . 80 5.6.2 Performance Evaluation . 87 5.7 Conclusion . 90 Chapter 6 Composite-ISA Architectures . 93 6.1 ISA Feature Set Derivation . 94 6.2 Compiler and Runtime Strategy . 99 6.2.1 Compiler Toolchain Development . 99 6.2.2 Migration Strategy . 101 6.3 Decoder Design and Implementation . 102 6.3.1 Instruction Encoding . 102 6.3.2 Decoder Analysis . 103 6.4 Experimental Methodology . 105 vii 6.5 Results . 108 6.5.1 Performance and Energy Efficiency . 108 6.5.2 Feature Sensitivity Analysis . 113 6.5.3 Feature Affinity . 116 6.5.4 Migration Cost Analysis . 118 6.6 Conclusion . 121 Chapter 7 Concluding Remarks . 123 Bibliography . 126 viii LIST OF FIGURES Figure 2.1: Return-oriented Programming . 16 Figure 3.1: Symmetrical Fat Binary . 22 Figure 3.2: Operation of the State Transformer . 25 Figure 3.3: Target-to-Source Ratio . 34 Figure 3.4: Optimizations made during Binary Translation from ARM to MIPS . 35 Figure 3.5: Register and Immediate Caching during Binary Translation from MIPS to ARM ..................................... 35 Figure 3.6: Total-to-Source Ratio . 36 Figure 3.7: Comparison of Binary Translation with native performance . 37 Figure 3.8: Total Migration Overhead in microseconds . 38 Figure 3.9: Performance vs. migration frequency . 39 Figure 4.1: Instruction Mix . 44 Figure 4.2: Performance comparison under different peak power budgets for two different execution phases of bzip2 . 45 Figure 4.3: Multi-programmed Workload Performance comparison under different peak power and area budgets . 52 Figure 4.4: Energy-Delay-Product comparison for multi-programmed workloads under different peak power and area budgets . 53 Figure 4.5: Single Thread Performance and EDP evaluation using the dynamic multicore topology . 54 Figure 4.6: Single Thread Performance and EDP evaluation under different area budgets 55 Figure 4.7: ISA-Microarchitecture co-design . 56 Figure 4.8: Core Selection . 58 Figure 4.9: Split Overhead . 59 Figure 5.1: Program State Relocation Architecture . 65 Figure 5.2: Heterogeneous-ISA Program State Relocation . 67 Figure 5.3: Attack Surface of a Victim Program . 75 Figure 5.4: Classic ROP Attack Surface . 80 Figure 5.5: Brute Force Attack Surface . 81 Figure 5.6: JIT-ROP Attack Surface on (a) PSR, (b) HIPStR . 82 Figure 5.7: Percentage of Migration-Safe Basic Blocks . 83 Figure 5.8: Entropy Comparison . 84 Figure 5.9: Effect of diversification on attack surface . 85 Figure 5.10: Performance Optimizations for PSR . 86 Figure 5.11: EDQ . 87 Figure 5.12: RAT Miss Penalty . 88 Figure 5.13: Migration Overhead . 89 Figure 5.14: Code Cache . 89 ix Figure 5.15: Integrated Performance . 90 Figure 6.1: Feature Sets . 98 Figure 6.2: Instruction Mix . 98 Figure 6.3: Encoding . 103 Figure 6.4: Decoder . 104 Figure 6.5: Multi-programmed workload throughput comparison (higher is better) . 109 Figure 6.6: Multi-programmed workload EDP comparison (lower is better) . 110 Figure 6.7: Single Thread Performance (higher is better) and EDP (lower is better) comparison under Peak Power Budget . 112 Figure 6.8: Single Thread Performance (higher is better) and EDP (lower is better) comparison under Area Budget . 112 Figure 6.9:.

Load more