Designing Practical Software Bug Detectors Using Commodity Hardware and Common Programming Patterns

Designing Practical Software Bug Detectors Using Commodity Hardware and Common Programming Patterns Tong Zhang Dissertation submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science and Application Dongyoon Lee, Co-chair Changhee Jung, Co-chair Kirk Cameron Danfeng Yao Weidong Cui December 9, 2019 Blacksburg, Virginia Keywords: Software Bug Detection, Compilers, Commodity Hardware, Data Race Detection, Memory Safety, Permission Check Placement Analysis Copyright 2019, Tong Zhang Designing Practical Software Bug Detectors Using Commodity Hardware and Common Programming Patterns Tong Zhang (ABSTRACT) Software bugs can cost millions and affect people’s daily lives. However, many bug detection tools are not always practical in reality, which hinders their wide adoption. There are three main concerns regarding existing bug detectors: 1) run-time overhead in dynamic bug detectors, 2) space overhead in dynamic bug detectors, and 3) scalability and precision issues in static bug detectors. With those in mind, we propose to: 1) leverage commodity hardware to reduce run-time overhead, 2) reuse metadata maintained by one bug detector to detect other types of bugs, reducing space overhead, and 3) apply programming idioms to static analyses, improving scalability and precision. We demonstrate the effectiveness of three approaches using data race bugs, memory safety bugs, and permission check bugs, respectively. First, we leverage the commodity hardware transactional memory (HTM) selectively to use the dynamic data race detector only if necessary, thereby reducing the overhead from 11.68x to 4.65x. We then present a production-ready data race detector, which only incurs a 2.6% run-time overhead, by using performance monitoring units (PMUs) for online memory access sampling and offline unsampled memory access reconstruction. Second, for memory safety bugs, which are more common than data races, we provide practical temporal memory safety on top of the spatial memory safety of the Intel MPX in a memory-efficient manner without additional hardware support. We achieve this by reusing the existing metadata and checks already available in the Intel MPX-instrumented applications, thereby offering full memory safety at only 36% memory overhead. Finally, we design a scalable and precise function pointer analysis tool leveraging indirect call usage patterns in the Linux kernel. We applied the tool to the detection of permission check bugs; the detector found 14 previously unknown bugs within a limited time budget. Designing Practical Software Bug Detectors Using Commodity Hardware and Common Programming Patterns Tong Zhang (GENERAL AUDIENCE ABSTRACT) Software bugs have caused many real-world problems, e.g., the 2003 Northeast blackout and the Facebook stock price mismatch. Finding bugs is critical to solving those problems. Unfortunately, many existing bug detectors suffer from high run-time and space overheads as well as scalability and precision issues. In this dissertation, we address the limitations of bug detectors by leveraging commodity hardware and common programming patterns. Particularly, we focus on improving the run-time overhead of dynamic data race detectors, the space overhead of a memory safety bug detector, and the scalability and precision of the Linux kernel permission check bug detector. We first present a data race detector built upon commodity hardware transactional memory that can achieve 7x overhead reduction com- pared to the state-of-the-art solution (Google’s TSAN). We then present a very lightweight sampling-based data race detector which re-purposes performance monitoring hardware fea- tures for lightweight sampling and uses a novel offline analysis for better race detection capability. Our result highlights very low overhead (2.6%) with 27.5% detection probabil- ity with a sampling period of 10,000. Next, we present a space-efficient temporal memory safety bug detector for a hardware spatial memory safety bug detector, without additional hardware support. According to experimental results, our full memory safety solution incurs only a 36% memory overhead with a 60% run-time overhead. Finally, we present a permission check bug detector for the Linux kernel. This bug detector leverages indirect call usage patterns in the Linux kernel for scalable and precise analysis. As a result, within a limited time budget (scalable), the detector discovered 14 previously unknown bugs (precise). Dedication Dedicated to my wife Wei Song, my parents, and my boy Ethan. iv Acknowledgments First of all, I would like to thank my advisors, Drs. Dongyoon Lee and Changhee Jung, and express my highest appreciation and deepest gratitude to them. They are great advisors and mentors. They introduced me to system research, they gave me a lot of guidance, and they shared their profound knowledge with me to make me a better researcher. They care about me and also teach me the wisdom of life to make me a better person. Without their guidance, I won’t be able to make such an achievement. I feel very lucky to have Dr. Dongyoon Lee and Dr. Changhee Jung as my advisors and I sincerely hope that we will continue our academic collaborations in the future. I would also like to thank all committee members and express my sincere gratitude to them, Drs. Kirk Cameron, Danfeng Yao, and Weidong Cui, for their valuable feedback and in- sightful comment on my research. I am also very grateful for my mentor Drs. Wenbo Shen, Ahemd Azab, and my collaborators Dr. Ruowen Wang, Tongping Liu, as well as Hongyu Liu, and Sam Silvestro for their collaborative efforts in our joint projects. It was a great pleasure to make a lot of friends in Virginia Tech, Drs. Qingrui Liu, Ke Tian, Zheng Song, Hao Zhang, Xiaokui Shu, Fang Liu, Xiaodong Yu, Bo Li, Run Yu, Yue Cheng as well as Xinwei Fu, Spencer Lee, Peeratham Techapalokul, Dong Chen, Ye Wang, Xuewen Cui, Shengzhe Xu, Peng Peng, Da Zhang, Hang Hu and many other friends. I have so many happy and enjoyable moments with you guys. I would like to thank my family for the continuous support they have given me throughout my time in graduate school; I could not have done it without their supports. My thesis is supported in part by National Science Foundation under the grant CCF-1527463, CSR-1750503, CSR-1814430, Google Faculty Research Awards, and Pratt Fellowship. v Contents List of Figures xiii List of Tables xvi 1 Introduction 1 1.1 Three Focused Software Bugs .......................... 1 1.1.1 Data Race Bugs ............................. 2 1.1.2 Memory Safety Bugs ........................... 2 1.1.3 Permission Check Bugs ......................... 3 1.2 Problem Statements ............................... 4 1.2.1 Time and Space Overheads of Dynamic Bug Detectors ........ 5 1.2.2 Scalability and Precision Issues in Static Bug Detectors ........ 6 1.3 Thesis Statement ................................. 8 1.4 Contributions ................................... 11 1.4.1 Reducing Run-time Overhead of Dynamic Data Race Bug Detectors 11 1.4.2 Reducing Space Overhead of Dynamic Memory Safety Bug Detectors 12 1.4.3 Solving Scalability and Precision Issues in Static Linux Kernel Per- mission Bug Detectors .......................... 13 vi 1.5 Organization ................................... 14 2 Literature Review 15 2.1 Data Races .................................... 15 2.1.1 Lockset-based Approaches ........................ 16 2.1.2 Overlap-based Approaches ........................ 16 2.1.3 Hardware Data Race Detectors ..................... 17 2.1.4 Sampling-based Approaches ....................... 18 2.1.5 Hybrid Static/Dynamic Approaches .................. 19 2.1.6 Other Approaches to Reduce Dynamic Data Race Detector Overhead 19 2.1.7 Other Related Works ........................... 20 2.2 Memory Safety .................................. 21 2.2.1 Spatial Memory Safety .......................... 21 2.2.2 Temporal Memory Safety ........................ 22 2.3 Permission Check Bugs in Linux Kernel .................... 24 2.3.1 Permission Checks in Linux ....................... 24 2.3.2 Hook Verification and Placement .................... 26 2.3.3 Kernel Static Analysis Tools ....................... 27 2.3.4 Permission Check Analysis Tools .................... 28 3 Efficient Data Race Detection Using Hardware Transactional Memory 30 vii 3.1 Introduction .................................... 31 3.2 Background and Challenges ........................... 34 3.2.1 Hardware Transactional Memory .................... 34 3.2.2 Challenges in Using HTM for Race Detection ............. 35 3.3 Overview ..................................... 36 3.4 Fast Path HTM-based Race Detection ..................... 39 3.4.1 Transactionalization ........................... 39 3.4.2 Handling Transactional Aborts ..................... 41 3.4.3 Optimization ............................... 43 3.5 Slow Path Software-based Race Detection ................... 45 3.6 False Negatives .................................. 47 3.7 Implementation .................................. 48 3.8 Evaluation ..................................... 50 3.8.1 Methodology ............................... 50 3.8.2 Performance Overhead .......................... 51 3.8.3 False Negatives .............................. 54 3.8.4 Cost-Effectiveness of Data Race Detection ............... 55 3.9 Summary ..................................... 58 4 Practical Data Race Detection for

Load more