A Performance Evaluation of Platform-Independent Methods to Search for Hidden Instructions on RISC Processors
Total Page:16
File Type:pdf, Size:1020Kb
Opleiding Informatica A Performance Evaluation of Platform-Independent Methods to Search for Hidden Instructions on RISC Processors Rens Dofferhoff Supervisors: Dr. E. van der Kouwe & Dr. K.F.D. Rietveld BACHELOR THESIS Leiden Institute of Advanced Computer Science (LIACS) www.liacs.leidenuniv.nl 02/08/2019 Abstract Undocumented and faulty instructions can pose a security risk, as shown in the past with the infamous f00f-bug. A scanner that searches for undocumented instructions can be a useful tool to help verify the secure operation of processors. This research proposes two methods for finding undocumented instructions on RISC processors. These methods attempt the execution of a single instruction and analyze the resulting signal information to determine if the instruction is seen as valid by the processor. The results are compared to the behavior specified by the ISA the processor implements. The resulting scanner program is used to scan multiple ARMv8 and RISC-V systems. Various flaws were discovered in the used disassemblers and the QEMU emulator. An undocumented instruction was found on a RISC-V chip. Within this research the performance of the resulting scanner including multi-core scaling behavior is analyzed. Both methods allow the scanning of 32-bit instruction spaces in less than a day. Good multi-core scaling is seen on systems with less than 16 cores. Contents 1 Introduction 1 1.1 Contributions.......................................1 1.2 Thesis overview......................................1 2 Background 3 2.1 Instruction set architecture................................3 2.2 Hidden instructions....................................4 2.3 Exceptions & signal handling..............................4 3 Related work 5 4 Overview 7 5 Design 10 5.1 Program state protection................................. 10 5.2 Analysis stage....................................... 10 5.2.1 Resolving privileged instructions......................... 11 5.3 Instruction fetch stage.................................. 12 5.4 Manager.......................................... 12 5.5 Ptrace method...................................... 12 5.6 Memcage method..................................... 14 5.6.1 Program state recovery.............................. 16 5.6.2 Hangs....................................... 17 5.7 Hybrid length encoding.................................. 17 5.7.1 Instruction space traversal............................ 17 5.7.2 Length detection................................. 18 6 Implementation 19 6.1 ISA specific variables & functions............................ 19 6.2 Blacklist.......................................... 20 6.2.1 Blacklist using sparse hash set.......................... 21 6.2.2 Blacklist using list of (value, care mask)-pairs................. 21 6.3 Disassembler........................................ 21 6.4 Self modifying code & Cache coherency......................... 22 6.5 Self-attach ptrace..................................... 23 7 Evaluation 24 7.1 Test system descriptions................................. 24 7.2 RISC-V Verification.................................... 24 7.2.1 Instruction Length detection test........................ 25 7.3 Performance........................................ 25 7.3.1 Multicore Scaling................................. 25 7.3.2 Effects of processor affinity............................ 30 7.3.3 Comparison memcage and ptrace method.................... 30 7.3.4 System comparisons............................... 31 7.3.5 Performance profile................................ 31 7.4 Scan results........................................ 32 7.4.1 ARMv8 A64.................................... 32 7.4.2 RISC-V...................................... 33 8 Limitations 36 9 Conclusion 37 10 Future work 38 References 40 Appendix A Labor division 41 1 Introduction Computer systems are of paramount importance to our modern society. Secure operation of these systems depends on both software and hardware. Software is generally mistrusted, both software verification and malware detection are extensively researched topics. Hardware however often seems to get a pass. Many programmers and analysts used to treat processors like trusted black boxes. In recent years we have seen some serious hardware security vulnerabilities like Meltdown [11] and Spectre [10] that affect many modern processors. This thesis is inspired by previous research done by Christopher Domas [4]. In this research a tool is built and used to scan x86 processors for undocumented and faulty instructions. Using this tool various such undocumented or `hidden' instructions were found. These hidden instructions can pose serious security risks. For example Domas was able to find a previously unknown halt and catch fire instruction similar to the infamous f00f-bug on a x86 processor [3]. With the rise of mobile devices the usage of RISC processors has grown. RISC processors make up a large majority of smartphone processors and are beginning to make their way into the server market. A possible security vulnerability on these systems brought about by a hidden instruction could therefore have a large impact. A program to search for hidden instructions on RISC processors could provide a useful tool to find possible security vulnerability in these processors. The goal of this research is to develop methods to scan RISC processors for hidden instructions. The aim is that these methods are generic and as such can be used on a myriad of different RISC instruction sets and processors. This research will describe the effectiveness and quantitative performance of the developed methods. 1.1 Contributions This research will deliver the following items: • A description and implementation of two methods to search for hidden instructions on RISC processors • Scan results of multiple RISC processors based on the ARM AArch64 and RV64GC architec- tures • A performance evaluation of the developed implementations of the instruction scanner 1.2 Thesis overview This section contains the introduction of this thesis; Section2 contains some background information needed to understand this thesis; Section3 describes previous research related to this subject; Section4 gives a high level overview of the developed programs components; Section5 discusses the overall program design and methods developed in this research in depth; Section6 gives 1 information on how the methods and design are implemented and what issues were encountered in their implementation; Section7 shows the results of system scans and performance analysis; Section8 describes the limitations on the methods, their implementations and any gathered results; Section 10 makes suggestions for further research. This is a bachelor thesis, by R. Dofferhoff for the Leiden Institute of Advanced Computer Science, supervised by Dr. E. van der Kouwe and Dr. K.F.D. Rietveld. A lot of the work was done in collaboration with M. G¨oebel, this includes development of the memcage method, basic analysis stage and basic fetch stage. All ptrace and RISC-V related components were developed by R. Dofferhoff. M. G¨oebel developed all analysis tools and the low memory blacklist. M. G¨oebel also verified the memcage method. AppendixA contains a table with a more detailed division of labor. For the thesis of M. G¨oebel we refer to [7]. 2 2 Background In this section the necessary background knowledge needed to understand this thesis is discussed. Definitions and explanations of various terms surrounding instruction set architecture and exception handling are given. Our definition of hidden instructions is also provided. 2.1 Instruction set architecture An instruction set architecture (ISA) is the interface between hardware and software. It defines the various attributes a processor implementing the ISA should have. Examples of such attributes include the instruction set that it should support and the state it must contain, like the size and number of general purpose registers. For software development an ISA is effectively a compiler target. Software that runs on one implementation of the ISA should also run on a different implementation of the same ISA. The instruction set is defined within the ISA. It is a set of operations called instructions and their binary encoding. Examples of instructions are integer operations such as add and subtract or memory operations such as loads and stores. A software program is a collection of encoded instructions that are executed sequentially. Operands of an instruction such as register numbers or immediate values are encoded along with the operation. An example of an instruction encoding is shown in Figure1. Aside from the few fixed bits that define the operation there are fields within the encoding that specify its various operands. An encoded instruction is often referred to as an `instruction' as well, context should make clear if the binary encoding or the operation itself is meant. Instructions use certain addressing modes, these addressing modes define how the instruction operands are retrieved using data in registers and immediate data within the instruction encoding. Important terms within this research are register and program counter relative addressing. Using register relative addressing the address of an operand stored in main memory is determined by the value in a specified base register and a (signed) offset encoded in the instruction encoding. The address a of an operand in main memory, resulting from register relative addressing relative to register r and encoded immediate i, is given by a = r + i. Using register