DEFENDING OPERATING SYSTEMS FROM MALICIOUS PERIPHERALS

By JING TIAN

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2019 2019 Jing Tian For my mom, who gave up her chance to go to a college for her family but firmly believes “Knowledge is power”. For my dad, who knows nothing about but bought me one in 1998. ACKNOWLEDGMENTS I am extremely grateful to my family, who always love and support me unconditionally; my advisor Dr. Kevin Butler, who inspires me to a faculty career and keeps inspiring me; and Xie and Fubao, who accompanied me in the past five years. I am also indebted to Dr. Patrick Traynor, who provide guidance and support all along the way. I am also grateful to Dr. Adam Bates, Dr. Bradley Reaves, Dr. Benjamin Mood, and Dr. Nolen Scaife – it is my honor to work with you guys. I would like to thank Dr. Adam Bates, Dr. Patrick McDaniel, Dr. Michael Bailey, Dr. Prabhat Mishra, Dr. Raju Rangaswami, Dr. Tom Shrimpton, and Dr. Vincent Bindschaedler, who gave strong support during my job hunting. Special thanks to Dr. Emily Rine Butler, who taught me how to write an academic paper. I had great pleasure of working with all the talented and motivated students of the FICS research. Many thanks to Grant Hernandez and Joseph Choi for their consistent support whenever my paper is on fire. I would like to thank all my co-authors on the projects described in this work – these projects would not happen without your contribution. I wish to thank our graduate coordinator Adrienne Cook, who always does her best to make sure everything is on track. I am grateful to Gary and Margurite, who keep encouraging and praying for me whenever I am blue. I owe Dr. Dejing Dou a debt of gratitude for his support during my first year in Oregon. Lastly, I would also like to thank my committee members, Dr. Swarup Bhunia, Dr. Prabhat Mishra, and Dr. Patrick Traynor, who have been accommodating in spite of tight schedules and helpful in provided feedback that helped shape this work.

4 TABLE OF CONTENTS page ACKNOWLEDGMENTS...... 4 LIST OF TABLES...... 8 LIST OF FIGURES...... 9 ABSTRACT...... 12

CHAPTER 1 INTRODUCTION...... 13 2 BACKGROUND...... 16 2.1 USB Security...... 16 2.1.1 USB Protocol...... 16 2.1.2 USB Attacks and Defenses...... 18 2.2 Bluetooth and NFC Security...... 22 2.3 BPF/eBPF...... 23 3 GOODUSB...... 25 3.1 Design...... 26 3.1.1 Threat Model and Assumptions...... 27 3.1.2 Mediating USB Interfaces and Drivers...... 28 3.1.3 Identifying USB Devices...... 29 3.1.4 Profiling Malicious USB Devices...... 31 3.2 Implementation...... 32 3.2.1 ...... 33 3.2.2 USB Honeypot...... 37 3.2.3 Device Class Identifier...... 38 3.2.4 Limited HID Driver...... 39 3.3 Evaluation...... 40 3.3.1 Attack Analysis...... 40 3.3.1.1 HID-based attacks...... 41 3.3.1.2 Other USB interfaces and composite devices...... 42 3.3.1.3 -based USB attacks...... 42 3.3.2 Performance Analysis...... 43 3.4 Discussion...... 45 4 USBFILTER...... 49 4.1 Design...... 51 4.1.1 Threat and Trust Models...... 51 4.1.2 Design Goals...... 52

5 4.1.3 Design and Implementation...... 53 4.1.3.1 Packet filtering rules...... 54 4.1.3.2 Traceback...... 55 4.1.3.3 Userspace control...... 56 4.1.4 Deployment...... 57 4.1.4.1 Platform integrity...... 57 4.1.4.2 Runtime integrity...... 57 4.2 Security Analylsis...... 58 4.3 Evaluation...... 62 4.3.1 Case Studies...... 62 4.3.2 Benchmarks...... 66 4.3.2.1 Microbenchmark...... 67 4.3.2.2 Macrobenchmark...... 70 4.3.3 Real-world Workloads...... 71 4.3.4 Summary...... 72 4.4 Discussion...... 72 4.4.1 Process Table...... 72 4.4.2 System Caching...... 73 4.4.3 Packet Analysis From USB Devices...... 74 4.4.4 Malicious USB Drivers and USB Covert Channels...... 74 4.4.5 Usability Issues...... 75 5 (E)BPF MODULES...... 76 5.1 Design...... 77 5.1.1 Security Model...... 77 5.1.2 Goals: Beyond A Reference Monitor...... 77 5.1.3 LBM Kernel Infrastructure...... 78 5.1.4 LBM User Space...... 83 5.2 Implementation...... 84 5.2.1 LBM Kernel Space...... 84 5.2.2 LBM User Space...... 90 5.3 Evaluation...... 91 5.3.1 Case Studies...... 92 5.3.2 Benchmark Setup...... 98 5.3.3 Micro-Benchmark...... 99 5.3.4 Macro-Benchmark...... 99 5.3.5 Scalability...... 101 5.4 Discussion...... 105 5.4.1 LBM vs. USBFILTER vs. USBFirewall...... 105 5.4.2 L2CAP Signaling in Bluetooth...... 105 5.4.3 BPF Memory Write...... 106 5.4.4 BPF Helper Kernel Modules...... 106 5.4.5 LLVM Support...... 107 5.5 Limitations...... 107

6 5.5.1 Stateless vs. Stateful Policy...... 107 5.5.2 DMA-Oriented Protocols...... 108 5.5.3 Operating Systems Dependency...... 108 5.5.4 Lbmtool Limitations...... 109 6 USB TYPE-C AUTHENTICATION...... 110 6.1 Authentication Protocol...... 110 6.1.1 USB Certificate Authorities...... 112 6.1.2 Authentication Protocol...... 112 6.1.3 Secure Key Storage and Processing...... 113 6.1.4 Security Policy...... 113 6.2 Formal Verification...... 114 6.3 Other Issues...... 117 7 REFLECTIONS ON PERIPHERAL SECURITY...... 120 7.1 Future Work...... 121 7.2 Conclusion...... 123

APPENDIX A A LUM EXAMPLE TO BLOCK SCSI WRITES...... 124 B LBMTOOL FRONTEND GRAMMAR...... 125 C LBMTOOL COMPILATION EXAMPLE...... 126 D LMBENCH RESULTS FOR LBM...... 128 REFERENCES...... 130 BIOGRAPHICAL SKETCH...... 141

7 LIST OF TABLES Table page 2-1 Notable real-world attacks on the USB/Peripheral ecosystem, grouped by the layer at which they operate and the offensive primitive of which they are an instance. 20 3-1 Microbenchmarking GoodUSB operation (in microseconds) averaged over 20 runs. 44 4-1 Prolog reasoning time (µs) averaged by 100 runs...... 67 4-2 Rule adding operation time (ms) averaged by 100 runs...... 67 4-3 USB enumeration time (ms) averaged by 20 runs...... 68 4-4 Packet filtering time (µs) averaged by 1500 packets...... 68 4-5 Latency (ms) of the fileserver workload with different mean file sizes...... 69 5-1 LBM compared to USBFILTER and USBFirewall. LBM unifies USBFILTER and USBFirewall, providing a superset their properties via extensible protocol support...... 81 5-2 LBM vs. USBFILTER vs. USBFirewall, specifically with respect to filter design of each...... 82 5-3 LBM statistics per subsystem, including # of fields exposed to the user space, # of BPF helpers implemented, and # of lines of code changes...... 87 5-4 The number of lines added to support NFC...... 98 5-5 Details about the five LBM rules used during the benchmarks...... 98 5-6 LBM overhead in µs based on processing 10K packets on the RX path. For each subsystem, the 1st row is for normal LBM and the 2nd row is for LBM-JIT. In most cases, the overhead of is within 1 µs when JIT is enabled...... 99

D-1 lmbench results for a Vanilla kernel, LBM, and LBM-JIT...... 129

8 LIST OF FIGURES Figure page 2-1 Peripheral vulnerabilities can be classified by the abstracted communications layer at which they operated. A successful attack involves violating a design assumption or implementation error at a given layer...... 17 2-2 A USB device containing two configurations. Configuration 1 contains two interfaces, and configuration 2 contains one interface. Each interface supports two unidirectional communication channels (In/Out) with the host machine. Each channel may contain more than one endpoint (EP), which is the sink of the communication.. 18 2-3 USB Enumeration Procedure...... 19 3-1 During USB enumeration, the host discovers the device and the drivers (interfaces) that need to be loaded in order for it to operate. In the BadUSB attack, marked by a red dotted line, the device requests additional, unexpected interfaces that allow it to perform covert activities on the system...... 26 3-2 GoodUSB introduces a mediator in the USB stack of the host. The mediator restricts USB devices (subjects) access to USB drivers (objects) according to policy...... 28 3-3 GoodUSB cannot trust what the device claims to be during enumeration; however, the device’s claims can be verified by checking them against the user’s expectation as to what the device is and how it should operate. If verification fails, the device is flagged as potentially malicious, and is redirected to a honeypot virtual machine. 30 3-4 The GoodUSB architecture. Components that are introduced by GoodUSB are colored orange and bordered by dashed lines. The kernel hub thread is also modified to interoperate with our system components...... 32 3-5 Screenshots from GoodUSB user interface...... 35 3-6 GoodUSB’s profiling tool, HoneyUSB, captures injected keystrokes from a USB storage device maliciously exposing a keyboard (HID) interface...... 41

4-1 usbfilter implements a USB-layer reference monitor within the kernel, by filtering USB packets to different USB devices to control the communications between applications and devices based on rules configured...... 52

4-2 The architecture of usbfilter...... 54 4-3 The output of “usbtables -h”. The permitted conditions are divided into 4 tables: the process table, the device table, the packet table, and the Linux usbfilter Module (LUM) table...... 61 4-4 Filebench throughput (MB/s) using fileserver workload with different mean file sizes...... 69

9 4-5 Iperf bandwidth (MB/s) using TCP with different time intervals...... 69 4-6 Iperf bandwidth (MB/s) using UDP with different time intervals...... 69 4-7 Performance comparison of real-world workloads...... 69 5-1 LBM Architecture...... 79 5-2 LBM hooks inside the USB subsystem...... 80

5-3 The flow of lbmtool in compiling LBM rules to eBPF programs and loading them into the running kernel...... 83 5-4 LBM core component...... 85 5-5 Pseudo-code of lbm filter pkt...... 86 5-6 LBM hooks inside the Bluetooth subsystem...... 89 5-7 filebench across different kernel configurations. All configurations achieve similar throughputs, meaning a minimum performance impact from LBM...... 100

5-8 RTT of l2ping in milliseconds (lower is better) based on 10K pings, across different kernel configurations. All configurations achieve similar throughputs, meaning a minimal performance impact from LBM...... 101 5-9 LBM overhead in µs based on varying numbers of rules. While the general overhead increases as the number of rules increases, the overhead of going through each individual rule decreases, thus the total overhead is essentially amortized..... 102 5-10 LBM vs. USBFILTER benchmark using filebench with 10 same rules loaded respectively. LBM introduces a minimum overhead comparing to the stock kernel and performs better than USBFILTER in general...... 103 5-11 LBM vs. USBFILTER vs. USBFirewall benchmark using dd with 10 same rules loaded respectively. Comparing to their stock versions, all the solutions show minimum overhead. USBFirewall does not vary much based on the block size. LBM performs better than USBFirewall and USBFILTER when block size is beyond 16 KB in general...... 104 6-1 The USB Type-C Authentication Protocol...... 111 6-2 The USB Type-C Authentication challenge (request) and response messages with payloads...... 112 6-3 USB device internal architecture with secure storage and hardware to support Type-C Authentication...... 114

A-1 An example Linux usbfilter Module that blocks writes to USB removable storage.124

10 B-1 The Extended Backus-Naur Form (EBNF) of our constructed LBM expression grammar...... 125

11 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy DEFENDING OPERATING SYSTEMS FROM MALICIOUS PERIPHERALS By Jing Tian August 2019 Chair: Kevin R. B. Butler Major: Computer Science Modern computer peripherals are diverse in their capabilities and functionality, ranging from keyboards and printers to and external GPUs. In recent years, peripherals increasingly connect over a small number of standardized communication protocols, including USB, Bluetooth, and NFC. The host is responsible for managing these devices; however, malicious peripherals can request additional functionality from the OS resulting in system compromise, such as BadUSB attacks, or can craft data packets to exploit vulnerabilities within OS software stacks, such as BlueBorne attacks. The focus of this dissertation is on building security mechanisms within operating systems to defend against malicious peripherals, which includes three major efforts to explore the nature of malicious peripherals and to build a reference monitor for peripherals. We first present GoodUSB, a systematic USB security solution working at the layer. We then look at USBFILTER, the first reference monitor designed for the USB subsystem within an operating system working at the packet layer. We then introduce Linux (e)BPF Modules (LBM), the first generic security framework for all peripherals within an operating system. We finally formally verify the most recent USB Type-C Authentication protocol and demonstrate its design flaws. By addressing key security and performance challenges, these works pave the way for hardening modern operating systems against attacks from untrusted peripherals.

12 CHAPTER 1 INTRODUCTION Computer peripherals provide critical features to facilitate system use. The broad adoption of can be traced not only to the reduction in cost and size from mainframe to microcomputer, but to the interactivity afforded by devices such as keyboards and mice. Displays, printers, and scanners have become integral parts of the modern office environment. Nowadays, smartphones and tablets can not only act as peripherals to a host computer, but can themselves support peripherals that attach to them. The scope of functionality that peripherals can contain is almost limitless, but the methods of connecting them to host computers have converged to a few select standards, such as USB [1] for wired connections and Bluetooth [2] for wireless. As a result, most modern operating systems provide support for these standards (and the peripherals that use them) by default, implementing the respective software stacks inside the kernel and running different device drivers to support various classes of peripherals. However, with this virtually unconstrained functionality comes the threat of malicious devices that can compromise computer systems in myriad ways. The BadUSB attack [3] allows attackers to add functionality allowed by the USB protocol to device firmware with malicious intent. For example, a BadUSB flash drive presents not only expected behavior of a storage device when plugged into a computer, but also registers keyboard functionality to allow it to inject malicious keystrokes with the aim of gaining administrative privilege. Other examples of malicious USB functionality include chargers that can inject into iOS devices [4], or take control of Android devices via AT commands [5].

Excerpts from this chapter previously appeared in “LBM: A Security Framework for Peripherals within the ”, originally published in the proceedings of the 2019 IEEE Symposium on Security and Privacy (SP) [6].

13 Bluetooth peripherals can also be malicioius: the BlueBorne attack [7] allows remote adversaries to craft Bluetooth packets that will cause a kernel stack overflow with the host operating system and enable privilege escalation, while BleedingBit [8] exploits a stack overflow within the Bluetooth Low Energy (BLE) stack. We observe that malicious peripherals launch attacks in one of two ways, either by (1) sending unexpected packets (I/O requests or responses) to activate extra functionality enabled by the operating system, or by (2) crafting specially formed packets (either legitimate or malformed) to exploit vulnerabilities within the operating system’s protocol software stack. Based on these observations above, the central statement of this dissertation is thus: Software-based attacks from malicious peripherals such as USB devices and Bluetooth gadgets that abuse the protocol design or exploit software stack vulnerabilities can be defended by building packet-layer firewalls for those I/O subsystems within operating systems. The goal of this work is to understand how we could secure our host machines when the modern peripherals plugged in are untrusted or even malicious. More specifically, this work explores how we could build security solutions within operating systems to defend against attacks from peripherals. We takes a step-by-step approach to prove and go beyond the dissertation statement by asking the following questions:

1. How can we defend against BadUSB attacks from malicious USB peripherals?

2. Does the packet conceptualization work for USB? If so, what can we do with it?

3. How can we build a generic security framework for all peripherals such as USB and Bluetooth?

4. What else do we need to defend against malicious peripherals? In the process of answering each question above, we provide final results and deliverables, including:

14 1. GoodUSB: A USB security solution bridging the semantic gap between operating systems and end users.

2. USBFILTER: The first packet-layer firewall designed for the USB subsystem within the operating system.

3. Linux (e)BPF Modules: The first generic security framework for all peripherals within the Linux kernel.

4. Formal verification of the USB Type-C Authentication: Demonstrating the design flaws of the first USB standard to incoporate security. By addressing key security and performance challenges, these works pave the way for hardening modern operating systems against attacks from untrusted and even malicious peripherals, and shed some light on how to build secure and trusted peripherals.

15 CHAPTER 2 BACKGROUND In this chapter, we explore the current attack surface of peripherals. Given the myriad work in this space, we first classify existing attacks in terms of the functionality that they target. We thus categorize peripheral functionality into abstract communication layers. As seen in Figure 2-1, the layers represent the various entities involved across both the host and peripherals. At the highest level, the human layer involves actions and communications between human stakeholders. The application layer represents user-level programs on the host and capabilities on the device. The transport layer encompasses both device firmware and host operating systems containing the peripheral stack, such as USB stack or . Finally, the physical layer represents the communication over the peripheral . By grouping functionality into layers, we can more easily identify commonalities in approaches and derive sub-groupings, called primitives. In the case of attacks, these primitives encompass both the mechanism (i.e., how the attack is accomplished) and the outcome (e.g., forgery, eavesdropping, or denial of service). In the case of defenses, the layered approach also helps readers understand the scope of security solutions. This work focuses on building defending solutions targeting the transport layer.

2.1 USB Security

2.1.1 USB Protocol

The true flexibility of USB comes from composite devices, which can contain multiple configurations and interfaces, each of which is a standalone entity. For instance, a USB headset may contain one configuration, which in turn contains four interfaces, including a keyboard (for volume control), a microphone, and two speakers. An example of a

Excerpts from this chapter previously appeared in “SoK: “Plug & Pray” Today - Understanding USB Insecurity in Versions 1 through C”, originally published in the proceedings of the 2018 IEEE Symposium on Security and Privacy (SP) [9].

16 Human Layer

Host Device Application Layer Client Function

Operating Transport Layer Firmware System

Bus Physical Layer Bus Interface Interface

Figure 2-1. Peripheral vulnerabilities can be classified by the abstracted communications layer at which they operated. A successful attack involves violating a design assumption or implementation error at a given layer. two-configuration USB device is shown in Figure 2-2. Two mechanisms are necessary to accomplish composite devices, one to define different kinds of peripherals, and another to connect to them. Beginning in USB 1.0, the notion of Common Class Specifications [10, 11] was introduced to codify different kinds of peripherals. A USB class is a grouping of one or more interfaces that combine to provide more complex functionality. Examples of classes that feature a single interface include the Human Interface Device (HID) class that enables the USB host controller to interact with keyboards and mice, and the USB Mass Storage Class [12, 13] that defines how to transfer data between the host and storage devices. A composite device can then combine different classes to create a useful product, such as a USB headset leveraging both the HID class and Audio class. As we will see, the notion of designing USB peripherals through a composition of multiple functionalities continues to affect on the state of USB security today.

17 EP 0 EP 0 EP 0 EP 0 EP 0 EP 0 EP 1 EP 1 EP 1 EP 1

EP n EP n In Out In Out In Out

Interface 0 Interface 1 Interface 0 Configuration 1 Configuration 2 USB Device

Figure 2-2. A USB device containing two configurations. Configuration 1 contains two interfaces, and configuration 2 contains one interface. Each interface supports two unidirectional communication channels (In/Out) with the host machine. Each channel may contain more than one endpoint (EP), which is the sink of the communication.

After a device is plugged into the host machine, the USB host controller detects its presence and speed by checking the voltage change on data pins. Enumeration then begins (shown in Figure 2-3) with the GetDeviceDescriptors command, whereby the host asks the device for its identifying information including manufacturer, Vendor ID (VID), Product ID (PID), and serial number. The host controller resets the device and assigns an address to it for future communication. A GetConfigDescriptors request obtains all configurations available within the device. USB devices can have one or more configurations, though only one may be active at a time. Each configuration can include one or more interfaces, which are obtained with the GetInterfaceDescriptors request and represent the essential functional entities served by different drivers within the operating system. After GetInterfaceDescriptors completes, drivers are loaded on behalf of the device and class-specific subsets of the USB protocol (e.g., HID, Storage) begin operation. 2.1.2 USB Attacks and Defenses

Based on this examination of attacks, we identify several offensive primitives that are leveraged in USB-based attacks. Note that we exclude DMA attacks from USB devices,

18 Host Device GetDeviceDescriptors

ResetDevice/AssignAddress

GetConfigDescriptors

GetInterfaceDescriptors

Load Drivers

USB Communications

Figure 2-3. USB Enumeration Procedure. which are an example of I/O attacks against host machines and peer devices [54–56]. Table 2-1 provides a mapping of notable attacks surveyed above to their respective layers and primitives. Abuse at the human layer involves social engineering or human error, as performed by outsiders as well as privileged members within an organization. Both Manning [20] and Snowden [21] are typical examples of using USB flash drives to steal sensitive information. Application layer attacks involve user-space processes on the host and their interactions with the functionality of a device. Attacks in this layer typically fall into two categories: code injection attacks, where the attacker injects malicious code into the host, such as [23, 24], and data exfiltration attacks, in which the device accesses data from the host without authorization, such as TURNIPSCHOOL [33]. Attacks on the USB transport layer fall into two general categories: those that perform masquerading through additional interfaces and those that send maliciously crafted

19 Table 2-1. Notable real-world attacks on the USB/Peripheral ecosystem, grouped by the layer at which they operate and the offensive primitive of which they are an instance. Layer Offensive Primitives Attack Social engineering the USB way [14] U.S. government USB study [15] Outsider Threats USB attack vector study [16] Human Users really do plug in USB [17] USB sticks cause data breach [18] Australian defense loss [19] Insider Threats Manning infiltration (Wikileaks) [20] Snowden smuggled documents [21] The BRAIN virus [22] Stuxnet [23, 24] Code Injection Conficker [25] [26] Application user-mode rootkit [27] Webcam Extraction [28–30] Audio Extraction [31] Data Extraction USBee [32] TURNIPSCHOOL [33] USB Rubber Ducky [34] USBdriveby [35] Protocol Masquerading TURNIPSCHOOL [33] USB Bypassing Tool [36] Transport BadUSB [3] FaceDancer [37] Protocol Corruption UMAP2 [38] Syzkaller [39] Exploiting smartphone USB connectivity [40] Power/EM Side-channels [41, 42] BadUSB Hubs [43] Signal Eavesdropping USB Fingerprinting [44–46] USBSnoop [47] Physical CottonMouth [48, 49] USB GPS locator [50, 51] USBKiller [52] Bad-quality USB cables [53] Signal Injection USBee [32] TURNIPSCHOOL [33] packets/messages to compromise the host operating system. BadUSB [3] attacks allows attackers to reprogram the firmware of a USB device to add extra functionality, e.g., adding a keyboard functionality to a USB flash drive. This USB flash drive can then inject malicious keystrokes to compromise the host operating system. FaceDancer [37] is a programmable USB micro-controller sending our malicious USB packets. Physical layer attacks consist of attacks against confidentiality and integrity in the communication across

20 the USB bus. In this context, signal refers to activities that occur over the USB bus. For instance, USBSnoop [47] leverages the signal leakage from adjacent USB ports to snoop the traffic happening on the USB bus. USBKiller [52] is a USB flash drive equipped with bunch of capacitors. After being plugged into, it draws power from the USB bus. Once fully charged, it discharges to burn the host machine. Across all communications layers, a common characteristic of attacks is that they abuse the trust-by-default assumption that pervades the USB ecosystem. While security training and Anti-Virus software could mitigate attacks within the human and application layers, and we could improve the quality of USB hardware to reduce some of the physical attacks, defenses at the transport layer are limited. USBFirewall [57] focuses on protecting the host USB stack by detecting malformed USB packets, e.g., generated by FaceDancer, based on a formal model of the protocol syntax. Cinch [58] leverages to decrease the host’s attack surface – the host operating system is hoisted into a VM to isolate it from the USB host controller, and then all USB traffic is tunneled via IOMMU through a sacrificial gateway VM. Unfortunately, USBFirewall cannot defend against attacks such as BadUSB; the overhead of Cinch prevents itself from being used in practice. For a more detailed analysis on USB security, please refer to [9]. In Summary. Across all communications layers, a common characteristic of attacks is that they abuse the trust-by-default assumption that pervades the USB ecosystem. This trust model is inextricably linked to the “Plug & Play” philosophy that led to USB’s ubiquity, making popular the notion that peripherals should work instantly upon connection without any additional configuration. Violations at the human layer are the result of misplaced trust in the intentions of devices and other humans. Within the application layer, host machines blindly trust the integrity of the contents of portable media and devices assume that all transactions emanate from a trustworthy agent. At the transport layer, USB protocols assume that kernel drivers will only be requested for legitimate purposes. Finally, at the physical layer, USB host controllers supporting the USB 1.x and

21 2.x protocols broadcast messages downstream assuming that they would only be read by the recipient. 2.2 Bluetooth and NFC Security

Just as USB dominates wired connections for peripherals, Bluetooth [2] is the de facto standard for connecting peripherals wirelessly. Being a short-distance Radio Frequency (RF) technology, Bluetooth usually allows data transmission within 10 meters. After Bluetooth 4.0, Bluetooth Low Energy (BLE) and Bluetooth Mesh were introduced to support lower-power consumption devices (e.g., IoT) and sensor networks. Bluetooth, like USB, is also susceptible to a wide variety of attacks [59] due to software implementation vulnerabilities and malicious Bluetooth peripherals. BlueBug [60] allows an attacker to send AT commands to take control of the victim’s phone, from e.g., a malicious Bluetooth headset. Blueprinting [61] and BlueBag [62] identify and collect statistics on all discoverable devices in the area. BlueSnarf and BlueSnarf++ [60] allow an adversary to acquire files from a victim device without being authenticated. BlueDump [63] causes a victim device to dump its stored link keys associated with connection events. CarWhisperer [64] allows an adversary to eavesdrop on and inject audio into a car over Bluetooth. BlueBorne [7] attacks craft specially-formed Bluetooth packets to exploit certain vulnerabilities within the software stack implementation, causing e.g., privilege escalation. BleedingBit [8] attacks exploit another stack overflow within TI’s BLE stack. While pairing is used to prevent unidentified devices from being connected via Bluetooth, many attacks happen before the pairing procedure. Also, pairing does not work for simple devices without a means to input PINs. Unlike the case for USB, there is no available systematic solution that defends against malicious Bluetooth peripherals at all. Even worse, BadBluetooth [65] attacks replicate BadUSB attacks in Bluetooth, allowing a malicious Bluetooth device to change its profile after it is paired. The most effective defense seems to be turning off Bluetooth or physically unplugging the Bluetooth module.

22 Near Field Communication (NFC) [66] is another short-range wireless communication protocol based on RFID technology. The operation range is usually within 4 to 5 centimeters. Smartphones (e.g., Androids and ) commonly use NFC as a quick means to exchange information, such as when downloading a poster or making a payment. Similarly, these NFC software stacks are also vulnerable. A NFC feature that unknowingly invokes a Bluetooth connection can install malware on phones [67]. “Exploring the NFC Attacking Surface” [68] lists four possible attacks enabled by bugs within the Android and N9 software stacks. A recent bug within the Linux kernel NFC software stack [69] allows a malicious NFC device to inject a malformed packet to launch out-of-bounds writes in kernel memory. The most recent “Tap’n Ghost” attack [70] can compromise smartphones by using a malicious NFC device. In Summary. Regardless of wireline or wireless, these peripheral communication protocols often refer to their communication unit as a “packet” (e.g., USB packets or Bluetooth packets). The OS further instantiates the abstraction of these “packets” within the context of a given I/O subsystem. This provides us an opportunity to treat these peripheral security issues as we would treat networking security issues: by building firewalls for these peripherals and applying rules to filter unwanted (malicious) packets. 2.3 BPF/eBPF

The BSD Packet Filter (BPF) [71] is a high-performance RISC-based virtual machine running inside the OS. Since its creation, it has been used as a standard way for packet filtering in the kernel space. The most well-known BPF customer might be tcpdump, which compiles filtering rules into BPF instructions and loads them into the kernel via socket . Extended BPF (eBPF) [72, 73] is a new ISA based on the classic BPF. Compared to the old ISA, eBPF increases the number of registers from 2 to 10 and register width from 32-bit to 64-bit. eBPF also introduces a JIT compiler to map eBPF instructions to native CPU instructions, including x86, x86-64, ARM, PowerPC, Sparc,

23 etc. A new syscall bpf, added since Linux kernel 3.18, supports loading eBPF programs from the user space. Besides the ISA extensions, eBPF provides new ways to communicate between user and kernel spaces, and to call kernel APIs within BPF programs [74]. eBPF maps are a generic data structure to share data between the user/kernel spaces. A typical usage is to have the kernel update certain values (e.g., the number of IP packets received) inside the map with the user space program picking up the change. BPF helpers are a special call to bridge the eBPF programs and kernel APIs. The newly added CALL instruction can be used to trigger predefined BPF helpers, which usually wrap up kernel APIs to implement some functionalities that cannot be achieved by eBPF instructions themselves. eBPF also includes a verifier, which checks the safety of a given eBPF program via a directed acyclic graph (DAG) check (to ensure bounded execution) and by checking for memory violations. The purpose of this verifier is to make sure that an eBPF program cannot affect the kernel’s integrity. In the later part of this work, we will demonstrate how to leverage BPF/eBPF for peripheral security.

24 CHAPTER 3 GOODUSB In the BadUSB attack [3, 75], a malicious USB device registers as multiple device types, allowing the device to take covert actions on the host. For example, a USB flash drive could register itself as both a storage device and a keyboard, enabling the ability to inject malicious scripts, as shown in Figure 3-1. This functionality is present in the Rubber Ducky penetration test tool [34], which is now available for public sale. Unfortunately, because USB device firmware cannot be scanned by the host, antivirus software is not positioned to detect or defend against this attack. This problem is not just limited to dubious flash drives: any device that communicates over USB is susceptible to this attack.

We observe that the root cause of the BadUSB attack is a lack of access control within the enumeration phase of the USB protocol. Devices are free to request that any number of device drivers be loaded on their behalf. However, existing USB security solutions, such as whitelisting individual devices by their serial number, are not adequate when considering malicious firmware that can make spurious claims about its identity during device enumeration. Standard USB devices are too simplistic to reliably authenticate, and secure devices with signed firmware that could permit authentication are rare, leaving it unclear how to defend ourselves against this new attack. Our key insight in this work is that the most reliable source of information about a device’s identity is the end user’s expectation of the device’s functionality. For example, when a user plugs in a flash drive, they are aware that they have not plugged in a keyboard. We use this insight to design and implement GoodUSB, a host-side defense

Excerpts from this chapter previously appeared in “Defending Against Malicious USB Firmware with GoodUSB”, originally published in the proceedings of the 2015 Annual Computer Security Applications Conference (ACSAC) [76].

25 Host Device SetAddress(n)

ACK

GetDescriptor(Device)

MNF: Kingston, Product: Flash Drive

GetDescriptor(Interface) Storage

Human Interface

Figure 3-1. During USB enumeration, the host discovers the device and the drivers (interfaces) that need to be loaded in order for it to operate. In the BadUSB attack, marked by a red dotted line, the device requests additional, unexpected interfaces that allow it to perform covert activities on the system. for operating systems against BadUSB attacks. GoodUSB features a graphical interface that prompts users to describe their devices, and a kernel enforcement mechanism that denies access to features that fall outside of that description. Our system also features a security image system to simplify device administration using security pictures, and a novel USB Honeypot mechanism for profiling BadUSB attacks. GoodUSB even provides an added layer of protection for “secure” devices with signed firmware, ensuring that BadUSB attacks will still fail even if the manufacturer’s signing key falls into the wrong hands. 3.1 Design

In this section, we identify the key challenges to the design of a security mechanism for USB enumeration. In considering BadUSB, we observe that the root cause of this

26 threat is that USB drivers effectively represent a set of system privileges, and yet the USB protocol does not provide a means of restricting devices’ access to these privileges. Therefore, solving the BadUSB problem requires the introduction of a security layer to the enumeration phase of the USB protocol. We discover that a number of unique challenges exist due to the plug-and-play nature of USB that prevents traditional access control mechanisms from being suitable in this environment, and subsequently propose solutions to each of these obstacles. The technical details of our solution can be found in Section 3.2. 3.1.1 Threat Model and Assumptions

The goal of our system is to provide a defense against BadUSB attacks in a security-conscious enterprise environment where myriad different USB devices are used each day. Due to the sensitive nature of such organizations, they have already deployed advanced USB malware scanning kiosks that effectively detect malicious storage payloads [77, 78]; hence, traditional USB attacks are not a concern. We assume the use of standard commodity devices that lack advanced security features such as signed firmware. While signed firmware can be employed to defeat BadUSB, such features are costly, and to our knowledge are only available for USB storage devices [79]. We assume that employees in our operating environment are required to participate in a security orientation. We consider an Advanced Persistent Threat attack that is attempting to further its presence in the enterprise through distributing USB devices with malicious firmware (i.e., BadUSB). The malicious devices have entered the physical premises via supply-chain compromise or social engineering. We conservatively assume that these devices are subject to byzantine faults during participation in the USB protocol. The device may make any claim about its identity during enumeration, and can attempt to confuse or evade the device identification mechanism that our system introduces; for example, the device can lie about its manufacturer and product ID. The device may also alter its responses each

27 Objects (Interfaces) USB Mediator Subjects (Devices) Storage PORT 1 Driver Policy Device Engine Identifier Host HID Controller Driver

Audio Policy PORT 2 Driver

Figure 3-2. GoodUSB introduces a mediator in the USB stack of the host. The mediator restricts USB devices (subjects) access to USB drivers (objects) according to policy. time it enumerates. Moreover, the adversary may have changed the physical casing of the device so that its functionality is not apparent through visible inspection. Finally, we make the following assumptions about the state of the host system on which our security mechanism is being deployed. We assume the host is in a correct state prior to connecting to any USB devices. We also assume that the host’s USB software stack is correct, and does not contain any exploitable software flaws. Conceivably, a BadUSB device could send malformed messages that could exploit a software vulnerability (e.g., buffer overflow) in the host controller or driver. This is an important problem in itself, and fuzzing techniques have been proposed elsewhere in the literature to detect such faults [80]; however, it is orthogonal to our goal of addressing a fundamental vulnerability in the USB protocol. 3.1.2 Mediating USB Interfaces and Drivers

The fundamental vulnerability in USB that gives rise to BadUSB attacks is that arbitrary USB interfaces can be enumerated, comprising a set of unrestricted privileges provided to a USB device. In response, we propose the introduction of a permission validation mechanism that authorizes device’s access requests to individual USB interfaces. The proposed mediator is shown at a high level in Figure 3-2. During the USB enumeration phase, a Device Identifier authenticates the connected device and provides an subject ID. When the host queries the device for its interfaces, the device’s

28 response represents an access request. The subject ID and access request are passed into a Policy Engine. Based on the Policy, the engine then individually authorizes the requested interfaces prior to loading the drivers on behalf of the device. While restricting device activity at the driver granularity is a significant improvement over the status quo, it would also be desirable to restrict device actions at finer granularities. For example, a device that registers as a Human Interface Device (HID) but only makes use of the Volume Up and Volume Down keys is dangerously over-capable; while use of those two keys alone is harmless, with the full HID driver the device can effectively take any action on the host. Unfortunately, this requires instrumentation of individual USB drivers, so it is not a general solution to the BadUSB problem. However, our mediator must be extensible, supporting security-enhanced USB drivers as they are made available. In Section 3.2, we instrument the general USB HID driver to provide access to volume controls only, preventing a USB headset from running in an over-capable state, such as running as a keyboard. 3.1.3 Identifying USB Devices

We now describe the Device Identifier component of our USB mediator. A fundamental requirement of any access control system is authenticating the subject. The device descriptor passed by the USB device during enumeration contains information such as the manufacturer, product, and a unique serial number for the device. However, the problem of identifying the device is actually much more complicated. As we assume devices are subject to byzantine faults, we cannot trust any message that we receive from the device during enumeration. If an adversary has rewritten a device’s firmware, it can change its response during any message in the enumeration, including lying about its manufacturer and model number. When the device’s reported descriptor and even its physical appearance are potentially false, how can we identify the device? We assert that the most reliable source of information of a device’s identity is the end user’s expectation of the device’s functionality. The purposes of most USB interfaces are

29 USB User Expectations If incorrect, redirect Policy Honeypot device to honeypot… VM

Storage If correct, load driver… USB Device Claims Driver Mediator

Figure 3-3. GoodUSB cannot trust what the device claims to be during enumeration; however, the device’s claims can be verified by checking them against the user’s expectation as to what the device is and how it should operate. If verification fails, the device is flagged as potentially malicious, and is redirected to a honeypot virtual machine. intuitive, especially in an environment where all users are computer literate and have been instructed on security procedures. We propose that the simplest means of enforcing least privilege on USB devices is to ask the users what they expect their device to do, having the user serve as a verifier for the claims that the device makes during enumeration. This verification concept is visualized in Figure 3-3. When a device first connects to the host, the user is notified of the connection through a graphical dialog box on the host. The dialog prompts the user to select the features (i.e., interfaces) that they wish to enable on the device. Note that, since the host is trusted, this constitutes a trusted path from the host controller to the user. The user’s settings are stored in a policy database, with each record being a tuple (Subject ID, Authorized Interfaces). The subject ID includes Manufacture, Product, Serial Number, etc.1 On subsequent connections, one of three scenarios may occur:

1. The device’s claim is consistent with the prior connection.

1 The record format and subject ID are simplified here for better illustration.

30 2. The device makes a different claim that matches an entry in the device database.

3. The device makes a different claim that does not match an entry in the device database. In Scenarios (1) and (2), the user will be presented with an entry from the database and asked whether the information is correct. In Case (1), the user confirms that the information is correct, and enumeration is permitted to continue for the authorized interfaces. In Case (2), the user reports that the information is incorrect, and the device is flagged as potentially malicious. In Case (3), the user is presented with the initial device registration dialog again. However, since the user knows that the device has already been registered, he or she can report the anomaly and the device will be flagged as potentially malicious. Given the above description, readers may find themselves understandably wary of the burden that our system places on the end user. However, given the capabilities of our attacker, and the lack of a core root of trust for measuring USB firmware, we assert that our solution is the only option for deterministically authenticating devices, which is a prerequisite to defending against BadUSB. In Section 3.2, we present the technical details of the GoodUSB graphical interface, which makes use of visual cues to dramatically simplify the process of administration for normal users with limited technical background. 3.1.4 Profiling Malicious USB Devices

Once a device has been flagged as potentially malicious, what actions can we take? Unfortunately, it is not possible to block the device from further interactions with the system. On subsequent connections, the device can make different claims about its identity, so we have no means of blacklisting it. We determined the most valuable action that our system could take is to redirect the device to a virtualized honeypot, allowing the device to be observed while simultaneously protecting the host. Within the virtual honeypot, the actions taken by the device can be profiled, which could prove valuable in the ensuing forensic investigation. The honeypot’s interactions

31 GoodUSB Daemon USB Honeypot (gud) (HoneyUSB) Policy Engine USB Profiler Graphical Interface USB Monitor Device Database QEMU KVM User Space VirtIO Kernel Space Kernel Hub Thread Kernel Virtual Machine

USB Device Class Identifier Host Ctrl Passthrough Interface Drivers Host Ctrl 0 Host Ctrl 1 Host Ctrl 2

P P P P P P P P Limited O O O O O O O O HID R R R R R R R R T T T T T T T T 1 2 3 4 5 6 7 8

Figure 3-4. The GoodUSB architecture. Components that are introduced by GoodUSB are colored orange and bordered by dashed lines. The kernel hub thread is also modified to interoperate with our system components. with other system components is shown in Figure 3-3. We identify the following types of information as valuable to an investigator: device information after enumeration, device drivers loaded for the device, and all communication at the USB layer, including all keystrokes and all IP packets sent/received over the network. This information could potentially be passed to high-level forensic tools for detailed inspection and an intrusion detection system (IDS) which would provide a means of remediation in the event that a device is incorrectly flagged as malicious. 3.2 Implementation

In this section, we present GoodUSB, our fully-implemented security architecture for the Linux USB stack. While our BadUSB defense is general enough to apply to any operating system, we have implemented GoodUSB for 14.04 LTS (kernel

32 version 3.13.11). The full architecture of GoodUSB, shown in Figure 3-4, introduces four components. First, a user space daemon handles the graphical interface and policy management, and also includes the logic for the USB mediator. A second user space component, a KVM honeypot, profiles potentially malicious USB devices. In kernel space, we introduce a device class identifier, as well as a limited USB HID driver that secures human interface devices by restraining them to particular kinds of keystrokes. The kernel hub thread is minimally modified to interoperate with the components of the GoodUSB architecture. 3.2.1 User Space Daemon

Most of GoodUSB’s functions are handled by a user space daemon (a.k.a. gud). Shown in Figure 3-4, gud includes three subsystems: a policy engine that implements the USB mediator logic, a graphical interface that features a security picture recognition system, and a device database that associates a device’s claimed identity and functionality with the user’s expectation of the device. To allow gud to interact with the rest of the USB stack, we use the new netlink socket created in the kernel hub thread (a.k.a. khub), which communicates with gud to perform USB device detection, enumeration and driver matching/loading. The subsystems of gud are detailed below. Policy Engine. The policy engine is responsible for determining whether the requested interfaces of a newly connected USB device match the user’s expectation of device functionality, and subsequently enforcing that expectation. It notifies the kernel space components to block the loading of particular interfaces, or to redirect the device to a honeypot when it is potentially malicious. To increase the users’ experience, the policy engine maintains a mapping between low-level interface types and a high-level summary of common USB devices. Some example mappings are shown below:2

2 There are 17 mappings in total. Only 4 are presented here.

33 USB_DEV_STORAGE=> USB_CLASS_MASS_STORAGE USB_CLASS_CSCID USB_CLASS_VENDOR_SPEC USB_DEV_CELLPHONE=> USB_CLASS_MASS_STORAGE USB_CLASS_VENDOR_SPEC USB_DEV_HEADSET=> USB_CLASS_AUDIO USB_CLASS_HID (LIMITED) USB_CLASS_VENDOR_SPEC USB_DEV_CHARGER=> {0} Reading the above mappings, the policy states that storage devices can only register the following interfaces: MASS STORAGE (for flash drives), CSCID (for smart cards) and/or VENDOR SPEC interfaces. Storage devices cannot register the HID interface, preventing the most widely recognized form of BadUSB attacks. In addition to the AUDIO interface, certain USB headsets sometimes require the HID interface for volume control. GoodUSB introduces a limited HID interface that restricts the permissible keystrokes of non-keyboard USB devices, which prevents malicious HID devices from taking control of the system. Another interesting example is a CHARGER device, which does not contain any interfaces. As a matter of fact, these chargers should never be detected as USB devices, because the charging procedure does not need to involve any USB layer communication. Through enforcing this permission mapping, GoodUSB is able to defend against BadUSB attacks. Note that vendor specific interfaces are allowed in most devices. This is a tradeoff between security and usability, as devices that require a vendor specific driver are likely to break if denied this interface. We discuss this limitation in Section 3.4. Based on GoodUSB’s configuration, the policy engine can operate in either basic or advanced modes. In basic mode, the graphical interface features high-level device summaries, as shown in Figure 5a, and the user selects a single option that maps to

34 A. Device registration screen

B. Recognized device notification C. Security picture screen

Figure 3-5. Screenshots from GoodUSB user interface. low-level interfaces. In advanced mode, the graphical interface instead shows the low-level interfaces, and allows the user to make multiple selections. The advanced mode allows the user to exercise finer control over device functionalities, and also supports devices that require uncommon interface sets. Graphical Interface. When a USB device connects to the host, the policy engine loads one of several dialog boxes depending on whether the device is recognized from a previous session. If a device is not recognized, GoodUSB prompts the user with the device

35 registration box shown in Figure 3-5A. 3 The text field at the top of the box allows the user to confirm that the device’s claimed identity (i.e., Manufacturer and Product) match the device that was just plugged into the host. The remainder of the box provides a set of device descriptions for the user to select. Each device description maps to a set of permissible interfaces. Immediately after the device registration screen, the user is asked to select a security image to associate with the device, as shown in Figure 3-5C. Security images are widely used as an anti-phishing mechanism by banking websites [81]. In GoodUSB, the security image component is introduced to simplify device administration, and also provides a visual cue for the presence of a potential BadUSB attack. Recall that BadUSB devices can spoof any message in their device descriptors; an adversary who is aware of our GoodUSB defense may therefore attempt to masquerade as a known device that has the desired interface, e.g., the HID interface. When GoodUSB recognizes a device, it is either a legitimate occurrence or evidence of an attack. The dialog box for recognized devices is shown in Figure 3-5B. Here, the option for selecting a device type has been removed. The user can verify the device through either reading the descriptive text or checking that the presented security image is correctly associated with the device. If the presented information is incorrect, the device is flagged as potentially malicious and is redirected to the USB honeypot. Otherwise the user approves the device and driver loading continues. Device Database. Once gud obtains user expectations and a security picture is selected, this information is recorded alongside the output of the Device Class Identifier in a database. The database is implemented as a binary file, and is synchronized with kernel space whenever a new USB device is plugged in. When the machine is rebooted, gud re-transmits the device database to kernel space via the netlink socket, making sure that

3 A user study of the prompt’s effectiveness is out of the scope of the paper.

36 previously classified devices will be recognized on subsequent connections. If needed, users can also clear the database in gud, which provides a clean base in the kernel space as well, once the machine is restarted. 3.2.2 USB Honeypot

In the event of a potential attack, administrators will undertake forensic investigation to determine the nature of the attack and identify likely culprits. To observe the activities of potentially malicious devices, GoodUSB features a honeypot virtual machine mechanism. While honeypots for malicious USB devices have been previously proposed by Poeplau and Gassen [82], we realized that these systems are actually incapable of observing the BadUSB attack vector. The reason is that their system emulates a device, as opposed to a host, and attempts to catch host-based malware as it infects the device. BadUSB is not a host-to-device attack, but rather a device-to-host attack. In BadUSB, once the host is compromised, the adversary will have to rely on other attack vectors to extend their presence in the devices, as it is very difficult (if not impossible) to infect the firmware of USB devices simply by having them connect to an infected host. Therefore, it is necessary to design a new USB honeypot framework that is suited to observing BadUSB. Our system, HoneyUSB, is a QEMU-KVM virtualized Linux machine containing multiple device profiling services. HoneyUSB supports two modes of device observation (profiling). In the first, HoneyUSB reserves an entire USB controller device on the host, and the host controller device (HCD) is hoisted directly into KVM using pass-through technology. The advantage of this profiling mode is that the potentially infected device never operates directly within the host OS, and is effectively physically separated from the host machine. Using this profiling mode is helpful when out-of-band knowledge has been used to flag a device as potentially malicious, e.g., it was found lying in the company parking lot. In a second mode, gud automatically redirects devices to HoneyUSB after the user flags them as potentially malicious.

37 The honeypot VM, which also runs Ubuntu Linux, is preconfigured as follows. We enabled usbmon in the VM’s kernel [83], which acts as a general USB layer monitor, capturing all the USB packets transmitted by the device. In the user space, we created a USB profiling application, usbpro, which aggregates device information from , lsusb, -devices and device activities from usbmon and tcpdump. Moreover, a new rule with high priority is associated with usbpro, guaranteeing that usbpro is loaded prior to device enumeration. Thus, the report generated by usbpro is an exhaustive description of the device’s reported information as well as the actions taken by its associated drivers. Excerpts of a report for a HID device generated by usbpro can be found in the evaluation section. HoneyUSB also contains an instrumented version of the GIO Virtual Filesystem (gvfs), the user-space driver used by USB-enabled cellphones such as Android. We have extended gvfs to collect file-level data provenance, constituting a detailed description of the read and write operations performed by the device. Currently, file-level provenance has been added into the MTP backend to support Android, which means usbpro is able to record all the file-based I/O operations happened in the Android phone operated in MTP mode. 3.2.3 Device Class Identifier

The Device Class Identifier is a kernel space component that summarizes the claims made by the device during enumeration. This summary contains both the device descriptor fields that are presented to the user in Figure 3-5, and all the descriptors transmitted by the device during enumeration, plus the current active configuration. This includes the device information and requested interfaces in the active configuration, and also other configuration supported but not used by the device. A SHA1 digest is then computed based on the summary. We assume that the digest could be forged in our threat model and we allow that happen in GoodUSB. Therefore, there is no need to pursue the

38 best secure hashing function and SHA1 is usually optimized for better performance in the kernel. The digest is used as a device identifier in both the gud device database, allowing gud to recognize devices that were previously registered, and the kernel device database, keeping it synchronized with the former. After the USB enumeration, the kernel knows all the interfaces requested by the device, as well as the SHA1 digest. If the digest does not match a prior entry from the kernel device database, the kernel notifies gud to present the device registration screen to the user (3-5A). The user’s response is transmitted to the kernel by gud before the requested interface drivers are loaded. After receiving instructions from gud, the kernel first creates an entry for the newly registered device in its device database, if the device is to be enabled. The permitted drivers are then loaded, while other requested drivers are ignored, thus ensuring that the device cannot interact with the system in ways that were not expected by the user. When there is a match in the kernel device database, meaning this device is recognized as a known one, the kernel notifies gud and asks for permission to enable this device (3-5B). The requested drivers are not loaded until after the user has approved the device. If the user disapproves the device, the kernel disallows any interfaces requested by the device by not loading any drivers and gud helps redirect the device into the USB honeypot. 3.2.4 Limited HID Driver

The GoodUSB architecture is designed primarily to enforce least privilege on USB device at the granularity of device drivers. Unfortunately, devices that have been approved for a particular interface are free to operate with the full capabilities of the associated driver. In Section 3.1.2, we mentioned that this is particularly troubling for the the Human Interface Device (HID) interface, which allows a BadUSB device to take nearly any action on the system [3]. While GoodUSB allows users to disable the HID interface of the USB device completely, there are cases where the HID interface is legal and a

39 functioning part of the device, e.g., the HID interface of a USB headset controlling the volume of the internal speaker. To mitigate the danger of HID-based BadUSB attacks, we have instrumented a copy of the Linux USB HID driver to restrict the number of characters that can be injected by USB devices. The Linux USB HID driver is widely used by many USB devices because it bridges the USB and Input layers in the kernel. As USB Requests Blocks (URBs) are a common abstraction for USB packets within the kernel, we instrumented the USB HID driver at the URB level, which saved us from having to perform packet inspection. We modified the original USB HID driver to restrict the kinds of URBs the driver can report to the higher-level input driver. The current limited USB HID driver supports only 3 different keystrokes, corresponding to volume increase, volume decrease, and the mute button, as are commonly found on USB headsets. Exercising control of device activity above the interface level requires instrumenting the various USB drivers to support access control, like what grsecurity [84] does, which is tedious, potentially error-prone and volatile to driver changes and new drivers. However, our limited USB HID driver demonstrates that our approach can dramatically reduce the scope of BadUSB attacks by limiting the general USB HID driver without touching any specific drivers. 3.3 Evaluation

We now evaluate the GoodUSB architecture. We first provide a security case study in Section 3.3.1, where GoodUSB is tested against a variety of malicious and benign devices. In Section 3.3.2, we provide a performance evaluation of our system. 3.3.1 Attack Analysis

The authors of BadUSB have published a proof-of-concept implementation online [75] with reverse engineered firmware for a particular USB storage device that adds a malicious HID interface. Rather than use this highly specific instance of BadUSB, we use

40 usbpro HID analyzer started: ======F2 x t erm ENTER p w d ENTER i d ENTER c a t SPACE/ e t c /passwd ENTER

======usbpro HID analyzer done

Figure 3-6. GoodUSB’s profiling tool, HoneyUSB, captures injected keystrokes from a USB storage device maliciously exposing a keyboard (HID) interface. several popular penetration and development tools to launch a variety of attacks in order to demonstrate the range of defenses provided by GoodUSB. 3.3.1.1 HID-based attacks

To demonstrate GoodUSB’s resistance to attacks from exposing human interface device (HID) interfaces (e.g., exposing keyboard functionality), we use the Rubber Ducky penetration testing device. The Ducky provides a user-friendly scripting language enabling different HID-based attacks. We load a basic Ubuntu terminal command script [85] into the Ducky, which opens an xterm window once the Ducky is plugged into the victim’s computer. It then issues several commands, including checking the /etc/passwd file. The first time we plug in the Ducky, GoodUSB pops up the device registration GUI, asking users for their expectations of the device’s functionality. Since the Ducky appears to be a USB thumb drive, we choose “USB Storage” and register it with a security image selected from a list, as shown in Figure 3-5C. The attack fails because GoodUSB does not allow USB HID interfaces for “USB Storage” devices. However, the Ducky continues to function in its capacity as a storage device. When the Ducky is plugged in again, GoodUSB recognizes it and shows the correct security picture. Rather than enabling this device as a USB storage device, we click “This is NOT my device!”, which redirects the Ducky into HoneyUSB. Using the usbpro utility, we can easily see all the information and activities of the ducky, including reconstructing its injected keystrokes, as shown in Figure 3-6.

41 3.3.1.2 Other USB interfaces and composite devices

We demonstrate more robust interface attacks using a Teensy USB development board [35]. Unlike the Rubber Ducky, Teensy is able to simulate not only USB HID devices but also USB Serial devices, USB MIDI devices and others. Moreover, Teensy is also able to combine different interfaces together to make a composite device, which is how devices such as USB headsets and smartphones present themselves to hosts. First, we consider a scenario where a Teensy presents a USB storage form factor but is acting as a serial device to transmit messages (e.g., shell scripts) to a trojan residing on the host machine. To accomplish this, we program a Teensy 3.1 board to

expose a serial terminal at /dev/ttyACM0. When the board is plugged in, it attempts to communicate over the serial interface to the trojan listening on the tty interface. Based on its form factor, however, the user registers the Teensy as a USB storage device with GoodUSB. Consequently, the serial interface is not exposed and the trojan cannot receive its commands. We use the Teensy to demonstrate GoodUSB’s ability to handle composite devices. We program the Teensy to simultaneously register itself as a keyboard, a joystick, a mouse and a serial port. Each interface is controlled by a separate task on the board; for example, one job instructs the mouse to move around the screen, while an independent task controls the joystick. With the help of the advanced mode of gud, GoodUSB displays all the interfaces requested by the Teensy before any drivers are loaded. This allows the user to whitelist individual interfaces; for example, we can enable mouse functionality while disabling all other input types. The result is that GoodUSB is able to enforce least privilege over the composite device by disabling other undesired functionalities requested by the device. 3.3.1.3 Smartphone-based USB attacks

The authors of BadUSB released a shell script called BadAndroid that emulates a DNS-based Man in the Middle (MitM) attack on the host machine using a rooted Android

42 phone4 . The basic functionality required by this attack is USB Tethering, which allows a USB device to present itself as an Ethernet card to the host. In this experiment, we connect a phone to GoodUSB and register it as “USB Cellphone.” GoodUSB only permits the smartphone to use the mass storage and vendor-specific interfaces. At first, Nexus S only registers the storage interface. However, when we enable USB Tethering on the phone, GoodUSB detects the new interface request and pops up the registration window again, asking for the user’s permission. Only if the user explicitly selects “USB Cellphone with Tethering” will the network interface be available. If the standard “USB Cellphone” description is again selected, tethering over USB, and the potential DNS MitM attack, is thwarted. Alternately, when GoodUSB presented a second device registration window, we could have flagged the device as potentially malicious. The Nexus S would have then been redirected to HoneyUSB and been granted permission to register the USB Tethering interface, where usbpro would have observed the IP packets sent and received by the phone. If there is a legitimate need for additional interfaces such as the network interface for tethering, GoodUSB can provide this support through the advanced interface menu or through adding an additional device-to-interface mapping (e.g., tethering-enabled phones) on the basic menu. 3.3.2 Performance Analysis

The utility of GoodUSB depends on its imposing minimal overhead on the host. Below, we provide a micro benchmark based on the different operations of GoodUSB. Our host machine is a ThinkCentre desktop, with a 3GHz (R) Core(TM)2 Duo CPU (2 cores) and 4 GB of RAM. HoneyUSB, which executes inside a KVM virtual guest, runs on the same host, with 2 virtual CPUs and 2GB memory. Both are running Ubuntu

4 The malicious phone sends the host false DNS information, e.g., the IP address of an attacker-controlled server for a banking web site to steal credit card information.

43 Table 3-1. Microbenchmarking GoodUSB operation (in microseconds) averaged over 20 runs. Action Min Avg Max Mdev Overhead Normal Enumeration 140266 140424 141001 126 N/A

GoodUSB Steps: Device Identification 8.0 9.0 10.0 0.2 N/A First Enumeration 146308 147675 149336 609 5.2% Second Enumeration 146306 147463 149268 558 5.0% Honeypot Redirect 248951 262057 295444 6842 N/A

Linux 14.04 LTS with kernel 3.13. The testing USB device is a Logitech ClearChat USB headset H390, containing 4 interfaces (3 audio + 1 HID). To precisely measure the overhead imposed by the core system rather than user interactions, we bypass the measurement of the GUI component by hard-coding messages to the kernel from the user daemon. All measurements are based on 20 enumerations using same device plugged into the same USB port on the test machine. Table 3-1 provides the results of our measurements. Normal Enumeration displays the time required to add a new USB device by the original khub thread in the kernel without GoodUSB enabled. Device Identification shows the overhead of our device class identifier, which measures all the descriptors from the USB device and the current configuration using SHA1. The average overhead for this step 9 us, which is almost negligible compared to the whole USB enumeration, which takes about 140 ms. First Enumeration demonstrates the case when GoodUSB is enabled where the device is plugged in for the first time (within the user space, both the device registration and the security picture selection GUIs would be popped up). Compared to the original device adding procedure, GoodUSB only introduces 5.2% overhead. Second Enumeration shows the case where the device is recognized by GoodUSB (within the user space, only the device recognition GUI would show up). Compared with the original procedure, GoodUSB only presents 5.0% overhead. Finally, we measure the overhead of HoneyUSB redirection in Honeypot Redirect. Note that HoneyUSB is already started in our evaluation and it

44 usually takes 5–10 seconds to start it in our host machine. Once HoneyUSB is running, the whole redirection needs only 262 ms to allow the device to re-enumerate. We performed similar tests with a Kingston 2GB USB thumb drive and a Nexus S phone with/without GoodUSB. The enumeration times are comparable - the overhead is 5.1% for the USB storage and 7.3% for the phone. The phone appears to have larger overhead because it enumerates more quickly in our testing, which is 2275 us in average without GoodUSB enabled. Because USB is a master-slave protocol, the device’s ability to modify the speed of enumeration is limited. The speed is dictated by USB 2.0 bus speeds and the processing delay on the host. The enumeration of a headset is slower than a flash drive because it is registering more interfaces, which causes more processing on the host and more data to be sent over the USB interface. GoodUSB’s overhead is thus virtually negligible during the USB device enumeration phase. There is no impact at all on regular device operations (e.g., file transfer, mouse movement, etc.) after the enumeration phase. 3.4 Discussion

Does selectively disabling interfaces break USB devices? We tested GoodUSB against a number of devices found in our laboratory and commonly used, including USB keyboards, mouses, flash drives, headsets, wireless adaptors, webcams, smart phones and chargers, and can anecdotally report that selective authorization of interfaces does not prevent benign USB devices from performing the authorized functions in most cases. For example, we tested GoodUSB against a Logitech USB headset that requested interfaces for Audio (Input), Audio (Output), and Human Interface Device (HID). Each feature was able to work in isolation when the others were disabled, e.g., a headset with HID interface disabled. This is an exciting potential application for GoodUSB, as some enterprise environments may wish to selective disable certain features (e.g., the microphone found in a headset) for fear that they be misused by malware. We expect that compatibility issues will arise in instances where USB device developers make unexpected use of interfaces. One example may be the yubikey [86],

45 an authentication aid that is both a USB smart card and a HID keyboard. While there always exist some USB quirks and a serious USB device survey is needed to tell how diverse the combination of interfaces is, for these unusual cases, GoodUSB can be easily extended to support these special devices by adding new device-to-interface mappings. Can GoodUSB authenticate individual USB device units? Because devices can lie about their identity, GoodUSB relaxes the concept of authentication, instead seeking to identify classes of devices at the granularity of the product type under the same manufacturer. This is sufficient for the goals of our system, which only seeks to restrict the interface set available to certain kinds of devices; all USB devices of the same model should require access to the same interfaces. Can GoodUSB protect against malicious smart phones? Smart phones are troublesome for GoodUSB due to the use of the vendor-specific interface. To minimize compatibility issues, GoodUSB allows the vendor-specific interface to be loaded for most kinds of devices. Many smart phones, including Android and iPhone devices, request the vendor-specific interface during enumeration. Because the phone’s actions are ultimately dependent on a user space driver, GoodUSB cannot make a determination as to the device’s potential actions until after the device has loaded. To provide some confidence as to the intent of the device, we recommend plugging smart phones into HoneyUSB via passthrough, where usbpro is able to profile the phone in the sandbox. To demonstrate, we profiled the and iPhone 3GS in HoneyUSB. usbpro reported that Nexus 5 used the vendor-specific interface to load the usbfs kernel driver. Different from other USB kernel drivers, the only functionality provided by usbfs is to expose the device node to user space on the host and to enable file I/O operations. From there, Nexus 5 loads gvfsd-mtp to perform the Media Transfer Protocol (MTP) over USB connections. The iPhone 3GS uses two vendor-specific interfaces for loading the usbfs and ipheth kernel drivers. In user space, the usbmuxd driver allows data synchronization

46 between the host machine and iOS device. This serves to demonstrate that HoneyUSB can be used independently of the rest of our system to profile potentially malicious smart phones. Can GoodUSB be used as IDS? Though GoodUSB rests on the final decision of users, it is possible to extend GoodUSB into an IDS for USB devcies, assisting users to identify anomalous combination of interfaces (storage + keyboard), as well as anomalous device behaviors (delayed registration of a keyboard, for instance). For the former, GoodUSB can pop up an warning window, notifying the policy violation of the device to the user, rather than disabling the anomalous interface silently. For the later, usbpro could be used to learn the normal behaviors of devices, and to detect abnormal behaviors in the future. Machine learning based techniques using timing side channels [45] can also be integrated into GoodUSB, helping detect abnormal behaviors of devices early in the USB enumeration phase. Is GoodUSB easily deployable? GoodUSB was designed with consideration for users with limited technical knowledge. The GoodUSB daemon provides a basic mode to abstract away low-level interface decisions, simplifying device administration for regular users. Additionally, the daemon’s security image component speeds up the process of authorizing devices on subsequent connections. One obstacle to the deployment of GoodUSB is the requirement of a custom kernel. Instrumenting the kernel is necessary to introduce a security mechanism into the USB stack. To ease the installation and configuration of GoodUSB, we will be releasing GoodUSB in multiple formats upon publication. In addition to a kernel patch, we will also publish a prebuilt x86-64 GoodUSB kernel image for Ubuntu Linux users. Additionally, we will provide a preconfigured GoodUSB KVM image, as well as a separate KVM image for HoneyUSB, in order to make deployment of GoodUSB feasible and straightforward.

47 GoodUSB is a first step in hardening the USB stack from sophisticated attacks. In future work, we intend to move up the stack to explore USB drivers. While best practices in software engineering encourage drivers to support as many devices as possible, this inherently violates the principle of least privilege, providing a device with more abilities than it actually needs. We plan to perform a driver analysis that explores this problem in depth. We also intend to analyze some of the more popular user space drivers, such as usbmuxd, and instrument them to provide file-level provenance so their actions on the system may be better understood. We also hope to add more features to the GoodUSB architecture. One such feature is to let the profiling phase of HoneyUSB inform the available device-to-interface mappings in the gud graphical interface, thereby automating the process of adding new mappings to the policy engine. We also hope to use HoneyUSB profiles to improve GoodUSB’s ability to predict the purpose of vendor-specific interface requests, allowing gud to display the actual expected driver to be loaded to the user.

48 CHAPTER 4 USBFILTER While GoodUSB bridges the semantic gap between operating systems and end users by including humans into the loop of USB device authentication, it relies on the knowledge of end users and needs user interactions whenever a USB device is plugged into the host machine. A natural question to ask is how to get rid of the human interaction while maintaining the security guarantees, e.g., enforcing security policies on different USB devices within the host machine? The USB protocol works in a master-slave fashion, where the host USB controller is responsible to poll the device both for requests and responses. When a USB device is attached to a host machine, the host USB controller queries the device to obtain the configurations of the device, and activates a single configuration supported by the device. For instance, when a smartphone is connected with a host machine via USB, users can choose it to be a storage or networking device. By parsing the current active configuration, the host operating system identifies all the interfaces contained in the configuration, and loads the corresponding device drivers for each interface. Once a USB device driver starts, it first parses the endpoints information embedded within this interface as shown in Figure 2-2.

While the interface provides the basic information for the host operating system to load the driver, the endpoint is the communication unit when a driver talks with the USB device hardware. Per specification, the endpoint 0 (EP0) should be supported by default, enabling Control (packet) transfer from a host to a device to further probe the device, prepare for data transmission, and check for errors. All other endpoints can be optional

Excerpts from this chapter previously appeared in “Making USB Great Again with USBFILTER”, originally published in the proceedings of the 2016 USENIX Security Symposium [87].

49 though there is usually at least EP1, providing Isochronous, Interrupt, or Bulk (packet) transfers, which are used by audio/video, keyboard/mouse, and storage/networking devices respectively. All endpoints are grouped into either In pipes, where transfers are from the device to the host, or Out pipes, where transfers are from the host to the device. This in/out pipe determines the transmission direction of a USB packet. With all endpoints set up, the driver is able to communicate with the device hardware by submitting USB packets with different target endpoints, packet types, and directions. These packets are delivered to the host controller, which calls the controller hardware to encode USB packets into electrical signals and send them to the device. Observing that USB is essentially a packetized protocol like networking transport

protocols, we propose usbfilter, the first USB packet firewall enforcing security policies on each USB packet within the operating system. usbfilter is different from previous works in this space because it enables the creation of rules that explicitly allow or deny functionality based on a wide range of features. GoodUSB relies on the user to explicitly allow or deny specific functionality based on what the device reports, but cannot enforce that the behavior of a device matches what it reports. SELinux [88] policies and

PinUP [89] provide mechanisms for pinning processes to filesystem objects, but usbfilter expands this by allowing individual USB packets to be associated with processes. This not only allows our system to permit pinning devices to processes, but also individual interfaces of composite devices. The policies can be applied to differentiate individual devices by identifiers presented during device enumeration. These identifiers, such as serial number, provide a stronger measure of identification than simple product and vendor

codes. While not a strong authentication mechanism, usbfilter is able to perform filtering without additional hardware. The granularity and extensibility of usbfilter allows it to perform the existing functions of GoodUSB while permitting much stronger control over USB devices.

50 4.1 Design

The complex nature of the USB protocol and the variety of devices that can be attached to it makes developing a robust and efficient access control mechanism challenging. Layers in the operating system between the process and the hardware device create difficulties when identifying processes. Accordingly, developing a system such as usbfilter is not as simple as intercepting USB packets and dropping those that match rules. In this section, we discuss our security goals, design considerations, and implementation of usbfilter while explaining the challenges of developing such a system. 4.1.1 Threat and Trust Models

We consider an adversary against our system who has restricted external physical or full network access to a given host. The adversary may launch physical attacks such as attaching unauthorized USB devices to the host system or tampering with the hardware of previously-authorized devices to add additional functionality. The physically-present adversary may not open the device or tamper with the internal storage, firmware, or any other hardware. This type of adversary might (for example) be present in an or retail location, where devices have exposed USB ports, but tampering with the chassis of the device would raise suspicion or sound alarms. The adversary may also launch network attacks in order to enable or access authorized devices from unauthorized processes or devices. In either case, the adversary may attempt to exfiltrate data from the host system via both physical and virtual USB devices. We consider the following actions by an adversary:

• Device Tampering: The adversary may attempt to attach or tamper with a previously-authorized device to add unauthorized functionality (e.g., BadUSB [3]).

• Unauthorized Devices: Unauthorized devices attached to the system either physically or virtually [90] can be used to discreetly interact with the host system or to provide data storage for future exfiltration.

• Unauthorized Access: The adversary may attempt to enable or access authorized devices on a host (e.g., webcam, microphone, etc.) via unauthorized software to gain access to information or functionality that would otherwise inaccessible.

51 App1 App2 App3

User Space I/O operation

Kernel Space

Rule USBFILTER DB USB packet mouse camera keyboard headset storage wireless

Figure 4-1. usbfilter implements a USB-layer reference monitor within the kernel, by filtering USB packets to different USB devices to control the communications between applications and devices based on rules configured.

We assume that as a kernel component, the integrity of usbfilter depends on the integrity of the operating system and the host hardware (except USB devices). Code running in the kernel space has unrestricted access to the kernel’s memory, including our code, and we assume that the code running in the kernel will not tamper with usbfilter. We discuss how we ensure runtime and platform integrity in our experimental setup in Section 4.1.4. 4.1.2 Design Goals

Inspired by the Netfilter [91] framework in the Linux kernel, we designed usbfilter to enable administrator-defined rule-based filtering for the USB protocol. To achieve this, we first designed our system to satisfy the concept of a reference monitor [92], shown in

Figure 4-1. While these goals are not required for full functionality of usbfilter, we

52 chose to design for stronger security guarantees to ensure that processes attempting to access hardware USB devices directly would be unable to circumvent our system. We define the specific goals as follows:

G1 Complete Mediation. All physical or virtual USB packets must pass through usbfilter before delivery to the intended destination.

G2 Tamperproof. usbfilter may not be bypassed or disabled as long as the integrity of the operating system is maintained.

G3 Verifiable. The user-defined rules input into the system must be verifiably correct. These rules may not conflict with each other.

While the above goals support the security guarantees that we want usbfilter to provide, we expand upon these to provide additional functionality:

G4 Granular. Any mutable data in a USB packet header must be accessible by a user-defined rule. If the ultimate destination of a packet is a userspace process, usbfilter must permit the user to specify the process in a rule.

G5 Modular. usbfilter must be extensible and allow users to provide submodules to support additional types of analysis. 4.1.3 Design and Implementation

The core usbfilter component is statically compiled and linked into the Linux kernel image, which hooks the flow of USB packets before they reach the USB host controller which serves the USB device drivers, as shown in Figure 4-2. Like Netfilter, this USB firewall checks a user-defined rule database for each USB packet that passes through it and takes the action defined in the first matching rule. A user-space program, usbtables, provides mediated read/write access to the rule database. Since usbfilter intercepts USB packets in the kernel, it can control access to both physical and virtual devices. 4.1.3.1 Packet filtering rules

To access external USB devices, user-space applications request I/O operations which are transformed into USB request blocks (URBs) by the operating system. The communication path involves the process, the device, and the I/O request itself (USB

53 App1 App2 App3

usbtables I/O operation

User Space netlink Kernel Space Storage Input Video Driver Driver Driver URB Host Controller Rule USBFILTER usbfilter DB modules USB packet USB Devices

Figure 4-2. The architecture of usbfilter. packet). Similarly, a usbfilter rule can be described using the process information, the device information, and the USB packet information.

A usbfilter rule R can be expressed as a triple (N, C,A) where N is the name of the rule, C is a set of conditions, and A ∈ {ALLOW, DROP } is the action that is taken when all of the conditions are satisfied. As long as the values in conditions, action, and name are valid, this rule is valid, but may not be correct considering other existing rules. We discuss verifying the correctness of rules in Section 4.2. 4.1.3.2 Traceback

USB packets do not carry attribution data that can be used to determine the source or destination process of a packet. We therefore need to perform traceback to attribute packets to interfaces and processes.

54 Interfaces. As discussed before, a USB device can have multiple interfaces, each with a discrete functionality served by a device driver in the operating system. Once a driver is bound with an interface, it is able to communicate with that interface using USB packets. Determining the driver responsible for receiving or sending a given USB packet is useful for precisely controlling device behaviors. However, identifying the responsible driver is not possible at the packet level, since the packets are already in transit and do not contain identifying information. While we could infer the responsible driver for simple USB devices, such as a mouse, this becomes unclear with composite USB devices with multiple interfaces (some of which may be served by the same driver). To recover this important information from USB packets without changing each driver and extending the packet structure, we save the interface index into the kernel endpoint structure during USB enumeration. This reverse mapping of interface to driver needs to be performed only once per device. The interface index distinguishes interfaces belonging to the same physical device and USB packets submitted by different driver instances. Once the mapping has been completed, the USB host controller is able to easily trace the originating interface back to the USB packets. Processes. Similarly, tracking the destination or source process responsible for a USB packet is not trivial due to the way modern operating systems abstract device access from applications. For example, when communicating with USB storage devices, the operating system provides several abstractions between the application and the , including a filesystem, block layer, and I/O scheduler. Furthermore, applications generally submit asynchronous I/O requests, causing the kernel to perform the communications task on a separate background thread. This problem also appears when inspecting USB network device packets, including both wireline (e.g., Ethernet) dongles and wireless (e.g., WiFi) adapters. It is common for these USB device drivers to have their own RX/TX queues to boost the system

55 performance using asynchronous I/O. In these cases, USB is an intermediate layer to encapsulate IP packets into USB packets for processing by the USB networking hardware.

These cases are problematic for usbfilter because a na¨ıve traceback approach will often only identify the kernel thread as the origin of a USB packet. To recover the process identifier (PID) of the true origin, we must ensure that this information persists between all layers within the operating system before the I/O request is transformed into a USB packet.1

usbfilter instruments the USB networking driver (usbnet), the USB wireless driver (rt2x00usb), the USB storage driver (usb-storage), as well as the block layer and I/O schedulers. Changes to the I/O schedulers are needed to avoid the potential merging of two block requests from different processes. By querying the rule database and usbfilter modules, usbfilter sets up a filter for all USB packets right before being dispatched to the devices. 4.1.3.3 Userspace control

usbtables manages usbfilter rules added in the kernel and saves all active rules in a database. Using udev, saved rules are flushed into the kernel automatically upon reboot. usbtables is also responsible for verifying the correctness of rules as we will discuss in Section 4.2. Once verified, new rules will be synchronized with the kernel and saved locally. If no user-defined rules are present, usbfilter enforces default rules that are designed to prevent impact on normal kernel activities (e.g., USB hot-plugs). These rules can be overridden or augmented by the user as desired. 4.1.4 Deployment

We now demonstrate how we use existing security techniques in the deployment of usbfilter. Attestation and MAC policy are necessary for providing complete mediation

1 usbfilter does not overlap with Netfilter or any other IP packet filtering mechanisms which work along the TCP/IP stack.

56 and tamperproof reference monitor guarantees, but not for the functionality of the system. The technologies we reference in this section are illustrative examples of how these goals can be met. 4.1.4.1 Platform integrity

We deployed usbfilter on a physical machine with a Trusted Platform Module (TPM). The TPM provides a root of trust that allows for a measured boot of the system and provides the basis for remote attestations to prove that the host machine is in a known hardware and software configuration. The BIOS’s core root of trust for measurement (CRTM) bootstraps a series of code measurements prior to the execution of each platform component. Once booted, the kernel then measures the code for user-space components (e.g., provenance recorder) before launching them using the Linux Integrity Measurement Architecture (IMA) [93]. The result is then extended into TPM PCRs, which forms a verifiable chain of trust that shows the integrity of the system via a digital signature over the measurements. A remote verifier can use this chain to determine the current state of the system using TPM attestation. Together with TPM, we also use Intel’s Trusted Boot (tboot)2 4.1.4.2 Runtime integrity

After into the usbfilter kernel, the runtime integrity of the TCB (defined in Section 4.1.1) must also be assured. To protect the runtime integrity of the kernel, we deploy a Mandatory Access Control (MAC) policy, as implemented by . We enable SELinux’s MLS policy, the security of which was formally modeled

by Hicks et al. [94]. We also ensure that usbtables executes in a restricted environment and that the access to the rules database saved on the disk is protected by defining an SELinux Policy Module and compiling it into the SELinux Policy.

2 See http://sf.net/projects/tboot

57 4.2 Security Analylsis

In this section, we demonstrate that usbfilter meets the security goals outlined in Section 4.1 using the deployment and configurations described in that section.

Complete Mediation (G1). As we previously discussed, usbfilter must mediate all USB packets between devices and applications on the host. In order to ensure this, we have instrumented usbfilter into the USB host controller, which is the last hop for USB packets before leaving the host machine and the first when entering it. Devices cannot initiate USB packet transmission without permission from the controller. We also instrument the virtual USB host controller (vhci) to cover virtual USB devices (e.g., USB/IP). To support other non-traditional USB host controllers such as

Wireless USB [95] and Media Agnostic USB [96], usbfilter support is easily added via a simple kernel API call and the inclusion of a header file.

Tamperproof (G2). usbfilter is statically compiled and linked into the kernel image to avoid being unloaded as a kernel module. The integrity of this runtime, the associated database, and user-space tools is assured through the SELinux policy as described in Section 4.1.4.2. Tampering with the kernel or booting a different kernel is the only way to bypass usbfilter, and platform integrity measures provide detection capabilities for this scenario (Section 4.1.4.1).

Formal Verification (G3). The formal verification of usbfilter rules is implemented as a logic engine within usbtables using GNU Prolog [97]. Instead of trying to prove that an abstract model of rule semantics is correctly implemented by the code, which is usually intractable for the Linux kernel, we limit our focus on rule correctness and consistency checking. Each time usbtables is invoked to add a new rule, the new rule and the existing rules are loaded into the logic engine for formal verification. This process only

needs to be performed once when adding a new rule and usbfilter continues to run while the verification takes place.

58 The verification checks for rules with the same conditions but different actions. These

rules are considered conflicting and usbtables will terminate with error when this occurs. We define the correctness of a rule:

is correct(R, R) ←

is name unique(R)∧

are condition values in range(R)∧

has no conflict with existing rules(R, R).

where R is a new usbfilter rule and R for all other existing rules maintained by usbfil- ter. If the new rule has a unique name, all the values of conditions are in range, and it does not conflict with any existing rules, the rule is correct. While the name and the value checks are straightforward, there are different conflicting cases between the conditions and the action, particularly when a rule does not contain all conditions. For example, a rule can be contradictory with, a sub rule of, or the same as another existing rule. As such, we define the general conflict between two rules as follows:

general conflict(Ra,Rb) ←

∀Ci 3 C :

a b a b (∃Ci 3 Ra ∧ ∃Ci 3 Rb ∧ value(Ci ) 6= value(Ci ))∨

a b (∃Ci 3 Ra∧ 6 ∃Ci 3 Rb)∨

a b (6 ∃Ci 3 Ra∧ 6 ∃Ci 3 Rb).

A rule Ra is generally conflicted with another rule Rb if all conditions used by Ra are a

subset of the ones specified in Rb. We consider a general conflict to occur if the new rule and an existing rule would fire on the same packet.

59 Based on the general conflict, we define weak conflict and strong conflict as follows:

weak conflict(Ra,Rb) ←

general conflict(Ra,Rb) ∧ action(Ra) = action(Rb).

strong conflict(Ra,Rb) ←

general conflict(Ra,Rb) ∧ action(Ra) 6= action(Rb).

While weak conflict shows that the new rule could be a duplicate of an existing rule, strong conflict presents that this new rule would not work. The weak conflict, however, depending on the requirement and the implementation, may be allowed temporarily to shrink the scope of an existing rule while avoiding the time gap between the old rule removed and the new rule added. For instance, rule A drops any USB packets writing data into any external USB storage devices. Later on, the user decides to block write operations only for the Kingston thumb drive by writing rule B, which is weak conflicted with rule A, since both rules have the same destination and action. When the user wants to unblock the Kingston storage by writing rule C, rule C is strong conflicted with both rule A and B, since rule C has a different action, and will never work as expected because of rule A/B. By relying on the logic reasoning of Prolog, we are able to guarantee that a rule before added is formally verified no conflict with existing rules 3 .

Granular (G4). A usbfilter rule can contain 21 different conditions, excluding the name and action field. We further divide these conditions into 4 tables, including the process, device, packet, and the Linux usbfilter Module (LUM) table, as shown in Figure 4-3. The process table lists conditions specific to target applications; the device table contains details of USB devices in the system; the packet table includes important information

3 Note that all rules are monotonic by design, which means rules to be added cannot override existing ones. Future work will add general rules, which can be overwritten by new rules.

60 −d|−−debug enable debug mode −c|−− config path to configuration file (TBD) −h|−−help display this help message −p|−−dump dump a l l the r u l e s −a|−−add addanew rule −r|−−remove remove an existing rule −s|−− synchronize rules with kernel −e|−−enable enable usbfilter −q|−− disable disable usbfilter −b|−−behave change the default behavior −o|−−proc process table rule −v|−−dev device table rule −k|−−pkt packet table rule −l |−−lum LUM table rule −t|−−act table rule action −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− proc: pid,ppid,pgid ,uid,euid ,gid ,egid ,comm dev: busnum,devnum,portnum,ifnum ,devpath ,product , manufacturer , serial pkt: types ,direction ,endpoint ,address lum : name behavior/action: allow | drop

Figure 4-3. The output of “usbtables -h”. The permitted conditions are divided into 4 tables: the process table, the device table, the packet table, and the Linux usbfilter Module (LUM) table. about USB packets; and the LUM table determines the name of the LUM to be used if needed. Note that all LUMs should be loaded into the kernel before being used in usbfilter rules. Module Extension (G5). To support customized rule construction and deep USB packet analysis, usbfilter allows system administrators to write Linux usbfilter Modules (LUMs), and load them into the kernel as needed. To write a LUM, developers need only include the ¡linux/usbfilter.h¿ header file in the kernel module, implement the callback lum filter urb(), and register the module using usbfilter register lum(). Once registered, the LUM can be referenced by its name in the construction of a rule. When

a LUM is encountered in a rule, besides other condition checking, usbfilter calls the lum filter urb() callback within this LUM, passing the USB packet as the sole parameter.

61 The callback returns 1 if the packet matches the target of this LUM, 0 otherwise. Note that the current implementation supports only one LUM per rule. 4.3 Evaluation

The usbfilter host machine is a Optiplex 7010 with an Intel Quad-core 3.20 GHz CPU with 8 GB memory and is running Ubuntu Linux 14.04 LTS with kernel version 3.13. The machine has two USB 2.0 controllers and one USB 3.0 controller, provided by the Intel 7 Series/C210 Series chipset. To demonstrate the power of usbfilter, we first examine different USB devices and provide practical use cases which are non-trivial for traditional access control mechanisms. Finally we measure the overhead introduced by

usbfilter. The default behavior of usbfilter in our host machine is to allow the USB packet if no rule matches the packet. A more constrained setting is to change the default behavior to drop, requiring each permitted USB device to need an allow rule. In this setting, malicious devices have to impersonate benign devices to allow communications, which are still regulated by the rules, e.g., no HID traffic allowed for a legit USB storage device. All tests use the same front-end USB 2.0 port on the machine. 4.3.1 Case Studies

Listen-only USB headset. The typical USB headset is a composite device with multiple interfaces including speakers, microphone, and volume control. Sensitive working environments may ban the use of USB headsets due to possible eavesdropping using the microphone [98]. Physically disabling the headset microphone is often the only mechanism for permanently removing it, as there is no other way to guarantee the microphone stays off. Users can mute or unmute the microphone using the desktop audio controls at any

time after login. However, with usbfilter, the system administrator can guarantee that the headset’s microphone remains disabled and cannot be enabled or accessed by users. We use a Logitech H390 Headset to demonstrate how to achieve this guarantee on the

usbfilter host machine:

62 u s b t a b l e s −a l o g i t e c h −headset −v ifnum=2,product= ”Logitech USB Headset” ,manufacturer=Logitech −k direction=1 −t drop

This rule drops any incoming packets from the Logitech USB headset’s microphone. By adding the interface number (ifnum=2), we avoid breaking other functionality in the headset.

Customizing devices. To further show how usbfilter can filter functionalities provided by USB devices, we use Teensy 3.2 [99] to create a complex USB device with five interfaces including a keyboard, a mouse, a joystick, and two serial ports. The keyboard continually types commands in the terminal, while the mouse continually moves the

cursor. We can write usbfilter rules to completely shutdown the keyboard and mouse functionalities:

u s b t a b l e s −a teensy1 −v ifnum=2,manufacturer= Teensyduino , serial=1509380 −t drop u s b t a b l e s −a teensy2 −v ifnum=3,manufacturer= Teensyduino , serial=1509380 −t drop

In these rules, we use condition “manufacturer” and “serial” (serial number) to limit the Teensy’s functionality. Different interface numbers represent the keyboard and the mouse respectively. After these rules applied, both the keyboard and the mouse return to normal. Default-deny input devices. Next, we show how to defend against HID-based BadUSB attacks using usbfilter. These types of devices are a type of trojan horse; they appear to be one device, such as a storage device, but secretly contain hidden input functionality (e.g., keyboard or mouse). When attached to a host, the device can send keystrokes to the host and perform actions as the current user. First, we create a BadUSB storage device using a Rubber Ducky [34], which looks like a USB thumb drive but opens a terminal and injects keystrokes. Then we add following rules into the host machine:

u s b t a b l e s −a mymouse −v busnum=1,devnum=4,portnum=2,

63 devpath=1.2,product=”USB Optical Mouse”, manufacturer=PixArt −k types=1 −t allow u s b t a b l e s −a mykeyboard −v busnum=1,devnum=3, portnum=1,devpath=1.1, product=”Dell USB Entry Keyboard”, manufacturer=DELL −k types=1 −t allow u s b t a b l e s −a noducky −k types=1 −t drop

The first two rules whitelist the existing keyboard and mouse on the host machine; the last rule drops any USB packets from other HID devices. After these rules are inserted into the kernel, reconnecting the malicious device does nothing. Attackers may try to impersonate the keyboard or mouse on the host machine. However, we have leveraged information about the physical interface (busnum and portnum) to write the first two rules, which would require the attacker to unplug the existing devices, plug the malicious device in, and impersonate the original devices including the device’s VID/PID and serial number. We leave authenticating individual USB devices to future work, however usbfilter is extensible so that authentication can be added and used in rules. Data exfiltration. To prevent data exfiltration from the host machine to USB storage devices, we write a LUM (Linux usbfilter Module) to block the SCSI write command from the host to the device, as shown in Figure A-1 in the Appendix. The LUM then registers itself with usbfilter and can be referenced by its name in rule constructions. In this case study, we use a Kingston DT 101 II 2G USB flash drive, and insert the following rule:

u s b t a b l e s −a nodataexfil −v manufacturer=Kingston −l name=block scsi write −t drop

This rule prevents modification of files on the storage device. Interestingly, vim reports files on the device to be read-only, despite the filesystem reporting that the files are read-write. Since usbfilter is able to trace packets back to the applications initiating

64 I/O operations at the Linux kernel block layer, we are able to write rules blocking (or allowing) specific users or applications from writing to flash drive:

u s b t a b l e s −a nodataexfil2 −o uid =1001 −v manufacturer=Kingston −l name=block scsi write −t drop u s b t a b l e s −a nodataexfil3 −o comm=vim −v manufacturer=Kingston −l name=block scsi write −t drop

The first rule prevents the user with uid=1001 from writing anything to the USB storage; the second blocks vim from writing to the storage. We can also block any writes to USB storage devices:

u s b t a b l e s −a nodataexfil4 −l name=block scsi write −t drop

usbfilter logs dropped USB packets, and these logs can easily be used in a centralized alerting system, notifying administrators to unauthorized access attempts. Webcam pinning. Webcams can easily be enabled and accessed by attackers from exploiting vulnerable applications. Once access has been established, the attacker can listen or watch the environment around the host computer. In this case study, we show how to use usbfilter to restrict the use of a Logitech Webcam C310 to specific users and applications.

u s b t a b l e s −a skype −o uid=1001,comm=skype −v serial=B4482A20 −t allow u s b t a b l e s −a nowebcam −v serial=B4482A20 −t drop

The serial number of the Logitech webcam is specified in the rules to differentiate any others that may be attached to the system as well as to prevent other webcams from being attached. The first rule allows USB communication with the webcam only if the user is uid=1001 and the application is Skype. The following nowebcam rule drops other

65 USB packets to the webcam otherwise. As expected, the user can use the webcam from his Skype but not from Pidgin, and other users cannot start video calls even with Skype. USB charge-only. Another form of BadUSB attacks is DNS spoofing using smartphones. Once plugged into the host machine, the malicious phone automatically enables USB tethering, is recognized as a USB NIC by the host, then injects spoofed DNS replies into the host. The resulting man-in-the-middle attack gives the attacker access to the host’s network communications without the authorization of the user. To prevent this attack, we

use usbfilter to prevent all USB packets from a Google smartphone:

u s b t a b l e s −a n4−charger −v product=”Nexus4” −t drop

This rule rule drops any USB packets to/from the phone, which enforces the phone as a pure charging device without any USB functionality. The phone is unable to be used for storage or tethering after the rule is applied. We can construct a more specific charge-only rule:

u s b t a b l e s −a charger −v busnum=1,portnum=4 −t drop

This rule specifies a specific physical port on the host and this port can only be used for charging. This type of rule is useful where USB ports may be exposed (e.g., on a terminal) and cannot be physically removed. It is also vital to defend against malicious devices whose firmware can be reprogrammed to forge the VID/PID such as BadUSB, since this type of rule only leverages the physical information on the host machine. usbfilter can partition all physical USB ports and limit the USB traffic on each port. 4.3.2 Benchmarks

We first measure the performance of the user-space tool, usbtables. We then measure the overhead imposed by usbfilter. The measurement host is loaded with the rules mentioned in the case studies above before beginning benchmarking. When coupled with the default rules provided by us- bfilter, there are 20 total rules loaded in the kernel. We chose 20 because we believe

66 Table 4-1. Prolog reasoning time (µs) averaged by 100 runs. Prolog Engine Min Avg Med Max Dev Time (20 rules) 128.0 239.8 288.0 329.0 73.2 Time (100 rules) 132.0 251.7 298.0 485.0 78.6

Table 4-2. Rule adding operation time (ms) averaged by 100 runs. Rule Adding Min Avg Med Max Dev Time (20 rules) 5.1 5.9 6.1 6.6 0.3 Time (100 rules) 4.9 5.9 6.1 6.8 0.4

that a typical enterprise host’s USB devices (e.g., keyboard, mouse, removable storage, webcam, etc.) will total less than 20. Then we load 100 rules in the kernel to understand

the scalability of usbfilter. 4.3.2.1 Microbenchmark usbtables Performance. We measure the time used by the Prolog engine to formally verify a rule before it is added into the kernel. We loaded the kernel with 20 and 100 rules and measured the time to process the rules. For each new rule, the Prolog engine needs to go through the existing rules and check for conflicts. We measured 100 trials of each test. The performance of the Prolog engine is shown in Table 4-1. The average time used by the Prolog engine is 239.8 µs with 20 rules and 251.7 µs with 100 rules. This fast speed is the result of using GNU Prolog (gplc) compiler to compile Prolog into assembly for acceleration. We also measure the overhead for usbtables to add a new rule to the kernel space. This includes loading existing rules into the Prolog engine, checking for conflicts, saving the rule locally, passing the rule to the kernel, and waiting for the acknowledgment. As shown in Table 4-2, the average time of adding a rule using usbtables stays at around 6 ms in both cases, which is a negligible one-time cost. USB Enumeration Overhead. For this test, we used the Logitech H390 USB headset, which has 4 interfaces. We manually plugged the headset into the host 20 times. We then compare the results between the usbfilter kernel with varying numbers of rules loaded

67 Table 4-3. USB enumeration time (ms) averaged by 20 runs. USB Enumeration Min Avg Med Max Dev Cost Stock Kernel 32.0 33.9 34.1 34.8 0.6 N/A usbfilter (20 rules) 33.2 34.4 34.3 35.8 0.7 1.5% usbfilter (100 rules) 33.9 34.8 34.6 36.0 0.5 2.7%

Table 4-4. Packet filtering time (µs) averaged by 1500 packets. Packet Filtering Min Avg Med Max Dev Time (20 rules) 2.0 2.6 3.0 5.0 0.5 Time (100 rules) 2.0 9.7 10.0 15.0 1.0 and the stock Ubuntu kernel, where usbfilter is fully disabled, as shown in Table 4-3. The average USB enumeration time is 33.9 ms for the stock kernel and 34.4 ms and 34.8 ms for the usbfilter kernel with 20 and 100 rules preloaded respectively. Comparing to the stock kernel, usbfilter only introduces 1.5% and 2.7% overheads, or less than 1 ms even with 100 rules preloaded.

Packing Filtering Overhead. The overhead of USB enumeration introduced by usbfilter is the result of packet filtering and processing performed on each USB packet, since there may be hundreds of packets during USB enumeration, depending on the number of interface and endpoints of the device. To capture this packet filtering overhead, we plug in a Logitech M105 USB Optical Mouse, and move it around to generate enough

USB packets. We then measure the time used by usbfilter to determine whether the packet should be filtered/dropped or not for 1500 packets, as shown in Table 4-4. The average cost per packet are 2.6 µs and 9.7 µs respectively, including the time to traverse all the 20/100 rules in the kernel, and the time used by the benchmark itself to get the timing and print the result. The 100-rule case shows that the overhead of usbfilter is quadruped when the number of rule increases by one order of magnitude. As we mentioned before, most common USB usages could be covered within 20 rules. We assume it is rare for a system to have 100 rules for different USB devices. To search in hundreds of rules efficiently, we can setup a hash table using e.g., USB port numbers as keys to save rules instead of a linear array (list) currently implemented.

68 Table 4-5. Latency (ms) of the fileserver workload with different mean file sizes. Configuration 1K 10K 100K 1M 10M 100M Stock 97.6 98.1 99.2 105.5 741.7 5177.7 usbfilter 97.7 98.2 99.6 106.3 851.5 6088.4 Overhead 0.1% 0.1% 0.4% 0.8% 14.8% 17.6%

9 6 Stock Stock 8 usbfilter 5 StockAvg 7 usbfilter usbfilterAvg 6 4

5 3 4 MB/Second MB/Second 3 2

2 1 1

0 0 0K 1K 10K 100K 1M 10M 100M 0 2 4 6 8 10 12 14 Mean File Size Nth Interval

Figure 4-4. Filebench throughput Figure 4-5. Iperf bandwidth (MB/s) using fileserver (MB/s) using TCP workload with different with different time mean file sizes. intervals. 6 Stock 5 StockAvg 600 usbfilter Stock usbfilterAvg usbfilter 4 500

400 3

MB/Second 300 2 Time/Scores 200 1

100 0 0 2 4 6 8 10 12 14 0 Nth Interval KVM Chrome ClamAV wget Real-world Workloads Figure 4-6. Iperf bandwidth Figure 4-7. Performance comparison (MB/s) using UDP of real-world workloads. with different time intervals.

4.3.2.2 Macrobenchmark

We use filebench [100] and iperf [101] to measure throughputs and latencies of file operations, and bandwidths of network activities, under the stock kernel and the

69 usbfilter kernel, using different USB devices. The usbfilter kernel is loaded with 20 rules introduced in the case studies before benchmarking. Filebench. We choose the fileserver workload in filebench, with the following settings: the number of files in operation is 20; the number of working threads is 1; the run time for each test case is 2 minutes; the mean file size in operation ranges from 1 KB to 100 MB; all other settings are default provided by filebench. These settings emulate a typical usage of USB storage devices, where users plug in flash drives to copy or edit some files. All file operations happen in a SanDisk Cruzer Fit 16 GB flash drive. The throughputs under the stock kernel and the usbfilter kernel are demonstrated in Figure 4-4. When the mean file size is less than 1 MB, the throughput of usbfilter is close to the one of the stock kernel. Since there is at most 20 × 1 MB data involved in block I/O operations, both the stock kernel and usbfilter can handle this data size smoothly. When the mean file size is greater than 1 MB, usbfilter shows lower throughputs comparing to the stock kernel, as the result of rule matching for each USB packet. Compared to the stock kernel, usbfilter imposes 14.7% and 18.4% overheads when the mean file sizes are 10 MB and 100 MB respectively. That is, when there is 20 × 100 MB (2 GB) involved in block I/O operations, the throughput decreases from 8.7 MB/s to 7.1 MB/s, when usbfilter is enabled.

The corresponding latencies are shown in Table 4-5. The latency of usbfilter is higher than the stock kernel. Following the throughput model, the latencies between the two kernels are close when the mean file size is less than 1 MB. The overhead introduced by usbfilter is less than 1.0%. When the mean file sizes are 10 MB and 100 MB, usbfilter imposed 14.8% and 17.6% overheads in latency. comparing to the stock kernel. That is, to deal with 20 × 100 MB data, users need one more second to finish all the operations with usbfilter enabled, which is acceptable for most users. iperf. We use iperf to measure bandwidths of upstream TCP and UDP communications, where the host machine acts as a server, providing local network access via a Ralink

70 RT5372 300 Mbps USB wireless adapter. The time interval for each transmission is 10 seconds, and each test runs 5 minutes (30 intervals). For TCP, we use the default TCP window size 64 KB; for UDP, we use the default available UDP bandwidth size 10 MB. The TCP bandwidths of the two kernels are shown in Figure 4-5, where we aggregate each two intervals into one, reducing the number of sampling points from 30 to 15. and the average bandwidths are also listed in dot lines. Though having different transmission patterns, the average bandwidths of both are close, with the stock kernel at 2.75 Mbps and usbfilter at 2.52 Mbps. Comparing to the stock kernel, usbfilter introduces 8.4% overhead. The UDP benchmarking result closely resembles TCP, as shown in Figure 4-6. Regardless of transmission patterns, average bandwidth of the two kernels is similar, with the stock kernel at 3.48 Mbps and usbfilter at 3.27 Mbps. Comparing to the TCP transmission, UDP transmission is faster due to the simpler design/implementation of UDP, and usbfilter introduces 6.0% overhead. In both cases, usbfilter has demonstrated a low impact to the original networking component. 4.3.3 Real-world Workloads

To better understand the performance impact of usbfilter, we generate a series of real-world workloads to measure typical USB use cases. In the KVM [102] workload, we create and install a KVM virtual machine automatically from the Ubuntu 14.04 ISO image file (581 MB) saved on USB storage. In the Chrome workload, we access the web browser benchmark site [103] via a USB wireless adapter. In the ClamAV [104] workload, we scan the unzipped Ubuntu 14.04 ISO image saved on the USB storage for virus using ClamAV. In the wget workload, we download the Linux kernel 4.4 (83 MB) via the USB wireless adapter using wget. The USB storage is the SanDisk 16 GB flash drive, and the USB wireless adapter is the Ralink 300 Mbps wireless card. All time measurements are in seconds except the Chrome workload, where scores are given, and are divided by 10 to fit into the figure. Figure 4-7 shows the comparison between the two kernels when running

71 these workloads. In all workloads, usbfilter either performs slightly better than the stock kernel, or imposes a small overhead compared to the stock kernel in our test. It is clear that usbfilter approximates the original system performance. 4.3.4 Summary

In this section, we showed how usbfilter can help administrators prevent access to unauthorized (and unknown) device interfaces, restrict access to authorized devices using application pinning, and prevent data exfiltration. Our system introduces between 3 and 10 µs of latency on USB packets while checking rules, introducing minimal overhead on the USB stack. 4.4 Discussion

4.4.1 Process Table

We have successfully traced each USB packet to its originating application for USB storage devices by passing the PID information along the software stack from the VFS layer, through the block layer, to the USB layer within the kernel. However, it is not always possible to find the PID for each USB packet received by the USB host controller. One example is HID devices, such as keyboards and mouses. Keystrokes and mouse movements happen in the interrupt (IRQ) context, where the current stopped process has nothing to do with this USB packet. All these packets are delivered to the Xorg server in the user space, which then dispatches the inputs to different applications registered for different events. usbfilter is able to make sure that only Xorg can receive inputs from the keyboard and mouse. To guarantee the USB packet delivered to the desired application, we can enhance the Xorg server to understand usbfilter rules. The other example comes from USB networking devices. Though we have enhanced the general USB wireline driver usbnet to pass the PID information into each USB packet, unlike USB storage devices sharing the same usb-storage driver, many USB Ethernet dongles have their own drivers instead of using the general one. Even worse, there is no general USB wireless driver at all. Depending of the device type and model, one may need

72 to instrument the corresponding driver to have the PID information, like what we did for rt2800usb driver. Future work will introduce a new USB networking driver framework to be shared by specific drivers, providing a unified interface for passing PID information into USB packets.

Another issue of using process table in usbfilter rules is TOCTTOU (Time Of Check To Time Of Use) attacks. A malicious process can submit a USB packet to the kernel and exit. When the packet is finally handled by the host controller, usbfilter is no longer able to find the corresponding process given the PID. Fortunately, these attacks does not impact rules without process tables. When process information is crucial to the system, we recommending using usbtables to change the default behavior to “drop”, make sure that no packet would get through without an explicit matching rule. 4.4.2 System Caching

usbfilter is able to completely shut down any write operations to external USB storage devices, preventing any form of data exfiltration from the host machine. Similarly, one can also write a “block scsi read” LUM to stop read operations from storage devices. Nevertheless, this LUM may not be desired or work as expected in reality. To correctly mount the filesystem in the storage device, the kernel has to read the metadata saved in the storage. One solution would be to delay the read blocking till the filesystem is mounted. However, for performance considerations, the Linux kernel also reads ahead some data in the storage, and brings it into the system cache (page cache). All following I/O operations will happen in the memory rather than the storage. While memory protection is out of scope for this paper, we rely on the integrity of the kernel to enforce the MAC model it applies. Write operations, even though in the memory, will be flushed

into the storage, where usbfilter is able to provide a strong and useful guarantee. 4.4.3 Packet Analysis From USB Devices

Because of the master-slave nature of the USB protocol, we do not setup usbfil- ter in the response path, which is from the device to the host, due to performance

73 considerations. However, enabling usbfilter in the response path provides new opportunities to defend against malicious devices and users, since the response packet could be inspected with the help of usbfilter. For example, one can write a LUM to limit the capability of a HID device, such as allowing only three different key actions from a headset’s volume control button, which is implemented by GoodUSB as a customized keyboard driver, or disabling sudo commands for unknown keyboards. Another useful case is to filter the spoofing DNS reply message embedded in the USB packet sent by malicious smart phones or network adapters, to defend against DNS cache poisoning. We are planning to investigate these new case studies in future work. 4.4.4 Malicious USB Drivers and USB Covert Channels

While BadUSB is the most prominent attack that exploits the USB protocol, we observe that using USB communication as a side channel to steal data from host machines, or to inject malicious code into hosts, is another technically mature and plausible threat. On the Linux platform, with the development of libusb [105], more USB drivers run within user space and can be delivered as binaries. On Windows platform, PE has been a common format of device drivers. To use these devices, users have to run these binary files without knowing if these drivers are doing something else in the meantime.4 For instance, USB storage devices should use bulk packets to transfer data per the USB spec. However, a malicious storage driver may use control packets to stealthily exfiltrate data as long as the malicious storage is able to decode the packet. This works because control transfers are mainly used during the USB enumeration process. With the help of usbfilter, one can examine each USB packet, and filter unrecognized ones without breaking the normal functionality of the device.

4 N.B. that there are ways to instrument DLL files on Windows platform, though this does not appear to be commonly done with drivers.

74 4.4.5 Usability Issues

To write usbfilter rules, one needs some knowledge about the USB protocol in general, as well as the target USB device. The lsusb command under Linux provides a lot of useful information that can directly be mapped into rule construction. Another tool usb-devices also helps users understand USB devices. Windows has a GUI program USBView to visualize the hierarchy and configuration of USB devices plugged into the host machine. While users can write some simple rules, we expect that developers will provide useful LUMs, which may require deep understanding of the USB protocol and domain specific knowledge (e.g., SCSI, and will share these LUMs with the community. We will also provide more useful LUMs in the future.

75 CHAPTER 5 LINUX (E)BPF MODULES Current defenses against malicious peripherals are not comprehensive and are limited in scope. USBFILTER [87] applies user-defined rules to USB packet filtering within the Linux kernel, but fails to prevent exploitation from malformed packets. USBFirewall [57], on the other hand, provides bit-level protection by parsing individual incoming USB packets, but offers limited support for user-defined filtering rules. Apple recently added USB restricted mode in iOS 11.4, shutting down USB data connections after the device stays locked for an hour [106], but this restriction can be bypassed [107]. Not only do these defenses lack comprehensive coverage, but they often focus primarily or solely on USB, providing no protection against peripherals using other protocols, such as Bluetooth.

In this chapter, we propose Linux (e)BPF Modules (LBM), a general security framework that provides a unified API for enforcing protection against malicious peripherals within the Linux kernel. LBM requires only a single hook for incoming and outgoing peripheral data to be placed in each peripheral subsystem, and modules for filtering specific peripheral packet types (e.g., USB request blocks or Bluetooth socket buffers) can then be developed. Importantly for performance and extensibility, we leverage the Extended BSD Packet Filter (eBPF) mechanism [72], which supports loading of filter programs from user space. Unlike previous solutions, LBM is designed to be a general framework suitable for any peripheral protocol. As a result, existing solutions such as USBFILTER and USBFirewall can be easily instantiated using LBM. Moreover, new peripherals can be easily supported by adding extensions into the LBM core framework. To demonstrate the generality and flexibility of LBM, we have fully

Excerpts from this chapter previously appeared in “LBM: A Security Framework for Peripherals within the Linux Kernel”, originally published in the proceedings of the 2019 IEEE Symposium on Security and Privacy (SP) [6].

76 instantiated USBFILTER and USBFirewall using the LBM framework, developed hooks for the Bluetooth Host Control Interface (HCI) and Logical Link and Adaptation Protocol (L2CAP) layers, and demonstrated a hook mechanism for the Near-Field Communication (NFC) protocol. Our evaluation shows that the general overhead introduced by LBM is within 1 µs per packet across different peripherals in most cases; the application and system benchmarks demonstrate a negligible overhead from LBM; and LBM has a better performance when compared to other state-of-the-art solutions. 5.1 Design

We first describe the security model we consider, outline the goals we set for our solution, and finally show how we achieve these goals through different aspects of the design. 5.1.1 Security Model

We consider attacks from peripherals to require physical access to the host machine (e.g., plugging into the USB port) or use wireless channels to connect with the host (e.g., over Bluetooth). These malicious peripherals usually try to achieve privilege escalation by claiming unexpected functionalities (e.g., BadUSB [3]), or exploiting the kernel protocol stack via specially crafted packets (e.g., BlueBorne [7]). Note that we do not consider DMA-based attacks [108], where IOMMU [109] is needed to stop arbitrary memory writes from the peripheral. Our Trusted Computing Base (TCB) is made up of the Linux kernel and the software stack down below. We assume trusted boot or measured boot, such as Intel TXT [110], is deployed to protect system integrity. We also assume Mandatary Access Control (MAC), such as SELinux [111], is being enforced across the whole system. 5.1.2 Goals: Beyond A Reference Monitor

The first three goals we set (G1 through G3) are drawn from the classic reference monitor concept [112], and are needed to build a secure kernel. The remaining goals (G4

77 through G7) draw inspiration from existing security frameworks, such as Linux Security Modules (LSM) [113], and consider practical issues surrounding usage and deployment.

G1 Complete Mediation – For each kind of supported peripheral, we need to guarantee that all inputs from the device and all outputs from the host are mediated.

G2 Tamper-proofness – Assuming the system TCB is not compromised, we need to defend against any attacks originating from outside the TCB.

G3 Verifiability – While a whole-system formal verification may be infeasible, we should mandate formal guarantees for security-sensitive components.

G4 Generality – The solution should provide a general framework that seamlessly incorporates the features of existing security solutions.

G5 Flexibility/Extensibility – The addition of support for new kinds of peripherals should be a straightforward and non-intrusive process.

G6 Usability – The solution should be easy to use.

G7 High Performance – The solution should introduce minimal overhead. Bearing these goals in mind, we design the Linux (e)BPF Module (LBM), as shown in Figure 5-1. Within the kernel space, LBM interposes different peripheral subsystems (such as USB, Bluetooth, and NFC) at the bottom level, covering both TX and RX paths. Before a packet can be sent out or reach the corresponding protocol stack for parsing, LBM applies filtering rules (eBPF programs) and loaded LBM kernel modules to the packet for filtering. In the user space, we introduce a new filter language for peripherals. Filters written in this language are compiled into eBPF programs and loaded into the kernel by lbmtool. In short, LBM provides a general peripheral firewall framework, running eBPF instructions as the packet filtering mechanism. We instantiate LBM on USB, Bluetooth, and NFC to cover the most common peripherals. 5.1.3 LBM Kernel Infrastructure

We design LBM as a standalone kernel component/subsystem statically linked into the kernel image. We rely on TPM and IMA [93] to guarantee the boot time integrity of

78 if usb.devnum == 7: drop

LLVM/ lbmtool Clang

User Space bpf syscall lbm sysfs Kernel Space

lbm1 Framework Peripheral Subsystems

USB Bluetooth NFC

lbm2 LBM Subsys Subsys Subsys lbm3 LBM LBM LBM TX TX TX LBM LBM LBM RX RX RX

BPF/eBPF USB Packet BT Packet NFC Packet

Figure 5-1. LBM Architecture. the kernel and load time integrity of user-space dependencies. We further use MAC such as SELinux [111] to make sure LBM cannot be disabled without root permission. Since LBM cannot be unloaded/reloaded as a kernel module, disabled, or bypassed from the user space, we achieve G2 – tamper-proofness. For each kind of peripheral that LBM supports, we need to place “hooks” on both the TX and RX paths to mediate each packet being sent to and received from the peripheral. While different peripheral subsystems may have different structuring of their software stack architectures within the kernel, we follow two general rules for the placement of LBM hooks. First, these hooks should be placed as close as possible to the real hardware controlling the corresponding peripherals. This helps reduce the potential impact from vulnerabilities within the upper layer of the software stack (e.g., by packets

79 Storage Driver Input Driver Video Driver USB Core

LBM TX LBM RX

Host Controller Device Driver Host Controller Device USB Peripherals

Figure 5-2. LBM hooks inside the USB subsystem. bypassing the hooks). Second, these hooks should be general enough without relying on the implementation of certain hardware. As a result, we place LBM hooks beneath the core implementation of a peripheral’s protocol stack, and above a specific peripheral controller driver. Take USB as an example. As shown in Figure 5-2, LBM hooks are deployed just above the host controller device and its driver, which communicates with USB peripherals directly. At the same time, the hooks are deployed below the USB core and other USB device drivers, preventing third-party USB drivers from bypassing these hooks. Through this careful placement of LBM hooks, we achieve G1 – complete mediation. Since LBM allows the loading of eBPF programs into the kernel space and executing of these programs for peripheral packet filtering, special care is needed to make sure these programs are not introducing new vulnerabilities into the kernel or bypassing security mechanisms enforced by the kernel. We leverage the eBPF verifier [114] to examine each eBPF program before it can be loaded. Unlike normal eBPF programs (mainly used by the networking subsystem) loaded by the bpf syscall, we forbid both bounded loop [115] and packet rewriting (e.g., changing the port number of a TCP packet) in LBM. Once a

80 Table 5-1. LBM compared to USBFILTER and USBFirewall. LBM unifies USBFILTER and USBFirewall, providing a superset their properties via extensible protocol support. Feature USBFILTER USBFirewall LBM Plugin Modules X X Stack Protection X X User-defined Rules X X TX Path Mediation X X RX Path Mediation X X Multiple Protocols X program passes verification, we can be sure that the program halts after a limited number of state transitions, that each program state is valid (e.g., no stack overflow occurs), and that no instruction changes the kernel memory (besides the program’s own stack). We achieve G3 – verifiability for programs executed by LBM. LBM draws inspiration from state-of-the-art solutions including USBFILTER [87] and USBFirewall [57], and improves on them, as shown in Table 5-1. Similarly to USBFILTER, LBM supports kernel module plugin. As depicted in Figure 5-1, different LBM kernel modules (e.g., lbm1-lbm3) can be plugged into the LBM framework and essentially hook into the TX and/or RX paths for different peripherals. As we will later show in Section 5.3.1, it takes less than 20 lines of change to convert a LUM (Linux USBFILTER Module) into an LBM module. To protect protocol stacks from malformed packets, we derive packet field constraints from specifications. Rather than translating these constraints into C and compiling them into the kernel image like USBFirewall does, we transform them into eBPF programs and load them on the RX paths for malformed packet filtering. In short, we achieve G4 – generality, by incorporating all the features provided by existing solutions. Additionally, we extend support beyond USB to other peripherals, such as Bluetooth and NFC. To ease support for a new kind of peripheral, we design a unified API used by different subsystems to hook into LBM:

int lbm filter pkt (

81 Table 5-2. LBM vs. USBFILTER vs. USBFirewall, specifically with respect to filter design of each. Feature USBFILTER USBFirewall LBM Filter Mechanism C C eBPF User-space DSL CNF N/A PCAP DSL Acceleration Short Circuit N/A JIT

int subsys , int dir , void ∗ pkt ) subsys determines the index of a certain peripheral subsystem (e.g., 0 for USB and 1 for Bluetooth); dir specifies the direction of the I/O path: TX or RX; and pkt points to the core kernel data structure used to encapsulate the I/O packet depending on different subsystems, (e.g., urb for USB and skb for Bluetooth). Once this LBM hook is placed into a peripheral subsystem, developers can write an LBM module to filter packets using typical C programming, by implementing the TX and/or RX callbacks:

int (∗ lbm ingress hook ) ( void ∗ pkt ) int (∗ lbm egress hook ) ( void ∗ pkt )

A more useful extension is to expose some packet fields to the user space, and implement BPF helpers as backends to provide data access to these fields if needed (as we have done for USB and Bluetooth). As a result, lbmtool can generate a new dialect for the new peripheral based on a PCAP-like packet filtering language. Users can then write filtering rules as they would for tcpdump instead of directly crafting eBPF instructions. Through the design of the LBM framework and the introduction of a domain specific language (DSL), we achieve G5 – flexibility/extensibility. Besides the verifiability of eBPF programs, we choose eBPF as the filtering mechanism in LBM to strike a balance between performance and programmability. As shown in Table 5-2, both USBFILTER and USBFirewall rely on hardcoded C compiled into the kernel to implement the filter mechanism. Although USBFirewall leverages the Haskell description of the specification to generate the C code, it lacks support for a user-space DSL. USBFILTER only supports a limited DSL following the conjunctive

82 lbmtool

Tree Semantic Expr Parse CST Shaping Analysis

CodeGen IR IRGen AST

write sysfs eBPF Loader Program call sys_bpf

Figure 5-3. The flow of lbmtool in compiling LBM rules to eBPF programs and loading them into the running kernel. normal form (CNF). As we will elaborate in the following section, LBM DSL is more expressive and powerful. Instead of implementing a filtering mechanism directly, LBM builds an eBPF running environment for peripherals and executes eBPF programs as filters. Thanks to JIT compilation of eBPF code, LBM is able to run filters as fast as native instructions; thus, we achieve G7 – high performance. 5.1.4 LBM User Space

To interact with an LBM-enabled kernel we design lbmtool, a frontend utility to interact with the LBM kernel space. Its primary purpose is to compile, load, and manage LBM programs resident in the kernel. To create a unified, simple, and expressive way of describing peripheral filtering rules, we develop a custom Domain Specific Language (DSL) modeled on Wireshark and tcpdump filter expressions. These LBM rules are processed by lbmtool using a custom compiler that outputs eBPF filter programs, as shown in Figure 5-3. Compiled filters are loaded into the LBM framework via an extension to the sys bpf syscall. Programs are then loaded into a specific subsystem: USB, Bluetooth, or NFC.

83 The filter syntax we develop is concisely described by the grammar shown in AppendixB. Filter rules are effectively stateless expressions that abstract away from the eBPF language syntax. For example, if we want to match on a specific USB device’s vendor and product ID, such as a Dell optical mouse, we would write:

usb .idVendor == 0x413c && usb .idProduct == 0x3010

If we want to include more than one Dell product, we could write multiple rules, or we could consolidate them into a larger expression. To match on a Dell mouse, keyboard, printer, and Bluetooth adapter, we would write:

usb .idVendor == 0x413c && ( usb .idProduct == 0x3010 | | // Mouse usb .idProduct == 0x2003 | | // Keyboard usb .idProduct == 0x5300 | | // Printer usb .idProduct == 0x8501// Bluetooth adapter )

The lbmtool compiler supports multi-line nested sub-expressions while following the C 89 Standard operator precedence rules [116].

lbmtool is able to load a compiled LBM program into a target subsystem TX (OUTPUT) or RX (INPUT) path and specify a match action (i.e., ACCEPT or DROP).

The following usage has lbmtool compile and load a filter rule:

lbmtool −−e x p r e s s i o n”usb.idProduct ==0x3010” −o mouse . lbm

lbmtool −−load mouse.lbm −t usb −A INPUT −j ACCEPT

By providing descriptive error-checking in lbmtool and developing a custom DSL that is easy to write in and reason about, we achieve G6 – usability. 5.2 Implementation

5.2.1 LBM Kernel Space

We divide the implementation of the LBM kernel space into three parts: core, USB implementation, and Bluetooth implementation. All LBM-specific code is located under

84 lbm create_module bpf sysfs syscall syscall

BPF verifier

LBM MDB TX LBM FDB TX LBM MDB RX LBM FDB RX LBM Core LBM Filter Engine

BPF/eBPF

Figure 5-4. LBM core component. the security/lbm directory of the Linux kernel source tree, as a new security component for the Linux kernel. LBM Core: To load an eBPF program into LBM, we extend the existing bpf syscall, sys bpf, as shown in Figure 5-4. We define a new program type BPF PROG LOAD LBM to distinguish LBM calls from other typical BPF usage. Unlike typical eBPF programs, which normally only persist for the lifetime of the loading process, LBM filters must persist after lbmtool exits. To extend the lifetime of these programs, we pin them using the BPF filesystem [117], essentially using the filesystem to increase the reference count of the object. Before a program is saved by the LBM core, the eBPF verifier checks every instruction of the program for any security violations. Depending on the subsystem (USB or Bluetooth) of the program, LBM provides different verifier callbacks, such as LBM USB or LBM Bluetooth (as we will detail later), thus making sure every memory access of the

85 1 int lbm filter pkt ( int subsys , int dir , void ∗ pkt ) 2 { 3 check subsystem(subsys ); 4 check path ( d i r ) ; 5 check pkt ( pkt ) ; 6 r e s = ALLOW; 7 i f ( d i r == TX) { 8 for each ebpf in db[subsys][ dir] { 9 i f (ebpf(subsys , dir , pkt) == DROP) { 10 r e s = DROP; 11 goto RET; 12 }} 13 for each kmod in db[subsys][ dir] { 14 i f (kmod(subsys , dir , pkt) == DROP) { 15 r e s = DROP; 16 goto RET; 17 }} 18 } else { /∗ Ditto for the RX ∗/ } 19 RET: 20 return r e s ; }

Figure 5-5. Pseudo-code of lbm filter pkt.

program is meaningful, aligned, and safe. Inside LBM, all eBPF programs are organized based on the relevant subsystem and the direction of the filtering path (i.e., TX or RX). We allow the same program to apply for both the TX and RX paths when it is loaded using the BPF syscall, and duplicate the program on TX and RX queues, respectively. The separation of TX and RX paths is mainly for performance, since it allows us to bypass programs that do not interpose on a certain path during filtering. Additionally, to avoid expensive locking, each program is protected by the read-copy-update (RCU) [118] mechanism to enable concurrent reads by different LBM components. LBM modules are also organized according to subsystem and filter path, and protected by RCU. The pseudo code of lbm filter pkt, previously mentioned in Section 5.1.3, is presented in Figure 5-5. To ease the management of LBM filters and modules, we expose ten entries under /sys/kernel/security/lbm/, including a global switch to enable/disable LBM; per-subsystem switches to enable/disable debugging, profiling, and statistics; and per-subsystem-per-path controls to view/remove loaded filters and modules. The whole implementation of LBM core is around 1.6K lines of code.

86 Table 5-3. LBM statistics per subsystem, including # of fields exposed to the user space, # of BPF helpers implemented, and # of lines of code changes. Subsystem # of Fields # of BPF-helpers # of Lines USB 34 31 621 Bluetooth-HCI 30 29 683 Bluetooth-L2CAP 28 27 744 TOTAL 92 87 2048

LBM USB: As shown in Figure 5-2, LBM hooks into the Host Controller Device (HCD) core implementation to cover both TX and RX paths. These hooks eventually call lbm filter pkt before the packet reaches the USB core, as demonstrated below:

lbm filter pkt (LBM SUBSYS INDEX USB, LBM DIR TX, ( void ∗) urb ) ; lbm filter pkt (LBM SUBSYS INDEX USB, LBM DIR RX, ( void ∗) urb ) ;

Every USB packet (urb) then needs to go through the LBM core for filtering before being sent to or received from USB peripherals.

To support writing rules in lbmtool, we expose packet metadata maintained by the kernel and packet fields defined by the USB specification to the user space. To achieve this, a naive approach would be to mirror the urb structure to the userspace, while providing every field explicitly in the filter DSL. Unfortunately, exposing raw kernel structures to the userspace is a security risk, as doing so will leak sensitive kernel pointer values, which can be used to break KASLR [119]. Explicitly supporting every field is infeasible as well, given the complexity of the protocol suites. As a trade-off, we expose the most commonly recognized and used fields, while providing special BPF helpers for accessing the rest of the fields. These helpers allow LBM filters to support array accesses to urb structures, thus enabling them to access every field within a USB packet. As shown in Table 5-3, we expose 34 fields and implement 31 BPF helpers for the USB subsystem. Besides the special BPF helpers mentioned above for accessing packet fields, additional helpers are implemented for returning the length of a buffer or string, or for providing access to the indirect members of the urb structure. For fields that are direct members, no helper is needed since we can access them using an offset from within the

87 urb. We group these fields together in a struct and expose it to the user space, as listed below:

struct lbm usb { u32 pipe ; u32 stream id ; u32 s t a t u s ; u32 t r a n s f e r f l a g s ; u32 t r a n s f e r b u f f e r l e n g t h ; u32 actual length ; u32 setup packet ; u32 start frame ; u32 number of packets ; u32 i n t e r v a l ; u32 error count ; } ;

Instead of exposing urb itself to the user space and using the corresponding offsets, lbmtool only needs to know the lbm usb struct and use offsets against it to directly access these fields. LBM handles the translation of struct member access within lbm usb into one within the kernel urb. To help the BPF verifier understand the security constraints of LBM and the scope of the USB subsystem, we implement three callbacks within the bpf verifier ops struct used by the verifier. We first explicitly enumerate all legal BPF helpers for the verifier, including the 31 LBM USB BPF helpers mentioned above as well as other common BPF map helpers. We exclude any existing BPF helpers designed for the networking subsystem. Therefore, the verifier would reject any LBM USB filters that use BPF helpers besides the ones specified. We then validate every member access of lbm usb within the range, and forbid any memory write operations. Finally, we rewrite the instructions accessing lbm usb and map them into corresponding urb accesses. LBM Bluetooth: The implementation for Bluetooth follows the same procedure as for USB. We place hooks into the Host Control Interface (HCI) layer of the Bluetooth

88 L2CAP LBM RX SCO ACL LBM TX Bluetooth Core

LBM TX LBM RX

Host Controller Interface Bluetooth Module Bluetooth Peripherals

Figure 5-6. LBM hooks inside the Bluetooth subsystem. subsystem, as HCI talks to the Bluetooth hardware directly. While HCI provides the lowest-level of packet abstraction for the upper layers, it is not easy for normal users to interact with this layer since it lacks support for high-level protocol elements, such as connections and device addresses, which are better known to Bluetooth users. To bridge this semantic gap, we add another set of hooks into the Logical Link Control and Adaptation Protocol (L2CAP) layer right above HCI, as shown in Figure 5-6. These hooks are effectively calls to lbm filter pkt, as demonstrated below:

lbm filter pkt (LBM SUBSYS INDEX BLUETOOTH, LBM DIR TX, ( void ∗) skb ) ; lbm filter pkt (LBM SUBSYS INDEX BLUETOOTH, LBM DIR RX, ( void ∗) skb ) ; lbm filter pkt (LBM SUBSYS INDEX BLUETOOTH L2CAP, LBM DIR TX, ( void ∗) skb ) ; lbm filter pkt (LBM SUBSYS INDEX BLUETOOTH L2CAP, LBM DIR RX, ( void ∗) skb ) ;

89 The Bluetooth packet is encapsulated in a socket buffer, or skb in kernel parlance, for both the HCI and the L2CAP layers. During development, we encountered two challenges while hooking the TX path of L2CAP. Unlike for the RX path, the L2CAP layer provides multiple functions for sending out L2CAP packets. Even worse, because of different Maximum Transmission Unit (MTU) sizes between HCI and L2CAP, an L2CAP packet is usually fragmented during packet construction before being sent to the lower layer. One possible solution would be to place LBM hooks inside every function on the TX path and reassemble the packet there. Besides the resulting code duplication, the major fault in this solution is the maintenance burden of adding hooks to new TX functions. To solve these challenges, we deploy only one LBM hook at the Asynchronous Connection-Less (ACL) layer within HCI and reassemble the original L2CAP packet there, while fully covering all TX cases used by the L2CAP layer. Note that the RX path still has the LBM hook inside the L2CAP layer, as the kernel has already handled the packet reassembly. As shown in Table 5-3, we expose 30 and 28 protocol fields from the HCI and L2CAP layers, respectively. Note that both layers share the same 12 fields related with connections. For a HCI packet, a BPF helper is provided to check if a connection is established (indicated by the availability of these fields). For L2CAP, a connection is always established. We also implement 29 and 27 BPF helpers for HCI and L2CAP, respectively, which can retrieve the value of exposed fields. As with the USB subsystem, we enumerate all the legal BPF helpers that can be called within the Bluetooth subsystem, and restrict the memory write operations in the verifier. 5.2.2 LBM User Space

lbmtool is responsible for compiling LBM rules to eBPF programs and loading them into the kernel. Rules/filters pass through standard compilation stages before ending up in the kernel as compiled eBPF. To begin, we tokenize and parse the input LBM filter. To simplify these initial steps we use Lark, a dependency-free Python that supports

90 LALR(1) grammars written in EBNF syntax. Lark processes our LBM rule grammar and creates a working standalone parser. Once filters are lexed, they are parsed into a Concrete Syntax Tree (CST), also known as a parse tree [120]. The raw parse tree is then shaped and canonicalized over multiple steps into a friendlier representation known as an

Abstract Syntax Tree (AST). These steps include symbol (e.g., usb.idProduct) resolution, type checking, and expression flattening. After processing, the AST more accurately represents the LBM language semantics and is flattened into a low-level Intermediate Representation (IR) for backend processing. Our IR is modeled on Three-Address Code (TAC) [120], and it has a close mapping to the DSL semantics. Additionally, we ensure that our IR conforms to Static Single Assignment (SSA) form to simplify register allocation and any late IR optimization passes. Once we have optimized our IR, it moves to the eBPF instruction generator. There, we allocate registers and translate each IR instruction into corresponding eBPF instructions. Our register allocator maps an infinite number of virtual registers from our SSA IR to a fixed number of eBPF physical registers. To do this, it builds an interference graph [121] of the IR statements in the program. This graph encodes the lifetime of each virtual register throughout the program and aids in quickly selecting appropriate physical registers during the allocation process. With registers allocated, each IR statement is processed in order by the eBPF instruction generation backend to produce assembly instructions. With machine code produced, any remaining control transfer labels are resolved by a final two-pass assembly step. The resulting eBPF instructions are packaged into a LBM object file with metadata for loading into the kernel. For an example of the compiler’s output at each stage, visit AppendixC. 5.3 Evaluation

To evaluate LBM, we first demonstrate how users can write simple LBM rules to protect protocol stacks and defend against known attacks through case studies. These case studies center around the USB and Bluetooth stacks, ending with an proof-of-concept

91 implementation of NFC support in LBM. We divide the cases between specific attacks from malicious peripherals and general host system hardening against potential peripheral threats. The next part of our evaluation focuses on benchmarking the performance of LBM. We divide the benchmarking into our testing setup, micro-benchmark, (providing LBM overhead per packet), macro-benchmark (showing LBM overhead on the application and system level), and scalability (covering 100 LBM rules and comparing LBM with previous solutions). 5.3.1 Case Studies

Kernel Protocol Stack Protection: To protect the kernel’s USB protocol stack similar to USBFirewall, we extract protocol constraints from the USB specification and translate them to LBM rules for loading via lbmtool. For example, to ensure the response of a Get Descriptor request is well-formed during the enumeration phase, we write:

(( usb . setup packet != 0) &&/ ∗ For enumeration ∗/ ( usb .request[0] == 0x80) &&/ ∗ Get Descriptor ∗/ ( usb .request[1] == 0x06) && /∗ Make sure response contains at least2 bytes ∗/

(( usb . actual length < 2) | | /∗ Make sure the descriptor type matches ∗/

(( usb .request[3] != usb . data [ 1 ] ) | | /∗ Device descriptor ∗/

(( usb .request[3] == 1) && (( usb .data[0] != 18) | | ( usb . actual length != 18) ) ) | | /∗ Configuration descriptor ∗/

(( usb .request[3] == 2) && (( usb . data [ 0 ] < 9) | | ( usb . actual length < 9) ) ) | | /∗ S t r i n g descriptor ∗/

(( usb .request[4] == 3) && (( usb . data [ 0 ] < 4) | | ( usb . actual length < 4) ) ) ) ) )

We first make sure the response has at least 2 bytes, for extracting the length (usb.data[0]) and type (usb.data[1]) of the response. We reject the packet if there is a type mismatch

92 between request and response. Depending on the descriptor type, we then make sure the response has the minimum length required by the specification. To fully cover all the responses during USB enumeration, we also check the response returned by Get Status in a similar fashion. We use FaceDancer [122] and umap2 [38] to emulate a malicious hub device fuzzing the host USB stack. Our stack protection filters are able to drop all malformed packets during USB enumeration. To protect the Bluetooth stack within the kernel, we extract the constraints from the

Bluetooth specification and rewrite them using lbmtool as follows:

/∗ HCI−CMD ∗/

(( bt . hci .type == 1) && ( bt . hci . l e n < 3) ) | | /∗ HCI−ACL ∗/

(( bt . hci .type == 2) && ( bt . hci . l e n < 4) ) | | /∗ HCI−SCO ∗/

(( bt . hci .type == 3) && ( bt . hci . l e n < 3) ) | | /∗ HCI−EVT ∗/

(( bt . hci .type == 4) && ( bt . hci . l e n < 2) ) )

This rule provides basic protection for the HCI layer. Depending on the packet type, we make sure the response has the minimum length required by the specification. We also implemented similarly styled protection for the L2CAP layer. Preventing Data Leakage: In addition to propagating malware, USB storage devices are also used to steal sensitive information from a computer. To tackle this threat, USBFILTER implemented a plugin to drop the SCSI write command on the TX path, thus preventing any data from being written into a connected USB storage device; this plugin mechanism is referred to as Linux USBFILTER Module (LUM). Recall LBM is designed to support the features of existing solutions. We are able to port the SCSI-write-drop LUM to LBM with only around 10 lines of code changes (primarily adjusting naming of callbacks and header files). In fact, any LUM can be ported to LBM with similarly minimal changes, because LUMs can be treated as a special

93 case on USB in LBM. As they are essentially kernel modules, neither LUMs nor LBM module extensions are as constrained as the LBM filter DSL, given that they are written in C and call kernel APIs directly. Trusted Input Devices: One of the most common BadUSB attacks relies on the Human Interface Device (HID) class, in which a malicious USB device behaves like a keyboard, injecting keystrokes into the host machine. With LBM, we can write a rule specifying a trusted input device, such that keystrokes from all other input devices are dropped, as follows:

(( usb .pipe == 1) &&/ ∗ INT(Keystroke) ∗/ (( usb .manufacturer !=”X”) | | ( usb . product !=”Y”) | | ( usb . s e r i a l !=”Z”) | | ( usb .plugtime != 12345)))

For all keystrokes, we check against the expected manufacturer, product, and serial number of the trusted input device. This rules out any devices from different vendors or different device models, and only permits keystrokes from the trusted input device without completely disabling the USB keyboard functionality. Similarly to writing udev rules, system administrators can plug in their trusted input devices to collect the device information before writing and loading LBM filters into the kernel. In case of a BadUSB device spoofing its identity, we extend the USB hub thread to report the initial timestamp when a device was plugged in, and expose this field to user space. Sysadmins can discover this timestamp in dmesg and include it as part of a LBM rule.1 As such, even if a malicious device were able to mimic the identity of the trusted input device, the malicious keystrokes would be dropped because the initial timestamp would differ.

1 We assume these trusted input devices do not get unplugged and replugged very often. Using this field solely is also possible, although then we can not limit the USB packet type to include only keystrokes.

94 Securing USB Charging: A well-known defense against BadUSB attacks by USB chargers is the “USB condom” [123], which effectively physically disconnects the USB data pins (D+/-) from the USB bus. Unfortunately, this prevents phones that support USB Battery Charging [124] from drawing extra power via the data wires. As a result, fully charging a phone may take 15 times as long due to the lower amperage. Additionally, a comparable device is not available for USB Type-C. Using LBM, we could instead implement a software USB data blocker:

(( usb .busnum == 1) && ( usb .portnum == 1))

After applying this LBM rule to the RX path, we are able to drop any data transmission from the physical USB port 1 under bus 1, thus making the port charge-only for any connections. This LBM rule does not interfere with USB Battery Charging, since the data wires are still physically connected, and can be applied to any physical USB port, regardless of whether or not it is Type-C. Securing Bluetooth Invisible Mode: To prevent a Bluetooth device from being scanned by another (potentially) malicious device, such as during a Blueprinting [61] or BlueBag [62] attack, Bluetooth introduces discoverable and non-discoverable modes to devices. A device in non-discoverable mode does not respond to inquires from other devices, thus hiding its presence from outsiders. On one hand, the toggling of this mode can be controlled from the user space, (e.g., using bluetoothctl, which should require root permission). On the other hand, any vulnerabilities within these user-space daemons and tools, once exploited, might put the device into discoverable mode again. To prevent this, we could define a LBM rule as follows:

(( bt . hci .type == 1) &&/ ∗ HCI−CMD ∗/ ( bt . hci . command . ogf == 3) &&/ ∗ Discoverable ∗/ ( bt . hci . command . o c f == 58) )

This rule detects the HCI command used to enable the discoverable mode on the device. Once applied to the TX path, the rule drops any request from the user space

95 attempting to put the device into discoverable mode. We could write a similar rule to enforce non-connectable mode, which is used to prevent any Bluetooth connection to the device, even if its MAC address is known beforehand. Controlling Bluetooth/BLE Connections: Along with the rise of IoT devices, which often rely on Bluetooth Low Energy (BLE), Android devices began to support BLE since version 4.3 [125], with iOS adding support from the iPhone 4S forward. The Linux kernel Bluetooth stack (BlueZ [126]) also supports both classic Bluetooth and BLE at the same time. Although it is not uncommon to see a dual-mode device supporting both classic Bluetooth and BLE, it is surprisingly challenging (if not impossible) to enable only one of them while disabling the other. [127] With LBM, enabling/disabling Bluetooth or BLE connections is just a one-liner:

(( bt . hci .conn == 1) &&/ ∗ A link exists ∗/ ( bt . hci .conn.type == 0x80))/ ∗ BLE link ∗/

This LBM rule checks the connection type for each Bluetooth or BLE packet, and drops the packet if the connection is BLE, thus preventing unfamiliar IoT devices from establishing a connection while still allowing classic Bluetooth connections. It also provides a quick workaround for BleedingBit attacks [8] without waiting for firmware updates. Simply changing == 0x80 to != 0x80 achieves the opposite effect, only permitting BLE connections and thus providing a temporary defense against BlueBorne attacks [7]. Defending Against BlueBorne: BlueBorne attacks exploit vulnerabilities within Bluetooth protocol stack implementations, by sending either malformed or specially crafted Bluetooth packets. Within the Linux kernel, this vulnerability resulted from a missing check before using a local buffer. As a result, a crafted packet could cause a kernel stack overflow, potentially leading to remote code execution. Although the fix was a straightforward one, adding the missing checks [128], and applying patches to existing devices still requires additional steps of rebuilding the kernel and flashing new firmware.

96 With LBM, we can write a simple rule to properly defend against the potential kernel stack overflow:

(( bt . l2cap .cid == 0x1) &&/ ∗ L2CAP Signaling ∗/ /∗ Configuration Response ∗/

( bt . l2cap .sig.cmd.code == 0x5) && ( bt . l2cap . sig .cmd.len >= 66) )

We first pinpoint where the vulnerability was triggered, which is at the L2CAP layer during configuration response. Because the local buffer is 64 bytes and the first 4 bytes are used for the header, the actual data buffer to hold configuration options is 60 bytes. In the rule above, bt.l2cap.sig.cmd.len denotes the total length of a L2CAP command packet. Without counting the 6-byte header, the actual payload size of a command packet is cmd.len - 6. To defend against BlueBorne attacks, all we need is to make sure (cmd.len - 6) ¡ 60. Therefore, our rule, which is written to drop any configuration response larger than 66 bytes, will put a stop to BlueBorne. The above two rules demonstrate that LBM provides a dynamic patching capability to protocol stacks within the kernel, without waiting for official kernel patches or firmware updates to be upstreamed. NFC Support: To further show the generality of LBM, we extend LBM to support NFC. Unlike Bluetooth, NFC has three different standards (software interfaces) for communicating with NFC modules, including HCI [129], NCI [130], and Digital [131]. As a proof-of-concept, we focus on NCI, exposing two protocol fields and implementing one

BPF helper. The number of additional lines of code added to the kernel and lbmtool to make LBM support NFC is shown in Table 5-4. Step 1: Placing LBM hooks. NCI provides unique interfaces to cover both TX and RX transmission: nci send frame and nci recv frame. As for other networking subsystems, skb is used to carry NFC packets. We place the following LBM hooks at the two interfaces:

lbm filter pkt (LBM SUBSYS INDEX NFC, LBM DIR TX, ( void ∗) skb ) ; lbm filter pkt (LBM SUBSYS INDEX NFC, LBM DIR RX, ( void ∗) skb ) ;

97 Table 5-4. The number of lines added to support NFC. NFC Kernel lbmtool Total # of lines 85 12 97

Table 5-5. Details about the five LBM rules used during the benchmarks. LBM Rule Purpose # of Insn Scope USB-1 Stack Protection 72 Micro/Macro BM USB-2 Stack Protection 25 Micro/Macro BM USB-3 User Defined 22 Scalability BM HCI-1 Stack Protection 81 Micro/Macro BM L2CAP-1 Stack Protection 76 Micro/Macro BM

Step 2: Exposing protocol fields. We expose the packet length (nfc.nci.len) and message type (nfc.nci.mt) fields to the user space. The packet length is a member of the struct lbm nfc exposed in the LBM user-space header file. The message type is implemented as a BPF helper calling other NCI APIs.

Step 3: Enhancing lbmtool. lbmtool is easily extensible for new protocols, as we do for NFC. The internal LBM-rule code generation backend is abstracted from the specific subsystem the rules will apply to. As such, the only changes required to support NFC are to include a symbol descriptor table for each variable exposed to the user space by the

kernel. Once these changes are incorporated, lbmtool accepts LBM filters with NFC protocol fields and compiles them into eBPF instructions. 5.3.2 Benchmark Setup

We performed all of our benchmarks on a workstation with a 4-core Intel i5 CPU running at 3.2 GHz and 8 GB memory. The peripheral used during testing include a 300 Mbps USB 2.0 WiFi adapter, a Bluetooth 4.0 USB 2.0 adapter, and a 500 GB USB 3.0 external storage device. Depending on the benchmark, some subset of devices were connected. We list all the LBM rules used during the benchmarks in Table 5-5. We deploy all the rules on the RX path, since our protection target is the host machine. In addition to the “Stack Protection” rules mentioned in the case studies, we include “USB-3”, a user

98 Table 5-6. LBM overhead in µs based on processing 10K packets on the RX path. For each subsystem, the 1st row is for normal LBM and the 2nd row is for LBM-JIT. In most cases, the overhead of is within 1 µs when JIT is enabled. Subsystem Min Max Avg Med Dev USB 0.29 11.18 1.26 1.83 0.44 0.12 8.87 0.55 0.28 0.33 Bluetooth-HCI 1.16 17.87 2.81 2.70 0.62 0.27 15.67 0.98 0.77 0.47 Bluetooth-L2CAP 1.32 25.87 2.93 2.99 0.67 0.44 23.76 1.15 1.26 0.53

defined rule similar to usb. serial ==”7777” which drops the USB packet if the sending device’s serial number is 7777. As no devices that we test have a serial number matching this pattern, we mainly use this rule for the scalability benchmark. 5.3.3 Micro-Benchmark

For USB testing, we load LBM rules “USB-1” and “USB-2” into the system. We then capture 10K USB packets on the RX path from the WiFi adapter. As shown in the first two rows of Table 5-6, the average overhead is 1.26 µs per packet. When JIT is enabled, the overhead is reduced to 0.55 µs. For Bluetooth testing, we load LBM rules “HCI-1” and “L2CAP-1” into the system. We implement a simple L2CAP client/server protocol based on PyBluez [132] to generate 10K packets on the RX path for the HCI and L2CAP layers, respectively. As shown in the last four rows of Table 5-6, the average overheads are 2.81 µs for HCI and 2.93 µs for L2CAP. Again, with the help of JIT, we can reduce the overhead to around 1 µs. Takeaway: the general overhead introduced by LBM is around 1 µs for most cases. 5.3.4 Macro-Benchmark

For USB, we load the rules “USB-1” and “USB-2” and use filebench [100] to measure the throughput of the USB 3.0 external storage device. We chose the “fileserver” workload model with 10K files, 128KB and 1MB mean file sizes, 10 working threads, and 10-min running time. This workload generates roughly 1GB and 10GBs of files, respectively, within the storage device. As shown in Figure 5-7, all kernel configurations achieve similar

99 500

400

300 128KB 1MB 200

Throughput in MB/s 100

0 Vanilla LBM LBM-JIT

Figure 5-7. filebench across different kernel configurations. All configurations achieve similar throughputs, meaning a minimum performance impact from LBM. throughput during our testing. When the mean file size is 128KB, the total file size (1 GB) can easily fit into the system page cache. Thus, we are able to achieve close to 500 MB/s throughput (faster than the hard drive’s maximum speed of 150 MB/s). When the mean file size is 1MB, the total file size (10 GB) cannot completely fit into the page cache, thus resulting in much lower throughput. For Bluetooth, we load the rules “HCI-1” and “L2CAP-1” and use l2ping [133] to benchmark the Round-Trip-Time (RTT) for 10K pings. As with the USB testing, all kernel configurations achieve similar RTTs of around 5 ms, as shown in Figure 5-8. Because the overhead of LBM is under 1 µs in general (subsection 5.3.3), the overhead contributed to the RTT measurement is negligible. To double-check that LBM introduces a minimal overhead across the whole system, we use lmbench [134] to benchmark the whole system across different kernel

100 5

4

3

RTT in ms 2

1

0 Vanilla LBM LBM-JIT

Figure 5-8. RTT of l2ping in milliseconds (lower is better) based on 10K pings, across different kernel configurations. All configurations achieve similar throughputs, meaning a minimal performance impact from LBM. configurations. The complete summary is available in AppendixD. In short, LBM achieves comparable performance with the vanilla kernel. Takeaway: the overhead introduced by LBM is negligible for applications and for the system as a whole. 5.3.5 Scalability

To understand the scalability of LBM, we load the rule “USB-3” into the RX path once, 10 times, and 100 times. As in the micro-benchmark, we record 10K USB packets generated by the USB WiFi adapter and compute the overhead of LBM going through these rules for each packet. As shown in Figure 5-9, while the total overhead increases as the number of rules increases, the average overhead of checking individual rules decreases. The average overhead was 0.83 µs when there was only one rule loaded. It decreased to 0.32 µs when there were 100 rules loaded. Under JIT, the overhead was further reduced to

101 30 LBM LBM-JIT 25

20

15

10 Overhead in us 5

0 1 10 100 Num of LBM Rules

Figure 5-9. LBM overhead in µs based on varying numbers of rules. While the general overhead increases as the number of rules increases, the overhead of going through each individual rule decreases, thus the total overhead is essentially amortized.

0.23 µs. This might be the result of increased cache hits from accessing the same rule in a loop. Even for different rules, it is possible to observe this amortization effect, as long as each rule occupies a different cache line. Also, in general, more complicated rules will also induce more runtime overhead. We then compare LBM with USBFILTER using filebench.2 Except the difference in kernel versions3 , we ran LBM and USBFILTER on the same physical machine. To

2 Due to a kernel bug within USBFILTER, the front USB 3.0 ports could not support USB 3.0 devices. We switched to the rear USB 3.0 ports in this testing. We also tried to run USBFirewall. Unfortunately, FreeBSD does not support filebench or EXT4 filesystem used by our external drive. 3 LBM is running Linux kernel 4.13 while USBFILTER runs 3.13.

102 600

500

400 128KB 300 1MB

200 Throughput in MB/s 100

0 USBFILTER Stock-LBM LBM LBM-JIT

Figure 5-10. LBM vs. USBFILTER benchmark using filebench with 10 same rules loaded respectively. LBM introduces a minimum overhead comparing to the stock kernel and performs better than USBFILTER in general. set up the benchmark, we load “USB-3” into the RX path 10 times on LBM and load an equivalent rule the same number of times into USBFILTER. As shown in Figure 5-10, both LBM and LBM-JIT show a minimum overhead comparing to the stock kernel, and provide better throughput than USBFILTER regardless the mean file size. This could be the result of both kernel code improvements across versions and the design of LBM (e.g., due to its use of eBPF). The throughput boost is even clearer when the mean file size is 1MB and JIT is enabled. Compared to USBFILTER, LBM-JIT improves the throughput by roughly 60%. Finally, we compare LBM with USBFILTER and USBFirewall using dd on VFAT filesystem with direct I/O enabled to bypass the page cache. Since USBFirewall does not support loading rules from the user space directly, we statically built these 10 rules when compiling USBFirewall. As shown in Figure 5-11, comparing to their stock versions, all

103 120

100

80 Stock-USBFirewall USBFirewall 60 Stock-USBFILTER USBFILTER

Throughput in MB/s Stock-LBM 40 LBM LBM-JIT 4 8 16 32 64 128 Block Size in KB

Figure 5-11. LBM vs. USBFILTER vs. USBFirewall benchmark using dd with 10 same rules loaded respectively. Comparing to their stock versions, all the solutions show minimum overhead. USBFirewall does not vary much based on the block size. LBM performs better than USBFirewall and USBFILTER when block size is beyond 16 KB in general. the solutions show minimum overheads. The throughput of USBFirewall does not vary much based on the block size. We tried both the native FreeBSD version of dd and the GNU version. Both demonstrate similar throughput regardless the block size. We double check this by increasing the block size to 1 MB. When the block size is beyond 16 KB, both LBM and USBFILTER show better throughput than USBFirewall. Similarly, both LBM and LBM-JIT have better throughput than USBFILTER. Takeaway: compared to other state-of-the-art solutions, LBM provides better scalability and performance.

104 5.4 Discussion

5.4.1 LBM vs. USBFILTER vs. USBFirewall

The LBM filter DSL is more expressive than the USBFILTER policy, which only supports concatenating equality checks using logical AND. The LBM filter DSL supports different arithmetic and logical operations, as well as changing of operation precedence using parentheses. Compared to USBFILTER, LBM USB also doubles the number of protocol fields exposed to the user space, although LBM does not support pinning applications to peripherals.4 Nevertheless, LBM enables more complicated and powerful filtering rules than USBFILTER. Besides, any LUM can be converted into an LBM module without much hassle. LBM USB has also fully replicated functionality provided by USBFirewall, which required a kernel recompile and reboot to make any rule changes. 5.4.2 L2CAP Signaling in Bluetooth

Unlike L2CAP signaling in BLE, where each L2CAP packet only carries a single command, L2CAP signaling in the Bluetooth classic may have packets containing multiple commands. As we saw in the BlueBorne defense case study, if there is a malicious configuration response command contained in a L2CAP signaling packet, the entire payload will be dropped, including other “innocent” commands if they exist. One possible solution to such coarse-grained drops is to separate each command from the same L2CAP signaling packet into standalone packets. This requires packet parsing and duplication in the early stage. Another solution is to add a new customized hook in the place where each command is extracted by the L2CAP stack. Our current implementation does not apply either solution, for performance and simplicity considerations. From a security perspective, if one command from a certain device is recognized as malicious, it seems reasonable to drop other commands from the same device.

4 USBFILTER instrumented some USB device driver to support application pinning. It is ad-hoc, rather than a generic method.

105 5.4.3 BPF Memory Write

For security considerations, we forbid memory writes in LBM eBPF programs. While this restriction improves the kernel’s security posture towards user-loaded code, we also lose a powerful feature provided by eBPF and BPF helpers—packet mangling, which allows for fields to be changed on the fly. This feature has been employed by the networking subsystem, e.g., for changing the source IP address and/or the destination port number. For LBM, one potential use of memory write is removing only malicious commands while keeping others within the same L2CAP signaling packet intact. As an intermediate step to enable memory write in LBM programs, we can restrict the memory write ability to certain BPF helpers. As long as these BPF helpers are safe, the BPF verifier can still verify these programs by rejecting store instructions as before. 5.4.4 BPF Helper Kernel Modules

Ideally, we should allow BPF helpers for each subsystem to be implemented as a standalone kernel module, which can be plugged in when needed. Unfortunately, this is forbidden by the current eBPF design, and we follow the same design principles for similar reasons. First of all, BPF helpers are like syscalls in a system. The number of a BPF helper is like the syscall number, which is part of the Application Binary Interface (ABI) of the system. Although by introducing LBM, we have essentially namespaced LBM BPF helpers from other general and networking-specific helpers, these helpers still share the same LBM namespace regardless their respective subsystems. As a result, the number of a LBM BPF helper implemented within a kernel module cannot be decided until all used numbers are known, including the ones defined by LBM internals and those defined in other BPF helper modules. A possible solution here is to further namespace LBM BPF helpers per subsystem, e.g., have USB helpers always start with 100, Bluetooth helpers with 200, etc. Note that this solution would consequently limit the number of helpers each subsystem could have.

106 5.4.5 LLVM Support

LLVM began to support eBPF as an architectural backend in early 2015 [135]. A typical workflow involves writing an eBPF filter in C and compiling it using Clang. eBPF loaders such as tc are able to parse the generated ELF file and load it into the kernel [136]. While LLVM brings C into eBPF programming, easing filter writing for C developers, we realized that eBPF programming might still be challenging for sysadmins, who need an easy and intuitive way to write eBPF filters; we designed the LBM filter DSL with this in mind. We are planning to support LLVM as well by adding a new eBPF loader into LBM. 5.5 Limitations

5.5.1 Stateless vs. Stateful Policy

LBM filters are designed to be policy-independent, although a large part of the case studies presented stateless polices. Whether the policy is stateless or stateful essentially depends on what protocol fields and packet data are exposed to the user space. For example, USB does not have a “session” concept, and we could write useful LBM filters based on just the device information (a.k.a., stateless policy). Bluetooth has the “connection” concept in the L2CAP layer (like TCP connections), so we could write LBM filters using this field (a.k.a., stateful policy). Besides protocols fields defined by standards, the Linux kernel also maintains some bookkeeping data structures, e.g., counters. Exposing these kernel fields would also help to create stateful polices. The current LBM USB and Bluetooth implementations focus on exposing basic protocol fields rather than stateful variables. Nevertheless, we have noticed the potential of stateful policies. For instance, we could write a stateful policy to detect BleedingBit [8] attacks by observing a sequence of multiple BLE advertising packets with a certain bit off followed by another BLE advertising packet with that bit on.

107 5.5.2 DMA-Oriented Protocols

We have not instantiated LBM on , HDMI, or DisplayPort, although it is indeed possible to support these DMA-oriented protocols using LBM.5 Since LBM works at the packet layer, we are able to filter packets for these protocols as long as the concept of packet, given a protocol, is defined by the standard and implemented by the kernel. For example, DisplayPort defines different packets to carry different payloads such as stream and audio [137], implemented as such within the kernel. Thunderbolt, however, is a proprietary standard. It is not clear whether the protocol itself is packetized, and the only packet-level message available within the kernel is Thunderbolt control request/response instead of data transfer. Another challenge to supporting these protocols comes from determining the proper hook placement for complete mediation. DisplayPort is not a standalone subsystem but rather a component of (DRM) inside the kernel. Thunderbolt does not have a core layer but only provides few drivers due to the limited hardware devices. 5.5.3 Operating Systems Dependency

Although LBM is built upon the Linux kernel, it is possible to apply LBM to other operating systems. To achieve that, we need the target operating system to support a generic in-kernel packet filtering mechanism such as eBPF. The classic BPF is not enough because LBM relies on calling kernel APIs within filters to access different kernel data. While it is non-trivial to extend the classic BPF to eBPF, some porting effort has been done for FreeBSD to support eBPF [138]. The other requirement is a software architecture enabling complete-mediation hook placement for different peripherals. For instance, it is possible to mediate all USB packets within the FreeBSD USB subsystem, as proven by USBFirewall. Nevertheless, it might be challenging to port LBM to Windows, since it has a different packet filtering mechanism [139] and it is closed-source.

5 USB is also DMA-oriented.

108 5.5.4 Lbmtool Limitations

lbmtool currently does not support LBM filter consistency checking, meaning it is possible to have two LBM filters conflict with each other. Regarding eBPF instruction generation, lbmtool does not support stack allocations when the return value of BPF helpers is beyond 8 bytes (width of an eBPF register). Manual assembly is needed to manipulate the stack for those BPF helpers. lbmtool also does not support lazy evaluation on BPF helpers. They are always called at first to retrieve all the values of protocols fields needed before the actual evaluation of the LBM filter DSL expression. These are merely the current limitations of the custom compiler itself and could be eliminated with additional code.

109 CHAPTER 6 USB TYPE-C AUTHENTICATION In the previous chapters, we design and implement different security solutions within the operating system, including GoodUSB, USBFILTER, and LBM, to constrain the functionality a peripheral can behave at different granularities. Essentially, these systems provide authorization mechanisms for peripherals. As a result, they need to trust either user’s judgement or device’s identification information to enforce the authorization within the system. While both USBFILTER and LBM can enforce security policies on certain USB ports without trusting any information from USB peripherals, the deeper problem here is peripherals do not have trusted identifications. Attackers can hack the device firmware to change the identification information as needed, which reduces the effectiveness of authorization systems. In this chapter, we go beyond authorization and look at authentication by studying the most recent USB Type-C Authentication protocol. Note that although this protocol is design for USB, it provides guidelines and lesson learns on how to build trust anchors for any other peripherals.

6.1 Authentication Protocol

Although the reserach community has proposed many different solutions for addressing weaknesses in USB security, none have reached widespread commercial adoption. In this section, we evaluate the industry’s proposed solution, USB Type-C Authentication [140]. Type-C Authentication (TCA) is the first attempt by the USB 3.0 Promoter Group and USB-IF to address issues related to security. However, the security

Excerpts from this chapter previously appeared in “SoK: “Plug & Pray” Today - Understanding USB Insecurity in Versions 1 through C”, originally published in the proceedings of the 2018 IEEE Symposium on Security and Privacy (SP) [9].

110 Host Device Digest Query

GetDigest

Certificate Read

GetCertificate

Authentication Challenge

Challenge

Figure 6-1. The USB Type-C Authentication Protocol. properties of TCA are not yet widely understood by the security community.1 We begin with a description of the features and assumptions of TCA. Then, using the Type-C Authentication revision 1.0 specification (released on Feb 2, 2017), we formally model and verify the protocol using ProVerif [143], demonstrate multiple attacks, and discuss other issues within the spec. We finally evaluate TCA using findings we have learned through the systematization, and show that TCA is on the right direction to solve USB security in general, but the design flaws and the ignorance of modern USB attacks render its efforts in vain.

1 At the time of writing, the only commercial products supporting TCA are software from Siliconch [141] and a USB PD controller from Renesas [142].

111 ReqHeader slot# Nonce

CertChain Context slot# Others Salt Sig ResHeader Hash Hash

Figure 6-2. The USB Type-C Authentication challenge (request) and response messages with payloads.

6.1.1 USB Certificate Authorities

The TCA protocol is built over a certificate authority (CA) hierarchy, mimicking the current CA model used by SSL/TLS. The USB-IF owns and operates a default self-signed root certificate, and permits other organizations to use their own root certificates. The specification places no requirements on third-party roots (e.g., organizational vetting or issuance processes). USB device manufacturers control intermediate certificates signed by the USB-IF, and devices are issued their own certificates by the manufacturers. The final USB product is capable of storing at most 8 certificate chains and associated private keys, each with separate roots. 6.1.2 Authentication Protocol

In this protocol, the initiator is the USB host controller and the responder is the USB device. The protocol defines three operations the initiator can perform, shown in Figure 6-1: Digest Query: In this operation, the host controller issues a GetDigest request to the device. The device responds with digests for all of its certificate chains. According to the specification, the intent of this operation is to accelerate the certificate verification process in cases where the certificate chain has already been cached and verified. Certificate Read: This operation allows the host to retrieve a specific certificate chain using the GetCertificate request. Challenge: As shown in Figure 6-2, this operation defines a challenge-response protocol where the host initiates by sending a Challenge request. The request contains a slot

112 identifier in the request header and a 32-byte nonce. The response echoes the same slot identifier in the response header and contains a 32-byte SHA256 hash of the chosen certificate chain, a 32-byte salt, a 32-byte SHA256 hash of all USB descriptors for USB devices and all zeros for PD devices, and a 64-byte ECDSA digital signature on the challenge message and the response message using the corresponding private key of the device. 6.1.3 Secure Key Storage and Processing

To protect certificate private keys, a non-volatile secure enclave is needed, shown in Figure 6-3. As discussed above, this storage is partitioned into 8 slots supporting 8 private keys. Similarly, the certificate chain region also has 8 slots, containing the corresponding certificate chain if there is a private key in the associated slot. The TCA specification does not specify whether certificate chains should also be secured. To support the authentication protocol, a hardware cryptographic engine supporting ECDSA is also required. Presumably, this should be the only component which can access the secure storage. Other hardware components, besides the basic MCU, may be needed for both security and performance reasons, including TRNG and SHA256. 6.1.4 Security Policy

Following device authentication, the TCA specification suggests the introduction of a policy mechanism for peripheral management. The specification explains that “Policy defines the behavior of Products. It defines the capabilities a Product advertises, its Authentication requirements, and resource availability with respect to unauthenticated Products” (Page 14, Section 1.4) and “USB Type-C Authentication allows an organization to set and enforce a Policy with regard to acceptable Products.” (Page 11, Section 1). Unfortunately, beyond this description a concrete definition for policy is not provided; all implementation details are left to the OEM.

113 SHA256 ECDSA TRNG MCU slot0 slot1 slot7 Private Keys slot0 slot1 slot7 Certificate Chains Config Firmware

Figure 6-3. USB device internal architecture with secure storage and hardware to support Type-C Authentication.

6.2 Formal Verification

To discover possible vulnerabilities in the design, in this section we formally verify the TCA protocol using ProVerif [143], which has been applied on Signal [144] and TLS 1.3 Draft [145]. ProVerif uses the concept of channels to model an untrusted communication environment (e.g., the Internet) where adversaries may attack the protocol. However, because the USB communication channel does not provide confidentiality by default and is trusted in most cases,2 we instead model the device firmware as our channel. This accurately models attacks such as BadUSB [3], where the attacker is either a malicious USB device or a non-root hub trying to spoof the authentication protocol. In ProVerif, we define this firmware channel as free fw:channel. We also need to define the security properties we wish to prove. For example, since the private keys inside USB devices should never be leaked, we seek to understand if attackers can learn the key from eavesdropping or participating in the protocol. The

2 We do not consider side-channel or hardware attacks against the USB bus.

114 Type-C authentication spec clearly states (Page 11, Section 1.2) that “it permits assurance that a Product is

1. Of a particular type from a particular manufacturer with particular characteristics

2. Owned and controlled by a particular organization”. This means the authentication protocol should guarantee both the original configuration and the true identity of the device. The original configuration should be the one designed by the vendor for this product (e.g., a webcam). The true identity combines the usage of certificate chains (tying to a particular organization) and private keys baked into the device to provide the ability to cryptographically verify the original configuration. We abstract these security goals in ProVerif:

f r e e slot key : pri key [private]. f r e e slot cert chain : cert chain . f r e e orig conf : usbpd config . query attacker(slot key ) . query d:usbpd; event(goodAuth(d, true)) ==> event(useConfig(d, orig conf ) ) . query d:usbpd; event(goodAuth(d, true)) ==> (event(useCert(d, slot cert chain)) && event(usePrivkey(d, slot key ) ) ) .

To simplify the abstraction, we model one private key and the corresponding certificate chain rather than implementing all 8 slots. We also make the following assumptions:

• We ignore the verification process for a certificate chain, which is critical to the security of the entire protocol but out of the scope of the protocol.

• We assume the verification process to be successful by default. Our modeling is based on the communication between the USB host and the USB device. PD products share the same procedure via different signaling mappings. To mimic the caching behavior involved in the protocol, we use a “table” in the host side, supporting reading and writing a certificate chain: table cert chain cache(cert chain, digest).

115 Unsurprisingly, attackers cannot obtain the private key inside the USB device by protocol messages alone since none of these messages are designed to transmit the key. However, this protocol fails to meet its goals; neither the original configuration nor the true identity of the device could be guaranteed even if the authentication protocol succeeds, due to certificate chain caching inside the USB host:

get cert chain cache(chain, =dig) in known device(config) else new device(config).

Since the certificate chains are not secret, a malicious device can compute the digest of the expected chain. This digest can be sent as a response to the GetDigest request and impersonate the legitimate device. Unless the configuration of the legitimate device is saved and compared with the current configuration by the host, a malicious device can claim any functionality it wants. Thus, the certificate chain cache is vulnerable to spoofing attacks. We then remove the certificate chain cache from the host, forcing every device to go through a complete certificate request. Again, the private key is secure. Unfortunately, the authentication can still be spoofed as shown in this attacking trace:

attacker(sign((non 1883,hash(chain 1877 ) , sal d 1881 , config d 1879 ) , prik 1876 ) ) .

To exploit this vulnerability, the attacker hardcodes a certificate chain and a private key in the firmware rather than using the ones in the slot and modifies the original configuration (e.g., by adding a malicious HID functionality). This means that without firmware verification to prevent BadUSB attacks, these also allow circumventing the TCA protocol, rendering it useless for its stated goals. To demonstrate how firmware verification corrects this issue, we then assume firmware is trusted (e.g., signed by the vendor and verified by the MCU before flashing). We model this in ProVerif by marking the firmware channel as private: free fw:channel [private]. We assume that valid, legitimate firmware will use the certificate chains and private keys

116 inside the slots during authentication and that the original configuration of the device does not contain malicious functionality. Using this model, ProVerif confirms that successful authentication guarantees both the original configuration and the true identity of the device:

RESULT event(goodAuth(d, true)) ==> (event(useCert(d,slot cert chain [ ] ) ) && event(usePrivkey(d,slot key[]))) is true. RESULT event(goodAuth(d 2076,true)) ==> event(useConfig(d 2076 , orig conf[])) is true. RESULT not attacker(slot key[]) is true.

These results show that correct authentication using the TCA protocol is possible only when the firmware is verified. 6.3 Other Issues

While our formal verification of the authentication protocol uncovered major flaws, our manual analysis of the TCA specification uncovered other serious and systemic design flaws. These flaws reflect both a lack of understanding of secure protocol design and a lack of awareness to the present state of threats to peripheral devices. Responsibility for solving the most difficult security challenges raised by Type-C, such as a USB Certificate Authority system or a rich language for expressing security policies, is delegated wholesale to the OEMs. As a result, we are left to conclude that Type-C is based on an intrinsically broken design. Below, we catalog these issues:

1. No Binding for Identification with Functionality: In addition to the VID, PID, and serial number of the device, a device’s leaf certificate also carries Additional Certificate Data (ACD). ACD contains physical characteristics of PD products (e.g., peak current and voltage regulation) but no functionality (interface) information for other USB products.3 One explanation is that the protocol was designed to address low-quality Type-C cables that were damaging host machines [53] but was later extended to support other USB products. For PD, the specification clearly states

3 Note that using a self-signed root certificate from the vendor itself may not solve the problem, especially when the vendor is not trusted.

117 that it does not consider alternative modes. As a result, a successful authentication does not specify the device’s original configuration (e.g., storage device, keyboard, normal charging cable).

2. Volatile Context Hash: As shown in Figure 6-2, the challenge response contains the context hash, which is all zeros for PD products but a SHA256 hash of all descriptors for USB products. This seems intended to solve the functionality binding issue for USB products mentioned above but is broken when the firmware is not trusted. However, the firmware can provide its own set of USB descriptors and feed them into the hardware ECDSA signing module to generate the challenge response, as shown in Figure 6-3. As a result, BadUSB attacks are still possible.

3. Unidirectional Authentication: For PD products, either a PD sink or a PD source can initiate an authentication challenge. The authentication between PD devices is thus mutual. However, the TCA specification only allows USB host controllers to initiate an authentication challenge for USB devices. This is unfortunate, as our survey of defensive solutions demonstrates that host authentication is an essential feature for device self-protection. As a result, the TCA specification does not provide a way for smart devices such as mobile phones to make informed trust decisions.

4. Nebulous Policy Component: Following device authentication, the TCA specification calls for the creation of a security policy to handle different connected products, but does not adequately describe what a policy is or how to create one. The specification does not define the security policy language, encoding, installation method, or how it interacts with the USB host controller. Policies are only described anecdotally, indicating a lack of forethought as to how TCA policy can appreciably enhance security.

5. Impractical Key Protection Requirement: The private keys in the slots are the most important property a product needs to protect besides the firmware. Although the specification does not detail how to secure private keys, it does list more than 10 attacks a product needs to defend against from leaking keys, including side-channel attacks, power analysis, micro-probing, etc. It is unlikely that a $10 USB product [142] could stop advanced invasive attacks, e.g., using Focused Ion Beam (Appendix C, TCA Spec), which makes certificate revocation critical when a private key is leaked.

6. No Revocation: The specification states that the validity time of a product certificate is ignored, suggesting that once the certificate is loaded onto the device, there is no way to revoke it. The use of certificate chain caching to accelerate the authentication process is also based on the fact that all certificates along a chain stay legitimate forever once the chain is verified.

7. No Support for Legacy Products: With the help of converters, Type-C can be fully compatible with legacy USB devices, and leaves it to the end user to set a security policy that blacklists devices that cannot participate in the authentication protocol.

118 As breaking backwards compatibility is in direct conflict with the USB’s core design principle of universality, very few organizations will elect to set such a policy. As a result, TCA is likely to be trivially bypassed by applying a converter to a Type-C device. Not surprisingly, TCA works best for signal injection attacks since it was designed to solve the problem of low-quality charging cables. All other limited defense effects are the results of trusting the identity and the firmware once the device passes the authentication protocols, and assuming some security policies deployed on the host machines using the identity of the device. One one hand, TCA is aware of some urgent issues in USB security, taking initial steps to fix them. For example, TCA introduces certificates and a CA model, providing a way to build trust for USB products, and embeds private keys into USB products to provide trust anchors. However, as we show in the TCA weakness column, the design flaws and limitations makes TCA a vulnerable and incomplete solution for USB security.

119 CHAPTER 7 REFLECTIONS ON PERIPHERAL SECURITY Decades ago when the Internet was invented and there were only five computers connected, concern of security was off the table. Because everyone knew who they were talking to on the other end. Interestingly, peripheral security nowadays replicates the early days of the Internet – security is still off the table, although the number of peripherals in the world is not five and we do not really know what we plug into or connect with our machines. While it is always easy to blame end users, the fundamental issue of peripheral security lies in the “Trust-by-default” perception of peripherals, and the missing authentication and authorization mechanisms as a result. Accordingly, to solve the peripheral security problem, we need to have a “Untrusted-by-default” or even “Adversarial-by-default” mindset for peripherals, and build authentication and authorization mechanisms within hardware and system software. In chapter 3, we presented GoodUSB, a security solution to defend against malicious USB devices by including end users into the authentication loop. While this solution bridges the semantic gap between end users and operating systems, we found the limitations of relying on end users and coarse-granularity control based on USB functionality. In chapter 4, we introduced USBFILTER, a systematic solution within the operating system to defend against untrusted and malicious USB peripherals by filtering USB packets. Borrowing the firewall concept widely used in network security, we have shown that software-based attacks from malicious USB devices can be defeated by building a packet-layer firewall for the USB subsystem within operating systems. In chapter 5, we went beyond USB and generalized the packet-layer firewall for all peripherals including Bluetooth and NFC by building LBM. As a generic security framework for peripherals within the Linux kernel, LBM has proven the dissertation statement in chapter 1: Software-based attacks from malicious peripherals such as USB devices and Bluetooth gadgets that abuse the protocol design or exploit software stack vulnerabilities can be

120 defended by building packet-layer firewalls for those I/O subsystems within operating systems. We went beyond peripheral authorization issues and looked at peripheral authentication issues in chapter 6 by formally verifying the USB Type-C Authentication protocol. We have demonstrated that while the protocol provides promising steps to solve peripheral security, there is still a long way to go to enable a secure and trusted computing environment for the usage of modern peripherals. 7.1 Future Work

Solution Integration and Extention. Because most existing peripheral defense solutions focus on a single layer, it is natural to investigate how to combine different solutions covering multiple layers. For instance, combining ProvUSB [146], GoodUSB, and FirmUSB [147] provides a comprehensive defense from Human to Transport Layer, defeating most software-based attacks. Similarly, USBFirewall can act in combination with USBFILTER to provide a powerful USB packet firewall for controlling USB device behavior while defending against exploits from malformed packets. LBM currently supports USB, Bluetooth, and NFC. We can also extend it to cover other peripherals, such as HDMI, I2C, SPI, etc. Type-C Authentication Products Evaluation While we have shown design flaws of TCA, it is unlikely that we will see a new version of the specification in the near future, given that the it has just been finalized. 1 There is therefore an urgent need to evaluate the security of these new products, since real-world attacks may provide the impetus for a specification update. It is also possible that vendor-specific implementations have considered those pitfalls in the spec and have offered mitigations which, once verified, will prove convincing.

1 By the time of writing, the only vendor we are aware of producing Type-C Authentication products is Renesas, and it is only for PD other than normal USB.

121 Bi-directional Authentication and Mutual Authentication. While the trust anchor for USB hosts is missing in TCA, a short-term fix is to leverage the trusted hardware available on the host, such as TPM, and implement a host authentication protocol like Kells [148] and ProvUSB. The possibility of doing bidirectional authentication also opens a door to mutual authentication, where the USB host and peripheral authenticate each other. Moreover, we need to take the lesson learns from USB and apply them to other peripherals, such as Bluetooth and NFC. Together with clear key protection and revocation requirements, this may provide a comprehensive solution to cover different stakeholders within the whole host-peripheral ecosystem. Legacy Device Authentication. To authenticate legacy devices which are lack of trust anchors, two techniques are promising, and solve the problem in different ways. Host fingerprinting has shown the possibility of fingerprinting host machines from peripherals, e.g., via the USB interface, using machine learning algorithms. The same idea could be applied to peripheral fingerprinting, although with the pitfalls of building a robust machine learning system in an adversarial environment. FirmUSB is able to understand the USB device firmware behavior, and providing a stronger security guarantee than fingerprinting when the firmware is available. This combination of fingerprinting and firmware verification can potentially mitigate most attacks from legacy devices. Security Policy Instantiation. Although security policies have been designed and used in existing solutions such as USBFILTER and LBM, we need a new policy design that is general enough to be adopted by most vendors and expressive enough to ease creating rich rules. The new design should enumerate a set of subject, object, and access primitives to provide an intuitive mediation abstraction, define a common data marshaling format (e.g., XML, JSON) through which policies can be shared between deployments. It should also describe best practices for policy design, including how policies can preserve security in the presence of legacy devices. This will not only concretize TCA which assumes an existing

122 policy instantiation, but also promote peripheral security as part of systems security solutions, such as SELinux. 7.2 Conclusion

In this dissertation, we have demonstrated how to build security solutions within operating systems to defend against malicious peripherals. Moreover, we have shown that software-based attacks from malicious peripherals such as USB devices and Bluetooth gadgets that abuse the protocol design or exploit software stack vulnerabilities can be defended by building packet-layer firewalls for those I/O subsystems within operating systems. Besides peripheral authorization mechanisms, we have also looked into peripheral authentication issues, such as the USB Type-C authentication, and demonstrated its design flaws via formal verification. These combined efforts have enabled further peripheral security research on hardening operating systems and building trust anchors within peripherals.

123 APPENDIX A A LUM EXAMPLE TO BLOCK SCSI WRITES

/∗ ∗ lbsw − A LUM kernel module ∗ used to blockSCSI write command within USB packets ∗/ #include #include #include

#define LUM NAME”block scsi write” #define LUM SCSI CMD IDX 15

static struct u s b f i l t e r l u m lbsw ; static int lum registered;

/∗ ∗ Define the filter function ∗ Return1 if this is the target packet ∗ Otherwise0 ∗/ int lbsw filter urb ( struct urb ∗urb ) { char opcode ;

/∗ Has to be an OUT packet ∗/ i f ( usb pipein ( urb−>pipe ) ) return 0 ;

/∗ Make sure the packet is large enough ∗/ i f ( urb−>t r a n s f e r b u f f e r l e n g t h <= LUM SCSI CMD IDX) return 0 ;

/∗ Make sure the packet is not empty ∗/ i f ( ! urb−>t r a n s f e r b u f f e r ) return 0 ;

/∗ Get theSCSI cmd opcode ∗/ opcode = ( ( char ∗) urb−>t r a n s f e r buffer) [LUM SCSI CMD IDX] ;

/∗ Current only handle WRITE 10 for Kingston ∗/ switch ( opcode ) { case WRITE 10: return 1 ; default : break ; }

return 0 ; }

static int init lbsw init ( void ) { pr info (”lbsw: Entering:%s \n”, func ); snprintf (lbsw .name, USBFILTER LUM NAME LEN,”%s” , LUM NAME) ; lbsw . lum filter urb = lbsw filter urb ;

/∗ R e g i s t e r this lum ∗/ i f ( u s b f i l t e r r e g i s t e r lum(&lbsw)) pr err (”lbsw: registering lum failed \n”); else lum registered = 1;

return 0 ; }

static void exit lbsw exit ( void ) { pr info (”exiting lbsw module \n”); i f ( lum registered) u s b f i l t e r d e r e g i s t e r lum(&lbsw); }

module init ( lbsw init ) ; module exit ( lbsw exit ) ;

MODULE LICENSE(”GPL”); MODULE DESCRIPTION(”lbsw module”); MODULE AUTHOR(”dtrump”);

Figure A-1. An example Linux usbfilter Module that blocks writes to USB removable storage.

124 APPENDIX B LBMTOOL FRONTEND GRAMMAR

hexpri ::= hlogical-ori hlogical-ori ::= hlogical-andi (‘||’ hlogical-andi)* hlogical-andi ::= hcomparisoni (‘&&’ hcomparisoni)* hcomparisoni ::= hatomi (hcomparison-opi hatomi)* hcomparison-opi ::=‘ <’ | ‘>’ | ‘<=’ | ‘>=’ | ‘==’ | ’!=’ haccessi ::=‘ [’ hnumberi ‘:’ hnumberi ‘]’ hattributei ::=‘ .’ hIDENTIFIERi hstructi ::= hIDENTIFIERi hattributei* haccessi? hnumberi ::= hDEC NUMBERi | hHEX NUMBERi hstringi ::= hSTRINGi hatomi ::= hnumberi | ‘-’ hnumberi | hstructi | hstringi | ‘(’ hexpri ‘)’ hDEC NUMBERi ::= hDIGIT i+ hHEX NUMBERi ::=‘ 0x’ hHEXDIGIT i+ hLETTERi ::=‘ a’ ... ‘z’ | ‘A’ ... ‘Z’ hSTRINGi ::=‘ "’ (‘\"’ | /[ˆ‘"’]/)* ‘"’ hDIGIT i ::=‘ 0’...‘9’ hHEXDIGIT i ::=‘ a’ ... ‘f’ | ‘A’ ... ‘F’ | hDIGIT i hIDENTIFIERi ::=(‘ _’ | hLETTERi)(‘_’ | hLETTERi | hDIGIT i)*

Figure B-1. The Extended Backus-Naur Form (EBNF) of our constructed LBM expression grammar.

125 APPENDIX C LBMTOOL COMPILATION EXAMPLE

LBM Program

usb .idVendor == 0x413c && usb .idProduct == 0x3010

Intermediate Representation

0: t1 := call(lbm usb get idVendor ) 1: t0 := binop(EQ, t1, 16700) 2: t3 := call(lbm usb get idProduct ) 3: t2 := binop(EQ, t3, 12304) 4: t4 := binop(AND, t0, t2)

eBPF Assembly

LSTART: MOV64 REG(REG 9, REG 1) MOV64 REG(REG 1, REG 9) CALL FUNC( FUNC lbm usb get idVendor ) MOV64 REG(REG 1, REG 0) MOV64 IMM(REG 6, 1) JMP IMM(JEQ, REG 1, 16700 , L1 ) MOV64 IMM(REG 6, 0) L1 : MOV64 REG(REG 1, REG 9) CALL FUNC( FUNC lbm usb get idProduct ) MOV64 REG(REG 2, REG 0) MOV64 IMM(REG 3, 1) JMP IMM(JEQ, REG 2, 12304 , L2 ) MOV64 IMM(REG 3, 0) L2 : JMP IMM(JEQ, REG 6, 0 , L3 ) JMP IMM(JEQ, REG 3, 0 , L3 ) MOV64 IMM(REG 4, 1) JMP A(L4 )

126 L3 : MOV64 IMM(REG 4, 0) L4 : JMP IMM(JNE, REG 4, 0 , L5 ) L6 : MOV64 IMM(REG 0, 0) EXIT INSN( ) L5 : MOV64 IMM(REG 0, 1) LEND: EXIT INSN( )

127 APPENDIX D LMBENCH RESULTS FOR LBM

128 Table D-1. lmbench results for a Vanilla kernel, LBM, and LBM-JIT. Processor & Processes (ns) Null call Null I/O Stat Open/Close Select TCP Signal install Signal Handle Fork Execute Exec. Shell Vanilla 0.23 0.32 0.65 1.39 6.26 0.27 0.81 151. 497. 1425 LBM 0.22 0.32 0.66 1.38 5.65 0.27 0.80 141. 400. 1411 LBM-JIT 0.22 0.32 0.66 1.38 5.65 0.27 0.80 92.6 415. 1446 Basic integer operations (ns) bit add div mod Vanilla 0.2800 0.1400 6.1100 6.5700 LBM 0.2800 0.1400 6.0200 6.4900 LBM-JIT 0.2800 0.1400 6.0300 6.5300 Basic uint64 operations (ns) bit div mod Vanilla 0.280 12.0 11.7 LBM 0.280 12.1 11.7 LBM-JIT 0.280 12.1 11.7 Basic float operations (ns) add mul div bogo Vanilla 0.8400 1.3900 3.7800 1.9500 LBM 0.8400 1.3900 3.6800 1.9500 LBM-JIT 0.8400 1.3900 3.6800 1.9600 Basic Double Operations (ns) add mul div bogo Vanilla 0.8400 1.3900 5.6200 3.9000 LBM 0.8400 1.3900 5.6300 3.9000 LBM-JIT 0.8400 1.3900 5.6500 3.9100 Context Switching (ns) 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K Vanilla 1.7300 1.6600 2.4000 4.2000 5.0700 4.24000 5.79000 LBM 1.6500 1.5800 2.1900 3.3800 4.9100 4.11000 7.77000 LBM-JIT 1.6100 1.5000 2.2600 3.2200 7.5000 3.28000 7.55000 Local Communication Latencies (us) 2p/0K context switch Pipe AF UDP TCP TCP/connection Vanilla 1.730 5.028 6.97 9.127 11.5 17. LBM 1.650 4.998 6.31 8.973 11.3 17. LBM-JIT 1.610 5.068 7.27 8.966 11.4 17. File & VM system latencies (us) 0K File Cre. 0K File Del. 10K File Cre. 10K File Del. Mmap Latency Prot. Fault Page Fault 100 FD Select Vanilla 5.7323 3.8630 13.3 6.8787 6493.0 0.501 0.22380 1.609 LBM 5.7247 3.8566 13.2 7.0278 6518.0 0.502 0.22080 1.602 LBM-JIT 5.7531 3.8511 13.7 6.8543 6523.0 0.500 0.22310 1.613 Local Communication bandwidths (MB/s), Larger is better Pipe AF UNIX TCP File Reread Mmap Reread Bcopy (libc) Bcopy (custom) Memory Read Memory Write Vanilla 5597 12.K 7539 7455.9 15.0K 8126.0 5886.8 14.K 8528. LBM 5606 12.K 7365 7473.6 15.0K 8193.2 5911.6 14.K 8535. LBM-JIT 5686 12.K 7466 7494.9 15.0K 8169.2 5909.9 14.K 8542. Memory latencies (ns) Mhz L1 Cache L2 Cache Main memory Random memory Vanilla 3192 1.1140 3.3420 15.2 84.1 LBM 3192 1.1140 3.3420 14.6 84.9 LBM-JIT 3192 1.1140 3.3430 15.2 83.9

129 REFERENCES [1] Apple, Hewlett-Packard, Intel, , Renesas, STMicroelectronics, and Texas Instruments, “Universal Serial Bus 3.2 Specification: Revision 1.0,” Tech. Rep., Sep. 2017. [2] Bluetooth SIG, Inc., “Bluetooth Core Specification v5.0,” Tech. Rep., Dec. 2016. [3] K. Nohl, S. Krißler, and J. Lell, “BadUSB - On accessories that turn evil,” BlackHat, 2014. [4] B. Lau, Y. Jang, C. Song, T. Wang, P. Chung, and P. Royal, “Mactans: Injecting Malware into iOS Devices via Malicious Chargers,” Proceedings of the Black Hat USA Briefings, Las Vegas, NV, August 2013, 2013. [5] D. J. Tian, G. Hernandez, J. I. Choi, V. Frost, C. Ruales, P. Traynor, H. Vijayakumar, L. Harrison, A. Rahmati, M. Grace, and K. R. B. Butler, “ATtention spanned: Comprehensive vulnerability analysis of AT commands within the Android ecosystem,” in 27th USENIX Security Symposium (USENIX Security 18), 2018. [Online]. Available: https://www.usenix.org/conference/usenixsecurity18/presentation/tian [6] D. J. Tian, G. Hernandez, J. I. Choi, V. Frost, P. C. Johnson, and K. R. Butler, “Lbm: A security framework for peripherals within the linux kernel,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019. [7] Armis Inc., “BlueBorne,” https://www.armis.com/blueborne/, 2017. [8] ——, “Bleeding Bit,” https://armis.com/bleedingbit/, 2018. [9] D. J. Tian, N. Scaife, D. Kumar, M. Bailey, A. Bates, and K. R. B. Butler, “SoK: “Plug & Pray” Today - Understanding USB Insecurity in Versions 1 through C,” in Proceedings of the IEEE Symposium on Security and Privacy (S&P), 2018. [10] SystemSoft Corporation and Intel Corporation, “Universal Serial Bus Common Class Specification, Revision 1.0,” December 1997. [11] The USB Device Working Group, “USB Class Codes,” http://www.usb.org/ developers/defined class, 2015. [12] USB Implementers Forum, Inc., “USB Mass Storage Class Specification Overview,” http://www.usb.org/developers/docs/devclass docs/Mass Storage Specification Overview v1.4 2-19-2010.pdf, 2010. [13] ——, “USB Mass Storage Class CBI Transport,” http://www.usb.org/developers/ docs/devclass docs/usb msc cbi 1.1.pdf, 2003. [14] S. Stasiukonis, “Social engineering, the USB way,” Dark Reading, 2006.

130 [15] Paul Sewers, “US Govt. plant USB sticks in security study, 60% of subjects take the bait,” http://thenextweb.com/insider/2011/06/28/ us-govt-plant-usb-sticks-in-security-study-60-of-subjects-take-the-bait/, 2011. [16] J. R. Jacobs, “Measuring the effectiveness of the USB flash drive as a vector for social engineering attacks on commercial and residential computer systems,” Master’s thesis, Embry-Riddle Aeronautical University, 2011. [17] M. Tischer, Z. Durumeric, S. Foster, S. Duan, A. Mori, E. Bursztein, and M. Bailey, “Users Really Do Plug in USB Drives They Find,” in Proceedings of the 37th IEEE Symposium on Security and Privacy (S&P ’16), San Jose, California, USA, May 2016. [18] Mathew J. Schwartz, “How USB Sticks Cause Data Breach, Malware Woes,” http://www.pcworld.com/article/237600/companies lose 2 5 million from missing memory sticks study says.html, 2011. [19] Darren Pauli, “Secret defence documents lost to foreign intelligence,” http://www. itnews.com.au/news/secret-defence-documents-lost-to-foreign-intelligence-278961, 2011. [20] Kevin Poulsen and Kim Zetter, “U.S. Intelligence Analyst Arrested in Wikileaks Video Probe,” http://www.wired.com/2010/06/leak/, 2010. [21] Alex Washburn, “Snowden Smuggled Documents From NSA on a Thumb Drive,” https://www.wired.com/2013/06/snowden-thumb-drive/, 2013. [22] H. J. Highland, “The BRAIN virus: fact and fantasy,” Computers & Security, vol. 7, no. 4, pp. 367–370, 1988. [23] Common Vulnerabilities and Exposures, “CVE-2010-2568,” https://cve.mitre.org/ cgi-bin/cvename.cgi?name=CVE-2010-2568, 2010. [24] Falliere, Nicolas and Murchu, Liam O and Chien, Eric, “W32.Stuxnet Dossier,” https://www.symantec.com/content/en/us/enterprise/media/security response/ whitepapers/w32 stuxnet dossier.pdf, 2011. [25] S. Shin and G. Gu, “Conficker and Beyond: A Large-scale Empirical Study,” in Proceedings of the 26th Annual Computer Security Applications Conference, ser. ACSAC ’10, 2010. [Online]. Available: http://doi.acm.org/10.1145/1920261.1920285 [26] K. Zetter, “Meet “Flame”, The Massive Spy Malware Infiltrating Iranian Computers,” Wired, 28 May 2012, https://www.wired.com/2012/05/flame/. [27] P. Szor, “Duqu–threat research and analysis,” McAfee Labs, 2011, https: //scadahacker.com/library/Documents/Cyber Events/McAfee%20-%20W32.Duqu% 20Threat%20Analysis.pdf. [28] P. Oliveira, Jr., “FBI can turn on your web cam, and youd never know it,” http://nypost.com/2013/12/08/fbi-can-turn-on-your-web-cam/, 8 Dec. 2013,

131 accessed: 2016-11-10. [Online]. Available: http://nypost.com/2013/12/08/ fbi-can-turn-on-your-web-cam/ [29] CBS/AP, “BlackShades malware hijacked half a million computers, FBI says,” http://www.cbsnews.com/news/ blackshades-malware-hijacked-half-a-million-computers-fbi-says/, 2014, accessed: 2016-11-10. [Online]. Available: http://www.cbsnews.com/news/ blackshades-malware-hijacked-half-a-million-computers-fbi-says/ [30] M. Brocker and S. Checkoway, “iSeeYou: Disabling the MacBook webcam indicator LED,” in 23rd USENIX Security Symposium (USENIX Security 14), 2014, pp. 337–352. [31] Tal Ater, “Chrome Bugs Allow Sites to Listen to Your Private Conversations,” https://www.talater.com/chrome-is-listening/, 2014. [32] M. Guri, M. Monitz, and Y. Elovici, “USBee: air-gap covert-channel via electromagnetic emission from USB,” in Privacy, Security and Trust (PST), 2016 14th Annual Conference on. IEEE, 2016, pp. 264–268. [33] “TURNIPSCHOOL - NSA playset,” http://www.nsaplayset.org/turnipschool. [34] Hak5, “Episode 709: USB Rubber Ducky Part 1,” http://hak5.org/episodes/ episode-709, 2013. [35] S. Kamkar, “USBdriveby,” http://samy.pl/usbdriveby/, 2014. [36] J. Bang, B. Yoo, and S. Lee, “Secure USB bypassing tool,” digital investigation, vol. 7, pp. S114–S120, 2010. [37] GoodFET, “Facedancer21,” http://goodfet.sourceforge.net/hardware/facedancer21/, 2016. [38] NCCGROUP, “Umap2,” https://github.com/nccgroup/umap2, 2018. [39] Google, “Found Linux kernel USB bugs,” https://github.com/google/syzkaller/blob/ master/docs/linux/found bugs usb.md, 2017. [40] Z. Wang and A. Stavrou, “Exploiting Smart-phone USB Connectivity for Fun and Profit,” in Proceedings of the 26th Annual Computer Security Applications Conference, ser. ACSAC ’10. New York, NY, USA: ACM, 2010, pp. 357–366. [41] K. Sridhar, S. Prasad, L. Punitha, and S. Karunakaran, “EMI issues of universal serial bus and solutions,” in Electromagnetic Interference and Compatibility, 2003. INCEMIC 2003. 8th International Conference on. IEEE, 2003, pp. 97–100. [42] D. Oswald, B. Richter, and C. Paar, “Side-channel attacks on the Yubikey 2 one-time password generator,” in International Workshop on Recent Advances in Intrusion Detection. Springer, 2013, pp. 204–222.

132 [43] K. Nohl, “BadUSB Exposure: Hubs,” https://opensource.srlabs.de/projects/badusb/ wiki/Hubs, November 2014. [44] L. Letaw, J. Pletcher, and K. Butler, “Host Identification via USB Fingerprinting,” 2011 IEEE 6th International Workshop on Systematic Approaches to Digital Forensic Engineering (SADFE), May 2011. [45] A. Bates, R. Leonard, H. Pruse, K. R. Butler, and D. Lowd, “Leveraging USB to Establish Host Identity Using Commodity Devices,” in Proceedings of the 2014 Network and Distributed System Security Symposium, ser. NDSS ’14, February 2014. [46] A. Davis, “Revealing Embedded Fingerprints: Deriving Intelligence from USB Stack Interactions,” in Blackhat USA, Jul. 2013. [47] Y. Su, D. Genkin, D. Ranasinghe, and Y. Yarom, “USB Snooping Made Easy: Crosstalk Leakage Attacks on USB Hubs,” in 26th USENIX Security Symposium (USENIX Security 17), Vancouver, BC, 2017. [Online]. Available: https: //www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/su [48] “COTTONMOUTH-I,” https://nsa.gov1.info/dni/nsa-ant-catalog/usb/index.html# COTTONMOUTH-I, 2008. [49] “COTTONMOUTH-II,” https://nsa.gov1.info/dni/nsa-ant-catalog/usb/index.html# COTTONMOUTH-II, 2008. [50] mich, “Inside a low budget consumer hardware espionage implant,” https: //ha.cking.ch/s8 data line locator/, 2017. [51] ALLOYSEED, “GIM Answer Monitor USB Charging Data Cable GPS Locator,” https://www.aliexpress.com/item/ 1m-GPS-Positioning-Pick-up-Line-Tracker-Remote-Tracking-Cable-\ GIM-Answer-Monitor-USB-Charging-Data/32813314360.html?trace= msiteDetail2pcDetail, 2017. [52] USBKiller, “Usbkiller,” https://www.usbkill.com/, 2016. [53] Benson Leung, “Surjtech’s 3M USB A-to-C cable completely violates the USB spec. Seriously damaged my ,” https://www.amazon.com/review/ R2XDBFUD9CTN2R/ref=cm cr rdp perm, 2016. [54] F. L. Sang, V. Nicomette, and Y. Deswarte, “I/O attacks in Intel PC-based architectures and countermeasures,” in SysSec Workshop (SysSec), 2011 First. IEEE, 2011, pp. 19–26. [55] Z. Zhou, M. Yu, and V. D. Gligor, “Dancing with giants: Wimpy kernels for on-demand isolated I/O,” in Proceeding of the 2014 IEEE Symposium on Security and Privacy (S&P), 2014.

133 [56] A. T. Markettos, C. Rothwell, B. F. Gutstein, A. Pearce, P. G. Neumann, S. W. Moore, and R. N. Watson, “Thunderclap: Exploring vulnerabilities in operating system iommu protection via dma from untrustworthy peripherals,” in Network and Distributed Systems Security (NDSS) Symposium, 2019. [57] P. C. Johnson, S. Bratus, and S. W. Smith, “Protecting Against Malicious Bits On the Wire: Automatically Generating a USB Protocol Parser for a Production Kernel,” in Proceedings of the 33rd Annual Computer Security Applications Conference (ACSAC), 2017. [58] S. Angel, R. S. Wahby, M. Howald, J. B. Leners, M. Spilo, Z. Sun, A. J. Blumberg, and M. Walfish, “Defending against Malicious Peripherals with Cinch,” in Proceedings of the 25th USENIX Security Symposium, 2016. [59] trifinite.group, “trifinite,” https://trifinite.org/, 2004. [60] A. Laurie, M. Holtmann, and M. Herfurt, “Hacking Bluetooth enabled mobile phones and beyond - Full Disclosure,” BlackHat Europe, 2005. [61] M. Herfurt and C. Mulliner, “Blueprinting: Remote Device Identification based on Bluetooth Fingerprinting Techniques,” in 21st Chaos Communication Congress (21C3), Dec. 2004. [62] L. Carettoni, C. Merloni, and S. Zanero, “Studying Bluetooth malware propagation: The Bluebag project,” IEEE Security & Privacy, vol. 5, no. 2, 2007. [63] A. Laurie, M. Holtmann, and M. Herfurt, “Bluetooth Hacking: The State of the Art,” BlackHat Europe, 2006. [64] M. Herfurt, “Bluetooth Security,” What the Hack Conference, 2005. [65] F. Xu, W. Diao, Z. Li, J. Chen, and K. Zhang, “Badbluetooth: Breaking android security mechanisms via malicious bluetooth peripherals,” in Proceedings of the 26th Annual Network and Distributed System Security Symposium (NDSS19), San Diego, CA, 2019. [66] Near Field Communication Forum, Inc., “Core Protocol Technical Specifications,” https://nfc-forum.org/our-work/specifications-and-application-documents/ specifications/protocol-technical-specifications/, 2018. [67] R. Verdult and F. Kooman, “Practical Attacks on NFC Enabled Cell Phones,” in Proceedings of the 3rd International Workshop on Near Field Communication (NFC), 2011. [68] C. Miller, “Exploring the nfc attack surface,” Proceedings of Blackhat, 2012. [69] S. Baghdasaryan, “[v3,2/4] NFC: Fix possible memory corruption when handling SHDLC I-Frame commands,” https://patchwork.kernel.org/patch/10378895/, May 2018.

134 [70] S. Maruyama, S. Wakabayashi, and T. Mori, “Tap ’n ghost: A compilation of novel attack techniques against smartphone touchscreens,” in 2019 IEEE Symposium on Security and Privacy (SP). Los Alamitos, CA, USA: IEEE Computer Society, may 2019, pp. 628–645. [Online]. Available: https://doi.ieeecomputersociety.org/10.1109/SP.2019.00037 [71] S. McCanne and V. Jacobson, “The BSD Packet Filter: A New Architecture for User-level Packet Capture,” in USENIX winter, vol. 93, 1993. [72] J. Corbet, “Extending extended BPF,” Linux Weekly News, 2014. [73] J. Kicinski and N. Viljoen, “eBPF Hardware Offload to SmartNICs: cls bpf and XDP,” Proceedings of netdev, vol. 1, 2016. [74] J. Schulist, D. Borkmann, and A. Starovoitov, “Linux Socket Filtering aka (BPF),” https://www.kernel.org/doc/Documentation/networking/filter. txt, 2018. [75] A. Caudill and B. Wilson, “Phison 2251-03 (2303) Custom Firmware & Existing Firmware Patches (BadUSB),” GitHub, vol. 26, Sep. 2014. [76] D. J. Tian, A. Bates, and K. Butler, “Defending Against Malicious USB Firmware with GoodUSB,” in Proceedings of the 31st Annual Computer Security Applications Conference (ACSAC), 2015. [77] OLEA Kiosks, Inc., “Malware Scrubbing Cyber Security Kiosk,” http://www.olea. com/product/cyber-security-kiosk/, 2015. [78] OPSWAT, “Metascan,” https://www.opswat.com/products/metascan, 2013. [79] Imation, “Ironkey,” http://www.ironkey.com/en-US/resources/, 2013. [80] S. Schumilo, R. Spenneberg, and H. Schwartke, “Don’t trust your USB! How to find bugs in USB device drivers,” in Blackhat Europe, Oct. 2014. [81] J. Lee, L. Bauer, and M. Mazurek, “The effectiveness of security images in internet banking,” Internet Computing, IEEE, vol. 19, no. 1, pp. 54–62, Jan 2015. [82] S. Poeplau and J. Gassen, “A Honeypot for Arbitrary Malware on USB Storage Devices,” in 7th International Conference on Risk and Security of Internet and Systems, ser. CRiSIS ’12, Oct. 2012. [83] P. Zaitcev, “The usbmon: USB Monitoring Framework,” http://people.redhat.com/ zaitcev/linux/OLS05 zaitcev.pdf, 2005. [84] Open Source Security,Inc, “grsecurity,” https://grsecurity.net/, 2013. [85] Hak5, “USB Rubber Ducky Payloads,” https://github.com/hak5darren/ USB-Rubber-Ducky/wiki/Payloads, 2013.

135 [86] yubico, “yubikey,” https://www.yubico.com/products/yubikey-hardware/, 2015. [87] D. J. Tian, N. Scaife, A. Bates, K. R. B. Butler, and P. Traynor, “Making USB Great Again with USBFILTER,” in Proceedings of the 25th USENIX Security Symposium, 2016. [88] S. Smalley, C. Vance, and W. Salamon, “Implementing SELinux as a Linux security module,” NAI Labs Report, vol. 1, p. 43, 2001. [89] W. Enck, P. McDaniel, and T. Jaeger, “PinUP: Pinning user files to known applications,” in Computer Security Applications Conference, 2008. ACSAC 2008. Annual. ieeexplore.ieee.org, Dec. 2008, pp. 55–64. [Online]. Available: http://dx.doi.org/10.1109/ACSAC.2008.41 [90] T. Hirofuchi, E. Kawai, K. Fujikawa, and H. Sunahara, “USB/IP-A peripheral bus extension for device sharing over IP network,” in Proceedings of the annual conference on USENIX Annual Technical Conference, 2005, pp. 42–42. [Online]. Available: https://www.usenix.org/legacy/event/usenix05/tech/freenix/hirofuchi/hirofuchi.pdf [91] The Netfilter Core Team, “The Netfilter Project: Packet Mangling for Linux 2.4,” http://www.netfilter.org/, 1999. [Online]. Available: http://crypto.stanford.edu/∼cao/lineage.html [92] J. P. Anderson, “Computer Security Technology Planning Study,” Air Force Electronic Systems Division, Tech. Rep. ESD-TR-73-51, 1972. [93] R. Sailer, X. Zhang, T. Jaeger, and L. van Doorn, “Design and Implementation of a TCG-based Integrity Measurement Architecture,” in Proceedings of the USENIX Security Symposium, 2004. [94] B. Hicks, S. Rueda, L. St Clair, T. Jaeger, and P. McDaniel, “A logical specification and analysis for selinux mls policy,” ACM Transactions on Information and System Security (TISSEC), vol. 13, no. 3, p. 26, 2010. [95] Hewlett-Packard, Intel, LSI, Microsoft, NEC, , and ST-Ericsson, “Wireless Universal Serial Bus Specification 1.1,” September 2010. [96] U. I. Forum, “Media Agnostic Universal Serial Bus Specification, Release 1.0a,” July 2015. [97] D. Diaz et al., “The GNU Prolog web site,” http://gprolog.org/. [98] D. Genkin, A. Shamir, and E. Tromer, “RSA key extraction via Low-Bandwidth acoustic cryptanalysis,” in Advances in Cryptology – CRYPTO 2014, ser. Lecture Notes in Computer Science. Springer Berlin Heidelberg, 17 Aug. 2014, pp. 444–461. [Online]. Available: http://link.springer.com/chapter/10.1007/978-3-662-44371-2 25 [99] PJRC, “Teensy 3.1,” https://www.pjrc.com/teensy/teensy31.html, 2013.

136 [100] E. Kustarz, S. Shepler, and A. Wilson, “The New and Improved FileBench Benchmarking Framework,” in Proceedings of the USENIX Conference and File and Storage Technologies (FAST), 2008, wiP. [101] A. Tirumala, F. Qin, J. Dugan, J. Ferguson, and K. Gibbs, “Iperf: The tcp/udp bandwidth measurement tool,” htt p://dast. nlanr. net/Projects, 2005. [102] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: the linux virtual machine monitor,” in Proceedings of the Linux symposium, vol. 1, 2007, pp. 225–230. [103] Basemark, Inc., “Basemark browsermark,” http://web.basemark.com/, 2015. [104] T. Kojm, “Clamav,” 2004. [105] J. Erdfelt and D. Drake, “Libusb homepage,” Online, http://www. libusb. org. [106] C. Welch, “Apple’s USB Restricted Mode: how to use your iPhones latest security feature,” https://www.theverge.com/2018/7/10/17550316/ apple-iphone-usb-restricted-mode-how-to-use-security, Jul. 2018. [107] O. Afonin, “This $39 Device Can Defeat iOS USB Restricted Mode,” https: //blog.elcomsoft.com/2018/07/this-9-device-can-defeat-ios-usb-restricted-mode/, Jul. 2018. [108] P. Stewin and I. Bystrov, “Understanding DMA Malware,” in Proceedings of the Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA), 2012.

[109] Intel Corporation, “Intel R Virtualization Technology for Directed I/O: Architecture Specification,” Tech. Rep., Jun. 2018.

[110] J. Greene, “Intel R Trusted Execution Technology,” Intel Corporation, Tech. Rep., 2012. [111] S. Smalley, C. Vance, and W. Salamon, “Implementing SELinux as a Linux Security Module,” Tech. Rep., Dec. 2001, nAI Labs Report 01-043. [112] J. P. Anderson, “Computer Security Technology Planning Study, ESD-TR-73-51, Vol. 1,” Air Force Systems Command: Electronic Systems Division, Tech. Rep., Oct. 1972. [113] C. Wright, C. Cowan, S. Smalley, J. Morris, and G. Kroah-Hartman, “Linux Security Modules: General Security Support for the Linux Kernel,” in Proceedings of the 11th USENIX Security Symposium, 2002. [114] A. Staravoitov, “[RFC,net-next,08/14] bpf: add eBPF verifier,” https://lore.kernel. org/patchwork/patch/477364/, Jun. 2014, kernel Patch. [115] E. Cree, “[RFC/PoC PATCH bpf-next 00/12] bounded loops for eBPF,” https: //www.mail-archive.com/[email protected]/msg218182.html, Feb. 2018.

137 [116] American National Standards Institute (ANSI), “ANSI X3.159-1989: Programming Language C,” Tech. Rep., 1989. [117] D. Borkmann, “[PATCH net-next 3/4] bpf: add support for persistent maps/progs,” Oct. 2015, LKML Archive. [Online]. Available: https://lore.kernel.org/lkml/ ab1fceb2d68876d89bb2ebb3d2b45486d3cf2388.1444956943.git.daniel@iogearbox.net/ [118] T. E. Hart, P. E. McKenney, A. D. Brown, and J. Walpole, “Performance of memory reclamation for lockless synchronization,” Journal of Parallel and Distributed Computing, vol. 67, no. 12, pp. 1270–1285, Dec. 2007. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S074373150700069X [119] K. Cook, “Linux kernel aslr (kaslr),” Linux Security Summit, vol. 69, 2013. [120] A. V. Aho, R. Sethi, and J. D. Ullman, “Compilers: principles, techniques, and tools,” Addison Wesley, vol. 7, no. 8, p. 9, 1986. [121] G. J. Chaitin, “Register allocation & spilling via graph coloring,” in Proceedings of the SIGPLAN Symposium on Compiler Construction, 1982. [Online]. Available: http://doi.acm.org/10.1145/800230.806984 [122] GoodFET, “Facedancer21,” http://goodfet.sourceforge.net/hardware/facedancer21, 2018. [123] SyncStop, “The Original USB Condom,” https://shop.syncstop.com/products/ usb-condom?variant=35430087052, 2018. [124] T. Remple and A. Burns, “Battery Charging Specification: Revision 1.2,” Tech. Rep., Dec. 2010. [125] Android Developers, “Bluetooth low energy overview,” https://developer.android. com/guide/topics/connectivity/bluetooth-le, Apr. 2018. [126] M. Krasnyansky and M. Holtmann, “BlueZ: Official Linux Bluetooth protocol stack,” http://www.bluez.org/, 2002. [127] A. Borg, S. N, and P. Uttarwar, “Can BLE be turned on while Bluetooth Classic is off on an Android device?” https://www.quora.com/ Can-BLE-be-turned-on-while-Bluetooth-Classic-is-off-on-an-Android-device, 2016. [128] B. Seri, “Bluetooth: Properly check L2CAP config option output buffer length,” https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id= e860d2c904d1a9f38a24eb44c9f34b8f915a6ea3, Sep. 2017, kernel Patch. [129] ETSI Technical Committee Smart Card Platform (SCP), “ETSI TS 102 622 V10.2.0: Smart Cards; UICC-Contactless Front-end (CLF) Interface; Host Controller Interface (HCI) (Release 10),” Tech. Rep., Mar. 2011.

138 [130] Near Field Communication Forum, Inc., “NFC Controller Interface (NCI) Specification: NCI 1.0,” Tech. Rep., Nov. 2012. [131] ——, “NFC Digital Protocol: Digital 1.0,” Tech. Rep., Nov. 2010. [132] “PyBluez: Bluetooth Python extension module,” https://github.com/pybluez/ pybluez, 2018. [133] M. Krasnyansky and M. Holtmann, “l2ping.c,” https://github.com/pauloborges/ bluez/blob/master/tools/l2ping.c, 2002. [134] L. McVoy and C. Staelin, “lmbench: Portable tools for performance analysis,” in Proceedings of the USENIX Annual Technical Conference (ATC), 1996. [135] A. Starovoitov, “BPF in LLVM and kernel,” Linux Plumbers Conference, 2015. [136] D. Borkmann, “On getting tc classifier fully programmable with cls bpf.” tc, no. 1/23, 2016. [137] A. Kobayashi, “Displayport (tm) ver. 1.2 overview.” [138] Y. Hayakawa, “eBPF Implementation for FreeBSD,” https://www.bsdcan.org/2018/ schedule/track/Hacking/963.en.html, 2018. [139] Windows Dev Center, “Windows Filtering Platform,” https://docs.microsoft.com/ en-us/windows/desktop/fwp/windows-filtering-platform-start-page, 2018. [140] USB 3.0 Promoter Group, “Universal Serial Bus Type-C Authentication Specification, Revision 1.0,” March 2016. [141] Siliconch Systems, “USB Type-C Authentication IP,” http://www.siliconch.com/ authentication.html, 2017. [142] Renesas, “Renesas Electronics Delivers R9J02G012 Controller That Enables Device-to-Device Authentication in Support of Safer USB Power Delivery Ecosystem,” https://www.renesas.com/en-hq/about/press-center/news/2017/news20170530.html, 2017. [143] B. Blanchet, V. Cheval, X. Allamigeon, and B. Smyth, “ProVerif: Cryptographic protocol verifier in the formal model,” URL http://prosecco. gforge. inria. fr/personal/bblanche/proverif, 2010. [144] N. Kobeissi, K. Bhargavan, and B. Blanchet, “Automated verification for secure messaging protocols and their implementations: A symbolic and computational approach,” in Proceeding of the 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2017, pp. 435–450. [145] K. Bhargavan, B. Blanchet, and N. Kobeissi, “Verified models and reference implementations for the TLS 1.3 standard candidate,” in Proceeding of the 2017 IEEE Symposium on Security and Privacy (S&P), 2017.

139 [146] D. J. Tian, A. Bates, K. R. B. Butler, and R. Rangaswami, “ProvUSB: Block-level provenance-based data protection for USB storage devices,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS’16), 2016. [147] G. Hernandez, F. Fowze, D. J. Tian, T. Yavuz, and K. R. B. Butler, “FirmUSB: Vetting USB device firmware using domain informed symbolic execution,” in Proceedings of the 2017 ACM Conference on Computer and Communications Security (CCS’17), 2017. [148] K. R. B. Butler, S. E. McLaughlin, and P. D. McDaniel, “Kells: a protection framework for portable data,” in Proceedings of the 26th Annual Computer Security Applications Conference (ACSAC’10), 2010.

140 BIOGRAPHICAL SKETCH Dave (Jing) Tian received his Ph.D. from the Computer and Information Science and Engineering (CISE) department at the University of Florida in 2019. He is a founding member of FICS (Florida Institute for Cybersecurity) Research, supervised by Dr. Kevin Butler. His research direction involves system infrastructure, security and storage. His interests are Linux kernel hacking, compilers and machine learning systems. Before this, he was a Ph.D student of Computer and Information Science (CIS) department at the University of Oregon and a member of OSIRIS lab supervised by Dr. Kevin Butler. He has also spent a year on Artificial Intelligence, Machine Learning, Data Mining and Semantic Web and been a member of AIM lab supervised Dr. Dejing Dou. Before that, he has been working in R&D (former Lucent Technologies) Linux Control Platform group, Qingdao, China, as a software engineer for around 4 years. He has got his B.Sc. degree from Qingdao University of Technology and M.E. degree from Ocean University of China. Following graduation, Dave accepted an appointment to a tenure track position of Assistant Professor in the Department of Computer Science of the Purdue University.

141