Apple Picking in the Ios Bluetooth Stack
Total Page:16
File Type:pdf, Size:1020Kb
ToothPicker: Apple Picking in the iOS Bluetooth Stack Dennis Heinze Jiska Classen Secure Mobile Networking Lab Secure Mobile Networking Lab TU Darmstadt / TU Darmstadt ERNW GmbH Matthias Hollick Secure Mobile Networking Lab TU Darmstadt Abstract thus, rendering feedback-based fuzzing difficult. Moreover, crafting arbitrary packets and transmitting them for Bluetooth Bluetooth enables basic communication prior to pairing as Low Energy (BLE) as well as Classic Bluetooth has limited well as low-energy information exchange with multiple de- tool support. As of now, there is no full-stack open-source vices. The Apple ecosystem is extensively using Bluetooth Software-Defined Radio (SDR) implementation. Thus, secu- for coordination tasks that run in the background and enable rity researchers need to use commercial tools by Ellisys start- seamless device handover. To this end, Apple established ing at US$ 10 k or extend InternalBlue [22], which is based proprietary protocols. Since their implementation is closed- on reverse-engineered firmware on off-the-shelf smartphones. source and over-the-air fuzzers are very limited, these proto- During our research, InternalBlue proofed as powerful tool cols are largely unexplored and not publicly tested for security. for bug verification but the overall nature of Apple’s Bluetooth In this paper, we summarize the current state of Apple’s Blue- stack requires a custom fuzzing solution. tooth protocols. Based on this, we build the iOS in-process We design and implement ToothPicker, an in-process Blue- fuzzer ToothPicker and evaluate the implementation security tooth daemon fuzzer running on iOS 13. To this end, we solve of these protocols. We find a zero-click Remote Code Execu- various harnessing challenges, such as creating virtual connec- tion (RCE) that was fixed in iOS 13.5 and simple crashes. tions, partial chip abstraction, selecting protocol handlers, ob- taining suitable corpora for undocumented protocols as well 1 Introduction as getting crash and coverage feedback of this closed-source target. ToothPicker is based on F R IDA and radamsa [16, 28] In the past, Bluetooth was mainly used as an audio-only tech- and harnesses the Bluetooth daemon while leaving the re- nology. Nowadays, it is used by integral operating system ser- maining interaction with iOS components intact. The main vices that are permanently running in the background. Within contributions of this paper are as follows: the Apple ecosystem, most of these are part of the Continuity • We implement ToothPicker, an iOS in-process Bluetooth framework [5]. Examples are searching for devices before fuzzer running on a physical iPhone. sharing files via AirDrop [31], the initial detection of an Apple • We provide an up-to-date Bluetooth protocol overview Watch by macOS to start the Auto Unlock protocol and unlock- for iOS, macOS, and RTKit. ing a Mac, or beginning to write an email on a mobile device and later continuing this on a desktop via Handoff [14, 23]. • We uncover various issues in Apple’s implementations Further features outside of the Continuity framework include that we can reproduce over-the-air. Based on our find- pairing AirPods once and seamlessly using them on all de- ings, Apple fixed the Bluetooth RCE CVE-2020-9838 in vices that are logged into the same iCloud account [15]. With iOS 13.5 and the Denial of Service (DoS) CVE-2020- Exposure Notifications introduced due to SARS-CoV-2, Apple 9931 in iOS 13.6. expanded their rather closed ecosystem to work with Google Our results indicate that Apple never systematically fuzzed devices [12]. Even users with just an iPhone who do not de- their Bluetooth stack, with issues going back to iOS 5 that still pend on the Continuity framework might enable Bluetooth exist in iOS 13. However, fuzzing BLE/GATT, which is used for exposure notifications or audio streaming. by most Internet of Things (IoT) devices and supported by Testing these services over-the-air is cumbersome. Apple’s various Nordic Semiconductor based testing frameworks [9, Bluetooth stacks almost immediately disconnect upon receiv- 25], did not reveal any findings. Thus, ToothPicker indeed ing invalid packets. A disconnect is hard to distinguish from closes the gap between what is possible with over-the-air a crash because the Bluetooth service restarts within seconds, testing and fast in-process fuzzing for proprietary protocols. This paper is structured as follows. We provide background achieved by adding a trust-threshold on code that is not modi- information in Section 2. In Section 3, we start with a Tooth- fied during runtime, and, thus, has reusable blocks. The stalker Picker design overview, identify and prioritize various target observes basic block coverage, as required for feedback-based protocols, and then harness the iOS Bluetooth daemon for fuzzing. If exceptions occur while running the code copy, fuzzing. We evaluate ToothPicker in Section 4. The fuzzing F R IDA catches these and the original program does not crash. results and malicious payloads are provided in Section 5. Fi- nally, we conclude our work in Section 6. Even though a detailed understanding of Apple’s proprietary protocols is not 3 ToothPicker required for fuzzing per se, they have not been documented before and we provide descriptions in the Appendix. In the following, we design and implement ToothPicker. An overview of the ToothPicker architecture and its specifics is depicted in Figure 1. First, we provide an intuition about its 2 Background on Fuzzing Apple Bluetooth design in Section 3.1. Then, we analyze various Bluetooth- based protocols and prioritize them as fuzzing targets in Sec- In the following, we explain why we chose the iOS Bluetooth tion 3.2. In Section 3.3, we select corpora for these protocols stack and provide a background on the fuzzing options. and describe adaptions required for harnessing these. Finally, we describe the fuzzer operation in Section 3.4, including its 2.1 Selecting a Bluetooth Stack virtual connection management in Section 3.5. Apple implements three different Bluetooth stacks. One is for iOS and its derivates. macOS uses another stack with dupli- 3.1 General Design and Concepts cate protocol implementations that behave slightly different. Embedded devices, such as the AirPods, use the RTKit stack. The main challenge in fuzzing the iOS Bluetooth daemon is The embedded RTKit stack is the least accessible. In con- harnessing it while maintaining sufficient state to find realis- trast, the macOS stack has already been tested recently with se- tic bugs. ToothPicker achieves this by attaching itself to the vere security issues uncovered [11]. We find that the iOS stack running daemon with F R IDA [28]. However, fuzzing requires experienced only little research but implements the majority further extensions, as running on a physical iPhone has var- of Apple’s proprietary protocols. With the checkra1n jail- ious side effects due to the remaining interactions with the break [18], it becomes fairly accessible on research iPhones. Bluetooth chip, other daemons, and apps. To this end, Tooth- Picker creates virtual connections, partially abstracts between 2.2 iOS Fuzzing Options the physical Bluetooth chip and the changed behavior within bluetoothd, and gets coverage and crash feedback during Apple internally tests their software including fuzzing but it fuzzing. Moreover, as fuzzing speed is limited, we have to is unknown how exactly they do this. With neither the source understand Apple’s proprietary protocols sufficiently to obtain code nor the fuzzing methods and targets being public, secu- meaningful corpora and select interesting protocol handlers. rity of Apple’s software should be tested by independent re- Chip interaction can be harmful for the overall fuzzing searchers. Apple keeps bluetoothd closed-source, meaning process. For example, if a protocol that requires an active that only they are able to rebuild it with fuzzing instrumenta- connection is fuzzed but the chip is not aware of such a con- tion such as edge coverage and memory sanitization. As this nection, it reports the connection to be terminated. Yet, it is binary is rather complex, statically patching it without source complicated to fully abstract from the chip, as bluetoothd code is out of scope, as there is currently no such tool for iOS requires it during startup and other complex operations. Thus, ARM64e binaries without symbols in Mach-O format. ToothPicker uses virtual connections and filters communica- In general, there are multiple tools for dynamic iOS analy- tion with the chip concerning these connections manually. sis. The iOS 12 kernel can be booted into an interactive shell Another important factor are the daemons and apps that using Quick Emulator (QEMU) but without any daemons remain interacting with bluetoothd while it is being fuzzed. running [1]. Moreover, KTRW enables iOS 13 kernel debug- First of all, they might crash as well due to the fuzzing, with ging at runtime [7]. However, bluetoothd runs in user space. exceptions that cannot be caught by F R IDA but are still cap- In contrast to the previous options, F R IDA is a dynamic in- tured within the iOS crash reports. Second, anything happen- strumentation framework that enables function hooking and ing on the iPhone that also affects Bluetooth changes the coverage collection by rewriting instructions of user-space fuzzing behavior. Simple instructions such as opening the programs during runtime [28]. Thus, ToothPicker is based on Bluetooth device dialog already crash bluetoothd immedi- F R IDA and the fuzzing-specific frizzer extension [21]. ately. Moreover, the chip might still receive BLE advertise- The F R IDA stalker follows program flow during run- ments or connection requests from other devices and forward time [27, 29]. To this end, F R IDA copies code prior to execu- them to bluetoothd. All this parallelism and statefulness of- tion, modifies that copy, and runs it.