CAB-Fuzz: Practical Concolic Testing Techniques for COTS Operating Systems

CAB-Fuzz: Practical Concolic Testing Techniques for COTS Operating Systems Su Yong Kim, The Affiliated Institute of ETRI; Sangho Lee, Insu Yun, and Wen Xu, Georgia Tech; Byoungyoung Lee, Purdue University; Youngtae Yun, The Affiliated Institute of ETRI; Taesoo Kim, Georgia Tech https://www.usenix.org/conference/atc17/technical-sessions/presentation/kim This paper is included in the Proceedings of the 2017 USENIX Annual Technical Conference (USENIX ATC ’17). July 12–14, 2017 • Santa Clara, CA, USA ISBN 978-1-931971-38-6 Open access to the Proceedings of the 2017 USENIX Annual Technical Conference is sponsored by USENIX. CAB-FUZZ: Practical Concolic Testing Techniques for COTS Operating Systems Su Yong Kim∗§ Sangho Lee† Insu Yun† Wen Xu† Byoungyoung Lee¶ Youngtae Yun∗ Taesoo Kim† ∗The Affiliated Institute of ETRI †Georgia Institute of Technology ¶Purdue University Abstract explores all feasible states in a program and thoroughly checks whether a certain security property can be vio- Discovering the security vulnerabilities of commercial lated. In particular, it has shown its effectiveness for off-the-shelf (COTS) operating systems (OSes) is chal- small applications and/or applications with source code. lenging because they not only are huge and complex, but For example, Avgerinos et al. [1] found more than 10,000 also lack detailed debug information. Concolic testing, bugs in about 4,000 small applications. Also, Ramos which generates all feasible inputs of a program by using and Engler [40] found 67 bugs in various open-source symbolic execution and tests the program with the gen- projects, such as BIND, OpenSSL, and the Linux kernel. erated inputs, is one of the most promising approaches to solve this problem. Unfortunately, the state-of-the-art However, concolic testing does not scale well for com- concolic testing tools do not scale well for testing COTS plex and large software [5,8, 13, 48], such as commer- OSes because of state explosion. Indeed, they often fail cial off-the-shelf (COTS) operating systems (OSes). The to find a single bug (or crash) in COTS OSes despite their complete concolic execution of COTS OSes would never long execution time. terminate in a reasonable amount of time due to the well- known limitation of the symbolic execution, state (or In this paper, we propose CAB-FUZZ (Context-Aware and Boundary-focused), a practical concolic testing tool path) explosion, where the number of feasible program to quickly explore interesting paths that are highly likely states increases exponentially (e.g., once reaching a loop triggering real bugs without debug information. First, statement). Since the COTS OSes have massive implementation complexity, testing using symbolic execution CAB-FUZZ prioritizes the boundary states of arrays and loops, inspired by the fact that many vulnerabilities ends up exploring a very small portion of program states, originate from a lack of proper boundary checks. Sec- i.e., it cannot test deep execution paths. ond, CAB-FUZZ exploits real programs interacting with Moreover, a proprietary nature of COTS OSes prevents COTS OSes to construct proper contexts to explore deep concolic testing from exploring program states with pre- and complex kernel states without debug information. We contexts. Unlike the open-source kernel for which the applied CAB-FUZZ to Windows 7 and Windows Server internal documentation and all test suites are publicly 2008 and found 21 undisclosed unique crashes, includ- available [41,50], COTS OSes do not provide such com- ing two local privilege escalation vulnerabilities (CVE- prehensive information. Although manual annotation on 2015-6098 and CVE-2016-0040) and one information the interface can help increase code coverage and detect disclosure vulnerability in a cryptography driver (CVE- logical bugs [27], it also does not scale. For these rea- 2016-7219). CAB-FUZZ found vulnerabilities that are sons, concolic execution on COTS OSes cannot explore non-trivial to discover; five vulnerabilities have existed program states that are only reachable after undergoing for 14 years, and we could trigger them even in the initial complex runtime operations. version of Windows XP (August 2001). In this paper, we propose CAB-FUZZ (Context-Aware 1 Introduction and Boundary-focused), a practical system specialized to detect vulnerabilities in COTS OSes based on concolic Concolic testing is a well-known approach to automati- testing. First, to overcome the scalability limitation of cally detect software vulnerabilities [8]. Empowered by concolic testing, CAB-FUZZ prioritizes states likely hav- its symbolic interpretation of the input, it generates and ing vulnerabilities. This prioritization is based on the §This work is done while this author was a visiting scholar in observation that a majority of critical security bugs (e.g., Georgia Institute of Technology. memory corruption and information disclosure) originate USENIX Association 2017 USENIX Annual Technical Conference 689 from a lack of proper boundary checks. This is why com- untrusted interface for every product, and their fuzzing pilers and even hardware have adopted boundary-check solution has been running 24/7 since 2007 for a total of mechanisms, such as SoftBound [37], SafeStack [28], and over 500 machine years [3]. However, despite this ef- Intel Memory Protection Extensions (MPX) [20]. There- fort, CAB-FUZZ was able to discover 14-year-old bugs fore, we instruct CAB-FUZZ to generate and explore the in Windows’ kernel device drivers (§5.3). boundary states of arrays and loops first, thereby detect- This paper makes the following contributions. ing vulnerabilities as early as possible before exploding in terms of program states. • Practical Techniques. CAB-FUZZ makes concolic Second, to construct pre-contexts of COTS OSes with- testing practical by addressing its two important out detailed debug information, CAB-FUZZ refers to real challenges: state explosion and missing execution programs as a concolic-execution template. Since such contexts. CAB-FUZZ prioritizes boundary condi- a program frequently interacts with the COTS OSes to tions to trigger a crash before explosion and refers perform a certain operation, it embodies sufficient infor- to a real application to construct proper execution mation and logic that constructs pre-contexts for using OS contexts. functions. Thus, if CAB-FUZZ runs a real program until • Evaluation and In-depth Analysis. We analyzed it calls any target OS function that we are interested in, the implementation of COTS OSes in detail to fig- CAB-FUZZ is able to prepare with proper pre-contexts ure out why CAB-FUZZ was able to detect their to initiate concolic testing correctly. vulnerabilities effectively compared to conventional techniques. We implemented CAB-FUZZ based on a popular concolic testing tool, S2E [10], and evaluated it with two • Real-world Impact. CAB-FUZZ discovered 21 popular COTS OSes, Windows 7 and Windows Server unique crashes of device drivers for Windows 7 and 2008, especially for the 274 device drivers shipped with Windows Server 2008. We reported all reproducible them. Since our approaches are general and independent crashes to the vendors. They confirmed that four of the OS, we believe they can be applied to currently of the reported crashes were critical and fixed them. unsupported OSes in the future. Specifically, two of them were privilege escalation vulnerabilities and one was an information disclo- In total CAB-FUZZ discovered 21 unique crashes of sure vulnerability in a cryptography driver. six device drivers developed by Microsoft and ESET (§5). Among them we reported six reproducible crashes to Mi- The rest of this paper is organized as follows. §2 de- crosoft and one reproducible crash to ESET. Microsoft scribes the challenges of performing concolic testing for confirmed that three of them were undisclosed vulnera- COTS OSes. §3 depicts CAB-FUZZ and §4 describes its bilities and could be abused by a guest account for lo- implementation. §5 evaluates CAB-FUZZ’s vulnerability- cal privilege escalation (CVE-2015-6098 and CVE-2016- finding effectiveness. §6 discusses the various aspects of 0040) and information disclosure in a cryptography driver CAB-FUZZ including its limitations, and §7 presents re- (CVE-2016-7219). Especially, the later vulnerability even lated work. §8 concludes the paper. existed in the latest versions of Windows (Windows 10 and Windows Server 2016). Microsoft acknowledged the 2 Challenges for COTS OSes other three reports demanding administrator privilege and This section elaborates on the challenges involved in ESET fixed the bug we reported. performing concolic testing for COTS OSes to clearly This evaluation result arguably demonstrates the ef- motivate our proposed system, CAB-FUZZ. fectiveness of CAB-FUZZ in finding vulnerabilities in 2.1 Binary COTS OSes despite its lack of completeness. CAB-FUZZ Automated binary analysis is necessary for production may not be able to trigger sophisticated bugs unrelated to software (e.g., COTS OS) because (1) it usually contains boundary states. However, because of the fundamental third-party binaries and libraries without source code, (2) scalability limitation of concolic testing, complete con- its behavior can be changed due to compiler optimization colic testing is infeasible especially for large software. or linking, and (3) its code can be written with multiple One of the contributions of CAB-FUZZ is that it changes programming languages, making source code analysis the way we think

Load more