S3library: Automatically Eliminating C/C++ Buffer Overflow Using

S3Library: Automatically Eliminating C/C++ Buffer Overflow using Compatible Safer Libraries Kang Sun, Daliang Xu, Dongwei Chen, Xu Cheng, Dong Tong Department of Computer Science and Technology, Peking University {ajksunkang, xudaliang, chendongwei, tongdong}@pku.edu.cn, [email protected] Abstract buffer overflows. These contiguous overflows still cover over 40 percent in real-world exploitation [3]. Early designers of Annex K of C11, bounds-checking interfaces, recently in- standard C (specifically ANSI) over-trusted the programmers. troduced a set of alternative functions to mitigate buffer over- A host of vulnerable functions calls they provide, such as flows, primarily those caused by string/memory functions. gets, strcpy and memcpy, neither make bounds checking to However, poor compatibility limits their adoption. Failure determine whether the destination buffer is big enough nor oblivious computing can eliminate the possibility that an at- have the size information needed to perform such checks. tacker can exploit memory errors to corrupt the address space and significantly increase the availability of systems. Annex K of C11, Bounds-checking interfaces [19, 37, 45], In this paper, we present S3Library (Saturation-Memory- recently proposed a set of new, optional alternative library Access Safer String Library), which is compatible with functions that promote safer, more secure programming. The the standard C library in terms of function signature. Our apparent difference (we will explain in later section) is that s size technique automatically replaces unsafe deprecated memo- the APIs have _ suffix and take an additional argu- ry/string functions with safer versions that perform bounds ment explicitly passed by programmers. Intuitively, adopting checking and eliminate buffer overflows via boundless mem- these APIs in an existing code requires non-trivial modifi- ory. S3Library employs MinFat, a very compact pointer rep- cations leading to poor compatibility and guidance. This is resentation following the Less is More principle, to encode the main reason why the new APIs continue to be controver- metadata into unused upper bits within pointers. In addition, sial [12,26,38], despite almost a decade since its introduction. S3Library utilizes Saturation Memory Access to eliminate Furthermore, there is almost no viable conforming implemen- illegal memory accesses into boundless padding area. Even if tation [24] applying the bounds-checking interfaces without an out-of-bounds access is made, the faulty program will not considerable origin code changes. be interrupted. We implement our scheme within the LLVM Various modern techniques have been proposed to enforce framework on X86-64 and evaluate our approach on correct- memory safety, both statically and dynamically. Static analy- ness, security, runtime performance and availability. sis [25,41 –43] that automatically transforming C programs at source code level is hard to obtain a complete coverage because a certain type of size information is only available at run- arXiv:2004.09062v1 [cs.CR] 20 Apr 2020 1 Introduction time. Dynamic defense mechanisms [7, 13, 21, 22, 28, 30, 39] augment the original unmodified program with metadata Buffer overflows remain a major threat to the security of de- (bounds of live objects or allowed memory regions) and insert pendable computer systems today [44]. Applications written bounds checking against this metadata before every memory in low-level languages like assembly or C/C++ are prone to access for runtime detection. They all leave libraries uninstru- buffer overflow bugs. From 2008 to 2014, nearly 23 percent of mented and introduce manually written wrapper to maintain all severe software vulnerabilities were buffer errors, and 72 the compatibility, performing simple bounds checking before percent of buffer errors were serious [16]. In 2018, among the calling a real legacy function. However, all existing software- 16,556 security vulnerabilities recorded by the NIST National based bounds-checking solutions exhibit high performance Vulnerability Database [4], 2,368 (14.3%) were overflow vul- overhead (50-150%), preventing them from wide adoption in nerabilities. Meanwhile, buffer overflow is listed as the first production runs. Address Sanitizer [39] is currently better in position in Weaknesses CWE Top 25(2019) [1]. terms of usability, but it is built for debugging purposes and The danger inherent in the use of unsafe standard C library suffers from detecting non-contiguous out-of-bounds viola- calls, especially the string/memory functions, presents classic tions. Intel MPX [30] provides a promising hardware-assisted 1 full-stack technique, but its implementation is proved not as (a) Look-up Table (b) Fat Pointer (c) Tagged Pointer good enough as expected. Most of these approaches provide complete protection for buffer overflow violations, detecting second trie both contiguous and non-contiguous overflows, but have rela- LB UB key lock tively high runtime overhead. primary trie Performing bounds checking is costly due to large amounts of metadata management: the metadata describing the object bounds must be recorded, propagated and retrieved to check Object Object Object numerous times. Among these processes, the checking is the × × bottleneck. For each pointer dereference, metadata must be √ √ loaded from memory/in-pointer to verify the validation of the pointer. Once the pointer is out of bounds, it also gives Pointer Pointer Pointer A much pressure on the branch predictor and pipeline to handle Base exceptions. As a result, the checking process accounts for the Bound vast majority of execution time. In this paper, we propose an interesting idea in the explo- ration space to concentrate only on buffer overflows caused by Figure 1: Pointer-based approach. highly-critical memory/string functions, rather than bounds checking on each memory reference. With this “partial” memory safety, we wonder what trade-off we can achieve between 2 Background security and overhead. Meanwhile, we propose a feasible implementation of Safe C Library without any modification to 2.1 Memory Safety existing C programs. We present MinFat and S3Library, an interesting approach Enforcing Memory Sa f ety stops all memory corruption ex- to automatically replace unsafe deprecated functions like ploits. Existing runtime techniques that guarantee spatial and strcpy with safer versions that perform bounds checking and temporal memory safety can be broadly categorized into two eliminate buffer overflows via boundless memory. MinFat classes: object-based approach and pointer-based approach. is based on the tagged pointer [20, 40] scheme that trans- Object-based approaches [18, 35, 39] associate metadata with parently encodes bounds meta information of buffer (stack, the location of the object in memory, not with each pointer. heap and global variables) within the pointer itself. We fol- The significant drawback is that its implementations are gener- low the principle of Less is More and adopts a very compact ally incomplete because they are unable to provide an accurate encoding scheme inspired by BaggyBounds [7] with the min- bound information for each object. However, an accurate size imum bit-width that allows an effective way to retrieve object information is necessary for the implementation of Safe C bounds. S3Library retains the APIs compatibility with legacy Library to perform bounds checks. Besides, object-based ap- functions, and performs the same bounds checking as Safe proaches are not suitable for non-contiguous buffer overflow C Library. The property of MinFat trades memory for per- detection also due to the lack of accurate bounds information. formance and adds boundless padding [11, 32–34] to every In this section, we focus on eliminating spatial memory object. These boundless memory blocks support Saturation errors (i.e., buffer overflows) using the pointer-based ap- Memory Access (SMA) to isolate the memory errors within proach, which is considered as the only way to support com- S3Library in case the runtime-constraint violation occurs. prehensive memory safety [27]. Pointer-based approaches Overall, this paper makes the following contributions: [7, 13, 21, 22, 28, 30, 47] attach metadata with every pointer, • To the best of our knowledge, S3Library is the first run- bounding the region of memory that pointer can legitimately time solution that applies implementations of Safe C dereference. As presented in Figure1, pointer-based ap- Library without any modification of origin codes. proaches can be categorized into the following three classes according to how current designs store and use metadata. • A thorough analysis and evaluation of the overhead using Look-up Table scheme: SoftBound [28, 29]. This class, MinFat Pointer on selective functions. such as SoftBound, records base and bound metadata in a • We present a buffer overflow elimination mechanism disjoint metadata facility that is accessed by explicit table within Safe C Library and evaluate its performance in look-ups. Figure1(a) shows the way how SoftBound+CETS Section 5.3. organizes pointer metadata in a two-level trie for quick search- ing purpose. With table look-ups, the accurate bound infor- • An LLVM-based prototype of our design implemented in mation can be obtained in time whenever the pointer needs. X86-64 architecture environment, evaluated with respect Unfortunately, this look-up table scheme cannot be considered to security, availability and runtime performance. safe in multithread environments.

Load more