Runtime Checking C Programs

Runtime Checking C Programs Reed Milewicz Rajesh Vanka James Tuck University of Alabama at Matlab North Carolina State Birmingham 3 Apple Hill Dr University Department of Computer and [email protected] Department of Electrical and Information Sciences Computer Engineering Birmingham, AL 35294 [email protected] [email protected] Daniel Quinlan Peter Pirkelbauer Lawrence Livermore National University of Alabama at Laboratory Birmingham Livermore, CA 94550 Department of Computer and [email protected] Information Sciences Birmingham, AL 35294 [email protected] ABSTRACT Many programming languages allow the use of unsafe pro- The C Programming Language is known for being an effi- gramming constructs in order to attain a high degree of cient language that can be compiled on almost any architec- flexibility and performance. This makes the construction ture and operating system. However the absence of dynamic of correct large-scale software difficult. Unsafe language safety checks and a relatively weak type system allows pro- features allow programmer oversights to introduce software grammer oversights that are hard to spot. In this paper, we flaws which can create security hazards that can compromise present RTC, a runtime monitoring tool that instruments an entire system. Eliminating these software flaws can be unsafe code and monitors the program execution. RTC is addressed on many levels in the software engineering pro- built on top of the ROSE compiler infrastructure. RTC finds cess. Rigid coding standards, restricting the use of a large memory bugs and arithmetic overflows and underflows, and language to a safer subset, peer review, and the use of static run-time type violations. Most of the instrumentations are or dynamic analysis tools are some means that can all reduce directly added to the source file and only require a mini- the exposure to software flaws. mal runtime system. As a result, the instrumented code Analysis tools can be categorized by how they analyze remains portable. In tests against known error detection software. Static tools analyze software without running it. benchmarks, RTC found 98% of all memory related bugs They target source code (occasionally binary) to apply for- and had zero false positives. In performance tests conducted mal analysis techniques such as dataflow analysis, abstract with well known algorithms, such as binary search and MD5, interpretation, and model checking; such techniques often we determined that the unoptimized overhead rate is be- use approximations to arrive at sound but imprecise con- tween a factor of 1.8 and a factor of 77 respectively. clusions about the behavior of programs. Dynamic analysis tools find bugs by observing the behavior of running programs. This is typically accomplished by code instrumen- 1. INTRODUCTION tation (source or binary) or by replacing (built-in) library One major trend in computing is the continuing increase functions (e.g., malloc and free) with custom implementa- in the complexity of software systems. Such an increase is tions. Dynamic analysis operates with concrete values and is allowed by the expectation of increasingly powerful hard- not prone to combinatorial state explosion. The downside of ware (faster processors, larger memory and disks) and the dynamic analysis tools is that monitoring running programs increasing diversity of environments in which the software impacts performance. The quality of the results depends on runs. This increase in complexity is expensive; the National the tests' input data and covered program paths. Hybrid Institute for Standards and Technology (NIST) estimated analyses combine static and dynamic techniques. Hybrid that inadequate infrastructure for software testing costs the tools can improve performance by eliminating checks that US economy $22.2 billion annually [26]. a static analyzer considers safe in all possible scenarios, or they can improve test coverage by producing input values Permission to make digital or hard copies of all or part of this work for personal or that cover all possible paths. classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation The history of program analysis research now spans sev- on the first page. To copy otherwise, to republish, to post on servers or to redistribute eral decades, and dozens of tools and techniques have pushed to lists, requires prior specific permission and/or a fee. the envelope on our ability to detect and correct bugs. How- SAC ’15 ,April 13 - 17 2015, Salamanca, Spain (c) 2014 Association for Computing ever, in the area of dynamic analysis, certain fundamental Machinery. ACM acknowledges that this contribution was authored or co-authored challenges remain, namely high overhead costs and lack of by an employee, contractor or affiliate of the United States government. As such, portability. Neither of these challenges have gone unnoted, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. but have been relatively low priority targets for researchers. ACM 978-1-4503-3196-8/15/04 ...$15.00. http://dx.doi.org/10.1145/2578726.2578744 Severe overhead, in many respects, has been seen as the toring for C99 programs; (2) lightweight runtime monitoring cost of doing business, a burden that software developers implementation. have been willing to tolerate. For dynamic analysis tools The paper is outlined as follows: x2 presents background that target binaries for instrumentation (the most popular information and related work. x3 describes our implementa- choice), high overhead costs are difficult to avoid. The most tion in detail, and x4 discusses how we tested our tool and efficient way to reduce overhead is to selectively reduce in- the obtained results. x5 summarizes the paper and discusses strumentation where bugs are unlikely or are rendered impossible future research directions. possible, but without access to high-level information (e.g. type information) to guide the pruning process, this can be a 2. BACKGROUND risky proposition. Meanwhile, the computing landscape has In this section, we present earlier work on error checking, been historically dominated by only a handful of operating and an overview of the ROSE source-to-source transforma- systems and varieties of architecture; meeting the portabil- tion system. ity requirements of developers meant being able to operate in two or three popular environments. 2.1 Related Work However, as we move towards a future where computing Run-time error checking tools have been designed for a pervades every aspect of our daily lives, these challenges variety of reasons, ranging from bug detection and security become more substantial and more must be done to ad- to software verification. dress them. The most visible signal of the shift to ubiq- SafeC [1] introduced the notion of using source code in- uitous computing has been the proliferation of smartphones strumentation to detect temporal and spatial memory er- and the thriving ecosystem of services that interface with rors. CCured [18], Cyclone [10], and MSCC [28] are all them. However, the most influential transformations have works derived from or inspired by SafeC. Of particular in- come from the progressive infiltration of embedded comput- terest to us is CCured, which introduced the notion of using systems, from critical control devices in medical care and ing lightweight, disjoint metadata facilities to cut down on avionics, to smart televisions and coffee machines. Many of the massive overhead inherent in the approach taken by the most popular languages for development in these bur- SafeC. This idea inspired Hardbound [5], which attempts geoning environments are also the least safe (e.g. C and to tackle the issue by giving hardware support for bounds C++), the improper use of which introduce vulnerabilities checking pointers and pointer management. A software- that threaten reliability and security. At the same time, based approach analogous to HardBound was explored in burdensome overhead costs and portability limitations ren- SoftBound [17]. MemSafe [25] extends this idea by using der in situ dynamic analysis impractical if not impossible. static analysis to prove memory accesses safe. ConSeq [29] The imperative is a simple one: adapt or die. identifies code locations, such as assertions or reads of key In this paper, we present RTC (Run-Time error check for global variables. Then ConSeq extracts slices to determine C programs), a dynamic analysis tool for C99 programs. instructions that contribute to violations. Finally, ConSeq RTC instruments source code with safety checks and pro- uses dynamic analysis to generate valid, bug-free executions duces another C source file. The resulting source file is of the program, and then see whether any legal deviations portable and can be compiled on any platform and any com- from that execution could lead to the potential errors that piler that can handle C99 and linked to a small runtime sys- were revealed by the static analysis. tem. Choosing to instrument source code instead of binary RTED [23] is a dynamic analysis tool developed at the code has a number of advantages. First, the tool is portable, Lawrence Livermore National Laboratory. RTED is a first because the systems where the code is instrumented and the proof of concept implementation for sequential C, a large system where the code runs can be different architectures. sequential subset of C++, and UPC[6], a parallel language The only requirement is that there is a C99 compiler avail- for the partitioned global address space model. RTED finds able for the target system. Second, by instrumenting source temporal and spatial memory violations on stack and heap, code, the tool processes the code as written by the program- signature mismatch in declaration and definition, erroneous mer, and not some code that was generated and possibly library calls, and reads from uninitialized memory. RTED optimized by a compiler.

Runtime Checking C Programs

A Status Update of BEAMJIT, the Just-In-Time Compiling Abstract Machine

HALO: Post-Link Heap-Layout Optimisation

Architectural Support for Scripting Languages

Using JIT Compilation and Configurable Runtime Systems for Efficient Deployment of Java Programs on Ubiquitous Devices

CDC Build System Guide

Runtime Detection of C-Style Errors in UPC Code

EXE: Automatically Generating Inputs of Death

Wind River® Vxworks® 7 Third Party License Notices

An Opencl Runtime System for a Heterogeneous Many-Core Virtual Platform Kuan-Chung Chen and Chung-Ho Chen Inst

Matrox Imaging Library (MIL) 10.0 MIL 10 Processing Pack 3 Release Notes (C) Copyright Matrox Electronic Systems Ltd., 1992-2018

DETECTING BUFFER OVERFLOWS USING TESTCASE SYNTHESIS and CODE INSTRUMENTATION by MICHAEL A

C/C++ Runtime Error Detection Using Source Code Instrumentation