Runtime Checking C Programs

Total Page:16

File Type:pdf, Size:1020Kb

Runtime Checking C Programs Runtime Checking C Programs Reed Milewicz Rajesh Vanka James Tuck University of Alabama at Matlab North Carolina State Birmingham 3 Apple Hill Dr University Department of Computer and [email protected] Department of Electrical and Information Sciences Computer Engineering Birmingham, AL 35294 [email protected] [email protected] Daniel Quinlan Peter Pirkelbauer Lawrence Livermore National University of Alabama at Laboratory Birmingham Livermore, CA 94550 Department of Computer and [email protected] Information Sciences Birmingham, AL 35294 [email protected] ABSTRACT Many programming languages allow the use of unsafe pro- The C Programming Language is known for being an effi- gramming constructs in order to attain a high degree of cient language that can be compiled on almost any architec- flexibility and performance. This makes the construction ture and operating system. However the absence of dynamic of correct large-scale software difficult. Unsafe language safety checks and a relatively weak type system allows pro- features allow programmer oversights to introduce software grammer oversights that are hard to spot. In this paper, we flaws which can create security hazards that can compromise present RTC, a runtime monitoring tool that instruments an entire system. Eliminating these software flaws can be unsafe code and monitors the program execution. RTC is addressed on many levels in the software engineering pro- built on top of the ROSE compiler infrastructure. RTC finds cess. Rigid coding standards, restricting the use of a large memory bugs and arithmetic overflows and underflows, and language to a safer subset, peer review, and the use of static run-time type violations. Most of the instrumentations are or dynamic analysis tools are some means that can all reduce directly added to the source file and only require a mini- the exposure to software flaws. mal runtime system. As a result, the instrumented code Analysis tools can be categorized by how they analyze remains portable. In tests against known error detection software. Static tools analyze software without running it. benchmarks, RTC found 98% of all memory related bugs They target source code (occasionally binary) to apply for- and had zero false positives. In performance tests conducted mal analysis techniques such as dataflow analysis, abstract with well known algorithms, such as binary search and MD5, interpretation, and model checking; such techniques often we determined that the unoptimized overhead rate is be- use approximations to arrive at sound but imprecise con- tween a factor of 1.8 and a factor of 77 respectively. clusions about the behavior of programs. Dynamic analysis tools find bugs by observing the behavior of running pro- grams. This is typically accomplished by code instrumen- 1. INTRODUCTION tation (source or binary) or by replacing (built-in) library One major trend in computing is the continuing increase functions (e.g., malloc and free) with custom implementa- in the complexity of software systems. Such an increase is tions. Dynamic analysis operates with concrete values and is allowed by the expectation of increasingly powerful hard- not prone to combinatorial state explosion. The downside of ware (faster processors, larger memory and disks) and the dynamic analysis tools is that monitoring running programs increasing diversity of environments in which the software impacts performance. The quality of the results depends on runs. This increase in complexity is expensive; the National the tests' input data and covered program paths. Hybrid Institute for Standards and Technology (NIST) estimated analyses combine static and dynamic techniques. Hybrid that inadequate infrastructure for software testing costs the tools can improve performance by eliminating checks that US economy $22.2 billion annually [26]. a static analyzer considers safe in all possible scenarios, or they can improve test coverage by producing input values Permission to make digital or hard copies of all or part of this work for personal or that cover all possible paths. classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation The history of program analysis research now spans sev- on the first page. To copy otherwise, to republish, to post on servers or to redistribute eral decades, and dozens of tools and techniques have pushed to lists, requires prior specific permission and/or a fee. the envelope on our ability to detect and correct bugs. How- SAC ’15 ,April 13 - 17 2015, Salamanca, Spain (c) 2014 Association for Computing ever, in the area of dynamic analysis, certain fundamental Machinery. ACM acknowledges that this contribution was authored or co-authored challenges remain, namely high overhead costs and lack of by an employee, contractor or affiliate of the United States government. As such, portability. Neither of these challenges have gone unnoted, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only. but have been relatively low priority targets for researchers. ACM 978-1-4503-3196-8/15/04 ...$15.00. http://dx.doi.org/10.1145/2578726.2578744 Severe overhead, in many respects, has been seen as the toring for C99 programs; (2) lightweight runtime monitoring cost of doing business, a burden that software developers implementation. have been willing to tolerate. For dynamic analysis tools The paper is outlined as follows: x2 presents background that target binaries for instrumentation (the most popular information and related work. x3 describes our implementa- choice), high overhead costs are difficult to avoid. The most tion in detail, and x4 discusses how we tested our tool and efficient way to reduce overhead is to selectively reduce in- the obtained results. x5 summarizes the paper and discusses strumentation where bugs are unlikely or are rendered im- possible future research directions. possible, but without access to high-level information (e.g. type information) to guide the pruning process, this can be a 2. BACKGROUND risky proposition. Meanwhile, the computing landscape has In this section, we present earlier work on error checking, been historically dominated by only a handful of operating and an overview of the ROSE source-to-source transforma- systems and varieties of architecture; meeting the portabil- tion system. ity requirements of developers meant being able to operate in two or three popular environments. 2.1 Related Work However, as we move towards a future where computing Run-time error checking tools have been designed for a pervades every aspect of our daily lives, these challenges variety of reasons, ranging from bug detection and security become more substantial and more must be done to ad- to software verification. dress them. The most visible signal of the shift to ubiq- SafeC [1] introduced the notion of using source code in- uitous computing has been the proliferation of smartphones strumentation to detect temporal and spatial memory er- and the thriving ecosystem of services that interface with rors. CCured [18], Cyclone [10], and MSCC [28] are all them. However, the most influential transformations have works derived from or inspired by SafeC. Of particular in- come from the progressive infiltration of embedded comput- terest to us is CCured, which introduced the notion of us- ing systems, from critical control devices in medical care and ing lightweight, disjoint metadata facilities to cut down on avionics, to smart televisions and coffee machines. Many of the massive overhead inherent in the approach taken by the most popular languages for development in these bur- SafeC. This idea inspired Hardbound [5], which attempts geoning environments are also the least safe (e.g. C and to tackle the issue by giving hardware support for bounds C++), the improper use of which introduce vulnerabilities checking pointers and pointer management. A software- that threaten reliability and security. At the same time, based approach analogous to HardBound was explored in burdensome overhead costs and portability limitations ren- SoftBound [17]. MemSafe [25] extends this idea by using der in situ dynamic analysis impractical if not impossible. static analysis to prove memory accesses safe. ConSeq [29] The imperative is a simple one: adapt or die. identifies code locations, such as assertions or reads of key In this paper, we present RTC (Run-Time error check for global variables. Then ConSeq extracts slices to determine C programs), a dynamic analysis tool for C99 programs. instructions that contribute to violations. Finally, ConSeq RTC instruments source code with safety checks and pro- uses dynamic analysis to generate valid, bug-free executions duces another C source file. The resulting source file is of the program, and then see whether any legal deviations portable and can be compiled on any platform and any com- from that execution could lead to the potential errors that piler that can handle C99 and linked to a small runtime sys- were revealed by the static analysis. tem. Choosing to instrument source code instead of binary RTED [23] is a dynamic analysis tool developed at the code has a number of advantages. First, the tool is portable, Lawrence Livermore National Laboratory. RTED is a first because the systems where the code is instrumented and the proof of concept implementation for sequential C, a large system where the code runs can be different architectures. sequential subset of C++, and UPC[6], a parallel language The only requirement is that there is a C99 compiler avail- for the partitioned global address space model. RTED finds able for the target system. Second, by instrumenting source temporal and spatial memory violations on stack and heap, code, the tool processes the code as written by the program- signature mismatch in declaration and definition, erroneous mer, and not some code that was generated and possibly library calls, and reads from uninitialized memory. RTED optimized by a compiler.
Recommended publications
  • A Status Update of BEAMJIT, the Just-In-Time Compiling Abstract Machine
    A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine Frej Drejhammar and Lars Rasmusson <ffrej,[email protected]> 140609 Who am I? Senior researcher at the Swedish Institute of Computer Science (SICS) working on programming tools and distributed systems. Acknowledgments Project funded by Ericsson AB. Joint work with Lars Rasmusson <[email protected]>. What this talk is About An introduction to how BEAMJIT works and a detailed look at some subtle details of its implementation. Outline Background BEAMJIT from 10000m BEAMJIT-aware Optimization Compiler-supported Profiling Future Work Questions Just-In-Time (JIT) Compilation Decide at runtime to compile \hot" parts to native code. Fairly common implementation technique. McCarthy's Lisp (1969) Python (Psyco, PyPy) Smalltalk (Cog) Java (HotSpot) JavaScript (SquirrelFish Extreme, SpiderMonkey, J¨agerMonkey, IonMonkey, V8) Motivation A JIT compiler increases flexibility. Tracing does not require switching to full emulation. Cross-module optimization. Compiled BEAM modules are platform independent: No need for cross compilation. Binaries not strongly coupled to a particular build of the emulator. Integrates naturally with code upgrade. Project Goals Do as little manual work as possible. Preserve the semantics of plain BEAM. Automatically stay in sync with the plain BEAM, i.e. if bugs are fixed in the interpreter the JIT should not have to be modified manually. Have a native code generator which is state-of-the-art. Eventually be better than HiPE (steady-state). Plan Use automated tools to transform and extend the BEAM. Use an off-the-shelf optimizer and code generator. Implement a tracing JIT compiler. BEAM: Specification & Implementation BEAM is the name of the Erlang VM.
    [Show full text]
  • HALO: Post-Link Heap-Layout Optimisation
    HALO: Post-Link Heap-Layout Optimisation Joe Savage Timothy M. Jones University of Cambridge, UK University of Cambridge, UK [email protected] [email protected] Abstract 1 Introduction Today, general-purpose memory allocators dominate the As the gap between memory and processor speeds continues landscape of dynamic memory management. While these so- to widen, efficient cache utilisation is more important than lutions can provide reasonably good behaviour across a wide ever. While compilers have long employed techniques like range of workloads, it is an unfortunate reality that their basic-block reordering, loop fission and tiling, and intelligent behaviour for any particular workload can be highly subop- register allocation to improve the cache behaviour of pro- timal. By catering primarily to average and worst-case usage grams, the layout of dynamically allocated memory remains patterns, these allocators deny programs the advantages of largely beyond the reach of static tools. domain-specific optimisations, and thus may inadvertently Today, when a C++ program calls new, or a C program place data in a manner that hinders performance, generating malloc, its request is satisfied by a general-purpose allocator unnecessary cache misses and load stalls. with no intimate knowledge of what the program does or To help alleviate these issues, we propose HALO: a post- how its data objects are used. Allocations are made through link profile-guided optimisation tool that can improve the fixed, lifeless interfaces, and fulfilled by
    [Show full text]
  • Architectural Support for Scripting Languages
    Architectural Support for Scripting Languages By Dibakar Gope A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Electrical and Computer Engineering) at the UNIVERSITY OF WISCONSIN–MADISON 2017 Date of final oral examination: 6/7/2017 The dissertation is approved by the following members of the Final Oral Committee: Mikko H. Lipasti, Professor, Electrical and Computer Engineering Gurindar S. Sohi, Professor, Computer Sciences Parameswaran Ramanathan, Professor, Electrical and Computer Engineering Jing Li, Assistant Professor, Electrical and Computer Engineering Aws Albarghouthi, Assistant Professor, Computer Sciences © Copyright by Dibakar Gope 2017 All Rights Reserved i This thesis is dedicated to my parents, Monoranjan Gope and Sati Gope. ii acknowledgments First and foremost, I would like to thank my parents, Sri Monoranjan Gope, and Smt. Sati Gope for their unwavering support and encouragement throughout my doctoral studies which I believe to be the single most important contribution towards achieving my goal of receiving a Ph.D. Second, I would like to express my deepest gratitude to my advisor Prof. Mikko Lipasti for his mentorship and continuous support throughout the course of my graduate studies. I am extremely grateful to him for guiding me with such dedication and consideration and never failing to pay attention to any details of my work. His insights, encouragement, and overall optimism have been instrumental in organizing my otherwise vague ideas into some meaningful contributions in this thesis. This thesis would never have been accomplished without his technical and editorial advice. I find myself fortunate to have met and had the opportunity to work with such an all-around nice person in addition to being a great professor.
    [Show full text]
  • Using JIT Compilation and Configurable Runtime Systems for Efficient Deployment of Java Programs on Ubiquitous Devices
    Using JIT Compilation and Configurable Runtime Systems for Efficient Deployment of Java Programs on Ubiquitous Devices Radu Teodorescu and Raju Pandey Parallel and Distributed Computing Laboratory Computer Science Department University of California, Davis {teodores,pandey}@cs.ucdavis.edu Abstract. As the proliferation of ubiquitous devices moves computation away from the conventional desktop computer boundaries, distributed systems design is being exposed to new challenges. A distributed system supporting a ubiquitous computing application must deal with a wider spectrum of hardware architectures featuring structural and functional differences, and resources limitations. Due to its architecture independent infrastructure and object-oriented programming model, the Java programming environment can support flexible solutions for addressing the diversity among these devices. Unfortunately, Java solutions are often associated with high costs in terms of resource consumption, which limits the range of devices that can benefit from this approach. In this paper, we present an architecture that deals with the cost and complexity of running Java programs by partitioning the process of Java program execution between system nodes and remote devices. The system nodes prepare a Java application for execution on a remote device by generating device-specific native code and application-specific runtime system on the fly. The resulting infrastructure provides the flexibility of a high-level programming model and the architecture independence specific to Java. At the same time the amount of resources consumed by an application on the targeted device are comparable to that of a native implementation. 1 Introduction As a result of the proliferation of computation into the physical world, ubiquitous computing [1] has become an important research topic.
    [Show full text]
  • CDC Build System Guide
    CDC Build System Guide Java™ Platform, Micro Edition Connected Device Configuration, Version 1.1.2 Foundation Profile, Version 1.1.2 Optimized Implementation Sun Microsystems, Inc. www.sun.com December 2008 Copyright © 2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. Sun Microsystems, Inc. has intellectual property rights relating to technology embodied in the product that is described in this document. In particular, and without limitation, these intellectual property rights may include one or more of the U.S. patents listed at http://www.sun.com/patents and one or more additional patents or pending patent applications in the U.S. and in other countries. U.S. Government Rights - Commercial software. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements. This distribution may include materials developed by third parties. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. Sun, Sun Microsystems, the Sun logo, Java, Solaris and HotSpot are trademarks or registered trademarks of Sun Microsystems, Inc. or its subsidiaries in the United States and other countries. The Adobe logo is a registered trademark of Adobe Systems, Incorporated. Products covered by and information contained in this service manual are controlled by U.S. Export Control laws and may be subject to the export or import laws in other countries. Nuclear, missile, chemical biological weapons or nuclear maritime end uses or end users, whether direct or indirect, are strictly prohibited.
    [Show full text]
  • Runtime Detection of C-Style Errors in UPC Code
    Runtime Detection of C-Style Errors in UPC Code Peter Pirkelbauer Chunhua Liao Thomas Panas Lawrence Livermore National Lawrence Livermore National Microsoft Laboratory Laboratory Parallel Data Warehousing [email protected] [email protected] [email protected] Dan Quinlan Lawrence Livermore National Laboratory [email protected] Abstract Software frequently contains flaws, and the size and complexity Unified Parallel C (UPC) extends the C programming language of modern software makes the detection of such flaws difficult. Re- (ISO C 99) with explicit parallel programming support for the parti- ducing the number of software flaws is an effort that permeates tioned global address space (PGAS), which provides a global mem- all stages in software development. For example, adopting rigid ory space with localized partitions to each thread. Like its ancestor coding standards and restricting programming languages to a safe C, UPC is a low-level language that emphasizes code efficiency. subset can eliminate error-prone coding techniques. Static and dy- The absence of dynamic (and static) safety checks allows program- namic code analysis tools that operate on source or binary code mer oversights and software flaws that can be hard to spot. are also used to reduce software flaws. Static analysis tools uti- In this paper, we present an extension of a dynamic analysis lize source code (style) analysis or formal analysis techniques such tool, ROSE-Code Instrumentation and Runtime Monitor (ROSE- as dataflow analysis, abstract interpretation, and model checking. CIRM), for UPC to help programmers find C-style errors involving Static analysis tools do not influence the running programs but of- the global address space.
    [Show full text]
  • EXE: Automatically Generating Inputs of Death
    EXE: Automatically Generating Inputs of Death Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, Dawson R. Engler Computer Systems Laboratory Stanford University Stanford, CA 94305, U.S.A {cristic, vganesh, piotrek, dill, engler} @cs.stanford.edu ABSTRACT 1. INTRODUCTION This paper presents EXE, an effective bug-finding tool that Attacker-exposed code is often a tangled mess of deeply- automatically generates inputs that crash real code. Instead nested conditionals, labyrinthine call chains, huge amounts of running code on manually or randomly constructed input, of code, and frequent, abusive use of casting and pointer EXE runs it on symbolic input initially allowed to be “any- operations. For safety, this code must exhaustively vet in- thing.” As checked code runs, EXE tracks the constraints put received directly from potential attackers (such as sys- on each symbolic (i.e., input-derived) memory location. If a tem call parameters, network packets, even data from USB statement uses a symbolic value, EXE does not run it, but sticks). However, attempting to guard against all possible instead adds it as an input-constraint; all other statements attacks adds significant code complexity and requires aware- run as usual. If code conditionally checks a symbolic ex- ness of subtle issues such as arithmetic and buffer overflow pression, EXE forks execution, constraining the expression conditions, which the historical record unequivocally shows to be true on the true branch and false on the other. Be- programmers reason about poorly. cause EXE reasons about all possible values on a path, it Currently, programmers check for such errors using a com- has much more power than a traditional runtime tool: (1) bination of code review, manual and random testing, dy- it can force execution down any feasible program path and namic tools, and static analysis.
    [Show full text]
  • Wind River® Vxworks® 7 Third Party License Notices
    Wind River® VxWorks® 7 Third Party License Notices This document contains third party intellectual property (IP) notices for the BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY Wind River® VxWorks® 7 distribution. Certain licenses and license notices THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, may appear in other parts of the product distribution in accordance with the OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN license requirements. ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Trademarks All company, product and service names used in this software are for ACPICA identification purposes only. Version: 20170303 Component(s): Runtime Wind River and VxWorks are registered trademarks of Wind River Systems. Description: Provides code to implement ACPI specification in VxWorks. UNIX is a registered trademark of The Open Group. IBM and Bluemix are registered trademarks of the IBM Corporation. NOTICES: All other third-party trademarks are the property of their respective owners. 1. Copyright Notice Some or all of this work - Copyright (c) 1999 - 2016, Intel Corp. All rights reserved. Third Party Notices 2. License 2.1. This is your license from Intel Corp. under its intellectual property rights. You may have additional license terms from the party that provided you this software, covering your right to use that party's intellectual property rights. 64-Bit Dynamic Linker Version: 2.2. Intel grants, free of charge, to any person ("Licensee") obtaining a copy Component(s): Runtime of the source code appearing in this file ("Covered Code") an irrevocable, Description: The dynamic linker is used to load shared libraries.
    [Show full text]
  • An Opencl Runtime System for a Heterogeneous Many-Core Virtual Platform Kuan-Chung Chen and Chung-Ho Chen Inst
    An OpenCL Runtime System for a Heterogeneous Many-Core Virtual Platform Kuan-Chung Chen and Chung-Ho Chen Inst. of Computer & Communication Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C. [email protected], [email protected] Abstract—We present a many-core full system is executed on a Linux PC while the kernel code execution is simulation platform and its OpenCL runtime system. The simulated on an ESL many-core system. OpenCL runtime system includes an on-the-fly compiler and resource manager for the ARM-based many-core The rest of this paper consists of the following sections. In platform. Using this platform, we evaluate approaches of Section 2, we briefly introduce the OpenCL execution model work-item scheduling and memory management in and our heterogeneous full system simulation platform. Section OpenCL memory hierarchy. Our experimental results 3 discusses the OpenCL runtime implementation in detail and show that scheduling work-items on a many-core system its enhancements. Section 4 presents the evaluation system and using general purpose RISC CPU should avoid per work- results. We conclude this paper in Section 5. item context switching. Data deployment and work-item coalescing are the two keys for significant speedup. II. OPENCL MODEL AND VIRTUAL PLATFORM Keywords—OpenCL; runtime system; work-item coalescing; In this section, we first introduce the OpenCL programming full system simulation; heterogeneous integration; model and followed by our SystemC-based virtual platform architecture. I. INTRODUCTION A. OpenCL Platform Overview OpenCL is a data parallel programming model introduced Fig. 1 shows a conceptual OpenCL platform which has two for heterogeneous system architecture [1] which may include execution components: a host processor and the compute CPUs, GPUs, or other accelerator devices.
    [Show full text]
  • Matrox Imaging Library (MIL) 10.0 MIL 10 Processing Pack 3 Release Notes (C) Copyright Matrox Electronic Systems Ltd., 1992-2018
    ------------------------------------------------------------------------------- Matrox Imaging Library (MIL) 10.0 MIL 10 Processing Pack 3 Release Notes (c) Copyright Matrox Electronic Systems Ltd., 1992-2018. ------------------------------------------------------------------------------- Main table of contents Section 1 : Differences between MIL 10 Processing Pack 3 and MIL 10 Processing Pack 2 with Update 63 Section 2 : Differences between MIL 10 Processing Pack 2 with Update 63 and MIL 10 Processing Pack 2 Section 3 : Differences between MIL 10 Processing Pack 2 and MIL 10 Processing Pack 1 Section 4 : Differences between MIL 10 Processing Pack 1 and MIL 10 Section 5 : Differences between MIL 10 and MIL 9 Processing Pack 2 with Update 56 Section 6 : Differences between MIL 9 Processing Pack 2 with Update 56 and MIL 9 Processing Pack 2 Section 7 : Differences between MIL 9 Processing Pack 2 with Update 45 and MIL 9 Processing Pack 2 Section 8 : Differences between MIL 9 Processing Pack 2 and MIL 9 Processing Pack 1 Section 9 : Differences between MIL 9 Processing Pack 1 and MIL 9 Section 10 : Differences between MIL 9 and MIL 8 Processing Pack 4 ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Section 1: Differences between MIL 10 Processing Pack 3 and MIL 10 Processing Pack 2 with with Update 63 Table of Contents for Section 1 1. Overview 2. New functionalities and improvements 2.01 MIL processing specific examples 2.02 Classification module 2.03 Blob module 2.04 Calibration module 2.05 SureDotOCR® module 2.06 Registration module 2.07 Pattern Matching module 2.08 Model Finder module 2.09 Metrology module 2.10 3dMap module 2.11 Code Reader module 2.12 Measurement module 2.13 String Reader module 2.14 Color module 2.15 Primitives 2.16 Graphics 2.17 General 3.
    [Show full text]
  • DETECTING BUFFER OVERFLOWS USING TESTCASE SYNTHESIS and CODE INSTRUMENTATION by MICHAEL A
    DETECTING BUFFER OVERFLOWS USING TESTCASE SYNTHESIS AND CODE INSTRUMENTATION by MICHAEL A. ZHIVICH S.B., Electrical Engineering and Computer Science (2004) Massachusetts Institute of Technology SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ENGINEERING IN ELECTRICAL ENGINEERING AND COMPUTER SCIENCE at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2005 c Massachusetts Institute of Technology 2005. All rights reserved. Author............................................................................ Department of Electrical Engineering and Computer Science May 19, 2005 Certified by........................................................................ Richard Lippmann Senior Scientist, MIT Lincoln Laboratory Thesis Supervisor Accepted by....................................................................... Arthur C. Smith Chairman, Department Committee on Graduate Students 2 Detecting Buffer Overflows Using Testcase Synthesis and Code Instrumentation by Michael A. Zhivich Submitted to the Department of Electrical Engineering and Computer Science on May 19, 2005, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract The research presented in this thesis aims to improve existing approaches to dynamic buffer overflow detection by developing a system that utilizes code instrumentation and adap- tive test case synthesis to find buffer overflows and corresponding failure-inducing
    [Show full text]
  • C/C++ Runtime Error Detection Using Source Code Instrumentation
    FRIEDRICH-ALEXANDER-UNIVERSITAT¨ ERLANGEN-NURNBERG¨ INSTITUT FUR¨ INFORMATIK (MATHEMATISCHE MASCHINEN UND DATENVERARBEITUNG) Lehrstuhl f¨urInformatik 10 (Systemsimulation) C/C++ Runtime Error Detection Using Source Code Instrumentation Martin Bauer Bachelor Thesis C/C++ Runtime Error Detection Using Source Code Instrumentation Martin Bauer Bachelor Thesis Aufgabensteller: Prof. Dr. U. R¨ude Betreuer: M.Sc. K. Iglberger Dr. T. Panas Bearbeitungszeitraum: 01.10.2009 - 19.3.2010 Erkl¨arung: Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der angegebenen Quellen angefertigt habe und dass die Arbeit in gleicher oder ¨ahnlicher Form noch keiner anderen Prufungsbeh¨ ¨orde vorgelegen hat und von dieser als Teil einer Prufungs-¨ leistung angenommen wurde. Alle Ausfuhrungen,¨ die w¨ortlich oder sinngem¨aß ubernommen¨ wurden, sind als solche gekennzeichnet. Erlangen, den 19. M¨arz 2010 . Abstract The detection and removal of software errors is an expensive component of the total software development cost. Especially hard to remove errors are those which occur during the execution of a program, so called runtime errors. Therefore tools are needed to detect these errors reliably and give the programmer a good hint how to fix the problem. In the first part of the thesis different types of runtime errors, which can occur in C/C++ are explored, and methods are shown how to detect them. The goal of this thesis is to present an implementation of a system which detects runtime errors using source code instrumentation. and to compare it to common binary instrumentation tools. First the ROSE source-to-source compiler is introduced which is used to do the instrumentation.
    [Show full text]