A Differential Approach to Undefined Behavior
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Undefined Behaviour in the C Language
FAKULTA INFORMATIKY, MASARYKOVA UNIVERZITA Undefined Behaviour in the C Language BAKALÁŘSKÁ PRÁCE Tobiáš Kamenický Brno, květen 2015 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Vedoucí práce: RNDr. Adam Rambousek ii Acknowledgements I am very grateful to my supervisor Miroslav Franc for his guidance, invaluable help and feedback throughout the work on this thesis. iii Summary This bachelor’s thesis deals with the concept of undefined behavior and its aspects. It explains some specific undefined behaviors extracted from the C standard and provides each with a detailed description from the view of a programmer and a tester. It summarizes the possibilities to prevent and to test these undefined behaviors. To achieve that, some compilers and tools are introduced and further described. The thesis contains a set of example programs to ease the understanding of the discussed undefined behaviors. Keywords undefined behavior, C, testing, detection, secure coding, analysis tools, standard, programming language iv Table of Contents Declaration ................................................................................................................................ ii Acknowledgements .................................................................................................................. iii Summary ................................................................................................................................. -
Defining the Undefinedness of C
Technical Report: Defining the Undefinedness of C Chucky Ellison Grigore Ros, u University of Illinois {celliso2,grosu}@illinois.edu Abstract naturally capture undefined behavior simply by exclusion, be- This paper investigates undefined behavior in C and offers cause of the complexity of undefined behavior, it takes active a few simple techniques for operationally specifying such work to avoid giving many undefined programs semantics. behavior formally. A semantics-based undefinedness checker In addition, capturing the undefined behavior is at least as for C is developed using these techniques, as well as a test important as capturing the defined behavior, as it represents suite of undefined programs. The tool is evaluated against a source of many subtle program bugs. While a semantics other popular analysis tools, using the new test suite in of defined programs can be used to prove their behavioral addition to a third-party test suite. The semantics-based tool correctness, any results are contingent upon programs actu- performs at least as well or better than the other tools tested. ally being defined—it takes a semantics capturing undefined behavior to decide whether this is the case. 1. Introduction C, together with C++, is the king of undefined behavior—C has over 200 explicitly undefined categories of behavior, and A programming language specification or semantics has dual more that are left implicitly undefined [11]. Many of these duty: to describe the behavior of correct programs and to behaviors can not be detected statically, and as we show later identify incorrect programs. The process of identifying incor- (Section 2.6), detecting them is actually undecidable even rect programs can also be seen as describing which programs dynamically. -
The Art, Science, and Engineering of Fuzzing: a Survey
1 The Art, Science, and Engineering of Fuzzing: A Survey Valentin J.M. Manes,` HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo Abstract—Among the many software vulnerability discovery techniques available today, fuzzing has remained highly popular due to its conceptual simplicity, its low barrier to deployment, and its vast amount of empirical evidence in discovering real-world software vulnerabilities. At a high level, fuzzing refers to a process of repeatedly running a program with generated inputs that may be syntactically or semantically malformed. While researchers and practitioners alike have invested a large and diverse effort towards improving fuzzing in recent years, this surge of work has also made it difficult to gain a comprehensive and coherent view of fuzzing. To help preserve and bring coherence to the vast literature of fuzzing, this paper presents a unified, general-purpose model of fuzzing together with a taxonomy of the current fuzzing literature. We methodically explore the design decisions at every stage of our model fuzzer by surveying the related literature and innovations in the art, science, and engineering that make modern-day fuzzers effective. Index Terms—software security, automated software testing, fuzzing. ✦ 1 INTRODUCTION Figure 1 on p. 5) and an increasing number of fuzzing Ever since its introduction in the early 1990s [152], fuzzing studies appear at major security conferences (e.g. [225], has remained one of the most widely-deployed techniques [52], [37], [176], [83], [239]). In addition, the blogosphere is to discover software security vulnerabilities. At a high level, filled with many success stories of fuzzing, some of which fuzzing refers to a process of repeatedly running a program also contain what we consider to be gems that warrant a with generated inputs that may be syntactically or seman- permanent place in the literature. -
Intermediate Representation and LLVM
CS153: Compilers Lecture 6: Intermediate Representation and LLVM Stephen Chong https://www.seas.harvard.edu/courses/cs153 Contains content from lecture notes by Steve Zdancewic and Greg Morrisett Announcements •Homework 1 grades returned •Style •Testing •Homework 2: X86lite •Due Tuesday Sept 24 •Homework 3: LLVMlite •Will be released Tuesday Sept 24 Stephen Chong, Harvard University 2 Today •Continue Intermediate Representation •Intro to LLVM Stephen Chong, Harvard University 3 Low-Level Virtual Machine (LLVM) •Open-Source Compiler Infrastructure •see llvm.org for full documentation •Created by Chris Lattner (advised by Vikram Adve) at UIUC •LLVM: An infrastructure for Multi-stage Optimization, 2002 •LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation, 2004 •2005: Adopted by Apple for XCode 3.1 •Front ends: •llvm-gcc (drop-in replacement for gcc) •Clang: C, objective C, C++ compiler supported by Apple •various languages: Swift, ADA, Scala, Haskell, … •Back ends: •x86 / Arm / PowerPC / etc. •Used in many academic/research projects Stephen Chong, Harvard University 4 LLVM Compiler Infrastructure [Lattner et al.] LLVM llc frontends Typed SSA backend like IR code gen 'clang' jit Optimizations/ Transformations Analysis Stephen Chong, Harvard University 5 Example LLVM Code factorial-pretty.ll define @factorial(%n) { •LLVM offers a textual %1 = alloca %acc = alloca store %n, %1 representation of its IR store 1, %acc •files ending in .ll br label %start start: %3 = load %1 factorial64.c %4 = icmp sgt %3, 0 br %4, label -
Ryan Holland CSC415 Term Paper Final Draft Language: Swift (OSX)
Ryan Holland CSC415 Term Paper Final Draft Language: Swift (OSX) Apple released their new programming language "Swift" on June 2nd of this year. Swift replaces current versions of objective-C used to program in the OSX and iOS environments. According to Apple, swift will make programming apps for these environments "more fun". Swift is also said to bring a significant boost in performance compared to the equivalent objective-C. Apple has not released an official document stating the reasoning behind the launch of Swift, but speculation is that it was a move to attract more developers to the iOS platform. Swift development began in 2010 by Chris Lattner, but would later be aided by many other programmers. Swift employs ideas from many languages into one, not only from objective-C. These languages include Rust, Haskell, Ruby, Python, C#, CLU mainly but many others are also included. Apple says that most of the language is built for Cocoa and Cocoa Touch. Apple claims that Swift is more friendly to new programmers, also claiming that the language is just as enjoyable to learn as scripting languages. It supports a feature known as playgrounds, that allow programmers to make modifications on the fly, and see results immediately without the need to build the entire application first. Apple controls most aspects of this language, and since they control so much of it, they keep a detailed records of information about the language on their developer site. Most information within this paper will be based upon those records. Unlike most languages, Swift has a very small set of syntax that needs to be shown compared to other languages. -
LLVM Overview
Overview Brandon Starcheus & Daniel Hackney Outline ● What is LLVM? ● History ● Language Capabilities ● Where is it Used? What is LLVM? What is LLVM? ● Compiler infrastructure used to develop a front end for any programming language and a back end for any instruction set architecture. ● Framework to generate object code from any kind of source code. ● Originally an acronym for “Low Level Virtual Machine”, now an umbrella project ● Intended to replace the GCC Compiler What is LLVM? ● Designed to be compatible with a broad spectrum of front ends and computer architectures. What is LLVM? LLVM Project ● LLVM (Compiler Infrastructure, our focus) ● Clang (C, C++ frontend) ● LLDB (Debugger) ● Other libraries (Parallelization, Multi-level IR, C, C++) What is LLVM? LLVM Project ● LLVM (Compiler Infrastructure, our focus) ○ API ○ llc Compiler: IR (.ll) or Bitcode (.bc) -> Assembly (.s) ○ lli Interpreter: Executes Bitcode ○ llvm-link Linker: Bitcode (.bc) -> Bitcode (.bc) ○ llvm-as Assembler: IR (.ll) -> Bitcode (.bc) ○ llvm-dis Disassembler: Bitcode (.bc) -> IR (.ll) What is LLVM? What is LLVM? Optimizations History History ● Developed by Chris Lattner in 2000 for his grad school thesis ○ Initial release in 2003 ● Lattner also created: ○ Clang ○ Swift ● Other work: ○ Apple - Developer Tools, Compiler Teams ○ Tesla - VP of Autopilot Software ○ Google - Tensorflow Infrastructure ○ SiFive - Risc-V SoC’s History Language Capabilities Language Capabilities ● Infinite virtual registers ● Strongly typed ● Multiple Optimization Passes ● Link-time and Install-time -
Improving Program Reconstruction in LLDB Using C++ Modules
Beyond Debug Information: Improving Program Reconstruction in LLDB using C++ Modules Master’s Thesis in Computer Science and Engineering RAPHAEL ISEMANN Department of Computer Science and Engineering CHALMERS UNIVERSITY OF TECHNOLOGY UNIVERSITY OF GOTHENBURG Gothenburg, Sweden 2019 Master’s thesis 2019 Beyond Debug Information: Improving Program Reconstruction in LLDB using C++ Modules Raphael Isemann Department of Computer Science and Engineering Chalmers University of Technology University of Gothenburg Gothenburg, Sweden 2019 Beyond Debug Information: Improving Program Reconstruction in LLDB using C++ Modules Raphael Isemann © Raphael Isemann, 2019. Supervisor: Thomas Sewell, Department of Computer Science and Engineering Examiner: Magnus Myreen, Department of Computer Science and Engineering Master’s Thesis 2019 Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg SE-412 96 Gothenburg Telephone +46 31 772 1000 Cover: The LLVM logo, owned by and royality-free licensed from Apple Inc. Typeset in LATEX Gothenburg, Sweden 2019 iv Beyond Debug Information: Improving Program Reconstruction in LLDB using C++ Modules Raphael Isemann Department of Computer Science and Engineering Chalmers University of Technology and University of Gothenburg Abstract Expression evaluation is a core feature of every modern C++ debugger. Still, no C++ debugger currently features an expression evaluator that consistently supports advanced language features such as meta-programming with templates. The under- lying problem is that the debugger can often only partially reconstruct the debugged program from the debug information. This thesis presents a solution to this problem by using C++ modules as an additional source of program information. We devel- oped a prototype based on the LLDB debugger that is loading missing program components from the C++ modules used by the program. -
Speed Performance Between Swift and Objective-C
International Journal of Engineering Applied Sciences and Technology, 2016 Vol. 1, Issue 10, ISSN No. 2455-2143, Pages 185-189 Published Online August - September 2016 in IJEAST (http://www.ijeast.com) SPEED PERFORMANCE BETWEEN SWIFT AND OBJECTIVE-C Harwinder Singh Department of CSE DIET, Regional Centre PTU, Mohali, INDIA Abstracts: The appearance of a new programming Swift is a new programming language for iOS and OS X language gives the necessity to contrast its contribution apps that builds on the best of C and Objective-C, without with the existing programming languages to evaluate the the constraints of C compatibility. Swift adopts safe novelties and improvements that the new programming programming patterns and adds modern features to make language offers for developers. Intended to eventually programming easier, more flexible, and more fun. Swift’s replace Objective-C as Apple’s language of choice, Swift clean slate, backed by the mature and much loved Cocoa needs to convince developers to switch over to the new and Cocoa Touch frameworks, is an opportunity to language. Apple has promised that Swift will be faster reimaging how software development works. than Objective-C, as well as offer more modern language features, be very safe, and be easy to learn and use. In II. LITERATURE SURVEY this thesis developer test these claims by creating an iOS application entirely in Swift as well as benchmarking By Christ Lattner(2015)Released in June of 2014 by two different algorithms. Developer finds that while Apple Swift is a statically typed language and Swift is faster than Objective-C, it does not see the compiled language that uses the LLVM compiler speedup projected by Apple. -
Xcode Static Analyzer
1 Developer Tools Kickoff Session 300 Andreas Wendker These are confidential sessions—please refrain from streaming, blogging, or taking pictures 2 3 14 Billion Apps Downloaded 4 Xcode 4 Released March 2011 5 6 7 1 Single Window IB Inside Assistant Version Editor 8 1 Single Window LLVM Compiler 2 IB Inside Fix-It ImprovedBranchingLiveProjectAutomaticUnitSchemesAppArchivesCode QuickIssuesBehaviors Testing JumpValidation& Tabs Code & SnippetsTarget HelpMerging Provisioning Bar Completion Editor One-ClickC++WorkspacesLLDBSubversionGit inBlame LLVM Filtering Assistant Debugger Version Editor Instruments 9 LLVM Compiler 2 Schemes Improved Code Completion Automatic Provisioning Archives Debugger Single Window Subversion Branching & Merging Git Jump Bar IB Inside Version Editor Live Issues Behaviors LLDB Unit Testing Project & Target Editor Instruments Fix-It One-Click Filtering C++ in LLVM Blame Assistant Tabs Workspaces Quick Help App Validation Code Snippets 10 Smaller Free on Packages Lion Xcode in Mac App Store 11 Xcode 4.1 Xcode 4.2 12 Xcode 4.1 13 Built for Lion 14 15 Modernize Your Project 16 Assembly & Preprocessing 17 View-based tables New Cocoa controls Mac push notifications Entitlements Editor Custom behaviors 18 Auto Layout 19 Auto Layout Max Drukman 20 21 22 Demo 23 Xcode 4.2 24 Data Sync 25 Unit Tests 26 System Trace for iOS 27 Networking Activity 28 Simulate Locations 29 30 31 32 33 34 35 36 37 Storyboarding 38 Storyboarding Jon Hess 39 40 Scenes 41 SeguesScenes 42 43 Demo 44 45 OpenGL ES Performance Detective 46 OpenGL ES Analyzer -
Lenient Execution of C on a JVM How I Learned to Stop Worrying and Execute the Code
Lenient Execution of C on a JVM How I Learned to Stop Worrying and Execute the Code Manuel Rigger Roland Schatz Johannes Kepler University Linz, Austria Oracle Labs, Austria [email protected] [email protected] Matthias Grimmer Hanspeter M¨ossenb¨ock Oracle Labs, Austria Johannes Kepler University Linz, Austria [email protected] [email protected] ABSTRACT that rely on undefined behavior risk introducing bugs that Most C programs do not strictly conform to the C standard, are hard to find, can result in security vulnerabilities, or and often show undefined behavior, e.g., on signed integer remain as time bombs in the code that explode after compiler overflow. When compiled by non-optimizing compilers, such updates [31, 44, 45]. programs often behave as the programmer intended. However, While bug-finding tools help programmers to find and optimizing compilers may exploit undefined semantics for eliminate undefined behavior in C programs, the majority more aggressive optimizations, thus possibly breaking the of C code will still contain at least some occurrences of non- code. Analysis tools can help to find and fix such issues. Alter- portable code. This includes unspecified and implementation- natively, one could define a C dialect in which clear semantics defined patterns, which do not render the whole program are defined for frequent program patterns whose behavior invalid, but can still cause surprising results. To address would otherwise be undefined. In this paper, we present such this, it has been advocated to come up with a more lenient a dialect, called Lenient C, that specifies semantics for behav- C dialect, that better suits the programmers' needs and ior that the standard left open for interpretation. -
In Using the GNU Compiler Collection (GCC)
Using the GNU Compiler Collection For gcc version 6.1.0 (GCC) Richard M. Stallman and the GCC Developer Community Published by: GNU Press Website: http://www.gnupress.org a division of the General: [email protected] Free Software Foundation Orders: [email protected] 51 Franklin Street, Fifth Floor Tel 617-542-5942 Boston, MA 02110-1301 USA Fax 617-542-2652 Last printed October 2003 for GCC 3.3.1. Printed copies are available for $45 each. Copyright c 1988-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being \Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled \GNU Free Documentation License". (a) The FSF's Front-Cover Text is: A GNU Manual (b) The FSF's Back-Cover Text is: You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development. i Short Contents Introduction ::::::::::::::::::::::::::::::::::::::::::::: 1 1 Programming Languages Supported by GCC ::::::::::::::: 3 2 Language Standards Supported by GCC :::::::::::::::::: 5 3 GCC Command Options ::::::::::::::::::::::::::::::: 9 4 C Implementation-Defined Behavior :::::::::::::::::::: 373 5 C++ Implementation-Defined Behavior ::::::::::::::::: 381 6 Extensions to -
Reenix: Implementing a Unix-Like Operating System in Rust
Reenix: Implementing a Unix-Like Operating System in Rust Alex Light (alexander [email protected]) Advisor: Tom Doeppner Reader: Shriram Krishnamurthi Brown University, Department of Computer Science April 2015 Abstract This paper describes the experience, problems and successes found in implementing a unix-like operating system kernel in rust. Using the basic design and much of the lowest-level support code from the Weenix operating system written for CS167/9 I was able to create a basic kernel supporting multiple kernel processes scheduled cooperatively, drivers for the basic devices and the beginnings of a virtual file system. I made note of where the rust programming language, and its safety and type systems, helped and hindered my work and made some, tentative, performance comparisons between the rust and C implementations of this kernel. I also include a short introduction to the rust programming language and the weenix project. Contents 1 Introduction 1 Introduction 1 Ever since it was first created in 1971 the UNIX operat- 1.1 The Weenix OS . .2 ing system has been a fixture of software engineering. 1.2 The Rust language . .2 One of its largest contributions to the world of OS engineering, and software engineering in general, was 2 Reenix 7 the C programming language created to write it. In 2.1 Organization . .7 the 4 decades that have passed since being released, C 2.2 Booting . .8 has changed relatively little but the state-of-the-art in 2.3 Memory & Initialization . .8 programming language design and checking, has ad- 2.4 Processes . .9 vanced tremendously.