Program Analysis of Temporal Memory Mismanagement

Program Analysis of Temporal Memory Mismanagement by Hua Yan B.Sc., Peking University, China, 2007 M.Sc., Peking University, China, 2010 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN THE SCHOOL OF Computer Science and Engineering THE UNIVERSITY OF NEW SOUTH WALES SYDNEY· AUSTRALIA Monday 22nd January, 2018 All rights reserved. This work may not be reproduced in whole or in part, by photocopy or other means, without the permission of the author. © Hua Yan 2018 PLEASE TYPE THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet Surname or Family name: Yan First name: Hua Other name/s: Abbreviation for degree as given in the University calendar: PhD School: School of Computer Science and Engineering Faculty: Faculty of Engineering Title: Program Analysis of Temporal Memory Mismanagement Abstract 350 words maximum: (PLEASE TYPE) For CIC++ programs, the performance benefits obtained from flexible low-level memory access and management sacrifice language-level support for memory safety and garbage collection. Memory-related programming mistakes are introduced as a result, rendering CIC++ programs prone to memory errors. A common category of programming mistakes is defined by the misplacement of deallocation operations, also known as temporal memory mismanagement, which can generate two types of bugs: (1) use-after-free (UAF) bugs and (2) memory leaks. The former are severe security vulnerabilities that expose programs to both data and control-flow exploits, while the latter are critical performance bugs that compromise availability and reliability. In the case of UAF bugs, existing solutions, which almost exclusively rely on dynamic analysis, suffer from limitations, including low coverage, binary incompatibility and high overheads. In the case of memory leaks, detectiontechniques are abundant; however, fixing techniques have been poorly investigated. This thesis presents three novel program analysis frameworks to address temporal memory mismanagement in C/C++. First, we introduce Tac, the first static UAF detection framework to combine typestate analysis with machine learning. Tac identifies representative features to train an SVM to classify likely true/false UAF candidates, and thereby providing guidance for typestate analysis used to locate bugs with precision. We then present CRed, a pointer-analysis-based framework for UAF detection with a novel context-reduction technique and a new demand-driven path-sensitive pointer analysis to boost scalability and precision. A major advantage of CRed is its ability to substantially and soundly reduce search space without losing bug-finding ability. This is achieved by utilizing must-not-alias information to truncate off unnecessary segments of calling contexts. Finally, we propose AutoFix, an automated memory leak fixing framework based on value-flow analysis and static instrumentation that can fix memory leaks safely and precisely with negligible overheads. The contribution of this thesis is threefold. First, we advance existing state-of-the-art solutions to temporal memory mismanagement by proposing a series of novel program analysis techniques. Second, corresponding prototype tools are fully implemented in the LLVM compiler framework. Third, an extensive evaluation of open-source benchmarks is conducted to validate the effectiveness of the proposed techniques. Declaration relating to disposition of project thesis/dissertation I hereby grant to the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all property rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in DissertationAbstracts International (this is applicable to doctoral theses only). The University recognises that there may be exceptional circumstances requiring restrictions on· copying or conditions on use. Requests for restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research. FOR OFFICE USE ONLY Date of completion of requirements for Award: ORIGINALITY STATEMENT 'I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.' Signed Date COPYRIGHT STATEMENT 'I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.' Signed Date AUTHENTICITY STATEMENT 'I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.' Signed Date Abstract For C/C++ programs, the performance benefits obtained from flexible low-level memory access and management sacrifice language-level support for memory safety and garbage collection. Memory-related programming mistakes are introduced as a result, rendering C/C++ programs prone to memory errors. A common category of programming mistakes is defined by the misplacement of deallocation operations, also known as, temporal memory mismanagement, which can generate two types of bugs: (1) use-after-free (UAF) bugs and (2) memory leaks. The former are severe security vulnerabilities that expose programs to both data and control-flow exploits, while the latter are critical performance bugs that compromise software availability and reliability. In the case of UAF bugs, existing solutions, which almost exclusively rely on dynamic analysis, suffer from limitations, including low code coverage, binary incompatibility and high overheads. In the case of memory leaks, detection techniques are abundant; however, fixing techniques have been poorly investigated. In this thesis, we present three novel program analysis frameworks to address temporal memory mismanagement in C/C++. First, we introduce Tac, the first static UAF detection framework to combine typestate analysis with machine learning. Tac identifies representative features to train a Support Vector Machine to classify likely true/false UAF candidates, thereby providing guidance for typestate i ii analysis used to locate bugs with precision. We then present CRed, a pointer-analysis-based framework for UAF detection with a novel context-reduction technique and a new demand-driven path-sensitive pointer analysis to boost scalability and precision. A major advantage of CRed is its ability to substantially and soundly reduce search space without losing bug- finding ability. This is achieved by utilizing must-not-alias information to truncate unnecessary segments of calling contexts. Finally, we propose AutoFix, an automated memory leak fixing framework based on value-flow analysis and static instrumentation that can fix all leaks re- ported by any front-end detector with negligible overheads safely with precision. AutoFix tolerates false leaks with a shadow memory data structure carefully de- signed to keep track of the allocation and deallocation of potentially leaked memory objects. The contribution of this thesis is threefold. First, we advance existing state- of-the-art solutions to temporal memory mismanagement by proposing a series of novel program analysis techniques. Second, corresponding prototype tools are fully implemented in the LLVM compiler framework. Third, an extensive evaluation of open-source C/C++ benchmarks is conducted to validate the effectiveness of the proposed techniques. Publications • Hua Yan, Yulei Sui, Shiping Chen and Jingling Xue. Spatio-Temporal Context Reduction: A Pointer-Analysis-Based Static Approach for Detecting Use-After-Free Vulnerabilities. International Conference on Software Engi- neering (ICSE '18). • Hua

Program Analysis of Temporal Memory Mismanagement

Semantic Patches for Java Program Transformation

Coccinelle: Reducing the Barriers to Modularization in a Large C Code Base

Inferring Semantic Patches for the Linux Kernel

Automating Patching of Vulnerable Open-Source Software Versions in Application Binaries

SED 1214 Transcript EPISODE 1214

Automated Secure Code Review for Webapplications

Detect Complex Code Patterns Using Semantic Grep

Introducing Semgrep

Towards Generating Transformation Rules Without Examples for Android API Replacement

Effective Source Code Analysis with Minimization

Clang and Coccinelle: Synergising Program Analysis Tools for CERT C Secure Coding Standard Certification

Aalborg Universitet Coccinelle Tool Support for Automated