Address Space Randomization for Mobile Devices

Total Page:16

File Type:pdf, Size:1020Kb

Address Space Randomization for Mobile Devices Address Space Randomization for Mobile Devices Hristo Bojinov Dan Boneh Rich Cannings Stanford University Stanford University Google, Inc. Stanford, CA, USA Stanford, CA, USA Mountain View, CA, USA [email protected] [email protected] [email protected] Iliyan Malchev Google, Inc. Mountain View, CA, USA [email protected] ABSTRACT [19]. ASLR randomizes the base points of the stack, heap, Address Space Layout Randomization (ASLR) is a defen- shared libraries, and base executables. The goal of ASLR sive technique supported by many desktop and server oper- is to make certain classes of control-hijacking attacks more ating systems. While smartphone vendors wish to make it difficult: with executable code residing at unknown loca- available on their platforms, there are technical challenges tions, garden variety buffer or stack overflow attacks are in implementing ASLR on these devices. Pre-linking, lim- made significantly harder to develop and execute [14]. In ited processing power and restrictive update processes make conjunction with OS mechanisms that only allow writing it difficult to use existing ASLR implementation strategies to non-executable memory (e.g. DEP in Windows), ASLR even on the latest generation of smartphones. In this paper prevents many network-based native code control hijacking we introduce retouching, a mechanism for executable ASLR attacks from completing [24]. that requires no kernel modifications and is suitable for mo- bile devices. We have implemented ASLR for the Android Implementation challenges. operating system and evaluated its effectiveness and per- Although there has been much work on implementing and formance. In addition, we introduce crash stack analysis, evaluating ASLR for general-purpose PCs, none of the ma- a technique that uses crash reports locally on the device, jor smartphones currently use it. In principle, the same or in aggregate in the cloud to reliably detect attempts to ASLR techniques should carry over to mobile devices, how- brute-force ASLR protection. We expect that retouching ever there are several practical obstacles that make this dif- and crash stack analysis will become standard techniques in ficult. Smartphone operating systems spend considerable mobile ASLR implementations. effort to minimize boot and application launch time, power consumption, and memory footprint. These optimizations make existing ASLR implementation strategies insufficient. Categories and Subject Descriptors We give two examples for challenges in implementing ASLR D.4.6 [Operating Systems]: Security and Protection in Android: • The Android OS prelinks shared libraries to speed up General Terms the boot process. Prelinking takes place during the Security, Experimentation, Performance build process and results in hard-coded memory ad- dresses written in the library code. This prevents relo- cating these libraries in process memory. Android also Keywords uses a custom dynamic linker that cannot self-relocate ASLR, control flow hijacking, return-to-libc, mobile devices, at run-time (unlike ld.so). Recent attack techniques smartphones, Android against ASLR clearly demonstrate the need to ran- domize the whole process address space, including base 1. INTRODUCTION executables and shared libraries [18]. Over the last few years Address-Space Layout Random- • During normal operation the filesystem on the device is ization (ASLR) has become mainstream, with various lev- mounted read-only for security reasons. This prevents els of support in Linux [25], Windows [20], and Mac OS X binary editing tools [12] from modifying images on the device or in file-backed memory. Permission to make digital or hard copies of all or part of this work for Our contributions. personal or classroom use is granted without fee provided that copies are We propose retouching, a novel mechanism for randomiz- not made or distributed for profit or commercial advantage and that copies ing prelinked code for deployment on mobile devices. Re- bear this notice and the full citation on the first page. To copy otherwise, to touching can randomize all executable code including li- republish, to post on servers or to redistribute to lists, requires prior specific braries (and prelinked libraries), base executables, and the permission and/or a fee. WiSec’11, June14–17, 2011, Hamburg, Germany. linker. Unlike traditional ASLR implementations, retouch- Copyright 2011 ACM 978-1-4503-0692-8/11/06 ...$10.00. ing requires no kernel modifications. We implement the mechanism for Android, evaluate its effectiveness, and mea- 2.2 Windows sure its impact on performance at build and runtime and on Following the implementation of DEP (Data Execution memory footprint. Our conclusion is that retouching is an Prevention: marking the stack and heap as non-execute) in effective approach to ASLR and is particularly well-suited Windows XP SP2, Microsoft implemented ASLR in Win- in situations where performance is an issue, or when there dows Vista. The two mechanisms work together to pre- are incentives to avoid kernel changes. vent control-flow hijacking attacks (such as return-to-libc), Our second contribution is a cloud-based approach to de- as well as injected code from being executed on the stack tecting and preventing ASLR brute-forcing [23]. We intro- or heap. In the Windows ASLR implementation, executable duce crash stack analysis, a technique that analyzes crash code randomization happens on every reboot, when a global reports from mobile devices and reliably detects attempts image offset is selected randomly out of 256 possibilities. to bypass ASLR by guessing the random offset used on each Additionally, every process is launched with an individually device. We evaluate crash stack analysis using real crash randomized stack, heap, and Process Environment Block data as well as simulated attacks, and conclude that the ap- (PEB) [26]. The 8 bits of entropy used for selecting the off- proach can effectively detect attacks and, in addition, can sets renders the Windows ASLR implementation vulnerable help pinpoint the OS code being targeted. to guessing attacks [23], but is still better than no random- Brute forcing mobile ASLR can be very effective and dif- ization at all. ficult to detect locally. By making a single attempt on every Adopting the Windows ASLR approach directly in An- mobile user the attacker can compromise 1=256 of mobile droid would increase boot time of the device substantially devices (assuming 8 bits for ASLR randomness as in Win- by eliminating library prelinking. dows). Given the billions of phones in use, this fraction gives 2.3 Mac OS X the attacker control of a large number of devices. Apple introduced ASLR in the Leopard release of Mac OS In the rest of the paper, Sections 2 and 3 give some back- X. Currently, OS X only randomizes the offsets of shared ground information on ASLR and the Android OS, Section libraries. This randomization is performed at the time li- 4 presents the threat model that we address with our de- braries are prelinked, effectively prelinking them at a dif- sign and implementation in Section 5. Section 6 evaluates ferent address on each system. In addition to ASLR, the the implementation and discusses it in the context of other operating system protects stack and heap data from being related and future work. Section 7 introduces crash stack executed (heap protection is only available for 64-bit bina- analysis and evaluates its effectiveness. Sections 8 and 9 ries) [19, 15]. discuss future and related work, and Section 10 concludes. Retouching, the technique we have developed, is concep- tually similar to randomization during prelinking. The addi- 2. OVERVIEW OF ADDRESS-SPACE RAN- tional benefits of retouching are in significantly reducing the DOMIZATION amount of work performed on the target device by perform- ing the prelinking during the build process and retaining the Before discussing our system we first survey traditional minimum information needed for randomization. Addition- strategies for implementing ASLR and their limitations on ally, no ELF manipulation code needs to be installed on the Android. The first ASLR implementation, PaX [25], was target. designed for Linux. Subsequently, ASLR was implemented in Windows Vista and Mac OS X. We briefly describe these 3. OVERVIEW OF ANDROID implementations focusing primarily on user-space random- ization (as opposed to kernel randomization, which is a sep- Android [1] is an operating system for mobile devices de- arate topic). veloped by Google, Inc. While Android borrows much plat- form code from other open-source operating systems, its se- 2.1 PaX curity model was built from the start with the assumption PaX implements ASLR by generating three different ran- that the device will be running a variety of untrusted (or dom offsets (delta values) that apply to different areas of the partially trusted) applications. A manifestation of this ap- address space: proach is the execution of each installed application in a sep- arate process running under a unique user identifier (UID): • delta mmap controls the randomization of areas allo- any damage that the application can cause will be contained cated via mmap(), which includes shared libraries, as within the resources that are dedicated to this UID|disjoint well as the main executable when compiled and linked from those of any other UID or application. as an ET DYN ELF file Android is built on top of the Linux kernel. The system includes many device drivers and native system libraries, in- • delta exec is the offset of the base executable, fol- cluding a customized implementation of libc. Applications lowed by the heap, when the base executable is of type in Android are written in Java and execute in a virtual ma- ET EXEC (not position-independent) chine called Dalvik, in the form of Dalvik bytecode. After • delta stack is the offset for the user-space stack boot, the system runs many services (such as the media ser- vice, telephony service) each in its own process and having PaX ASLR is complemented by data and stack execution a unique UID.
Recommended publications
  • Red Hat Enterprise Linux 6 Developer Guide
    Red Hat Enterprise Linux 6 Developer Guide An introduction to application development tools in Red Hat Enterprise Linux 6 Dave Brolley William Cohen Roland Grunberg Aldy Hernandez Karsten Hopp Jakub Jelinek Developer Guide Jeff Johnston Benjamin Kosnik Aleksander Kurtakov Chris Moller Phil Muldoon Andrew Overholt Charley Wang Kent Sebastian Red Hat Enterprise Linux 6 Developer Guide An introduction to application development tools in Red Hat Enterprise Linux 6 Edition 0 Author Dave Brolley [email protected] Author William Cohen [email protected] Author Roland Grunberg [email protected] Author Aldy Hernandez [email protected] Author Karsten Hopp [email protected] Author Jakub Jelinek [email protected] Author Jeff Johnston [email protected] Author Benjamin Kosnik [email protected] Author Aleksander Kurtakov [email protected] Author Chris Moller [email protected] Author Phil Muldoon [email protected] Author Andrew Overholt [email protected] Author Charley Wang [email protected] Author Kent Sebastian [email protected] Editor Don Domingo [email protected] Editor Jacquelynn East [email protected] Copyright © 2010 Red Hat, Inc. and others. The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law.
    [Show full text]
  • Intel Updates
    Intel Updates Jun Nakajima Intel Open Source Technology Center Xen Summit November 2007 Legal Disclaimer y INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS. y Intel may make changes to specifications and product descriptions at any time, without notice. y All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. y Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. y Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. y *Other names and brands may be claimed as the property of others. y Copyright © 2007 Intel Corporation. Throughout this presentation: VT-x refers to Intel®
    [Show full text]
  • EXPLOITING BLUEBORNE in LINUX- BASED IOT DEVICES Ben Seri & Alon Livne
    EXPLOITING BLUEBORNE IN LINUX- BASED IOT DEVICES Ben Seri & Alon Livne EXPLOITING BLUEBORNE IN LINUX-BASED IOT DEVICES – © 2019 ARMIS, INC. Preface 3 Brief Bluetooth Background 4 L2CAP 4 Overview 4 Mutual configuration 5 Linux kernel RCE vulnerability - CVE-2017-1000251 6 Impact 8 Exploitation 9 Case Study #1 - Samsung Gear S3 9 Extracting the smartwatch kernel 9 Leveraging stack overflow into PC control 10 From Kernel to User-Mode 13 Aarch64 Return-Oriented-Programming 13 Structural considerations 14 SMACK 19 Case Study #2 - Amazon Echo 22 Getting out of bounds 23 Analyzing the stack 23 Developing Write-What-Where 26 Putting it all together 28 Case Study #n - Defeating modern mitigations 29 A new information leak vulnerability in the kernel - CVE-2017-1000410 29 Conclusions 31 BLUEBORNE ON LINUX — ©​ 2019 ARMIS, INC. — 2 ​ ​ Preface In September 2017 the BlueBorne attack vector was disclosed by Armis Labs. BlueBorne allows attackers to leverage Bluetooth connections to penetrate and take complete control over targeted devices. Armis Labs has identified 8 vulnerabilities related to this attack vector, affecting four operating systems, including Windows, iOS, Linux, and Android. Previous white papers on BlueBorne were published as well: ● The dangers of Bluetooth implementations detailed the overall research, the attack ​ surface, and the discovered vulnerabilities. ● BlueBorne on Android detailed the exploitation process of the BlueBorne vulnerabilities ​ on Android. This white paper will elaborate upon the Linux RCE vulnerability (CVE-2017-1000251) and its ​ ​ exploitation. The exploitation of this vulnerability will be presented on two IoT devices - a Samsung Gear S3 Smartwatch, and the Amazon Echo digital assistant.
    [Show full text]
  • Improving Security Through Egalitarian Binary Recompilation
    Improving Security Through Egalitarian Binary Recompilation David Williams-King Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2021 © 2021 David Williams-King All Rights Reserved Abstract Improving Security Through Egalitarian Binary Recompilation David Williams-King In this thesis, we try to bridge the gap between which program transformations are possible at source-level and which are possible at binary-level. While binaries are typically seen as opaque artifacts, our binary recompiler Egalito (ASPLOS 2020) enables users to parse and modify stripped binaries on existing systems. Our technique of binary recompilation is not robust to errors in disassembly, but with an accurate analysis, provides near-zero transformation overhead. We wrote several demonstration security tools with Egalito, including code randomization, control-flow integrity, retpoline insertion, and a fuzzing backend. We also wrote Nibbler (ACSAC 2019, DTRAP 2020), which detects unused code and removes it. Many of these features, including Nibbler, can be combined with other defenses resulting in multiplicatively stronger or more effective hardening. Enabled by our recompiler, an overriding theme of this thesis is our focus on deployable software transformation. Egalito has been tested by collaborators across tens of thousands of Debian programs and libraries. We coined this term egalitarian in the context of binary security. Simply put, an egalitarian analysis or security mechanism is one that can operate on itself (and is usually more deployable as a result). As one demonstration of this idea, we created a strong, deployable defense against code reuse attacks.
    [Show full text]
  • Dartmouth Computer Science Technical Report TR2011-680 (Draft Version) Exploiting the Hard-Working DWARF: Trojans with No Native Executable Code
    Dartmouth Computer Science Technical Report TR2011-680 (Draft version) Exploiting the Hard-Working DWARF: Trojans with no Native Executable Code James Oakley and Sergey Bratus Computer Science Dept. Dartmouth College Hanover, New Hampshire [email protected] April 11, 2011 Abstract All binaries compiled by recent versions of GCC from C++ programs include complex data and dedicated code for exception handling support. The data structures describe the call stack frame layout in the DWARF format byte- code. The dedicated code includes an interpreter of this bytecode and logic to implement the call stack unwinding. Despite being present in a large class of programs { and therefore poten- tially providing a huge attack surface { this mechanism is not widely known or studied. Of particular interest to us is that the exception handling mech- anism provides the means for fundamentally altering the flow of a program. DWARF is designed specifically for calculating call frame addresses and reg- ister values. DWARF expressions are Turing-complete and may calculate register values based on any readable data in the address space of the pro- cess. The exception handling data is in effect an embedded program residing within every C++ process. This paper explores what can be accomplished with control of the debugging information without modifying the program's text or data. We also examine the exception handling mechanism and argue that it is rife for vulnerability finding, not least because the error states of a program are often those least well tested. We demonstrate the capabilities of this DWARF virtual machine and its suitableness for building a new type of backdoor as well as other implications it has on security.
    [Show full text]
  • Chapter 1. Origins of Mac OS X
    1 Chapter 1. Origins of Mac OS X "Most ideas come from previous ideas." Alan Curtis Kay The Mac OS X operating system represents a rather successful coming together of paradigms, ideologies, and technologies that have often resisted each other in the past. A good example is the cordial relationship that exists between the command-line and graphical interfaces in Mac OS X. The system is a result of the trials and tribulations of Apple and NeXT, as well as their user and developer communities. Mac OS X exemplifies how a capable system can result from the direct or indirect efforts of corporations, academic and research communities, the Open Source and Free Software movements, and, of course, individuals. Apple has been around since 1976, and many accounts of its history have been told. If the story of Apple as a company is fascinating, so is the technical history of Apple's operating systems. In this chapter,[1] we will trace the history of Mac OS X, discussing several technologies whose confluence eventually led to the modern-day Apple operating system. [1] This book's accompanying web site (www.osxbook.com) provides a more detailed technical history of all of Apple's operating systems. 1 2 2 1 1.1. Apple's Quest for the[2] Operating System [2] Whereas the word "the" is used here to designate prominence and desirability, it is an interesting coincidence that "THE" was the name of a multiprogramming system described by Edsger W. Dijkstra in a 1968 paper. It was March 1988. The Macintosh had been around for four years.
    [Show full text]
  • Portable Executable File Format
    Chapter 11 Portable Executable File Format IN THIS CHAPTER + Understanding the structure of a PE file + Talking in terms of RVAs + Detailing the PE format + The importance of indices in the data directory + How the loader interprets a PE file MICROSOFT INTRODUCED A NEW executable file format with Windows NT. This for- mat is called the Portable Executable (PE) format because it is supposed to be portable across all 32-bit operating systems by Microsoft. The same PE format exe- cutable can be executed on any version of Windows NT, Windows 95, and Win32s. Also, the same format is used for executables for Windows NT running on proces- sors other than Intel x86, such as MIPS, Alpha, and Power PC. The 32-bit DLLs and Windows NT device drivers also follow the same PE format. It is helpful to understand the PE file format because PE files are almost identi- cal on disk and in RAM. Learning about the PE format is also helpful for under- standing many operating system concepts. For example, how operating system loader works to support dynamic linking of DLL functions, the data structures in- volved in dynamic linking such as import table, export table, and so on. The PE format is not really undocumented. The WINNT.H file has several struc- ture definitions representing the PE format. The Microsoft Developer's Network (MSDN) CD-ROMs contain several descriptions of the PE format. However, these descriptions are in bits and pieces, and are by no means complete. In this chapter, we try to give you a comprehensive picture of the PE format.
    [Show full text]
  • Process Address Spaces and Binary Formats
    Process Address Spaces and Binary Formats Don Porter – CSE 506 Housekeeping ò Lab deadline extended to Wed night (9/14) ò Enrollment finalized – if you still want in, email me ò All students should have VMs at this point ò Email Don if you don’t have one ò TA office hours posted ò Private git repositories should be setup soon Review ò We’ve seen how paging and segmentation work on x86 ò Maps logical addresses to physical pages ò These are the low-level hardware tools ò This lecture: build up to higher-level abstractions ò Namely, the process address space Definitions (can vary) ò Process is a virtual address space ò 1+ threads of execution work within this address space ò A process is composed of: ò Memory-mapped files ò Includes program binary ò Anonymous pages: no file backing ò When the process exits, their contents go away Problem 1: How to represent? ò What is the best way to represent the components of a process? ò Common question: is mapped at address x? ò Page faults, new memory mappings, etc. ò Hint: a 64-bit address space is seriously huge ò Hint: some programs (like databases) map tons of data ò Others map very little ò No one size fits all Sparse representation ò Naïve approach might would be to represent each page ò Mark empty space as unused ò But this wastes OS memory ò Better idea: only allocate nodes in a data structure for memory that is mapped to something ò Kernel data structure memory use proportional to complexity of address space! Linux: vm_area_struct ò Linux represents portions of a process with a vm_area_struct,
    [Show full text]
  • Open and Efficient Type Switch For
    Draft for OOPSLA 2012 Open and Efficient Type Switch for C++ Yuriy Solodkyy Gabriel Dos Reis Bjarne Stroustrup Texas A&M University Texas, USA fyuriys,gdr,[email protected] Abstract – allow for independent extensions, modular type-checking Selecting operations based on the run-time type of an object and dynamic linking. On the other, in order to be accepted is key to many object-oriented and functional programming for production code, the implementation of such a construct techniques. We present a technique for implementing open must equal or outperform all known workarounds. However, and efficient type-switching for hierarchical extensible data existing approaches to case analysis on hierarchical exten- types. The technique is general and copes well with C++ sible data types are either efficient or open, but not both. multiple inheritance. Truly open approaches rely on expensive class-membership To simplify experimentation and gain realistic prefor- testing combined with decision trees []. Efficient approaches mance using production-quality compilers and tool chains, rely on sealing either the class hierarchy or the set of func- we implement our type swich constructs as an ISO C++11 li- tions, which loses extensibility [9, 18, 44, 51]. Consider a brary. Our library-only implementation provides concise no- simple expression language: tation and outperforms the visitor design pattern, commonly exp ∶∶= val S exp + exp S exp − exp S exp ∗ exp S exp~exp used for type-casing scenarios in object-oriented programs. For many uses, it equals or outperforms equivalent code in In an object-oriented language without direct support for languages with built-in type-switching constructs, such as algebraic data types, the type representing an expression-tree OCaml and Haskell.
    [Show full text]
  • CIS Ubuntu Linux 18.04 LTS Benchmark
    CIS Ubuntu Linux 18.04 LTS Benchmark v1.0.0 - 08-13-2018 Terms of Use Please see the below link for our current terms of use: https://www.cisecurity.org/cis-securesuite/cis-securesuite-membership-terms-of-use/ 1 | P a g e Table of Contents Terms of Use ........................................................................................................................................................... 1 Overview ............................................................................................................................................................... 12 Intended Audience ........................................................................................................................................ 12 Consensus Guidance ..................................................................................................................................... 13 Typographical Conventions ...................................................................................................................... 14 Scoring Information ..................................................................................................................................... 14 Profile Definitions ......................................................................................................................................... 15 Acknowledgements ...................................................................................................................................... 17 Recommendations ............................................................................................................................................
    [Show full text]
  • Download Slides
    Redistribution of this material in any form is not allowed without written permission. ROBOTS AND WINDOWS Joakim Sandström (JODE) nSense define: Joakim Sandström On since 1993’ish 13:37 (EEST) Work: • Chief Technology Officer / Founder @ nSense Interests • Application security • Smartphone security • Code reviews • Writing code • Obscure protocols and more… nSense in a nutshell • Highly specialised in information security • International organization with a strong local presence • Financially independent • The key owners work for the company • High customer satisfaction • ~50 employees Be the leading European IT security assessor Page 4 nSense Service Offering Trusted Advisor Business Enabler Execution Strategic security Security management Vulnerability assessment consultation Penetration testing PCI compliance Security Improvement Program Code reviews ISO and ISF compliance Security reviews and Security coaching analyses services Secure software development Vulnerability management Incident response services Training services Karhu vulnerability scanner Page 5 Main Objectives • Provide a brief overview of the Android and WP7 OS • Security models and architecture • Comparison • Common pitfalls • Allow developers to understand better the underlying platform- and security frameworks. Intro Platforms • Definitions • Overview • Architecture • Ecosystem Devices Outro • Secure programming • Conclusions • Authentication • QA • Authorization • Transport • Storage • Logging • Exception management • Server-side processing Definitions Architecture
    [Show full text]
  • Robust Static Analysis of Portable Executable Malware
    Robust Static Analysis of Portable Executable Malware by Katja Hahn Register INM 11 Master Thesis in Computer Science HTWK Leipzig Fakult¨at Informatik, Mathematik und Naturwissenschaften First Assessor: Prof. Dr. rer. nat. habil. Michael Frank (HTWK Leipzig) Second Assessor: Prof. Dr. rer. nat. Karsten Weicker (HTWK Leipzig) Leipzig, December 2014 Abstract The Portable Executable (PE) format is an architectural independent file format for 32 and 64-bit Windows operating systems. PE format related properties that violate conventions or the specification are called PE malformations. They can cause problems to any program that parses PE files, among others, crashes, refusal to read the file, extracting and displaying wrong information, and inability to parse certain structures. Malware authors use PE malformations to avoid or prolong malware analysis and evade detection by antivirus scanners. This master thesis analyses PE malformations and presents robust PE parsing algorithms. A static analysis library for PE files named PortEx serves as example. The library is hardened successfully against 103 275 PE malware samples and a collection of 269 malformed proof-of-concept files. PortEx is also used to extract statistical information about malicious and clean files, including anomalies in the PE format. The author identifies 31 file properties for heuristic analysis of PE files. Contents 1 Introduction1 1.1 Purpose of Malware Analysis....................1 1.2 PE Format-Related Malware Defence................1 1.3 Robust PE Malware Analysis....................2 1.4 Limitations..............................3 1.5 Roadmap...............................3 2 Malware5 2.1 Malware Types............................5 2.2 File Infection Strategies.......................9 2.3 Malware Analysis........................... 12 2.4 Malware Detection by Antivirus Software............
    [Show full text]