On Callgraphs and Generative Mechanisms

Total Page:16

File Type:pdf, Size:1020Kb

On Callgraphs and Generative Mechanisms J Comput Virol DOI 10.1007/s11416-007-0057-x ORIGINAL PAPER On callgraphs and generative mechanisms Daniel Bilar Received: 15 February 2007 / Accepted: 14 June 2007 © Springer-Verlag France 2007 1 Abstract This paper examines the structural features of code mutation techniques, the technical reader is referred 29 2 callgraphs. The sample consisted of 120 malicious and 280 to [17]. The net result of these techniques is a shrinking 30 3 non-malicious executables. Pareto models were fitted to inde- usable “constant base” for strict signature-based detection 31 4 gree, outdegree and basic block count distribution, and a sta- approaches. 32 5 tistically significant difference shown for the derived power Since signature-based approaches are quite fast (but show 33 6 law exponent. A two-step optimization process involving little tolerance for metamorphic and polymorphic code) and 34 7 human designers and code compilers is proposed to account heuristics such as emulation are more resilient (but quite 35 8 for these structural features of executables. slow and may hinge on environmental triggers), a detection 36 approach that combines the best of both worlds would be 37 desirable. This is the philosophy behind a structural finger- 38 9 1 Introduction print. Structural fingerprints are statistical in nature, and as 39 such are positioned as ‘fuzzier’ metrics between static signa- 40 10 All commercial antivirus (AV) products rely on signature tures and dynamic heuristics. The structural fingerprint inves- 41 11 matching; the bulk of which constitutes strict byte sequence tigated in this paper for differentiation purposes is based on 42 12 pattern matching. For modern, evolving polymorphic and some properties of the executable’s callgraph. 43 13 metamorphic malware, this approach is unsatifactory. The rest of this paper is structured as follows. Section2 44 14 Clementi [9] recently checked fifteen state-of-the-art, describes the setup, data, procedures and results. Section 3 45 15 updated AVscanner against ten highly polymorphic malware gives a short overview of related work on graph-based 46 16 samples and found false negative rates from 0 to 90%, with classification. Section4 sketches the proposed generative 47 17 an average of 48%. This development was already predicted mechanism. 48 18 in 2001 [51]. 19 Polymorphic malware contain decryption routines which 20 decrypt encrypted constant parts of the malware body. The 2 Generating the callgraph 49 21 malware can mutate its decryptors in subsequent generations, 22 thereby complicating signature-based detection approaches. Primary tools used are described in more details in the 50 23 The decrypted body, however, remains constant. Metamor- Acknowledgements. 51 24 phic malware generally do not use encryption, but are able 25 to mutate their body in subsequent generation using vari- 2.1 Samples 52 26 ous techniques, such as junk insertion, semantic NOPs, code 27 transposition, equivalent instruction substitution and regis- For non-malicious software, henceforth called ‘goodware’, 53 28 ter reassignments [8,49]. For a recent formalization of these sampling followed a two-step process: We inventoried all PEs 54 (the primary 32-bit Windows file format) on a Microsoft XP 55 D. Bilar (B) Home SP2 laptop, extracted uniform randomly 300 samples, 56 Department of Computer Science, Wellesley College,uncorrected proof Wellesley, MA 02481, USA discarded overly large and small files, yielding 280 samples. 57 e-mail: [email protected] For malicious software (malware), seven classes of interest 58 123 Journal: 11416 MS: 57 CMS: GIVE CMS TYPESET DISK LE CP Disp.:2007/7/4 Pages: 13 D. Bilar 59 were fixed: backdoor, hacking tools, DoS, trojans, exploits, A sequence of instructions that is continuous (e.g. has 105 60 virus, and worms. The worm class was further divided into no branches jumping into its middle and ends at a branch 106 61 Peer-to-Peer (P2P), Internet Relay Chat/Instant Messenger instruction) is called a basic block. We consider the graph 107 62 (IRC/IM), Email and Network worm subclasses. For an non- formed by having each basic block as a node, and each short 108 63 specialist introduction to malicious software, see [48]; for a branch an edge. The connected components in this directed 109 64 canonical reference, see [50]. graph correspond to the flowgraphs of the functions in the 110 65 Each class (subclass) contained at least 15 samples. Since source code. For each connected component in the previ- 111 66 AV vendors were hesitant for liability reasons to provide ous graph, we create a node in the callgraph. For each far 112 67 samples, we gathered them from herm1t’s collection [24] branch in the connected component, we add an edge to the 113 68 and identified compiler and (potential) packer metadata using node corresponding to the connected component this branch 114 69 PEiD. Practically all malware samples were identified as hav- is targeting. Figures 7 and 8 in the Appendix illustrate a func- 115 70 ing been compiled by MS C++ 5.0/6.0, MS Visual Basic tion’s flow- and callgraph, respectively. 116 = ( , ) 71 5.0/6.0 or LCC, and about a dozen samples were packed Formally, denote a callgraph CG as CG G V E , 117 72 with various versions of UPX (an executable compression where G(·) stands for ‘Graph’. Let V = F, where F ∈ 118 73 program). Malware was run through best-of-breed, updated normal, import, library, thunk. This just says that each 119 74 open- and closed-source AV products yielding a false neg- function in CG is either a ‘library’ function (from an external 120 75 ative rate of 32% (open-source) and 2% (closed-source), libraries statically linked in), an ‘import’ function (dynam- 121 76 respectively. Overall file sizes for both mal- and goodware ically imported from a dynamic library), a ‘thunk’ function 122 1 77 ranged from (10kb) to (1MB). A preliminary file size (mostly one-line wrapper functions used for calling conven- 123 78 distribution investigation yielded a log-normal distribution; tion or type conversion) or a ‘normal’ function (can be viewed 124 79 for a putative explanation of the underlying generative pro- as the executables own function). Following metrics were 125 80 cess, see [38] and [31]. programmatically collected from CG 126 81 All 400 samples were loaded into the de-facto industry 82 standard disassembler (IDA Pro [22]), inter- and intra- – |V | is number of nodes in CG, i.e. the function count of 127 83 procedurally parsed and augmented with symbolic meta- the callgraph 128 84 information gleaned programmatically from the binary via – For any f ∈ V ,let f = G(V f , E f ) where b ∈ V f is 129 85 FLIRT signatures (when applicable). We exported the a block of code, i.e. each node in the callgraph is itself a 130 86 identified structures exported via IDAPython into a MySQL graph, a flowgraph, and each node on the flowgraph is a 131 87 database. These structures were subsequently parsed by a dis- basic block 132 88 assembly visualization tool (BinNavi [13]) to generate and – Define IC : B → N where B is defined to be set of 133 89 investigate the callgraph. blocks of code, and IC(b) is the number of instructions in 134 b. We denote this function shorthand as |b|IC, the number 135 90 2.2 Callgraph of instructions in basic block b. 136 |·| – We extend this notation IC to elements of V be defining 137 | f | = |b| . This gives us the total number of 138 91 Following [14], we treat an executable as a graph of graphs. IC b∈V f IC 92 This follows the intuition that in any procedural language, instructions in a node of the callgraph, i.e. in a function. 139 +( ) −( ) bb( ) 93 the source code is structured into functions (which can be –LetdG f , dG f and dG f denote the indegree, out- 140 94 viewed as a flowchart, e.g. a directed graph which we call degree and basic block count of a function, respectively. 141 95 flowgraph). These functions call each other, thus creating a 96 larger graph where each node is a function and the edges 2.3 Correlations 142 97 are calls-to relations between the functions. We call this 98 larger graph the callgraph. We recover this structure by di- We calculated the correlation between in and outdegree of 143 99 assembling the executable into individual instructions. We functions. Prior analysis of static class collaboration net- 144 100 distinguish between short and far branch instructions: Short works [40,44] suggest an anti-correlation, characterizing 145 101 branches do not save a return address while far branches do. some functions as source or sinks. We found no signifi- 146 102 Intuitively, short branches are normally used to pass con- cant correlation between in and outdegree of functions in 147 103 trol around within one function of the program, while far the disassembled executables (Fig.1). Correlation intuitively 148 104 branches are used to call other functions. is unlikely to occur except in the ‘0 outdegree’ case (the 149 BinNavi toolset does not generate the flowgraph for imported 150 1 ( ) ( ( )) functions, i.e. an imported function automatically has 151 A function f n is g n if there are positiveuncorrected constants c1, c2,and proof 152 n0 such that 0 ≤ c1g(n) ≤ f (n) ≤ c2g(n), ∀n ≥ n0 .See[26]fora outdegree 0, and but will be called from many other readable discussion of asymptotic notation etymology. functions). 153 123 Journal: 11416 MS: 57 CMS: GIVE CMS TYPESET DISK LE CP Disp.:2007/7/4 Pages: 13 On callgraphs and generative mechanisms B&W print Fig. 1 Correlation coefficient 0.4 0.4 rin,out Malware: p versus rin,out Goodware: p versus rin,out 0.3 0.3 0.2 0.2 0.1 0.1 0 0 −0.1 −0.1 −0.2 −0.2 −0.3 −0.3 −0.4 −0.4 −7 −6 −5 −4 −3 −2 −1 0 −7 −6 −5 −4 −3 −2 −1 0 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 (a) (b) Table 1 Correlation, IQR for instruction count |V |, which may serve as a rough proxy for the executable’s 179 ( , ) = Class Metric (10) (100) (1000) size.
Recommended publications
  • LZ Based Compression Benchmark on PE Files Introduction LZ Based
    LZ based compression benchmark on PE files Zsombor Paróczi Abstract: The key element in runtime compression is the compression algorithm, that is used during processing. It has to be small in enough in decompression bytecode size to fit in the final executable, yet have to provide the best compression ratio. In our work we benchmark the top LZ based compression methods on Windows PE files (both exe and dll files), and present the results including the decompres- sion overhead and the compression rates. Keywords: lz based compression, compression benchmark, PE benchmark Introduction During runtime executable compression an already compiled executable is modified in ways, that it still retains the ability to execute, yet the transformation produces smaller file size. The transformations usually exists from multiple steps, changing the structure of the executable by removing unused bytes, adding a compression layer or modifying the code in itself. During the code modifications the actual bytecode can change, or remain the same depending on the modification itself. In the world of x86 (or even x86-64) PE compression there are only a few benchmarks, since the ever growing storage capacity makes this field less important. Yet in new fields, like IOT and wearable electronics every application uses some kind of compression, Android apk-s are always compressed by a simple gzip compression. There are two mayor benchmarks for PE compression available today, the Maximum Compression benchmark collection [1] includes two PE files, one DLL and one EXE, and the Pe Compression Test [2] has four exe files. We will use the exe files during our benchmark, referred as small corpus.
    [Show full text]
  • Malware Detection Using Semantic Features and Improved Chi-Square 879
    Malware Detection Using Semantic Features and Improved Chi-square 879 Malware Detection Using Semantic Features and Improved Chi-square Seung-Tae Ha1, Sung-Sam Hong1, Myung-Mook Han1* 1 IT convergence engineering, Gachon University, South Korea [email protected], [email protected], [email protected] Abstract to avoid their detection and to make the analysis difficult. Signature-based detection is commonly used As advances in information technology (IT) affect all for anti-virus software currently to identify malware. areas in the world, cyber-attacks also continue to increase. The signature-based detection registers unique binary Malware has been used for cyber attacks, and the number signatures of malware and then detects the malware by of new malware and variants tends to explode in these checking the signature existence. This method means years, depending on its trendy types. In this study, we that more malware attacks leads to more signatures. It introduce semantic feature generation and new feature becomes very time-consuming to generate and register selection methods for improving the accuracy of malware signatures for various types of malware. Therefore, detection based on API sequences to detect these new there is a need for a new malware detection method in malware and variants. Therefore, one of the existing order to respond efficiently and quickly to such new feature selection methods is chosen because it shows the malware and variants. best performance, and then it is improved to be suitable For this reason, there have been studies on malware for malware detection. In addition, the improved feature detection using behavior-based malware feature selection method is verified by using the Reuter dataset.
    [Show full text]
  • Download (221Kb)
    UNPACKING CODE PATTERN FROM PACKED BINARY EXECUTABLE USING EXECUTION UNIT PATTERN BASED SEQUENCE ALIGNMENT ANALYSIS Page 94 of 103 Bibliography “AV-TEST, The Independent IT-Security Institute.” , 2018, URL https://www. av-test.org/en/statistics/malware/. Al-Anezi, M. M. K., “Generic packing detection using several complexity analysis for accurate malware detection,” International journal of advanced computer science and applications, volume 5(1), 2015. Alimehr, L., “The performance of sequence alignment algorithms,” , 2013. Armadillo, “Armadillo, Overlays packer and obfuscator,” , 2017, URL http: //the-armadillo-software-protection-system.software.informer.com, (Date last accessed 1 March 2017). Banin, S., Shalaginov, A., and Franke, K., “Memory access patterns for malware detec- tion,” , 2016. Bazrafshan, Z., Hashemi, H., Fard, S. M. H., and Hamzeh, A., “A survey on heuris- tic malware detection techniques,” in “Information and Knowledge Technology (IKT), 2013 5th Conference on,” pp. 113–120, IEEE, 2013. Beek, C., Dinkar, D., Gund, Y., and Others, “McAfee Labs threats report,” McAfee Inc., Santa Clara, CA. Available: https://www.mcafee.com/us/resources/reports/rp- quarterly-threats-dec-2017.pdf, 2017. Bellard, F., “Qemu: Open source processor emulator, 2008,” URL http://savannah. nongnu. org/projects/qemu, 2009. Benninger, C. A., Maitland: analysis of packed and encrypted malware via paravirtu- alization extensions, Ph.D. thesis, University of Victoria, 2012. Berdajs, J. and Bosnic,´ Z., “Extending applications using an advanced approach to DLL injection and API hooking,” Software: Practice and Experience, volume 40(7) pp. 567– 584, 2010. Andy Asmoro UNPACKING CODE PATTERN FROM PACKED BINARY EXECUTABLE USING EXECUTION UNIT PATTERN BASED SEQUENCE ALIGNMENT ANALYSIS Page 95 of 103 Bergroth, L., Hakonen, H., and Raita, T., “A survey of longest common subsequence algorithms,” in “String Processing and Information Retrieval, 2000.
    [Show full text]
  • Fastdump Pro™
    HBGary Responder™ User Guide 1 HBGary, Inc. 3604 Fair Oaks Blvd, Suite 250 Sacramento, CA 95864 http://www.hbgary.com/ Copyright © 2003 - 2010, HBGary, Inc. All rights reserved. HBGary Responder™ User Guide 2 Copyright © 2003 - 2010, HBGary, Inc. All rights reserved. HBGary Responder™ User Guide 3 HBGary Responder™ 2.0 User guide Copyright © 2003 - 2010, HBGary, Inc. All rights reserved. HBGary Responder™ User Guide 4 Copyright © 2003 - 2010, HBGary, Inc. All rights reserved. HBGary Responder™ User Guide 5 Contents Copyright and Trademark Information ....................................................................................................... 11 Privacy Information ..................................................................................................................................... 11 Notational Conventions .............................................................................................................................. 12 Contacting Technical Support ..................................................................................................................... 12 Responder™ Installation Prerequisites ....................................................................................................... 13 Minimum Hardware Requirements ........................................................................................................ 13 Prerequisite Software ............................................................................................................................. 13 REcon™
    [Show full text]
  • Adaptive Rule-Based Malware Detection Employing Learning Classifier Systems
    Scholars' Mine Masters Theses Student Theses and Dissertations Fall 2011 Adaptive rule-based malware detection employing learning classifier systems Jonathan Joseph Blount Follow this and additional works at: https://scholarsmine.mst.edu/masters_theses Part of the Computer Sciences Commons Department: Recommended Citation Blount, Jonathan Joseph, "Adaptive rule-based malware detection employing learning classifier systems" (2011). Masters Theses. 5008. https://scholarsmine.mst.edu/masters_theses/5008 This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the permission of the copyright holder. For more information, please contact [email protected]. ADAPTIVE RULE-BASED MALWARE DETECTION EMPLOYING LEARNING CLASSIFIER SYSTEMS by JONATHAN JOSEPH BLOUNT A THESIS Presented to the Faculty of the Graduate School of MISSOURI UNIVERSITY OF SCIENCE AND TECHNOLOGY in Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE IN COMPUTER SCIENCE 2011 Approved by Daniel R. Tauritz, Advisor Bruce M. McMillin Samuel A. Mulder Copyright c 2011 Jonathan Joseph Blount All Rights Reserved iii ABSTRACT Efficient and accurate malware detection is increasingly becoming a necessity for society to operate. Existing malware detection systems have excellent performance in identifying known malware for which signatures are available, but poor performance in anomaly detection for zero day exploits for which signatures have not yet been made available or targeted attacks against a specific entity. The primary goal of this thesis is to provide evidence for the potential of learning classifier systems to improve the accuracy of malware detection.
    [Show full text]
  • Portable Executable
    Portable Executable The Portable Executable (PE) format is a file format for executables, object code, Portable Executable DLLs and others used in 32-bit and 64-bit Filename .acm, .ax, .cpl, .dll, .drv, .efi, versions of Windows operating systems. extension .exe, .mui, .ocx, .scr, .sys, .tsp The PE format is a data structure that Internet application/vnd.microsoft.portable- encapsulates the information necessary for media type executable[1] the Windows OS loader to manage the Developed by Currently: Microsoft wrapped executable code. This includes dynamic library references for linking, API Type of format Binary, executable, object, shared libraries export and import tables, resource Extended from DOS MZ executable management data and thread-local storage COFF (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS (device driver), MUI and other file types. The Unified Extensible Firmware Interface (UEFI) specification states that PE is the standard executable format in EFI environments.[2] On Windows NT operating systems, PE currently supports the x86-32, x86-64 (AMD64/Intel 64), IA-64, ARM and ARM64 instruction set architectures (ISAs). Prior to Windows 2000, Windows NT (and thus PE) supported the MIPS, Alpha, and PowerPC ISAs. Because PE is used on Windows CE, it continues to support several variants of the MIPS, ARM (including Thumb), and SuperH ISAs. [3] Analogous formats to PE are ELF (used in Linux and most other versions of Unix) and Mach-O (used in macOS and iOS). Contents History Technical details Layout Import table Relocations .NET, metadata, and the PE format Use on other operating systems See also References External links History Microsoft migrated to the PE format from the 16-bit NE formats with the introduction of the Windows NT 3.1 operating system.
    [Show full text]
  • Internet Technology and Web Programming
    INTERNET TECHNOLOGY AND WEB PROGRAMMING 1 F.C Ledesma Avenue, San Carlos City, Negros Occidental Tel. #: (034) 312-6189/(034) 729-4327 CONTENTS LESSON I: Introduction to Networking · Networking concepts and Technology (LANs and WANs) . · Serial Networking (SLIP, PPP) . · Internet Protocol (IP) and Domain Name System (DNS) . · What is the Internet . LESSON II: Internet Access Hardware and Media · HARDWARE: Modems, Terminal Adapters, Routers . · MEDIA: PTSN, ISDN, Kilostream . LESSON III: Internet Services · Electronic Mail; Newsgroups . · File Transfer Protocol (FTP) and Hypertext Transfer Protocol (HTTP) . · Internet databases: WAIS, Archie, gopher, WWW search databases . LESSON IV: Using E-Mail and other Clients · Electronics Mail . · Other Internet Clients . · FTP . · Newsgroups . · Telnet . LESSON V: Media & Active Content · Object & Active Content . · Types of Browser Plug-ins . 2 F.C Ledesma Avenue, San Carlos City, Negros Occidental Tel. #: (034) 312-6189/(034) 729-4327 · Additional Media File Formats . · Images File Formats . LESSON VI: Internetworking Servers · Server Implementation . · Content Servers . · Performance Servers . · Database Servers . · Mirrored Servers . · Popular Server Products . LESSON VII: Web Servers and Databases · Databases . · Introduction to Database Gateways for Web Servers . · Common Gateway Interface (CGI) . · Server Application Programming Interfaces (SAPIs) . · JavaScript . · ASP . · PHP . · HTML . · Java & Java Service . · JSP . · ColdFusion . · Database Connectivity 3 F.C Ledesma Avenue, San Carlos City, Negros Occidental Tel. #: (034) 312-6189/(034) 729-4327 · ODBC . · JDBC . LESSON VIII: Internet Security · What is Security? . · The cracker Process . · Types of Attacks . · Defending Your Networks . · Firewalls . · Defending Your Computer . · Defending Your Transmitted Data . Lesson I: (Introduction to Networking) 1. Network concepts and Technology (LANs and WANs) 4 F.C Ledesma Avenue, San Carlos City, Negros Occidental Tel.
    [Show full text]
  • Testing and Evaluation of Virus Detectors for Handheld Devices Jose Andre Morales, Peter J
    Testing and Evaluation of Virus Detectors for Handheld Devices Jose Andre Morales, Peter J. Clarke, Yi Deng School of Computing and Information Sciences Florida International University Miami, Fl 33199 { jmora009, clarkep, deng } @cis.fiu.edu ABSTRACT The widespread use of personal digital assistants and smartphones 1. INTRODUCTION On June 14, 2004, the first computer virus infecting handheld should make securing these devices a high priority. Yet little devices was identified [25]. The first virus to infect handhelds* attention has been placed on protecting handheld devices against running Windows Mobile operating system was released July 17, viruses. Currently available antivirus software for handhelds is 2004 [21]. This was the beginning of a new era for the virus and few in number. At this stage, the opportunity exists for the antivirus community. At the time there were little if any antivirus evaluation and improvement of current solutions. By pinpointing solutions available. An overwhelming majority of users were weaknesses in the current antivirus software, improvements can vulnerable to any possible viral attack. In a reactionary effort, be made to properly protect these devices from a future tidal wave security companies released antivirus solutions for the infected of viruses. This research evaluates four currently available devices that only protected against these specific viruses. Still antivirus solutions for handheld devices. A formal model of virus today many handhelds do not have some form of antivirus transformation that provides transformation traceability is software installed. presented. Ten tests were administered; nine involved the modification of source code of a known virus for handheld This research evaluates current antivirus solutions for handhelds devices.
    [Show full text]
  • Pdf/Security Report/AV- Restoring an Original Image of the Packed file
    Hindawi Security and Communication Networks Volume 2019, Article ID 5278137, 16 pages https://doi.org/10.1155/2019/5278137 Research Article All-in-One Framework for Detection, Unpacking, and Verification for Malware Analysis Mi-Jung Choi , Jiwon Bang , Jongwook Kim, Hajin Kim, and Yang-Sae Moon Department of Computer Science, Kangwon National University, 1 Kangwondaehak-gil, Chuncheon-si, Gangwon 24341, Republic of Korea Correspondence should be addressed to Yang-Sae Moon; [email protected] Received 10 April 2019; Revised 21 August 2019; Accepted 5 September 2019; Published 13 October 2019 Academic Editor: Jes´u sD´ıaz-Verdejo Copyright © 2019 Mi-Jung Choi et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Packing is the most common analysis avoidance technique for hiding malware. Also, packing can make it harder for the security researcher to identify the behaviour of malware and increase the analysis time. In order to analyze the packed malware, we need to perform unpacking rst to release the packing. In this paper, we focus on unpacking and its related technologies to analyze the packed malware. rough extensive analysis on previous unpacking studies, we pay attention to four important drawbacks: no phase integration, no detection combination, no real-restoration, and no unpacking veri­cation. To resolve these four drawbacks, in this paper, we present an all-in-one structure of the unpacking system that performs packing detection, unpacking (i.e., res- toration), and verication phases in an integrated framework.
    [Show full text]
  • X86 Instruction Reordering for Code Compression
    Acta Cybernetica 21 (2013) 177{190. x86 Instruction Reordering for Code Compression Zsombor Paroczi∗ Abstract Runtime executable code compression is a method which uses standard data compression methods and binary machine code transformations to achieve smaller file size, yet maintaining the ability to execute the compressed file as a regular executable. With a disassembler, an almost perfect instructional and functional level disassembly can be generated. Using the structural in- formation of the compiled machine code each function can be split into so called basic blocks. In this work we show that reordering instructions within basic blocks using data flow constraints can improve code compression without changing the behavior of the code. We use two kinds of data affection (read, write) and 20 data types including registers: 8 basic x86 registers, 11 eflags, and memory data. Due to the complexity of the reordering, some simplification is required. Our solution is to search local optimum of the compression on the function level and then combine the results to get a suboptimal global result. Using the reordering method better results can be achieved, namely the compression size gain for gzip can be as high as 1.24%, for lzma 0.68% on the tested executables. Keywords: executable compression, instruction reordering, x86 1 Introduction Computer programs, more precisely binary executables are the result of the source code compiling and linking process. The executables have a well defined format for each operating system (e.g. Windows exe and Mac binary [6, 15]) which usu- ally consists of information headers, initialized and uninitialized data sections and machine code for a specific processor instruction set.
    [Show full text]
  • Translate Toolkit Documentation Release 3.4.1
    Translate Toolkit Documentation Release 3.4.1 Translate.org.za Sep 26, 2021 Contents 1 User’s Guide 3 1.1 Features..................................................3 1.2 Installation................................................4 1.3 Converters................................................6 1.4 Tools................................................... 60 1.5 Scripts.................................................. 99 1.6 Use Cases................................................. 111 1.7 Translation Related File Formats..................................... 128 2 Developer’s Guide 161 2.1 Translate Styleguide........................................... 161 2.2 Documentation.............................................. 168 2.3 Building................................................. 171 2.4 Testing.................................................. 172 2.5 Command Line Functional Testing................................... 174 2.6 Contributing............................................... 176 2.7 Translate Toolkit Developers Guide................................... 178 2.8 Making a Translate Toolkit Release................................... 182 2.9 Deprecation of Features......................................... 187 3 Additional Notes 189 3.1 Release Notes.............................................. 189 3.2 History of the Translate Toolkit..................................... 239 3.3 License.................................................. 241 4 API Reference 243 4.1 API.................................................... 243 Python
    [Show full text]
  • Aplib 1.1.1 Documentation » Next | Index Welcome to the Aplib Documentation!
    aPLib 1.1.1 documentation » next | index Welcome to the aPLib documentation! aPLib is a compression library based on the algorithm used in aPACK (my 16-bit executable packer). aPLib is an easy-to-use alternative to many of the heavy-weight compression libraries available. The compression ratios achieved by aPLib combined with the speed and tiny footprint of the decompressors (as low as 169 bytes!) makes it the ideal choice for many products. Contents: License General Information Introduction Compatibility Thread-safety Using aPLib Safe Wrapper Functions Contact Information Compression Compression Functions Safe Wrapper Functions Example Decompression Decompression Functions Safe Wrapper Functions Example Acknowledgements Version History aPLib 1.1.1 documentation » next | index © Copyright 1998-2014, Joergen Ibsen. Created using Sphinx 1.2.2. aPLib 1.1.1 documentation » previous | next | index License aPLib is freeware. If you use aPLib in a product, an acknowledgement would be appreciated, e.g. by adding something like the following to the documentation: This product uses the aPLib compression library, Copyright © 1998-2014 Joergen Ibsen, All Rights Reserved. For more information, please visit: http://www.ibsensoftware.com/ You may not redistribute aPLib without all of the files. You may not edit or reverse engineer any of the files (except the header files and the decompression code, which you may edit as long as you do not remove the copyright notice). You may not sell aPLib, or any part of it, for money (except for charging for the media). #ifndef COMMON_SENSE This software is provided “as is”. In no event shall I, the author, be liable for any kind of loss or damage arising out of the use, abuse or the inability to use this software.
    [Show full text]