Isadetect: Usable Automated Detection of CPU Architecture and Endianness for Executable Binary Files and Object Code

Total Page:16

File Type:pdf, Size:1020Kb

Isadetect: Usable Automated Detection of CPU Architecture and Endianness for Executable Binary Files and Object Code This is a self-archived version of an original article. This version may differ from the original in pagination and typographic details. Author(s): Kairajärvi, Sami; Costin, Andrei; Hämäläinen, Timo Title: ISAdetect : Usable Automated Detection of CPU Architecture and Endianness for Executable Binary Files and Object Code Year: 2020 Version: Accepted version (Final draft) Copyright: © 2020 ACM Rights: In Copyright Rights url: http://rightsstatements.org/page/InC/1.0/?language=en Please cite the original version: Kairajärvi, S., Costin, A., & Hämäläinen, T. (2020). ISAdetect : Usable Automated Detection of CPU Architecture and Endianness for Executable Binary Files and Object Code. In CODASPY '20 : Proceedings of the 10th ACM Conference on Data and Application Security and Privacy (pp. 376- 380). ACM. https://doi.org/10.1145/3374664.3375742 ISAdetect: Usable Automated Detection of CPU Architecture and Endianness for Executable Binary Files and Object Code Sami Kairajärvi ∗ Andrei Costin Timo Hämäläinen University of Jyväskylä University of Jyväskylä University of Jyväskylä Jyväskylä, Finland Jyväskylä, Finland Jyväskylä, Finland [email protected] [email protected] [email protected] ABSTRACT KEYWORDS Static and dynamic binary analysis techniques are actively used Binary code analysis, Firmware analysis, Instruction Set Archi- to reverse engineer software’s behavior and to detect its vulnera- tecture (ISA), Supervised machine learning, Reverse engineering, bilities, even when only the binary code is available for analysis. Malware analysis, Digital forensics To avoid analysis errors due to misreading op-codes for a wrong ACM Reference Format: CPU architecture, these analysis tools must precisely identify the Sami Kairajärvi, Andrei Costin, and Timo Hämäläinen. 2020. ISAdetect: Instruction Set Architecture (ISA) of the object code under analysis. Usable Automated Detection of CPU Architecture and Endianness for The variety of CPU architectures that modern security and reverse Executable Binary Files and Object Code. In Tenth ACM Conference on engineering tools must support is ever increasing due to massive Data and Application Security and Privacy (CODASPY’20), March 16–18, proliferation of IoT devices and the diversity of firmware and mal- 2020, New Orleans, LA, USA. ACM, New York, NY, USA, 5 pages. https: ware targeting those devices. Recent studies concluded that falsely //doi.org/10.1145/3374664.3375742 identifying the binary code’s ISA caused alone about 10% of failures of IoT firmware analysis. The state of the art approaches detecting 1 INTRODUCTION ISA for executable object code look promising, and their results Reverse engineering and analysis of binary code has a wide spec- demonstrate effectiveness and high-performance. However, they trum of applications [24, 35, 39], ranging from vulnerability re- lack the support of publicly available datasets and toolsets, which search [4, 9, 10, 34, 43] to binary patching and translation [37], makes the evaluation, comparison, and improvement of those tech- and from digital forensics [8] to anti-malware and Intrusion De- niques, datasets, and machine learning models quite challenging tection Systems (IDS) [25, 41]. For such applications, various static (if not impossible). This paper bridges multiple gaps in the field of and dynamic analysis techniques and tools are constantly being automated and precise identification of architecture and endianness researched, developed, and improved [3, 6, 15, 24, 30, 38, 42]. of binary files and object code. We develop from scratch the toolset Regardless of their end goal, one of the important steps in these and datasets that are lacking in this research space. As such, we techniques is to correctly identify the Instruction Set Architecture contribute a comprehensive collection of open data, open source, and (ISA) of the op-codes within the binary code. Some techniques open API web-services. We also attempt experiment reconstruction can perform the analysis using architecture-independent or cross- and cross-validation of effectiveness, efficiency, and results ofthe architecture methods [16, 17, 32]. However, many of those tech- state of the art methods. When training and testing classifiers using niques still require the exact knowledge of the binary code’s ISA. solely code-sections from executable binary files, all our classifiers For example, recent studies concluded that falsely identifying the performed equally well achieving over 98% accuracy. The results binary code’s ISA caused about 10% of failures of IoT/embedded are consistent and comparable with the current state of the art, firmware analysis [9, 10]. hence supports the general validity of the algorithms, features, and Sometimes the CPU architecture is available in the executable approaches suggested in those works. format’s header sections, for example in ELF file format [29]. How- ever, this information is not guaranteed to be universally available CCS CONCEPTS for analysis. There are multiple reasons for this and we will detail a • Security and privacy → Software reverse engineering; • few of them. The advances in Internet of Things (IoT) technologies Computing methodologies → Machine learning; • Computer bring to the game an extreme variety of hardware and software, in systems organization → Embedded and cyber-physical systems. particular new CPU architectures, new OSs or OS-like systems [27]. Many of those devices are resource-constrained and the binary code comes without sophisticated headers and OS abstraction lay- ers. At the same time, the digital forensics and the IDSs sometimes ∗This paper is based on: author’s MSc thesis [21] and extended pre-print version [22]. may have access only to a fraction of the code, e.g., from an ex- ploit (shell-code), malware trace, or a memory dump. For example, many shell-codes are exactly this – a quite short, formatless and CODASPY ’20, March 16–18, 2020, New Orleans, LA, USA © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. headerless sequence of CPU op-codes for a target system (i.e., a This is the author’s version of the work. It is posted here for your personal use. Not for combination of hardware, operating-system, and abstraction lay- redistribution. The definitive Version of Record was published in Tenth ACM Conference ers) that performs a presumably malicious action on behalf of the on Data and Application Security and Privacy (CODASPY’20), March 16–18, 2020, New Orleans, LA, USA, https://doi.org/10.1145/3374664.3375742. attacker [18, 33]. In such cases, though possible in theory, it is quite unlikely that the full code including the headers specifying CPU ISA will be available for analysis. Finally, even in the traditional No x86/x86_64 computing world of (e.g., ), there are situations where Crawl Debian Extract selected Unpack Debian Jigdo file Yes Jigdo to ISO the hosts contain object code for CPU architectures other than repository packages packages the one of the host itself. Examples include firmware for network Metadata and Features from cards [2, 13, 14], various management co-processors [26], and de- code sections code sections vice drivers [5, 20] for USB [28, 40] and other embedded devices that from binaries contain own specialized processors and provide specific services (e.g., encryption, data codecs) that are implemented as peripheral Figure 1: ISAdetect general workflow of the toolset. firmware [11, 23]. Even worse, more often than not the object code for such peripheral firmware is stored using less traditional or non- Table 1: ISAdetect dataset summary. standard file formats and headers (if any), or embedded inside the device drivers themselves resulting in mixed architectures code Type Approx. # of files in dataset Approx. total size streams. Currently, several state of the art works try to address the .iso files ~1600 ~1843 GB challenge of accurately identifying the CPU ISA for executable ob- .deb files ~79000 ~36 GB ject code sequences [8, 12]. Their approaches look promising as the ELF files ~96000 ~27 GB ELF code sections ~96000 ~17 GB results demonstrate effectiveness and high-performance. However, they lack the support of publicly available datasets and toolsets, which makes the evaluation, comparison, and improvement of those 2 DATASETS AND EXPERIMENTAL SETUP techniques, datasets, and machine learning models quite challeng- ing (if not impossible). With this paper, we bridge multiple gaps in 2.1 Datasets the field of automated and precise identification of architecture and We started with the dataset acquisition challenge. Despite the exis- endianness of binary files and object code. We develop from scratch tence of several state of the art works, unfortunately neither their the toolset and datasets that are lacking in this research space. datasets nor the toolsets are publicly available. 1 To overcome this To this end, we release a comprehensive collection of open data, limitation we had to develop a complete toolset and pipeline that open source, and open API web-services. We attempt experiment are able to optimally download a dataset that is both large and reconstruction and cross-validation of effectiveness, efficiency, and representative enough. We release our data acquisition toolset as results of the state of the art methods [8, 12], as well as propose and open source. The pipeline used to acquire our dataset is depicted experimentally validate new approaches to the main classification in Figure 1. We chose the Debian Linux repositories for several challenge where we obtain
Recommended publications
  • Debian Installer Bullseye Alpha 3 Released and More Debian News
    Published on Tux Machines (http://www.tuxmachines.org) Home > content > Debian Installer Bullseye Alpha 3 Released and More Debian News Debian Installer Bullseye Alpha 3 Released and More Debian News By Roy Schestowitz Created 06/12/2020 - 9:51pm Submitted by Roy Schestowitz on Sunday 6th of December 2020 09:51:32 PM Filed under Debian [1] Debian Installer Bullseye Alpha 3 release [2] The Debian Installer team[1] is pleased to announce the third alpha release of the installer for Debian 11 "Bullseye". Improvements in this release ============================ * apt-setup: - Remove mention of volatile repo from generated sources.list file (#954460). * base-installer: - Improve test architecture, adding support for Linux 5.x versions. * brltty: - Improve hardware detection and driver support. * cdebconf: - Make text interface report progress more accurately: from the very beginning, and also as soon as an answer to a question has been given. * choose-mirror: - Update Mirrors.masterlist. * console-setup: - Improve support for box-drawing characters (#965029). - Sync Terminus font with the xfonts-terminus package. - Fix Lithuanian layout (#951387). * debian-cd: - Only include Linux udebs for the latest ABI, making small installation images more useful. * debian-installer: - Bump Linux kernel ABI to 5.9.0-4 - Drop fontconfig tweaks introduced in the Debian Installer Buster Alpha 1 release (See: #873462). - Install kmod-udeb instead of libkmod2-udeb. - Mimick libgcc1 handling, for libgcc-s1. - Clean up the list of fake packages. - Replace the mklibs library reduction pass with a hack, copying libgcc_s.so.[124] from the host filesystem for the time being. - Add explicit build-depends on fdisk on arm64, amd64 and i386 now that util-linux doesn't depend on it anymore.
    [Show full text]
  • A Zahlensysteme
    A Zahlensysteme Außer dem Dezimalsystem sind das Dual-,dasOktal- und das Hexadezimalsystem gebräuchlich. Ferner spielt das Binär codierte Dezimalsystem (BCD) bei manchen Anwendungen eine Rolle. Bei diesem sind die einzelnen Dezimalstellen für sich dual dargestellt. Die folgende Tabelle enthält die Werte von 0 bis dezimal 255. Be- quemlichkeitshalber sind auch die zugeordneten ASCII-Zeichen aufgeführt. dezimal dual oktal hex BCD ASCII 0 0 0 0 0 nul 11111soh 2102210stx 3113311etx 4 100 4 4 100 eot 5 101 5 5 101 enq 6 110 6 6 110 ack 7 111 7 7 111 bel 8 1000 10 8 1000 bs 9 1001 11 9 1001 ht 10 1010 12 a 1.0 lf 11 101 13 b 1.1 vt 12 1100 14 c 1.10 ff 13 1101 15 d 1.11 cr 14 1110 16 e 1.100 so 15 1111 17 f 1.101 si 16 10000 20 10 1.110 dle 17 10001 21 11 1.111 dc1 18 10010 22 12 1.1000 dc2 19 10011 23 13 1.1001 dc3 20 10100 24 14 10.0 dc4 21 10101 25 15 10.1 nak 22 10110 26 16 10.10 syn 430 A Zahlensysteme 23 10111 27 17 10.11 etb 24 11000 30 18 10.100 can 25 11001 31 19 10.101 em 26 11010 32 1a 10.110 sub 27 11011 33 1b 10.111 esc 28 11100 34 1c 10.1000 fs 29 11101 35 1d 10.1001 gs 30 11110 36 1e 11.0 rs 31 11111 37 1f 11.1 us 32 100000 40 20 11.10 space 33 100001 41 21 11.11 ! 34 100010 42 22 11.100 ” 35 100011 43 23 11.101 # 36 100100 44 24 11.110 $ 37 100101 45 25 11.111 % 38 100110 46 26 11.1000 & 39 100111 47 27 11.1001 ’ 40 101000 50 28 100.0 ( 41 101001 51 29 100.1 ) 42 101010 52 2a 100.10 * 43 101011 53 2b 100.11 + 44 101100 54 2c 100.100 , 45 101101 55 2d 100.101 - 46 101110 56 2e 100.110 .
    [Show full text]
  • Debian 1 Debian
    Debian 1 Debian Debian Part of the Unix-like family Debian 7.0 (Wheezy) with GNOME 3 Company / developer Debian Project Working state Current Source model Open-source Initial release September 15, 1993 [1] Latest release 7.5 (Wheezy) (April 26, 2014) [±] [2] Latest preview 8.0 (Jessie) (perpetual beta) [±] Available in 73 languages Update method APT (several front-ends available) Package manager dpkg Supported platforms IA-32, x86-64, PowerPC, SPARC, ARM, MIPS, S390 Kernel type Monolithic: Linux, kFreeBSD Micro: Hurd (unofficial) Userland GNU Default user interface GNOME License Free software (mainly GPL). Proprietary software in a non-default area. [3] Official website www.debian.org Debian (/ˈdɛbiən/) is an operating system composed of free software mostly carrying the GNU General Public License, and developed by an Internet collaboration of volunteers aligned with the Debian Project. It is one of the most popular Linux distributions for personal computers and network servers, and has been used as a base for other Linux distributions. Debian 2 Debian was announced in 1993 by Ian Murdock, and the first stable release was made in 1996. The development is carried out by a team of volunteers guided by a project leader and three foundational documents. New distributions are updated continually and the next candidate is released after a time-based freeze. As one of the earliest distributions in Linux's history, Debian was envisioned to be developed openly in the spirit of Linux and GNU. This vision drew the attention and support of the Free Software Foundation, who sponsored the project for the first part of its life.
    [Show full text]
  • Pipenightdreams Osgcal-Doc Mumudvb Mpg123-Alsa Tbb
    pipenightdreams osgcal-doc mumudvb mpg123-alsa tbb-examples libgammu4-dbg gcc-4.1-doc snort-rules-default davical cutmp3 libevolution5.0-cil aspell-am python-gobject-doc openoffice.org-l10n-mn libc6-xen xserver-xorg trophy-data t38modem pioneers-console libnb-platform10-java libgtkglext1-ruby libboost-wave1.39-dev drgenius bfbtester libchromexvmcpro1 isdnutils-xtools ubuntuone-client openoffice.org2-math openoffice.org-l10n-lt lsb-cxx-ia32 kdeartwork-emoticons-kde4 wmpuzzle trafshow python-plplot lx-gdb link-monitor-applet libscm-dev liblog-agent-logger-perl libccrtp-doc libclass-throwable-perl kde-i18n-csb jack-jconv hamradio-menus coinor-libvol-doc msx-emulator bitbake nabi language-pack-gnome-zh libpaperg popularity-contest xracer-tools xfont-nexus opendrim-lmp-baseserver libvorbisfile-ruby liblinebreak-doc libgfcui-2.0-0c2a-dbg libblacs-mpi-dev dict-freedict-spa-eng blender-ogrexml aspell-da x11-apps openoffice.org-l10n-lv openoffice.org-l10n-nl pnmtopng libodbcinstq1 libhsqldb-java-doc libmono-addins-gui0.2-cil sg3-utils linux-backports-modules-alsa-2.6.31-19-generic yorick-yeti-gsl python-pymssql plasma-widget-cpuload mcpp gpsim-lcd cl-csv libhtml-clean-perl asterisk-dbg apt-dater-dbg libgnome-mag1-dev language-pack-gnome-yo python-crypto svn-autoreleasedeb sugar-terminal-activity mii-diag maria-doc libplexus-component-api-java-doc libhugs-hgl-bundled libchipcard-libgwenhywfar47-plugins libghc6-random-dev freefem3d ezmlm cakephp-scripts aspell-ar ara-byte not+sparc openoffice.org-l10n-nn linux-backports-modules-karmic-generic-pae
    [Show full text]
  • Ldp Howto-Index
    LDP HOWTO-INDEX Guylhem Aznar Joshua Drake Greg Ferguson v9.0, 2005-12-29 This document contains an index to the Linux HOWTOs as well as other information about the HOWTO project. LDP HOWTO-INDEX Table of Contents Chapter 1. What Are Linux HOWTOs?...........................................................................................................1 Chapter 2. Where Can I Get Linux HOWTOs?..............................................................................................2 Chapter 3. HOWTO Translations.....................................................................................................................3 Chapter 4. Categorized List of HOWTOs........................................................................................................4 4.1. The Linux OS....................................................................................................................................4 4.1.1. Getting Started.........................................................................................................................4 4.1.2. Switching from Other Operating Systems...............................................................................4 4.1.3. Distributions............................................................................................................................5 4.1.4. Installation...............................................................................................................................5 4.1.5. Kernel......................................................................................................................................7
    [Show full text]
  • Linux Package Management
    Welcome A Basic Overview and Introduction to Linux Package Management By Stan Reichardt [email protected] October 2009 Disclaimer ● ...like a locomotive ● Many (similar but different) ● Fast moving ● Complex parts ● Another one coming any minute ● I have ridden locomotives ● I am NOT a locomotive engineer 2 Begin The Train Wreck 3 Definitions ● A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage. ● Metadata is data (or information) about other data (or information). 4 File Archivers Front Ends Base Package Tool CLI GUI tar .tar, tar tar file roller .tar.gz, .tgz, .tar.Z, .taz, .tar.bz2,.tbz2, .tbz, .tb2, .tar.lzma,.tlz, .tar.xz, .txz, .tz zip .zip zip zip file roller gzip gzip gunzip gunzip ● Archive file http://en.wikipedia.org/wiki/Archive_file ● Comparison of file archivers http://en.wikipedia.org/wiki/Comparison_of_file_archivers 5 tar ● These files end with .tar suffix. ● Compressed tar files end with “.t” variations: .tar.gz, .tgz, .tar.Z, .taz, .tar.bz2, .tbz2, .tbz, .tb2, .tar.lzma, .tlz, .tar.xz, .txz, .tz ● Originally intended for transferring files to and from tape, it is still used on disk-based storage to combine files before they are compressed. ● tar (file format) http://en.wikipedia.org/wiki/.tar 6 tarball ● A tar file or compressed tar file is commonly referred to as a tarball. ● The "tarball" format combines tar archives with a file-based compression scheme (usually gzip). ● Commonly used for source and binary distribution on Unix-like platforms, widely available elsewhere.
    [Show full text]
  • Debian Reference
    Guía de referencia Debian Osamu Aoki <[email protected]> Coordinador de la traducción al español: Walter O. Echarri <[email protected]> ‘Autores’ en la página 251 CVS, dom sep 25 04:22:02 UTC 2005 Resumen Esta Guía de referencia Debian (http://qref.sourceforge.net/) intenta proporcionar un repaso amplio del sistema Debian al igual que una guía de usuario post-instalación Abarca diversos aspectos de la administración del sistema mediante ejemplos que utilizan comandos de la shell. Se brindan tutoriales, trucos e información sobre diversos temas: conceptos bási- cos del sistema Debian, consejos para la instalación del sistema, administración de paquetes Debian, el kernel de Linux en Debian, puesta a punto del sistema, creación de una puerta de enlace (gateway), editores de texto, CVS, programación y GnuPG para usuarios que no son desarrolladores. Nota de Copyright Copyright © 2001–2005 by Osamu Aoki <[email protected]> Copyright (Chapter 2) © 1996–2001 by Software in the Public Interest. Este documento puede ser usado en los términos descritos en la Licencia Pública GNU versión 2 o posterior. (http://www.gnu.org/copyleft/gpl.html) Permission is granted to make and distribute verbatim copies of this document provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this document under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this document into another lan- guage, under the above conditions for modified versions, except that this permission notice may be included in translations approved by the Free Software Foundation instead of in the original English.
    [Show full text]
  • Konstruktion, Bau Und Ansteuerung
    KONSTRUKTION, BAU UND ANSTEUERUNG EINER CNC-FRÄSMASCHINE Eine Maturitätsarbeit an der KANTONSSCHULE LIMMATTAL vorgelegt von DOMINIC GLÄTTLI Klasse M6b im Fach Physik betreut von Arthur Elsener 2018 Dominic Glättli Zusammenfassung Dieser Bericht dokumentiert die Entwicklung einer kompletten Drei-Achsen-CNC-Fräsma- schine. Die Arbeit ist in drei Hauptteile gegliedert. Die digitale Konstruktion der Maschine am Computer, die Fertigung der Bauteile und die Ansteuerung der CNC-Fräsmaschine. Die- sem Hauptteil ist noch eine Dokumentation der Planungsphase vorangestellt. Ebenso ist ein Kapitel zum Test der Maschine angefügt. In der Planung wurden einige generelle Parameter, wie die Dimensionen, das Baumaterial und die Antriebs- sowie Steuerungsmethode gewählt. In der Konstruktionsphase wurde die Ma- schine konkret ausgearbeitet. Von den definitiven Bauteilen, die herzustellen waren, konnten aus dieser Konstruktion heraus Baupläne erstellt werden. Nach diesen wurden die Bauteile dann selbst hergestellt. Als letzter Schritt des Baus wurde die Steuerung der Maschine in An- griff genommen. Dieser Teil besteht aus der Wahl und Konfiguration der Software sowie der Entwicklung der Elektronik. Die vollendete CNC-Fräse ist tatsächlich auch vollfunktionsfähig. Sie kann Holz, Kunststoffe und mit geringem Abtrag sogar weiche Metalle bearbeiten. Die damit hergestellten Teile ha- ben eine maximale Abweichung vom Sollmass von ±0.02 mm. Dominic Glättli Inhaltsverzeichnis 1 Einleitung .......................................................................................................................................
    [Show full text]
  • Network-Install-HOWTO
    Network−Install−HOWTO Network−Install−HOWTO Table of Contents Network Install HOWTO...................................................................................................................................1 Graham White, [email protected] IBM Hursley, UK........................................................................1 1. Introduction..........................................................................................................................................1 2. Document Structure.............................................................................................................................1 3. Quick Guide.........................................................................................................................................1 4. SuSE Server Setup...............................................................................................................................1 5. SuSE Client Install...............................................................................................................................2 6. Redhat Server Setup.............................................................................................................................2 7. Redhat Client Install............................................................................................................................2 8. Debian Server Setup............................................................................................................................2 9. Debian Client Install............................................................................................................................2
    [Show full text]
  • Debian Reference
    Debian Reference Osamu Aoki <[email protected]> Editor: David Sewell <[email protected]> ‘Authors’ on page 173 CVS, Thu, 3 Oct 2002 22:43:22 -0600 Abstract This Debian Reference (http://qref.sourceforge.net/) is intended to provide a broad overview of the Debian system as a post-installation user’s guide. It covers many aspects of sys- tem administration through shell-command examples. Basic tutorials, tips, and other information are provided for topics including fundamental concepts of the Debian system, system installation hints, Debian package management, the Linux kernel under Debian, system tuning, building a gateway, text editors, CVS, programming, and GnuPG for non-developers. Copyright Notice Copyright © 2001–2002 by Osamu Aoki <[email protected]>. Copyright (Chapter 2) © 1996–2001 by Software in the Public Interest. This document may used under the terms of the GNU General Public License version 2 or higher. (http://www.gnu.org/copyleft/gpl.html) Permission is granted to make and distribute verbatim copies of this document provided the copy- right notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this document under the con- ditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this document into another language, under the above conditions for modified versions, except that this permission notice may be in- cluded in translations approved by the Free Software Foundation instead of in the original English.
    [Show full text]
  • Automatic Identification of Architecture and Endianness Using Binary File Contents
    Sami Kairajärvi Automatic identification of architecture and endianness using binary file contents in Information Technology Master’s Thesis May 14, 2019 University of Jyväskylä Faculty of Information Technology Author: Sami Kairajärvi Contact information: [email protected] Supervisor: Andrei Costin Title: Automatic identification of architecture and endianness using binary file contents Työn nimi: Arkkitehtuurin ja tavujärjestyksen automaattinen tunnistaminen binääritiedoston sisällön avulla Project: Master’s Thesis Study line: Cyber security Page count: 71+5 Abstract: This thesis explores how architecture and endianness of executable code can be identified using binary file contents, as falsely identifying the architecture caused about 10% of failures of firmware analysis in a recent study (Costin et al. 2014). A literature review was performed to identify the current state-of-the-art methods and how they could be improved in terms of algorithms, performance, data sets, and support tools. The thorough review identified methods presented by Clemens (2015) and De Nicolao et al. (2018) as the state-of-the-art and found that they had good results. However, these methods were found lacking essential tools to acquire or build the data sets as well as requiring more comprehensive comparison of classifier performance on full binaries. An experimental evaluation was performed to test classifier performance on different situations. For example, when training and testing classifiers with only code sections from executable files, all the classifiers performed equally well achieving over 98% accuracy. On samples with very small code sections 3-nearest neighbors and SVM had the best performance achieving 90% accuracy at 128 bytes. At the same time, random forest classifier performed the best classifying full binaries when trained with code sections at 90% accuracy and 99.2% when trained using full binaries.
    [Show full text]
  • Debian Jigdo Mini-HOWTO Peter Jay Salzman [email protected]
    Debian Jigdo mini-HOWTO Peter Jay Salzman [email protected] Àííîòàöèÿ Çàãðóçêà ISO îáðàçîâ Debian âñåãäà áûëà òÿæ¼ëûì, ìåäëåííûì è ïðåäåëüíî íå ýôôåêòèâíûì ïðîöåññîì. Jigdo - ýòî íîâûé èíñòðóìåíò äëÿ ïîëó÷åíèÿ ISO îáðàçîâ Debian ë¼ãêèì, áûñòðûì è î÷åíü ýôôåêòèâíûì ñïîñîáîì.  ýòîì HOWTO îáúÿñíÿåòñÿ, ïî÷åìó âû äîëæíû èñïîëüçîâàòü jigdo, êàê îíà ðàáîòàåò, è êàê ïîëó÷àòü è îáíîâëÿòü ISO îáðàçû Debian ñ ïîìîùüþ jigdo. Jigdo íå ïèñàëàñü ñïåöèàëüíî äëÿ Debian. Óòèëèòû jigdo ìîæíî ïðèìåíèòü ê ëþáîìó ISO è åãî çàãðóçêà ñòàíåò òàêîé æå ëåãêîé, áûñòðîé è ýôôåêòèâíîé êàê ISO ñ Debian.  ýòîì HOWTO áóäåò ðàññêàçàíî êàê ýòî äåëàåòñÿ, íî ãëàâíûì îáðàçîì, ìû ñôîêóñèðóåìñÿ íà çàãðóçêå ISO îáðàçîâ Debian. 1. Administrata 1.1. Àâòîðñòâî è ïðàâà Àâòîðñêèå ïðàâà íà ýòîò äîêóìåíò ïðèíàäëåæàò (c) 2001 Peter Jay Salzman, <[email protected] (mailto:[email protected])>. Ðàçðåøàåòñÿ êîïèðîâàòü, ðàñïðîñòðàíÿòü è/èëè èçìåíÿòü ýòîò äîêóìåíò íà óñëîâèÿõ Open Software License (OSL), âåðñèÿ 1.1, çà èñêëþ÷åíèåì ïóíêòîâ, îãîâîðåííûõ ìíîé â ñëåäóþùåì ïàðàãðàôå. ß íåíàâèæó HOWTO, êîòîðûå âêëþ÷àþò ëèöåíçèþ; ýòî óáèéöû äåðåâüåâ. Âû ìîæåòå íàéòè OSL çäåñü http://opensource.org/licenses/osl-1.1.txt. Åñëè âû õîòèòå ñîçäàòü ÷òî-òî ñâî¼ íà îñíîâå ýòîãî äîêóìåíòà èëè îïóáëèêîâàòü ýòîò HOWTO â êîììåð÷åñêèõ öåëÿõ, ÿ áûë áû î÷åíü ïðèçíàòåëåí, åñëè áû âû ñíà÷àëà ñâÿçàëèñü ñî ìíîé. Ýòî äàñò ìíå øàíñ ïðåäîñòàâèòü âàì ñàìóþ ïîñëåäíþþ âåðñèþ. Òàêæå, ÿ áûë áû áëàãîäàðåí çà êîïèþ âàøåé ðàáîòû èëè ïèööó ñî øïèíàòîì, ÷åñíîêîì, ãðèáàìè, áðûíçîé è àðòèøîêàìè. 1.2. Áëàãîäàðíîñòè Âî-ïåðâûõ, ÿ õîòåë áû ïîáëàãîäàðèòü àâòîðà jigdo Richard Atterer (mailto:[email protected]), ïðîñòî çà òî, ÷òî îí íàïèñàë jigdo.
    [Show full text]