Porting to 64-Bit ARM
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
ARM Architecture
ARM Architecture Comppgzuter Organization and Assembly ygg Languages Yung-Yu Chuang with slides by Peng-Sheng Chen, Ville Pietikainen ARM history • 1983 developed by Acorn computers – To replace 6502 in BBC computers – 4-man VLSI design team – Its simp lic ity comes from the inexper ience team – Match the needs for generalized SoC for reasonable power, performance and die size – The first commercial RISC implemenation • 1990 ARM (Advanced RISC Mac hine ), owned by Acorn, Apple and VLSI ARM Ltd Design and license ARM core design but not fabricate Why ARM? • One of the most licensed and thus widespread processor cores in the world – Used in PDA, cell phones, multimedia players, handheld game console, digital TV and cameras – ARM7: GBA, iPod – ARM9: NDS, PSP, Sony Ericsson, BenQ – ARM11: Apple iPhone, Nokia N93, N800 – 90% of 32-bit embedded RISC processors till 2009 • Used especially in portable devices due to its low power consumption and reasonable performance ARM powered products ARM processors • A simple but powerful design • A whlhole filfamily of didesigns shiharing siilimilar didesign principles and a common instruction set Naming ARM •ARMxyzTDMIEJFS – x: series – y: MMU – z: cache – T: Thumb – D: debugger – M: Multiplier – I: EmbeddedICE (built-in debugger hardware) – E: Enhanced instruction – J: Jazell e (JVM) – F: Floating-point – S: SthiiblSynthesizible version (source code version for EDA tools) Popular ARM architectures •ARM7TDMI – 3 pipe line stages (ft(fetc h/deco de /execu te ) – High code density/low power consumption – One of the most used ARM-version (for low-end systems) – All ARM cores after ARM7TDMI include TDMI even if they do not include TDMI in their labels • ARM9TDMI – Compatible with ARM7 – 5 stages (fe tc h/deco de /execu te /memory /wr ite ) – Separate instruction and data cache •ARM11 ARM family comparison year 1995 1997 1999 2003 ARM is a RISC • RISC: simple but powerful instructions that execute within a single cycle at high clock speed. -
Pwny Documentation Release 0.9.0
pwny Documentation Release 0.9.0 Author Nov 19, 2017 Contents 1 pwny package 3 2 pwnypack package 5 2.1 asm – (Dis)assembler..........................................5 2.2 bytecode – Python bytecode manipulation..............................7 2.3 codec – Data transformation...................................... 11 2.4 elf – ELF file parsing.......................................... 16 2.5 flow – Communication......................................... 36 2.6 fmtstring – Format strings...................................... 41 2.7 marshal – Python marshal loader................................... 42 2.8 oracle – Padding oracle attacks.................................... 43 2.9 packing – Data (un)packing...................................... 44 2.10 php – PHP related functions....................................... 46 2.11 pickle – Pickle tools.......................................... 47 2.12 py_internals – Python internals.................................. 49 2.13 rop – ROP gadgets........................................... 50 2.14 shellcode – Shellcode generator................................... 50 2.15 target – Target definition....................................... 79 2.16 util – Utility functions......................................... 80 3 Indices and tables 83 Python Module Index 85 i ii pwny Documentation, Release 0.9.0 pwnypack is the official CTF toolkit of Certified Edible Dinosaurs. It aims to provide a set of command line utilities and a python library that are useful when playing hacking CTFs. The core functionality of pwnypack -
Protecting Million-User Ios Apps with Obfuscation: Motivations, Pitfalls, and Experience
Protecting Million-User iOS Apps with Obfuscation: Motivations, Pitfalls, and Experience Pei Wang∗ Dinghao Wu Zhaofeng Chen Tao Wei [email protected] [email protected] [email protected] [email protected] The Pennsylvania State The Pennsylvania State Baidu X-Lab Baidu X-Lab University University ABSTRACT ACM Reference Format: In recent years, mobile apps have become the infrastructure of many Pei Wang, Dinghao Wu, Zhaofeng Chen, and Tao Wei. 2018. Protecting popular Internet services. It is now fairly common that a mobile app Million-User iOS Apps with Obfuscation: Motivations, Pitfalls, and Experi- ence. In ICSE-SEIP ’18: 40th International Conference on Software Engineering: serves a large number of users across the globe. Different from web- Software Engineering in Practice Track, May 27–June 3, 2018, Gothenburg, based services whose important program logic is mostly placed on Sweden. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3183519. remote servers, many mobile apps require complicated client-side 3183524 code to perform tasks that are critical to the businesses. The code of mobile apps can be easily accessed by any party after the software is installed on a rooted or jailbroken device. By examining the code, skilled reverse engineers can learn various knowledge about the 1 INTRODUCTION design and implementation of an app. Real-world cases have shown During the last decade, mobile devices and apps have become the that the disclosed critical information allows malicious parties to foundations of many million-dollar businesses operated globally. abuse or exploit the app-provided services for unrightful profits, However, the prosperity has drawn many malevolent attempts to leading to significant financial losses for app vendors. -
Realtime Capabilities of Low-End Powerpc and ARM Boards for Embedded Systems
Realtime capabilities of low-end PowerPC and ARM boards for embedded systems Alexander Bauer PHYTEC Messtechnik Gmbh Robert-Koch-Str.39, 55129 Mainz, Germany [email protected] Abstract With the stepwise integration of the Realtime Preemption Patches (RT-Preempt) into the Mainline Linux kernel and their support for architectures other than Intel and AMD, there are now a number of choices which board to use for a particular embedded realtime project running Mainline Linux. In order to select the appropriate processor and clock frequency, it would be desirable to have some generally applicable ranges of worst-case latencies that can be obtained using the various processor types and conditions. We, therefore, determined the internal worst-case latency of PowerPC and ARM boards running Linux 2.6.20 and above patched with RT-Preempt. The PowerPC-board (Phytec phyCORE-MPC5200B) was running at 266 and 400 MHz, the ARM board (Phytec phyCORE-PXA270) was running at 266 and 520 MHz. This article will provide the details of the various measurement set-ups, present the results and discuss them with respect to the design differences between PowerPC and ARM. 1 Introduction This paper presents the results of the latency tests and discusses the results with respect of the different In the embedded market there is a wide range of processor designs. processors to choose from. A processor is typically selected for a customer design because of it features, e.g. video interface and peripherals, and the clock 2 Latency Tests frequency. With the growing importance of Linux and especially realtime Linux for customer designs For the latency tests based on MPC5200 we used in the embedded market, it is also essential to choose the PHYTEC phyCORE MPC5200 board with 400 the right processor that will cope with the realtime MHz as a reference platform. -
Using Arm Scalable Vector Extension to Optimize OPEN MPI
Using Arm Scalable Vector Extension to Optimize OPEN MPI Dong Zhong1,2, Pavel Shamis4, Qinglei Cao1,2, George Bosilca1,2, Shinji Sumimoto3, Kenichi Miura3, and Jack Dongarra1,2 1Innovative Computing Laboratory, The University of Tennessee, US 2fdzhong, [email protected], fbosilca, [email protected] 3Fujitsu Ltd, fsumimoto.shinji, [email protected] 4Arm, [email protected] Abstract— As the scale of high-performance computing (HPC) with extension instruction sets. systems continues to grow, increasing levels of parallelism must SVE is a vector extension for AArch64 execution mode be implored to achieve optimal performance. Recently, the for the A64 instruction set of the Armv8 architecture [4], [5]. processors support wide vector extensions, vectorization becomes much more important to exploit the potential peak performance Unlike other SIMD architectures, SVE does not define the size of target architecture. Novel processor architectures, such as of the vector registers, instead it provides a range of different the Armv8-A architecture, introduce Scalable Vector Extension values which permit vector code to adapt automatically to the (SVE) - an optional separate architectural extension with a new current vector length at runtime with the feature of Vector set of A64 instruction encodings, which enables even greater Length Agnostic (VLA) programming [6], [7]. Vector length parallelisms. In this paper, we analyze the usage and performance of the constrains in the range from a minimum of 128 bits up to a SVE instructions in Arm SVE vector Instruction Set Architec- maximum of 2048 bits in increments of 128 bits. ture (ISA); and utilize those instructions to improve the memcpy SVE not only takes advantage of using long vectors but also and various local reduction operations. -
Anatomy of Cross-Compilation Toolchains
Embedded Linux Conference Europe 2016 Anatomy of cross-compilation toolchains Thomas Petazzoni free electrons [email protected] Artwork and Photography by Jason Freeny free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 1/1 Thomas Petazzoni I CTO and Embedded Linux engineer at Free Electrons I Embedded Linux specialists. I Development, consulting and training. I http://free-electrons.com I Contributions I Kernel support for the Marvell Armada ARM SoCs from Marvell I Major contributor to Buildroot, an open-source, simple and fast embedded Linux build system I Living in Toulouse, south west of France Drawing from Frank Tizzoni, at Kernel Recipes 2016 free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 2/1 Disclaimer I I am not a toolchain developer. Not pretending to know everything about toolchains. I Experience gained from building simple toolchains in the context of Buildroot I Purpose of the talk is to give an introduction, not in-depth information. I Focused on simple gcc-based toolchains, and for a number of examples, on ARM specific details. I Will not cover advanced use cases, such as LTO, GRAPHITE optimizations, etc. I Will not cover LLVM free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 3/1 What is a cross-compiling toolchain? I A set of tools that allows to build source code into binary code for -
RISC-V Geneology
RISC-V Geneology Tony Chen David A. Patterson Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-6 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-6.html January 24, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Introduction RISC-V is an open instruction set designed along RISC principles developed originally at UC Berkeley1 and is now set to become an open industry standard under the governance of the RISC-V Foundation (www.riscv.org). Since the instruction set architecture (ISA) is unrestricted, organizations can share implementations as well as open source compilers and operating systems. Designed for use in custom systems on a chip, RISC-V consists of a base set of instructions called RV32I along with optional extensions for multiply and divide (RV32M), atomic operations (RV32A), single-precision floating point (RV32F), and double-precision floating point (RV32D). The base and these four extensions are collectively called RV32G. This report discusses the historical precedents of RV32G. We look at 18 prior instruction set architectures, chosen primarily from earlier UC Berkeley RISC architectures and major proprietary RISC instruction sets. Among the 122 instructions in RV32G: ● 6 instructions do not have precedents among the selected instruction sets, ● 98 instructions of the 116 with precedents appear in at least three different instruction sets. -
Cross-Compiling Linux Kernels on X86 64: a Tutorial on How to Get Started
Cross-compiling Linux Kernels on x86_64: A tutorial on How to Get Started Shuah Khan Senior Linux Kernel Developer – Open Source Group Samsung Research America (Silicon Valley) [email protected] Agenda ● Cross-compile value proposition ● Preparing the system for cross-compiler installation ● Cross-compiler installation steps ● Demo – install arm and arm64 ● Compiling on architectures ● Demo – compile arm and arm64 ● Automating cross-compile testing ● Upstream cross-compile testing activity ● References and Package repositories ● Q&A Cross-compile value proposition ● 30+ architectures supported (several sub-archs) ● Native compile testing requires wide range of test systems – not practical ● Ability to cross-compile non-natively on an widely available architecture helps detect compile errors ● Coupled with emulation environments (e.g: qemu) testing on non-native architectures becomes easier ● Setting up cross-compile environment is the first and necessary step arch/ alpha frv arc microblaze h8300 s390 um arm mips hexagon score x86_64 arm64 mn10300 unicore32 ia64 sh xtensa avr32 openrisc x86 m32r sparc blackfin parisc m68k tile c6x powerpc metag cris Cross-compiler packages ● Ubuntu arm packages (12.10 or later) – gcc-arm-linux-gnueabi – gcc-arm-linux-gnueabihf ● Ubuntu arm64 packages (13.04 or later) – use arm64 repo for older Ubuntu releases. – gcc-4.7-aarch64-linux-gnu ● Ubuntu keeps adding support for compilers. Search Ubuntu repository for packages. Cross-compiler packages ● Embedded Debian Project is a good resource for alpha, mips, -
Design of the RISC-V Instruction Set Architecture
Design of the RISC-V Instruction Set Architecture Andrew Waterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-1 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html January 3, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Patterson, Chair Professor Krste Asanovi´c Associate Professor Per-Olof Persson Spring 2016 Design of the RISC-V Instruction Set Architecture Copyright 2016 by Andrew Shell Waterman 1 Abstract Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Patterson, Chair The hardware-software interface, embodied in the instruction set architecture (ISA), is arguably the most important interface in a computer system. Yet, in contrast to nearly all other interfaces in a modern computer system, all commercially popular ISAs are proprietary. -
Computer Architectures an Overview
Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements. -
The Arms Race to Trustzone
The ARMs race to TrustZone Jonathan Levin http://Technologeeks.com (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! `whoami` • Jonathan Levin, CTO of technologeeks[.com] – Group of experts doing consulting/training on all things internal • Author of a growing family of books: – Mac OS X/iOS Internals – Android Internals (http://NewAndroidbook.com ) – *OS Internals (http://NewOSXBook.com ) (C) 2016 Jonathan Levin – Share freely, but please cite source! For more details – Technologeeks.com Plan • TrustZone – Recap of ARMv7 and ARMv8 architecture • iOS Implementation – Apple’s “WatchTower” (Kernel Patch Protector) implementation • Android Implementations – Samsung, Qualcomm, Others (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone & ELx (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone • Hardware support for a trusted execution environment • Provides a separate “secure world” 安全世界 – Self-contained operating system – Isolated from “non-secure world” • In AArch64, integrates well with Exception Levels (例外層級) – EL3 only exists in the secure world – EL2 (hypervisor) not applicable in secure world. • De facto standard for security enforcement in mobile world (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone Regular User mode TrustZone (Untrusted) User mode, root (Untrusted, Privileged) Kernel mode (Trusted, Privileged) Hardware (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! Trust Zone Architecture (Aarch32) 非安全世界 安全世界 Source: ARM documentation (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! Android uses of TrustZone • Cryptographic hardware backing (keystore, gatekeeper) – Key generation, storage and validation are all in secure world – Non secure world only gets “tokens” – Public keys accessible in non-secure world – Secret unlocking (e.g. -
The ARM Instruction Set Architecture
EE382N-4 Embedded Systems Architecture The ARM Instruction Set Architecture Mark McDermott With help from our good friends at ARM Fall 2008 8/22/2008 EE382N-4 Embedded Systems Architecture Main features of the ARM Instruction Set All instructions are 32 bits long. Most instructions execute in a single cycle. Most instructions can be conditionally executed. A load/store architecture – Data processing instructions act only on registers • Three operand format • Combined ALU and shifter for high speed bit manipulation – Specific memory access instructions with powerful auto‐indexing addressing modes. • 32 bit and 8 bit data types – and also 16 bit data types on ARM Architecture v4. • Flexible multiple register load and store instructions Instruction set extension via coprocessors Very dense 16‐bit compressed instruction set (Thumb) 8/22/2008 2 EE382N-4 Embedded Systems Architecture Coprocessors – Up to 16 coprocessors can be defined – Expands the ARM instruction set – Each coprocessor can have up to 16 private registers of any reasonable size – Load‐store architecture 3 EE382N-4 Embedded Systems Architecture Thumb Thumb is a 16‐bit instruction set – Optimized for code density from C code – Improved performance form narrow memory – Subset of the functionality of the ARM instruction set Core has two execution states –ARM and Thumb – Switch between them using BX instruction Thumb has characteristic features: – Most Thumb instruction are executed unconditionally – Many Thumb data process instruction use a 2‐address format – Thumb instruction