Porting the Lhcb Stack from X86 (Intel) to Aarch64 (ARM) and Ppc64le (Powerpc)

Total Page:16

File Type:pdf, Size:1020Kb

Porting the Lhcb Stack from X86 (Intel) to Aarch64 (ARM) and Ppc64le (Powerpc) EPJ Web of Conferences 214, 05016 (2019) https://doi.org/10.1051/epjconf/201921405016 CHEP 2018 Porting the LHCb Stack from x86 (Intel) to aarch64 (ARM) and ppc64le (PowerPC) 1, 2 2 3 Laura Promberger ∗, Marco Clemencic , Ben Couturier , Aritz Brosa Iartza , and Niko Neufeld2 on behalf of the LHCb collaboration 1Fakultät für Informatik und Wirtschaftsinformatik - Fachgebiet Informatik, Hochschule Karlsruhe - Technik und Wirtschaft, Karlsruhe, Germany 2EP, CERN, Meyrin, Switzerland 3Escuela de Ingeniería Informática, Universidad de Oviedo, Oviedo, Asturias, Spain Abstract. LHCb is undergoing major changes in its data selection and process- ing chain for the upcoming LHC Run 3 starting in 2021. With this in sight several initiatives have been launched to optimise the software stack. This con- tribution discusses porting the LHCb Stack from x86_64 architecture to both ar- chitectures aarch64 and ppc64le with the goal to evaluate the performance and the cost of the computing infrastructure for the High Level Trigger (HLT). This requires porting a stack with more than five million lines of code and finding working versions of external libraries provided by LCG. Across all software packages the biggest challenge is the growing use of vectorisation - as many vectorisation libraries are specialised on x86 architecture and do not have any support for other architectures. In spite of these challenges we have success- fully ported the LHCb High Level Trigger code to aarch64 and ppc64le. This contribution discusses the status and plans for the porting of the software as well as the LHCb approach for tackling code vectorisation in a platform independent way. 1 Introduction In 2021 the LHCb experiment will undergo a major upgrade for Run 3. With the increased luminosity provided by the LHC and the introduction of new sub-detectors, the LHCb exper- iment will be able to research new phenomena and known ones more in detail. At the same time this results in an increase of the raw data detector output by a factor of 100 from 50 GB/s to approximately 4 TB/s. And the output data rate of the final selection of the events being written to disk will increase from 0.7 GB/sto2-10GB/s. To cope with the large increase of data volume a combination of upgrading the hardware resources of the HLT computing farm and increasing the performance of the software stack is necessary. The increase of performance can be achieved by optimizing the logic of algorithms and exploiting techniques which maximize hardware efficiency. Mainly to be named are multi-threading and vectorisation. The upgrade of the computing farm will include a new data center and new compute nodes. For the most competitive cost-performance solution it is important to have several ∗e-mail: [email protected] © The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). EPJ Web of Conferences 214, 05016 (2019) https://doi.org/10.1051/epjconf/201921405016 CHEP 2018 options. For this LHCb decided to extend the architecture support from Intel x86_64 to ARM aarch64 and PowerPc ppc64le. 1.1 The LHCb software stack The LHCb software stack is made up of multiple, large projects. These projects can be divided into three groups. The LCG project provides external dependencies (e.g. Oracle, ROOT, Python). On top of these is the experiment-independent project Gaudi which is being used by different experiments at CERN. Last there are the experiment-specific projects LHCb, Lbcom, Rec and Brunel which alone sum up to about five million lines of code. For this study we used the LHCb stack with Brunel v53r1 [1]. First, with the predefined LCG version 91 [2], but later LCG was upgraded to version 92 [3]. This version of the stack was selected because vectorisation is considered to be the largest challenge for porting the code. The selected version contains less vectorised code than the software stack being currently developed for Run 3. At the same time this version has the disadvantage that it is not multi-threaded, yet. 2 Vectorisation Before being able to port the software stack, several vectorisation libraries being used by LHCb are evaluated for their cross-platform support. The two libraries used are Vc and Vcl. An overview of their features is shown in Table 1. Both libraries do not have any support for either aarch64 (ARM) or ppc64le (PowerPc). However, Vcl being a light low-level wrapper on top of the intrinsic functions allows an implementation of the missing architectures in a limited time. Whereas Vc is a more high-level approach making its usage easier but with a more complex internal structure. Therefore it was decided for the port to add the cross- platform support to Vcl for all needed functions and replace the usage of Vc by either Vcl or a generic scalar implementation. Table 1: Vectorisation libraries Vcl Vc Avx2 Yes Yes Avx512 Yes In development Altivec No No Neon No In development Documentation Yes Yes Examples Yes Yes Nearly every function Masked Functions Shuffle, blend and permute (where construct) Unsupported Func- Officially yes, but not imple- tions and Architec- No tures mented GPL3 or payed for use in License BSD-3-Clause proprietary software Distribution Own website[4] Github[5] 2 EPJ Web of Conferences 214, 05016 (2019) https://doi.org/10.1051/epjconf/201921405016 CHEP 2018 options. For this LHCb decided to extend the architecture support from Intel x86_64 to ARM Table 1: Vectorisation libraries aarch64 and PowerPc ppc64le. Vcl Vc 1.1 The LHCb software stack Main Developer Agner Fog Matthias Kretz Targeted for horizontal vec- The LHCb software stack is made up of multiple, large projects. These projects can be Vectorisation Style Wrapper for intrinsic divided into three groups. The LCG project provides external dependencies (e.g. Oracle, torisation ROOT, Python). On top of these is the experiment-independent project Gaudi which is being Does not cover all use cases; Only support for Intel archi- used by different experiments at CERN. Last there are the experiment-specific projects LHCb, hard to implement vertical Problems tecture, not so many masked Lbcom, Rec and Brunel which alone sum up to about five million lines of code. vectorisation; vector width functions For this study we used the LHCb stack with Brunel v53r1 [1]. First, with the predefined not always identifiable LCG version 91 [2], but later LCG was upgraded to version 92 [3]. This version of the Expandability for stack was selected because vectorisation is considered to be the largest challenge for porting New Intrinsics Medium (no unit tests) Complex the code. The selected version contains less vectorised code than the software stack being currently developed for Run 3. At the same time this version has the disadvantage that it is Future Version 2 will be integrated not multi-threaded, yet. in the C++ standard 2 Vectorisation 3 Porting to aarch64 (ARM) Before being able to port the software stack, several vectorisation libraries being used by The LHCb stack is first ported to aarch64. For LCG it requires changing compile flags and LHCb are evaluated for their cross-platform support. The two libraries used are Vc and Vcl. versions of the external dependencies. Some optional dependencies, like Oracle, are not An overview of their features is shown in Table 1. Both libraries do not have any support for supported on the architecture, so they can be deactivated without creating any problems. For either aarch64 (ARM) or ppc64le (PowerPc). However, Vcl being a light low-level wrapper the other projects (Gaudi, LHCb, ...) compile flags also have to be changed. Additionally, all on top of the intrinsic functions allows an implementation of the missing architectures in a usage of Vc is replaced either by Vcl or a generic scalar implementation. limited time. Whereas Vc is a more high-level approach making its usage easier but with During the port two major problems occurred. First, there are platform-specific differ- a more complex internal structure. Therefore it was decided for the port to add the cross- ences when casting double to unsigned int. Intel does not have a specific instruction to platform support to Vcl for all needed functions and replace the usage of Vc by either Vcl or cast from double to unsigned int. Instead Intel cast from double to signed int and a generic scalar implementation. then reinterprets it to unsigned int. ARM on the other hand has an instruction to cast from double to unsigned int. As a result casting the number -2.3 to unsigned int is not valid on ARM. It will result in a floating-point exception stating that the operation is Table 1: Vectorisation libraries invalid as the value is out of range for unsigned int. To solve this problem the behavior cl c of Intel is mimicked explicitly: cast double to signed int and then to unsigned int. V V The mimicking was selected as enforcing the proper data range would have resulted in an Avx2 Yes Yes unreasonable amount of code breaking changes for this study. The second problem is also Avx512 Yes In development a platform-specific problem. It is about the default expansion of char. On Intel char is Altivec No No expanded to signed char and on ARM it is expanded to unsigned char. This resulted Neon No In development within the LHCb stack in a wrongly calculated hash function. To solve the problem the Gcc -fsigned-char Documentation Yes Yes compile flag has to be used, forcing ARM to behave like Intel: expanding char signed char Examples Yes Yes to . After successfully building the LHCb stack on aarch64, the results of the full reconstruc- Nearly every function Masked Functions Shuffle, blend and permute tion test (Brunel) were validated for their numerical accuracy as the values fluctuate depend- (where construct) ing on the compiler version and the platform being used.
Recommended publications
  • Pwny Documentation Release 0.9.0
    pwny Documentation Release 0.9.0 Author Nov 19, 2017 Contents 1 pwny package 3 2 pwnypack package 5 2.1 asm – (Dis)assembler..........................................5 2.2 bytecode – Python bytecode manipulation..............................7 2.3 codec – Data transformation...................................... 11 2.4 elf – ELF file parsing.......................................... 16 2.5 flow – Communication......................................... 36 2.6 fmtstring – Format strings...................................... 41 2.7 marshal – Python marshal loader................................... 42 2.8 oracle – Padding oracle attacks.................................... 43 2.9 packing – Data (un)packing...................................... 44 2.10 php – PHP related functions....................................... 46 2.11 pickle – Pickle tools.......................................... 47 2.12 py_internals – Python internals.................................. 49 2.13 rop – ROP gadgets........................................... 50 2.14 shellcode – Shellcode generator................................... 50 2.15 target – Target definition....................................... 79 2.16 util – Utility functions......................................... 80 3 Indices and tables 83 Python Module Index 85 i ii pwny Documentation, Release 0.9.0 pwnypack is the official CTF toolkit of Certified Edible Dinosaurs. It aims to provide a set of command line utilities and a python library that are useful when playing hacking CTFs. The core functionality of pwnypack
    [Show full text]
  • Protecting Million-User Ios Apps with Obfuscation: Motivations, Pitfalls, and Experience
    Protecting Million-User iOS Apps with Obfuscation: Motivations, Pitfalls, and Experience Pei Wang∗ Dinghao Wu Zhaofeng Chen Tao Wei [email protected] [email protected] [email protected] [email protected] The Pennsylvania State The Pennsylvania State Baidu X-Lab Baidu X-Lab University University ABSTRACT ACM Reference Format: In recent years, mobile apps have become the infrastructure of many Pei Wang, Dinghao Wu, Zhaofeng Chen, and Tao Wei. 2018. Protecting popular Internet services. It is now fairly common that a mobile app Million-User iOS Apps with Obfuscation: Motivations, Pitfalls, and Experi- ence. In ICSE-SEIP ’18: 40th International Conference on Software Engineering: serves a large number of users across the globe. Different from web- Software Engineering in Practice Track, May 27–June 3, 2018, Gothenburg, based services whose important program logic is mostly placed on Sweden. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3183519. remote servers, many mobile apps require complicated client-side 3183524 code to perform tasks that are critical to the businesses. The code of mobile apps can be easily accessed by any party after the software is installed on a rooted or jailbroken device. By examining the code, skilled reverse engineers can learn various knowledge about the 1 INTRODUCTION design and implementation of an app. Real-world cases have shown During the last decade, mobile devices and apps have become the that the disclosed critical information allows malicious parties to foundations of many million-dollar businesses operated globally. abuse or exploit the app-provided services for unrightful profits, However, the prosperity has drawn many malevolent attempts to leading to significant financial losses for app vendors.
    [Show full text]
  • Using Arm Scalable Vector Extension to Optimize OPEN MPI
    Using Arm Scalable Vector Extension to Optimize OPEN MPI Dong Zhong1,2, Pavel Shamis4, Qinglei Cao1,2, George Bosilca1,2, Shinji Sumimoto3, Kenichi Miura3, and Jack Dongarra1,2 1Innovative Computing Laboratory, The University of Tennessee, US 2fdzhong, [email protected], fbosilca, [email protected] 3Fujitsu Ltd, fsumimoto.shinji, [email protected] 4Arm, [email protected] Abstract— As the scale of high-performance computing (HPC) with extension instruction sets. systems continues to grow, increasing levels of parallelism must SVE is a vector extension for AArch64 execution mode be implored to achieve optimal performance. Recently, the for the A64 instruction set of the Armv8 architecture [4], [5]. processors support wide vector extensions, vectorization becomes much more important to exploit the potential peak performance Unlike other SIMD architectures, SVE does not define the size of target architecture. Novel processor architectures, such as of the vector registers, instead it provides a range of different the Armv8-A architecture, introduce Scalable Vector Extension values which permit vector code to adapt automatically to the (SVE) - an optional separate architectural extension with a new current vector length at runtime with the feature of Vector set of A64 instruction encodings, which enables even greater Length Agnostic (VLA) programming [6], [7]. Vector length parallelisms. In this paper, we analyze the usage and performance of the constrains in the range from a minimum of 128 bits up to a SVE instructions in Arm SVE vector Instruction Set Architec- maximum of 2048 bits in increments of 128 bits. ture (ISA); and utilize those instructions to improve the memcpy SVE not only takes advantage of using long vectors but also and various local reduction operations.
    [Show full text]
  • Anatomy of Cross-Compilation Toolchains
    Embedded Linux Conference Europe 2016 Anatomy of cross-compilation toolchains Thomas Petazzoni free electrons [email protected] Artwork and Photography by Jason Freeny free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 1/1 Thomas Petazzoni I CTO and Embedded Linux engineer at Free Electrons I Embedded Linux specialists. I Development, consulting and training. I http://free-electrons.com I Contributions I Kernel support for the Marvell Armada ARM SoCs from Marvell I Major contributor to Buildroot, an open-source, simple and fast embedded Linux build system I Living in Toulouse, south west of France Drawing from Frank Tizzoni, at Kernel Recipes 2016 free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 2/1 Disclaimer I I am not a toolchain developer. Not pretending to know everything about toolchains. I Experience gained from building simple toolchains in the context of Buildroot I Purpose of the talk is to give an introduction, not in-depth information. I Focused on simple gcc-based toolchains, and for a number of examples, on ARM specific details. I Will not cover advanced use cases, such as LTO, GRAPHITE optimizations, etc. I Will not cover LLVM free electrons - Embedded Linux, kernel, drivers - Development, consulting, training and support. http://free-electrons.com 3/1 What is a cross-compiling toolchain? I A set of tools that allows to build source code into binary code for
    [Show full text]
  • Cross-Compiling Linux Kernels on X86 64: a Tutorial on How to Get Started
    Cross-compiling Linux Kernels on x86_64: A tutorial on How to Get Started Shuah Khan Senior Linux Kernel Developer – Open Source Group Samsung Research America (Silicon Valley) [email protected] Agenda ● Cross-compile value proposition ● Preparing the system for cross-compiler installation ● Cross-compiler installation steps ● Demo – install arm and arm64 ● Compiling on architectures ● Demo – compile arm and arm64 ● Automating cross-compile testing ● Upstream cross-compile testing activity ● References and Package repositories ● Q&A Cross-compile value proposition ● 30+ architectures supported (several sub-archs) ● Native compile testing requires wide range of test systems – not practical ● Ability to cross-compile non-natively on an widely available architecture helps detect compile errors ● Coupled with emulation environments (e.g: qemu) testing on non-native architectures becomes easier ● Setting up cross-compile environment is the first and necessary step arch/ alpha frv arc microblaze h8300 s390 um arm mips hexagon score x86_64 arm64 mn10300 unicore32 ia64 sh xtensa avr32 openrisc x86 m32r sparc blackfin parisc m68k tile c6x powerpc metag cris Cross-compiler packages ● Ubuntu arm packages (12.10 or later) – gcc-arm-linux-gnueabi – gcc-arm-linux-gnueabihf ● Ubuntu arm64 packages (13.04 or later) – use arm64 repo for older Ubuntu releases. – gcc-4.7-aarch64-linux-gnu ● Ubuntu keeps adding support for compilers. Search Ubuntu repository for packages. Cross-compiler packages ● Embedded Debian Project is a good resource for alpha, mips,
    [Show full text]
  • The Arms Race to Trustzone
    The ARMs race to TrustZone Jonathan Levin http://Technologeeks.com (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! `whoami` • Jonathan Levin, CTO of technologeeks[.com] – Group of experts doing consulting/training on all things internal • Author of a growing family of books: – Mac OS X/iOS Internals – Android Internals (http://NewAndroidbook.com ) – *OS Internals (http://NewOSXBook.com ) (C) 2016 Jonathan Levin – Share freely, but please cite source! For more details – Technologeeks.com Plan • TrustZone – Recap of ARMv7 and ARMv8 architecture • iOS Implementation – Apple’s “WatchTower” (Kernel Patch Protector) implementation • Android Implementations – Samsung, Qualcomm, Others (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone & ELx (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone • Hardware support for a trusted execution environment • Provides a separate “secure world” 安全世界 – Self-contained operating system – Isolated from “non-secure world” • In AArch64, integrates well with Exception Levels (例外層級) – EL3 only exists in the secure world – EL2 (hypervisor) not applicable in secure world. • De facto standard for security enforcement in mobile world (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! TrustZone Regular User mode TrustZone (Untrusted) User mode, root (Untrusted, Privileged) Kernel mode (Trusted, Privileged) Hardware (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! Trust Zone Architecture (Aarch32) 非安全世界 安全世界 Source: ARM documentation (C) 2016 Jonathan Levin & Technologeeks.com - Share freely, but please cite source! Android uses of TrustZone • Cryptographic hardware backing (keystore, gatekeeper) – Key generation, storage and validation are all in secure world – Non secure world only gets “tokens” – Public keys accessible in non-secure world – Secret unlocking (e.g.
    [Show full text]
  • 2021-01-20-Netbsd-Raspi-Earmv6hf.Img (Re: Raspber
    大阪 NetBSD 2021 Announcing NetBSD 9.1 (Oct 18, 2020) http://www.netbsd.org/releases/formal-9/NetBSD-9.... Announcing NetBSD 9.1 (Oct 18, 2020) Introduction The NetBSD Project is pleased to announce NetBSD 9.1, the first update of the NetBSD 9 release branch. It represents a selected subset of fixes deemed important for security or stability reasons, as well as new features $11,824 raised of $50,000 goal and enhancements. Here are some highlights of this new release. Home Recent changes Highlights NetBSD blog Parallelized disk encryption with cgd(4). Presentations Added the C.UTF-8 locale. About Added support for Xen 4.13. Various reliability fixes and improvements for ZFS. Added support for ZFS on dk(4) wedges on ld(4). Developers NVMM hypervisor updated, bringing improved emulation, performance, and stability. Gallery Additional settings for the NPF firewall, updated documentation, and various npfctl(8) usability Ports improvements. X11 improvements, default window manager switched to ctwm(1), enabled sixel support in xterm(1), fixes Packages for older Intel chipsets Documentation Stability improvements for LFS, the BSD log-structured filesystem. Added support for using USB security keys in raw mode, usable in Firefox and other applications. FAQ & HOWTOs Added support for more hardware RNGs in the entropy subsystem, including those in Allwinner and The Guide Rockchip SoCs. Manual pages Various audio system fixes, resolving NetBSD 7 and OSSv4 compatibility edge-cases, among other issues. Added aq(4), a driver for Aquantia 10 gigabit ethernet adapters. Wiki Added uxrcom(4), a driver for Exar single and multi-port USB serial adapters.
    [Show full text]
  • Always Be Cross-Compiling
    Always be Cross-compiling Matthew Bauer, John Ericson October 9, 2019 Always be cross compiling Who needs cross-compilation? I Used to create executables for a system different than we are currently on I While native compilation is usually easier and better supported, we need cross-compilation for: I embedded systems, no Nix I windows, no Nix (yet) I new operating systems I architectures where we haven’t made bootstrap tools History I Nixpkgs has had cross-compilation support for a while. I But, it was considered separate from native compilation, requiring special crossAttrs args. I Recent efforts make cross-compilation less exceptional, allowing us to reuse native infrastructure. This reduces duplication between package expressions. What is a system string? I Two ways to specify target systems exist. Both are supported in Nixpkgs through crossSystem and localSystem mechanisms. I A system string is meant to specify some group of computers by architecture, operating system, or ABI. Nix system tuple (system) I Format: <arch>-<os> I Examples: I x86_64-linux I x86_64-darwin I aarch64-linux I i686-windows I arm-none I Nix internally doesn’t care about libc or vendor. LLVM triple, also know as GNU config (config) I Format: <arch>-<vendor>-<os>-<libc> I Examples: I x86_64-unknown-linux-gnu I x86_64-apple-darwin I aarch64-unknown-linux-musl I i686-pc-mingw32 I arm-none-eabi I Actually has up to 4 parts, making it a quadruple not a triple. libc is optional on systems where there is only one standard Libc. History I Original GNU config just had 3 parts.
    [Show full text]
  • Evolution of Kernel Fuzzers in Netbsd
    Evolution of kernel fuzzers in NetBSD Siddharth Muralee Team bi0s >_ $ whoami ● Siddharth Muralee (R3x) ● Third year BTech CSE @ Amrita Vishwa Vidyapeetham ● CTF player - Team bi0s ● Reverse Engineering and Exploitation ● Core organising team @ InCTF and InCTFj ● Contributor to NetBSD (GSoC ‘18) ○ Kernel Code Quality improvement team ○ Security team 2 >_ The NetBSD Project “Of course it runs NetBSD” ❏ Unix - like BSD Operating System ❏ Open Source ❏ Portability PowerPC, Alpha, SPARC, MIPS, SH3, ARM, amd64, i386, m68k, VAX, ... 3 >_ Agenda ● Issues faced while fuzzing the kernel ● Sanitizers ● Kernel Code Coverage ● Syzkaller ● Future work 4 >_ Issues faced while fuzzing the kernel ● Setup with fuzzer and VMs ○ Handle Crashes ○ Multiarch support ● Scraping Crashes/logs to generate reports ○ Console output ● Restricted kernel APIs ○ Sandboxing ○ Usermode privilege protection 5 >_ Issues faced while fuzzing the kernel (Contd.) ● Generating proper inputs for fuzzing ○ Need to identify proper contexts ● Getting proper reproducers ○ Kernel bug ~= Kernel Panic ○ Identify root cause ● Indetermination of Execution ○ Threads ○ Scheduling 6 Coverage Fuzzer Sanitizers Future Work 7 >_ Sanitizers ● Dynamic testing tools ● Compiler Instrumented ● Available with GCC and Clang ● Fuzzing Aid 8 >_ Types of Sanitizers >_ Address Sanitizer >_ Undefined Behaviour Sanitizer detects invalid address usage bugs finds unspecified code semantic bugs >_ Memory Sanitizer >_ Thread Sanitizer finds uninitialized memory access detects threading bugs bugs >_ Kernel Address Sanitizer (KASAN) ● Overflows ● Use after free (UAFs) Compile NetBSD kernel with : makeoptions KASAN=1 options KASAN Supported in NetBSD : ● amd64 ● aarch64 10 >_ KASAN - Overview ● Poisoning ● Shadow buffer ● Interceptors 11 >_ KASAN - sample report ifconfig gif0 create ifconfig gif0 up [ 50.682919] kASan: Unauthorized Access In 0xffffffff80f22655: \ Addr 0xffffffff81b997a0 [8 bytes, read] [ 50.682919] #0 0xffffffff8021ce6a in kasan_memcpy <netbsd> [ 50.692999] #1 0xffffffff80f22655 in m_copyback_internal <netbsd> …….
    [Show full text]
  • Armv8-A Architecture Overview
    ARMv8-A Architecture Overview 1 64-bit Android on ARM, Campus London, September 2015 Chris Shore – ARM Training Manager . With ARM for 16 years . Managing customer training for 15 years . Worldwide customer training delivery . Approved Training Centers . Active Assist onsite project services . Background . MA Physics & Computer Science, Cambridge University, 1986 . Background as an embedded software consultant for 17 years . Software Engineer . Project Manager . Technical Director . Engineering Manager . Training Manager . Regular conference speaker and trainer 2 64-bit Android on ARM, Campus London, September 2015 Development of the ARM Architecture . ARMv8-A is one of the most significant architecture changes in ARM’s history . Positions ARM to continue servicing current markets as their needs grow 3 64-bit Android on ARM, Campus London, September 2015 What’s new in ARMv8-A? . ARMv8-A introduces two execution states: AArch32 and AArch64 . AArch32 . Evolution of ARMv7-A . A32 (ARM) and T32 (Thumb) instruction sets . ARMv8-A adds some new instructions . Traditional ARM exception model . Virtual addresses stored in 32-bit registers . AArch64 . New 64-bit general purpose registers (X0 to X30) . New instructions – A64, fixed length 32-bit instruction set . Includes SIMD, floating point and crypto instructions . New exception model . Virtual addresses now stored in 64-bit registers 4 64-bit Android on ARM, Campus London, September 2015 Agenda Architecture versions . Privilege levels AArch64 Registers A64 Instruction Set AArch64 Exception Model AArch64 Memory Model 5 64-bit Android on ARM, Campus London, September 2015 AArch64 privilege model . AArch64 has four exception levels, and two security states . EL0 = least privileged, EL3 = most privileged . Secure state and non-secure (or Normal) state Non-Secure Secure EL0 App App App App Trusted Services EL1 OS OS Trusted Kernel EL2 Hypervisor No EL2 in Secure world EL3 Secure Monitor .
    [Show full text]
  • Ios Hacking Guide.Pdf
    Hacking iOS Applications a detailed testing guide Prepared by: Dinesh Shetty, Sr. Manager - Information Security @Din3zh 2 Table of Contents 1. Setting Up iOS Pentest Lab ................................................................................................. 5 1.1 Get an iOS Device ................................................................................................................................ 5 1.2 Jailbreaking an iOS Device................................................................................................................... 7 1.3 Installing Required Software and Utilities ........................................................................................ 10 2. Acquiring iOS Binaries ...................................................................................................... 13 3. Generating iOS Binary (.IPA file) from Xcode Source Code: ............................................... 15 3.1 Method I – With A Valid Paid Developer Account. ........................................................................... 15 3.2 Method II - Without a Valid Paid Developer Account ....................................................................... 18 4. Installing iOS Binaries on Physical Devices ........................................................................ 23 4.1 Method I - Using iTunes .................................................................................................................... 23 4.2 Method II - Using Cydia Impactor ....................................................................................................
    [Show full text]
  • Guesting Bsds with the QNX Hypervisor
    Virtualization of BSD Using the QNX Hypervisor Quentin Garnier Senior Kernel Developer [email protected] May 2019 Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 1 Agenda • The virtualization environment: QNX Neutrino and QNX Hypervisor • Goals for the exercise • Stories from the trenches Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 2 QNX Host Environment Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 3 About the QNX Hypervisor design (1/2) • Some vocabulary: ● Host system ● Virtualization manager (qvm) ● Virtual machine ● Guest system • What the host system provides: ● Virtualization manager ● Drivers for possible shared hardware resources ● Anything else the system designer wants to have Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 4 About the QNX Hypervisor design (2/2) App 1 ... App n App 1 ... App n Guest OS (QNX) Guest OS (Linux) Pass-through Drivers Drivers BSP Emulated Emulated Virtual Virtual Shared qvm process qvm process devices devices Drivers Shared Message handler Private threads memory QNX Neutrino microkernel ( procnto ) Hypervisor module ( libmod_qvm.a ) Hypervisor Hardware (Aarch64, x86-64) Devices Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 5 A few design choices for the Hypervisor (1/2) • Targeting QNX guests and Linux guests ● This is what the industry wants • The Hypervisor runs as a process in the host system, with virtual CPUs being scheduled as normal threads ● A special privilege elevation interface allows running guest code • Minimal environment, therefore no or minimal virtual firmware ● For instance, no emulated BIOS whatsoever on x86_64 Confidential – Internal Use Only © 20182019 BlackBerry. All Rights Reserved. 6 A few design choices for the Hypervisor (2/2) • Minimal environment, therefore no or minimal virtual firmware ● QNX on x86 is booted through Multiboot ● Linux/x86_64 has its own protected-mode loading protocol • How does time flow in a guest? ● It’s complicated..
    [Show full text]