ECEN 5623 RT Embedded Systems

Total Page:16

File Type:pdf, Size:1020Kb

ECEN 5623 RT Embedded Systems CEC 320 and 322 Microprocessor Systems Class and Lab Lecture 13 - MCU Platforms Exam #2 Results and Solutions Ave=68.2, High=94 – Exam-1 (Ave=75) and Exam-2 (Ave=65) = 30% (15% each) – Final Exam = 20% (40% Part-1, 40% Part-2, 20% Ch. 4) – Ex #1…5 = 30% (Ex #6 canceled - more review & lab-10 time) – 6 Quizzes = 20% (2 left - Ch. 4 and Final Review) – Canvas weights and policies, curve applied after final (help only) Solutions Posted on Canvas – Solutions Walk-through – Q&A Remaining Grade Events (32.67%) – 2 Quizzes (6.7%) - Complete before 11/26, 12/9 (on Canvas) – Ex #5 (6%) - Due 12/2 – Final (20%) - 12:00-02:30pm, Tues, 12/10 (Schedule 2019) Lab #9 Demo - Video Re-do Lab #4 Test ISR Board Configuration & OLED Display - Verify ASM and C Version of ISR 3+ Clock Rates Re-compile, Reset between tests PD7 Analog Input Goals: 1) Hand optimize ASM 2) Compare C and ASM 3) ARM Example Menu Procedure Call and Commands Standard ARM ISA and Platform Documents ARM Architecture (like x86, MIPS, PowerPC, etc.) – ARM Infocenter – ARM Developer – ARM ABI – Azeria Labs - ARM Platform Security – ARM University - Overview of Resources – ARM Ltd. Platform Documentation is Vendor Specific – E.g. Broadcom - bcm58712 (Raspberry Pi BCM2837, BCM2711) – TI Sitara (A-Series) and Tiva TM4C (M-Series used in CEC 320) – NVIDIA Tegra Series – Marvell (XSC) – Cypress MCUs – ST Micro MCUs – Altera FPGA SoC – Silicon Labs MCUs © Sam Siewert Continuation of MCU Related Studies Purpose-built MCUs SE program – CEC 470 Comp. Arch. – CEC 450 RT Systems Processor Scale-Up, Scale-Down RTOS OS + RT extensions – CEC 460 Telecomm Network processors CE Program CE program – CEC 460 Telecomm – CEC 450 RT Systems – CEC 470 Comp. Arch. SE Program Network Processors © Sam Siewert Life-long Study of Embedded Systems SoC platforms and/or CPU core design - ALU with an FPGA or Sim System on a Chip and embedded MCU platforms – Altera DE SoC (DE2-115) - Nios II Soft Core – Xilinx Digilent - MicroBlaze Soft Core – NVIDIA Jetson Nano, Xavier NX IMSAI “Workstation” – Texas Instruments - Launch Pads (TM4C123GXL LP, TM4C1294XL Connected LP) Intel 8080 One 2 Mhz core 64KB von Neumann Arch. Useful for real-time systems (CEC 450) $931 in 1979 assembled – Concepts such as WCET (pipeline performance) for RMA – Resource view of Platforms for HAL or OS (CPU, I/O, Memory, Power) – RTOS introduction (e.g. FreeRTOS, Zephyr, TI RTOS, VxWorks, ARM Univ., Micrium, etc.) 40 – RT Services years 1. FPGA VHDL or SoC (CEC 330, CEC399 Special Topics), 2. Bare Metal CE / Main+ISR (CEC 320/322), 3. RTOS or IoT (URI, CEC399 Special Topics), 4. OS+RT extensions (CEC 450) Self-Study Continuation after Micro, before CEC450 (Real-Time) – MIPS with Simulation of the ALU that is Cycle Accurate or Approximate – Hennessy and Patterson - MIPS Comp Org Book, 5th Ed., Cortex-A8, NVIDIA, ARM v7, v8, x86 QtSpim MARS – ARM MCU or SoC with ETM / KEIL CoreSight (IAR Tools, p. 262, Code Composer) QEMU Jetson Xavier NX Quartus-II and ModelSim Six 1.4 GHz ARM Cortex A cores – Intel x86, x64 PMU with VTune (to see chip-level events in Windows or Linux) 8GB, 384 Co-processors $399 in 2019 URI to learn and work on Comp Org for ICARUS (or CEC330 DB, CEC330 PC) – Between CEC 320 and CEC 450 with embedded FPGA, SoC, IoT, GP-GPU and RTOS/OS experience – Participate in research as an option before/after industry internships (e.g. summer after 2nd year) © Sam Siewert Recall ARM M & A Series ARM M Series - MCU ARM cortex-m4 – TIVA TM4C123G (M4), NXP, Cypress, Silicon Labs – The Cortex-M4 processor is developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities. ARM A Series - Adv. Mobile ARM cortex-a15 – Smart Phone – Qualcomm, Broadcomm, NVIDIA – Harvard Split L1, Unified L2, L3, Multi-core – The processor cluster has one to four cores. Each core has its own L1 instruction and data caches, together with a single shared L2 unified cache. © Sam Siewert Recall ARM R Series ARM R Series - Real-Time – Redundancy (no SPOFs) - ARM cortex-r52 Lock-step MISD – Predictable / Deterministic response (TCM) – Resilience - recovery and fail- safe – ECC memory – Flash memory with data protection – Software sanity monitoring – RT critical services – Best-effort services – The Cortex-R52 processor meets the rising performance needs of advanced real-time embedded systems. © Sam Siewert Assignment #5 Final Assignment - Ex1 … Ex5 Explore ARM MCU Platforms (Do, Observe, Explain) – Jetson TK1 - King 112 lab – Raspberry Pi 3b+ (Broadcomm) - borrow, remote login Compare ARM MCU SoC Platforms (on paper) – Jetson Nano - remote login – DE2-115 Provides concrete examples to motivate CAC Ch. 4 Bridge to CEC 450, CS 415, Capstone © Sam Siewert From MCUs to Platforms 1980’s - early 1990’s - Multi-chip, TTL logic, complex PCBs – von Neumann (no split L1 cache), no pipeline, zero or low wait-state memory – predictable - ASM clocks per instruction in x86 86/88, 186/188 User’s Manual - HW Ref – Introduction of 32-bit MCUs – 8-bit, 16-bit MCUs common (still widely used for deeply embedded today) – E.g. 8051, 68HC11 used in robotics, automotive, etc. (IEEE 485, RS232, Token Ring, etc.) – Today - Microchip/Atmel 8-bit, 16-bit AVR, TI for Scale-down (subsumption - CAN, I2C, SPI, BLE) 1990’s - MIPS, ARM, PowerPC, Alpha (3.3v) – Introduction of Pipelines and L1 cache (split cache Harvard architecture) – 32-bit MCUs common, 64-bit for Workstations (e.g. DEC) – Vector processing (SIMD) introduced - Altivec (PPC), MMX (Intel), (ARM NEON - 2009) Early 2000 - Super-pipelines (XSC 7/8 stage, ARM-11), Superscalar (AMD Opteron, Intel P6/Xeon x64), Dual-core (ARM, XSC) Current Decade (2010’s) - Many Core, MICA, FPGA & GP-GPU SoC – ARM Cortex M-Series (embedded), A-Series (mobile), R-Series (real-time) – Many new ARM SoCs Next - IoT (Scale down), Visual (Scale-up), Neuromorphic (Purpose built) – Google TPU (Machine learning) – NVIDIA GP-GPU (Visual processing and ML) – Intel Neural Compute Stick (ML) – ARM NXP, TI, ST-Micro, Cypress, Silicon Labs, etc. (IoT) © Sam Siewert Scaling MCUs - 8, 16, to 32-bit Motorola 68K Early MCUs were not pipelined and had zero wait-state memory access (or single-wait state worst case) – Today, this is Tightly Coupled Memory – TCM can be emulated with https://en.wikipedia.org/wiki/Motorola_68000 pipelined modern MCU with cache load and lock in L1 – 32-bit Examples: Motorola 68000 (Mac), Intel 8088 (IBM PC) – L1 split cache (Harvard) and unified L2/L3 minimizes wait-state slow down today Cady, Frederick M. Microcontrollers and Microcomputers principles of software and hardware engineering. Oxford University Press, Inc., 2009. © Sam Siewert Simplify, Speed-Up - RISC Pipelined MCUs MIPS - R2000, R3000 (Late 1980’s - Early 1990’s) – Harris Radiation Hardened RH3000 (NASA New Horizons) – Mongoose-V (1993) – AAS 2017 Presentation on Modern RH MCUs (Siewert) https://en.wikipedia.org/wiki/R3000 – Competition in 1993 was 64-bit DEC Alpha, 32-bit PowerPC Board level solutions (Mac), 32-bit 80486/Pentium P5 became System on Chip and MCU (Wintel PC), 32-bit ARM7 (von Solutions on Chip Neumann arch.) Lower part count, fewer issues with signal integrity, Other RISC MCUs - ARM, simpler, but sometimes PowerPC more than you need © Sam Siewert Current Scale-down, Scale-up MCUs Scale-up (e.g. Cavium MIPS) - ARM A/R Series – Many new 64-bit MCUs MIPS 64 (Cavium Octeon, etc.) ARM 64 A-Series – Multi-core and Many-core MCUs – Co-processor SoCs - FPGA and GP-GPU Scale-down (e.g. Microchip/Atmel AVR) - ARM M Series – Simple 32-bit IoT (BLE, 802.11, 5G) for predictive maintenance and consumer IoT (e.g. smart home) – Continuation of 8-bit and 16-bit MCUs (Sensor networks, robotics) – Subsumption architecture, Sensor networks © Sam Siewert Cortex M-Series is Scale Down TIVA TM4C123G Dev Board TM4C123G Dev board uses the TM4C123GH6PGE MCU Includes a number of demonstration devices – I2C devices (e.g. MPU9150 Motion Tracker) – GPIO LED, Switches, Pins (Multi-function) – Analog inputs (Temp sensor) – CAN bus interface – 96x64 color OLED (Synch. Serial Interface) – MicroSD (Synch. Serial) TM4C123G has lower part count with MCU Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Computing Platforms Platform organization. – MCU processor, peripherals (on-chip), peripherals (off-chip), on- chip memory, off-chip memory, on-chip/off-chip Nand flash, etc. Busses. – Local bus (AMBA) and I/O bus (e.g. PCIe) Memory devices. – Don’t confuse a Memory Controller (MCU) with a Microcontroller Unit (MCU) – Overloaded acronym – MMU - Memory Management Unit used for memory mapping and access control Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Computing platform architecture DMA Request queue DMA Completion queue Request • Src starting address • Dst starting address • Length • Interrupt on done • Return request tag DMA provides direct memory access. Timers used by OS, devices. Completion Multiple busses connect CPU, memory to devices. • Request tag • Status For TIVA TM4C123G we used Programmed MMIO – Read, Write FIFO or MMIO Registers (e.g. 16x8 UART FIFO) – ADC Channel Reads – GPIO Reads and Writes – I2C Bus Writes (Function Generator) – Exception is Motion Tracker - Data Filled in and Completion indicated by Call-back Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Platform software Platform software provides core functions, utilities. Low-level functions depend on architecture--- TI interrupt vectors, etc. PDL CE Main+ISR - e.g. Texas Instruments PDL RTOS - e.g. Wind River VxWorks Wind kernel, Zephyr micro-kernel, FreeRTOS OS + Extensions - e.g. Embedded Linux with POSIX RT Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Example 4Gb System Memory Map 0xFFFF_FFFF 1 Mbyte Boot ROM device Boot ROM (Flash) (reset vector address @ high address) 0xFFF0_0000 0xFFEF_FFFF 4015 Mbytes unused 0x0500_0000 0x04FF_FFFF 16 Mbytes Memory Mapped IO MMIO (PCI BARs for Device 0x0400_0000 Function Registers) 0x03FF_FFFF 32 Mbytes unused (space left for memory upgrades) 0x0200_0000 0x01FF_FFFF Main Working Memory for OS/Apps Working Memory (e.g.
Recommended publications
  • Bootstomp: on the Security of Bootloaders in Mobile Devices
    BootStomp: On the Security of Bootloaders in Mobile Devices Nilo Redini, Aravind Machiry, Dipanjan Das, Yanick Fratantonio, Antonio Bianchi, Eric Gustafson, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna, UC Santa Barbara https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/redini This paper is included in the Proceedings of the 26th USENIX Security Symposium August 16–18, 2017 • Vancouver, BC, Canada ISBN 978-1-931971-40-9 Open access to the Proceedings of the 26th USENIX Security Symposium is sponsored by USENIX BootStomp: On the Security of Bootloaders in Mobile Devices Nilo Redini, Aravind Machiry, Dipanjan Das, Yanick Fratantonio, Antonio Bianchi, Eric Gustafson, Yan Shoshitaishvili, Christopher Kruegel, and Giovanni Vigna fnredini, machiry, dipanjan, yanick, antoniob, edg, yans, chris, [email protected] University of California, Santa Barbara Abstract by proposing simple mitigation steps that can be im- plemented by manufacturers to safeguard the bootloader Modern mobile bootloaders play an important role in and OS from all of the discovered attacks, using already- both the function and the security of the device. They deployed hardware features. help ensure the Chain of Trust (CoT), where each stage of the boot process verifies the integrity and origin of 1 Introduction the following stage before executing it. This process, in theory, should be immune even to attackers gaining With the critical importance of the integrity of today’s full control over the operating system, and should pre- mobile and embedded devices, vendors have imple- vent persistent compromise of a device’s CoT. However, mented a string of inter-dependent mechanisms aimed at not only do these bootloaders necessarily need to take removing the possibility of persistent compromise from untrusted input from an attacker in control of the OS in the device.
    [Show full text]
  • Allgemeines Abkürzungsverzeichnis
    Allgemeines Abkürzungsverzeichnis L.
    [Show full text]
  • FAN53525 3.0A, 2.4Mhz, Digitally Programmable Tinybuck® Regulator
    FAN53525 — 3.0 A, 2.4 MHz, June 2014 FAN53525 3.0A, 2.4MHz, Digitally Programmable TinyBuck® Regulator Digitally Programmable TinyBuck Digitally Features Description . Fixed-Frequency Operation: 2.4 MHz The FAN53525 is a step-down switching voltage regulator that delivers a digitally programmable output from an input . Best-in-Class Load Transient voltage supply of 2.5 V to 5.5 V. The output voltage is 2 . Continuous Output Current Capability: 3.0 A programmed through an I C interface capable of operating up to 3.4 MHz. 2.5 V to 5.5 V Input Voltage Range Using a proprietary architecture with synchronous . Digitally Programmable Output Voltage: rectification, the FAN53525 is capable of delivering 3.0 A - 0.600 V to 1.39375 V in 6.25 mV Steps continuous at over 80% efficiency, maintaining that efficiency at load currents as low as 10 mA. The regulator operates at Programmable Slew Rate for Voltage Transitions . a nominal fixed frequency of 2.4 MHz, which reduces the . I2C-Compatible Interface Up to 3.4 Mbps value of the external components to 330 nH for the output inductor and as low as 20 µF for the output capacitor. PFM Mode for High Efficiency in Light Load . Additional output capacitance can be added to improve . Quiescent Current in PFM Mode: 50 µA (Typical) regulation during load transients without affecting stability, allowing inductance up to 1.2 µH to be used. Input Under-Voltage Lockout (UVLO) ® At moderate and light loads, Pulse Frequency Modulation Regulator Thermal Shutdown and Overload Protection . (PFM) is used to operate in Power-Save Mode with a typical .
    [Show full text]
  • Embedded Computer Solutions for Advanced Automation Control «
    » Embedded Computer Solutions for Advanced Automation Control « » Innovative Scalable Hardware » Qualifi ed for Industrial Software » Open Industrial Communication The pulse of innovation » We enable Automation! « Open Industrial Automation Platforms Kontron, one of the leaders of embedded computing technol- ogy has established dedicated global business units to provide application-ready OEM platforms for specifi c markets, includ- ing Industrial Automation. With our global corporate headquarters located in Germany, Visualization & Control Data Storage Internet-of-Things and regional headquarters in the United States and Asia-Pa- PanelPC Industrial Server cifi c, Kontron has established a strong presence worldwide. More than 1000 highly qualifi ed engineers in R&D, technical Industrie 4.0 support, and project management work with our experienced sales teams and sales partners to devise a solution that meets M2M SYMKLOUD your individual application’s demands. When it comes to embedded computing, you can focus on your core capabilities and rely on Kontron as your global OEM part- ner for a successful long-term business relationship. In addition to COTS standards based products, Kontron also of- fers semi- and full-custom ODM services for a full product port- folio that ranges from Computer-on-Modules and SBCs, up to embedded integrated systems and application ready platforms. Open for new technologies Kontron provides an exceptional range of hardware for any kind of control solution. Open for individual application Kontron systems are available either as readily integrated control solutions, or as open platforms for customers who build their own control applications with their own look and feel. Open for real-time Kontron’s Industrial Automation platforms are open for Real- Industrial Ethernet Time operating systems like VxWorks and Linux with real time extension.
    [Show full text]
  • Low-Power Ultra-Small Edge AI Accelerators for Image Recog- Nition with Convolution Neural Networks: Analysis and Future Directions
    Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 16 July 2021 doi:10.20944/preprints202107.0375.v1 Review Low-power Ultra-small Edge AI Accelerators for Image Recog- nition with Convolution Neural Networks: Analysis and Future Directions Weison Lin 1, *, Adewale Adetomi 1 and Tughrul Arslan 1 1 Institute for Integrated Micro and Nano Systems, University of Edinburgh, Edinburgh EH9 3FF, UK; [email protected]; [email protected] * Correspondence: [email protected] Abstract: Edge AI accelerators have been emerging as a solution for near customers’ applications in areas such as unmanned aerial vehicles (UAVs), image recognition sensors, wearable devices, ro- botics, and remote sensing satellites. These applications not only require meeting performance tar- gets but also meeting strict reliability and resilience constraints due to operations in harsh and hos- tile environments. Numerous research articles have been proposed, but not all of these include full specifications. Most of these tend to compare their architecture with other existing CPUs, GPUs, or other reference research. This implies that the performance results of the articles are not compre- hensive. Thus, this work lists the three key features in the specifications such as computation ability, power consumption, and the area size of prior art edge AI accelerators and the CGRA accelerators during the past few years to define and evaluate the low power ultra-small edge AI accelerators. We introduce the actual evaluation results showing the trend in edge AI accelerator design about key performance metrics to guide designers on the actual performance of existing edge AI acceler- ators’ capability and provide future design directions and trends for other applications with chal- lenging constraints.
    [Show full text]
  • Tegra Linux Driver Package
    TEGRA LINUX DRIVER PACKAGE RN_05071-R32 | March 18, 2019 Subject to Change 32.1 Release Notes RN_05071-R32 Table of Contents 1.0 About this Release ................................................................................... 3 1.1 Login Credentials ............................................................................................... 4 2.0 Known Issues .......................................................................................... 5 2.1 General System Usability ...................................................................................... 5 2.2 Boot .............................................................................................................. 6 2.3 Camera ........................................................................................................... 6 2.4 CUDA Samples .................................................................................................. 7 2.5 Multimedia ....................................................................................................... 7 3.0 Top Fixed Issues ...................................................................................... 9 3.1 General System Usability ...................................................................................... 9 3.2 Camera ........................................................................................................... 9 4.0 Documentation Corrections ..................................................................... 10 4.1 Adaptation and Bring-Up Guide ............................................................................
    [Show full text]
  • From Camac to Wireless Sensor Networks and Time- Triggered Systems and Beyond: Evolution of Computer Interfaces for Data Acquisition and Control
    Janusz Zalewski / International Journal of Computing, 15(2) 2016, 92-106 Print ISSN 1727-6209 [email protected] On-line ISSN 2312-5381 www.computingonline.net International Journal of Computing FROM CAMAC TO WIRELESS SENSOR NETWORKS AND TIME- TRIGGERED SYSTEMS AND BEYOND: EVOLUTION OF COMPUTER INTERFACES FOR DATA ACQUISITION AND CONTROL. PART I Janusz Zalewski Dept. of Software Engineering, Florida Gulf Coast University Fort Myers, FL 33965, USA [email protected], http://www.fgcu.edu/zalewski/ Abstract: The objective of this paper is to present a historical overview of design choices for data acquisition and control systems, from the first developments in CAMAC, through the evolution of their designs operating in VMEbus, Firewire and USB, to the latest developments concerning distributed systems using, in particular, wireless protocols and time-triggered architecture. First part of the overview is focused on connectivity aspects, including buses and interconnects, as well as their standardization. More sophisticated designs and a number of challenges are addressed in the second part, among them: bus performance, bus safety and security, and others. Copyright © Research Institute for Intelligent Computer Systems, 2016. All rights reserved. Keywords: Data Acquisition, Computer Control, CAMAC, Computer Buses, VMEbus, Firewire, USB. 1. INTRODUCTION which later became international standards adopted by IEC and IEEE [4]-[7]. The design and development of data acquisition The CAMAC standards played a significant role and control systems has been driven by applications. in developing data acquisition and control The earliest and most prominent of those were instrumentation not only for nuclear research, but applications in scientific experimentation, which also for research in general and for industry as well arose in the early sixties of the previous century, [8].
    [Show full text]
  • Putting Switched Fabric to Work for Software Radio
    PUTTING SWITCHED FABRIC TO WORK FOR SOFTWARE RADIO Rodger H. Hosking (Pentek, Inc., Upper Saddle River, NJ, USA, [email protected]) ABSTRACT In order to take advantage of the wealth of high- volume, low-cost devices for mass-market electronics, and The most difficult problem for designers of high- to reap the same benefits of easier connectivity, even the performance, software radio systems is simply moving data most powerful high-end software radio RISC and DSP within the system because of data throughput limitations. processors from Freescale and Texas Instruments are now Driving this dilemma are processors with higher clock rates sporting gigabit serial interfaces. and wider buses, data converter products with higher sampling rates, more complex digital communication 2. GIGABIT SERIAL STANDARDS standards with increased bandwidths, disk storage devices with faster I/O rates, FPGAs and DSPs offering incredible The descriptive phrase “gigabit serial” covers a truly diverse computational rates, and system connections and network range of implementations and application spaces. Figure 1 links operating at higher speeds. shows most of the popular standards used in embedded Traditional system architectures relying on buses and systems suitable for software radio, along with how each parallel connections between system boards and mezzanines standard is normally deployed in a system. fall far short of delivering the required peak rates, and suffer even worse if they must be shared and arbitrated. New Standard Main Application strategies for solving these problems exploit gigabit serial Gigabit Ethernet Computer Networking links and switched fabric standards to create significantly FibreChannel Data Storage more powerful architectures ideally suited for embedded software radio systems.
    [Show full text]
  • NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad Core
    Whitepaper NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad core 1 Table of Contents ...................................................................................................................................................................... 1 Introduction .............................................................................................................................................. 3 NVIDIA Tegra 4 Family of Mobile Processors ............................................................................................ 3 Benchmarking CPU Performance .............................................................................................................. 4 Tegra 4 Family CPUs Architected for High Performance and Power Efficiency ......................................... 6 Wider Issue Execution Units for Higher Throughput ............................................................................ 6 Better Memory Level Parallelism from a Larger Instruction Window for Out-of-Order Execution ...... 7 Fast Load-To-Use Logic allows larger L1 Data Cache ............................................................................. 8 Enhanced branch prediction for higher efficiency .............................................................................. 10 Advanced Prefetcher for higher MLP and lower latency .................................................................... 10 Large Unified L2 Cache .......................................................................................................................
    [Show full text]
  • 130 Demystifying Arm Trustzone: a Comprehensive Survey
    Demystifying Arm TrustZone: A Comprehensive Survey SANDRO PINTO, Centro Algoritmi, Universidade do Minho NUNO SANTOS, INESC-ID, Instituto Superior Técnico, Universidade de Lisboa The world is undergoing an unprecedented technological transformation, evolving into a state where ubiq- uitous Internet-enabled “things” will be able to generate and share large amounts of security- and privacy- sensitive data. To cope with the security threats that are thus foreseeable, system designers can find in Arm TrustZone hardware technology a most valuable resource. TrustZone is a System-on-Chip and CPU system- wide security solution, available on today’s Arm application processors and present in the new generation Arm microcontrollers, which are expected to dominate the market of smart “things.” Although this technol- ogy has remained relatively underground since its inception in 2004, over the past years, numerous initiatives have significantly advanced the state of the art involving Arm TrustZone. Motivated by this revival ofinter- est, this paper presents an in-depth study of TrustZone technology. We provide a comprehensive survey of relevant work from academia and industry, presenting existing systems into two main areas, namely, Trusted Execution Environments and hardware-assisted virtualization. Furthermore, we analyze the most relevant weaknesses of existing systems and propose new research directions within the realm of tiniest devices and the Internet of Things, which we believe to have potential to yield high-impact contributions in the future. CCS Concepts: • Computer systems organization → Embedded and cyber-physical systems;•Secu- rity and privacy → Systems security; Security in hardware; Software and application security; Additional Key Words and Phrases: TrustZone, security, virtualization, TEE, survey, Arm ACM Reference format: Sandro Pinto and Nuno Santos.
    [Show full text]
  • Low-Power Ultra-Small Edge AI Accelerators for Image Recognition with Convolution Neural Networks: Analysis and Future Directions
    electronics Review Low-Power Ultra-Small Edge AI Accelerators for Image Recognition with Convolution Neural Networks: Analysis and Future Directions Weison Lin *, Adewale Adetomi and Tughrul Arslan Institute for Integrated Micro and Nano Systems, University of Edinburgh, Edinburgh EH9 3FF, UK; [email protected] (A.A.); [email protected] (T.A.) * Correspondence: [email protected] Abstract: Edge AI accelerators have been emerging as a solution for near customers’ applications in areas such as unmanned aerial vehicles (UAVs), image recognition sensors, wearable devices, robotics, and remote sensing satellites. These applications require meeting performance targets and resilience constraints due to the limited device area and hostile environments for operation. Numerous research articles have proposed the edge AI accelerator for satisfying the applications, but not all include full specifications. Most of them tend to compare the architecture with other existing CPUs, GPUs, or other reference research, which implies that the performance exposé of the articles are not comprehensive. Thus, this work lists the essential specifications of prior art edge AI accelerators and the CGRA accelerators during the past few years to define and evaluate the low power ultra-small edge AI accelerators. The actual performance, implementation, and productized examples of edge AI accelerators are released in this paper. We introduce the evaluation results showing the edge AI accelerator design trend about key performance metrics to guide designers. Citation: Lin, W.; Adetomi, A.; Last but not least, we give out the prospect of developing edge AI’s existing and future directions Arslan, T. Low-Power Ultra-Small Edge AI Accelerators for Image and trends, which will involve other technologies for future challenging constraints.
    [Show full text]
  • VXS Created in the First Place? VXS and VPX Are Both Based on the Same Multigig RT2 Connector Family
    Frequently Asked Questions on 1. Why was VXS created in the first place? VXS and VPX are both based on the same MultiGig RT2 connector family. Mesh versions of VXS can match the slot-to-slot The VITA community needed a way to expand beyond the bandwidth of VPX. The primary advantage of VXS is the backward performance limitations of the solely parallel bus architectures compatibility with millions of existing VMEbus boards with the and incorporate serial fabrics. VXS offers backward compatibility edition of high-speed serial switch fabrics. to the VMEbus while adopting the use of serial signals such as Gigabit Ethernet, PCI Express, Serial RapidIO, and other fabrics. Each architecture has its own advantages that must be carefully considered. Designers should review factors such as A well-established ecosystem exists for VXS products. With more the ecosystem, architecture maturity/stability, pricing, power/ than 80 unique products offered in the market and deployed cooling requirements, I/O availability, bandwidth, and backwards throughout the world, the VXS architecture provides developers compatibility. with a logical extension and performance upgrade to their VME- based applications. 6. What’s the difference between VXS and VPX? 2. What are the general VXS features? VXS offers backwards compatibility for existing VME/VME64x • Backwards compatible to the VME/VME64x architecture. line cards for payload slots. VPX can offer compatibility through Enabling re-use of existing hardware and software. the use of hybrid backplanes with VME/VME64x slots. As VXS is • Uses high-speed Multi-Gig RT2 for P0 connector. based on a 0.8” pitch versus a typical 1.0” pitch for VPX, VXS offers • Switch card slot(s) in Star or Dual Star configurations.
    [Show full text]