ECEN 5623 RT Embedded Systems

CEC 320 and 322 Microprocessor Systems Class and Lab Lecture 13 - MCU Platforms Exam #2 Results and Solutions Ave=68.2, High=94 – Exam-1 (Ave=75) and Exam-2 (Ave=65) = 30% (15% each) – Final Exam = 20% (40% Part-1, 40% Part-2, 20% Ch. 4) – Ex #1…5 = 30% (Ex #6 canceled - more review & lab-10 time) – 6 Quizzes = 20% (2 left - Ch. 4 and Final Review) – Canvas weights and policies, curve applied after final (help only) Solutions Posted on Canvas – Solutions Walk-through – Q&A Remaining Grade Events (32.67%) – 2 Quizzes (6.7%) - Complete before 11/26, 12/9 (on Canvas) – Ex #5 (6%) - Due 12/2 – Final (20%) - 12:00-02:30pm, Tues, 12/10 (Schedule 2019) Lab #9 Demo - Video Re-do Lab #4 Test ISR Board Configuration & OLED Display - Verify ASM and C Version of ISR 3+ Clock Rates Re-compile, Reset between tests PD7 Analog Input Goals: 1) Hand optimize ASM 2) Compare C and ASM 3) ARM Example Menu Procedure Call and Commands Standard ARM ISA and Platform Documents ARM Architecture (like x86, MIPS, PowerPC, etc.) – ARM Infocenter – ARM Developer – ARM ABI – Azeria Labs - ARM Platform Security – ARM University - Overview of Resources – ARM Ltd. Platform Documentation is Vendor Specific – E.g. Broadcom - bcm58712 (Raspberry Pi BCM2837, BCM2711) – TI Sitara (A-Series) and Tiva TM4C (M-Series used in CEC 320) – NVIDIA Tegra Series – Marvell (XSC) – Cypress MCUs – ST Micro MCUs – Altera FPGA SoC – Silicon Labs MCUs © Sam Siewert Continuation of MCU Related Studies Purpose-built MCUs SE program – CEC 470 Comp. Arch. – CEC 450 RT Systems Processor Scale-Up, Scale-Down RTOS OS + RT extensions – CEC 460 Telecomm Network processors CE Program CE program – CEC 460 Telecomm – CEC 450 RT Systems – CEC 470 Comp. Arch. SE Program Network Processors © Sam Siewert Life-long Study of Embedded Systems SoC platforms and/or CPU core design - ALU with an FPGA or Sim System on a Chip and embedded MCU platforms – Altera DE SoC (DE2-115) - Nios II Soft Core – Xilinx Digilent - MicroBlaze Soft Core – NVIDIA Jetson Nano, Xavier NX IMSAI “Workstation” – Texas Instruments - Launch Pads (TM4C123GXL LP, TM4C1294XL Connected LP) Intel 8080 One 2 Mhz core 64KB von Neumann Arch. Useful for real-time systems (CEC 450) $931 in 1979 assembled – Concepts such as WCET (pipeline performance) for RMA – Resource view of Platforms for HAL or OS (CPU, I/O, Memory, Power) – RTOS introduction (e.g. FreeRTOS, Zephyr, TI RTOS, VxWorks, ARM Univ., Micrium, etc.) 40 – RT Services years 1. FPGA VHDL or SoC (CEC 330, CEC399 Special Topics), 2. Bare Metal CE / Main+ISR (CEC 320/322), 3. RTOS or IoT (URI, CEC399 Special Topics), 4. OS+RT extensions (CEC 450) Self-Study Continuation after Micro, before CEC450 (Real-Time) – MIPS with Simulation of the ALU that is Cycle Accurate or Approximate – Hennessy and Patterson - MIPS Comp Org Book, 5th Ed., Cortex-A8, NVIDIA, ARM v7, v8, x86 QtSpim MARS – ARM MCU or SoC with ETM / KEIL CoreSight (IAR Tools, p. 262, Code Composer) QEMU Jetson Xavier NX Quartus-II and ModelSim Six 1.4 GHz ARM Cortex A cores – Intel x86, x64 PMU with VTune (to see chip-level events in Windows or Linux) 8GB, 384 Co-processors $399 in 2019 URI to learn and work on Comp Org for ICARUS (or CEC330 DB, CEC330 PC) – Between CEC 320 and CEC 450 with embedded FPGA, SoC, IoT, GP-GPU and RTOS/OS experience – Participate in research as an option before/after industry internships (e.g. summer after 2nd year) © Sam Siewert Recall ARM M & A Series ARM M Series - MCU ARM cortex-m4 – TIVA TM4C123G (M4), NXP, Cypress, Silicon Labs – The Cortex-M4 processor is developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities. ARM A Series - Adv. Mobile ARM cortex-a15 – Smart Phone – Qualcomm, Broadcomm, NVIDIA – Harvard Split L1, Unified L2, L3, Multi-core – The processor cluster has one to four cores. Each core has its own L1 instruction and data caches, together with a single shared L2 unified cache. © Sam Siewert Recall ARM R Series ARM R Series - Real-Time – Redundancy (no SPOFs) - ARM cortex-r52 Lock-step MISD – Predictable / Deterministic response (TCM) – Resilience - recovery and fail- safe – ECC memory – Flash memory with data protection – Software sanity monitoring – RT critical services – Best-effort services – The Cortex-R52 processor meets the rising performance needs of advanced real-time embedded systems. © Sam Siewert Assignment #5 Final Assignment - Ex1 … Ex5 Explore ARM MCU Platforms (Do, Observe, Explain) – Jetson TK1 - King 112 lab – Raspberry Pi 3b+ (Broadcomm) - borrow, remote login Compare ARM MCU SoC Platforms (on paper) – Jetson Nano - remote login – DE2-115 Provides concrete examples to motivate CAC Ch. 4 Bridge to CEC 450, CS 415, Capstone © Sam Siewert From MCUs to Platforms 1980’s - early 1990’s - Multi-chip, TTL logic, complex PCBs – von Neumann (no split L1 cache), no pipeline, zero or low wait-state memory – predictable - ASM clocks per instruction in x86 86/88, 186/188 User’s Manual - HW Ref – Introduction of 32-bit MCUs – 8-bit, 16-bit MCUs common (still widely used for deeply embedded today) – E.g. 8051, 68HC11 used in robotics, automotive, etc. (IEEE 485, RS232, Token Ring, etc.) – Today - Microchip/Atmel 8-bit, 16-bit AVR, TI for Scale-down (subsumption - CAN, I2C, SPI, BLE) 1990’s - MIPS, ARM, PowerPC, Alpha (3.3v) – Introduction of Pipelines and L1 cache (split cache Harvard architecture) – 32-bit MCUs common, 64-bit for Workstations (e.g. DEC) – Vector processing (SIMD) introduced - Altivec (PPC), MMX (Intel), (ARM NEON - 2009) Early 2000 - Super-pipelines (XSC 7/8 stage, ARM-11), Superscalar (AMD Opteron, Intel P6/Xeon x64), Dual-core (ARM, XSC) Current Decade (2010’s) - Many Core, MICA, FPGA & GP-GPU SoC – ARM Cortex M-Series (embedded), A-Series (mobile), R-Series (real-time) – Many new ARM SoCs Next - IoT (Scale down), Visual (Scale-up), Neuromorphic (Purpose built) – Google TPU (Machine learning) – NVIDIA GP-GPU (Visual processing and ML) – Intel Neural Compute Stick (ML) – ARM NXP, TI, ST-Micro, Cypress, Silicon Labs, etc. (IoT) © Sam Siewert Scaling MCUs - 8, 16, to 32-bit Motorola 68K Early MCUs were not pipelined and had zero wait-state memory access (or single-wait state worst case) – Today, this is Tightly Coupled Memory – TCM can be emulated with https://en.wikipedia.org/wiki/Motorola_68000 pipelined modern MCU with cache load and lock in L1 – 32-bit Examples: Motorola 68000 (Mac), Intel 8088 (IBM PC) – L1 split cache (Harvard) and unified L2/L3 minimizes wait-state slow down today Cady, Frederick M. Microcontrollers and Microcomputers principles of software and hardware engineering. Oxford University Press, Inc., 2009. © Sam Siewert Simplify, Speed-Up - RISC Pipelined MCUs MIPS - R2000, R3000 (Late 1980’s - Early 1990’s) – Harris Radiation Hardened RH3000 (NASA New Horizons) – Mongoose-V (1993) – AAS 2017 Presentation on Modern RH MCUs (Siewert) https://en.wikipedia.org/wiki/R3000 – Competition in 1993 was 64-bit DEC Alpha, 32-bit PowerPC Board level solutions (Mac), 32-bit 80486/Pentium P5 became System on Chip and MCU (Wintel PC), 32-bit ARM7 (von Solutions on Chip Neumann arch.) Lower part count, fewer issues with signal integrity, Other RISC MCUs - ARM, simpler, but sometimes PowerPC more than you need © Sam Siewert Current Scale-down, Scale-up MCUs Scale-up (e.g. Cavium MIPS) - ARM A/R Series – Many new 64-bit MCUs MIPS 64 (Cavium Octeon, etc.) ARM 64 A-Series – Multi-core and Many-core MCUs – Co-processor SoCs - FPGA and GP-GPU Scale-down (e.g. Microchip/Atmel AVR) - ARM M Series – Simple 32-bit IoT (BLE, 802.11, 5G) for predictive maintenance and consumer IoT (e.g. smart home) – Continuation of 8-bit and 16-bit MCUs (Sensor networks, robotics) – Subsumption architecture, Sensor networks © Sam Siewert Cortex M-Series is Scale Down TIVA TM4C123G Dev Board TM4C123G Dev board uses the TM4C123GH6PGE MCU Includes a number of demonstration devices – I2C devices (e.g. MPU9150 Motion Tracker) – GPIO LED, Switches, Pins (Multi-function) – Analog inputs (Temp sensor) – CAN bus interface – 96x64 color OLED (Synch. Serial Interface) – MicroSD (Synch. Serial) TM4C123G has lower part count with MCU Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Computing Platforms Platform organization. – MCU processor, peripherals (on-chip), peripherals (off-chip), on- chip memory, off-chip memory, on-chip/off-chip Nand flash, etc. Busses. – Local bus (AMBA) and I/O bus (e.g. PCIe) Memory devices. – Don’t confuse a Memory Controller (MCU) with a Microcontroller Unit (MCU) – Overloaded acronym – MMU - Memory Management Unit used for memory mapping and access control Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Computing platform architecture DMA Request queue DMA Completion queue Request • Src starting address • Dst starting address • Length • Interrupt on done • Return request tag DMA provides direct memory access. Timers used by OS, devices. Completion Multiple busses connect CPU, memory to devices. • Request tag • Status For TIVA TM4C123G we used Programmed MMIO – Read, Write FIFO or MMIO Registers (e.g. 16x8 UART FIFO) – ADC Channel Reads – GPIO Reads and Writes – I2C Bus Writes (Function Generator) – Exception is Motion Tracker - Data Filled in and Completion indicated by Call-back Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Platform software Platform software provides core functions, utilities. Low-level functions depend on architecture--- TI interrupt vectors, etc. PDL CE Main+ISR - e.g. Texas Instruments PDL RTOS - e.g. Wind River VxWorks Wind kernel, Zephyr micro-kernel, FreeRTOS OS + Extensions - e.g. Embedded Linux with POSIX RT Computers as Components 4e © 2016 Marilyn Wolf, Updated by SBS Example 4Gb System Memory Map 0xFFFF_FFFF 1 Mbyte Boot ROM device Boot ROM (Flash) (reset vector address @ high address) 0xFFF0_0000 0xFFEF_FFFF 4015 Mbytes unused 0x0500_0000 0x04FF_FFFF 16 Mbytes Memory Mapped IO MMIO (PCI BARs for Device 0x0400_0000 Function Registers) 0x03FF_FFFF 32 Mbytes unused (space left for memory upgrades) 0x0200_0000 0x01FF_FFFF Main Working Memory for OS/Apps Working Memory (e.g.

ECEN 5623 RT Embedded Systems

Bootstomp: on the Security of Bootloaders in Mobile Devices

Allgemeines Abkürzungsverzeichnis

FAN53525 3.0A, 2.4Mhz, Digitally Programmable Tinybuck® Regulator

Embedded Computer Solutions for Advanced Automation Control «

Low-Power Ultra-Small Edge AI Accelerators for Image Recog- Nition with Convolution Neural Networks: Analysis and Future Directions

Tegra Linux Driver Package

From Camac to Wireless Sensor Networks and Time- Triggered Systems and Beyond: Evolution of Computer Interfaces for Data Acquisition and Control

Putting Switched Fabric to Work for Software Radio

NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad Core

130 Demystifying Arm Trustzone: a Comprehensive Survey

Low-Power Ultra-Small Edge AI Accelerators for Image Recognition with Convolution Neural Networks: Analysis and Future Directions

VXS Created in the First Place? VXS and VPX Are Both Based on the Same Multigig RT2 Connector Family