ECEN 5623 RT Embedded Systems

ECEN 5623 RT Embedded Systems

CEC 450 Real-Time Systems Lecture 9 – Device Interfaces and I/O October 20, 2020 Sam Siewert View to End of Semester Exam 1 – Ave 77.74, High 98 – Exam #1 Solutions - Go over in class (Canva T/F, Multiple Choice) – Code Q&A – Grades Posted on Canvas – Discuss, Q&A Assignment #4 – Last Practice Oriented Exercise Assignment #5 – Propose In-Depth Creative RT Project, or Outline Requirements for Standard Project (and team if you wish) – Or Standard External Clock / Metronome Machine Vision Project Exam #2 - Extension to I/O, Memory and Mission Critical RT – Can do on Canvas rather than in class – Will be take home, open notes, book, etc., with some simple coding Assignment #6 – Complete Project Final Oral Exam – Present Project Sam Siewert 2 LUB vs. N&S (CT/SP) vs. Safe Criteria LUB - always safe, but fails some workable service sets N&S (CT/SP) - exact, but could have low to zero margin (harmonic), and therefore, potentially unsafe Safe Criteria - Must pass N&S test AND have say 10% margin – Margin criteria s.t. margin < LUB required for m services – E.g. margin < 30%, but more than 0 – Compromise that tolerates some WCET and T error, jitter, drift Test Has margin? Is feasible? LUB yes, f(m) some false failures some zero margin N&S CT/SP (harmonic) exact Safe Criteria yes, fixed fails if insufficient margin Sam Siewert 3 Bus I/O Interfacing to Off-Chip Devices Sam Siewert 4 Recall: Conceptual View of HW/SW Interface Three-Space View of Utilization CPU-Util Requirements – CPU Margin? – IO Latency (and Bandwidth) Margin? – Memory Capacity (and IO-Util Latency) Margin? Upper Right Front Corner – Low-Margin Memory-Util Origin – High-Margin Sam Siewert 5 VITA VME/VXS/VXI vs. PCI / PCIe VITA VME (VESA Module PCI 2.1, 2.2, 2.3 (Peripheral Expansion - History) Component Interconnect) Asynchronous 20 MhZ Synch Clock 33/66 MhZ A32, A24, A16 Addr Bus Muxed 32/64 bit A/D Bus D32, D24, D16 Data Bus Word or Block Transfer Burst Transfer Always Daisy-Chained Prio Interrupts Int A-D Routed to APIC, MSI added Interrupt Data Cycle Map onto IRQ 0…15 Built-in Hidden Arbiter Device Designed in MMIO Plug ‘n’ Play Configuration Space Custom Bus Integration on 6U PCI-to-PCI Bridge Scalability 3U/6U D-shell form factor CPCI, PMC, PC/104+, PCI-X VME, VXS Bus, VXI Bus PCI-Express Sam Siewert 6 Card / Backplane I/O Expansion Scalable Embedded Systems DoD, Commercial Aviation, etc. Sam Siewert 7 PCI Revisions Compared Bus Frequency Potential Number of Bandwidth Devices PCI 2.x 32-bit 33 MhZ 133 Mbytes/sec 4-5 PCI 2.x 32-bit 66 MhZ 266 Mbytes/sec 1-2 PCI-X 1.0a 133 MhZ 533 Mbytes/sec 1-2 PCI-X 2.0 266 MhZ 1066 Mbytes/sec 1 Point-to-Point Bus PCI-E x8 bi- 2.5 GhZ 4 GBytes/sec Switched Scalable directional Differential Serial Byte Lanes Sam Siewert 8 Original PCI System - Intel Reference Key concepts: Evolved into PCIe gen3/4 1) NB - Memory Controller, CPU • 8/16 G-transfers/sec core(s) and cache, integrated graphics • Differential serial byte lanes CPU 2) SB - I/O Controller interfaced to FSB - Front side bus evolved into Memory Controller (MMIO) with low QPI (Quick Path Interconnect) rate programmed I/O devices and FSB high-rate DMA Graphics North Bridge Adapter AGP SDRAM/ DDR PCI 2.x Bus IDE South Bridge Ethernet Expansion Slots ISA Bus COM-A Audio Super IO COM-B Sam Siewert 9 x86 and ARM SoCs - with PCIe Bus Key Distinctions between and MCUs and an SoC x86 PC System – Both are Single Chip Solutions Architecture – SoC has more processing, memory, and scalable I/O for Memory and I/O Bus Interfaces to Peripherals Multi-Core CPU Memory Controller (Local Bus) – On-chip Memory (E.g. SRAM) – Off-chip Memory Expansion (E.g. DRAM) – On-chip and Off-chip Persistent Memory (Nand, NOR Flash) I/O Bus (e.g. PCIe - Express) – Expansion I/O Bus (PCIe) – On-chip I/O Bus NVIDIA Tegra K1 Quad Cortex-A15 Intel Altera Cyclone V HPS, Cyclone SoC Dual Core ARM Cortex-A9 Sam Siewert 10 Tegra K1 SoC Detailed Block Diagram Sam Siewert 11 PCI Key Concepts PCI SIG Industry Consortium for Standard – PCI 2.1, 2.2, 2.3 – PCI-X 1.0a/b, PCI-X 2.0 – PCI-Express North Bridge: CPU, Memory, AGP, PCI Bus South Bridge: PCI Bus, APIC, ISA Bus – Legacy I/F for x86 IRQs and SuperIO ISA Chipsets – Not Required, but PCI NB/SB Often on a Single Chip PCI-to-PCI Bus Bridges for Scalability – Type 0 Configuration Transaction for Bus 0 – Type 1 Configuration Transaction for Bridged Bus Sam Siewert 12 PCI Key Concepts Plug ‘n’ Play (Resource Config Space) – OS and/or BIOS can probe configuration space registers at well known port address For Each Bus (256), Probe to Find all Devices (32) – Vendor ID and Device ID – Setup Each Device Function (8) – Setup Interrupts A-D for Each Function Program Command Register for MMIO, IO, and Mastering Program BAR 0-5 for MMIO or IO Program Int A-D if Applicable Hidden Arbitration – Req/Gnt During Master to Target Bursts – Master Latency Timer is Minimum Burst Sam Siewert 13 PCI Form Factors PC Edge Connected 32 bit and 64 bit slots CPCI Backplane Pin Connectors PC/104+ PCI and ISA Stackable PMC PCI Mezzanine Cards PCI-Express Scalable Differential Serial – PCI Compatible Message Transport Protocol – x1 to x32 PCI Express Byte Lanes (8 to 256 bits) – Root Complex (Replaces North Bridge) – Byte Lanes are 2.5 Gb/s/direction Differential Serial 8b/10b Encoded – Switched Architecture – Slots, Embedded, Cables MiniPCI Sam Siewert 14 PCI-Express Design Goals: – Highest Bandwidth / pin (2.5 Gb/sec/direction) PCI-Express [(2.5 Gb/s/dir X 8b/dir) X (1B/8b)]/40 pins = 100 MB/s/pin PCI [(32b X 33 MhZ) X (1B/8b)]/84 pins = 1.58 MB/s/pin PCI-X 2.0 (DDR) [(64b X 266 MhZ) X (1B/8b)]/150 pins = 7.09 MB/s/pin – De-couple Physical Signaling from Protocol – Switched Architecture – Message Protocol with Minimized Side-Band Signals MSI and MSI-X Message-Based Interrupts Data Transport Management Power Management Side-Band Signals – Compatible with PCI-2.x, PCI-X 1.0a/b, PCI-X 2.0 PCI buses Bridged with PCI-Express Switches on I/O Bridge North Bridge Becomes Root Complex – Memory Bridge – I/O Bridge Sam Siewert 15 PCI-Express Each Tx/Rx Differential Pair (HSOp,n and HSIp,n) forms a Byte Lane – Byte Lanes Ganged x1, x2, x4, x8, x12, x16, x32 – Serializer/Deserializer on Each Byte Lane – Driver, Buffering, and PLL on Each Byte Lane – Lane-to-Lane Deskewing Done in Phy Data Tx/Rx with Packet Protocol PCI PnP Driver transaction HDR link Seq# CRC physical Frame Frame Sam Siewert 16 I/O Trends High Speed Differential Serial Overtaking Parallel Buses? – USB 2.x, 3.x – PCI-Express – 1/10G Ethernet (1G Cat-5 UTP Copper, 10G Fiber) – SAS Parallel Buses Hit Signal Integrity Limits (Skew, Crosstalk) – DDR (266, 400 MhZ) – PCI-X 2.0 – Quad-Rate Sam Siewert 17 Single Board Computer SoCs SBC = Single Board Computer (Instead of Backplane) For RT Systems 2 Boards are Use for High Rate I/O (with Co-Processing) – Jetson TK-1 – Multi-Core CPU + GPU Co-Processor – DE1-SoC – Multi-Core CPU + FPGA Co-Processor For Low Rate, Texas Instruments Tiva TM4C is also an Option SBCs are Less Scalable than a CPCI or VXS/VXI Backplane, But SoC Packs Multiple Cores and I/O onto a Single Chip! Sam Siewert 18 Embedded MCUs – Texas Instruments TIVA - ARM Cortex M4 Microprocessor, IAR or Code Composer IDE, Cyclic Executive or FreeRTOS ARM M-Series SBC:TIVA TM4C123G DEV BOARD (used in lab), TM4C123G MCU (used in lab), ARM Cortex M4 (used in lab), TM4C123GXL Launch Pad ARM M-Series IoT Boards:IoT Enabled TM4C129X Connected Development Kit , TM4C1294XL Connected Launch Pad Sam Siewert 19 Embedded GP-GPU SoCs - Jetson Older Jetson TK1 CPU+GPU – NVIDIA "4-Plus-1" 2.32GHz ARM quad-core Cortex-A15 – NVIDIA Kepler "GK20a" GPU with 192 SM3.2 CUDA cores (up to 326 GFLOPS) Newer Jetson Nano Tegra K1 – Competitive with R-Pi, TI OMAP, etc. in terms of price, no fan, etc. – Same Tegra K1 SoC – Much more compact – Good for student projects involving machine vision, AI https://developer.nvidia.com/embedded/jetson-nano-developer-kit Sam Siewert 20 Embedded FPGA SoC Devices – DE1-SoC Reconfigurable SoC with FPGA Co-processing Dual-Core ARM Cortex A9, Linux or FreeRTOS Sam Siewert 21 Comparison of FGPA and GP-GPU SoC Both use PCIe to interface to co- processing, both < 12 Watts peak, 2-5 Watts idle FPGA Tegra K1 – FPGA is power efficient, with scaling limited by gates (LE logic elements, or LUTs Look-Up Tables) – Power consumption is more constant GP-GPU – GP-GPU scales with number of synergistic processors and gridding of workload – Power consumption is efficient, but scales up to saturation of 4 GP cores + 192 SP cores Full comparison here - for image proc Sam Siewert DE1-SoC FPGA 22 Processor Trends Yesterday’s Board, Today’s Chipset, Tomorrow’s ASIC System on a Chip - SoC – Core(s) + IO (PowerPC 8xx, 82xx) – Reconfigurable (Virtex II) – Configurable (Tensilica) – IP Modules (CPU Cores, Memory Contorller, Local Bus) Offloading to HW (Today’s SW is Tomorrow’s HW) Multi-Core SoCs – Cache Coherency – e.g.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    29 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us