Class 413 Rapidly Developing Embedded Systems Using Configurable Processors
Steven Knapp ([email protected])
Triscend Corporation www.triscend.com (Booth 160)
© Copyright 1998-99, Triscend Corporation. All rights reserved. Agenda
• What is a Configurable Processor? • Configurable Processors architectures • Building custom peripherals • Hardware/software trade-offs, algorithmic acceleration • Comparing the alternatives • Configurable Processor technical challenges – Communication – Software development environment – Debugging • Summary Terminology Used in this Session
• Configurable Processor – Embedded processor core – Programmable logic to build peripherals – Dedicated System Bus – On-chip memory • User-Definable Processor – Parameterized or changeable processor hardware description – Typically synthesizable from VHDL/Verilog – Usually targeted to ASIC or system-on-a-chip – Examples: ARC, Tensilica, etc. Trends Toward the Configurable Processor • Decreasing Time-to-Market – Fast iterations – Fast in-system, real-time debugging – Fast component availability • More Adaptability – During design and debug – In the field or in the application • Higher Performance – Match the architecture to the problem – Fast response to real-time events – Parallel operations • Increased Differentiation • Ever-improving Process Technology Toward a Configurable Processor
"System on a Configurable Chip" Processors
MCU ASIC Derivative Configurable FLASH Processor EPROM/
MCU Derivatives and Programmable Logic
RAM Discrete Solutions MCU Derivative EPROM/ RAM MCU TTL FLASH "Roll-Your-
CPLD/ Own" MCU in EPROM EPROM FPGA FPGA PAL Periph- TTL eral
Periph- Periph-
TTL eral eral Increasing Functionality, Density, and Performance Density, Increasing Functionality, 1980 1990 2000 Year What Is a Configurable Processor?
• Industry-standard Programmable I/O Programmable I/O
processor Dedicated RAM • Dedicated bus Processor • Programmable Dedicated Programmable Peripherals Logic Logic Bus System Soft User Programmable I/O Programmable – Soft peripherals I/O Programmable Hardware peripherals logic Breakpoint – User-defined functions Programmable I/O – Hardware acceleration • On-Chip Memory Configurable Processors Vendor/ Dedicated Programmable Embedded Bus Family Processor Status Resources Resources Structure 2-channel DMA Triscend Triscend/ 8032 8K-64K bytes RAM 8-bit Data Sampling Coarse-grained, E5 "Turbo" Hardware debug 32-bit Address bus oriented JTAG Triscend ARM In 32-bit Data Triscend Coarse-grained, 7TDMI Development 32-bit Address bus oriented 2-channel DMA Multiple busses Motorola/ 3K bytes RAM Motorola MPA ColdFire Cancelled Unknown CORE+ DRAM controller Fine-grained format Hardware debug 16K RAM National/ Compact- In 8x256 RAM Concurrent NAPA RISC Development Timer Fine-grained 1000 JTAG debugger Plans Gatefield Siemens TriCore Announced Fine-grained Plans Atmel AT40K Atmel ? Announced Coarse-grained None, SIDSA/ Programmable SIDSA 8031 Sampling? Memory- FIPSOC analog Coarse-grained mapped Motorola CORE+
Clock DRAM Controller JTAG Chip System Selects 8Kbyte Bus Interrupt Instruction Controller Cache Controller External 8Kbyte Bus SRAM Interface Shared Pins Debug Module FPGA I/O Motorola M H/W ColdFire A Divide CPU Core C
DMA Master Bus Fine- Controller Interface Grained 1Kbyte FPGA Parallel Embedded Array Port RAM
Timers Slave Bus DUART Interface External 2 3Kbyte I C Embedded Bus Controller RAM Interface National NAPA
ToggleBus Transceiver System Port National Reconfigurable Adaptive CompactRISC Pipeline Logic CPU Controller Processor
Bus Interface Pipeline (fine-grain Unit Memory Array FPGA
architecture) I/O Configurable External Memory Peripheral Scratchpad Interface Devices Memory Array Triscend E5 Configurable Processor
Memory Selector PIO Power Interface Control Unit Selector PIO Configurable PIO System Logic Clock and Selector (CSL) PIO Crystal 8032 Matrix Oscillator PIO "Turbo" Control Micro- controller PIO Power-On Reset CSI Socket Data Bus Address Mappers Byte-wide Bus Address Bus System Arbiter RAM Two- channel DMA Hardware Controller Breakpoint Unit
JTAG Interface
Configurable System Interconnect (CSI) bus Configurable Processor Applications
• Custom Peripheral Set – Practically any digital function – Matched specifically to the application – Derivative on demand • Hardware Acceleration – Algorithms in hardware – Handling odd-size math – Faster real-time response – Multiple operations in parallel – Bit manipulation Custom Peripherals (Hardware/Software Trade-Offs) • Software Solution (µs to ms) – Slow peripherals (serial ports, etc.) – Limited by CPU performance (Scenix, Teragen) – Easy to modify – Cheap, re-use existing silicon • Hardware Solution (ns to µs) – Standard derivative (no differentiation) – CPU + ASIC/FPGA (difficult to modify) – Configurable Processor (easy to modify) – Additional silicon/cost Example: SPI Interface
• Find a processor derivative that matches your requirements – It has SPI, but does it have everything else you need? – Availability? Software support? • Implement your peripheral in software (ex. Scenix) • Build your peripheral in an external ASIC or FPGA • Use a SPI soft peripheral in your configurable processor Design Techniques
Peripherals in Software Peripherals in Hardware • ‘C’ language • Schematic capture • Assembly • VHDL/Verilog entry • Instruction-set • Digital logic simulator simulator • Soft macros available • Function library
Software Hardware Hardware Acceleration Example
• Calculate the instantaneous average of four 8-bit values ( + + + DCBA ) Z = 4
• Issues – Concurrency (I/O, processing requirements) – Handling overflow (accumulator width) – Performance (processing time) Two Solutions
• Processor Solution • Logic Solution
Move PortA to Accumulator A Add PortB to Accumulator
B Add PortC to Accumulator ÷4 Z
Add PortD to Accumulator C
Shift Accumulator Right Twice (divide by four) D
Move Accumulator to RegZ More instances require More instances require additional time additional logic Comparing the Alternatives
Device Development Solution Issues When to Use It Cost Time/Cost Availability, software Processor Quick/ Lowest cost, if your $1 - $15 support, Derivative Low application fits differentiation Acquiring cores, System-on-a- $5 - $50 Long/ Volume, complexity, + development verification, NRE, Chip High performance justify it. cost vendor selection Fast Moderate/ Creating ‘soft’ If it fits and it’s fast $5 - $50 Processor Low peripherals enough, use it! Multi-chip solution, For applications where inter-chip CPU + Moderate/ a configurable $10 - $100 communication, ASIC/FPGA Moderate processor does not yet debugging support, exist. multiple CAE tools Fast time to market, Configurable Moderate/ $8 - $80 New technology complete embedded Processor Moderate system Configurable Processor Technical Challenges • Communication between the processor and programmable logic functions • Maintaining a standard development flow • Debugging a system with both processor and programmable logic Communication between the Processor and Programmable Logic • Distributing the data and address bus to the programmable logic • Decoding/controlling bus transactions • Multi-master bus support • Register intimacy • Debugging support One Solution: CSI Bus Socket (Configurable System Interconnect) • Distributes address and Side-band Signals
CSI Socket Interface data to CSL matrix – Full access to data and Data Write 8032 address bus "Turbo" Data Read Microcontroller – Dedicated address Address decoding – Predictable, synchronous timing ?Selectors 2-Channel Bus Clock – Forward compatible with DMA Controller DMA Request/ Acknowledge (CSL) Matrix future configurable
Wait-State processors Control
Configurable System Logic – Wait-state and Breakpoint Hardware Configurable Interconnect (CSI) Bus System Control breakpoint control Breakpoint Unit – Contention-free bussing Decoding Bus Transactions CSI Selector Style CSI Bus Address A2 A1 A0
An Fast address decoding • Any address range Match0 READ RDSEL Match1 • Access type Code
WRITE WRSEL Data
BCLK Exported Special Function Registers (SFR) • Three modes Selector Bus Clock Chip Select RdSel DMA Control Register Data Read [7:0] DATA • Up to 200 selectors in a single device Decode delay is constant (less than 5 ns after clock) Connecting the Two Development Worlds
Hardware Development Software Development • Design and “soft” • Compiler/assembler module libraries support • Passing register • Function libraries addresses to • Instruction-set compiler/assembler simulator • Vendor place and • System-wide in- route software system debugging • Device programming support support • System-wide in- system debugging support Preserving Existing Tool Flow
Header Configurable Processor Tools File MCU Development Tools 1 2 System Program Configuration Development
Soft Module Object Source Library File Code 4 3 Library System Test Program and Debug Debug
5 Device Designer’s Programming standard tool flow System on a Chip Flow
Configurable Processor MCU Development Tools Development Software 3rd Party EDA Tools Program System Development Configuration Capture or Synthesis Instruction Source Code Simulation Library System Test Netlist and Debug Functional In-System Simulation Debug Device Designer’s Programming standard tool flow Example System: FastChip Software “Soft” Module Library
Dedicated Resources
“Soft” peripherals dragged into CSL matrix
Resources Used Indicators Real-Time, In-System Debugging • Difficult in most ASIC or system-on-a-chip designs – Must rely on simulation before completing design • Most debuggers only support the processor – Monitor bus activity – Monitor processor registers – Break on event and single-step • Additional debugging desired for programmable logic functions – Monitor the state of logic and flip-flops in “soft” peripherals – Monitor or force a breakpoint from programmable logic Configurable Processor Debugging Capabilities
Access to all Commands from 3rd address mapped party debuggers 8032 RAM Test and and other key Control translated to JTAG processor resources instructions CSI Bus
All sequential and Configurable combinatorial logic System Logic Breakpoint unit nodes have snoops the internal complete bus, providing Triscend E5 observability complex runtime control features Summary
• Configurable Processors offer benefits in embedded system design: – Faster time to market than ASIC/SOC designs – Higher performance compared to most processors – Higher product differentiation • Configurable processors are a new class of single-chip programmable devices designed for embedded systems applications – Industry-standard processor – Dedicated, high-performance internal bus – Programmable logic, connected to internal bus – On-chip, high-density memory