Assembly Languages II Last Time Digital Signal Processor Apps

Assembly Languages II Last Time Digital Signal Processor Apps

Last Time General model of assembly language Assembly Languages II • Undifferentiated sequence of instructions • Arithmetic instructions (ADD, SUB) • Control-flow (JMP, CALL, RET) Four main types: CISC, RISC, DSP, and VLIW Prof. Stephen A. Edwards CISC • Few, special-purpose registers • Complex addressing modes with contributions from Prof. Brian Evans, • Powerful instruction (e.g., string move) Niranjan Damera-Venkata and RISC Magesh Valliappan, UT Austin • Many general-purpose registers • Few addressing modes • Arithmetic operations don’t touch memory • Simple instructions Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Digital Signal Processor Apps. Conventional DSP Architecture Low-cost embedded systems Harvard architecture • Modems, cellular telephones, disk drives, printers • Separate data memory/bus and program memory/bus • Three reads and one or two writes per instruction cycle High-throughput applications Deterministic interrupt service routine latency • Halftoning, base stations, 3-D sonar, tomography Multiply-accumulate in single instruction cycle PC based multimedia Special addressing modes supported in hardware • Compression/decompression of audio, graphics, video • Modulo addressing for circular buffers for FIR filters Embedded processor requirements • Bit-reversed addressing for fast Fourier transforms • Inexpensive with small area and volume Instructions to keep the pipeline (3-4 stages) full • Deterministic interrupt service routine latency • Zero-overhead looping (one pipeline flush to set up) • Low power: ~50 mW (TMS320C54x uses 0.36 µA/MIPS) • Delayed branches Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Conventional DSPs Conventional DSPs Fixed-Point Floating-Point ¡ Market share: 95% fixed-point, 5% floating-point Cost/Unit $5 - $79 $5 - $381 ¡ Architecture Accumulator load-store or Each processor comes in dozens of configurations memory-register • Data and program memory size Registers 2-4 data, 8 address 8-16 data, 8-16 address • Peripherals: A/D, D/A, serial, parallel ports, timers Data Words 16 or 24 bit 32 bit ¡ Chip Memory 2-64K data and program 8-64K data and program Drawbacks Address 16-128K data, 16M – 4Gdata, • No byte addressing (needed for image and video) Space 16-64K program 16M – 4G program • Limited on-chip memory Compilers Bad C Better C, C++ • Limited addressable memory on most fixed-point DSPs Examples TI TMS320C5x; TI TMS320C3x; • Non-standard C extensions to support fixed-point data Motorola 56000 Analog Devices SHARC Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 1 DSP Example 56001 Programmer’s Model ¡ Finite Impulse Response filter (FIR) 55 48 47 24 23 0 15 0 x1 x0 Source Program counter ¡ Can be used for lowpass, highpass, bandpass, etc. y1 y0 Registers Status Register ¡ a2 a1 a0 Loop Address Basic DSP operation Accumulators b2 b1 b0 Loop Count Pointer Offset Modifier ¡ 15 For each sample, computes 15 0 15 0 15 0 … -1 -1 -1 PC Stack z z z r7 n7 m7 0 k r6 n6 m6 y = Σ a x r5 n5 m5 15 n i n+i … i=0 r4 n4 m4 Address SR Stack r3 n3 m3 Registers 0 ¡ a0 … ak are filter coefficients r2 n2 m2 ¡ Stack Pointer x and y are the nth input and output sample r1 n1 m1 n n r0 n0 m0 Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 56001 Datapath 56001 Memory Spaces x1 ¡ Three memory regions, each 64K: x0 • 24-bit Program memory y0 • 24-bit X data memory y1 • 24-bit Y data memory multiplier Y bus X bus ¡ Idea: enable simultaneous access of program, shifter alu sample, and coefficient memory ¡ Three on-chip memory spaces can be used this way a2 a1 a0 ¡ b2 b1 b0 One off-chip memory pathway connected to all three memory spaces shifter/limiter ¡ Only one off-chip access per cycle maximum Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 56001 Address Generation FIR Filter in 56001 ¡ Addresses come from pointer register r0 … r7 n equ 20 Define symbolic constants ¡ Offset registers n0 … n7 can be added to pointer start equ $40 samples equ $0 ¡ Modifier registers cause the address to wrap around coefficients equ $0 Addresses of ¡ Zero modifier causes reverse-carry arithmetic input equ $ffe0 memory-mapped I/O Address Notation Next value of r0 output equ $ffe1 “Locate this in program r0 (r0) r0 memory at $40” org p:start r0 + n0 (r0+n0) r0 “Initialize pointers to r0 (r0)+ (r0 + 1) mod m0 move #samples, r0 samples and r0 – 1 –(r0) r0 – 1 mod m0 move #coefficients, r4 coefficients” r0 (r0)– (r0 – 1) mod m0 move #n-1, m0 r0 (r0)+n0 (r0 + n0) mod m0 move m0, m4 “Prepare to treat these as r0 (r0)-n0 (r0 – n0) mod m0 circular buffers of size n” Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 2 FIR Filter in 56001 FIR Filter in 56001 “Load a sample from an I/O movep y:input, x:(r0) device in Y data memory” movep y:input, x:(r0) “Load a sample from X memory clr a x:(r0)+, x0 y:(r4)+, y0 “Clear accumulator A” into x0, advance the pointer” “Repeat the next instruction n-1 times” clr a x:(r0)+, x0 y:(r4)+, y0 “Fetch next sample and coefficient” rep #n-1 “Load a coefficient from Y memory into y0, advance the pointer” mac x0,y0,a x:(r0)+, x0 y:(r4)+, y0 rep #n-1 “a = a + x0 * y0” mac x0,y0,a x:(r0)+, x0 y:(r4)+, y0 macr x0,y0,a (r0)- macr x0,y0,a (r0)- movep a, y:output movep a, y:output Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved FIR Filter in 56001 TI TMS320C6000 VLIW DSP movep y:input, x:(r0) ¡ Eight instruction units dispatched by one very long instruction word clr a x:(r0)+, x0 y:(r4)+, y0 ¡ Designed for DSP applications ¡ Orthogonal instruction set rep #n-1 ¡ Big, uniform register file (16 32-bit registers) mac x0,y0,a x:(r0)+, x0 y:(r4)+, y0 ¡ Better compiler target than 56001 macr x0,y0,a (r0)- “Get ready for the ¡ Deeply pipelined (up to 15 levels) next sample” ¡ Complicated, but more regular, datapath movep a, y:output “a = a + x0 * y0 and round the result” “Write the filtered result to an I/O device in Y data memory” Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Pipelining on the C6 ’C6 Datapath ¡ One instruction issued per clock cycle A0 B0 … … ¡ Very deep pipeline • 4 fetch cycles A15 B15 • 2 decode cycles • 1-10 execute cycles ¡ Branch in pipeline disables interrupts ¡ Conditional instructions avoid branch-induced stalls .L1 .S1 .M1 .D1 .D2 .M2 .S2 .L2 ¡ No hardware to protect against hazards • Assembler or compiler’s responsibility D A A D Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 3 ’C6 Datapath FIR in ’C6 Assembly ¡ Two identical halves B0 “Load a halfword (16 bits)” … ¡ Each has “Do this on unit D1” FIRLOOP: • 16 32-bit registers B15 LDH .D1 *A1++, A2 ; Fetch next sample • Logical/Arithmetic (.L) || LDH .D2 *B1++, B2 ; Fetch next coefficient • Shifter/Branching (.S) || [B0] SUB .L2 B0, 1, B0 ; Decrement loop count • Multiplier (.M) || [B0] B .S2 FIRLOOP ; Branch if non-zero • Data/Memory (.D) .D2 .M2 .S2 .L2 || MPY .M1X A2, B2, A3 ; Sample * Coefficient || ADD .L1 A4, A3, A4 ; Accumulate result ¡ One cross path X: “Use the cross path” predicated instruction: “Execute only if B0 is non-zero” “Run all of these A D instructions in parallel” Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Peripherals Example: 56001 Port C ¡ Often the whole point of the system Nine pins each usable in one of two ways • Simple parallel I/O ¡ Memory-mapped I/O • Serial interface • Magical memory locations that make something happen or change on their own Parallel Serial PC0 RxD ¡ Typical meanings: PC1 TxD Serial Communication Interface (SCI) • Configuration (write) PC2 SCLK • Status (read) • Address/Data (access more peripheral state) PC3 SC0 PC4 SC1 PC5 SC2 PC6 SCK Synchronous Serial Interface (SSI) PC7 SRD PC8 STD Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Port C Registers for Parallel Port Port C Registers for Parallel Port ¡ ¡ Port C Control Register Port C Data Register • Selects mode (parallel or serial) of each pin • Returns input data or sets output state of parallel port X: $FFE1 X: $FFE5 0 = parallel I/O Read: pin state 1 = serial I/O Write: set output pin state ¡ Port C Data Direction Register • I/O direction when used in parallel mode X: $FFE3 0 = Input 1 = Output Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved 4 Port C SCI Port C SCI Registers ¡ Three-pin interface X: $FFF0 ¡ 422 Kbit/s NRZ asynchronous interface (RS-232-like) Word select bits ¡ ¡ 3.375 Mbit/s synchronous serial mode SCI Control Shift direction Send break ¡ Register Multidrop mode for multiprocessor systems Wakeup mode select ¡ Two Wakeup modes Receiver wakeup enable Wired-OR mode select • Idle line Receiver Enable • Address bit Transmitter Enable ¡ Wired-OR mode Idle line interrupt enable ¡ Receive interrupt enable On-chip or external baud rate generator Transmit interrupt enable ¡ Four interrupt priority levels Timer interrupt enable Clock polarity Copyright © 2001 Stephen A. Edwards All rights reserved Copyright © 2001 Stephen A. Edwards All rights reserved Port C SCI Registers Port C SCI Registers X: $FFF1 X: $FFF2 Transmitter Empty ¡ ¡ SCI Status Transmitter Reg Empty SCI Clock Clock Divider Bits Register (read- Receive Data Full Control Register Clock Output Divider only) Idle Line Clock Prescaler Overrun Error Receive Clock Source Parity Error Transmit Clock Source Framing Error Received bit 8 Copyright © 2001 Stephen A.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us