IN3160 FPGA circuit technologies and configuration Outline
• Circuit technologies • FPGA configuration • Case study: Mars Rover
IN3160 Circuit technologies and FPGA configuration 2 The difference between a microprocessor and programmable logic
• A micro processor is programmed with instructions (which are stored in RAM/ROM) • A programmable logic circuit is programmed by a circuit description • A programmable circuit consist of configurable blocks of logic and configurable connections between these block
IN3160 Circuit technologies and FPGA configuration 3 FPGA
IN3160 Circuit technologies and FPGA configuration 4 Circuit technologies
E2PROM / Feature SRAM Antifuse FLASH
One or more One or more Technology node State-of-the-art generations behind generations behind Yes Yes (in-system Reprogrammable No (in system) or offline) Reprogramming 3x slower speed (inc. Fast ---- than SRAM erasing) Volatile (must No be programmed Yes No (but can be if required) on power-up) Requires external Yes No No configuration file Good for Yes Yes No prototyping (very good) (reasonable)
Instant-on No Yes Yes
Acceptable IP Security (especially when using Very Good Very Good bitstream encryption) Size of Large Medium-small Very small configuration cell (six transistors) (two transistors) Power Medium Low Medium consumption
Rad Hard No Yes Not really
IN3160 Circuit technologies and FPGA configuration 5 SRAM-based FPGAs
• Working principle: – SRAM memory inside the FPGA saves the configuration • Advantages: – Can be programmed an unlimited number of times – Space for a lot of logic – Can easily change the functionality of the system – Does not need a special programmer or process • Disadvantages: – Takes a lot of space (SRAM-cell with 5 transistors) – Volatile memory (configuration has to be saved externally) – Relatively high power usage • We use SRAM-based FPGAs from Xilinx, which has SRAM-based FPGAs with the Artix, Kintex, Virtex and Zynq families. Intel (Altera) has similar FPGAs in their FPGA families.
IN3160 Circuit technologies and FPGA configuration 6 Antifuse
Logic 1 Unprogrammed antifuses a Pull-up resistors
NOT & y = 1 (N/A) b AND Logic 1 Programmed antifuses NOT a Pull-up resistors
NOT & y = !a & b b AND
NOT
IN3160 Circuit technologies and FPGA configuration 7 Antifuse Amorphous silicon column Polysilicon via
Metal Oxide Metal Substrate
(a) Before programming (b) After programming Logic 1 Programmed antifuses
a Pull-up resistors
NOT & y = !a & b b AND
NOT
IN3160 Circuit technologies and FPGA configuration 8 Antifuse
• Working principle: – Configuration is saved in the FPGA by making shorts using high voltage • Advantages – Low impedance when fuse is “on” (small delay) – Low power usage – Compact technology (low space requirements) – Extra robust technology (high radiation resistance) • Disadvantages – Has to be programmed with dedicated programmer – High programming voltage and power – Permanent programming (one-time programming)
IN3160 Circuit technologies and FPGA configuration 9 Logic block complexity in FPGA
• Fine grained: – The blocks can be fully Programmable utilized in the design, interconnect
but requires large Programmable routing resources logic blocks • Coarse grained: – A block can implement any arbitrary function (LUT), but the resources can often not be fully exploited
IN3160 Circuit technologies and FPGA configuration 10 Coarse grained logic block
• The complexity of the coarse grained logic block increases with technological progress • Example of a traditional coarse grained logic block: – Four 4-input LUT for combinatorial logic – Four multiplexers – 4 D-flip flops – Carry logic for efficient arithmetic (+ and -)
IN3160 Circuit technologies and FPGA configuration 11 Example implementation in coarse grained logic block • Implementation of the function y = (a AND b) OR c
MUX-based AND a & OR b | y c y = (a & b) | c LUT-based Required function Truth table 0 0 MUX AND a b c y b 1 0 MUX a OR 0 0 0 0 a y & 1 b 0 0 1 1 1 0 MUX | y c 0 1 0 0 x 1 0 1 1 1 0 y = (a & b) | c 1 0 0 0 MUX 1 0 1 1 0 0 1 1 0 1 1 1 1 1 1 1 c
IN3160 Circuit technologies and FPGA configuration 12 Typical LUT implementation
Transmission gate 11 (active low) 0 0 Transmission gate (active high) 11
SRAM 11 y cells 11
00
11
11
c b a
IN3160 Circuit technologies and FPGA configuration 13 Typical LUT implementation cont.
Transmission gate 11 (active low) 0 0 Transmission gate (active high) 11
SRAM 11 y=1 cells 11
00
11
11
c=0 b=0 a=1
IN3160 Circuit technologies and FPGA configuration 14 Different uses of LUTs
16-bit SR
16 x 1 RAM
4-input LUT
IN3160 Circuit technologies and FPGA configuration 15 FPGA LUTs
Normal FPGA LUT in Shift LUT Register Mode
A
A D
d
d 0
a
r B t
e 1 a
s
C s 0 1 1 0 1 0 1 F 1 C
l F(A,B,C) o 0 0 c k Address D 1
a
t
a
C 0
l
o
c 1 k 0 A B C (Shift Register Length)
IN3160 Circuit technologies and FPGA configuration 16 Xilinx terminology
Configurable logic block (CLB) Slice Slice
CLB CLB Logic cell Logic cell
Logic cell Logic cell
Slice Slice 16-bit SR Logic CLB CLB Logic cell Logic cell 16x1 RAM
Logic cell Logic cell a 4-input Cell LUT b y c mux d flip-flop q e clock clock enable set/reset
IN3160 Circuit technologies and FPGA configuration 17 Additional functions in modern FPGAs
• Clock management and distribution • Carry chains for fast arithmetic (+ and -) • RAM blocks (in addition to distributed RAM with LUTs or through external memory interfaces) • Function blocks (multipliers, DSP functions like multiply & accumulate, Ethernet MAC, PCI-E, etc.) • Processor cores (ARM, RISC-V, MicroBlaze or other types) • High speed serial transceivers
• This is in addition to LUTs and registers which should be utilized as efficiently as possible!
IN3160 Circuit technologies and FPGA configuration 18 RAM blocks (Block RAM / BRAM)
Columns of embedded RAM blocks Arrays of programmable logic blocks
IN3160 Circuit technologies and FPGA configuration 19 Function blocks
RAM blocks
Multipliers
Logic blocks
IN3160 Circuit technologies and FPGA configuration 20 Multiply-and-accumulate (MAC)
Multiplier Adder Accumulator
A[n:0]
xx
B[n:0] ++ Y[(2n - 1):0]
MAC
IN3160 Circuit technologies and FPGA configuration 21 DSPs (DSP48E1)
• Contains functionality for different arithmetic functions: – 25*18 two’s-complement multiplier – 48-bit accumulator – 25-bit pre-adder – Dual 24-bit add/ subtract/accumulate – Pattern detector – Pipelining and cascading buses
IN3160 Circuit technologies and FPGA configuration 22 Processor cores
• Processor cores can be integrated with the FPGA logic • Many designs require a processor, and adding it to the FPGA can remove the need for an external CPU • Two different types – Soft core • Programmable logic is used in the FPGA to implement a simple microprocessor that interacts directly with the rest of the FPGA • Examples are Xilinx MicroBlaze, Intel (Altera) Nios II and RISC-V – Hard core • The processor is implemented on the silicon with the FPGA chip at production • High speed interfaces are used between the CPU and the FPGA fabric • Xilinx Zynq 7000 family have a dual core ARM Cortex-A9 (up to 1.0 GHz) • Xilinx Zync Ultrascale+ family have a quad core ARM Cortex-A53 (up to 1.5GHz), dual core ARM Cortex-R5 (up to 600MHz) and a Mali-400 MP2 graphic processor
IN3160 Circuit technologies and FPGA configuration 23 Clock generation and distribution
• Global clock trees distribute clock signals to synchronous elements across the device – Ensures that the edges of the clock signal arrives at almost the same time across the device (setup/hold time) • A global clock is often generated externally, and an internal clock manager generates a number of derived clocks
Clock signal from outside world Daughter clocks Clock used to drive internal clock trees Manager or output pins etc.
Special clock pin and pad
IN3160 Circuit technologies and FPGA configuration 24 Configuration of the FPGA
Programmable interconnect
Programmable logic blocks
Configuration bit string
0100010111010110001010010011101
IN3160 Circuit technologies and FPGA configuration 25 What do you need to configure?
a b 4-input y c LUT mux d flip-flop q e clock
Inputs Type of edge triggering, 16 bit function initial value etc.
IN3160 Circuit technologies and FPGA configuration 26 Xilinx Vivado example with two input LUT (i.e. LUT2)
IN3160 Circuit technologies and FPGA configuration 28 Xilinx Vivado example with four input LUT (i.e. LUT4)
IN3160 Circuit technologies and FPGA configuration 29 Configuration methods
• Configuration methods – Serial load with FPGA as master – Serial load with FPGA as slave – Parallel load with FPGA as master – Parallel load with FPGA as slave • JTAG can also be used
IN3160 Circuit technologies and FPGA configuration 30 Serial download
Configuration data in Configuration data out
= I/O pin/pad = SRAM cell
IN3160 Circuit technologies and FPGA configuration 31 Serial download with FPGA as master
Reset + clock from FPGA Control FPGA y e r c o i Configuration data in v m Cdata In e e D M Cdata Out
Configuration data out
IN3160 Circuit technologies and FPGA configuration 32 Daisy chaining FPGAs
Control FPGA FPGA y e r c o i v m Cdata In Cdata In e e D M Cdata Out Cdata Out
etc.
IN3160 Circuit technologies and FPGA configuration 33 Parallel download with FPGA as master
Control FPGA y e r
c Address o i v m e e D
M Configuration data [7:0] Cdata In[7:0]
IN3160 Circuit technologies and FPGA configuration 34 Parallel download with FPGA as slave , l . a y c r e t r e c e o i
h , v t m p r e i e r o D M e P P r
o Control s s e
c Address o
r FPGA p
o Data r
c Cdata In[7:0] i M
IN3160 Circuit technologies and FPGA configuration 35 Download using the JTAG port
JTAG data out JTAG data in
From previous JTAG filp-flop Input pad
To internal Input pin from logic outside world
JTAG flip-flops
From internal Output pin to logic outside world
Output pad To next JTAG filp-flop
IN3160 Circuit technologies and FPGA configuration 36 Case: Mars Rover
Among NASA's Curiosity avionics contractors are Wind River (VxWorks real-time operating system) and Microsemi Semiconductor (RTAX-S and RTSX-SU FPGAs, high- and low-voltage power supplies, high reliability diodes and signal and power transistors).
IN3160 Circuit technologies and FPGA configuration 37