IN3160 FPGA circuit technologies and configuration Outline

• Circuit technologies • FPGA configuration • Case study: Mars Rover

IN3160 Circuit technologies and FPGA configuration 2 The difference between a microprocessor and programmable logic

• A micro processor is programmed with instructions (which are stored in RAM/ROM) • A programmable logic circuit is programmed by a circuit description • A programmable circuit consist of configurable blocks of logic and configurable connections between these block

IN3160 Circuit technologies and FPGA configuration 3 FPGA

IN3160 Circuit technologies and FPGA configuration 4 Circuit technologies

E2PROM / Feature SRAM Antifuse FLASH

One or more One or more Technology node State-of-the-art generations behind generations behind Yes Yes (in-system Reprogrammable No (in system) or offline) Reprogramming 3x slower speed (inc. Fast ---- than SRAM erasing) Volatile (must No be programmed Yes No (but can be if required) on power-up) Requires external Yes No No configuration file Good for Yes Yes No prototyping (very good) (reasonable)

Instant-on No Yes Yes

Acceptable IP Security (especially when using Very Good Very Good bitstream encryption) Size of Large Medium-small Very small configuration cell (six transistors) (two transistors) Power Medium Low Medium consumption

Rad Hard No Yes Not really

IN3160 Circuit technologies and FPGA configuration 5 SRAM-based FPGAs

• Working principle: – SRAM memory inside the FPGA saves the configuration • Advantages: – Can be programmed an unlimited number of times – Space for a lot of logic – Can easily change the functionality of the system – Does not need a special programmer or process • Disadvantages: – Takes a lot of space (SRAM-cell with 5 transistors) – Volatile memory (configuration has to be saved externally) – Relatively high power usage • We use SRAM-based FPGAs from , which has SRAM-based FPGAs with the Artix, Kintex, Virtex and Zynq families. () has similar FPGAs in their FPGA families.

IN3160 Circuit technologies and FPGA configuration 6 Antifuse

Logic 1 Unprogrammed antifuses a Pull-up resistors

NOT & y = 1 (N/A) b AND Logic 1 Programmed antifuses NOT a Pull-up resistors

NOT & y = !a & b b AND

NOT

IN3160 Circuit technologies and FPGA configuration 7 Antifuse Amorphous silicon column Polysilicon via

Metal Oxide Metal Substrate

(a) Before programming (b) After programming Logic 1 Programmed antifuses

a Pull-up resistors

NOT & y = !a & b b AND

NOT

IN3160 Circuit technologies and FPGA configuration 8 Antifuse

• Working principle: – Configuration is saved in the FPGA by making shorts using high voltage • Advantages – Low impedance when fuse is “on” (small delay) – Low power usage – Compact technology (low space requirements) – Extra robust technology (high radiation resistance) • Disadvantages – Has to be programmed with dedicated programmer – High programming voltage and power – Permanent programming (one-time programming)

IN3160 Circuit technologies and FPGA configuration 9 Logic block complexity in FPGA

• Fine grained: – The blocks can be fully Programmable utilized in the design, interconnect

but requires large Programmable routing resources logic blocks • Coarse grained: – A block can implement any arbitrary function (LUT), but the resources can often not be fully exploited

IN3160 Circuit technologies and FPGA configuration 10 Coarse grained logic block

• The complexity of the coarse grained logic block increases with technological progress • Example of a traditional coarse grained logic block: – Four 4-input LUT for combinatorial logic – Four – 4 D-flip flops – Carry logic for efficient arithmetic (+ and -)

IN3160 Circuit technologies and FPGA configuration 11 Example implementation in coarse grained logic block • Implementation of the function y = (a AND b) OR c

MUX-based AND a & OR b | y c y = (a & b) | c LUT-based Required function Truth table 0 0 MUX AND a b c y b 1 0 MUX a OR 0 0 0 0 a y & 1 b 0 0 1 1 1 0 MUX | y c 0 1 0 0 x 1 0 1 1 1 0 y = (a & b) | c 1 0 0 0 MUX 1 0 1 1 0 0 1 1 0 1 1 1 1 1 1 1 c

IN3160 Circuit technologies and FPGA configuration 12 Typical LUT implementation

Transmission gate 11 (active low) 0 0 Transmission gate (active high) 11

SRAM 11 y cells 11

00

11

11

c b a

IN3160 Circuit technologies and FPGA configuration 13 Typical LUT implementation cont.

Transmission gate 11 (active low) 0 0 Transmission gate (active high) 11

SRAM 11 y=1 cells 11

00

11

11

c=0 b=0 a=1

IN3160 Circuit technologies and FPGA configuration 14 Different uses of LUTs

16-bit SR

16 x 1 RAM

4-input LUT

IN3160 Circuit technologies and FPGA configuration 15 FPGA LUTs

Normal FPGA LUT in Shift LUT Register Mode

A

A D

d

d 0

a

r B t

e 1 a

s

C s 0 1 1 0 1 0 1 F 1 C

l F(A,B,C) o 0 0 c k Address D 1

a

t

a

C 0

l

o

c 1 k 0 A B C (Shift Register Length)

IN3160 Circuit technologies and FPGA configuration 16 Xilinx terminology

Configurable logic block (CLB) Slice Slice

CLB CLB Logic cell Logic cell

Logic cell Logic cell

Slice Slice 16-bit SR Logic CLB CLB Logic cell Logic cell 16x1 RAM

Logic cell Logic cell a 4-input Cell LUT b y c mux d flip-flop q e clock clock enable set/reset

IN3160 Circuit technologies and FPGA configuration 17 Additional functions in modern FPGAs

• Clock management and distribution • Carry chains for fast arithmetic (+ and -) • RAM blocks (in addition to distributed RAM with LUTs or through external memory interfaces) • Function blocks (multipliers, DSP functions like multiply & accumulate, Ethernet MAC, PCI-E, etc.) • Processor cores (ARM, RISC-V, MicroBlaze or other types) • High speed serial transceivers

• This is in addition to LUTs and registers which should be utilized as efficiently as possible!

IN3160 Circuit technologies and FPGA configuration 18 RAM blocks (Block RAM / BRAM)

Columns of embedded RAM blocks Arrays of programmable logic blocks

IN3160 Circuit technologies and FPGA configuration 19 Function blocks

RAM blocks

Multipliers

Logic blocks

IN3160 Circuit technologies and FPGA configuration 20 Multiply-and-accumulate (MAC)

Multiplier Accumulator

A[n:0]

xx

B[n:0] ++ Y[(2n - 1):0]

MAC

IN3160 Circuit technologies and FPGA configuration 21 DSPs (DSP48E1)

• Contains functionality for different arithmetic functions: – 25*18 two’s-complement multiplier – 48-bit accumulator – 25-bit pre-adder – Dual 24-bit add/ subtract/accumulate – Pattern detector – Pipelining and cascading buses

IN3160 Circuit technologies and FPGA configuration 22 Processor cores

• Processor cores can be integrated with the FPGA logic • Many designs require a processor, and adding it to the FPGA can remove the need for an external CPU • Two different types – Soft core • Programmable logic is used in the FPGA to implement a simple microprocessor that interacts directly with the rest of the FPGA • Examples are Xilinx MicroBlaze, Intel (Altera) Nios II and RISC-V – Hard core • The processor is implemented on the silicon with the FPGA chip at production • High speed interfaces are used between the CPU and the FPGA fabric • Xilinx Zynq 7000 family have a dual core ARM Cortex-A9 (up to 1.0 GHz) • Xilinx Zync Ultrascale+ family have a quad core ARM Cortex-A53 (up to 1.5GHz), dual core ARM Cortex-R5 (up to 600MHz) and a Mali-400 MP2 graphic processor

IN3160 Circuit technologies and FPGA configuration 23 Clock generation and distribution

• Global clock trees distribute clock signals to synchronous elements across the device – Ensures that the edges of the clock signal arrives at almost the same time across the device (setup/hold time) • A global clock is often generated externally, and an internal clock manager generates a number of derived clocks

Clock signal from outside world Daughter clocks Clock used to drive internal clock trees Manager or output pins etc.

Special clock pin and pad

IN3160 Circuit technologies and FPGA configuration 24 Configuration of the FPGA

Programmable interconnect

Programmable logic blocks

Configuration bit string

0100010111010110001010010011101

IN3160 Circuit technologies and FPGA configuration 25 What do you need to configure?

a b 4-input y c LUT mux d flip-flop q e clock

Inputs Type of edge triggering, 16 bit function initial value etc.

IN3160 Circuit technologies and FPGA configuration 26 example with two input LUT (i.e. LUT2)

IN3160 Circuit technologies and FPGA configuration 28 Xilinx Vivado example with four input LUT (i.e. LUT4)

IN3160 Circuit technologies and FPGA configuration 29 Configuration methods

• Configuration methods – Serial load with FPGA as master – Serial load with FPGA as slave – Parallel load with FPGA as master – Parallel load with FPGA as slave • JTAG can also be used

IN3160 Circuit technologies and FPGA configuration 30 Serial download

Configuration data in Configuration data out

= I/O pin/pad = SRAM cell

IN3160 Circuit technologies and FPGA configuration 31 Serial download with FPGA as master

Reset + clock from FPGA Control FPGA y e r c o i Configuration data in v m Cdata In e e D M Cdata Out

Configuration data out

IN3160 Circuit technologies and FPGA configuration 32 Daisy chaining FPGAs

Control FPGA FPGA y e r c o i v m Cdata In Cdata In e e D M Cdata Out Cdata Out

etc.

IN3160 Circuit technologies and FPGA configuration 33 Parallel download with FPGA as master

Control FPGA y e r

c Address o i v m e e D

M Configuration data [7:0] Cdata In[7:0]

IN3160 Circuit technologies and FPGA configuration 34 Parallel download with FPGA as slave , l . a y c r e t r e c e o i

h , v t m p r e i e r o D M e P P r

o Control s s e

c Address o

r FPGA p

o Data r

c Cdata In[7:0] i M

IN3160 Circuit technologies and FPGA configuration 35 Download using the JTAG port

JTAG data out JTAG data in

From previous JTAG filp-flop Input pad

To internal Input pin from logic outside world

JTAG flip-flops

From internal Output pin to logic outside world

Output pad To next JTAG filp-flop

IN3160 Circuit technologies and FPGA configuration 36 Case: Mars Rover

Among NASA's Curiosity avionics contractors are Wind River (VxWorks real-time operating system) and Microsemi Semiconductor (RTAX-S and RTSX-SU FPGAs, high- and low-voltage power supplies, high reliability diodes and signal and power transistors).

IN3160 Circuit technologies and FPGA configuration 37