<<

Field Program mable Gate Arrays

M andakini Patil E H E P g r o u p DHEP TIFR

SERC school NISER, Bhubaneshwar Nov 7-27 2017 Outline  Digital

 Short history of programmable devices

 The internals of an FPGA  Programming techniques

 Example Applications

 Design flow What is a Field Programmable Gate Array ? A quick answer for the impatient  An FPGA is an  Mostly

 An FPGA is programmable in the field (=outside the factory), hence the name “field programmable”  Design is specified by schematics or with a hardware description language  Tools compute a programming file for the FPGA  The FPGA is configured with the design  Your is ready to use

With an FPGA you can build electronic circuits … … without using a soldering iron… without plugging together existing modules … without having a chip produced at a factory

Introduction to digital electronics The building blocks: logic gates Exclusive OR gate XOR gate C equivalent AND gate

q = a && b;

OR gate q = a || b; q = a != b; Combinatorial logic (asynchronous)

Outputs are determined by Inputs, only

Example: Full with carry-in, carry-out

Combinatorial logic may be implemented using Look-Up Tables (LUTs) (Synchronous)

Outputs are determined by Inputs and their History (Sequence) The logic has an internal

2- binary

D Flip-flop: set samples the data at the rising (or falling) edge of the clock data Output clock Inverted output Synchronous sequential logic

+ =

Using Look-Up-Tables and Flip-Flops any kind of digital electronics may be implemented Of course there are some details to be learnt about electronics design … Programmable digital electronics Long long time ago … Ever-decreasing feature size

 Higher capacity  Higher speed  Lower power consumption

130 nm Xilinx Virtex-2

28 nm Xilinx Virtex-7 / Altera Stratix V 16 nm Xilinx UltraScale 14 nm Altera Stratix 10 Slowing down

5.5 million logic cells 4 million logic cells Simple PLDs (sPLDs): PROMs Programmable Logic Devices Programmable Read-Only Memory

Unprogrammed PROM (Fixed AND Array, Programmable OR Array) Simple PLDs (sPLDs): PLAs Programmable Logic Arrays

Programmable AND array

Unprogrammed PLA (Programmable AND and OR Arrays) Most flexible 1975 but slower Simple PLDs (sPLDs): PALs Unprogrammed PAL (Programmable AND Array, Fixed OR Array) Complex PLDs (CPLDs)

and flip-

Coarse grained 100’s of blocks, restrictive structure FPGAs ASIC FPGA  You can get started with a $200  Extremely high development cost development board and free  ASICs are produced at a software fabrication facility (“fab”) according to your design  Rapid development cycle (minutes / hours)  Long development cycle (weeks / months)  May be reprogrammed in the field (firmware upgrade)  Design cannot be changed once it is produced  New features  Bug fixes  Better radiation hardness  Analog designs possible  Lower cost per device in a large volume production Components in a modern FPGA LUT-based Fabrics

Typical LUT-based Logic Cel l Embedded RAM blocks Xilinx: logic cell, Inside a FPGA Altera: logic element

 Flip-Flop registers the LUT output  May use only the LUT or only the Flip-flop  LUT may alternatively be configured as a shift register  Additional elements (not shown): fast carry logic Digital (DSP)

DSP block (Xilinx 7-series) Up to several 1000 per chip Today: Up to several MB of RAM Embedded Multipliers Soft and Hard Processor Cores

Clock Trees Clock Managers System-On-a-Chip (SoC) FPGAs

Xlinix Zynq

Altera Stratix 10

CPU(s) + Peripherals + FPGA in one package Trends  Increased capacity and speed (> 1 M logic cells)

 Look-up-tables with more inputs DIFFERENT FEATURES of FPGA

 Hard macro cores (Ethernet MAC, Memory interfaces …)

 Embedded CPUs

 Sophisticated interfaces (PCI Express, …)

 Domain-specific devices

 Ultra-low-power FPGAs  Mixed-signal FPGAs ... How FPGA Technology evolved . Used in most FPGAs Design Considerations (SRAM Config.) Fusible Links (not used in FPGAs)

Antifuse Technology SRAM-Based Devices ,1971

EEPROM Technology Erasable Programmable Read Only Memory Major Manufacturers

 Xilinx  First company to produce FPGAs in 1985  About 50% market share, today  SRAM based CMOS devices

 Altera  About 30% market share  SRAM based CMOS devices

 Actel

 Anti- FPGAs  Flash based FPGAs  Mixed Signal

 Lattice Semiconductor Hardware Description Language  Looks similar to a programming language  BUT be aware of the difference  Programming Language => translated into machine instructions that are executed by a CPU  HDL => translated into gateware (logic gates & flip-flops)  Common HDLs  VHDL   AHDL ( Altera specific )

 Newer trends  C-like languages (handle-C, System C)  Labview Development Tools Different Steps in Implementation entity DelayLine is!

generic (! n_halfcycles : integer := 2);!

port (! x : in std_logic_vector;! x_delayed : out std_logic_vector;! clk : in std_logic);!

end entity DelayLine;

 Graphical overview  Can generate blocks using loops  Can draw entire design  Can synthesize algorithms  Can also include blocks in hardware  Independent of design tool description language.  May use tools used in SW development (CVS, …) Mostly a personal choice depending on previous experience Design flow Hardware Description Language

Configuration at power-up

stores Flash single or PROM multiple designs

FPGA Serial bit-stream ( SRAM based ) (may be encrypted)

Typical FPGA configuration time: milliseconds Programming from a host PC

Flash PROM FPGA PCI, VME

JTAG FPGA ( SRAM based )

... The JTAG bus may be driven by an FPGA which contains an interface to a host PC via PCI or VME Firmware can then be updated from the PC FPGA APPLICATIONS in HEP and Other Domain Why are FPGAs ideal for First-Level Triggers ?

 They are fast  Much faster than discrete electronics (shorter connections) Low latency

 Many parallel inputs  Data from many parts of the detector has to be combined

 Can send parallel data rather than serial

 All operations are performed in parallel High performance  Can build pipelined logic FPGAs in Data Acquisition  Frontend Electronics  Pedestal subtraction  Zero suppression  Compression  …

 Custom data links  E.g. SLINK-64  Several serial LVDS links in parallel  Up to 400 MB/s

 Interface from custom hardware to commercial electronics  PCI bus, VME bus, Myrinet, etc. Example 3: CMS Front-end Readout Link

Trigger algorithms implemented in FPGAs

 Peak finding

 Pattern Finding

 Track Finding  Energy summing

 Sorting  Topological Algorithms

 Trigger

 Fast signal merging  Many more … Example: CMS Global Muon Trigger

 Input: ~1000 @ 40 and 80 MHz

 Output: ~50 bits @ 80MHz  Processing time: 250 ns  Pipelined logic one new result every 25 ns

 10 Xilinx Virtex-II FPGAs  up to 500 user I/Os per chip

 Up to 25000 LUTs per chip used  Up to 96 x 18kbit RAM used  The CMS Global Muon trigger receives 16 muon candidates from the three muon systems of CMS

 It merges different measurements for the same muon and finds the best 4 over-all muon candidates CMS Global Muon Trigger main FPGA Example 4: CMS Readout Link for Run-2 in use since 2015 Myrinet NIC 10 Gb/s TCP/IP replaced by custom-built card (“FEROL”)

Cost effective solution (need many boards) Rather inexpensive FPGA + commercial chip to combine SLINK-64 input 3 Gb/s links to 10 Gb/s LVDS / copper

FEROL (Front End Readout Optical Link) Input: 1x or 2x SLINK (copper) Output: 10 Gb/s Ethernet optical 1x or 2x 5Gb/s optical TCP/IP sender in FPGA 1x 10Gb/s optical CMS Global Muon Trigger main FPGA Example 2: CMS Fast Merging bi-color LEDs Modules

These modules merge the status of all detector front-ends in CMS in order to throttle the trigger when buffers fill up.

Additionally these modules monitor all

status changes of detector frontends. 1

FPGA (Altera) : PCI interface

1 FPGA (Xilinx) : Merging logic (1 µs latency) Monitoring logic Interface to SRAM  SLINK Sender Mezzanine Card: 400 M B / s  1 FPGA (Altera)  CRC check  Automatic link test

Commercial Myrinet Network Interface Card on internal PCI bus  Front-end Readout Link Card  1 main FPGA (Altera)  1 FPGA as PCI interface  Custom Compact PCI card  Receives 1 or 2 SLINK64  2nd CRC check  Monitoring, Histogramming  Event spy Example 2: New µTCA board for CMS trigger upgrade based on Virtex 7

360 Gb/s 36 x 10 Gb/s

Rx Tx

Rx Tx

MP7, Imperial College Input/output: Virtex 7 with 690k logic cells up to 14k bits per 40 MHz clock 80 x 10 Gb/s transceivers bi-directional 72 of them as optical links on front panel Same board used for different functions 0.75 + 0.75 Tb/s (different gateware) Being used in the CMS trigger since 2015 Separation of framework + algorithm fw FPGAs in Server Processors and the Cloud  New in 2016: Intel Xeon Server Processor with FPGA in socket  Intel acquired Altera in 2015

 FPGAs in the cloud  Amazon Elastic Cloud F1 instances  8 CPUs / 1 Xlinix UltraScale FPGA  64 CPUs / 8 Xlinix UltraScale FPGA FPGAs in other domains  Set-top boxes  Processing

 Medical imaging  ASIC Prototyping

vision  High performance computing  Speech recognition  Computations performed  Cryptography by FPGA: FFT, Convolution

 Bioinformatics  Reconfigurable computing

 Software-Defined Radio

 Aerospace

 Defense FPGAs in other domains

 Medical imaging  ASIC Prototyping  Advanced Driver Assistance  High performance Systems (Image Processing) computing  Accelerator cards  Speech recognition  Cryptography

 Bioinformatics 3 TFlop

 Aerospace / Defense  Server processors w. FPGA  Bitcoin mining You are going to design the digital electronics using this FPGA board ! THANK YOU VHDL has roots in the Ada programming language in both concept and syntax, while Verilog's roots can be tracked back to an early HDL called Hilo and the C programming language.