ECE 545

Digital System Design with VHDL

Fall 2015 Kris Gaj Research and teaching interests: • • computer arithmetic • cryptography • network security Contact: The Engineering Building, room 3225 [email protected] Office hours: Thursday, 6:00-7:00 PM, Tuesday, 6:00-7:00 PM, and by appointment Course Web Page

ECE web page → Courses →

Digital System Design with VHDL

(or “Kris Gaj”) ECE 545 Part of:

MS in Computer Engineering One of five core courses (must be passed with B or better) Fundamental course for the specialization areas: Digital Systems Design Digital Signal Processing Elective course in the remaining specialization areas

MS in Electrical Engineering Elective ECE 545

Part of:

PhD in Electrical and Computer Engineering

Knowledge tested at the Technical Qualifying Exam (TQE) Topic 2: Digital Design and Computer Organization Recommended I am interested I want to specialize program & in… primarily in… specialization

CAD tools & Design Automation MS CpE

VLSI Digital Systems Design Hardware Description Languages

Digital Systems Design FPGAs & Reconfigurable computing

ASICs & FPGAs Computer Arithmetic

VHDL/ Front-end ASIC Design (algorithmic downto gate level) CAD Tools Back-end ASIC Design (circuit and mask layout levels) Reconfigurable Computing Analog & Digital Circuit Design

Microelectronics VLSI Fabrication

VLSI Fabrication Microelectronics MS EE Nanoelectronics Nanoelectronics Microelectronics/

Semiconductor Devices Nanoelectronics Design level Courses Digital System Computer VLSI Design VLSI Test Design with VHDL Arithmetic for ASICs Concepts algorithmic ECE ECE ECE 545 699 645 SW/HW register-transfer Codesign

ECE ECE gate 681 682 ECE 586 transistor Digital ECE Integrated 680 Circuits layout Physical VLSI Design Semiconductor MOS Device ECE 584 ECE684 devices Device Fundamentals Electronics CpE CpE Digital Systems Design Microprocessors and Embedded Systems ECE 545 Digital System Design ECE 510 Real-Time Concepts with VHDL ECE 511 Microprocessors Pre- ECE 586 Digital Integrated Circuits ECE 611 Advanced Microprocessors Approved ECE 645 Computer Arithmetic ECE 612 Real-Time Emb. Systems ECE 681 VLSI Design for ASICs ECE 641 Computer System Arch. Electives ECE 682 VLSI Test Concepts ECE 699 SW/HW Codesign ECE 699 SW/HW Codesign ECE 699 Green Computing and ECE 740 DSP HW Architectures Heterogeneous Architectures

ECE 545, 645, 681 (digital design) Suggested ECE 584, 684, … (technology) CS 571 (operating systems) ECE 511, 611, … (microprocessors) CS 540, 583 (languages, algorithms) Electives ECE 535, 537, 646, …(applications: CS 580 (artificial intelligence) DSP, image processing, crypto, etc.) ECE 542, 642, 742 (networks) ECE 548 (sequential mach. theory)

K. Gaj, H. Homayoun, J-P. Kaps H. Homayoun, J. Kaps, P. Pachowicz, Professors T. Storey, A. Cohen C. Sabzevari DIGITAL SYSTEMS DESIGN

1. ECE 545 Digital System Design with VHDL – K. Gaj, project, FPGA design with VHDL

2. ECE 699 Software/Hardware Codesign – K. Gaj, homework, SoC design with VHDL and C

3. ECE 645 Computer Arithmetic – K. Gaj, project, FPGA design with VHDL or Verilog

4. ECE 681 VLSI Design for ASICs – H. Homayoun, project/lab, front-end and back-end ASIC design with tools

5. ECE 586 Digital Integrated Circuits – D. Ioannou, R. Mulpuri, homework

6a. ECE 682 VLSI Test Concepts – T. Storey, homework 6b. ECE 740 Digital Signals Processing Hardware Architectures – A. Cohen, project, FPGA design with VHDL and Matlab/Simulink MICROPROCESSOR AND EMBEDDED SYSTEMS

1. ECE 510 Real-Time Concepts – P. Pachowicz, project, design of real-time systems

2. ECE 511 Microprocessors – J.P. Kaps, project, system based on MSP430 microcontroller

3. ECE 611 Advanced Microprocessors – H. Homayoun, project, computer architecture simulation tools

4. ECE 612 Real-Time Embedded System – C. Sabzevari, project, programming distributed real-time systems

5. ECE 641 Computer System Architecture – H. Homayoun, project, computer architecture simulation tools

6. ECE 699 Software/Hardware Codesign – K. Gaj, homework, SoC design with VHDL and C

7. ECE 699 Heterogeneous Architectures and Green Computing – H. Homayoun, project, computer architecture simulation tools TA

Sanjay Deshpande • help with the installation and configuration of CAD tools

• help with understanding of tutorials and the operation of tools

• help with VHDL and tool-oriented homework assignments

• limited help with debugging your MS Thesis Student project codes in the Cryptographic Engineering Research Group (CERG) Getting Help Outside of Office Hours

• System for asking questions 24/7 • Answers can be given by students and instructors • Student answers endorsed (or corrected) by instructors • Average response time in Fall 2014 = 1.5 hour • You can submit your questions anonymously • You can ask private questions visible only to the instructors Grading Scheme

• Homework - 15%

• Project - 35%

• Midterm Exam - 20%

• Final Exam - 30%

• Class Activity - Bonus 5% Bonus Points for Class Activity

• Based on class exercises during lecture • “Small” points earned each week posted on BlackBoard • Up to 5 “big” bonus points • Scaled based on the performance of the best student

For example: Small points Big points 1. Alice 40 5 2. Bob 36 4.5 … … … 28. Charlie 8 1 Midterm exam 1 ü 2 hours 40 minutes

ü in class

ü design-oriented

ü open-books, cheat sheet

ü practice exams available on the web

Tentative date: Last week of October Final exam ü 2 hours 45 minutes

ü in class

ü design-oriented

ü open-books, cheat sheet

ü practice exams available on the web

Date: Thursday, December 17, 7:30-10:15pm Textbooks

17 Required Textbook Pong P. Chu, RTL Hardware Design Using VHDL, Wiley-Interscience, 2006.

K?<JB@CCJ8E;>L@;8E:<E<<;<;KF D8JKE

K_`j Yffb k\XZ_\j i\X[\ij _fn kf jpjk\dXk`ZXccp [\j`^e \]ÔZ`\ek# gfikXYc\# Xe[ jZXcXYc\ I\^`jk\i KiXej]\i C\m\c IKC  [`^`kXc Z`iZl`kj lj`e^ k_\ M?;C _Xi[nXi\ [\jZi`gk`fe cXe^lX^\ Xe[ jpek_\j`j jf]knXi\% =fZlj`e^ fe k_\ df[lc\$c\m\c [\j`^e# n_`Z_ `j Zfdgfj\[ f] :?L ]leZk`feXc le`kj# iflk`e^ Z`iZl`k# Xe[ jkfiX^\# k_\ Yffb `ccljkiXk\j k_\ i\cXk`fej_`g Y\kn\\e k_\M?;CZfejkilZkjXe[k_\le[\icp`e^_Xi[nXi\Zfdgfe\ekj#Xe[j_fnj_fnkf[\m\cfg IKC?8I;N8I<;E Zf[\jk_Xk]X`k_]lccpi\Õ\Zkk_\df[lc\$c\m\c[\j`^eXe[ZXeY\jpek_\j`q\[`ekf\]ÔZ`\ek ^Xk\$c\m\c`dgc\d\ekXk`fe%

J\m\iXcle`hl\]\Xkli\j[`jk`e^l`j_k_\Yffb1 ›:f[`e^jkpc\k_Xkj_fnjXZc\Xii\cXk`fej_`gY\kn\\eM?;CZfejkilZkjXe[  _Xi[nXi\Zfdgfe\ekj ›:feZ\gklXc[`X^iXdjk_Xk`ccljkiXk\k_\i\Xc`qXk`fef]M?;CZf[\j ›

 gifZ\[li\j#Xe[k\Z_e`hl\j LJ@E>M?;C ›KnfZ_Xgk\ijfei\Xc`q`e^j\hl\ek`XcXc^fi`k_dj`e_Xi[nXi\ ›KnfZ_Xgk\ijfejZXcXYc\Xe[gXiXd\k\i`q\[[\j`^ejXe[Zf[`e^ IKC ?8I;N8I<;E ›Fe\Z_Xgk\iZfm\i`e^k_\jpeZ_ife`qXk`feXe[`ek\i]XZ\Y\kn\\edlck`gc\  ZcfZb[fdX`ej

8ck_fl^_k_\]fZljf]k_\Yffb`jIKCjpek_\j`j#`kXcjf\oXd`e\jk_\jpek_\j`jkXjb]ifdk_\ LJ@E>M?;C g\ijg\Zk`m\f]k_\fm\iXcc[\m\cfgd\ekgifZ\jj%I\X[\ijc\Xie^ff[[\j`^egiXZk`Z\jXe[ ^l`[\c`e\jkf\ejli\k_XkXeIKC[\j`^eZXeXZZfddf[Xk\]lkli\j`dlcXk`fe#m\i`ÔZXk`fe#Xe[ k\jk`e^e\\[j#Xe[ZXeY\\Xj`cp`eZfigfiXk\[`ekfXcXi^\ijpjk\dfii\lj\[%;`jZljj`fe`j`e$ [\g\e[\ekf]k\Z_efcf^pXe[ZXeY\Xggc`\[kfYfk_8J@:Xe[=G>8[\m`Z\j%

N`k_ X YXcXeZ\[ gi\j\ekXk`fe f] ]le[Xd\ekXcj Xe[ giXZk`ZXc \oXdgc\j# k_`j `j Xe \oZ\c$ c\ekk\okYffb]filgg\i$c\m\cle[\i^iX[lXk\fi^iX[lXk\Zflij\j`eX[mXeZ\[[`^`kXccf^`Z%

8[\m`Z\j JZXcXY`c`kp GfikXY`c`kp#Xe[ :f[`e^]fi<]ÔZ`\eZp# j_flc[Xcjfi\]\ikfk_`jYffb%

GFE>G%:?L#G?;#`j8jjfZ`Xk\Gif]\jjfi`ek_\;\gXikd\ekf]

GFE>G%:?L Supplementary Textbook – Basics Refresher Stephen Brown and Zvonko Vranesic, Fundamentals of Digital Logic with VHDL Design, McGraw-Hill, 3rd or 2nd Edition Supplementary Textbook – Advanced Hubert Kaeslin, Digital Integrated Circuit Design: From VLSI Architectures to CMOS Fabrication, Cambridge University Press; 1st Edition, 2008. Technology & Tools

21 I/O Blocks Block RAMs Logic Blocks (CLB) / Blocks/ (CLB) Logic Logic Adaptive (ALM) Modules Configurable Configurable

Block RAMs Block RAMs What is an FPGA? isanFPGA? What Modern FPGA

RAMRAM bblockslocks Multipliers/DSPMultipliers units LogicLogic b resourceslocks (CLBs or ALMs)

(#Logic resources, #Multipliers/DSP units, #RAM_blocks)

Graphics based on The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Corp. (www.mentor.com) 23 General structure of an FPGA

Programmable interconnect

Programmable logic blocks

The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

ECE 448 – FPGA and ASIC Design with VHDL 24 4-input LUT (Look-Up Table) (used in earlier families of FPGAs)

• Look-Up tables x1 x 2 y x x x x y x3 LUT x x x x y are primary 1 2 3 4 x 1 2 3 4 0 0 0 0 1 4 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 elements for 0 0 1 0 1 0 0 1 0 0 0 0 1 1 1 0 0 1 1 0 logic 0 1 0 0 1 0 1 0 0 0 0 1 0 1 1 0 1 0 1 1 0 1 1 0 1 0 1 1 0 0 implementation 0 1 1 1 1 0 1 1 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 1 1 1 0 0 1 1 • Each LUT can 1 0 1 0 1 1 0 1 0 0 1 0 1 1 1 1 0 1 1 0 implement any 1 1 0 0 0 1 1 0 0 1 1 1 0 1 0 1 1 0 1 1 x x x x function of 1 1 1 0 0 1 2 3 4 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 4 inputs

x1 x2

y

y

25 6-Input LUT of Spartan-6

ECE 448 – FPGA and ASIC Design with VHDL 26 Two competing implementation approaches

ASIC FPGA Application Specific Field Programmable Integrated Circuit Gate Array

• designed all the way • no physical layout design; from behavioral description design ends with to physical layout a bitstream used to configure a device • designs must be sent for expensive and time • bought off the shelf consuming fabrication and reconfigured by in semiconductor foundry designers themselves FPGAs vs. ASICs

ASICs FPGAs

Off-the-shelf High performance

Low development costs Low power Short time to the market Low cost (but only in high volumes) Reconfigurability Major FPGA Vendors

SRAM-based FPGAs • , Inc. ~ 51% of the market ~ 85% • Corp. ~ 34% of the market •

Flash & antifuse FPGAs • Microsemi SoC Products Group (formerly Actel Corp.) • Quick Logic Corp.

29 Xilinx FPGA Families Technology Low-cost High-performance

220 nm Virtex 180 nm Spartan-II, Spartan-IIE

120/150 nm Virtex-II, Virtex-II Pro 90 nm Spartan-3 Virtex-4 65 nm Virtex-5 45 nm Spartan-6 40 nm Virtex-6 28 nm Arx-7 Virtex-7 20 nm Virtex UltraSCALE 16 nm Virtex UltraSCALE+ Altera FPGA Devices

Technology Low-cost Mid-range High- performance 130 nm Cyclone Strax 90 nm Cyclone II Strax II 65 nm Cyclone III Arria I Strax III 40 nm Cyclone IV Arria II Strax IV 28 nm Cyclone V Arria V Strax V 20 / 14 nm Arria 10 Strax 10 FPGA Family

32 Spartan-6 FPGA Family

33 FPGA Design process (1)

Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able Specification / Pseudocode to perform an encryption algorithm by itself, executing 32 rounds…..

On-paper hardware design (Block diagram & ASM chart)

VHDL description (Your Source Files)

Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; Functional simulation entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core;

Synthesis Post-synthesis simulation FPGA Design process (2)

Implementation Timing simulation

Results Configuration On chip testing Levels of design description Levels supported by HDL

Algorithmic level Level of description Register Transfer Level most suitable for synthesis

Logic (gate) level

Circuit (transistor) level

Physical (layout) level Register Transfer Level (RTL) Design Description

Combinational Combinational Logic Logic

Registers

37 Synthesis

George Mason University Logic Synthesis

VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1;

MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1;

with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW;

39 Circuit netlist (RTL view)

40 Implementation

George Mason University Mapping

LUT0

FF1

LUT1

FF2 LUT2

42 Placing FPGA CLB SLICES

43 Routing FPGA

Programmable Connections

44 Configuration

• Once a design is implemented, you must create a file that the FPGA can understand • This file is called a bitstream: a BIT file (.bit extension)

• The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information

45 Simulation Tools

FPGA Synthesis Tools

XST Logic Synthesis

VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1;

MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1;

with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; FPGA Implementation

• After synthesis the entire implementation process is performed by FPGA vendor tools Design Process control from Active-HDL Xilinx FPGA Tools ECE Labs

Aldec Active-HDL Xilinx ISE Design Flow Design Flow

Aldec Active-HDL (IDE) ISim or ModelSim Xilinx XST Xilinx XST or or Synopsys Synplify Premier Synopsys Synplify Premier

Xilinx ISE Design Suite Xilinx ISE Design Suite (IDE)

simulation synthesis implementation Xilinx FPGA Tools Home

Aldec Active-HDL Xilinx ISE Design Flow Design Flow Aldec Active-HDL ISim Student Edition (IDE) Xilinx XST Xilinx XST (restricted) (restricted)

Xilinx ISE WebPACK Xilinx ISE WebPACK (IDE) (restricted) (restricted) simulation synthesis implementation Altera FPGA Tools ECE Labs

Altera Design Flow

Mentor Graphics ModelSim-Altera

Altera Quartus II Subscription Edition

simulation synthesis & implementation Altera FPGA Tools Home

Altera Design Flow

Mentor Graphics ModelSim-Altera Starter (restricted)

Altera Quartus II Web Edition (restricted)

simulation synthesis & implementation Lab Access Rules and Behavior Code

Please refer to

ECE Labs website and in particular to

Access rules & behavior code

ATHENa – Automated Tool for Hardware EvaluaoN

Supported in part by the National Institute of Standards & Technology (NIST)58 GMU ATHENa Team

Venkata Ekawat Marcin John Rajesh Michal “Vinny” “Ice” PhD exchange MS CpE PhD CpE PhD ECE MS CpE PhD ECE student from student student student student student Slovakia ATHENa – Automated Tool for Hardware EvaluatioN http://cryptography.gmu.edu/athena

Benchmarking open-source tool, written in Perl, aimed at an AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms

Currently under development at George Mason University.

60 Why Athena?

"The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.”

from "Athena, Greek Goddess of Wisdom and Craftsmanship"

61 Basic Dataflow of ATHENa

User FPGA Synthesis and Implementation 6 5 Ranking 2 3 Database of designs query HDL + scripts + Result Summary configuration files + Database Entries

ATHENa 1 Server Download scripts HDL + FPGA Tools and configuration files8

4

Database Designer Entries Interfaces 0 + Testbenches 62

configuraon constraint files files

testbench synthesizable source files

database result entries summary (machine- (user-friendly) friendly) 63 ATHENa Major Features (1) • synthesis, implementaon, and ming analysis in batch mode • support for devices and tools of mulple FPGA vendors:

• generaon of results for mulple families of FPGAs of a given vendor

• automated choice of a best-matching device within a given family

64 ATHENa Major Features (2)

• automated verificaon of designs through simulaon in batch mode

OR

• support for mul-core processing • automated extracon and tabulaon of results • several opmizaon strategies aimed at finding – opmum opons of tools – best target clock frequency – best starng point of placement

65 Generation of Results Facilitated by ATHENa

• batch mode of FPGA tools

vs.

• ease of extraction and tabulation of results • Text Reports, Excel, CSV (Comma-Separated Values) • optimized choice of tool options • GMU_optimization_1 strategy 66 Relative Improvement of Results from Using ATHENa Virtex 5, 256-bit Variants of Hash Functions

2.5

2

1.5 Area 1 Thr Thr/Area 0.5

0

Ratios of results obtained using ATHENa suggested options

vs. default options of FPGA tools 67 Other (Somewhat) Similar Tools

Vivado

Design Space Explorer (DSE)

Boldport Flow

EDAx10 Cloud Platform

68 Distinguishing Features of ATHENa

• Support for multiple tools from multiple vendors

• Optimization strategies aimed at the best possible performance rather than design closure

• Extraction and presentation of results

• Seamless integration with the ATHENa database of results

69 Benchmarking Goals Facilitated by ATHENa

Comparing multiple: 1. cryptographic algorithms 2. hardware architectures or implementations of the same cryptographic algorithm 3. hardware platforms from the point of view of their suitability for the implementation of a given algorithm, (e.g., choice of an FPGA device or FPGA board) 4. tools and languages in terms of quality of results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 14.7 vs. ISE v. 14.6)

70 Project

71 Cryptography Project

ü related to the research project conducted by Cryptographic Engineering Research Group (CERG) at GMU

ü supporting NIST (National Institute of Standards and Technology) and the CAESAR Contest Committee in the evaluation of candidates for a new cryptographic standard Cryptography Project ü RTL VHDL implementation of an authenticated cipher based on the • algorithm specification • reference implementation in C • Hardware API specification.

ü a different cipher for each student

ü two students working on the similar ciphers can work closely together, and exchange the source codes

ü each student graded based on the deliverables for his/her own cipher Combining Projects from Two Different Courses • ECE 545 & ECE 646 • ECE 545 project can be extended into an ECE 646 hardware project by adding additional ciphers, architectures, modes of operation, etc. • ECE 646 students must write a final report and submit deliverables (one submission per group) ECE 545 submit only deliverables (separate for each member of a group) • Students forming a two-member group in ECE 646 will receive the same score for the ECE 646 project, but possibly different scores for their respective ECE 545 projects • ECE 545 & ECE 797/798/799/998 • ECE 545 project can be extended into a Scholarly Paper, Research Project, Master’s Thesis, PhD Thesis Project Organization • Project divided into phases

• Deliverables for each phase submitted using Blackboard at selected checkpoints and evaluated by the instructor and/or TA

• Feedback provided to the students on the best effort basis

• Periodical individual/group meetings devoted to the discussion of each phase deliverables and encountered difficulties

• Final deliverables submitted using Blackboard at the end of the semester

• Final project score based only on the final deliverables Honor Code Rules

• All students are expected to write and debug their project codes individually or in groups of two • All homework assignments should be done individually • Students are encouraged to help and support each other in all problems related to the - operation of the CAD tools - understanding of an investigated algorithm and existing implementations - understanding of the project tasks ECE 545 Questionnaire