Implementing and Benchmarking Seven Round 2 Lattice-Based Key Encapsulation Mechanisms Using a Software/Hardware Codesign Approach

Total Page:16

File Type:pdf, Size:1020Kb

Implementing and Benchmarking Seven Round 2 Lattice-Based Key Encapsulation Mechanisms Using a Software/Hardware Codesign Approach Implementing and Benchmarking Seven Round 2 Lattice-Based Key Encapsulation Mechanisms Using a Software/Hardware Codesign Approach Farnoud Farahmand, Viet B. Dang, Michał Andrzejczak, Kris Gaj George Mason University Co-Authors GMU PhD Students Visiting Scholar Farnoud Viet Michał Farahmand Ba Dang Andrzejczak Military University of Technology in Warsaw, Poland 2 Hardware Benchmarking 3 Round 2 Candidates in Hardware #Round 2 Implemented Percentage candidates in hardware AES 5 5 100% SHA-3 14 14 100% CAESAR 29 28 97% PQC 26 ? ? 4 Software/Hardware Codesign 5 Software/Hardware Codesign Software Hardware Most time-critical operation 6 SW/HW Codesign: Motivational Example 1 Software Software/Hardware Other Other Major 9% 9% ~1% Major Time saved 91% 90% speed-up ≥ 100 91% major operation(s) ~1% major operation(s) in HW 9% other operations 9% other operations in SW Total Speed-Up ≥ 10 7 SW/HW Codesign: Motivational Example 2 Software Software/Hardware Other Major Other 1% ~1% 1% Major Time saved 99% 98% speed-up ≥ 100 99% major operation(s) ~1% major operation(s) in HW 1% other operations 1% other operations in SW Total Speed-Up ≥ 50 8 SW/HW Codesign: Advantages ❖ Focus on a few major operations, known to be easily parallelizable ▪ much shorter development time (at least by a factor of 10) ▪ guaranteed substantial speed-up ▪ high-flexibility to changes in other operations (such as candidate tweaks) ❖ Insight regarding performance of future instruction set extensions of modern microprocessors ❖ Possibility of implementing multiple candidates by the same research group, eliminating the influence of different ▪ design skills ▪ operation subset (e.g., including or excluding key generation) ▪ interface & protocol ▪ optimization target ▪ platform 9 SW/HW Codesign: Potential Pitfalls ❖ Performance & ranking may strongly depend on A. features of a particular platform o Software/hardware interface o Support for cache coherency o Differences in max. clock frequency B. selected hardware/software partitioning C. optimization of an underlying software implementation ❖ Limited insight on ranking of purely hardware implementations First step, not the ultimate solution! 10 Two Major Types of Platforms FPGA Fabric FPGA Fabric, including & Hard-core Processors Soft-core Processors Processor Soft-core w/ Memory Processor & I/O FPGA FPGA Fabric Fabric Examples: Examples: • Xilinx Zynq 7000 System on Chip (SoC) Xilinx Virtex UltraScale+ FPGAs • Xilinx Zynq UltraScale+ MPSoC Intel Stratix 10 FPGAs, including • Intel Arria 10 SoC FPGAs • Xilinx MicroBlaze • Intel Stratix 10 SoC FPGAs • Intel Nios II • RISC-V, originally UC Berkeley 11 Two Major Types of Platform Feature FPGA Fabric and FPGA Fabric with Hard-core Processor Soft-core Processor Processor ARM MicroBlaze, NIOS II, RISC-V, etc. Clock frequency >1 GHz max. 200-450 MHz Portability similar FPGA SoCs various FPGAs, FPGA SoCs, and ASICs Hardware accelerators Yes Yes Instruction set extensions No Yes Ease of design Easy Dependent on a particular (methodology, tools, OS soft-core processor and support) tool chain Xilinx Zynq UltraScale+ MPSoC 1.2 GHz ARM Cortex-A53 + UltraScale+ FPGA logic 12 Choice of a Platform for Benchmarking Embedded Processor: FPGA Architecture: In NIST presentations to date: ARM Cortex-M4 Artix-7 Our recommendation: ARM Cortex-A53 UltraScale+ • No FPGA SoC with ARM Cortex-M4 and Artix-7 on a single chip • Cortex-M4 and Artix-7 more suitable for lightweight designs, Cortex-A53 and UltraScale+ for high performance • Zynq UltraScale+: • capability to compare SW/HW implementations with fully-SW and fully-HW implementations realized using the same chip • likely in use in the first years of the new standard deployments 13 Experimental Setup AXI Lite Main Clock Zynq Processing System Interface AXI Timer e e l e l c c t i u a a f L f F Q r r I I R e e I X X t t A n A n I I AXI Stream AXI Stream Interface AXI DMA Interface e e c t i e a e f L c t r I i a e f L X t r I A n e I X t A n I Hardware Input FIFO FIFO FIFO Output FIFO Interface Accelerator Interface wr_clk rd_clk clk wr_clk rd_clk UUT_clk Clocking wizard All elements located on a single chip 14 Code Release • Full Code & Configuration of the Experimental Setup • Software/Hardware Codesign of Round 1 NTRUEncrypt to be made available at https://cryptography.gmu.edu/athena under PQC by August 31, 2019 15 Our Case Study 16 SW/HW Codesign: Case Study 7 IND-CCA*-secure Lattice-Based Key Encapsulation Mechanisms (KEMs) representing 5 NIST PQC Round 2 Submissions LWE (Learning with Error)-based: NTRU-based: NTRU FrodoKEM • NTRU-HPS RLWR (Ring Learning with Rounding)-based: • NTRU-HRSS Round5 NTRU Prime Module-LWR-based: • Streamlined NTRU Prime • NTRU LPRime Saber * IND-CCA = with Indistinguishability under Chosen Ciphertext Attack 17 SW/HW Partitioning Top candidates for offloading to hardware From profiling: ❖ Large percentage of the execution time ❖ Small number of function calls From manual analysis of the code: ❖ Small size of inputs and outputs ❖ Potential for combining with neighboring functions From knowledge of operations and concurrent computing: ❖ High potential for parallelization 18 Operations Offloaded to Hardware • Major arithmetic operations • Polynomial multiplications • Matrix-by-vector multiplications • Vector-by-vector multiplications • All hash-based operations • (c)SHAKE128, (c)SHAKE256 • SHA3-256, SHA3-512 19 Example: LightSaber Decapsulation GenSecret Other Hash 2.30% 2.40% 3.30% GenMatrix 5.03% InnerProduct 43.52% MatrixVectorMul 43.44% 20 LightSaber Decapsulation Hash GenSecret Other Hardware Other 3.30% 2.30% 2.40% Accelerator 2.40% GenMatrix 8.77% Execution time 5.03% remaining 11.17% InnerProduct 43.52% MatrixVectorMul Execution time saved 43.44% 88.83% Execution time of functions to be moved to hardware 97.60% Accelerator Speed-Up = 97.60/8.77=11.1 Execution time of functions remaining in software Total Speed-Up = 100/11.17=9.0 2.40% 21 Tentative Results 22 Software Implementations Used FrodoKEM, NTRU-HPS, NTRU-HRSS, Saber: Round 2 submission packages – Optimized_Implementation Round5: https://github.com/r5embed/r5embed (2019-07-28) Streamlined NTRU Prime, NTRU LPRime: supercop-20190811 : factored Changes made after the submission of the paper! Results substantially different! New version of the paper available on ePrint soon! 23 Total Execution Time in Software [휇s] Encapsulation 34,609 62,076 16,192 4,500 4,000 3,500 3,000 2,500 2,000 1,500 1,000 500 0 Round5 Saber Str NTRU NTRU NTRU-HRSS NTRU-HPS FrodoKEM Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 24 Total Execution Time in Software/Hardware [휇s] Encapsulation 7 1,642 2,186 4⇒6 1,223 600 500 3⇒4 6⇒5 400 300 5⇒3 200 1 2 100 0 Round5 Saber NTRU-HRSS Str NTRU NTRU-HPS NTRU FrodoKEM Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 25 Total Speed-ups: Encapsulation 30.0 28.4 25.0 21.1 20.0 17.8 18.1 15.0 13.2 12.7 10.6 10.2 9.3 10.0 8.3 7.9 7.1 5.0 3.4 3.6 2.6 3.2 2.4 2.5 0.0 NTRU-HRSS FrodoKEM Round5 NTRU-HPS Saber NTRU LPRime Str NTRU Prime Level 1 Level 2 Level 3 Level 4 Level 5 26 Accelerator Speed-ups: Encapsulation 200.0 192.2 180.0 160.0 146.4 140.4 140.0 120.0 100.0 80.0 60.0 44.3 46.1 43.5 32.5 40.0 27.5 20.4 15.317.7 21.3 11.1 14.5 20.0 12.3 8.7 10.6 8.5 0.0 NTRU-HRSS NTRU-HPS FrodoKEM NTRU Str NTRU Round5 Saber LPRime Prime Level 1 Level 2 Level 3 Level 4 Level 5 27 SW Part Sped up by HW[%]: Encapsulation 99.59 99.36 98.49 99.55 98.93 99.14 100.00 97.45 98.62 95.03 94.62 89.71 90.00 88.03 80.00 71.44 73.26 70.35 70.00 65.00 63.43 62.53 60.00 50.00 Round5 Saber NTRU-HRSS FrodoKEM NTRU-HPS NTRU Str NTRU LPRime Prime Level 1 Level 2 Level 3 Level 4 Level 5 28 Total Execution Time in Software [휇s] Decapsulation 34,649 62,377 5 6 16,192 12,000 Order reversed 11,000 compared to 10,000 encapsulation 9,000 8,000 3 4 7,000 6,000 5,000 4,000 3,000 2,000 1,000 0 Round5 Saber NTRU Str NTRU NTRU-HPS NTRU-HRSS FrodoKEM LPRime Prime Level 1 Level 2 Level 3 Level 4 Level 5 29 Total Execution Time in Software/Hardware [휇s]: Decapsulation 7 1,866 3,120 1,319 700 3⇒6 600 500 400 300 5⇒3 4 6⇒5 200 1 2 100 0 Round5 Saber NTRU-HPS Str NTRU NTRU-HRSS NTRU FrodoKEM Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 30 Total Speed-ups: Decapsulation 130.0 119.3 120.0 110.0 100.0 90.0 80.0 77.4 74.1 70.0 60.0 54.8 45.5 50.0 40.0 38.3 30.0 18.6 20.0 17.9 20.0 12.3 13.4 4.1 9.0 8.1 9.4 9.8 4.5 10.0 3.9 0.0 NTRU-HPS NTRU-HRSS Str NTRU FrodoKEM Saber Round5 NTRU Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 31 Accelerator Speed-ups: Decapsulation 250.0 235.7 225.0 200.0 188.1 182.8 175.0 150.0 132.7 112.0 125.0 100.0 86.2 75.0 46.0 46.0 50.0 44.4 32.8 37.7 27.5 24.6 17.7 25.0 11.1 8.1 9.4 9.8 0.0 NTRU-HRSS NTRU-HPS Str NTRU FrodoKEM NTRU Saber Round5 Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 32 SW Part Sped up by HW[%]: Decapsulation 100.00 99.25 98.69 98.92 100.00 100.00 99.58 99.18 98.10 98.41 97.11 100.00 98.53 97.60 96.78 93.96 90.00 79.22 77.46 80.00 76.47 70.00 60.00 50.00 Round5 NTRU-HPS NTRU-HRSS Str NTRU Saber FrodoKEM NTRU Prime LPRime Level 1 Level 2 Level 3 Level 4 Level 5 33 Conclusions ❖ Total speed-ups ▪ for encapsulation from 2.4 (Str NTRU Prime) to 28.4 (FrodoKEM) ▪ for decapsulation from 3.9 (NTRU LPRime) to 119.3 (NTRU-HPS) ❖ Total speed-up dependent on the percentage of the software execution time taken by functions offloaded to hardware and the amount of acceleration itself ❖ Hardware accelerators thoroughly optimized using Register-Transfer Level design methodology ❖ Determining optimal software/hardware partitioning
Recommended publications
  • Configurable RISC-V Softcore Processor for FPGA Implementation
    1 Configurable RISC-V softcore processor for FPGA implementation Joao˜ Filipe Monteiro Rodrigues, Instituto Superior Tecnico,´ Universidade de Lisboa Abstract—Over the past years, the processor market has and development of several programming tools. The RISC-V been dominated by proprietary architectures that implement Foundation controls the RISC-V evolution, and its members instruction sets that require licensing and the payment of fees to are responsible for promoting the adoption of RISC-V and receive permission so they can be used. ARM is an example of one of those companies that sell its microarchitectures to participating in the development of the new ISA. In the list of the manufactures so they can implement them into their own members are big companies like Google, NVIDIA, Western products, and it does not allow the use of its instruction set Digital, Samsung, or Qualcomm. (ISA) in other implementations without licensing. The RISC-V The main goal of this work is the development of a RISC- instruction set appeared proposing the hardware and software V softcore processor to be implemented in an FPGA, using development without costs, through the creation of an open- source ISA. This way, it is possible that any project that im- a non-RISC-V core as the base of this architecture. The plements the RISC-V ISA can be made available open-source or proposed solution is focused on solving the problems and even implemented in commercial products. However, the RISC- limitations identified in the other RISC-V cores that were V solutions that have been developed do not present the needed analyzed in this thesis, especially in terms of the adaptability requirements so they can be included in projects, especially the and flexibility, allowing future modifications according to the research projects, because they offer poor documentation, and their performances are not suitable.
    [Show full text]
  • A Superscalar Out-Of-Order X86 Soft Processor for FPGA
    A Superscalar Out-of-Order x86 Soft Processor for FPGA Henry Wong University of Toronto, Intel [email protected] June 5, 2019 Stanford University EE380 1 Hi! ● CPU architect, Intel Hillsboro ● Ph.D., University of Toronto ● Today: x86 OoO processor for FPGA (Ph.D. work) – Motivation – High-level design and results – Microarchitecture details and some circuits 2 FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT6-LUT 6-LUT6-LUT 3 6-LUT 6-LUT FPGA: Field-Programmable Gate Array ● Is a digital circuit (logic gates and wires) ● Is field-programmable (at power-on, not in the fab) ● Pre-fab everything you’ll ever need – 20x area, 20x delay cost – Circuit building blocks are somewhat bigger than logic gates 6-LUT 6-LUT 6-LUT 6-LUT 4 6-LUT 6-LUT FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 5 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel code and hardware accelerators need effort – Less effort if soft processors got faster 6 FPGA Soft Processors ● FPGA systems often have software components – Often running on a soft processor ● Need more performance? – Parallel
    [Show full text]
  • Intel® Quartus® Prime Design Suite Version 18.1 Update Release Notes
    Intel® Quartus® Prime Design Suite Version 18.1 Update Release Notes Updated for Intel® Quartus® Prime Design Suite: 18.1.1 Standard Edition Subscribe RN-01080-18.1.1.0 | 2019.04.17 Send Feedback Latest document on the web: PDF | HTML Contents Contents 1. Intel® Quartus® Prime Design Suite Version 18.1 Update Release Notes........................ 3 2. Issues Addressed in Update 1......................................................................................... 4 2.1. Intel Quartus Prime Pro Edition Software.................................................................. 4 2.2. Intel Quartus Prime Standard Edition Software.......................................................... 7 2.3. IP and IP Cores..................................................................................................... 8 2.4. DSP Builder for Intel FPGAs...................................................................................12 2.5. Intel High Level Synthesis Compiler........................................................................12 2.6. Intel FPGA SDK for OpenCL*................................................................................. 13 3. Issues Addressed in Update 2....................................................................................... 15 3.1. Intel Quartus Prime Pro Edition Software.................................................................15 3.2. IP and IP Cores................................................................................................... 15 3.3. Intel FPGA SDK for OpenCL..................................................................................
    [Show full text]
  • Intel® Industrial Iot Workshop Security for Industrial Platforms
    Intel® Industrial IoT workshop Security for industrial platforms Gopi K. Agrawal Security Architect IOTG Technical Sales & Marketing Intel Corporation Legal © 2018 Intel Corporation No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com. Intel, the Intel logo, are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure.
    [Show full text]
  • FPGA Architecture: Survey and Challenges Full Text Available At
    Full text available at: http://dx.doi.org/10.1561/1000000005 FPGA Architecture: Survey and Challenges Full text available at: http://dx.doi.org/10.1561/1000000005 FPGA Architecture: Survey and Challenges Ian Kuon University of Toronto Toronto, ON Canada [email protected] Russell Tessier University of Massachusetts Amherst, MA USA [email protected] Jonathan Rose University of Toronto Toronto, ON Canada [email protected] Boston – Delft Full text available at: http://dx.doi.org/10.1561/1000000005 Foundations and Trends R in Electronic Design Automation Published, sold and distributed by: now Publishers Inc. PO Box 1024 Hanover, MA 02339 USA Tel. +1-781-985-4510 www.nowpublishers.com [email protected] Outside North America: now Publishers Inc. PO Box 179 2600 AD Delft The Netherlands Tel. +31-6-51115274 The preferred citation for this publication is I. Kuon, R. Tessier and J. Rose, FPGA Architecture: Survey and Challenges, Foundations and Trends R in Elec- tronic Design Automation, vol 2, no 2, pp 135–253, 2007 ISBN: 978-1-60198-126-4 c 2008 I. Kuon, R. Tessier and J. Rose All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, mechanical, photocopying, recording or otherwise, without prior written permission of the publishers. Photocopying. In the USA: This journal is registered at the Copyright Clearance Cen- ter, Inc., 222 Rosewood Drive, Danvers, MA 01923. Authorization to photocopy items for internal or personal use, or the internal or personal use of specific clients, is granted by now Publishers Inc for users registered with the Copyright Clearance Center (CCC).
    [Show full text]
  • Demystifying Internet of Things Security Successful Iot Device/Edge and Platform Security Deployment — Sunil Cheruvu Anil Kumar Ned Smith David M
    Demystifying Internet of Things Security Successful IoT Device/Edge and Platform Security Deployment — Sunil Cheruvu Anil Kumar Ned Smith David M. Wheeler Demystifying Internet of Things Security Successful IoT Device/Edge and Platform Security Deployment Sunil Cheruvu Anil Kumar Ned Smith David M. Wheeler Demystifying Internet of Things Security: Successful IoT Device/Edge and Platform Security Deployment Sunil Cheruvu Anil Kumar Chandler, AZ, USA Chandler, AZ, USA Ned Smith David M. Wheeler Beaverton, OR, USA Gilbert, AZ, USA ISBN-13 (pbk): 978-1-4842-2895-1 ISBN-13 (electronic): 978-1-4842-2896-8 https://doi.org/10.1007/978-1-4842-2896-8 Copyright © 2020 by The Editor(s) (if applicable) and The Author(s) This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this book are included in the book’s Creative Commons license, unless indicated otherwise in a credit line to the material.
    [Show full text]
  • (10) Patent No.: US 9037807 B2
    US009037807B2 (12) United States Patent (10) Patent No.: US 9,037,807 B2 Vorbach (45) Date of Patent: May 19, 2015 (54) PROCESSOR ARRANGEMENT ON A CHIP Sep. 17, 2001 (DE) .................................. 101 45 792 INCLUDING DATA PROCESSING, MEMORY, Sep. 17, 2001 (DE) ... ... 101 45795 AND INTERFACE ELEMENTS Sep. 19, 2001 (DE) .................................. 101 46132 Sep. 30, 2001 (WO). ... PCT/EPO1/11299 (75) Inventor: Martin Vorbach, Munich (DE) Oct. 8, 2001 (WO) ....................... PCT/EPO1/11593 Nov. 5, 2001 (DE) .................................. 101 54. 259 (73) Assignee: srecinologies AG, Nov. 5, 2001 (DE) ... ... 101 54 260 Dec. 14, 2001 (EP) ..................................... O1129923 (*) Notice: Subject to any disclaimer, the term of this Jan. 18, 2002 (EP) ..................................... O2OO1331 patent is extended or adjusted under 35 Jan. 19, 2002 (DE). 102 O2 044 U.S.C. 154(b) by 0 days. Jan. 20, 2002 (DE) 102 O2 175 Feb. 15, 2002 (DE) 102 O2 653 (21) Appl. No.: 12/944,068 Feb. 18, 2002 (DE) ... ... 102 O6856 Feb. 18, 2002 (DE) ... ... 102 O6857 (22) Filed: Nov. 11, 2010 Feb. 21, 2002 (DE) ... ... 102 O7 224 Feb. 21, 2002 (DE) ... ... 102 O7 225 (65) Prior Publication Data Feb. 21, 2002 (DE) .................................. 102 O7 226 US 2011 FOO60942 A1 Mar. 10, 2011 (51) Int. Cl. O O G06F 3/4 (2006.01) Related U.S. Application Data G06F II/20 (2006.01) (60) Division of application No. 12/496.012, filed on Jul. 1, G06F 3/16 (2006.01) 2009, now abandoned, which is a continuation of G06F 12/00 (2006.01) application No. 10/471.061, filed as application No.
    [Show full text]
  • Intel® Stratix® 10 General Purpose I/O User Guide
    Intel® Stratix® 10 General Purpose I/O User Guide Updated for Intel® Quartus® Prime Design Suite: 21.2 Subscribe UG-S10GPIO | 2021.07.07 Send Feedback Latest document on the web: PDF | HTML Contents Contents 1. Intel® Stratix® 10 I/O Overview..................................................................................... 4 1.1. Intel Stratix 10 I/O and Differential I/O Buffers..........................................................5 1.2. Intel Stratix 10 I/O Migration Support...................................................................... 6 2. Intel Stratix 10 I/O Architecture and Features............................................................... 8 2.1. I/O Standards and Voltage Levels in Intel Stratix 10 Devices....................................... 8 2.1.1. Intel Stratix 10 I/O Standards Support......................................................... 9 2.1.2. Intel Stratix 10 I/O Standards Voltage Support............................................ 10 2.2. I/O Element Structure in Intel Stratix 10 Devices..................................................... 12 2.2.1. I/O Bank Architecture in Intel Stratix 10 Devices..........................................13 2.2.2. I/O Buffer and Registers in Intel Stratix 10 Devices...................................... 14 2.3. Programmable IOE Features in Intel Stratix 10 Devices............................................. 15 2.3.1. Programmable Output Slew Rate Control.....................................................17 2.3.2. Programmable IOE Delay.........................................................................
    [Show full text]
  • Product Change Notification
    Product Change Notification Change Notification #: 117176 - 00 Change Title: Intel® Stratix® 10, PCN 117176-00, Documentation, Intel® Stratix® 10 Device Datasheet Update Date of Publication: September 27, 2019 Key Characteristics of the Change: Documentation Forecasted Key Milestones: September 27, 2019 Availability of Intel Stratix 10 device datasheet update: Description of Change to the Customer: Intel’s Network & Custom Logic Group (formerly known as the Programmable Solutions Group) is notifying customers of an important documentation update for Intel Stratix® 10 devices. It is necessary to update the datasheet with the new specifications, as the previous specifications were determined to be inaccurate. There is no change to the Intel® Stratix 10 product silicon and materials. Please review the revision history in the Intel Stratix 10 device datasheet for the complete history of updates. The Intel Stratix 10 device datasheet can be found here: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/hb/stratix-10/s10_datasheet.pdf Customer Impact of Change and Recommended Action: Customers are requested to take note of the changes and determine the impact on their designs. For more information, please contact your local Field Applications Engineer (FAE) or submit a Service Request at the My Intel support page. Products Affected / Intel Ordering Codes: All Intel Stratix 10 devices. The list of affected part numbers (OPNs) can be downloaded in Excel form: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/pcn/adv1915-opn-list.xlsx PCN Revision History: Date of Revision: Revision Number: Reason: September 27, 2019 00 Originally Published PCN Page 1 of 2 PCN #117176 - 00 Product Change Notification 117176 - 00 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS.
    [Show full text]
  • Xilinx Vivado – „EDK” Embedded Development) 4
    EFOP-3.4.3-16-2016-00009 A fels őfokú oktatás min őségének és hozzáférhet őségének együttes javítása a Pannon Egyetemen EMBEDDED SYSTEM DEVELOPMENT (MISAM154R) Created by Zsolt Voroshazi, PhD [email protected] Updated: 02 March. 2020. 2. FPGAS & PLATFORMS Embedded Systems Topics covered 1. Introduction – Embedded Systems 2. FPGAs, Digilent ZyBo development platform 3. Embedded System - Firmware development environment (Xilinx Vivado – „EDK” Embedded Development) 4. Embedded System - Software development environment (Xilinx VITIS – „SDK”) 5. Embedded Base System Build (and Board Bring-Up) 6. Adding Peripherals (from IP database) to BSB 7. Adding Custom (=own) Peripherals to BSB 8. Design and Development of Complex IP cores and applications (e.g. camera/video/ audio controllers) 3 Further references • XILINX official website: http://www.xilinx.com • EE Journal – Electronic Engineering: http://www.eejournal.com/design/embedded • EE Times - News: http://www.eetimes.com/design/embedded 4 PLD & FPGA CIRCUITS General description PAST … • Before ’80s, designing logic networks for digital circuits, modern development tools were not yet available as today. • The design of high complexity (multi I/O) logical combination and sequential networks was therefore slow and cumbersome, often coupled with paper- based design, multiple manual checks, and calculations. • We could not talk about advanced design and simulation tools (CAD) either, so there was a high probability of error in a prototype design. 6 … AND PRESENT • Today, all of these are available in an automated way (EDA - Electronic Design Automation), which, in addition to the use of Programmable Logic Devices (PLD), is relatively fast for both Printed Circuit Boards (PCB) and Application-specific Integrated Circuits and Processors (ASIC / ASSP).
    [Show full text]
  • CHERI Concentrate: Practical Compressed Capabilities
    1 CHERI Concentrate: Practical Compressed Capabilities Jonathan Woodruff, Alexandre Joannou, Hongyan Xia, Anthony Fox, Robert Norton, Thomas Bauereiss, David Chisnall, Brooks Davis, Khilan Gudka, Nathaniel W. Filardo, A. Theodore Markettos, Michael Roe, Peter G. Neumann, Robert N. M. Watson, Simon W. Moore Abstract—We present CHERI Concentrate, a new fat-pointer compression scheme applied to CHERI, the most developed capability-pointer system at present. Capability fat pointers are a primary candidate to enforce fine-grained and non-bypassable security properties in future computer systems, although increased pointer size can severely affect performance. Thus, several proposals for capability compression have been suggested elsewhere that do not support legacy instruction sets, ignore features critical to the existing software base, and also introduce design inefficiencies to RISC-style processor pipelines. CHERI Concentrate improves on the state-of-the-art region-encoding efficiency, solves important pipeline problems, and eases semantic restrictions of compressed encoding, allowing it to protect a full legacy software stack. We present the first quantitative analysis of compiled capability code, which we use to guide the design of the encoding format. We analyze and extend logic from the open-source CHERI prototype processor design on FPGA to demonstrate encoding efficiency, minimize delay of pointer arithmetic, and eliminate additional load-to-use delay. To verify correctness of our proposed high-performance logic, we present a HOL4 machine-checked proof of the decode and pointer-modify operations. Finally, we measure a 50% to 75% reduction in L2 misses for many compiled C-language benchmarks running under a commodity operating system using compressed 128-bit and 64-bit formats, demonstrating both compatibility with and increased performance over the uncompressed, 256-bit format.
    [Show full text]
  • Stratix II Vs. Virtex-4 Power Comparison & Estimation Accuracy
    White Paper Stratix II vs. Virtex-4 Power Comparison & Estimation Accuracy Introduction This document compares power consumption and power estimation accuracy for Altera® Stratix® II FPGAs and Xilinx Virtex-4 FPGAs. The comparison addresses all components of power: core dynamic power, core static power, and I/O power. This document uses bench-measured results to compare actual dynamic power consumption. To compare power estimation accuracy, the analysis uses the vendor-recommended power estimation software tools. The summary of these comparisons are: Altera’s Quartus® II PowerPlay power analyzer tool is accurate (to within 20%), while Xilinx’s tools are significantly less accurate. Stratix II devices exhibit lower dynamic power than Virtex-4 devices, resulting in total device power that is equal. Having an accurate FPGA power estimate is important to avoid surprises late in the design and prototyping phase. Inaccurate estimates can be costly and cause design issues, including: board re-layout, changes to power-management circuitry, changes cooling solution, unreliable FPGA operation, undue heating of other components, and changes to the FPGA design. Furthermore, without accurate power estimates, it is impossible for the designer and FPGA CAD software to optimize design power. This white paper contains the following sections: Components of total device power Power estimation and measurement methodology Core dynamic power comparison – power tool accuracy and bench measurements Core Static power comparison I/O power comparison Total device power summary For competitive comparisons on performance and density between Stratix II and Virtex-4 devices, refer to the following white papers from the Altera web site: Stratix II vs.
    [Show full text]