Netlist Security Algorithm Acceleration Using Opencl on Fpgas

NETLIST SECURITY ALGORITHM ACCELERATION USING OPENCL ON FPGAS Thesis Submitted to The School of Engineering of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree of Master of Science in Computer Engineering By Nicholas Michael Pelini UNIVERSITY OF DAYTON Dayton, Ohio August, 2017 NETLIST SECURITY ALGORITHM ACCELERATION USING OPENCL ON FPGAS Name: Pelini, Nicholas Michael APPROVED BY: Eric Balster, Ph.D. Frank Scarpino, Ph.D. Advisor Committee Chairman Committee Member Associate Professor, Department of Professor, Department of Electrical and Electrical and Computer Engineering Computer Engineering John Weber, Ph.D. Committee Member Professor, Department of Electrical and Computer Engineering Robert J. Wilkens, Ph.D., P.E. Eddy M. Rojas, Ph.D., M.A., P. E. Assoc. Dean for Research & Innovation, Dean, School of Engineering Professor School of Engineering ii c Copyright by Nicholas Michael Pelini All rights reserved 2017 ABSTRACT NETLIST SECURITY ALGORITHM ACCELERATION USING OPENCL ON FPGAS Name: Pelini, Nicholas Michael University of Dayton Advisor: Dr. Eric Balster Integrated circuits continue to grow in number of transistors and design complexity. Production of many of these components are also outsourced to facilities in a number of countries. Therefore, there is a need to ensure all parts within a system are reliable and free from modification. Verification tools must be able to assess circuits down to a gate level but also be scalable to assess complex designs. In response to this problem, an accelerated version of verification software is proposed to determine if a manufacturer design is the same as a known, reference design by comparing the circuit’s netlists. Optimizations are made to the Python code, and an FPGA hardware accelerated version of the code is created using OpenCL. Results of the OpenCL implementation show an 18x to 24x speedup across various netlists. Additionally, a netlist previously too large for verification tools to run is able to be tested by the OpenCL algorithm. iii ACKNOWLEDGMENTS I want to thank everyone who supported me over the years in my life and education. I would especially like to thank the following people: • Dr. Eric Balster: For serving as my adviser during graduate school and serving on my thesis committee. • Kerry Hill, Chris Taylor, and Air Force Research Laboratory: For providing funding and research to make this thesis possible. • Andrew Kordik, Jonathon Skeans, and everyone at Univeristy of Dayton Research Institute Sensor APEX: For introducing me to OpenCL and answering • Dr. Frank Scarpino: For serving on my thesis committee. • Dr. John Weber: For serving on my thesis committee. • My family: For always supporting me. iv TABLE OF CONTENTS ABSTRACT . iii ACKNOWLEDGMENTS . iv LIST OF FIGURES . vii LIST OF TABLES . ix I. INTRODUCTION . 1 1.1 Starting Code and Approach . 1 1.2 Thesis Objective and Organization . 2 II. CIRCUIT BACKGROUND . 3 2.1 Integrated Circuit Security . 3 2.2 Netlist Description . 4 2.3 Circuit Component Background . 5 III. SOFTWARE AND HARDWARE BACKGROUND . 6 3.1 Python . 6 3.2 Stratix V FPGA . 7 3.3 OpenCL . 8 3.4 Ctypes Package . 9 IV. INTEGRATED CIRCUIT VERIFICATION SOFTWARE . 10 4.1 Read Netlists and DFFs . 10 4.2 Fan In . 11 4.3 Fan Out . 12 4.4 Comparison of Fan In and Fan Out Signatures . 13 v V. PYTHON OPTIMIZATIONS . 14 5.1 Flatten Gate List into Separate Tuple Lists . 14 5.2 Hashing Strings into Integers . 15 5.3 Python Optimization Results . 16 VI. OPENCL IMPLEMENTATION . 18 6.1 General Algorithm Overview . 18 6.2 Storing Inputs / Outputs in Memory . 19 6.3 Fan In OpenCL . 20 6.4 Fan Out OpenCL . 22 6.5 Matching Golden Netlist to Manufacturer Netlist . 25 VII. RESULTS . 26 7.1 Timings of FPC Netlist . 26 7.2 Timings of FPU Netlist . 27 7.3 Timings of RISC Processor Netlist . 29 7.4 Timings of Four Core RISC Processor Netlist . 30 VIII. CONCLUSIONS AND FUTURE WORK . 32 8.1 Conclusions . 32 8.2 Future Work . 32 BIBLIOGRAPHY . 34 vi LIST OF FIGURES 2.1 Diagram of One Gate in a Test Netlist . 4 2.2 Circuit Symbol for a D-type Flip-flop [1] . 5 3.1 Diagram of a Stratix V FGPA [2] . 8 4.1 High Level Overview of the Algorithm . 11 4.2 Diagram of the Original Fan In Function . 12 4.3 Diagram of the Original Fan Out Function . 13 5.1 Flattening the Netlist into Separate Lists Improves Efficiency by Removing Nested List Access . 15 5.2 Integer Comparisons Improve Memory Bandwidth and Comparison Speed . 16 5.3 Python Optimizations Resulted in a 9.7x Speedup of the Algorithm . 17 6.1 High Level Overview of the OpenCL Version of the Algorithm . 19 6.2 Diagram of the Fan In Function in the OpenCL Version of the Algorithm . 20 6.3 Diagram of the FPGA Portion of the Fan In Function . 21 6.4 Diagram of the Fan Out Function in the OpenCL Version of the Algorithm . 23 6.5 Diagram of the FPGA Portion of the Fan Out Function . 24 7.1 Timing Comparison of All ICVS Algorithms Running the FPC Netlist . 27 7.2 Timing Comparison of All ICVS Algorithms Running the FPU Netlist . 28 vii 7.3 Timing Comparison of All ICVS Algorithms Running the RISC Processor Netlist . 30 7.4 Timing of the OpenCL ICVS Implementation Running the Four Core RISC Proces- sor Netlist . 31 viii LIST OF TABLES 7.1 Average Runtime and Speedup of All ICVS Algorithms Running the FPC Netlist . 26 7.2 Average Runtime and Speedup of All ICVS Algorithms Running the FPU Netlist . 28 7.3 Average Runtime and Speedup of All ICVS Algorithms Running the RISC Netlist . 29 7.4 Average Runtime of the OpenCL ICVS Algorithm Running the Four Core RISC Processor Netlist . 31 ix CHAPTER I INTRODUCTION Electronic systems perform integral roles in many industries such as technological, medical, and government. These systems are responsible for handling sensitive data that must remain secure. There is a great need to verify that electronic components within a system are reliable and free from alteration. Verification tools must be able to assess circuits down to a low level to certify electronics as reliable and free from modifications. These tools must be developed to perform faster than current capabilities which require manual support on complex designs. Also, improved tools must be scalable to analyze exponentially growing circuit designs. 1.1 Starting Code and Approach The first implementation of the Integrated Circuit Verfication Software (ICVS) is a Python application supplied by Air Force Research Laboratories (AFRL). The program is able to parse and map two netlist files, a golden, reference design and an unknown design commissioned by a manufacturer. The netlists are text files that detail every component and the component’s input and output connections in an electronic circuit. Bioinformatics algorithms used in deoxyribonucleic acid (DNA) sequencing are applied to accomplish the mapping of the netlists. Two functions named fan in and fan out generate signatures for each data (delay) flip-flop (DFF) by performing a breadth-first search of depth two in the netlist. This search begins at each DFF and builds a signature containing 1 all gates connected to the DFF within two levels. The golden and unknown netlists’ signatures for each DFF are then compared to provide an initial mapping of the two netlists. 1.2 Thesis Objective and Organization The objective of this thesis is to accelerate the current Python program and implement a faster, more scalable version using the advanced programming language Open Computing Language (OpenCL). The field-programmable gate array (FPGA) chosen to run the OpenCL portion of the code is the In- tel Stratix V. Chapter II of this thesis provides an overview of the necessity of circuit security, test netlists, and relevant circuit components. Chapter III provides an overview of the programming languages used. Chapter IV walks through the starting code provided by AFRL. Chapter V details the Python optimizations performed to improve the applications performance. Chapter VI describes the OpenCL accelerated version of the program. Chapter VII presents the results section comparing the runtime and scalability of all versions of the application. Lastly, Chapter VIII offers conclusions of the project and proposes future work to continue the netlist mapping. 2 CHAPTER II CIRCUIT BACKGROUND The ICVS provides a method to ensure a circuit’s reliability and security. As the need and availibility of more complex circuit designs continues to rise, the integrity of these parts must be maintained. Four netlists with information about every component in a circuit are used to test the ICVS. DFFs in the netlist are the basis of the signatures generated to map the golden and unknown netlists to each other. 2.1 Integrated Circuit Security Modern integrated circuits (IC) are extremely complex and are composed of billions of transistors. [3] Vulnerabilities potentially exist within these intricate designs which could leak information, allow unauthorized access, or disable a device unknowingly to the user. [4] Complicating this issue has been the rise of overseas production. These facilities prove difficult to monitor for trust and security. [5] In many cases, functional testing can reveal a flaw in a modified circuit due to an incorrect output. However, purely testing the functionality does not reveal the precise component in the circuit that is the root cause of the behavior modification. However, other vulnerabilities are not detectable by functional testing. In these cases, a modified circuit often produces the expected output which provides a major challenge in identifying circuits with liabilities. For example, a NOR gate insertion 3 into a full adder circuit can generate a superfluous output along with the expected output.

Netlist Security Algorithm Acceleration Using Opencl on Fpgas

Data Structure

Data Structure Invariants

Using Machine Learning to Improve Dense and Sparse Matrix Multiplication Kernels

Verification-Aware Opencl Based Read Mapper for Heterogeneous

Applying Front End Compiler Process to Parse Polynomials in Parallel

Exploratory Large Scale Graph Analytics in Arkouda 59 2 60 3 61 4 Zhihui Du,Oliver Alvarado Rodriguez and Michael Merrill and William Reus 62 5 David A

Array Data Structure

View of This Work

When Prefetching Works, When It Doesn&Rsquo

Array Data Structure

Unsynchronized Techniques for Approximate Parallel Computing

The Affix Array Data Structure and Its Applications to RNA Secondary