<<

Development of Trust Metrics for Quantifying Design Integrity and Error Implementation Cost

DISSERTATION

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Adam G. Kimura

Graduate Program in Electrical & Computer Engineering

The Ohio State University

2017

Dissertation Committee:

Dr. Steve Bibyk, Advisor

Dr. Lisa Fiorentini

Dr. Ayman Fayed

Copyright by

Adam G. Kimura

©2017

Abstract

One of the major concerns in the Integrated Circuit (IC) industry today is the issue of Hardware

Trust. This problem has risen as result of increased outsourcing and from the integration of more third party Intellectual Property (IP) into designs. Trusted Microelectronics is a new field of research that has emerged to address these hardware assurance concerns. Trojan Detection, Design for Security and Trust, Trusted Supply Chain Management, Trusted Design Verification, Anti-

Counterfeiting, and Vulnerability & Attack Mitigation are the major sub-fields of Trusted

Microelectronics where research progress is being made. There is; however, currently a lack of well-defined metrics for quantifying Hardware Trust. As such, developing a portfolio of Trust

Metrics is a needed contribution in the Trusted Microelectronics space that will also bring value to the other sub-fields of Trust.

In this work, a Trust Metric Solution Space is defined in order to establish a roadmap for developing Trust Metrics. The Solution Space also creates a coalescing point for the metrics work being conducted in other communities to integrate into. An Error Implementation Cost (EIC) measure is developed as a technique to quantify errors and to allow error ranking and rating. Four

MIPS Processor test cases containing embedded errors are utilized in order to show that the EIC scoring can be applied for creating quantifiable differentiation between different errors of varying severity. Errors 1 and 2 were shown to be the least severe with System Payloads of 0.0181 and

0.0010 respectively. Errors 3 and 4 were shown to be more severe with System Payloads of 0.5140 and 0.3216 respectively.

ii

The EIC scoring is then used to assist in developing Test Articles (TA) for example case scenarios that contain embedded errors. A 32-bit Floating Point Adder, a Fixed to Floating Point

Converter, a MIPS Processor, and a Full System TA was developed in order to apply the techniques for evaluating hardware integrity. The Design Integrity (DI) analysis parses the design out into five sub-domain profiles (Logical Equivalence, Power Consumption, Signal Activity Rate,

Structural Architecture, and Functional Correctness) in order to track deviation away from its intended reference profile. The deviation measurements were unique to each domain on a [0, 1] scale and were aggregated together to produce a DI metric on a [0, 5] scale that can be correlated to Hardware Trust. The DI analysis was subjected to a more complex test article with embedded errors in order show the effectiveness of the DI analysis in quantifying integrity. The result of the analysis demonstrated how several different TAs of varying integrity could be quantitatively ranked with higher or lower Trust. TA0 had the highest Trust at 1.00/1.00 followed by TA3 and TA4, both at 0.88/1.00. TA1 had a measure of 0.65/1.00 and TA2 had the worst Trust at 0.59/1.00. A metric for Reference Quality is also proposed since the DI analysis hinges heavily on the utilization of design references. The Reference Quality metric is utilized in conjunction with the DI metric to arrive at a final Trust Measure Figure of Merit.

Finally, future work and recommendations are made for continuing the progress this work has made. Recommendations for future work would involve exploring nonlinear distance measures to track deviation from the expected reference profile, as well as developing enough design cases to define a probabilistic model of Trust. Additional work could also be directed towards determining the non-uniformity of the Correlation Factor weighting of the aggregated domains and exploring new sub-domain profiles of Design Integrity.

iii

Dedication

I dedicate the collective work represented by this document to my wonderful wife, Heather Rachel and to my precious daughter, Eden Megumi.

Eden, the things in life that are worth pursuing never come easy and will always have a cost that requires your unrelenting determination and perseverance.

iv

Acknowledgments

This accomplishment would not have been possible without the support and encouragement of many people. Truly, I could not have reached this milestone in my life without each of them.

To Dr. Steve Bibyk, my wonderful advisor; I will be forever indebted to you for the support you’ve given me and for your belief in me over these past several years. The odds were not in my favor in the beginning, yet you still saw something in me that convinced you to take me on as your advisee. The impact you’ve had in this season of my life cannot be overstated. There were many times that I struggled with internal doubts regarding my ability, and your words were always a refreshing surge of encouragement that I needed to keep going. Particularly, the conversation we had in the lobby of the old ESL building after I was officially admitted into the PhD program will be forever etched into my memory. Thank you. You have been an inspiration and are someone whom I have come to regard with the highest respect and honor. I will whole-heartedly miss our conversations regarding life, politics, and philosophy.

To all the sponsors of my research, you made this accomplishment financially possible. To

Dr. Greg Creech, thank you for believing in me and bringing me into the Electro Science

Laboratory to be part of this world class lab. Your ability to articulate the big picture vision, as well as the way you connect people and organizations to work together, is something that I am constantly impressed by. It has been an honor to work with you these last several years. To the

DAGSI organization, thank you for awarding me with the research fellowship that supported the

v

last two years of my research. Finally, thank you to Dr. John Merrill and the rest of the EEIC staff for their support in the years of my Master of Science studies.

To my PhD committee members, thank you for investing in my research and challenging me to make it the best it could be. Dr. Joanne DeGroat, thank you for teaching me the fundamentals of VHDL and Verification. The foundation you laid has served me very well over these years.

Dr. Ümit Çatalyürek, thank you for the knowledge you imparted to me regarding Computer

Architecture and Embedded Systems. It has likewise served me well when my research was in the nascent stages of development. To Dr. Lisa Fiorentini, thank you for your endorsement of both my

PhD and fellowship applications. I cannot overstate my gratitude to you for your support as well as your service on my PhD committees. To Dr. Ayman Fayed, your teaching of Analog VLSI has made a lasting impression on me. Your ability to teach complex subject matter in a story-like fashion keeps students engaged and is inspiring to anyone who has a passion for teaching.

To Dr. Brian Dupaix, you are someone that many people in our lab look to as an older brother and mentor. Thank you for your insights into my work and for your candid corrections when I’ve needed them. I have found in life, it is the people who tell you what you do not want to hear, no matter how difficult, that truly value you and are invested in your success.

To Gus Fragasse, thank you for your contribution of the IP-XACT work. You were one of the best Masters students I had the pleasure of working with. You are an exceptionally hard worker who always delivered on what he promised. The traits you possess have become rare to find these days in young engineers. As such, thank you for being such a breath of fresh air.

To Dr. Steven Sebo, Phil Davids, John Daulton, Joe Moore, Jim Worman, Tom Lease, and

David Herron, thank you for your endorsement and encouragement to pursue higher education.

The decision to leave my well-established career in the industry and come back to school was nerve

vi

racking to say the least. Thank you for helping me process my goals at the start and for the application support that helped get me admitted into the OSU Graduate Program.

To the ECE department guidance counselors, thank you for being an outside source of encouragement to talk to. You are not shown near enough appreciation for the work you do and the support you offer to students. Susan Noble, you have been someone very dear to me, since the adversity I faced as an undergraduate. And to Tricia Toothman, I have appreciated all of our discussions and the guidance you’ve given me through the ups and downs of my time here. Thank you for your belief in me since the beginning of my graduate studies.

To Mark Scott, you have become a dear friend in my time here at OSU. Your devotion to your research and to excellence is something I admire about you as it has challenged me in my own work. I’ve appreciated all the discussions we’ve had over the years regarding our research, life, and God.

To Kai-wei Lui, Siddharth Prabhu, Adithya Jayakumar, Mariant Gutierrez Soto, Adit Joshi,

Rolland Tallos, Luke Duncan, Chris Taylor, Shane Smith, Daron DiSabato, Monica Okon, Miriam

Simon, and Kadri Parris, your personalities have added a richness to this experience that I will think back fondly on for many years to come.

To Gary Nickrand, Ben Lloyd, and Brandon Henderson, you three have become dear friends whose commiserations through the challenges and pressures that come with higher education and life have served me well and will never be forgotten. I cherish all three of you and am grateful for the advice and support you offered early on when I decided to come back to the university.

To Jenny Barton and Kate Holland, I would not have been able to continue or finish my studies if you had not cared after my precious daughter while I was in the lab and my wife was at work. I

vii

can't express enough how knowing that Eden was at home safe and being cared for enabled me to push through the long days and nights without worry. Thank you.

To the best mother-in-law one could ask for, Bethann Silvey; I will be forever indebted to you for your presence around our home over these years, filling in the gaps when I was not around.

You were instrumental in helping to keep our family going. There were many times that I questioned if the family could handle "just one more semester." You certainly played a role that undeniably supported us until the end. To Steve Silvey, thank you for your words of encouragement and prayers. There were many times that I found myself amazed how difficult situations worked themselves out with the best outcome.

To Mom and Dad, thank you for bearing with me through the holidays where I was not completely present due to project and paper deadlines. Thank you for instilling the value of education and for pushing me to be disciplined and not choose the easy paths in life.

And to Heather, the most supportive wife I could have asked for; I will be indebted to you for as long as I live for the amount of sacrifice you have willingly and selflessly made for me. We both know that I could not have finished this without you being by my side. Thank you for putting your dreams on hold in order for me to pursue mine.

Finally, and most of all, I am thankful to the Lord Jesus for the continued grace I have experienced since coming to know Him in a deeply personal way. He is the reason I am here today as a changed person. My hope and prayer is that He would be glorified by all of the work and research I have conducted in this field.

viii

Vita

2005 ...... Electrical Designer, Apex Machine Company, Ft. Lauderdale, FL 2006 - 2007 ...... Electrical Designer, K & H Energy LLC, Dublin, OH 2007 ...... B.S. Electrical & Computer Engineering, German Language & Literature - The Ohio State University 2007 - 2011 ...... Electrical Engineer, Cockerill Maintenance & Ingénierie Groupe, Westerville, OH 2012 - 2014 ...... Graduate Teaching Associate, Fundamentals of Engineering, The Ohio State University 2014 - 2017 ...... Graduate Research Associate, OSU Electro Science Lab 2017 – present ...... Cyber Embedded Systems Engineer, Battelle Memorial Institute, Columbus, OH

Publications

A. Kimura, S. Bibyk, B. Dupaix, M. Casto, G. Creech, Metrics for Analyzing Quantifiable Differentiation of Designs with Varying Integrity for Hardware Assurance, ©2017 GOMAC Tech Conference

A. Kimura, S. Bibyk, Quantifying Metrics for Analyzing Integrated Circuit Design Integrity, Midwest Symposium on Circuits and Systems, ©2016 IEEE

A. Kimura, S. Bibyk, M. Casto, B. Dupaix, G. Creech, Quantifying Error Payload and Error Implementation Cost for Hardware Assurance, ©2016 GOMAC Tech Conference

M. Barber, A. Kimura, K. Sexton, S. Bibyk, Assured Hardware/Software Design for Integrated Circuit Design Trust Flow, ©2015 GOMAC Tech Conference

A. Kimura, K. Liu, S. Prabhu, S. Bibyk, G. Creech, Trusted Verification Test Bench development for Phase- Locked Loop (PLL) Hardware Insertion, Midwest Symposium on Circuits and Systems, no. 6674871, pp. 1208-1211., ©2013 IEEE

W. Kearns, A. Kimura, A. Seibert, Optimized Wire Coil Batch Pickling Plant Design via Computer-aided Modeling, ©2011 The Wire Association International Inc.

Fields of Study

Major Field: Electrical and Computer Engineering ix

TABLE OF CONTENTS

Abstract ...... ii Dedication ...... iv Acknowledgments ...... v Vita...... ix Publications ...... ix Fields of Study ...... ix 1. INTRODUCTION ...... 1 1.1. Deviation from the Design Specification ...... 1 1.2. The Rising Concern of Hardware Trust ...... 2 1.3. The Trusted Microelectronics Space ...... 4 1.4. The Need for Trust Metrics ...... 7 1.5. Previous Research ...... 8 1.5.1. Scoring Overt Hardware Attacks ...... 8 1.5.2. Quantifying Supplier Trustworthiness ...... 10 1.5.1. Quantifying System Level Trustworthiness ...... 11 1.5.2. Quantifying Vulnerability ...... 12 1.5.1. Quantifying Trojan Presence ...... 14 1.6. Problem Statement and Document Organization ...... 14 2. CONSTRAINING THE TRUST METRIC SOLUTION SPACE ...... 17 2.1. Reference Establishment Domain...... 20 2.2. Design Integrity Analysis Domain ...... 20 2.3. Test Article Development Domain ...... 22 2.4. Error Analysis Domain ...... 23 2.5. Hardware Error Taxonomy Domain ...... 25 2.6. Vulnerability Analysis Domain ...... 25 2.7. Produced Metrics ...... 26 3. ERROR IMPLEMENTATION COST ...... 27 3.1. Error Scoring by Inspection ...... 28 3.1.1. Error Scoring of Trojan Inserted into an ALU with EIC ...... 30 3.2. Error Scoring Through Objective Measurement Techniques ...... 32 x

3.2.1. Error Payload ...... 32 3.2.2. Payload for ALU Controller Embedded Error ...... 33 3.3. Error Realization ...... 34 3.4. Framework for Test Article Development ...... 38 4. TEST ARTICLE DEVELOPMENT ...... 39 4.1. Establishing a Reference for Developing Metrics ...... 40 4.2. Taxonomy of Hardware Errors ...... 45 4.2.1. Action Characteristics ...... 45 4.2.2. Activation Characteristics ...... 46 4.2.3. Physical Characteristics ...... 46 4.3. 32-Bit Floating Point Adder Test Article ...... 48 4.3.1. Reference Floating Point Adder Model ...... 53 4.3.2. Error Insertion into Floating Point Adder Models ...... 55 4.4. Fixed Point to Floating Point Converter Test Article ...... 56 4.5. Floating Point Adder with Fixed Point Conversion (Full System) Test Article ...... 57 4.5.1. Reference Floating Point Adder with Fixed Point Conversion Model ...... 58 4.5.2. Error Insertion into Full System ...... 60 4.6. MIPS Processor Test Article ...... 61 4.6.1. Operation of MIPS Processor ...... 62 4.6.2. Arithmetic Logic Unit (ALU) and ALU Controller Test Article...... 65 4.6.3. Integrating Corrupted ALU Controllers into Larger MIPS Processor System .... 66 4.6.4. Error Realization in the ALU Controller ...... 71 5. QUANTIFYING DESIGN INTEGRITY ...... 77 5.1. Multi-Pass Approach to Trust Verification ...... 78 5.2. Discretized Design Integrity Model...... 80 5.3. Logical Equivalence Integrity ...... 83

5.3.1. Utilization of Formal Verification for Determining LEintegrity ...... 83 5.3.2. Logical Equivalence Integrity Domain Test Article ...... 87 5.4. Power Consumption Integrity ...... 90 5.4.1. Power Consumption Integrity Domain Test Article ...... 91 5.5. Signal Activity Rate ...... 93 5.5.1. Signal Activity Rate Integrity Domain Test Article ...... 94 5.6. Functional Integrity ...... 96 5.6.1. Functional Integrity Domain Test Article ...... 97 5.7. Structural Architecture Integrity ...... 98 5.8. Aggregation of DI Techniques on Simple Test Cases ...... 100 6. DESIGN INTEGRITY METRICS APPLIED TO TEST CASE EMBEDDED SYSTEM . 102 6.1. Full System Test Cases ...... 103 6.2. Quantifying the Reference Quality ...... 104 6.3. Distance Measures Relating Trust Measures to Error Implementation Cost ...... 107 xi

7. IP-XACT… ...... 112 7.1. IP-XACT Hierarchy ...... 113 7.1.1. Use Model of IP-XACT ...... 114 7.1.2. Acceptance in the EDA Community ...... 117 7.2. Analyzing Trust with IP-XACT ...... 118 8. CONCLUSION ...... 121 8.1. Contribution ...... 123 8.2. Recommendations for Future Work ...... 124 8.3. The Broader Impact ...... 127

xii

LIST OF TABLES

Table 1 – Component Scoring Rubric for the Error Implementation Cost Function ...... 29 Table 2 – Description of Errors Inserted into ALU Controllers ...... 32 Table 3 – Special Case Test Vectors for FPA Coverage ...... 50 Table 4 – Test Vectors for Boundary Coverage ...... 51 Table 5 – Full System Test Article Sample Data ...... 59 Table 6 – Error Insertion into Full System Test Article ...... 60 Table 7 – MIPS Instruction Set ...... 62 Table 8 – ALU Controller Operations ...... 66 Table 9 – Measured System Payload on MIPS from Corrupted Controller...... 69 Table 10 – Realization of Errors 1 and 2 Embedded in ALU Controller ...... 72 Table 11 – Realization of Errors 3 and 4 Embedded in ALU Controller ...... 73 Table 12 – Error Realization Results from Embedded MIPS Errors ...... 75 Table 13 – Results of Power Consumption Integrity Evaluation on Test Article ...... 93 Table 14 – Results of Signal Rate Integrity Evaluation on Test Article ...... 95 Table 15 – Results of Functional Integrity Evaluation on Test Article ...... 97 Table 16 – Results of Structural Architecture Integrity Evaluation on Test Article ...... 99 Table 17 – Results of Design Integrity Analysis on Test Case ...... 101 Table 18 – Design Integrity Results for the Full System Test Cases ...... 104 Table 19 – Description of References ...... 105 Table 20 – Comparison of Different References Types...... 106 Table 21 – Test Article Error Scoring and Trust Measure across References ...... 108 Table 22 – Distance Matrix of Normalized EIC and TM for each Reference ...... 108 Table 23 – Distance Matrix of Normalized Payload and TM for each Reference ...... 110

xiii

LIST OF FIGURES

Figure 1 – Insertion Points for Hardware Errors in Design Flow ...... 4 Figure 2 – The Trusted Microelectronics Space ...... 6 Figure 3 – Quantifying Overt Hardware Attacks [17] ...... 9 Figure 4 – Relationship between Metric Attributes, Assessments, and Threats [20] ...... 12 Figure 5 – Vulnerability Analysis Flow [22] ...... 13 Figure 6 – Different Domains for Trust Analysis ...... 18 Figure 7 – Trust Abstraction Levels ...... 18 Figure 8 – Constrained Trust Metric Solution Space ...... 19 Figure 9 – Waveform Showing Trojan Activation ...... 31 Figure 10 – Test Bench for ALU Controller Testing ...... 33 Figure 11 – Netlist Objects [29] ...... 36 Figure 12 – Cell Objects [29] ...... 37 Figure 13 – Comparison of Black Box Test Article (left) to Independently Developed ...... 41 Figure 14 – Training Model Data and Application to Test Set ...... 41 Figure 15 – Test Article Development for Actual and Expected Design Scenarios ...... 43 Figure 16 – FPGA Boards Utilized in Test Article Development [30] ...... 44 Figure 17 – Hardware Error Taxonomy [3] ...... 47 Figure 18 – Scoring ALU Trojan Error ...... 47 Figure 19 – Single Precision 32-Bit Floating Point Adder ...... 48 Figure 20 – Generalized Verification Testbench ...... 49 Figure 21 – Testbench Assertion for Evaluating Expected and Actual Result ...... 53 Figure 22 – Waveform of FPA Dataflow Architecture Displaying Error Flag ...... 54 Figure 23 – Detail Waveform of FPA Dataflow Architecture ...... 54 Figure 24 – Small Errors Added at the Netlist Level ...... 55 Figure 25 – Errors Added at the RTL ...... 56

xiv

Figure 26 – Macro Block Diagram for the Fixed Point to Floating Point Converter ...... 57 Figure 27 – Fixed to Floating Point Conversion Input Bit Definition ...... 57 Figure 28 – Fixed to Floating Point Conversion Output Bit Definition ...... 57 Figure 29 – Block Diagram of Floating Point Adder with Fixed Point Conversion Test Article .. 59 Figure 30 – Waveform of Full System Test Article ...... 60 Figure 31 – 8-bit MIPS Processor Top Level [33] ...... 62 Figure 32 – MIPS Encoding Formats ...... 63 Figure 33 – Block Diagram of MIPS Controller and Datapath [33] ...... 63 Figure 34 – Block Diagram of MIPS Processor Datapath and Control Unit [33] ...... 65 Figure 35 – Block Diagram of 8-Bit ALU with Controller ...... 66 Figure 36 – Test Setup for MIPS Processor ...... 67 Figure 37 – Error Propagation through MIPS Processor ...... 68 Figure 38 – Observed Errors on ALU Controller ...... 70 Figure 39 – Observed Errors on MIPS...... 70 Figure 40 – Error Ranking Based on Detectability, System Payload, and EIC ...... 76 Figure 41 – DI Analysis True/False Positives and Negatives ...... 79 Figure 42 – Generalization of Equivalence Checks across Design Path ...... 79 Figure 43 – Parsing Design into Sub-domain Profiles ...... 81 Figure 44 – Generalized Deviation of Actual Away from Expected Character Profile ...... 81 Figure 45 – Design Integrity Scale for Single Design Profile ...... 83 Figure 46 – Design Integrity Scale – Aggregated Profiles ...... 83 Figure 47 – Trusted Design Path Flow ...... 85 Figure 48 – Logical Equivalence Check between the Reference and Untrusted Design ...... 85 Figure 49 – Embedded Errors into Netlist of Floating Point Adder ...... 88 Figure 50 – Equivalence at each Key Point (left) and Design Results (right) for Test Article 1 ... 88 Figure 51 – Deviation of Actual Logical Equivalence from Expected Domain ...... 89 Figure 52 – Comparison Point Diagnosis to Uncover Non-equivalence in Test Article 1 ...... 89 Figure 53 – Schematic Diagnosis of Comparison Point within Conformal ...... 90 Figure 54 – Trojan Inserted into Test Article for Evaluating Power Consumption Integrity ...... 92 Figure 55 – Deviation of Actual Power Consumption from Expected Domain Profile ...... 93 Figure 56 – Deviation of Actual Signal Activity from Expected Domain Profile ...... 96 xv

Figure 57 – Test Setup for Evaluating Questionable Design Functionality ...... 97 Figure 58 – Deviation of Functionality from Expected Domain Profile ...... 98 Figure 59 – Deviation of Structural Architecture from Expected Domain Profile ...... 100 Figure 60 – Test Case Block Diagram ...... 103 Figure 61 – Correlating the Trust Measure to Error Implementation Cost ...... 109 Figure 62 – Correlating the Trust Measure to Error Payload...... 111 Figure 63 – Example of Hierarchical Structure of the IP-XACT Standard ...... 114 Figure 64 – Packaging Step for both User and Supplier End of the IP Transfer ...... 115 Figure 65 – High-Level Block Diagram of Typical IP-XACT Use Model ...... 117 Figure 66 – Example IP with Malicious Circuitry Added ...... 119 Figure 67 – Comparison of Checksum Values ...... 120 Figure 68 – Constrained Trust Metric Solution Space ...... 125 Figure 69 – Expanding the Full System into the Analog Domain ...... 126

xvi

Chapter 1: INTRODUCTION

1.1. Deviation from the Design Specification

In engineering, the design of a product is tied very closely to its design specification.

Specifications outline details such as performance metrics as well as operating conditions and constraints for the product design. The design specification therefore ends up being one of the main driving forces in the detail design, since the specification is typically defined by the end user or customer requirements. As such, deviations away from the design specification can be a positive or negative concern depending on the context in which the deviation is observed.

Positive deviation from the design specification is done with the intention of making the product more reliable or to guarantee a specific performance margin. Many companies introduce things such as overdesign as a way to ensure that their products have a minimal chance of ever dropping below a specific performance metric. As such, a product can be overdesigned by adding redundancy or overcompensating certain design specs in order to reduce the likelihood of the design falling outside critical margins. In the Integrated Circuit (IC) industry, this could be designing the chip in order to harden it to the degradation effects caused by things such as Negative Bias

Temperature Instability (NBTI) and Hot Carrier Injection (HCI) in order to improve the circuit

1

reliability [1]. In the automobile industry, this could mean overdesigning certain critical parts such that they outlast the warranty guarantees.

By contrast, negative deviation from the design specification results in an undesirable deviation in the performance and behavior profiles of the design. This could be caused by poor engineering, subpar manufacturing and fabrication processes, or through intentional means of a malicious actor.

Counterfeiting is a major problem where the deviation effects can be observed in a performance drift over time as a result of the quality assurance issues created by the counterfeit part. Another concern is in the area of malicious tampering, involving intentional insertion of malicious elements into a design. The functionality of a design can be dramatically altered by small modifications to the circuit. Effects of the malicious tampering can be observed as it deviates away from how the design was intended to perform or behave. All of these factors alter the design from its originally intended profile. Regardless of the kind of deviation being observed, the question that arises is

“How does one quantify these deviations from the originally intended reference specification?”

A large amount of research remains to be done in the area of quantifying reference specification deviation. Different deviation contexts require different measurement techniques. For example, a major component to quantifying overdesign would include measuring specification overshoots as well as the resource costs and tradeoffs made as a result of the overdesign efforts. This work focuses on quantifying reliability, and specifically, the deviations of design integrity observed in questionable chips received from untrusted suppliers.

1.2. The Rising Concern of Hardware Trust

As IC chips continue to advance in complexity, economics and time-to-market pressures have driven hardware developers into distributed design processes and overly complex supply chains.

This has led to more opportunistic points in the design and manufacturing flow for error insertion 2

by adversarial or dishonest agents inside a supplier. A hardware error is defined as any construct that causes deviation from the intended specification. Hardware errors are typically categorized as either faults or hardware Trojans. Hardware Trojans are inserted into the design with malicious intent to compromise a design’s functionality and reliability. Other aims for hardware Trojans could be to grant control to an adversary for monitoring or stealing information. A fault is a quality control occurrence usually created as a result of poor fabrication processes; however, faults are not typically malicious in nature.

With the globalization of the IC industry, hardware untrustworthiness (i.e. the concern for hardware error insertion) has become a growing issue as the Internet of Things continues to expand.

There are approximately 10 billion devices connected to the World Wide Web today. This number is projected to more than triple by the year 2020 to 35 billion devices [2]. The rising concern for hardware Trust has therefore led to the emergence of a new field of research to address these concerns - Trusted Microelectronics.

Figure 1 illustrates a generalized design path flow for a hardware IC. The design path begins at the highest level of abstraction, the System Level, and moves down into the behavioral Register

Transfer Level (RTL) and Gate Level Netlist. Finally, the path ends at the lowest abstraction, the

Layout Level. As depicted in Figure 1, the two main vulnerability points in the design path lie in the integration of outside supplier designs and in the fabrication conducted at untrusted foundries.

3

Supplier Supplier Supplier

? ? ?

?

S

G

S

C

N

S

I

I

E

y

S

T

r

D

C

E

d

U

E

O

?

C

n

O

R

R

u

O

R

U

o

P

R

SYSTEM LAYOUT/IP

T

F

D P

RTL S NETLIST

C

I d

SPEC N GDSII

e

S

A

N

t

A

?

s

F

E

G

I

u

E

U

H

r

S

t

C

T

N

E

n

A

N

A

D

U

L

Y

M

P S Design Path ?

? ? ? Highest Level Lowest Level of Abstraction of Abstraction Supplier Supplier Supplier

Figure 1 – Insertion Points for Hardware Errors in Design Flow

1.3. The Trusted Microelectronics Space

As mentioned previously, Trusted Microelectronics is an emerging field of research that encompasses a variety of different sub-field areas in its research scope. Figure 2 illustrates these different sub-fields within Trusted Microelectronics and shows the relationship between each sub- field with regard to one another.

The Trojan Detection area is one of the largest sub-fields and makes up a large portion of the current Trust research being conducted. Since it is a major component in Trust, one finds that it overlaps with many of the other sub-fields. One of the main focuses of this area is to standardize and develop new methods for detecting malicious insertions or hardware Trojans. In [3],

Tehranipoor and Koushanfar present a survey of the latest state-of-the-art Trojan detection techniques that have been widely accepted within the Trojan Detection community. A highlight of some of these techniques would include Outlier Analysis, Path Delay Analysis, Pattern Generation for Transient Power and Switching Analysis, and Kullback-Leibler Divergence. In order to analyze proposed detection methods, the Trojan Design and Test Systems space is where sample Trojans 4

are designed and inserted into embedded systems for the purpose of exercising and validating new detection techniques. The Trust Hub is a repository of benchmark test case examples that can be leveraged for exploring, comparing, and vetting new detection methods [4].

The Trojan Detection space intuitively overlaps into the Design for Security and Trust, as well as the Vulnerability and Attack Mitigation spaces. As Trojan detection work continues to progress, establishing defenses against attacks and anticipating threats is a critical response to combatting the hardware Trust concerns. Design for Security and Trust involves proactively introducing measures at the design stage in order to achieve increased hardware security and to make the insertion of hardware errors easier to detect. Design for Security and Trust work also presents methods for IC authentication that can help identify counterfeit or tampered components. The work done in [5] shows examples of designing for Security and Trust by inserting dummy scan flip into areas of the design that are below a specified gate transition probability. Since these are points in the design with low observability, it can be very difficult to identify a Trojan’s presence. By adding dummy scan flip flops, one effectively increases design observability at these points, making the detection of Trojan activity much easier. In [6], Abramovici and Bradley employ reconfigurable DEsign For ENabling SEcurity (DEFENSE) logic which deploys countermeasures for scenarios where there is detection of malicious circuitry. One can also employ design obfuscation techniques such as Camouflaging or Split-Fabrication Obfuscation [7] [8] as a means for realizing a more secure design.

Vulnerability and Attack Mitigation efforts directly impact Design for Security and Trust. As one is able to determine the areas of the design that are more vulnerable, mitigation strategies can be implemented which in turn help to achieve a more secure and attack-resistant design. The work done in [9] looks at ways to detect vulnerabilities within behavioral designs at the RTL captured in

5

Hardware Description Languages. [10] expands on this work by extending the behavioral level vulnerability analysis down to identifying the vulnerabilities at the Gate and Layout Levels of abstraction.

Figure 2 – The Trusted Microelectronics Space

Attack Detection is largely a subset of Trojan Detection when considering internally based attacks; however, it also incorporates attacks that are made external to the component. Bhunia et al provide a survey of state-of-the-art Trojan attacks and mitigation or countermeasure strategies in

[11]. The work done in [12] and [13] also look at multi-level attack models as well as various attack schemes.

The Trusted Supply Chain Management sub-field focuses on obtaining trusted parts and components by maximizing the supply chain trustworthiness. Trusted Supply Chain Management abstracts away from the component level and seeks to verify Trust from the Supplier Level. This

6

involves developing models of Supplier Network Security and Trust such as presented in [14] and

[15] in order to aid in supplier selection. Computation Models of Trustworthy Degree also play an important role in quantity analysis for trustworthy networks [16].

The Trusted Design Verification sub-field focuses on developing new Verification Science that can be extended towards vetting components that have gone through untrusted paths in the supply chain. Regardless of the untrusted point locations (e.g. the foundry, IP supplier, etc.), the Trusted

Design Verification space focuses on developing a vetting protocol for parts that are of questionable integrity. Anti-Counterfeiting addresses the problems observed with chip cloning and counterfeiting and tries to identify and protect against counterfeit components from being passed as authentic parts.

1.4. The Need for Trust Metrics

The Trust Metrics space ties together all of the other sub-fields. As such, there is an overlap of the Trust Metric space into all of the other sub-fields. The ability to add quantifying measures to each of the sub-fields grants insight into each respective field, allowing measurable evaluation of things such as coverage, thoroughness of analysis, and testing confidence. Within the Trusted

Design Verification space, Trust Metrics can be integrated into the verification protocol as a way to map progress towards a Trusted Design. Metrics can be developed and utilized in the Trusted

Supply Chain Management space as a way to compare and rank different supply chains and assist with choosing the most trustworthy supplier path. The overlap into the Vulnerability and Attack

Mitigation includes developing hardware vulnerability metrics that could also be integrated into

Computer-Automated Design (CAD) toolsets as a means for identifying the most vulnerable areas of the design as well as for comparing and evaluating different mitigation strategies. In a similar token, showing quantifiable improvements in Design for Security and Trust techniques can create 7

a contrast between different security philosophies and show one method as superior to another.

Finally, Trust Metrics could become valuable measures in the Trojan Detection and Attack

Detection communities where the amount of deviation away from a reference point could provide valuable insight into the presence of something malicious in the design.

Developing a portfolio of Trust Metrics that spans a variety of use cases is a critical component to Trusted Microelectronics research work, and it remains largely underdeveloped. As standards for Trusted Microelectronics begin to be developed, metrics such as integrity figures of merit, coverage, and distance measures may be instrumental in providing the framework for benchmarking industry Trusted Part certifications.

1.5. Previous Research

This sub-section will provide a review over previous and recent metrics research that has been conducted. This will include the scoring of overt hardware attacks, quantifying supplier and system trustworthiness, as well as quantifying vulnerability and metrics for Trojan detection.

1.5.1. Scoring Overt Hardware Attacks

In [17], Moein and Gebali developed a framework for scoring overt hardware attacks. They define an overt hardware attack as an attack where the attack scheme is not hidden and can be observed by the victim. The intention of the adversary’s attack may be comprised of any one or more of the following:

1. Disrupting the normal functionality of the system such that it deviates from the expected

performance.

2. Preventing the system from working altogether.

8

3. Reverse engineering the system to attack later by covert attack techniques or to copy the

system.

They quantify the accessibility an adversary has to the hardware system, the time required for the

Trojan to activate within the system, and the resources required for the attack. The quantifying metrics evaluate the hardware attacks from an external perspective after fabrication. Their metrics take the accessibility to the system (a), the monetary resource cost (r), and time for the attack to be observed (t) into consideration. Equation (1) is a scoring range for Least Demanding Attacks

(LDA), Equation (2) scoring for Demanding Attacks (DA), and Equation (3) the Most Demanding

Attacks (MDA.) Equation (4) generalizes the attack level scorings. Depending on how the attack is scored, it is cataloged as LDA, DA, or MDA.

Figure 3 – Quantifying Overt Hardware Attacks [17]

퐿퐷퐴 = {2 < 푎 + 푟 + 푡 ≤ 4} (1)

9

퐷퐴 = {5 ≤ 푎 + 푟 + 푡 ≤ 7} (2)

푀퐷퐴 = {8 ≤ 푎 + 푟 + 푡 < 10} (3)

퐿퐷퐴 푤ℎ푒푛 퐿1 + ∆ ≤ 4

퐴퐿 = { 퐷퐴 푤ℎ푒푛 5 ≤ 퐿1 + ∆ ≤ 7 where ∆ ∈ {−1, 0, 1} (4) 푀퐷퐴 푤ℎ푒푛 퐿1 + ∆ ≥ 8

1.5.2. Quantifying Supplier Trustworthiness

Attempts have been made to quantify supplier trustworthiness through probabilistic models. In

[18], probabilistic measures are proposed for the confidentiality (C), integrity (G), and availability

(A0) of system components such that the degree of system trust (IC components, printed circuit boards, or embedded software components) can be quantified as Equation (5).

푛 푛 푃(푇푠푦푠푡푒푚) = ∏ 푃(푇푖) = ∏ 푃(퐺푖)푃(퐶푖|퐺푖)푃(퐴표푖|퐺푖 ∩ 퐶푖) (5) 푖=1 푖=1

The integrity of the IC is defined as the measure of confidence in its ability to meet the application specification (S), its authenticity (A), if it is new (N), and if it is benign (B) (i.e. not maliciously altered.) The system level probability of integrity can be written as:

푃(퐺) = 푃(퐴)푃(푁|퐴)푃(푆|퐴 ∩ 푁)푃(퐵|푆 ∩ 퐴 ∩ 푁) (6)

푃(푄) = 푃(퐴)푃(푁|퐴)푃(푆|퐴 ∩ 푁) (7)

10

푛 푛

푃(퐺푠푦푠푡푒푚) = ∏ 푃(퐺푖) = ∏ 푃(퐺푖|푄푖)푃(푄푖) (8) 푖=1 푖=1

1.5.1. Quantifying System Level Trustworthiness

In [19], Paul et al. introduce a Trustworthiness Ontology constructed to describe dependencies and correlation of various components of Trust from a System Level. The ontology seeks to capture several dimensions of high assurance systems with the goal of providing thorough information to aid in trustworthiness analysis and data collection [19]. Cho et al. expand on this work in [20] by introducing a metric framework to measure the quality of trustworthy systems with a structure they propose called TRAM (Trust, Resilience, and Agility Metrics.) TRAM leverages the ontology methodologies of [19] to describe the hierarchical structure of metrics to measure the trustworthiness of a system. Figure 4 presents a Petri Net representation of the relationships between the various trustworthiness attributes they discuss. An oval is representative of the state of a given system, whereas a bar representative of the transition from one state to another. The diagram shows the relationships between metric attributes, assessments, and threats. U is the

Uncertainty determined by the unknown vulnerabilities and attacks, I indicates the Importance of an asset, and R refers to the Risk of the system. Equation (9) gives an expression for the

Trustworthiness of the System, Tw.

푇푤 = (푇푟, 푇, 푅푠, 퐴) (9)

Tr is the degree of Threat, T is perceived Trust, Rs is Resilience, and A is the Agility of the system.

Each measurement x maps to a [0, 1] scale.

11

Figure 4 – Relationship between Metric Attributes, Assessments, and Threats [20]

1.5.2. Quantifying Vulnerability

Methods for quantifying design vulnerability have also been proposed recently and are valuable for quantifying transient faults injected into a system by an adversary [21]. Single and multiple signal line state flips are considered and test patterns are applied in order to make the faults observable. Silent Data Corruption (SDC) is the scenario where a fault is effective but undetected and can be expressed as Equation (10), where ISDC(f) is the set of test patterns for which the fault f is effective but undetected. The Attack Success Rate (ASR) displayed in Equation (11) quantifies the conditional probability that a fault with an effect is not detected. IDET(f) is the set of test patterns for which the fault f is detected.

|퐼 (푓)| 푆퐷퐶(푓) = 푆퐷퐶 , where |퐼| = 2푛 if all inputs legitimate (10) |퐼|

|퐼 (푓)| 퐴푆푅(푓) = 푆퐷퐶 (11) |퐼퐷퐸푇(푓)| + |퐼푆퐷퐶(푓)|

12

In [22], the authors propose a Vulnerability Analysis Flow as shown in Figure 5 which can be utilized to identify hard-to-detect areas of the circuit that are more vulnerable to Trojan insertion.

Power, Delay, and Structural Analyses are conducted on the design in order to determine the locations of the hard-to-detect areas. The Power Analysis looks at nets with a very low transition probability as locations containing low observability. In the Delay Analysis, nets that are on non- critical paths are identified as more susceptible to Trojan insertion since their delay change is harder to detect. Finally, the Structural Analysis looks for untestable nets that are either blocked or unreachable for test.

Figure 5 – Vulnerability Analysis Flow [22]

13

1.5.1. Quantifying Trojan Presence

The authors of [22] also propose a metric for detecting a Trojan based on the number of transitions in the Trojan circuit and the extra capacitance induced by the gates from its implementation. Equations (12) and (13) determine the Trojan detectability, Tdetectability.

푇푑푒푡푒푐푡푎푏푖푙푖푡푦 = |푡| (12)

퐴 [ 푇푟표푗푎푛] 푆푇푟표푗푎푛 푇퐼퐶 푡 = , (13) 퐴 퐶 [ 푇푗_퐹푟푒푒] 푇푗_퐹푟푒푒 ( 푆푇푗_퐹푟푒푒 )

ATrojan represents the number of transitions of the Trojan circuit. STrojan is the size of the Trojan with regard to cells required for implementation. ATj_Free is the number of transitions in a Trojan-free circuit and STj_Free the Trojan-free circuit size. TIC represents the added capacitance by the Trojan as a Trojan-induced capacitance and CTj_Free is the Trojan-affected path with the largest capacitance in the corresponding Trojan-free circuit [22].

In [23], the authors review several metrics for quantifying Trojan detection techniques from a probabilistic perspective. They define the Probability of Detection as the ratio of the number of

Trojans detected by the technique to the total number of Trojans in the design. The Probability of

False Alarm is defined as the ratio of the number of Trojan-free designs that are incorrectly classified as Trojan to the number of Trojan-free designs. Finally, they discuss the amount of time required to find the Trojan as a third factor to the detection metric [23] [24] [3].

1.6. Problem Statement and Document Organization

The current state of metrics research in Trusted Microelectronics has primarily focused on employing a top down philosophy for evaluating Trust and quantifying attacks. These

14

quantification techniques are generally applicable to higher abstraction levels (e.g. the supplier and greater system level.) The scoring of overt hardware attacks discussed in [17] are useful for measuring the context in which the adversary executes the attack; however, information about the attack must be known and is realistically very difficult to acquire in practice. The probabilistic models from [18] and ontology models from [19] [20] also are only valid for supplier or system level trustworthiness evaluation and do not afford the resolution into lower levels of hierarchy that span into the design level of components. The vulnerability measuring techniques presented in

[21] only target transient faults injected by the adversary at the system level and do not address vulnerabilities in the fabrication or design that can be taken advantage of.

The contribution that these metrics offer for quantifying Trust leaves one with questions such as “How can one quantify the integrity of the design in scenarios where no information about the supply chain is known?” and “How does one obtain greater resolution into a supplier with the granularity to evaluate a questionable component?” Techniques that can address these Trust concerns at the design level remain to be developed.

Developing metrics that can evaluate the trustworthiness of the component at the design level is an approach that will allow one to quantify a design independent of the supply chain. A roadmap that constrains the scope while still defining the Problem Space is needed to assist in the development process flow for metric development. In addition, a Trust Metric roadmap can establish a Solution Space where future metric work and advancements can be mapped to. The process of developing metrics will require establishing a dataset of test articles with embedded errors that mimic real world scenarios of manufacturing faults and adversary intrusion. As such, a method of error scoring that ranks and rates errors to establish quantifiable differentiation needs to be developed. Developing and utilizing references for this process can present a wide range of

15

utility and variability in quality. Developing a measure for reference quality is needed to convey confidence in the Trust analytics.

This dissertation work will address the outlined problem by defining a Solution Space that will serve as a roadmap to developing new measuring techniques for establishing novel Trust Metrics.

These Trust Metrics will quantify Design Integrity and provide a means for creating measurable differentiation between various error types. Finally, a Trust Metric Figure of Merit will be discussed that evaluates the quality of the Trust Metrics themselves and set a benchmark for comparing other metrics against.

This document has been organized in a manner to help the reader understand the progression of developments in a logical manner. Background concepts and topics will be reviewed when necessary in order to add context for areas where it is deemed useful to the reader. Chapter 2 will discuss the process of constraining the Trust Metric Solution Space and review the subsequent domains of the Trust Metric space. From there, Chapter 3 will discuss the Error Implementation

Cost measures that were developed to quantify errors to allow error ranking and rating. The process of developing Test Articles for test scenarios will be discussed in Chapter 4. Chapter 5 will review the Design Integrity techniques and metrics that were developed for tracking deviation of a design away from its intended reference profile. In Chapter 6, several test cases containing inserted errors will be analyzed and have the integrity metrics applied. Chapter 7 will discuss the IP-XACT

Standard and opportunities it offers for leveraging new metrics in the world of IP sharing.

Chapter 8 will be the conclusion of this document where future work recommendations will be given for continuing the progress this work has made.

16

Chapter 2: CONSTRAINING THE TRUST METRIC SOLUTION SPACE

Previously, the Trust Metric Problem Space had been loosely defined. This created a variety of different objectives for developing Trust Metrics thereby making collaborative efforts in the

Trusted Microelectronics community difficult to realize or gain traction. Chapter 2 seeks to address this problem by constraining the Trust Metric Problem Space into six different regions or domains of metric development: Reference Establishment, Vulnerability Analysis, Design Integrity Analysis,

Test Article Development, Error Analysis, and Hardware Error Taxonomy. By constraining the

Trust Metric Space and representing it in a modular fashion, a central reference point for all Trust

Metrics work is established for various research group contributions. New metric developments as well as new domains can easily be integrated into the Solution Space.

Figure 6 represents the scope of the design space by identifying the two domains that encompass any modern design: Discrete and Continuous Domains. The intersection between the

Discrete and Continuous Domains is considered to be the Mixed Domain which contains combinations of both discrete and continuous elements. A comprehensive portfolio of Trust

Metrics will require metrics from all three domains. Figure 7 expands on Figure 6 by adding dimensionality to characterize the Hardware Trust Problem in terms of abstraction layers.

17

Hardware Trust can be discretized into three main Trust Abstraction Levels. The highest level is

Trust at the Supplier Level which includes the larger external system attacks, overt hardware attacks, and supplier trustworthiness. The middle abstraction is Trust at the Design and Component

Level with a scope spanning the Register Transfer Level design, the Gate Level Netlist, and the

Layout of the hardware component. The lowest abstraction is Trust at the device level which targets the physical silicon as well as the transistor physics. This work focuses on the Design and

Component Trust Abstraction Level as indicated in Figure 7.

Figure 6 – Different Domains for Trust Analysis

Figure 7 – Trust Abstraction Levels 18

Figure 8 presents the Constrained Trust Metric Solution Space that exists within the confines of the highlighted Discrete Domain shown in Figure 7. Each of the six aforementioned domains are displayed showing their interaction and relationship to the other domains. The pertinent metrics produced from the six domains can be observed as the input into the Trust Metrics area at the bottom. The remainder of Chapter 2 will review each of the six domains as they lay the groundwork for the deeper analysis that will be conducted in later chapters.

Figure 8 – Constrained Trust Metric Solution Space

19

2.1. Reference Establishment Domain

When developing metrics for Trust, one of the most fundamental components to conducting a thorough analysis and developing new techniques revolves around having a well-defined reference specification. Since there are many different types of references, they can vary in quality as well as utility. References can be highly detailed such as a behavioral design that is synthesizable or a netlist that shows all of the gates and structures being utilized in the design. At the other end of the spectrum, there are references that have much less utility, such as a high level datasheet or executable specification that covers only functional profiles of the design. There can be gaps or holes in the information provided by the reference which becomes problematic in the analyses that require greater amounts of detail. In addition, IP suppliers will often intentionally abstract away details of the design as a means of protecting their IP. The Reference Establishment Domain is the area where the design reference is defined. The process items in this domain include using whatever information is available about the particular design to develop a richer reference which can be utilized for other domain processes. This can involve tasks that range from developing executable specifications or behavioral models in HDL to implementing the RTL into bit streams or synthesized gate netlists. The best case scenario for a high quality reference is having the full design itself. Finally, a reference quality metric is produced which quantifies the usefulness of the reference.

2.2. Design Integrity Analysis Domain

Design Integrity is a crucial component to the Trust Metrics as it provides quantitative insight into how closely the actual hardware matches the expected version of the design. As mentioned earlier, the majority of the existing metrics are applicable only at the supplier level of abstraction and do not address Trust problems with the design itself. The Design Integrity analysis determines 20

the integrity of an untrusted design by vetting the reliability, identifying extra or modified circuitry, and considering any behavioral or operational anomalies. The analysis further breaks down the design into five smaller sub-domain profiles for characterization: Logical Equivalence, Structural

Architecture, Power Consumption, Signal Activity Rate, and Functional Correctness. The reference design can be utilized as a baseline for establishing what an expected profile of the design would look like. The design in question provides an actual profile of the design which can be compared to the expected profile. This comparison between the actual and expected profile lends itself to developing a normalized distance measurement that can be used to provide correlation between an embedded error and the design integrity of each of the sub-domain profile categories.

The distance measures can vary for each of the categories depending on how a particular error affects the design. By looking at multiple profiles of the design such as Power Consumption or

Logical Equivalence and quantifying each, one can acquire greater resolution into different perspectives of the design that can then be correlated to the integrity of the design. The sub-domain profile measurements can then be aggregated together to arrive at a final metric or distance measure that is indicative of its integrity.

The Power, Signal Rate, Functional, Structural, and Logical Equivalence Integrity blocks represented in the Design Integrity Domain indicate the measurement processes and techniques that are specific and unique to the respective sub-domain. Each technique realizes a measurement that arrives at a final normalized distance measure for the sub-domain. Highest Design Integrity can be defined as minimum deviation (within a specified performance margin) from the original design specification (i.e. smallest distance measure.) Lowest Design Integrity would indicate high deviations from the specification (i.e. large distance measures.) In order to develop the measuring techniques necessary for exploring the deviation distance measurements, design examples or test

21

articles needed to be developed to allow the techniques to be exercised and validated. The process of creating test articles is conducted in the Test Article Development Domain.

2.3. Test Article Development Domain

The development of test articles for exercising the Design Integrity sub-domain regions and exploring error cost measures is one of the central components to the metrics research. Test articles are the foundational design cases that mimic scenarios where error insertion has been executed through accidental or intentional means. They are the source for data generation and feed into the other domain processes that are developing domain-specific Trust models.

The Design Corruption Process shown in the Test Article Development Domain of Figure 8 encompasses the selection and insertion process of an error into a reference design that has already been functionally verified. The term functionally verified is utilized here to indicate that the design has gone through the traditional model checking and verification processes and has been vetted for any design bugs. Design bugs are not considered to be trust concerns since they are not intentionally inserted with the goal of corrupting the design. The process of error insertion utilizes the errors developed in the Error Analysis Domain and inserts them into strategic locations within the design. The error insertion process may be conducted manually or semi-automated. The semi- automated insertion of errors affords an increased ability to remove the human bias element of insertion when developing corrupted test articles. Semi-automated insertion also can assist in the mass creation of test articles that will be needed in order to analyze deterministic trends. Large quantities of test articles also become necessary for establishing a data corpus that can assist in moving beyond deterministic models and into probabilistic domains of Trust.

When a test article is finalized, it can be utilized in the Design Integrity Domain as a compromised design example. The techniques developed in each sub-domain of Design Integrity 22

can then be used to analyze it in order to measure the deviation distance from the expected design characteristic. For example, once a design has been developed with an embedded Trojan error, thus compromising the expected design behavior, the test article is analyzed in each of the sub- domain profile areas of Design Integrity by running tests for Logical Equivalence, Power

Consumption, Functional Correctness, Signal Activity, and Structural Architecture. After these tests have been applied and the normalized deviation measurements have been obtained, one can arrive at a final Design Integrity metric. This metric is utilized with the Reference Quality metric to arrive at a final Trust Measure Figure of Merit.

Lastly, feedback from the Test Article Development Domain goes back into the Error Analysis

Domain where the implementation cost of the error insertion into the test article can be quantified.

This establishes measurable differentiation between each error that can be used for improving the error insertion process into test articles, as well as defines a framework through which the errors can be ranked and rated.

2.4. Error Analysis Domain

The Error Analysis Domain is comprised of two main parts. The first part is where the development and modeling of the error occurs. A wide range of error types in this part are designed for insertion into the test articles in the Test Article Development Domain. These errors target functional corruption in the design as well as structural architecture, logical equivalence, and power consumption and will assist in developing the measurement techniques for the DI analysis.

Sophisticated Trojans made to evade these evaluation mechanisms are also designed. The goal is to have a spectrum of different error scenarios that closely mimic those inserted intentionally or by accident in a real design.

23

The second part is where the error is analyzed according to the Error Implementation Cost

(EIC) function. In error development, fault models such as discussed in [25] are utilized in order to model fabrication flaws created by poor manufacturing processes observed at the foundry. Other errors are developed which end up having an impact on certain design performance measurables such as timing accuracy or power consumption. “Smart” errors can be designed to resemble sophisticated Trojans which activate and cause a malicious function. Human intuition can easily classify the impact of an error as severe or not severe, but how does one quantitatively show that one error is measurably more severe than another? A fabrication flaw that is benign is clearly less severe than a hard-to-detect embedded Trojan which has the capability to shut a system down upon activation. The EIC function scores these errors such that measureable differentiation can be used to show quantitatively that one error is more severe than another error.

The error scoring analysis is determined by the EIC function, the cost of implementing the error into the design. The EIC function utilizes three components for the scoring process: Payload,

Triggering Mechanism, and Detectability. After each of these components have been quantified, they can be aggregated together to arrive at a final cost of implementation measure. The Payload,

P, of the error determines how much damage the error is capable of doing to the circuit. The

Triggering Mechanism, T, of the error quantifies how the error is activated. Finally, the

Detectability, D, of the error is determined and measures how easy the embedded error is to detect.

Since each component is normalized, a weighting, βi, is used to take the normalization into account.

The output of the error scoring feeds directly into the Hardware Error Taxonomy which utilizes the scoring system to rank and rate errors similar to what has been established in the software community regarding the Common Weakness Enumeration (CWE) and Software Fault Patterns

(SFP) [26].

24

2.5. Hardware Error Taxonomy Domain

The Hardware Error Taxonomy establishes a classification system that can be applied to the various error types observed in a design. The scoring system that is developed in the Error Analysis

Domain becomes the framework by which the errors can be ranked, rated, and classified. The taxonomy deepens the resolution and differentiation between error types such that a more strategic and thorough approach can be taken when creating or using errors to realize compromised test articles. This classification network is broad enough that every error type can map to it regardless of its structure or payload [3].

2.6. Vulnerability Analysis Domain

The Vulnerability Analysis Domain seeks to establish techniques that measure the vulnerabilities of the design. These measurements are significant as they help identify areas of a design that are more susceptible to error insertion. By knowing the locations of the most vulnerable points within the design, different mitigation strategies can then be explored as a means for improving design security. In addition, they highlight the areas of the design that should be held under higher scrutiny in the test phase prior to insertion into a larger system.

Three design abstractions are considered in the analysis (RTL, Gate Netlist Level, and Layout

Level.) Techniques pertinent to each abstraction level must be developed in order to maximize the resolution of the vulnerability checks. The points of error insertion can also be further analyzed in this domain such that new or different mitigation strategies can be developed and evaluated. The final metric is a Vulnerability Figure of Merit that gauges how secure the design is by the sum of each aggregated abstraction level.

25

2.7. Produced Metrics

The Trust Metric Solution Space presented in Figure 8 will produce three categories of metrics to include in the Trusted Microelectronics portfolio. The first set are the Design Vulnerability metrics which quantify the vulnerability of the design as a whole, yet contains the resolution to identify the highly vulnerable points at the various design abstraction levels. A unique metric is produced for each abstraction boundary (i.e. RTL, Gate, and Layout) and can be aggregated to arrive at a final figure of merit value for the entire design.

The second set of metrics revolve around quantifying the integrity of the design. Design

Integrity is parsed into five separate sub-domains with separate distance measurements for each quantifying the deviation away from the expected design reference. This distance measurement correlates to the integrity of each of the sub-domain profiles of the design and is expressed as a normalized value (LEintegrity, Pintegrity, Sintegrity, Fintegrity, and SRintegrity.) These five sub-domain measurements can then be aggregated to arrive at a final Design Integrity metric. These measurements are also utilized to determine the quality of the reference being used in the analysis,

RQ, and produce a final Trust Measure Figure of Merit that can be assigned to the design in question.

The last set of metrics focus on quantifying characteristics of the error and its cost of implementation into a system. Each of these error characteristic metrics (Payload, Triggering

Mechanism, and Detectability) are determined independently before being aggregated together into a final cost metric, the EIC.

Future metrics and Trust Domains can be modularly added into the Solution Space. To address this, a Future Domain area is included as a placeholder for future developments and advancements in new metric areas.

26

Chapter 3: ERROR IMPLEMENTATION COST

When considering the wide range of hardware error types, there is a tendency to generalize them as undesirable design anomalies or flaws. This perspective, however, doesn’t offer an approach for differentiating errors from one another. Hardware Trojans can be easily differentiated from normal hardware faults since they carry a designed behavior. This differentiation, however, is made by inspection or through different Trojan detection techniques. Although it is understood that hardware errors are different when compared to one another, the inclination is to simply abstract away from their differences and identify them as a design anomaly. The reality is that hardware errors do not stack up evenly when compared to one another. Different errors have varying degrees of damage infliction that they can cause to the design performance. One error type

(e.g. malicious Trojan) may require a small amount of additional circuitry, however, it may result in much greater system level damage capacity than that of benign faults or fabrication flaws carelessly inserted into the design.

The question then becomes, “How does one quantify these differences such that one error type is measurably more or less severe than another? And what added benefits does error quantification bring to the Trust Metrics work?” Much of the Trusted Microelectronics work revolves around the

27

development of test articles to exercise the domain specific measuring techniques. As such, test article development requires the insertion of errors into the design. The scoring framework provides a means to assist with the insertion process. Furthermore, as Trusted Microelectronics research moves into more probabilistic models of trust, a large data corpus of test articles will be required to fit the probability models to. Automating the error insertion process minimizes the human bias element as well as enables much faster development of the data corpus necessary. With errors being quantifiable, the scoring framework becomes an invaluable aid to the automated error insertion process.

One way the hardware community can establish measureable differentiation in the diversity of error types is by scoring them with a standardized methodology. By doing so, each error can have a value assigned to it which will bring all errors to the same reference point for comparison. The error scoring in this chapter focuses on the error cost of implementation and integrates components of the Hardware Error Taxonomy in Figure 17 as attributes to quantify the error. The Action,

Activation, and Physical Characteristics of the taxonomy directly relate to the Payload, Trigger, and Detectability of the scoring function respectively. This chapter will discuss the error scoring methodology starting with a generalized scoring process applied by inspection. It will then move onto measuring the errors with objective techniques that were developed and present examples that demonstrate the scoring effectiveness.

3.1. Error Scoring by Inspection

The hardware error scoring can be generalized as an expression that is determined by inspection of the error. Equation (14) quantifies the hardware Error Implementation Cost (EIC.) In order to score the error in this way, one must have an understanding of the error design nearly to the level of actually designing the error (e.g. how error was implemented, behavior of error, etc.) The 28

resource cost can be reduced to three levels (Low, Medium, and High) as expressed in the inequality in Equation (15).

Error Implementation Cost = 훽1푃 + 훽2푇 + 훽3퐷 (14)

No Cost: 퐸퐼퐶 = 0 Low Cost: 0 < 퐸퐼퐶 < 3 (15) Medium Cost: 3 ≤ 퐸퐼퐶 < 6 High Cost: 6 ≤ 퐸퐼퐶 ≤ 9

Equation (14) takes three aspects of each error into consideration [27] [28]. First, the Payload (P) of the error or how much damage the error is capable of doing to the circuit or system is considered.

Secondly, the Triggering Mechanism (T) of the error is evaluated. Finally, the level of Detectability

(D) or difficulty to implement the error such that it is stealthy and untraceable is considered. Each component has a weight coefficient, βi, which gives different weighting relative to the other components. The higher the error cost, the more difficult it would be for an adversary to implement it into a design due to the higher resources required. A scaling of 0 to 3 is used, described in Table

1, for evaluating the three components of Equation (14).

Table 1 - Component Scoring Rubric for the Error Implementation Cost Function

29

3.1.1. Error Scoring of Trojan Inserted into an ALU with EIC

To demonstrate how the EIC can be applied, an example case will be used. An 8-bit Arithmetic

Logic Unit (ALU) was corrupted with an embedded Trojan circuit. When the ALU is in operation, if an input combination of all bit-wise high values (i.e. Input A = Input B = “11111111”) are received at the ALU input ports, the logical AND operation will be mutated into an XOR operation.

For all other input combinations, both operations function as expected. Figure 9 displays the waveform that confirms the functional corruption that occurs when the Trojan is activated. One can observe that the vector input combination “11111111” at both inputs activated the Trojan, forcing an XOR operation when an AND operation was received. The Trojan cost of implementation was valued at 5 indicating medium cost (assuming unity weighting of βi = 1.) The

Payload was valued as P=1 since the damage capability was very limited (i.e. minor functional modification rather than system failure, data output stream, etc.) The Trigger was evaluated as T=2 since the Trojan was passive until a specific combination of inputs triggered it to be active. The

Detectability was rated as D=2 since the Trojan was only detectable at a behavioral level by one unique vector combination in a 216 test space; hence making it highly unlikely to detect through quasi-random functional testing. In order to make the Trojan observable, techniques of Formal

Verification (i.e. logical equivalence checking) would need to be utilized.

30

Figure 9 – Waveform Showing Trojan Activation

Table 2 details a description of four different errors inserted into separate ALU Controllers. The

EIC assessment is applied by inspection. Error 1 simply forced a logical OR statement to execute as Set on Less and vice versa. Error 2 was a stuck-at fault that forced the Controller operation to select Set on Less for conditions where the logical OR operation is desirable, effectively creating one false function. Error 3 performed a bitwise invert that corrupted every output result of the

Controller. Error 4 performed the same corruption as Error 3, however, it was triggered only in special cases. In Table 2, one can observe each of the components for the EIC function (i.e. P, T, and D.) Errors 1 and 2 are clearly less severe than 3 and 4 from a resource cost perspective.

Although there is less payload than Error 3, Error 4 maintains the highest cost due to the complex triggering mechanism required for implementation.

31

Error Error P T D EIC Cost Description of Inserted Error Location Number ALU Logical OR executes as Set on Less. Error 1 1 1 0 2 Low Controller Set on Less executes as Logical OR ALU Stuck-at Fault creates double function for Error 2 1 1 0 2 Low Controller operation select ALU Bitwise invert for operation output. Error 3 3 1 1 5 Medium Controller Corrupts every function ALU Bitwise invert triggered for special case Error 4 2 3 1 6 High Controller controller outputs

Table 2 – Description of Errors Inserted into ALU Controllers

3.2. Error Scoring Through Objective Measurement Techniques

The scoring by inspection provides a good means for acquiring a generalized quantitative assessment of the error. It is desirable, however, to move into a more objective way of error scoring.

As such, developing techniques for measuring the EIC components needed to be established. The first technique that will be addressed is the measuring of the error Payload.

3.2.1. Error Payload

In order to acquire an objective measurement of Payload, the scope of error Payload is constrained to include only functional corruptions caused by the error. Equation (16) presents an expression for observing the error Payload across an output boundary line, ρboundary, which encircles the input and output ports of the design architecture. ρboundary is expressed as the errors observed,

εobserved, over a given set of tests, Ttotal. When possible, the test scheme can be designed to be exhaustive. For much larger test spaces, the coverage for boundary and corner conditions can be maximized along with basic functional testing. The higher the value of ρ (approaching to 1), the

32

more functional damage will be observed from the error across the boundary surrounding the component.

εobserved 𝜌boundry = [ ] such that 0 ≤ 𝜌boundry ≤ 1 (16) Ttotal

3.2.2. Payload for ALU Controller Embedded Error

In order to demonstrate this with an example, a reference ALU Controller was corrupted with strategically inserted errors in order to analyze the Payload of each error on the Controller functionality. Figure 10 displays the test bench setup utilized. The corrupted ALU Controller is identical to the reference ALU Controller design with regard to expected functionality and performance. The only difference is the corrupted Controller has an error embedded in it. A boundary is defined around the corrupted Controller to define the scope of the Payload calculation measure.

Figure 10 – Test Bench for ALU Controller Testing

33

Both the reference ALU Controller and the corrupted Controller architectures are injected with a test vector set to stimulate all functions of the design. The reference ALU Controller produces an expected output which is used as a reference and compared to the actual result produced by the corrupted Controller. For any vectors where the expected and actual result are not equal, an error flag is sent to the test bench indicating a functional error occurence. In order to quantify the Payload of the error as it propagates through larger boundary regions, one can apply Equation (16) over a set of boundaries, i, as shown in Equation (17) where m is the number of signals across each respective boundary. The boundary Payloads are then aggregated together to arrive at a final

System Payload, Psystem, that is indicative of the error damage relative to the system or largest boundary.

n m 1 εobserved Psystem = ∏ [ ∙ ∑ ( ) ] (17) 푚 Ttotal i=1 j=1 푗 푖

such that 0 < Psystem ≤ 1

3.3. Error Realization

The Detectability and Triggering Mechanism components of the EIC are considered to be subsets of error Realization, R. Instead of considering the Triggering Mechanism of the error only, the amount of additional circuitry required for implementation of both the error and Trigger is analyzed. The detectability of these modifications is then evaluated in context to the full design to determine how easy the error is to detect. A modified EIC function can then be expressed by

Equation (18) where the Implementation, I, and Detectability, D, are both components of the error

Realization. 34

퐸퐼퐶 = 푃system + 푅(퐷, 퐼) where 0 ≤ EIC ≤ 2 (18)

The error Realization can be evaluated objectively by observing the amount of structural changes that were made to the design in order to implement the error. The structural changes include additional wiring added or removed, as well as new or modified architecture components that were added into the design. The Realization measure is constrained to the Gate Level Netlist, although a similar analysis could be conducted at the RTL or Layout Levels. Once the design is synthesized from RTL into gates, the analysis revolves around possible modifications made to the design’s Nets and Leaf Cells from the expected synthesis reference.

A Net is a set of interconnected pins, ports, and wires. Wires sharing a common Net are at the same electrical potential as one another. Figure 11 displays the Net objects and the other class objects of the design as specified by Xilinx when performing FPGA synthesis. The Netlist objects include Logic Cells, Pins, Ports, and Nets. The relationship between objects is conveyed with arrows connecting two objects. A double headed arrow indicates that the relationship can be queried from either direction. A single ended arrow reflects a relationship that can only be queried in the direction of the arrow.

35

Figure 11 – Netlist Objects [29]

Figure 12 displays the relationship of the Cell objects to the other objects. Cells are typically Leaf

Cells which are primitives. The Leaf Cells have pins which are connected to the Nets in order to define the external Netlist. Hierarchical cells contain ports that are associated with hierarchical pins and connect internally to the Nets in order to define the internal Netlist. The Cells can be placed onto Basic Elements (BEL) or onto the SITE object which makes up larger more complex logic cells [29].

36

Figure 12 – Cell Objects [29]

When the synthesis process is executed, the design Nets and Leaf Cells are represented hierarchically as subcomponent architectures. As such, these become the points of comparison for identifying any deviation from the expected structure. Equation (19) and (20) determine the number of extra or removed Nets and Leaf Cells respectively for an evaluated architecture component, i.

Implementation Difference of Nets = 퐼∆푋푖 = |푁푒푡푠푒푥푝푒푐푡푒푑 − 푁푒푡푠푎푐푡푢푎푙| (19)

Implementation Difference of Cells = 퐼∆푌푖 = |퐶푒푙푙푠푒푥푝푒푐푡푒푑 − 퐶푒푙푙푠푎푐푡푢푎푙| (20)

37

The modified Nets and Cells can then be represented as a ratio to the total Nets and Cells respectively to arrive at a final expression for the error Realization, shown in Equation (21). In order to maintain the resolution of the modified circuits from getting washed out in a large design, only the architectures that show a modification to the Nets or Leaf Cells are considered; therefore

퐼∆푋푖 ≠ 0 and 퐼∆푌푖 ≠ 0. The Cells and Nets ratios are averaged together. The final Realization value can be subtracted from 1 in order to invert the scaling such that a 0 indicates a highly detectable error and a 1, an undetectable error.

푛 푚 1 1 퐼∆푋 1 퐼∆푌 푅 = ( ∑ [ 푖 ] + ∑ [ 푖 ]) 푤ℎ푒푟푒 0 ≤ 푅 ≤ 1 (21) 2 푛 푋푖_푒푥푝푒푐푡푒푑 푚 푌푖_푒푥푝푒푐푡푒푑 푖=1 푖=1

where n, m = number of modified architectures evaluated for Nets

and Leaf Cells respectively and 퐼∆푋푖 ≠ 0 and 퐼∆푌푖 ≠ 0

3.4. Framework for Test Article Development

The scoring of errors has allowed a path for strategic insertion of errors into designs for creating scenarios of obfuscated error test cases. These test cases can mimic designs that contain errors from an adversary or unscrupulous IP supplier. The utilization of the these developments will be revisited in later sections of Chapter 4 where a series of corrupt MIPS test articles are created and the errors scored accordingly for gauging the severity of the errors inserted.

38

Chapter 4: TEST ARTICLE DEVELOPMENT

In order to develop Trust Metrics, several reference designs are required to aid in the process of metric model development. These models, referred to as test articles, are design examples that contain embedded faults or Trojans that represent inserted errors from an adversary or unscrupulous

IP supplier. The corpus of test articles creates a space where the various error types can be analyzed from a quantitative perspective with regards to various design profile characteristics such as Logical

Equivalence, Power Consumption, Structural Architecture, and Functional Correctness. This lends itself toward developing a Trust Metric Model for the representative Solution Space domains

(e.g. Vulnerability, Design Integrity, Error Analysis Domain, etc.) The development of the test articles is therefore one of the central components to the metric development, since the analyses and generated data are foundational for fitting the models to. This chapter will review the process utilized for developing test article design cases. Specifically, a 32-bit Floating Point Adder, a Fixed

Point to Floating Point Converter, a MIPS processor, and an Arithmetic Logic Unit with Controller will be discussed.

39

4.1. Establishing a Reference for Developing Metrics

As previously mentioned, the primary focus of the Test Article Development Domain observed in Figure 8 is to develop a set of design case examples which can be used to establish a deterministic

Trust Model for different characteristics of the design (e.g. a Logical Equivalence Trust Model,

Functional Correctness Trust Model, Power Consumption Trust Model, etc.) One of the difficulties in accomplishing this is the lack of available controlled reference designs. Initial investigations into the metrics work began with test articles that contained obfuscated errors. These test articles were black box designs where the quantity and types of errors inside were largely unknown. In addition, there were no design specifications made available to grant meaningful insight into the original design functionality. Without any sort of reference bench point, developing integrity metrics with only the use of corrupted test articles was very difficult. As such, an approach was taken to develop a new set of test articles that contained manufactured errors strategically inserted into the design, thus creating a component that could closely mimic a component that an adversary had corrupted. Figure 13 compares the black box test article that was unknown with a newly developed test article where all of the information is known. The approach of developing new test articles brought transparency to the error designs enabling one to observe the functional mutations caused by the error. It also established a reference benchmark in a controlled setting for developing the metric models.

40

S SET Q

0 & R CLR Q Full Functionality Known 0 No Design Specification 0 Black Box S SET Q Error Structure Known

Unknown Errors CLR R Q Error Location Observable Unknown Insertion Points S SET Q Test Article Functional Corruption Observable R CLR Q

Malicious Circuitry Inserted

Figure 13 – Comparison of Black Box Test Article (left) to Independently Developed Test Article with Known Design and Errors (right)

Figure 14 – Training Model Data and Application to Test Set

Figure 14 illustrates how the developed test articles can then be utilized as a training data set to create the Trust Models. Once the model was created from the Training Data set, the Trust Model could be applied to a Test Data set such as the original black boxed test articles for evaluation. As the data set continues to grow, taking more error scenarios into account in different test article 41

designs, a probabilistic Trust Model approach could be taken which could facilitate a predictive model. In this work, the models are deterministic in nature, since the focus has been developing measuring techniques for different design profile domains. The process of developing the training data set shown in Figure 14 goes into more detail in Figure 15. A golden, or uncorrupted, version of each test article is established and serves as the design source from which the compromised, or corrupted, test articles are developed. The corrupted test articles are developed through the strategic insertion of errors into the golden design. In Figure 15, one can see that the error may be inserted at either the RTL or Netlist Level. A taxonomy of different error characteristics provides a means for selecting a spectrum of error types. The error scoring developed from the EIC allowed errors with higher severity to be chosen for insertion as well as errors with a very low severity.

There are two paths displayed in Figure 15. The first path inserts the error at the RTL and then synthesizes the RTL into a Netlist, embedding the error into the gate-level description. The second path synthesizes the RTL into a gate-level Netlist and then inserts the error into the Netlist.

42

Figure 15 – Test Article Development for Actual and Expected Design Scenarios

The Abstraction Boundary is the point where generalized equivalence checks spanning across multiple profile dimensions of the design can be executed. This allows one to analyze deviations observed in the Actual Design when compared to the Expected Design as shown in Figure 15. By developing test articles in this manner, expected and actual scenarios of the design were created.

The expected scenario represents a design containing no Trust concerns and maintains the characteristics intended by the system engineer. The actual scenario is representative of what was

43

actually manufactured at the foundry or the untrusted IP that was received from the IP supplier.

Additional circuitry integrated or modified would cause deviation away from the expected characteristics.

The test article designs were explored and developed utilizing a Field Programmable Gate

Array (FPGA) flow. When considering the costs for fabricating a design, FPGAs are significantly cheaper than Application Specific Integrated Circuits (ASIC) chips and carry negligible implementation time when compared to an ASIC tape out. As such, the Xilinx Spartan 3E,

Spartan 6, and Zync-7000 FPGAs were utilized for the test article development. The development boards used were the Digilent Basys2 and Zedboard development boards as shown in Figure 16.

The Xilinx ISE and Vivado environments were utilized for HDL modeling, simulation, synthesis, and Place & Route implementation. These tools were used in conjunction with Mentor Graphics

Questa/ModelSIM for design verification.

Figure 16 – FPGA Boards Utilized in Test Article Development [30]

44

4.2. Taxonomy of Hardware Errors

In order to establish compromised test article scenarios, a selection process for errors needed to be developed such that a wide range of error types could be chosen for insertion into the design.

Cataloging different characteristics of each error was therefore a viable way of creating distinction between each. A hardware error taxonomy is displayed in Figure 17 which catalogs error types

[3]. The taxonomy is presented in such a way that all error types can be mapped to it, regardless of the error type. The spectrum of error types, from a small fabrication fault to a complex Trojan is mappable to the taxonomy. The three characteristic categories of each error described in the hardware error taxonomy are the Action, Activation, and Physical Characteristics.

4.2.1. Action Characteristics

The Action characteristics of the error address how the error behaves. For a hardware Trojan, this would be the damage assignment that gets executed once it is activated. For a fabrication flaw, this is simply the performance consequences caused by the fault (e.g. wire short to ground causes a logic bit to never flip.) The three arenas that Action characteristics encompass is Transmit

Information, Modify Specification, and Modify Function. In other words, these are categorized action outcomes of the activated error. Transmit Information catalogs errors that are able to stream or export data from the system to an outside entity. Modify Specification addresses errors that make the original design specification defective or compromise the reliability of the design. Finally, the error could modify the actual functionality of the design such that the desired functionality deviates from the expected behavior.

45

4.2.2. Activation Characteristics

An error can be internally or externally activated. The error is internally activated by either remaining statically activated or by satisfying a specific condition that allows it to turn ON. If it is activated by a condition, it can be satisfied through logic or sensor implementation. The logic could be in the form of a counter timer that runs out, a unique input combination, or some other logic structure design. The sensor could be a monitoring device that triggers the error ON once a certain threshold is reached (e.g. temperature limit.) The external activations require an outside antenna, sensor, or a hybrid combination of both in order to turn the error ON.

4.2.3. Physical Characteristics

The error’s physical characteristics can be cataloged according to its realization into the design.

The physical realization covers the error distribution within and throughout the system, the layout structure, the required layout real estate, and the type of physical implementation. The physical type can be considered functional, meaning additional gates were added or deleted from the design in order to modify the functionality. It can also be considered parametric, indicating that the error was realized by modifying the existing wires or logic design.

46

Error Taxonomy

Action Activation Physical Characteristics Characteristics Characteristics

Transmit Tight Information Distribution External Activation Internal Activation Loose

Layout Modify Same Structure Specification Conditional Always ON Layout Antenna Change Defects Reliability Small Tran/Wire Size Modify Sensor Logic Sensor Large Function Gate/Macr

Internal Counter Input Parametric Change Disable State /Clock Type Functional Data Instr. Interrupt

Figure 17 – Hardware Error Taxonomy [3]

ALU TROJAN

ACTION Modify Function P = 1 Change ACTIVATION Internal Activation Conditional Logic T = 2 EIC=5 Medium Input Cost Data PHYSICAL Distribution Structure Size Type D = 2 Loose Layout Large Functional Change Gate

Figure 18 – Scoring ALU Trojan Error

47

Figure 18 shows an example of the ALU with the embedded error discussed in Section 3.1.1 being mapped to the error taxonomy.

4.3. 32-Bit Floating Point Adder Test Article

The first test article that will be reviewed is the 32-bit single precision Floating Point Adder

(FPA) from [31] that was used to explore the RTL/Netlist boundary. Once the reference FPA was established, the FPA design was compromised with inserted errors, thus creating many more test articles to mimic tampered designs. The FPA design is able to receive two inputs in the IEEE

Standard 754 single precision format and produces a single precision 32-bit output. The FPA has the capacity to support inputs that are NaN, ±∞, ±0, normalized numbers, and denormalized numbers. Figure 19 displays a macro model diagram of the FPA module. The inputs must be latched into the FPA first before the Add operation is performed. When the DRIVE input is HIGH, the result is driven onto the FPA output.

Figure 19 – Single Precision 32-Bit Floating Point Adder

48

Figure 20 – Generalized Verification Testbench

Figure 20 displays a generalized testbench approach for verifying the design’s functionality. The top testbench stimulates a behavioral model of the design in order to generate an expected set of outputs for the FPA under a given coverage test scheme. The second testbench (shown in the lower half) stimulates the architecture dataflow design with the same coverage test scheme. The output results (considered the actual output) can then be referenced against the expected results in order to verify correct functionality of the design. An error monitor in the testbench identifies the points in the design where there is a discrepancy between the expected and actual design functionality.

These are points where the design must be corrected to insure functional correctness.

As part of the test article development, a functional test plan was developed. This test plan would be later implemented as a means for evaluating the Functional Integrity of the actual design.

The FPA had a test space of 265, therefore exhaustive testing was not practical. The test scheme was designed in order to maximize testing coverage without being exhaustive. As such, the test scheme validated corner cases and special conditions in addition to the general functional operation.

49

Table 3 displays all of the special case test vectors that were used for coverage. One can see that a total of 73 special case test vectors were required for the FPA special case testing coverage.

Table 3 – Special Case Test Vectors for FPA Coverage

50

Table 4 – Test Vectors for Boundary Coverage 51

In addition to the special case test vectors, boundary coverage test vectors were designed for more thorough testing coverage. Table 4 outlines the boundary coverage vectors used for the FPA.

Referencing Table 4, the boundary condition tests are outlined along with the expected result. In

Test 1, a denormalized number is added to another denormalized number. The result is a maximum denormalized number. In the right most column, the test vectors for Input A and Input B are specified along with the appropriate output result Z. With the special cases and boundary condition vectors established, random test vectors were also generated for additional coverage. A MATLAB program was written to generate approximately 5000 random test vectors that would also be injected into the FPA. A MATLAB script was utilized in conjunction with the vector generation program for converting the vectors from the IEEE 754 Floating Point Standard into a decimal equivalent for more intuitive inspection. Regarding the vector generation process, each test vector was divided into three sub-parts maintaining consistency to the IEEE Standard 754 Single Precision number format. The Most Significant Bit (MSB) was used to denote the sign (i.e. positive or negative value) of the vector. For most cases, the sign bit was generated randomly. For the boundary conditions (i.e. the zero condition), the difference between +0 and -0 was tested.

Therefore, the sign bit was set manually rather than randomly. The second vector sub-part is an 8- bit representation of the floating point exponent. The exponent for the denormalized numbers and

±0 was a value of zero (i.e. 00000000.) The normalized numbers had an exponent ranging from 1 to 254 (i.e. 00000001 to 11111110.) The NaN and cases carried an exponent value of 255

(i.e. 11111111.) The third vector sub-part is the fractional part of the floating point number. For the zero and infinity test cases, the fractional part is zero. For the normalized and denormalized numbers, the fractional parts are non-zero.

52

4.3.1. Reference Floating Point Adder Model

With the coverage test vectors determined, they were then collected into a test file and injected into the FPA behavioral model. Each respective output was then written to an output file to become the expected result. As previously mentioned, the generated results of the reference FPA architecture had to be equivalent to the output file of expected results. Areas where they were not equivalent were indications of sections of the design where logic errors still remained and needed to be fixed. Another testbench was developed in order to stimulate the FPA dataflow architecture for verifying correct operation per the required behavioral specification. An assertion was setup within the FPA verification testbench to evaluate if the generated result of the dataflow architecture matched the expected result of the behavioral model. For cases where they were not equivalent, an error flag was sent to the testbench monitor. The code snippet displayed in Figure 21 shows how the assertion was implemented. A conditional statement compares the FPA output C generated from the dataflow design to what is expected from the executable behavioral specification output.

75 IF ((C /= std_expect_val) AND (NOT(C(30 DOWNTO 23)="11111111") AND NOT(std_expect_val(30 DOWNTO 23)="11111111"))) THEN { 76 err_num := err_num + 1; 77 error <= err_num; 78 err_sig <= '1', '0' after 10 ns; -- Display error flag in waveform } 79 END IF;

Figure 21 – Testbench Assertion for Evaluating Expected and Actual Result

For cases where the output, C, of the model was not equal to the expected output, the testbench generated an error flag that was displayed in the waveform. By doing this, the errors could be easily identified and located in order to be addressed. Figure 22 displays the waveform that was generated for the coverage testing of the FPA dataflow architecture. In particular, Figure 22 shows

53

how the assertion was used to compare the expected and actual results. One will notice that the row /fpa_dat/err_sig contains two error flags. For this specific case, these errors pertained to the case where Input A is +0 and Input B is -0. The expected result was to be +0, however, it was actually -0. Figure 23 displays a detail view of the waveform so that one can see how the tests were executed and displayed. For example, at simulation time 412800 ns, Input A was 7233CB79 and

Input B was 7C94085F (values both displayed in .) The output was driven at

412900 ns and had a value of 7C940864. One can see that both the expected value and generated value matched; hence there was no error flag displayed in the error bar.

Figure 22 – Waveform of FPA Dataflow Architecture Displaying Error Flag

Figure 23 – Detail Waveform of FPA Dataflow Architecture 54

4.3.2. Error Insertion into Floating Point Adder Models

There were several ways the reference FPA was compromised in order to create tampered test article designs. The first was done by taking the reference Netlist and intentionally re-wiring parts of the design that would cause faulty behavior. The goal of this was to corrupt the design at a lower level than the RTL in order to make the error points more difficult to find. Figure 24 shows an example of the errors inserted into the Netlist of the FPA.

499 X_FF #( 500 .INIT ( 1'b0 )) 501 Aint_0 ( 502 .CLK(latch_BUFGP), 503 .I(A_0_IBUF_35), 504 .O(Aint[0]), 505 .CE(VCC), Inserted Error 506 .SET(VCC), // Tied SET to VCC instead of GND 507 .RST(VCC) // Tied RST to VCC instead of GND 508 ); 509 X_FF #( 510 .INIT ( 1'b0 )) 511 Aint_2 ( 512 .CLK(latch_BUFGP), 513 .I(A_2_IBUF_57), 514 .O(Aint[2]), 515 .CE(GND), // Tied CE GND instead of VCC 516 .SET(GND), 517 .RST(GND) 518 );

Figure 24 – Small Errors Added at the Netlist Level

At lines 506 and 507, the SET and RESET inputs of the D Flip-Flop with asynchronous clear, preset, and clock enable were changed and tied to VCC instead of GND. Also at line 515, the clock enable was tied to GND instead of VCC. Other test articles were developed by utilizing the RTL model of the FPA and synthesizing a corrupted design for implementation into the FPGA. Figure

25 displays a code snippet example showing the error insertion at the RTL level. The signal expgt at line 87 was made to be LOW instead of HIGH. This seemingly small modification will affect the output results of the FPA specifically in the case where the exponent of Input A is larger than

55

the exponent of Input B. Hardware Trojans were also designed and entered at the RTL level of the design. These Trojans were designed to not be easily observed or tested for.

Inserted Error 86 -- generate exponent signals 87 expgt <= '0' when (expa > expb) else '0'; -- Originally was expgt <= '1' 88 expeq <= '1' when (expa = expb) else '0'; 89 explt <= '1' when (expa < expb) else '0'; 90 expa0 <= '1' when (expa = zeroexp) else '0'; 91 expa1 <= '1' when (expa = oneexp) else '0'; 92 expb0 <= '1' when (expb = zeroexp) else '0'; 93 expb1 <= '1' when (expb = oneexp) else '0';

Figure 25 – Errors Added at the RTL

4.4. Fixed Point to Floating Point Converter Test Article

A test article to convert fixed point values into floating point representation was developed as a way to enhance the FPA previously discussed. Figure 26 displays a macro block diagram of the converter. The fixed input is defined as a 12-bit signed value. dw is a specification for the full word length (set to dw=12) and fw the length of the fractional bits of the input word (set to fw=8.)

The integer portion of the value is therefore only 3 bits whereas the fractional component is 8.

Figure 27 shows the bit definition layout for an input word. Figure 28 shows the output bit definition for the converter output. The output is 32-bits, conforming to the IEEE 754 Floating

Point Standard. The 32nd bit is the sign bit. Bits 31-24 are the exponent bits followed by bits 23-0 as the mantissa or fractional portion bits. A functional test scheme was designed to be exhaustive since the test space was 212.

56

Figure 26 – Macro Block Diagram for the Fixed Point to Floating Point Converter

Figure 27 – Fixed to Floating Point Conversion Input Bit Definition

Figure 28 – Fixed to Floating Point Conversion Output Bit Definition

4.5. Floating Point Adder with Fixed Point Conversion (Full System) Test Article

A new test article was developed by integrating the previously reviewed Floating Point Adder and Fixed Point Converter (Full System) test articles together into a larger system. This provided a richer test article system that lends itself to future expansion into the mixed signal domain through the integration of Analog-to-Digital Converters (ADCs) and analog sensors. This sub-section will discuss the Full System design utilized as a reference, as well as review a variety of the errors that were inserted into the test article set.

57

4.5.1. Reference Floating Point Adder with Fixed Point Conversion Model

A block diagram of the Full System test article can be observed in Figure 29. The diagram shows each of the major design components (e.g. Floating Point Adder, Output Buffer, etc.) as well as the three testbenches utilized for functional verification and other model checking. Testbench 1 reads in an exhaustive test vector set generated from a Python Script along with the expected outputs generated from the MATLAB behavioral models. Testbench 2 reads in a user-defined vector for purposes of isolating and performing deep analysis on specific problem areas observed in the design (i.e. functional corruption caused by an error.) Testbench 3 injects a set of exhaustive input vectors into the design and outputs a text file of the design outputs. The Output Buffer puts a delay buffer on the system output in order to allow time for the result to be setup. It essentially holds the output value state until it is ready to be updated with a new output result.

Table 5 shows some sample input and output data for the Full System test article. The table is divided up according to each component showing the input values as well as the output values in each. For the Fixed to Floating Point Converter, Input A and Input B are shown to be in both hexadecimal as well as the actual decimal value. One should note that the $7FF translates to

0111.11111111 (note the decimal point.) This is because dw=12 and fw=8. The Output A and

Output B are both then converted into 32-bit Single Precision IEEE 754 representation and become the two inputs of the Floating Point Adder. The Data Propagation specifies the amount of time it takes before the output result is observable after new inputs have been injected into the component.

58

Figure 29 – Block Diagram of Floating Point Adder with Fixed Point Conversion Test Article

Data Fixed to Floating Point Converter Floating Point Adder Output Propagation Buffer Input A Input B Output A Output B Input A Input B Result ∆T $001 0.00390625 $40D 4.05078125 $3B800000 $4081A000 $3B800000 $4081A000 $4081C000 675 ns $4081C000 $007 0.02734375 $44F 4.30859375 $3CE00000 $4089E000 $3CE00000 $4089E000 $408AC000 675 ns $408AC000 $007 0.02734375 $C4F -3.69140625 $4080E000 $C06C4000 $4080E000 $C06C4000 $3EAC0000 675 ns $3EAC0000 $7FF 7.99609375 $40D 4.05078125 $40FFE000 $4081A000 $40FFE000 $4081A000 $4140C000 675 ns $4140C000 $E3F -1.75390625 $7FF 7.99609375 $BFE08000 $40FFE000 $BFE08000 $40FFE000 $40C7C000 675 ns $40C7C000 $B8F -4.44140625 $BFF -4.00390625 $C08E2000 $C0802000 $C08E2000 $C0802000 $C1072000 675 ns $C1072000

Table 5 – Full System Test Article Sample Data

Figure 30 shows a waveform of the test article along with the timing of the output once new values on the input are observed. The result of Fixed Inputs A and B being added together can be seen at a time ∆T = 675 ns once the new inputs have been injected. The test vectors utilized in this design were exhaustive relative to the Fixed to Floating Point Converters. Since the converters had a confined output boundary, this bounded the range of input values that could have been taken by the

Floating Point Adder.

59

Figure 30 – Waveform of Full System Test Article

4.5.2. Error Insertion into Full System

Once the Full System test article was established, a series of errors were inserted into several of the components. The inserted errors ranged from Stuck-At faults to well-hidden malicious

Trojans. This created a spectrum of errors with varying Payloads (i.e. damage capability) to mimic adversarial tampering. Table 6 presents details of each error as well as the insertion location in the

Full System test article and activation mechanism (i.e. trigger.) The test articles listed in the table will be referenced later in Chapter 5 on quantifying Design Integrity.

Test Article Error Location Error Trigger Description No Error TA None None No malicious circuitry added to design

Error 1 TA Output Buffer Time Bomb with Denial of Service attack launched once pre- Counter set time count is met Error 2 TA Output Buffer Counter Trigger –Slows down performance through counter delays Error 3 TA Top Module Siphon Enable Data is siphoned to unmonitored port when requested Error 4 TA Fixed to Floating No Trigger 30th bit of converter output stuck at logic Point Conversion HIGH

Table 6 – Error Insertion into Full System Test Article

60

The No Error TA is a reference test article that had no malicious circuitry added. Error 1 TA contained a Trojan that was a counter that would run until a specified count was met. Once the count is met, a Denial of Service would be forced by not allowing an output to be observable on the output port. Error 2 TA caused a delay on each output by a counter incrementing up to 256 and then to 511. When the counter hits either of these values, it allows the output to be sent to the port.

The error causes the output to be delayed by 2560 ns. For really fast inputs, this causes the performance of the design to lock up and not send the correct output. For slower clocking, it means the FPA won’t perform as quickly due to the delay. Error 3 TA was inserted into the Top Level

Module. It contained an enable switch which, when enabled, would allow data being propagated to the output to be siphoned and streamed to another port. For Error 4 TA, there was a Stuck-At fault added into both of the Fixed to IEEE 754 Conversion Design that forced the 30th bit of the output to be stuck at Logic HIGH. For any signal values that carried a low state at that bit, an error would be observed.

4.6. MIPS Processor Test Article

The MIPS Processor test article was utilized as a complex design that integrates a variety of lower level components together and serves as an example of IC processors utilized in sophisticated embedded systems. The MIPS design contained an 8-bit architecture based off of the Patterson and Hennessy MIPS design [32]. Figure 31 shows the Top Level design along with the corresponding input/output ports of the MIPS architecture. The external memory houses the programs that are processed by the MIPS. The reset returns the Program Counter (PC) to 0. The processor sends an 8-bit address, adr, asserting either the memread or memwrite. When executing a read cycle, the memory returns a value on the memdata lines. For the write cycles, the memory accepts inputs from the writedata. 61

2-phase memread crystal ph1 memwrite clock oscillator generator ph2 MIPS 8 processor adr 8 external writedata reset memory 8 memdata

Figure 31 – 8-bit MIPS Processor Top Level [33]

4.6.1. Operation of MIPS Processor

The MIPS test article is a simple 32-bit Reduced Instruction Set Computing (RISC) architecture and was able to handle LB, SB, ADD, SUB, AND, OR, SLT, BEQ, and J instructions. The function and encoding is outlined below in Table 7. Figure 32 details the MIPS encoding formats for

Register, Immediate, and Jump Types. For Register Type encoding, the operation code (op code) is the same, but carries a unique Function Code depending on the instruction. The Immediate format reserves 16 bits for the immediate addressing. 26 bits are reserved in the Jump Type for the destination addressing.

Instruction Function Encoding Op Code Function ADD $1, $2, $3 Addition: $1 ←$2 + $3 Register Type 000000 100000 SUB $1, $2, $3 Subtraction: $1 ←$2 - $3 Register Type 000000 100010 AND $1, $2, $3 Bitwise AND: $1 ←$2 AND $3 Register Type 000000 100100 OR $1, $2, $3 Bitwise OR: $1 ←$2 OR $3 Register Type 000000 100101 SLT $1, $2, $3 Set Less Than: $1 ←1 if $2 < $3, else $1 ← 0 Register Type 000000 101010 ADDI $1, $2, imm Add Immediate: $1 ←$2 + imm Immediate Type 001000 N/A BEQ $1, $2, imm Branch if Equal: PC ← PC + imm_addr Immediate Type 000100 N/A J destination Jump: PC ← PC + imm_addr Jump Type 000010 N/A LB $1, imm($2) Load Byte: $1 ← mem[$2 + imm] Immediate Type 100000 N/A SB $1, imm($2) Store Byte: mem[$2 + imm] ← $1 Immediate Type 101000 N/A

Table 7 – MIPS Instruction Set

62

Figure 33 drops down a hierarchy level in the macro diagram shown in Figure 31 and displays the

MIPS being partitioned into three top level units: the Controller, ALU Controller, and the Datapath.

The Controller encompasses operation of the finite state machine and the two gates used to compute pcen. The ALU Controller consists of the combinational logic used for driving the ALU component of the processor. The Datapath is organized according to bit slice and contains all of the functional units such as the ALU which performs the data processing operations, registers, and buses.

Figure 32 – MIPS Encoding Formats

Figure 33 – Block Diagram of MIPS Controller and Datapath [33]

63

Each instruction set is 32-bits wide and is loaded in four successive fetch cycles across the 8-bit path to external memory. Figure 34 shows a more detailed diagram of the MIPS processor Datapath with the Control Unit. The processor can be operated in single cycle mode (i.e. one instruction per cycle) or pipelined in order to improve performance (i.e. handles multiple instructions in a single cycle.) The operation of the processor can be described as a series of process steps executed:

1. Instruction Fetch

2. Instruction decode and register read

3. Execution, memory address calculation, or branch

4. Memory access or R-type instruction completion

5. Memory Write-back

6. Update Program Counter

The first step is for the instruction to be fetched from the physical memory (e.g. Random Access

Memory.) The instruction is copied from memory and placed into the Instruction Register for decoding. Once the instruction is fetched, it is decoded and the operands of the instruction are taken from their respective registers and read. In step three, the instruction is executed by the

Arithmetic Logic Unit (ALU) operation. Any memory address calculations or branches occur in this step. Step four occurs with access to data memory and register-based (R-type) instruction completion. In step five, the results of the operations performed by the ALU are written back to the appropriate registers in the register file. Finally, the Program Counter is incremented in order to prepare for the next instruction.

64

Figure 34 – Block Diagram of MIPS Processor Datapath and Control Unit [33]

4.6.2. Arithmetic Logic Unit (ALU) and ALU Controller Test Article

The ALU is a sub-component within the MIPS Processor. The ALU receives computation instructions from the ALU Controller (e.g. add, subtract, logical AND, logical OR, etc.) and performs the selected operation on the inputs A and B of the ALU. A block diagram for the ALU and the Controller is shown in Figure 35. The Controller selection conditions for specific ALU operations to perform are outlined in Table 8. In the ALU Operation and Function columns, the parts that are marked with an X identify the “don’t care” scenarios and the opposing column (either

ALU Operation or Function) takes precedence.

65

Figure 35 – Block Diagram of 8-Bit ALU with Controller

ALU Function ALU Description Operation Control 00 010 Add operation for lb/sb/addi 01 110 Subtract for BEQ 100000 010 Add operation 100010 110 Subtract operation 100100 000 Logical AND operation 100101 001 Logical OR operation 101010 111 Set on Less

Table 8 – ALU Controller Operations

4.6.3. Integrating Corrupted ALU Controllers into Larger MIPS Processor System

Since the ALU and Controller are at the heart of the processor, errors were inserted into them as a means of corrupting the functional performance of the MIPS system. Figure 36 shows the test setup that was utilized for identifying the corrupted functionality. The corrupted MIPS containing the errors was run with the same test program as the reference MIPS. Monitors were setup to observe any deviation across the signal lines. A comparison could then be made between each

MIPS and the single evaluation points. The evaluation points where they were not equal were noted as corruption points caused by the embedded error. 66

Figure 36 – Test Setup for MIPS Processor

Four different errors were inserted into the ALU Controller which are detailed in Table 2 of

Section 3.1. The corrupted ALU Controllers were integrated into the MIPS and the EIC was utilized to quantify the error damage inflicted on the larger system. Boundaries were defined around the ALU Controller, ALU output, and MIPS processor output signals as shown in Figure

37. This allowed one to observe how the erroneous functionality of the ALU Controller propagated out and affected the respective ALU and MIPS performance. Each boundary has a corresponding boundary error Payload, ρi, which represents the functional corruption observed at the boundary.

In this case, there are three boundaries drawn; one at the output of each component. This could easily be expanded to the larger system that would house the MIPS.

67

Larger System Boundary

MIPS Boundary ρn ALU Boundary ρ3 ALU Controller Boundary ρ2 Test Program $80020044 $80070040 ρ1 memread $80E30045 ALU Controller $00E22025 memwrite $10A70008 ALU op ALU … ALU control ... X address ALU funct

Input A writedata Inserted Error ALU control

Input B

MIPS Processor System

Figure 37 – Error Propagation through MIPS Processor

Table 9 shows the impact of the inserted errors on the functionality of the Controller, ALU, and

MIPS processor. ρboundary was calculated at each component boundary and Psystem was determined by Equation (17) as the final Payload of the error on the system (outermost boundary.) The Payload measurement at each boundary allows one to compare and differentiate between each error quantitatively. When analyzing the table, one should look at the vertical columns at each boundary in order to get the best comparison between the four errors. For example, when evaluating the error impact at the ALU boundary, ρ2, one should note that Error 3 had the most damage (ρ2 = 0.92) whereas Error 2 had the least (ρ2 = 0.12.) The final system value is an aggregate of all of the boundaries and provides an indication of the damage caused to the system as it propagates through it. Looking at the ρ values at a single boundary provides a more even comparison between the error

68

Payloads, since the same test schemes are utilized for evaluating each error. Errors 3 and 4 had the largest impact on the overall system with a measured system error Payload, Psystem, of 0.514 and

0.3216 respectively. This is consistent with the higher EIC determined in Table 2.

Controller ALU MIPS Psystem (ρ1) (ρ2) (ρ3) ε 4 196094 195 Error 1 observed 0.2 0.37 0.24 0.0181 Ttotal 20 524288 808 ε 2 65535 63 Error 2 observed 0.1 0.12 0.08 0.0010 Ttotal 20 524288 808 ε 20 483842 450 Error 3 observed 1 0.92 0.56 0.5140 Ttotal 20 524288 808 ε 16 385282 442 Error 4 observed 0.8 0.73 0.55 0.3216 T 20 524288 808 total

Table 9 – Measured System Payload on MIPS from Corrupted Controller

Errors 1 and 2 had a much lower System Payload of 0.0181 and 0.0010 respectively. This is consistent with the low EIC obtained by inspection. Figure 38 and Figure 39 add another dimension to both Table 2 and Table 3 by tracking the error occurrence accumulation over increasing clock cycles. Another dimension to the Payload is the rate at which these errors occur as expressed in

Equation (22), the slope in Figure 38 and Figure 39, where y = errors and x = clock cycles.

푑푦 ∆푦 ε푟푎푡푒 = [ ] = lim (22) 푑푥 퐸푟푟표푟 ∆x→0 ∆푥

The higher the value of εrate, the more severe the Payload of the error. Figure 38 shows the error accumulation at the error source location in the ALU Controller. εrate is highest for Error 3 and lowest for Error 2. Figure 39 shows the error accumulation at the MIPS boundary after the error

69

has propagated through all other boundaries. Errors 3 and 4 have the highest εrate, whereas Errors 1 and 2 have a much lower εrate.

Figure 38 – Observed Errors on ALU Controller

Figure 39 – Observed Errors on MIPS

70

4.6.4. Error Realization in the ALU Controller

The error Realization measuring techniques discussed in Section 3.3 were applied to the four errors inserted into the ALU Controllers. Table 11 presents the data collected for the error realization analysis on each of the four corrupted ALU Controllers embedded into the MIPS

Processor. One can see that the far left set of columns represent the expected MIPS data acquired through the synthesis process. The information is represented hierarchically according to architecture tiers. The 1st Architecture Tier is the Top Level architecture. The 2nd Architecture Tier is comprised of the Controller and Datapath. The 3rd Architecture Tier is comprised of all of the submodules utilized in the implementation of the design. The Nets and Leaf Cells created in synthesis are shown next to each respective component architecture. These represent the expected

Nets, Xi,expected,

71

Hierarchy GOLDEN MIPS Error 1 MIPS (Psystem= 0.0181) I Δ Xi I Δ Yi

1st Tier Arch 2nd Tier Arch 3rd Tier Arch i Nets (Xi, exp) Cells (Yi,exp) Nets (X i) Cells (Y i) I Δ Xi I ΔYi X i,exp Y i,exp MIPS - - 1 161 29 139 29 22 0 0.136646 0 MIPS Controller (cont) - 2 142 78 102 53 40 25 0.28169 0.320513 MIPS Datapath (dp) - 3 156 0 156 0 0 0 0 0 MIPS Datapath (dp) alunit (ALU) 4 57 11 62 12 5 1 0.087719 0.090909 MIPS Datapath (dp) Areg 5 19 9 19 9 0 0 0 0 MIPS Datapath (dp) IR0 6 42 16 78 32 36 16 0.857143 1 MIPS Datapath (dp) IR1 7 16 8 16 8 0 0 0 0 MIPS Datapath (dp) IR2 8 17 9 17 9 0 0 0 0 MIPS Datapath (dp) IR3 9 23 14 23 14 0 0 0 0 MIPS Datapath (dp) MDR 10 19 10 19 10 0 0 0 0 MIPS Datapath (dp) PCreg 11 19 8 19 8 0 0 0 0 MIPS Datapath (dp) RegMUX 12 13 3 13 3 0 0 0 0 MIPS Datapath (dp) RES 13 19 10 19 10 0 0 0 0 MIPS Datapath (dp) RF 14 36 5 36 5 0 0 0 0 MIPS Datapath (dp) WRD 15 19 9 19 9 0 0 0 0

Total(Nets, Leaf Cells) 758 219 737 211 103 42 Grand Total (Combined) 977 948 145 Average (Nets, Leaf Cells) 0.3408 0.705711 Realization (R) : 0.52325526 1- R : 0.47674474

Hierarchy GOLDEN MIPS Error 2 MIPS (Psystem= 0.0010) I Δ Xi I Δ Yi

1st Tier Arch 2nd Tier Arch 3rd Tier Arch i Nets (Xi, exp) Cells (Yi,exp) Nets (Xi) Cells (Yi) I Δ Xi I ΔYi X i,exp Y i,exp MIPS - - 1 161 29 168 29 7 0 0.0435 0 MIPS Controller (cont) - 2 142 78 171 101 29 23 0.2042 0.2949 MIPS Datapath (dp) - 3 156 0 159 0 3 0 0.0192 0 MIPS Datapath (dp) alunit (ALU) 4 57 11 57 11 0 0 0 0 MIPS Datapath (dp) Areg 5 19 9 19 9 0 0 0 0 MIPS Datapath (dp) IR0 6 42 16 32 17 10 1 0.2381 0.0625 MIPS Datapath (dp) IR1 7 16 8 16 8 0 0 0 0 MIPS Datapath (dp) IR2 8 17 9 17 9 0 0 0 0 MIPS Datapath (dp) IR3 9 23 14 23 14 0 0 0 0 MIPS Datapath (dp) MDR 10 19 10 19 10 0 0 0 0 MIPS Datapath (dp) PCreg 11 19 8 19 8 0 0 0 0 MIPS Datapath (dp) RegMUX 12 13 3 13 3 0 0 0 0 MIPS Datapath (dp) RES 13 19 10 19 10 0 0 0 0 MIPS Datapath (dp) RF 14 36 5 36 5 0 0 0 0 MIPS Datapath (dp) WRD 15 19 9 19 9 0 0 0 0

Total(Nets, Leaf Cells) 758 219 787 243 49 24 Grand Total (Combined) 977 1030 73 Average (Nets, Leaf Cells) 0.1263 0.1787 Realization (R) : 0.152471651 1- R : 0.847528349

Table 10 – Realization of Errors 1 and 2 Embedded in ALU Controller

72

Hierarchy GOLDEN MIPS Error 3 MIPS (Psystem= 0.5140) I Δ Xi I Δ Yi

1st Tier Arch 2nd Tier Arch 3rd Tier Arch i Nets (Xi, exp) Cells (Yi,exp) Nets (Xi) Cells (Yi) I Δ Xi I ΔYi X i,exp Y i,exp MIPS - - 1 161 29 163 29 2 0 0.012422 0 MIPS Controller (cont) - 2 142 78 166 97 24 19 0.169014 0.24359 MIPS Datapath (dp) - 3 156 0 154 0 2 0 0.012821 0 MIPS Datapath (dp) alunit (ALU) 4 57 11 55 10 2 1 0.035088 0.090909 MIPS Datapath (dp) Areg 5 19 9 19 9 0 0 0 0 MIPS Datapath (dp) IR0 6 42 16 33 15 9 1 0.214286 0.0625 MIPS Datapath (dp) IR1 7 16 8 16 8 0 0 0 0 MIPS Datapath (dp) IR2 8 17 9 17 9 0 0 0 0 MIPS Datapath (dp) IR3 9 23 14 23 14 0 0 0 0 MIPS Datapath (dp) MDR 10 19 10 19 10 0 0 0 0 MIPS Datapath (dp) PCreg 11 19 8 19 8 0 0 0 0 MIPS Datapath (dp) RegMUX 12 13 3 13 3 0 0 0 0 MIPS Datapath (dp) RES 13 19 10 19 10 0 0 0 0 MIPS Datapath (dp) RF 14 36 5 36 5 0 0 0 0 MIPS Datapath (dp) WRD 15 19 9 19 9 0 0 0 0

Total(Nets, Leaf Cells) 758 219 771 236 39 21 Grand Total (Combined) 977 1007 60 Average (Nets, Leaf Cells) 0.088726 0.132333 Realization (R) : 0.110529512 1- R : 0.889470488

Hierarchy GOLDEN MIPS Error 4 MIPS (Psystem= 0.3216) I Δ Xi I Δ Yi

1st Tier Arch 2nd Tier Arch 3rd Tier Arch i Nets (Xi, exp) Cells (Yi,exp) Nets (Xi) Cells (Yi) I Δ Xi I ΔYi X i,exp Y i,exp MIPS - - 1 161 29 164 29 3 0 0.0186 0 MIPS Controller (cont) - 2 142 78 168 98 26 20 0.1831 0.2564 MIPS Datapath (dp) - 3 156 0 163 0 7 0 0.0449 0 MIPS Datapath (dp) alunit (ALU) 4 57 11 61 12 4 1 0.0702 0.0909 MIPS Datapath (dp) Areg 5 19 9 19 9 0 0 0 0 MIPS Datapath (dp) IR0 6 42 16 56 26 14 10 0.3333 0.625 MIPS Datapath (dp) IR1 7 16 8 16 8 0 0 0 0 MIPS Datapath (dp) IR2 8 17 9 17 9 0 0 0 0 MIPS Datapath (dp) IR3 9 23 14 23 14 0 0 0 0 MIPS Datapath (dp) MDR 10 19 10 19 10 0 0 0 0 MIPS Datapath (dp) PCreg 11 19 8 19 8 0 0 0 0 MIPS Datapath (dp) RegMUX 12 13 3 13 3 0 0 0 0 MIPS Datapath (dp) RES 13 19 10 19 10 0 0 0 0 MIPS Datapath (dp) RF 14 36 5 36 5 0 0 0 0 MIPS Datapath (dp) WRD 15 19 9 19 9 0 0 0 0

Total(Nets, Leaf Cells) 758 219 812 250 54 31 Grand Total (Combined) 977 1062 85 Average (Nets, Leaf Cells) 0.13 0.3241 Realization (R) : 0.227064494 1- R : 0.772935506

Table 11 – Realization of Errors 3 and 4 Embedded in ALU Controller

73

and the expected Leaf Cells, Yi,expected. The columns to the right of the reference MIPS data display the synthesis results for each respective MIPS that harbors a unique error in the ALU Controller.

퐼푋푖 and 퐼푌푖 are the implementation measures of the error relative to the architecture module being evaluated and are expressed as Equation (23) and Equation (24) respectively. A value of 0 indicates no implementation in the module (i.e. no Nets or Leaf Cells were modified.) A value of 1 indicates the highest amount of implementation circuitry (i.e. majority Nets and Leaf Cells modified.)

퐼∆푋푖 퐼푋푖 = , 푤ℎ푒푟푒 0 ≤ 퐼푋푖 ≤ 1 (23) 푋푖,푒푥푝푒푐푡푒푑

퐼∆푌푖 퐼푌푖 = , 푤ℎ푒푟푒 0 ≤ 퐼푌푖 ≤ 1 (24) 푌푖,푒푥푝푒푐푡푒푑

For cases where Xi,expected or Yi,expected do not exist (i.e. an added architecture module is not represented in the reference version), R for the evaluation point is 1. This is a likely scenario if a

malicious module is added into the design. 퐼푋푖 and 퐼푌푖 can now be averaged separately and combined together for a final R measure as expressed in Equation (21). R is subtracted from 1 in order to invert the scaling. From Table 12, one can see that the realization measure (1-R) for Error 1 was the lowest at (1-R) = 0.4767 which indicates medium detectability according to the inequality displayed in Equation (25). Errors 2, 3, and 4 were evaluated as high detectability with Error 3 measuring (1-R) = 0.8895. Equation (26) displays the inequality for the objective measure of the

EIC.

The conclusions one can draw from this analysis is that Error 3 would be the easiest to identify due to the higher number of structural changes required for realizing the error when compared to a lessor error such as Error 1. An interesting observation from Table 12 is that the error System

Payload, Psystem, was also included and does not necessarily correlate to the Realization of the error. 74

In other words, an error that maintains a high Realization rating does not necessarily mean that it carries a high Payload.

Low Detectability = 0.0000 ≤ 푅 < 0.3333 Medium Detectability = 0.3333 ≤ 푅 < 0.6666 (25) High Detectability = 0.6666 ≤ 푅 ≤ 1.0000

Low Cost = 0.0000 ≤ 퐸퐼퐶 < 0.6666 Medium Cost = 0.6666 ≤ 퐸퐼퐶 < 1.3333 (26) High Cost = 1.3333 ≤ 퐸퐼퐶 ≤ 2.0000

Error No. 1-R Interpretation Payload (Psystem) EIC EIC Description Error 1 0.4767 Medium Detectability 0.0181 0.4948 Low Cost Error 2 0.8475 High Detectability 0.0010 0.8485 Medium Cost Error 3 0.8895 High Detectability 0.5140 1.4035 High Cost Error 4 0.7729 High Detectability 0.3216 1.0945 Medium Cost

Table 12 – Error Realization Results from Embedded MIPS Errors

To show this in the test cases, one can see that although Error 1 had medium detectability, its System

Payload was only measured as Psystem = 0.0181. Conversely, Error 2 had high detectability yet contained the lowest System Payload measured as Psystem = 0.0010.

75

Figure 40 – Error Ranking Based on Detectability, System Payload, and EIC

Figure 40 displays a graph such that one can visualize how each of the four error scenarios compare with regard to the Detectability, System Payload, and Error Implementation Cost. The line which each error is mapped onto represents the range of possible values an error can take. This allows each error to be ranked such that the severity of one can be considered higher or lower than the other.

76

Chapter 5: QUANTIFYING DESIGN INTEGRITY

Quantifying Design Integrity (DI) is critical to evaluating the trustworthiness of a design in question. The integrity of a design can be defined as the amount of deviation observed in a one-to- one mapping of the questionable design to its reference specification. In essence, one is seeking to answer the question, “Does the design reliably operate the way that it was intended to without any anomalous behavior?” Highest integrity therefore consists of minimal deviation (within a specified performance margin) from the original reference specification. Lowest integrity would indicate maximum deviation from the reference specification such that it does not map to the reference. The development and utilization of Trust Metrics for quantifying DI can therefore provide measurable insight into how closely the actual hardware matches the original design, as well as a measurement of how far it has deviated from it. Large deviations could infer the presence of a pernicious hardware modification. As discussed in Chapter 1, current metrics tend to be only applicable at the supplier level of abstraction and do not address the Trust concerns at the lower component levels

(e.g. gate and layout levels.) By honing in and quantifying the integrity at the design level, greater resolution can be achieved on a part-to-part basis when evaluating hardware Trust. In addition, as new standards for Trusted Microelectronics are developed and employed in the industry, integrity

77

figures of merit may be standardized for Trusted Part certifications. These figures of merit therefore contain invaluable insight as hardware is vetted prior to insertion into larger systems.

Chapter 5 will provide an overview of the Design Integrity model concept which forms the basis for the subsequent sub-sections that Discuss Integrity models for Logical Equivalence, Power

Consumption, Functional Correctness, Signal Activity Rate, and Structural Architecture. The various measurement techniques will be presented and shown how their leveraging assists in arriving to an integrity metric for each respective sub-domain. Finally, several test cases will be looked at that apply the techniques to evaluate the integrity of several questionable designs.

5.1. Multi-Pass Approach to Trust Verification

When performing any sort of error detection analysis, there is always a concern for false positives and false negatives. Figure 41 shows a description of each of the four possible scenarios that can occur in error detection. A true positive or true negative is a scenario where the detection mechanism was successful in its assessment. A false positive (i.e. false alarm) occurs when the presence of an error is inferred based on the detection mechanism; however, in reality there is no error present in the design. A false negative is an occurrence where no error presence is detected, however, a real error actually resides in the system. Since the DI Analysis is being utilized as a means for determining the integrity of a questionable design, the DI measurements could also be leveraged to gain insight into the presence of any errors.

78

Figure 41 – DI Analysis True/False Positives and Negatives

Equivalence Checks SPECIFICATION/RTL

SPECIFICATION RTL

Pass 1 Pass 2 TOP DOWN High Level of Abstraction Pass N

Trusted Design Equivalence Checks RTL/NETLIST

BOTTOM UP Lowest Level of Abstraction

MANUFACTURE NETLIST

Equivalence Checks MANUFACTURE/NETLIST

Figure 42 – Generalization of Equivalence Checks across Design Path

Figure 42 shows a spiral flow that incorporates generalized equivalence checks across the design path of a chip [34]. Each quadrant represents a level of design abstraction starting with the

Specification as the highest and the Manufactured Layout as the lowest. As one traverses across different abstraction levels, the boundaries between each represents generalized equivalence checks that can be performed at each abstraction boundary (e.g. Logical Equivalence Check, Functional 79

Equivalence Check, etc.) Multiple passes can be made in the spiral flow in order to optimize efficiency of resource utilization. The first pass is done with minimum resource cost and used to flag or remove false positives that have been detected. The second pass focuses on addressing the true positives detected in the design. The subsequent passes require higher resource cost therefore, by eliminating as many false positives in the first pass as possible, the overall resource cost will be minimized.

5.2. Discretized Design Integrity Model

In order to evaluate the deviation of a questionable design away from its expected performance, the design is parsed into five measurable character sub-domain profiles: Logical Equivalence,

Signal Activity Rate, Functional Correctness, Structural Architecture, and Power Consumption as shown in Figure 43. By evaluating these five sub-domains independently, one can acquire greater resolution into the design’s performance and characteristics from multiple viewpoints. Figure 44 extends the collection of sub-domain profiles into an analysis that can be conducted on each profile individually. Conceptually, both the expected and actual characteristic domains are compared and the deviation is measured by a technique pertinent to the profile being analyzed (e.g. measures of dynamic and static power used in the Power Consumption domain.)

80

Figure 43 – Parsing Design into Sub-domain Profiles

Figure 44 – Generalized Deviation of Actual Away from Expected Character Profile

A uniform mapping of the expected and actual performance domains would show no deviation

(measured as 1.0) and indicate the highest design integrity for the respective domain. In a scenario where the questionable design had undergone modifications to change the performance, a non- uniform overlap of the actual and expected performance domains would be observed in one or more of the evaluated domains. The actual profile would have slightly deviated outside of the one-to-

81

one mapping against the expected profile. A [1, 0] scaling, normalized for each domain, is superimposed as a measurement scale and spans the range of possible sub-domain deviation distances. This benchmark scale measures and quantifies the deviation amount of the actual design performance away from the expected design performance.

When all of the profile domains are analyzed, their normalized deviation measurements are aggregated together to arrive at a final Design Integrity, DI, measurement expressed as

Equation (27) for the questionable design. Since each measurement is normalized, the different weights of each domain is pushed into the Correlation Factor, βi, which takes the non-uniform nature of each domain into account. In this work, βi was evaluated as uniform across all components (i.e. βi = 1) until further experimentation quantifies the amount of possible non- uniformity. 풯푖 is the measured value of the integrity test when applied to a specific design profile i. Equation (28) takes the measured DI and expresses it as a Figure of Deviation (FOD) that indicates the percentage of deviation away from the expected design.

DI = Design Integrity = ∑ 훽푖풯푖 (27) 푖=1

where 풯 = normalized domain measure (0 ≤ 풯 ≤ 1) 푖 푖 and Correlation Factor = 훽푖 , weight for 풯푖. ( 훽푖 = 1)

Figure 45 is the Design Integrity scale that is utilized for the measurement of a single design profile characteristic. Regarding the single domain profile, the scale is [0, 1]. Figure 46 displays the

Design Integrity scaling of all the profiles being aggregated together determined by Equation (27).

For the aggregated DI, the possible value is in the range of [0, 5]. The following sections will focus on techniques utilized for measuring the deviation in each of the sub-domains.

82

Figure 45 – Design Integrity Scale for Single Design Profile

Figure 46 – Design Integrity Scale – Aggregated Profiles

푚푒푎푠푢푟푒푑 퐷퐼 FOD = [1 − ] ∙ 100% 푝표푠푠𝑖푏푙푒 퐷퐼 (28)

5.3. Logical Equivalence Integrity

The Logical Equivalence Integrity, LEintegrity, assesses the degree to which the logic state points of the design in question maps to the original reference. Cadence Conformal was the tool utilized for conducting the logical equivalence checks. Both the reference design and compromised design were brought into the equivalence checking environment and broken down into comparison key points that could be analyzed. The key points were determined by identifying registers, inputs, and outputs of the design. The process of equivalence checking utilizes Formal Verification to prove the logical equivalence between the expected and actual designs.

5.3.1. Utilization of Formal Verification for Determining LEintegrity

Design verification is a mathematical proof-based approach to logic verification that exhaustively proves the logic state properties of a given design. Design verification encompasses two main paradigms: 1) model checking by injecting coverage test vectors within a test bench

83

environment and 2) performing logical equivalence checks over multiple levels of abstraction of the design. Model checking is typically used for uncovering design flaws or bugs in the verification phase of the design flow prior to fabrication. The Formal Verification process is executed with use of logical equivalence checks which compare different design boundary abstractions for logic state equivalence. Logical equivalence checking involves verifying the Boolean logic of the given design at the same or different levels of abstraction (e.g. RTL-to-RTL, RTL-to-Gate, Gate-to-Gate, etc.) Historically, logical equivalence checking has been utilized in synthesis revision tracking and for implementation verification of white box designs. Regarding Design Integrity, its ability to observe design equivalence across multiple abstraction levels is leveraged and applied for detection of any new or modified logic added.

As previously mentioned and displayed in Figure 15 the Design Integrity measures the deviation between the expected design and actual design. The Trusted Design Path, shown in

Figure 47, illustrates how the comparison between the expected and actual design can be made when evaluating the logical equivalence integrity for a questionable design. The Traditional Design

Path is discretized into four abstraction levels: System Specification, RTL, Netlist, and Layout.

The Traditional Design Path begins at the highest abstraction level, the Specification, and then moves down into the RTL and Netlist, and finally to the Layout, the lowest level of design abstraction. The Trusted Design Path is defined as the reverse of the Traditional Design Path. One can see that it begins at the Layout and finishes at the Specification Level. Once the hardware has been received, Formal Verification is utilized as a means for confirming that the fabricated hardware (actual design) logically matches the design from the Traditional Design Path (expected design.) In cases of black box IP, reference specifications are usually provided along with the necessary layout detail required for manufacturing (e.g. GDSII format.)

84

Figure 47 – Trusted Design Path Flow

Behavioral models of the IP can be generated or the layout reverse engineered through conversion tools as discussed in [35] to extract the Netlist and RTL representations in order to be validated against the original reference.

Figure 48 – Logical Equivalence Check between the Reference and Untrusted Design

85

Figure 48 illustrates how the equivalence check process is conducted with an RTL model, considered to be the reference design from the Traditional Design Path, and the Gate-level Netlist model extracted from the Trusted Design Path. The Gate Level Netlist is the design that was potentially compromised. For this example, it is assumed that the Netlist was successfully obtained from the layout. When the two designs are brought into the equivalence checking environment, comparison key points are defined to map a connection between the reference and questionable designs. A mapped key point is a node location between the two designs which is determined to be identical. For example, based on the architecture and logic cones of each design, input X1 of the reference RTL is mapped to the input X1 of the questionable Netlist. Likewise, similar mapping points are established for registers and architecture outputs. Once the comparison key points are defined, a series of test vectors automatically generated by the equivalence checking tool are injected into both designs to stimulate logic state changes at every key point. Each key point state is evaluated as logically equivalent or non-equivalent between the reference and questionable designs. The vectors are chosen by the verification environment such that every state point between the designs is toggled; thus ensuring exhaustive testing. The equivalence check identifies the points between the two designs that are not equivalent, raising the concern for and identifying the location of the inserted error. Referencing Figure 48, the malicious circuitry in the Netlist would cause several non-equivalent points between the two designs. Equation (29) and Equation (30) are expressions for the Equivalent Points and Total Comparison Points respectively. There are m Total

Comparison Points. bi is a binary value that evaluates if the comparison key point, Pi, was found to be equivalent. For cases where it was not equivalent, bi = 0. For the case of an equivalent point, bi = 1. The utilization factor of each Pi is represented as σi and gives more weight to higher utilized points and less weight to less utilized points.

86

m 0, Pi is nonequivalent Points EQ= ∑ biPi σi where bi = { (29) 1, Pi is equivalent i=1

m

Points COMPARED= ∑ Pi σi where PointsEQ ⊆ 푚 (30) i=1

LE integrity can now be expressed as Equation (31) which takes the ratio of Equivalent Points to the

Total Comparison Points.

PointsEQ 퐿퐸integrity = [ ] where 0 ≤ 퐿퐸integrity ≤1 (31) PointsCOMPARED

5.3.2. Logical Equivalence Integrity Domain Test Article

The Floating Point Adder (FPA) discussed in Section 4.3 is utilized to show how the logical equivalence technique is applied for measuring LEintegrity. Figure 49 shows a snippet of the synthesized Netlist of the FPA at the locations where the errors were inserted. At lines 506 and

507, the SET and RESET inputs of the D Flip-Flop with Asynchronous Clear, Preset, and Clock

Enable were changed such that both were tied to the power rail. At line 515, the Clock Enable was tied to the ground. Figure 50 shows the Cadence Conformal output window after the comparison test was executed. The left window details the logical equivalence check result for each comparison key point. One can see that there are several non-equivalent points (marked in red.) The right side details the results of the check for the entire design. One can observe that 93 of the key points were determined to be logically equivalent, however, three of the key points were determined to be non- equivalent.

87

499 X_FF #( 500 .INIT ( 1'b0 )) 501 Aint_0 ( 502 .CLK(latch_BUFGP), 503 .I(A_0_IBUF_35), 504 .O(Aint[0]), Inserted Error 505 .CE(VCC), 506 .SET(VCC), // Tied SET to VCC instead of GND 507 .RST(VCC) // Tied RST to VCC instead of GND 508 ); 509 X_FF #( 510 .INIT ( 1'b0 )) 511 Aint_2 ( 512 .CLK(latch_BUFGP), 513 .I(A_2_IBUF_57), Inserted Error 514 .O(Aint[2]), 515 .CE(GND), // Tied CE GND instead of VCC 516 .SET(GND), 517 .RST(GND) 518 );

Figure 49 – Embedded Errors into Netlist of Floating Point Adder

Figure 50 – Equivalence at each Key Point (left) and Design Results (right) for Test Article 1

When the Design Integrity technique for the Logical Equivalence Domain is applied to the FPA design, a LEintegrity measure of 0.96 is obtained. This implies a Figure of Deviation of 3.125% away from the expected logical equivalence. Figure 51 shows the actual domain being mapped to the expected domain along with the measured deviation of the actual design (containing the error) away

88

from the expected. The conclusion drawn from the measurements in this scenario is that the design in question contains high integrity, however, there are concerns that warrant an additional investigation into the questionable design.

Figure 51 – Deviation of Actual Logical Equivalence from Expected Domain

With the non-equivalent points known, a deeper analysis can be conducted to determine the source of the error. When one selects the last non-equivalent point in Figure 50 (Aint_reg[0]) and runs a diagnosis, a comparison report is opened up specifying the details of the respective non-equivalent point. Figure 52 displays the report generated by the equivalence check. The D-Flip Flop

(comparison ID 162) in the reference model (labeled “Golden”) was determined to not be equivalent to the D-Flip Flop (comparison ID 99) in the synthesized Netlist (labeled “Revised.”)

Figure 52 – Comparison Point Diagnosis to Uncover Non-equivalence in Test Article 1

89

From this point, a schematic diagnosis is instantiated as shown in Figure 53 which allows one to see clearly how the FPA design was corrupted by the errors. One can see that the SET and RESET of the D-Flip Flop in the expected design were tied to ground, however, in the corrupted design, they were tied to power. Going back to the original Netlist that had the inserted errors (reference

Figure 49), one can see that indeed the SET and RESET of the D-Flip Flop Aint_0 were in fact tied to the power rail.

Figure 53 – Schematic Diagnosis of Comparison Point within Conformal

5.4. Power Consumption Integrity

The Power Consumption Integrity sub-domain, Pintegrity, measures how closely the questionable design aligns to the original reference from a power perspective. Namely, for a given test scheme, does the power consumed from the simulation tests conducted in the design phase match the actual power consumption after the design was implemented? Equation (32) expresses the expected and actual power consumptions, Pexpected and Pactual, at an arbitrary power source point, i. Each of the power source points can be added together to arrive at a total power consumption for both the expected and actual power consumptions. The difference between the two can be represented as

ΔPdist and expressed as Equation (33). The final Pintegrity is determined by Equation (34).

90

푃 = [푉 (퐼 + 퐼 )] 푎푐푡푢푎푙 푖 푖_푑푦푛푎푚푖푐 푖_푠푡푎푡푖푐 푎푐푡푢푎푙 (32) 푃 = [푉 (퐼 + 퐼 )] 푒푥푝푒푐푡푒푑 푖 푖_푑푦푛푎푚푖푐 푖_푠푡푎푡푖푐 푒푥푝푒푐푡푒푑

푛 푚 훥푃 = |∑(푃 ) − ∑(푃 ) | (33) 푑푖푠푡 푎푐푡푢푎푙 푖 푒푥푝푒푐푡푒푑 푖 푖=1 푖=1

where n, m = total source points for questionable and reference designs

푃푒푥푝푒푐푡푒푑 − 훥푃푑푖푠푡 푃푖푛푡푒푔푟푖푡푦 = such that 0 ≤ 푃푖푛푡푒푔푟푖푡푦 ≤ 1 (34) 푃푒푥푝푒푐푡푒푑

5.4.1. Power Consumption Integrity Domain Test Article

A Trojan was entered into the ALU and Controller Test Article discussed in Section 4.6.2 in order to evaluate the Design Integrity from the Power Consumption perspective. Figure 54 illustrates the system with the inserted Trojan. The ALU and Controller are central components in a larger processor system (e.g. MIPS Processor) therefore any corruption in their functionality could cause greater damage beyond the components the error was embedded in. The inserted

Trojan was designed to remain passive under normal operating conditions, thus making it unobservable with traditional functional testing. As the system clock runs, a counter is incremented with every clock cycle. Once the counter hits the defined count, the Trojan is activated and the output of the ALU is XOR’d with a constant, thereby corrupting the result. The question to ponder is, “Can the Power Consumption Integrity measuring techniques expose the presence of the

Trojan?” A test scheme is applied to the reference design in order to obtain an expected power

91

consumption and then the same test scheme is applied to the corrupted design to determine the actual power consumption.

Figure 54 – Trojan Inserted into Test Article for Evaluating Power Consumption Integrity

The results of the test case are presented in Table 13. One can see that the Resource Type breaks down the power sources into one of five different categories. The total is aggregated together for a final total power. From here, the power deviation can be measured and the final integrity metric determined. It can be observed that the Trojan embedded test article had a Pintegrity of 0.9867 which shows minor interruption in the expected power consumption. In order to have more granularity, one could look at each of the resource type Power Integrity metrics obtained for each resource type.

The Pintegrity of each resource type is shown in the far right column of Table 13. When these are averaged together, effectively normalizing each resource type, a more accurate Pintegrity is obtained of 0.664. It can be concluded that the effects of the embedded error become visible when looked at in the power consumption domain.

92

Reference Test Trojan 1 Test Resource Type Article Article Pintegrity Power (mW) Power (mW) Clocks 0 0.45 0 Logic 0.02 0.02 1.00 Signals 0.03 0.05 0.33 I/O 0 0 1 Static Power 36.07 36.08 0.99 Total 36.12 36.60 -

ΔPdist 0.00 0.48 -

Pintegrity 1.0000 0.9867 0.664

Table 13 – Results of Power Consumption Integrity Evaluation on Test Article

Figure 55 illustrates the amount of deviation that was observed when the expected and actual Power

Consumptions are compared. One can see that the design has deviated 33.6% relative to the linear reference point.

Figure 55 – Deviation of Actual Power Consumption from Expected Domain Profile

5.5. Signal Activity Rate

The Signal Rate, SR, defines the number of times the evaluated element changes state over the duration of a given test scheme and is expressed in units of millions of transitions per

93

second (Mtr/s.) Equation (35) determines the Signal Rate for element i where fCLK is the clock frequency and σi is the utilization or toggle rate percentage of the element [36].

푆푅 = 푓퐶퐿퐾 𝜎푖 (35)

The SR can be generalized for a design as the average Signal Rate and expressed as the actual and expected signal rate displayed in Equation (36) and Equation (37). The signals have been categorized as data, input/output (IO), and logic signals with a total signal quantity of a, b, and c respectively for each. This can be determined for both the actual and expected Signal Rate averages. SRactual and SRexpected can then be applied to Equation (38) to determine the deviation distance. Equation (39) is the final expression for the normalized SRintegrity.

푎 푏 푐 1 1 1 푆푅 = [ ∑(푆푅 ) + ∑(푆푅 ) + ∑(푆푅 ) ] (36) 푎푐푡푢푎푙 푎 푑푎푡푎 푖 푏 퐼푂 푖 푐 푙표푔푖푐 푖 푖=1 푖=1 푖=1 푎푐푡푢푎푙

푎 푏 푐 1 1 1 푆푅 = [ ∑(푆푅 ) + ∑(푆푅 ) + ∑(푆푅 ) ] (37) 푒푥푝푒푐푡푒푑 푎 푑푎푡푎 푖 푏 퐼푂 푖 푐 푙표푔푖푐 푖 푖=1 푖=1 푖=1 푒푥푝푒푐푡푒푑

∆푆푅푑푖푠푡 = |푆푅푒푥푝푒푐푡푒푑 − 푆푅푎푐푡푢푎푙| (38)

푆푅푒푥푝푒푐푡푒푑 − ∆푆푅푑푖푠푡 푆푅푖푛푡푒푔푟푖푡푦 = such that 0 ≤ 푆푅푖푛푡푒푔푟푖푡푦 ≤ 1 (39) 푆푅푒푥푝푒푐푡푒푑

5.5.1. Signal Activity Rate Integrity Domain Test Article

The test article illustrated in Figure 54 was looked at from the perspective of Signal Activity

Rate Integrity. Table 14 displays the data collected for a sample Signal Activity Rate Domain test. 94

The resource types looked at were Logic, IO, and General Signal Types. The average signal rate for each signal is determined based on the total and the number of signals per resource type. The averages are then added together to arrive at a final Signal Rate. One can see that the reference test article has no deviation, therefore it produces a SRintegrity of 1.00. The test article with the embedded

Trojan contained a ∆SRdist of 1.463 that produced a SRintegrity of 0.676. From the perspective of the

Signal Activity Rate profile, one can conclude that the effects of the embedded Trojan are observable in this domain. There were a number of new signals added that were not in the original design. The SRintegrity of 0.676 captures and shows quantifiably how far the design has deviated from the expected reference.

Reference Test Article Trojan 1 Test Article Resource Type No. Signals Total Average No. Signals Total Average (i) Signal Rate Signal Rate (i) Signal Rate Signal Rate Logic 25 50.3 2.012 60 70.7 1.178 General 48 119.96 2.499 75 140.23 1.869 I/O 32 0 0 33 0 0 Sum of SR - - 4.511 - - 3.048

ΔSRdist 0.00 1.463

SRintegrity 1.00 0.676

Table 14 – Results of Signal Rate Integrity Evaluation on Test Article

Figure 56 illustrates the deviation of the actual Signal Activity Rate away from the expected domain. A Figure of Deviation of 32.4% is obtained from the compromised design. This is intuitive because the counter circuit toggles with each clock cycle, thereby increasing the number of total state changes over the given reference period.

95

Figure 56 – Deviation of Actual Signal Activity from Expected Domain Profile

5.6. Functional Integrity

The Functional Integrity, Fintegrity, is evaluated by observing the number of errors that occur,

εobserved, for a given verification test scheme. TPtotal is the total verification test points used for verifying the design functionality. εobserved is the number of error cases accumulated from the test scheme. Figure 57 illustrates the testbench setup employed to verify the questionable design’s functionality. The verification test scheme is designed to stimulate both the original reference and questionable designs such that the functionality of both designs can be compared. Test schemes can range from exhaustive testing to ones that only provide corner and basic functionality coverage.

For every test where the questionable design does not match the reference design, an error is observed. Fintegrity can then be expressed as Equation (40), the ratio of successful tests (i.e. Expected

Result equals Actual Result) to the total tests made.

푇푃푡표푡푎푙 − 휀표푏푠푒푟푣푒푑 퐹푖푛푡푒푔푟푖푡푦 = such that 0 ≤ 퐹푖푛푡푒푔푟푖푡푦 ≤ 1 (40) 푇푃푡표푡푎푙

96

If not equal, Functional Golden Error observed Verification Design Expected Result Test Scheme DUT

Compare Actual and Expected Questionable Design

? Actual Result

Figure 57 – Test Setup for Evaluating Questionable Design Functionality

5.6.1. Functional Integrity Domain Test Article

The test article used to demonstrate the Functional Integrity had an error embedded in it that corrupted the output of the ALU for every Logical AND operation. Table 15 presents the data that was obtained from the analysis. A test scheme was designed to maximize the coverage for the functional analysis. The total tests executed were 524,288 and one can see that the compromised test article had 65,537 errors observed. The Fintegrity for the compromised test article was therefore determined to be 0.87 which indicates a mild yet concerning amount of functional corruption in the design.

Trojan 2 Reference Resource Type Test Article Test Article

Errors Observed 0 65537 Total Tests 524288 524288 Ratio 0 0.125 1- Ratio 1 0.874

Fintegrity 1.00 0.87

Table 15 – Results of Functional Integrity Evaluation on Test Article

97

In a similar fashion to the other parsed out domains, Figure 58 illustrates the functional deviation of the actual design away from the expected. A Figure of Deviation of 13% is obtained.

Figure 58 – Deviation of Functionality from Expected Domain Profile

5.7. Structural Architecture Integrity

The Structural Analysis looks at the architecture components generated once the design has been synthesized into a Gate Level Netlist. For an FPGA, when the synthesis process is executed, the design Nets and Leaf Cells are represented hierarchically as sub-component architectures. As such, these become the points of comparison for identifying any deviation from the expected structure. Equation (41) and (42) determine the number of extra or removed Nets and Leaf Cells respectively for an evaluated architecture component, i.

푆∆푋푖 = |푁푒푡푠푒푥푝푒푐푡푒푑 − 푁푒푡푠푎푐푡푢푎푙| (41)

푆∆푌푖 = |퐶푒푙푙푠푒푥푝푒푐푡푒푑 − 퐶푒푙푙푠푎푐푡푢푎푙| (42)

The modified Nets and Cells can then be represented as a ratio against the total Nets and Cells to arrive at the Structural Integrity, Sintegrity, expressed in Equation (44). In order to maintain the

98

resolution of the modified circuits in a large design, only the architectures that show a modification

to the Nets or Leaf Cells are considered; therefore 푆∆푋푖 ≠ 0 and 푆∆푌푖 ≠ 0.

푛 푚 (43) 1 1 푆∆푋 1 푆∆푌 ∆푆 = ( ∑ [ 푖 ] + ∑ [ 푖 ]) 2 n 푋푖_푒푥푝푒푐푡푒푑 m 푌푖_푒푥푝푒푐푡푒푑 푖=1 푖=1

S푖푛푡푒푔푟푖푡푦 = 1 − ∆푆 (44) where n, m = number of modified architectures evaluated for Nets, Leaf Cells respectively

(푆∆푋푖 ≠ 0 and 푆∆푌푖 ≠ 0)

Table 16 displays the data collected for a sample Structural Architecture analysis. Each of the

modules are broken down into Nets and Leaf Cells and the changes between the actual and expected

test article are tallied in the ∆S column. The number of Nets and Leaf Cells was increased in the

actual test article when compared to the expected. This was due to the counter circuit detailed in

Figure 54. The Structural Architecture integrity was measured as 0.5607. Figure 59 graphically

shows the deviation away from the expected mapping.

Module Component Structure TA Expected TA1 Actual TA1 Changes Change Ratio

Breakdown Sexpected Sactual ∆S=|Sexpected-Sactual| ∆S/Sexpected Top Level Module Nets 64 66 2 0.03125 Top Level Module Leaf Cells 34 35 1 0.029411765 ALU Nets 59 67 8 0.13559322 ALU Leaf Cells 21 21 0 0 Trojan Counter Nets 0 43 43 1 Trojan Counter Leaf Cells 0 31 31 1 TOTAL NETS ...………………………………. 123 133 TOTAL LEAF CELLS…..……………………. 55 56 Total/change category#: 0.439250997 1-Total 0.560749003 GRAND TOTAL OF ALL NETS/LEAFS …….... 178 189 Sintegrity 0.560749003

Table 16 – Results of Structural Architecture Integrity Evaluation on Test Article

99

Figure 59 – Deviation of Structural Architecture from Expected Domain Profile

5.8. Aggregation of DI Techniques on Simple Test Cases

Several test cases were subjected to the analysis technique in order to demonstrate the usefulness of the DI metric and FOD. The ALU and ALU Controller had errors of varying magnitudes inserted into several instantiations of the design much like what was presented in Figure

54. The diagram shows Test Article (TA) 2, one of the designed Trojans that consisted of a counter circuit that taps into the system clock (CLK) path. The counter is incremented at a rate proportional to the CLK frequency of the circuit. Once the counter reaches a user-defined count, a binary constant is logical XOR’d with the true ALU output, resulting in an incorrect output value. Table

17 presents the results collected from the analyses on each of the TA design cases. One can observe that, as expected, TA1 and TA5 had the highest integrity. TA1 had no malicious content added, therefore scored maximum value (1.0) in each of the performance domains which translated to 0% deviation in the FOD. TA5 had additional wires added that were benign in nature which resulted in no change to the integrity. TA2 and TA3 had malicious content added that affected one or more of the performance domains causing a much lower DI score. TA4 had pernicious content added that corrupted all the results of the ALU effectively causing the lowest DI scoring and highest FOD

100

deviation at 40.2%. The highlighted domain categories identify the lowest scoring domains for each TA analyzed and represent the profiles of the test article that made the error observable.

Test Design SR P LE S F FOD Description of Error Insertion Article integrity integrity integrity integrity integrity Integrity TA1 1.00 1.00 1.00 1.00 1.00 5.00 0.00% No malicious content added. TA2 0.67 0.98 0.13 0.56 0.99 3.33 33.40% Counter added to trigger XOR operation on output TA3 1.00 0.99 0.00 0.97 0.87 3.83 23.40% AND function with bit-wise invert TA4 1.00 0.99 0.00 1.00 0.00 2.99 40.20% All signal bits inverted at output TA5 1.00 1.00 1.00 1.00 1.00 5.00 0.00% Stray signal wire added

Table 17 – Results of Design Integrity Analysis on Test Case

101

Chapter 6: DESIGN INTEGRITY METRICS APPLIED TO TEST CASE EMBEDDED SYSTEM

This chapter expands on the Design Integrity techniques that were presented in the previous chapter by using a larger, more complex test article set. The Floating Point Adder with Fixed Point

Conversion (Full System) discussed in Section 4.5 is utilized as a test case in order to apply the integrity evaluation on five of the Full System test articles that had errors inserted in them. The integrity of each test article is measured based on the DI analytics allowing each system test case to be ranked and rated according to highest and lowest integrity. In addition, this chapter investigates a method for quantifying the quality of the reference being utilized in the DI analysis.

Several different references are created to mimic design references that are likely to be found in practice. The Reference Quality metric is then utilized as a confidence measure and used in conjunction with the DI measure in order to obtain a final Trust Measure Figure of Merit. The

Trust Measure factors in the DI and Reference Quality metrics together and reduces the integrity analysis to a single value allowing a straight forward comparison between the different designs.

Finally, the Trust Measure indication will be correlated to the error cost measure in order to observe the impact that a given error has on the trustworthiness of the design.

102

6.1. Full System Test Cases

The Full System test cases were designed with the errors outlined in Table 6. Figure 60 presents a top level block diagram of the test system as detailed in Section 4.5, comprised of two Fixed Point

Converters, a Floating Point Adder, and an Output Buffer. The system allows two 12-bit fixed point inputs to be converted into the single precision IEEE 754 Standard Floating Point Format.

The two values are then added together and the result propagated to the system output. Errors were placed into specific locations in the components of the design in order to model a potentially malicious IP supplier.

Figure 60 – Test Case Block Diagram

From here, the DI analysis was applied to all five test article cases. Table 18 presents the results of the analysis and details each domain profile measurement as well as the aggregated Design

Integrity measure. Each of the domains are evaluated on a [0, 1] scale and marked with a color indicative of the DI scale shown in Figure 45. The Design Integrity column is consistent with the aggregate scale in Figure 46. One can see that the integrity for the No Error TA was highest followed by the Error 4 TA. Error 1 TA and Error 2 TA were determined to have the lowest 103

integrities. Based on the analysis, the DI metric shows measurable differentiation between all five of the test cases and lends itself to the ranking of each test article in the order of highest to lowest

Trust.

Design Test Article P F SR S LE integrity integrity integrity integrity integrity Integrity No Error TA 1.0000 1.0000 1.0000 1.0000 1.0000 5.00 Error 1 TA 0.7647 0.0034 0.8555 0.9424 0.7083 3.27 Error 2 TA 0.7059 0.1330 0.7365 0.6102 0.7648 2.95 Error 3 TA 0.7059 1.0000 0.9956 0.8803 0.8016 4.38 Error 4 TA 0.9412 0.4993 0.9938 0.9704 0.9949 4.40

Table 18 – Design Integrity Results for the Full System Test Cases

6.2. Quantifying the Reference Quality

One question that intuitively rises when investigating the integrity analytics revolves around the quality of reference being utilized in the analysis. As such, formulating a metric for quantifying the reference quality, RQ, and correlating it to the obtained DI metric allows one to place higher or lower confidence in the DI measures that are obtained. It also lends itself to being used for comparing different reference types and ranking one against the other for overall usefulness. The reference quality is determined by Equation (45) where n is the number of integrity domains the reference can evaluate, and N is the total possible domains to evaluate. In this work, N=5 however as the DI analysis expands into new domains, it will be modified to N=6, N=7, etc.

푛 푅 = , where 0 ≤ RQ ≤ 1 and n ≤ N (45) 푄 푁

Table 19 shows five different reference types that were used in the DI analysis. An “X” is marked in each domain that is analyzable by the reference being evaluated. The Reference Quality metric 104

is then scored accordingly. RQ can be used in conjunction with the DI metric to arrive at a final design Trust Measure expressed in Equation (46) that is indicative of the confidence one can have in the insights afforded by the DI metric. Equation (47) represents the normalized DI which allows all the DI analyses, regardless of the value of n, to map to the [0, 1] scale system.

Reference Reference No. Analyzable Domains Description and Format P F SR LE S n /N Quality (RQ) Reference 1 5/5 1.00 Synthesizable Behavioral Design (VHDL)

Reference 2 2/5 0.40 Datasheet Specification (MS Word)

Reference 3 1/5 0.20 Executable Specification (MATLAB) Datasheet with Executable Specification Reference 4 3/5 0.60 (MATLAB/MS Word) Reference 5 4/5 0.80 Synthesized Netlist ()

Table 19 – Description of References

Trust Measure = 푇푀 = 퐷퐼푛표푟푚 ∙ 푅푄 (46)

퐷퐼 (47) 퐷퐼 = 푛표푟푚 푛

Table 20 revisits the DI metrics presented in Table 18 and displays how the TA cases would be evaluated in the DI analysis from the spectrum of different references detailed in Table 19.

Reference 1 was the highest quality because it applied to all five domains of the DI analysis.

Reference 3 was the lowest quality lending itself to be utilized in only one domain. One observation to note is with Error 3 TA. DInorm was measured as 0.88 for Reference 1, but measured as 1.00 by

Reference 3. This is because Reference 3 does not afford the level of observability that Reference 1 does into the design for tracking the deviations caused by the error. In other words, the error is not observable if Reference 3 is used for the DI analysis. Based on this information, one could be led 105

to believe that Error 3 TA was of highest trust. The Trust Measure, however, accounts for the poor reference quality and adjusts the scoring to 0.20 which is significantly lower than the Reference 1

Trust Measure of 0.88.

Reference 1 (RQ = 1) Reference 2 (RQ = 2/5) Reference 3 (RQ = 1/5)

Test Article N n RQ DI DInorm TM n RQ DI DInorm TM n RQ DI DInorm TM

No Error TA 5 5 1 5.00 1.00 1.00 2 0.4 2.00 1.00 0.40 1 0.2 1.00 1.00 0.20 Error 1 TA 5 5 1 3.27 0.65 0.65 2 0.4 1.71 0.85 0.34 1 0.2 0.00 0.00 0.00 Error 2 TA 5 5 1 2.95 0.59 0.59 2 0.4 1.32 0.66 0.26 1 0.2 0.13 0.13 0.03 Error 3 TA 5 5 1 4.38 0.88 0.88 2 0.4 1.59 0.79 0.32 1 0.2 1.00 1.00 0.20 Error 4 TA 5 5 1 4.40 0.88 0.88 2 0.4 1.91 0.96 0.38 1 0.2 0.50 0.50 0.10

Reference 4 (RQ = 3/5) Reference 5 (RQ = 4/5)

Test Article N n RQ DI DInorm TM n RQ DI DInorm TM Trust Scaling 1.00 Highest Trust No Error TA 5 3 0.6 3.00 1.00 0.60 4 0.8 4.00 1.00 0.80 0.80 - 0.99 Error 1 TA 5 3 0.6 1.71 0.57 0.34 4 0.8 3.27 0.82 0.65 0.60 - 0.79 Error 2 TA 5 3 0.6 1.45 0.48 0.29 4 0.8 2.82 0.70 0.56 0.40 - 0.59 Error 3 TA 5 3 0.6 2.59 0.86 0.52 4 0.8 3.38 0.85 0.68 0.20 - 0.39 Error 4 TA 5 3 0.6 2.41 0.80 0.48 4 0.8 3.90 0.98 0.78 0.00 - 0.19 Lowest Trust

Table 20 – Comparison of Different References Types

The value of the TM maps to the Trust Scaling benchmark scale shown in Table 20. This effectively establishes a final Trust Figure of Merit for evaluating a design’s trustworthiness. If a design is to be regarded with Highest Trust, it must be evaluated with the highest quality reference (RQ = 1) and must show no deviation from each expected domain profile in the DI analysis. The Trust Scaling allows a tradeoff to be made between the quality of the reference and the amount of design trust given for a less than optimum DI analysis (e.g. only three domain profiles analyzed.) For instance, a design may score very high with a lesser quality reference, however, the highest achievable Trust

Measure is adjusted and equates the questionable design to a medium level of Trust. This is because

106

the unanalyzed domains remain unobservable and needs to be considered when assigning a final

Trust value.

6.3. Distance Measures Relating Trust Measures to Error Implementation Cost

When considering the Trust Measure Figure of Merit, it is an intuitive step to correlate the TM value to the cost of the error that caused the deviation. As such, one can look to methods of distance measuring as a means for analyzing the relationship between the TM and error cost, EIC.

Equation (48) employs the Euclidean Distance, D, as a means for determining the distance between a reference test article and a compromised test article. The distance measure can also be employed for comparing the deviations from an ideal reference when utilizing less than ideal references (i.e. low RQ.)

푛 (48) 2 Euclidean Distance = 퐷 = √∑(푞푖 − 푝푖) 푖=1

Table 21 shows each of the Full System test articles and their scoring according to the EIC inspection scoring discussed in Section 3.1. The EIC was normalized, EICnorm, to a [0, 1] scale similar to the TM scale. Min-Max Normalization expressed as Equation (49) was utilized to normalize the raw EIC values to the new scale.

′ 푣 − 표푙푑_푚𝑖푛퐴 (49) 푣 = [ ] (푛푒푤_푚푎푥퐴 − 푛푒푤_푚𝑖푛퐴) + 푛푒푤_푚𝑖푛퐴 표푙푑_푚푎푥퐴 − 표푙푑_푚𝑖푛퐴

The TM values from Table 20 are displayed next to the EIC scoring for each test article. Table 22 expands on this by presenting a distance matrix that specifies the distance each test article and reference deviates away from the highest integrity design and highest reference quality. A distance

107

of 0.00 represents no deviation away from the ideal reference design (i.e. no inserted errors and highest quality reference used for the DI analysis: TM = 1, EIC = 0.)

Trust Measure (TM)

Test Article P T D EIC EICnorm Cost Ref 1 Ref 2 Ref 3 Ref 4 Ref 5 Golden TA (TA0) 0 0 0 0 0 None 1.00 0.40 0.20 0.60 0.80 Error 1 TA (TA1) 3 2 2 7 0.78 High 0.65 0.34 0.00 0.34 0.65 Error 2 TA (TA2) 2 1 2 5 0.56 Medium 0.59 0.26 0.03 0.29 0.56 Error 3 TA (TA3) 0 3 2 5 0.56 Medium 0.88 0.32 0.20 0.52 0.68 Error 4 TA (TA4) 1 0 1 2 0.22 Low 0.88 0.38 0.10 0.48 0.78

Table 21 – Test Article Error Scoring and Trust Measure across References

EICnorm Ref 1 Ref 2 Ref 3 Ref 4 Ref 5 TA0 0 0.0000 0.6000 0.8000 0.4000 0.2000 TA1 0.78 0.8529 1.0208 1.2677 1.0204 0.8532 TA2 0.56 0.6940 0.9254 1.1230 0.9044 0.7100 TA3 0.56 0.5734 0.8830 0.9765 0.7394 0.6466 TA4 0.22 0.2506 0.6557 0.9266 0.5626 0.3111

Table 22 – Distance Matrix of Normalized EIC and TM for each Reference

The correlation between the TM and EIC can be illustrated by the graphs shown in Figure 61. Each plot captures the distance of each test article away from the ideal reference with highest integrity

(golden reference.) The Golden Reference is marked at the (1, 0) point in each plot. Each plot represents a reference with each TM for the test article mapped to it. One can see that the lower the quality of the reference, the further away all of the test articles are from the Golden Reference point. For example, Reference 3 (RQ=0.20) shows the largest deviation from the Golden Reference due to the poor quality of Reference 3. In addition, one can see that even the No Error TA is still a far distance from the Golden Reference due to the offsetting factor of the low RQ value.

108

Figure 61 – Correlating the Trust Measure to Error Implementation Cost

109

In order to gain more insight into the damage impact of an inserted error on the Trust of the design, the Payload component of the EIC measure was looked at individually and correlated to

TM. Table 23 displays an updated distance matrix that utilizes the normalized Payload to characterize the error. Figure 62 shows a graphical representation of the correlation across all five references. By removing the T and D components of the EIC, one can correlate the pure damage capability of the error cost to the Trust Measure. This can be seen in the general trend of longer distances between the ideal Golden Reference point and the actual design when compared back to the original analysis. One observation to note is that if the error had no impact on the functionality

(i.e. the P value was low), this translated to a shorter distance due to the overall lack of impact on the Trust Measure. One can see that this is the case when TA3 is compared with Reference 1 for both the Figure 61 and Figure 62 analysis. In Figure 61, the T and D components were factored in creating a higher normalized EIC, thus giving the false impression that the error had a big impact on the Design Integrity. When removed, the Pnorm value is much lower when compared to original

EICnorm. The result shows a much lower distance because the error damage capacity was in reality not severe.

Pnorm Ref 1 Ref 2 Ref 3 Ref 4 Ref 5 TA0 0 0.0000 0.6000 0.8000 0.4000 0.2000 TA1 1 1.0579 1.1974 1.4137 1.1970 1.0581 TA2 0.67 0.7854 0.9959 1.1817 0.9763 0.7996 TA3 0 0.1233 0.6828 0.8000 0.4828 0.3233 TA4 0.33 0.3512 0.7003 0.9587 0.6141 0.3966

Table 23 – Distance Matrix of Normalized Payload and TM for each Reference

110

Figure 62 – Correlating the Trust Measure to Error Payload

111

Chapter 7: IP-XACT

IP-XACT is an IEEE standard (IEEE-1685) that gives the construct and syntax for generating a high quality Extensible Markup Language (XML) description, for Intellectual Property (IP.) The standardized XML description greatly simplifies the process of IP reuse and distribution. The system on chip design flow, applicable to both FPGAs and ASICs, can be sped up tremendously by using IP-XACT. Things such as bus infrastructure, interrupt controllers, timers, UARTs, and many other pieces of IP do not give a competitive advantage to any specific vendor or provider.

By allowing the reuse and marketing of these IP components, one can ensure a much faster time-to-market period.

Invariably, one of the issues that arises with the sharing and reuse of IP is the concern of its trustworthiness and the potential for an adversary to modify the design with malicious content. IP-

XACT presents an infrastructure that can be leveraged for tracking and observing design tampering scenarios through the XML code that is generated when the IP is packaged. This chapter will provide an overview of IP-XACT contributed by [37] and look at certain aspects that could be leveraged for Trust.

112

7.1. IP-XACT Hierarchy

The constructs and syntax used in the IP-XACT Standard can be likened to an Object Oriented

Design (OOD) approach. There are seven top-level component descriptions within the IP-XACT standard that enumerate this approach. These descriptions are: Bus Definitions, Abstraction

Definitions, Components, Designs, Abstractors, Generator Chains, and Design Configurations

[38]. Each level of hierarchy has required fields that pertain to the time stamp and origin of the IP, as well as specific characteristics of the design and model parameters that describe the IP. An example of this can be seen when looking at the Component top level description. The XML description for the model parameters of the component are found within the Model sub-element, which is one of the 16 sub-elements in the top level description Component. An example of the

OOD approach is even more apparent in the hierarchy when looking at the Views sub-element within the Model sub-element. By using the Hierarchical View, one can either reference the Design top level element to instantiate a hardware description, or import an HDL file such as VHDL or

Verilog. Using the hierarchical approach ensures that the design adheres to the IP-XACT standard.

If the hierarchy is terminated, therefore choosing the non-hierarchical view, potential Trust vulnerabilities are created.

Figure 63 outlines a specific hierarchical case. The greyed out boxes are elements at the same level of hierarchy, but are not pertinent to the example case. The blue boxes show the trustworthy

IP-XACT flow while the red boxes show Trust vulnerabilities. The flow diagram also highlights the use of a hierarchical reference for an IP-XACT design. The highest level of hierarchy lies at the top of the diagram. It can be clearly seen that there are optional sub-elements designated for

Vendor Extensions which poses a notable vulnerability at many levels of hierarchy (e.g. harmful content can be injected with this element). Another potential issue is in the IP supplier’s option to

113

go with a non-hierarchical view of the design. Termination at this step poses a large trust concern in that the design from that point on no longer adheres to the IP-XACT Standard. Rather, it may be imported in the IP as some hardware description. In order for a user to ensure that the IP inserted through this process is not harmful, a method for vetting the IP-XACT model is needed.

Figure 63 – Example of Hierarchical Structure of the IP-XACT Standard

7.1.1. Use Model of IP-XACT

The IP-XACT Standard is detailed in the constructs and syntax, however the use of the IP-

XACT model and extensive checking is left to the discretion of the supplier and user of the standardized IP. Figure 65 shows a model flow for how the IP is packaged along with the various verification steps required. The XML description of the IP-XACT definition is first generated automatically in what is known as the Packaging step. The tools used for this automated XML creation are a wide array ranging from very simple Register Transfer Level (RTL) parsing editors, all the way to open ended automated systems that include user expendability for the vendor [39].

114

One of the unique points in terms of the packaging step is the fact that packaging must be present in every vendor’s system. Regardless of the nature of a particular vendor’s IP interaction (supplier or user of IP), the XML must be either created, or re-created. When leveraging the IP-XACT XML for Trust, it is essential to verify that the XML descriptions themselves are trustworthy and free of misleading changes and edits made by a malicious supplier. The flow diagram in Figure 64 explains the methodology behind this scenario. This diagram outlines a case in which malicious

IP is modified (e.g. error insertion, supplier carelessness, etc.) in the XML description through a manual edit. When the user receives the IP, a misleading XML solution is received that gives false representation of the IP sent, therein, creating the Trust issue.

Figure 64 – Packaging Step for both User and Supplier End of the IP Transfer

The next step is to check that the XML description has the correct syntax, and the schema

(i.e. language for enumerating constraints) is being followed. This step is called the Validation step of the process. It should be noted that this Validation step can be completed using integrated third

115

party tools such as Altova or XMLSpy to validate the XML. Some IP providers and other users of

IP-XACT rely on the Validation step alone to deem the IP as authenticated. This not only poses a functional issue to the user (e.g. the IP does not do what it was intended to do), but it also creates a major Trust concern. In other words, the IP could maintain the correct syntax yet still harbor pernicious content. The XML description then goes through a Screening phase where the semantics as well as the completeness of the description is checked. This step accounts for the divergence of the format from a standard XML point of view, as well as an IP-XACT point of view (e.g. do all of the enumerated registers in an IP description fit in the specified address block?) The completeness of the model within itself is also checked at this point (e.g. is there a file reference in a Component level description that points to a non-existent file?)

The Validation and Screening steps give a good diagnostic review of the completeness of the

IP, as well as its adhesion to the IP-XACT standard and basic syntax rules of XML. The metric to assign a real value to completeness and the impact a piece of IP will have is covered in the succeeding steps. Subsequently, the Reporting phase gives a much truer account of the completeness of the IP, and what kind of effect the missing information will have. The measurement of the effect the missing information has on the component as a whole is a pseudo

Impact Analysis report that gives an IP user an idea of what kind of issues the missing or incorrect

IP will cause within the design. The scope of this Reporting phase is specific to the IP being transferred, and the report will give results based on the operability of that specific piece of IP.

The next check is the System Verification step. It is at this point that the IP being transferred is put into the context of the system it is being transferred into. This is a crucial step in a standardized

IP sharing scheme, because even with a passed Reporting phase (e.g. the IP being transferred is

116

correct within its own scope), the System can still fail. Verifying the Interoperability of the model is therefore the main function of the System Verification phase.

Figure 65 displays the feedback structure of a typical use model in IP-XACT. The Validation step shown in this diagram is integrated with the Packager, but a third party tool such as XMLSpy can be used separate from the Packager. Trust needs to be validated (primarily in the three grey phases) in order to ensure the XML checking flow is secure.

Figure 65 – High-Level Block Diagram of Typical IP-XACT Use Model

7.1.2. Acceptance in the EDA Community

The IP-XACT Standard is being integrated into several Electronic Design Automation (EDA) environments at different levels depending on the toolset vendor. The Spirit Consortium, a group of vendors and users of EDA tools that defines the standards for Systems on Chip, was responsible for the creation of IP-XACT, and all of the members, whether users or suppliers, use IP-XACT to some capacity. Among the Board of Directors of the Spirit Consortium: Cadence Design Systems,

NXP Semiconductors, and Mentor Graphics are all verified users of the IP-XACT Standard.

117

A good example of verified use of IP-XACT in the FPGA domain is in the Xilinx Vivado Design

Suite. In the context of Vivado, the user can bring in outside IP that is either IP-XACT or standard

IP. Xilinx will check the IP for adhesion to the standard if it is IP-XACT or generate (package) an

IP-XACT XML description of non-standardized IP. It then adds all IP-XACT IP to its built in IP

Catalog, which contains a repository of basic FPGA building block components that have IP-

XACT XML descriptions. Ease of performing the System Verification step is achieved by having a repository of IP-XACT IP in the EDA toolset.

7.2. Analyzing Trust with IP-XACT

Trust can be monitored in a variety of ways using the constructs and syntax within the IP-

XACT Standard. At a high level, such as the Models sub-element, the version identifier, top level porting parameters, and bus connections may be packaged by the user of the IP, and the IP-XACT description quantified by either comparison or completeness checks similar to those in the use

Model section. This method is heavily dependent on the user having a complete Packager. As such, vulnerabilities could be taken advantage of through manual tampering of the XML by a malicious supplier. Another construct that can be used to track the Trust is the unique version identifier that is required when defining an object. The Vendor, Library, Name, and Version

(VLNV) of the IP object must always be specified, and often times is used as a reference within an

IP description (e.g. ports are connected to different bus interfaces via VLNV.) This can be used to create a repository of trusted IP sources, as well as provide measures of deviation away from the trusted IP. Tracking the provenance of IP through VLNV could allow the creation of a Trusted

Repository.

118

In order to analyze some of the usefulness of an IP-XACT description, an example design was packaged using the Xilinx Vivado Design Suite. The MIPS test article from Section 4.6.2 was packaged and an IP-XACT XML description is created. As illustrated in Figure 66, malicious circuitry was added into the ALU and ALU Controller components of the MIPS and the design was re-packaged, and new IP-XACT XML created.

Figure 66 – Example IP with Malicious Circuitry Added

The two XML descriptions were then compared line by line for differences. Aside from nomenclature differences, two useable differences emerged from the experiment. Both of the differences happened to be unique parameters to the Vivado software. The first was a vendor extension that called for the exact time stamp of when the IP was packaged, a variable called coreCreationDateTime. The second difference was a parameter that was a design identification based on the IP implementation. The design ID is a hexadecimal value called the viewChecksum parameter. This parameter was intended to track that the design is loaded correctly into the tool.

Every design contains a unique viewChecksum unless there is absolutely no change, no constraint, no pin, no IP, and no version change. This can be used to verify the trustworthiness of the IP even in cases where a malicious supplier makes changes to the original XML document, because the

119

viewChecksum parameter is generated when a design is instantiated into Vivado. Figure 67 shows the comparison of the MIPS reference and Corrupted MIPS design. One can see that the viewChecksum parameter is now different from the original reference version.

Figure 67 – Comparison of Checksum Values

120

Chapter 8: CONCLUSION

The goal of the research reviewed in this dissertation was to develop quantifying metrics that would give increased insight into the hardware Trust concerns observed in the IC supply chain.

This was achieved by improving the resolution of existing Trust evaluation techniques and involved moving the Trust assessment points from the supplier level down to the design component level of abstraction. By accomplishing this, hardware Trust can now be evaluated on a component-to- component level rather than just a supplier-to-supplier level. This effectively improved the resolution of the Trust assessment process allowing for a more thorough vetting of hardware. From a larger perspective, the accomplishments of this research made extensions to Verification Science.

A Trust Metric Solution Space was developed to constrain a previously loosely-defined Trust

Metric Problem Space. The Solution Space was defined as a collection of domain areas that presented a roadmap-like approach to exploring quantifying Trust Metrics and developing Trust

Models. Furthermore, the Trust Metric Solution Space established a coalescing point for the broader metrics work being conducted within the Trusted Microelectronics community to map to, thus creating a reference point which encourages collaboration and progress on existing and future

121

work. The Solution Space was defined in a modular fashion such that new domains or future advances to the metrics could be easily integrated.

A large portion of the metric development process hinged on the development of reference test articles that established obfuscated error scenarios set to mimic real world adversarial intrusions and fault instances. Behavioral references were created from high level design specifications that could capture the expected design behavior in order to be compared to the actual or questionable design. An Error Implementation Cost scoring framework was established to quantify the large variation observed in error types and severity levels. This allowed measurable differentiation to be benchmarked between different errors. By accomplishing this, errors are now able to be ranked and rated allowing one to judge error severity objectively against other errors. This created a strategic approach to test article creation as varying degrees of design corruption could be embedded into each test article for a comprehensive spectrum of error severity scenarios. These test articles were then utilized for developing and evaluating the domain analytics that lead to the

Design Integrity metrics of Trust.

The Design Integrity metric parsed out the design into five different domain profiles (Logical

Equivalence, Power Consumption, Functional Accuracy, Structural Architecture, and Signal

Activity Rate) and was evaluated with measurement analytics specific to each respective domain.

The measurement techniques created a means for measuring the deviation of a questionable design

(actual design) away from its reference (the expected design.) Each analytic was mapped to a normalized scale on a [1, 0] axis whose value was then aggregated together with the other domains to arrive at a final Design Integrity metric. The Design Integrity was defined on a [0, 5] scale with

5.0 indicating the highest Design Integrity and 0.0 the lowest. Several test case scenarios were looked at that contained varying levels of inserted corruption and were subjected to the Design

122

Integrity evaluation process as a means for quantifying the integrities of each of the test cases against one another. The execution of this process allowed one to judge one design as measurably more or less trustworthy than another design.

Finally, a Reference Quality metric was developed in order to compare and rank the different references used in the DI analysis. The Reference Quality essentially evaluated how much confidence one could place in the integrity measures obtained from the DI analytics. As such, it was used in conjunction with the DI metric to arrive at a final Trust Measure Figure of Merit. The

Trust Measure attenuated the DI measurements to allow an equivalent comparison environment to compare between different designs regardless of the reference type. The Trust Measure Figure of

Merit contributed a means for benchmarking and comparing Trust Metrics obtained from various

DI analyses. The comparison that can be made between different Trust Metrics allows one to evaluate any future metrics against the established ones to show measurable progress.

8.1. Contribution

The research work reviewed in this dissertation contributes to the emerging field of Trusted

Microelectronics which has risen out of the hardware Trust concerns caused by higher levels of outsourcing and the increased vulnerability points associated with the exponential growth trends of the Internet of Things. Metrics that can be used for evaluating the trustworthiness of a questionable design is an area of Trusted Microelectronics that has remained largely undefined and only applicable to higher levels of supply chain abstraction. By constraining the Trust Metric Problem

Space and contributing a modular roadmap Solution Space, a path has been paved for future work to be integrated into the metrics progress.

123

The Trust Metrics contributed from this work evaluate the integrity, or trustworthiness, of the component at the design level which, to the best of the author’s knowledge, has not previously been done. In addition, a means for quantifying errors to create measurable differentiation when compared to one another will allow quantitative description and cataloging of real world Trojans and faults as they are identified in practice. Quantitative descriptions of errors now provide an objective means for ranking and rating errors that previously relied on human intuition and insight for assessment. This work can be utilized to explore more automated approaches to test article creation that would be necessary for pursuing probabilistic models of Trust. Finally, the Trust

Measure Figure of Merit utilized for evaluating the quality of the Trust Metric provides a benchmarking standard that previously did not exist and provides a method for showing marked improvement on any future metrics.

8.2. Recommendations for Future Work

The contribution of this research work has opened up several new paths that need to be explored in order to extend the progress made on metrics for Trusted Microelectronics. The abstraction below the component and design level that looks at the device physics and physical semiconductor layers remains to be explored from a metric perspective. Referencing the Trust Metrics Solution

Space shown in Figure 68, work remains to be conducted in the Vulnerability Analysis Domain and involves developing measurement techniques that can quantify the vulnerability in different areas of the design at the RTL, Gate Netlist, and Layout Levels of design abstraction.

124

Figure 68 – Constrained Trust Metric Solution Space

These analytics would be instrumental for evaluating and comparing different mitigation strategies in order to improve design security. In addition, these analytics could be instrumental in identifying areas of the design that need to be held to higher scrutiny and exhaustive testing due to a lack of observability or reachability. Future Domains should be expanded into the other overlapping sections of Figure 2, defining techniques as well as metrics specific to the sub-field.

125

The development of Trust Metrics can be viewed as an iterative process with future iterations improving on and adding to the previous ones. This work looked at the DI domain profiles for

Logical Equivalence, Power Consumption, Structural Architecture, Functional Accuracy, and

Signal Activity Rate to perform integrity analytics in order to arrive at the aggregated DI metric.

There are other unexplored characteristic domains of a design that could offer increased dimensionality to the DI metric. Such domains could include Timing, Operational Anomalies,

Speed, and other measureable aspects of the circuit design.

Other future work could be explored by expanding the metrics into the analog and mixed signal domains. One could attach an Analog-to-Digital Converter (ADC) to the two inputs of the Figure

29 test article to allow the integration of analog components. Each ADC could be attached to an analog sensor which would read in analog data to be converted into fixed point and then floating point as shown in Figure 69.

Figure 69 – Expanding the Full System into the Analog Domain

126

The Correlation Factor, βi, was utilized as a weighting parameter for correlating the normalized measures of each DI domain together in order to arrive at a single DI value. Since this work focused on the measuring techniques and analytics of each domain, the Correlation Factor was assumed to be uniform across all DI domains, though it is understood that they are non-uniform in practice. As such, research needs to be conducted in order to determine the true Correlation Factor between each of the DI domains. This will greatly improve the fidelity of the DI metric.

One of the realizations made in this work is that the Trust Metrics problem largely revolves around finding and developing the correct mathematics to describe the deviation distances. These distance measures are believed to be nonlinear in nature, however, more work needs to be done to fully explore and vet these theories. Linear distances measures such as Euclidean Distance were utilized as a first iteration for approximating the deviation measure. Future work should explore nonlinear distance measures that can be correlated to the TM and EIC.

This work also utilized deterministic models for exploring Trust. The development of the EIC measure allows for a large variety of errors to now be utilized in the test article development process which could be automated for mass production of obfuscated error insertion scenarios. This process could be executed through the use of XML tag schemas that insert varying error severities into designs based on the Hardware Description Language constructs. With this process automated, a large database of corrupted test articles could be utilized for establishing probabilistic models of trust that could be predictive of a questionable design’s integrity.

8.3. The Broader Impact

The concern for hardware Trust continues to intensify as society grows in reliance on electronics and technology’s ability to automate, control, and simplify everyday tasks. This research work therefore has the potential to have a broad impact in the commercial and civil sectors 127

of society. The ability of Trust Metrics to vet a questionable design in terms of malicious content prior to insertion into a larger system has major implications with regard to human safety and national security.

It is almost certain that standards for Trust will need to be developed in order to realize Trusted

Design certifications that industry networks will inevitably start requiring. Design Integrity and other Trust Metrics will surely play a major role in the quantifying component of these future Trust standards. This research work is therefore well-positioned to be a major contributor to prospective

Trust standards or certifications that will be developed in the future for the protection of our society.

128

REFERENCES

[1] S. Askari and M. Nourani, "A Design for Reliability Methodology Based on Selective Overdesign," in Design and Test Workshop (IDT), 2010.

[2] A. Thierer and A. Castillo, "Projecting the Growth and Economic Impact of the Internet of Things," Mercatus Center - George Mason University, 15 June 2015. [Online]. Available: http://mercatus.org/publication/projecting-growth-and-economic-impact-internet-things. [Accessed 23 September 2015].

[3] M. Tehranipoor and F. Koushanfar, "A Survey of Hardware Trojan Taxonomy and Detection," IEEE, Vols. 0740-7475/10/, p. 11, ©2010.

[4] M. Tehranipoor, D. Forte, R. Karri, F. Koushanfar and M. Potkonjak, "Trust Hub," National Science Foundation, [Online]. Available: www.trust-hub.org. [Accessed 8 February 2017].

[5] H. Salmani, M. Tehranipoor and J. Plusquellic, "New Design Strategy for Improving Hardware Trojan Detection and Reducing Trojan Activation Time," in International Workshop on Hardware-Oriented Security and Trust (HOST), 2009.

129

[6] M. Abramovici and P. Bradley, "Integrated Circuit Security - New Threats and Solutions," in Cyber Security and Information Intelligence Research Workshop, Oak Ridge, 2009.

[7] M. Jagasivamani,, P. Gadfort,, M. Sika,, M. Bajura, and M. Fritze, "Split-Fabrication Obfuscation: Metrics and Techniques," in IEEE International Symposium on Hardware- Oriented Security and Trust (HOST), 2014.

[8] J. Rajendran, O. Sinanoglu and R. Karri, "VLSI Testing based Security Metric for IC Camouflaging," in IEEE International Test Conference, 2013.

[9] H. Salmani and M. Tehranipoor, "Analyzing Circuit Vulnerability to Hardware Trojan Insertion at the Behavioral Level," in International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS), 2013.

[10] M. Tehranipoor, H. Salmani and X. Zhang, Integrated Circuit Authentication: Hardware Trojans and Counterfeit Detection, Springer International Publishing, 2014, pp. 135-140.

[11] S. Bhunia, M. S. Hsiao, ,. M. Banga and S. Narasimhan, "Hardware Trojan Attacks: Threat Analysis and Countermeasures," Proceedings of the IEEE, vol. 102, no. 8, pp. 1229-1247, August 2014.

[12] S. Ali, S. Chakraborty, D. Mukhopadhyay and S. Bhunia, "Multi-level Attacks: an Emerging Security Concern for Cryptographic Hardware," in European Design and Automation Association, 2011.

[13] A. G. Voyiatzis and D. N. Serpanos, "Active Hardware Attacks and Proactive Countermeasures," in International Symposium on Computers and Communications (ISCC), 2002.

[14] J.-F. Tian, Q.-G. Sun and Q. Liu, "A Searching Model of Trustworthy Supply Chain -- TSFM," in International Conference on Information Management, Innovation Management and Industrial Engineering, 2009.

[15] P. Ratnasingam , P. A. Pavlou and Y.-h. Tan , "The Importance of Technology Trust for B2B Electronic Commerce," in Bled Electronic Commerce Conference, 2002.

[16] Y. Ma and M. Zhang, "A Computation Model of Trustworthy Degree," in International Symposium on Intelligent Information Technology Application Workshops, 2008.

[17] S. Moein and F. Gebali, "A Formal Methodology for Quantifying Overt Hardware Attacks," in Advances in Information Science and Computer Engineering, Dubai, 2015.

[18] D. Pentrack, L. Neal, J. Lloyd and A. Gahoonia, "Quantifying System Trust and Microelectronics Integrity," in GOMAC, 2015.

130

[19] R. Paul, I.-L. Yen, F. Bastani, J. Dong, W.-T. Tsai, K. Kavi, A. Ghafoor and J. Srivastava, "An Ontology-Based Integrated Assessment Framework for High-Assurance Systems," in IEEE International Conference on Semantic Computing, 2008.

[20] J.-H. Cho, P. M. Hurley and S. Xu, "Metrics and Measurement of Trustworthy Systems," in MILCOM, 2016.

[21] L. Feiten, M. Sauer, T. Schubert, V. Tomashevich, I. Polian and B. Becker, "Formal Vulnerability Analysis of Security Components," IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, vol. DOI 10.1109/TCAD.2015.2448687, 2015.

[22] H. Salmani, M. Tehranipoor and R. Karri, "On Design Vulnerability Analysis and Trust Benchmarks Development," in IEEE, 2013.

[23] M. Rostami, F. Koushanfar and R. Karri, "A Primer on Hardware Security: Models, Methods, and Metrics," Vols. 102, No. 8, no. Proceedings of the IEEE, pp. 1283-1295, 2014.

[24] R. Karri, J. Rajendran, K. Rosenfeld, and M. Tehranipoor,, "Trustworthy Hardware: Identifying and Classifying Hardware Trojans," Computer Society, Vols. 0018-9162/10, no. October, pp. 39-46, 2010.

[25] R. J. Hayne, "Behavioral Fault Modeling in a VHDL Synthesis Environment," University of Virginia Library, Charlottesville, 1999.

[26] The Mitre Corporation, "Common Weakness Enumeration," Mitre, [Online]. Available: http://cwe.mitre.org. [Accessed 7 August 2014].

[27] Y. Jin, N. Kupp and Y. Makris, "Experiences in Hardware Trojan Design and Implementation," in IEEE International Workshop on Hardware-Oriented Security and Trust, 978-1-4244-4804-3/09/, 2009.

[28] H. Salmani, M. Tehranipoor and J. Plusquellic , "A Novel Technique for Improving Hardware Trojan Detection and Reducing Trojan Activation Time," 1063-8210 IEEE, ©2011 .

[29] X. Inc., "Vivado Design Suite - UG912 Vivado Properties," 20 December 2013. [Online]. Available: http://www.xilinx.com/support/documentation/sw_manuals/xilinx2013_4/ug912- vivado-properties.pdf. [Accessed 29 February 2016].

[30] Digilent Inc. , "Digilent Inc.," [Online]. Available: http://store.digilentinc.com/. [Accessed 7 February 2016].

[31] J. E. DeGroat, "IEEE 754 32-bit Floating Point Adder," The Ohio State University, [Online]. Available: http://www2.ece.ohio-state.edu/~degroat/ECE5462. [Accessed December 2012].

[32] D. A. Patterson and J. L. Hennessy, Computer Organization and Design - The Hardware/Software Interface, Waltham, MA 02451: Elsevier, Inc., 2012.

131

[33] N. H. E. Weste and D. M. Harris, CMOS VLSI Design - A Circuits and Systems Perspective, Fourth ed., Boston, MA: Addision-Wessley, 2011, pp. 39-48.

[34] A. Kimura, K.-w. Liu, S. Prabhu, S. Bibyk and G. Creech, "Trusted Verification Test Bench Devlopment for Phase-Locked Loop (PLL) Hardware Insertion," in Midwest Symposium on Circuits and Systems (MWSCAS), 2013 .

[35] T. Meade, S. Zhang and Y. Jin, "Netlist Reverse Engineering for High-Level Functionality Reconstruction," in 21st Asia and South Pacific Design Automation Conference, Macau, @2016 IEEE 978-1-4673-9568-7.

[36] Xilinx Inc., "Vivado Design Suite - UG907 Power Analysis and Optimization," 25 July 2012. [Online]. Available: http://www.xilinx.com/support/documentation/sw_manuals/xilinx2012_2/ug907-vivado- power-analysis-optimization.pdf.

[37] G. Fragasse, A. Kimura and G. Creech, "White Paper - Using the Framework of High Quality IP-XACT XML for the Advancement of Trust Metrics and Tracking a Quantitative Trust Level," The Ohio State University, Columbus, OH, 2016.

[38] IEEE/IEC International Standard, "IP-XACT, Standard Structure for Packaging, Integrating, and Reusing IP within Tool Flows," March 24, 2015.

[39] M. van Hintum and P. Williams, "The Value of High Quality IP-XACT XML," April 2009. [Online]. Available: http://www.design-reuse.com/articles/19895/ip-xact-xml.html.

[40] M. Tehranipoor, H. Salmani and X. Zhang , "Integrated Circuit Authentication - Hardware Trojans and Counterfeit Detection," Springer International Publishing Switzerland , 2014.

[41] G. Qu and M. Potkonjak, "Intellectual Property Protection in VLSI Designs: Theory and Practice," Springer Science + Business Media, 2004, pp. 4-7.

[42] N. H. E. Weste and D. M. Harris, CMOS VLSI Design - A Circuits and Systems Perspective, Fourth ed., Boston, MA: Addision-Wessley, 2011.

132