Configuration Scrubbing Architectures for High-Reliability FPGA Systems

Brigham Young University BYU ScholarsArchive Theses and Dissertations 2015-12-01 Configuration Scrubbing Architectures for High-Reliability FPGA Systems Aaron Gerald Stoddard Brigham Young University Follow this and additional works at: https://scholarsarchive.byu.edu/etd Part of the Electrical and Computer Engineering Commons BYU ScholarsArchive Citation Stoddard, Aaron Gerald, "Configuration Scrubbing Architectures for High-Reliability FPGA Systems" (2015). Theses and Dissertations. 5704. https://scholarsarchive.byu.edu/etd/5704 This Thesis is brought to you for free and open access by BYU ScholarsArchive. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of BYU ScholarsArchive. For more information, please contact [email protected], [email protected]. Configuration Scrubbing Architectures for High-Reliability FPGA Systems Aaron Gerald Stoddard A thesis submitted to the faculty of Brigham Young University in partial fulfillment of the requirements for the degree of Master of Science Michael J. Wirthlin, Chair Brent E. Nelson Gregory P. Nordin Department of Electrical and Computer Engineering Brigham Young University December 2015 Copyright c 2015 Aaron Gerald Stoddard All Rights Reserved ABSTRACT Configuration Scrubbing Architectures for High-Reliability FPGA Systems Aaron Gerald Stoddard Department of Electrical and Computer Engineering, BYU Master of Science Field Programmable Gate Arrays (FPGAs) are being used more frequently in space applications because of their reconfigurability and intensive processing capabilities. FPGAs in environments like space are susceptible to ionizing radiation which can cause Single Event Upsets (SEUs) in the FPGA's configuration memory. These upsets may cause the pro- grammed user design on the FPGA to deviate from its normal behavior. Space missions cannot afford to allow important data processing applications to become corrupted due to these radiation upsets. Configuration scrubbing is an upset mitigation technique that detects and corrects upsets in an FPGA's configuration memory. Configuration scrubbing periodically moni- tors an FPGA's configuration memory utilizing mechanisms such as Error Correction Codes (ECCs), Cyclic Redundancy Checks (CRCs), a protected golden file, and partial reconfiguration to detect and correct upset memory bits. This work presents improved Xilinx 7-Series configuration scrubbing architectures that achieve minimal hardware footprints, competitive performance metrics, and robust detection and correction capabilities. The two principal scrubbing architectures presented in this work are the readback and hybrid scrubbers which detect and correct Single Bit Upsets (SBUs) and Multi-Bit Upsets (MBUs). Harnessing the performance advantages granted by the 7-Series internal Readback CRC scan, a hybrid scrubber built in software for the Zynq XZC07020 FPGA has been measured to correct SBUs in 8.024 ms, even-numbered MBUs in 13.38 ms, and odd- numbered MBUs in 21.40 ms. It can also perform a full readback scrub of the entire device in under two seconds. These scrubbing architectures were validated in radiation beam tests, where one of the architectures corrected MBUs as large as sixteen bits in a single frame. Keywords: FPGA, radiation testing, BYU, scrubbing, configuration, readback, reliability, TMR, SEU, ECC, CRC, MBU, Zynq, PCAP, Xilinx, bitstream, upset mitigation, 7-Series ACKNOWLEDGMENTS This work could not have been accomplished without the help of my advisor, Dr. Michael Wirthlin. His encouragement and tutelage were paramount to overcoming each obstacle that we encountered. Dr. Brent Nelson and Dr. Gregory Nordin were also instru- mental in improving the clarity and cohesion of my writing. I sincerely appreciate the members of the scrubbing team in the BYU Configurable Computing Lab. Without their help, this work would not be possible. Ammon Gruwell, Alex Harding, and Peter Zabriskie were all vital in developing innovative ideas to tackle the many complications that we faced in our scrubbing journey. I would also like to thank Alex Wilson and Jordan Anderson, who integrated the Readback and Hybrid Scrubbers into the Linux CSP flight software, making it possible for these scrubbers to actually go up into outer space. They also made it possible to get results from these scrubbers functioning in radiation beam tests. My gratitude also goes to CISCO and to the Los Alamos Neutron Science Center (LANSCE), which provided valuable radiation beam time to validate the scrubbers presented in this thesis. Their technical staffs and facilities were tremendously helpful. Finally, I could not have done what I did without the continuing support of my family, particularly from my uncle, who first inspired me to pursue my education in this field; my father, who provided significant logistical assistance to make my education possible; and my wife, whose daily support, patience, and constant encouragement has been an unending source of motivation and determination. This work was supported by the I/UCRC Program of the National Science Foundation under Grant No. 1265957 through the NSF Center for High-Performance Reconfigurable Computing (CHREC) and the Utah NASA Space Grant Consortium. Contents List of Tables ix List of Figures xi 1 Introduction 1 1.1 Thesis Contributions . 3 1.2 Thesis Organization . 4 2 FPGA Reliability in High-Radiation Environments 5 2.1 Ionizing Radiation Overview . 5 2.1.1 Radiation Sources . 5 2.1.2 Radiation: Single Event Effects . 6 2.1.3 FPGA Transistor Density . 9 2.2 General FPGA Upset Mitigation Techniques . 11 2.2.1 Radiation Hardening . 11 2.2.2 Error Correction Code . 12 2.2.3 Cyclic Redundancy Check . 13 2.2.4 Triple Modular Redundancy . 13 2.2.5 Memory Scrubbing . 15 2.2.6 Configuration Scrubbing . 16 3 Configuration of Xilinx FPGAs 17 iv 3.1 Xilinx FPGA Architecture . 17 3.2 Configuration Bitstreams . 18 3.2.1 Frames . 19 3.2.2 Bitstream Composition . 20 3.3 Configuration Module . 21 3.3.1 Configuration Process . 22 3.3.2 Configuration Interfaces . 22 3.3.3 Configuration Registers . 25 3.3.4 Synchronization Commands . 28 3.3.5 Configuration Operations . 29 3.4 Readback . 30 3.4.1 Bit Classifications . 32 3.4.2 Dummy Frames . 33 3.5 Partial Reconfiguration . 34 4 SEU Configuration Memory Protection for 7-Series FPGAs 35 4.1 7-Series Frame Layout . 35 4.1.1 Frame Block Types . 36 4.1.2 ECC Word . 36 4.1.3 Frame Interleaving . 40 4.2 FRAME ECCE2 . 41 4.2.1 FRAME ECCE2 Signals . 42 4.2.2 CRCERROR Signal . 44 4.3 Readback CRC . 46 4.3.1 Readback CRC Action . 46 v 4.3.2 Upset Types . 48 4.3.3 Multiple Frames with Upsets . 49 4.3.4 Readback CRC Behavior . 50 4.3.5 Simultaneous Upset Example . 53 5 FPGA Scrubbing Techniques 57 5.1 Scrubbing Basics . 57 5.2 External and Internal Scrubbers . 59 5.3 Common Scrubbing Strategies . 61 5.3.1 Blind Scrubbing . 61 5.3.2 Readback Scrubbing . 62 5.4 Scrubbing Examples . 64 5.4.1 Virtex-4 Scrubbers . 64 5.4.2 Virtex-5 Scrubbers . 66 5.4.3 Virtex-6 Scrubbers . 67 5.5 Hybrid Scrubbing . 68 5.5.1 Early Hybrid Scrubbers . 69 5.5.2 Xilinx Soft Error Mitigation IP (SEM IP) . 70 6 7-Series Scrubbing Architectures 73 6.1 Frame Row Boundaries . 73 6.1.1 FAR Auto-Increment Plateauing Behavior . 74 6.1.2 Using Dummy Frames to Overcome Row Boundaries . 75 6.2 Scrubbing Files . 76 6.2.1 Golden Data . 76 6.2.2 Mask File . 77 vi 6.2.3 Essential Bits File . 78 6.2.4 Frame Address (FRADs) List . 78 6.3 Blind Scrubbing Architecture . 80 6.4 Readback Scrubbing Architecture . 82 6.5 Hybrid Scrubbing Architecture . 83 6.5.1 Hybrid Scrubbing Components . 84 6.5.2 Hybrid Flow . 87 7 Zynq-7000 Scrubbing 91 7.1 Scrubbing with the Zynq-7000 Family . 92 7.1.1 Processor Configuration Access Port (PCAP) . 92 7.1.2 Device Configuration Interface (DevC) . 94 7.1.3 PCAP Limitations . 95 7.2 Zynq Readback Scrubber . 96 7.2.1 Performing Readback on the Zynq . 98 7.2.2 Readback Radiation Testing . 99 7.3 Zynq Hybrid Scrubber . 101 7.3.1 Hybrid Hardware . 101 7.3.2 Hybrid Scrubber Performance . 105 7.3.3 Hybrid Radiation Testing . 106 8 Conclusion 109 Acronyms 113 Bibliography 116 vii A TMR Markov Modeling 123 B PCAP Transfer Process 126 C Proprietary CRC Registers 128 D Configuration Command Sequences for the PCAP 129 D.1 Write Operations . 129 D.1.1 Write Configuration Registers . 129 D.1.2 Write Configuration Data . 130 D.2 Read Operations . 130 D.2.1 Read Configuration Registers . 131 D.2.2 Performing Readback of Configuration Data . 132 E Scrubbing Other Memory Types 133 E.1 Block RAM ECC . 133 E.2 Dynamic Reconfiguration Ports . 134 E.3 Readback Capture . 134 F FRADs List for Zynq XZC07020 FPGA 135 G Excerpts from Hybrid Scrubber Log 141 viii List of Tables 2.1 Largest and Smallest Bitstream Sizes for Recent Xilinx Families . 10 3.1 CLB Resources for Virtex 7-Series FPGAs . 18 3.2 Number of 32-bit Words in a Frame for Different Xilinx Families . 19 3.3 Xilinx 7-Series Configuration Registers Used in Scrubbing [1] . ..

Load more