SIMD Within a Register on Linear Feedback Shift Registers
Total Page:16
File Type:pdf, Size:1020Kb
SIMD within a register on linear feedback shift registers Item Type Other Authors Ott, Karl Download date 27/09/2021 17:15:00 Link to Item http://hdl.handle.net/11122/8856 SIMD Within A Register on Linear Feedback Shift Registers A Project By Karl Ott Presented to the faculty of University of Alaska Fairbanks In Partial Fulfillment of the Requirements of MASTER OF SCIENCE IN COMPUTER SCIENCE Fairbanks Alaska April 22, 2015 SIMD Within A Register on Linear v Feedback"V • . Shift A Project By Karl Ott RECOMMENDED: Date Date Date APPROVED: Department Head, Computer Science Department Date Dean> College of Engineering and Mines Date Dean of the Graduate School Date Date Abstract Linear feedback shift registers (LFSRs) are used throughout a subset of cryptography. They have long been deployed as a means to generate a pseudo-random number stream. The random number generation provided by the LFSRs has been utilized in stream ciphers ranging from consumer to military grade. For example GSM privacy relies on the A5/1 stream cipher which in turn relies on LFSRs to generate the keystream. They are deployed because they are easy to construct, yet still provide strong cryptographic properties. The scope of this project is to speed up the simulation of LFSRs. The method of speeding up LFSRs is to use parallel operations to operate on multiple LFSRs at once. This is accomplished by using a method of SIMD. The method is SIMD within a register (SWAR). SWAR uses general purpose machine registers (eg. rax on an x86_64 machine). This means that 64 LFSRs can be simulated at once with one machine register using SWAR. This has the trade off of latency vs throughput. 2 Acknowledgments Firstly I would like to extend my sincere gratitude to my advisor, Dr. Lawlor. He has been indis pensable through the course of this project by serving as a constant source of good ideas, invaluable advice, quality feedback, and excellent instruction as a professor. One would be hard pressed to find a better advisor. I would also like to thank the other members of my committee, Dr. Genetti and Dr. Hartman. Both have taught me a significant amount as professors during my undergraduate career and con tinue to do through gradate school. This project would not have been possible without the education provided by Dr. Hay, Dr. Chappell, and Dr. Nance. I would also like to thank the rest of the profes sors who have taught me through my academic career. Fellow graduate students Matt VanVeldhuizen and Ben Bettisworth deserve recognition for putting up with my constant barrage of ”sanity checks”. Drew ”Don/Juan” Lanning and Chris Woodruff also deserve thanks for being such excellent friends over the course of my life thus far. Lastly, I would like to than my family. Specifically my parents for instilling me with a thirst for knowledge, and doing anything and everything to put me on a path towards success. Their constant support has been of the utmost importance throughout my academic career. 3 Table O f Contents 1 Introduction 7 2 Background 8 2.1 Linear Feedback Shift R egisters........................................................................................ 8 2.2 A 5 / 1 ........................................................................................................................................ 8 2.3 SIMD Within a Register.................................................................................................... 9 2.4 Using SWAR on A5/1 .......................................................................................................... 12 2.5 C U D A ....................................................................................................................................... 13 2.6 Using CU DA on A5/1 .......................................................................................................... 14 3 Results 16 3.1 Equipment Used ................................................................................................................... 16 3.2 CPU Results......................................................................................................................... 17 3.3 CU DA Results ...................................................................................................................... 21 3.4 CPU vs GPU ......................................................................................................................... 22 4 Related Work 26 5 Conclusion 27 6 References 28 a Appendix 29 A.1 a5.c........................................................................................................................................... 29 A.2 sa5.c ........................................................................................................................................ 33 A.3 pa5.c ........................................................................................................................................ 37 A.4 psa5.c ..................................................................................................................................... 41 A.5 cudaa5.cu................................................................................................................................. 45 A.6 cudasa5.cu ............................................................................................................................ 49 A.7 timing.h .................................................................................................................................. 53 A.8 timing.c— c p p ...................................................................................................................... 54 A.9 CPU and GPU Specifications ........................................................................................... 55 A.10 Computer 1 Tim ings .......................................................................................................... 56 A.11 Computer 2 T im in g s .......................................................................................................... 59 A.12 Computer 3 timings ............................................................................................................. 61 List of Figures 1 Layout of A5/1 cipher [12] .................................................................................................... 9 2 Serial 8-bit a d d itio n s............................................................................................................. 10 3 SWAR 8-bit a d d itio n ............................................................................................................. 10 4 SWAR 7-bit addition ............................................................................................................. 11 5 SWAR vs serial LFSRs ....................................................................................................... 11 4 6 CPU implementations on Computer 1 ................................................................................ 17 7 CPU results on Computer 2 ................................................................................................ 18 8 CPU results on Computer 3 ................................................................................................ 18 9 GPU implementations on Computer 1 .............................................................................. 22 10 CPU vs GPU implementations ............................................................................................ 23 11 Zoomed and Adjusted CPU vs GPU implementations .................................................. 24 List of Tables 1 Specifications of LFSRs in A5/1 [3] ...................................................................................... 8 2 A5/1 majority function .......................................................................................................... 12 3 K-map for A5/1 majority function ...................................................................................... 12 4 Hardware used .......................................................................................................................... 16 5 Factor speedup/slowdown at 1 simulation on Computer 1 .............................................. 18 6 Factor speedup/slowdown at 256 simulations on Computer 1 ........................................ 18 7 Factor speedup/slowdown at 65535 simulations on Computer 1 .................................. 19 8 Computer 1 Ideal Speedup Factor ....................................................................................... 19 9 Computer 1 Actual Speedup Factor ...................................................................................... 19 10 Computer 1 Ideal Speed Factor ............................................................................................ 19 11 Computer 1 Actual Speed Factor ......................................................................................... 19 12 Computer 1 Ideal Speed Factor ............................................................................................ 20 13 Computer 1 Actual Speed Factor ......................................................................................... 20 14 Calculated Factors for Computer 2 at 65536 Simulations ............................................... 20 15 Computer 2 Ideal Speed Factor ............................................................................................ 20 16 Computer 2 Actual Speed Factor ........................................................................................