Ralf Zimmermann Bochum, June 2015 Copyright C 2015 by Ralf Zimmermann

CRYPTANALYSIS USING RECONFIGURABLE HARDWARE CLUSTERS FOR HIGH-PERFORMANCE COMPUTING DISSERTATION zur Erlangung des Grades eines Doktor-Ingenieurs der Fakultät für Elektrotechnik und Informationstechnik an der Ruhr-Universität Bochum by Ralf Zimmermann Bochum, June 2015 Copyright c 2015 by Ralf Zimmermann. All rights reserved. Printed in Germany. To my beloved wife, Heike. Ralf Zimmermann Place of birth: Cologne, Germany Author’s contact information: [email protected] www.rub.de Thesis Advisor: Prof. Dr.-Ing. Christof Paar Ruhr-Universität Bochum, Germany Secondary Referee: Prof. Dr. Tanja Lange Technische Universiteit Eindhoven, Netherlands Thesis submitted: June 10th, 2015 Thesis defense: July 13th, 2015 Last revision: March 16, 2016 v vi Source: “Piled Higher and Deeper” by Jorge Cham www.phdcomics.com vii viii Abstract Today, we share our thoughts, habits, and acquaintances in social networks at every step we take in our lives and use network-based services like smart grid, home automation, and the Internet of Things. As the connectivity and data-flow between sensors and networks grows, we rely more and more on cryptographic primitives to prevent misuse of services, protect data, and ensure data integrity, authenticity, and confidentiality — given that the primitives remain secure as long as the data is considered useful. History shows the need for well-performed cryptanalysis not only on the theoretical level but also by utilizing state-of-the-art technology: By applying the best implementation of suitable attacks to cutting-edge hardware, we derive upper bounds on the security level of cryptographic algorithms. This allows us to suggest adjustments of security parameters or to exchange primitives at an early stage. The focus of this thesis is an analysis of the effects of hardware acceleration using clusters of reconfigurable devices for cryptanalytical tasks and security evaluations of practical attacks. As not all tasks are equally suitable for hardware implementations, this thesis covers different areas of cryptography and cryptanalysis in four major projects, i. e., algebraic attacks on stream ciphers, post-quantum cryptography, password search, and elliptic curve cryptography: The first project, Dynamic Cube Attack on the Grain-128 Stream Cipher, introduces a new type of algebraic attack, based on an improved version of cube testers, against the Grain-128 stream cipher and required special-purpose hardware for the attack verification. The second project covers Password Search against Key Derivation Functions and evaluates the security of two of the current standards in password-based key derivation: PBKDF2 and bcrypt. We analyze the effects of special-purpose hardware for both low-power attacks and well-funded, powerful adversaries. In the third project, Elliptic Curve Discrete Logarithm Problem on sect113r2, we target the ECDL computation on the sect113r2 elliptic curve, which is a non-broken SECG standard binary elliptic curve. We implemented Pollard’s rho algorithm in combination with the negation-map technique on FPGAs to increase the efficiency of the random walk, which has not been done before. The last part consists of the project Information Set Decoding against McEliece, in which we designed the first hardware-accelerated implementation of an Information Set Decoding attack against the code-based cryptosystem McEliece. We present a proof-of- concept implementation of ISD on reconfigurable devices and discuss the benefits and restrictions of our hardware approach to provide a solid basis for upcoming hardware implementations. The results of the projects show that special-purpose hardware is a very important platform to accelerate cryptanalytic tasks and — even though the speed gain heavily depends on the algorithm and the choice of the hardware platform — that it plays a key role for practical attacks and security evaluations of new cryptographic primitives. Thus, a lot of effort is spent to decrease the effects of massively parallelized and energy-efficient attack implementations. ix Abstract Keywords Cryptanaysis, Reconfigurable Hardware, FPGA, Cluster, High-Performance Computation, Im- plementation. x Kurzfassung Hochleistungsrechner aus rekonfigurierbarer Hardware für Anwendungen in der Kryptoanalyse Heutzutage haben wir uns angewöhnt, zu jedem Zeitpunkt unsere Gedanken, Gewohnheiten und Bekanntschaften in sozialen Netzwerken zu teilen. Hierzu nutzen wir netzwerkbasierte Dienste wie das intelligente Stromnetz, ferngesteuerte Haustechnik oder das Internet der Dinge. Im gleichen Maße, in dem die Verbindung zwischen Mensch und Netzwerk sowie der Datenfluss an- steigen, wächst die Bedeutung eines verlässlichen Schutzes vor Datenmissbrauch. Dazu vertrauen wir auf kryptographische Primitive, die wir zum Schutz von Datenintegrität, -authentizität und -vertrauenswürdigkeit einsetzen. Diese Primitive müssen dabei so lange als sicher gelten, wie die Daten potenziell Verwendung finden können. Die Geschichte hat gezeigt, dass Kryptoanalyse nicht nur eine theoretische Bedeutung hat, sondern auch unter Berücksichtigung des aktuellen Standes der Technik erfolgen muss. Durch die Verwendung optimaler Angriffe in Kombination mit der modernsten Hardware lässt sich das Sicherheitsniveau kryptographischer Algorithmen nach oben abschätzen. Dadurch können frühzeitig Anpassungen an die Sicherheitsparameter oder der Austausch von Algorithmen vorgeschlagen werden. Der Fokus dieser Arbeit liegt in der Analyse der Einflüsse der Verwendung von Hardwarebe- schleunigung durch Hochleistungsrechner aus rekonfigurierbarer Hardware für die Anwendungen in der Kryptoanalyse. Zudem werden die daraus resultierenden Auswirkung auf die Sicherheits- abschätzungen untersucht. Da nicht alle kryptographischen Primitive gleichermaßen für eine Hardwareimplementierung geeignet sind, werden in dieser Arbeit vier Projekte aus verschiedenen Teilgebieten der Kryptologie, insbesondere aus dem Bereich der Stromchiffren, effizienter Pass- wortsuche, Elliptischen-Kurven-Kryptographie und Post-Quantum Kryptographie dargestellt: Im ersten Projekt wird ein neuer algebraischer Angriff, der auf einer verbesserten Version der Cube Tester basiert, gegen die Stromchiffre Grain-128 beschrieben. Die Validierung des Angriffs unter Verwendung eines Simulationsalgorithmuses erfordert darauf spezialisierte Hardware, da ein Software-Ansatz nicht effizient genug ist. Das zweite Projekt beschäftigt sich mit der effi- zienten Passwortsuche gegen Schlüsselableitungsfunktionen und untersucht die Sicherheit von zwei der derzeitigen Standards in der Passwortableitung: PBKDF2 und bcrypt. Dabei werden die Auswirkungen von spezialisierter Hardware für energieeffiziente Angriffe und Kontrahen- ten mit entsprechenden finanziellen Mitteln analysiert. In dem dritten Projekt geht es um die Berechnung des diskreten Logarithmus auf der elliptischen Kurve sect113r2, die eine bislang nicht gebrochene Binärkurve der SECG Standardkurven über dem F2113 ist. Dabei wurde der parallele Pollard’s Rho Algorithmus zum ersten Mal in Hardware in Kombination mit der Ne- gation Map Technik implementiert, um die Effizienz der Random Walk Iteration zu erhöhen. Der letzte Abschnitt handelt von der ersten hardwarebeschleunigten Implementierung eines In- formation Set Decoding Angriffs auf das Post-Quantum Kryptographieverfahren McEliece. Die Proof-of-Concept Implementierung dient dabei als Grundlage für die Diskussion der Vorteile xi Kurzfassung und Einschränkungen durch den Hardware-Entwurf, die signifikante Unterschiede in der Wahl der Parameter und Optimierungen nach sich ziehen. Die Resultate der Projekte zeigen, dass in den verschiedenen Bereichen der Kryptoanalyse der Einsatz von Hardwarebeschleunigung unterschiedliche große Auswirkungen mit sich bringt. Dennoch rücken Hochleistungsrechner und hochparallele Implementierungen immer stärker in den Fokus der Sicherheitsforscher, da die relativen Kosten für die Durchführung von Angriffen immer attraktiver werden. Dementsprechend wird inzwischen bei der Definition neuer kryptographischer Primitive viel Wert auf Maßnahmen gegen Vorteile eines Angreifers durch massive Parallelisierung und energie-effiziente Implementierungen gelegt. Schlagworte Kryptoanalyse, Rekonfigurierbare Hardware, FPGA, Hochleistungsrechner, Hochgeschwindig- keitsberechnungen, Implementierung. xii Acknowledgements This thesis is the result of the last 5 years, which I spent at the Chair for Embedded Security at the Ruhr-University Bochum, at conferences, workshops and summer schools all around the world, and by commuting far more than 100 000 km on countless (usually delayed) trains between Mainz and Bochum. Here, I would like to express my gratitude and thank those, who made all of this possible and enjoyable. First and foremost, I would like to thank my family for all of the support throughout the years and thank my wife, Heike, in particular, who managed to act as a counterbalance and married me in spite of my unrealistic years-to-graduate estimation, the long long-distance relationship, and the work I brought home frequently to ruin her plans for our weekends. Thank you for all your support, your faith, and your love. Coming back to academia, I am very grateful to my supervisor, Christof Paar. Aside from the scientific guidance, helpful advices, and the contribution of research ideas, you always managed to motivate and encourage me. Thank you very much! I would also like to thank my thesis committee, especially Tanja Lange, who provided me with advices and suggestions whenever I met her. I am very grateful for the wonderful

Ralf Zimmermann Bochum, June 2015 Copyright C 2015 by Ralf Zimmermann

Efficient Implementation of an Optimized Attack on a Reconfigurable Hardware Cluster

High Performance Computing Zur Technischen Finanzmarktanalyse

High-Performance Reconfigurable Computing

A Hybrid-Parallel Architecture for Applications in Bioinformatics

An Efficient VHDL Description and Hardware Implementation of The

Secure Volunteer Computing for Distributed Cryptanalysis

Fpgas in Bioinformatics

Solving the Discrete Logarithm of a 113-Bit Koblitz Curve with an FPGA Cluster

A Massively Parallel Architecture for Bioinformatics

Active Electromagnetic Attacks on Secure Hardware

Breaking Legacy Banking Standards with Special-Purpose Hardware

Download Hostside C/C++ API Documentation