Usuba, Optimizing Bitslicing Compiler Darius Mercadier

Usuba, Optimizing Bitslicing Compiler Darius Mercadier To cite this version: Darius Mercadier. Usuba, Optimizing Bitslicing Compiler. Programming Languages [cs.PL]. Sorbonne Université (France), 2020. English. tel-03133456 HAL Id: tel-03133456 https://tel.archives-ouvertes.fr/tel-03133456 Submitted on 6 Feb 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THESE` DE DOCTORAT DE SORBONNE UNIVERSITE´ Specialit´ e´ Informatique Ecole´ doctorale Informatique, Tel´ ecommunications´ et Electronique´ (Paris) Present´ ee´ par Darius MERCADIER Pour obtenir le grade de DOCTEUR de SORBONNE UNIVERSITE´ Sujet de la these` : Usuba, Optimizing Bitslicing Compiler soutenue le 20 novembre 2020 devant le jury compose´ de : M. Gilles MULLER Directeur de these` M. Pierre-Evariste´ DAGAND Encadrant de these` M. Karthik BHARGAVAN Rapporteur Mme. Sandrine BLAZY Rapporteur Mme. Caroline COLLANGE Examinateur M. Xavier LEROY Examinateur M. Thomas PORNIN Examinateur M. Damien VERGNAUD Examinateur Abstract Bitslicing is a technique commonly used in cryptography to implement high-throughput parallel and constant-time symmetric primitives. However, writing, optimizing and protecting bitsliced implementations by hand are tedious tasks, requiring knowledge in cryptography, CPU microarchitectures and side-channel attacks. The resulting programs tend to be hard to maintain due to their high complexity. To overcome those issues, we propose Usuba, a high-level domain-specific language to write symmetric cryptographic primitives. Usuba allows developers to write high-level specifications of ciphers without worrying about the actual parallelization: an Usuba program is a scalar description of a cipher, from which the Usuba compiler, Usubac, automatically produces vectorized bitsliced code. When targeting high-end Intel CPUs, the Usubac applies several domain-specific optimizations, such as interleaving and custom instruction-scheduling algorithms. We are thus able to match the throughputs of hand-tuned assembly and C implementations of several widely used ciphers. Futhermore, in order to protect cryptographic implementations on embedded devices against side-channel attacks, we extend our compiler in two ways. First, we integrate into Usubac state-of-the-art techniques in higher-order masking to generate implementations that are provably secure against power-analysis attacks. Second, we implement a backend for SKIVA, a custom 32-bit CPU enabling the combination of countermeasures against power-based and timing-based leakage, as well as fault injection. 3 Résumé Le bitslicing est une technique utilisee´ pour implementer´ des primitives cryptographiques efficaces et s’executant´ en temps constant. Cependant, ecrire,´ optimiser, et securiser´ manuellement des programmes bitslices´ est une tacheˆ fastidieuse, necessitant´ des con- naissances en cryptographie, en microarchitecture des processeurs et en attaques par canaux caches.´ Afin de remedier´ a` ces difficultes,´ nous proposons Usuba, un langage dedi´ e´ permettant d’implementer´ des algorithmes de cryptographie symetrique.´ Usuba permet aux developpeurs´ d’ecrire´ des specifications´ de haut niveau sans se soucier de leur parallelisation:´ un programme Usuba est une description scalaire d’une primitive, a` partir de laquelle le compilateur Usuba, Usubac, produit automatiquement un code bitslice´ et vectorise.´ Afin de produire du code efficace pour les processeurs haut de gamme, Usubac ap- plique plusieurs optimisations specialement´ conçues pour les primitives cryptographiques, telles que l’entrelacement et l’ordonnancement d’instructions. Ainsi, le code produit par notre compilateur offre des performances comparables a` du code assembleur ou C opti- mise´ a` la main. De plus, afin de gen´ erer´ des implementations´ securis´ ees´ contre des attaques par canaux caches,´ nous proposons deux extensions de Usubac. Lorsque les attaques par analyse de courant sont un risque a` considerer,´ Usubac est capable de proteger´ les implementations´ qu’il produit a` l’aide de masquage booleen.´ Si, additionellement, des attaques par injection de fautes doivent etreˆ prevenues,´ alors Usubac peut gen´ erer´ du code pour SKIVA, un processeur 32-bit offrant des instructions permettant de combiner des contre-mesures pour du code bitslice.´ 5 Acknowledgments Note: the names mentioned in this section are ordered using a non-disclosed algorithm. I recom- mend that you do not try to find a logic in the ordering; and especially that you do not wrongly assume that first is better than last. First, I would like to thank Sandrine Blazy and Karthik Bhargavan for reviewing my manuscript. Having spent months trying to produce a good manuscript, I was very pleased that you enjoyed reading it, and your positive reviews really warmed my heart. I would also like to thank Damien Vergnaud, Xavier Leroy, Caroline Collange and Thomas Pornin for being part of the jury. Of course, I am immensely grateful to Pierre-Evariste´ Dagand and Gilles Muller for supervising and guiding me throughout these 3 years. Pierre-Evariste,´ you always pushed me to strive for perfection, to leave no unturned stones, to try every possibilities, and to do the best work possible. Even though I probably wasn’t the most docile PhD student, I enjoyed working with you, and I learned a lot during these 3 years, in large part thanks to you. Or, to put it otherwise (just for you), I won’t say that I am unhappy to not have done this Ph.D. in another way. A lot of work in this thesis would not have been possible without collaborations with great researchers. Patrick Schaumont, Pantea Kiaei, Raphael¨ Wintersdorff, Sonia Bela¨ıd, Matthieu Rivain, Lionel Lacassagne, Karine Heydemann, I’m glad to have worked with you, and I hope this is neither the end of our collaborations, nor of our friendship. It was in particular really interesting to work with people from different sub-fields of cryptography: on one side, Patrick and Pantea, with your hardware expertise and low-level un- derstanding of side-channel attacks, and on the other side, Raphael,¨ Sonia and Matthieu, with your theoretical approach to the problem. While not an official collaborator, I’m infinitely grateful to my brother Hector who helped me visualize and interpret quite a lot of data from my benchmarks. I’d also like to thank some of my professors who inspired me to pursue this PhD and become a researcher: Jer´ emie´ Salvucci (although your last advice regarding the PhD was “run away”), Fred´ eric´ Peschanski, Antoine Genitrini, Emmanuel Chailloux, Julien Sopena, Quentin Meunier. Three years in the same office working on the same topic would certainly have been too much for me, if it weren’t for the amazing people surrounding me. In particular, Redha, Lucas, Antoine, Cedric´ and Pierre, whose eternal commitment to not being too serious made it a pleasure to work around them. And more generally, other PhD stu- dents, interns and post-doc from the Whisper, DELYS, MoVe and APR teams. There is more to life than just research (hope I’m not offending too many people with that sentence). And one of the major reasons that those last 3 years were great is the time I spent with my friends (not really sorted): Pierre B, Julien P, Juliette, Camille, Theo,´ Guillaume, Ninon, Jordi, Remy,´ Astrid and all the ones I’m forgetting. Thank you also to my cycling partners (Yacine, Jerome, Vicky and Antoine), and my Counter-Strike mates (Xili, Mando, John, Pasjordi, Arta, Thaur). I’d like to thank my family for always being there and for being so great. While I might not show it, I’m glad to have you in my life. 7 8 Finally, I would like to thank the most important person to me, Marjorie, without whom life would be so grim. Beyond being the most amazing girlfriend ever, you are the nicest and most caring person I know, and living with you is an unending pleasure. Thank you for helping me get through the finish line of this PhD (I would probably have taken a least a few more months without you pushing me to work), and thank you for being in my life ~ Contents 1 Introduction 13 1.1 Contributions . 21 1.2 Background . 22 1.3 Conclusion . 47 2 Usuba, Informally 49 2.1 Data Layout . 50 2.2 Syntax & Semantics, Informally . 51 2.3 Types . 54 2.4 Applications . 60 2.5 Conclusion . 68 3 Usuba, Greek Edition 73 3.1 Syntax . 74 3.2 Type System . 76 3.3 Monomorphization . 78 3.4 Semantics . 81 3.5 Usuba0 ....................................... 83 3.6 Translation to PseudoC .............................. 84 3.7 Conclusion . 90 4 Usubac 91 4.1 Frontend . 91 4.2 Backend . 96 4.3 Conclusion . 123 5 Performance Evaluation 125 5.1 Usuba vs. Reference Implementations . 125 5.2 Scalability . 130 5.3 Polymorphic Cryptographic Implementations . 132 5.4 Conclusion . 134 6 Protecting Against Power-based Side-channel Attacks 135 6.1 Masking . 136 6.2 Tornado ....................................... 136 6.3 Evaluation . 140 6.4 Conclusion . 147 7 Protecting Against Fault Attacks and Power-based Side-channels Attacks 151 7.1 Fault Attacks . 151 7.2 SKIVA . 152 7.3 Evaluation . 158 9 7.4 Conclusion . 161 8 Conclusion 163 8.1 Future Work . 164 Bibliography 175 Appendix A Backend optimizations 195 List of Figures 1.1 The RECTANGLE cipher . 19 1.2 Skylake’s pipeline . 23 1.3 Some SIMD instructions and intrinsics . 25 1.4 Our benchmark code . 27 1.5 Bitslicing example with 3-bit data and 4-bit registers . 28 1.6 The AVX-512 vpandd instruction .

Usuba, Optimizing Bitslicing Compiler Darius Mercadier

Maldet: How to Detect the Malware?

System Design for a Computational-RAM Logic-In-Memory Parailel-Processing Machine

Talking Book Topics November-December 2017

9781412929998.Pdf

X86-64 Calling Conventions

For Those Who Won't Fail Themselves

Block Ciphers: Fast Implementations on X86-64 Architecture

Works by Tom Stoppard

Sample Applications User Guides Release 20.05.0

An FPGA-Accelerated Embedded Convolutional Neural Network

Advanced Synthesis Cookbook

X86 Assembly Language Reference Manual