Secure, Fast and Verified Cryptographic Applications: a Scalable Approach
Total Page:16
File Type:pdf, Size:1020Kb
Secure, fast and verified cryptographic applications :a scalable approach Jean-Karim Zinzindohoué-Marsaudon To cite this version: Jean-Karim Zinzindohoué-Marsaudon. Secure, fast and verified cryptographic applications : a scalable approach. Cryptography and Security [cs.CR]. Université Paris sciences et lettres, 2018. English. NNT : 2018PSLEE052. tel-01981380v2 HAL Id: tel-01981380 https://tel.archives-ouvertes.fr/tel-01981380v2 Submitted on 10 Jan 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. THÈSE DE DOCTORAT de l’Université de recherche Paris Sciences et Lettres PSL Research University Préparée à ENS Ulm – INRIA Paris SECURE, FAST AND VERIFIED CRYPTOGRAPHIC APPLICATIONS: A SCALABLE APPROACH IMPLEMENTATIONS CRYPTOGRAPHIQUES SURES, PERFORMANTES ET VERIFIEES : UNE APPROCHE PASSANT A L’ECHELLE Ecole doctorale ED386 SCIENCES MATHEMATIQUES DE PARIS CENTRE Spécialité INFORMATIQUE COMPOSITION DU JURY : Mme. CORTIER Véronique LORIA, Rapporteur , Présidente du Jury M. FOUQUE Pierre-Alain Université Rennes 1, Rapporteur M. SCHWABE Peter Soutenue par JEAN-KARIM Radboud University, Membre du jury ZINZINDOHOUE-MARSAUDON M. FOURNET Cédric le 03 juillet 2018 Microsoft Research, Membre du jury h M. STRUB Pierre-Yves Ecole Polytechnique, Membre du jury Dirigée par KARTHIKEYAN BHARGAVAN M. BHARGAVAN Karthikeyan INRIA Paris, Membre du jury h SECURE, FAST AND VERIFIED CRYPTOGRAPHIC APPLICATIONS: A SCALABLE APPROACH Jean-Karim ZINZINDOHOUE-MARSAUDON Ph. D. Thesis President of the Jury: Ms. Véronique CORTIER July 3rd 2018 Abstract The security of Internet applications relies crucially on the secure design and robust implementations of cryptographic algorithms and protocols. This thesis presents a new, scalable and extensible approach for verifying state-of-the-art bignum algorithms, found in popular cryptographic implementations. Our code and proofs are written in F∗, a proof-oriented language which offers a very rich and expressive type system, with dependent types, refinement types, effects and customizable memory models. The natural way of writing and verifying higher-order functional code in F∗ prioritizes code sharing and proof composition, but this results in low performance for cryptographic code. We propose a new language, Low∗, a fragment of F∗ which can be seen as a shal- low embedding of C in F∗ and safely compiled to C code. Nonetheless, Low∗ retains the full expressiveness and verification power of the F∗ system, at the specification and proof level. We use Low∗ to implement cryptographic code, incorporating state-of-the-art optimizations from existing C libraries. We use F∗ to verify this code for functional correctness, memory safety and secret in- dependence. We present HACL∗, a full-fledged and fully verified cryptographic library which boasts performance on par, if not better, with the reference C code. Several algorithms from HACL∗ are now part of NSS, Mozilla’s cryp- tographic library, notably used in the Firefox web browser and the Red Hat operating system. Eventually, we apply our techniques to miTLS, a verified implementation of the Transport Layer Security protocol. We show how they extend to cryptographic proofs, state-machine implementations and message parsing verification. 3 Contents Contents 4 1 Introduction 7 1.1 Contributions . 13 1.2 Related work . 13 2 Verified Programming in F∗ 19 2.1 Raising the Level of Trust in Software Development . 19 2.2 F∗, a proof-oriented programming language . 21 2.3 Syntax . 22 2.4 Types . 25 2.5 Proofs . 27 2.6 Effects . 34 2.7 Proofs with effects . 40 2.8 Libraries . 43 2.9 A feel for the SMT encoding . 49 3 An Extensible Bignum Framework for Cryptography 53 3.1 A Dire Need for Trust in Cryptography . 53 3.2 Prime fields Arithmetic . 54 3.3 A verified generic bignum library . 58 3.4 A second approach: balancing code sharing and performance . 74 4 Low∗, a Low-Level Programming Subset of F∗ 81 4.1 Verification Approach . 81 4.2 The Low∗ language . 85 4.3 A formal translation from Low∗ to Clight . 95 4.4 The KreMLin compiler . 103 4.5 KreMLin: a Compiler from Low∗ to C . 103 5 HACL∗, a Fast and Verified Reference Cryptographic Library107 5.1 A self-contained, modern and efficient cryptographic library . 107 5.2 Methodology: Structuring the Library Between Specifications and Low-Level Code . 109 4 Contents 5.3 High-Level Specifications . 110 5.4 Low-level Low∗ code . 118 5.5 Vectorization . 126 5.6 Evaluation (LoC, verification time, benchmarks etc.) . 131 6 Going Further: Building Secure Cryptographic Applications 137 6.1 Towards Cryptographic Proofs of Protocols . 138 6.2 A Verified State Machine for OpenSSL . 150 6.3 Parsing Protocol Messages . 157 7 Conclusion 165 Bibliography 169 5 Chapter 1 Introduction The last few decades have seen a booming expansion of information technolo- gies. In less than half a century, the Internet went from a confidential military network to an indispensable communication technology that we all rely on. This incredible development was backed by dramatic improvements in compu- tational power, storage capacity and worldwide connectivity in such a way that sharing large amounts of data and computational resources across the planet is now an easy feat. New generations of children are raised in a connected world where everything is immediate and decentralized, and information tech- nologies are now pervasive. The development of new usages and features has been the main drive for this technological and societal revolution, led by young information technology (IT) companies that have grown to be richer — and maybe more powerful — than certain states. Unfortunately, other subjects of concern such as security and privacy have not received as much attention, mostly for business and marketing reasons: the hype supporting new products always originates from new, distinguishing features, while improvements on the overall quality of the products in terms of safety, security or privacy are much harder to show off and market. This is not without consequences. Current users of various communication sys- tems around the globe are mostly unaware that they inherited many legacy protocols, software designs and more generally conceptual views from their predecessors. Because of the pace at which new standards, technologies and usages ap- pear and disappear, IT companies focus on maintaining compatibility between their latest most up-to-date products and legacy items. Thus low-level li- braries, which are at the core of many of those products, tend to encompass a broad range of features, from legacy ones to the most recent or advanced, in a mix of old and recent code. These libraries are essential building blocks for software applications, for instance providing ways to access the kernel of the system, to use the graphical interfaces, to access cryptographic or networking functionalities etc. with new, non-interoperable versions of those components 7 1. Introduction every couple months or years. Hence, these libraries are regularly updated, integrating new features but retaining the old ones, which may still be useful to a number of products. This process results in always larger and more com- plex codebases, almost impossible to audit and fully review, and sometimes leads to major issues: software components are usually not completely isolated and a flaw in some remote unused part of a core library could be used by an attacker to corrupt the whole system. It raises the following questions: should we trust legacy code which is sel- dom used, might not have been developed with the current security standards and may very well be flawed in a way that would compromise the whole code- base? And on the other hand, should we trust the latest implementations which have not yet received a lot of attention and may be similarly flawed? Obviously we should be extremely cautious in both cases, and still, this is under the assumption that the code is open-source so that the community can review it and contribute. Proprietary code that almost no one gets their hands on should be considered with the utmost care. The research community is always keen on studying the security of new systems under different and often innovative angles, and thus for a long time now these evolving new technologies have been analyzed, criticized and new approaches have been proposed. Unfortunately, exchanges between academic researchers and the IT industry are limited. Although there are some ties between the two communities, in practice it is not so common for academia to consider industrial constraints or the industry the latest academic research innovations. No one is to blame, the industry has its own set of constraints which often do not apply to a research environment, while real companies are interested in improvements from the research community only as long as they can readily be integrated in their processes at a negligible cost or with a high return on investment, which does not happen so often. However, as computer science is still young and undergoing heavy changes, now may be a good time to encourage further cooperation, especially on security related aspects. The ambition of this work is to improve security in software development by bringing state-of-the-art methods from academia together with the specific needs from the industry and real world applications, and build new ways to solve the later using the former. This idea of making research more practi- cal, or making the industry more research aware is not novel. Nonetheless, as new concerns emerge from the security and privacy aspects of computer science, contributing to making communication systems more reliable, robust and trustworthy appears as a challenging topic, which will quickly produce results and significant long term impacts.