Start your ENGINEs: dynamically loadable contemporary crypto

Nicola Tuveri Billy Bob Brumley Tampere University, Tampere, Finland Tampere University, Tampere, Finland

Abstract—Software ever-increasingly relies on building blocks with respect to this paper, the NaCl [1] project—and its implemented by security libraries, which provide access to fork libsodium2—are especially relevant, as they aim at evolving standards, protocols, and cryptographic primitives. providing practical and efficient strong confidentiality and These libraries are often subject to complex development models and long decision-making processes, which limit the ability of integrity with a special emphasis on the simplicity of the contributors to participate in the development process, hinder provided programming interface. Unfortunately, after years, the deployment of scientific results and pose challenges for OS we notice that, even if gathering momentum and being adopted maintainers. In this paper, focusing on OpenSSL as a de-facto in various new projects, this approach fails in meeting the standard, we analyze these limits, their impact on the security needs of mainstream projects. of modern systems, and their significance for researchers. We propose the OpenSSL ENGINE API as a tool in a framework Another related development sprouting off this research is to overcome these limits, describing how it fits in the OpenSSL HACL*3 [2], a verified cryptographic library that implements architecture, its features, and a technical review of its internals. the NaCl API: it provides mathematical guarantees on memory We evaluate our methodology by instantiating libsuola, a new safety, timing side-channel safety and functional correctness ENGINE providing support for emerging cryptographic standards (with respect to the published primitive specification), while such as X25519 and Ed25519 for currently deployed versions of OpenSSL, performing benchmarks to demonstrate the viability producing portable code that is as fast as state-of-the-art C and benefits. The results confirm that the ENGINE API offers (1) implementations. an ideal architecture to address wide-ranging security concerns; Alongside the development of NaCl, research efforts were (2) a valuable tool to enhance future research by easing testing also spent on the related SUPERCOP benchmarking suite and facilitating the dissemination of novel results in real-world systems; and (3) a means to bridge the gaps between research as part of the eBACS project [3] for the benchmarking of results and currently deployed systems. cryptographic systems: many researchers from industry and academia submitted new cryptographic primitives or alterna- I.INTRODUCTION tive implementations for benchmarking. Included submissions Following current common best practices for the develop- are benchmarked across different architectures and toolchains ment of secure systems, most applications rely on crypto- and undergo several automated tests. But still, most of these graphic primitives for which widely accepted standards have programming and scientific efforts fail to reach widespread been published after extensive studies to ensure that breaking adoption through mainstream libraries. We defer to Section II them is believed computationally infeasible. As the available for an in-depth analysis of the limits of the OpenSSL project computation power increases and new attacks and algorithms from the point of view of researchers that might help ex- to improve the performance of theoretic attacks are developed, plaining this gap. In here, to further substantiate the existence these standards are periodically revised to raise the security and relevance of this gap, we focus on the implementation of level of the recommended primitives. emerging cryptographic standards based on [4, 5]. Considering the complexity of protocols and algorithms X25519, the Diffie-Hellman cryptosystem, originally released and the number of potential pitfalls concerning details at in 2005, promises, due to the properties of the underlying every level of abstraction, the software implementation of curve design, simpler and faster implementations, with en- such evolving standards is a daunting and complex task and hanced resistance to side-channel attacks. Ed25519, formally application developers are strongly advised not to implement introduced in 2011, is a system based on their own crypto but to rely on existing well-established the twisted Edwards equivalent of Curve25519. It delivers fast cryptographic libraries, among which OpenSSL1 is the most digital signature generation and verification, short signatures, widely adopted. Still, in addition to the complexity of their and enhanced resistance to side-channel attacks. programming interfaces, bugs and defects in these libraries These cryptosystems have since gathered momentum, gain- are often the cause of cryptographic failures in the security ing official support in OpenSSH in 2014, in BoringSSL in layer of modern information systems. 2015, in the Google Chrome Internet browser TLS/QUIC Previous research focused on the development of new cryp- protocol support in April 2016, and finally becoming part tographic software libraries as a solution to these problems: 2https://libsodium.org/ 1https://www.openssl.org 3https://github.com/mitls/hacl-star of IETF RFC 7748 [6] (X25519) in January 2016 and RFC Outline: Section II presents further claims motivating our 8032 [7] (Ed25519) in January 2017, and becoming officially contribution. Section III contains an analysis of the OpenSSL recommended for implementations of TLS 1.3 (RFC 8446 [8]) architecture and the ENGINE API. Section IV presents lib- and earlier versions (RFC 8422 [9]) since August 2018. suola, an ENGINE we developed to demonstrate how to use OpenSSL officially added support for X25519 in August the ENGINE API as a framework to offer alternative or miss- 2016, with release 1.1.0, while Ed25519 support landed only ing implementations in OpenSSL. In Section V we present and very recently in a release version, after a long development evaluate experimental results comparing the default implemen- cycle, with the release of 1.1.1 on September 11, 2018. tation of X25519 and Ed25519 (or the lack thereof) against While implementation details are examined in Section III-D, the implementations provided through our custom ENGINE here we highlight that the original implementation chosen and analyze the limits of our proposed methodology and how by the OpenSSL development team favored portability over it addresses the concerns exposed in Section II. Section VI optimization, and as a result using these cryptosystems in presents a brief analysis of related work, focusing on the applications built on top of OpenSSL, do not always yield mechanisms deployed by other security libraries, frameworks the expected performance in comparison with other elliptic and operating systems to allow the transparent adoption of curve (EC) implementations (e.g., NIST P-256). alternative cryptographic implementations. We conclude in This does not seem to be a consequence of a lack of trusted Section VII. optimized implementations for the most widespread architec- tures supported by OpenSSL, as the mentioned SUPERCOP II.MOTIVATION project includes several implementations tested on different As mentioned in Section I, this work is partially motivated architectures and alternative libraries like libsodium and by the rigidity of currently deployed, ubiquitous BoringSSL ship optimized implementations for popular ar- software libraries. This rigidity impacts real-world security chitectures. It rather shows that new research, even after at least via two avenues, which this section summarizes: it the time required for gaining trust and widespread adoption amplifies software assurance issues due to the small circle in other mainstream projects and standards, still requires a of contributors, and furthermore restricts the practical ability long additional time to be included in OpenSSL, affecting of OS vendors to provide predictable and steady support researchers, developers and users. throughout the product life cycle. Our contribution: To address these limits, in this work: While our intent is not to belittle the efforts of the OpenSSL (1) we present an analysis of the OpenSSL ENGINE API project, to motivate our contributions, here we briefly highlight and its benefits for bridging gaps between cryptographic re- some limits of the project from the point of view of researchers search and practical real-world implementations of cryptosys- that might explain the gap highlighted in the introduction: tems—the analysis in itself is novel, as available resources ap- (1) lack of unified quality reference documentation on the pear to be outdated; (2) we develop an ENGINE to demonstrate overall software architecture of the library, on API design how to use the ENGINE API as a framework to transparently choices, and frequently on the details of public and inter- integrate alternative implementations or new functionality in nal functions and data structures; (2) complex ad-hoc build OpenSSL, making them available to existing applications; system; (3) strong constraints on the choice of programming (3) we evaluate experimental results, by benchmarking our languages (C and custom “augmented” ASM syntax) and libsuola ENGINE, demonstrating the viability and the coding style; (4) being a complex and huge project ongoing benefits of the proposed solution across different versions of very active development, it implicates higher maintenance OpenSSL. costs for external developers to keep their submissions up- As discussed later in Section III, the entire existence of the -to-date during the contribution process, which in turn can OpenSSL ENGINE API is to provide alternative implementa- become quite long due to the double review constraint (see tions; our novelty instead lies in our “shallow” engine concept, e.g. [10] which, while contributed in early 2009, did not get bridging APIs of existing libraries to seamlessly realize this mainlined until late 2011, finally reaching a release version in functionality and allowing easy selection of several different early 2012); (5) contributing a new feature usually requires backend providers for it. Even though performance gains are a splitting the contributed code in smaller units to facilitate nice side-effect, the main values of using the proposed frame- and speed up the review process and collect feedback and work come from (1) the integration of missing functionality consensus on development choices for the subsequent units. In in end-of-life, yet still widely deployed, versions of OpenSSL; turn, generally, this might slow down the overall development (2) transparent access to the integrated functionality to existing process and increase the maintenance costs for contributing applications, requiring exclusively changes to system configu- developers; (6) inclusion of new implementations or features ration; (3) freedom of choosing alternative implementations usually undergoes a long decision-making process rarely com- for new or existing functionality (e.g. choosing a formally patible with research timelines; (7) trust, quality assurance, verified implementation like HACL*); (4) ease of testing and and IPR issues force the core development team to be very benchmarking in real-world scenarios and dissemination to a conservative in the decision-making process, thus even in cases larger user audience of novel implementations and primitives where time is a minor issue, the final result can still be a for researchers. rejection. In these cases, it is up to the researchers to maintain a custom set of patches and documentation to make their with higher functional correctness assurance—e.g., formally contribution available to users and other researchers for further verified libraries [2, 16]. We demonstrate this by providing an research (see e.g. [11] where the proposed cipher suite is no option to select HACL* [2] as the backend provider of the longer compatible with the newer OpenSSL API). X25519 and Ed25519 implementations. Side-channel security: In 2004, Percival discovered [17] A. Software assurance the first cache-timing attack on public key cryptography in The OpenSSL codebase is maintained by a small team of OpenSSL, recovering RSA keys by exploiting key-dependent developers, mostly on a voluntary basis, and few individuals table lookups in sliding window exponentiation (CVE-2005- working full-time on the project. This practically restricts 0109). OpenSSL responded by (1) implementing (against ex- the ability of contributors and end users to, e.g., control pert advice to avoid cache line-level lookups [18, Sec. 15]) the what alternative cryptographic primitive implementations are scatter-gather technique [19] during exponentiation to reduce featured in the library. Upkeep of end user custom builds is the amount of leakage; (2) adding a runtime flag that controls costly and does not scale well, hence the logical choice is to whether or not to follow this secure code path. Scatter-gather stick to the version and feature set provided by the OS vendor, is not a full mitigation but a hedge [20], and in 2016 Yarom et overwhelmingly driven by the choices of OpenSSL core de- al. implemented an attack utilizing cache-bank conflicts [21]. velopers themselves. This is a poor model for security-critical OpenSSL responded (CVE-2016-0702) by tweaking scatter- software: it leads to e.g. limited accountability in the decision- gather parameters to further reduce the amount of leakage, making process, and lack of assurance that the included code is hence the root cause of the vulnerability is still present in functionally correct. In what follows, we give some examples OpenSSL as of this writing. Also in 2016, Pereida García of extremely security-critical code paths in OpenSSL that lack et al. discovered a defect in the way OpenSSL handles the software assurance (see e.g. [12] for a broader survey). We runtime flag added in 2005 and used it to recover DSA private strongly reiterate here that our work should not be construed as keys from TLS and SSH servers [22], i.e. the fix to the 2005 diminishing the contributions of the OpenSSL project—which vulnerability was in fact not being activated; a bug present for at RWC’18 was awardedwith the Levchin prize “for dramatic over a decade (CVE-2016-2178). improvements to the code quality”—but rather we use these In 2009, Brumley and Hakala discovered the first cache- observations to motivate our research and later demonstrate timing attack on ECC in OpenSSL, recovering private keys how our framework can alleviate this security burden, shifting from scalar multiplication [23]. OpenSSL responded by imple- it back towards developers and cryptographers. menting constant-time paths on certain architectures for ellip- Software defects: Biham et al. introduced the concept of tic curves P-224, P-256, and P-521 based initially on Käsper’s bug attacks in 2008 [13], highlighting the importance of work [24] and later a P-256 contribution from Intel [25]. The cryptography implementation correctness to protect private OpenSSL team declined contributed patches to blanketly mit- keys. In 2011, Brumley et al. presented the first (and only, igate the vulnerability [26]. Leaving the vulnerability present as far as we are aware) real-world bug attack [14], remotely for all other curves, in 2014 Benger et al. improved the 2009 recovering P-256 private keys from a TLS server by ex- result and targeted the 256-bit BitCoin curve [27]. In 2015, van ploiting a carry propagation software defect in OpenSSL de Pol et al. further reduced the number of required signatures 0.9.8g finite field arithmetic (CVE-2011-4354). The defect to recover private keys [28], and Allan et al. even further in (and vulnerability) resurfaced later in the Nettle cryptography 2016 [29]—all from the initial code path shown vulnerable in library [15, Sec. 3.1]. CVE-2014-3570 discloses a defect in 2009 but only recently fixed [30] due to CVE-2018-5407 [31]. multi-precision integer squaring, potentially affecting RSA, To summarize, unfortunately OpenSSL has a dubious track DSA, and ECDH cryptosystems in OpenSSL 0.9.8+. CVE- record regarding side-channel security: from stopgap counter- 2016-7055 discloses a carry propagation bug leading to in- measures that do not stand the test of time, to implemented correct Montgomery multiplication results for 256-bit inputs, mitigations that are never activated due to defects, to no affecting ECDH cryptosystems for Brainpool P-512 elliptic response at all when code paths are shown to be vulnerable. curve in OpenSSL 1.0.2+. CVE-2017-3736 discloses a carry Our proposed methodology allows any developer to trump propagation bug leading to incorrect Montgomery squaring re- these OpenSSL design decisions and provide side-channel se- sults, affecting RSA, DSA, and DH cryptosystems in OpenSSL cure implementations largely independent from the OpenSSL 1.0.2+; CVE-2017-3732 and CVE-2015-3193 are similar but codebase. distinct defects. CVE-2017-3738 discloses an overflow bug in Montgomery arithmetic for 1024-bit moduli leading to B. Release strategies incorrect multiplication results, affecting RSA, DSA, and DH Reconciling software package EOL with OS distribution cryptosystems in OpenSSL 1.0.2+. To summarize, these secu- EOL is a challenging task w.r.t. both feature set and security. rity vulnerability disclosures unfortunately document a track For example, consider RedHat Enterprise Linux (RHEL) ver- record of functional correctness failures in OpenSSL, putting sion 7 with a 10 June 2014 release date. At that time, the the security of private keys at risk. Our proposed methodology stable version of OpenSSL was 1.0.1 and RHEL7 shipped allows any developer to provide alternative implementations with that package. Once a distribution chooses a software of cryptosystems decoupled from the OpenSSL codebase package version, this is essentially written in stone—the stock 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 TABLE I OpenSSL 1.0.1 SELECTIVE OPENSSL FEATURESACROSSVERSIONS. X25519 ONLY OpenSSL 1.0.2 LTS APPEARSSINCEVERSION 1.1.0. ED25519 SUPPORT WAS ADDED IN OpenSSL 1.1.0 OpenSSL 1.1.1 LTS OPENSSL VERSION 1.1.1, RELEASEDON SEPTEMBER 11, 2018. Debian 7 (1.0.1t) Debian 8 (1.0.1t) Debian 9 (1.1.0j) 1.0.1 1.0.2 1.1.0 1.1.1 Ubuntu 14 LTS (1.0.1f) a nistz256 — X XX Ubuntu 16 LTS (1.0.2g) X25519 — — Ubuntu 18 LTS (1.1.0g) XX RHEL 7 (1.0.1e) Ed25519 — — — X

a Fig. 1. OpenSSL release strategy vs. OS vendor release strategy: OS version Only for x86-64 platforms. On other 64-bit platforms a 64-bit portable C EOL exceeds the OpenSSL version EOL. implementation based on [24] is available as a non-default compilation option. Any other platform uses the default generic implementation.

III.OPENSSL ANDTHE ENGINE API OpenSSL is an open source project consisting of a general- version of OpenSSL on RHEL7 started at 1.0.1 and will stay at purpose cryptographic library, an SSL/TLS library and toolkit 1.0.1 until RHEL7 EOL through 2024. Security issues will be and a collection of command line tools to generate and handle patched with backported fixes and applied to this distribution’s keys, certificates, PKIs and other cryptographic objects and OpenSSL flavor by the OS vendor. The reason the version is execute cryptographic operations. fixed is that other applications will link against e.g. shared The project officially started in December 1998, with release libraries and, since there is no guarantee newer versions will 0.9.1c forking the SSLeay project by Eric Andrew Young maintain backwards compatibility, the simplest solution is not and Tim Hudson. Since then the project has seen a total of to change the version number. thirteen major releases, of which three are currently actively Considering bugs, it is an extremely challenging task for supported: (1) 1.0.2 is the old Long Term Support version, the OS vendor to determine which fixes to backport. On the initially released in January 2015 and supported until the end security side, they essentially rely on the risk analysis that of December 2019; (2) 1.1.0 is the old stable version, released takes place during the responsible disclosure process. This in August 2016 and officially supported in security-fix only process is perilous—e.g. the previously discussed P-256 defect mode until September 2019; (3) 1.1.1 is the current stable exploited by [14] was not flagged by either the OpenSSL version and the latest LTS release, realeased on September security team or the OS vendors as a security issue, and hence 11, 2018 after a long development cycle. went unpatched over four years in the wild before backported Being ubiquitous on the server side (e.g. powering the fixes rolled out to address CVE-2011-4354. HTTPS support for the Apache and nginx web servers which This task is slightly easier when the software in question combined cover almost two-thirds of Internet active sites is still maintained and has not reached EOL—the burden according to NETCRAFT’s July 2017 Web Server Survey4) falls upon the software package project in question. But what and in many client tools, the OpenSSL project has become a happens when the software package reaches EOL before the de facto standard for Internet security. OS does? This implies that official fixes are not available from The project is written mainly in the C programming the software package project, and the vendors must rely on language and assembly for optimized implementations and contributed and third-party fixes and security analysis. supports a plethora of platforms, including a wide range Before OpenSSL 1.0.2, OpenSSL had no official release of hardware architectures running most Unix and Unix-like strategy and roadmap. Understandably, this made the OS operating systems, OpenVMS and Microsoft Windows. vendor’s job difficult, as there was no guarantee on how long A. Architecture of the OpenSSL project official security fixes would be available. With the new release strategy at the end of 2014, OS vendors can make more Figure 2 depicts the current architecture of the OpenSSL informed choices regarding OpenSSL version inclusion, yet project. It is based on the current 1.1.1 version, but can be there will still be a coverage gap. For example, consulting applied with minor changes also to the 1.0.2 and 1.1.0 release Table I and Figure 1, stock OpenSSL on RHEL7 will not branches. feature X25519 or Ed25519, and since OpenSSL 1.0.1 reached The three main blocks in the diagram are the OpenSSL EOL at the end of 2016, RedHat is on the hook for backporting binaries, consisting of the command-line tools included in the third-party security fixes until the end of 2024 as the OpenSSL project, which are linked against the other two main blocks of team will no longer provide official security fixes. the OpenSSL project diagram, representing the two software libraries implementing the core project functionality. To summarize, as Figure 1 shows the expected lifetime of OpenSSL libssl implements the SSL/TLS library, deal- OpenSSL versions is in fact much longer than the official ing with the implementation details of the standardized net- EOL stated by OpenSSL. Compounded with the feature sets work protocols and extensions, linking against OpenSSL in Table I, OS vendors for currently deployed distributions are essentially stuck with little to no OpenSSL support for emerg- 4https://news.netcraft.com/archives/2017/07/20/ ing cryptography standards such as X25519 and Ed25519. july-2017-web-server-survey.html 3rd party binaries OpenSSL binaries

3rd party OpenSSL libcrypto OpenSSL libssl libraries

containers, KDF TS OCSP CT X509 encodings CONF EVP EVP EVP PKEY MD CIPHER PKCS#12 PKCS#7 CMS PEM STORE EVP OBJECTS table UI ENGINE API & low-level crypto ASN1 BIO ERR (I/O abstraction: built-in network sockets, ENGINEs RSA DH DSA EC ECX ... memory buffers, files, filters, etc.)

low-level generic modules ENGINEs 3rd party 3rd RAND BN CRYPTO BUFFER ASYNC COMP (random (arbitrary (memory, (in-mem byte (async. jobs) (zlib, number gen.) prec. int) threads, ...) buffers) compression)

OS / System libraries

HW

Fig. 2. Architecture diagram of the OpenSSL project. libcrypto for the implementation of the underlying cryp- cryptosystem. In particular, among the sub-modules included tographic functionality required by the network protocols. in the EVP API,5 EVP_PKEY abstracts asymmetric cryp- OpenSSL libcrypto contains the actual cryptographic tosystems, EVP_MD abstracts message digest cryptosystems, implementations and also acts as portability layer across the and EVP_CIPHER abstracts symmetric encryption/decryption different platforms supported by the OpenSSL project, and is cryptosystems. further organized into several independent modules. These abstractions are made possible by the use of abstract Focusing on the OpenSSL libcrypto block, the diagram context and method structures which describe each EVP cryp- is organized so that, in general, the vertical position is cor- tosystem in terms of pointers to metadata structures describing related with the level of abstraction of each depicted module its parameters, the functions to manipulate such metadata (going from the top application-oriented modules to the bot- structures or derived data structures describing the internal sta- tom modules providing lower-level implementations or low- tus of any of its instances, and the functions operating on such level functionality mostly dealing with operating systems and instances to implement the actual cryptosystem functionality. system libraries abstractions), while the horizontal position, Therefore, most EVP API functions ultimately act as wrap- with exceptions, is coupled with the “topic” of each module pers around the library internal OBJECTS table, which can or group of modules (the left part of the libcrypto block be queried by a numeric identifier (NID) to retrieve the is more closely related to cryptographic implementations, actual method structures associated with a particular indexed while the right part comprises increasingly general purpose cryptosystem. modules). This indirection mechanism can also be used as a way From a point of view of an application wishing to use to provide multiple alternative implementations for a given OpenSSL to perform cryptographic operations including en- cryptosystem, by manipulating the querying algorithm to select cryption and decryption, digital signature schemes, key deriva- among the method structures registered for the same NID: tion algorithms, and message digests, the high-level EVP API this approach is what ultimately powers the ENGINE API is the recommended interface: the EVP block provides an ab- described in the following paragraphs. straction level to handle these operations using abstract meth- ods and data types to decouple application from the details of the actual low-level implementations of each 5https://www.openssl.org/docs/man1.1.1/crypto/evp.html B. What is an OpenSSL ENGINE? release their work under these licenses, which may be an issue OpenSSL 0.9.6 introduced a new component to support especially when reusing code from projects released under a alternative cryptography implementations, most commonly proprietary license or an incompatible copyleft license. Being for interfacing with external crypto devices (e.g. accelerator objects dynamically loaded at runtime, engines can benefit cards), in the form of ENGINE objects.6 from usually more flexible licensing requirements, providing a bridge towards software released under different licenses. These objects act as “containers for implementations of Functionality provided by ENGINEs: The cryptographic cryptographic algorithms” and can be statically linked in functionality that can be provided by an ENGINE implemen- the OpenSSL library at compile-time or dynamically loaded tation includes the following abstractions: (1) RSA_METHOD, at run-time in and out of the running application with DSA_METHOD, DH_METHOD, EC_METHOD; providing alter- low-overhead from external binary objects implementing the native RSA/DSA/etc. implementations; (2) RAND_METHOD: ENGINE API7 through a special built-in ENGINE called providing alternative (pseudo-)random number generation im- “dynamic”. plementations; (3) EVP_CIPHER: providing alternative (sym- Such dynamic ENGINEs are particularly interesting due to metric) cipher algorithms; (4) EVP_MD: providing alternative the flexibility they provide: (1) they allow to replace com- message digest algorithms; (5) EVP_PKEY: providing alterna- piled-in implementations affected by known problems with tive public-key algorithms. newer ones, maintaining compatibility with existing applica- tions; (2) they allow hardware vendors to release self-con- C. Anatomy of a dynamic ENGINE tained shared-libraries to add support for arbitrary hardware At the highest level of abstraction, a dynamic ENGINE can to work with applications based on OpenSSL, keeping their be split into two functional blocks. software outside of the main OpenSSL codebase; (3) they al- One block contains all the alternative implementations for low to reduce the memory impact of OpenSSL, by avoiding to the cryptosystems provided by the ENGINE: this part mainly statically link support for unneeded hardware at compile-time consists of a collection of structs for each cryptosystem, each in favor of system configuration or automatically probing for linking to the actual functions implementing its operations. For supported devices at run-time and dynamically loading only example, an EVP_MD message digest struct would reference the required cryptographic modules. the actual init(), update(), and final() functions Dynamic ENGINEs are also interesting for software im- implementing the OpenSSL message digest streaming API, plementations: (1) they allow to replace compiled-in software in addition to some utility functions allowing the OpenSSL implementations in case of bugs, vulnerabilities or sub-optimal library to cleanly handle, clone and destroy instances of the performances with newer alternative implementations; (2) they provided message digest implementation. provide an alternative in case of issues with the OpenSSL Every such struct would be individually registered against core development team decision-making process, decoupling the OpenSSL library during the bind() process, and structs the decision to adopt the OpenSSL library from the choice of the same kind (e.g. all the EVP_MD structs, all the EVP_- of individual cryptosystem implementations, without incurring PKEY_meth structs, etc.) are glued together by functions the costs of maintaining a fork of the OpenSSL project or patch registered in the ENGINE object that allow the OpenSSL sets, while also providing transparent binary compatibility library to query the engine for lists of provided algorithms with existing applications; (3) they allow to backport newer or a specific algorithm indexed by NID. cryptosystems in previous versions of the OpenSSL library The other block contains the bind() method and the and existing applications based on it; (4) they allow to easily initialization and deinitialization functions: (1) the bind() add new cryptosystems or new implementations for already method is called by the OpenSSL built-in dynamic EN- compiled-in cryptosystems to the OpenSSL library, providing GINE upon load and is used to set the internal state of the a convenient way to test and benchmark new software imple- ENGINE object and allocate needed resources, to set its id mentations in a real-world context; (5) they offer a greater de- and name, and the pointers to the init(), finish(), and gree of freedom from the OpenSSL project toolchain, allowing destroy() functions; (2) the init() function is called to developers to use different programming languages and build derive a fully initialized functional reference to the ENGINE systems, further lowering the development and maintenance from a structural reference; (3) the finish() function is costs for developing plugin alternative implementations or called when releasing an ENGINE functional reference, to free new functionality; (6) they offer flexibility to solve licensing up any resource allocated to it; (4) the destroy() function issues: currently the OpenSSL project is released under a "dual is called upon unloading the ENGINE, when the last structural license" scheme, under the OpenSSL License, (a derivate of reference to it is released, to cleanly free any resource allocated the Apache License 1.0) and the SSLeay License (similar to a upon loading it into memory. 4-clause BSD License), and is in the process of transitioning to the Apache License 2.0. Contributors are thus forced to D. Curve25519, X25519 and Ed25519 in OpenSSL OpenSSL 1.1.0 introduced support for X25519: as a con- 6https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ README.ENGINE sequence of the way Curve25519 and X25519 are defined, 7https://www.openssl.org/docs/man1.1.1/man3/ENGINE_init.html instead of adding the curve inside the EC module containing every other elliptic-curve cryptosystem implementation based configuration files or programmatically use the OpenSSL API on prime or binary fields, the OpenSSL developers decided from application code to explicitly load libsuola. to add a dedicated ECX sub-module defining EVP_PKEY_- Independently of the way it is instructed to load the lib- meth structures directly linking to a self-contained portable C suola ENGINE, internally the OpenSSL library creates an implementation of Curve25519 and the underlying finite field instance of the built-in dynamic ENGINE, which in turn uses arithmetic. the OS dynamic loader to load libsuola. The actual low-level portable C implementation was closely As soon as it is loaded into memory, the dynamic ENGINE based on the ref10 version of Ed25519 in SUPERCOP calls the libsuola suola_bind() function which in turn: 20141124 which, although portable, suffers huge speed penal- (1) sets the id and name fields in the ENGINE structure; ties compared to implementations optimized for specific ar- (2) sets pointers to the destroy(), init() and fin- chitectures or even portable 64-bit C code. ish() functions in the ENGINE structure; (3) registers NIDs The latest release of OpenSSL (1.1.1) expanded the ECX for X25519, Ed25519 and the identity message digest: NID_- sub-module to add a new EVP_PKEY_meth supporting X25519 is defined in OpenSSL since version 1.1.0, while the Ed25519 digital signature scheme (based on a seperate NID_ED25519 is only defined in 1.1.1. This step ensures that portable C implementation). During its development cycle, even in previous versions of OpenSSL the internal OBJECTS the X25519 low-level scalar multiplication was also revised, table includes definitions for both cryptosystems; (4) creates adding a specialized portable 64-bit C implementation, and an the structures to handle each implemented cryptosystem: the assembly implementation for x86_64. EVP_MD md_identity describing the identity message di- As a result, comparing the benchmarks of X25519 (in gest and two (EVP_PKEY_meth, EVP_PKEY_ASN1_meth) OpenSSL 1.1.0) and Ed25519 (in OpenSSL 1.1.1) cryptosys- pairs of structures for X25519 and Ed25519. Each of these tems with operations of analogous EC cryptosystem (e.g. over structures is also registered in the internal OpenSSL OBJECTS the NIST P-256 elliptic curve) does not yield the performance table under the corresponding NID; (5) sets pointers to the benefits expected from [4, 5], as shown later in Section V. digests(), pkey_meths() and pkey_asn1_meths() callbacks in the ENGINE structure: these callbacks allow the IV. THE LIBSUOLA ENGINE OpenSSL library to manipulate the ENGINE handle querying libsuola8 takes advantage of the ENGINE API de- for lists of implemented methods indexed by NID; (6) calls scribed in Section III to provide alternative software imple- the provider suola_implementation_init() function mentations for X25519 and Ed25519, working as a bridge to initialize the internals of the provider implementation. between OpenSSL and an external library. The rest of libsuola consists of the collection of func- libsuola itself is a shallow loadable module, i.e. it does tions used to implement the functionality described by each not contain cryptographic implementations: the core part of of the structures registered in the suola_bind() function, libsuola is the code facing the OpenSSL APIs, that takes which map OpenSSL structures to the provider functionality. care of initializing data structures and registering abstract These are briefly described in the following paragraphs. methods; these abstractions are routed to a “provider” module A. libsuola X25519 in libsuola, which finally links against the selected external library to provide the actual cryptographic functionality. To X25519 is implemented through an EVP_PKEY_meth demonstrate the potential of the proposed methodology, we and a corresponding EVP_PKEY_ASN1_meth. The first one provide three different providers, among which the user can mainly includes the definition of a keygen() and a de- select at build time, each linking against a different implemen- rive() method, while the second includes methods for tation, and further described in Section IV-D. encoding and decoding X25519 private and public values. Figure 3 shows the architecture of libsuola (compiled The keygen() method delegates its core functionality selecting the libsodium provider) and its interactions with to the suola_scalarmult_curve25519_base() func- libsodium and OpenSSL. Selecting a different provider tion, which is then routed to the actual implementation through results in linking against another library or embedding, through the selected provider. static linking, an implementation inside the provider object, The derive() method, called once the peer key has but the changes affect only the provider object, which is the been set, performs a generic point scalar multiplication over only element providing cryptographic functionality. Curve25519 through the provider suola_scalarmult_- For the installed applications that will benefit from the curve25519() function. added functionality or alternative implementations, lib- B. libsuola Ed25519 suola is completely transparent: at run time when OpenSSL is loaded by the application, the library reads from system- Similarly, Ed25519 is also implemented through an EVP_- wide configuration files, and is instructed to subsequently load PKEY_meth and a corresponding EVP_PKEY_ASN1_meth. the libsuola ENGINE. Alternatively, through environment The first one mainly includes the definition of a keygen(), variables, it is possible to have application-specific OpenSSL a sign() and a derive() method; the second includes methods for encoding and decoding Ed25519 private and 8https://github.com/romen/libsuola public values, as well as a control method that the OpenSSL Fig. 3. libsuola architecture and its interactions with libsodium and OpenSSL. library uses to query for the default message digest algorithm C. libsuola md_identity associated with the digital signature cryptosystem, which in EVP_MD md_identity libsuola is set to return the NID for md_identity. The is a custom message digest algorithm behaving as an identity function to be used as The keygen() method for Ed25519 is mostly a wrapper the pre-hash algorithm in the libsuola implementation of around the provider suola_sign_ed25519_seed_key- PureEd25519: the output of this message digest is a copy of pair() function. the input message. The sign() and verify() methods are wrappers around suola_sign_ed25519_detached() and The EVP_MD API in OpenSSL can be reduced to a stream- suola_sign_ed25519_verify_detached() ing approach based on calling an init() function before functions in the provider. These functions implement a starting the hash computation, repeated calls to an update() PureEdDSA signature scheme, i.e. where the message to be function until the whole input message is consumed, and a signed is directly used as an input to the signature generation final call to a final() function to finalize the hash and algorithm, without a pre-hashing step as opposed to traditional retrieve its output. For md_identity these functions are digital signature schemes based on RSA, DSA or ECDSA or implemented as described below: to the PreHashEdDSA signature scheme. init() Securely allocates an empty small buffer as the OpenSSL 1.1.1 introduced new API functions to support internal status of the message digest algorithm, tracks its one-shot signature generation and verification cryptosystems length and sets to 0 the counter of used buffer bytes; like Ed25519, but using this approach in libsuola would update() Depending on the size of the input block and the make it incompatible with previous versions of OpenSSL. availability of unused space, the internal buffer is securely Instead, we chose to work around the limitations of the tradi- reallocated to sufficiently increase its size. A copy of the tional API by adding a custom md_identity message digest new input block is then concatenated to the existing data method to be used in the default pre-hash step performed by in the buffer while tracking the length of the whole buffer OpenSSL before the actual signature generation or verification. and of the data contained in it; final() OpenSSL EVP_MD API limits the maximum size of-the-art C implementations, while implementing standard of a message digest algorithm output thus, considering countermeasures to timing side-channel attacks”. that no actual finalization is required and that the md_- Through our proposed methodology it then becomes pos- identity algorithm is used only by code inside lib- sible to integrate the guarantees of projects like this, that suola, we decided to work around this limit by gaining mathematically prove the absence of the defects we listed in access to the internal md_identity buffer directly Section II-A, without needing to alter existing application to from the Ed25519 EVP_PKEY_meth implementation, use a different cryptographic library. without ever using the final() function. As a result, HACL* uses the same C API as libsodium for the NaCl the implementation of this function simply returns an constructions, so excluding some obvious prefix renaming, the error. mappings described in Figure 3 with respect to libsodium are analogous to the ones provided by this provider for the D. Providers HACL* library. libsuola-donna: As a proof of concept, we implemented lib- The actual cryptographic functionality provided by an alternative version of libsuola that statically links at suola is entirely contained in the provider module. It deter- compile-time against an implementation of Curve25519 and libsuola mines which library is linked against, and routes X25519 based on the work of Adam Langley and Andrew the function calls from the OpenSSL structures described Moon.9 above to the relevant functions in the selected implementation. Through this provider, the contents of the EVP_PKEY_- To demonstrate the potential of the proposed methodology, meth structures for Ed25519 and X25519 are routed as fol- we provide three different alternative providers: (1) the first lows: (1) the keygen() functions ultimately link to, respec- provider, to which Figure 3 refers, links against libsodium, tively, ed25519_publickey() and curved25519_- a fork of NaCl; (2) a second provider links against the scalarmult_basepoint() from floodyberry/ed- HACL* library, providing a formally verified implementation, 25519-donna/ed25519.c; (2) the Ed25519 sign() through an API compatible with NaCl; (3) as a proof of and verify() functions are implemented via, respectively, concept, we develop a third provider which does not depend ed25519_sign() and ed25519_sign_open() from on an external library and internally embeds the provided floodyberry/ed25519-donna/ed25519.c; (3) the functionality through static linking. X25519 derive() function is not supported by the code libsuola-sodium: For our first provider, libsodium was in the floodyberry/ed25519-donna repository, so selected as a modern cryptographic library, designed empha- we opted to use the portable 64-bit C donna implemen- sizing high security and ease-of-use, and addressing from its tation included in SUPERCOP10 in crypto_scalar- inception as a fork of NaCl many of the shortcomings we mult/curve25519/donna_c64/smult.c. This imple- described in OpenSSL. mentation is not compatible with our 32-bit ARM test- While the aim of our project is to provide support for ing environment, so on this platform at build time we re- emerging cryptography standards such as X25519 and Ed- place it with another implementation from SUPERCOP, in 25519 for currently deployed versions of OpenSSL, as de- crypto_scalarmult/curve25519/neon2/scalar- tailed in Section V, we also notice that, on the platforms mult.s, contributed by Bernstein and Schwabe [32], opti- we tested, the alternative implementations provided through mized for the NEON Advanced SIMD extension. libsodium generally performs equally or better than the built-in implementations included in the previous OpenSSL V. EXPERIMENTAL RESULTS 1.1.0 release (supporting X25519 only). Even in the latest 1.1.1 Our last contribution consists in an experimental evaluation release of OpenSSL (supporting both X25519 and Ed25519), of the libsuola ENGINE, carried out by benchmarking the the libsodium Ed25519 implementation provided through provided operations across different versions of OpenSSL on libsuola appears to be faster than the OpenSSL default different target architectures, by analyzing how the proposed implementation. methodology effectively addresses the concerns listed in Sec- When this provider is selected, as depicted in Figure 3, tion II and its limits. libsuola is dynamically linked against libsodium, hence A. Benchmarks when OpenSSL loads the ENGINE, the dynamic loader would automatically load both libsuola and libsodium. For the benchmarks, for which the detailed results are The suola_implementation_init() function in reported in Appendix A, we targeted two platforms to address this case calls sodium_init() during the binding process both the desktop/server scenario and a mobile/embedded sys- to initialize libsodium internals. tem scenario: a 4-core/8-threads 3.4GHz Intel Core i5-6700 Skylake CPU (Table III) and a quad-core 1.2GHz Broadcom libsuola-hacl: HACL* [2] is a very interesting target for BCM2837 CPU on a Raspberry Pi 3B on a 32-bit (Table IV) our framework, as it presents a practical implementation of a and a 64-bit environment (Table V). cryptographic library, which is formally verified for memory safety and functional correctness with respect to its published 9https://github.com/floodyberry/ed25519-donna standard specification, and aims to be “as fast as state- 10http://bench.cr.yp.to/supercop.html We collected the measurements using a benchmarking ap- TABLE II plication derived from the OpenSSL speed app, modified to OPENSSL SECURITY STATISTICS ACROSS OS VENDORS, ASOF 08 APRIL 2019. THEADVISORYCOUNTSARERESTRICTEDTO CVES AFFECTING use the EVP API for every operation and to measure CPU LIBCRYPTO. cycles for a fixed number of repeated runs of the specified operations (in contrast with measuring the number of repeated OpenSSL version 1.0.2 1.1.0 1.1.1 CVEs, total 63 23 3 runs over a fixed amount of time). We linked the benchmarking CVEs, libssl 22 9 0 application, and libsuola in its three different versions CVEs, libcrypto 41 14 3 (i.e. one for each provider selected at compile time), against Amazon Linux Security Advisories 16 8 – Amazon Linux 2 Security Advisories 3 3 1 OpenSSL versions 1.0.2r (old LTS), 1.1.0j (old stable) and Debian Security Advisories 14 8 2 1.1.1b (current LTS release). Debian LTS Advisories 9 5 1 The table reports the execution cost in CPU cycles for Fedora Security Updates 42 18 1 FreeBSD Security Advisories 29 14 2 the following cryptosystem operations: key generation, ECDH Gentoo Linux Security Advisories 11 5 – shared secret derivation, digital signature generation, and Oracle Security Advisories 22 7 – digital signature verification. As a baseline reference, the Redhat Security Advisories 33 12 – Slackware Security Advisories 15 6 1 table also presents the execution cost of these operations for SUSE Security Updates 81 47 10 cryptosystems based on the fast nistz256 implementation Ubuntu Security Notices 15 7 1 for the popular NIST P-256 elliptic curve. The benchmarks demonstrate that our methodology achieves the goal of adding missing functionality to OpenSSL trans- there are no vulnerabilities, but is consistent with no patches parently for existing applications, and to replace the default found in said package builds. We carried out our analysis implementations with alternative ones. separately for all OpenSSL versions which are currently Excluding X25519 primitives in OpenSSL 1.1.1b, we also not EOL: 1.0.2, 1.1.0, and 1.1.1. Summarizing the Table II discovered that using libsuola-sodium, on top of adding results, we conclude that our approach of dedicated backend missing functionality, generally also improves the performance crypto providers for an ENGINE do not suffer from the same of the listed primitives, as libsodium selects implementa- classes of security issues as the analogous functionality in tions that are more optimized for the test platforms. libcrypto, supporting our claim. We also note that, as claimed in [2], HACL* does generally It can also be argued that libsuola itself is adding achieve relatively good performance, comparable with the even more surface for bugs and defects, that might result in default implementations in OpenSSL, and notably in the case additional risks. While this is definitely true for any additional of X25519 derive() in OpenSSL 1.1.0j, the HACL* imple- line of code in a software system, we designed libsuola as mentation is even twice as fast as the default implementation a “shallow” ENGINE to minimize the risk of introducing such on x86-64. defects: libsuola itself is not providing any cryptographic functionality, and we designed it to maximize readability and B. Analysis and security evaluation maintainability making it modular. In Section I and Section II we listed a series of concerns Also, compared with the process of patching OpenSSL regarding software assurance, release strategies and limits of directly to add missing or alternative primitives, modifying our ENGINE OpenSSL as a platform for researchers to motivate our work. proposed template is generally an easier process: the In this section we analyze how our proposed methodology actual cryptographic functionality can be implemented using addresses those concerns and its limits. any language and build system as long as it produces C Software assurance: We claim that the ENGINE approach bindings that can be mapped in a dedicated “provider” module, ENGINE is useful in preventing various vulnerabilities that plague while the functionality remains separate and mostly OpenSSL, including e.g. traditional software defects, arith- reusable, which further minimizes the risks of introducing metic defects in e.g. hand-coded assembly, and various side- software defects. channel attack vectors. To support this claim, we use the Release strategies: One of our original goals was to fa- following metric. We first enumerate all the CVEs issued by cilitate the integration of missing functionality in end-of-life, OpenSSL and then fetch the related security metadata for each yet still widely deployed, versions of OpenSSL, motivated by CVE. To semi-automate this, we utilize the Computer Inci- different release strategies applied by software providers in dent Response Center Luxembourg (CIRCL) database11 that real-world systems as described in Section II-B. provides a contextual feed of security vulnerabilities. We then We demonstrated how the proposed approach allows to count the number of security advisories issued across selected easily add or backport functionality in OpenSSL 1.0.2, and vendors. We partially validated the results by examining the how this is done transparently for existing applications. Debian and Ubuntu patches applied to package builds. Alternatives to our approach usually require to manually As a case study, we did the same analysis for libsodium, recompile and install more recent versions of OpenSSL and for which we found no CVEs issued—which does not mean then repeat the same process for each component of the software stack of the target system that depends on OpenSSL. 11https://www.circl.lu/services/data-feeds-cve/ Sometime this is not even possible if existing applications do not support the newer release of OpenSSL, in which case One can argue that this cost should be justified, e.g. in the system administrator would need to design a patch (or comparison with loading into memory only the library actually retrieve a trusted one) for the target software. This approach is providing the cryptographic implementation (in our case either impractical in most medium- or large-scale deployments, as it libsodium or HACL*). It is true that if the application is costly to maintain and definitely adds even more surface for was rewritten to only use the primitives in libsodium the rise of critical software defects and security vulnerabilities. —i.e. directly linking against libsodium —the memory The same can be told about planning to replace the crypto- cost would be more than halved compared to our approach. graphic provider for existing applications based on OpenSSL. However: (1) this would require redesigning every single Excluding some exceptions where software is designed with application that currently depends on OpenSSL, defying our a “cryptographic module abstraction layer” to easily swap goal of providing the additional functionality transparently to the default cryptographic module with alternative supported existing applications; (2) part of the design of libsodium ones, most applications do not usually allow this level of (and NaCl) is based on providing only an opinionated set of flexibility, and extensive patches would be required to modify, primitives, therefore applications patched to rely exclusively for example, an application using OpenSSL to serve TLS con- on libsodium for their cryptographic needs would lose nections to use a different library for the TLS stack supporting generality and compatibility with other applications, which is libsodium as the cryptographic primitive provider. often not an option; (3) libsodium only provides crypto- Accessibility for researchers: Another goal of our research graphic primitives, so redesigning the applications as has been listed in Section I aimed at delivering a framework to enable proposed would also incur the additional cost (and security researchers to do testing and benchmarks in real-world sce- risks) of selecting and adapting to yet another library to narios, and to easily disseminate novel implementations and implement the protocol functionality (e.g. TLS) on top of primitives to a larger user audience. libsodium. Our proposed methodology achieve this by lowering the For these reasons we do not believe that such comparison is development and maintenance costs of adding functionality to fair, because what forks of NaCl gain in elegance, conciseness, the widely used OpenSSL library. ease-of-use, performance and security comes at the cost of Freedom of choice in programming languages and building not being interoperable with projects that do not implement tools for the actual cryptographic implementation makes it the same set of primitives, and with both the benefits and the easier to plug functionality into OpenSSL. It also lowers drawbacks of being only a cryptographic library and not a full the long term maintenance costs, especially considering the stack library. usually long and possibly unfruitful process of proposing the libsuola memory overhead: Regarding the overhead of addition of novel or alternative functionality into the main libsuola itself, with respect to memory consumption, its OpenSSL codebase. Moreover, the higher degree of freedom footprint is negligible compared with the memory require- about licensing allows to further reduce the entry barrier for ments of OpenSSL across all the tested architectures and adding functionality to applications based on OpenSSL. OpenSSL versions. Finally, it provides an interesting framework to collect real To give some figures, in the case of our x86_64 target data about researchers’ novel contributions, which are readily platform and OpenSSL 1.1.1, the resident set size in memory comparable with existing functionality by reusing existing of libsuola.so alone totals 60 KB, compared with over benchmarking facilities. Moreover, being an optional compo- 2 MB of memory occupied by OpenSSL alone (excluding the nent and easy to plug in at run-time, it makes it easier for additional 1.2 MB occupied by libc, libpthread, and researchers to reach a wider user audience for more extensive libdl which are required by OpenSSL). testing or for releasing their projects. libsuola computational overhead: With respect to com- putation, excluding from our analysis md_identity which libsuola C. Limits is not of particular relevance, itself adds some computational overhead once when it is dynamically loaded by The scope is limited to OpenSSL: First and foremost, our OpenSSL during initialization, and a small computational cost proposed methodology applies to OpenSSL only. While it can every time the functions wrapping the actual cryptographic be argued that this is sufficient considering the portion of the primitives in the provider module are called. whole Internet that currently depends on OpenSSL as part of OpenSSL uses exactly the same level of indirection whether its security stack, we recognize that this is a strong limit of the function implementing a primitive resides in OpenSSL our proposed methodology. In Section VI we discuss potential itself or in an external ENGINE. The additional cost comes research directions to overcome this limit in the future. from the fact that the implementing function is resolved with Overhead of linking to other libraries: Another limit of our the memory address of the function in the provider object approach is the memory consumption: when using libsuola inside libsuola, which then calls the actual function in to integrate missing functionality in OpenSSL, in addition to libsodium, HACL* or the internal object with the donna the memory required by OpenSSL itself, libsuola and the implementation. library it depends on (libsodium or HACL*) are also loaded With the libsuola-donna case the wrapper is actually into memory. required as the donna implementation does not implement the NaCl C API, but in the case of libsuola-sodium and natively supported by OpenSSL: e.g. PKCS #11 support relies libsuola-hacl, where the wrapper only calls the relevant on an external dynamic ENGINE. function in the target library, this overhead could be erased Operating System assisted crypto: Operating Systems are by annotating the source code of the providers with compiler by design natural candidates to provide security services and attributes to instead resolve the final function pointer at load abstraction of hardware capabilities to users and applications time, so that OpenSSL would call directly the relevant function of a system. Modern OS designs provide a hardware ab- in the target library. straction layer (HAL), separation of security contexts and This further optimization would come at the cost of higher levels, access control, authentication of loadable modules and complexity, inferior code readability and limited portability, binaries, have privileged access to cryptographic units and co- which we deemed to be not in line with the stated goals of processors, and usually already require a set of cryptographic our project. We also believe that researchers who strongly need primitives for their internal functions. to squeeze out every single superfluous assembly instruction, The OpenBSD/FreeBSD Cryptographic Framework (OCF) while testing their code against OpenSSL through our pro- provides convenient access to the kernel cryptographic func- posed methodology, will have no problem implementing the tionalities (including symmetric and public-key crypto, hashes, described expedient. MACs, access to crypto hardware accelerators and RNGs) to userspace through a /dev/crypto pseudo-device [40, 41] VI.RELATED WORK and a standard ioctl interface, simplifying the development We focused our analysis on the OpenSSL project and of applications and reducing the required cryptographic aware- on ENGINEs as the native mechanism to handle alternative ness of application developers. cryptographic implementations for the supported primitives. Recent versions of the Linux kernel include the AF_- In this section, we present an overview of how other security ALG interface (providing access to the kernel symmetric standards, libraries, and frameworks address extensibility. crypto functions through the socket interface), while inde- Describing programming interfaces and application pro- pendent developers maintain a set of patches12 to provide a tocols to provide interoperability and transparency between /dev/crypto ioctl interface compatible with the OCF applications and cryptographic providers in the form of hard- one. See [42] for a performance analysis of this framework in ware and software cryptographic modules is a decades-old different usage scenarios and [43] for a proposal to enhance problem. Standardization bodies, operating systems, security this framework to provide decoupling of cryptographic keys software and hardware vendors, manufacturers and various from applications for increased properties. organizations over the years approach the issue aiming for A different project provides a complete port of OCF to the akin goals but with different approaches (e.g. regarding the Linux kernel13, including the ioctl interface but using the cryptographic awareness required to adopt the proposed solu- ported implementations instead of the native Linux crypto tion), trust models and features. capabilities. Standards: One of the more relevant and widespread crypto- Microsoft Windows OSs expose a Cryptographic Applica- graphic standards is PKCS #11 [33]: it specifies the “Cryptoki” tion Programming Interface (CAPI, a.k.a. CryptoAPI), pro- API for devices that hold cryptographic information and viding an abstraction layer and a set of dynamically linked perform cryptographic functions, presenting applications an libraries decoupling applications from the functionality pro- abstract “cryptographic token” view, thus providing a sepa- vided by Cryptographic Service Providers (CSP). The sup- ration layer between applications and specific algorithms and ported functionality includes RNGs, symmetric and public- implementations. The PKCS #11 API defines abstract object key crypto, hashes and MACs, authentication, PKI and key types to represent symmetric and asymmetric crypto keys, dig- storage. The “Cryptographic API: Next Generation” (CNG)14 ital signature keys, X.509 certificates, hash functions, MACs, is the latest long-term replacement of the original API, pro- and other common cryptographic objects, and all the functions viding a compatibility layer and featuring better support for needed to generate, modify, use and delete such objects. Just configuration, cryptographic agility, and a wider selection of like OpenSSL ENGINEs, PKCS #11 was originally designed supported primitives, covering the whole NSA Suite B [44] for hardware devices, but the level of abstraction provided (including ECC). allows it to be used also for software cryptographic modules. Support of OS-provided cryptographic functionality in By design, PKCS #11 defines use cases with many applications OpenSSL is generally implemented through dedicated EN- and many tokens, allowing interaction and the possibility to GINEs. choose among alternative implementations. Other security libraries and frameworks: Mozilla NSS15 is Other related standards that aim to separate applications another widespread set of libraries supporting cross-platform from cryptographic implementation details include GSS- development of security-enabled client and server applications. API [34, 35] and IDUP-GSS-API [36], CDSA [37], SS- API [38], and the Simple Cryptographic Program Inter- 12http://cryptodev-linux.org 13 face [39]. http://ocf-linux.sourceforge.net/ 14https://msdn.microsoft.com/en-us/library/windows/desktop/aa375276. These standards were excluded as an alternative to the aspx presented libsuola approach because currently they are not 15https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS It has native PKCS #11 support for hardware and soft- the inherent complexity costs and the risk of enabling adopters ware security modules, and indeed the internal cryptosystem to select insecure combinations of primitives. implementations are encapsulated in a software PKCS #11 From this brief survey, PKCS #11, due to its widespread token, allowing selection among alternative implementations support, appears to be a suitable candidate to provide al- of crypto primitives during configuration. ternative crypto primitive implementations portable across GnuTLS16 is a high-level library providing TLS support, different libraries. Unfortunately, OpenSSL does not natively and relies on other libraries for crypto primitives. It features a support PKCS #11, but only through a third party ENGINE multilayered architecture, based on a Cryptography Provider implementation. Nonetheless, this application of PKCS #11 Layer, which provides abstraction from the actual providers seems to be an interesting direction for further research on for individual primitives and allows transparent access to this topic. implementations provided by the underlying cryptographic software library (i.e. or nettle), to OS assisted VII.CONCLUSION crypto (e.g. /dev/crypto or CAPI), non-privileged CPU In this work, we analyzed issues affecting real-world secu- crypto instructions (e.g. Intel AES-NI), and TPM or hardware rity across currently deployed, ubiquitous cryptography soft- and software cryptographic modules through a rich native sup- ware libraries. The consequences derived from these issues port for PKCS #11. The architecture also provides functions include: (1) limited assurance that software implementations to override symmetric crypto implementations and “abstract are functionally correct; (2) low or non-existent accountabil- private keys” as a way to handle abstractions over keys stored ity in the software engineering decision-making process; (3) in hardware modules (PKCS #11 or TPM) or over operations costly and nonscalable upkeep of custom builds augmented implemented directly using an external API. 17 with new features; (4) and additional security risks due to ARM mbed TLS (formerly known as PolarSSL) has conflicting release strategies between cryptographic libraries optional support for Hardware Security Modules (HSM) with and OS vendors. PKCS #11, and supports the replacement of specific functions In addition, we also examined some factors that hin- or full cryptosystem modules with alternative implementations, der researchers and practitioners from achieving timely and but limited to compile time.18 widespread dissemination of their scientific results in real- [45, 46] has built-in native support for selected world applications, thus impeding the potential impact of hardware crypto accelerators and for PKCS #11 devices, current cryptography research. ensuring that the API for any crypto device is identical to As a possible solution, we presented the adoption of the API of cryptlib native crypto implementations, allowing OpenSSL ENGINEs as a framework to overcome these limits easy and transparent migration of applications from the native and as a convenient tool for researchers to disseminate, test software implementations to the use of crypto devices. and benchmark their contributions in real-world applications. The Java SDK includes the Java Secure Socket Extension 19 20 We demonstrated the usage, viability, limits and benefits of (JSSE), and the Java Cryptography Architecture (JCA). this framework through libsuola, a project that aims at These modular frameworks provide an abstraction layer re- providing support for emerging cryptography standards such spectively for secure network protocols like TLS and for as X25519 and Ed25519 for currently deployed versions of cryptographic primitive implementations. Through the factory OpenSSL, while being completely transparent for existing method design pattern and the Service Provider Interface applications. (SPI) programming interface, JCA supports the registration of Due to the nature of this research, we expect it to be Cryptographic Service Provider (CSP) instances. Applications the foundation for future work aiming at bridging the gaps using the framework can either select a specific CSP for a between research results and real-world applications. Our primitive or let the system configuration select the most suit- methodology has already been applied [47] in NIST’s ongoing able implementation at runtime, attaining complete decoupling post-quantum cryptosystem standardization competition [48]. between applications and crypto implementations. We intend to explore automated ENGINE construction, poten- Other libraries have a completely opposite design philoso- tially leading to tooling that rigs together popular research- libsodium phy: NaCl and , for example, are intentionally driven APIs (e.g. SUPERCOP) with practice-driven APIs. designed as simplified and opinionated libraries to avoid Moreover, we also expect it to serve as a means to enhance “cryptographic disasters” due to insufficient cryptographic future research efforts, by providing a framework to ease the awareness of application developers. As such, cryptographic testing and benchmarking of novel scientific results in real- agility and configurability are considered “antifeatures” due to world systems and settings.

16 ://www.gnutls.org/ Acknowledgments 17https://tls.mbed.org/ 18https://tls.mbed.org/kb/development/hw_acc_guidelines Supported in part by Academy of Finland grant 303814. 19 https://docs.oracle.com/javase/8/docs/technotes/guides/security/jsse/ This article is based in part upon work from COST Action JSSERefGuide.html 20https://docs.oracle.com/javase/8/docs/technotes/guides/security/crypto/ IC1403 CRYPTACUS, supported by COST (European Coop- CryptoSpec.html eration in Science and Technology). REFERENCES the RSA Conference 2012, San Francisco, CA, USA, February 27 - March 2, 2012. Proceedings, ser. Lecture Notes in Computer Science, [1] D. J. Bernstein, T. Lange, and P. Schwabe, “The security impact of a O. Dunkelman, Ed., vol. 7178. Springer, 2012, pp. 171–186. [Online]. new cryptographic library,” in Progress in Cryptology - LATINCRYPT Available: https://doi.org/10.1007/978-3-642-27954-6_11 2012 - 2nd International Conference on Cryptology and Information [15] R. Dubois, “Trapping ECC with invalid curve bug attacks,” IACR Security in Latin America, Santiago, Chile, October 7-10, 2012. Cryptology ePrint Archive, vol. 2017, no. 554, 2017. [Online]. Proceedings, ser. Lecture Notes in Computer Science, A. Hevia and Available: http://eprint.iacr.org/2017/554 G. Neven, Eds., vol. 7533. Springer, 2012, pp. 159–176. [Online]. [16] J. B. Almeida, M. Barbosa, G. Barthe, A. Blot, B. Grégoire, Available: https://doi.org/10.1007/978-3-642-33481-8_9 V. Laporte, T. Oliveira, H. Pacheco, B. Schmidt, and P. Strub, “Jasmin: [2] J. K. Zinzindohoué, K. Bhargavan, J. Protzenko, and B. Beurdouche, High-assurance and high-speed cryptography,” in Proceedings of the “HACL*: A verified modern cryptographic library,” in Proceedings of 2017 ACM SIGSAC Conference on Computer and Communications the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, B. M. Thuraisingham, D. Evans, T. Malkin, and 03, 2017, B. M. Thuraisingham, D. Evans, T. Malkin, and D. Xu, Eds. ACM, 2017, pp. 1807–1823. [Online]. Available: D. Xu, Eds. ACM, 2017, pp. 1789–1806. [Online]. Available: http://doi.acm.org/10.1145/3133956.3134078 http://doi.acm.org/10.1145/3133956.3134043 [17] C. Percival, “Cache missing for fun and profit,” in BSDCan 2005, [3] D. J. Bernstein and T. Lange, “eBACS: ECRYPT Benchmarking of Ottawa, Canada, May 13-14, 2005, Proceedings, 2005. [Online]. Cryptographic Systems,” [Online; accessed 30-August-2017]. [Online]. Available: http://www.daemonology.net/papers/cachemissing.pdf Available: https://bench.cr.yp.to [18] D. J. Bernstein, “Cache-timing attacks on AES,” 2005. [Online]. [4] D. J. Bernstein, “Curve25519: New Diffie-Hellman speed records,” in Available: http://cr.yp.to/papers.html#cachetiming Public Key Cryptography - PKC 2006, 9th International Conference [19] S. Gueron, “Efficient software implementations of modular on Theory and Practice of Public-Key Cryptography, New York, exponentiation,” J. Cryptographic Engineering, vol. 2, no. 1, pp. 31–43, NY, USA, April 24-26, 2006, Proceedings, ser. Lecture Notes in 2012. [Online]. Available: https://doi.org/10.1007/s13389-012-0031-5 Computer Science, M. Yung, Y. Dodis, A. Kiayias, and T. Malkin, [20] D. J. Bernstein and P. Schwabe, “A word of warning,” August 2013, Eds., vol. 3958. Springer, 2006, pp. 207–228. [Online]. Available: cHES 2013 Rump Session. [Online]. Available: https://cryptojedi.org/ https://doi.org/10.1007/11745853_14 peter/data/chesrump-20130822.pdf [5] D. J. Bernstein, N. Duif, T. Lange, P. Schwabe, and B. Yang, [21] Y. Yarom, D. Genkin, and N. Heninger, “CacheBleed: A timing attack “High-speed high-security signatures,” J. Cryptographic Engineering, on OpenSSL constant time RSA,” in Cryptographic Hardware and vol. 2, no. 2, pp. 77–89, 2012. [Online]. Available: https://doi.org/10. Embedded Systems - CHES 2016 - 18th International Conference, 1007/s13389-012-0027-1 Santa Barbara, CA, USA, August 17-19, 2016, Proceedings, ser. [6] A. Langley, M. Hamburg, and S. Turner, “Elliptic Curves for Security,” Lecture Notes in Computer Science, B. Gierlichs and A. Y. Poschmann, Internet Requests for Comments, RFC Editor, RFC 7748, Jan. 2016. Eds., vol. 9813. Springer, 2016, pp. 346–367. [Online]. Available: [Online]. Available: https://datatracker.ietf.org/doc/rfc7748/ https://doi.org/10.1007/978-3-662-53140-2_17 [7] S. Josefsson and I. Liusvaara, “Edwards-Curve Digital Signature [22] C. Pereida García, B. B. Brumley, and Y. Yarom, ““Make Algorithm (EdDSA),” Internet Requests for Comments, RFC Editor, sure DSA signing exponentiations really are constant-time”,” in RFC 8032, Jan. 2017. [Online]. Available: https://datatracker.ietf.org/ Proceedings of the 2016 ACM SIGSAC Conference on Computer doc/rfc8032/ and Communications Security, Vienna, Austria, October 24-28, 2016, [8] E. Rescorla, “The (TLS) protocol version 1.3,” E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and Internet Requests for Comments, RFC Editor, RFC 8446, Aug. 2018. S. Halevi, Eds. ACM, 2016, pp. 1639–1650. [Online]. Available: [Online]. Available: https://datatracker.ietf.org/doc/rfc8446/ http://doi.acm.org/10.1145/2976749.2978420 [9] Y. Nir, S. Josefsson, and M. Pégourié-Gonnard, “Elliptic Curve [23] B. B. Brumley and R. M. Hakala, “Cache-timing template attacks,” Cryptography (ECC) Cipher Suites for Transport Layer Security in Advances in Cryptology - ASIACRYPT 2009, 15th International (TLS) Versions 1.2 and Earlier,” Internet Requests for Comments, Conference on the Theory and Application of Cryptology and RFC Editor, RFC 8422, Aug. 2018. [Online]. Available: https: Information Security, Tokyo, Japan, December 6-10, 2009. Proceedings, //datatracker.ietf.org/doc/rfc8422/ ser. Lecture Notes in Computer Science, M. Matsui, Ed., vol. [10] E. Käsper and P. Schwabe, “Faster and timing-attack resistant AES- 5912. Springer, 2009, pp. 667–684. [Online]. Available: https: GCM,” in Cryptographic Hardware and Embedded Systems - CHES //doi.org/10.1007/978-3-642-10366-7_39 2009, 11th International Workshop, Lausanne, Switzerland, September [24] E. Käsper, “Fast elliptic curve cryptography in OpenSSL,” in 6-9, 2009, Proceedings, ser. Lecture Notes in Computer Science, Financial Cryptography and Data Security - FC 2011 Workshops, C. Clavier and K. Gaj, Eds., vol. 5747. Springer, 2009, pp. 1–17. RLCPS and WECSR 2011, Rodney Bay, St. Lucia, February 28 [Online]. Available: https://doi.org/10.1007/978-3-642-04138-9_1 - March 4, 2011, Revised Selected Papers, ser. Lecture Notes [11] J. W. Bos, C. Costello, L. Ducas, I. Mironov, M. Naehrig, in Computer Science, G. Danezis, S. Dietrich, and K. Sako, V. Nikolaenko, A. Raghunathan, and D. Stebila, “Frodo: Take off Eds., vol. 7126. Springer, 2011, pp. 27–39. [Online]. Available: the ring! Practical, quantum-secure key exchange from LWE,” in https://doi.org/10.1007/978-3-642-29889-9_4 Proceedings of the 2016 ACM SIGSAC Conference on Computer [25] S. Gueron and V. Krasnov, “Fast prime field elliptic-curve and Communications Security, Vienna, Austria, October 24-28, 2016, cryptography with 256-bit primes,” J. Cryptographic Engineering, E. R. Weippl, S. Katzenbeisser, C. Kruegel, A. C. Myers, and vol. 5, no. 2, pp. 141–151, 2015. [Online]. Available: S. Halevi, Eds. ACM, 2016, pp. 1006–1018. [Online]. Available: https://doi.org/10.1007/s13389-014-0090-x http://doi.acm.org/10.1145/2976749.2978425 [26] B. B. Brumley, “Faster software for fast endomorphisms,” in [12] C. Meyer and J. Schwenk, “SoK: Lessons learned from SSL/TLS Constructive Side-Channel Analysis and Secure Design - 6th attacks,” in Information Security Applications - 14th International International Workshop, COSADE 2015, Berlin, Germany, April Workshop, WISA 2013, Jeju Island, Korea, August 19-21, 2013, Revised 13-14, 2015. Revised Selected Papers, ser. Lecture Notes in Selected Papers, ser. Lecture Notes in Computer Science, Y. Kim, Computer Science, S. Mangard and A. Y. Poschmann, Eds., H. Lee, and A. Perrig, Eds., vol. 8267. Springer, 2013, pp. 189–209. vol. 9064. Springer, 2015, pp. 127–140. [Online]. Available: [Online]. Available: https://doi.org/10.1007/978-3-319-05149-9_12 https://doi.org/10.1007/978-3-319-21476-4_9 [13] E. Biham, Y. Carmeli, and A. Shamir, “Bug attacks,” in Advances [27] N. Benger, J. van de Pol, N. P. Smart, and Y. Yarom, ““Ooh Aah... Just in Cryptology - CRYPTO 2008, 28th Annual International Cryptology a Little Bit”: A small amount of side channel can go a long way,” in Conference, Santa Barbara, CA, USA, August 17-21, 2008. Proceedings, Cryptographic Hardware and Embedded Systems - CHES 2014 - 16th ser. Lecture Notes in Computer Science, D. A. Wagner, Ed., International Workshop, Busan, South Korea, September 23-26, 2014. vol. 5157. Springer, 2008, pp. 221–240. [Online]. Available: Proceedings, ser. Lecture Notes in Computer Science, L. Batina and https://doi.org/10.1007/978-3-540-85174-5_13 M. Robshaw, Eds., vol. 8731. Springer, 2014, pp. 75–92. [Online]. [14] B. B. Brumley, M. Barbosa, D. Page, and F. Vercauteren, “Practical Available: https://doi.org/10.1007/978-3-662-44709-3_5 realisation and elimination of an ECC-related software bug attack,” in [28] J. van de Pol, N. P. Smart, and Y. Yarom, “Just a little bit Topics in Cryptology - CT-RSA 2012 - The Cryptographers’ Track at more,” in Topics in Cryptology - CT-RSA 2015, The Cryptographer’s Track at the RSA Conference 2015, San Francisco, CA, USA, April [44] U.S. , “NSA Suite B Cryptography,” 20-24, 2015. Proceedings, ser. Lecture Notes in Computer Science, January 2009. [Online]. Available: http://www.nsa.gov/ia/programs/ K. Nyberg, Ed., vol. 9048. Springer, 2015, pp. 3–21. [Online]. suiteb_cryptography/ Available: https://doi.org/10.1007/978-3-319-16715-2_1 [45] P. Gutmann, “cryptlib security toolkit, 3.4.3.1,” January 2017. [Online]. [29] T. Allan, B. B. Brumley, K. E. Falkner, J. van de Pol, and Y. Yarom, Available: https://www.cs.auckland.ac.nz/~pgut001/cryptlib/ “Amplifying side channels through performance degradation,” in [46] ——, Cryptographic Security Architecture. Design and Verification, Proceedings of the 32nd Annual Conference on Computer Security 1st ed. Springer-Verlag New York, 2004. Applications, ACSAC 2016, Los Angeles, CA, USA, December 5-9, [47] A. Barnett and A. Byrne, “Lattice-based cryptographic key 2016, S. Schwab, W. K. Robertson, and D. Balzarotti, Eds. ACM, management prototype,” H2020: Secure Architectures of Future 2016, pp. 422–435. [Online]. Available: http://doi.acm.org/10.1145/ Emerging Cryptography (SAFEcrypto), Deliverable D8.3. [Online]. 2991079.2991084 Available: https://cordis.europa.eu/project/rcn/194240/results/en [30] N. Tuveri, S. ul Hassan, C. Pereida García, and B. B. Brumley, [48] L. Chen, S. Jordan, Y.-K. Liu, D. Moody, R. Peralta, R. Perlner, “Side-channel analysis of SM2: A late-stage featurization case and D. Smith-Tone, “Report on post-quantum cryptography,” National study,” in Proceedings of the 34th Annual Computer Security Institute of Standards and Technology, NISTIR 8105, April 2016. Applications Conference, ACSAC 2018, San Juan, PR, USA, December [Online]. Available: http://doi.org/10.6028/NIST.IR.8105 03-07, 2018. ACM, 2018, pp. 147–160. [Online]. Available: https://doi.org/10.1145/3274694.3274725 APPENDIX A [31] A. C. Aldaya, B. B. Brumley, S. ul Hassan, C. Pereida García, and N. Tuveri, “Port contention for fun and profit,” in 2019 IEEE DETAILED BENCHMARK RESULTS Symposium on Security and Privacy, SP 2019, Proceedings, 20-22 May 2019, San Francisco, California, USA. IEEE, 2019, pp. 1036–1053. TABLE III [Online]. Available: https://doi.org/10.1109/SP.2019.00066 BENCHMARK RESULTS ON A 4-CORES/8-THREADS INTEL COREI5-6700 [32] D. J. Bernstein and P. Schwabe, “NEON crypto,” in Cryptographic CPU(SKYLAKE) RUNNING AT 3.4GHZ, WITH ENHANCED INTEL Hardware and Embedded Systems - CHES 2012 - 14th International SPEEDSTEP TECHNOLOGYAND INTEL TURBO BOOST TECHNOLOGY Workshop, Leuven, Belgium, September 9-12, 2012. Proceedings, ser. DISABLED.THE WORKSTATION IS EQUIPPED WITH 32GB 2667MHZ Lecture Notes in Computer Science, E. Prouff and P. Schaumont, DRAM, ANDRUNS UBUNTU 18.04.2 LTS WITH LINUXX86_64 KERNEL Eds., vol. 7428. Springer, 2012, pp. 320–339. [Online]. Available: 4.15.0-45-GENERIC.FOREACHOPERATIONWECOLLECTED 12800 https://doi.org/10.1007/978-3-642-33027-8_19 SAMPLESANDCOMPUTEDTHEMEDIANVALUE.IN PARENTHESIS THE [33] O. P. . TC, “PKCS #11 Cryptographic Token Interface Base CPU CYCLESRATIOBETWEENTHECURRENTCOLUMNANDTHE Specification Version 2.40 Plus Errata 01,” OASIS Standard, REFERENCE (LEFTMOST) VALUE. OASIS, Organization for the Advancement of Structured Information Standards, Standard incorporating approved Errata, May 2016. CPU cycles per operation [Online]. Available: http://docs.oasis-open.org/pkcs11/pkcs11-base/v2. Operation OpenSSL-1.0.2r 40/errata01/os/pkcs11-base-v2.40-errata01-os-complete.html libsuola- default libsuola-donna libsuola-hacl [34] J. Linn, “Generic Security Service Application Program Interface, sodium Version 2,” Internet Requests for Comments, RFC Editor, RFC 2078, nistz256 keygen 41528 ——— Jan. 1997. [Online]. Available: https://datatracker.ietf.org/doc/rfc2078/ X25519 keygen — 128581 90320 (1.4x) 166736 (0.8x) [35] ——, “Generic Security Service Application Program Interface Version Ed25519 keygen — 80286 92710 (0.9x) 300552 (0.3x) 2, Update 1,” Internet Requests for Comments, RFC Editor, RFC 2743, nistz256 derive 172436 ——— Jan. 2000. [Online]. Available: https://datatracker.ietf.org/doc/rfc2743/ X25519 derive — 135892 149116 (0.9x) 159182 (0.9x) [36] D. C. Adams, “Independent Data Unit Protection Generic Security Service Application Program Interface (IDUP-GSS-API),” Internet nistz256 sign 118090 ——— Ed25519 sign — 88426 (0.9x) 587925 (0.1x) Requests for Comments, RFC Editor, RFC 2479, Dec. 1998. [Online]. 82160 Available: https://datatracker.ietf.org/doc/rfc2479/ nistz256 verify 253644 ——— [37] T. O. Group, “Common Data Security Architecture (CDSA), Version 2,” Ed25519 verify — 207832 284916 (0.7x) 611872 (0.3x) May 2000. CPU cycles per operation [38] N. C.-O. C. Team, “Security service api: Cryptographic api recommen- Operation OpenSSL-1.1.0j libsuola- dation,” 1996. default libsuola-donna libsuola-hacl [39] V. Smyslov, “Simple Cryptographic Program Interface (Crypto API),” sodium Internet Requests for Comments, RFC Editor, RFC 2628, Jun. 1999. nistz256 keygen 43542 ——— [Online]. Available: https://datatracker.ietf.org/doc/rfc2628/ X25519 keygen 117484 129992 (0.9x) 91510 (1.3x) 167815 (0.7x) Ed25519 keygen — 80250 92924 (0.9x) 301606 (0.3x) [40] A. D. Keromytis, J. L. Wright, and T. de Raadt, “The design of the OpenBSD cryptographic framework,” in Proceedings of the General nistz256 derive 172404 ——— Track: 2003 USENIX Annual Technical Conference, June 9-14, 2003, X25519 derive 342231 136888 (2.5x) 149159 (2.3x) 159084 (2.2x) San Antonio, Texas, USA. USENIX, 2003, pp. 181–196. [Online]. nistz256 sign 124082 ——— Available: http://www.usenix.org/events/usenix03/tech/keromytis.html Ed25519 sign — 82161 87204 (0.9x) 587762 (0.1x) [41] A. D. Keromytis, J. L. Wright, T. de Raadt, and M. Burnside, “Cryptography as an operating system service: A case study,” ACM nistz256 verify 259612 ——— Ed25519 verify — 286386 (0.7x) 611105 (0.3x) Trans. Comput. Syst., vol. 24, no. 1, pp. 1–38, 2006. [Online]. 205114 Available: http://doi.acm.org/10.1145/1124153.1124154 CPU cycles per operation Operation OpenSSL-1.1.1b [42] Y. Dutta and V. Sethi, “Performance analysis of cryptographic libsuola- default libsuola-donna libsuola-hacl acceleration in multicore environment,” in Quality, Reliability, Security sodium and Robustness in Heterogeneous Networks - 9th International nistz256 keygen 38877 ——— Conference, QShine 2013, Greader Noida, India, January 11-12, 2013, X25519 keygen 114124 128595 (0.9x) 86838 (1.3x) 162968 (0.7x) Revised Selected Papers, ser. Lecture Notes of the Institute for Computer Ed25519 keygen 115230 80232 (1.4x) 88382 (1.3x) 297225 (0.4x) Sciences, Social Informatics and Telecommunications Engineering, ——— K. Singh and A. K. Awasthi, Eds., vol. 115. Springer, 2013, pp. 658– nistz256 derive 171662 X25519 derive 118312 135398 (0.9x) 149148 (0.8x) 159346 (0.7x) 667. [Online]. Available: https://doi.org/10.1007/978-3-642-37949-9_57 [43] N. Mavrogiannopoulos, M. Trmac, and B. Preneel, “A linux nistz256 sign 75490 ——— kernel cryptographic framework: decoupling cryptographic keys from Ed25519 sign 114782 82287 (1.4x) 87428 (1.3x) 587884 (0.2x) applications,” in Proceedings of the ACM Symposium on Applied nistz256 verify 231329 ——— Computing, SAC 2012, Riva, Trento, Italy, March 26-30, 2012, Ed25519 verify 374236 207457 (1.8x) 284330 (1.3x) 611566 (0.6x) S. Ossowski and P. Lecca, Eds. ACM, 2012, pp. 1435–1442. [Online]. Available: http://doi.acm.org/10.1145/2245276.2232006 TABLE V TABLE IV BENCHMARK RESULTS ON A RASPBERRY PI 3B, EQUIPPEDWITHA BENCHMARK RESULTS ON A RASPBERRY PI 3B, EQUIPPEDWITHA QUAD-CORE 1.2GHZ BROADCOM BCM2837 64BIT CPU AND 1GB QUAD-CORE 1.2GHZ BROADCOM BCM2837 64BIT CPU AND 1GB RAM, RUNNING UBUNTU 18.04.2 LTS AND 64-BITAARCH64 LINUX RAM, RUNNING UBUNTU 18.04.2 LTS AND 32-BITARMV7L LINUX KERNELVERSION 4.15.0-1031-RASPI2. A CPU CORE WAS RESERVED KERNELVERSION 4.15.0-1031-RASPI2. A CPU CORE WAS RESERVED EXCLUSIVELYFORTHEBENCHMARKPROCESS, ANDFREQUENCY EXCLUSIVELYFORTHEBENCHMARKPROCESS, ANDFREQUENCY SCALINGDISABLEDVIASOFTWARE.WEMONITOREDTHE RTOS SCALINGDISABLEDVIASOFTWARE.WEMONITOREDTHE RTOS RUNNINGONTHEEMBEDDED VIDEOCORE VPU CONTROLLINGTHE SOC RUNNINGONTHEEMBEDDED VIDEOCORE VPU CONTROLLINGTHE SOC TO ENSURE THAT UNDERVOLTAGE, FREQUENCYCAPPINGANDEXTERNAL TO ENSURE THAT UNDERVOLTAGE, FREQUENCYCAPPINGANDEXTERNAL THROTTLINGWEREAVOIDEDFORTHEDURATIONOFTHEBENCHMARK. THROTTLINGWEREAVOIDEDFORTHEDURATIONOFTHEBENCHMARK. FOREACHOPERATIONWECOLLECTED 12800 SAMPLESANDCOMPUTED FOREACHOPERATIONWECOLLECTED 12800 SAMPLESANDCOMPUTED THE MEDIAN VALUE.IN PARENTHESIS THE CPU CYCLES RATIO BETWEEN THE MEDIAN VALUE.IN PARENTHESIS THE CPU CYCLES RATIO BETWEEN THECURRENTCOLUMNANDTHEREFERENCE (LEFTMOST) VALUE. THECURRENTCOLUMNANDTHEREFERENCE (LEFTMOST) VALUE. CPU cycles per operation CPU cycles per operation Operation OpenSSL-1.0.2r Operation OpenSSL-1.0.2r libsuola- default libsuola-donna libsuola-hacl default libsuola-sodium libsuola-donna sodium ——— nistp256 keygen 5915353 —— nistp256 keygen 401806 — X25519 keygen — 582951 506892 (1.2x) X25519 keygen 213138 182785 (1.2x) 524648 (0.4x) — Ed25519 keygen — 593302 513263 (1.2x) Ed25519 keygen 216960 186450 (1.2x) 993557 (0.2x) ——— nistp256 derive 6054496 —— nistp256 derive 1087916 — X25519 derive — 1569018 484920 (3.2x) X25519 derive 558776 479496 (1.2x) 514212 (1.1x) ——— nistp256 sign 6320842 —— nistp256 sign 808516 — Ed25519 sign — 616191 503163 (1.2x) Ed25519 sign 221356 174227 (1.3x) 1956526 (0.1x) ——— nistp256 verify 4155613 —— nistp256 verify 1525158 — Ed25519 verify — 1766874 1509083 (1.2x) Ed25519 verify 656302 566792 (1.2x) 2027707 (0.3x) CPU cycles per operation CPU cycles per operation Operation Operation OpenSSL-1.1.0j OpenSSL-1.1.0j libsuola- default libsuola-sodium libsuola-donna default libsuola-donna libsuola-hacl sodium —— nistz256 keygen 219973 nistz256 keygen 140707 ——— X25519 keygen 575106 583914 (1.0x) 510528 (1.1x) X25519 keygen 242538 213084 (1.1x) 184170 (1.3x) 526310 (0.5x) — Ed25519 keygen 595987 516408 (1.2x) Ed25519 keygen — 217356 187964 (1.2x) 995222 (0.2x) —— nistz256 derive 1149047 nistz256 derive 539104 ——— X25519 derive 1546213 1568875 (1.0x) 484890 (3.2x) X25519 derive 536012 558572 (1.0x) 479776 (1.1x) 514541 (1.0x) —— nistz256 sign 553960 nistz256 sign 328727 ——— — Ed25519 sign 618496 502348 (1.2x) Ed25519 sign — 221468 174797 (1.3x) 1956594 (0.1x) —— nistz256 verify 1503140 nistz256 verify 792487 ——— — Ed25519 verify 1771464 1508409 (1.2x) Ed25519 verify — 652126 566749 (1.2x) 2027784 (0.3x) CPU cycles per operation CPU cycles per operation Operation OpenSSL-1.1.1b Operation OpenSSL-1.1.1b default libsuola-sodium libsuola-donna libsuola- default libsuola-donna libsuola-hacl nistz256 keygen 221288 —— sodium X25519 keygen 582162 583748 (1.0x) 510915 (1.1x) nistz256 keygen 144158 ——— Ed25519 keygen 587208 596005 (1.0x) 516872 (1.1x) X25519 keygen 250089 212957 (1.2x) 187706 (1.3x) 530116 (0.5x) Ed25519 keygen 253400 217181 (1.2x) 191261 (1.3x) 998444 (0.3x) nistz256 derive 1147024 —— X25519 derive 1550240 1568880 (1.0x) 484880 (3.2x) nistz256 derive 539346 ——— X25519 derive 519758 558584 (0.9x) 479632 (1.1x) 514342 (1.0x) nistz256 sign 515348 —— Ed25519 sign 576812 621402 (0.9x) 505244 (1.1x) nistz256 sign 238736 ——— Ed25519 sign 235394 221840 (1.1x) 174794 (1.3x) 1956938 (0.1x) nistz256 verify 1559354 —— Ed25519 verify 1757511 1792064 (1.0x) 1530556 (1.1x) nistz256 verify 721034 ——— Ed25519 verify 619356 648358 (1.0x) 563364 (1.1x) 2028430 (0.3x)