<<

Masaryk University Faculty of Informatics

Comparing X.509 Certificate Validation Errors Across TLS libraries

Bachelor’s Thesis

Pavol Žáčik

Brno, Spring 2021 Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Pavol Žáčik

Advisor: RNDr. Martin Ukrop

i Acknowledgements

My deep gratitude goes to my advisor, Martin Ukrop, for his invaluable tutorship over the last two years. It is a pleasure to be guided by you. I thank my family for their support, advice and calm. I am grateful to Mrs. Szórádová for her motivational words of worries. Finally, my warmest thanks go to Maťka, for her tender care, avid interest, and ceaseless love.

ii Abstract

IT professionals often meet certificate validation errors when dealing with TLS. In such situations, their decisions may be crucial for the security of systems they implement. However, error differ depending on the used TLS library, and official documentation usually does not help much. This thesis performs a comparison of certificate validation errors occurring in five common TLS libraries. To do so, it employs a custom set of erroneous certificates. Furthermore, a simple TLS connection is implemented in the five libraries. As a result, we establish a mapping between the corresponding errors from different libraries. The mapping is published online, together with the erroneous certificates and TLS source code. All three resources aim to be used by developers when they require guidance.

Keywords

certificate validation, documentation, usable security, validation error, X.509, TLS library

iii Contents

1 Introduction1

2 Crafting malformed certificates4 2.1 Public key certificates...... 4 2.1.1 X.509 certificate profile...... 4 2.1.2 Certificate path validation...... 7 2.1.3 Certificate revocation mechanisms...... 8 2.2 Creating a certificate dataset...... 9 2.2.1 Prototype solution and its limits...... 9 2.2.2 Abstract Syntax Notation 1...... 11 2.2.3 Constructing arbitrary ASN.1 structures..... 12 2.2.4 X.509 Python module...... 14 2.2.5 Generating invalid certificate chains...... 15 2.2.6 Final implementation...... 17

3 Validating certificates within TLS 19 3.1 TLS protocol...... 19 3.1.1 TLS handshake...... 19 3.1.2 Security properties of TLS...... 20 3.1.3 TLS deployment...... 21 3.2 Implementing certificate validation...... 21 3.2.1 Prototype solution...... 22 3.2.2 Client-side TLS connection...... 23 3.2.3 Library choice...... 24 3.2.4 TLS client implementation...... 24 3.2.5 Server implementation...... 26

4 Comparing errors across libraries 28 4.1 Goals...... 28 4.2 Build...... 30 4.3 Result processing...... 32 4.3.1 Collecting error data...... 32 4.3.2 Observations...... 33 4.3.3 Cross-library error linking...... 35 4.4 Results evaluation...... 37 4.4.1 Error taxonomy improvements...... 40

iv 5 Deployment 41 5.1 Error documentation...... 41 5.2 Developer guides...... 43

6 Related work 47 6.1 Malformed certificates...... 47 6.2 Improving certificate infrastructure...... 48 6.3 X.509 and TLS usability...... 50

7 Conclusion 51 7.1 Limitations...... 51 7.2 Future work...... 52

References 53

Appendix: Source code and build 61

v 1 Introduction

X.509 public key certificates play a key part within multiple modern security protocols, including those most commonly used, such as TLS. The purpose of certificates in the protocol is rather simple. Still, many end users and developers alike fail to understand the consequences of deploying and trusting the invalid ones [5, 84]. Pitfalls of certificate validation. Before one can rely on a certificate, it must be properly validated. However, due to their complex nature, validating certificates is a rather involved procedure, with its description spanning multiple documents [29, 71, 74, 75]. There are many ways in which a certificate or a certificate chain can become erroneous. During the validation process, all possible miscon- figurations need to be checked. Failing to do so properly may lead to accepting a forged certificate and thus to security threats [35]. Researchers have been assessing certificate validation source code in cryptographic libraries for years, still being able to find new security holes within [19, 24, 49]. Even if we assume that no such holes exist, there are new issues to consider when developers start using the libraries’ source code in their applications. Usable documentation. Some errors occurring during certificate validation pose no or only minimal risks. As a consequence, developers may have to differentiate between “benign” and “malign” errors. These decisions can be crucial for security, as they affect many end users [40]. To help developers make better decisions, cryptographic libraries’ authors should strive to provide detailed and usable documentation regarding both the deployment of certificate validation and the errors occurring therein. Unfortunately, previous research shows that such attempts are scarce [1, 39]. To further complicate the issue, the taxonomy of possible errors is chaotic and not at all unified. Each cryptographic library uses its own set of error codes when referring to certificate deformities. Hence, developers face extra troubles when transitioning from one library to another. Unifying the system would be difficult, but previous successful attempts [34, 44] have proven similar tasks possible.

1 1. Introduction Our contributions. As part of the Usable X.509 Errors project [85], this thesis aims to provide a supplementary resource for developers to consult when dealing with certificate validation. We divide our contributions into three main parts.

1. Erroneous certificate dataset. We establish a dataset of more than 60 distinct malformed cer- tificate chains, which are publicly available for testing purposes. Each chain is initially crafted by hand, but the dataset itself is dynamically generated (Chapter 2).

2. Client-side TLS implementation. We implement basic TLS client-side connection in five crypto- graphic libraries. Each implementation properly validates server certificates. Furthermore, the source code of three TLS clients is published online in the form of well-documented developer guides (Chapter 3).

3. Cross-library error mapping. Merging the previous, we develop an automated system which compares certificate validation errors occurring within the five TLS-enabled libraries. This comparison aims to be one of the initial steps in improving X.509-related documentation. (Chapter 4).

Together, the three listed topics form the main scope of this thesis. The rest is laid out as follows. Deployment. All of our newly created resources are deployed on the website x509errors.org. Neither the deployment process nor the website design lie within the main focus of the thesis, but they are briefly described to give an idea of how our results are published (Chapter 5). Related work. The list of previous research related to our work is relatively wide. We present relevant research efforts and locate our contributions among them (Chapter 6). Conclusion. Our work can be further extended in multiple directions. Along with the summary of the thesis, we list its limits and propose possible future extensions and improvements. (Chapter 7).

2 1. Introduction Acknowledgment of collaboration

As already mentioned, this thesis is part of the Usable X.509 Errors project [85]. Multiple other students collaborate on the project, and thus the work presented can not fully be my own. Here, I list all of their individual contributions. My thesis advisor and the project lead, Martin Ukrop, helped with prototype development of both certificate generation (Section 2.2.1) and certificate validation (Section 3.2.1). Both were completely replaced, but I include them for illustrative purposes. Additionally, the design of the website where our results are deployed (Chapter 5) is almost exclusively his effort. Therefore, it is presented only in brief. Matěj Grabovský implemented initial versions of three out of five TLS clients (Section 3.2.4), specifically of OpenSSL, GnuTLS, and TLS. His work was then further improved and refactored. Eric Vincent Valčík helped with website development (Section 5.1) and with collecting library error data (Section 4.3.1). He implemented a prototype of certificate mapping, which is not mentioned in the thesis, since it was fully renewed. Unless explicitly stated otherwise, all further work presented in the following chapters is my own.

3 2 Crafting malformed certificates

In order to be able to compare certificate validation error messages occurring within cryptographic libraries, we must prepare a certificate dataset to validate. Such a dataset should be reasonably large and diverse so that it will exhibit as many validation errors as possible. This chapter begins with a description of the X.509 certificate profile (Section 2.1), followed by the tools we implemented to create arbitrary certificate structures (Section 2.2).

2.1 Public key certificates

Using asymmetric gives rise to some major obstacles. When keys are designed to be public, their authenticity is at risk. The encrypting party needs to be certain that a public encryption key is, in fact, a complement to a private key owned by the intended recipient. Such a requirement is crucial, otherwise any untrusted adversary could claim to be the recipient and present his public key instead. To resolve this issue, a public key certificate is employed. As its name suggests, this document binds a public key with a subject’s identity, thus certifying the possession of the associated private key by the subject. The binding is done by means of a digital signature wherein the signer must be a trusted authority, usually called a certificate authority (CA).

2.1.1 X.509 certificate profile The X.509 certificate profile, currently defined in RFC 5280 [29], has been the de facto standard of public key certificates for more than three decades [36, 62]. The document specifies both the structure and the semantics of a digital certificate, building upon the ASN.1 description language [45]. Three versions of X.509 certificates were defined through- out their history, the first two being rather short-lived and soon replaced by a version 3 (v3) certificate. All X.509-compliant certificates must consist of three main fields. A tbsCertificate (certificate “to be signed”) holds all information about a certificate in plaintext. The issuer of the certificate signs this information using an algorithm specified in the second, signatureAlgorithm field, storing the result in the third, signature field.

4 2. Crafting malformed certificates Within the tbsCertificate itself, multiple subfields are required to be present. What follows is a short description of both the mandatory and the optional ones. Version. This field specifies the version of the certificate. It should therefore contain integer values ranging from 0 to 2, since the three existing versions are indexed from 0. Serial Number. Every certificate issuer should keep track of the certificates it issued. This is done by means of assigning a unique integer to each issued certificate, and storing it in the serialNumber field. Signature. The signature field contains an object identifier1 of the algorithm used to sign the certificate. Notice that such a field is already present in the outer structure. These two must unconditionally match. Issuer. The name of the issuing CA is stored in the issuer field. To conform to the specification, the name must be equal to the subject name in the issuing CA certificate. Validity. In order to reduce the negative impacts of their possible misuse, certificates should not be issued for an indefinite period. The validity field determines the lifetime of a certificate, which includes the time when it becomes valid as well as its expiration time. Subject. The subject field identifies the entity to which the public key certificate belongs. This entity must be the sole owner of the associated private key. Subject Public Key Info. The public key itself is contained within the subjectPublicKeyInfo field, together with an OID of its algorithm, such as RSA or ECDSA. Issuer Unique ID and Subject Unique ID (optional). These two fields were originally intended to uniquely identify the issuer and the subject, but their usage is currently discouraged [29]. Instead, two extensions, Authority Key Identifier and Subject Key Identifier, should be employed to identify both entities by their public keys.

1. Object identifiers (OIDs) are a standardized [47] tree-based scheme for naming objects such as algorithms and data structures [61], extensively used in the public key infrastructure. 5 2. Crafting malformed certificates Extensions (optional). With the release of X.509v3, the previously listed fields were deemed insufficient for practical use and the option to include arbitrary extensions was introduced. For this purpose, the Extensions field can be used. In order to conform to the most recent X.509 certificate profile, certain certificates have to include some standard extensions. We list the basic ones below. • Authority Key Identifier and Subject Key Identifier. These two extensions serve as unique identifiers of both the issuer and the subject public key. They usually contain a hash value of the key in question. • Basic Constraints. The Basic Constraints extension is used to determine whether a certificate belongs to a certificate authority. Moreover, it can constrain the length of a certificate chain. • Key usage and Extended Key usage. A public key contained in a certificate usually cannot be used for any purpose. The permitted uses of the key, such as signing other certificates or key establishment, are listed in the Key Usage and Extended Key Usage extensions. • Issuer Alternative Name and Subject Alternative Name. The subject and issuer fields do not provide the possibility to specify certain types of names, such as URIs or IP addresses. Hence, these two extensions may be used in addition. • Name Constraints. The Name Constrains extension is used in CA certificates to constrain the names of certificates that an authority can issue. • Certificate Policies. The Certificate Policies extension puts further requirements on certificate usage. These requirements may be arbitrary, for instance constraining how the associated private key must be stored. Each certificate extension must also contain an additional boolean field named critical, which indicates whether the extension must be processed during certificate validation.

6 2. Crafting malformed certificates 2.1.2 Certificate path validation Along with the definition of certificate structure, RFC 5280 also specifies basic steps to take in the process of validating a certificate path. Three fundamental procedures need to be implemented within. Building a certificate path. For the public key infrastructure to work, one must always regard at least one entity’s public key as trusted. If this condition was not met, verifying signatures on certificates would not be possible, since there would be no public key to verify with. In the usual case, the self-signed certificates of “trust anchors”, also called root certificate authorities, are stored locally. Root authorities sign certificates for level 1 intermediate authorities, who in turn sign certificates for next level authorities, eventually closing the path by signing a common “end entity” certificate. A certificate that is to be validated is normally supplied in the form of a certificate chain, containing all CA certificates except the root one. Building the path then consists of matching the subject and issuer fields in the supplied chain. Lastly, a trusted root certificate must be found so that its subject matches the issuer of the last certificate in the chain. Signature verification. After building the path, all signatures must be verified to ensure that the certificates are not forged. That is, each signature on a certificate is verified with the public key of the certificate issuer. This is a critical step, since attackers are able to craft a certificate with any given subject and issuer fields. Hostname check. One important input into the validation procedure is the host name. This is essentially the identity of the subject that we want to communicate with. It is necessary to check the equality of this name and the subject field (or Subject Alternative Name) in the end certificate. If this check was not performed, an attacker could present any valid certificate chain built up to some trusted root.

Still, these are just a fraction of all checks that need to be performed. Some properties are required to hold for almost every certificate field. As an example, a certificate may expire if its validity has passed. It should be also be rejected when it is of an invalid version, or when the two signature algorithm fields do not match. Moreover, extensions have their own sets of rules to follow during the validation.

7 2. Crafting malformed certificates 2.1.3 Certificate revocation mechanisms One further issue emerges when a certificate has to be deemed invalid before its validity period ends. This could happen due to a variety of reasons, most importantly the compromise of the private key associated with the certificate. There is no way to “change” a certificate after it was issued, hence a completely new mechanism has to be established to deal with this problem. As of now, three such mechanisms are in use concurrently. Certificate revocation list (CRL). The original solution, provided within RFC 5280, is the use of certificate revocation lists. These are documents published regularly by certificate authorities. The most simple CRL contains only the serial numbers of revoked (prematurely invalidated) certificates issued by a CA and is signed by the same CA. During certificate validation, a revocation list must be retrieved from a location specified in a CRL Distribution Points extension. A subsequent check is performed to validate it and to determine whether it contains the serial number of the certificate in question. Online Certificate Status Protocol (OCSP). Not too long after they were introduced, CRLs have been deemed unscalable due to their size and the overhead they introduce [55, 59]. The Online Certificate Status Protocol [75] is supposed to eliminate these scalability issues. The protocol enables the validator to request certificate revocation information about by contacting the CA directly. This approach solves the problem of CRLs being too large, since the response only carries information about one certificate. TLS Certificate Status Request (OCSP stapling). An alterna- tive way of distributing certificate status information is defined in RFC 6066 [31]. The OCSP response is sent together with the certificate during a TLS handshake. This transfers the responsibility for contacting the CA on the peer, thus reducing overhead for the validating party. When combined with the OCSP “must-staple” extension [41] and the ability to send status responses for each certificate in the supplied chain [67], we are describing the state-of-the-art.

In practice, the quality of revocation support differs greatly among TLS libraries. We discuss this issue further in Section 3.2.4.

8 2. Crafting malformed certificates 2.2 Creating a certificate dataset

There exists a great variety of cryptographic libraries which provide means of generating certificates. Their tools come either in the form of APIs, such as OpenSSL’s libcrypto [53] and Go’s x509 package [88], or in the form of CLIs, such as GnuTLS’ certtool [22].

2.2.1 Prototype solution and its limits In our first prototype, we approach the task of generating erroneous certificates using a standard tool – the certtool command-line utility provided by GnuTLS. Using this simple tool, we can create a self-signed root certificate and a single end entity certificate as follows. # Generate a new private key for the root CA certtool --generate-privkey --outfile "ca_key.pem" # Generate a self-signed certtool --generate-self-signed --load-privkey "ca_key.pem" --template "ca.cfg" --outfile "ca.pem" # Generate a new private key for the end entity certtool --generate-privkey --outfile "end_key.pem" # Sign a new end entity certificate using the CA key certtool --generate-certificate --load-privkey "end_key.pem" --load-ca-privkey "ca_key.pem" --load-ca-certificate "ca.pem" --template "end.cfg" --outfile "end.pem" Inside the .cfg files, we can define the values of certificate fields. A very basic certificate authority configuration file ca.cfg may look as below. # The common name of the certificate subject cn= "Root CA" # In how many days this certificate will expire expiration_days= 365 # This is a CA certificate ca

9 2. Crafting malformed certificates As mentioned in Section 2.1.1, the Basic Constraints extension deter- mines whether a certificate belongs to a certificate authority or not. To do so, it contains a boolean field named CA. The X.509 certificate profile mandates that every CA certificate must have the value in this field set to true. Thus, if we were to delete the corresponding line in ca.cfg (the last one), certificate validation of end.crt would fail. Limits. As demonstrated, some certificate chain deformities can be reproduced using certtool without much effort. However, there are three specific certificate properties that neither certtool nor other available tools can easily replicate.

1. Arbitrary extensions. One issue arises when we want to include arbitrary extensions in the certificate. Within certtool, only a limited set of standard extensions can be defined. Other libraries provide the possibility to add any extension, but the data of the extension itself can only be specified as a bytestring, which is highly impractical.

2. Nonsense values. In order to yield a certain type of validation errors, we may want to introduce nonsensical values into a certificate. As an example, we can try to put a non-existent OID into the signatureAlgorithm field, or try to set the certificate serialNumber to -1. In most tools, making such changes is problematic (and rightfully so). The tools either perform a sanity check on the values given, which usually results in an error, or they do not allow modification of some fields at all.

3. Syntactical deformities. No well-known tools provide the option to generate syntactically incorrect certificates. Such behaviour is undeniably an intended feature which improves usability. However, being able to do so would lead us to a new category of validation errors – parsing errors. To induce this error type, we may try filling the issuer field with a time value. Alternatively, we may evaluate the consequences of not including the field at all.

10 2. Crafting malformed certificates 2.2.2 Abstract Syntax Notation 1 Assessing the possibilities of existing certificate-generating tools brings us to the conclusion that a more low-level apparatus may suit our needs better. It is therefore necessary to understand the inner formal structure of an X.509 certificate. As already mentioned, X.509 certificate profile builds on top of a single description language, the Abstract Syntax Notation 1 (ASN.1). The language, widely employed within network protocols, is used to formally describe data with no respect to their actual physical repre- sentation [48]. All certificate related data, be it certificates themselves, CRLs or OCSP requests, are defined using ASN.1. One of the most simple certificate subfields, subjectPublicKeyInfo, is specified using ASN.1 as follows.

SubjectPublicKeyInfo ::= SEQUENCE{ algorithm AlgorithmIdentifier, subjectPublicKey BIT STRING }

The SEQUENCE keyword states that the whole field will consist of mul- tiple ordered substructures. In this case, there are only two. The subjectPublicKey is of type BIT STRING, which is a primitive ASN.1 type. This is not true for the type of the algorithm field, which must be further defined.

AlgorithmIdentifier ::= SEQUENCE{ algorithm OBJECT IDENTIFIER, parameters ANY DEFINED BY algorithm OPTIONAL }

In the second definition, we finally end up with two primitive ASN.1 types2. Hence, the definition of subjectPublicKeyInfo is complete. By combining and nesting simple structures as this one, we are able to assemble very complex ones. The resulting “prototypes” can then be replaced with actual data. ASN.1 is an abstract language, but there are standardized ways to represent it and encode it.

2. The ANY DEFINED BY type is, in fact, obsolete and was replaced by more elaborate structures [45]. However, it still persists in RFC 5280.

11 2. Crafting malformed certificates ASN.1 encodings The ITU-T X.690 reccommendation [46] defines three binary encodings for ASN.1 data. The one most prominently used to encode certificates is DER (Distinguished Encoding Rules). DER encodes each ASN.1 structure uniquely, which is crucial for documents that are to be digitally signed. Historically, another encoding has become more commonly used. The so-called PEM, deriving its name from Privacy Enhanced Mail [54], was never standardized for use in certificates, but is nevertheless the default format in many cryptographic libraries, e.g. OpenSSL. It consists of a base64-encoded DER, prefixed and suffixed with labels such as “––-BEGIN CERTIFICATE––-” and “––-END CERTIFICATE––-”. There are some further features of ASN.1 that need to be taken into consideration when encoding it. Those will be explained in the following sections, when relevant.

2.2.3 Constructing arbitrary ASN.1 structures To overcome the limits listed in Section 2.2.1, we need to be able to change the ASN.1 certificate structure in an arbitrary way. Internally, all aforementioned cryptographic libraries and tools work with ASN.1. While it would be theoretically possible to construct any malformed certificate by abusing their inner workings, other tools may be more convenient to use. Criteria for an ASN.1 encoder. We require that our chosen tool satisfies the following three criteria. 1. Functional. The tool must support encoding of every primitive ASN.1 type present in the X.509 profile. It should also be able to load an ASN.1 specification and encode its structures, which can be arbitrary. 2. Open-source. Since we make all of our source code public, we restrict ourselves only to open-source and free tools. 3. Lightweight. Ideally, the tool should be self-standing and independent of any large libraries. Again, this is due to the fact that we aim to make our implementation publicly available.

12 2. Crafting malformed certificates The number of libraries providing ASN.1 functionality is relatively high. Surprisingly, only a fraction of them meets the desired requirements. OSS Nokalva [10] offers powerful ASN.1 toolkits for C, C++, C# and Java, but they are both proprietary and paid. The same holds for tools provided by ASN Lab [9]. On the other hand, Go’s asn1 package [11], python-asn1 [7], and other open-source packages do not offer the ability to import custom ASN.1 specification files. Only two known candidates do meet our criteria – the asn1c C library [86], and the asn1tools Python package [58]. We chose the latter one, since it is simpler to use and more suitable for scripting. Encoding a custom ASN.1 structure. Recall the formal ASN.1 definition of SubjectPublicKeyInfo from the previous section. If we assume that a file named x509.asn contains this exact definition, we can encode a public key using the following simple steps.

import asn1tools

# Load the X.509 ASN.1 specification file asn= asn1tools.compile_files( 'X509.asn', 'DER')

# Set the 'algorithm' to be 'rsaEncryption' alg_oid={ 'algorithm': '1.2.840.113549.1.1.1' }

# Set random 30 bits into 'subjectPublicKey' public_key=(b '\x30\xcf\x2a\xf0', 30)

# Fill in both subfields of 'SubjectPublicKeyInfo' public_key_info={ 'algorithm': alg_oid, 'subjectPublicKey': public_key }

# DER encode the whole structure der= asn.encode( 'SubjectPublicKeyInfo', public_key_info)

We see that the steps are quite self-descriptive, since they follow the ASN.1 structure of SubjectPublicKeyInfo tightly.

13 2. Crafting malformed certificates 2.2.4 X.509 Python module Using asn1tools and the described techniques, we have developed a Python module which can encode all certificate and CRL fields, all standard and some non-standard extensions. Adding further extensions is easily possible by defining them within the ASN.1 specification file and implementing a single function. With such a module at hand, generating a valid certificate is straight- forward. The process consists of the following:

1. Generating key pairs. First, two pairs of asymmetric keys are generated, one for the issuer, the other one for the subject (to be certified). For the purpose of generating keys, we use the Pycryptodome [33] package.

2. Filling in the “to-be-signed” certificate. We fill in a tbsCertificate according to the rules of the X.509 profile. This is the step which makes use of our newly established Python module. The tbsCertificate is DER-encoded using asn1tools into binary data, but we also keep the non-encoded version.

3. Signing the certificate. We sign the encoded version of the “to-be-signed” certificate with the issuer private key, making use of PyCryptodome once again.

4. Encoding the certificate. The non-encoded tbsCertificate is merged with the signature to create the outermost certificate structure. The whole structure is DER-encoded into a binary certificate, which may further be converted into the PEM format and exported into a file.

In a very similar fashion, we can generate certificate revocation lists. Their outermost structure is identical with that of certificates, despite the fact that the “to-be-signed” field is totally different. If we need to create a certificate chain consisting of two or more certificates, we encode each certificate separately and then simply con- catenate their PEM encodings into a single file.

14 2. Crafting malformed certificates 2.2.5 Generating invalid certificate chains Once certificate generation is established, introducing various flaws and deformities is quite simple. Depending on the stage in which a flaw is inserted, we split them into multiple categories. Wrong ASN.1 specification. By using an ASN.1 specification file with invalid definitions, we can induce syntactical flaws. Certificates containing such flaws should not pass through parsing procedures. A wrong ASN.1 specification may contain a definition like this one. SubjectPublicKeyInfo ::= SEQUENCE{ algorithm AlgorithmIdentifier, subjectPublicKey INTEGER } Certificate parsers expect a BIT STRING in the subjectPublicKey field. Hence, replacing it with an INTEGER will lead to parsing errors. Even more elaborate flaws can be inserted when we employ custom ASN.1 tags into the procedure. Every primitive ASN.1 type has its own unique number assigned, called a tag. A BIT STRING has tag 3, whereas an INTEGER has tag 2. Tags always precede the data when it is encoded. This makes parsing much easier, since the parser knows what data will follow afterwards. When we replace the BIT STRING with an INTEGER, the tag changes too. Thus, a parser which expects a BIT STRING will fail without even seeing the data themselves, purely because of an invalid tag number. However, the ASN.1 specification file allows for manually changing the tag for each structure. We can then produce more involved flaws as follows. SubjectPublicKeyInfo ::= SEQUENCE{ algorithm AlgorithmIdentifier, subjectPublicKey[UNIVERSAL3] INTEGER } Changing the tag number3 will trick certificate parsers into processing a BIT STRING afterwards. Since they expect a BIT STRING, parsing will not fail at first. However, an INTEGER will follow, causing an error.

3. The UNIVERSAL keyword specifies that a tag number belongs to one of the primitive ASN.1 types. There are other types of ASN.1 tags, not relevant to this example.

15 2. Crafting malformed certificates Nonsense or prohibited value. A major number of errors can be introduced by not following the rules of RFC 5280 or other relevant documents. This usually means setting some field or extension to a well-typed value which does not make sense, or is explicitly prohibited. Recall the public key algorithm identifier field from Section 2.2.2. We can fill any object identifier into it, but OIDs are used to describe many things. For instance, value “2.16.203” refers to the Czech Republic, and will thus certainly not qualify as a public key algorithm. On the other hand, if we aim to craft a certificate which violates the written rules of X.509, we may set the CA bit in Basic Constraints to false, while at the same time specifying the optional pathLenConstraint. The pathLenConstraint field places a restriction on the number of cer- tificates which can appear in a certificate chain after a given certificate. The proposed combination of properties is explicitly prohibited, since pathLenConstraint is only meaningful in CA certificates. This is clear, since no certificates can follow a non-CA certificate in a certificate path. Post-signature change. Making any changes within tbsCertificate after it was already signed will result in obvious behaviour during validation – failing to verify the certificate signature. Invalid chaining or revocation status. Most commonly, certificate validation errors occur when the validated chain is not constructed properly up to a trusted root [3, 68]. Examples of such erroneous chains include:

• A self-signed end-entity certificate.

• An unknown root CA.

• Mismatch between the issuer field of the end-entity certificate and the subject field of its issuing CA certificate.

• Exceeded length of the chain, according to the pathLenConstraint in Basic Constraints.

Furthermore, certificates will prompt errors when they are present on a certificate revocation list. The same holds for being declared revoked by an OCSP responder.

16 2. Crafting malformed certificates Discrepancy with validation options. Certificate validation is not an optionless procedure. It can take various inputs, such as hostnames, explained in Section 2.1.2. Examples of other inputs include constraints on public key algorithms, trusted root certificates and required extension values, such as certificate policies. Time is also an input, though rarely explicitly stated. However valid certificate chains may be by themselves, validating them with incompatible settings will also lead to errors. Hence, we are able to induce a new set of errors by storing a set of validation options beside each certificate chain.

2.2.6 Final implementation Our final implementation of malformed certificate generation consists of the following components. Helper Python modules. We implemented three Python modules to ease the certificate generation itself:

• The x509 module is intended to encode X.509 certificate fields and extensions. This is the module presented in Section 2.2.4. For every certificate field or extension, a single function is present within. It also contains high-level functions for generating default valid certificates.

• The crypto module provides access to cryptographic primitives, namely key pair generation and signatures. It is a simple wrapper around the utilized features of PyCryptodome.

• The io module performs basic input and output operations, such as importing and exporting keys and certificates, together with ASN.1 format conversion.

ASN.1 directory. All ASN.1 specification files, be it valid or invalid ones, are contained within one directory. These files are then loaded into certificate-generating scripts to encode all data. Trust anchor script. We use our own root certificate authority as a default trust anchor for all certificate chains. A single Python script generates its certificate, making use of the helper libraries.

17 2. Crafting malformed certificates Scripts for individual chains. To generate the malformed chains, one Python script is present for each, together with a file containing validation options. As of now, the repository contains 61 distinct scripts. Figure 1 shows numbers of chains in different categories, as divided in the previous section. New chains can be added easily, by simply copying a script for a valid chain and making the desired changes. The chains are generated dynamically – two runs of the same script produce two different certificates, but with the same flaw. This is because keys are generated randomly and, more importantly, the validity field depends on the current time (if it did not, all chains would eventually become expired).

40 33

30

20 13

10 7 3 4 1

0 Syntax Wrong Invalid Bad Bad Valid flaw value signature chaining validation options

Figure 1: Number of crafted certificate chains in different categories.

18 3 Validating certificates within TLS

Having built a certificate dataset to validate, our next step is to create means of validating that dataset. For this purpose, we use tools and interfaces provided by TLS-enabled libraries. We impose three requirements on our certificate validation: it must perform all necessary checks; it must yield the same errors as certificate validation in the wild, and it must be implemented in multiple commonly used TLS libraries, so we can compare them. In the beginning of this chapter, we briefly describe the TLS protocol (Section 3.1). Subsequently, we present the tools that we use to validate certificates (Section 3.2).

3.1 TLS protocol

X.509 certificates are in use in multiple security infrastructures, including S/MIME [76], eIDAS [70] and code signing [27], but they play their most prominent role within TLS [71]. TLS () is currently the most commonly used security protocol [62]. Its name conveys its purpose well – the protocol secures application data by encrypting them, and as such passes them to the transport network layer below.

3.1.1 TLS handshake TLS is intended for use in a client–server communication. Before these two parties can exchange encrypted application data, they must per- form a TLS handshake. In the procedure, they authenticate each other (optionally, in the case of client), and agree on encryption keys. The exact order and structure of exchanged handshake messages depends on the protocol version and setting. In very specific cases, the authentication process can be skipped and certificates do not get used at all1. Nonetheless, the usual setting requires the following three steps to be performed.

1. This setting, known as anonymous key exchange, does not provide any guarantees about the identity of the peer. Hence, it is intended for use only when active attackers are not a concern, or when another secure channel is already established.

19 3. Validating certificates within TLS 1. Server certificate validation. The server must include its public key certificate in its initial handshake messages. Subsequently, the client verifies the validity of the certificate.

2. Server authentication. To prove possession of the private key associated with the public key certificate, the server must use that key to decrypt or sign some data, depending on the setting.

3. Key exchange. Both parties share their respective supported and then establish symmetric session keys, usually through some variation of the Diffie–Hellman protocol.

In some cases, the server may require the client to authenticate itself as well. In that case, steps 1 and 2 are performed twice, once for each communicating party.

3.1.2 Security properties of TLS If TLS is properly implemented, post-handshake communication comes with three security properties:

1. Peer authentication. One or both parties are authenticated. That is, the other party is sure that it performed key exchange with the entity that it desired to communicate with.

2. Confidentiality. All application data are encrypted using the newly established symmetric session keys. Hence, their contents are private.

3. Data integrity. The integrity of transmitted data is protected using a keyed MAC function. If any data were altered along the way, the receiving party would notice a disturbance and request their re-transmission.

Note that data authenticity is not ensured. When the established session keys are compromised, an attacker can impersonate both parties.

20 3. Validating certificates within TLS The role of certificates. As is obvious from the listed properties, the presence of public key certificates in the protocol is critical. In their absence, the protocol would still provide confidentiality and data integrity, yet if we are unsure about who do we share the confidential data with, both properties would come to naught. Similar arguments apply when certificate validation is incorrectly implemented, or when major certificate validation errors are dismissed as harmless by users or developers. In that case, an active attacker can forge a certificate, pass the certificate validation, and pretend to be someone else.

3.1.3 TLS deployment In practice, TLS can be deployed with a diverse range of protocols in the adjacent network layers – the transport layer below, and the application layer above. Transport layer. A reliable transport protocol is intended to run un- der TLS, with TCP (Transmission Control Protocol) being the one most widely used. However, TLS was also adapted (into so-called DTLS [72]) to run on top of unreliable protocols, such as UDP. Application layer. When SSL (TLS predecessor) was released, many historically not secure application protocols were modified to support it underneath. These modifications, namely of HTTP, SMTP or FTP, are now standards in secure internet communication. Nevertheless, any arbitrary application can employ TLS to secure its communication, too. Unlike popular browsers or mail clients, such an application may not be developed by a security professional, which often results in flawed implementations. Thus, further sections of this chapter focus on this specific use of TLS.

3.2 Implementing certificate validation

The task of implementing certificate validation can be approached in multiple different ways. However, not all of them are fully suitable for our needs. We present the simplistic approach, which does not employ TLS at all (Section 3.2.1). From Section 3.2.2 onward, we discuss our final implementation, which does make use of TLS.

21 3. Validating certificates within TLS 3.2.1 Prototype solution Some of the widely used TLS libraries are shipped with pre-installed command-line utilities that perform certificate validation. For the sake of simplicity, we used these utilites to validate our initial datasets. Specifically, we used four libraries that possess such utilities: OpenSSL, GnuTLS, Botan, and MbedTLS. The usage of a command-line certificate validation tool is usually quite straightforward. As an input, it simply takes a certificate chain to validate. Unless trusted root certificates are explicitly provided, most tools will trust the default ones stored locally2. Depending on the specific library, other options can be set, namely hostnames and requirements on revocation checking. With verify, we would validate a certificate chain as below. We assume that the file root_ca.pem contains a certificate of the only trusted root certificate authority.

openssl verify -CAfile "root_ca.pem" -verify_hostname "x509errors.org" "chain_to_validate.pem"

A valid chain will lead to the following output:

chain_to_validate.pem: OK

If, on the other hand, an error occurs during the validation, a message is simply printed out onto the standard output.

error 18 at0 depth lookup: self signed certificate error chain_to_validate.pem: verification failed

In this case, the chain in question contains a single untrusted self-signed certificate. Thus, it was rightly rejected. If we validated this chain with multiple tools, we could then easily compare the occurring error messages, and deem them correspondent to each other.

2. On , trusted root certificates are usually present in a single directory, such as /etc/ssl/certs/ or /usr/local/share/ca-certificates/.

22 3. Validating certificates within TLS Limitations. While this approach to certificate validation is simple, it brings about a multitude of problems:

1. Availability of the tools. Merely a fraction of all available TLS libraries provide such command-line tools for certificate validation. As of now, we are aware of only five libraries that do so. Since we want to be able to work with as many libraries as possible, this is a major obstacle.

2. Incompleteness of the tools. Command-line tools that do exist are mostly primitive, and their interfaces do not cover all capabilities of their respective under- lying libraries. For instance, the Botan library is fully capable of hostname checking, but does not provide that option in its CLI. The only exception to this point is openssl verify, which contains a vast array of options.

3. Artificiality of the process. Lastly, and perhaps most importantly, validating locally stored certificates is a heavily artificial scenario. Command-line validation tools are used solely for purposes such as testing, while certifi- cate validation itself is usually part of higher-level protocol, most commonly TLS.

As a result of these limitations, our final solution uses an entirely different set of tools to validate certificates. One of our goals is to use tools that are most representative of real-world certificate usage, which would bring us closer to developers’ perspective.

3.2.2 Client-side TLS connection When securing an application with TLS, one of the most simple things one may want to implement is a client-side TLS connection with only the server being authenticated. Since this type of TLS usage is prevalent in the wild [35], we chose it as a basis for our certificate validation scenario. The following three sections present our implementation of client-side TLS connection in five commonly used TLS libraries.

23 3. Validating certificates within TLS 3.2.3 Library choice To reach the highest amount of developers, our intention was to pick the most widely used TLS libraries. Yet, we are currently not aware of any research regarding high-level TLS library popularity. Nemec et al. [60] have compared the popularity of low-level crypto- graphic libraries by examining the distribution of public keys generated by each library. Subsequently, they have analyzed large datasets of public key certificates, e.g. Censys scans [20], to determine where do public keys most commonly come from. Most TLS libraries either directly provide low-level cryptography functionality, or they build on top of some other low-level library. Hence, we chose five popular libraries based on the mentioned research. OpenSSL. OpenSSL [64] is by far the most dominant all-purpose cryptographic and TLS library. It comes pre-installed in a multitude of Linux distributions. GnuTLS. GnuTLS [73] is a TLS library written on top of Nettle [57], a low-level cryptographic library. As OpenSSL, it is written in C, and aims to provide a simple interface to TLS. Mbed TLS. Mbed TLS [80] is another C library, intended for use in applications with restricted resources. Botan. Botan [56] is a cryptography library which aims to be a modern solution for C++ applications. It also provides language bindings for C, Python, Rust, and Ruby. OpenJDK. OpenJDK [63] is an open-source implementation of the Java Platform. Unlike others, it is not a library per se, but it includes all tools needed to implement TLS, and is commonly used to do so.

3.2.4 TLS client implementation All five TLS clients are implemented according to documentation. They are not fully complete, since they close the connection immediately after the TLS handshake. This is because we are interested solely in certificate validation errors, which occur during the handshake. However, we publish three out of the five clients online as guides. These do contain all necessary features and we describe them further in Section 5.2.

24 3. Validating certificates within TLS Though some steps may be hidden by certain libraries, all clients perform roughly the following ones.

1. Processing validation options. Firstly, the client processes all command-line arguments which represent validation options. These are the validation options that we store separately for each malformed certificate chain, as described in Section 2.2.6. They include trusted root certificates, expected certificate hostname and many other.

2. Establishing an underlying TCP/IP connection. Next, the client connects to the server using the TCP/IP protocol. The secure TLS connection will be built on top of this one.

3. Preparing necessary data structures. Before the connection can be established, various data structures must be initialized and options set. This includes setting the validation options collected in step 1. Apart from those, the clients use default options, recommended by their libraries3.

4. Performing the TLS handshake. In this step, the handshake is finally performed. When an error occurs during the handshake protocol, most libraries immediately fail, and rightly so. In our case, we force the client to continue, in order to learn more information about the occurring error.

5. Announcing the result of certificate validation. To be specific, we require that the clients print out the error message of an error which occurred during certificate validation. Usually, a few additional functions have to be called to do so.

6. Closing the connection. Lastly, we close the connection without sending or receiving any application data. The TLS connection should be closed in an orderly manner, by sending a special “close notify” message to the peer and waiting for his reply.

3. Necessary settings include supported ciphersuites or acceptable TLS versions. Libraries usually recommend safe settings as defaults, e.g. only TLS v1.2 and higher.

25 3. Validating certificates within TLS Revocation checking. The used libraries provide relatively simple API calls for enabling almost all critical certificate validation settings. There is a single exception – revocation checks. Having a locally stored certificate revocation list at hand, we can enable offline revocation checks easily. However, such a scenario would be rare in real-world applications. More often, the checks must be performed online by contacting CRL or OCSP servers during the handshake protocol. Excluding Mbed TLS, all libraries are able to retrieve data from the CRL Distribution Points and Authority Information Access certificate extensions. Both extensions should contain the URIs of servers providing revocation information (CRL and OCSP respectively). However, clients must contact these servers manually during the handshake, since the libraries do not provide such a feature. The situation is a bit better when it comes to validating stapled OCSP responses. OpenSSL and GnuTLS check them automatically when present, considering the OCSP “must-staple” certificate extension as well. Botan and OpenJDK require additional API calls, while Mbed TLS does not support OCSP at all. As a result of these limitations, the support for revocation checking differs in our implementations. We include offline revocation checks in Mbed TLS. We support OCSP stapling in OpenSSL and GnuTLS. The GnuTLS implementation also includes experimental online CRL and OCSP checking.

3.2.5 Server implementation Since we validate certificates via a client-side TLS connection, we must also run a TLS server. We are interested purely in handshake messages, so we can use a “dummy” server which does not respond to any further requests afterwards. We use two separate TLS servers, “main” and “backup”. The reasons are explained below, as we briefly describe both. Main TLS server. To run our main TLS server, we use a standard Python package - asyncio [12]. The package is intended for use in a variety of concurrent applications. Among other things, it can wrap the client/server functionality of ssl [79], another standard Python library. As an added value, it can establish a server which accepts multiple connections at once.

26 3. Validating certificates within TLS Internally, the ssl package runs regular OpenSSL. Unfortunately, OpenSSL performs basic checks on the certificate and on the private key when they are imported for use. This behaviour is certainly useful in normal usage, but it rejects some of our erroneous certificates, e.g. those associated with weak private keys. Backup TLS server. To deploy certificates rejected by the main server, we use a command-line server shipped with Botan [28]. It is rather primitive and its documentation explicitly states that it is intended for testing purposes only. As a backup, we deem it sufficient. This server does not reject certificates like Python does. On the other hand, it fails to process syntactically malformed certificates, sending invalid data during the handshake instead. As a result, we use two separate TLS servers to deploy certificates. For each malformed certificate chain, we specify which of the two should be used, depending on the certificate flaw.

In some cases, we need to establish two additional HTTP servers that act as a CRL distribution point and an OCSP responder. CRL distribution point. A CRL distribution point is a simple HTTP server which only serves a single file – the most recent CRL. We employ it using the Python http.server package [43]. OCSP responder. In contrast, the functionality of an OCSP server is much more elaborate. The responder must be able to process an OCSP status request, create a response and sign it. For this purpose, we could employ openssl-ocsp [65], a command-line utility. However, we do not compare any OCSP-related validation errors yet, due to the difficulties explained earlier.

27 4 Comparing errors across libraries

At this point, we have a certificate dataset at hand, and we are able establish client-side TLS connection with proper certificate validation. This chapter describes the process of merging the two. In the beginning of the chapter, we explain our motivation for comparing certificate validation errors across libraries (Section 4.1). We carry on with the build process of our system (Section 4.2). Next, we explain how the system processes certificate validation data (Section 4.3), before evaluating the results (Section 4.4).

4.1 Goals

Comparing behaviour of TLS libraries when confronted with erroneous certificates may seem to be a futile task. We present the reasons for doing so below. Error correspondence. Our main goal is to find a correspondence between errors that occur within different TLS libraries. We believe that establishing such correspondence may prove useful for multiple purposes: 1. More effective documentation. A parallel effort in the Usable X.509 errors project aims to im- prove documentation of individual error messages. Currently, most libraries provide error documentation that consists of one or two sentences, hence not explaining the problem well [84]. However, previous research shows that proper documentation is valuable and can lead to more secure code [2]. In her master thesis, Balážová [15] concludes that most developers would prefer error documentation to be more detailed and to include sections such as security perspective. Moreover, she designs improved documentation for three certificate validation errors occurring in the OpenSSL library. While rewriting the documentation proves to be useful, it is a costly process, considering the sheer amount of validation errors among libraries [84]. Grouping errors with the same meaning would imply that less documentation needs to be written in overall.

28 4. Comparing errors across libraries 2. Transitioning from one library to another. Establishing a kind of a mapping between certificate validation error messages in different libraries could also help developers that transition between those libraries. A less security savvy developer may understand the implications of an OpenSSL error X509_V_ERR_HOSTNAME_MISMATCH, if they have already dealt with it before. However, if they encounter the error GNUTLS_CERT_UNEXPECTED_OWNER, it may not be clear that it has, in fact, the same cause1 as the former one. By having access to a resource where these two errors are linked, a developer may evaluate the security implications better, and solve the problem faster.

3. Deciding on the feasibility of error taxonomy unification. Standardizing a set of certificate validation errors would arguably eliminate the problems that we list in the previous two points, but it is questionable whether such thing would even be possible, beneficial, and appropriate. Performing the comparison is the first step in deciding on these issues. Further research would then be required to assess the differences and granularity of errors in different libraries. Even if the unification of validation errors proves to be possible, a wide security community would have to be open to the changes. Moreover, authors of the TLS libraries themselves would have to be able to incorporate the changes into their libraries.

Looking for bugs. Aside from error comparison, our secondary goal is to assess possible abnormalities in the validation results. To do so, we look for patterns therein, for instance a library accepting a certificate chain when all other libraries reject it. Such behaviour may point to a missing or incorrectly implemented check in certificate validation. However, this direction of research is not our main point of focus. As we discuss in Section 6.1, others have employed more advanced techniques for such testing.

1. These two errors are both caused by an invalid hostname in the certificate, as discussed in Section 2.1.2.

29 4. Comparing errors across libraries 4.2 Build

In order to compare certificate validation errors in a reliable way, we construct a system which performs all the necessary steps automatically. To be more specific, it must generate all certificate chains, validate them, and process the validation results. This section summarizes the procedures that take place when the entire system is built for the first time. The build process is controlled by Makefiles, therefore any subsequent build only interacts with source code and data which have changed since the previous build. Figure 2 describes how individual system components interact with data and each other. We have already described most of the components and data in the previous two chapters, except for the ones that concern validation results (right side of the figure). The data are split into two categories: static data that we prepared manually beforehand (blue), and data that are generated dynamically with each build (orange).

use TLS clients Validation settings Possible validation errors trust only OpenSSL Root CA certificate consults GnuTLS ASN.1 Root private key specification files Mbed TLS output Validation results generates use for signing Botan

OpenJDK Certificate scripts load are fed into Root generating script use Chain generating scripts connect to Mapping script

creates Helper Python libraries generate Servers

X.509 TLS server Private keys authenticates IO with CRL distribution point Library error Certificate chains Crypto OCSP responder mapping

Figure 2: High-level overview of the entire system. We distinguish between manually collected data, source code and automatically generated data. Solid lines represent data flow, dashed lines are communication/usage.

30 4. Comparing errors across libraries When make is executed for the first time, the following steps are performed, in this order:

1. TLS Client build. All TLS clients are compiled into executable binaries.

2. Root authority certificate generation. The root authority generating script is executed, yielding a private key and a single self-signed CA certificate.

3. Malformed chain generation. Each certificate script is executed, outputting the private key of the end entity, a certificate chain, and in some cases other data, such as CRLs. The topmost intermediate CA certificate in each chain is signed using the root CA key generated in the previous step (unless the flaw within the chain contradicts this rule).

4. TLS handshakes. The main TLS server loads a certificate chain and the matching private key. Then it listens for connections on localhost, eventually performing a TLS handshake with all five TLS clients. For some chains, CRL and OCSP servers have to be started too, loading the necessary data (CRLs and OCSP status information). Each client validates the server certificate chain during the hand- shake, using the predefined validation settings. The resultant message is saved into a file. When all clients’ handshakes are finished, running servers are killed, and the whole step repeats for the next certificate chain.

5. Result processing. Lastly, the validation results are evaluated by a mapping script. In the process, the script consults sets of all possible error messages. We describe both the pre-collected error data and the mapping script in the following section.

31 4. Comparing errors across libraries 4.3 Result processing

To recognize errors occurring during certificate validation, we establish a set of all possible errors in each library (Section 4.3.1). Further in this section, we observe the validation results, describing the limits of cross-library error mapping (Section 4.3.2). Lastly, we explain how the results are processed during the build (Section 4.3.3).

4.3.1 Collecting error data When certificate validation fails in one of our TLS clients, the client prints out an error message to the standard error output. Usually, the error message will correspond to a single error that the library recognizes. To give an example, OpenSSL prints its messages like below.

switch(error_code) { case X509_V_ERR_CERT_HAS_EXPIRED: return "certificate has expired"; case X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT: return "self signed certificate"; } As we can see, an expired certificate will induce an error with the code X509_V_ERR_CERT_HAS_EXPIRED and the OpenSSL client will print out a single string: “certificate has expired”. Certain TLS libraries may print out additional information about the error, such as the subject name of the faulty certificate. However, we are not interested in that kind of information, since we only want to know which error occurred. To make these decisions easier, we collected data about all possible certificate validation errors for each library2. This process needs to be done only once, but the resulting database must be updated whenever the libraries themselves change. It is important to mention that numbers of possible error messages differ significantly among libraries, as already noted by Ukrop et al. [84]. We list the current numbers in Table 1.

2. In OpenJDK, we did not manage to collect all possible errors, since the errors are thrown as exceptions with string messages. No list of all exception messages exists, therefore we only work with errors that we successfully reproduced.

32 4. Comparing errors across libraries

Table 1: Number of validation errors as stated by each library and number of other errors that may occur during validation.

Library name Validation errors Other related errors Total known errors OpenSSL 96 0 96 GnuTLS 19 10 29 Mbed TLS 20 21 41 Botan 50 4 54 OpenJDK 14 5 19

To store possible library errors, we use a single YAML file, containing errors and their respective messages. In the case of OpenSSL, the file contains the following excerpt.

- code: X509_V_ERR_CERT_HAS_EXPIRED message: | certificate has expired - code: X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT message: | self signed certificate We make use of these data further in this chapter, when linking errors together (Section 4.3.3).

4.3.2 Observations After all certificate chains are validated, the validation results are stored in a single file. For each chain, the file contains error messages reported by each TLS client.

INVALID_SIGNATURE: botan: Signature error : ...The signature in the certificate is invalid. mbedtls: ...not correctly signed by the trusted CA openjdk: ...signature check failed openssl: certificate signature failure In this case, the name we gave to the chain (INVALID_SIGNATURE) is rather descriptive. The end entity certificate is signed by a random key, not corresponding to the public key of the issuer CA. Hence, all libraries correctly reject the chain.

33 4. Comparing errors across libraries When the results are carefully examined, it is important to notice that the mapping of error messages across libraries is not bijective. That is, multiple errors in OpenSSL may map to a single error in GnuTLS. For instance, GNUTLS_CERT_SIGNER_CONSTRAINTS_FAILURE encompasses multiple OpenSSL errors that regard Name Constraints and Basic Constrains extensions, including X509_V_ERR_SUBTREE_MINMAX and X509_V_ERR_PATH_LENGTH_EXCEEDED. This crucial observation is in line with different numbers of possible errors in each library. The only conclusion we can deduce from the observation: OpenSSL developers prefer to differentiate between the errors on a finer scale. Concluding that we will not be able to create a 1:1 mapping of validation errors across libraries, we can try to assess whether at least a 1:N mapping would be possible to establish. Hence, we need to evaluate whether it is true that if a single GnuTLS error covers multiple OpenSSL errors, neither of these OpenSSL errors occur simultaneously with any other GnuTLS error. This relationship must hold between each possible pair of libraries, without exception. Upon closer inspection, it is obvious that even such less restrictive mapping is impossible to create. Notice the following three validation results, which serve as a counterexample.

INVALID_SIGNATURE: mbedtls: ...not correctly signed by the trusted CA openssl: certificate signature failure

ISSUER_CA_FALSE: mbedtls: ...not correctly signed by the trusted CA openssl: invalid CA certificate

WRONG_SIGNATURE_ALGORITHM: mbedtls: ...Signature algorithms do not match... openssl: certificate signature failure

When validating the ISSUER_CA_FALSE chain, Mbed TLS returns the same error message as it does for INVALID_SIGNATURE, while OpenSSL does not. That being said, when the WRONG_SIGNATURE_ALGORITHM chain is validated, OpenSSL does return the same error message as in the case of INVALID_SIGNATURE, while Mbed TLS does not.

34 4. Comparing errors across libraries This observation directly implies that the cross-library error mapping is neither 1:1, nor 1:N, but M:N. As a result, the only information we are able to collect about a given error is a set of errors that, at some point, occur simultaneously with the given error.

4.3.3 Cross-library error linking The cross-library linking of certificate validation errors is done by a simple Python script. Its functionality is threefold: firstly, it determines whether a particular error message actually belongs to some error in the given library; secondly, it links errors across libraries based on common certificate chains; and lastly, it stores the linkage conveniently. Matching with an error code. Recall from Section 4.3.1 that we store all possible error codes and their messages in a file like below.

- code: GNUTLS_CERT_EXPIRED message: | The certificate chain uses expired certificate. Let us assume that our GnuTLS client returns the message: “The certificate is NOT trusted. The certificate chain uses expired certificate.” Clearly, the listed error (GNUTLS_CERT_EXPIRED) has occurred, but its message was printed along with some extra words. To match the returned message with the right error code, we simply loop through all possible error codes. In the usual case, we check whether its error message is a substring of the returned message. If this condition holds, we declare it a match. Error correspondence. For each error, we can compute the set of corresponding errors in two steps.

1. Firstly, we calculate a set of all certificate chains that yield the given error during validation. Formally, if e is the assessed error from library i, we compute the following set Xe:

Xe = {c ∈ C | νi(c) = e},

where νi : C → Ei is the certificate validation function of library i, C is the set of all malformed certificate chains, and Ei is the set of all possible errors in library i.

35 4. Comparing errors across libraries 2. Secondly, we compute sets of corresponding errors. We declare two errors e, f correspondent to each other if and only if they share at least one common certificate chain. The binary correspondence relation, R, is thus defined as follows:

eRf ⇐⇒ Xe ∩ Xf 6= ∅

Notice that the error correspondence, as we define it, is not an equiva- lence relation since it is not transitive. An error e can share common chains with errors f, g, but f and g do not have to share a common chain. This observation is consistent with the actual cross-library error mapping, which is of type M:N. There is a single pitfall that we need to mention. Throughout the chapter, we assume that a library can only return a single error when validating a certificate chain. In general, this is not true for some libraries, namely GnuTLS and Mbed TLS. We do take this into account when computing the mapping, but do not describe the process for simplicity. Storing correspondence results. Once again, the results of the linkage are stored in a single file for each analyzed library. An excerpt from a mapping file of the Botan library may look as below.

CANNOT_ESTABLISH_TRUST: chains: - SELF_SIGNED_END_ENTITY - SELF_SIGNED_INTERMEDIATE correspondence: gnutls: - GNUTLS_CERT_SIGNER_NOT_FOUND mbedtls: - MBEDTLS_X509_BADCERT_NOT_TRUSTED openjdk: - UNABLE_TO_FIND_VALID_CERTIFICATION_PATH openssl: - X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT - X509_V_ERR_SELF_SIGNED_CERT_IN_CHAIN

In this case, the set Xe (e being CANNOT_ESTABLISH_TRUST) contains two chains, both caused by untrusted self-signed certificates. The error is in correspondence relation R with 5 other errors. We group them by library, in order to ease data manipulation.

36 4. Comparing errors across libraries 4.4 Results evaluation

Within this section, we look at the correspondence data and briefly evaluate them. We assess inconsistencies in libraries behaviour to see whether there are possible bugs in their certificate validation. Afterwards, we suggest error taxonomy improvements (Section 4.4.1). Error coverage. Fig. 3 shows how our certificate dataset covers the sets of possible errors in the five libraries. As of now, we have replicated 31/96 errors in OpenSSL, 14/29 errors in GnuTLS, 13/41 errors in Mbed TLS and 20/54 errors in Botan. Our certificate chains induce 19 distinct errors in OpenJDK, but we do not have data about the complete set of errors, as already mentioned. Since the cross-library error mapping is constructed with respect to certificate chains, the same errors also have corresponding errors assigned. There are some exceptions; errors that depend purely on validation settings of some library (usually OpenSSL) do not form any correspondence relations. Some errors are explicitly declared obsolete or deprecated by their documentation. Thus, they cannot be replicated and the figure displays them in gray.

Deprecated Not covered Covered

100

75

50

25

0 OpenSSL GnuTLS Mbed TLS Botan OpenJDK Figure 3: Number of successfully reproduced errors in assessed libraries.

37 4. Comparing errors across libraries Library differences. The extent of differences between errors from different libraries depends largely on the nature of certificate flaws. For common and simple flaws, e.g. expired certificate or wrong hostname, the mapping is straightforward. All libraries have unique error codes reserved for them, hence the mapping is bijective when we choose any pair of libraries. However, in extreme cases, a single error in one library may en- compass as much as 10 errors in other libraries. This is the case of GNUTLS_CERT_SIGNER_CONSTRAINTS_FAILURE which corresponds to 10 OpenSSL errors, 2 Mbed TLS errors, 5 Botan errors and 7 OpenJDK errors. In fact, the mapping of extension-related errors is rather chaotic in general. Each library chooses its own granularity – OpenSSL rarely returns the same error twice, while Mbed TLS often plainly states that “extensions are invalid”. We ought to note that some of our mappings may be imprecise when a library does not recognize non-standard certificate extensions. For instance, Botan returns the error UNKNOWN_CRITICAL_EXTENSION when validating chains that have flaws within the Proxy Cert Info extension [83]. Further cleaning of these results would be required to make them more precise. Unlike other libraries, OpenSSL does not clearly separate seman- tic certificate validation errors from other errors that may occur in the validation. Thus, its official error list includes unrelated errors like X509_V_ERR_OUT_OF_MEM (error when allocating memory during valida- tion) or X509_V_ERR_INVALID_CALL (bad call within the C structure used for validation). We did not try replicating these kinds of errors. OpenSSL’s list of certificate validation errors also contains par- sing errors, while other libraries return high-level errors instead. For instance, badly formatted expiration date causes a very specific er- ror in OpenSSL (X509_V_ERR_NOT_AFTER_FIELD_INVALID). In contrast, GnuTLS returns a general parsing error (GNUTLS_E_ASN1_DER_ERROR). Overlooked flaws. Surprisingly, we have discovered a considerable amount of certificate flaws that are overlooked by TLS libraries. We list the concrete instances below. • Basic Constraints not critical. According to RFC 5280, the Basic Constraints extension must be set to critical in CA certificates. No library performs this check.

38 4. Comparing errors across libraries • PathLenConstraint present in non-CA certificate. Similarly, libraries do not mind the pathLenConstraint field in certificates not belonging to certificate authorities. Yet, the field must not be present in end entity certificates.

• PathLenConstraint negative. OpenJDK and GnuTLS accept certificate chains wherein the indermediate CA certificate has a negative pathLenConstraint, even though they correctly reject other chains that violate this constraint.

• Empty Key Usage or Extended Key Usage. Botan and GnuTLS accept server certificates with empty Key Usage and Extended Key Usage extensions. Hence, they will accept even certificates not intended for use by TLS servers.

• Name constraints violations. Botan and OpenJDK accept certificates that do not conform to constraints set by their issuer Name Constraints extension (specifically, they violate the excludedSubtrees field). OpenSSL and OpenJDK accept certificates with empty Name Constraints, which is prohibited by RFC 5280.

• Badly formatted Subject Alternative Name. Botan, GnuTLS and OpenJDK accept certificates with null bytes in their Subject Alternative Name.

• Invalid certificate versions. GnuTLS and OpenSSL accept certificates of version 4. Moreover, OpenSSL accepts version 1 certificates with extensions, unless the strict validation flag is enabled.

• Signature algorithm mismatch. Botan accepts an end entity certificate where the public key algorithm identifier does not match the public key.

Note that some libraries ignore certain flaws on purpose, for reasons such as backward compatibility [6]. Moreover, we are currently not aware of any attacks that could abuse the listed behaviour.

39 4. Comparing errors across libraries 4.4.1 Error taxonomy improvements Based on our observations of the error comparison, we propose three possible improvements to error classification.

1. Unifying the names of common errors. The names of simple errors with clear cross-library mapping could be unified. That way, a transitioning developer would not have to guess whether these five errors represent the same flaw.

• CANNOT_ESTABLISH_TRUST • X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY • MBEDTLS_X509_BADCERT_NOT_TRUSTED • GNUTLS_CERT_SIGNER_NOT_FOUND • NO_TRUST_ANCHOR

2. One error per extension. In certificate extensions, there are lots of things that can go wrong. One approach to unifying error taxonomy would be to have a single validation error for each extension. That is, all certificate flaws related to a given extension would yield the same error. By itself, this would probably decrease error understandability. However, the approach could be combined with detailed docu- mentation which would explain all possible violations of the given extension (e.g. by extracting rules from RFCs [25]).

3. Completely separating parsing errors. Treating parsing and formatting errors as a completely different category could also help in untangling the error taxonomy. In their essence, they do not fit into certificate path validation, as defined in the X.509 profile.

Further research and communication with security professionals would be required to decide whether the proposed changes are feasible and appropriate to employ.

40 5 Deployment

We publish all of our data and resources on the website x509errors.org. Our intentions are to help developers not by replacing existing TLS documentation, but by supplementing it. Therefore, we aim to have our resources accurate, detailed, and intuitive to use. This chapter presents the features of the website, splitting them into two parts. Section 5.1 explains how we publish certificate validation error

5/22/2021data, error correspondence, and malformedUsable X.509 errors: OpenSSL certificates; while Section 5.2 describes X509_ V_ step-by-step ERR_ UNABLE_ TO_ V guidesERIFY_ LEAF that_ SIGNA shouldTURE help developers implement basic client-side TLS connection in three different libraries.  X509_ V_ ERR_ PATH_ LOOP 

5.1 XError509_ V_ ERR_ documentation OCSP_ CERT_ UNKNOWN

 X509_ V_ ERR_ AKID_ SKID_ MISMATCH  As its primary goal, our website aggregates data and documentation about X50 certificate9_ V_ ERR_ AKID validation_ ISSUER_ SERIAL errors_ MISMAT intoCH a single place. Each TLS library

 has its own subpage where all its errors are listed. We use error data  X509_ V_ ERR_ SUBJECT_ ISSUER_ MISMATCH 

F

e collected in Section 4.3.1 to display all possible library errors. e

d  b Figure X509_ V_ E4R Rshows_ OCSP_ V anERIF excerptY_ FAILED from the OpenSSL page, certificate exten- a c k sion errors.X509_ V_ ER TheR_ CER errorsT_ UNTRU areSTED clearly distinguished by their original library names, since developers are likely to look for those.

Basic extension errors Errors related to extensions in general or to the BasicConstraints standard extension. Relevant links: Certificate Extensions  (RFC 5280), BasicConstraints Extension  (RFC 5280)

 X509_ V_ ERR_ UNSUPPORTED_ EXTENSION_ FEATURE 

 X509_ V_ ERR_ INVALID_ CA  

 X509_ V_ ERR_ PATH_ LENGTH_ EXCEEDED   

 X509_ V_ ERR_ UNHANDLED_ CRITICAL_ EXTENSION   

 X509_ V_ ERR_ UNHANDLED_ CRITICAL_ CRL_ EXTENSION

 X509_ V_ ERR_ INVALID_ EXTENSION  

Figure 4: OpenSSL errors related to Basic Constraints and extensions in general, as published on the website x509errors.org. Name related errors Errors signalizing problems with either hostname verification, NameConstaints standard extension or IP Address Delegation extension. Relevant links: NameConstaints extension  (RFC 5280), IP Address Delegation extension  (RFC 3779), Certificate Common Name  (RFC 5280) 41

 X509_ V_ ERR_ HOSTNAME_ MISMATCH   

 X509_ V_ ERR_ EMAIL_ MISMATCH 

 X509_ V_ ERR_ IP_ ADDRESS_ MISMATCH 

 X509_ V_ ERR_ PERMITTED_ VIOLATION  

 X509_ V_ ERR_ EXCLUDED_ VIOLATION   ://x509errors.org 2/6 5. Deployment

5/22/2021 Usable X.509 errors: OpenSSL

To make X509_ V_ the ERR_ websiteUNABLE_ TO_ easier VERIFY_ L toEAF_ navigate, SIGNATURE we split errors into different categories, which loosely correspond to those established in Section 2.2.5. The sortingX509_ V_ ERR was_ PATH performed_ LOOP manually. For each category, we list a short description X509_ V_ ER andR_ OC linkSP_ CE relevantRT_ UNKNOW documents,N such as Requests for Comments. Each error code displays additional information when clicked on, as seen X in50 9Figure_ V_ ERR_ A 5KI.D_ Specifically, SKID_ MISMATCH we list the original library documentation and theX509_ original V_ ERR_ AKID error_ ISSUER message._ SERIAL_ MIS IfMAT ourCH certificate dataset contains certifi-

 cate chains which yield the given error, we provide them for download as  X509_ V_ ERR_ SUBJECT_ ISSUER_ MISMATCH 

F ZIP archives. Furthermore, all corresponding errors from other libraries, e e

d 

b as computed X509_ V_ ERR_ in OC SSectionP_ VERIFY_ F 4.3AILE,D are linked. a c

k So as to aid developers even more, chosen errors are getting covered   with X redesigned509_ V_ ERR_ CERT documentation._ UNTRUSTED The documentation contains sections explaining the root cause of the error, its security implications, and steps to perform in order to solve the issue. Albeit important, this new documentationBasic extension iserrnotors a subject of this thesis, since it is being written byError as rel parallelated to extension efforts in general o ofr to t Balážováhe BasicConstraints [st15anda].rd extension. Relevant links: Certificate Extensions  (RFC 5280), BasicConstraints Extension  (RFC 5280)

 X509_ V_ ERR_ UNSUPPORTED_ EXTENSION_ FEATURE 

 X509_ V_ ERR_ INVALID_ CA  

 Original documentation:

A CA certificate is invalid. Either it is not a CA or its extensions are not consistent with the supplied purpose. (source )

 Original error message:

invalid CA certificate (source )

 Example certificates

Below you can download one or more example malformed certificates causing X509_ V_ ERR_ INVALID_ CA in OpenSSL. If you are interested in generating these certificates yourself, see the corresponding generating script for each case on the project Github.

Case  ISSUER_CA_FALSE (see the  generation script )

 Corresponding errors

What validation errors do other libraries give for certificates causing X509_ V_ ERR_ INVALID_ CA in OpenSSL? Below, you can see the basic overview based on the example certificates from the previous section. (The list may be incomplete.)

GnuTLS: GNUTLS_ CERT_ SIGNER_ NOT_ CA Botan: CA_ CERT_ NOT_ FOR_ CERT_ ISSUER Mbed TLS: MBEDTLS_ X509_ BADCERT_ NOT_ TRUSTED OpenJDK: PKIX_ PATH_ VALIDATION_ FAILED, NOT_ A_ CA_ CERTIFICATE

 X509_ V_ ERR_ PATH_ LENGTH_ EXCEEDED   

 X509_ V_ ERR_ UNHANDLED_ CRITICAL_ EXTENSION    Figure 5: Detailed information about X509_V_ERR_INVALID_CA, an OpenSSL X509_ V error,_ ERR_ UN asHAN publishedDLED_ CRITIC onAL_ Cx509errors.orgRL_ EXTENSION .

https://x509errors.org/#about 2/6 42 5. Deployment 5.2 Developer guides

An additional section of the website contains guides intended for use by developers (Figure 6). As of now, three guides aimed at implementing basic client-side TLS connection are available, based on the source code described in Section 3.2.4. Specifically, we cover OpenSSL, GnuTLS and Mbed TLS.

Figure 6: Introduction to the Mbed TLS guide on establishing client-side TLS connection, as published on x509errors.org.

43 5. Deployment Similar resources and their drawbacks. Online resources that provide guidance during implementation of TLS connections do exist, but they suffer from the following issues:

1. Difficult navigability. The original documentation of TLS libraries usually explains the functionality of individual functions within their APIs rather well. However, the documentation may be difficult to navigate for a developer that has never implemented TLS before [39]. Such a developer may struggle to determine the correct order of functions to use. Moreover, they may fail to enable important settings, such as the whole certificate validation procedure. This may lead to functional, but not necessarily secure implementations. Some libraries do also provide step-by-step TLS guides. GnuTLS documentation contains a structured multi-page manual on how to use TLS in applications [42]. However, it lists a vast array of options for each step, which may be overwhelming [39]. The same holds for the Security Developer’s Guide [77] by Oracle, which is simply too extensive to navigate easily.

2. Obsolescence. The interfaces of TLS libraries change over time, and thus some re- sources are becoming obsolete. However useful they may have been in the past, not being recent enough is a major issue. Unintuitively, even official documentation is deprecated in some cases. In 2012, the Fedora Security Team has published a concise guide on implementing TLS clients in four TLS libraries [87], as a part of their Defensive Coding document. Since the guide has not been updated in the last 7 years, it contains deprecated function calls. Yet, we base the content and structure of our guides on this work, because it is well written. Similarly, OpenSSL contains a structured TLS client example on its wiki page [81]. Parts of it have been updated over time, but it still leaves hostname validation as a manual “exercise for the reader”. This is in spite of the fact that hostname validation can be enabled by calling a single function in the latest OpenSSL API.

44 5. Deployment 3. Inaccuracy or incompleteness. Together with obsolescence, inaccuracy and incompleteness are the main problems of resources commonly found on internet forums. Research by Acar et al. [2] suggests that it is possible and efficient to write code using sources such as Stack Overflow [82], but it leads to less secure code, as compared to documentation. In the case of Mbed TLS, incomplete examples are present in the documentation, too. Its tutorial [14] presents an overview of upgrading “insecure” TCP to “secure” TLS, but explicitly disables certificate validation, purely noting that this approach “should not be used in full applications”.

4. Missing explanation. When other options fail, developers may turn to a different type of resource – existing source code. For instance, OpenSSL and Botan refer to the source code of their command line tools as “example code” [81, 37], yet the code itself is scarce in comments or explanations. As a result, developers do not acquire information about which commands and steps are necessary, and which are optional. Moreover, the code contains a lot of overhead, such as command line argument parsing, which makes it less readable. With our guides, we aim to mitigate the listed problems by being concise, yet accurate and up to date. All three guides comprise the following features, as seen in Figure 7:

1. Step-by-step manual. Each guide is visually split into multiple consecutive sections to ease navigation. Together, the sections cover all of the necessary steps in establishing TLS connection. Online revocation checking is not yet covered (except for OCSP stapling in GnuTLS), for reasons explained in Section 3.2.4.

2. Explanation of each step. All steps are accompanied with at least a single descriptive line, which explains their purpose. Introductions to each section provide a more high-level description of the steps which follow. In some critical steps, we also briefly explain the security perspective on its necessity.

45 5. Deployment

Figure 7: A single optional section from the Mbed TLS guide, as published on the website x509errors.org. The section deals with the process of retrieving the certificate validation result.

3. Optional and alternative settings. If various alternative approaches are possible, we list all of them, and explain the use cases of each one. Optional steps, which may increase security, are present as well. Both of these are visually distinguished from standard sections.

4. Links to documentation and relevant standards. Importantly, links to original documentation are provided for each function call. Additional resources, including standards and RFCs, are also linked when relevant. This way, the developer ought to immediately know where to look for more details.

Similarly to the error documentation, these guides do not intend to replace any of the previously listed resources. Instead, they cover the most important steps, further prompting the reader to look into the official library documentation when necessary.

46 6 Related work

This thesis touches upon a rather broad spectrum of topics. Thus, there are plenty of existing efforts which intersect our work. This chapter presents the relevant research, splitting it into three sections: erroneous certificate datasets, improving public key infrastruc- ture, and usability of TLS implementations. Within each section, we situate our work amongst the existing one.

6.1 Malformed certificates

Validating sets of erroneous certificates is not a novel technique of testing. Most libraries that implement certificate validation include diverse testing datasets themselves [18, 38, 66]. Nonetheless, research clearly shows that these do not cover all possible certificate flaws. Certificate mutation. The idea to “mutate” certificates first appears in the work of Brubaker et al. [19]. Their paper proposes a way to auto- matically generate synthetic certificates, so-called “frankencerts”. These certificates are constructed by randomly combining parts of existing “seed” certificates. Frankencerts were successfully used to uncover flaws in libraries that implement certificate validation. Inspired by the work of Brubaker, several other authors employ elaborate techniques to produce more certificate discrepancies with fewer generated certificates: • Y. Chen and Su [26] approach mutation using MCMC sampling1; • Kleine and Simos [49] use advanced combinatorial methods; • Zhu et al. [89] utilize source code coverage data; • Chao Chen et al. [24] employ methods of deep learning; • Chau et al. [23] use techniques of symbolic execution; • Chu Chen et al. [25] extract rules from RFCs and purposely disobey them.

1. Markov chain Monte Carlo (MCMC) sampling is an effective probabilistic method of drawing unique data samples from an unknown distribution [69].

47 6. Related work All listed authors create their datasets in a more elaborate way than us. However, the purpose of their research differs greatly from ours, too. The papers aim at constructing incorrectly accepted and incorrectly rejected certificates, while our main intention is to construct correctly rejected certificates and compare the error messages they yield. It would be possible to use these datasets to compare certificate validation errors among libraries, yet our manual technique enables us to induce specific error messages more precisely and more reliably. With their “RFCcerts”, Chu Chen et al. get the closest to our work, since they generate non-conforming certificates according to rules extracted from Requests for Comments, i.e. not randomly. Hence, their approach might be the most suitable to employ in our future work. Online test suites. Some erroneous certificate datasets are accessible online too. A Chromium project, badssl.com [13], contains subdomains configured with wrong certificate chains. It is intended for manual testing of TLS clients in browsers and non-browser software alike. In a very similar fashion, bettertls.com [17] provides a test suite specifically aimed at flaws related to the Name Constraints extension. As opposed to our work, these two resources only focus on the most common and most harmful certificate errors.

6.2 Improving certificate infrastructure

New techniques are being proposed to enhance the security levels of the public key infrastructure. The research is wide in this area, dealing with public certificate logs, implementation of certificate validation, and evolving methods of certificate revocation. Certificate transparency. Multiple certificate authorities were com- promised in recent years. In reaction to these incidents, Laurie [51] has established Certificate Transparency (CT) [21, 52] to mitigate their negative consequences. CT is intended to publicly log information about issued TLS certificates. In an ideal case, all issued TLS certificates would be logged in CT and all TLS clients would demand CT data during validation. Thus, if a certificate authority gets compromised, all wrongly issued certificates are deemed invalid or revoked.

48 6. Related work Certificate transparency is slowly paving its way into TLS libraries. Implementing CT checks would enhance non-browser TLS security similarly as in the case of browsers, where it is already adopted. Certificate validation improvements. In addition to the certificate mutation research, further effort has been made to identify issues within real-world implementations of certificate validation. Barenghi et al. [16] construct a detailed X.509 certificate parser. They remark that most implementations of certificate validation intertwine parsing and semantic validation, instead of separating the two. As a result, commonly used libraries often accept syntactically malformed certificates. Detaching the parsing into an independent procedure would likely slow down the validation process as a whole, but it would make the pro- cess less prone to error. Moreover, it would enable complete separation of syntactical error messages from the semantical ones, which is one of the improvements that we suggest in Section 4.4.1. Revocation checking. As is apparent from Section 3.2.4, revocation checking is very difficult to implement. Since it introduces significant overhead, TLS libraries rarely do it by default. Furthermore, some of the relevant standards [31, 41, 75] are quite recent, and thus real- world implementations currently fall behind. However, research is still in progress, focusing on more scalable approaches. Larisch et al. [50] present CRLite [30], a light-weight CRL that is already in use within Firefox Nightly. Their solution transfers overhead from OCSP responders onto separate servers that make use of Certificate Transparency logs. Instead of obtaining revocation data during TLS handshakes, clients periodically download compact revocation lists, thus reducing TLS latency. More recently, Smith et al. [78] address the scalability concerns of current revocation methods by proposing certificate revocation vectors. These vectors contain the revocation status of each certificate in a single bit. Similarly to CRLite, they would be periodically pushed to clients to minimize the overhead in TLS handshakes. Finally including any revocation checking by default would make TLS libraries more secure. Moreover, it would improve their usability for those developers who must implement revocation checks.

49 6. Related work 6.3 X.509 and TLS usability

In the past, research in usable security was aimed primarily at actions of end users [4,5, 32]. However, the impact of developers’ choices is usually much larger. This section lists related usable security research concerning APIs and documentation, especially within TLS. Georgiev et al. [35] manage to discover numerous vulnerabilities in non-browser certificate validation. Most of the vulnerabilities are consequences of poorly designed APIs of TLS libraries. Examples include unsafe default ciphers or disabled hostname validation. Backed by such discoveries, Green and Smith [40] are one of the first to publicly call for improvement of security APIs. They construct a set of recommendations, including 10 principles to follow in order to make these APIs usable. Acar et al. [2] conduct research to determine whether there is some correlation between the type of resources that developers use, and the security of the code they write. They conclude that documentation is the most reliable resource, but developers more often turn to online forums. Thus, there exists a demand for usable and quality documentation. In another study, Acar et al. [1] evaluate the usability of different low-level cryptographic APIs, concluding that developers often write insecure code without realizing it. Moreover, their results show that improved documentation can promote security. In their master theses, Grabovský [39] and Armknecht [8] both conduct similar studies in parallel, focusing instead on higher-level TLS APIs, such as the ones employed in Chapter 3. Their findings are in sync with those of Acar et al., pointing to incomplete and unclear documentation. Our work builds upon a study administered by Ukrop et al. [84], which assesses the understandability of certificate validation error mes- sages. Its outcomes suggest that re-wording the messages and their documentation can have significant effects on perceived security risks, and thus overall security.

50 7 Conclusion

We created a system for comparing errors that occur during validation of X.509 certificate chains. Using the system, we assessed five sets of errors from commonly used TLS libraries. We compared these sets with each other, linking errors with equivalent meanings – thus establishing an error correspondence relation among the libraries. The comparison intends to ease developers’ transition from one TLS library to another. More importantly, it should allow for writing more effective and library-independent documentation of certificate validation errors. That is, proper and detailed documentation (which proves to be crucial for security but currently does not exist) would not have to be written for each TLS library separately. Based on the comparison, we suggest possible improvements of the error ecosystem. Two “byproducts” were created throughout the process, though they hold no less significance. Firstly, we crafted a set of over 60 erroneous certificate chains. These were necessary to induce certificate validation errors, managing to do so for 97 distinct errors over the five libraries. Secondly, we implemented basic client-side TLS connection in all five libraries. We accessed certificate validation APIs only through these clients, so as to replicate the validation errors within a realistic certificate usage scenario. The error comparison results, certificate chains, and TLS source code are all published on x509errors.org in an appropriate form. The system is ready to be extended easily, by new chains and libraries alike.

7.1 Limitations

Throughout the work, we have encountered multiple problems and limitations of varied severity. We list them here. Unknown error data. While it is possible to obtain the set of all errors in most libraries, we did not manage to do so in the case of OpenJDK. OpenJDK raises errors as exceptions, but the set of all exceptions and their messages is difficult to determine. This might happen in other future libraries, too. Yet, the problem is not unsolvable and could be settled by thorough analysis of their source code.

51 7. Conclusion Difficult revocation checking. Some validation checks, especially the ones regarding revocation, have proven difficult to implement in most TLS libraries. Hence, only offline revocation checking is implemented so far. It is questionable how many applications actually implement revocation checking, given the default settings and possibilities provided by the libraries. This should be a subject of further research. Ambiguous mapping. Lastly, the usefulness of the established error mapping is limited, given the vast differences between libraries and the way they deal with certificate flaws. We were not able to construct a truly unambiguous correspondence, simply because the cross-library error relations are too complicated.

7.2 Future work

There are multiple ways in which our work can be built upon. Here, we suggest possible future research and work directions, in arbitrary order. Fixing issues in TLS libraries. We plan to report possible bugs in certificate validation implementations, as listed in Section 4.4. Pull requests should also be made to fix the issues, if they are confirmed. Further improving the resources. The resources we publish can still be further improved. In terms of quantity, new certificate chains, new libraries and new developer guides (e.g. concerning server-side TLS connection) can be added. In terms of quality, a parallel effort is underway in the Usable X.509 errors project, with the aim to write detailed and usable error documentation, as previously stated. Advanced certificate generation. To increase the coverage of li- brary errors, we might try to evaluate whether it would be possible to employ more elaborate methods of creating malformed certificate chains. The work listed in Section 6.1 may serve as an inspiration. Error taxonomy redesign. Lastly, a discussion should be sparked off inside a wider security community regarding the chaos with the errors. A major coordinated redesign of the error ecosystem may help developers in the long run.

52 References

[1] Yasemin Acar et al. “Comparing the Usability of Cryptographic APIs”. In: 2017 IEEE Symposium on Security and Privacy (SP). 2017, pp. 154–171. doi: 10.1109/SP.2017.52. [2] Yasemin Acar et al. “You Get Where You’re Looking for: The Impact of Information Sources on Code Security”. In: 2016 IEEE Symposium on Security and Privacy (SP). 2016, pp. 289–305. doi: 10.1109/SP.2016.25. [3] Mustafa Emre Acer et al. “Where the Wild Warnings Are: Root Causes of Chrome HTTPS Certificate Errors”. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Commu- nications Security. 2017, pp. 1407–1420. doi: 10.1145/3133956. 3134007. [4] Anne Adams and Martina Angela Sasse. “Users Are Not the Enemy”. In: Communications of the ACM 42.12 (1999), pp. 40–46. doi: 10.1145/322796.322806. [5] Devdatta Akhawe and Adrienne Porter Felt. “Alice in Warningland: A Large-Scale Field Study of Browser Security Warning Effective- ness”. In: 22nd USENIX Security Symposium. 2013, pp. 257–272. url: https://usenix.org/conference/usenixsecurity13/ technical-sessions/presentation/akhawe. [6] Allow certificates with Basic Constraints CA:false, pathlen:0. Pull Request at GitHub. url: https : / / github . com / openssl / openssl/pull/11463 (visited on 05/23/2021). [7] Sebastien Andrivet. python-asn1. GitHub repository. url: https: //github.com/andrivet/python-asn1 (visited on 05/23/2021). [8] Jonathan Blake Armknecht. “A Developer Usability Study of TLS Libraries”. MA thesis. Brigham Young University, Provo, 2020. url: https://scholarsarchive.byu.edu/etd/8685/ (visited on 05/23/2021). [9] ASN Lab. ASN Lab. url: https://asnlab.org/ (visited on 05/23/2021). [10] ASN.1 and LTE/5G Tools and Services. OSS Nokalva, Inc. url: https://www.oss.com/ (visited on 05/23/2021).

53 REFERENCES [11] asn1 - The Go Programming Language. Go package. Google Inc. url: https://golang.org/pkg/encoding/asn1/ (visited on 05/23/2021). [12] asyncio — Asynchronous I/O. Python library. The Python Soft- ware Foundation. url: https://docs.python.org/3/library/ asyncio.html (visited on 05/23/2021). [13] badssl.com. The Chromium Project. url: https://badssl.com (visited on 05/23/2021). [14] Paul Bakker. mbed TLS tutorial - Knowledge base. ARM. url: https://tls.mbed.org/kb/how-to/mbedtls-tutorial (visited on 05/23/2021). [15] Michaela Balážová. “Usable documentation for certificate valida- tion errors”. MA thesis. Masaryk University, Faculty of Informatics, Brno, 2020. url: https://is.muni.cz/th/e0wem/ (visited on 05/23/2021). [16] Alessandro Barenghi, Nicholas Mainardi, and Gerardo Pelosi. “Sys- tematic Parsing of X.509: Eradicating Security Issues with a Parse Tree”. In: Journal of Computer Security 26 (2018), pp. 1–33. doi: 10.3233/JCS-171110. [17] BetterTLS: Name Constraints. Netflix. url: https://bettertls. com/ (visited on 05/23/2021). [18] botan/src/tests/data/x509/x509test at master. GitHub repository. url: https://github.com/randombit/botan/tree/master/ src/tests/data/x509/x509test (visited on 05/23/2021). [19] Chad Brubaker et al. “Using Frankencerts for Automated Adversar- ial Testing of Certificate Validation in SSL/TLS Implementations”. In: 2014 IEEE Symposium on Security and Privacy. 2014, pp. 114– 129. doi: 10.1109/SP.2014.15. [20] Censys Certificate search. Censys. url: https://censys.io/ certificates (visited on 05/23/2021). [21] Certificate Transparency. Google. url: https://certificate. transparency.dev (visited on 05/23/2021). [22] certtool Invocation. url: https://gnutls.org/manual/html_ node/certtool-Invocation.html (visited on 05/23/2021). [23] Sze Yiu Chau et al. “SymCerts: Practical Symbolic Execution for Exposing Noncompliance in X.509 Certificate Validation Imple- mentations”. In: 2017 IEEE Symposium on Security and Privacy (SP). 2017, pp. 503–520. doi: 10.1109/SP.2017.40.

54 REFERENCES [24] Chao Chen et al. “DRLgencert: Deep Learning-Based Automated Testing of Certificate Verification in SSL/TLS Implementations”. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). 2018, pp. 48–58. doi: 10.1109/ICSME. 2018.00014. [25] Chu Chen et al. “RFC-Directed Differential Testing of Certificate Validation in SSL/TLS Implementations”. In: 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). 2018, pp. 859–870. doi: 10.1145/3180155.3180226. [26] Yuting Chen and Zhendong Su. “Guided Differential Testing of Certificate Validation in SSL/TLS Implementations”. In: Proceed- ings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 2015, pp. 793–804. doi: 10.1145/2786805.2786835. [27] Code Signing. Certificate Authority Security Council (Public Key Infrastructure Consortium). url: https://pkic.org/uploads/ 2013/10/CASC-Code-Signing.pdf (visited on 05/23/2021). [28] Command Line Interface — Botan (TLS Server/Client). url: https : / / botan . randombit . net / handbook / cli . html # tls - server-client (visited on 05/23/2021). [29] David Cooper et al. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. RFC 5280. RFC Editor, 2008. doi: 10.17487/RFC5280. [30] crlite. GitHub repository. Mozilla. url: https://github.com/ mozilla/crlite (visited on 05/23/2021). [31] Donald Eastlake. Transport Layer Security (TLS) Extensions: Extension Definitions. RFC 6066. RFC Editor, 2011. doi: 10. 17487/RFC6066. [32] Serge Egelman, Lorrie Faith Cranor, and Jason Hong. “You’ve Been Warned: An Empirical Study of the Effectiveness of Web Browser Phishing Warnings”. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2008, pp. 1065–1074. doi: 10.1145/1357054.1357219. [33] Helder Eijs. PyCryptodome. url: https://pycryptodome.org (visited on 05/23/2021). [34] Roy Fielding and Julian Reschke. Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content: Response Status Codes. RFC 7231. RFC Editor, 2014. doi: 10.17487/RFC7231.

55 REFERENCES [35] Martin Georgiev et al. “The most dangerous code in the world: val- idating SSL certificates in non-browser software”. In: Proceedings of the 2012 ACM conference on Computer and communications security. 2012, pp. 38–49. doi: 10.1145/2382196.2382204. [36] Ed Gerck. “Overview of Certification Systems: X.509, PKIX, CA, PGP & SKIP”. In: The Bell Newsletter ISSN 1530-048X, Vol. 1 (2000), pp. 3–8. doi: 10.13140/RG.2.1.1274.2489. [37] Getting started — Botan. url: https://botan.randombit.net/ handbook/ (visited on 05/23/2021). [38] gnutls/tests/certs at master. GitLab repository. url: https:// gitlab . com / gnutls / gnutls/ - /tree / master / tests / certs (visited on 05/23/2021). [39] Matěj Grabovský. “Usability Analysis of TLS API Documenta- tion”. MA thesis. Masaryk University, Faculty of Informatics, Brno, 2020. url: https://is.muni.cz/th/dw1iw/ (visited on 05/23/2021). [40] Matthew Green and Matthew Smith. “Developers are Not the Enemy!: The Need for Usable Security APIs”. In: IEEE Security & Privacy 14.5 (2016), pp. 40–46. doi: 10.1109/MSP.2016.111. [41] Phillip Hallam-Baker. X.509v3 Transport Layer Security (TLS) Feature Extension. RFC 7633. RFC Editor, 2015. doi: 10.17487/ RFC7633. [42] How to use GnuTLS in applications. url: https://gnutls.org/ manual/html_node/How- to- use- GnuTLS- in- applications. html (visited on 05/23/2021). [43] http.server — HTTP servers. Python library. The Python Software Foundation. url: https://docs.python.org/3/library/http. server.html (visited on 05/23/2021). [44] “IEEE Standard for Information Technology: Portable Interface (POSIX(TM)) Base Specifications, Issue 7”. In: IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008) (2018), pp. 1–3951. doi: 10.1109/IEEESTD.2018.8277153. [45] Information technology - Abstract Syntax Notation One (ASN.1): Specification of basic notation. ITU Recommendation X.680. In- ternational Telecommunication Union, 2021. doi: 11.1002/1000/ 14468.

56 REFERENCES [46] Information technology - ASN.1 encoding rules: Specification of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER). ITU Recommendation X.690. International Telecommunication Union, 2021. doi: 11. 1002/1000/14472. [47] Information technology – Procedures for the operation of object identifier registration authorities: General procedures and top arcs of the international object identifier tree. ITU Recommendation X.660. International Telecommunication Union, 2011. doi: 11. 1002/1000/11336. [48] Introduction to ASN.1. International Telecommunication Union. url: https://itu.int/en/ITU-T/asn1/Pages/introduction. aspx (visited on 05/23/2021). [49] Kristofer Kleine and Dimitris E. Simos. “Coveringcerts: Combi- natorial Methods for X.509 Certificate Testing”. In: 2017 IEEE International Conference on Software Testing, Verification and Validation (ICST). 2017, pp. 69–79. doi: 10.1109/ICST.2017.14. [50] James Larisch et al. “CRLite: A Scalable System for Pushing All TLS Revocations to All Browsers”. In: 2017 IEEE Symposium on Security and Privacy (SP). 2017, pp. 539–556. doi: 10.1109/SP. 2017.17. [51] Ben Laurie. “Certificate Transparency: Public, Verifiable, Append- Only Logs”. In: Queue 12.8 (2014), pp. 10–19. doi: 10.1145/ 2668152.2668154. [52] Ben Laurie, Adam Langley, and Emilia Kasper. Certificate Trans- parency. RFC 6962. RFC Editor, 2013. doi: 10.17487/RFC6962. [53] Libcrypto API - OpenSSLWiki. url: https://wiki.openssl. org/index.php/Libcrypto_API (visited on 05/23/2021). [54] John Linn. Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures. RFC 1421. RFC Editor, 1993. doi: 10.17487/RFC1421. [55] Yabing Liu et al. “An End-to-End Measurement of Certificate Revocation in the Web’s PKI”. In: Proceedings of the 2015 Inter- net Measurement Conference. 2015, pp. 183–196. doi: 10.1145/ 2815675.2815685. [56] Jack Lloyd. Botan: Crypto and TLS for Modern C++. url: https: //botan.randombit.net/ (visited on 05/23/2021).

57 REFERENCES

[57] Niels Möller. Nettle - a low-level cryptographic library. url: https: //lysator.liu.se/~nisse/nettle/ (visited on 05/23/2021). [58] Erik Moqvist. asn1tools. GitHub repository. url: https://github. com/eerimoq/asn1tools (visited on 05/23/2021). [59] Moni Naor and Kobbi Nissim. “Certificate revocation and certifi- cate update”. In: IEEE Journal on Selected Areas in Communica- tions 18.4 (2000), pp. 561–570. doi: 10.1109/49.839932. [60] Matus Nemec et al. “Measuring Popularity of Cryptographic Li- braries in Internet-Wide Scans”. In: Proceedings of the 33rd Annual Computer Security Applications Conference. 2017, pp. 162–175. doi: 10.1145/3134600.3134612. [61] OID Repository. url: http://oid-info.com/#oid (visited on 05/23/2021). [62] Paul C van Oorschot. Computer security and the internet: Tools and jewels. 1st ed. Springer Nature, 2020. Chap. Public-Key Cer- tificate Management and Use Cases. doi: 10.1007/978-3-030- 33649-3. [63] OpenJDK. Oracle Corporation. url: https://openjdk.java. net/ (visited on 05/23/2021). [64] OpenSSL – Cryptography and SSL/TLS toolkit. OpenSSL Soft- ware Foundation. url: https : / / openssl . org/ (visited on 05/23/2021). [65] openssl-ocsp. OpenSSL Software Foundation. url: https : / / openssl.org/docs/man1.1.1/man1/openssl-ocsp.html (vis- ited on 05/23/2021). [66] openssl/test/certs at master. GitHub repository. url: https : //github.com/openssl/openssl/tree/master/test/certs (visited on 05/23/2021). [67] Yngve Pettersen. The Transport Layer Security (TLS) Multiple Certificate Status Request Extension. RFC 6961. RFC Editor, 2013. doi: 10.17487/RFC6961. [68] Radim Podola. “Quantitative analysis of TLS certificate validity on the Internet”. MA thesis. Masaryk University, Faculty of In- formatics, Brno, 2020. url: https://is.muni.cz/th/l47tp/ (visited on 05/23/2021).

58 REFERENCES [69] Don van Ravenzwaaij, Peter Cassey, and Scott Brown. “A sim- ple introduction to Markov Chain Monte–Carlo sampling”. In: Psychonomic Bulletin & Review 25 (2016), pp. 143–154. doi: 10.3758/s13423-016-1015-8. [70] “Regulation (EU) No 910/2014 of the European Parliament and of the Council of 23 July 2014 on electronic identification and trust services for electronic transactions in the internal market and repealing Directive 1999/93/EC”. In: OJ L 257 (2014), pp. 73–114. url: https://eur- lex.europa.eu/eli/reg/2014/910/oj (visited on 05/23/2021). [71] Eric Rescorla. The Transport Layer Security (TLS) Protocol Ver- sion 1.3. RFC 8446. RFC Editor, 2018. doi: 10.17487/RFC8446. [72] Eric Rescorla and Nagendra Modadugu. Datagram Transport Layer Security Version 1.2. RFC 6347. RFC Editor, 2012. doi: 10.17487/ RFC6347. [73] Tim Rühsen et al. The GnuTLS Transport Layer Security Library. url: https://gnutls.org/ (visited on 05/23/2021). [74] Peter Saint-Andre and Jeff Hodges. Representation and Verifica- tion of Domain-Based Application Service Identity within Internet Public Key Infrastructure Using X.509 (PKIX) Certificates in the Context of Transport Layer Security (TLS). RFC 6125. RFC Editor, 2011. doi: 10.17487/RFC6125. [75] Stefan Santesson et al. X.509 Internet Public Key Infrastructure Online Certificate Status Protocol - OCSP. RFC 6960. RFC Editor, 2013. doi: 10.17487/RFC6960. [76] Jim Schaad, Blake Ramsdell, and Sean Turner. Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 4.0 Message Specifi- cation. RFC 8551. RFC Editor, 2019. doi: 10.17487/RFC8551. [77] Security Developer’s Guide (Java Platform, Standard Edition). Oracle. url: https://docs.oracle.com/en/java/javase/ 16 / security / security - developer - guide . pdf (visited on 05/23/2021). [78] Trevor Smith, Luke Dickenson, and Kent Seamons. “Let’s Revoke: Scalable Global Certificate Revocation”. In: Network and Dis- tributed System Security Symposium. 2020. doi: 10.14722/ndss. 2020.24084.

59 REFERENCES [79] ssl — TLS/SSL wrapper for socket objects. Python library. The Python Software Foundation. url: https://docs.python.org/ 3/library/ssl.html (visited on 05/23/2021). [80] SSL Library mbedTLS / PolarSSL. ARM. url: https://tls. mbed.org/ (visited on 05/23/2021). [81] SSL/TLS Client - OpenSSLWiki. url: https://wiki.openssl. org/index.php/SSL/TLS_Client (visited on 05/23/2021). [82] Stack Overflow. Stack Exchange Inc. url: https://stackoverflow. com/ (visited on 05/23/2021). [83] Steven Tuecke et al. Internet X.509 Public Key Infrastructure (PKI) Proxy Certificate Profile. RFC 3820. RFC Editor, 2004. doi: 10.17487/RFC3820. [84] Martin Ukrop, Lydia Kraus, and Vashek Matyas. “Will You Trust This TLS Certificate? Perceptions of People Working in IT (Ex- tended Version)”. In: Digital Threats: Research and Practice 1.4 (2020). doi: 10.1145/3419472. [85] Martin Ukrop et al. Usable X.509 Errors. Centre for Research on Cryptography and Security, Masaryk University, Brno. url: https://x509errors.org (visited on 05/23/2021). [86] Lev Walkin. Open Source ASN.1 Compiler. url: http://lionet. info/asn1c/compiler.html (visited on 05/23/2021). [87] Florian Weimer. TLS Clients (Defensive coding). Fedora Security Team. url: https://docs.fedoraproject.org/en-US/Fedora_ Security_Team/1/html/Defensive_Coding/sect-Defensive_ Coding-TLS-Client.html (visited on 05/23/2021). [88] x509 - The Go Programming Language. Go package. Google Inc. url: https : / / golang . org / pkg / crypto / x509 (visited on 05/23/2021). [89] Jiayu Zhu et al. “Guided, Deep Testing of X.509 Certificate Valida- tion via Coverage Transfer Graphs”. In: 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). 2020, pp. 243–254. doi: 10.1109/ICSME46990.2020.00032.

60 Appendix: Source code and build

The attached archive contains the entire Usable X.509 errors repository, as available on https://github.com/crocs-muni/usable-cert-validation. Only certain parts of the repository contain my authorial work, but it is included as a whole – so that the system can be built completely. The following tree describes the structure of data and source code related to this thesis. / README.md ...... Readme with instructions for the build. Makefile...... Primary Makefile (Section 4.2). _data errors...... Collected error data (Section 4.3.1). _guides ...... TLS developer guides (Section 5.2). validation mapper.py...... The mapping script (Section 4.3.3). servers server.py...... Main TLS server (Section 3.2.5). clients...... TLS clients (Section 3.2.4). certs asn...... ASN.1 specification files (Section 2.2.6). utils...... Helper Python libraries (Section 2.2.6). scripts....Certificate-generating scripts (Section 2.2.6). The system is built according to the steps in README.md. Because of its numerous dependencies, it runs only on Linux. Once the system is built, the following data are created. / validation certs build...... Built certificate chains (Section 2.2.5). results ...... Unprocessed results (Section 4.3.2). _data mapping...... Final error mapping data (Section 4.3.3). When built with make local, the website is automatically run locally. Alternatively, it runs online on https://x509errors.org.

61