<<

The Broken Shield: Measuring Revocation Effectiveness in the Windows Code-Signing PKI Doowon Kim and Bum Jun Kwon, University of Maryland, College Park; Kristián Kozák, Masaryk University, Czech Republic; Christopher Gates, Symantec; Tudor Dumitras, University of Maryland, College Park https://www.usenix.org/conference/usenixsecurity18/presentation/kim

This paper is included in the Proceedings of the 27th USENIX Security Symposium. August 15–17, 2018 • Baltimore, MD, USA ISBN 978-1-931971-46-1

Open access to the Proceedings of the 27th USENIX Security Symposium is sponsored by USENIX. The Broken Shield: Measuring Revocation Effectiveness in the Windows Code-Signing PKI

Doowon Kim Bum Jun Kwon Kristian´ Kozak´ University of Maryland University of Maryland Masaryk University

Christopher Gates Tudor Dumitras, Symantec Research Labs University of Maryland

Abstract code. A common security policy is to trust executables that carry valid signatures from unsuspicious publishers. Recent measurement studies have highlighted security The premise for trusting these executables is that the threats against the code-signing public infrastructure signing keys are not controlled by malicious actors. (PKI), such as certificates that had been compromised Unfortunately, anecdotal evidence and recent measure- or issued directly to the authors. The primary ments of the Windows code-signing ecosystem have doc- mechanism for mitigating these threats is to revoke the umented cases of signed malware [8, 9, 12, 23, 26] and abusive certificates. However, the distributed yet closed potentially unwanted programs (PUPs) [1, 13, 17, 28], nature of the code signing PKI makes it difficult to evalu- where the trusted certificates were either compromised ate the effectiveness of revocations in this ecosystem. In or issued directly to the malware authors. The pri- consequence, the magnitude of signed malware threat is mary defense against these threats is to revoke the cer- not fully understood. tificates involved in the abuse. For the better studied In this paper, we collect seven datasets, including the Web’s PKI, prior measurements have uncovered impor- largest corpus of code-signing certificates, and we com- tant problems with this approach, including long revo- bine them to analyze the revocation process from end to cation delays [6, 29, 30], large bandwidth costs for dis- end. Effective revocations rely on three roles: (1) discov- seminating the revocation information [19], and clients ering the abusive certificates, (2) revoking the certificates that do not check whether certificates are revoked [19]. effectively, and (3) disseminating the revocation infor- In contrast, little is currently known about the effec- mation for clients. We assess the challenge for discover- tiveness of revocations in the code signing PKI. With- ing compromised certificates and the subsequent revoca- out this understanding, platform security protections risk tion delays. We show that erroneously setting revocation making incorrect assumptions about how critical revo- dates causes signed malware to remain valid even after cations are for end-host security and about the practical the certificate has been revoked. We also report failures challenges for implementing effective revocations in the in disseminating the revocations, leading clients to con- code-signing ecosystem. tinue trusting the revoked certificates. Code signing uses a default-valid trust model, where certificate chains remain trusted until proven compro- 1 Introduction mised. Due to this fact, missing or delayed revocations for a certificate involved in abuse allow bad actors to gen- The code-signing Public Key Infrastructure (PKI) is a erate trusted executables until the certificate expires or is fundamental building block for establishing trust in com- successfully added to a revocation list. puter [22]. This PKI allows software publish- Abusive code-signing certificates may also a ers to sign their executables and to embed certificates security threat beyond their expiration dates, which is an that bind the signing keys to the publishers’ real-world important distinction from the Web’s PKI where the ex- identities. In turn, client platforms can verify the signa- piration date limits the use of a compromised certificate tures and check the publishers, to confirm the integrity and also puts a limit on how long a revocation for that of third-party programs and to avoid executing malicious certificate must be maintained. To avoid re-signing and

USENIX Association 27th USENIX Security Symposium 851 Role Finding Implication The mark-recapture estimation for the number of compro- Discovery There might be malware with compromised certificates that mised certificates suggests that even a large AV vendor can of remain a threat for a long time without being detected. Potentially only see about 36.5% of the population. CAs took on average 171.4 days to revoke the compro- Compromised Compromised certificates are not discovered and revoked mised certificates after the malware signed with the cer- Certificates for a long time. tificates appeared in the wild. Setting CAs erroneously set effective revocation dates for 62 cer- Wrong effective revocation date setting results in the sur- Revocation Date tificates, causing 402 signed malware to remain valid. vival of signed malware although its certificates is revoked. Clients have no way to check the revocation status of the 788 certificates contain neither CRLs nor OCSP points. certificates. 13 CRLs and 15 OCSP servers had reachability issues. Dissemination OCSP servers responded with unknown or unauthorized of messages. CAs improperly maintain their CRLs and OCSP servers. Revocation 19 certificates have inconsistent responses from CRLs and Information OCSP; they are valid from OCSP but are revoked in CRLs. Errors in the revocation process are made, and later re- 278 revoked certificates were added and then later removed tracted. CAs misunderstood the code signing PKI and re- from 18 CRLs. moved expired certificates from CRLs.

Table 1: Summary of findings. distributing binaries when a signing certificate expires, files, while soft revocations that set a revocation date af- Windows developers may extend the validity of binaries ter the issuance date may not cover undiscovered signed they release by including a trusted timestamp, provided malware. Moreover, the CAs also must properly main- by a Time-Stamping Authority (TSA), that certifies the tain their revocation infrastructure so that the informa- signing time of a binary. If a malicious binary is cor- tion of compromise can be disseminated to the clients. If rectly signed and timestamped before the expiration date the dissemination is not handled as it should be, it may of the certificate, it will remain trusted even after its cer- reduce the incentives for revoking code signing certifi- tificate expires—unless the certificate is revoked. This cates. These challenges render the code signing ecosys- means that prompt and effective revocations, even of ex- tem opaque and difficult to audit, which contributes to an pired certificates, are critical in the code signing PKI. under-appreciation of the security threats that result from An effective revocation process faces additional chal- ineffective revocations. lenges in the code signing ecosystem. This process in- In this paper, we present an end-to-end measurement volves three roles: (1) discovering certificates that are of certificate revocations in the code signing PKI; in compromised or controlled by malicious actors; (2) re- particular, how effective is the current revocation pro- voking these certificates effectively; and (3) disseminat- cess from discovery to dissemination, and what threats ing the revocation information so that it is broadly avail- are introduced if the process is not properly done. Our able. work extends prior works in the code signing PKI; previ- Unlike in the Web’s PKI, where potentially com- ous studies have focused on signed PUPs [1, 13, 28] and promised certificates can be discovered systematically signed malware [12], but there is no study of code sign- through network scanning [6, 29, 30], in the code sign- ing certificate revocation process yet. Unlike the prior ing PKI this requires discovering signed malware or PUP studies in the Web’s PKI [2, 6, 7, 10, 19] where TLS cer- samples on end-hosts around the world. Security compa- tificate can be collected by scanning the Internet, we are nies involved in this discovery process cannot observe all unable to utilize a comprehensive corpus of code sign- the hosts where a maliciously signed binary may appear. ing certificates since there is no official repository for This also makes it a challenge to detect the total number code signing certificates. To overcome the challenge, of certificates that are actively being used to sign mal- we utilize data sets that are publicly released from prior ware, which leads to an incorrect perception about the research [1, 13] and increase our coverage with Syman- need and urgency of revocations. Even though a signed tec’s internal repository of binary samples. We extract malicious binary is discovered, it is difficult to determine 145,582 unique leaf code signing certificates from the the date when a certificate revocation should become ef- data sets. From the code signing certificates, we also ex- fective. Hard revocations that invalidate the entire life tract 215 Certificate Revocation Lists (CRLs) used only of the certificate may invalidate too many benign signed for code signing certificates, and 131 Online Certificate

852 27th USENIX Security Symposium USENIX Association Status Protocol (OCSP) points. We periodically probe when they are first seen, and periodically after that to the collected CRLs to check their status to collect the re- make sure the certificate is still valid. vocation publication date; the date on which a certificate is revoked by a CA and the revocation information is dis- Authenticode. In the Windows platforms, seminated. Authenticode [21] is the code signing standard designed We highlight the nine findings from our analysis in to digitally sign Windows files including executables the revocation process in the three roles and the result- (.exe), dynamically loaded libraries (.dll), cabinet files ing security implication as depicted in Table 1. To allow (.cab), ActiveX controls (.ctl, and .ocx), catalogs (.cat) the security research community to reproduce and ex- files, etc. The standard relies on Public Key Cryptog- tend our study, we make three data sets publicly available raphy Standard (PKCS) #7 [11] that stores X.509 code at http://signedmalware.org; (1) Revocation information signing certificate chains, X.509 TSA certificate chains, (D2), (2) Revocation Publication Date List (D3), and (3) a , and a hash value of a PE file, with no CRL/OCSP reachability history (D7) 1. encrypted data. In summary, we make the following contributions: (1) we collect a large corpus of code signing certificates and Trusted timestamping. Unlike the Web’s PKI, the code the revocation information, (2) we conduct the first end- signing PKI provides trusted timestamping. Trusted to-end measurement of the code signing certificate revo- timestamping is a way to attest that the code was signed cation process, (3) we use our data to estimate a lower at a specific date and time. The timestamp is issued and bound on the number of compromised certificates, (4) signed by Time Stamping Authority (TSA) during the we highlight the problems in the three parts of the re- signing process. The trusted timestamp guarantees that vocation process as well as new threats that result from the signature is generated within the validity period of a those problems, and (5) we discuss suggestions/recom- certificate to extend the trust in the signed program code mendations to improve the security of the code signing even after the certificate expires. Unfortunately, mal- ecosystem. ware writers also benefit from this mechanism. Properly signed and trusted timestamped malware can be trusted 2 Problem Statement and remain valid even after its certificate expiration date.

In this section, we provide a brief overview of the code Trends of code signing abuse. Digitally signed malware signing PKI, with an emphasis on certificate revocation. can help to bypass some of the protection mechanisms We also discuss the implications of code signing as it for end-users such as Windows’ User Account Control currently exists, and highlight the research questions for (UAC) and some Anti-Virus (AV) engines. Therefore, investigating the effectiveness of the revocation process. malware authors have abused the code signing PKI and signed their malware code with the certificates either 2.1 Code Signing PKI stolen or fraudulently issued to malware authors: for example, , , and [8, 9, 23]. Kim et The code signing PKI provides a mechanism to validate al. [12] presented threat models that emphasize three the authenticity of a software publisher and the integrity types of weaknesses in the code signing PKI: (1) in- of a binary executable. adequate client-side protections, (2) publisher-side key Code signing process. Similar to the Web’s PKI (e.g., mismanagement, and (3) CA-side verification failures. TLS), the software publishers first ask a Certificate Au- Those weakness can breach the trust in the Windows’ thority (CA) to issue code signing certificates based on code signing PKI. the X.509 v3 certificate standard [4], and they use the Moreover, malware authors also use the underground certificates to sign their binary files. In the process of black markets to purchase code signing certificates. Ac- signing a binary file, the hash value is first computed, cording to prior work [14], the certificates are being and then the hash value is digitally signed with the soft- sold at $350–$1,000 for a code signing certificate and at ware publisher’s private key. Finally, the original code $1,600–$3,000 for an EV code signing certificate. Also, is bundled with the signature as well as the public part they reported that about 60% of the compromised certifi- of the code signing certificate. The end users check the cates in their data sets used to sign malware within the validity of the certificates used to sign the program code first month after its issue date. They claimed this finding 1Due to the agreement terms, we are unable to publicize the data as a new evidence of the growing prevalence of certifi- sets collected in the Symantec internal repository. cates issued for abuse.

USENIX Association 27th USENIX Security Symposium 853 2.2 Revocation Process Malware Valid (") Benign code Invalid Certificate revocation is the primary defense against the Not revoked certificate abuse of code signing. CAs are responsible for revok- ing certificates for reasons such as: the private key as- !" !$1 !'1 !'2 !$2 !'3 !# sociated with a certificate is made public, the entity be- Revoked certificate (effective revocation date tr) hind the certificate becomes untrusted, the certificate is used to sign malware even if the source is unknown, or ! ! ! t ! ! ! ! if a certificate is erroneously issued [12]. The revocation " $1 '1 r '2 $2 '3 # process consists of three roles: (1) promptly discovering ("") Revocation Delay compromised certificates, (2) performing an effective re- vocation of the certificate, and (3) disseminating the re- !#Certificate !) Compromise !* Revocation vocation information. expired detected published Discovery of potentially compromised certificates. It is not clearly stated in the requirements [3] who is Figure 1: An example of (i) an effective revocation date responsible for discovering compromised certificates. (tr) that determines the validity of signed malware and However, the notification of abuse often comes exter- (ii) a revocation delay (tp - td) (ti: issue date, te: expira- nally, from Anti-virus (AV) companies, researchers or tion date, tr: effective revocation date, tb: signing date of the companies that own the certificates. Once notified, a benign program, tm: signing date of malware, td: detec- the CAs, who have issued the certificates, are required to tion date, and tp: revocation publication date). When an promptly investigate and revoke the abused certificates. effective revocation date is set at tr, the malware signed at tm1 validates continuously as it was signed before tr. The delay between the initial discovery (td) and the time when the revocation information is made public (i.e., re- vocation publication date (t )) should be as short as pos- p tocol (OCSP) [24]. sible. Figure 1 depicts the case where the discovery hap- • CRLs contain the revocation information (certificate pened after the expiration (t ). Due to trusted timestamp- e serial numbers, (effective) revocation date, revocation ing, the revocation should be performed even after the reason) of certificates that have been revoked. Each expiration date of the certificate. The revocation delay CRL is updated based on their CA’s issuance policy; can be defined as t −t . p d for example, they can be issued when a new revoked Setting the revocation date. Once the CAs confirm the certificate is inserted, or a specific time of day or a abuse, in collaboration with the certificate owners, they day of month. The location of the CRL is specified have to decide the effective revocation date (tr) due to at CRL Distribution Point (CDP) of the X.509 certifi- the trusted timestamping. The effective revocation date cate. Clients have to periodically download the entire determines which binaries will be impacted. Suppose CRL (not just recent changes) to check the latest revo- we have a code signing certificate valid between ti (is- cations. sue date) and te (expiration date). We sign a binary with • OCSP was introduced to resolve the network overhead the certificate during its validity period. If a certificate problems of CRL. Clients can simply query an OCSP is found to be compromised in some way at td (detection server for a certain certificate, which helps mitigate the date), it must be revoked. At this point the CA also must network overhead at the server as well as clients. Au- set tr (effective revocation date) for the certificate. As thority Information Access (AIA), an extension field shown in Figure 1, any binary signed by the certificate in a X.509 certificate specifies OCSP point for each after tr, regardless of the trusted timestamp, will become certificate. invalid. However, a binary signed with a trusted times- The TLS CAs are typically not responsible for providing tamp before tr remains valid. the revocation status of expired certificates. The code Dissemination of revocation information. CAs must signing CAs, however, must maintain and provide the then disseminate the revoked certificate information. Un- revocation information of all certificates that they have like the discovery and setting the revocation date, CAs issued including expired certificates due to the trusted are solely responsible for this part of the revocation pro- timestamp [3, 20]. Since the trusted timestamp extends cess. The two predominant ways to disseminate certifi- the life of a signed binary, CAs must maintain the CRLs cate revocation information are (1) Certificate Revoca- and OCSP in perpetuity to make revocation information tion List (CRL) [4] and (2) Online Certificate Status Pro- always-available for clients.

854 27th USENIX Security Symposium USENIX Association 2.3 Effectiveness of Revocation Process trust certificates when revocation information is unavail- able. In this setting, all signed malicious files can remain In this sub-section we discuss the revocation process. We valid even after the certificate is already revoked if the break this part into four sub-questions: revocation status information is unavailable. Therefore, Q1. How many certificates are being used to sign mal- it is important to check if the revocation information is ware? The revocation process starts from the discovery properly maintained and disseminated by CAs. of compromised certificates. We begin our study by esti- mating the magnitude of the current threat that should be the target of the revocation process. 2.4 Our Goal and Non-Goal Q2. How prompt is the revocation process? When alerted to a certificate problem, CAs have to begin inves- In the Web’s PKI (e.g., TLS), the security issues of cer- tigating the reports within 24 hours and revoke the com- tificate revocation have been well-understood [6, 19, 30]. promised certificates and publish the revocation infor- In contrast, little is known about code signing certifi- mation within seven days or get reasonable cause from cate revocation: in particular, the revocation process (1) the owner of the certificate to delay [3]. The date when promptly discovering compromised certificates, (2) re- the revocation information is available to the public (i.e., voking the compromised certificates effectively, and (3) added to the CRL or OCSP), is defined as revocation disseminating the revocation information. In this paper, publication date (tp). There are currently some report- our goal is to systemically measure the problems in the ing mechanisms in place to allow an outside party, such revocation process and new threats introduced by these as an AV company or researcher, to report misuse of cer- problems. Our non-goals include fully characterizing (1) tificates to CAs [3]. Due to this adhoc process, there may CA’s internal infrastructure problems, (2) their internal be delays from initial evidence of compromise (td) to the revocation policies, and (3) Windows platforms internal revocation published date (tp). revocation checking policies. Q3. Are effective revocation dates set properly? When revoking a certificate, the CA must set the date when the revocation should be considered active (effec- 2.5 Challenges for Measuring Revocation tive revocation date (tr)). Because of the trusted times- tamp, any binary signed with the certificate before the In our study, the challenges for measuring code sign- effective revocation date (t ) is still considered trusted, r ing certificate revocation are (1) visibility and (2) timing. while any file signed and timestamped after the effec- Visibility is an issue because, unlike on the open Inter- tive revocation date (t ) is considered untrusted. Two r net, there is no easy way to identify all the certificates strategies are used, hard revocation where t = t , and r i that are actively being used in the wild. Instead, we have soft revocation where t < t ≤ t . Hard revocation has i r e to find data from sources that provide as wide a view of the advantage that all malicious signed files are un- the ecosystem as possible. Timing is a problem because trusted, but the side effect is that all benign files also if we observe only a single version of the CRL, we can become untrusted. Soft revocation tries to match the only see the effective revocation date (t ), which helps date more closely to the date when the certificate was r define which files should be untrusted, but not when the compromised, which means some benign files will still trust was lost. To see the revocation publication date (t ), be trusted. If this date is not set correctly, then signed p when a certificate appears on a revocation list, we must malware (i.e., malware is signed before the date, t < t ) m r actively monitor the CRLs over an extended period of may still exist and continue to be trusted as the example time. shown in Figure 1. Q4. Is revocation information served properly? Client-side platforms (e.g., Windows) check the valid- ity of both leaf and intermediate certificates used to sign 3 Data Collection program code. According to the specification [3], a bi- nary should be considered unsigned when it is not possi- There are no publicly available datasets that are used to ble to check the revocation status. Suppose that a client perform research on code signing certificates. In this sec- platform does not follow the specification, but instead tion we describe our data collection methodology and applies a soft-fail revocation checking policy; the soft- how we measure the revocation process for code signing fail revocation checking policy is for client platforms to certs.

USENIX Association 27th USENIX Security Symposium 855 Malsign Malcert Symantec WINE Total* CA Leaf Certificates PKCS #7 2,171 801,995 149,840 11,108 965,114 44,014 (30.23%) CS certs.** 2,106 1,121 145,411 1,137 145,582 Thawte 26,884 (18.47%) CRL URLs 55 60 403 49 413 Comodo 24,780 (17.02%) OCSP URLs 24 24 130 16 131 GlobalSign 12,079 (8.30%) Symantec 8,913 (6.12%) Table 2: Summary of the fundamental data. (*: total DigiCert 8,300 (5.70%) number of unique data, **: CS stands for code signing – Go Daddy 7,376 (5.07%) some certificates have parsing errors.) WoSign 3,796 (2.61%) Certum 1,874 (1.29%) StartCom 1,830 (1.26%) 3.1 Fundamental Data (D1 – D2) Other 4,281 (2.94%) Total 145,582 (100%) The code signing certificates are the to collect addi- tional information since they include the revocation dis- Table 3: Top 10 Code signing Certificate Authorities. tribution points (CRLs and OCSP points) and other in- The top 10 CAs account for 97% of the certificates in formation that we monitor. Here we describe how we our data set (D1). collect the code signing certificates and the revocation information. Table 2 shows the breakdown of the funda- mental data. code signing certificates. First, we extract only a leaf cer- Code signing certificates (D1). There is a publicly avail- tificate from each PKCS #7 file by filtering out interme- able corpus of TLS certificates at Censys.io 2, collected diate certificates and TSA certificates, and we select only by scanning all IPv4 network address. In contrast, there code signing certificates using the keyword of “Code is no large public corpus of code signing certificates ob- Signing” in the extendedKeyUsage extension field. We served in the wild. We use multiple data sets that are are unable to parse 1,989 leaf certificates due to parsing publicly released from prior research [1,5,13] and a pro- errors. 145,582 unique leaf code signing certificates (ex- prietary repository of binary samples. The data sets are: tracted from 965,114 binary samples) legitimately issued • Malsign. Kotzias et al. [13] evaluated signed mali- from CAs remain after we remove duplicate leaf certifi- cious PE files and they publicly released the 2,171 leaf cates (85.2% leaf certificates are duplicate) and two self- code signing certificates used to sign the PE files. signed certificates. Table 3 shows the number of code • Malcert. Alrawi et al. [1] examined 3.3 million sam- signing certificates for the top-ten most popular CAs in ples collected from a commercial feed of a private our data set (D1). company, and they shared 801,995 signed PE sam- The D1 data set is used for (1) the trend of revocation ples. The reason for the large reduction from PKCS setting policy (Section 5.1), (2) the certificates without #7 to CS certs for Malcert in Table 2 is that most of CRL and OCSP (Section 6.3), (3) the inconsistent re- the PKCS #7 files were duplicate code signing certifi- sponses from CRLs and OCSP (Section 6.3), and (4) the cates used to sign binaries with different hashes. unknown or unauthorized responses from OCSP (Section • Symantec data set. Symantec has an internal reposi- 6.3). tory of binary files, from which they extracted a sam- Revocation information (D2). The CRLs and OCSP ple of 149,840 PKCS #7 files for analysis. points (URLs) are specified at the CRLDistributionPoints • Samples from WINE [5] and VirusTotal. To get more and AuthorityInfoAccess extensions respectively. We code signing certificates, we also select around 300 extract the CRL and OCSP points from 145,582 leaf PE files for each CA from WINE (c.f., Section 3.3) code signing certificates that we find in the four data and download the samples from VirusTotal using the sets. Most (137,027, 94.1%) certificates contain both download API; 11,108 PE samples are collected. The CRL and OCSP points; only CRL points are specified in details of VirusTotal will be explained in Section 3.3. 7,794 (5.3%) certificates and only OCSP points are ex- A PKCS #7 [11] file includes code signing certificate pressed in 98 (0.06%) certificates. We observe a total of chains, TSA certificate chains, a signature, and a hash 413 unique CRLs, however CRLs can be used for other value of a PE file. The data sets consist of PKCS #7 files purposes such as TLS. Therefore, we manually search except for the Malsign data set that provides only leaf Censys.io for each CRL and filter out CRLs used for other purposes. Eventually, 215 CRLs that are used only 2https://censys.io for code signing remain. We observed 131 unique points

856 27th USENIX Security Symposium USENIX Association for OCSP. This D2 data set is used to examine the prob- problems in revocation date setting (Section 5.1). lems in effective revocation date setting (Section 5.1), the Symantec metadata telemetry (D5). For the revoked transient certificates in CRLs (Section 6.3), and the no certificates observed by our revocation publication date longer updated CRLs (Section 6.3). collection system, we also received meta information about the binaries signed by the 2,617 code signing cer- 3.2 Revocation Publication Date List (D3) tificates from Symantec, using the serial numbers of the certificate to identify the set of the affected binaries. The A CRL contains the serial numbers of revoked certifi- information is similar to WINE, but for a more recent cates, revocation date, and reason code. The revocation time period than what is in WINE (from Jan. 1st, 2016 date field is effective revocation date (tr) (c.f., Section to Sept. 10th, 2017) so that we could observe informa- 2.2) that determines the validity of signed program code. tion related to more recent certificates and revocations In other words, the revocation information in CRLs does that we track in D3. The data consist of the serial num- not contain the date on which the certificates become re- ber of the signing certificates, the SHA256 hash of the voked. Therefore, we devise a system, called revoca- binary, the first seen timestamp. Symantec provided us tion publication date collection system that collects re- ground truth for identifying malware among these signed voked serial numbers once a day from our CRL data set binaries as well. With the ground truth, we identify the in order to detect the revocation publication date (tp), certificates used on signed malware. This D5 data set is when the certificate is added to CRL or OCSP servers. used to estimate malware signing certificates in the wild This information can be used to measure the revocation (Section 4.1), and to examine the revocation delay (Sec- delay between a malicious signed binary appearing in tion 4.2). the wild and a CA revoking the compromised certificate. VirusTotal (D6). Because the previous two data sets do From the 215 CRLs, we observe 2,617 unique certifi- not provide actual binaries, we use VirusTotal [27] to find cates added to the CRLs between Apr. 16th, 2017 to specific binaries and to perform further analysis. Virus- Sept. 10th, 2017. This D3 data set is used to examine the Total provides a service that analyzes potentially mali- revocation delay (Section 4.2). cious binary files and URLs using up to 63 different anti- virus engines. The analysis is triggered when a sample 3.3 Binary Sample Information (D4 – D6) is submitted, the report is kept in a database and exposed externally via an API. We use the private API to collect Among our measurements, there exist several research the following information from these reports: the signed questions which require information about the signed bi- date of the binary, the number of AV engines detected naries. For example, to measure the malware which is the file as malicious, and the first submission timestamp still valid due to the ineffective revocation date setting, to VirusTotal. we need a view of the binaries signed with a revoked cer- VirusTotal also allows users to apply rule-based tificate and information to determine their maliciousness matching on the incoming submissions, which can help and their signing date. Therefore, we collect information researchers find a specific type of malware. This plat- about the signed binaries from three data sets: WINE, form is called VirusTotal Hunting 3, and it uses YARA Symantec, and VirusTotal. 4 to define rules. We write a YARA rule that triggered Worldwide Intelligence Network Environment when a binary was signed and at least 10 AVengines con- (WINE) (D4). WINE [5] provides security telemetry vict the binary. From each report, we extract the SHA256 submitted from 10.9 million Symantec customers around hash of the binary, the first submission date, and the se- the world that opt into this data sharing. Among the rial number of the leaf code signing certificate. The data various data sets in WINE, we use the binary reputation collection began on Apr. 18th, 2017. The extracted data data that contains metadata of binary files that are seen set is used in the estimation of malware signing certifi- on endpoints. We extract the following information cates (Section 4.1), and to examine the revocation delay from this data set: the SHA256 hash value of the (Section 4.2). file, the server-side timestamp, and the names of the We also use the VirusTotal download API to download publisher and the CA which are extracted from the code the actual binary of a given hash when necessary (e.g., to signing certificate. Note that detailed information of the collect the certificate to extract the CRL/OCSP informa- certificate (e.g., a serial number of the certificate, CRL) tion). is not provided in WINE. Also, WINE does not provide 3https://www.virustotal.com/#/hunting-overview the actual binary. This D4 data set is used to examine the 4http://virustotal.github.io/yara/

USENIX Association 27th USENIX Security Symposium 857 3.4 CRL/OCSP Reachability History (D7) official repository for code signing certificates and the signed binaries. To overcome this problem, we employ CRL reachability history. For the list of CRLs we have the mark-recapture analysis [15]. This technique was in our data set, we check the reachability of the CRLs originally developed for measuring wildlife populations. daily from Aug. 10th, 2017 to Sept. 10th, 2017. If a The goal of mark-recapture is to estimate the size N of a CRL is unreachable, we record the timestamp and the population that cannot be observed in its entirety. In our reason of failure to the log. This D7 data set is used for case, N is the number of certificates employed by digi- measuring the unreachability of CRLs (Section 6.2). tally signed malware. The technique requires two sepa- OCSP reachability history. Similar to the reachability rate samples drawn, with replacement, from the popula- checker for CRLs, we also develop an OCSP reachabil- tion. The first sampling results in the capture of n1 sub- ity checker. The checker tests the reachability of each jects. These subjects are marked and released in the wild. OCSPs we found from the four data sets every 30 min- The second sampling results in the capture of n2 subjects, utes. Rather than simply pinging the domain, it queries among which p bears the marks from the previous sam- each OCSP points with the certificates that contain the pling. In other words, p is the size of the intersection of OCSP point over the OCSP protocol using Openssl. Sim- the two samples, denoting the subjects that have been re- ilarly, the timestamp and the reasons are logged if not captured. An estimator Nˆ for the total population N can reachable. It has been running with 131 unique OCSP then be computed as: points from Aug. 10th, 2017 to Sept. 10th, 2017. This n n Nˆ = 1 2 (1) D7 data set is used for measuring the unreachability of p OCSP points (Section 6.2). We apply the mark-recapture technique to the malware signing certificates from two different data sets: Syman- 4 Discovery of Potentially Compromised tec telemetry (D5) and VirusTotal (D6). We consider that Certificates each data set is a sample of the total population of po- tentially compromised certificates. Specifically, n1 and There are many reasons for revoking a code signing cer- n2 represent the numbers of certificates that should have tificate, and in general it is difficult to determine whether been revoked, as they are known to sign malware, from and when a certificate should have been revoked. How- the Symantec and VirusTotal data sets respectively. ever, one situation warrants a prompt certificate revoca- tion: when the corresponding private key has been used Assumptions and interpretation. Mark-recapture to sign malicious code [3]. We therefore compute a con- makes three assumptions about the population and the servative estimate of the number of certificates used to sampling process that may not hold in our case. First, sign malware in the wild, and we compare it with the the subjects in the population should have an equal coverage of a major security company to assess the odds chance of being captured; in other words, the population of discovering all the potentially compromised certifi- is homogeneous. However, the certificate population is cates (Section 4.1). Furthermore, after a signed malware unlikely to be homogeneous. For example, a certificate sample has been discovered, the information must reach used by a popular software company would have a higher the principal responsible for revoking the code signing chance of appearing in our datasets. Second, the samples certificate, and the principal must add the certificate to from the population should be independent. That is, Certificate Revocation List (CRL). We therefore ana- the initial capture should not affect the likelihood of lyze the delay between the time when this information recapture. This assumption ensures that the proportion is available to the community and the time when the cer- of recaptured subjects in the second sample p/n2 is tificate appears on a CRL (Section 4.2). the same as the proportion of marked subjects out of the total population n1/N, which leads to Equation 1. 4.1 Mark-recapture Population Estima- However, security companies share malware feeds with tion each other, which raises the probability of recapture for the potentially compromised certificates captured in the The process of revocation starts from discovering the cer- first sample. Third, the population should be closed. tificates used in malware. To understand how effective A population is closed when its size does not fluctuate the discovery phase is, we need to answer our first re- due to the birth and death of its members. However, our search question, Q1. How many certificates are used population changes over time, as certificates are issued to sign malware in the wild? However, there exists no and revoked.

858 27th USENIX Security Symposium USENIX Association USENIX Association number the with estimations these compare also We riod. timations Results. certificates. compromised potentially a as interpreted of increase an of estimation to lead would intersection sets the data two the between exis- population that The real imply the would set. timates data certificates either such in of occur tence not targeted may in preva- and used only low attacks are they a because have example for may lence, certificates some Hunting. VirusTotal Furthermore, the a in have occurring may of 2017, certificates probability lower April older these before with issued signed malware certificates include sets for data period 4/18/17 collection the between 9/10/17, estimates and daily our compute we lation, closed day. approximately each is within interest of population revo- our its daily, on ( population date the publication leaves cation certificate a con- that we sider com- reasoning, same potentially the of Using population certificates. the promised join they Symantec as VirusTotal, when the for is date this in submission timestamp first the seen and telemetry first the as certificate N week). that of start the is during certificates revoked (4 newly of number certificates total estimation the the signing and between comparison (b) malware time number over observed blue) in as and red Trend as estimation (a) (mark-recapture 2: Figure ˆ Number of certificates / eaaeyfrec a.W e h it aefreach for date birth the set We day. each for separately omtgt h mato o-ooeeu popu- non-homogeneous a of impact the mitigate To estimate we issue, last the of impact the minimize To 18 1000 1200 1400 1600 1800 200 400 600 800 4/17 - 4/23 0 / 4/24 - 4/30 17

5/1 - 5/7 − N Accumulated Observation Accumulated

ˆ 5/8 - 5/14 iue2a hw h vrg fordiyes- daily our of average the shows 2(a) Figure 5/15 - 5/21 o ahwe uigormaueetpe- measurement our during week each for , 9 5/22 - 5/28 / N

10 5/29 - 6/4 u siaini hsscinsol be should section this in estimation Our . p oe bound lower 6/5 - 6/11 /

hc ol lorsl na under- an in result also would which , 6/12 - 6/18 7 telblsat rm4 from starts label (the 17)

Period (week) Period 6/19 - 6/25

6/26 - 7/2 (a) 7/3 - 7/9 7/10 - 7/16 t

p 7/17 - 7/23 .BcueCL r updated are CRLs Because ). 7/24 - 7/30 Bound) (Lower Estimation N 7/31 - 8/6 iial,dependencies Similarly, . o h repplto of population true the for 8/7 - 8/13 8/14 - 8/20 8/21 - 8/27

8/28 - 9/3 9/4 - 9/10 D6 hl h two the While . Estimation / N ˆ (b) 7sneit since 17 RevokedNewly underes- 0 500 1000 1500 2000 2500 week ( CRLs on observed D5 we certificates the from the revoked? is certificate promptly corresponding how discovered, is malware signed is the question ter research second malicious our sign Therefore, to used code. been are has CAs they certificate after the 2.3, that days alerted seven Section within in certificate a the discussed revoke explore must As to approach driven process. publica- data revocation a revocation take the We Their for date. tion approximation malware. an revoked sign used never were to and 5.6 that used certificates over first included for are estimation they threat after compromised a the years remain of 80% certificates that estimated code-signing [12] al. et Kim Delay Revocation 4.2 po- revocation. of their and discovery certificates the compromised between tentially delay the We investigate discoveries. next older as to effectively, correspond mitigated may is revocations threat the security not the does sign that this that revoked, imply eventually certificates are the wild the all in if malware even However, reality. of larger in number much be the may bound, certificates lower compromised potentially a is that, estimation note We our re- malware. because signed in of done discovery are the estima- revocations to close sponse most our that in- indicate revocations, not could the do tion for CRLs the reason While the dicate CRLs. cer- the signing to code the added of tificates 95.1% represents dur- period this certificates ing compromised potentially of population set data from ( re- 9/10/17, which date certificates, publication revoked vocation newly of the number with (4/18–9/10/17) actual period measurement the the compare during observed we certificates the 2(b) all Figure on estimation in mark-recapture revocations, the on cess certificates. compromised potentially large the infor- a of observe an portion not and do VirusTotal Symantec like aggregator like mation company even security that major suggests a This certificates. 2.74 of is number estimation. population observed the estimated not of the had date and age, the wild the by in revoked certifi- malware been signing sign to code used 1,004–1,786 were cates least at that estimate ( ob- ( telemetry VirusTotal certificates Symantec of the sets from the daily of served union the actually is we which that observe, certificates compromised potentially of oilsrt h feto h nfcetdsoeypro- discovery inefficient the of effect the illustrate To eervkdb h n forosrainpro.Drn h last the During period. observation our of end the by revoked were 5 eas h yatctlmtydtstwscletdstarting collected was dataset telemetry Symantec the Because n 1 = ,wihpeet sfo aiga cuaeestimation. accurate an making from us prevents which 1, D6 .Ecuigtels ek(/–/0,we (9/4–9/10), week last the Excluding ). 27th USENIX Security Symposium 859 D3 h ubro h estimated the of number The . t p sbten41/7and 4/18/17 between is ) D3 ,altecrictsin certificates the all ), × agrta the than larger D5 5 n from and ) naver- On 2 Af- Q2. t 860 27th USENIXSecurity Symposium do or manner timely a in either information CAs the that receive imply delays not long do The days, days). 324.9 (std 38 months) median (5.6 days 171.4 is The delay distribution. average cumulative a shows 3 Figure days; 1553 Results. CRL. its to added ( date publication revocation lay the compute We compromised. likely certificate was the the that when suspecting date started community the security for estimate this conservative abuse, a the represents of aware were ( vendors date anti-virus discovery multiple the as sample malware signed a of and VirusTotal, samples. the in to- sign to samples In used unique certificates unique binaries. 19,053 254 the find of we timestamp tal retrieve submission also consen- first we reports using VirusTotal the malware the From signed results. the sus label and binary rely the we binaries ( these VirusTotal collect on not does certificates. revoked Symantec the Since with signed the hashes 146,286 in and certificates cer- revoked revoked (17.9%) the 468 find with signed are from tificates that files binaries for Syman- ( use telemetry We metadata malware. tec signed corresponding the for voked; tem our ( by date vided publication revocation the of tion certifi- compromised the revoke cate. and CAs which certificates on compromised dates the with which signed on malware dates the the between delays Revocation 3: Figure p ewe p.1t,21 n et 0h 2017. 10th, Sept. and 2017 16th, Apr. between o ahcrict,w s h aletdtcindate detection earliest the use we certificate, each For date discovery the determine to is challenge next Our estima- accurate an need we question, this answer To ( ( t

D3 CDF p 0.2 0.4 0.6 0.8 1.0 − 0 D3 .W ou nyo etfiae hthv enre- been have that certificates on only focus We ). 0 t d nlds267cd inn etfiae,with certificates, signing code 2,617 includes h eoaindlyrne rmoedyto day one from ranges delay revocation The stedfeec ewe hsdt n the and date this between difference the as ) eoainpbiaindt olcinsys- collection date publication revocation D3 D6 fte267rvkdcricts we certificates, revoked 2,617 the Of . and ) Revocation Delay (day) Delay Revocation 500 AVClass D5 oietf e fhashes of set a identify to ) t p ,we h etfiaewas certificate the when ), 2]t e eotof report a get to [25] 1000 t p .Ti spro- is This ). eoainde- revocation D5 1500 aaset, data t d .As ). t htte aeise.Cscnset can CAs issued. certificates have the they revoking that when dates revocation set must properly? is set ques- answer dates next to the have we process, revocation tion effective an Setting have Date To Revocation in Problems 5.1 com- is certificate the in trust when promised. period the cover to proper the a this call determine (we be date must revocation could CA the certificates efficiently, compromised discovered potentially if Even Date Revocation the Setting 5 the of discovery the after malware. for average, signed threat on this months, to five exposed over remain [3]. users Group consequence, Working In Signing the by Code set Forum requirements CA/Browser minimum the follow strictly not (stacked). Num- certificates revoked trends: of setting ber date revocation Effective 4: Figure h te hand, other the to date) tion n oaC eeal re oset to tries generally CA date, a revocation effective so the and on depends binary signed a eoetecrictsuigorcletd1552code 145,582 ( certificates collected signing our they using when certificates date revocation the effective revoke the set CAs the how setting. date revocation effective ef- of Trend wrong revocation. soft the in by setting led date ( revocation problem set fective security data the our identify using also time, over the how changed and is soft), trend or the hard set (e.g., CAs date the revocation how effective understand better to date policies revocation setting CAs’ examine We malware). signed cate oldest the to close date) e eprto ae,called date), (expiration Number of certificates 1000 1200 1400 200 400 600 800 0 2002 2003 t Hard revocation Hard Soft revocation i 2004 isedt) called date), (issue t r 2005 a estt n aebetween date any to set be can D1 2006 sdsrbdi eto .,CAs 2.3, Section in described As .Frt ecektecertificate’s the check we First, ). 2007 t m 2008 Period (Year)Period tedt nwihtecertifi- the which on date (the 3 r fetv revocation effective Are Q3. otrevocation soft 2009 2010 fetv eoaindate revocation effective 2011 t r adrevocation hard USENIX Association efcierevocation (effective 2012 t r 2013 efcierevoca- (effective 2014 h rs in trust The . 2015 eexamine We 2016 D1 2017 .We ). t On . i and ) < ti = ti ≤ te > te Total 1.0 Comodo 0 426 1,437 17 1,880 0.8 Thawte 0 74 1,055 39 1,168 Go Daddy 2 14 672 18 706 0.6 Verisign 2 59 430 51 542 CDF Digicert 1 161 323 3 488 0.4 Starfield 0 3 153 2 158 Symantec 0 33 89 1 123 0.2 Wosign 0 57 17 0 74 Startcom 0 0 47 0 47 0 0 200 400 600 800 1000 Certum 0 1 9 0 10 Revocation Date Setting Error (Days) Other 0 96 117 1 214 Total 5 924 4,349 132 5,410 Figure 5: CDF of the revocation date setting error (tr − tm): difference between the effective revocation date and Table 4: Effective revocation date setting policy for top the first malware signing date of a certificate. 10 CAs (ti: issue date, te: expiration date).

date. As shown in Table 4, most CAs have set the ef- revocation status using CRL points, specified at its cer- fective revocation dates even after its certificate expira- tificate extension field. Table 4. shows the breakdown of tion date. In this case, the effective revocation dates be- the effective revocation date setting policy. We observe come ineffective. In other words, the revoked certificates that 5,410 (3.7% out of 145,582 certificates) certificates should not affect any properly signed and timestamped are explicitly revoked. Of those, 96% (5,196) certificates sample including malware should remain valid. have been issued by the top 10 CAs; most (1,880, 34.8%) We measure how many CAs erroneously set effective revoked certificates are issued by Comodo, followed by revocation dates, and how many signed malware still re- Thawte (1,168, 21.6%). Most (4,481, 82.8%) revoked mains valid even after the certificates used to sign are certificates take soft revocation while only 17.2% certifi- revoked. To examine the erroneous effective revocation cates perform hard revocation. date problems, the information (e.g., signing date) of bi- Most CAs apply both hard revocation and soft revo- nary samples signed with the revoked certificates is nec- cation when revoking a certificate. Soft revocation is essary. We use WINE data set (D4), and query Virus- more common than hard revocation in all CAs except for Total with the 12,351,946 signed hashes from WINE. Wosign; in particular, Startcom has never performed hard Only 4,729,023 (38.3%) samples have sigcheck informa- revocation in our observation. Interestingly, three CAs tion in its VirusTotal report; and the 4,729,023 samples (Go Daddy, Verisign, and Digicert) set the effective re- are signed with 45,613 unique certificates. We are un- vocation date to before their certificates’ issue date. The able to directly obtain the effective revocation dates of two certificates of Go Daddy were set to one day before the 45,613 certificates because of the following two rea- their issue date, and the one certificate of Digicert was sons. First, the search index service of VirusTotal sup- set to five days before its issue date. However, other two ports only 80TB of data, or about a month of samples certificates of Verisign were set to around five months so that we cannot query VirusTotal for all old samples. and nine months respectively before their issue date. It is Second, the VirusTotal reports contain neither CRLs nor considered hard revocation; therefore, there are no secu- OCSP points to check the revocation status and to ob- rity threats to clients. Figure 4 presents the total number tain effective revocation dates. Therefore, we query the of soft and hard revocations. The total number of revo- CRLs we have collected (D2) to check whether or not the cation has made a drastic increase since 2012. It is also certificate is revoked and to obtain effective revocation worth noting that the numbers for 2016 and 2017 are not dates (tr) if revoked. This process gives us 1,022 revoked yet final, as we have already seen in the previous section, certificates (out of 45,613 certificates). due to revocation delay these numbers should continue We find that CAs applied the soft revocation policy to grow in the future. to revoke 891 (87.2%) certificates. Of those, the effec- Ineffective revocation date setting. So far we have seen tive revocation date (tr) of 45 (5.1%) certificates were the dominance of soft revocation among the CAs. As erroneously set by CAs. The affected CAs are summa- mentioned in Section 2.3, soft revocation may result in rized in Table 5. We also measure how many malware the survival of signed malware even after a certificate has signed with the certificates are still valid due to the in- been revoked if a CA sets the wrong effective revocation effective revocation dates. We first use AVClass [25] to

USENIX Association 27th USENIX Security Symposium 861 label malware using the VirusTotal reports. For the la- to allow execution with no prompts unless the revoca- beled malware sample, we extract the signed date tm. If tion is explicitly found, for all unknown and unexpected we find a signed malware with tm < tr, we say the effec- cases the assumption is that it is safe to proceed. We ob- tive revocation date is erroneously set and the malware served that some of the problems in the revocation infor- remains valid. We find that 250 malware (5.3% out of mation dissemination, when combined with the enforce- the 4,716 malware) signed with the 45 certificates still ment policy of Windows, could allow binaries with re- remain valid. The still-valid signed malware should be voked certificates to be executed without security warn- revoked, but due to the CAs’ error, they remain valid ing messages. and a security threat to clients. The number of still-valid signed malware is relatively small in our data sets, but we believe that more still-valid signed malware can be 6.2 Unavailable Revocation Information found in the wild since our data sets are limited, and do In the code signing PKI, CAs must maintain the revoca- not cover all samples in the wild. Figure 5 shows the dif- tion information indefinitely since the trusted timestamp ference between the effective revocation date (t ) and the r extends the life of the certificate for an unknown length oldest signing date of signed malware (t ) of the certifi- m much longer than the certificates lifetime. This is an im- cate. The shortest difference is one day, and the longest portant difference between code signing and Web’s PKI. difference is 1019 days (2.8 years). Clients may execute This means that revocation status information has to be or install the still-valid malware because the executions always-available and updated much longer than the life of the malware do not trigger any warnings for clients of the certificates [20]. There are several cases when the even though its certificate is already revoked. revocation status information for a certificate is not avail- able for clients. The results and affected CAs are sum- 6 Dissemination of Revocation Informa- marized in Table 5. tion Certificates without CRL and OCSP. The first problem arises when there are no CRL or OCSP points embedded After compromised certificates are properly revoked and in certificates. Code signing certificates that follow the the appropriate effective revocation dates are decided, X.509 v3 standard must include CRLs and OCSP points the next step for CAs is to make the revocation public for clients to check revocation status. However, we ob- and maintain its availability. We first take a look into serve that 788 (0.5% out of 145,582) certificates contain the enforcement of the Windows platforms6 since clients neither CRLs nor OCSP points from the corpus of leaf can be affected depending on the enforcement policies in code signing certificates (D1). This means that clients client-side platforms for checking revocation status in- have no way to check the revocation status for these formation. Then, we examine the security problems in certificates. Of the 788 certificates that contain neither dissemination of revocation status information and try to CRLs nor OCSP points, most (676, 85.8%) were issued answer our last research question Q4. Is revocation in- by Thawte, and they were issued before 2003. Recently, formation served properly? in 2014, iTrusChina issued a code signing certificate to without revocation information; thus, the prob- lem does persist. The 788 certificates with no CRLs and 6.1 Enforcement in Windows OCSP points have already expired. Therefore, no new binaries can be signed with these certificates. However, Client-side platforms must check the validity of code old binaries (including malware) already signed with the signing certificates when a signed binary is encountered. certificates can be valid as long as it contains a trusted When there is a failure or inconsistent state at some timestamp. point in the revocation infrastructure, it matters how the We also examine how it affects the Windows plat- endpoint, where the binary is being executed, handles forms. We download several samples signed with the that failure. Windows considers binary samples signed certificates from VirusTotal. We then inspect the certifi- with revoked certificates as unsigned samples and dis- cate of the samples on Windows 7 and 10 to observe how plays “unknown publisher” in a security warning mes- the Windows platforms check the validity of the sample. sage. Windows also typically follows the soft-fail policy In both versions of Windows, a message saying “The re- vocation function was unable to check revocation for the 6According to Net Market Share (https://www. netmarketshare.com), since more than 75% of Windows clients use certificate” is displayed if you manually inspect the cer- Windows 7 and Windows 10, we focus on only these two platforms. tificate (seen in Figure 6 in the appendix), but the certifi-

862 27th USENIX Security Symposium USENIX Association USENIX Association or root the case, sign this to for used that are suggest We they if files. even malicious this them revoking for never certificates or impli- all CA serious revoking have explicitly could either not which cation; do purchased, domain We be the time still this can lapse. at because domain details its many related of too provide the part certificate that let down a and shut by business reseller issued the were however The CRL reseller; available. longer this no with is certificates point CRL that the means at which exists still server a but the address removed domain. has the CA from the that CRL indicates HTTP which error, 404 two produce example, (http://crl.globalsign.net/ObjectSign.crl, For http://www.startssl.com/crtc2-crl.crl) Error. points Found Not CRLs 404 HTTP to due observa- our with during left available period. are never tion were we that reachable, CRLs generally 13 CRL were the removing that After local- URLs system. probably monitoring few were our to a that ized However, issues institution’s caused our which for network issues day. networking one were there in CRLs times, least 55 at that observe unreachable we are 2017), unreachability 10th, the Sept. record 2017- 16th, we ( that CRLs in Recall the points of OCSP set. and data CRLs the our of unreachability the ine server. OCSP and CRLs Unreachable re- already and compromised be voked). might it the worst, though even (at unavail- is able appendix certificate the the of in information 7 status revocation Figure trusted in normal a seen to presents as prompt attempt file the clients file, when a fact, such In execute Windows. of check- revocation policy soft-fail ing the to due trusted appears cate n oanhsbe ogtb oanreseller, domain a by bought been has domain One unreachable are CRLs (38.4%) 5 CRLs, 13 the Of D7 rnin et.i CRLs in certs. Transient response OCSP Unauthorized or Unknown OCSP and CRLs from responses Inconsistent points CRLs or OCSP Unreachable points OCSP and CRLs without Certs. date revocation Ineffective .Drn u bevto eid(Apr. period observation our During ). al :Msaaeetise on costetp1 CAs. 10 top the across found issues Mismanagement 5: Table

susfound, Issues = enwexam- now We # susntfound not Issues = # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # Comodo h fetdOS Rs h necaiiycnbe can unreachability The by URLs. caused OCSP removing affected after the GlobalSign) and Certum, alTrustFinder, eight Glob- WoSign, by StartSSL, operated Comodo, Verisign, URLs (AOL, CAs infrastructure, OCSP 15 internal unreachable institution have we our on problems age hyudt n eiseterCL.O 1 CRLs, 215 Of CRLs. their re-issue and update update they next the from and days week, a the once at timestamp least at re-issued CRLs. be updated The longer 2017. No 10th, 5. Table in Sept. summarized to are 2017 CAs we affected the 16th, during Apr. issues OCSPs from and mismanagement period CRLs the some observing while highlight found we OCSPs and Here CRLs in Mismanagement 6.3 Windows the to policy. this due checking valid revocation with remain soft-fail signed can malware, certificate of including type binary, qui- Any happens also failure etly. the OCSP then and gone, CRL permanently the are when that qui- means failures this these However, network etly. handles a Windows have so not and to reasons connection, valid many are there since information. status revocation un- to for are query try certificates to the who where with aware clients signed its code and program verify maintained, and CRLs longer cur- are one no servers AOL’s both rently the operate However, and servers. CA, OCSP a two be to used AOL allowed not ( operated. longer no are resellers their if or servers CRLs OCSP maintain and over take should CAs intermediate D7 necal RsadOS onsaecommon, are points OCSP and CRLs Unreachable severs OCSP of unreachability the measure also We

.A ehv xeinesm ewr n stor- and network some experience have we As ). Thawte Go Daddy a hostname bad thisUpdate o xml,i h aeof case the in example, For . Verisign 27th USENIX Security Symposium 863

nextUpdate Digicert

Starfield ed[] eeaiehwoften how examine We [3]. field , timeout Symantec edsol els hnten than less be should field

ealta Rsshould CRLs that Recall WoSign , forbidden Startcom Certum a hostname bad and , method , 57 CRLs are never updated at all since their nextUpdate revoked. timestamps are not changed in the observation period of We observe that 19 certificates have inconsistent re- our revocation publication date collection system. Most sponses from CRLs and OCSP from our data set (D1); the (34 of 57, 59.6%) CRLs are issued by Shanghai Elec- certificates are valid according to the OCSP, but are re- tronic CA, and well-known CAs’ CRLs are not found in voked in the corresponding CRLs 7. To examine how the the 57 CRLs. Moreover, most (130, 89.7% out of 145 inconsistency between OCSP and CRLs affects the Win- CRLs except for unreachable CRLs and not-updated- dows platforms we download the binary samples signed CRLs) CRLs are updated and re-issued every day. It in- with these certificates from VirusTotal and check its re- dicates that CAs re-issue their CRLs when revoked serial vocation status in the Windows platforms. These down- numbers are added. loaded samples are classified as malware by most AV Transient certificates in CRLs. Recall that code sign- vendors and their certificates are explicitly revoked in ing CAs must maintain and provide the revocation status the CRL. Therefore, the samples must be invalid and not information of all certificates including expired ones be- be executed. However, the Windows platforms present cause of the trusted timestamp. However, we find that these signed malware as valid, due to the inconsistency 278 certificates are added and then later removed from between OCSP and CRLs. The Windows policy is to 18 CRLs. The CRLs are maintained by ten different first check the OCSP. If the response from the OCSP in- CAs including GlobalSign, Certum, Entrust, Digicert, dicates the certificate is valid, then Windows does not and Comodo. Most removed serial numbers are never double-check the status using CRLs. To prevent this sort re-added to its CRL. However, one serial number of Dig- of threats caused by mismanagement issues, Windows icert is re-added to the CRL after 106 days. should double-check certificate revocation status using We reach out to the CAs to try and understand the fac- both OCSP and CRLs. tors that go into a decision to remove a revocation from The 19 certificate were issued by Go Daddy; three the CRLs. One CA replied that they had a flaw in their certificates were issued by Starfield Technologies (re- revocation system that removes certificates after the cer- lated to Go Daddy). We believe that Go Daddy and tificate expired, and they fix the flaw to keep the certifi- Starfield Technologies may share the same infrastruc- cates on the CRL indefinitely thanks to our report. tures for revocation information repositories; the infras- The disappeared serial numbers from CRLs are un- tructures may cause the inconsistency problem. It indi- likely to affect the Windows platforms as long as cer- cates that CAs must keep monitoring the consistence be- tificates have both CRLs and OCSP points since in Win- tween CRLs and OCSP responses. dows, OCSP is always preferred over CRL to check re- Unknown or unauthorized responses from OCSP. Ac- vocation status. However, when code signing certificates cording to the OCSP specification, the OCSP respon- contain only CRL points, Windows must rely on only the ders (servers) should return three statuses for a certifi- CRL mechanism. In our data set (D1), the 28,386 leaf cate; good, revoked, and unknown [24]. The unknown code signing certificates (19.4% out of 145,582) con- state indicates that the responder is unaware of the sta- tain one of the 18 CRL points that have experienced se- tus of the certificate being requested. Surprisingly, in our rial numbers disappearance. Most certificates (82.8%) data set (D1), the three OCSP servers (Certum, Shanghai have both the CRL and OCSP points, but the 4,878 Electronic CA, and LuxTrust) respond that they are un- (17.2%) certificates issued by GlobalSign include only aware of the status of their 669 certificates; almost all of CRL. Therefore, the Windows platforms must rely on the certificates (658, 98%) are issued from Certum; the only the specified CRL points to check revocation sta- rest of them (2%) are issued by Shanghai Electronic CA tus. If revoked serial numbers are removed from CRLs, and LuxTrust. any program code including malware signed with one of OCSP responders may also respond with an error the 4,878 certificates can remain valid even though the message. The error message has the five types; mal- certificate is already revoked. formedRequest, internalError, tryLater, sigRequired, Inconsistent responses from CRLs and OCSP. Since 7We consider only this case where the responses from OCSP in- CAs are distributing revocation information through dicate the certificates are valid, but revoked in CRLs since only this CRLs and OCSP, and one is a fallback mechanism for case can lead to security threats where Windows users are allowed to the other. We expect that the state in the CRL and OCSP execute the binary samples with revoked certificates. However, the re- would be consistent; for example, when the serial num- versed inconsistent responses (revoked in OCSP and valid in CRLs) do not affect Windows in terms of security as Windows believes cer- ber of a revoked certificate is found in a CRL, the cor- tificates are revoked when the responses from OCSP indicate revoked, responding OCSP will also return that the certificate is and it displays warning messages for Windows users.

864 27th USENIX Security Symposium USENIX Association and unauthorized. The unauthorized response means the population should be homogeneous, 2) the samples that; (1) the client is not authorized to query the OCSP should be independent, and 3) it should be a closed pop- server, or (2) the OCSP server is unable to respond au- ulation. It results in underestimating the true population thoritatively [24]. In the OCSP server-side case, OCSP of the potentially compromised certificates. Therefore, responders return an unauthorized error message when the actual severity of the threat might be much more (1) they are not authorized to access the revocation significant. However, the results suggest that even with records for the certificate, or (2) when they remove the the underestimation, the number doubles the number of revocation records of expired certificates and are unable malware-signing certificates observed by Symantec and to locate the records for requested certificates. We exam- VirusTotal combined (which is a precise measurement, ine how many OCSP servers return the error messages not an estimate). This puts the challenge of discover- for the requested certificates that they have issued. In our ing compromised certificates into perspective, as a major data set (D1), we observe that 2,129 certificates (1.5% out security company and an information aggregator cannot of 145,582) have the unauthorized error messages; most see most of these certificates. Additionally, it provides certificates (1,515, 71.2%) are issued by Go Daddy. To a possible explanation for the long revocation delays we figure out whether client or server-side causes the prob- report. lem, we check the revocation status of the certificates through OCSP using OpenSSL, and using SignTool on the Windows platforms. Both tools receive the unautho- 8 Discussion rized error messages, which indicates that this problem The findings from our measurement study (Section 4–6) results from the server-side, not the client-side. suggest the current revocation systems based on CRLs The unknown or unauthorized responses from OCSP and OCSP are facing several problems including (1) dif- may not affect Windows platforms in terms of secu- ficulties in discovering compromised certificates, (2) re- rity since they also check CRLs if they receive those vocation delay, (3) ineffective revocation dates, and (4) responses. However, it indicates that CAs improperly improper maintenance of the revocation information. We maintain their OCSP servers. discuss several preliminary recommendations for the ef- fective code signing PKI and how a new design could 7 Limitation address the current problems in revocation. Recommendation. We suggest the following properties Data sets collection. Due to the nature of how signed bi- for the revocation system: naries are distributed (various distribution mechanisms), • Publicize the issuances of certificates and signed bi- there is no easy way to collect all signed binaries and naries. As depicted in Section 4, CAs have difficulties code signing certificates in the wild. For example, some in discovering compromised certificates that they have binaries come directly from websites, but others come issued due to the nature of the code signing PKI. If after running installers or updaters or from external stor- CAs or owners of certificates are informed and aware age. More importantly malicious binaries often are tar- that their certificates are abused, CAs would promptly geted and the samples are hard find or only available for and properly revoke the compromised certificates. For a short time. This is an important difference between the this goal, similar to TLS certificate transparency [18], code signing PKI and the Web’s PKI as it relates to mea- we suggest a new certificate transparency system for surement studies. TLS certificates collected through net- the code signing PKI. In this system, CAs should log work scanners provide a view of the publicly accessible the issuances of code signing certificates when issuing Web’s PKI, however our collected code signing certifi- new certificates. The distinct feature from TLS certifi- cates may not be representative of the entire code signing cate transparency is that publishers are required to log PKI ecosystem as the collected data sets do not cover all the history of when/what binaries (to be publicly dis- certificates and signed samples in the wild. Therefore, tributed) are signed with their private keys. Along with we attempt to collect the broadest view of code sign- code signing certificates, the hash values of signed bi- ing certificates, and also try to approximate how large naries are logged in the proposed system. The system compromised code signing certificates are with the mark- should be available to the public so that anyone can recapture estimation. audit and monitor the logs. Using the logs, CAs and Mark-recapture population estimation. As we dis- owners are able to know the first date of when a certifi- cussed in Section 4.1, the characteristics of the data vi- cate becomes compromised, which results in a proper olates the assumptions of Mark-recapture algorithm: 1) effective revocation date.

USENIX Association 27th USENIX Security Symposium 865 • Better dissemination of revocation information. The Web browsers often failed to check the revocation status CAs should better understand the code signing PKI due to the expensive revocation status checking in terms and properly maintain their revocation systems (CRLs of bandwidth and latency. Kumar et al. [16] measured the and OCSP servers) to have better availability and con- mismanagement of OCSP and CRLs in the Web’s PKI: sistency that can help clients correctly check the re- specifically endpoint availability, uptime, and error re- vocation status of certificates. Moreover, rather than sponses. maintaining their own separate infrastructures only for dissemination of revocation information, they may use 10 Conclusion our proposed code signing certificates transparency to log their revocation information. Certificate revocation is the primary defense against the • More conservative Windows’ checking policy. Win- abuse in the code signing PKI. An effective certificate re- dows should double-check the revocation status of vocation process consists of three roles: (1) discovering code signing certificates for the inconsistent responses compromised certificates, (2) revoking the compromised from OCSP and CRLs. Moreover, Windows should certificates with a meaningful date, and (3) disseminat- apply the hard-fail revocation checking policy for bet- ing the revocation information. However, we found that ter security. the revocation processes can have security problems, and new security threats can be introduced by the problems. 9 Related Work In the discovery phase, CAs take on average 5.6 months to revoke the compromised certificates after the certifi- We discuss related work in two key areas: identifying the cates was used to sign a known malicious binary. The code signing PKI abuse and measuring revocation prob- mark-recapture estimation of compromised certificates lems in the Web’s PKI. point to the fact that it is difficult to find abusive cer- Code signing PKI abuse. [28], Kotzias et tificates in the wild. The validity of a signed sample al. [13], and Alrawi et al. [1] examined the signed ma- is determined by the effective revocation date, but CAs licious PE files. They found that the most malicious PE improperly set effective revocation dates. The inaccu- files were PUP, and they were signed with code signing rate effective revocation dates mean that signed malware certificates legitimately issued from CAs. On the con- remains valid even after its certificate is revoked. Al- trary, Kim et al. [12] focused on the breaches of the trust though CAs properly and promptly revoke the compro- in the code signing PKI ecosystems; many certificates as- mised certificates, clients can be exposed to signed mal- sociated with stolen private keys were used to sign mal- ware attacks due to CAs’ mismanagements of CRL and ware. These studies briefly introduced a few of the re- OCSP. There are many cases that we have seen where vocation problems, but they did not make a distinction clients are unable to check certificate revocation status between the effective revocation date (tr) and the revoca- due to (1) missing CRLs and OCSP points, (2) unreach- tion publication date (tp) and only measured the former. able CRLs and OCSP points, (3) CRLs that are no longer This may result in an inaccurate estimation of the revo- updated, (4) revoked certificates that are mistakenly re- cation delay. In contrast, we measured tp by periodically moved from a CRL, (5) inconsistent responses from CRL collecting CRLs. Additionally, we analyzed the revoca- and OCSP, and (6) unknown or unauthorized responses tion process from end-to-end and we report new findings from OCSP. These discoveries highlight various proper- regarding the discovery of compromised certificates and ties of the code signing PKI and its revocation process the dissemination of revocation information. that should be monitored more actively due to the secu- rity implications that they create. Revocations problems in the Web’s PKI. Compared to the code signing PKI, the Web’s PKI ecosystems has been well studied since many network scanners have Acknowledgement been introduced to collect data: e.g., Zmap [7]. Zhang et al. [30] and Durumeric et al. [6] have found that the We thank the reviewers and our shepherd, number of revocations increased after the Heartbleed Mohammad Mannan, for their feedback. We also thank announcement. However, the majority of the compro- VirusTotal for access to their service and Symantec for mised certificates were not revoked even after new cer- making data available through the WINE platform. This tificates were re-issued. Liu et al. took a close look research was partially supported by the National Science at the TLS certificate revocation [19]. They found that Foundation (award CNS-1564143) and the Department a large fraction of TLS revoked certificates are served. of Defense.

866 27th USENIX Security Symposium USENIX Association References [15]K REBS,C.J., ETAL. Ecological methodology. Tech. rep., Harper & Row New York, 1989. [1]A LRAWI,O., AND MOHAISEN, A. Chains of distrust: To- wards understanding certificates used for signing malicious ap- [16]K UMAR,D.,WANG,Z.,HYDER,M.,DICKINSON,J.,BECK, plications. In Proceedings of the 25th International Conference G.,ADRIAN,D.,MASON,J.,DURUMERIC,Z.,HALDERMAN, Companion on World Wide Web (Republic and Canton of Geneva, J.A., AND BAILEY, M. Tracking certificate misissuance in the Switzerland, 2016), WWW ’16 Companion, International World wild. In 2018 IEEE Symposium on Security and Privacy (SP), Wide Web Conferences Steering Committee, pp. 451–456. pp. 288–301. [2]C ANGIALOSI, F., CHUNG, T., CHOFFNES,D.,LEVIN,D., [17]K WON,B.J.,SRINIVAS, V., DESHPANDE,A., AND DUMI- MAGGS,B.M.,MISLOVE,A., AND WILSON, C. Measurement TRAS, T. Catching worms, trojan horses and pups: Unsupervised and analysis of private key sharing in the https ecosystem. In Pro- detection of silent delivery campaigns. In 24th Annual Network ceedings of the 2016 ACM SIGSAC Conference on Computer and and Distributed System Security Symposium, NDSS 2017, San Communications Security (New York, NY, USA, 2016), CCS ’16, Diego, , USA, February 26 - March 1, 2017 (2017). ACM, pp. 628–640. [18]L AURIE,B.,LANGLEY,A., AND KASPER, E. Certificate trans- [3]C ODESIGNINGWORKINGGROUP. Minimum requirements for parency. RFC 6962, RFC Editor, June 2013. the issuance and management of publicly-trusted code signing certificates. Tech. rep., 2016. [19]L IU, Y., TOME, W., ZHANG,L.,CHOFFNES,D.,LEVIN,D., [4]C OOPER,D.,SANTESSON,S.,FARRELL,S.,BOEYEN,S., MAGGS,B.,MISLOVE,A.,SCHULMAN,A., AND WILSON, HOUSLEY,R., AND POLK, W. Internet X.509 public key in- C. An End-to-End Measurement of Certificate Revocation in the frastructure certificate and certificate revocation list (crl) profile. Web’s PKI. ACM Press, pp. 183–196. RFC 5280, RFC Editor, May 2008. http://www.rfc-editor. [20]M ICROSOFT. Code-Signing Best Practices. Tech. rep., org/rfc/rfc5280.txt. 2007. https://msdn.microsoft.com/en-us/library/ [5]D UMITRAS, , T., AND SHOU, D. Toward a standard benchmark windows/hardware/dn653556. for research: The Worldwide Intelligence Net- work Environment (WINE). In EuroSys BADGERS Workshop [21]M ICROSOFT. Windows Authenticode portable ex- (Salzburg, Austria, Apr 2011). ecutable signature format, Mar 2008. http: //download.microsoft.com/download/9/c/ [6]D URUMERIC,Z.,KASTEN,J.,ADRIAN,D.,HALDERMAN, 5/9c5b2167-8017-4bae-9fde-d599bac8184a/ J.A.,BAILEY,M.,LI, F., WEAVER,N.,AMANN,J.,BEEK- Authenticode_PE.docx. MAN,J.,PAYER,M., AND PAXSON, V. The matter of Heart- bleed. In Proceedings of the 2014 Conference on Internet Mea- [22]P ARNO,B.,MCCUNE,J.M., AND PERRIG, A. Bootstrapping surement Conference (New York, NY, USA, 2014), IMC ’14, trust in commodity computers. In IEEE Symposium on Security ACM, pp. 475–488. and Privacy (2010), pp. 414–429. [7]D URUMERIC,Z.,WUSTROW,E., AND HALDERMAN,J.A. [23]R ESEARCH,K.L.G., AND TEAM, A. The duqu 2.0 persistence ZMap: Fast internet-wide scanning and its security applica- module, Jun 2015. tions. In Proceedings of the 22Nd USENIX Conference on Secu- rity (Berkeley, CA, USA, 2013), SEC’13, USENIX Association, [24]S ANTESSON,S.,MYERS,M.,ANKNEY,R.,MALPANI,A., pp. 605–620. GALPERIN,S., AND ADAMS, C. X.509 internet public key in- [8]F ALLIERE,N.,O’MURCHU,L., AND CHIEN, E. W32.Stuxnet frastructure online certificate status protocol - ocsp. RFC 6960, dossier. Symantec Whitepaper, February 2011. RFC Editor, June 2013. http://www.rfc-editor.org/rfc/ [9] GOODIN, D. Stuxnet spawn infected kaspersky using stolen rfc6960.txt. foxconn digital certificates, Jun 2015. [25]S EBASTIAN´ ,M.,RIVERA,R.,KOTZIAS, P., AND CA- [10]H OLZ,R.,BRAUN,L.,KAMMENHUBER,N., AND CARLE, BALLERO, J. Avclass: A tool for massive malware labeling. In G. The SSL landscape: a thorough analysis of the X.509 PKI International Symposium on Research in Attacks, Intrusions, and using active and passive measurements. In Proceedings of the Defenses (2016), Springer, pp. 230–253. 2011 ACM SIGCOMM conference on Internet measurement con- [26]S WIAT. Flame malware explained, Jun 2012. ference (2011), ACM, pp. 427–444. [11]K ALISKI, B. PKCS #7: Cryptographic message syntax ver- [27]V IRUSTOTAL. www.virustotal.com, 2017. sion 1.5. RFC 2315, RFC Editor, March 1998. http://www. [28]W OOD, M. Want My Autograph? The Use and Abuse of Digi- rfc-editor.org/rfc/rfc2315.txt. tal Signatures by Malware. Virus Bulletin Conference September [12]K IM,D.,KWON,B.J., AND DUMITRAS, , T. Certified malware: 2010, September (2010), 1–8. Measuring breaches of trust in the windows code-signing pki. In Proceedings of the 2017 ACM SIGSAC Conference on Computer [29]Y ILEK,S.,RESCORLA,E.,SHACHAM,H.,ENRIGHT,B., AND and Communications Security (2017), CCS ’17. SAVAGE, S. When private keys are public: Results from the 2008 debian openssl vulnerability. In Proceedings of the 9th ACM SIG- [13]K OTZIAS, P., MATIC,S.,RIVERA,R., AND CABALLERO,J. Certified pup: Abuse in authenticode code signing. In Proceed- COMM Conference on Internet Measurement (New York, NY, ings of the 22Nd ACM SIGSAC Conference on Computer and USA, 2009), IMC ’09, ACM, pp. 15–27. Communications Security (New York, NY, USA, 2015), CCS ’15, [30]Z HANG,L.,CHOFFNES,D.,LEVIN,D.,DUMITRAS, T., MIS- ACM, pp. 465–478. LOVE,A.,SCHULMAN,A., AND WILSON, C. Analysis of ssl [14]K OZAK´ ,K.,KWON,B.J.,KIM,D.,GATES,C., AND certificate reissues and revocations in the of heartbleed. DUMITRAS¸, T. Issued for abuse: Measuring the underground In Proceedings of the 2014 Conference on Internet Measure- trade in code signing certificate. In 17th Annual Workshop on the ment Conference (New York, NY, USA, 2014), IMC ’14, ACM, Economics of Information Security (WEIS) (2018). pp. 489–502.

USENIX Association 27th USENIX Security Symposium 867 Appendix

A Screenshots

Figure 7: Screenshot of the prompt displayed in Win- dows 10 when executing a signed binary file with miss- ing certificate revocation information. In this case, nor CRL or OCSP information is provided in the certificate. Figure 6: Screenshot of Windows 10 when a certificate without revocation information.

868 27th USENIX Security Symposium USENIX Association