Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem

Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem Xavier de Carné de Carnavalet A thesis in The Concordia Institute for Information Systems Engineering Presented in Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy (Information and Systems Engineering) at Concordia University Montréal, Québec, Canada July 2019 c Xavier de Carné de Carnavalet, 2019 CONCORDIA UNIVERSITY School of Graduate Studies This is to certify that the thesis prepared By: Mr. Xavier de Carné de Carnavalet Entitled: Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem and submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Information and Systems Engineering) complies with the regulations of this University and meets the accepted standards with re- spect to originality and quality. Signed by the final examining committee: Chair Dr. William Lynch External Examiner Dr. Carlisle Adams External to Program Dr. Wahab Hamou-Lhadj Examiner Dr. Amr Youssef Examiner Dr. Jeremy Clark Thesis Supervisor Dr. Mohammad Mannan Approved by Dr. Mohammad Mannan, Graduate Program Director July 24, 2019 Dr. Amir Asif, Dean Gina Cody School of Engineering and Computer Science Abstract Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem Xavier de Carné de Carnavalet, Ph.D. Concordia University, 2019 Transport Layer Security (TLS) is one of the most widely deployed cryptographic protocols on the Internet that provides confidentiality, integrity, and a certain degree of authenticity of the communications between clients and servers. Following Snowden’s revelations on US surveillance programs, the adoption of TLS has steadily increased. However, encrypted traffic prevents legitimate inspection. Therefore, security solutions such as personal an- tiviruses and enterprise firewalls may intercept encrypted connections in search for mali- cious or unauthorized content. Therefore, the end-to-end property of TLS is broken by these TLS proxies (a.k.a. middleboxes) for arguably laudable reasons; yet, may pose a security risk. While TLS clients and servers have been analyzed to some extent, such proxies have remained unexplored until recently. We propose a framework for analyzing client- end TLS proxies, and apply it to 14 consumer antivirus and parental control applications as they break end-to-end TLS connections. Overall, the security of TLS connections was systematically worsened compared to the guarantees provided by modern browsers. Next, we aim at exploring the non-public HTTPS ecosystem, composed of locally- trusted proxy-issued certificates, from the user’s perspective and from several countries in residential and enterprise settings. We focus our analysis on the long tail of interception events. We characterize the customers of network appliances, ranging from small/medium iii businesses and institutes to hospitals, hotels, resorts, insurance companies, and govern- ment agencies. We also discover regional cases of traffic interception malware/adware that mostly rely on the same Software Development Kit (i.e., NetFilter). Our scanning and analysis techniques allow us to identify more middleboxes and intercepting apps than pre- viously found from privileged server vantages looking at billions of connections. We further perform a longitudinal study over six years of the evolution of a prominent traffic-intercepting adware found in our dataset: Wajam. We expose the TLS interception techniques it has used and the weaknesses it has introduced on hundreds of millions of user devices. This study also (re)opens the neglected problem of privacy-invasive adware, by showing how adware evolves sometimes stronger than even advanced malware and poses significant detection and reverse-engineering challenges. Overall, whether beneficial or not, TLS interception often has detrimental impacts on security without the end-user being alerted. iv Acknowledgments I would like to express my deepest gratitude and appreciation to my supervisor Dr. Mo- hammad Mannan, for his continuous support, guidance, and pushing me to surpass myself. His patience and dedication contributed to making this thesis possible. My journey into the Ph.D. was an adventure, and I am grateful to those I met and who supported me along the way. In particular, my love and appreciation go to Mengyuan Zhang. I wish to thank all members of the Madiba Security Research Group, especially Liany- ing Zhao (Viau), as well as the rest of my research colleagues of the CIISE department, for their enthusiastic discussions. I also would like to express my gratitude to my friends and family, who helped me keep on track. v Contents List of Figures xiii List of Tables xv 1 Introduction 1 1.1 Motivation . 1 1.2 Thesis Statement . 3 1.3 Objectives and Contributions . 3 1.4 Related Publications . 5 1.5 Outline . 5 2 Background 6 2.1 SSL/TLS . 6 2.2 Terminology . 7 2.3 Trusted Root CA Stores . 8 2.3.1 System CA Store . 8 2.3.2 Third-party CA Stores . 8 2.4 OS-provided APIs for Key Storage . 9 2.5 Insertions in Trusted Stores: Implications . 10 2.6 Client-side TLS Proxies and Appliances . 11 vi 3 Literature Review 13 3.1 Surveys on SSL/TLS and the CA Infrastructure . 13 3.2 Certificate Collection and Analyses . 14 3.2.1 Internet-wide Active Scans . 14 3.2.2 Passive Certificate Collection . 18 3.3 TLS Proxy-oriented Analyses and Protocols . 19 3.3.1 Network Appliances . 20 3.3.2 Software Proxies . 20 3.3.3 TLS Proxy Protocols . 21 3.4 Implementation Verification . 21 3.4.1 Certificate Generation for Testing Purposes . 22 3.4.2 Source Code Analysis . 23 3.4.3 TLS Implementation Testing . 23 3.5 Miscellaneous . 23 3.5.1 Related Technologies . 24 3.5.2 Mimicking TLS handshakes . 24 4 Analyzing Client-end TLS Interception Software 25 4.1 Methodology . 25 4.1.1 Analysis Framework . 25 4.1.1.1 Root Certificate and Private Key . 26 4.1.1.2 Certificate Validation . 26 4.1.1.3 Server-end Parameters . 26 4.1.1.4 Client-end Transparency . 27 4.1.2 Threat Model . 27 4.1.3 Product Selection . 28 4.2 Contributions . 28 vii 4.3 Major Findings . 31 4.4 Private Key Extraction . 32 4.4.1 Locating Private Keys in Files and Windows Registry . 33 4.4.2 Application-protected Private Keys . 34 4.4.2.1 Identify the Process Responsible for TLS Filtering . 34 4.4.2.2 Retrieving Passphrases . 35 4.4.2.3 Encrypted Containers . 36 4.4.3 Security Considerations . 36 4.5 Limitations of Existing TLS Test Suites . 38 4.5.1 Certificate Verification . 39 4.5.2 TLS Security Parameters . 41 4.6 Our TLS Proxy Testing Framework . 41 4.6.1 Test Environment . 41 4.6.2 Certificate Validation Testing . 42 4.6.3 Proxy-embedded Trusted Stores . 44 4.6.4 TLS Versions and Known Attacks . 46 4.7 Results Analysis . 47 4.7.1 Root Certificates . 47 4.7.1.1 Certificate Generation . 48 4.7.1.2 Third-party Trusted Stores . 49 4.7.1.3 Self-acceptance . 49 4.7.1.4 Filtering Conditions . 49 4.7.1.5 Expired Product Licenses . 50 4.7.1.6 Uninstallation . 50 4.7.2 Private Key Protections . 50 4.7.2.1 Passphrase-protected Private Keys . 51 viii 4.7.2.2 Encrypted Containers . 52 4.7.3 Certificate Validation and Trusted Stores . 53 4.7.3.1 Invalid Chain of Trust . 53 4.7.3.2 Weak and Deprecated Encryption/signing Algorithms . 55 4.7.3.3 Proxy-embedded Trusted Store . 55 4.7.4 TLS Parameters . 57 4.7.4.1 SSL/TLS Versions . 57 4.7.4.2 Certificate Security Parameters . 59 4.7.4.3 Cipher Suites . 60 4.7.4.4 Known Attacks . 60 4.8 Practical Attacks . 61 4.9 Company Notifications and Responses . 64 4.10 Recommendations for Safer TLS Proxying . 65 4.11 Conclusion . 70 5 A Client-side View of the HTTPS Ecosystem 71 5.1 Introduction . 71 5.2 First Data Collection: L17 . 76 5.2.1 Data Collection Methodology . 77 5.2.1.1 Luminati . 77 5.2.1.2 Domain Datasets . 78 5.2.1.3 Country List . 80 5.2.1.4 Browser-like TLS Handshake Simulation . 81 5.2.1.5 Scanning Methodology . 84 5.2.1.6 Verifying Certificates . 85 5.2.1.7 Analysis Methodology . 87 5.2.2 Findings . 89 ix 5.2.2.1 Personal Filters & Enterprise Middleboxes Identification 90 5.2.2.2 Middleboxes . 92 5.2.2.3 NetFilter-based Interceptions . 95 5.2.2.4 New Trends . 99 5.2.2.5 Country-wide Censorship and ISP-level Interception . 100 5.2.2.6 Likely Malware . 103 5.2.2.7 False Positives . 103 5.2.2.8 Remaining Unknown Certificates . 105 5.2.3 Discussion on Network Errors . 106 5.2.4 Trusted Certificates and CT logs . 108 5.3 Second Data Collection: L19 . 109 5.3.1 Data Collection Methodology . 109 5.3.1.1 Domain Datasets . 110 5.3.1.2 Country List . 112 5.3.1.3 Browser-like TLS Handshake Simulation for TLS 1.3 . 112 5.3.1.4 Scanning Methodology . 114 5.3.1.5 Verifying Certificates . 117 5.3.2 Findings . 118 5.3.2.1 Enterprise Proxies and Home Filters . 119 5.3.2.2 ISP-level Injection . 120 5.3.2.3 NetFilter-based Interceptions ..

Last-Mile TLS Interception: Analysis and Observation of the Non-Public HTTPS Ecosystem

Using Frankencerts for Automated Adversarial Testing of Certificate

Hola Dating App, Your Guide to Find a Suitable Personality Based on Yours

Opera Mini Application for Android

Measuring the Rapid Growth of HSTS and HPKP Deployments

Comptia Security+ 501

Sizzle: a Standards-Based End-To-End Security Architecture for the Embedded Internet

A Systematic Empirical Analysis of Unwanted Software Abuse, Prevalence, Distribution, and Economics

Data Privacy: the Current Legal Landscape February 2016

Tracking Users Across the Web Via TLS Session Resumption

Weather Investment

HOLA Lite Setup

An Empirical Analysis of Email Delivery Security