Large Scale Measurement on the Adoption of Encrypted
Total Page:16
File Type:pdf, Size:1020Kb
Large Scale Measurement on the Adoption of Encrypted DNS Sebastián García Karel Hynek Dmtrii Vekshin [email protected] [email protected] [email protected] Faculty of Electrical Engineering Faculty of Information Technology Avast s.r.o. Czech Technical University in Czech Technical University in Prague, Czech Republic Prague Prague Prague, Czech Republic and CESNET z.s.p.o. Prague, Czech Republic Tomáš Čejka Armin Wasicek [email protected] [email protected] CESNET z.s.p.o. Avast s.r.o. Prague, Czech Republic Prague, Czech Republic Figure 1: Sum of DoH, DoT, DoQ flows for Organization 2 for 16 months, x-axis are the days of themonth. arXiv:2107.04436v1 [cs.CR] 9 Jul 2021 ABSTRACT We conclude that despite growing in 2020, during the first Several encryption proposals for DNS have been presented five months of 2021 there was statistically significant evi- since 2016, but their adoption was not comprehensively stud- dence that the average amount of Internet traffic for DoH, ied yet. This research measured the current adoption of DoH DoT and DoQ remained stationary. However, we found that (DNS over HTTPS), DoT (DNS over TLS), and DoQ (DNS the amount of, still unknown and ready to use, DoH servers over QUIC) for five months at the beginning of 2021 by three grew 4 times. These measurements suggest that even though different organizations with global coverage. By comparing the amount of encrypted DNS is currently not growing, there the total values, amount of requests per user, and the sea- may probably be more connections soon to those unknown sonality of the traffic, it was possible to obtain the current DoH servers for benign and malicious purposes. adoption trends. Moreover, we actively scanned the Internet for still-unknown working DoH servers and we compared KEYWORDS them with a novel curated list of well-known DoH servers. DoH, DoT, Encrypted DNS, Network Measurement, Network Trends 1 INTRODUCTION 2 RELATED WORK The Domain Name System (DNS) is a core service that plays DNS encryption is a relatively novel approach to improve an essential role in network communication, but it does user’s privacy. IETF standardized DNS over TLS (DoT) in not meet the current requirements of privacy and security. 2016 as RFC 7858 [18], followed by DNS over HTTPS (DoH) In the last years there has been an increasing amount of in 2018 as RFC 8484 [15], and the third approach, DNS over proposed protocols to enhance its privacy and confidentiality QUIC (DoQ), is still an RFC draft [19]. Since DNS encryption via encryption [24]. This trend started around 2014 [17], with modifies the usage of one of the oldest and essential proto- protocols like DNS over HTTPS (DoH), DNS over TLS (DoT), cols on the Internet, many research studies focused on the or DNS over QUIC (DoQ). consequences of large-scale DNS encryption. There has been previous measurements of the prevalence Borgolte et al. [2] provided a general discussion about of encrypted DNS in the Internet focused on specific pro- DoH and several areas such as performance, security, and tocols of encrypted DNS [6, 25]. However, there is a lack privacy. Similarly, Fidler et al. [7] presented challenges that of real-world observations of the volume of encrypted DNS network operators are going to face due to DNS encryption. traffic, and there is no study of the long term trends onits Both studies concluded that large deployments of privacy- adoption. Moreover, the amount of DoH servers seems to be preserving DNS will be a problem for users’ security since small and highly controlled by organizations. automated network security tools rely on unencrypted DNS. This paper presents the first comprehensive measurement, This phenomenon is further researched by Bumanglag et comparison and analysis of the volume of DoH, Dot and DoQ al. [3], which analyzed DoH traffic and even discussed the traffic from the perspective of three large organizations: a) challenges related to its detection, since DoH uses a well- the backbone infrastructure of an Internet Service Provider, known port allocated for HTTPS communication (443/TCP). b) the network infrastructure of a large University, and c) The Bumanglag paper also mentioned that the cybersecurity the traffic of millions of endpoints from a global security community suggests allowing HTTPS traffic to pass only company. through inspection proxies that would decrypt and encrypt The methodology of our analysis has two main parts. First, all HTTPS traffic. a measurement on the total amount of DoH servers on the The detection of DoH in different traffic was studied in Internet via a global scan and verification. Second, a deep multiple papers. There are currently two main approaches in statistical analysis on the trends, stationarity, patterns and DoH detection. The first approach is the detection of regular ratios between the three organizations during the first five behavior patterns by Hjelm et al. [13]. They claim that by months of 2021. The analysis is completed by a comparison using the Real Intelligence Threat Analytics (RITA) frame- of the trends with the traffic from 2020. All trends and ratios work, they were able to identify behavioral patterns of DoH are statistically tested for similarities and the stationarity of and successfully detect them. The second approach employs the time series is verified with the Augmented Dickey–Fuller Machine Learning principles and was introduced by Vekshin test. et al. [35]. Their AdaBoost decision tree classifier was able The results show that despite a growing increase of en- to detect DoH connections with an accuracy of more the crypted DNS traffic in 2020, the first five months of 2021show 99 %. A limitation was that their classifier worked only with a statistically significant stationarity of the traffic. With some DoH created by web browsers and cannot detect single DoH differences for DoH and DoT traffic, these protocols areused queries. to encrypt up to 0.01% of the total DNS traffic. Regarding Malicious DoH studies were limited only to data exfiltra- the unknown DoH servers on the Internet, results show that tion via DNS. MontazeriShatoori et al. [29] created a classifier there are at least 4 times more DoH servers working than that can distinguish between a DoH tunnel and regular DoH the most comprehensive and current list of well-known DoH with outstanding precision. Similar results were achieved on servers. These services could be easily exploited by security the same dataset by Singh et al. [34] and Banadaki [1]. threats. Several studies [4, 16, 21, 33] also challenge the privacy- The contributions of this paper are: preserving characteristics of encrypted DNS. Studies from Siby et al. [33] and Bushart et al. [4] brought interesting results by performing very accurate website fingerprinting • A comprehensive dataset of well-known DoH providers. using only DoH and DoT traffic. Even though their results • A dataset of unknown and unpublished DoH servers showed a limitation of encryption, their proposed attack is by a global Internet scan. difficult to reproduce in a real environment but it could be • A measurement and comparison of DoH, DoT and DoQ fixed by proper payload padding. Nevertheless, the main traffic in three large organizations. • A new Nmap NSE script tool to verify DoH servers. 2 reason for privacy concerns is still the transition of visibil- Table 1: Summary of the most important Encrypted ity from local DNS resolvers controlled by Internet Service DNS method Providers (ISPs) to global DoH service providers that central- ize the entire process of domain name translation. DoT DoH DoQ The previously mentioned studies focused mainly on en- crypted DNS characteristics, weaknesses, and detection pos- L4 Protocol TCP/UDP TCP UDP sibilities. However, none of them dealt with the adoption Port 853 443 784 and actual popularity of DNS encryption. Deccio et al. [5] Protocol TLS/DTSL HTTPS QUIC studied in 2019 the adoption of DoT and DoH by open re- solvers. Their result shows that the adoption is very poor. From around 1.2 million open resolvers, only 1,747 (0.15 %) 3.1 DNS over TLS (DoT) supported DoT, and only 9 supported DoH. Doan et al. [6] DNS over TLS (DoT) is specified by RFC 7858 [18]. According performed a similar measurement focusing on DoT in 2020 to the specification, the DNS wireformat messages defined and found 2,151 open resolvers supporting DoT, which shows in RFC 1035 [28] are sent over a secure TLS or Datagram an increasing trend in resolvers adoption. However, these TLS connection. IANA reserved port 853/TCP, which should data show only the adoption across the open DNS resolvers be used by default in all DoT clients and resolvers. Since the and do not consider specialized DoT/DoH proxies. packets are sent over a dedicated port, the traffic can be easily Furthermore, Doan et al. [6] and Lu et al. [25] also mea- recognized, blocked, or filtered by a network administrator. sured trends in encrypted DNS usage. Both studies used a However, it is worth noting that the RFC standard allows dataset from ISP backbone lines collected in 2019, when en- using DoT over ports other than 853/TCP; therefore, it is crypted DNS was not enabled by default in browsers, and it expected that DoT clients and resolvers will have the option was much less popular. Additionally, both studies focused to change the port in their configuration. mainly on measuring the prevalence of DoT, and the DoH measurement was very limited. Last but not least, their stud- ies did not consider DNS over QUIC. 3.2 DNS over HTTPS (DoH) We are not aware of any previous work that tried to enu- The DNS over HTTPS (DoH) protocol was adopted as RFC merate all DoH resolvers to infer the encrypted DNS preva- 8484 [15] in October 2018.