Pdf Monetization to the Benefit of Attackers
Total Page:16
File Type:pdf, Size:1020Kb
SIDN Labs https://sidnlabs.nl February 13, 2017 Peer-reviewed Publication Title: Domain names abuse and TLDs: from monetization towards mitigation Authors: Giovane C. M. Moura, Moritz M¨uller,Marco Davids, Maarten Wullink, and Cristian Hesselman Venue: 3rd IEEE/IFIP Workshop on Security for Emerging Dis- tributed Network Technologies (DISSECT 2017), co-located with IFIP/IEEE International Symposium on Integrated Network Man- agement (IM 2017), Lisbon, Portugal. Conference dates: May 8th to 12th, 2017. Citation: • Moura, G.C. M., M¨uller,M., Davids, M., Wullink, M, Hesselman, C.: Do- main names abuse and TLDs: from monetization towards mitigation. 3rd IEEE/IFIP Workshop on Security for Emerging Distributed Network Technolo- gies (DISSECT 2017), co-located with IFIP/IEEE International Symposium on Integrated Network Management (IM 2017). Lisbon, Portugal, May 2017 (to appear) • Bibtex: @inproceedings{sidn-dissect-2017, author = {{Giovane C. M. Moura, Moritz Muller, Marco Davids, Maarten Wullink, and Cristian Hesselman}}, booktitle={{ 3rd IEEE/IFIP Workshop on Security for Emerging Distributed Network Technologies (DISSECT 2017), co-located with IFIP/IEEE International Symposium on Integrated Network Management (IM 2017)}}, title={Domain names abuse and TLDs: from monetization towards mitigation}}, year={2017}, month={May}, } 1 Domain names abuse and TLDs: from monetization towards mitigation Giovane C. M. Moura, Moritz Muller,¨ Marco Davids, Maarten Wullink, and Cristian Hesselman SIDN Labs Stichting Internet Domeinregistratie Nederland (SIDN) Arnhem, The Netherlands Email: ffi[email protected] Abstract—Hidden behind domain names, there are lucrative Domain Registration Domain Resolution (and ingenious) business models that misuse/abuse the DNS Authori- Registrar Zone tative DNS namespace and employ a diversified form of monetization. To Registrant Registry User / Reseller File Name Resolvers curb some of those abuses, many research works have been pro- Servers posed. However, while having a clear contribution and advancing the state-of-the-art, these works are constrained by their limited Active Scans datasets and none of them present a survey on the forms of DNS abuse. In this paper, we address these limitations by presenting a case study in one top-level domain (TLD) operator (.nl) with RegDB Records AuthDNS diverse longitudinal datasets. We then cover eight business models Datasets that DNS abusers employ and their respective monetization form, and discuss how TLD operators can employ these datasets to Fig. 1. TLD Operations: registration (left), domain name resolution (right), detect these forms of abuse. and derived datasets. I. INTRODUCTION Domain names have long been misused for different types of curb each form of abuse. The main contribution of this paper abuse: phishing, malware distribution, spamming, and botnet is, therefore, a survey of domain related abuses including their command-and-control (C&C) are just some of them. Underly- underlying business models, and a discussion on how TLD ing each of these forms of abuse, we find profitable business operators can use their datasets to mitigate them. models, which provide the incentives for these abusers to The remainder of this paper is divided as follows: we cover continue with such activities. the two basic services provided by TLDs (domain registration To curb such practices, the research community has been and name resolution) in xII. Then, we introduce in xIII the active in proposing various solutions, such as [1], [2], [3], datasets we used . Following that, we present in xIV a survey [4], [5], [6]. While these works advance the state-of-the art on the types of DNS abuse and their underlying business and have a clear contribution, they are faced with two main models and their respective implications on the datasets. shortcomings: (i) they are constrained by type and/or duration Finally, conclusions and future work are discussed in xV. of their respectively available datasets (due to the difficulty in obtaining such datasets) and (ii) while these solutions cover II. BACKGROUND different sorts of abuse, we lack a survey on domain-related A. Domain Registration abuses, which leaves the question of how much ground has Domain registration consists of creating a unique domain not been covered yet unanswered. name that ultimately is added to the zone file of a DNS This paper addresses both issues: by carrying out a case zone. Typically, it involves the so-called triple-R: registrant, study on the top-level domain (TLD) of the Netherlands (.nl), registrar (or reseller), and registry (or TLD operator). Figure 1 we address the first issue by analyzing three longitudinal summarizes the process (left part). To register a domain in a datasets readily available to TLD operators (xIII): historical specific TLD, first a registrant (a user) choses a registrar (e.g: registration database (as opposed to hard-to-parse [7] and GoDaddy) that is accredited by the TLD of his/her choice. yet incomplete whois records), traffic to authoritative name Once the requirements of the registrar are fulfilled (personal servers (centralized view instead of DNS resolvers traffic), and data, payment), it contacts the registry and registers the the infrastructure used by the domains (obtained using DNS requested domain on behalf of the user1. Different registrars scans). have different registration interfaces, but the communications We address the second issue by presenting a survey on between the registrar and registry are typically performed domain abuses (xIV) and discuss their underlying business using the Extensible Provisioning Protocol (EPP, RFC 5730). models and respective monetization methods. We demonstrate how they create patterns in our datasets, and discuss how TLD 1Some registries allow domain tasting, in which a user may try a domain operators [8] can leverage these to develop methods tailored to for a few days for free (domain tasting), but is not the case for .nl. Business Spam RegDB AuthDNS Records Lit Domains are registered for a certain period of time (de- Phishing(0-day) Yes Weak Strong Weak [3], [6] pending on the registry), and after expiration, they can enter Phishing(comp.) Yes None Strong Weak [17] Parking (Ads) No Strong Weak Strong [18], [19] a Redemption Grade Period (RGP), in which the former Parking (Mal) No Strong Weak Strong [18], [19] registrant can still renew the domain. After this period has Fake Goods Yes Weak Weak Medium [6], [20] expired, the domain is deleted and other registrants can register Drop-Catch No Medium Medium Weak [21] Botnet C&C No Medium Strong ? [22] it (this depends on the policy of the registry and .nl domains Blackhat SEO No Medium Medium Strong [23], [24] are made available after 40 days of the expiration date). TABLE I Each registry, in turn, maintains its own registration BUSINESS MODELS AND DATASETS/SIGNAL “STRENGTH”, AND database, which then is used to generate a Zone File (Figure 1) RESEARCH WORKS THAT COVER THOSE. that contains the list of all active and delegated domains sampled view (due to caching on the resolvers [13]) of all under the respective TLD and their respective DNS records. queries issued to .nl. Similarly to the registration database, Ultimately, the zone files are used as input files on the 2 researchers usually do not have access to this type of data – authoritative name servers for the particular TLD . These when they have it typically covers a snapshot of it. We, on the zone files are also frequently updated, and each TLD operator other hand, have been continuously storing this data since May chooses how often–.nl updates its zone files every hour. 2014. We use our open-source Hadoop-based ENTRADA [14] B. Domain Resolution to store and process this dataset. Records: last, .nl zone files contains information about all Domain name resolution consists of resolving a domain active domains, but not all the DNS records [11]. To obtain name into, ultimately, its IP addresses or other specific types such information and types of records, we utilize the daily of DNS records [11]. We summarize this process on the right scans (Figure 1) to our zone [15]. side of Figure 1. First a user attempts to access a web site Access to these datasets is regulated by our publicly avail- (e.g,: example.nl). The stub DNS resolver on his/her computer able data privacy framework that conforms to both EU and sends a DNS request to its DNS resolver, typically provided Dutch legislation [16]. We refer the interested reader to [8] for by the ISP. The DNS resolver, in turn, contacts one of the a discussion on a security and stability role of TLDs, including root.hints Root DNS servers [12] (as provided by file) to privacy management. obtain the authoritative name server for .nl. Then, it will send another request to one of the authoritative servers of .nl, which IV. MONETIZATION METHODS AND MITIGATION respond for example.nl. Caching on DNS resolvers [13] may eliminate some of these steps. Finally, the resolver responds How can one monetize using domain names? Answering to the user with the required DNS record. this question allows us to understand the underlying business models employed by domain name abusers. These business III. DATASETS AND TLDS VANTAGE POINT models vary significantly, leaving an, often distinctive, “trace” RegDB: at domain registration side (Figure 1), we have ac- on different types of data sets we discussed in xIII, which can, cess to the historical database of .nl, which contains historical in turn, be used to mitigate such abuses. Table I summarizes information about registration and removal of domains from the relationship between commonly observed business models its respective zone files. We refer to this dataset as RegDB. and the three datasets we covered in xIII, and whether they This dataset contains complete information about registrant use spam to advertise their domains. and registrar (and resellers, if applicable), as well as some There are, however, other business models that can be used of the DNS records of the respective domain (NS, DS, and to monetize on DNS – such as compromising a registrar or DNS glue [11]) for a period of 20+ years.